From llvm-commits at lists.llvm.org Mon Oct 7 00:08:04 2019 From: llvm-commits at lists.llvm.org (James Molloy via llvm-commits) Date: Mon, 7 Oct 2019 08:08:04 +0100 Subject: [PATCH] D67968: [TableGen] Introduce a generic automaton (DFA) backend In-Reply-To: <57a5afc34cdc59112423803f4ac59582@localhost.localdomain> References: <57a5afc34cdc59112423803f4ac59582@localhost.localdomain> Message-ID: Thanks Mikael, That looks easier than plumbing the type name into that code. And thanks for pointing at the bot , I will commit the change and watch it for greenness. (Just arriving into work, was going to look at this in a few minutes anyway :)) On Mon, 7 Oct 2019, 07:47 Mikael Holmén via Phabricator, < reviews at reviews.llvm.org> wrote: > uabelho added a comment. > > clang-cuda-build buildbot failed too: > > http://lab.llvm.org:8011/builders/clang-cuda-build/builds/37865/steps/ninja%20check%201/logs/stdio > > I think it can be fixed with > > @@ -371,20 +371,20 @@ uint64_t Transition::transitionFrom(uint64_t > State) { > void CustomDfaEmitter::printActionType(raw_ostream &OS) { OS << > TypeName; } > > void CustomDfaEmitter::printActionValue(action_type A, raw_ostream &OS) > { > const ActionTuple &AT = Actions[A]; > if (AT.size() > 1) > - OS << "{"; > + OS << "std::make_tuple("; > bool First = true; > for (const auto &SingleAction : AT) { > if (!First) > OS << ", "; > First = false; > SingleAction.print(OS); > } > if (AT.size() > 1) > - OS << "}"; > + OS << ")"; > } > > namespace llvm { > > void EmitAutomata(RecordKeeper &RK, raw_ostream &OS) { > > similar to the fix in r372384 (169cb6347 < > https://reviews.llvm.org/rG169cb63478aa047451786d8ccf6af4b721e3b271>). > > > Repository: > rL LLVM > > CHANGES SINCE LAST ACTION > https://reviews.llvm.org/D67968/new/ > > https://reviews.llvm.org/D67968 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Mon Oct 7 00:25:04 2019 From: llvm-commits at lists.llvm.org (Tanya Lattner via llvm-commits) Date: Mon, 07 Oct 2019 07:25:04 -0000 Subject: [www] r373879 - Add online schedule. Message-ID: <20191007072504.D2F2485C0C@lists.llvm.org> Author: tbrethou Date: Mon Oct 7 00:25:04 2019 New Revision: 373879 URL: http://llvm.org/viewvc/llvm-project?rev=373879&view=rev Log: Add online schedule. Modified: www/trunk/devmtg/2019-10/index.html Modified: www/trunk/devmtg/2019-10/index.html URL: http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2019-10/index.html?rev=373879&r1=373878&r2=373879&view=diff ============================================================================== --- www/trunk/devmtg/2019-10/index.html (original) +++ www/trunk/devmtg/2019-10/index.html Mon Oct 7 00:25:04 2019 @@ -125,7 +125,7 @@ More details will be coming soon but Program -

An online schedule with dates/times will be posted closer to the conference.

+

The online schedule may be found here: https://llvmdevmtg2019.sched.com. Bookmark this on your mobile device.

Keynotes From llvm-commits at lists.llvm.org Mon Oct 7 00:26:49 2019 From: llvm-commits at lists.llvm.org (=?iso-8859-1?Q?Mikael_Holm=E9n?= via llvm-commits) Date: Mon, 7 Oct 2019 07:26:49 +0000 Subject: [PATCH] D67968: [TableGen] Introduce a generic automaton (DFA) backend In-Reply-To: References: <57a5afc34cdc59112423803f4ac59582@localhost.localdomain>, Message-ID: Sounds good! Thanks, Mikael ________________________________________ From: James Molloy Sent: Monday, October 7, 2019 9:08 AM To: reviews+D67968+public+7a365629e84f6a39 at reviews.llvm.org Cc: Tim Northover; daniel_l_sanders at apple.com; david.majnemer at gmail.com; Mikael Holmén; wan.yu at ibm.com; llvm-dev at redking.me.uk; notstina at gmail.com; mgorny at gentoo.org; llvm-commits at lists.llvm.org; jun.l at samsung.com Subject: Re: [PATCH] D67968: [TableGen] Introduce a generic automaton (DFA) backend Thanks Mikael, That looks easier than plumbing the type name into that code. And thanks for pointing at the bot , I will commit the change and watch it for greenness. (Just arriving into work, was going to look at this in a few minutes anyway :)) On Mon, 7 Oct 2019, 07:47 Mikael Holmén via Phabricator, > wrote: uabelho added a comment. clang-cuda-build buildbot failed too: http://lab.llvm.org:8011/builders/clang-cuda-build/builds/37865/steps/ninja%20check%201/logs/stdio I think it can be fixed with @@ -371,20 +371,20 @@ uint64_t Transition::transitionFrom(uint64_t State) { void CustomDfaEmitter::printActionType(raw_ostream &OS) { OS << TypeName; } void CustomDfaEmitter::printActionValue(action_type A, raw_ostream &OS) { const ActionTuple &AT = Actions[A]; if (AT.size() > 1) - OS << "{"; + OS << "std::make_tuple("; bool First = true; for (const auto &SingleAction : AT) { if (!First) OS << ", "; First = false; SingleAction.print(OS); } if (AT.size() > 1) - OS << "}"; + OS << ")"; } namespace llvm { void EmitAutomata(RecordKeeper &RK, raw_ostream &OS) { similar to the fix in r372384 (169cb6347 >). Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67968/new/ https://reviews.llvm.org/D67968 From llvm-commits at lists.llvm.org Mon Oct 7 00:31:50 2019 From: llvm-commits at lists.llvm.org (Djordje Todorovic via llvm-commits) Date: Mon, 07 Oct 2019 07:31:50 -0000 Subject: [llvm] r373880 - [llvm-locstats] Fix a typo in the documentation; NFC Message-ID: <20191007073150.14B49862A9@lists.llvm.org> Author: djtodoro Date: Mon Oct 7 00:31:49 2019 New Revision: 373880 URL: http://llvm.org/viewvc/llvm-project?rev=373880&view=rev Log: [llvm-locstats] Fix a typo in the documentation; NFC Modified: llvm/trunk/docs/CommandGuide/llvm-locstats.rst Modified: llvm/trunk/docs/CommandGuide/llvm-locstats.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CommandGuide/llvm-locstats.rst?rev=373880&r1=373879&r2=373880&view=diff ============================================================================== --- llvm/trunk/docs/CommandGuide/llvm-locstats.rst (original) +++ llvm/trunk/docs/CommandGuide/llvm-locstats.rst Mon Oct 7 00:31:49 2019 @@ -60,7 +60,7 @@ OUTPUT EXAMPLE 20-29% 0 0% 30-39% 0 0% 40-49% 0 0% - 50-99% 1 16% + 50-59% 1 16% 60-69% 0 0% 70-79% 0 0% 80-89% 1 16% From llvm-commits at lists.llvm.org Mon Oct 7 00:32:18 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 07:32:18 +0000 (UTC) Subject: [PATCH] D68477: Add an off-by-default option to enable testing for gdb pretty printers. In-Reply-To: References: Message-ID: <2bea5187816fc632bd3d241e6eff1799@localhost.localdomain> MaskRay added a comment. I think the latest code is Python 3 compatible. The problem is that STL on some platforms have different behaviors. http://lab.llvm.org:8011/builders/libcxx-libcxxabi-libunwind-armv7-linux/builds/1050/steps/test.libcxx/logs/FAIL%3A%20libc%2B%2B%3A%3Agdb_pretty_printer_test.sh.cpp GDB printed: 'std::bitset<15> = {[2] = 1, [3] = 1, [4] = 1, [5] = 1, [6] = 1, [7] = 1, [8] = 1, [9] = 1, [10] = 1, [11] = 1, [12] = 1, [13] = 1, [14] = 1, [15] = 1, [16] = 1, [17] = 1, [18] = 1, [19] = 1, [20] = 1, [21] = 1, [22] = 1, [23] = 1, [24] = 1, [25] = 1, [26] = 1, [27] = 1, [28] = 1, [29] = 1, [30] = 1, [31] = 1, [32] = 1, [33] = 1, [34] = 1, [35] = 1, [36] = 1, [37] = 1, [38] = 1, [39] = 1, [40] = 1, [41] = 1, [42] = 1, [43] = 1, [44] = 1, [45] = 1, [46] = 1, [47] = 1, [48] = 1, [49] = 1, [50] = 1, [51] = 1, [52] = 1, [53] = 1, [54] = 1, [55] = 1, [56] = 1, [57] = 1, [58] = 1, [59] = 1, [60] = 1, [61] = 1, [62] = 1, [63] = 1, [64] = 1, [65] = 1, [66] = 1, [67] = 1, [68] = 1, [69] = 1, [70] = 1, [71] = 1, [72] = 1, [73] = 1, [74] = 1, [75] = 1, [76] = 1, [77] = 1, [78] = 1, [79] = 1, [80] = 1, [81] = 1, [82] = 1, [83] = 1, [84] = 1, [85] = 1, [86] = 1, [87] = 1, [88] = 1, [89] = 1, [90] = 1, [91] = 1, [92] = 1, [93] = 1, [94] = 1, [95] = 1, [96] = 1, [97] = 1, [98] = 1, [99] = 1, [100] = 1, [101] = 1, [102] = 1, [103] = 1, [104] = 1, [105] = 1, [106] = 1, [107] = 1, [108] = 1, [109] = 1, [110] = 1, [111] = 1, [112] = 1, [113] = 1, [114] = 1, [115] = 1, [116] = 1, [117] = 1, [118] = 1, [119] = 1, [120] = 1, [121] = 1, [122] = 1, [123] = 1, [124] = 1, [125] = 1, [126] = 1, [127] = 1, [128] = 1, [129] = 1, [130] = 1, [131] = 1, [132] = 1, [133] = 1, [134] = 1, [135] = 1, [136] = 1, [137] = 1, [138] = 1, [139] = 1, [140] = 1, [141] = 1, [142] = 1, [143] = 1, [144] = 1, [145] = 1, [146] = 1, [147] = 1, [148] = 1, [149] = 1, [150] = 1, [151] = 1, [152] = 1, [153] = 1, [154] = 1, [155] = 1, [156] = 1, [157] = 1, [158] = 1, [159] = 1, [160] = 1, [161] = 1, [162] = 1, [163] = 1, [164] = 1, [165] = 1, [166] = 1, [167] = 1, [168] = 1, [169] = 1, [170] = 1, [171] = 1, [172] = 1, [173] = 1, [174] = 1, [175] = 1, [176] = 1, [177] = 1, [178] = 1, [179] = 1, [180] = 1, [181] = 1, [182] = 1, [183] = 1, [184] = 1, [185] = 1, [186] = 1, [187] = 1, [188] = 1, [189] = 1, [190] = 1, [191] = 1, [192] = 1, [193] = 1, [194] = 1, [195] = 1, [196] = 1, [197] = 1, [198] = 1, [199] = 1, [200] = 1, [201] = 1...}' Value should match: 'std::bitset<15> = {[2] = 1, [3] = 1, [4] = 1, [5] = 1, [6] = 1, [7] = 1, [8] = 1, [9] = 1}' Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68477/new/ https://reviews.llvm.org/D68477 From llvm-commits at lists.llvm.org Mon Oct 7 00:41:26 2019 From: llvm-commits at lists.llvm.org (David Stuttard via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 07:41:26 +0000 (UTC) Subject: [PATCH] D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands In-Reply-To: References: Message-ID: <7001248da0c0cbe348aceda055ce395b@localhost.localdomain> dstuttard added a comment. ping Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51932/new/ https://reviews.llvm.org/D51932 From llvm-commits at lists.llvm.org Mon Oct 7 00:45:39 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Christian_K=C3=BChnel_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 07:45:39 +0000 (UTC) Subject: [PATCH] D68560: frist test commit for build server Message-ID: kuhnel created this revision. Herald added a project: LLVM. Herald added a subscriber: llvm-commits. test the interaction with the build server. DO NOT MERGE! Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68560 Files: DELETEME.txt Index: DELETEME.txt =================================================================== --- /dev/null +++ DELETEME.txt @@ -0,0 +1,4 @@ +just for testing. delete this file if you see it... + + + -------------- next part -------------- A non-text attachment was scrubbed... Name: D68560.223452.patch Type: text/x-patch Size: 194 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 00:47:52 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Mikael_Holm=C3=A9n_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 07:47:52 +0000 (UTC) Subject: [PATCH] D68460: [MachineSink] Don't preserve MachineLoopInfo In-Reply-To: References: Message-ID: <408d75374b7e1e941521601630e30242@localhost.localdomain> uabelho added a comment. In D68460#1695283 , @kuhar wrote: > Would it be possible to add a `verifyAnalysis` function to MLI and check if a freshly calculated one matches the 'preserved' one? I don't know but it sounds like a good idea to me. Not sure I can find the time to dig into that myself though (especially since I don't really know MLI). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68460/new/ https://reviews.llvm.org/D68460 From llvm-commits at lists.llvm.org Mon Oct 7 00:47:59 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 07:47:59 +0000 (UTC) Subject: [PATCH] D68561: [ELF][MIPS] Use lld::elf::{read,write}* instead of llvm::support::endian::{read,write}* Message-ID: MaskRay created this revision. MaskRay added reviewers: atanasyan, ruiu. Herald added subscribers: llvm-commits, jrtc27, arichardson, sdardis, emaste. Herald added a reviewer: espindola. Herald added a project: LLVM. This allows us to delete `using namespace llvm::support::endian` and simplify D68323 . This change adds runtime config->endianness check but the overhead should be negligible. Repository: rLLD LLVM Linker https://reviews.llvm.org/D68561 Files: ELF/Arch/Mips.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68561.223453.patch Type: text/x-patch Size: 11893 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 00:52:09 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 07:52:09 +0000 (UTC) Subject: [PATCH] D68561: [ELF][MIPS] Use lld::elf::{read,write}* instead of llvm::support::endian::{read,write}* In-Reply-To: References: Message-ID: <9d6d8afeb8eba304bc3f56f542248cb5@localhost.localdomain> ruiu accepted this revision. ruiu added a comment. This revision is now accepted and ready to land. LGTM I think you can de-template some functions now, but that can be done in a follow-up patch. Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68561/new/ https://reviews.llvm.org/D68561 From llvm-commits at lists.llvm.org Mon Oct 7 01:00:46 2019 From: llvm-commits at lists.llvm.org (Kristof Beyls via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:00:46 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: <027b94563f15b784d5d4721dde28b350@localhost.localdomain> kristof.beyls added a comment. Given this seems to add a new IR instruction, shouldn't there also be good quality documentation for this new instruction in docs/LangRef.rst? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Mon Oct 7 01:01:52 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:01:52 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: <413ae3d98a1edf3a11e32494dc3dd881@localhost.localdomain> lebedev.ri added a comment. In D29011#1696898 , @kristof.beyls wrote: > Given this seems to add a new IR instruction, shouldn't there also be good quality documentation for this new instruction in docs/LangRef.rst? See D29121 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Mon Oct 7 01:06:37 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:06:37 +0000 (UTC) Subject: [PATCH] D68461: [ARM][MVE] Enable truncating masked stores In-Reply-To: References: Message-ID: <8f47c9109f79c5b090caff78bd59f232@localhost.localdomain> samparker updated this revision to Diff 223455. samparker added a comment. Rebased and added a '1' shift value for strh. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68461/new/ https://reviews.llvm.org/D68461 Files: lib/Target/ARM/ARMInstrMVE.td lib/Target/ARM/ARMTargetTransformInfo.cpp test/CodeGen/Thumb2/mve-masked-store.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68461.223455.patch Type: text/x-patch Size: 14287 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 01:09:57 2019 From: llvm-commits at lists.llvm.org (Tanya Lattner via llvm-commits) Date: Mon, 07 Oct 2019 08:09:57 -0000 Subject: [www] r373881 - Fix up round table registration. Message-ID: <20191007080957.5C3BB8383E@lists.llvm.org> Author: tbrethou Date: Mon Oct 7 01:09:57 2019 New Revision: 373881 URL: http://llvm.org/viewvc/llvm-project?rev=373881&view=rev Log: Fix up round table registration. Modified: www/trunk/devmtg/2019-10/index.html Modified: www/trunk/devmtg/2019-10/index.html URL: http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2019-10/index.html?rev=373881&r1=373880&r2=373881&view=diff ============================================================================== --- www/trunk/devmtg/2019-10/index.html (original) +++ www/trunk/devmtg/2019-10/index.html Mon Oct 7 01:09:57 2019 @@ -590,9 +590,9 @@ More details will be coming soon but

-
Travel Grants for Students
+
Round Tables

-https://forms.gle/EaFWhzeyJK6AYHus9Round Table Registration +Round Table Registration

Round tables are informal get togethers where attendees talk about a specific subject. Each group will be given a round table and a flip chart to discuss and brainstorm ideas on a specific topic. If you are interested in organizing a round table discussion, please use the link above.

From llvm-commits at lists.llvm.org Mon Oct 7 01:09:32 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:09:32 +0000 (UTC) Subject: [PATCH] D68337: [ARM][MVE] Enable extending masked loads In-Reply-To: References: Message-ID: samparker updated this revision to Diff 223456. samparker added a comment. I had missed the shift value on the input patterns. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68337/new/ https://reviews.llvm.org/D68337 Files: lib/CodeGen/SelectionDAG/DAGCombiner.cpp lib/Target/ARM/ARMISelLowering.cpp lib/Target/ARM/ARMInstrMVE.td lib/Target/ARM/ARMTargetTransformInfo.cpp test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll test/CodeGen/Thumb2/mve-masked-ldst.ll test/CodeGen/Thumb2/mve-masked-load.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68337.223456.patch Type: text/x-patch Size: 141290 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 01:12:02 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:12:02 +0000 (UTC) Subject: [PATCH] D67867: [libc] Add few docs and implementation of strcpy and strcat. In-Reply-To: References: Message-ID: <30450ffef74004f163ecd501add712fe@localhost.localdomain> MaskRay added a comment. The commit was done in a hurry. Many points raised in the review process were just shrugged off. - Proper cmake review - Detailed summary. The commit message should at least reference some previous discussions on the mailing list, especially this is a brand new project. - The llvm-objcopy issue definitely needs more consideration. This may interfere badly with instrumentation tools, which is a selling point of the llvm libc. - Why `__llvm_libc` is necessary is not well explained. - Some necessary options `-ffreestanding -nostdinc` are absent. - C++ should not get `#define __restrict restrict` - ... This is a post-commit review anyway so many points are probably moot. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67867/new/ https://reviews.llvm.org/D67867 From llvm-commits at lists.llvm.org Mon Oct 7 01:12:45 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:12:45 +0000 (UTC) Subject: [PATCH] D65402: [Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer In-Reply-To: References: Message-ID: <10cea7fb1de9a7dd3c40f12cdd01c9bd@localhost.localdomain> uenoku added a comment. In D65402#1696345 , @jdoerfert wrote: > In D65402#1696036 , @uenoku wrote: > > > In D65402#1695253 , @jdoerfert wrote: > > > > > In D65402#1694872 , @uenoku wrote: > > > > > > > Currently, any use is not tracked so nothing about A[1] or A[-2] is deduced. > > > > > > > > > As long as we have a test that is fine. > > > > > > > This would be solved once making it track gep instruction. But beforehand, I strongly suggest separating deduction for known/assumption respectively. > > > > > > What do you mean by the separation part? > > > > > > I mean, running two different deduction scheme(known, assumption) might cause an unpredictable result. > > > > define i32* @test_for_minus_index(i32* %p) { > > %q = getelementptr inbounds i32, i32* %p, i32 -2 > > store i32 1, i32* %q > > ret i32* %q > > } > > > > > > AANonNullArgument is composed of AAArgumentFromCallSiteArguments, AAFromMustBeExecutedContext. > > Assume that gep is tracked in `followUse`. > > > > Iteration 1 : > > > > - AAFromMustBeExecutedContext will traverse uses of `%p` and prepare uses of `%q` for next iteration. > > - AAArgumentFromCallSiteArguments will call `indicatePessimisticFixpoint` because the function is not `internal` function. AANonNullArgument has already reached to pessimistic fixpoint so nonnull won't be deduced. > > > > This example is so simple that we can debug them but it is hard to debug more complex ones. > > > I see. You can explore uses exhaustively though that is a "local" solution to a general problem. > I think we need to keep known & assumed together but we should provide a way for AAs that have multiple deduction strategies to exhaust them seperatly, e.g, `AAArgumentFromCallSiteArguments` is known to have 2 schemes so we should track their "fixpoints" separate somehow. > In addition, or as an alternative, we could allow updates for AAs in a fixpoint if they opt-in to it. They would do so if they can improve based on known-information around them. Looks good. Anyway, the current code maintains soundness ( "general" solution might not be reached) so I'll commit it if there is no problem. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65402/new/ https://reviews.llvm.org/D65402 From llvm-commits at lists.llvm.org Mon Oct 7 01:21:38 2019 From: llvm-commits at lists.llvm.org (Martin Storsjo via llvm-commits) Date: Mon, 07 Oct 2019 08:21:38 -0000 Subject: [llvm] r373882 - Revert "[SLP] avoid reduction transform on patterns that the backend can load-combine" Message-ID: <20191007082138.1A17385FC7@lists.llvm.org> Author: mstorsjo Date: Mon Oct 7 01:21:37 2019 New Revision: 373882 URL: http://llvm.org/viewvc/llvm-project?rev=373882&view=rev Log: Revert "[SLP] avoid reduction transform on patterns that the backend can load-combine" This reverts SVN r373833, as it caused a failed assert "Non-zero loop cost expected" on building numerous projects, see PR43582 for details and reproduction samples. Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h llvm/trunk/lib/Analysis/TargetTransformInfo.cpp llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/trunk/test/Transforms/SLPVectorizer/X86/bad-reduction.ll Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=373882&r1=373881&r2=373882&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original) +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Mon Oct 7 01:21:37 2019 @@ -1129,16 +1129,6 @@ private: /// Returns -1 if the cost is unknown. int getInstructionThroughput(const Instruction *I) const; - /// Given an input value that is an element of an 'or' reduction, check if the - /// reduction is composed of narrower loaded values. Assuming that a - /// legal-sized reduction of shifted/zexted loaded values can be load combined - /// in the backend, create a relative cost that accounts for the removal of - /// the intermediate ops and replacement by a single wide load. - /// TODO: If load combining is allowed in the IR optimizer, this analysis - /// may not be necessary. - Optional getLoadCombineCost(unsigned Opcode, - ArrayRef Args) const; - /// The abstract base class used to type erase specific TTI /// implementations. class Concept; Modified: llvm/trunk/lib/Analysis/TargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/TargetTransformInfo.cpp?rev=373882&r1=373881&r2=373882&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/TargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Analysis/TargetTransformInfo.cpp Mon Oct 7 01:21:37 2019 @@ -571,64 +571,11 @@ TargetTransformInfo::getOperandInfo(Valu return OpInfo; } -Optional -TargetTransformInfo::getLoadCombineCost(unsigned Opcode, - ArrayRef Args) const { - if (Opcode != Instruction::Or) - return llvm::None; - if (Args.empty()) - return llvm::None; - - // Look past the reduction to find a source value. Arbitrarily follow the - // path through operand 0 of any 'or'. Also, peek through optional - // shift-left-by-constant. - const Value *ZextLoad = Args.front(); - while (match(ZextLoad, m_Or(m_Value(), m_Value())) || - match(ZextLoad, m_Shl(m_Value(), m_Constant()))) - ZextLoad = cast(ZextLoad)->getOperand(0); - - // Check if the input to the reduction is an extended load. - Value *LoadPtr; - if (!match(ZextLoad, m_ZExt(m_Load(m_Value(LoadPtr))))) - return llvm::None; - - // Require that the total load bit width is a legal integer type. - // For example, <8 x i8> --> i64 is a legal integer on a 64-bit target. - // But <16 x i8> --> i128 is not, so the backend probably can't reduce it. - Type *WideType = ZextLoad->getType(); - Type *EltType = LoadPtr->getType()->getPointerElementType(); - unsigned WideWidth = WideType->getIntegerBitWidth(); - unsigned EltWidth = EltType->getIntegerBitWidth(); - if (!isTypeLegal(WideType) || WideWidth % EltWidth != 0) - return llvm::None; - - // Calculate relative cost: {narrow load+zext+shl+or} are assumed to be - // removed and replaced by a single wide load. - // FIXME: This is not accurate for the larger pattern where we replace - // multiple narrow load sequences with just 1 wide load. We could - // remove the addition of the wide load cost here and expect the caller - // to make an adjustment for that. - int Cost = 0; - Cost -= getMemoryOpCost(Instruction::Load, EltType, 0, 0); - Cost -= getCastInstrCost(Instruction::ZExt, WideType, EltType); - Cost -= getArithmeticInstrCost(Instruction::Shl, WideType); - Cost -= getArithmeticInstrCost(Instruction::Or, WideType); - Cost += getMemoryOpCost(Instruction::Load, WideType, 0, 0); - return Cost; -} - - int TargetTransformInfo::getArithmeticInstrCost( unsigned Opcode, Type *Ty, OperandValueKind Opd1Info, OperandValueKind Opd2Info, OperandValueProperties Opd1PropInfo, OperandValueProperties Opd2PropInfo, ArrayRef Args) const { - // Check if we can match this instruction as part of a larger pattern. - Optional LoadCombineCost = getLoadCombineCost(Opcode, Args); - if (LoadCombineCost) - return LoadCombineCost.getValue(); - - // Fallback to implementation-specific overrides or base class. int Cost = TTIImpl->getArithmeticInstrCost(Opcode, Ty, Opd1Info, Opd2Info, Opd1PropInfo, Opd2PropInfo, Args); assert(Cost >= 0 && "TTI should not produce negative costs!"); Modified: llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp?rev=373882&r1=373881&r2=373882&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp Mon Oct 7 01:21:37 2019 @@ -6499,19 +6499,10 @@ private: int ScalarReduxCost = 0; switch (ReductionData.getKind()) { - case RK_Arithmetic: { - // Note: Passing in the reduction operands allows the cost model to match - // load combining patterns for this reduction. - auto *ReduxInst = cast(ReductionRoot); - SmallVector OperandList; - for (Value *Operand : ReduxInst->operands()) - OperandList.push_back(Operand); - ScalarReduxCost = TTI->getArithmeticInstrCost(ReductionData.getOpcode(), - ScalarTy, TargetTransformInfo::OK_AnyValue, - TargetTransformInfo::OK_AnyValue, TargetTransformInfo::OP_None, - TargetTransformInfo::OP_None, OperandList); + case RK_Arithmetic: + ScalarReduxCost = + TTI->getArithmeticInstrCost(ReductionData.getOpcode(), ScalarTy); break; - } case RK_Min: case RK_Max: case RK_UMin: Modified: llvm/trunk/test/Transforms/SLPVectorizer/X86/bad-reduction.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SLPVectorizer/X86/bad-reduction.ll?rev=373882&r1=373881&r2=373882&view=diff ============================================================================== --- llvm/trunk/test/Transforms/SLPVectorizer/X86/bad-reduction.ll (original) +++ llvm/trunk/test/Transforms/SLPVectorizer/X86/bad-reduction.ll Mon Oct 7 01:21:37 2019 @@ -15,37 +15,31 @@ define i64 @load_bswap(%v8i8* %p) { ; CHECK-NEXT: [[G5:%.*]] = getelementptr inbounds [[V8I8]], %v8i8* [[P]], i64 0, i32 5 ; CHECK-NEXT: [[G6:%.*]] = getelementptr inbounds [[V8I8]], %v8i8* [[P]], i64 0, i32 6 ; CHECK-NEXT: [[G7:%.*]] = getelementptr inbounds [[V8I8]], %v8i8* [[P]], i64 0, i32 7 -; CHECK-NEXT: [[T0:%.*]] = load i8, i8* [[G0]] -; CHECK-NEXT: [[T1:%.*]] = load i8, i8* [[G1]] -; CHECK-NEXT: [[T2:%.*]] = load i8, i8* [[G2]] -; CHECK-NEXT: [[T3:%.*]] = load i8, i8* [[G3]] +; CHECK-NEXT: [[TMP1:%.*]] = bitcast i8* [[G0]] to <4 x i8>* +; CHECK-NEXT: [[TMP2:%.*]] = load <4 x i8>, <4 x i8>* [[TMP1]], align 1 ; CHECK-NEXT: [[T4:%.*]] = load i8, i8* [[G4]] ; CHECK-NEXT: [[T5:%.*]] = load i8, i8* [[G5]] ; CHECK-NEXT: [[T6:%.*]] = load i8, i8* [[G6]] ; CHECK-NEXT: [[T7:%.*]] = load i8, i8* [[G7]] -; CHECK-NEXT: [[Z0:%.*]] = zext i8 [[T0]] to i64 -; CHECK-NEXT: [[Z1:%.*]] = zext i8 [[T1]] to i64 -; CHECK-NEXT: [[Z2:%.*]] = zext i8 [[T2]] to i64 -; CHECK-NEXT: [[Z3:%.*]] = zext i8 [[T3]] to i64 +; CHECK-NEXT: [[TMP3:%.*]] = zext <4 x i8> [[TMP2]] to <4 x i64> ; CHECK-NEXT: [[Z4:%.*]] = zext i8 [[T4]] to i64 ; CHECK-NEXT: [[Z5:%.*]] = zext i8 [[T5]] to i64 ; CHECK-NEXT: [[Z6:%.*]] = zext i8 [[T6]] to i64 ; CHECK-NEXT: [[Z7:%.*]] = zext i8 [[T7]] to i64 -; CHECK-NEXT: [[SH0:%.*]] = shl nuw i64 [[Z0]], 56 -; CHECK-NEXT: [[SH1:%.*]] = shl nuw nsw i64 [[Z1]], 48 -; CHECK-NEXT: [[SH2:%.*]] = shl nuw nsw i64 [[Z2]], 40 -; CHECK-NEXT: [[SH3:%.*]] = shl nuw nsw i64 [[Z3]], 32 +; CHECK-NEXT: [[TMP4:%.*]] = shl nuw <4 x i64> [[TMP3]], ; CHECK-NEXT: [[SH4:%.*]] = shl nuw nsw i64 [[Z4]], 24 ; CHECK-NEXT: [[SH5:%.*]] = shl nuw nsw i64 [[Z5]], 16 ; CHECK-NEXT: [[SH6:%.*]] = shl nuw nsw i64 [[Z6]], 8 -; CHECK-NEXT: [[OR01:%.*]] = or i64 [[SH0]], [[SH1]] -; CHECK-NEXT: [[OR012:%.*]] = or i64 [[OR01]], [[SH2]] -; CHECK-NEXT: [[OR0123:%.*]] = or i64 [[OR012]], [[SH3]] -; CHECK-NEXT: [[OR01234:%.*]] = or i64 [[OR0123]], [[SH4]] -; CHECK-NEXT: [[OR012345:%.*]] = or i64 [[OR01234]], [[SH5]] -; CHECK-NEXT: [[OR0123456:%.*]] = or i64 [[OR012345]], [[SH6]] -; CHECK-NEXT: [[OR01234567:%.*]] = or i64 [[OR0123456]], [[Z7]] -; CHECK-NEXT: ret i64 [[OR01234567]] +; CHECK-NEXT: [[RDX_SHUF:%.*]] = shufflevector <4 x i64> [[TMP4]], <4 x i64> undef, <4 x i32> +; CHECK-NEXT: [[BIN_RDX:%.*]] = or <4 x i64> [[TMP4]], [[RDX_SHUF]] +; CHECK-NEXT: [[RDX_SHUF1:%.*]] = shufflevector <4 x i64> [[BIN_RDX]], <4 x i64> undef, <4 x i32> +; CHECK-NEXT: [[BIN_RDX2:%.*]] = or <4 x i64> [[BIN_RDX]], [[RDX_SHUF1]] +; CHECK-NEXT: [[TMP5:%.*]] = extractelement <4 x i64> [[BIN_RDX2]], i32 0 +; CHECK-NEXT: [[TMP6:%.*]] = or i64 [[TMP5]], [[SH4]] +; CHECK-NEXT: [[TMP7:%.*]] = or i64 [[TMP6]], [[SH5]] +; CHECK-NEXT: [[TMP8:%.*]] = or i64 [[TMP7]], [[SH6]] +; CHECK-NEXT: [[OP_EXTRA:%.*]] = or i64 [[TMP8]], [[Z7]] +; CHECK-NEXT: ret i64 [[OP_EXTRA]] ; %g0 = getelementptr inbounds %v8i8, %v8i8* %p, i64 0, i32 0 %g1 = getelementptr inbounds %v8i8, %v8i8* %p, i64 0, i32 1 @@ -103,38 +97,18 @@ define i64 @load_bswap_nop_shift(%v8i8* ; CHECK-NEXT: [[G5:%.*]] = getelementptr inbounds [[V8I8]], %v8i8* [[P]], i64 0, i32 5 ; CHECK-NEXT: [[G6:%.*]] = getelementptr inbounds [[V8I8]], %v8i8* [[P]], i64 0, i32 6 ; CHECK-NEXT: [[G7:%.*]] = getelementptr inbounds [[V8I8]], %v8i8* [[P]], i64 0, i32 7 -; CHECK-NEXT: [[T0:%.*]] = load i8, i8* [[G0]] -; CHECK-NEXT: [[T1:%.*]] = load i8, i8* [[G1]] -; CHECK-NEXT: [[T2:%.*]] = load i8, i8* [[G2]] -; CHECK-NEXT: [[T3:%.*]] = load i8, i8* [[G3]] -; CHECK-NEXT: [[T4:%.*]] = load i8, i8* [[G4]] -; CHECK-NEXT: [[T5:%.*]] = load i8, i8* [[G5]] -; CHECK-NEXT: [[T6:%.*]] = load i8, i8* [[G6]] -; CHECK-NEXT: [[T7:%.*]] = load i8, i8* [[G7]] -; CHECK-NEXT: [[Z0:%.*]] = zext i8 [[T0]] to i64 -; CHECK-NEXT: [[Z1:%.*]] = zext i8 [[T1]] to i64 -; CHECK-NEXT: [[Z2:%.*]] = zext i8 [[T2]] to i64 -; CHECK-NEXT: [[Z3:%.*]] = zext i8 [[T3]] to i64 -; CHECK-NEXT: [[Z4:%.*]] = zext i8 [[T4]] to i64 -; CHECK-NEXT: [[Z5:%.*]] = zext i8 [[T5]] to i64 -; CHECK-NEXT: [[Z6:%.*]] = zext i8 [[T6]] to i64 -; CHECK-NEXT: [[Z7:%.*]] = zext i8 [[T7]] to i64 -; CHECK-NEXT: [[SH0:%.*]] = shl nuw i64 [[Z0]], 56 -; CHECK-NEXT: [[SH1:%.*]] = shl nuw nsw i64 [[Z1]], 48 -; CHECK-NEXT: [[SH2:%.*]] = shl nuw nsw i64 [[Z2]], 40 -; CHECK-NEXT: [[SH3:%.*]] = shl nuw nsw i64 [[Z3]], 32 -; CHECK-NEXT: [[SH4:%.*]] = shl nuw nsw i64 [[Z4]], 24 -; CHECK-NEXT: [[SH5:%.*]] = shl nuw nsw i64 [[Z5]], 16 -; CHECK-NEXT: [[SH6:%.*]] = shl nuw nsw i64 [[Z6]], 8 -; CHECK-NEXT: [[SH7:%.*]] = shl nuw nsw i64 [[Z7]], 0 -; CHECK-NEXT: [[OR01:%.*]] = or i64 [[SH0]], [[SH1]] -; CHECK-NEXT: [[OR012:%.*]] = or i64 [[OR01]], [[SH2]] -; CHECK-NEXT: [[OR0123:%.*]] = or i64 [[OR012]], [[SH3]] -; CHECK-NEXT: [[OR01234:%.*]] = or i64 [[OR0123]], [[SH4]] -; CHECK-NEXT: [[OR012345:%.*]] = or i64 [[OR01234]], [[SH5]] -; CHECK-NEXT: [[OR0123456:%.*]] = or i64 [[OR012345]], [[SH6]] -; CHECK-NEXT: [[OR01234567:%.*]] = or i64 [[OR0123456]], [[SH7]] -; CHECK-NEXT: ret i64 [[OR01234567]] +; CHECK-NEXT: [[TMP1:%.*]] = bitcast i8* [[G0]] to <8 x i8>* +; CHECK-NEXT: [[TMP2:%.*]] = load <8 x i8>, <8 x i8>* [[TMP1]], align 1 +; CHECK-NEXT: [[TMP3:%.*]] = zext <8 x i8> [[TMP2]] to <8 x i64> +; CHECK-NEXT: [[TMP4:%.*]] = shl nuw <8 x i64> [[TMP3]], +; CHECK-NEXT: [[RDX_SHUF:%.*]] = shufflevector <8 x i64> [[TMP4]], <8 x i64> undef, <8 x i32> +; CHECK-NEXT: [[BIN_RDX:%.*]] = or <8 x i64> [[TMP4]], [[RDX_SHUF]] +; CHECK-NEXT: [[RDX_SHUF1:%.*]] = shufflevector <8 x i64> [[BIN_RDX]], <8 x i64> undef, <8 x i32> +; CHECK-NEXT: [[BIN_RDX2:%.*]] = or <8 x i64> [[BIN_RDX]], [[RDX_SHUF1]] +; CHECK-NEXT: [[RDX_SHUF3:%.*]] = shufflevector <8 x i64> [[BIN_RDX2]], <8 x i64> undef, <8 x i32> +; CHECK-NEXT: [[BIN_RDX4:%.*]] = or <8 x i64> [[BIN_RDX2]], [[RDX_SHUF3]] +; CHECK-NEXT: [[TMP5:%.*]] = extractelement <8 x i64> [[BIN_RDX4]], i32 0 +; CHECK-NEXT: ret i64 [[TMP5]] ; %g0 = getelementptr inbounds %v8i8, %v8i8* %p, i64 0, i32 0 %g1 = getelementptr inbounds %v8i8, %v8i8* %p, i64 0, i32 1 @@ -194,36 +168,30 @@ define i64 @load64le(i8* %arg) { ; CHECK-NEXT: [[G6:%.*]] = getelementptr inbounds i8, i8* [[ARG]], i64 6 ; CHECK-NEXT: [[G7:%.*]] = getelementptr inbounds i8, i8* [[ARG]], i64 7 ; CHECK-NEXT: [[LD0:%.*]] = load i8, i8* [[ARG]], align 1 -; CHECK-NEXT: [[LD1:%.*]] = load i8, i8* [[G1]], align 1 -; CHECK-NEXT: [[LD2:%.*]] = load i8, i8* [[G2]], align 1 -; CHECK-NEXT: [[LD3:%.*]] = load i8, i8* [[G3]], align 1 -; CHECK-NEXT: [[LD4:%.*]] = load i8, i8* [[G4]], align 1 +; CHECK-NEXT: [[TMP1:%.*]] = bitcast i8* [[G1]] to <4 x i8>* +; CHECK-NEXT: [[TMP2:%.*]] = load <4 x i8>, <4 x i8>* [[TMP1]], align 1 ; CHECK-NEXT: [[LD5:%.*]] = load i8, i8* [[G5]], align 1 ; CHECK-NEXT: [[LD6:%.*]] = load i8, i8* [[G6]], align 1 ; CHECK-NEXT: [[LD7:%.*]] = load i8, i8* [[G7]], align 1 ; CHECK-NEXT: [[Z0:%.*]] = zext i8 [[LD0]] to i64 -; CHECK-NEXT: [[Z1:%.*]] = zext i8 [[LD1]] to i64 -; CHECK-NEXT: [[Z2:%.*]] = zext i8 [[LD2]] to i64 -; CHECK-NEXT: [[Z3:%.*]] = zext i8 [[LD3]] to i64 -; CHECK-NEXT: [[Z4:%.*]] = zext i8 [[LD4]] to i64 +; CHECK-NEXT: [[TMP3:%.*]] = zext <4 x i8> [[TMP2]] to <4 x i64> ; CHECK-NEXT: [[Z5:%.*]] = zext i8 [[LD5]] to i64 ; CHECK-NEXT: [[Z6:%.*]] = zext i8 [[LD6]] to i64 ; CHECK-NEXT: [[Z7:%.*]] = zext i8 [[LD7]] to i64 -; CHECK-NEXT: [[S1:%.*]] = shl nuw nsw i64 [[Z1]], 8 -; CHECK-NEXT: [[S2:%.*]] = shl nuw nsw i64 [[Z2]], 16 -; CHECK-NEXT: [[S3:%.*]] = shl nuw nsw i64 [[Z3]], 24 -; CHECK-NEXT: [[S4:%.*]] = shl nuw nsw i64 [[Z4]], 32 +; CHECK-NEXT: [[TMP4:%.*]] = shl nuw nsw <4 x i64> [[TMP3]], ; CHECK-NEXT: [[S5:%.*]] = shl nuw nsw i64 [[Z5]], 40 ; CHECK-NEXT: [[S6:%.*]] = shl nuw nsw i64 [[Z6]], 48 ; CHECK-NEXT: [[S7:%.*]] = shl nuw i64 [[Z7]], 56 -; CHECK-NEXT: [[O1:%.*]] = or i64 [[S1]], [[Z0]] -; CHECK-NEXT: [[O2:%.*]] = or i64 [[O1]], [[S2]] -; CHECK-NEXT: [[O3:%.*]] = or i64 [[O2]], [[S3]] -; CHECK-NEXT: [[O4:%.*]] = or i64 [[O3]], [[S4]] -; CHECK-NEXT: [[O5:%.*]] = or i64 [[O4]], [[S5]] -; CHECK-NEXT: [[O6:%.*]] = or i64 [[O5]], [[S6]] -; CHECK-NEXT: [[O7:%.*]] = or i64 [[O6]], [[S7]] -; CHECK-NEXT: ret i64 [[O7]] +; CHECK-NEXT: [[RDX_SHUF:%.*]] = shufflevector <4 x i64> [[TMP4]], <4 x i64> undef, <4 x i32> +; CHECK-NEXT: [[BIN_RDX:%.*]] = or <4 x i64> [[TMP4]], [[RDX_SHUF]] +; CHECK-NEXT: [[RDX_SHUF1:%.*]] = shufflevector <4 x i64> [[BIN_RDX]], <4 x i64> undef, <4 x i32> +; CHECK-NEXT: [[BIN_RDX2:%.*]] = or <4 x i64> [[BIN_RDX]], [[RDX_SHUF1]] +; CHECK-NEXT: [[TMP5:%.*]] = extractelement <4 x i64> [[BIN_RDX2]], i32 0 +; CHECK-NEXT: [[TMP6:%.*]] = or i64 [[TMP5]], [[S5]] +; CHECK-NEXT: [[TMP7:%.*]] = or i64 [[TMP6]], [[S6]] +; CHECK-NEXT: [[TMP8:%.*]] = or i64 [[TMP7]], [[S7]] +; CHECK-NEXT: [[OP_EXTRA:%.*]] = or i64 [[TMP8]], [[Z0]] +; CHECK-NEXT: ret i64 [[OP_EXTRA]] ; %g1 = getelementptr inbounds i8, i8* %arg, i64 1 %g2 = getelementptr inbounds i8, i8* %arg, i64 2 @@ -279,38 +247,18 @@ define i64 @load64le_nop_shift(i8* %arg) ; CHECK-NEXT: [[G5:%.*]] = getelementptr inbounds i8, i8* [[ARG]], i64 5 ; CHECK-NEXT: [[G6:%.*]] = getelementptr inbounds i8, i8* [[ARG]], i64 6 ; CHECK-NEXT: [[G7:%.*]] = getelementptr inbounds i8, i8* [[ARG]], i64 7 -; CHECK-NEXT: [[LD0:%.*]] = load i8, i8* [[ARG]], align 1 -; CHECK-NEXT: [[LD1:%.*]] = load i8, i8* [[G1]], align 1 -; CHECK-NEXT: [[LD2:%.*]] = load i8, i8* [[G2]], align 1 -; CHECK-NEXT: [[LD3:%.*]] = load i8, i8* [[G3]], align 1 -; CHECK-NEXT: [[LD4:%.*]] = load i8, i8* [[G4]], align 1 -; CHECK-NEXT: [[LD5:%.*]] = load i8, i8* [[G5]], align 1 -; CHECK-NEXT: [[LD6:%.*]] = load i8, i8* [[G6]], align 1 -; CHECK-NEXT: [[LD7:%.*]] = load i8, i8* [[G7]], align 1 -; CHECK-NEXT: [[Z0:%.*]] = zext i8 [[LD0]] to i64 -; CHECK-NEXT: [[Z1:%.*]] = zext i8 [[LD1]] to i64 -; CHECK-NEXT: [[Z2:%.*]] = zext i8 [[LD2]] to i64 -; CHECK-NEXT: [[Z3:%.*]] = zext i8 [[LD3]] to i64 -; CHECK-NEXT: [[Z4:%.*]] = zext i8 [[LD4]] to i64 -; CHECK-NEXT: [[Z5:%.*]] = zext i8 [[LD5]] to i64 -; CHECK-NEXT: [[Z6:%.*]] = zext i8 [[LD6]] to i64 -; CHECK-NEXT: [[Z7:%.*]] = zext i8 [[LD7]] to i64 -; CHECK-NEXT: [[S0:%.*]] = shl nuw nsw i64 [[Z0]], 0 -; CHECK-NEXT: [[S1:%.*]] = shl nuw nsw i64 [[Z1]], 8 -; CHECK-NEXT: [[S2:%.*]] = shl nuw nsw i64 [[Z2]], 16 -; CHECK-NEXT: [[S3:%.*]] = shl nuw nsw i64 [[Z3]], 24 -; CHECK-NEXT: [[S4:%.*]] = shl nuw nsw i64 [[Z4]], 32 -; CHECK-NEXT: [[S5:%.*]] = shl nuw nsw i64 [[Z5]], 40 -; CHECK-NEXT: [[S6:%.*]] = shl nuw nsw i64 [[Z6]], 48 -; CHECK-NEXT: [[S7:%.*]] = shl nuw i64 [[Z7]], 56 -; CHECK-NEXT: [[O1:%.*]] = or i64 [[S1]], [[S0]] -; CHECK-NEXT: [[O2:%.*]] = or i64 [[O1]], [[S2]] -; CHECK-NEXT: [[O3:%.*]] = or i64 [[O2]], [[S3]] -; CHECK-NEXT: [[O4:%.*]] = or i64 [[O3]], [[S4]] -; CHECK-NEXT: [[O5:%.*]] = or i64 [[O4]], [[S5]] -; CHECK-NEXT: [[O6:%.*]] = or i64 [[O5]], [[S6]] -; CHECK-NEXT: [[O7:%.*]] = or i64 [[O6]], [[S7]] -; CHECK-NEXT: ret i64 [[O7]] +; CHECK-NEXT: [[TMP1:%.*]] = bitcast i8* [[ARG]] to <8 x i8>* +; CHECK-NEXT: [[TMP2:%.*]] = load <8 x i8>, <8 x i8>* [[TMP1]], align 1 +; CHECK-NEXT: [[TMP3:%.*]] = zext <8 x i8> [[TMP2]] to <8 x i64> +; CHECK-NEXT: [[TMP4:%.*]] = shl nuw <8 x i64> [[TMP3]], +; CHECK-NEXT: [[RDX_SHUF:%.*]] = shufflevector <8 x i64> [[TMP4]], <8 x i64> undef, <8 x i32> +; CHECK-NEXT: [[BIN_RDX:%.*]] = or <8 x i64> [[TMP4]], [[RDX_SHUF]] +; CHECK-NEXT: [[RDX_SHUF1:%.*]] = shufflevector <8 x i64> [[BIN_RDX]], <8 x i64> undef, <8 x i32> +; CHECK-NEXT: [[BIN_RDX2:%.*]] = or <8 x i64> [[BIN_RDX]], [[RDX_SHUF1]] +; CHECK-NEXT: [[RDX_SHUF3:%.*]] = shufflevector <8 x i64> [[BIN_RDX2]], <8 x i64> undef, <8 x i32> +; CHECK-NEXT: [[BIN_RDX4:%.*]] = or <8 x i64> [[BIN_RDX2]], [[RDX_SHUF3]] +; CHECK-NEXT: [[TMP5:%.*]] = extractelement <8 x i64> [[BIN_RDX4]], i32 0 +; CHECK-NEXT: ret i64 [[TMP5]] ; %g1 = getelementptr inbounds i8, i8* %arg, i64 1 %g2 = getelementptr inbounds i8, i8* %arg, i64 2 From llvm-commits at lists.llvm.org Mon Oct 7 01:20:14 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 08:20:14 +0000 (UTC) Subject: [PATCH] D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine In-Reply-To: References: Message-ID: <079011e04cf6c1e903e626ae2e076d3a@localhost.localdomain> mstorsjo added a comment. This caused lots of failed asserts in building many different projects, see https://bugs.llvm.org/show_bug.cgi?id=43582, so I went ahead and reverted it for now. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67841/new/ https://reviews.llvm.org/D67841 From llvm-commits at lists.llvm.org Mon Oct 7 01:23:20 2019 From: llvm-commits at lists.llvm.org (James Molloy via llvm-commits) Date: Mon, 07 Oct 2019 08:23:20 -0000 Subject: [llvm] r373883 - [TableGen] Pacify gcc-5.4 more Message-ID: <20191007082320.B7C80816EC@lists.llvm.org> Author: jamesm Date: Mon Oct 7 01:23:20 2019 New Revision: 373883 URL: http://llvm.org/viewvc/llvm-project?rev=373883&view=rev Log: [TableGen] Pacify gcc-5.4 more Followup to a previous pacification, this performs the same workaround to the TableGen generated code for tuple automata. Modified: llvm/trunk/utils/TableGen/DFAEmitter.cpp Modified: llvm/trunk/utils/TableGen/DFAEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/DFAEmitter.cpp?rev=373883&r1=373882&r2=373883&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/DFAEmitter.cpp (original) +++ llvm/trunk/utils/TableGen/DFAEmitter.cpp Mon Oct 7 01:23:20 2019 @@ -373,7 +373,7 @@ void CustomDfaEmitter::printActionType(r void CustomDfaEmitter::printActionValue(action_type A, raw_ostream &OS) { const ActionTuple &AT = Actions[A]; if (AT.size() > 1) - OS << "{"; + OS << "std::make_tuple("; bool First = true; for (const auto &SingleAction : AT) { if (!First) @@ -382,7 +382,7 @@ void CustomDfaEmitter::printActionValue( SingleAction.print(OS); } if (AT.size() > 1) - OS << "}"; + OS << ")"; } namespace llvm { From llvm-commits at lists.llvm.org Mon Oct 7 01:30:46 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via llvm-commits) Date: Mon, 07 Oct 2019 08:30:46 -0000 Subject: [lld] r373884 - [ELF][MIPS] Use lld::elf::{read, write}* instead of llvm::support::endian::{read, write}* Message-ID: <20191007083046.9B9BB83B7F@lists.llvm.org> Author: maskray Date: Mon Oct 7 01:30:46 2019 New Revision: 373884 URL: http://llvm.org/viewvc/llvm-project?rev=373884&view=rev Log: [ELF][MIPS] Use lld::elf::{read,write}* instead of llvm::support::endian::{read,write}* This allows us to delete `using namespace llvm::support::endian` and simplify D68323. This change adds runtime config->endianness check but the overhead should be negligible. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D68561 Modified: lld/trunk/ELF/Arch/Mips.cpp Modified: lld/trunk/ELF/Arch/Mips.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/Mips.cpp?rev=373884&r1=373883&r2=373884&view=diff ============================================================================== --- lld/trunk/ELF/Arch/Mips.cpp (original) +++ lld/trunk/ELF/Arch/Mips.cpp Mon Oct 7 01:30:46 2019 @@ -14,11 +14,9 @@ #include "Thunks.h" #include "lld/Common/ErrorHandler.h" #include "llvm/Object/ELF.h" -#include "llvm/Support/Endian.h" using namespace llvm; using namespace llvm::object; -using namespace llvm::support::endian; using namespace llvm::ELF; using namespace lld; using namespace lld::elf; @@ -199,7 +197,7 @@ void MIPS::writeGotPlt(uint8_t *bu uint64_t va = in.plt->getVA(); if (isMicroMips()) va |= 1; - write32(buf, va); + write32(buf, va); } template static uint32_t readShuffle(const uint8_t *loc) { @@ -209,7 +207,7 @@ template static uint32_t // as early as possible. To do so, little-endian binaries keep 16-bit // words in a big-endian order. That is why we have to swap these // words to get a correct value. - uint32_t v = read32(loc); + uint32_t v = read32(loc); if (E == support::little) return (v << 16) | (v >> 16); return v; @@ -218,10 +216,10 @@ template static uint32_t template static void writeValue(uint8_t *loc, uint64_t v, uint8_t bitsSize, uint8_t shift) { - uint32_t instr = read32(loc); + uint32_t instr = read32(loc); uint32_t mask = 0xffffffff >> (32 - bitsSize); uint32_t data = (instr & ~mask) | ((v >> shift) & mask); - write32(loc, data); + write32(loc, data); } template @@ -241,10 +239,10 @@ static void writeShuffleValue(uint8_t *l template static void writeMicroRelocation16(uint8_t *loc, uint64_t v, uint8_t bitsSize, uint8_t shift) { - uint16_t instr = read16(loc); + uint16_t instr = read16(loc); uint16_t mask = 0xffff >> (16 - bitsSize); uint16_t data = (instr & ~mask) | ((v >> shift) & mask); - write16(loc, data); + write16(loc, data); } template void MIPS::writePltHeader(uint8_t *buf) const { @@ -255,53 +253,53 @@ template void MIPS::w // Overwrite trap instructions written by Writer::writeTrapInstr. memset(buf, 0, pltHeaderSize); - write16(buf, isMipsR6() ? 0x7860 : 0x7980); // addiupc v1, (GOTPLT) - . - write16(buf + 4, 0xff23); // lw $25, 0($3) - write16(buf + 8, 0x0535); // subu16 $2, $2, $3 - write16(buf + 10, 0x2525); // srl16 $2, $2, 2 - write16(buf + 12, 0x3302); // addiu $24, $2, -2 - write16(buf + 14, 0xfffe); - write16(buf + 16, 0x0dff); // move $15, $31 + write16(buf, isMipsR6() ? 0x7860 : 0x7980); // addiupc v1, (GOTPLT) - . + write16(buf + 4, 0xff23); // lw $25, 0($3) + write16(buf + 8, 0x0535); // subu16 $2, $2, $3 + write16(buf + 10, 0x2525); // srl16 $2, $2, 2 + write16(buf + 12, 0x3302); // addiu $24, $2, -2 + write16(buf + 14, 0xfffe); + write16(buf + 16, 0x0dff); // move $15, $31 if (isMipsR6()) { - write16(buf + 18, 0x0f83); // move $28, $3 - write16(buf + 20, 0x472b); // jalrc $25 - write16(buf + 22, 0x0c00); // nop + write16(buf + 18, 0x0f83); // move $28, $3 + write16(buf + 20, 0x472b); // jalrc $25 + write16(buf + 22, 0x0c00); // nop relocateOne(buf, R_MICROMIPS_PC19_S2, gotPlt - plt); } else { - write16(buf + 18, 0x45f9); // jalrc $25 - write16(buf + 20, 0x0f83); // move $28, $3 - write16(buf + 22, 0x0c00); // nop + write16(buf + 18, 0x45f9); // jalrc $25 + write16(buf + 20, 0x0f83); // move $28, $3 + write16(buf + 22, 0x0c00); // nop relocateOne(buf, R_MICROMIPS_PC23_S2, gotPlt - plt); } return; } if (config->mipsN32Abi) { - write32(buf, 0x3c0e0000); // lui $14, %hi(&GOTPLT[0]) - write32(buf + 4, 0x8dd90000); // lw $25, %lo(&GOTPLT[0])($14) - write32(buf + 8, 0x25ce0000); // addiu $14, $14, %lo(&GOTPLT[0]) - write32(buf + 12, 0x030ec023); // subu $24, $24, $14 - write32(buf + 16, 0x03e07825); // move $15, $31 - write32(buf + 20, 0x0018c082); // srl $24, $24, 2 + write32(buf, 0x3c0e0000); // lui $14, %hi(&GOTPLT[0]) + write32(buf + 4, 0x8dd90000); // lw $25, %lo(&GOTPLT[0])($14) + write32(buf + 8, 0x25ce0000); // addiu $14, $14, %lo(&GOTPLT[0]) + write32(buf + 12, 0x030ec023); // subu $24, $24, $14 + write32(buf + 16, 0x03e07825); // move $15, $31 + write32(buf + 20, 0x0018c082); // srl $24, $24, 2 } else if (ELFT::Is64Bits) { - write32(buf, 0x3c0e0000); // lui $14, %hi(&GOTPLT[0]) - write32(buf + 4, 0xddd90000); // ld $25, %lo(&GOTPLT[0])($14) - write32(buf + 8, 0x25ce0000); // addiu $14, $14, %lo(&GOTPLT[0]) - write32(buf + 12, 0x030ec023); // subu $24, $24, $14 - write32(buf + 16, 0x03e07825); // move $15, $31 - write32(buf + 20, 0x0018c0c2); // srl $24, $24, 3 + write32(buf, 0x3c0e0000); // lui $14, %hi(&GOTPLT[0]) + write32(buf + 4, 0xddd90000); // ld $25, %lo(&GOTPLT[0])($14) + write32(buf + 8, 0x25ce0000); // addiu $14, $14, %lo(&GOTPLT[0]) + write32(buf + 12, 0x030ec023); // subu $24, $24, $14 + write32(buf + 16, 0x03e07825); // move $15, $31 + write32(buf + 20, 0x0018c0c2); // srl $24, $24, 3 } else { - write32(buf, 0x3c1c0000); // lui $28, %hi(&GOTPLT[0]) - write32(buf + 4, 0x8f990000); // lw $25, %lo(&GOTPLT[0])($28) - write32(buf + 8, 0x279c0000); // addiu $28, $28, %lo(&GOTPLT[0]) - write32(buf + 12, 0x031cc023); // subu $24, $24, $28 - write32(buf + 16, 0x03e07825); // move $15, $31 - write32(buf + 20, 0x0018c082); // srl $24, $24, 2 + write32(buf, 0x3c1c0000); // lui $28, %hi(&GOTPLT[0]) + write32(buf + 4, 0x8f990000); // lw $25, %lo(&GOTPLT[0])($28) + write32(buf + 8, 0x279c0000); // addiu $28, $28, %lo(&GOTPLT[0]) + write32(buf + 12, 0x031cc023); // subu $24, $24, $28 + write32(buf + 16, 0x03e07825); // move $15, $31 + write32(buf + 20, 0x0018c082); // srl $24, $24, 2 } uint32_t jalrInst = config->zHazardplt ? 0x0320fc09 : 0x0320f809; - write32(buf + 24, jalrInst); // jalr.hb $25 or jalr $25 - write32(buf + 28, 0x2718fffe); // subu $24, $24, 2 + write32(buf + 24, jalrInst); // jalr.hb $25 or jalr $25 + write32(buf + 28, 0x2718fffe); // subu $24, $24, 2 uint64_t gotPlt = in.gotPlt->getVA(); writeValue(buf, gotPlt + 0x8000, 16, 16); @@ -319,16 +317,16 @@ void MIPS::writePlt(uint8_t *buf, memset(buf, 0, pltEntrySize); if (isMipsR6()) { - write16(buf, 0x7840); // addiupc $2, (GOTPLT) - . - write16(buf + 4, 0xff22); // lw $25, 0($2) - write16(buf + 8, 0x0f02); // move $24, $2 - write16(buf + 10, 0x4723); // jrc $25 / jr16 $25 + write16(buf, 0x7840); // addiupc $2, (GOTPLT) - . + write16(buf + 4, 0xff22); // lw $25, 0($2) + write16(buf + 8, 0x0f02); // move $24, $2 + write16(buf + 10, 0x4723); // jrc $25 / jr16 $25 relocateOne(buf, R_MICROMIPS_PC19_S2, gotPltEntryAddr - pltEntryAddr); } else { - write16(buf, 0x7900); // addiupc $2, (GOTPLT) - . - write16(buf + 4, 0xff22); // lw $25, 0($2) - write16(buf + 8, 0x4599); // jrc $25 / jr16 $25 - write16(buf + 10, 0x0f02); // move $24, $2 + write16(buf, 0x7900); // addiupc $2, (GOTPLT) - . + write16(buf + 4, 0xff22); // lw $25, 0($2) + write16(buf + 8, 0x4599); // jrc $25 / jr16 $25 + write16(buf + 10, 0x0f02); // move $24, $2 relocateOne(buf, R_MICROMIPS_PC23_S2, gotPltEntryAddr - pltEntryAddr); } return; @@ -339,10 +337,10 @@ void MIPS::writePlt(uint8_t *buf, : (config->zHazardplt ? 0x03200408 : 0x03200008); uint32_t addInst = ELFT::Is64Bits ? 0x65f80000 : 0x25f80000; - write32(buf, 0x3c0f0000); // lui $15, %hi(.got.plt entry) - write32(buf + 4, loadInst); // l[wd] $25, %lo(.got.plt entry)($15) - write32(buf + 8, jrInst); // jr $25 / jr.hb $25 - write32(buf + 12, addInst); // [d]addiu $24, $15, %lo(.got.plt entry) + write32(buf, 0x3c0f0000); // lui $15, %hi(.got.plt entry) + write32(buf + 4, loadInst); // l[wd] $25, %lo(.got.plt entry)($15) + write32(buf + 8, jrInst); // jr $25 / jr.hb $25 + write32(buf + 12, addInst); // [d]addiu $24, $15, %lo(.got.plt entry) writeValue(buf, gotPltEntryAddr + 0x8000, 16, 16); writeValue(buf + 4, gotPltEntryAddr, 16, 0); writeValue(buf + 12, gotPltEntryAddr, 16, 0); @@ -379,16 +377,16 @@ int64_t MIPS::getImplicitAddend(co case R_MIPS_GPREL32: case R_MIPS_TLS_DTPREL32: case R_MIPS_TLS_TPREL32: - return SignExtend64<32>(read32(buf)); + return SignExtend64<32>(read32(buf)); case R_MIPS_26: // FIXME (simon): If the relocation target symbol is not a PLT entry // we should use another expression for calculation: // ((A << 2) | (P & 0xf0000000)) >> 2 - return SignExtend64<28>(read32(buf) << 2); + return SignExtend64<28>(read32(buf) << 2); case R_MIPS_GOT16: case R_MIPS_HI16: case R_MIPS_PCHI16: - return SignExtend64<16>(read32(buf)) << 16; + return SignExtend64<16>(read32(buf)) << 16; case R_MIPS_GPREL16: case R_MIPS_LO16: case R_MIPS_PCLO16: @@ -396,7 +394,7 @@ int64_t MIPS::getImplicitAddend(co case R_MIPS_TLS_DTPREL_LO16: case R_MIPS_TLS_TPREL_HI16: case R_MIPS_TLS_TPREL_LO16: - return SignExtend64<16>(read32(buf)); + return SignExtend64<16>(read32(buf)); case R_MICROMIPS_GOT16: case R_MICROMIPS_HI16: return SignExtend64<16>(readShuffle(buf)) << 16; @@ -410,21 +408,21 @@ int64_t MIPS::getImplicitAddend(co case R_MICROMIPS_GPREL7_S2: return SignExtend64<9>(readShuffle(buf) << 2); case R_MIPS_PC16: - return SignExtend64<18>(read32(buf) << 2); + return SignExtend64<18>(read32(buf) << 2); case R_MIPS_PC19_S2: - return SignExtend64<21>(read32(buf) << 2); + return SignExtend64<21>(read32(buf) << 2); case R_MIPS_PC21_S2: - return SignExtend64<23>(read32(buf) << 2); + return SignExtend64<23>(read32(buf) << 2); case R_MIPS_PC26_S2: - return SignExtend64<28>(read32(buf) << 2); + return SignExtend64<28>(read32(buf) << 2); case R_MIPS_PC32: - return SignExtend64<32>(read32(buf)); + return SignExtend64<32>(read32(buf)); case R_MICROMIPS_26_S1: return SignExtend64<27>(readShuffle(buf) << 1); case R_MICROMIPS_PC7_S1: - return SignExtend64<8>(read16(buf) << 1); + return SignExtend64<8>(read16(buf) << 1); case R_MICROMIPS_PC10_S1: - return SignExtend64<11>(read16(buf) << 1); + return SignExtend64<11>(read16(buf) << 1); case R_MICROMIPS_PC16_S1: return SignExtend64<17>(readShuffle(buf) << 1); case R_MICROMIPS_PC18_S3: @@ -494,7 +492,7 @@ static uint64_t fixupCrossModeJump(uint8 switch (type) { case R_MIPS_26: { - uint32_t inst = read32(loc) >> 26; + uint32_t inst = read32(loc) >> 26; if (inst == 0x3 || inst == 0x1d) { // JAL or JALX writeValue(loc, 0x1d << 26, 32, 0); return val; @@ -552,12 +550,12 @@ void MIPS::relocateOne(uint8_t *lo case R_MIPS_GPREL32: case R_MIPS_TLS_DTPREL32: case R_MIPS_TLS_TPREL32: - write32(loc, val); + write32(loc, val); break; case R_MIPS_64: case R_MIPS_TLS_DTPREL64: case R_MIPS_TLS_TPREL64: - write64(loc, val); + write64(loc, val); break; case R_MIPS_26: writeValue(loc, val, 26, 2); @@ -643,12 +641,12 @@ void MIPS::relocateOne(uint8_t *lo // Replace jalr/jr instructions by bal/b if the target // offset fits into the 18-bit range. if (isInt<18>(val)) { - switch (read32(loc)) { + switch (read32(loc)) { case 0x0320f809: // jalr $25 => bal sym - write32(loc, 0x04110000 | ((val >> 2) & 0xffff)); + write32(loc, 0x04110000 | ((val >> 2) & 0xffff)); break; case 0x03200008: // jr $25 => b sym - write32(loc, 0x10000000 | ((val >> 2) & 0xffff)); + write32(loc, 0x10000000 | ((val >> 2) & 0xffff)); break; } } From llvm-commits at lists.llvm.org Mon Oct 7 01:28:38 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:28:38 +0000 (UTC) Subject: [PATCH] D68561: [ELF][MIPS] Use lld::elf::{read,write}* instead of llvm::support::endian::{read,write}* In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373884: [ELF][MIPS] Use lld::elf::{read,write}* instead of llvm::support::endian::{read… (authored by MaskRay, committed by ). Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68561/new/ https://reviews.llvm.org/D68561 Files: lld/trunk/ELF/Arch/Mips.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68561.223458.patch Type: text/x-patch Size: 11923 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 01:31:18 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via llvm-commits) Date: Mon, 07 Oct 2019 08:31:18 -0000 Subject: [lld] r373885 - [ELF] Wrap things in `namespace lld { namespace elf {`, NFC Message-ID: <20191007083118.AEC2E8B10C@lists.llvm.org> Author: maskray Date: Mon Oct 7 01:31:18 2019 New Revision: 373885 URL: http://llvm.org/viewvc/llvm-project?rev=373885&view=rev Log: [ELF] Wrap things in `namespace lld { namespace elf {`, NFC This makes it clear `ELF/**/*.cpp` files define things in the `lld::elf` namespace and simplifies `elf::foo` to `foo`. Reviewed By: atanasyan, grimar, ruiu Differential Revision: https://reviews.llvm.org/D68323 Modified: lld/trunk/ELF/Arch/AArch64.cpp lld/trunk/ELF/Arch/AMDGPU.cpp lld/trunk/ELF/Arch/ARM.cpp lld/trunk/ELF/Arch/AVR.cpp lld/trunk/ELF/Arch/Hexagon.cpp lld/trunk/ELF/Arch/MSP430.cpp lld/trunk/ELF/Arch/Mips.cpp lld/trunk/ELF/Arch/MipsArchTree.cpp lld/trunk/ELF/Arch/PPC.cpp lld/trunk/ELF/Arch/PPC64.cpp lld/trunk/ELF/Arch/RISCV.cpp lld/trunk/ELF/Arch/SPARCV9.cpp lld/trunk/ELF/Arch/X86.cpp lld/trunk/ELF/Arch/X86_64.cpp lld/trunk/ELF/CallGraphSort.cpp lld/trunk/ELF/DWARF.cpp lld/trunk/ELF/Driver.cpp lld/trunk/ELF/DriverUtils.cpp lld/trunk/ELF/EhFrame.cpp lld/trunk/ELF/ICF.cpp lld/trunk/ELF/InputFiles.cpp lld/trunk/ELF/InputFiles.h lld/trunk/ELF/InputSection.cpp lld/trunk/ELF/LTO.cpp lld/trunk/ELF/LinkerScript.cpp lld/trunk/ELF/MapFile.cpp lld/trunk/ELF/MarkLive.cpp lld/trunk/ELF/OutputSections.cpp lld/trunk/ELF/Relocations.cpp lld/trunk/ELF/ScriptLexer.cpp lld/trunk/ELF/ScriptParser.cpp lld/trunk/ELF/SymbolTable.cpp lld/trunk/ELF/Symbols.cpp lld/trunk/ELF/Symbols.h lld/trunk/ELF/SyntheticSections.cpp lld/trunk/ELF/Target.cpp lld/trunk/ELF/Writer.cpp Modified: lld/trunk/ELF/Arch/AArch64.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/AArch64.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/AArch64.cpp (original) +++ lld/trunk/ELF/Arch/AArch64.cpp Mon Oct 7 01:31:18 2019 @@ -17,13 +17,14 @@ using namespace llvm; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { // Page(Expr) is the page address of the expression Expr, defined // as (Expr & ~0xFFF). (This applies even if the machine page size // supported by the platform has a different value.) -uint64_t elf::getAArch64Page(uint64_t expr) { +uint64_t getAArch64Page(uint64_t expr) { return expr & ~static_cast(0xFFF); } @@ -679,4 +680,7 @@ static TargetInfo *getTargetInfo() { return &t; } -TargetInfo *elf::getAArch64TargetInfo() { return getTargetInfo(); } +TargetInfo *getAArch64TargetInfo() { return getTargetInfo(); } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/AMDGPU.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/AMDGPU.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/AMDGPU.cpp (original) +++ lld/trunk/ELF/Arch/AMDGPU.cpp Mon Oct 7 01:31:18 2019 @@ -17,8 +17,9 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { namespace { class AMDGPU final : public TargetInfo { @@ -107,7 +108,10 @@ RelType AMDGPU::getDynRel(RelType type) return R_AMDGPU_NONE; } -TargetInfo *elf::getAMDGPUTargetInfo() { +TargetInfo *getAMDGPUTargetInfo() { static AMDGPU target; return ⌖ } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/ARM.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/ARM.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/ARM.cpp (original) +++ lld/trunk/ELF/Arch/ARM.cpp Mon Oct 7 01:31:18 2019 @@ -18,8 +18,9 @@ using namespace llvm; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { namespace { class ARM final : public TargetInfo { @@ -600,7 +601,10 @@ int64_t ARM::getImplicitAddend(const uin } } -TargetInfo *elf::getARMTargetInfo() { +TargetInfo *getARMTargetInfo() { static ARM target; return ⌖ } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/AVR.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/AVR.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/AVR.cpp (original) +++ lld/trunk/ELF/Arch/AVR.cpp Mon Oct 7 01:31:18 2019 @@ -36,8 +36,9 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { namespace { class AVR final : public TargetInfo { @@ -70,7 +71,10 @@ void AVR::relocateOne(uint8_t *loc, RelT } } -TargetInfo *elf::getAVRTargetInfo() { +TargetInfo *getAVRTargetInfo() { static AVR target; return ⌖ } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/Hexagon.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/Hexagon.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/Hexagon.cpp (original) +++ lld/trunk/ELF/Arch/Hexagon.cpp Mon Oct 7 01:31:18 2019 @@ -19,8 +19,9 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { namespace { class Hexagon final : public TargetInfo { @@ -318,7 +319,10 @@ RelType Hexagon::getDynRel(RelType type) return R_HEX_NONE; } -TargetInfo *elf::getHexagonTargetInfo() { +TargetInfo *getHexagonTargetInfo() { static Hexagon target; return ⌖ } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/MSP430.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/MSP430.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/MSP430.cpp (original) +++ lld/trunk/ELF/Arch/MSP430.cpp Mon Oct 7 01:31:18 2019 @@ -26,8 +26,9 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { namespace { class MSP430 final : public TargetInfo { @@ -87,7 +88,10 @@ void MSP430::relocateOne(uint8_t *loc, R } } -TargetInfo *elf::getMSP430TargetInfo() { +TargetInfo *getMSP430TargetInfo() { static MSP430 target; return ⌖ } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/Mips.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/Mips.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/Mips.cpp (original) +++ lld/trunk/ELF/Arch/Mips.cpp Mon Oct 7 01:31:18 2019 @@ -18,9 +18,9 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; +namespace lld { +namespace elf { namespace { template class MIPS final : public TargetInfo { public: @@ -721,7 +721,7 @@ template bool MIPS::u } // Return true if the symbol is a PIC function. -template bool elf::isMipsPIC(const Defined *sym) { +template bool isMipsPIC(const Defined *sym) { if (!sym->isFunc()) return false; @@ -739,17 +739,20 @@ template bool elf::isMipsPI return file->getObj().getHeader()->e_flags & EF_MIPS_PIC; } -template TargetInfo *elf::getMipsTargetInfo() { +template TargetInfo *getMipsTargetInfo() { static MIPS target; return ⌖ } -template TargetInfo *elf::getMipsTargetInfo(); -template TargetInfo *elf::getMipsTargetInfo(); -template TargetInfo *elf::getMipsTargetInfo(); -template TargetInfo *elf::getMipsTargetInfo(); - -template bool elf::isMipsPIC(const Defined *); -template bool elf::isMipsPIC(const Defined *); -template bool elf::isMipsPIC(const Defined *); -template bool elf::isMipsPIC(const Defined *); +template TargetInfo *getMipsTargetInfo(); +template TargetInfo *getMipsTargetInfo(); +template TargetInfo *getMipsTargetInfo(); +template TargetInfo *getMipsTargetInfo(); + +template bool isMipsPIC(const Defined *); +template bool isMipsPIC(const Defined *); +template bool isMipsPIC(const Defined *); +template bool isMipsPIC(const Defined *); + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/MipsArchTree.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/MipsArchTree.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/MipsArchTree.cpp (original) +++ lld/trunk/ELF/Arch/MipsArchTree.cpp Mon Oct 7 01:31:18 2019 @@ -23,8 +23,8 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; +namespace lld { +namespace elf { namespace { struct ArchTreeEdge { @@ -294,7 +294,7 @@ static uint32_t getArchFlags(ArrayRef uint32_t elf::calcMipsEFlags() { +template uint32_t calcMipsEFlags() { std::vector v; for (InputFile *f : objectFiles) v.push_back({f, cast>(f)->getObj().getHeader()->e_flags}); @@ -350,8 +350,7 @@ static StringRef getMipsFpAbiName(uint8_ } } -uint8_t elf::getMipsFpAbiFlag(uint8_t oldFlag, uint8_t newFlag, - StringRef fileName) { +uint8_t getMipsFpAbiFlag(uint8_t oldFlag, uint8_t newFlag, StringRef fileName) { if (compareMipsFpAbi(newFlag, oldFlag) >= 0) return newFlag; if (compareMipsFpAbi(oldFlag, newFlag) < 0) @@ -367,7 +366,7 @@ template static bool isN32A return false; } -bool elf::isMipsN32Abi(const InputFile *f) { +bool isMipsN32Abi(const InputFile *f) { switch (config->ekind) { case ELF32LEKind: return isN32Abi(f); @@ -382,14 +381,17 @@ bool elf::isMipsN32Abi(const InputFile * } } -bool elf::isMicroMips() { return config->eflags & EF_MIPS_MICROMIPS; } +bool isMicroMips() { return config->eflags & EF_MIPS_MICROMIPS; } -bool elf::isMipsR6() { +bool isMipsR6() { uint32_t arch = config->eflags & EF_MIPS_ARCH; return arch == EF_MIPS_ARCH_32R6 || arch == EF_MIPS_ARCH_64R6; } -template uint32_t elf::calcMipsEFlags(); -template uint32_t elf::calcMipsEFlags(); -template uint32_t elf::calcMipsEFlags(); -template uint32_t elf::calcMipsEFlags(); +template uint32_t calcMipsEFlags(); +template uint32_t calcMipsEFlags(); +template uint32_t calcMipsEFlags(); +template uint32_t calcMipsEFlags(); + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/PPC.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/PPC.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/PPC.cpp (original) +++ lld/trunk/ELF/Arch/PPC.cpp Mon Oct 7 01:31:18 2019 @@ -16,8 +16,9 @@ using namespace llvm; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { namespace { class PPC final : public TargetInfo { @@ -61,7 +62,7 @@ static void writeFromHalf16(uint8_t *loc write32(config->isLE ? loc : loc - 2, insn); } -void elf::writePPC32GlinkSection(uint8_t *buf, size_t numEntries) { +void writePPC32GlinkSection(uint8_t *buf, size_t numEntries) { // On PPC Secure PLT ABI, bl foo at plt jumps to a call stub, which loads an // absolute address from a specific .plt slot (usually called .got.plt on // other targets) and jumps there. @@ -435,7 +436,10 @@ void PPC::relaxTlsIeToLe(uint8_t *loc, R } } -TargetInfo *elf::getPPCTargetInfo() { +TargetInfo *getPPCTargetInfo() { static PPC target; return ⌖ } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/PPC64.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/PPC64.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/PPC64.cpp (original) +++ lld/trunk/ELF/Arch/PPC64.cpp Mon Oct 7 01:31:18 2019 @@ -16,8 +16,9 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { static uint64_t ppc64TocOffset = 0x8000; static uint64_t dynamicThreadPointerOffset = 0x8000; @@ -59,7 +60,7 @@ enum DFormOpcd { ADDI = 14 }; -uint64_t elf::getPPC64TocBase() { +uint64_t getPPC64TocBase() { // The TOC consists of sections .got, .toc, .tocbss, .plt in that order. The // TOC starts where the first of these sections starts. We always create a // .got when we see a relocation that uses it, so for us the start is always @@ -73,7 +74,7 @@ uint64_t elf::getPPC64TocBase() { return tocVA + ppc64TocOffset; } -unsigned elf::getPPC64GlobalEntryToLocalEntryOffset(uint8_t stOther) { +unsigned getPPC64GlobalEntryToLocalEntryOffset(uint8_t stOther) { // The offset is encoded into the 3 most significant bits of the st_other // field, with some special values described in section 3.4.1 of the ABI: // 0 --> Zero offset between the GEP and LEP, and the function does NOT use @@ -98,7 +99,7 @@ unsigned elf::getPPC64GlobalEntryToLocal return 0; } -bool elf::isPPC64SmallCodeModelTocReloc(RelType type) { +bool isPPC64SmallCodeModelTocReloc(RelType type) { // The only small code model relocations that access the .toc section. return type == R_PPC64_TOC16 || type == R_PPC64_TOC16_DS; } @@ -153,8 +154,8 @@ getRelaTocSymAndAddend(InputSectionBase // ld/lwa 3, 0(3) # load the value from the address // // Returns true if the relaxation is performed. -bool elf::tryRelaxPPC64TocIndirection(RelType type, const Relocation &rel, - uint8_t *bufLoc) { +bool tryRelaxPPC64TocIndirection(RelType type, const Relocation &rel, + uint8_t *bufLoc) { assert(config->tocOptimize); if (rel.addend < 0) return false; @@ -458,7 +459,7 @@ void PPC64::relaxTlsLdToLe(uint8_t *loc, } } -unsigned elf::getPPCDFormOp(unsigned secondaryOp) { +unsigned getPPCDFormOp(unsigned secondaryOp) { switch (secondaryOp) { case LBZX: return LBZ; @@ -1093,7 +1094,10 @@ bool PPC64::adjustPrologueForCrossSplitS return true; } -TargetInfo *elf::getPPC64TargetInfo() { +TargetInfo *getPPC64TargetInfo() { static PPC64 target; return ⌖ } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/RISCV.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/RISCV.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/RISCV.cpp (original) +++ lld/trunk/ELF/Arch/RISCV.cpp Mon Oct 7 01:31:18 2019 @@ -14,8 +14,9 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { namespace { @@ -436,7 +437,10 @@ void RISCV::relocateOne(uint8_t *loc, co } } -TargetInfo *elf::getRISCVTargetInfo() { +TargetInfo *getRISCVTargetInfo() { static RISCV target; return ⌖ } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/SPARCV9.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/SPARCV9.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/SPARCV9.cpp (original) +++ lld/trunk/ELF/Arch/SPARCV9.cpp Mon Oct 7 01:31:18 2019 @@ -16,8 +16,9 @@ using namespace llvm; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { namespace { class SPARCV9 final : public TargetInfo { @@ -143,7 +144,10 @@ void SPARCV9::writePlt(uint8_t *buf, uin relocateOne(buf + 4, R_SPARC_WDISP19, -(off + 4 - pltEntrySize)); } -TargetInfo *elf::getSPARCV9TargetInfo() { +TargetInfo *getSPARCV9TargetInfo() { static SPARCV9 target; return ⌖ } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/X86.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/X86.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/X86.cpp (original) +++ lld/trunk/ELF/Arch/X86.cpp Mon Oct 7 01:31:18 2019 @@ -16,8 +16,9 @@ using namespace llvm; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { namespace { class X86 : public TargetInfo { @@ -539,7 +540,7 @@ void RetpolineNoPic::writePlt(uint8_t *b write32le(buf + 22, -off - 26); } -TargetInfo *elf::getX86TargetInfo() { +TargetInfo *getX86TargetInfo() { if (config->zRetpolineplt) { if (config->isPic) { static RetpolinePic t; @@ -552,3 +553,6 @@ TargetInfo *elf::getX86TargetInfo() { static X86 t; return &t; } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Arch/X86_64.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/X86_64.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Arch/X86_64.cpp (original) +++ lld/trunk/ELF/Arch/X86_64.cpp Mon Oct 7 01:31:18 2019 @@ -18,8 +18,9 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { namespace { class X86_64 : public TargetInfo { @@ -698,4 +699,7 @@ static TargetInfo *getTargetInfo() { return &t; } -TargetInfo *elf::getX86_64TargetInfo() { return getTargetInfo(); } +TargetInfo *getX86_64TargetInfo() { return getTargetInfo(); } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/CallGraphSort.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/CallGraphSort.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/CallGraphSort.cpp (original) +++ lld/trunk/ELF/CallGraphSort.cpp Mon Oct 7 01:31:18 2019 @@ -48,8 +48,9 @@ #include using namespace llvm; -using namespace lld; -using namespace lld::elf; + +namespace lld { +namespace elf { namespace { struct Edge { @@ -264,6 +265,9 @@ DenseMap // This first builds a call graph based on the profile data then merges sections // according to the C³ huristic. All clusters are then sorted by a density // metric to further improve locality. -DenseMap elf::computeCallGraphProfileOrder() { +DenseMap computeCallGraphProfileOrder() { return CallGraphSort().run(); } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/DWARF.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/DWARF.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/DWARF.cpp (original) +++ lld/trunk/ELF/DWARF.cpp Mon Oct 7 01:31:18 2019 @@ -22,9 +22,9 @@ using namespace llvm; using namespace llvm::object; -using namespace lld; -using namespace lld::elf; +namespace lld { +namespace elf { template LLDDwarfObj::LLDDwarfObj(ObjFile *obj) { for (InputSectionBase *sec : obj->getSections()) { if (!sec) @@ -124,7 +124,10 @@ Optional LLDDwarfObjtemplate rels()); } -template class elf::LLDDwarfObj; -template class elf::LLDDwarfObj; -template class elf::LLDDwarfObj; -template class elf::LLDDwarfObj; +template class LLDDwarfObj; +template class LLDDwarfObj; +template class LLDDwarfObj; +template class LLDDwarfObj; + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Driver.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Driver.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Driver.cpp (original) +++ lld/trunk/ELF/Driver.cpp Mon Oct 7 01:31:18 2019 @@ -66,17 +66,16 @@ using namespace llvm::object; using namespace llvm::sys; using namespace llvm::support; -using namespace lld; -using namespace lld::elf; +namespace lld { +namespace elf { -Configuration *elf::config; -LinkerDriver *elf::driver; +Configuration *config; +LinkerDriver *driver; static void setConfigs(opt::InputArgList &args); static void readConfigs(opt::InputArgList &args); -bool elf::link(ArrayRef args, bool canExitEarly, - raw_ostream &error) { +bool link(ArrayRef args, bool canExitEarly, raw_ostream &error) { errorHandler().logName = args::getFilenameWithoutExe(args[0]); errorHandler().errorLimitExceededMsg = "too many errors emitted, stopping now (use " @@ -1970,3 +1969,6 @@ template void LinkerDriver: // Write the result to the file. writeResult(); } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/DriverUtils.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/DriverUtils.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/DriverUtils.cpp (original) +++ lld/trunk/ELF/DriverUtils.cpp Mon Oct 7 01:31:18 2019 @@ -30,8 +30,8 @@ using namespace llvm; using namespace llvm::sys; using namespace llvm::opt; -using namespace lld; -using namespace lld::elf; +namespace lld { +namespace elf { // Create OptTable @@ -143,7 +143,7 @@ opt::InputArgList ELFOptTable::parse(Arr return args; } -void elf::printHelp() { +void printHelp() { ELFOptTable().PrintHelp( outs(), (config->progName + " [options] file...").str().c_str(), "lld", false /*ShowHidden*/, true /*ShowAllAliases*/); @@ -165,7 +165,7 @@ static std::string rewritePath(StringRef // Reconstructs command line arguments so that so that you can re-run // the same command with the same inputs. This is for --reproduce. -std::string elf::createResponseFile(const opt::InputArgList &args) { +std::string createResponseFile(const opt::InputArgList &args) { SmallString<0> data; raw_svector_ostream os(data); os << "--chroot .\n"; @@ -216,7 +216,7 @@ static Optional findFile(St return None; } -Optional elf::findFromSearchPaths(StringRef path) { +Optional findFromSearchPaths(StringRef path) { for (StringRef dir : config->searchPaths) if (Optional s = findFile(dir, path)) return s; @@ -225,7 +225,7 @@ Optional elf::findFromSearc // This is for -l. We'll look for lib.so or lib.a from // search paths. -Optional elf::searchLibraryBaseName(StringRef name) { +Optional searchLibraryBaseName(StringRef name) { for (StringRef dir : config->searchPaths) { if (!config->isStatic) if (Optional s = findFile(dir, "lib" + name + ".so")) @@ -237,17 +237,20 @@ Optional elf::searchLibrary } // This is for -l. -Optional elf::searchLibrary(StringRef name) { - if (name.startswith(":")) - return findFromSearchPaths(name.substr(1)); - return searchLibraryBaseName (name); +Optional searchLibrary(StringRef name) { + if (name.startswith(":")) + return findFromSearchPaths(name.substr(1)); + return searchLibraryBaseName(name); } // If a linker/version script doesn't exist in the current directory, we also // look for the script in the '-L' search paths. This matches the behaviour of // '-T', --version-script=, and linker script INPUT() command in ld.bfd. -Optional elf::searchScript(StringRef name) { +Optional searchScript(StringRef name) { if (fs::exists(name)) return name.str(); return findFromSearchPaths(name); } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/EhFrame.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/EhFrame.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/EhFrame.cpp (original) +++ lld/trunk/ELF/EhFrame.cpp Mon Oct 7 01:31:18 2019 @@ -30,9 +30,8 @@ using namespace llvm::ELF; using namespace llvm::dwarf; using namespace llvm::object; -using namespace lld; -using namespace lld::elf; - +namespace lld { +namespace elf { namespace { class EhReader { public: @@ -57,7 +56,7 @@ private: }; } -size_t elf::readEhRecordSize(InputSectionBase *s, size_t off) { +size_t readEhRecordSize(InputSectionBase *s, size_t off) { return EhReader(s, s->data().slice(off)).readEhRecordSize(); } @@ -149,7 +148,7 @@ void EhReader::skipAugP() { d = d.slice(size); } -uint8_t elf::getFdeEncoding(EhSectionPiece *p) { +uint8_t getFdeEncoding(EhSectionPiece *p) { return EhReader(p->sec, p->data()).getFdeEncoding(); } @@ -195,3 +194,6 @@ uint8_t EhReader::getFdeEncoding() { } return DW_EH_PE_absptr; } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/ICF.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/ICF.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/ICF.cpp (original) +++ lld/trunk/ELF/ICF.cpp Mon Oct 7 01:31:18 2019 @@ -88,12 +88,12 @@ #include #include -using namespace lld; -using namespace lld::elf; using namespace llvm; using namespace llvm::ELF; using namespace llvm::object; +namespace lld { +namespace elf { namespace { template class ICF { public: @@ -512,9 +512,12 @@ template void ICF::ru } // ICF entry point function. -template void elf::doIcf() { ICF().run(); } +template void doIcf() { ICF().run(); } -template void elf::doIcf(); -template void elf::doIcf(); -template void elf::doIcf(); -template void elf::doIcf(); +template void doIcf(); +template void doIcf(); +template void doIcf(); +template void doIcf(); + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/InputFiles.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/InputFiles.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/InputFiles.cpp (original) +++ lld/trunk/ELF/InputFiles.cpp Mon Oct 7 01:31:18 2019 @@ -37,18 +37,31 @@ using namespace llvm::sys; using namespace llvm::sys::fs; using namespace llvm::support::endian; -using namespace lld; -using namespace lld::elf; +namespace lld { +// Returns "", "foo.a(bar.o)" or "baz.o". +std::string toString(const elf::InputFile *f) { + if (!f) + return ""; + if (f->toStringCache.empty()) { + if (f->archiveName.empty()) + f->toStringCache = f->getName(); + else + f->toStringCache = (f->archiveName + "(" + f->getName() + ")").str(); + } + return f->toStringCache; +} + +namespace elf { bool InputFile::isInGroup; uint32_t InputFile::nextGroupId; -std::vector elf::binaryFiles; -std::vector elf::bitcodeFiles; -std::vector elf::lazyObjFiles; -std::vector elf::objectFiles; -std::vector elf::sharedFiles; +std::vector binaryFiles; +std::vector bitcodeFiles; +std::vector lazyObjFiles; +std::vector objectFiles; +std::vector sharedFiles; -std::unique_ptr elf::tar; +std::unique_ptr tar; static ELFKind getELFKind(MemoryBufferRef mb, StringRef archiveName) { unsigned char size; @@ -88,7 +101,7 @@ InputFile::InputFile(Kind k, MemoryBuffe ++nextGroupId; } -Optional elf::readFile(StringRef path) { +Optional readFile(StringRef path) { // The --chroot option changes our virtual root directory. // This is useful when you are dealing with files created by --reproduce. if (!config->chroot.empty() && path.startswith("/")) @@ -188,7 +201,7 @@ template static void doPars } // Add symbols in File to the symbol table. -void elf::parseFile(InputFile *file) { +void parseFile(InputFile *file) { switch (config->ekind) { case ELF32LEKind: doParseFile(file); @@ -356,20 +369,6 @@ Optional ObjFile::getD return None; } -// Returns "", "foo.a(bar.o)" or "baz.o". -std::string lld::toString(const InputFile *f) { - if (!f) - return ""; - - if (f->toStringCache.empty()) { - if (f->archiveName.empty()) - f->toStringCache = f->getName(); - else - f->toStringCache = (f->archiveName + "(" + f->getName() + ")").str(); - } - return f->toStringCache; -} - ELFFileBase::ELFFileBase(Kind k, MemoryBufferRef mb) : InputFile(k, mb) { ekind = getELFKind(mb, ""); @@ -1530,8 +1529,8 @@ void BinaryFile::parse() { STV_DEFAULT, STT_OBJECT, data.size(), 0, nullptr}); } -InputFile *elf::createObjectFile(MemoryBufferRef mb, StringRef archiveName, - uint64_t offsetInArchive) { +InputFile *createObjectFile(MemoryBufferRef mb, StringRef archiveName, + uint64_t offsetInArchive) { if (isBitcode(mb)) return make(mb, archiveName, offsetInArchive); @@ -1622,7 +1621,7 @@ template void LazyObjFile:: } } -std::string elf::replaceThinLTOSuffix(StringRef path) { +std::string replaceThinLTOSuffix(StringRef path) { StringRef suffix = config->thinLTOObjectSuffixReplace.first; StringRef repl = config->thinLTOObjectSuffixReplace.second; @@ -1641,12 +1640,15 @@ template void LazyObjFile::parse(); template void LazyObjFile::parse(); -template class elf::ObjFile; -template class elf::ObjFile; -template class elf::ObjFile; -template class elf::ObjFile; +template class ObjFile; +template class ObjFile; +template class ObjFile; +template class ObjFile; template void SharedFile::parse(); template void SharedFile::parse(); template void SharedFile::parse(); template void SharedFile::parse(); + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/InputFiles.h URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/InputFiles.h?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/InputFiles.h (original) +++ lld/trunk/ELF/InputFiles.h Mon Oct 7 01:31:18 2019 @@ -33,15 +33,13 @@ class InputFile; } // namespace llvm namespace lld { -namespace elf { -class InputFile; -class InputSectionBase; -} // Returns "", "foo.a(bar.o)" or "baz.o". std::string toString(const elf::InputFile *f); namespace elf { +class InputFile; +class InputSectionBase; using llvm::object::Archive; Modified: lld/trunk/ELF/InputSection.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/InputSection.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/InputSection.cpp (original) +++ lld/trunk/ELF/InputSection.cpp Mon Oct 7 01:31:18 2019 @@ -37,16 +37,15 @@ using namespace llvm::support; using namespace llvm::support::endian; using namespace llvm::sys; -using namespace lld; -using namespace lld::elf; - -std::vector elf::inputSections; - +namespace lld { // Returns a string to construct an error message. -std::string lld::toString(const InputSectionBase *sec) { +std::string toString(const elf::InputSectionBase *sec) { return (toString(sec->file) + ":(" + sec->name + ")").str(); } +namespace elf { +std::vector inputSections; + template static ArrayRef getSectionContents(ObjFile &file, const typename ELFT::Shdr &hdr) { @@ -619,7 +618,7 @@ static int64_t getTlsTpOffset(const Symb // Variant 2. Static TLS blocks, followed by alignment padding are placed // before TP. The alignment padding is added so that (TP - padding - // p_memsz) is congruent to p_vaddr modulo p_align. - elf::PhdrEntry *tls = Out::tlsPhdr; + PhdrEntry *tls = Out::tlsPhdr; switch (config->emachine) { // Variant 1. case EM_ARM: @@ -1082,7 +1081,7 @@ void InputSectionBase::adjustSplitStackF end, f->stOther)) continue; if (!getFile()->someNoSplitStack) - error(lld::toString(this) + ": " + f->getName() + + error(toString(this) + ": " + f->getName() + " (with -fsplit-stack) calls " + rel.sym->getName() + " (without -fsplit-stack), but couldn't adjust its prologue"); } @@ -1345,3 +1344,6 @@ template void EhInputSection::split(); template void EhInputSection::split(); template void EhInputSection::split(); + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/LTO.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/LTO.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/LTO.cpp (original) +++ lld/trunk/ELF/LTO.cpp Mon Oct 7 01:31:18 2019 @@ -42,8 +42,8 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; +namespace lld { +namespace elf { // Creates an empty file to store a list of object files for final // linking of distributed ThinLTO. @@ -303,3 +303,6 @@ std::vector BitcodeCompiler ret.push_back(createObjectFile(*file)); return ret; } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/LinkerScript.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/LinkerScript.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/LinkerScript.cpp (original) +++ lld/trunk/ELF/LinkerScript.cpp Mon Oct 7 01:31:18 2019 @@ -43,10 +43,10 @@ using namespace llvm; using namespace llvm::ELF; using namespace llvm::object; using namespace llvm::support::endian; -using namespace lld; -using namespace lld::elf; -LinkerScript *elf::script; +namespace lld { +namespace elf { +LinkerScript *script; static uint64_t getOutputSectionVA(SectionBase *sec) { OutputSection *os = sec->getOutputSection(); @@ -1202,3 +1202,6 @@ std::vector LinkerScript::getPhd } return ret; } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/MapFile.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/MapFile.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/MapFile.cpp (original) +++ lld/trunk/ELF/MapFile.cpp Mon Oct 7 01:31:18 2019 @@ -34,9 +34,8 @@ using namespace llvm; using namespace llvm::object; -using namespace lld; -using namespace lld::elf; - +namespace lld { +namespace elf { using SymbolMapTy = DenseMap>; static constexpr char indent8[] = " "; // 8 spaces @@ -139,7 +138,7 @@ static void printEhFrame(raw_ostream &os } } -void elf::writeMapFile() { +void writeMapFile() { if (config->mapFile.empty()) return; @@ -228,7 +227,7 @@ static void print(StringRef a, StringRef // // In this case, strlen is defined by libc.so.6 and used by other two // files. -void elf::writeCrossReferenceTable() { +void writeCrossReferenceTable() { if (!config->cref) return; @@ -259,3 +258,6 @@ void elf::writeCrossReferenceTable() { print("", toString(file)); } } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/MarkLive.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/MarkLive.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/MarkLive.cpp (original) +++ lld/trunk/ELF/MarkLive.cpp Mon Oct 7 01:31:18 2019 @@ -37,11 +37,11 @@ using namespace llvm; using namespace llvm::ELF; using namespace llvm::object; -using namespace llvm::support::endian; -using namespace lld; -using namespace lld::elf; +namespace endian = llvm::support::endian; +namespace lld { +namespace elf { namespace { template class MarkLive { public: @@ -141,7 +141,7 @@ void MarkLive::scanEhFrameSection( if (firstRelI == (unsigned)-1) continue; - if (read32(piece.data().data() + 4) == 0) { + if (endian::read32(piece.data().data() + 4) == 0) { // This is a CIE, we only need to worry about the first relocation. It is // known to point to the personality function. resolveReloc(eh, rels[firstRelI], false); @@ -317,7 +317,7 @@ template void MarkLive void elf::markLive() { +template void markLive() { // If -gc-sections is not given, no sections are removed. if (!config->gcSections) { for (InputSectionBase *sec : inputSections) @@ -379,7 +379,10 @@ template void elf::markLive message("removing unused section " + toString(sec)); } -template void elf::markLive(); -template void elf::markLive(); -template void elf::markLive(); -template void elf::markLive(); +template void markLive(); +template void markLive(); +template void markLive(); +template void markLive(); + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/OutputSections.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/OutputSections.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/OutputSections.cpp (original) +++ lld/trunk/ELF/OutputSections.cpp Mon Oct 7 01:31:18 2019 @@ -27,9 +27,8 @@ using namespace llvm::object; using namespace llvm::support::endian; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; - +namespace lld { +namespace elf { uint8_t *Out::bufferStart; uint8_t Out::first; PhdrEntry *Out::tlsPhdr; @@ -39,7 +38,7 @@ OutputSection *Out::preinitArray; OutputSection *Out::initArray; OutputSection *Out::finiArray; -std::vector elf::outputSections; +std::vector outputSections; uint32_t OutputSection::getPhdrFlags() const { uint32_t ret = 0; @@ -226,7 +225,7 @@ static void sortByOrder(MutableArrayRef< in[i] = v[i].second; } -uint64_t elf::getHeaderSize() { +uint64_t getHeaderSize() { if (config->oFormatBinary) return 0; return Out::elfHeader->size + Out::programHeaders->size; @@ -446,7 +445,7 @@ void OutputSection::sortCtorsDtors() { // If an input string is in the form of "foo.N" where N is a number, // return N. Otherwise, returns 65536, which is one greater than the // lowest priority. -int elf::getPriority(StringRef s) { +int getPriority(StringRef s) { size_t pos = s.rfind('.'); if (pos == StringRef::npos) return 65536; @@ -456,7 +455,7 @@ int elf::getPriority(StringRef s) { return v; } -std::vector elf::getInputSections(OutputSection *os) { +std::vector getInputSections(OutputSection *os) { std::vector ret; for (BaseCommand *base : os->sectionCommands) if (auto *isd = dyn_cast(base)) @@ -497,3 +496,6 @@ template void OutputSection::maybeCompre template void OutputSection::maybeCompress(); template void OutputSection::maybeCompress(); template void OutputSection::maybeCompress(); + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Relocations.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Relocations.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Relocations.cpp (original) +++ lld/trunk/ELF/Relocations.cpp Mon Oct 7 01:31:18 2019 @@ -62,9 +62,8 @@ using namespace llvm::ELF; using namespace llvm::object; using namespace llvm::support::endian; -using namespace lld; -using namespace lld::elf; - +namespace lld { +namespace elf { static Optional getLinkerScriptLocation(const Symbol &sym) { for (BaseCommand *base : script->sectionCommands) if (auto *cmd = dyn_cast(base)) @@ -823,7 +822,7 @@ static void reportUndefinedSymbol(const error(msg); } -template void elf::reportUndefinedSymbols() { +template void reportUndefinedSymbols() { // Find the first "undefined symbol" diagnostic for each diagnostic, and // collect all "referenced from" lines at the first diagnostic. DenseMap firstRef; @@ -1405,7 +1404,7 @@ static void scanRelocs(InputSectionBase }); } -template void elf::scanRelocations(InputSectionBase &s) { +template void scanRelocations(InputSectionBase &s) { if (s.areRelocsRela) scanRelocs(s, s.relas()); else @@ -1832,11 +1831,14 @@ bool ThunkCreator::createThunks(ArrayRef return addressesChanged; } -template void elf::scanRelocations(InputSectionBase &); -template void elf::scanRelocations(InputSectionBase &); -template void elf::scanRelocations(InputSectionBase &); -template void elf::scanRelocations(InputSectionBase &); -template void elf::reportUndefinedSymbols(); -template void elf::reportUndefinedSymbols(); -template void elf::reportUndefinedSymbols(); -template void elf::reportUndefinedSymbols(); +template void scanRelocations(InputSectionBase &); +template void scanRelocations(InputSectionBase &); +template void scanRelocations(InputSectionBase &); +template void scanRelocations(InputSectionBase &); +template void reportUndefinedSymbols(); +template void reportUndefinedSymbols(); +template void reportUndefinedSymbols(); +template void reportUndefinedSymbols(); + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/ScriptLexer.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/ScriptLexer.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/ScriptLexer.cpp (original) +++ lld/trunk/ELF/ScriptLexer.cpp Mon Oct 7 01:31:18 2019 @@ -36,9 +36,9 @@ #include "llvm/ADT/Twine.h" using namespace llvm; -using namespace lld; -using namespace lld::elf; +namespace lld { +namespace elf { // Returns a whole line containing the current token. StringRef ScriptLexer::getLine() { StringRef s = getCurrentMB().getBuffer(); @@ -298,3 +298,6 @@ MemoryBufferRef ScriptLexer::getCurrentM return mb; llvm_unreachable("getCurrentMB: failed to find a token"); } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/ScriptParser.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/ScriptParser.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/ScriptParser.cpp (original) +++ lld/trunk/ELF/ScriptParser.cpp Mon Oct 7 01:31:18 2019 @@ -37,9 +37,9 @@ using namespace llvm; using namespace llvm::ELF; using namespace llvm::support::endian; -using namespace lld; -using namespace lld::elf; +namespace lld { +namespace elf { namespace { class ScriptParser final : ScriptLexer { public: @@ -1268,7 +1268,7 @@ Expr ScriptParser::readPrimary() { return [=] { return cmd->size; }; } if (tok == "SIZEOF_HEADERS") - return [=] { return elf::getHeaderSize(); }; + return [=] { return getHeaderSize(); }; // Tok is the dot. if (tok == ".") @@ -1511,18 +1511,19 @@ std::pair ScriptPars return {flags, negFlags}; } -void elf::readLinkerScript(MemoryBufferRef mb) { +void readLinkerScript(MemoryBufferRef mb) { ScriptParser(mb).readLinkerScript(); } -void elf::readVersionScript(MemoryBufferRef mb) { +void readVersionScript(MemoryBufferRef mb) { ScriptParser(mb).readVersionScript(); } -void elf::readDynamicList(MemoryBufferRef mb) { - ScriptParser(mb).readDynamicList(); -} +void readDynamicList(MemoryBufferRef mb) { ScriptParser(mb).readDynamicList(); } -void elf::readDefsym(StringRef name, MemoryBufferRef mb) { +void readDefsym(StringRef name, MemoryBufferRef mb) { ScriptParser(mb).readDefsym(name); } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/SymbolTable.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/SymbolTable.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/SymbolTable.cpp (original) +++ lld/trunk/ELF/SymbolTable.cpp Mon Oct 7 01:31:18 2019 @@ -27,10 +27,9 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; - -SymbolTable *elf::symtab; +namespace lld { +namespace elf { +SymbolTable *symtab; void SymbolTable::wrap(Symbol *sym, Symbol *real, Symbol *wrap) { // Swap symbols as instructed by -wrap. @@ -265,3 +264,6 @@ void SymbolTable::scanVersionScript() { // --dynamic-list. handleDynamicList(); } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Symbols.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Symbols.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Symbols.cpp (original) +++ lld/trunk/ELF/Symbols.cpp Mon Oct 7 01:31:18 2019 @@ -23,9 +23,20 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; +namespace lld { +// Returns a symbol for an error message. +static std::string demangle(StringRef symName) { + if (elf::config->demangle) + return demangleItanium(symName); + return symName; +} +std::string toString(const elf::Symbol &b) { return demangle(b.getName()); } +std::string toELFString(const Archive::Symbol &b) { + return demangle(b.getName()); +} + +namespace elf { Defined *ElfSym::bss; Defined *ElfSym::etext1; Defined *ElfSym::etext2; @@ -42,19 +53,6 @@ Defined *ElfSym::relaIpltEnd; Defined *ElfSym::riscvGlobalPointer; Defined *ElfSym::tlsModuleBase; -// Returns a symbol for an error message. -static std::string demangle(StringRef symName) { - if (config->demangle) - return demangleItanium(symName); - return symName; -} -namespace lld { -std::string toString(const Symbol &b) { return demangle(b.getName()); } -std::string toELFString(const Archive::Symbol &b) { - return demangle(b.getName()); -} -} // namespace lld - static uint64_t getSymVA(const Symbol &sym, int64_t &addend) { switch (sym.kind()) { case Symbol::DefinedKind: { @@ -298,7 +296,7 @@ bool Symbol::includeInDynsym() const { } // Print out a log message for --trace-symbol. -void elf::printTraceSymbol(const Symbol *sym) { +void printTraceSymbol(const Symbol *sym) { std::string s; if (sym->isUndefined()) s = ": reference to "; @@ -314,7 +312,7 @@ void elf::printTraceSymbol(const Symbol message(toString(sym->file) + s + sym->getName()); } -void elf::maybeWarnUnorderableSymbol(const Symbol *sym) { +void maybeWarnUnorderableSymbol(const Symbol *sym) { if (!config->warnSymbolOrdering) return; @@ -655,3 +653,6 @@ void Symbol::resolveShared(const SharedS referenced = true; } } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Symbols.h URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Symbols.h?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Symbols.h (original) +++ lld/trunk/ELF/Symbols.h Mon Oct 7 01:31:18 2019 @@ -21,6 +21,13 @@ #include "llvm/Object/ELF.h" namespace lld { +std::string toString(const elf::Symbol &); + +// There are two different ways to convert an Archive::Symbol to a string: +// One for Microsoft name mangling and one for Itanium name mangling. +// Call the functions toCOFFString and toELFString, not just toString. +std::string toELFString(const llvm::object::Archive::Symbol &); + namespace elf { class CommonSymbol; class Defined; @@ -30,16 +37,6 @@ class LazyObject; class SharedSymbol; class Symbol; class Undefined; -} // namespace elf - -std::string toString(const elf::Symbol &); - -// There are two different ways to convert an Archive::Symbol to a string: -// One for Microsoft name mangling and one for Itanium name mangling. -// Call the functions toCOFFString and toELFString, not just toString. -std::string toELFString(const elf::Archive::Symbol &); - -namespace elf { // This is a StringRef-like container that doesn't run strlen(). // Modified: lld/trunk/ELF/SyntheticSections.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/SyntheticSections.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/SyntheticSections.cpp (original) +++ lld/trunk/ELF/SyntheticSections.cpp Mon Oct 7 01:31:18 2019 @@ -45,13 +45,12 @@ using namespace llvm::ELF; using namespace llvm::object; using namespace llvm::support; -using namespace lld; -using namespace lld::elf; - using llvm::support::endian::read32le; using llvm::support::endian::write32le; using llvm::support::endian::write64le; +namespace lld { +namespace elf { constexpr size_t MergeNoTailSection::numShards; static uint64_t readUint(uint8_t *buf) { @@ -82,7 +81,7 @@ static ArrayRef getVersion() { // With this feature, you can identify LLD-generated binaries easily // by "readelf --string-dump .comment ". // The returned object is a mergeable string section. -MergeInputSection *elf::createCommentSection() { +MergeInputSection *createCommentSection() { return make(SHF_MERGE | SHF_STRINGS, SHT_PROGBITS, 1, getVersion(), ".comment"); } @@ -138,7 +137,7 @@ MipsAbiFlagsSection *MipsAbiFlagsS flags.ases |= s->ases; flags.flags1 |= s->flags1; flags.flags2 |= s->flags2; - flags.fp_abi = elf::getMipsFpAbiFlag(flags.fp_abi, s->fp_abi, filename); + flags.fp_abi = getMipsFpAbiFlag(flags.fp_abi, s->fp_abi, filename); }; if (create) @@ -252,7 +251,7 @@ MipsReginfoSection *MipsReginfoSec return make>(reginfo); } -InputSection *elf::createInterpSection() { +InputSection *createInterpSection() { // StringSaver guarantees that the returned string ends with '\0'. StringRef s = saver.save(config->dynamicLinker); ArrayRef contents = {(const uint8_t *)s.data(), s.size() + 1}; @@ -261,8 +260,8 @@ InputSection *elf::createInterpSection() ".interp"); } -Defined *elf::addSyntheticLocal(StringRef name, uint8_t type, uint64_t value, - uint64_t size, InputSectionBase §ion) { +Defined *addSyntheticLocal(StringRef name, uint8_t type, uint64_t value, + uint64_t size, InputSectionBase §ion) { auto *s = make(section.file, name, STB_LOCAL, STV_DEFAULT, type, value, size, §ion); if (in.symTab) @@ -1274,7 +1273,7 @@ static uint64_t addPltRelSz() { // Add remaining entries to complete .dynamic contents. template void DynamicSection::finalizeContents() { - elf::Partition &part = getPartition(); + Partition &part = getPartition(); bool isMain = part.name.empty(); for (StringRef s : config->filterList) @@ -2940,7 +2939,7 @@ bool VersionTableSection::isNeeded() con return getPartition().verDef || getPartition().verNeed->isNeeded(); } -void elf::addVerneed(Symbol *ss) { +void addVerneed(Symbol *ss) { auto &file = cast(*ss->file); if (ss->verdefIndex == VER_NDX_GLOBAL) { ss->versionId = VER_NDX_GLOBAL; @@ -3123,16 +3122,16 @@ void MergeNoTailSection::finalizeContent }); } -MergeSyntheticSection *elf::createMergeSynthetic(StringRef name, uint32_t type, - uint64_t flags, - uint32_t alignment) { +MergeSyntheticSection *createMergeSynthetic(StringRef name, uint32_t type, + uint64_t flags, + uint32_t alignment) { bool shouldTailMerge = (flags & SHF_STRINGS) && config->optimize >= 2; if (shouldTailMerge) return make(name, type, flags, alignment); return make(name, type, flags, alignment); } -template void elf::splitSections() { +template void splitSections() { // splitIntoPieces needs to be called on each MergeInputSection // before calling finalizeContents(). parallelForEach(inputSections, [](InputSectionBase *sec) { @@ -3486,7 +3485,7 @@ static uint8_t getAbiVersion() { return 0; } -template void elf::writeEhdr(uint8_t *buf, Partition &part) { +template void writeEhdr(uint8_t *buf, Partition &part) { // For executable segments, the trap instructions are written before writing // the header. Setting Elf header bytes to zero ensures that any unused bytes // in header are zero-cleared, instead of having trap instructions. @@ -3512,7 +3511,7 @@ template void elf::write } } -template void elf::writePhdrs(uint8_t *buf, Partition &part) { +template void writePhdrs(uint8_t *buf, Partition &part) { // Write the program header table. auto *hBuf = reinterpret_cast(buf); for (PhdrEntry *p : part.phdrs) { @@ -3587,87 +3586,90 @@ void PartitionIndexSection::writeTo(uint } } -InStruct elf::in; +InStruct in; -std::vector elf::partitions; -Partition *elf::mainPart; +std::vector partitions; +Partition *mainPart; template GdbIndexSection *GdbIndexSection::create(); template GdbIndexSection *GdbIndexSection::create(); template GdbIndexSection *GdbIndexSection::create(); template GdbIndexSection *GdbIndexSection::create(); -template void elf::splitSections(); -template void elf::splitSections(); -template void elf::splitSections(); -template void elf::splitSections(); +template void splitSections(); +template void splitSections(); +template void splitSections(); +template void splitSections(); template void PltSection::addEntry(Symbol &Sym); template void PltSection::addEntry(Symbol &Sym); template void PltSection::addEntry(Symbol &Sym); template void PltSection::addEntry(Symbol &Sym); -template class elf::MipsAbiFlagsSection; -template class elf::MipsAbiFlagsSection; -template class elf::MipsAbiFlagsSection; -template class elf::MipsAbiFlagsSection; - -template class elf::MipsOptionsSection; -template class elf::MipsOptionsSection; -template class elf::MipsOptionsSection; -template class elf::MipsOptionsSection; - -template class elf::MipsReginfoSection; -template class elf::MipsReginfoSection; -template class elf::MipsReginfoSection; -template class elf::MipsReginfoSection; - -template class elf::DynamicSection; -template class elf::DynamicSection; -template class elf::DynamicSection; -template class elf::DynamicSection; - -template class elf::RelocationSection; -template class elf::RelocationSection; -template class elf::RelocationSection; -template class elf::RelocationSection; - -template class elf::AndroidPackedRelocationSection; -template class elf::AndroidPackedRelocationSection; -template class elf::AndroidPackedRelocationSection; -template class elf::AndroidPackedRelocationSection; - -template class elf::RelrSection; -template class elf::RelrSection; -template class elf::RelrSection; -template class elf::RelrSection; - -template class elf::SymbolTableSection; -template class elf::SymbolTableSection; -template class elf::SymbolTableSection; -template class elf::SymbolTableSection; - -template class elf::VersionNeedSection; -template class elf::VersionNeedSection; -template class elf::VersionNeedSection; -template class elf::VersionNeedSection; - -template void elf::writeEhdr(uint8_t *Buf, Partition &Part); -template void elf::writeEhdr(uint8_t *Buf, Partition &Part); -template void elf::writeEhdr(uint8_t *Buf, Partition &Part); -template void elf::writeEhdr(uint8_t *Buf, Partition &Part); - -template void elf::writePhdrs(uint8_t *Buf, Partition &Part); -template void elf::writePhdrs(uint8_t *Buf, Partition &Part); -template void elf::writePhdrs(uint8_t *Buf, Partition &Part); -template void elf::writePhdrs(uint8_t *Buf, Partition &Part); - -template class elf::PartitionElfHeaderSection; -template class elf::PartitionElfHeaderSection; -template class elf::PartitionElfHeaderSection; -template class elf::PartitionElfHeaderSection; - -template class elf::PartitionProgramHeadersSection; -template class elf::PartitionProgramHeadersSection; -template class elf::PartitionProgramHeadersSection; -template class elf::PartitionProgramHeadersSection; +template class MipsAbiFlagsSection; +template class MipsAbiFlagsSection; +template class MipsAbiFlagsSection; +template class MipsAbiFlagsSection; + +template class MipsOptionsSection; +template class MipsOptionsSection; +template class MipsOptionsSection; +template class MipsOptionsSection; + +template class MipsReginfoSection; +template class MipsReginfoSection; +template class MipsReginfoSection; +template class MipsReginfoSection; + +template class DynamicSection; +template class DynamicSection; +template class DynamicSection; +template class DynamicSection; + +template class RelocationSection; +template class RelocationSection; +template class RelocationSection; +template class RelocationSection; + +template class AndroidPackedRelocationSection; +template class AndroidPackedRelocationSection; +template class AndroidPackedRelocationSection; +template class AndroidPackedRelocationSection; + +template class RelrSection; +template class RelrSection; +template class RelrSection; +template class RelrSection; + +template class SymbolTableSection; +template class SymbolTableSection; +template class SymbolTableSection; +template class SymbolTableSection; + +template class VersionNeedSection; +template class VersionNeedSection; +template class VersionNeedSection; +template class VersionNeedSection; + +template void writeEhdr(uint8_t *Buf, Partition &Part); +template void writeEhdr(uint8_t *Buf, Partition &Part); +template void writeEhdr(uint8_t *Buf, Partition &Part); +template void writeEhdr(uint8_t *Buf, Partition &Part); + +template void writePhdrs(uint8_t *Buf, Partition &Part); +template void writePhdrs(uint8_t *Buf, Partition &Part); +template void writePhdrs(uint8_t *Buf, Partition &Part); +template void writePhdrs(uint8_t *Buf, Partition &Part); + +template class PartitionElfHeaderSection; +template class PartitionElfHeaderSection; +template class PartitionElfHeaderSection; +template class PartitionElfHeaderSection; + +template class PartitionProgramHeadersSection; +template class PartitionProgramHeadersSection; +template class PartitionProgramHeadersSection; +template class PartitionProgramHeadersSection; + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Target.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Target.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Target.cpp (original) +++ lld/trunk/ELF/Target.cpp Mon Oct 7 01:31:18 2019 @@ -34,19 +34,19 @@ using namespace llvm; using namespace llvm::object; using namespace llvm::ELF; -using namespace lld; -using namespace lld::elf; -const TargetInfo *elf::target; - -std::string lld::toString(RelType type) { +namespace lld { +std::string toString(elf::RelType type) { StringRef s = getELFRelocationTypeName(elf::config->emachine, type); if (s == "Unknown") return ("Unknown (" + Twine(type) + ")").str(); return s; } -TargetInfo *elf::getTarget() { +namespace elf { +const TargetInfo *target; + +TargetInfo *getTarget() { switch (config->emachine) { case EM_386: case EM_IAMCU: @@ -103,7 +103,7 @@ template static ErrorPlace return {}; } -ErrorPlace elf::getErrorPlace(const uint8_t *loc) { +ErrorPlace getErrorPlace(const uint8_t *loc) { switch (config->ekind) { case ELF32LEKind: return getErrPlace(loc); @@ -179,3 +179,6 @@ uint64_t TargetInfo::getImageBase() cons return *config->imageBase; return config->isPic ? 0 : defaultImageBase; } + +} // namespace elf +} // namespace lld Modified: lld/trunk/ELF/Writer.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Writer.cpp?rev=373885&r1=373884&r2=373885&view=diff ============================================================================== --- lld/trunk/ELF/Writer.cpp (original) +++ lld/trunk/ELF/Writer.cpp Mon Oct 7 01:31:18 2019 @@ -36,9 +36,8 @@ using namespace llvm::object; using namespace llvm::support; using namespace llvm::support::endian; -using namespace lld; -using namespace lld::elf; - +namespace lld { +namespace elf { namespace { // The writer writes a SymbolTable result to a file. template class Writer { @@ -92,7 +91,7 @@ static bool isSectionPrefix(StringRef pr return name.startswith(prefix) || name == prefix.drop_back(); } -StringRef elf::getOutputSectionName(const InputSectionBase *s) { +StringRef getOutputSectionName(const InputSectionBase *s) { if (config->relocatable) return s->name; @@ -140,7 +139,7 @@ static bool needsInterpSection() { script->needsInterpSection(); } -template void elf::writeResult() { Writer().run(); } +template void writeResult() { Writer().run(); } static void removeEmptyPTLoad(std::vector &phdrs) { llvm::erase_if(phdrs, [&](const PhdrEntry *p) { @@ -153,7 +152,7 @@ static void removeEmptyPTLoad(std::vecto }); } -void elf::copySectionsIntoPartitions() { +void copySectionsIntoPartitions() { std::vector newSections; for (unsigned part = 2; part != partitions.size() + 1; ++part) { for (InputSectionBase *s : inputSections) { @@ -175,7 +174,7 @@ void elf::copySectionsIntoPartitions() { newSections.end()); } -void elf::combineEhSections() { +void combineEhSections() { for (InputSectionBase *&s : inputSections) { // Ignore dead sections and the partition end marker (.part.end), // whose partition number is out of bounds. @@ -216,7 +215,7 @@ static Defined *addAbsolute(StringRef na // The linker is expected to define some symbols depending on // the linking result. This function defines such symbols. -void elf::addReservedSymbols() { +void addReservedSymbols() { if (config->emachine == EM_MIPS) { // Define _gp for MIPS. st_value of _gp symbol will be updated by Writer // so that it points to an absolute address which by default is relative @@ -309,7 +308,7 @@ static OutputSection *findSection(String return nullptr; } -template void elf::createSyntheticSections() { +template void createSyntheticSections() { // Initialize all pointers with NULL. This is needed because // you can call lld::elf::main more than once as a library. memset(&Out::first, 0, sizeof(Out)); @@ -2737,12 +2736,15 @@ template void Writer: part.buildId->writeBuildId(buildId); } -template void elf::createSyntheticSections(); -template void elf::createSyntheticSections(); -template void elf::createSyntheticSections(); -template void elf::createSyntheticSections(); - -template void elf::writeResult(); -template void elf::writeResult(); -template void elf::writeResult(); -template void elf::writeResult(); +template void createSyntheticSections(); +template void createSyntheticSections(); +template void createSyntheticSections(); +template void createSyntheticSections(); + +template void writeResult(); +template void writeResult(); +template void writeResult(); +template void writeResult(); + +} // namespace elf +} // namespace lld From llvm-commits at lists.llvm.org Mon Oct 7 01:49:34 2019 From: llvm-commits at lists.llvm.org (Merge Guard [bot] via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:49:34 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: <68d2f54823d471620bd6b86d774d27c0@localhost.localdomain> merge_guards_bot added a comment. Bulid results are available at http://results.llvm-merge-guard.org/Phabricator-7 See http://jenkins.llvm-merge-guard.org/job/Phabricator/7/ for more details. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 From llvm-commits at lists.llvm.org Mon Oct 7 01:52:08 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via llvm-commits) Date: Mon, 07 Oct 2019 08:52:08 -0000 Subject: [lld] r373886 - [ELF][MIPS] De-template writeValue. NFC Message-ID: <20191007085208.21D8C8B18D@lists.llvm.org> Author: maskray Date: Mon Oct 7 01:52:07 2019 New Revision: 373886 URL: http://llvm.org/viewvc/llvm-project?rev=373886&view=rev Log: [ELF][MIPS] De-template writeValue. NFC Depends on D68561. Modified: lld/trunk/ELF/Arch/Mips.cpp Modified: lld/trunk/ELF/Arch/Mips.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/Mips.cpp?rev=373886&r1=373885&r2=373886&view=diff ============================================================================== --- lld/trunk/ELF/Arch/Mips.cpp (original) +++ lld/trunk/ELF/Arch/Mips.cpp Mon Oct 7 01:52:07 2019 @@ -213,7 +213,6 @@ template static uint32_t return v; } -template static void writeValue(uint8_t *loc, uint64_t v, uint8_t bitsSize, uint8_t shift) { uint32_t instr = read32(loc); @@ -230,7 +229,7 @@ static void writeShuffleValue(uint8_t *l if (E == support::little) std::swap(words[0], words[1]); - writeValue(loc, v, bitsSize, shift); + writeValue(loc, v, bitsSize, shift); if (E == support::little) std::swap(words[0], words[1]); @@ -246,7 +245,6 @@ static void writeMicroRelocation16(uint8 } template void MIPS::writePltHeader(uint8_t *buf) const { - const endianness e = ELFT::TargetEndianness; if (isMicroMips()) { uint64_t gotPlt = in.gotPlt->getVA(); uint64_t plt = in.plt->getVA(); @@ -302,16 +300,15 @@ template void MIPS::w write32(buf + 28, 0x2718fffe); // subu $24, $24, 2 uint64_t gotPlt = in.gotPlt->getVA(); - writeValue(buf, gotPlt + 0x8000, 16, 16); - writeValue(buf + 4, gotPlt, 16, 0); - writeValue(buf + 8, gotPlt, 16, 0); + writeValue(buf, gotPlt + 0x8000, 16, 16); + writeValue(buf + 4, gotPlt, 16, 0); + writeValue(buf + 8, gotPlt, 16, 0); } template void MIPS::writePlt(uint8_t *buf, uint64_t gotPltEntryAddr, uint64_t pltEntryAddr, int32_t index, unsigned relOff) const { - const endianness e = ELFT::TargetEndianness; if (isMicroMips()) { // Overwrite trap instructions written by Writer::writeTrapInstr. memset(buf, 0, pltEntrySize); @@ -341,9 +338,9 @@ void MIPS::writePlt(uint8_t *buf, write32(buf + 4, loadInst); // l[wd] $25, %lo(.got.plt entry)($15) write32(buf + 8, jrInst); // jr $25 / jr.hb $25 write32(buf + 12, addInst); // [d]addiu $24, $15, %lo(.got.plt entry) - writeValue(buf, gotPltEntryAddr + 0x8000, 16, 16); - writeValue(buf + 4, gotPltEntryAddr, 16, 0); - writeValue(buf + 12, gotPltEntryAddr, 16, 0); + writeValue(buf, gotPltEntryAddr + 0x8000, 16, 16); + writeValue(buf + 4, gotPltEntryAddr, 16, 0); + writeValue(buf + 12, gotPltEntryAddr, 16, 0); } template @@ -494,7 +491,7 @@ static uint64_t fixupCrossModeJump(uint8 case R_MIPS_26: { uint32_t inst = read32(loc) >> 26; if (inst == 0x3 || inst == 0x1d) { // JAL or JALX - writeValue(loc, 0x1d << 26, 32, 0); + writeValue(loc, 0x1d << 26, 32, 0); return val; } break; @@ -558,17 +555,17 @@ void MIPS::relocateOne(uint8_t *lo write64(loc, val); break; case R_MIPS_26: - writeValue(loc, val, 26, 2); + writeValue(loc, val, 26, 2); break; case R_MIPS_GOT16: // The R_MIPS_GOT16 relocation's value in "relocatable" linking mode // is updated addend (not a GOT index). In that case write high 16 bits // to store a correct addend value. if (config->relocatable) { - writeValue(loc, val + 0x8000, 16, 16); + writeValue(loc, val + 0x8000, 16, 16); } else { checkInt(loc, val, 16, type); - writeValue(loc, val, 16, 0); + writeValue(loc, val, 16, 0); } break; case R_MICROMIPS_GOT16: @@ -595,7 +592,7 @@ void MIPS::relocateOne(uint8_t *lo case R_MIPS_PCLO16: case R_MIPS_TLS_DTPREL_LO16: case R_MIPS_TLS_TPREL_LO16: - writeValue(loc, val, 16, 0); + writeValue(loc, val, 16, 0); break; case R_MICROMIPS_GPREL16: case R_MICROMIPS_TLS_GD: @@ -621,7 +618,7 @@ void MIPS::relocateOne(uint8_t *lo case R_MIPS_PCHI16: case R_MIPS_TLS_DTPREL_HI16: case R_MIPS_TLS_TPREL_HI16: - writeValue(loc, val + 0x8000, 16, 16); + writeValue(loc, val + 0x8000, 16, 16); break; case R_MICROMIPS_CALL_HI16: case R_MICROMIPS_GOT_HI16: @@ -631,10 +628,10 @@ void MIPS::relocateOne(uint8_t *lo writeShuffleValue(loc, val + 0x8000, 16, 16); break; case R_MIPS_HIGHER: - writeValue(loc, val + 0x80008000, 16, 32); + writeValue(loc, val + 0x80008000, 16, 32); break; case R_MIPS_HIGHEST: - writeValue(loc, val + 0x800080008000, 16, 48); + writeValue(loc, val + 0x800080008000, 16, 48); break; case R_MIPS_JALR: val -= 4; @@ -657,25 +654,25 @@ void MIPS::relocateOne(uint8_t *lo case R_MIPS_PC16: checkAlignment(loc, val, 4, type); checkInt(loc, val, 18, type); - writeValue(loc, val, 16, 2); + writeValue(loc, val, 16, 2); break; case R_MIPS_PC19_S2: checkAlignment(loc, val, 4, type); checkInt(loc, val, 21, type); - writeValue(loc, val, 19, 2); + writeValue(loc, val, 19, 2); break; case R_MIPS_PC21_S2: checkAlignment(loc, val, 4, type); checkInt(loc, val, 23, type); - writeValue(loc, val, 21, 2); + writeValue(loc, val, 21, 2); break; case R_MIPS_PC26_S2: checkAlignment(loc, val, 4, type); checkInt(loc, val, 28, type); - writeValue(loc, val, 26, 2); + writeValue(loc, val, 26, 2); break; case R_MIPS_PC32: - writeValue(loc, val, 32, 0); + writeValue(loc, val, 32, 0); break; case R_MICROMIPS_26_S1: case R_MICROMIPS_PC26_S1: From llvm-commits at lists.llvm.org Mon Oct 7 01:53:59 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:53:59 +0000 (UTC) Subject: [PATCH] D67589: Fix crash on SBCommandReturnObject & assignment In-Reply-To: References: Message-ID: <922d275658901d1798f7fbcb38085564@localhost.localdomain> labath added a comment. (Ideally, all of these tests would be just (gtest) unit tests. There's no need to pull in python to do something our build system already knows perfectly well to do.) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67589/new/ https://reviews.llvm.org/D67589 From llvm-commits at lists.llvm.org Mon Oct 7 01:53:33 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via llvm-commits) Date: Mon, 7 Oct 2019 17:53:33 +0900 Subject: [lld] r373886 - [ELF][MIPS] De-template writeValue. NFC In-Reply-To: <20191007085208.21D8C8B18D@lists.llvm.org> References: <20191007085208.21D8C8B18D@lists.llvm.org> Message-ID: Thanks! On Mon, Oct 7, 2019 at 5:49 PM Fangrui Song via llvm-commits < llvm-commits at lists.llvm.org> wrote: > Author: maskray > Date: Mon Oct 7 01:52:07 2019 > New Revision: 373886 > > URL: http://llvm.org/viewvc/llvm-project?rev=373886&view=rev > Log: > [ELF][MIPS] De-template writeValue. NFC > > Depends on D68561. > > Modified: > lld/trunk/ELF/Arch/Mips.cpp > > Modified: lld/trunk/ELF/Arch/Mips.cpp > URL: > http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/Mips.cpp?rev=373886&r1=373885&r2=373886&view=diff > > ============================================================================== > --- lld/trunk/ELF/Arch/Mips.cpp (original) > +++ lld/trunk/ELF/Arch/Mips.cpp Mon Oct 7 01:52:07 2019 > @@ -213,7 +213,6 @@ template static uint32_t > return v; > } > > -template > static void writeValue(uint8_t *loc, uint64_t v, uint8_t bitsSize, > uint8_t shift) { > uint32_t instr = read32(loc); > @@ -230,7 +229,7 @@ static void writeShuffleValue(uint8_t *l > if (E == support::little) > std::swap(words[0], words[1]); > > - writeValue(loc, v, bitsSize, shift); > + writeValue(loc, v, bitsSize, shift); > > if (E == support::little) > std::swap(words[0], words[1]); > @@ -246,7 +245,6 @@ static void writeMicroRelocation16(uint8 > } > > template void MIPS::writePltHeader(uint8_t *buf) const > { > - const endianness e = ELFT::TargetEndianness; > if (isMicroMips()) { > uint64_t gotPlt = in.gotPlt->getVA(); > uint64_t plt = in.plt->getVA(); > @@ -302,16 +300,15 @@ template void MIPS::w > write32(buf + 28, 0x2718fffe); // subu $24, $24, 2 > > uint64_t gotPlt = in.gotPlt->getVA(); > - writeValue(buf, gotPlt + 0x8000, 16, 16); > - writeValue(buf + 4, gotPlt, 16, 0); > - writeValue(buf + 8, gotPlt, 16, 0); > + writeValue(buf, gotPlt + 0x8000, 16, 16); > + writeValue(buf + 4, gotPlt, 16, 0); > + writeValue(buf + 8, gotPlt, 16, 0); > } > > template > void MIPS::writePlt(uint8_t *buf, uint64_t gotPltEntryAddr, > uint64_t pltEntryAddr, int32_t index, > unsigned relOff) const { > - const endianness e = ELFT::TargetEndianness; > if (isMicroMips()) { > // Overwrite trap instructions written by Writer::writeTrapInstr. > memset(buf, 0, pltEntrySize); > @@ -341,9 +338,9 @@ void MIPS::writePlt(uint8_t *buf, > write32(buf + 4, loadInst); // l[wd] $25, %lo(.got.plt entry)($15) > write32(buf + 8, jrInst); // jr $25 / jr.hb $25 > write32(buf + 12, addInst); // [d]addiu $24, $15, %lo(.got.plt entry) > - writeValue(buf, gotPltEntryAddr + 0x8000, 16, 16); > - writeValue(buf + 4, gotPltEntryAddr, 16, 0); > - writeValue(buf + 12, gotPltEntryAddr, 16, 0); > + writeValue(buf, gotPltEntryAddr + 0x8000, 16, 16); > + writeValue(buf + 4, gotPltEntryAddr, 16, 0); > + writeValue(buf + 12, gotPltEntryAddr, 16, 0); > } > > template > @@ -494,7 +491,7 @@ static uint64_t fixupCrossModeJump(uint8 > case R_MIPS_26: { > uint32_t inst = read32(loc) >> 26; > if (inst == 0x3 || inst == 0x1d) { // JAL or JALX > - writeValue(loc, 0x1d << 26, 32, 0); > + writeValue(loc, 0x1d << 26, 32, 0); > return val; > } > break; > @@ -558,17 +555,17 @@ void MIPS::relocateOne(uint8_t *lo > write64(loc, val); > break; > case R_MIPS_26: > - writeValue(loc, val, 26, 2); > + writeValue(loc, val, 26, 2); > break; > case R_MIPS_GOT16: > // The R_MIPS_GOT16 relocation's value in "relocatable" linking mode > // is updated addend (not a GOT index). In that case write high 16 > bits > // to store a correct addend value. > if (config->relocatable) { > - writeValue(loc, val + 0x8000, 16, 16); > + writeValue(loc, val + 0x8000, 16, 16); > } else { > checkInt(loc, val, 16, type); > - writeValue(loc, val, 16, 0); > + writeValue(loc, val, 16, 0); > } > break; > case R_MICROMIPS_GOT16: > @@ -595,7 +592,7 @@ void MIPS::relocateOne(uint8_t *lo > case R_MIPS_PCLO16: > case R_MIPS_TLS_DTPREL_LO16: > case R_MIPS_TLS_TPREL_LO16: > - writeValue(loc, val, 16, 0); > + writeValue(loc, val, 16, 0); > break; > case R_MICROMIPS_GPREL16: > case R_MICROMIPS_TLS_GD: > @@ -621,7 +618,7 @@ void MIPS::relocateOne(uint8_t *lo > case R_MIPS_PCHI16: > case R_MIPS_TLS_DTPREL_HI16: > case R_MIPS_TLS_TPREL_HI16: > - writeValue(loc, val + 0x8000, 16, 16); > + writeValue(loc, val + 0x8000, 16, 16); > break; > case R_MICROMIPS_CALL_HI16: > case R_MICROMIPS_GOT_HI16: > @@ -631,10 +628,10 @@ void MIPS::relocateOne(uint8_t *lo > writeShuffleValue(loc, val + 0x8000, 16, 16); > break; > case R_MIPS_HIGHER: > - writeValue(loc, val + 0x80008000, 16, 32); > + writeValue(loc, val + 0x80008000, 16, 32); > break; > case R_MIPS_HIGHEST: > - writeValue(loc, val + 0x800080008000, 16, 48); > + writeValue(loc, val + 0x800080008000, 16, 48); > break; > case R_MIPS_JALR: > val -= 4; > @@ -657,25 +654,25 @@ void MIPS::relocateOne(uint8_t *lo > case R_MIPS_PC16: > checkAlignment(loc, val, 4, type); > checkInt(loc, val, 18, type); > - writeValue(loc, val, 16, 2); > + writeValue(loc, val, 16, 2); > break; > case R_MIPS_PC19_S2: > checkAlignment(loc, val, 4, type); > checkInt(loc, val, 21, type); > - writeValue(loc, val, 19, 2); > + writeValue(loc, val, 19, 2); > break; > case R_MIPS_PC21_S2: > checkAlignment(loc, val, 4, type); > checkInt(loc, val, 23, type); > - writeValue(loc, val, 21, 2); > + writeValue(loc, val, 21, 2); > break; > case R_MIPS_PC26_S2: > checkAlignment(loc, val, 4, type); > checkInt(loc, val, 28, type); > - writeValue(loc, val, 26, 2); > + writeValue(loc, val, 26, 2); > break; > case R_MIPS_PC32: > - writeValue(loc, val, 32, 0); > + writeValue(loc, val, 32, 0); > break; > case R_MICROMIPS_26_S1: > case R_MICROMIPS_PC26_S1: > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Mon Oct 7 01:59:45 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Christian_K=C3=BChnel_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 08:59:45 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: <7587515296f7eea9d7a4b70ec4096204@localhost.localdomain> kuhnel updated this revision to Diff 223460. kuhnel added a comment. - second change Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 Files: DELETEME.txt Index: DELETEME.txt =================================================================== --- /dev/null +++ DELETEME.txt @@ -0,0 +1,4 @@ +just for testing. delete this file if you see it... + +This is my second change. + -------------- next part -------------- A non-text attachment was scrubbed... Name: D68560.223460.patch Type: text/x-patch Size: 219 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 02:17:43 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 09:17:43 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: <1763cef4f6bc3d820e752ff6ac6c68f4@localhost.localdomain> thopre marked an inline comment as done. thopre added a comment. In D68472#1695743 , @hubert.reinterpretcast wrote: > In D68472#1695156 , @thopre wrote: > > > I believe that's because it is a builtin locale (much like the C locale). There wasn't one in /usr/share/locale on the Ubuntu docker image I've tested this but it did work while trying with en_US.UTF-8 did not. > > > I mean that using `"C.UTF-8"` with `setlocale` gets me a null pointer, and using `"en_US.UTF-8"` gets me a string with the following: > > #include > extern int printf(const char *, ...); > void trylocale(const char *locale) { > const char *ret = setlocale(LC_ALL, locale); > printf("setlocale(\"%s\") returned \"%s\".\n", locale, ret ? ret : "(null)"); > } > int main(void) { > trylocale("C.UTF-8"); > trylocale("en_US.UTF-8"); > } > > > On AIX: > > setlocale("C.UTF-8") returned "(null)". > setlocale("en_US.UTF-8") returned "en_US.UTF-8 en_US.UTF-8 en_US.UTF-8 en_US.UTF-8 en_US.UTF-8 en_US.UTF-8". > > > On RHEL 7: > > setlocale("C.UTF-8") returned "(null)". > setlocale("en_US.UTF-8") returned "en_US.UTF-8". > Mmh so much for C.UTF-8 then. Good thing Fangrui Song came up with a better idea. ================ Comment at: llvm/test/tools/llvm-ar/mri-utf8.test:26 +# and linux vs windows. The C.UTF-8 locale is chosen +RUN: env LANG=C.UTF-8 %python -c "assert open(u'\xA3.txt', 'rb').read() == b'contents\n'" ---------------- hubert.reinterpretcast wrote: > MaskRay wrote: > > Just delete the comments and avoid python. > > > > ``` > > RUN: FileCheck --input-file £.txt --match-full-lines > > CHECK: contents > > ``` > As it is, the file contains nothing aside from this last RUN line and its associated comment block that indicates that U+00A3 is the intended interpretation of the bytes `\xC2\xA3`. Note: There is no BOM in the file. > > In addition to making the intent clear, I believe that the current approach has more of an ability to detect cases where the instances of `\xC2\xA3` in the file are misinterpreted. > > That said, if the file redirection to create the file works, then `FileCheck` can be invoked with use of file redirection: > ``` > RUN: FileCheck <£.txt --match-full-lines %s > ``` > Is UTF-8 encoding really the desired behavior or just non ascii? I know the test is named mri-utf8 but the first comment says "Test non-ascii archive members". Besides as I mentioned in the patch description Windows encodes it in UTF-16 so UTF-8 is already not possible there. I do like the approach of using FileCheck with an input redirection. It is consistent with the echo line above so if one works the other one will as well. I feel ashamed I didn't think of that good old FileCheck. I'll revise the patch accordingly. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 From llvm-commits at lists.llvm.org Mon Oct 7 02:26:26 2019 From: llvm-commits at lists.llvm.org (Juneyoung Lee via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 09:26:26 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: aqjune marked 9 inline comments as done. aqjune added a comment. Updated the patch so UnaryOperator is used for Instruction::Freeze op. Currently I see an error from Phabricator while uploading a new diff: Unhandled Exception ("PhabricatorFileUploadException") Unable to write file: failed to write to temporary directory. I'll retry uploading it later. ================ Comment at: include/llvm/CodeGen/GlobalISel/IRTranslator.h:487 + bool translateFreeze(const User &U, MachineIRBuilder &MIRBuilder) { + return false; + } ---------------- lebedev.ri wrote: > Is this a correctness issue? > Or does returning `false` here results in IRTranslator "aborting"? Yep, returning false makes it abort, if my understanding is correct. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Mon Oct 7 02:32:00 2019 From: llvm-commits at lists.llvm.org (Merge Guard [bot] via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 09:32:00 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: merge_guards_bot added a comment. Bulid results are available at http://results.llvm-merge-guard.org/Phabricator-9 See http://jenkins.llvm-merge-guard.org/job/Phabricator/9/ for more details. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 From llvm-commits at lists.llvm.org Mon Oct 7 02:47:17 2019 From: llvm-commits at lists.llvm.org (Juneyoung Lee via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 09:47:17 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: aqjune updated this revision to Diff 223461. aqjune marked an inline comment as done. aqjune edited the summary of this revision. Herald added a subscriber: jfb. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 Files: include/llvm-c/Core.h include/llvm/Bitcode/LLVMBitCodes.h include/llvm/CodeGen/GlobalISel/IRTranslator.h include/llvm/IR/IRBuilder.h include/llvm/IR/Instruction.def include/llvm/IR/PatternMatch.h lib/AsmParser/LLLexer.cpp lib/AsmParser/LLParser.cpp lib/AsmParser/LLParser.h lib/AsmParser/LLToken.h lib/Bitcode/Reader/BitcodeReader.cpp lib/Bitcode/Writer/BitcodeWriter.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h lib/CodeGen/TargetLoweringBase.cpp lib/IR/ConstantFold.cpp lib/IR/Core.cpp lib/IR/Instruction.cpp lib/IR/Instructions.cpp lib/IR/Verifier.cpp test/Bindings/llvm-c/freeze.ll test/Bitcode/compatibility.ll test/Bitcode/freeze-pointer.ll tools/llvm-c-test/echo.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D29011.223461.patch Type: text/x-patch Size: 22370 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 02:54:53 2019 From: llvm-commits at lists.llvm.org (Bill Wendling via llvm-commits) Date: Mon, 07 Oct 2019 09:54:53 -0000 Subject: [llvm] r373888 - [IA] Recognize hexadecimal escape sequences Message-ID: <20191007095453.EA23E87B2E@lists.llvm.org> Author: void Date: Mon Oct 7 02:54:53 2019 New Revision: 373888 URL: http://llvm.org/viewvc/llvm-project?rev=373888&view=rev Log: [IA] Recognize hexadecimal escape sequences Summary: Implement support for hexadecimal escape sequences to match how GNU 'as' handles them. I.e., read all hexadecimal characters and truncate to the lower 16 bits. Reviewers: nickdesaulniers Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68483 Modified: llvm/trunk/lib/MC/MCParser/AsmParser.cpp llvm/trunk/test/MC/AsmParser/directive_ascii.s Modified: llvm/trunk/lib/MC/MCParser/AsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCParser/AsmParser.cpp?rev=373888&r1=373887&r2=373888&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCParser/AsmParser.cpp (original) +++ llvm/trunk/lib/MC/MCParser/AsmParser.cpp Mon Oct 7 02:54:53 2019 @@ -2914,11 +2914,26 @@ bool AsmParser::parseEscapedString(std:: } // Recognize escaped characters. Note that this escape semantics currently - // loosely follows Darwin 'as'. Notably, it doesn't support hex escapes. + // loosely follows Darwin 'as'. ++i; if (i == e) return TokError("unexpected backslash at end of string"); + // Recognize hex sequences similarly to GNU 'as'. + if (Str[i] == 'x' || Str[i] == 'X') { + if (!isHexDigit(Str[i + 1])) + return TokError("invalid hexadecimal escape sequence"); + + // Consume hex characters. GNU 'as' reads all hexadecimal characters and + // then truncates to the lower 16 bits. Seems reasonable. + unsigned Value = 0; + while (isHexDigit(Str[i + 1])) + Value = Value * 16 + hexDigitValue(Str[++i]); + + Data += (unsigned char)(Value & 0xFF); + continue; + } + // Recognize octal sequences. if ((unsigned)(Str[i] - '0') <= 7) { // Consume up to three octal characters. Modified: llvm/trunk/test/MC/AsmParser/directive_ascii.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AsmParser/directive_ascii.s?rev=373888&r1=373887&r2=373888&view=diff ============================================================================== --- llvm/trunk/test/MC/AsmParser/directive_ascii.s (original) +++ llvm/trunk/test/MC/AsmParser/directive_ascii.s Mon Oct 7 02:54:53 2019 @@ -39,3 +39,8 @@ TEST5: # CHECK: .byte 0 TEST6: .string "B", "C" + +# CHECK: TEST7: +# CHECK: .ascii "dk" +TEST7: + .ascii "\x64\Xa6B" From llvm-commits at lists.llvm.org Mon Oct 7 02:56:00 2019 From: llvm-commits at lists.llvm.org (Jay Foad via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 09:56:00 +0000 (UTC) Subject: [PATCH] D68563: [AMDGPU] Disable a test that was relying on misched behavior Message-ID: foad created this revision. foad added reviewers: arsenm, rampitec, vpykhtin, mareko. Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl, qcolombet. Herald added a project: LLVM. This test only passed because misched ordered the instructions in a way that happened to work. With other orderings, register scavenging would fail and llc would fail assertions or crash. I've demonstrated this by disabling misched, which makes the test fail, and then disabling the test itself. I'm told that the SGPR spill to smem path was never fully completed, and should probably be expected to be buggy. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68563 Files: llvm/test/CodeGen/AMDGPU/attr-amdgpu-num-sgpr-spill-to-smem.ll Index: llvm/test/CodeGen/AMDGPU/attr-amdgpu-num-sgpr-spill-to-smem.ll =================================================================== --- llvm/test/CodeGen/AMDGPU/attr-amdgpu-num-sgpr-spill-to-smem.ll +++ llvm/test/CodeGen/AMDGPU/attr-amdgpu-num-sgpr-spill-to-smem.ll @@ -1,4 +1,5 @@ -; RUN: llc -mtriple=amdgcn--amdhsa -mcpu=fiji -amdgpu-spill-sgpr-to-smem=1 -verify-machineinstrs < %s | FileCheck -check-prefix=TOSMEM -check-prefix=ALL %s +; REQUIRES: disabled +; RUN: llc -mtriple=amdgcn--amdhsa -mcpu=fiji -enable-misched=false -amdgpu-spill-sgpr-to-smem=1 -verify-machineinstrs < %s | FileCheck -check-prefix=TOSMEM -check-prefix=ALL %s ; FIXME: SGPR-to-SMEM requires an additional SGPR always to scavenge m0 -------------- next part -------------- A non-text attachment was scrubbed... Name: D68563.223463.patch Type: text/x-patch Size: 723 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 02:56:10 2019 From: llvm-commits at lists.llvm.org (Bill Wendling via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 09:56:10 +0000 (UTC) Subject: [PATCH] D68483: [IA] Recognize hexadecimal escape sequences In-Reply-To: References: Message-ID: <5d9f2db590b36c93bc5697d79a4c8266@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373888: [IA] Recognize hexadecimal escape sequences (authored by void, committed by ). Changed prior to commit: https://reviews.llvm.org/D68483?vs=223338&id=223466#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68483/new/ https://reviews.llvm.org/D68483 Files: llvm/trunk/lib/MC/MCParser/AsmParser.cpp llvm/trunk/test/MC/AsmParser/directive_ascii.s Index: llvm/trunk/test/MC/AsmParser/directive_ascii.s =================================================================== --- llvm/trunk/test/MC/AsmParser/directive_ascii.s +++ llvm/trunk/test/MC/AsmParser/directive_ascii.s @@ -39,3 +39,8 @@ # CHECK: .byte 0 TEST6: .string "B", "C" + +# CHECK: TEST7: +# CHECK: .ascii "dk" +TEST7: + .ascii "\x64\Xa6B" Index: llvm/trunk/lib/MC/MCParser/AsmParser.cpp =================================================================== --- llvm/trunk/lib/MC/MCParser/AsmParser.cpp +++ llvm/trunk/lib/MC/MCParser/AsmParser.cpp @@ -2914,11 +2914,26 @@ } // Recognize escaped characters. Note that this escape semantics currently - // loosely follows Darwin 'as'. Notably, it doesn't support hex escapes. + // loosely follows Darwin 'as'. ++i; if (i == e) return TokError("unexpected backslash at end of string"); + // Recognize hex sequences similarly to GNU 'as'. + if (Str[i] == 'x' || Str[i] == 'X') { + if (!isHexDigit(Str[i + 1])) + return TokError("invalid hexadecimal escape sequence"); + + // Consume hex characters. GNU 'as' reads all hexadecimal characters and + // then truncates to the lower 16 bits. Seems reasonable. + unsigned Value = 0; + while (isHexDigit(Str[i + 1])) + Value = Value * 16 + hexDigitValue(Str[++i]); + + Data += (unsigned char)(Value & 0xFF); + continue; + } + // Recognize octal sequences. if ((unsigned)(Str[i] - '0') <= 7) { // Consume up to three octal characters. -------------- next part -------------- A non-text attachment was scrubbed... Name: D68483.223466.patch Type: text/x-patch Size: 1564 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 02:58:30 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 09:58:30 +0000 (UTC) Subject: [PATCH] D68082: [LV] Emitting SCEV checks with OptForSize In-Reply-To: References: Message-ID: <57b4623836be25c57a87b1e84c9058b8@localhost.localdomain> SjoerdMeijer updated this revision to Diff 223465. SjoerdMeijer added a comment. Comments addressed: - cleaned up the test case a bit. - couldn't reuse an existing run-line, I guess because of -mcpu=skx, but a separate run line seems fine to me. > The other suggested fix in LoopAccessInfo::collectStridedAccess() indeed deserves a separate patch Thanks for that suggestions, and I will address this separately. I have unfinished business in the vectorizer, and will add this to me my list of things to do next. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68082/new/ https://reviews.llvm.org/D68082 Files: llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp llvm/test/Transforms/LoopVectorize/X86/optsize.ll Index: llvm/test/Transforms/LoopVectorize/X86/optsize.ll =================================================================== --- llvm/test/Transforms/LoopVectorize/X86/optsize.ll +++ llvm/test/Transforms/LoopVectorize/X86/optsize.ll @@ -4,6 +4,7 @@ ; attributes. This is a target-dependent version of the test. ; RUN: opt < %s -loop-vectorize -force-vector-width=64 -S -mtriple=x86_64-unknown-linux -mcpu=skx | FileCheck %s ; RUN: opt < %s -loop-vectorize -S -mtriple=x86_64-unknown-linux -mcpu=skx | FileCheck %s --check-prefix AUTOVF +; RUN: opt < %s -loop-vectorize -S | FileCheck %s --check-prefix=NO-SCEV-PREDS target datalayout = "E-m:e-p:32:32-i64:32-f64:32:64-a:0:32-n32-S128" @@ -196,3 +197,42 @@ while.cond.loopexit: ret i32 0 } + +; PR43371: don't run into an assert due to emitting SCEV runtime checks +; with OptForSize. +; + at cm_array = external global [2592 x i16], align 1 + +define void @pr43371() optsize { +; +; NO-SCEV-PREDS-LABEL: @pr43371 +; +; We do not want to generate SCEV predicates when optimising for size, because +; that will lead to extra code generation such as the SCEV overflow runtime +; checks. Not generating SCEV predicates can still result in vectorisation as +; the non-consecutive loads/stores can be scalarized: +; +; NO-SCEV-PREDS: vector.body: +; NO-SCEV-PREDS: store i16 0, i16* %{{.*}}, align 1 +; NO-SCEV-PREDS: store i16 0, i16* %{{.*}}, align 1 +; NO-SCEV-PREDS: br i1 {{.*}}, label %vector.body +; +entry: + br label %header + +header: + br label %for.body29 + +for.cond.cleanup28: + unreachable + +for.body29: + %i24.0170 = phi i16 [ 0, %header ], [ %inc37, %for.body29] + %add33 = add i16 undef, %i24.0170 + %idxprom34 = zext i16 %add33 to i32 + %arrayidx35 = getelementptr [2592 x i16], [2592 x i16] * @cm_array, i32 0, i32 %idxprom34 + store i16 0, i16 * %arrayidx35, align 1 + %inc37 = add i16 %i24.0170, 1 + %cmp26 = icmp ult i16 %inc37, 756 + br i1 %cmp26, label %for.body29, label %for.cond.cleanup28 +} Index: llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp =================================================================== --- llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp +++ llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp @@ -409,7 +409,8 @@ const ValueToValueMap &Strides = getSymbolicStrides() ? *getSymbolicStrides() : ValueToValueMap(); - int Stride = getPtrStride(PSE, Ptr, TheLoop, Strides, true, false); + bool CanAddPredicate = !TheLoop->getHeader()->getParent()->hasOptSize(); + int Stride = getPtrStride(PSE, Ptr, TheLoop, Strides, CanAddPredicate, false); if (Stride == 1 || Stride == -1) return Stride; return 0; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68082.223465.patch Type: text/x-patch Size: 2678 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 03:05:20 2019 From: llvm-commits at lists.llvm.org (Momchil Velikov via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:05:20 +0000 (UTC) Subject: [PATCH] D68468: [AArch64] Do not untag before returning via a `resume` instruction In-Reply-To: References: Message-ID: <764ba32c8d3770246cf34286e7e87b84@localhost.localdomain> chill planned changes to this revision. chill added a comment. Indeed, that patch is a bit premature. I've kicked up a discussion about supplementing the (EH)ABI, and this patch need to wait for the outcome. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68468/new/ https://reviews.llvm.org/D68468 From llvm-commits at lists.llvm.org Mon Oct 7 03:06:23 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:06:23 +0000 (UTC) Subject: [PATCH] D53877: [IR] Strawman for dedicated FNeg IR instruction In-Reply-To: References: Message-ID: <8008762a1344fd4c4d18d72a8eda7835@localhost.localdomain> lebedev.ri added inline comments. ================ Comment at: llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h:373 +enum UnaryOpcodes { + UNOP_NEG = 0 +}; ---------------- @cameron.mcinally also, shouldn't this be `UNOP_FNEG`? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53877/new/ https://reviews.llvm.org/D53877 From llvm-commits at lists.llvm.org Mon Oct 7 03:09:49 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:09:49 +0000 (UTC) Subject: [PATCH] D68104: [LNT] Python 3 support: adapt secret computation In-Reply-To: References: Message-ID: <437028de0944d08cefabfbb052ced1c8@localhost.localdomain> thopre updated this revision to Diff 223468. thopre added a comment. Use approach suggested by Hubert to adapt the existing code rather than use a new way of generating random bits to not change the security strength of the code. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68104/new/ https://reviews.llvm.org/D68104 Files: lnt/lnttool/create.py Index: lnt/lnttool/create.py =================================================================== --- lnt/lnttool/create.py +++ lnt/lnttool/create.py @@ -113,6 +113,7 @@ * INSTANCE_PATH should point to a directory that will keep LNT configuration. """ + from builtins import bytes from .common import init_logger import hashlib import lnt.server.db.migrate @@ -137,8 +138,12 @@ tmp_path = os.path.join(basepath, tmp_dir) wsgi_path = os.path.join(basepath, wsgi) schemas_path = os.path.join(basepath, "schemas") - secret_key = (secret_key or - hashlib.sha1(str(random.getrandbits(256))).hexdigest()) + secret_key = ( + secret_key + or hashlib.sha1( + bytes(str(random.getrandbits(256)), encoding="ascii") + ).hexdigest() + ) os.mkdir(instance_path) os.mkdir(tmp_path) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68104.223468.patch Type: text/x-patch Size: 876 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 03:12:52 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:12:52 +0000 (UTC) Subject: [PATCH] D68337: [ARM][MVE] Enable extending masked loads In-Reply-To: References: Message-ID: <73777c91ffcd5a0ce52ce69133642020@localhost.localdomain> dmgreen added subscribers: RKSimon, craig.topper. dmgreen added a comment. Nice. I think this is looking good, just some details to sort out, like what to do about the target independent parts. We will presumably want to add the pre and post inc to these in the future too, which will probably bring up the same kinds of questions. ================ Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:9283 + ISD::NodeType ExtOpc) { + if (!TLI.isLoadExtLegal(ExtLoadType, VT, N0.getValueType())) + return SDValue(); ---------------- samparker wrote: > dmgreen wrote: > > Is it true that whenever you have a legal extending load, you will also have the equivalent legal extending masked load? (For MVE we do, but is that true for all archs?) > > > > Do we need to add an extra set of flags for this? Or is isVectorLoadExtDesirable good enough to handle these cases when there is an asymmetry? > Yes, we can't expect that it's true for everything. I don't understand why the APIs generally like to pass lots of arguments instead of just passing, say the load that you'd want to inspect... So hopefully both these calls will cover all cases and I'd like to avoid adding another flag. That or I could just change isLoadExtLegal to take the LoadSDNode, but I've assumed these calls are designed like they are for reason... They refer back to the LoadExtActions, which are set by setLoadExtAction in ISel. We may need more flags on there to specify the difference between the masked loads and the normal loads. ================ Comment at: lib/Target/ARM/ARMISelLowering.cpp:8887 // zero too, and other values are lowered to a select. SDValue ZeroVec = DAG.getNode(ARMISD::VMOVIMM, dl, VT, DAG.getTargetConstant(0, dl, MVT::i32)); ---------------- This is creating a zero vector of size VT, which is the size of what the masked loads returns. Should it instead be the size of the memory being loaded (because the extend happens to the passthru as well)? What happens if that isn't a legal value type? ================ Comment at: lib/Target/ARM/ARMInstrMVE.td:5196 def : MVE_vector_maskedload_typed; + // Extending masked loads. + def : Pat<(v8i16 (sextmaskedload8 t2addrmode_imm7<0>:$addr, VCCR:$pred, ---------------- There likely needs to be an anyext too. Can (or is it beneficial for) these be merged into the MVEExtLoad multiclass below? ================ Comment at: lib/Target/ARM/ARMInstrMVE.td:5203 + (v4i32 (MVE_VLDRBS32 t2addrmode_imm7<0>:$addr, (i32 1), VCCR:$pred))>; + def : Pat<(v4i32 (sextmaskedload16 t2addrmode_imm7<0>:$addr, VCCR:$pred, + (v4i32 NEONimmAllZerosV))), ---------------- dmgreen wrote: > t2addrmode_imm7<0> -> t2addrmode_imm7<1>, for a VLDRH. Same below. Edit: You beat me to it. Can you add some tests? ================ Comment at: lib/Target/ARM/ARMTargetTransformInfo.cpp:511 + // Only support extending integers if the memory is aligned. + if ((EltWidth == 16 && Alignment < 2) || + (EltWidth == 32 && Alignment < 4)) ---------------- samparker wrote: > dmgreen wrote: > > If this is coming from codegen, can the alignment here be 0? I think in ISel it is always set (and clang will always set it), but it may not be guaranteed in llvm in general. > I can't see anything in the spec for any guarantees of these intrinsics, but for normal loads, it becomes defined by the target ABI. It's always safe for us to use a i8* accessor, so I don't see 0 being a problem here. Yeah. Alignment of 0 means ABI alignment, which means 8, not unaligned. I think it may be better to just check this alignment is always the case, getting rid of that weird "use i8's to load unaligned masked loads" thing. That was probably a bad idea, more trouble than it's worth. I think what will happen here at the moment is that the Vectorizer will call isLegalMaskedLoad with an scalar type and an alignment (which, lets say is unaligned). That alignment won't be checked so the masked loads and stores will be created. Then when we get to the backend the legalizer will call this with a vector type and we'll hit this check, expanding out the masked load into a that very inefficient bunch of code. Which is probably something that we want to avoid. ================ Comment at: test/CodeGen/Thumb2/mve-masked-load.ll:903 ; CHECK-LE: @ %bb.0: @ %entry -; CHECK-LE-NEXT: vmov.i32 q1, #0x0 ; CHECK-LE-NEXT: vpt.s8 gt, q0, zr ; CHECK-LE-NEXT: vldrbt.u8 q0, [r0] ---------------- Nice :) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68337/new/ https://reviews.llvm.org/D68337 From llvm-commits at lists.llvm.org Mon Oct 7 03:29:39 2019 From: llvm-commits at lists.llvm.org (George Rimar via llvm-commits) Date: Mon, 07 Oct 2019 10:29:39 -0000 Subject: [llvm] r373890 - [llvm-readelf/llvm-objdump] - Improve/refactor the implementation of SHT_LLVM_ADDRSIG section dumping. Message-ID: <20191007102939.1D460833FE@lists.llvm.org> Author: grimar Date: Mon Oct 7 03:29:38 2019 New Revision: 373890 URL: http://llvm.org/viewvc/llvm-project?rev=373890&view=rev Log: [llvm-readelf/llvm-objdump] - Improve/refactor the implementation of SHT_LLVM_ADDRSIG section dumping. This patch: * Adds a llvm-readobj/llvm-readelf test file for SHT_LLVM_ADDRSIG sections. (we do not have any) * Enables dumping of SHT_LLVM_ADDRSIG with --all. * Changes the logic to report a warning instead of an error when something goes wrong during dumping (allows to continue dumping SHT_LLVM_ADDRSIG and other sections on error). * Refactors a piece of logic to a new toULEB128Array helper which might be used for GNU-style dumping implementation. Differential revision: https://reviews.llvm.org/D68383 Added: llvm/trunk/test/tools/llvm-readobj/elf-addrsig.test Modified: llvm/trunk/test/tools/llvm-readobj/all.test llvm/trunk/tools/llvm-readobj/ELFDumper.cpp llvm/trunk/tools/llvm-readobj/llvm-readobj.cpp Modified: llvm/trunk/test/tools/llvm-readobj/all.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-readobj/all.test?rev=373890&r1=373889&r2=373890&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-readobj/all.test (original) +++ llvm/trunk/test/tools/llvm-readobj/all.test Mon Oct 7 03:29:38 2019 @@ -14,6 +14,7 @@ # ALL: Version symbols { # ALL: SHT_GNU_verdef { # ALL: SHT_GNU_verneed { +# ALL: Addrsig [ # ALL: Notes [ # ALL: StackSizes [ Added: llvm/trunk/test/tools/llvm-readobj/elf-addrsig.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-readobj/elf-addrsig.test?rev=373890&view=auto ============================================================================== --- llvm/trunk/test/tools/llvm-readobj/elf-addrsig.test (added) +++ llvm/trunk/test/tools/llvm-readobj/elf-addrsig.test Mon Oct 7 03:29:38 2019 @@ -0,0 +1,84 @@ +## Show that llvm-readobj can dump SHT_LLVM_ADDRSIG sections. + +# RUN: yaml2obj --docnum=1 %s -o %t1.o +# RUN: llvm-readobj --addrsig %t1.o | FileCheck -DFILE=%t1.o %s --check-prefix LLVM +# RUN: not llvm-readelf --addrsig %t1.o 2>&1 | FileCheck -DFILE=%t1.o %s --check-prefix GNU + +# LLVM: Addrsig [ +# LLVM-NEXT: Sym: foo (1) +# LLVM-NEXT: Sym: bar (2) +# LLVM-NEXT: ] + +# GNU: error: '[[FILE]]': --addrsig: not implemented + +--- !ELF +FileHeader: + Class: ELFCLASS64 + Data: ELFDATA2LSB + Type: ET_DYN + Machine: EM_X86_64 +Sections: + - Name: .llvm_addrsig + Type: SHT_LLVM_ADDRSIG + Symbols: + - Name: foo + - Name: bar +Symbols: + - Name: foo + - Name: bar + +## Check that llvm-readobj dumps any SHT_LLVM_ADDRSIG section when --all +## is specified for LLVM style, but not for GNU style. +## TODO: Refine the llvm-readelf check when GNU-style dumping is implemented. + +# RUN: llvm-readobj --all %t1.o | FileCheck %s --check-prefix LLVM +# RUN: llvm-readelf --all %t1.o 2>&1 | FileCheck %s --implicit-check-not=warning --implicit-check-not=error + +## Check we report a warning when SHT_LLVM_ADDRSIG is broken (e.g. contains a malformed uleb128). + +# RUN: yaml2obj --docnum=2 %s -o %t2.o +# RUN: llvm-readobj --addrsig %t2.o 2>&1 | FileCheck %s -DFILE=%t2.o --check-prefix=MALFORMED + +# MALFORMED: warning: '[[FILE]]': malformed uleb128, extends past end + +--- !ELF +FileHeader: + Class: ELFCLASS64 + Data: ELFDATA2LSB + Type: ET_DYN + Machine: EM_X86_64 +Sections: + - Name: .llvm_addrsig + Type: SHT_LLVM_ADDRSIG + Content: "FF" + +## Check we report a warning when SHT_LLVM_ADDRSIG references a symbol that can't be +## dumped (e.g. the index value is larger than the number of symbols in .symtab). + +# RUN: yaml2obj --docnum=3 %s -o %t3.o +# RUN: llvm-readobj --addrsig %t3.o 2>&1 | FileCheck %s -DFILE=%t3.o --check-prefix=INVALID-INDEX + +# INVALID-INDEX: Addrsig [ +# INVALID-INDEX-NEXT: Sym: foo (1) +# INVALID-INDEX-EMPTY: +# INVALID-INDEX-NEXT: warning: '[[FILE]]': unable to get symbol from section [index 2]: invalid symbol index (255) +# INVALID-INDEX-NEXT: Sym: (255) +# INVALID-INDEX-NEXT: Sym: bar (2) +# INVALID-INDEX-NEXT: ] + +--- !ELF +FileHeader: + Class: ELFCLASS64 + Data: ELFDATA2LSB + Type: ET_DYN + Machine: EM_X86_64 +Sections: + - Name: .llvm_addrsig + Type: SHT_LLVM_ADDRSIG + Symbols: + - Index: 1 + - Index: 255 + - Index: 2 +Symbols: + - Name: foo + - Name: bar Modified: llvm/trunk/tools/llvm-readobj/ELFDumper.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-readobj/ELFDumper.cpp?rev=373890&r1=373889&r2=373890&view=diff ============================================================================== --- llvm/trunk/tools/llvm-readobj/ELFDumper.cpp (original) +++ llvm/trunk/tools/llvm-readobj/ELFDumper.cpp Mon Oct 7 03:29:38 2019 @@ -302,7 +302,7 @@ public: void getSectionNameIndex(const Elf_Sym *Symbol, const Elf_Sym *FirstSym, StringRef &SectionName, unsigned &SectionIndex) const; - std::string getStaticSymbolName(uint32_t Index) const; + Expected getStaticSymbolName(uint32_t Index) const; std::string getDynamicString(uint64_t Value) const; StringRef getSymbolVersionByIndex(StringRef StrTab, uint32_t VersionSymbolIndex, @@ -754,17 +754,22 @@ static std::string maybeDemangle(StringR } template -std::string ELFDumper::getStaticSymbolName(uint32_t Index) const { +Expected +ELFDumper::getStaticSymbolName(uint32_t Index) const { const ELFFile *Obj = ObjF->getELFFile(); - StringRef StrTable = unwrapOrError( - ObjF->getFileName(), Obj->getStringTableForSymtab(*DotSymtabSec)); - Elf_Sym_Range Syms = - unwrapOrError(ObjF->getFileName(), Obj->symbols(DotSymtabSec)); - if (Index >= Syms.size()) - reportError(createError("Invalid symbol index"), ObjF->getFileName()); - const Elf_Sym *Sym = &Syms[Index]; - return maybeDemangle( - unwrapOrError(ObjF->getFileName(), Sym->getName(StrTable))); + Expected SymOrErr = + Obj->getSymbol(DotSymtabSec, Index); + if (!SymOrErr) + return SymOrErr.takeError(); + + Expected StrTabOrErr = Obj->getStringTableForSymtab(*DotSymtabSec); + if (!StrTabOrErr) + return StrTabOrErr.takeError(); + + Expected NameOrErr = (*SymOrErr)->getName(*StrTabOrErr); + if (!NameOrErr) + return NameOrErr.takeError(); + return maybeDemangle(*NameOrErr); } template @@ -4047,7 +4052,7 @@ void GNUStyle::printCGProfile(cons template void GNUStyle::printAddrsig(const ELFFile *Obj) { - OS << "GNUStyle::printAddrsig not implemented\n"; + reportError(createError("--addrsig: not implemented"), this->FileName); } static StringRef getGenericNoteTypeName(const uint32_t NT) { @@ -5723,14 +5728,35 @@ void LLVMStyle::printCGProfile(con this->dumper()->getDotCGProfileSec())); for (const Elf_CGProfile &CGPE : CGProfile) { DictScope D(W, "CGProfileEntry"); - W.printNumber("From", this->dumper()->getStaticSymbolName(CGPE.cgp_from), - CGPE.cgp_from); - W.printNumber("To", this->dumper()->getStaticSymbolName(CGPE.cgp_to), - CGPE.cgp_to); + W.printNumber( + "From", + unwrapOrError(this->FileName, + this->dumper()->getStaticSymbolName(CGPE.cgp_from)), + CGPE.cgp_from); + W.printNumber( + "To", + unwrapOrError(this->FileName, + this->dumper()->getStaticSymbolName(CGPE.cgp_to)), + CGPE.cgp_to); W.printNumber("Weight", CGPE.cgp_weight); } } +static Expected> toULEB128Array(ArrayRef Data) { + std::vector Ret; + const uint8_t *Cur = Data.begin(); + const uint8_t *End = Data.end(); + while (Cur != End) { + unsigned Size; + const char *Err; + Ret.push_back(decodeULEB128(Cur, &Size, End, &Err)); + if (Err) + return createError(Err); + Cur += Size; + } + return Ret; +} + template void LLVMStyle::printAddrsig(const ELFFile *Obj) { ListScope L(W, "Addrsig"); @@ -5739,18 +5765,20 @@ void LLVMStyle::printAddrsig(const ArrayRef Contents = unwrapOrError( this->FileName, Obj->getSectionContents(this->dumper()->getDotAddrsigSec())); - const uint8_t *Cur = Contents.begin(); - const uint8_t *End = Contents.end(); - while (Cur != End) { - unsigned Size; - const char *Err; - uint64_t SymIndex = decodeULEB128(Cur, &Size, End, &Err); - if (Err) - reportError(createError(Err), this->FileName); + Expected> V = toULEB128Array(Contents); + if (!V) { + reportWarning(V.takeError(), this->FileName); + return; + } - W.printNumber("Sym", this->dumper()->getStaticSymbolName(SymIndex), - SymIndex); - Cur += Size; + for (uint64_t Sym : *V) { + Expected NameOrErr = this->dumper()->getStaticSymbolName(Sym); + if (NameOrErr) { + W.printNumber("Sym", *NameOrErr, Sym); + continue; + } + reportWarning(NameOrErr.takeError(), this->FileName); + W.printNumber("Sym", "", Sym); } } Modified: llvm/trunk/tools/llvm-readobj/llvm-readobj.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-readobj/llvm-readobj.cpp?rev=373890&r1=373889&r2=373890&view=diff ============================================================================== --- llvm/trunk/tools/llvm-readobj/llvm-readobj.cpp (original) +++ llvm/trunk/tools/llvm-readobj/llvm-readobj.cpp Mon Oct 7 03:29:38 2019 @@ -691,8 +691,10 @@ int main(int argc, const char *argv[]) { opts::UnwindInfo = true; opts::SectionGroups = true; opts::HashHistogram = true; - if (opts::Output == opts::LLVM) + if (opts::Output == opts::LLVM) { + opts::Addrsig = true; opts::PrintStackSizes = true; + } } if (opts::Headers) { From llvm-commits at lists.llvm.org Mon Oct 7 03:28:37 2019 From: llvm-commits at lists.llvm.org (Merge Guard [bot] via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:28:37 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: merge_guards_bot added a comment. Bulid results are available at http://results.llvm-merge-guard.org/Phabricator-11 See http://jenkins.llvm-merge-guard.org/job/Phabricator/11/ for more details. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 From llvm-commits at lists.llvm.org Mon Oct 7 03:30:14 2019 From: llvm-commits at lists.llvm.org (Joel Jones via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:30:14 +0000 (UTC) Subject: [PATCH] D53927: [AArch64] Enable libm vectorized functions via SLEEF In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG5f533c5fe1e2: [AArch64] Enable libm vectorized functions via SLEEF (authored by joelkevinjones). Herald added a subscriber: hiraditya. Changed prior to commit: https://reviews.llvm.org/D53927?vs=187507&id=223473#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53927/new/ https://reviews.llvm.org/D53927 Files: llvm/include/llvm/Analysis/TargetLibraryInfo.h llvm/include/llvm/IR/Intrinsics.td llvm/lib/Analysis/TargetLibraryInfo.cpp llvm/test/Transforms/LoopVectorize/AArch64/sleef-calls-aarch64.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D53927.223473.patch Type: text/x-patch Size: 44495 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 03:31:28 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:31:28 +0000 (UTC) Subject: [PATCH] D68461: [ARM][MVE] Enable truncating masked stores In-Reply-To: References: Message-ID: <12a1b448245fd4966f7ccab012e6fb58@localhost.localdomain> dmgreen added a comment. Nice one. ================ Comment at: lib/Target/ARM/ARMInstrMVE.td:5218 def : Pat<(pre_truncstvi16_align2 (v4i32 MQPR:$Rt), tGPR:$Rn, t2am_imm7_offset<1>:$addr), (MVE_VSTRH32_pre MQPR:$Rt, tGPR:$Rn, t2am_imm7_offset<1>:$addr)>; } ---------------- Maybe put them here, with the other trunc stores? ================ Comment at: lib/Target/ARM/ARMTargetTransformInfo.cpp:498 - if (DataTy->isVectorTy()) { - // We don't yet support narrowing or widening masked loads/stores. Expand - // them for the moment. - unsigned VecWidth = DataTy->getPrimitiveSizeInBits(); - if (VecWidth != 128) + unsigned EltWidth = DataTy->getScalarSizeInBits(); + if (auto *VecTy = dyn_cast(DataTy)) { ---------------- This is the same as in the load patch? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68461/new/ https://reviews.llvm.org/D68461 From llvm-commits at lists.llvm.org Mon Oct 7 03:32:15 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:32:15 +0000 (UTC) Subject: [PATCH] D53876: Preserve loop metadata when splitting exit blocks In-Reply-To: References: Message-ID: <03691d1debca92839b162061d4062663@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG3c87c2a3c50b: Preserve loop metadata when splitting exit blocks (authored by craig.topper). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D53876?vs=182694&id=223475#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53876/new/ https://reviews.llvm.org/D53876 Files: llvm/lib/Transforms/Utils/LoopUtils.cpp llvm/test/Transforms/LoopSimplify/preserve-llvm-loop-metadata2.ll Index: llvm/test/Transforms/LoopSimplify/preserve-llvm-loop-metadata2.ll =================================================================== --- /dev/null +++ llvm/test/Transforms/LoopSimplify/preserve-llvm-loop-metadata2.ll @@ -0,0 +1,48 @@ +; RUN: opt -S -loop-simplify < %s | FileCheck %s + +; Two-loop nest with llvm.loop metadata on each loop. +; inner.header exits to outer.header. inner.header is a latch for the outer +; loop, and contains the outer loop's metadata. +; After loop-simplify, a new block "outer.header.loopexit" is created between +; inner.header and outer.header. The metadata from inner.header must be moved +; to the new block, as the new block becomes the outer loop latch. +; The metadata on the inner loop's latch should be untouched. + +; CHECK: outer.header.loopexit: +; CHECK-NEXT: llvm.loop [[UJAMTAG:.*]] +; CHECK-NOT: br i1 {{.*}}, label {{.*}}, label %outer.header.loopexit, !llvm.loop +; CHECK: br label %inner.header, !llvm.loop [[UNROLLTAG:.*]] + +; CHECK: distinct !{[[UJAMTAG]], [[UJAM:.*]]} +; CHECK: [[UJAM]] = !{!"llvm.loop.unroll_and_jam.count", i32 17} +; CHECK: distinct !{[[UNROLLTAG]], [[UNROLL:.*]]} +; CHECK: [[UNROLL]] = !{!"llvm.loop.unroll.count", i32 1} + + +define dso_local void @loopnest() local_unnamed_addr #0 { +entry: + br label %outer.header + +outer.header: ; preds = %inner.header, %entry + %ii.0 = phi i64 [ 2, %entry ], [ %add, %inner.header ] + %cmp = icmp ult i64 %ii.0, 64 + br i1 %cmp, label %inner.header, label %outer.header.cleanup + +outer.header.cleanup: ; preds = %outer.header + ret void + +inner.header: ; preds = %outer.header, %inner.body + %j.0 = phi i64 [ %add10, %inner.body ], [ %ii.0, %outer.header ] + %add = add nuw nsw i64 %ii.0, 16 + %cmp2 = icmp ult i64 %j.0, %add + br i1 %cmp2, label %inner.body, label %outer.header, !llvm.loop !2 + +inner.body: ; preds = %inner.header + %add10 = add nuw nsw i64 %j.0, 1 + br label %inner.header, !llvm.loop !4 +} + +!2 = distinct !{!2, !3} +!3 = !{!"llvm.loop.unroll_and_jam.count", i32 17} +!4 = distinct !{!4, !5} +!5 = !{!"llvm.loop.unroll.count", i32 1} Index: llvm/lib/Transforms/Utils/LoopUtils.cpp =================================================================== --- llvm/lib/Transforms/Utils/LoopUtils.cpp +++ llvm/lib/Transforms/Utils/LoopUtils.cpp @@ -74,9 +74,41 @@ if (IsDedicatedExit) return false; + // With nested loops, the inner loop might exit to the header of an + // enclosing loop, and the in-loop-predecessor is a latch for that + // enclosing loop. If we insert a block between the latch and the header, + // that block becomes the new latch. Any loop metadata from the old latch + // needs to be moved to the new one. + MDNode *OuterLoopMD = nullptr; + + // If the exit block is a header of a different loop, get that loop's + // metadata before we split the block. + if (LI->isLoopHeader(BB)) + OuterLoopMD = LI->getLoopFor(BB)->getLoopID(); + auto *NewExitBB = SplitBlockPredecessors( BB, InLoopPredecessors, ".loopexit", DT, LI, nullptr, PreserveLCSSA); + // If OuterLoopMD is non-null, we know that the exit block BB is a + // loop header for a different loop, with metadata on its back edges. + // If NewExitBB is a member of that loop, then NewExitBB is a latch, + // and the loop's metadata needs to be copied to NewExitBB. + if (NewExitBB && OuterLoopMD && + LI->getLoopFor(NewExitBB) == LI->getLoopFor(BB)) { + // The preds of NewExitBB are all former latches of the outer loop. + // Remove their metadata. + for (auto *PredLoopBB : InLoopPredecessors) { + Instruction *TI = PredLoopBB->getTerminator(); + // All the latches should have the same metadata (ensured by + // getLoopID()). + assert(TI->getMetadata(LLVMContext::MD_loop) == OuterLoopMD && + "exit edge to other loop doesn't contain expected metadata"); + TI->setMetadata(LLVMContext::MD_loop, nullptr); + } + NewExitBB->getTerminator()->setMetadata(LLVMContext::MD_loop, + OuterLoopMD); + } + if (!NewExitBB) LLVM_DEBUG( dbgs() << "WARNING: Can't create a dedicated exit block for loop: " -------------- next part -------------- A non-text attachment was scrubbed... Name: D53876.223475.patch Type: text/x-patch Size: 4406 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 03:33:29 2019 From: llvm-commits at lists.llvm.org (Pirama Arumuga Nainar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:33:29 +0000 (UTC) Subject: [PATCH] D54125: [LTO] Drop non-prevailing definitions only if linkage is not local or appending In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGe61652a38427: [LTO] Drop non-prevailing definitions only if linkage is not local or appending (authored by pirama). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D54125?vs=173208&id=223476#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D54125/new/ https://reviews.llvm.org/D54125 Files: llvm/include/llvm/LTO/LTO.h llvm/include/llvm/Transforms/IPO/FunctionImport.h llvm/lib/LTO/LTO.cpp llvm/lib/LTO/LTOBackend.cpp llvm/lib/LTO/ThinLTOCodeGenerator.cpp llvm/lib/Transforms/IPO/FunctionImport.cpp llvm/test/LTO/Resolution/X86/dead-strip-fulllto.ll llvm/test/ThinLTO/X86/Inputs/strong_non_prevailing.ll llvm/test/ThinLTO/X86/funcimport.ll llvm/test/ThinLTO/X86/strong_non_prevailing.ll llvm/test/Transforms/FunctionImport/funcimport_var.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D54125.223476.patch Type: text/x-patch Size: 14743 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 03:34:53 2019 From: llvm-commits at lists.llvm.org (Oliver Stannard (Linaro) via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:34:53 +0000 (UTC) Subject: [PATCH] D57529: Add .dword direcrive support for aarch64 mc In-Reply-To: References: Message-ID: <2ae9b3e51c0e470ec3bbf914c54c7c9e@localhost.localdomain> ostannard added a comment. A patch adding the same functionality was added back in May: D61719 , rL360381 . Reviewers tend to assume that patch authors have commit access, if you don't then just say so when the patch is accepted and the reviewer will commit it for you. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57529/new/ https://reviews.llvm.org/D57529 From llvm-commits at lists.llvm.org Mon Oct 7 03:36:33 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 10:36:33 +0000 (UTC) Subject: [PATCH] D53066: [Driver] Use forward slashes in most linker arguments In-Reply-To: References: Message-ID: <313985a312e4c7446115a6ea9c848925@localhost.localdomain> This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rGcbd73574e43e: Reapply: [Driver] Use forward slashes in most linker arguments (authored by mstorsjo). Herald added a project: clang. Changed prior to commit: https://reviews.llvm.org/D53066?vs=171253&id=223478#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53066/new/ https://reviews.llvm.org/D53066 Files: clang/include/clang/Driver/ToolChain.h clang/lib/Driver/Driver.cpp clang/lib/Driver/ToolChain.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Driver/ToolChains/CommonArgs.cpp clang/lib/Driver/ToolChains/Gnu.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D53066.223478.patch Type: text/x-patch Size: 6962 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 03:41:30 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:41:30 +0000 (UTC) Subject: [PATCH] D68491: [PATCH 08/38] [noalias] [IR] IRBuilder support for noalias intrinsics. In-Reply-To: References: Message-ID: <2a4167101e47d958d347cdc18c0c7779@localhost.localdomain> jeroen.dobbelaere marked 5 inline comments as done. jeroen.dobbelaere added inline comments. ================ Comment at: llvm/include/llvm/IR/IRBuilder.h:651 + AllocaPtr, + ConstantInt::get(IntegerType::getInt32Ty(getContext()), ObjId), Scope); + } ---------------- lebedev.ri wrote: > But `ObjId` is `uint64_t`? I am preparing an update where I use uint64_t more consistenly. ================ Comment at: llvm/include/llvm/IR/IntrinsicInst.h:891-896 + return (lhs->getOperand(Intrinsic::SideNoAliasScopeArg) == + rhs->getOperand(Intrinsic::SideNoAliasScopeArg)) && + (lhs->getOperand(Intrinsic::SideNoAliasIdentifyPObjIdArg) == + rhs->getOperand(Intrinsic::SideNoAliasIdentifyPObjIdArg)) && + (lhs->getOperand(Intrinsic::SideNoAliasIdentifyPArg) == + rhs->getOperand(Intrinsic::SideNoAliasIdentifyPArg)); ---------------- lebedev.ri wrote: > `std::tie(<...>) == std::tie(<...>)` ? You mean: std::forward_as_tuple ? ================ Comment at: llvm/lib/IR/IRBuilder.cpp:514 +IRBuilderBase::CreateNoAliasCopyGuard(Value *BasePtr, Value *NoAliasDecl, + ArrayRef EncodedIndices, + MDNode *ScopeTag, const Twine &Name) { ---------------- lebedev.ri wrote: > Should this be `uint64_t`? No. It should be int64_t. ================ Comment at: llvm/lib/IR/IRBuilder.cpp:565-566 + // For the metadata info, types must not be added: + for (auto *MD : MDNodes) { + Ops.push_back(MetadataAsValue::get(Context, MD)); + } ---------------- lebedev.ri wrote: > Ops.insert(Ops.end(), MDNodes.begin(), MDNodes.end()) ? That won't work. ================ Comment at: llvm/lib/IR/IRBuilder.cpp:569 + for (auto *MDV : MDValues) { + Ops.push_back(MDV); + } ---------------- lebedev.ri wrote: > same But here it will. ================ Comment at: llvm/lib/IR/IRBuilder.cpp:573-577 + auto *FnIntrinsic = Intrinsic::getDeclaration(M, ID, Types); + Instruction *Ret = createCallHelper(FnIntrinsic, Ops, this, Name); if (Ret->getType() != Ptr->getType()) { + BitCastInst *BCI = new BitCastInst(Ret, Ptr->getType(), Name + ".cast"); ---------------- lebedev.ri wrote: > This should be in the parent patch that added that Maybe. I tried to not do that kind of changes in the rebased versions. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68491/new/ https://reviews.llvm.org/D68491 From llvm-commits at lists.llvm.org Mon Oct 7 03:44:39 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:44:39 +0000 (UTC) Subject: [PATCH] D29121: [Docs] Add LangRef documention for freeze instruction In-Reply-To: References: Message-ID: lebedev.ri added a comment. Should there be a constantexpr version of `freeze`? Also, @nlopes, just to put a bold stop to this question - in the end, we want `freeze` to be fully agnostic, it should not care *at all* what the type is, right? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29121/new/ https://reviews.llvm.org/D29121 From llvm-commits at lists.llvm.org Mon Oct 7 03:45:32 2019 From: llvm-commits at lists.llvm.org (Phabricator via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:45:32 +0000 (UTC) Subject: [PATCH] D51470: Add flag to llvm-profdata to allow symbols in profile data to be remapped, andadd a tool to generate symbol remapping files. In-Reply-To: References: Message-ID: <3df71a1e34a76efaf39ce3eeece60811@localhost.localdomain> This revision was not accepted when it landed; it landed in state "Needs Revision". This revision was automatically updated to reflect the committed changes. Closed by commit rG3164fcfd273b: Add flag to llvm-profdata to allow symbols in profile data to be remapped, and… (authored by Richard Smith <richard-llvm at metafoo.co.uk>). Changed prior to commit: https://reviews.llvm.org/D51470?vs=165365&id=223481#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51470/new/ https://reviews.llvm.org/D51470 Files: llvm/docs/CommandGuide/index.rst llvm/docs/CommandGuide/llvm-cxxmap.rst llvm/docs/CommandGuide/llvm-profdata.rst llvm/test/tools/llvm-cxxmap/Inputs/after.sym llvm/test/tools/llvm-cxxmap/Inputs/ambiguous.sym llvm/test/tools/llvm-cxxmap/Inputs/before.sym llvm/test/tools/llvm-cxxmap/Inputs/expected llvm/test/tools/llvm-cxxmap/Inputs/incomplete.sym llvm/test/tools/llvm-cxxmap/Inputs/remap.map llvm/test/tools/llvm-cxxmap/ambiguous.test llvm/test/tools/llvm-cxxmap/incomplete.test llvm/test/tools/llvm-cxxmap/remap.test llvm/test/tools/llvm-profdata/Inputs/instr-remap.expected llvm/test/tools/llvm-profdata/Inputs/instr-remap.proftext llvm/test/tools/llvm-profdata/Inputs/instr-remap.remap llvm/test/tools/llvm-profdata/Inputs/sample-remap.expected llvm/test/tools/llvm-profdata/Inputs/sample-remap.proftext llvm/test/tools/llvm-profdata/Inputs/sample-remap.remap llvm/test/tools/llvm-profdata/instr-remap.test llvm/test/tools/llvm-profdata/sample-remap.test llvm/tools/llvm-cxxmap/CMakeLists.txt llvm/tools/llvm-cxxmap/LLVMBuild.txt llvm/tools/llvm-cxxmap/llvm-cxxmap.cpp llvm/tools/llvm-profdata/llvm-profdata.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D51470.223481.patch Type: text/x-patch Size: 24836 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 03:49:22 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 10:49:22 +0000 (UTC) Subject: [PATCH] D45842: [Reassociate] swap binop operands to increase factoring potential In-Reply-To: References: Message-ID: This revision was not accepted when it landed; it landed in state "Changes Planned". This revision was automatically updated to reflect the committed changes. Closed by commit rGca36eb4e33e4: [Reassociate] swap binop operands to increase factoring potential (authored by spatel). Herald added a subscriber: hiraditya. Changed prior to commit: https://reviews.llvm.org/D45842?vs=163644&id=223484#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D45842/new/ https://reviews.llvm.org/D45842 Files: llvm/include/llvm/Transforms/Scalar/Reassociate.h llvm/lib/Transforms/Scalar/Reassociate.cpp llvm/test/Transforms/Reassociate/matching-binops.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D45842.223484.patch Type: text/x-patch Size: 12983 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 03:57:41 2019 From: llvm-commits at lists.llvm.org (Jay Foad via llvm-commits) Date: Mon, 07 Oct 2019 10:57:41 -0000 Subject: [llvm] r373893 - [AMDGPU] Fix test checks Message-ID: <20191007105741.EEC278B4FC@lists.llvm.org> Author: foad Date: Mon Oct 7 03:57:41 2019 New Revision: 373893 URL: http://llvm.org/viewvc/llvm-project?rev=373893&view=rev Log: [AMDGPU] Fix test checks The GFX10-DENORM-STRICT checks were only passing by accident. Fix them to make the test more robust in the face of scheduling or register allocation changes. Modified: llvm/trunk/test/CodeGen/AMDGPU/fmuladd.f16.ll Modified: llvm/trunk/test/CodeGen/AMDGPU/fmuladd.f16.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/fmuladd.f16.ll?rev=373893&r1=373892&r2=373893&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/fmuladd.f16.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/fmuladd.f16.ll Mon Oct 7 03:57:41 2019 @@ -331,7 +331,8 @@ define amdgpu_kernel void @mad_sub_f16(h ; GFX10-FLUSH: v_mul_f16_e32 [[TMP:v[0-9]+]], [[REGA]], [[REGB]] ; GFX10-FLUSH: v_sub_f16_e32 [[RESULT:v[0-9]+]], [[REGC]], [[TMP]] ; GFX10-FLUSH: global_store_short v{{\[[0-9]+:[0-9]+\]}}, [[RESULT]] -; GFX10-DENORM: global_store_short v{{\[[0-9]+:[0-9]+\]}}, [[REGC]] +; GFX10-DENORM-STRICT: global_store_short v{{\[[0-9]+:[0-9]+\]}}, [[RESULT]] +; GFX10-DENORM-CONTRACT: global_store_short v{{\[[0-9]+:[0-9]+\]}}, [[REGC]] define amdgpu_kernel void @mad_sub_inv_f16(half addrspace(1)* noalias nocapture %out, half addrspace(1)* noalias nocapture readonly %ptr) #1 { %tid = tail call i32 @llvm.amdgcn.workitem.id.x() #0 %tid.ext = sext i32 %tid to i64 @@ -439,7 +440,8 @@ define amdgpu_kernel void @mad_sub_fabs_ ; GFX10-FLUSH: v_mul_f16_e32 [[TMP:v[0-9]+]], [[REGA]], [[REGB]] ; GFX10-FLUSH: v_add_f16_e32 [[RESULT:v[0-9]+]], [[REGC]], [[TMP]] ; GFX10-FLUSH: global_store_short v{{\[[0-9]+:[0-9]+\]}}, [[RESULT]] -; GFX10-DENORM: global_store_short v{{\[[0-9]+:[0-9]+\]}}, [[REGC]] +; GFX10-DENORM-STRICT: global_store_short v{{\[[0-9]+:[0-9]+\]}}, [[RESULT]] +; GFX10-DENORM-CONTRACT: global_store_short v{{\[[0-9]+:[0-9]+\]}}, [[REGC]] define amdgpu_kernel void @neg_neg_mad_f16(half addrspace(1)* noalias nocapture %out, half addrspace(1)* noalias nocapture readonly %ptr) #1 { %tid = tail call i32 @llvm.amdgcn.workitem.id.x() #0 %tid.ext = sext i32 %tid to i64 From llvm-commits at lists.llvm.org Mon Oct 7 04:00:39 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:00:39 +0000 (UTC) Subject: [PATCH] D68566: [ARM] VQADD instructions Message-ID: dmgreen created this revision. dmgreen added reviewers: t.p.northover, simon_tatham, SjoerdMeijer, samparker, ostannard. Herald added subscribers: hiraditya, kristof.beyls. Herald added a project: LLVM. This selects MVE VQADD from the vector llvm.sadd.sat or llvm.uadd.sat intrinsics. The signed versions seem hard to get from C, but the unsigned are simple enough. And both are available from llvm, obviously. https://reviews.llvm.org/D68566 Files: llvm/lib/Target/ARM/ARMISelLowering.cpp llvm/lib/Target/ARM/ARMInstrMVE.td llvm/test/CodeGen/Thumb2/mve-saturating-arith.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68566.223486.patch Type: text/x-patch Size: 7586 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 04:01:44 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:01:44 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: <592da4f338fe064e70e78fe3aa805246@localhost.localdomain> lebedev.ri marked an inline comment as done. lebedev.ri added a comment. Thanks, this looks about right. I'm not sure about backend part of this patch, but that is likely okay. Two main questions: should there be a constantexpr `freeze`, and should `freeze` be fully type-agnostic, like it is stated in D29121 ? ================ Comment at: include/llvm/CodeGen/GlobalISel/IRTranslator.h:487 + bool translateFreeze(const User &U, MachineIRBuilder &MIRBuilder) { + return false; + } ---------------- aqjune wrote: > lebedev.ri wrote: > > Is this a correctness issue? > > Or does returning `false` here results in IRTranslator "aborting"? > Yep, returning false makes it abort, if my understanding is correct. Okay. ================ Comment at: lib/AsmParser/LLParser.cpp:3407-3430 // Unary Operators. case lltok::kw_fneg: { unsigned Opc = Lex.getUIntVal(); Constant *Val; Lex.Lex(); if (ParseToken(lltok::lparen, "expected '(' in unary constantexpr") || ParseGlobalTypeAndValue(Val) || ---------------- Should there be `constantexpr` version of `freeze`? ================ Comment at: lib/AsmParser/LLParser.cpp:6317-6338 /// ParseUnaryOp /// ::= UnaryOp TypeAndValue ',' Value /// -/// If IsFP is false, then any integer operand is allowed, if it is true, any fp -/// operand is allowed. +/// If IsFP is true, then any fp operand is allowed. +// If IsInt is true, then any integer operand is allowed. bool LLParser::ParseUnaryOp(Instruction *&Inst, PerFunctionState &PFS, + unsigned Opc, bool IsFP, bool IsInt) { ---------------- I see no restrictions on the type to `freeze` in D29121, so i think you just don't want any checking for `freeze` here. And existing `ParseUnaryOp` is only called with `/*IsFP*/true` to parse `fneg`. So let's change this to ``` bool Valid = !IsFPOnly || LHS->getType()->isFPOrFPVectorTy(); ``` ? ================ Comment at: lib/Bitcode/Writer/BitcodeWriter.cpp:2448-2457 case Instruction::FNeg: { assert(CE->getNumOperands() == 1 && "Unknown constant expr!"); Code = bitc::CST_CODE_CE_UNOP; Record.push_back(getEncodedUnaryOpcode(CE->getOpcode())); Record.push_back(VE.getValueID(C->getOperand(0))); uint64_t Flags = getOptimizationFlags(CE); if (Flags != 0) ---------------- What about constantexpr `freeze`? ================ Comment at: lib/IR/Instructions.cpp:2240-2242 + assert((getType()->isIntOrIntVectorTy() || getType()->isFPOrFPVectorTy()) && + "Tried to create a freeze operation on a " + "non-integer, non-floating-point type!"); ---------------- I'm not seeing this restriction in D29121 ================ Comment at: lib/IR/Verifier.cpp:3141-3144 + case Instruction::Freeze: + Assert(U.getType()->isIntOrIntVectorTy() || U.getType()->isFPOrFPVectorTy(), + "Freeze operator only works with float/int types!", &U); + break; ---------------- I'm not seeing this restriction in D29121 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Mon Oct 7 04:03:20 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:03:20 +0000 (UTC) Subject: [PATCH] D68567: [ARM] VQSUB instruction Message-ID: dmgreen created this revision. dmgreen added reviewers: t.p.northover, simon_tatham, SjoerdMeijer, samparker, ostannard. Herald added subscribers: hiraditya, kristof.beyls. Herald added a project: LLVM. Same as VQADD, VQSUB can be selected from llvm.ssub.sat intrinsics. https://reviews.llvm.org/D68567 Files: llvm/lib/Target/ARM/ARMISelLowering.cpp llvm/lib/Target/ARM/ARMInstrMVE.td llvm/test/CodeGen/Thumb2/mve-saturating-arith.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68567.223489.patch Type: text/x-patch Size: 5278 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 04:04:55 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:04:55 +0000 (UTC) Subject: [PATCH] D45842: [Reassociate] swap binop operands to increase factoring potential In-Reply-To: References: Message-ID: lebedev.ri added a comment. I don't think this just relanded, phab gone mad due to the disk space issues? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D45842/new/ https://reviews.llvm.org/D45842 From llvm-commits at lists.llvm.org Mon Oct 7 04:07:16 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:07:16 +0000 (UTC) Subject: [PATCH] D68499: [PATCH 16/38] [noalias] Loop vectorizer: learn about noalias intrinsics In-Reply-To: References: Message-ID: jeroen.dobbelaere marked 2 inline comments as done. jeroen.dobbelaere added inline comments. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3223-3227 + // Compute corresponding vector type for return value and arguments. + Type *RetTy = ToVectorTy(ScalarRetTy, VF); + for (Type *ScalarTy : ScalarTys) + Tys.push_back(ToVectorTy(ScalarTy, VF)); + ---------------- lebedev.ri wrote: > Why was this moved? Just as an optimization in case we early-return inbetween? By moving this later, we avoid trying to get the vector variant of MetadataAsValue for the noalias intrinsics. (and also the corresponding assertion) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68499/new/ https://reviews.llvm.org/D68499 From llvm-commits at lists.llvm.org Mon Oct 7 04:08:48 2019 From: llvm-commits at lists.llvm.org (Sean Fertile via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:08:48 +0000 (UTC) Subject: [PATCH] D40425: Extending CFGPrinter and CallPrinter with Heat Colors In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG3b0535b424ac: Extend CFGPrinter and CallPrinter with Heat Colors (authored by sfertile). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D40425?vs=158745&id=223492#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D40425/new/ https://reviews.llvm.org/D40425 Files: llvm/include/llvm/Analysis/CFGPrinter.h llvm/include/llvm/Analysis/HeatUtils.h llvm/lib/Analysis/CFGPrinter.cpp llvm/lib/Analysis/CMakeLists.txt llvm/lib/Analysis/CallPrinter.cpp llvm/lib/Analysis/DomPrinter.cpp llvm/lib/Analysis/HeatUtils.cpp llvm/lib/Analysis/RegionPrinter.cpp llvm/lib/Passes/PassRegistry.def llvm/lib/Transforms/Scalar/NewGVN.cpp llvm/llvm/Analysis/HeatUtils.h llvm/test/Other/2007-06-05-PassID.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D40425.223492.patch Type: text/x-patch Size: 47523 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 04:12:49 2019 From: llvm-commits at lists.llvm.org (Sam McCall via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:12:49 +0000 (UTC) Subject: [PATCH] D45842: [Reassociate] swap binop operands to increase factoring potential In-Reply-To: References: Message-ID: <1007babef0190f3dae06a641f18c8d00@localhost.localdomain> sammccall reopened this revision. sammccall added a comment. This revision is now accepted and ready to land. In D45842#1697098 , @lebedev.ri wrote: > I don't think this just relanded, phab gone mad due to the disk space issues? Yes. It's decided to reimport all the reviews, so anything that was committed and reopened will probably be autoclosed :-( Sorry, not really sure how to stop it without leaving everything in an unimported state. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D45842/new/ https://reviews.llvm.org/D45842 From llvm-commits at lists.llvm.org Mon Oct 7 04:15:45 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:15:45 +0000 (UTC) Subject: [PATCH] D68462: [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. In-Reply-To: References: Message-ID: <39247647e9878b2a360a53a605028b0b@localhost.localdomain> grimar marked 14 inline comments as done. grimar added inline comments. ================ Comment at: test/tools/llvm-readobj/all.test:27-28 +# GNU-ALL: Symbol table '.symtab' contains {{.*}} entries: +# GNU-ALL: EH_FRAME Header [ +# GNU-ALL: Dynamic section at offset {{.*}} contains {{.*}} entries: +# GNU-ALL: Program Headers: ---------------- jhenderson wrote: > These two appear to be missing from the LLVM list above? Yes. LLVM output seems to be inconsistent, incomplete and ugly sometimes. I'd review and refine it separatelly. ================ Comment at: test/tools/llvm-readobj/all.test:33-34 +# GNU-ALL: Version needs section '.gnu.version_r' contains {{.*}} entries: +# GNU-ALL: COMDAT group section [ {{.*}}] `.group' [foo] contains {{.*}} sections: +# GNU-ALL: Histogram for bucket list length (total of 1 buckets) +# GNU-ALL: Displaying notes found at file offset {{.*}} with length {{.*}}: ---------------- jhenderson wrote: > These two are also missing from the LLVM list. The same. ================ Comment at: test/tools/llvm-readobj/all.test:49 + Relocations: + - Name: .gnu.version + Type: SHT_GNU_versym ---------------- jhenderson wrote: > I take it the version stuff is needed to make GNU mode print anything? Yep, GNU style prints nothing if there is no version section. See: https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L3785 ================ Comment at: test/tools/llvm-readobj/all.test:63 + Entries: [] + - Name: .group + Type: SHT_GROUP ---------------- jhenderson wrote: > Do we need group information to print a header? Yes. Logic is: 1) Collect all group sections: https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L2842 2) Dump them (header is printed on this step): https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L2886 ================ Comment at: test/tools/llvm-readobj/all.test:75-76 + Entries: + - Tag: DT_HASH + Value: 0x1100 + - Tag: DT_NULL ---------------- jhenderson wrote: > Can we just get away with the DT_NULL tag, or is DT_HASH required for the hash histogram behaviour? > is DT_HASH required for the hash histogram behaviour? Yes, we need it to get the `DT_HASH` content: https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L1712 ================ Comment at: test/tools/llvm-readobj/all.test:86-90 + - Name: .eh_frame_hdr + Type: SHT_PROGBITS +## An arbitrary linker-generated valid content. + Content: 011b033b140000000100000000f0ffff30000000 + - Name: .eh_frame ---------------- jhenderson wrote: > Same comments as earlier. Can these be empty? No. We need to have something valid here, otherwise any error triggered will fail the dumping. (e.g. https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/DwarfCFIEHPrinter.h#L127). ================ Comment at: test/tools/llvm-readobj/all.test:100 +## An arbitrary linker-generated valid content. + Content: 040000001000000003000000474E55004FCB712AA6387724A9F465A32CD8C14B +Symbols: ---------------- jhenderson wrote: > This could probably just be an arbitrary note, and much simpler. Probably. But it is already short enough and I do not want to spend time on optimising it until we have a way to describe it with YAML. Having a raw content is anyways not optimal. What do you think? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68462/new/ https://reviews.llvm.org/D68462 From llvm-commits at lists.llvm.org Mon Oct 7 04:17:08 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:17:08 +0000 (UTC) Subject: [PATCH] D47751: [lsan] Do not check for leaks in the forked process In-Reply-To: References: Message-ID: <21e59d20cd6b9faef991a2f29d64f507@localhost.localdomain> This revision was not accepted when it landed; it landed in state "Changes Planned". This revision was automatically updated to reflect the committed changes. Closed by commit rGb89704fa6f6f: [lsan] Do not check for leaks in the forked process (authored by vitalybuka). Herald added projects: Sanitizers, LLVM. Herald added a subscriber: Sanitizers. Changed prior to commit: https://reviews.llvm.org/D47751?vs=150012&id=223496#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D47751/new/ https://reviews.llvm.org/D47751 Files: compiler-rt/lib/lsan/lsan_common.cc compiler-rt/test/lsan/TestCases/Linux/fork_with_threads.cc Index: compiler-rt/test/lsan/TestCases/Linux/fork_with_threads.cc =================================================================== --- /dev/null +++ compiler-rt/test/lsan/TestCases/Linux/fork_with_threads.cc @@ -0,0 +1,35 @@ +// Test forked process does not run lsan. +// RUN: %clangxx_lsan %s -o %t && %run %t 2>&1 | FileCheck %s + +#include +#include +#include +#include + +static pthread_barrier_t barrier; + +// CHECK-NOT: SUMMARY: {{(Leak|Address)}}Sanitizer: +static void *thread_func(void *arg) { + void *buffer = malloc(1337); + pthread_barrier_wait(&barrier); + for (;;) + pthread_yield(); + return 0; +} + +int main() { + pthread_barrier_init(&barrier, 0, 2); + pthread_t tid; + int res = pthread_create(&tid, 0, thread_func, 0); + pthread_barrier_wait(&barrier); + pthread_barrier_destroy(&barrier); + + pid_t pid = fork(); + if (pid > 0) { + int status = 0; + waitpid(pid, &status, 0); + } + return 0; +} + +// CHECK: WARNING: LeakSanitizer is disabled in forked process Index: compiler-rt/lib/lsan/lsan_common.cc =================================================================== --- compiler-rt/lib/lsan/lsan_common.cc +++ compiler-rt/lib/lsan/lsan_common.cc @@ -100,6 +100,8 @@ static InternalMmapVector *root_regions; +static uptr initialized_for_pid; + InternalMmapVector const *GetRootRegions() { return root_regions; } void InitializeRootRegions() { @@ -113,6 +115,7 @@ } void InitCommonLsan() { + initialized_for_pid = internal_getpid(); InitializeRootRegions(); if (common_flags()->detect_leaks) { // Initialization which can fail or print warnings should only be done if @@ -568,6 +571,12 @@ static bool CheckForLeaks() { if (&__lsan_is_turned_off && __lsan_is_turned_off()) return false; + if (initialized_for_pid != internal_getpid()) { + // If process was forked and it had threads we fail to detect references + // from other threads. + Report("WARNING: LeakSanitizer is disabled in forked process.\n"); + return false; + } EnsureMainThreadIDIsCorrect(); CheckForLeaksParam param; param.success = false; -------------- next part -------------- A non-text attachment was scrubbed... Name: D47751.223496.patch Type: text/x-patch Size: 2186 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 04:20:57 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:20:57 +0000 (UTC) Subject: [PATCH] D46814: [InstCombine] Fold unfolded masked merge pattern with variable mask! In-Reply-To: References: Message-ID: <8d1a34b1f0b5869c79d2a311295cdf20@localhost.localdomain> This revision was not accepted when it landed; it landed in state "Changes Planned". This revision was automatically updated to reflect the committed changes. Closed by commit rG6b6c553bb895: [InstCombine] Fold unfolded masked merge pattern with variable mask! (authored by lebedev.ri). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D46814?vs=148250&id=223497#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D46814/new/ https://reviews.llvm.org/D46814 Files: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp llvm/test/Transforms/InstCombine/and-or-not.ll llvm/test/Transforms/InstCombine/masked-merge-add.ll llvm/test/Transforms/InstCombine/masked-merge-and-of-ors.ll llvm/test/Transforms/InstCombine/masked-merge-or.ll llvm/test/Transforms/InstCombine/masked-merge-xor.ll llvm/test/Transforms/InstCombine/vec_sext.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D46814.223497.patch Type: text/x-patch Size: 30822 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 04:25:37 2019 From: llvm-commits at lists.llvm.org (Juneyoung Lee via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:25:37 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: <64d4391f46a95f51c81b9608a88ee065@localhost.localdomain> aqjune added a comment. > should there be a constantexpr freeze Yep, constantexpr freeze makes sense, as freeze is a scalar operation (like fneg). :) > and should freeze be fully type-agnostic, like it is stated in D29121 ? I think this is a hard question, especially due to the existence of the undef pointer. A pointer value tracks which memory block it is pointing to. If `freeze i8* undef` is defined to yield a random pointer to any pre-defined memory block, this will limit free moving of freeze, e.g: p = malloc() // p is created ptr0 = freeze i8* undef // ptr0 can point to block p => ptr0 = freeze i8* undef // ptr0 can't point to block p, because p is not allocated yet p = malloc() CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Mon Oct 7 04:30:17 2019 From: llvm-commits at lists.llvm.org (Kuba (Brecka) Mracek via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:30:17 +0000 (UTC) Subject: [PATCH] D44246: [sanitizer] Generalize atomic_uint8_t, atomic_uint16_t, ... into a template. NFC. In-Reply-To: References: Message-ID: <99716446efd1d801a817d3640ab00307@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG1707fa337468: [sanitizer] Generalize atomic_uint8_t, atomic_uint16_t, ... into a template. (authored by kubamracek). Herald added a subscriber: jfb. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D44246?vs=143181&id=223500#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D44246/new/ https://reviews.llvm.org/D44246 Files: compiler-rt/lib/sanitizer_common/sanitizer_atomic.h Index: compiler-rt/lib/sanitizer_common/sanitizer_atomic.h =================================================================== --- compiler-rt/lib/sanitizer_common/sanitizer_atomic.h +++ compiler-rt/lib/sanitizer_common/sanitizer_atomic.h @@ -27,36 +27,18 @@ memory_order_seq_cst = 1 << 5 }; -struct atomic_uint8_t { - typedef u8 Type; - volatile Type val_dont_use; -}; - -struct atomic_uint16_t { - typedef u16 Type; - volatile Type val_dont_use; -}; - -struct atomic_sint32_t { - typedef s32 Type; - volatile Type val_dont_use; -}; - -struct atomic_uint32_t { - typedef u32 Type; - volatile Type val_dont_use; -}; - -struct atomic_uint64_t { - typedef u64 Type; - // On 32-bit platforms u64 is not necessary aligned on 8 bytes. - volatile ALIGNED(8) Type val_dont_use; +template +struct atomic { + typedef T Type; + volatile Type ALIGNED(sizeof(Type)) val_dont_use; }; -struct atomic_uintptr_t { - typedef uptr Type; - volatile Type val_dont_use; -}; +typedef atomic atomic_uint8_t; +typedef atomic atomic_uint16_t; +typedef atomic atomic_sint32_t; +typedef atomic atomic_uint32_t; +typedef atomic atomic_uint64_t; +typedef atomic atomic_uintptr_t; } // namespace __sanitizer -------------- next part -------------- A non-text attachment was scrubbed... Name: D44246.223500.patch Type: text/x-patch Size: 1250 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 04:33:28 2019 From: llvm-commits at lists.llvm.org (Tim Corringham via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:33:28 +0000 (UTC) Subject: [PATCH] D45246: Add AMDPAL Code Conventions section to AMD docs In-Reply-To: References: Message-ID: <974af278821e88a47a658d8b10ec4dfe@localhost.localdomain> This revision was not accepted when it landed; it landed in state "Needs Revision". This revision was automatically updated to reflect the committed changes. Closed by commit rGaf2dfc697bbb: Add AMDPAL Code Conventions section to AMD docs (authored by timcorringham). Herald added a subscriber: jvesely. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D45246?vs=140952&id=223502#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D45246/new/ https://reviews.llvm.org/D45246 Files: llvm/docs/AMDGPUUsage.rst -------------- next part -------------- A non-text attachment was scrubbed... Name: D45246.223502.patch Type: text/x-patch Size: 5431 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 04:36:39 2019 From: llvm-commits at lists.llvm.org (James Henderson via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:36:39 +0000 (UTC) Subject: [PATCH] D68210: Object/minidump: Add support for the MemoryInfoList stream In-Reply-To: References: Message-ID: jhenderson added a comment. Can't comment too much on the file format details, but I've made some more general comments. FYI, I'll be away from end of day Wednesday for 2 and a half weeks, so won't be able to further review after that point until I'm back. ================ Comment at: include/llvm/BinaryFormat/Minidump.h:81 +#include "llvm/BinaryFormat/MinidumpConstants.def" + LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue = */ 0xffffffffu), +}; ---------------- I believe if you format this line as: ``` LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/0xffffffffu), ``` clang-format will leave it unedited. I believe it has special rules for `/*=*/` to label parameters. ================ Comment at: lib/Object/Minidump.cpp:58 +MinidumpFile::getMemoryInfoList() const { + auto OptionalStream = getRawStream(StreamType::MemoryInfoList); + if (!OptionalStream) ---------------- I probably should have picked up on this in previous reviews, but this is too much `auto` for my liking, as it's not obvious from the call site what `getRawStream` returns. ================ Comment at: lib/Object/Minidump.cpp:66 + const minidump::MemoryInfoListHeader &H = ExpectedHeader.get()[0]; + auto ExpectedData = getDataSlice(*OptionalStream, H.SizeOfHeader, + H.SizeOfEntry * H.NumberOfEntries); ---------------- Ditto. ================ Comment at: unittests/Object/MinidumpTest.cpp:617-618 + // MemoryInfoListHeader + 16, 0, 0, 0, 48, 0, 0, 0, // SizeOfHeader, SizeOfEntry + 1, 0, 0, 0, // ??? + }; ---------------- I might make the data here be of size 15 to test the edge case. It's probably also worth a test case where the header size as specified by SizeOfHeader fits in the data but is smaller than the expected value. ================ Comment at: unittests/Object/MinidumpTest.cpp:634 + // MemoryInfoListHeader + 16, 0, 0, 0, 52, 0, 0, 0, // SizeOfHeader, SizeOfEntry + 1, 0, 0, 0, 0, 0, 0, 0, // NumberOfEntries ---------------- I might go for a value of 49 to test the edge value here. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68210/new/ https://reviews.llvm.org/D68210 From llvm-commits at lists.llvm.org Mon Oct 7 04:43:53 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:43:53 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: lebedev.ri added a comment. In D29011#1697170 , @aqjune wrote: > > should there be a constantexpr freeze > > Yep, constantexpr freeze makes sense, as freeze is a scalar operation (like fneg). :) See inline comments then :) >> and should freeze be fully type-agnostic, like it is stated in D29121 ? > > I think this is a hard question, especially due to the existence of the undef pointer. > A pointer value tracks which memory block it is pointing to. If `freeze i8* undef` is defined to yield a random pointer to any pre-defined memory block, this will limit free moving of freeze, e.g: > > p = malloc() // p is created > ptr0 = freeze i8* undef // ptr0 can point to block p > => > ptr0 = freeze i8* undef // ptr0 can't point to block p, because p is not allocated yet > p = malloc() I guess D29121 needs to explicitly single-out the pointers then. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Mon Oct 7 04:46:27 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Mon, 07 Oct 2019 11:46:27 -0000 Subject: [llvm] r373898 - Revert r373888 "[IA] Recognize hexadecimal escape sequences" Message-ID: <20191007114627.0809F8B908@lists.llvm.org> Author: nico Date: Mon Oct 7 04:46:26 2019 New Revision: 373898 URL: http://llvm.org/viewvc/llvm-project?rev=373898&view=rev Log: Revert r373888 "[IA] Recognize hexadecimal escape sequences" It broke MC/AsmParser/directive_ascii.s on all bots: Assertion failed: (Index < Length && "Invalid index!"), function operator[], file ../../llvm/include/llvm/ADT/StringRef.h, line 243. Modified: llvm/trunk/lib/MC/MCParser/AsmParser.cpp llvm/trunk/test/MC/AsmParser/directive_ascii.s Modified: llvm/trunk/lib/MC/MCParser/AsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCParser/AsmParser.cpp?rev=373898&r1=373897&r2=373898&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCParser/AsmParser.cpp (original) +++ llvm/trunk/lib/MC/MCParser/AsmParser.cpp Mon Oct 7 04:46:26 2019 @@ -2914,26 +2914,11 @@ bool AsmParser::parseEscapedString(std:: } // Recognize escaped characters. Note that this escape semantics currently - // loosely follows Darwin 'as'. + // loosely follows Darwin 'as'. Notably, it doesn't support hex escapes. ++i; if (i == e) return TokError("unexpected backslash at end of string"); - // Recognize hex sequences similarly to GNU 'as'. - if (Str[i] == 'x' || Str[i] == 'X') { - if (!isHexDigit(Str[i + 1])) - return TokError("invalid hexadecimal escape sequence"); - - // Consume hex characters. GNU 'as' reads all hexadecimal characters and - // then truncates to the lower 16 bits. Seems reasonable. - unsigned Value = 0; - while (isHexDigit(Str[i + 1])) - Value = Value * 16 + hexDigitValue(Str[++i]); - - Data += (unsigned char)(Value & 0xFF); - continue; - } - // Recognize octal sequences. if ((unsigned)(Str[i] - '0') <= 7) { // Consume up to three octal characters. Modified: llvm/trunk/test/MC/AsmParser/directive_ascii.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AsmParser/directive_ascii.s?rev=373898&r1=373897&r2=373898&view=diff ============================================================================== --- llvm/trunk/test/MC/AsmParser/directive_ascii.s (original) +++ llvm/trunk/test/MC/AsmParser/directive_ascii.s Mon Oct 7 04:46:26 2019 @@ -39,8 +39,3 @@ TEST5: # CHECK: .byte 0 TEST6: .string "B", "C" - -# CHECK: TEST7: -# CHECK: .ascii "dk" -TEST7: - .ascii "\x64\Xa6B" From llvm-commits at lists.llvm.org Mon Oct 7 04:44:48 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:44:48 +0000 (UTC) Subject: [PATCH] D68483: [IA] Recognize hexadecimal escape sequences In-Reply-To: References: Message-ID: thakis added a comment. I reverted this in r373898 since MC/AsmParser/directive_ascii.s failed on bots. I didn't look into it, but maybe it's because there's no bounds checking on the `i + 1` index. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68483/new/ https://reviews.llvm.org/D68483 From llvm-commits at lists.llvm.org Mon Oct 7 04:46:12 2019 From: llvm-commits at lists.llvm.org (Haicheng Wu via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:46:12 +0000 (UTC) Subject: [PATCH] D36104: [AArch64] Coalesce Copy Zero during instruction selection In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGaed6e52b3c3f: [AArch64] Coalesce Copy Zero during instruction selection (authored by haicheng). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D36104?vs=134832&id=223511#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D36104/new/ https://reviews.llvm.org/D36104 Files: llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp llvm/test/CodeGen/AArch64/arm64-addr-type-promotion.ll llvm/test/CodeGen/AArch64/arm64-cse.ll llvm/test/CodeGen/AArch64/copy-zero-reg.ll llvm/test/CodeGen/AArch64/i128-fast-isel-fallback.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D36104.223511.patch Type: text/x-patch Size: 4759 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 04:46:44 2019 From: llvm-commits at lists.llvm.org (James Henderson via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:46:44 +0000 (UTC) Subject: [PATCH] D68462: [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. In-Reply-To: References: Message-ID: <9be4b97e6b75baebbd717585004d44e7@localhost.localdomain> jhenderson added inline comments. ================ Comment at: test/tools/llvm-readobj/all.test:63 + Entries: [] + - Name: .group + Type: SHT_GROUP ---------------- grimar wrote: > jhenderson wrote: > > Do we need group information to print a header? > Yes. > Logic is: > 1) Collect all group sections: > https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L2842 > 2) Dump them (header is printed on this step): > https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L2886 I see. Could you just check the "There are no section groups in the file" message instead? (https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L2911) I think that would be sufficient for this test case. ================ Comment at: test/tools/llvm-readobj/all.test:86-90 + - Name: .eh_frame_hdr + Type: SHT_PROGBITS +## An arbitrary linker-generated valid content. + Content: 011b033b140000000100000000f0ffff30000000 + - Name: .eh_frame ---------------- grimar wrote: > jhenderson wrote: > > Same comments as earlier. Can these be empty? > No. We need to have something valid here, otherwise any > error triggered will fail the dumping. > (e.g. https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/DwarfCFIEHPrinter.h#L127). You don't actually need the .eh_frame_hdr at all, looking at the code, I think, just the .eh_frame section. That said, this appears to be different from GNU readelf. For reference, GNU readelf prints "There are no unwind sections in this file" if there are no .eh_frame_hdr sections even if there is a .eh_frame section (not looked to see what happens if there is a PT_GNU_EH_FRAME program header). ================ Comment at: test/tools/llvm-readobj/all.test:100 +## An arbitrary linker-generated valid content. + Content: 040000001000000003000000474E55004FCB712AA6387724A9F465A32CD8C14B +Symbols: ---------------- grimar wrote: > jhenderson wrote: > > This could probably just be an arbitrary note, and much simpler. > Probably. But it is already short enough and I do not want to spend time on optimising it until we have a way to describe > it with YAML. Having a raw content is anyways not optimal. What do you think? Time to implement SHT_NOTE sections in yaml2obj :) But happy for that to be later. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68462/new/ https://reviews.llvm.org/D68462 From llvm-commits at lists.llvm.org Mon Oct 7 04:48:14 2019 From: llvm-commits at lists.llvm.org (Kuba (Brecka) Mracek via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:48:14 +0000 (UTC) Subject: [PATCH] D40032: [compiler-rt] Replace forkpty with posix_spawn In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG3ecf9dcaf487: [compiler-rt] Replace forkpty with posix_spawn (authored by kubamracek). Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D40032?vs=133795&id=223512#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D40032/new/ https://reviews.llvm.org/D40032 Files: compiler-rt/lib/sanitizer_common/sanitizer_mac.cc compiler-rt/lib/sanitizer_common/sanitizer_posix.h compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_internal.h compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_libcdep.cc compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_mac.cc compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc -------------- next part -------------- A non-text attachment was scrubbed... Name: D40032.223512.patch Type: text/x-patch Size: 9634 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 04:48:53 2019 From: llvm-commits at lists.llvm.org (Hans Wennborg via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:48:53 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations Message-ID: hans created this revision. hans added reviewers: majnemer, evgeny777, rnk. Herald added subscribers: seiya, jakehehrlich, hiraditya, mgorny. Herald added a reviewer: alexshap. Herald added a reviewer: rupprecht. Herald added a reviewer: jhenderson. Herald added a project: LLVM. David added the JamCRC implementation in r246590. More recently, Eugene added a CRC-32 implementation in r357901, which falls back to zlib's crc32 function if present. These checksums are essentially the same, so having multiple implementations seems unnecessary. This replaces the CRC-32 implementation with the simpler one from JamCRC, and implements the JamCRC interface in terms of CRC-32 since this means it can use zlib's implementation when available, saving a few bytes and potentially making it faster. JamCRC took an ArrayRef argument, and CRC-32 took a StringRef. This patch changes it to ArrayRef which I think is the best choice, simplifies a few of the callers nicely. Please take a look! https://reviews.llvm.org/D68570 Files: lld/COFF/PDB.cpp llvm/include/llvm/Support/CRC.h llvm/include/llvm/Support/JamCRC.h llvm/lib/DebugInfo/PDB/Native/Hash.cpp llvm/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp llvm/lib/DebugInfo/PDB/Native/TpiHashing.cpp llvm/lib/DebugInfo/Symbolize/Symbolize.cpp llvm/lib/MC/WinCOFFObjectWriter.cpp llvm/lib/Support/CMakeLists.txt llvm/lib/Support/CRC.cpp llvm/lib/Support/JamCRC.cpp llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp llvm/tools/llvm-objcopy/COFF/COFFObjcopy.cpp llvm/tools/llvm-objcopy/CopyConfig.cpp llvm/unittests/Support/CRCTest.cpp llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn -------------- next part -------------- A non-text attachment was scrubbed... Name: D68570.223504.patch Type: text/x-patch Size: 21027 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 04:53:25 2019 From: llvm-commits at lists.llvm.org (Easwaran Raman via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:53:25 +0000 (UTC) Subject: [PATCH] D42212: [ThinLTO] Add call edges' relative block frequency to per-module summary. In-Reply-To: References: Message-ID: <6bba2e83be97a5b5eaa53457e1c113fe@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG5f7aff9a0a8b: [ThinLTO] Add call edges' relative block frequency to per-module summary. (authored by eraman). Herald added subscribers: arphaman, dexonsmith, steven_wu, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D42212?vs=131366&id=223514#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D42212/new/ https://reviews.llvm.org/D42212 Files: llvm/include/llvm/Bitcode/LLVMBitCodes.h llvm/include/llvm/IR/ModuleSummaryIndex.h llvm/lib/Analysis/ModuleSummaryAnalysis.cpp llvm/lib/Bitcode/Reader/BitcodeReader.cpp llvm/lib/Bitcode/Writer/BitcodeWriter.cpp llvm/test/Bitcode/thinlto-function-summary-callgraph-relbf.ll llvm/tools/llvm-bcanalyzer/llvm-bcanalyzer.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D42212.223514.patch Type: text/x-patch Size: 12731 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:02:38 2019 From: llvm-commits at lists.llvm.org (Shahid via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:02:38 +0000 (UTC) Subject: [PATCH] D36130: [SLP] Vectorize jumbled memory loads. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGdbd30edb7ff8: [SLP] Vectorize jumbled memory loads. (authored by ashahid). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D36130?vs=136311&id=223515#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D36130/new/ https://reviews.llvm.org/D36130 Files: llvm/include/llvm/Analysis/LoopAccessAnalysis.h llvm/lib/Analysis/LoopAccessAnalysis.cpp llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/test/Transforms/SLPVectorizer/X86/jumbled-load-multiuse.ll llvm/test/Transforms/SLPVectorizer/X86/jumbled-load-shuffle-placement.ll llvm/test/Transforms/SLPVectorizer/X86/jumbled-load-used-in-phi.ll llvm/test/Transforms/SLPVectorizer/X86/jumbled-load.ll llvm/test/Transforms/SLPVectorizer/X86/store-jumbled.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D36130.223515.patch Type: text/x-patch Size: 61058 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:06:10 2019 From: llvm-commits at lists.llvm.org (Oliver Stannard via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:06:10 +0000 (UTC) Subject: [PATCH] D36747: [Asm, ARM] Add fallback diag for multiple invalid operands In-Reply-To: References: Message-ID: <8cfa41337f923f1f5f21d9feeff6eaeb@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG7cd4db94f8ca: [Asm, ARM] Add fallback diag for multiple invalid operands (authored by olista01). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D36747?vs=125308&id=223516#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D36747/new/ https://reviews.llvm.org/D36747 Files: llvm/include/llvm/MC/MCParser/MCTargetAsmParser.h llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp llvm/test/MC/ARM/diagnostics.s llvm/test/MC/ARM/invalid-fp-armv8.s llvm/test/MC/ARM/invalid-neon-v8.s llvm/test/MC/ARM/ldrd-strd-gnu-arm-bad-regs.s llvm/test/MC/ARM/ldrd-strd-gnu-bad-inst.s llvm/test/MC/ARM/ldrd-strd-gnu-sp.s llvm/test/MC/ARM/ldrd-strd-gnu-thumb-bad-regs.s llvm/test/MC/ARM/thumb-mov.s llvm/test/MC/ARM/thumb2-diagnostics.s llvm/test/MC/ARM/vfp4.s llvm/utils/TableGen/AsmMatcherEmitter.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D36747.223516.patch Type: text/x-patch Size: 21911 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:06:42 2019 From: llvm-commits at lists.llvm.org (Jatin Bhateja via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:06:42 +0000 (UTC) Subject: [PATCH] D35014: [X86] Improvement in CodeGen instruction selection for LEAs. In-Reply-To: References: Message-ID: This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rG328199ec2643: [X86] Improvement in CodeGen instruction selection for LEAs. (authored by jbhateja). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D35014?vs=125122&id=223517#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D35014/new/ https://reviews.llvm.org/D35014 Files: llvm/include/llvm/CodeGen/MachineInstr.h llvm/include/llvm/CodeGen/SelectionDAG.h llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp llvm/lib/Target/X86/X86ISelDAGToDAG.cpp llvm/lib/Target/X86/X86OptimizeLEAs.cpp llvm/test/CodeGen/X86/GlobalISel/callingconv.ll llvm/test/CodeGen/X86/GlobalISel/gep.ll llvm/test/CodeGen/X86/GlobalISel/memop-scalar.ll llvm/test/CodeGen/X86/lea-opt-cse1.ll llvm/test/CodeGen/X86/lea-opt-cse2.ll llvm/test/CodeGen/X86/lea-opt-cse3.ll llvm/test/CodeGen/X86/lea-opt-cse4.ll llvm/test/CodeGen/X86/mul-constant-i16.ll llvm/test/CodeGen/X86/mul-constant-i32.ll llvm/test/CodeGen/X86/mul-constant-i64.ll llvm/test/CodeGen/X86/mul-constant-result.ll llvm/test/CodeGen/X86/umul-with-overflow.ll llvm/test/Transforms/LoopStrengthReduce/X86/ivchain-X86.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D35014.223517.patch Type: text/x-patch Size: 53074 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:06:53 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:06:53 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <4aa46386b02f1bc7d3caaa61fae62345@localhost.localdomain> thakis accepted this revision. thakis added a comment. This revision is now accepted and ready to land. Nice! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Mon Oct 7 05:08:01 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:08:01 +0000 (UTC) Subject: [PATCH] D68566: [ARM] VQADD instructions In-Reply-To: References: Message-ID: <070cfdc52ab0073c2f75f0a5137fb9f0@localhost.localdomain> SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Looks like a nice bit of isel to me. ================ Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:1624 + foreach instr = [MVE_VQADDu8, MVE_VQADDu16, MVE_VQADDu32] in + foreach VT = [instr.VT] in + def : Pat<(VT (uaddsat (VT MQPR:$Qm), (VT MQPR:$Qn))), ---------------- And looking at this, I almost start to like tablegen. :-) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68566/new/ https://reviews.llvm.org/D68566 From llvm-commits at lists.llvm.org Mon Oct 7 05:11:39 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:11:39 +0000 (UTC) Subject: [PATCH] D68566: [ARM] VQADD instructions In-Reply-To: References: Message-ID: samparker added a comment. Just wondering, who generates the intrinsics? From the little that I remember from the last time I looked, I thought it was clang but it that is was x86 specific? Do we have some hooks somewhere saying that we support them? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68566/new/ https://reviews.llvm.org/D68566 From llvm-commits at lists.llvm.org Mon Oct 7 05:14:38 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:14:38 +0000 (UTC) Subject: [PATCH] D68470: [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask In-Reply-To: References: Message-ID: spatel added a comment. The diff as shown includes D68239 rather than building on top of it? Commit the other patch and rebase, so we are current with trunk? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68470/new/ https://reviews.llvm.org/D68470 From llvm-commits at lists.llvm.org Mon Oct 7 05:17:37 2019 From: llvm-commits at lists.llvm.org (Hans Wennborg via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:17:37 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: hans updated this revision to Diff 223518. hans added a comment. Herald added subscribers: MaskRay, arichardson, emaste. Herald added a reviewer: espindola. Found a call in LLDB too. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 Files: lld/COFF/PDB.cpp lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp llvm/include/llvm/Support/CRC.h llvm/include/llvm/Support/JamCRC.h llvm/lib/DebugInfo/PDB/Native/Hash.cpp llvm/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp llvm/lib/DebugInfo/PDB/Native/TpiHashing.cpp llvm/lib/DebugInfo/Symbolize/Symbolize.cpp llvm/lib/MC/WinCOFFObjectWriter.cpp llvm/lib/Support/CMakeLists.txt llvm/lib/Support/CRC.cpp llvm/lib/Support/JamCRC.cpp llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp llvm/tools/llvm-objcopy/COFF/COFFObjcopy.cpp llvm/tools/llvm-objcopy/CopyConfig.cpp llvm/unittests/Support/CRCTest.cpp llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn -------------- next part -------------- A non-text attachment was scrubbed... Name: D68570.223518.patch Type: text/x-patch Size: 22018 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:20:16 2019 From: llvm-commits at lists.llvm.org (Keno Fischer via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:20:16 +0000 (UTC) Subject: [PATCH] D39297: [DynamicLibrary] Fix build on musl libc In-Reply-To: References: Message-ID: <2a4210974dfcfda6738eb5b7cff49ac9@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG1c43ad09650a: [DynamicLibrary] Fix build on musl libc (authored by loladiro). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D39297?vs=120439&id=223520#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D39297/new/ https://reviews.llvm.org/D39297 Files: llvm/lib/Support/Unix/DynamicLibrary.inc Index: llvm/lib/Support/Unix/DynamicLibrary.inc =================================================================== --- llvm/lib/Support/Unix/DynamicLibrary.inc +++ llvm/lib/Support/Unix/DynamicLibrary.inc @@ -71,7 +71,7 @@ // Must declare the symbols in the global namespace. static void *DoSearch(const char* SymbolName) { #define EXPLICIT_SYMBOL(SYM) \ - extern void *SYM; if (!strcmp(SymbolName, #SYM)) return &SYM + extern void *SYM; if (!strcmp(SymbolName, #SYM)) return (void*)&SYM // If this is darwin, it has some funky issues, try to solve them here. Some // important symbols are marked 'private external' which doesn't allow -------------- next part -------------- A non-text attachment was scrubbed... Name: D39297.223520.patch Type: text/x-patch Size: 653 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:20:28 2019 From: llvm-commits at lists.llvm.org (Nikola Prica via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:20:28 +0000 (UTC) Subject: [PATCH] D67556: [ARM][AArch64][DebugInfo] Improve call site instruction interpretation In-Reply-To: References: Message-ID: <6171dc64ef3e67ec516c834ceb6a8c53@localhost.localdomain> NikolaPrica marked an inline comment as done. NikolaPrica added inline comments. ================ Comment at: include/llvm/CodeGen/TargetInstrInfo.h:888 + /// If the specific machine instruction is an instruction that adds an + /// immediate value to its first operand and stores it in the first, return + /// true along with @Source machine operand to which @Offset has been ---------------- dstenb wrote: > dstenb wrote: > > I wonder if the hook should allow the source and destination to be different, as we then for example could describe cases like this: > > > > ``` > > $reg0 = add $frame-ptr, -13 > > ``` > > > > If so, would it then make sense to move the LEA part of X86's `describeLoadedValue()` hook into this hook instead? > If so, should we perhaps also consider generalizing the hook so that it has a Destination out-parameter, e.g. same as `isCopyInstr()`? That could probably be helpful if we make the `describeLoadedValue()` hook aware of which register it should describe, as we discussed in D67225. > I wonder if the hook should allow the source and destination to be different In fact we should only relay on situations were source and destination operands are different. Such restriction should be used at general part of `describeLoadedValue()`. There is no use of describing situatios like $reg0 = add $reg0, 4 This case would require recursive description of $reg0. Describing such instruction is a different story. > If so, would it then make sense to move the LEA part of X86's describeLoadedValue() hook into this hook instead? The LEA instruction is more complex than add immidiate instruction. It could be observed as add immidiate for one case but it could also have addition of multiple source registers with some multiplication operations. IMHO it would be better to keep this API function clean in sense of recognizing only clear add immidiate instruction for purpose of further usage. > If so, should we perhaps also consider generalizing the hook so that it has a Destination out-parameter, e.g. same as isCopyInstr()? That could probably be helpful if we make the describeLoadedValue() hook aware of which register it should describe, as we discussed in D67225. In the first version of this function I've added a destination operand but I've removed it since there was no current use of it. But for further flexibilty I will add it. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67556/new/ https://reviews.llvm.org/D67556 From llvm-commits at lists.llvm.org Mon Oct 7 05:20:56 2019 From: llvm-commits at lists.llvm.org (Max Kazantsev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:20:56 +0000 (UTC) Subject: [PATCH] D39228: [SCEV] Enhance SCEVFindUnsafe for division In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGb6d40067af8e: [SCEV] Enhance SCEVFindUnsafe for division (authored by mkazantsev). Herald added subscribers: javed.absar, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D39228?vs=120229&id=223521#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D39228/new/ https://reviews.llvm.org/D39228 Files: llvm/lib/Analysis/ScalarEvolutionExpander.cpp llvm/test/Transforms/IndVarSimplify/udiv.ll Index: llvm/test/Transforms/IndVarSimplify/udiv.ll =================================================================== --- llvm/test/Transforms/IndVarSimplify/udiv.ll +++ llvm/test/Transforms/IndVarSimplify/udiv.ll @@ -130,11 +130,11 @@ ; IndVars doesn't emit a udiv in for.body.preheader since SCEVExpander::expand will ; find out there's already a udiv in the original code. -; CHECK-LABEL: @foo( +; CHECK-LABEL: @foo_01( ; CHECK: for.body.preheader: ; CHECK-NOT: udiv -define void @foo(double* %p, i64 %n) nounwind { +define void @foo_01(double* %p, i64 %n) nounwind { entry: %div0 = udiv i64 %n, 7 ; [#uses=1] %div1 = add i64 %div0, 1 @@ -160,3 +160,39 @@ for.end: ; preds = %for.end.loopexit, %entry ret void } + +; Same as foo_01, but we divide by non-constant value. + +; CHECK-LABEL: @foo_02( +; CHECK: for.body.preheader: +; CHECK-NOT: udiv + +define void @foo_02(double* %p, i64 %n, i64* %lp) nounwind { +entry: + %denom = load i64, i64* %lp, align 4, !range !0 + %div0 = udiv i64 %n, %denom ; [#uses=1] + %div1 = add i64 %div0, 1 + %cmp2 = icmp ult i64 0, %div1 ; [#uses=1] + br i1 %cmp2, label %for.body.preheader, label %for.end + +for.body.preheader: ; preds = %entry + br label %for.body + +for.body: ; preds = %for.body.preheader, %for.body + %i.03 = phi i64 [ %inc, %for.body ], [ 0, %for.body.preheader ] ; [#uses=2] + %arrayidx = getelementptr inbounds double, double* %p, i64 %i.03 ; [#uses=1] + store double 0.000000e+00, double* %arrayidx + %inc = add i64 %i.03, 1 ; [#uses=2] + %divx = udiv i64 %n, %denom ; [#uses=1] + %div = add i64 %divx, 1 + %cmp = icmp ult i64 %inc, %div ; [#uses=1] + br i1 %cmp, label %for.body, label %for.end.loopexit + +for.end.loopexit: ; preds = %for.body + br label %for.end + +for.end: ; preds = %for.end.loopexit, %entry + ret void +} + +!0 = !{i64 1, i64 10} Index: llvm/lib/Analysis/ScalarEvolutionExpander.cpp =================================================================== --- llvm/lib/Analysis/ScalarEvolutionExpander.cpp +++ llvm/lib/Analysis/ScalarEvolutionExpander.cpp @@ -2250,10 +2250,6 @@ // only needed when the expression includes some subexpression that is not IV // derived. // -// Currently, we only allow division by a nonzero constant here. If this is -// inadequate, we could easily allow division by SCEVUnknown by using -// ValueTracking to check isKnownNonZero(). -// // We cannot generally expand recurrences unless the step dominates the loop // header. The expander handles the special case of affine recurrences by // scaling the recurrence outside the loop, but this technique isn't generally @@ -2268,13 +2264,11 @@ bool follow(const SCEV *S) { if (const SCEVUDivExpr *D = dyn_cast(S)) { - const SCEVConstant *SC = dyn_cast(D->getRHS()); - if (!SC || SC->getValue()->isZero()) { + if (!SE.isKnownNonZero(D->getRHS())) { IsUnsafe = true; return false; } - } - if (const SCEVAddRecExpr *AR = dyn_cast(S)) { + } else if (const SCEVAddRecExpr *AR = dyn_cast(S)) { const SCEV *Step = AR->getStepRecurrence(SE); if (!AR->isAffine() && !SE.dominates(Step, AR->getLoop()->getHeader())) { IsUnsafe = true; -------------- next part -------------- A non-text attachment was scrubbed... Name: D39228.223521.patch Type: text/x-patch Size: 3637 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:27:21 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:27:21 +0000 (UTC) Subject: [PATCH] D68571: [Remarks] Pass StringBlockValue as StringRef. Message-ID: fhahn created this revision. fhahn added reviewers: thegameg, anemet. Herald added a subscriber: hiraditya. Herald added a project: LLVM. After changing the remark serialization, we now pass StringRefs to the serializer. We should use StringRef for StringBlockVal, to avoid creating temporary objects, which then cause StringBlockVal.Value to point to invalid memory. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68571 Files: llvm/lib/Remarks/YAMLRemarkSerializer.cpp Index: llvm/lib/Remarks/YAMLRemarkSerializer.cpp =================================================================== --- llvm/lib/Remarks/YAMLRemarkSerializer.cpp +++ llvm/lib/Remarks/YAMLRemarkSerializer.cpp @@ -103,7 +103,7 @@ /// newlines in strings. struct StringBlockVal { StringRef Value; - StringBlockVal(const std::string &Value) : Value(Value) {} + StringBlockVal(StringRef R) : Value(R) {} }; template <> struct BlockScalarTraits { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68571.223523.patch Type: text/x-patch Size: 469 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:32:33 2019 From: llvm-commits at lists.llvm.org (Dinar Temirbulatov via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:32:33 +0000 (UTC) Subject: [PATCH] D28907: [SLP] Fix for PR30787: Failure to beneficially vectorize 'copyable' elements in integer binary ops. In-Reply-To: References: Message-ID: <0be3ce74bc122da7ad292dc2e4b38368@localhost.localdomain> This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rGe2358b53bc09: [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in… (authored by dtemirbulatov). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D28907?vs=182576&id=223526#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D28907/new/ https://reviews.llvm.org/D28907 Files: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/test/Transforms/SLPVectorizer/X86/vect_copyable_in_binops.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D28907.223526.patch Type: text/x-patch Size: 64803 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:41:09 2019 From: llvm-commits at lists.llvm.org (Roman via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:41:09 +0000 (UTC) Subject: [PATCH] D35761: [Polly][WIP] Use SCEV information for the second level aliasing In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG1563f039f504: Use SCEV information for the second level aliasing (authored by gareevroman). Herald added a subscriber: javed.absar. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D35761?vs=110221&id=223531#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D35761/new/ https://reviews.llvm.org/D35761 Files: polly/include/polly/CodeGen/IRBuilder.h polly/lib/CodeGen/IRBuilder.cpp polly/test/ScheduleOptimizer/kernel_gemm___%for.body---%for.end24.jscop polly/test/ScheduleOptimizer/kernel_gemm___%for.body---%for.end24.jscop.transformed polly/test/ScheduleOptimizer/pattern-matching-based-opts_14.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D35761.223531.patch Type: text/x-patch Size: 10091 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:49:39 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:49:39 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: thopre updated this revision to Diff 223532. thopre added a comment. Use file redirection + FileCheck to test file content Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 Files: llvm/test/tools/llvm-ar/mri-nonascii.test llvm/test/tools/llvm-ar/mri-utf8.test Index: llvm/test/tools/llvm-ar/mri-utf8.test =================================================================== --- llvm/test/tools/llvm-ar/mri-utf8.test +++ /dev/null @@ -1,23 +0,0 @@ -# Test non-ascii archive members -# XFAIL: system-darwin - -RUN: rm -rf %t && mkdir -p %t/extracted - -RUN: echo "contents" > %t/£.txt - -RUN: echo "CREATE %t/mri.ar" > %t/script.mri -RUN: echo "ADDMOD %t/£.txt" >> %t/script.mri -RUN: echo "SAVE" >> %t/script.mri - -RUN: llvm-ar -M < %t/script.mri -RUN: cd %t/extracted && llvm-ar x %t/mri.ar - -# This works around problems launching processess that -# include arguments with non-ascii characters. -# Python on Linux defaults to ASCII encoding unless the -# environment specifies otherwise, so it is explicitly set. -# The reliance the test has on this locale is not ideal, -# however alternate solutions have been difficult due to -# behaviour differences with python 2 vs python 3, -# and linux vs windows. -RUN: env LANG=en_US.UTF-8 %python -c "assert open(u'\U000000A3.txt', 'rb').read() == b'contents\n'" Index: llvm/test/tools/llvm-ar/mri-nonascii.test =================================================================== --- /dev/null +++ llvm/test/tools/llvm-ar/mri-nonascii.test @@ -0,0 +1,19 @@ +# Test non-ascii archive members +# XFAIL: system-darwin + +RUN: rm -rf %t && mkdir -p %t/extracted + +RUN: echo "contents" > %t/£.txt + +RUN: echo "CREATE %t/mri.ar" > %t/script.mri +RUN: echo "ADDMOD %t/£.txt" >> %t/script.mri +RUN: echo "SAVE" >> %t/script.mri + +RUN: llvm-ar -M < %t/script.mri +RUN: cd %t/extracted && llvm-ar x %t/mri.ar + +# Use input redirection to work around problems launching processess that +# include arguments with non-ascii characters. +RUN: FileCheck --strict-whitespace %s <£.txt +CHECK:{{^}} +CHECK-SAME:{{^}}contents{{$}} -------------- next part -------------- A non-text attachment was scrubbed... Name: D68472.223532.patch Type: text/x-patch Size: 1809 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:49:40 2019 From: llvm-commits at lists.llvm.org (Weiming Zhao via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:49:40 +0000 (UTC) Subject: [PATCH] D34918: [libc++] Refactoring __sync_* builtins; NFC In-Reply-To: References: Message-ID: <4a01822d24a47c678393cd6262ea40bf@localhost.localdomain> This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rGf7850fa8b64d: [libc++] Refactoring __sync_* builtins; NFC (Reland) (authored by weimingz). Herald added subscribers: libcxx-commits, jfb, ldionne, christof. Herald added a project: libc++. Changed prior to commit: https://reviews.llvm.org/D34918?vs=105924&id=223533#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D34918/new/ https://reviews.llvm.org/D34918 Files: libcxx/include/__atomic_support libcxx/include/__refstring libcxx/src/locale.cpp libcxx/src/support/runtime/exception_fallback.ipp libcxx/src/support/runtime/new_handler_fallback.ipp -------------- next part -------------- A non-text attachment was scrubbed... Name: D34918.223533.patch Type: text/x-patch Size: 5283 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:49:49 2019 From: llvm-commits at lists.llvm.org (Farhana Aleen via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:49:49 +0000 (UTC) Subject: [PATCH] D34478: [BasicAliasAnalysis] Allow idAddofNonZero() for values coming from the same loop iteration. In-Reply-To: References: Message-ID: <3d3bed2dc1f6372bd58ecf3cf7d60b2c@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG2ff973f2a5ad: Avoid doing conservative phi checks in aliasSameBasePointerGEPs() if no phis… (authored by Farhana). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D34478?vs=105908&id=223534#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D34478/new/ https://reviews.llvm.org/D34478 Files: llvm/include/llvm/Analysis/BasicAliasAnalysis.h llvm/lib/Analysis/BasicAliasAnalysis.cpp llvm/lib/Analysis/ValueTracking.cpp Index: llvm/lib/Analysis/ValueTracking.cpp =================================================================== --- llvm/lib/Analysis/ValueTracking.cpp +++ llvm/lib/Analysis/ValueTracking.cpp @@ -1873,7 +1873,7 @@ if (Known.countMaxLeadingZeros() < BitWidth - ShiftVal) return true; // Are all the bits to be shifted out known zero? - if (Known.countMinTrailingZeros() >= ShiftVal) + if (Known.isUnknown() || Known.countMinTrailingZeros() >= ShiftVal) return isKnownNonZero(X, Depth, Q); } } Index: llvm/lib/Analysis/BasicAliasAnalysis.cpp =================================================================== --- llvm/lib/Analysis/BasicAliasAnalysis.cpp +++ llvm/lib/Analysis/BasicAliasAnalysis.cpp @@ -922,11 +922,11 @@ /// Provide ad-hoc rules to disambiguate accesses through two GEP operators, /// both having the exact same pointer operand. -static AliasResult aliasSameBasePointerGEPs(const GEPOperator *GEP1, - uint64_t V1Size, - const GEPOperator *GEP2, - uint64_t V2Size, - const DataLayout &DL) { +AliasResult BasicAAResult::aliasSameBasePointerGEPs(const GEPOperator *GEP1, + uint64_t V1Size, + const GEPOperator *GEP2, + uint64_t V2Size, + const DataLayout &DL) { assert(GEP1->getPointerOperand()->stripPointerCastsAndBarriers() == GEP2->getPointerOperand()->stripPointerCastsAndBarriers() && @@ -1006,7 +1006,7 @@ // Because they cannot partially overlap and because fields in an array // cannot overlap, if we can prove the final indices are different between // GEP1 and GEP2, we can conclude GEP1 and GEP2 don't alias. - + // If the last indices are constants, we've already checked they don't // equal each other so we can exit early. if (C1 && C2) @@ -1014,11 +1014,15 @@ { Value *GEP1LastIdx = GEP1->getOperand(GEP1->getNumOperands() - 1); Value *GEP2LastIdx = GEP2->getOperand(GEP2->getNumOperands() - 1); - if (isa(GEP1LastIdx) || isa(GEP2LastIdx)) { + if ((isa(GEP1LastIdx) || isa(GEP2LastIdx)) && + !VisitedPhiBBs.empty()) { // If one of the indices is a PHI node, be safe and only use // computeKnownBits so we don't make any assumptions about the // relationships between the two indices. This is important if we're // asking about values from different loop iterations. See PR32314. + // But, with empty visitedPhiBBs we can guarantee that the values are + // from the same iteration. Therefore, we can avoid doing this + // conservative check. // TODO: We may be able to change the check so we only do this when // we definitely looked through a PHINode. if (GEP1LastIdx != GEP2LastIdx && Index: llvm/include/llvm/Analysis/BasicAliasAnalysis.h =================================================================== --- llvm/include/llvm/Analysis/BasicAliasAnalysis.h +++ llvm/include/llvm/Analysis/BasicAliasAnalysis.h @@ -183,6 +183,12 @@ uint64_t V2Size, const AAMDNodes &V2AAInfo, const Value *UnderlyingV1, const Value *UnderlyingV2); + AliasResult aliasSameBasePointerGEPs(const GEPOperator *GEP1, + uint64_t V1Size, + const GEPOperator *GEP2, + uint64_t V2Size, + const DataLayout &DL); + AliasResult aliasPHI(const PHINode *PN, uint64_t PNSize, const AAMDNodes &PNAAInfo, const Value *V2, uint64_t V2Size, const AAMDNodes &V2AAInfo, -------------- next part -------------- A non-text attachment was scrubbed... Name: D34478.223534.patch Type: text/x-patch Size: 4057 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:50:59 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:50:59 +0000 (UTC) Subject: [PATCH] D34583: [LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale. In-Reply-To: References: Message-ID: <4c0205735df7f10dc0d1aa1566e8f3ad@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG90707394e37f: [LSR] Narrow search space by filtering non-optimal formulae with the same… (authored by wmi). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D34583?vs=105443&id=223535#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D34583/new/ https://reviews.llvm.org/D34583 Files: llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp llvm/test/CodeGen/X86/regalloc-reconcile-broken-hints.ll llvm/test/Transforms/LoopStrengthReduce/2013-01-14-ReuseCast.ll llvm/test/Transforms/LoopStrengthReduce/X86/lsr-filtering-scaledreg.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D34583.223535.patch Type: text/x-patch Size: 9624 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 05:52:07 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 12:52:07 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: <8b1d768a9154b9a5598b0aaf3b556473@localhost.localdomain> thopre added a comment. In D68472#1697316 , @thopre wrote: > Use file redirection + FileCheck to test file content Can people with mac & Windows test this new version works for them? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 From llvm-commits at lists.llvm.org Mon Oct 7 05:57:42 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Christian_K=C3=BChnel_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 12:57:42 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: <2106fe4442234de749c4e0ef4767765d@localhost.localdomain> kuhnel added a comment. new test comment Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 From llvm-commits at lists.llvm.org Mon Oct 7 06:06:28 2019 From: llvm-commits at lists.llvm.org (Nirav Dave via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:06:28 +0000 (UTC) Subject: [PATCH] D30471: [SDAG] Relax conditions under stores of loaded values can be merged In-Reply-To: References: Message-ID: This revision was not accepted when it landed; it landed in state "Needs Revision". This revision was automatically updated to reflect the committed changes. Closed by commit rGa38c049fc5c7: [SDAG] Relax conditions under stores of loaded values can be merged (authored by niravd). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D30471?vs=102187&id=223538#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D30471/new/ https://reviews.llvm.org/D30471 Files: llvm/include/llvm/CodeGen/ISDOpcodes.h llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/test/CodeGen/X86/merge_store_duplicated_loads.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D30471.223538.patch Type: text/x-patch Size: 5374 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:07:19 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:07:19 +0000 (UTC) Subject: [PATCH] D68572: gn build: use better triple on windows Message-ID: thakis created this revision. thakis added a reviewer: hans. Herald added a project: LLVM. The CMake build uses "x86_64-pc-windows-msvc". The "-msvc" suffix is important because e.g. clang/test/lit.cfg.py matches against the suffix "windows-msvc" to compute the presence of the "ms-sdk" and the absence of the "LP64" feature. https://reviews.llvm.org/D68572 Files: llvm/utils/gn/secondary/llvm/triples.gni Index: llvm/utils/gn/secondary/llvm/triples.gni =================================================================== --- llvm/utils/gn/secondary/llvm/triples.gni +++ llvm/utils/gn/secondary/llvm/triples.gni @@ -10,7 +10,7 @@ } else if (current_os == "mac") { llvm_current_triple = "x86_64-apple-darwin" } else if (current_os == "win") { - llvm_current_triple = "x86_64-pc-windows" + llvm_current_triple = "x86_64-pc-windows-msvc" } } else if (current_cpu == "arm64") { if (current_os == "android") { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68572.223539.patch Type: text/x-patch Size: 523 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:07:24 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:07:24 +0000 (UTC) Subject: [PATCH] D68573: [LoopRotate] Unconditionally get ScalarEvolution. Message-ID: fhahn created this revision. fhahn added reviewers: anemet, asbirlea. Herald added a subscriber: hiraditya. Herald added a project: LLVM. LoopRotate is a loop pass and SE should always be available. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68573 Files: llvm/lib/Transforms/Scalar/LoopRotation.cpp Index: llvm/lib/Transforms/Scalar/LoopRotation.cpp =================================================================== --- llvm/lib/Transforms/Scalar/LoopRotation.cpp +++ llvm/lib/Transforms/Scalar/LoopRotation.cpp @@ -96,15 +96,14 @@ auto *AC = &getAnalysis().getAssumptionCache(F); auto *DTWP = getAnalysisIfAvailable(); auto *DT = DTWP ? &DTWP->getDomTree() : nullptr; - auto *SEWP = getAnalysisIfAvailable(); - auto *SE = SEWP ? &SEWP->getSE() : nullptr; + auto &SE = getAnalysis().getSE(); const SimplifyQuery SQ = getBestSimplifyQuery(*this, F); Optional MSSAU; if (EnableMSSALoopDependency) { MemorySSA *MSSA = &getAnalysis().getMSSA(); MSSAU = MemorySSAUpdater(MSSA); } - return LoopRotation(L, LI, TTI, AC, DT, SE, + return LoopRotation(L, LI, TTI, AC, DT, &SE, MSSAU.hasValue() ? MSSAU.getPointer() : nullptr, SQ, false, MaxHeaderSize, false); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68573.223540.patch Type: text/x-patch Size: 1119 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:07:31 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:07:31 +0000 (UTC) Subject: [PATCH] D28907: [SLP] Fix for PR30787: Failure to beneficially vectorize 'copyable' elements in integer binary ops. In-Reply-To: References: Message-ID: <7e92f5bf237ffea4dda8a8f6fe1ac6a9@localhost.localdomain> RKSimon reopened this revision. RKSimon added a comment. reopening - phab seems to be a bit broken Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D28907/new/ https://reviews.llvm.org/D28907 From llvm-commits at lists.llvm.org Mon Oct 7 06:08:52 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Christian_K=C3=BChnel_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 13:08:52 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: kuhnel added a comment. Build FAILED!. ninja_check_all.log Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 From llvm-commits at lists.llvm.org Mon Oct 7 06:09:00 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:09:00 +0000 (UTC) Subject: [PATCH] D68470: [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask In-Reply-To: References: Message-ID: <398ecc2a35d73b14d00945d6915a3256@localhost.localdomain> lebedev.ri added a comment. In D68470#1697275 , @spatel wrote: > The diff as shown includes D68239 rather than building on top of it? Commit the other patch and rebase, so we are current with trunk? No, this is fully properly rebased. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68470/new/ https://reviews.llvm.org/D68470 From llvm-commits at lists.llvm.org Mon Oct 7 06:09:16 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:09:16 +0000 (UTC) Subject: [PATCH] D67882: [LNT] Python 3 support: make leaf classes inherit from object In-Reply-To: References: Message-ID: hubert.reinterpretcast added a comment. Perhaps others are more well-versed in this than I am, but I think that having a link in the commit message and using the terminology used by the documentation ("new-style" and "classic") would be useful here: https://docs.python.org/2/reference/datamodel.html#newstyle. Also, I am not sure that switching these to be new-style classes in Python 2 is necessary. I believe the commit message should give additional rationale, e.g., using new-style classes helps make the Python 2 and Python 3 behaviour of the code more similar. ================ Comment at: lnt/server/reporting/analysis.py:111 - @property - def stddev_mean(self): ---------------- This seems to work fine if the name of the caching variable and the name of the property is not the same. ================ Comment at: lnt/server/reporting/analysis.py:121 + + stddev_mean = property(__get_stddev_mean, __set_stddev_mean) ---------------- This makes it possible to assign to the property, which was read-only before this change. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67882/new/ https://reviews.llvm.org/D67882 From llvm-commits at lists.llvm.org Mon Oct 7 06:10:12 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Christian_K=C3=BChnel_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 13:10:12 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: <460cadcf8d086e5150c496812ea9eb24@localhost.localdomain> kuhnel added a comment. Build FAILED! cmake.log ninja_check_all.log Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 From llvm-commits at lists.llvm.org Mon Oct 7 06:10:33 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Christian_K=C3=BChnel_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 13:10:33 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: <098e3c927826f54d77030f3a1919a5fe@localhost.localdomain> kuhnel added a comment. Build SUCCESSFUL! cmake.log ninja_check_all.log Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 From llvm-commits at lists.llvm.org Mon Oct 7 06:13:32 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Mon, 07 Oct 2019 13:13:32 -0000 Subject: [llvm] r373899 - gn build: use better triple on windows Message-ID: <20191007131332.31CF28BB69@lists.llvm.org> Author: nico Date: Mon Oct 7 06:13:31 2019 New Revision: 373899 URL: http://llvm.org/viewvc/llvm-project?rev=373899&view=rev Log: gn build: use better triple on windows The CMake build uses "x86_64-pc-windows-msvc". The "-msvc" suffix is important because e.g. clang/test/lit.cfg.py matches against the suffix "windows-msvc" to compute the presence of the "ms-sdk" and the absence of the "LP64" feature. Differential Revision: https://reviews.llvm.org/D68572 Modified: llvm/trunk/utils/gn/secondary/llvm/triples.gni Modified: llvm/trunk/utils/gn/secondary/llvm/triples.gni URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/triples.gni?rev=373899&r1=373898&r2=373899&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/triples.gni (original) +++ llvm/trunk/utils/gn/secondary/llvm/triples.gni Mon Oct 7 06:13:31 2019 @@ -10,7 +10,7 @@ if (current_cpu == "x86") { } else if (current_os == "mac") { llvm_current_triple = "x86_64-apple-darwin" } else if (current_os == "win") { - llvm_current_triple = "x86_64-pc-windows" + llvm_current_triple = "x86_64-pc-windows-msvc" } } else if (current_cpu == "arm64") { if (current_os == "android") { From llvm-commits at lists.llvm.org Mon Oct 7 06:14:06 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:14:06 +0000 (UTC) Subject: [PATCH] D32239: [SCEV] Make SCEV or modeling more aggressive. In-Reply-To: References: Message-ID: <1eb80dd8315bca87db6dd2c8664730ef@localhost.localdomain> This revision was not accepted when it landed; it landed in state "Changes Planned". This revision was automatically updated to reflect the committed changes. Closed by commit rGe77d2b86b478: [SCEV] Make SCEV or modeling more aggressive. (authored by efriedma). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D32239?vs=95808&id=223542#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D32239/new/ https://reviews.llvm.org/D32239 Files: llvm/lib/Analysis/ScalarEvolution.cpp llvm/test/Analysis/ScalarEvolution/or-as-add.ll Index: llvm/test/Analysis/ScalarEvolution/or-as-add.ll =================================================================== --- /dev/null +++ llvm/test/Analysis/ScalarEvolution/or-as-add.ll @@ -0,0 +1,38 @@ +; RUN: opt < %s -analyze -scalar-evolution | FileCheck %s + +declare void @z(i32) +declare void @z2(i64) + +define void @fun(i1 %bool, i32 %x) { +entry: + br label %body +body: + %i = phi i32 [ 0, %entry ], [ %i.next, %body ] + %bottom_zero = mul i32 %i, 2 + %a = or i32 %bottom_zero, 1 + call void @z(i32 %a) + %bool_ext = zext i1 %bool to i32 + %b = or i32 %bool_ext, %bottom_zero + call void @z(i32 %b) + %shifted = lshr i32 %x, 31 + %c = or i32 %shifted, %bottom_zero + call void @z(i32 %c) + %i_ext = zext i32 %i to i64 + %d = or i64 %i_ext, 4294967296 + call void @z2(i64 %d) + %i.next = add i32 %i, 1 + %cond = icmp eq i32 %i.next, 10 + br i1 %cond, label %exit, label %body +exit: + ret void +} + +; CHECK: %a = or i32 %bottom_zero, 1 +; CHECK-NEXT: --> {1,+,2}<%body> +; CHECK: %b = or i32 %bool_ext, %bottom_zero +; CHECK-NEXT: --> {(zext i1 %bool to i32),+,2} +; CHECK: %c = or i32 %shifted, %bottom_zero +; CHECK-NEXT: --> {(%x /u -2147483648),+,2}<%body> +; CHECK: %d = or i64 %i_ext, 4294967296 +; CHECK-NEXT: --> {4294967296,+,1}<%body> + Index: llvm/lib/Analysis/ScalarEvolution.cpp =================================================================== --- llvm/lib/Analysis/ScalarEvolution.cpp +++ llvm/lib/Analysis/ScalarEvolution.cpp @@ -5328,28 +5328,12 @@ break; case Instruction::Or: - // If the RHS of the Or is a constant, we may have something like: - // X*4+1 which got turned into X*4|1. Handle this as an Add so loop - // optimizations will transparently handle this case. - // - // In order for this transformation to be safe, the LHS must be of the - // form X*(2^n) and the Or constant must be less than 2^n. - if (ConstantInt *CI = dyn_cast(BO->RHS)) { - const SCEV *LHS = getSCEV(BO->LHS); - const APInt &CIVal = CI->getValue(); - if (GetMinTrailingZeros(LHS) >= - (CIVal.getBitWidth() - CIVal.countLeadingZeros())) { - // Build a plain add SCEV. - const SCEV *S = getAddExpr(LHS, getSCEV(CI)); - // If the LHS of the add was an addrec and it has no-wrap flags, - // transfer the no-wrap flags, since an or won't introduce a wrap. - if (const SCEVAddRecExpr *NewAR = dyn_cast(S)) { - const SCEVAddRecExpr *OldAR = cast(LHS); - const_cast(NewAR)->setNoWrapFlags( - OldAR->getNoWrapFlags()); - } - return S; - } + // Use ValueTracking to check whether this is actually an add. + if (haveNoCommonBitsSet(BO->LHS, BO->RHS, getDataLayout(), &AC, + nullptr, &DT)) { + // There aren't any common bits set, so the add can't wrap. + auto Flags = SCEV::NoWrapFlags(SCEV::FlagNUW | SCEV::FlagNSW); + return getAddExpr(getSCEV(BO->LHS), getSCEV(BO->RHS), Flags); } break; -------------- next part -------------- A non-text attachment was scrubbed... Name: D32239.223542.patch Type: text/x-patch Size: 3266 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:20:01 2019 From: llvm-commits at lists.llvm.org (Kevin P. Neal via llvm-commits) Date: Mon, 07 Oct 2019 13:20:01 -0000 Subject: [llvm] r373900 - [FPEnv] Add constrained intrinsics for lrint and lround Message-ID: <20191007132001.469A98B9E5@lists.llvm.org> Author: kpn Date: Mon Oct 7 06:20:00 2019 New Revision: 373900 URL: http://llvm.org/viewvc/llvm-project?rev=373900&view=rev Log: [FPEnv] Add constrained intrinsics for lrint and lround Earlier in the year intrinsics for lrint, llrint, lround and llround were added to llvm. The constrained versions are now implemented here. Reviewed by: andrew.w.kaylor, craig.topper, cameron.mcinally Approved by: craig.topper Differential Revision: https://reviews.llvm.org/D64746 Modified: llvm/trunk/docs/LangRef.rst llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h llvm/trunk/include/llvm/CodeGen/TargetLowering.h llvm/trunk/include/llvm/IR/IntrinsicInst.h llvm/trunk/include/llvm/IR/Intrinsics.td llvm/trunk/include/llvm/Target/TargetSelectionDAG.td llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp llvm/trunk/lib/IR/IntrinsicInst.cpp llvm/trunk/lib/IR/Verifier.cpp llvm/trunk/test/CodeGen/X86/fp-intrinsics.ll llvm/trunk/test/Feature/fp-intrinsics.ll Modified: llvm/trunk/docs/LangRef.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/LangRef.rst?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/docs/LangRef.rst (original) +++ llvm/trunk/docs/LangRef.rst Mon Oct 7 06:20:00 2019 @@ -15940,6 +15940,102 @@ mode is determined by the runtime floati mode argument is only intended as information to the compiler. +'``llvm.experimental.constrained.lrint``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.lrint( , + metadata , + metadata ) + +Overview: +""""""""" + +The '``llvm.experimental.constrained.lrint``' intrinsic returns the first +operand rounded to the nearest integer. An inexact floating-point exception +will be raised if the operand is not an integer. An invalid exception is +raised if the result is too large to fit into a supported integer type, +and in this case the result is undefined. + +Arguments: +"""""""""" + +The first argument is a floating-point number. The return value is an +integer type. Not all types are supported on all targets. The supported +types are the same as the ``llvm.lrint`` intrinsic and the ``lrint`` +libm functions. + +The second and third arguments specify the rounding mode and exception +behavior as described above. + +Semantics: +"""""""""" + +This function returns the same values as the libm ``lrint`` functions +would, and handles error conditions in the same way. + +The rounding mode is described, not determined, by the rounding mode +argument. The actual rounding mode is determined by the runtime floating-point +environment. The rounding mode argument is only intended as information +to the compiler. + +If the runtime floating-point environment is using the default rounding mode +then the results will be the same as the llvm.lrint intrinsic. + + +'``llvm.experimental.constrained.llrint``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.llrint( , + metadata , + metadata ) + +Overview: +""""""""" + +The '``llvm.experimental.constrained.llrint``' intrinsic returns the first +operand rounded to the nearest integer. An inexact floating-point exception +will be raised if the operand is not an integer. An invalid exception is +raised if the result is too large to fit into a supported integer type, +and in this case the result is undefined. + +Arguments: +"""""""""" + +The first argument is a floating-point number. The return value is an +integer type. Not all types are supported on all targets. The supported +types are the same as the ``llvm.llrint`` intrinsic and the ``llrint`` +libm functions. + +The second and third arguments specify the rounding mode and exception +behavior as described above. + +Semantics: +"""""""""" + +This function returns the same values as the libm ``llrint`` functions +would, and handles error conditions in the same way. + +The rounding mode is described, not determined, by the rounding mode +argument. The actual rounding mode is determined by the runtime floating-point +environment. The rounding mode argument is only intended as information +to the compiler. + +If the runtime floating-point environment is using the default rounding mode +then the results will be the same as the llvm.llrint intrinsic. + + '``llvm.experimental.constrained.nearbyint``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -16162,6 +16258,82 @@ This function returns the same values as would and handles error conditions in the same way. +'``llvm.experimental.constrained.lround``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.lround( , + metadata ) + +Overview: +""""""""" + +The '``llvm.experimental.constrained.lround``' intrinsic returns the first +operand rounded to the nearest integer with ties away from zero. It will +raise an inexact floating-point exception if the operand is not an integer. +An invalid exception is raised if the result is too large to fit into a +supported integer type, and in this case the result is undefined. + +Arguments: +"""""""""" + +The first argument is a floating-point number. The return value is an +integer type. Not all types are supported on all targets. The supported +types are the same as the ``llvm.lround`` intrinsic and the ``lround`` +libm functions. + +The second argument specifies the exception behavior as described above. + +Semantics: +"""""""""" + +This function returns the same values as the libm ``lround`` functions +would and handles error conditions in the same way. + + +'``llvm.experimental.constrained.llround``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.llround( , + metadata ) + +Overview: +""""""""" + +The '``llvm.experimental.constrained.llround``' intrinsic returns the first +operand rounded to the nearest integer with ties away from zero. It will +raise an inexact floating-point exception if the operand is not an integer. +An invalid exception is raised if the result is too large to fit into a +supported integer type, and in this case the result is undefined. + +Arguments: +"""""""""" + +The first argument is a floating-point number. The return value is an +integer type. Not all types are supported on all targets. The supported +types are the same as the ``llvm.llround`` intrinsic and the ``llround`` +libm functions. + +The second argument specifies the exception behavior as described above. + +Semantics: +"""""""""" + +This function returns the same values as the libm ``llround`` functions +would and handles error conditions in the same way. + + '``llvm.experimental.constrained.trunc``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Modified: llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h (original) +++ llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h Mon Oct 7 06:20:00 2019 @@ -301,6 +301,7 @@ namespace ISD { STRICT_FEXP, STRICT_FEXP2, STRICT_FLOG, STRICT_FLOG10, STRICT_FLOG2, STRICT_FRINT, STRICT_FNEARBYINT, STRICT_FMAXNUM, STRICT_FMINNUM, STRICT_FCEIL, STRICT_FFLOOR, STRICT_FROUND, STRICT_FTRUNC, + STRICT_LROUND, STRICT_LLROUND, STRICT_LRINT, STRICT_LLRINT, /// STRICT_FP_TO_[US]INT - Convert a floating point value to a signed or /// unsigned integer. These have the same semantics as fptosi and fptoui Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h (original) +++ llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h Mon Oct 7 06:20:00 2019 @@ -701,12 +701,16 @@ public: case ISD::STRICT_FLOG: case ISD::STRICT_FLOG10: case ISD::STRICT_FLOG2: + case ISD::STRICT_LRINT: + case ISD::STRICT_LLRINT: case ISD::STRICT_FRINT: case ISD::STRICT_FNEARBYINT: case ISD::STRICT_FMAXNUM: case ISD::STRICT_FMINNUM: case ISD::STRICT_FCEIL: case ISD::STRICT_FFLOOR: + case ISD::STRICT_LROUND: + case ISD::STRICT_LLROUND: case ISD::STRICT_FROUND: case ISD::STRICT_FTRUNC: case ISD::STRICT_FP_TO_SINT: Modified: llvm/trunk/include/llvm/CodeGen/TargetLowering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/TargetLowering.h?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/TargetLowering.h (original) +++ llvm/trunk/include/llvm/CodeGen/TargetLowering.h Mon Oct 7 06:20:00 2019 @@ -953,12 +953,16 @@ public: case ISD::STRICT_FLOG: EqOpc = ISD::FLOG; break; case ISD::STRICT_FLOG10: EqOpc = ISD::FLOG10; break; case ISD::STRICT_FLOG2: EqOpc = ISD::FLOG2; break; + case ISD::STRICT_LRINT: EqOpc = ISD::LRINT; break; + case ISD::STRICT_LLRINT: EqOpc = ISD::LLRINT; break; case ISD::STRICT_FRINT: EqOpc = ISD::FRINT; break; case ISD::STRICT_FNEARBYINT: EqOpc = ISD::FNEARBYINT; break; case ISD::STRICT_FMAXNUM: EqOpc = ISD::FMAXNUM; break; case ISD::STRICT_FMINNUM: EqOpc = ISD::FMINNUM; break; case ISD::STRICT_FCEIL: EqOpc = ISD::FCEIL; break; case ISD::STRICT_FFLOOR: EqOpc = ISD::FFLOOR; break; + case ISD::STRICT_LROUND: EqOpc = ISD::LROUND; break; + case ISD::STRICT_LLROUND: EqOpc = ISD::LLROUND; break; case ISD::STRICT_FROUND: EqOpc = ISD::FROUND; break; case ISD::STRICT_FTRUNC: EqOpc = ISD::FTRUNC; break; case ISD::STRICT_FP_TO_SINT: EqOpc = ISD::FP_TO_SINT; break; Modified: llvm/trunk/include/llvm/IR/IntrinsicInst.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/IntrinsicInst.h?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/IntrinsicInst.h (original) +++ llvm/trunk/include/llvm/IR/IntrinsicInst.h Mon Oct 7 06:20:00 2019 @@ -273,12 +273,16 @@ namespace llvm { case Intrinsic::experimental_constrained_log: case Intrinsic::experimental_constrained_log10: case Intrinsic::experimental_constrained_log2: + case Intrinsic::experimental_constrained_lrint: + case Intrinsic::experimental_constrained_llrint: case Intrinsic::experimental_constrained_rint: case Intrinsic::experimental_constrained_nearbyint: case Intrinsic::experimental_constrained_maxnum: case Intrinsic::experimental_constrained_minnum: case Intrinsic::experimental_constrained_ceil: case Intrinsic::experimental_constrained_floor: + case Intrinsic::experimental_constrained_lround: + case Intrinsic::experimental_constrained_llround: case Intrinsic::experimental_constrained_round: case Intrinsic::experimental_constrained_trunc: return true; Modified: llvm/trunk/include/llvm/IR/Intrinsics.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.td?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/Intrinsics.td (original) +++ llvm/trunk/include/llvm/IR/Intrinsics.td Mon Oct 7 06:20:00 2019 @@ -703,6 +703,14 @@ let IntrProperties = [IntrInaccessibleMe [ LLVMMatchType<0>, llvm_metadata_ty, llvm_metadata_ty ]>; + def int_experimental_constrained_lrint : Intrinsic<[ llvm_anyint_ty ], + [ llvm_anyfloat_ty, + llvm_metadata_ty, + llvm_metadata_ty ]>; + def int_experimental_constrained_llrint : Intrinsic<[ llvm_anyint_ty ], + [ llvm_anyfloat_ty, + llvm_metadata_ty, + llvm_metadata_ty ]>; def int_experimental_constrained_maxnum : Intrinsic<[ llvm_anyfloat_ty ], [ LLVMMatchType<0>, LLVMMatchType<0>, @@ -721,6 +729,12 @@ let IntrProperties = [IntrInaccessibleMe [ LLVMMatchType<0>, llvm_metadata_ty, llvm_metadata_ty ]>; + def int_experimental_constrained_lround : Intrinsic<[ llvm_anyint_ty ], + [ llvm_anyfloat_ty, + llvm_metadata_ty ]>; + def int_experimental_constrained_llround : Intrinsic<[ llvm_anyint_ty ], + [ llvm_anyfloat_ty, + llvm_metadata_ty ]>; def int_experimental_constrained_round : Intrinsic<[ llvm_anyfloat_ty ], [ LLVMMatchType<0>, llvm_metadata_ty, Modified: llvm/trunk/include/llvm/Target/TargetSelectionDAG.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetSelectionDAG.td?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/TargetSelectionDAG.td (original) +++ llvm/trunk/include/llvm/Target/TargetSelectionDAG.td Mon Oct 7 06:20:00 2019 @@ -506,12 +506,20 @@ def strict_flog2 : SDNode<"ISD::STR SDTFPUnaryOp, [SDNPHasChain]>; def strict_frint : SDNode<"ISD::STRICT_FRINT", SDTFPUnaryOp, [SDNPHasChain]>; +def strict_lrint : SDNode<"ISD::STRICT_LRINT", + SDTFPToIntOp, [SDNPHasChain]>; +def strict_llrint : SDNode<"ISD::STRICT_LLRINT", + SDTFPToIntOp, [SDNPHasChain]>; def strict_fnearbyint : SDNode<"ISD::STRICT_FNEARBYINT", SDTFPUnaryOp, [SDNPHasChain]>; def strict_fceil : SDNode<"ISD::STRICT_FCEIL", SDTFPUnaryOp, [SDNPHasChain]>; def strict_ffloor : SDNode<"ISD::STRICT_FFLOOR", SDTFPUnaryOp, [SDNPHasChain]>; +def strict_lround : SDNode<"ISD::STRICT_LROUND", + SDTFPToIntOp, [SDNPHasChain]>; +def strict_llround : SDNode<"ISD::STRICT_LLROUND", + SDTFPToIntOp, [SDNPHasChain]>; def strict_fround : SDNode<"ISD::STRICT_FROUND", SDTFPUnaryOp, [SDNPHasChain]>; def strict_ftrunc : SDNode<"ISD::STRICT_FTRUNC", @@ -1339,6 +1347,12 @@ def any_flog2 : PatFrags<(ops node: def any_frint : PatFrags<(ops node:$src), [(strict_frint node:$src), (frint node:$src)]>; +def any_lrint : PatFrags<(ops node:$src), + [(strict_lrint node:$src), + (lrint node:$src)]>; +def any_llrint : PatFrags<(ops node:$src), + [(strict_llrint node:$src), + (llrint node:$src)]>; def any_fnearbyint : PatFrags<(ops node:$src), [(strict_fnearbyint node:$src), (fnearbyint node:$src)]>; @@ -1348,6 +1362,12 @@ def any_fceil : PatFrags<(ops node: def any_ffloor : PatFrags<(ops node:$src), [(strict_ffloor node:$src), (ffloor node:$src)]>; +def any_lround : PatFrags<(ops node:$src), + [(strict_lround node:$src), + (lround node:$src)]>; +def any_llround : PatFrags<(ops node:$src), + [(strict_llround node:$src), + (llround node:$src)]>; def any_fround : PatFrags<(ops node:$src), [(strict_fround node:$src), (fround node:$src)]>; Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Mon Oct 7 06:20:00 2019 @@ -1103,6 +1103,16 @@ void SelectionDAGLegalize::LegalizeOp(SD return; } break; + case ISD::STRICT_LRINT: + case ISD::STRICT_LLRINT: + case ISD::STRICT_LROUND: + case ISD::STRICT_LLROUND: + // These pseudo-ops are the same as the other STRICT_ ops except + // they are registered with setOperationAction() using the input type + // instead of the output type. + Action = TLI.getStrictFPOperationAction(Node->getOpcode(), + Node->getOperand(1).getValueType()); + break; case ISD::SADDSAT: case ISD::UADDSAT: case ISD::SSUBSAT: @@ -2141,6 +2151,9 @@ SDValue SelectionDAGLegalize::ExpandArgF RTLIB::Libcall Call_F80, RTLIB::Libcall Call_F128, RTLIB::Libcall Call_PPCF128) { + if (Node->isStrictFPOpcode()) + Node = DAG.mutateStrictFPToFP(Node); + RTLIB::Libcall LC; switch (Node->getOperand(0).getValueType().getSimpleVT().SimpleTy) { default: llvm_unreachable("Unexpected request for libcall!"); @@ -2895,30 +2908,6 @@ bool SelectionDAGLegalize::ExpandNode(SD return true; } break; - case ISD::LROUND: - Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LROUND_F32, - RTLIB::LROUND_F64, RTLIB::LROUND_F80, - RTLIB::LROUND_F128, - RTLIB::LROUND_PPCF128)); - break; - case ISD::LLROUND: - Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LLROUND_F32, - RTLIB::LLROUND_F64, RTLIB::LLROUND_F80, - RTLIB::LLROUND_F128, - RTLIB::LLROUND_PPCF128)); - break; - case ISD::LRINT: - Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LRINT_F32, - RTLIB::LRINT_F64, RTLIB::LRINT_F80, - RTLIB::LRINT_F128, - RTLIB::LRINT_PPCF128)); - break; - case ISD::LLRINT: - Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LLRINT_F32, - RTLIB::LLRINT_F64, RTLIB::LLRINT_F80, - RTLIB::LLRINT_F128, - RTLIB::LLRINT_PPCF128)); - break; case ISD::VAARG: Results.push_back(DAG.expandVAArg(Node)); Results.push_back(Results[0].getValue(1)); @@ -3712,10 +3701,25 @@ bool SelectionDAGLegalize::ExpandNode(SD // the "strict" properties. For now, we just fall back to the non-strict // version if that is legal on the target. The actual mutation of the // operation will happen in SelectionDAGISel::DoInstructionSelection. - if (TLI.getStrictFPOperationAction(Node->getOpcode(), - Node->getValueType(0)) - == TargetLowering::Legal) - return true; + switch (Node->getOpcode()) { + default: + if (TLI.getStrictFPOperationAction(Node->getOpcode(), + Node->getValueType(0)) + == TargetLowering::Legal) + return true; + break; + case ISD::STRICT_LRINT: + case ISD::STRICT_LLRINT: + case ISD::STRICT_LROUND: + case ISD::STRICT_LLROUND: + // These are registered by the operand type instead of the value + // type. Reflect that here. + if (TLI.getStrictFPOperationAction(Node->getOpcode(), + Node->getOperand(1).getValueType()) + == TargetLowering::Legal) + return true; + break; + } } // Replace the original node with the legalized result. @@ -3959,6 +3963,34 @@ void SelectionDAGLegalize::ConvertNodeTo RTLIB::POW_F80, RTLIB::POW_F128, RTLIB::POW_PPCF128)); break; + case ISD::LROUND: + case ISD::STRICT_LROUND: + Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LROUND_F32, + RTLIB::LROUND_F64, RTLIB::LROUND_F80, + RTLIB::LROUND_F128, + RTLIB::LROUND_PPCF128)); + break; + case ISD::LLROUND: + case ISD::STRICT_LLROUND: + Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LLROUND_F32, + RTLIB::LLROUND_F64, RTLIB::LLROUND_F80, + RTLIB::LLROUND_F128, + RTLIB::LLROUND_PPCF128)); + break; + case ISD::LRINT: + case ISD::STRICT_LRINT: + Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LRINT_F32, + RTLIB::LRINT_F64, RTLIB::LRINT_F80, + RTLIB::LRINT_F128, + RTLIB::LRINT_PPCF128)); + break; + case ISD::LLRINT: + case ISD::STRICT_LLRINT: + Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LLRINT_F32, + RTLIB::LLRINT_F64, RTLIB::LLRINT_F80, + RTLIB::LLRINT_F128, + RTLIB::LLRINT_PPCF128)); + break; case ISD::FDIV: Results.push_back(ExpandFPLibCall(Node, RTLIB::DIV_F32, RTLIB::DIV_F64, RTLIB::DIV_F80, RTLIB::DIV_F128, Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Mon Oct 7 06:20:00 2019 @@ -7756,12 +7756,16 @@ SDNode* SelectionDAG::mutateStrictFPToFP case ISD::STRICT_FLOG: NewOpc = ISD::FLOG; break; case ISD::STRICT_FLOG10: NewOpc = ISD::FLOG10; break; case ISD::STRICT_FLOG2: NewOpc = ISD::FLOG2; break; + case ISD::STRICT_LRINT: NewOpc = ISD::LRINT; break; + case ISD::STRICT_LLRINT: NewOpc = ISD::LLRINT; break; case ISD::STRICT_FRINT: NewOpc = ISD::FRINT; break; case ISD::STRICT_FNEARBYINT: NewOpc = ISD::FNEARBYINT; break; case ISD::STRICT_FMAXNUM: NewOpc = ISD::FMAXNUM; break; case ISD::STRICT_FMINNUM: NewOpc = ISD::FMINNUM; break; case ISD::STRICT_FCEIL: NewOpc = ISD::FCEIL; break; case ISD::STRICT_FFLOOR: NewOpc = ISD::FFLOOR; break; + case ISD::STRICT_LROUND: NewOpc = ISD::LROUND; break; + case ISD::STRICT_LLROUND: NewOpc = ISD::LLROUND; break; case ISD::STRICT_FROUND: NewOpc = ISD::FROUND; break; case ISD::STRICT_FTRUNC: NewOpc = ISD::FTRUNC; break; case ISD::STRICT_FP_ROUND: NewOpc = ISD::FP_ROUND; break; Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Mon Oct 7 06:20:00 2019 @@ -6104,12 +6104,16 @@ void SelectionDAGBuilder::visitIntrinsic case Intrinsic::experimental_constrained_log: case Intrinsic::experimental_constrained_log10: case Intrinsic::experimental_constrained_log2: + case Intrinsic::experimental_constrained_lrint: + case Intrinsic::experimental_constrained_llrint: case Intrinsic::experimental_constrained_rint: case Intrinsic::experimental_constrained_nearbyint: case Intrinsic::experimental_constrained_maxnum: case Intrinsic::experimental_constrained_minnum: case Intrinsic::experimental_constrained_ceil: case Intrinsic::experimental_constrained_floor: + case Intrinsic::experimental_constrained_lround: + case Intrinsic::experimental_constrained_llround: case Intrinsic::experimental_constrained_round: case Intrinsic::experimental_constrained_trunc: visitConstrainedFPIntrinsic(cast(I)); @@ -6935,6 +6939,12 @@ void SelectionDAGBuilder::visitConstrain case Intrinsic::experimental_constrained_log2: Opcode = ISD::STRICT_FLOG2; break; + case Intrinsic::experimental_constrained_lrint: + Opcode = ISD::STRICT_LRINT; + break; + case Intrinsic::experimental_constrained_llrint: + Opcode = ISD::STRICT_LLRINT; + break; case Intrinsic::experimental_constrained_rint: Opcode = ISD::STRICT_FRINT; break; @@ -6953,6 +6963,12 @@ void SelectionDAGBuilder::visitConstrain case Intrinsic::experimental_constrained_floor: Opcode = ISD::STRICT_FFLOOR; break; + case Intrinsic::experimental_constrained_lround: + Opcode = ISD::STRICT_LROUND; + break; + case Intrinsic::experimental_constrained_llround: + Opcode = ISD::STRICT_LLROUND; + break; case Intrinsic::experimental_constrained_round: Opcode = ISD::STRICT_FROUND; break; Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp Mon Oct 7 06:20:00 2019 @@ -333,9 +333,13 @@ std::string SDNode::getOperationName(con case ISD::FP16_TO_FP: return "fp16_to_fp"; case ISD::FP_TO_FP16: return "fp_to_fp16"; case ISD::LROUND: return "lround"; + case ISD::STRICT_LROUND: return "strict_lround"; case ISD::LLROUND: return "llround"; + case ISD::STRICT_LLROUND: return "strict_llround"; case ISD::LRINT: return "lrint"; + case ISD::STRICT_LRINT: return "strict_lrint"; case ISD::LLRINT: return "llrint"; + case ISD::STRICT_LLRINT: return "strict_llrint"; // Control flow instructions case ISD::BR: return "br"; Modified: llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp (original) +++ llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp Mon Oct 7 06:20:00 2019 @@ -709,10 +709,14 @@ void TargetLoweringBase::initActions() { setOperationAction(ISD::STRICT_FLOG, VT, Expand); setOperationAction(ISD::STRICT_FLOG10, VT, Expand); setOperationAction(ISD::STRICT_FLOG2, VT, Expand); + setOperationAction(ISD::STRICT_LRINT, VT, Expand); + setOperationAction(ISD::STRICT_LLRINT, VT, Expand); setOperationAction(ISD::STRICT_FRINT, VT, Expand); setOperationAction(ISD::STRICT_FNEARBYINT, VT, Expand); setOperationAction(ISD::STRICT_FCEIL, VT, Expand); setOperationAction(ISD::STRICT_FFLOOR, VT, Expand); + setOperationAction(ISD::STRICT_LROUND, VT, Expand); + setOperationAction(ISD::STRICT_LLROUND, VT, Expand); setOperationAction(ISD::STRICT_FROUND, VT, Expand); setOperationAction(ISD::STRICT_FTRUNC, VT, Expand); setOperationAction(ISD::STRICT_FMAXNUM, VT, Expand); Modified: llvm/trunk/lib/IR/IntrinsicInst.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/IntrinsicInst.cpp?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/lib/IR/IntrinsicInst.cpp (original) +++ llvm/trunk/lib/IR/IntrinsicInst.cpp Mon Oct 7 06:20:00 2019 @@ -200,10 +200,14 @@ bool ConstrainedFPIntrinsic::isUnaryOp() case Intrinsic::experimental_constrained_log: case Intrinsic::experimental_constrained_log10: case Intrinsic::experimental_constrained_log2: + case Intrinsic::experimental_constrained_lrint: + case Intrinsic::experimental_constrained_llrint: case Intrinsic::experimental_constrained_rint: case Intrinsic::experimental_constrained_nearbyint: case Intrinsic::experimental_constrained_ceil: case Intrinsic::experimental_constrained_floor: + case Intrinsic::experimental_constrained_lround: + case Intrinsic::experimental_constrained_llround: case Intrinsic::experimental_constrained_round: case Intrinsic::experimental_constrained_trunc: return true; Modified: llvm/trunk/lib/IR/Verifier.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Verifier.cpp?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/lib/IR/Verifier.cpp (original) +++ llvm/trunk/lib/IR/Verifier.cpp Mon Oct 7 06:20:00 2019 @@ -4308,12 +4308,16 @@ void Verifier::visitIntrinsicCall(Intrin case Intrinsic::experimental_constrained_log: case Intrinsic::experimental_constrained_log10: case Intrinsic::experimental_constrained_log2: + case Intrinsic::experimental_constrained_lrint: + case Intrinsic::experimental_constrained_llrint: case Intrinsic::experimental_constrained_rint: case Intrinsic::experimental_constrained_nearbyint: case Intrinsic::experimental_constrained_maxnum: case Intrinsic::experimental_constrained_minnum: case Intrinsic::experimental_constrained_ceil: case Intrinsic::experimental_constrained_floor: + case Intrinsic::experimental_constrained_lround: + case Intrinsic::experimental_constrained_llround: case Intrinsic::experimental_constrained_round: case Intrinsic::experimental_constrained_trunc: visitConstrainedFPIntrinsic(cast(Call)); @@ -4766,6 +4770,31 @@ void Verifier::visitConstrainedFPIntrins HasRoundingMD = true; break; + case Intrinsic::experimental_constrained_lrint: + case Intrinsic::experimental_constrained_llrint: { + Assert((NumOperands == 3), "invalid arguments for constrained FP intrinsic", + &FPI); + Type *ValTy = FPI.getArgOperand(0)->getType(); + Type *ResultTy = FPI.getType(); + Assert(!ValTy->isVectorTy() && !ResultTy->isVectorTy(), + "Intrinsic does not support vectors", &FPI); + HasExceptionMD = true; + HasRoundingMD = true; + } + break; + + case Intrinsic::experimental_constrained_lround: + case Intrinsic::experimental_constrained_llround: { + Assert((NumOperands == 2), "invalid arguments for constrained FP intrinsic", + &FPI); + Type *ValTy = FPI.getArgOperand(0)->getType(); + Type *ResultTy = FPI.getType(); + Assert(!ValTy->isVectorTy() && !ResultTy->isVectorTy(), + "Intrinsic does not support vectors", &FPI); + HasExceptionMD = true; + break; + } + case Intrinsic::experimental_constrained_fma: Assert((NumOperands == 5), "invalid arguments for constrained FP intrinsic", &FPI); Modified: llvm/trunk/test/CodeGen/X86/fp-intrinsics.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fp-intrinsics.ll?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/fp-intrinsics.ll (original) +++ llvm/trunk/test/CodeGen/X86/fp-intrinsics.ll Mon Oct 7 06:20:00 2019 @@ -342,6 +342,82 @@ entry: ret double %result } +; CHECK-LABEL: f23 +; COMMON: jmp lrint +define i32 @f23(double %x) #0 { +entry: + %result = call i32 @llvm.experimental.constrained.lrint.i32.f64(double %x, + metadata !"round.dynamic", + metadata !"fpexcept.strict") #0 + ret i32 %result +} + +; CHECK-LABEL: f24 +; COMMON: jmp lrintf +define i32 @f24(float %x) #0 { +entry: + %result = call i32 @llvm.experimental.constrained.lrint.i32.f32(float %x, + metadata !"round.dynamic", + metadata !"fpexcept.strict") #0 + ret i32 %result +} + +; CHECK-LABEL: f25 +; COMMON: jmp llrint +define i64 @f25(double %x) #0 { +entry: + %result = call i64 @llvm.experimental.constrained.llrint.i64.f64(double %x, + metadata !"round.dynamic", + metadata !"fpexcept.strict") #0 + ret i64 %result +} + +; CHECK-LABEL: f26 +; COMMON: jmp llrintf +define i64 @f26(float %x) { +entry: + %result = call i64 @llvm.experimental.constrained.llrint.i64.f32(float %x, + metadata !"round.dynamic", + metadata !"fpexcept.strict") #0 + ret i64 %result +} + +; CHECK-LABEL: f27 +; COMMON: jmp lround +define i32 @f27(double %x) #0 { +entry: + %result = call i32 @llvm.experimental.constrained.lround.i32.f64(double %x, + metadata !"fpexcept.strict") #0 + ret i32 %result +} + +; CHECK-LABEL: f28 +; COMMON: jmp lroundf +define i32 @f28(float %x) #0 { +entry: + %result = call i32 @llvm.experimental.constrained.lround.i32.f32(float %x, + metadata !"fpexcept.strict") #0 + ret i32 %result +} + +; CHECK-LABEL: f29 +; COMMON: jmp llround +define i64 @f29(double %x) #0 { +entry: + %result = call i64 @llvm.experimental.constrained.llround.i64.f64(double %x, + metadata !"fpexcept.strict") #0 + ret i64 %result +} + +; CHECK-LABEL: f30 +; COMMON: jmp llroundf +define i64 @f30(float %x) #0 { +entry: + %result = call i64 @llvm.experimental.constrained.llround.i64.f32(float %x, + metadata !"fpexcept.strict") #0 + ret i64 %result +} + attributes #0 = { strictfp } @llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata" @@ -368,3 +444,11 @@ declare i32 @llvm.experimental.constrain declare i32 @llvm.experimental.constrained.fptoui.i32.f64(double, metadata) declare float @llvm.experimental.constrained.fptrunc.f32.f64(double, metadata, metadata) declare double @llvm.experimental.constrained.fpext.f64.f32(float, metadata) +declare i32 @llvm.experimental.constrained.lrint.i32.f64(double, metadata, metadata) +declare i32 @llvm.experimental.constrained.lrint.i32.f32(float, metadata, metadata) +declare i64 @llvm.experimental.constrained.llrint.i64.f64(double, metadata, metadata) +declare i64 @llvm.experimental.constrained.llrint.i64.f32(float, metadata, metadata) +declare i32 @llvm.experimental.constrained.lround.i32.f64(double, metadata) +declare i32 @llvm.experimental.constrained.lround.i32.f32(float, metadata) +declare i64 @llvm.experimental.constrained.llround.i64.f64(double, metadata) +declare i64 @llvm.experimental.constrained.llround.i64.f32(float, metadata) Modified: llvm/trunk/test/Feature/fp-intrinsics.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Feature/fp-intrinsics.ll?rev=373900&r1=373899&r2=373900&view=diff ============================================================================== --- llvm/trunk/test/Feature/fp-intrinsics.ll (original) +++ llvm/trunk/test/Feature/fp-intrinsics.ll Mon Oct 7 06:20:00 2019 @@ -289,6 +289,90 @@ entry: ret double %result } +; Verify that lrint(42.1) isn't simplified when the rounding mode is unknown. +; CHECK-LABEL: f22 +; CHECK: call i32 @llvm.experimental.constrained.lrint +define i32 @f22() #0 { +entry: + %result = call i32 @llvm.experimental.constrained.lrint.i32.f64(double 42.1, + metadata !"round.dynamic", + metadata !"fpexcept.strict") #0 + ret i32 %result +} + +; Verify that lrintf(42.0) isn't simplified when the rounding mode is unknown. +; CHECK-LABEL: f23 +; CHECK: call i32 @llvm.experimental.constrained.lrint +define i32 @f23() #0 { +entry: + %result = call i32 @llvm.experimental.constrained.lrint.i32.f32(float 42.0, + metadata !"round.dynamic", + metadata !"fpexcept.strict") #0 + ret i32 %result +} + +; Verify that llrint(42.1) isn't simplified when the rounding mode is unknown. +; CHECK-LABEL: f24 +; CHECK: call i64 @llvm.experimental.constrained.llrint +define i64 @f24() #0 { +entry: + %result = call i64 @llvm.experimental.constrained.llrint.i64.f64(double 42.1, + metadata !"round.dynamic", + metadata !"fpexcept.strict") #0 + ret i64 %result +} + +; Verify that llrint(42.0) isn't simplified when the rounding mode is unknown. +; CHECK-LABEL: f25 +; CHECK: call i64 @llvm.experimental.constrained.llrint +define i64 @f25() #0 { +entry: + %result = call i64 @llvm.experimental.constrained.llrint.i64.f32(float 42.0, + metadata !"round.dynamic", + metadata !"fpexcept.strict") #0 + ret i64 %result +} + +; Verify that lround(42.1) isn't simplified when the rounding mode is unknown. +; CHECK-LABEL: f26 +; CHECK: call i32 @llvm.experimental.constrained.lround +define i32 @f26() #0 { +entry: + %result = call i32 @llvm.experimental.constrained.lround.i32.f64(double 42.1, + metadata !"fpexcept.strict") #0 + ret i32 %result +} + +; Verify that lround(42.0) isn't simplified when the rounding mode is unknown. +; CHECK-LABEL: f27 +; CHECK: call i32 @llvm.experimental.constrained.lround +define i32 @f27() #0 { +entry: + %result = call i32 @llvm.experimental.constrained.lround.i32.f32(float 42.0, + metadata !"fpexcept.strict") #0 + ret i32 %result +} + +; Verify that llround(42.1) isn't simplified when the rounding mode is unknown. +; CHECK-LABEL: f28 +; CHECK: call i64 @llvm.experimental.constrained.llround +define i64 @f28() #0 { +entry: + %result = call i64 @llvm.experimental.constrained.llround.i64.f64(double 42.1, + metadata !"fpexcept.strict") #0 + ret i64 %result +} + +; Verify that llround(42.0) isn't simplified when the rounding mode is unknown. +; CHECK-LABEL: f29 +; CHECK: call i64 @llvm.experimental.constrained.llround +define i64 @f29() #0 { +entry: + %result = call i64 @llvm.experimental.constrained.llround.i64.f32(float 42.0, + metadata !"fpexcept.strict") #0 + ret i64 %result +} + attributes #0 = { strictfp } @llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata" @@ -313,3 +397,11 @@ declare i32 @llvm.experimental.constrain declare i32 @llvm.experimental.constrained.fptoui.i32.f64(double, metadata) declare float @llvm.experimental.constrained.fptrunc.f32.f64(double, metadata, metadata) declare double @llvm.experimental.constrained.fpext.f64.f32(float, metadata) +declare i32 @llvm.experimental.constrained.lrint.i32.f64(double, metadata, metadata) +declare i32 @llvm.experimental.constrained.lrint.i32.f32(float, metadata, metadata) +declare i64 @llvm.experimental.constrained.llrint.i64.f64(double, metadata, metadata) +declare i64 @llvm.experimental.constrained.llrint.i64.f32(float, metadata, metadata) +declare i32 @llvm.experimental.constrained.lround.i32.f64(double, metadata) +declare i32 @llvm.experimental.constrained.lround.i32.f32(float, metadata) +declare i64 @llvm.experimental.constrained.llround.i64.f64(double, metadata) +declare i64 @llvm.experimental.constrained.llround.i64.f32(float, metadata) From llvm-commits at lists.llvm.org Mon Oct 7 06:21:00 2019 From: llvm-commits at lists.llvm.org (Nirav Dave via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:21:00 +0000 (UTC) Subject: [PATCH] D30471: [SDAG] Relax conditions under stores of loaded values can be merged In-Reply-To: References: Message-ID: This revision was not accepted when it landed; it landed in state "Needs Revision". This revision was automatically updated to reflect the committed changes. Closed by commit rGa38c049fc5c7: [SDAG] Relax conditions under stores of loaded values can be merged (authored by niravd). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D30471/new/ https://reviews.llvm.org/D30471 Files: llvm/include/llvm/CodeGen/ISDOpcodes.h llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/test/CodeGen/X86/merge_store_duplicated_loads.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D30471.223544.patch Type: text/x-patch Size: 5374 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:23:29 2019 From: llvm-commits at lists.llvm.org (Evgenii Stepanov via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:23:29 +0000 (UTC) Subject: [PATCH] D30121: [asan] Fix dead stripping of globals on Linux. In-Reply-To: References: Message-ID: <1f948fbe22fdd981d78e112fd32760fd@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGc5aa6b94115d: [asan] Fix dead stripping of globals on Linux. (authored by eugenis). Herald added subscribers: dexonsmith, steven_wu, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D30121?vs=93024&id=223547#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D30121/new/ https://reviews.llvm.org/D30121 Files: llvm/include/llvm/Transforms/Utils/ModuleUtils.h llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp llvm/lib/Transforms/Utils/ModuleUtils.cpp llvm/test/Instrumentation/AddressSanitizer/global_metadata.ll llvm/test/Instrumentation/AddressSanitizer/global_metadata_darwin.ll llvm/test/Instrumentation/AddressSanitizer/instrument_global.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D30121.223547.patch Type: text/x-patch Size: 17093 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:24:44 2019 From: llvm-commits at lists.llvm.org (Peter Collingbourne via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:24:44 +0000 (UTC) Subject: [PATCH] D30770: Ensure that prefix data is preserved with subsections-via-symbols In-Reply-To: References: Message-ID: <707ca92c7bb7abae8c262b0bc8a5a7ca@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG7f6e2c97b889: Ensure that prefix data is preserved with subsections-via-symbols (authored by pcc). Herald added subscribers: bollu, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D30770?vs=91819&id=223548#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D30770/new/ https://reviews.llvm.org/D30770 Files: llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/test/CodeGen/AArch64/prefixdata.ll llvm/test/CodeGen/X86/prefixdata.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D30770.223548.patch Type: text/x-patch Size: 3257 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:28:01 2019 From: llvm-commits at lists.llvm.org (Dwight Guth via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:28:01 +0000 (UTC) Subject: [PATCH] D67855: [X86] Add new calling convention that guarantees tail call optimization In-Reply-To: References: Message-ID: dwightguth added a comment. @rnk @paquette what does this need to move forward? I think I addressed all your comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67855/new/ https://reviews.llvm.org/D67855 From llvm-commits at lists.llvm.org Mon Oct 7 06:29:43 2019 From: llvm-commits at lists.llvm.org (David Stenberg via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:29:43 +0000 (UTC) Subject: [PATCH] D68465: [DebugInfo] Trim call-clobbered location list entries when tuning for GDB In-Reply-To: References: Message-ID: dstenb added a comment. In D68465#1695482 , @dblaikie wrote: > Thanks for bringing this up! > > A few thoughts from me: > > 1. Yeah, I tend to agree with the DWARF Committee folks & the fact that LLDB can do the right thing without this change sort of points to this being a "fix it in GDB" situation. Have you tried asking the GDB folks about it/submitting patches there rather than here? No, we have not done that yet. > 2. Do you have a small example of GCC producing this kind of output? extern int value(void); extern void call(int); int main() { int local = value(); call(local); return 0; } compiled using -O1 -g with GCC 8.3.0 gives: (gdb) info addr local Symbol "local" is multi-location: Range 0x5555555550ee-0x5555555550f4: a variable in $rax . (gdb) disas main [...] 0x00005555555550f0 <+11>: callq 0x5555555550ff 0x00005555555550f5 <+16>: mov $0x0,%eax [...] As seen, the location list entry ends one byte before the return address. > 3. The ability to use an offset from a debug_addr entry is actually something that's quite desirable - but not quite in the way you're suggesting. Actually the goal would be to not use another debug_addr entry, but the ability to refer to an addr entry + offset in a DIE. Of course this would require an extension to DWARF (non-standard, or eventually standard) which would also mean updating the DWARF consumer... which probably defeats the point of your work, which I imagine is intended to avoid changing the consumer. Though perhaps GDB would be more inclined to accept a patch for an addr+offset form compared to support for the register details. Okay! How would that look like, and what would that be used for? > 4. If GDB can do the right thing when printing in a backtrace, then it seems like it should do the same/similar thing when in a frame The same problem exists for backtraces. For example, in PR39752 the outer frames' parameters are printed using the inner-most register value: (gdb) bt #0 fn3 (p3=) at test.c:11 #1 0x000000000040050f in fn2 (p2=999) at test.c:15 #2 0x000000000040052f in fn1 (p1=999) at test.c:21 #3 0x000000000040054e in main () at test.c:26 Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68465/new/ https://reviews.llvm.org/D68465 From llvm-commits at lists.llvm.org Mon Oct 7 06:34:52 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Micha=C5=82_G=C3=B3rny_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 13:34:52 +0000 (UTC) Subject: [PATCH] D28213: [Frontend] Correct values of ATOMIC_*_LOCK_FREE to match builtin In-Reply-To: References: Message-ID: <52cd058afb44466e01c10efe62aedbae@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGdc155744c82f: [Frontend] Correct values of ATOMIC_*_LOCK_FREE to match builtin (authored by mgorny). Herald added a subscriber: dexonsmith. Herald added a project: clang. Changed prior to commit: https://reviews.llvm.org/D28213?vs=83677&id=223551#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D28213/new/ https://reviews.llvm.org/D28213 Files: clang/lib/Frontend/InitPreprocessor.cpp clang/test/Sema/atomic-ops.c Index: clang/test/Sema/atomic-ops.c =================================================================== --- clang/test/Sema/atomic-ops.c +++ clang/test/Sema/atomic-ops.c @@ -14,11 +14,7 @@ _Static_assert(__GCC_ATOMIC_SHORT_LOCK_FREE == 2, ""); _Static_assert(__GCC_ATOMIC_INT_LOCK_FREE == 2, ""); _Static_assert(__GCC_ATOMIC_LONG_LOCK_FREE == 2, ""); -#ifdef __i386__ -_Static_assert(__GCC_ATOMIC_LLONG_LOCK_FREE == 1, ""); -#else _Static_assert(__GCC_ATOMIC_LLONG_LOCK_FREE == 2, ""); -#endif _Static_assert(__GCC_ATOMIC_POINTER_LOCK_FREE == 2, ""); _Static_assert(__c11_atomic_is_lock_free(1), ""); Index: clang/lib/Frontend/InitPreprocessor.cpp =================================================================== --- clang/lib/Frontend/InitPreprocessor.cpp +++ clang/lib/Frontend/InitPreprocessor.cpp @@ -286,12 +286,12 @@ /// Get the value the ATOMIC_*_LOCK_FREE macro should have for a type with /// the specified properties. -static const char *getLockFreeValue(unsigned TypeWidth, unsigned TypeAlign, - unsigned InlineWidth) { +static const char *getLockFreeValue(unsigned TypeWidth, unsigned InlineWidth) { // Fully-aligned, power-of-2 sizes no larger than the inline // width will be inlined as lock-free operations. - if (TypeWidth == TypeAlign && (TypeWidth & (TypeWidth - 1)) == 0 && - TypeWidth <= InlineWidth) + // Note: we do not need to check alignment since _Atomic(T) is always + // appropriately-aligned in clang. + if ((TypeWidth & (TypeWidth - 1)) == 0 && TypeWidth <= InlineWidth) return "2"; // "always lock free" // We cannot be certain what operations the lib calls might be // able to implement as lock-free on future processors. @@ -881,7 +881,6 @@ #define DEFINE_LOCK_FREE_MACRO(TYPE, Type) \ Builder.defineMacro("__GCC_ATOMIC_" #TYPE "_LOCK_FREE", \ getLockFreeValue(TI.get##Type##Width(), \ - TI.get##Type##Align(), \ InlineWidthBits)); DEFINE_LOCK_FREE_MACRO(BOOL, Bool); DEFINE_LOCK_FREE_MACRO(CHAR, Char); @@ -894,7 +893,6 @@ DEFINE_LOCK_FREE_MACRO(LLONG, LongLong); Builder.defineMacro("__GCC_ATOMIC_POINTER_LOCK_FREE", getLockFreeValue(TI.getPointerWidth(0), - TI.getPointerAlign(0), InlineWidthBits)); #undef DEFINE_LOCK_FREE_MACRO } -------------- next part -------------- A non-text attachment was scrubbed... Name: D28213.223551.patch Type: text/x-patch Size: 2487 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:36:51 2019 From: llvm-commits at lists.llvm.org (Zachary Turner via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:36:51 +0000 (UTC) Subject: [PATCH] D27780: Make OptionDefinition structure store a StringRef In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG182b4652e542: [StringRef] Add enable-if to StringLiteral. (authored by zturner). Herald added subscribers: llvm-commits, dexonsmith. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D27780?vs=81625&id=223552#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D27780/new/ https://reviews.llvm.org/D27780 Files: llvm/include/llvm/ADT/StringRef.h Index: llvm/include/llvm/ADT/StringRef.h =================================================================== --- llvm/include/llvm/ADT/StringRef.h +++ llvm/include/llvm/ADT/StringRef.h @@ -838,22 +838,21 @@ /// A wrapper around a string literal that serves as a proxy for constructing /// global tables of StringRefs with the length computed at compile time. - /// Using this class with a non-literal char array is considered undefined - /// behavior. To prevent this, it is recommended that StringLiteral *only* - /// be used in a constexpr context, as such: + /// In order to avoid the invocation of a global constructor, StringLiteral + /// should *only* be used in a constexpr context, as such: /// /// constexpr StringLiteral S("test"); /// - /// Note: There is a subtle behavioral difference in the constructor of - /// StringRef and StringLiteral, as illustrated below: - /// - /// constexpr StringLiteral S("a\0b"); // S.size() == 3 - /// StringRef S("a\0b"); // S.size() == 1 - /// class StringLiteral : public StringRef { public: template - constexpr StringLiteral(const char (&Str)[N]) : StringRef(Str, N - 1) {} + constexpr StringLiteral(const char (&Str)[N]) +#if __has_attribute(enable_if) + __attribute((enable_if(__builtin_strlen(Str) == N - 1, + "invalid string literal"))) +#endif + : StringRef(Str, N - 1) { + } }; /// @name StringRef Comparison Operators @@ -865,9 +864,7 @@ } LLVM_ATTRIBUTE_ALWAYS_INLINE - inline bool operator!=(StringRef LHS, StringRef RHS) { - return !(LHS == RHS); - } + inline bool operator!=(StringRef LHS, StringRef RHS) { return !(LHS == RHS); } inline bool operator<(StringRef LHS, StringRef RHS) { return LHS.compare(RHS) == -1; -------------- next part -------------- A non-text attachment was scrubbed... Name: D27780.223552.patch Type: text/x-patch Size: 1815 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:39:56 2019 From: llvm-commits at lists.llvm.org (Kevin P. Neal via llvm-commits) Date: Mon, 07 Oct 2019 13:39:56 -0000 Subject: [llvm] r373902 - Fix sphinx warnings. Message-ID: <20191007133956.F23FA8291F@lists.llvm.org> Author: kpn Date: Mon Oct 7 06:39:56 2019 New Revision: 373902 URL: http://llvm.org/viewvc/llvm-project?rev=373902&view=rev Log: Fix sphinx warnings. Differential Revision: https://reviews.llvm.org/D64746 Modified: llvm/trunk/docs/LangRef.rst Modified: llvm/trunk/docs/LangRef.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/LangRef.rst?rev=373902&r1=373901&r2=373902&view=diff ============================================================================== --- llvm/trunk/docs/LangRef.rst (original) +++ llvm/trunk/docs/LangRef.rst Mon Oct 7 06:39:56 2019 @@ -15941,7 +15941,7 @@ mode argument is only intended as inform '``llvm.experimental.constrained.lrint``' Intrinsic -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Syntax: """"""" @@ -15989,7 +15989,7 @@ then the results will be the same as the '``llvm.experimental.constrained.llrint``' Intrinsic -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Syntax: """"""" From llvm-commits at lists.llvm.org Mon Oct 7 06:38:41 2019 From: llvm-commits at lists.llvm.org (Evgeny Stupachenko via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:38:41 +0000 (UTC) Subject: [PATCH] D26877: Minor fixes in Loop Strength Reduction In-Reply-To: References: Message-ID: <7df18a28e33bc52ba49c40fc9d7e8849@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG0c4300fac7e0: Fix LSR best register search algorithm. (authored by evstupac). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D26877?vs=78784&id=223554#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D26877/new/ https://reviews.llvm.org/D26877 Files: llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp Index: llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp =================================================================== --- llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp +++ llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp @@ -4178,9 +4178,10 @@ for (const SCEV *Reg : RegUses) { if (Taken.count(Reg)) continue; - if (!Best) + if (!Best) { Best = Reg; - else { + BestNum = RegUses.getUsedByIndices(Reg).count(); + } else { unsigned Count = RegUses.getUsedByIndices(Reg).count(); if (Count > BestNum) { Best = Reg; -------------- next part -------------- A non-text attachment was scrubbed... Name: D26877.223554.patch Type: text/x-patch Size: 610 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:40:52 2019 From: llvm-commits at lists.llvm.org (Evgeny Stupachenko via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:40:52 +0000 (UTC) Subject: [PATCH] D21719: Unroll restructure In-Reply-To: References: Message-ID: <9b978e6ec2173c102f4f01c437436f8c@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGc2698cd90313: Minor unroll pass refacoring. (authored by evstupac). Herald added subscribers: zzheng, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D21719?vs=70640&id=223556#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D21719/new/ https://reviews.llvm.org/D21719 Files: llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D21719.223556.patch Type: text/x-patch Size: 9456 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:41:21 2019 From: llvm-commits at lists.llvm.org (Sam McCall via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:41:21 +0000 (UTC) Subject: [PATCH] D65677: [VirtualFileSystem] Make the RedirectingFileSystem hold on to its own working directory. In-Reply-To: References: Message-ID: sammccall added a comment. Mostly LG, just a couple of possible logic bugs. Apologies, I was out on vacation and hoped someone else would see this. ================ Comment at: llvm/include/llvm/Support/VirtualFileSystem.h:650 + bool Fallthrough() const { return ExternalFSValidWD && IsFallthrough; } + ---------------- this name seems less than ideal: - it's very similar to `IsFallthrough` but has different semantics - "fallthrough" itself is not a very clear description of this functionality IMO - it's spelled wrong per the style guide I'd suggest `shouldUseExternalFS()` or so ================ Comment at: llvm/lib/Support/VirtualFileSystem.cpp:1057 + auto EC = ExternalFS->setCurrentWorkingDirectory(Path); + ExternalFSValidWD = static_cast(EC); + } ---------------- this seems backwards - error_code converts to true if it's an *error* add tests? ================ Comment at: llvm/lib/Support/VirtualFileSystem.cpp:1061 + // Don't change the working directory if the path doesn't exist. + if (!exists(Path)) + return errc::no_such_file_or_directory; ---------------- this seems like it should go at the top? cd to a nonexistent directory should avoid changing state and return an error (which means not marking ExternalFSValidWD as false, I think) ================ Comment at: llvm/lib/Support/VirtualFileSystem.cpp:1065 + // Non-absolute paths are relative to the current working directory. + if (!sys::path::is_absolute(Path)) { + SmallString<128> AbsolutePath; ---------------- makeAbsolute already does this check CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65677/new/ https://reviews.llvm.org/D65677 From llvm-commits at lists.llvm.org Mon Oct 7 06:46:17 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:46:17 +0000 (UTC) Subject: [PATCH] D68470: [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask In-Reply-To: References: Message-ID: <76c5c76d40208499ba77cd41703ff957@localhost.localdomain> spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. In D68470#1697370 , @lebedev.ri wrote: > In D68470#1697275 , @spatel wrote: > > > The diff as shown includes D68239 rather than building on top of it? Commit the other patch and rebase, so we are current with trunk? > > > No, this is fully properly rebased. I see. I was distracted by the cosmetic diffs caused by different indentation. This is already NFC, so there's not much point in making sub-patches. LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68470/new/ https://reviews.llvm.org/D68470 From llvm-commits at lists.llvm.org Mon Oct 7 06:47:46 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:47:46 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: hubert.reinterpretcast added inline comments. ================ Comment at: llvm/test/tools/llvm-ar/mri-nonascii.test:6 + +RUN: echo "contents" > %t/£.txt + ---------------- I am not particularly thrilled with having a file containing non-ASCII characters that are ambiguous with regards to their interpretation. Is this `£`, `Β£`, or something else? Is there an objection to adding a BOM? ================ Comment at: llvm/test/tools/llvm-ar/mri-nonascii.test:15 + +# Use input redirection to work around problems launching processess that +# include arguments with non-ascii characters. ---------------- Minor nit: s/processess/processes/; ================ Comment at: llvm/test/tools/llvm-ar/mri-utf8.test:26 +# and linux vs windows. The C.UTF-8 locale is chosen +RUN: env LANG=C.UTF-8 %python -c "assert open(u'\xA3.txt', 'rb').read() == b'contents\n'" ---------------- thopre wrote: > hubert.reinterpretcast wrote: > > MaskRay wrote: > > > Just delete the comments and avoid python. > > > > > > ``` > > > RUN: FileCheck --input-file £.txt --match-full-lines > > > CHECK: contents > > > ``` > > As it is, the file contains nothing aside from this last RUN line and its associated comment block that indicates that U+00A3 is the intended interpretation of the bytes `\xC2\xA3`. Note: There is no BOM in the file. > > > > In addition to making the intent clear, I believe that the current approach has more of an ability to detect cases where the instances of `\xC2\xA3` in the file are misinterpreted. > > > > That said, if the file redirection to create the file works, then `FileCheck` can be invoked with use of file redirection: > > ``` > > RUN: FileCheck <£.txt --match-full-lines %s > > ``` > > > Is UTF-8 encoding really the desired behavior or just non ascii? I know the test is named mri-utf8 but the first comment says "Test non-ascii archive members". Besides as I mentioned in the patch description Windows encodes it in UTF-16 so UTF-8 is already not possible there. > > I do like the approach of using FileCheck with an input redirection. It is consistent with the echo line above so if one works the other one will as well. I feel ashamed I didn't think of that good old FileCheck. I'll revise the patch accordingly. The description in the Windows case indicates that the file that ends up on the filesystem is named, in terms of what a user might see in a directory listing, `£.txt` (as opposed to something else). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 From llvm-commits at lists.llvm.org Mon Oct 7 06:49:51 2019 From: llvm-commits at lists.llvm.org (Roger Ferrer Ibanez via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:49:51 +0000 (UTC) Subject: [PATCH] D20561: Warn when taking address of packed member In-Reply-To: References: Message-ID: <913cff3869d4e56fa9c9cf31b1ee9a1e@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGf7b9f3149b76: Add missing tests (authored by rogfer01). Herald added a project: clang. Changed prior to commit: https://reviews.llvm.org/D20561?vs=67807&id=223564#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D20561/new/ https://reviews.llvm.org/D20561 Files: clang/test/Sema/address-packed-member-memops.c clang/test/Sema/address-packed.c clang/test/SemaCXX/address-packed-member-memops.cpp clang/test/SemaCXX/address-packed.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D20561.223564.patch Type: text/x-patch Size: 8603 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:50:29 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:50:29 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: MaskRay added inline comments. ================ Comment at: llvm/test/tools/llvm-ar/mri-nonascii.test:15 + +# Use input redirection to work around problems launching processess that +# include arguments with non-ascii characters. ---------------- hubert.reinterpretcast wrote: > Minor nit: s/processess/processes/; What problems do you work around? POSIX.1-2017 3.282 Portable Filename Character Set consists of the classical Latin alphabet, 0~9, , , and . a filename consisting of the UTF-8 byte sequence 0xc2 0xa3 (£) may be disallowed by some implementations but it is unlikely that the implementation can arbitrarily reinterpret the byte sequence and cause the test to fail. I suggest deleting the comment. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 From llvm-commits at lists.llvm.org Mon Oct 7 06:51:00 2019 From: llvm-commits at lists.llvm.org (Charles Davis via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:51:00 +0000 (UTC) Subject: [PATCH] D19908: [X86] Support the "ms-hotpatch" attribute. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG0822aa118eaf: [X86] Support the "ms-hotpatch" attribute. (authored by cdavis5x). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D19908?vs=68393&id=223567#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D19908/new/ https://reviews.llvm.org/D19908 Files: llvm/docs/LangRef.rst llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/CodeGen/PatchableFunction.cpp llvm/lib/Target/X86/X86AsmPrinter.cpp llvm/lib/Target/X86/X86AsmPrinter.h llvm/lib/Target/X86/X86FrameLowering.cpp llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86TargetTransformInfo.cpp llvm/lib/Target/X86/X86TargetTransformInfo.h llvm/test/CodeGen/X86/ms-hotpatch-attr.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D19908.223567.patch Type: text/x-patch Size: 16176 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:53:17 2019 From: llvm-commits at lists.llvm.org (David Stenberg via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:53:17 +0000 (UTC) Subject: [PATCH] D67556: [ARM][AArch64][DebugInfo] Improve call site instruction interpretation In-Reply-To: References: Message-ID: <7ab8110d3f279efe21adf0d66a847f80@localhost.localdomain> dstenb added inline comments. ================ Comment at: include/llvm/CodeGen/TargetInstrInfo.h:888 + /// If the specific machine instruction is an instruction that adds an + /// immediate value to its first operand and stores it in the first, return + /// true along with @Source machine operand to which @Offset has been ---------------- NikolaPrica wrote: > dstenb wrote: > > dstenb wrote: > > > I wonder if the hook should allow the source and destination to be different, as we then for example could describe cases like this: > > > > > > ``` > > > $reg0 = add $frame-ptr, -13 > > > ``` > > > > > > If so, would it then make sense to move the LEA part of X86's `describeLoadedValue()` hook into this hook instead? > > If so, should we perhaps also consider generalizing the hook so that it has a Destination out-parameter, e.g. same as `isCopyInstr()`? That could probably be helpful if we make the `describeLoadedValue()` hook aware of which register it should describe, as we discussed in D67225. > > I wonder if the hook should allow the source and destination to be different > > > In fact we should only relay on situations were source and destination operands are different. Such restriction should be used at general part of `describeLoadedValue()`. There is no use of describing situatios like > > $reg0 = add $reg0, 4 > > This case would require recursive description of $reg0. Describing such instruction is a different story. > > > If so, would it then make sense to move the LEA part of X86's describeLoadedValue() hook into this hook instead? > > The LEA instruction is more complex than add immidiate instruction. It could be observed as add immidiate for one case but it could also have addition of multiple source registers with some multiplication operations. IMHO it would be better to keep this API function clean in sense of recognizing only clear add immidiate instruction for purpose of further usage. > > > If so, should we perhaps also consider generalizing the hook so that it has a Destination out-parameter, e.g. same as isCopyInstr()? That could probably be helpful if we make the describeLoadedValue() hook aware of which register it should describe, as we discussed in D67225. > > In the first version of this function I've added a destination operand but I've removed it since there was no current use of it. But for further flexibilty I will add it. > > In fact we should only relay on situations were source and destination operands are different. Such restriction should be used at general part of describeLoadedValue(). There is no use of describing situatios like > > $reg0 = add $reg0, 4 In previous revisions of the downstream target we develop for we had to resort to: ``` $reg0 = mov $frame-ptr $reg0 = add $reg0, $offset ``` instead of loading the frame pointer with an offset in one instruction. Perhaps there is some upstream target that requires the same? > This case would require recursive description of $reg0. Describing such instruction is a different story. Is that due to the issue with expressions in collectCallSiteParameters() which we discussed earlier in this patch? > The LEA instruction is more complex than add immidiate instruction. It could be observed as add immidiate for one case but it could also have addition of multiple source registers with some multiplication operations. IMHO it would be better to keep this API function clean in sense of recognizing only clear add immidiate instruction for purpose of further usage. Okay, that sounds fair. Moving some parts to the LEA implementation to this hook, and keeping the rest in `describeLoadedValue()` would probably not be ideal. > In the first version of this function I've added a destination operand but I've removed it since there was no current use of it. But for further flexibilty I will add it. Okay, thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67556/new/ https://reviews.llvm.org/D67556 From llvm-commits at lists.llvm.org Mon Oct 7 06:53:18 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:53:18 +0000 (UTC) Subject: [PATCH] D68484: [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. In-Reply-To: References: Message-ID: <43d60c7bcf2d73a7484f31c7954d7e8b@localhost.localdomain> jeroen.dobbelaere updated this revision to Diff 223569. jeroen.dobbelaere added a comment. Document the type of the noalias_sidechannel for load and store. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68484/new/ https://reviews.llvm.org/D68484 Files: llvm/docs/LangRef.rst -------------- next part -------------- A non-text attachment was scrubbed... Name: D68484.223569.patch Type: text/x-patch Size: 23955 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:54:30 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:54:30 +0000 (UTC) Subject: [PATCH] D68470: [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask In-Reply-To: References: Message-ID: <15ba229e3bfcf724b1978a629ad78ca1@localhost.localdomain> lebedev.ri added a comment. In D68470#1697437 , @spatel wrote: > In D68470#1697370 , @lebedev.ri wrote: > > > In D68470#1697275 , @spatel wrote: > > > > > The diff as shown includes D68239 rather than building on top of it? Commit the other patch and rebase, so we are current with trunk? > > > > > > No, this is fully properly rebased. > > > I see. I was distracted by the cosmetic diffs caused by different indentation. This is already NFC, so there's not much point in making sub-patches. LGTM. Thank you for the review! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68470/new/ https://reviews.llvm.org/D68470 From llvm-commits at lists.llvm.org Mon Oct 7 06:57:13 2019 From: llvm-commits at lists.llvm.org (whitequark via llvm-commits) Date: Mon, 07 Oct 2019 13:57:13 -0000 Subject: [llvm] r373903 - [LLVM-C] Add bindings to create macro debug info Message-ID: <20191007135713.C2E808BF9A@lists.llvm.org> Author: whitequark Date: Mon Oct 7 06:57:13 2019 New Revision: 373903 URL: http://llvm.org/viewvc/llvm-project?rev=373903&view=rev Log: [LLVM-C] Add bindings to create macro debug info Summary: The C API doesn't have the bindings to create macro debug information. Reviewers: whitequark, CodaFi, deadalnix Reviewed By: whitequark Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58334 Modified: llvm/trunk/include/llvm-c/DebugInfo.h llvm/trunk/lib/IR/DebugInfo.cpp llvm/trunk/test/Bindings/llvm-c/debug_info.ll llvm/trunk/tools/llvm-c-test/debuginfo.c Modified: llvm/trunk/include/llvm-c/DebugInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/DebugInfo.h?rev=373903&r1=373902&r2=373903&view=diff ============================================================================== --- llvm/trunk/include/llvm-c/DebugInfo.h (original) +++ llvm/trunk/include/llvm-c/DebugInfo.h Mon Oct 7 06:57:13 2019 @@ -170,6 +170,19 @@ typedef unsigned LLVMMetadataKind; typedef unsigned LLVMDWARFTypeEncoding; /** + * Describes the kind of macro declaration used for LLVMDIBuilderCreateMacro. + * @see llvm::dwarf::MacinfoRecordType + * @note Values are from DW_MACINFO_* constants in the DWARF specification. + */ +typedef enum { + LLVMDWARFMacinfoRecordTypeDefine = 0x01, + LLVMDWARFMacinfoRecordTypeMacro = 0x02, + LLVMDWARFMacinfoRecordTypeStartFile = 0x03, + LLVMDWARFMacinfoRecordTypeEndFile = 0x04, + LLVMDWARFMacinfoRecordTypeVendorExt = 0xff +} LLVMDWARFMacinfoRecordType; + +/** * The current debug metadata version number. */ unsigned LLVMDebugMetadataVersion(void); @@ -522,6 +535,38 @@ LLVMDIBuilderCreateSubroutineType(LLVMDI LLVMDIFlags Flags); /** + * Create debugging information entry for a macro. + * @param Builder The DIBuilder. + * @param ParentMacroFile Macro parent (could be NULL). + * @param Line Source line number where the macro is defined. + * @param MacroType DW_MACINFO_define or DW_MACINFO_undef. + * @param Name Macro name. + * @param NameLen Macro name length. + * @param Value Macro value. + * @param ValueLen Macro value length. + */ +LLVMMetadataRef LLVMDIBuilderCreateMacro(LLVMDIBuilderRef Builder, + LLVMMetadataRef ParentMacroFile, + unsigned Line, + LLVMDWARFMacinfoRecordType RecordType, + const char *Name, size_t NameLen, + const char *Value, size_t ValueLen); + +/** + * Create debugging information temporary entry for a macro file. + * List of macro node direct children will be calculated by DIBuilder, + * using the \p ParentMacroFile relationship. + * @param Builder The DIBuilder. + * @param ParentMacroFile Macro parent (could be NULL). + * @param Line Source line number where the macro file is included. + * @param File File descriptor containing the name of the macro file. + */ +LLVMMetadataRef +LLVMDIBuilderCreateTempMacroFile(LLVMDIBuilderRef Builder, + LLVMMetadataRef ParentMacroFile, unsigned Line, + LLVMMetadataRef File); + +/** * Create debugging information entry for an enumerator. * @param Builder The DIBuilder. * @param Name Enumerator name. Modified: llvm/trunk/lib/IR/DebugInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/DebugInfo.cpp?rev=373903&r1=373902&r2=373903&view=diff ============================================================================== --- llvm/trunk/lib/IR/DebugInfo.cpp (original) +++ llvm/trunk/lib/IR/DebugInfo.cpp Mon Oct 7 06:57:13 2019 @@ -929,6 +929,26 @@ const char *LLVMDIFileGetSource(LLVMMeta return ""; } +LLVMMetadataRef LLVMDIBuilderCreateMacro(LLVMDIBuilderRef Builder, + LLVMMetadataRef ParentMacroFile, + unsigned Line, + LLVMDWARFMacinfoRecordType RecordType, + const char *Name, size_t NameLen, + const char *Value, size_t ValueLen) { + return wrap( + unwrap(Builder)->createMacro(unwrapDI(ParentMacroFile), Line, + static_cast(RecordType), + {Name, NameLen}, {Value, ValueLen})); +} + +LLVMMetadataRef +LLVMDIBuilderCreateTempMacroFile(LLVMDIBuilderRef Builder, + LLVMMetadataRef ParentMacroFile, unsigned Line, + LLVMMetadataRef File) { + return wrap(unwrap(Builder)->createTempMacroFile( + unwrapDI(ParentMacroFile), Line, unwrapDI(File))); +} + LLVMMetadataRef LLVMDIBuilderCreateEnumerator(LLVMDIBuilderRef Builder, const char *Name, size_t NameLen, int64_t Value, Modified: llvm/trunk/test/Bindings/llvm-c/debug_info.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Bindings/llvm-c/debug_info.ll?rev=373903&r1=373902&r2=373903&view=diff ============================================================================== --- llvm/trunk/test/Bindings/llvm-c/debug_info.ll (original) +++ llvm/trunk/test/Bindings/llvm-c/debug_info.ll Mon Oct 7 06:57:13 2019 @@ -3,13 +3,13 @@ ; CHECK: ; ModuleID = 'debuginfo.c' ; CHECK-NEXT: source_filename = "debuginfo.c" -; CHECK: define i64 @foo(i64 %0, i64 %1, <10 x i64> %2) !dbg !20 { +; CHECK: define i64 @foo(i64 %0, i64 %1, <10 x i64> %2) !dbg !31 { ; CHECK-NEXT: entry: -; CHECK-NEXT: call void @llvm.dbg.declare(metadata i64 0, metadata !27, metadata !DIExpression()), !dbg !32 -; CHECK-NEXT: call void @llvm.dbg.declare(metadata i64 0, metadata !28, metadata !DIExpression()), !dbg !32 -; CHECK-NEXT: call void @llvm.dbg.declare(metadata i64 0, metadata !29, metadata !DIExpression()), !dbg !32 +; CHECK-NEXT: call void @llvm.dbg.declare(metadata i64 0, metadata !38, metadata !DIExpression()), !dbg !43 +; CHECK-NEXT: call void @llvm.dbg.declare(metadata i64 0, metadata !39, metadata !DIExpression()), !dbg !43 +; CHECK-NEXT: call void @llvm.dbg.declare(metadata i64 0, metadata !40, metadata !DIExpression()), !dbg !43 ; CHECK: vars: ; No predecessors! -; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 0, metadata !30, metadata !DIExpression(DW_OP_constu, 0, DW_OP_stack_value)), !dbg !33 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 0, metadata !41, metadata !DIExpression(DW_OP_constu, 0, DW_OP_stack_value)), !dbg !44 ; CHECK-NEXT: } ; CHECK: ; Function Attrs: nounwind readnone speculatable @@ -21,39 +21,51 @@ ; CHECK: attributes #0 = { nounwind readnone speculatable willreturn } ; CHECK: !llvm.dbg.cu = !{!0} -; CHECK-NEXT: !FooType = !{!16} +; CHECK-NEXT: !FooType = !{!28} +; CHECK-NEXT: !EnumTest = !{!3} -; CHECK: !0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "llvm-c-test", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, globals: !3, imports: !12, splitDebugInlining: false) +; CHECK: !0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "llvm-c-test", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, globals: !11, imports: !19, macros: !23, splitDebugInlining: false) ; CHECK-NEXT: !1 = !DIFile(filename: "debuginfo.c", directory: ".") -; CHECK-NEXT: !2 = !{} -; CHECK-NEXT: !3 = !{!4, !8} -; CHECK-NEXT: !4 = !DIGlobalVariableExpression(var: !5, expr: !DIExpression(DW_OP_constu, 0, DW_OP_stack_value)) -; CHECK-NEXT: !5 = distinct !DIGlobalVariable(name: "globalClass", scope: !6, file: !1, line: 1, type: !7, isLocal: true, isDefinition: true) -; CHECK-NEXT: !6 = !DIModule(scope: null, name: "llvm-c-test", includePath: "/test/include/llvm-c-test.h") -; CHECK-NEXT: !7 = !DICompositeType(tag: DW_TAG_structure_type, name: "TestClass", scope: !1, file: !1, line: 42, size: 64, flags: DIFlagObjcClassComplete, elements: !2) -; CHECK-NEXT: !8 = !DIGlobalVariableExpression(var: !9, expr: !DIExpression(DW_OP_constu, 0, DW_OP_stack_value)) -; CHECK-NEXT: !9 = distinct !DIGlobalVariable(name: "global", scope: !6, file: !1, line: 1, type: !10, isLocal: true, isDefinition: true) -; CHECK-NEXT: !10 = !DIDerivedType(tag: DW_TAG_typedef, name: "int64_t", scope: !1, file: !1, line: 42, baseType: !11) -; CHECK-NEXT: !11 = !DIBasicType(name: "Int64", size: 64) -; CHECK-NEXT: !12 = !{!13, !15} -; CHECK-NEXT: !13 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !6, entity: !14, file: !1, line: 42) -; CHECK-NEXT: !14 = !DIModule(scope: null, name: "llvm-c-test-import", includePath: "/test/include/llvm-c-test-import.h") -; CHECK-NEXT: !15 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !6, entity: !13, file: !1, line: 42) -; CHECK-NEXT: !16 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !17, size: 192, dwarfAddressSpace: 0) -; CHECK-NEXT: !17 = !DICompositeType(tag: DW_TAG_structure_type, name: "MyStruct", scope: !18, file: !1, size: 192, elements: !19, runtimeLang: DW_LANG_C89, identifier: "MyStruct") -; CHECK-NEXT: !18 = !DINamespace(name: "NameSpace", scope: !6) -; CHECK-NEXT: !19 = !{!11, !11, !11} -; CHECK-NEXT: !20 = distinct !DISubprogram(name: "foo", linkageName: "foo", scope: !1, file: !1, line: 42, type: !21, scopeLine: 42, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition, unit: !0, retainedNodes: !26) -; CHECK-NEXT: !21 = !DISubroutineType(types: !22) -; CHECK-NEXT: !22 = !{!11, !11, !23} -; CHECK-NEXT: !23 = !DICompositeType(tag: DW_TAG_array_type, baseType: !11, size: 640, flags: DIFlagVector, elements: !24) -; CHECK-NEXT: !24 = !{!25} -; CHECK-NEXT: !25 = !DISubrange(count: 10) -; CHECK-NEXT: !26 = !{!27, !28, !29, !30} -; CHECK-NEXT: !27 = !DILocalVariable(name: "a", arg: 1, scope: !20, file: !1, line: 42, type: !11) -; CHECK-NEXT: !28 = !DILocalVariable(name: "b", arg: 2, scope: !20, file: !1, line: 42, type: !11) -; CHECK-NEXT: !29 = !DILocalVariable(name: "c", arg: 3, scope: !20, file: !1, line: 42, type: !23) -; CHECK-NEXT: !30 = !DILocalVariable(name: "d", scope: !31, file: !1, line: 43, type: !11) -; CHECK-NEXT: !31 = distinct !DILexicalBlock(scope: !20, file: !1, line: 42) -; CHECK-NEXT: !32 = !DILocation(line: 42, scope: !20) -; CHECK-NEXT: !33 = !DILocation(line: 43, scope: !20) +; CHECK-NEXT: !2 = !{!3} +; CHECK-NEXT: !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "EnumTest", scope: !4, file: !1, baseType: !6, size: 64, elements: !7) +; CHECK-NEXT: !4 = !DINamespace(name: "NameSpace", scope: !5) +; CHECK-NEXT: !5 = !DIModule(scope: null, name: "llvm-c-test", includePath: "/test/include/llvm-c-test.h") +; CHECK-NEXT: !6 = !DIBasicType(name: "Int64", size: 64) +; CHECK-NEXT: !7 = !{!8, !9, !10} +; CHECK-NEXT: !8 = !DIEnumerator(name: "Test_A", value: 0, isUnsigned: true) +; CHECK-NEXT: !9 = !DIEnumerator(name: "Test_B", value: 1, isUnsigned: true) +; CHECK-NEXT: !10 = !DIEnumerator(name: "Test_B", value: 2, isUnsigned: true) +; CHECK-NEXT: !11 = !{!12, !16} +; CHECK-NEXT: !12 = !DIGlobalVariableExpression(var: !13, expr: !DIExpression(DW_OP_constu, 0, DW_OP_stack_value)) +; CHECK-NEXT: !13 = distinct !DIGlobalVariable(name: "globalClass", scope: !5, file: !1, line: 1, type: !14, isLocal: true, isDefinition: true) +; CHECK-NEXT: !14 = !DICompositeType(tag: DW_TAG_structure_type, name: "TestClass", scope: !1, file: !1, line: 42, size: 64, flags: DIFlagObjcClassComplete, elements: !15) +; CHECK-NEXT: !15 = !{} +; CHECK-NEXT: !16 = !DIGlobalVariableExpression(var: !17, expr: !DIExpression(DW_OP_constu, 0, DW_OP_stack_value)) +; CHECK-NEXT: !17 = distinct !DIGlobalVariable(name: "global", scope: !5, file: !1, line: 1, type: !18, isLocal: true, isDefinition: true) +; CHECK-NEXT: !18 = !DIDerivedType(tag: DW_TAG_typedef, name: "int64_t", scope: !1, file: !1, line: 42, baseType: !6) +; CHECK-NEXT: !19 = !{!20, !22} +; CHECK-NEXT: !20 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !5, entity: !21, file: !1, line: 42) +; CHECK-NEXT: !21 = !DIModule(scope: null, name: "llvm-c-test-import", includePath: "/test/include/llvm-c-test-import.h") +; CHECK-NEXT: !22 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !5, entity: !20, file: !1, line: 42) +; CHECK-NEXT: !23 = !{!24} +; CHECK-NEXT: !24 = !DIMacroFile(file: !1, nodes: !25) +; CHECK-NEXT: !25 = !{!26, !27} +; CHECK-NEXT: !26 = !DIMacro(type: DW_MACINFO_define, name: "SIMPLE_DEFINE") +; CHECK-NEXT: !27 = !DIMacro(type: DW_MACINFO_define, name: "VALUE_DEFINE", value: "1") +; CHECK-NEXT: !28 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !29, size: 192, dwarfAddressSpace: 0) +; CHECK-NEXT: !29 = !DICompositeType(tag: DW_TAG_structure_type, name: "MyStruct", scope: !4, file: !1, size: 192, elements: !30, runtimeLang: DW_LANG_C89, identifier: "MyStruct") +; CHECK-NEXT: !30 = !{!6, !6, !6} +; CHECK-NEXT: !31 = distinct !DISubprogram(name: "foo", linkageName: "foo", scope: !1, file: !1, line: 42, type: !32, scopeLine: 42, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition, unit: !0, retainedNodes: !37) +; CHECK-NEXT: !32 = !DISubroutineType(types: !33) +; CHECK-NEXT: !33 = !{!6, !6, !34} +; CHECK-NEXT: !34 = !DICompositeType(tag: DW_TAG_array_type, baseType: !6, size: 640, flags: DIFlagVector, elements: !35) +; CHECK-NEXT: !35 = !{!36} +; CHECK-NEXT: !36 = !DISubrange(count: 10) +; CHECK-NEXT: !37 = !{!38, !39, !40, !41} +; CHECK-NEXT: !38 = !DILocalVariable(name: "a", arg: 1, scope: !31, file: !1, line: 42, type: !6) +; CHECK-NEXT: !39 = !DILocalVariable(name: "b", arg: 2, scope: !31, file: !1, line: 42, type: !6) +; CHECK-NEXT: !40 = !DILocalVariable(name: "c", arg: 3, scope: !31, file: !1, line: 42, type: !34) +; CHECK-NEXT: !41 = !DILocalVariable(name: "d", scope: !42, file: !1, line: 43, type: !6) +; CHECK-NEXT: !42 = distinct !DILexicalBlock(scope: !31, file: !1, line: 42) +; CHECK-NEXT: !43 = !DILocation(line: 42, scope: !31) +; CHECK-NEXT: !44 = !DILocation(line: 43, scope: !31) Modified: llvm/trunk/tools/llvm-c-test/debuginfo.c URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-c-test/debuginfo.c?rev=373903&r1=373902&r2=373903&view=diff ============================================================================== --- llvm/trunk/tools/llvm-c-test/debuginfo.c (original) +++ llvm/trunk/tools/llvm-c-test/debuginfo.c Mon Oct 7 06:57:13 2019 @@ -170,6 +170,27 @@ int llvm_test_dibuilder(void) { LLVMDIBuilderInsertDbgValueAtEnd(DIB, FooVal1, FooVar1, FooVarValueExpr, FooVarsLocation, FooVarBlock); + LLVMMetadataRef MacroFile = + LLVMDIBuilderCreateTempMacroFile(DIB, NULL, 0, File); + LLVMDIBuilderCreateMacro(DIB, MacroFile, 0, LLVMDWARFMacinfoRecordTypeDefine, + "SIMPLE_DEFINE", 13, NULL, 0); + LLVMDIBuilderCreateMacro(DIB, MacroFile, 0, LLVMDWARFMacinfoRecordTypeDefine, + "VALUE_DEFINE", 12, "1", 1); + + LLVMMetadataRef EnumeratorTestA = + LLVMDIBuilderCreateEnumerator(DIB, "Test_A", strlen("Test_A"), 0, true); + LLVMMetadataRef EnumeratorTestB = + LLVMDIBuilderCreateEnumerator(DIB, "Test_B", strlen("Test_B"), 1, true); + LLVMMetadataRef EnumeratorTestC = + LLVMDIBuilderCreateEnumerator(DIB, "Test_B", strlen("Test_C"), 2, true); + LLVMMetadataRef EnumeratorsTest[] = {EnumeratorTestA, EnumeratorTestB, + EnumeratorTestC}; + LLVMMetadataRef EnumTest = LLVMDIBuilderCreateEnumerationType( + DIB, NameSpace, "EnumTest", strlen("EnumTest"), File, 0, 64, 0, + EnumeratorsTest, 3, Int64Ty); + LLVMAddNamedMetadataOperand( + M, "EnumTest", LLVMMetadataAsValue(LLVMGetModuleContext(M), EnumTest)); + LLVMDIBuilderFinalize(DIB); char *MStr = LLVMPrintModuleToString(M); From llvm-commits at lists.llvm.org Mon Oct 7 06:56:15 2019 From: llvm-commits at lists.llvm.org (Etienne Bergeron via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:56:15 +0000 (UTC) Subject: [PATCH] D21101: [exceptions] Upgrade exception handlers when stack protector is used In-Reply-To: References: Message-ID: <2d166b3d16e0a31b2f485dbb0fab16b1@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG47cf4eabe6e2: [exceptions] Upgrade exception handlers when stack protector is used (authored by etienneb). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D21101?vs=62373&id=223576#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D21101/new/ https://reviews.llvm.org/D21101 Files: llvm/lib/CodeGen/WinEHPrepare.cpp llvm/test/CodeGen/WinEH/wineh-promote-eh.ll Index: llvm/test/CodeGen/WinEH/wineh-promote-eh.ll =================================================================== --- /dev/null +++ llvm/test/CodeGen/WinEH/wineh-promote-eh.ll @@ -0,0 +1,16 @@ +; RUN: opt -mtriple=i686-windows-msvc -S -winehprepare %s | FileCheck %s + +declare i32 @_except_handler3(...) + +define void @test1a() personality i32 (...)* @_except_handler3 { +; CHECK: define void @test1a() personality i32 (...)* @_except_handler3 +entry: + ret void +} + +define void @test1b() ssp personality i32 (...)* @_except_handler3 { +; CHECK: define void @test1b() [[attr:.*]] personality i32 (...)* @_except_handler4 +entry: + ret void +} + Index: llvm/lib/CodeGen/WinEHPrepare.cpp =================================================================== --- llvm/lib/CodeGen/WinEHPrepare.cpp +++ llvm/lib/CodeGen/WinEHPrepare.cpp @@ -20,6 +20,7 @@ #include "llvm/ADT/DenseMap.h" #include "llvm/ADT/MapVector.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/Analysis/CFG.h" #include "llvm/Analysis/EHPersonalities.h" #include "llvm/CodeGen/MachineBasicBlock.h" @@ -67,6 +68,7 @@ } private: + void promoteEHPersonality(Function &F); void insertPHIStores(PHINode *OriginalPHI, AllocaInst *SpillSlot); void insertPHIStore(BasicBlock *PredBlock, Value *PredVal, AllocaInst *SpillSlot, @@ -464,6 +466,39 @@ return FuncInfo.ClrEHUnwindMap.size() - 1; } +static Value *getStackGuardEHPersonality(Value *Pers) { + Function *F = + Pers ? dyn_cast(Pers->stripPointerCasts()) : nullptr; + if (!F) + return nullptr; + + // TODO(etienneb): Upgrade exception handlers when they are working. + StringRef NewName = llvm::StringSwitch(F->getName()) + .Case("_except_handler3", "_except_handler4") + .Default(""); + if (NewName.empty()) + return nullptr; + + Module *M = F->getParent(); + return M->getOrInsertFunction("_except_handler4", F->getFunctionType(), + F->getAttributes()); +} + +void WinEHPrepare::promoteEHPersonality(Function &F) { + // Promote the exception handler when stack protection is activated. + if (!F.hasFnAttribute(Attribute::StackProtect) && + !F.hasFnAttribute(Attribute::StackProtectReq) && + !F.hasFnAttribute(Attribute::StackProtectStrong)) + return; + + if (Value *PersonalityFn = F.getPersonalityFn()) { + if (Value *Personality = getStackGuardEHPersonality(PersonalityFn)) { + Function* PromotedFn = cast(Personality); + F.setPersonalityFn(PromotedFn); + } + } +} + void llvm::calculateClrEHStateNumbers(const Function *Fn, WinEHFuncInfo &FuncInfo) { // Return if it's already been done. @@ -1028,6 +1063,10 @@ } bool WinEHPrepare::prepareExplicitEH(Function &F) { + // When stack-protector is present, some exception handlers need to be + // promoted to a compatible handlers. + promoteEHPersonality(F); + // Remove unreachable blocks. It is not valuable to assign them a color and // their existence can trick us into thinking values are alive when they are // not. -------------- next part -------------- A non-text attachment was scrubbed... Name: D21101.223576.patch Type: text/x-patch Size: 3147 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 06:56:22 2019 From: llvm-commits at lists.llvm.org (whitequark via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:56:22 +0000 (UTC) Subject: [PATCH] D58334: [LLVM-C] Add bindings to create macro debug info In-Reply-To: References: Message-ID: <806321ccd2f12e9d95c2d58669410418@localhost.localdomain> whitequark added a comment. Done (rG373903). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D58334/new/ https://reviews.llvm.org/D58334 From llvm-commits at lists.llvm.org Mon Oct 7 06:57:48 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:57:48 +0000 (UTC) Subject: [PATCH] D68548: [Mips] Fix evaluating J-format branch targets In-Reply-To: References: Message-ID: <48e8a4a3812cad191beeeb2eda5d52d7@localhost.localdomain> atanasyan accepted this revision. atanasyan added a comment. This revision is now accepted and ready to land. LGTM. Thanks for the patch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68548/new/ https://reviews.llvm.org/D68548 From llvm-commits at lists.llvm.org Mon Oct 7 06:57:58 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:57:58 +0000 (UTC) Subject: [PATCH] D68542: [Mips] Always save RA when disabling frame pointer elimination In-Reply-To: References: Message-ID: <2fbae9b977d97dec559bd8520933ae6c@localhost.localdomain> atanasyan accepted this revision. atanasyan added a comment. This revision is now accepted and ready to land. LGTM. Thanks for the patch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68542/new/ https://reviews.llvm.org/D68542 From llvm-commits at lists.llvm.org Mon Oct 7 07:01:22 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Mon, 07 Oct 2019 14:01:22 -0000 Subject: [llvm] r373906 - [Mips] Fix evaluating J-format branch targets Message-ID: <20191007140122.78FA28B8A8@lists.llvm.org> Author: atanasyan Date: Mon Oct 7 07:01:22 2019 New Revision: 373906 URL: http://llvm.org/viewvc/llvm-project?rev=373906&view=rev Log: [Mips] Fix evaluating J-format branch targets J/JAL/JALX/JALS are absolute branches, but stay within the current 256 MB-aligned region, so we must include the high bits of the instruction address when calculating the branch target. Patch by James Clarke. Differential Revision: https://reviews.llvm.org/D68548 Added: llvm/trunk/test/MC/Mips/micromips-jump-pc-region.s llvm/trunk/test/MC/Mips/mips-jump-pc-region.s Modified: llvm/trunk/lib/Target/Mips/MCTargetDesc/MipsMCTargetDesc.cpp Modified: llvm/trunk/lib/Target/Mips/MCTargetDesc/MipsMCTargetDesc.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MCTargetDesc/MipsMCTargetDesc.cpp?rev=373906&r1=373905&r2=373906&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MCTargetDesc/MipsMCTargetDesc.cpp (original) +++ llvm/trunk/lib/Target/Mips/MCTargetDesc/MipsMCTargetDesc.cpp Mon Oct 7 07:01:22 2019 @@ -143,12 +143,15 @@ public: return false; switch (Info->get(Inst.getOpcode()).OpInfo[NumOps - 1].OperandType) { case MCOI::OPERAND_UNKNOWN: - case MCOI::OPERAND_IMMEDIATE: - // jal, bal ... - Target = Inst.getOperand(NumOps - 1).getImm(); + case MCOI::OPERAND_IMMEDIATE: { + // j, jal, jalx, jals + // Absolute branch within the current 256 MB-aligned region + uint64_t Region = Addr & ~uint64_t(0xfffffff); + Target = Region + Inst.getOperand(NumOps - 1).getImm(); return true; + } case MCOI::OPERAND_PCREL: - // b, j, beq ... + // b, beq ... Target = Addr + Inst.getOperand(NumOps - 1).getImm(); return true; default: Added: llvm/trunk/test/MC/Mips/micromips-jump-pc-region.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Mips/micromips-jump-pc-region.s?rev=373906&view=auto ============================================================================== --- llvm/trunk/test/MC/Mips/micromips-jump-pc-region.s (added) +++ llvm/trunk/test/MC/Mips/micromips-jump-pc-region.s Mon Oct 7 07:01:22 2019 @@ -0,0 +1,17 @@ +# RUN: llvm-mc -triple=mips -mcpu=mips32 -mattr=+micromips -filetype=obj < %s \ +# RUN: | llvm-objdump -d - | FileCheck %s + +.set noreorder + +# Force us into the second 256 MB region with a non-zero instruction index +.org 256*1024*1024 + 12 +# CHECK-LABEL: 1000000c foo: +# CHECK-NEXT: 1000000c: d4 00 00 06 j 12 +# CHECK-NEXT: 10000010: f4 00 00 08 jal 16 +# CHECK-NEXT: 10000014: f0 00 00 05 jalx 20 +# CHECK-NEXT: 10000018: 74 00 00 0c jals 24 +foo: + j 12 + jal 16 + jalx 20 + jals 24 Added: llvm/trunk/test/MC/Mips/mips-jump-pc-region.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Mips/mips-jump-pc-region.s?rev=373906&view=auto ============================================================================== --- llvm/trunk/test/MC/Mips/mips-jump-pc-region.s (added) +++ llvm/trunk/test/MC/Mips/mips-jump-pc-region.s Mon Oct 7 07:01:22 2019 @@ -0,0 +1,17 @@ +# RUN: llvm-mc -triple=mips -mcpu=mips32 -filetype=obj < %s \ +# RUN: | llvm-objdump -d - | FileCheck %s +# RUN: llvm-mc -triple=mips64 -mcpu=mips64 -filetype=obj < %s \ +# RUN: | llvm-objdump -d - | FileCheck %s + +.set noreorder + +# Force us into the second 256 MB region with a non-zero instruction index +.org 256*1024*1024 + 12 +# CHECK-LABEL: 1000000c foo: +# CHECK-NEXT: 1000000c: 08 00 00 03 j 12 +# CHECK-NEXT: 10000010: 0c 00 00 04 jal 16 +# CHECK-NEXT: 10000014: 74 00 00 05 jalx 20 +foo: + j 12 + jal 16 + jalx 20 From llvm-commits at lists.llvm.org Mon Oct 7 06:59:05 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:59:05 +0000 (UTC) Subject: [PATCH] D68575: implement parsing overflow section Message-ID: DiggerLin created this revision. DiggerLin added reviewers: sfertile, hubert.reinterpretcast, jasonliu. Herald added subscribers: llvm-commits, seiya, rupprecht. Herald added a project: LLVM. in the xcoff, if the number of relocation entries or line number entries is overflow(large than or equal 65535) , there will be overflow section for it. The interpret of overflow section is different with generic section header, the patch implement parsing the overflow section. Repository: rL LLVM https://reviews.llvm.org/D68575 Files: llvm/test/tools/llvm-readobj/Inputs/xcoff-reloc-overflow.o llvm/test/tools/llvm-readobj/xcoff-overflow-section.test llvm/tools/llvm-readobj/XCOFFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68575.223570.patch Type: text/x-patch Size: 5927 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:01:38 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Mon, 07 Oct 2019 14:01:38 -0000 Subject: [llvm] r373907 - [Mips] Always save RA when disabling frame pointer elimination Message-ID: <20191007140138.164FD8C5EC@lists.llvm.org> Author: atanasyan Date: Mon Oct 7 07:01:37 2019 New Revision: 373907 URL: http://llvm.org/viewvc/llvm-project?rev=373907&view=rev Log: [Mips] Always save RA when disabling frame pointer elimination This ensures that frame-based unwinding will continue to work when calling a noreturn function; there is not much use having the caller's frame pointer saved if you don't also have the caller's program counter. Patch by James Clarke. Differential Revision: https://reviews.llvm.org/D68542 Added: llvm/trunk/test/CodeGen/Mips/no-frame-pointer-elim.ll Modified: llvm/trunk/lib/Target/Mips/MipsSEFrameLowering.cpp llvm/trunk/test/CodeGen/Mips/cconv/vector.ll llvm/trunk/test/CodeGen/Mips/dynamic-stack-realignment.ll llvm/trunk/test/CodeGen/Mips/frame-address.ll llvm/trunk/test/CodeGen/Mips/tnaked.ll llvm/trunk/test/CodeGen/Mips/v2i16tof32.ll Modified: llvm/trunk/lib/Target/Mips/MipsSEFrameLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsSEFrameLowering.cpp?rev=373907&r1=373906&r2=373907&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsSEFrameLowering.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsSEFrameLowering.cpp Mon Oct 7 07:01:37 2019 @@ -865,12 +865,15 @@ void MipsSEFrameLowering::determineCalle const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo(); MipsFunctionInfo *MipsFI = MF.getInfo(); MipsABIInfo ABI = STI.getABI(); + unsigned RA = ABI.IsN64() ? Mips::RA_64 : Mips::RA; unsigned FP = ABI.GetFramePtr(); unsigned BP = ABI.IsN64() ? Mips::S7_64 : Mips::S7; - // Mark $fp as used if function has dedicated frame pointer. - if (hasFP(MF)) + // Mark $ra and $fp as used if function has dedicated frame pointer. + if (hasFP(MF)) { + setAliasRegs(MF, SavedRegs, RA); setAliasRegs(MF, SavedRegs, FP); + } // Mark $s7 as used if function has dedicated base pointer. if (hasBP(MF)) setAliasRegs(MF, SavedRegs, BP); Modified: llvm/trunk/test/CodeGen/Mips/cconv/vector.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Mips/cconv/vector.ll?rev=373907&r1=373906&r2=373907&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Mips/cconv/vector.ll (original) +++ llvm/trunk/test/CodeGen/Mips/cconv/vector.ll Mon Oct 7 07:01:37 2019 @@ -50,23 +50,25 @@ define <2 x i8> @i8_2(<2 x i8> %a, <2 x ; ; MIPS32R5EB-LABEL: i8_2: ; MIPS32R5EB: # %bb.0: -; MIPS32R5EB-NEXT: addiu $sp, $sp, -48 -; MIPS32R5EB-NEXT: .cfi_def_cfa_offset 48 -; MIPS32R5EB-NEXT: sw $fp, 44($sp) # 4-byte Folded Spill -; MIPS32R5EB-NEXT: .cfi_offset 30, -4 +; MIPS32R5EB-NEXT: addiu $sp, $sp, -64 +; MIPS32R5EB-NEXT: .cfi_def_cfa_offset 64 +; MIPS32R5EB-NEXT: sw $ra, 60($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: sw $fp, 56($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: .cfi_offset 31, -4 +; MIPS32R5EB-NEXT: .cfi_offset 30, -8 ; MIPS32R5EB-NEXT: move $fp, $sp ; MIPS32R5EB-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EB-NEXT: addiu $1, $zero, -16 ; MIPS32R5EB-NEXT: and $sp, $sp, $1 -; MIPS32R5EB-NEXT: sw $5, 36($sp) -; MIPS32R5EB-NEXT: sw $4, 40($sp) -; MIPS32R5EB-NEXT: lbu $1, 37($sp) +; MIPS32R5EB-NEXT: sw $5, 48($sp) +; MIPS32R5EB-NEXT: sw $4, 52($sp) +; MIPS32R5EB-NEXT: lbu $1, 49($sp) ; MIPS32R5EB-NEXT: sw $1, 28($sp) -; MIPS32R5EB-NEXT: lbu $1, 36($sp) +; MIPS32R5EB-NEXT: lbu $1, 48($sp) ; MIPS32R5EB-NEXT: sw $1, 20($sp) -; MIPS32R5EB-NEXT: lbu $1, 41($sp) +; MIPS32R5EB-NEXT: lbu $1, 53($sp) ; MIPS32R5EB-NEXT: sw $1, 12($sp) -; MIPS32R5EB-NEXT: lbu $1, 40($sp) +; MIPS32R5EB-NEXT: lbu $1, 52($sp) ; MIPS32R5EB-NEXT: sw $1, 4($sp) ; MIPS32R5EB-NEXT: ld.d $w0, 16($sp) ; MIPS32R5EB-NEXT: ld.d $w1, 0($sp) @@ -74,12 +76,13 @@ define <2 x i8> @i8_2(<2 x i8> %a, <2 x ; MIPS32R5EB-NEXT: shf.w $w0, $w0, 177 ; MIPS32R5EB-NEXT: copy_s.w $1, $w0[1] ; MIPS32R5EB-NEXT: copy_s.w $2, $w0[3] -; MIPS32R5EB-NEXT: sb $2, 33($sp) -; MIPS32R5EB-NEXT: sb $1, 32($sp) -; MIPS32R5EB-NEXT: lhu $2, 32($sp) +; MIPS32R5EB-NEXT: sb $2, 45($sp) +; MIPS32R5EB-NEXT: sb $1, 44($sp) +; MIPS32R5EB-NEXT: lhu $2, 44($sp) ; MIPS32R5EB-NEXT: move $sp, $fp -; MIPS32R5EB-NEXT: lw $fp, 44($sp) # 4-byte Folded Reload -; MIPS32R5EB-NEXT: addiu $sp, $sp, 48 +; MIPS32R5EB-NEXT: lw $fp, 56($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $ra, 60($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: addiu $sp, $sp, 64 ; MIPS32R5EB-NEXT: jr $ra ; MIPS32R5EB-NEXT: nop ; @@ -151,35 +154,38 @@ define <2 x i8> @i8_2(<2 x i8> %a, <2 x ; ; MIPS32R5EL-LABEL: i8_2: ; MIPS32R5EL: # %bb.0: -; MIPS32R5EL-NEXT: addiu $sp, $sp, -48 -; MIPS32R5EL-NEXT: .cfi_def_cfa_offset 48 -; MIPS32R5EL-NEXT: sw $fp, 44($sp) # 4-byte Folded Spill -; MIPS32R5EL-NEXT: .cfi_offset 30, -4 +; MIPS32R5EL-NEXT: addiu $sp, $sp, -64 +; MIPS32R5EL-NEXT: .cfi_def_cfa_offset 64 +; MIPS32R5EL-NEXT: sw $ra, 60($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: sw $fp, 56($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: .cfi_offset 31, -4 +; MIPS32R5EL-NEXT: .cfi_offset 30, -8 ; MIPS32R5EL-NEXT: move $fp, $sp ; MIPS32R5EL-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EL-NEXT: addiu $1, $zero, -16 ; MIPS32R5EL-NEXT: and $sp, $sp, $1 -; MIPS32R5EL-NEXT: sw $5, 36($sp) -; MIPS32R5EL-NEXT: sw $4, 40($sp) -; MIPS32R5EL-NEXT: lbu $1, 37($sp) +; MIPS32R5EL-NEXT: sw $5, 48($sp) +; MIPS32R5EL-NEXT: sw $4, 52($sp) +; MIPS32R5EL-NEXT: lbu $1, 49($sp) ; MIPS32R5EL-NEXT: sw $1, 24($sp) -; MIPS32R5EL-NEXT: lbu $1, 36($sp) +; MIPS32R5EL-NEXT: lbu $1, 48($sp) ; MIPS32R5EL-NEXT: sw $1, 16($sp) -; MIPS32R5EL-NEXT: lbu $1, 41($sp) +; MIPS32R5EL-NEXT: lbu $1, 53($sp) ; MIPS32R5EL-NEXT: sw $1, 8($sp) -; MIPS32R5EL-NEXT: lbu $1, 40($sp) +; MIPS32R5EL-NEXT: lbu $1, 52($sp) ; MIPS32R5EL-NEXT: sw $1, 0($sp) ; MIPS32R5EL-NEXT: ld.d $w0, 16($sp) ; MIPS32R5EL-NEXT: ld.d $w1, 0($sp) ; MIPS32R5EL-NEXT: addv.d $w0, $w1, $w0 ; MIPS32R5EL-NEXT: copy_s.w $1, $w0[0] ; MIPS32R5EL-NEXT: copy_s.w $2, $w0[2] -; MIPS32R5EL-NEXT: sb $2, 33($sp) -; MIPS32R5EL-NEXT: sb $1, 32($sp) -; MIPS32R5EL-NEXT: lhu $2, 32($sp) +; MIPS32R5EL-NEXT: sb $2, 45($sp) +; MIPS32R5EL-NEXT: sb $1, 44($sp) +; MIPS32R5EL-NEXT: lhu $2, 44($sp) ; MIPS32R5EL-NEXT: move $sp, $fp -; MIPS32R5EL-NEXT: lw $fp, 44($sp) # 4-byte Folded Reload -; MIPS32R5EL-NEXT: addiu $sp, $sp, 48 +; MIPS32R5EL-NEXT: lw $fp, 56($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $ra, 60($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: addiu $sp, $sp, 64 ; MIPS32R5EL-NEXT: jr $ra ; MIPS32R5EL-NEXT: nop ; @@ -312,36 +318,38 @@ define <2 x i8> @i8x2_7(<2 x i8> %a, <2 ; MIPS32R5EB: # %bb.0: # %entry ; MIPS32R5EB-NEXT: addiu $sp, $sp, -144 ; MIPS32R5EB-NEXT: .cfi_def_cfa_offset 144 -; MIPS32R5EB-NEXT: sw $fp, 140($sp) # 4-byte Folded Spill -; MIPS32R5EB-NEXT: .cfi_offset 30, -4 +; MIPS32R5EB-NEXT: sw $ra, 140($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: sw $fp, 136($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: .cfi_offset 31, -4 +; MIPS32R5EB-NEXT: .cfi_offset 30, -8 ; MIPS32R5EB-NEXT: move $fp, $sp ; MIPS32R5EB-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EB-NEXT: addiu $1, $zero, -16 ; MIPS32R5EB-NEXT: and $sp, $sp, $1 -; MIPS32R5EB-NEXT: sw $5, 132($sp) -; MIPS32R5EB-NEXT: sw $4, 136($sp) -; MIPS32R5EB-NEXT: lbu $1, 133($sp) +; MIPS32R5EB-NEXT: sw $5, 128($sp) +; MIPS32R5EB-NEXT: sw $4, 132($sp) +; MIPS32R5EB-NEXT: lbu $1, 129($sp) ; MIPS32R5EB-NEXT: sw $1, 76($sp) -; MIPS32R5EB-NEXT: lbu $1, 132($sp) +; MIPS32R5EB-NEXT: lbu $1, 128($sp) ; MIPS32R5EB-NEXT: sw $1, 68($sp) -; MIPS32R5EB-NEXT: lbu $1, 137($sp) +; MIPS32R5EB-NEXT: lbu $1, 133($sp) ; MIPS32R5EB-NEXT: sw $1, 60($sp) -; MIPS32R5EB-NEXT: lbu $1, 136($sp) +; MIPS32R5EB-NEXT: lbu $1, 132($sp) ; MIPS32R5EB-NEXT: sw $1, 52($sp) ; MIPS32R5EB-NEXT: ld.d $w0, 64($sp) ; MIPS32R5EB-NEXT: ld.d $w1, 48($sp) ; MIPS32R5EB-NEXT: addv.d $w0, $w1, $w0 -; MIPS32R5EB-NEXT: sw $6, 128($sp) -; MIPS32R5EB-NEXT: lbu $1, 129($sp) +; MIPS32R5EB-NEXT: sw $6, 124($sp) +; MIPS32R5EB-NEXT: lbu $1, 125($sp) ; MIPS32R5EB-NEXT: sw $1, 92($sp) -; MIPS32R5EB-NEXT: lbu $1, 128($sp) +; MIPS32R5EB-NEXT: lbu $1, 124($sp) ; MIPS32R5EB-NEXT: sw $1, 84($sp) ; MIPS32R5EB-NEXT: ld.d $w1, 80($sp) ; MIPS32R5EB-NEXT: addv.d $w0, $w0, $w1 -; MIPS32R5EB-NEXT: sw $7, 124($sp) -; MIPS32R5EB-NEXT: lbu $1, 125($sp) +; MIPS32R5EB-NEXT: sw $7, 120($sp) +; MIPS32R5EB-NEXT: lbu $1, 121($sp) ; MIPS32R5EB-NEXT: sw $1, 108($sp) -; MIPS32R5EB-NEXT: lbu $1, 124($sp) +; MIPS32R5EB-NEXT: lbu $1, 120($sp) ; MIPS32R5EB-NEXT: sw $1, 100($sp) ; MIPS32R5EB-NEXT: ld.d $w1, 96($sp) ; MIPS32R5EB-NEXT: addv.d $w0, $w0, $w1 @@ -366,11 +374,12 @@ define <2 x i8> @i8x2_7(<2 x i8> %a, <2 ; MIPS32R5EB-NEXT: shf.w $w0, $w0, 177 ; MIPS32R5EB-NEXT: copy_s.w $1, $w0[1] ; MIPS32R5EB-NEXT: copy_s.w $2, $w0[3] -; MIPS32R5EB-NEXT: sb $2, 121($sp) -; MIPS32R5EB-NEXT: sb $1, 120($sp) -; MIPS32R5EB-NEXT: lhu $2, 120($sp) +; MIPS32R5EB-NEXT: sb $2, 117($sp) +; MIPS32R5EB-NEXT: sb $1, 116($sp) +; MIPS32R5EB-NEXT: lhu $2, 116($sp) ; MIPS32R5EB-NEXT: move $sp, $fp -; MIPS32R5EB-NEXT: lw $fp, 140($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $fp, 136($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $ra, 140($sp) # 4-byte Folded Reload ; MIPS32R5EB-NEXT: addiu $sp, $sp, 144 ; MIPS32R5EB-NEXT: jr $ra ; MIPS32R5EB-NEXT: nop @@ -550,36 +559,38 @@ define <2 x i8> @i8x2_7(<2 x i8> %a, <2 ; MIPS32R5EL: # %bb.0: # %entry ; MIPS32R5EL-NEXT: addiu $sp, $sp, -144 ; MIPS32R5EL-NEXT: .cfi_def_cfa_offset 144 -; MIPS32R5EL-NEXT: sw $fp, 140($sp) # 4-byte Folded Spill -; MIPS32R5EL-NEXT: .cfi_offset 30, -4 +; MIPS32R5EL-NEXT: sw $ra, 140($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: sw $fp, 136($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: .cfi_offset 31, -4 +; MIPS32R5EL-NEXT: .cfi_offset 30, -8 ; MIPS32R5EL-NEXT: move $fp, $sp ; MIPS32R5EL-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EL-NEXT: addiu $1, $zero, -16 ; MIPS32R5EL-NEXT: and $sp, $sp, $1 -; MIPS32R5EL-NEXT: sw $5, 132($sp) -; MIPS32R5EL-NEXT: sw $4, 136($sp) -; MIPS32R5EL-NEXT: lbu $1, 133($sp) +; MIPS32R5EL-NEXT: sw $5, 128($sp) +; MIPS32R5EL-NEXT: sw $4, 132($sp) +; MIPS32R5EL-NEXT: lbu $1, 129($sp) ; MIPS32R5EL-NEXT: sw $1, 72($sp) -; MIPS32R5EL-NEXT: lbu $1, 132($sp) +; MIPS32R5EL-NEXT: lbu $1, 128($sp) ; MIPS32R5EL-NEXT: sw $1, 64($sp) -; MIPS32R5EL-NEXT: lbu $1, 137($sp) +; MIPS32R5EL-NEXT: lbu $1, 133($sp) ; MIPS32R5EL-NEXT: sw $1, 56($sp) -; MIPS32R5EL-NEXT: lbu $1, 136($sp) +; MIPS32R5EL-NEXT: lbu $1, 132($sp) ; MIPS32R5EL-NEXT: sw $1, 48($sp) ; MIPS32R5EL-NEXT: ld.d $w0, 64($sp) ; MIPS32R5EL-NEXT: ld.d $w1, 48($sp) ; MIPS32R5EL-NEXT: addv.d $w0, $w1, $w0 -; MIPS32R5EL-NEXT: sw $6, 128($sp) -; MIPS32R5EL-NEXT: lbu $1, 129($sp) +; MIPS32R5EL-NEXT: sw $6, 124($sp) +; MIPS32R5EL-NEXT: lbu $1, 125($sp) ; MIPS32R5EL-NEXT: sw $1, 88($sp) -; MIPS32R5EL-NEXT: lbu $1, 128($sp) +; MIPS32R5EL-NEXT: lbu $1, 124($sp) ; MIPS32R5EL-NEXT: sw $1, 80($sp) ; MIPS32R5EL-NEXT: ld.d $w1, 80($sp) ; MIPS32R5EL-NEXT: addv.d $w0, $w0, $w1 -; MIPS32R5EL-NEXT: sw $7, 124($sp) -; MIPS32R5EL-NEXT: lbu $1, 125($sp) +; MIPS32R5EL-NEXT: sw $7, 120($sp) +; MIPS32R5EL-NEXT: lbu $1, 121($sp) ; MIPS32R5EL-NEXT: sw $1, 104($sp) -; MIPS32R5EL-NEXT: lbu $1, 124($sp) +; MIPS32R5EL-NEXT: lbu $1, 120($sp) ; MIPS32R5EL-NEXT: sw $1, 96($sp) ; MIPS32R5EL-NEXT: ld.d $w1, 96($sp) ; MIPS32R5EL-NEXT: addv.d $w0, $w0, $w1 @@ -603,11 +614,12 @@ define <2 x i8> @i8x2_7(<2 x i8> %a, <2 ; MIPS32R5EL-NEXT: addv.d $w0, $w0, $w1 ; MIPS32R5EL-NEXT: copy_s.w $1, $w0[0] ; MIPS32R5EL-NEXT: copy_s.w $2, $w0[2] -; MIPS32R5EL-NEXT: sb $2, 121($sp) -; MIPS32R5EL-NEXT: sb $1, 120($sp) -; MIPS32R5EL-NEXT: lhu $2, 120($sp) +; MIPS32R5EL-NEXT: sb $2, 117($sp) +; MIPS32R5EL-NEXT: sb $1, 116($sp) +; MIPS32R5EL-NEXT: lhu $2, 116($sp) ; MIPS32R5EL-NEXT: move $sp, $fp -; MIPS32R5EL-NEXT: lw $fp, 140($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $fp, 136($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $ra, 140($sp) # 4-byte Folded Reload ; MIPS32R5EL-NEXT: addiu $sp, $sp, 144 ; MIPS32R5EL-NEXT: jr $ra ; MIPS32R5EL-NEXT: nop @@ -952,8 +964,10 @@ define <8 x i8> @i8_8(<8 x i8> %a, <8 x ; MIPS32R5EB: # %bb.0: ; MIPS32R5EB-NEXT: addiu $sp, $sp, -48 ; MIPS32R5EB-NEXT: .cfi_def_cfa_offset 48 -; MIPS32R5EB-NEXT: sw $fp, 44($sp) # 4-byte Folded Spill -; MIPS32R5EB-NEXT: .cfi_offset 30, -4 +; MIPS32R5EB-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: sw $fp, 40($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: .cfi_offset 31, -4 +; MIPS32R5EB-NEXT: .cfi_offset 30, -8 ; MIPS32R5EB-NEXT: move $fp, $sp ; MIPS32R5EB-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EB-NEXT: addiu $1, $zero, -16 @@ -1019,7 +1033,8 @@ define <8 x i8> @i8_8(<8 x i8> %a, <8 x ; MIPS32R5EB-NEXT: copy_s.w $2, $w0[1] ; MIPS32R5EB-NEXT: copy_s.w $3, $w0[3] ; MIPS32R5EB-NEXT: move $sp, $fp -; MIPS32R5EB-NEXT: lw $fp, 44($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $fp, 40($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload ; MIPS32R5EB-NEXT: addiu $sp, $sp, 48 ; MIPS32R5EB-NEXT: jr $ra ; MIPS32R5EB-NEXT: nop @@ -1088,8 +1103,10 @@ define <8 x i8> @i8_8(<8 x i8> %a, <8 x ; MIPS32R5EL: # %bb.0: ; MIPS32R5EL-NEXT: addiu $sp, $sp, -48 ; MIPS32R5EL-NEXT: .cfi_def_cfa_offset 48 -; MIPS32R5EL-NEXT: sw $fp, 44($sp) # 4-byte Folded Spill -; MIPS32R5EL-NEXT: .cfi_offset 30, -4 +; MIPS32R5EL-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: sw $fp, 40($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: .cfi_offset 31, -4 +; MIPS32R5EL-NEXT: .cfi_offset 30, -8 ; MIPS32R5EL-NEXT: move $fp, $sp ; MIPS32R5EL-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EL-NEXT: addiu $1, $zero, -16 @@ -1155,7 +1172,8 @@ define <8 x i8> @i8_8(<8 x i8> %a, <8 x ; MIPS32R5EL-NEXT: copy_s.w $2, $w0[0] ; MIPS32R5EL-NEXT: copy_s.w $3, $w0[2] ; MIPS32R5EL-NEXT: move $sp, $fp -; MIPS32R5EL-NEXT: lw $fp, 44($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $fp, 40($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload ; MIPS32R5EL-NEXT: addiu $sp, $sp, 48 ; MIPS32R5EL-NEXT: jr $ra ; MIPS32R5EL-NEXT: nop @@ -1471,23 +1489,25 @@ define <2 x i16> @i16_2(<2 x i16> %a, <2 ; ; MIPS32R5EB-LABEL: i16_2: ; MIPS32R5EB: # %bb.0: -; MIPS32R5EB-NEXT: addiu $sp, $sp, -48 -; MIPS32R5EB-NEXT: .cfi_def_cfa_offset 48 -; MIPS32R5EB-NEXT: sw $fp, 44($sp) # 4-byte Folded Spill -; MIPS32R5EB-NEXT: .cfi_offset 30, -4 +; MIPS32R5EB-NEXT: addiu $sp, $sp, -64 +; MIPS32R5EB-NEXT: .cfi_def_cfa_offset 64 +; MIPS32R5EB-NEXT: sw $ra, 60($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: sw $fp, 56($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: .cfi_offset 31, -4 +; MIPS32R5EB-NEXT: .cfi_offset 30, -8 ; MIPS32R5EB-NEXT: move $fp, $sp ; MIPS32R5EB-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EB-NEXT: addiu $1, $zero, -16 ; MIPS32R5EB-NEXT: and $sp, $sp, $1 -; MIPS32R5EB-NEXT: sw $5, 36($sp) -; MIPS32R5EB-NEXT: sw $4, 40($sp) -; MIPS32R5EB-NEXT: lhu $1, 38($sp) +; MIPS32R5EB-NEXT: sw $5, 48($sp) +; MIPS32R5EB-NEXT: sw $4, 52($sp) +; MIPS32R5EB-NEXT: lhu $1, 50($sp) ; MIPS32R5EB-NEXT: sw $1, 28($sp) -; MIPS32R5EB-NEXT: lhu $1, 36($sp) +; MIPS32R5EB-NEXT: lhu $1, 48($sp) ; MIPS32R5EB-NEXT: sw $1, 20($sp) -; MIPS32R5EB-NEXT: lhu $1, 42($sp) +; MIPS32R5EB-NEXT: lhu $1, 54($sp) ; MIPS32R5EB-NEXT: sw $1, 12($sp) -; MIPS32R5EB-NEXT: lhu $1, 40($sp) +; MIPS32R5EB-NEXT: lhu $1, 52($sp) ; MIPS32R5EB-NEXT: sw $1, 4($sp) ; MIPS32R5EB-NEXT: ld.d $w0, 16($sp) ; MIPS32R5EB-NEXT: ld.d $w1, 0($sp) @@ -1495,12 +1515,13 @@ define <2 x i16> @i16_2(<2 x i16> %a, <2 ; MIPS32R5EB-NEXT: shf.w $w0, $w0, 177 ; MIPS32R5EB-NEXT: copy_s.w $1, $w0[1] ; MIPS32R5EB-NEXT: copy_s.w $2, $w0[3] -; MIPS32R5EB-NEXT: sh $2, 34($sp) -; MIPS32R5EB-NEXT: sh $1, 32($sp) -; MIPS32R5EB-NEXT: lw $2, 32($sp) +; MIPS32R5EB-NEXT: sh $2, 46($sp) +; MIPS32R5EB-NEXT: sh $1, 44($sp) +; MIPS32R5EB-NEXT: lw $2, 44($sp) ; MIPS32R5EB-NEXT: move $sp, $fp -; MIPS32R5EB-NEXT: lw $fp, 44($sp) # 4-byte Folded Reload -; MIPS32R5EB-NEXT: addiu $sp, $sp, 48 +; MIPS32R5EB-NEXT: lw $fp, 56($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $ra, 60($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: addiu $sp, $sp, 64 ; MIPS32R5EB-NEXT: jr $ra ; MIPS32R5EB-NEXT: nop ; @@ -1532,35 +1553,38 @@ define <2 x i16> @i16_2(<2 x i16> %a, <2 ; ; MIPS32R5EL-LABEL: i16_2: ; MIPS32R5EL: # %bb.0: -; MIPS32R5EL-NEXT: addiu $sp, $sp, -48 -; MIPS32R5EL-NEXT: .cfi_def_cfa_offset 48 -; MIPS32R5EL-NEXT: sw $fp, 44($sp) # 4-byte Folded Spill -; MIPS32R5EL-NEXT: .cfi_offset 30, -4 +; MIPS32R5EL-NEXT: addiu $sp, $sp, -64 +; MIPS32R5EL-NEXT: .cfi_def_cfa_offset 64 +; MIPS32R5EL-NEXT: sw $ra, 60($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: sw $fp, 56($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: .cfi_offset 31, -4 +; MIPS32R5EL-NEXT: .cfi_offset 30, -8 ; MIPS32R5EL-NEXT: move $fp, $sp ; MIPS32R5EL-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EL-NEXT: addiu $1, $zero, -16 ; MIPS32R5EL-NEXT: and $sp, $sp, $1 -; MIPS32R5EL-NEXT: sw $5, 36($sp) -; MIPS32R5EL-NEXT: sw $4, 40($sp) -; MIPS32R5EL-NEXT: lhu $1, 38($sp) +; MIPS32R5EL-NEXT: sw $5, 48($sp) +; MIPS32R5EL-NEXT: sw $4, 52($sp) +; MIPS32R5EL-NEXT: lhu $1, 50($sp) ; MIPS32R5EL-NEXT: sw $1, 24($sp) -; MIPS32R5EL-NEXT: lhu $1, 36($sp) +; MIPS32R5EL-NEXT: lhu $1, 48($sp) ; MIPS32R5EL-NEXT: sw $1, 16($sp) -; MIPS32R5EL-NEXT: lhu $1, 42($sp) +; MIPS32R5EL-NEXT: lhu $1, 54($sp) ; MIPS32R5EL-NEXT: sw $1, 8($sp) -; MIPS32R5EL-NEXT: lhu $1, 40($sp) +; MIPS32R5EL-NEXT: lhu $1, 52($sp) ; MIPS32R5EL-NEXT: sw $1, 0($sp) ; MIPS32R5EL-NEXT: ld.d $w0, 16($sp) ; MIPS32R5EL-NEXT: ld.d $w1, 0($sp) ; MIPS32R5EL-NEXT: addv.d $w0, $w1, $w0 ; MIPS32R5EL-NEXT: copy_s.w $1, $w0[0] ; MIPS32R5EL-NEXT: copy_s.w $2, $w0[2] -; MIPS32R5EL-NEXT: sh $2, 34($sp) -; MIPS32R5EL-NEXT: sh $1, 32($sp) -; MIPS32R5EL-NEXT: lw $2, 32($sp) +; MIPS32R5EL-NEXT: sh $2, 46($sp) +; MIPS32R5EL-NEXT: sh $1, 44($sp) +; MIPS32R5EL-NEXT: lw $2, 44($sp) ; MIPS32R5EL-NEXT: move $sp, $fp -; MIPS32R5EL-NEXT: lw $fp, 44($sp) # 4-byte Folded Reload -; MIPS32R5EL-NEXT: addiu $sp, $sp, 48 +; MIPS32R5EL-NEXT: lw $fp, 56($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $ra, 60($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: addiu $sp, $sp, 64 ; MIPS32R5EL-NEXT: jr $ra ; MIPS32R5EL-NEXT: nop %1 = add <2 x i16> %a, %b @@ -1622,8 +1646,10 @@ define <4 x i16> @i16_4(<4 x i16> %a, <4 ; MIPS32R5EB: # %bb.0: ; MIPS32R5EB-NEXT: addiu $sp, $sp, -48 ; MIPS32R5EB-NEXT: .cfi_def_cfa_offset 48 -; MIPS32R5EB-NEXT: sw $fp, 44($sp) # 4-byte Folded Spill -; MIPS32R5EB-NEXT: .cfi_offset 30, -4 +; MIPS32R5EB-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: sw $fp, 40($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: .cfi_offset 31, -4 +; MIPS32R5EB-NEXT: .cfi_offset 30, -8 ; MIPS32R5EB-NEXT: move $fp, $sp ; MIPS32R5EB-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EB-NEXT: addiu $1, $zero, -16 @@ -1665,7 +1691,8 @@ define <4 x i16> @i16_4(<4 x i16> %a, <4 ; MIPS32R5EB-NEXT: copy_s.w $2, $w0[1] ; MIPS32R5EB-NEXT: copy_s.w $3, $w0[3] ; MIPS32R5EB-NEXT: move $sp, $fp -; MIPS32R5EB-NEXT: lw $fp, 44($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $fp, 40($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload ; MIPS32R5EB-NEXT: addiu $sp, $sp, 48 ; MIPS32R5EB-NEXT: jr $ra ; MIPS32R5EB-NEXT: nop @@ -1710,8 +1737,10 @@ define <4 x i16> @i16_4(<4 x i16> %a, <4 ; MIPS32R5EL: # %bb.0: ; MIPS32R5EL-NEXT: addiu $sp, $sp, -48 ; MIPS32R5EL-NEXT: .cfi_def_cfa_offset 48 -; MIPS32R5EL-NEXT: sw $fp, 44($sp) # 4-byte Folded Spill -; MIPS32R5EL-NEXT: .cfi_offset 30, -4 +; MIPS32R5EL-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: sw $fp, 40($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: .cfi_offset 31, -4 +; MIPS32R5EL-NEXT: .cfi_offset 30, -8 ; MIPS32R5EL-NEXT: move $fp, $sp ; MIPS32R5EL-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EL-NEXT: addiu $1, $zero, -16 @@ -1753,7 +1782,8 @@ define <4 x i16> @i16_4(<4 x i16> %a, <4 ; MIPS32R5EL-NEXT: copy_s.w $2, $w0[0] ; MIPS32R5EL-NEXT: copy_s.w $3, $w0[2] ; MIPS32R5EL-NEXT: move $sp, $fp -; MIPS32R5EL-NEXT: lw $fp, 44($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $fp, 40($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload ; MIPS32R5EL-NEXT: addiu $sp, $sp, 48 ; MIPS32R5EL-NEXT: jr $ra ; MIPS32R5EL-NEXT: nop @@ -1962,8 +1992,10 @@ define <2 x i32> @i32_2(<2 x i32> %a, <2 ; MIPS32R5EB: # %bb.0: ; MIPS32R5EB-NEXT: addiu $sp, $sp, -48 ; MIPS32R5EB-NEXT: .cfi_def_cfa_offset 48 -; MIPS32R5EB-NEXT: sw $fp, 44($sp) # 4-byte Folded Spill -; MIPS32R5EB-NEXT: .cfi_offset 30, -4 +; MIPS32R5EB-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: sw $fp, 40($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: .cfi_offset 31, -4 +; MIPS32R5EB-NEXT: .cfi_offset 30, -8 ; MIPS32R5EB-NEXT: move $fp, $sp ; MIPS32R5EB-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EB-NEXT: addiu $1, $zero, -16 @@ -1979,7 +2011,8 @@ define <2 x i32> @i32_2(<2 x i32> %a, <2 ; MIPS32R5EB-NEXT: copy_s.w $2, $w0[1] ; MIPS32R5EB-NEXT: copy_s.w $3, $w0[3] ; MIPS32R5EB-NEXT: move $sp, $fp -; MIPS32R5EB-NEXT: lw $fp, 44($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $fp, 40($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload ; MIPS32R5EB-NEXT: addiu $sp, $sp, 48 ; MIPS32R5EB-NEXT: jr $ra ; MIPS32R5EB-NEXT: nop @@ -2010,8 +2043,10 @@ define <2 x i32> @i32_2(<2 x i32> %a, <2 ; MIPS32R5EL: # %bb.0: ; MIPS32R5EL-NEXT: addiu $sp, $sp, -48 ; MIPS32R5EL-NEXT: .cfi_def_cfa_offset 48 -; MIPS32R5EL-NEXT: sw $fp, 44($sp) # 4-byte Folded Spill -; MIPS32R5EL-NEXT: .cfi_offset 30, -4 +; MIPS32R5EL-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: sw $fp, 40($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: .cfi_offset 31, -4 +; MIPS32R5EL-NEXT: .cfi_offset 30, -8 ; MIPS32R5EL-NEXT: move $fp, $sp ; MIPS32R5EL-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EL-NEXT: addiu $1, $zero, -16 @@ -2026,7 +2061,8 @@ define <2 x i32> @i32_2(<2 x i32> %a, <2 ; MIPS32R5EL-NEXT: copy_s.w $2, $w0[0] ; MIPS32R5EL-NEXT: copy_s.w $3, $w0[2] ; MIPS32R5EL-NEXT: move $sp, $fp -; MIPS32R5EL-NEXT: lw $fp, 44($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $fp, 40($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload ; MIPS32R5EL-NEXT: addiu $sp, $sp, 48 ; MIPS32R5EL-NEXT: jr $ra ; MIPS32R5EL-NEXT: nop @@ -2312,8 +2348,10 @@ define void @float_2(<2 x float> %a, <2 ; MIPS32R5: # %bb.0: ; MIPS32R5-NEXT: addiu $sp, $sp, -48 ; MIPS32R5-NEXT: .cfi_def_cfa_offset 48 -; MIPS32R5-NEXT: sw $fp, 44($sp) # 4-byte Folded Spill -; MIPS32R5-NEXT: .cfi_offset 30, -4 +; MIPS32R5-NEXT: sw $ra, 44($sp) # 4-byte Folded Spill +; MIPS32R5-NEXT: sw $fp, 40($sp) # 4-byte Folded Spill +; MIPS32R5-NEXT: .cfi_offset 31, -4 +; MIPS32R5-NEXT: .cfi_offset 30, -8 ; MIPS32R5-NEXT: move $fp, $sp ; MIPS32R5-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5-NEXT: addiu $1, $zero, -16 @@ -2331,7 +2369,8 @@ define void @float_2(<2 x float> %a, <2 ; MIPS32R5-NEXT: swc1 $f1, 4($2) ; MIPS32R5-NEXT: swc1 $f0, %lo(float_res_v2f32)($1) ; MIPS32R5-NEXT: move $sp, $fp -; MIPS32R5-NEXT: lw $fp, 44($sp) # 4-byte Folded Reload +; MIPS32R5-NEXT: lw $fp, 40($sp) # 4-byte Folded Reload +; MIPS32R5-NEXT: lw $ra, 44($sp) # 4-byte Folded Reload ; MIPS32R5-NEXT: addiu $sp, $sp, 48 ; MIPS32R5-NEXT: jr $ra ; MIPS32R5-NEXT: nop @@ -2794,8 +2833,10 @@ define <8 x i8> @ret_8_i8() { ; MIPS32R5EB: # %bb.0: ; MIPS32R5EB-NEXT: addiu $sp, $sp, -32 ; MIPS32R5EB-NEXT: .cfi_def_cfa_offset 32 -; MIPS32R5EB-NEXT: sw $fp, 28($sp) # 4-byte Folded Spill -; MIPS32R5EB-NEXT: .cfi_offset 30, -4 +; MIPS32R5EB-NEXT: sw $ra, 28($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: sw $fp, 24($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: .cfi_offset 31, -4 +; MIPS32R5EB-NEXT: .cfi_offset 30, -8 ; MIPS32R5EB-NEXT: move $fp, $sp ; MIPS32R5EB-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EB-NEXT: addiu $1, $zero, -16 @@ -2810,7 +2851,8 @@ define <8 x i8> @ret_8_i8() { ; MIPS32R5EB-NEXT: copy_s.w $2, $w0[1] ; MIPS32R5EB-NEXT: copy_s.w $3, $w0[3] ; MIPS32R5EB-NEXT: move $sp, $fp -; MIPS32R5EB-NEXT: lw $fp, 28($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $fp, 24($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $ra, 28($sp) # 4-byte Folded Reload ; MIPS32R5EB-NEXT: addiu $sp, $sp, 32 ; MIPS32R5EB-NEXT: jr $ra ; MIPS32R5EB-NEXT: nop @@ -2829,8 +2871,10 @@ define <8 x i8> @ret_8_i8() { ; MIPS32R5EL: # %bb.0: ; MIPS32R5EL-NEXT: addiu $sp, $sp, -32 ; MIPS32R5EL-NEXT: .cfi_def_cfa_offset 32 -; MIPS32R5EL-NEXT: sw $fp, 28($sp) # 4-byte Folded Spill -; MIPS32R5EL-NEXT: .cfi_offset 30, -4 +; MIPS32R5EL-NEXT: sw $ra, 28($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: sw $fp, 24($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: .cfi_offset 31, -4 +; MIPS32R5EL-NEXT: .cfi_offset 30, -8 ; MIPS32R5EL-NEXT: move $fp, $sp ; MIPS32R5EL-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EL-NEXT: addiu $1, $zero, -16 @@ -2845,7 +2889,8 @@ define <8 x i8> @ret_8_i8() { ; MIPS32R5EL-NEXT: copy_s.w $2, $w0[0] ; MIPS32R5EL-NEXT: copy_s.w $3, $w0[2] ; MIPS32R5EL-NEXT: move $sp, $fp -; MIPS32R5EL-NEXT: lw $fp, 28($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $fp, 24($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $ra, 28($sp) # 4-byte Folded Reload ; MIPS32R5EL-NEXT: addiu $sp, $sp, 32 ; MIPS32R5EL-NEXT: jr $ra ; MIPS32R5EL-NEXT: nop @@ -2965,8 +3010,10 @@ define <4 x i16> @ret_4_i16() { ; MIPS32R5EB: # %bb.0: ; MIPS32R5EB-NEXT: addiu $sp, $sp, -32 ; MIPS32R5EB-NEXT: .cfi_def_cfa_offset 32 -; MIPS32R5EB-NEXT: sw $fp, 28($sp) # 4-byte Folded Spill -; MIPS32R5EB-NEXT: .cfi_offset 30, -4 +; MIPS32R5EB-NEXT: sw $ra, 28($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: sw $fp, 24($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: .cfi_offset 31, -4 +; MIPS32R5EB-NEXT: .cfi_offset 30, -8 ; MIPS32R5EB-NEXT: move $fp, $sp ; MIPS32R5EB-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EB-NEXT: addiu $1, $zero, -16 @@ -2981,7 +3028,8 @@ define <4 x i16> @ret_4_i16() { ; MIPS32R5EB-NEXT: copy_s.w $2, $w0[1] ; MIPS32R5EB-NEXT: copy_s.w $3, $w0[3] ; MIPS32R5EB-NEXT: move $sp, $fp -; MIPS32R5EB-NEXT: lw $fp, 28($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $fp, 24($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $ra, 28($sp) # 4-byte Folded Reload ; MIPS32R5EB-NEXT: addiu $sp, $sp, 32 ; MIPS32R5EB-NEXT: jr $ra ; MIPS32R5EB-NEXT: nop @@ -3000,8 +3048,10 @@ define <4 x i16> @ret_4_i16() { ; MIPS32R5EL: # %bb.0: ; MIPS32R5EL-NEXT: addiu $sp, $sp, -32 ; MIPS32R5EL-NEXT: .cfi_def_cfa_offset 32 -; MIPS32R5EL-NEXT: sw $fp, 28($sp) # 4-byte Folded Spill -; MIPS32R5EL-NEXT: .cfi_offset 30, -4 +; MIPS32R5EL-NEXT: sw $ra, 28($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: sw $fp, 24($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: .cfi_offset 31, -4 +; MIPS32R5EL-NEXT: .cfi_offset 30, -8 ; MIPS32R5EL-NEXT: move $fp, $sp ; MIPS32R5EL-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EL-NEXT: addiu $1, $zero, -16 @@ -3016,7 +3066,8 @@ define <4 x i16> @ret_4_i16() { ; MIPS32R5EL-NEXT: copy_s.w $2, $w0[0] ; MIPS32R5EL-NEXT: copy_s.w $3, $w0[2] ; MIPS32R5EL-NEXT: move $sp, $fp -; MIPS32R5EL-NEXT: lw $fp, 28($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $fp, 24($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $ra, 28($sp) # 4-byte Folded Reload ; MIPS32R5EL-NEXT: addiu $sp, $sp, 32 ; MIPS32R5EL-NEXT: jr $ra ; MIPS32R5EL-NEXT: nop @@ -3098,8 +3149,10 @@ define <2 x i32> @ret_2_i32() { ; MIPS32R5EB: # %bb.0: ; MIPS32R5EB-NEXT: addiu $sp, $sp, -32 ; MIPS32R5EB-NEXT: .cfi_def_cfa_offset 32 -; MIPS32R5EB-NEXT: sw $fp, 28($sp) # 4-byte Folded Spill -; MIPS32R5EB-NEXT: .cfi_offset 30, -4 +; MIPS32R5EB-NEXT: sw $ra, 28($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: sw $fp, 24($sp) # 4-byte Folded Spill +; MIPS32R5EB-NEXT: .cfi_offset 31, -4 +; MIPS32R5EB-NEXT: .cfi_offset 30, -8 ; MIPS32R5EB-NEXT: move $fp, $sp ; MIPS32R5EB-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EB-NEXT: addiu $1, $zero, -16 @@ -3114,7 +3167,8 @@ define <2 x i32> @ret_2_i32() { ; MIPS32R5EB-NEXT: copy_s.w $2, $w0[1] ; MIPS32R5EB-NEXT: copy_s.w $3, $w0[3] ; MIPS32R5EB-NEXT: move $sp, $fp -; MIPS32R5EB-NEXT: lw $fp, 28($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $fp, 24($sp) # 4-byte Folded Reload +; MIPS32R5EB-NEXT: lw $ra, 28($sp) # 4-byte Folded Reload ; MIPS32R5EB-NEXT: addiu $sp, $sp, 32 ; MIPS32R5EB-NEXT: jr $ra ; MIPS32R5EB-NEXT: nop @@ -3133,8 +3187,10 @@ define <2 x i32> @ret_2_i32() { ; MIPS32R5EL: # %bb.0: ; MIPS32R5EL-NEXT: addiu $sp, $sp, -32 ; MIPS32R5EL-NEXT: .cfi_def_cfa_offset 32 -; MIPS32R5EL-NEXT: sw $fp, 28($sp) # 4-byte Folded Spill -; MIPS32R5EL-NEXT: .cfi_offset 30, -4 +; MIPS32R5EL-NEXT: sw $ra, 28($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: sw $fp, 24($sp) # 4-byte Folded Spill +; MIPS32R5EL-NEXT: .cfi_offset 31, -4 +; MIPS32R5EL-NEXT: .cfi_offset 30, -8 ; MIPS32R5EL-NEXT: move $fp, $sp ; MIPS32R5EL-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5EL-NEXT: addiu $1, $zero, -16 @@ -3149,7 +3205,8 @@ define <2 x i32> @ret_2_i32() { ; MIPS32R5EL-NEXT: copy_s.w $2, $w0[0] ; MIPS32R5EL-NEXT: copy_s.w $3, $w0[2] ; MIPS32R5EL-NEXT: move $sp, $fp -; MIPS32R5EL-NEXT: lw $fp, 28($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $fp, 24($sp) # 4-byte Folded Reload +; MIPS32R5EL-NEXT: lw $ra, 28($sp) # 4-byte Folded Reload ; MIPS32R5EL-NEXT: addiu $sp, $sp, 32 ; MIPS32R5EL-NEXT: jr $ra ; MIPS32R5EL-NEXT: nop @@ -6073,8 +6130,10 @@ define float @mixed_i8(<2 x float> %a, i ; MIPS32R5: # %bb.0: # %entry ; MIPS32R5-NEXT: addiu $sp, $sp, -64 ; MIPS32R5-NEXT: .cfi_def_cfa_offset 64 -; MIPS32R5-NEXT: sw $fp, 60($sp) # 4-byte Folded Spill -; MIPS32R5-NEXT: .cfi_offset 30, -4 +; MIPS32R5-NEXT: sw $ra, 60($sp) # 4-byte Folded Spill +; MIPS32R5-NEXT: sw $fp, 56($sp) # 4-byte Folded Spill +; MIPS32R5-NEXT: .cfi_offset 31, -4 +; MIPS32R5-NEXT: .cfi_offset 30, -8 ; MIPS32R5-NEXT: move $fp, $sp ; MIPS32R5-NEXT: .cfi_def_cfa_register 30 ; MIPS32R5-NEXT: addiu $1, $zero, -16 @@ -6098,7 +6157,8 @@ define float @mixed_i8(<2 x float> %a, i ; MIPS32R5-NEXT: splati.w $w1, $w0[1] ; MIPS32R5-NEXT: add.s $f0, $f0, $f1 ; MIPS32R5-NEXT: move $sp, $fp -; MIPS32R5-NEXT: lw $fp, 60($sp) # 4-byte Folded Reload +; MIPS32R5-NEXT: lw $fp, 56($sp) # 4-byte Folded Reload +; MIPS32R5-NEXT: lw $ra, 60($sp) # 4-byte Folded Reload ; MIPS32R5-NEXT: addiu $sp, $sp, 64 ; MIPS32R5-NEXT: jr $ra ; MIPS32R5-NEXT: nop Modified: llvm/trunk/test/CodeGen/Mips/dynamic-stack-realignment.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Mips/dynamic-stack-realignment.ll?rev=373907&r1=373906&r2=373907&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Mips/dynamic-stack-realignment.ll (original) +++ llvm/trunk/test/CodeGen/Mips/dynamic-stack-realignment.ll Mon Oct 7 07:01:37 2019 @@ -163,8 +163,9 @@ entry: ; GP32-M: addiu $sp, $sp, -1024 ; GP32-MMR2: addiusp -1024 ; GP32-MMR6: addiu $sp, $sp, -1024 - ; GP32: sw $fp, 1020($sp) - ; GP32: sw $23, 1016($sp) + ; GP32: sw $ra, 1020($sp) + ; GP32: sw $fp, 1016($sp) + ; GP32: sw $23, 1012($sp) ; ; GP32: move $fp, $sp ; GP32: addiu $[[T0:[0-9]+|gp]], $zero, -512 @@ -177,8 +178,9 @@ entry: ; epilogue ; GP32: move $sp, $fp - ; GP32: lw $23, 1016($sp) - ; GP32: lw $fp, 1020($sp) + ; GP32: lw $23, 1012($sp) + ; GP32: lw $fp, 1016($sp) + ; GP32: lw $ra, 1020($sp) ; GP32-M: addiu $sp, $sp, 1024 ; GP32-MMR2: addiusp 1024 ; GP32-MMR6: addiu $sp, $sp, 1024 @@ -201,8 +203,9 @@ entry: ; FIXME: We are currently over-allocating stack space. ; N32: addiu $sp, $sp, -1024 ; N64: daddiu $sp, $sp, -1024 - ; GP64: sd $fp, 1016($sp) - ; GP64: sd $23, 1008($sp) + ; GP64: sd $ra, 1016($sp) + ; GP64: sd $fp, 1008($sp) + ; GP64: sd $23, 1000($sp) ; ; GP64: move $fp, $sp ; GP64: addiu $[[T0:[0-9]+|gp]], $zero, -512 @@ -215,8 +218,9 @@ entry: ; epilogue ; GP64: move $sp, $fp - ; GP64: ld $23, 1008($sp) - ; GP64: ld $fp, 1016($sp) + ; GP64: ld $23, 1000($sp) + ; GP64: ld $fp, 1008($sp) + ; GP64: ld $ra, 1016($sp) ; N32: addiu $sp, $sp, 1024 ; N64: daddiu $sp, $sp, 1024 Modified: llvm/trunk/test/CodeGen/Mips/frame-address.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Mips/frame-address.ll?rev=373907&r1=373906&r2=373907&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Mips/frame-address.ll (original) +++ llvm/trunk/test/CodeGen/Mips/frame-address.ll Mon Oct 7 07:01:37 2019 @@ -1,17 +1,26 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc -march=mipsel < %s | FileCheck %s declare i8* @llvm.frameaddress(i32) nounwind readnone define i8* @f() nounwind uwtable { +; CHECK-LABEL: f: +; CHECK: # %bb.0: # %entry +; CHECK-NEXT: addiu $sp, $sp, -8 +; CHECK-NEXT: .cfi_def_cfa_offset 8 +; CHECK-NEXT: sw $ra, 4($sp) # 4-byte Folded Spill +; CHECK-NEXT: sw $fp, 0($sp) # 4-byte Folded Spill +; CHECK-NEXT: .cfi_offset 31, -4 +; CHECK-NEXT: .cfi_offset 30, -8 +; CHECK-NEXT: move $fp, $sp +; CHECK-NEXT: .cfi_def_cfa_register 30 +; CHECK-NEXT: move $2, $fp +; CHECK-NEXT: move $sp, $fp +; CHECK-NEXT: lw $fp, 0($sp) # 4-byte Folded Reload +; CHECK-NEXT: lw $ra, 4($sp) # 4-byte Folded Reload +; CHECK-NEXT: jr $ra +; CHECK-NEXT: addiu $sp, $sp, 8 entry: %0 = call i8* @llvm.frameaddress(i32 0) ret i8* %0 - -; CHECK: .cfi_startproc -; CHECK: .cfi_def_cfa_offset 8 -; CHECK: .cfi_offset 30, -4 -; CHECK: move $fp, $sp -; CHECK: .cfi_def_cfa_register 30 -; CHECK: move $2, $fp -; CHECK: .cfi_endproc } Added: llvm/trunk/test/CodeGen/Mips/no-frame-pointer-elim.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Mips/no-frame-pointer-elim.ll?rev=373907&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/Mips/no-frame-pointer-elim.ll (added) +++ llvm/trunk/test/CodeGen/Mips/no-frame-pointer-elim.ll Mon Oct 7 07:01:37 2019 @@ -0,0 +1,37 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc -march=mips64 -relocation-model=static < %s \ +; RUN: | FileCheck %s --check-prefix STATIC +; RUN: llc -march=mips64 -relocation-model=pic < %s \ +; RUN: | FileCheck %s --check-prefix PIC + +declare dso_local void @callee() noreturn nounwind + +define dso_local void @caller() nounwind "no-frame-pointer-elim-non-leaf" { +; STATIC-LABEL: caller: +; STATIC: # %bb.0: # %entry +; STATIC-NEXT: daddiu $sp, $sp, -16 +; STATIC-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill +; STATIC-NEXT: sd $fp, 0($sp) # 8-byte Folded Spill +; STATIC-NEXT: move $fp, $sp +; STATIC-NEXT: jal callee +; STATIC-NEXT: nop +; +; PIC-LABEL: caller: +; PIC: # %bb.0: # %entry +; PIC-NEXT: daddiu $sp, $sp, -32 +; PIC-NEXT: sd $ra, 24($sp) # 8-byte Folded Spill +; PIC-NEXT: sd $fp, 16($sp) # 8-byte Folded Spill +; PIC-NEXT: sd $gp, 8($sp) # 8-byte Folded Spill +; PIC-NEXT: move $fp, $sp +; PIC-NEXT: lui $1, %hi(%neg(%gp_rel(caller))) +; PIC-NEXT: daddu $1, $1, $25 +; PIC-NEXT: daddiu $gp, $1, %lo(%neg(%gp_rel(caller))) +; PIC-NEXT: ld $25, %call16(callee)($gp) +; PIC-NEXT: .reloc .Ltmp0, R_MIPS_JALR, callee +; PIC-NEXT: .Ltmp0: +; PIC-NEXT: jalr $25 +; PIC-NEXT: nop +entry: + tail call void @callee() + unreachable +} Modified: llvm/trunk/test/CodeGen/Mips/tnaked.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Mips/tnaked.ll?rev=373907&r1=373906&r2=373907&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Mips/tnaked.ll (original) +++ llvm/trunk/test/CodeGen/Mips/tnaked.ll Mon Oct 7 07:01:37 2019 @@ -21,7 +21,7 @@ entry: ; CHECK: .ent tnonaked ; CHECK-LABEL: tnonaked: ; CHECK: .frame $fp,8,$ra -; CHECK: .mask 0x40000000,-4 +; CHECK: .mask 0xc0000000,-4 ; CHECK: .fmask 0x00000000,0 ; CHECK: addiu $sp, $sp, -8 Modified: llvm/trunk/test/CodeGen/Mips/v2i16tof32.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Mips/v2i16tof32.ll?rev=373907&r1=373906&r2=373907&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Mips/v2i16tof32.ll (original) +++ llvm/trunk/test/CodeGen/Mips/v2i16tof32.ll Mon Oct 7 07:01:37 2019 @@ -9,8 +9,10 @@ define float @f(<8 x i16>* %a) { ; CHECK: # %bb.0: # %entry ; CHECK-NEXT: addiu $sp, $sp, -32 ; CHECK-NEXT: .cfi_def_cfa_offset 32 -; CHECK-NEXT: sw $fp, 28($sp) # 4-byte Folded Spill -; CHECK-NEXT: .cfi_offset 30, -4 +; CHECK-NEXT: sw $ra, 28($sp) # 4-byte Folded Spill +; CHECK-NEXT: sw $fp, 24($sp) # 4-byte Folded Spill +; CHECK-NEXT: .cfi_offset 31, -4 +; CHECK-NEXT: .cfi_offset 30, -8 ; CHECK-NEXT: move $fp, $sp ; CHECK-NEXT: .cfi_def_cfa_register 30 ; CHECK-NEXT: addiu $1, $zero, -16 @@ -25,7 +27,8 @@ define float @f(<8 x i16>* %a) { ; CHECK-NEXT: sw $1, 4($sp) ; CHECK-NEXT: mtc1 $2, $f0 ; CHECK-NEXT: move $sp, $fp -; CHECK-NEXT: lw $fp, 28($sp) # 4-byte Folded Reload +; CHECK-NEXT: lw $fp, 24($sp) # 4-byte Folded Reload +; CHECK-NEXT: lw $ra, 28($sp) # 4-byte Folded Reload ; CHECK-NEXT: jr $ra ; CHECK-NEXT: addiu $sp, $sp, 32 entry: From llvm-commits at lists.llvm.org Mon Oct 7 06:59:15 2019 From: llvm-commits at lists.llvm.org (whitequark via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 13:59:15 +0000 (UTC) Subject: [PATCH] D65070: [LLVM-C][OCaml] Add a fast linker binding In-Reply-To: References: Message-ID: whitequark added a comment. @CodaFi Could you please comment on the current state of the patch? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65070/new/ https://reviews.llvm.org/D65070 From llvm-commits at lists.llvm.org Mon Oct 7 07:01:46 2019 From: llvm-commits at lists.llvm.org (whitequark via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:01:46 +0000 (UTC) Subject: [PATCH] D52239: [OCaml] Add OCaml APIs to access DebugLoc info In-Reply-To: References: Message-ID: <601a7bcd2b61a5eeeebb1b8e42f83709@localhost.localdomain> whitequark added a comment. @jberdine I personally highly prefer the approach in D60902 . In general, I believe that the right direction for the OCaml bindings is to evolve towards rich and type-safe accessors. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D52239/new/ https://reviews.llvm.org/D52239 From llvm-commits at lists.llvm.org Mon Oct 7 07:04:20 2019 From: llvm-commits at lists.llvm.org (Zachary Turner via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:04:20 +0000 (UTC) Subject: [PATCH] D19634: Read the rest of the substreams from DBI, and parse source file information In-Reply-To: References: Message-ID: This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rG84c3a8ba3dfc: Read the rest of the DBI substreams, and parse source info. (authored by zturner). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D19634?vs=55357&id=223579#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D19634/new/ https://reviews.llvm.org/D19634 Files: llvm/include/llvm/DebugInfo/PDB/Raw/ModInfo.h llvm/include/llvm/DebugInfo/PDB/Raw/PDBDbiStream.h llvm/lib/DebugInfo/PDB/Raw/PDBDbiStream.cpp llvm/test/DebugInfo/PDB/pdbdump-headers.test llvm/tools/llvm-pdbdump/llvm-pdbdump.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D19634.223579.patch Type: text/x-patch Size: 12527 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:04:22 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:04:22 +0000 (UTC) Subject: [PATCH] D68462: [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. In-Reply-To: References: Message-ID: grimar updated this revision to Diff 223578. grimar marked 14 inline comments as done. grimar added a comment. - Addressed review comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68462/new/ https://reviews.llvm.org/D68462 Files: test/tools/llvm-readobj/all.test -------------- next part -------------- A non-text attachment was scrubbed... Name: D68462.223578.patch Type: text/x-patch Size: 4089 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:04:31 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:04:31 +0000 (UTC) Subject: [PATCH] D68462: [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. In-Reply-To: References: Message-ID: <053c3c71892db20a1c0c1e8239ca79df@localhost.localdomain> grimar added inline comments. ================ Comment at: test/tools/llvm-readobj/all.test:63 + Entries: [] + - Name: .group + Type: SHT_GROUP ---------------- jhenderson wrote: > grimar wrote: > > jhenderson wrote: > > > Do we need group information to print a header? > > Yes. > > Logic is: > > 1) Collect all group sections: > > https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L2842 > > 2) Dump them (header is printed on this step): > > https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L2886 > I see. Could you just check the "There are no section groups in the file" message instead? (https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L2911) > > I think that would be sufficient for this test case. Done. ================ Comment at: test/tools/llvm-readobj/all.test:86-90 + - Name: .eh_frame_hdr + Type: SHT_PROGBITS +## An arbitrary linker-generated valid content. + Content: 011b033b140000000100000000f0ffff30000000 + - Name: .eh_frame ---------------- jhenderson wrote: > grimar wrote: > > jhenderson wrote: > > > Same comments as earlier. Can these be empty? > > No. We need to have something valid here, otherwise any > > error triggered will fail the dumping. > > (e.g. https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/DwarfCFIEHPrinter.h#L127). > You don't actually need the .eh_frame_hdr at all, looking at the code, I think, just the .eh_frame section. That said, this appears to be different from GNU readelf. > > For reference, GNU readelf prints "There are no unwind sections in this file" if there are no .eh_frame_hdr sections even if there is a .eh_frame section (not looked to see what happens if there is a PT_GNU_EH_FRAME program header). I am a bit confused. Imagine we have the code and invocations below: "1.s": ``` .section foo,"ax", at progbits .cfi_startproc nop .cfi_endproc ``` ``` as 1.s -o 1.o ld.bfd 1.o -o with_hdr --eh-frame-hdr ld.bfd 1.o -o wo_hdr ``` For both of them I do not see neither `.eh_frame_hdr` nor `.eh_frame` section dumped with `-a`. I see ".eh_frame" dumped when I add `-wf` though (but still no `.eh_frame_hdr`). e.g.: ``` umb at ubuntu:~/tests/81$ readelf -v GNU readelf (GNU Binutils for Ubuntu) 2.31.1 Copyright (C) 2018 Free Software Foundation, Inc. umb at ubuntu:~/tests/81$ readelf -a with_hdr ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x401000 Start of program headers: 64 (bytes into file) Start of section headers: 8584 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 3 Size of section headers: 64 (bytes) Number of section headers: 7 Section header string table index: 6 Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] foo PROGBITS 0000000000401000 00001000 0000000000000001 0000000000000000 AX 0 0 1 [ 2] .eh_frame_hdr PROGBITS 0000000000402000 00002000 0000000000000014 0000000000000000 A 0 0 4 [ 3] .eh_frame PROGBITS 0000000000402018 00002018 000000000000002c 0000000000000000 A 0 0 8 [ 4] .symtab SYMTAB 0000000000000000 00002048 00000000000000d8 0000000000000018 5 5 8 [ 5] .strtab STRTAB 0000000000000000 00002120 000000000000002c 0000000000000000 0 0 1 [ 6] .shstrtab STRTAB 0000000000000000 0000214c 0000000000000037 0000000000000000 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), l (large), p (processor specific) There are no section groups in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 0x0000000000000001 0x0000000000000001 R E 0x1000 LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 0x0000000000000044 0x0000000000000044 R 0x1000 GNU_EH_FRAME 0x0000000000002000 0x0000000000402000 0x0000000000402000 0x0000000000000014 0x0000000000000014 R 0x4 Section to Segment mapping: Segment Sections... 00 foo 01 .eh_frame_hdr .eh_frame 02 .eh_frame_hdr There is no dynamic section in this file. There are no relocations in this file. The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. Symbol table '.symtab' contains 9 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 3: 0000000000402018 0 SECTION LOCAL DEFAULT 3 4: 0000000000402000 0 NOTYPE LOCAL DEFAULT 2 __GNU_EH_FRAME_HDR 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 __bss_start 7: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _edata 8: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _end No version information found in this file. umb at ubuntu:~/tests/81$ readelf -a with_hdr -wf ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x401000 Start of program headers: 64 (bytes into file) Start of section headers: 8584 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 3 Size of section headers: 64 (bytes) Number of section headers: 7 Section header string table index: 6 Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] foo PROGBITS 0000000000401000 00001000 0000000000000001 0000000000000000 AX 0 0 1 [ 2] .eh_frame_hdr PROGBITS 0000000000402000 00002000 0000000000000014 0000000000000000 A 0 0 4 [ 3] .eh_frame PROGBITS 0000000000402018 00002018 000000000000002c 0000000000000000 A 0 0 8 [ 4] .symtab SYMTAB 0000000000000000 00002048 00000000000000d8 0000000000000018 5 5 8 [ 5] .strtab STRTAB 0000000000000000 00002120 000000000000002c 0000000000000000 0 0 1 [ 6] .shstrtab STRTAB 0000000000000000 0000214c 0000000000000037 0000000000000000 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), l (large), p (processor specific) There are no section groups in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 0x0000000000000001 0x0000000000000001 R E 0x1000 LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 0x0000000000000044 0x0000000000000044 R 0x1000 GNU_EH_FRAME 0x0000000000002000 0x0000000000402000 0x0000000000402000 0x0000000000000014 0x0000000000000014 R 0x4 Section to Segment mapping: Segment Sections... 00 foo 01 .eh_frame_hdr .eh_frame 02 .eh_frame_hdr There is no dynamic section in this file. There are no relocations in this file. The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. Symbol table '.symtab' contains 9 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 3: 0000000000402018 0 SECTION LOCAL DEFAULT 3 4: 0000000000402000 0 NOTYPE LOCAL DEFAULT 2 __GNU_EH_FRAME_HDR 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 __bss_start 7: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _edata 8: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _end No version information found in this file. Contents of the .eh_frame section: 00000000 0000000000000014 00000000 CIE Version: 1 Augmentation: "zR" Code alignment factor: 1 Data alignment factor: -8 Return address column: 16 Augmentation data: 1b DW_CFA_def_cfa: r7 (rsp) ofs 8 DW_CFA_offset: r16 (rip) at cfa-8 DW_CFA_nop DW_CFA_nop 00000018 0000000000000010 0000001c FDE cie=00000000 pc=0000000000401000..0000000000401001 DW_CFA_nop DW_CFA_nop DW_CFA_nop ``` I also see ".eh_frame" dumped when there is no ".eh_frame_hdr": ``` umb at ubuntu:~/tests/81$ readelf -a wo_hdr -wf ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x401000 Start of program headers: 64 (bytes into file) Start of section headers: 8480 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 2 Size of section headers: 64 (bytes) Number of section headers: 6 Section header string table index: 5 Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] foo PROGBITS 0000000000401000 00001000 0000000000000001 0000000000000000 AX 0 0 1 [ 2] .eh_frame PROGBITS 0000000000402000 00002000 000000000000002c 0000000000000000 A 0 0 8 [ 3] .symtab SYMTAB 0000000000000000 00002030 00000000000000a8 0000000000000018 4 3 8 [ 4] .strtab STRTAB 0000000000000000 000020d8 0000000000000019 0000000000000000 0 0 1 [ 5] .shstrtab STRTAB 0000000000000000 000020f1 0000000000000029 0000000000000000 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), l (large), p (processor specific) There are no section groups in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 0x0000000000000001 0x0000000000000001 R E 0x1000 LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 0x000000000000002c 0x000000000000002c R 0x1000 Section to Segment mapping: Segment Sections... 00 foo 01 .eh_frame There is no dynamic section in this file. There are no relocations in this file. The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. Symbol table '.symtab' contains 7 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 3: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start 4: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 __bss_start 5: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 _edata 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 _end No version information found in this file. Contents of the .eh_frame section: 00000000 0000000000000014 00000000 CIE Version: 1 Augmentation: "zR" Code alignment factor: 1 Data alignment factor: -8 Return address column: 16 Augmentation data: 1b DW_CFA_def_cfa: r7 (rsp) ofs 8 DW_CFA_offset: r16 (rip) at cfa-8 DW_CFA_nop DW_CFA_nop 00000018 0000000000000010 0000001c FDE cie=00000000 pc=0000000000401000..0000000000401001 DW_CFA_nop DW_CFA_nop DW_CFA_nop } ``` Since we have such differences in the behavior, should we just test the current behavior atm? I.e. before this diff I tested "EH_FRAME Header [", now I also added a check for ".eh_frame section at offset...". Both of them are dumped at the top level currently. Seems reasonable to test the fact we do that (with just `-all`) and the order, probably? (I am ok to change it in any way actually, but just wanted to clarify this before doing anything with it.) ================ Comment at: test/tools/llvm-readobj/all.test:100 +## An arbitrary linker-generated valid content. + Content: 040000001000000003000000474E55004FCB712AA6387724A9F465A32CD8C14B +Symbols: ---------------- jhenderson wrote: > grimar wrote: > > jhenderson wrote: > > > This could probably just be an arbitrary note, and much simpler. > > Probably. But it is already short enough and I do not want to spend time on optimising it until we have a way to describe > > it with YAML. Having a raw content is anyways not optimal. What do you think? > Time to implement SHT_NOTE sections in yaml2obj :) > > But happy for that to be later. Yep, probably `SHT_NOTE` will be the next. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68462/new/ https://reviews.llvm.org/D68462 From llvm-commits at lists.llvm.org Mon Oct 7 07:04:52 2019 From: llvm-commits at lists.llvm.org (Zachary Turner via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:04:52 +0000 (UTC) Subject: [PATCH] D19445: Refactor some more PDB reading code into DebugInfoPDB In-Reply-To: References: Message-ID: This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rGf34e01624a95: Refactor some more PDB reading code into DebugInfoPDB. (authored by zturner). Herald added subscribers: hiraditya, mgorny. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D19445?vs=54743&id=223580#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D19445/new/ https://reviews.llvm.org/D19445 Files: llvm/include/llvm/DebugInfo/PDB/Raw/PDBFile.h llvm/include/llvm/DebugInfo/PDB/Raw/PDBInfoStream.h llvm/include/llvm/DebugInfo/PDB/Raw/PDBNameMap.h llvm/include/llvm/DebugInfo/PDB/Raw/PDBRawConstants.h llvm/lib/DebugInfo/PDB/CMakeLists.txt llvm/lib/DebugInfo/PDB/Raw/PDBInfoStream.cpp llvm/lib/DebugInfo/PDB/Raw/PDBNameMap.cpp llvm/test/DebugInfo/PDB/pdbdump-headers.test llvm/tools/llvm-pdbdump/llvm-pdbdump.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D19445.223580.patch Type: text/x-patch Size: 17389 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:06:36 2019 From: llvm-commits at lists.llvm.org (Josef Eisl via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:06:36 +0000 (UTC) Subject: [PATCH] D68213: [LTO] Support for embedding bitcode section during LTO In-Reply-To: References: Message-ID: <82000c0445526bf9b19adb0a4fae82e3@localhost.localdomain> zapster added a comment. (ping) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68213/new/ https://reviews.llvm.org/D68213 From llvm-commits at lists.llvm.org Mon Oct 7 07:10:21 2019 From: llvm-commits at lists.llvm.org (Amaury Sechet via llvm-commits) Date: Mon, 07 Oct 2019 14:10:21 -0000 Subject: [llvm] r373908 - Regenerate ptr-rotate.ll . NFC Message-ID: <20191007141021.B1AC68A9D2@lists.llvm.org> Author: deadalnix Date: Mon Oct 7 07:10:21 2019 New Revision: 373908 URL: http://llvm.org/viewvc/llvm-project?rev=373908&view=rev Log: Regenerate ptr-rotate.ll . NFC Modified: llvm/trunk/test/CodeGen/X86/ptr-rotate.ll Modified: llvm/trunk/test/CodeGen/X86/ptr-rotate.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/ptr-rotate.ll?rev=373908&r1=373907&r2=373908&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/ptr-rotate.ll (original) +++ llvm/trunk/test/CodeGen/X86/ptr-rotate.ll Mon Oct 7 07:10:21 2019 @@ -1,11 +1,16 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc -mtriple=i386-apple-darwin -mcpu=corei7 -o - < %s | FileCheck %s define i32 @func(i8* %A) nounwind readnone { +; CHECK-LABEL: func: +; CHECK: ## %bb.0: ## %entry +; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax +; CHECK-NEXT: roll $27, %eax +; CHECK-NEXT: retl entry: %tmp = ptrtoint i8* %A to i32 %shr = lshr i32 %tmp, 5 %shl = shl i32 %tmp, 27 %or = or i32 %shr, %shl -; CHECK: roll $27 ret i32 %or } From llvm-commits at lists.llvm.org Mon Oct 7 07:10:23 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:10:23 +0000 (UTC) Subject: [PATCH] D68488: [PATCH 05/38] [noalias] [IR] Introduce noalias_sidechannel for LoadInst/StoreInst In-Reply-To: References: Message-ID: jeroen.dobbelaere updated this revision to Diff 223582. jeroen.dobbelaere added a comment. Herald added a subscriber: jfb. Added assert message; clang-format CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68488/new/ https://reviews.llvm.org/D68488 Files: llvm/include/llvm/IR/InstVisitor.h llvm/include/llvm/IR/Instructions.h llvm/include/llvm/IR/User.h llvm/include/llvm/IR/Value.h llvm/lib/IR/AsmWriter.cpp llvm/lib/IR/Instructions.cpp llvm/lib/IR/User.cpp llvm/lib/IR/Value.cpp llvm/unittests/IR/IRBuilderTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68488.223582.patch Type: text/x-patch Size: 19219 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:14:46 2019 From: llvm-commits at lists.llvm.org (Kevin P. Neal via llvm-commits) Date: Mon, 07 Oct 2019 14:14:46 -0000 Subject: [llvm] r373909 - Fix another sphinx warning. Message-ID: <20191007141446.664F186AA8@lists.llvm.org> Author: kpn Date: Mon Oct 7 07:14:46 2019 New Revision: 373909 URL: http://llvm.org/viewvc/llvm-project?rev=373909&view=rev Log: Fix another sphinx warning. Differential Revision: https://reviews.llvm.org/D64746 Modified: llvm/trunk/docs/LangRef.rst Modified: llvm/trunk/docs/LangRef.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/LangRef.rst?rev=373909&r1=373908&r2=373909&view=diff ============================================================================== --- llvm/trunk/docs/LangRef.rst (original) +++ llvm/trunk/docs/LangRef.rst Mon Oct 7 07:14:46 2019 @@ -16259,7 +16259,7 @@ would and handles error conditions in th '``llvm.experimental.constrained.lround``' Intrinsic -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Syntax: """"""" @@ -16297,7 +16297,7 @@ would and handles error conditions in th '``llvm.experimental.constrained.llround``' Intrinsic -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Syntax: """"""" From llvm-commits at lists.llvm.org Mon Oct 7 07:13:42 2019 From: llvm-commits at lists.llvm.org (Nikola Prica via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:13:42 +0000 (UTC) Subject: [PATCH] D67556: [ARM][AArch64][DebugInfo] Improve call site instruction interpretation In-Reply-To: References: Message-ID: NikolaPrica marked an inline comment as done. NikolaPrica added inline comments. ================ Comment at: include/llvm/CodeGen/TargetInstrInfo.h:888 + /// If the specific machine instruction is an instruction that adds an + /// immediate value to its first operand and stores it in the first, return + /// true along with @Source machine operand to which @Offset has been ---------------- dstenb wrote: > NikolaPrica wrote: > > dstenb wrote: > > > dstenb wrote: > > > > I wonder if the hook should allow the source and destination to be different, as we then for example could describe cases like this: > > > > > > > > ``` > > > > $reg0 = add $frame-ptr, -13 > > > > ``` > > > > > > > > If so, would it then make sense to move the LEA part of X86's `describeLoadedValue()` hook into this hook instead? > > > If so, should we perhaps also consider generalizing the hook so that it has a Destination out-parameter, e.g. same as `isCopyInstr()`? That could probably be helpful if we make the `describeLoadedValue()` hook aware of which register it should describe, as we discussed in D67225. > > > I wonder if the hook should allow the source and destination to be different > > > > > > In fact we should only relay on situations were source and destination operands are different. Such restriction should be used at general part of `describeLoadedValue()`. There is no use of describing situatios like > > > > $reg0 = add $reg0, 4 > > > > This case would require recursive description of $reg0. Describing such instruction is a different story. > > > > > If so, would it then make sense to move the LEA part of X86's describeLoadedValue() hook into this hook instead? > > > > The LEA instruction is more complex than add immidiate instruction. It could be observed as add immidiate for one case but it could also have addition of multiple source registers with some multiplication operations. IMHO it would be better to keep this API function clean in sense of recognizing only clear add immidiate instruction for purpose of further usage. > > > > > If so, should we perhaps also consider generalizing the hook so that it has a Destination out-parameter, e.g. same as isCopyInstr()? That could probably be helpful if we make the describeLoadedValue() hook aware of which register it should describe, as we discussed in D67225. > > > > In the first version of this function I've added a destination operand but I've removed it since there was no current use of it. But for further flexibilty I will add it. > > > > In fact we should only relay on situations were source and destination operands are different. Such restriction should be used at general part of describeLoadedValue(). There is no use of describing situatios like > > > > $reg0 = add $reg0, 4 > > In previous revisions of the downstream target we develop for we had to resort to: > > ``` > $reg0 = mov $frame-ptr > $reg0 = add $reg0, $offset > ``` > > instead of loading the frame pointer with an offset in one instruction. Perhaps there is some upstream target that requires the same? > > > This case would require recursive description of $reg0. Describing such instruction is a different story. > > Is that due to the issue with expressions in collectCallSiteParameters() which we discussed earlier in this patch? > > > The LEA instruction is more complex than add immidiate instruction. It could be observed as add immidiate for one case but it could also have addition of multiple source registers with some multiplication operations. IMHO it would be better to keep this API function clean in sense of recognizing only clear add immidiate instruction for purpose of further usage. > > Okay, that sounds fair. Moving some parts to the LEA implementation to this hook, and keeping the rest in `describeLoadedValue()` would probably not be ideal. > > > In the first version of this function I've added a destination operand but I've removed it since there was no current use of it. But for further flexibilty I will add it. > > Okay, thanks! > Is that due to the issue with expressions in collectCallSiteParameters() which we discussed earlier in this patch? Yes. You are right. Such cases should be handled the way we discussed there. But until such support is provided such instructions ($reg0 = add $reg0, 4) should be omitted. I will emphasize that as TODO comment. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67556/new/ https://reviews.llvm.org/D67556 From llvm-commits at lists.llvm.org Mon Oct 7 07:13:57 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:13:57 +0000 (UTC) Subject: [PATCH] D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine In-Reply-To: References: Message-ID: spatel added a comment. In D67841#1696910 , @mstorsjo wrote: > This caused lots of failed asserts in building many different projects, see https://bugs.llvm.org/show_bug.cgi?id=43582, so I went ahead and reverted it for now. Thanks. I looked at the test cases attached to the bug report, and this patch causes scary behavior: LV: Found an estimated cost of 4294967293 for VF 1 For instruction: %or75 = or i32 %shl74, %shl71 The loop vectorizer assumes that costs are always positive (it converts the value returned by the cost model to an *unsigned* value). This matches the assert in the getArithmeticInstrCost() implementation that we tried to bypass: assert(Cost >= 0 && "TTI should not produce negative costs!"); But we want SLP to weigh the *relative* cost of scalar code (that will be reduced) vs. vector code. I think we should use the earlier revision of this patch that created a dedicated function for estimating a load combining pattern. Ie, we tried to squeeze this into the more general getArithmeticInstrCost() API, but it does not belong there. Existing callers have made assumptions about using that cost model API, and we violated the contract: /// This is an approximation of reciprocal throughput of a math/logic op. /// A higher cost indicates less expected throughput. /// From Agner Fog's guides, reciprocal throughput is "the average number of /// clock cycles per instruction when the instructions are not part of a /// limiting dependency chain." /// Therefore, costs should be scaled to account for multiple execution units /// on the target that can process this type of instruction. For example, if /// there are 5 scalar integer units and 2 vector integer units that can /// calculate an 'add' in a single cycle, this model should indicate that the /// cost of the vector add instruction is 2.5 times the cost of the scalar /// add instruction. /// \p Args is an optional argument which holds the instruction operands /// values so the TTI can analyze those values searching for special /// cases or optimizations based on those values. int getArithmeticInstrCost( Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67841/new/ https://reviews.llvm.org/D67841 From llvm-commits at lists.llvm.org Mon Oct 7 07:14:48 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:14:48 +0000 (UTC) Subject: [PATCH] D68491: [PATCH 08/38] [noalias] [IR] IRBuilder support for noalias intrinsics. In-Reply-To: References: Message-ID: <0c069d10b34851e9d72a60ae7761196c@localhost.localdomain> jeroen.dobbelaere updated this revision to Diff 223586. jeroen.dobbelaere added a comment. Treat objId as uint64_t everywhere; treat indices as int64_t; use std::forward_as_tuple; .use insert instead of for-loop+push_back. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68491/new/ https://reviews.llvm.org/D68491 Files: llvm/include/llvm/IR/IRBuilder.h llvm/include/llvm/IR/IntrinsicInst.h llvm/include/llvm/IR/Intrinsics.h llvm/lib/IR/IRBuilder.cpp llvm/lib/IR/Verifier.cpp llvm/unittests/IR/IRBuilderTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68491.223586.patch Type: text/x-patch Size: 22710 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:16:01 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:16:01 +0000 (UTC) Subject: [PATCH] D68494: [PATCH 11/38] [noalias] D9377: llvm.noalias - don't block EarlyCSE In-Reply-To: References: Message-ID: <04a7a5c68140c6ac5b26f52bc03e41ca@localhost.localdomain> jeroen.dobbelaere updated this revision to Diff 223589. jeroen.dobbelaere added a comment. Fix bad rebase. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68494/new/ https://reviews.llvm.org/D68494 Files: llvm/lib/Transforms/Scalar/EarlyCSE.cpp llvm/test/Transforms/EarlyCSE/basic.ll Index: llvm/test/Transforms/EarlyCSE/basic.ll =================================================================== --- llvm/test/Transforms/EarlyCSE/basic.ll +++ llvm/test/Transforms/EarlyCSE/basic.ll @@ -54,6 +54,16 @@ ; CHECK: ret i32 0 } +; CHECK-LABEL: @test2b( +define i32 @test2b(i32 *%P, i1 %b) { + %V1 = load i32, i32* %P + call i8* @llvm.noalias.p0i8(i8* undef, metadata !1) + %V2 = load i32, i32* %P + %Diff = sub i32 %V1, %V2 + ret i32 %Diff + ; CHECK: ret i32 0 +} + ;; Cross block load value numbering. ; CHECK-LABEL: @test3( define i32 @test3(i32 *%P, i1 %Cond) { @@ -134,6 +144,15 @@ ; CHECK: ret i32 42 } +; CHECK-LABEL: @test6b( +define i32 @test6b(i32 *%P, i1 %b) { + store i32 42, i32* %P + call i8* @llvm.noalias.p0i8(i8* undef, metadata !1) + %V1 = load i32, i32* %P + ret i32 %V1 + ; CHECK: ret i32 42 +} + ;; Trivial dead store elimination. ; CHECK-LABEL: @test7( define void @test7(i32 *%P) { @@ -291,3 +310,8 @@ store i32 2, i32* @c, align 4 ret void } + +declare i8* @llvm.noalias.p0i8(i8*, metadata) nounwind + +!0 = !{!0, !"some domain"} +!1 = !{!1, !0, !"some scope"} Index: llvm/lib/Transforms/Scalar/EarlyCSE.cpp =================================================================== --- llvm/lib/Transforms/Scalar/EarlyCSE.cpp +++ llvm/lib/Transforms/Scalar/EarlyCSE.cpp @@ -932,8 +932,10 @@ } // Skip sideeffect intrinsics, for the same reason as assume intrinsics. - if (match(Inst, m_Intrinsic())) { - LLVM_DEBUG(dbgs() << "EarlyCSE skipping sideeffect: " << *Inst << '\n'); + // Likewise, noalias intrinsics don't actually write. + if (match(Inst, m_CombineOr(m_Intrinsic(), + m_Intrinsic()))) { + LLVM_DEBUG(dbgs() << "EarlyCSE skipping intrinsic: " << *Inst << '\n'); continue; } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68494.223589.patch Type: text/x-patch Size: 1892 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:16:54 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:16:54 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: thopre updated this revision to Diff 223591. thopre marked an inline comment as done. thopre added a comment. Fix typo and add BOM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 Files: llvm/test/tools/llvm-ar/mri-nonascii.test llvm/test/tools/llvm-ar/mri-utf8.test Index: llvm/test/tools/llvm-ar/mri-utf8.test =================================================================== --- llvm/test/tools/llvm-ar/mri-utf8.test +++ /dev/null @@ -1,23 +0,0 @@ -# Test non-ascii archive members -# XFAIL: system-darwin - -RUN: rm -rf %t && mkdir -p %t/extracted - -RUN: echo "contents" > %t/£.txt - -RUN: echo "CREATE %t/mri.ar" > %t/script.mri -RUN: echo "ADDMOD %t/£.txt" >> %t/script.mri -RUN: echo "SAVE" >> %t/script.mri - -RUN: llvm-ar -M < %t/script.mri -RUN: cd %t/extracted && llvm-ar x %t/mri.ar - -# This works around problems launching processess that -# include arguments with non-ascii characters. -# Python on Linux defaults to ASCII encoding unless the -# environment specifies otherwise, so it is explicitly set. -# The reliance the test has on this locale is not ideal, -# however alternate solutions have been difficult due to -# behaviour differences with python 2 vs python 3, -# and linux vs windows. -RUN: env LANG=en_US.UTF-8 %python -c "assert open(u'\U000000A3.txt', 'rb').read() == b'contents\n'" Index: llvm/test/tools/llvm-ar/mri-nonascii.test =================================================================== --- /dev/null +++ llvm/test/tools/llvm-ar/mri-nonascii.test @@ -0,0 +1,19 @@ +# Test non-ascii archive members +# XFAIL: system-darwin + +RUN: rm -rf %t && mkdir -p %t/extracted + +RUN: echo "contents" > %t/£.txt + +RUN: echo "CREATE %t/mri.ar" > %t/script.mri +RUN: echo "ADDMOD %t/£.txt" >> %t/script.mri +RUN: echo "SAVE" >> %t/script.mri + +RUN: llvm-ar -M < %t/script.mri +RUN: cd %t/extracted && llvm-ar x %t/mri.ar + +# Use input redirection to work around problems launching processes that +# include arguments with non-ascii characters. +RUN: FileCheck --strict-whitespace %s <£.txt +CHECK:{{^}} +CHECK-SAME:{{^}}contents{{$}} -------------- next part -------------- A non-text attachment was scrubbed... Name: D68472.223591.patch Type: text/x-patch Size: 1811 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:17:04 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:17:04 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: <8ae8fffa05cb126f7190107473ca28ad@localhost.localdomain> thopre marked an inline comment as done. thopre added inline comments. ================ Comment at: llvm/test/tools/llvm-ar/mri-nonascii.test:15 + +# Use input redirection to work around problems launching processess that +# include arguments with non-ascii characters. ---------------- MaskRay wrote: > hubert.reinterpretcast wrote: > > Minor nit: s/processess/processes/; > What problems do you work around? POSIX.1-2017 3.282 Portable Filename Character Set consists of the classical Latin alphabet, 0~9, , , and . a filename consisting of the UTF-8 byte sequence 0xc2 0xa3 (£) may be disallowed by some implementations but it is unlikely that the implementation can arbitrarily reinterpret the byte sequence and cause the test to fail. > > I suggest deleting the comment. The original message is not mine so I'm not sure what it referred to it might be that arguments are passed down the the program being invoked without interpretation, thus the filename would be UTF-8 encoded since that is what mri-utf8.test is encoded in. This would fail on Windows where filename must be UTF-16 and the output redirection of the earlier line would have created a filename in UTF-16. I'll let Owen confirm. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 From llvm-commits at lists.llvm.org Mon Oct 7 07:17:09 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:17:09 +0000 (UTC) Subject: [PATCH] D68337: [ARM][MVE] Enable extending masked loads In-Reply-To: References: Message-ID: <39ce1c47ec42a39d10ed82ff1e2ab2c2@localhost.localdomain> samparker marked 2 inline comments as done. samparker added inline comments. ================ Comment at: lib/Target/ARM/ARMISelLowering.cpp:8887 // zero too, and other values are lowered to a select. SDValue ZeroVec = DAG.getNode(ARMISD::VMOVIMM, dl, VT, DAG.getTargetConstant(0, dl, MVT::i32)); ---------------- dmgreen wrote: > This is creating a zero vector of size VT, which is the size of what the masked loads returns. Should it instead be the size of the memory being loaded (because the extend happens to the passthru as well)? What happens if that isn't a legal value type? Well, surely the result VT of the masked load has to match the VT of the passthru input. passthru is not about what memory is accessed, but what is written to the destination register. VOVIMM will also generate the same zero value for all full width vector types so for vector widths less than 128-bits, the higher elements will be zeroed and that makes sense. For vectors wider than 128-bits, I think something would have gone before here. I'll add some tests for both these cases. ================ Comment at: lib/Target/ARM/ARMTargetTransformInfo.cpp:511 + // Only support extending integers if the memory is aligned. + if ((EltWidth == 16 && Alignment < 2) || + (EltWidth == 32 && Alignment < 4)) ---------------- dmgreen wrote: > samparker wrote: > > dmgreen wrote: > > > If this is coming from codegen, can the alignment here be 0? I think in ISel it is always set (and clang will always set it), but it may not be guaranteed in llvm in general. > > I can't see anything in the spec for any guarantees of these intrinsics, but for normal loads, it becomes defined by the target ABI. It's always safe for us to use a i8* accessor, so I don't see 0 being a problem here. > Yeah. Alignment of 0 means ABI alignment, which means 8, not unaligned. > > I think it may be better to just check this alignment is always the case, getting rid of that weird "use i8's to load unaligned masked loads" thing. That was probably a bad idea, more trouble than it's worth. > > I think what will happen here at the moment is that the Vectorizer will call isLegalMaskedLoad with an scalar type and an alignment (which, lets say is unaligned). That alignment won't be checked so the masked loads and stores will be created. Then when we get to the backend the legalizer will call this with a vector type and we'll hit this check, expanding out the masked load into a that very inefficient bunch of code. Which is probably something that we want to avoid. Hmmm, okay. I also can't see removing unaligned support having a big negative effect. Sounds like I need to add some vectorization tests too, unless we already have them? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68337/new/ https://reviews.llvm.org/D68337 From llvm-commits at lists.llvm.org Mon Oct 7 07:17:10 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:17:10 +0000 (UTC) Subject: [PATCH] D68495: [PATCH 12/38] [noalias] EarlyCSE: learn about noalias intrinsics In-Reply-To: References: Message-ID: jeroen.dobbelaere updated this revision to Diff 223590. jeroen.dobbelaere added a comment. Adapt to changed rebase in D68494 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68495/new/ https://reviews.llvm.org/D68495 Files: llvm/lib/Transforms/Scalar/EarlyCSE.cpp llvm/test/Transforms/EarlyCSE/basic.ll Index: llvm/test/Transforms/EarlyCSE/basic.ll =================================================================== --- llvm/test/Transforms/EarlyCSE/basic.ll +++ llvm/test/Transforms/EarlyCSE/basic.ll @@ -57,7 +57,17 @@ ; CHECK-LABEL: @test2b( define i32 @test2b(i32 *%P, i1 %b) { %V1 = load i32, i32* %P - call i8* @llvm.noalias.p0i8(i8* undef, metadata !1) + call i8* @llvm.noalias.p0i8.p0i8.p0p0i8.i32(i8* undef, i8* null, i8** null, i32 0, metadata !1) + %V2 = load i32, i32* %P + %Diff = sub i32 %V1, %V2 + ret i32 %Diff + ; CHECK: ret i32 0 +} + +; CHECK-LABEL: @test2c( +define i32 @test2c(i32 *%P, i1 %b) { + %V1 = load i32, i32* %P + call i8* @llvm.side.noalias.p0i8.p0i8.p0p0i8.p0p0i8.i32(i8* undef, i8* null, i8** null, i8** null, i32 0, metadata !1) %V2 = load i32, i32* %P %Diff = sub i32 %V1, %V2 ret i32 %Diff @@ -147,7 +157,16 @@ ; CHECK-LABEL: @test6b( define i32 @test6b(i32 *%P, i1 %b) { store i32 42, i32* %P - call i8* @llvm.noalias.p0i8(i8* undef, metadata !1) + call i8* @llvm.noalias.p0i8.p0i8.p0p0i8.i32(i8* undef, i8* null, i8** null, i32 0, metadata !1) + %V1 = load i32, i32* %P + ret i32 %V1 + ; CHECK: ret i32 42 +} + +; CHECK-LABEL: @test6c( +define i32 @test6c(i32 *%P, i1 %b) { + store i32 42, i32* %P + call i8* @llvm.side.noalias.p0i8.p0i8.p0p0i8.p0p0i8.i32(i8* undef, i8* null, i8** null, i8** null, i32 0, metadata !1) %V1 = load i32, i32* %P ret i32 %V1 ; CHECK: ret i32 42 @@ -311,7 +330,8 @@ ret void } -declare i8* @llvm.noalias.p0i8(i8*, metadata) nounwind +declare i8* @llvm.noalias.p0i8.p0i8.p0p0i8.i32(i8*, i8*, i8**, i32, metadata ) nounwind +declare i8* @llvm.side.noalias.p0i8.p0i8.p0p0i8.p0p0i8.i32(i8*, i8*, i8**, i8**, i32, metadata ) nounwind !0 = !{!0, !"some domain"} !1 = !{!1, !0, !"some scope"} Index: llvm/lib/Transforms/Scalar/EarlyCSE.cpp =================================================================== --- llvm/lib/Transforms/Scalar/EarlyCSE.cpp +++ llvm/lib/Transforms/Scalar/EarlyCSE.cpp @@ -927,15 +927,23 @@ << '\n'); AvailableValues.insert(CondI, ConstantInt::getTrue(BB->getContext())); } else - LLVM_DEBUG(dbgs() << "EarlyCSE skipping assumption: " << *Inst << '\n'); + LLVM_DEBUG(dbgs() << "EarlyCSE skipping intrinsic: " << *Inst << '\n'); continue; } - // Skip sideeffect intrinsics, for the same reason as assume intrinsics. // Likewise, noalias intrinsics don't actually write. - if (match(Inst, m_CombineOr(m_Intrinsic(), - m_Intrinsic()))) { - LLVM_DEBUG(dbgs() << "EarlyCSE skipping intrinsic: " << *Inst << '\n'); + if (match(Inst, m_Intrinsic()) || + match(Inst, m_Intrinsic()) || + match(Inst, m_Intrinsic()) || + match(Inst, m_Intrinsic())) { + LLVM_DEBUG(dbgs() << "EarlyCSE skipping noalias intrinsic: " << *Inst + << '\n'); + continue; + } + + // Skip sideeffect intrinsics, for the same reason as assume intrinsics. + if (match(Inst, m_Intrinsic())) { + LLVM_DEBUG(dbgs() << "EarlyCSE skipping sideeffect: " << *Inst << '\n'); continue; } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68495.223590.patch Type: text/x-patch Size: 3350 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:19:49 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:19:49 +0000 (UTC) Subject: [PATCH] D68461: [ARM][MVE] Enable truncating masked stores In-Reply-To: References: Message-ID: <41aa2a3a8ba2bfc74e4577bc4bfcf39e@localhost.localdomain> samparker marked an inline comment as done. samparker added inline comments. ================ Comment at: lib/Target/ARM/ARMTargetTransformInfo.cpp:498 - if (DataTy->isVectorTy()) { - // We don't yet support narrowing or widening masked loads/stores. Expand - // them for the moment. - unsigned VecWidth = DataTy->getPrimitiveSizeInBits(); - if (VecWidth != 128) + unsigned EltWidth = DataTy->getScalarSizeInBits(); + if (auto *VecTy = dyn_cast(DataTy)) { ---------------- dmgreen wrote: > This is the same as in the load patch? Yes, sorry. I've got two separate downstream branches and forgot to keep this part off this patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68461/new/ https://reviews.llvm.org/D68461 From llvm-commits at lists.llvm.org Mon Oct 7 07:19:56 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:19:56 +0000 (UTC) Subject: [PATCH] D68509: [PATCH 26/38] [noalias] Use noalias intrinsics when inlining and keep metadata up to date. In-Reply-To: References: Message-ID: jeroen.dobbelaere updated this revision to Diff 223593. jeroen.dobbelaere added a comment. Adapt tests to i64 p.objId. (Was i32) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68509/new/ https://reviews.llvm.org/D68509 Files: llvm/include/llvm/Transforms/Utils/NoAliasUtils.h llvm/lib/Transforms/Utils/CMakeLists.txt llvm/lib/Transforms/Utils/CloneFunction.cpp llvm/lib/Transforms/Utils/InlineFunction.cpp llvm/lib/Transforms/Utils/NoAliasUtils.cpp llvm/test/Transforms/Inline/noalias-calls.ll llvm/test/Transforms/Inline/noalias-scopes.ll llvm/test/Transforms/Inline/noalias.ll llvm/test/Transforms/Inline/noalias2.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68509.223593.patch Type: text/x-patch Size: 50000 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:20:04 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:20:04 +0000 (UTC) Subject: [PATCH] D67008: implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: <28e38a47a81e112540a083b40e4e430f@localhost.localdomain> DiggerLin marked 5 inline comments as done. DiggerLin added inline comments. ================ Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:560 +// to the discussion of overflow headers in "Sections and Section Headers". +uint32_t XCOFFObjectFile::getLogicalNumberOfRelocationEntries( + const XCOFFSectionHeader32 &Sec, uint16_t SectionIndex) const { ---------------- sfertile wrote: > hubert.reinterpretcast wrote: > > @sfertile, suggested that this be a separate patch. Could we land that first (with an update to how `STYP_OVRFLO` section headers are printed)? > Yes Please. I have created a new patch for it. https://reviews.llvm.org/D68575 . implement parsing overflow section header. ================ Comment at: llvm/test/tools/llvm-readobj/reloc_overflow.ll:1 +# RUN: llvm-readobj --sections %p/Inputs/xcoff-reloc-overflow.o | \ +# RUN: FileCheck --check-prefix=SECOVERFLOW %s ---------------- sfertile wrote: > The `.ll` suffix implies the test is written in LLVM IR. Use `.test` instead. I will change the name ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:140 + // Only the .text, .data, .tdata, and STYP_DWARF sections have relocation. + if (Sec.Flags != XCOFF::STYP_TEXT && Sec.Flags != XCOFF::STYP_DATA && + Sec.Flags != XCOFF::STYP_TDATA && Sec.Flags != XCOFF::STYP_DWARF) ---------------- sfertile wrote: > Is this specified in the docs? I wasn't able to find it specified anywhere. What about the exception section? I don't know anything about the exception implementation on AIX so I could be wrong, but I suspect it might contain relocations. > > I did find the specification of the special relocations in the loader table, and that they are a different format from the 'normal' relocations implemented in this patch. Does the loader section use the relocation pointer and relocation count in the section header table for these different relocations, or do we find them through fields defined in the loader section itself? from the xcoff document. s_relptr Recognized for the .text, .data, .tdata, and STYP_DWARF sections only. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Mon Oct 7 07:20:13 2019 From: llvm-commits at lists.llvm.org (James Henderson via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:20:13 +0000 (UTC) Subject: [PATCH] D68462: [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. In-Reply-To: References: Message-ID: jhenderson accepted this revision. jhenderson added a comment. This revision is now accepted and ready to land. LGTM. ================ Comment at: test/tools/llvm-readobj/all.test:86-90 + - Name: .eh_frame_hdr + Type: SHT_PROGBITS +## An arbitrary linker-generated valid content. + Content: 011b033b140000000100000000f0ffff30000000 + - Name: .eh_frame ---------------- grimar wrote: > jhenderson wrote: > > grimar wrote: > > > jhenderson wrote: > > > > Same comments as earlier. Can these be empty? > > > No. We need to have something valid here, otherwise any > > > error triggered will fail the dumping. > > > (e.g. https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/DwarfCFIEHPrinter.h#L127). > > You don't actually need the .eh_frame_hdr at all, looking at the code, I think, just the .eh_frame section. That said, this appears to be different from GNU readelf. > > > > For reference, GNU readelf prints "There are no unwind sections in this file" if there are no .eh_frame_hdr sections even if there is a .eh_frame section (not looked to see what happens if there is a PT_GNU_EH_FRAME program header). > I am a bit confused. > > Imagine we have the code and invocations below: > > "1.s": > ``` > .section foo,"ax", at progbits > .cfi_startproc > nop > .cfi_endproc > ``` > > ``` > as 1.s -o 1.o > ld.bfd 1.o -o with_hdr --eh-frame-hdr > ld.bfd 1.o -o wo_hdr > ``` > > For both of them I do not see neither `.eh_frame_hdr` nor `.eh_frame` section dumped with `-a`. > I see ".eh_frame" dumped when I add `-wf` though (but still no `.eh_frame_hdr`). > e.g.: > > > ``` > umb at ubuntu:~/tests/81$ readelf -v > GNU readelf (GNU Binutils for Ubuntu) 2.31.1 > Copyright (C) 2018 Free Software Foundation, Inc. > > umb at ubuntu:~/tests/81$ readelf -a with_hdr > ELF Header: > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > Class: ELF64 > Data: 2's complement, little endian > Version: 1 (current) > OS/ABI: UNIX - System V > ABI Version: 0 > Type: EXEC (Executable file) > Machine: Advanced Micro Devices X86-64 > Version: 0x1 > Entry point address: 0x401000 > Start of program headers: 64 (bytes into file) > Start of section headers: 8584 (bytes into file) > Flags: 0x0 > Size of this header: 64 (bytes) > Size of program headers: 56 (bytes) > Number of program headers: 3 > Size of section headers: 64 (bytes) > Number of section headers: 7 > Section header string table index: 6 > > Section Headers: > [Nr] Name Type Address Offset > Size EntSize Flags Link Info Align > [ 0] NULL 0000000000000000 00000000 > 0000000000000000 0000000000000000 0 0 0 > [ 1] foo PROGBITS 0000000000401000 00001000 > 0000000000000001 0000000000000000 AX 0 0 1 > [ 2] .eh_frame_hdr PROGBITS 0000000000402000 00002000 > 0000000000000014 0000000000000000 A 0 0 4 > [ 3] .eh_frame PROGBITS 0000000000402018 00002018 > 000000000000002c 0000000000000000 A 0 0 8 > [ 4] .symtab SYMTAB 0000000000000000 00002048 > 00000000000000d8 0000000000000018 5 5 8 > [ 5] .strtab STRTAB 0000000000000000 00002120 > 000000000000002c 0000000000000000 0 0 1 > [ 6] .shstrtab STRTAB 0000000000000000 0000214c > 0000000000000037 0000000000000000 0 0 1 > Key to Flags: > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > L (link order), O (extra OS processing required), G (group), T (TLS), > C (compressed), x (unknown), o (OS specific), E (exclude), > l (large), p (processor specific) > > There are no section groups in this file. > > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > 0x0000000000000001 0x0000000000000001 R E 0x1000 > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > 0x0000000000000044 0x0000000000000044 R 0x1000 > GNU_EH_FRAME 0x0000000000002000 0x0000000000402000 0x0000000000402000 > 0x0000000000000014 0x0000000000000014 R 0x4 > > Section to Segment mapping: > Segment Sections... > 00 foo > 01 .eh_frame_hdr .eh_frame > 02 .eh_frame_hdr > > There is no dynamic section in this file. > > There are no relocations in this file. > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > Symbol table '.symtab' contains 9 entries: > Num: Value Size Type Bind Vis Ndx Name > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > 3: 0000000000402018 0 SECTION LOCAL DEFAULT 3 > 4: 0000000000402000 0 NOTYPE LOCAL DEFAULT 2 __GNU_EH_FRAME_HDR > 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 __bss_start > 7: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _edata > 8: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _end > > No version information found in this file. > > umb at ubuntu:~/tests/81$ readelf -a with_hdr -wf > ELF Header: > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > Class: ELF64 > Data: 2's complement, little endian > Version: 1 (current) > OS/ABI: UNIX - System V > ABI Version: 0 > Type: EXEC (Executable file) > Machine: Advanced Micro Devices X86-64 > Version: 0x1 > Entry point address: 0x401000 > Start of program headers: 64 (bytes into file) > Start of section headers: 8584 (bytes into file) > Flags: 0x0 > Size of this header: 64 (bytes) > Size of program headers: 56 (bytes) > Number of program headers: 3 > Size of section headers: 64 (bytes) > Number of section headers: 7 > Section header string table index: 6 > > Section Headers: > [Nr] Name Type Address Offset > Size EntSize Flags Link Info Align > [ 0] NULL 0000000000000000 00000000 > 0000000000000000 0000000000000000 0 0 0 > [ 1] foo PROGBITS 0000000000401000 00001000 > 0000000000000001 0000000000000000 AX 0 0 1 > [ 2] .eh_frame_hdr PROGBITS 0000000000402000 00002000 > 0000000000000014 0000000000000000 A 0 0 4 > [ 3] .eh_frame PROGBITS 0000000000402018 00002018 > 000000000000002c 0000000000000000 A 0 0 8 > [ 4] .symtab SYMTAB 0000000000000000 00002048 > 00000000000000d8 0000000000000018 5 5 8 > [ 5] .strtab STRTAB 0000000000000000 00002120 > 000000000000002c 0000000000000000 0 0 1 > [ 6] .shstrtab STRTAB 0000000000000000 0000214c > 0000000000000037 0000000000000000 0 0 1 > Key to Flags: > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > L (link order), O (extra OS processing required), G (group), T (TLS), > C (compressed), x (unknown), o (OS specific), E (exclude), > l (large), p (processor specific) > > There are no section groups in this file. > > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > 0x0000000000000001 0x0000000000000001 R E 0x1000 > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > 0x0000000000000044 0x0000000000000044 R 0x1000 > GNU_EH_FRAME 0x0000000000002000 0x0000000000402000 0x0000000000402000 > 0x0000000000000014 0x0000000000000014 R 0x4 > > Section to Segment mapping: > Segment Sections... > 00 foo > 01 .eh_frame_hdr .eh_frame > 02 .eh_frame_hdr > > There is no dynamic section in this file. > > There are no relocations in this file. > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > Symbol table '.symtab' contains 9 entries: > Num: Value Size Type Bind Vis Ndx Name > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > 3: 0000000000402018 0 SECTION LOCAL DEFAULT 3 > 4: 0000000000402000 0 NOTYPE LOCAL DEFAULT 2 __GNU_EH_FRAME_HDR > 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 __bss_start > 7: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _edata > 8: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _end > > No version information found in this file. > Contents of the .eh_frame section: > > > 00000000 0000000000000014 00000000 CIE > Version: 1 > Augmentation: "zR" > Code alignment factor: 1 > Data alignment factor: -8 > Return address column: 16 > Augmentation data: 1b > DW_CFA_def_cfa: r7 (rsp) ofs 8 > DW_CFA_offset: r16 (rip) at cfa-8 > DW_CFA_nop > DW_CFA_nop > > 00000018 0000000000000010 0000001c FDE cie=00000000 pc=0000000000401000..0000000000401001 > DW_CFA_nop > DW_CFA_nop > DW_CFA_nop > > > ``` > > I also see ".eh_frame" dumped when there is no ".eh_frame_hdr": > > > ``` > umb at ubuntu:~/tests/81$ readelf -a wo_hdr -wf > ELF Header: > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > Class: ELF64 > Data: 2's complement, little endian > Version: 1 (current) > OS/ABI: UNIX - System V > ABI Version: 0 > Type: EXEC (Executable file) > Machine: Advanced Micro Devices X86-64 > Version: 0x1 > Entry point address: 0x401000 > Start of program headers: 64 (bytes into file) > Start of section headers: 8480 (bytes into file) > Flags: 0x0 > Size of this header: 64 (bytes) > Size of program headers: 56 (bytes) > Number of program headers: 2 > Size of section headers: 64 (bytes) > Number of section headers: 6 > Section header string table index: 5 > > Section Headers: > [Nr] Name Type Address Offset > Size EntSize Flags Link Info Align > [ 0] NULL 0000000000000000 00000000 > 0000000000000000 0000000000000000 0 0 0 > [ 1] foo PROGBITS 0000000000401000 00001000 > 0000000000000001 0000000000000000 AX 0 0 1 > [ 2] .eh_frame PROGBITS 0000000000402000 00002000 > 000000000000002c 0000000000000000 A 0 0 8 > [ 3] .symtab SYMTAB 0000000000000000 00002030 > 00000000000000a8 0000000000000018 4 3 8 > [ 4] .strtab STRTAB 0000000000000000 000020d8 > 0000000000000019 0000000000000000 0 0 1 > [ 5] .shstrtab STRTAB 0000000000000000 000020f1 > 0000000000000029 0000000000000000 0 0 1 > Key to Flags: > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > L (link order), O (extra OS processing required), G (group), T (TLS), > C (compressed), x (unknown), o (OS specific), E (exclude), > l (large), p (processor specific) > > There are no section groups in this file. > > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > 0x0000000000000001 0x0000000000000001 R E 0x1000 > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > 0x000000000000002c 0x000000000000002c R 0x1000 > > Section to Segment mapping: > Segment Sections... > 00 foo > 01 .eh_frame > > There is no dynamic section in this file. > > There are no relocations in this file. > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > Symbol table '.symtab' contains 7 entries: > Num: Value Size Type Bind Vis Ndx Name > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > 3: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > 4: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 __bss_start > 5: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 _edata > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 _end > > No version information found in this file. > Contents of the .eh_frame section: > > > 00000000 0000000000000014 00000000 CIE > Version: 1 > Augmentation: "zR" > Code alignment factor: 1 > Data alignment factor: -8 > Return address column: 16 > Augmentation data: 1b > DW_CFA_def_cfa: r7 (rsp) ofs 8 > DW_CFA_offset: r16 (rip) at cfa-8 > DW_CFA_nop > DW_CFA_nop > > 00000018 0000000000000010 0000001c FDE cie=00000000 pc=0000000000401000..0000000000401001 > DW_CFA_nop > DW_CFA_nop > DW_CFA_nop > > } > ``` > > Since we have such differences in the behavior, should we just test the current behavior atm? > I.e. before this diff I tested "EH_FRAME Header [", now I also added a check for ".eh_frame section at offset...". > Both of them are dumped at the top level currently. Seems reasonable to test the fact we do that (with just `-all`) > and the order, probably? > (I am ok to change it in any way actually, but just wanted to clarify this before doing anything with it.) I'm confused too. Your output above even appears to be conflicted. Note that it mentions "The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported." Anyway, probably best to file a bug to record the issue and then do as you're doing here (i.e. test the current behaviour). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68462/new/ https://reviews.llvm.org/D68462 From llvm-commits at lists.llvm.org Mon Oct 7 07:21:11 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:21:11 +0000 (UTC) Subject: [PATCH] D68515: [PATCH 31/38] [noalias] SROA/PromoteMemoryToRegister: Learn how to handle noalias intrinsics In-Reply-To: References: Message-ID: <9f9803aaae257385662bcb98ca82fbac@localhost.localdomain> jeroen.dobbelaere updated this revision to Diff 223594. jeroen.dobbelaere added a comment. Adapt test to i64 p.objId. (was i32) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68515/new/ https://reviews.llvm.org/D68515 Files: llvm/lib/Transforms/Scalar/SROA.cpp llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp llvm/test/Transforms/SROA/noalias.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68515.223594.patch Type: text/x-patch Size: 55930 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:21:56 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:21:56 +0000 (UTC) Subject: [PATCH] D67158: [ARM] Add IR intrinsics for a sample of MVE instructions. In-Reply-To: References: Message-ID: <1485a9d36346d305e2d54f5737e8f46e@localhost.localdomain> dmgreen added a comment. This is a bit large to review in a single patch, and I don't think all the parts are necessarily interrelated. Mind pulling a few logically separable parts out into separate patches, to make what's left simpler? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67158/new/ https://reviews.llvm.org/D67158 From llvm-commits at lists.llvm.org Mon Oct 7 07:24:21 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:24:21 +0000 (UTC) Subject: [PATCH] D68566: [ARM] VQADD instructions In-Reply-To: References: Message-ID: <47f601d82eacb7f74c44f39810c38bda@localhost.localdomain> dmgreen marked an inline comment as done. dmgreen added a comment. The usat version can be idiom recognised from C: https://godbolt.org/z/9knBnP ssat is more difficult like I mentioned, but I don't think it should be impossible to do the same thing there too. ================ Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:1624 + foreach instr = [MVE_VQADDu8, MVE_VQADDu16, MVE_VQADDu32] in + foreach VT = [instr.VT] in + def : Pat<(VT (uaddsat (VT MQPR:$Qm), (VT MQPR:$Qn))), ---------------- SjoerdMeijer wrote: > And looking at this, I almost start to like tablegen. :-) This is from D67158. The idea is that an intrinsic can be easily added in here, same as for vadds in that patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68566/new/ https://reviews.llvm.org/D68566 From llvm-commits at lists.llvm.org Mon Oct 7 07:24:23 2019 From: llvm-commits at lists.llvm.org (Joerg Sonnenberger via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:24:23 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: joerg added a comment. Why go back to the large tables for crc32? Just because JamCRC had that bug doesn't mean it should persist. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Mon Oct 7 07:25:52 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:25:52 +0000 (UTC) Subject: [PATCH] D68522: [PATCH 37/38] [noalias] Inlining: enable --use-noalias-intrinsic-during-inlining by default In-Reply-To: References: Message-ID: jeroen.dobbelaere updated this revision to Diff 223595. jeroen.dobbelaere added a comment. Adapt to i64 p.objId. (was i32) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68522/new/ https://reviews.llvm.org/D68522 Files: llvm/lib/Transforms/Utils/InlineFunction.cpp llvm/test/Transforms/Coroutines/ArgAddr.ll llvm/test/Transforms/Coroutines/coro-retcon-resume-values.ll llvm/test/Transforms/Coroutines/coro-retcon-value.ll llvm/test/Transforms/Coroutines/coro-retcon.ll llvm/test/Transforms/Coroutines/ex3.ll llvm/test/Transforms/Inline/launder.invariant.group.ll llvm/test/Transforms/Inline/parallel-loop-md-merge.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68522.223595.patch Type: text/x-patch Size: 12150 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:28:02 2019 From: llvm-commits at lists.llvm.org (Khem Raj via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:28:02 +0000 (UTC) Subject: [PATCH] D57529: Add .dword direcrive support for aarch64 mc In-Reply-To: References: Message-ID: <2d23496703255e9b8aa65d73c6388053@localhost.localdomain> raj.khem added a comment. In D57529#1697029 , @ostannard wrote: > A patch adding the same functionality was added back in May: D61719 , rL360381 . > > Reviewers tend to assume that patch authors have commit access, if you don't then just say so when the patch is accepted and the reviewer will commit it for you. thanks, as long as the issues are fixed this is fine. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57529/new/ https://reviews.llvm.org/D57529 From llvm-commits at lists.llvm.org Mon Oct 7 07:28:10 2019 From: llvm-commits at lists.llvm.org (Nemanja Ivanovic via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:28:10 +0000 (UTC) Subject: [PATCH] D68576: [PowerPC] Fix VSX clobbers of CSR registers Message-ID: nemanjai created this revision. nemanjai added reviewers: hfinkel, PowerPC. Herald added subscribers: shchenz, jsji, MaskRay, kbarton. Herald added a project: LLVM. If an inline asm statement clobbers a VSX register that overlaps with a callee-saved Altivec register or FPR, we will not record the clobber and will therefore violate the ABI. This is clearly a bug so this patch fixes it. Repository: rL LLVM https://reviews.llvm.org/D68576 Files: lib/Target/PowerPC/PPCISelLowering.cpp test/CodeGen/PowerPC/inline-asm-vsx-clobbers.ll Index: test/CodeGen/PowerPC/inline-asm-vsx-clobbers.ll =================================================================== --- test/CodeGen/PowerPC/inline-asm-vsx-clobbers.ll +++ test/CodeGen/PowerPC/inline-asm-vsx-clobbers.ll @@ -0,0 +1,32 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc -mcpu=pwr9 -mtriple=powerpc64le-unknown-unknown \ +; RUN: -enable-ppc-quad-precision -ppc-vsr-nums-as-vr \ +; RUN: -ppc-asm-full-reg-names < %s | FileCheck %s + +define dso_local void @clobberVR(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr { +; CHECK-LABEL: clobberVR: +; CHECK: # %bb.0: # %entry +; CHECK-NEXT: stxv v22, -160(r1) # 16-byte Folded Spill +; CHECK-NEXT: #APP +; CHECK-NEXT: nop +; CHECK-NEXT: #NO_APP +; CHECK-NEXT: lxv v22, -160(r1) # 16-byte Folded Reload +; CHECK-NEXT: blr +entry: + tail call void asm sideeffect "nop", "~{vs54}"() + ret void +} + +define dso_local void @clobberFPR(<4 x i32> %a, <4 x i32> %b) local_unnamed_addr { +; CHECK-LABEL: clobberFPR: +; CHECK: # %bb.0: # %entry +; CHECK-NEXT: stfd f14, -144(r1) # 8-byte Folded Spill +; CHECK-NEXT: #APP +; CHECK-NEXT: nop +; CHECK-NEXT: #NO_APP +; CHECK-NEXT: lfd f14, -144(r1) # 8-byte Folded Reload +; CHECK-NEXT: blr +entry: + tail call void asm sideeffect "nop", "~{vs14}"() + ret void +} Index: lib/Target/PowerPC/PPCISelLowering.cpp =================================================================== --- lib/Target/PowerPC/PPCISelLowering.cpp +++ lib/Target/PowerPC/PPCISelLowering.cpp @@ -14309,6 +14309,17 @@ return std::make_pair(0U, &PPC::VSFRCRegClass); } + // If we name a VSX register, we can't defer to the base class because it + // will not recognize the correct register (their names will be VSL{0-31} + // and V{0-31} so they won't match). So we match them here. + if (Constraint.size() > 3 && Constraint[1] == 'v' && Constraint[2] == 's') { + int VSNum = atoi(Constraint.data() + 3); + assert(VSNum >= 0 && VSNum <= 63 && + "Attempted to access a vsr out of range"); + if (VSNum < 32) + return std::make_pair(PPC::VSL0 + VSNum, &PPC::VSRCRegClass); + return std::make_pair(PPC::V0 + VSNum - 32, &PPC::VSRCRegClass); + } std::pair R = TargetLowering::getRegForInlineAsmConstraint(TRI, Constraint, VT); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68576.223596.patch Type: text/x-patch Size: 2389 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:28:13 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:28:13 +0000 (UTC) Subject: [PATCH] D68462: [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. In-Reply-To: References: Message-ID: <1e47e09e006d5c003a8e4e5f932ace21@localhost.localdomain> grimar marked an inline comment as done. grimar added inline comments. ================ Comment at: test/tools/llvm-readobj/all.test:86-90 + - Name: .eh_frame_hdr + Type: SHT_PROGBITS +## An arbitrary linker-generated valid content. + Content: 011b033b140000000100000000f0ffff30000000 + - Name: .eh_frame ---------------- jhenderson wrote: > grimar wrote: > > jhenderson wrote: > > > grimar wrote: > > > > jhenderson wrote: > > > > > Same comments as earlier. Can these be empty? > > > > No. We need to have something valid here, otherwise any > > > > error triggered will fail the dumping. > > > > (e.g. https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/DwarfCFIEHPrinter.h#L127). > > > You don't actually need the .eh_frame_hdr at all, looking at the code, I think, just the .eh_frame section. That said, this appears to be different from GNU readelf. > > > > > > For reference, GNU readelf prints "There are no unwind sections in this file" if there are no .eh_frame_hdr sections even if there is a .eh_frame section (not looked to see what happens if there is a PT_GNU_EH_FRAME program header). > > I am a bit confused. > > > > Imagine we have the code and invocations below: > > > > "1.s": > > ``` > > .section foo,"ax", at progbits > > .cfi_startproc > > nop > > .cfi_endproc > > ``` > > > > ``` > > as 1.s -o 1.o > > ld.bfd 1.o -o with_hdr --eh-frame-hdr > > ld.bfd 1.o -o wo_hdr > > ``` > > > > For both of them I do not see neither `.eh_frame_hdr` nor `.eh_frame` section dumped with `-a`. > > I see ".eh_frame" dumped when I add `-wf` though (but still no `.eh_frame_hdr`). > > e.g.: > > > > > > ``` > > umb at ubuntu:~/tests/81$ readelf -v > > GNU readelf (GNU Binutils for Ubuntu) 2.31.1 > > Copyright (C) 2018 Free Software Foundation, Inc. > > > > umb at ubuntu:~/tests/81$ readelf -a with_hdr > > ELF Header: > > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > > Class: ELF64 > > Data: 2's complement, little endian > > Version: 1 (current) > > OS/ABI: UNIX - System V > > ABI Version: 0 > > Type: EXEC (Executable file) > > Machine: Advanced Micro Devices X86-64 > > Version: 0x1 > > Entry point address: 0x401000 > > Start of program headers: 64 (bytes into file) > > Start of section headers: 8584 (bytes into file) > > Flags: 0x0 > > Size of this header: 64 (bytes) > > Size of program headers: 56 (bytes) > > Number of program headers: 3 > > Size of section headers: 64 (bytes) > > Number of section headers: 7 > > Section header string table index: 6 > > > > Section Headers: > > [Nr] Name Type Address Offset > > Size EntSize Flags Link Info Align > > [ 0] NULL 0000000000000000 00000000 > > 0000000000000000 0000000000000000 0 0 0 > > [ 1] foo PROGBITS 0000000000401000 00001000 > > 0000000000000001 0000000000000000 AX 0 0 1 > > [ 2] .eh_frame_hdr PROGBITS 0000000000402000 00002000 > > 0000000000000014 0000000000000000 A 0 0 4 > > [ 3] .eh_frame PROGBITS 0000000000402018 00002018 > > 000000000000002c 0000000000000000 A 0 0 8 > > [ 4] .symtab SYMTAB 0000000000000000 00002048 > > 00000000000000d8 0000000000000018 5 5 8 > > [ 5] .strtab STRTAB 0000000000000000 00002120 > > 000000000000002c 0000000000000000 0 0 1 > > [ 6] .shstrtab STRTAB 0000000000000000 0000214c > > 0000000000000037 0000000000000000 0 0 1 > > Key to Flags: > > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > > L (link order), O (extra OS processing required), G (group), T (TLS), > > C (compressed), x (unknown), o (OS specific), E (exclude), > > l (large), p (processor specific) > > > > There are no section groups in this file. > > > > Program Headers: > > Type Offset VirtAddr PhysAddr > > FileSiz MemSiz Flags Align > > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > > 0x0000000000000001 0x0000000000000001 R E 0x1000 > > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > 0x0000000000000044 0x0000000000000044 R 0x1000 > > GNU_EH_FRAME 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > 0x0000000000000014 0x0000000000000014 R 0x4 > > > > Section to Segment mapping: > > Segment Sections... > > 00 foo > > 01 .eh_frame_hdr .eh_frame > > 02 .eh_frame_hdr > > > > There is no dynamic section in this file. > > > > There are no relocations in this file. > > > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > > > Symbol table '.symtab' contains 9 entries: > > Num: Value Size Type Bind Vis Ndx Name > > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > > 3: 0000000000402018 0 SECTION LOCAL DEFAULT 3 > > 4: 0000000000402000 0 NOTYPE LOCAL DEFAULT 2 __GNU_EH_FRAME_HDR > > 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 __bss_start > > 7: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _edata > > 8: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _end > > > > No version information found in this file. > > > > umb at ubuntu:~/tests/81$ readelf -a with_hdr -wf > > ELF Header: > > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > > Class: ELF64 > > Data: 2's complement, little endian > > Version: 1 (current) > > OS/ABI: UNIX - System V > > ABI Version: 0 > > Type: EXEC (Executable file) > > Machine: Advanced Micro Devices X86-64 > > Version: 0x1 > > Entry point address: 0x401000 > > Start of program headers: 64 (bytes into file) > > Start of section headers: 8584 (bytes into file) > > Flags: 0x0 > > Size of this header: 64 (bytes) > > Size of program headers: 56 (bytes) > > Number of program headers: 3 > > Size of section headers: 64 (bytes) > > Number of section headers: 7 > > Section header string table index: 6 > > > > Section Headers: > > [Nr] Name Type Address Offset > > Size EntSize Flags Link Info Align > > [ 0] NULL 0000000000000000 00000000 > > 0000000000000000 0000000000000000 0 0 0 > > [ 1] foo PROGBITS 0000000000401000 00001000 > > 0000000000000001 0000000000000000 AX 0 0 1 > > [ 2] .eh_frame_hdr PROGBITS 0000000000402000 00002000 > > 0000000000000014 0000000000000000 A 0 0 4 > > [ 3] .eh_frame PROGBITS 0000000000402018 00002018 > > 000000000000002c 0000000000000000 A 0 0 8 > > [ 4] .symtab SYMTAB 0000000000000000 00002048 > > 00000000000000d8 0000000000000018 5 5 8 > > [ 5] .strtab STRTAB 0000000000000000 00002120 > > 000000000000002c 0000000000000000 0 0 1 > > [ 6] .shstrtab STRTAB 0000000000000000 0000214c > > 0000000000000037 0000000000000000 0 0 1 > > Key to Flags: > > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > > L (link order), O (extra OS processing required), G (group), T (TLS), > > C (compressed), x (unknown), o (OS specific), E (exclude), > > l (large), p (processor specific) > > > > There are no section groups in this file. > > > > Program Headers: > > Type Offset VirtAddr PhysAddr > > FileSiz MemSiz Flags Align > > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > > 0x0000000000000001 0x0000000000000001 R E 0x1000 > > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > 0x0000000000000044 0x0000000000000044 R 0x1000 > > GNU_EH_FRAME 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > 0x0000000000000014 0x0000000000000014 R 0x4 > > > > Section to Segment mapping: > > Segment Sections... > > 00 foo > > 01 .eh_frame_hdr .eh_frame > > 02 .eh_frame_hdr > > > > There is no dynamic section in this file. > > > > There are no relocations in this file. > > > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > > > Symbol table '.symtab' contains 9 entries: > > Num: Value Size Type Bind Vis Ndx Name > > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > > 3: 0000000000402018 0 SECTION LOCAL DEFAULT 3 > > 4: 0000000000402000 0 NOTYPE LOCAL DEFAULT 2 __GNU_EH_FRAME_HDR > > 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 __bss_start > > 7: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _edata > > 8: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _end > > > > No version information found in this file. > > Contents of the .eh_frame section: > > > > > > 00000000 0000000000000014 00000000 CIE > > Version: 1 > > Augmentation: "zR" > > Code alignment factor: 1 > > Data alignment factor: -8 > > Return address column: 16 > > Augmentation data: 1b > > DW_CFA_def_cfa: r7 (rsp) ofs 8 > > DW_CFA_offset: r16 (rip) at cfa-8 > > DW_CFA_nop > > DW_CFA_nop > > > > 00000018 0000000000000010 0000001c FDE cie=00000000 pc=0000000000401000..0000000000401001 > > DW_CFA_nop > > DW_CFA_nop > > DW_CFA_nop > > > > > > ``` > > > > I also see ".eh_frame" dumped when there is no ".eh_frame_hdr": > > > > > > ``` > > umb at ubuntu:~/tests/81$ readelf -a wo_hdr -wf > > ELF Header: > > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > > Class: ELF64 > > Data: 2's complement, little endian > > Version: 1 (current) > > OS/ABI: UNIX - System V > > ABI Version: 0 > > Type: EXEC (Executable file) > > Machine: Advanced Micro Devices X86-64 > > Version: 0x1 > > Entry point address: 0x401000 > > Start of program headers: 64 (bytes into file) > > Start of section headers: 8480 (bytes into file) > > Flags: 0x0 > > Size of this header: 64 (bytes) > > Size of program headers: 56 (bytes) > > Number of program headers: 2 > > Size of section headers: 64 (bytes) > > Number of section headers: 6 > > Section header string table index: 5 > > > > Section Headers: > > [Nr] Name Type Address Offset > > Size EntSize Flags Link Info Align > > [ 0] NULL 0000000000000000 00000000 > > 0000000000000000 0000000000000000 0 0 0 > > [ 1] foo PROGBITS 0000000000401000 00001000 > > 0000000000000001 0000000000000000 AX 0 0 1 > > [ 2] .eh_frame PROGBITS 0000000000402000 00002000 > > 000000000000002c 0000000000000000 A 0 0 8 > > [ 3] .symtab SYMTAB 0000000000000000 00002030 > > 00000000000000a8 0000000000000018 4 3 8 > > [ 4] .strtab STRTAB 0000000000000000 000020d8 > > 0000000000000019 0000000000000000 0 0 1 > > [ 5] .shstrtab STRTAB 0000000000000000 000020f1 > > 0000000000000029 0000000000000000 0 0 1 > > Key to Flags: > > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > > L (link order), O (extra OS processing required), G (group), T (TLS), > > C (compressed), x (unknown), o (OS specific), E (exclude), > > l (large), p (processor specific) > > > > There are no section groups in this file. > > > > Program Headers: > > Type Offset VirtAddr PhysAddr > > FileSiz MemSiz Flags Align > > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > > 0x0000000000000001 0x0000000000000001 R E 0x1000 > > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > 0x000000000000002c 0x000000000000002c R 0x1000 > > > > Section to Segment mapping: > > Segment Sections... > > 00 foo > > 01 .eh_frame > > > > There is no dynamic section in this file. > > > > There are no relocations in this file. > > > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > > > Symbol table '.symtab' contains 7 entries: > > Num: Value Size Type Bind Vis Ndx Name > > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > > 3: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > > 4: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 __bss_start > > 5: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 _edata > > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 _end > > > > No version information found in this file. > > Contents of the .eh_frame section: > > > > > > 00000000 0000000000000014 00000000 CIE > > Version: 1 > > Augmentation: "zR" > > Code alignment factor: 1 > > Data alignment factor: -8 > > Return address column: 16 > > Augmentation data: 1b > > DW_CFA_def_cfa: r7 (rsp) ofs 8 > > DW_CFA_offset: r16 (rip) at cfa-8 > > DW_CFA_nop > > DW_CFA_nop > > > > 00000018 0000000000000010 0000001c FDE cie=00000000 pc=0000000000401000..0000000000401001 > > DW_CFA_nop > > DW_CFA_nop > > DW_CFA_nop > > > > } > > ``` > > > > Since we have such differences in the behavior, should we just test the current behavior atm? > > I.e. before this diff I tested "EH_FRAME Header [", now I also added a check for ".eh_frame section at offset...". > > Both of them are dumped at the top level currently. Seems reasonable to test the fact we do that (with just `-all`) > > and the order, probably? > > (I am ok to change it in any way actually, but just wanted to clarify this before doing anything with it.) > I'm confused too. Your output above even appears to be conflicted. Note that it mentions "The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported." > > Anyway, probably best to file a bug to record the issue and then do as you're doing here (i.e. test the current behaviour). > Note that it mentions "The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported." Yes, that looks strange. I'll build the latest binutils from sources tomorrow and check what it do, then probably file a bug or prepare a patch. Thanks for review! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68462/new/ https://reviews.llvm.org/D68462 From llvm-commits at lists.llvm.org Mon Oct 7 07:34:47 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:34:47 +0000 (UTC) Subject: [PATCH] D67749: [AArch64] Stackframe accesses to SVE objects. In-Reply-To: References: Message-ID: <3fd153e59ad1c74b3f59bed0671bf5ad@localhost.localdomain> cameron.mcinally added inline comments. ================ Comment at: lib/Target/AArch64/AArch64InstrInfo.cpp:3366 +} + int llvm::isAArch64FrameOffsetLegal(const MachineInstr &MI, ---------------- I'm not an LLVM coding standards expert, but does this need an llvm_unreachable()? I think it does... ================ Comment at: lib/Target/AArch64/AArch64InstrInfo.cpp:3453 + SOffset = StackOffset(Offset, MVT::i8) + + StackOffset(SOffset.getScalableBytes(), MVT::nxv1i8); return AArch64FrameOffsetCanUpdate | ---------------- Would you shed some light on what this change is doing? `IsMulVL` indicates there are scalable objects on the stack, right? What is the reason for the behavior change of the legacy code when `!IsMulVL`. I.e. the addition of `StackOffset(SOffset.getScalableBytes(), MVT::nxv1i8)` in the else block. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67749/new/ https://reviews.llvm.org/D67749 From llvm-commits at lists.llvm.org Mon Oct 7 07:37:20 2019 From: llvm-commits at lists.llvm.org (David Greene via llvm-commits) Date: Mon, 07 Oct 2019 14:37:20 -0000 Subject: [llvm] r373912 - Allow update_test_checks.py to not scrub names. Message-ID: <20191007143720.6FFFC8C957@lists.llvm.org> Author: greened Date: Mon Oct 7 07:37:20 2019 New Revision: 373912 URL: http://llvm.org/viewvc/llvm-project?rev=373912&view=rev Log: Allow update_test_checks.py to not scrub names. Add a --preserve-names option to tell the script not to replace IR names. Sometimes tests want those names. For example if a test is looking for a modification to an existing instruction we'll want to make the names. Differential Revision: https://reviews.llvm.org/D68081 Modified: llvm/trunk/utils/UpdateTestChecks/common.py llvm/trunk/utils/update_test_checks.py Modified: llvm/trunk/utils/UpdateTestChecks/common.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/UpdateTestChecks/common.py?rev=373912&r1=373911&r2=373912&view=diff ============================================================================== --- llvm/trunk/utils/UpdateTestChecks/common.py (original) +++ llvm/trunk/utils/UpdateTestChecks/common.py Mon Oct 7 07:37:20 2019 @@ -267,10 +267,12 @@ def add_checks(output_lines, comment_mar output_lines.append(comment_marker) break -def add_ir_checks(output_lines, comment_marker, prefix_list, func_dict, func_name): +def add_ir_checks(output_lines, comment_marker, prefix_list, func_dict, + func_name, preserve_names): # Label format is based on IR string. check_label_format = '{} %s-LABEL: @%s('.format(comment_marker) - add_checks(output_lines, comment_marker, prefix_list, func_dict, func_name, check_label_format, False, False) + add_checks(output_lines, comment_marker, prefix_list, func_dict, func_name, + check_label_format, False, preserve_names) def add_analyze_checks(output_lines, comment_marker, prefix_list, func_dict, func_name): check_label_format = '{} %s-LABEL: \'%s\''.format(comment_marker) Modified: llvm/trunk/utils/update_test_checks.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/update_test_checks.py?rev=373912&r1=373911&r2=373912&view=diff ============================================================================== --- llvm/trunk/utils/update_test_checks.py (original) +++ llvm/trunk/utils/update_test_checks.py Mon Oct 7 07:37:20 2019 @@ -64,6 +64,8 @@ def main(): '--function', help='The function in the test file to update') parser.add_argument('-u', '--update-only', action='store_true', help='Only update test if it was already autogened') + parser.add_argument('-p', '--preserve-names', action='store_true', + help='Do not scrub IR names') parser.add_argument('tests', nargs='+') args = parser.parse_args() @@ -174,7 +176,8 @@ def main(): continue # Print out the various check lines here. - common.add_ir_checks(output_lines, ';', prefix_list, func_dict, func_name) + common.add_ir_checks(output_lines, ';', prefix_list, func_dict, + func_name, args.preserve_names) is_in_function_start = False if is_in_function: From llvm-commits at lists.llvm.org Mon Oct 7 07:37:11 2019 From: llvm-commits at lists.llvm.org (James Y Knight via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:37:11 +0000 (UTC) Subject: [PATCH] D28213: [Frontend] Correct values of ATOMIC_*_LOCK_FREE to match builtin In-Reply-To: References: Message-ID: <8081241eb77c3e6bc5cbfae844c27d65@localhost.localdomain> jyknight reopened this revision. jyknight added a comment. This revision is now accepted and ready to land. The close was due to phabricator problem, reopening. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D28213/new/ https://reviews.llvm.org/D28213 From llvm-commits at lists.llvm.org Mon Oct 7 07:37:44 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:37:44 +0000 (UTC) Subject: [PATCH] D67882: [LNT] Python 3 support: make leaf classes inherit from object In-Reply-To: References: Message-ID: <60ab930daca4cce11c19c9feeeacc79f@localhost.localdomain> thopre marked 2 inline comments as done. thopre added a comment. In D67882#1697371 , @hubert.reinterpretcast wrote: > Perhaps others are more well-versed in this than I am, but I think that having a link in the commit message and using the terminology used by the documentation ("new-style" and "classic") would be useful here: https://docs.python.org/2/reference/datamodel.html#newstyle. Also, I am not sure that switching these to be new-style classes in Python 2 is necessary. I believe the commit message should give additional rationale, e.g., using new-style classes helps make the Python 2 and Python 3 behaviour of the code more similar. Good point about terminology. I'm not sure I understand your point about it not being necessary in Python 2. All the changes I'm doing in this Python 3 support series are not necessary for Python 2, they are here to have a codebase that works accross both versions of Python. In D67882#1697371 , @hubert.reinterpretcast wrote: > Perhaps others are more well-versed in this than I am, but I think that having a link in the commit message and using the terminology used by the documentation ("new-style" and "classic") would be useful here: https://docs.python.org/2/reference/datamodel.html#newstyle. Also, I am not sure that switching these to be new-style classes in Python 2 is necessary. I believe the commit message should give additional rationale, e.g., using new-style classes helps make the Python 2 and Python 3 behaviour of the code more similar. I hadn't realized that the explicit inheritance from object was not a requirement of Python 3. Therefore as you say it's only to improve behaviour being more similar between Python 2 and 3 but given the timeframe and the low amount of commits on LNT I don't think that is necessary. I think we should make LNT Python 3 only as soon as it can work in that mode, so the less compability code we can add the better. Do you agree with that approach? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67882/new/ https://reviews.llvm.org/D67882 From llvm-commits at lists.llvm.org Mon Oct 7 07:38:45 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:38:45 +0000 (UTC) Subject: [PATCH] D68566: [ARM] VQADD instructions In-Reply-To: References: Message-ID: <6cdec8802a04d54a69e7f1b718bbcce9@localhost.localdomain> samparker added a comment. Great! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68566/new/ https://reviews.llvm.org/D68566 From llvm-commits at lists.llvm.org Mon Oct 7 07:38:50 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:38:50 +0000 (UTC) Subject: [PATCH] D67882: [LNT] Python 3 support: make leaf classes inherit from object In-Reply-To: References: Message-ID: thopre updated this revision to Diff 223603. thopre added a comment. Remove class style compability change Remove stddev_mean property code altogether since it is not used anywhere CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67882/new/ https://reviews.llvm.org/D67882 Files: lnt/server/reporting/analysis.py Index: lnt/server/reporting/analysis.py =================================================================== --- lnt/server/reporting/analysis.py +++ lnt/server/reporting/analysis.py @@ -97,7 +97,6 @@ self.stddev = None self.MAD = None - self.stddev_mean = None # Only calculate this if needed. self.failed = cur_failed self.prev_failed = prev_failed self.samples = samples @@ -106,14 +105,6 @@ self.confidence_lv = confidence_lv self.bigger_is_better = bigger_is_better - @property - def stddev_mean(self): - """The mean around stddev for current sampples. Cached after first call. - """ - if not self.stddev_mean: - self.stddev_mean = stats.mean(self.samples) - return self.stddev_mean - def __repr__(self): """Print this ComparisonResult's constructor. -------------- next part -------------- A non-text attachment was scrubbed... Name: D67882.223603.patch Type: text/x-patch Size: 899 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:40:01 2019 From: llvm-commits at lists.llvm.org (Gil Rapaport via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:40:01 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) Message-ID: gilr created this revision. gilr added reviewers: hsaito, rengolin, dcaballe, fhahn, Ayal. Herald added subscribers: llvm-commits, psnobl, rogfer01, rkruppe, tschuett, bollu, hiraditya. Herald added a project: LLVM. The sink-after and interleave-group vectorization decisions were so far applied to VPlan during initial VPlan construction, which complicates VPlan construction – also because of their inter-dependence. This patch refactors buildVPlanWithRecipes() to construct a simpler initial VPlan and later apply both these vectorization decisions, in order, as VPlan-to-VPlan transformations. Repository: rL LLVM https://reviews.llvm.org/D68577 Files: llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h llvm/lib/Transforms/Vectorize/VPlan.cpp llvm/lib/Transforms/Vectorize/VPlan.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68577.223600.patch Type: text/x-patch Size: 21081 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:43:08 2019 From: llvm-commits at lists.llvm.org (Hans Wennborg via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:43:08 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <95131d8c07df02d9c2ff3a32c4409865@localhost.localdomain> hans added a comment. In D68570#1697588 , @joerg wrote: > Why go back to the large tables for crc32? Just because JamCRC had that bug doesn't mean it should persist. Because just using the table is much simpler and we already have it: no need for any run-time initialization and fancy code like call_once. Why do you consider it a bug? Generating a constant table like this at run-time -- again and again for each invocation of the program -- seems less than ideal to me. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Mon Oct 7 07:44:16 2019 From: llvm-commits at lists.llvm.org (Dragan Mladjenovic via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:44:16 +0000 (UTC) Subject: [PATCH] D68542: [Mips] Always save RA when disabling frame pointer elimination In-Reply-To: References: Message-ID: draganm added a comment. I don't think you can have frame-pointer based stack unwinding under current Mips ABIs, albeit this might be useful for some stack scan based unwind. Not sure tho. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68542/new/ https://reviews.llvm.org/D68542 From llvm-commits at lists.llvm.org Mon Oct 7 07:48:27 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Mon, 07 Oct 2019 14:48:27 -0000 Subject: [llvm] r373913 - [LoopVectorize] add test that asserted after cost model change (PR43582); NFC Message-ID: <20191007144827.D47A480696@lists.llvm.org> Author: spatel Date: Mon Oct 7 07:48:27 2019 New Revision: 373913 URL: http://llvm.org/viewvc/llvm-project?rev=373913&view=rev Log: [LoopVectorize] add test that asserted after cost model change (PR43582); NFC Added: llvm/trunk/test/Transforms/LoopVectorize/X86/cost-model-assert.ll Added: llvm/trunk/test/Transforms/LoopVectorize/X86/cost-model-assert.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/cost-model-assert.ll?rev=373913&view=auto ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/X86/cost-model-assert.ll (added) +++ llvm/trunk/test/Transforms/LoopVectorize/X86/cost-model-assert.ll Mon Oct 7 07:48:27 2019 @@ -0,0 +1,127 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt < %s -loop-vectorize -S | FileCheck %s + +; This is a bugpoint reduction of a test from PR43582: +; https://bugs.llvm.org/show_bug.cgi?id=43582 + +; ...but it's over-simplifying the underlying question: +; TODO: Should this be vectorized rather than allowing the backend to load combine? +; The original code is a bswap pattern. + +target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-w64-windows-gnu" + +define void @cff_index_load_offsets(i1 %cond, i8 %x, i8* %p) #0 { +; CHECK-LABEL: @cff_index_load_offsets( +; CHECK-NEXT: entry: +; CHECK-NEXT: br i1 [[COND:%.*]], label [[IF_THEN:%.*]], label [[EXIT:%.*]] +; CHECK: if.then: +; CHECK-NEXT: br i1 true, label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]] +; CHECK: vector.ph: +; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i8> undef, i8 [[X:%.*]], i32 0 +; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i8> [[BROADCAST_SPLATINSERT]], <4 x i8> undef, <4 x i32> zeroinitializer +; CHECK-NEXT: br label [[VECTOR_BODY:%.*]] +; CHECK: vector.body: +; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ] +; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0 +; CHECK-NEXT: [[TMP1:%.*]] = mul i64 [[TMP0]], 4 +; CHECK-NEXT: [[NEXT_GEP:%.*]] = getelementptr i8, i8* null, i64 [[TMP1]] +; CHECK-NEXT: [[TMP2:%.*]] = zext <4 x i8> [[BROADCAST_SPLAT]] to <4 x i32> +; CHECK-NEXT: [[TMP3:%.*]] = shl nuw <4 x i32> [[TMP2]], +; CHECK-NEXT: [[TMP4:%.*]] = load i8, i8* [[P:%.*]], align 1, !tbaa !1 +; CHECK-NEXT: [[TMP5:%.*]] = load i8, i8* [[P]], align 1, !tbaa !1 +; CHECK-NEXT: [[TMP6:%.*]] = load i8, i8* [[P]], align 1, !tbaa !1 +; CHECK-NEXT: [[TMP7:%.*]] = load i8, i8* [[P]], align 1, !tbaa !1 +; CHECK-NEXT: [[TMP8:%.*]] = insertelement <4 x i8> undef, i8 [[TMP4]], i32 0 +; CHECK-NEXT: [[TMP9:%.*]] = insertelement <4 x i8> [[TMP8]], i8 [[TMP5]], i32 1 +; CHECK-NEXT: [[TMP10:%.*]] = insertelement <4 x i8> [[TMP9]], i8 [[TMP6]], i32 2 +; CHECK-NEXT: [[TMP11:%.*]] = insertelement <4 x i8> [[TMP10]], i8 [[TMP7]], i32 3 +; CHECK-NEXT: [[TMP12:%.*]] = zext <4 x i8> [[TMP11]] to <4 x i32> +; CHECK-NEXT: [[TMP13:%.*]] = shl nuw nsw <4 x i32> [[TMP12]], +; CHECK-NEXT: [[TMP14:%.*]] = or <4 x i32> [[TMP13]], [[TMP3]] +; CHECK-NEXT: [[TMP15:%.*]] = load i8, i8* undef, align 1, !tbaa !1 +; CHECK-NEXT: [[TMP16:%.*]] = load i8, i8* undef, align 1, !tbaa !1 +; CHECK-NEXT: [[TMP17:%.*]] = load i8, i8* undef, align 1, !tbaa !1 +; CHECK-NEXT: [[TMP18:%.*]] = load i8, i8* undef, align 1, !tbaa !1 +; CHECK-NEXT: [[TMP19:%.*]] = or <4 x i32> [[TMP14]], zeroinitializer +; CHECK-NEXT: [[TMP20:%.*]] = or <4 x i32> [[TMP19]], zeroinitializer +; CHECK-NEXT: [[TMP21:%.*]] = extractelement <4 x i32> [[TMP20]], i32 0 +; CHECK-NEXT: store i32 [[TMP21]], i32* undef, align 4, !tbaa !4 +; CHECK-NEXT: [[TMP22:%.*]] = extractelement <4 x i32> [[TMP20]], i32 1 +; CHECK-NEXT: store i32 [[TMP22]], i32* undef, align 4, !tbaa !4 +; CHECK-NEXT: [[TMP23:%.*]] = extractelement <4 x i32> [[TMP20]], i32 2 +; CHECK-NEXT: store i32 [[TMP23]], i32* undef, align 4, !tbaa !4 +; CHECK-NEXT: [[TMP24:%.*]] = extractelement <4 x i32> [[TMP20]], i32 3 +; CHECK-NEXT: store i32 [[TMP24]], i32* undef, align 4, !tbaa !4 +; CHECK-NEXT: [[INDEX_NEXT]] = add i64 [[INDEX]], 4 +; CHECK-NEXT: [[TMP25:%.*]] = icmp eq i64 [[INDEX_NEXT]], 0 +; CHECK-NEXT: br i1 [[TMP25]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !6 +; CHECK: middle.block: +; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 1, 0 +; CHECK-NEXT: br i1 [[CMP_N]], label [[SW_EPILOG:%.*]], label [[SCALAR_PH]] +; CHECK: scalar.ph: +; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i8* [ null, [[MIDDLE_BLOCK]] ], [ null, [[IF_THEN]] ] +; CHECK-NEXT: br label [[FOR_BODY68:%.*]] +; CHECK: for.body68: +; CHECK-NEXT: [[P_359:%.*]] = phi i8* [ [[ADD_PTR86:%.*]], [[FOR_BODY68]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ] +; CHECK-NEXT: [[CONV70:%.*]] = zext i8 [[X]] to i32 +; CHECK-NEXT: [[SHL71:%.*]] = shl nuw i32 [[CONV70]], 24 +; CHECK-NEXT: [[TMP26:%.*]] = load i8, i8* [[P]], align 1, !tbaa !1 +; CHECK-NEXT: [[CONV73:%.*]] = zext i8 [[TMP26]] to i32 +; CHECK-NEXT: [[SHL74:%.*]] = shl nuw nsw i32 [[CONV73]], 16 +; CHECK-NEXT: [[OR75:%.*]] = or i32 [[SHL74]], [[SHL71]] +; CHECK-NEXT: [[TMP27:%.*]] = load i8, i8* undef, align 1, !tbaa !1 +; CHECK-NEXT: [[SHL78:%.*]] = shl nuw nsw i32 undef, 8 +; CHECK-NEXT: [[OR79:%.*]] = or i32 [[OR75]], [[SHL78]] +; CHECK-NEXT: [[CONV81:%.*]] = zext i8 undef to i32 +; CHECK-NEXT: [[OR83:%.*]] = or i32 [[OR79]], [[CONV81]] +; CHECK-NEXT: store i32 [[OR83]], i32* undef, align 4, !tbaa !4 +; CHECK-NEXT: [[ADD_PTR86]] = getelementptr inbounds i8, i8* [[P_359]], i64 4 +; CHECK-NEXT: [[CMP66:%.*]] = icmp ult i8* [[ADD_PTR86]], undef +; CHECK-NEXT: br i1 [[CMP66]], label [[FOR_BODY68]], label [[SW_EPILOG]], !llvm.loop !8 +; CHECK: sw.epilog: +; CHECK-NEXT: unreachable +; CHECK: Exit: +; CHECK-NEXT: ret void +; +entry: + br i1 %cond, label %if.then, label %Exit + +if.then: ; preds = %entry + br label %for.body68 + +for.body68: ; preds = %for.body68, %if.then + %p.359 = phi i8* [ %add.ptr86, %for.body68 ], [ null, %if.then ] + %conv70 = zext i8 %x to i32 + %shl71 = shl nuw i32 %conv70, 24 + %0 = load i8, i8* %p, align 1, !tbaa !1 + %conv73 = zext i8 %0 to i32 + %shl74 = shl nuw nsw i32 %conv73, 16 + %or75 = or i32 %shl74, %shl71 + %1 = load i8, i8* undef, align 1, !tbaa !1 + %shl78 = shl nuw nsw i32 undef, 8 + %or79 = or i32 %or75, %shl78 + %conv81 = zext i8 undef to i32 + %or83 = or i32 %or79, %conv81 + store i32 %or83, i32* undef, align 4, !tbaa !4 + %add.ptr86 = getelementptr inbounds i8, i8* %p.359, i64 4 + %cmp66 = icmp ult i8* %add.ptr86, undef + br i1 %cmp66, label %for.body68, label %sw.epilog + +sw.epilog: ; preds = %for.body68 + unreachable + +Exit: ; preds = %entry + ret void +} + +attributes #0 = { "use-soft-float"="false" } + +!llvm.ident = !{!0} + +!0 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 0fedc26a0dc0066f3968b9fea6a4e1f746c8d5a4)"} +!1 = !{!2, !2, i64 0} +!2 = !{!"omnipotent char", !3, i64 0} +!3 = !{!"Simple C/C++ TBAA"} +!4 = !{!5, !5, i64 0} +!5 = !{!"long", !2, i64 0} From llvm-commits at lists.llvm.org Mon Oct 7 07:53:34 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:53:34 +0000 (UTC) Subject: [PATCH] D68521: [PATCH 36/38] [noalias] Clang CodeGen for restrict-qualified pointers In-Reply-To: References: Message-ID: <730c7d40367347a4129f3e94e4545dc1@localhost.localdomain> jeroen.dobbelaere updated this revision to Diff 223606. jeroen.dobbelaere added a comment. Adapt CodeGen to std::vector indices (was std::vector). Adapt tests to i64 p.objId (was i32). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68521/new/ https://reviews.llvm.org/D68521 Files: clang/include/clang/AST/Type.h clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Driver/CC1Options.td clang/lib/AST/Type.cpp clang/lib/CodeGen/Address.h clang/lib/CodeGen/CGCall.cpp clang/lib/CodeGen/CGDecl.cpp clang/lib/CodeGen/CGExpr.cpp clang/lib/CodeGen/CGExprAgg.cpp clang/lib/CodeGen/CGStmt.cpp clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/CodeGen/CodeGenFunction.h clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/noalias.c clang/test/CodeGen/restrict/arg_reuse.c clang/test/CodeGen/restrict/array.c clang/test/CodeGen/restrict/basic.c clang/test/CodeGen/restrict/basic_opt_01.c clang/test/CodeGen/restrict/basic_opt_02.c clang/test/CodeGen/restrict/basic_opt_03.c clang/test/CodeGen/restrict/basic_opt_04.c clang/test/CodeGen/restrict/escape_through_volatile.c clang/test/CodeGen/restrict/inlining_01.c clang/test/CodeGen/restrict/inlining_02.c clang/test/CodeGen/restrict/side_noalias_reduction_01.c clang/test/CodeGen/restrict/struct.c clang/test/CodeGen/restrict/struct_member_01.c clang/test/CodeGen/restrict/struct_member_02.c clang/test/CodeGen/restrict/struct_member_03.c clang/test/CodeGen/restrict/struct_member_04.c clang/test/CodeGen/restrict/struct_member_05.c clang/test/CodeGen/restrict/struct_member_06.c clang/test/CodeGen/restrict/struct_member_07.c clang/test/CodeGen/restrict/struct_member_08.cpp clang/test/OpenMP/taskloop_firstprivate_codegen.cpp clang/test/OpenMP/taskloop_lastprivate_codegen.cpp clang/test/OpenMP/taskloop_private_codegen.cpp clang/test/OpenMP/taskloop_simd_firstprivate_codegen.cpp clang/test/OpenMP/taskloop_simd_lastprivate_codegen.cpp clang/test/OpenMP/taskloop_simd_private_codegen.cpp llvm/lib/IR/IRBuilder.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68521.223606.patch Type: text/x-patch Size: 105809 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:53:40 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:53:40 +0000 (UTC) Subject: [PATCH] D68542: [Mips] Always save RA when disabling frame pointer elimination In-Reply-To: References: Message-ID: <71462d0ff1de4d5730a0e1afe38652e1@localhost.localdomain> atanasyan added a comment. In D68542#1697636 , @draganm wrote: > I don't think you can have frame-pointer based stack unwinding under current Mips ABIs, albeit this might be useful for some stack scan based unwind. Not sure tho. Agreed. But saving RA has a low cost, might be useful and as far as I can see - gcc saves RA in the same cases. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68542/new/ https://reviews.llvm.org/D68542 From llvm-commits at lists.llvm.org Mon Oct 7 07:55:39 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:55:39 +0000 (UTC) Subject: [PATCH] D67879: [LNT] Python 3 support: import object when inheriting from it In-Reply-To: References: Message-ID: <7fd740ec1260b96093c29551870212b2@localhost.localdomain> hubert.reinterpretcast added a comment. My understanding is that this patch has no effect for Python 3. In Python 2, `object` from `builtins` (as provided by the `future` package) is used to enable use of some Python 3 coding patterns. Absent further changes that make use of such enablement, I am not sure that this patch is necessary. If this patch is needed to support later patches, then I suggest applying the fixer at that point. Applying the fixer before applying D67882 seems odd anyway. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67879/new/ https://reviews.llvm.org/D67879 From llvm-commits at lists.llvm.org Mon Oct 7 07:55:49 2019 From: llvm-commits at lists.llvm.org (Peter Collingbourne via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:55:49 +0000 (UTC) Subject: [PATCH] D10548: Teach LTOModule to emit linker flags for dllexported symbols, plus interface cleanup. In-Reply-To: References: Message-ID: <8faf9dbe32d2d2b328939a017bb4a74e@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGaef3659e1888: Teach LTOModule to emit linker flags for dllexported symbols, plus interface… (authored by pcc). Herald added subscribers: mstorsjo, dang, dexonsmith, steven_wu, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D10548?vs=28718&id=223608#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D10548/new/ https://reviews.llvm.org/D10548 Files: llvm/include/llvm-c/lto.h llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h llvm/include/llvm/LTO/LTOModule.h llvm/include/llvm/Target/TargetLoweringObjectFile.h llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp llvm/lib/LTO/LTOModule.cpp llvm/lib/Target/X86/X86AsmPrinter.cpp llvm/lib/Target/X86/X86AsmPrinter.h llvm/test/CodeGen/X86/dllexport-x86_64.ll llvm/test/CodeGen/X86/dllexport.ll llvm/tools/lto/lto.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D10548.223608.patch Type: text/x-patch Size: 19184 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 07:59:08 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:59:08 +0000 (UTC) Subject: [PATCH] D67879: [LNT] Python 3 support: import object when inheriting from it In-Reply-To: References: Message-ID: <97de2165b9a40f9228d718a6e2d9cd08@localhost.localdomain> thopre abandoned this revision. thopre added a comment. In D67879#1697652 , @hubert.reinterpretcast wrote: > My understanding is that this patch has no effect for Python 3. In Python 2, `object` from `builtins` (as provided by the `future` package) is used to enable use of some Python 3 coding patterns. Absent further changes that make use of such enablement, I am not sure that this patch is necessary. If this patch is needed to support later patches, then I suggest applying the fixer at that point. Applying the fixer before applying D67882 seems odd anyway. I wholly agree (see what I wrote in D67882 ). And yes the ordering in which futurize do these changes is odd. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67879/new/ https://reviews.llvm.org/D67879 From llvm-commits at lists.llvm.org Mon Oct 7 07:59:25 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 14:59:25 +0000 (UTC) Subject: [PATCH] D68579: [HardwareLoops] Optimisation remarks Message-ID: SjoerdMeijer created this revision. SjoerdMeijer added reviewers: hfinkel, samparker, shchenz, nemanjai, steven.zhang. Herald added a subscriber: hiraditya. Herald added a project: LLVM. This adds the initial plumbing to support optimisation remarks in the IR hardware-loop pass. I have left a TODO in a comment where we can improve the reporting, but I will iterate on that once we have this initial support in. https://reviews.llvm.org/D68579 Files: llvm/lib/CodeGen/HardwareLoops.cpp llvm/test/CodeGen/ARM/O3-pipeline.ll llvm/test/Transforms/HardwareLoops/ARM/opt-remarks.ll llvm/test/Transforms/HardwareLoops/unconditional-latch.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68579.223607.patch Type: text/x-patch Size: 9344 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 08:06:41 2019 From: llvm-commits at lists.llvm.org (Alexandre Ganea via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:06:41 +0000 (UTC) Subject: [PATCH] D68352: [lld] Handle sections without chunks during PDB generation In-Reply-To: References: Message-ID: <12ceb6f55e3bd22096bd4a3105f23529@localhost.localdomain> aganea added a subscriber: mstorsjo. aganea added a comment. Your explanation makes sense, a hint-less EXE would bind to a very specific set of runtime libraries, which never happens on Windows, but make sense in case of embedded development. Indeed the hint table does not need null termination. I would assume the following in `lld//COFF/Writer.cpp` would solve your issue: if (!idata.hints.empty()) add(".idata$6", idata.hints); Remains to see how the modern (Windows 10) NT loader handles non-existing hint tables. Could you possibly add a test please, with a mention explaining your use-case on Xbox? You will need to create a ordinal-only library in the test, then try linking a "hello world" with that library. Binary are best avoided in the tests (being able to reproduce the data for the test is another good thing). Ensure all tests pass afterwards - `ninja check-lld` should do it in a MINGW32 shell (or in a regular cmd.exe if you have the GnuWin32 tools installed and in the %PATH%). Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Mon Oct 7 08:07:24 2019 From: llvm-commits at lists.llvm.org (Sebastian Pop via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:07:24 +0000 (UTC) Subject: [PATCH] D67990: [aarch64] fix generation of fp16 fmls In-Reply-To: References: Message-ID: sebpop added a comment. Ping. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67990/new/ https://reviews.llvm.org/D67990 From llvm-commits at lists.llvm.org Mon Oct 7 08:13:06 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:13:06 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: <71541644271f860f0dae5a3433a157d5@localhost.localdomain> hubert.reinterpretcast added a comment. Thanks for adding the BOM. With the BOM, would it make sense to leave `mri-utf8.test` as the name of the file? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 From llvm-commits at lists.llvm.org Mon Oct 7 08:13:13 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:13:13 +0000 (UTC) Subject: [PATCH] D68337: [ARM][MVE] Enable extending masked loads In-Reply-To: References: Message-ID: <415248e3111eba8011adace87773c409@localhost.localdomain> samparker marked an inline comment as done. samparker added inline comments. ================ Comment at: lib/Target/ARM/ARMInstrMVE.td:5196 def : MVE_vector_maskedload_typed; + // Extending masked loads. + def : Pat<(v8i16 (sextmaskedload8 t2addrmode_imm7<0>:$addr, VCCR:$pred, ---------------- dmgreen wrote: > There likely needs to be an anyext too. Can (or is it beneficial for) these be merged into the MVEExtLoad multiclass below? As much as I don't like copy-paste, I do appreciate being able to read the code! I think adding to that multiclass is more hassle than it's worth :) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68337/new/ https://reviews.llvm.org/D68337 From llvm-commits at lists.llvm.org Mon Oct 7 08:13:50 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Christian_K=C3=BChnel_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 15:13:50 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: <75819c4f92d955e40a5098664dcb7172@localhost.localdomain> kuhnel updated this revision to Diff 223611. kuhnel added a comment. - second change - third diff Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 Files: DELETEME.txt Index: DELETEME.txt =================================================================== --- /dev/null +++ DELETEME.txt @@ -0,0 +1,4 @@ +just for testing. delete this file if you see it... + +This is my second change. +3rd one -------------- next part -------------- A non-text attachment was scrubbed... Name: D68560.223611.patch Type: text/x-patch Size: 226 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 08:21:31 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:21:31 +0000 (UTC) Subject: [PATCH] D67882: [LNT] Python 3 support: make leaf classes inherit from object In-Reply-To: References: Message-ID: <96c5d5fe46024fe0559335ccbf0f1d48@localhost.localdomain> hubert.reinterpretcast added a comment. In D67882#1697616 , @thopre wrote: > I think we should make LNT Python 3 only as soon as it can work in that mode, so the less compability code we can add the better. Do you agree with that approach? I agree with the approach of minimizing compatibility code. I am not sure about proactively dropping Python 2 support. As you said, the amount of commits going into LNT is low, so keeping Python 2 compatibility once the Python 3 mode works won't cost much development effort. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67882/new/ https://reviews.llvm.org/D67882 From llvm-commits at lists.llvm.org Mon Oct 7 08:25:04 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:25:04 +0000 (UTC) Subject: [PATCH] D53877: [IR] Strawman for dedicated FNeg IR instruction In-Reply-To: References: Message-ID: <6e2cf9f972a93e2878938d3dc66d9edb@localhost.localdomain> cameron.mcinally marked 2 inline comments as done. cameron.mcinally added inline comments. ================ Comment at: llvm/trunk/include/llvm-c/Core.h:1523-1524 macro(UndefValue) \ macro(Instruction) \ macro(BinaryOperator) \ macro(CallInst) \ ---------------- lebedev.ri wrote: > @cameron.mcinally Should anything have been added here for `UnaryOperator` ? Yes, I believe you're correct. Will add that under a separate Diff. Thanks. ================ Comment at: llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h:373 +enum UnaryOpcodes { + UNOP_NEG = 0 +}; ---------------- lebedev.ri wrote: > @cameron.mcinally also, shouldn't this be `UNOP_FNEG`? I'm not sure. The BINOPs are overloaded for INT/FP types. E.g. BINOP_ADD is also FP. I suppose there are no plans for an INT UNOP_NEG though. Do you feel strongly about this change? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53877/new/ https://reviews.llvm.org/D53877 From llvm-commits at lists.llvm.org Mon Oct 7 08:25:39 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:25:39 +0000 (UTC) Subject: [PATCH] D68144: [LoopInterchange] Improve inner exit loop safety checks. In-Reply-To: References: Message-ID: fhahn added a comment. ping Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68144/new/ https://reviews.llvm.org/D68144 From llvm-commits at lists.llvm.org Mon Oct 7 08:26:35 2019 From: llvm-commits at lists.llvm.org (James Clarke via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:26:35 +0000 (UTC) Subject: [PATCH] D68542: [Mips] Always save RA when disabling frame pointer elimination In-Reply-To: References: Message-ID: <3a2517167ba1def38e604c95acd0f90d@localhost.localdomain> jrtc27 added a comment. In D68542#1697636 , @draganm wrote: > I don't think you can have frame-pointer based stack unwinding under current Mips ABIs, albeit this might be useful for some stack scan based unwind. Not sure tho. You can most of the time, you just have to scan backwards to find the function prologue. Yes, it can break, but unless you have full DWARF info you can't do much better. Both FreeBSD (sys/mips/mips/db_trace.c) and Linux (arch/mips/kernel/process.c) do instruction-based unwinding on MIPS to get a good-enough backtrace on panic, so without this they can end up terminating the backtrace early. In particular, if you want a specific instance of the issue that motivated this patch, on FreeBSD, they have a `panic` which calls `vpanic` (much like `printf` vs `vprintf`), but due to being marked `noreturn`, `$ra` is dead and thus being clobbered by the call doesn't force a save like normal, so *every* panic ends up with a useless backtrace terminating at `panic`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68542/new/ https://reviews.llvm.org/D68542 From llvm-commits at lists.llvm.org Mon Oct 7 08:28:30 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:28:30 +0000 (UTC) Subject: [PATCH] D67990: [aarch64] fix generation of fp16 fmls In-Reply-To: References: Message-ID: <0906ac3f7971af5a64a2b0e5783b0923@localhost.localdomain> SjoerdMeijer added inline comments. ================ Comment at: llvm/test/CodeGen/AArch64/fp16-fmla.ll:163 +; CHECK: fneg {{v[0-9]+}}.8h, {{v[0-9]+}}.8h +; CHECK: fmla {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h entry: ---------------- Why are we not generating a fmls? And a nit, but perhaps actually just using registers v0, v1, and v2 here makes things clearer? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67990/new/ https://reviews.llvm.org/D67990 From llvm-commits at lists.llvm.org Mon Oct 7 08:32:57 2019 From: llvm-commits at lists.llvm.org (Tony Tye via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:32:57 +0000 (UTC) Subject: [PATCH] D45246: Add AMDPAL Code Conventions section to AMD docs In-Reply-To: References: Message-ID: <6e51375a7d170a5cc8407156eb0b0e53@localhost.localdomain> t-tye reopened this revision. t-tye added inline comments. ================ Comment at: llvm/docs/AMDGPUUsage.rst:3808-3810 +Note that there are always 10 available *user data entries* in registers - +entries beyond that limit must be fetched from memory (via the spill table +pointer) by the shader. ---------------- Clarify that this is User SGPR Registers. Also should this define the System SGPR and VGPR Registers. ================ Comment at: llvm/docs/AMDGPUUsage.rst:3816 + ============= ================================ + User Register Description + ============= ================================ ---------------- User SGPR Registers ================ Comment at: llvm/docs/AMDGPUUsage.rst:3829 + +Graphics pipelines support a much more flexible user data mapping: + ---------------- Add System SGPR and VGPR mapping information. ================ Comment at: llvm/docs/AMDGPUUsage.rst:3835 + ============= ================================ + User Register Description + ============= ================================ ---------------- User SGPR Registers ================ Comment at: llvm/docs/AMDGPUUsage.rst:3836-3845 + ============= ================================ + 0 Global Internal Table (32-bit pointer) + + Per-Shader Internal Table (32-bit pointer) + + 1-15 Application Controlled User Data + (1-15 Contiguous 32-bit Values in Registers) + + Spill Table (32-bit pointer) + + Draw Index (First Stage Only) ---------------- Need to remove the "+" as that is making a bulleted list. I think this is what you want which puts a list of the possible values in the 1-15 table row: ============= ================================ User Register Description ============= ================================ 0 :ref:`amdpal_global_internal_table` (32-bit pointer) 1-15 Application Controlled User Data (1-15 Contiguous 32-bit Values in Registers) - Per-Shader Internal Table (32-bit pointer) - Spill Table (32-bit pointer) - Draw Index (First Stage Only) - Vertex Offset (First Stage Only) - Instance Offset (First Stage Only) ============= ================================ Define these fields and how the metadata sets them up. ================ Comment at: llvm/docs/AMDGPUUsage.rst:3858-3862 + * The application-controlled user data range supports compaction remapping, so + only *entries* that are actually consumed by the shader must be assigned to + corresponding *registers*. Note that in order to support an efficient runtime + implementation, the remapping must pack *registers* in the same order as + *entries*, with unused *entries* removed. ---------------- Define how the mapping and re-mapping is conveyed and what the rules are. From tjis description I know there is a mapping but am unclear on how it is expressed. ================ Comment at: llvm/docs/AMDGPUUsage.rst:3864 + +.. _pal_global_internal_table: + ---------------- To be consistent with the section name: .. _amdpal_global_internal_table: ================ Comment at: llvm/docs/AMDGPUUsage.rst:3870 +The global internal table is a table of *shader resource descriptors* (SRDs) that +define how certain engine-wide, runtime-managed resources should be accessed +from a shader. The majority of these resources have HW-defined formats, and it ---------------- Where is the concept of an "engine" defined? Is this just another term for the PAL runtime? If so I would stick with saying PAL runtime. I would clarify use of runtime with PAL Runtime since the higher levels also have runtimes. ================ Comment at: llvm/docs/AMDGPUUsage.rst:3872 +from a shader. The majority of these resources have HW-defined formats, and it +is up to the compiler to write/read data as required by the target hardware. + ---------------- Would be helpful to reference where the target hardware format is defined. ================ Comment at: llvm/docs/AMDGPUUsage.rst:3879-3896 + ============= ================================ + Offset Description + ============= ================================ + 0-3 Graphics Scratch SRD + 4-7 Compute Scratch SRD + 8-11 ES/GS Ring Output SRD + 12-15 ES/GS Ring Input SRD ---------------- Add sections to define the other structures mentioned: Per-Shader Internal Table (32-bit pointer) Spill Table (32-bit pointer) Section to define the metadata and how it specifies how these structures are set up. ================ Comment at: llvm/docs/AMDGPUUsage.rst:3898-3901 + The pointer to the global internal table passed to the shader as user data + is a 32-bit pointer. The top 32 bits should be assumed to be the same as + the top 32 bits of the pipeline, so the shader may use the program + counter's top 32 bits. ---------------- Suggest being more explicit. What does "top 32 bits should be assumed to be the same as the top 32 bits of the pipeline" mean? Presumably it is the loaded PC address of the code for the shader pipeline. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D45246/new/ https://reviews.llvm.org/D45246 From llvm-commits at lists.llvm.org Mon Oct 7 08:33:21 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:33:21 +0000 (UTC) Subject: [PATCH] D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine In-Reply-To: References: Message-ID: spatel updated this revision to Diff 223613. spatel added a comment. Patch updated: Jump back to the earlier revision which created a new method for cost of a load-combine pattern. This is independent of the existing arithmetic instruction cost API, so we can be sure that there is no conflict with other cost model users. It's also more accurate because we can add the cost of a wider load just once for the entire pattern. Test case for the loop vectorizer crash was added at rL373913 . CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67841/new/ https://reviews.llvm.org/D67841 Files: llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/test/Transforms/SLPVectorizer/X86/bad-reduction.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67841.223613.patch Type: text/x-patch Size: 20744 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 08:34:55 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:34:55 +0000 (UTC) Subject: [PATCH] D53877: [IR] Strawman for dedicated FNeg IR instruction In-Reply-To: References: Message-ID: <426503660a42e9484b71cf1ef1e639c5@localhost.localdomain> lebedev.ri added inline comments. ================ Comment at: llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h:373 +enum UnaryOpcodes { + UNOP_NEG = 0 +}; ---------------- cameron.mcinally wrote: > lebedev.ri wrote: > > @cameron.mcinally also, shouldn't this be `UNOP_FNEG`? > I'm not sure. The BINOPs are overloaded for INT/FP types. E.g. BINOP_ADD is also FP. I suppose there are no plans for an INT UNOP_NEG though. Do you feel strongly about this change? I don't expect we'll ever have integer neg, so the current `UNOP_NEG` looks at least inconsistent (you use `fneg`) elsewhere. So yes please, let's change this :) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53877/new/ https://reviews.llvm.org/D53877 From llvm-commits at lists.llvm.org Mon Oct 7 08:36:14 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:36:14 +0000 (UTC) Subject: [PATCH] D68337: [ARM][MVE] Enable extending masked loads In-Reply-To: References: Message-ID: dmgreen added inline comments. ================ Comment at: lib/Target/ARM/ARMISelLowering.cpp:8887 // zero too, and other values are lowered to a select. SDValue ZeroVec = DAG.getNode(ARMISD::VMOVIMM, dl, VT, DAG.getTargetConstant(0, dl, MVT::i32)); ---------------- samparker wrote: > dmgreen wrote: > > This is creating a zero vector of size VT, which is the size of what the masked loads returns. Should it instead be the size of the memory being loaded (because the extend happens to the passthru as well)? What happens if that isn't a legal value type? > Well, surely the result VT of the masked load has to match the VT of the passthru input. passthru is not about what memory is accessed, but what is written to the destination register. VOVIMM will also generate the same zero value for all full width vector types so for vector widths less than 128-bits, the higher elements will be zeroed and that makes sense. For vectors wider than 128-bits, I think something would have gone before here. I'll add some tests for both these cases. Hmmm. Yeah OK. I see. The PassThru is explicitly extended in tryToFoldExtOfMaskedLoad? That makes sense, and the tests look OK. (There's one that is both sext and zext the same value, but that looks correct for where it is used). Test for masked loads/stores longer than 128 bits sounds like a good idea. We should ideally be able to deal with longer vector by splitting them just fine. ================ Comment at: lib/Target/ARM/ARMInstrMVE.td:5196 def : MVE_vector_maskedload_typed; + // Extending masked loads. + def : Pat<(v8i16 (sextmaskedload8 t2addrmode_imm7<0>:$addr, VCCR:$pred, ---------------- samparker wrote: > dmgreen wrote: > > There likely needs to be an anyext too. Can (or is it beneficial for) these be merged into the MVEExtLoad multiclass below? > As much as I don't like copy-paste, I do appreciate being able to read the code! I think adding to that multiclass is more hassle than it's worth :) Ha, Fair. I will agree with you there that sometimes more code is simpler. ================ Comment at: lib/Target/ARM/ARMTargetTransformInfo.cpp:511 + // Only support extending integers if the memory is aligned. + if ((EltWidth == 16 && Alignment < 2) || + (EltWidth == 32 && Alignment < 4)) ---------------- samparker wrote: > dmgreen wrote: > > samparker wrote: > > > dmgreen wrote: > > > > If this is coming from codegen, can the alignment here be 0? I think in ISel it is always set (and clang will always set it), but it may not be guaranteed in llvm in general. > > > I can't see anything in the spec for any guarantees of these intrinsics, but for normal loads, it becomes defined by the target ABI. It's always safe for us to use a i8* accessor, so I don't see 0 being a problem here. > > Yeah. Alignment of 0 means ABI alignment, which means 8, not unaligned. > > > > I think it may be better to just check this alignment is always the case, getting rid of that weird "use i8's to load unaligned masked loads" thing. That was probably a bad idea, more trouble than it's worth. > > > > I think what will happen here at the moment is that the Vectorizer will call isLegalMaskedLoad with an scalar type and an alignment (which, lets say is unaligned). That alignment won't be checked so the masked loads and stores will be created. Then when we get to the backend the legalizer will call this with a vector type and we'll hit this check, expanding out the masked load into a that very inefficient bunch of code. Which is probably something that we want to avoid. > Hmmm, okay. I also can't see removing unaligned support having a big negative effect. Sounds like I need to add some vectorization tests too, unless we already have them? There was one added to the vectoriser tests, but not for alignment checks as far as I remember. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68337/new/ https://reviews.llvm.org/D68337 From llvm-commits at lists.llvm.org Mon Oct 7 08:41:45 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:41:45 +0000 (UTC) Subject: [PATCH] D68210: Object/minidump: Add support for the MemoryInfoList stream In-Reply-To: References: Message-ID: <5565a3d10e3f77a3135eed841f25e82b@localhost.localdomain> labath updated this revision to Diff 223616. labath marked 6 inline comments as done. labath added a comment. Address review comments Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68210/new/ https://reviews.llvm.org/D68210 Files: include/llvm/BinaryFormat/Minidump.h include/llvm/BinaryFormat/MinidumpConstants.def include/llvm/Object/Minidump.h lib/Object/Minidump.cpp unittests/Object/MinidumpTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68210.223616.patch Type: text/x-patch Size: 21931 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 08:41:47 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:41:47 +0000 (UTC) Subject: [PATCH] D68582: GlobalISel: Add target pre-isel instructions Message-ID: arsenm created this revision. arsenm added reviewers: aemerson, aditya_nandakumar, paquette, dsanders, qcolombet. Herald added subscribers: Petar.Avramovic, volkan, rovka, nhaehnle, wdng, jvesely. Allows targets to introduce regbankselectable pseudo-instructions. Currently the closet feature to this is an intrinsic. However this requires creating a public intrinsic declaration. This litters the public intrinsic namespace with operations we don't necessarily want to expose to IR producers, and would rather leave as private to the backend. Use a new instruction bit. A previous attempt tried to keep using enum value ranges, but it turned into a mess. https://reviews.llvm.org/D68582 Files: include/llvm/CodeGen/MachineInstr.h include/llvm/MC/MCInstrDesc.h include/llvm/Target/GenericOpcodes.td include/llvm/Target/Target.td lib/CodeGen/GlobalISel/RegBankSelect.cpp lib/Target/AMDGPU/AMDGPUGISel.td lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp lib/Target/AMDGPU/SIInstrInfo.cpp lib/Target/AMDGPU/SIInstructions.td test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgpu-ffbh-u32.mir utils/TableGen/CodeGenInstruction.cpp utils/TableGen/CodeGenInstruction.h utils/TableGen/InstrInfoEmitter.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68582.223615.patch Type: text/x-patch Size: 10822 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 08:41:48 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:41:48 +0000 (UTC) Subject: [PATCH] D68440: [llvm-profdata] Minor format fix In-Reply-To: References: Message-ID: wmi accepted this revision. wmi added a comment. This revision is now accepted and ready to land. LGTM. Thanks for the fix. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68440/new/ https://reviews.llvm.org/D68440 From llvm-commits at lists.llvm.org Mon Oct 7 08:43:12 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:43:12 +0000 (UTC) Subject: [PATCH] D68583: AMDGPU: Fix i16 arithmetic pattern redundancy Message-ID: arsenm created this revision. arsenm added reviewers: rampitec, kzhuravl. Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely. There were 2 problems here. First, these patterns were duplicated to handle the inverted shift operands instead of using the commuted PatFrags. Second, the point of the zext folding patterns don't apply to the non-0ing high subtargets. They should be skipped instead of inserting the extension. The zeroing high code would be emitted when necessary anyway. This was also emitting unnecessary zexts in cases where the high bits were undefined. https://reviews.llvm.org/D68583 Files: lib/Target/AMDGPU/VOP2Instructions.td test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir test/CodeGen/AMDGPU/idot2.ll test/CodeGen/AMDGPU/idot4s.ll test/CodeGen/AMDGPU/idot4u.ll test/CodeGen/AMDGPU/idot8s.ll test/CodeGen/AMDGPU/idot8u.ll test/CodeGen/AMDGPU/preserve-hi16.ll test/CodeGen/AMDGPU/sdwa-peephole.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68583.223618.patch Type: text/x-patch Size: 62074 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 08:46:11 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:46:11 +0000 (UTC) Subject: [PATCH] D68210: Object/minidump: Add support for the MemoryInfoList stream In-Reply-To: References: Message-ID: <9fb9f507bdd332475dee6a3ba6f37ce3@localhost.localdomain> labath added a comment. Thanks for the review. I am fairly confident in the minidump details, as I based this code on the existing functional implementation in lldb, which I have also cross-referenced with the publicly available microsoft documentation. @amccarth, @clayborg: do you want to have a look at the minidump details? ================ Comment at: lib/Object/Minidump.cpp:58 +MinidumpFile::getMemoryInfoList() const { + auto OptionalStream = getRawStream(StreamType::MemoryInfoList); + if (!OptionalStream) ---------------- jhenderson wrote: > I probably should have picked up on this in previous reviews, but this is too much `auto` for my liking, as it's not obvious from the call site what `getRawStream` returns. Done. I've also changed the other calls to getRawStream. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68210/new/ https://reviews.llvm.org/D68210 From llvm-commits at lists.llvm.org Mon Oct 7 08:46:55 2019 From: llvm-commits at lists.llvm.org (Erich Keane via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:46:55 +0000 (UTC) Subject: [PATCH] D68584: Fix Calling Convention through aliases Message-ID: erichkeane created this revision. erichkeane added reviewers: rnk, pcc. Herald added a subscriber: hiraditya. Herald added a project: LLVM. r369697 changed the behavior of stripPointerCasts to no longer include aliases. However, the code in CGDeclCXX.cpp's createAtExitStub counted on the looking through aliases to properly set the calling convention of a call. The result of the change was that the calling convention mismatch of the call would be replaced with a llvm.trap, causing a runtime crash. https://reviews.llvm.org/D68584 Files: clang/lib/CodeGen/CGDeclCXX.cpp clang/test/CodeGenCXX/call-conv-thru-alias.cpp llvm/include/llvm/IR/Value.h llvm/lib/IR/Value.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68584.223617.patch Type: text/x-patch Size: 4052 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 08:49:07 2019 From: llvm-commits at lists.llvm.org (Aditya Nandakumar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:49:07 +0000 (UTC) Subject: [PATCH] D68582: GlobalISel: Add target pre-isel instructions In-Reply-To: References: Message-ID: <3e1a72ef2789566a3cb96d6c74a2be46@localhost.localdomain> aditya_nandakumar added a comment. I really like the approach here. Thanks for working on this. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68582/new/ https://reviews.llvm.org/D68582 From llvm-commits at lists.llvm.org Mon Oct 7 08:49:39 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:49:39 +0000 (UTC) Subject: [PATCH] D68337: [ARM][MVE] Enable extending masked loads In-Reply-To: References: Message-ID: <93b4895508b32071fd8178c883c5c5d4@localhost.localdomain> samparker marked an inline comment as done. samparker added inline comments. ================ Comment at: lib/Target/ARM/ARMISelLowering.cpp:8887 // zero too, and other values are lowered to a select. SDValue ZeroVec = DAG.getNode(ARMISD::VMOVIMM, dl, VT, DAG.getTargetConstant(0, dl, MVT::i32)); ---------------- dmgreen wrote: > samparker wrote: > > dmgreen wrote: > > > This is creating a zero vector of size VT, which is the size of what the masked loads returns. Should it instead be the size of the memory being loaded (because the extend happens to the passthru as well)? What happens if that isn't a legal value type? > > Well, surely the result VT of the masked load has to match the VT of the passthru input. passthru is not about what memory is accessed, but what is written to the destination register. VOVIMM will also generate the same zero value for all full width vector types so for vector widths less than 128-bits, the higher elements will be zeroed and that makes sense. For vectors wider than 128-bits, I think something would have gone before here. I'll add some tests for both these cases. > Hmmm. Yeah OK. I see. The PassThru is explicitly extended in tryToFoldExtOfMaskedLoad? > > That makes sense, and the tests look OK. (There's one that is both sext and zext the same value, but that looks correct for where it is used). > > Test for masked loads/stores longer than 128 bits sounds like a good idea. We should ideally be able to deal with longer vector by splitting them just fine. At some point, I was extending passthru... but it seems that is no longer the case! Our VMOVIMM is probably keeping us correct and if I extend it in dag combine, hopefully we won't need the bitcast handling here anymore. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68337/new/ https://reviews.llvm.org/D68337 From llvm-commits at lists.llvm.org Mon Oct 7 08:51:41 2019 From: llvm-commits at lists.llvm.org (Derek Schuff via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:51:41 +0000 (UTC) Subject: [PATCH] D68553: [WebAssembly] Add memory intrinsics handling to mayThrow() In-Reply-To: References: Message-ID: dschuff accepted this revision. dschuff added a comment. This revision is now accepted and ready to land. LGTM. It does seem probably worthwhile to go ahead and try out a patch for CallLoweringInfo at some point. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68553/new/ https://reviews.llvm.org/D68553 From llvm-commits at lists.llvm.org Mon Oct 7 08:54:04 2019 From: llvm-commits at lists.llvm.org (Dragan Mladjenovic via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:54:04 +0000 (UTC) Subject: [PATCH] D68542: [Mips] Always save RA when disabling frame pointer elimination In-Reply-To: References: Message-ID: <65101f21fa9a94a5b339d4dcd55df941@localhost.localdomain> draganm added a comment. In D68542#1697732 , @jrtc27 wrote: > In D68542#1697636 , @draganm wrote: > > > I don't think you can have frame-pointer based stack unwinding under current Mips ABIs, albeit this might be useful for some stack scan based unwind. Not sure tho. > > > You can most of the time, you just have to scan backwards to find the function prologue. Yes, it can break, but unless you have full DWARF info you can't do much better. Both FreeBSD (sys/mips/mips/db_trace.c) and Linux (arch/mips/kernel/process.c) do instruction-based unwinding on MIPS to get a good-enough backtrace on panic, so without this they can end up terminating the backtrace early. In particular, if you want a specific instance of the issue that motivated this patch, on FreeBSD, they have a `panic` which calls `vpanic` (much like `printf` vs `vprintf`), but due to being marked `noreturn`, `$ra` is dead and thus being clobbered by the call doesn't force a save like normal, so *every* panic ends up with a useless backtrace terminating at `panic`. I see. I haven't loked further but just removing nounwind from callee makes caller save $ra. Thanks for the concrete example. The patch makes sense to me now. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68542/new/ https://reviews.llvm.org/D68542 From llvm-commits at lists.llvm.org Mon Oct 7 08:55:39 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:55:39 +0000 (UTC) Subject: [PATCH] D68337: [ARM][MVE] Enable extending masked loads In-Reply-To: References: Message-ID: samparker marked an inline comment as done. samparker added inline comments. ================ Comment at: lib/Target/ARM/ARMISelLowering.cpp:8887 // zero too, and other values are lowered to a select. SDValue ZeroVec = DAG.getNode(ARMISD::VMOVIMM, dl, VT, DAG.getTargetConstant(0, dl, MVT::i32)); ---------------- samparker wrote: > dmgreen wrote: > > samparker wrote: > > > dmgreen wrote: > > > > This is creating a zero vector of size VT, which is the size of what the masked loads returns. Should it instead be the size of the memory being loaded (because the extend happens to the passthru as well)? What happens if that isn't a legal value type? > > > Well, surely the result VT of the masked load has to match the VT of the passthru input. passthru is not about what memory is accessed, but what is written to the destination register. VOVIMM will also generate the same zero value for all full width vector types so for vector widths less than 128-bits, the higher elements will be zeroed and that makes sense. For vectors wider than 128-bits, I think something would have gone before here. I'll add some tests for both these cases. > > Hmmm. Yeah OK. I see. The PassThru is explicitly extended in tryToFoldExtOfMaskedLoad? > > > > That makes sense, and the tests look OK. (There's one that is both sext and zext the same value, but that looks correct for where it is used). > > > > Test for masked loads/stores longer than 128 bits sounds like a good idea. We should ideally be able to deal with longer vector by splitting them just fine. > At some point, I was extending passthru... but it seems that is no longer the case! Our VMOVIMM is probably keeping us correct and if I extend it in dag combine, hopefully we won't need the bitcast handling here anymore. Ah, no. I was just being blind, passthru is extended. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68337/new/ https://reviews.llvm.org/D68337 From llvm-commits at lists.llvm.org Mon Oct 7 08:55:52 2019 From: llvm-commits at lists.llvm.org (Steven Wu via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:55:52 +0000 (UTC) Subject: [PATCH] D59709: [ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible In-Reply-To: References: Message-ID: steven_wu added inline comments. Herald added a subscriber: hiraditya. ================ Comment at: llvm/trunk/lib/LTO/LTO.cpp:334 + S->setCanAutoHide(VI.canAutoHide() && + !GUIDPreservedSymbols.count(VI.getGUID())); + } ---------------- tejohnson wrote: > steven_wu wrote: > > tejohnson wrote: > > > steven_wu wrote: > > > > tejohnson wrote: > > > > > steven_wu wrote: > > > > > > tejohnson wrote: > > > > > > > tejohnson wrote: > > > > > > > > steven_wu wrote: > > > > > > > > > The regression I saw is exactly the situation mentioned in the comments but I am not so sure about the conclusion here. For ld64, it can see whether a symbol can be auto hide or not, and it prefers the one that is not auto hide. This means if there are other copies of the symbol that is not autohide outside the summary and not autohide, the one can be autohide should never be prevailing. > > > > > > > > > > > > > > > > > > Is that the case for other linker? > > > > > > > > > The regression I saw is exactly the situation mentioned in the comments but I am not so sure about the conclusion here. > > > > > > > > > > > > > > > > By the situation mentioned in the comments I assume you mean the part about symbols in the GUIDPreservedSymbols set because they are visible outside the summary? In your situation, why are there so many symbols in the preserved symbols set? Are there a lot of symbols in native code being linked in? > > > > > > > > > > > > > > > > I guess it is different in lld since in the case I encountered (in an internal application), the linkonce_odr copy was prevailing, not the weak_odr copy. If ld64 guarantees that the autohide (linkonce_odr) copy will never be prevailing if there is an autohide (weak_odr) copy available somewhere, then we may need to figure out a way to communicate that. > > > > > > > > If ld64 guarantees that the autohide (linkonce_odr) copy will never be prevailing if there is an autohide (weak_odr) copy available somewhere > > > > > > > > > > > > > > I meant if there is a *non*-autohide (weak_odr) copy available of course... > > > > > > > By the situation mentioned in the comments I assume you mean the part about symbols in the GUIDPreservedSymbols set because they are visible outside the summary? In your situation, why are there so many symbols in the preserved symbols set? Are there a lot of symbols in native code being linked in? > > > > > > > > > > > > > > > > > > > Yes. The situation is the code is linking a static c++ library (native code) with linkonce autohide libcxx symbols. Since ld64 decides to coalesce away the copy in the native code and use the version inside the LTO, it has to add it to mustPreserve symbols. Now even both copies can be autohide, it is not autohide because it is in the preserve list. > > > > > > > > > > > > > > > > > > > I guess it is different in lld since in the case I encountered (in an internal application), the linkonce_odr copy was prevailing, not the weak_odr copy. If ld64 guarantees that the autohide (linkonce_odr) copy will never be prevailing if there is an autohide (weak_odr) copy available somewhere, then we may need to figure out a way to communicate that. > > > > > > > > > > > > > > > > > > > This sounds like a linker bug but I am not familiar with lld LTO pipeline to be sure. Should lld mark weak_odr copy as prevailing? I also would like to understand how lld is using mustPreserve symbols because I thought lld doesn't need it. > > > > > > For ld64, it uses mustPreserve symbol to communicate if the weak/linkonce symbols are prevailing in this case. > > > > > > > > > > > In the new LTO API, symbols are added to the GUIDPreservedSymbols set if they are visible outside the summary (either in a bitcode file without a summary or a native object). > > > > > > > > > > We could do better if the linker indicated whether these symbols in objects without summaries were eligible for auto hide, which in the compiler is based both on the linkage type and on it having the global unnamed addr flag - does the linker know both of these for native objects? > > > > > > > > > > I'm not an lld expert, but looking back at my original internal discussion with @pcc on this issue, it appears to be a difference in ld64 (Mach-O) semantics and ELF semantics. Here is an excerpt: > > > > > > > > > > pcc wrote: > > > > > > tejohnson wrote: > > > > > >> Is there some other criteria that should be used to prevent the symbol from being marked Hidden in this case? > > > > > > > > > > > > It seems that the rule should be: if we see at least one weak_odr definition (i.e. explicit instantiation) that should prevent the relaxation from default to hidden. The reason is that the weak_odr makes it possible to have another translation unit that does not implicitly instantiate the function, including a translation unit in another DSO. Similar logic is implemented in ld64: if one symbol is weak_odr unnamed_addr (i.e. non-auto-hide) and the other is linkonce_odr unnamed_addr (i.e. auto-hide), the linker selects the non-auto-hide one. > > > > > > https://github.com/apple-opensource/ld64/blob/master/src/ld/SymbolTable.cpp#L228 > > > > > > > > > > > > Unfortunately, this is at odds with ELF semantics where the "most hidden" visibility wins, so I think this means that we would not be able to mark linkonce_odr unnamed_addr symbols as hidden in the non-LTO case, as was suggested at one point on llvm-dev I believe. > > > > > > > > > > > > > > > So perhaps the best thing to do in this case is have an interface for the linker to indicate the correct semantics with regard to whether most or least hidden wins? > > > > When you say visible outside the summary, what exactly is the definition of `visible`? I failed the find the relevant in lld that interacts with GUIDPreservedSymbols. For ld64, weak symbols are only in the GUIDPreservedSymbols when it is in the final image, so I am not sure if we share the same definition. If the weak symbols are `visible` outside the summary just to get coalesced away later, it doesn't have to be preserved. > > > > > > > > I remember there were few other semantics difference in symbol resolution rules. IRLinker follows ELF visibility rules as well. It might be a good idea to clean that up sometimes. > > > > > > > > If the ELF is choosing the most hidden version, why is the case described in the commit message a problem? Is it because weak_odr beats linkonce_odr before visibility comes into consideration? > > > > When you say visible outside the summary, what exactly is the definition of visible? I failed the find the relevant in lld that interacts with GUIDPreservedSymbols. For ld64, weak symbols are only in the GUIDPreservedSymbols when it is in the final image, so I am not sure if we share the same definition. If the weak symbols are visible outside the summary just to get coalesced away later, it doesn't have to be preserved. > > > > > > lld doesn't create this set directly, it is done in the new LTO API here: > > > http://llvm-cs.pcc.me.uk/lib/LTO/LTO.cpp#910 > > > > > > And VisibleOutsideSummary is set here: > > > http://llvm-cs.pcc.me.uk/lib/LTO/LTO.cpp#539 > > > > > > VisibleToRegularObj is set by lld if the symbol is used in a regular (native) object > > > InSummary is true of the bitcode object had a summary > > > > > > > If the ELF is choosing the most hidden version, why is the case described in the commit message a problem? Is it because weak_odr beats linkonce_odr before visibility comes into consideration? > > > > > > ELF is picking the linkonce_odr symbol (most hidden) as prevailing. If you look at the change in FunctionImport.cpp, before this patch we were unconditionally marking the prevailing linkonce_odr copy as hidden when "promoting" to weak_odr. > > > > > > In the case that was failing we were first linking into a (native) shared library. The translation units in the shared library link contained both some implicit instantiations (linkonce_odr) and an explicit instantiation (weak_odr). Without LTO, the weak_odr would have resulted in a non-hidden weak definition in the shared library. Then this shared library was linked with code that contained references to this symbol that were expecting to be resolved by the explicit instantiation. If we instead mark that symbol as hidden in the shared library, those references are no longer satisfied. > > > > > > ELF is picking the linkonce_odr symbol (most hidden) as prevailing. If you look at the change in FunctionImport.cpp, before this patch we were unconditionally marking the prevailing linkonce_odr copy as hidden when "promoting" to weak_odr. > > > > > > In the case that was failing we were first linking into a (native) shared library. The translation units in the shared library link contained both some implicit instantiations (linkonce_odr) and an explicit instantiation (weak_odr). Without LTO, the weak_odr would have resulted in a non-hidden weak definition in the shared library. Then this shared library was linked with code that contained references to this symbol that were expecting to be resolved by the explicit instantiation. If we instead mark that symbol as hidden in the shared library, those references are no longer satisfied. > > > > > > > I guess I was missing something here for how ELF handles weak links during static link time. It sounds it is correct the the linkonce_odr version is prevailing under ELF rule but I don't understand the reason why the weak_odr version is expected to be exported? > > > > Here is my understanding for ELF linking and let me know if it is correct or not. > > > > For linking without LTO: > > * There is no bits in ELF object file to indicate auto hide. linkonce_odr and weak_odr are both weak external symbols so you get a weak copy in the dylib. > > > > For linking with fullLTO: > > * IRLinker will pick weak_odr linkage, remove unamed_addr, resulting the same output as native > > > > For linking with thinLTO (no native or fullLTO): > > * autohide disabled because weak_odr is cannot autohide, weak external symbol produced. > > Since every situation produces weak external symbol, would it be easier just make lld mark weak_odr as prevailing (can you tell linkonce_odr from weak_odr in ELF object)? > > > > Because native object for ELF doesn't have autohide optimization, missing autohide during thinLTO is not a regression. > Sorry I forgot I hadn't responded here. @pcc should be better able to answer some of your specific questions about ELF and lld. > > > For linking with thinLTO (no native or fullLTO): > > - autohide disabled because weak_odr is cannot autohide, weak external symbol produced. > > This should also be the same as a non-LTO after the link completes - there is a non-hidden weak symbol in the library (the prevailing linkonce_odr that was "promoted" to weak_odr and now no longer marked hidden). > > > Since every situation produces weak external symbol, would it be easier just make lld mark weak_odr as prevailing (can you tell linkonce_odr from weak_odr in ELF object)? > > I am not an expert on ELF semantics, deferring to @pcc here. I also forgot to mention here. I figure out later even for ld64, it should behave correctly even when the libLTO are not sure if the symbol can be autohide or not. It should just treat them the same as local_unname_addr. I don't think we need to touch the logic here. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D59709/new/ https://reviews.llvm.org/D59709 From llvm-commits at lists.llvm.org Mon Oct 7 08:57:49 2019 From: llvm-commits at lists.llvm.org (Francis Visoiu Mistrih via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:57:49 +0000 (UTC) Subject: [PATCH] D68571: [Remarks] Pass StringBlockValue as StringRef. In-Reply-To: References: Message-ID: <74ed1813e10d1e42b136dbbb88eafb1c@localhost.localdomain> thegameg accepted this revision. thegameg added a comment. This revision is now accepted and ready to land. LGTM, thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68571/new/ https://reviews.llvm.org/D68571 From llvm-commits at lists.llvm.org Mon Oct 7 08:58:33 2019 From: llvm-commits at lists.llvm.org (Victor Huang via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:58:33 +0000 (UTC) Subject: [PATCH] D63676: Disable hosting MI to hotter basic blocks In-Reply-To: References: Message-ID: NeHuang added a comment. - Merged to latest code base and compiled with -O3 and PGO - Collected stats for SPECInt and SPECFP benchmarks (SPEC2017) with baseline and patch. | **Benchmark ** | **Number of machine instructions hoisted out of loops (Baseline) ** | **Number of machine instructions hoisted out of loops (Patch) ** | **Number of instructions not hoisted to hotter destination (Patch)** | | SPECInt | 206914 | 132007 | 476157 | | SPECFP | 42591 | 36383 | 68215 | | - With the feature enabled, found performance gain for SPEC benchmarks, e.g. 2.1% for perlbench_r and 1.6% for povray_r. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63676/new/ https://reviews.llvm.org/D63676 From llvm-commits at lists.llvm.org Mon Oct 7 09:05:27 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 16:05:27 +0000 (UTC) Subject: [PATCH] D53877: [IR] Strawman for dedicated FNeg IR instruction In-Reply-To: References: Message-ID: <1edf4b59a99216b310768686ae9e9e5d@localhost.localdomain> cameron.mcinally marked an inline comment as done. cameron.mcinally added inline comments. ================ Comment at: llvm/trunk/include/llvm-c/Core.h:1523-1524 macro(UndefValue) \ macro(Instruction) \ macro(BinaryOperator) \ macro(CallInst) \ ---------------- cameron.mcinally wrote: > lebedev.ri wrote: > > @cameron.mcinally Should anything have been added here for `UnaryOperator` ? > Yes, I believe you're correct. Will add that under a separate Diff. Thanks. @lebedev.ri, do you know of any existing tests for these macros? I see a number of uses, but no unittests/etc. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53877/new/ https://reviews.llvm.org/D53877 From llvm-commits at lists.llvm.org Mon Oct 7 09:07:31 2019 From: llvm-commits at lists.llvm.org (David Greene via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 16:07:31 +0000 (UTC) Subject: [PATCH] D68272: [UpdateCCTestChecks] Detect function mangled name on separate line In-Reply-To: References: Message-ID: <8fe3f16cc1ec702bb08983ea5f0da212@localhost.localdomain> greened added a comment. In D68272#1690823 , @MaskRay wrote: > Can you give an example demonstrating the issue? This test: /***********************************/ /* */ /* A test. */ /* */ /***********************************/ int foo(void) { return 1; } c-index-test -write-pch test.pch test.c c-index-test -test-print-mangle test.pch Produces this: FunctionDecl=foo:6:5 (Definition) RawComment=[/***********************************/] RawCommentRange=[5:1 - 5:38] BriefComment=[********************************] FullCommentAsHTML=[

********************************

] FullCommentAsXML=[fooc:@F at fooint foo()********************************] // CHECK: CommentAST=[ // CHECK: (CXComment_FullComment // CHECK: (CXComment_Paragraph // CHECK: (CXComment_Text Text=[********************************])))] [mangled=foo] Note that `mangled=foo` appears on a separate line. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68272/new/ https://reviews.llvm.org/D68272 From llvm-commits at lists.llvm.org Mon Oct 7 09:12:37 2019 From: llvm-commits at lists.llvm.org (Wei Mi via llvm-commits) Date: Mon, 07 Oct 2019 16:12:37 -0000 Subject: [llvm] r373914 - [SampleFDO] Add compression support for any section in ExtBinary profile format Message-ID: <20191007161237.B1A938CAC0@lists.llvm.org> Author: wmi Date: Mon Oct 7 09:12:37 2019 New Revision: 373914 URL: http://llvm.org/viewvc/llvm-project?rev=373914&view=rev Log: [SampleFDO] Add compression support for any section in ExtBinary profile format Previously ExtBinary profile format only supports compression using zlib for profile symbol list. In this patch, we extend the compression support to any section. User can select some or all of the sections to compress. In an experiment, for a 45M profile in ExtBinary format, compressing name table reduced its size to 24M, and compressing all the sections reduced its size to 11M. Differential Revision: https://reviews.llvm.org/D68253 Added: llvm/trunk/test/Transforms/SampleProfile/profile-format-compress.ll llvm/trunk/test/tools/llvm-profdata/profile-symbol-list-compress.test llvm/trunk/test/tools/llvm-profdata/roundtrip-compress.test Modified: llvm/trunk/include/llvm/ProfileData/SampleProf.h llvm/trunk/include/llvm/ProfileData/SampleProfReader.h llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h llvm/trunk/lib/ProfileData/SampleProf.cpp llvm/trunk/lib/ProfileData/SampleProfReader.cpp llvm/trunk/lib/ProfileData/SampleProfWriter.cpp llvm/trunk/test/Transforms/SampleProfile/compressed-profile-symbol-list.ll llvm/trunk/test/Transforms/SampleProfile/uncompressed-profile-symbol-list.ll llvm/trunk/tools/llvm-profdata/llvm-profdata.cpp Modified: llvm/trunk/include/llvm/ProfileData/SampleProf.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ProfileData/SampleProf.h?rev=373914&r1=373913&r2=373914&view=diff ============================================================================== --- llvm/trunk/include/llvm/ProfileData/SampleProf.h (original) +++ llvm/trunk/include/llvm/ProfileData/SampleProf.h Mon Oct 7 09:12:37 2019 @@ -145,11 +145,25 @@ static inline std::string getSecName(Sec // and SampleProfileExtBinaryBaseWriter. struct SecHdrTableEntry { SecType Type; - uint64_t Flag; + uint64_t Flags; uint64_t Offset; uint64_t Size; }; +enum SecFlags { SecFlagInValid = 0, SecFlagCompress = (1 << 0) }; + +static inline void addSecFlags(SecHdrTableEntry &Entry, uint64_t Flags) { + Entry.Flags |= Flags; +} + +static inline void removeSecFlags(SecHdrTableEntry &Entry, uint64_t Flags) { + Entry.Flags &= ~Flags; +} + +static inline bool hasSecFlag(SecHdrTableEntry &Entry, SecFlags Flag) { + return Entry.Flags & Flag; +} + /// Represents the relative location of an instruction. /// /// Instruction locations are specified by the line offset from the @@ -643,9 +657,9 @@ public: unsigned size() { return Syms.size(); } void setToCompress(bool TC) { ToCompress = TC; } + bool toCompress() { return ToCompress; } - std::error_code read(uint64_t CompressSize, uint64_t UncompressSize, - const uint8_t *Data); + std::error_code read(const uint8_t *Data, uint64_t ListSize); std::error_code write(raw_ostream &OS); void dump(raw_ostream &OS = dbgs()) const; Modified: llvm/trunk/include/llvm/ProfileData/SampleProfReader.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ProfileData/SampleProfReader.h?rev=373914&r1=373913&r2=373914&view=diff ============================================================================== --- llvm/trunk/include/llvm/ProfileData/SampleProfReader.h (original) +++ llvm/trunk/include/llvm/ProfileData/SampleProfReader.h Mon Oct 7 09:12:37 2019 @@ -488,6 +488,14 @@ public: /// possible to define other types of profile inherited from /// SampleProfileReaderExtBinaryBase/SampleProfileWriterExtBinaryBase. class SampleProfileReaderExtBinaryBase : public SampleProfileReaderBinary { +private: + std::error_code decompressSection(const uint8_t *SecStart, + const uint64_t SecSize, + const uint8_t *&DecompressBuf, + uint64_t &DecompressBufSize); + + BumpPtrAllocator Allocator; + protected: std::vector SecHdrTable; std::unique_ptr ProfSymList; @@ -518,7 +526,7 @@ private: virtual std::error_code verifySPMagic(uint64_t Magic) override; virtual std::error_code readOneSection(const uint8_t *Start, uint64_t Size, SecType Type) override; - std::error_code readProfileSymbolList(); + std::error_code readProfileSymbolList(uint64_t Size); public: SampleProfileReaderExtBinary(std::unique_ptr B, LLVMContext &C, Modified: llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h?rev=373914&r1=373913&r2=373914&view=diff ============================================================================== --- llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h (original) +++ llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h Mon Oct 7 09:12:37 2019 @@ -143,14 +143,16 @@ class SampleProfileWriterRawBinary : pub class SampleProfileWriterExtBinaryBase : public SampleProfileWriterBinary { using SampleProfileWriterBinary::SampleProfileWriterBinary; - public: virtual std::error_code write(const StringMap &ProfileMap) override; + void setToCompressAllSections(); + void setToCompressSection(SecType Type); + protected: - uint64_t markSectionStart(); - uint64_t addNewSection(SecType Sec, uint64_t SectionStart); + uint64_t markSectionStart(SecType Type); + std::error_code addNewSection(SecType Sec, uint64_t SectionStart); virtual void initSectionLayout() = 0; virtual std::error_code writeSections(const StringMap &ProfileMap) = 0; @@ -158,34 +160,52 @@ protected: // Specifiy the section layout in the profile. Note that the order in // SecHdrTable (order to collect sections) may be different from the // order in SectionLayout (order to write out sections into profile). - SmallVector SectionLayout; + SmallVector SectionLayout; private: void allocSecHdrTable(); std::error_code writeSecHdrTable(); virtual std::error_code writeHeader(const StringMap &ProfileMap) override; - + void addSectionFlags(SecType Type, SecFlags Flags); + SecHdrTableEntry &getEntryInLayout(SecType Type); + std::error_code compressAndOutput(); + + // We will swap the raw_ostream held by LocalBufStream and that + // held by OutputStream if we try to add a section which needs + // compression. After the swap, all the data written to output + // will be temporarily buffered into the underlying raw_string_ostream + // originally held by LocalBufStream. After the data writing for the + // section is completed, compress the data in the local buffer, + // swap the raw_ostream back and write the compressed data to the + // real output. + std::unique_ptr LocalBufStream; // The location where the output stream starts. uint64_t FileStart; // The location in the output stream where the SecHdrTable should be // written to. uint64_t SecHdrTableOffset; + // Initial Section Flags setting. std::vector SecHdrTable; }; class SampleProfileWriterExtBinary : public SampleProfileWriterExtBinaryBase { - using SampleProfileWriterExtBinaryBase::SampleProfileWriterExtBinaryBase; - public: + SampleProfileWriterExtBinary(std::unique_ptr &OS) + : SampleProfileWriterExtBinaryBase(OS) { + initSectionLayout(); + } + virtual void setProfileSymbolList(ProfileSymbolList *PSL) override { ProfSymList = PSL; }; private: virtual void initSectionLayout() override { - SectionLayout = {SecProfSummary, SecNameTable, SecLBRProfile, - SecProfileSymbolList}; + SectionLayout = {{SecProfSummary}, + {SecNameTable}, + {SecLBRProfile}, + {SecProfileSymbolList}}; }; virtual std::error_code writeSections(const StringMap &ProfileMap) override; Modified: llvm/trunk/lib/ProfileData/SampleProf.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ProfileData/SampleProf.cpp?rev=373914&r1=373913&r2=373914&view=diff ============================================================================== --- llvm/trunk/lib/ProfileData/SampleProf.cpp (original) +++ llvm/trunk/lib/ProfileData/SampleProf.cpp Mon Oct 7 09:12:37 2019 @@ -15,7 +15,6 @@ #include "llvm/Config/llvm-config.h" #include "llvm/IR/DebugInfoMetadata.h" #include "llvm/Support/Compiler.h" -#include "llvm/Support/Compression.h" #include "llvm/Support/Debug.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" @@ -198,66 +197,34 @@ FunctionSamples::findFunctionSamples(con LLVM_DUMP_METHOD void FunctionSamples::dump() const { print(dbgs(), 0); } #endif -std::error_code ProfileSymbolList::read(uint64_t CompressSize, - uint64_t UncompressSize, - const uint8_t *Data) { +std::error_code ProfileSymbolList::read(const uint8_t *Data, + uint64_t ListSize) { const char *ListStart = reinterpret_cast(Data); - // CompressSize being non-zero means the profile is compressed and - // needs to be uncompressed first. - if (CompressSize) { - if (!llvm::zlib::isAvailable()) - return sampleprof_error::zlib_unavailable; - - StringRef CompressedStrings(reinterpret_cast(Data), - CompressSize); - char *Buffer = Allocator.Allocate(UncompressSize); - size_t UCSize = UncompressSize; - llvm::Error E = zlib::uncompress(CompressedStrings, Buffer, UCSize); - if (E) - return sampleprof_error::uncompress_failed; - ListStart = Buffer; - } - uint64_t Size = 0; - while (Size < UncompressSize) { + while (Size < ListSize) { StringRef Str(ListStart + Size); add(Str); Size += Str.size() + 1; } + if (Size != ListSize) + return sampleprof_error::malformed; return sampleprof_error::success; } std::error_code ProfileSymbolList::write(raw_ostream &OS) { - // Sort the symbols before doing compression. It will make the - // compression much more effective. + // Sort the symbols before output. If doing compression. + // It will make the compression much more effective. std::vector SortedList; SortedList.insert(SortedList.begin(), Syms.begin(), Syms.end()); llvm::sort(SortedList); - std::string UncompressedStrings; + std::string OutputString; for (auto &Sym : SortedList) { - UncompressedStrings.append(Sym.str()); - UncompressedStrings.append(1, '\0'); + OutputString.append(Sym.str()); + OutputString.append(1, '\0'); } - if (ToCompress) { - if (!llvm::zlib::isAvailable()) - return sampleprof_error::zlib_unavailable; - SmallString<128> CompressedStrings; - llvm::Error E = zlib::compress(UncompressedStrings, CompressedStrings, - zlib::BestSizeCompression); - if (E) - return sampleprof_error::compress_failed; - encodeULEB128(UncompressedStrings.size(), OS); - encodeULEB128(CompressedStrings.size(), OS); - OS << CompressedStrings.str(); - } else { - encodeULEB128(UncompressedStrings.size(), OS); - // If profile symbol list is not compressed, we will still save - // a compressed size value, but the value of the size is 0. - encodeULEB128(0, OS); - OS << UncompressedStrings; - } + OS << OutputString; return sampleprof_error::success; } Modified: llvm/trunk/lib/ProfileData/SampleProfReader.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ProfileData/SampleProfReader.cpp?rev=373914&r1=373913&r2=373914&view=diff ============================================================================== --- llvm/trunk/lib/ProfileData/SampleProfReader.cpp (original) +++ llvm/trunk/lib/ProfileData/SampleProfReader.cpp Mon Oct 7 09:12:37 2019 @@ -26,6 +26,7 @@ #include "llvm/IR/ProfileSummary.h" #include "llvm/ProfileData/ProfileCommon.h" #include "llvm/ProfileData/SampleProf.h" +#include "llvm/Support/Compression.h" #include "llvm/Support/ErrorOr.h" #include "llvm/Support/LEB128.h" #include "llvm/Support/LineIterator.h" @@ -471,6 +472,7 @@ std::error_code SampleProfileReaderExtBinary::readOneSection(const uint8_t *Start, uint64_t Size, SecType Type) { Data = Start; + End = Start + Size; switch (Type) { case SecProfSummary: if (std::error_code EC = readSummary()) @@ -487,7 +489,7 @@ SampleProfileReaderExtBinary::readOneSec } break; case SecProfileSymbolList: - if (std::error_code EC = readProfileSymbolList()) + if (std::error_code EC = readProfileSymbolList(Size)) return EC; break; default: @@ -496,27 +498,43 @@ SampleProfileReaderExtBinary::readOneSec return sampleprof_error::success; } -std::error_code SampleProfileReaderExtBinary::readProfileSymbolList() { - auto UncompressSize = readNumber(); - if (std::error_code EC = UncompressSize.getError()) +std::error_code +SampleProfileReaderExtBinary::readProfileSymbolList(uint64_t Size) { + if (!ProfSymList) + ProfSymList = std::make_unique(); + + if (std::error_code EC = ProfSymList->read(Data, Size)) return EC; + Data = Data + Size; + return sampleprof_error::success; +} + +std::error_code SampleProfileReaderExtBinaryBase::decompressSection( + const uint8_t *SecStart, const uint64_t SecSize, + const uint8_t *&DecompressBuf, uint64_t &DecompressBufSize) { + Data = SecStart; + End = SecStart + SecSize; + auto DecompressSize = readNumber(); + if (std::error_code EC = DecompressSize.getError()) + return EC; + DecompressBufSize = *DecompressSize; + auto CompressSize = readNumber(); if (std::error_code EC = CompressSize.getError()) return EC; - if (!ProfSymList) - ProfSymList = std::make_unique(); + if (!llvm::zlib::isAvailable()) + return sampleprof_error::zlib_unavailable; - if (std::error_code EC = - ProfSymList->read(*CompressSize, *UncompressSize, Data)) - return EC; - - // CompressSize is zero only when ProfileSymbolList is not compressed. - if (*CompressSize == 0) - Data = Data + *UncompressSize; - else - Data = Data + *CompressSize; + StringRef CompressedStrings(reinterpret_cast(Data), + *CompressSize); + char *Buffer = Allocator.Allocate(DecompressBufSize); + llvm::Error E = + zlib::uncompress(CompressedStrings, Buffer, DecompressBufSize); + if (E) + return sampleprof_error::uncompress_failed; + DecompressBuf = reinterpret_cast(Buffer); return sampleprof_error::success; } @@ -528,11 +546,35 @@ std::error_code SampleProfileReaderExtBi // Skip empty section. if (!Entry.Size) continue; + const uint8_t *SecStart = BufStart + Entry.Offset; - if (std::error_code EC = readOneSection(SecStart, Entry.Size, Entry.Type)) + uint64_t SecSize = Entry.Size; + + // If the section is compressed, decompress it into a buffer + // DecompressBuf before reading the actual data. The pointee of + // 'Data' will be changed to buffer hold by DecompressBuf + // temporarily when reading the actual data. + bool isCompressed = hasSecFlag(Entry, SecFlagCompress); + if (isCompressed) { + const uint8_t *DecompressBuf; + uint64_t DecompressBufSize; + if (std::error_code EC = decompressSection( + SecStart, SecSize, DecompressBuf, DecompressBufSize)) + return EC; + SecStart = DecompressBuf; + SecSize = DecompressBufSize; + } + + if (std::error_code EC = readOneSection(SecStart, SecSize, Entry.Type)) return EC; - if (Data != SecStart + Entry.Size) + if (Data != SecStart + SecSize) return sampleprof_error::malformed; + + // Change the pointee of 'Data' from DecompressBuf to original Buffer. + if (isCompressed) { + Data = BufStart + Entry.Offset; + End = BufStart + Buffer->getBufferSize(); + } } return sampleprof_error::success; @@ -621,10 +663,10 @@ std::error_code SampleProfileReaderExtBi return EC; Entry.Type = static_cast(*Type); - auto Flag = readUnencodedNumber(); - if (std::error_code EC = Flag.getError()) + auto Flags = readUnencodedNumber(); + if (std::error_code EC = Flags.getError()) return EC; - Entry.Flag = *Flag; + Entry.Flags = *Flags; auto Offset = readUnencodedNumber(); if (std::error_code EC = Offset.getError()) Modified: llvm/trunk/lib/ProfileData/SampleProfWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ProfileData/SampleProfWriter.cpp?rev=373914&r1=373913&r2=373914&view=diff ============================================================================== --- llvm/trunk/lib/ProfileData/SampleProfWriter.cpp (original) +++ llvm/trunk/lib/ProfileData/SampleProfWriter.cpp Mon Oct 7 09:12:37 2019 @@ -21,6 +21,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ProfileData/ProfileCommon.h" #include "llvm/ProfileData/SampleProf.h" +#include "llvm/Support/Compression.h" #include "llvm/Support/Endian.h" #include "llvm/Support/EndianStream.h" #include "llvm/Support/ErrorOr.h" @@ -72,21 +73,58 @@ SampleProfileWriter::write(const StringM return sampleprof_error::success; } +SecHdrTableEntry & +SampleProfileWriterExtBinaryBase::getEntryInLayout(SecType Type) { + auto SecIt = std::find_if( + SectionLayout.begin(), SectionLayout.end(), + [=](const auto &Entry) -> bool { return Entry.Type == Type; }); + return *SecIt; +} + /// Return the current position and prepare to use it as the start /// position of a section. -uint64_t SampleProfileWriterExtBinaryBase::markSectionStart() { - return OutputStream->tell(); +uint64_t SampleProfileWriterExtBinaryBase::markSectionStart(SecType Type) { + uint64_t SectionStart = OutputStream->tell(); + auto &Entry = getEntryInLayout(Type); + // Use LocalBuf as a temporary output for writting data. + if (hasSecFlag(Entry, SecFlagCompress)) + LocalBufStream.swap(OutputStream); + return SectionStart; +} + +std::error_code SampleProfileWriterExtBinaryBase::compressAndOutput() { + if (!llvm::zlib::isAvailable()) + return sampleprof_error::zlib_unavailable; + std::string &UncompressedStrings = + static_cast(LocalBufStream.get())->str(); + if (UncompressedStrings.size() == 0) + return sampleprof_error::success; + auto &OS = *OutputStream; + SmallString<128> CompressedStrings; + llvm::Error E = zlib::compress(UncompressedStrings, CompressedStrings, + zlib::BestSizeCompression); + if (E) + return sampleprof_error::compress_failed; + encodeULEB128(UncompressedStrings.size(), OS); + encodeULEB128(CompressedStrings.size(), OS); + OS << CompressedStrings.str(); + UncompressedStrings.clear(); + return sampleprof_error::success; } -/// Add a new section into section header table. Return the position -/// of SectionEnd. -uint64_t -SampleProfileWriterExtBinaryBase::addNewSection(SecType Sec, +/// Add a new section into section header table. +std::error_code +SampleProfileWriterExtBinaryBase::addNewSection(SecType Type, uint64_t SectionStart) { - uint64_t SectionEnd = OutputStream->tell(); - SecHdrTable.push_back( - {Sec, 0, SectionStart - FileStart, SectionEnd - SectionStart}); - return SectionEnd; + auto Entry = getEntryInLayout(Type); + if (hasSecFlag(Entry, SecFlagCompress)) { + LocalBufStream.swap(OutputStream); + if (std::error_code EC = compressAndOutput()) + return EC; + } + SecHdrTable.push_back({Type, Entry.Flags, SectionStart - FileStart, + OutputStream->tell() - SectionStart}); + return sampleprof_error::success; } std::error_code SampleProfileWriterExtBinaryBase::write( @@ -94,6 +132,8 @@ std::error_code SampleProfileWriterExtBi if (std::error_code EC = writeHeader(ProfileMap)) return EC; + std::string LocalBuf; + LocalBufStream = std::make_unique(LocalBuf); if (std::error_code EC = writeSections(ProfileMap)) return EC; @@ -105,28 +145,38 @@ std::error_code SampleProfileWriterExtBi std::error_code SampleProfileWriterExtBinary::writeSections( const StringMap &ProfileMap) { - uint64_t SectionStart = markSectionStart(); + uint64_t SectionStart = markSectionStart(SecProfSummary); computeSummary(ProfileMap); if (auto EC = writeSummary()) return EC; - SectionStart = addNewSection(SecProfSummary, SectionStart); + if (std::error_code EC = addNewSection(SecProfSummary, SectionStart)) + return EC; // Generate the name table for all the functions referenced in the profile. + SectionStart = markSectionStart(SecNameTable); for (const auto &I : ProfileMap) { addName(I.first()); addNames(I.second); } writeNameTable(); - SectionStart = addNewSection(SecNameTable, SectionStart); + if (std::error_code EC = addNewSection(SecNameTable, SectionStart)) + return EC; + SectionStart = markSectionStart(SecLBRProfile); if (std::error_code EC = writeFuncProfiles(ProfileMap)) return EC; - SectionStart = addNewSection(SecLBRProfile, SectionStart); + if (std::error_code EC = addNewSection(SecLBRProfile, SectionStart)) + return EC; + + if (ProfSymList && ProfSymList->toCompress()) + setToCompressSection(SecProfileSymbolList); + SectionStart = markSectionStart(SecProfileSymbolList); if (ProfSymList && ProfSymList->size() > 0) if (std::error_code EC = ProfSymList->write(*OutputStream)) return EC; - addNewSection(SecProfileSymbolList, SectionStart); + if (std::error_code EC = addNewSection(SecProfileSymbolList, SectionStart)) + return EC; return sampleprof_error::success; } @@ -308,6 +358,23 @@ std::error_code SampleProfileWriterBinar return sampleprof_error::success; } +void SampleProfileWriterExtBinaryBase::setToCompressAllSections() { + for (auto &Entry : SectionLayout) + addSecFlags(Entry, SecFlagCompress); +} + +void SampleProfileWriterExtBinaryBase::setToCompressSection(SecType Type) { + addSectionFlags(Type, SecFlagCompress); +} + +void SampleProfileWriterExtBinaryBase::addSectionFlags(SecType Type, + SecFlags Flags) { + for (auto &Entry : SectionLayout) { + if (Entry.Type == Type) + addSecFlags(Entry, Flags); + } +} + void SampleProfileWriterExtBinaryBase::allocSecHdrTable() { support::endian::Writer Writer(*OutputStream, support::little); @@ -342,9 +409,9 @@ std::error_code SampleProfileWriterExtBi // to adjust the order in SecHdrTable to be consistent with // SectionLayout when we write SecHdrTable to the memory. for (uint32_t i = 0; i < SectionLayout.size(); i++) { - uint32_t idx = IndexMap[static_cast(SectionLayout[i])]; + uint32_t idx = IndexMap[static_cast(SectionLayout[i].Type)]; Writer.write(static_cast(SecHdrTable[idx].Type)); - Writer.write(static_cast(SecHdrTable[idx].Flag)); + Writer.write(static_cast(SecHdrTable[idx].Flags)); Writer.write(static_cast(SecHdrTable[idx].Offset)); Writer.write(static_cast(SecHdrTable[idx].Size)); } @@ -362,7 +429,6 @@ std::error_code SampleProfileWriterExtBi FileStart = OS.tell(); writeMagicIdent(Format); - initSectionLayout(); allocSecHdrTable(); return sampleprof_error::success; } Modified: llvm/trunk/test/Transforms/SampleProfile/compressed-profile-symbol-list.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/compressed-profile-symbol-list.ll?rev=373914&r1=373913&r2=373914&view=diff ============================================================================== --- llvm/trunk/test/Transforms/SampleProfile/compressed-profile-symbol-list.ll (original) +++ llvm/trunk/test/Transforms/SampleProfile/compressed-profile-symbol-list.ll Mon Oct 7 09:12:37 2019 @@ -1,5 +1,5 @@ ; REQUIRES: zlib ; Append inline.prof with profile symbol list and save it after compression. -; RUN: llvm-profdata merge --sample --prof-sym-list=%S/Inputs/profile-symbol-list.text --compress-prof-sym-list=true --extbinary %S/Inputs/inline.prof --output=%t.profdata +; RUN: llvm-profdata merge --sample --prof-sym-list=%S/Inputs/profile-symbol-list.text --compress-all-sections=true --extbinary %S/Inputs/inline.prof --output=%t.profdata ; RUN: opt < %S/Inputs/profile-symbol-list.ll -sample-profile -profile-accurate-for-symsinlist -sample-profile-file=%t.profdata -S | FileCheck %S/Inputs/profile-symbol-list.ll ; RUN: opt < %S/Inputs/profile-symbol-list.ll -passes=sample-profile -profile-accurate-for-symsinlist -sample-profile-file=%t.profdata -S | FileCheck %S/Inputs/profile-symbol-list.ll Added: llvm/trunk/test/Transforms/SampleProfile/profile-format-compress.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/profile-format-compress.ll?rev=373914&view=auto ============================================================================== --- llvm/trunk/test/Transforms/SampleProfile/profile-format-compress.ll (added) +++ llvm/trunk/test/Transforms/SampleProfile/profile-format-compress.ll Mon Oct 7 09:12:37 2019 @@ -0,0 +1,123 @@ +; REQUIRES: zlib +; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/inline.prof -S | FileCheck %s +; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/inline.prof -S | FileCheck %s +; RUN: llvm-profdata merge -sample -extbinary -compress-all-sections %S/Inputs/inline.prof -o %t.compress.extbinary.afdo +; RUN: opt < %s -sample-profile -sample-profile-file=%t.compress.extbinary.afdo -S | FileCheck %s +; RUN: opt < %s -passes=sample-profile -sample-profile-file=%t.compress.extbinary.afdo -S | FileCheck %s + +; Original C++ test case +; +; #include +; +; int sum(int x, int y) { +; return x + y; +; } +; +; int main() { +; int s, i = 0; +; while (i++ < 20000 * 20000) +; if (i != 100) s = sum(i, s); else s = 30; +; printf("sum is %d\n", s); +; return 0; +; } +; + at .str = private unnamed_addr constant [11 x i8] c"sum is %d\0A\00", align 1 + +; Check sample-profile phase using compressed extbinary format profile +; will annotate the IR with exactly the same result as using text format. +; CHECK: br i1 %cmp, label %while.body, label %while.end{{.*}} !prof ![[IDX1:[0-9]*]] +; CHECK: br i1 %cmp1, label %if.then, label %if.else{{.*}} !prof ![[IDX2:[0-9]*]] +; CHECK: call i32 (i8*, ...) @printf{{.*}} !prof ![[IDX3:[0-9]*]] +; CHECK: = !{!"TotalCount", i64 26781} +; CHECK: = !{!"MaxCount", i64 5553} +; CHECK: ![[IDX1]] = !{!"branch_weights", i32 5392, i32 163} +; CHECK: ![[IDX2]] = !{!"branch_weights", i32 5280, i32 113} +; CHECK: ![[IDX3]] = !{!"branch_weights", i32 1} + +; Function Attrs: nounwind uwtable +define i32 @_Z3sumii(i32 %x, i32 %y) !dbg !4 { +entry: + %x.addr = alloca i32, align 4 + %y.addr = alloca i32, align 4 + store i32 %x, i32* %x.addr, align 4 + store i32 %y, i32* %y.addr, align 4 + %0 = load i32, i32* %x.addr, align 4, !dbg !11 + %1 = load i32, i32* %y.addr, align 4, !dbg !11 + %add = add nsw i32 %0, %1, !dbg !11 + ret i32 %add, !dbg !11 +} + +; Function Attrs: uwtable +define i32 @main() !dbg !7 { +entry: + %retval = alloca i32, align 4 + %s = alloca i32, align 4 + %i = alloca i32, align 4 + store i32 0, i32* %retval + store i32 0, i32* %i, align 4, !dbg !12 + br label %while.cond, !dbg !13 + +while.cond: ; preds = %if.end, %entry + %0 = load i32, i32* %i, align 4, !dbg !14 + %inc = add nsw i32 %0, 1, !dbg !14 + store i32 %inc, i32* %i, align 4, !dbg !14 + %cmp = icmp slt i32 %0, 400000000, !dbg !14 + br i1 %cmp, label %while.body, label %while.end, !dbg !14 + +while.body: ; preds = %while.cond + %1 = load i32, i32* %i, align 4, !dbg !16 + %cmp1 = icmp ne i32 %1, 100, !dbg !16 + br i1 %cmp1, label %if.then, label %if.else, !dbg !16 + + +if.then: ; preds = %while.body + %2 = load i32, i32* %i, align 4, !dbg !18 + %3 = load i32, i32* %s, align 4, !dbg !18 + %call = call i32 @_Z3sumii(i32 %2, i32 %3), !dbg !18 + store i32 %call, i32* %s, align 4, !dbg !18 + br label %if.end, !dbg !18 + +if.else: ; preds = %while.body + store i32 30, i32* %s, align 4, !dbg !20 + br label %if.end + +if.end: ; preds = %if.else, %if.then + br label %while.cond, !dbg !22 + +while.end: ; preds = %while.cond + %4 = load i32, i32* %s, align 4, !dbg !24 + %call2 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str, i32 0, i32 0), i32 %4), !dbg !24 + ret i32 0, !dbg !25 +} + +declare i32 @printf(i8*, ...) #2 + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!8, !9} +!llvm.ident = !{!10} + +!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, producer: "clang version 3.5 ", isOptimized: false, emissionKind: NoDebug, file: !1, enums: !2, retainedTypes: !2, globals: !2, imports: !2) +!1 = !DIFile(filename: "calls.cc", directory: ".") +!2 = !{} +!4 = distinct !DISubprogram(name: "sum", line: 3, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: false, unit: !0, scopeLine: 3, file: !1, scope: !5, type: !6, retainedNodes: !2) +!5 = !DIFile(filename: "calls.cc", directory: ".") +!6 = !DISubroutineType(types: !2) +!7 = distinct !DISubprogram(name: "main", line: 7, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: false, unit: !0, scopeLine: 7, file: !1, scope: !5, type: !6, retainedNodes: !2) +!8 = !{i32 2, !"Dwarf Version", i32 4} +!9 = !{i32 1, !"Debug Info Version", i32 3} +!10 = !{!"clang version 3.5 "} +!11 = !DILocation(line: 4, scope: !4) +!12 = !DILocation(line: 8, scope: !7) +!13 = !DILocation(line: 9, scope: !7) +!14 = !DILocation(line: 9, scope: !15) +!15 = !DILexicalBlockFile(discriminator: 2, file: !1, scope: !7) +!16 = !DILocation(line: 10, scope: !17) +!17 = distinct !DILexicalBlock(line: 10, column: 0, file: !1, scope: !7) +!18 = !DILocation(line: 10, scope: !19) +!19 = !DILexicalBlockFile(discriminator: 2, file: !1, scope: !17) +!20 = !DILocation(line: 10, scope: !21) +!21 = !DILexicalBlockFile(discriminator: 4, file: !1, scope: !17) +!22 = !DILocation(line: 10, scope: !23) +!23 = !DILexicalBlockFile(discriminator: 6, file: !1, scope: !17) +!24 = !DILocation(line: 11, scope: !7) +!25 = !DILocation(line: 12, scope: !7) Modified: llvm/trunk/test/Transforms/SampleProfile/uncompressed-profile-symbol-list.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/uncompressed-profile-symbol-list.ll?rev=373914&r1=373913&r2=373914&view=diff ============================================================================== --- llvm/trunk/test/Transforms/SampleProfile/uncompressed-profile-symbol-list.ll (original) +++ llvm/trunk/test/Transforms/SampleProfile/uncompressed-profile-symbol-list.ll Mon Oct 7 09:12:37 2019 @@ -1,4 +1,4 @@ ; Append inline.prof with profile symbol list and save it without compression. -; RUN: llvm-profdata merge --sample --prof-sym-list=%S/Inputs/profile-symbol-list.text --compress-prof-sym-list=false --extbinary %S/Inputs/inline.prof --output=%t.profdata +; RUN: llvm-profdata merge --sample --prof-sym-list=%S/Inputs/profile-symbol-list.text --compress-all-sections=false --extbinary %S/Inputs/inline.prof --output=%t.profdata ; RUN: opt < %S/Inputs/profile-symbol-list.ll -sample-profile -profile-accurate-for-symsinlist -sample-profile-file=%t.profdata -S | FileCheck %S/Inputs/profile-symbol-list.ll ; RUN: opt < %S/Inputs/profile-symbol-list.ll -passes=sample-profile -profile-accurate-for-symsinlist -sample-profile-file=%t.profdata -S | FileCheck %S/Inputs/profile-symbol-list.ll Added: llvm/trunk/test/tools/llvm-profdata/profile-symbol-list-compress.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-profdata/profile-symbol-list-compress.test?rev=373914&view=auto ============================================================================== --- llvm/trunk/test/tools/llvm-profdata/profile-symbol-list-compress.test (added) +++ llvm/trunk/test/tools/llvm-profdata/profile-symbol-list-compress.test Mon Oct 7 09:12:37 2019 @@ -0,0 +1,6 @@ +REQUIRES: zlib +; RUN: llvm-profdata merge -sample -extbinary -compress-all-sections -prof-sym-list=%S/Inputs/profile-symbol-list-1.text %S/Inputs/sample-profile.proftext -o %t.1.output +; RUN: llvm-profdata merge -sample -extbinary -compress-all-sections -prof-sym-list=%S/Inputs/profile-symbol-list-2.text %S/Inputs/sample-profile.proftext -o %t.2.output +; RUN: llvm-profdata merge -sample -extbinary -compress-all-sections %t.1.output %t.2.output -o %t.3.output +; RUN: llvm-profdata show -sample -show-prof-sym-list %t.3.output > %t.4.output +; RUN: diff %S/Inputs/profile-symbol-list.expected %t.4.output Added: llvm/trunk/test/tools/llvm-profdata/roundtrip-compress.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-profdata/roundtrip-compress.test?rev=373914&view=auto ============================================================================== --- llvm/trunk/test/tools/llvm-profdata/roundtrip-compress.test (added) +++ llvm/trunk/test/tools/llvm-profdata/roundtrip-compress.test Mon Oct 7 09:12:37 2019 @@ -0,0 +1,10 @@ +REQUIRES: zlib +# Round trip from text --> compressed extbinary --> text +RUN: llvm-profdata merge --sample --extbinary -compress-all-sections -output=%t.1.profdata %S/Inputs/sample-profile.proftext +RUN: llvm-profdata merge --sample --text -output=%t.1.proftext %t.1.profdata +RUN: diff %t.1.proftext %S/Inputs/sample-profile.proftext +# Round trip from text --> binary --> compressed extbinary --> text +RUN: llvm-profdata merge --sample --binary -output=%t.2.profdata %S/Inputs/sample-profile.proftext +RUN: llvm-profdata merge --sample --extbinary -compress-all-sections -output=%t.3.profdata %t.2.profdata +RUN: llvm-profdata merge --sample --text -output=%t.2.proftext %t.3.profdata +RUN: diff %t.2.proftext %S/Inputs/sample-profile.proftext Modified: llvm/trunk/tools/llvm-profdata/llvm-profdata.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-profdata/llvm-profdata.cpp?rev=373914&r1=373913&r2=373914&view=diff ============================================================================== --- llvm/trunk/tools/llvm-profdata/llvm-profdata.cpp (original) +++ llvm/trunk/tools/llvm-profdata/llvm-profdata.cpp Mon Oct 7 09:12:37 2019 @@ -439,12 +439,35 @@ static void populateProfileSymbolList(Me PSL.add(symbol); } +static void handleExtBinaryWriter(sampleprof::SampleProfileWriter &Writer, + ProfileFormat OutputFormat, + MemoryBuffer *Buffer, + sampleprof::ProfileSymbolList &WriterList, + bool CompressAllSections) { + populateProfileSymbolList(Buffer, WriterList); + if (WriterList.size() > 0 && OutputFormat != PF_Ext_Binary) + warn("Profile Symbol list is not empty but the output format is not " + "ExtBinary format. The list will be lost in the output. "); + + Writer.setProfileSymbolList(&WriterList); + + if (CompressAllSections) { + if (OutputFormat != PF_Ext_Binary) { + warn("-compress-all-section is ignored. Specify -extbinary to enable it"); + } else { + auto ExtBinaryWriter = + static_cast(&Writer); + ExtBinaryWriter->setToCompressAllSections(); + } + } +} + static void mergeSampleProfile(const WeightedFileVector &Inputs, SymbolRemapper *Remapper, StringRef OutputFilename, ProfileFormat OutputFormat, StringRef ProfileSymbolListFile, - bool CompressProfSymList, FailureMode FailMode) { + bool CompressAllSections, FailureMode FailMode) { using namespace sampleprof; StringMap ProfileMap; SmallVector, 5> Readers; @@ -496,17 +519,12 @@ static void mergeSampleProfile(const Wei if (std::error_code EC = WriterOrErr.getError()) exitWithErrorCode(EC, OutputFilename); + auto Writer = std::move(WriterOrErr.get()); // WriterList will have StringRef refering to string in Buffer. // Make sure Buffer lives as long as WriterList. auto Buffer = getInputFileBuf(ProfileSymbolListFile); - populateProfileSymbolList(Buffer.get(), WriterList); - WriterList.setToCompress(CompressProfSymList); - if (WriterList.size() > 0 && OutputFormat != PF_Ext_Binary) - warn("Profile Symbol list is not empty but the output format is not " - "ExtBinary format. The list will be lost in the output. "); - - auto Writer = std::move(WriterOrErr.get()); - Writer->setProfileSymbolList(&WriterList); + handleExtBinaryWriter(*Writer, OutputFormat, Buffer.get(), WriterList, + CompressAllSections); Writer->write(ProfileMap); } @@ -630,9 +648,10 @@ static int merge_main(int argc, const ch "prof-sym-list", cl::init(""), cl::desc("Path to file containing the list of function symbols " "used to populate profile symbol list")); - cl::opt CompressProfSymList( - "compress-prof-sym-list", cl::init(false), cl::Hidden, - cl::desc("Compress profile symbol list before write it into profile. ")); + cl::opt CompressAllSections( + "compress-all-sections", cl::init(false), cl::Hidden, + cl::desc("Compress all sections when writing the profile (only " + "meaningful for -extbinary)")); cl::ParseCommandLineOptions(argc, argv, "LLVM profile data merger\n"); @@ -666,8 +685,8 @@ static int merge_main(int argc, const ch OutputFormat, OutputSparse, NumThreads, FailureMode); else mergeSampleProfile(WeightedInputs, Remapper.get(), OutputFilename, - OutputFormat, ProfileSymbolListFile, - CompressProfSymList, FailureMode); + OutputFormat, ProfileSymbolListFile, CompressAllSections, + FailureMode); return 0; } From llvm-commits at lists.llvm.org Mon Oct 7 09:15:20 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Mon, 07 Oct 2019 16:15:20 -0000 Subject: [llvm] r373915 - [X86][SSE] getTargetShuffleInputs - move VT.isSimple/isVector checks inside. NFCI. Message-ID: <20191007161520.EA0B381663@lists.llvm.org> Author: rksimon Date: Mon Oct 7 09:15:20 2019 New Revision: 373915 URL: http://llvm.org/viewvc/llvm-project?rev=373915&view=rev Log: [X86][SSE] getTargetShuffleInputs - move VT.isSimple/isVector checks inside. NFCI. Stop all the callers from having to check the value type before calling getTargetShuffleInputs. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=373915&r1=373914&r2=373915&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Oct 7 09:15:20 2019 @@ -7259,6 +7259,10 @@ static bool getTargetShuffleInputs(SDVal SmallVectorImpl &Mask, SelectionDAG &DAG, unsigned Depth, bool ResolveZero) { + EVT VT = Op.getValueType(); + if (!VT.isSimple() || !VT.isVector()) + return false; + APInt KnownUndef, KnownZero; if (getTargetShuffleAndZeroables(Op, Mask, Inputs, KnownUndef, KnownZero)) { for (int i = 0, e = Mask.size(); i != e; ++i) { @@ -7280,6 +7284,10 @@ static bool getTargetShuffleInputs(SDVal SmallVectorImpl &Mask, SelectionDAG &DAG, unsigned Depth = 0, bool ResolveZero = true) { + EVT VT = Op.getValueType(); + if (!VT.isSimple() || !VT.isVector()) + return false; + unsigned NumElts = Op.getValueType().getVectorNumElements(); APInt DemandedElts = APInt::getAllOnesValue(NumElts); return getTargetShuffleInputs(Op, DemandedElts, Inputs, Mask, DAG, Depth, @@ -34574,8 +34582,8 @@ bool X86TargetLowering::SimplifyDemanded // Get target/faux shuffle mask. SmallVector OpMask; SmallVector OpInputs; - if (!VT.isSimple() || !getTargetShuffleInputs(Op, DemandedElts, OpInputs, - OpMask, TLO.DAG, Depth, false)) + if (!getTargetShuffleInputs(Op, DemandedElts, OpInputs, OpMask, TLO.DAG, + Depth, false)) return false; // Shuffle inputs must be the same size as the result. @@ -34954,8 +34962,7 @@ SDValue X86TargetLowering::SimplifyMulti SmallVector ShuffleMask; SmallVector ShuffleOps; - if (VT.isSimple() && VT.isVector() && - getTargetShuffleInputs(Op, ShuffleOps, ShuffleMask, DAG, Depth)) { + if (getTargetShuffleInputs(Op, ShuffleOps, ShuffleMask, DAG, Depth)) { // If all the demanded elts are from one operand and are inline, // then we can use the operand directly. int NumOps = ShuffleOps.size(); From llvm-commits at lists.llvm.org Mon Oct 7 09:20:34 2019 From: llvm-commits at lists.llvm.org (Paul Robinson via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 16:20:34 +0000 (UTC) Subject: [PATCH] D68465: [DebugInfo] Trim call-clobbered location list entries when tuning for GDB In-Reply-To: References: Message-ID: <397fc0a4194533c5ff2510876d719fe6@localhost.localdomain> probinson added a comment. Debugger tuning should not be used directly this way. There should be a DwarfDebug flag, and a CL option, and the default set in the DwarfDebug ctor based on tuning. This allows the defaulting to work how you want, but can be overridden easily for experimentation and testing. There are lots of examples of doing this in the ctor already. Also, if it turns out some other debugger also needs this, it's trivial to fix up the ctor to handle it with no code changes needed elsewhere. @dblaikie I'm also not clear what you're suggestion about .debug_addr entry plus offset. DW_LLE_offset_pair does this, derived from the base address, which ought to be available for any given function, assuming DWARF v5. Can you explain more clearly what's missing? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68465/new/ https://reviews.llvm.org/D68465 From llvm-commits at lists.llvm.org Mon Oct 7 09:35:01 2019 From: llvm-commits at lists.llvm.org (Kevin P. Neal via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 16:35:01 +0000 (UTC) Subject: [PATCH] D64746: Add constrained intrinsics for lrint and lround In-Reply-To: References: Message-ID: kpn closed this revision. kpn added a comment. Changes pushed to r373900. I don't know why the ticket was left open. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64746/new/ https://reviews.llvm.org/D64746 From llvm-commits at lists.llvm.org Mon Oct 7 09:36:31 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 16:36:31 +0000 (UTC) Subject: [PATCH] D68535: Fix loop unrolling initialization in the new pass manager In-Reply-To: References: Message-ID: <473bd03d7b8842d6aed322180f708d68@localhost.localdomain> asbirlea added a comment. Maybe elaborate in the patch description what `determine when and how we will unroll loops.` means? e.g.: "The default before and after this patch is for LoopUnroll to be enabled, and for it to use a cost model to determine whether to unroll the loop (`OnlyWhenForced = false`). Before this patch, disabling loop unroll would not run the LoopUnroll pass. After this patch, the LoopUnroll pass is being run, but it restricts unrolling to only the loops marked by a pragma (`OnlyWhenForced = true`). In addition, this patch disables the UnrollAndJam pass when disabling unrolling." Otherwise LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68535/new/ https://reviews.llvm.org/D68535 From llvm-commits at lists.llvm.org Mon Oct 7 09:45:47 2019 From: llvm-commits at lists.llvm.org (Wei Mi via llvm-commits) Date: Mon, 07 Oct 2019 16:45:47 -0000 Subject: [llvm] r373919 - Fix build errors caused by rL373914. Message-ID: <20191007164547.3CC358294D@lists.llvm.org> Author: wmi Date: Mon Oct 7 09:45:47 2019 New Revision: 373919 URL: http://llvm.org/viewvc/llvm-project?rev=373919&view=rev Log: Fix build errors caused by rL373914. Modified: llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h llvm/trunk/lib/ProfileData/SampleProfReader.cpp Modified: llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h?rev=373919&r1=373918&r2=373919&view=diff ============================================================================== --- llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h (original) +++ llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h Mon Oct 7 09:45:47 2019 @@ -202,10 +202,10 @@ public: private: virtual void initSectionLayout() override { - SectionLayout = {{SecProfSummary}, - {SecNameTable}, - {SecLBRProfile}, - {SecProfileSymbolList}}; + SectionLayout = {{SecProfSummary, 0, 0, 0}, + {SecNameTable, 0, 0, 0}, + {SecLBRProfile, 0, 0, 0}, + {SecProfileSymbolList, 0, 0, 0}}; }; virtual std::error_code writeSections(const StringMap &ProfileMap) override; Modified: llvm/trunk/lib/ProfileData/SampleProfReader.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ProfileData/SampleProfReader.cpp?rev=373919&r1=373918&r2=373919&view=diff ============================================================================== --- llvm/trunk/lib/ProfileData/SampleProfReader.cpp (original) +++ llvm/trunk/lib/ProfileData/SampleProfReader.cpp Mon Oct 7 09:45:47 2019 @@ -530,8 +530,9 @@ std::error_code SampleProfileReaderExtBi StringRef CompressedStrings(reinterpret_cast(Data), *CompressSize); char *Buffer = Allocator.Allocate(DecompressBufSize); + size_t UCSize = DecompressBufSize; llvm::Error E = - zlib::uncompress(CompressedStrings, Buffer, DecompressBufSize); + zlib::uncompress(CompressedStrings, Buffer, UCSize); if (E) return sampleprof_error::uncompress_failed; DecompressBuf = reinterpret_cast(Buffer); From llvm-commits at lists.llvm.org Mon Oct 7 09:49:44 2019 From: llvm-commits at lists.llvm.org (Cyndy Ishida via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 16:49:44 +0000 (UTC) Subject: [PATCH] D67529: [TextAPI] Introduce TBDv4 In-Reply-To: References: Message-ID: <9cce7c2fc99dcf5b6fc2e164b5a68a1a@localhost.localdomain> cishida updated this revision to Diff 223621. cishida added a comment. Reduce inplace limit for TargetList Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67529/new/ https://reviews.llvm.org/D67529 Files: llvm/include/llvm/TextAPI/MachO/InterfaceFile.h llvm/include/llvm/TextAPI/MachO/Symbol.h llvm/include/llvm/TextAPI/MachO/Target.h llvm/lib/TextAPI/MachO/Target.cpp llvm/lib/TextAPI/MachO/TextStub.cpp llvm/lib/TextAPI/MachO/TextStubCommon.cpp llvm/unittests/TextAPI/CMakeLists.txt llvm/unittests/TextAPI/TextStubV4Tests.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67529.223621.patch Type: text/x-patch Size: 48798 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 09:54:18 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 16:54:18 +0000 (UTC) Subject: [PATCH] D68529: [lit] Move argument parsing/validation to separate file In-Reply-To: References: Message-ID: yln updated this revision to Diff 223622. yln marked 2 inline comments as done. yln added a comment. Ensure refactored code follows Python style guide. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68529/new/ https://reviews.llvm.org/D68529 Files: llvm/utils/lit/lit/cl_arguments.py llvm/utils/lit/lit/main.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68529.223622.patch Type: text/x-patch Size: 19216 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 09:54:54 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 16:54:54 +0000 (UTC) Subject: [PATCH] D68529: [lit] Move argument parsing/validation to separate file In-Reply-To: References: Message-ID: <856832a984f655bb9b2225e3aaaf77df@localhost.localdomain> yln added inline comments. ================ Comment at: llvm/utils/lit/lit/main.py:237 elif opts.incremental: - sort_by_incremental_cache(run) + run.tests.sort(key = by_mtime, reverse = True) else: ---------------- serge-sans-paille wrote: > nitpicking: PEP8 recommends ``sort(key=by_mtime, reverse=True)`` Fixed. Thanks! :) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68529/new/ https://reviews.llvm.org/D68529 From llvm-commits at lists.llvm.org Mon Oct 7 09:55:44 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Christian_K=C3=BChnel_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 16:55:44 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: <85476accf26b9b23d88bd26a512062ed@localhost.localdomain> kuhnel updated this revision to Diff 223623. kuhnel added a comment. not sure where this ends up... Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 Files: DELETEME.txt Index: DELETEME.txt =================================================================== --- /dev/null +++ DELETEME.txt @@ -0,0 +1,4 @@ +just for testing. delete this file if you see it... + +This is my second change. +3rd one -------------- next part -------------- A non-text attachment was scrubbed... Name: D68560.223623.patch Type: text/x-patch Size: 226 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 09:55:45 2019 From: llvm-commits at lists.llvm.org (Juergen Ributzka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 16:55:45 +0000 (UTC) Subject: [PATCH] D67529: [TextAPI] Introduce TBDv4 In-Reply-To: References: Message-ID: ributzka added inline comments. ================ Comment at: llvm/lib/TextAPI/MachO/Target.cpp:35 + .Case("bridgeos", PlatformKind::bridgeOS) + .Case("maccatalyst", PlatformKind::macCatalyst) + .Default(PlatformKind::unknown); ---------------- Sorry, I just noticed that the simulators are missing in this list. Please add support for them and also matching tests. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67529/new/ https://reviews.llvm.org/D67529 From llvm-commits at lists.llvm.org Mon Oct 7 09:58:37 2019 From: llvm-commits at lists.llvm.org (Juergen Ributzka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 16:58:37 +0000 (UTC) Subject: [PATCH] D67646: [TextAPI] Add Multiple Document Support to TBDv3 In-Reply-To: References: Message-ID: <8ae47955adf58003a7daec1bb231fea5@localhost.localdomain> ributzka added a comment. Does nm print now all symbols - including the inlined ones? That would be an unexpected change, because we don't do the same for MachOs that have re-exported frameworks. I think this feature should be guarded by an option. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67646/new/ https://reviews.llvm.org/D67646 From llvm-commits at lists.llvm.org Mon Oct 7 10:00:17 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:00:17 +0000 (UTC) Subject: [PATCH] D53877: [IR] Strawman for dedicated FNeg IR instruction In-Reply-To: References: Message-ID: <40dec9daec5f295f6ecf1a55e61eea64@localhost.localdomain> lebedev.ri added inline comments. ================ Comment at: llvm/trunk/include/llvm-c/Core.h:1523-1524 macro(UndefValue) \ macro(Instruction) \ macro(BinaryOperator) \ macro(CallInst) \ ---------------- cameron.mcinally wrote: > cameron.mcinally wrote: > > lebedev.ri wrote: > > > @cameron.mcinally Should anything have been added here for `UnaryOperator` ? > > Yes, I believe you're correct. Will add that under a separate Diff. Thanks. > @lebedev.ri, do you know of any existing tests for these macros? I see a number of uses, but no unittests/etc. I don't know how this can be reached, but based on the name (`LLVMCCoreValues`) i can guess this is for C API. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53877/new/ https://reviews.llvm.org/D53877 From llvm-commits at lists.llvm.org Mon Oct 7 10:05:09 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via llvm-commits) Date: Mon, 07 Oct 2019 17:05:09 -0000 Subject: [llvm] r373923 - [Remarks] Pass StringBlockValue as StringRef. Message-ID: <20191007170509.4624A8261E@lists.llvm.org> Author: fhahn Date: Mon Oct 7 10:05:09 2019 New Revision: 373923 URL: http://llvm.org/viewvc/llvm-project?rev=373923&view=rev Log: [Remarks] Pass StringBlockValue as StringRef. After changing the remark serialization, we now pass StringRefs to the serializer. We should use StringRef for StringBlockVal, to avoid creating temporary objects, which then cause StringBlockVal.Value to point to invalid memory. Reviewers: thegameg, anemet Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D68571 Modified: llvm/trunk/lib/Remarks/YAMLRemarkSerializer.cpp Modified: llvm/trunk/lib/Remarks/YAMLRemarkSerializer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Remarks/YAMLRemarkSerializer.cpp?rev=373923&r1=373922&r2=373923&view=diff ============================================================================== --- llvm/trunk/lib/Remarks/YAMLRemarkSerializer.cpp (original) +++ llvm/trunk/lib/Remarks/YAMLRemarkSerializer.cpp Mon Oct 7 10:05:09 2019 @@ -103,7 +103,7 @@ template <> struct MappingTraits struct BlockScalarTraits { From llvm-commits at lists.llvm.org Mon Oct 7 10:06:09 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:06:09 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <66f1b8fe073450aa91e6db5def37a1bc@localhost.localdomain> rupprecht added a comment. In D68570#1697633 , @hans wrote: > In D68570#1697588 , @joerg wrote: > > > Why go back to the large tables for crc32? Just because JamCRC had that bug doesn't mean it should persist. > > > Because just using the table is much simpler and we already have it: no need for any run-time initialization and fancy code like call_once. Why do you consider it a bug? Generating a constant table like this at run-time -- again and again for each invocation of the program -- seems less than ideal to me. Do you have any benchmarks? A table is simpler in some regards, but also less readable in another sense (what are these random hex values?). Having benchmark results helps settle that debate. That said, general +1 to removing code complexity/duplication ================ Comment at: llvm/lib/Support/CRC.cpp:26 -uint32_t llvm::crc32(uint32_t CRC, StringRef S) { - static llvm::once_flag InitFlag; - static CRC32Table Tbl; - llvm::call_once(InitFlag, initCRC32Table, &Tbl); +static const uint32_t CRCTable[256] = { + 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, ---------------- Can you leave a comment how this table was generated/how it could be regenerated if needed in the future? And/or a unit test to assert the values are correct? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Mon Oct 7 10:09:25 2019 From: llvm-commits at lists.llvm.org (Alexander Richardson via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:09:25 +0000 (UTC) Subject: [PATCH] D68542: [Mips] Always save RA when disabling frame pointer elimination In-Reply-To: References: Message-ID: arichardson added a comment. In D68542#1697732 , @jrtc27 wrote: > In D68542#1697636 , @draganm wrote: > > > I don't think you can have frame-pointer based stack unwinding under current Mips ABIs, albeit this might be useful for some stack scan based unwind. Not sure tho. > > > You can most of the time, you just have to scan backwards to find the function prologue. Yes, it can break, but unless you have full DWARF info you can't do much better. Both FreeBSD (sys/mips/mips/db_trace.c) and Linux (arch/mips/kernel/process.c) do instruction-based unwinding on MIPS to get a good-enough backtrace on panic, so without this they can end up terminating the backtrace early. In particular, if you want a specific instance of the issue that motivated this patch, on FreeBSD, they have a `panic` which calls `vpanic` (much like `printf` vs `vprintf`), but due to being marked `noreturn`, `$ra` is dead and thus being clobbered by the call doesn't force a save like normal, so *every* panic ends up with a useless backtrace terminating at `panic`. Would probably be useful to include this example in the commit message. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68542/new/ https://reviews.llvm.org/D68542 From llvm-commits at lists.llvm.org Mon Oct 7 10:14:12 2019 From: llvm-commits at lists.llvm.org (Filipe Cabecinhas via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:14:12 +0000 (UTC) Subject: [PATCH] D67985: CFI: wrong type passed to llvm.type.test with multiple inheritance devirtualization In-Reply-To: References: Message-ID: filcab added subscribers: pcc, filcab. filcab added a comment. It seems there's a FIXME anticipating this problem. @pcc: Can you double-check, please? Thank you, Filipe Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67985/new/ https://reviews.llvm.org/D67985 From llvm-commits at lists.llvm.org Mon Oct 7 10:16:29 2019 From: llvm-commits at lists.llvm.org (Peter Collingbourne via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:16:29 +0000 (UTC) Subject: [PATCH] D67985: CFI: wrong type passed to llvm.type.test with multiple inheritance devirtualization In-Reply-To: References: Message-ID: <184773682ce5afa6feaef6c9e437f490@localhost.localdomain> pcc added a comment. Can you add a CodeGenCXX test as well, please? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67985/new/ https://reviews.llvm.org/D67985 From llvm-commits at lists.llvm.org Mon Oct 7 10:18:36 2019 From: llvm-commits at lists.llvm.org (Peter Collingbourne via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:18:36 +0000 (UTC) Subject: [PATCH] D68584: Fix Calling Convention through aliases In-Reply-To: References: Message-ID: <3e5682c663f916d851d6a91264d7b0aa@localhost.localdomain> pcc accepted this revision. pcc added a comment. This revision is now accepted and ready to land. LGTM ================ Comment at: clang/lib/CodeGen/CGDeclCXX.cpp:251 // Make sure the call and the callee agree on calling convention. - if (llvm::Function *dtorFn = - dyn_cast(dtor.getCallee()->stripPointerCasts())) + if (llvm::Function *dtorFn = dyn_cast( + dtor.getCallee()->stripPointerCastsAndAliases())) ---------------- Nit: while here you could change this to `auto *`. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68584/new/ https://reviews.llvm.org/D68584 From llvm-commits at lists.llvm.org Mon Oct 7 10:28:04 2019 From: llvm-commits at lists.llvm.org (Erich Keane via llvm-commits) Date: Mon, 07 Oct 2019 17:28:04 -0000 Subject: [llvm] r373929 - Fix Calling Convention through aliases Message-ID: <20191007172804.1B0FA8D09E@lists.llvm.org> Author: erichkeane Date: Mon Oct 7 10:28:03 2019 New Revision: 373929 URL: http://llvm.org/viewvc/llvm-project?rev=373929&view=rev Log: Fix Calling Convention through aliases r369697 changed the behavior of stripPointerCasts to no longer include aliases. However, the code in CGDeclCXX.cpp's createAtExitStub counted on the looking through aliases to properly set the calling convention of a call. The result of the change was that the calling convention mismatch of the call would be replaced with a llvm.trap, causing a runtime crash. Differential Revision: https://reviews.llvm.org/D68584 Modified: llvm/trunk/include/llvm/IR/Value.h llvm/trunk/lib/IR/Value.cpp Modified: llvm/trunk/include/llvm/IR/Value.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Value.h?rev=373929&r1=373928&r2=373929&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/Value.h (original) +++ llvm/trunk/include/llvm/IR/Value.h Mon Oct 7 10:28:03 2019 @@ -523,6 +523,16 @@ public: static_cast(this)->stripPointerCasts()); } + /// Strip off pointer casts, all-zero GEPs, address space casts, and aliases. + /// + /// Returns the original uncasted value. If this is called on a non-pointer + /// value, it returns 'this'. + const Value *stripPointerCastsAndAliases() const; + Value *stripPointerCastsAndAliases() { + return const_cast( + static_cast(this)->stripPointerCastsAndAliases()); + } + /// Strip off pointer casts, all-zero GEPs and address space casts /// but ensures the representation of the result stays the same. /// Modified: llvm/trunk/lib/IR/Value.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Value.cpp?rev=373929&r1=373928&r2=373929&view=diff ============================================================================== --- llvm/trunk/lib/IR/Value.cpp (original) +++ llvm/trunk/lib/IR/Value.cpp Mon Oct 7 10:28:03 2019 @@ -455,6 +455,7 @@ namespace { // Various metrics for how much to strip off of pointers. enum PointerStripKind { PSK_ZeroIndices, + PSK_ZeroIndicesAndAliases, PSK_ZeroIndicesSameRepresentation, PSK_ZeroIndicesAndInvariantGroups, PSK_InBoundsConstantIndices, @@ -475,6 +476,7 @@ static const Value *stripPointerCastsAnd if (auto *GEP = dyn_cast(V)) { switch (StripKind) { case PSK_ZeroIndices: + case PSK_ZeroIndicesAndAliases: case PSK_ZeroIndicesSameRepresentation: case PSK_ZeroIndicesAndInvariantGroups: if (!GEP->hasAllZeroIndices()) @@ -497,6 +499,8 @@ static const Value *stripPointerCastsAnd // TODO: If we know an address space cast will not change the // representation we could look through it here as well. V = cast(V)->getOperand(0); + } else if (StripKind == PSK_ZeroIndicesAndAliases && isa(V)) { + V = cast(V)->getAliasee(); } else { if (const auto *Call = dyn_cast(V)) { if (const Value *RV = Call->getReturnedArgOperand()) { @@ -526,6 +530,10 @@ const Value *Value::stripPointerCasts() return stripPointerCastsAndOffsets(this); } +const Value *Value::stripPointerCastsAndAliases() const { + return stripPointerCastsAndOffsets(this); +} + const Value *Value::stripPointerCastsSameRepresentation() const { return stripPointerCastsAndOffsets(this); } From llvm-commits at lists.llvm.org Mon Oct 7 10:25:40 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:25:40 +0000 (UTC) Subject: [PATCH] D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine In-Reply-To: References: Message-ID: <47d34734b31d3fe6524fcd98dea1c05e@localhost.localdomain> spatel updated this revision to Diff 223625. spatel added a comment. Patch updated: Moved "using PatternMatch" line within function that uses that API. Side note: filed https://bugs.llvm.org/show_bug.cgi?id=43591 for larger questions about the cost model. I really don't want to hold up the underlying motivating bugs for this patch while we try to untangle that mess. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67841/new/ https://reviews.llvm.org/D67841 Files: llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/test/Transforms/SLPVectorizer/X86/bad-reduction.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67841.223625.patch Type: text/x-patch Size: 20614 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 10:30:56 2019 From: llvm-commits at lists.llvm.org (Quentin Colombet via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:30:56 +0000 (UTC) Subject: [PATCH] D68582: GlobalISel: Add target pre-isel instructions In-Reply-To: References: Message-ID: qcolombet accepted this revision. qcolombet added inline comments. ================ Comment at: lib/CodeGen/GlobalISel/RegBankSelect.cpp:693 + // + // TODO: Remove opcode check. Should copy and others be marked pre-isel? + if (isTargetSpecificOpcode(MI.getOpcode()) && !MI.isPreISelOpcode()) ---------------- Replying to the TODO: No, I don't think they should because after ISel the expectation is that pre-isel opcodes are not present anymore. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68582/new/ https://reviews.llvm.org/D68582 From llvm-commits at lists.llvm.org Mon Oct 7 10:31:23 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:31:23 +0000 (UTC) Subject: [PATCH] D68289: [lldb-server/android] Show more processes by relaxing some checks In-Reply-To: References: Message-ID: <301e2f176ef24993255b67f58d9694c5@localhost.localdomain> labath added a comment. In D68289#1696654 , @jankratochvil wrote: > It has regressed on Linux Fedora 30 x86_64: > > lldb-Suite :: commands/process/attach/TestProcessAttach.py > lldb-Suite :: tools/lldb-vscode/attach/TestVSCode_attach.py > > > F10185566: 1 Should be fixed by r373925. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68289/new/ https://reviews.llvm.org/D68289 From llvm-commits at lists.llvm.org Mon Oct 7 10:33:46 2019 From: llvm-commits at lists.llvm.org (Kostya Kortchinsky via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:33:46 +0000 (UTC) Subject: [PATCH] D68471: [scudo][standalone] Correct releaseToOS behavior In-Reply-To: References: Message-ID: <148bac3cebda7b8f859db4388a94e3a6@localhost.localdomain> cryptoad marked 3 inline comments as done. cryptoad added inline comments. ================ Comment at: lib/scudo/standalone/tests/primary_test.cpp:195 +// test for an error in how the release criteria were computed. +template static void testReleaseToOS() { + auto Deleter = [](Primary *P) { ---------------- morehouse wrote: > cryptoad wrote: > > morehouse wrote: > > > Does this test the two cases mentioned in the description? > > > > > > - `< 1` page in use, `> 1` page in free list (should release) > > > - `< 1` page in free list (shouldn't release) > > > > > > > > > > > > > > Indeed, this tests the aforementioned first case, **not** the second one. > > I don't think I have a way to test that from here. One of the indicators being `LastReleaseAtNs` being updated (without released bytes), and it's not accessible. > > I'll see if I can toy with the prototype some more to bubble the information up the chain, unless you have an idea. > > > > > Is there a way to glean unmapped/released bytes from `GlobalStats`? If not, it seems like something worth adding to Scudo's telemetry. > > I also see there's already a `printStats` method. Perhaps we could modify it to print into a buffer where we could extract the released metadata. So the plan of record on my side: - Landing this as is, fixing the issue at hand takes precedence - I am going to have to revisit how to fit the release stats in the mix: the primary's `releaseToOS` doesn't grab a cache, so we can't fit the stats in there; We could put the total of released bytes in regions data, but that departs from the model of all the other stats being the caches. Maybe do that globally indeed, that will require more thought. - for the `printStats` suggestion: it is something I have to do as well, for `mallocz` purposes for example. Technically this could be leveraged through a lit test in it's current form I think, but right now I only have unit tests. So lit tests is also on my plate. Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68471/new/ https://reviews.llvm.org/D68471 From llvm-commits at lists.llvm.org Mon Oct 7 10:37:39 2019 From: llvm-commits at lists.llvm.org (Kostya Kortchinsky via llvm-commits) Date: Mon, 07 Oct 2019 17:37:39 -0000 Subject: [compiler-rt] r373930 - [scudo][standalone] Correct releaseToOS behavior Message-ID: <20191007173739.35A6A8CC6A@lists.llvm.org> Author: cryptoad Date: Mon Oct 7 10:37:39 2019 New Revision: 373930 URL: http://llvm.org/viewvc/llvm-project?rev=373930&view=rev Log: [scudo][standalone] Correct releaseToOS behavior Summary: There was an issue in `releaseToOSMaybe`: one of the criteria to decide if we should proceed with the release was wrong. Namely: ``` const uptr N = Sci->Stats.PoppedBlocks - Sci->Stats.PushedBlocks; if (N * BlockSize < PageSize) return; // No chance to release anything. ``` I meant to check if the amount of bytes in the free list was lower than a page, but this actually checks if the amount of **in use** bytes was lower than a page. The correct code is: ``` const uptr BytesInFreeList = Region->AllocatedUser - (Region->Stats.PoppedBlocks - Region->Stats.PushedBlocks) * BlockSize; if (BytesInFreeList < PageSize) return 0; // No chance to release anything. ``` Consequences of the bug: - if a class size has less than a page worth of in-use bytes (allocated or in a cache), reclaiming would not occur, whatever the amount of blocks in the free list; in real world scenarios this is unlikely to happen and be impactful; - if a class size had less than a page worth of free bytes (and enough in-use bytes, etc), then reclaiming would be attempted, with likely no result. This means the reclaiming was overzealous at times. I didn't have a good way to test for this, so I changed the prototype of the function to return the number of bytes released, allowing to get the information needed. The test added fails with the initial criteria. Another issue is that `ReleaseToOsInterval` can actually be 0, meaning we always try to release (side note: it's terrible for performances). so change a `> 0` check to `>= 0`. Additionally, decrease the `CanRelease` threshold to `PageSize / 32`. I still have to make that configurable but I will do it at another time. Finally, rename some variables in `printStats`: I feel like "available" was too ambiguous, so change it to "total". Reviewers: morehouse, hctim, eugenis, vitalybuka, cferris Reviewed By: morehouse Subscribers: delcypher, #sanitizers, llvm-commits Tags: #llvm, #sanitizers Differential Revision: https://reviews.llvm.org/D68471 Modified: compiler-rt/trunk/lib/scudo/standalone/primary32.h compiler-rt/trunk/lib/scudo/standalone/primary64.h compiler-rt/trunk/lib/scudo/standalone/tests/primary_test.cpp Modified: compiler-rt/trunk/lib/scudo/standalone/primary32.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/primary32.h?rev=373930&r1=373929&r2=373930&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/primary32.h (original) +++ compiler-rt/trunk/lib/scudo/standalone/primary32.h Mon Oct 7 10:37:39 2019 @@ -72,9 +72,9 @@ public: SizeClassInfo *Sci = getSizeClassInfo(I); Sci->RandState = getRandomU32(&Seed); // See comment in the 64-bit primary about releasing smaller size classes. - Sci->CanRelease = (ReleaseToOsInterval > 0) && + Sci->CanRelease = (ReleaseToOsInterval >= 0) && (I != SizeClassMap::BatchClassId) && - (getSizeByClassId(I) >= (PageSize / 16)); + (getSizeByClassId(I) >= (PageSize / 32)); } ReleaseToOsIntervalMs = ReleaseToOsInterval; } @@ -161,14 +161,16 @@ public: printStats(I, 0); } - void releaseToOS() { + uptr releaseToOS() { + uptr TotalReleasedBytes = 0; for (uptr I = 0; I < NumClasses; I++) { if (I == SizeClassMap::BatchClassId) continue; SizeClassInfo *Sci = getSizeClassInfo(I); ScopedLock L(Sci->Mutex); - releaseToOSMaybe(Sci, I, /*Force=*/true); + TotalReleasedBytes += releaseToOSMaybe(Sci, I, /*Force=*/true); } + return TotalReleasedBytes; } private: @@ -339,35 +341,38 @@ private: AvailableChunks, Rss >> 10); } - NOINLINE void releaseToOSMaybe(SizeClassInfo *Sci, uptr ClassId, + NOINLINE uptr releaseToOSMaybe(SizeClassInfo *Sci, uptr ClassId, bool Force = false) { const uptr BlockSize = getSizeByClassId(ClassId); const uptr PageSize = getPageSizeCached(); CHECK_GE(Sci->Stats.PoppedBlocks, Sci->Stats.PushedBlocks); - const uptr N = Sci->Stats.PoppedBlocks - Sci->Stats.PushedBlocks; - if (N * BlockSize < PageSize) - return; // No chance to release anything. + const uptr BytesInFreeList = + Sci->AllocatedUser - + (Sci->Stats.PoppedBlocks - Sci->Stats.PushedBlocks) * BlockSize; + if (BytesInFreeList < PageSize) + return 0; // No chance to release anything. if ((Sci->Stats.PushedBlocks - Sci->ReleaseInfo.PushedBlocksAtLastRelease) * BlockSize < PageSize) { - return; // Nothing new to release. + return 0; // Nothing new to release. } if (!Force) { const s32 IntervalMs = ReleaseToOsIntervalMs; if (IntervalMs < 0) - return; + return 0; if (Sci->ReleaseInfo.LastReleaseAtNs + static_cast(IntervalMs) * 1000000ULL > getMonotonicTime()) { - return; // Memory was returned recently. + return 0; // Memory was returned recently. } } // TODO(kostyak): currently not ideal as we loop over all regions and // iterate multiple times over the same freelist if a ClassId spans multiple // regions. But it will have to do for now. + uptr TotalReleasedBytes = 0; for (uptr I = MinRegionIndex; I <= MaxRegionIndex; I++) { if (PossibleRegions[I] == ClassId) { ReleaseRecorder Recorder(I * RegionSize); @@ -377,10 +382,12 @@ private: Sci->ReleaseInfo.PushedBlocksAtLastRelease = Sci->Stats.PushedBlocks; Sci->ReleaseInfo.RangesReleased += Recorder.getReleasedRangesCount(); Sci->ReleaseInfo.LastReleasedBytes = Recorder.getReleasedBytes(); + TotalReleasedBytes += Sci->ReleaseInfo.LastReleasedBytes; } } } Sci->ReleaseInfo.LastReleaseAtNs = getMonotonicTime(); + return TotalReleasedBytes; } SizeClassInfo SizeClassInfoArray[NumClasses]; Modified: compiler-rt/trunk/lib/scudo/standalone/primary64.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/primary64.h?rev=373930&r1=373929&r2=373930&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/primary64.h (original) +++ compiler-rt/trunk/lib/scudo/standalone/primary64.h Mon Oct 7 10:37:39 2019 @@ -79,9 +79,9 @@ public: // memory accesses which ends up being fairly costly. The current lower // limit is mostly arbitrary and based on empirical observations. // TODO(kostyak): make the lower limit a runtime option - Region->CanRelease = (ReleaseToOsInterval > 0) && + Region->CanRelease = (ReleaseToOsInterval >= 0) && (I != SizeClassMap::BatchClassId) && - (getSizeByClassId(I) >= (PageSize / 16)); + (getSizeByClassId(I) >= (PageSize / 32)); Region->RandState = getRandomU32(&Seed); } ReleaseToOsIntervalMs = ReleaseToOsInterval; @@ -167,14 +167,16 @@ public: printStats(I, 0); } - void releaseToOS() { + uptr releaseToOS() { + uptr TotalReleasedBytes = 0; for (uptr I = 0; I < NumClasses; I++) { if (I == SizeClassMap::BatchClassId) continue; RegionInfo *Region = getRegionInfo(I); ScopedLock L(Region->Mutex); - releaseToOSMaybe(Region, I, /*Force=*/true); + TotalReleasedBytes += releaseToOSMaybe(Region, I, /*Force=*/true); } + return TotalReleasedBytes; } private: @@ -259,7 +261,7 @@ private: const uptr MappedUser = Region->MappedUser; const uptr TotalUserBytes = Region->AllocatedUser + MaxCount * Size; // Map more space for blocks, if necessary. - if (LIKELY(TotalUserBytes > MappedUser)) { + if (TotalUserBytes > MappedUser) { // Do the mmap for the user memory. const uptr UserMapSize = roundUpTo(TotalUserBytes - MappedUser, MapSizeIncrement); @@ -325,43 +327,44 @@ private: if (Region->MappedUser == 0) return; const uptr InUse = Region->Stats.PoppedBlocks - Region->Stats.PushedBlocks; - const uptr AvailableChunks = - Region->AllocatedUser / getSizeByClassId(ClassId); + const uptr TotalChunks = Region->AllocatedUser / getSizeByClassId(ClassId); Printf("%s %02zu (%6zu): mapped: %6zuK popped: %7zu pushed: %7zu inuse: " - "%6zu avail: %6zu rss: %6zuK releases: %6zu last released: %6zuK " + "%6zu total: %6zu rss: %6zuK releases: %6zu last released: %6zuK " "region: 0x%zx (0x%zx)\n", Region->Exhausted ? "F" : " ", ClassId, getSizeByClassId(ClassId), Region->MappedUser >> 10, Region->Stats.PoppedBlocks, - Region->Stats.PushedBlocks, InUse, AvailableChunks, Rss >> 10, + Region->Stats.PushedBlocks, InUse, TotalChunks, Rss >> 10, Region->ReleaseInfo.RangesReleased, Region->ReleaseInfo.LastReleasedBytes >> 10, Region->RegionBeg, getRegionBaseByClassId(ClassId)); } - NOINLINE void releaseToOSMaybe(RegionInfo *Region, uptr ClassId, + NOINLINE uptr releaseToOSMaybe(RegionInfo *Region, uptr ClassId, bool Force = false) { const uptr BlockSize = getSizeByClassId(ClassId); const uptr PageSize = getPageSizeCached(); CHECK_GE(Region->Stats.PoppedBlocks, Region->Stats.PushedBlocks); - const uptr N = Region->Stats.PoppedBlocks - Region->Stats.PushedBlocks; - if (N * BlockSize < PageSize) - return; // No chance to release anything. + const uptr BytesInFreeList = + Region->AllocatedUser - + (Region->Stats.PoppedBlocks - Region->Stats.PushedBlocks) * BlockSize; + if (BytesInFreeList < PageSize) + return 0; // No chance to release anything. if ((Region->Stats.PushedBlocks - Region->ReleaseInfo.PushedBlocksAtLastRelease) * BlockSize < PageSize) { - return; // Nothing new to release. + return 0; // Nothing new to release. } if (!Force) { const s32 IntervalMs = ReleaseToOsIntervalMs; if (IntervalMs < 0) - return; + return 0; if (Region->ReleaseInfo.LastReleaseAtNs + static_cast(IntervalMs) * 1000000ULL > getMonotonicTime()) { - return; // Memory was returned recently. + return 0; // Memory was returned recently. } } @@ -377,6 +380,7 @@ private: Region->ReleaseInfo.LastReleasedBytes = Recorder.getReleasedBytes(); } Region->ReleaseInfo.LastReleaseAtNs = getMonotonicTime(); + return Recorder.getReleasedBytes(); } }; Modified: compiler-rt/trunk/lib/scudo/standalone/tests/primary_test.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/tests/primary_test.cpp?rev=373930&r1=373929&r2=373930&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/tests/primary_test.cpp (original) +++ compiler-rt/trunk/lib/scudo/standalone/tests/primary_test.cpp Mon Oct 7 10:37:39 2019 @@ -188,3 +188,32 @@ TEST(ScudoPrimaryTest, PrimaryThreaded) testPrimaryThreaded>(); testPrimaryThreaded>(); } + +// Through a simple allocation that spans two pages, verify that releaseToOS +// actually releases some bytes (at least one page worth). This is a regression +// test for an error in how the release criteria were computed. +template static void testReleaseToOS() { + auto Deleter = [](Primary *P) { + P->unmapTestOnly(); + delete P; + }; + std::unique_ptr Allocator(new Primary, Deleter); + Allocator->init(/*ReleaseToOsInterval=*/-1); + typename Primary::CacheT Cache; + Cache.init(nullptr, Allocator.get()); + const scudo::uptr Size = scudo::getPageSizeCached() * 2; + EXPECT_TRUE(Primary::canAllocate(Size)); + const scudo::uptr ClassId = + Primary::SizeClassMap::getClassIdBySize(Size); + void *P = Cache.allocate(ClassId); + EXPECT_NE(P, nullptr); + Cache.deallocate(ClassId, P); + Cache.destroy(nullptr); + EXPECT_GT(Allocator->releaseToOS(), 0U); +} + +TEST(ScudoPrimaryTest, ReleaseToOS) { + using SizeClassMap = scudo::DefaultSizeClassMap; + testReleaseToOS>(); + testReleaseToOS>(); +} From llvm-commits at lists.llvm.org Mon Oct 7 10:35:35 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:35:35 +0000 (UTC) Subject: [PATCH] D68583: AMDGPU: Fix i16 arithmetic pattern redundancy In-Reply-To: References: Message-ID: rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68583/new/ https://reviews.llvm.org/D68583 From llvm-commits at lists.llvm.org Mon Oct 7 10:35:44 2019 From: llvm-commits at lists.llvm.org (Momchil Velikov via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:35:44 +0000 (UTC) Subject: [PATCH] D68469: [AArch64] Ensure no tagged memory is left in the unallocated portion of the stack In-Reply-To: References: Message-ID: chill updated this revision to Diff 223626. chill edited the summary of this revision. chill added a comment. Updated to not require dominance/post-dominance unconditionally. Removed dependency on the parent patch and will place untag operations in front of `resume`, for now. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68469/new/ https://reviews.llvm.org/D68469 Files: llvm/lib/Target/AArch64/AArch64StackTagging.cpp llvm/test/CodeGen/AArch64/stack-tagging-ex-1.ll llvm/test/CodeGen/AArch64/stack-tagging-ex-2.ll llvm/test/CodeGen/AArch64/stack-tagging-untag-placement.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68469.223626.patch Type: text/x-patch Size: 17986 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 10:44:46 2019 From: llvm-commits at lists.llvm.org (Daniel Sanders via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:44:46 +0000 (UTC) Subject: [PATCH] D68538: GlobalISel: Partially implement lower for G_INSERT In-Reply-To: References: Message-ID: <7d61a3a7cda21e6a4c2ebb79d0eae890@localhost.localdomain> dsanders accepted this revision. dsanders added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68538/new/ https://reviews.llvm.org/D68538 From llvm-commits at lists.llvm.org Mon Oct 7 10:55:55 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via llvm-commits) Date: Mon, 07 Oct 2019 10:55:55 -0700 Subject: [PATCH] D68351: [profile] Add a mode to continuously sync counter updates to a file In-Reply-To: References: <21d87b18b3a2775fd21385551c3dad8b@localhost.localdomain> <93B7AFA0-4603-4798-9F48-104BA466F9EA@apple.com> <01855620-E786-4FEE-87CD-8D1994E60DDD@apple.com> Message-ID: <383C7811-167A-4EBB-801F-1D6B7C3A4D2C@apple.com> > On Oct 5, 2019, at 1:03 PM, Petr Hosek wrote: > > I've considered omitting the `__llvm_prf_cnts` section, the only downside of doing so is you need to make sure that the space for counters is allocated before you execute any instrumented code. That makes it tricky if you want to instrument libc which is something we do today. > > We can work around it though: a simple solution is to avoid collecting profiles for libc, a better solution would be to carefully annotate all libc functions that are executed before constructors to ensure that they aren't instrumented e.g. by introducing a no_profile attribute, this is the same approach we use ASan by annotating all function on the early startup path with __attribute__((no_sanitize("address"))). I think the `no_profile` approach makes sense. One downside is that in non-continuous mode, annotated libc functions will not be profiled. Some alternatives are 1) having a more specific `no_profile_in_continuous_mode` annotation, or 2) to not instrument libc in continuous mode and rely on libc unittests which run in non-continuous mode only. It’s probably a bit easier to support value profiling with (2), if that’s a concern. vedant > > On Fri, Oct 4, 2019 at 4:43 PM Xinliang David Li > wrote: > > > On Fri, Oct 4, 2019 at 4:35 PM > wrote: > > >> On Oct 4, 2019, at 3:43 PM, Petr Hosek > wrote: >> >> On Fri, Oct 4, 2019 at 1:44 PM Xinliang David Li > wrote: >> I will let Petr chime in explaining more details -- the reply I provided are based on my understanding of 'Reloc' proposal, which may be quite off. >> >> David >> >> On Fri, Oct 4, 2019 at 1:28 PM > wrote: >> >> >>> On Oct 4, 2019, at 10:30 AM, Xinliang David Li > wrote: >>> >>> Petr's method is more like 'relocating' the counter memory from static allocated memory to mmapped memory, but better be named 'reloc' method. >> >> Thank you, that's a much better name. >> >> >>> Now lets do some comparison: >>> >>> 1) reloc method allows runtime to mmap from the start of the raw profile, so there is no need to page align the counter data in the file; >> >> Let me see if I follow. Is the idea to allocate a buffer large enough to contain the entire contiguous raw profile, in each instrumented image? I can see how it would be possible to mmap() a single such buffer onto a raw profile without any extra padding or format changes. However, this buffer allocation entails doubling the memory overhead of the `__llvm_prf_{cnts,data,names}` sections, as these would all need to be copied into the buffer-to-be-mmap'd. This may not work on well on iOS, as the devices we support are quite memory-constrained. The per-process memory limit on iOS is ~1.4GB on many devices -- clients like News.app or Music.app already get very close to this limit. >> >> Also, with more than one instrumented image (DSOs), I believe page-alignment padding is still necessary so that each in-memory raw profile buffer can be mmap()'d onto a page-aligned file offset. >> >> Perhaps that is a non-issue, as it can be solved by simply creating N different raw profiles -- one per instrumented image. Actually this should work for both proposed schemes. >> >> David's understanding is correct, my plan is to mmap the entire raw profile from the start which means that there's no requirement for page alignment. Alternative would be to mmap only a page-aligned portion of the raw file of file that contains counters, i.e. we would mmap [__llvm_prf_cnts & -PAGE_SIZE, (__llvm_prf_cnts + __llvm_prf_cnts_length + PAGE_SIZE - 1) & -PAGE_SIZE). There's no need for buffer allocation, we can simply mmap the file directly as read/writable region. >>> 2) reloc method does have more instrumentation overhead, however it is not as large as we think. >>> a) the bias load can be done at the start of the function >>> b) on x86, the addressing mode kicks in, there is actually no size difference in terms of updating: >>> addq $1, counter+16(%rip) >>> vs >>> addq $1, counter+16(%rax) >>> >>> both are 7 bytes long. >>> >>> so the overhead will be a few bytes more per function. >> >> I see, thanks for this correction! >> >> Thanks for the suggestion David! The prototype implementation I have loads the bias/reloc on every counter update but loading the bias/reloc at the start of the function is going to be more efficient. I'll update the implementation and try to get some numbers by compiling instrumented Clang. >>> 3) reloc method produces smaller sized raw profile >> >> I'm interested in understanding how major of an issue this is for Google's use cases. On iOS/macOS, the worst case overhead of section alignment padding is < 32K/8K respectively per image (due to 16K/4K pages). What's the expected file size overhead for Google? Is a much larger page size in use? If so, is switching to a smaller page size when running instrumented binaries an option? >> >> In addition to understanding this, per my last comment to Petr, I would also like to understand the specific security concerns w.r.t mapping a section onto a file. If this is something Darwin should not allow, I would like to let our kernel team know! >> >> It's not a specific security concern, rather it's a consequence of Fuchsia's design/capability model. In Fuchsia, the address space (sub)regions are objects which are accessed and "operated on" through handles (capabilities). When loading modules (executables and DSOs), our dynamic linker creates a new subregion for each one, mmaps individual segments into those subregions and then closes all subregion handles. That doesn't unmap those files, but it means that the address space layout can never be modified because there's no way to recover those handles. That's a nice property because it means that all module mapping is immutable (e.g. it's impossible to make any of the executable segments writable or even readable if you use execute-only memory on AArch64). However, it also means that we cannot mmap anything over those segments as would be needed for the continuous mode. The only way to make this work is to avoid closing the subregion handles and keep those for later use, but that would also make our dynamic linker a potential attack vector. >> >> More generally, relying on the overmap behavior means tying the implementation to the OS-specific API which is available on Linux and Darwin, but may not be available on other OSes (e.g. I'm not even sure if this is possible on Windows). It'd be nice to implement a solution that could be made to work on all OSes. > > Ok, I'm convinced that the reloc mode is the more portable approach. Per David's comments (the one about not actually emitting `__llvm_prf_cnts` in the binary //especially//, although I haven't quoted it here), reloc mode can probably be implemented reasonably efficiently on Linux/Darwin. The runtime can inspect `__llvm_prf_data` to determine the number of counters, mmap() a sufficient amount of memory, and then update the CounterPtr fields of all the data entries. This means the instrumentation would need to look like: https://godbolt.org/z/-PvObE . > > > incr3 version looks great. > > @Petr I'd be happy to work together on this. Let me know if you'd like any help prototyping/testing, perhaps we can split up some of the work. > > One thing we haven't really resolved is the conversation about not changing the raw profile format. Fwiw, I don't think avoiding changes to the raw profile format should be a design goal, as the point of having separate raw/indexed formats is that the raw format should be cheap to rev. I think that I've demonstrated that the file size overhead from adding section alignment padding is small: I suspect that getting rid of this would introduce a considerable amount of complexity to any kind of mmap()-mode implementation. > > > if there is a need to change raw format to make things more efficient, go for it. > > David > > > Anyway, I'm excited to see your results. Please keep me in the loop! > > thanks, > vedant > > >>> 4) The continuous method has the advantage of being more efficient in running speed. >> >> As pointed out above, the reduced memory overhead is another (perhaps more significant) factor. >> >> >>> Also, I believe both methods should support on-line merging -- as the profile update happens 'In-place' -- the online merging code can be simplified (no locking, reading, updating are needed). >> >> Thanks for pointing this out. In my last experiment, I added a `while (true) { print-counters; sleep(...) }` loop to `darwin-proof-of-concept.c` after it set up the mmap(), and then edited the counter section of the profile using a hex editor. The updates did not appear in the `proof-of-concept` process. But perhaps there is a more sophisticated way to share the physical pages backing `__llvm_prf_cnts` between processes, while they are also mapped onto the filesystem. I need to run more experiments to understand this. >> >> vedant >> >> >>> >>> David >>> >>> >>> >>> >>> >>> >>> >>> >>> On Thu, Oct 3, 2019 at 4:18 PM Vedant Kumar via Phabricator via llvm-commits > wrote: >>> vsk added a comment. >>> >>> In D68351#1693693 >, @phosek wrote: >>> >>> > In D68351#1693307 >, @davidxl wrote: >>> > >>> > > +petr who has similar needs for Fuchia platform >>> > >>> > >>> > Thank you for including me, we have exactly the same use case in Fuchsia and this is the solution I've initially considered and been experimenting with locally based on the discussion with @davidxl. However, after internal discussion we've decided against this solution for two reasons: >>> > >>> > 1. This requires the ability to mmap the output file over the (portion of) binary which is something we don't allow in Fuchsia for security reasons; once all modules have been mapped in by the dynamic linker, we don't allow any further changes to their mapping and using this solution would require special dynamic linker which is possible but highly undesirable. >>> > 2. It introduces bloat to both the binary and the output file. It also complicates the runtime implementation due to alignment and padding requirements. >>> > >>> > The alternative solution we've came up and that I've now been implementing is to change the instrumentation to allow extra level of indirection. Concretely, the instrumentation conceptually does the following: `c = __profc_foo[idx]; c++; __profc_foo[idx] = c`. We'd like to change this `c = *(&__profc[idx] + *bias); c++; *(&__profc[idx] + *bias) = c` where `bias` is a global variable set by the runtime to be the offset between the `__llvm_prf_cnts` section and the corresponding location in the file that's mmapped in. Initially, that offset would be `0` which would result in exactly the same behavior as today, but the runtime can mmap the output file into address space and then change the offset to make the counters be continuously updated. >>> > >>> > The advantage of this solution is that there are no changes needed to the profile format. It also doesn't require mmapping the output file over the binary, the output file can be mmapped anywhere in the address space. The disadvantage is extra overhead since instrumentation is going to be slightly more complicated, although we don't have any numbers yet to quantify how much slower it's going to be. The implementation should be fairly minimal and my tentative plan was to gate it on compiler switch, so it wouldn't affect existing in any way (modulo introducing one new variable in the runtime to hold the bias). I'm hoping to have the prototype ready and uploaded for review within the next few days. What's your opinion on this idea? Would this be something that you'd be interested in as an alternative approach? >>> >>> >>> @phosek thank you for sharing this alternative. I hope you don't mind my calling this the bias method, after the term in the proposed instrumentation. >>> >>> The TLDR is that I don't think the bias method is a good fit for Linux/Darwin. Imho, this method won't reduce the complexity of continuous mode, and its instrumentation overhead is likely to outweigh the savings from reduced section alignment. Let me try and justify these claims :). >>> >>> First and most critically, note that (on Linux & Darwin, at least), mmap() requires that the file `offset` argument be page-aligned. So, I don't believe there's a way to avoid changing the raw profile format, or to avoid the concomitant complexity necessary to calculate padding bytes in the runtime. I don't see how the bias method solves this problem: this seems to be a fundamental limitation of mmap(). >>> >>> Second, note that the size overhead of counter increment instrumentation dwarfs the alignment overhead for `__llvm_prf_{cnts,data}` for all but the smallest of programs. Assuming 16K pages, in the worst case, the section alignment costs 32K per image. Assuming an increment can be encoded in 7 bytes as `incq l___profc_main(%rip)`, this section alignment cost is equivalent to ~4700 increments. For comparison, this is roughly the number of counter increments in //FileCheck.cpp.o//. If the size overhead of counter instrumentation were to double, as I believe it would with the bias method, it would rapidly erase the memory savings from eliminating the section alignment requirement. >>> >>> Third, I'm not sure I understand the security constraints in Fuchsia that make it undesirable to map the contents of a section onto a file. Is the concern that an attacker process may update the file, thereby changing the in-memory contents of the section? I'll note that on Darwin, at least, the kernel does not permit this kind of sharing. If an attacker process modifies a file on disk that has the in-memory contents of a section MAP_SHARED over it, the in-memory contents of the mapped section are not changed (I verified this by making a small modification to `darwin-proof-of-concept.c`). If there is some security aspect to the problem I'm missing here, could you please elaborate on it? I'm hoping that the bias method will not be necessary to support continuous mode on Fuchsia, but it seems this depends on the answer to the security question. Either way, istm that the high level approach taken in this patch as-written is a better fit for Linux/Darwin, and moreover probably provides support that can be used to implement the bias method. >>> >>> I appreciate your feedback and think we can definitely work together to build some kind of common solution! >>> >>> >>> CHANGES SINCE LAST ACTION >>> https://reviews.llvm.org/D68351/new/ >>> >>> https://reviews.llvm.org/D68351 >>> >>> >>> >>> _______________________________________________ >>> llvm-commits mailing list >>> llvm-commits at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Mon Oct 7 10:56:29 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 17:56:29 +0000 (UTC) Subject: [PATCH] D68585: AMDGPU/GlobalISel: Handle flat/global G_ATOMIC_CMPXCHG Message-ID: arsenm created this revision. arsenm added reviewers: tstellar, nhaehnle, kerbowa. Herald added subscribers: Petar.Avramovic, jfb, t-tye, tpr, dstuttard, rovka, yaxunl, wdng, jvesely, kzhuravl. Custom lower this to a target instruction with the merge operands. I think it might be better to directly select this and emit a REG_SEQUENCE, but this would be more work since it would require splitting the tablegen patterns for these cases from the other atomics. https://reviews.llvm.org/D68585 Files: lib/Target/AMDGPU/AMDGPUInstructions.td lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp lib/Target/AMDGPU/FLATInstructions.td test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-global.mir test/CodeGen/AMDGPU/GlobalISel/legalize-atomic-cmpxchg-with-success.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68585.223627.patch Type: text/x-patch Size: 36900 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 11:04:14 2019 From: llvm-commits at lists.llvm.org (Jan Korous via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:04:14 +0000 (UTC) Subject: [PATCH] D68093: [clang-scan-deps][static analyzer] Support for clang --analyze in scan-deps In-Reply-To: References: Message-ID: <97c3d3cdbadf00c748a59dd1464fb9b6@localhost.localdomain> jkorous updated this revision to Diff 223628. jkorous marked 3 inline comments as done. jkorous added a comment. Addressed comment. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68093/new/ https://reviews.llvm.org/D68093 Files: clang-tools-extra/clang-tidy/ClangTidy.cpp clang/include/clang/Driver/CC1Options.td clang/include/clang/Lex/PreprocessorOptions.h clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/lib/Frontend/InitPreprocessor.cpp clang/test/Analysis/preprocessor-setup.c clang/test/ClangScanDeps/Inputs/static-analyzer-cdb.json clang/test/ClangScanDeps/static-analyzer.c llvm/utils/lit/lit/llvm/config.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68093.223628.patch Type: text/x-patch Size: 7862 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 11:05:27 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:05:27 +0000 (UTC) Subject: [PATCH] D68309: GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELT In-Reply-To: References: Message-ID: arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68309/new/ https://reviews.llvm.org/D68309 From llvm-commits at lists.llvm.org Mon Oct 7 11:07:38 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:07:38 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <8af6dcc9bbe2434561a500e38e92f16c@localhost.localdomain> thakis added a comment. Also, in practice most clients will build against zlib and not see the tables. +1 to the current approach :) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Mon Oct 7 11:08:49 2019 From: llvm-commits at lists.llvm.org (Cyndy Ishida via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:08:49 +0000 (UTC) Subject: [PATCH] D67529: [TextAPI] Introduce TBDv4 In-Reply-To: References: Message-ID: cishida updated this revision to Diff 223629. cishida added a comment. Add simulator + tests Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67529/new/ https://reviews.llvm.org/D67529 Files: llvm/include/llvm/TextAPI/MachO/InterfaceFile.h llvm/include/llvm/TextAPI/MachO/Symbol.h llvm/include/llvm/TextAPI/MachO/Target.h llvm/lib/TextAPI/MachO/Target.cpp llvm/lib/TextAPI/MachO/TextStub.cpp llvm/lib/TextAPI/MachO/TextStubCommon.cpp llvm/unittests/TextAPI/CMakeLists.txt llvm/unittests/TextAPI/TextStubV4Tests.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67529.223629.patch Type: text/x-patch Size: 50961 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 11:14:24 2019 From: llvm-commits at lists.llvm.org (Jordan Rose via llvm-commits) Date: Mon, 07 Oct 2019 18:14:24 -0000 Subject: [llvm] r373935 - Second attempt to add iterator_range::empty() Message-ID: <20191007181424.DB26A883ED@lists.llvm.org> Author: jrose Date: Mon Oct 7 11:14:24 2019 New Revision: 373935 URL: http://llvm.org/viewvc/llvm-project?rev=373935&view=rev Log: Second attempt to add iterator_range::empty() Doing this makes MSVC complain that `empty(someRange)` could refer to either C++17's std::empty or LLVM's llvm::empty, which previously we avoided via SFINAE because std::empty is defined in terms of an empty member rather than begin and end. So, switch callers over to the new method as it is added. https://reviews.llvm.org/D68439 Modified: llvm/trunk/include/llvm/ADT/iterator_range.h llvm/trunk/lib/Analysis/LazyCallGraph.cpp llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp llvm/trunk/lib/CodeGen/GlobalISel/InstructionSelector.cpp llvm/trunk/lib/CodeGen/GlobalISel/LegalizerInfo.cpp llvm/trunk/lib/CodeGen/GlobalISel/RegBankSelect.cpp llvm/trunk/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp llvm/trunk/lib/CodeGen/MachineModuleInfo.cpp llvm/trunk/lib/ExecutionEngine/Orc/ExecutionUtils.cpp llvm/trunk/lib/IR/DebugInfo.cpp llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp llvm/trunk/lib/Target/BPF/BPFAbstractMemberAccess.cpp llvm/trunk/lib/Target/BPF/BPFAsmPrinter.cpp llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.cpp llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp llvm/trunk/lib/Transforms/Scalar/NewGVN.cpp llvm/trunk/lib/Transforms/Utils/PredicateInfo.cpp llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Modified: llvm/trunk/include/llvm/ADT/iterator_range.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/iterator_range.h?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/iterator_range.h (original) +++ llvm/trunk/include/llvm/ADT/iterator_range.h Mon Oct 7 11:14:24 2019 @@ -44,6 +44,7 @@ public: IteratorT begin() const { return begin_iterator; } IteratorT end() const { return end_iterator; } + bool empty() const { return begin_iterator == end_iterator; } }; /// Convenience function for iterating over sub-ranges. Modified: llvm/trunk/lib/Analysis/LazyCallGraph.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/LazyCallGraph.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/LazyCallGraph.cpp (original) +++ llvm/trunk/lib/Analysis/LazyCallGraph.cpp Mon Oct 7 11:14:24 2019 @@ -632,7 +632,7 @@ LazyCallGraph::RefSCC::switchInternalEdg // If the merge range is empty, then adding the edge didn't actually form any // new cycles. We're done. - if (empty(MergeRange)) { + if (MergeRange.empty()) { // Now that the SCC structure is finalized, flip the kind to call. SourceN->setEdgeKind(TargetN, Edge::Call); return false; // No new cycle. Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Mon Oct 7 11:14:24 2019 @@ -1054,7 +1054,7 @@ void DwarfDebug::finalizeModuleInfo() { // If we're splitting the dwarf out now that we've got the entire // CU then add the dwo id to it. auto *SkCU = TheCU.getSkeleton(); - if (useSplitDwarf() && !empty(TheCU.getUnitDie().children())) { + if (useSplitDwarf() && !TheCU.getUnitDie().children().empty()) { finishUnitAttributes(TheCU.getCUNode(), TheCU); TheCU.addString(TheCU.getUnitDie(), dwarf::DW_AT_GNU_dwo_name, Asm->TM.Options.MCOptions.SplitDwarfFile); @@ -1106,7 +1106,7 @@ void DwarfDebug::finalizeModuleInfo() { // is a bit pessimistic under LTO. if (!AddrPool.isEmpty() && (getDwarfVersion() >= 5 || - (SkCU && !empty(TheCU.getUnitDie().children())))) + (SkCU && !TheCU.getUnitDie().children().empty()))) U.addAddrTableBase(); if (getDwarfVersion() >= 5) { Modified: llvm/trunk/lib/CodeGen/GlobalISel/InstructionSelector.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/InstructionSelector.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/InstructionSelector.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/InstructionSelector.cpp Mon Oct 7 11:14:24 2019 @@ -79,5 +79,5 @@ bool InstructionSelector::isObviouslySaf return true; return !MI.mayLoadOrStore() && !MI.mayRaiseFPException() && - !MI.hasUnmodeledSideEffects() && empty(MI.implicit_operands()); + !MI.hasUnmodeledSideEffects() && MI.implicit_operands().empty(); } Modified: llvm/trunk/lib/CodeGen/GlobalISel/LegalizerInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/LegalizerInfo.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/LegalizerInfo.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/LegalizerInfo.cpp Mon Oct 7 11:14:24 2019 @@ -433,7 +433,7 @@ LegalizeRuleSet &LegalizerInfo::getActio std::initializer_list Opcodes) { unsigned Representative = *Opcodes.begin(); - assert(!empty(Opcodes) && Opcodes.begin() + 1 != Opcodes.end() && + assert(!llvm::empty(Opcodes) && Opcodes.begin() + 1 != Opcodes.end() && "Initializer list must have at least two opcodes"); for (auto I = Opcodes.begin() + 1, E = Opcodes.end(); I != E; ++I) Modified: llvm/trunk/lib/CodeGen/GlobalISel/RegBankSelect.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/RegBankSelect.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/RegBankSelect.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/RegBankSelect.cpp Mon Oct 7 11:14:24 2019 @@ -139,7 +139,7 @@ bool RegBankSelect::repairReg( "need new vreg for each breakdown"); // An empty range of new register means no repairing. - assert(!empty(NewVRegs) && "We should not have to repair"); + assert(!NewVRegs.empty() && "We should not have to repair"); MachineInstr *MI; if (ValMapping.NumBreakDowns == 1) { Modified: llvm/trunk/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp Mon Oct 7 11:14:24 2019 @@ -455,7 +455,7 @@ void RegisterBankInfo::applyDefaultMappi "This mapping is too complex for this function"); iterator_range::const_iterator> NewRegs = OpdMapper.getVRegs(OpIdx); - if (empty(NewRegs)) { + if (NewRegs.empty()) { LLVM_DEBUG(dbgs() << " has not been repaired, nothing to be done\n"); continue; } Modified: llvm/trunk/lib/CodeGen/MachineModuleInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineModuleInfo.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/MachineModuleInfo.cpp (original) +++ llvm/trunk/lib/CodeGen/MachineModuleInfo.cpp Mon Oct 7 11:14:24 2019 @@ -346,7 +346,7 @@ char MachineModuleInfoWrapperPass::ID = bool MachineModuleInfoWrapperPass::doInitialization(Module &M) { MMI.initialize(); MMI.TheModule = &M; - MMI.DbgInfoAvailable = !empty(M.debug_compile_units()); + MMI.DbgInfoAvailable = !M.debug_compile_units().empty(); return false; } @@ -361,6 +361,6 @@ MachineModuleInfo MachineModuleAnalysis: ModuleAnalysisManager &) { MachineModuleInfo MMI(TM); MMI.TheModule = &M; - MMI.DbgInfoAvailable = !empty(M.debug_compile_units()); + MMI.DbgInfoAvailable = !M.debug_compile_units().empty(); return MMI; } Modified: llvm/trunk/lib/ExecutionEngine/Orc/ExecutionUtils.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/Orc/ExecutionUtils.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/ExecutionEngine/Orc/ExecutionUtils.cpp (original) +++ llvm/trunk/lib/ExecutionEngine/Orc/ExecutionUtils.cpp Mon Oct 7 11:14:24 2019 @@ -88,7 +88,7 @@ iterator_range getDest } void CtorDtorRunner::add(iterator_range CtorDtors) { - if (empty(CtorDtors)) + if (CtorDtors.empty()) return; MangleAndInterner Mangle( Modified: llvm/trunk/lib/IR/DebugInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/DebugInfo.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/IR/DebugInfo.cpp (original) +++ llvm/trunk/lib/IR/DebugInfo.cpp Mon Oct 7 11:14:24 2019 @@ -279,7 +279,7 @@ bool DebugInfoFinder::addScope(DIScope * } static MDNode *stripDebugLocFromLoopID(MDNode *N) { - assert(!empty(N->operands()) && "Missing self reference?"); + assert(!N->operands().empty() && "Missing self reference?"); // if there is no debug location, we do not have to rewrite this MDNode. if (std::none_of(N->op_begin() + 1, N->op_end(), [](const MDOperand &Op) { Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp Mon Oct 7 11:14:24 2019 @@ -1588,7 +1588,7 @@ void AMDGPURegisterBankInfo::applyMappin if (DstTy != LLT::vector(2, 16)) break; - assert(MI.getNumOperands() == 3 && empty(OpdMapper.getVRegs(0))); + assert(MI.getNumOperands() == 3 && OpdMapper.getVRegs(0).empty()); substituteSimpleCopyRegs(OpdMapper, 1); substituteSimpleCopyRegs(OpdMapper, 2); @@ -1644,7 +1644,7 @@ void AMDGPURegisterBankInfo::applyMappin case AMDGPU::G_EXTRACT_VECTOR_ELT: { SmallVector DstRegs(OpdMapper.getVRegs(0)); - assert(empty(OpdMapper.getVRegs(1)) && empty(OpdMapper.getVRegs(2))); + assert(OpdMapper.getVRegs(1).empty() && OpdMapper.getVRegs(2).empty()); if (DstRegs.empty()) { applyDefaultMapping(OpdMapper); @@ -1708,9 +1708,9 @@ void AMDGPURegisterBankInfo::applyMappin case AMDGPU::G_INSERT_VECTOR_ELT: { SmallVector InsRegs(OpdMapper.getVRegs(2)); - assert(empty(OpdMapper.getVRegs(0))); - assert(empty(OpdMapper.getVRegs(1))); - assert(empty(OpdMapper.getVRegs(3))); + assert(OpdMapper.getVRegs(0).empty()); + assert(OpdMapper.getVRegs(1).empty()); + assert(OpdMapper.getVRegs(3).empty()); if (InsRegs.empty()) { applyDefaultMapping(OpdMapper); @@ -1785,8 +1785,8 @@ void AMDGPURegisterBankInfo::applyMappin case Intrinsic::amdgcn_readlane: { substituteSimpleCopyRegs(OpdMapper, 2); - assert(empty(OpdMapper.getVRegs(0))); - assert(empty(OpdMapper.getVRegs(3))); + assert(OpdMapper.getVRegs(0).empty()); + assert(OpdMapper.getVRegs(3).empty()); // Make sure the index is an SGPR. It doesn't make sense to run this in a // waterfall loop, so assume it's a uniform value. @@ -1794,9 +1794,9 @@ void AMDGPURegisterBankInfo::applyMappin return; } case Intrinsic::amdgcn_writelane: { - assert(empty(OpdMapper.getVRegs(0))); - assert(empty(OpdMapper.getVRegs(2))); - assert(empty(OpdMapper.getVRegs(3))); + assert(OpdMapper.getVRegs(0).empty()); + assert(OpdMapper.getVRegs(2).empty()); + assert(OpdMapper.getVRegs(3).empty()); substituteSimpleCopyRegs(OpdMapper, 4); // VGPR input val constrainOpWithReadfirstlane(MI, MRI, 2); // Source value @@ -1818,7 +1818,7 @@ void AMDGPURegisterBankInfo::applyMappin case Intrinsic::amdgcn_ds_ordered_add: case Intrinsic::amdgcn_ds_ordered_swap: { // This is only allowed to execute with 1 lane, so readfirstlane is safe. - assert(empty(OpdMapper.getVRegs(0))); + assert(OpdMapper.getVRegs(0).empty()); substituteSimpleCopyRegs(OpdMapper, 3); constrainOpWithReadfirstlane(MI, MRI, 2); // M0 return; Modified: llvm/trunk/lib/Target/BPF/BPFAbstractMemberAccess.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFAbstractMemberAccess.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/Target/BPF/BPFAbstractMemberAccess.cpp (original) +++ llvm/trunk/lib/Target/BPF/BPFAbstractMemberAccess.cpp Mon Oct 7 11:14:24 2019 @@ -147,7 +147,7 @@ bool BPFAbstractMemberAccess::runOnModul LLVM_DEBUG(dbgs() << "********** Abstract Member Accesses **********\n"); // Bail out if no debug info. - if (empty(M.debug_compile_units())) + if (M.debug_compile_units().empty()) return false; return doTransformation(M); Modified: llvm/trunk/lib/Target/BPF/BPFAsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFAsmPrinter.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/Target/BPF/BPFAsmPrinter.cpp (original) +++ llvm/trunk/lib/Target/BPF/BPFAsmPrinter.cpp Mon Oct 7 11:14:24 2019 @@ -59,7 +59,7 @@ bool BPFAsmPrinter::doInitialization(Mod AsmPrinter::doInitialization(M); // Only emit BTF when debuginfo available. - if (MAI->doesSupportDebugInformation() && !empty(M.debug_compile_units())) { + if (MAI->doesSupportDebugInformation() && !M.debug_compile_units().empty()) { BTF = new BTFDebug(this); Handlers.push_back(HandlerInfo(std::unique_ptr(BTF), "emit", "Debug Info Emission", "BTF", Modified: llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.cpp Mon Oct 7 11:14:24 2019 @@ -2273,7 +2273,7 @@ void PPCInstrInfo::replaceInstrOperandWi Register InUseReg = MI.getOperand(OpNo).getReg(); MI.getOperand(OpNo).ChangeToImmediate(Imm); - if (empty(MI.implicit_operands())) + if (MI.implicit_operands().empty()) return; // We need to make sure that the MI didn't have any implicit use Modified: llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp Mon Oct 7 11:14:24 2019 @@ -1264,7 +1264,7 @@ std::pair PartialInlin if (PSI->isFunctionEntryCold(F)) return {false, nullptr}; - if (empty(F->users())) + if (F->users().empty()) return {false, nullptr}; OptimizationRemarkEmitter ORE(F); @@ -1370,7 +1370,7 @@ bool PartialInlinerImpl::tryPartialInlin return false; } - assert(empty(Cloner.OrigFunc->users()) && + assert(Cloner.OrigFunc->users().empty() && "F's users should all be replaced!"); std::vector Users(Cloner.ClonedFunc->user_begin(), Modified: llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp Mon Oct 7 11:14:24 2019 @@ -2789,7 +2789,7 @@ bool IndVarSimplify::optimizeLoopExits(L // have already been removed; TODO: generalize BasicBlock *ExitBlock = BI->getSuccessor(L->contains(BI->getSuccessor(0)) ? 1 : 0); - if (!empty(ExitBlock->phis())) + if (!ExitBlock->phis().empty()) return true; const SCEV *ExitCount = SE->getExitCount(L, ExitingBB); Modified: llvm/trunk/lib/Transforms/Scalar/NewGVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/NewGVN.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/NewGVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/NewGVN.cpp Mon Oct 7 11:14:24 2019 @@ -1754,7 +1754,7 @@ NewGVN::performSymbolicPHIEvaluation(Arr return true; }); // If we are left with no operands, it's dead. - if (empty(Filtered)) { + if (Filtered.empty()) { // If it has undef at this point, it means there are no-non-undef arguments, // and thus, the value of the phi node must be undef. if (HasUndef) { Modified: llvm/trunk/lib/Transforms/Utils/PredicateInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/PredicateInfo.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/PredicateInfo.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/PredicateInfo.cpp Mon Oct 7 11:14:24 2019 @@ -556,7 +556,7 @@ Value *PredicateInfo::materializeStack(u if (isa(ValInfo)) { IRBuilder<> B(getBranchTerminator(ValInfo)); Function *IF = getCopyDeclaration(F.getParent(), Op->getType()); - if (empty(IF->users())) + if (IF->users().empty()) CreatedDeclarations.insert(IF); CallInst *PIC = B.CreateCall(IF, Op, Op->getName() + "." + Twine(Counter++)); @@ -568,7 +568,7 @@ Value *PredicateInfo::materializeStack(u "Should not have gotten here without it being an assume"); IRBuilder<> B(PAssume->AssumeInst); Function *IF = getCopyDeclaration(F.getParent(), Op->getType()); - if (empty(IF->users())) + if (IF->users().empty()) CreatedDeclarations.insert(IF); CallInst *PIC = B.CreateCall(IF, Op); PredicateMap.insert({PIC, ValInfo}); Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp?rev=373935&r1=373934&r2=373935&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Mon Oct 7 11:14:24 2019 @@ -5314,7 +5314,7 @@ static bool SwitchToLookupTable(SwitchIn // Figure out the corresponding result for each case value and phi node in the // common destination, as well as the min and max case values. - assert(!empty(SI->cases())); + assert(!SI->cases().empty()); SwitchInst::CaseIt CI = SI->case_begin(); ConstantInt *MinCaseVal = CI->getCaseValue(); ConstantInt *MaxCaseVal = CI->getCaseValue(); From llvm-commits at lists.llvm.org Mon Oct 7 11:12:24 2019 From: llvm-commits at lists.llvm.org (Jordan Rose via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:12:24 +0000 (UTC) Subject: [PATCH] D68439: Second attempt to add iterator_range::empty() In-Reply-To: References: Message-ID: jordan_rose closed this revision. jordan_rose added a comment. Committed in rL373935 . Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68439/new/ https://reviews.llvm.org/D68439 From llvm-commits at lists.llvm.org Mon Oct 7 11:15:36 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Micha=C5=82_G=C3=B3rny_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 18:15:36 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: mgorny added a comment. I'd personally prefer either the non-table approach or having the tables generated at build time. Given this is only going to be used rarely, I don't think we should clutter the code with big tables. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Mon Oct 7 11:17:18 2019 From: llvm-commits at lists.llvm.org (Juergen Ributzka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:17:18 +0000 (UTC) Subject: [PATCH] D67529: [TextAPI] Introduce TBDv4 In-Reply-To: References: Message-ID: <06a1cd566dc2dba720e82eb004cec334@localhost.localdomain> ributzka accepted this revision. ributzka added a comment. This revision is now accepted and ready to land. Thanks Cyndy. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67529/new/ https://reviews.llvm.org/D67529 From llvm-commits at lists.llvm.org Mon Oct 7 11:20:19 2019 From: llvm-commits at lists.llvm.org (Cyndy Ishida via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:20:19 +0000 (UTC) Subject: [PATCH] D67529: [TextAPI] Introduce TBDv4 In-Reply-To: References: Message-ID: cishida updated this revision to Diff 223630. cishida added a comment. Fix a few typos in comments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67529/new/ https://reviews.llvm.org/D67529 Files: llvm/include/llvm/TextAPI/MachO/InterfaceFile.h llvm/include/llvm/TextAPI/MachO/Symbol.h llvm/include/llvm/TextAPI/MachO/Target.h llvm/lib/TextAPI/MachO/Target.cpp llvm/lib/TextAPI/MachO/TextStub.cpp llvm/lib/TextAPI/MachO/TextStubCommon.cpp llvm/unittests/TextAPI/CMakeLists.txt llvm/unittests/TextAPI/TextStubV4Tests.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67529.223630.patch Type: text/x-patch Size: 50977 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 11:20:35 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:20:35 +0000 (UTC) Subject: [PATCH] D47751: [lsan] Do not check for leaks in the forked process In-Reply-To: References: Message-ID: <11a7f332f9b75dc6cc057077a3573739@localhost.localdomain> vitalybuka abandoned this revision. vitalybuka added a comment. It was reverted. But I am not planing to work on this soon. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D47751/new/ https://reviews.llvm.org/D47751 From llvm-commits at lists.llvm.org Mon Oct 7 11:24:45 2019 From: llvm-commits at lists.llvm.org (Sebastian Pop via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:24:45 +0000 (UTC) Subject: [PATCH] D67990: [aarch64] fix generation of fp16 fmls In-Reply-To: References: Message-ID: sebpop marked an inline comment as done. sebpop added inline comments. ================ Comment at: llvm/test/CodeGen/AArch64/fp16-fmla.ll:163 +; CHECK: fneg {{v[0-9]+}}.8h, {{v[0-9]+}}.8h +; CHECK: fmla {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h entry: ---------------- SjoerdMeijer wrote: > Why are we not generating a fmls? > > And a nit, but perhaps actually just using registers v0, v1, and v2 here makes things clearer? That is part of the problem that Tim pointed out: when the multiply is the first operand of `fsub`, i.e., ``` %sub = fsub fast <8 x half> %mul, %a ``` that should not generate a fused multiply sub. With this patch, for `b * c - a` we negate the value of a and generate a fused multiply add `-a + b * c`. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67990/new/ https://reviews.llvm.org/D67990 From llvm-commits at lists.llvm.org Mon Oct 7 11:27:25 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:27:25 +0000 (UTC) Subject: [PATCH] D68270: DWARFDebugLoc: Add a function to get the address range of an entry In-Reply-To: References: Message-ID: <37a8f331d2f7b6804d9aff01fdb2b9b7@localhost.localdomain> labath marked 3 inline comments as done. labath added inline comments. ================ Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:291-295 + EntryIterator Absolute = + getAbsoluteLocations( + SectionedAddress{BaseAddr, SectionedAddress::UndefSection}, + LookupPooledAddress) + .begin(); ---------------- dblaikie wrote: > labath wrote: > > This parallel iteration is not completely nice, but I think it's worth being able to reuse the absolute range computation code. I'm open to ideas for improvement though. > Ah, I see - this is what you meant about "In particular it makes it possible to reuse this stuff in the dumping code, which would have been pretty hard with callbacks.". > > I'm wondering if that might be worth revisiting somewhat. A full iterator abstraction for one user here (well, two once you include lldb - but I assume it's likely going to build its own data structure from the iteration anyway, right? (it's not going to keep the iterator around, do anything interesting like partial iterations, re-iterate/etc - such that a callback would suffice)) > > I could imagine two callback APIs for this - one that gets entries and locations and one that only gets locations by filtering on the entry version. > > eg: > > // for non-verbose output: > LL.forEachEntry([&](const Entry &E, Expected L) { > if (Verbose && actually dumping debug_loc) > print(E) // print any LLE_*, raw parameters, etc > if (L) > print(*L) // print the resulting address range, section name (if verbose), > else > print(error stuff) > }); > > One question would be "when/where do we print the DWARF expression" - if there's an error computing the address range, we can still print the expression, so maybe that happens unconditionally at the end of the callback, using the expression in the Entry? (then, arguably, the expression doesn't need to be in the DWARFLocation - and I'd say make the DWARFLocation a sectioned range, exactly the same type as for ranges so that part of the dumping code, etc, can be maximally reused) Actually, what lldb currently does is that it does not build any data structures at all (except storing the pointer to the right place in the debug_loc section. Then, whenever it wants to do something to the loclist, it parses it afresh. I don't know why it does this exactly, but I assume it has something to do with most locations never being used, or being only a couple of times, and the actual parsing being fairly fast. What this means is that lldb is not really a single "user", but there are like four or five places where it iterates through the list, depending on what does it actually want to do with it. It also does partial iteration where it stops as soon as it find the entry it was interested in. Now, all of that is possible with a callback (though I am generally trying to avoid them), but it does resurface the issue of what should be the value of the second argument for DW_LLE_base_address entries (the thing which I originally used a error type for). Maybe this should be actually one callback API, taking two callback functions, with one of them being invoked for base_address entries, and one for others? However, if we stick to the current approaches in both LLE and RLE of making the address pool resolution function a parameter (which I'd like to keep, as it makes my job in lldb easier), then this would actually be three callbacks, which starts to get unwieldy. Though one of those callbacks could be removed with the "DWARFUnit implementing a AddrOffsetResolver interface" idea, which I really like. :) ================ Comment at: test/CodeGen/X86/debug-loclists.ll:16 ; CHECK-NEXT: 0x00000000: -; CHECK-NEXT: [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0 -; CHECK-NEXT: [0x0000000000000004, 0x0000000000000012): DW_OP_breg3 RBX+0 - -; There is no way to use llvm-dwarfdump atm (2018, october) to verify the DW_LLE_* codes emited, -; because dumper is not yet implements that. Use asm code to do this check instead. -; -; RUN: llc -mtriple=x86_64-pc-linux -filetype=asm < %s -o - | FileCheck %s --check-prefix=ASM -; ASM: .section .debug_loclists,"", at progbits -; ASM-NEXT: .long .Ldebug_loclist_table_end0-.Ldebug_loclist_table_start0 # Length -; ASM-NEXT: .Ldebug_loclist_table_start0: -; ASM-NEXT: .short 5 # Version -; ASM-NEXT: .byte 8 # Address size -; ASM-NEXT: .byte 0 # Segment selector size -; ASM-NEXT: .long 0 # Offset entry count -; ASM-NEXT: .Lloclists_table_base0: -; ASM-NEXT: .Ldebug_loc0: -; ASM-NEXT: .byte 4 # DW_LLE_offset_pair -; ASM-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0 # starting offset -; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # ending offset -; ASM-NEXT: .byte 2 # Loc expr size -; ASM-NEXT: .byte 117 # DW_OP_breg5 -; ASM-NEXT: .byte 0 # 0 -; ASM-NEXT: .byte 4 # DW_LLE_offset_pair -; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # starting offset -; ASM-NEXT: .uleb128 .Ltmp1-.Lfunc_begin0 # ending offset -; ASM-NEXT: .byte 2 # Loc expr size -; ASM-NEXT: .byte 115 # DW_OP_breg3 -; ASM-NEXT: .byte 0 # 0 -; ASM-NEXT: .byte 0 # DW_LLE_end_of_list -; ASM-NEXT: .Ldebug_loclist_table_end0: +; CHECK-NEXT: [DW_LLE_offset_pair ]: 0x0000000000000000, 0x0000000000000004 => [0x0000000000000000, 0x0000000000000004) DW_OP_breg5 RDI+0 +; CHECK-NEXT: [DW_LLE_offset_pair ]: 0x0000000000000004, 0x0000000000000012 => [0x0000000000000004, 0x0000000000000012) DW_OP_breg3 RBX+0 ---------------- dblaikie wrote: > labath wrote: > > This tries to follow the RLE format as closely as possible, but I think something like > > ``` > > [DW_LLE_offset_pair, 0x0000000000000000, 0x0000000000000004] => [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0 > > ``` > > would make more sense (both here and for RLE). > Yep, that'd make more sense to me - are you planning to unify the codepaths for this? I think that'd be for the best. > > If I were picking a printing from scratch, I might go with: > > DW_LLE_offset_pair(0x0000, 0x0004) => [0x0000, 0x0004): DW_OP_breg5 RDI+0 > > Making it look a bit more like a function call and function arguments. Though the () might be confusing with the range notation. > > I'm also undecided on the " => " separator. Whether a ':' might be better/fine, etc. > > Totally open to ideas, but mostly I'd really love these to use loclist and ranges to use the same code as much as possible, so we can get consistency and any readability benefits, etc in both. I like the function call format. I hoping to get some code reuse, though it's still not fully clear to me how to achieve that.. ================ Comment at: test/DebugInfo/X86/dwarfdump-debug-loclists.test:7 # CHECK: DW_AT_location [DW_FORM_sec_offset] (0x0000000c -# CHECK-NEXT: [0x0000000000000010, 0x0000000000000020): DW_OP_breg5 RDI+0 -# CHECK-NEXT: [0x0000000000000530, 0x0000000000000540): DW_OP_breg6 RBP-8, DW_OP_deref -# CHECK-NEXT: [0x0000000000000700, 0x0000000000000710): DW_OP_breg5 RDI+0 +# CHECK-NEXT: [DW_LLE_offset_pair ]: 0x0000000000000000, 0x0000000000000010 => [0x0000000000000010, 0x0000000000000020) DW_OP_breg5 RDI+0 +# CHECK-NEXT: [DW_LLE_base_address ]: 0x0000000000000500 ---------------- dblaikie wrote: > I don't think the inline dumping should print the encoding - I'd borrow a lot from/try to unify with the ranges printing, which doesn't. I think verbose ranges print the same as non-verbose except they also add the section name/number. Sure, I can do that, though I think that means there won't be a single place where one can see both the raw encodings and their interpretation -- section-based dumping will not show the interpretation (would you want me to show still show them I they happen to be interpretable without the base address or the address pool?), and the debug_info dumping will not show the encoding. Is that bad? -- I don't know... Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68270/new/ https://reviews.llvm.org/D68270 From llvm-commits at lists.llvm.org Mon Oct 7 11:29:02 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:29:02 +0000 (UTC) Subject: [PATCH] D68584: Fix Calling Convention through aliases In-Reply-To: References: Message-ID: rnk accepted this revision. rnk added a comment. lgtm CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68584/new/ https://reviews.llvm.org/D68584 From llvm-commits at lists.llvm.org Mon Oct 7 11:32:55 2019 From: llvm-commits at lists.llvm.org (Hideki Saito via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:32:55 +0000 (UTC) Subject: [PATCH] D67948: [LV] Interleaving should not exceed estimated loop trip count. In-Reply-To: References: Message-ID: hsaito accepted this revision. hsaito added a comment. This revision is now accepted and ready to land. In D67948#1695326 , @hsaito wrote: > Vectorizer code change looks fine with me. I'd like to see the comments updated, though. Any more changes needed for the LIT tests? LGTM. Please wait a few more days to give others a chance for another look. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67948/new/ https://reviews.llvm.org/D67948 From llvm-commits at lists.llvm.org Mon Oct 7 11:39:57 2019 From: llvm-commits at lists.llvm.org (Yonghong Song via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:39:57 +0000 (UTC) Subject: [PATCH] D67980: [BPF] do compile-once run-everywhere relocation for bitfields In-Reply-To: References: Message-ID: <777be10a2c69a5e463a7a46c453cf58f@localhost.localdomain> yonghong-song updated this revision to Diff 223633. yonghong-song retitled this revision from "[CLANG][BPF] do compile-once run-everywhere relocation for bitfields" to "[BPF] do compile-once run-everywhere relocation for bitfields". yonghong-song edited the summary of this revision. yonghong-song added a comment. Herald added a subscriber: ormris. add test cases. Also handle filed_info for field array elements properly. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67980/new/ https://reviews.llvm.org/D67980 Files: clang/include/clang/Basic/BuiltinsBPF.def clang/include/clang/Basic/DiagnosticSemaKinds.td clang/include/clang/Basic/TargetBuiltins.h clang/include/clang/Sema/Sema.h clang/include/clang/module.modulemap clang/lib/Basic/Targets/BPF.cpp clang/lib/Basic/Targets/BPF.h clang/lib/CodeGen/CGBuiltin.cpp clang/lib/CodeGen/CGExpr.cpp clang/lib/CodeGen/CodeGenFunction.h clang/lib/Sema/SemaChecking.cpp clang/test/CodeGen/builtins-bpf-preserve-field-info-1.c clang/test/CodeGen/builtins-bpf-preserve-field-info-2.c clang/test/Sema/builtins-bpf.c llvm/include/llvm/IR/IntrinsicsBPF.td llvm/lib/Target/BPF/BPF.h llvm/lib/Target/BPF/BPFAbstractMemberAccess.cpp llvm/lib/Target/BPF/BPFCORE.h llvm/lib/Target/BPF/BPFTargetMachine.cpp llvm/lib/Target/BPF/BTF.h llvm/lib/Target/BPF/BTFDebug.cpp llvm/lib/Target/BPF/BTFDebug.h llvm/test/CodeGen/BPF/CORE/intrinsic-array.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-struct.ll llvm/test/CodeGen/BPF/CORE/intrinsic-union.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-access-str.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-basic.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-array-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-array-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-3.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-union-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-union-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-end-load.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-end-ret.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-3.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-ignore.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-middle-chain.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multi-array-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multi-array-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multilevel.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-pointer-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-pointer-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-struct-anonymous.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-struct-array.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-array.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-struct.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-union.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-union.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67980.223633.patch Type: text/x-patch Size: 232956 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 11:40:26 2019 From: llvm-commits at lists.llvm.org (Jordan Rose via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:40:26 +0000 (UTC) Subject: [PATCH] D68586: Save a word in every StringSet entry Message-ID: jordan_rose created this revision. jordan_rose added reviewers: mmpozulp, aaron.ballman. Herald added subscribers: llvm-commits, dexonsmith. Herald added a project: LLVM. Add a specialization to StringMap (actually StringMapEntry) for a value type of NoneType (the type of llvm::None), and use it for StringSet. This'll save us a word from every entry in a StringSet, used for alignment with the size_t that stores the string length. I could have gone all the way to some kind of empty base class optimization , but that seemed like overkill. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68586 Files: llvm/include/llvm/ADT/StringMap.h llvm/include/llvm/ADT/StringSet.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68586.223632.patch Type: text/x-patch Size: 4338 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 11:43:30 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Mon, 07 Oct 2019 18:43:30 -0000 Subject: [llvm] r373937 - GlobalISel: Add target pre-isel instructions Message-ID: <20191007184330.3F38F8CDEF@lists.llvm.org> Author: arsenm Date: Mon Oct 7 11:43:29 2019 New Revision: 373937 URL: http://llvm.org/viewvc/llvm-project?rev=373937&view=rev Log: GlobalISel: Add target pre-isel instructions Allows targets to introduce regbankselectable pseudo-instructions. Currently the closet feature to this is an intrinsic. However this requires creating a public intrinsic declaration. This litters the public intrinsic namespace with operations we don't necessarily want to expose to IR producers, and would rather leave as private to the backend. Use a new instruction bit. A previous attempt tried to keep using enum value ranges, but it turned into a mess. Added: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgpu-ffbh-u32.mir Modified: llvm/trunk/include/llvm/CodeGen/MachineInstr.h llvm/trunk/include/llvm/MC/MCInstrDesc.h llvm/trunk/include/llvm/Target/GenericOpcodes.td llvm/trunk/include/llvm/Target/Target.td llvm/trunk/lib/CodeGen/GlobalISel/RegBankSelect.cpp llvm/trunk/lib/Target/AMDGPU/AMDGPUGISel.td llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp llvm/trunk/lib/Target/AMDGPU/SIInstrInfo.cpp llvm/trunk/lib/Target/AMDGPU/SIInstructions.td llvm/trunk/utils/TableGen/CodeGenInstruction.cpp llvm/trunk/utils/TableGen/CodeGenInstruction.h llvm/trunk/utils/TableGen/InstrInfoEmitter.cpp Modified: llvm/trunk/include/llvm/CodeGen/MachineInstr.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/MachineInstr.h?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/MachineInstr.h (original) +++ llvm/trunk/include/llvm/CodeGen/MachineInstr.h Mon Oct 7 11:43:29 2019 @@ -618,6 +618,12 @@ public: return hasPropertyInBundle(1ULL << MCFlag, Type); } + /// Return true if this is an instruction that should go through the usual + /// legalization steps. + bool isPreISelOpcode(QueryType Type = IgnoreBundle) const { + return hasProperty(MCID::PreISelOpcode, Type); + } + /// Return true if this instruction can have a variable number of operands. /// In this case, the variable operands will be after the normal /// operands but before the implicit definitions and uses (if any are Modified: llvm/trunk/include/llvm/MC/MCInstrDesc.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCInstrDesc.h?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCInstrDesc.h (original) +++ llvm/trunk/include/llvm/MC/MCInstrDesc.h Mon Oct 7 11:43:29 2019 @@ -129,7 +129,8 @@ namespace MCID { /// not use these directly. These all correspond to bitfields in the /// MCInstrDesc::Flags field. enum Flag { - Variadic = 0, + PreISelOpcode = 0, + Variadic, HasOptionalDef, Pseudo, Return, @@ -242,6 +243,10 @@ public: /// Return flags of this instruction. uint64_t getFlags() const { return Flags; } + /// \returns true if this instruction is emitted before instruction selection + /// and should be legalized/regbankselected/selected. + bool isPreISelOpcode() const { return Flags & (1ULL << MCID::PreISelOpcode); } + /// Return true if this instruction can have a variable number of /// operands. In this case, the variable operands will be after the normal /// operands but before the implicit definitions and uses (if any are Modified: llvm/trunk/include/llvm/Target/GenericOpcodes.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/GenericOpcodes.td?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/GenericOpcodes.td (original) +++ llvm/trunk/include/llvm/Target/GenericOpcodes.td Mon Oct 7 11:43:29 2019 @@ -15,7 +15,9 @@ // Unary ops. //------------------------------------------------------------------------------ -class GenericInstruction : StandardPseudoInstruction; +class GenericInstruction : StandardPseudoInstruction { + let isPreISelOpcode = 1; +} // Extend the underlying scalar type of an operation, leaving the high bits // unspecified. Modified: llvm/trunk/include/llvm/Target/Target.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/Target.td?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/Target.td (original) +++ llvm/trunk/include/llvm/Target/Target.td Mon Oct 7 11:43:29 2019 @@ -492,6 +492,10 @@ class Instruction : InstructionEncoding // Added complexity passed onto matching pattern. int AddedComplexity = 0; + // Indicates if this is a pre-isel opcode that should be + // legalized/regbankselected/selected. + bit isPreISelOpcode = 0; + // These bits capture information about the high-level semantics of the // instruction. bit isReturn = 0; // Is this instruction a return instruction? Modified: llvm/trunk/lib/CodeGen/GlobalISel/RegBankSelect.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/RegBankSelect.cpp?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/RegBankSelect.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/RegBankSelect.cpp Mon Oct 7 11:43:29 2019 @@ -687,8 +687,9 @@ bool RegBankSelect::runOnMachineFunction // iterator before hand. MachineInstr &MI = *MII++; - // Ignore target-specific instructions: they should use proper regclasses. - if (isTargetSpecificOpcode(MI.getOpcode())) + // Ignore target-specific post-isel instructions: they should use proper + // regclasses. + if (isTargetSpecificOpcode(MI.getOpcode()) && !MI.isPreISelOpcode()) continue; if (!assignInstr(MI)) { Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUGISel.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUGISel.td?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUGISel.td (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUGISel.td Mon Oct 7 11:43:29 2019 @@ -116,6 +116,7 @@ def : GINodeEquiv; def : GINodeEquiv; +def : GINodeEquiv; class GISelSop2Pat < SDPatternOperator node, Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp Mon Oct 7 11:43:29 2019 @@ -1650,7 +1650,7 @@ bool AMDGPUInstructionSelector::select(M if (I.isPHI()) return selectPHI(I); - if (!isPreISelGenericOpcode(I.getOpcode())) { + if (!I.isPreISelOpcode()) { if (I.isCopy()) return selectCOPY(I); return true; Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp Mon Oct 7 11:43:29 2019 @@ -2305,6 +2305,7 @@ AMDGPURegisterBankInfo::getInstrMapping( case AMDGPU::G_FCANONICALIZE: case AMDGPU::G_INTRINSIC_TRUNC: case AMDGPU::G_INTRINSIC_ROUND: + case AMDGPU::G_AMDGPU_FFBH_U32: return getDefaultMappingVOP(MI); case AMDGPU::G_UMULH: case AMDGPU::G_SMULH: { Modified: llvm/trunk/lib/Target/AMDGPU/SIInstrInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/SIInstrInfo.cpp?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/SIInstrInfo.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/SIInstrInfo.cpp Mon Oct 7 11:43:29 2019 @@ -3117,7 +3117,8 @@ static bool shouldReadExec(const Machine return true; } - if (SIInstrInfo::isGenericOpcode(MI.getOpcode()) || + if (MI.isPreISelOpcode() || + SIInstrInfo::isGenericOpcode(MI.getOpcode()) || SIInstrInfo::isSALU(MI) || SIInstrInfo::isSMRD(MI)) return false; Modified: llvm/trunk/lib/Target/AMDGPU/SIInstructions.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/SIInstructions.td?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/SIInstructions.td (original) +++ llvm/trunk/lib/Target/AMDGPU/SIInstructions.td Mon Oct 7 11:43:29 2019 @@ -1982,3 +1982,13 @@ def : FP16Med3Pat; defm : Int16Med3Pat; defm : Int16Med3Pat; } // End Predicates = [isGFX9Plus] + +class AMDGPUGenericInstruction : GenericInstruction { + let Namespace = "AMDGPU"; +} + +def G_AMDGPU_FFBH_U32 : AMDGPUGenericInstruction { + let OutOperandList = (outs type0:$dst); + let InOperandList = (ins type1:$src); + let hasSideEffects = 0; +} Added: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir?rev=373937&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir (added) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir Mon Oct 7 11:43:29 2019 @@ -0,0 +1,68 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py +# RUN: llc -march=amdgcn -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 %s -o - | FileCheck %s + +--- + +name: ffbh_u32_s32_s_s +legalized: true +regBankSelected: true +tracksRegLiveness: true + +body: | + bb.0: + liveins: $sgpr0 + + ; CHECK-LABEL: name: ffbh_u32_s32_s_s + ; CHECK: liveins: $sgpr0 + ; CHECK: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0 + ; CHECK: [[S_FLBIT_I32_B32_:%[0-9]+]]:sreg_32 = S_FLBIT_I32_B32 [[COPY]] + ; CHECK: S_ENDPGM 0, implicit [[S_FLBIT_I32_B32_]] + %0:sgpr(s32) = COPY $sgpr0 + %1:sgpr(s32) = G_AMDGPU_FFBH_U32 %0 + S_ENDPGM 0, implicit %1 + +... + +--- + +name: ffbh_u32_s32_v_v +legalized: true +regBankSelected: true +tracksRegLiveness: true + +body: | + bb.0: + liveins: $vgpr0 + + ; CHECK-LABEL: name: ffbh_u32_s32_v_v + ; CHECK: liveins: $vgpr0 + ; CHECK: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; CHECK: [[AMDGPU_FFBH_U32_:%[0-9]+]]:vgpr(s32) = G_AMDGPU_FFBH_U32 [[COPY]](s32) + ; CHECK: S_ENDPGM 0, implicit [[AMDGPU_FFBH_U32_]](s32) + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = G_AMDGPU_FFBH_U32 %0 + S_ENDPGM 0, implicit %1 + +... + +--- + +name: ffbh_u32_v_s +legalized: true +regBankSelected: true +tracksRegLiveness: true + +body: | + bb.0: + liveins: $sgpr0 + + ; CHECK-LABEL: name: ffbh_u32_v_s + ; CHECK: liveins: $sgpr0 + ; CHECK: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 + ; CHECK: [[AMDGPU_FFBH_U32_:%[0-9]+]]:vgpr(s32) = G_AMDGPU_FFBH_U32 [[COPY]](s32) + ; CHECK: S_ENDPGM 0, implicit [[AMDGPU_FFBH_U32_]](s32) + %0:sgpr(s32) = COPY $sgpr0 + %1:vgpr(s32) = G_AMDGPU_FFBH_U32 %0 + S_ENDPGM 0, implicit %1 + +... Added: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgpu-ffbh-u32.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgpu-ffbh-u32.mir?rev=373937&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgpu-ffbh-u32.mir (added) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgpu-ffbh-u32.mir Mon Oct 7 11:43:29 2019 @@ -0,0 +1,32 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py +# RUN: llc -march=amdgcn -mcpu=fiji -run-pass=regbankselect %s -verify-machineinstrs -o - -regbankselect-fast | FileCheck %s +# RUN: llc -march=amdgcn -mcpu=fiji -run-pass=regbankselect %s -verify-machineinstrs -o - -regbankselect-greedy | FileCheck %s + +--- +name: ffbh_u32_s +legalized: true + +body: | + bb.0: + liveins: $sgpr0 + + ; CHECK-LABEL: name: ffbh_u32_s + ; CHECK: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 + ; CHECK: [[AMDGPU_FFBH_U32_:%[0-9]+]]:vgpr(s32) = G_AMDGPU_FFBH_U32 [[COPY]](s32) + %0:_(s32) = COPY $sgpr0 + %1:_(s32) = G_AMDGPU_FFBH_U32 %0 +... + +--- +name: ffbh_u32_v +legalized: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1 + ; CHECK-LABEL: name: ffbh_u32_v + ; CHECK: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; CHECK: [[AMDGPU_FFBH_U32_:%[0-9]+]]:vgpr(s32) = G_AMDGPU_FFBH_U32 [[COPY]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = G_AMDGPU_FFBH_U32 %0 +... Modified: llvm/trunk/utils/TableGen/CodeGenInstruction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenInstruction.cpp?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/CodeGenInstruction.cpp (original) +++ llvm/trunk/utils/TableGen/CodeGenInstruction.cpp Mon Oct 7 11:43:29 2019 @@ -363,6 +363,7 @@ CodeGenInstruction::CodeGenInstruction(R Namespace = R->getValueAsString("Namespace"); AsmString = R->getValueAsString("AsmString"); + isPreISelOpcode = R->getValueAsBit("isPreISelOpcode"); isReturn = R->getValueAsBit("isReturn"); isEHScopeReturn = R->getValueAsBit("isEHScopeReturn"); isBranch = R->getValueAsBit("isBranch"); Modified: llvm/trunk/utils/TableGen/CodeGenInstruction.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenInstruction.h?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/CodeGenInstruction.h (original) +++ llvm/trunk/utils/TableGen/CodeGenInstruction.h Mon Oct 7 11:43:29 2019 @@ -231,6 +231,7 @@ template class ArrayRef; std::vector ImplicitDefs, ImplicitUses; // Various boolean values we track for the instruction. + bool isPreISelOpcode : 1; bool isReturn : 1; bool isEHScopeReturn : 1; bool isBranch : 1; Modified: llvm/trunk/utils/TableGen/InstrInfoEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/InstrInfoEmitter.cpp?rev=373937&r1=373936&r2=373937&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/InstrInfoEmitter.cpp (original) +++ llvm/trunk/utils/TableGen/InstrInfoEmitter.cpp Mon Oct 7 11:43:29 2019 @@ -662,6 +662,7 @@ void InstrInfoEmitter::emitRecord(const CodeGenTarget &Target = CDP.getTargetInfo(); // Emit all of the target independent flags... + if (Inst.isPreISelOpcode) OS << "|(1ULL< Author: arsenm Date: Mon Oct 7 11:43:31 2019 New Revision: 373938 URL: http://llvm.org/viewvc/llvm-project?rev=373938&view=rev Log: AMDGPU/GlobalISel: Select more G_INSERT cases At minimum handle the s64 insert type, which are emitted in real cases during legalization. We really need TableGen to emit something to emit something like the inverse of composeSubRegIndices do determine the subreg index to use. Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp?rev=373938&r1=373937&r2=373938&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp Mon Oct 7 11:43:31 2019 @@ -555,39 +555,97 @@ bool AMDGPUInstructionSelector::selectG_ return false; } +// FIXME: TableGen should generate something to make this manageable for all +// register classes. At a minimum we could use the opposite of +// composeSubRegIndices and go up from the base 32-bit subreg. +static unsigned getSubRegForSizeAndOffset(const SIRegisterInfo &TRI, + unsigned Size, unsigned Offset) { + switch (Size) { + case 32: + return TRI.getSubRegFromChannel(Offset / 32); + case 64: { + switch (Offset) { + case 0: + return AMDGPU::sub0_sub1; + case 32: + return AMDGPU::sub1_sub2; + case 64: + return AMDGPU::sub2_sub3; + case 96: + return AMDGPU::sub4_sub5; + case 128: + return AMDGPU::sub5_sub6; + case 160: + return AMDGPU::sub7_sub8; + // FIXME: Missing cases up to 1024 bits + default: + return AMDGPU::NoSubRegister; + } + } + case 96: { + switch (Offset) { + case 0: + return AMDGPU::sub0_sub1_sub2; + case 32: + return AMDGPU::sub1_sub2_sub3; + case 64: + return AMDGPU::sub2_sub3_sub4; + } + } + default: + return AMDGPU::NoSubRegister; + } +} + bool AMDGPUInstructionSelector::selectG_INSERT(MachineInstr &I) const { MachineBasicBlock *BB = I.getParent(); + + Register DstReg = I.getOperand(0).getReg(); Register Src0Reg = I.getOperand(1).getReg(); Register Src1Reg = I.getOperand(2).getReg(); LLT Src1Ty = MRI->getType(Src1Reg); - if (Src1Ty.getSizeInBits() != 32) - return false; + + unsigned DstSize = MRI->getType(DstReg).getSizeInBits(); + unsigned InsSize = Src1Ty.getSizeInBits(); int64_t Offset = I.getOperand(3).getImm(); if (Offset % 32 != 0) return false; - unsigned SubReg = TRI.getSubRegFromChannel(Offset / 32); + unsigned SubReg = getSubRegForSizeAndOffset(TRI, InsSize, Offset); + if (SubReg == AMDGPU::NoSubRegister) + return false; + + const RegisterBank *DstBank = RBI.getRegBank(DstReg, *MRI, TRI); + const TargetRegisterClass *DstRC = + TRI.getRegClassForSizeOnBank(DstSize, *DstBank, *MRI); + if (!DstRC) + return false; + + const RegisterBank *Src0Bank = RBI.getRegBank(Src0Reg, *MRI, TRI); + const RegisterBank *Src1Bank = RBI.getRegBank(Src1Reg, *MRI, TRI); + const TargetRegisterClass *Src0RC = + TRI.getRegClassForSizeOnBank(DstSize, *Src0Bank, *MRI); + const TargetRegisterClass *Src1RC = + TRI.getRegClassForSizeOnBank(InsSize, *Src1Bank, *MRI); + + // Deal with weird cases where the class only partially supports the subreg + // index. + Src0RC = TRI.getSubClassWithSubReg(Src0RC, SubReg); + if (!Src0RC) + return false; + + if (!RBI.constrainGenericRegister(DstReg, *DstRC, *MRI) || + !RBI.constrainGenericRegister(Src0Reg, *Src0RC, *MRI) || + !RBI.constrainGenericRegister(Src1Reg, *Src1RC, *MRI)) + return false; + const DebugLoc &DL = I.getDebugLoc(); + BuildMI(*BB, &I, DL, TII.get(TargetOpcode::INSERT_SUBREG), DstReg) + .addReg(Src0Reg) + .addReg(Src1Reg) + .addImm(SubReg); - MachineInstr *Ins = BuildMI(*BB, &I, DL, TII.get(TargetOpcode::INSERT_SUBREG)) - .addDef(I.getOperand(0).getReg()) - .addReg(Src0Reg) - .addReg(Src1Reg) - .addImm(SubReg); - - for (const MachineOperand &MO : Ins->operands()) { - if (!MO.isReg()) - continue; - if (Register::isPhysicalRegister(MO.getReg())) - continue; - - const TargetRegisterClass *RC = - TRI.getConstrainedRegClassForOperand(MO, *MRI); - if (!RC) - continue; - RBI.constrainGenericRegister(MO.getReg(), *RC, *MRI); - } I.eraseFromParent(); return true; } Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir?rev=373938&r1=373937&r2=373938&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir Mon Oct 7 11:43:31 2019 @@ -1,32 +1,35 @@ -# RUN: llc -march=amdgcn -run-pass=instruction-select -verify-machineinstrs -global-isel %s -o - | FileCheck %s +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py +# RUN: llc -march=amdgcn -run-pass=instruction-select -verify-machineinstrs %s -o - | FileCheck %s + --- -name: insert512 +name: insert_s512_s32 legalized: true regBankSelected: true -# CHECK-LABEL: insert512 -# CHECK: [[BASE:%[0-9]+]]:sreg_512 = IMPLICIT_DEF -# CHECK: [[VAL:%[0-9]+]]:sreg_32_xm0 = IMPLICIT_DEF -# CHECK: [[BASE0:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE]], [[VAL]], %subreg.sub0 -# CHECK: [[BASE1:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE0]], [[VAL]], %subreg.sub1 -# CHECK: [[BASE2:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE1]], [[VAL]], %subreg.sub2 -# CHECK: [[BASE3:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE2]], [[VAL]], %subreg.sub3 -# CHECK: [[BASE4:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE3]], [[VAL]], %subreg.sub4 -# CHECK: [[BASE5:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE4]], [[VAL]], %subreg.sub5 -# CHECK: [[BASE6:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE5]], [[VAL]], %subreg.sub6 -# CHECK: [[BASE7:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE6]], [[VAL]], %subreg.sub7 -# CHECK: [[BASE8:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE7]], [[VAL]], %subreg.sub8 -# CHECK: [[BASE9:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE8]], [[VAL]], %subreg.sub9 -# CHECK: [[BASE10:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE9]], [[VAL]], %subreg.sub10 -# CHECK: [[BASE11:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE10]], [[VAL]], %subreg.sub11 -# CHECK: [[BASE12:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE11]], [[VAL]], %subreg.sub12 -# CHECK: [[BASE13:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE12]], [[VAL]], %subreg.sub13 -# CHECK: [[BASE14:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE13]], [[VAL]], %subreg.sub14 -# CHECK: [[BASE15:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[BASE14]], [[VAL]], %subreg.sub15 - body: | bb.0: + ; CHECK-LABEL: name: insert_s512_s32 + ; CHECK: [[DEF:%[0-9]+]]:sreg_512 = IMPLICIT_DEF + ; CHECK: [[DEF1:%[0-9]+]]:sreg_32_xm0 = IMPLICIT_DEF + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[DEF]], [[DEF1]], %subreg.sub0 + ; CHECK: [[INSERT_SUBREG1:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG]], [[DEF1]], %subreg.sub1 + ; CHECK: [[INSERT_SUBREG2:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG1]], [[DEF1]], %subreg.sub2 + ; CHECK: [[INSERT_SUBREG3:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG2]], [[DEF1]], %subreg.sub3 + ; CHECK: [[INSERT_SUBREG4:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG3]], [[DEF1]], %subreg.sub4 + ; CHECK: [[INSERT_SUBREG5:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG4]], [[DEF1]], %subreg.sub5 + ; CHECK: [[INSERT_SUBREG6:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG5]], [[DEF1]], %subreg.sub6 + ; CHECK: [[INSERT_SUBREG7:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG6]], [[DEF1]], %subreg.sub7 + ; CHECK: [[INSERT_SUBREG8:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG7]], [[DEF1]], %subreg.sub8 + ; CHECK: [[INSERT_SUBREG9:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG8]], [[DEF1]], %subreg.sub9 + ; CHECK: [[INSERT_SUBREG10:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG9]], [[DEF1]], %subreg.sub10 + ; CHECK: [[INSERT_SUBREG11:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG10]], [[DEF1]], %subreg.sub11 + ; CHECK: [[INSERT_SUBREG12:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG11]], [[DEF1]], %subreg.sub12 + ; CHECK: [[INSERT_SUBREG13:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG12]], [[DEF1]], %subreg.sub13 + ; CHECK: [[INSERT_SUBREG14:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG13]], [[DEF1]], %subreg.sub14 + ; CHECK: [[INSERT_SUBREG15:%[0-9]+]]:sreg_512 = INSERT_SUBREG [[INSERT_SUBREG14]], [[DEF1]], %subreg.sub15 + ; CHECK: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7_sgpr8_sgpr9_sgpr10_sgpr11_sgpr12_sgpr13_sgpr14_sgpr15 = COPY [[INSERT_SUBREG15]] + ; CHECK: SI_RETURN_TO_EPILOG $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7_sgpr8_sgpr9_sgpr10_sgpr11_sgpr12_sgpr13_sgpr14_sgpr15 %0:sgpr(s512) = G_IMPLICIT_DEF %1:sgpr(s32) = G_IMPLICIT_DEF %2:sgpr(s512) = G_INSERT %0:sgpr, %1:sgpr(s32), 0 @@ -47,3 +50,403 @@ body: | %17:sgpr(s512) = G_INSERT %16:sgpr, %1:sgpr(s32), 480 $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7_sgpr8_sgpr9_sgpr10_sgpr11_sgpr12_sgpr13_sgpr14_sgpr15 = COPY %17:sgpr(s512) SI_RETURN_TO_EPILOG $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7_sgpr8_sgpr9_sgpr10_sgpr11_sgpr12_sgpr13_sgpr14_sgpr15 + +--- + +name: insert_v_s64_v_s32_0 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1, $vgpr2 + %0:vgpr(s64) = COPY $vgpr0_vgpr1 + %1:vgpr(s32) = COPY $vgpr2 + %2:vgpr(s64) = G_INSERT %0, %1, 0 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_v_s64_v_s32_32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1, $vgpr2 + ; CHECK-LABEL: name: insert_v_s64_v_s32_32 + ; CHECK: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1 + ; CHECK: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:vreg_64 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub1 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:vgpr(s64) = COPY $vgpr0_vgpr1 + %1:vgpr(s32) = COPY $vgpr2 + %2:vgpr(s64) = G_INSERT %0, %1, 32 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s64_s_s32_0 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1, $sgpr2 + ; CHECK-LABEL: name: insert_s_s64_s_s32_0 + ; CHECK: [[COPY:%[0-9]+]]:sreg_64_xexec = COPY $sgpr0_sgpr1 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr2 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_64_xexec = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub0 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s64) = COPY $sgpr0_sgpr1 + %1:sgpr(s32) = COPY $sgpr2 + %2:sgpr(s64) = G_INSERT %0, %1, 0 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s64_s_s32_32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1, $sgpr2 + ; CHECK-LABEL: name: insert_s_s64_s_s32_32 + ; CHECK: [[COPY:%[0-9]+]]:sreg_64_xexec = COPY $sgpr0_sgpr1 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr2 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_64_xexec = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub1 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s64) = COPY $sgpr0_sgpr1 + %1:sgpr(s32) = COPY $sgpr2 + %2:sgpr(s64) = G_INSERT %0, %1, 32 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s64_v_s32_32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1, $vgpr0 + ; CHECK-LABEL: name: insert_s_s64_v_s32_32 + ; CHECK: [[COPY:%[0-9]+]]:sreg_64_xexec = COPY $sgpr0_sgpr1 + ; CHECK: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:vreg_64 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub1 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s64) = COPY $sgpr0_sgpr1 + %1:vgpr(s32) = COPY $vgpr2 + %2:vgpr(s64) = G_INSERT %0, %1, 32 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_v_s64_s_s32_32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1, $sgpr0 + ; CHECK-LABEL: name: insert_v_s64_s_s32_32 + ; CHECK: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:vreg_64 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub1 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:vgpr(s64) = COPY $vgpr0_vgpr1 + %1:sgpr(s32) = COPY $sgpr0 + %2:vgpr(s64) = G_INSERT %0, %1, 32 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_v_s96_v_s64_0 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1_vgpr2, $vgpr3_vgpr4 + ; CHECK-LABEL: name: insert_v_s96_v_s64_0 + ; CHECK: [[COPY:%[0-9]+]]:vreg_96 = COPY $vgpr0_vgpr1_vgpr2 + ; CHECK: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr3_vgpr4 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:vreg_96 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub0_sub1 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:vgpr(s96) = COPY $vgpr0_vgpr1_vgpr2 + %1:vgpr(s64) = COPY $vgpr3_vgpr4 + %2:vgpr(s96) = G_INSERT %0, %1, 0 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_v_s96_v_s64_32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1_vgpr2, $vgpr3_vgpr4 + ; CHECK-LABEL: name: insert_v_s96_v_s64_32 + ; CHECK: [[COPY:%[0-9]+]]:vreg_96 = COPY $vgpr0_vgpr1_vgpr2 + ; CHECK: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr3_vgpr4 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:vreg_96 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub1_sub2 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:vgpr(s96) = COPY $vgpr0_vgpr1_vgpr2 + %1:vgpr(s64) = COPY $vgpr3_vgpr4 + %2:vgpr(s96) = G_INSERT %0, %1, 32 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s96_s_s64_0 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2, $sgpr4_sgpr5 + ; CHECK-LABEL: name: insert_s_s96_s_s64_0 + ; CHECK: [[COPY:%[0-9]+]]:sgpr_96_with_sub0_sub1 = COPY $sgpr0_sgpr1_sgpr2 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_64_xexec = COPY $sgpr4_sgpr5 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_96 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub0_sub1 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s96) = COPY $sgpr0_sgpr1_sgpr2 + %1:sgpr(s64) = COPY $sgpr4_sgpr5 + %2:sgpr(s96) = G_INSERT %0, %1, 0 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s96_s_s64_32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2, $sgpr4_sgpr5 + ; CHECK-LABEL: name: insert_s_s96_s_s64_32 + ; CHECK: [[COPY:%[0-9]+]]:sgpr_96_with_sub1_sub2 = COPY $sgpr0_sgpr1_sgpr2 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_64_xexec = COPY $sgpr4_sgpr5 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_96 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub1_sub2 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s96) = COPY $sgpr0_sgpr1_sgpr2 + %1:sgpr(s64) = COPY $sgpr4_sgpr5 + %2:sgpr(s96) = G_INSERT %0, %1, 32 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s128_s_s64_0 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr4_sgpr5 + ; CHECK-LABEL: name: insert_s_s128_s_s64_0 + ; CHECK: [[COPY:%[0-9]+]]:sreg_128 = COPY $sgpr0_sgpr1_sgpr2_sgpr3 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_64_xexec = COPY $sgpr4_sgpr5 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_128 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub0_sub1 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s128) = COPY $sgpr0_sgpr1_sgpr2_sgpr3 + %1:sgpr(s64) = COPY $sgpr4_sgpr5 + %2:sgpr(s128) = G_INSERT %0, %1, 0 + S_ENDPGM 0, implicit %2 +... + +# --- + +# name: insert_s_s128_s_s64_32 +# legalized: true +# regBankSelected: true + +# body: | +# bb.0: +# liveins: $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr4_sgpr5 +# %0:sgpr(s128) = COPY $sgpr0_sgpr1_sgpr2_sgpr3 +# %1:sgpr(s64) = COPY $sgpr4_sgpr5 +# %2:sgpr(s128) = G_INSERT %0, %1, 32 +# S_ENDPGM 0, implicit %2 +# ... + +--- + +name: insert_s_s128_s_s64_64 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr4_sgpr5 + ; CHECK-LABEL: name: insert_s_s128_s_s64_64 + ; CHECK: [[COPY:%[0-9]+]]:sreg_128 = COPY $sgpr0_sgpr1_sgpr2_sgpr3 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_64_xexec = COPY $sgpr4_sgpr5 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_128 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub2_sub3 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s128) = COPY $sgpr0_sgpr1_sgpr2_sgpr3 + %1:sgpr(s64) = COPY $sgpr4_sgpr5 + %2:sgpr(s128) = G_INSERT %0, %1, 64 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s256_s_s64_96 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9 + ; CHECK-LABEL: name: insert_s_s256_s_s64_96 + ; CHECK: [[COPY:%[0-9]+]]:sreg_256 = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_64_xexec = COPY $sgpr8_sgpr9 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_256 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub4_sub5 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s256) = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7 + %1:sgpr(s64) = COPY $sgpr8_sgpr9 + %2:sgpr(s256) = G_INSERT %0, %1, 96 + S_ENDPGM 0, implicit %2 +... + +# --- + +# name: insert_s_s256_s_s64_128 +# legalized: true +# regBankSelected: true + +# body: | +# bb.0: +# liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9 +# %0:sgpr(s256) = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7 +# %1:sgpr(s64) = COPY $sgpr4_sgpr5 +# %2:sgpr(s256) = G_INSERT %0, %1, 128 +# S_ENDPGM 0, implicit %2 +# ... + +# --- + +# name: insert_s_s256_s_s64_160 +# legalized: true +# regBankSelected: true + +# body: | +# bb.0: +# liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9 +# %0:sgpr(s256) = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7 +# %1:sgpr(s64) = COPY $sgpr4_sgpr5 +# %2:sgpr(s256) = G_INSERT %0, %1, 160 +# S_ENDPGM 0, implicit %2 +# ... + +--- + +name: insert_s_s128_s_s96_0 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr6_sgpr7_sgpr8 + ; CHECK-LABEL: name: insert_s_s128_s_s96_0 + ; CHECK: [[COPY:%[0-9]+]]:sgpr_128_with_sub0_sub1_sub2 = COPY $sgpr0_sgpr1_sgpr2_sgpr3 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_96 = COPY $sgpr6_sgpr7_sgpr8 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_128 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub0_sub1_sub2 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s128) = COPY $sgpr0_sgpr1_sgpr2_sgpr3 + %1:sgpr(s96) = COPY $sgpr6_sgpr7_sgpr8 + %2:sgpr(s128) = G_INSERT %0, %1, 0 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s128_s_s96_32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr6_sgpr7_sgpr8 + ; CHECK-LABEL: name: insert_s_s128_s_s96_32 + ; CHECK: [[COPY:%[0-9]+]]:sgpr_128_with_sub1_sub2_sub3 = COPY $sgpr0_sgpr1_sgpr2_sgpr3 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_96 = COPY $sgpr6_sgpr7_sgpr8 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_128 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub1_sub2_sub3 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s128) = COPY $sgpr0_sgpr1_sgpr2_sgpr3 + %1:sgpr(s96) = COPY $sgpr6_sgpr7_sgpr8 + %2:sgpr(s128) = G_INSERT %0, %1, 32 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s160_s_s96_0 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4, $sgpr6_sgpr7_sgpr8 + ; CHECK-LABEL: name: insert_s_s160_s_s96_0 + ; CHECK: [[COPY:%[0-9]+]]:sgpr_160_with_sub0_sub1_sub2 = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_96 = COPY $sgpr6_sgpr7_sgpr8 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_160 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub0_sub1_sub2 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s160) = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4 + %1:sgpr(s96) = COPY $sgpr6_sgpr7_sgpr8 + %2:sgpr(s160) = G_INSERT %0, %1, 0 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s160_s_s96_32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4, $sgpr6_sgpr7_sgpr8 + ; CHECK-LABEL: name: insert_s_s160_s_s96_32 + ; CHECK: [[COPY:%[0-9]+]]:sgpr_160_with_sub1_sub2_sub3 = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_96 = COPY $sgpr6_sgpr7_sgpr8 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_160 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub1_sub2_sub3 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s160) = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4 + %1:sgpr(s96) = COPY $sgpr6_sgpr7_sgpr8 + %2:sgpr(s160) = G_INSERT %0, %1, 32 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s160_s_s96_64 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4, $sgpr6_sgpr7_sgpr8 + ; CHECK-LABEL: name: insert_s_s160_s_s96_64 + ; CHECK: [[COPY:%[0-9]+]]:sgpr_160_with_sub2_sub3_sub4 = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_96 = COPY $sgpr6_sgpr7_sgpr8 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_160 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub2_sub3_sub4 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s160) = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4 + %1:sgpr(s96) = COPY $sgpr6_sgpr7_sgpr8 + %2:sgpr(s160) = G_INSERT %0, %1, 64 + S_ENDPGM 0, implicit %2 +... From llvm-commits at lists.llvm.org Mon Oct 7 11:41:28 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:41:28 +0000 (UTC) Subject: [PATCH] D68582: GlobalISel: Add target pre-isel instructions In-Reply-To: References: Message-ID: arsenm closed this revision. arsenm marked an inline comment as done. arsenm added a comment. r373937 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68582/new/ https://reviews.llvm.org/D68582 From llvm-commits at lists.llvm.org Mon Oct 7 11:41:43 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:41:43 +0000 (UTC) Subject: [PATCH] D68416: AMDGPU/GlobalISel: Select more G_INSERT cases In-Reply-To: References: Message-ID: <857e93a39da621c62b8dd05339352ced@localhost.localdomain> arsenm closed this revision. arsenm added a comment. r373938 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68416/new/ https://reviews.llvm.org/D68416 From llvm-commits at lists.llvm.org Mon Oct 7 11:43:22 2019 From: llvm-commits at lists.llvm.org (Jordan Rose via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:43:22 +0000 (UTC) Subject: [PATCH] D68586: Save a word in every StringSet entry In-Reply-To: References: Message-ID: jordan_rose added a comment. Hm, doesn't quite work yet but I'll get there. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68586/new/ https://reviews.llvm.org/D68586 From llvm-commits at lists.llvm.org Mon Oct 7 11:43:37 2019 From: llvm-commits at lists.llvm.org (Jake Ehrlich via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:43:37 +0000 (UTC) Subject: [PATCH] D68146: [tests] Output of od can be lower or upper case (llvm-objcopy/yaml2obj). In-Reply-To: References: Message-ID: <3dfd1c7d832ebaf56064c291197c1f6b@localhost.localdomain> jakehehrlich added a comment. Seems like a good idea to me! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 From llvm-commits at lists.llvm.org Mon Oct 7 11:44:24 2019 From: llvm-commits at lists.llvm.org (Austin Kerbow via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:44:24 +0000 (UTC) Subject: [PATCH] D68437: AMDGPU/GlobalISel: Use S_MOV_B64 for inline constants In-Reply-To: References: Message-ID: kerbowa accepted this revision. kerbowa added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68437/new/ https://reviews.llvm.org/D68437 From llvm-commits at lists.llvm.org Mon Oct 7 11:45:52 2019 From: llvm-commits at lists.llvm.org (Austin Kerbow via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:45:52 +0000 (UTC) Subject: [PATCH] D68540: AMDGPU/GlobalISel: Handle more G_INSERT cases In-Reply-To: References: Message-ID: kerbowa accepted this revision. kerbowa added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68540/new/ https://reviews.llvm.org/D68540 From llvm-commits at lists.llvm.org Mon Oct 7 11:51:56 2019 From: llvm-commits at lists.llvm.org (Dmitry Mikulin via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 18:51:56 +0000 (UTC) Subject: [PATCH] D67985: CFI: wrong type passed to llvm.type.test with multiple inheritance devirtualization In-Reply-To: References: Message-ID: dmikulin updated this revision to Diff 223634. dmikulin added a comment. Added a new CodeGetCXX test case CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67985/new/ https://reviews.llvm.org/D67985 Files: clang/lib/CodeGen/CGExprCXX.cpp clang/test/CodeGenCXX/cfi-multiple-inheritance.cpp compiler-rt/test/cfi/multiple-inheritance2.cpp Index: compiler-rt/test/cfi/multiple-inheritance2.cpp =================================================================== --- /dev/null +++ compiler-rt/test/cfi/multiple-inheritance2.cpp @@ -0,0 +1,38 @@ +// Test that virtual functions of the derived class can be called through +// pointers of both base classes without CFI errors. +// Related to Bugzilla 43390. + +// RUN: %clangxx_cfi -o %t1 %s +// RUN: %run %t1 2>&1 | FileCheck --check-prefix=CFI %s + +// CFI: In f1 +// CFI: In f2 +// CFI-NOT: control flow integrity check + +// REQUIRES: cxxabi + +#include + +class A1 { +public: + virtual void f1() = 0; +}; + +class A2 { +public: + virtual void f2() = 0; +}; + + +class B : public A1, public A2 { +public: + void f2() final { fprintf(stderr, "In f2\n"); } + void f1() final { fprintf(stderr, "In f1\n"); } +}; + +int main() { + B b; + + static_cast(&b)->f1(); + static_cast(&b)->f2(); +} Index: clang/test/CodeGenCXX/cfi-multiple-inheritance.cpp =================================================================== --- /dev/null +++ clang/test/CodeGenCXX/cfi-multiple-inheritance.cpp @@ -0,0 +1,31 @@ +// Test that correct vtable ptr and type metadata are passed to llvm.type.test +// Related to Bugzilla 43390. + +// RUN: %clang_cc1 -triple x86_64-unknown-linux -fvisibility hidden -std=c++11 -fsanitize=cfi-nvcall -emit-llvm -o - %s | FileCheck %s + +class A1 { +public: + virtual int f1() = 0; +}; + +class A2 { +public: + virtual int f2() = 0; +}; + + +class B : public A1, public A2 { +public: + int f2() final { return 1; } + int f1() final { return 2; } +}; + +// CHECK-LABEL: define hidden i32 @_Z3foov +int foo() { + B b; + return static_cast(&b)->f2(); + // CHECK: [[P:%[^ ]*]] = bitcast %class.B* %b to i8** + // CHECK: [[V:%[^ ]*]] = load i8*, i8** [[P]], align 8 + // CHECK: call i1 @llvm.type.test(i8* [[V]], metadata !"_ZTS1B") + // CHECK: call i1 @llvm.type.test(i8* [[V]], metadata !"all-vtables") +} Index: clang/lib/CodeGen/CGExprCXX.cpp =================================================================== --- clang/lib/CodeGen/CGExprCXX.cpp +++ clang/lib/CodeGen/CGExprCXX.cpp @@ -382,7 +382,7 @@ const CXXRecordDecl *RD; std::tie(VTable, RD) = CGM.getCXXABI().LoadVTablePtr(*this, This.getAddress(), - MD->getParent()); + CalleeDecl->getParent()); EmitVTablePtrCheckForCall(RD, VTable, CFITCK_NVCall, CE->getBeginLoc()); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D67985.223634.patch Type: text/x-patch Size: 2545 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 12:05:58 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Mon, 07 Oct 2019 19:05:58 -0000 Subject: [llvm] r373942 - AMDGPU/GlobalISel: Widen 16-bit G_MERGE_VALUEs sources Message-ID: <20191007190559.268A48A9B5@lists.llvm.org> Author: arsenm Date: Mon Oct 7 12:05:58 2019 New Revision: 373942 URL: http://llvm.org/viewvc/llvm-project?rev=373942&view=rev Log: AMDGPU/GlobalISel: Widen 16-bit G_MERGE_VALUEs sources Continue making a mess of merge/unmerge legality. Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant-32bit.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-merge-values.mir Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp?rev=373942&r1=373941&r2=373942&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp Mon Oct 7 12:05:58 2019 @@ -988,7 +988,7 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo return false; }; - getActionDefinitionsBuilder(Op) + auto &Builder = getActionDefinitionsBuilder(Op) .widenScalarToNextPow2(LitTyIdx, /*Min*/ 16) // Clamp the little scalar to s8-s256 and make it a power of 2. It's not // worth considering the multiples of 64 since 2*192 and 2*384 are not @@ -1007,25 +1007,36 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo [=](const LegalityQuery &Query) { return notValidElt(Query, 1); }, scalarize(1)) .clampScalar(BigTyIdx, S32, S1024) - .lowerFor({{S16, V2S16}}) - .widenScalarIf( + .lowerFor({{S16, V2S16}}); + + if (Op == G_MERGE_VALUES) { + Builder.widenScalarIf( + // TODO: Use 16-bit shifts if legal for 8-bit values? [=](const LegalityQuery &Query) { - const LLT &Ty = Query.Types[BigTyIdx]; - return !isPowerOf2_32(Ty.getSizeInBits()) && - Ty.getSizeInBits() % 16 != 0; + const LLT Ty = Query.Types[LitTyIdx]; + return Ty.getSizeInBits() < 32; }, - [=](const LegalityQuery &Query) { - // Pick the next power of 2, or a multiple of 64 over 128. - // Whichever is smaller. - const LLT &Ty = Query.Types[BigTyIdx]; - unsigned NewSizeInBits = 1 << Log2_32_Ceil(Ty.getSizeInBits() + 1); - if (NewSizeInBits >= 256) { - unsigned RoundedTo = alignTo<64>(Ty.getSizeInBits() + 1); - if (RoundedTo < NewSizeInBits) - NewSizeInBits = RoundedTo; - } - return std::make_pair(BigTyIdx, LLT::scalar(NewSizeInBits)); - }) + changeTo(LitTyIdx, S32)); + } + + Builder.widenScalarIf( + [=](const LegalityQuery &Query) { + const LLT Ty = Query.Types[BigTyIdx]; + return !isPowerOf2_32(Ty.getSizeInBits()) && + Ty.getSizeInBits() % 16 != 0; + }, + [=](const LegalityQuery &Query) { + // Pick the next power of 2, or a multiple of 64 over 128. + // Whichever is smaller. + const LLT &Ty = Query.Types[BigTyIdx]; + unsigned NewSizeInBits = 1 << Log2_32_Ceil(Ty.getSizeInBits() + 1); + if (NewSizeInBits >= 256) { + unsigned RoundedTo = alignTo<64>(Ty.getSizeInBits() + 1); + if (RoundedTo < NewSizeInBits) + NewSizeInBits = RoundedTo; + } + return std::make_pair(BigTyIdx, LLT::scalar(NewSizeInBits)); + }) .legalIf([=](const LegalityQuery &Query) { const LLT &BigTy = Query.Types[BigTyIdx]; const LLT &LitTy = Query.Types[LitTyIdx]; Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant-32bit.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant-32bit.mir?rev=373942&r1=373941&r2=373942&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant-32bit.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant-32bit.mir Mon Oct 7 12:05:58 2019 @@ -39,8 +39,12 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C5]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: $vgpr0 = COPY [[MV1]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C7]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: $vgpr0 = COPY [[OR2]](s32) %0:_(p6) = COPY $vgpr0 %1:_(s32) = G_LOAD %0 :: (load 4, align 1, addrspace 6) $vgpr0 = COPY %1 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir?rev=373942&r1=373941&r2=373942&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir Mon Oct 7 12:05:58 2019 @@ -383,53 +383,78 @@ body: | ; CI-LABEL: name: test_load_constant_s32_align2 ; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: $vgpr0 = COPY [[MV]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: $vgpr0 = COPY [[OR]](s32) ; VI-LABEL: name: test_load_constant_s32_align2 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: $vgpr0 = COPY [[OR]](s32) ; GFX9-LABEL: name: test_load_constant_s32_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: $vgpr0 = COPY [[OR]](s32) ; CI-MESA-LABEL: name: test_load_constant_s32_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](s32) + ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: $vgpr0 = COPY [[OR]](s32) ; GFX9-MESA-LABEL: name: test_load_constant_s32_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](s32) + ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: $vgpr0 = COPY [[OR]](s32) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(s32) = G_LOAD %0 :: (load 4, align 2, addrspace 4) $vgpr0 = COPY %1 @@ -471,8 +496,12 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: $vgpr0 = COPY [[MV]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: $vgpr0 = COPY [[OR2]](s32) ; VI-LABEL: name: test_load_constant_s32_align1 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, addrspace 4) @@ -499,8 +528,12 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: $vgpr0 = COPY [[OR2]](s32) ; GFX9-LABEL: name: test_load_constant_s32_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, addrspace 4) @@ -527,8 +560,12 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: $vgpr0 = COPY [[OR2]](s32) ; CI-MESA-LABEL: name: test_load_constant_s32_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, addrspace 4) @@ -559,8 +596,12 @@ body: | ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](s32) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-MESA: $vgpr0 = COPY [[OR2]](s32) ; GFX9-MESA-LABEL: name: test_load_constant_s32_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, addrspace 4) @@ -587,8 +628,12 @@ body: | ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](s32) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9-MESA: $vgpr0 = COPY [[OR2]](s32) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(s32) = G_LOAD %0 :: (load 4, align 1, addrspace 4) $vgpr0 = COPY %1 @@ -712,92 +757,142 @@ body: | ; CI-LABEL: name: test_load_constant_s64_align2 ; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; VI-LABEL: name: test_load_constant_s64_align2 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-LABEL: name: test_load_constant_s64_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-MESA-LABEL: name: test_load_constant_s64_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-MESA-LABEL: name: test_load_constant_s64_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(s64) = G_LOAD %0 :: (load 8, align 2, addrspace 4) @@ -868,7 +963,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; VI-LABEL: name: test_load_constant_s64_align1 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -920,7 +1024,16 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-LABEL: name: test_load_constant_s64_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -972,7 +1085,16 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-MESA-LABEL: name: test_load_constant_s64_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -1032,7 +1154,16 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-MESA-LABEL: name: test_load_constant_s64_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -1084,7 +1215,16 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(s64) = G_LOAD %0 :: (load 8, align 1, addrspace 4) @@ -1193,132 +1333,202 @@ body: | ; CI-LABEL: name: test_load_constant_s96_align2 ; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; CI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; CI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; CI: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C4]](s64) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_constant_s96_align2 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; VI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; VI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; VI: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C4]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-LABEL: name: test_load_constant_s96_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; GFX9: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; GFX9: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; GFX9: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C4]](s64) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; CI-MESA-LABEL: name: test_load_constant_s96_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; CI-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; CI-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-MESA: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; CI-MESA: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C4]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-MESA-LABEL: name: test_load_constant_s96_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; GFX9-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9-MESA: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C4]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; GFX9-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; GFX9-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(s96) = G_LOAD %0 :: (load 12, align 2, addrspace 4) @@ -1417,7 +1627,20 @@ body: | ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C12]](s32) ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C14]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C14]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C14]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_constant_s96_align1 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -1493,7 +1716,20 @@ body: | ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; VI: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; VI: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-LABEL: name: test_load_constant_s96_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -1569,7 +1805,20 @@ body: | ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; GFX9: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; GFX9: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; CI-MESA-LABEL: name: test_load_constant_s96_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -1657,7 +1906,20 @@ body: | ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C12]](s32) ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C14]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C14]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C14]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-MESA-LABEL: name: test_load_constant_s96_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -1733,7 +1995,20 @@ body: | ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(s96) = G_LOAD %0 :: (load 12, align 1, addrspace 4) @@ -2007,7 +2282,24 @@ body: | ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[C16]](s32) ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C18]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C18]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C18]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C18]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; VI-LABEL: name: test_load_constant_s128_align1 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -2107,7 +2399,24 @@ body: | ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C15]] ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C16]](s16) ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C17]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C17]](s32) + ; VI: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C17]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C17]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; GFX9-LABEL: name: test_load_constant_s128_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -2207,7 +2516,24 @@ body: | ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C15]] ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C16]](s16) ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C17]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C17]](s32) + ; GFX9: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C17]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C17]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; CI-MESA-LABEL: name: test_load_constant_s128_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -2323,7 +2649,24 @@ body: | ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[C16]](s32) ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C18]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C18]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C18]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C18]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; GFX9-MESA-LABEL: name: test_load_constant_s128_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -2423,7 +2766,24 @@ body: | ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C15]] ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C16]](s16) ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C17]](s32) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C17]](s32) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C17]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C17]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(s128) = G_LOAD %0 :: (load 16, align 1, addrspace 4) @@ -2587,7 +2947,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; VI-LABEL: name: test_load_constant_p1_align1 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -2639,7 +3008,16 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; GFX9-LABEL: name: test_load_constant_p1_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -2691,7 +3069,16 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](p1) ; CI-MESA-LABEL: name: test_load_constant_p1_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -2751,7 +3138,16 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](p1) ; GFX9-MESA-LABEL: name: test_load_constant_p1_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -2803,7 +3199,16 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](p1) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(p1) = G_LOAD %0 :: (load 8, align 1, addrspace 4) @@ -2912,92 +3317,142 @@ body: | ; CI-LABEL: name: test_load_constant_p4_align2 ; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; VI-LABEL: name: test_load_constant_p4_align2 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; GFX9-LABEL: name: test_load_constant_p4_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](p4) ; CI-MESA-LABEL: name: test_load_constant_p4_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) ; GFX9-MESA-LABEL: name: test_load_constant_p4_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(p4) = G_LOAD %0 :: (load 8, align 2, addrspace 4) @@ -3068,7 +3523,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; VI-LABEL: name: test_load_constant_p4_align1 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -3120,7 +3584,16 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; GFX9-LABEL: name: test_load_constant_p4_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -3172,7 +3645,16 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](p4) ; CI-MESA-LABEL: name: test_load_constant_p4_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -3232,7 +3714,16 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) ; GFX9-MESA-LABEL: name: test_load_constant_p4_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -3284,7 +3775,16 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(p4) = G_LOAD %0 :: (load 8, align 1, addrspace 4) @@ -3331,53 +3831,83 @@ body: | ; CI-LABEL: name: test_load_constant_p5_align2 ; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p5) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p5) ; VI-LABEL: name: test_load_constant_p5_align2 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p5) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-LABEL: name: test_load_constant_p5_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p5) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-MESA-LABEL: name: test_load_constant_p5_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](p5) + ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; CI-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-MESA-LABEL: name: test_load_constant_p5_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](p5) + ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; GFX9-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(p5) = G_LOAD %0 :: (load 4, align 2, addrspace 4) $vgpr0 = COPY %1 @@ -3419,8 +3949,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p5) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p5) ; VI-LABEL: name: test_load_constant_p5_align1 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, addrspace 4) @@ -3447,8 +3982,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p5) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-LABEL: name: test_load_constant_p5_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, addrspace 4) @@ -3475,8 +4015,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p5) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-MESA-LABEL: name: test_load_constant_p5_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, addrspace 4) @@ -3507,8 +4052,13 @@ body: | ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](p5) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; CI-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-MESA-LABEL: name: test_load_constant_p5_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, addrspace 4) @@ -3535,8 +4085,13 @@ body: | ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](p5) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; GFX9-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(p5) = G_LOAD %0 :: (load 4, align 1, addrspace 4) $vgpr0 = COPY %1 @@ -6390,166 +6945,256 @@ body: | ; CI-LABEL: name: test_load_constant_v2s64_align2 ; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; CI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; CI: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C1]](s64) ; CI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C2]](s64) ; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 4) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; VI-LABEL: name: test_load_constant_v2s64_align2 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; VI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C1]](s64) ; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C2]](s64) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 4) - ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; VI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; VI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; VI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; VI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-LABEL: name: test_load_constant_v2s64_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; GFX9: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C1]](s64) ; GFX9: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C2]](s64) ; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 4) - ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; GFX9: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; GFX9: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; CI-MESA-LABEL: name: test_load_constant_v2s64_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) + ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; CI-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-MESA: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; CI-MESA: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C1]](s64) ; CI-MESA: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-MESA: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C2]](s64) ; CI-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI-MESA: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; CI-MESA: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-MESA-LABEL: name: test_load_constant_v2s64_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) - ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) + ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 4) ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9-MESA: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C1]](s64) ; GFX9-MESA: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9-MESA: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C2]](s64) ; GFX9-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; GFX9-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9-MESA: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; GFX9-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) %0:_(p4) = COPY $vgpr0_vgpr1 @@ -6621,9 +7266,18 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; CI: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -6644,34 +7298,42 @@ body: | ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; VI-LABEL: name: test_load_constant_v2s64_align1 @@ -6724,9 +7386,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; VI: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -6746,27 +7417,35 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-LABEL: name: test_load_constant_v2s64_align1 @@ -6819,9 +7498,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; GFX9: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -6841,27 +7529,35 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; CI-MESA-LABEL: name: test_load_constant_v2s64_align1 @@ -6922,9 +7618,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -6945,34 +7650,42 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-MESA-LABEL: name: test_load_constant_v2s64_align1 @@ -7025,9 +7738,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -7047,27 +7769,35 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) %0:_(p4) = COPY $vgpr0_vgpr1 @@ -7225,9 +7955,18 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; CI: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -7248,36 +7987,44 @@ body: | ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; CI: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; CI: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; CI: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C12]](s64) ; CI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p4) :: (load 1, addrspace 4) ; CI: [[GEP16:%[0-9]+]]:_(p4) = G_GEP [[GEP15]], [[C]](s64) ; CI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p4) :: (load 1, addrspace 4) @@ -7298,34 +8045,42 @@ body: | ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; CI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; CI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; CI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; CI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; CI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; CI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; CI: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; CI: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7380,9 +8135,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; VI: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -7402,29 +8166,37 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; VI: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; VI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; VI: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; VI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p4) :: (load 1, addrspace 4) ; VI: [[GEP16:%[0-9]+]]:_(p4) = G_GEP [[GEP15]], [[C]](s64) ; VI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p4) :: (load 1, addrspace 4) @@ -7444,27 +8216,35 @@ body: | ; VI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; VI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; VI: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; VI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; VI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; VI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; VI: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; VI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; VI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; VI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; VI: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; VI: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; VI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; VI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; VI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; VI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; VI: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; VI: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; VI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; VI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; VI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; VI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; VI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; VI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; VI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; VI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; VI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; VI: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; VI: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7519,9 +8299,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; GFX9: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -7541,29 +8330,37 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; GFX9: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; GFX9: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; GFX9: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; GFX9: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p4) :: (load 1, addrspace 4) ; GFX9: [[GEP16:%[0-9]+]]:_(p4) = G_GEP [[GEP15]], [[C]](s64) ; GFX9: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p4) :: (load 1, addrspace 4) @@ -7583,27 +8380,35 @@ body: | ; GFX9: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; GFX9: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; GFX9: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; GFX9: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; GFX9: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; GFX9: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; GFX9: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; GFX9: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; GFX9: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; GFX9: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; GFX9: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; GFX9: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; GFX9: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; GFX9: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; GFX9: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; GFX9: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; GFX9: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; GFX9: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7666,9 +8471,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -7689,36 +8503,44 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; CI-MESA: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; CI-MESA: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; CI-MESA: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C12]](s64) ; CI-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p4) :: (load 1, addrspace 4) ; CI-MESA: [[GEP16:%[0-9]+]]:_(p4) = G_GEP [[GEP15]], [[C]](s64) ; CI-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p4) :: (load 1, addrspace 4) @@ -7739,34 +8561,42 @@ body: | ; CI-MESA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI-MESA: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI-MESA: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; CI-MESA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI-MESA: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI-MESA: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; CI-MESA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI-MESA: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI-MESA: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; CI-MESA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI-MESA: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; CI-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; CI-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; CI-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; CI-MESA: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; CI-MESA: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7821,9 +8651,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -7843,29 +8682,37 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; GFX9-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; GFX9-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p4) :: (load 1, addrspace 4) ; GFX9-MESA: [[GEP16:%[0-9]+]]:_(p4) = G_GEP [[GEP15]], [[C]](s64) ; GFX9-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p4) :: (load 1, addrspace 4) @@ -7885,27 +8732,35 @@ body: | ; GFX9-MESA: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; GFX9-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9-MESA: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9-MESA: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; GFX9-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; GFX9-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; GFX9-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9-MESA: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9-MESA: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; GFX9-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; GFX9-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; GFX9-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9-MESA: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9-MESA: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; GFX9-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; GFX9-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; GFX9-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9-MESA: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; GFX9-MESA: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; GFX9-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; GFX9-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; GFX9-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; GFX9-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; GFX9-MESA: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; GFX9-MESA: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -8043,9 +8898,18 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; CI: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -8066,36 +8930,44 @@ body: | ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; CI: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; CI: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; CI: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C12]](s64) ; CI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p4) :: (load 1, addrspace 4) ; CI: [[GEP16:%[0-9]+]]:_(p4) = G_GEP [[GEP15]], [[C]](s64) ; CI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p4) :: (load 1, addrspace 4) @@ -8116,36 +8988,44 @@ body: | ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; CI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; CI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; CI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; CI: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 - ; CI: [[GEP23:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C12]](s64) + ; CI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; CI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; CI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; CI: [[C13:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 + ; CI: [[GEP23:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C13]](s64) ; CI: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p4) :: (load 1, addrspace 4) ; CI: [[GEP24:%[0-9]+]]:_(p4) = G_GEP [[GEP23]], [[C]](s64) ; CI: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p4) :: (load 1, addrspace 4) @@ -8166,34 +9046,42 @@ body: | ; CI: [[COPY24:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LOAD25]](s32) ; CI: [[AND25:%[0-9]+]]:_(s32) = G_AND [[COPY25]], [[C9]] - ; CI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) - ; CI: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) - ; CI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] + ; CI: [[SHL18:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) + ; CI: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL18]](s32) + ; CI: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] ; CI: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; CI: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; CI: [[COPY26:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LOAD27]](s32) ; CI: [[AND27:%[0-9]+]]:_(s32) = G_AND [[COPY27]], [[C9]] - ; CI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) - ; CI: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) - ; CI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] + ; CI: [[SHL19:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) + ; CI: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL19]](s32) + ; CI: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] ; CI: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; CI: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; CI: [[COPY28:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LOAD29]](s32) ; CI: [[AND29:%[0-9]+]]:_(s32) = G_AND [[COPY29]], [[C9]] - ; CI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) - ; CI: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) - ; CI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] + ; CI: [[SHL20:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) + ; CI: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL20]](s32) + ; CI: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] ; CI: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; CI: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; CI: [[COPY30:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LOAD31]](s32) ; CI: [[AND31:%[0-9]+]]:_(s32) = G_AND [[COPY31]], [[C9]] - ; CI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) - ; CI: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) - ; CI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] - ; CI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; CI: [[SHL21:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) + ; CI: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL21]](s32) + ; CI: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] + ; CI: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; CI: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; CI: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C10]](s32) + ; CI: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; CI: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; CI: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; CI: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C10]](s32) + ; CI: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; CI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64), [[MV3]](s64) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BUILD_VECTOR]](<4 x s64>) ; VI-LABEL: name: test_load_constant_v4s64_align1 @@ -8246,9 +9134,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; VI: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -8268,29 +9165,37 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; VI: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; VI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; VI: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; VI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p4) :: (load 1, addrspace 4) ; VI: [[GEP16:%[0-9]+]]:_(p4) = G_GEP [[GEP15]], [[C]](s64) ; VI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p4) :: (load 1, addrspace 4) @@ -8310,29 +9215,37 @@ body: | ; VI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; VI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; VI: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; VI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; VI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; VI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; VI: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; VI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; VI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; VI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; VI: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; VI: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; VI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; VI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; VI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; VI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; VI: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; VI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 - ; VI: [[GEP23:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) + ; VI: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; VI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; VI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; VI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; VI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; VI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; VI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; VI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; VI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; VI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; VI: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 + ; VI: [[GEP23:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C12]](s64) ; VI: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p4) :: (load 1, addrspace 4) ; VI: [[GEP24:%[0-9]+]]:_(p4) = G_GEP [[GEP23]], [[C]](s64) ; VI: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p4) :: (load 1, addrspace 4) @@ -8352,27 +9265,35 @@ body: | ; VI: [[AND24:%[0-9]+]]:_(s16) = G_AND [[TRUNC24]], [[C7]] ; VI: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD25]](s32) ; VI: [[AND25:%[0-9]+]]:_(s16) = G_AND [[TRUNC25]], [[C7]] - ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) - ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL12]] + ; VI: [[SHL18:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) + ; VI: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL18]] ; VI: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; VI: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; VI: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD27]](s32) ; VI: [[AND27:%[0-9]+]]:_(s16) = G_AND [[TRUNC27]], [[C7]] - ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) - ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL13]] + ; VI: [[SHL19:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) + ; VI: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL19]] ; VI: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; VI: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; VI: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD29]](s32) ; VI: [[AND29:%[0-9]+]]:_(s16) = G_AND [[TRUNC29]], [[C7]] - ; VI: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) - ; VI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL14]] + ; VI: [[SHL20:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) + ; VI: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL20]] ; VI: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; VI: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; VI: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD31]](s32) ; VI: [[AND31:%[0-9]+]]:_(s16) = G_AND [[TRUNC31]], [[C7]] - ; VI: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) - ; VI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL15]] - ; VI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; VI: [[SHL21:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) + ; VI: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL21]] + ; VI: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; VI: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; VI: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C9]](s32) + ; VI: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; VI: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; VI: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; VI: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C9]](s32) + ; VI: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; VI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64), [[MV3]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BUILD_VECTOR]](<4 x s64>) ; GFX9-LABEL: name: test_load_constant_v4s64_align1 @@ -8425,9 +9346,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; GFX9: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -8447,29 +9377,37 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; GFX9: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; GFX9: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; GFX9: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; GFX9: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p4) :: (load 1, addrspace 4) ; GFX9: [[GEP16:%[0-9]+]]:_(p4) = G_GEP [[GEP15]], [[C]](s64) ; GFX9: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p4) :: (load 1, addrspace 4) @@ -8489,29 +9427,37 @@ body: | ; GFX9: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; GFX9: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; GFX9: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; GFX9: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; GFX9: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; GFX9: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; GFX9: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; GFX9: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; GFX9: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; GFX9: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; GFX9: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; GFX9: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; GFX9: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; GFX9: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 - ; GFX9: [[GEP23:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) + ; GFX9: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; GFX9: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; GFX9: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; GFX9: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; GFX9: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; GFX9: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 + ; GFX9: [[GEP23:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C12]](s64) ; GFX9: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p4) :: (load 1, addrspace 4) ; GFX9: [[GEP24:%[0-9]+]]:_(p4) = G_GEP [[GEP23]], [[C]](s64) ; GFX9: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p4) :: (load 1, addrspace 4) @@ -8531,27 +9477,35 @@ body: | ; GFX9: [[AND24:%[0-9]+]]:_(s16) = G_AND [[TRUNC24]], [[C7]] ; GFX9: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD25]](s32) ; GFX9: [[AND25:%[0-9]+]]:_(s16) = G_AND [[TRUNC25]], [[C7]] - ; GFX9: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) - ; GFX9: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL12]] + ; GFX9: [[SHL18:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) + ; GFX9: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL18]] ; GFX9: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; GFX9: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; GFX9: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD27]](s32) ; GFX9: [[AND27:%[0-9]+]]:_(s16) = G_AND [[TRUNC27]], [[C7]] - ; GFX9: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) - ; GFX9: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL13]] + ; GFX9: [[SHL19:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) + ; GFX9: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL19]] ; GFX9: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; GFX9: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; GFX9: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD29]](s32) ; GFX9: [[AND29:%[0-9]+]]:_(s16) = G_AND [[TRUNC29]], [[C7]] - ; GFX9: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) - ; GFX9: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL14]] + ; GFX9: [[SHL20:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) + ; GFX9: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL20]] ; GFX9: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; GFX9: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; GFX9: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD31]](s32) ; GFX9: [[AND31:%[0-9]+]]:_(s16) = G_AND [[TRUNC31]], [[C7]] - ; GFX9: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) - ; GFX9: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL15]] - ; GFX9: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; GFX9: [[SHL21:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) + ; GFX9: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL21]] + ; GFX9: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; GFX9: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; GFX9: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C9]](s32) + ; GFX9: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; GFX9: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; GFX9: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; GFX9: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C9]](s32) + ; GFX9: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; GFX9: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64), [[MV3]](s64) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BUILD_VECTOR]](<4 x s64>) ; CI-MESA-LABEL: name: test_load_constant_v4s64_align1 @@ -8612,9 +9566,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -8635,36 +9598,44 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; CI-MESA: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; CI-MESA: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; CI-MESA: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C12]](s64) ; CI-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p4) :: (load 1, addrspace 4) ; CI-MESA: [[GEP16:%[0-9]+]]:_(p4) = G_GEP [[GEP15]], [[C]](s64) ; CI-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p4) :: (load 1, addrspace 4) @@ -8685,36 +9656,44 @@ body: | ; CI-MESA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI-MESA: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI-MESA: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; CI-MESA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI-MESA: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI-MESA: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; CI-MESA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI-MESA: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI-MESA: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; CI-MESA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI-MESA: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; CI-MESA: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 - ; CI-MESA: [[GEP23:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C12]](s64) + ; CI-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; CI-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; CI-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; CI-MESA: [[C13:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 + ; CI-MESA: [[GEP23:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C13]](s64) ; CI-MESA: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p4) :: (load 1, addrspace 4) ; CI-MESA: [[GEP24:%[0-9]+]]:_(p4) = G_GEP [[GEP23]], [[C]](s64) ; CI-MESA: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p4) :: (load 1, addrspace 4) @@ -8735,34 +9714,42 @@ body: | ; CI-MESA: [[COPY24:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LOAD25]](s32) ; CI-MESA: [[AND25:%[0-9]+]]:_(s32) = G_AND [[COPY25]], [[C9]] - ; CI-MESA: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) - ; CI-MESA: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) - ; CI-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] + ; CI-MESA: [[SHL18:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) + ; CI-MESA: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL18]](s32) + ; CI-MESA: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] ; CI-MESA: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; CI-MESA: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; CI-MESA: [[COPY26:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LOAD27]](s32) ; CI-MESA: [[AND27:%[0-9]+]]:_(s32) = G_AND [[COPY27]], [[C9]] - ; CI-MESA: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) - ; CI-MESA: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) - ; CI-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] + ; CI-MESA: [[SHL19:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) + ; CI-MESA: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL19]](s32) + ; CI-MESA: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] ; CI-MESA: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; CI-MESA: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; CI-MESA: [[COPY28:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LOAD29]](s32) ; CI-MESA: [[AND29:%[0-9]+]]:_(s32) = G_AND [[COPY29]], [[C9]] - ; CI-MESA: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) - ; CI-MESA: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) - ; CI-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] + ; CI-MESA: [[SHL20:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) + ; CI-MESA: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL20]](s32) + ; CI-MESA: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] ; CI-MESA: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; CI-MESA: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; CI-MESA: [[COPY30:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LOAD31]](s32) ; CI-MESA: [[AND31:%[0-9]+]]:_(s32) = G_AND [[COPY31]], [[C9]] - ; CI-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) - ; CI-MESA: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) - ; CI-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] - ; CI-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; CI-MESA: [[SHL21:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) + ; CI-MESA: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL21]](s32) + ; CI-MESA: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] + ; CI-MESA: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; CI-MESA: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; CI-MESA: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C10]](s32) + ; CI-MESA: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; CI-MESA: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; CI-MESA: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; CI-MESA: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C10]](s32) + ; CI-MESA: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; CI-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64), [[MV3]](s64) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BUILD_VECTOR]](<4 x s64>) ; GFX9-MESA-LABEL: name: test_load_constant_v4s64_align1 @@ -8815,9 +9802,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -8837,29 +9833,37 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; GFX9-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; GFX9-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p4) :: (load 1, addrspace 4) ; GFX9-MESA: [[GEP16:%[0-9]+]]:_(p4) = G_GEP [[GEP15]], [[C]](s64) ; GFX9-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p4) :: (load 1, addrspace 4) @@ -8879,29 +9883,37 @@ body: | ; GFX9-MESA: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; GFX9-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9-MESA: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9-MESA: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; GFX9-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; GFX9-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; GFX9-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9-MESA: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9-MESA: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; GFX9-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; GFX9-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; GFX9-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9-MESA: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9-MESA: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; GFX9-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; GFX9-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; GFX9-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9-MESA: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; GFX9-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 - ; GFX9-MESA: [[GEP23:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) + ; GFX9-MESA: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; GFX9-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; GFX9-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; GFX9-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; GFX9-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; GFX9-MESA: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 + ; GFX9-MESA: [[GEP23:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C12]](s64) ; GFX9-MESA: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p4) :: (load 1, addrspace 4) ; GFX9-MESA: [[GEP24:%[0-9]+]]:_(p4) = G_GEP [[GEP23]], [[C]](s64) ; GFX9-MESA: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p4) :: (load 1, addrspace 4) @@ -8921,27 +9933,35 @@ body: | ; GFX9-MESA: [[AND24:%[0-9]+]]:_(s16) = G_AND [[TRUNC24]], [[C7]] ; GFX9-MESA: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD25]](s32) ; GFX9-MESA: [[AND25:%[0-9]+]]:_(s16) = G_AND [[TRUNC25]], [[C7]] - ; GFX9-MESA: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) - ; GFX9-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL12]] + ; GFX9-MESA: [[SHL18:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) + ; GFX9-MESA: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL18]] ; GFX9-MESA: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; GFX9-MESA: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; GFX9-MESA: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD27]](s32) ; GFX9-MESA: [[AND27:%[0-9]+]]:_(s16) = G_AND [[TRUNC27]], [[C7]] - ; GFX9-MESA: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) - ; GFX9-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL13]] + ; GFX9-MESA: [[SHL19:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) + ; GFX9-MESA: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL19]] ; GFX9-MESA: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; GFX9-MESA: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; GFX9-MESA: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD29]](s32) ; GFX9-MESA: [[AND29:%[0-9]+]]:_(s16) = G_AND [[TRUNC29]], [[C7]] - ; GFX9-MESA: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) - ; GFX9-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL14]] + ; GFX9-MESA: [[SHL20:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) + ; GFX9-MESA: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL20]] ; GFX9-MESA: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; GFX9-MESA: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; GFX9-MESA: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD31]](s32) ; GFX9-MESA: [[AND31:%[0-9]+]]:_(s16) = G_AND [[TRUNC31]], [[C7]] - ; GFX9-MESA: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) - ; GFX9-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL15]] - ; GFX9-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; GFX9-MESA: [[SHL21:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) + ; GFX9-MESA: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL21]] + ; GFX9-MESA: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; GFX9-MESA: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; GFX9-MESA: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C9]](s32) + ; GFX9-MESA: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; GFX9-MESA: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; GFX9-MESA: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; GFX9-MESA: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C9]](s32) + ; GFX9-MESA: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; GFX9-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64), [[MV3]](s64) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BUILD_VECTOR]](<4 x s64>) %0:_(p4) = COPY $vgpr0_vgpr1 @@ -9137,9 +10157,18 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; CI: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -9160,34 +10189,42 @@ body: | ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) ; VI-LABEL: name: test_load_constant_v2p1_align1 @@ -9240,9 +10277,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; VI: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -9262,27 +10308,35 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) ; GFX9-LABEL: name: test_load_constant_v2p1_align1 @@ -9335,9 +10389,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; GFX9: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -9357,27 +10420,35 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) ; CI-MESA-LABEL: name: test_load_constant_v2p1_align1 @@ -9438,9 +10509,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -9461,34 +10541,42 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) ; GFX9-MESA-LABEL: name: test_load_constant_v2p1_align1 @@ -9541,9 +10629,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 1, addrspace 4) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 1, addrspace 4) @@ -9563,27 +10660,35 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) %0:_(p4) = COPY $vgpr0_vgpr1 @@ -9689,9 +10794,14 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; CI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; CI: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C7]](s64) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 1, addrspace 4) ; CI: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 1, addrspace 4) @@ -9704,19 +10814,23 @@ body: | ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) ; VI-LABEL: name: test_load_constant_v2p3_align1 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -9744,9 +10858,14 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; VI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; VI: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; VI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 1, addrspace 4) ; VI: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 1, addrspace 4) @@ -9758,16 +10877,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) ; GFX9-LABEL: name: test_load_constant_v2p3_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -9795,9 +10918,14 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; GFX9: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 1, addrspace 4) ; GFX9: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 1, addrspace 4) @@ -9809,16 +10937,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) ; CI-MESA-LABEL: name: test_load_constant_v2p3_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -9850,9 +10982,14 @@ body: | ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-MESA: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; CI-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-MESA: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; CI-MESA: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CI-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C7]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 1, addrspace 4) ; CI-MESA: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 1, addrspace 4) @@ -9865,19 +11002,23 @@ body: | ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI-MESA: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-MESA: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; CI-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) ; GFX9-MESA-LABEL: name: test_load_constant_v2p3_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -9905,9 +11046,14 @@ body: | ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9-MESA: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; GFX9-MESA: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 1, addrspace 4) ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 1, addrspace 4) @@ -9919,16 +11065,20 @@ body: | ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9-MESA: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(<2 x p3>) = G_LOAD %0 :: (load 8, align 1, addrspace 4) @@ -10220,9 +11370,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; CI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C7]](s64) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 1, addrspace 1) ; CI: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 1, addrspace 1) @@ -10235,19 +11389,22 @@ body: | ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; VI-LABEL: name: test_extload_constant_v2s32_from_4_align1 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -10275,9 +11432,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; VI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; VI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 1, addrspace 1) ; VI: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 1, addrspace 1) @@ -10289,16 +11450,19 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-LABEL: name: test_extload_constant_v2s32_from_4_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -10326,9 +11490,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 1, addrspace 1) ; GFX9: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 1, addrspace 1) @@ -10340,16 +11508,19 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-MESA-LABEL: name: test_extload_constant_v2s32_from_4_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -10381,9 +11552,13 @@ body: | ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-MESA: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; CI-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-MESA: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CI-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C7]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 1, addrspace 1) ; CI-MESA: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 1, addrspace 1) @@ -10396,19 +11571,22 @@ body: | ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI-MESA: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-MESA: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-MESA-LABEL: name: test_extload_constant_v2s32_from_4_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 @@ -10436,9 +11614,13 @@ body: | ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9-MESA: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[GEP3]], [[C]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 1, addrspace 1) @@ -10450,16 +11632,19 @@ body: | ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9-MESA: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(<2 x s32>) = G_LOAD %0 :: (load 4, align 1, addrspace 1) @@ -10475,97 +11660,137 @@ body: | ; CI-LABEL: name: test_extload_constant_v2s32_from_4_align2 ; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; CI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[GEP1]], [[C]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; VI-LABEL: name: test_extload_constant_v2s32_from_4_align2 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; VI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; VI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[GEP1]], [[C]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-LABEL: name: test_extload_constant_v2s32_from_4_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; GFX9: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; GFX9: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[GEP1]], [[C]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-MESA-LABEL: name: test_extload_constant_v2s32_from_4_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; CI-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) + ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CI-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[GEP1]], [[C]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-MESA-LABEL: name: test_extload_constant_v2s32_from_4_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) + ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[GEP1]], [[C]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(<2 x s32>) = G_LOAD %0 :: (load 4, align 2, addrspace 1) @@ -10842,9 +12067,22 @@ body: | ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C12]](s32) ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) - ; CI: [[C14:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; CI: [[GEP11:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C14]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C14]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C14]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C14]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) + ; CI: [[C15:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; CI: [[GEP11:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C15]](s64) ; CI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p4) :: (load 1, addrspace 1) ; CI: [[GEP12:%[0-9]+]]:_(p4) = G_GEP [[GEP11]], [[C]](s64) ; CI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p4) :: (load 1, addrspace 1) @@ -10873,50 +12111,62 @@ body: | ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C13]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C11]] ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C13]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] ; CI: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; CI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C11]] ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C13]] - ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C11]] ; CI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C13]] - ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C11]] ; CI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C13]] - ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C11]] ; CI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C13]] - ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; CI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C14]](s32) + ; CI: [[OR15:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL15]] + ; CI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; CI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C14]](s32) + ; CI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C14]](s32) + ; CI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR15]](s32), [[OR16]](s32), [[OR17]](s32) ; CI: [[COPY24:%[0-9]+]]:_(s96) = COPY [[MV]](s96) ; CI: [[COPY25:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY24]](s96) @@ -10995,9 +12245,22 @@ body: | ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) - ; VI: [[C13:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; VI: [[GEP11:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C13]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; VI: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; VI: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) + ; VI: [[C14:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; VI: [[GEP11:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C14]](s64) ; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p4) :: (load 1, addrspace 1) ; VI: [[GEP12:%[0-9]+]]:_(p4) = G_GEP [[GEP11]], [[C]](s64) ; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p4) :: (load 1, addrspace 1) @@ -11025,39 +12288,51 @@ body: | ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C11]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C11]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C12]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C12]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C11]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C11]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C12]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] + ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C12]](s16) + ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] ; VI: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; VI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C11]] ; VI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; VI: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C11]] - ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C12]](s16) - ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C12]](s16) + ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL11]] ; VI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; VI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C11]] ; VI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; VI: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C11]] - ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C12]](s16) - ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C12]](s16) + ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL12]] ; VI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; VI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C11]] ; VI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; VI: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C11]] - ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C12]](s16) - ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C12]](s16) + ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL13]] ; VI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; VI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C11]] ; VI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; VI: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C11]] - ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C12]](s16) - ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; VI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; VI: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C12]](s16) + ; VI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL14]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; VI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C13]](s32) + ; VI: [[OR15:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL15]] + ; VI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; VI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; VI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C13]](s32) + ; VI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; VI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; VI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; VI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C13]](s32) + ; VI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; VI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR15]](s32), [[OR16]](s32), [[OR17]](s32) ; VI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) ; VI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) @@ -11136,9 +12411,22 @@ body: | ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[C13:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; GFX9: [[GEP11:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C13]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; GFX9: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; GFX9: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) + ; GFX9: [[C14:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; GFX9: [[GEP11:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C14]](s64) ; GFX9: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p4) :: (load 1, addrspace 1) ; GFX9: [[GEP12:%[0-9]+]]:_(p4) = G_GEP [[GEP11]], [[C]](s64) ; GFX9: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p4) :: (load 1, addrspace 1) @@ -11166,39 +12454,51 @@ body: | ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C11]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C11]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C12]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C12]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C11]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C11]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C12]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] + ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C12]](s16) + ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] ; GFX9: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; GFX9: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C11]] ; GFX9: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C11]] - ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C12]](s16) - ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C12]](s16) + ; GFX9: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL11]] ; GFX9: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C11]] ; GFX9: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C11]] - ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C12]](s16) - ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C12]](s16) + ; GFX9: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL12]] ; GFX9: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C11]] ; GFX9: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C11]] - ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C12]](s16) - ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C12]](s16) + ; GFX9: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL13]] ; GFX9: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C11]] ; GFX9: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C11]] - ; GFX9: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C12]](s16) - ; GFX9: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; GFX9: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C12]](s16) + ; GFX9: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL14]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; GFX9: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C13]](s32) + ; GFX9: [[OR15:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL15]] + ; GFX9: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; GFX9: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C13]](s32) + ; GFX9: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C13]](s32) + ; GFX9: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR15]](s32), [[OR16]](s32), [[OR17]](s32) ; GFX9: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) ; GFX9: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) @@ -11289,9 +12589,22 @@ body: | ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C12]](s32) ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) - ; CI-MESA: [[C14:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; CI-MESA: [[GEP11:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C14]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C14]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C14]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C14]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) + ; CI-MESA: [[C15:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; CI-MESA: [[GEP11:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C15]](s64) ; CI-MESA: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p4) :: (load 1, addrspace 1) ; CI-MESA: [[GEP12:%[0-9]+]]:_(p4) = G_GEP [[GEP11]], [[C]](s64) ; CI-MESA: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p4) :: (load 1, addrspace 1) @@ -11320,50 +12633,62 @@ body: | ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C13]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C11]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C13]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] ; CI-MESA: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; CI-MESA: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C11]] ; CI-MESA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI-MESA: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C13]] - ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C11]] ; CI-MESA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI-MESA: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C13]] - ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI-MESA: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C11]] ; CI-MESA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI-MESA: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C13]] - ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI-MESA: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C11]] ; CI-MESA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI-MESA: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C13]] - ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; CI-MESA: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C14]](s32) + ; CI-MESA: [[OR15:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL15]] + ; CI-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; CI-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C14]](s32) + ; CI-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C14]](s32) + ; CI-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR15]](s32), [[OR16]](s32), [[OR17]](s32) ; CI-MESA: [[COPY24:%[0-9]+]]:_(s96) = COPY [[MV]](s96) ; CI-MESA: [[COPY25:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY24]](s96) @@ -11442,9 +12767,22 @@ body: | ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) - ; GFX9-MESA: [[C13:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; GFX9-MESA: [[GEP11:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C13]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) + ; GFX9-MESA: [[C14:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; GFX9-MESA: [[GEP11:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C14]](s64) ; GFX9-MESA: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p4) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP12:%[0-9]+]]:_(p4) = G_GEP [[GEP11]], [[C]](s64) ; GFX9-MESA: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p4) :: (load 1, addrspace 1) @@ -11472,39 +12810,51 @@ body: | ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C11]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C11]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C12]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C12]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C11]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C11]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C12]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C12]](s16) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] ; GFX9-MESA: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; GFX9-MESA: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C11]] ; GFX9-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9-MESA: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C11]] - ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C12]](s16) - ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C12]](s16) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL11]] ; GFX9-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C11]] ; GFX9-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9-MESA: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C11]] - ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C12]](s16) - ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9-MESA: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C12]](s16) + ; GFX9-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL12]] ; GFX9-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C11]] ; GFX9-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9-MESA: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C11]] - ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C12]](s16) - ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9-MESA: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C12]](s16) + ; GFX9-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL13]] ; GFX9-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C11]] ; GFX9-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9-MESA: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C11]] - ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C12]](s16) - ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; GFX9-MESA: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C12]](s16) + ; GFX9-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL14]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; GFX9-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C13]](s32) + ; GFX9-MESA: [[OR15:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL15]] + ; GFX9-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; GFX9-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C13]](s32) + ; GFX9-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C13]](s32) + ; GFX9-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR15]](s32), [[OR16]](s32), [[OR17]](s32) ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) @@ -11526,248 +12876,378 @@ body: | ; CI-LABEL: name: test_extload_constant_v2s96_from_24_align2 ; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; CI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; CI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; CI: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C4]](s64) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) - ; CI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; CI: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; CI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; CI: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; CI: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C7]](s64) ; CI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C]](s64) ; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; CI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C1]](s64) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; CI: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C2]](s64) ; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; CI: [[GEP9:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C3]](s64) ; CI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[GEP10:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C4]](s64) ; CI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p4) :: (load 2, addrspace 1) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16), [[TRUNC10]](s16), [[TRUNC11]](s16) - ; CI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) - ; CI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) - ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; CI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] + ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C6]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; CI: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C5]] + ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C6]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; CI: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C5]] + ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; CI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; CI: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; CI: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; CI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; VI-LABEL: name: test_extload_constant_v2s96_from_24_align2 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; VI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; VI: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; VI: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C4]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) - ; VI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; VI: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; VI: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; VI: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C7]](s64) ; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C]](s64) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C1]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; VI: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C2]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[GEP9:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C3]](s64) ; VI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[GEP10:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C4]](s64) ; VI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p4) :: (load 2, addrspace 1) - ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16), [[TRUNC10]](s16), [[TRUNC11]](s16) - ; VI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) - ; VI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) - ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; VI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; VI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] + ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; VI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] + ; VI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C6]](s32) + ; VI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; VI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; VI: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C5]] + ; VI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; VI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C6]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; VI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; VI: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C5]] + ; VI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; VI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C6]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; VI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; VI: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; VI: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; VI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; GFX9-LABEL: name: test_extload_constant_v2s96_from_24_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; GFX9: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; GFX9: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; GFX9: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C4]](s64) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; GFX9: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; GFX9: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; GFX9: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C7]](s64) ; GFX9: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C]](s64) ; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C1]](s64) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; GFX9: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C2]](s64) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[GEP9:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C3]](s64) ; GFX9: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[GEP10:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C4]](s64) ; GFX9: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p4) :: (load 2, addrspace 1) - ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16), [[TRUNC10]](s16), [[TRUNC11]](s16) - ; GFX9: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) - ; GFX9: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) - ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; GFX9: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; GFX9: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] + ; GFX9: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C6]](s32) + ; GFX9: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; GFX9: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C5]] + ; GFX9: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; GFX9: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C6]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; GFX9: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C5]] + ; GFX9: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; GFX9: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C6]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; GFX9: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; GFX9: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; CI-MESA-LABEL: name: test_extload_constant_v2s96_from_24_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; CI-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; CI-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-MESA: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; CI-MESA: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C4]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) - ; CI-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; CI-MESA: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; CI-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; CI-MESA: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; CI-MESA: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C7]](s64) ; CI-MESA: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-MESA: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C]](s64) ; CI-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; CI-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C1]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C2]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; CI-MESA: [[GEP9:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C3]](s64) ; CI-MESA: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[GEP10:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C4]](s64) ; CI-MESA: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p4) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; CI-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16), [[TRUNC10]](s16), [[TRUNC11]](s16) - ; CI-MESA: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) - ; CI-MESA: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) - ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; CI-MESA: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI-MESA: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] + ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] + ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C6]](s32) + ; CI-MESA: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; CI-MESA: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C5]] + ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C6]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; CI-MESA: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C5]] + ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C6]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; CI-MESA: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; CI-MESA: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; GFX9-MESA-LABEL: name: test_extload_constant_v2s96_from_24_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; GFX9-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C3]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9-MESA: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C4]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) - ; GFX9-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; GFX9-MESA: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; GFX9-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; GFX9-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; GFX9-MESA: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; GFX9-MESA: [[GEP5:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C7]](s64) ; GFX9-MESA: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9-MESA: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C]](s64) ; GFX9-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C1]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C2]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[GEP9:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C3]](s64) ; GFX9-MESA: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[GEP10:%[0-9]+]]:_(p4) = G_GEP [[GEP5]], [[C4]](s64) ; GFX9-MESA: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p4) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16), [[TRUNC10]](s16), [[TRUNC11]](s16) - ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) - ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) - ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; GFX9-MESA: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; GFX9-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9-MESA: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] + ; GFX9-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] + ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C6]](s32) + ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C5]] + ; GFX9-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C6]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C5]] + ; GFX9-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C6]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; GFX9-MESA: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; GFX9-MESA: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(<2 x s96>) = G_LOAD %0 :: (load 24, align 2, addrspace 1) %2:_(s96) = G_EXTRACT %1, 0 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir?rev=373942&r1=373941&r2=373942&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir Mon Oct 7 12:05:58 2019 @@ -383,53 +383,78 @@ body: | ; CI-LABEL: name: test_load_flat_s32_align2 ; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: $vgpr0 = COPY [[MV]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: $vgpr0 = COPY [[OR]](s32) ; VI-LABEL: name: test_load_flat_s32_align2 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: $vgpr0 = COPY [[OR]](s32) ; GFX9-LABEL: name: test_load_flat_s32_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: $vgpr0 = COPY [[OR]](s32) ; CI-MESA-LABEL: name: test_load_flat_s32_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](s32) + ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: $vgpr0 = COPY [[OR]](s32) ; GFX9-MESA-LABEL: name: test_load_flat_s32_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](s32) + ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: $vgpr0 = COPY [[OR]](s32) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(s32) = G_LOAD %0 :: (load 4, align 2, addrspace 0) $vgpr0 = COPY %1 @@ -471,8 +496,12 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: $vgpr0 = COPY [[MV]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: $vgpr0 = COPY [[OR2]](s32) ; VI-LABEL: name: test_load_flat_s32_align1 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1) @@ -499,8 +528,12 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: $vgpr0 = COPY [[OR2]](s32) ; GFX9-LABEL: name: test_load_flat_s32_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1) @@ -527,8 +560,12 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: $vgpr0 = COPY [[OR2]](s32) ; CI-MESA-LABEL: name: test_load_flat_s32_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1) @@ -559,8 +596,12 @@ body: | ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](s32) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-MESA: $vgpr0 = COPY [[OR2]](s32) ; GFX9-MESA-LABEL: name: test_load_flat_s32_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1) @@ -587,8 +628,12 @@ body: | ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](s32) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9-MESA: $vgpr0 = COPY [[OR2]](s32) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(s32) = G_LOAD %0 :: (load 4, align 1, addrspace 0) $vgpr0 = COPY %1 @@ -712,92 +757,142 @@ body: | ; CI-LABEL: name: test_load_flat_s64_align2 ; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; VI-LABEL: name: test_load_flat_s64_align2 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-LABEL: name: test_load_flat_s64_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-MESA-LABEL: name: test_load_flat_s64_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-MESA-LABEL: name: test_load_flat_s64_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(s64) = G_LOAD %0 :: (load 8, align 2, addrspace 0) @@ -868,7 +963,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; VI-LABEL: name: test_load_flat_s64_align1 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -920,7 +1024,16 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-LABEL: name: test_load_flat_s64_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -972,7 +1085,16 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-MESA-LABEL: name: test_load_flat_s64_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -1032,7 +1154,16 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-MESA-LABEL: name: test_load_flat_s64_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -1084,7 +1215,16 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(s64) = G_LOAD %0 :: (load 8, align 1, addrspace 0) @@ -1193,132 +1333,202 @@ body: | ; CI-LABEL: name: test_load_flat_s96_align2 ; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; CI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; CI: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C3]](s64) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 2) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; CI: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C4]](s64) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 2) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_flat_s96_align2 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; VI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; VI: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C3]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 2) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; VI: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C4]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 2) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-LABEL: name: test_load_flat_s96_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; GFX9: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; GFX9: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C3]](s64) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 2) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; GFX9: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C4]](s64) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 2) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; CI-MESA-LABEL: name: test_load_flat_s96_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; CI-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; CI-MESA: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C3]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 2) - ; CI-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-MESA: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; CI-MESA: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C4]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 2) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-MESA-LABEL: name: test_load_flat_s96_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; GFX9-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C3]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9-MESA: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C4]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; GFX9-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; GFX9-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(s96) = G_LOAD %0 :: (load 12, align 2, addrspace 0) @@ -1417,7 +1627,20 @@ body: | ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C12]](s32) ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C14]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C14]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C14]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_flat_s96_align1 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -1493,7 +1716,20 @@ body: | ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; VI: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; VI: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-LABEL: name: test_load_flat_s96_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -1569,7 +1805,20 @@ body: | ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; GFX9: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; GFX9: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; CI-MESA-LABEL: name: test_load_flat_s96_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -1657,7 +1906,20 @@ body: | ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C12]](s32) ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C14]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C14]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C14]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-MESA-LABEL: name: test_load_flat_s96_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -1733,7 +1995,20 @@ body: | ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(s96) = G_LOAD %0 :: (load 12, align 1, addrspace 0) @@ -2057,7 +2332,24 @@ body: | ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[C16]](s32) ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C18]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C18]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C18]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C18]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; VI-LABEL: name: test_load_flat_s128_align1 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -2157,7 +2449,24 @@ body: | ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C15]] ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C16]](s16) ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C17]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C17]](s32) + ; VI: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C17]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C17]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; GFX9-LABEL: name: test_load_flat_s128_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -2257,7 +2566,24 @@ body: | ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C15]] ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C16]](s16) ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C17]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C17]](s32) + ; GFX9: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C17]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C17]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; CI-MESA-LABEL: name: test_load_flat_s128_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -2373,7 +2699,24 @@ body: | ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[C16]](s32) ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C18]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C18]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C18]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C18]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; GFX9-MESA-LABEL: name: test_load_flat_s128_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -2473,7 +2816,24 @@ body: | ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C15]] ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C16]](s16) ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C17]](s32) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C17]](s32) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C17]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C17]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(s128) = G_LOAD %0 :: (load 16, align 1, addrspace 0) @@ -2657,7 +3017,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; VI-LABEL: name: test_load_flat_p1_align1 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -2709,7 +3078,16 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; GFX9-LABEL: name: test_load_flat_p1_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -2761,7 +3139,16 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](p1) ; CI-MESA-LABEL: name: test_load_flat_p1_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -2821,7 +3208,16 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](p1) ; GFX9-MESA-LABEL: name: test_load_flat_p1_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -2873,7 +3269,16 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](p1) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(p1) = G_LOAD %0 :: (load 8, align 1, addrspace 0) @@ -2982,92 +3387,142 @@ body: | ; CI-LABEL: name: test_load_flat_p4_align2 ; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; VI-LABEL: name: test_load_flat_p4_align2 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; GFX9-LABEL: name: test_load_flat_p4_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](p4) ; CI-MESA-LABEL: name: test_load_flat_p4_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) ; GFX9-MESA-LABEL: name: test_load_flat_p4_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(p4) = G_LOAD %0 :: (load 8, align 2, addrspace 0) @@ -3138,7 +3593,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; VI-LABEL: name: test_load_flat_p4_align1 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -3190,7 +3654,16 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; GFX9-LABEL: name: test_load_flat_p4_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -3242,7 +3715,16 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](p4) ; CI-MESA-LABEL: name: test_load_flat_p4_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -3302,7 +3784,16 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) ; GFX9-MESA-LABEL: name: test_load_flat_p4_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -3354,7 +3845,16 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(p4) = G_LOAD %0 :: (load 8, align 1, addrspace 0) @@ -3401,53 +3901,83 @@ body: | ; CI-LABEL: name: test_load_flat_p5_align2 ; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p5) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p5) ; VI-LABEL: name: test_load_flat_p5_align2 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p5) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-LABEL: name: test_load_flat_p5_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p5) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-MESA-LABEL: name: test_load_flat_p5_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](p5) + ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; CI-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-MESA-LABEL: name: test_load_flat_p5_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](p5) + ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; GFX9-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(p5) = G_LOAD %0 :: (load 4, align 2, addrspace 0) $vgpr0 = COPY %1 @@ -3489,8 +4019,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p5) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p5) ; VI-LABEL: name: test_load_flat_p5_align1 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1) @@ -3517,8 +4052,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p5) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-LABEL: name: test_load_flat_p5_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1) @@ -3545,8 +4085,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p5) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-MESA-LABEL: name: test_load_flat_p5_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1) @@ -3577,8 +4122,13 @@ body: | ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](p5) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; CI-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-MESA-LABEL: name: test_load_flat_p5_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1) @@ -3605,8 +4155,13 @@ body: | ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](p5) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; GFX9-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(p5) = G_LOAD %0 :: (load 4, align 1, addrspace 0) $vgpr0 = COPY %1 @@ -6480,166 +7035,256 @@ body: | ; CI-LABEL: name: test_load_flat_v2s64_align2 ; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C3]](s64) + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; CI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C5]](s64) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 2) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C]](s64) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 2) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; CI: [[GEP5:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C1]](s64) ; CI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p0) :: (load 2) - ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[GEP6:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C2]](s64) ; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p0) :: (load 2) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; VI-LABEL: name: test_load_flat_v2s64_align2 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C3]](s64) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; VI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C5]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 2) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 2) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[GEP5:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C1]](s64) ; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p0) :: (load 2) - ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[GEP6:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C2]](s64) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p0) :: (load 2) - ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; VI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; VI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; VI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; VI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-LABEL: name: test_load_flat_v2s64_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C3]](s64) + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; GFX9: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C5]](s64) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 2) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C]](s64) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 2) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[GEP5:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C1]](s64) ; GFX9: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p0) :: (load 2) - ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[GEP6:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C2]](s64) ; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p0) :: (load 2) - ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; GFX9: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; GFX9: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; CI-MESA-LABEL: name: test_load_flat_v2s64_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C3]](s64) + ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; CI-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C5]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 2) - ; CI-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-MESA: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 2) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; CI-MESA: [[GEP5:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C1]](s64) ; CI-MESA: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p0) :: (load 2) - ; CI-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-MESA: [[GEP6:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C2]](s64) ; CI-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p0) :: (load 2) - ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI-MESA: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; CI-MESA: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-MESA-LABEL: name: test_load_flat_v2s64_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C3]](s64) + ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C5]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9-MESA: [[GEP5:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C1]](s64) ; GFX9-MESA: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9-MESA: [[GEP6:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C2]](s64) ; GFX9-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p0) :: (load 2) - ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; GFX9-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9-MESA: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; GFX9-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) %0:_(p0) = COPY $vgpr0_vgpr1 @@ -6711,9 +7356,18 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; CI: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -6734,34 +7388,42 @@ body: | ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; VI-LABEL: name: test_load_flat_v2s64_align1 @@ -6814,9 +7476,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; VI: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -6836,27 +7507,35 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-LABEL: name: test_load_flat_v2s64_align1 @@ -6909,9 +7588,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; GFX9: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -6931,27 +7619,35 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; CI-MESA-LABEL: name: test_load_flat_v2s64_align1 @@ -7012,9 +7708,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -7035,34 +7740,42 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-MESA-LABEL: name: test_load_flat_v2s64_align1 @@ -7115,9 +7828,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -7137,27 +7859,35 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) %0:_(p0) = COPY $vgpr0_vgpr1 @@ -7385,9 +8115,18 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; CI: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -7408,36 +8147,44 @@ body: | ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; CI: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; CI: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; CI: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C12]](s64) ; CI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p0) :: (load 1) ; CI: [[GEP16:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C]](s64) ; CI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p0) :: (load 1) @@ -7458,34 +8205,42 @@ body: | ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; CI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; CI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; CI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; CI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; CI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; CI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; CI: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; CI: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7540,9 +8295,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; VI: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -7562,29 +8326,37 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; VI: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; VI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; VI: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; VI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p0) :: (load 1) ; VI: [[GEP16:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C]](s64) ; VI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p0) :: (load 1) @@ -7604,27 +8376,35 @@ body: | ; VI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; VI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; VI: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; VI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; VI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; VI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; VI: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; VI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; VI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; VI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; VI: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; VI: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; VI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; VI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; VI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; VI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; VI: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; VI: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; VI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; VI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; VI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; VI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; VI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; VI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; VI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; VI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; VI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; VI: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; VI: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7679,9 +8459,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; GFX9: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -7701,29 +8490,37 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; GFX9: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; GFX9: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; GFX9: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; GFX9: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p0) :: (load 1) ; GFX9: [[GEP16:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C]](s64) ; GFX9: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p0) :: (load 1) @@ -7743,27 +8540,35 @@ body: | ; GFX9: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; GFX9: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; GFX9: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; GFX9: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; GFX9: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; GFX9: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; GFX9: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; GFX9: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; GFX9: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; GFX9: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; GFX9: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; GFX9: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; GFX9: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; GFX9: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; GFX9: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; GFX9: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; GFX9: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; GFX9: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7826,9 +8631,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -7849,36 +8663,44 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; CI-MESA: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; CI-MESA: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; CI-MESA: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C12]](s64) ; CI-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p0) :: (load 1) ; CI-MESA: [[GEP16:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C]](s64) ; CI-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p0) :: (load 1) @@ -7899,34 +8721,42 @@ body: | ; CI-MESA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI-MESA: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI-MESA: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; CI-MESA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI-MESA: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI-MESA: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; CI-MESA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI-MESA: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI-MESA: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; CI-MESA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI-MESA: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; CI-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; CI-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; CI-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; CI-MESA: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; CI-MESA: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7981,9 +8811,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -8003,29 +8842,37 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; GFX9-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; GFX9-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p0) :: (load 1) ; GFX9-MESA: [[GEP16:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C]](s64) ; GFX9-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p0) :: (load 1) @@ -8045,27 +8892,35 @@ body: | ; GFX9-MESA: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; GFX9-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9-MESA: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9-MESA: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; GFX9-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; GFX9-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; GFX9-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9-MESA: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9-MESA: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; GFX9-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; GFX9-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; GFX9-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9-MESA: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9-MESA: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; GFX9-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; GFX9-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; GFX9-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9-MESA: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; GFX9-MESA: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; GFX9-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; GFX9-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; GFX9-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; GFX9-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; GFX9-MESA: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; GFX9-MESA: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -8243,9 +9098,18 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; CI: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -8266,37 +9130,45 @@ body: | ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) - ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; CI: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) + ; CI: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; CI: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C12]](s64) ; CI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p0) :: (load 1) ; CI: [[GEP16:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C]](s64) ; CI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p0) :: (load 1) @@ -8317,35 +9189,43 @@ body: | ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; CI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; CI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; CI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; CI: [[GEP23:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C10]](s64) + ; CI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; CI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; CI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; CI: [[GEP23:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C11]](s64) ; CI: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p0) :: (load 1) ; CI: [[GEP24:%[0-9]+]]:_(p0) = G_GEP [[GEP23]], [[C]](s64) ; CI: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p0) :: (load 1) @@ -8366,34 +9246,42 @@ body: | ; CI: [[COPY24:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LOAD25]](s32) ; CI: [[AND25:%[0-9]+]]:_(s32) = G_AND [[COPY25]], [[C9]] - ; CI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) - ; CI: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) - ; CI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] + ; CI: [[SHL18:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) + ; CI: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL18]](s32) + ; CI: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] ; CI: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; CI: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; CI: [[COPY26:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LOAD27]](s32) ; CI: [[AND27:%[0-9]+]]:_(s32) = G_AND [[COPY27]], [[C9]] - ; CI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) - ; CI: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) - ; CI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] + ; CI: [[SHL19:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) + ; CI: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL19]](s32) + ; CI: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] ; CI: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; CI: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; CI: [[COPY28:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LOAD29]](s32) ; CI: [[AND29:%[0-9]+]]:_(s32) = G_AND [[COPY29]], [[C9]] - ; CI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) - ; CI: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) - ; CI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] + ; CI: [[SHL20:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) + ; CI: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL20]](s32) + ; CI: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] ; CI: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; CI: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; CI: [[COPY30:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LOAD31]](s32) ; CI: [[AND31:%[0-9]+]]:_(s32) = G_AND [[COPY31]], [[C9]] - ; CI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) - ; CI: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) - ; CI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] - ; CI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; CI: [[SHL21:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) + ; CI: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL21]](s32) + ; CI: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] + ; CI: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; CI: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; CI: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C10]](s32) + ; CI: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; CI: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; CI: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; CI: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C10]](s32) + ; CI: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; CI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV2]](s64), [[MV3]](s64) ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s64>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s64>), [[BUILD_VECTOR1]](<2 x s64>) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[CONCAT_VECTORS]](<4 x s64>) @@ -8447,9 +9335,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; VI: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -8469,30 +9366,38 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) - ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; VI: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; VI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; VI: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; VI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p0) :: (load 1) ; VI: [[GEP16:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C]](s64) ; VI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p0) :: (load 1) @@ -8512,28 +9417,36 @@ body: | ; VI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; VI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; VI: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; VI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; VI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; VI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; VI: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; VI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; VI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; VI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; VI: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; VI: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; VI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; VI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; VI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; VI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; VI: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; VI: [[GEP23:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C9]](s64) + ; VI: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; VI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; VI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; VI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; VI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; VI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; VI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; VI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; VI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; VI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; VI: [[GEP23:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C10]](s64) ; VI: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p0) :: (load 1) ; VI: [[GEP24:%[0-9]+]]:_(p0) = G_GEP [[GEP23]], [[C]](s64) ; VI: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p0) :: (load 1) @@ -8553,27 +9466,35 @@ body: | ; VI: [[AND24:%[0-9]+]]:_(s16) = G_AND [[TRUNC24]], [[C7]] ; VI: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD25]](s32) ; VI: [[AND25:%[0-9]+]]:_(s16) = G_AND [[TRUNC25]], [[C7]] - ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) - ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL12]] + ; VI: [[SHL18:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) + ; VI: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL18]] ; VI: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; VI: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; VI: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD27]](s32) ; VI: [[AND27:%[0-9]+]]:_(s16) = G_AND [[TRUNC27]], [[C7]] - ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) - ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL13]] + ; VI: [[SHL19:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) + ; VI: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL19]] ; VI: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; VI: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; VI: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD29]](s32) ; VI: [[AND29:%[0-9]+]]:_(s16) = G_AND [[TRUNC29]], [[C7]] - ; VI: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) - ; VI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL14]] + ; VI: [[SHL20:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) + ; VI: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL20]] ; VI: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; VI: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; VI: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD31]](s32) ; VI: [[AND31:%[0-9]+]]:_(s16) = G_AND [[TRUNC31]], [[C7]] - ; VI: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) - ; VI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL15]] - ; VI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; VI: [[SHL21:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) + ; VI: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL21]] + ; VI: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; VI: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; VI: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C9]](s32) + ; VI: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; VI: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; VI: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; VI: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C9]](s32) + ; VI: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; VI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV2]](s64), [[MV3]](s64) ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s64>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s64>), [[BUILD_VECTOR1]](<2 x s64>) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[CONCAT_VECTORS]](<4 x s64>) @@ -8627,9 +9548,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; GFX9: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -8649,30 +9579,38 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) - ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; GFX9: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; GFX9: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; GFX9: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; GFX9: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p0) :: (load 1) ; GFX9: [[GEP16:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C]](s64) ; GFX9: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p0) :: (load 1) @@ -8692,28 +9630,36 @@ body: | ; GFX9: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; GFX9: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; GFX9: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; GFX9: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; GFX9: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; GFX9: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; GFX9: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; GFX9: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; GFX9: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; GFX9: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; GFX9: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; GFX9: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; GFX9: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; GFX9: [[GEP23:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C9]](s64) + ; GFX9: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; GFX9: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; GFX9: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; GFX9: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; GFX9: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; GFX9: [[GEP23:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C10]](s64) ; GFX9: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p0) :: (load 1) ; GFX9: [[GEP24:%[0-9]+]]:_(p0) = G_GEP [[GEP23]], [[C]](s64) ; GFX9: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p0) :: (load 1) @@ -8733,27 +9679,35 @@ body: | ; GFX9: [[AND24:%[0-9]+]]:_(s16) = G_AND [[TRUNC24]], [[C7]] ; GFX9: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD25]](s32) ; GFX9: [[AND25:%[0-9]+]]:_(s16) = G_AND [[TRUNC25]], [[C7]] - ; GFX9: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) - ; GFX9: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL12]] + ; GFX9: [[SHL18:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) + ; GFX9: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL18]] ; GFX9: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; GFX9: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; GFX9: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD27]](s32) ; GFX9: [[AND27:%[0-9]+]]:_(s16) = G_AND [[TRUNC27]], [[C7]] - ; GFX9: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) - ; GFX9: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL13]] + ; GFX9: [[SHL19:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) + ; GFX9: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL19]] ; GFX9: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; GFX9: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; GFX9: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD29]](s32) ; GFX9: [[AND29:%[0-9]+]]:_(s16) = G_AND [[TRUNC29]], [[C7]] - ; GFX9: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) - ; GFX9: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL14]] + ; GFX9: [[SHL20:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) + ; GFX9: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL20]] ; GFX9: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; GFX9: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; GFX9: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD31]](s32) ; GFX9: [[AND31:%[0-9]+]]:_(s16) = G_AND [[TRUNC31]], [[C7]] - ; GFX9: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) - ; GFX9: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL15]] - ; GFX9: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; GFX9: [[SHL21:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) + ; GFX9: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL21]] + ; GFX9: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; GFX9: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; GFX9: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C9]](s32) + ; GFX9: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; GFX9: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; GFX9: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; GFX9: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C9]](s32) + ; GFX9: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; GFX9: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV2]](s64), [[MV3]](s64) ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s64>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s64>), [[BUILD_VECTOR1]](<2 x s64>) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[CONCAT_VECTORS]](<4 x s64>) @@ -8815,9 +9769,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -8838,37 +9801,45 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) - ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; CI-MESA: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) + ; CI-MESA: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; CI-MESA: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C12]](s64) ; CI-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p0) :: (load 1) ; CI-MESA: [[GEP16:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C]](s64) ; CI-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p0) :: (load 1) @@ -8889,35 +9860,43 @@ body: | ; CI-MESA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI-MESA: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI-MESA: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; CI-MESA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI-MESA: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI-MESA: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; CI-MESA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI-MESA: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI-MESA: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; CI-MESA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI-MESA: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; CI-MESA: [[GEP23:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C10]](s64) + ; CI-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; CI-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; CI-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; CI-MESA: [[GEP23:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C11]](s64) ; CI-MESA: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p0) :: (load 1) ; CI-MESA: [[GEP24:%[0-9]+]]:_(p0) = G_GEP [[GEP23]], [[C]](s64) ; CI-MESA: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p0) :: (load 1) @@ -8938,34 +9917,42 @@ body: | ; CI-MESA: [[COPY24:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LOAD25]](s32) ; CI-MESA: [[AND25:%[0-9]+]]:_(s32) = G_AND [[COPY25]], [[C9]] - ; CI-MESA: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) - ; CI-MESA: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) - ; CI-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] + ; CI-MESA: [[SHL18:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) + ; CI-MESA: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL18]](s32) + ; CI-MESA: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] ; CI-MESA: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; CI-MESA: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; CI-MESA: [[COPY26:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LOAD27]](s32) ; CI-MESA: [[AND27:%[0-9]+]]:_(s32) = G_AND [[COPY27]], [[C9]] - ; CI-MESA: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) - ; CI-MESA: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) - ; CI-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] + ; CI-MESA: [[SHL19:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) + ; CI-MESA: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL19]](s32) + ; CI-MESA: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] ; CI-MESA: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; CI-MESA: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; CI-MESA: [[COPY28:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LOAD29]](s32) ; CI-MESA: [[AND29:%[0-9]+]]:_(s32) = G_AND [[COPY29]], [[C9]] - ; CI-MESA: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) - ; CI-MESA: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) - ; CI-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] + ; CI-MESA: [[SHL20:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) + ; CI-MESA: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL20]](s32) + ; CI-MESA: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] ; CI-MESA: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; CI-MESA: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; CI-MESA: [[COPY30:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LOAD31]](s32) ; CI-MESA: [[AND31:%[0-9]+]]:_(s32) = G_AND [[COPY31]], [[C9]] - ; CI-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) - ; CI-MESA: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) - ; CI-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] - ; CI-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; CI-MESA: [[SHL21:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) + ; CI-MESA: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL21]](s32) + ; CI-MESA: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] + ; CI-MESA: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; CI-MESA: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; CI-MESA: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C10]](s32) + ; CI-MESA: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; CI-MESA: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; CI-MESA: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; CI-MESA: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C10]](s32) + ; CI-MESA: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; CI-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; CI-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV2]](s64), [[MV3]](s64) ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s64>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s64>), [[BUILD_VECTOR1]](<2 x s64>) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[CONCAT_VECTORS]](<4 x s64>) @@ -9019,9 +10006,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -9041,30 +10037,38 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) - ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; GFX9-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; GFX9-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p0) :: (load 1) ; GFX9-MESA: [[GEP16:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C]](s64) ; GFX9-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p0) :: (load 1) @@ -9084,28 +10088,36 @@ body: | ; GFX9-MESA: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; GFX9-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9-MESA: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9-MESA: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; GFX9-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; GFX9-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; GFX9-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9-MESA: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9-MESA: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; GFX9-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; GFX9-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; GFX9-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9-MESA: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9-MESA: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; GFX9-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; GFX9-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; GFX9-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9-MESA: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; GFX9-MESA: [[GEP23:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C9]](s64) + ; GFX9-MESA: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; GFX9-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; GFX9-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; GFX9-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; GFX9-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; GFX9-MESA: [[GEP23:%[0-9]+]]:_(p0) = G_GEP [[GEP15]], [[C10]](s64) ; GFX9-MESA: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p0) :: (load 1) ; GFX9-MESA: [[GEP24:%[0-9]+]]:_(p0) = G_GEP [[GEP23]], [[C]](s64) ; GFX9-MESA: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p0) :: (load 1) @@ -9125,27 +10137,35 @@ body: | ; GFX9-MESA: [[AND24:%[0-9]+]]:_(s16) = G_AND [[TRUNC24]], [[C7]] ; GFX9-MESA: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD25]](s32) ; GFX9-MESA: [[AND25:%[0-9]+]]:_(s16) = G_AND [[TRUNC25]], [[C7]] - ; GFX9-MESA: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) - ; GFX9-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL12]] + ; GFX9-MESA: [[SHL18:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) + ; GFX9-MESA: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL18]] ; GFX9-MESA: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; GFX9-MESA: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; GFX9-MESA: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD27]](s32) ; GFX9-MESA: [[AND27:%[0-9]+]]:_(s16) = G_AND [[TRUNC27]], [[C7]] - ; GFX9-MESA: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) - ; GFX9-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL13]] + ; GFX9-MESA: [[SHL19:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) + ; GFX9-MESA: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL19]] ; GFX9-MESA: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; GFX9-MESA: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; GFX9-MESA: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD29]](s32) ; GFX9-MESA: [[AND29:%[0-9]+]]:_(s16) = G_AND [[TRUNC29]], [[C7]] - ; GFX9-MESA: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) - ; GFX9-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL14]] + ; GFX9-MESA: [[SHL20:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) + ; GFX9-MESA: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL20]] ; GFX9-MESA: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; GFX9-MESA: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; GFX9-MESA: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD31]](s32) ; GFX9-MESA: [[AND31:%[0-9]+]]:_(s16) = G_AND [[TRUNC31]], [[C7]] - ; GFX9-MESA: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) - ; GFX9-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL15]] - ; GFX9-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; GFX9-MESA: [[SHL21:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) + ; GFX9-MESA: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL21]] + ; GFX9-MESA: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; GFX9-MESA: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; GFX9-MESA: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C9]](s32) + ; GFX9-MESA: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; GFX9-MESA: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; GFX9-MESA: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; GFX9-MESA: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C9]](s32) + ; GFX9-MESA: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; GFX9-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; GFX9-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV2]](s64), [[MV3]](s64) ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s64>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s64>), [[BUILD_VECTOR1]](<2 x s64>) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[CONCAT_VECTORS]](<4 x s64>) @@ -9362,9 +10382,18 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; CI: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -9385,34 +10414,42 @@ body: | ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) ; VI-LABEL: name: test_load_flat_v2p1_align1 @@ -9465,9 +10502,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; VI: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -9487,27 +10533,35 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) ; GFX9-LABEL: name: test_load_flat_v2p1_align1 @@ -9560,9 +10614,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; GFX9: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -9582,27 +10645,35 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) ; CI-MESA-LABEL: name: test_load_flat_v2p1_align1 @@ -9663,9 +10734,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -9686,34 +10766,42 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) ; GFX9-MESA-LABEL: name: test_load_flat_v2p1_align1 @@ -9766,9 +10854,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p0) :: (load 1) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p0) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p0) :: (load 1) @@ -9788,27 +10885,35 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) %0:_(p0) = COPY $vgpr0_vgpr1 @@ -9914,9 +11019,14 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; CI: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C6]](s64) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; CI: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CI: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C7]](s64) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 1) ; CI: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C]](s64) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 1) @@ -9929,19 +11039,23 @@ body: | ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) ; VI-LABEL: name: test_load_flat_v2p3_align1 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -9969,9 +11083,14 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; VI: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C5]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; VI: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; VI: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C6]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 1) ; VI: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 1) @@ -9983,16 +11102,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) ; GFX9-LABEL: name: test_load_flat_v2p3_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -10020,9 +11143,14 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; GFX9: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C6]](s64) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 1) ; GFX9: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C]](s64) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 1) @@ -10034,16 +11162,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) ; CI-MESA-LABEL: name: test_load_flat_v2p3_align1 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -10075,9 +11207,14 @@ body: | ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-MESA: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; CI-MESA: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C6]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-MESA: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; CI-MESA: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CI-MESA: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C7]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 1) ; CI-MESA: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 1) @@ -10090,19 +11227,23 @@ body: | ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI-MESA: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-MESA: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; CI-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) ; GFX9-MESA-LABEL: name: test_load_flat_v2p3_align1 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 @@ -10130,9 +11271,14 @@ body: | ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9-MESA: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; GFX9-MESA: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C6]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p0) :: (load 1) ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p0) = G_GEP [[GEP3]], [[C]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p0) :: (load 1) @@ -10144,16 +11290,20 @@ body: | ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9-MESA: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(<2 x p3>) = G_LOAD %0 :: (load 8, align 1, addrspace 0) Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir?rev=373942&r1=373941&r2=373942&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir Mon Oct 7 12:05:58 2019 @@ -446,13 +446,18 @@ body: | ; SI-LABEL: name: test_load_global_s32_align2 ; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; SI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: $vgpr0 = COPY [[MV]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: $vgpr0 = COPY [[OR]](s32) ; CI-HSA-LABEL: name: test_load_global_s32_align2 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, align 2, addrspace 1) @@ -460,23 +465,33 @@ body: | ; CI-MESA-LABEL: name: test_load_global_s32_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](s32) + ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: $vgpr0 = COPY [[OR]](s32) ; VI-LABEL: name: test_load_global_s32_align2 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: $vgpr0 = COPY [[OR]](s32) ; GFX9-HSA-LABEL: name: test_load_global_s32_align2 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, align 2, addrspace 1) @@ -484,13 +499,18 @@ body: | ; GFX9-MESA-LABEL: name: test_load_global_s32_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](s32) + ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: $vgpr0 = COPY [[OR]](s32) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(s32) = G_LOAD %0 :: (load 4, align 2, addrspace 1) $vgpr0 = COPY %1 @@ -532,8 +552,12 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: $vgpr0 = COPY [[MV]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: $vgpr0 = COPY [[OR2]](s32) ; CI-HSA-LABEL: name: test_load_global_s32_align1 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, align 1, addrspace 1) @@ -568,8 +592,12 @@ body: | ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](s32) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-MESA: $vgpr0 = COPY [[OR2]](s32) ; VI-LABEL: name: test_load_global_s32_align1 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1) @@ -596,8 +624,12 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: $vgpr0 = COPY [[OR2]](s32) ; GFX9-HSA-LABEL: name: test_load_global_s32_align1 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, align 1, addrspace 1) @@ -628,8 +660,12 @@ body: | ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](s32) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9-MESA: $vgpr0 = COPY [[OR2]](s32) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(s32) = G_LOAD %0 :: (load 4, align 1, addrspace 1) $vgpr0 = COPY %1 @@ -775,20 +811,30 @@ body: | ; SI-LABEL: name: test_load_global_s64_align2 ; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; SI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; SI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; SI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; SI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-HSA-LABEL: name: test_load_global_s64_align2 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -797,38 +843,58 @@ body: | ; CI-MESA-LABEL: name: test_load_global_s64_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) ; VI-LABEL: name: test_load_global_s64_align2 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-HSA-LABEL: name: test_load_global_s64_align2 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -837,20 +903,30 @@ body: | ; GFX9-MESA-LABEL: name: test_load_global_s64_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(s64) = G_LOAD %0 :: (load 8, align 2, addrspace 1) @@ -921,7 +997,16 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; SI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-HSA-LABEL: name: test_load_global_s64_align1 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -985,7 +1070,16 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) ; VI-LABEL: name: test_load_global_s64_align1 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -1037,7 +1131,16 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-HSA-LABEL: name: test_load_global_s64_align1 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -1093,7 +1196,16 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](s64) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(s64) = G_LOAD %0 :: (load 8, align 1, addrspace 1) @@ -1214,28 +1326,42 @@ body: | ; SI-LABEL: name: test_load_global_s96_align2 ; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; SI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; SI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; SI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; SI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; SI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; SI: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; SI: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C4]](s64) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; CI-HSA-LABEL: name: test_load_global_s96_align2 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -1244,54 +1370,82 @@ body: | ; CI-MESA-LABEL: name: test_load_global_s96_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; CI-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; CI-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-MESA: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; CI-MESA: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C4]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_global_s96_align2 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; VI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; VI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; VI: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C4]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-HSA-LABEL: name: test_load_global_s96_align2 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -1300,28 +1454,42 @@ body: | ; GFX9-MESA-LABEL: name: test_load_global_s96_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; GFX9-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9-MESA: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C4]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; GFX9-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; GFX9-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(s96) = G_LOAD %0 :: (load 12, align 2, addrspace 1) @@ -1420,7 +1588,20 @@ body: | ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C12]](s32) ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C14]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C14]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C14]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; CI-HSA-LABEL: name: test_load_global_s96_align1 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -1512,7 +1693,20 @@ body: | ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C12]](s32) ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C14]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C14]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C14]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_global_s96_align1 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -1588,7 +1782,20 @@ body: | ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; VI: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; VI: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-HSA-LABEL: name: test_load_global_s96_align1 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -1668,7 +1875,20 @@ body: | ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(s96) = G_LOAD %0 :: (load 12, align 1, addrspace 1) @@ -1971,7 +2191,24 @@ body: | ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[C16]](s32) ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C18]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C18]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C18]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C18]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; CI-HSA-LABEL: name: test_load_global_s128_align1 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -2091,7 +2328,24 @@ body: | ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[C16]](s32) ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C18]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C18]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C18]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C18]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; VI-LABEL: name: test_load_global_s128_align1 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -2191,7 +2445,24 @@ body: | ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C15]] ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C16]](s16) ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C17]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C17]](s32) + ; VI: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C17]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C17]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; GFX9-HSA-LABEL: name: test_load_global_s128_align1 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -2295,7 +2566,24 @@ body: | ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C15]] ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C16]](s16) ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C17]](s32) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C17]](s32) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C17]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C17]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(s128) = G_LOAD %0 :: (load 16, align 1, addrspace 1) @@ -2471,7 +2759,16 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; SI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; CI-HSA-LABEL: name: test_load_global_p1_align1 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -2535,7 +2832,16 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](p1) ; VI-LABEL: name: test_load_global_p1_align1 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -2587,7 +2893,16 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; GFX9-HSA-LABEL: name: test_load_global_p1_align1 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -2643,7 +2958,16 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](p1) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(p1) = G_LOAD %0 :: (load 8, align 1, addrspace 1) @@ -2768,20 +3092,30 @@ body: | ; SI-LABEL: name: test_load_global_p4_align2 ; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; SI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; SI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; SI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; SI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; CI-HSA-LABEL: name: test_load_global_p4_align2 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -2790,38 +3124,58 @@ body: | ; CI-MESA-LABEL: name: test_load_global_p4_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) ; VI-LABEL: name: test_load_global_p4_align2 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; GFX9-HSA-LABEL: name: test_load_global_p4_align2 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -2830,20 +3184,30 @@ body: | ; GFX9-MESA-LABEL: name: test_load_global_p4_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(p4) = G_LOAD %0 :: (load 8, align 2, addrspace 1) @@ -2914,7 +3278,16 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; SI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; CI-HSA-LABEL: name: test_load_global_p4_align1 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -2978,7 +3351,16 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) ; VI-LABEL: name: test_load_global_p4_align1 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -3030,7 +3412,16 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](p4) ; GFX9-HSA-LABEL: name: test_load_global_p4_align1 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -3086,7 +3477,16 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[MV]](p4) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(p4) = G_LOAD %0 :: (load 8, align 1, addrspace 1) @@ -3137,13 +3537,19 @@ body: | ; SI-LABEL: name: test_load_global_p5_align2 ; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; SI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: $vgpr0 = COPY [[MV]](p5) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; SI: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-HSA-LABEL: name: test_load_global_p5_align2 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-HSA: [[LOAD:%[0-9]+]]:_(p5) = G_LOAD [[COPY]](p1) :: (load 4, align 2, addrspace 1) @@ -3151,23 +3557,35 @@ body: | ; CI-MESA-LABEL: name: test_load_global_p5_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](p5) + ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; CI-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) ; VI-LABEL: name: test_load_global_p5_align2 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p5) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-HSA-LABEL: name: test_load_global_p5_align2 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(p5) = G_LOAD [[COPY]](p1) :: (load 4, align 2, addrspace 1) @@ -3175,13 +3593,19 @@ body: | ; GFX9-MESA-LABEL: name: test_load_global_p5_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](p5) + ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; GFX9-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(p5) = G_LOAD %0 :: (load 4, align 2, addrspace 1) $vgpr0 = COPY %1 @@ -3223,8 +3647,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: $vgpr0 = COPY [[MV]](p5) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; SI: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-HSA-LABEL: name: test_load_global_p5_align1 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-HSA: [[LOAD:%[0-9]+]]:_(p5) = G_LOAD [[COPY]](p1) :: (load 4, align 1, addrspace 1) @@ -3259,8 +3688,13 @@ body: | ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-MESA: $vgpr0 = COPY [[MV]](p5) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; CI-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) ; VI-LABEL: name: test_load_global_p5_align1 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1) @@ -3287,8 +3721,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p5) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-HSA-LABEL: name: test_load_global_p5_align1 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(p5) = G_LOAD [[COPY]](p1) :: (load 4, align 1, addrspace 1) @@ -3319,8 +3758,13 @@ body: | ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9-MESA: $vgpr0 = COPY [[MV]](p5) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9-MESA: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; GFX9-MESA: $vgpr0 = COPY [[INTTOPTR]](p5) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(p5) = G_LOAD %0 :: (load 4, align 1, addrspace 1) $vgpr0 = COPY %1 @@ -6434,34 +6878,52 @@ body: | ; SI-LABEL: name: test_load_global_v2s64_align2 ; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; SI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; SI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; SI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; SI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; SI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; SI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; SI: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; SI: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C1]](s64) ; SI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C2]](s64) ; SI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; SI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; CI-HSA-LABEL: name: test_load_global_v2s64_align2 @@ -6471,67 +6933,103 @@ body: | ; CI-MESA-LABEL: name: test_load_global_v2s64_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) + ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; CI-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-MESA: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; CI-MESA: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C1]](s64) ; CI-MESA: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-MESA: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C2]](s64) ; CI-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI-MESA: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; CI-MESA: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; VI-LABEL: name: test_load_global_v2s64_align2 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; VI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C1]](s64) ; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C2]](s64) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; VI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; VI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; VI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; VI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-HSA-LABEL: name: test_load_global_v2s64_align2 @@ -6541,34 +7039,52 @@ body: | ; GFX9-MESA-LABEL: name: test_load_global_v2s64_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) + ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9-MESA: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C1]](s64) ; GFX9-MESA: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9-MESA: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C2]](s64) ; GFX9-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; GFX9-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9-MESA: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; GFX9-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) %0:_(p1) = COPY $vgpr0_vgpr1 @@ -6640,9 +7156,18 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; SI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; SI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; SI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; SI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; SI: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; SI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -6663,34 +7188,42 @@ body: | ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; SI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; SI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; CI-HSA-LABEL: name: test_load_global_v2s64_align1 @@ -6755,9 +7288,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -6778,34 +7320,42 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; VI-LABEL: name: test_load_global_v2s64_align1 @@ -6858,9 +7408,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; VI: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -6880,27 +7439,35 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-HSA-LABEL: name: test_load_global_v2s64_align1 @@ -6957,9 +7524,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -6979,27 +7555,35 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) %0:_(p1) = COPY $vgpr0_vgpr1 @@ -7169,9 +7753,18 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; SI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; SI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; SI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; SI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; SI: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; SI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -7192,36 +7785,44 @@ body: | ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; SI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; SI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; SI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; SI: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; SI: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; SI: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C12]](s64) ; SI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p1) :: (load 1, addrspace 1) ; SI: [[GEP16:%[0-9]+]]:_(p1) = G_GEP [[GEP15]], [[C]](s64) ; SI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p1) :: (load 1, addrspace 1) @@ -7242,34 +7843,42 @@ body: | ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; SI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; SI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; SI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; SI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; SI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; SI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; SI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; SI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; SI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; SI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; SI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; SI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; SI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; SI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; SI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; SI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; SI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; SI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; SI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; SI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; SI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; SI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; SI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; SI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; SI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; SI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; SI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; SI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; SI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; SI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; SI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; SI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; SI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; SI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; SI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; SI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; SI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; SI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; SI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; SI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; SI: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; SI: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7338,9 +7947,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -7361,36 +7979,44 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; CI-MESA: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; CI-MESA: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; CI-MESA: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C12]](s64) ; CI-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p1) :: (load 1, addrspace 1) ; CI-MESA: [[GEP16:%[0-9]+]]:_(p1) = G_GEP [[GEP15]], [[C]](s64) ; CI-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p1) :: (load 1, addrspace 1) @@ -7411,34 +8037,42 @@ body: | ; CI-MESA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI-MESA: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI-MESA: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; CI-MESA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI-MESA: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI-MESA: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; CI-MESA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI-MESA: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI-MESA: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; CI-MESA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI-MESA: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; CI-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; CI-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; CI-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; CI-MESA: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; CI-MESA: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7493,9 +8127,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; VI: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -7515,29 +8158,37 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; VI: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; VI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; VI: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; VI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p1) :: (load 1, addrspace 1) ; VI: [[GEP16:%[0-9]+]]:_(p1) = G_GEP [[GEP15]], [[C]](s64) ; VI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p1) :: (load 1, addrspace 1) @@ -7557,27 +8208,35 @@ body: | ; VI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; VI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; VI: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; VI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; VI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; VI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; VI: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; VI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; VI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; VI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; VI: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; VI: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; VI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; VI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; VI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; VI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; VI: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; VI: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; VI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; VI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; VI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; VI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; VI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; VI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; VI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; VI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; VI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; VI: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; VI: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7638,9 +8297,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -7660,29 +8328,37 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; GFX9-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; GFX9-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p1) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP16:%[0-9]+]]:_(p1) = G_GEP [[GEP15]], [[C]](s64) ; GFX9-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p1) :: (load 1, addrspace 1) @@ -7702,27 +8378,35 @@ body: | ; GFX9-MESA: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; GFX9-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9-MESA: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9-MESA: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; GFX9-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; GFX9-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; GFX9-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9-MESA: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9-MESA: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; GFX9-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; GFX9-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; GFX9-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9-MESA: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9-MESA: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; GFX9-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; GFX9-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; GFX9-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9-MESA: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; GFX9-MESA: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; GFX9-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; GFX9-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; GFX9-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; GFX9-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64) ; GFX9-MESA: [[DEF:%[0-9]+]]:_(<4 x s64>) = G_IMPLICIT_DEF ; GFX9-MESA: [[INSERT:%[0-9]+]]:_(<4 x s64>) = G_INSERT [[DEF]], [[BUILD_VECTOR]](<3 x s64>), 0 @@ -7868,9 +8552,18 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; SI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; SI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; SI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; SI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; SI: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; SI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -7891,36 +8584,44 @@ body: | ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; SI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; SI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; SI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; SI: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; SI: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; SI: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C12]](s64) ; SI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p1) :: (load 1, addrspace 1) ; SI: [[GEP16:%[0-9]+]]:_(p1) = G_GEP [[GEP15]], [[C]](s64) ; SI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p1) :: (load 1, addrspace 1) @@ -7941,36 +8642,44 @@ body: | ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; SI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; SI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; SI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; SI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; SI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; SI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; SI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; SI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; SI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; SI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; SI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; SI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; SI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; SI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; SI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; SI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; SI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; SI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; SI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; SI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; SI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; SI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; SI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; SI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; SI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; SI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; SI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; SI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; SI: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 - ; SI: [[GEP23:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C12]](s64) + ; SI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; SI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; SI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; SI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; SI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; SI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; SI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; SI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; SI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; SI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; SI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; SI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; SI: [[C13:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 + ; SI: [[GEP23:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C13]](s64) ; SI: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p1) :: (load 1, addrspace 1) ; SI: [[GEP24:%[0-9]+]]:_(p1) = G_GEP [[GEP23]], [[C]](s64) ; SI: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p1) :: (load 1, addrspace 1) @@ -7991,34 +8700,42 @@ body: | ; SI: [[COPY24:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LOAD25]](s32) ; SI: [[AND25:%[0-9]+]]:_(s32) = G_AND [[COPY25]], [[C9]] - ; SI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) - ; SI: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) - ; SI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] + ; SI: [[SHL18:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) + ; SI: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL18]](s32) + ; SI: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] ; SI: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; SI: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; SI: [[COPY26:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LOAD27]](s32) ; SI: [[AND27:%[0-9]+]]:_(s32) = G_AND [[COPY27]], [[C9]] - ; SI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) - ; SI: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) - ; SI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] + ; SI: [[SHL19:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) + ; SI: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL19]](s32) + ; SI: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] ; SI: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; SI: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; SI: [[COPY28:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LOAD29]](s32) ; SI: [[AND29:%[0-9]+]]:_(s32) = G_AND [[COPY29]], [[C9]] - ; SI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) - ; SI: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) - ; SI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] + ; SI: [[SHL20:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) + ; SI: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL20]](s32) + ; SI: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] ; SI: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; SI: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; SI: [[COPY30:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LOAD31]](s32) ; SI: [[AND31:%[0-9]+]]:_(s32) = G_AND [[COPY31]], [[C9]] - ; SI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) - ; SI: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) - ; SI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] - ; SI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; SI: [[SHL21:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) + ; SI: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL21]](s32) + ; SI: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] + ; SI: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; SI: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; SI: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C10]](s32) + ; SI: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; SI: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; SI: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; SI: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C10]](s32) + ; SI: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; SI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64), [[MV3]](s64) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BUILD_VECTOR]](<4 x s64>) ; CI-HSA-LABEL: name: test_load_global_v4s64_align1 @@ -8083,9 +8800,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -8106,36 +8832,44 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; CI-MESA: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; CI-MESA: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; CI-MESA: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C12]](s64) ; CI-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p1) :: (load 1, addrspace 1) ; CI-MESA: [[GEP16:%[0-9]+]]:_(p1) = G_GEP [[GEP15]], [[C]](s64) ; CI-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p1) :: (load 1, addrspace 1) @@ -8156,36 +8890,44 @@ body: | ; CI-MESA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI-MESA: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C9]] - ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI-MESA: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; CI-MESA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI-MESA: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C9]] - ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI-MESA: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; CI-MESA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI-MESA: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C9]] - ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI-MESA: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; CI-MESA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI-MESA: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C9]] - ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; CI-MESA: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 - ; CI-MESA: [[GEP23:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C12]](s64) + ; CI-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; CI-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; CI-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; CI-MESA: [[C13:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 + ; CI-MESA: [[GEP23:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C13]](s64) ; CI-MESA: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p1) :: (load 1, addrspace 1) ; CI-MESA: [[GEP24:%[0-9]+]]:_(p1) = G_GEP [[GEP23]], [[C]](s64) ; CI-MESA: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p1) :: (load 1, addrspace 1) @@ -8206,34 +8948,42 @@ body: | ; CI-MESA: [[COPY24:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LOAD25]](s32) ; CI-MESA: [[AND25:%[0-9]+]]:_(s32) = G_AND [[COPY25]], [[C9]] - ; CI-MESA: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) - ; CI-MESA: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) - ; CI-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] + ; CI-MESA: [[SHL18:%[0-9]+]]:_(s32) = G_SHL [[AND25]], [[COPY24]](s32) + ; CI-MESA: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[SHL18]](s32) + ; CI-MESA: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[TRUNC25]] ; CI-MESA: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; CI-MESA: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; CI-MESA: [[COPY26:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LOAD27]](s32) ; CI-MESA: [[AND27:%[0-9]+]]:_(s32) = G_AND [[COPY27]], [[C9]] - ; CI-MESA: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) - ; CI-MESA: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) - ; CI-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] + ; CI-MESA: [[SHL19:%[0-9]+]]:_(s32) = G_SHL [[AND27]], [[COPY26]](s32) + ; CI-MESA: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[SHL19]](s32) + ; CI-MESA: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[TRUNC27]] ; CI-MESA: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; CI-MESA: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; CI-MESA: [[COPY28:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LOAD29]](s32) ; CI-MESA: [[AND29:%[0-9]+]]:_(s32) = G_AND [[COPY29]], [[C9]] - ; CI-MESA: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) - ; CI-MESA: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) - ; CI-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] + ; CI-MESA: [[SHL20:%[0-9]+]]:_(s32) = G_SHL [[AND29]], [[COPY28]](s32) + ; CI-MESA: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[SHL20]](s32) + ; CI-MESA: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[TRUNC29]] ; CI-MESA: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; CI-MESA: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; CI-MESA: [[COPY30:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LOAD31]](s32) ; CI-MESA: [[AND31:%[0-9]+]]:_(s32) = G_AND [[COPY31]], [[C9]] - ; CI-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) - ; CI-MESA: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) - ; CI-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] - ; CI-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; CI-MESA: [[SHL21:%[0-9]+]]:_(s32) = G_SHL [[AND31]], [[COPY30]](s32) + ; CI-MESA: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[SHL21]](s32) + ; CI-MESA: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[TRUNC31]] + ; CI-MESA: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; CI-MESA: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; CI-MESA: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C10]](s32) + ; CI-MESA: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; CI-MESA: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; CI-MESA: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; CI-MESA: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C10]](s32) + ; CI-MESA: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; CI-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64), [[MV3]](s64) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BUILD_VECTOR]](<4 x s64>) ; VI-LABEL: name: test_load_global_v4s64_align1 @@ -8286,9 +9036,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; VI: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -8308,29 +9067,37 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; VI: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; VI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; VI: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; VI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p1) :: (load 1, addrspace 1) ; VI: [[GEP16:%[0-9]+]]:_(p1) = G_GEP [[GEP15]], [[C]](s64) ; VI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p1) :: (load 1, addrspace 1) @@ -8350,29 +9117,37 @@ body: | ; VI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; VI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; VI: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; VI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; VI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; VI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; VI: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; VI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; VI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; VI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; VI: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; VI: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; VI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; VI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; VI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; VI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; VI: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; VI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 - ; VI: [[GEP23:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) + ; VI: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; VI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; VI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; VI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; VI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; VI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; VI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; VI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; VI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; VI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; VI: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 + ; VI: [[GEP23:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C12]](s64) ; VI: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p1) :: (load 1, addrspace 1) ; VI: [[GEP24:%[0-9]+]]:_(p1) = G_GEP [[GEP23]], [[C]](s64) ; VI: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p1) :: (load 1, addrspace 1) @@ -8392,27 +9167,35 @@ body: | ; VI: [[AND24:%[0-9]+]]:_(s16) = G_AND [[TRUNC24]], [[C7]] ; VI: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD25]](s32) ; VI: [[AND25:%[0-9]+]]:_(s16) = G_AND [[TRUNC25]], [[C7]] - ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) - ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL12]] + ; VI: [[SHL18:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) + ; VI: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL18]] ; VI: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; VI: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; VI: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD27]](s32) ; VI: [[AND27:%[0-9]+]]:_(s16) = G_AND [[TRUNC27]], [[C7]] - ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) - ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL13]] + ; VI: [[SHL19:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) + ; VI: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL19]] ; VI: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; VI: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; VI: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD29]](s32) ; VI: [[AND29:%[0-9]+]]:_(s16) = G_AND [[TRUNC29]], [[C7]] - ; VI: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) - ; VI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL14]] + ; VI: [[SHL20:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) + ; VI: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL20]] ; VI: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; VI: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; VI: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD31]](s32) ; VI: [[AND31:%[0-9]+]]:_(s16) = G_AND [[TRUNC31]], [[C7]] - ; VI: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) - ; VI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL15]] - ; VI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; VI: [[SHL21:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) + ; VI: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL21]] + ; VI: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; VI: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; VI: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C9]](s32) + ; VI: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; VI: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; VI: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; VI: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C9]](s32) + ; VI: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; VI: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64), [[MV3]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BUILD_VECTOR]](<4 x s64>) ; GFX9-HSA-LABEL: name: test_load_global_v4s64_align1 @@ -8469,9 +9252,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -8491,29 +9283,37 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) - ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 - ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) + ; GFX9-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 16 + ; GFX9-MESA: [[GEP15:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; GFX9-MESA: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p1) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP16:%[0-9]+]]:_(p1) = G_GEP [[GEP15]], [[C]](s64) ; GFX9-MESA: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p1) :: (load 1, addrspace 1) @@ -8533,29 +9333,37 @@ body: | ; GFX9-MESA: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; GFX9-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9-MESA: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9-MESA: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; GFX9-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; GFX9-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; GFX9-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9-MESA: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9-MESA: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; GFX9-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] ; GFX9-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; GFX9-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9-MESA: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9-MESA: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; GFX9-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL14]] ; GFX9-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; GFX9-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9-MESA: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) - ; GFX9-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 - ; GFX9-MESA: [[GEP23:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) + ; GFX9-MESA: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; GFX9-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL15]] + ; GFX9-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; GFX9-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; GFX9-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9-MESA: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR16]](s32), [[OR17]](s32) + ; GFX9-MESA: [[C12:%[0-9]+]]:_(s64) = G_CONSTANT i64 24 + ; GFX9-MESA: [[GEP23:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C12]](s64) ; GFX9-MESA: [[LOAD24:%[0-9]+]]:_(s32) = G_LOAD [[GEP23]](p1) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP24:%[0-9]+]]:_(p1) = G_GEP [[GEP23]], [[C]](s64) ; GFX9-MESA: [[LOAD25:%[0-9]+]]:_(s32) = G_LOAD [[GEP24]](p1) :: (load 1, addrspace 1) @@ -8575,27 +9383,35 @@ body: | ; GFX9-MESA: [[AND24:%[0-9]+]]:_(s16) = G_AND [[TRUNC24]], [[C7]] ; GFX9-MESA: [[TRUNC25:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD25]](s32) ; GFX9-MESA: [[AND25:%[0-9]+]]:_(s16) = G_AND [[TRUNC25]], [[C7]] - ; GFX9-MESA: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) - ; GFX9-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL12]] + ; GFX9-MESA: [[SHL18:%[0-9]+]]:_(s16) = G_SHL [[AND25]], [[C8]](s16) + ; GFX9-MESA: [[OR18:%[0-9]+]]:_(s16) = G_OR [[AND24]], [[SHL18]] ; GFX9-MESA: [[TRUNC26:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD26]](s32) ; GFX9-MESA: [[AND26:%[0-9]+]]:_(s16) = G_AND [[TRUNC26]], [[C7]] ; GFX9-MESA: [[TRUNC27:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD27]](s32) ; GFX9-MESA: [[AND27:%[0-9]+]]:_(s16) = G_AND [[TRUNC27]], [[C7]] - ; GFX9-MESA: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) - ; GFX9-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL13]] + ; GFX9-MESA: [[SHL19:%[0-9]+]]:_(s16) = G_SHL [[AND27]], [[C8]](s16) + ; GFX9-MESA: [[OR19:%[0-9]+]]:_(s16) = G_OR [[AND26]], [[SHL19]] ; GFX9-MESA: [[TRUNC28:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD28]](s32) ; GFX9-MESA: [[AND28:%[0-9]+]]:_(s16) = G_AND [[TRUNC28]], [[C7]] ; GFX9-MESA: [[TRUNC29:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD29]](s32) ; GFX9-MESA: [[AND29:%[0-9]+]]:_(s16) = G_AND [[TRUNC29]], [[C7]] - ; GFX9-MESA: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) - ; GFX9-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL14]] + ; GFX9-MESA: [[SHL20:%[0-9]+]]:_(s16) = G_SHL [[AND29]], [[C8]](s16) + ; GFX9-MESA: [[OR20:%[0-9]+]]:_(s16) = G_OR [[AND28]], [[SHL20]] ; GFX9-MESA: [[TRUNC30:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD30]](s32) ; GFX9-MESA: [[AND30:%[0-9]+]]:_(s16) = G_AND [[TRUNC30]], [[C7]] ; GFX9-MESA: [[TRUNC31:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD31]](s32) ; GFX9-MESA: [[AND31:%[0-9]+]]:_(s16) = G_AND [[TRUNC31]], [[C7]] - ; GFX9-MESA: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) - ; GFX9-MESA: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL15]] - ; GFX9-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR12]](s16), [[OR13]](s16), [[OR14]](s16), [[OR15]](s16) + ; GFX9-MESA: [[SHL21:%[0-9]+]]:_(s16) = G_SHL [[AND31]], [[C8]](s16) + ; GFX9-MESA: [[OR21:%[0-9]+]]:_(s16) = G_OR [[AND30]], [[SHL21]] + ; GFX9-MESA: [[ZEXT12:%[0-9]+]]:_(s32) = G_ZEXT [[OR18]](s16) + ; GFX9-MESA: [[ZEXT13:%[0-9]+]]:_(s32) = G_ZEXT [[OR19]](s16) + ; GFX9-MESA: [[SHL22:%[0-9]+]]:_(s32) = G_SHL [[ZEXT13]], [[C9]](s32) + ; GFX9-MESA: [[OR22:%[0-9]+]]:_(s32) = G_OR [[ZEXT12]], [[SHL22]] + ; GFX9-MESA: [[ZEXT14:%[0-9]+]]:_(s32) = G_ZEXT [[OR20]](s16) + ; GFX9-MESA: [[ZEXT15:%[0-9]+]]:_(s32) = G_ZEXT [[OR21]](s16) + ; GFX9-MESA: [[SHL23:%[0-9]+]]:_(s32) = G_SHL [[ZEXT15]], [[C9]](s32) + ; GFX9-MESA: [[OR23:%[0-9]+]]:_(s32) = G_OR [[ZEXT14]], [[SHL23]] + ; GFX9-MESA: [[MV3:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR22]](s32), [[OR23]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64), [[MV2]](s64), [[MV3]](s64) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BUILD_VECTOR]](<4 x s64>) %0:_(p1) = COPY $vgpr0_vgpr1 @@ -8807,9 +9623,18 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; SI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; SI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; SI: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; SI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; SI: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; SI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -8830,34 +9655,42 @@ body: | ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; SI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; SI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) ; CI-HSA-LABEL: name: test_load_global_v2p1_align1 @@ -8922,9 +9755,18 @@ body: | ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; CI-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[C11:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; CI-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C11]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -8945,34 +9787,42 @@ body: | ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C9]] - ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY8]](s32) + ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C9]] - ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) + ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C9]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C9]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) ; VI-LABEL: name: test_load_global_v2p1_align1 @@ -9025,9 +9875,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; VI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C9]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; VI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; VI: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -9047,27 +9906,35 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) ; GFX9-HSA-LABEL: name: test_load_global_v2p1_align1 @@ -9124,9 +9991,18 @@ body: | ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[C9:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 - ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C9]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[C10:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 + ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C10]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP7]], [[C]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 1, addrspace 1) @@ -9146,27 +10022,35 @@ body: | ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9-MESA: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p1>) = G_BUILD_VECTOR [[MV]](p1), [[MV1]](p1) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x p1>) %0:_(p1) = COPY $vgpr0_vgpr1 @@ -9315,9 +10199,14 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; SI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; SI: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; SI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C7]](s64) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 1, addrspace 1) ; SI: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 1, addrspace 1) @@ -9330,19 +10219,23 @@ body: | ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) ; CI-HSA-LABEL: name: test_load_global_v2p3_align1 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -9378,9 +10271,14 @@ body: | ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-MESA: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-MESA: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; CI-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-MESA: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; CI-MESA: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CI-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C7]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 1, addrspace 1) ; CI-MESA: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 1, addrspace 1) @@ -9393,19 +10291,23 @@ body: | ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI-MESA: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-MESA: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; CI-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) ; VI-LABEL: name: test_load_global_v2p3_align1 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -9433,9 +10335,14 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; VI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; VI: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; VI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 1, addrspace 1) ; VI: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 1, addrspace 1) @@ -9447,16 +10354,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) ; GFX9-HSA-LABEL: name: test_load_global_v2p3_align1 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -9488,9 +10399,14 @@ body: | ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9-MESA: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; GFX9-MESA: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 1, addrspace 1) @@ -9502,16 +10418,20 @@ body: | ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9-MESA: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[MV]](p3), [[MV1]](p3) + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[INTTOPTR1:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR5]](s32) + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p3>) = G_BUILD_VECTOR [[INTTOPTR]](p3), [[INTTOPTR1]](p3) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x p3>) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(<2 x p3>) = G_LOAD %0 :: (load 8, align 1, addrspace 1) @@ -9839,9 +10759,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; SI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; SI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C7]](s64) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 1, addrspace 1) ; SI: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 1, addrspace 1) @@ -9854,19 +10778,22 @@ body: | ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-HSA-LABEL: name: test_extload_global_v2s32_from_4_align1 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -9906,9 +10833,13 @@ body: | ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-MESA: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; CI-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-MESA: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CI-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C7]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 1, addrspace 1) ; CI-MESA: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 1, addrspace 1) @@ -9921,19 +10852,22 @@ body: | ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI-MESA: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-MESA: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; VI-LABEL: name: test_extload_global_v2s32_from_4_align1 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -9961,9 +10895,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; VI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; VI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 1, addrspace 1) ; VI: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 1, addrspace 1) @@ -9975,16 +10913,19 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-HSA-LABEL: name: test_extload_global_v2s32_from_4_align1 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -10020,9 +10961,13 @@ body: | ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9-MESA: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[GEP3]], [[C]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 1, addrspace 1) @@ -10034,16 +10979,19 @@ body: | ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9-MESA: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(<2 x s32>) = G_LOAD %0 :: (load 4, align 1, addrspace 1) @@ -10059,21 +11007,29 @@ body: | ; SI-LABEL: name: test_extload_global_v2s32_from_4_align2 ; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; SI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; SI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; SI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[GEP1]], [[C]](s64) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-HSA-LABEL: name: test_extload_global_v2s32_from_4_align2 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -10086,40 +11042,56 @@ body: | ; CI-MESA-LABEL: name: test_extload_global_v2s32_from_4_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; CI-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) + ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; CI-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[GEP1]], [[C]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; CI-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; VI-LABEL: name: test_extload_global_v2s32_from_4_align2 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; VI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; VI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[GEP1]], [[C]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-HSA-LABEL: name: test_extload_global_v2s32_from_4_align2 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -10132,21 +11104,29 @@ body: | ; GFX9-MESA-LABEL: name: test_extload_global_v2s32_from_4_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 - ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) + ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 + ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[GEP1]], [[C]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(<2 x s32>) = G_LOAD %0 :: (load 4, align 2, addrspace 1) @@ -10476,9 +11456,22 @@ body: | ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C12]](s32) ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) - ; SI: [[C14:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; SI: [[GEP11:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C14]](s64) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C14]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C14]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C14]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) + ; SI: [[C15:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; SI: [[GEP11:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C15]](s64) ; SI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p1) :: (load 1, addrspace 1) ; SI: [[GEP12:%[0-9]+]]:_(p1) = G_GEP [[GEP11]], [[C]](s64) ; SI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p1) :: (load 1, addrspace 1) @@ -10507,50 +11500,62 @@ body: | ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C13]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C11]] ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C13]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] ; SI: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; SI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C11]] ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; SI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; SI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C13]] - ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; SI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; SI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; SI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; SI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C11]] ; SI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; SI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; SI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C13]] - ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; SI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; SI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; SI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; SI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; SI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; SI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C11]] ; SI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; SI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; SI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C13]] - ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; SI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; SI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; SI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; SI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; SI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; SI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C11]] ; SI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; SI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; SI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C13]] - ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; SI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; SI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; SI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; SI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; SI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; SI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; SI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C14]](s32) + ; SI: [[OR15:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL15]] + ; SI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; SI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; SI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C14]](s32) + ; SI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; SI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; SI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; SI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C14]](s32) + ; SI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; SI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR15]](s32), [[OR16]](s32), [[OR17]](s32) ; SI: [[COPY24:%[0-9]+]]:_(s96) = COPY [[MV]](s96) ; SI: [[COPY25:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY24]](s96) @@ -10648,9 +11653,22 @@ body: | ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C12]](s32) ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CI-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) - ; CI-MESA: [[C14:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; CI-MESA: [[GEP11:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C14]](s64) + ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-MESA: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C14]](s32) + ; CI-MESA: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CI-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C14]](s32) + ; CI-MESA: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CI-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C14]](s32) + ; CI-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) + ; CI-MESA: [[C15:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; CI-MESA: [[GEP11:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C15]](s64) ; CI-MESA: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p1) :: (load 1, addrspace 1) ; CI-MESA: [[GEP12:%[0-9]+]]:_(p1) = G_GEP [[GEP11]], [[C]](s64) ; CI-MESA: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p1) :: (load 1, addrspace 1) @@ -10679,50 +11697,62 @@ body: | ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-MESA: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY13]], [[C13]] - ; CI-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) - ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY12]](s32) + ; CI-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C11]] ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-MESA: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY15]], [[C13]] - ; CI-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) - ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY14]](s32) + ; CI-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] ; CI-MESA: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; CI-MESA: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C11]] ; CI-MESA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI-MESA: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY17]], [[C13]] - ; CI-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) - ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY16]](s32) + ; CI-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) + ; CI-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C11]] ; CI-MESA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI-MESA: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY19]], [[C13]] - ; CI-MESA: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) - ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI-MESA: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY18]](s32) + ; CI-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C11]] ; CI-MESA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI-MESA: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY21]], [[C13]] - ; CI-MESA: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) - ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI-MESA: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY20]](s32) + ; CI-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C11]] ; CI-MESA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[C12]](s32) ; CI-MESA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI-MESA: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY23]], [[C13]] - ; CI-MESA: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) - ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; CI-MESA: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY22]](s32) + ; CI-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C14]](s32) + ; CI-MESA: [[OR15:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL15]] + ; CI-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; CI-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C14]](s32) + ; CI-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C14]](s32) + ; CI-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR15]](s32), [[OR16]](s32), [[OR17]](s32) ; CI-MESA: [[COPY24:%[0-9]+]]:_(s96) = COPY [[MV]](s96) ; CI-MESA: [[COPY25:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY24]](s96) @@ -10801,9 +11831,22 @@ body: | ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) - ; VI: [[C13:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; VI: [[GEP11:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C13]](s64) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; VI: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; VI: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) + ; VI: [[C14:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; VI: [[GEP11:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C14]](s64) ; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p1) :: (load 1, addrspace 1) ; VI: [[GEP12:%[0-9]+]]:_(p1) = G_GEP [[GEP11]], [[C]](s64) ; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p1) :: (load 1, addrspace 1) @@ -10831,39 +11874,51 @@ body: | ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C11]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C11]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C12]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C12]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C11]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C11]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C12]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] + ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C12]](s16) + ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] ; VI: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; VI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C11]] ; VI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; VI: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C11]] - ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C12]](s16) - ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C12]](s16) + ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL11]] ; VI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; VI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C11]] ; VI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; VI: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C11]] - ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C12]](s16) - ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C12]](s16) + ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL12]] ; VI: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; VI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C11]] ; VI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; VI: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C11]] - ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C12]](s16) - ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C12]](s16) + ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL13]] ; VI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; VI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C11]] ; VI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; VI: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C11]] - ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C12]](s16) - ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; VI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; VI: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C12]](s16) + ; VI: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL14]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; VI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C13]](s32) + ; VI: [[OR15:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL15]] + ; VI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; VI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; VI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C13]](s32) + ; VI: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; VI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; VI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; VI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C13]](s32) + ; VI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; VI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR15]](s32), [[OR16]](s32), [[OR17]](s32) ; VI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) ; VI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) @@ -10949,9 +12004,22 @@ body: | ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C11]] ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C12]](s16) ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) - ; GFX9-MESA: [[C13:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; GFX9-MESA: [[GEP11:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C13]](s64) + ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9-MESA: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; GFX9-MESA: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9-MESA: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; GFX9-MESA: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9-MESA: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) + ; GFX9-MESA: [[C14:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; GFX9-MESA: [[GEP11:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C14]](s64) ; GFX9-MESA: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p1) :: (load 1, addrspace 1) ; GFX9-MESA: [[GEP12:%[0-9]+]]:_(p1) = G_GEP [[GEP11]], [[C]](s64) ; GFX9-MESA: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p1) :: (load 1, addrspace 1) @@ -10979,39 +12047,51 @@ body: | ; GFX9-MESA: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C11]] ; GFX9-MESA: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9-MESA: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C11]] - ; GFX9-MESA: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C12]](s16) - ; GFX9-MESA: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C12]](s16) + ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; GFX9-MESA: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9-MESA: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C11]] ; GFX9-MESA: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9-MESA: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C11]] - ; GFX9-MESA: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C12]](s16) - ; GFX9-MESA: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] + ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C12]](s16) + ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] ; GFX9-MESA: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; GFX9-MESA: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C11]] ; GFX9-MESA: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9-MESA: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C11]] - ; GFX9-MESA: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C12]](s16) - ; GFX9-MESA: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C12]](s16) + ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL11]] ; GFX9-MESA: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9-MESA: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C11]] ; GFX9-MESA: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9-MESA: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C11]] - ; GFX9-MESA: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C12]](s16) - ; GFX9-MESA: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] + ; GFX9-MESA: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C12]](s16) + ; GFX9-MESA: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL12]] ; GFX9-MESA: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; GFX9-MESA: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C11]] ; GFX9-MESA: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9-MESA: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C11]] - ; GFX9-MESA: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C12]](s16) - ; GFX9-MESA: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9-MESA: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C12]](s16) + ; GFX9-MESA: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL13]] ; GFX9-MESA: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9-MESA: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C11]] ; GFX9-MESA: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9-MESA: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C11]] - ; GFX9-MESA: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C12]](s16) - ; GFX9-MESA: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; GFX9-MESA: [[SHL14:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C12]](s16) + ; GFX9-MESA: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL14]] + ; GFX9-MESA: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9-MESA: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; GFX9-MESA: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C13]](s32) + ; GFX9-MESA: [[OR15:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL15]] + ; GFX9-MESA: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; GFX9-MESA: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9-MESA: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C13]](s32) + ; GFX9-MESA: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; GFX9-MESA: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9-MESA: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; GFX9-MESA: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C13]](s32) + ; GFX9-MESA: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR15]](s32), [[OR16]](s32), [[OR17]](s32) ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) @@ -11033,52 +12113,78 @@ body: | ; SI-LABEL: name: test_extload_global_v2s96_from_24_align2 ; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; SI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; SI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; SI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; SI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; SI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; SI: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; SI: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C4]](s64) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) - ; SI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; SI: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) + ; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; SI: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; SI: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C7]](s64) ; SI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C]](s64) ; SI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; SI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C1]](s64) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; SI: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C2]](s64) ; SI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; SI: [[GEP9:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C3]](s64) ; SI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[GEP10:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C4]](s64) ; SI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p1) :: (load 2, addrspace 1) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16), [[TRUNC10]](s16), [[TRUNC11]](s16) - ; SI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) - ; SI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) - ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; SI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; SI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] + ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C6]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; SI: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C5]] + ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C6]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; SI: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C5]] + ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; SI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; SI: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; SI: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; SI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; CI-HSA-LABEL: name: test_extload_global_v2s96_from_24_align2 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-HSA: [[LOAD:%[0-9]+]]:_(<2 x s96>) = G_LOAD [[COPY]](p1) :: (load 24, align 2, addrspace 1) @@ -11089,101 +12195,153 @@ body: | ; CI-MESA-LABEL: name: test_extload_global_v2s96_from_24_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; CI-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; CI-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; CI-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; CI-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-MESA: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; CI-MESA: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C4]](s64) ; CI-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) - ; CI-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; CI-MESA: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) + ; CI-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; CI-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; CI-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; CI-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; CI-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; CI-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; CI-MESA: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; CI-MESA: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C7]](s64) ; CI-MESA: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-MESA: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C]](s64) ; CI-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; CI-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C1]](s64) ; CI-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; CI-MESA: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C2]](s64) ; CI-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; CI-MESA: [[GEP9:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C3]](s64) ; CI-MESA: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-MESA: [[GEP10:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C4]](s64) ; CI-MESA: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; CI-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16), [[TRUNC10]](s16), [[TRUNC11]](s16) - ; CI-MESA: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) - ; CI-MESA: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) - ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; CI-MESA: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI-MESA: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] + ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] + ; CI-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C6]](s32) + ; CI-MESA: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; CI-MESA: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C5]] + ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; CI-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] + ; CI-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C6]](s32) + ; CI-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; CI-MESA: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C5]] + ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; CI-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] + ; CI-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C6]](s32) + ; CI-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; CI-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; CI-MESA: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; CI-MESA: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; CI-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; CI-MESA: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; VI-LABEL: name: test_extload_global_v2s96_from_24_align2 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; VI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; VI: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; VI: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; VI: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C4]](s64) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) - ; VI: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; VI: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; VI: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; VI: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C7]](s64) ; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C]](s64) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C1]](s64) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; VI: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C2]](s64) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[GEP9:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C3]](s64) ; VI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[GEP10:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C4]](s64) ; VI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p1) :: (load 2, addrspace 1) - ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16), [[TRUNC10]](s16), [[TRUNC11]](s16) - ; VI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) - ; VI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) - ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; VI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; VI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] + ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; VI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] + ; VI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C6]](s32) + ; VI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; VI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; VI: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C5]] + ; VI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; VI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C6]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; VI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; VI: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C5]] + ; VI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; VI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C6]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; VI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; VI: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; VI: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; VI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; GFX9-HSA-LABEL: name: test_extload_global_v2s96_from_24_align2 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<2 x s96>) = G_LOAD [[COPY]](p1) :: (load 24, align 2, addrspace 1) @@ -11194,52 +12352,78 @@ body: | ; GFX9-MESA-LABEL: name: test_extload_global_v2s96_from_24_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 6 ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; GFX9-MESA: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 8 ; GFX9-MESA: [[GEP3:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C3]](s64) ; GFX9-MESA: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9-MESA: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 10 ; GFX9-MESA: [[GEP4:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C4]](s64) ; GFX9-MESA: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) - ; GFX9-MESA: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 - ; GFX9-MESA: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C5]](s64) + ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9-MESA: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9-MESA: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; GFX9-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; GFX9-MESA: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9-MESA: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9-MESA: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; GFX9-MESA: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; GFX9-MESA: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9-MESA: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; GFX9-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9-MESA: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; GFX9-MESA: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; GFX9-MESA: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9-MESA: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; GFX9-MESA: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 12 + ; GFX9-MESA: [[GEP5:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C7]](s64) ; GFX9-MESA: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9-MESA: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C]](s64) ; GFX9-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9-MESA: [[GEP7:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C1]](s64) ; GFX9-MESA: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; GFX9-MESA: [[GEP8:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C2]](s64) ; GFX9-MESA: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9-MESA: [[GEP9:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C3]](s64) ; GFX9-MESA: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9-MESA: [[GEP10:%[0-9]+]]:_(p1) = G_GEP [[GEP5]], [[C4]](s64) ; GFX9-MESA: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16), [[TRUNC10]](s16), [[TRUNC11]](s16) - ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) - ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) - ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; GFX9-MESA: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; GFX9-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9-MESA: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] + ; GFX9-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9-MESA: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] + ; GFX9-MESA: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C6]](s32) + ; GFX9-MESA: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; GFX9-MESA: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C5]] + ; GFX9-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; GFX9-MESA: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] + ; GFX9-MESA: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C6]](s32) + ; GFX9-MESA: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; GFX9-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; GFX9-MESA: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C5]] + ; GFX9-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; GFX9-MESA: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] + ; GFX9-MESA: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C6]](s32) + ; GFX9-MESA: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; GFX9-MESA: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; GFX9-MESA: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; GFX9-MESA: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; GFX9-MESA: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; GFX9-MESA: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(<2 x s96>) = G_LOAD %0 :: (load 24, align 2, addrspace 1) %2:_(s96) = G_EXTRACT %1, 0 From llvm-commits at lists.llvm.org Mon Oct 7 12:05:58 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Mon, 07 Oct 2019 19:05:58 -0000 Subject: [llvm] r373942 - AMDGPU/GlobalISel: Widen 16-bit G_MERGE_VALUEs sources Message-ID: <20191007190559.2C81C8C747@lists.llvm.org> Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir?rev=373942&r1=373941&r2=373942&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir Mon Oct 7 12:05:58 2019 @@ -385,53 +385,78 @@ body: | ; SI-LABEL: name: test_load_local_s32_align2 ; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: $vgpr0 = COPY [[MV]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: $vgpr0 = COPY [[OR]](s32) ; CI-LABEL: name: test_load_local_s32_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: $vgpr0 = COPY [[MV]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: $vgpr0 = COPY [[OR]](s32) ; CI-DS128-LABEL: name: test_load_local_s32_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI-DS128: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-DS128: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-DS128: $vgpr0 = COPY [[MV]](s32) + ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-DS128: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-DS128: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-DS128: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-DS128: $vgpr0 = COPY [[OR]](s32) ; VI-LABEL: name: test_load_local_s32_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: $vgpr0 = COPY [[OR]](s32) ; GFX9-LABEL: name: test_load_local_s32_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: $vgpr0 = COPY [[OR]](s32) %0:_(p3) = COPY $vgpr0 %1:_(s32) = G_LOAD %0 :: (load 4, align 2, addrspace 3) $vgpr0 = COPY %1 @@ -473,8 +498,12 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: $vgpr0 = COPY [[MV]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: $vgpr0 = COPY [[OR2]](s32) ; CI-LABEL: name: test_load_local_s32_align1 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -505,8 +534,12 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: $vgpr0 = COPY [[MV]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: $vgpr0 = COPY [[OR2]](s32) ; CI-DS128-LABEL: name: test_load_local_s32_align1 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -537,8 +570,12 @@ body: | ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-DS128: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-DS128: $vgpr0 = COPY [[MV]](s32) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-DS128: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-DS128: $vgpr0 = COPY [[OR2]](s32) ; VI-LABEL: name: test_load_local_s32_align1 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -565,8 +602,12 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: $vgpr0 = COPY [[OR2]](s32) ; GFX9-LABEL: name: test_load_local_s32_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -593,8 +634,12 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: $vgpr0 = COPY [[OR2]](s32) %0:_(p3) = COPY $vgpr0 %1:_(s32) = G_LOAD %0 :: (load 4, align 1, addrspace 3) $vgpr0 = COPY %1 @@ -708,92 +753,142 @@ body: | ; SI-LABEL: name: test_load_local_s64_align2 ; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; SI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; SI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; SI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-LABEL: name: test_load_local_s64_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; CI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-DS128-LABEL: name: test_load_local_s64_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI-DS128: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI-DS128: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI-DS128: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; CI-DS128: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI-DS128: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-DS128: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-DS128: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-DS128: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-DS128: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI-DS128: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI-DS128: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-DS128: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-DS128: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-DS128: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI-DS128: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-DS128: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI-DS128: $vgpr0_vgpr1 = COPY [[MV]](s64) ; VI-LABEL: name: test_load_local_s64_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; VI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; VI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-LABEL: name: test_load_local_s64_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](s64) %0:_(p3) = COPY $vgpr0 %1:_(s64) = G_LOAD %0 :: (load 8, align 2, addrspace 3) @@ -864,7 +959,16 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; SI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-LABEL: name: test_load_local_s64_align1 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -924,7 +1028,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-DS128-LABEL: name: test_load_local_s64_align1 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -984,7 +1097,16 @@ body: | ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-DS128: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-DS128: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-DS128: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-DS128: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-DS128: $vgpr0_vgpr1 = COPY [[MV]](s64) ; VI-LABEL: name: test_load_local_s64_align1 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -1036,7 +1158,16 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-LABEL: name: test_load_local_s64_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -1088,7 +1219,16 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](s64) %0:_(p3) = COPY $vgpr0 %1:_(s64) = G_LOAD %0 :: (load 8, align 1, addrspace 3) @@ -1160,7 +1300,16 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; SI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; SI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -1174,21 +1323,24 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; SI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; SI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; SI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; SI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) ; CI-LABEL: name: test_load_local_s96_align16 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -1249,7 +1401,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; CI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -1263,21 +1424,24 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; CI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; CI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; CI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; CI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) ; CI-DS128-LABEL: name: test_load_local_s96_align16 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -1365,7 +1529,20 @@ body: | ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) ; CI-DS128: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CI-DS128: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; CI-DS128: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; CI-DS128: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CI-DS128: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-DS128: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-DS128: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; CI-DS128: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI-DS128: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; CI-DS128: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_local_s96_align16 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -1417,9 +1594,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; VI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -1431,18 +1617,21 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; VI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; VI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; VI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; VI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) ; GFX9-LABEL: name: test_load_local_s96_align16 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -1494,9 +1683,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -1508,18 +1706,21 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; GFX9: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; GFX9: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; GFX9: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; GFX9: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) %0:_(p3) = COPY $vgpr0 %1:_(s96) = G_LOAD %0 :: (load 12, align 1, addrspace 3) @@ -1645,144 +1846,210 @@ body: | ; SI-LABEL: name: test_load_local_s96_align2 ; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; SI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; SI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; SI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) + ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] ; SI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; SI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; SI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; SI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR2]](s32), 64 ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) ; CI-LABEL: name: test_load_local_s96_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; CI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; CI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] ; CI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; CI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; CI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; CI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR2]](s32), 64 ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) ; CI-DS128-LABEL: name: test_load_local_s96_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI-DS128: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI-DS128: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI-DS128: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; CI-DS128: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI-DS128: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; CI-DS128: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; CI-DS128: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-DS128: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 10 ; CI-DS128: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; CI-DS128: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI-DS128: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI-DS128: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-DS128: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-DS128: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; CI-DS128: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-DS128: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-DS128: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-DS128: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; CI-DS128: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-DS128: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-DS128: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; CI-DS128: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-DS128: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; CI-DS128: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-DS128: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; CI-DS128: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_local_s96_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; VI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; VI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] ; VI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; VI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; VI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; VI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR2]](s32), 64 ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) ; GFX9-LABEL: name: test_load_local_s96_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] ; GFX9: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; GFX9: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; GFX9: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; GFX9: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR2]](s32), 64 ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) %0:_(p3) = COPY $vgpr0 %1:_(s96) = G_LOAD %0 :: (load 12, align 2, addrspace 3) @@ -1854,7 +2121,16 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; SI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; SI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -1868,21 +2144,24 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; SI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; SI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; SI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; SI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) ; CI-LABEL: name: test_load_local_s96_align1 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -1943,7 +2222,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; CI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -1957,21 +2245,24 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; CI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; CI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; CI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; CI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) ; CI-DS128-LABEL: name: test_load_local_s96_align1 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -2059,7 +2350,20 @@ body: | ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) ; CI-DS128: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CI-DS128: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; CI-DS128: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; CI-DS128: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CI-DS128: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-DS128: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-DS128: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; CI-DS128: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI-DS128: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; CI-DS128: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_local_s96_align1 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -2111,9 +2415,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; VI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -2125,18 +2438,21 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; VI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; VI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; VI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; VI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) ; GFX9-LABEL: name: test_load_local_s96_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -2188,9 +2504,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -2202,18 +2527,21 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; GFX9: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; GFX9: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; GFX9: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 + ; GFX9: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[INSERT1]](s96) %0:_(p3) = COPY $vgpr0 %1:_(s96) = G_LOAD %0 :: (load 12, align 1, addrspace 3) @@ -2285,7 +2613,16 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; SI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; SI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -2307,34 +2644,42 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; SI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; SI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C9]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C9]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; SI: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) ; CI-LABEL: name: test_load_local_s128_align16 @@ -2396,7 +2741,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; CI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -2418,34 +2772,42 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) ; CI-DS128-LABEL: name: test_load_local_s128_align16 @@ -2562,7 +2924,24 @@ body: | ; CI-DS128: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) ; CI-DS128: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) ; CI-DS128: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C17]](s32) + ; CI-DS128: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C17]](s32) + ; CI-DS128: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; CI-DS128: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-DS128: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-DS128: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C17]](s32) + ; CI-DS128: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-DS128: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-DS128: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-DS128: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C17]](s32) + ; CI-DS128: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-DS128: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; CI-DS128: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; VI-LABEL: name: test_load_local_s128_align16 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -2614,9 +2993,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; VI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -2636,27 +3024,35 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; VI: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) ; GFX9-LABEL: name: test_load_local_s128_align16 @@ -2709,9 +3105,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -2731,27 +3136,35 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) %0:_(p3) = COPY $vgpr0 @@ -2862,167 +3275,257 @@ body: | ; SI-LABEL: name: test_load_local_s128_align2 ; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; SI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; SI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; SI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; SI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C1]](s32) ; SI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C2]](s32) ; SI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; SI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; SI: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) ; CI-LABEL: name: test_load_local_s128_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; CI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; CI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; CI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C1]](s32) ; CI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C2]](s32) ; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; CI: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) ; CI-DS128-LABEL: name: test_load_local_s128_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI-DS128: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI-DS128: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI-DS128: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; CI-DS128: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI-DS128: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; CI-DS128: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; CI-DS128: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-DS128: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 10 ; CI-DS128: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; CI-DS128: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; CI-DS128: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 ; CI-DS128: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; CI-DS128: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 14 ; CI-DS128: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; CI-DS128: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI-DS128: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI-DS128: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-DS128: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C7]] + ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-DS128: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C7]] + ; CI-DS128: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C8]](s32) + ; CI-DS128: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-DS128: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-DS128: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C7]] + ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-DS128: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C7]] + ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C8]](s32) + ; CI-DS128: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-DS128: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-DS128: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C7]] + ; CI-DS128: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-DS128: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C7]] + ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C8]](s32) + ; CI-DS128: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-DS128: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI-DS128: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C7]] + ; CI-DS128: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI-DS128: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C7]] + ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) + ; CI-DS128: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI-DS128: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32), [[OR3]](s32) ; CI-DS128: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; VI-LABEL: name: test_load_local_s128_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; VI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; VI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C1]](s32) ; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C2]](s32) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; VI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; VI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; VI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; VI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; VI: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) ; GFX9-LABEL: name: test_load_local_s128_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C1]](s32) ; GFX9: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C2]](s32) ; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; GFX9: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; GFX9: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR3]](s32) ; GFX9: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) %0:_(p3) = COPY $vgpr0 @@ -3095,7 +3598,16 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; SI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; SI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -3117,34 +3629,42 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; SI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; SI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C9]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C9]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; SI: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) ; CI-LABEL: name: test_load_local_s128_align1 @@ -3206,7 +3726,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; CI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -3228,34 +3757,42 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) ; CI-DS128-LABEL: name: test_load_local_s128_align1 @@ -3372,7 +3909,24 @@ body: | ; CI-DS128: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) ; CI-DS128: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) ; CI-DS128: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C17]](s32) + ; CI-DS128: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL8]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C17]](s32) + ; CI-DS128: [[OR9:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL9]] + ; CI-DS128: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-DS128: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-DS128: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C17]](s32) + ; CI-DS128: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-DS128: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-DS128: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-DS128: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C17]](s32) + ; CI-DS128: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-DS128: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR8]](s32), [[OR9]](s32), [[OR10]](s32), [[OR11]](s32) ; CI-DS128: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; VI-LABEL: name: test_load_local_s128_align1 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -3424,9 +3978,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; VI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -3446,27 +4009,35 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; VI: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) ; GFX9-LABEL: name: test_load_local_s128_align1 @@ -3519,9 +4090,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -3541,27 +4121,35 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9: [[MV2:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s64), [[MV1]](s64) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV2]](s128) %0:_(p3) = COPY $vgpr0 @@ -3640,92 +4228,142 @@ body: | ; SI-LABEL: name: test_load_local_p1_align2 ; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; SI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; SI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; SI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; CI-LABEL: name: test_load_local_p1_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; CI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; CI-DS128-LABEL: name: test_load_local_p1_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI-DS128: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI-DS128: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI-DS128: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; CI-DS128: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI-DS128: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-DS128: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-DS128: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-DS128: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-DS128: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI-DS128: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI-DS128: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-DS128: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-DS128: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-DS128: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI-DS128: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-DS128: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CI-DS128: $vgpr0_vgpr1 = COPY [[MV]](p1) ; VI-LABEL: name: test_load_local_p1_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; VI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; VI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; GFX9-LABEL: name: test_load_local_p1_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](p1) %0:_(p3) = COPY $vgpr0 %1:_(p1) = G_LOAD %0 :: (load 8, align 2, addrspace 3) @@ -3796,7 +4434,16 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; SI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; CI-LABEL: name: test_load_local_p1_align1 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -3856,7 +4503,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; CI-DS128-LABEL: name: test_load_local_p1_align1 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -3916,7 +4572,16 @@ body: | ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-DS128: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-DS128: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-DS128: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-DS128: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-DS128: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-DS128: $vgpr0_vgpr1 = COPY [[MV]](p1) ; VI-LABEL: name: test_load_local_p1_align1 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -3968,7 +4633,16 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; GFX9-LABEL: name: test_load_local_p1_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -4020,7 +4694,16 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](p1) %0:_(p3) = COPY $vgpr0 %1:_(p1) = G_LOAD %0 :: (load 8, align 1, addrspace 3) @@ -4067,53 +4750,83 @@ body: | ; SI-LABEL: name: test_load_local_p3_align2 ; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: $vgpr0 = COPY [[MV]](p3) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR]](s32) + ; SI: $vgpr0 = COPY [[INTTOPTR]](p3) ; CI-LABEL: name: test_load_local_p3_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p3) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p3) ; CI-DS128-LABEL: name: test_load_local_p3_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI-DS128: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-DS128: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-DS128: $vgpr0 = COPY [[MV]](p3) + ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-DS128: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-DS128: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-DS128: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-DS128: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR]](s32) + ; CI-DS128: $vgpr0 = COPY [[INTTOPTR]](p3) ; VI-LABEL: name: test_load_local_p3_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p3) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p3) ; GFX9-LABEL: name: test_load_local_p3_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p3) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p3) %0:_(p3) = COPY $vgpr0 %1:_(p3) = G_LOAD %0 :: (load 4, align 2, addrspace 3) $vgpr0 = COPY %1 @@ -4155,8 +4868,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: $vgpr0 = COPY [[MV]](p3) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; SI: $vgpr0 = COPY [[INTTOPTR]](p3) ; CI-LABEL: name: test_load_local_p3_align1 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -4187,8 +4905,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p3) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p3) ; CI-DS128-LABEL: name: test_load_local_p3_align1 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -4219,8 +4942,13 @@ body: | ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-DS128: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-DS128: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-DS128: $vgpr0 = COPY [[MV]](p3) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-DS128: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-DS128: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; CI-DS128: $vgpr0 = COPY [[INTTOPTR]](p3) ; VI-LABEL: name: test_load_local_p3_align1 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -4247,8 +4975,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p3) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p3) ; GFX9-LABEL: name: test_load_local_p3_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -4275,8 +5008,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p3) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p3) %0:_(p3) = COPY $vgpr0 %1:_(p3) = G_LOAD %0 :: (load 4, align 1, addrspace 3) $vgpr0 = COPY %1 @@ -4322,53 +5060,83 @@ body: | ; SI-LABEL: name: test_load_local_p5_align2 ; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: $vgpr0 = COPY [[MV]](p5) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; SI: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-LABEL: name: test_load_local_p5_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p5) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-DS128-LABEL: name: test_load_local_p5_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI-DS128: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-DS128: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-DS128: $vgpr0 = COPY [[MV]](p5) + ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-DS128: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-DS128: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-DS128: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-DS128: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; CI-DS128: $vgpr0 = COPY [[INTTOPTR]](p5) ; VI-LABEL: name: test_load_local_p5_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p5) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-LABEL: name: test_load_local_p5_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p5) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p5) %0:_(p3) = COPY $vgpr0 %1:_(p5) = G_LOAD %0 :: (load 4, align 2, addrspace 3) $vgpr0 = COPY %1 @@ -4410,8 +5178,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: $vgpr0 = COPY [[MV]](p5) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; SI: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-LABEL: name: test_load_local_p5_align1 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -4442,8 +5215,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p5) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-DS128-LABEL: name: test_load_local_p5_align1 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -4474,8 +5252,13 @@ body: | ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-DS128: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-DS128: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-DS128: $vgpr0 = COPY [[MV]](p5) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-DS128: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-DS128: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; CI-DS128: $vgpr0 = COPY [[INTTOPTR]](p5) ; VI-LABEL: name: test_load_local_p5_align1 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -4502,8 +5285,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p5) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-LABEL: name: test_load_local_p5_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3) @@ -4530,8 +5318,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p5) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p5) %0:_(p3) = COPY $vgpr0 %1:_(p5) = G_LOAD %0 :: (load 4, align 1, addrspace 3) $vgpr0 = COPY %1 @@ -6446,9 +7239,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; SI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -6461,19 +7258,22 @@ body: | ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-LABEL: name: test_load_local_v2s32_align1 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -6505,9 +7305,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; CI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -6520,19 +7324,22 @@ body: | ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-DS128-LABEL: name: test_load_local_v2s32_align1 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -6564,9 +7371,13 @@ body: | ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-DS128: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-DS128: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-DS128: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; CI-DS128: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; CI-DS128: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI-DS128: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -6579,19 +7390,22 @@ body: | ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI-DS128: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI-DS128: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI-DS128: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI-DS128: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-DS128: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI-DS128: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI-DS128: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI-DS128: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-DS128: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI-DS128: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI-DS128: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI-DS128: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; CI-DS128: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; VI-LABEL: name: test_load_local_v2s32_align1 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -6619,9 +7433,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; VI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -6633,16 +7451,19 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-LABEL: name: test_load_local_v2s32_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -6670,9 +7491,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -6684,16 +7509,19 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(p3) = COPY $vgpr0 %1:_(<2 x s32>) = G_LOAD %0 :: (load 8, align 1, addrspace 3) @@ -6737,9 +7565,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; SI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -6752,18 +7584,21 @@ body: | ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; SI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; SI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -6777,19 +7612,22 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[BUILD_VECTOR]](<3 x s32>) ; CI-LABEL: name: test_load_local_v3s32_align16 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -6822,9 +7660,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; CI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -6837,18 +7679,21 @@ body: | ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; CI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; CI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -6862,19 +7707,22 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[BUILD_VECTOR]](<3 x s32>) ; CI-DS128-LABEL: name: test_load_local_v3s32_align16 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -6907,9 +7755,13 @@ body: | ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-DS128: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-DS128: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-DS128: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; CI-DS128: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; CI-DS128: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI-DS128: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -6922,18 +7774,21 @@ body: | ; CI-DS128: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI-DS128: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI-DS128: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI-DS128: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI-DS128: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-DS128: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI-DS128: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI-DS128: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI-DS128: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-DS128: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; CI-DS128: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI-DS128: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI-DS128: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; CI-DS128: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; CI-DS128: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; CI-DS128: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -6947,19 +7802,22 @@ body: | ; CI-DS128: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-DS128: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI-DS128: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI-DS128: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-DS128: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-DS128: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI-DS128: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-DS128: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-DS128: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-DS128: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI-DS128: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-DS128: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI-DS128: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-DS128: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI-DS128: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32) + ; CI-DS128: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI-DS128: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-DS128: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-DS128: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-DS128: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-DS128: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI-DS128: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) ; CI-DS128: $vgpr0_vgpr1_vgpr2 = COPY [[BUILD_VECTOR]](<3 x s32>) ; VI-LABEL: name: test_load_local_v3s32_align16 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -6987,9 +7845,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; VI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -7001,17 +7863,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; VI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -7023,16 +7888,19 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[BUILD_VECTOR]](<3 x s32>) ; GFX9-LABEL: name: test_load_local_v3s32_align16 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -7060,9 +7928,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -7074,17 +7946,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -7096,16 +7971,19 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[BUILD_VECTOR]](<3 x s32>) %0:_(p3) = COPY $vgpr0 %1:_(<3 x s32>) = G_LOAD %0 :: (load 12, align 1, addrspace 3) @@ -7321,181 +8199,251 @@ body: | ; SI-LABEL: name: test_load_local_v4s32_align2 ; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[GEP1]], [[C]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) - ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; SI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; SI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C1]](s32) + ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; SI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C3]](s32) ; SI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C]](s32) ; SI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV2]](s32), [[MV3]](s32) + ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; SI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR3]](s32) ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s32>), [[BUILD_VECTOR1]](<2 x s32>) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<4 x s32>) ; CI-LABEL: name: test_load_local_v4s32_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[GEP1]], [[C]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) - ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; CI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C1]](s32) + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C3]](s32) ; CI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C]](s32) ; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV2]](s32), [[MV3]](s32) + ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR3]](s32) ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s32>), [[BUILD_VECTOR1]](<2 x s32>) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<4 x s32>) ; CI-DS128-LABEL: name: test_load_local_v4s32_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI-DS128: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-DS128: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI-DS128: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) + ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-DS128: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-DS128: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-DS128: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-DS128: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI-DS128: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; CI-DS128: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-DS128: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[GEP1]], [[C]](s32) ; CI-DS128: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-DS128: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) - ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) + ; CI-DS128: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-DS128: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-DS128: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI-DS128: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) + ; CI-DS128: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; CI-DS128: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-DS128: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI-DS128: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI-DS128: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; CI-DS128: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C1]](s32) + ; CI-DS128: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-DS128: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; CI-DS128: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-DS128: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; CI-DS128: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-DS128: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C3]](s32) ; CI-DS128: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-DS128: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C]](s32) ; CI-DS128: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI-DS128: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; CI-DS128: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV2]](s32), [[MV3]](s32) + ; CI-DS128: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI-DS128: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; CI-DS128: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI-DS128: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; CI-DS128: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI-DS128: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR3]](s32) ; CI-DS128: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s32>), [[BUILD_VECTOR1]](<2 x s32>) ; CI-DS128: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<4 x s32>) ; VI-LABEL: name: test_load_local_v4s32_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[GEP1]], [[C]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) - ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; VI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C1]](s32) + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C3]](s32) ; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C]](s32) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV2]](s32), [[MV3]](s32) + ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; VI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; VI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; VI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; VI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR3]](s32) ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s32>), [[BUILD_VECTOR1]](<2 x s32>) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<4 x s32>) ; GFX9-LABEL: name: test_load_local_v4s32_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[GEP1]], [[C]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) - ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; GFX9: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C1]](s32) + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C3]](s32) ; GFX9: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C]](s32) ; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV2]](s32), [[MV3]](s32) + ; GFX9: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; GFX9: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; GFX9: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR3]](s32) ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s32>), [[BUILD_VECTOR1]](<2 x s32>) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<4 x s32>) %0:_(p3) = COPY $vgpr0 @@ -7540,9 +8488,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; SI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -7555,19 +8507,22 @@ body: | ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; SI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; SI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -7581,19 +8536,22 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; SI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C6]](s32) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C7]](s32) ; SI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p3) :: (load 1, addrspace 3) ; SI: [[GEP12:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C]](s32) ; SI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p3) :: (load 1, addrspace 3) @@ -7606,19 +8564,22 @@ body: | ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV2]](s32), [[MV3]](s32) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR8]](s32), [[OR11]](s32) ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s32>), [[BUILD_VECTOR1]](<2 x s32>) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<4 x s32>) ; CI-LABEL: name: test_load_local_v4s32_align1 @@ -7652,9 +8613,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; CI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -7667,19 +8632,22 @@ body: | ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; CI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; CI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -7693,19 +8661,22 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C6]](s32) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C7]](s32) ; CI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p3) :: (load 1, addrspace 3) ; CI: [[GEP12:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C]](s32) ; CI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p3) :: (load 1, addrspace 3) @@ -7718,19 +8689,22 @@ body: | ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV2]](s32), [[MV3]](s32) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR8]](s32), [[OR11]](s32) ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s32>), [[BUILD_VECTOR1]](<2 x s32>) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<4 x s32>) ; CI-DS128-LABEL: name: test_load_local_v4s32_align1 @@ -7764,9 +8738,13 @@ body: | ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-DS128: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-DS128: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-DS128: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; CI-DS128: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; CI-DS128: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI-DS128: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -7779,18 +8757,21 @@ body: | ; CI-DS128: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI-DS128: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI-DS128: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI-DS128: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI-DS128: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-DS128: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI-DS128: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI-DS128: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI-DS128: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-DS128: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; CI-DS128: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI-DS128: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI-DS128: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; CI-DS128: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; CI-DS128: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; CI-DS128: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -7804,20 +8785,23 @@ body: | ; CI-DS128: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-DS128: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI-DS128: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI-DS128: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-DS128: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-DS128: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI-DS128: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-DS128: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-DS128: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-DS128: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI-DS128: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-DS128: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI-DS128: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-DS128: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI-DS128: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI-DS128: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI-DS128: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) + ; CI-DS128: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI-DS128: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-DS128: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-DS128: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-DS128: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-DS128: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI-DS128: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI-DS128: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI-DS128: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; CI-DS128: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p3) :: (load 1, addrspace 3) ; CI-DS128: [[GEP12:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C]](s32) ; CI-DS128: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p3) :: (load 1, addrspace 3) @@ -7830,19 +8814,22 @@ body: | ; CI-DS128: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-DS128: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; CI-DS128: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI-DS128: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-DS128: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-DS128: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI-DS128: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-DS128: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-DS128: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-DS128: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; CI-DS128: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-DS128: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; CI-DS128: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI-DS128: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-DS128: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-DS128: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; CI-DS128: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI-DS128: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI-DS128: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-DS128: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-DS128: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI-DS128: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; CI-DS128: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) ; CI-DS128: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) ; VI-LABEL: name: test_load_local_v4s32_align1 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -7870,9 +8857,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; VI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -7884,18 +8875,21 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) - ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) + ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; VI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -7907,16 +8901,19 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; VI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C5]](s32) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C6]](s32) ; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p3) :: (load 1, addrspace 3) ; VI: [[GEP12:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C]](s32) ; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p3) :: (load 1, addrspace 3) @@ -7928,16 +8925,19 @@ body: | ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV2]](s32), [[MV3]](s32) + ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR8]](s32), [[OR11]](s32) ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s32>), [[BUILD_VECTOR1]](<2 x s32>) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<4 x s32>) ; GFX9-LABEL: name: test_load_local_v4s32_align1 @@ -7966,9 +8966,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -7980,18 +8984,21 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) - ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) + ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -8003,16 +9010,19 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C5]](s32) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C6]](s32) ; GFX9: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP12:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C]](s32) ; GFX9: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p3) :: (load 1, addrspace 3) @@ -8024,16 +9034,19 @@ body: | ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV2]](s32), [[MV3]](s32) + ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR8]](s32), [[OR11]](s32) ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s32>), [[BUILD_VECTOR1]](<2 x s32>) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<4 x s32>) %0:_(p3) = COPY $vgpr0 @@ -8411,7 +9424,16 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; SI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; SI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -8433,34 +9455,42 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; SI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; SI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C9]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C9]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; CI-LABEL: name: test_load_local_v2s64_align16 @@ -8522,7 +9552,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; CI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -8544,34 +9583,42 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; CI-DS128-LABEL: name: test_load_local_v2s64_align16 @@ -8633,7 +9680,16 @@ body: | ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI-DS128: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI-DS128: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI-DS128: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-DS128: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI-DS128: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; CI-DS128: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; CI-DS128: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -8655,34 +9711,42 @@ body: | ; CI-DS128: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-DS128: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI-DS128: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; CI-DS128: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI-DS128: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI-DS128: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI-DS128: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI-DS128: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI-DS128: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI-DS128: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-DS128: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI-DS128: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-DS128: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI-DS128: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI-DS128: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI-DS128: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI-DS128: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI-DS128: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI-DS128: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] ; CI-DS128: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; CI-DS128: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; CI-DS128: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-DS128: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-DS128: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C9]] - ; CI-DS128: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI-DS128: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-DS128: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-DS128: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI-DS128: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) + ; CI-DS128: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-DS128: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-DS128: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI-DS128: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI-DS128: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-DS128: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C9]] - ; CI-DS128: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI-DS128: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-DS128: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI-DS128: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; CI-DS128: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI-DS128: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-DS128: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-DS128: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI-DS128: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI-DS128: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI-DS128: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; CI-DS128: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; CI-DS128: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-DS128: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI-DS128: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI-DS128: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI-DS128: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; VI-LABEL: name: test_load_local_v2s64_align16 @@ -8735,9 +9799,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; VI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -8757,27 +9830,35 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; VI: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-LABEL: name: test_load_local_v2s64_align16 @@ -8830,9 +9911,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -8852,27 +9942,35 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] ; GFX9: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD12]](s32) ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL8]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16), [[OR6]](s16), [[OR7]](s16) + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL9]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR10:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR8]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR10]](s32), [[OR11]](s32) ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) %0:_(p3) = COPY $vgpr0 @@ -9396,9 +10494,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; SI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -9411,19 +10513,22 @@ body: | ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-LABEL: name: test_extload_local_v2s32_from_4_align1 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -9455,9 +10560,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; CI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -9470,19 +10579,22 @@ body: | ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-DS128-LABEL: name: test_extload_local_v2s32_from_4_align1 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -9514,9 +10626,13 @@ body: | ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI-DS128: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI-DS128: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI-DS128: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; CI-DS128: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; CI-DS128: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI-DS128: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -9529,19 +10645,22 @@ body: | ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI-DS128: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI-DS128: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI-DS128: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI-DS128: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-DS128: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI-DS128: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI-DS128: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI-DS128: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI-DS128: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI-DS128: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI-DS128: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI-DS128: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI-DS128: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; CI-DS128: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; VI-LABEL: name: test_extload_local_v2s32_from_4_align1 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -9569,9 +10688,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; VI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -9583,16 +10706,19 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-LABEL: name: test_extload_local_v2s32_from_4_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 @@ -9620,9 +10746,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 1, addrspace 3) @@ -9634,16 +10764,19 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(p3) = COPY $vgpr0 %1:_(<2 x s32>) = G_LOAD %0 :: (load 4, align 1, addrspace 3) @@ -9659,97 +10792,137 @@ body: | ; SI-LABEL: name: test_extload_local_v2s32_from_4_align2 ; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[GEP1]], [[C]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-LABEL: name: test_extload_local_v2s32_from_4_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[GEP1]], [[C]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-DS128-LABEL: name: test_extload_local_v2s32_from_4_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI-DS128: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI-DS128: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI-DS128: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) + ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-DS128: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-DS128: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI-DS128: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-DS128: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI-DS128: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; CI-DS128: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-DS128: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[GEP1]], [[C]](s32) ; CI-DS128: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-DS128: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI-DS128: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-DS128: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-DS128: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI-DS128: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; CI-DS128: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; VI-LABEL: name: test_extload_local_v2s32_from_4_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[GEP1]], [[C]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-LABEL: name: test_extload_local_v2s32_from_4_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[GEP1]], [[C]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(p3) = COPY $vgpr0 %1:_(<2 x s32>) = G_LOAD %0 :: (load 4, align 2, addrspace 3) @@ -10019,7 +11192,16 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; SI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; SI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -10033,23 +11215,26 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; SI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; SI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; SI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 - ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; SI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) + ; SI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 + ; SI: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; SI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C11]](s32) ; SI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p3) :: (load 1, addrspace 3) ; SI: [[GEP12:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C]](s32) ; SI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p3) :: (load 1, addrspace 3) @@ -10070,34 +11255,42 @@ body: | ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C9]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C9]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] ; SI: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; SI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; SI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; SI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY18]], [[C9]] - ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY17]](s32) - ; SI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY17]](s32) + ; SI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; SI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; SI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; SI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; SI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY20]], [[C9]] - ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY19]](s32) - ; SI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] - ; SI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16) + ; SI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY19]](s32) + ; SI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; SI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; SI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; SI: [[OR13:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL13]] + ; SI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; SI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; SI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; SI: [[OR14:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL14]] + ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR13]](s32), [[OR14]](s32) ; SI: [[GEP19:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C8]](s32) ; SI: [[LOAD20:%[0-9]+]]:_(s32) = G_LOAD [[GEP19]](p3) :: (load 1, addrspace 3) ; SI: [[GEP20:%[0-9]+]]:_(p3) = G_GEP [[GEP19]], [[C]](s32) @@ -10111,21 +11304,24 @@ body: | ; SI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; SI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY22]], [[C9]] - ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY21]](s32) - ; SI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; SI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY21]](s32) + ; SI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; SI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; SI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; SI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; SI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; SI: [[COPY24:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; SI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY24]], [[C9]] - ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY23]](s32) - ; SI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; SI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR10]](s16), [[OR11]](s16) + ; SI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY23]](s32) + ; SI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL16]](s32) + ; SI: [[OR16:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; SI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; SI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR16]](s16) + ; SI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; SI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] ; SI: [[DEF1:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF - ; SI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV2]](s64), 0 - ; SI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[MV3]](s32), 64 + ; SI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV1]](s64), 0 + ; SI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[OR17]](s32), 64 ; SI: [[COPY25:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) ; SI: [[COPY26:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY25]](s96) @@ -10189,7 +11385,16 @@ body: | ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C8]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; CI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) @@ -10203,23 +11408,26 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C9]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C9]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C10]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; CI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; CI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; CI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 - ; CI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) + ; CI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 + ; CI: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C11]](s32) ; CI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p3) :: (load 1, addrspace 3) ; CI: [[GEP12:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C]](s32) ; CI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p3) :: (load 1, addrspace 3) @@ -10240,34 +11448,42 @@ body: | ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C9]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C9]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] ; CI: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; CI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; CI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY18]], [[C9]] - ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY17]](s32) - ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY17]](s32) + ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; CI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY20]], [[C9]] - ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY19]](s32) - ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] - ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16) + ; CI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY19]](s32) + ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C10]](s32) + ; CI: [[OR13:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL13]] + ; CI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; CI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C10]](s32) + ; CI: [[OR14:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL14]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR13]](s32), [[OR14]](s32) ; CI: [[GEP19:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C8]](s32) ; CI: [[LOAD20:%[0-9]+]]:_(s32) = G_LOAD [[GEP19]](p3) :: (load 1, addrspace 3) ; CI: [[GEP20:%[0-9]+]]:_(p3) = G_GEP [[GEP19]], [[C]](s32) @@ -10281,21 +11497,24 @@ body: | ; CI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY22]], [[C9]] - ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY21]](s32) - ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY21]](s32) + ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; CI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[C8]](s32) ; CI: [[COPY24:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY24]], [[C9]] - ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY23]](s32) - ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR10]](s16), [[OR11]](s16) + ; CI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY23]](s32) + ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL16]](s32) + ; CI: [[OR16:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR16]](s16) + ; CI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C10]](s32) + ; CI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] ; CI: [[DEF1:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF - ; CI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV2]](s64), 0 - ; CI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[MV3]](s32), 64 + ; CI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV1]](s64), 0 + ; CI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[OR17]](s32), 64 ; CI: [[COPY25:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) ; CI: [[COPY26:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY25]](s96) @@ -10386,9 +11605,22 @@ body: | ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) ; CI-DS128: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CI-DS128: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI-DS128: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) - ; CI-DS128: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI-DS128: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C13]](s32) + ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI-DS128: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C13]](s32) + ; CI-DS128: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CI-DS128: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CI-DS128: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI-DS128: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C13]](s32) + ; CI-DS128: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CI-DS128: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI-DS128: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CI-DS128: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C13]](s32) + ; CI-DS128: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI-DS128: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) + ; CI-DS128: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI-DS128: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C14]](s32) ; CI-DS128: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p3) :: (load 1, addrspace 3) ; CI-DS128: [[GEP12:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C]](s32) ; CI-DS128: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p3) :: (load 1, addrspace 3) @@ -10417,50 +11649,62 @@ body: | ; CI-DS128: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C7]](s32) ; CI-DS128: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI-DS128: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C12]] - ; CI-DS128: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI-DS128: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI-DS128: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI-DS128: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI-DS128: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI-DS128: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI-DS128: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI-DS128: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C11]] ; CI-DS128: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C7]](s32) ; CI-DS128: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI-DS128: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C12]] - ; CI-DS128: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI-DS128: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI-DS128: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI-DS128: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI-DS128: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI-DS128: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] ; CI-DS128: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; CI-DS128: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C11]] ; CI-DS128: [[COPY17:%[0-9]+]]:_(s32) = COPY [[C7]](s32) ; CI-DS128: [[COPY18:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI-DS128: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY18]], [[C12]] - ; CI-DS128: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY17]](s32) - ; CI-DS128: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI-DS128: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI-DS128: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY17]](s32) + ; CI-DS128: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) + ; CI-DS128: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI-DS128: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI-DS128: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C11]] ; CI-DS128: [[COPY19:%[0-9]+]]:_(s32) = COPY [[C7]](s32) ; CI-DS128: [[COPY20:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI-DS128: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY20]], [[C12]] - ; CI-DS128: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY19]](s32) - ; CI-DS128: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI-DS128: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI-DS128: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY19]](s32) + ; CI-DS128: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI-DS128: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] ; CI-DS128: [[TRUNC20:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD20]](s32) ; CI-DS128: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C11]] ; CI-DS128: [[COPY21:%[0-9]+]]:_(s32) = COPY [[C7]](s32) ; CI-DS128: [[COPY22:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI-DS128: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY22]], [[C12]] - ; CI-DS128: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY21]](s32) - ; CI-DS128: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI-DS128: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI-DS128: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY21]](s32) + ; CI-DS128: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI-DS128: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI-DS128: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI-DS128: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C11]] ; CI-DS128: [[COPY23:%[0-9]+]]:_(s32) = COPY [[C7]](s32) ; CI-DS128: [[COPY24:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI-DS128: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY24]], [[C12]] - ; CI-DS128: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY23]](s32) - ; CI-DS128: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI-DS128: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI-DS128: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16), [[OR10]](s16), [[OR11]](s16) + ; CI-DS128: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY23]](s32) + ; CI-DS128: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL14]](s32) + ; CI-DS128: [[OR14:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI-DS128: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI-DS128: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI-DS128: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C13]](s32) + ; CI-DS128: [[OR15:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL15]] + ; CI-DS128: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; CI-DS128: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI-DS128: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C13]](s32) + ; CI-DS128: [[OR16:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL16]] + ; CI-DS128: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI-DS128: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR14]](s16) + ; CI-DS128: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C13]](s32) + ; CI-DS128: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI-DS128: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR15]](s32), [[OR16]](s32), [[OR17]](s32) ; CI-DS128: [[COPY25:%[0-9]+]]:_(s96) = COPY [[MV]](s96) ; CI-DS128: [[COPY26:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; CI-DS128: $vgpr0_vgpr1_vgpr2 = COPY [[COPY25]](s96) @@ -10515,9 +11759,18 @@ body: | ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; VI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; VI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -10529,20 +11782,23 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; VI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; VI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; VI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 - ; VI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; VI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) + ; VI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 + ; VI: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; VI: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C11]](s32) ; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p3) :: (load 1, addrspace 3) ; VI: [[GEP12:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C]](s32) ; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p3) :: (load 1, addrspace 3) @@ -10562,28 +11818,36 @@ body: | ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] + ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] ; VI: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; VI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; VI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; VI: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL11]] ; VI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; VI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; VI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; VI: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] - ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16) - ; VI: [[GEP19:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C9]](s32) + ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL12]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; VI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; VI: [[OR13:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL13]] + ; VI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; VI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; VI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; VI: [[OR14:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL14]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR13]](s32), [[OR14]](s32) + ; VI: [[GEP19:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C10]](s32) ; VI: [[LOAD20:%[0-9]+]]:_(s32) = G_LOAD [[GEP19]](p3) :: (load 1, addrspace 3) ; VI: [[GEP20:%[0-9]+]]:_(p3) = G_GEP [[GEP19]], [[C]](s32) ; VI: [[LOAD21:%[0-9]+]]:_(s32) = G_LOAD [[GEP20]](p3) :: (load 1, addrspace 3) @@ -10595,18 +11859,21 @@ body: | ; VI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; VI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; VI: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; VI: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; VI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL15]] ; VI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; VI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; VI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; VI: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR10]](s16), [[OR11]](s16) + ; VI: [[SHL16:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; VI: [[OR16:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL16]] + ; VI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; VI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR16]](s16) + ; VI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; VI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] ; VI: [[DEF1:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF - ; VI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV2]](s64), 0 - ; VI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[MV3]](s32), 64 + ; VI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV1]](s64), 0 + ; VI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[OR17]](s32), 64 ; VI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) ; VI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) @@ -10661,9 +11928,18 @@ body: | ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C7]] ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C8]](s16) ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C9]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C9]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C9]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 1, addrspace 3) @@ -10675,20 +11951,23 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C7]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C7]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C8]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C7]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C7]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C8]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C9]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] ; GFX9: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; GFX9: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; GFX9: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 - ; GFX9: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; GFX9: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C10]](s32) + ; GFX9: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR8]](s32), 64 + ; GFX9: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; GFX9: [[GEP11:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C11]](s32) ; GFX9: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP12:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C]](s32) ; GFX9: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p3) :: (load 1, addrspace 3) @@ -10708,28 +11987,36 @@ body: | ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C7]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C7]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C8]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C7]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C7]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] + ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C8]](s16) + ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] ; GFX9: [[TRUNC16:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD16]](s32) ; GFX9: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C7]] ; GFX9: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C7]] - ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) - ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C8]](s16) + ; GFX9: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL11]] ; GFX9: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C7]] ; GFX9: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C7]] - ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) - ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] - ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16), [[OR8]](s16), [[OR9]](s16) - ; GFX9: [[GEP19:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C9]](s32) + ; GFX9: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C8]](s16) + ; GFX9: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL12]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; GFX9: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C9]](s32) + ; GFX9: [[OR13:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL13]] + ; GFX9: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR11]](s16) + ; GFX9: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C9]](s32) + ; GFX9: [[OR14:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL14]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR13]](s32), [[OR14]](s32) + ; GFX9: [[GEP19:%[0-9]+]]:_(p3) = G_GEP [[GEP11]], [[C10]](s32) ; GFX9: [[LOAD20:%[0-9]+]]:_(s32) = G_LOAD [[GEP19]](p3) :: (load 1, addrspace 3) ; GFX9: [[GEP20:%[0-9]+]]:_(p3) = G_GEP [[GEP19]], [[C]](s32) ; GFX9: [[LOAD21:%[0-9]+]]:_(s32) = G_LOAD [[GEP20]](p3) :: (load 1, addrspace 3) @@ -10741,18 +12028,21 @@ body: | ; GFX9: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C7]] ; GFX9: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C7]] - ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) - ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C8]](s16) + ; GFX9: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL15]] ; GFX9: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C7]] ; GFX9: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C7]] - ; GFX9: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) - ; GFX9: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR10]](s16), [[OR11]](s16) + ; GFX9: [[SHL16:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C8]](s16) + ; GFX9: [[OR16:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL16]] + ; GFX9: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR16]](s16) + ; GFX9: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C9]](s32) + ; GFX9: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] ; GFX9: [[DEF1:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF - ; GFX9: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV2]](s64), 0 - ; GFX9: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[MV3]](s32), 64 + ; GFX9: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV1]](s64), 0 + ; GFX9: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[OR17]](s32), 64 ; GFX9: [[COPY1:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) ; GFX9: [[COPY2:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) @@ -10774,276 +12064,398 @@ body: | ; SI-LABEL: name: test_extload_local_v2s96_from_24_align2 ; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; SI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; SI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; SI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; SI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) + ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] ; SI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; SI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; SI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 - ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; SI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) + ; SI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR2]](s32), 64 + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; SI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; SI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C]](s32) ; SI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; SI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C1]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; SI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C2]](s32) ; SI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) - ; SI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16) - ; SI: [[GEP9:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C3]](s32) + ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; SI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; SI: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C3]] + ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C3]] + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32) + ; SI: [[GEP9:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C5]](s32) ; SI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[GEP10:%[0-9]+]]:_(p3) = G_GEP [[GEP9]], [[C]](s32) ; SI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p3) :: (load 2, addrspace 3) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC10]](s16), [[TRUNC11]](s16) + ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; SI: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C3]] + ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C3]] + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C4]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] ; SI: [[DEF1:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF - ; SI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV2]](s64), 0 - ; SI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[MV3]](s32), 64 - ; SI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) - ; SI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) - ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; SI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; SI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV1]](s64), 0 + ; SI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[OR5]](s32), 64 + ; SI: [[COPY13:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) + ; SI: [[COPY14:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) + ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; SI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; CI-LABEL: name: test_extload_local_v2s96_from_24_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; CI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; CI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; CI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] ; CI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; CI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; CI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 - ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) + ; CI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR2]](s32), 64 + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; CI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C]](s32) ; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; CI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C1]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; CI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C2]](s32) ; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) - ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16) - ; CI: [[GEP9:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C3]](s32) + ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; CI: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C3]] + ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C3]] + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32) + ; CI: [[GEP9:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C5]](s32) ; CI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[GEP10:%[0-9]+]]:_(p3) = G_GEP [[GEP9]], [[C]](s32) ; CI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p3) :: (load 2, addrspace 3) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC10]](s16), [[TRUNC11]](s16) + ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; CI: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C3]] + ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C3]] + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C4]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] ; CI: [[DEF1:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF - ; CI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV2]](s64), 0 - ; CI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[MV3]](s32), 64 - ; CI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) - ; CI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) - ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; CI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; CI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV1]](s64), 0 + ; CI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[OR5]](s32), 64 + ; CI: [[COPY13:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) + ; CI: [[COPY14:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) + ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; CI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; CI-DS128-LABEL: name: test_extload_local_v2s96_from_24_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI-DS128: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI-DS128: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI-DS128: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; CI-DS128: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI-DS128: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) ; CI-DS128: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 ; CI-DS128: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) ; CI-DS128: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI-DS128: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 10 ; CI-DS128: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) ; CI-DS128: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI-DS128: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16) - ; CI-DS128: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI-DS128: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) + ; CI-DS128: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI-DS128: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C5]] + ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI-DS128: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C5]] + ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI-DS128: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C6]](s32) + ; CI-DS128: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI-DS128: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI-DS128: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C5]] + ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI-DS128: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C5]] + ; CI-DS128: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C6]](s32) + ; CI-DS128: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI-DS128: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI-DS128: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] + ; CI-DS128: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI-DS128: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] + ; CI-DS128: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C6]](s32) + ; CI-DS128: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI-DS128: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; CI-DS128: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI-DS128: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C7]](s32) ; CI-DS128: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI-DS128: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C]](s32) ; CI-DS128: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; CI-DS128: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C1]](s32) ; CI-DS128: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; CI-DS128: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C2]](s32) ; CI-DS128: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; CI-DS128: [[GEP9:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C3]](s32) ; CI-DS128: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI-DS128: [[GEP10:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C4]](s32) ; CI-DS128: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; CI-DS128: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16), [[TRUNC10]](s16), [[TRUNC11]](s16) - ; CI-DS128: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) - ; CI-DS128: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) - ; CI-DS128: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; CI-DS128: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; CI-DS128: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI-DS128: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] + ; CI-DS128: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI-DS128: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] + ; CI-DS128: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C6]](s32) + ; CI-DS128: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI-DS128: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; CI-DS128: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C5]] + ; CI-DS128: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; CI-DS128: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] + ; CI-DS128: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C6]](s32) + ; CI-DS128: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; CI-DS128: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; CI-DS128: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C5]] + ; CI-DS128: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; CI-DS128: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] + ; CI-DS128: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C6]](s32) + ; CI-DS128: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; CI-DS128: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; CI-DS128: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; CI-DS128: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; CI-DS128: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; CI-DS128: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; VI-LABEL: name: test_extload_local_v2s96_from_24_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; VI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; VI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] ; VI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; VI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; VI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 - ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; VI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) + ; VI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR2]](s32), 64 + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; VI: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C]](s32) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C1]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; VI: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C2]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) - ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16) - ; VI: [[GEP9:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C3]](s32) + ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; VI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; VI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; VI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; VI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; VI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; VI: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C3]] + ; VI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; VI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C3]] + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C4]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32) + ; VI: [[GEP9:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C5]](s32) ; VI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[GEP10:%[0-9]+]]:_(p3) = G_GEP [[GEP9]], [[C]](s32) ; VI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p3) :: (load 2, addrspace 3) - ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC10]](s16), [[TRUNC11]](s16) + ; VI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; VI: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C3]] + ; VI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; VI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C3]] + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C4]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] ; VI: [[DEF1:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF - ; VI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV2]](s64), 0 - ; VI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[MV3]](s32), 64 - ; VI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) - ; VI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) - ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; VI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; VI: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV1]](s64), 0 + ; VI: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[OR5]](s32), 64 + ; VI: [[COPY13:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) + ; VI: [[COPY14:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) + ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; VI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; GFX9-LABEL: name: test_extload_local_v2s96_from_24_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 6 ; GFX9: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C3]](s32) + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C3]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C4]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C3]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C3]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP3:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C5]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[GEP4:%[0-9]+]]:_(p3) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C3]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C3]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C4]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] ; GFX9: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF ; GFX9: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[MV]](s64), 0 - ; GFX9: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[MV1]](s32), 64 - ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; GFX9: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C4]](s32) + ; GFX9: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[OR2]](s32), 64 + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; GFX9: [[GEP5:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[GEP6:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C]](s32) ; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[GEP7:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C1]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; GFX9: [[GEP8:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C2]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) - ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16), [[TRUNC8]](s16), [[TRUNC9]](s16) - ; GFX9: [[GEP9:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C3]](s32) + ; GFX9: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C3]] + ; GFX9: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C3]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C4]](s32) + ; GFX9: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; GFX9: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C3]] + ; GFX9: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; GFX9: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C3]] + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C4]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32) + ; GFX9: [[GEP9:%[0-9]+]]:_(p3) = G_GEP [[GEP5]], [[C5]](s32) ; GFX9: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[GEP10:%[0-9]+]]:_(p3) = G_GEP [[GEP9]], [[C]](s32) ; GFX9: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p3) :: (load 2, addrspace 3) - ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC10]](s16), [[TRUNC11]](s16) + ; GFX9: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; GFX9: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C3]] + ; GFX9: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; GFX9: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C3]] + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C4]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] ; GFX9: [[DEF1:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF - ; GFX9: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV2]](s64), 0 - ; GFX9: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[MV3]](s32), 64 - ; GFX9: [[COPY1:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) - ; GFX9: [[COPY2:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) - ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; GFX9: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; GFX9: [[INSERT2:%[0-9]+]]:_(s96) = G_INSERT [[DEF1]], [[MV1]](s64), 0 + ; GFX9: [[INSERT3:%[0-9]+]]:_(s96) = G_INSERT [[INSERT2]], [[OR5]](s32), 64 + ; GFX9: [[COPY13:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96) + ; GFX9: [[COPY14:%[0-9]+]]:_(s96) = COPY [[INSERT3]](s96) + ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; GFX9: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) %0:_(p3) = COPY $vgpr0 %1:_(<2 x s96>) = G_LOAD %0 :: (load 24, align 2, addrspace 3) %2:_(s96) = G_EXTRACT %1, 0 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir?rev=373942&r1=373941&r2=373942&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir Mon Oct 7 12:05:58 2019 @@ -328,43 +328,63 @@ body: | ; SI-LABEL: name: test_load_private_s32_align2 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: $vgpr0 = COPY [[MV]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: $vgpr0 = COPY [[OR]](s32) ; CI-LABEL: name: test_load_private_s32_align2 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: $vgpr0 = COPY [[MV]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: $vgpr0 = COPY [[OR]](s32) ; VI-LABEL: name: test_load_private_s32_align2 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: $vgpr0 = COPY [[OR]](s32) ; GFX9-LABEL: name: test_load_private_s32_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: $vgpr0 = COPY [[OR]](s32) %0:_(p5) = COPY $vgpr0 %1:_(s32) = G_LOAD %0 :: (load 4, align 2, addrspace 5) $vgpr0 = COPY %1 @@ -406,8 +426,12 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: $vgpr0 = COPY [[MV]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: $vgpr0 = COPY [[OR2]](s32) ; CI-LABEL: name: test_load_private_s32_align1 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -438,8 +462,12 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: $vgpr0 = COPY [[MV]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: $vgpr0 = COPY [[OR2]](s32) ; VI-LABEL: name: test_load_private_s32_align1 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -466,8 +494,12 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: $vgpr0 = COPY [[OR2]](s32) ; GFX9-LABEL: name: test_load_private_s32_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -494,8 +526,12 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: $vgpr0 = COPY [[OR2]](s32) %0:_(p5) = COPY $vgpr0 %1:_(s32) = G_LOAD %0 :: (load 4, align 1, addrspace 5) $vgpr0 = COPY %1 @@ -664,79 +700,111 @@ body: | ; SI-LABEL: name: test_load_private_s64_align2 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; SI: $vgpr0_vgpr1 = COPY [[MV2]](s64) + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; SI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-LABEL: name: test_load_private_s64_align2 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; CI: $vgpr0_vgpr1 = COPY [[MV2]](s64) + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; CI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; VI-LABEL: name: test_load_private_s64_align2 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; VI: $vgpr0_vgpr1 = COPY [[MV2]](s64) + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; VI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-LABEL: name: test_load_private_s64_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; GFX9: $vgpr0_vgpr1 = COPY [[MV2]](s64) + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](s64) %0:_(p5) = COPY $vgpr0 %1:_(s64) = G_LOAD %0 :: (load 8, align 2, addrspace 5) $vgpr0_vgpr1 = COPY %1 @@ -778,9 +846,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -793,20 +865,23 @@ body: | ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; SI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; SI: $vgpr0_vgpr1 = COPY [[MV2]](s64) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) + ; SI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; CI-LABEL: name: test_load_private_s64_align1 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -837,9 +912,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -852,20 +931,23 @@ body: | ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; CI: $vgpr0_vgpr1 = COPY [[MV2]](s64) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) + ; CI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; VI-LABEL: name: test_load_private_s64_align1 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -892,9 +974,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -906,17 +992,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; VI: $vgpr0_vgpr1 = COPY [[MV2]](s64) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) + ; VI: $vgpr0_vgpr1 = COPY [[MV]](s64) ; GFX9-LABEL: name: test_load_private_s64_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -943,9 +1032,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -957,17 +1050,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; GFX9: $vgpr0_vgpr1 = COPY [[MV2]](s64) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) + ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](s64) %0:_(p5) = COPY $vgpr0 %1:_(s64) = G_LOAD %0 :: (load 8, align 1, addrspace 5) $vgpr0_vgpr1 = COPY %1 @@ -1010,9 +1106,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -1025,18 +1125,21 @@ body: | ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; SI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; SI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -1050,20 +1153,23 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; SI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; CI-LABEL: name: test_load_private_s96_align16 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 56) @@ -1095,9 +1201,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -1110,18 +1220,21 @@ body: | ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; CI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; CI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -1135,20 +1248,23 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_private_s96_align16 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 56) @@ -1175,9 +1291,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -1189,17 +1309,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; VI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 56) @@ -1211,17 +1334,20 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; VI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-LABEL: name: test_load_private_s96_align16 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 56) @@ -1248,9 +1374,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -1262,17 +1392,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; GFX9: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 56) @@ -1284,17 +1417,20 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) %0:_(p5) = COPY $vgpr0 %1:_(s96) = G_LOAD %0 :: (load 12, align 1, addrspace 56) $vgpr0_vgpr1_vgpr2 = COPY %1 @@ -1419,111 +1555,155 @@ body: | ; SI-LABEL: name: test_load_private_s96_align2 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; SI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; CI-LABEL: name: test_load_private_s96_align2 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; CI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_private_s96_align2 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; VI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-LABEL: name: test_load_private_s96_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; GFX9: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) %0:_(p5) = COPY $vgpr0 %1:_(s96) = G_LOAD %0 :: (load 12, align 2, addrspace 5) $vgpr0_vgpr1_vgpr2 = COPY %1 @@ -1566,9 +1746,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -1581,18 +1765,21 @@ body: | ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; SI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; SI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -1606,20 +1793,23 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; SI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; CI-LABEL: name: test_load_private_s96_align1 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -1651,9 +1841,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -1666,18 +1860,21 @@ body: | ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; CI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; CI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -1691,20 +1888,23 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; VI-LABEL: name: test_load_private_s96_align1 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -1731,9 +1931,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -1745,17 +1949,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; VI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 5) @@ -1767,17 +1974,20 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; VI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) ; GFX9-LABEL: name: test_load_private_s96_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -1804,9 +2014,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -1818,17 +2032,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 5) @@ -1840,17 +2057,20 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[MV3]](s96) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) %0:_(p5) = COPY $vgpr0 %1:_(s96) = G_LOAD %0 :: (load 12, align 1, addrspace 5) $vgpr0_vgpr1_vgpr2 = COPY %1 @@ -1893,9 +2113,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -1908,18 +2132,21 @@ body: | ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; SI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; SI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -1933,20 +2160,23 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; SI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 56) ; SI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; SI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 56) @@ -1959,20 +2189,23 @@ body: | ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; SI: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) + ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; CI-LABEL: name: test_load_private_s128_align16 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 56) @@ -2004,9 +2237,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -2019,18 +2256,21 @@ body: | ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; CI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; CI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -2044,20 +2284,23 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; CI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 56) ; CI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; CI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 56) @@ -2070,20 +2313,23 @@ body: | ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; CI: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) + ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; VI-LABEL: name: test_load_private_s128_align16 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 56) @@ -2110,9 +2356,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -2124,17 +2374,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; VI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 56) @@ -2146,17 +2399,20 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 56) ; VI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 56) @@ -2168,17 +2424,20 @@ body: | ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; VI: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) + ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; GFX9-LABEL: name: test_load_private_s128_align16 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 56) @@ -2205,9 +2464,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -2219,17 +2482,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; GFX9: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 56) @@ -2241,17 +2507,20 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; GFX9: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 56) ; GFX9: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; GFX9: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 56) @@ -2263,17 +2532,20 @@ body: | ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; GFX9: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) + ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) %0:_(p5) = COPY $vgpr0 %1:_(s128) = G_LOAD %0 :: (load 16, align 1, addrspace 56) $vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1 @@ -2422,143 +2694,199 @@ body: | ; SI-LABEL: name: test_load_private_s128_align2 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; SI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; SI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; SI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; SI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; SI: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; SI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; SI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32), [[OR3]](s32) + ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; CI-LABEL: name: test_load_private_s128_align2 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; CI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; CI: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32), [[OR3]](s32) + ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; VI-LABEL: name: test_load_private_s128_align2 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; VI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; VI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; VI: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; VI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; VI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; VI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; VI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; VI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32), [[OR3]](s32) + ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; GFX9-LABEL: name: test_load_private_s128_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; GFX9: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; GFX9: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; GFX9: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; GFX9: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; GFX9: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; GFX9: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; GFX9: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32), [[OR3]](s32) + ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) %0:_(p5) = COPY $vgpr0 %1:_(s128) = G_LOAD %0 :: (load 16, align 2, addrspace 5) $vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1 @@ -2601,9 +2929,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -2616,18 +2948,21 @@ body: | ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; SI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; SI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -2641,20 +2976,23 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; SI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; SI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; SI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -2667,20 +3005,23 @@ body: | ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; SI: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) + ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; CI-LABEL: name: test_load_private_s128_align1 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -2712,9 +3053,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -2727,18 +3072,21 @@ body: | ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; CI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; CI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -2752,20 +3100,23 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; CI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; CI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; CI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -2778,20 +3129,23 @@ body: | ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; CI: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) + ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; VI-LABEL: name: test_load_private_s128_align1 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -2818,9 +3172,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -2832,17 +3190,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; VI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 5) @@ -2854,17 +3215,20 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; VI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -2876,17 +3240,20 @@ body: | ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; VI: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) + ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) ; GFX9-LABEL: name: test_load_private_s128_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -2913,9 +3280,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -2927,17 +3298,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 5) @@ -2949,17 +3323,20 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; GFX9: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; GFX9: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -2971,17 +3348,20 @@ body: | ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; GFX9: [[MV4:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) - ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV4]](s128) + ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV:%[0-9]+]]:_(s128) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) + ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[MV]](s128) %0:_(p5) = COPY $vgpr0 %1:_(s128) = G_LOAD %0 :: (load 16, align 1, addrspace 5) $vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1 @@ -3082,79 +3462,111 @@ body: | ; SI-LABEL: name: test_load_private_p1_align2 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[MV2:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; SI: $vgpr0_vgpr1 = COPY [[MV2]](p1) + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; SI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; CI-LABEL: name: test_load_private_p1_align2 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[MV2:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; CI: $vgpr0_vgpr1 = COPY [[MV2]](p1) + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; CI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; VI-LABEL: name: test_load_private_p1_align2 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[MV2:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; VI: $vgpr0_vgpr1 = COPY [[MV2]](p1) + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; VI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; GFX9-LABEL: name: test_load_private_p1_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[MV2:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; GFX9: $vgpr0_vgpr1 = COPY [[MV2]](p1) + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](p1) %0:_(p5) = COPY $vgpr0 %1:_(p1) = G_LOAD %0 :: (load 8, align 2, addrspace 5) $vgpr0_vgpr1 = COPY %1 @@ -3196,9 +3608,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -3211,20 +3627,23 @@ body: | ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; SI: [[MV2:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; SI: $vgpr0_vgpr1 = COPY [[MV2]](p1) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) + ; SI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; CI-LABEL: name: test_load_private_p1_align1 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -3255,9 +3674,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -3270,20 +3693,23 @@ body: | ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI: [[MV2:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; CI: $vgpr0_vgpr1 = COPY [[MV2]](p1) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) + ; CI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; VI-LABEL: name: test_load_private_p1_align1 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -3310,9 +3736,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -3324,17 +3754,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[MV2:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; VI: $vgpr0_vgpr1 = COPY [[MV2]](p1) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) + ; VI: $vgpr0_vgpr1 = COPY [[MV]](p1) ; GFX9-LABEL: name: test_load_private_p1_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -3361,9 +3794,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -3375,17 +3812,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[MV2:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; GFX9: $vgpr0_vgpr1 = COPY [[MV2]](p1) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) + ; GFX9: $vgpr0_vgpr1 = COPY [[MV]](p1) %0:_(p5) = COPY $vgpr0 %1:_(p1) = G_LOAD %0 :: (load 8, align 1, addrspace 5) $vgpr0_vgpr1 = COPY %1 @@ -3427,43 +3867,67 @@ body: | ; SI-LABEL: name: test_load_private_p3_align2 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: $vgpr0 = COPY [[MV]](p3) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR]](s32) + ; SI: $vgpr0 = COPY [[INTTOPTR]](p3) ; CI-LABEL: name: test_load_private_p3_align2 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p3) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p3) ; VI-LABEL: name: test_load_private_p3_align2 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p3) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p3) ; GFX9-LABEL: name: test_load_private_p3_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p3) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p3) %0:_(p5) = COPY $vgpr0 %1:_(p3) = G_LOAD %0 :: (load 4, align 2, addrspace 5) $vgpr0 = COPY %1 @@ -3505,8 +3969,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: $vgpr0 = COPY [[MV]](p3) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; SI: $vgpr0 = COPY [[INTTOPTR]](p3) ; CI-LABEL: name: test_load_private_p3_align1 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -3537,8 +4006,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p3) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p3) ; VI-LABEL: name: test_load_private_p3_align1 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -3565,8 +4039,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p3) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p3) ; GFX9-LABEL: name: test_load_private_p3_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -3593,8 +4072,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p3) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR2]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p3) %0:_(p5) = COPY $vgpr0 %1:_(p3) = G_LOAD %0 :: (load 4, align 1, addrspace 5) $vgpr0 = COPY %1 @@ -3636,43 +4120,67 @@ body: | ; SI-LABEL: name: test_load_private_p5_align2 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: $vgpr0 = COPY [[MV]](p5) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; SI: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-LABEL: name: test_load_private_p5_align2 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p5) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p5) ; VI-LABEL: name: test_load_private_p5_align2 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p5) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-LABEL: name: test_load_private_p5_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p5) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p5) %0:_(p5) = COPY $vgpr0 %1:_(p5) = G_LOAD %0 :: (load 4, align 2, addrspace 5) $vgpr0 = COPY %1 @@ -3714,8 +4222,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: $vgpr0 = COPY [[MV]](p5) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; SI: $vgpr0 = COPY [[INTTOPTR]](p5) ; CI-LABEL: name: test_load_private_p5_align1 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -3746,8 +4259,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: $vgpr0 = COPY [[MV]](p5) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; CI: $vgpr0 = COPY [[INTTOPTR]](p5) ; VI-LABEL: name: test_load_private_p5_align1 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -3774,8 +4292,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: $vgpr0 = COPY [[MV]](p5) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; VI: $vgpr0 = COPY [[INTTOPTR]](p5) ; GFX9-LABEL: name: test_load_private_p5_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 5) @@ -3802,8 +4325,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(p5) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: $vgpr0 = COPY [[MV]](p5) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[INTTOPTR:%[0-9]+]]:_(p5) = G_INTTOPTR [[OR2]](s32) + ; GFX9: $vgpr0 = COPY [[INTTOPTR]](p5) %0:_(p5) = COPY $vgpr0 %1:_(p5) = G_LOAD %0 :: (load 4, align 1, addrspace 5) $vgpr0 = COPY %1 @@ -5489,9 +6017,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -5504,19 +6036,22 @@ body: | ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-LABEL: name: test_load_private_v2s32_align1 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -5548,9 +6083,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -5563,19 +6102,22 @@ body: | ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; VI-LABEL: name: test_load_private_v2s32_align1 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -5603,9 +6145,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -5617,16 +6163,19 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-LABEL: name: test_load_private_v2s32_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -5654,9 +6203,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -5668,16 +6221,19 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(p5) = COPY $vgpr0 %1:_(<2 x s32>) = G_LOAD %0 :: (load 8, align 1, addrspace 5) @@ -5721,9 +6277,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -5736,18 +6296,21 @@ body: | ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; SI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; SI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -5761,19 +6324,22 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[BUILD_VECTOR]](<3 x s32>) ; CI-LABEL: name: test_load_private_v3s32_align16 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -5806,9 +6372,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -5821,18 +6391,21 @@ body: | ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; CI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; CI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -5846,19 +6419,22 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[BUILD_VECTOR]](<3 x s32>) ; VI-LABEL: name: test_load_private_v3s32_align16 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -5886,9 +6462,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -5900,17 +6480,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; VI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 56) @@ -5922,16 +6505,19 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[BUILD_VECTOR]](<3 x s32>) ; GFX9-LABEL: name: test_load_private_v3s32_align16 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -5959,9 +6545,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -5973,17 +6563,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; GFX9: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 56) @@ -5995,16 +6588,19 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[BUILD_VECTOR]](<3 x s32>) %0:_(p5) = COPY $vgpr0 %1:_(<3 x s32>) = G_LOAD %0 :: (load 12, align 1, addrspace 56) @@ -6103,9 +6699,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -6118,18 +6718,21 @@ body: | ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; SI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; SI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -6143,20 +6746,23 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; SI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 56) ; SI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; SI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 56) @@ -6169,19 +6775,22 @@ body: | ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) ; CI-LABEL: name: test_load_private_v4s32_align16 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -6214,9 +6823,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -6229,18 +6842,21 @@ body: | ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; CI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; CI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -6254,20 +6870,23 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; CI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 56) ; CI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; CI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 56) @@ -6280,19 +6899,22 @@ body: | ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) ; VI-LABEL: name: test_load_private_v4s32_align16 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -6320,9 +6942,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -6334,17 +6960,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; VI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 56) @@ -6356,17 +6985,20 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 56) ; VI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 56) @@ -6378,16 +7010,19 @@ body: | ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) ; GFX9-LABEL: name: test_load_private_v4s32_align16 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -6415,9 +7050,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 56) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 56) @@ -6429,17 +7068,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 56) ; GFX9: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 56) @@ -6451,17 +7093,20 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; GFX9: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 56) ; GFX9: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; GFX9: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 56) @@ -6473,16 +7118,19 @@ body: | ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) %0:_(p5) = COPY $vgpr0 %1:_(<4 x s32>) = G_LOAD %0 :: (load 16, align 1, addrspace 56) @@ -6632,142 +7280,198 @@ body: | ; SI-LABEL: name: test_load_private_v4s32_align2 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; SI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; SI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; SI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; SI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; SI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32), [[OR2]](s32), [[OR3]](s32) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) ; CI-LABEL: name: test_load_private_v4s32_align2 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; CI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32), [[OR2]](s32), [[OR3]](s32) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) ; VI-LABEL: name: test_load_private_v4s32_align2 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; VI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; VI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; VI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; VI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; VI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; VI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32), [[OR2]](s32), [[OR3]](s32) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) ; GFX9-LABEL: name: test_load_private_v4s32_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; GFX9: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; GFX9: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; GFX9: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; GFX9: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; GFX9: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; GFX9: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32), [[OR2]](s32), [[OR3]](s32) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) %0:_(p5) = COPY $vgpr0 %1:_(<4 x s32>) = G_LOAD %0 :: (load 16, align 2, addrspace 5) @@ -6811,9 +7515,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -6826,18 +7534,21 @@ body: | ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; SI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; SI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -6851,20 +7562,23 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; SI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; SI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; SI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -6877,19 +7591,22 @@ body: | ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) ; CI-LABEL: name: test_load_private_v4s32_align1 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -6922,9 +7639,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -6937,18 +7658,21 @@ body: | ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; CI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; CI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -6962,20 +7686,23 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; CI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; CI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; CI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -6988,19 +7715,22 @@ body: | ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) ; VI-LABEL: name: test_load_private_v4s32_align1 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -7028,9 +7758,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -7042,17 +7776,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; VI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 5) @@ -7064,17 +7801,20 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; VI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -7086,16 +7826,19 @@ body: | ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) ; GFX9-LABEL: name: test_load_private_v4s32_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -7123,9 +7866,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -7137,17 +7884,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 5) @@ -7159,17 +7909,20 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; GFX9: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; GFX9: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -7181,16 +7934,19 @@ body: | ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32) + ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32), [[OR8]](s32), [[OR11]](s32) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<4 x s32>) %0:_(p5) = COPY $vgpr0 %1:_(<4 x s32>) = G_LOAD %0 :: (load 16, align 1, addrspace 5) @@ -7615,9 +8371,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -7630,19 +8390,22 @@ body: | ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; SI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) ; SI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; SI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -7656,19 +8419,22 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C6]](s32) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C7]](s32) ; SI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; SI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; SI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -7681,20 +8447,23 @@ body: | ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; SI: [[MV5:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV3]](s32), [[MV4]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV2]](s64), [[MV5]](s64) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s32), [[OR11]](s32) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; CI-LABEL: name: test_load_private_v2s64_align16 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -7727,9 +8496,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -7742,19 +8515,22 @@ body: | ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) ; CI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; CI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -7768,19 +8544,22 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C6]](s32) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C7]](s32) ; CI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; CI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; CI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -7793,20 +8572,23 @@ body: | ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; CI: [[MV5:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV3]](s32), [[MV4]](s32) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV2]](s64), [[MV5]](s64) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s32), [[OR11]](s32) + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; VI-LABEL: name: test_load_private_v2s64_align16 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -7834,9 +8616,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -7848,18 +8634,21 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) + ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; VI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 5) @@ -7871,16 +8660,19 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C5]](s32) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C6]](s32) ; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; VI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -7892,17 +8684,20 @@ body: | ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; VI: [[MV5:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV3]](s32), [[MV4]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV2]](s64), [[MV5]](s64) + ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s32), [[OR11]](s32) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) ; GFX9-LABEL: name: test_load_private_v2s64_align16 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -7930,9 +8725,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -7944,18 +8743,21 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32) - ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32) + ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 5) @@ -7967,16 +8769,19 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C5]](s32) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C6]](s32) ; GFX9: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; GFX9: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -7988,17 +8793,20 @@ body: | ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; GFX9: [[MV5:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[MV3]](s32), [[MV4]](s32) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV2]](s64), [[MV5]](s64) + ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR8]](s32), [[OR11]](s32) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[MV]](s64), [[MV1]](s64) ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) %0:_(p5) = COPY $vgpr0 %1:_(<2 x s64>) = G_LOAD %0 :: (load 16, align 1, addrspace 5) @@ -8715,9 +9523,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -8730,19 +9542,22 @@ body: | ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-LABEL: name: test_extload_private_v2s32_from_4_align1 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -8774,9 +9589,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -8789,19 +9608,22 @@ body: | ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY6]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; VI-LABEL: name: test_extload_private_v2s32_from_4_align1 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -8829,9 +9651,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -8843,16 +9669,19 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-LABEL: name: test_extload_private_v2s32_from_4_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -8880,9 +9709,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -8894,16 +9727,19 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR2]](s32), [[OR5]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(p5) = COPY $vgpr0 %1:_(<2 x s32>) = G_LOAD %0 :: (load 4, align 1, addrspace 5) @@ -8919,78 +9755,110 @@ body: | ; SI-LABEL: name: test_extload_private_v2s32_from_4_align2 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; CI-LABEL: name: test_extload_private_v2s32_from_4_align2 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; VI-LABEL: name: test_extload_private_v2s32_from_4_align2 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) ; GFX9-LABEL: name: test_extload_private_v2s32_from_4_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32) + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[OR]](s32), [[OR1]](s32) ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(p5) = COPY $vgpr0 %1:_(<2 x s32>) = G_LOAD %0 :: (load 4, align 2, addrspace 5) @@ -9203,9 +10071,13 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; SI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -9218,18 +10090,21 @@ body: | ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; SI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; SI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; SI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; SI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -9243,21 +10118,24 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; SI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; SI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; SI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; SI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; SI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; SI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; SI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; SI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; SI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; SI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; SI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -9270,19 +10148,22 @@ body: | ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; SI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; SI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; SI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; SI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; SI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; SI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; SI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; SI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; SI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; SI: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; SI: [[GEP15:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C6]](s32) + ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; SI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; SI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; SI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; SI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; SI: [[GEP15:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C7]](s32) ; SI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p5) :: (load 1, addrspace 5) ; SI: [[GEP16:%[0-9]+]]:_(p5) = G_GEP [[GEP15]], [[C]](s32) ; SI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p5) :: (load 1, addrspace 5) @@ -9295,18 +10176,21 @@ body: | ; SI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; SI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY18]], [[C5]] - ; SI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY17]](s32) - ; SI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; SI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; SI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY17]](s32) + ; SI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; SI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; SI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; SI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C3]] ; SI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; SI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY20]], [[C5]] - ; SI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY19]](s32) - ; SI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; SI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] - ; SI: [[MV5:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16) + ; SI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY19]](s32) + ; SI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; SI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; SI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; SI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; SI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C6]](s32) + ; SI: [[OR14:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL14]] ; SI: [[GEP19:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C4]](s32) ; SI: [[LOAD20:%[0-9]+]]:_(s32) = G_LOAD [[GEP19]](p5) :: (load 1, addrspace 5) ; SI: [[GEP20:%[0-9]+]]:_(p5) = G_GEP [[GEP19]], [[C]](s32) @@ -9320,21 +10204,24 @@ body: | ; SI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; SI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY22]], [[C5]] - ; SI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY21]](s32) - ; SI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; SI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; SI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY21]](s32) + ; SI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; SI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; SI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; SI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C3]] ; SI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; SI: [[COPY24:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; SI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY24]], [[C5]] - ; SI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY23]](s32) - ; SI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; SI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; SI: [[MV6:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR10]](s16), [[OR11]](s16) - ; SI: [[MV7:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV4]](s32), [[MV5]](s32), [[MV6]](s32) - ; SI: [[COPY25:%[0-9]+]]:_(s96) = COPY [[MV3]](s96) - ; SI: [[COPY26:%[0-9]+]]:_(s96) = COPY [[MV7]](s96) + ; SI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY23]](s32) + ; SI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL16]](s32) + ; SI: [[OR16:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; SI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; SI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR16]](s16) + ; SI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C6]](s32) + ; SI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; SI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR11]](s32), [[OR14]](s32), [[OR17]](s32) + ; SI: [[COPY25:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; SI: [[COPY26:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY25]](s96) ; SI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY26]](s96) ; CI-LABEL: name: test_extload_private_v2s96_from_24_align1 @@ -9368,9 +10255,13 @@ body: | ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[COPY3]](s32) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; CI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -9383,18 +10274,21 @@ body: | ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C5]] - ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) - ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY5]](s32) + ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C5]] - ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; CI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[COPY7]](s32) + ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] + ; CI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C6]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] ; CI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; CI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) @@ -9408,21 +10302,24 @@ body: | ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C5]] - ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL4]](s32) - ; CI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] + ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[COPY9]](s32) + ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) + ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[TRUNC9]] ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C5]] - ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) - ; CI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; CI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY11]](s32) + ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) + ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] + ; CI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; CI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C6]](s32) + ; CI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; CI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; CI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; CI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; CI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -9435,19 +10332,22 @@ body: | ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32) ; CI: [[AND13:%[0-9]+]]:_(s32) = G_AND [[COPY14]], [[C5]] - ; CI: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) - ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL6]](s32) - ; CI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] + ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND13]], [[COPY13]](s32) + ; CI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) + ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[TRUNC13]] ; CI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; CI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32) ; CI: [[AND15:%[0-9]+]]:_(s32) = G_AND [[COPY16]], [[C5]] - ; CI: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) - ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL7]](s32) - ; CI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] - ; CI: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; CI: [[GEP15:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C6]](s32) + ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND15]], [[COPY15]](s32) + ; CI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) + ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[TRUNC15]] + ; CI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; CI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C6]](s32) + ; CI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; CI: [[GEP15:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C7]](s32) ; CI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p5) :: (load 1, addrspace 5) ; CI: [[GEP16:%[0-9]+]]:_(p5) = G_GEP [[GEP15]], [[C]](s32) ; CI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p5) :: (load 1, addrspace 5) @@ -9460,18 +10360,21 @@ body: | ; CI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[LOAD17]](s32) ; CI: [[AND17:%[0-9]+]]:_(s32) = G_AND [[COPY18]], [[C5]] - ; CI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY17]](s32) - ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL8]](s32) - ; CI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] + ; CI: [[SHL12:%[0-9]+]]:_(s32) = G_SHL [[AND17]], [[COPY17]](s32) + ; CI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[SHL12]](s32) + ; CI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[TRUNC17]] ; CI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; CI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C3]] ; CI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[LOAD19]](s32) ; CI: [[AND19:%[0-9]+]]:_(s32) = G_AND [[COPY20]], [[C5]] - ; CI: [[SHL9:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY19]](s32) - ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL9]](s32) - ; CI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] - ; CI: [[MV5:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16) + ; CI: [[SHL13:%[0-9]+]]:_(s32) = G_SHL [[AND19]], [[COPY19]](s32) + ; CI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[SHL13]](s32) + ; CI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[TRUNC19]] + ; CI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; CI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; CI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C6]](s32) + ; CI: [[OR14:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL14]] ; CI: [[GEP19:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C4]](s32) ; CI: [[LOAD20:%[0-9]+]]:_(s32) = G_LOAD [[GEP19]](p5) :: (load 1, addrspace 5) ; CI: [[GEP20:%[0-9]+]]:_(p5) = G_GEP [[GEP19]], [[C]](s32) @@ -9485,21 +10388,24 @@ body: | ; CI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[LOAD21]](s32) ; CI: [[AND21:%[0-9]+]]:_(s32) = G_AND [[COPY22]], [[C5]] - ; CI: [[SHL10:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY21]](s32) - ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL10]](s32) - ; CI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] + ; CI: [[SHL15:%[0-9]+]]:_(s32) = G_SHL [[AND21]], [[COPY21]](s32) + ; CI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[SHL15]](s32) + ; CI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[TRUNC21]] ; CI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; CI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C3]] ; CI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[C4]](s32) ; CI: [[COPY24:%[0-9]+]]:_(s32) = COPY [[LOAD23]](s32) ; CI: [[AND23:%[0-9]+]]:_(s32) = G_AND [[COPY24]], [[C5]] - ; CI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY23]](s32) - ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL11]](s32) - ; CI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] - ; CI: [[MV6:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR10]](s16), [[OR11]](s16) - ; CI: [[MV7:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV4]](s32), [[MV5]](s32), [[MV6]](s32) - ; CI: [[COPY25:%[0-9]+]]:_(s96) = COPY [[MV3]](s96) - ; CI: [[COPY26:%[0-9]+]]:_(s96) = COPY [[MV7]](s96) + ; CI: [[SHL16:%[0-9]+]]:_(s32) = G_SHL [[AND23]], [[COPY23]](s32) + ; CI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[SHL16]](s32) + ; CI: [[OR16:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[TRUNC23]] + ; CI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; CI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR16]](s16) + ; CI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C6]](s32) + ; CI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; CI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR11]](s32), [[OR14]](s32), [[OR17]](s32) + ; CI: [[COPY25:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; CI: [[COPY26:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY25]](s96) ; CI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY26]](s96) ; VI-LABEL: name: test_extload_private_v2s96_from_24_align1 @@ -9528,9 +10434,13 @@ body: | ; VI: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; VI: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; VI: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; VI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -9542,17 +10452,20 @@ body: | ; VI: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; VI: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; VI: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; VI: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; VI: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; VI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 5) @@ -9564,18 +10477,21 @@ body: | ; VI: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; VI: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; VI: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; VI: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; VI: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; VI: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; VI: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; VI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; VI: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; VI: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; VI: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; VI: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; VI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; VI: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; VI: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -9587,16 +10503,19 @@ body: | ; VI: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; VI: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; VI: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; VI: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; VI: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; VI: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; VI: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; VI: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; VI: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; VI: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; VI: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; VI: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; VI: [[GEP15:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C5]](s32) + ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; VI: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; VI: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; VI: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; VI: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; VI: [[GEP15:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C6]](s32) ; VI: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p5) :: (load 1, addrspace 5) ; VI: [[GEP16:%[0-9]+]]:_(p5) = G_GEP [[GEP15]], [[C]](s32) ; VI: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p5) :: (load 1, addrspace 5) @@ -9608,16 +10527,19 @@ body: | ; VI: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C3]] ; VI: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; VI: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C3]] - ; VI: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C4]](s16) - ; VI: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; VI: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C4]](s16) + ; VI: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; VI: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; VI: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C3]] ; VI: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; VI: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C3]] - ; VI: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C4]](s16) - ; VI: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] - ; VI: [[MV5:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16) - ; VI: [[GEP19:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C6]](s32) + ; VI: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C4]](s16) + ; VI: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] + ; VI: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; VI: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; VI: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C5]](s32) + ; VI: [[OR14:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL14]] + ; VI: [[GEP19:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C7]](s32) ; VI: [[LOAD20:%[0-9]+]]:_(s32) = G_LOAD [[GEP19]](p5) :: (load 1, addrspace 5) ; VI: [[GEP20:%[0-9]+]]:_(p5) = G_GEP [[GEP19]], [[C]](s32) ; VI: [[LOAD21:%[0-9]+]]:_(s32) = G_LOAD [[GEP20]](p5) :: (load 1, addrspace 5) @@ -9629,18 +10551,21 @@ body: | ; VI: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C3]] ; VI: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; VI: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C3]] - ; VI: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C4]](s16) - ; VI: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; VI: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C4]](s16) + ; VI: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL15]] ; VI: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; VI: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C3]] ; VI: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; VI: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C3]] - ; VI: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C4]](s16) - ; VI: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; VI: [[MV6:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR10]](s16), [[OR11]](s16) - ; VI: [[MV7:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV4]](s32), [[MV5]](s32), [[MV6]](s32) - ; VI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV3]](s96) - ; VI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV7]](s96) + ; VI: [[SHL16:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C4]](s16) + ; VI: [[OR16:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL16]] + ; VI: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; VI: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR16]](s16) + ; VI: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C5]](s32) + ; VI: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; VI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR11]](s32), [[OR14]](s32), [[OR17]](s32) + ; VI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; VI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) ; VI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) ; GFX9-LABEL: name: test_extload_private_v2s96_from_24_align1 @@ -9669,9 +10594,13 @@ body: | ; GFX9: [[AND3:%[0-9]+]]:_(s16) = G_AND [[TRUNC3]], [[C3]] ; GFX9: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[AND3]], [[C4]](s16) ; GFX9: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[SHL1]] - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) + ; GFX9: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; GFX9: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C5]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 1, addrspace 5) @@ -9683,17 +10612,20 @@ body: | ; GFX9: [[AND4:%[0-9]+]]:_(s16) = G_AND [[TRUNC4]], [[C3]] ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C3]] - ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) - ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C4]](s16) + ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL3]] ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[AND6:%[0-9]+]]:_(s16) = G_AND [[TRUNC6]], [[C3]] ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) ; GFX9: [[AND7:%[0-9]+]]:_(s16) = G_AND [[TRUNC7]], [[C3]] - ; GFX9: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) - ; GFX9: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL3]] - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR3]](s16) - ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C6]](s32) + ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND7]], [[C4]](s16) + ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[SHL4]] + ; GFX9: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; GFX9: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C5]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 1, addrspace 5) @@ -9705,18 +10637,21 @@ body: | ; GFX9: [[AND8:%[0-9]+]]:_(s16) = G_AND [[TRUNC8]], [[C3]] ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) ; GFX9: [[AND9:%[0-9]+]]:_(s16) = G_AND [[TRUNC9]], [[C3]] - ; GFX9: [[SHL4:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) - ; GFX9: [[OR4:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND9]], [[C4]](s16) + ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND8]], [[SHL6]] ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[AND10:%[0-9]+]]:_(s16) = G_AND [[TRUNC10]], [[C3]] ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) ; GFX9: [[AND11:%[0-9]+]]:_(s16) = G_AND [[TRUNC11]], [[C3]] - ; GFX9: [[SHL5:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) - ; GFX9: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL5]] - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR4]](s16), [[OR5]](s16) - ; GFX9: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C7]](s32) + ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND11]], [[C4]](s16) + ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[SHL7]] + ; GFX9: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR6]](s16) + ; GFX9: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR7]](s16) + ; GFX9: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C5]](s32) + ; GFX9: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR2]](s32), [[OR5]](s32), [[OR8]](s32) + ; GFX9: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; GFX9: [[GEP11:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C8]](s32) ; GFX9: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[GEP11]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP12:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C]](s32) ; GFX9: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[GEP12]](p5) :: (load 1, addrspace 5) @@ -9728,16 +10663,19 @@ body: | ; GFX9: [[AND12:%[0-9]+]]:_(s16) = G_AND [[TRUNC12]], [[C3]] ; GFX9: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD13]](s32) ; GFX9: [[AND13:%[0-9]+]]:_(s16) = G_AND [[TRUNC13]], [[C3]] - ; GFX9: [[SHL6:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) - ; GFX9: [[OR6:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL6]] + ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND13]], [[C4]](s16) + ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND12]], [[SHL9]] ; GFX9: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD14]](s32) ; GFX9: [[AND14:%[0-9]+]]:_(s16) = G_AND [[TRUNC14]], [[C3]] ; GFX9: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD15]](s32) ; GFX9: [[AND15:%[0-9]+]]:_(s16) = G_AND [[TRUNC15]], [[C3]] - ; GFX9: [[SHL7:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) - ; GFX9: [[OR7:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL7]] - ; GFX9: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR6]](s16), [[OR7]](s16) - ; GFX9: [[GEP15:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C5]](s32) + ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND15]], [[C4]](s16) + ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND14]], [[SHL10]] + ; GFX9: [[ZEXT6:%[0-9]+]]:_(s32) = G_ZEXT [[OR9]](s16) + ; GFX9: [[ZEXT7:%[0-9]+]]:_(s32) = G_ZEXT [[OR10]](s16) + ; GFX9: [[SHL11:%[0-9]+]]:_(s32) = G_SHL [[ZEXT7]], [[C5]](s32) + ; GFX9: [[OR11:%[0-9]+]]:_(s32) = G_OR [[ZEXT6]], [[SHL11]] + ; GFX9: [[GEP15:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C6]](s32) ; GFX9: [[LOAD16:%[0-9]+]]:_(s32) = G_LOAD [[GEP15]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP16:%[0-9]+]]:_(p5) = G_GEP [[GEP15]], [[C]](s32) ; GFX9: [[LOAD17:%[0-9]+]]:_(s32) = G_LOAD [[GEP16]](p5) :: (load 1, addrspace 5) @@ -9749,16 +10687,19 @@ body: | ; GFX9: [[AND16:%[0-9]+]]:_(s16) = G_AND [[TRUNC16]], [[C3]] ; GFX9: [[TRUNC17:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD17]](s32) ; GFX9: [[AND17:%[0-9]+]]:_(s16) = G_AND [[TRUNC17]], [[C3]] - ; GFX9: [[SHL8:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C4]](s16) - ; GFX9: [[OR8:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL8]] + ; GFX9: [[SHL12:%[0-9]+]]:_(s16) = G_SHL [[AND17]], [[C4]](s16) + ; GFX9: [[OR12:%[0-9]+]]:_(s16) = G_OR [[AND16]], [[SHL12]] ; GFX9: [[TRUNC18:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD18]](s32) ; GFX9: [[AND18:%[0-9]+]]:_(s16) = G_AND [[TRUNC18]], [[C3]] ; GFX9: [[TRUNC19:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD19]](s32) ; GFX9: [[AND19:%[0-9]+]]:_(s16) = G_AND [[TRUNC19]], [[C3]] - ; GFX9: [[SHL9:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C4]](s16) - ; GFX9: [[OR9:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL9]] - ; GFX9: [[MV5:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR8]](s16), [[OR9]](s16) - ; GFX9: [[GEP19:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C6]](s32) + ; GFX9: [[SHL13:%[0-9]+]]:_(s16) = G_SHL [[AND19]], [[C4]](s16) + ; GFX9: [[OR13:%[0-9]+]]:_(s16) = G_OR [[AND18]], [[SHL13]] + ; GFX9: [[ZEXT8:%[0-9]+]]:_(s32) = G_ZEXT [[OR12]](s16) + ; GFX9: [[ZEXT9:%[0-9]+]]:_(s32) = G_ZEXT [[OR13]](s16) + ; GFX9: [[SHL14:%[0-9]+]]:_(s32) = G_SHL [[ZEXT9]], [[C5]](s32) + ; GFX9: [[OR14:%[0-9]+]]:_(s32) = G_OR [[ZEXT8]], [[SHL14]] + ; GFX9: [[GEP19:%[0-9]+]]:_(p5) = G_GEP [[GEP11]], [[C7]](s32) ; GFX9: [[LOAD20:%[0-9]+]]:_(s32) = G_LOAD [[GEP19]](p5) :: (load 1, addrspace 5) ; GFX9: [[GEP20:%[0-9]+]]:_(p5) = G_GEP [[GEP19]], [[C]](s32) ; GFX9: [[LOAD21:%[0-9]+]]:_(s32) = G_LOAD [[GEP20]](p5) :: (load 1, addrspace 5) @@ -9770,18 +10711,21 @@ body: | ; GFX9: [[AND20:%[0-9]+]]:_(s16) = G_AND [[TRUNC20]], [[C3]] ; GFX9: [[TRUNC21:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD21]](s32) ; GFX9: [[AND21:%[0-9]+]]:_(s16) = G_AND [[TRUNC21]], [[C3]] - ; GFX9: [[SHL10:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C4]](s16) - ; GFX9: [[OR10:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL10]] + ; GFX9: [[SHL15:%[0-9]+]]:_(s16) = G_SHL [[AND21]], [[C4]](s16) + ; GFX9: [[OR15:%[0-9]+]]:_(s16) = G_OR [[AND20]], [[SHL15]] ; GFX9: [[TRUNC22:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD22]](s32) ; GFX9: [[AND22:%[0-9]+]]:_(s16) = G_AND [[TRUNC22]], [[C3]] ; GFX9: [[TRUNC23:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD23]](s32) ; GFX9: [[AND23:%[0-9]+]]:_(s16) = G_AND [[TRUNC23]], [[C3]] - ; GFX9: [[SHL11:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C4]](s16) - ; GFX9: [[OR11:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL11]] - ; GFX9: [[MV6:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR10]](s16), [[OR11]](s16) - ; GFX9: [[MV7:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV4]](s32), [[MV5]](s32), [[MV6]](s32) - ; GFX9: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV3]](s96) - ; GFX9: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV7]](s96) + ; GFX9: [[SHL16:%[0-9]+]]:_(s16) = G_SHL [[AND23]], [[C4]](s16) + ; GFX9: [[OR16:%[0-9]+]]:_(s16) = G_OR [[AND22]], [[SHL16]] + ; GFX9: [[ZEXT10:%[0-9]+]]:_(s32) = G_ZEXT [[OR15]](s16) + ; GFX9: [[ZEXT11:%[0-9]+]]:_(s32) = G_ZEXT [[OR16]](s16) + ; GFX9: [[SHL17:%[0-9]+]]:_(s32) = G_SHL [[ZEXT11]], [[C5]](s32) + ; GFX9: [[OR17:%[0-9]+]]:_(s32) = G_OR [[ZEXT10]], [[SHL17]] + ; GFX9: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR11]](s32), [[OR14]](s32), [[OR17]](s32) + ; GFX9: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; GFX9: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) ; GFX9: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) %0:_(p5) = COPY $vgpr0 @@ -9801,215 +10745,295 @@ body: | ; SI-LABEL: name: test_extload_private_v2s96_from_24_align2 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; SI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; SI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; SI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; SI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; SI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; SI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; SI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; SI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; SI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; SI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; SI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; SI: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; SI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; SI: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; SI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C1]](s32) + ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; SI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; SI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; SI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C3]](s32) ; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; SI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; SI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) - ; SI: [[MV5:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s16), [[TRUNC9]](s16) - ; SI: [[GEP9:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C2]](s32) + ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; SI: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C1]] + ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; SI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C1]] + ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C2]](s32) + ; SI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; SI: [[GEP9:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C4]](s32) ; SI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; SI: [[GEP10:%[0-9]+]]:_(p5) = G_GEP [[GEP9]], [[C]](s32) ; SI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p5) :: (load 2, addrspace 5) - ; SI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; SI: [[MV6:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC10]](s16), [[TRUNC11]](s16) - ; SI: [[MV7:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV4]](s32), [[MV5]](s32), [[MV6]](s32) - ; SI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV3]](s96) - ; SI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV7]](s96) - ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; SI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; SI: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C1]] + ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; SI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C1]] + ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C2]](s32) + ; SI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; SI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; SI: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; SI: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; SI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; CI-LABEL: name: test_extload_private_v2s96_from_24_align2 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; CI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; CI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; CI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; CI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; CI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; CI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; CI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; CI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; CI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; CI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; CI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; CI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; CI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; CI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; CI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; CI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; CI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; CI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; CI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; CI: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; CI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C1]](s32) + ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; CI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; CI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; CI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; CI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; CI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C3]](s32) ; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; CI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) - ; CI: [[MV5:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s16), [[TRUNC9]](s16) - ; CI: [[GEP9:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C2]](s32) + ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; CI: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C1]] + ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; CI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C1]] + ; CI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C2]](s32) + ; CI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; CI: [[GEP9:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C4]](s32) ; CI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; CI: [[GEP10:%[0-9]+]]:_(p5) = G_GEP [[GEP9]], [[C]](s32) ; CI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p5) :: (load 2, addrspace 5) - ; CI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; CI: [[MV6:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC10]](s16), [[TRUNC11]](s16) - ; CI: [[MV7:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV4]](s32), [[MV5]](s32), [[MV6]](s32) - ; CI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV3]](s96) - ; CI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV7]](s96) - ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; CI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; CI: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C1]] + ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; CI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C1]] + ; CI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C2]](s32) + ; CI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; CI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; CI: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; CI: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; CI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; CI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; VI-LABEL: name: test_extload_private_v2s96_from_24_align2 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; VI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; VI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; VI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; VI: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; VI: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; VI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; VI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; VI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; VI: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; VI: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; VI: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; VI: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; VI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; VI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; VI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; VI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; VI: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; VI: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; VI: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; VI: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; VI: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C1]](s32) + ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; VI: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; VI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; VI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; VI: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; VI: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C3]](s32) ; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; VI: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) - ; VI: [[MV5:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s16), [[TRUNC9]](s16) - ; VI: [[GEP9:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C2]](s32) + ; VI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; VI: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C1]] + ; VI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; VI: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C1]] + ; VI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C2]](s32) + ; VI: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; VI: [[GEP9:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C4]](s32) ; VI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; VI: [[GEP10:%[0-9]+]]:_(p5) = G_GEP [[GEP9]], [[C]](s32) ; VI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p5) :: (load 2, addrspace 5) - ; VI: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; VI: [[MV6:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC10]](s16), [[TRUNC11]](s16) - ; VI: [[MV7:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV4]](s32), [[MV5]](s32), [[MV6]](s32) - ; VI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV3]](s96) - ; VI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV7]](s96) - ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; VI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; VI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; VI: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C1]] + ; VI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; VI: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C1]] + ; VI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C2]](s32) + ; VI: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; VI: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; VI: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; VI: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; VI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; VI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) ; GFX9-LABEL: name: test_extload_private_v2s96_from_24_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) - ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16) - ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 - ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) + ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) + ; GFX9: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) + ; GFX9: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] + ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX9: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32) + ; GFX9: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) ; GFX9: [[GEP2:%[0-9]+]]:_(p5) = G_GEP [[GEP1]], [[C]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 - ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C2]](s32) + ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) + ; GFX9: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] + ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32) + ; GFX9: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] + ; GFX9: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C2]](s32) + ; GFX9: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; GFX9: [[GEP3:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C4]](s32) ; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[GEP3]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD4]](s32) ; GFX9: [[GEP4:%[0-9]+]]:_(p5) = G_GEP [[GEP3]], [[C]](s32) ; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[GEP4]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD5]](s32) - ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s16), [[TRUNC5]](s16) - ; GFX9: [[MV3:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV]](s32), [[MV1]](s32), [[MV2]](s32) - ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 - ; GFX9: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C3]](s32) + ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32) + ; GFX9: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] + ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32) + ; GFX9: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] + ; GFX9: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[C2]](s32) + ; GFX9: [[OR2:%[0-9]+]]:_(s32) = G_OR [[AND4]], [[SHL2]] + ; GFX9: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) + ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12 + ; GFX9: [[GEP5:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C5]](s32) ; GFX9: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[GEP5]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD6]](s32) ; GFX9: [[GEP6:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C]](s32) ; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9: [[MV4:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC6]](s16), [[TRUNC7]](s16) - ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C1]](s32) + ; GFX9: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32) + ; GFX9: [[AND6:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] + ; GFX9: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32) + ; GFX9: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY8]], [[C1]] + ; GFX9: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C2]](s32) + ; GFX9: [[OR3:%[0-9]+]]:_(s32) = G_OR [[AND6]], [[SHL3]] + ; GFX9: [[GEP7:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C3]](s32) ; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[GEP7]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD8]](s32) ; GFX9: [[GEP8:%[0-9]+]]:_(p5) = G_GEP [[GEP7]], [[C]](s32) ; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[GEP8]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD9]](s32) - ; GFX9: [[MV5:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s16), [[TRUNC9]](s16) - ; GFX9: [[GEP9:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C2]](s32) + ; GFX9: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32) + ; GFX9: [[AND8:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C1]] + ; GFX9: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32) + ; GFX9: [[AND9:%[0-9]+]]:_(s32) = G_AND [[COPY10]], [[C1]] + ; GFX9: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[AND9]], [[C2]](s32) + ; GFX9: [[OR4:%[0-9]+]]:_(s32) = G_OR [[AND8]], [[SHL4]] + ; GFX9: [[GEP9:%[0-9]+]]:_(p5) = G_GEP [[GEP5]], [[C4]](s32) ; GFX9: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[GEP9]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD10]](s32) ; GFX9: [[GEP10:%[0-9]+]]:_(p5) = G_GEP [[GEP9]], [[C]](s32) ; GFX9: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[GEP10]](p5) :: (load 2, addrspace 5) - ; GFX9: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD11]](s32) - ; GFX9: [[MV6:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC10]](s16), [[TRUNC11]](s16) - ; GFX9: [[MV7:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[MV4]](s32), [[MV5]](s32), [[MV6]](s32) - ; GFX9: [[COPY1:%[0-9]+]]:_(s96) = COPY [[MV3]](s96) - ; GFX9: [[COPY2:%[0-9]+]]:_(s96) = COPY [[MV7]](s96) - ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96) - ; GFX9: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96) + ; GFX9: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32) + ; GFX9: [[AND10:%[0-9]+]]:_(s32) = G_AND [[COPY11]], [[C1]] + ; GFX9: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32) + ; GFX9: [[AND11:%[0-9]+]]:_(s32) = G_AND [[COPY12]], [[C1]] + ; GFX9: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[C2]](s32) + ; GFX9: [[OR5:%[0-9]+]]:_(s32) = G_OR [[AND10]], [[SHL5]] + ; GFX9: [[MV1:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR3]](s32), [[OR4]](s32), [[OR5]](s32) + ; GFX9: [[COPY13:%[0-9]+]]:_(s96) = COPY [[MV]](s96) + ; GFX9: [[COPY14:%[0-9]+]]:_(s96) = COPY [[MV1]](s96) + ; GFX9: $vgpr0_vgpr1_vgpr2 = COPY [[COPY13]](s96) + ; GFX9: $vgpr3_vgpr4_vgpr5 = COPY [[COPY14]](s96) %0:_(p5) = COPY $vgpr0 %1:_(<2 x s96>) = G_LOAD %0 :: (load 24, align 2, addrspace 5) %2:_(s96) = G_EXTRACT %1, 0 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-merge-values.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-merge-values.mir?rev=373942&r1=373941&r2=373942&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-merge-values.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-merge-values.mir Mon Oct 7 12:05:58 2019 @@ -55,7 +55,16 @@ body: | ; CHECK: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C1]](s32) ; CHECK: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CHECK: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CHECK: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CHECK: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CHECK: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CHECK: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C3]](s32) + ; CHECK: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CHECK: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CHECK: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CHECK: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C3]](s32) + ; CHECK: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CHECK: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CHECK: $vgpr0_vgpr1 = COPY [[MV]](p1) %0:_(s32) = COPY $vgpr0 %1:_(s32) = COPY $vgpr1 @@ -131,8 +140,12 @@ body: | ; CHECK: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C4]](s32) ; CHECK: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CHECK: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CHECK: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CHECK: [[COPY3:%[0-9]+]]:_(s32) = COPY [[MV]](s32) + ; CHECK: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CHECK: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CHECK: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C6]](s32) + ; CHECK: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CHECK: [[COPY3:%[0-9]+]]:_(s32) = COPY [[OR2]](s32) ; CHECK: $vgpr0 = COPY [[COPY3]](s32) %0:_(s8) = G_CONSTANT i8 0 %1:_(s8) = G_CONSTANT i8 1 @@ -169,8 +182,12 @@ body: | ; CHECK: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C5]](s32) ; CHECK: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; CHECK: [[OR1:%[0-9]+]]:_(s16) = G_OR [[AND2]], [[TRUNC3]] - ; CHECK: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16) - ; CHECK: $vgpr0 = COPY [[MV]](s32) + ; CHECK: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CHECK: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CHECK: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C7]](s32) + ; CHECK: [[OR2:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL2]] + ; CHECK: $vgpr0 = COPY [[OR2]](s32) %0:_(s8) = G_CONSTANT i8 0 %1:_(s8) = G_CONSTANT i8 1 %2:_(s8) = G_CONSTANT i8 2 @@ -205,11 +222,21 @@ body: | ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2 ; CHECK: [[COPY3:%[0-9]+]]:_(s32) = COPY $vgpr3 - ; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32) - ; CHECK: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32) - ; CHECK: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[COPY2]](s32) - ; CHECK: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[COPY3]](s32) - ; CHECK: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CHECK: [[COPY4:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; CHECK: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C]] + ; CHECK: [[COPY5:%[0-9]+]]:_(s32) = COPY [[COPY1]](s32) + ; CHECK: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C]] + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C1]](s32) + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CHECK: [[COPY6:%[0-9]+]]:_(s32) = COPY [[COPY2]](s32) + ; CHECK: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C]] + ; CHECK: [[COPY7:%[0-9]+]]:_(s32) = COPY [[COPY3]](s32) + ; CHECK: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C]] + ; CHECK: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[AND3]], [[C1]](s32) + ; CHECK: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CHECK: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) ; CHECK: $vgpr1_vgpr2 = COPY [[MV]](s64) %0:_(s32) = COPY $vgpr0 %1:_(s32) = COPY $vgpr1 @@ -278,8 +305,12 @@ body: | ; CHECK: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C9]](s32) ; CHECK: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CHECK: [[OR5:%[0-9]+]]:_(s16) = G_OR [[OR4]], [[TRUNC7]] - ; CHECK: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR5]](s16) - ; CHECK: [[TRUNC8:%[0-9]+]]:_(s24) = G_TRUNC [[MV]](s32) + ; CHECK: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CHECK: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CHECK: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CHECK: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CHECK: [[TRUNC8:%[0-9]+]]:_(s24) = G_TRUNC [[OR6]](s32) ; CHECK: S_NOP 0, implicit [[TRUNC8]](s24) %0:_(s4) = G_CONSTANT i4 0 %1:_(s4) = G_CONSTANT i4 1 @@ -346,8 +377,12 @@ body: | ; CHECK: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C10]](s32) ; CHECK: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CHECK: [[OR5:%[0-9]+]]:_(s16) = G_OR [[OR4]], [[TRUNC7]] - ; CHECK: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[OR2]](s16), [[OR5]](s16) - ; CHECK: [[TRUNC8:%[0-9]+]]:_(s28) = G_TRUNC [[MV]](s32) + ; CHECK: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CHECK: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CHECK: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C11]](s32) + ; CHECK: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CHECK: [[TRUNC8:%[0-9]+]]:_(s28) = G_TRUNC [[OR6]](s32) ; CHECK: S_NOP 0, implicit [[TRUNC8]](s28) %0:_(s4) = G_CONSTANT i4 0 %1:_(s4) = G_CONSTANT i4 1 @@ -442,7 +477,20 @@ body: | ; CHECK: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[AND11]], [[COPY10]](s32) ; CHECK: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[SHL5]](s32) ; CHECK: [[OR5:%[0-9]+]]:_(s16) = G_OR [[AND10]], [[TRUNC11]] - ; CHECK: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16), [[OR4]](s16), [[OR5]](s16) + ; CHECK: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CHECK: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CHECK: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL6:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C14]](s32) + ; CHECK: [[OR6:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL6]] + ; CHECK: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CHECK: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CHECK: [[SHL7:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C14]](s32) + ; CHECK: [[OR7:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL7]] + ; CHECK: [[ZEXT4:%[0-9]+]]:_(s32) = G_ZEXT [[OR4]](s16) + ; CHECK: [[ZEXT5:%[0-9]+]]:_(s32) = G_ZEXT [[OR5]](s16) + ; CHECK: [[SHL8:%[0-9]+]]:_(s32) = G_SHL [[ZEXT5]], [[C14]](s32) + ; CHECK: [[OR8:%[0-9]+]]:_(s32) = G_OR [[ZEXT4]], [[SHL8]] + ; CHECK: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR6]](s32), [[OR7]](s32), [[OR8]](s32) ; CHECK: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) %0:_(s8) = G_CONSTANT i8 0 %1:_(s8) = G_CONSTANT i8 1 @@ -466,13 +514,20 @@ name: test_merge_s96_s16_s16_s16_s16_s16 body: | bb.0: ; CHECK-LABEL: name: test_merge_s96_s16_s16_s16_s16_s16_s16 - ; CHECK: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 0 - ; CHECK: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 1 - ; CHECK: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 2 - ; CHECK: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 3 - ; CHECK: [[C4:%[0-9]+]]:_(s16) = G_CONSTANT i16 4 - ; CHECK: [[C5:%[0-9]+]]:_(s16) = G_CONSTANT i16 5 - ; CHECK: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[C]](s16), [[C1]](s16), [[C2]](s16), [[C3]](s16), [[C4]](s16), [[C5]](s16) + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0 + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 1 + ; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[C1]], [[C2]](s32) + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[C]], [[SHL]] + ; CHECK: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 + ; CHECK: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 3 + ; CHECK: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[C4]], [[C2]](s32) + ; CHECK: [[OR1:%[0-9]+]]:_(s32) = G_OR [[C3]], [[SHL1]] + ; CHECK: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 + ; CHECK: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 5 + ; CHECK: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[C6]], [[C2]](s32) + ; CHECK: [[OR2:%[0-9]+]]:_(s32) = G_OR [[C5]], [[SHL2]] + ; CHECK: [[MV:%[0-9]+]]:_(s96) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32), [[OR2]](s32) ; CHECK: $vgpr0_vgpr1_vgpr2 = COPY [[MV]](s96) %0:_(s16) = G_CONSTANT i16 0 %1:_(s16) = G_CONSTANT i16 1 @@ -531,7 +586,16 @@ body: | ; CHECK: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[AND7]], [[C8]](s32) ; CHECK: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) ; CHECK: [[OR3:%[0-9]+]]:_(s16) = G_OR [[AND6]], [[TRUNC7]] - ; CHECK: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s16), [[OR1]](s16), [[OR2]](s16), [[OR3]](s16) + ; CHECK: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16) + ; CHECK: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[OR1]](s16) + ; CHECK: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C10]](s32) + ; CHECK: [[OR4:%[0-9]+]]:_(s32) = G_OR [[ZEXT]], [[SHL4]] + ; CHECK: [[ZEXT2:%[0-9]+]]:_(s32) = G_ZEXT [[OR2]](s16) + ; CHECK: [[ZEXT3:%[0-9]+]]:_(s32) = G_ZEXT [[OR3]](s16) + ; CHECK: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[ZEXT3]], [[C10]](s32) + ; CHECK: [[OR5:%[0-9]+]]:_(s32) = G_OR [[ZEXT2]], [[SHL5]] + ; CHECK: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR4]](s32), [[OR5]](s32) ; CHECK: [[TRUNC8:%[0-9]+]]:_(s56) = G_TRUNC [[MV]](s64) ; CHECK: S_NOP 0, implicit [[TRUNC8]](s56) %0:_(s8) = G_CONSTANT i8 0 @@ -706,12 +770,80 @@ name: test_merge_p3_s16_s16 body: | bb.0: ; CHECK-LABEL: name: test_merge_p3_s16_s16 - ; CHECK: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 0 - ; CHECK: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 1 - ; CHECK: [[MV:%[0-9]+]]:_(p3) = G_MERGE_VALUES [[C]](s16), [[C1]](s16) - ; CHECK: $vgpr0 = COPY [[MV]](p3) + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0 + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 1 + ; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[C1]], [[C2]](s32) + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[C]], [[SHL]] + ; CHECK: [[INTTOPTR:%[0-9]+]]:_(p3) = G_INTTOPTR [[OR]](s32) + ; CHECK: $vgpr0 = COPY [[INTTOPTR]](p3) %0:_(s16) = G_CONSTANT i16 0 %1:_(s16) = G_CONSTANT i16 1 %2:_(p3) = G_MERGE_VALUES %0, %1 $vgpr0 = COPY %2 ... + +--- +name: test_merge_s32_s16_s16 +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; CHECK-LABEL: name: test_merge_s32_s16_s16 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; CHECK: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C]] + ; CHECK: [[COPY3:%[0-9]+]]:_(s32) = COPY [[COPY1]](s32) + ; CHECK: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C]] + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C1]](s32) + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CHECK: $vgpr0 = COPY [[OR]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s16) = G_TRUNC %0 + %3:_(s16) = G_TRUNC %1 + %4:_(s32) = G_MERGE_VALUES %2, %3 + $vgpr0 = COPY %4 +... + +--- +name: test_merge_s48_s16_s16_s16 +body: | + bb.0: + liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3 + + ; CHECK-LABEL: name: test_merge_s48_s16_s16_s16 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2 + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CHECK: [[COPY3:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; CHECK: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C]] + ; CHECK: [[COPY4:%[0-9]+]]:_(s32) = COPY [[COPY1]](s32) + ; CHECK: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C]] + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C1]](s32) + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND]], [[SHL]] + ; CHECK: [[COPY5:%[0-9]+]]:_(s32) = COPY [[COPY2]](s32) + ; CHECK: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C]] + ; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 0 + ; CHECK: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[C2]], [[C1]](s32) + ; CHECK: [[OR1:%[0-9]+]]:_(s32) = G_OR [[AND2]], [[SHL1]] + ; CHECK: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[OR]](s32), [[OR1]](s32) + ; CHECK: [[COPY6:%[0-9]+]]:_(s64) = COPY [[MV]](s64) + ; CHECK: $vgpr0_vgpr1 = COPY [[COPY6]](s64) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s32) = COPY $vgpr2 + + %3:_(s16) = G_TRUNC %0 + %4:_(s16) = G_TRUNC %1 + %5:_(s16) = G_TRUNC %2 + + %6:_(s48) = G_MERGE_VALUES %3, %4, %5 + %7:_(s64) = G_ANYEXT %6 + $vgpr0_vgpr1 = COPY %7 +... From llvm-commits at lists.llvm.org Mon Oct 7 12:03:51 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:03:51 +0000 (UTC) Subject: [PATCH] D68308: AMDGPU/GlobalISel: Widen 16-bit G_MERGE_VALUEs sources In-Reply-To: References: Message-ID: arsenm marked an inline comment as done. arsenm added a comment. r373942 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68308/new/ https://reviews.llvm.org/D68308 From llvm-commits at lists.llvm.org Mon Oct 7 12:07:19 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Mon, 07 Oct 2019 19:07:19 -0000 Subject: [llvm] r373943 - AMDGPU/GlobalISel: Use S_MOV_B64 for inline constants Message-ID: <20191007190719.C788387CA7@lists.llvm.org> Author: arsenm Date: Mon Oct 7 12:07:19 2019 New Revision: 373943 URL: http://llvm.org/viewvc/llvm-project?rev=373943&view=rev Log: AMDGPU/GlobalISel: Use S_MOV_B64 for inline constants This hides some defects in SIFoldOperands when the immediates are split. Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-constant.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp?rev=373943&r1=373942&r2=373943&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp Mon Oct 7 12:07:19 2019 @@ -1472,31 +1472,38 @@ bool AMDGPUInstructionSelector::selectG_ return constrainSelectedInstRegOperands(I, TII, TRI, RBI); } - DebugLoc DL = I.getDebugLoc(); - const TargetRegisterClass *RC = IsSgpr ? &AMDGPU::SReg_32_XM0RegClass : - &AMDGPU::VGPR_32RegClass; - Register LoReg = MRI->createVirtualRegister(RC); - Register HiReg = MRI->createVirtualRegister(RC); - const APInt &Imm = APInt(Size, I.getOperand(1).getImm()); - - BuildMI(*BB, &I, DL, TII.get(Opcode), LoReg) - .addImm(Imm.trunc(32).getZExtValue()); - - BuildMI(*BB, &I, DL, TII.get(Opcode), HiReg) - .addImm(Imm.ashr(32).getZExtValue()); - - const MachineInstr *RS = - BuildMI(*BB, &I, DL, TII.get(AMDGPU::REG_SEQUENCE), DstReg) - .addReg(LoReg) - .addImm(AMDGPU::sub0) - .addReg(HiReg) - .addImm(AMDGPU::sub1); + const DebugLoc &DL = I.getDebugLoc(); + + APInt Imm(Size, I.getOperand(1).getImm()); + + MachineInstr *ResInst; + if (IsSgpr && TII.isInlineConstant(Imm)) { + ResInst = BuildMI(*BB, &I, DL, TII.get(AMDGPU::S_MOV_B64), DstReg) + .addImm(I.getOperand(1).getImm()); + } else { + const TargetRegisterClass *RC = IsSgpr ? + &AMDGPU::SReg_32_XM0RegClass : &AMDGPU::VGPR_32RegClass; + Register LoReg = MRI->createVirtualRegister(RC); + Register HiReg = MRI->createVirtualRegister(RC); + + BuildMI(*BB, &I, DL, TII.get(Opcode), LoReg) + .addImm(Imm.trunc(32).getZExtValue()); + + BuildMI(*BB, &I, DL, TII.get(Opcode), HiReg) + .addImm(Imm.ashr(32).getZExtValue()); + + ResInst = BuildMI(*BB, &I, DL, TII.get(AMDGPU::REG_SEQUENCE), DstReg) + .addReg(LoReg) + .addImm(AMDGPU::sub0) + .addReg(HiReg) + .addImm(AMDGPU::sub1); + } // We can't call constrainSelectedInstRegOperands here, because it doesn't // work for target independent opcodes I.eraseFromParent(); const TargetRegisterClass *DstRC = - TRI.getConstrainedRegClassForOperand(RS->getOperand(0), *MRI); + TRI.getConstrainedRegClassForOperand(ResInst->getOperand(0), *MRI); if (!DstRC) return true; return RBI.constrainGenericRegister(DstReg, *DstRC, *MRI); Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-constant.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-constant.mir?rev=373943&r1=373942&r2=373943&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-constant.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-constant.mir Mon Oct 7 12:07:19 2019 @@ -5,6 +5,7 @@ name: constant legalized: true regBankSelected: true +tracksRegLiveness: true body: | @@ -25,28 +26,30 @@ body: | ; GCN: %{{[0-9]+}}:sreg_32 = S_MOV_B32 1065353216 %4:sgpr(s32) = G_FCONSTANT float 1.0 + ; GCN: %5:sreg_64_xexec = S_MOV_B64 4607182418800017408 + %5:sgpr(s64) = G_FCONSTANT double 1.0 + ; GCN: [[LO1:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 0 - ; GCN: [[HI1:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 1072693248 + ; GCN: [[HI1:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 1076101120 ; GCN: %{{[0-9]+}}:sreg_64_xexec = REG_SEQUENCE [[LO1]], %subreg.sub0, [[HI1]], %subreg.sub1 - %5:sgpr(s64) = G_FCONSTANT double 1.0 + %6:sgpr(s64) = G_FCONSTANT double 10.0 ; GCN: %{{[0-9]+}}:vgpr_32 = V_MOV_B32_e32 1 - %6:vgpr(s32) = G_CONSTANT i32 1 + %7:vgpr(s32) = G_CONSTANT i32 1 ; GCN: [[LO2:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0 ; GCN: [[HI2:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1 ; GCN: %{{[0-9]+}}:vreg_64 = REG_SEQUENCE [[LO2]], %subreg.sub0, [[HI2]], %subreg.sub1 - %7:vgpr(s64) = G_CONSTANT i64 4294967296 + %8:vgpr(s64) = G_CONSTANT i64 4294967296 ; GCN: %{{[0-9]+}}:vgpr_32 = V_MOV_B32_e32 1065353216 - %8:vgpr(s32) = G_FCONSTANT float 1.0 + %9:vgpr(s32) = G_FCONSTANT float 1.0 ; GCN: [[LO3:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0 ; GCN: [[HI3:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1072693248 ; GCN: %{{[0-9]+}}:vreg_64 = REG_SEQUENCE [[LO3]], %subreg.sub0, [[HI3]], %subreg.sub1 - %9:vgpr(s64) = G_FCONSTANT double 1.0 + %10:vgpr(s64) = G_FCONSTANT double 1.0 - S_ENDPGM 0, implicit %2, implicit %4, implicit %6, implicit %8, implicit %3, implicit %5, implicit %7, implicit %9 + S_ENDPGM 0, implicit %2, implicit %4, implicit %5, implicit %6, implicit %8, implicit %3, implicit %5, implicit %7, implicit %9, implicit %10 ... - Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir?rev=373943&r1=373942&r2=373943&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir Mon Oct 7 12:07:19 2019 @@ -190,9 +190,7 @@ body: | # Test a load of an offset from a constant base address # GCN-LABEL: name: constant_address_positive{{$}} -# GCN: %4:sreg_32_xm0 = S_MOV_B32 44 -# GCN: %5:sreg_32_xm0 = S_MOV_B32 0 -# GCN: %0:sreg_64 = REG_SEQUENCE %4, %subreg.sub0, %5, %subreg.sub1 +# GCN: %0:sreg_64 = S_MOV_B64 44 # VI: %3:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %0, 64, 0, 0 :: (dereferenceable invariant load 4, addrspace 4) # SICI: %3:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %0, 16, 0, 0 :: (dereferenceable invariant load 4, addrspace 4) From llvm-commits at lists.llvm.org Mon Oct 7 12:05:06 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:05:06 +0000 (UTC) Subject: [PATCH] D68437: AMDGPU/GlobalISel: Use S_MOV_B64 for inline constants In-Reply-To: References: Message-ID: <9b73600525a963374ba551199fb9c384@localhost.localdomain> arsenm closed this revision. arsenm added a comment. r373943 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68437/new/ https://reviews.llvm.org/D68437 From llvm-commits at lists.llvm.org Mon Oct 7 12:05:44 2019 From: llvm-commits at lists.llvm.org (Aditya Kumar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:05:44 +0000 (UTC) Subject: [PATCH] D68093: [clang-scan-deps][static analyzer] Support for clang --analyze in scan-deps In-Reply-To: References: Message-ID: <113d4048aaf71de064e25ad2768ade14@localhost.localdomain> hiraditya added inline comments. ================ Comment at: clang/include/clang/Driver/CC1Options.td:849 HelpText<"include a detailed record of preprocessing actions">; +def setup_static_analyzer : Flag<["-"], "setup-static-analyzer">, + HelpText<"Set up preprocessor for static analyzer (done automatically when static analyzer is run).">; ---------------- The name doesn't quite reflect what it does. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68093/new/ https://reviews.llvm.org/D68093 From llvm-commits at lists.llvm.org Mon Oct 7 12:05:49 2019 From: llvm-commits at lists.llvm.org (Alexei Starovoitov via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:05:49 +0000 (UTC) Subject: [PATCH] D67980: [BPF] do compile-once run-everywhere relocation for bitfields In-Reply-To: References: Message-ID: ast added a comment. thanks for adding the tests Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67980/new/ https://reviews.llvm.org/D67980 From llvm-commits at lists.llvm.org Mon Oct 7 12:05:44 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via llvm-commits) Date: Mon, 07 Oct 2019 20:05:44 +0100 Subject: [llvm] r373935 - Second attempt to add iterator_range::empty() In-Reply-To: <20191007181424.DB26A883ED@lists.llvm.org> References: <20191007181424.DB26A883ED@lists.llvm.org> Message-ID: <66A26545-3417-4899-9E92-A0AF52888FD5@apple.com> > On Oct 7, 2019, at 19:14, Jordan Rose via llvm-commits wrote: > > Author: jrose > Date: Mon Oct 7 11:14:24 2019 > New Revision: 373935 > > URL: http://llvm.org/viewvc/llvm-project?rev=373935&view=rev > Log: > Second attempt to add iterator_range::empty() That’s awesome, thanks! From llvm-commits at lists.llvm.org Mon Oct 7 12:10:43 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Mon, 07 Oct 2019 19:10:43 -0000 Subject: [llvm] r373944 - AMDGPU/GlobalISel: Select VALU G_AMDGPU_FFBH_U32 Message-ID: <20191007191043.78547870FC@lists.llvm.org> Author: arsenm Date: Mon Oct 7 12:10:43 2019 New Revision: 373944 URL: http://llvm.org/viewvc/llvm-project?rev=373944&view=rev Log: AMDGPU/GlobalISel: Select VALU G_AMDGPU_FFBH_U32 Modified: llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir Modified: llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td?rev=373944&r1=373943&r2=373944&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td (original) +++ llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td Mon Oct 7 12:10:43 2019 @@ -235,7 +235,7 @@ defm V_COS_F32 : VOP1Inst <"v_cos_f32", defm V_NOT_B32 : VOP1Inst <"v_not_b32", VOP_I32_I32>; defm V_BFREV_B32 : VOP1Inst <"v_bfrev_b32", VOP_I32_I32, bitreverse>; -defm V_FFBH_U32 : VOP1Inst <"v_ffbh_u32", VOP_I32_I32>; +defm V_FFBH_U32 : VOP1Inst <"v_ffbh_u32", VOP_I32_I32, AMDGPUffbh_u32>; defm V_FFBL_B32 : VOP1Inst <"v_ffbl_b32", VOP_I32_I32>; defm V_FFBH_I32 : VOP1Inst <"v_ffbh_i32", VOP_I32_I32, AMDGPUffbh_i32>; Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir?rev=373944&r1=373943&r2=373944&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir Mon Oct 7 12:10:43 2019 @@ -1,5 +1,5 @@ # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py -# RUN: llc -march=amdgcn -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 %s -o - | FileCheck %s +# RUN: llc -march=amdgcn -run-pass=instruction-select -verify-machineinstrs %s -o - | FileCheck %s --- @@ -36,9 +36,9 @@ body: | ; CHECK-LABEL: name: ffbh_u32_s32_v_v ; CHECK: liveins: $vgpr0 - ; CHECK: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; CHECK: [[AMDGPU_FFBH_U32_:%[0-9]+]]:vgpr(s32) = G_AMDGPU_FFBH_U32 [[COPY]](s32) - ; CHECK: S_ENDPGM 0, implicit [[AMDGPU_FFBH_U32_]](s32) + ; CHECK: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; CHECK: [[V_FFBH_U32_e64_:%[0-9]+]]:vgpr_32 = V_FFBH_U32_e64 [[COPY]], implicit $exec + ; CHECK: S_ENDPGM 0, implicit [[V_FFBH_U32_e64_]] %0:vgpr(s32) = COPY $vgpr0 %1:vgpr(s32) = G_AMDGPU_FFBH_U32 %0 S_ENDPGM 0, implicit %1 @@ -58,9 +58,9 @@ body: | ; CHECK-LABEL: name: ffbh_u32_v_s ; CHECK: liveins: $sgpr0 - ; CHECK: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; CHECK: [[AMDGPU_FFBH_U32_:%[0-9]+]]:vgpr(s32) = G_AMDGPU_FFBH_U32 [[COPY]](s32) - ; CHECK: S_ENDPGM 0, implicit [[AMDGPU_FFBH_U32_]](s32) + ; CHECK: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; CHECK: [[V_FFBH_U32_e64_:%[0-9]+]]:vgpr_32 = V_FFBH_U32_e64 [[COPY]], implicit $exec + ; CHECK: S_ENDPGM 0, implicit [[V_FFBH_U32_e64_]] %0:sgpr(s32) = COPY $sgpr0 %1:vgpr(s32) = G_AMDGPU_FFBH_U32 %0 S_ENDPGM 0, implicit %1 From llvm-commits at lists.llvm.org Mon Oct 7 12:10:44 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Mon, 07 Oct 2019 19:10:44 -0000 Subject: [llvm] r373945 - AMDGPU/GlobalISel: Fix selection of 16-bit shifts Message-ID: <20191007191044.E9C938D51D@lists.llvm.org> Author: arsenm Date: Mon Oct 7 12:10:44 2019 New Revision: 373945 URL: http://llvm.org/viewvc/llvm-project?rev=373945&view=rev Log: AMDGPU/GlobalISel: Fix selection of 16-bit shifts Modified: llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir Modified: llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td?rev=373945&r1=373944&r2=373945&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td (original) +++ llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td Mon Oct 7 12:10:44 2019 @@ -752,19 +752,22 @@ multiclass Bits_OpsRev_i16_Pats ; def : GCNPat< (i32 (zext (op i16:$src0, i16:$src1))), - !if(!eq(PreservesHI16,1), (ClearHI16 (inst $src1, $src0)), (inst $src1, $src0)) + !if(!eq(PreservesHI16,1), (ClearHI16 (inst VSrc_b32:$src1, VSrc_b32:$src0)), + (inst VSrc_b32:$src1, VSrc_b32:$src0)) >; def : GCNPat< (i64 (zext (op i16:$src0, i16:$src1))), (REG_SEQUENCE VReg_64, - !if(!eq(PreservesHI16,1), (ClearHI16 (inst $src1, $src0)), (inst $src1, $src0)), + !if(!eq(PreservesHI16,1), (ClearHI16 (inst VSrc_b32:$src1, VSrc_b32:$src0)), + (inst VSrc_b32:$src1, VSrc_b32:$src0)), sub0, (V_MOV_B32_e32 (i32 0)), sub1) >; Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir?rev=373945&r1=373944&r2=373945&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir Mon Oct 7 12:10:44 2019 @@ -10,51 +10,258 @@ # RUN: FileCheck -check-prefixes=ERR-GFX910,ERR %s < %t # ERR-NOT: remark -# ERR-GFX8: remark: :0:0: cannot select: %3:sgpr(s16) = G_ASHR %2:sgpr, %1:sgpr(s32) (in function: ashr_s16_ss) -# ERR-GFX8-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_ASHR %2:sgpr, %1:vgpr(s32) (in function: ashr_s16_sv) -# ERR-GFX8-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_ASHR %2:vgpr, %1:sgpr(s32) (in function: ashr_s16_vs) -# ERR-GFX8-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_ASHR %2:vgpr, %1:vgpr(s32) (in function: ashr_s16_vv) - -# ERR-GFX910: remark: :0:0: cannot select: %3:sgpr(s16) = G_ASHR %2:sgpr, %1:sgpr(s32) (in function: ashr_s16_ss) -# ERR-GFX910-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_ASHR %2:sgpr, %1:vgpr(s32) (in function: ashr_s16_sv) -# ERR-GFX910-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_ASHR %2:vgpr, %1:sgpr(s32) (in function: ashr_s16_vs) -# ERR-GFX910-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_ASHR %2:vgpr, %1:vgpr(s32) (in function: ashr_s16_vv) - +# ERR: remark: :0:0: cannot select: %4:sgpr(s16) = G_ASHR %2:sgpr, %3:sgpr(s16) (in function: ashr_s16_s16_ss) +# ERR-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_ASHR %2:vgpr, %1:vgpr(s32) (in function: ashr_s16_s32_vv) +# ERR-NEXT: remark: :0:0: cannot select: %5:vgpr(s64) = G_ZEXT %4:vgpr(s16) (in function: ashr_s16_vv_zext_to_s64) +# ERR-NEXT: remark: :0:0: cannot select: %3:sgpr(s16) = G_ASHR %2:sgpr, %1:sgpr(s32) (in function: ashr_s16_s32_ss) +# ERR-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_ASHR %2:sgpr, %1:vgpr(s32) (in function: ashr_s16_s32_sv) +# ERR-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_ASHR %2:vgpr, %1:sgpr(s32) (in function: ashr_s16_s32_vs) # ERR-NOT: remark --- -name: ashr_s16_ss +name: ashr_s16_s16_ss +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0, $sgpr1 + + ; GFX8-LABEL: name: ashr_s16_s16_ss + ; GFX8: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 + ; GFX8: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX8: [[TRUNC1:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX8: [[ASHR:%[0-9]+]]:sgpr(s16) = G_ASHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX8: S_ENDPGM 0, implicit [[ASHR]](s16) + ; GFX9-LABEL: name: ashr_s16_s16_ss + ; GFX9: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 + ; GFX9: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX9: [[TRUNC1:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX9: [[ASHR:%[0-9]+]]:sgpr(s16) = G_ASHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX9: S_ENDPGM 0, implicit [[ASHR]](s16) + ; GFX10-LABEL: name: ashr_s16_s16_ss + ; GFX10: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 + ; GFX10: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX10: [[TRUNC1:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX10: [[ASHR:%[0-9]+]]:sgpr(s16) = G_ASHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX10: S_ENDPGM 0, implicit [[ASHR]](s16) + %0:sgpr(s32) = COPY $sgpr0 + %1:sgpr(s32) = COPY $sgpr1 + %2:sgpr(s16) = G_TRUNC %0 + %3:sgpr(s16) = G_TRUNC %1 + %4:sgpr(s16) = G_ASHR %2, %3 + S_ENDPGM 0, implicit %4 +... + +--- +name: ashr_s16_s16_vs +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0, $vgpr0 + ; GFX8-LABEL: name: ashr_s16_s16_vs + ; GFX8: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX8: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]] + ; GFX9-LABEL: name: ashr_s16_s16_vs + ; GFX9: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX9: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]] + ; GFX10-LABEL: name: ashr_s16_s16_vs + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_ASHRREV_I16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + %0:vgpr(s32) = COPY $vgpr0 + %1:sgpr(s32) = COPY $sgpr0 + %2:vgpr(s16) = G_TRUNC %0 + %3:sgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_ASHR %2, %3 + S_ENDPGM 0, implicit %4 +... + +--- +name: ashr_s16_s32_vv +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: ashr_s16_s32_vv + ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX8: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) + ; GFX8: S_ENDPGM 0, implicit [[ASHR]](s16) + ; GFX9-LABEL: name: ashr_s16_s32_vv + ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX9: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) + ; GFX9: S_ENDPGM 0, implicit [[ASHR]](s16) + ; GFX10-LABEL: name: ashr_s16_s32_vv + ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX10: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) + ; GFX10: S_ENDPGM 0, implicit [[ASHR]](s16) + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_ASHR %2, %1 + S_ENDPGM 0, implicit %3 +... + +--- +name: ashr_s16_s16_vv +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: ashr_s16_s16_vv + ; GFX8: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX8: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]] + ; GFX9-LABEL: name: ashr_s16_s16_vv + ; GFX9: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX9: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]] + ; GFX10-LABEL: name: ashr_s16_s16_vv + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_ASHRREV_I16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_ASHR %2, %3 + S_ENDPGM 0, implicit %4 +... + +--- +name: ashr_s16_s16_vv_zext_to_s32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: ashr_s16_s16_vv_zext_to_s32 + ; GFX8: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX8: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_ASHRREV_I16_e64_]], 0, 16, implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_BFE_U32_]] + ; GFX9-LABEL: name: ashr_s16_s16_vv_zext_to_s32 + ; GFX9: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX9: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_ASHRREV_I16_e64_]], 0, 16, implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_BFE_U32_]] + ; GFX10-LABEL: name: ashr_s16_s16_vv_zext_to_s32 + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_ASHRREV_I16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_AND_B32_e64_]], 0, 16, implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_BFE_U32_]] + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_ASHR %2, %3 + %5:vgpr(s32) = G_ZEXT %4 + S_ENDPGM 0, implicit %5 +... + +--- +name: ashr_s16_vv_zext_to_s64 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: ashr_s16_vv_zext_to_s64 + ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX8: [[TRUNC1:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX8: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX8: [[ZEXT:%[0-9]+]]:vgpr(s64) = G_ZEXT [[ASHR]](s16) + ; GFX8: S_ENDPGM 0, implicit [[ZEXT]](s64) + ; GFX9-LABEL: name: ashr_s16_vv_zext_to_s64 + ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX9: [[TRUNC1:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX9: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:vgpr(s64) = G_ZEXT [[ASHR]](s16) + ; GFX9: S_ENDPGM 0, implicit [[ZEXT]](s64) + ; GFX10-LABEL: name: ashr_s16_vv_zext_to_s64 + ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX10: [[TRUNC1:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX10: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX10: [[ZEXT:%[0-9]+]]:vgpr(s64) = G_ZEXT [[ASHR]](s16) + ; GFX10: S_ENDPGM 0, implicit [[ZEXT]](s64) + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_ASHR %2, %3 + %5:vgpr(s64) = G_ZEXT %4 + S_ENDPGM 0, implicit %5 +... + +--- +name: ashr_s16_s32_ss legalized: true regBankSelected: true body: | bb.0: liveins: $sgpr0, $sgpr1 - ; GFX6-LABEL: name: ashr_s16_ss - ; GFX6: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 - ; GFX6: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[ASHR:%[0-9]+]]:sgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX7-LABEL: name: ashr_s16_ss - ; GFX7: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 - ; GFX7: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[ASHR:%[0-9]+]]:sgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX8-LABEL: name: ashr_s16_ss + + ; GFX8-LABEL: name: ashr_s16_s32_ss ; GFX8: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 ; GFX8: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX8: [[ASHR:%[0-9]+]]:sgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) ; GFX8: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX9-LABEL: name: ashr_s16_ss + ; GFX9-LABEL: name: ashr_s16_s32_ss ; GFX9: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 ; GFX9: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX9: [[ASHR:%[0-9]+]]:sgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) ; GFX9: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX10-LABEL: name: ashr_s16_ss + ; GFX10-LABEL: name: ashr_s16_s32_ss ; GFX10: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 ; GFX10: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) @@ -68,38 +275,26 @@ body: | ... --- -name: ashr_s16_sv +name: ashr_s16_s32_sv legalized: true regBankSelected: true body: | bb.0: liveins: $sgpr0, $vgpr0 - ; GFX6-LABEL: name: ashr_s16_sv - ; GFX6: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX6: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX7-LABEL: name: ashr_s16_sv - ; GFX7: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX7: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX8-LABEL: name: ashr_s16_sv + ; GFX8-LABEL: name: ashr_s16_s32_sv ; GFX8: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 ; GFX8: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX8: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) ; GFX8: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX9-LABEL: name: ashr_s16_sv + ; GFX9-LABEL: name: ashr_s16_s32_sv ; GFX9: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 ; GFX9: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX9: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) ; GFX9: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX10-LABEL: name: ashr_s16_sv + ; GFX10-LABEL: name: ashr_s16_s32_sv ; GFX10: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 ; GFX10: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) @@ -113,90 +308,67 @@ body: | ... --- -name: ashr_s16_vs +name: ashr_s16_s16_sv legalized: true regBankSelected: true body: | bb.0: liveins: $sgpr0, $vgpr0 - ; GFX6-LABEL: name: ashr_s16_vs - ; GFX6: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX6: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX7-LABEL: name: ashr_s16_vs - ; GFX7: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX7: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX8-LABEL: name: ashr_s16_vs - ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX8: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) - ; GFX8: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX9-LABEL: name: ashr_s16_vs - ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX9: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) - ; GFX9: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX10-LABEL: name: ashr_s16_vs - ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX10: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) - ; GFX10: S_ENDPGM 0, implicit [[ASHR]](s16) - %0:vgpr(s32) = COPY $vgpr0 - %1:sgpr(s32) = COPY $sgpr0 - %2:vgpr(s16) = G_TRUNC %0 - %3:vgpr(s16) = G_ASHR %2, %1 - S_ENDPGM 0, implicit %3 + ; GFX8-LABEL: name: ashr_s16_s16_sv + ; GFX8: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]] + ; GFX9-LABEL: name: ashr_s16_s16_sv + ; GFX9: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]] + ; GFX10-LABEL: name: ashr_s16_s16_sv + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_ASHRREV_I16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + %0:sgpr(s32) = COPY $sgpr0 + %1:vgpr(s32) = COPY $vgpr0 + %2:sgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_ASHR %2, %3 + S_ENDPGM 0, implicit %4 ... --- -name: ashr_s16_vv +name: ashr_s16_s32_vs legalized: true regBankSelected: true body: | bb.0: - liveins: $vgpr0, $vgpr1 - ; GFX6-LABEL: name: ashr_s16_vv - ; GFX6: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 - ; GFX6: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX7-LABEL: name: ashr_s16_vv - ; GFX7: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 - ; GFX7: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX8-LABEL: name: ashr_s16_vv + liveins: $sgpr0, $vgpr0 + ; GFX8-LABEL: name: ashr_s16_s32_vs ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX8: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) ; GFX8: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX9-LABEL: name: ashr_s16_vv + ; GFX9-LABEL: name: ashr_s16_s32_vs ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX9: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) ; GFX9: S_ENDPGM 0, implicit [[ASHR]](s16) - ; GFX10-LABEL: name: ashr_s16_vv + ; GFX10-LABEL: name: ashr_s16_s32_vs ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX10: [[ASHR:%[0-9]+]]:vgpr(s16) = G_ASHR [[TRUNC]], [[COPY1]](s32) ; GFX10: S_ENDPGM 0, implicit [[ASHR]](s16) %0:vgpr(s32) = COPY $vgpr0 - %1:vgpr(s32) = COPY $vgpr1 + %1:sgpr(s32) = COPY $sgpr0 %2:vgpr(s16) = G_TRUNC %0 %3:vgpr(s16) = G_ASHR %2, %1 S_ENDPGM 0, implicit %3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir?rev=373945&r1=373944&r2=373945&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir Mon Oct 7 12:10:44 2019 @@ -10,51 +10,258 @@ # RUN: FileCheck -check-prefixes=ERR-GFX910,ERR %s < %t # ERR-NOT: remark -# ERR-GFX8: remark: :0:0: cannot select: %3:sgpr(s16) = G_LSHR %2:sgpr, %1:sgpr(s32) (in function: lshr_s16_ss) -# ERR-GFX8-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_LSHR %2:sgpr, %1:vgpr(s32) (in function: lshr_s16_sv) -# ERR-GFX8-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_LSHR %2:vgpr, %1:sgpr(s32) (in function: lshr_s16_vs) -# ERR-GFX8-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_LSHR %2:vgpr, %1:vgpr(s32) (in function: lshr_s16_vv) - -# ERR-GFX910: remark: :0:0: cannot select: %3:sgpr(s16) = G_LSHR %2:sgpr, %1:sgpr(s32) (in function: lshr_s16_ss) -# ERR-GFX910-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_LSHR %2:sgpr, %1:vgpr(s32) (in function: lshr_s16_sv) -# ERR-GFX910-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_LSHR %2:vgpr, %1:sgpr(s32) (in function: lshr_s16_vs) -# ERR-GFX910-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_LSHR %2:vgpr, %1:vgpr(s32) (in function: lshr_s16_vv) - +# ERR: remark: :0:0: cannot select: %4:sgpr(s16) = G_LSHR %2:sgpr, %3:sgpr(s16) (in function: lshr_s16_s16_ss) +# ERR-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_LSHR %2:vgpr, %1:vgpr(s32) (in function: lshr_s16_s32_vv) +# ERR-NEXT: remark: :0:0: cannot select: %5:vgpr(s64) = G_ZEXT %4:vgpr(s16) (in function: lshr_s16_vv_zext_to_s64) +# ERR-NEXT: remark: :0:0: cannot select: %3:sgpr(s16) = G_LSHR %2:sgpr, %1:sgpr(s32) (in function: lshr_s16_s32_ss) +# ERR-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_LSHR %2:sgpr, %1:vgpr(s32) (in function: lshr_s16_s32_sv) +# ERR-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_LSHR %2:vgpr, %1:sgpr(s32) (in function: lshr_s16_s32_vs) # ERR-NOT: remark --- -name: lshr_s16_ss +name: lshr_s16_s16_ss +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0, $sgpr1 + + ; GFX8-LABEL: name: lshr_s16_s16_ss + ; GFX8: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 + ; GFX8: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX8: [[TRUNC1:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX8: [[LSHR:%[0-9]+]]:sgpr(s16) = G_LSHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX8: S_ENDPGM 0, implicit [[LSHR]](s16) + ; GFX9-LABEL: name: lshr_s16_s16_ss + ; GFX9: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 + ; GFX9: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX9: [[TRUNC1:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX9: [[LSHR:%[0-9]+]]:sgpr(s16) = G_LSHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX9: S_ENDPGM 0, implicit [[LSHR]](s16) + ; GFX10-LABEL: name: lshr_s16_s16_ss + ; GFX10: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 + ; GFX10: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX10: [[TRUNC1:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX10: [[LSHR:%[0-9]+]]:sgpr(s16) = G_LSHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX10: S_ENDPGM 0, implicit [[LSHR]](s16) + %0:sgpr(s32) = COPY $sgpr0 + %1:sgpr(s32) = COPY $sgpr1 + %2:sgpr(s16) = G_TRUNC %0 + %3:sgpr(s16) = G_TRUNC %1 + %4:sgpr(s16) = G_LSHR %2, %3 + S_ENDPGM 0, implicit %4 +... + +--- +name: lshr_s16_s16_vs +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0, $vgpr0 + ; GFX8-LABEL: name: lshr_s16_s16_vs + ; GFX8: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX8: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]] + ; GFX9-LABEL: name: lshr_s16_s16_vs + ; GFX9: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX9: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]] + ; GFX10-LABEL: name: lshr_s16_s16_vs + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHRREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + %0:vgpr(s32) = COPY $vgpr0 + %1:sgpr(s32) = COPY $sgpr0 + %2:vgpr(s16) = G_TRUNC %0 + %3:sgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_LSHR %2, %3 + S_ENDPGM 0, implicit %4 +... + +--- +name: lshr_s16_s32_vv +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: lshr_s16_s32_vv + ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX8: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) + ; GFX8: S_ENDPGM 0, implicit [[LSHR]](s16) + ; GFX9-LABEL: name: lshr_s16_s32_vv + ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX9: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) + ; GFX9: S_ENDPGM 0, implicit [[LSHR]](s16) + ; GFX10-LABEL: name: lshr_s16_s32_vv + ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX10: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) + ; GFX10: S_ENDPGM 0, implicit [[LSHR]](s16) + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_LSHR %2, %1 + S_ENDPGM 0, implicit %3 +... + +--- +name: lshr_s16_s16_vv +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: lshr_s16_s16_vv + ; GFX8: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX8: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]] + ; GFX9-LABEL: name: lshr_s16_s16_vv + ; GFX9: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX9: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]] + ; GFX10-LABEL: name: lshr_s16_s16_vv + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHRREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_LSHR %2, %3 + S_ENDPGM 0, implicit %4 +... + +--- +name: lshr_s16_s16_vv_zext_to_s32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: lshr_s16_s16_vv_zext_to_s32 + ; GFX8: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX8: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_LSHRREV_B16_e64_]], 0, 16, implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_BFE_U32_]] + ; GFX9-LABEL: name: lshr_s16_s16_vv_zext_to_s32 + ; GFX9: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX9: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_LSHRREV_B16_e64_]], 0, 16, implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_BFE_U32_]] + ; GFX10-LABEL: name: lshr_s16_s16_vv_zext_to_s32 + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHRREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_AND_B32_e64_]], 0, 16, implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_BFE_U32_]] + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_LSHR %2, %3 + %5:vgpr(s32) = G_ZEXT %4 + S_ENDPGM 0, implicit %5 +... + +--- +name: lshr_s16_vv_zext_to_s64 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: lshr_s16_vv_zext_to_s64 + ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX8: [[TRUNC1:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX8: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX8: [[ZEXT:%[0-9]+]]:vgpr(s64) = G_ZEXT [[LSHR]](s16) + ; GFX8: S_ENDPGM 0, implicit [[ZEXT]](s64) + ; GFX9-LABEL: name: lshr_s16_vv_zext_to_s64 + ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX9: [[TRUNC1:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX9: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:vgpr(s64) = G_ZEXT [[LSHR]](s16) + ; GFX9: S_ENDPGM 0, implicit [[ZEXT]](s64) + ; GFX10-LABEL: name: lshr_s16_vv_zext_to_s64 + ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX10: [[TRUNC1:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX10: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[TRUNC1]](s16) + ; GFX10: [[ZEXT:%[0-9]+]]:vgpr(s64) = G_ZEXT [[LSHR]](s16) + ; GFX10: S_ENDPGM 0, implicit [[ZEXT]](s64) + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_LSHR %2, %3 + %5:vgpr(s64) = G_ZEXT %4 + S_ENDPGM 0, implicit %5 +... + +--- +name: lshr_s16_s32_ss legalized: true regBankSelected: true body: | bb.0: liveins: $sgpr0, $sgpr1 - ; GFX6-LABEL: name: lshr_s16_ss - ; GFX6: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 - ; GFX6: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[LSHR:%[0-9]+]]:sgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX7-LABEL: name: lshr_s16_ss - ; GFX7: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 - ; GFX7: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[LSHR:%[0-9]+]]:sgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX8-LABEL: name: lshr_s16_ss + + ; GFX8-LABEL: name: lshr_s16_s32_ss ; GFX8: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 ; GFX8: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX8: [[LSHR:%[0-9]+]]:sgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) ; GFX8: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX9-LABEL: name: lshr_s16_ss + ; GFX9-LABEL: name: lshr_s16_s32_ss ; GFX9: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 ; GFX9: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX9: [[LSHR:%[0-9]+]]:sgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) ; GFX9: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX10-LABEL: name: lshr_s16_ss + ; GFX10-LABEL: name: lshr_s16_s32_ss ; GFX10: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 ; GFX10: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) @@ -68,38 +275,26 @@ body: | ... --- -name: lshr_s16_sv +name: lshr_s16_s32_sv legalized: true regBankSelected: true body: | bb.0: liveins: $sgpr0, $vgpr0 - ; GFX6-LABEL: name: lshr_s16_sv - ; GFX6: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX6: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX7-LABEL: name: lshr_s16_sv - ; GFX7: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX7: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX8-LABEL: name: lshr_s16_sv + ; GFX8-LABEL: name: lshr_s16_s32_sv ; GFX8: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 ; GFX8: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX8: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) ; GFX8: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX9-LABEL: name: lshr_s16_sv + ; GFX9-LABEL: name: lshr_s16_s32_sv ; GFX9: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 ; GFX9: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX9: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) ; GFX9: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX10-LABEL: name: lshr_s16_sv + ; GFX10-LABEL: name: lshr_s16_s32_sv ; GFX10: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 ; GFX10: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) @@ -113,90 +308,67 @@ body: | ... --- -name: lshr_s16_vs +name: lshr_s16_s16_sv legalized: true regBankSelected: true body: | bb.0: liveins: $sgpr0, $vgpr0 - ; GFX6-LABEL: name: lshr_s16_vs - ; GFX6: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX6: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX7-LABEL: name: lshr_s16_vs - ; GFX7: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX7: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX8-LABEL: name: lshr_s16_vs - ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX8: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) - ; GFX8: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX9-LABEL: name: lshr_s16_vs - ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX9: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) - ; GFX9: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX10-LABEL: name: lshr_s16_vs - ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX10: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) - ; GFX10: S_ENDPGM 0, implicit [[LSHR]](s16) - %0:vgpr(s32) = COPY $vgpr0 - %1:sgpr(s32) = COPY $sgpr0 - %2:vgpr(s16) = G_TRUNC %0 - %3:vgpr(s16) = G_LSHR %2, %1 - S_ENDPGM 0, implicit %3 + ; GFX8-LABEL: name: lshr_s16_s16_sv + ; GFX8: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]] + ; GFX9-LABEL: name: lshr_s16_s16_sv + ; GFX9: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]] + ; GFX10-LABEL: name: lshr_s16_s16_sv + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHRREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + %0:sgpr(s32) = COPY $sgpr0 + %1:vgpr(s32) = COPY $vgpr0 + %2:sgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_LSHR %2, %3 + S_ENDPGM 0, implicit %4 ... --- -name: lshr_s16_vv +name: lshr_s16_s32_vs legalized: true regBankSelected: true body: | bb.0: - liveins: $vgpr0, $vgpr1 - ; GFX6-LABEL: name: lshr_s16_vv - ; GFX6: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 - ; GFX6: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX7-LABEL: name: lshr_s16_vv - ; GFX7: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 - ; GFX7: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX8-LABEL: name: lshr_s16_vv + liveins: $sgpr0, $vgpr0 + ; GFX8-LABEL: name: lshr_s16_s32_vs ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX8: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) ; GFX8: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX9-LABEL: name: lshr_s16_vv + ; GFX9-LABEL: name: lshr_s16_s32_vs ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX9: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) ; GFX9: S_ENDPGM 0, implicit [[LSHR]](s16) - ; GFX10-LABEL: name: lshr_s16_vv + ; GFX10-LABEL: name: lshr_s16_s32_vs ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX10: [[LSHR:%[0-9]+]]:vgpr(s16) = G_LSHR [[TRUNC]], [[COPY1]](s32) ; GFX10: S_ENDPGM 0, implicit [[LSHR]](s16) %0:vgpr(s32) = COPY $vgpr0 - %1:vgpr(s32) = COPY $vgpr1 + %1:sgpr(s32) = COPY $sgpr0 %2:vgpr(s16) = G_TRUNC %0 %3:vgpr(s16) = G_LSHR %2, %1 S_ENDPGM 0, implicit %3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir?rev=373945&r1=373944&r2=373945&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir Mon Oct 7 12:10:44 2019 @@ -10,51 +10,258 @@ # RUN: FileCheck -check-prefixes=ERR-GFX910,ERR %s < %t # ERR-NOT: remark -# ERR-GFX8: remark: :0:0: cannot select: %3:sgpr(s16) = G_SHL %2:sgpr, %1:sgpr(s32) (in function: shl_s16_ss) -# ERR-GFX8-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_SHL %2:sgpr, %1:vgpr(s32) (in function: shl_s16_sv) -# ERR-GFX8-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_SHL %2:vgpr, %1:sgpr(s32) (in function: shl_s16_vs) -# ERR-GFX8-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_SHL %2:vgpr, %1:vgpr(s32) (in function: shl_s16_vv) - -# ERR-GFX910: remark: :0:0: cannot select: %3:sgpr(s16) = G_SHL %2:sgpr, %1:sgpr(s32) (in function: shl_s16_ss) -# ERR-GFX910-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_SHL %2:sgpr, %1:vgpr(s32) (in function: shl_s16_sv) -# ERR-GFX910-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_SHL %2:vgpr, %1:sgpr(s32) (in function: shl_s16_vs) -# ERR-GFX910-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_SHL %2:vgpr, %1:vgpr(s32) (in function: shl_s16_vv) - +# ERR: remark: :0:0: cannot select: %4:sgpr(s16) = G_SHL %2:sgpr, %3:sgpr(s16) (in function: shl_s16_s16_ss) +# ERR-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_SHL %2:vgpr, %1:vgpr(s32) (in function: shl_s16_s32_vv) +# ERR-NEXT: remark: :0:0: cannot select: %5:vgpr(s64) = G_ZEXT %4:vgpr(s16) (in function: shl_s16_vv_zext_to_s64) +# ERR-NEXT: remark: :0:0: cannot select: %3:sgpr(s16) = G_SHL %2:sgpr, %1:sgpr(s32) (in function: shl_s16_s32_ss) +# ERR-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_SHL %2:sgpr, %1:vgpr(s32) (in function: shl_s16_s32_sv) +# ERR-NEXT: remark: :0:0: cannot select: %3:vgpr(s16) = G_SHL %2:vgpr, %1:sgpr(s32) (in function: shl_s16_s32_vs) # ERR-NOT: remark --- -name: shl_s16_ss +name: shl_s16_s16_ss +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0, $sgpr1 + + ; GFX8-LABEL: name: shl_s16_s16_ss + ; GFX8: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 + ; GFX8: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX8: [[TRUNC1:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX8: [[SHL:%[0-9]+]]:sgpr(s16) = G_SHL [[TRUNC]], [[TRUNC1]](s16) + ; GFX8: S_ENDPGM 0, implicit [[SHL]](s16) + ; GFX9-LABEL: name: shl_s16_s16_ss + ; GFX9: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 + ; GFX9: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX9: [[TRUNC1:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX9: [[SHL:%[0-9]+]]:sgpr(s16) = G_SHL [[TRUNC]], [[TRUNC1]](s16) + ; GFX9: S_ENDPGM 0, implicit [[SHL]](s16) + ; GFX10-LABEL: name: shl_s16_s16_ss + ; GFX10: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 + ; GFX10: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX10: [[TRUNC1:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX10: [[SHL:%[0-9]+]]:sgpr(s16) = G_SHL [[TRUNC]], [[TRUNC1]](s16) + ; GFX10: S_ENDPGM 0, implicit [[SHL]](s16) + %0:sgpr(s32) = COPY $sgpr0 + %1:sgpr(s32) = COPY $sgpr1 + %2:sgpr(s16) = G_TRUNC %0 + %3:sgpr(s16) = G_TRUNC %1 + %4:sgpr(s16) = G_SHL %2, %3 + S_ENDPGM 0, implicit %4 +... + +--- +name: shl_s16_s16_vs +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0, $vgpr0 + ; GFX8-LABEL: name: shl_s16_s16_vs + ; GFX8: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX8: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]] + ; GFX9-LABEL: name: shl_s16_s16_vs + ; GFX9: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX9: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]] + ; GFX10-LABEL: name: shl_s16_s16_vs + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHLREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + %0:vgpr(s32) = COPY $vgpr0 + %1:sgpr(s32) = COPY $sgpr0 + %2:vgpr(s16) = G_TRUNC %0 + %3:sgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_SHL %2, %3 + S_ENDPGM 0, implicit %4 +... + +--- +name: shl_s16_s32_vv +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: shl_s16_s32_vv + ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX8: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) + ; GFX8: S_ENDPGM 0, implicit [[SHL]](s16) + ; GFX9-LABEL: name: shl_s16_s32_vv + ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX9: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) + ; GFX9: S_ENDPGM 0, implicit [[SHL]](s16) + ; GFX10-LABEL: name: shl_s16_s32_vv + ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX10: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) + ; GFX10: S_ENDPGM 0, implicit [[SHL]](s16) + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_SHL %2, %1 + S_ENDPGM 0, implicit %3 +... + +--- +name: shl_s16_s16_vv +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: shl_s16_s16_vv + ; GFX8: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX8: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]] + ; GFX9-LABEL: name: shl_s16_s16_vv + ; GFX9: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX9: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]] + ; GFX10-LABEL: name: shl_s16_s16_vv + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHLREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_SHL %2, %3 + S_ENDPGM 0, implicit %4 +... + +--- +name: shl_s16_s16_vv_zext_to_s32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: shl_s16_s16_vv_zext_to_s32 + ; GFX8: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX8: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_LSHLREV_B16_e64_]], 0, 16, implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_BFE_U32_]] + ; GFX9-LABEL: name: shl_s16_s16_vv_zext_to_s32 + ; GFX9: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX9: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_LSHLREV_B16_e64_]], 0, 16, implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_BFE_U32_]] + ; GFX10-LABEL: name: shl_s16_s16_vv_zext_to_s32 + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHLREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_AND_B32_e64_]], 0, 16, implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_BFE_U32_]] + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_SHL %2, %3 + %5:vgpr(s32) = G_ZEXT %4 + S_ENDPGM 0, implicit %5 +... + +--- +name: shl_s16_vv_zext_to_s64 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; GFX8-LABEL: name: shl_s16_vv_zext_to_s64 + ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX8: [[TRUNC1:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX8: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[TRUNC1]](s16) + ; GFX8: [[ZEXT:%[0-9]+]]:vgpr(s64) = G_ZEXT [[SHL]](s16) + ; GFX8: S_ENDPGM 0, implicit [[ZEXT]](s64) + ; GFX9-LABEL: name: shl_s16_vv_zext_to_s64 + ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX9: [[TRUNC1:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX9: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[TRUNC1]](s16) + ; GFX9: [[ZEXT:%[0-9]+]]:vgpr(s64) = G_ZEXT [[SHL]](s16) + ; GFX9: S_ENDPGM 0, implicit [[ZEXT]](s64) + ; GFX10-LABEL: name: shl_s16_vv_zext_to_s64 + ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) + ; GFX10: [[TRUNC1:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY1]](s32) + ; GFX10: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[TRUNC1]](s16) + ; GFX10: [[ZEXT:%[0-9]+]]:vgpr(s64) = G_ZEXT [[SHL]](s16) + ; GFX10: S_ENDPGM 0, implicit [[ZEXT]](s64) + %0:vgpr(s32) = COPY $vgpr0 + %1:vgpr(s32) = COPY $vgpr1 + %2:vgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_SHL %2, %3 + %5:vgpr(s64) = G_ZEXT %4 + S_ENDPGM 0, implicit %5 +... + +--- +name: shl_s16_s32_ss legalized: true regBankSelected: true body: | bb.0: liveins: $sgpr0, $sgpr1 - ; GFX6-LABEL: name: shl_s16_ss - ; GFX6: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 - ; GFX6: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[SHL:%[0-9]+]]:sgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX7-LABEL: name: shl_s16_ss - ; GFX7: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 - ; GFX7: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[SHL:%[0-9]+]]:sgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX8-LABEL: name: shl_s16_ss + + ; GFX8-LABEL: name: shl_s16_s32_ss ; GFX8: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 ; GFX8: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX8: [[SHL:%[0-9]+]]:sgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) ; GFX8: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX9-LABEL: name: shl_s16_ss + ; GFX9-LABEL: name: shl_s16_s32_ss ; GFX9: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 ; GFX9: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX9: [[SHL:%[0-9]+]]:sgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) ; GFX9: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX10-LABEL: name: shl_s16_ss + ; GFX10-LABEL: name: shl_s16_s32_ss ; GFX10: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr1 ; GFX10: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) @@ -68,38 +275,26 @@ body: | ... --- -name: shl_s16_sv +name: shl_s16_s32_sv legalized: true regBankSelected: true body: | bb.0: liveins: $sgpr0, $vgpr0 - ; GFX6-LABEL: name: shl_s16_sv - ; GFX6: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX6: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX7-LABEL: name: shl_s16_sv - ; GFX7: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX7: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX8-LABEL: name: shl_s16_sv + ; GFX8-LABEL: name: shl_s16_s32_sv ; GFX8: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 ; GFX8: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX8: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) ; GFX8: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX9-LABEL: name: shl_s16_sv + ; GFX9-LABEL: name: shl_s16_s32_sv ; GFX9: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 ; GFX9: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX9: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) ; GFX9: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX10-LABEL: name: shl_s16_sv + ; GFX10-LABEL: name: shl_s16_s32_sv ; GFX10: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 ; GFX10: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32) @@ -113,90 +308,67 @@ body: | ... --- -name: shl_s16_vs +name: shl_s16_s16_sv legalized: true regBankSelected: true body: | bb.0: liveins: $sgpr0, $vgpr0 - ; GFX6-LABEL: name: shl_s16_vs - ; GFX6: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX6: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX7-LABEL: name: shl_s16_vs - ; GFX7: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX7: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX8-LABEL: name: shl_s16_vs - ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX8: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) - ; GFX8: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX9-LABEL: name: shl_s16_vs - ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX9: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) - ; GFX9: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX10-LABEL: name: shl_s16_vs - ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 - ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX10: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) - ; GFX10: S_ENDPGM 0, implicit [[SHL]](s16) - %0:vgpr(s32) = COPY $vgpr0 - %1:sgpr(s32) = COPY $sgpr0 - %2:vgpr(s16) = G_TRUNC %0 - %3:vgpr(s16) = G_SHL %2, %1 - S_ENDPGM 0, implicit %3 + ; GFX8-LABEL: name: shl_s16_s16_sv + ; GFX8: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX8: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX8: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]] + ; GFX9-LABEL: name: shl_s16_s16_sv + ; GFX9: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX9: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX9: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX9: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]] + ; GFX10-LABEL: name: shl_s16_s16_sv + ; GFX10: $vcc_hi = IMPLICIT_DEF + ; GFX10: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 + ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec + ; GFX10: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec + ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHLREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec + ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + %0:sgpr(s32) = COPY $sgpr0 + %1:vgpr(s32) = COPY $vgpr0 + %2:sgpr(s16) = G_TRUNC %0 + %3:vgpr(s16) = G_TRUNC %1 + %4:vgpr(s16) = G_SHL %2, %3 + S_ENDPGM 0, implicit %4 ... --- -name: shl_s16_vv +name: shl_s16_s32_vs legalized: true regBankSelected: true body: | bb.0: - liveins: $vgpr0, $vgpr1 - ; GFX6-LABEL: name: shl_s16_vv - ; GFX6: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX6: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 - ; GFX6: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX6: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) - ; GFX6: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX7-LABEL: name: shl_s16_vv - ; GFX7: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX7: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 - ; GFX7: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) - ; GFX7: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) - ; GFX7: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX8-LABEL: name: shl_s16_vv + liveins: $sgpr0, $vgpr0 + ; GFX8-LABEL: name: shl_s16_s32_vs ; GFX8: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX8: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX8: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX8: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX8: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) ; GFX8: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX9-LABEL: name: shl_s16_vv + ; GFX9-LABEL: name: shl_s16_s32_vs ; GFX9: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX9: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX9: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX9: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX9: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) ; GFX9: S_ENDPGM 0, implicit [[SHL]](s16) - ; GFX10-LABEL: name: shl_s16_vv + ; GFX10-LABEL: name: shl_s16_s32_vs ; GFX10: [[COPY:%[0-9]+]]:vgpr(s32) = COPY $vgpr0 - ; GFX10: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY $vgpr1 + ; GFX10: [[COPY1:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 ; GFX10: [[TRUNC:%[0-9]+]]:vgpr(s16) = G_TRUNC [[COPY]](s32) ; GFX10: [[SHL:%[0-9]+]]:vgpr(s16) = G_SHL [[TRUNC]], [[COPY1]](s32) ; GFX10: S_ENDPGM 0, implicit [[SHL]](s16) %0:vgpr(s32) = COPY $vgpr0 - %1:vgpr(s32) = COPY $vgpr1 + %1:sgpr(s32) = COPY $sgpr0 %2:vgpr(s16) = G_TRUNC %0 %3:vgpr(s16) = G_SHL %2, %1 S_ENDPGM 0, implicit %3 From llvm-commits at lists.llvm.org Mon Oct 7 12:13:27 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Mon, 07 Oct 2019 19:13:27 -0000 Subject: [llvm] r373946 - GlobalISel: Partially implement lower for G_INSERT Message-ID: <20191007191327.82B228253C@lists.llvm.org> Author: arsenm Date: Mon Oct 7 12:13:27 2019 New Revision: 373946 URL: http://llvm.org/viewvc/llvm-project?rev=373946&view=rev Log: GlobalISel: Partially implement lower for G_INSERT Modified: llvm/trunk/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h llvm/trunk/lib/CodeGen/GlobalISel/LegalizerHelper.cpp llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-insert.mir Modified: llvm/trunk/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h?rev=373946&r1=373945&r2=373946&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h (original) +++ llvm/trunk/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h Mon Oct 7 12:13:27 2019 @@ -231,6 +231,7 @@ public: LegalizeResult lowerShuffleVector(MachineInstr &MI); LegalizeResult lowerDynStackAlloc(MachineInstr &MI); LegalizeResult lowerExtract(MachineInstr &MI); + LegalizeResult lowerInsert(MachineInstr &MI); private: MachineRegisterInfo &MRI; Modified: llvm/trunk/lib/CodeGen/GlobalISel/LegalizerHelper.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/LegalizerHelper.cpp?rev=373946&r1=373945&r2=373946&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/LegalizerHelper.cpp Mon Oct 7 12:13:27 2019 @@ -2249,6 +2249,8 @@ LegalizerHelper::lower(MachineInstr &MI, return lowerDynStackAlloc(MI); case G_EXTRACT: return lowerExtract(MI); + case G_INSERT: + return lowerInsert(MI); } } @@ -4131,6 +4133,45 @@ LegalizerHelper::lowerExtract(MachineIns MI.eraseFromParent(); return Legalized; } + + return UnableToLegalize; +} + +LegalizerHelper::LegalizeResult LegalizerHelper::lowerInsert(MachineInstr &MI) { + Register Dst = MI.getOperand(0).getReg(); + Register Src = MI.getOperand(1).getReg(); + Register InsertSrc = MI.getOperand(2).getReg(); + uint64_t Offset = MI.getOperand(3).getImm(); + + LLT DstTy = MRI.getType(Src); + LLT InsertTy = MRI.getType(InsertSrc); + + if (InsertTy.isScalar() && + (DstTy.isScalar() || + (DstTy.isVector() && DstTy.getElementType() == InsertTy))) { + LLT IntDstTy = DstTy; + if (!DstTy.isScalar()) { + IntDstTy = LLT::scalar(DstTy.getSizeInBits()); + Src = MIRBuilder.buildBitcast(IntDstTy, Src).getReg(0); + } + + Register ExtInsSrc = MIRBuilder.buildZExt(IntDstTy, InsertSrc).getReg(0); + if (Offset != 0) { + auto ShiftAmt = MIRBuilder.buildConstant(IntDstTy, Offset); + ExtInsSrc = MIRBuilder.buildShl(IntDstTy, ExtInsSrc, ShiftAmt).getReg(0); + } + + APInt MaskVal = ~APInt::getBitsSet(DstTy.getSizeInBits(), Offset, + InsertTy.getSizeInBits()); + + auto Mask = MIRBuilder.buildConstant(IntDstTy, MaskVal); + auto MaskedSrc = MIRBuilder.buildAnd(IntDstTy, Src, Mask); + auto Or = MIRBuilder.buildOr(IntDstTy, MaskedSrc, ExtInsSrc); + + MIRBuilder.buildBitcast(Dst, Or); + MI.eraseFromParent(); + return Legalized; + } return UnableToLegalize; } Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp?rev=373946&r1=373945&r2=373946&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp Mon Oct 7 12:13:27 2019 @@ -912,13 +912,9 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo unsigned LitTyIdx = Op == G_EXTRACT ? 0 : 1; // FIXME: Doesn't handle extract of illegal sizes. - auto &Builder = getActionDefinitionsBuilder(Op); - - // FIXME: Cleanup when G_INSERT lowering implemented. - if (Op == G_EXTRACT) - Builder.lowerIf(all(typeIs(LitTyIdx, S16), sizeIs(BigTyIdx, 32))); - - Builder + getActionDefinitionsBuilder(Op) + .lowerIf(all(typeIs(LitTyIdx, S16), sizeIs(BigTyIdx, 32))) + // FIXME: Multiples of 16 should not be legal. .legalIf([=](const LegalityQuery &Query) { const LLT BigTy = Query.Types[BigTyIdx]; const LLT LitTy = Query.Types[LitTyIdx]; Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-insert.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-insert.mir?rev=373946&r1=373945&r2=373946&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-insert.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-insert.mir Mon Oct 7 12:13:27 2019 @@ -762,15 +762,48 @@ body: | ; CHECK-LABEL: name: test_insert_v2s16_s16_offset0 ; CHECK: [[COPY:%[0-9]+]]:_(<2 x s16>) = COPY $vgpr0 ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 - ; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32) - ; CHECK: [[INSERT:%[0-9]+]]:_(<2 x s16>) = G_INSERT [[COPY]], [[TRUNC]](s16), 0 - ; CHECK: $vgpr0 = COPY [[INSERT]](<2 x s16>) + ; CHECK: [[BITCAST:%[0-9]+]]:_(s32) = G_BITCAST [[COPY]](<2 x s16>) + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY [[COPY1]](s32) + ; CHECK: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C]] + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 -65536 + ; CHECK: [[AND1:%[0-9]+]]:_(s32) = G_AND [[BITCAST]], [[C1]] + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND1]], [[AND]] + ; CHECK: [[BITCAST1:%[0-9]+]]:_(<2 x s16>) = G_BITCAST [[OR]](s32) + ; CHECK: $vgpr0 = COPY [[BITCAST1]](<2 x s16>) %0:_(<2 x s16>) = COPY $vgpr0 %1:_(s32) = COPY $vgpr1 %2:_(s16) = G_TRUNC %1 %3:_(<2 x s16>) = G_INSERT %0, %2, 0 $vgpr0 = COPY %3 ... + +--- +name: test_insert_v2s16_s16_offset1 +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; CHECK-LABEL: name: test_insert_v2s16_s16_offset1 + ; CHECK: [[COPY:%[0-9]+]]:_(<2 x s16>) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[BITCAST:%[0-9]+]]:_(s32) = G_BITCAST [[COPY]](<2 x s16>) + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY [[COPY1]](s32) + ; CHECK: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C]] + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 1 + ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND]], [[C1]](s32) + ; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 -65535 + ; CHECK: [[AND1:%[0-9]+]]:_(s32) = G_AND [[BITCAST]], [[C2]] + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND1]], [[SHL]] + ; CHECK: [[BITCAST1:%[0-9]+]]:_(<2 x s16>) = G_BITCAST [[OR]](s32) + ; CHECK: $vgpr0 = COPY [[BITCAST1]](<2 x s16>) + %0:_(<2 x s16>) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s16) = G_TRUNC %1 + %3:_(<2 x s16>) = G_INSERT %0, %2, 1 + $vgpr0 = COPY %3 +... --- name: test_insert_v2s16_s16_offset16 body: | @@ -780,9 +813,17 @@ body: | ; CHECK-LABEL: name: test_insert_v2s16_s16_offset16 ; CHECK: [[COPY:%[0-9]+]]:_(<2 x s16>) = COPY $vgpr0 ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 - ; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32) - ; CHECK: [[INSERT:%[0-9]+]]:_(<2 x s16>) = G_INSERT [[COPY]], [[TRUNC]](s16), 16 - ; CHECK: $vgpr0 = COPY [[INSERT]](<2 x s16>) + ; CHECK: [[BITCAST:%[0-9]+]]:_(s32) = G_BITCAST [[COPY]](<2 x s16>) + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY [[COPY1]](s32) + ; CHECK: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C]] + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND]], [[C1]](s32) + ; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 -1 + ; CHECK: [[AND1:%[0-9]+]]:_(s32) = G_AND [[BITCAST]], [[C2]] + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND1]], [[SHL]] + ; CHECK: [[BITCAST1:%[0-9]+]]:_(<2 x s16>) = G_BITCAST [[OR]](s32) + ; CHECK: $vgpr0 = COPY [[BITCAST1]](<2 x s16>) %0:_(<2 x s16>) = COPY $vgpr0 %1:_(s32) = COPY $vgpr1 %2:_(s16) = G_TRUNC %1 @@ -1247,3 +1288,104 @@ body: | %3:_(s64) = G_INSERT %0, %2, 48 $vgpr0_vgpr1 = COPY %3 ... +--- +name: test_insert_s32_s16_offset0 +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; CHECK-LABEL: name: test_insert_s32_s16_offset0 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY [[COPY1]](s32) + ; CHECK: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C]] + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 -65536 + ; CHECK: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND1]], [[AND]] + ; CHECK: [[BITCAST:%[0-9]+]]:_(s32) = G_BITCAST [[OR]](s32) + ; CHECK: $vgpr0 = COPY [[BITCAST]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s16) = G_TRUNC %1 + %3:_(s32) = G_INSERT %1, %2, 0 + $vgpr0 = COPY %3 +... + +--- +name: test_insert_s32_s16_offset1 +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; CHECK-LABEL: name: test_insert_s32_s16_offset1 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY [[COPY1]](s32) + ; CHECK: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C]] + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 1 + ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND]], [[C1]](s32) + ; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 -65535 + ; CHECK: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C2]] + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND1]], [[SHL]] + ; CHECK: [[BITCAST:%[0-9]+]]:_(s32) = G_BITCAST [[OR]](s32) + ; CHECK: $vgpr0 = COPY [[BITCAST]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s16) = G_TRUNC %1 + %3:_(s32) = G_INSERT %1, %2, 1 + $vgpr0 = COPY %3 +... + +--- +name: test_insert_s32_s16_offset8 +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; CHECK-LABEL: name: test_insert_s32_s16_offset8 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY [[COPY1]](s32) + ; CHECK: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C]] + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8 + ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND]], [[C1]](s32) + ; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 -65281 + ; CHECK: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C2]] + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND1]], [[SHL]] + ; CHECK: [[BITCAST:%[0-9]+]]:_(s32) = G_BITCAST [[OR]](s32) + ; CHECK: $vgpr0 = COPY [[BITCAST]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s16) = G_TRUNC %1 + %3:_(s32) = G_INSERT %1, %2, 8 + $vgpr0 = COPY %3 +... + +--- +name: test_insert_s32_s16_offset16 +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; CHECK-LABEL: name: test_insert_s32_s16_offset16 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY [[COPY1]](s32) + ; CHECK: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C]] + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; CHECK: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND]], [[C1]](s32) + ; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 -1 + ; CHECK: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C2]] + ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[AND1]], [[SHL]] + ; CHECK: [[BITCAST:%[0-9]+]]:_(s32) = G_BITCAST [[OR]](s32) + ; CHECK: $vgpr0 = COPY [[BITCAST]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s16) = G_TRUNC %1 + %3:_(s32) = G_INSERT %1, %2, 16 + $vgpr0 = COPY %3 +... From llvm-commits at lists.llvm.org Mon Oct 7 12:11:54 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:11:54 +0000 (UTC) Subject: [PATCH] D68538: GlobalISel: Partially implement lower for G_INSERT In-Reply-To: References: Message-ID: <6d04d936385428af3fa7eec8922a3186@localhost.localdomain> arsenm closed this revision. arsenm added a comment. r373946 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68538/new/ https://reviews.llvm.org/D68538 From llvm-commits at lists.llvm.org Mon Oct 7 12:14:00 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:14:00 +0000 (UTC) Subject: [PATCH] D68231: [SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func In-Reply-To: References: Message-ID: <805c570aa07616ab96b2c393590faa53@localhost.localdomain> evandro added a comment. Indeed, it seemed to be a coarse check in case the following transformations end up calling `pow()`. However, the only transformation that calls `pow()` is when shrinking to `powf()`, which itself chacks for the availability of this routine. So this patch seems to address this issue the wrong way. It seems that removing this check would be better. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68231/new/ https://reviews.llvm.org/D68231 From llvm-commits at lists.llvm.org Mon Oct 7 12:16:26 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Mon, 07 Oct 2019 19:16:26 -0000 Subject: [llvm] r373947 - AMDGPU/GlobalISel: Handle more G_INSERT cases Message-ID: <20191007191626.3F8E08D563@lists.llvm.org> Author: arsenm Date: Mon Oct 7 12:16:26 2019 New Revision: 373947 URL: http://llvm.org/viewvc/llvm-project?rev=373947&view=rev Log: AMDGPU/GlobalISel: Handle more G_INSERT cases Start manually writing a table to get the subreg index. TableGen should probably generate this, but I'm not sure what it looks like in the arbitrary case where subregisters are allowed to not fully cover the super-registers. Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp?rev=373947&r1=373946&r2=373947&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp Mon Oct 7 12:16:26 2019 @@ -555,48 +555,6 @@ bool AMDGPUInstructionSelector::selectG_ return false; } -// FIXME: TableGen should generate something to make this manageable for all -// register classes. At a minimum we could use the opposite of -// composeSubRegIndices and go up from the base 32-bit subreg. -static unsigned getSubRegForSizeAndOffset(const SIRegisterInfo &TRI, - unsigned Size, unsigned Offset) { - switch (Size) { - case 32: - return TRI.getSubRegFromChannel(Offset / 32); - case 64: { - switch (Offset) { - case 0: - return AMDGPU::sub0_sub1; - case 32: - return AMDGPU::sub1_sub2; - case 64: - return AMDGPU::sub2_sub3; - case 96: - return AMDGPU::sub4_sub5; - case 128: - return AMDGPU::sub5_sub6; - case 160: - return AMDGPU::sub7_sub8; - // FIXME: Missing cases up to 1024 bits - default: - return AMDGPU::NoSubRegister; - } - } - case 96: { - switch (Offset) { - case 0: - return AMDGPU::sub0_sub1_sub2; - case 32: - return AMDGPU::sub1_sub2_sub3; - case 64: - return AMDGPU::sub2_sub3_sub4; - } - } - default: - return AMDGPU::NoSubRegister; - } -} - bool AMDGPUInstructionSelector::selectG_INSERT(MachineInstr &I) const { MachineBasicBlock *BB = I.getParent(); @@ -612,7 +570,7 @@ bool AMDGPUInstructionSelector::selectG_ if (Offset % 32 != 0) return false; - unsigned SubReg = getSubRegForSizeAndOffset(TRI, InsSize, Offset); + unsigned SubReg = TRI.getSubRegFromChannel(Offset / 32, InsSize / 32); if (SubReg == AMDGPU::NoSubRegister) return false; Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp?rev=373947&r1=373946&r2=373947&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp Mon Oct 7 12:16:26 2019 @@ -26,19 +26,59 @@ AMDGPURegisterInfo::AMDGPURegisterInfo() // they are not supported at this time. //===----------------------------------------------------------------------===// -unsigned AMDGPURegisterInfo::getSubRegFromChannel(unsigned Channel) { - static const unsigned SubRegs[] = { - AMDGPU::sub0, AMDGPU::sub1, AMDGPU::sub2, AMDGPU::sub3, AMDGPU::sub4, - AMDGPU::sub5, AMDGPU::sub6, AMDGPU::sub7, AMDGPU::sub8, AMDGPU::sub9, - AMDGPU::sub10, AMDGPU::sub11, AMDGPU::sub12, AMDGPU::sub13, AMDGPU::sub14, - AMDGPU::sub15, AMDGPU::sub16, AMDGPU::sub17, AMDGPU::sub18, AMDGPU::sub19, - AMDGPU::sub20, AMDGPU::sub21, AMDGPU::sub22, AMDGPU::sub23, AMDGPU::sub24, - AMDGPU::sub25, AMDGPU::sub26, AMDGPU::sub27, AMDGPU::sub28, AMDGPU::sub29, - AMDGPU::sub30, AMDGPU::sub31 - }; +// Table of NumRegs sized pieces at every 32-bit offset. +static const uint16_t SubRegFromChannelTable[][32] = { + { AMDGPU::sub0, AMDGPU::sub1, AMDGPU::sub2, AMDGPU::sub3, + AMDGPU::sub4, AMDGPU::sub5, AMDGPU::sub6, AMDGPU::sub7, + AMDGPU::sub8, AMDGPU::sub9, AMDGPU::sub10, AMDGPU::sub11, + AMDGPU::sub12, AMDGPU::sub13, AMDGPU::sub14, AMDGPU::sub15, + AMDGPU::sub16, AMDGPU::sub17, AMDGPU::sub18, AMDGPU::sub19, + AMDGPU::sub20, AMDGPU::sub21, AMDGPU::sub22, AMDGPU::sub23, + AMDGPU::sub24, AMDGPU::sub25, AMDGPU::sub26, AMDGPU::sub27, + AMDGPU::sub28, AMDGPU::sub29, AMDGPU::sub30, AMDGPU::sub31 + }, + { + AMDGPU::sub0_sub1, AMDGPU::sub1_sub2, AMDGPU::sub2_sub3, AMDGPU::sub3_sub4, + AMDGPU::sub4_sub5, AMDGPU::sub5_sub6, AMDGPU::sub6_sub7, AMDGPU::sub7_sub8, + AMDGPU::sub8_sub9, AMDGPU::sub9_sub10, AMDGPU::sub10_sub11, AMDGPU::sub11_sub12, + AMDGPU::sub12_sub13, AMDGPU::sub13_sub14, AMDGPU::sub14_sub15, AMDGPU::sub15_sub16, + AMDGPU::sub16_sub17, AMDGPU::sub17_sub18, AMDGPU::sub18_sub19, AMDGPU::sub19_sub20, + AMDGPU::sub20_sub21, AMDGPU::sub21_sub22, AMDGPU::sub22_sub23, AMDGPU::sub23_sub24, + AMDGPU::sub24_sub25, AMDGPU::sub25_sub26, AMDGPU::sub26_sub27, AMDGPU::sub27_sub28, + AMDGPU::sub28_sub29, AMDGPU::sub29_sub30, AMDGPU::sub30_sub31, AMDGPU::NoSubRegister + }, + { + AMDGPU::sub0_sub1_sub2, AMDGPU::sub1_sub2_sub3, AMDGPU::sub2_sub3_sub4, AMDGPU::sub3_sub4_sub5, + AMDGPU::sub4_sub5_sub6, AMDGPU::sub5_sub6_sub7, AMDGPU::sub6_sub7_sub8, AMDGPU::sub7_sub8_sub9, + AMDGPU::sub8_sub9_sub10, AMDGPU::sub9_sub10_sub11, AMDGPU::sub10_sub11_sub12, AMDGPU::sub11_sub12_sub13, + AMDGPU::sub12_sub13_sub14, AMDGPU::sub13_sub14_sub15, AMDGPU::sub14_sub15_sub16, AMDGPU::sub15_sub16_sub17, + AMDGPU::sub16_sub17_sub18, AMDGPU::sub17_sub18_sub19, AMDGPU::sub18_sub19_sub20, AMDGPU::sub19_sub20_sub21, + AMDGPU::sub20_sub21_sub22, AMDGPU::sub21_sub22_sub23, AMDGPU::sub22_sub23_sub24, AMDGPU::sub23_sub24_sub25, + AMDGPU::sub24_sub25_sub26, AMDGPU::sub25_sub26_sub27, AMDGPU::sub26_sub27_sub28, AMDGPU::sub27_sub28_sub29, + AMDGPU::sub28_sub29_sub30, AMDGPU::sub29_sub30_sub31, AMDGPU::NoSubRegister, AMDGPU::NoSubRegister + }, + { + AMDGPU::sub0_sub1_sub2_sub3, AMDGPU::sub1_sub2_sub3_sub4, AMDGPU::sub2_sub3_sub4_sub5, AMDGPU::sub3_sub4_sub5_sub6, + AMDGPU::sub4_sub5_sub6_sub7, AMDGPU::sub5_sub6_sub7_sub8, AMDGPU::sub6_sub7_sub8_sub9, AMDGPU::sub7_sub8_sub9_sub10, + AMDGPU::sub8_sub9_sub10_sub11, AMDGPU::sub9_sub10_sub11_sub12, AMDGPU::sub10_sub11_sub12_sub13, AMDGPU::sub11_sub12_sub13_sub14, + AMDGPU::sub12_sub13_sub14_sub15, AMDGPU::sub13_sub14_sub15_sub16, AMDGPU::sub14_sub15_sub16_sub17, AMDGPU::sub15_sub16_sub17_sub18, + AMDGPU::sub16_sub17_sub18_sub19, AMDGPU::sub17_sub18_sub19_sub20, AMDGPU::sub18_sub19_sub20_sub21, AMDGPU::sub19_sub20_sub21_sub22, + AMDGPU::sub20_sub21_sub22_sub23, AMDGPU::sub21_sub22_sub23_sub24, AMDGPU::sub22_sub23_sub24_sub25, AMDGPU::sub23_sub24_sub25_sub26, + AMDGPU::sub24_sub25_sub26_sub27, AMDGPU::sub25_sub26_sub27_sub28, AMDGPU::sub26_sub27_sub28_sub29, AMDGPU::sub27_sub28_sub29_sub30, + AMDGPU::sub28_sub29_sub30_sub31, AMDGPU::NoSubRegister, AMDGPU::NoSubRegister, AMDGPU::NoSubRegister + } +}; - assert(Channel < array_lengthof(SubRegs)); - return SubRegs[Channel]; +// FIXME: TableGen should generate something to make this manageable for all +// register classes. At a minimum we could use the opposite of +// composeSubRegIndices and go up from the base 32-bit subreg. +unsigned AMDGPURegisterInfo::getSubRegFromChannel(unsigned Channel, unsigned NumRegs) { + const unsigned NumRegIndex = NumRegs - 1; + + assert(NumRegIndex < array_lengthof(SubRegFromChannelTable) && + "Not implemented"); + assert(Channel < array_lengthof(SubRegFromChannelTable[0])); + return SubRegFromChannelTable[NumRegIndex][Channel]; } void AMDGPURegisterInfo::reserveRegisterTuples(BitVector &Reserved, unsigned Reg) const { Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h?rev=373947&r1=373946&r2=373947&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h Mon Oct 7 12:16:26 2019 @@ -28,7 +28,7 @@ struct AMDGPURegisterInfo : public AMDGP /// \returns the sub reg enum value for the given \p Channel /// (e.g. getSubRegFromChannel(0) -> AMDGPU::sub0) - static unsigned getSubRegFromChannel(unsigned Channel); + static unsigned getSubRegFromChannel(unsigned Channel, unsigned NumRegs = 1); void reserveRegisterTuples(BitVector &, unsigned Reg) const; }; Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir?rev=373947&r1=373946&r2=373947&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir Mon Oct 7 12:16:26 2019 @@ -303,41 +303,46 @@ body: | --- -name: insert_s_s256_s_s64_96 +name: insert_s_v256_v_s64_96 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7, $vgpr8_vgpr9 + ; CHECK-LABEL: name: insert_s_v256_v_s64_96 + ; CHECK: [[COPY:%[0-9]+]]:vreg_256 = COPY $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + ; CHECK: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr8_vgpr9 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:vreg_256 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub3_sub4 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:vgpr(s256) = COPY $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + %1:vgpr(s64) = COPY $vgpr8_vgpr9 + %2:vgpr(s256) = G_INSERT %0, %1, 96 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_s_s256_s_s64_128 legalized: true regBankSelected: true body: | bb.0: liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9 - ; CHECK-LABEL: name: insert_s_s256_s_s64_96 + ; CHECK-LABEL: name: insert_s_s256_s_s64_128 ; CHECK: [[COPY:%[0-9]+]]:sreg_256 = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7 - ; CHECK: [[COPY1:%[0-9]+]]:sreg_64_xexec = COPY $sgpr8_sgpr9 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_64_xexec = COPY $sgpr4_sgpr5 ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_256 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub4_sub5 ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] %0:sgpr(s256) = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7 - %1:sgpr(s64) = COPY $sgpr8_sgpr9 - %2:sgpr(s256) = G_INSERT %0, %1, 96 + %1:sgpr(s64) = COPY $sgpr4_sgpr5 + %2:sgpr(s256) = G_INSERT %0, %1, 128 S_ENDPGM 0, implicit %2 ... # --- -# name: insert_s_s256_s_s64_128 -# legalized: true -# regBankSelected: true - -# body: | -# bb.0: -# liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9 -# %0:sgpr(s256) = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7 -# %1:sgpr(s64) = COPY $sgpr4_sgpr5 -# %2:sgpr(s256) = G_INSERT %0, %1, 128 -# S_ENDPGM 0, implicit %2 -# ... - -# --- - # name: insert_s_s256_s_s64_160 # legalized: true # regBankSelected: true @@ -450,3 +455,108 @@ body: | %2:sgpr(s160) = G_INSERT %0, %1, 64 S_ENDPGM 0, implicit %2 ... + +--- + +name: insert_s_s256_s_s128_0 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, $sgpr8_sgpr9_sgpr10_sgpr11 + + ; CHECK-LABEL: name: insert_s_s256_s_s128_0 + ; CHECK: [[COPY:%[0-9]+]]:sreg_256 = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7 + ; CHECK: [[COPY1:%[0-9]+]]:sreg_128 = COPY $sgpr8_sgpr9_sgpr10_sgpr11 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:sreg_256 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub0_sub1_sub2_sub3 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:sgpr(s256) = COPY $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7 + %1:sgpr(s128) = COPY $sgpr8_sgpr9_sgpr10_sgpr11 + %2:sgpr(s256) = G_INSERT %0, %1, 0 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_v_s256_v_s128_32 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7, $vgpr8_vgpr9_vgpr10_vgpr11 + + ; CHECK-LABEL: name: insert_v_s256_v_s128_32 + ; CHECK: [[COPY:%[0-9]+]]:vreg_256 = COPY $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + ; CHECK: [[COPY1:%[0-9]+]]:vreg_128 = COPY $vgpr8_vgpr9_vgpr10_vgpr11 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:vreg_256 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub1_sub2_sub3_sub4 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:vgpr(s256) = COPY $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + %1:vgpr(s128) = COPY $vgpr8_vgpr9_vgpr10_vgpr11 + %2:vgpr(s256) = G_INSERT %0, %1, 32 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_v_s256_v_s128_64 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7, $vgpr8_vgpr9_vgpr10_vgpr11 + + ; CHECK-LABEL: name: insert_v_s256_v_s128_64 + ; CHECK: [[COPY:%[0-9]+]]:vreg_256 = COPY $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + ; CHECK: [[COPY1:%[0-9]+]]:vreg_128 = COPY $vgpr8_vgpr9_vgpr10_vgpr11 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:vreg_256 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub2_sub3_sub4_sub5 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:vgpr(s256) = COPY $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + %1:vgpr(s128) = COPY $vgpr8_vgpr9_vgpr10_vgpr11 + %2:vgpr(s256) = G_INSERT %0, %1, 64 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_v_s256_v_s128_96 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7, $vgpr8_vgpr9_vgpr10_vgpr11 + + ; CHECK-LABEL: name: insert_v_s256_v_s128_96 + ; CHECK: [[COPY:%[0-9]+]]:vreg_256 = COPY $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + ; CHECK: [[COPY1:%[0-9]+]]:vreg_128 = COPY $vgpr8_vgpr9_vgpr10_vgpr11 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:vreg_256 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub3_sub4_sub5_sub6 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:vgpr(s256) = COPY $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + %1:vgpr(s128) = COPY $vgpr8_vgpr9_vgpr10_vgpr11 + %2:vgpr(s256) = G_INSERT %0, %1, 96 + S_ENDPGM 0, implicit %2 +... + +--- + +name: insert_v_s256_v_s128_128 +legalized: true +regBankSelected: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7, $vgpr8_vgpr9_vgpr10_vgpr11 + + ; CHECK-LABEL: name: insert_v_s256_v_s128_128 + ; CHECK: [[COPY:%[0-9]+]]:vreg_256 = COPY $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + ; CHECK: [[COPY1:%[0-9]+]]:vreg_128 = COPY $vgpr8_vgpr9_vgpr10_vgpr11 + ; CHECK: [[INSERT_SUBREG:%[0-9]+]]:vreg_256 = INSERT_SUBREG [[COPY]], [[COPY1]], %subreg.sub4_sub5_sub6_sub7 + ; CHECK: S_ENDPGM 0, implicit [[INSERT_SUBREG]] + %0:vgpr(s256) = COPY $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + %1:vgpr(s128) = COPY $vgpr8_vgpr9_vgpr10_vgpr11 + %2:vgpr(s256) = G_INSERT %0, %1, 128 + S_ENDPGM 0, implicit %2 +... From llvm-commits at lists.llvm.org Mon Oct 7 12:14:13 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:14:13 +0000 (UTC) Subject: [PATCH] D68540: AMDGPU/GlobalISel: Handle more G_INSERT cases In-Reply-To: References: Message-ID: arsenm closed this revision. arsenm added a comment. r373947 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68540/new/ https://reviews.llvm.org/D68540 From llvm-commits at lists.llvm.org Mon Oct 7 12:17:03 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Mon, 07 Oct 2019 19:17:03 -0000 Subject: [llvm] r373948 - gn build: try to make system-libs.windows.test pass Message-ID: <20191007191703.34C258D51D@lists.llvm.org> Author: nico Date: Mon Oct 7 12:17:02 2019 New Revision: 373948 URL: http://llvm.org/viewvc/llvm-project?rev=373948&view=rev Log: gn build: try to make system-libs.windows.test pass Modified: llvm/trunk/utils/gn/secondary/llvm/tools/llvm-config/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/llvm/tools/llvm-config/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/tools/llvm-config/BUILD.gn?rev=373948&r1=373947&r2=373948&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/tools/llvm-config/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/llvm/tools/llvm-config/BUILD.gn Mon Oct 7 12:17:02 2019 @@ -40,7 +40,7 @@ write_cmake_config("BuildVariables.inc") # lib/Support/Windows/Path.inc. # advapi32 required for CryptAcquireContextW in # lib/Support/Windows/Path.inc - system_libs = "psapi.lib shell32.lib ole32.lib uuid.lib advapi32" + system_libs = "psapi.lib shell32.lib ole32.lib uuid.lib advapi32.lib" } else { system_libs += "-lm" if (host_os == "linux") { From llvm-commits at lists.llvm.org Mon Oct 7 12:16:23 2019 From: llvm-commits at lists.llvm.org (Aditya Kumar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:16:23 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <60b32703a1567f85639292b7c921b2c6@localhost.localdomain> hiraditya added inline comments. ================ Comment at: llvm/lib/Support/CRC.cpp:26 -uint32_t llvm::crc32(uint32_t CRC, StringRef S) { - static llvm::once_flag InitFlag; - static CRC32Table Tbl; - llvm::call_once(InitFlag, initCRC32Table, &Tbl); +static const uint32_t CRCTable[256] = { + 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, ---------------- rupprecht wrote: > Can you leave a comment how this table was generated/how it could be regenerated if needed in the future? And/or a unit test to assert the values are correct? +1 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Mon Oct 7 12:21:13 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:21:13 +0000 (UTC) Subject: [PATCH] D67749: [AArch64] Stackframe accesses to SVE objects. In-Reply-To: References: Message-ID: efriedma added inline comments. ================ Comment at: lib/Target/AArch64/AArch64InstrInfo.cpp:3366 +} + int llvm::isAArch64FrameOffsetLegal(const MachineInstr &MI, ---------------- cameron.mcinally wrote: > I'm not an LLVM coding standards expert, but does this need an llvm_unreachable()? I think it does... The reason we add llvm_unreachable() after switches in some cases is related to warnings. Some compilers warn about a missing return after a switch that covers every named enum value, but not every possible enum value. That case doesn't apply here: the switch has a "default". CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67749/new/ https://reviews.llvm.org/D67749 From llvm-commits at lists.llvm.org Mon Oct 7 12:21:47 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:21:47 +0000 (UTC) Subject: [PATCH] D32239: [SCEV] Make SCEV or modeling more aggressive. In-Reply-To: References: Message-ID: <968c024bb67f974f582053807b9f013e@localhost.localdomain> efriedma reopened this revision. efriedma added a comment. This revision is now accepted and ready to land. This was reverted. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D32239/new/ https://reviews.llvm.org/D32239 From llvm-commits at lists.llvm.org Mon Oct 7 12:24:38 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:24:38 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: <8ea84e766445f3013ada97bb691c0a66@localhost.localdomain> evandro updated this revision to Diff 223635. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 Files: llvm/include/llvm/Support/MathExtras.h llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68257.223635.patch Type: text/x-patch Size: 5334 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 12:27:01 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:27:01 +0000 (UTC) Subject: [PATCH] D68484: [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. In-Reply-To: References: Message-ID: craig.topper added inline comments. ================ Comment at: llvm/docs/LangRef.rst:16269 +The ``llvm.noalias`` intrinsic introduces alias assumptions in the normal +computation path of a pointer and it will be opaque for most optimizations. The +``PropagateAndConvertNoAlias`` pass converts ``llvm.noalias`` intrinsics into ---------------- Extra space after "path" ================ Comment at: llvm/docs/LangRef.rst:16272 +``llvm.side.noalias`` intrinsics. At the same time, it splits the pointer path +in a computation path (without ``llvm.noalias`` intrinsics) and a +``noalias_sidechannel`` path (with ``llvm.side.noalias`` intrinsics). This ---------------- in -> into? ================ Comment at: llvm/docs/LangRef.rst:16289 +The ``llvm.noalias.copy.guard`` intrinsic is used to annotate that the returned +pointer points to a blob of memory that contains restrict pointers. This allows +to track the *based on* dependency when copying such blocks of memory. ---------------- "allows to track" reads funny. Maybe "allows tracking of"? ================ Comment at: llvm/docs/LangRef.rst:16295 + +Following arguments are typically used in the various intrinscis: + ---------------- intrinsics* CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68484/new/ https://reviews.llvm.org/D68484 From llvm-commits at lists.llvm.org Mon Oct 7 12:32:13 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:32:13 +0000 (UTC) Subject: [PATCH] D68485: [PATCH 02/38] [noalias] D9375: An llvm.noalias intrinsic In-Reply-To: References: Message-ID: craig.topper added inline comments. ================ Comment at: llvm/docs/LangRef.rst:17134 +.. _int_noalias: + ---------------- Is this intended to be a different LangRef text than the documentation in D68484? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68485/new/ https://reviews.llvm.org/D68485 From llvm-commits at lists.llvm.org Mon Oct 7 12:36:37 2019 From: llvm-commits at lists.llvm.org (Sander de Smalen via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:36:37 +0000 (UTC) Subject: [PATCH] D67749: [AArch64] Stackframe accesses to SVE objects. In-Reply-To: References: Message-ID: sdesmalen marked 2 inline comments as done. sdesmalen added inline comments. ================ Comment at: lib/Target/AArch64/AArch64InstrInfo.cpp:3453 + SOffset = StackOffset(Offset, MVT::i8) + + StackOffset(SOffset.getScalableBytes(), MVT::nxv1i8); return AArch64FrameOffsetCanUpdate | ---------------- cameron.mcinally wrote: > Would you shed some light on what this change is doing? > > `IsMulVL` indicates there are scalable objects on the stack, right? What is the reason for the behavior change of the legacy code when `!IsMulVL`. I.e. the addition of `StackOffset(SOffset.getScalableBytes(), MVT::nxv1i8)` in the else block. `isAArch64FrameOffsetLegal` tries to determine if the immediate field of the instruction can fit the given StackOffset, and will attempt to fold as much of the offset in the immediate. Although a StackOffset can contain both a scalable and a non-scalable part, it will depend on the instruction whether the immediate is scalable or non-scalable. For example, in ```LDR , [{, #, MUL VL}]``` the immediate is `mul vl`, so scalable, which means the instruction can only handle the "scalable" part of the StackOffset. The rest of the offset will need to be handled elsewhere. The variable `int64_t Offset` uses either the scalable or non-scalable part of the StackOffset, which happens on line 3411. After that, this function does its magic to determine what part of the offset can be folded into the immediate. On line 3448, the remaining part of the offset that could *not* be folded into the immediate will need to be reflected in the in/out parameter `SOffset`, which is a StackOffset. If `IsMulVL` is true, then variable `Offset` is scalable and will at this point contain the part of the scalable offset that could not be folded into the immediate. SOffset.getBytes() just passes through the fixed-size part of the offset that is not handled by the instruction. Conversely, if `IsMulVL` is false, then the variable `Offset` is non-scalable and will contain the part of the fixed-size offset that could not be folded into the immediate. It then has to pass through SOffset.getScalableBytes() that is not handled by the instruction, so it can be handled elsewhere. For example: ```isAArch64FrameOffsetLegal(AArch64::LDR_ZXI, {16, MVT::i8} + {16, MVT::nxv1i8})``` would fold `{16, MVT::nxv1i8}` into the immediate, and the resulting SOffset would be `{16, MVT::i8}`. It would return `AArch64FrameOffsetCanUpdate`. ```isAArch64FrameOffsetLegal(AArch64::LDR_ZXI, {16, MVT::i8} + {4096, MVT::nxv1i8})``` would fold `{4080, MVT::nxv1i8}` into the immediate (note that it's immediate goes up to `#255 MUL VL`), and the resulting SOffset would be `{16, MVT::i8} + {16, MVT::nxv1i8}`. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67749/new/ https://reviews.llvm.org/D67749 From llvm-commits at lists.llvm.org Mon Oct 7 12:38:06 2019 From: llvm-commits at lists.llvm.org (Aditya Kumar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:38:06 +0000 (UTC) Subject: [PATCH] D68579: [HardwareLoops] Optimisation remarks In-Reply-To: References: Message-ID: hiraditya added a comment. Nice! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68579/new/ https://reviews.llvm.org/D68579 From llvm-commits at lists.llvm.org Mon Oct 7 12:41:28 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:41:28 +0000 (UTC) Subject: [PATCH] D67749: [AArch64] Stackframe accesses to SVE objects. In-Reply-To: References: Message-ID: <78dc57723aebcb6b1d35121ab2e11eed@localhost.localdomain> cameron.mcinally added inline comments. ================ Comment at: lib/Target/AArch64/AArch64InstrInfo.cpp:3366 +} + int llvm::isAArch64FrameOffsetLegal(const MachineInstr &MI, ---------------- efriedma wrote: > cameron.mcinally wrote: > > I'm not an LLVM coding standards expert, but does this need an llvm_unreachable()? I think it does... > The reason we add llvm_unreachable() after switches in some cases is related to warnings. Some compilers warn about a missing return after a switch that covers every named enum value, but not every possible enum value. That case doesn't apply here: the switch has a "default". Ah, ok. I thought that it was protection in case the the switch is changed in the future. The X86 backend has a few switches with default cases that also have unreachable after them. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67749/new/ https://reviews.llvm.org/D67749 From llvm-commits at lists.llvm.org Mon Oct 7 12:49:26 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:49:26 +0000 (UTC) Subject: [PATCH] D53876: Preserve loop metadata when splitting exit blocks In-Reply-To: References: Message-ID: <6b02c89922f8103a2a3ae9bd58ccdbdc@localhost.localdomain> efriedma added a comment. Is this patch still relevant? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53876/new/ https://reviews.llvm.org/D53876 From llvm-commits at lists.llvm.org Mon Oct 7 12:52:35 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:52:35 +0000 (UTC) Subject: [PATCH] D68485: [PATCH 02/38] [noalias] D9375: An llvm.noalias intrinsic In-Reply-To: References: Message-ID: jeroen.dobbelaere marked an inline comment as done. jeroen.dobbelaere added inline comments. ================ Comment at: llvm/docs/LangRef.rst:17134 +.. _int_noalias: + ---------------- craig.topper wrote: > Is this intended to be a different LangRef text than the documentation in D68484? No, the 'rebases' are meant to be just a 'rebase' of the original patch from Hal Finkel. The next patch adapts it to the full restrict support. For this part here, the next patch removes this documentation again. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68485/new/ https://reviews.llvm.org/D68485 From llvm-commits at lists.llvm.org Mon Oct 7 12:55:03 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:55:03 +0000 (UTC) Subject: [PATCH] D68563: [AMDGPU] Disable a test that was relying on misched behavior In-Reply-To: References: Message-ID: arsenm added a comment. I'm curious what the scheduler is able to do here? Everything is volatile and non-reorderable Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68563/new/ https://reviews.llvm.org/D68563 From llvm-commits at lists.llvm.org Mon Oct 7 12:56:34 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 19:56:34 +0000 (UTC) Subject: [PATCH] D65677: [VirtualFileSystem] Make the RedirectingFileSystem hold on to its own working directory. In-Reply-To: References: Message-ID: <9f8bb2a4d176735ce727b5f29343309f@localhost.localdomain> JDevlieghere updated this revision to Diff 223639. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65677/new/ https://reviews.llvm.org/D65677 Files: llvm/include/llvm/Support/VirtualFileSystem.h llvm/lib/Support/VirtualFileSystem.cpp llvm/unittests/Support/VirtualFileSystemTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D65677.223639.patch Type: text/x-patch Size: 13139 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 13:01:12 2019 From: llvm-commits at lists.llvm.org (Nirav Dave via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:01:12 +0000 (UTC) Subject: [PATCH] D30471: [SDAG] Relax conditions under stores of loaded values can be merged In-Reply-To: References: Message-ID: <4499cfe4a9e6b8efb05f3291315dbd58@localhost.localdomain> niravd added a comment. This is almost certainly just a matter of deleting the assert. I would land this myself, but I've yet to go through the official "you can commit" with the new companieslegal. Is anyone willing to try removing the assert and recommiting? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D30471/new/ https://reviews.llvm.org/D30471 From llvm-commits at lists.llvm.org Mon Oct 7 13:02:16 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:02:16 +0000 (UTC) Subject: [PATCH] D68588: [Bitcode] Update naming of UNOP_NEG to UNOP_FNEG Message-ID: cameron.mcinally created this revision. cameron.mcinally added a reviewer: lebedev.ri. Herald added a reviewer: deadalnix. Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. As requested in D53877 post-commit review, update naming of UNOP_NEG to UNOP_FNEG. Also add UnaryOperator to the LLVMIsA##name macro. There does not appear to be existing tests for the LLVMIsA* functions, so I'm not sure how to test it. Repository: rL LLVM https://reviews.llvm.org/D68588 Files: llvm/include/llvm-c/Core.h llvm/include/llvm/Bitcode/LLVMBitCodes.h llvm/lib/Bitcode/Reader/BitcodeReader.cpp llvm/lib/Bitcode/Writer/BitcodeWriter.cpp Index: llvm/lib/Bitcode/Writer/BitcodeWriter.cpp =================================================================== --- llvm/lib/Bitcode/Writer/BitcodeWriter.cpp +++ llvm/lib/Bitcode/Writer/BitcodeWriter.cpp @@ -520,7 +520,7 @@ static unsigned getEncodedUnaryOpcode(unsigned Opcode) { switch (Opcode) { default: llvm_unreachable("Unknown binary instruction!"); - case Instruction::FNeg: return bitc::UNOP_NEG; + case Instruction::FNeg: return bitc::UNOP_FNEG; } } Index: llvm/lib/Bitcode/Reader/BitcodeReader.cpp =================================================================== --- llvm/lib/Bitcode/Reader/BitcodeReader.cpp +++ llvm/lib/Bitcode/Reader/BitcodeReader.cpp @@ -1063,7 +1063,7 @@ switch (Val) { default: return -1; - case bitc::UNOP_NEG: + case bitc::UNOP_FNEG: return IsFP ? Instruction::FNeg : -1; } } Index: llvm/include/llvm/Bitcode/LLVMBitCodes.h =================================================================== --- llvm/include/llvm/Bitcode/LLVMBitCodes.h +++ llvm/include/llvm/Bitcode/LLVMBitCodes.h @@ -391,7 +391,7 @@ /// have no fixed relation to the LLVM IR enum values. Changing these will /// break compatibility with old files. enum UnaryOpcodes { - UNOP_NEG = 0 + UNOP_FNEG = 0 }; /// BinaryOpcodes - These are values used in the bitcode files to encode which Index: llvm/include/llvm-c/Core.h =================================================================== --- llvm/include/llvm-c/Core.h +++ llvm/include/llvm-c/Core.h @@ -1543,6 +1543,7 @@ macro(GlobalVariable) \ macro(UndefValue) \ macro(Instruction) \ + macro(UnaryOperator) \ macro(BinaryOperator) \ macro(CallInst) \ macro(IntrinsicInst) \ -------------- next part -------------- A non-text attachment was scrubbed... Name: D68588.223637.patch Type: text/x-patch Size: 1856 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 13:05:13 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:05:13 +0000 (UTC) Subject: [PATCH] D65097: AMDGPU: Add offsets to MMO when lowering buffer intrinsics In-Reply-To: References: Message-ID: <3795671b8f3b3d26a98df4f1717d4821@localhost.localdomain> arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65097/new/ https://reviews.llvm.org/D65097 From llvm-commits at lists.llvm.org Mon Oct 7 13:09:59 2019 From: llvm-commits at lists.llvm.org (Jordan Rose via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:09:59 +0000 (UTC) Subject: [PATCH] D68586: Save a word in every StringSet entry In-Reply-To: References: Message-ID: jordan_rose updated this revision to Diff 223640. jordan_rose added a comment. Herald added subscribers: dang, steven_wu, hiraditya, mehdi_amini. Fixed bad uses of StringSet, changed a `friend` from StringMapEntry to StringMapEntryStorage. The fact that I only had to do this in one place (and that one place is definitely doing something tricky) makes me still feel confident enough to make this change. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68586/new/ https://reviews.llvm.org/D68586 Files: llvm/include/llvm/ADT/StringMap.h llvm/include/llvm/ADT/StringSet.h llvm/include/llvm/IR/Metadata.h llvm/include/llvm/LTO/legacy/LTOCodeGenerator.h llvm/lib/LTO/LTOCodeGenerator.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68586.223640.patch Type: text/x-patch Size: 5869 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 13:11:47 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:11:47 +0000 (UTC) Subject: [PATCH] D29121: [Docs] Add LangRef documention for freeze instruction In-Reply-To: References: Message-ID: <6a2c99df7b823dacb4b9788a2799a9f6@localhost.localdomain> efriedma added a comment. > we want freeze to be fully agnostic, it should not care *at all* what the type is, right? Well, it has to be some value which actually has bits that can be frozen. So integers, floats, pointers, vectors, arrays, and structs. Maybe worth listing out explicitly. For pointers, we should probably mention the aliasing rules explicitly. It should be similar to null: the result can't be dereferenced. (See http://llvm.org/docs/LangRef.html#pointer-aliasing-rules .) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29121/new/ https://reviews.llvm.org/D29121 From llvm-commits at lists.llvm.org Mon Oct 7 13:26:07 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:26:07 +0000 (UTC) Subject: [PATCH] D68588: [Bitcode] Update naming of UNOP_NEG to UNOP_FNEG In-Reply-To: References: Message-ID: <928b5ae9bc49948d32cad8e82535c88d@localhost.localdomain> lebedev.ri accepted this revision. lebedev.ri added a comment. This revision is now accepted and ready to land. Thank you, LG! ================ Comment at: llvm/include/llvm-c/Core.h:1546 macro(Instruction) \ + macro(UnaryOperator) \ macro(BinaryOperator) \ ---------------- Commit this separately? "llvm-c: there's Unary operator" I don't know what else might be missing for C API though. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68588/new/ https://reviews.llvm.org/D68588 From llvm-commits at lists.llvm.org Mon Oct 7 13:27:21 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:27:21 +0000 (UTC) Subject: [PATCH] D68588: [Bitcode] Update naming of UNOP_NEG to UNOP_FNEG In-Reply-To: References: Message-ID: <4162a665ee4050a24e962c2cd1e0482f@localhost.localdomain> lebedev.ri added a reviewer: whitequark. lebedev.ri added a subscriber: whitequark. lebedev.ri added a comment. @whitequark might know about C API Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68588/new/ https://reviews.llvm.org/D68588 From llvm-commits at lists.llvm.org Mon Oct 7 13:27:58 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:27:58 +0000 (UTC) Subject: [PATCH] D68573: [LoopRotate] Unconditionally get ScalarEvolution. In-Reply-To: References: Message-ID: <30aae712d173b3705d97cb827ecc256d@localhost.localdomain> asbirlea accepted this revision. asbirlea added a comment. This revision is now accepted and ready to land. lgtm. ================ Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:97 auto *AC = &getAnalysis().getAssumptionCache(F); auto *DTWP = getAnalysisIfAvailable(); auto *DT = DTWP ? &DTWP->getDomTree() : nullptr; ---------------- AFAICT this also holds true for the DominatorTreeWrapperPass. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68573/new/ https://reviews.llvm.org/D68573 From llvm-commits at lists.llvm.org Mon Oct 7 13:28:14 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:28:14 +0000 (UTC) Subject: [PATCH] D66035: [WebAssembly] WIP: Add support for reference types In-Reply-To: References: Message-ID: <74a39d28bb6074ba43c7b13866d31b5d@localhost.localdomain> tlively added a comment. Another approach to reference types we should look at is clang's upcoming sizeless types support (https://reviews.llvm.org/D62962, RFC: http://lists.llvm.org/pipermail/cfe-dev/2019-June/062523.html). This would allow reference types to be constructed at the source level but most operations such as `sizeof`, loading, and storing would be disallowed because they would be treated as incomplete types. I don't think that approach is inconsistent with this one, though. This patch deals mostly with the backend while sizeless types are a frontend concept. I'm not sure what the codegen and lowering would look like, though. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66035/new/ https://reviews.llvm.org/D66035 From llvm-commits at lists.llvm.org Mon Oct 7 13:30:32 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:30:32 +0000 (UTC) Subject: [PATCH] D68573: [LoopRotate] Unconditionally get ScalarEvolution. In-Reply-To: References: Message-ID: fhahn marked an inline comment as done. fhahn added inline comments. ================ Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:97 auto *AC = &getAnalysis().getAssumptionCache(F); auto *DTWP = getAnalysisIfAvailable(); auto *DT = DTWP ? &DTWP->getDomTree() : nullptr; ---------------- asbirlea wrote: > AFAICT this also holds true for the DominatorTreeWrapperPass. Yep. I'll commit this with both fixed. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68573/new/ https://reviews.llvm.org/D68573 From llvm-commits at lists.llvm.org Mon Oct 7 13:33:20 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via llvm-commits) Date: Mon, 07 Oct 2019 20:33:20 -0000 Subject: [llvm] r373956 - [AccelTable] Remove stale comment (NFC) Message-ID: <20191007203320.C30BD8732A@lists.llvm.org> Author: jdevlieghere Date: Mon Oct 7 13:33:20 2019 New Revision: 373956 URL: http://llvm.org/viewvc/llvm-project?rev=373956&view=rev Log: [AccelTable] Remove stale comment (NFC) rdar://55857228 Modified: llvm/trunk/include/llvm/CodeGen/AccelTable.h Modified: llvm/trunk/include/llvm/CodeGen/AccelTable.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/AccelTable.h?rev=373956&r1=373955&r2=373956&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/AccelTable.h (original) +++ llvm/trunk/include/llvm/CodeGen/AccelTable.h Mon Oct 7 13:33:20 2019 @@ -101,8 +101,6 @@ /// /// An Apple Accelerator Table can be serialized by calling emitAppleAccelTable /// function. -/// -/// TODO: Add DWARF v5 emission code. namespace llvm { From llvm-commits at lists.llvm.org Mon Oct 7 13:41:25 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via llvm-commits) Date: Mon, 07 Oct 2019 20:41:25 -0000 Subject: [llvm] r373958 - [Bitcode] Update naming of UNOP_NEG to UNOP_FNEG Message-ID: <20191007204125.DB6CB86DE0@lists.llvm.org> Author: mcinally Date: Mon Oct 7 13:41:25 2019 New Revision: 373958 URL: http://llvm.org/viewvc/llvm-project?rev=373958&view=rev Log: [Bitcode] Update naming of UNOP_NEG to UNOP_FNEG Differential Revision: https://reviews.llvm.org/D68588 Modified: llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Modified: llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h?rev=373958&r1=373957&r2=373958&view=diff ============================================================================== --- llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h (original) +++ llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h Mon Oct 7 13:41:25 2019 @@ -391,7 +391,7 @@ enum CastOpcodes { /// have no fixed relation to the LLVM IR enum values. Changing these will /// break compatibility with old files. enum UnaryOpcodes { - UNOP_NEG = 0 + UNOP_FNEG = 0 }; /// BinaryOpcodes - These are values used in the bitcode files to encode which Modified: llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp?rev=373958&r1=373957&r2=373958&view=diff ============================================================================== --- llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp (original) +++ llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp Mon Oct 7 13:41:25 2019 @@ -1063,7 +1063,7 @@ static int getDecodedUnaryOpcode(unsigne switch (Val) { default: return -1; - case bitc::UNOP_NEG: + case bitc::UNOP_FNEG: return IsFP ? Instruction::FNeg : -1; } } Modified: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp?rev=373958&r1=373957&r2=373958&view=diff ============================================================================== --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp (original) +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Mon Oct 7 13:41:25 2019 @@ -520,7 +520,7 @@ static unsigned getEncodedCastOpcode(uns static unsigned getEncodedUnaryOpcode(unsigned Opcode) { switch (Opcode) { default: llvm_unreachable("Unknown binary instruction!"); - case Instruction::FNeg: return bitc::UNOP_NEG; + case Instruction::FNeg: return bitc::UNOP_FNEG; } } From llvm-commits at lists.llvm.org Mon Oct 7 13:43:07 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via llvm-commits) Date: Mon, 07 Oct 2019 20:43:07 -0000 Subject: [zorg] r373959 - [LLDB] Add LLVM 9 to the Matrix bot Message-ID: <20191007204307.2CA5886E18@lists.llvm.org> Author: jdevlieghere Date: Mon Oct 7 13:43:07 2019 New Revision: 373959 URL: http://llvm.org/viewvc/llvm-project?rev=373959&view=rev Log: [LLDB] Add LLVM 9 to the Matrix bot Modified: zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix Modified: zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix?rev=373959&r1=373958&r2=373959&view=diff ============================================================================== --- zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix (original) +++ zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix Mon Oct 7 13:43:07 2019 @@ -174,16 +174,16 @@ pipeline { junit 'test/results.xml' } } - stage('Build Clang 6.0.1') { + stage('Build Clang 7.0.1') { steps { - dir('clang_601') { - checkout([$class: 'GitSCM', branches: [[name: "llvmorg-6.0.1"]], userRemoteConfigs: [[url: 'http://labmaster3.local/git/llvm-project.git']]]) + dir('clang_701') { + checkout([$class: 'GitSCM', branches: [[name: "llvmorg-7.0.1"]], userRemoteConfigs: [[url: 'http://labmaster3.local/git/llvm-project.git']]]) } timeout(90) { sh ''' export PATH=$PATH:/usr/bin:/usr/local/bin - export SRC_DIR='clang_601' - export BUILD_DIR='clang_601_build' + export SRC_DIR='clang_701' + export BUILD_DIR='clang_701_build' python llvm-zorg/zorg/jenkins/monorepo_build.py cmake build \ --assertions \ @@ -194,12 +194,12 @@ pipeline { } } } - stage('Test Clang 6.0.1') { + stage('Test Clang 7.0.1') { steps { timeout(60) { sh ''' export PATH=$PATH:/usr/bin:/usr/local/bin - export LLDB_TEST_COMPILER="$WORKSPACE/clang_601_build/bin/clang" + export LLDB_TEST_COMPILER="$WORKSPACE/clang_701_build/bin/clang" python llvm-zorg/zorg/jenkins/monorepo_build.py lldb-cmake-matrix configure \ --assertions \ --projects="clang;libcxx;libcxxabi;lldb" \ @@ -218,16 +218,16 @@ pipeline { junit 'test/results.xml' } } - stage('Build Clang 7.0.1') { + stage('Build Clang 9.0.0') { steps { - dir('clang_701') { - checkout([$class: 'GitSCM', branches: [[name: "llvmorg-7.0.1"]], userRemoteConfigs: [[url: 'http://labmaster3.local/git/llvm-project.git']]]) + dir('clang_900') { + checkout([$class: 'GitSCM', branches: [[name: "llvmorg-9.0.0"]], userRemoteConfigs: [[url: 'http://labmaster3.local/git/llvm-project.git']]]) } timeout(90) { sh ''' export PATH=$PATH:/usr/bin:/usr/local/bin - export SRC_DIR='clang_701' - export BUILD_DIR='clang_701_build' + export SRC_DIR='clang_900' + export BUILD_DIR='clang_900_build' python llvm-zorg/zorg/jenkins/monorepo_build.py cmake build \ --assertions \ @@ -238,12 +238,12 @@ pipeline { } } } - stage('Test Clang 7.0.1') { + stage('Test Clang 9.0.0') { steps { timeout(60) { sh ''' export PATH=$PATH:/usr/bin:/usr/local/bin - export LLDB_TEST_COMPILER="$WORKSPACE/clang_701_build/bin/clang" + export LLDB_TEST_COMPILER="$WORKSPACE/clang_900_build/bin/clang" python llvm-zorg/zorg/jenkins/monorepo_build.py lldb-cmake-matrix configure \ --assertions \ --projects="clang;libcxx;libcxxabi;lldb" \ From llvm-commits at lists.llvm.org Mon Oct 7 13:41:15 2019 From: llvm-commits at lists.llvm.org (Nikolai Tillmann via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:41:15 +0000 (UTC) Subject: [PATCH] D68530: [AArch64] Make combining of callee-save and local stack adjustment optional In-Reply-To: References: Message-ID: <68cf10ef8235b0314d3c32c549d97a1d@localhost.localdomain> Nikolai added a comment. > Do we need the option? Or should we just be doing this whenever we are Optsize? We don't really need it. I mainly introduced this option for "maximal backwards compatibility", in case anyone has taken a dependency on the code generation pattern outside of the LLVM test suite. If you prefer, I can update the diff, removing the option. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68530/new/ https://reviews.llvm.org/D68530 From llvm-commits at lists.llvm.org Mon Oct 7 13:43:42 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:43:42 +0000 (UTC) Subject: [PATCH] D67855: [X86] Add new calling convention that guarantees tail call optimization In-Reply-To: References: Message-ID: <4aac7a634ccd88d90e855327462305de@localhost.localdomain> rnk accepted this revision. rnk added a comment. This revision is now accepted and ready to land. (back from vacation) Thanks, I think this looks good. Would you like somebody to commit this? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67855/new/ https://reviews.llvm.org/D67855 From llvm-commits at lists.llvm.org Mon Oct 7 13:48:02 2019 From: llvm-commits at lists.llvm.org (Dwight Guth via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:48:02 +0000 (UTC) Subject: [PATCH] D67855: [X86] Add new calling convention that guarantees tail call optimization In-Reply-To: References: Message-ID: dwightguth added a comment. It's ready on my end. I don't have commit access, so... CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67855/new/ https://reviews.llvm.org/D67855 From llvm-commits at lists.llvm.org Mon Oct 7 13:49:53 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:49:53 +0000 (UTC) Subject: [PATCH] D68589: [lit] Leverage argparse features to remove some code Message-ID: yln created this revision. yln added reviewers: rnk, ddunbar, serge-sans-paille, probinson, jdenny, cishida, nate_chandler, jordan_rose. Herald added subscribers: llvm-commits, delcypher. Herald added a project: LLVM. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68589 Files: llvm/utils/lit/lit/cl_arguments.py llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/selecting.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68589.223642.patch Type: text/x-patch Size: 5939 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 13:52:53 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Mon, 07 Oct 2019 20:52:53 -0000 Subject: [llvm] r373960 - [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift amounts Message-ID: <20191007205253.344848361F@lists.llvm.org> Author: lebedevri Date: Mon Oct 7 13:52:52 2019 New Revision: 373960 URL: http://llvm.org/viewvc/llvm-project?rev=373960&view=rev Log: [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift amounts Summary: When we do `ConstantExpr::getZExt()`, that "extends" `undef` to `0`, which means that for patterns a/b we'd assume that we must not produce any bits for that channel, while in reality we simply didn't care about that channel - i.e. we don't need to mask it. Reviewers: spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68239 Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-c.ll llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp?rev=373960&r1=373959&r2=373960&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp (original) +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp Mon Oct 7 13:52:52 2019 @@ -117,6 +117,24 @@ reassociateShiftAmtsOfTwoSameDirectionSh return Ret; } +// Try to replace `undef` constants in C with Replacement. +static Constant *replaceUndefsWith(Constant *C, Constant *Replacement) { + if (C && match(C, m_Undef())) + return Replacement; + + if (auto *CV = dyn_cast(C)) { + llvm::SmallVector NewOps(CV->getNumOperands()); + for (unsigned i = 0, NumElts = NewOps.size(); i != NumElts; ++i) { + Constant *EltC = CV->getOperand(i); + NewOps[i] = EltC && match(EltC, m_Undef()) ? Replacement : EltC; + } + return ConstantVector::get(NewOps); + } + + // Don't know how to deal with this constant. + return C; +} + // If we have some pattern that leaves only some low bits set, and then performs // left-shift of those bits, if none of the bits that are left after the final // shift are modified by the mask, we can omit the mask. @@ -177,6 +195,14 @@ dropRedundantMaskingOfLeftShiftInput(Bin // The mask must be computed in a type twice as wide to ensure // that no bits are lost if the sum-of-shifts is wider than the base type. Type *ExtendedTy = Ty->getExtendedType(); + // An extend of an undef value becomes zero because the high bits are + // never completely unknown. Replace the the `undef` shift amounts with + // final shift bitwidth to ensure that the value remains undef when + // creating the subsequent shift op. + SumOfShAmts = replaceUndefsWith( + SumOfShAmts, + ConstantInt::get(SumOfShAmts->getType()->getScalarType(), + ExtendedTy->getScalarType()->getScalarSizeInBits())); auto *ExtendedSumOfShAmts = ConstantExpr::getZExt(SumOfShAmts, ExtendedTy); // And compute the mask as usual: ~(-1 << (SumOfShAmts)) @@ -212,6 +238,13 @@ dropRedundantMaskingOfLeftShiftInput(Bin // The mask must be computed in a type twice as wide to ensure // that no bits are lost if the sum-of-shifts is wider than the base type. Type *ExtendedTy = Ty->getExtendedType(); + // An extend of an undef value becomes zero because the high bits are + // never completely unknown. Replace the the `undef` shift amounts with + // negated shift bitwidth to ensure that the value remains undef when + // creating the subsequent shift op. + ShAmtsDiff = replaceUndefsWith( + ShAmtsDiff, + ConstantInt::get(ShAmtsDiff->getType()->getScalarType(), -BitWidth)); auto *ExtendedNumHighBitsToClear = ConstantExpr::getZExt( ConstantExpr::getAdd( ConstantExpr::getNeg(ShAmtsDiff), Modified: llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll?rev=373960&r1=373959&r2=373960&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll Mon Oct 7 13:52:52 2019 @@ -82,7 +82,7 @@ define <8 x i32> @t1_vec_splat_undef(<8 ; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]]) ; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T4]]) ; CHECK-NEXT: [[TMP1:%.*]] = shl <8 x i32> [[X:%.*]], [[T4]] -; CHECK-NEXT: [[T5:%.*]] = and <8 x i32> [[TMP1]], +; CHECK-NEXT: [[T5:%.*]] = and <8 x i32> [[TMP1]], ; CHECK-NEXT: ret <8 x i32> [[T5]] ; %t0 = add <8 x i32> %nbits, Modified: llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll?rev=373960&r1=373959&r2=373960&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll Mon Oct 7 13:52:52 2019 @@ -82,7 +82,7 @@ define <8 x i32> @t1_vec_splat_undef(<8 ; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]]) ; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T4]]) ; CHECK-NEXT: [[TMP1:%.*]] = shl <8 x i32> [[X:%.*]], [[T4]] -; CHECK-NEXT: [[T5:%.*]] = and <8 x i32> [[TMP1]], +; CHECK-NEXT: [[T5:%.*]] = and <8 x i32> [[TMP1]], ; CHECK-NEXT: ret <8 x i32> [[T5]] ; %t0 = add <8 x i32> %nbits, Modified: llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-c.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-c.ll?rev=373960&r1=373959&r2=373960&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-c.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-c.ll Mon Oct 7 13:52:52 2019 @@ -62,7 +62,7 @@ define <8 x i32> @t1_vec_splat_undef(<8 ; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]]) ; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]]) ; CHECK-NEXT: [[TMP1:%.*]] = shl <8 x i32> [[X:%.*]], [[T2]] -; CHECK-NEXT: [[T3:%.*]] = and <8 x i32> [[TMP1]], +; CHECK-NEXT: [[T3:%.*]] = and <8 x i32> [[TMP1]], ; CHECK-NEXT: ret <8 x i32> [[T3]] ; %t0 = lshr <8 x i32> , %nbits Modified: llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll?rev=373960&r1=373959&r2=373960&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll Mon Oct 7 13:52:52 2019 @@ -72,7 +72,7 @@ define <8 x i32> @t2_vec_splat_undef(<8 ; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T1]]) ; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T3]]) ; CHECK-NEXT: [[TMP1:%.*]] = shl <8 x i32> [[X:%.*]], [[T3]] -; CHECK-NEXT: [[T4:%.*]] = and <8 x i32> [[TMP1]], +; CHECK-NEXT: [[T4:%.*]] = and <8 x i32> [[TMP1]], ; CHECK-NEXT: ret <8 x i32> [[T4]] ; %t0 = shl <8 x i32> , %nbits Modified: llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll?rev=373960&r1=373959&r2=373960&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll Mon Oct 7 13:52:52 2019 @@ -62,7 +62,7 @@ define <8 x i32> @t1_vec_splat_undef(<8 ; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]]) ; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]]) ; CHECK-NEXT: [[TMP1:%.*]] = shl <8 x i32> [[X]], [[T2]] -; CHECK-NEXT: [[T3:%.*]] = and <8 x i32> [[TMP1]], +; CHECK-NEXT: [[T3:%.*]] = and <8 x i32> [[TMP1]], ; CHECK-NEXT: ret <8 x i32> [[T3]] ; %t0 = shl <8 x i32> %x, %nbits From llvm-commits at lists.llvm.org Mon Oct 7 13:53:00 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Mon, 07 Oct 2019 20:53:00 -0000 Subject: [llvm] r373961 - [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask Message-ID: <20191007205300.C2EBB8D8AA@lists.llvm.org> Author: lebedevri Date: Mon Oct 7 13:53:00 2019 New Revision: 373961 URL: http://llvm.org/viewvc/llvm-project?rev=373961&view=rev Log: [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask Summary: Currently, we pre-check whether we need to produce a mask or not. This involves some rather magical constants. I'd like to extend this fold to also handle the situation when there's also a `trunc` before outer shift. That will require another set of magical constants. It's ugly. Instead, we can just compute the mask, and check whether mask is a pass-through (all-ones) or not. This way we don't need to have any magical numbers. This change is NFC other than the fact that we now compute the mask and then check if we need (and can!) apply it. Reviewers: spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68470 Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp?rev=373961&r1=373960&r2=373961&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp (original) +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp Mon Oct 7 13:53:00 2019 @@ -181,39 +181,29 @@ dropRedundantMaskingOfLeftShiftInput(Bin MaskShAmt, ShiftShAmt, /*IsNSW=*/false, /*IsNUW=*/false, Q)); if (!SumOfShAmts) return nullptr; // Did not simplify. + // In this pattern SumOfShAmts correlates with the number of low bits + // that shall remain in the root value (OuterShift). + Type *Ty = X->getType(); - unsigned BitWidth = Ty->getScalarSizeInBits(); - // In this pattern SumOfShAmts correlates with the number of low bits that - // shall remain in the root value (OuterShift). If SumOfShAmts is less than - // bitwidth, we'll need to also produce a mask to keep SumOfShAmts low bits. - // So, does *any* channel need a mask? - if (!match(SumOfShAmts, m_SpecificInt_ICMP(ICmpInst::Predicate::ICMP_UGE, - APInt(BitWidth, BitWidth)))) { - // But for a mask we need to get rid of old masking instruction. - if (!Masked->hasOneUse()) - return nullptr; // Else we can't perform the fold. - // The mask must be computed in a type twice as wide to ensure - // that no bits are lost if the sum-of-shifts is wider than the base type. - Type *ExtendedTy = Ty->getExtendedType(); - // An extend of an undef value becomes zero because the high bits are - // never completely unknown. Replace the the `undef` shift amounts with - // final shift bitwidth to ensure that the value remains undef when - // creating the subsequent shift op. - SumOfShAmts = replaceUndefsWith( - SumOfShAmts, - ConstantInt::get(SumOfShAmts->getType()->getScalarType(), - ExtendedTy->getScalarType()->getScalarSizeInBits())); - auto *ExtendedSumOfShAmts = - ConstantExpr::getZExt(SumOfShAmts, ExtendedTy); - // And compute the mask as usual: ~(-1 << (SumOfShAmts)) - auto *ExtendedAllOnes = ConstantExpr::getAllOnesValue(ExtendedTy); - auto *ExtendedInvertedMask = - ConstantExpr::getShl(ExtendedAllOnes, ExtendedSumOfShAmts); - auto *ExtendedMask = ConstantExpr::getNot(ExtendedInvertedMask); - NewMask = ConstantExpr::getTrunc(ExtendedMask, Ty); - } else - NewMask = nullptr; // No mask needed. - // All good, we can do this fold. + + // The mask must be computed in a type twice as wide to ensure + // that no bits are lost if the sum-of-shifts is wider than the base type. + Type *ExtendedTy = Ty->getExtendedType(); + // An extend of an undef value becomes zero because the high bits are never + // completely unknown. Replace the the `undef` shift amounts with final + // shift bitwidth to ensure that the value remains undef when creating the + // subsequent shift op. + SumOfShAmts = replaceUndefsWith( + SumOfShAmts, + ConstantInt::get(SumOfShAmts->getType()->getScalarType(), + ExtendedTy->getScalarType()->getScalarSizeInBits())); + auto *ExtendedSumOfShAmts = ConstantExpr::getZExt(SumOfShAmts, ExtendedTy); + // And compute the mask as usual: ~(-1 << (SumOfShAmts)) + auto *ExtendedAllOnes = ConstantExpr::getAllOnesValue(ExtendedTy); + auto *ExtendedInvertedMask = + ConstantExpr::getShl(ExtendedAllOnes, ExtendedSumOfShAmts); + auto *ExtendedMask = ConstantExpr::getNot(ExtendedInvertedMask); + NewMask = ConstantExpr::getTrunc(ExtendedMask, Ty); } else if (match(Masked, m_c_And(m_CombineOr(MaskC, MaskD), m_Value(X))) || match(Masked, m_Shr(m_Shl(m_Value(X), m_Value(MaskShAmt)), m_Deferred(MaskShAmt)))) { @@ -223,49 +213,51 @@ dropRedundantMaskingOfLeftShiftInput(Bin if (!ShAmtsDiff) return nullptr; // Did not simplify. // In this pattern ShAmtsDiff correlates with the number of high bits that - // shall be unset in the root value (OuterShift). If ShAmtsDiff is negative, - // we'll need to also produce a mask to unset ShAmtsDiff high bits. - // So, does *any* channel need a mask? (is ShiftShAmt u>= MaskShAmt ?) - if (!match(ShAmtsDiff, m_NonNegative())) { - // This sub-fold (with mask) is invalid for 'ashr' "masking" instruction. - if (match(Masked, m_AShr(m_Value(), m_Value()))) - return nullptr; - // For a mask we need to get rid of old masking instruction. - if (!Masked->hasOneUse()) - return nullptr; // Else we can't perform the fold. - Type *Ty = X->getType(); - unsigned BitWidth = Ty->getScalarSizeInBits(); - // The mask must be computed in a type twice as wide to ensure - // that no bits are lost if the sum-of-shifts is wider than the base type. - Type *ExtendedTy = Ty->getExtendedType(); - // An extend of an undef value becomes zero because the high bits are - // never completely unknown. Replace the the `undef` shift amounts with - // negated shift bitwidth to ensure that the value remains undef when - // creating the subsequent shift op. - ShAmtsDiff = replaceUndefsWith( - ShAmtsDiff, - ConstantInt::get(ShAmtsDiff->getType()->getScalarType(), -BitWidth)); - auto *ExtendedNumHighBitsToClear = ConstantExpr::getZExt( - ConstantExpr::getAdd( - ConstantExpr::getNeg(ShAmtsDiff), - ConstantInt::get(Ty, BitWidth, /*isSigned=*/false)), - ExtendedTy); - // And compute the mask as usual: (-1 l>> (ShAmtsDiff)) - auto *ExtendedAllOnes = ConstantExpr::getAllOnesValue(ExtendedTy); - auto *ExtendedMask = - ConstantExpr::getLShr(ExtendedAllOnes, ExtendedNumHighBitsToClear); - NewMask = ConstantExpr::getTrunc(ExtendedMask, Ty); - } else - NewMask = nullptr; // No mask needed. - // All good, we can do this fold. + // shall be unset in the root value (OuterShift). + + Type *Ty = X->getType(); + unsigned BitWidth = Ty->getScalarSizeInBits(); + + // The mask must be computed in a type twice as wide to ensure + // that no bits are lost if the sum-of-shifts is wider than the base type. + Type *ExtendedTy = Ty->getExtendedType(); + // An extend of an undef value becomes zero because the high bits are never + // completely unknown. Replace the the `undef` shift amounts with negated + // shift bitwidth to ensure that the value remains undef when creating the + // subsequent shift op. + ShAmtsDiff = replaceUndefsWith( + ShAmtsDiff, + ConstantInt::get(ShAmtsDiff->getType()->getScalarType(), -BitWidth)); + auto *ExtendedNumHighBitsToClear = ConstantExpr::getZExt( + ConstantExpr::getSub(ConstantInt::get(ShAmtsDiff->getType(), BitWidth, + /*isSigned=*/false), + ShAmtsDiff), + ExtendedTy); + // And compute the mask as usual: (-1 l>> (NumHighBitsToClear)) + auto *ExtendedAllOnes = ConstantExpr::getAllOnesValue(ExtendedTy); + auto *ExtendedMask = + ConstantExpr::getLShr(ExtendedAllOnes, ExtendedNumHighBitsToClear); + NewMask = ConstantExpr::getTrunc(ExtendedMask, Ty); } else return nullptr; // Don't know anything about this pattern. - // No 'NUW'/'NSW'! - // We no longer know that we won't shift-out non-0 bits. + // Does this mask has any unset bits? If not then we can just not apply it. + bool NeedMask = !match(NewMask, m_AllOnes()); + + // If we need to apply a mask, there are several more restrictions we have. + if (NeedMask) { + // The old masking instruction must go away. + if (!Masked->hasOneUse()) + return nullptr; + // The original "masking" instruction must not have been`ashr`. + if (match(Masked, m_AShr(m_Value(), m_Value()))) + return nullptr; + } + + // No 'NUW'/'NSW'! We no longer know that we won't shift-out non-0 bits. auto *NewShift = BinaryOperator::Create(OuterShift->getOpcode(), X, ShiftShAmt); - if (!NewMask) + if (!NeedMask) return NewShift; Builder.Insert(NewShift); From llvm-commits at lists.llvm.org Mon Oct 7 13:53:09 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Mon, 07 Oct 2019 20:53:09 -0000 Subject: [llvm] r373962 - [InstCombine] Move isSignBitCheck(), handle rest of the predicates Message-ID: <20191007205309.0A8AE8D8B6@lists.llvm.org> Author: lebedevri Date: Mon Oct 7 13:53:08 2019 New Revision: 373962 URL: http://llvm.org/viewvc/llvm-project?rev=373962&view=rev Log: [InstCombine] Move isSignBitCheck(), handle rest of the predicates True, no test coverage is being added here. But those non-canonical predicates that are already handled here already have no test coverage as far as i can tell. I tried to add tests for them, but all the patterns already get handled elsewhere. Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp?rev=373962&r1=373961&r2=373962&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp (original) +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp Mon Oct 7 13:53:08 2019 @@ -69,34 +69,6 @@ static bool hasBranchUse(ICmpInst &I) { return false; } -/// Given an exploded icmp instruction, return true if the comparison only -/// checks the sign bit. If it only checks the sign bit, set TrueIfSigned if the -/// result of the comparison is true when the input value is signed. -static bool isSignBitCheck(ICmpInst::Predicate Pred, const APInt &RHS, - bool &TrueIfSigned) { - switch (Pred) { - case ICmpInst::ICMP_SLT: // True if LHS s< 0 - TrueIfSigned = true; - return RHS.isNullValue(); - case ICmpInst::ICMP_SLE: // True if LHS s<= RHS and RHS == -1 - TrueIfSigned = true; - return RHS.isAllOnesValue(); - case ICmpInst::ICMP_SGT: // True if LHS s> -1 - TrueIfSigned = false; - return RHS.isAllOnesValue(); - case ICmpInst::ICMP_UGT: - // True if LHS u> RHS and RHS == high-bit-mask - 1 - TrueIfSigned = true; - return RHS.isMaxSignedValue(); - case ICmpInst::ICMP_UGE: - // True if LHS u>= RHS and RHS == high-bit-mask (2^7, 2^15, 2^31, etc) - TrueIfSigned = true; - return RHS.isSignMask(); - default: - return false; - } -} - /// Returns true if the exploded icmp can be expressed as a signed comparison /// to zero and updates the predicate accordingly. /// The signedness of the comparison is preserved. Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h?rev=373962&r1=373961&r2=373962&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h (original) +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineInternal.h Mon Oct 7 13:53:08 2019 @@ -113,6 +113,45 @@ static inline bool isCanonicalPredicate( } } +/// Given an exploded icmp instruction, return true if the comparison only +/// checks the sign bit. If it only checks the sign bit, set TrueIfSigned if the +/// result of the comparison is true when the input value is signed. +inline bool isSignBitCheck(ICmpInst::Predicate Pred, const APInt &RHS, + bool &TrueIfSigned) { + switch (Pred) { + case ICmpInst::ICMP_SLT: // True if LHS s< 0 + TrueIfSigned = true; + return RHS.isNullValue(); + case ICmpInst::ICMP_SLE: // True if LHS s<= -1 + TrueIfSigned = true; + return RHS.isAllOnesValue(); + case ICmpInst::ICMP_SGT: // True if LHS s> -1 + TrueIfSigned = false; + return RHS.isAllOnesValue(); + case ICmpInst::ICMP_SGE: // True if LHS s>= 0 + TrueIfSigned = false; + return RHS.isNullValue(); + case ICmpInst::ICMP_UGT: + // True if LHS u> RHS and RHS == sign-bit-mask - 1 + TrueIfSigned = true; + return RHS.isMaxSignedValue(); + case ICmpInst::ICMP_UGE: + // True if LHS u>= RHS and RHS == sign-bit-mask (2^7, 2^15, 2^31, etc) + TrueIfSigned = true; + return RHS.isMinSignedValue(); + case ICmpInst::ICMP_ULT: + // True if LHS u< RHS and RHS == sign-bit-mask (2^7, 2^15, 2^31, etc) + TrueIfSigned = false; + return RHS.isMinSignedValue(); + case ICmpInst::ICMP_ULE: + // True if LHS u<= RHS and RHS == sign-bit-mask - 1 + TrueIfSigned = false; + return RHS.isMaxSignedValue(); + default: + return false; + } +} + llvm::Optional> getFlippedStrictnessPredicateAndConstant(CmpInst::Predicate Pred, Constant *C); From llvm-commits at lists.llvm.org Mon Oct 7 13:53:16 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Mon, 07 Oct 2019 20:53:16 -0000 Subject: [llvm] r373963 - [InstCombine][NFC] Tests for "conditional sign-extend of high-bit-extract" pattern (PR42389) Message-ID: <20191007205316.CE0AD809C7@lists.llvm.org> Author: lebedevri Date: Mon Oct 7 13:53:16 2019 New Revision: 373963 URL: http://llvm.org/viewvc/llvm-project?rev=373963&view=rev Log: [InstCombine][NFC] Tests for "conditional sign-extend of high-bit-extract" pattern (PR42389) https://bugs.llvm.org/show_bug.cgi?id=42389 Added: llvm/trunk/test/Transforms/InstCombine/conditional-variable-length-signext-after-high-bit-extract.ll Added: llvm/trunk/test/Transforms/InstCombine/conditional-variable-length-signext-after-high-bit-extract.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/conditional-variable-length-signext-after-high-bit-extract.ll?rev=373963&view=auto ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/conditional-variable-length-signext-after-high-bit-extract.ll (added) +++ llvm/trunk/test/Transforms/InstCombine/conditional-variable-length-signext-after-high-bit-extract.ll Mon Oct 7 13:53:16 2019 @@ -0,0 +1,1040 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt %s -instcombine -S | FileCheck %s + +; If we extract (via lshr) some high bits, and then perform their sign-extension +; conditionally depending on whether the extracted value is negative or not +; (i.e. interpreting the highest extracted bit, which was the original signbit +; of the value from which we extracted as a signbit), then we should just +; perform extraction via `ashr`. + +; Base patterns. + +declare void @use1(i1) +declare void @use16(i16) +declare void @use32(i32) +declare void @use64(i64) + +define i32 @t0_notrunc_add(i32 %data, i32 %nbits) { +; CHECK-LABEL: @t0_notrunc_add( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @t1_notrunc_sub(i32 %data, i32 %nbits) { +; CHECK-LABEL: @t1_notrunc_sub( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[HIGHER_BIT_AFTER_SIGNBIT:%.*]] = shl i32 1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[HIGHER_BIT_AFTER_SIGNBIT]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[HIGHER_BIT_AFTER_SIGNBIT]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = sub i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %higher_bit_after_signbit = shl i32 1, %nbits + %magic = select i1 %should_signext, i32 %higher_bit_after_signbit, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %higher_bit_after_signbit) + call void @use32(i32 %magic) + + %signextended = sub i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @t2_trunc_add(i64 %data, i32 %nbits) { +; CHECK-LABEL: @t2_trunc_add( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 64, [[NBITS:%.*]] +; CHECK-NEXT: [[LOW_BITS_TO_SKIP_WIDE:%.*]] = zext i32 [[LOW_BITS_TO_SKIP]] to i64 +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED_WIDE:%.*]] = lshr i64 [[DATA:%.*]], [[LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED_WIDE]] to i32 +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i64 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use64(i64 [[LOW_BITS_TO_SKIP_WIDE]]) +; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED_WIDE]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[MAGIC]], [[HIGH_BITS_EXTRACTED]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 64, %nbits + %low_bits_to_skip_wide = zext i32 %low_bits_to_skip to i64 + %high_bits_extracted_wide = lshr i64 %data, %low_bits_to_skip_wide + %high_bits_extracted = trunc i64 %high_bits_extracted_wide to i32 + %should_signext = icmp slt i64 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 ; one-use + + call void @use32(i32 %low_bits_to_skip) + call void @use64(i64 %low_bits_to_skip_wide) + call void @use64(i64 %high_bits_extracted_wide) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + + %signextended = add i32 %magic, %high_bits_extracted + ret i32 %signextended +} + +define i32 @t3_trunc_sub(i64 %data, i32 %nbits) { +; CHECK-LABEL: @t3_trunc_sub( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 64, [[NBITS:%.*]] +; CHECK-NEXT: [[LOW_BITS_TO_SKIP_WIDE:%.*]] = zext i32 [[LOW_BITS_TO_SKIP]] to i64 +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED_WIDE:%.*]] = lshr i64 [[DATA:%.*]], [[LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED_WIDE]] to i32 +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i64 [[DATA]], 0 +; CHECK-NEXT: [[HIGHER_BIT_AFTER_SIGNBIT:%.*]] = shl i32 1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[HIGHER_BIT_AFTER_SIGNBIT]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use64(i64 [[LOW_BITS_TO_SKIP_WIDE]]) +; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED_WIDE]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[HIGHER_BIT_AFTER_SIGNBIT]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = sub i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 64, %nbits + %low_bits_to_skip_wide = zext i32 %low_bits_to_skip to i64 + %high_bits_extracted_wide = lshr i64 %data, %low_bits_to_skip_wide + %high_bits_extracted = trunc i64 %high_bits_extracted_wide to i32 + %should_signext = icmp slt i64 %data, 0 + %higher_bit_after_signbit = shl i32 1, %nbits + %magic = select i1 %should_signext, i32 %higher_bit_after_signbit, i32 0 ; one-use + + call void @use32(i32 %low_bits_to_skip) + call void @use64(i64 %low_bits_to_skip_wide) + call void @use64(i64 %high_bits_extracted_wide) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %higher_bit_after_signbit) + + %signextended = sub i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +; Commutativity + +define i32 @t4_commutativity0(i32 %data, i32 %nbits) { +; CHECK-LABEL: @t4_commutativity0( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} +define i32 @t5_commutativity1(i32 %data, i32 %nbits) { +; CHECK-LABEL: @t5_commutativity1( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp sgt i32 [[DATA]], -1 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 0, i32 [[ALL_BITS_EXCEPT_LOW_NBITS]] +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp sgt i32 %data, -1 ; swapped + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 0, i32 %all_bits_except_low_nbits ; swapped + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} +define i32 @t6_commutativity2(i32 %data, i32 %nbits) { +; CHECK-LABEL: @t6_commutativity2( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[MAGIC]], [[HIGH_BITS_EXTRACTED]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %magic, %high_bits_extracted ; swapped + ret i32 %signextended +} + +; Extra uses + +define i32 @t7_trunc_extrause0(i64 %data, i32 %nbits) { +; CHECK-LABEL: @t7_trunc_extrause0( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 64, [[NBITS:%.*]] +; CHECK-NEXT: [[LOW_BITS_TO_SKIP_WIDE:%.*]] = zext i32 [[LOW_BITS_TO_SKIP]] to i64 +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED_WIDE:%.*]] = lshr i64 [[DATA:%.*]], [[LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED_WIDE]] to i32 +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i64 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use64(i64 [[LOW_BITS_TO_SKIP_WIDE]]) +; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED_WIDE]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[MAGIC]], [[HIGH_BITS_EXTRACTED]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 64, %nbits + %low_bits_to_skip_wide = zext i32 %low_bits_to_skip to i64 + %high_bits_extracted_wide = lshr i64 %data, %low_bits_to_skip_wide + %high_bits_extracted = trunc i64 %high_bits_extracted_wide to i32 ; has extra use + %should_signext = icmp slt i64 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 ; one-use + + call void @use32(i32 %low_bits_to_skip) + call void @use64(i64 %low_bits_to_skip_wide) + call void @use64(i64 %high_bits_extracted_wide) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + + %signextended = add i32 %magic, %high_bits_extracted + ret i32 %signextended +} +define i32 @t8_trunc_extrause1(i64 %data, i32 %nbits) { +; CHECK-LABEL: @t8_trunc_extrause1( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 64, [[NBITS:%.*]] +; CHECK-NEXT: [[LOW_BITS_TO_SKIP_WIDE:%.*]] = zext i32 [[LOW_BITS_TO_SKIP]] to i64 +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED_WIDE:%.*]] = lshr i64 [[DATA:%.*]], [[LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED_WIDE]] to i32 +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i64 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use64(i64 [[LOW_BITS_TO_SKIP_WIDE]]) +; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED_WIDE]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[MAGIC]], [[HIGH_BITS_EXTRACTED]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 64, %nbits + %low_bits_to_skip_wide = zext i32 %low_bits_to_skip to i64 + %high_bits_extracted_wide = lshr i64 %data, %low_bits_to_skip_wide + %high_bits_extracted = trunc i64 %high_bits_extracted_wide to i32 ; one-use + %should_signext = icmp slt i64 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 ; has extra use + + call void @use32(i32 %low_bits_to_skip) + call void @use64(i64 %low_bits_to_skip_wide) + call void @use64(i64 %high_bits_extracted_wide) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %magic, %high_bits_extracted + ret i32 %signextended +} +define i32 @n9_trunc_extrause2(i64 %data, i32 %nbits) { +; CHECK-LABEL: @n9_trunc_extrause2( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 64, [[NBITS:%.*]] +; CHECK-NEXT: [[LOW_BITS_TO_SKIP_WIDE:%.*]] = zext i32 [[LOW_BITS_TO_SKIP]] to i64 +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED_WIDE:%.*]] = lshr i64 [[DATA:%.*]], [[LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED_WIDE]] to i32 +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i64 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use64(i64 [[LOW_BITS_TO_SKIP_WIDE]]) +; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED_WIDE]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[MAGIC]], [[HIGH_BITS_EXTRACTED]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 64, %nbits + %low_bits_to_skip_wide = zext i32 %low_bits_to_skip to i64 + %high_bits_extracted_wide = lshr i64 %data, %low_bits_to_skip_wide + %high_bits_extracted = trunc i64 %high_bits_extracted_wide to i32 ; has extra use + %should_signext = icmp slt i64 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 ; has extra use + + call void @use32(i32 %low_bits_to_skip) + call void @use64(i64 %low_bits_to_skip_wide) + call void @use64(i64 %high_bits_extracted_wide) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %magic, %high_bits_extracted + ret i32 %signextended +} + +define i32 @t10_preserve_exact(i32 %data, i32 %nbits) { +; CHECK-LABEL: @t10_preserve_exact( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr exact i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr exact i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @t11_different_zext_of_shamt(i32 %data, i8 %nbits) { +; CHECK-LABEL: @t11_different_zext_of_shamt( +; CHECK-NEXT: [[NBITS_16BIT:%.*]] = zext i8 [[NBITS:%.*]] to i16 +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub nsw i16 32, [[NBITS_16BIT]] +; CHECK-NEXT: [[LOW_BITS_TO_SKIP_32:%.*]] = zext i16 [[LOW_BITS_TO_SKIP]] to i32 +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP_32]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[NBITS_32BIT:%.*]] = zext i8 [[NBITS]] to i32 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS_32BIT]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use16(i16 [[NBITS_16BIT]]) +; CHECK-NEXT: call void @use16(i16 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP_32]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[NBITS_32BIT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %nbits_16bit = zext i8 %nbits to i16 + %low_bits_to_skip = sub i16 32, %nbits_16bit + %low_bits_to_skip_32 = zext i16 %low_bits_to_skip to i32 + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip_32 + %should_signext = icmp slt i32 %data, 0 + %nbits_32bit = zext i8 %nbits to i32 + %all_bits_except_low_nbits = shl i32 -1, %nbits_32bit + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use16(i16 %nbits_16bit) + call void @use16(i16 %low_bits_to_skip) + call void @use32(i32 %low_bits_to_skip_32) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %nbits_32bit) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @t12_add_sext_of_magic(i32 %data, i8 %nbits) { +; CHECK-LABEL: @t12_add_sext_of_magic( +; CHECK-NEXT: [[NBITS_32BIT:%.*]] = zext i8 [[NBITS:%.*]] to i32 +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub nsw i32 32, [[NBITS_32BIT]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[NBITS_16BIT:%.*]] = zext i8 [[NBITS]] to i16 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i16 -1, [[NBITS_16BIT]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i16 [[ALL_BITS_EXCEPT_LOW_NBITS]], i16 0 +; CHECK-NEXT: [[MAGIC_WIDE:%.*]] = sext i16 [[MAGIC]] to i32 +; CHECK-NEXT: call void @use32(i32 [[NBITS_32BIT]]) +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use16(i16 [[NBITS_16BIT]]) +; CHECK-NEXT: call void @use16(i16 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use16(i16 [[MAGIC]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC_WIDE]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC_WIDE]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %nbits_32bit = zext i8 %nbits to i32 + %low_bits_to_skip = sub i32 32, %nbits_32bit + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %nbits_16bit = zext i8 %nbits to i16 + %all_bits_except_low_nbits = shl i16 -1, %nbits_16bit + %magic = select i1 %should_signext, i16 %all_bits_except_low_nbits, i16 0 + %magic_wide = sext i16 %magic to i32 + + call void @use32(i32 %nbits_32bit) + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use16(i16 %nbits_16bit) + call void @use16(i16 %all_bits_except_low_nbits) + call void @use16(i16 %magic) + call void @use32(i32 %magic_wide) + + %signextended = add i32 %high_bits_extracted, %magic_wide + ret i32 %signextended +} + +define i32 @t13_sub_zext_of_magic(i32 %data, i8 %nbits) { +; CHECK-LABEL: @t13_sub_zext_of_magic( +; CHECK-NEXT: [[NBITS_32BIT:%.*]] = zext i8 [[NBITS:%.*]] to i32 +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub nsw i32 32, [[NBITS_32BIT]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[NBITS_16BIT:%.*]] = zext i8 [[NBITS]] to i16 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i16 1, [[NBITS_16BIT]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i16 [[ALL_BITS_EXCEPT_LOW_NBITS]], i16 0 +; CHECK-NEXT: [[MAGIC_WIDE:%.*]] = zext i16 [[MAGIC]] to i32 +; CHECK-NEXT: call void @use32(i32 [[NBITS_32BIT]]) +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use16(i16 [[NBITS_16BIT]]) +; CHECK-NEXT: call void @use16(i16 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use16(i16 [[MAGIC]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC_WIDE]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = sub i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC_WIDE]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %nbits_32bit = zext i8 %nbits to i32 + %low_bits_to_skip = sub i32 32, %nbits_32bit + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %nbits_16bit = zext i8 %nbits to i16 + %all_bits_except_low_nbits = shl i16 1, %nbits_16bit + %magic = select i1 %should_signext, i16 %all_bits_except_low_nbits, i16 0 + %magic_wide = zext i16 %magic to i32 + + call void @use32(i32 %nbits_32bit) + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use16(i16 %nbits_16bit) + call void @use16(i16 %all_bits_except_low_nbits) + call void @use16(i16 %magic) + call void @use32(i32 %magic_wide) + + %signextended = sub i32 %high_bits_extracted, %magic_wide + ret i32 %signextended +} + +define i32 @t14_add_sext_of_shl(i32 %data, i8 %nbits) { +; CHECK-LABEL: @t14_add_sext_of_shl( +; CHECK-NEXT: [[NBITS_32BIT:%.*]] = zext i8 [[NBITS:%.*]] to i32 +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub nsw i32 32, [[NBITS_32BIT]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[NBITS_16BIT:%.*]] = zext i8 [[NBITS]] to i16 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i16 -1, [[NBITS_16BIT]] +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS_WIDE:%.*]] = sext i16 [[ALL_BITS_EXCEPT_LOW_NBITS]] to i32 +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS_WIDE]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[NBITS_32BIT]]) +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use16(i16 [[NBITS_16BIT]]) +; CHECK-NEXT: call void @use16(i16 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS_WIDE]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %nbits_32bit = zext i8 %nbits to i32 + %low_bits_to_skip = sub i32 32, %nbits_32bit + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %nbits_16bit = zext i8 %nbits to i16 + %all_bits_except_low_nbits = shl i16 -1, %nbits_16bit + %all_bits_except_low_nbits_wide = sext i16 %all_bits_except_low_nbits to i32 + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits_wide, i32 0 + + call void @use32(i32 %nbits_32bit) + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use16(i16 %nbits_16bit) + call void @use16(i16 %all_bits_except_low_nbits) + call void @use32(i32 %all_bits_except_low_nbits_wide) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @t15_sub_zext_of_shl(i32 %data, i8 %nbits) { +; CHECK-LABEL: @t15_sub_zext_of_shl( +; CHECK-NEXT: [[NBITS_32BIT:%.*]] = zext i8 [[NBITS:%.*]] to i32 +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub nsw i32 32, [[NBITS_32BIT]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[NBITS_16BIT:%.*]] = zext i8 [[NBITS]] to i16 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i16 1, [[NBITS_16BIT]] +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS_WIDE:%.*]] = zext i16 [[ALL_BITS_EXCEPT_LOW_NBITS]] to i32 +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS_WIDE]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[NBITS_32BIT]]) +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use16(i16 [[NBITS_16BIT]]) +; CHECK-NEXT: call void @use16(i16 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS_WIDE]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = sub i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %nbits_32bit = zext i8 %nbits to i32 + %low_bits_to_skip = sub i32 32, %nbits_32bit + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %nbits_16bit = zext i8 %nbits to i16 + %all_bits_except_low_nbits = shl i16 1, %nbits_16bit + %all_bits_except_low_nbits_wide = zext i16 %all_bits_except_low_nbits to i32 + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits_wide, i32 0 + + call void @use32(i32 %nbits_32bit) + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use16(i16 %nbits_16bit) + call void @use16(i16 %all_bits_except_low_nbits) + call void @use32(i32 %all_bits_except_low_nbits_wide) + call void @use32(i32 %magic) + + %signextended = sub i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +; Negative tests. + +define i32 @n16(i32 %data, i32 %nbits) { +; CHECK-LABEL: @n16( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 31, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 31, %nbits ; not 32 + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @n17_add(i32 %data, i32 %nbits) { +; CHECK-LABEL: @n17_add( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %all_bits_except_low_nbits = shl i32 1, %nbits ; not -1 + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @n18(i32 %data, i32 %nbits) { +; CHECK-LABEL: @n18( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 0, i32 [[ALL_BITS_EXCEPT_LOW_NBITS]] +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 0, i32 %all_bits_except_low_nbits ; wrong order + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @n19(i32 %data1, i32 %data2, i32 %nbits) { +; CHECK-LABEL: @n19( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA1:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA2:%.*]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data1, %low_bits_to_skip ; not %data2 + %should_signext = icmp slt i32 %data2, 0 ; not %data1 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @n20(i32 %data, i32 %nbits1, i32 %nbits2) { +; CHECK-LABEL: @n20( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS1:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS2:%.*]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits1 ; not %nbits2 + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits2 ; not %nbits1 + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @n21(i32 %data, i32 %nbits) { +; CHECK-LABEL: @n21( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp sgt i32 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp sgt i32 %data, 0 ; this isn't a sign bit test + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @n22(i64 %data, i32 %nbits) { +; CHECK-LABEL: @n22( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 63, [[NBITS:%.*]] +; CHECK-NEXT: [[LOW_BITS_TO_SKIP_WIDE:%.*]] = zext i32 [[LOW_BITS_TO_SKIP]] to i64 +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED_WIDE:%.*]] = lshr i64 [[DATA:%.*]], [[LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED_WIDE]] to i32 +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i64 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use64(i64 [[LOW_BITS_TO_SKIP_WIDE]]) +; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED_WIDE]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[MAGIC]], [[HIGH_BITS_EXTRACTED]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 63, %nbits ; not 64 + %low_bits_to_skip_wide = zext i32 %low_bits_to_skip to i64 + %high_bits_extracted_wide = lshr i64 %data, %low_bits_to_skip_wide + %high_bits_extracted = trunc i64 %high_bits_extracted_wide to i32 + %should_signext = icmp slt i64 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use64(i64 %low_bits_to_skip_wide) + call void @use64(i64 %high_bits_extracted_wide) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %magic, %high_bits_extracted + ret i32 %signextended +} + +define i32 @n23(i32 %data, i32 %nbits) { +; CHECK-LABEL: @n23( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = ashr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = ashr i32 %data, %low_bits_to_skip ; not `lshr` + %should_signext = icmp slt i32 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @n24(i32 %data, i32 %nbits) { +; CHECK-LABEL: @n24( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[HIGHER_BIT_AFTER_SIGNBIT:%.*]] = shl i32 1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[HIGHER_BIT_AFTER_SIGNBIT]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[HIGHER_BIT_AFTER_SIGNBIT]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = sub i32 [[MAGIC]], [[HIGH_BITS_EXTRACTED]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %higher_bit_after_signbit = shl i32 1, %nbits + %magic = select i1 %should_signext, i32 %higher_bit_after_signbit, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %higher_bit_after_signbit) + call void @use32(i32 %magic) + + %signextended = sub i32 %magic, %high_bits_extracted ; wrong order; `sub` is not commutative + ret i32 %signextended +} + +define i32 @n25_sub(i32 %data, i32 %nbits) { +; CHECK-LABEL: @n25_sub( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[HIGHER_BIT_AFTER_SIGNBIT:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[HIGHER_BIT_AFTER_SIGNBIT]], i32 0 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[HIGHER_BIT_AFTER_SIGNBIT]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = sub i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %higher_bit_after_signbit = shl i32 -1, %nbits ; not 1 + %magic = select i1 %should_signext, i32 %higher_bit_after_signbit, i32 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %higher_bit_after_signbit) + call void @use32(i32 %magic) + + %signextended = sub i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @n26(i32 %data, i32 %nbits) { +; CHECK-LABEL: @n26( +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 -1 +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %all_bits_except_low_nbits = shl i32 -1, %nbits + %magic = select i1 %should_signext, i32 %all_bits_except_low_nbits, i32 -1 ; not 0 + + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use32(i32 %all_bits_except_low_nbits) + call void @use32(i32 %magic) + + %signextended = add i32 %high_bits_extracted, %magic + ret i32 %signextended +} + +define i32 @n27_add_zext_of_magic(i32 %data, i8 %nbits) { +; CHECK-LABEL: @n27_add_zext_of_magic( +; CHECK-NEXT: [[NBITS_32BIT:%.*]] = zext i8 [[NBITS:%.*]] to i32 +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub nsw i32 32, [[NBITS_32BIT]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[NBITS_16BIT:%.*]] = zext i8 [[NBITS]] to i16 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i16 -1, [[NBITS_16BIT]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i16 [[ALL_BITS_EXCEPT_LOW_NBITS]], i16 0 +; CHECK-NEXT: [[MAGIC_WIDE:%.*]] = zext i16 [[MAGIC]] to i32 +; CHECK-NEXT: call void @use32(i32 [[NBITS_32BIT]]) +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use16(i16 [[NBITS_16BIT]]) +; CHECK-NEXT: call void @use16(i16 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use16(i16 [[MAGIC]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC_WIDE]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC_WIDE]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %nbits_32bit = zext i8 %nbits to i32 + %low_bits_to_skip = sub i32 32, %nbits_32bit + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %nbits_16bit = zext i8 %nbits to i16 + %all_bits_except_low_nbits = shl i16 -1, %nbits_16bit + %magic = select i1 %should_signext, i16 %all_bits_except_low_nbits, i16 0 + %magic_wide = zext i16 %magic to i32 ; not sext + + call void @use32(i32 %nbits_32bit) + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use16(i16 %nbits_16bit) + call void @use16(i16 %all_bits_except_low_nbits) + call void @use16(i16 %magic) + call void @use32(i32 %magic_wide) + + %signextended = add i32 %high_bits_extracted, %magic_wide + ret i32 %signextended +} + +define i32 @n28_sub_sext_of_magic(i32 %data, i8 %nbits) { +; CHECK-LABEL: @n28_sub_sext_of_magic( +; CHECK-NEXT: [[NBITS_32BIT:%.*]] = zext i8 [[NBITS:%.*]] to i32 +; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub nsw i32 32, [[NBITS_32BIT]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i32 [[DATA]], 0 +; CHECK-NEXT: [[NBITS_16BIT:%.*]] = zext i8 [[NBITS]] to i16 +; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i16 1, [[NBITS_16BIT]] +; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i16 [[ALL_BITS_EXCEPT_LOW_NBITS]], i16 0 +; CHECK-NEXT: [[MAGIC_WIDE:%.*]] = sext i16 [[MAGIC]] to i32 +; CHECK-NEXT: call void @use32(i32 [[NBITS_32BIT]]) +; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) +; CHECK-NEXT: call void @use16(i16 [[NBITS_16BIT]]) +; CHECK-NEXT: call void @use16(i16 [[ALL_BITS_EXCEPT_LOW_NBITS]]) +; CHECK-NEXT: call void @use16(i16 [[MAGIC]]) +; CHECK-NEXT: call void @use32(i32 [[MAGIC_WIDE]]) +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = sub i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC_WIDE]] +; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] +; + %nbits_32bit = zext i8 %nbits to i32 + %low_bits_to_skip = sub i32 32, %nbits_32bit + %high_bits_extracted = lshr i32 %data, %low_bits_to_skip + %should_signext = icmp slt i32 %data, 0 + %nbits_16bit = zext i8 %nbits to i16 + %all_bits_except_low_nbits = shl i16 1, %nbits_16bit + %magic = select i1 %should_signext, i16 %all_bits_except_low_nbits, i16 0 + %magic_wide = sext i16 %magic to i32 ; not zext + + call void @use32(i32 %nbits_32bit) + call void @use32(i32 %low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use1(i1 %should_signext) + call void @use16(i16 %nbits_16bit) + call void @use16(i16 %all_bits_except_low_nbits) + call void @use16(i16 %magic) + call void @use32(i32 %magic_wide) + + %signextended = sub i32 %high_bits_extracted, %magic_wide + ret i32 %signextended +} From llvm-commits at lists.llvm.org Mon Oct 7 13:53:27 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Mon, 07 Oct 2019 20:53:27 -0000 Subject: [llvm] r373964 - [InstCombine] Fold conditional sign-extend of high-bit-extract into high-bit-extract-with-signext (PR42389) Message-ID: <20191007205327.E4A7B8D3AF@lists.llvm.org> Author: lebedevri Date: Mon Oct 7 13:53:27 2019 New Revision: 373964 URL: http://llvm.org/viewvc/llvm-project?rev=373964&view=rev Log: [InstCombine] Fold conditional sign-extend of high-bit-extract into high-bit-extract-with-signext (PR42389) This can come up in Bit Stream abstractions. The pattern looks big/scary, but it can't be simplified any further. It only is so simple because a number of my preparatory folds had happened already (shift amount reassociation / shift amount reassociation in bit test, sign bit test detection). Highlights: * There are two main flavors: https://rise4fun.com/Alive/zWi The difference is add vs. sub, and left-shift of -1 vs. 1 * Since we only change the shift opcode, we can preserve the exact-ness: https://rise4fun.com/Alive/4u4 * There can be truncation after high-bit-extraction: https://rise4fun.com/Alive/slHc1 (the main pattern i'm after!) Which means that we need to ignore zext of shift amounts and of NBits. * The sign-extending magic can be extended itself (in add pattern via sext, in sub pattern via zext. not the other way around!) https://rise4fun.com/Alive/NhG (or those sext/zext can be sinked into `select`!) Which again means we should pay attention when matching NBits. * We can have both truncation of extraction and widening of magic: https://rise4fun.com/Alive/XTw In other words, i don't believe we need to have any checks on bitwidths of any of these constructs. This is worsened in general by the fact that we may have `sext` instead of `zext` for shift amounts, and we don't yet canonicalize to `zext`, although we should. I have not done anything about that here. Also, we really should have something to weed out `sub` like these, by folding them into `add` variant. https://bugs.llvm.org/show_bug.cgi?id=42389 Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineAddSub.cpp llvm/trunk/test/Transforms/InstCombine/conditional-variable-length-signext-after-high-bit-extract.ll Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineAddSub.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineAddSub.cpp?rev=373964&r1=373963&r2=373964&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineAddSub.cpp (original) +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineAddSub.cpp Mon Oct 7 13:53:27 2019 @@ -1097,6 +1097,106 @@ static Instruction *foldToUnsignedSatura return nullptr; } +static Instruction * +canonicalizeCondSignextOfHighBitExtractToSignextHighBitExtract( + BinaryOperator &I, InstCombiner::BuilderTy &Builder) { + assert((I.getOpcode() == Instruction::Add || + I.getOpcode() == Instruction::Sub) && + "Expecting add/sub instruction"); + + // We have a subtraction/addition between a (potentially truncated) *logical* + // right-shift of X and a "select". + Value *X, *Select; + Instruction *LowBitsToSkip, *Extract; + if (!match(&I, m_c_BinOp(m_TruncOrSelf(m_CombineAnd( + m_LShr(m_Value(X), m_Instruction(LowBitsToSkip)), + m_Instruction(Extract))), + m_Value(Select)))) + return nullptr; + + // `add` is commutative; but for `sub`, "select" *must* be on RHS. + if (I.getOpcode() == Instruction::Sub && I.getOperand(1) != Select) + return nullptr; + + Type *XTy = X->getType(); + bool HadTrunc = I.getType() != XTy; + + // If there was a truncation of extracted value, then we'll need to produce + // one extra instruction, so we need to ensure one instruction will go away. + if (HadTrunc && !match(&I, m_c_BinOp(m_OneUse(m_Value()), m_Value()))) + return nullptr; + + // Extraction should extract high NBits bits, with shift amount calculated as: + // low bits to skip = shift bitwidth - high bits to extract + // The shift amount itself may be extended, and we need to look past zero-ext + // when matching NBits, that will matter for matching later. + Constant *C; + Value *NBits; + if (!match( + LowBitsToSkip, + m_ZExtOrSelf(m_Sub(m_Constant(C), m_ZExtOrSelf(m_Value(NBits))))) || + !match(C, m_SpecificInt_ICMP(ICmpInst::Predicate::ICMP_EQ, + APInt(C->getType()->getScalarSizeInBits(), + X->getType()->getScalarSizeInBits())))) + return nullptr; + + // Sign-extending value can be sign-extended itself if we `add` it, + // or zero-extended if we `sub`tract it. + auto SkipExtInMagic = [&I](Value *&V) { + if (I.getOpcode() == Instruction::Add) + match(V, m_SExtOrSelf(m_Value(V))); + else + match(V, m_ZExtOrSelf(m_Value(V))); + }; + + // Now, finally validate the sign-extending magic. + // `select` itself may be appropriately extended, look past that. + SkipExtInMagic(Select); + + ICmpInst::Predicate Pred; + const APInt *Thr; + Value *SignExtendingValue, *Zero; + bool ShouldSignext; + // It must be a select between two values we will later estabilish to be a + // sign-extending value and a zero constant. The condition guarding the + // sign-extension must be based on a sign bit of the same X we had in `lshr`. + if (!match(Select, m_Select(m_ICmp(Pred, m_Specific(X), m_APInt(Thr)), + m_Value(SignExtendingValue), m_Value(Zero))) || + !isSignBitCheck(Pred, *Thr, ShouldSignext)) + return nullptr; + + // icmp-select pair is commutative. + if (!ShouldSignext) + std::swap(SignExtendingValue, Zero); + + // If we should not perform sign-extension then we must add/subtract zero. + if (!match(Zero, m_Zero())) + return nullptr; + // Otherwise, it should be some constant, left-shifted by the same NBits we + // had in `lshr`. Said left-shift can also be appropriately extended. + // Again, we must look past zero-ext when looking for NBits. + SkipExtInMagic(SignExtendingValue); + Constant *SignExtendingValueBaseConstant; + if (!match(SignExtendingValue, + m_Shl(m_Constant(SignExtendingValueBaseConstant), + m_ZExtOrSelf(m_Specific(NBits))))) + return nullptr; + // If we `add`, then the constant should be all-ones, else it should be one. + if (I.getOpcode() == Instruction::Add + ? !match(SignExtendingValueBaseConstant, m_AllOnes()) + : !match(SignExtendingValueBaseConstant, m_One())) + return nullptr; + + auto *NewAShr = BinaryOperator::CreateAShr(X, LowBitsToSkip, + Extract->getName() + ".sext"); + NewAShr->copyIRFlags(Extract); // Preserve `exact`-ness. + if (!HadTrunc) + return NewAShr; + + Builder.Insert(NewAShr); + return TruncInst::CreateTruncOrBitCast(NewAShr, I.getType()); +} + Instruction *InstCombiner::visitAdd(BinaryOperator &I) { if (Value *V = SimplifyAddInst(I.getOperand(0), I.getOperand(1), I.hasNoSignedWrap(), I.hasNoUnsignedWrap(), @@ -1302,6 +1402,11 @@ Instruction *InstCombiner::visitAdd(Bina if (Instruction *V = canonicalizeLowbitMask(I, Builder)) return V; + if (Instruction *V = + canonicalizeCondSignextOfHighBitExtractToSignextHighBitExtract( + I, Builder)) + return V; + if (Instruction *SatAdd = foldToUnsignedSaturatedAdd(I)) return SatAdd; @@ -1900,6 +2005,11 @@ Instruction *InstCombiner::visitSub(Bina return SelectInst::Create(Cmp, Neg, A); } + if (Instruction *V = + canonicalizeCondSignextOfHighBitExtractToSignextHighBitExtract( + I, Builder)) + return V; + if (Instruction *Ext = narrowMathIfNoOverflow(I)) return Ext; Modified: llvm/trunk/test/Transforms/InstCombine/conditional-variable-length-signext-after-high-bit-extract.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/conditional-variable-length-signext-after-high-bit-extract.ll?rev=373964&r1=373963&r2=373964&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/conditional-variable-length-signext-after-high-bit-extract.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/conditional-variable-length-signext-after-high-bit-extract.ll Mon Oct 7 13:53:27 2019 @@ -26,7 +26,7 @@ define i32 @t0_notrunc_add(i32 %data, i3 ; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) ; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = ashr i32 [[DATA]], [[LOW_BITS_TO_SKIP]] ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %low_bits_to_skip = sub i32 32, %nbits @@ -57,7 +57,7 @@ define i32 @t1_notrunc_sub(i32 %data, i3 ; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) ; CHECK-NEXT: call void @use32(i32 [[HIGHER_BIT_AFTER_SIGNBIT]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = sub i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = ashr i32 [[DATA]], [[LOW_BITS_TO_SKIP]] ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %low_bits_to_skip = sub i32 32, %nbits @@ -84,14 +84,14 @@ define i32 @t2_trunc_add(i64 %data, i32 ; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED_WIDE]] to i32 ; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i64 [[DATA]], 0 ; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] -; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 ; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) ; CHECK-NEXT: call void @use64(i64 [[LOW_BITS_TO_SKIP_WIDE]]) ; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED_WIDE]]) ; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) ; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) ; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[MAGIC]], [[HIGH_BITS_EXTRACTED]] +; CHECK-NEXT: [[TMP1:%.*]] = ashr i64 [[DATA]], [[LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = trunc i64 [[TMP1]] to i32 ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %low_bits_to_skip = sub i32 64, %nbits @@ -121,14 +121,14 @@ define i32 @t3_trunc_sub(i64 %data, i32 ; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED_WIDE]] to i32 ; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i64 [[DATA]], 0 ; CHECK-NEXT: [[HIGHER_BIT_AFTER_SIGNBIT:%.*]] = shl i32 1, [[NBITS]] -; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[HIGHER_BIT_AFTER_SIGNBIT]], i32 0 ; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) ; CHECK-NEXT: call void @use64(i64 [[LOW_BITS_TO_SKIP_WIDE]]) ; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED_WIDE]]) ; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) ; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) ; CHECK-NEXT: call void @use32(i32 [[HIGHER_BIT_AFTER_SIGNBIT]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = sub i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: [[TMP1:%.*]] = ashr i64 [[DATA]], [[LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = trunc i64 [[TMP1]] to i32 ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %low_bits_to_skip = sub i32 64, %nbits @@ -164,7 +164,7 @@ define i32 @t4_commutativity0(i32 %data, ; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) ; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = ashr i32 [[DATA]], [[LOW_BITS_TO_SKIP]] ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %low_bits_to_skip = sub i32 32, %nbits @@ -194,7 +194,7 @@ define i32 @t5_commutativity1(i32 %data, ; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) ; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = ashr i32 [[DATA]], [[LOW_BITS_TO_SKIP]] ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %low_bits_to_skip = sub i32 32, %nbits @@ -224,7 +224,7 @@ define i32 @t6_commutativity2(i32 %data, ; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) ; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[MAGIC]], [[HIGH_BITS_EXTRACTED]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = ashr i32 [[DATA]], [[LOW_BITS_TO_SKIP]] ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %low_bits_to_skip = sub i32 32, %nbits @@ -253,14 +253,14 @@ define i32 @t7_trunc_extrause0(i64 %data ; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED_WIDE]] to i32 ; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i64 [[DATA]], 0 ; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] -; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 ; CHECK-NEXT: call void @use32(i32 [[LOW_BITS_TO_SKIP]]) ; CHECK-NEXT: call void @use64(i64 [[LOW_BITS_TO_SKIP_WIDE]]) ; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED_WIDE]]) ; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) ; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) ; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[MAGIC]], [[HIGH_BITS_EXTRACTED]] +; CHECK-NEXT: [[TMP1:%.*]] = ashr i64 [[DATA]], [[LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = trunc i64 [[TMP1]] to i32 ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %low_bits_to_skip = sub i32 64, %nbits @@ -286,7 +286,6 @@ define i32 @t8_trunc_extrause1(i64 %data ; CHECK-NEXT: [[LOW_BITS_TO_SKIP:%.*]] = sub i32 64, [[NBITS:%.*]] ; CHECK-NEXT: [[LOW_BITS_TO_SKIP_WIDE:%.*]] = zext i32 [[LOW_BITS_TO_SKIP]] to i64 ; CHECK-NEXT: [[HIGH_BITS_EXTRACTED_WIDE:%.*]] = lshr i64 [[DATA:%.*]], [[LOW_BITS_TO_SKIP_WIDE]] -; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED_WIDE]] to i32 ; CHECK-NEXT: [[SHOULD_SIGNEXT:%.*]] = icmp slt i64 [[DATA]], 0 ; CHECK-NEXT: [[ALL_BITS_EXCEPT_LOW_NBITS:%.*]] = shl i32 -1, [[NBITS]] ; CHECK-NEXT: [[MAGIC:%.*]] = select i1 [[SHOULD_SIGNEXT]], i32 [[ALL_BITS_EXCEPT_LOW_NBITS]], i32 0 @@ -296,7 +295,8 @@ define i32 @t8_trunc_extrause1(i64 %data ; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) ; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[MAGIC]], [[HIGH_BITS_EXTRACTED]] +; CHECK-NEXT: [[TMP1:%.*]] = ashr i64 [[DATA]], [[LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = trunc i64 [[TMP1]] to i32 ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %low_bits_to_skip = sub i32 64, %nbits @@ -368,7 +368,7 @@ define i32 @t10_preserve_exact(i32 %data ; CHECK-NEXT: call void @use1(i1 [[SHOULD_SIGNEXT]]) ; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = ashr exact i32 [[DATA]], [[LOW_BITS_TO_SKIP]] ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %low_bits_to_skip = sub i32 32, %nbits @@ -405,7 +405,7 @@ define i32 @t11_different_zext_of_shamt( ; CHECK-NEXT: call void @use32(i32 [[NBITS_32BIT]]) ; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = ashr i32 [[DATA]], [[LOW_BITS_TO_SKIP_32]] ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %nbits_16bit = zext i8 %nbits to i16 @@ -448,7 +448,7 @@ define i32 @t12_add_sext_of_magic(i32 %d ; CHECK-NEXT: call void @use16(i16 [[ALL_BITS_EXCEPT_LOW_NBITS]]) ; CHECK-NEXT: call void @use16(i16 [[MAGIC]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC_WIDE]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC_WIDE]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = ashr i32 [[DATA]], [[LOW_BITS_TO_SKIP]] ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %nbits_32bit = zext i8 %nbits to i32 @@ -491,7 +491,7 @@ define i32 @t13_sub_zext_of_magic(i32 %d ; CHECK-NEXT: call void @use16(i16 [[ALL_BITS_EXCEPT_LOW_NBITS]]) ; CHECK-NEXT: call void @use16(i16 [[MAGIC]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC_WIDE]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = sub i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC_WIDE]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = ashr i32 [[DATA]], [[LOW_BITS_TO_SKIP]] ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %nbits_32bit = zext i8 %nbits to i32 @@ -534,7 +534,7 @@ define i32 @t14_add_sext_of_shl(i32 %dat ; CHECK-NEXT: call void @use16(i16 [[ALL_BITS_EXCEPT_LOW_NBITS]]) ; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS_WIDE]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = add i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = ashr i32 [[DATA]], [[LOW_BITS_TO_SKIP]] ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %nbits_32bit = zext i8 %nbits to i32 @@ -577,7 +577,7 @@ define i32 @t15_sub_zext_of_shl(i32 %dat ; CHECK-NEXT: call void @use16(i16 [[ALL_BITS_EXCEPT_LOW_NBITS]]) ; CHECK-NEXT: call void @use32(i32 [[ALL_BITS_EXCEPT_LOW_NBITS_WIDE]]) ; CHECK-NEXT: call void @use32(i32 [[MAGIC]]) -; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = sub i32 [[HIGH_BITS_EXTRACTED]], [[MAGIC]] +; CHECK-NEXT: [[SIGNEXTENDED:%.*]] = ashr i32 [[DATA]], [[LOW_BITS_TO_SKIP]] ; CHECK-NEXT: ret i32 [[SIGNEXTENDED]] ; %nbits_32bit = zext i8 %nbits to i32 From llvm-commits at lists.llvm.org Mon Oct 7 13:52:16 2019 From: llvm-commits at lists.llvm.org (Ana Pazos via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:52:16 +0000 (UTC) Subject: [PATCH] D68290: [RISCV] WIP better estimate size of outlined block with C extension enabled In-Reply-To: References: Message-ID: <469a318f9f367c5e5df6fb21567db7ca@localhost.localdomain> apazos updated this revision to Diff 223643. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68290/new/ https://reviews.llvm.org/D68290 Files: llvm/lib/Target/RISCV/RISCVInstrInfo.cpp llvm/utils/TableGen/RISCVCompressInstEmitter.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68290.223643.patch Type: text/x-patch Size: 17096 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 13:53:25 2019 From: llvm-commits at lists.llvm.org (Michael Kruse via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:53:25 +0000 (UTC) Subject: [PATCH] D53876: Preserve loop metadata when splitting exit blocks In-Reply-To: References: Message-ID: <43c2d3fd5628e0b271f006110f248395@localhost.localdomain> Meinersbur added a comment. I think yes, together with all the other places were loop metadata is not preserved, such as D66892 . Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53876/new/ https://reviews.llvm.org/D53876 From llvm-commits at lists.llvm.org Mon Oct 7 14:07:57 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Mon, 07 Oct 2019 21:07:57 -0000 Subject: [llvm] r373965 - [Attributor] Deduce memory behavior of functions and arguments Message-ID: <20191007210757.A58648D8E7@lists.llvm.org> Author: jdoerfert Date: Mon Oct 7 14:07:57 2019 New Revision: 373965 URL: http://llvm.org/viewvc/llvm-project?rev=373965&view=rev Log: [Attributor] Deduce memory behavior of functions and arguments Deduce the memory behavior, aka "read-none", "read-only", or "write-only", for functions and arguments. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67384 Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/align.ll llvm/trunk/test/Transforms/FunctionAttrs/arg_nocapture.ll llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll llvm/trunk/test/Transforms/FunctionAttrs/nofree-attributor.ll llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll llvm/trunk/test/Transforms/FunctionAttrs/nosync.ll llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll llvm/trunk/test/Transforms/FunctionAttrs/willreturn.ll Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/Attributor.h?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/Attributor.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/Attributor.h Mon Oct 7 14:07:57 2019 @@ -408,7 +408,11 @@ struct IRPosition { /// Return true if any kind in \p AKs existing in the IR at a position that /// will affect this one. See also getAttrs(...). - bool hasAttr(ArrayRef AKs) const; + /// \param IgnoreSubsumingPositions Flag to determine if subsuming positions, + /// e.g., the function position if this is an + /// argument position, should be ignored. + bool hasAttr(ArrayRef AKs, + bool IgnoreSubsumingPositions = false) const; /// Return the attributes of any kind in \p AKs existing in the IR at a /// position that will affect this one. While each position can only have a @@ -434,6 +438,28 @@ struct IRPosition { return Attribute(); } + /// Remove the attribute of kind \p AKs existing in the IR at this position. + void removeAttrs(ArrayRef AKs) { + if (getPositionKind() == IRP_INVALID || getPositionKind() == IRP_FLOAT) + return; + + AttributeList AttrList; + CallSite CS = CallSite(&getAnchorValue()); + if (CS) + AttrList = CS.getAttributes(); + else + AttrList = getAssociatedFunction()->getAttributes(); + + LLVMContext &Ctx = getAnchorValue().getContext(); + for (Attribute::AttrKind AK : AKs) + AttrList = AttrList.removeAttribute(Ctx, getAttrIdx(), AK); + + if (CS) + CS.setAttributes(AttrList); + else + getAssociatedFunction()->setAttributes(AttrList); + } + bool isAnyCallSitePosition() const { switch (getPositionKind()) { case IRPosition::IRP_CALL_SITE: @@ -1822,6 +1848,54 @@ struct AAHeapToStack : public StateWrapp /// Unique ID (due to the unique address) static const char ID; +}; + +/// An abstract interface for all memory related attributes. +struct AAMemoryBehavior + : public IRAttribute> { + AAMemoryBehavior(const IRPosition &IRP) : IRAttribute(IRP) {} + + /// State encoding bits. A set bit in the state means the property holds. + /// BEST_STATE is the best possible state, 0 the worst possible state. + enum { + NO_READS = 1 << 0, + NO_WRITES = 1 << 1, + NO_ACCESSES = NO_READS | NO_WRITES, + + BEST_STATE = NO_ACCESSES, + }; + + /// Return true if we know that the underlying value is not read or accessed + /// in its respective scope. + bool isKnownReadNone() const { return isKnown(NO_ACCESSES); } + + /// Return true if we assume that the underlying value is not read or accessed + /// in its respective scope. + bool isAssumedReadNone() const { return isAssumed(NO_ACCESSES); } + + /// Return true if we know that the underlying value is not accessed + /// (=written) in its respective scope. + bool isKnownReadOnly() const { return isKnown(NO_WRITES); } + + /// Return true if we assume that the underlying value is not accessed + /// (=written) in its respective scope. + bool isAssumedReadOnly() const { return isAssumed(NO_WRITES); } + + /// Return true if we know that the underlying value is not read in its + /// respective scope. + bool isKnownWriteOnly() const { return isKnown(NO_READS); } + + /// Return true if we assume that the underlying value is not read in its + /// respective scope. + bool isAssumedWriteOnly() const { return isAssumed(NO_READS); } + + /// Create an abstract attribute view for the position \p IRP. + static AAMemoryBehavior &createForPosition(const IRPosition &IRP, + Attributor &A); + + /// Unique ID (due to the unique address) + static const char ID; }; } // end namespace llvm Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Mon Oct 7 14:07:57 2019 @@ -418,11 +418,18 @@ SubsumingPositionIterator::SubsumingPosi } } -bool IRPosition::hasAttr(ArrayRef AKs) const { - for (const IRPosition &EquivIRP : SubsumingPositionIterator(*this)) +bool IRPosition::hasAttr(ArrayRef AKs, + bool IgnoreSubsumingPositions) const { + for (const IRPosition &EquivIRP : SubsumingPositionIterator(*this)) { for (Attribute::AttrKind AK : AKs) if (EquivIRP.getAttr(AK).getKindAsEnum() == AK) return true; + // The first position returned by the SubsumingPositionIterator is + // always the position itself. If we ignore subsuming positions we + // are done after the first iteration. + if (IgnoreSubsumingPositions) + break; + } return false; } @@ -3437,6 +3444,448 @@ struct AAHeapToStackFunction final : pub }; } // namespace +/// -------------------- Memory Behavior Attributes ---------------------------- +/// Includes read-none, read-only, and write-only. +/// ---------------------------------------------------------------------------- +struct AAMemoryBehaviorImpl : public AAMemoryBehavior { + AAMemoryBehaviorImpl(const IRPosition &IRP) : AAMemoryBehavior(IRP) {} + + /// See AbstractAttribute::initialize(...). + void initialize(Attributor &A) override { + intersectAssumedBits(BEST_STATE); + getKnownStateFromValue(getIRPosition(), getState()); + IRAttribute::initialize(A); + } + + /// Return the memory behavior information encoded in the IR for \p IRP. + static void getKnownStateFromValue(const IRPosition &IRP, + IntegerState &State) { + SmallVector Attrs; + IRP.getAttrs(AttrKinds, Attrs); + for (const Attribute &Attr : Attrs) { + switch (Attr.getKindAsEnum()) { + case Attribute::ReadNone: + State.addKnownBits(NO_ACCESSES); + break; + case Attribute::ReadOnly: + State.addKnownBits(NO_WRITES); + break; + case Attribute::WriteOnly: + State.addKnownBits(NO_READS); + break; + default: + llvm_unreachable("Unexpcted attribute!"); + } + } + + if (auto *I = dyn_cast(&IRP.getAnchorValue())) { + if (!I->mayReadFromMemory()) + State.addKnownBits(NO_READS); + if (!I->mayWriteToMemory()) + State.addKnownBits(NO_WRITES); + } + } + + /// See AbstractAttribute::getDeducedAttributes(...). + void getDeducedAttributes(LLVMContext &Ctx, + SmallVectorImpl &Attrs) const override { + assert(Attrs.size() == 0); + if (isAssumedReadNone()) + Attrs.push_back(Attribute::get(Ctx, Attribute::ReadNone)); + else if (isAssumedReadOnly()) + Attrs.push_back(Attribute::get(Ctx, Attribute::ReadOnly)); + else if (isAssumedWriteOnly()) + Attrs.push_back(Attribute::get(Ctx, Attribute::WriteOnly)); + assert(Attrs.size() <= 1); + } + + /// See AbstractAttribute::manifest(...). + ChangeStatus manifest(Attributor &A) override { + IRPosition &IRP = getIRPosition(); + + // Check if we would improve the existing attributes first. + SmallVector DeducedAttrs; + getDeducedAttributes(IRP.getAnchorValue().getContext(), DeducedAttrs); + if (llvm::all_of(DeducedAttrs, [&](const Attribute &Attr) { + return IRP.hasAttr(Attr.getKindAsEnum(), + /* IgnoreSubsumingPositions */ true); + })) + return ChangeStatus::UNCHANGED; + + // Clear existing attributes. + IRP.removeAttrs(AttrKinds); + + // Use the generic manifest method. + return IRAttribute::manifest(A); + } + + /// See AbstractState::getAsStr(). + const std::string getAsStr() const override { + if (isAssumedReadNone()) + return "readnone"; + if (isAssumedReadOnly()) + return "readonly"; + if (isAssumedWriteOnly()) + return "writeonly"; + return "may-read/write"; + } + + /// The set of IR attributes AAMemoryBehavior deals with. + static const Attribute::AttrKind AttrKinds[3]; +}; + +const Attribute::AttrKind AAMemoryBehaviorImpl::AttrKinds[] = { + Attribute::ReadNone, Attribute::ReadOnly, Attribute::WriteOnly}; + +/// Memory behavior attribute for a floating value. +struct AAMemoryBehaviorFloating : AAMemoryBehaviorImpl { + AAMemoryBehaviorFloating(const IRPosition &IRP) : AAMemoryBehaviorImpl(IRP) {} + + /// See AbstractAttribute::initialize(...). + void initialize(Attributor &A) override { + AAMemoryBehaviorImpl::initialize(A); + // Initialize the use vector with all direct uses of the associated value. + for (const Use &U : getAssociatedValue().uses()) + Uses.insert(&U); + } + + /// See AbstractAttribute::updateImpl(...). + ChangeStatus updateImpl(Attributor &A) override; + + /// See AbstractAttribute::trackStatistics() + void trackStatistics() const override { + if (isAssumedReadNone()) + STATS_DECLTRACK_FLOATING_ATTR(readnone) + else if (isAssumedReadOnly()) + STATS_DECLTRACK_FLOATING_ATTR(readonly) + else if (isAssumedWriteOnly()) + STATS_DECLTRACK_FLOATING_ATTR(writeonly) + } + +private: + /// Return true if users of \p UserI might access the underlying + /// variable/location described by \p U and should therefore be analyzed. + bool followUsersOfUseIn(Attributor &A, const Use *U, + const Instruction *UserI); + + /// Update the state according to the effect of use \p U in \p UserI. + void analyzeUseIn(Attributor &A, const Use *U, const Instruction *UserI); + +protected: + /// Container for (transitive) uses of the associated argument. + SetVector Uses; +}; + +/// Memory behavior attribute for function argument. +struct AAMemoryBehaviorArgument : AAMemoryBehaviorFloating { + AAMemoryBehaviorArgument(const IRPosition &IRP) + : AAMemoryBehaviorFloating(IRP) {} + + /// See AbstractAttribute::initialize(...). + void initialize(Attributor &A) override { + AAMemoryBehaviorFloating::initialize(A); + + // TODO: From readattrs.ll: "inalloca parameters are always + // considered written" + if (hasAttr({Attribute::InAlloca})) + removeAssumedBits(NO_WRITES); + + // Initialize the use vector with all direct uses of the associated value. + Argument *Arg = getAssociatedArgument(); + if (!Arg || !Arg->getParent()->hasExactDefinition()) + indicatePessimisticFixpoint(); + } + + /// See AbstractAttribute::trackStatistics() + void trackStatistics() const override { + if (isAssumedReadNone()) + STATS_DECLTRACK_ARG_ATTR(readnone) + else if (isAssumedReadOnly()) + STATS_DECLTRACK_ARG_ATTR(readonly) + else if (isAssumedWriteOnly()) + STATS_DECLTRACK_ARG_ATTR(writeonly) + } +}; + +struct AAMemoryBehaviorCallSiteArgument final : AAMemoryBehaviorArgument { + AAMemoryBehaviorCallSiteArgument(const IRPosition &IRP) + : AAMemoryBehaviorArgument(IRP) {} + + /// See AbstractAttribute::updateImpl(...). + ChangeStatus updateImpl(Attributor &A) override { + // TODO: Once we have call site specific value information we can provide + // call site specific liveness liveness information and then it makes + // sense to specialize attributes for call sites arguments instead of + // redirecting requests to the callee argument. + Argument *Arg = getAssociatedArgument(); + const IRPosition &ArgPos = IRPosition::argument(*Arg); + auto &ArgAA = A.getAAFor(*this, ArgPos); + return clampStateAndIndicateChange( + getState(), + static_cast(ArgAA.getState())); + } + + /// See AbstractAttribute::trackStatistics() + void trackStatistics() const override { + if (isAssumedReadNone()) + STATS_DECLTRACK_CSARG_ATTR(readnone) + else if (isAssumedReadOnly()) + STATS_DECLTRACK_CSARG_ATTR(readonly) + else if (isAssumedWriteOnly()) + STATS_DECLTRACK_CSARG_ATTR(writeonly) + } +}; + +/// Memory behavior attribute for a call site return position. +struct AAMemoryBehaviorCallSiteReturned final : AAMemoryBehaviorFloating { + AAMemoryBehaviorCallSiteReturned(const IRPosition &IRP) + : AAMemoryBehaviorFloating(IRP) {} + + /// See AbstractAttribute::manifest(...). + ChangeStatus manifest(Attributor &A) override { + // We do not annotate returned values. + return ChangeStatus::UNCHANGED; + } + + /// See AbstractAttribute::trackStatistics() + void trackStatistics() const override {} +}; + +/// An AA to represent the memory behavior function attributes. +struct AAMemoryBehaviorFunction final : public AAMemoryBehaviorImpl { + AAMemoryBehaviorFunction(const IRPosition &IRP) : AAMemoryBehaviorImpl(IRP) {} + + /// See AbstractAttribute::updateImpl(Attributor &A). + virtual ChangeStatus updateImpl(Attributor &A) override; + + /// See AbstractAttribute::manifest(...). + ChangeStatus manifest(Attributor &A) override { + Function &F = cast(getAnchorValue()); + if (isAssumedReadNone()) { + F.removeFnAttr(Attribute::ArgMemOnly); + F.removeFnAttr(Attribute::InaccessibleMemOnly); + F.removeFnAttr(Attribute::InaccessibleMemOrArgMemOnly); + } + return AAMemoryBehaviorImpl::manifest(A); + } + + /// See AbstractAttribute::trackStatistics() + void trackStatistics() const override { + if (isAssumedReadNone()) + STATS_DECLTRACK_FN_ATTR(readnone) + else if (isAssumedReadOnly()) + STATS_DECLTRACK_FN_ATTR(readonly) + else if (isAssumedWriteOnly()) + STATS_DECLTRACK_FN_ATTR(writeonly) + } +}; + +/// AAMemoryBehavior attribute for call sites. +struct AAMemoryBehaviorCallSite final : AAMemoryBehaviorImpl { + AAMemoryBehaviorCallSite(const IRPosition &IRP) : AAMemoryBehaviorImpl(IRP) {} + + /// See AbstractAttribute::initialize(...). + void initialize(Attributor &A) override { + AAMemoryBehaviorImpl::initialize(A); + Function *F = getAssociatedFunction(); + if (!F || !F->hasExactDefinition()) + indicatePessimisticFixpoint(); + } + + /// See AbstractAttribute::updateImpl(...). + ChangeStatus updateImpl(Attributor &A) override { + // TODO: Once we have call site specific value information we can provide + // call site specific liveness liveness information and then it makes + // sense to specialize attributes for call sites arguments instead of + // redirecting requests to the callee argument. + Function *F = getAssociatedFunction(); + const IRPosition &FnPos = IRPosition::function(*F); + auto &FnAA = A.getAAFor(*this, FnPos); + return clampStateAndIndicateChange( + getState(), static_cast(FnAA.getState())); + } + + /// See AbstractAttribute::trackStatistics() + void trackStatistics() const override { + if (isAssumedReadNone()) + STATS_DECLTRACK_CS_ATTR(readnone) + else if (isAssumedReadOnly()) + STATS_DECLTRACK_CS_ATTR(readonly) + else if (isAssumedWriteOnly()) + STATS_DECLTRACK_CS_ATTR(writeonly) + } +}; + +ChangeStatus AAMemoryBehaviorFunction::updateImpl(Attributor &A) { + + // The current assumed state used to determine a change. + auto AssumedState = getAssumed(); + + auto CheckRWInst = [&](Instruction &I) { + // If the instruction has an own memory behavior state, use it to restrict + // the local state. No further analysis is required as the other memory + // state is as optimistic as it gets. + if (ImmutableCallSite ICS = ImmutableCallSite(&I)) { + const auto &MemBehaviorAA = A.getAAFor( + *this, IRPosition::callsite_function(ICS)); + intersectAssumedBits(MemBehaviorAA.getAssumed()); + return !isAtFixpoint(); + } + + // Remove access kind modifiers if necessary. + if (I.mayReadFromMemory()) + removeAssumedBits(NO_READS); + if (I.mayWriteToMemory()) + removeAssumedBits(NO_WRITES); + return !isAtFixpoint(); + }; + + if (!A.checkForAllReadWriteInstructions(CheckRWInst, *this)) + return indicatePessimisticFixpoint(); + + return (AssumedState != getAssumed()) ? ChangeStatus::CHANGED + : ChangeStatus::UNCHANGED; +} + +ChangeStatus AAMemoryBehaviorFloating::updateImpl(Attributor &A) { + + const IRPosition &IRP = getIRPosition(); + const IRPosition &FnPos = IRPosition::function_scope(IRP); + AAMemoryBehavior::StateType &S = getState(); + + // First, check the function scope. We take the known information and we avoid + // work if the assumed information implies the current assumed information for + // this attribute. + const auto &FnMemAA = A.getAAFor(*this, FnPos); + S.addKnownBits(FnMemAA.getKnown()); + if ((S.getAssumed() & FnMemAA.getAssumed()) == S.getAssumed()) + return ChangeStatus::UNCHANGED; + + // Make sure the value is not captured (except through "return"), if + // it is, any information derived would be irrelevant anyway as we cannot + // check the potential aliases introduced by the capture. + const auto &ArgNoCaptureAA = A.getAAFor(*this, IRP); + if (!ArgNoCaptureAA.isAssumedNoCaptureMaybeReturned()) + return indicatePessimisticFixpoint(); + + // The current assumed state used to determine a change. + auto AssumedState = S.getAssumed(); + + // Liveness information to exclude dead users. + // TODO: Take the FnPos once we have call site specific liveness information. + const auto &LivenessAA = A.getAAFor( + *this, IRPosition::function(*IRP.getAssociatedFunction())); + + // Visit and expand uses until all are analyzed or a fixpoint is reached. + for (unsigned i = 0; i < Uses.size() && !isAtFixpoint(); i++) { + const Use *U = Uses[i]; + Instruction *UserI = cast(U->getUser()); + LLVM_DEBUG(dbgs() << "[AAMemoryBehavior] Use: " << **U << " in " << *UserI + << " [Dead: " << (LivenessAA.isAssumedDead(UserI)) + << "]\n"); + if (LivenessAA.isAssumedDead(UserI)) + continue; + + // Check if the users of UserI should also be visited. + if (followUsersOfUseIn(A, U, UserI)) + for (const Use &UserIUse : UserI->uses()) + Uses.insert(&UserIUse); + + // If UserI might touch memory we analyze the use in detail. + if (UserI->mayReadOrWriteMemory()) + analyzeUseIn(A, U, UserI); + } + + return (AssumedState != getAssumed()) ? ChangeStatus::CHANGED + : ChangeStatus::UNCHANGED; +} + +bool AAMemoryBehaviorFloating::followUsersOfUseIn(Attributor &A, const Use *U, + const Instruction *UserI) { + // The loaded value is unrelated to the pointer argument, no need to + // follow the users of the load. + if (isa(UserI)) + return false; + + // By default we follow all uses assuming UserI might leak information on U, + // we have special handling for call sites operands though. + ImmutableCallSite ICS(UserI); + if (!ICS || !ICS.isArgOperand(U)) + return true; + + // If the use is a call argument known not to be captured, the users of + // the call do not need to be visited because they have to be unrelated to + // the input. Note that this check is not trivial even though we disallow + // general capturing of the underlying argument. The reason is that the + // call might the argument "through return", which we allow and for which we + // need to check call users. + unsigned ArgNo = ICS.getArgumentNo(U); + const auto &ArgNoCaptureAA = + A.getAAFor(*this, IRPosition::callsite_argument(ICS, ArgNo)); + return !ArgNoCaptureAA.isAssumedNoCapture(); +} + +void AAMemoryBehaviorFloating::analyzeUseIn(Attributor &A, const Use *U, + const Instruction *UserI) { + assert(UserI->mayReadOrWriteMemory()); + + switch (UserI->getOpcode()) { + default: + // TODO: Handle all atomics and other side-effect operations we know of. + break; + case Instruction::Load: + // Loads cause the NO_READS property to disappear. + removeAssumedBits(NO_READS); + return; + + case Instruction::Store: + // Stores cause the NO_WRITES property to disappear if the use is the + // pointer operand. Note that we do assume that capturing was taken care of + // somewhere else. + if (cast(UserI)->getPointerOperand() == U->get()) + removeAssumedBits(NO_WRITES); + return; + + case Instruction::Call: + case Instruction::CallBr: + case Instruction::Invoke: { + // For call sites we look at the argument memory behavior attribute (this + // could be recursive!) in order to restrict our own state. + ImmutableCallSite ICS(UserI); + + // Give up on operand bundles. + if (ICS.isBundleOperand(U)) { + indicatePessimisticFixpoint(); + return; + } + + // Calling a function does read the function pointer, maybe write it if the + // function is self-modifying. + if (ICS.isCallee(U)) { + removeAssumedBits(NO_READS); + break; + } + + // Adjust the possible access behavior based on the information on the + // argument. + unsigned ArgNo = ICS.getArgumentNo(U); + const IRPosition &ArgPos = IRPosition::callsite_argument(ICS, ArgNo); + const auto &MemBehaviorAA = A.getAAFor(*this, ArgPos); + // "assumed" has at most the same bits as the MemBehaviorAA assumed + // and at least "known". + intersectAssumedBits(MemBehaviorAA.getAssumed()); + return; + } + }; + + // Generally, look at the "may-properties" and adjust the assumed state if we + // did not trigger special handling before. + if (UserI->mayReadFromMemory()) + removeAssumedBits(NO_READS); + if (UserI->mayWriteToMemory()) + removeAssumedBits(NO_WRITES); +} + /// ---------------------------------------------------------------------------- /// Attributor /// ---------------------------------------------------------------------------- @@ -3607,7 +4056,8 @@ bool Attributor::checkForAllInstructions auto &OpcodeInstMap = InfoCache.getOpcodeInstMapForFunction(*AssociatedFunction); - if (!checkForAllInstructionsImpl(OpcodeInstMap, Pred, &LivenessAA, AnyDead, Opcodes)) + if (!checkForAllInstructionsImpl(OpcodeInstMap, Pred, &LivenessAA, AnyDead, + Opcodes)) return false; // If we actually used liveness information so we have to record a dependence. @@ -3965,6 +4415,9 @@ void Attributor::identifyDefaultAbstract // Every function might be "no-recurse". getOrCreateAAFor(FPos); + // Every function might be "readnone/readonly/writeonly/...". + getOrCreateAAFor(FPos); + // Every function might be applicable for Heap-To-Stack conversion. if (EnableHeapToStack) getOrCreateAAFor(FPos); @@ -4019,6 +4472,10 @@ void Attributor::identifyDefaultAbstract // Every argument with pointer type might be marked nocapture. getOrCreateAAFor(ArgPos); + + // Every argument with pointer type might be marked + // "readnone/readonly/writeonly/..." + getOrCreateAAFor(ArgPos); } } @@ -4232,6 +4689,7 @@ const char AAAlign::ID = 0; const char AANoCapture::ID = 0; const char AAValueSimplify::ID = 0; const char AAHeapToStack::ID = 0; +const char AAMemoryBehavior::ID = 0; // Macro magic to create the static generator function for attributes that // follow the naming scheme. @@ -4310,6 +4768,23 @@ const char AAHeapToStack::ID = 0; return *AA; \ } +#define CREATE_NON_RET_ABSTRACT_ATTRIBUTE_FOR_POSITION(CLASS) \ + CLASS &CLASS::createForPosition(const IRPosition &IRP, Attributor &A) { \ + CLASS *AA = nullptr; \ + switch (IRP.getPositionKind()) { \ + SWITCH_PK_INV(CLASS, IRP_INVALID, "invalid") \ + SWITCH_PK_INV(CLASS, IRP_RETURNED, "returned") \ + SWITCH_PK_CREATE(CLASS, IRP, IRP_FUNCTION, Function) \ + SWITCH_PK_CREATE(CLASS, IRP, IRP_CALL_SITE, CallSite) \ + SWITCH_PK_CREATE(CLASS, IRP, IRP_FLOAT, Floating) \ + SWITCH_PK_CREATE(CLASS, IRP, IRP_ARGUMENT, Argument) \ + SWITCH_PK_CREATE(CLASS, IRP, IRP_CALL_SITE_RETURNED, CallSiteReturned) \ + SWITCH_PK_CREATE(CLASS, IRP, IRP_CALL_SITE_ARGUMENT, CallSiteArgument) \ + } \ + AA->initialize(A); \ + return *AA; \ + } + CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoUnwind) CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoSync) CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoFree) @@ -4329,6 +4804,8 @@ CREATE_ALL_ABSTRACT_ATTRIBUTE_FOR_POSITI CREATE_FUNCTION_ONLY_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAHeapToStack) +CREATE_NON_RET_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAMemoryBehavior) + #undef CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION #undef CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION #undef CREATE_ALL_ABSTRACT_ATTRIBUTE_FOR_POSITION Modified: llvm/trunk/test/Transforms/FunctionAttrs/align.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/align.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/align.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/align.ll Mon Oct 7 14:07:57 2019 @@ -7,26 +7,26 @@ target datalayout = "e-m:e-i64:64-f80:12 ; TEST 1 -; ATTRIBUTOR: define align 8 i32* @test1(i32* returned align 8 "no-capture-maybe-returned" %0) +; ATTRIBUTOR: define align 8 i32* @test1(i32* readnone returned align 8 "no-capture-maybe-returned" %0) define i32* @test1(i32* align 8 %0) #0 { ret i32* %0 } ; TEST 2 -; ATTRIBUTOR: define i32* @test2(i32* returned "no-capture-maybe-returned" %0) +; ATTRIBUTOR: define i32* @test2(i32* readnone returned "no-capture-maybe-returned" %0) define i32* @test2(i32* %0) #0 { ret i32* %0 } ; TEST 3 -; ATTRIBUTOR: define align 4 i32* @test3(i32* align 8 "no-capture-maybe-returned" %0, i32* align 4 "no-capture-maybe-returned" %1, i1 %2) +; ATTRIBUTOR: define align 4 i32* @test3(i32* readnone align 8 "no-capture-maybe-returned" %0, i32* readnone align 4 "no-capture-maybe-returned" %1, i1 %2) define i32* @test3(i32* align 8 %0, i32* align 4 %1, i1 %2) #0 { %ret = select i1 %2, i32* %0, i32* %1 ret i32* %ret } ; TEST 4 -; ATTRIBUTOR: define align 32 i32* @test4(i32* align 32 "no-capture-maybe-returned" %0, i32* align 32 "no-capture-maybe-returned" %1, i1 %2) +; ATTRIBUTOR: define align 32 i32* @test4(i32* readnone align 32 "no-capture-maybe-returned" %0, i32* readnone align 32 "no-capture-maybe-returned" %1, i1 %2) define i32* @test4(i32* align 32 %0, i32* align 32 %1, i1 %2) #0 { %ret = select i1 %2, i32* %0, i32* %1 ret i32* %ret @@ -139,7 +139,7 @@ define internal i8* @f3(i8* readnone %0) ; TEST 7 ; Better than IR information -; ATTRIBUTOR: define align 32 i32* @test7(i32* returned align 32 "no-capture-maybe-returned" %p) +; ATTRIBUTOR: define align 32 i32* @test7(i32* readnone returned align 32 "no-capture-maybe-returned" %p) define align 4 i32* @test7(i32* align 32 %p) #0 { tail call i8* @f1(i8* align 8 dereferenceable(1) @a1) ret i32* %p @@ -162,7 +162,7 @@ define void @test8_helper() { } define internal void @test8(i32* %a, i32* %b, i32* %c) { -; ATTRIBUTOR: define internal void @test8(i32* nocapture align 4 %a, i32* nocapture align 4 %b, i32* nocapture %c) +; ATTRIBUTOR: define internal void @test8(i32* nocapture readnone align 4 %a, i32* nocapture readnone align 4 %b, i32* nocapture readnone %c) ret void } Modified: llvm/trunk/test/Transforms/FunctionAttrs/arg_nocapture.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/arg_nocapture.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/arg_nocapture.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/arg_nocapture.ll Mon Oct 7 14:07:57 2019 @@ -116,8 +116,7 @@ entry: ; ; CHECK: define dereferenceable_or_null(8) i64* @scc_B(double* readnone returned dereferenceable_or_null(8) "no-capture-maybe-returned" %a) ; -; FIXME: readnone missing for %s -; CHECK: define dereferenceable_or_null(2) i8* @scc_C(i16* returned dereferenceable_or_null(2) "no-capture-maybe-returned" %a) +; CHECK: define dereferenceable_or_null(2) i8* @scc_C(i16* readnone returned dereferenceable_or_null(2) "no-capture-maybe-returned" %a) ; ; float *scc_A(int *a) { ; return (float*)(a ? (int*)scc_A((int*)scc_B((double*)scc_C((short*)a))) : a); @@ -245,7 +244,7 @@ declare i32 @printf(i8* nocapture, ...) ; } ; ; There should *not* be a no-capture attribute on %a -; CHECK: define i64* @not_captured_but_returned_0(i64* returned "no-capture-maybe-returned" %a) +; CHECK: define i64* @not_captured_but_returned_0(i64* returned writeonly "no-capture-maybe-returned" %a) define i64* @not_captured_but_returned_0(i64* %a) #0 { entry: store i64 0, i64* %a, align 8 @@ -260,7 +259,7 @@ entry: ; } ; ; There should *not* be a no-capture attribute on %a -; CHECK: define nonnull i64* @not_captured_but_returned_1(i64* "no-capture-maybe-returned" %a) +; CHECK: define nonnull i64* @not_captured_but_returned_1(i64* writeonly "no-capture-maybe-returned" %a) define i64* @not_captured_but_returned_1(i64* %a) #0 { entry: %add.ptr = getelementptr inbounds i64, i64* %a, i64 1 @@ -275,8 +274,7 @@ entry: ; not_captured_but_returned_1(a); ; } ; -; FIXME: no-capture missing for %a -; CHECK: define void @test_not_captured_but_returned_calls(i64* nocapture %a) +; CHECK: define void @test_not_captured_but_returned_calls(i64* nocapture writeonly %a) define void @test_not_captured_but_returned_calls(i64* %a) #0 { entry: %call = call i64* @not_captured_but_returned_0(i64* %a) @@ -291,7 +289,7 @@ entry: ; } ; ; There should *not* be a no-capture attribute on %a -; CHECK: define i64* @negative_test_not_captured_but_returned_call_0a(i64* returned "no-capture-maybe-returned" %a) +; CHECK: define i64* @negative_test_not_captured_but_returned_call_0a(i64* returned writeonly "no-capture-maybe-returned" %a) define i64* @negative_test_not_captured_but_returned_call_0a(i64* %a) #0 { entry: %call = call i64* @not_captured_but_returned_0(i64* %a) @@ -305,7 +303,7 @@ entry: ; } ; ; There should *not* be a no-capture attribute on %a -; CHECK: define void @negative_test_not_captured_but_returned_call_0b(i64* %a) +; CHECK: define void @negative_test_not_captured_but_returned_call_0b(i64* writeonly %a) define void @negative_test_not_captured_but_returned_call_0b(i64* %a) #0 { entry: %call = call i64* @not_captured_but_returned_0(i64* %a) @@ -321,7 +319,7 @@ entry: ; } ; ; There should *not* be a no-capture attribute on %a -; CHECK: define nonnull i64* @negative_test_not_captured_but_returned_call_1a(i64* "no-capture-maybe-returned" %a) +; CHECK: define nonnull i64* @negative_test_not_captured_but_returned_call_1a(i64* writeonly "no-capture-maybe-returned" %a) define i64* @negative_test_not_captured_but_returned_call_1a(i64* %a) #0 { entry: %call = call i64* @not_captured_but_returned_1(i64* %a) @@ -335,7 +333,7 @@ entry: ; } ; ; There should *not* be a no-capture attribute on %a -; CHECK: define void @negative_test_not_captured_but_returned_call_1b(i64* %a) +; CHECK: define void @negative_test_not_captured_but_returned_call_1b(i64* writeonly %a) define void @negative_test_not_captured_but_returned_call_1b(i64* %a) #0 { entry: %call = call i64* @not_captured_but_returned_1(i64* %a) @@ -391,7 +389,7 @@ r: ; TEST not captured by readonly external function ; -; CHECK: define void @not_captured_by_readonly_call(i32* nocapture %b) +; CHECK: define void @not_captured_by_readonly_call(i32* nocapture readonly %b) declare i32* @readonly_unknown(i32*, i32*) readonly define void @not_captured_by_readonly_call(i32* %b) #0 { Modified: llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll Mon Oct 7 14:07:57 2019 @@ -159,23 +159,16 @@ return: ; TEST SCC test returning a pointer value argument ; -; BOTH: Function Attrs: nofree noinline norecurse nosync nounwind readnone uwtable -; BOTH-NEXT: define double* @ptr_sink_r0(double* readnone returned "no-capture-maybe-returned" %r) -; BOTH: Function Attrs: nofree noinline nosync nounwind readnone uwtable -; BOTH-NEXT: define double* @ptr_scc_r1(double* %a, double* readnone returned %r, double* nocapture readnone %b) -; BOTH: Function Attrs: nofree noinline nosync nounwind readnone uwtable -; BOTH-NEXT: define double* @ptr_scc_r2(double* readnone %a, double* readnone %b, double* readnone returned %r) -; ; FNATTR: define double* @ptr_sink_r0(double* readnone returned %r) ; FNATTR: define double* @ptr_scc_r1(double* %a, double* readnone %r, double* nocapture readnone %b) ; FNATTR: define double* @ptr_scc_r2(double* readnone %a, double* readnone %b, double* readnone %r) ; -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define double* @ptr_sink_r0(double* returned "no-capture-maybe-returned" %r) -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define double* @ptr_scc_r1(double* %a, double* returned %r, double* nocapture %b) -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define double* @ptr_scc_r2(double* %a, double* %b, double* returned %r) +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable +; ATTRIBUTOR-NEXT: define double* @ptr_sink_r0(double* readnone returned "no-capture-maybe-returned" %r) +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable +; ATTRIBUTOR-NEXT: define double* @ptr_scc_r1(double* readnone %a, double* readnone returned %r, double* nocapture readnone %b) +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable +; ATTRIBUTOR-NEXT: define double* @ptr_scc_r2(double* readnone %a, double* readnone %b, double* readnone returned %r) ; ; double* ptr_scc_r1(double* a, double* b, double* r); ; double* ptr_scc_r2(double* a, double* b, double* r); @@ -293,7 +286,7 @@ entry: ; ; FNATTR: define i32* @rt2_helper(i32* %a) ; FNATTR: define i32* @rt2(i32* readnone %a, i32* readnone %b) -; BOTH: define i32* @rt2_helper(i32* returned %a) +; BOTH: define i32* @rt2_helper(i32* readnone returned %a) ; BOTH: define i32* @rt2(i32* readnone %a, i32* readnone "no-capture-maybe-returned" %b) define i32* @rt2_helper(i32* %a) #0 { entry: @@ -319,7 +312,7 @@ if.end: ; ; FNATTR: define i32* @rt3_helper(i32* %a, i32* %b) ; FNATTR: define i32* @rt3(i32* readnone %a, i32* readnone %b) -; BOTH: define i32* @rt3_helper(i32* %a, i32* returned "no-capture-maybe-returned" %b) +; BOTH: define i32* @rt3_helper(i32* readnone %a, i32* readnone returned "no-capture-maybe-returned" %b) ; BOTH: define i32* @rt3(i32* readnone %a, i32* readnone returned "no-capture-maybe-returned" %b) define i32* @rt3_helper(i32* %a, i32* %b) #0 { entry: @@ -355,7 +348,7 @@ if.end: ; BOTH: Function Attrs: noinline nounwind uwtable ; BOTH-NEXT: define i32* @calls_unknown_fn(i32* readnone returned "no-capture-maybe-returned" %r) ; FNATTR: define i32* @calls_unknown_fn(i32* readnone returned %r) -; ATTRIBUTOR: define i32* @calls_unknown_fn(i32* returned "no-capture-maybe-returned" %r) +; ATTRIBUTOR: define i32* @calls_unknown_fn(i32* readnone returned "no-capture-maybe-returned" %r) declare void @unknown_fn(i32* (i32*)*) #0 define i32* @calls_unknown_fn(i32* %r) #0 { @@ -443,7 +436,7 @@ entry: ; BOTH-NEXT: define double @select_and_phi(double returned %b) ; ; FNATTR: define double @select_and_phi(double %b) -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define double @select_and_phi(double returned %b) define double @select_and_phi(double %b) #0 { entry: @@ -475,7 +468,7 @@ if.end: ; ; FNATTR: define double @recursion_select_and_phi(i32 %a, double %b) ; -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define double @recursion_select_and_phi(i32 %a, double returned %b) define double @recursion_select_and_phi(i32 %a, double %b) #0 { entry: @@ -506,8 +499,8 @@ if.end: ; ; FNATTR: define double* @bitcast(i32* readnone %b) ; -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define double* @bitcast(i32* returned "no-capture-maybe-returned" %b) +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable +; ATTRIBUTOR-NEXT: define double* @bitcast(i32* readnone returned "no-capture-maybe-returned" %b) define double* @bitcast(i32* %b) #0 { entry: %bc0 = bitcast i32* %b to double* @@ -529,8 +522,8 @@ entry: ; ; FNATTR: define double* @bitcasts_select_and_phi(i32* readnone %b) ; -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define double* @bitcasts_select_and_phi(i32* returned %b) +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable +; ATTRIBUTOR-NEXT: define double* @bitcasts_select_and_phi(i32* readnone returned %b) define double* @bitcasts_select_and_phi(i32* %b) #0 { entry: %bc0 = bitcast i32* %b to double* @@ -567,8 +560,8 @@ if.end: ; ; FNATTR: define double* @ret_arg_arg_undef(i32* readnone %b) ; -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define double* @ret_arg_arg_undef(i32* returned %b) +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable +; ATTRIBUTOR-NEXT: define double* @ret_arg_arg_undef(i32* readnone returned %b) define double* @ret_arg_arg_undef(i32* %b) #0 { entry: %bc0 = bitcast i32* %b to double* @@ -605,8 +598,8 @@ ret_undef: ; ; FNATTR: define double* @ret_undef_arg_arg(i32* readnone %b) ; -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define double* @ret_undef_arg_arg(i32* returned %b) +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable +; ATTRIBUTOR-NEXT: define double* @ret_undef_arg_arg(i32* readnone returned %b) define double* @ret_undef_arg_arg(i32* %b) #0 { entry: %bc0 = bitcast i32* %b to double* @@ -642,7 +635,7 @@ ret_arg1: ; BOTH-NEXT: define double* @ret_undef_arg_undef(i32* readnone returned %b) ; ; FNATTR: define double* @ret_undef_arg_undef(i32* readnone %b) -; ATTRIBUTOR: define double* @ret_undef_arg_undef(i32* returned %b) +; ATTRIBUTOR: define double* @ret_undef_arg_undef(i32* readnone returned %b) define double* @ret_undef_arg_undef(i32* %b) #0 { entry: %bc0 = bitcast i32* %b to double* @@ -846,7 +839,8 @@ attributes #0 = { noinline nounwind uwta ; BOTH-DAG: attributes #{{[0-9]*}} = { nofree noinline noreturn nosync nounwind readonly uwtable } ; BOTH-DAG: attributes #{{[0-9]*}} = { noinline nounwind uwtable } ; BOTH-DAG: attributes #{{[0-9]*}} = { noreturn } -; BOTH-DAG: attributes #{{[0-9]*}} = { nofree nosync willreturn } -; BOTH-DAG: attributes #{{[0-9]*}} = { nofree nosync } -; BOTH-DAG: attributes #{{[0-9]*}} = { nofree noreturn nosync } +; BOTH-DAG: attributes #{{[0-9]*}} = { nofree noinline norecurse nosync nounwind readnone uwtable } +; BOTH-DAG: attributes #{{[0-9]*}} = { nofree nosync readnone willreturn } +; BOTH-DAG: attributes #{{[0-9]*}} = { nofree nosync readnone } +; BOTH-DAG: attributes #{{[0-9]*}} = { nofree noreturn nosync readonly } ; BOTH-NOT: attributes # Modified: llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll Mon Oct 7 14:07:57 2019 @@ -7,7 +7,7 @@ declare void @deref_phi_user(i32* %a); ; take mininimum of return values ; define i32* @test1(i32* dereferenceable(4) %0, double* dereferenceable(8) %1, i1 zeroext %2) local_unnamed_addr { -; ATTRIBUTOR: define nonnull dereferenceable(4) i32* @test1(i32* nonnull dereferenceable(4) "no-capture-maybe-returned" %0, double* nonnull dereferenceable(8) "no-capture-maybe-returned" %1, i1 zeroext %2) +; ATTRIBUTOR: define nonnull dereferenceable(4) i32* @test1(i32* nonnull readnone dereferenceable(4) "no-capture-maybe-returned" %0, double* nonnull readnone dereferenceable(8) "no-capture-maybe-returned" %1, i1 zeroext %2) %4 = bitcast double* %1 to i32* %5 = select i1 %2, i32* %0, i32* %4 ret i32* %5 @@ -15,7 +15,7 @@ define i32* @test1(i32* dereferenceable( ; TEST 2 define i32* @test2(i32* dereferenceable_or_null(4) %0, double* dereferenceable(8) %1, i1 zeroext %2) local_unnamed_addr { -; ATTRIBUTOR: define dereferenceable_or_null(4) i32* @test2(i32* dereferenceable_or_null(4) "no-capture-maybe-returned" %0, double* nonnull dereferenceable(8) "no-capture-maybe-returned" %1, i1 zeroext %2) +; ATTRIBUTOR: define dereferenceable_or_null(4) i32* @test2(i32* readnone dereferenceable_or_null(4) "no-capture-maybe-returned" %0, double* nonnull readnone dereferenceable(8) "no-capture-maybe-returned" %1, i1 zeroext %2) %4 = bitcast double* %1 to i32* %5 = select i1 %2, i32* %0, i32* %4 ret i32* %5 @@ -24,20 +24,20 @@ define i32* @test2(i32* dereferenceable_ ; TEST 3 ; GEP inbounds define i32* @test3_1(i32* dereferenceable(8) %0) local_unnamed_addr { -; ATTRIBUTOR: define nonnull dereferenceable(4) i32* @test3_1(i32* nonnull dereferenceable(8) "no-capture-maybe-returned" %0) +; ATTRIBUTOR: define nonnull dereferenceable(4) i32* @test3_1(i32* nonnull readnone dereferenceable(8) "no-capture-maybe-returned" %0) %ret = getelementptr inbounds i32, i32* %0, i64 1 ret i32* %ret } define i32* @test3_2(i32* dereferenceable_or_null(32) %0) local_unnamed_addr { ; FIXME: Argument should be mark dereferenceable because of GEP `inbounds`. -; ATTRIBUTOR: define nonnull dereferenceable(16) i32* @test3_2(i32* dereferenceable_or_null(32) "no-capture-maybe-returned" %0) +; ATTRIBUTOR: define nonnull dereferenceable(16) i32* @test3_2(i32* readnone dereferenceable_or_null(32) "no-capture-maybe-returned" %0) %ret = getelementptr inbounds i32, i32* %0, i64 4 ret i32* %ret } define i32* @test3_3(i32* dereferenceable(8) %0, i32* dereferenceable(16) %1, i1 %2) local_unnamed_addr { -; ATTRIBUTOR: define nonnull dereferenceable(4) i32* @test3_3(i32* nonnull dereferenceable(8) "no-capture-maybe-returned" %0, i32* nonnull dereferenceable(16) "no-capture-maybe-returned" %1, i1 %2) local_unnamed_addr +; ATTRIBUTOR: define nonnull dereferenceable(4) i32* @test3_3(i32* nonnull readnone dereferenceable(8) "no-capture-maybe-returned" %0, i32* nonnull readnone dereferenceable(16) "no-capture-maybe-returned" %1, i1 %2) local_unnamed_addr %ret1 = getelementptr inbounds i32, i32* %0, i64 1 %ret2 = getelementptr inbounds i32, i32* %1, i64 2 %ret = select i1 %2, i32* %ret1, i32* %ret2 @@ -48,7 +48,7 @@ define i32* @test3_3(i32* dereferenceabl ; Better than known in IR. define dereferenceable(4) i32* @test4(i32* dereferenceable(8) %0) local_unnamed_addr { -; ATTRIBUTOR: define nonnull dereferenceable(8) i32* @test4(i32* nonnull returned dereferenceable(8) "no-capture-maybe-returned" %0) +; ATTRIBUTOR: define nonnull dereferenceable(8) i32* @test4(i32* nonnull readnone returned dereferenceable(8) "no-capture-maybe-returned" %0) ret i32* %0 } Modified: llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll Mon Oct 7 14:07:57 2019 @@ -8,9 +8,7 @@ entry: ret i32 %add } -; FIXME: Should be something like this. -; define internal i32 @noalias_args(i32* nocapture readonly %A, i32* noalias nocapture readonly %B) -; CHECK: define internal i32 @noalias_args(i32* nocapture %A, i32* noalias nocapture %B) +; CHECK: define internal i32 @noalias_args(i32* nocapture readonly %A, i32* noalias nocapture readonly %B) define internal i32 @noalias_args(i32* %A, i32* %B) #0 { entry: @@ -25,7 +23,7 @@ entry: ; FIXME: Should be something like this. ; define internal i32 @noalias_args_argmem(i32* noalias nocapture readonly %A, i32* noalias nocapture readonly %B) -; CHECK: define internal i32 @noalias_args_argmem(i32* nocapture %A, i32* nocapture %B) +; CHECK: define internal i32 @noalias_args_argmem(i32* nocapture readonly %A, i32* nocapture readonly %B) ; define internal i32 @noalias_args_argmem(i32* %A, i32* %B) #1 { entry: Modified: llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll Mon Oct 7 14:07:57 2019 @@ -39,8 +39,8 @@ define i32 @volatile_load(i32*) norecurs ret i32 %2 } -; CHECK: Function Attrs: nofree norecurse nosync nounwind uwtable willreturn -; CHECK-NEXT: define internal i32 @internal_load(i32* nocapture nonnull %0) +; CHECK: Function Attrs: nofree norecurse nosync nounwind readonly uwtable willreturn +; CHECK-NEXT: define internal i32 @internal_load(i32* nocapture nonnull readonly %0) define internal i32 @internal_load(i32*) norecurse nounwind uwtable { %2 = load i32, i32* %0, align 4 ret i32 %2 @@ -48,11 +48,11 @@ define internal i32 @internal_load(i32*) ; TEST 1: Only first block is live. ; CHECK: Function Attrs: nofree noreturn nosync nounwind -; CHECK-NEXT: define i32 @first_block_no_return(i32 %a, i32* nocapture nonnull %ptr1, i32* nocapture %ptr2) +; CHECK-NEXT: define i32 @first_block_no_return(i32 %a, i32* nocapture nonnull readonly %ptr1, i32* nocapture readnone %ptr2) define i32 @first_block_no_return(i32 %a, i32* nonnull %ptr1, i32* %ptr2) #0 { entry: call i32 @internal_load(i32* %ptr1) - ; CHECK: call i32 @internal_load(i32* nocapture nonnull %ptr1) + ; CHECK: call i32 @internal_load(i32* nocapture nonnull readonly %ptr1) call void @no_return_call() ; CHECK: call void @no_return_call() ; CHECK-NEXT: unreachable @@ -84,7 +84,7 @@ cond.end: ; dead block and check if it is deduced. ; CHECK: Function Attrs: nosync -; CHECK-NEXT: define i32 @dead_block_present(i32 %a, i32* nocapture %ptr1) +; CHECK-NEXT: define i32 @dead_block_present(i32 %a, i32* nocapture readnone %ptr1) define i32 @dead_block_present(i32 %a, i32* %ptr1) #0 { entry: %cmp = icmp eq i32 %a, 0 @@ -239,7 +239,7 @@ cleanup: ; TEST 6: Undefined behvior, taken from LangRef. ; FIXME: Should be able to detect undefined behavior. -; CHECK: define void @ub(i32* nocapture %0) +; CHECK: define void @ub(i32* nocapture writeonly %0) define void @ub(i32* %0) { %poison = sub nuw i32 0, 1 ; Results in a poison value. %still_poison = and i32 %poison, 0 ; 0, but also poison. @@ -660,7 +660,7 @@ define internal void @dead_e2() { ret vo ; CHECK: define internal void @non_dead_d13() ; CHECK: define internal void @non_dead_d14() ; Verify we actually deduce information for these functions. -; CHECK: Function Attrs: nofree nosync nounwind willreturn +; CHECK: Function Attrs: nofree nosync nounwind readnone willreturn ; CHECK-NEXT: define internal void @non_dead_d15() ; CHECK-NOT: define internal void @dead_e Modified: llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll Mon Oct 7 14:07:57 2019 @@ -153,7 +153,7 @@ define i8* @test8(i32* %0) nounwind uwta ; TEST 9 ; Simple Argument Test define internal void @test9(i8* %a, i8* %b) { -; CHECK: define internal void @test9(i8* noalias nocapture %a, i8* nocapture %b) +; CHECK: define internal void @test9(i8* noalias nocapture readnone %a, i8* nocapture readnone %b) ret void } define void @test9_helper(i8* %a, i8* %b) { Modified: llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll Mon Oct 7 14:07:57 2019 @@ -6,7 +6,7 @@ @g = global i32* null ; [#uses=1] ; FNATTR: define i32* @c1(i32* readnone returned %q) -; ATTRIBUTOR: define i32* @c1(i32* returned "no-capture-maybe-returned" %q) +; ATTRIBUTOR: define i32* @c1(i32* readnone returned "no-capture-maybe-returned" %q) define i32* @c1(i32* %q) { ret i32* %q } @@ -24,7 +24,8 @@ define void @c3(i32* %q) { ret void } -; EITHER: define i1 @c4(i32* %q, i32 %bitno) +; FNATTR: define i1 @c4(i32* %q, i32 %bitno) +; ATTRIBUTOR: define i1 @c4(i32* readnone %q, i32 %bitno) define i1 @c4(i32* %q, i32 %bitno) { %tmp = ptrtoint i32* %q to i32 %tmp2 = lshr i32 %tmp, %bitno @@ -126,8 +127,7 @@ define void @nc3(void ()* %p) { } declare void @external(i8*) readonly nounwind -; FNATTR: define void @nc4(i8* nocapture readonly %p) -; ATTRIBUTOR: define void @nc4(i8* nocapture %p) +; EITHER: define void @nc4(i8* nocapture readonly %p) define void @nc4(i8* %p) { call void @external(i8* %p) ret void @@ -141,7 +141,7 @@ define void @nc5(void (i8*)* %f, i8* %p) } ; FNATTR: define void @test1_1(i8* nocapture readnone %x1_1, i8* %y1_1, i1 %c) -; ATTRIBUTOR: define void @test1_1(i8* nocapture %x1_1, i8* nocapture %y1_1, i1 %c) +; ATTRIBUTOR: define void @test1_1(i8* nocapture readnone %x1_1, i8* nocapture readnone %y1_1, i1 %c) ; It would be acceptable to add readnone to %y1_1 and %y1_2. define void @test1_1(i8* %x1_1, i8* %y1_1, i1 %c) { call i8* @test1_2(i8* %x1_1, i8* %y1_1, i1 %c) @@ -150,7 +150,7 @@ define void @test1_1(i8* %x1_1, i8* %y1_ } ; FNATTR: define i8* @test1_2(i8* nocapture readnone %x1_2, i8* returned %y1_2, i1 %c) -; ATTRIBUTOR: define i8* @test1_2(i8* nocapture %x1_2, i8* returned "no-capture-maybe-returned" %y1_2, i1 %c) +; ATTRIBUTOR: define i8* @test1_2(i8* nocapture readnone %x1_2, i8* readnone returned "no-capture-maybe-returned" %y1_2, i1 %c) define i8* @test1_2(i8* %x1_2, i8* %y1_2, i1 %c) { br i1 %c, label %t, label %f t: @@ -161,16 +161,14 @@ f: ret i8* %y1_2 } -; FNATTR: define void @test2(i8* nocapture readnone %x2) -; ATTRIBUTOR: define void @test2(i8* nocapture %x2) +; EITHER: define void @test2(i8* nocapture readnone %x2) define void @test2(i8* %x2) { call void @test2(i8* %x2) store i32* null, i32** @g ret void } -; FNATTR: define void @test3(i8* nocapture readnone %x3, i8* nocapture readnone %y3, i8* nocapture readnone %z3) -; ATTRIBUTOR: define void @test3(i8* nocapture %x3, i8* nocapture %y3, i8* nocapture %z3) +; EITHER: define void @test3(i8* nocapture readnone %x3, i8* nocapture readnone %y3, i8* nocapture readnone %z3) define void @test3(i8* %x3, i8* %y3, i8* %z3) { call void @test3(i8* %z3, i8* %y3, i8* %x3) store i32* null, i32** @g @@ -178,7 +176,7 @@ define void @test3(i8* %x3, i8* %y3, i8* } ; FNATTR: define void @test4_1(i8* %x4_1, i1 %c) -; ATTRIBUTOR: define void @test4_1(i8* nocapture %x4_1, i1 %c) +; ATTRIBUTOR: define void @test4_1(i8* nocapture readnone %x4_1, i1 %c) define void @test4_1(i8* %x4_1, i1 %c) { call i8* @test4_2(i8* %x4_1, i8* %x4_1, i8* %x4_1, i1 %c) store i32* null, i32** @g @@ -186,7 +184,7 @@ define void @test4_1(i8* %x4_1, i1 %c) { } ; FNATTR: define i8* @test4_2(i8* nocapture readnone %x4_2, i8* readnone returned %y4_2, i8* nocapture readnone %z4_2, i1 %c) -; ATTRIBUTOR: define i8* @test4_2(i8* nocapture %x4_2, i8* returned "no-capture-maybe-returned" %y4_2, i8* nocapture %z4_2, i1 %c) +; ATTRIBUTOR: define i8* @test4_2(i8* nocapture readnone %x4_2, i8* readnone returned "no-capture-maybe-returned" %y4_2, i8* nocapture readnone %z4_2, i1 %c) define i8* @test4_2(i8* %x4_2, i8* %y4_2, i8* %z4_2, i1 %c) { br i1 %c, label %t, label %f t: @@ -257,7 +255,8 @@ define void @captureLaunder(i8* %p) { ret void } -; EITHER: @nocaptureStrip(i8* nocapture %p) +; FNATTR: @nocaptureStrip(i8* nocapture %p) +; ATTRIBUTOR: @nocaptureStrip(i8* nocapture writeonly %p) define void @nocaptureStrip(i8* %p) { entry: %b = call i8* @llvm.strip.invariant.group.p0i8(i8* %p) @@ -273,22 +272,19 @@ define void @captureStrip(i8* %p) { ret void } -; FNATTR: define i1 @captureICmp(i32* readnone %x) -; ATTRIBUTOR: define i1 @captureICmp(i32* %x) +; EITHER: define i1 @captureICmp(i32* readnone %x) define i1 @captureICmp(i32* %x) { %1 = icmp eq i32* %x, null ret i1 %1 } -; FNATTR: define i1 @captureICmpRev(i32* readnone %x) -; ATTRIBUTOR: define i1 @captureICmpRev(i32* %x) +; EITHER: define i1 @captureICmpRev(i32* readnone %x) define i1 @captureICmpRev(i32* %x) { %1 = icmp eq i32* null, %x ret i1 %1 } -; FNATTR: define i1 @nocaptureInboundsGEPICmp(i32* nocapture readnone %x) -; ATTRIBUTOR: define i1 @nocaptureInboundsGEPICmp(i32* nocapture %x) +; EITHER: define i1 @nocaptureInboundsGEPICmp(i32* nocapture readnone %x) define i1 @nocaptureInboundsGEPICmp(i32* %x) { %1 = getelementptr inbounds i32, i32* %x, i32 5 %2 = bitcast i32* %1 to i8* @@ -296,8 +292,7 @@ define i1 @nocaptureInboundsGEPICmp(i32* ret i1 %3 } -; FNATTR: define i1 @nocaptureInboundsGEPICmpRev(i32* nocapture readnone %x) -; ATTRIBUTOR: define i1 @nocaptureInboundsGEPICmpRev(i32* nocapture %x) +; EITHER: define i1 @nocaptureInboundsGEPICmpRev(i32* nocapture readnone %x) define i1 @nocaptureInboundsGEPICmpRev(i32* %x) { %1 = getelementptr inbounds i32, i32* %x, i32 5 %2 = bitcast i32* %1 to i8* @@ -305,16 +300,14 @@ define i1 @nocaptureInboundsGEPICmpRev(i ret i1 %3 } -; FNATTR: define i1 @nocaptureDereferenceableOrNullICmp(i32* nocapture readnone dereferenceable_or_null(4) %x) -; ATTRIBUTOR: define i1 @nocaptureDereferenceableOrNullICmp(i32* nocapture dereferenceable_or_null(4) %x) +; EITHER: define i1 @nocaptureDereferenceableOrNullICmp(i32* nocapture readnone dereferenceable_or_null(4) %x) define i1 @nocaptureDereferenceableOrNullICmp(i32* dereferenceable_or_null(4) %x) { %1 = bitcast i32* %x to i8* %2 = icmp eq i8* %1, null ret i1 %2 } -; FNATTR: define i1 @captureDereferenceableOrNullICmp(i32* readnone dereferenceable_or_null(4) %x) -; ATTRIBUTOR: define i1 @captureDereferenceableOrNullICmp(i32* dereferenceable_or_null(4) %x) +; EITHER: define i1 @captureDereferenceableOrNullICmp(i32* readnone dereferenceable_or_null(4) %x) define i1 @captureDereferenceableOrNullICmp(i32* dereferenceable_or_null(4) %x) "null-pointer-is-valid"="true" { %1 = bitcast i32* %x to i8* %2 = icmp eq i8* %1, null Modified: llvm/trunk/test/Transforms/FunctionAttrs/nofree-attributor.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nofree-attributor.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nofree-attributor.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nofree-attributor.ll Mon Oct 7 14:07:57 2019 @@ -15,7 +15,7 @@ declare void @_ZdaPv(i8*) local_unnamed_ ; TEST 1 (positive case) ; FNATTR: Function Attrs: noinline norecurse nounwind readnone uwtable ; FNATTR-NEXT: define void @only_return() -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define void @only_return() define void @only_return() #0 { ret void @@ -92,7 +92,7 @@ end: ; FNATTR: Function Attrs: noinline nounwind readnone uwtable ; FNATTR-NEXT: define void @mutual_recursion1() -; ATTRIBUTOR: Function Attrs: nofree noinline noreturn nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline noreturn nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define void @mutual_recursion1() define void @mutual_recursion1() #0 { call void @mutual_recursion2() @@ -101,7 +101,7 @@ define void @mutual_recursion1() #0 { ; FNATTR: Function Attrs: noinline nounwind readnone uwtable ; FNATTR-NEXT: define void @mutual_recursion2() -; ATTRIBUTOR: Function Attrs: nofree noinline noreturn nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline noreturn nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define void @mutual_recursion2() define void @mutual_recursion2() #0 { call void @mutual_recursion1() @@ -158,7 +158,7 @@ declare void @nofree_function() nofree r ; FNATTR: Function Attrs: noinline nounwind readnone uwtable ; FNATTR-NEXT: define void @call_nofree_function() -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define void @call_nofree_function() define void @call_nofree_function() #0 { tail call void @nofree_function() @@ -211,7 +211,7 @@ declare float @llvm.floor.f32(float) ; FNATTRS: Function Attrs: noinline nounwind uwtable ; FNATTRS-NEXT: define void @call_floor(float %a) ; FIXME: missing nofree -; ATTRIBUTOR: Function Attrs: noinline nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: noinline nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define void @call_floor(float %a) define void @call_floor(float %a) #0 { @@ -224,7 +224,7 @@ define void @call_floor(float %a) #0 { ; FNATTRS: Function Attrs: noinline nounwind uwtable ; FNATTRS-NEXT: define void @f1() -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define void @f1() define void @f1() #0 { tail call void @nofree_function() @@ -233,7 +233,7 @@ define void @f1() #0 { ; FNATTRS: Function Attrs: noinline nounwind uwtable ; FNATTRS-NEXT: define void @f2() -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define void @f2() define void @f2() #0 { tail call void @f1() Modified: llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll Mon Oct 7 14:07:57 2019 @@ -159,7 +159,7 @@ define void @test13_helper() { ret void } define internal void @test13(i8* %a, i8* %b, i8* %c) { -; ATTRIBUTOR: define internal void @test13(i8* nocapture nonnull %a, i8* nocapture %b, i8* nocapture %c) +; ATTRIBUTOR: define internal void @test13(i8* nocapture nonnull readnone %a, i8* nocapture readnone %b, i8* nocapture readnone %c) ret void } @@ -178,8 +178,8 @@ declare nonnull i8* @nonnull() define internal i32* @f1(i32* %arg) { -; FIXME: missing nonnull It should be nonnull @f1(i32* nonnull %arg) -; ATTRIBUTOR: define internal nonnull i32* @f1(i32* %arg) +; FIXME: missing nonnull It should be nonnull @f1(i32* nonnull readonly %arg) +; ATTRIBUTOR: define internal nonnull i32* @f1(i32* readonly %arg) bb: %tmp = icmp eq i32* %arg, null @@ -212,18 +212,18 @@ define internal i32* @f2(i32* %arg) { ; ATTRIBUTOR: define internal nonnull i32* @f2(i32* %arg) bb: -; FIXME: missing nonnull. It should be @f1(i32* nonnull %arg) -; ATTRIBUTOR: %tmp = tail call nonnull i32* @f1(i32* %arg) +; FIXME: missing nonnull. It should be @f1(i32* nonnull readonly %arg) +; ATTRIBUTOR: %tmp = tail call nonnull i32* @f1(i32* readonly %arg) %tmp = tail call i32* @f1(i32* %arg) ret i32* %tmp } define dso_local noalias i32* @f3(i32* %arg) { -; FIXME: missing nonnull. It should be nonnull @f3(i32* nonnull %arg) -; ATTRIBUTOR: define dso_local noalias i32* @f3(i32* %arg) +; FIXME: missing nonnull. It should be nonnull @f3(i32* nonnull readonly %arg) +; ATTRIBUTOR: define dso_local noalias i32* @f3(i32* readonly %arg) bb: -; FIXME: missing nonnull. It should be @f1(i32* nonnull %arg) -; ATTRIBUTOR: %tmp = call i32* @f1(i32* %arg) +; FIXME: missing nonnull. It should be @f1(i32* nonnull readonly %arg) +; ATTRIBUTOR: %tmp = call i32* @f1(i32* readonly %arg) %tmp = call i32* @f1(i32* %arg) ret i32* null } @@ -402,7 +402,7 @@ declare i32 @esfp(...) define i1 @parent8(i8* %a, i8* %bogus1, i8* %b) personality i8* bitcast (i32 (...)* @esfp to i8*){ ; FNATTR-LABEL: @parent8(i8* nonnull %a, i8* nocapture readnone %bogus1, i8* nonnull %b) ; FIXME : missing "nonnull", it should be @parent8(i8* nonnull %a, i8* %bogus1, i8* nonnull %b) -; ATTRIBUTOR-LABEL: @parent8(i8* %a, i8* nocapture %bogus1, i8* %b) +; ATTRIBUTOR-LABEL: @parent8(i8* %a, i8* nocapture readnone %bogus1, i8* %b) ; BOTH-NEXT: entry: ; FNATTR-NEXT: invoke void @use2nonnull(i8* %a, i8* %b) ; ATTRIBUTOR-NEXT: invoke void @use2nonnull(i8* nonnull %a, i8* nonnull %b) @@ -458,7 +458,7 @@ define i32* @g1() { ret i32* %c } -; ATTRIBUTOR: define internal void @called_by_weak(i32* nocapture nonnull %a) +; ATTRIBUTOR: define internal void @called_by_weak(i32* nocapture nonnull readnone %a) define internal void @called_by_weak(i32* %a) { ret void } Modified: llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll Mon Oct 7 14:07:57 2019 @@ -4,14 +4,14 @@ ; CHECK: Function Attrs ; CHECK-SAME: norecurse nounwind readnone -; ATTRIBUTOR: Function Attrs: nofree norecurse nosync nounwind willreturn +; ATTRIBUTOR: Function Attrs: nofree norecurse nosync nounwind readnone willreturn ; BOTH-NEXT: define i32 @leaf() define i32 @leaf() { ret i32 1 } ; BOTH: Function Attrs -; CHECK-SAME: readnone +; BOTH-SAME: readnone ; BOTH-NOT: norecurse ; BOTH-NEXT: define i32 @self_rec() define i32 @self_rec() { @@ -20,7 +20,7 @@ define i32 @self_rec() { } ; BOTH: Function Attrs -; CHECK-SAME: readnone +; BOTH-SAME: readnone ; BOTH-NOT: norecurse ; BOTH-NEXT: define i32 @indirect_rec() define i32 @indirect_rec() { @@ -28,7 +28,7 @@ define i32 @indirect_rec() { ret i32 %a } ; BOTH: Function Attrs -; CHECK-SAME: readnone +; BOTH-SAME: readnone ; BOTH-NOT: norecurse ; BOTH-NEXT: define i32 @indirect_rec2() define i32 @indirect_rec2() { @@ -37,7 +37,7 @@ define i32 @indirect_rec2() { } ; BOTH: Function Attrs -; CHECK-SAME: readnone +; BOTH-SAME: readnone ; BOTH-NOT: norecurse ; BOTH-NEXT: define i32 @extern() define i32 @extern() { @@ -53,7 +53,7 @@ declare i32 @k() readnone ; CHECK-SAME: nounwind ; BOTH-NOT: norecurse ; CHECK-NEXT: define void @intrinsic(i8* nocapture %dest, i8* nocapture readonly %src, i32 %len) -; ATTRIBUTOR-NEXT: define void @intrinsic(i8* nocapture %dest, i8* nocapture %src, i32 %len) +; ATTRIBUTOR-NEXT: define void @intrinsic(i8* nocapture writeonly %dest, i8* nocapture readonly %src, i32 %len) define void @intrinsic(i8* %dest, i8* %src, i32 %len) { call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 %len, i1 false) ret void @@ -66,7 +66,7 @@ declare void @llvm.memcpy.p0i8.p0i8.i32( ; BOTH: Function Attrs ; CHECK-SAME: norecurse readnone ; FIXME: missing "norecurse" -; ATTRIBUTOR-SAME: nosync +; ATTRIBUTOR-SAME: nosync readnone ; CHECK-NEXT: define internal i32 @called_by_norecurse() define internal i32 @called_by_norecurse() { %a = call i32 @k() @@ -138,7 +138,7 @@ define i32 @eval_func(i32 (i32)* , i32) declare void @unknown() ; Call an unknown function in a dead block. -; ATTRIBUTOR: Function Attrs: nofree norecurse nosync nounwind willreturn +; ATTRIBUTOR: Function Attrs: nofree norecurse nosync nounwind readnone willreturn ; ATTRIBUTOR: define i32 @call_unknown_in_dead_block() define i32 @call_unknown_in_dead_block() local_unnamed_addr { ret i32 0 Modified: llvm/trunk/test/Transforms/FunctionAttrs/nosync.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nosync.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nosync.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nosync.ll Mon Oct 7 14:07:57 2019 @@ -28,7 +28,7 @@ target datalayout = "e-m:e-i64:64-f80:12 ; FNATTR: Function Attrs: norecurse nounwind optsize readnone ssp uwtable ; FNATTR-NEXT: define nonnull i32* @foo(%struct.ST* readnone %s) ; ATTRIBUTOR: Function Attrs: nofree nosync nounwind optsize readnone ssp uwtable -; ATTRIBUTOR-NEXT: define nonnull i32* @foo(%struct.ST* "no-capture-maybe-returned" %s) +; ATTRIBUTOR-NEXT: define nonnull i32* @foo(%struct.ST* readnone "no-capture-maybe-returned" %s) define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp { entry: %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13 @@ -61,7 +61,7 @@ define i32 @load_monotonic(i32* nocaptur ; FNATTR: Function Attrs: nofree norecurse nounwind uwtable ; FNATTR-NEXT: define void @store_monotonic(i32* nocapture %0) ; ATTRIBUTOR: Function Attrs: nofree norecurse nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define void @store_monotonic(i32* nocapture %0) +; ATTRIBUTOR-NEXT: define void @store_monotonic(i32* nocapture writeonly %0) define void @store_monotonic(i32* nocapture %0) norecurse nounwind uwtable { store atomic i32 10, i32* %0 monotonic, align 4 ret void @@ -94,7 +94,7 @@ define i32 @load_acquire(i32* nocapture ; FNATTR-NEXT: define void @load_release(i32* nocapture %0) ; ATTRIBUTOR: Function Attrs: nofree norecurse nounwind uwtable ; ATTRIBUTOR-NOT: nosync -; ATTRIBUTOR-NEXT: define void @load_release(i32* nocapture %0) +; ATTRIBUTOR-NEXT: define void @load_release(i32* nocapture writeonly %0) define void @load_release(i32* nocapture %0) norecurse nounwind uwtable { store atomic volatile i32 10, i32* %0 release, align 4 ret void @@ -106,7 +106,7 @@ define void @load_release(i32* nocapture ; FNATTR-NEXT: define void @load_volatile_release(i32* nocapture %0) ; ATTRIBUTOR: Function Attrs: nofree norecurse nounwind uwtable ; ATTRIBUTOR-NOT: nosync -; ATTRIBUTOR-NEXT: define void @load_volatile_release(i32* nocapture %0) +; ATTRIBUTOR-NEXT: define void @load_volatile_release(i32* nocapture writeonly %0) define void @load_volatile_release(i32* nocapture %0) norecurse nounwind uwtable { store atomic volatile i32 10, i32* %0 release, align 4 ret void @@ -185,8 +185,8 @@ define void @call_might_sync() nounwind ; FNATTR: Function Attrs: nofree noinline nounwind uwtable ; FNATTR-NEXT: define i32 @scc1(i32* %0) -; ATTRIBUTOR: Function Attrs: nofree noinline noreturn nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define i32 @scc1(i32* nocapture %0) +; ATTRIBUTOR: Function Attrs: nofree noinline noreturn nosync nounwind readnone uwtable +; ATTRIBUTOR-NEXT: define i32 @scc1(i32* nocapture readnone %0) define i32 @scc1(i32* %0) noinline nounwind uwtable { tail call void @scc2(i32* %0); %val = tail call i32 @volatile_load(i32* %0); @@ -195,8 +195,8 @@ define i32 @scc1(i32* %0) noinline nounw ; FNATTR: Function Attrs: nofree noinline nounwind uwtable ; FNATTR-NEXT: define void @scc2(i32* %0) -; ATTRIBUTOR: Function Attrs: nofree noinline noreturn nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define void @scc2(i32* nocapture %0) +; ATTRIBUTOR: Function Attrs: nofree noinline noreturn nosync nounwind readnone uwtable +; ATTRIBUTOR-NEXT: define void @scc2(i32* nocapture readnone %0) define void @scc2(i32* %0) noinline nounwind uwtable { tail call i32 @scc1(i32* %0); ret void; @@ -224,7 +224,7 @@ define void @scc2(i32* %0) noinline noun ; FNATTR: Function Attrs: nofree norecurse nounwind ; FNATTR-NEXT: define void @foo1(i32* nocapture %0, %"struct.std::atomic"* nocapture %1) ; ATTRIBUTOR-NOT: nosync -; ATTRIBUTOR: define void @foo1(i32* nocapture %0, %"struct.std::atomic"* nocapture %1) +; ATTRIBUTOR: define void @foo1(i32* nocapture writeonly %0, %"struct.std::atomic"* nocapture writeonly %1) define void @foo1(i32* %0, %"struct.std::atomic"* %1) { store i32 100, i32* %0, align 4 fence release @@ -236,7 +236,7 @@ define void @foo1(i32* %0, %"struct.std: ; FNATTR: Function Attrs: nofree norecurse nounwind ; FNATTR-NEXT: define void @bar(i32* nocapture readnone %0, %"struct.std::atomic"* nocapture readonly %1) ; ATTRIBUTOR-NOT: nosync -; ATTRIBUTOR: define void @bar(i32* nocapture %0, %"struct.std::atomic"* nocapture %1) +; ATTRIBUTOR: define void @bar(i32* nocapture readnone %0, %"struct.std::atomic"* nocapture readonly %1) define void @bar(i32* %0, %"struct.std::atomic"* %1) { %3 = getelementptr inbounds %"struct.std::atomic", %"struct.std::atomic"* %1, i64 0, i32 0, i32 0 br label %4 @@ -256,7 +256,7 @@ define void @bar(i32* %0, %"struct.std:: ; FNATTR: Function Attrs: nofree norecurse nounwind ; FNATTR-NEXT: define void @foo1_singlethread(i32* nocapture %0, %"struct.std::atomic"* nocapture %1) ; ATTRIBUTOR: Function Attrs: nofree nosync -; ATTRIBUTOR: define void @foo1_singlethread(i32* nocapture %0, %"struct.std::atomic"* nocapture %1) +; ATTRIBUTOR: define void @foo1_singlethread(i32* nocapture writeonly %0, %"struct.std::atomic"* nocapture writeonly %1) define void @foo1_singlethread(i32* %0, %"struct.std::atomic"* %1) { store i32 100, i32* %0, align 4 fence syncscope("singlethread") release @@ -268,7 +268,7 @@ define void @foo1_singlethread(i32* %0, ; FNATTR: Function Attrs: nofree norecurse nounwind ; FNATTR-NEXT: define void @bar_singlethread(i32* nocapture readnone %0, %"struct.std::atomic"* nocapture readonly %1) ; ATTRIBUTOR: Function Attrs: nofree nosync -; ATTRIBUTOR: define void @bar_singlethread(i32* nocapture %0, %"struct.std::atomic"* nocapture %1) +; ATTRIBUTOR: define void @bar_singlethread(i32* nocapture readnone %0, %"struct.std::atomic"* nocapture readonly %1) define void @bar_singlethread(i32* %0, %"struct.std::atomic"* %1) { %3 = getelementptr inbounds %"struct.std::atomic", %"struct.std::atomic"* %1, i64 0, i32 0, i32 0 br label %4 @@ -293,7 +293,7 @@ declare void @llvm.memset(i8* %dest, i8 ; ; ATTRIBUTOR: Function Attrs: nounwind ; ATTRIBUTOR-NOT: nosync -; ATTRIBUTOR-NEXT: define i32 @memcpy_volatile(i8* nocapture %ptr1, i8* nocapture %ptr2) +; ATTRIBUTOR-NEXT: define i32 @memcpy_volatile(i8* nocapture writeonly %ptr1, i8* nocapture readonly %ptr2) define i32 @memcpy_volatile(i8* %ptr1, i8* %ptr2) { call void @llvm.memcpy(i8* %ptr1, i8* %ptr2, i32 8, i1 1) ret i32 4 @@ -304,7 +304,7 @@ define i32 @memcpy_volatile(i8* %ptr1, i ; It is odd to add nocapture but a result of the llvm.memset nocapture. ; ; ATTRIBUTOR: Function Attrs: nosync -; ATTRIBUTOR-NEXT: define i32 @memset_non_volatile(i8* nocapture %ptr1, i8 %val) +; ATTRIBUTOR-NEXT: define i32 @memset_non_volatile(i8* nocapture writeonly %ptr1, i8 %val) define i32 @memset_non_volatile(i8* %ptr1, i8 %val) { call void @llvm.memset(i8* %ptr1, i8 %val, i32 8, i1 0) ret i32 4 Modified: llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll Mon Oct 7 14:07:57 2019 @@ -102,7 +102,7 @@ return: } ; CHECK: Function Attrs: nofree norecurse nosync nounwind -; CHECK-NEXT: define i32* @external_sink_ret2_nrw(i32* readnone %n0, i32* nocapture readonly %r0, i32* returned "no-capture-maybe-returned" %w0) +; CHECK-NEXT: define i32* @external_sink_ret2_nrw(i32* readnone %n0, i32* nocapture readonly %r0, i32* returned writeonly "no-capture-maybe-returned" %w0) define i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) { entry: %tobool = icmp ne i32* %n0, null Modified: llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll Mon Oct 7 14:07:57 2019 @@ -1,19 +1,24 @@ -; RUN: opt < %s -functionattrs -S | FileCheck %s -; RUN: opt < %s -aa-pipeline=basic-aa -passes='cgscc(function-attrs)' -S | FileCheck %s +; RUN: opt < %s -functionattrs -S | FileCheck %s --check-prefixes=CHECK,FNATTR +; RUN: opt < %s -aa-pipeline=basic-aa -passes='cgscc(function-attrs)' -S | FileCheck %s --check-prefixes=CHECK,FNATTR +; RUN: opt < %s -attributor -attributor-disable=false -S | FileCheck %s --check-prefixes=CHECK,ATTRIBUTOR +; RUN: opt < %s -aa-pipeline=basic-aa -passes='attributor' -attributor-disable=false -S | FileCheck %s --check-prefixes=CHECK,ATTRIBUTOR + @x = global i32 0 declare void @test1_1(i8* %x1_1, i8* readonly %y1_1, ...) ; NOTE: readonly for %y1_2 would be OK here but not for the similar situation in test13. ; -; CHECK: define void @test1_2(i8* %x1_2, i8* readonly %y1_2, i8* %z1_2) +; FNATTR: define void @test1_2(i8* %x1_2, i8* readonly %y1_2, i8* %z1_2) +; ATTRIBUTOR: define void @test1_2(i8* %x1_2, i8* %y1_2, i8* %z1_2) define void @test1_2(i8* %x1_2, i8* %y1_2, i8* %z1_2) { call void (i8*, i8*, ...) @test1_1(i8* %x1_2, i8* %y1_2, i8* %z1_2) store i32 0, i32* @x ret void } -; CHECK: define i8* @test2(i8* readnone returned %p) +; FNATTR: define i8* @test2(i8* readnone returned %p) +; ATTRIBUTOR: define i8* @test2(i8* readnone returned %p) define i8* @test2(i8* %p) { store i32 0, i32* @x ret i8* %p @@ -33,7 +38,8 @@ define void @test4_2(i8* %p) { ret void } -; CHECK: define void @test5(i8** nocapture %p, i8* %q) +; FNATTR: define void @test5(i8** nocapture %p, i8* %q) +; ATTRIBUTOR: define void @test5(i8** nocapture writeonly %p, i8* %q) ; Missed optz'n: we could make %q readnone, but don't break test6! define void @test5(i8** %p, i8* %q) { store i8* %q, i8** %p @@ -41,7 +47,8 @@ define void @test5(i8** %p, i8* %q) { } declare void @test6_1() -; CHECK: define void @test6_2(i8** nocapture %p, i8* %q) +; FNATTR: define void @test6_2(i8** nocapture %p, i8* %q) +; ATTRIBUTOR: define void @test6_2(i8** nocapture writeonly %p, i8* %q) ; This is not a missed optz'n. define void @test6_2(i8** %p, i8* %q) { store i8* %q, i8** %p @@ -49,19 +56,22 @@ define void @test6_2(i8** %p, i8* %q) { ret void } -; CHECK: define void @test7_1(i32* inalloca nocapture %a) +; FNATTR: define void @test7_1(i32* inalloca nocapture %a) +; ATTRIBUTOR: define void @test7_1(i32* inalloca nocapture writeonly %a) ; inalloca parameters are always considered written define void @test7_1(i32* inalloca %a) { ret void } -; CHECK: define i32* @test8_1(i32* readnone returned %p) +; FNATTR: define i32* @test8_1(i32* readnone returned %p) +; ATTRIBUTOR: define i32* @test8_1(i32* readnone returned %p) define i32* @test8_1(i32* %p) { entry: ret i32* %p } -; CHECK: define void @test8_2(i32* %p) +; FNATTR: define void @test8_2(i32* %p) +; ATTRIBUTOR: define void @test8_2(i32* nocapture writeonly %p) define void @test8_2(i32* %p) { entry: %call = call i32* @test8_1(i32* %p) @@ -115,18 +125,21 @@ define i32 @volatile_load(i32* %p) { ret i32 %load } -declare void @escape_readonly_ptr(i8** %addr, i8* readnone %ptr) -declare void @escape_readnone_ptr(i8** %addr, i8* readonly %ptr) +declare void @escape_readnone_ptr(i8** %addr, i8* readnone %ptr) +declare void @escape_readonly_ptr(i8** %addr, i8* readonly %ptr) ; The argument pointer %escaped_then_written cannot be marked readnone/only even ; though the only direct use, in @escape_readnone_ptr/@escape_readonly_ptr, ; is marked as readnone/only. However, the functions can write the pointer into ; %addr, causing the store to write to %escaped_then_written. ; -; FIXME: This test currently exposes a bug! +; FIXME: This test currently exposes a bug in functionattrs! +; +; FNATTR: define void @unsound_readnone(i8* nocapture readnone %ignored, i8* readnone %escaped_then_written) +; FNATTR: define void @unsound_readonly(i8* nocapture readnone %ignored, i8* readonly %escaped_then_written) ; -; BUG: define void @unsound_readnone(i8* %ignored, i8* readnone %escaped_then_written) -; BUG: define void @unsound_readonly(i8* %ignored, i8* readonly %escaped_then_written) +; ATTRIBUTOR: define void @unsound_readnone(i8* nocapture readnone %ignored, i8* %escaped_then_written) +; ATTRIBUTOR: define void @unsound_readonly(i8* nocapture readnone %ignored, i8* %escaped_then_written) define void @unsound_readnone(i8* %ignored, i8* %escaped_then_written) { %addr = alloca i8* call void @escape_readnone_ptr(i8** %addr, i8* %escaped_then_written) Modified: llvm/trunk/test/Transforms/FunctionAttrs/willreturn.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/willreturn.ll?rev=373965&r1=373964&r2=373965&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/willreturn.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/willreturn.ll Mon Oct 7 14:07:57 2019 @@ -11,7 +11,7 @@ target datalayout = "e-m:e-i64:64-f80:12 ; TEST 1 (positive case) ; FNATTR: Function Attrs: noinline norecurse nounwind readnone uwtable ; FNATTR-NEXT: define void @only_return() -; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind uwtable willreturn +; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind readnone uwtable willreturn ; ATTRIBUTOR-NEXT: define void @only_return() define void @only_return() #0 { ret void @@ -28,7 +28,7 @@ define void @only_return() #0 { ; FNATTR: Function Attrs: noinline nounwind readnone uwtable ; FNATTR-NEXT: define i32 @fib(i32 %0) ; FIXME: missing willreturn -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define i32 @fib(i32 %0) local_unnamed_addr define i32 @fib(i32 %0) local_unnamed_addr #0 { %2 = icmp slt i32 %0, 2 @@ -59,7 +59,7 @@ define i32 @fib(i32 %0) local_unnamed_ad ; FNATTR: Function Attrs: noinline norecurse nounwind readnone uwtable ; FNATTR-NOT: willreturn ; FNATTR-NEXT: define i32 @fact_maybe_not_halt(i32 %0) local_unnamed_addr -; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind readnone uwtable ; ATTRIBUTOR-NOT: willreturn ; ATTRIBUTOR-NEXT: define i32 @fact_maybe_not_halt(i32 %0) local_unnamed_addr define i32 @fact_maybe_not_halt(i32 %0) local_unnamed_addr #0 { @@ -95,7 +95,7 @@ define i32 @fact_maybe_not_halt(i32 %0) ; FIXME: missing willreturn ; FNATTR: Function Attrs: noinline norecurse nounwind readnone uwtable ; FNATTR-NEXT: define i32 @fact_loop(i32 %0) -; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define i32 @fact_loop(i32 %0) local_unnamed_addr define i32 @fact_loop(i32 %0) local_unnamed_addr #0 { %2 = icmp slt i32 %0, 1 @@ -126,7 +126,7 @@ define i32 @fact_loop(i32 %0) local_unna ; FNATTR: Function Attrs: noinline nounwind readnone uwtable ; FNATTR-NOT: willreturn ; FNATTR-NEXT: define void @mutual_recursion1(i1 %c) -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable ; ATTRIBUTOR-NOT: willreturn ; ATTRIBUTOR-NEXT: define void @mutual_recursion1(i1 %c) define void @mutual_recursion1(i1 %c) #0 { @@ -142,7 +142,7 @@ end: ; FNATTR: Function Attrs: noinline nounwind readnone uwtable ; FNATTR-NOT: willreturn ; FNATTR-NEXT: define void @mutual_recursion2(i1 %c) -; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline nosync nounwind readnone uwtable ; ATTRIBUTOR-NOT: willreturn ; ATTRIBUTOR-NEXT: define void @mutual_recursion2(i1 %c) define void @mutual_recursion2(i1 %c) #0 { @@ -216,10 +216,10 @@ define void @conditional_exit(i32 %0, i3 ; ATTRIBUTOR-NEXT: declare float @llvm.floor.f32(float) declare float @llvm.floor.f32(float) -; FNATTRS: Function Attrs: noinline nounwind uwtable +; FNATTRS: Function Attrs: noinline nounwind readnone uwtable ; FNATTRS-NEXT: define void @call_floor(float %a) ; FIXME: missing willreturn -; ATTRIBUTOR: Function Attrs: noinline nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: noinline nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define void @call_floor(float %a) define void @call_floor(float %a) #0 { tail call float @llvm.floor.f32(float %a) @@ -337,7 +337,7 @@ declare i32 @__gxx_personality_v0(...) ; FIXME: missing willreturn ; FNATTR: Function Attrs: noinline norecurse nounwind readonly uwtable ; FNATTR-NEXT: define i32 @loop_constant_trip_count(i32* nocapture readonly %0) -; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind readonly uwtable ; ATTRIBUTOR-NEXT: define i32 @loop_constant_trip_count(i32* nocapture readonly %0) define i32 @loop_constant_trip_count(i32* nocapture readonly %0) #0 { br label %3 @@ -370,7 +370,7 @@ define i32 @loop_constant_trip_count(i32 ; FNATTR: Function Attrs: noinline norecurse nounwind readonly uwtable ; FNATTR-NOT: willreturn ; FNATTR-NEXT: define i32 @loop_trip_count_unbound(i32 %0, i32 %1, i32* nocapture readonly %2, i32 %3) local_unnamed_addr -; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind readonly uwtable ; ATTRIBUTOR-NOT: willreturn ; ATTRIBUTOR-NEXT: define i32 @loop_trip_count_unbound(i32 %0, i32 %1, i32* nocapture readonly %2, i32 %3) local_unnamed_addr define i32 @loop_trip_count_unbound(i32 %0, i32 %1, i32* nocapture readonly %2, i32 %3) local_unnamed_addr #0 { @@ -408,7 +408,7 @@ define i32 @loop_trip_count_unbound(i32 ; FIXME: missing willreturn ; FNATTR: Function Attrs: noinline norecurse nounwind readonly uwtable ; FNATTR-NEXT: define i32 @loop_trip_dec(i32 %0, i32* nocapture readonly %1) -; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind readonly uwtable ; ATTRIBUTOR-NEXT: define i32 @loop_trip_dec(i32 %0, i32* nocapture readonly %1) local_unnamed_addr define i32 @loop_trip_dec(i32 %0, i32* nocapture readonly %1) local_unnamed_addr #0 { @@ -439,7 +439,7 @@ define i32 @loop_trip_dec(i32 %0, i32* n ; FNATTR: Function Attrs: noinline norecurse nounwind readnone uwtable ; FNATTR-NEXT: define i32 @multiple_return(i32 %a) -; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind uwtable willreturn +; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind readnone uwtable willreturn ; ATTRIBUTOR-NEXT: define i32 @multiple_return(i32 %a) define i32 @multiple_return(i32 %a) #0 { %b = icmp eq i32 %a, 0 @@ -471,7 +471,7 @@ unreachable_label: ; FIXME: missing willreturn ; FNATTR: Function Attrs: noinline nounwind uwtable ; FNATTR-NEXT: define i32 @unreachable_exit_positive2(i32 %0) -; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline norecurse nosync nounwind readnone uwtable ; ATTRIBUTOR-NEXT: define i32 @unreachable_exit_positive2(i32 %0) define i32 @unreachable_exit_positive2(i32) local_unnamed_addr #0 { %2 = icmp slt i32 %0, 1 @@ -515,7 +515,7 @@ unreachable_label: ; FNATTR: Function Attrs: noinline nounwind uwtable ; FNATTR-NOT: willreturn ; FNATTR-NEXT: define void @unreachable_exit_negative2() -; ATTRIBUTOR: Function Attrs: nofree noinline norecurse noreturn nosync nounwind uwtable +; ATTRIBUTOR: Function Attrs: nofree noinline norecurse noreturn nosync nounwind readnone uwtable ; ATTRIBUTOR-NOT: willreturn ; ATTRIBUTOR-NEXT: define void @unreachable_exit_negative2() define void @unreachable_exit_negative2() #0 { From llvm-commits at lists.llvm.org Mon Oct 7 14:05:55 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:05:55 +0000 (UTC) Subject: [PATCH] D67882: [LNT] Python 3 support: remove useless var-setting getter In-Reply-To: References: Message-ID: <24e7d2cf8ab3f1e220838993b2ca07b5@localhost.localdomain> thopre added a comment. In D67882#1697713 , @hubert.reinterpretcast wrote: > In D67882#1697616 , @thopre wrote: > > > I think we should make LNT Python 3 only as soon as it can work in that mode, so the less compability code we can add the better. Do you agree with that approach? > > > I agree with the approach of minimizing compatibility code. I am not sure about proactively dropping Python 2 support. As you said, the amount of commits going into LNT is low, so keeping Python 2 compatibility once the Python 3 mode works won't cost much development effort. It's not so much the development effort rather than the risk of introducing python2- or python3-specific code without noticing. Since LNT is an application so has no problem of reverse dependences and Python 2 becomes EOL in 2 months I feel like a migration allows to quickly remove the compatibility added without much downside. But either way it doesn't impact the amount of effort to migrate so I'm fine with whatever is decided. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67882/new/ https://reviews.llvm.org/D67882 From llvm-commits at lists.llvm.org Mon Oct 7 14:06:19 2019 From: llvm-commits at lists.llvm.org (Aditya Kumar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:06:19 +0000 (UTC) Subject: [PATCH] D68579: [HardwareLoops] Optimisation remarks In-Reply-To: References: Message-ID: hiraditya added inline comments. ================ Comment at: llvm/lib/CodeGen/HardwareLoops.cpp:83 + dbgs() << "HWLoops: " << DebugMsg; + if (I != nullptr) + dbgs() << " " << *I; ---------------- `if (I)` ================ Comment at: llvm/lib/CodeGen/HardwareLoops.cpp:84 + if (I != nullptr) + dbgs() << " " << *I; + else ---------------- nit, ' ' ================ Comment at: llvm/lib/CodeGen/HardwareLoops.cpp:254 - if (TTI->isHardwareLoopProfitable(L, *SE, *AC, LibInfo, HWLoopInfo) || - ForceHardwareLoops) { - - // Allow overriding of the counter width and loop decrement value. - if (CounterBitWidth.getNumOccurrences()) - HWLoopInfo.CountType = - IntegerType::get(M->getContext(), CounterBitWidth); + if (!TTI->isHardwareLoopProfitable(L, *SE, *AC, LibInfo, HWLoopInfo) && + !ForceHardwareLoops) { ---------------- Checking `ForceHardwareLoops` should be cheaper, so we can check that first. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68579/new/ https://reviews.llvm.org/D68579 From llvm-commits at lists.llvm.org Mon Oct 7 14:10:33 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:10:33 +0000 (UTC) Subject: [PATCH] D68194: [LCSSA] Forget values we create LCSSA phis for In-Reply-To: References: Message-ID: <8fb306956f6c6d64e775b8f873bbbb82@localhost.localdomain> fhahn updated this revision to Diff 223651. fhahn added a comment. Herald added a subscriber: zzheng. Add test. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68194/new/ https://reviews.llvm.org/D68194 Files: llvm/include/llvm/Transforms/Utils/LoopUtils.h llvm/lib/Transforms/Utils/LCSSA.cpp llvm/test/Transforms/LoopUnroll/unroll-preserve-scev-lcssa.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68194.223651.patch Type: text/x-patch Size: 7531 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 14:10:34 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:10:34 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: <1e4e823fa35eea95bdd8232576f40f94@localhost.localdomain> thopre added a comment. In D68472#1697691 , @hubert.reinterpretcast wrote: > Thanks for adding the BOM. With the BOM, would it make sense to leave `mri-utf8.test` as the name of the file? I think the testfile name should reflect what is being tested since that's the test identifier (ie. when a test fails lit prints the relative filepath) so the fact that the file is encoded in UTF-8 is irrelevant. Here the test is about llvm-ar handling non ascii filename, as the first comment explains it. How is the .txt file encoded would make a bit more sense as a name but then as I mentioned AFAIK the filename is encoded in UTF-16 on Windows anywat. In summary, I think the renaming is warranted. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 From llvm-commits at lists.llvm.org Mon Oct 7 14:11:16 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:11:16 +0000 (UTC) Subject: [PATCH] D68592: [SCEV] Add stricter verification option. Message-ID: fhahn created this revision. fhahn added reviewers: efriedma, sanjoy.google, reames, atrick. Herald added subscribers: javed.absar, hiraditya. Herald added a project: LLVM. Currently -verify-scev only fails if there is a constant difference between two BE counts. This misses a lot of cases. This patch adds a -verify-scev-strict options, which fails for any non-zero differences, if used together with -verify-scev. With the stricter checking, some unit tests fail because of mis-matches, especially around IndVarSimplify. If there is no reason I am missing for just checking constant deltas, I am planning on looking into the various failures. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68592 Files: llvm/lib/Analysis/ScalarEvolution.cpp Index: llvm/lib/Analysis/ScalarEvolution.cpp =================================================================== --- llvm/lib/Analysis/ScalarEvolution.cpp +++ llvm/lib/Analysis/ScalarEvolution.cpp @@ -158,6 +158,9 @@ static cl::opt VerifySCEV( "verify-scev", cl::Hidden, cl::desc("Verify ScalarEvolution's backedge taken counts (slow)")); +static cl::opt VerifySCEVStrict( + "verify-scev-strict", cl::Hidden, + cl::desc("Enable stricter verification with -verify-scev is passed")); static cl::opt VerifySCEVMap("verify-scev-maps", cl::Hidden, cl::desc("Verify no dangling value in ScalarEvolution's " @@ -11926,14 +11929,14 @@ SE.getTypeSizeInBits(NewBECount->getType())) CurBECount = SE2.getZeroExtendExpr(CurBECount, NewBECount->getType()); - auto *ConstantDelta = - dyn_cast(SE2.getMinusSCEV(CurBECount, NewBECount)); + const SCEV *Delta = SE2.getMinusSCEV(CurBECount, NewBECount); - if (ConstantDelta && ConstantDelta->getAPInt() != 0) { - dbgs() << "Trip Count Changed!\n"; + // Unless VerifySCEVStrict is set, we only compare constant deltas. + if ((VerifySCEVStrict || isa(Delta)) && !Delta->isZero()) { + dbgs() << "Trip Count for " << *L << " Changed!\n"; dbgs() << "Old: " << *CurBECount << "\n"; dbgs() << "New: " << *NewBECount << "\n"; - dbgs() << "Delta: " << *ConstantDelta << "\n"; + dbgs() << "Delta: " << *Delta << "\n"; std::abort(); } } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68592.223653.patch Type: text/x-patch Size: 1541 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 14:11:21 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:11:21 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: <9dc024446af032146b6c50abc37556c8@localhost.localdomain> Xiangling_L updated this revision to Diff 223652. Xiangling_L marked 3 inline comments as done. Xiangling_L added a comment. Move variable 'SRE' into assertion & rename function name to 'getMCSymbolForTOCPseudoMO' Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 Files: llvm/include/llvm/MC/MCExpr.h llvm/lib/MC/MCExpr.cpp llvm/lib/Target/PowerPC/MCTargetDesc/PPCInstPrinter.cpp llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.td llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68341.223652.patch Type: text/x-patch Size: 24423 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 14:14:23 2019 From: llvm-commits at lists.llvm.org (Alexander Shaposhnikov via llvm-commits) Date: Mon, 07 Oct 2019 21:14:23 -0000 Subject: [llvm] r373966 - [llvm-lipo] Relax the check of the specified input file architecture Message-ID: <20191007211423.04E898D690@lists.llvm.org> Author: alexshap Date: Mon Oct 7 14:14:22 2019 New Revision: 373966 URL: http://llvm.org/viewvc/llvm-project?rev=373966&view=rev Log: [llvm-lipo] Relax the check of the specified input file architecture cctools lipo only compares the cputypes when it verifies that the specified (via -arch) input file and the architecture match. This diff adjusts the behavior of llvm-lipo accordingly. Differential revision: https://reviews.llvm.org/D68319 Test plan: make check-all Modified: llvm/trunk/tools/llvm-lipo/llvm-lipo.cpp Modified: llvm/trunk/tools/llvm-lipo/llvm-lipo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-lipo/llvm-lipo.cpp?rev=373966&r1=373965&r2=373966&view=diff ============================================================================== --- llvm/trunk/tools/llvm-lipo/llvm-lipo.cpp (original) +++ llvm/trunk/tools/llvm-lipo/llvm-lipo.cpp Mon Oct 7 14:14:22 2019 @@ -23,6 +23,7 @@ #include "llvm/Support/FileOutputBuffer.h" #include "llvm/Support/InitLLVM.h" #include "llvm/Support/WithColor.h" +#include "llvm/TextAPI/MachO/Architecture.h" using namespace llvm; using namespace llvm::object; @@ -438,14 +439,19 @@ readInputBinaries(ArrayRef In if (!B->isArchive() && !B->isMachO() && !B->isMachOUniversalBinary()) reportError("File " + IF.FileName + " has unsupported binary format"); if (IF.ArchType && (B->isMachO() || B->isArchive())) { - const auto ArchType = - B->isMachO() ? Slice(cast(B)).getArchString() - : Slice(cast(B)).getArchString(); - if (Triple(*IF.ArchType).getArch() != Triple(ArchType).getArch()) + const auto S = B->isMachO() ? Slice(cast(B)) + : Slice(cast(B)); + const auto SpecifiedCPUType = + MachO::getCPUTypeFromArchitecture( + MachO::mapToArchitecture(Triple(*IF.ArchType))) + .first; + // For compatibility with cctools' lipo the comparison is relaxed just to + // checking cputypes. + if (S.getCPUType() != SpecifiedCPUType) reportError("specified architecture: " + *IF.ArchType + " for file: " + B->getFileName() + - " does not match the file's architecture (" + ArchType + - ")"); + " does not match the file's architecture (" + + S.getArchString() + ")"); } InputBinaries.push_back(std::move(*BinaryOrErr)); } From llvm-commits at lists.llvm.org Mon Oct 7 14:14:45 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via llvm-commits) Date: Mon, 07 Oct 2019 21:14:45 -0000 Subject: [llvm] r373967 - [WebAssembly] Add memory intrinsics handling to mayThrow() Message-ID: <20191007211445.906168D67C@lists.llvm.org> Author: aheejin Date: Mon Oct 7 14:14:45 2019 New Revision: 373967 URL: http://llvm.org/viewvc/llvm-project?rev=373967&view=rev Log: [WebAssembly] Add memory intrinsics handling to mayThrow() Summary: Previously, `WebAssembly::mayThrow()` assumed all inputs are global addresses. But when intrinsics, such as `memcpy`, `memmove`, or `memset` are lowered to external symbols in instruction selection and later emitted as library calls. And these functions don't throw. This patch adds handling to those memory intrinsics to `mayThrow` function. But while most of libcalls don't throw, we can't guarantee all of them don't throw, so currently we conservatively return true for all other external symbols. I think a better way to solve this problem is to embed 'nounwind' info in `TargetLowering::CallLoweringInfo`, so that we can access the info from the backend. This will also enable transferring 'nounwind' properties of LLVM IR instructions. Currently we don't transfer that info and we can only access properties of callee functions, if the callees are within the module. Other targets don't need this info in the backend because they do all the processing before isel, but it will help us because that info will reduce code size increase in fixing unwind destination mismatches in CFGStackify. But for now we return false for these memory intrinsics and true for all other libcalls conservatively. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68553 Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyUtilities.cpp llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyUtilities.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyUtilities.cpp?rev=373967&r1=373966&r2=373967&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyUtilities.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyUtilities.cpp Mon Oct 7 14:14:45 2019 @@ -50,7 +50,21 @@ bool WebAssembly::mayThrow(const Machine return false; const MachineOperand &MO = MI.getOperand(getCalleeOpNo(MI.getOpcode())); - assert(MO.isGlobal()); + assert(MO.isGlobal() || MO.isSymbol()); + + if (MO.isSymbol()) { + // Some intrinsics are lowered to calls to external symbols, which are then + // lowered to calls to library functions. Most of libcalls don't throw, but + // we only list some of them here now. + // TODO Consider adding 'nounwind' info in TargetLowering::CallLoweringInfo + // instead for more accurate info. + const char *Name = MO.getSymbolName(); + if (strcmp(Name, "memcpy") == 0 || strcmp(Name, "memmove") == 0 || + strcmp(Name, "memset") == 0) + return false; + return true; + } + const auto *F = dyn_cast(MO.getGlobal()); if (!F) return true; Modified: llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll?rev=373967&r1=373966&r2=373967&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll (original) +++ llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll Mon Oct 7 14:14:45 2019 @@ -664,11 +664,51 @@ if.end: ret void } +%class.Object = type { i8 } + +; Intrinsics like memcpy, memmove, and memset don't throw and are lowered into +; calls to external symbols (not global addresses) in instruction selection, +; which will be eventually lowered to library function calls. +; Because this test runs with -wasm-disable-ehpad-sort, these library calls in +; invoke.cont BB fall within try~end_try, but they shouldn't cause crashes or +; unwinding destination mismatches in CFGStackify. + +; NOSORT-LABEL: test10 +; NOSORT: try +; NOSORT: call foo +; NOSORT: i32.call {{.*}} memcpy +; NOSORT: i32.call {{.*}} memmove +; NOSORT: i32.call {{.*}} memset +; NOSORT: return +; NOSORT: catch +; NOSORT: rethrow +; NOSORT: end_try +define void @test10(i8* %a, i8* %b) personality i8* bitcast (i32 (...)* @__gxx_wasm_personality_v0 to i8*) { +entry: + %o = alloca %class.Object, align 1 + invoke void @foo() + to label %invoke.cont unwind label %ehcleanup + +invoke.cont: ; preds = %entry + call void @llvm.memcpy.p0i8.p0i8.i32(i8* %a, i8* %b, i32 100, i1 false) + call void @llvm.memmove.p0i8.p0i8.i32(i8* %a, i8* %b, i32 100, i1 false) + call void @llvm.memset.p0i8.i32(i8* %a, i8 0, i32 100, i1 false) + %call = call %class.Object* @_ZN6ObjectD2Ev(%class.Object* %o) #1 + ret void + +ehcleanup: ; preds = %entry + %0 = cleanuppad within none [] + %call2 = call %class.Object* @_ZN6ObjectD2Ev(%class.Object* %o) #1 [ "funclet"(token %0) ] + cleanupret from %0 unwind to caller +} + declare void @foo() declare void @bar() declare i32 @baz() ; Function Attrs: nounwind declare void @nothrow(i32) #0 +; Function Attrs: nounwind +declare %class.Object* @_ZN6ObjectD2Ev(%class.Object* returned) #0 declare i32 @__gxx_wasm_personality_v0(...) declare i8* @llvm.wasm.get.exception(token) declare i32 @llvm.wasm.get.ehselector(token) @@ -678,5 +718,11 @@ declare i8* @__cxa_begin_catch(i8*) declare void @__cxa_end_catch() declare void @__clang_call_terminate(i8*) declare void @_ZSt9terminatev() +; Function Attrs: nounwind +declare void @llvm.memcpy.p0i8.p0i8.i32(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i32, i1 immarg) #0 +; Function Attrs: nounwind +declare void @llvm.memmove.p0i8.p0i8.i32(i8* nocapture, i8* nocapture readonly, i32, i1 immarg) #0 +; Function Attrs: nounwind +declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1 immarg) #0 attributes #0 = { nounwind } From llvm-commits at lists.llvm.org Mon Oct 7 14:14:15 2019 From: llvm-commits at lists.llvm.org (Guozhi Wei via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:14:15 +0000 (UTC) Subject: [PATCH] D68593: [NewPM] Add an SROA pass after loop unroll Message-ID: Carrot created this revision. Carrot added a reviewer: chandlerc. Herald added subscribers: llvm-commits, dexonsmith, steven_wu, hiraditya, mehdi_amini. Herald added a project: LLVM. In tensorflow library we found llvm generates redundant memory accesses to local array. It can also be demonstrated by following test case #include constexpr int size=4; void f(int *a,int * b) { float tmp[size]; for(int i =0;i From llvm-commits at lists.llvm.org Mon Oct 7 14:16:30 2019 From: llvm-commits at lists.llvm.org (Sanjoy Das (Work Account) via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:16:30 +0000 (UTC) Subject: [PATCH] D68592: [SCEV] Add stricter verification option. In-Reply-To: References: Message-ID: <0c497854840b9bb69e49052c75142e7d@localhost.localdomain> sanjoy.google accepted this revision. sanjoy.google added a comment. This revision is now accepted and ready to land. lgtm Last time I looked at this I concluded it would be difficult to make enable something like this consistently, but no harm in trying. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68592/new/ https://reviews.llvm.org/D68592 From llvm-commits at lists.llvm.org Mon Oct 7 14:18:04 2019 From: llvm-commits at lists.llvm.org (Jan Korous via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:18:04 +0000 (UTC) Subject: [PATCH] D68093: [clang-scan-deps][static analyzer] Support for clang --analyze in scan-deps In-Reply-To: References: Message-ID: <7dc6e7cfd79451b90b7464c573237720@localhost.localdomain> jkorous marked an inline comment as done. jkorous added inline comments. ================ Comment at: clang/include/clang/Driver/CC1Options.td:849 HelpText<"include a detailed record of preprocessing actions">; +def setup_static_analyzer : Flag<["-"], "setup-static-analyzer">, + HelpText<"Set up preprocessor for static analyzer (done automatically when static analyzer is run).">; ---------------- hiraditya wrote: > The name doesn't quite reflect what it does. `setup-pp-for-analyzer`? I'm open to suggestions. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68093/new/ https://reviews.llvm.org/D68093 From llvm-commits at lists.llvm.org Mon Oct 7 14:21:44 2019 From: llvm-commits at lists.llvm.org (Sanjoy Das (Work Account) via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:21:44 +0000 (UTC) Subject: [PATCH] D68194: [LCSSA] Forget values we create LCSSA phis for In-Reply-To: References: Message-ID: <6659d6374f790957f15a91606105aa05@localhost.localdomain> sanjoy.google added a comment. If the SCEV expression is mathematically correct I think this is a problem with SCEV expander. If it is expected to preserve LCSSA it should add the necessary PHIs. I know this contradicts existing code in SCEV that tries not to break LCSSA, I'm wondering if we should just remove those in favor of teaching SCEV expander about preserving LCSSA. ================ Comment at: llvm/test/Transforms/LoopUnroll/unroll-preserve-scev-lcssa.ll:76 +bb3: ; preds = %bb9, %bb + br i1 undef, label %bb9, label %bb5 + ---------------- Minor thing: I'd avoid adding branches on `undef` (unless you need them to reproduce the test) because there is no guarantee on how the optimizers will optimize these. (It is also unclear whether this is UB.) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68194/new/ https://reviews.llvm.org/D68194 From llvm-commits at lists.llvm.org Mon Oct 7 14:21:46 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:21:46 +0000 (UTC) Subject: [PATCH] D68594: [llvm-lipo] Add TextAPI to LINK_COMPONENTS Message-ID: aheejin created this revision. aheejin added a reviewer: alexshap. Herald added subscribers: llvm-commits, mgorny. Herald added a project: LLVM. D68319 uses `MachO::getCPUTypeFromArchitecture` and without this builds with `-DBUILD_SHARED_LIBS=ON` fail. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68594 Files: llvm/tools/llvm-lipo/CMakeLists.txt Index: llvm/tools/llvm-lipo/CMakeLists.txt =================================================================== --- llvm/tools/llvm-lipo/CMakeLists.txt +++ llvm/tools/llvm-lipo/CMakeLists.txt @@ -3,6 +3,7 @@ Object Option Support + TextAPI ) set(LLVM_TARGET_DEFINITIONS LipoOpts.td) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68594.223657.patch Type: text/x-patch Size: 296 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 14:28:50 2019 From: llvm-commits at lists.llvm.org (Keith Randall via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:28:50 +0000 (UTC) Subject: [PATCH] D68596: break commands into multiple lines Message-ID: randall77 created this revision. Herald added subscribers: llvm-commits, Sanitizers, jfb, delcypher. Herald added projects: LLVM, Sanitizers. Repository: rCRT Compiler Runtime https://reviews.llvm.org/D68596 Files: lib/tsan/go/build.bat Index: lib/tsan/go/build.bat =================================================================== --- lib/tsan/go/build.bat +++ lib/tsan/go/build.bat @@ -1,4 +1,56 @@ -type tsan_go.cpp ..\rtl\tsan_interface_atomic.cpp ..\rtl\tsan_clock.cpp ..\rtl\tsan_flags.cpp ..\rtl\tsan_md5.cpp ..\rtl\tsan_mutex.cpp ..\rtl\tsan_report.cpp ..\rtl\tsan_rtl.cpp ..\rtl\tsan_rtl_mutex.cpp ..\rtl\tsan_rtl_report.cpp ..\rtl\tsan_rtl_thread.cpp ..\rtl\tsan_rtl_proc.cpp ..\rtl\tsan_stat.cpp ..\rtl\tsan_suppressions.cpp ..\rtl\tsan_sync.cpp ..\rtl\tsan_stack_trace.cpp ..\..\sanitizer_common\sanitizer_allocator.cpp ..\..\sanitizer_common\sanitizer_common.cpp ..\..\sanitizer_common\sanitizer_flags.cpp ..\..\sanitizer_common\sanitizer_stacktrace.cpp ..\..\sanitizer_common\sanitizer_libc.cpp ..\..\sanitizer_common\sanitizer_printf.cpp ..\..\sanitizer_common\sanitizer_suppressions.cpp ..\..\sanitizer_common\sanitizer_thread_registry.cpp ..\rtl\tsan_platform_windows.cpp ..\..\sanitizer_common\sanitizer_win.cpp ..\..\sanitizer_common\sanitizer_deadlock_detector1.cpp ..\..\sanitizer_common\sanitizer_stackdepot.cpp ..\..\sanitizer_common\sanitizer_persistent_allocator.cpp ..\..\sanitizer_common\sanitizer_flag_parser.cpp ..\..\sanitizer_common\sanitizer_symbolizer.cpp ..\..\sanitizer_common\sanitizer_termination.cpp > gotsan.cpp - -gcc -c -o race_windows_amd64.syso gotsan.cpp -I..\rtl -I..\.. -I..\..\sanitizer_common -I..\..\..\include -m64 -Wall -fno-exceptions -fno-rtti -DSANITIZER_GO=1 -Wno-error=attributes -Wno-attributes -Wno-format -Wno-maybe-uninitialized -DSANITIZER_DEBUG=0 -O3 -fomit-frame-pointer -std=c++11 +type ^ + tsan_go.cpp ^ + ..\rtl\tsan_interface_atomic.cpp ^ + ..\rtl\tsan_clock.cpp ^ + ..\rtl\tsan_flags.cpp ^ + ..\rtl\tsan_md5.cpp ^ + ..\rtl\tsan_mutex.cpp ^ + ..\rtl\tsan_report.cpp ^ + ..\rtl\tsan_rtl.cpp ^ + ..\rtl\tsan_rtl_mutex.cpp ^ + ..\rtl\tsan_rtl_report.cpp ^ + ..\rtl\tsan_rtl_thread.cpp ^ + ..\rtl\tsan_rtl_proc.cpp ^ + ..\rtl\tsan_stat.cpp ^ + ..\rtl\tsan_suppressions.cpp ^ + ..\rtl\tsan_sync.cpp ^ + ..\rtl\tsan_stack_trace.cpp ^ + ..\..\sanitizer_common\sanitizer_allocator.cpp ^ + ..\..\sanitizer_common\sanitizer_common.cpp ^ + ..\..\sanitizer_common\sanitizer_flags.cpp ^ + ..\..\sanitizer_common\sanitizer_stacktrace.cpp ^ + ..\..\sanitizer_common\sanitizer_libc.cpp ^ + ..\..\sanitizer_common\sanitizer_printf.cpp ^ + ..\..\sanitizer_common\sanitizer_suppressions.cpp ^ + ..\..\sanitizer_common\sanitizer_thread_registry.cpp ^ + ..\rtl\tsan_platform_windows.cpp ^ + ..\..\sanitizer_common\sanitizer_win.cpp ^ + ..\..\sanitizer_common\sanitizer_deadlock_detector1.cpp ^ + ..\..\sanitizer_common\sanitizer_stackdepot.cpp ^ + ..\..\sanitizer_common\sanitizer_persistent_allocator.cpp ^ + ..\..\sanitizer_common\sanitizer_flag_parser.cpp ^ + ..\..\sanitizer_common\sanitizer_symbolizer.cpp ^ + ..\..\sanitizer_common\sanitizer_termination.cpp ^ + > gotsan.cpp +gcc ^ + -c ^ + -o race_windows_amd64.syso ^ + gotsan.cpp ^ + -I..\rtl ^ + -I..\.. ^ + -I..\..\sanitizer_common ^ + -I..\..\..\include ^ + -m64 ^ + -Wall ^ + -fno-exceptions ^ + -fno-rtti ^ + -DSANITIZER_GO=1 ^ + -Wno-error=attributes ^ + -Wno-attributes ^ + -Wno-format ^ + -Wno-maybe-uninitialized ^ + -DSANITIZER_DEBUG=0 ^ + -O3 ^ + -fomit-frame-pointer ^ + -std=c++11 -------------- next part -------------- A non-text attachment was scrubbed... Name: D68596.223659.patch Type: text/x-patch Size: 3319 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 14:33:39 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via llvm-commits) Date: Mon, 07 Oct 2019 21:33:39 -0000 Subject: [llvm] r373969 - [llvm-c] Add UnaryOperator to LLVM_FOR_EACH_VALUE_SUBCLASS macro Message-ID: <20191007213339.5EE5A8CF3F@lists.llvm.org> Author: mcinally Date: Mon Oct 7 14:33:39 2019 New Revision: 373969 URL: http://llvm.org/viewvc/llvm-project?rev=373969&view=rev Log: [llvm-c] Add UnaryOperator to LLVM_FOR_EACH_VALUE_SUBCLASS macro Note that we are not sure where the tests for these functions lives. This was discussed in the Phab Diff. Differential Revision: https://reviews.llvm.org/D68588 Modified: llvm/trunk/include/llvm-c/Core.h Modified: llvm/trunk/include/llvm-c/Core.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/Core.h?rev=373969&r1=373968&r2=373969&view=diff ============================================================================== --- llvm/trunk/include/llvm-c/Core.h (original) +++ llvm/trunk/include/llvm-c/Core.h Mon Oct 7 14:33:39 2019 @@ -1543,6 +1543,7 @@ LLVMTypeRef LLVMX86MMXType(void); macro(GlobalVariable) \ macro(UndefValue) \ macro(Instruction) \ + macro(UnaryOperator) \ macro(BinaryOperator) \ macro(CallInst) \ macro(IntrinsicInst) \ From llvm-commits at lists.llvm.org Mon Oct 7 14:33:46 2019 From: llvm-commits at lists.llvm.org (Duncan Ogilvie via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:33:46 +0000 (UTC) Subject: [PATCH] D31635: [clang-format] Added ReferenceAlignmentStyle option In-Reply-To: References: Message-ID: <4b00df2749608173ce54a550c6df2e8d@localhost.localdomain> mrexodia updated this revision to Diff 223660. mrexodia edited the summary of this revision. mrexodia added a reviewer: MyDeveloperDay. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D31635/new/ https://reviews.llvm.org/D31635 Files: clang/include/clang/Format/Format.h clang/lib/Format/Format.cpp clang/lib/Format/TokenAnnotator.cpp clang/lib/Format/TokenAnnotator.h clang/unittests/Format/FormatTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D31635.223660.patch Type: text/x-patch Size: 11753 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 14:37:33 2019 From: llvm-commits at lists.llvm.org (Sid Manning via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:37:33 +0000 (UTC) Subject: [PATCH] D66542: R_HEX_B15_PCREL_X/R_HEX_B9_PCREL_X can be in shared objects In-Reply-To: References: Message-ID: <68a5102b44894cc6a694ba5fd2f77880@localhost.localdomain> sidneym updated this revision to Diff 223645. sidneym added a comment. Add a new test similar to riscv-plt.s. Update hexagon-shared.s to test -Bsymbolic and .hidden. Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66542/new/ https://reviews.llvm.org/D66542 Files: lld/ELF/Arch/Hexagon.cpp lld/test/ELF/hexagon-plt.s lld/test/ELF/hexagon-shared.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D66542.223645.patch Type: text/x-patch Size: 7333 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 14:41:54 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:41:54 +0000 (UTC) Subject: [PATCH] D68484: [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. In-Reply-To: References: Message-ID: jdoerfert added a comment. Thank you very much for working on this and putting all this into motion! I started to look at this patch in isolation but with the rough idea of the approach in mind. I did add various comments, from small wording changes to proposals on how we conceptually describe things. Given that many comments would be repeated for each intrinsic, I stopped after the first and want to see what people think. ================ Comment at: llvm/docs/LangRef.rst:16232 +The intrinsics can also be used to specify alias assumptions that are +not restrict based. + ---------------- Nit: > of `load` and `store` instructions. The "of" sounds weird to me. --- > The documentation below explains how it works, with ``restrict`` in mind. I personally dislike sentences like this and would just remove it. > The intrinsics can also be used to specify alias assumptions that are not restrict based. Arguably that is always true. The section describes the semantics of the alias stuff and how that can be used to model `restrict`. It is implied that other things can be modeled as well. ================ Comment at: llvm/docs/LangRef.rst:16249 + +Note: ``XXX`` is the encoding of the return type and the types of the arguments. + ---------------- I think you mix the "templated" definition (`XXX`) with instantiations (`i8**`, `%struct.FOO*`, ...). I would prefer we pick either. Precedence says you replace `XXX` with the types of that instantiation. ================ Comment at: llvm/docs/LangRef.rst:16275 +allows optimization passes to adapt the normal pointer path, without impacting +the knowledge about the *depends on* relationship. + ---------------- > introduces alias assumptions plural vs singular > in the normal computation path this isn't a "known" term for me (see below) > of a pointer and it will be opaque for most optimizations this is the hope but it is questionable if it is true and why it is here I would replace the first sentence with: "The ``llvm.noalias`` intrinsic attaches alias assumptions to its first argument." --- The whole pass thing and splitting comes to early (IMHO). I don't know yet what these intrinsics mean but I learn that they are transformed. That said, `llvm.side.noalias` is not described here. ================ Comment at: llvm/docs/LangRef.rst:16281 +the loop is unrolled. Otherwise the restrict scope could spill across +iterations. + ---------------- Nit: remove "is used to", just "identifies" Nit: remove "exact" (what does it mean given that we actually move stuff around under the normal "as if" rules) It's not "done inside" and loops are only one example of this. What you want to say is more general: "Whenever a `llvm.noalias.decl` intrinsic is duplicated through code transformations, care must be taken to duplicate and uniquify the scopes and intrinsics. These steps are described in the following." To be honest, I'm not sure if it makes sense to say something like that here already. ================ Comment at: llvm/docs/LangRef.rst:16286 +or returned from a function. After inlining, the correct dependencies can still +be propagated. + ---------------- Not: stray "this" in 16284 The inlining sentence does not really clear up anything here, partially because we don't know what is happening. ================ Comment at: llvm/docs/LangRef.rst:16290 +pointer points to a blob of memory that contains restrict pointers. This allows +to track the *based on* dependency when copying such blocks of memory. + ---------------- remove "a blob of" ================ Comment at: llvm/docs/LangRef.rst:16301 + Different ``%p.addr`` represent different restrict pointers, pointing to + disjunct objects. +- ``p.objId``: a number that can be used to differentiate different *object P* ---------------- > Either a real object, a constant where the value is relative to 0 or ``null``. There is a word missing and 0 is `null`. ================ Comment at: llvm/docs/LangRef.rst:16303 +- ``p.objId``: a number that can be used to differentiate different *object P* + when ``%p.addr`` is optimized away. +- ``!p.scope``: metadata argument that refers to a list of alias.scope metadata ---------------- This seems odd, why introduce two things that do the same thing. ================ Comment at: llvm/docs/LangRef.rst:16306 + entries that contains exactly one element. It represents the variable + declaration that contains one or more restrict pointers. +- ``%p.decl``: points to the ``@llvm.noalias.decl`` intrinsic associated with ---------------- "entries with a single element each." > It represents the variable declaration that contains one or more restrict pointers. I do not understand this sentence. ================ Comment at: llvm/docs/LangRef.rst:16310 +- ``%p.alloca``: points to the alloca associated with the declaration of a + restrict variable +- ``%side.p``: the noalias_sidechannel associated with ``%p``. ---------------- For both items above: No need for "points to". > a restrict variable. Maybe more specific: "the restrict pointer `%p`." ================ Comment at: llvm/docs/LangRef.rst:16314 +- ``%p.block``: the address of a block in memory or on the stack is associated + with a variable that contains at least one restrict pointer. +- ``!p.indices``: metadata argument that refers to a list of metadata ---------------- Maybe: "the address of an object with at least one restrict pointer constituent. ================ Comment at: llvm/docs/LangRef.rst:16317 + references. Each reference points to a metadata array of indices. At the + specified location, a restrict pointer is located. A '-1' indicates any index. + ---------------- I did not understand the above wording. ================ Comment at: llvm/docs/LangRef.rst:16334 +The restrictness of a restrict pointer ``%p``, as given by the +``llvm.side.noalias`` intrinsic, can be expressed by following arguments: + ---------------- Nit: "by the following" ================ Comment at: llvm/docs/LangRef.rst:16338 +- ``p.objId``: an extra number +- ``!p.scope``: a declaration scope, related to the variable declaration. + ---------------- "an extra number" is not helpful " related to " -> "describing"/"identifying" ================ Comment at: llvm/docs/LangRef.rst:16353 + identical to any other tuple, unless it can be proven that their ``%p.addr`` + are disjunct. + ---------------- "related to" -> "referencing" or "scoped in" ================ Comment at: llvm/docs/LangRef.rst:16483 + + declare i8* @llvm.noalias.decl.XXX(* %p.alloca, i32 , metadata !p.scope) + ---------------- I mentioned that before, the lady of the lake says XXX is specialized here: https://llvm.org/docs/LangRef.html#llvm-memcpy-intrinsic (do we add attributes here? if so, we need attributes here, e.g., `nocapture`). ================ Comment at: llvm/docs/LangRef.rst:16492 +certain ``alloca`` is associated to an object that contains one or more +restrict pointers. + ---------------- I dislike the loop body and alloca sentence because they do not convey information. To be honest, I don't know what the second sentence is saying. What I would prefer is to say something like: "The intrinsic identifies the scope of the restrict restrict pointer through a virtual side-effect that ensures the control dependences are preserved. This virtual side-effect will also keep allocations alive and explicit." (I guessed what you want to say wrt. alloca) (allocas and mallocs should not be treated differently anyway, we have a heap-2-stack transformation now and for the purpose of this discussion there should not be a difference anyway) ================ Comment at: llvm/docs/LangRef.rst:16496 +not really represent a value. It is merely used to track a dependency on the +declaration. + ---------------- The above reads funny, maybe: "The returned value is a handle to track dependences on the declaration. There is no explicit relationship to the value of the arguments." Also, why do we want an `i8*` then? We have `tokens` and we have `i32`, I'd prefer either over an `i8*` which is more confusing in this context full of `i8*` that are actually pointers (IMHO). ================ Comment at: llvm/docs/LangRef.rst:16505 + +The second argument ``p.objId`` is an integer representing an object id. + ---------------- 1) I would love to remove this duplication, is that possible? 2) Why do we need to talk about `alloca` and "optimized away"? Can't we say: "The first argument `%p.alloca` points to an object in memory with one or more restrict pointers constituents or `null`." ================ Comment at: llvm/docs/LangRef.rst:16509 +metadata references. The format is identical to that required for ``noalias`` +metadata. This list must have exactly one element. + ---------------- Is it a list with one element or a list with entries that have one element each? What I read earlier sounded different from what I read here. ================ Comment at: llvm/docs/LangRef.rst:16518 +the loop is unrolled. Otherwise the restrict scope could spill across +iterations. + ---------------- Copy and paste from somewhere above. I'd avoid duplication if possible in favor of references. ================ Comment at: llvm/docs/LangRef.rst:16524 +the relationship between those intrinsics and the actual variable declaration +visible. + ---------------- The above sentence is broken somewhere (I think). Maybe make it two. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68484/new/ https://reviews.llvm.org/D68484 From llvm-commits at lists.llvm.org Mon Oct 7 14:45:04 2019 From: llvm-commits at lists.llvm.org (Aditya Kumar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:45:04 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: <0eda2bb6235af0f8c05b5830d5d0e87e@localhost.localdomain> hiraditya accepted this revision. hiraditya added a comment. This revision is now accepted and ready to land. LGTM as the comments have been addressed. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 From llvm-commits at lists.llvm.org Mon Oct 7 14:48:08 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Mon, 07 Oct 2019 21:48:08 -0000 Subject: [llvm] r373972 - [Attributor][FIX] Remove assertion wrong for on invalid IRPositions Message-ID: <20191007214808.EBB928D494@lists.llvm.org> Author: jdoerfert Date: Mon Oct 7 14:48:08 2019 New Revision: 373972 URL: http://llvm.org/viewvc/llvm-project?rev=373972&view=rev Log: [Attributor][FIX] Remove assertion wrong for on invalid IRPositions Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/Attributor.h?rev=373972&r1=373971&r2=373972&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/Attributor.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/Attributor.h Mon Oct 7 14:48:08 2019 @@ -398,8 +398,6 @@ struct IRPosition { assert(KindOrArgNo < 0 && "Expected (call site) arguments to never reach this point!"); - assert(!isa(getAnchorValue()) && - "Expected arguments to have an associated argument position!"); return Kind(KindOrArgNo); } From llvm-commits at lists.llvm.org Mon Oct 7 14:48:54 2019 From: llvm-commits at lists.llvm.org (Bill Wendling via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:48:54 +0000 (UTC) Subject: [PATCH] D68598: [IA] Recognize hexadecimal escape sequences Message-ID: void created this revision. void added reviewers: nickdesaulniers, jcai19. void added a project: LLVM. Implement support for hexadecimal escape sequences to match how GNU 'as' handles them. I.e., read all hexadecimal characters and truncate to the lower 16 bits. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68598 Files: llvm/lib/MC/MCParser/AsmParser.cpp llvm/test/MC/AsmParser/directive_ascii.s Index: llvm/test/MC/AsmParser/directive_ascii.s =================================================================== --- llvm/test/MC/AsmParser/directive_ascii.s +++ llvm/test/MC/AsmParser/directive_ascii.s @@ -39,3 +39,8 @@ # CHECK: .byte 0 TEST6: .string "B", "C" + +# CHECK: TEST7: +# CHECK: .ascii "dk" +TEST7: + .ascii "\x64\Xa6B" Index: llvm/lib/MC/MCParser/AsmParser.cpp =================================================================== --- llvm/lib/MC/MCParser/AsmParser.cpp +++ llvm/lib/MC/MCParser/AsmParser.cpp @@ -2914,11 +2914,27 @@ } // Recognize escaped characters. Note that this escape semantics currently - // loosely follows Darwin 'as'. Notably, it doesn't support hex escapes. + // loosely follows Darwin 'as'. ++i; if (i == e) return TokError("unexpected backslash at end of string"); + // Recognize hex sequences similarly to GNU 'as'. + if (Str[i] == 'x' || Str[i] == 'X') { + size_t length = Str.size(); + if (i + 1 >= length || !isHexDigit(Str[i + 1])) + return TokError("invalid hexadecimal escape sequence"); + + // Consume hex characters. GNU 'as' reads all hexadecimal characters and + // then truncates to the lower 16 bits. Seems reasonable. + unsigned Value = 0; + while (i + 1 < length && isHexDigit(Str[i + 1])) + Value = Value * 16 + hexDigitValue(Str[++i]); + + Data += (unsigned char)(Value & 0xFF); + continue; + } + // Recognize octal sequences. if ((unsigned)(Str[i] - '0') <= 7) { // Consume up to three octal characters. -------------- next part -------------- A non-text attachment was scrubbed... Name: D68598.223666.patch Type: text/x-patch Size: 1600 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 14:49:05 2019 From: llvm-commits at lists.llvm.org (Duncan Ogilvie via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:49:05 +0000 (UTC) Subject: [PATCH] D31635: [clang-format] Added ReferenceAlignmentStyle option In-Reply-To: References: Message-ID: mrexodia added a comment. @MyDeveloperDay I rebased and added unit tests + and ran clang-format. @klimek I just need this to be able to replace AStyle for my open source project (x64dbg). I just rebased because @STL_MSFT asked for it and I had some free time. I don't think it is actually possible to be consistent with separate options for pointers and references. See for example https://docs.microsoft.com/en-us/cpp/cpp/references-to-pointers?view=vs-2019 which has: `int Add2( BTree*& Root, char *szToAdd )` Properly supporting this formatting likely isn't going to be pretty, but anyone is free to try with this rebased patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D31635/new/ https://reviews.llvm.org/D31635 From llvm-commits at lists.llvm.org Mon Oct 7 14:49:19 2019 From: llvm-commits at lists.llvm.org (Bill Wendling via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:49:19 +0000 (UTC) Subject: [PATCH] D68483: [IA] Recognize hexadecimal escape sequences In-Reply-To: References: Message-ID: void added a comment. In D68483#1697207 , @thakis wrote: > I reverted this in r373898 since MC/AsmParser/directive_ascii.s failed on bots. I didn't look into it, but maybe it's because there's no bounds checking on the `i + 1` index. Doh! Asserts weren't turned on in my build tree. Fixed in D68598 . Very sorry for the breakage. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68483/new/ https://reviews.llvm.org/D68483 From llvm-commits at lists.llvm.org Mon Oct 7 14:52:38 2019 From: llvm-commits at lists.llvm.org (Alex Brachet via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:52:38 +0000 (UTC) Subject: [PATCH] D67867: [libc] Add few docs and implementation of strcpy and strcat. In-Reply-To: References: Message-ID: <295d68ae6a99c04a9684d968b3ba1845@localhost.localdomain> abrachet added a comment. In D67867#1692484 , @MaskRay wrote: > My earlier question is about why we need the namespace `__llvm_libc` at all. From `libc/src/string/strcat/strcat_test.cpp` I conclude it is for unit testing in an environment that already has a libc (gtest). This should probably be documented. > > Can we do things the other way round? No namespace, no `__llvm_libc_` prefix. Add the `-ffreestanding` compiler flag and just define `strstr, open, etc` in the global namespace. In unit tests, invoke `llvm-objcopy --redefine-syms=` to rename `strstr` to `__llvm_libc_strstr`, and call `__llvm_libc_strstr` in the tests. For functions that affect global states, gtest will not be suitable. It is good to think how the tests will be organized in the current early stage. In D67867#1696908 , @MaskRay wrote: > The commit was done in a hurry. For the initial commit of a brand new project that sets up the project hierarchy, this seems to have received fewer than enough thumbs up. Many points raised in the review process were just shrugged off. I don't know if it matters anymore because this was committed but I agree with @MaskRay. His suggestion of using `llvm-objcopy` to rename the symbols for tests makes much more sense to me. I haven't seen a libc that does testing in an ergonomic way and this suggestion seems the best to me, frankly. There is a lot going on here, it's hard to follow it all in one patch, and I think some comments got lost because of this. I feel like a lot of big design decisions were made here, did I miss something on the libc-dev mailing list? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67867/new/ https://reviews.llvm.org/D67867 From llvm-commits at lists.llvm.org Mon Oct 7 14:54:26 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:54:26 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: efriedma requested changes to this revision. efriedma added inline comments. This revision now requires changes to proceed. ================ Comment at: llvm/include/llvm/Support/MathExtras.h:66 + inv_sqrtpi = 0.5641895835477563, // https://oeis.org/A087197 + sqrt2 = 1.414213562373095, // https://oeis.org/A002193 + inv_sqrt2 = 0.7071067811865475, ---------------- The correct value of sqrt(2) in double-precision is 1.4142135623730951. And now I don't trust any of the other values... ================ Comment at: llvm/include/llvm/Support/MathExtras.h:71 + phi = 1.618033988749895; // https://oeis.org/A001622 +constexpr float ef = 2.718282, // https://oeis.org/A001113 + egammaf = 0.5772157, // https://oeis.org/A001620 ---------------- Please mark the constants with "f", e.g. `2.718282f`, so there isn't an extra rounding step. ================ Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:1944 + // FIXME: The `long double` type is not fully supported by the classes + // `APFloat` and `Constant`. + Eul = ConstantFP::get(Log->getType(), numbers::e); ---------------- I'm not sure this describes the issue correctly. You can specify a long double as a string, or raw bits. It can't interoperate with the native long double because that might have the wrong width. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 From llvm-commits at lists.llvm.org Mon Oct 7 14:57:24 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:57:24 +0000 (UTC) Subject: [PATCH] D68400: [NFC][TTI] Add Alignment for isLegalMasked[Load/Store] In-Reply-To: References: Message-ID: <5030eedd4d8b6f878590fe8f9ae27b44@localhost.localdomain> dmgreen added inline comments. ================ Comment at: lib/Target/X86/X86TargetTransformInfo.h:189 + bool isLegalMaskedLoad(Type *DataType, unsigned Alignment); bool isLegalMaskedStore(Type *DataType); bool isLegalNTLoad(Type *DataType, Align Alignment); ---------------- ", unsigned Alignment" :) Also the isLegalNTLoad below use the Align. I think they should be fairly simple to use (but haven't used them myself yet, just fixed merge conflicts). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68400/new/ https://reviews.llvm.org/D68400 From llvm-commits at lists.llvm.org Mon Oct 7 14:58:15 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:58:15 +0000 (UTC) Subject: [PATCH] D68530: [AArch64] Make combining of callee-save and local stack adjustment optional In-Reply-To: References: Message-ID: <37e3aac0f77d1733094389760f5ea25a@localhost.localdomain> dmgreen added a comment. My understanding is that D18619 was an optimisation, and that optimisation increases codesize in order to decrease micro-ops and improve scheduling? If so then I don't think that anyone should be relying on it, exactly. Under minsize it should be fine to just not do the optimisation. It sounds like a fairly limited gain to me, increasing instruction count to gain on micro ops and scheduling, so the codesize benefit should be bigger win. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68530/new/ https://reviews.llvm.org/D68530 From llvm-commits at lists.llvm.org Mon Oct 7 14:59:55 2019 From: llvm-commits at lists.llvm.org (Keith Randall via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:59:55 +0000 (UTC) Subject: [PATCH] D68599: fix Go windows build Message-ID: randall77 created this revision. Herald added subscribers: llvm-commits, Sanitizers, delcypher. Herald added projects: LLVM, Sanitizers. Repository: rCRT Compiler Runtime https://reviews.llvm.org/D68599 Files: lib/sanitizer_common/sanitizer_win_defs.h lib/tsan/go/build.bat Index: lib/tsan/go/build.bat =================================================================== --- lib/tsan/go/build.bat +++ lib/tsan/go/build.bat @@ -31,6 +31,9 @@ ..\..\sanitizer_common\sanitizer_flag_parser.cpp ^ ..\..\sanitizer_common\sanitizer_symbolizer.cpp ^ ..\..\sanitizer_common\sanitizer_termination.cpp ^ + ..\..\sanitizer_common\sanitizer_file.cpp ^ + ..\..\sanitizer_common\sanitizer_symbolizer_report.cpp ^ + ..\rtl\tsan_external.cpp ^ > gotsan.cpp gcc ^ @@ -46,6 +49,7 @@ -fno-exceptions ^ -fno-rtti ^ -DSANITIZER_GO=1 ^ + -DGetProcessMemoryInfo=K32GetProcessMemoryInfo ^ -Wno-error=attributes ^ -Wno-attributes ^ -Wno-format ^ Index: lib/sanitizer_common/sanitizer_win_defs.h =================================================================== --- lib/sanitizer_common/sanitizer_win_defs.h +++ lib/sanitizer_common/sanitizer_win_defs.h @@ -43,6 +43,8 @@ #define STRINGIFY_(A) #A #define STRINGIFY(A) STRINGIFY_(A) +#if !SANITIZER_GO + // ----------------- A workaround for the absence of weak symbols -------------- // We don't have a direct equivalent of weak symbols when using MSVC, but we can // use the /alternatename directive to tell the linker to default a specific @@ -158,5 +160,15 @@ // return a >= b; // } // + +#else // SANITIZER_GO + +// Go neither needs nor wants weak references. +// The shenanigans above don't work for gcc. +# define WIN_WEAK_EXPORT_DEF(ReturnType, Name, ...) \ + extern "C" ReturnType Name(__VA_ARGS__) + +#endif // SANITIZER_GO + #endif // SANITIZER_WINDOWS #endif // SANITIZER_WIN_DEFS_H -------------- next part -------------- A non-text attachment was scrubbed... Name: D68599.223667.patch Type: text/x-patch Size: 1628 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:00:25 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:00:25 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: <7d25ca080880002d1fcb5ee6a968e72b@localhost.localdomain> evandro marked 3 inline comments as done. evandro added inline comments. ================ Comment at: llvm/include/llvm/Support/MathExtras.h:66 + inv_sqrtpi = 0.5641895835477563, // https://oeis.org/A087197 + sqrt2 = 1.414213562373095, // https://oeis.org/A002193 + inv_sqrt2 = 0.7071067811865475, ---------------- efriedma wrote: > The correct value of sqrt(2) in double-precision is 1.4142135623730951. > > And now I don't trust any of the other values... `double` has a precision of 15 or 16 significant digits. I don't understand why are you suggesting 17 significant digits when you asked to trim the precision down. Besides, the reference I provided states that this value is 1.41421356237309505. Whether it's rounded to 1.4142135623730950 or 1.4142135623730951 is a bit moot, IMO. ================ Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:1944 + // FIXME: The `long double` type is not fully supported by the classes + // `APFloat` and `Constant`. + Eul = ConstantFP::get(Log->getType(), numbers::e); ---------------- efriedma wrote: > I'm not sure this describes the issue correctly. You can specify a long double as a string, or raw bits. It can't interoperate with the native long double because that might have the wrong width. What wording would you suggest, please? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 From llvm-commits at lists.llvm.org Mon Oct 7 15:01:07 2019 From: llvm-commits at lists.llvm.org (Keith Randall via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:01:07 +0000 (UTC) Subject: [PATCH] D68599: fix Go windows build In-Reply-To: References: Message-ID: <04dfeb81a5a303ae1cab076c55cf31e1@localhost.localdomain> randall77 added reviewers: dvyukov, vitalybuka. randall77 added a comment. This change replaces D68277 . Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68599/new/ https://reviews.llvm.org/D68599 From llvm-commits at lists.llvm.org Mon Oct 7 15:04:37 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:04:37 +0000 (UTC) Subject: [PATCH] D68194: [LCSSA] Forget values we create LCSSA phis for In-Reply-To: References: Message-ID: efriedma added a comment. > I know this contradicts existing code in SCEV that tries not to break LCSSA, I'm wondering if we should just remove those in favor of teaching SCEV expander about preserving LCSSA. Currently, SCEV deliberately doesn't look through LCSSA PHI nodes (there's code in ScalarEvolution::createNodeForPHI etc.). You're saying we should change that? It might be a good idea. I recently ran into an performance issue involving a nested loop where SCEV for the outer loop was getting confused by an LCSSA PHI for the inner loop. I'd be worried about loop passes getting confused, though; for example, if they assume all AddRecs are part of the current loop nest. That said, this patch probably isn't the right place to have that discussion. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68194/new/ https://reviews.llvm.org/D68194 From llvm-commits at lists.llvm.org Mon Oct 7 15:05:26 2019 From: llvm-commits at lists.llvm.org (Keith Randall via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:05:26 +0000 (UTC) Subject: [PATCH] D68599: fix Go windows build In-Reply-To: References: Message-ID: <962c91908bbb6dcbbfe6545a9e6d1d8a@localhost.localdomain> randall77 updated this revision to Diff 223668. randall77 added a comment. Herald added a subscriber: jfb. Missing two -D options Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68599/new/ https://reviews.llvm.org/D68599 Files: lib/sanitizer_common/sanitizer_win_defs.h lib/tsan/go/build.bat Index: lib/tsan/go/build.bat =================================================================== --- lib/tsan/go/build.bat +++ lib/tsan/go/build.bat @@ -1,4 +1,61 @@ -type tsan_go.cpp ..\rtl\tsan_interface_atomic.cpp ..\rtl\tsan_clock.cpp ..\rtl\tsan_flags.cpp ..\rtl\tsan_md5.cpp ..\rtl\tsan_mutex.cpp ..\rtl\tsan_report.cpp ..\rtl\tsan_rtl.cpp ..\rtl\tsan_rtl_mutex.cpp ..\rtl\tsan_rtl_report.cpp ..\rtl\tsan_rtl_thread.cpp ..\rtl\tsan_rtl_proc.cpp ..\rtl\tsan_stat.cpp ..\rtl\tsan_suppressions.cpp ..\rtl\tsan_sync.cpp ..\rtl\tsan_stack_trace.cpp ..\..\sanitizer_common\sanitizer_allocator.cpp ..\..\sanitizer_common\sanitizer_common.cpp ..\..\sanitizer_common\sanitizer_flags.cpp ..\..\sanitizer_common\sanitizer_stacktrace.cpp ..\..\sanitizer_common\sanitizer_libc.cpp ..\..\sanitizer_common\sanitizer_printf.cpp ..\..\sanitizer_common\sanitizer_suppressions.cpp ..\..\sanitizer_common\sanitizer_thread_registry.cpp ..\rtl\tsan_platform_windows.cpp ..\..\sanitizer_common\sanitizer_win.cpp ..\..\sanitizer_common\sanitizer_deadlock_detector1.cpp ..\..\sanitizer_common\sanitizer_stackdepot.cpp ..\..\sanitizer_common\sanitizer_persistent_allocator.cpp ..\..\sanitizer_common\sanitizer_flag_parser.cpp ..\..\sanitizer_common\sanitizer_symbolizer.cpp ..\..\sanitizer_common\sanitizer_termination.cpp > gotsan.cpp - -gcc -c -o race_windows_amd64.syso gotsan.cpp -I..\rtl -I..\.. -I..\..\sanitizer_common -I..\..\..\include -m64 -Wall -fno-exceptions -fno-rtti -DSANITIZER_GO=1 -Wno-error=attributes -Wno-attributes -Wno-format -Wno-maybe-uninitialized -DSANITIZER_DEBUG=0 -O3 -fomit-frame-pointer -std=c++11 +type ^ + tsan_go.cpp ^ + ..\rtl\tsan_interface_atomic.cpp ^ + ..\rtl\tsan_clock.cpp ^ + ..\rtl\tsan_flags.cpp ^ + ..\rtl\tsan_md5.cpp ^ + ..\rtl\tsan_mutex.cpp ^ + ..\rtl\tsan_report.cpp ^ + ..\rtl\tsan_rtl.cpp ^ + ..\rtl\tsan_rtl_mutex.cpp ^ + ..\rtl\tsan_rtl_report.cpp ^ + ..\rtl\tsan_rtl_thread.cpp ^ + ..\rtl\tsan_rtl_proc.cpp ^ + ..\rtl\tsan_stat.cpp ^ + ..\rtl\tsan_suppressions.cpp ^ + ..\rtl\tsan_sync.cpp ^ + ..\rtl\tsan_stack_trace.cpp ^ + ..\..\sanitizer_common\sanitizer_allocator.cpp ^ + ..\..\sanitizer_common\sanitizer_common.cpp ^ + ..\..\sanitizer_common\sanitizer_flags.cpp ^ + ..\..\sanitizer_common\sanitizer_stacktrace.cpp ^ + ..\..\sanitizer_common\sanitizer_libc.cpp ^ + ..\..\sanitizer_common\sanitizer_printf.cpp ^ + ..\..\sanitizer_common\sanitizer_suppressions.cpp ^ + ..\..\sanitizer_common\sanitizer_thread_registry.cpp ^ + ..\rtl\tsan_platform_windows.cpp ^ + ..\..\sanitizer_common\sanitizer_win.cpp ^ + ..\..\sanitizer_common\sanitizer_deadlock_detector1.cpp ^ + ..\..\sanitizer_common\sanitizer_stackdepot.cpp ^ + ..\..\sanitizer_common\sanitizer_persistent_allocator.cpp ^ + ..\..\sanitizer_common\sanitizer_flag_parser.cpp ^ + ..\..\sanitizer_common\sanitizer_symbolizer.cpp ^ + ..\..\sanitizer_common\sanitizer_termination.cpp ^ + ..\..\sanitizer_common\sanitizer_file.cpp ^ + ..\..\sanitizer_common\sanitizer_symbolizer_report.cpp ^ + ..\rtl\tsan_external.cpp ^ + > gotsan.cpp +gcc ^ + -c ^ + -o race_windows_amd64.syso ^ + gotsan.cpp ^ + -I..\rtl ^ + -I..\.. ^ + -I..\..\sanitizer_common ^ + -I..\..\..\include ^ + -m64 ^ + -Wall ^ + -fno-exceptions ^ + -fno-rtti ^ + -DSANITIZER_GO=1 ^ + -DWINVER=0x0600 -D_WIN32_WINNT=0x0600 ^ + -DGetProcessMemoryInfo=K32GetProcessMemoryInfo ^ + -Wno-error=attributes ^ + -Wno-attributes ^ + -Wno-format ^ + -Wno-maybe-uninitialized ^ + -DSANITIZER_DEBUG=0 ^ + -O3 ^ + -fomit-frame-pointer ^ + -std=c++11 Index: lib/sanitizer_common/sanitizer_win_defs.h =================================================================== --- lib/sanitizer_common/sanitizer_win_defs.h +++ lib/sanitizer_common/sanitizer_win_defs.h @@ -43,6 +43,8 @@ #define STRINGIFY_(A) #A #define STRINGIFY(A) STRINGIFY_(A) +#if !SANITIZER_GO + // ----------------- A workaround for the absence of weak symbols -------------- // We don't have a direct equivalent of weak symbols when using MSVC, but we can // use the /alternatename directive to tell the linker to default a specific @@ -158,5 +160,15 @@ // return a >= b; // } // + +#else // SANITIZER_GO + +// Go neither needs nor wants weak references. +// The shenanigans above don't work for gcc. +# define WIN_WEAK_EXPORT_DEF(ReturnType, Name, ...) \ + extern "C" ReturnType Name(__VA_ARGS__) + +#endif // SANITIZER_GO + #endif // SANITIZER_WINDOWS #endif // SANITIZER_WIN_DEFS_H -------------- next part -------------- A non-text attachment was scrubbed... Name: D68599.223668.patch Type: text/x-patch Size: 4496 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:06:32 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:06:32 +0000 (UTC) Subject: [PATCH] D68594: [llvm-lipo] Add TextAPI to LINK_COMPONENTS In-Reply-To: References: Message-ID: <098355d5e93699087b3e8261436eeeda@localhost.localdomain> aheejin added a comment. Merging this, because this crashes builds. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68594/new/ https://reviews.llvm.org/D68594 From llvm-commits at lists.llvm.org Mon Oct 7 15:06:41 2019 From: llvm-commits at lists.llvm.org (James Y Knight via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:06:41 +0000 (UTC) Subject: [PATCH] D67867: [libc] Add few docs and implementation of strcpy and strcat. In-Reply-To: References: Message-ID: <11d2d7462acfe3f931959a6eb6b8c31f@localhost.localdomain> jyknight added a comment. In D67867#1698273 , @abrachet wrote: > In D67867#1696908 , @MaskRay wrote: > > > The commit was done in a hurry. For the initial commit of a brand new project that sets up the project hierarchy, this seems to have received fewer than enough thumbs up. Many points raised in the review process were just shrugged off. > > > I don't know if it matters anymore because this was committed but I agree with @MaskRay. His suggestion of using `llvm-objcopy` to rename the symbols for tests makes much more sense to me. I haven't seen a libc that does testing in an ergonomic way and this suggestion seems the best to me, frankly. > > There is a lot going on here, it's hard to follow it all in one patch, and I think some comments got lost because of this. I feel like a lot of big design decisions were made here, did I miss something on the libc-dev mailing list? FWIW -- llvm project policy is that both pre and post-commit reviews must be addressed. That this is now committed does not change anything w.r.t. needing to respond to outstanding comments. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67867/new/ https://reviews.llvm.org/D67867 From llvm-commits at lists.llvm.org Mon Oct 7 15:07:10 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:07:10 +0000 (UTC) Subject: [PATCH] D68600: AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer Message-ID: arsenm created this revision. arsenm added reviewers: tstellar, nhaehnle, kerbowa. Herald added subscribers: Petar.Avramovic, t-tye, tpr, dstuttard, rovka, yaxunl, wdng, jvesely, kzhuravl. This was ignoring the register bank of the input pointer, and isUniformMMO seems overly aggressive. This will now conservatively assume a VGPR in cases where the incoming bank hasn't been determined yet (i.e. is from a loop phi). https://reviews.llvm.org/D68600 Files: lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp test/CodeGen/AMDGPU/GlobalISel/legalize-extract.mir test/CodeGen/AMDGPU/GlobalISel/regbankselect-load.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68600.223669.patch Type: text/x-patch Size: 4604 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:07:15 2019 From: llvm-commits at lists.llvm.org (Keith Randall via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:07:15 +0000 (UTC) Subject: [PATCH] D68599: fix Go windows build In-Reply-To: References: Message-ID: <2b26259e9772e1f6d751ba8174c1a37b@localhost.localdomain> randall77 updated this revision to Diff 223670. randall77 added a comment. rebasing attempt 1 Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68599/new/ https://reviews.llvm.org/D68599 Files: lib/sanitizer_common/sanitizer_win_defs.h lib/tsan/go/build.bat Index: lib/tsan/go/build.bat =================================================================== --- lib/tsan/go/build.bat +++ lib/tsan/go/build.bat @@ -31,6 +31,9 @@ ..\..\sanitizer_common\sanitizer_flag_parser.cpp ^ ..\..\sanitizer_common\sanitizer_symbolizer.cpp ^ ..\..\sanitizer_common\sanitizer_termination.cpp ^ + ..\..\sanitizer_common\sanitizer_file.cpp ^ + ..\..\sanitizer_common\sanitizer_symbolizer_report.cpp ^ + ..\rtl\tsan_external.cpp ^ > gotsan.cpp gcc ^ @@ -46,6 +49,8 @@ -fno-exceptions ^ -fno-rtti ^ -DSANITIZER_GO=1 ^ + -DWINVER=0x0600 -D_WIN32_WINNT=0x0600 ^ + -DGetProcessMemoryInfo=K32GetProcessMemoryInfo ^ -Wno-error=attributes ^ -Wno-attributes ^ -Wno-format ^ Index: lib/sanitizer_common/sanitizer_win_defs.h =================================================================== --- lib/sanitizer_common/sanitizer_win_defs.h +++ lib/sanitizer_common/sanitizer_win_defs.h @@ -43,6 +43,8 @@ #define STRINGIFY_(A) #A #define STRINGIFY(A) STRINGIFY_(A) +#if !SANITIZER_GO + // ----------------- A workaround for the absence of weak symbols -------------- // We don't have a direct equivalent of weak symbols when using MSVC, but we can // use the /alternatename directive to tell the linker to default a specific @@ -158,5 +160,15 @@ // return a >= b; // } // + +#else // SANITIZER_GO + +// Go neither needs nor wants weak references. +// The shenanigans above don't work for gcc. +# define WIN_WEAK_EXPORT_DEF(ReturnType, Name, ...) \ + extern "C" ReturnType Name(__VA_ARGS__) + +#endif // SANITIZER_GO + #endif // SANITIZER_WINDOWS #endif // SANITIZER_WIN_DEFS_H -------------- next part -------------- A non-text attachment was scrubbed... Name: D68599.223670.patch Type: text/x-patch Size: 1671 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:08:05 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:08:05 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: evandro updated this revision to Diff 223671. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 Files: llvm/include/llvm/Support/MathExtras.h llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68257.223671.patch Type: text/x-patch Size: 5349 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:11:30 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via llvm-commits) Date: Mon, 07 Oct 2019 22:11:30 -0000 Subject: [llvm] r373974 - [llvm-lipo] Add TextAPI to LINK_COMPONENTS Message-ID: <20191007221130.B01DE874F9@lists.llvm.org> Author: aheejin Date: Mon Oct 7 15:11:30 2019 New Revision: 373974 URL: http://llvm.org/viewvc/llvm-project?rev=373974&view=rev Log: [llvm-lipo] Add TextAPI to LINK_COMPONENTS Summary: D68319 uses `MachO::getCPUTypeFromArchitecture` and without this builds with `-DBUILD_SHARED_LIBS=ON` fail. Reviewers: alexshap Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68594 Modified: llvm/trunk/tools/llvm-lipo/CMakeLists.txt Modified: llvm/trunk/tools/llvm-lipo/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-lipo/CMakeLists.txt?rev=373974&r1=373973&r2=373974&view=diff ============================================================================== --- llvm/trunk/tools/llvm-lipo/CMakeLists.txt (original) +++ llvm/trunk/tools/llvm-lipo/CMakeLists.txt Mon Oct 7 15:11:30 2019 @@ -3,6 +3,7 @@ set(LLVM_LINK_COMPONENTS Object Option Support + TextAPI ) set(LLVM_TARGET_DEFINITIONS LipoOpts.td) From llvm-commits at lists.llvm.org Mon Oct 7 15:11:12 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:11:12 +0000 (UTC) Subject: [PATCH] D68599: fix Go windows build In-Reply-To: References: Message-ID: vitalybuka accepted this revision. vitalybuka added inline comments. This revision is now accepted and ready to land. ================ Comment at: lib/tsan/go/build.bat:52 -DSANITIZER_GO=1 ^ + -DWINVER=0x0600 -D_WIN32_WINNT=0x0600 ^ + -DGetProcessMemoryInfo=K32GetProcessMemoryInfo ^ ---------------- For consistency could you split these 1-per-line? Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68599/new/ https://reviews.llvm.org/D68599 From llvm-commits at lists.llvm.org Mon Oct 7 15:13:04 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:13:04 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format Message-ID: wmi created this revision. wmi added reviewers: davidxl, mtrofin. Herald added subscribers: arphaman, hiraditya. Herald added a project: LLVM. Currently for Text, Binary and ExtBinary format profiles, when we compile a module with samplefdo, even if there is no function showing up in the profile, we have to load all the function profiles from the profile input. That is a waste of compile time. CompactBinary format profile has already had the support of loading function profiles on demand. In this patch, we add the indexing in ExtBinary format too. It will work no matter the sections in ExtBinary format profile are compressed or not. Experiment shows it reduces the time to compile a server benchmark by 30%. When profile remapping and loading function profiles on demand are both used, extra work needs to be done so that the loading on demand process will take the name remapping into consideration. It will be addressed in a follow-up patch. Repository: rL LLVM https://reviews.llvm.org/D68601 Files: llvm/include/llvm/ProfileData/SampleProf.h llvm/include/llvm/ProfileData/SampleProfReader.h llvm/include/llvm/ProfileData/SampleProfWriter.h llvm/lib/ProfileData/SampleProfReader.cpp llvm/lib/ProfileData/SampleProfWriter.cpp llvm/test/Transforms/SampleProfile/Inputs/inline.extbinary.afdo llvm/test/Transforms/SampleProfile/Inputs/profsampleacc.extbinary.afdo llvm/unittests/ProfileData/SampleProfTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68601.223663.patch Type: text/x-patch Size: 11589 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:13:30 2019 From: llvm-commits at lists.llvm.org (Keith Randall via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:13:30 +0000 (UTC) Subject: [PATCH] D68602: Split two defines into two lines Message-ID: randall77 created this revision. Herald added subscribers: llvm-commits, Sanitizers, delcypher. Herald added projects: LLVM, Sanitizers. Repository: rCRT Compiler Runtime https://reviews.llvm.org/D68602 Files: lib/tsan/go/build.bat Index: lib/tsan/go/build.bat =================================================================== --- lib/tsan/go/build.bat +++ lib/tsan/go/build.bat @@ -49,7 +49,8 @@ -fno-exceptions ^ -fno-rtti ^ -DSANITIZER_GO=1 ^ - -DWINVER=0x0600 -D_WIN32_WINNT=0x0600 ^ + -DWINVER=0x0600 ^ + -D_WIN32_WINNT=0x0600 ^ -DGetProcessMemoryInfo=K32GetProcessMemoryInfo ^ -Wno-error=attributes ^ -Wno-attributes ^ -------------- next part -------------- A non-text attachment was scrubbed... Name: D68602.223672.patch Type: text/x-patch Size: 416 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:17:13 2019 From: llvm-commits at lists.llvm.org (Nuno Lopes via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:17:13 +0000 (UTC) Subject: [PATCH] D29121: [Docs] Add LangRef documention for freeze instruction In-Reply-To: References: Message-ID: <19a40bbd6228dbc20b273dd8587263b1@localhost.localdomain> nlopes updated this revision to Diff 223673. nlopes added a comment. Clarify semantics for pointers. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29121/new/ https://reviews.llvm.org/D29121 Files: llvm/docs/LangRef.rst -------------- next part -------------- A non-text attachment was scrubbed... Name: D29121.223673.patch Type: text/x-patch Size: 6414 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:19:40 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via llvm-commits) Date: Mon, 07 Oct 2019 22:19:40 -0000 Subject: [llvm] r373975 - [WebAssembly] Fix unwind mismatch stat computation Message-ID: <20191007221940.9A7BE8DB8B@lists.llvm.org> Author: aheejin Date: Mon Oct 7 15:19:40 2019 New Revision: 373975 URL: http://llvm.org/viewvc/llvm-project?rev=373975&view=rev Log: [WebAssembly] Fix unwind mismatch stat computation Summary: There was a bug when computing the number of unwind destination mismatches in CFGStackify. When there are many mismatched calls that share the same (original) destination BB, they have to be counted separately. This also fixes a typo and runs `fixUnwindMismatches` only when the wasm exception handling is enabled. This is to prevent unnecessary computations and does not change behavior. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68552 Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp?rev=373975&r1=373974&r2=373975&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp Mon Oct 7 15:19:40 2019 @@ -848,7 +848,7 @@ bool WebAssemblyCFGStackify::fixUnwindMi SmallVector EHPadStack; // Range of intructions to be wrapped in a new nested try/catch using TryRange = std::pair; - // In original CFG, + // In original CFG, DenseMap> UnwindDestToTryRanges; // In new CFG, DenseMap> BrDestToTryRanges; @@ -985,7 +985,7 @@ bool WebAssemblyCFGStackify::fixUnwindMi // ... // cont: for (auto &P : UnwindDestToTryRanges) { - NumUnwindMismatches++; + NumUnwindMismatches += P.second.size(); // This means the destination is the appendix BB, which was separately // handled above. @@ -1300,7 +1300,9 @@ void WebAssemblyCFGStackify::placeMarker } } // Fix mismatches in unwind destinations induced by linearizing the code. - fixUnwindMismatches(MF); + if (MCAI->getExceptionHandlingType() == ExceptionHandling::Wasm && + MF.getFunction().hasPersonalityFn()) + fixUnwindMismatches(MF); } void WebAssemblyCFGStackify::rewriteDepthImmediates(MachineFunction &MF) { Modified: llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll?rev=373975&r1=373974&r2=373975&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll (original) +++ llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll Mon Oct 7 15:19:40 2019 @@ -1,6 +1,7 @@ ; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -exception-model=wasm -mattr=+exception-handling | FileCheck %s ; RUN: llc < %s -O0 -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -verify-machineinstrs -exception-model=wasm -mattr=+exception-handling | FileCheck %s --check-prefix=NOOPT ; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -exception-model=wasm -mattr=+exception-handling -wasm-disable-ehpad-sort | FileCheck %s --check-prefix=NOSORT +; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -exception-model=wasm -mattr=+exception-handling -wasm-disable-ehpad-sort -stats 2>&1 | FileCheck %s --check-prefix=NOSORT-STAT target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128" target triple = "wasm32-unknown-unknown" @@ -702,6 +703,9 @@ ehcleanup: cleanupret from %0 unwind to caller } +; Check if the unwind destination mismatch stats are correct +; NOSORT-STAT: 11 wasm-cfg-stackify - Number of EH pad unwind mismatches found + declare void @foo() declare void @bar() declare i32 @baz() From llvm-commits at lists.llvm.org Mon Oct 7 15:17:48 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:17:48 +0000 (UTC) Subject: [PATCH] D68603: [sanitizer] Print SIGTRAP for corresponding signal Message-ID: vitalybuka created this revision. vitalybuka added a reviewer: eugenis. Herald added a reviewer: jfb. Herald added projects: Sanitizers, LLVM. Herald added subscribers: llvm-commits, Sanitizers. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68603 Files: compiler-rt/lib/sanitizer_common/sanitizer_posix.cpp compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap.cpp Index: compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap.cpp =================================================================== --- /dev/null +++ compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap.cpp @@ -0,0 +1,8 @@ +// RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=2 not %run %t 2>&1 | FileCheck %s + +int main() { + __builtin_debugtrap(); +} + +// CHECK: Sanitizer:DEADLYSIGNAL +// CHECK: Sanitizer: SIGTRAP on unknown address Index: compiler-rt/lib/sanitizer_common/sanitizer_posix.cpp =================================================================== --- compiler-rt/lib/sanitizer_common/sanitizer_posix.cpp +++ compiler-rt/lib/sanitizer_common/sanitizer_posix.cpp @@ -312,6 +312,8 @@ return "SEGV"; case SIGBUS: return "BUS"; + case SIGTRAP: + return "SIGTRAP"; } return "UNKNOWN SIGNAL"; } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68603.223674.patch Type: text/x-patch Size: 872 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:18:13 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:18:13 +0000 (UTC) Subject: [PATCH] D68604: [tsan] Don't delay SIGTRAP handler Message-ID: vitalybuka created this revision. vitalybuka added a reviewer: eugenis. Herald added a reviewer: jfb. Herald added projects: Sanitizers, LLVM. Herald added subscribers: llvm-commits, Sanitizers. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68604 Files: compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Index: compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp =================================================================== --- /dev/null +++ compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp @@ -0,0 +1,31 @@ +// RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=2 not %run %t 2>&1 | FileCheck %s + +#include +#include +#include + +int handled; + +void handler(int signo, siginfo_t *info, void *uctx) { + handled = 1; +} + +int main() { + struct sigaction a = {}, old = {}; + a.sa_sigaction = handler; + a.sa_flags = SA_SIGINFO; + sigaction(SIGTRAP, &a, &old); + + a = {}; + sigaction(SIGTRAP, 0, &a); + assert(a.sa_sigaction == handler); + assert(a.sa_flags & SA_SIGINFO); + + __builtin_debugtrap(); + assert(handled); + fprintf(stderr, "HANDLED %d\n", handled); + + sigaction(SIGTRAP, &old, 0); +} + +// CHECK: HANDLED 1 Index: compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp =================================================================== --- compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp +++ compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp @@ -114,6 +114,7 @@ const int EPOLL_CTL_ADD = 1; #endif const int SIGILL = 4; +const int SIGTRAP = 5; const int SIGABRT = 6; const int SIGFPE = 8; const int SIGSEGV = 11; @@ -1962,10 +1963,10 @@ } // namespace __tsan static bool is_sync_signal(ThreadSignalContext *sctx, int sig) { - return sig == SIGSEGV || sig == SIGBUS || sig == SIGILL || - sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS || - // If we are sending signal to ourselves, we must process it now. - (sctx && sig == sctx->int_signal_send); + return sig == SIGSEGV || sig == SIGBUS || sig == SIGILL || sig == SIGTRAP || + sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS || + // If we are sending signal to ourselves, we must process it now. + (sctx && sig == sctx->int_signal_send); } void ALWAYS_INLINE rtl_generic_sighandler(bool sigact, int sig, -------------- next part -------------- A non-text attachment was scrubbed... Name: D68604.223675.patch Type: text/x-patch Size: 2077 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:20:19 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:20:19 +0000 (UTC) Subject: [PATCH] D68589: [lit] Leverage argparse features to remove some code In-Reply-To: References: Message-ID: <18d7ded85225bfeda6dc93397536c40d@localhost.localdomain> rnk accepted this revision. rnk added a comment. This revision is now accepted and ready to land. lgtm Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68589/new/ https://reviews.llvm.org/D68589 From llvm-commits at lists.llvm.org Mon Oct 7 15:20:37 2019 From: llvm-commits at lists.llvm.org (Evgenii Stepanov via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:20:37 +0000 (UTC) Subject: [PATCH] D68469: [AArch64] Ensure no tagged memory is left in the unallocated portion of the stack In-Reply-To: References: Message-ID: <00b2c9c8526906bd5a84b1cfc76c33f8@localhost.localdomain> eugenis added a comment. LGTM modulo the postDominates comment. ================ Comment at: llvm/lib/Target/AArch64/AArch64StackTagging.cpp:501 + + return ABB == BBB || PDT->dominates(ABB, BBB); +} ---------------- I'm worried about the case when A and B in the same basic block, but in a opposite order - i.e. one lifetime ends when control enters the basic block, and a new one starts before it exits. I've never seen it happen in practice, but it seems to be valid IR. I think you need to iterate over instructions here, same as DominatorTree::dominates does. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68469/new/ https://reviews.llvm.org/D68469 From llvm-commits at lists.llvm.org Mon Oct 7 15:20:43 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:20:43 +0000 (UTC) Subject: [PATCH] D68604: [tsan] Don't delay SIGTRAP handler In-Reply-To: References: Message-ID: <4df3c775cadfa33b149b4d4c8aee22f0@localhost.localdomain> vitalybuka updated this revision to Diff 223676. vitalybuka added a comment. Herald added a subscriber: dexonsmith. remove unneeded final trap Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68604/new/ https://reviews.llvm.org/D68604 Files: compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Index: compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp =================================================================== --- /dev/null +++ compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp @@ -0,0 +1,29 @@ +// RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=2 %run %t 2>&1 | FileCheck %s + +#include +#include +#include + +int handled; + +void handler(int signo, siginfo_t *info, void *uctx) { + handled = 1; +} + +int main() { + struct sigaction a = {}, old = {}; + a.sa_sigaction = handler; + a.sa_flags = SA_SIGINFO; + sigaction(SIGTRAP, &a, &old); + + a = {}; + sigaction(SIGTRAP, 0, &a); + assert(a.sa_sigaction == handler); + assert(a.sa_flags & SA_SIGINFO); + + __builtin_debugtrap(); + assert(handled); + fprintf(stderr, "HANDLED %d\n", handled); +} + +// CHECK: HANDLED 1 Index: compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp =================================================================== --- compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp +++ compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp @@ -114,6 +114,7 @@ const int EPOLL_CTL_ADD = 1; #endif const int SIGILL = 4; +const int SIGTRAP = 5; const int SIGABRT = 6; const int SIGFPE = 8; const int SIGSEGV = 11; @@ -1962,10 +1963,10 @@ } // namespace __tsan static bool is_sync_signal(ThreadSignalContext *sctx, int sig) { - return sig == SIGSEGV || sig == SIGBUS || sig == SIGILL || - sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS || - // If we are sending signal to ourselves, we must process it now. - (sctx && sig == sctx->int_signal_send); + return sig == SIGSEGV || sig == SIGBUS || sig == SIGILL || sig == SIGTRAP || + sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS || + // If we are sending signal to ourselves, we must process it now. + (sctx && sig == sctx->int_signal_send); } void ALWAYS_INLINE rtl_generic_sighandler(bool sigact, int sig, -------------- next part -------------- A non-text attachment was scrubbed... Name: D68604.223676.patch Type: text/x-patch Size: 2039 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:21:05 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:21:05 +0000 (UTC) Subject: [PATCH] D68231: [SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func In-Reply-To: References: Message-ID: evandro added inline comments. ================ Comment at: test/Transforms/InstCombine/pow-1.ll:120 ; MSVC-NEXT: [[POW:%.*]] = call <2 x double> @llvm.pow.v2f64(<2 x double> , <2 x double> [[X:%.*]]) -; MSVC-NEXT: ret <2 x double> [[POW]] ; ---------------- I don't see a good reason why this check was removed. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68231/new/ https://reviews.llvm.org/D68231 From llvm-commits at lists.llvm.org Mon Oct 7 15:22:58 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:22:58 +0000 (UTC) Subject: [PATCH] D68604: [tsan] Don't delay SIGTRAP handler In-Reply-To: References: Message-ID: vitalybuka updated this revision to Diff 223679. vitalybuka added a comment. fix handle_sigtrap value Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68604/new/ https://reviews.llvm.org/D68604 Files: compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Index: compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp =================================================================== --- /dev/null +++ compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp @@ -0,0 +1,29 @@ +// RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=1 %run %t 2>&1 | FileCheck %s + +#include +#include +#include + +int handled; + +void handler(int signo, siginfo_t *info, void *uctx) { + handled = 1; +} + +int main() { + struct sigaction a = {}, old = {}; + a.sa_sigaction = handler; + a.sa_flags = SA_SIGINFO; + sigaction(SIGTRAP, &a, &old); + + a = {}; + sigaction(SIGTRAP, 0, &a); + assert(a.sa_sigaction == handler); + assert(a.sa_flags & SA_SIGINFO); + + __builtin_debugtrap(); + assert(handled); + fprintf(stderr, "HANDLED %d\n", handled); +} + +// CHECK: HANDLED 1 Index: compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp =================================================================== --- compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp +++ compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp @@ -114,6 +114,7 @@ const int EPOLL_CTL_ADD = 1; #endif const int SIGILL = 4; +const int SIGTRAP = 5; const int SIGABRT = 6; const int SIGFPE = 8; const int SIGSEGV = 11; @@ -1962,10 +1963,10 @@ } // namespace __tsan static bool is_sync_signal(ThreadSignalContext *sctx, int sig) { - return sig == SIGSEGV || sig == SIGBUS || sig == SIGILL || - sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS || - // If we are sending signal to ourselves, we must process it now. - (sctx && sig == sctx->int_signal_send); + return sig == SIGSEGV || sig == SIGBUS || sig == SIGILL || sig == SIGTRAP || + sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS || + // If we are sending signal to ourselves, we must process it now. + (sctx && sig == sctx->int_signal_send); } void ALWAYS_INLINE rtl_generic_sighandler(bool sigact, int sig, -------------- next part -------------- A non-text attachment was scrubbed... Name: D68604.223679.patch Type: text/x-patch Size: 2039 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:23:52 2019 From: llvm-commits at lists.llvm.org (Evgenii Stepanov via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:23:52 +0000 (UTC) Subject: [PATCH] D68604: [tsan] Don't delay SIGTRAP handler In-Reply-To: References: Message-ID: <8f1537b6ed25294e4d48bfaef467b40a@localhost.localdomain> eugenis accepted this revision. eugenis added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68604/new/ https://reviews.llvm.org/D68604 From llvm-commits at lists.llvm.org Mon Oct 7 15:28:58 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via llvm-commits) Date: Mon, 07 Oct 2019 22:28:58 -0000 Subject: [llvm] r373976 - [X86] Add new calling convention that guarantees tail call optimization Message-ID: <20191007222858.C50228DCA6@lists.llvm.org> Author: rnk Date: Mon Oct 7 15:28:58 2019 New Revision: 373976 URL: http://llvm.org/viewvc/llvm-project?rev=373976&view=rev Log: [X86] Add new calling convention that guarantees tail call optimization When the target option GuaranteedTailCallOpt is specified, calls with the fastcc calling convention will be transformed into tail calls if they are in tail position. This diff adds a new calling convention, tailcc, currently supported only on X86, which behaves the same way as fastcc, except that the GuaranteedTailCallOpt flag does not need to enabled in order to enable tail call optimization. Patch by Dwight Guth ! Reviewed By: lebedev.ri, paquette, rnk Differential Revision: https://reviews.llvm.org/D67855 Added: llvm/trunk/test/CodeGen/X86/musttail-tailcc.ll llvm/trunk/test/CodeGen/X86/tailcall-tailcc.ll llvm/trunk/test/CodeGen/X86/tailcc-calleesave.ll llvm/trunk/test/CodeGen/X86/tailcc-disable-tail-calls.ll llvm/trunk/test/CodeGen/X86/tailcc-fastcc.ll llvm/trunk/test/CodeGen/X86/tailcc-fastisel.ll llvm/trunk/test/CodeGen/X86/tailcc-largecode.ll llvm/trunk/test/CodeGen/X86/tailcc-stackalign.ll llvm/trunk/test/CodeGen/X86/tailcc-structret.ll llvm/trunk/test/CodeGen/X86/tailccbyval.ll llvm/trunk/test/CodeGen/X86/tailccbyval64.ll llvm/trunk/test/CodeGen/X86/tailccfp.ll llvm/trunk/test/CodeGen/X86/tailccfp2.ll llvm/trunk/test/CodeGen/X86/tailccpic1.ll llvm/trunk/test/CodeGen/X86/tailccpic2.ll llvm/trunk/test/CodeGen/X86/tailccstack64.ll Modified: llvm/trunk/docs/BitCodeFormat.rst llvm/trunk/docs/CodeGenerator.rst llvm/trunk/docs/LangRef.rst llvm/trunk/include/llvm/IR/CallingConv.h llvm/trunk/lib/AsmParser/LLLexer.cpp llvm/trunk/lib/AsmParser/LLParser.cpp llvm/trunk/lib/AsmParser/LLToken.h llvm/trunk/lib/CodeGen/Analysis.cpp llvm/trunk/lib/IR/AsmWriter.cpp llvm/trunk/lib/Target/X86/X86CallingConv.td llvm/trunk/lib/Target/X86/X86FastISel.cpp llvm/trunk/lib/Target/X86/X86FrameLowering.cpp llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/lib/Target/X86/X86Subtarget.h llvm/trunk/utils/vim/syntax/llvm.vim Modified: llvm/trunk/docs/BitCodeFormat.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/BitCodeFormat.rst?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/docs/BitCodeFormat.rst (original) +++ llvm/trunk/docs/BitCodeFormat.rst Mon Oct 7 15:28:58 2019 @@ -794,6 +794,7 @@ function. The operand fields are: * ``preserve_allcc``: code 15 * ``swiftcc`` : code 16 * ``cxx_fast_tlscc``: code 17 + * ``tailcc`` : code 18 * ``x86_stdcallcc``: code 64 * ``x86_fastcallcc``: code 65 * ``arm_apcscc``: code 66 Modified: llvm/trunk/docs/CodeGenerator.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CodeGenerator.rst?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/docs/CodeGenerator.rst (original) +++ llvm/trunk/docs/CodeGenerator.rst Mon Oct 7 15:28:58 2019 @@ -2068,12 +2068,12 @@ supported on x86/x86-64, PowerPC, and We and PowerPC if: * Caller and callee have the calling convention ``fastcc``, ``cc 10`` (GHC - calling convention) or ``cc 11`` (HiPE calling convention). + calling convention), ``cc 11`` (HiPE calling convention), or ``tailcc``. * The call is a tail call - in tail position (ret immediately follows call and ret uses value of call or is void). -* Option ``-tailcallopt`` is enabled. +* Option ``-tailcallopt`` is enabled or the calling convention is ``tailcc``. * Platform-specific constraints are met. Modified: llvm/trunk/docs/LangRef.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/LangRef.rst?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/docs/LangRef.rst (original) +++ llvm/trunk/docs/LangRef.rst Mon Oct 7 15:28:58 2019 @@ -299,7 +299,7 @@ added in the future: allows the target to use whatever tricks it wants to produce fast code for the target, without having to conform to an externally specified ABI (Application Binary Interface). `Tail calls can only - be optimized when this, the GHC or the HiPE convention is + be optimized when this, the tailcc, the GHC or the HiPE convention is used. `_ This calling convention does not support varargs and requires the prototype of all callees to exactly match the prototype of the function definition. @@ -436,6 +436,14 @@ added in the future: - On X86-64 RCX and R8 are available for additional integer returns, and XMM2 and XMM3 are available for additional FP/vector returns. - On iOS platforms, we use AAPCS-VFP calling convention. +"``tailcc``" - Tail callable calling convention + This calling convention ensures that calls in tail position will always be + tail call optimized. This calling convention is equivalent to fastcc, + except for an additional guarantee that tail calls will be produced + whenever possible. `Tail calls can only be optimized when this, the fastcc, + the GHC or the HiPE convention is used. `_ This + calling convention does not support varargs and requires the prototype of + all callees to exactly match the prototype of the function definition. "``cc ``" - Numbered convention Any calling convention may be specified by number, allowing target-specific calling conventions to be used. Target specific @@ -10232,11 +10240,12 @@ This instruction requires several argume Tail call optimization for calls marked ``tail`` is guaranteed to occur if the following conditions are met: - - Caller and callee both have the calling convention ``fastcc``. + - Caller and callee both have the calling convention ``fastcc`` or ``tailcc``. - The call is in tail position (ret immediately follows call and ret uses value of call or is void). - - Option ``-tailcallopt`` is enabled, or - ``llvm::GuaranteedTailCallOpt`` is ``true``. + - Option ``-tailcallopt`` is enabled, + ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention + is ``tailcc`` - `Platform-specific constraints are met. `_ Modified: llvm/trunk/include/llvm/IR/CallingConv.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/CallingConv.h?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/CallingConv.h (original) +++ llvm/trunk/include/llvm/IR/CallingConv.h Mon Oct 7 15:28:58 2019 @@ -75,6 +75,11 @@ namespace CallingConv { // CXX_FAST_TLS - Calling convention for access functions. CXX_FAST_TLS = 17, + /// Tail - This calling convention attemps to make calls as fast as + /// possible while guaranteeing that tail call optimization can always + /// be performed. + Tail = 18, + // Target - This is the start of the target-specific calling conventions, // e.g. fastcall and thiscall on X86. FirstTargetCC = 64, Modified: llvm/trunk/lib/AsmParser/LLLexer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/AsmParser/LLLexer.cpp?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/lib/AsmParser/LLLexer.cpp (original) +++ llvm/trunk/lib/AsmParser/LLLexer.cpp Mon Oct 7 15:28:58 2019 @@ -622,6 +622,7 @@ lltok::Kind LLLexer::LexIdentifier() { KEYWORD(amdgpu_ps); KEYWORD(amdgpu_cs); KEYWORD(amdgpu_kernel); + KEYWORD(tailcc); KEYWORD(cc); KEYWORD(c); Modified: llvm/trunk/lib/AsmParser/LLParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/AsmParser/LLParser.cpp?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/lib/AsmParser/LLParser.cpp (original) +++ llvm/trunk/lib/AsmParser/LLParser.cpp Mon Oct 7 15:28:58 2019 @@ -1955,6 +1955,7 @@ void LLParser::ParseOptionalDLLStorageCl /// ::= 'amdgpu_ps' /// ::= 'amdgpu_cs' /// ::= 'amdgpu_kernel' +/// ::= 'tailcc' /// ::= 'cc' UINT /// bool LLParser::ParseOptionalCallingConv(unsigned &CC) { @@ -2000,6 +2001,7 @@ bool LLParser::ParseOptionalCallingConv( case lltok::kw_amdgpu_ps: CC = CallingConv::AMDGPU_PS; break; case lltok::kw_amdgpu_cs: CC = CallingConv::AMDGPU_CS; break; case lltok::kw_amdgpu_kernel: CC = CallingConv::AMDGPU_KERNEL; break; + case lltok::kw_tailcc: CC = CallingConv::Tail; break; case lltok::kw_cc: { Lex.Lex(); return ParseUInt32(CC); Modified: llvm/trunk/lib/AsmParser/LLToken.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/AsmParser/LLToken.h?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/lib/AsmParser/LLToken.h (original) +++ llvm/trunk/lib/AsmParser/LLToken.h Mon Oct 7 15:28:58 2019 @@ -168,6 +168,7 @@ enum Kind { kw_amdgpu_ps, kw_amdgpu_cs, kw_amdgpu_kernel, + kw_tailcc, // Attributes: kw_attributes, Modified: llvm/trunk/lib/CodeGen/Analysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/Analysis.cpp?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/Analysis.cpp (original) +++ llvm/trunk/lib/CodeGen/Analysis.cpp Mon Oct 7 15:28:58 2019 @@ -523,7 +523,8 @@ bool llvm::isInTailCallPosition(Immutabl // longjmp on x86), it can end up causing miscompilation that has not // been fully understood. if (!Ret && - (!TM.Options.GuaranteedTailCallOpt || !isa(Term))) + ((!TM.Options.GuaranteedTailCallOpt && + CS.getCallingConv() != CallingConv::Tail) || !isa(Term))) return false; // If I will have a chain, make sure no other instruction that will have a Modified: llvm/trunk/lib/IR/AsmWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/AsmWriter.cpp?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/lib/IR/AsmWriter.cpp (original) +++ llvm/trunk/lib/IR/AsmWriter.cpp Mon Oct 7 15:28:58 2019 @@ -352,6 +352,7 @@ static void PrintCallingConv(unsigned cc case CallingConv::PreserveAll: Out << "preserve_allcc"; break; case CallingConv::CXX_FAST_TLS: Out << "cxx_fast_tlscc"; break; case CallingConv::GHC: Out << "ghccc"; break; + case CallingConv::Tail: Out << "tailcc"; break; case CallingConv::X86_StdCall: Out << "x86_stdcallcc"; break; case CallingConv::X86_FastCall: Out << "x86_fastcallcc"; break; case CallingConv::X86_ThisCall: Out << "x86_thiscallcc"; break; Modified: llvm/trunk/lib/Target/X86/X86CallingConv.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86CallingConv.td?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86CallingConv.td (original) +++ llvm/trunk/lib/Target/X86/X86CallingConv.td Mon Oct 7 15:28:58 2019 @@ -433,6 +433,7 @@ defm X86_SysV64_RegCall : def RetCC_X86_32 : CallingConv<[ // If FastCC, use RetCC_X86_32_Fast. CCIfCC<"CallingConv::Fast", CCDelegateTo>, + CCIfCC<"CallingConv::Tail", CCDelegateTo>, // If HiPE, use RetCC_X86_32_HiPE. CCIfCC<"CallingConv::HiPE", CCDelegateTo>, CCIfCC<"CallingConv::X86_VectorCall", CCDelegateTo>, @@ -1000,6 +1001,7 @@ def CC_X86_32 : CallingConv<[ CCIfCC<"CallingConv::X86_VectorCall", CCDelegateTo>, CCIfCC<"CallingConv::X86_ThisCall", CCDelegateTo>, CCIfCC<"CallingConv::Fast", CCDelegateTo>, + CCIfCC<"CallingConv::Tail", CCDelegateTo>, CCIfCC<"CallingConv::GHC", CCDelegateTo>, CCIfCC<"CallingConv::HiPE", CCDelegateTo>, CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo>, Modified: llvm/trunk/lib/Target/X86/X86FastISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86FastISel.cpp?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86FastISel.cpp (original) +++ llvm/trunk/lib/Target/X86/X86FastISel.cpp Mon Oct 7 15:28:58 2019 @@ -1160,6 +1160,7 @@ bool X86FastISel::X86SelectRet(const Ins CallingConv::ID CC = F.getCallingConv(); if (CC != CallingConv::C && CC != CallingConv::Fast && + CC != CallingConv::Tail && CC != CallingConv::X86_FastCall && CC != CallingConv::X86_StdCall && CC != CallingConv::X86_ThisCall && @@ -1173,7 +1174,8 @@ bool X86FastISel::X86SelectRet(const Ins // fastcc with -tailcallopt is intended to provide a guaranteed // tail call optimization. Fastisel doesn't know how to do that. - if (CC == CallingConv::Fast && TM.Options.GuaranteedTailCallOpt) + if ((CC == CallingConv::Fast && TM.Options.GuaranteedTailCallOpt) || + CC == CallingConv::Tail) return false; // Let SDISel handle vararg functions. @@ -3157,7 +3159,7 @@ static unsigned computeBytesPoppedByCall if (Subtarget->getTargetTriple().isOSMSVCRT()) return 0; if (CC == CallingConv::Fast || CC == CallingConv::GHC || - CC == CallingConv::HiPE) + CC == CallingConv::HiPE || CC == CallingConv::Tail) return 0; if (CS) @@ -3208,6 +3210,7 @@ bool X86FastISel::fastLowerCall(CallLowe default: return false; case CallingConv::C: case CallingConv::Fast: + case CallingConv::Tail: case CallingConv::WebKit_JS: case CallingConv::Swift: case CallingConv::X86_FastCall: @@ -3224,7 +3227,8 @@ bool X86FastISel::fastLowerCall(CallLowe // fastcc with -tailcallopt is intended to provide a guaranteed // tail call optimization. Fastisel doesn't know how to do that. - if (CC == CallingConv::Fast && TM.Options.GuaranteedTailCallOpt) + if ((CC == CallingConv::Fast && TM.Options.GuaranteedTailCallOpt) || + CC == CallingConv::Tail) return false; // Don't know how to handle Win64 varargs yet. Nothing special needed for Modified: llvm/trunk/lib/Target/X86/X86FrameLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86FrameLowering.cpp?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86FrameLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86FrameLowering.cpp Mon Oct 7 15:28:58 2019 @@ -2269,7 +2269,8 @@ GetScratchRegister(bool Is64Bit, bool Is bool IsNested = HasNestArgument(&MF); if (CallingConvention == CallingConv::X86_FastCall || - CallingConvention == CallingConv::Fast) { + CallingConvention == CallingConv::Fast || + CallingConvention == CallingConv::Tail) { if (IsNested) report_fatal_error("Segmented stacks does not support fastcall with " "nested function."); Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Oct 7 15:28:58 2019 @@ -2963,7 +2963,7 @@ static SDValue CreateCopyOfByValArgument static bool canGuaranteeTCO(CallingConv::ID CC) { return (CC == CallingConv::Fast || CC == CallingConv::GHC || CC == CallingConv::X86_RegCall || CC == CallingConv::HiPE || - CC == CallingConv::HHVM); + CC == CallingConv::HHVM || CC == CallingConv::Tail); } /// Return true if we might ever do TCO for calls with this calling convention. @@ -2989,7 +2989,7 @@ static bool mayTailCallThisCC(CallingCon /// Return true if the function is being made into a tailcall target by /// changing its ABI. static bool shouldGuaranteeTCO(CallingConv::ID CC, bool GuaranteedTailCallOpt) { - return GuaranteedTailCallOpt && canGuaranteeTCO(CC); + return (GuaranteedTailCallOpt && canGuaranteeTCO(CC)) || CC == CallingConv::Tail; } bool X86TargetLowering::mayBeEmittedAsTailCall(const CallInst *CI) const { @@ -3615,6 +3615,8 @@ X86TargetLowering::LowerCall(TargetLower bool IsWin64 = Subtarget.isCallingConvWin64(CallConv); StructReturnType SR = callIsStructReturn(Outs, Subtarget.isTargetMCU()); bool IsSibcall = false; + bool IsGuaranteeTCO = MF.getTarget().Options.GuaranteedTailCallOpt || + CallConv == CallingConv::Tail; X86MachineFunctionInfo *X86Info = MF.getInfo(); auto Attr = MF.getFunction().getFnAttribute("disable-tail-calls"); const auto *CI = dyn_cast_or_null(CLI.CS.getInstruction()); @@ -3635,8 +3637,7 @@ X86TargetLowering::LowerCall(TargetLower if (Attr.getValueAsString() == "true") isTailCall = false; - if (Subtarget.isPICStyleGOT() && - !MF.getTarget().Options.GuaranteedTailCallOpt) { + if (Subtarget.isPICStyleGOT() && !IsGuaranteeTCO) { // If we are using a GOT, disable tail calls to external symbols with // default visibility. Tail calling such a symbol requires using a GOT // relocation, which forces early binding of the symbol. This breaks code @@ -3663,7 +3664,7 @@ X86TargetLowering::LowerCall(TargetLower // Sibcalls are automatically detected tailcalls which do not require // ABI changes. - if (!MF.getTarget().Options.GuaranteedTailCallOpt && isTailCall) + if (!IsGuaranteeTCO && isTailCall) IsSibcall = true; if (isTailCall) @@ -3695,8 +3696,7 @@ X86TargetLowering::LowerCall(TargetLower // This is a sibcall. The memory operands are available in caller's // own caller's stack. NumBytes = 0; - else if (MF.getTarget().Options.GuaranteedTailCallOpt && - canGuaranteeTCO(CallConv)) + else if (IsGuaranteeTCO && canGuaranteeTCO(CallConv)) NumBytes = GetAlignedArgumentStackSize(NumBytes, DAG); int FPDiff = 0; @@ -4321,6 +4321,8 @@ bool X86TargetLowering::IsEligibleForTai bool CCMatch = CallerCC == CalleeCC; bool IsCalleeWin64 = Subtarget.isCallingConvWin64(CalleeCC); bool IsCallerWin64 = Subtarget.isCallingConvWin64(CallerCC); + bool IsGuaranteeTCO = DAG.getTarget().Options.GuaranteedTailCallOpt || + CalleeCC == CallingConv::Tail; // Win64 functions have extra shadow space for argument homing. Don't do the // sibcall if the caller and callee have mismatched expectations for this @@ -4328,7 +4330,7 @@ bool X86TargetLowering::IsEligibleForTai if (IsCalleeWin64 != IsCallerWin64) return false; - if (DAG.getTarget().Options.GuaranteedTailCallOpt) { + if (IsGuaranteeTCO) { if (canGuaranteeTCO(CalleeCC) && CCMatch) return true; return false; @@ -24421,6 +24423,7 @@ SDValue X86TargetLowering::LowerINIT_TRA case CallingConv::X86_FastCall: case CallingConv::X86_ThisCall: case CallingConv::Fast: + case CallingConv::Tail: // Pass 'nest' parameter in EAX. // Must be kept in sync with X86CallingConv.td NestReg = X86::EAX; Modified: llvm/trunk/lib/Target/X86/X86Subtarget.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.h?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86Subtarget.h (original) +++ llvm/trunk/lib/Target/X86/X86Subtarget.h Mon Oct 7 15:28:58 2019 @@ -815,6 +815,7 @@ public: // On Win64, all these conventions just use the default convention. case CallingConv::C: case CallingConv::Fast: + case CallingConv::Tail: case CallingConv::Swift: case CallingConv::X86_FastCall: case CallingConv::X86_StdCall: Added: llvm/trunk/test/CodeGen/X86/musttail-tailcc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/musttail-tailcc.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/musttail-tailcc.ll (added) +++ llvm/trunk/test/CodeGen/X86/musttail-tailcc.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,114 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s -check-prefix=X64 +; RUN: llc < %s -mtriple=i686-unknown-unknown | FileCheck %s -check-prefix=X32 + +; tailcc will turn all of these musttail calls into tail calls. + +declare tailcc i32 @tailcallee(i32 %a1, i32 %a2) + +define tailcc i32 @tailcaller(i32 %in1, i32 %in2) nounwind { +; X64-LABEL: tailcaller: +; X64: # %bb.0: # %entry +; X64-NEXT: pushq %rax +; X64-NEXT: popq %rax +; X64-NEXT: jmp tailcallee # TAILCALL +; +; X32-LABEL: tailcaller: +; X32: # %bb.0: # %entry +; X32-NEXT: jmp tailcallee # TAILCALL +entry: + %tmp11 = musttail call tailcc i32 @tailcallee(i32 %in1, i32 %in2) + ret i32 %tmp11 +} + +declare tailcc i8* @alias_callee() + +define tailcc noalias i8* @noalias_caller() nounwind { +; X64-LABEL: noalias_caller: +; X64: # %bb.0: +; X64-NEXT: pushq %rax +; X64-NEXT: popq %rax +; X64-NEXT: jmp alias_callee # TAILCALL +; +; X32-LABEL: noalias_caller: +; X32: # %bb.0: +; X32-NEXT: jmp alias_callee # TAILCALL + %p = musttail call tailcc i8* @alias_callee() + ret i8* %p +} + +declare tailcc noalias i8* @noalias_callee() + +define tailcc i8* @alias_caller() nounwind { +; X64-LABEL: alias_caller: +; X64: # %bb.0: +; X64-NEXT: pushq %rax +; X64-NEXT: popq %rax +; X64-NEXT: jmp noalias_callee # TAILCALL +; +; X32-LABEL: alias_caller: +; X32: # %bb.0: +; X32-NEXT: jmp noalias_callee # TAILCALL + %p = musttail call tailcc noalias i8* @noalias_callee() + ret i8* %p +} + +define tailcc void @void_test(i32, i32, i32, i32) { +; X64-LABEL: void_test: +; X64: # %bb.0: # %entry +; X64-NEXT: pushq %rax +; X64-NEXT: .cfi_def_cfa_offset 16 +; X64-NEXT: popq %rax +; X64-NEXT: .cfi_def_cfa_offset 8 +; X64-NEXT: jmp void_test # TAILCALL +; +; X32-LABEL: void_test: +; X32: # %bb.0: # %entry +; X32-NEXT: pushl %esi +; X32-NEXT: .cfi_def_cfa_offset 8 +; X32-NEXT: subl $8, %esp +; X32-NEXT: .cfi_def_cfa_offset 16 +; X32-NEXT: .cfi_offset %esi, -8 +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax +; X32-NEXT: movl {{[0-9]+}}(%esp), %esi +; X32-NEXT: movl %esi, {{[0-9]+}}(%esp) +; X32-NEXT: movl %eax, {{[0-9]+}}(%esp) +; X32-NEXT: addl $8, %esp +; X32-NEXT: .cfi_def_cfa_offset 8 +; X32-NEXT: popl %esi +; X32-NEXT: .cfi_def_cfa_offset 4 +; X32-NEXT: jmp void_test # TAILCALL + entry: + musttail call tailcc void @void_test( i32 %0, i32 %1, i32 %2, i32 %3) + ret void +} + +define tailcc i1 @i1test(i32, i32, i32, i32) { +; X64-LABEL: i1test: +; X64: # %bb.0: # %entry +; X64-NEXT: pushq %rax +; X64-NEXT: .cfi_def_cfa_offset 16 +; X64-NEXT: popq %rax +; X64-NEXT: .cfi_def_cfa_offset 8 +; X64-NEXT: jmp i1test # TAILCALL +; +; X32-LABEL: i1test: +; X32: # %bb.0: # %entry +; X32-NEXT: pushl %esi +; X32-NEXT: .cfi_def_cfa_offset 8 +; X32-NEXT: subl $8, %esp +; X32-NEXT: .cfi_def_cfa_offset 16 +; X32-NEXT: .cfi_offset %esi, -8 +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax +; X32-NEXT: movl {{[0-9]+}}(%esp), %esi +; X32-NEXT: movl %esi, {{[0-9]+}}(%esp) +; X32-NEXT: movl %eax, {{[0-9]+}}(%esp) +; X32-NEXT: addl $8, %esp +; X32-NEXT: .cfi_def_cfa_offset 8 +; X32-NEXT: popl %esi +; X32-NEXT: .cfi_def_cfa_offset 4 +; X32-NEXT: jmp i1test # TAILCALL + entry: + %4 = musttail call tailcc i1 @i1test( i32 %0, i32 %1, i32 %2, i32 %3) + ret i1 %4 +} Added: llvm/trunk/test/CodeGen/X86/tailcall-tailcc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcall-tailcc.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailcall-tailcc.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailcall-tailcc.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,155 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s -check-prefix=X64 +; RUN: llc < %s -mtriple=i686-unknown-unknown | FileCheck %s -check-prefix=X32 + +; With -tailcallopt, CodeGen guarantees a tail call optimization +; for all of these. + +declare tailcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) + +define tailcc i32 @tailcaller(i32 %in1, i32 %in2) nounwind { +; X64-LABEL: tailcaller: +; X64: # %bb.0: # %entry +; X64-NEXT: pushq %rax +; X64-NEXT: movl %edi, %edx +; X64-NEXT: movl %esi, %ecx +; X64-NEXT: popq %rax +; X64-NEXT: jmp tailcallee # TAILCALL +; +; X32-LABEL: tailcaller: +; X32: # %bb.0: # %entry +; X32-NEXT: subl $16, %esp +; X32-NEXT: movl %ecx, {{[0-9]+}}(%esp) +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax +; X32-NEXT: movl %edx, {{[0-9]+}}(%esp) +; X32-NEXT: movl %eax, {{[0-9]+}}(%esp) +; X32-NEXT: addl $8, %esp +; X32-NEXT: jmp tailcallee # TAILCALL +entry: + %tmp11 = tail call tailcc i32 @tailcallee(i32 %in1, i32 %in2, i32 %in1, i32 %in2) + ret i32 %tmp11 +} + +declare tailcc i8* @alias_callee() + +define tailcc noalias i8* @noalias_caller() nounwind { +; X64-LABEL: noalias_caller: +; X64: # %bb.0: +; X64-NEXT: pushq %rax +; X64-NEXT: popq %rax +; X64-NEXT: jmp alias_callee # TAILCALL +; +; X32-LABEL: noalias_caller: +; X32: # %bb.0: +; X32-NEXT: jmp alias_callee # TAILCALL + %p = tail call tailcc i8* @alias_callee() + ret i8* %p +} + +declare tailcc noalias i8* @noalias_callee() + +define tailcc i8* @alias_caller() nounwind { +; X64-LABEL: alias_caller: +; X64: # %bb.0: +; X64-NEXT: pushq %rax +; X64-NEXT: popq %rax +; X64-NEXT: jmp noalias_callee # TAILCALL +; +; X32-LABEL: alias_caller: +; X32: # %bb.0: +; X32-NEXT: jmp noalias_callee # TAILCALL + %p = tail call tailcc noalias i8* @noalias_callee() + ret i8* %p +} + +declare tailcc i32 @i32_callee() + +define tailcc i32 @ret_undef() nounwind { +; X64-LABEL: ret_undef: +; X64: # %bb.0: +; X64-NEXT: pushq %rax +; X64-NEXT: popq %rax +; X64-NEXT: jmp i32_callee # TAILCALL +; +; X32-LABEL: ret_undef: +; X32: # %bb.0: +; X32-NEXT: jmp i32_callee # TAILCALL + %p = tail call tailcc i32 @i32_callee() + ret i32 undef +} + +declare tailcc void @does_not_return() + +define tailcc i32 @noret() nounwind { +; X64-LABEL: noret: +; X64: # %bb.0: +; X64-NEXT: pushq %rax +; X64-NEXT: popq %rax +; X64-NEXT: jmp does_not_return # TAILCALL +; +; X32-LABEL: noret: +; X32: # %bb.0: +; X32-NEXT: jmp does_not_return # TAILCALL + tail call tailcc void @does_not_return() + unreachable +} + +define tailcc void @void_test(i32, i32, i32, i32) { +; X64-LABEL: void_test: +; X64: # %bb.0: # %entry +; X64-NEXT: pushq %rax +; X64-NEXT: .cfi_def_cfa_offset 16 +; X64-NEXT: popq %rax +; X64-NEXT: .cfi_def_cfa_offset 8 +; X64-NEXT: jmp void_test # TAILCALL +; +; X32-LABEL: void_test: +; X32: # %bb.0: # %entry +; X32-NEXT: pushl %esi +; X32-NEXT: .cfi_def_cfa_offset 8 +; X32-NEXT: subl $8, %esp +; X32-NEXT: .cfi_def_cfa_offset 16 +; X32-NEXT: .cfi_offset %esi, -8 +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax +; X32-NEXT: movl {{[0-9]+}}(%esp), %esi +; X32-NEXT: movl %esi, {{[0-9]+}}(%esp) +; X32-NEXT: movl %eax, {{[0-9]+}}(%esp) +; X32-NEXT: addl $8, %esp +; X32-NEXT: .cfi_def_cfa_offset 8 +; X32-NEXT: popl %esi +; X32-NEXT: .cfi_def_cfa_offset 4 +; X32-NEXT: jmp void_test # TAILCALL + entry: + tail call tailcc void @void_test( i32 %0, i32 %1, i32 %2, i32 %3) + ret void +} + +define tailcc i1 @i1test(i32, i32, i32, i32) { +; X64-LABEL: i1test: +; X64: # %bb.0: # %entry +; X64-NEXT: pushq %rax +; X64-NEXT: .cfi_def_cfa_offset 16 +; X64-NEXT: popq %rax +; X64-NEXT: .cfi_def_cfa_offset 8 +; X64-NEXT: jmp i1test # TAILCALL +; +; X32-LABEL: i1test: +; X32: # %bb.0: # %entry +; X32-NEXT: pushl %esi +; X32-NEXT: .cfi_def_cfa_offset 8 +; X32-NEXT: subl $8, %esp +; X32-NEXT: .cfi_def_cfa_offset 16 +; X32-NEXT: .cfi_offset %esi, -8 +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax +; X32-NEXT: movl {{[0-9]+}}(%esp), %esi +; X32-NEXT: movl %esi, {{[0-9]+}}(%esp) +; X32-NEXT: movl %eax, {{[0-9]+}}(%esp) +; X32-NEXT: addl $8, %esp +; X32-NEXT: .cfi_def_cfa_offset 8 +; X32-NEXT: popl %esi +; X32-NEXT: .cfi_def_cfa_offset 4 +; X32-NEXT: jmp i1test # TAILCALL + entry: + %4 = tail call tailcc i1 @i1test( i32 %0, i32 %1, i32 %2, i32 %3) + ret i1 %4 +} Added: llvm/trunk/test/CodeGen/X86/tailcc-calleesave.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-calleesave.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailcc-calleesave.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailcc-calleesave.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,19 @@ +; RUN: llc -mcpu=core < %s | FileCheck %s + +target triple = "i686-apple-darwin" + +declare tailcc void @foo(i32, i32, i32, i32, i32, i32) +declare i32* @bar(i32*) + +define tailcc void @hoge(i32 %b) nounwind { +; Do not overwrite pushed callee-save registers +; CHECK: pushl +; CHECK: subl $[[SIZE:[0-9]+]], %esp +; CHECK-NOT: [[SIZE]](%esp) + %a = alloca i32 + store i32 0, i32* %a + %d = tail call i32* @bar(i32* %a) nounwind + store i32 %b, i32* %d + tail call tailcc void @foo(i32 1, i32 2, i32 3, i32 4, i32 5, i32 6) nounwind + ret void +} Added: llvm/trunk/test/CodeGen/X86/tailcc-disable-tail-calls.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-disable-tail-calls.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailcc-disable-tail-calls.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailcc-disable-tail-calls.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,40 @@ +; RUN: llc < %s -mtriple=x86_64-- | FileCheck %s --check-prefix=NO-OPTION +; RUN: llc < %s -mtriple=x86_64-- -disable-tail-calls | FileCheck %s --check-prefix=DISABLE-TRUE +; RUN: llc < %s -mtriple=x86_64-- -disable-tail-calls=false | FileCheck %s --check-prefix=DISABLE-FALSE + +; Check that command line option "-disable-tail-calls" overrides function +; attribute "disable-tail-calls". + +; NO-OPTION-LABEL: {{\_?}}func_attr +; NO-OPTION: callq {{\_?}}callee + +; DISABLE-FALSE-LABEL: {{\_?}}func_attr +; DISABLE-FALSE: jmp {{\_?}}callee + +; DISABLE-TRUE-LABEL: {{\_?}}func_attr +; DISABLE-TRUE: callq {{\_?}}callee + +define tailcc i32 @func_attr(i32 %a) #0 { +entry: + %call = tail call tailcc i32 @callee(i32 %a) + ret i32 %call +} + +; NO-OPTION-LABEL: {{\_?}}func_noattr +; NO-OPTION: jmp {{\_?}}callee + +; DISABLE-FALSE-LABEL: {{\_?}}func_noattr +; DISABLE-FALSE: jmp {{\_?}}callee + +; DISABLE-TRUE-LABEL: {{\_?}}func_noattr +; DISABLE-TRUE: callq {{\_?}}callee + +define tailcc i32 @func_noattr(i32 %a) { +entry: + %call = tail call tailcc i32 @callee(i32 %a) + ret i32 %call +} + +declare tailcc i32 @callee(i32) + +attributes #0 = { "disable-tail-calls"="true" } Added: llvm/trunk/test/CodeGen/X86/tailcc-fastcc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-fastcc.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailcc-fastcc.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailcc-fastcc.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,49 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc -tailcallopt < %s -mtriple=x86_64-unknown-unknown | FileCheck %s -check-prefix=X64 +; RUN: llc -tailcallopt < %s -mtriple=i686-unknown-unknown | FileCheck %s -check-prefix=X32 + +; llc -tailcallopt should not enable tail calls from fastcc to tailcc or vice versa + +declare tailcc i32 @tailcallee1(i32 %a1, i32 %a2, i32 %a3, i32 %a4) + +define fastcc i32 @tailcaller1(i32 %in1, i32 %in2) nounwind { +; X64-LABEL: tailcaller1: +; X64: # %bb.0: # %entry +; X64-NEXT: pushq %rax +; X64-NEXT: movl %edi, %edx +; X64-NEXT: movl %esi, %ecx +; X64-NEXT: callq tailcallee1 +; X64-NEXT: retq $8 +; +; X32-LABEL: tailcaller1: +; X32: # %bb.0: # %entry +; X32-NEXT: pushl %edx +; X32-NEXT: pushl %ecx +; X32-NEXT: calll tailcallee1 +; X32-NEXT: retl +entry: + %tmp11 = tail call tailcc i32 @tailcallee1(i32 %in1, i32 %in2, i32 %in1, i32 %in2) + ret i32 %tmp11 +} + +declare fastcc i32 @tailcallee2(i32 %a1, i32 %a2, i32 %a3, i32 %a4) + +define tailcc i32 @tailcaller2(i32 %in1, i32 %in2) nounwind { +; X64-LABEL: tailcaller2: +; X64: # %bb.0: # %entry +; X64-NEXT: pushq %rax +; X64-NEXT: movl %edi, %edx +; X64-NEXT: movl %esi, %ecx +; X64-NEXT: callq tailcallee2 +; X64-NEXT: retq $8 +; +; X32-LABEL: tailcaller2: +; X32: # %bb.0: # %entry +; X32-NEXT: pushl %edx +; X32-NEXT: pushl %ecx +; X32-NEXT: calll tailcallee2 +; X32-NEXT: retl +entry: + %tmp11 = tail call fastcc i32 @tailcallee2(i32 %in1, i32 %in2, i32 %in1, i32 %in2) + ret i32 %tmp11 +} Added: llvm/trunk/test/CodeGen/X86/tailcc-fastisel.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-fastisel.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailcc-fastisel.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailcc-fastisel.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,18 @@ +; RUN: llc < %s -mtriple=x86_64-apple-darwin -fast-isel -fast-isel-abort=1 | FileCheck %s + +%0 = type { i64, i32, i8* } + +define tailcc i8* @"visit_array_aux<`Reference>"(%0 %arg, i32 %arg1) nounwind { +fail: ; preds = %entry + %tmp20 = tail call tailcc i8* @"visit_array_aux<`Reference>"(%0 %arg, i32 undef) ; [#uses=1] +; CHECK: jmp "_visit_array_aux<`Reference>" ## TAILCALL + ret i8* %tmp20 +} + +define i32 @foo() nounwind { +entry: + %0 = tail call i32 (...) @bar() nounwind ; [#uses=1] + ret i32 %0 +} + +declare i32 @bar(...) nounwind Added: llvm/trunk/test/CodeGen/X86/tailcc-largecode.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-largecode.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailcc-largecode.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailcc-largecode.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,71 @@ +; RUN: llc < %s -mtriple=x86_64-linux-gnu -code-model=large -enable-misched=false | FileCheck %s + +declare tailcc i32 @callee(i32 %arg) +define tailcc i32 @directcall(i32 %arg) { +entry: +; This is the large code model, so &callee may not fit into the jmp +; instruction. Instead, stick it into a register. +; CHECK: movabsq $callee, [[REGISTER:%r[a-z0-9]+]] +; CHECK: jmpq *[[REGISTER]] # TAILCALL + %res = tail call tailcc i32 @callee(i32 %arg) + ret i32 %res +} + +; Check that the register used for an indirect tail call doesn't +; clobber any of the arguments. +define tailcc i32 @indirect_manyargs(i32(i32,i32,i32,i32,i32,i32,i32)* %target) { +; Adjust the stack to enter the function. (The amount of the +; adjustment may change in the future, in which case the location of +; the stack argument and the return adjustment will change too.) +; CHECK: pushq +; Put the call target into R11, which won't be clobbered while restoring +; callee-saved registers and won't be used for passing arguments. +; CHECK: movq %rdi, %rax +; Pass the stack argument. +; CHECK: movl $7, 16(%rsp) +; Pass the register arguments, in the right registers. +; CHECK: movl $1, %edi +; CHECK: movl $2, %esi +; CHECK: movl $3, %edx +; CHECK: movl $4, %ecx +; CHECK: movl $5, %r8d +; CHECK: movl $6, %r9d +; Adjust the stack to "return". +; CHECK: popq +; And tail-call to the target. +; CHECK: jmpq *%rax # TAILCALL + %res = tail call tailcc i32 %target(i32 1, i32 2, i32 3, i32 4, i32 5, + i32 6, i32 7) + ret i32 %res +} + +; Check that the register used for a direct tail call doesn't clobber +; any of the arguments. +declare tailcc i32 @manyargs_callee(i32,i32,i32,i32,i32,i32,i32) +define tailcc i32 @direct_manyargs() { +; Adjust the stack to enter the function. (The amount of the +; adjustment may change in the future, in which case the location of +; the stack argument and the return adjustment will change too.) +; CHECK: pushq +; Pass the stack argument. +; CHECK: movl $7, 16(%rsp) +; This is the large code model, so &manyargs_callee may not fit into +; the jmp instruction. Put it into a register which won't be clobbered +; while restoring callee-saved registers and won't be used for passing +; arguments. +; CHECK: movabsq $manyargs_callee, %rax +; Pass the register arguments, in the right registers. +; CHECK: movl $1, %edi +; CHECK: movl $2, %esi +; CHECK: movl $3, %edx +; CHECK: movl $4, %ecx +; CHECK: movl $5, %r8d +; CHECK: movl $6, %r9d +; Adjust the stack to "return". +; CHECK: popq +; And tail-call to the target. +; CHECK: jmpq *%rax # TAILCALL + %res = tail call tailcc i32 @manyargs_callee(i32 1, i32 2, i32 3, i32 4, + i32 5, i32 6, i32 7) + ret i32 %res +} Added: llvm/trunk/test/CodeGen/X86/tailcc-stackalign.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-stackalign.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailcc-stackalign.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailcc-stackalign.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,23 @@ +; RUN: llc < %s -mtriple=i686-unknown-linux -no-x86-call-frame-opt | FileCheck %s +; Linux has 8 byte alignment so the params cause stack size 20, +; ensure that a normal tailcc call has matching stack size + + +define tailcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) { + ret i32 %a3 +} + +define tailcc i32 @tailcaller(i32 %in1, i32 %in2, i32 %in3, i32 %in4) { + %tmp11 = tail call tailcc i32 @tailcallee(i32 %in1, i32 %in2, + i32 %in1, i32 %in2) + ret i32 %tmp11 +} + +define i32 @main(i32 %argc, i8** %argv) { + %tmp1 = call tailcc i32 @tailcaller( i32 1, i32 2, i32 3, i32 4 ) + ; expect match subl [stacksize] here + ret i32 0 +} + +; CHECK: calll tailcaller +; CHECK-NEXT: subl $12 Added: llvm/trunk/test/CodeGen/X86/tailcc-structret.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-structret.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailcc-structret.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailcc-structret.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,7 @@ +; RUN: llc < %s -mtriple=i686-unknown-linux | FileCheck %s +define tailcc { { i8*, i8* }*, i8*} @init({ { i8*, i8* }*, i8*}, i32) { +entry: + %2 = tail call tailcc { { i8*, i8* }*, i8* } @init({ { i8*, i8*}*, i8*} %0, i32 %1) + ret { { i8*, i8* }*, i8*} %2 +; CHECK: jmp init +} Added: llvm/trunk/test/CodeGen/X86/tailccbyval.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccbyval.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailccbyval.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailccbyval.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,21 @@ +; RUN: llc < %s -mtriple=i686-unknown-linux | FileCheck %s +%struct.s = type {i32, i32, i32, i32, i32, i32, i32, i32, + i32, i32, i32, i32, i32, i32, i32, i32, + i32, i32, i32, i32, i32, i32, i32, i32 } + +define tailcc i32 @tailcallee(%struct.s* byval %a) nounwind { +entry: + %tmp2 = getelementptr %struct.s, %struct.s* %a, i32 0, i32 0 + %tmp3 = load i32, i32* %tmp2 + ret i32 %tmp3 +; CHECK: tailcallee +; CHECK: movl 4(%esp), %eax +} + +define tailcc i32 @tailcaller(%struct.s* byval %a) nounwind { +entry: + %tmp4 = tail call tailcc i32 @tailcallee(%struct.s* byval %a ) + ret i32 %tmp4 +; CHECK: tailcaller +; CHECK: jmp tailcallee +} Added: llvm/trunk/test/CodeGen/X86/tailccbyval64.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccbyval64.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailccbyval64.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailccbyval64.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,42 @@ +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux | FileCheck %s + +; FIXME: Win64 does not support byval. + +; Expect the entry point. +; CHECK-LABEL: tailcaller: + +; Expect 2 rep;movs because of tail call byval lowering. +; CHECK: rep; +; CHECK: rep; + +; A sequence of copyto/copyfrom virtual registers is used to deal with byval +; lowering appearing after moving arguments to registers. The following two +; checks verify that the register allocator changes those sequences to direct +; moves to argument register where it can (for registers that are not used in +; byval lowering - not rsi, not rdi, not rcx). +; Expect argument 4 to be moved directly to register edx. +; CHECK: movl $7, %edx + +; Expect argument 6 to be moved directly to register r8. +; CHECK: movl $17, %r8d + +; Expect not call but jmp to @tailcallee. +; CHECK: jmp tailcallee + +; Expect the trailer. +; CHECK: .size tailcaller + +%struct.s = type { i64, i64, i64, i64, i64, i64, i64, i64, + i64, i64, i64, i64, i64, i64, i64, i64, + i64, i64, i64, i64, i64, i64, i64, i64 } + +declare tailcc i64 @tailcallee(%struct.s* byval %a, i64 %val, i64 %val2, i64 %val3, i64 %val4, i64 %val5) + + +define tailcc i64 @tailcaller(i64 %b, %struct.s* byval %a) { +entry: + %tmp2 = getelementptr %struct.s, %struct.s* %a, i32 0, i32 1 + %tmp3 = load i64, i64* %tmp2, align 8 + %tmp4 = tail call tailcc i64 @tailcallee(%struct.s* byval %a , i64 %tmp3, i64 %b, i64 7, i64 13, i64 17) + ret i64 %tmp4 +} Added: llvm/trunk/test/CodeGen/X86/tailccfp.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccfp.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailccfp.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailccfp.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,6 @@ +; RUN: llc < %s -mtriple=i686-- | FileCheck %s +define tailcc i32 @bar(i32 %X, i32(double, i32) *%FP) { + %Y = tail call tailcc i32 %FP(double 0.0, i32 %X) + ret i32 %Y +; CHECK: jmpl +} Added: llvm/trunk/test/CodeGen/X86/tailccfp2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccfp2.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailccfp2.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailccfp2.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,27 @@ +; RUN: llc < %s -mtriple=i686-- | FileCheck %s + +declare i32 @putchar(i32) + +define tailcc i32 @checktail(i32 %x, i32* %f, i32 %g) nounwind { +; CHECK-LABEL: checktail: + %tmp1 = icmp sgt i32 %x, 0 + br i1 %tmp1, label %if-then, label %if-else + +if-then: + %fun_ptr = bitcast i32* %f to i32(i32, i32*, i32)* + %arg1 = add i32 %x, -1 + call i32 @putchar(i32 90) +; CHECK: jmpl *%e{{.*}} + %res = tail call tailcc i32 %fun_ptr( i32 %arg1, i32 * %f, i32 %g) + ret i32 %res + +if-else: + ret i32 %x +} + + +define i32 @main() nounwind { + %f = bitcast i32 (i32, i32*, i32)* @checktail to i32* + %res = tail call tailcc i32 @checktail( i32 10, i32* %f,i32 10) + ret i32 %res +} Added: llvm/trunk/test/CodeGen/X86/tailccpic1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccpic1.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailccpic1.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailccpic1.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,16 @@ +; RUN: llc < %s -mtriple=i686-pc-linux-gnu -relocation-model=pic | FileCheck %s + +; This test uses guaranteed TCO so these will be tail calls, despite the early +; binding issues. + +define protected tailcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) { +entry: + ret i32 %a3 +} + +define tailcc i32 @tailcaller(i32 %in1, i32 %in2) { +entry: + %tmp11 = tail call tailcc i32 @tailcallee( i32 %in1, i32 %in2, i32 %in1, i32 %in2 ) ; [#uses=1] + ret i32 %tmp11 +; CHECK: jmp tailcallee +} Added: llvm/trunk/test/CodeGen/X86/tailccpic2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccpic2.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailccpic2.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailccpic2.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,15 @@ +; RUN: llc < %s -mtriple=i686-pc-linux-gnu -relocation-model=pic | FileCheck %s + +define tailcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) { +entry: + ret i32 %a3 +} + +define tailcc i32 @tailcaller(i32 %in1, i32 %in2) { +entry: + %tmp11 = tail call tailcc i32 @tailcallee( i32 %in1, i32 %in2, i32 %in1, i32 %in2 ) ; [#uses=1] + ret i32 %tmp11 +; CHECK: movl tailcallee at GOT +; CHECK: jmpl +} + Added: llvm/trunk/test/CodeGen/X86/tailccstack64.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccstack64.ll?rev=373976&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailccstack64.ll (added) +++ llvm/trunk/test/CodeGen/X86/tailccstack64.ll Mon Oct 7 15:28:58 2019 @@ -0,0 +1,28 @@ +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -post-RA-scheduler=true | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-win32 -post-RA-scheduler=true | FileCheck %s + +; FIXME: Redundant unused stack allocation could be eliminated. +; CHECK: subq ${{24|72|80}}, %rsp + +; Check that lowered arguments on the stack do not overwrite each other. +; Add %in1 %p1 to a different temporary register (%eax). +; CHECK: movl [[A1:32|144]](%rsp), [[R1:%e..]] +; Move param %in1 to temp register (%r10d). +; CHECK: movl [[A2:40|152]](%rsp), [[R2:%[a-z0-9]+]] +; Add %in1 %p1 to a different temporary register (%eax). +; CHECK: addl {{%edi|%ecx}}, [[R1]] +; Move param %in2 to stack. +; CHECK-DAG: movl [[R2]], [[A1]](%rsp) +; Move result of addition to stack. +; CHECK-DAG: movl [[R1]], [[A2]](%rsp) +; Eventually, do a TAILCALL +; CHECK: TAILCALL + +declare tailcc i32 @tailcallee(i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5, i32 %p6, i32 %a, i32 %b) nounwind + +define tailcc i32 @tailcaller(i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5, i32 %p6, i32 %in1, i32 %in2) nounwind { +entry: + %tmp = add i32 %in1, %p1 + %retval = tail call tailcc i32 @tailcallee(i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5, i32 %p6, i32 %in2,i32 %tmp) + ret i32 %retval +} Modified: llvm/trunk/utils/vim/syntax/llvm.vim URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/vim/syntax/llvm.vim?rev=373976&r1=373975&r2=373976&view=diff ============================================================================== --- llvm/trunk/utils/vim/syntax/llvm.vim (original) +++ llvm/trunk/utils/vim/syntax/llvm.vim Mon Oct 7 15:28:58 2019 @@ -82,6 +82,7 @@ syn keyword llvmKeyword \ externally_initialized \ extern_weak \ fastcc + \ tailcc \ filter \ from \ gc From llvm-commits at lists.llvm.org Mon Oct 7 15:26:38 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:26:38 +0000 (UTC) Subject: [PATCH] D68602: Split two defines into two lines In-Reply-To: References: Message-ID: <6394c0b11b67debe4b074e5e1d171ce9@localhost.localdomain> vitalybuka added inline comments. ================ Comment at: lib/tsan/go/build.bat:53 + -DWINVER=0x0600 ^ + -D_WIN32_WINNT=0x0600 ^ -DGetProcessMemoryInfo=K32GetProcessMemoryInfo ^ ---------------- you can reaload patches into existing review with "arc diff" command arc diff [base_rev] e.g. arc diff HEAD^ or just: arc diff Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68602/new/ https://reviews.llvm.org/D68602 From llvm-commits at lists.llvm.org Mon Oct 7 15:28:14 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Tue, 8 Oct 2019 01:28:14 +0300 Subject: [llvm] r373976 - [X86] Add new calling convention that guarantees tail call optimization In-Reply-To: <20191007222858.C50228DCA6@lists.llvm.org> References: <20191007222858.C50228DCA6@lists.llvm.org> Message-ID: On Tue, Oct 8, 2019 at 1:26 AM Reid Kleckner via llvm-commits wrote: > > Author: rnk > Date: Mon Oct 7 15:28:58 2019 > New Revision: 373976 > > URL: http://llvm.org/viewvc/llvm-project?rev=373976&view=rev > Log: > [X86] Add new calling convention that guarantees tail call optimization > > When the target option GuaranteedTailCallOpt is specified, calls with > the fastcc calling convention will be transformed into tail calls if > they are in tail position. This diff adds a new calling convention, > tailcc, currently supported only on X86, which behaves the same way as > fastcc, except that the GuaranteedTailCallOpt flag does not need to > enabled in order to enable tail call optimization. > > Patch by Dwight Guth ! > > Reviewed By: lebedev.ri, paquette, rnk Pretty sure i didn't review this. > Differential Revision: https://reviews.llvm.org/D67855 > > Added: > llvm/trunk/test/CodeGen/X86/musttail-tailcc.ll > llvm/trunk/test/CodeGen/X86/tailcall-tailcc.ll > llvm/trunk/test/CodeGen/X86/tailcc-calleesave.ll > llvm/trunk/test/CodeGen/X86/tailcc-disable-tail-calls.ll > llvm/trunk/test/CodeGen/X86/tailcc-fastcc.ll > llvm/trunk/test/CodeGen/X86/tailcc-fastisel.ll > llvm/trunk/test/CodeGen/X86/tailcc-largecode.ll > llvm/trunk/test/CodeGen/X86/tailcc-stackalign.ll > llvm/trunk/test/CodeGen/X86/tailcc-structret.ll > llvm/trunk/test/CodeGen/X86/tailccbyval.ll > llvm/trunk/test/CodeGen/X86/tailccbyval64.ll > llvm/trunk/test/CodeGen/X86/tailccfp.ll > llvm/trunk/test/CodeGen/X86/tailccfp2.ll > llvm/trunk/test/CodeGen/X86/tailccpic1.ll > llvm/trunk/test/CodeGen/X86/tailccpic2.ll > llvm/trunk/test/CodeGen/X86/tailccstack64.ll > Modified: > llvm/trunk/docs/BitCodeFormat.rst > llvm/trunk/docs/CodeGenerator.rst > llvm/trunk/docs/LangRef.rst > llvm/trunk/include/llvm/IR/CallingConv.h > llvm/trunk/lib/AsmParser/LLLexer.cpp > llvm/trunk/lib/AsmParser/LLParser.cpp > llvm/trunk/lib/AsmParser/LLToken.h > llvm/trunk/lib/CodeGen/Analysis.cpp > llvm/trunk/lib/IR/AsmWriter.cpp > llvm/trunk/lib/Target/X86/X86CallingConv.td > llvm/trunk/lib/Target/X86/X86FastISel.cpp > llvm/trunk/lib/Target/X86/X86FrameLowering.cpp > llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > llvm/trunk/lib/Target/X86/X86Subtarget.h > llvm/trunk/utils/vim/syntax/llvm.vim > > Modified: llvm/trunk/docs/BitCodeFormat.rst > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/BitCodeFormat.rst?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/docs/BitCodeFormat.rst (original) > +++ llvm/trunk/docs/BitCodeFormat.rst Mon Oct 7 15:28:58 2019 > @@ -794,6 +794,7 @@ function. The operand fields are: > * ``preserve_allcc``: code 15 > * ``swiftcc`` : code 16 > * ``cxx_fast_tlscc``: code 17 > + * ``tailcc`` : code 18 > * ``x86_stdcallcc``: code 64 > * ``x86_fastcallcc``: code 65 > * ``arm_apcscc``: code 66 > > Modified: llvm/trunk/docs/CodeGenerator.rst > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CodeGenerator.rst?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/docs/CodeGenerator.rst (original) > +++ llvm/trunk/docs/CodeGenerator.rst Mon Oct 7 15:28:58 2019 > @@ -2068,12 +2068,12 @@ supported on x86/x86-64, PowerPC, and We > and PowerPC if: > > * Caller and callee have the calling convention ``fastcc``, ``cc 10`` (GHC > - calling convention) or ``cc 11`` (HiPE calling convention). > + calling convention), ``cc 11`` (HiPE calling convention), or ``tailcc``. > > * The call is a tail call - in tail position (ret immediately follows call and > ret uses value of call or is void). > > -* Option ``-tailcallopt`` is enabled. > +* Option ``-tailcallopt`` is enabled or the calling convention is ``tailcc``. > > * Platform-specific constraints are met. > > > Modified: llvm/trunk/docs/LangRef.rst > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/LangRef.rst?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/docs/LangRef.rst (original) > +++ llvm/trunk/docs/LangRef.rst Mon Oct 7 15:28:58 2019 > @@ -299,7 +299,7 @@ added in the future: > allows the target to use whatever tricks it wants to produce fast > code for the target, without having to conform to an externally > specified ABI (Application Binary Interface). `Tail calls can only > - be optimized when this, the GHC or the HiPE convention is > + be optimized when this, the tailcc, the GHC or the HiPE convention is > used. `_ This calling convention does not > support varargs and requires the prototype of all callees to exactly > match the prototype of the function definition. > @@ -436,6 +436,14 @@ added in the future: > - On X86-64 RCX and R8 are available for additional integer returns, and > XMM2 and XMM3 are available for additional FP/vector returns. > - On iOS platforms, we use AAPCS-VFP calling convention. > +"``tailcc``" - Tail callable calling convention > + This calling convention ensures that calls in tail position will always be > + tail call optimized. This calling convention is equivalent to fastcc, > + except for an additional guarantee that tail calls will be produced > + whenever possible. `Tail calls can only be optimized when this, the fastcc, > + the GHC or the HiPE convention is used. `_ This > + calling convention does not support varargs and requires the prototype of > + all callees to exactly match the prototype of the function definition. > "``cc ``" - Numbered convention > Any calling convention may be specified by number, allowing > target-specific calling conventions to be used. Target specific > @@ -10232,11 +10240,12 @@ This instruction requires several argume > Tail call optimization for calls marked ``tail`` is guaranteed to occur if > the following conditions are met: > > - - Caller and callee both have the calling convention ``fastcc``. > + - Caller and callee both have the calling convention ``fastcc`` or ``tailcc``. > - The call is in tail position (ret immediately follows call and ret > uses value of call or is void). > - - Option ``-tailcallopt`` is enabled, or > - ``llvm::GuaranteedTailCallOpt`` is ``true``. > + - Option ``-tailcallopt`` is enabled, > + ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention > + is ``tailcc`` > - `Platform-specific constraints are > met. `_ > > > Modified: llvm/trunk/include/llvm/IR/CallingConv.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/CallingConv.h?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/include/llvm/IR/CallingConv.h (original) > +++ llvm/trunk/include/llvm/IR/CallingConv.h Mon Oct 7 15:28:58 2019 > @@ -75,6 +75,11 @@ namespace CallingConv { > // CXX_FAST_TLS - Calling convention for access functions. > CXX_FAST_TLS = 17, > > + /// Tail - This calling convention attemps to make calls as fast as > + /// possible while guaranteeing that tail call optimization can always > + /// be performed. > + Tail = 18, > + > // Target - This is the start of the target-specific calling conventions, > // e.g. fastcall and thiscall on X86. > FirstTargetCC = 64, > > Modified: llvm/trunk/lib/AsmParser/LLLexer.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/AsmParser/LLLexer.cpp?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/lib/AsmParser/LLLexer.cpp (original) > +++ llvm/trunk/lib/AsmParser/LLLexer.cpp Mon Oct 7 15:28:58 2019 > @@ -622,6 +622,7 @@ lltok::Kind LLLexer::LexIdentifier() { > KEYWORD(amdgpu_ps); > KEYWORD(amdgpu_cs); > KEYWORD(amdgpu_kernel); > + KEYWORD(tailcc); > > KEYWORD(cc); > KEYWORD(c); > > Modified: llvm/trunk/lib/AsmParser/LLParser.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/AsmParser/LLParser.cpp?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/lib/AsmParser/LLParser.cpp (original) > +++ llvm/trunk/lib/AsmParser/LLParser.cpp Mon Oct 7 15:28:58 2019 > @@ -1955,6 +1955,7 @@ void LLParser::ParseOptionalDLLStorageCl > /// ::= 'amdgpu_ps' > /// ::= 'amdgpu_cs' > /// ::= 'amdgpu_kernel' > +/// ::= 'tailcc' > /// ::= 'cc' UINT > /// > bool LLParser::ParseOptionalCallingConv(unsigned &CC) { > @@ -2000,6 +2001,7 @@ bool LLParser::ParseOptionalCallingConv( > case lltok::kw_amdgpu_ps: CC = CallingConv::AMDGPU_PS; break; > case lltok::kw_amdgpu_cs: CC = CallingConv::AMDGPU_CS; break; > case lltok::kw_amdgpu_kernel: CC = CallingConv::AMDGPU_KERNEL; break; > + case lltok::kw_tailcc: CC = CallingConv::Tail; break; > case lltok::kw_cc: { > Lex.Lex(); > return ParseUInt32(CC); > > Modified: llvm/trunk/lib/AsmParser/LLToken.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/AsmParser/LLToken.h?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/lib/AsmParser/LLToken.h (original) > +++ llvm/trunk/lib/AsmParser/LLToken.h Mon Oct 7 15:28:58 2019 > @@ -168,6 +168,7 @@ enum Kind { > kw_amdgpu_ps, > kw_amdgpu_cs, > kw_amdgpu_kernel, > + kw_tailcc, > > // Attributes: > kw_attributes, > > Modified: llvm/trunk/lib/CodeGen/Analysis.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/Analysis.cpp?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/lib/CodeGen/Analysis.cpp (original) > +++ llvm/trunk/lib/CodeGen/Analysis.cpp Mon Oct 7 15:28:58 2019 > @@ -523,7 +523,8 @@ bool llvm::isInTailCallPosition(Immutabl > // longjmp on x86), it can end up causing miscompilation that has not > // been fully understood. > if (!Ret && > - (!TM.Options.GuaranteedTailCallOpt || !isa(Term))) > + ((!TM.Options.GuaranteedTailCallOpt && > + CS.getCallingConv() != CallingConv::Tail) || !isa(Term))) > return false; > > // If I will have a chain, make sure no other instruction that will have a > > Modified: llvm/trunk/lib/IR/AsmWriter.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/AsmWriter.cpp?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/lib/IR/AsmWriter.cpp (original) > +++ llvm/trunk/lib/IR/AsmWriter.cpp Mon Oct 7 15:28:58 2019 > @@ -352,6 +352,7 @@ static void PrintCallingConv(unsigned cc > case CallingConv::PreserveAll: Out << "preserve_allcc"; break; > case CallingConv::CXX_FAST_TLS: Out << "cxx_fast_tlscc"; break; > case CallingConv::GHC: Out << "ghccc"; break; > + case CallingConv::Tail: Out << "tailcc"; break; > case CallingConv::X86_StdCall: Out << "x86_stdcallcc"; break; > case CallingConv::X86_FastCall: Out << "x86_fastcallcc"; break; > case CallingConv::X86_ThisCall: Out << "x86_thiscallcc"; break; > > Modified: llvm/trunk/lib/Target/X86/X86CallingConv.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86CallingConv.td?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86CallingConv.td (original) > +++ llvm/trunk/lib/Target/X86/X86CallingConv.td Mon Oct 7 15:28:58 2019 > @@ -433,6 +433,7 @@ defm X86_SysV64_RegCall : > def RetCC_X86_32 : CallingConv<[ > // If FastCC, use RetCC_X86_32_Fast. > CCIfCC<"CallingConv::Fast", CCDelegateTo>, > + CCIfCC<"CallingConv::Tail", CCDelegateTo>, > // If HiPE, use RetCC_X86_32_HiPE. > CCIfCC<"CallingConv::HiPE", CCDelegateTo>, > CCIfCC<"CallingConv::X86_VectorCall", CCDelegateTo>, > @@ -1000,6 +1001,7 @@ def CC_X86_32 : CallingConv<[ > CCIfCC<"CallingConv::X86_VectorCall", CCDelegateTo>, > CCIfCC<"CallingConv::X86_ThisCall", CCDelegateTo>, > CCIfCC<"CallingConv::Fast", CCDelegateTo>, > + CCIfCC<"CallingConv::Tail", CCDelegateTo>, > CCIfCC<"CallingConv::GHC", CCDelegateTo>, > CCIfCC<"CallingConv::HiPE", CCDelegateTo>, > CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo>, > > Modified: llvm/trunk/lib/Target/X86/X86FastISel.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86FastISel.cpp?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86FastISel.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86FastISel.cpp Mon Oct 7 15:28:58 2019 > @@ -1160,6 +1160,7 @@ bool X86FastISel::X86SelectRet(const Ins > CallingConv::ID CC = F.getCallingConv(); > if (CC != CallingConv::C && > CC != CallingConv::Fast && > + CC != CallingConv::Tail && > CC != CallingConv::X86_FastCall && > CC != CallingConv::X86_StdCall && > CC != CallingConv::X86_ThisCall && > @@ -1173,7 +1174,8 @@ bool X86FastISel::X86SelectRet(const Ins > > // fastcc with -tailcallopt is intended to provide a guaranteed > // tail call optimization. Fastisel doesn't know how to do that. > - if (CC == CallingConv::Fast && TM.Options.GuaranteedTailCallOpt) > + if ((CC == CallingConv::Fast && TM.Options.GuaranteedTailCallOpt) || > + CC == CallingConv::Tail) > return false; > > // Let SDISel handle vararg functions. > @@ -3157,7 +3159,7 @@ static unsigned computeBytesPoppedByCall > if (Subtarget->getTargetTriple().isOSMSVCRT()) > return 0; > if (CC == CallingConv::Fast || CC == CallingConv::GHC || > - CC == CallingConv::HiPE) > + CC == CallingConv::HiPE || CC == CallingConv::Tail) > return 0; > > if (CS) > @@ -3208,6 +3210,7 @@ bool X86FastISel::fastLowerCall(CallLowe > default: return false; > case CallingConv::C: > case CallingConv::Fast: > + case CallingConv::Tail: > case CallingConv::WebKit_JS: > case CallingConv::Swift: > case CallingConv::X86_FastCall: > @@ -3224,7 +3227,8 @@ bool X86FastISel::fastLowerCall(CallLowe > > // fastcc with -tailcallopt is intended to provide a guaranteed > // tail call optimization. Fastisel doesn't know how to do that. > - if (CC == CallingConv::Fast && TM.Options.GuaranteedTailCallOpt) > + if ((CC == CallingConv::Fast && TM.Options.GuaranteedTailCallOpt) || > + CC == CallingConv::Tail) > return false; > > // Don't know how to handle Win64 varargs yet. Nothing special needed for > > Modified: llvm/trunk/lib/Target/X86/X86FrameLowering.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86FrameLowering.cpp?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86FrameLowering.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86FrameLowering.cpp Mon Oct 7 15:28:58 2019 > @@ -2269,7 +2269,8 @@ GetScratchRegister(bool Is64Bit, bool Is > bool IsNested = HasNestArgument(&MF); > > if (CallingConvention == CallingConv::X86_FastCall || > - CallingConvention == CallingConv::Fast) { > + CallingConvention == CallingConv::Fast || > + CallingConvention == CallingConv::Tail) { > if (IsNested) > report_fatal_error("Segmented stacks does not support fastcall with " > "nested function."); > > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Oct 7 15:28:58 2019 > @@ -2963,7 +2963,7 @@ static SDValue CreateCopyOfByValArgument > static bool canGuaranteeTCO(CallingConv::ID CC) { > return (CC == CallingConv::Fast || CC == CallingConv::GHC || > CC == CallingConv::X86_RegCall || CC == CallingConv::HiPE || > - CC == CallingConv::HHVM); > + CC == CallingConv::HHVM || CC == CallingConv::Tail); > } > > /// Return true if we might ever do TCO for calls with this calling convention. > @@ -2989,7 +2989,7 @@ static bool mayTailCallThisCC(CallingCon > /// Return true if the function is being made into a tailcall target by > /// changing its ABI. > static bool shouldGuaranteeTCO(CallingConv::ID CC, bool GuaranteedTailCallOpt) { > - return GuaranteedTailCallOpt && canGuaranteeTCO(CC); > + return (GuaranteedTailCallOpt && canGuaranteeTCO(CC)) || CC == CallingConv::Tail; > } > > bool X86TargetLowering::mayBeEmittedAsTailCall(const CallInst *CI) const { > @@ -3615,6 +3615,8 @@ X86TargetLowering::LowerCall(TargetLower > bool IsWin64 = Subtarget.isCallingConvWin64(CallConv); > StructReturnType SR = callIsStructReturn(Outs, Subtarget.isTargetMCU()); > bool IsSibcall = false; > + bool IsGuaranteeTCO = MF.getTarget().Options.GuaranteedTailCallOpt || > + CallConv == CallingConv::Tail; > X86MachineFunctionInfo *X86Info = MF.getInfo(); > auto Attr = MF.getFunction().getFnAttribute("disable-tail-calls"); > const auto *CI = dyn_cast_or_null(CLI.CS.getInstruction()); > @@ -3635,8 +3637,7 @@ X86TargetLowering::LowerCall(TargetLower > if (Attr.getValueAsString() == "true") > isTailCall = false; > > - if (Subtarget.isPICStyleGOT() && > - !MF.getTarget().Options.GuaranteedTailCallOpt) { > + if (Subtarget.isPICStyleGOT() && !IsGuaranteeTCO) { > // If we are using a GOT, disable tail calls to external symbols with > // default visibility. Tail calling such a symbol requires using a GOT > // relocation, which forces early binding of the symbol. This breaks code > @@ -3663,7 +3664,7 @@ X86TargetLowering::LowerCall(TargetLower > > // Sibcalls are automatically detected tailcalls which do not require > // ABI changes. > - if (!MF.getTarget().Options.GuaranteedTailCallOpt && isTailCall) > + if (!IsGuaranteeTCO && isTailCall) > IsSibcall = true; > > if (isTailCall) > @@ -3695,8 +3696,7 @@ X86TargetLowering::LowerCall(TargetLower > // This is a sibcall. The memory operands are available in caller's > // own caller's stack. > NumBytes = 0; > - else if (MF.getTarget().Options.GuaranteedTailCallOpt && > - canGuaranteeTCO(CallConv)) > + else if (IsGuaranteeTCO && canGuaranteeTCO(CallConv)) > NumBytes = GetAlignedArgumentStackSize(NumBytes, DAG); > > int FPDiff = 0; > @@ -4321,6 +4321,8 @@ bool X86TargetLowering::IsEligibleForTai > bool CCMatch = CallerCC == CalleeCC; > bool IsCalleeWin64 = Subtarget.isCallingConvWin64(CalleeCC); > bool IsCallerWin64 = Subtarget.isCallingConvWin64(CallerCC); > + bool IsGuaranteeTCO = DAG.getTarget().Options.GuaranteedTailCallOpt || > + CalleeCC == CallingConv::Tail; > > // Win64 functions have extra shadow space for argument homing. Don't do the > // sibcall if the caller and callee have mismatched expectations for this > @@ -4328,7 +4330,7 @@ bool X86TargetLowering::IsEligibleForTai > if (IsCalleeWin64 != IsCallerWin64) > return false; > > - if (DAG.getTarget().Options.GuaranteedTailCallOpt) { > + if (IsGuaranteeTCO) { > if (canGuaranteeTCO(CalleeCC) && CCMatch) > return true; > return false; > @@ -24421,6 +24423,7 @@ SDValue X86TargetLowering::LowerINIT_TRA > case CallingConv::X86_FastCall: > case CallingConv::X86_ThisCall: > case CallingConv::Fast: > + case CallingConv::Tail: > // Pass 'nest' parameter in EAX. > // Must be kept in sync with X86CallingConv.td > NestReg = X86::EAX; > > Modified: llvm/trunk/lib/Target/X86/X86Subtarget.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.h?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86Subtarget.h (original) > +++ llvm/trunk/lib/Target/X86/X86Subtarget.h Mon Oct 7 15:28:58 2019 > @@ -815,6 +815,7 @@ public: > // On Win64, all these conventions just use the default convention. > case CallingConv::C: > case CallingConv::Fast: > + case CallingConv::Tail: > case CallingConv::Swift: > case CallingConv::X86_FastCall: > case CallingConv::X86_StdCall: > > Added: llvm/trunk/test/CodeGen/X86/musttail-tailcc.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/musttail-tailcc.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/musttail-tailcc.ll (added) > +++ llvm/trunk/test/CodeGen/X86/musttail-tailcc.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,114 @@ > +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py > +; RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s -check-prefix=X64 > +; RUN: llc < %s -mtriple=i686-unknown-unknown | FileCheck %s -check-prefix=X32 > + > +; tailcc will turn all of these musttail calls into tail calls. > + > +declare tailcc i32 @tailcallee(i32 %a1, i32 %a2) > + > +define tailcc i32 @tailcaller(i32 %in1, i32 %in2) nounwind { > +; X64-LABEL: tailcaller: > +; X64: # %bb.0: # %entry > +; X64-NEXT: pushq %rax > +; X64-NEXT: popq %rax > +; X64-NEXT: jmp tailcallee # TAILCALL > +; > +; X32-LABEL: tailcaller: > +; X32: # %bb.0: # %entry > +; X32-NEXT: jmp tailcallee # TAILCALL > +entry: > + %tmp11 = musttail call tailcc i32 @tailcallee(i32 %in1, i32 %in2) > + ret i32 %tmp11 > +} > + > +declare tailcc i8* @alias_callee() > + > +define tailcc noalias i8* @noalias_caller() nounwind { > +; X64-LABEL: noalias_caller: > +; X64: # %bb.0: > +; X64-NEXT: pushq %rax > +; X64-NEXT: popq %rax > +; X64-NEXT: jmp alias_callee # TAILCALL > +; > +; X32-LABEL: noalias_caller: > +; X32: # %bb.0: > +; X32-NEXT: jmp alias_callee # TAILCALL > + %p = musttail call tailcc i8* @alias_callee() > + ret i8* %p > +} > + > +declare tailcc noalias i8* @noalias_callee() > + > +define tailcc i8* @alias_caller() nounwind { > +; X64-LABEL: alias_caller: > +; X64: # %bb.0: > +; X64-NEXT: pushq %rax > +; X64-NEXT: popq %rax > +; X64-NEXT: jmp noalias_callee # TAILCALL > +; > +; X32-LABEL: alias_caller: > +; X32: # %bb.0: > +; X32-NEXT: jmp noalias_callee # TAILCALL > + %p = musttail call tailcc noalias i8* @noalias_callee() > + ret i8* %p > +} > + > +define tailcc void @void_test(i32, i32, i32, i32) { > +; X64-LABEL: void_test: > +; X64: # %bb.0: # %entry > +; X64-NEXT: pushq %rax > +; X64-NEXT: .cfi_def_cfa_offset 16 > +; X64-NEXT: popq %rax > +; X64-NEXT: .cfi_def_cfa_offset 8 > +; X64-NEXT: jmp void_test # TAILCALL > +; > +; X32-LABEL: void_test: > +; X32: # %bb.0: # %entry > +; X32-NEXT: pushl %esi > +; X32-NEXT: .cfi_def_cfa_offset 8 > +; X32-NEXT: subl $8, %esp > +; X32-NEXT: .cfi_def_cfa_offset 16 > +; X32-NEXT: .cfi_offset %esi, -8 > +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax > +; X32-NEXT: movl {{[0-9]+}}(%esp), %esi > +; X32-NEXT: movl %esi, {{[0-9]+}}(%esp) > +; X32-NEXT: movl %eax, {{[0-9]+}}(%esp) > +; X32-NEXT: addl $8, %esp > +; X32-NEXT: .cfi_def_cfa_offset 8 > +; X32-NEXT: popl %esi > +; X32-NEXT: .cfi_def_cfa_offset 4 > +; X32-NEXT: jmp void_test # TAILCALL > + entry: > + musttail call tailcc void @void_test( i32 %0, i32 %1, i32 %2, i32 %3) > + ret void > +} > + > +define tailcc i1 @i1test(i32, i32, i32, i32) { > +; X64-LABEL: i1test: > +; X64: # %bb.0: # %entry > +; X64-NEXT: pushq %rax > +; X64-NEXT: .cfi_def_cfa_offset 16 > +; X64-NEXT: popq %rax > +; X64-NEXT: .cfi_def_cfa_offset 8 > +; X64-NEXT: jmp i1test # TAILCALL > +; > +; X32-LABEL: i1test: > +; X32: # %bb.0: # %entry > +; X32-NEXT: pushl %esi > +; X32-NEXT: .cfi_def_cfa_offset 8 > +; X32-NEXT: subl $8, %esp > +; X32-NEXT: .cfi_def_cfa_offset 16 > +; X32-NEXT: .cfi_offset %esi, -8 > +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax > +; X32-NEXT: movl {{[0-9]+}}(%esp), %esi > +; X32-NEXT: movl %esi, {{[0-9]+}}(%esp) > +; X32-NEXT: movl %eax, {{[0-9]+}}(%esp) > +; X32-NEXT: addl $8, %esp > +; X32-NEXT: .cfi_def_cfa_offset 8 > +; X32-NEXT: popl %esi > +; X32-NEXT: .cfi_def_cfa_offset 4 > +; X32-NEXT: jmp i1test # TAILCALL > + entry: > + %4 = musttail call tailcc i1 @i1test( i32 %0, i32 %1, i32 %2, i32 %3) > + ret i1 %4 > +} > > Added: llvm/trunk/test/CodeGen/X86/tailcall-tailcc.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcall-tailcc.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailcall-tailcc.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailcall-tailcc.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,155 @@ > +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py > +; RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s -check-prefix=X64 > +; RUN: llc < %s -mtriple=i686-unknown-unknown | FileCheck %s -check-prefix=X32 > + > +; With -tailcallopt, CodeGen guarantees a tail call optimization > +; for all of these. > + > +declare tailcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) > + > +define tailcc i32 @tailcaller(i32 %in1, i32 %in2) nounwind { > +; X64-LABEL: tailcaller: > +; X64: # %bb.0: # %entry > +; X64-NEXT: pushq %rax > +; X64-NEXT: movl %edi, %edx > +; X64-NEXT: movl %esi, %ecx > +; X64-NEXT: popq %rax > +; X64-NEXT: jmp tailcallee # TAILCALL > +; > +; X32-LABEL: tailcaller: > +; X32: # %bb.0: # %entry > +; X32-NEXT: subl $16, %esp > +; X32-NEXT: movl %ecx, {{[0-9]+}}(%esp) > +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax > +; X32-NEXT: movl %edx, {{[0-9]+}}(%esp) > +; X32-NEXT: movl %eax, {{[0-9]+}}(%esp) > +; X32-NEXT: addl $8, %esp > +; X32-NEXT: jmp tailcallee # TAILCALL > +entry: > + %tmp11 = tail call tailcc i32 @tailcallee(i32 %in1, i32 %in2, i32 %in1, i32 %in2) > + ret i32 %tmp11 > +} > + > +declare tailcc i8* @alias_callee() > + > +define tailcc noalias i8* @noalias_caller() nounwind { > +; X64-LABEL: noalias_caller: > +; X64: # %bb.0: > +; X64-NEXT: pushq %rax > +; X64-NEXT: popq %rax > +; X64-NEXT: jmp alias_callee # TAILCALL > +; > +; X32-LABEL: noalias_caller: > +; X32: # %bb.0: > +; X32-NEXT: jmp alias_callee # TAILCALL > + %p = tail call tailcc i8* @alias_callee() > + ret i8* %p > +} > + > +declare tailcc noalias i8* @noalias_callee() > + > +define tailcc i8* @alias_caller() nounwind { > +; X64-LABEL: alias_caller: > +; X64: # %bb.0: > +; X64-NEXT: pushq %rax > +; X64-NEXT: popq %rax > +; X64-NEXT: jmp noalias_callee # TAILCALL > +; > +; X32-LABEL: alias_caller: > +; X32: # %bb.0: > +; X32-NEXT: jmp noalias_callee # TAILCALL > + %p = tail call tailcc noalias i8* @noalias_callee() > + ret i8* %p > +} > + > +declare tailcc i32 @i32_callee() > + > +define tailcc i32 @ret_undef() nounwind { > +; X64-LABEL: ret_undef: > +; X64: # %bb.0: > +; X64-NEXT: pushq %rax > +; X64-NEXT: popq %rax > +; X64-NEXT: jmp i32_callee # TAILCALL > +; > +; X32-LABEL: ret_undef: > +; X32: # %bb.0: > +; X32-NEXT: jmp i32_callee # TAILCALL > + %p = tail call tailcc i32 @i32_callee() > + ret i32 undef > +} > + > +declare tailcc void @does_not_return() > + > +define tailcc i32 @noret() nounwind { > +; X64-LABEL: noret: > +; X64: # %bb.0: > +; X64-NEXT: pushq %rax > +; X64-NEXT: popq %rax > +; X64-NEXT: jmp does_not_return # TAILCALL > +; > +; X32-LABEL: noret: > +; X32: # %bb.0: > +; X32-NEXT: jmp does_not_return # TAILCALL > + tail call tailcc void @does_not_return() > + unreachable > +} > + > +define tailcc void @void_test(i32, i32, i32, i32) { > +; X64-LABEL: void_test: > +; X64: # %bb.0: # %entry > +; X64-NEXT: pushq %rax > +; X64-NEXT: .cfi_def_cfa_offset 16 > +; X64-NEXT: popq %rax > +; X64-NEXT: .cfi_def_cfa_offset 8 > +; X64-NEXT: jmp void_test # TAILCALL > +; > +; X32-LABEL: void_test: > +; X32: # %bb.0: # %entry > +; X32-NEXT: pushl %esi > +; X32-NEXT: .cfi_def_cfa_offset 8 > +; X32-NEXT: subl $8, %esp > +; X32-NEXT: .cfi_def_cfa_offset 16 > +; X32-NEXT: .cfi_offset %esi, -8 > +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax > +; X32-NEXT: movl {{[0-9]+}}(%esp), %esi > +; X32-NEXT: movl %esi, {{[0-9]+}}(%esp) > +; X32-NEXT: movl %eax, {{[0-9]+}}(%esp) > +; X32-NEXT: addl $8, %esp > +; X32-NEXT: .cfi_def_cfa_offset 8 > +; X32-NEXT: popl %esi > +; X32-NEXT: .cfi_def_cfa_offset 4 > +; X32-NEXT: jmp void_test # TAILCALL > + entry: > + tail call tailcc void @void_test( i32 %0, i32 %1, i32 %2, i32 %3) > + ret void > +} > + > +define tailcc i1 @i1test(i32, i32, i32, i32) { > +; X64-LABEL: i1test: > +; X64: # %bb.0: # %entry > +; X64-NEXT: pushq %rax > +; X64-NEXT: .cfi_def_cfa_offset 16 > +; X64-NEXT: popq %rax > +; X64-NEXT: .cfi_def_cfa_offset 8 > +; X64-NEXT: jmp i1test # TAILCALL > +; > +; X32-LABEL: i1test: > +; X32: # %bb.0: # %entry > +; X32-NEXT: pushl %esi > +; X32-NEXT: .cfi_def_cfa_offset 8 > +; X32-NEXT: subl $8, %esp > +; X32-NEXT: .cfi_def_cfa_offset 16 > +; X32-NEXT: .cfi_offset %esi, -8 > +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax > +; X32-NEXT: movl {{[0-9]+}}(%esp), %esi > +; X32-NEXT: movl %esi, {{[0-9]+}}(%esp) > +; X32-NEXT: movl %eax, {{[0-9]+}}(%esp) > +; X32-NEXT: addl $8, %esp > +; X32-NEXT: .cfi_def_cfa_offset 8 > +; X32-NEXT: popl %esi > +; X32-NEXT: .cfi_def_cfa_offset 4 > +; X32-NEXT: jmp i1test # TAILCALL > + entry: > + %4 = tail call tailcc i1 @i1test( i32 %0, i32 %1, i32 %2, i32 %3) > + ret i1 %4 > +} > > Added: llvm/trunk/test/CodeGen/X86/tailcc-calleesave.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-calleesave.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailcc-calleesave.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailcc-calleesave.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,19 @@ > +; RUN: llc -mcpu=core < %s | FileCheck %s > + > +target triple = "i686-apple-darwin" > + > +declare tailcc void @foo(i32, i32, i32, i32, i32, i32) > +declare i32* @bar(i32*) > + > +define tailcc void @hoge(i32 %b) nounwind { > +; Do not overwrite pushed callee-save registers > +; CHECK: pushl > +; CHECK: subl $[[SIZE:[0-9]+]], %esp > +; CHECK-NOT: [[SIZE]](%esp) > + %a = alloca i32 > + store i32 0, i32* %a > + %d = tail call i32* @bar(i32* %a) nounwind > + store i32 %b, i32* %d > + tail call tailcc void @foo(i32 1, i32 2, i32 3, i32 4, i32 5, i32 6) nounwind > + ret void > +} > > Added: llvm/trunk/test/CodeGen/X86/tailcc-disable-tail-calls.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-disable-tail-calls.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailcc-disable-tail-calls.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailcc-disable-tail-calls.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,40 @@ > +; RUN: llc < %s -mtriple=x86_64-- | FileCheck %s --check-prefix=NO-OPTION > +; RUN: llc < %s -mtriple=x86_64-- -disable-tail-calls | FileCheck %s --check-prefix=DISABLE-TRUE > +; RUN: llc < %s -mtriple=x86_64-- -disable-tail-calls=false | FileCheck %s --check-prefix=DISABLE-FALSE > + > +; Check that command line option "-disable-tail-calls" overrides function > +; attribute "disable-tail-calls". > + > +; NO-OPTION-LABEL: {{\_?}}func_attr > +; NO-OPTION: callq {{\_?}}callee > + > +; DISABLE-FALSE-LABEL: {{\_?}}func_attr > +; DISABLE-FALSE: jmp {{\_?}}callee > + > +; DISABLE-TRUE-LABEL: {{\_?}}func_attr > +; DISABLE-TRUE: callq {{\_?}}callee > + > +define tailcc i32 @func_attr(i32 %a) #0 { > +entry: > + %call = tail call tailcc i32 @callee(i32 %a) > + ret i32 %call > +} > + > +; NO-OPTION-LABEL: {{\_?}}func_noattr > +; NO-OPTION: jmp {{\_?}}callee > + > +; DISABLE-FALSE-LABEL: {{\_?}}func_noattr > +; DISABLE-FALSE: jmp {{\_?}}callee > + > +; DISABLE-TRUE-LABEL: {{\_?}}func_noattr > +; DISABLE-TRUE: callq {{\_?}}callee > + > +define tailcc i32 @func_noattr(i32 %a) { > +entry: > + %call = tail call tailcc i32 @callee(i32 %a) > + ret i32 %call > +} > + > +declare tailcc i32 @callee(i32) > + > +attributes #0 = { "disable-tail-calls"="true" } > > Added: llvm/trunk/test/CodeGen/X86/tailcc-fastcc.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-fastcc.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailcc-fastcc.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailcc-fastcc.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,49 @@ > +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py > +; RUN: llc -tailcallopt < %s -mtriple=x86_64-unknown-unknown | FileCheck %s -check-prefix=X64 > +; RUN: llc -tailcallopt < %s -mtriple=i686-unknown-unknown | FileCheck %s -check-prefix=X32 > + > +; llc -tailcallopt should not enable tail calls from fastcc to tailcc or vice versa > + > +declare tailcc i32 @tailcallee1(i32 %a1, i32 %a2, i32 %a3, i32 %a4) > + > +define fastcc i32 @tailcaller1(i32 %in1, i32 %in2) nounwind { > +; X64-LABEL: tailcaller1: > +; X64: # %bb.0: # %entry > +; X64-NEXT: pushq %rax > +; X64-NEXT: movl %edi, %edx > +; X64-NEXT: movl %esi, %ecx > +; X64-NEXT: callq tailcallee1 > +; X64-NEXT: retq $8 > +; > +; X32-LABEL: tailcaller1: > +; X32: # %bb.0: # %entry > +; X32-NEXT: pushl %edx > +; X32-NEXT: pushl %ecx > +; X32-NEXT: calll tailcallee1 > +; X32-NEXT: retl > +entry: > + %tmp11 = tail call tailcc i32 @tailcallee1(i32 %in1, i32 %in2, i32 %in1, i32 %in2) > + ret i32 %tmp11 > +} > + > +declare fastcc i32 @tailcallee2(i32 %a1, i32 %a2, i32 %a3, i32 %a4) > + > +define tailcc i32 @tailcaller2(i32 %in1, i32 %in2) nounwind { > +; X64-LABEL: tailcaller2: > +; X64: # %bb.0: # %entry > +; X64-NEXT: pushq %rax > +; X64-NEXT: movl %edi, %edx > +; X64-NEXT: movl %esi, %ecx > +; X64-NEXT: callq tailcallee2 > +; X64-NEXT: retq $8 > +; > +; X32-LABEL: tailcaller2: > +; X32: # %bb.0: # %entry > +; X32-NEXT: pushl %edx > +; X32-NEXT: pushl %ecx > +; X32-NEXT: calll tailcallee2 > +; X32-NEXT: retl > +entry: > + %tmp11 = tail call fastcc i32 @tailcallee2(i32 %in1, i32 %in2, i32 %in1, i32 %in2) > + ret i32 %tmp11 > +} > > Added: llvm/trunk/test/CodeGen/X86/tailcc-fastisel.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-fastisel.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailcc-fastisel.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailcc-fastisel.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,18 @@ > +; RUN: llc < %s -mtriple=x86_64-apple-darwin -fast-isel -fast-isel-abort=1 | FileCheck %s > + > +%0 = type { i64, i32, i8* } > + > +define tailcc i8* @"visit_array_aux<`Reference>"(%0 %arg, i32 %arg1) nounwind { > +fail: ; preds = %entry > + %tmp20 = tail call tailcc i8* @"visit_array_aux<`Reference>"(%0 %arg, i32 undef) ; [#uses=1] > +; CHECK: jmp "_visit_array_aux<`Reference>" ## TAILCALL > + ret i8* %tmp20 > +} > + > +define i32 @foo() nounwind { > +entry: > + %0 = tail call i32 (...) @bar() nounwind ; [#uses=1] > + ret i32 %0 > +} > + > +declare i32 @bar(...) nounwind > > Added: llvm/trunk/test/CodeGen/X86/tailcc-largecode.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-largecode.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailcc-largecode.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailcc-largecode.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,71 @@ > +; RUN: llc < %s -mtriple=x86_64-linux-gnu -code-model=large -enable-misched=false | FileCheck %s > + > +declare tailcc i32 @callee(i32 %arg) > +define tailcc i32 @directcall(i32 %arg) { > +entry: > +; This is the large code model, so &callee may not fit into the jmp > +; instruction. Instead, stick it into a register. > +; CHECK: movabsq $callee, [[REGISTER:%r[a-z0-9]+]] > +; CHECK: jmpq *[[REGISTER]] # TAILCALL > + %res = tail call tailcc i32 @callee(i32 %arg) > + ret i32 %res > +} > + > +; Check that the register used for an indirect tail call doesn't > +; clobber any of the arguments. > +define tailcc i32 @indirect_manyargs(i32(i32,i32,i32,i32,i32,i32,i32)* %target) { > +; Adjust the stack to enter the function. (The amount of the > +; adjustment may change in the future, in which case the location of > +; the stack argument and the return adjustment will change too.) > +; CHECK: pushq > +; Put the call target into R11, which won't be clobbered while restoring > +; callee-saved registers and won't be used for passing arguments. > +; CHECK: movq %rdi, %rax > +; Pass the stack argument. > +; CHECK: movl $7, 16(%rsp) > +; Pass the register arguments, in the right registers. > +; CHECK: movl $1, %edi > +; CHECK: movl $2, %esi > +; CHECK: movl $3, %edx > +; CHECK: movl $4, %ecx > +; CHECK: movl $5, %r8d > +; CHECK: movl $6, %r9d > +; Adjust the stack to "return". > +; CHECK: popq > +; And tail-call to the target. > +; CHECK: jmpq *%rax # TAILCALL > + %res = tail call tailcc i32 %target(i32 1, i32 2, i32 3, i32 4, i32 5, > + i32 6, i32 7) > + ret i32 %res > +} > + > +; Check that the register used for a direct tail call doesn't clobber > +; any of the arguments. > +declare tailcc i32 @manyargs_callee(i32,i32,i32,i32,i32,i32,i32) > +define tailcc i32 @direct_manyargs() { > +; Adjust the stack to enter the function. (The amount of the > +; adjustment may change in the future, in which case the location of > +; the stack argument and the return adjustment will change too.) > +; CHECK: pushq > +; Pass the stack argument. > +; CHECK: movl $7, 16(%rsp) > +; This is the large code model, so &manyargs_callee may not fit into > +; the jmp instruction. Put it into a register which won't be clobbered > +; while restoring callee-saved registers and won't be used for passing > +; arguments. > +; CHECK: movabsq $manyargs_callee, %rax > +; Pass the register arguments, in the right registers. > +; CHECK: movl $1, %edi > +; CHECK: movl $2, %esi > +; CHECK: movl $3, %edx > +; CHECK: movl $4, %ecx > +; CHECK: movl $5, %r8d > +; CHECK: movl $6, %r9d > +; Adjust the stack to "return". > +; CHECK: popq > +; And tail-call to the target. > +; CHECK: jmpq *%rax # TAILCALL > + %res = tail call tailcc i32 @manyargs_callee(i32 1, i32 2, i32 3, i32 4, > + i32 5, i32 6, i32 7) > + ret i32 %res > +} > > Added: llvm/trunk/test/CodeGen/X86/tailcc-stackalign.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-stackalign.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailcc-stackalign.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailcc-stackalign.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,23 @@ > +; RUN: llc < %s -mtriple=i686-unknown-linux -no-x86-call-frame-opt | FileCheck %s > +; Linux has 8 byte alignment so the params cause stack size 20, > +; ensure that a normal tailcc call has matching stack size > + > + > +define tailcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) { > + ret i32 %a3 > +} > + > +define tailcc i32 @tailcaller(i32 %in1, i32 %in2, i32 %in3, i32 %in4) { > + %tmp11 = tail call tailcc i32 @tailcallee(i32 %in1, i32 %in2, > + i32 %in1, i32 %in2) > + ret i32 %tmp11 > +} > + > +define i32 @main(i32 %argc, i8** %argv) { > + %tmp1 = call tailcc i32 @tailcaller( i32 1, i32 2, i32 3, i32 4 ) > + ; expect match subl [stacksize] here > + ret i32 0 > +} > + > +; CHECK: calll tailcaller > +; CHECK-NEXT: subl $12 > > Added: llvm/trunk/test/CodeGen/X86/tailcc-structret.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcc-structret.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailcc-structret.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailcc-structret.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,7 @@ > +; RUN: llc < %s -mtriple=i686-unknown-linux | FileCheck %s > +define tailcc { { i8*, i8* }*, i8*} @init({ { i8*, i8* }*, i8*}, i32) { > +entry: > + %2 = tail call tailcc { { i8*, i8* }*, i8* } @init({ { i8*, i8*}*, i8*} %0, i32 %1) > + ret { { i8*, i8* }*, i8*} %2 > +; CHECK: jmp init > +} > > Added: llvm/trunk/test/CodeGen/X86/tailccbyval.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccbyval.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailccbyval.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailccbyval.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,21 @@ > +; RUN: llc < %s -mtriple=i686-unknown-linux | FileCheck %s > +%struct.s = type {i32, i32, i32, i32, i32, i32, i32, i32, > + i32, i32, i32, i32, i32, i32, i32, i32, > + i32, i32, i32, i32, i32, i32, i32, i32 } > + > +define tailcc i32 @tailcallee(%struct.s* byval %a) nounwind { > +entry: > + %tmp2 = getelementptr %struct.s, %struct.s* %a, i32 0, i32 0 > + %tmp3 = load i32, i32* %tmp2 > + ret i32 %tmp3 > +; CHECK: tailcallee > +; CHECK: movl 4(%esp), %eax > +} > + > +define tailcc i32 @tailcaller(%struct.s* byval %a) nounwind { > +entry: > + %tmp4 = tail call tailcc i32 @tailcallee(%struct.s* byval %a ) > + ret i32 %tmp4 > +; CHECK: tailcaller > +; CHECK: jmp tailcallee > +} > > Added: llvm/trunk/test/CodeGen/X86/tailccbyval64.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccbyval64.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailccbyval64.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailccbyval64.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,42 @@ > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux | FileCheck %s > + > +; FIXME: Win64 does not support byval. > + > +; Expect the entry point. > +; CHECK-LABEL: tailcaller: > + > +; Expect 2 rep;movs because of tail call byval lowering. > +; CHECK: rep; > +; CHECK: rep; > + > +; A sequence of copyto/copyfrom virtual registers is used to deal with byval > +; lowering appearing after moving arguments to registers. The following two > +; checks verify that the register allocator changes those sequences to direct > +; moves to argument register where it can (for registers that are not used in > +; byval lowering - not rsi, not rdi, not rcx). > +; Expect argument 4 to be moved directly to register edx. > +; CHECK: movl $7, %edx > + > +; Expect argument 6 to be moved directly to register r8. > +; CHECK: movl $17, %r8d > + > +; Expect not call but jmp to @tailcallee. > +; CHECK: jmp tailcallee > + > +; Expect the trailer. > +; CHECK: .size tailcaller > + > +%struct.s = type { i64, i64, i64, i64, i64, i64, i64, i64, > + i64, i64, i64, i64, i64, i64, i64, i64, > + i64, i64, i64, i64, i64, i64, i64, i64 } > + > +declare tailcc i64 @tailcallee(%struct.s* byval %a, i64 %val, i64 %val2, i64 %val3, i64 %val4, i64 %val5) > + > + > +define tailcc i64 @tailcaller(i64 %b, %struct.s* byval %a) { > +entry: > + %tmp2 = getelementptr %struct.s, %struct.s* %a, i32 0, i32 1 > + %tmp3 = load i64, i64* %tmp2, align 8 > + %tmp4 = tail call tailcc i64 @tailcallee(%struct.s* byval %a , i64 %tmp3, i64 %b, i64 7, i64 13, i64 17) > + ret i64 %tmp4 > +} > > Added: llvm/trunk/test/CodeGen/X86/tailccfp.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccfp.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailccfp.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailccfp.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,6 @@ > +; RUN: llc < %s -mtriple=i686-- | FileCheck %s > +define tailcc i32 @bar(i32 %X, i32(double, i32) *%FP) { > + %Y = tail call tailcc i32 %FP(double 0.0, i32 %X) > + ret i32 %Y > +; CHECK: jmpl > +} > > Added: llvm/trunk/test/CodeGen/X86/tailccfp2.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccfp2.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailccfp2.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailccfp2.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,27 @@ > +; RUN: llc < %s -mtriple=i686-- | FileCheck %s > + > +declare i32 @putchar(i32) > + > +define tailcc i32 @checktail(i32 %x, i32* %f, i32 %g) nounwind { > +; CHECK-LABEL: checktail: > + %tmp1 = icmp sgt i32 %x, 0 > + br i1 %tmp1, label %if-then, label %if-else > + > +if-then: > + %fun_ptr = bitcast i32* %f to i32(i32, i32*, i32)* > + %arg1 = add i32 %x, -1 > + call i32 @putchar(i32 90) > +; CHECK: jmpl *%e{{.*}} > + %res = tail call tailcc i32 %fun_ptr( i32 %arg1, i32 * %f, i32 %g) > + ret i32 %res > + > +if-else: > + ret i32 %x > +} > + > + > +define i32 @main() nounwind { > + %f = bitcast i32 (i32, i32*, i32)* @checktail to i32* > + %res = tail call tailcc i32 @checktail( i32 10, i32* %f,i32 10) > + ret i32 %res > +} > > Added: llvm/trunk/test/CodeGen/X86/tailccpic1.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccpic1.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailccpic1.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailccpic1.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,16 @@ > +; RUN: llc < %s -mtriple=i686-pc-linux-gnu -relocation-model=pic | FileCheck %s > + > +; This test uses guaranteed TCO so these will be tail calls, despite the early > +; binding issues. > + > +define protected tailcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) { > +entry: > + ret i32 %a3 > +} > + > +define tailcc i32 @tailcaller(i32 %in1, i32 %in2) { > +entry: > + %tmp11 = tail call tailcc i32 @tailcallee( i32 %in1, i32 %in2, i32 %in1, i32 %in2 ) ; [#uses=1] > + ret i32 %tmp11 > +; CHECK: jmp tailcallee > +} > > Added: llvm/trunk/test/CodeGen/X86/tailccpic2.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccpic2.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailccpic2.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailccpic2.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,15 @@ > +; RUN: llc < %s -mtriple=i686-pc-linux-gnu -relocation-model=pic | FileCheck %s > + > +define tailcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) { > +entry: > + ret i32 %a3 > +} > + > +define tailcc i32 @tailcaller(i32 %in1, i32 %in2) { > +entry: > + %tmp11 = tail call tailcc i32 @tailcallee( i32 %in1, i32 %in2, i32 %in1, i32 %in2 ) ; [#uses=1] > + ret i32 %tmp11 > +; CHECK: movl tailcallee at GOT > +; CHECK: jmpl > +} > + > > Added: llvm/trunk/test/CodeGen/X86/tailccstack64.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailccstack64.ll?rev=373976&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailccstack64.ll (added) > +++ llvm/trunk/test/CodeGen/X86/tailccstack64.ll Mon Oct 7 15:28:58 2019 > @@ -0,0 +1,28 @@ > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -post-RA-scheduler=true | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-win32 -post-RA-scheduler=true | FileCheck %s > + > +; FIXME: Redundant unused stack allocation could be eliminated. > +; CHECK: subq ${{24|72|80}}, %rsp > + > +; Check that lowered arguments on the stack do not overwrite each other. > +; Add %in1 %p1 to a different temporary register (%eax). > +; CHECK: movl [[A1:32|144]](%rsp), [[R1:%e..]] > +; Move param %in1 to temp register (%r10d). > +; CHECK: movl [[A2:40|152]](%rsp), [[R2:%[a-z0-9]+]] > +; Add %in1 %p1 to a different temporary register (%eax). > +; CHECK: addl {{%edi|%ecx}}, [[R1]] > +; Move param %in2 to stack. > +; CHECK-DAG: movl [[R2]], [[A1]](%rsp) > +; Move result of addition to stack. > +; CHECK-DAG: movl [[R1]], [[A2]](%rsp) > +; Eventually, do a TAILCALL > +; CHECK: TAILCALL > + > +declare tailcc i32 @tailcallee(i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5, i32 %p6, i32 %a, i32 %b) nounwind > + > +define tailcc i32 @tailcaller(i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5, i32 %p6, i32 %in1, i32 %in2) nounwind { > +entry: > + %tmp = add i32 %in1, %p1 > + %retval = tail call tailcc i32 @tailcallee(i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5, i32 %p6, i32 %in2,i32 %tmp) > + ret i32 %retval > +} > > Modified: llvm/trunk/utils/vim/syntax/llvm.vim > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/vim/syntax/llvm.vim?rev=373976&r1=373975&r2=373976&view=diff > ============================================================================== > --- llvm/trunk/utils/vim/syntax/llvm.vim (original) > +++ llvm/trunk/utils/vim/syntax/llvm.vim Mon Oct 7 15:28:58 2019 > @@ -82,6 +82,7 @@ syn keyword llvmKeyword > \ externally_initialized > \ extern_weak > \ fastcc > + \ tailcc > \ filter > \ from > \ gc > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits From llvm-commits at lists.llvm.org Mon Oct 7 15:28:59 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:28:59 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: <7bbc33f3efe69773cd84b949c36aa145@localhost.localdomain> efriedma added inline comments. ================ Comment at: llvm/include/llvm/Support/MathExtras.h:66 + inv_sqrtpi = 0.5641895835477563, // https://oeis.org/A087197 + sqrt2 = 1.414213562373095, // https://oeis.org/A002193 + inv_sqrt2 = 0.7071067811865475, ---------------- evandro wrote: > efriedma wrote: > > The correct value of sqrt(2) in double-precision is 1.4142135623730951. > > > > And now I don't trust any of the other values... > `double` has a precision of 15 or 16 significant digits. I don't understand why are you suggesting 17 significant digits when you asked to trim the precision down. > > Besides, the reference I provided states that this value is 1.41421356237309505. Whether it's rounded to 1.4142135623730950 or 1.4142135623730951 is a bit moot, IMO. I asked for "the smallest number of digits required to produce the correct double-precision result". This is what you get if, for example, you ask Python 2.7 or later to convert the value to a string with `repr()` (`printf "import math\nprint(repr(math.sqrt(2)))" | python`). `1.414213562373095` produces a value that's different by one ulp. Yes, a one ulp difference is unlikely to matter for most uses, but if we're going to take the time to define these, we should define them correctly. ================ Comment at: llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp:1944 + // FIXME: The `long double` type is not fully supported by the classes + // `APFloat` and `Constant`. + Eul = ConstantFP::get(Log->getType(), numbers::e); ---------------- evandro wrote: > efriedma wrote: > > I'm not sure this describes the issue correctly. You can specify a long double as a string, or raw bits. It can't interoperate with the native long double because that might have the wrong width. > What wording would you suggest, please? Maybe just `FIXME: add more precise value of "e" for various long double types`. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 From llvm-commits at lists.llvm.org Mon Oct 7 15:31:58 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via llvm-commits) Date: Mon, 7 Oct 2019 15:31:58 -0700 Subject: [llvm] r373976 - [X86] Add new calling convention that guarantees tail call optimization In-Reply-To: References: <20191007222858.C50228DCA6@lists.llvm.org> Message-ID: On Mon, Oct 7, 2019 at 3:28 PM Roman Lebedev wrote: > > Reviewed By: lebedev.ri, paquette, rnk > Pretty sure i didn't review this > I vaguely recalled that you made a comment on it, and I wasn't wrong: https://reviews.llvm.org/D67855#1677359 I suppose it's a stretch to call it a review, but I'd rather give too much credit than too little. -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:33:23 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:33:23 +0000 (UTC) Subject: [PATCH] D68603: [sanitizer] Print SIGTRAP for corresponding signal In-Reply-To: References: Message-ID: <82c91d8bb607d0972cc6c81059727325@localhost.localdomain> vitalybuka updated this revision to Diff 223681. vitalybuka added a comment. Herald added a subscriber: dexonsmith. remove SIG prefix from the string Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68603/new/ https://reviews.llvm.org/D68603 Files: compiler-rt/lib/sanitizer_common/sanitizer_posix.cpp compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap.cpp Index: compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap.cpp =================================================================== --- /dev/null +++ compiler-rt/test/sanitizer_common/TestCases/Linux/signal_trap.cpp @@ -0,0 +1,8 @@ +// RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=2 not %run %t 2>&1 | FileCheck %s + +int main() { + __builtin_debugtrap(); +} + +// CHECK: Sanitizer:DEADLYSIGNAL +// CHECK: Sanitizer: TRAP on unknown address Index: compiler-rt/lib/sanitizer_common/sanitizer_posix.cpp =================================================================== --- compiler-rt/lib/sanitizer_common/sanitizer_posix.cpp +++ compiler-rt/lib/sanitizer_common/sanitizer_posix.cpp @@ -312,6 +312,8 @@ return "SEGV"; case SIGBUS: return "BUS"; + case SIGTRAP: + return "TRAP"; } return "UNKNOWN SIGNAL"; } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68603.223681.patch Type: text/x-patch Size: 866 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:37:23 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:37:23 +0000 (UTC) Subject: [PATCH] D68599: fix Go windows build In-Reply-To: References: Message-ID: vitalybuka added inline comments. ================ Comment at: lib/tsan/go/build.bat:52 -DSANITIZER_GO=1 ^ + -DWINVER=0x0600 -D_WIN32_WINNT=0x0600 ^ + -DGetProcessMemoryInfo=K32GetProcessMemoryInfo ^ ---------------- vitalybuka wrote: > For consistency could you split these 1-per-line? Ah, don't bother You likely have no commiter access, so someone else will have to land it. I'll reformat and land. Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68599/new/ https://reviews.llvm.org/D68599 From llvm-commits at lists.llvm.org Mon Oct 7 15:38:19 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:38:19 +0000 (UTC) Subject: [PATCH] D68484: [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. In-Reply-To: References: Message-ID: jeroen.dobbelaere marked 4 inline comments as done. jeroen.dobbelaere added a comment. Thanks for all the feedback ! I added some explanations. ================ Comment at: llvm/docs/LangRef.rst:16249 + +Note: ``XXX`` is the encoding of the return type and the types of the arguments. + ---------------- jdoerfert wrote: > I think you mix the "templated" definition (`XXX`) with instantiations (`i8**`, `%struct.FOO*`, ...). I would prefer we pick either. Precedence says you replace `XXX` with the types of that instantiation. > > That's true, but imho the full intrinsic name become very long, cluttering the display., For clarity I replaced the type encodings with XXX. This makes it easier to focus on the intrinsics and the actual arguments. I agree that this is not perfect. ================ Comment at: llvm/docs/LangRef.rst:16303 +- ``p.objId``: a number that can be used to differentiate different *object P* + when ``%p.addr`` is optimized away. +- ``!p.scope``: metadata argument that refers to a list of alias.scope metadata ---------------- jdoerfert wrote: > This seems odd, why introduce two things that do the same thing. The original idea was to treat '%p.addr' sometimes as a pointer to an object and sometimes as an offset. Later it needed to be separated: SROA first splits alloca's into multiple smaller alloca's. Each separate restrict pointer now points to its own alloca (%p.addr), and there is no place to put the offset. You can differentiate by splitting the p.scope, but that would imply duplicating scopes all over the place. The p.objId serves as a convenient and less costly solution to differentiate the pointers in this case. ================ Comment at: llvm/docs/LangRef.rst:16306 + entries that contains exactly one element. It represents the variable + declaration that contains one or more restrict pointers. +- ``%p.decl``: points to the ``@llvm.noalias.decl`` intrinsic associated with ---------------- jdoerfert wrote: > "entries with a single element each." > > It represents the variable declaration that contains one or more restrict pointers. > I do not understand this sentence. hmm. Not sure how to explain it further. What I want to say is (shown with an example:) int *restrict A; // one !p.scope, one restrict pointer int *restrict B[10]; // another (single) !p.scope, ten restrict pointers struct FOO { int* restrict mA; int * mB; int* restrict mC; } C; // yet another !p.scope, 2 restrict pointers ================ Comment at: llvm/docs/LangRef.rst:16496 +not really represent a value. It is merely used to track a dependency on the +declaration. + ---------------- jdoerfert wrote: > The above reads funny, maybe: > "The returned value is a handle to track dependences on the declaration. There is no explicit relationship to the value of the arguments." > Also, why do we want an `i8*` then? We have `tokens` and we have `i32`, I'd prefer either over an `i8*` which is more confusing in this context full of `i8*` that are actually pointers (IMHO). I think a token has to many restrictions (no PHI, no select). i32 might do. I didn't think too much about it and just settled on i8*. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68484/new/ https://reviews.llvm.org/D68484 From llvm-commits at lists.llvm.org Mon Oct 7 15:39:00 2019 From: llvm-commits at lists.llvm.org (Evgenii Stepanov via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:39:00 +0000 (UTC) Subject: [PATCH] D68603: [sanitizer] Print SIGTRAP for corresponding signal In-Reply-To: References: Message-ID: <959f4fb393d991a563f2801ca7e4b22d@localhost.localdomain> eugenis accepted this revision. eugenis added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68603/new/ https://reviews.llvm.org/D68603 From llvm-commits at lists.llvm.org Mon Oct 7 15:43:17 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Mon, 07 Oct 2019 22:43:17 -0000 Subject: [compiler-rt] r373978 - [tsan] Don't delay SIGTRAP handler Message-ID: <20191007224317.D414D85F0F@lists.llvm.org> Author: vitalybuka Date: Mon Oct 7 15:43:17 2019 New Revision: 373978 URL: http://llvm.org/viewvc/llvm-project?rev=373978&view=rev Log: [tsan] Don't delay SIGTRAP handler Reviewers: eugenis, jfb Subscribers: #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D68604 Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Modified: compiler-rt/trunk/lib/tsan/rtl/tsan_interceptors_posix.cpp Modified: compiler-rt/trunk/lib/tsan/rtl/tsan_interceptors_posix.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/tsan/rtl/tsan_interceptors_posix.cpp?rev=373978&r1=373977&r2=373978&view=diff ============================================================================== --- compiler-rt/trunk/lib/tsan/rtl/tsan_interceptors_posix.cpp (original) +++ compiler-rt/trunk/lib/tsan/rtl/tsan_interceptors_posix.cpp Mon Oct 7 15:43:17 2019 @@ -114,6 +114,7 @@ const int PTHREAD_MUTEX_RECURSIVE_NP = 2 const int EPOLL_CTL_ADD = 1; #endif const int SIGILL = 4; +const int SIGTRAP = 5; const int SIGABRT = 6; const int SIGFPE = 8; const int SIGSEGV = 11; @@ -1962,10 +1963,10 @@ void ProcessPendingSignals(ThreadState * } // namespace __tsan static bool is_sync_signal(ThreadSignalContext *sctx, int sig) { - return sig == SIGSEGV || sig == SIGBUS || sig == SIGILL || - sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS || - // If we are sending signal to ourselves, we must process it now. - (sctx && sig == sctx->int_signal_send); + return sig == SIGSEGV || sig == SIGBUS || sig == SIGILL || sig == SIGTRAP || + sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS || + // If we are sending signal to ourselves, we must process it now. + (sctx && sig == sctx->int_signal_send); } void ALWAYS_INLINE rtl_generic_sighandler(bool sigact, int sig, Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp?rev=373978&view=auto ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp (added) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Mon Oct 7 15:43:17 2019 @@ -0,0 +1,29 @@ +// RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=1 %run %t 2>&1 | FileCheck %s + +#include +#include +#include + +int handled; + +void handler(int signo, siginfo_t *info, void *uctx) { + handled = 1; +} + +int main() { + struct sigaction a = {}, old = {}; + a.sa_sigaction = handler; + a.sa_flags = SA_SIGINFO; + sigaction(SIGTRAP, &a, &old); + + a = {}; + sigaction(SIGTRAP, 0, &a); + assert(a.sa_sigaction == handler); + assert(a.sa_flags & SA_SIGINFO); + + __builtin_debugtrap(); + assert(handled); + fprintf(stderr, "HANDLED %d\n", handled); +} + +// CHECK: HANDLED 1 From llvm-commits at lists.llvm.org Mon Oct 7 15:43:19 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Mon, 07 Oct 2019 22:43:19 -0000 Subject: [compiler-rt] r373979 - [sanitizer] Print SIGTRAP for corresponding signal Message-ID: <20191007224319.D5C738CCC2@lists.llvm.org> Author: vitalybuka Date: Mon Oct 7 15:43:19 2019 New Revision: 373979 URL: http://llvm.org/viewvc/llvm-project?rev=373979&view=rev Log: [sanitizer] Print SIGTRAP for corresponding signal Reviewers: eugenis, jfb Subscribers: #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D68603 Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_posix.cpp Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_posix.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_posix.cpp?rev=373979&r1=373978&r2=373979&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_posix.cpp (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_posix.cpp Mon Oct 7 15:43:19 2019 @@ -312,6 +312,8 @@ const char *SignalContext::Describe() co return "SEGV"; case SIGBUS: return "BUS"; + case SIGTRAP: + return "TRAP"; } return "UNKNOWN SIGNAL"; } Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp?rev=373979&view=auto ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp (added) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp Mon Oct 7 15:43:19 2019 @@ -0,0 +1,8 @@ +// RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=2 not %run %t 2>&1 | FileCheck %s + +int main() { + __builtin_debugtrap(); +} + +// CHECK: Sanitizer:DEADLYSIGNAL +// CHECK: Sanitizer: TRAP on unknown address From llvm-commits at lists.llvm.org Mon Oct 7 15:49:35 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:49:35 +0000 (UTC) Subject: [PATCH] D67855: [X86] Add new calling convention that guarantees tail call optimization In-Reply-To: References: Message-ID: <77685b27e6391b067637a1fb914b78df@localhost.localdomain> rnk added a comment. In D67855#1698112 , @dwightguth wrote: > It's ready on my end. I don't have commit access, so... I committed this as rL373976 , thanks! I always ask permission to commit other folks patches just to be sure. We're not on github yet so we don't have a nice story for separating the patch author from the "approver" or "merger". Soon, though. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67855/new/ https://reviews.llvm.org/D67855 From llvm-commits at lists.llvm.org Mon Oct 7 15:55:42 2019 From: llvm-commits at lists.llvm.org (Joerg Sonnenberger via llvm-commits) Date: Mon, 07 Oct 2019 22:55:42 -0000 Subject: [llvm] r373980 - Fix the spelling of my name. Message-ID: <20191007225543.0233F820E4@lists.llvm.org> Author: joerg Date: Mon Oct 7 15:55:42 2019 New Revision: 373980 URL: http://llvm.org/viewvc/llvm-project?rev=373980&view=rev Log: Fix the spelling of my name. Modified: llvm/trunk/docs/Proposals/GitHubMove.rst Modified: llvm/trunk/docs/Proposals/GitHubMove.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/Proposals/GitHubMove.rst?rev=373980&r1=373979&r2=373980&view=diff ============================================================================== --- llvm/trunk/docs/Proposals/GitHubMove.rst (original) +++ llvm/trunk/docs/Proposals/GitHubMove.rst Mon Oct 7 15:55:42 2019 @@ -1081,6 +1081,6 @@ References .. [LattnerRevNum] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041739.html .. [TrickRevNum] Andrew Trick, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041721.html -.. [JSonnRevNum] Joerg Sonnenberg, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041688.html +.. [JSonnRevNum] Joerg Sonnenberger, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041688.html .. [MatthewsRevNum] Chris Matthews, http://lists.llvm.org/pipermail/cfe-dev/2016-July/049886.html .. [statuschecks] GitHub status-checks, https://help.github.com/articles/about-required-status-checks/ From llvm-commits at lists.llvm.org Mon Oct 7 15:57:42 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:57:42 +0000 (UTC) Subject: [PATCH] D68556: Document `LLVM_USE_SPLIT_DWARF` option In-Reply-To: References: Message-ID: dblaikie added a comment. FWIW, the current support is incomplete - if you just turn on -gsplit-dwarf, gdb may misbehave owing to a lack of index. (you'd need to add -Wl,-gdb-index to CMAKE_*_LINKER_FLAGS) Arguably we could/should fix LLVM itself to pass that flag when using -gsplit-dwarf while tuning for gdb and/or improve the LLVM CMake build to add the flag. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68556/new/ https://reviews.llvm.org/D68556 From llvm-commits at lists.llvm.org Mon Oct 7 16:02:03 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Mon, 07 Oct 2019 23:02:03 -0000 Subject: [llvm] r373981 - [X86] Add test cases for zero extending a gather index from less than i32 to i64. Message-ID: <20191007230203.9A1218DE4B@lists.llvm.org> Author: ctopper Date: Mon Oct 7 16:02:03 2019 New Revision: 373981 URL: http://llvm.org/viewvc/llvm-project?rev=373981&view=rev Log: [X86] Add test cases for zero extending a gather index from less than i32 to i64. We should be able to use a smaller zero extend. Modified: llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll Modified: llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll?rev=373981&r1=373980&r2=373981&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll (original) +++ llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll Mon Oct 7 16:02:03 2019 @@ -2689,6 +2689,108 @@ define <8 x float> @sext_v8i8_index(floa } declare <8 x float> @llvm.masked.gather.v8f32.v8p0f32(<8 x float*>, i32, <8 x i1>, <8 x float>) +; Make sure we also allow index to be zero extended from a smaller than i32 element size. +define <16 x float> @zext_i8_index(float* %base, <16 x i8> %ind) { +; KNL_64-LABEL: zext_i8_index: +; KNL_64: # %bb.0: +; KNL_64-NEXT: vpmovzxbw {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero +; KNL_64-NEXT: vpmovzxwq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; KNL_64-NEXT: vextracti128 $1, %ymm0, %xmm0 +; KNL_64-NEXT: vpmovzxwq {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; KNL_64-NEXT: kxnorw %k0, %k0, %k1 +; KNL_64-NEXT: kxnorw %k0, %k0, %k2 +; KNL_64-NEXT: vgatherqps (%rdi,%zmm0,4), %ymm2 {%k2} +; KNL_64-NEXT: vgatherqps (%rdi,%zmm1,4), %ymm0 {%k1} +; KNL_64-NEXT: vinsertf64x4 $1, %ymm2, %zmm0, %zmm0 +; KNL_64-NEXT: retq +; +; KNL_32-LABEL: zext_i8_index: +; KNL_32: # %bb.0: +; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax +; KNL_32-NEXT: vpmovzxbw {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero +; KNL_32-NEXT: vpmovzxwq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; KNL_32-NEXT: vextracti128 $1, %ymm0, %xmm0 +; KNL_32-NEXT: vpmovzxwq {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; KNL_32-NEXT: kxnorw %k0, %k0, %k1 +; KNL_32-NEXT: kxnorw %k0, %k0, %k2 +; KNL_32-NEXT: vgatherqps (%eax,%zmm0,4), %ymm2 {%k2} +; KNL_32-NEXT: vgatherqps (%eax,%zmm1,4), %ymm0 {%k1} +; KNL_32-NEXT: vinsertf64x4 $1, %ymm2, %zmm0, %zmm0 +; KNL_32-NEXT: retl +; +; SKX-LABEL: zext_i8_index: +; SKX: # %bb.0: +; SKX-NEXT: vpmovzxbw {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero +; SKX-NEXT: vpmovzxwq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; SKX-NEXT: vextracti128 $1, %ymm0, %xmm0 +; SKX-NEXT: vpmovzxwq {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; SKX-NEXT: kxnorw %k0, %k0, %k1 +; SKX-NEXT: kxnorw %k0, %k0, %k2 +; SKX-NEXT: vgatherqps (%rdi,%zmm0,4), %ymm2 {%k2} +; SKX-NEXT: vgatherqps (%rdi,%zmm1,4), %ymm0 {%k1} +; SKX-NEXT: vinsertf64x4 $1, %ymm2, %zmm0, %zmm0 +; SKX-NEXT: retq +; +; SKX_32-LABEL: zext_i8_index: +; SKX_32: # %bb.0: +; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax +; SKX_32-NEXT: vpmovzxbw {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero +; SKX_32-NEXT: vpmovzxwq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; SKX_32-NEXT: vextracti128 $1, %ymm0, %xmm0 +; SKX_32-NEXT: vpmovzxwq {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; SKX_32-NEXT: kxnorw %k0, %k0, %k1 +; SKX_32-NEXT: kxnorw %k0, %k0, %k2 +; SKX_32-NEXT: vgatherqps (%eax,%zmm0,4), %ymm2 {%k2} +; SKX_32-NEXT: vgatherqps (%eax,%zmm1,4), %ymm0 {%k1} +; SKX_32-NEXT: vinsertf64x4 $1, %ymm2, %zmm0, %zmm0 +; SKX_32-NEXT: retl + + %zext_ind = zext <16 x i8> %ind to <16 x i64> + %gep.random = getelementptr float, float *%base, <16 x i64> %zext_ind + + %res = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %gep.random, i32 4, <16 x i1> , <16 x float> undef) + ret <16 x float>%res +} + +; Make sure we also allow index to be zero extended from a smaller than i32 element size. +define <8 x float> @zext_v8i8_index(float* %base, <8 x i8> %ind) { +; KNL_64-LABEL: zext_v8i8_index: +; KNL_64: # %bb.0: +; KNL_64-NEXT: vpmovzxbq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero,xmm0[4],zero,zero,zero,zero,zero,zero,zero,xmm0[5],zero,zero,zero,zero,zero,zero,zero,xmm0[6],zero,zero,zero,zero,zero,zero,zero,xmm0[7],zero,zero,zero,zero,zero,zero,zero +; KNL_64-NEXT: kxnorw %k0, %k0, %k1 +; KNL_64-NEXT: vgatherqps (%rdi,%zmm1,4), %ymm0 {%k1} +; KNL_64-NEXT: retq +; +; KNL_32-LABEL: zext_v8i8_index: +; KNL_32: # %bb.0: +; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax +; KNL_32-NEXT: vpmovzxbq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero,xmm0[4],zero,zero,zero,zero,zero,zero,zero,xmm0[5],zero,zero,zero,zero,zero,zero,zero,xmm0[6],zero,zero,zero,zero,zero,zero,zero,xmm0[7],zero,zero,zero,zero,zero,zero,zero +; KNL_32-NEXT: kxnorw %k0, %k0, %k1 +; KNL_32-NEXT: vgatherqps (%eax,%zmm1,4), %ymm0 {%k1} +; KNL_32-NEXT: retl +; +; SKX-LABEL: zext_v8i8_index: +; SKX: # %bb.0: +; SKX-NEXT: vpmovzxbq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero,xmm0[4],zero,zero,zero,zero,zero,zero,zero,xmm0[5],zero,zero,zero,zero,zero,zero,zero,xmm0[6],zero,zero,zero,zero,zero,zero,zero,xmm0[7],zero,zero,zero,zero,zero,zero,zero +; SKX-NEXT: kxnorw %k0, %k0, %k1 +; SKX-NEXT: vgatherqps (%rdi,%zmm1,4), %ymm0 {%k1} +; SKX-NEXT: retq +; +; SKX_32-LABEL: zext_v8i8_index: +; SKX_32: # %bb.0: +; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax +; SKX_32-NEXT: vpmovzxbq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero,xmm0[4],zero,zero,zero,zero,zero,zero,zero,xmm0[5],zero,zero,zero,zero,zero,zero,zero,xmm0[6],zero,zero,zero,zero,zero,zero,zero,xmm0[7],zero,zero,zero,zero,zero,zero,zero +; SKX_32-NEXT: kxnorw %k0, %k0, %k1 +; SKX_32-NEXT: vgatherqps (%eax,%zmm1,4), %ymm0 {%k1} +; SKX_32-NEXT: retl + + %zext_ind = zext <8 x i8> %ind to <8 x i64> + %gep.random = getelementptr float, float *%base, <8 x i64> %zext_ind + + %res = call <8 x float> @llvm.masked.gather.v8f32.v8p0f32(<8 x float*> %gep.random, i32 4, <8 x i1> , <8 x float> undef) + ret <8 x float>%res +} + ; Index requires promotion define void @test_scatter_2i32_index(<2 x double> %a1, double* %base, <2 x i32> %ind, <2 x i1> %mask) { ; KNL_64-LABEL: test_scatter_2i32_index: From llvm-commits at lists.llvm.org Mon Oct 7 16:03:12 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Mon, 07 Oct 2019 23:03:12 -0000 Subject: [llvm] r373982 - [X86] Shrink zero extends of gather indices from type less than i32 to types larger than i32. Message-ID: <20191007230312.4E9138DD7B@lists.llvm.org> Author: ctopper Date: Mon Oct 7 16:03:12 2019 New Revision: 373982 URL: http://llvm.org/viewvc/llvm-project?rev=373982&view=rev Log: [X86] Shrink zero extends of gather indices from type less than i32 to types larger than i32. Gather instructions can use i32 or i64 elements for indices. If the index is zero extended from a type smaller than i32 to i64, we can shrink the extend to just extend to i32. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=373982&r1=373981&r2=373982&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Oct 7 16:03:12 2019 @@ -42572,16 +42572,17 @@ static SDValue combineGatherScatter(SDNo SDValue Base = GorS->getBasePtr(); SDValue Scale = GorS->getScale(); - // Shrink constant indices if they are larger than 32-bits. - // Only do this before legalize types since v2i64 could become v2i32. - // FIXME: We could check that the type is legal if we're after legalize types, - // but then we would need to construct test cases where that happens. - // FIXME: We could support more than just constant vectors, but we need to - // careful with costing. A truncate that can be optimized out would be fine. - // Otherwise we might only want to create a truncate if it avoids a split. if (DCI.isBeforeLegalize()) { + unsigned IndexWidth = Index.getScalarValueSizeInBits(); + + // Shrink constant indices if they are larger than 32-bits. + // Only do this before legalize types since v2i64 could become v2i32. + // FIXME: We could check that the type is legal if we're after legalize + // types, but then we would need to construct test cases where that happens. + // FIXME: We could support more than just constant vectors, but we need to + // careful with costing. A truncate that can be optimized out would be fine. + // Otherwise we might only want to create a truncate if it avoids a split. if (auto *BV = dyn_cast(Index)) { - unsigned IndexWidth = Index.getScalarValueSizeInBits(); if (BV->isConstant() && IndexWidth > 32 && DAG.ComputeNumSignBits(Index) > (IndexWidth - 32)) { unsigned NumElts = Index.getValueType().getVectorNumElements(); @@ -42604,16 +42605,18 @@ static SDValue combineGatherScatter(SDNo Scatter->getIndexType()); } } - } - if (DCI.isBeforeLegalizeOps()) { - // Remove any sign extends from 32 or smaller to larger than 32. - // Only do this before LegalizeOps in case we need the sign extend for - // legalization. - if (Index.getOpcode() == ISD::SIGN_EXTEND && - Index.getScalarValueSizeInBits() > 32 && - Index.getOperand(0).getScalarValueSizeInBits() <= 32) { - Index = Index.getOperand(0); + // Shrink any sign/zero extends from 32 or smaller to larger than 32 if + // there are sufficient sign bits. Only do this before legalize types to + // avoid creating illegal types in truncate. + if ((Index.getOpcode() == ISD::SIGN_EXTEND || + Index.getOpcode() == ISD::ZERO_EXTEND) && + IndexWidth > 32 && + Index.getOperand(0).getScalarValueSizeInBits() <= 32 && + DAG.ComputeNumSignBits(Index) > (IndexWidth - 32)) { + unsigned NumElts = Index.getValueType().getVectorNumElements(); + EVT NewVT = EVT::getVectorVT(*DAG.getContext(), MVT::i32, NumElts); + Index = DAG.getNode(ISD::TRUNCATE, DL, NewVT, Index); if (auto *Gather = dyn_cast(GorS)) { SDValue Ops[] = { Chain, Gather->getPassThru(), Mask, Base, Index, Scale } ; @@ -42630,41 +42633,20 @@ static SDValue combineGatherScatter(SDNo Ops, Scatter->getMemOperand(), Scatter->getIndexType()); } + } + + if (DCI.isBeforeLegalizeOps()) { + unsigned IndexWidth = Index.getScalarValueSizeInBits(); // Make sure the index is either i32 or i64 - unsigned ScalarSize = Index.getScalarValueSizeInBits(); - if (ScalarSize != 32 && ScalarSize != 64) { - MVT EltVT = ScalarSize > 32 ? MVT::i64 : MVT::i32; + if (IndexWidth != 32 && IndexWidth != 64) { + MVT EltVT = IndexWidth > 32 ? MVT::i64 : MVT::i32; EVT IndexVT = EVT::getVectorVT(*DAG.getContext(), EltVT, Index.getValueType().getVectorNumElements()); Index = DAG.getSExtOrTrunc(Index, DL, IndexVT); if (auto *Gather = dyn_cast(GorS)) { SDValue Ops[] = { Chain, Gather->getPassThru(), Mask, Base, Index, Scale } ; - return DAG.getMaskedGather(Gather->getVTList(), - Gather->getMemoryVT(), DL, Ops, - Gather->getMemOperand(), - Gather->getIndexType()); - } - auto *Scatter = cast(GorS); - SDValue Ops[] = { Chain, Scatter->getValue(), - Mask, Base, Index, Scale }; - return DAG.getMaskedScatter(Scatter->getVTList(), - Scatter->getMemoryVT(), DL, - Ops, Scatter->getMemOperand(), - Scatter->getIndexType()); - } - - // Try to remove zero extends from 32->64 if we know the sign bit of - // the input is zero. - if (Index.getOpcode() == ISD::ZERO_EXTEND && - Index.getScalarValueSizeInBits() == 64 && - Index.getOperand(0).getScalarValueSizeInBits() == 32 && - DAG.SignBitIsZero(Index.getOperand(0))) { - Index = Index.getOperand(0); - if (auto *Gather = dyn_cast(GorS)) { - SDValue Ops[] = { Chain, Gather->getPassThru(), - Mask, Base, Index, Scale } ; return DAG.getMaskedGather(Gather->getVTList(), Gather->getMemoryVT(), DL, Ops, Gather->getMemOperand(), Modified: llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll?rev=373982&r1=373981&r2=373982&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll (original) +++ llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll Mon Oct 7 16:03:12 2019 @@ -2693,56 +2693,32 @@ declare <8 x float> @llvm.masked.gather. define <16 x float> @zext_i8_index(float* %base, <16 x i8> %ind) { ; KNL_64-LABEL: zext_i8_index: ; KNL_64: # %bb.0: -; KNL_64-NEXT: vpmovzxbw {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero -; KNL_64-NEXT: vpmovzxwq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero -; KNL_64-NEXT: vextracti128 $1, %ymm0, %xmm0 -; KNL_64-NEXT: vpmovzxwq {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; KNL_64-NEXT: vpmovzxbd {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero,xmm0[8],zero,zero,zero,xmm0[9],zero,zero,zero,xmm0[10],zero,zero,zero,xmm0[11],zero,zero,zero,xmm0[12],zero,zero,zero,xmm0[13],zero,zero,zero,xmm0[14],zero,zero,zero,xmm0[15],zero,zero,zero ; KNL_64-NEXT: kxnorw %k0, %k0, %k1 -; KNL_64-NEXT: kxnorw %k0, %k0, %k2 -; KNL_64-NEXT: vgatherqps (%rdi,%zmm0,4), %ymm2 {%k2} -; KNL_64-NEXT: vgatherqps (%rdi,%zmm1,4), %ymm0 {%k1} -; KNL_64-NEXT: vinsertf64x4 $1, %ymm2, %zmm0, %zmm0 +; KNL_64-NEXT: vgatherdps (%rdi,%zmm1,4), %zmm0 {%k1} ; KNL_64-NEXT: retq ; ; KNL_32-LABEL: zext_i8_index: ; KNL_32: # %bb.0: ; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax -; KNL_32-NEXT: vpmovzxbw {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero -; KNL_32-NEXT: vpmovzxwq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero -; KNL_32-NEXT: vextracti128 $1, %ymm0, %xmm0 -; KNL_32-NEXT: vpmovzxwq {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; KNL_32-NEXT: vpmovzxbd {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero,xmm0[8],zero,zero,zero,xmm0[9],zero,zero,zero,xmm0[10],zero,zero,zero,xmm0[11],zero,zero,zero,xmm0[12],zero,zero,zero,xmm0[13],zero,zero,zero,xmm0[14],zero,zero,zero,xmm0[15],zero,zero,zero ; KNL_32-NEXT: kxnorw %k0, %k0, %k1 -; KNL_32-NEXT: kxnorw %k0, %k0, %k2 -; KNL_32-NEXT: vgatherqps (%eax,%zmm0,4), %ymm2 {%k2} -; KNL_32-NEXT: vgatherqps (%eax,%zmm1,4), %ymm0 {%k1} -; KNL_32-NEXT: vinsertf64x4 $1, %ymm2, %zmm0, %zmm0 +; KNL_32-NEXT: vgatherdps (%eax,%zmm1,4), %zmm0 {%k1} ; KNL_32-NEXT: retl ; ; SKX-LABEL: zext_i8_index: ; SKX: # %bb.0: -; SKX-NEXT: vpmovzxbw {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero -; SKX-NEXT: vpmovzxwq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero -; SKX-NEXT: vextracti128 $1, %ymm0, %xmm0 -; SKX-NEXT: vpmovzxwq {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; SKX-NEXT: vpmovzxbd {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero,xmm0[8],zero,zero,zero,xmm0[9],zero,zero,zero,xmm0[10],zero,zero,zero,xmm0[11],zero,zero,zero,xmm0[12],zero,zero,zero,xmm0[13],zero,zero,zero,xmm0[14],zero,zero,zero,xmm0[15],zero,zero,zero ; SKX-NEXT: kxnorw %k0, %k0, %k1 -; SKX-NEXT: kxnorw %k0, %k0, %k2 -; SKX-NEXT: vgatherqps (%rdi,%zmm0,4), %ymm2 {%k2} -; SKX-NEXT: vgatherqps (%rdi,%zmm1,4), %ymm0 {%k1} -; SKX-NEXT: vinsertf64x4 $1, %ymm2, %zmm0, %zmm0 +; SKX-NEXT: vgatherdps (%rdi,%zmm1,4), %zmm0 {%k1} ; SKX-NEXT: retq ; ; SKX_32-LABEL: zext_i8_index: ; SKX_32: # %bb.0: ; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax -; SKX_32-NEXT: vpmovzxbw {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero -; SKX_32-NEXT: vpmovzxwq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero -; SKX_32-NEXT: vextracti128 $1, %ymm0, %xmm0 -; SKX_32-NEXT: vpmovzxwq {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; SKX_32-NEXT: vpmovzxbd {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero,xmm0[8],zero,zero,zero,xmm0[9],zero,zero,zero,xmm0[10],zero,zero,zero,xmm0[11],zero,zero,zero,xmm0[12],zero,zero,zero,xmm0[13],zero,zero,zero,xmm0[14],zero,zero,zero,xmm0[15],zero,zero,zero ; SKX_32-NEXT: kxnorw %k0, %k0, %k1 -; SKX_32-NEXT: kxnorw %k0, %k0, %k2 -; SKX_32-NEXT: vgatherqps (%eax,%zmm0,4), %ymm2 {%k2} -; SKX_32-NEXT: vgatherqps (%eax,%zmm1,4), %ymm0 {%k1} -; SKX_32-NEXT: vinsertf64x4 $1, %ymm2, %zmm0, %zmm0 +; SKX_32-NEXT: vgatherdps (%eax,%zmm1,4), %zmm0 {%k1} ; SKX_32-NEXT: retl %zext_ind = zext <16 x i8> %ind to <16 x i64> @@ -2756,32 +2732,36 @@ define <16 x float> @zext_i8_index(float define <8 x float> @zext_v8i8_index(float* %base, <8 x i8> %ind) { ; KNL_64-LABEL: zext_v8i8_index: ; KNL_64: # %bb.0: -; KNL_64-NEXT: vpmovzxbq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero,xmm0[4],zero,zero,zero,zero,zero,zero,zero,xmm0[5],zero,zero,zero,zero,zero,zero,zero,xmm0[6],zero,zero,zero,zero,zero,zero,zero,xmm0[7],zero,zero,zero,zero,zero,zero,zero -; KNL_64-NEXT: kxnorw %k0, %k0, %k1 -; KNL_64-NEXT: vgatherqps (%rdi,%zmm1,4), %ymm0 {%k1} +; KNL_64-NEXT: vpmovzxbd {{.*#+}} ymm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; KNL_64-NEXT: movw $255, %ax +; KNL_64-NEXT: kmovw %eax, %k1 +; KNL_64-NEXT: vgatherdps (%rdi,%zmm1,4), %zmm0 {%k1} +; KNL_64-NEXT: # kill: def $ymm0 killed $ymm0 killed $zmm0 ; KNL_64-NEXT: retq ; ; KNL_32-LABEL: zext_v8i8_index: ; KNL_32: # %bb.0: ; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax -; KNL_32-NEXT: vpmovzxbq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero,xmm0[4],zero,zero,zero,zero,zero,zero,zero,xmm0[5],zero,zero,zero,zero,zero,zero,zero,xmm0[6],zero,zero,zero,zero,zero,zero,zero,xmm0[7],zero,zero,zero,zero,zero,zero,zero -; KNL_32-NEXT: kxnorw %k0, %k0, %k1 -; KNL_32-NEXT: vgatherqps (%eax,%zmm1,4), %ymm0 {%k1} +; KNL_32-NEXT: vpmovzxbd {{.*#+}} ymm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; KNL_32-NEXT: movw $255, %cx +; KNL_32-NEXT: kmovw %ecx, %k1 +; KNL_32-NEXT: vgatherdps (%eax,%zmm1,4), %zmm0 {%k1} +; KNL_32-NEXT: # kill: def $ymm0 killed $ymm0 killed $zmm0 ; KNL_32-NEXT: retl ; ; SKX-LABEL: zext_v8i8_index: ; SKX: # %bb.0: -; SKX-NEXT: vpmovzxbq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero,xmm0[4],zero,zero,zero,zero,zero,zero,zero,xmm0[5],zero,zero,zero,zero,zero,zero,zero,xmm0[6],zero,zero,zero,zero,zero,zero,zero,xmm0[7],zero,zero,zero,zero,zero,zero,zero +; SKX-NEXT: vpmovzxbd {{.*#+}} ymm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero ; SKX-NEXT: kxnorw %k0, %k0, %k1 -; SKX-NEXT: vgatherqps (%rdi,%zmm1,4), %ymm0 {%k1} +; SKX-NEXT: vgatherdps (%rdi,%ymm1,4), %ymm0 {%k1} ; SKX-NEXT: retq ; ; SKX_32-LABEL: zext_v8i8_index: ; SKX_32: # %bb.0: ; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax -; SKX_32-NEXT: vpmovzxbq {{.*#+}} zmm1 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero,xmm0[4],zero,zero,zero,zero,zero,zero,zero,xmm0[5],zero,zero,zero,zero,zero,zero,zero,xmm0[6],zero,zero,zero,zero,zero,zero,zero,xmm0[7],zero,zero,zero,zero,zero,zero,zero +; SKX_32-NEXT: vpmovzxbd {{.*#+}} ymm1 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero ; SKX_32-NEXT: kxnorw %k0, %k0, %k1 -; SKX_32-NEXT: vgatherqps (%eax,%zmm1,4), %ymm0 {%k1} +; SKX_32-NEXT: vgatherdps (%eax,%ymm1,4), %ymm0 {%k1} ; SKX_32-NEXT: retl %zext_ind = zext <8 x i8> %ind to <8 x i64> From llvm-commits at lists.llvm.org Mon Oct 7 16:04:16 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Mon, 07 Oct 2019 23:04:16 -0000 Subject: [compiler-rt] r373983 - [tsan, go] break commands into multiple lines Message-ID: <20191007230416.33ED78D4E1@lists.llvm.org> Author: vitalybuka Date: Mon Oct 7 16:04:16 2019 New Revision: 373983 URL: http://llvm.org/viewvc/llvm-project?rev=373983&view=rev Log: [tsan, go] break commands into multiple lines Summary: Patch by Keith Randall. Reviewers: dvyukov, vitalybuka Subscribers: delcypher, jfb, #sanitizers, llvm-commits Tags: #llvm, #sanitizers Differential Revision: https://reviews.llvm.org/D68596 Modified: compiler-rt/trunk/lib/tsan/go/build.bat Modified: compiler-rt/trunk/lib/tsan/go/build.bat URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/tsan/go/build.bat?rev=373983&r1=373982&r2=373983&view=diff ============================================================================== --- compiler-rt/trunk/lib/tsan/go/build.bat (original) +++ compiler-rt/trunk/lib/tsan/go/build.bat Mon Oct 7 16:04:16 2019 @@ -1,4 +1,56 @@ -type tsan_go.cpp ..\rtl\tsan_interface_atomic.cpp ..\rtl\tsan_clock.cpp ..\rtl\tsan_flags.cpp ..\rtl\tsan_md5.cpp ..\rtl\tsan_mutex.cpp ..\rtl\tsan_report.cpp ..\rtl\tsan_rtl.cpp ..\rtl\tsan_rtl_mutex.cpp ..\rtl\tsan_rtl_report.cpp ..\rtl\tsan_rtl_thread.cpp ..\rtl\tsan_rtl_proc.cpp ..\rtl\tsan_stat.cpp ..\rtl\tsan_suppressions.cpp ..\rtl\tsan_sync.cpp ..\rtl\tsan_stack_trace.cpp ..\..\sanitizer_common\sanitizer_allocator.cpp ..\..\sanitizer_common\sanitizer_common.cpp ..\..\sanitizer_common\sanitizer_flags.cpp ..\..\sanitizer_common\sanitizer_stacktrace.cpp ..\..\sanitizer_common\sanitizer_libc.cpp ..\..\sanitizer_common\sanitizer_printf.cpp ..\..\sanitizer_common\sanitizer_suppressions.cpp ..\..\sanitizer_common\sanitizer_thread_registry.cpp ..\rtl\tsan_platform_windows.cpp ..\..\sanitizer_common\sanitizer_win.cpp ..\..\sanitizer_common\sanitizer_deadlock_detector1.cpp ..\..\sanitizer_common\sanitizer_stackdepot.cpp ..\..\sanitizer_common\sanitizer_persistent_allocator.cpp ..\..\sanitizer_common\sanitizer_flag_parser.cpp ..\..\sanitizer_common\sanitizer_symbolizer.cpp ..\..\sanitizer_common\sanitizer_termination.cpp > gotsan.cpp - -gcc -c -o race_windows_amd64.syso gotsan.cpp -I..\rtl -I..\.. -I..\..\sanitizer_common -I..\..\..\include -m64 -Wall -fno-exceptions -fno-rtti -DSANITIZER_GO=1 -Wno-error=attributes -Wno-attributes -Wno-format -Wno-maybe-uninitialized -DSANITIZER_DEBUG=0 -O3 -fomit-frame-pointer -std=c++11 +type ^ + tsan_go.cpp ^ + ..\rtl\tsan_interface_atomic.cpp ^ + ..\rtl\tsan_clock.cpp ^ + ..\rtl\tsan_flags.cpp ^ + ..\rtl\tsan_md5.cpp ^ + ..\rtl\tsan_mutex.cpp ^ + ..\rtl\tsan_report.cpp ^ + ..\rtl\tsan_rtl.cpp ^ + ..\rtl\tsan_rtl_mutex.cpp ^ + ..\rtl\tsan_rtl_report.cpp ^ + ..\rtl\tsan_rtl_thread.cpp ^ + ..\rtl\tsan_rtl_proc.cpp ^ + ..\rtl\tsan_stat.cpp ^ + ..\rtl\tsan_suppressions.cpp ^ + ..\rtl\tsan_sync.cpp ^ + ..\rtl\tsan_stack_trace.cpp ^ + ..\..\sanitizer_common\sanitizer_allocator.cpp ^ + ..\..\sanitizer_common\sanitizer_common.cpp ^ + ..\..\sanitizer_common\sanitizer_flags.cpp ^ + ..\..\sanitizer_common\sanitizer_stacktrace.cpp ^ + ..\..\sanitizer_common\sanitizer_libc.cpp ^ + ..\..\sanitizer_common\sanitizer_printf.cpp ^ + ..\..\sanitizer_common\sanitizer_suppressions.cpp ^ + ..\..\sanitizer_common\sanitizer_thread_registry.cpp ^ + ..\rtl\tsan_platform_windows.cpp ^ + ..\..\sanitizer_common\sanitizer_win.cpp ^ + ..\..\sanitizer_common\sanitizer_deadlock_detector1.cpp ^ + ..\..\sanitizer_common\sanitizer_stackdepot.cpp ^ + ..\..\sanitizer_common\sanitizer_persistent_allocator.cpp ^ + ..\..\sanitizer_common\sanitizer_flag_parser.cpp ^ + ..\..\sanitizer_common\sanitizer_symbolizer.cpp ^ + ..\..\sanitizer_common\sanitizer_termination.cpp ^ + > gotsan.cpp +gcc ^ + -c ^ + -o race_windows_amd64.syso ^ + gotsan.cpp ^ + -I..\rtl ^ + -I..\.. ^ + -I..\..\sanitizer_common ^ + -I..\..\..\include ^ + -m64 ^ + -Wall ^ + -fno-exceptions ^ + -fno-rtti ^ + -DSANITIZER_GO=1 ^ + -Wno-error=attributes ^ + -Wno-attributes ^ + -Wno-format ^ + -Wno-maybe-uninitialized ^ + -DSANITIZER_DEBUG=0 ^ + -O3 ^ + -fomit-frame-pointer ^ + -std=c++11 From llvm-commits at lists.llvm.org Mon Oct 7 16:11:07 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Mon, 07 Oct 2019 23:11:07 -0000 Subject: [compiler-rt] r373984 - [tsan, go] fix Go windows build Message-ID: <20191007231107.BB56D87F03@lists.llvm.org> Author: vitalybuka Date: Mon Oct 7 16:11:07 2019 New Revision: 373984 URL: http://llvm.org/viewvc/llvm-project?rev=373984&view=rev Log: [tsan, go] fix Go windows build Summary: Don't use weak exports when building tsan into a shared library for Go. gcc can't handle the pragmas used to make the weak references. Include files that have been added since the last update to build.bat. (We should really find a better way to list all the files needed.) Add windows version defines (WINVER and _WIN32_WINNT) to get AcquireSRWLockExclusive and ReleaseSRWLockExclusive defined. Define GetProcessMemoryInfo to use the kernel32 version. This is kind of a hack, the windows header files should do this translation for us. I think we're not in the right family partition (we're using Desktop, but that translation only happens for App and System partitions???), but hacking the family partition seems equally gross and I have no idea what the consequences of that might be. Patch by Keith Randall. Reviewers: dvyukov, vitalybuka Reviewed By: vitalybuka Subscribers: jfb, delcypher, #sanitizers, llvm-commits Tags: #llvm, #sanitizers Differential Revision: https://reviews.llvm.org/D68599 Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_win_defs.h compiler-rt/trunk/lib/tsan/go/build.bat Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_win_defs.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_win_defs.h?rev=373984&r1=373983&r2=373984&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_win_defs.h (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_win_defs.h Mon Oct 7 16:11:07 2019 @@ -43,6 +43,8 @@ #define STRINGIFY_(A) #A #define STRINGIFY(A) STRINGIFY_(A) +#if !SANITIZER_GO + // ----------------- A workaround for the absence of weak symbols -------------- // We don't have a direct equivalent of weak symbols when using MSVC, but we can // use the /alternatename directive to tell the linker to default a specific @@ -158,5 +160,15 @@ // return a >= b; // } // + +#else // SANITIZER_GO + +// Go neither needs nor wants weak references. +// The shenanigans above don't work for gcc. +# define WIN_WEAK_EXPORT_DEF(ReturnType, Name, ...) \ + extern "C" ReturnType Name(__VA_ARGS__) + +#endif // SANITIZER_GO + #endif // SANITIZER_WINDOWS #endif // SANITIZER_WIN_DEFS_H Modified: compiler-rt/trunk/lib/tsan/go/build.bat URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/tsan/go/build.bat?rev=373984&r1=373983&r2=373984&view=diff ============================================================================== --- compiler-rt/trunk/lib/tsan/go/build.bat (original) +++ compiler-rt/trunk/lib/tsan/go/build.bat Mon Oct 7 16:11:07 2019 @@ -31,6 +31,9 @@ type ^ ..\..\sanitizer_common\sanitizer_flag_parser.cpp ^ ..\..\sanitizer_common\sanitizer_symbolizer.cpp ^ ..\..\sanitizer_common\sanitizer_termination.cpp ^ + ..\..\sanitizer_common\sanitizer_file.cpp ^ + ..\..\sanitizer_common\sanitizer_symbolizer_report.cpp ^ + ..\rtl\tsan_external.cpp ^ > gotsan.cpp gcc ^ @@ -46,6 +49,9 @@ gcc ^ -fno-exceptions ^ -fno-rtti ^ -DSANITIZER_GO=1 ^ + -DWINVER=0x0600 ^ + -D_WIN32_WINNT=0x0600 ^ + -DGetProcessMemoryInfo=K32GetProcessMemoryInfo ^ -Wno-error=attributes ^ -Wno-attributes ^ -Wno-format ^ From llvm-commits at lists.llvm.org Mon Oct 7 16:10:17 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 23:10:17 +0000 (UTC) Subject: [PATCH] D68602: Split two defines into two lines In-Reply-To: References: Message-ID: <3924977893b184681e1b5368e2c65336@localhost.localdomain> vitalybuka requested changes to this revision. vitalybuka added a comment. This revision now requires changes to proceed. already committed Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68602/new/ https://reviews.llvm.org/D68602 From llvm-commits at lists.llvm.org Mon Oct 7 16:11:32 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 23:11:32 +0000 (UTC) Subject: [PATCH] D68484: [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. In-Reply-To: References: Message-ID: <17c40b9251a0d039aeaf2dce418996b3@localhost.localdomain> jdoerfert added inline comments. ================ Comment at: llvm/docs/LangRef.rst:16303 +- ``p.objId``: a number that can be used to differentiate different *object P* + when ``%p.addr`` is optimized away. +- ``!p.scope``: metadata argument that refers to a list of alias.scope metadata ---------------- jeroen.dobbelaere wrote: > jdoerfert wrote: > > This seems odd, why introduce two things that do the same thing. > The original idea was to treat '%p.addr' sometimes as a pointer to an object and sometimes as an offset. Later it needed to be separated: SROA first splits alloca's into multiple smaller alloca's. Each separate restrict pointer now points to its own alloca (%p.addr), and there is no place to put the offset. You can differentiate by splitting the p.scope, but that would imply duplicating scopes all over the place. The p.objId serves as a convenient and less costly solution to differentiate the pointers in this case. So `objId` is an offset into `p.addr`? If so, let's document it that way. How does this work if there are multiple restrict pointers in the object, e.g. `struct { restrict *a; restrict *b }`? Maybe it would help if you point me towards the place where I can see this intrinsic in action. At least then I might be able to provide better feedback on the wording. ================ Comment at: llvm/docs/LangRef.rst:16306 + entries that contains exactly one element. It represents the variable + declaration that contains one or more restrict pointers. +- ``%p.decl``: points to the ``@llvm.noalias.decl`` intrinsic associated with ---------------- jeroen.dobbelaere wrote: > jdoerfert wrote: > > "entries with a single element each." > > > It represents the variable declaration that contains one or more restrict pointers. > > I do not understand this sentence. > hmm. Not sure how to explain it further. What I want to say is (shown with an example:) > int *restrict A; // one !p.scope, one restrict pointer > int *restrict B[10]; // another (single) !p.scope, ten restrict pointers > struct FOO { int* restrict mA; int * mB; int* restrict mC; } C; // yet another !p.scope, 2 restrict pointers > > > > In that example, how doe the `p.scopes` look like? Or, asked differently, is the `p.scope` a consequence of the declaration, hence does it uniquely identifies a declaration? ================ Comment at: llvm/docs/LangRef.rst:16496 +not really represent a value. It is merely used to track a dependency on the +declaration. + ---------------- jeroen.dobbelaere wrote: > jdoerfert wrote: > > The above reads funny, maybe: > > "The returned value is a handle to track dependences on the declaration. There is no explicit relationship to the value of the arguments." > > Also, why do we want an `i8*` then? We have `tokens` and we have `i32`, I'd prefer either over an `i8*` which is more confusing in this context full of `i8*` that are actually pointers (IMHO). > I think a token has to many restrictions (no PHI, no select). i32 might do. I didn't think too much about it and just settled on i8*. If the token is too restrictive I'd still prefer an i32 (or similar) to avoid confusion with all the i8 pointers that fly around. The wording will then make it clear that these are tokens. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68484/new/ https://reviews.llvm.org/D68484 From llvm-commits at lists.llvm.org Mon Oct 7 16:14:58 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Mon, 07 Oct 2019 23:14:58 -0000 Subject: [llvm] r373985 - [Attributor] Use abstract call sites for call site callback Message-ID: <20191007231458.EF24B8DE8D@lists.llvm.org> Author: jdoerfert Date: Mon Oct 7 16:14:58 2019 New Revision: 373985 URL: http://llvm.org/viewvc/llvm-project?rev=373985&view=rev Log: [Attributor] Use abstract call sites for call site callback Summary: When we iterate over uses of functions and expect them to be call sites, we now use abstract call sites to allow callback calls. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, hfinkel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67871 Added: llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll Modified: llvm/trunk/include/llvm/IR/CallSite.h llvm/trunk/include/llvm/Transforms/IPO/Attributor.h llvm/trunk/lib/Transforms/IPO/Attributor.cpp Modified: llvm/trunk/include/llvm/IR/CallSite.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/CallSite.h?rev=373985&r1=373984&r2=373985&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/CallSite.h (original) +++ llvm/trunk/include/llvm/IR/CallSite.h Mon Oct 7 16:14:58 2019 @@ -854,6 +854,15 @@ public: return CI.ParameterEncoding[0]; } + /// Return the use of the callee value in the underlying instruction. Only + /// valid for callback calls! + const Use &getCalleeUseForCallback() const { + int CalleeArgIdx = getCallArgOperandNoForCallee(); + assert(CalleeArgIdx >= 0 && + unsigned(CalleeArgIdx) < getInstruction()->getNumOperands()); + return getInstruction()->getOperandUse(CalleeArgIdx); + } + /// Return the pointer to function that is being called. Value *getCalledValue() const { if (isDirectCall()) Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/Attributor.h?rev=373985&r1=373984&r2=373985&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/Attributor.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/Attributor.h Mon Oct 7 16:14:58 2019 @@ -216,6 +216,16 @@ struct IRPosition { ArgNo); } + /// Create a position describing the argument of \p ACS at position \p ArgNo. + static const IRPosition callsite_argument(AbstractCallSite ACS, + unsigned ArgNo) { + int CSArgNo = ACS.getCallArgOperandNo(ArgNo); + if (CSArgNo >= 0) + return IRPosition::callsite_argument( + cast(*ACS.getInstruction()), CSArgNo); + return IRPosition(); + } + /// Create a position with function scope matching the "context" of \p IRP. /// If \p IRP is a call site (see isAnyCallSitePosition()) then the result /// will be a call site position, otherwise the function position of the @@ -825,7 +835,7 @@ struct Attributor { /// This method will evaluate \p Pred on call sites and return /// true if \p Pred holds in every call sites. However, this is only possible /// all call sites are known, hence the function has internal linkage. - bool checkForAllCallSites(const function_ref &Pred, + bool checkForAllCallSites(const function_ref &Pred, const AbstractAttribute &QueryingAA, bool RequireAllCallSites); Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=373985&r1=373984&r2=373985&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Mon Oct 7 16:14:58 2019 @@ -596,11 +596,16 @@ static void clampCallSiteArgumentStates( // The argument number which is also the call site argument number. unsigned ArgNo = QueryingAA.getIRPosition().getArgNo(); - auto CallSiteCheck = [&](CallSite CS) { - const IRPosition &CSArgPos = IRPosition::callsite_argument(CS, ArgNo); - const AAType &AA = A.getAAFor(QueryingAA, CSArgPos); - LLVM_DEBUG(dbgs() << "[Attributor] CS: " << *CS.getInstruction() - << " AA: " << AA.getAsStr() << " @" << CSArgPos << "\n"); + auto CallSiteCheck = [&](AbstractCallSite ACS) { + const IRPosition &ACSArgPos = IRPosition::callsite_argument(ACS, ArgNo); + // Check if a coresponding argument was found or if it is on not associated + // (which can happen for callback calls). + if (ACSArgPos.getPositionKind() == IRPosition::IRP_INVALID) + return false; + + const AAType &AA = A.getAAFor(QueryingAA, ACSArgPos); + LLVM_DEBUG(dbgs() << "[Attributor] ACS: " << *ACS.getInstruction() + << " AA: " << AA.getAsStr() << " @" << ACSArgPos << "\n"); const StateType &AAS = static_cast(AA.getState()); if (T.hasValue()) *T &= AAS; @@ -3100,9 +3105,12 @@ struct AAValueSimplifyArgument final : A ChangeStatus updateImpl(Attributor &A) override { bool HasValueBefore = SimplifiedAssociatedValue.hasValue(); - auto PredForCallSite = [&](CallSite CS) { - return checkAndUpdate(A, *this, *CS.getArgOperand(getArgNo()), - SimplifiedAssociatedValue); + auto PredForCallSite = [&](AbstractCallSite ACS) { + // Check if we have an associated argument or not (which can happen for + // callback calls). + if (Value *ArgOp = ACS.getCallArgOperand(getArgNo())) + return checkAndUpdate(A, *this, *ArgOp, SimplifiedAssociatedValue); + return false; }; if (!A.checkForAllCallSites(PredForCallSite, *this, true)) @@ -3914,9 +3922,9 @@ bool Attributor::isAssumedDead(const Abs return true; } -bool Attributor::checkForAllCallSites(const function_ref &Pred, - const AbstractAttribute &QueryingAA, - bool RequireAllCallSites) { +bool Attributor::checkForAllCallSites( + const function_ref &Pred, + const AbstractAttribute &QueryingAA, bool RequireAllCallSites) { // We can try to determine information from // the call sites. However, this is only possible all call sites are known, // hence the function has internal linkage. @@ -3934,15 +3942,21 @@ bool Attributor::checkForAllCallSites(co } for (const Use &U : AssociatedFunction->uses()) { - Instruction *I = dyn_cast(U.getUser()); - // TODO: Deal with abstract call sites here. - if (!I) + AbstractCallSite ACS(&U); + if (!ACS) { + LLVM_DEBUG(dbgs() << "[Attributor] Function " + << AssociatedFunction->getName() + << " has non call site use " << *U.get() << " in " + << *U.getUser() << "\n"); return false; + } + Instruction *I = ACS.getInstruction(); Function *Caller = I->getFunction(); - const auto &LivenessAA = getAAFor( - QueryingAA, IRPosition::function(*Caller), /* TrackDependence */ false); + const auto &LivenessAA = + getAAFor(QueryingAA, IRPosition::function(*Caller), + /* TrackDependence */ false); // Skip dead calls. if (LivenessAA.isAssumedDead(I)) { @@ -3952,22 +3966,22 @@ bool Attributor::checkForAllCallSites(co continue; } - CallSite CS(U.getUser()); - if (!CS || !CS.isCallee(&U)) { + const Use *EffectiveUse = + ACS.isCallbackCall() ? &ACS.getCalleeUseForCallback() : &U; + if (!ACS.isCallee(EffectiveUse)) { if (!RequireAllCallSites) continue; - - LLVM_DEBUG(dbgs() << "[Attributor] User " << *U.getUser() + LLVM_DEBUG(dbgs() << "[Attributor] User " << EffectiveUse->getUser() << " is an invalid use of " << AssociatedFunction->getName() << "\n"); return false; } - if (Pred(CS)) + if (Pred(ACS)) continue; LLVM_DEBUG(dbgs() << "[Attributor] Call site callback failed for " - << *CS.getInstruction() << "\n"); + << *ACS.getInstruction() << "\n"); return false; } @@ -4319,7 +4333,7 @@ ChangeStatus Attributor::run(Module &M) const auto *LivenessAA = lookupAAFor(IRPosition::function(*F)); if (LivenessAA && - !checkForAllCallSites([](CallSite CS) { return false; }, + !checkForAllCallSites([](AbstractCallSite ACS) { return false; }, *LivenessAA, true)) continue; Added: llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll?rev=373985&view=auto ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll (added) +++ llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll Mon Oct 7 16:14:58 2019 @@ -0,0 +1,63 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=1 < %s | FileCheck %s +; ModuleID = 'callback_simple.c' +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" + +; Test 0 +; +; Make sure we propagate information from the caller to the callback callee but +; only for arguments that are mapped through the callback metadata. Here, the +; first two arguments of the call and the callback callee do not correspond to +; each other but argument 3-5 of the transitive call site in the caller match +; arguments 2-4 of the callback callee. Here we should see information and value +; transfer in both directions. +; FIXME: The callee -> call site direction is not working yet. + +define void @t0_caller(i32* %a) { +; CHECK: @t0_caller(i32* [[A:%.*]]) +; CHECK-NEXT: entry: +; CHECK-NEXT: [[B:%.*]] = alloca i32, align 32 +; CHECK-NEXT: [[C:%.*]] = alloca i32*, align 64 +; CHECK-NEXT: [[PTR:%.*]] = alloca i32, align 128 +; CHECK-NEXT: [[TMP0:%.*]] = bitcast i32* [[B]] to i8* +; CHECK-NEXT: store i32 42, i32* [[B]], align 32 +; CHECK-NEXT: store i32* [[B]], i32** [[C]], align 64 +; CHECK-NEXT: call void (i32*, i32*, void (i32*, i32*, ...)*, ...) @t0_callback_broker(i32* null, i32* nonnull align 128 dereferenceable(4) [[PTR]], void (i32*, i32*, ...)* nonnull bitcast (void (i32*, i32*, i32*, i64, i32**)* @t0_callback_callee to void (i32*, i32*, ...)*), i32* [[A:%.*]], i64 99, i32** nonnull align 64 dereferenceable(8) [[C]]) +; CHECK-NEXT: ret void +; +entry: + %b = alloca i32, align 32 + %c = alloca i32*, align 64 + %ptr = alloca i32, align 128 + %0 = bitcast i32* %b to i8* + store i32 42, i32* %b, align 4 + store i32* %b, i32** %c, align 8 + call void (i32*, i32*, void (i32*, i32*, ...)*, ...) @t0_callback_broker(i32* null, i32* %ptr, void (i32*, i32*, ...)* bitcast (void (i32*, i32*, i32*, i64, i32**)* @t0_callback_callee to void (i32*, i32*, ...)*), i32* %a, i64 99, i32** %c) + ret void +} + +; Note that the first two arguments are provided by the callback_broker according to the callback in !1 below! +; The others are annotated with alignment information, amongst others, or even replaced by the constants passed to the call. +define internal void @t0_callback_callee(i32* %is_not_null, i32* %ptr, i32* %a, i64 %b, i32** %c) { +; CHECK: @t0_callback_callee(i32* nocapture writeonly [[IS_NOT_NULL:%.*]], i32* nocapture readonly [[PTR:%.*]], i32* [[A:%.*]], i64 [[B:%.*]], i32** nocapture nonnull readonly align 64 dereferenceable(8) [[C:%.*]]) +; CHECK-NEXT: entry: +; CHECK-NEXT: [[PTR_VAL:%.*]] = load i32, i32* [[PTR:%.*]], align 8 +; CHECK-NEXT: store i32 [[PTR_VAL]], i32* [[IS_NOT_NULL:%.*]] +; CHECK-NEXT: [[TMP0:%.*]] = load i32*, i32** [[C:%.*]], align 64 +; CHECK-NEXT: tail call void @t0_check(i32* align 256 [[A:%.*]], i64 99, i32* [[TMP0]]) +; CHECK-NEXT: ret void +; +entry: + %ptr_val = load i32, i32* %ptr, align 8 + store i32 %ptr_val, i32* %is_not_null + %0 = load i32*, i32** %c, align 8 + tail call void @t0_check(i32* %a, i64 %b, i32* %0) + ret void +} + +declare void @t0_check(i32* align 256, i64, i32*) + +declare !callback !0 void @t0_callback_broker(i32*, i32*, void (i32*, i32*, ...)*, ...) + +!0 = !{!1} +!1 = !{i64 2, i64 -1, i64 -1, i1 true} From llvm-commits at lists.llvm.org Mon Oct 7 16:17:08 2019 From: llvm-commits at lists.llvm.org (Keith Randall via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 23:17:08 +0000 (UTC) Subject: [PATCH] D68599: [tsan, go] fix Go windows build In-Reply-To: References: Message-ID: <40a99124ec754756e41df4375d3344cd@localhost.localdomain> randall77 added a comment. Vitaly, thanks for all your help! Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68599/new/ https://reviews.llvm.org/D68599 From llvm-commits at lists.llvm.org Mon Oct 7 16:21:52 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Mon, 07 Oct 2019 23:21:52 -0000 Subject: [llvm] r373986 - [Attributor] Use local linkage instead of internal Message-ID: <20191007232152.B6CDD8DAFE@lists.llvm.org> Author: jdoerfert Date: Mon Oct 7 16:21:52 2019 New Revision: 373986 URL: http://llvm.org/viewvc/llvm-project?rev=373986&view=rev Log: [Attributor] Use local linkage instead of internal Local linkage is internal or private, and private is a specialization of internal, so either is fine for all our "local linkage" queries. Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/Attributor.h?rev=373986&r1=373985&r2=373986&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/Attributor.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/Attributor.h Mon Oct 7 16:21:52 2019 @@ -810,8 +810,8 @@ struct Attributor { /// This will trigger the identification and initialization of attributes for /// \p F. void markLiveInternalFunction(const Function &F) { - assert(F.hasInternalLinkage() && - "Only internal linkage is assumed dead initially."); + assert(F.hasLocalLinkage() && + "Only local linkage is assumed dead initially."); identifyDefaultAbstractAttributes(const_cast(F)); } Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=373986&r1=373985&r2=373986&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Mon Oct 7 16:21:52 2019 @@ -2081,7 +2081,7 @@ struct AAIsDeadImpl : public AAIsDead { for (const Instruction &I : BB) if (ImmutableCallSite ICS = ImmutableCallSite(&I)) if (const Function *F = ICS.getCalledFunction()) - if (F->hasInternalLinkage()) + if (F->hasLocalLinkage()) A.markLiveInternalFunction(*F); } @@ -3933,7 +3933,7 @@ bool Attributor::checkForAllCallSites( if (!AssociatedFunction) return false; - if (RequireAllCallSites && !AssociatedFunction->hasInternalLinkage()) { + if (RequireAllCallSites && !AssociatedFunction->hasLocalLinkage()) { LLVM_DEBUG( dbgs() << "[Attributor] Function " << AssociatedFunction->getName() @@ -4319,7 +4319,7 @@ ChangeStatus Attributor::run(Module &M) // below fixpoint loop will identify and eliminate them. SmallVector InternalFns; for (Function &F : M) - if (F.hasInternalLinkage()) + if (F.hasLocalLinkage()) InternalFns.push_back(&F); bool FoundDeadFn = true; @@ -4634,7 +4634,7 @@ static bool runAttributorOnModule(Module // We look at internal functions only on-demand but if any use is not a // direct call, we have to do it eagerly. - if (F.hasInternalLinkage()) { + if (F.hasLocalLinkage()) { if (llvm::all_of(F.uses(), [](const Use &U) { return ImmutableCallSite(U.getUser()) && ImmutableCallSite(U.getUser()).isCallee(&U); Modified: llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll?rev=373986&r1=373985&r2=373986&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll Mon Oct 7 16:21:52 2019 @@ -8,9 +8,9 @@ entry: ret i32 %add } -; CHECK: define internal i32 @noalias_args(i32* nocapture readonly %A, i32* noalias nocapture readonly %B) +; CHECK: define private i32 @noalias_args(i32* nocapture readonly %A, i32* noalias nocapture readonly %B) -define internal i32 @noalias_args(i32* %A, i32* %B) #0 { +define private i32 @noalias_args(i32* %A, i32* %B) #0 { entry: %0 = load i32, i32* %A, align 4 %1 = load i32, i32* %B, align 4 From llvm-commits at lists.llvm.org Mon Oct 7 16:22:08 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 23:22:08 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: <9735d79c1a5c24468d9027cb9d774fd3@localhost.localdomain> hubert.reinterpretcast added inline comments. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:89 uint32_t Size; + uint32_t PaddingSize; uint32_t FileOffsetToData; ---------------- Remove this field (see comments on later lines). ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:103 Size = 0; + PaddingSize = 0; FileOffsetToData = 0; ---------------- Remove this field (see comments on later lines). ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:511 + const MCSectionXCOFF *MCSec = Csect.MCCsect; + Csect.PaddingSize = alignTo(Address, MCSec->getAlignment()) - Address; + Address += Csect.PaddingSize; ---------------- The inter-csect padding is not really a property of the csect requiring alignment. We do not need to store this value here. The amount of padding to write can be determined by tracking the virtual address of the raw section data being written during the serialization into the object file. The next virtual address following the padding should be `Csect.Address`. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:525 + } + Text.PaddingSize = alignTo(Address, DefaultSectionAlign) - Address; + Address += Text.PaddingSize; ---------------- The `Size` field accounts for the padding. We do not need to store this value here. The amount of padding to write can be determined by tracking the virtual address of the raw section data being written during the serialization into the object file. The next virtual address following the padding should be `Text.Address + Text.Size`. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 From llvm-commits at lists.llvm.org Mon Oct 7 16:28:54 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Mon, 07 Oct 2019 23:28:54 -0000 Subject: [llvm] r373987 - [Attributor][FIX] Remove initialize calls and add undefs Message-ID: <20191007232854.A5EC08DAD6@lists.llvm.org> Author: jdoerfert Date: Mon Oct 7 16:28:54 2019 New Revision: 373987 URL: http://llvm.org/viewvc/llvm-project?rev=373987&view=rev Log: [Attributor][FIX] Remove initialize calls and add undefs The initialization logic has become part of the Attributor but the patches that introduced these calls here were in development when the transition happened. We also now clean up (undefine) the macros used to create attributes. Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=373987&r1=373986&r2=373987&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Mon Oct 7 16:28:54 2019 @@ -4778,7 +4778,6 @@ const char AAMemoryBehavior::ID = 0; SWITCH_PK_INV(CLASS, IRP_CALL_SITE, "call site") \ SWITCH_PK_CREATE(CLASS, IRP, IRP_FUNCTION, Function) \ } \ - AA->initialize(A); \ return *AA; \ } @@ -4795,7 +4794,6 @@ const char AAMemoryBehavior::ID = 0; SWITCH_PK_CREATE(CLASS, IRP, IRP_CALL_SITE_RETURNED, CallSiteReturned) \ SWITCH_PK_CREATE(CLASS, IRP, IRP_CALL_SITE_ARGUMENT, CallSiteArgument) \ } \ - AA->initialize(A); \ return *AA; \ } @@ -4820,7 +4818,9 @@ CREATE_FUNCTION_ONLY_ABSTRACT_ATTRIBUTE_ CREATE_NON_RET_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAMemoryBehavior) +#undef CREATE_FUNCTION_ONLY_ABSTRACT_ATTRIBUTE_FOR_POSITION #undef CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION +#undef CREATE_NON_RET_ABSTRACT_ATTRIBUTE_FOR_POSITION #undef CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION #undef CREATE_ALL_ABSTRACT_ATTRIBUTE_FOR_POSITION #undef SWITCH_PK_CREATE From llvm-commits at lists.llvm.org Mon Oct 7 16:30:04 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Mon, 07 Oct 2019 23:30:04 -0000 Subject: [llvm] r373988 - [Attributor][NFC] Add debug output Message-ID: <20191007233004.997278DB67@lists.llvm.org> Author: jdoerfert Date: Mon Oct 7 16:30:04 2019 New Revision: 373988 URL: http://llvm.org/viewvc/llvm-project?rev=373988&view=rev Log: [Attributor][NFC] Add debug output Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=373988&r1=373987&r2=373988&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Mon Oct 7 16:30:04 2019 @@ -3930,8 +3930,11 @@ bool Attributor::checkForAllCallSites( // hence the function has internal linkage. const IRPosition &IRP = QueryingAA.getIRPosition(); const Function *AssociatedFunction = IRP.getAssociatedFunction(); - if (!AssociatedFunction) + if (!AssociatedFunction) { + LLVM_DEBUG(dbgs() << "[Attributor] No function associated with " << IRP + << "\n"); return false; + } if (RequireAllCallSites && !AssociatedFunction->hasLocalLinkage()) { LLVM_DEBUG( From llvm-commits at lists.llvm.org Mon Oct 7 16:33:08 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Mon, 07 Oct 2019 23:33:08 -0000 Subject: [llvm] r373989 - AMDGPU/GlobalISel: Clamp G_SITOFP/G_UITOFP sources Message-ID: <20191007233308.917198DF03@lists.llvm.org> Author: arsenm Date: Mon Oct 7 16:33:08 2019 New Revision: 373989 URL: http://llvm.org/viewvc/llvm-project?rev=373989&view=rev Log: AMDGPU/GlobalISel: Clamp G_SITOFP/G_UITOFP sources Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-sitofp.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-uitofp.mir Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp?rev=373989&r1=373988&r2=373989&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp Mon Oct 7 16:33:08 2019 @@ -424,11 +424,14 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo .scalarize(0); // TODO: Split s1->s64 during regbankselect for VALU. - getActionDefinitionsBuilder({G_SITOFP, G_UITOFP}) + auto &IToFP = getActionDefinitionsBuilder({G_SITOFP, G_UITOFP}) .legalFor({{S32, S32}, {S64, S32}, {S16, S32}, {S32, S1}, {S16, S1}, {S64, S1}}) .lowerFor({{S32, S64}}) - .customFor({{S64, S64}}) - .scalarize(0); + .customFor({{S64, S64}}); + if (ST.has16BitInsts()) + IToFP.legalFor({{S16, S16}}); + IToFP.clampScalar(1, S32, S64) + .scalarize(0); auto &FPToI = getActionDefinitionsBuilder({G_FPTOSI, G_FPTOUI}) .legalFor({{S32, S32}, {S32, S64}, {S32, S16}}); Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-sitofp.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-sitofp.mir?rev=373989&r1=373988&r2=373989&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-sitofp.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-sitofp.mir Mon Oct 7 16:33:08 2019 @@ -1,5 +1,6 @@ # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py -# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -run-pass=legalizer -global-isel %s -o - | FileCheck %s +# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=tahiti -run-pass=legalizer %s -o - | FileCheck -check-prefix=GFX6 %s +# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -run-pass=legalizer %s -o - | FileCheck -check-prefix=GFX8 %s --- name: test_sitofp_s32_to_s32 @@ -7,10 +8,14 @@ body: | bb.0: liveins: $vgpr0 - ; CHECK-LABEL: name: test_sitofp_s32_to_s32 - ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 - ; CHECK: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[COPY]](s32) - ; CHECK: $vgpr0 = COPY [[SITOFP]](s32) + ; GFX6-LABEL: name: test_sitofp_s32_to_s32 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[COPY]](s32) + ; GFX6: $vgpr0 = COPY [[SITOFP]](s32) + ; GFX8-LABEL: name: test_sitofp_s32_to_s32 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[COPY]](s32) + ; GFX8: $vgpr0 = COPY [[SITOFP]](s32) %0:_(s32) = COPY $vgpr0 %1:_(s32) = G_SITOFP %0 $vgpr0 = COPY %1 @@ -22,10 +27,14 @@ body: | bb.0: liveins: $vgpr0 - ; CHECK-LABEL: name: test_sitofp_s32_to_s64 - ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 - ; CHECK: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[COPY]](s32) - ; CHECK: $vgpr0_vgpr1 = COPY [[SITOFP]](s64) + ; GFX6-LABEL: name: test_sitofp_s32_to_s64 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[COPY]](s32) + ; GFX6: $vgpr0_vgpr1 = COPY [[SITOFP]](s64) + ; GFX8-LABEL: name: test_sitofp_s32_to_s64 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[COPY]](s32) + ; GFX8: $vgpr0_vgpr1 = COPY [[SITOFP]](s64) %0:_(s32) = COPY $vgpr0 %1:_(s64) = G_SITOFP %0 $vgpr0_vgpr1 = COPY %1 @@ -37,13 +46,20 @@ body: | bb.0: liveins: $vgpr0_vgpr1 - ; CHECK-LABEL: name: test_sitofp_v2s32_to_v2s32 - ; CHECK: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $vgpr0_vgpr1 - ; CHECK: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](<2 x s32>) - ; CHECK: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[UV]](s32) - ; CHECK: [[SITOFP1:%[0-9]+]]:_(s32) = G_SITOFP [[UV1]](s32) - ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[SITOFP]](s32), [[SITOFP1]](s32) - ; CHECK: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) + ; GFX6-LABEL: name: test_sitofp_v2s32_to_v2s32 + ; GFX6: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $vgpr0_vgpr1 + ; GFX6: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](<2 x s32>) + ; GFX6: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[UV]](s32) + ; GFX6: [[SITOFP1:%[0-9]+]]:_(s32) = G_SITOFP [[UV1]](s32) + ; GFX6: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[SITOFP]](s32), [[SITOFP1]](s32) + ; GFX6: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) + ; GFX8-LABEL: name: test_sitofp_v2s32_to_v2s32 + ; GFX8: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $vgpr0_vgpr1 + ; GFX8: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](<2 x s32>) + ; GFX8: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[UV]](s32) + ; GFX8: [[SITOFP1:%[0-9]+]]:_(s32) = G_SITOFP [[UV1]](s32) + ; GFX8: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[SITOFP]](s32), [[SITOFP1]](s32) + ; GFX8: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(<2 x s32>) = COPY $vgpr0_vgpr1 %1:_(<2 x s32>) = G_SITOFP %0 $vgpr0_vgpr1 = COPY %1 @@ -55,13 +71,20 @@ body: | bb.0: liveins: $vgpr0_vgpr1 - ; CHECK-LABEL: name: test_sitofp_v2s32_to_v2s64 - ; CHECK: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $vgpr0_vgpr1 - ; CHECK: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](<2 x s32>) - ; CHECK: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[UV]](s32) - ; CHECK: [[SITOFP1:%[0-9]+]]:_(s64) = G_SITOFP [[UV1]](s32) - ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[SITOFP]](s64), [[SITOFP1]](s64) - ; CHECK: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) + ; GFX6-LABEL: name: test_sitofp_v2s32_to_v2s64 + ; GFX6: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $vgpr0_vgpr1 + ; GFX6: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](<2 x s32>) + ; GFX6: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[UV]](s32) + ; GFX6: [[SITOFP1:%[0-9]+]]:_(s64) = G_SITOFP [[UV1]](s32) + ; GFX6: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[SITOFP]](s64), [[SITOFP1]](s64) + ; GFX6: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) + ; GFX8-LABEL: name: test_sitofp_v2s32_to_v2s64 + ; GFX8: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $vgpr0_vgpr1 + ; GFX8: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](<2 x s32>) + ; GFX8: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[UV]](s32) + ; GFX8: [[SITOFP1:%[0-9]+]]:_(s64) = G_SITOFP [[UV1]](s32) + ; GFX8: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[SITOFP]](s64), [[SITOFP1]](s64) + ; GFX8: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<2 x s64>) %0:_(<2 x s32>) = COPY $vgpr0_vgpr1 %1:_(<2 x s64>) = G_SITOFP %0 $vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1 @@ -73,50 +96,94 @@ body: | bb.0: liveins: $vgpr0_vgpr1 - ; CHECK-LABEL: name: test_sitofp_s64_to_s32 - ; CHECK: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 - ; CHECK: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 63 - ; CHECK: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[C]](s64) - ; CHECK: [[ASHR:%[0-9]+]]:_(s64) = G_ASHR [[COPY]], [[TRUNC]](s32) - ; CHECK: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](s64) - ; CHECK: [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ASHR]](s64) - ; CHECK: [[UADDO:%[0-9]+]]:_(s32), [[UADDO1:%[0-9]+]]:_(s1) = G_UADDO [[UV]], [[UV2]] - ; CHECK: [[UADDE:%[0-9]+]]:_(s32), [[UADDE1:%[0-9]+]]:_(s1) = G_UADDE [[UV1]], [[UV3]], [[UADDO1]] - ; CHECK: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[UADDO]](s32), [[UADDE]](s32) - ; CHECK: [[XOR:%[0-9]+]]:_(s64) = G_XOR [[MV]], [[ASHR]] - ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 0 - ; CHECK: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 0 - ; CHECK: [[CTLZ_ZERO_UNDEF:%[0-9]+]]:_(s32) = G_CTLZ_ZERO_UNDEF [[XOR]](s64) - ; CHECK: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 190 - ; CHECK: [[SUB:%[0-9]+]]:_(s32) = G_SUB [[C3]], [[CTLZ_ZERO_UNDEF]] - ; CHECK: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[XOR]](s64), [[C2]] - ; CHECK: [[SELECT:%[0-9]+]]:_(s32) = G_SELECT [[ICMP]](s1), [[SUB]], [[C1]] - ; CHECK: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 9223372036854775807 - ; CHECK: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[XOR]], [[CTLZ_ZERO_UNDEF]](s32) - ; CHECK: [[AND:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C4]] - ; CHECK: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 1099511627775 - ; CHECK: [[AND1:%[0-9]+]]:_(s64) = G_AND [[AND]], [[C5]] - ; CHECK: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 40 - ; CHECK: [[TRUNC1:%[0-9]+]]:_(s32) = G_TRUNC [[C6]](s64) - ; CHECK: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[AND]], [[TRUNC1]](s32) - ; CHECK: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 23 - ; CHECK: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[SELECT]], [[C7]](s32) - ; CHECK: [[TRUNC2:%[0-9]+]]:_(s32) = G_TRUNC [[LSHR]](s64) - ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[SHL1]], [[TRUNC2]] - ; CHECK: [[C8:%[0-9]+]]:_(s64) = G_CONSTANT i64 549755813888 - ; CHECK: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(ugt), [[AND1]](s64), [[C8]] - ; CHECK: [[ICMP2:%[0-9]+]]:_(s1) = G_ICMP intpred(eq), [[AND1]](s64), [[C8]] - ; CHECK: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 1 - ; CHECK: [[AND2:%[0-9]+]]:_(s32) = G_AND [[OR]], [[C9]] - ; CHECK: [[SELECT1:%[0-9]+]]:_(s32) = G_SELECT [[ICMP2]](s1), [[AND2]], [[C1]] - ; CHECK: [[SELECT2:%[0-9]+]]:_(s32) = G_SELECT [[ICMP1]](s1), [[C9]], [[SELECT1]] - ; CHECK: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[OR]], [[SELECT2]] - ; CHECK: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[XOR]](s64) - ; CHECK: [[FNEG:%[0-9]+]]:_(s32) = G_FNEG [[UITOFP]] - ; CHECK: [[ICMP3:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[ASHR]](s64), [[C2]] - ; CHECK: [[SELECT3:%[0-9]+]]:_(s32) = G_SELECT [[ICMP3]](s1), [[FNEG]], [[UITOFP]] - ; CHECK: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[COPY]](s64) - ; CHECK: $vgpr0 = COPY [[SITOFP]](s32) + ; GFX6-LABEL: name: test_sitofp_s64_to_s32 + ; GFX6: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 + ; GFX6: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 63 + ; GFX6: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[C]](s64) + ; GFX6: [[ASHR:%[0-9]+]]:_(s64) = G_ASHR [[COPY]], [[TRUNC]](s32) + ; GFX6: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](s64) + ; GFX6: [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ASHR]](s64) + ; GFX6: [[UADDO:%[0-9]+]]:_(s32), [[UADDO1:%[0-9]+]]:_(s1) = G_UADDO [[UV]], [[UV2]] + ; GFX6: [[UADDE:%[0-9]+]]:_(s32), [[UADDE1:%[0-9]+]]:_(s1) = G_UADDE [[UV1]], [[UV3]], [[UADDO1]] + ; GFX6: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[UADDO]](s32), [[UADDE]](s32) + ; GFX6: [[XOR:%[0-9]+]]:_(s64) = G_XOR [[MV]], [[ASHR]] + ; GFX6: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 0 + ; GFX6: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 0 + ; GFX6: [[CTLZ_ZERO_UNDEF:%[0-9]+]]:_(s32) = G_CTLZ_ZERO_UNDEF [[XOR]](s64) + ; GFX6: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 190 + ; GFX6: [[SUB:%[0-9]+]]:_(s32) = G_SUB [[C3]], [[CTLZ_ZERO_UNDEF]] + ; GFX6: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[XOR]](s64), [[C2]] + ; GFX6: [[SELECT:%[0-9]+]]:_(s32) = G_SELECT [[ICMP]](s1), [[SUB]], [[C1]] + ; GFX6: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 9223372036854775807 + ; GFX6: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[XOR]], [[CTLZ_ZERO_UNDEF]](s32) + ; GFX6: [[AND:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C4]] + ; GFX6: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 1099511627775 + ; GFX6: [[AND1:%[0-9]+]]:_(s64) = G_AND [[AND]], [[C5]] + ; GFX6: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 40 + ; GFX6: [[TRUNC1:%[0-9]+]]:_(s32) = G_TRUNC [[C6]](s64) + ; GFX6: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[AND]], [[TRUNC1]](s32) + ; GFX6: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 23 + ; GFX6: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[SELECT]], [[C7]](s32) + ; GFX6: [[TRUNC2:%[0-9]+]]:_(s32) = G_TRUNC [[LSHR]](s64) + ; GFX6: [[OR:%[0-9]+]]:_(s32) = G_OR [[SHL1]], [[TRUNC2]] + ; GFX6: [[C8:%[0-9]+]]:_(s64) = G_CONSTANT i64 549755813888 + ; GFX6: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(ugt), [[AND1]](s64), [[C8]] + ; GFX6: [[ICMP2:%[0-9]+]]:_(s1) = G_ICMP intpred(eq), [[AND1]](s64), [[C8]] + ; GFX6: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 1 + ; GFX6: [[AND2:%[0-9]+]]:_(s32) = G_AND [[OR]], [[C9]] + ; GFX6: [[SELECT1:%[0-9]+]]:_(s32) = G_SELECT [[ICMP2]](s1), [[AND2]], [[C1]] + ; GFX6: [[SELECT2:%[0-9]+]]:_(s32) = G_SELECT [[ICMP1]](s1), [[C9]], [[SELECT1]] + ; GFX6: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[OR]], [[SELECT2]] + ; GFX6: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[XOR]](s64) + ; GFX6: [[FNEG:%[0-9]+]]:_(s32) = G_FNEG [[UITOFP]] + ; GFX6: [[ICMP3:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[ASHR]](s64), [[C2]] + ; GFX6: [[SELECT3:%[0-9]+]]:_(s32) = G_SELECT [[ICMP3]](s1), [[FNEG]], [[UITOFP]] + ; GFX6: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[COPY]](s64) + ; GFX6: $vgpr0 = COPY [[SITOFP]](s32) + ; GFX8-LABEL: name: test_sitofp_s64_to_s32 + ; GFX8: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 + ; GFX8: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 63 + ; GFX8: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[C]](s64) + ; GFX8: [[ASHR:%[0-9]+]]:_(s64) = G_ASHR [[COPY]], [[TRUNC]](s32) + ; GFX8: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](s64) + ; GFX8: [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[ASHR]](s64) + ; GFX8: [[UADDO:%[0-9]+]]:_(s32), [[UADDO1:%[0-9]+]]:_(s1) = G_UADDO [[UV]], [[UV2]] + ; GFX8: [[UADDE:%[0-9]+]]:_(s32), [[UADDE1:%[0-9]+]]:_(s1) = G_UADDE [[UV1]], [[UV3]], [[UADDO1]] + ; GFX8: [[MV:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[UADDO]](s32), [[UADDE]](s32) + ; GFX8: [[XOR:%[0-9]+]]:_(s64) = G_XOR [[MV]], [[ASHR]] + ; GFX8: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 0 + ; GFX8: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 0 + ; GFX8: [[CTLZ_ZERO_UNDEF:%[0-9]+]]:_(s32) = G_CTLZ_ZERO_UNDEF [[XOR]](s64) + ; GFX8: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 190 + ; GFX8: [[SUB:%[0-9]+]]:_(s32) = G_SUB [[C3]], [[CTLZ_ZERO_UNDEF]] + ; GFX8: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[XOR]](s64), [[C2]] + ; GFX8: [[SELECT:%[0-9]+]]:_(s32) = G_SELECT [[ICMP]](s1), [[SUB]], [[C1]] + ; GFX8: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 9223372036854775807 + ; GFX8: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[XOR]], [[CTLZ_ZERO_UNDEF]](s32) + ; GFX8: [[AND:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C4]] + ; GFX8: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 1099511627775 + ; GFX8: [[AND1:%[0-9]+]]:_(s64) = G_AND [[AND]], [[C5]] + ; GFX8: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 40 + ; GFX8: [[TRUNC1:%[0-9]+]]:_(s32) = G_TRUNC [[C6]](s64) + ; GFX8: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[AND]], [[TRUNC1]](s32) + ; GFX8: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 23 + ; GFX8: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[SELECT]], [[C7]](s32) + ; GFX8: [[TRUNC2:%[0-9]+]]:_(s32) = G_TRUNC [[LSHR]](s64) + ; GFX8: [[OR:%[0-9]+]]:_(s32) = G_OR [[SHL1]], [[TRUNC2]] + ; GFX8: [[C8:%[0-9]+]]:_(s64) = G_CONSTANT i64 549755813888 + ; GFX8: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(ugt), [[AND1]](s64), [[C8]] + ; GFX8: [[ICMP2:%[0-9]+]]:_(s1) = G_ICMP intpred(eq), [[AND1]](s64), [[C8]] + ; GFX8: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 1 + ; GFX8: [[AND2:%[0-9]+]]:_(s32) = G_AND [[OR]], [[C9]] + ; GFX8: [[SELECT1:%[0-9]+]]:_(s32) = G_SELECT [[ICMP2]](s1), [[AND2]], [[C1]] + ; GFX8: [[SELECT2:%[0-9]+]]:_(s32) = G_SELECT [[ICMP1]](s1), [[C9]], [[SELECT1]] + ; GFX8: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[OR]], [[SELECT2]] + ; GFX8: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[XOR]](s64) + ; GFX8: [[FNEG:%[0-9]+]]:_(s32) = G_FNEG [[UITOFP]] + ; GFX8: [[ICMP3:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[ASHR]](s64), [[C2]] + ; GFX8: [[SELECT3:%[0-9]+]]:_(s32) = G_SELECT [[ICMP3]](s1), [[FNEG]], [[UITOFP]] + ; GFX8: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[COPY]](s64) + ; GFX8: $vgpr0 = COPY [[SITOFP]](s32) %0:_(s64) = COPY $vgpr0_vgpr1 %1:_(s32) = G_SITOFP %0 $vgpr0 = COPY %1 @@ -128,33 +195,196 @@ body: | bb.0: liveins: $vgpr0_vgpr1 - ; CHECK-LABEL: name: test_sitofp_s64_to_s64 - ; CHECK: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 - ; CHECK: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](s64) - ; CHECK: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[UV1]](s32) - ; CHECK: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[UV]](s32) - ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 32 - ; CHECK: [[INT:%[0-9]+]]:_(s64) = G_INTRINSIC intrinsic(@llvm.amdgcn.ldexp), [[SITOFP]](s64), [[C]](s32) - ; CHECK: [[FADD:%[0-9]+]]:_(s64) = G_FADD [[INT]], [[UITOFP]] - ; CHECK: $vgpr0_vgpr1 = COPY [[FADD]](s64) + ; GFX6-LABEL: name: test_sitofp_s64_to_s64 + ; GFX6: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 + ; GFX6: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](s64) + ; GFX6: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[UV1]](s32) + ; GFX6: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[UV]](s32) + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 32 + ; GFX6: [[INT:%[0-9]+]]:_(s64) = G_INTRINSIC intrinsic(@llvm.amdgcn.ldexp), [[SITOFP]](s64), [[C]](s32) + ; GFX6: [[FADD:%[0-9]+]]:_(s64) = G_FADD [[INT]], [[UITOFP]] + ; GFX6: $vgpr0_vgpr1 = COPY [[FADD]](s64) + ; GFX8-LABEL: name: test_sitofp_s64_to_s64 + ; GFX8: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 + ; GFX8: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](s64) + ; GFX8: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[UV1]](s32) + ; GFX8: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[UV]](s32) + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 32 + ; GFX8: [[INT:%[0-9]+]]:_(s64) = G_INTRINSIC intrinsic(@llvm.amdgcn.ldexp), [[SITOFP]](s64), [[C]](s32) + ; GFX8: [[FADD:%[0-9]+]]:_(s64) = G_FADD [[INT]], [[UITOFP]] + ; GFX8: $vgpr0_vgpr1 = COPY [[FADD]](s64) %0:_(s64) = COPY $vgpr0_vgpr1 %1:_(s64) = G_SITOFP %0 $vgpr0_vgpr1 = COPY %1 ... --- +name: test_sitofp_s16_to_s16 +body: | + bb.0: + liveins: $vgpr0 + + ; GFX6-LABEL: name: test_sitofp_s16_to_s16 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX6: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) + ; GFX6: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) + ; GFX6: [[SITOFP:%[0-9]+]]:_(s16) = G_SITOFP [[ASHR]](s32) + ; GFX6: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[SITOFP]](s16) + ; GFX6: $vgpr0 = COPY [[ANYEXT]](s32) + ; GFX8-LABEL: name: test_sitofp_s16_to_s16 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32) + ; GFX8: [[SITOFP:%[0-9]+]]:_(s16) = G_SITOFP [[TRUNC]](s16) + ; GFX8: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[SITOFP]](s16) + ; GFX8: $vgpr0 = COPY [[ANYEXT]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s16) = G_TRUNC %0 + %2:_(s16) = G_SITOFP %1 + %3:_(s32) = G_ANYEXT %2 + $vgpr0 = COPY %3 +... + +--- name: test_sitofp_s16_to_s32 body: | bb.0: liveins: $vgpr0 - ; CHECK-LABEL: name: test_sitofp_s16_to_s32 - ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 - ; CHECK: [[SITOFP:%[0-9]+]]:_(s16) = G_SITOFP [[COPY]](s32) - ; CHECK: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[SITOFP]](s16) - ; CHECK: $vgpr0 = COPY [[ANYEXT]](s32) + ; GFX6-LABEL: name: test_sitofp_s16_to_s32 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX6: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) + ; GFX6: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) + ; GFX6: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[ASHR]](s32) + ; GFX6: $vgpr0 = COPY [[SITOFP]](s32) + ; GFX8-LABEL: name: test_sitofp_s16_to_s32 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX8: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) + ; GFX8: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) + ; GFX8: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[ASHR]](s32) + ; GFX8: $vgpr0 = COPY [[SITOFP]](s32) %0:_(s32) = COPY $vgpr0 - %1:_(s16) = G_SITOFP %0 - %2:_(s32) = G_ANYEXT %1 + %1:_(s16) = G_TRUNC %0 + %2:_(s32) = G_SITOFP %1 $vgpr0 = COPY %2 ... + +--- +name: test_sitofp_s16_to_s64 +body: | + bb.0: + liveins: $vgpr0 + + ; GFX6-LABEL: name: test_sitofp_s16_to_s64 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX6: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) + ; GFX6: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) + ; GFX6: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[ASHR]](s32) + ; GFX6: $vgpr0_vgpr1 = COPY [[SITOFP]](s64) + ; GFX8-LABEL: name: test_sitofp_s16_to_s64 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 16 + ; GFX8: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) + ; GFX8: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) + ; GFX8: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[ASHR]](s32) + ; GFX8: $vgpr0_vgpr1 = COPY [[SITOFP]](s64) + %0:_(s32) = COPY $vgpr0 + %1:_(s16) = G_TRUNC %0 + %2:_(s64) = G_SITOFP %1 + $vgpr0_vgpr1 = COPY %2 +... + +--- +name: test_sitofp_s8_to_s16 +body: | + bb.0: + liveins: $vgpr0 + + ; GFX6-LABEL: name: test_sitofp_s8_to_s16 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 24 + ; GFX6: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) + ; GFX6: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) + ; GFX6: [[SITOFP:%[0-9]+]]:_(s16) = G_SITOFP [[ASHR]](s32) + ; GFX6: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[SITOFP]](s16) + ; GFX6: $vgpr0 = COPY [[ANYEXT]](s32) + ; GFX8-LABEL: name: test_sitofp_s8_to_s16 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 24 + ; GFX8: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) + ; GFX8: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) + ; GFX8: [[SITOFP:%[0-9]+]]:_(s16) = G_SITOFP [[ASHR]](s32) + ; GFX8: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[SITOFP]](s16) + ; GFX8: $vgpr0 = COPY [[ANYEXT]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s8) = G_TRUNC %0 + %2:_(s16) = G_SITOFP %1 + %3:_(s32) = G_ANYEXT %2 + $vgpr0 = COPY %3 +... + +--- +name: test_sitofp_s8_to_s32 +body: | + bb.0: + liveins: $vgpr0 + + ; GFX6-LABEL: name: test_sitofp_s8_to_s32 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 24 + ; GFX6: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) + ; GFX6: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) + ; GFX6: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[ASHR]](s32) + ; GFX6: $vgpr0 = COPY [[SITOFP]](s32) + ; GFX8-LABEL: name: test_sitofp_s8_to_s32 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 24 + ; GFX8: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) + ; GFX8: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) + ; GFX8: [[SITOFP:%[0-9]+]]:_(s32) = G_SITOFP [[ASHR]](s32) + ; GFX8: $vgpr0 = COPY [[SITOFP]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s8) = G_TRUNC %0 + %2:_(s32) = G_SITOFP %1 + $vgpr0 = COPY %2 +... + +--- +name: test_sitofp_s8_to_s64 +body: | + bb.0: + liveins: $vgpr0 + + ; GFX6-LABEL: name: test_sitofp_s8_to_s64 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 24 + ; GFX6: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) + ; GFX6: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) + ; GFX6: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[ASHR]](s32) + ; GFX6: $vgpr0_vgpr1 = COPY [[SITOFP]](s64) + ; GFX8-LABEL: name: test_sitofp_s8_to_s64 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 24 + ; GFX8: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) + ; GFX8: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) + ; GFX8: [[SITOFP:%[0-9]+]]:_(s64) = G_SITOFP [[ASHR]](s32) + ; GFX8: $vgpr0_vgpr1 = COPY [[SITOFP]](s64) + %0:_(s32) = COPY $vgpr0 + %1:_(s8) = G_TRUNC %0 + %2:_(s64) = G_SITOFP %1 + $vgpr0_vgpr1 = COPY %2 +... Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-uitofp.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-uitofp.mir?rev=373989&r1=373988&r2=373989&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-uitofp.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-uitofp.mir Mon Oct 7 16:33:08 2019 @@ -1,5 +1,6 @@ # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py -# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -run-pass=legalizer -global-isel %s -o - | FileCheck %s +# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=tahiti -run-pass=legalizer %s -o - | FileCheck -check-prefix=GFX6 %s +# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -run-pass=legalizer %s -o - | FileCheck -check-prefix=GFX8 %s --- name: test_uitofp_s32_to_s32 @@ -7,10 +8,14 @@ body: | bb.0: liveins: $vgpr0 - ; CHECK-LABEL: name: test_uitofp_s32_to_s32 - ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 - ; CHECK: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[COPY]](s32) - ; CHECK: $vgpr0 = COPY [[UITOFP]](s32) + ; GFX6-LABEL: name: test_uitofp_s32_to_s32 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[COPY]](s32) + ; GFX6: $vgpr0 = COPY [[UITOFP]](s32) + ; GFX8-LABEL: name: test_uitofp_s32_to_s32 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[COPY]](s32) + ; GFX8: $vgpr0 = COPY [[UITOFP]](s32) %0:_(s32) = COPY $vgpr0 %1:_(s32) = G_UITOFP %0 $vgpr0 = COPY %1 @@ -22,10 +27,14 @@ body: | bb.0: liveins: $vgpr0 - ; CHECK-LABEL: name: test_uitofp_s32_to_s64 - ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 - ; CHECK: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[COPY]](s32) - ; CHECK: $vgpr0_vgpr1 = COPY [[UITOFP]](s64) + ; GFX6-LABEL: name: test_uitofp_s32_to_s64 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[COPY]](s32) + ; GFX6: $vgpr0_vgpr1 = COPY [[UITOFP]](s64) + ; GFX8-LABEL: name: test_uitofp_s32_to_s64 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[COPY]](s32) + ; GFX8: $vgpr0_vgpr1 = COPY [[UITOFP]](s64) %0:_(s32) = COPY $vgpr0 %1:_(s64) = G_UITOFP %0 $vgpr0_vgpr1 = COPY %1 @@ -37,13 +46,20 @@ body: | bb.0: liveins: $vgpr0_vgpr1 - ; CHECK-LABEL: name: test_uitofp_v2s32_to_v2s32 - ; CHECK: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $vgpr0_vgpr1 - ; CHECK: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](<2 x s32>) - ; CHECK: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[UV]](s32) - ; CHECK: [[UITOFP1:%[0-9]+]]:_(s32) = G_UITOFP [[UV1]](s32) - ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[UITOFP]](s32), [[UITOFP1]](s32) - ; CHECK: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) + ; GFX6-LABEL: name: test_uitofp_v2s32_to_v2s32 + ; GFX6: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $vgpr0_vgpr1 + ; GFX6: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](<2 x s32>) + ; GFX6: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[UV]](s32) + ; GFX6: [[UITOFP1:%[0-9]+]]:_(s32) = G_UITOFP [[UV1]](s32) + ; GFX6: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[UITOFP]](s32), [[UITOFP1]](s32) + ; GFX6: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) + ; GFX8-LABEL: name: test_uitofp_v2s32_to_v2s32 + ; GFX8: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $vgpr0_vgpr1 + ; GFX8: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](<2 x s32>) + ; GFX8: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[UV]](s32) + ; GFX8: [[UITOFP1:%[0-9]+]]:_(s32) = G_UITOFP [[UV1]](s32) + ; GFX8: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[UITOFP]](s32), [[UITOFP1]](s32) + ; GFX8: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<2 x s32>) %0:_(<2 x s32>) = COPY $vgpr0_vgpr1 %1:_(<2 x s32>) = G_UITOFP %0 $vgpr0_vgpr1 = COPY %1 @@ -55,37 +71,68 @@ body: | bb.0: liveins: $vgpr0_vgpr1 - ; CHECK-LABEL: name: test_uitofp_s64_to_s32 - ; CHECK: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 - ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0 - ; CHECK: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 0 - ; CHECK: [[CTLZ_ZERO_UNDEF:%[0-9]+]]:_(s32) = G_CTLZ_ZERO_UNDEF [[COPY]](s64) - ; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 190 - ; CHECK: [[SUB:%[0-9]+]]:_(s32) = G_SUB [[C2]], [[CTLZ_ZERO_UNDEF]] - ; CHECK: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[COPY]](s64), [[C1]] - ; CHECK: [[SELECT:%[0-9]+]]:_(s32) = G_SELECT [[ICMP]](s1), [[SUB]], [[C]] - ; CHECK: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 9223372036854775807 - ; CHECK: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[COPY]], [[CTLZ_ZERO_UNDEF]](s32) - ; CHECK: [[AND:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C3]] - ; CHECK: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 1099511627775 - ; CHECK: [[AND1:%[0-9]+]]:_(s64) = G_AND [[AND]], [[C4]] - ; CHECK: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 40 - ; CHECK: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[C5]](s64) - ; CHECK: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[AND]], [[TRUNC]](s32) - ; CHECK: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 23 - ; CHECK: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[SELECT]], [[C6]](s32) - ; CHECK: [[TRUNC1:%[0-9]+]]:_(s32) = G_TRUNC [[LSHR]](s64) - ; CHECK: [[OR:%[0-9]+]]:_(s32) = G_OR [[SHL1]], [[TRUNC1]] - ; CHECK: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 549755813888 - ; CHECK: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(ugt), [[AND1]](s64), [[C7]] - ; CHECK: [[ICMP2:%[0-9]+]]:_(s1) = G_ICMP intpred(eq), [[AND1]](s64), [[C7]] - ; CHECK: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 1 - ; CHECK: [[AND2:%[0-9]+]]:_(s32) = G_AND [[OR]], [[C8]] - ; CHECK: [[SELECT1:%[0-9]+]]:_(s32) = G_SELECT [[ICMP2]](s1), [[AND2]], [[C]] - ; CHECK: [[SELECT2:%[0-9]+]]:_(s32) = G_SELECT [[ICMP1]](s1), [[C8]], [[SELECT1]] - ; CHECK: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[OR]], [[SELECT2]] - ; CHECK: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[COPY]](s64) - ; CHECK: $vgpr0 = COPY [[UITOFP]](s32) + ; GFX6-LABEL: name: test_uitofp_s64_to_s32 + ; GFX6: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0 + ; GFX6: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 0 + ; GFX6: [[CTLZ_ZERO_UNDEF:%[0-9]+]]:_(s32) = G_CTLZ_ZERO_UNDEF [[COPY]](s64) + ; GFX6: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 190 + ; GFX6: [[SUB:%[0-9]+]]:_(s32) = G_SUB [[C2]], [[CTLZ_ZERO_UNDEF]] + ; GFX6: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[COPY]](s64), [[C1]] + ; GFX6: [[SELECT:%[0-9]+]]:_(s32) = G_SELECT [[ICMP]](s1), [[SUB]], [[C]] + ; GFX6: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 9223372036854775807 + ; GFX6: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[COPY]], [[CTLZ_ZERO_UNDEF]](s32) + ; GFX6: [[AND:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C3]] + ; GFX6: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 1099511627775 + ; GFX6: [[AND1:%[0-9]+]]:_(s64) = G_AND [[AND]], [[C4]] + ; GFX6: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 40 + ; GFX6: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[C5]](s64) + ; GFX6: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[AND]], [[TRUNC]](s32) + ; GFX6: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 23 + ; GFX6: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[SELECT]], [[C6]](s32) + ; GFX6: [[TRUNC1:%[0-9]+]]:_(s32) = G_TRUNC [[LSHR]](s64) + ; GFX6: [[OR:%[0-9]+]]:_(s32) = G_OR [[SHL1]], [[TRUNC1]] + ; GFX6: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 549755813888 + ; GFX6: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(ugt), [[AND1]](s64), [[C7]] + ; GFX6: [[ICMP2:%[0-9]+]]:_(s1) = G_ICMP intpred(eq), [[AND1]](s64), [[C7]] + ; GFX6: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 1 + ; GFX6: [[AND2:%[0-9]+]]:_(s32) = G_AND [[OR]], [[C8]] + ; GFX6: [[SELECT1:%[0-9]+]]:_(s32) = G_SELECT [[ICMP2]](s1), [[AND2]], [[C]] + ; GFX6: [[SELECT2:%[0-9]+]]:_(s32) = G_SELECT [[ICMP1]](s1), [[C8]], [[SELECT1]] + ; GFX6: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[OR]], [[SELECT2]] + ; GFX6: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[COPY]](s64) + ; GFX6: $vgpr0 = COPY [[UITOFP]](s32) + ; GFX8-LABEL: name: test_uitofp_s64_to_s32 + ; GFX8: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0 + ; GFX8: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 0 + ; GFX8: [[CTLZ_ZERO_UNDEF:%[0-9]+]]:_(s32) = G_CTLZ_ZERO_UNDEF [[COPY]](s64) + ; GFX8: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 190 + ; GFX8: [[SUB:%[0-9]+]]:_(s32) = G_SUB [[C2]], [[CTLZ_ZERO_UNDEF]] + ; GFX8: [[ICMP:%[0-9]+]]:_(s1) = G_ICMP intpred(ne), [[COPY]](s64), [[C1]] + ; GFX8: [[SELECT:%[0-9]+]]:_(s32) = G_SELECT [[ICMP]](s1), [[SUB]], [[C]] + ; GFX8: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 9223372036854775807 + ; GFX8: [[SHL:%[0-9]+]]:_(s64) = G_SHL [[COPY]], [[CTLZ_ZERO_UNDEF]](s32) + ; GFX8: [[AND:%[0-9]+]]:_(s64) = G_AND [[SHL]], [[C3]] + ; GFX8: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 1099511627775 + ; GFX8: [[AND1:%[0-9]+]]:_(s64) = G_AND [[AND]], [[C4]] + ; GFX8: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 40 + ; GFX8: [[TRUNC:%[0-9]+]]:_(s32) = G_TRUNC [[C5]](s64) + ; GFX8: [[LSHR:%[0-9]+]]:_(s64) = G_LSHR [[AND]], [[TRUNC]](s32) + ; GFX8: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 23 + ; GFX8: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[SELECT]], [[C6]](s32) + ; GFX8: [[TRUNC1:%[0-9]+]]:_(s32) = G_TRUNC [[LSHR]](s64) + ; GFX8: [[OR:%[0-9]+]]:_(s32) = G_OR [[SHL1]], [[TRUNC1]] + ; GFX8: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 549755813888 + ; GFX8: [[ICMP1:%[0-9]+]]:_(s1) = G_ICMP intpred(ugt), [[AND1]](s64), [[C7]] + ; GFX8: [[ICMP2:%[0-9]+]]:_(s1) = G_ICMP intpred(eq), [[AND1]](s64), [[C7]] + ; GFX8: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 1 + ; GFX8: [[AND2:%[0-9]+]]:_(s32) = G_AND [[OR]], [[C8]] + ; GFX8: [[SELECT1:%[0-9]+]]:_(s32) = G_SELECT [[ICMP2]](s1), [[AND2]], [[C]] + ; GFX8: [[SELECT2:%[0-9]+]]:_(s32) = G_SELECT [[ICMP1]](s1), [[C8]], [[SELECT1]] + ; GFX8: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[OR]], [[SELECT2]] + ; GFX8: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[COPY]](s64) + ; GFX8: $vgpr0 = COPY [[UITOFP]](s32) %0:_(s64) = COPY $vgpr0_vgpr1 %1:_(s32) = G_UITOFP %0 $vgpr0 = COPY %1 @@ -97,33 +144,185 @@ body: | bb.0: liveins: $vgpr0_vgpr1 - ; CHECK-LABEL: name: test_uitofp_s64_to_s64 - ; CHECK: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 - ; CHECK: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](s64) - ; CHECK: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[UV1]](s32) - ; CHECK: [[UITOFP1:%[0-9]+]]:_(s64) = G_UITOFP [[UV]](s32) - ; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 32 - ; CHECK: [[INT:%[0-9]+]]:_(s64) = G_INTRINSIC intrinsic(@llvm.amdgcn.ldexp), [[UITOFP]](s64), [[C]](s32) - ; CHECK: [[FADD:%[0-9]+]]:_(s64) = G_FADD [[INT]], [[UITOFP1]] - ; CHECK: $vgpr0_vgpr1 = COPY [[FADD]](s64) + ; GFX6-LABEL: name: test_uitofp_s64_to_s64 + ; GFX6: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 + ; GFX6: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](s64) + ; GFX6: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[UV1]](s32) + ; GFX6: [[UITOFP1:%[0-9]+]]:_(s64) = G_UITOFP [[UV]](s32) + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 32 + ; GFX6: [[INT:%[0-9]+]]:_(s64) = G_INTRINSIC intrinsic(@llvm.amdgcn.ldexp), [[UITOFP]](s64), [[C]](s32) + ; GFX6: [[FADD:%[0-9]+]]:_(s64) = G_FADD [[INT]], [[UITOFP1]] + ; GFX6: $vgpr0_vgpr1 = COPY [[FADD]](s64) + ; GFX8-LABEL: name: test_uitofp_s64_to_s64 + ; GFX8: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 + ; GFX8: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY]](s64) + ; GFX8: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[UV1]](s32) + ; GFX8: [[UITOFP1:%[0-9]+]]:_(s64) = G_UITOFP [[UV]](s32) + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 32 + ; GFX8: [[INT:%[0-9]+]]:_(s64) = G_INTRINSIC intrinsic(@llvm.amdgcn.ldexp), [[UITOFP]](s64), [[C]](s32) + ; GFX8: [[FADD:%[0-9]+]]:_(s64) = G_FADD [[INT]], [[UITOFP1]] + ; GFX8: $vgpr0_vgpr1 = COPY [[FADD]](s64) %0:_(s64) = COPY $vgpr0_vgpr1 %1:_(s64) = G_UITOFP %0 $vgpr0_vgpr1 = COPY %1 ... --- +name: test_uitofp_s16_to_s16 +body: | + bb.0: + liveins: $vgpr0 + + ; GFX6-LABEL: name: test_uitofp_s16_to_s16 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]] + ; GFX6: [[UITOFP:%[0-9]+]]:_(s16) = G_UITOFP [[AND]](s32) + ; GFX6: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UITOFP]](s16) + ; GFX6: $vgpr0 = COPY [[ANYEXT]](s32) + ; GFX8-LABEL: name: test_uitofp_s16_to_s16 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32) + ; GFX8: [[UITOFP:%[0-9]+]]:_(s16) = G_UITOFP [[TRUNC]](s16) + ; GFX8: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UITOFP]](s16) + ; GFX8: $vgpr0 = COPY [[ANYEXT]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s16) = G_TRUNC %0 + %2:_(s16) = G_UITOFP %1 + %3:_(s32) = G_ANYEXT %2 + $vgpr0 = COPY %3 +... + +--- name: test_uitofp_s16_to_s32 body: | bb.0: liveins: $vgpr0 - ; CHECK-LABEL: name: test_uitofp_s16_to_s32 - ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 - ; CHECK: [[UITOFP:%[0-9]+]]:_(s16) = G_UITOFP [[COPY]](s32) - ; CHECK: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UITOFP]](s16) - ; CHECK: $vgpr0 = COPY [[ANYEXT]](s32) + ; GFX6-LABEL: name: test_uitofp_s16_to_s32 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]] + ; GFX6: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[AND]](s32) + ; GFX6: $vgpr0 = COPY [[UITOFP]](s32) + ; GFX8-LABEL: name: test_uitofp_s16_to_s32 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX8: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]] + ; GFX8: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[AND]](s32) + ; GFX8: $vgpr0 = COPY [[UITOFP]](s32) %0:_(s32) = COPY $vgpr0 - %1:_(s16) = G_UITOFP %0 - %2:_(s32) = G_ANYEXT %1 + %1:_(s16) = G_TRUNC %0 + %2:_(s32) = G_UITOFP %1 $vgpr0 = COPY %2 ... + +--- +name: test_uitofp_s16_to_s64 +body: | + bb.0: + liveins: $vgpr0 + + ; GFX6-LABEL: name: test_uitofp_s16_to_s64 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]] + ; GFX6: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[AND]](s32) + ; GFX6: $vgpr0_vgpr1 = COPY [[UITOFP]](s64) + ; GFX8-LABEL: name: test_uitofp_s16_to_s64 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 + ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX8: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]] + ; GFX8: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[AND]](s32) + ; GFX8: $vgpr0_vgpr1 = COPY [[UITOFP]](s64) + %0:_(s32) = COPY $vgpr0 + %1:_(s16) = G_TRUNC %0 + %2:_(s64) = G_UITOFP %1 + $vgpr0_vgpr1 = COPY %2 +... + +--- +name: test_uitofp_s8_to_s16 +body: | + bb.0: + liveins: $vgpr0 + + ; GFX6-LABEL: name: test_uitofp_s8_to_s16 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 255 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]] + ; GFX6: [[UITOFP:%[0-9]+]]:_(s16) = G_UITOFP [[AND]](s32) + ; GFX6: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UITOFP]](s16) + ; GFX6: $vgpr0 = COPY [[ANYEXT]](s32) + ; GFX8-LABEL: name: test_uitofp_s8_to_s16 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 255 + ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX8: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]] + ; GFX8: [[UITOFP:%[0-9]+]]:_(s16) = G_UITOFP [[AND]](s32) + ; GFX8: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UITOFP]](s16) + ; GFX8: $vgpr0 = COPY [[ANYEXT]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s8) = G_TRUNC %0 + %2:_(s16) = G_UITOFP %1 + %3:_(s32) = G_ANYEXT %2 + $vgpr0 = COPY %3 +... + +--- +name: test_uitofp_s8_to_s32 +body: | + bb.0: + liveins: $vgpr0 + + ; GFX6-LABEL: name: test_uitofp_s8_to_s32 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 255 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]] + ; GFX6: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[AND]](s32) + ; GFX6: $vgpr0 = COPY [[UITOFP]](s32) + ; GFX8-LABEL: name: test_uitofp_s8_to_s32 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 255 + ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX8: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]] + ; GFX8: [[UITOFP:%[0-9]+]]:_(s32) = G_UITOFP [[AND]](s32) + ; GFX8: $vgpr0 = COPY [[UITOFP]](s32) + %0:_(s32) = COPY $vgpr0 + %1:_(s8) = G_TRUNC %0 + %2:_(s32) = G_UITOFP %1 + $vgpr0 = COPY %2 +... + +--- +name: test_uitofp_s8_to_s64 +body: | + bb.0: + liveins: $vgpr0 + + ; GFX6-LABEL: name: test_uitofp_s8_to_s64 + ; GFX6: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX6: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 255 + ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX6: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]] + ; GFX6: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[AND]](s32) + ; GFX6: $vgpr0_vgpr1 = COPY [[UITOFP]](s64) + ; GFX8-LABEL: name: test_uitofp_s8_to_s64 + ; GFX8: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; GFX8: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 255 + ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32) + ; GFX8: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C]] + ; GFX8: [[UITOFP:%[0-9]+]]:_(s64) = G_UITOFP [[AND]](s32) + ; GFX8: $vgpr0_vgpr1 = COPY [[UITOFP]](s64) + %0:_(s32) = COPY $vgpr0 + %1:_(s8) = G_TRUNC %0 + %2:_(s64) = G_UITOFP %1 + $vgpr0_vgpr1 = COPY %2 +... From llvm-commits at lists.llvm.org Mon Oct 7 16:48:05 2019 From: llvm-commits at lists.llvm.org (Evgenii Stepanov via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 23:48:05 +0000 (UTC) Subject: [PATCH] D68431: [msan] Add interceptors: crypt, crypt_r. In-Reply-To: References: Message-ID: <7c8c12d3e66912ccf31e7f88e2c64a86@localhost.localdomain> eugenis updated this revision to Diff 223696. eugenis added a comment. addressed comments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68431/new/ https://reviews.llvm.org/D68431 Files: compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc compiler-rt/lib/sanitizer_common/sanitizer_platform_interceptors.h compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.h compiler-rt/test/sanitizer_common/TestCases/Linux/crypt_r.cpp compiler-rt/test/sanitizer_common/TestCases/Posix/crypt.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68431.223696.patch Type: text/x-patch Size: 5664 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 16:53:27 2019 From: llvm-commits at lists.llvm.org (Francis Visoiu Mistrih via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 23:53:27 +0000 (UTC) Subject: [PATCH] D68611: [IRGen] Emit lifetime markers for temporary struct allocas Message-ID: thegameg created this revision. thegameg added reviewers: rjmccall, t.p.northover, efriedma, rnk. When passing arguments using temporary allocas, we need to add the appropriate lifetime markers so that the stack coloring passes can re-use the stack space. This patch keeps track of all the lifetime.start calls emited before the codegened call, and adds the corresponding lifetime.end calls after the call. https://reviews.llvm.org/D68611 Files: clang/lib/CodeGen/CGCall.cpp clang/test/CodeGen/aarch64-byval-temp.c -------------- next part -------------- A non-text attachment was scrubbed... Name: D68611.223697.patch Type: text/x-patch Size: 7490 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 16:54:20 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 23:54:20 +0000 (UTC) Subject: [PATCH] D53876: Preserve loop metadata when splitting exit blocks In-Reply-To: References: Message-ID: <44ea6caa4e60f3661bf91a3348219f88@localhost.localdomain> craig.topper added a comment. I'm trying to make sense of the history here. It looks like I commited it in November last year. Then it got reverted. So I reopened it but maybe didn't move it to changes required and instead it stayed in accepted. Then @clin1 updated it, but it never got re-reviewed. Then it was rebased in January. And I guess sat in accepted state until today when phabricator lost its mind and closed it to the original commit? And probably erased the January version of the diff and went back to November 18? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53876/new/ https://reviews.llvm.org/D53876 From llvm-commits at lists.llvm.org Mon Oct 7 17:00:30 2019 From: llvm-commits at lists.llvm.org (Evgeniy Stepanov via llvm-commits) Date: Tue, 08 Oct 2019 00:00:30 -0000 Subject: [compiler-rt] r373993 - [msan] Add interceptors: crypt, crypt_r. Message-ID: <20191008000030.E4A6B81D62@lists.llvm.org> Author: eugenis Date: Mon Oct 7 17:00:30 2019 New Revision: 373993 URL: http://llvm.org/viewvc/llvm-project?rev=373993&view=rev Log: [msan] Add interceptors: crypt, crypt_r. Reviewers: vitalybuka Subscribers: srhines, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D68431 Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc?rev=373993&r1=373992&r2=373993&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc Mon Oct 7 17:00:30 2019 @@ -9573,6 +9573,41 @@ INTERCEPTOR(SSIZE_T, getrandom, void *bu #define INIT_GETRANDOM #endif +#if SANITIZER_INTERCEPT_CRYPT +INTERCEPTOR(char *, crypt, char *key, char *salt) { + void *ctx; + COMMON_INTERCEPTOR_ENTER(ctx, crypt, key, salt); + COMMON_INTERCEPTOR_READ_RANGE(ctx, key, internal_strlen(key) + 1); + COMMON_INTERCEPTOR_READ_RANGE(ctx, salt, internal_strlen(salt) + 1); + char *res = REAL(crypt)(key, salt); + if (res != nullptr) + COMMON_INTERCEPTOR_INITIALIZE_RANGE(res, internal_strlen(res) + 1); + return res; +} +#define INIT_CRYPT COMMON_INTERCEPT_FUNCTION(crypt); +#else +#define INIT_CRYPT +#endif + +#if SANITIZER_INTERCEPT_CRYPT_R +INTERCEPTOR(char *, crypt_r, char *key, char *salt, void *data) { + void *ctx; + COMMON_INTERCEPTOR_ENTER(ctx, crypt_r, key, salt, data); + COMMON_INTERCEPTOR_READ_RANGE(ctx, key, internal_strlen(key) + 1); + COMMON_INTERCEPTOR_READ_RANGE(ctx, salt, internal_strlen(salt) + 1); + char *res = REAL(crypt_r)(key, salt, data); + if (res != nullptr) { + COMMON_INTERCEPTOR_WRITE_RANGE(ctx, data, + __sanitizer::struct_crypt_data_sz); + COMMON_INTERCEPTOR_INITIALIZE_RANGE(res, internal_strlen(res) + 1); + } + return res; +} +#define INIT_CRYPT_R COMMON_INTERCEPT_FUNCTION(crypt_r); +#else +#define INIT_CRYPT_R +#endif + static void InitializeCommonInterceptors() { #if SI_POSIX static u64 metadata_mem[sizeof(MetadataHashMap) / sizeof(u64) + 1]; @@ -9871,6 +9906,8 @@ static void InitializeCommonInterceptors INIT_GETUSERSHELL; INIT_SL_INIT; INIT_GETRANDOM; + INIT_CRYPT; + INIT_CRYPT_R; INIT___PRINTF_CHK; } Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h?rev=373993&r1=373992&r2=373993&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h Mon Oct 7 17:00:30 2019 @@ -566,6 +566,8 @@ #define SANITIZER_INTERCEPT_FDEVNAME SI_FREEBSD #define SANITIZER_INTERCEPT_GETUSERSHELL (SI_POSIX && !SI_ANDROID) #define SANITIZER_INTERCEPT_SL_INIT (SI_FREEBSD || SI_NETBSD) +#define SANITIZER_INTERCEPT_CRYPT (SI_POSIX && !SI_ANDROID) +#define SANITIZER_INTERCEPT_CRYPT_R (SI_LINUX && !SI_ANDROID) #define SANITIZER_INTERCEPT_GETRANDOM (SI_LINUX && __GLIBC_PREREQ(2, 25)) #define SANITIZER_INTERCEPT___CXA_ATEXIT SI_NETBSD Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp?rev=373993&r1=373992&r2=373993&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp Mon Oct 7 17:00:30 2019 @@ -140,6 +140,7 @@ typedef struct user_fpregs elf_fpregset_ #include #include #include +#include #endif // SANITIZER_LINUX && !SANITIZER_ANDROID #if SANITIZER_ANDROID @@ -240,6 +241,7 @@ namespace __sanitizer { unsigned struct_ustat_sz = SIZEOF_STRUCT_USTAT; unsigned struct_rlimit64_sz = sizeof(struct rlimit64); unsigned struct_statvfs64_sz = sizeof(struct statvfs64); + unsigned struct_crypt_data_sz = sizeof(struct crypt_data); #endif // SANITIZER_LINUX && !SANITIZER_ANDROID #if SANITIZER_LINUX && !SANITIZER_ANDROID Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h?rev=373993&r1=373992&r2=373993&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h Mon Oct 7 17:00:30 2019 @@ -304,6 +304,7 @@ extern unsigned struct_msqid_ds_sz; extern unsigned struct_mq_attr_sz; extern unsigned struct_timex_sz; extern unsigned struct_statvfs_sz; +extern unsigned struct_crypt_data_sz; #endif // SANITIZER_LINUX && !SANITIZER_ANDROID struct __sanitizer_iovec { Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp?rev=373993&view=auto ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp (added) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp Mon Oct 7 17:00:30 2019 @@ -0,0 +1,37 @@ +// RUN: %clangxx -O0 -g %s -lcrypt -o %t && %run %t + +#include +#include +#include +#include + +#include + +int +main (int argc, char** argv) +{ + { + crypt_data cd; + cd.initialized = 0; + char *p = crypt_r("abcdef", "xz", &cd); + volatile size_t z = strlen(p); + } + { + crypt_data cd; + cd.initialized = 0; + char *p = crypt_r("abcdef", "$1$", &cd); + volatile size_t z = strlen(p); + } + { + crypt_data cd; + cd.initialized = 0; + char *p = crypt_r("abcdef", "$5$", &cd); + volatile size_t z = strlen(p); + } + { + crypt_data cd; + cd.initialized = 0; + char *p = crypt_r("abcdef", "$6$", &cd); + volatile size_t z = strlen(p); + } +} Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp?rev=373993&view=auto ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp (added) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp Mon Oct 7 17:00:30 2019 @@ -0,0 +1,26 @@ +// RUN: %clangxx -O0 -g %s -o %t -lcrypt && %run %t + +#include +#include +#include + +int +main (int argc, char** argv) +{ + { + char *p = crypt("abcdef", "xz"); + volatile size_t z = strlen(p); + } + { + char *p = crypt("abcdef", "$1$"); + volatile size_t z = strlen(p); + } + { + char *p = crypt("abcdef", "$5$"); + volatile size_t z = strlen(p); + } + { + char *p = crypt("abcdef", "$6$"); + volatile size_t z = strlen(p); + } +} From llvm-commits at lists.llvm.org Mon Oct 7 17:26:03 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 00:26:03 +0000 (UTC) Subject: [PATCH] D68466: [DebugInfo] Allow pairs in AddressPool [NFC] In-Reply-To: References: Message-ID: <719eee704dac99cac4c612ca405d626a@localhost.localdomain> aprantl added inline comments. ================ Comment at: llvm/lib/CodeGen/AsmPrinter/AddressPool.cpp:22 + auto IterBool = Pool.insert(std::make_pair( + std::make_pair(Sym, 0), AddressPoolEntry(Pool.size(), TLS))); + return IterBool.first->second.Number; ---------------- std::make_pair(a, b) -> {a, b} ================ Comment at: llvm/lib/CodeGen/AsmPrinter/AddressPool.cpp:85 + MCConstantExpr::create(std::abs(Offset), Asm.OutContext); + const auto SymbolExpr = Entries[I.second.Number]; + Entries[I.second.Number] = ---------------- This looks up the entry twice. Does this work? ``` auto &Entry = Entries[I.second.Number]; SymbolExpr OldSymbolExpr = Entry; Entry = MCBinaryExpr::create(Op, OldSymbolExpr, OffsetExpr, Asm.OutContext); ``` ================ Comment at: llvm/lib/CodeGen/AsmPrinter/AddressPool.h:21 +/// Pair of a symbol and an offset in number of bytes. +typedef std::pair SymbolWithOffset; + ---------------- I usually recommend defining a struct instead, so the members can have descriptive names for better readability. Since you still want the DenseMap to work, a trick I used elsewhere is to inherit from std::pair and provide accessors. Maybe that's excessive here. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68466/new/ https://reviews.llvm.org/D68466 From llvm-commits at lists.llvm.org Mon Oct 7 17:48:54 2019 From: llvm-commits at lists.llvm.org (Artem Dergachev via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 00:48:54 +0000 (UTC) Subject: [PATCH] D68093: [clang-scan-deps][static analyzer] Support for clang --analyze in scan-deps In-Reply-To: References: Message-ID: NoQ added inline comments. ================ Comment at: clang/include/clang/Driver/CC1Options.td:849 HelpText<"include a detailed record of preprocessing actions">; +def setup_static_analyzer : Flag<["-"], "setup-static-analyzer">, + HelpText<"Set up preprocessor for static analyzer (done automatically when static analyzer is run).">; ---------------- jkorous wrote: > hiraditya wrote: > > The name doesn't quite reflect what it does. > `setup-pp-for-analyzer`? I'm open to suggestions. I actually suggest modifying the help text to something like "Behave as if the Static Analyzer is going to be invoked, even if it's not actually going to be invoked (for now this boils down to defining the __clang_analyzer__ macro)" and keep the flag name roughly the same. This is the actual purpose of the option, right? We don't need to specify what precisely takes place when the option is invoked because this may change in the future, but the contract will remain. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68093/new/ https://reviews.llvm.org/D68093 From llvm-commits at lists.llvm.org Mon Oct 7 17:52:06 2019 From: llvm-commits at lists.llvm.org (David Greene via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 00:52:06 +0000 (UTC) Subject: [PATCH] D68081: Allow update_test_checks.py to not scrub names In-Reply-To: References: Message-ID: <11e5a64ff08e614426923d6d33c8fb0a@localhost.localdomain> greened closed this revision. greened added a comment. For some reason the commit did not auto-update this. Closing with commit r373912/a14ffc7eb741de4fd7484350d11947dea40991fd . Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68081/new/ https://reviews.llvm.org/D68081 From llvm-commits at lists.llvm.org Mon Oct 7 17:53:00 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 00:53:00 +0000 (UTC) Subject: [PATCH] D68465: [DebugInfo] Trim call-clobbered location list entries when tuning for GDB In-Reply-To: References: Message-ID: <794e742e7ae214f89ad7356c5addf567@localhost.localdomain> dblaikie added a comment. In D68465#1697407 , @dstenb wrote: > In D68465#1695482 , @dblaikie wrote: > > > Thanks for bringing this up! > > > > A few thoughts from me: > > > > 1. Yeah, I tend to agree with the DWARF Committee folks & the fact that LLDB can do the right thing without this change sort of points to this being a "fix it in GDB" situation. Have you tried asking the GDB folks about it/submitting patches there rather than here? > > > No, we have not done that yet. I think it'd be worthwhile having at least a statement from GDB that they feel this should be the responsibility of the producer. Though even if that's the answer they provide - I think some amount of pushback (especially given the existence proof of LLDB's behavior, by the sounds of it/if I'm understanding you correctly) might be worthwhile. >> 2. Do you have a small example of GCC producing this kind of output? > > > > extern int value(void); > extern void call(int); > > int main() { > int local = value(); > call(local); > return 0; > } > > > compiled using -O1 -g with GCC 8.3.0 gives: > > (gdb) info addr local > Symbol "local" is multi-location: > Range 0x5555555550ee-0x5555555550f4: a variable in $rax > . > (gdb) disas main > [...] > 0x00005555555550f0 <+11>: callq 0x5555555550ff > 0x00005555555550f5 <+16>: mov $0x0,%eax > [...] > > > As seen, the location list entry ends one byte before the return address. Cool - thanks! >> 3. The ability to use an offset from a debug_addr entry is actually something that's quite desirable - but not quite in the way you're suggesting. OK, I've looked through the code some more & understand a little better. So your changes to the address pool don't actually cause the address pool to contain entries with offsets - it stores only the base address in the actual debug_addr pool output, but then uses the offset from there in the place that refers to the address pool. So I think that would mean there would end up with duplicate entries in debug_addr, which would be a waste of space/relocations/etc. So only the address should go in the pool - the pool shouldn't be aware of the offset. (this would mean the semantics of the in-memory data structures would match more closely to the output) So the DebugLocStream::Entry's Begin/End could be SymbolWithOffset - without having the AddressPool having any knowledge of these offsets, just the symbol itself. > Actually the goal would be to not use another debug_addr entry, but the ability to refer to an addr entry + offset in a DIE. Of course this would require an extension to DWARF (non-standard, or eventually standard) which would also mean updating the DWARF consumer... which probably defeats the point of your work, which I imagine is intended to avoid changing the consumer. Though perhaps GDB would be more inclined to accept a patch for an addr+offset form compared to support for the register details. > > > Okay! How would that look like, and what would that be used for? > >> 4. If GDB can do the right thing when printing in a backtrace, then it seems like it should do the same/similar thing when in a frame > > The same problem exists for backtraces. For example, in PR39752 the outer frames' parameters are printed using the inner-most register value: > > (gdb) bt > #0 fn3 (p3=) at test.c:11 > #1 0x000000000040050f in fn2 (p2=999) at test.c:15 > #2 0x000000000040052f in fn1 (p1=999) at test.c:21 > #3 0x000000000040054e in main () at test.c:26 > Ah, sorry, I misunderstood your description. In D68465#1697824 , @probinson wrote: > Debugger tuning should not be used directly this way. There should be a DwarfDebug flag, and a CL option, and the default set in the DwarfDebug ctor based on tuning. This allows the defaulting to work how you want, but can be overridden easily for experimentation and testing. There are lots of examples of doing this in the ctor already. Also, if it turns out some other debugger also needs this, it's trivial to fix up the ctor to handle it with no code changes needed elsewhere. > > @dblaikie I'm also not clear what you're suggestion about .debug_addr entry plus offset. DW_LLE_offset_pair does this, derived from the base address, which ought to be available for any given function, assuming DWARF v5. Can you explain more clearly what's missing? Right - for loclists there's no need for new forms, etc. It was specifically related to the other review related to this that modifies the in-memory representation of the debug_addr (llvm::AddressPool) - which I assume meant a difference in output in the address pool, but seems it doesn't add the offset inside the pool (but may end up with redundant entries in the pool which should be fixed in any case). My point was generally that the debug_addr section shouldn't be incnluding addresses with offsets, it should be the places that refer to debug_addr that use the offsets. The specific place I'd like to use offsets would be from FORM_addr in debug_info. But, yes, in this case the support for debug addr references from loclists, the forms are already sufficiently descriptive for this. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68465/new/ https://reviews.llvm.org/D68465 From llvm-commits at lists.llvm.org Mon Oct 7 17:53:32 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 00:53:32 +0000 (UTC) Subject: [PATCH] D68616: [CodeExtractor] Factor out and reuse shrinkwrap analysis Message-ID: vsk created this revision. vsk added reviewers: davidxl, skatkov, void. Herald added subscribers: dexonsmith, hiraditya, tpr, mehdi_amini. Herald added a project: LLVM. Factor out CodeExtractor's analysis of allocas (for shrinkwrapping purposes), and allow the analysis to be reused. This resolves a quadratic compile-time bug observed when compiling AMDGPUDisassembler.cpp.o. Pre-patch (Release + LTO clang): ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 176.5278 ( 57.8%) 0.4915 ( 18.5%) 177.0192 ( 57.4%) 177.4112 ( 57.3%) Hot Cold Splitting Post-patch (ReleaseAsserts clang): ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 1.4051 ( 3.3%) 0.0079 ( 0.3%) 1.4129 ( 3.2%) 1.4129 ( 3.2%) Hot Cold Splitting Testing: check-llvm, and comparing the AMDGPUDisassembler.cpp.o binary pre- vs. post-patch. An alternate approach is to hide CodeExtractorAnalysisCache from clients of CodeExtractor, and to recompute the analysis from scratch inside of CodeExtractor::extractCodeRegion(). This eliminates some redundant work in the shrinkwrapping legality check. However, some clients continue to exhibit O(n^2) compile time behavior as computing the analysis is O(n). rdar://55912966 https://reviews.llvm.org/D68616 Files: llvm/include/llvm/Transforms/IPO/HotColdSplitting.h llvm/include/llvm/Transforms/Utils/CodeExtractor.h llvm/lib/Transforms/IPO/BlockExtractor.cpp llvm/lib/Transforms/IPO/HotColdSplitting.cpp llvm/lib/Transforms/IPO/LoopExtractor.cpp llvm/lib/Transforms/IPO/PartialInlining.cpp llvm/lib/Transforms/Utils/CodeExtractor.cpp llvm/unittests/Transforms/Utils/CodeExtractorTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68616.223708.patch Type: text/x-patch Size: 21696 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 17:56:00 2019 From: llvm-commits at lists.llvm.org (Andrew Trick via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 00:56:00 +0000 (UTC) Subject: [PATCH] D63945: Mark several PointerIntPair methods as lvalue-only In-Reply-To: References: Message-ID: atrick accepted this revision. atrick added a comment. This revision is now accepted and ready to land. Thanks! Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63945/new/ https://reviews.llvm.org/D63945 From llvm-commits at lists.llvm.org Mon Oct 7 17:58:03 2019 From: llvm-commits at lists.llvm.org (Michael Kruse via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 00:58:03 +0000 (UTC) Subject: [PATCH] D53876: Preserve loop metadata when splitting exit blocks In-Reply-To: References: Message-ID: <2d4ab89ae57f0838b833172e68e8d84e@localhost.localdomain> Meinersbur added a comment. The patch was closed by Phabricator since it discovered a "new" commit from git for this patch while assuming that the committed version is the most recent one. Eli re-opened it suspecting that we still might want to fix this. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53876/new/ https://reviews.llvm.org/D53876 From llvm-commits at lists.llvm.org Mon Oct 7 18:06:47 2019 From: llvm-commits at lists.llvm.org (Tom Stellard via llvm-commits) Date: Tue, 08 Oct 2019 01:06:47 -0000 Subject: [www] r374001 - Update GitHub migration status Message-ID: <20191008010647.641008E659@lists.llvm.org> Author: tstellar Date: Mon Oct 7 18:06:47 2019 New Revision: 374001 URL: http://llvm.org/viewvc/llvm-project?rev=374001&view=rev Log: Update GitHub migration status Modified: www/trunk/GitHubMigrationStatus.html Modified: www/trunk/GitHubMigrationStatus.html URL: http://llvm.org/viewvc/llvm-project/www/trunk/GitHubMigrationStatus.html?rev=374001&r1=374000&r2=374001&view=diff ============================================================================== --- www/trunk/GitHubMigrationStatus.html (original) +++ www/trunk/GitHubMigrationStatus.html Mon Oct 7 18:06:47 2019 @@ -23,6 +23,9 @@ Migration Instructions For:
  • Developers: Use the git-llvm script for committing changes.
  • +

    How to request commit access to the GitHub repository

    + +

    Blocking Tasks

    @@ -47,18 +50,23 @@ Migration Instructions For: - + - + - - + + + + + + +
    Migrate BuildbotsTodoIn Progress PR40262
    Enable GitHub Commit AccessTodoIn Progress PR42428
    Check For Merges in git-llvm ScriptTodoPR42430In ProgressD67772
    Send Commit Notifications to *-commits ListsIn ProgressPR40261
    @@ -79,11 +87,6 @@ Migration Instructions For: Done Example - - Send Commit Notifications to *-commits Lists - Todo - PR40261 -
    From llvm-commits at lists.llvm.org Mon Oct 7 18:08:16 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via llvm-commits) Date: Tue, 08 Oct 2019 01:08:16 -0000 Subject: [zorg] r374002 - [LLDB] Enable mails on the matrix bot Message-ID: <20191008010816.099E68E65F@lists.llvm.org> Author: jdevlieghere Date: Mon Oct 7 18:08:15 2019 New Revision: 374002 URL: http://llvm.org/viewvc/llvm-project?rev=374002&view=rev Log: [LLDB] Enable mails on the matrix bot Modified: zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix Modified: zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix?rev=374002&r1=374001&r2=374002&view=diff ============================================================================== --- zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix (original) +++ zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix Mon Oct 7 18:08:15 2019 @@ -80,7 +80,6 @@ pipeline { set -e ''' } - junit 'test/results.xml' } } stage('Test DWARF4') { @@ -104,7 +103,6 @@ pipeline { set -e ''' } - junit 'test/results.xml' } } stage('Test DWARF5') { @@ -171,7 +169,6 @@ pipeline { set -e ''' } - junit 'test/results.xml' } } stage('Build Clang 7.0.1') { @@ -215,7 +212,6 @@ pipeline { set -e ''' } - junit 'test/results.xml' } } stage('Build Clang 9.0.0') { @@ -259,8 +255,22 @@ pipeline { set -e ''' } - junit 'test/results.xml' } } } + post { + changed { + emailext subject: '$DEFAULT_SUBJECT', + presendScript: '$DEFAULT_PRESEND_SCRIPT', + postsendScript: '$DEFAULT_POSTSEND_SCRIPT', + recipientProviders: [ + [$class: 'CulpritsRecipientProvider'], + [$class: 'DevelopersRecipientProvider'], + [$class: 'RequesterRecipientProvider'], + ], + replyTo: '$DEFAULT_REPLYTO', + to: '$DEFAULT_RECIPIENTS', + body:'$DEFAULT_CONTENT' + } + } } From llvm-commits at lists.llvm.org Mon Oct 7 18:06:27 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 01:06:27 +0000 (UTC) Subject: [PATCH] D68616: [CodeExtractor] Factor out and reuse shrinkwrap analysis In-Reply-To: References: Message-ID: <5701db97a391b39d84e44fd3916c7e85@localhost.localdomain> vsk updated this revision to Diff 223711. vsk added a comment. - Add some documentation. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68616/new/ https://reviews.llvm.org/D68616 Files: llvm/include/llvm/Transforms/IPO/HotColdSplitting.h llvm/include/llvm/Transforms/Utils/CodeExtractor.h llvm/lib/Transforms/IPO/BlockExtractor.cpp llvm/lib/Transforms/IPO/HotColdSplitting.cpp llvm/lib/Transforms/IPO/LoopExtractor.cpp llvm/lib/Transforms/IPO/PartialInlining.cpp llvm/lib/Transforms/Utils/CodeExtractor.cpp llvm/unittests/Transforms/Utils/CodeExtractorTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68616.223711.patch Type: text/x-patch Size: 22338 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 18:09:28 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via llvm-commits) Date: Tue, 08 Oct 2019 01:09:28 -0000 Subject: [zorg] r374003 - Revert "[LLDB] Enable mails on the matrix bot" Message-ID: <20191008010928.F11D78D564@lists.llvm.org> Author: jdevlieghere Date: Mon Oct 7 18:09:28 2019 New Revision: 374003 URL: http://llvm.org/viewvc/llvm-project?rev=374003&view=rev Log: Revert "[LLDB] Enable mails on the matrix bot" This reverts commit 3b05a74c5b56868049a700bc2b7fe56388349f33. Modified: zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix Modified: zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix?rev=374003&r1=374002&r2=374003&view=diff ============================================================================== --- zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix (original) +++ zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-matrix Mon Oct 7 18:09:28 2019 @@ -80,6 +80,7 @@ pipeline { set -e ''' } + junit 'test/results.xml' } } stage('Test DWARF4') { @@ -103,6 +104,7 @@ pipeline { set -e ''' } + junit 'test/results.xml' } } stage('Test DWARF5') { @@ -169,6 +171,7 @@ pipeline { set -e ''' } + junit 'test/results.xml' } } stage('Build Clang 7.0.1') { @@ -212,6 +215,7 @@ pipeline { set -e ''' } + junit 'test/results.xml' } } stage('Build Clang 9.0.0') { @@ -255,22 +259,8 @@ pipeline { set -e ''' } + junit 'test/results.xml' } } } - post { - changed { - emailext subject: '$DEFAULT_SUBJECT', - presendScript: '$DEFAULT_PRESEND_SCRIPT', - postsendScript: '$DEFAULT_POSTSEND_SCRIPT', - recipientProviders: [ - [$class: 'CulpritsRecipientProvider'], - [$class: 'DevelopersRecipientProvider'], - [$class: 'RequesterRecipientProvider'], - ], - replyTo: '$DEFAULT_REPLYTO', - to: '$DEFAULT_RECIPIENTS', - body:'$DEFAULT_CONTENT' - } - } } From llvm-commits at lists.llvm.org Mon Oct 7 18:09:29 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via llvm-commits) Date: Tue, 08 Oct 2019 01:09:29 -0000 Subject: [zorg] r374004 - [LLDB] Enable mails on the standalone bot Message-ID: <20191008010929.7EFEC8E658@lists.llvm.org> Author: jdevlieghere Date: Mon Oct 7 18:09:29 2019 New Revision: 374004 URL: http://llvm.org/viewvc/llvm-project?rev=374004&view=rev Log: [LLDB] Enable mails on the standalone bot Modified: zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-standalone Modified: zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-standalone URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-standalone?rev=374004&r1=374003&r2=374004&view=diff ============================================================================== --- zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-standalone (original) +++ zorg/trunk/zorg/jenkins/jobs/jobs/lldb-cmake-standalone Mon Oct 7 18:09:29 2019 @@ -145,4 +145,19 @@ pipeline { } } } + post { + changed { + emailext subject: '$DEFAULT_SUBJECT', + presendScript: '$DEFAULT_PRESEND_SCRIPT', + postsendScript: '$DEFAULT_POSTSEND_SCRIPT', + recipientProviders: [ + [$class: 'CulpritsRecipientProvider'], + [$class: 'DevelopersRecipientProvider'], + [$class: 'RequesterRecipientProvider'], + ], + replyTo: '$DEFAULT_REPLYTO', + to: '$DEFAULT_RECIPIENTS', + body:'$DEFAULT_CONTENT' + } + } } From llvm-commits at lists.llvm.org Mon Oct 7 18:07:08 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 01:07:08 +0000 (UTC) Subject: [PATCH] D68619: [WebAssembly] Fix a bug in 'try' placement Message-ID: aheejin created this revision. aheejin added a reviewer: dschuff. Herald added subscribers: llvm-commits, sunfish, hiraditya, jgravelle-google, sbc100. Herald added a project: LLVM. When searching for local expression tree created by stackified registers, for 'block' placement, we start the search from the previous instruction of a BB's terminator. But in 'try''s case, we should start from the previous instruction of a call that can throw, or a EH_LABEL that precedes the call, because the return values of the call's previous instructions can be stackified and consumed by the throwing call. For example, i32.call @foo call @bar ; may throw br $label0 In this case, if we start the search from the previous instruction of the terminator (`br` here), we end up stopping at `call @bar` and place a 'try' between `i32.call @foo` and `call @bar`, because `call @bar` does not have a return value so it is not a local expression tree of `br`. But in this case, unlike when placing 'block's, we should start the search from `call @bar`, because the return value of `i32.call @foo` is stackified and used by `call @bar`. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68619 Files: llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/test/CodeGen/WebAssembly/cfg-stackify-eh.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68619.223715.patch Type: text/x-patch Size: 4579 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 18:19:20 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 01:19:20 +0000 (UTC) Subject: [PATCH] D68620: DebugInfo: Use base address selection entries for debug_loc Message-ID: dblaikie created this revision. dblaikie added reviewers: labath, probinson, aprantl. Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Unify the range and loc emission (for both DWARFv4 and DWARFv5 style lists) and take advantage of that unification to use strategic base addresses for loclists. Needs more testing, but llvm-dwarfdump doesn't currently support LLE_base_addressx, for instance. But Pavel's looking at some changes there, so I'm holding off in case his work addresses it, or at least I can work on it afterwards so as not to conflict if I tried to do so now. Anyone know whether they have consumers (LLDB, the Sony debugger) that would need to be updated for either the v4 changes (use of base address specifiers in classic debug_loc lists) or v5 (base_addressx, etc, etc)? GDB can't cope with the DWARFv5 stuff, but seems fine with the v4 version. Repository: rL LLVM https://reviews.llvm.org/D68620 Files: include/llvm/BinaryFormat/Dwarf.def include/llvm/BinaryFormat/Dwarf.h lib/BinaryFormat/Dwarf.cpp lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/DebugInfo/DWARF/DWARFDebugLoc.cpp test/DebugInfo/X86/sret.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68620.223716.patch Type: text/x-patch Size: 17353 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 18:22:31 2019 From: llvm-commits at lists.llvm.org (Andrew Trick via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 01:22:31 +0000 (UTC) Subject: [PATCH] D65764: Add TinyPtrVector support for general pointer-like things. In-Reply-To: References: Message-ID: <3a8c1c74f18f81fd9d88303335e2d110@localhost.localdomain> atrick closed this revision. atrick added a comment. I forgot to link the commit message back to this review. Closing with commit 861b371e1386b6ac1069c9aa3050f7074fb64516 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65764/new/ https://reviews.llvm.org/D65764 From llvm-commits at lists.llvm.org Mon Oct 7 18:28:08 2019 From: llvm-commits at lists.llvm.org (Matt Morehouse via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 01:28:08 +0000 (UTC) Subject: [PATCH] D68621: [sanitizer_common] Remove OnPrint from Go build. Message-ID: morehouse created this revision. morehouse added reviewers: vitalybuka, dvyukov. Herald added a project: LLVM. Go now uses __sanitizer_on_print instead. https://reviews.llvm.org/D68621 Files: compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp Index: compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp =================================================================== --- compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp +++ compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp @@ -229,8 +229,6 @@ // Can be overriden in frontend. #if SANITIZER_GO && defined(TSAN_EXTERNAL_HOOKS) // Implementation must be defined in frontend. -// TODO(morehouse): Remove OnPrint after migrating Go to __sanitizer_on_print. -extern "C" void OnPrint(const char *str); extern "C" void __sanitizer_on_print(const char *str); #else SANITIZER_INTERFACE_WEAK_DEF(void, __sanitizer_on_print, const char *str) { @@ -239,10 +237,6 @@ #endif static void CallPrintfAndReportCallback(const char *str) { -#if SANITIZER_GO && defined(TSAN_EXTERNAL_HOOKS) - // TODO(morehouse): Remove OnPrint after migrating Go to __sanitizer_on_print. - OnPrint(str); -#endif __sanitizer_on_print(str); if (PrintfAndReportCallback) PrintfAndReportCallback(str); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68621.223717.patch Type: text/x-patch Size: 1005 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 18:31:02 2019 From: llvm-commits at lists.llvm.org (Andrew Trick via llvm-commits) Date: Tue, 08 Oct 2019 01:31:02 -0000 Subject: [llvm] r374009 - [LitConfig] Silenced notes/warnings on quiet. Message-ID: <20191008013102.B8C0A8E6EA@lists.llvm.org> Author: atrick Date: Mon Oct 7 18:31:02 2019 New Revision: 374009 URL: http://llvm.org/viewvc/llvm-project?rev=374009&view=rev Log: [LitConfig] Silenced notes/warnings on quiet. Lit has a "quiet" option, -q, which is documented to "suppress no error output". Previously, LitConfig displayed notes and warnings when the quiet option was specified. The result was that it was not possible to get only pertinent file/line information to be used by an editor to jump to the location where checks were failing without passing a number of unhelpful locations first. Here, the implementations of LitConfig.note and LitConfig.warning are modified to account for the quiet flag and avoid displaying if the flag has indeed been set. Patch by Nate Chandler Reviewed by yln Differential Revision: https://reviews.llvm.org/D68044 Modified: llvm/trunk/utils/lit/lit/LitConfig.py Modified: llvm/trunk/utils/lit/lit/LitConfig.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/LitConfig.py?rev=374009&r1=374008&r2=374009&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/LitConfig.py (original) +++ llvm/trunk/utils/lit/lit/LitConfig.py Mon Oct 7 18:31:02 2019 @@ -174,10 +174,12 @@ class LitConfig(object): kind, message)) def note(self, message): - self._write_message('note', message) + if not self.quiet: + self._write_message('note', message) def warning(self, message): - self._write_message('warning', message) + if not self.quiet: + self._write_message('warning', message) self.numWarnings += 1 def error(self, message): From llvm-commits at lists.llvm.org Mon Oct 7 18:32:00 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 01:32:00 +0000 (UTC) Subject: [PATCH] D68270: DWARFDebugLoc: Add a function to get the address range of an entry In-Reply-To: References: Message-ID: dblaikie added inline comments. ================ Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:291-295 + EntryIterator Absolute = + getAbsoluteLocations( + SectionedAddress{BaseAddr, SectionedAddress::UndefSection}, + LookupPooledAddress) + .begin(); ---------------- labath wrote: > dblaikie wrote: > > labath wrote: > > > This parallel iteration is not completely nice, but I think it's worth being able to reuse the absolute range computation code. I'm open to ideas for improvement though. > > Ah, I see - this is what you meant about "In particular it makes it possible to reuse this stuff in the dumping code, which would have been pretty hard with callbacks.". > > > > I'm wondering if that might be worth revisiting somewhat. A full iterator abstraction for one user here (well, two once you include lldb - but I assume it's likely going to build its own data structure from the iteration anyway, right? (it's not going to keep the iterator around, do anything interesting like partial iterations, re-iterate/etc - such that a callback would suffice)) > > > > I could imagine two callback APIs for this - one that gets entries and locations and one that only gets locations by filtering on the entry version. > > > > eg: > > > > // for non-verbose output: > > LL.forEachEntry([&](const Entry &E, Expected L) { > > if (Verbose && actually dumping debug_loc) > > print(E) // print any LLE_*, raw parameters, etc > > if (L) > > print(*L) // print the resulting address range, section name (if verbose), > > else > > print(error stuff) > > }); > > > > One question would be "when/where do we print the DWARF expression" - if there's an error computing the address range, we can still print the expression, so maybe that happens unconditionally at the end of the callback, using the expression in the Entry? (then, arguably, the expression doesn't need to be in the DWARFLocation - and I'd say make the DWARFLocation a sectioned range, exactly the same type as for ranges so that part of the dumping code, etc, can be maximally reused) > Actually, what lldb currently does is that it does not build any data structures at all (except storing the pointer to the right place in the debug_loc section. Then, whenever it wants to do something to the loclist, it parses it afresh. I don't know why it does this exactly, but I assume it has something to do with most locations never being used, or being only a couple of times, and the actual parsing being fairly fast. What this means is that lldb is not really a single "user", but there are like four or five places where it iterates through the list, depending on what does it actually want to do with it. It also does partial iteration where it stops as soon as it find the entry it was interested in. > Now, all of that is possible with a callback (though I am generally trying to avoid them), but it does resurface the issue of what should be the value of the second argument for DW_LLE_base_address entries (the thing which I originally used a error type for). > Maybe this should be actually one callback API, taking two callback functions, with one of them being invoked for base_address entries, and one for others? However, if we stick to the current approaches in both LLE and RLE of making the address pool resolution function a parameter (which I'd like to keep, as it makes my job in lldb easier), then this would actually be three callbacks, which starts to get unwieldy. Though one of those callbacks could be removed with the "DWARFUnit implementing a AddrOffsetResolver interface" idea, which I really like. :) Ah, thanks for the details on LLDB's location parsing logic. That's interesting indeed! I can appreciate an iterator-based API if that's the sort of usage we've got, though I expect it doesn't have any interest in the low-level encoding & just wants the fully processed address ranges/locations - it doesn't want base_address or end_of_list entries? & I think the dual-iteration is a fairly awkward API design, trying to iterate them in lock-step, etc. I'd rather avoid that if reasonably possible. Either having an iterator API that gives only the fully processed data/semantic view & a completely different API if you want to access the low level primitives (LLE, etc) (this is how ranges works - there's an API that gives a collection of ranges & abstracts over v4/v5/rnglists/etc - though that's partly motivated by a strong multi-client need for that functionality for symbolizing, etc - but I think it's a good abstraction/model anyway (& one of the reasons the inline range list printing doesn't include encoding information, the API it uses is too high level to even have access to it)) > Now, all of that is possible with a callback (though I am generally trying to avoid them), but it does resurface the issue of what should be the value of the second argument for DW_LLE_base_address entries (the thing which I originally used a error type for). Sorry, my intent in the above API was for the second argument to be Optional's "None" state when... oh, I see, I did use Expected there, rather than Optional, because there are legit error cases. I know it's sort of awkward, but I might be inclined to use Optional> there. I realize two layers of wrapping is a bit weird, but I think it'd be nicer than having an error state for what, I think, isn't erroneous. > Maybe this should be actually one callback API, taking two callback functions, with one of them being invoked for base_address entries, and one for others? However, if we stick to the current approaches in both LLE and RLE of making the address pool resolution function a parameter (which I'd like to keep, as it makes my job in lldb easier), then this would actually be three callbacks, which starts to get unwieldy. Don't mind three callbacks too much. > Though one of those callbacks could be removed with the "DWARFUnit implementing a AddrOffsetResolver interface" idea, which I really like. :) Sorry, I haven't really looked at where the address resolver callback is registered and alternative designs being discussed - but yeah, going off just the one-sentence, it seems reasonable to have the DWARFUnit own an address resolver/be the thing you consult when you want to resolve an address (just through a normal function call in DWARFUnit, perhaps - which might, internally, use a callback registered when it was constructed). ================ Comment at: test/CodeGen/X86/debug-loclists.ll:16 ; CHECK-NEXT: 0x00000000: -; CHECK-NEXT: [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0 -; CHECK-NEXT: [0x0000000000000004, 0x0000000000000012): DW_OP_breg3 RBX+0 - -; There is no way to use llvm-dwarfdump atm (2018, october) to verify the DW_LLE_* codes emited, -; because dumper is not yet implements that. Use asm code to do this check instead. -; -; RUN: llc -mtriple=x86_64-pc-linux -filetype=asm < %s -o - | FileCheck %s --check-prefix=ASM -; ASM: .section .debug_loclists,"", at progbits -; ASM-NEXT: .long .Ldebug_loclist_table_end0-.Ldebug_loclist_table_start0 # Length -; ASM-NEXT: .Ldebug_loclist_table_start0: -; ASM-NEXT: .short 5 # Version -; ASM-NEXT: .byte 8 # Address size -; ASM-NEXT: .byte 0 # Segment selector size -; ASM-NEXT: .long 0 # Offset entry count -; ASM-NEXT: .Lloclists_table_base0: -; ASM-NEXT: .Ldebug_loc0: -; ASM-NEXT: .byte 4 # DW_LLE_offset_pair -; ASM-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0 # starting offset -; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # ending offset -; ASM-NEXT: .byte 2 # Loc expr size -; ASM-NEXT: .byte 117 # DW_OP_breg5 -; ASM-NEXT: .byte 0 # 0 -; ASM-NEXT: .byte 4 # DW_LLE_offset_pair -; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # starting offset -; ASM-NEXT: .uleb128 .Ltmp1-.Lfunc_begin0 # ending offset -; ASM-NEXT: .byte 2 # Loc expr size -; ASM-NEXT: .byte 115 # DW_OP_breg3 -; ASM-NEXT: .byte 0 # 0 -; ASM-NEXT: .byte 0 # DW_LLE_end_of_list -; ASM-NEXT: .Ldebug_loclist_table_end0: +; CHECK-NEXT: [DW_LLE_offset_pair ]: 0x0000000000000000, 0x0000000000000004 => [0x0000000000000000, 0x0000000000000004) DW_OP_breg5 RDI+0 +; CHECK-NEXT: [DW_LLE_offset_pair ]: 0x0000000000000004, 0x0000000000000012 => [0x0000000000000004, 0x0000000000000012) DW_OP_breg3 RBX+0 ---------------- labath wrote: > dblaikie wrote: > > labath wrote: > > > This tries to follow the RLE format as closely as possible, but I think something like > > > ``` > > > [DW_LLE_offset_pair, 0x0000000000000000, 0x0000000000000004] => [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0 > > > ``` > > > would make more sense (both here and for RLE). > > Yep, that'd make more sense to me - are you planning to unify the codepaths for this? I think that'd be for the best. > > > > If I were picking a printing from scratch, I might go with: > > > > DW_LLE_offset_pair(0x0000, 0x0004) => [0x0000, 0x0004): DW_OP_breg5 RDI+0 > > > > Making it look a bit more like a function call and function arguments. Though the () might be confusing with the range notation. > > > > I'm also undecided on the " => " separator. Whether a ':' might be better/fine, etc. > > > > Totally open to ideas, but mostly I'd really love these to use loclist and ranges to use the same code as much as possible, so we can get consistency and any readability benefits, etc in both. > I like the function call format. I hoping to get some code reuse, though it's still not fully clear to me how to achieve that.. I've posted my unification of range/loc/v4/v5 emission here: https://reviews.llvm.org/D68620 - & I'd imagine something similar in the parsing side. ================ Comment at: test/DebugInfo/X86/dwarfdump-debug-loclists.test:7 # CHECK: DW_AT_location [DW_FORM_sec_offset] (0x0000000c -# CHECK-NEXT: [0x0000000000000010, 0x0000000000000020): DW_OP_breg5 RDI+0 -# CHECK-NEXT: [0x0000000000000530, 0x0000000000000540): DW_OP_breg6 RBP-8, DW_OP_deref -# CHECK-NEXT: [0x0000000000000700, 0x0000000000000710): DW_OP_breg5 RDI+0 +# CHECK-NEXT: [DW_LLE_offset_pair ]: 0x0000000000000000, 0x0000000000000010 => [0x0000000000000010, 0x0000000000000020) DW_OP_breg5 RDI+0 +# CHECK-NEXT: [DW_LLE_base_address ]: 0x0000000000000500 ---------------- labath wrote: > dblaikie wrote: > > I don't think the inline dumping should print the encoding - I'd borrow a lot from/try to unify with the ranges printing, which doesn't. I think verbose ranges print the same as non-verbose except they also add the section name/number. > Sure, I can do that, though I think that means there won't be a single place where one can see both the raw encodings and their interpretation -- section-based dumping will not show the interpretation (would you want me to show still show them I they happen to be interpretable without the base address or the address pool?), and the debug_info dumping will not show the encoding. Is that bad? -- I don't know... Fair - that comes back to the issue I mentioned in a previous comment about potentially limiting dumping of non-debug_info sections based on the presence of a CU that references it (& only dumping it that way, rather than trying to parse it without a CU). DWARF isn't really designed to be parsed without the CU anyway. (could leave it in as best-effort to parse things without a referencing CU for debugging, etc). Mostly I'm interested in unification perhaps more/primarily, than feature improvements - then we can make feature improvements to both ranges and locs without having to duplicate things. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68270/new/ https://reviews.llvm.org/D68270 From llvm-commits at lists.llvm.org Mon Oct 7 18:50:58 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 01:50:58 +0000 (UTC) Subject: [PATCH] D64869: [SCEV] get more accurate range for AddExpr with NW flag In-Reply-To: References: Message-ID: shchenz planned changes to this revision. shchenz added a comment. Yes, I plan to take a look at https://reviews.llvm.org/D64868 firstly. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64869/new/ https://reviews.llvm.org/D64869 From llvm-commits at lists.llvm.org Mon Oct 7 18:52:08 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 01:52:08 +0000 (UTC) Subject: [PATCH] D65262: [SCEV] simplify more icmps with pred sle/ule to pred slt/ult In-Reply-To: References: Message-ID: <68d68b06bf787b95dbd9cbaa7e7653fc@localhost.localdomain> shchenz planned changes to this revision. shchenz added a comment. https://reviews.llvm.org/D64868 is reverted, need to fix that firstly. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65262/new/ https://reviews.llvm.org/D65262 From llvm-commits at lists.llvm.org Mon Oct 7 19:00:53 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Tue, 08 Oct 2019 02:00:53 -0000 Subject: [compiler-rt] r374010 - [sanitizer] Fix signal_trap_handler.cpp on android Message-ID: <20191008020053.EDA3C8E5BD@lists.llvm.org> Author: vitalybuka Date: Mon Oct 7 19:00:53 2019 New Revision: 374010 URL: http://llvm.org/viewvc/llvm-project?rev=374010&view=rev Log: [sanitizer] Fix signal_trap_handler.cpp on android Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp?rev=374010&r1=374009&r2=374010&view=diff ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp (original) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Mon Oct 7 19:00:53 2019 @@ -3,11 +3,15 @@ #include #include #include +#include -int handled; +int in_handler; void handler(int signo, siginfo_t *info, void *uctx) { - handled = 1; + fprintf(stderr, "in_handler: %d\n", in_handler); + fflush(stderr); + // CHECK: in_handler: 1 + _Exit(0); } int main() { @@ -21,9 +25,10 @@ int main() { assert(a.sa_sigaction == handler); assert(a.sa_flags & SA_SIGINFO); + in_handler = 1; __builtin_debugtrap(); - assert(handled); - fprintf(stderr, "HANDLED %d\n", handled); -} + in_handler = 0; -// CHECK: HANDLED 1 + fprintf(stderr, "UNREACHABLE\n"); + return 1; +} From llvm-commits at lists.llvm.org Mon Oct 7 19:04:33 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:04:33 +0000 (UTC) Subject: [PATCH] D68624: [Attributor] Handle `null` differently in capture and alias logic Message-ID: jdoerfert created this revision. jdoerfert added reviewers: sstefan1, uenoku. Herald added subscribers: bollu, hiraditya. Herald added a project: LLVM. `null` in the default address space (=AS 0) cannot be captured nor can it alias anything. We make this clear now as it can be important for callbacks and other cases later on. In addition, this patch improves the debug output for noalias deduction. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68624 Files: llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/callbacks.ll llvm/test/Transforms/FunctionAttrs/nocapture.ll Index: llvm/test/Transforms/FunctionAttrs/nocapture.ll =================================================================== --- llvm/test/Transforms/FunctionAttrs/nocapture.ll +++ llvm/test/Transforms/FunctionAttrs/nocapture.ll @@ -314,5 +314,14 @@ ret i1 %2 } +declare void @unknown(i8*) +define void @test_callsite() { +entry: +; We know that 'null' in AS 0 does not alias anything and cannot be captured +; CHECK: call void @unknown(i8* noalias nocapture null) + call void @unknown(i8* null) + ret void +} + declare i8* @llvm.launder.invariant.group.p0i8(i8*) declare i8* @llvm.strip.invariant.group.p0i8(i8*) Index: llvm/test/Transforms/FunctionAttrs/callbacks.ll =================================================================== --- llvm/test/Transforms/FunctionAttrs/callbacks.ll +++ llvm/test/Transforms/FunctionAttrs/callbacks.ll @@ -22,7 +22,7 @@ ; CHECK-NEXT: [[TMP0:%.*]] = bitcast i32* [[B]] to i8* ; CHECK-NEXT: store i32 42, i32* [[B]], align 32 ; CHECK-NEXT: store i32* [[B]], i32** [[C]], align 64 -; CHECK-NEXT: call void (i32*, i32*, void (i32*, i32*, ...)*, ...) @t0_callback_broker(i32* null, i32* nonnull align 128 dereferenceable(4) [[PTR]], void (i32*, i32*, ...)* nonnull bitcast (void (i32*, i32*, i32*, i64, i32**)* @t0_callback_callee to void (i32*, i32*, ...)*), i32* [[A:%.*]], i64 99, i32** nonnull align 64 dereferenceable(8) [[C]]) +; CHECK-NEXT: call void (i32*, i32*, void (i32*, i32*, ...)*, ...) @t0_callback_broker(i32* noalias null, i32* nonnull align 128 dereferenceable(4) [[PTR]], void (i32*, i32*, ...)* nonnull bitcast (void (i32*, i32*, i32*, i64, i32**)* @t0_callback_callee to void (i32*, i32*, ...)*), i32* [[A:%.*]], i64 99, i32** nonnull align 64 dereferenceable(8) [[C]]) ; CHECK-NEXT: ret void ; entry: Index: llvm/lib/Transforms/IPO/Attributor.cpp =================================================================== --- llvm/lib/Transforms/IPO/Attributor.cpp +++ llvm/lib/Transforms/IPO/Attributor.cpp @@ -1726,7 +1726,11 @@ /// See AbstractAttribute::initialize(...). void initialize(Attributor &A) override { AANoAliasImpl::initialize(A); - if (isa(getAnchorValue())) + Value &Val = getAssociatedValue(); + if (isa(Val)) + indicateOptimisticFixpoint(); + if (isa(Val) && + Val.getType()->getPointerAddressSpace() == 0) indicateOptimisticFixpoint(); } @@ -1790,8 +1794,12 @@ // check only uses possibly executed before this callsite. auto &NoCaptureAA = A.getAAFor(*this, IRP); - if (!NoCaptureAA.isAssumedNoCaptureMaybeReturned()) + if (!NoCaptureAA.isAssumedNoCaptureMaybeReturned()) { + LLVM_DEBUG( + dbgs() << "[Attributor][AANoAliasCSArg] " << V + << " cannot be noalias as it is potentially captured\n"); return indicatePessimisticFixpoint(); + } // (iii) Check there is no other pointer argument which could alias with the // value. @@ -1805,13 +1813,15 @@ if (const Function *F = getAnchorScope()) { if (AAResults *AAR = A.getInfoCache().getAAResultsForFunction(*F)) { + bool IsAliasing = AAR->isNoAlias(&getAssociatedValue(), ArgOp); LLVM_DEBUG(dbgs() << "[Attributor][NoAliasCSArg] Check alias between " "callsite arguments " << AAR->isNoAlias(&getAssociatedValue(), ArgOp) << " " - << getAssociatedValue() << " " << *ArgOp << "\n"); + << getAssociatedValue() << " " << *ArgOp << " => " + << (IsAliasing ? "" : "no-") << "alias \n"); - if (AAR->isNoAlias(&getAssociatedValue(), ArgOp)) + if (IsAliasing) continue; } } @@ -2681,6 +2691,13 @@ void initialize(Attributor &A) override { AANoCapture::initialize(A); + // You cannot "capture" null in the default address space. + if (isa(getAssociatedValue()) && + getAssociatedValue().getType()->getPointerAddressSpace() == 0) { + indicateOptimisticFixpoint(); + return; + } + const IRPosition &IRP = getIRPosition(); const Function *F = getArgNo() >= 0 ? IRP.getAssociatedFunction() : IRP.getAnchorScope(); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68624.223721.patch Type: text/x-patch Size: 4330 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 19:05:46 2019 From: llvm-commits at lists.llvm.org (Zhang Kang via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:05:46 +0000 (UTC) Subject: [PATCH] D68625: [CodeGen] [ExpandReduction] Fix the bug for ExpandReduction() when vector size isn't power of 2 Message-ID: ZhangKang created this revision. ZhangKang added reviewers: hfinkel, PowerPC, aemerson, efriedma. Herald added subscribers: wuzish, hiraditya, kristof.beyls. Herald added a project: LLVM. For below test case, we will get assert error except for AArch64 and ARM: declare i8 @llvm.experimental.vector.reduce.and.i8.v3i8(<3 x i8> %a) define i8 @test_v3i8(<3 x i8> %a) nounwind { %b = call i8 @llvm.experimental.vector.reduce.and.i8.v3i8(<3 x i8> %a) ret i8 %b } This patch is fix below error when the number of element is not power of 2 for those llvm.experimental.vector.reduce.* function. https://reviews.llvm.org/D68625 Files: llvm/lib/CodeGen/ExpandReductions.cpp llvm/test/CodeGen/Generic/expand-experimental-reductions.ll Index: llvm/test/CodeGen/Generic/expand-experimental-reductions.ll =================================================================== --- llvm/test/CodeGen/Generic/expand-experimental-reductions.ll +++ llvm/test/CodeGen/Generic/expand-experimental-reductions.ll @@ -18,6 +18,7 @@ declare double @llvm.experimental.vector.reduce.fmax.v2f64(<2 x double>) declare double @llvm.experimental.vector.reduce.fmin.v2f64(<2 x double>) +declare i8 @llvm.experimental.vector.reduce.and.i8.v3i8(<3 x i8>) define i64 @add_i64(<2 x i64> %vec) { ; CHECK-LABEL: @add_i64( @@ -303,3 +304,15 @@ %r = call double @llvm.experimental.vector.reduce.fmin.v2f64(<2 x double> %vec) ret double %r } + +; Test when the vector size is not power of two. +define i8 @test_v3i8(<3 x i8> %a) nounwind { +; CHECK-LABEL: @test_v3i8( +; CHECK-NEXT: entry: +; CHECK-NEXT: %b = call i8 @llvm.experimental.vector.reduce.and.v3i8(<3 x i8> %a) +; CHECK-NEXT: ret i8 %b +; +entry: + %b = call i8 @llvm.experimental.vector.reduce.and.i8.v3i8(<3 x i8> %a) + ret i8 %b +} Index: llvm/lib/CodeGen/ExpandReductions.cpp =================================================================== --- llvm/lib/CodeGen/ExpandReductions.cpp +++ llvm/lib/CodeGen/ExpandReductions.cpp @@ -105,6 +105,9 @@ if (!FMF.allowReassoc()) Rdx = getOrderedReduction(Builder, Acc, Vec, getOpcode(ID), MRK); else { + if (!isPowerOf2_32(Vec->getType()->getVectorNumElements())) + continue; + Rdx = getShuffleReduction(Builder, Vec, getOpcode(ID), MRK); Rdx = Builder.CreateBinOp((Instruction::BinaryOps)getOpcode(ID), Acc, Rdx, "bin.rdx"); @@ -122,6 +125,9 @@ case Intrinsic::experimental_vector_reduce_fmax: case Intrinsic::experimental_vector_reduce_fmin: { Value *Vec = II->getArgOperand(0); + if (!isPowerOf2_32(Vec->getType()->getVectorNumElements())) + continue; + Rdx = getShuffleReduction(Builder, Vec, getOpcode(ID), MRK); } break; default: -------------- next part -------------- A non-text attachment was scrubbed... Name: D68625.223720.patch Type: text/x-patch Size: 2040 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 19:12:37 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:12:37 +0000 (UTC) Subject: [PATCH] D68552: [WebAssembly] Fix unwind mismatch stat computation In-Reply-To: References: Message-ID: shchenz added inline comments. ================ Comment at: llvm/test/CodeGen/WebAssembly/cfg-stackify-eh.ll:706 +; Check if the unwind destination mismatch stats are correct +; NOSORT-STAT: 11 wasm-cfg-stackify - Number of EH pad unwind mismatches found ---------------- This causes `make check-llvm` fail for release llc. Need to add `; REQUIRES: asserts` at the beginning? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68552/new/ https://reviews.llvm.org/D68552 From llvm-commits at lists.llvm.org Mon Oct 7 19:15:06 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:15:06 +0000 (UTC) Subject: [PATCH] D67008: implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: hubert.reinterpretcast added inline comments. ================ Comment at: llvm/include/llvm/BinaryFormat/XCOFF.h:192 + ///< displacement that is the difference between the address of + ///< the refrenced symbol and the address of the refrenced branch + ///< instruction. References a non modifiable instruction. ---------------- The typo, "refrenced", is still here. ================ Comment at: llvm/include/llvm/BinaryFormat/XCOFF.h:193 + ///< the refrenced symbol and the address of the refrenced branch + ///< instruction. References a non modifiable instruction. + R_RBA = 0x18, ///< Branch absolute relocation. Similar to the R_BA but ---------------- Still missing the hyphen for "non-modifiable". ================ Comment at: llvm/include/llvm/BinaryFormat/XCOFF.h:194 + ///< instruction. References a non modifiable instruction. + R_RBA = 0x18, ///< Branch absolute relocation. Similar to the R_BA but + ///< references a modifiable instruction. ---------------- Either remove the "the" for this line or add "relocation" after "R_BA". ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:168 + +bool isRelocationSigned(XCOFFRelocation32 &Reloc); + ---------------- DiggerLin wrote: > hubert.reinterpretcast wrote: > > Do these need to be declared in the header? Are they called only in one `.cpp` file? If so, they can be made `static` in the `.cpp` file. Otherwise, it seems odd that these aren't `const` member functions of `XCOFFRelocation32`. > the llvm-readobj is using those function and obj2yaml will use them too. It is still odd to me that these aren't `const` non-static member functions of `XCOFFRelocation32`. ================ Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:556 } +// In an XCOFF32 file, if more than 65,534 relocation entries are required, +// the field value will be 65535, and an STYP_OVRFLO section header will ---------------- DiggerLin wrote: > hubert.reinterpretcast wrote: > > We can reduce the amount of background for the comment to what is necessary to understand the code here: > > In an XCOFF32 file, when the field value is 65535, then an STYP_OVRFLO section header contains the actual count of relocation entries in the s_paddr field. STYP_OVRFLO headers contain the section index of their corresponding sections as their raw "NumberOfRelocations" field value. > added. I am not seeing the change. ================ Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:593 + Sec.FileOffsetToRelocationInfo); + auto RelocEntNumOrErr = getLogicalNumberOfRelocationEntries(Sec); + if (Error E = RelocEntNumOrErr.takeError()) ---------------- Suggestion: `NumRelocEntriesOrErr` ================ Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:597 + + uint32_t RelocEntNum = RelocEntNumOrErr.get(); + ---------------- Suggestion: `NumRelocEntries` Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Mon Oct 7 19:17:56 2019 From: llvm-commits at lists.llvm.org (Alexander Shaposhnikov via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:17:56 +0000 (UTC) Subject: [PATCH] D68594: [llvm-lipo] Add TextAPI to LINK_COMPONENTS In-Reply-To: References: Message-ID: alexshap accepted this revision. alexshap added a comment. This revision is now accepted and ready to land. ok, thanks Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68594/new/ https://reviews.llvm.org/D68594 From llvm-commits at lists.llvm.org Mon Oct 7 19:23:04 2019 From: llvm-commits at lists.llvm.org (David Greene via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:23:04 +0000 (UTC) Subject: [PATCH] D68153: Make IR labels more precise In-Reply-To: References: Message-ID: <748570bc89b503c909cc5b9fbe8f0853@localhost.localdomain> greened added a comment. In D68153#1689791 , @RKSimon wrote: > Wouldn't this mean that every regeneration would see this change? Yes. The label pattern is just wrong as-is because it will match calls in some cases. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68153/new/ https://reviews.llvm.org/D68153 From llvm-commits at lists.llvm.org Mon Oct 7 19:25:57 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:25:57 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: <38a64d0b24d76e0cd5d137fd23d50f4d@localhost.localdomain> hubert.reinterpretcast added inline comments. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:357 + // Now output the auxiliary entry. + W.write(CSectionRef.SymbolTableIndex); + // Parameter typecheck hash. Not supported. ---------------- sfertile wrote: > hubert.reinterpretcast wrote: > > Since the field is named `SectionLen` in `llvm::object::XCOFFCsectAuxEnt32`, a comment is warranted regarding its use also for referencing the containing csect by symbol table index. Please also add a comment in `include/llvm/Object/XCOFFObjectFile.h`. > I'm not disagreeing with this, but it should be done in a separate patch. @DiggerLin, please post such a patch (perhaps going so far as to rename the field to `SectionOrLength`) so we do not lose track of this. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 From llvm-commits at lists.llvm.org Mon Oct 7 19:41:13 2019 From: llvm-commits at lists.llvm.org (Jan Korous via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:41:13 +0000 (UTC) Subject: [PATCH] D68093: [clang-scan-deps][static analyzer] Support for clang --analyze in scan-deps In-Reply-To: References: Message-ID: <3ef8c72ad78c0aa38b364228ac7606a7@localhost.localdomain> jkorous marked an inline comment as done. jkorous added inline comments. ================ Comment at: clang/include/clang/Driver/CC1Options.td:849 HelpText<"include a detailed record of preprocessing actions">; +def setup_static_analyzer : Flag<["-"], "setup-static-analyzer">, + HelpText<"Set up preprocessor for static analyzer (done automatically when static analyzer is run).">; ---------------- NoQ wrote: > jkorous wrote: > > hiraditya wrote: > > > The name doesn't quite reflect what it does. > > `setup-pp-for-analyzer`? I'm open to suggestions. > I actually suggest modifying the help text to something like "Behave as if the Static Analyzer is going to be invoked, even if it's not actually going to be invoked (for now this boils down to defining the __clang_analyzer__ macro)" and keep the flag name roughly the same. This is the actual purpose of the option, right? We don't need to specify what precisely takes place when the option is invoked because this may change in the future, but the contract will remain. Sorry, there were quite a few changes in the patch and it deviated from the original idea. After the most recent one the description wouldn't be correct anymore - please take a look at `clang/lib/Driver/ToolChains/Clang.cpp` below. The `-setup-static-analyzer` flag can be passed on its own to frontend (without `-analyze`) but if you pass `--analyze` to driver then both `-analyze` and `-setup-static-analyzer` are passed to frontend. Passing just `-analyze` without `-setup-static-analyzer` to frontend would result in skipping the `__clang_analyzer__` macro definition. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68093/new/ https://reviews.llvm.org/D68093 From llvm-commits at lists.llvm.org Mon Oct 7 19:46:17 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:46:17 +0000 (UTC) Subject: [PATCH] D68626: [Attributor] Use undef for calls with unused arguments. Message-ID: jdoerfert created this revision. jdoerfert added reviewers: uenoku, sstefan1. Herald added subscribers: bollu, hiraditya. Herald added a project: LLVM. If an argument is unused, we can pass in undef instead of the original value to remove the use of the original value completely. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68626 Files: llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/align.ll llvm/test/Transforms/FunctionAttrs/callbacks.ll llvm/test/Transforms/FunctionAttrs/internal-noalias.ll llvm/test/Transforms/FunctionAttrs/noalias_returned.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68626.223725.patch Type: text/x-patch Size: 7892 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 19:50:27 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via llvm-commits) Date: Tue, 08 Oct 2019 02:50:27 -0000 Subject: [llvm] r374015 - [WebAssembly] Add REQUIRES: asserts to cfg-stackify-eh.ll Message-ID: <20191008025027.EAB538215F@lists.llvm.org> Author: aheejin Date: Mon Oct 7 19:50:27 2019 New Revision: 374015 URL: http://llvm.org/viewvc/llvm-project?rev=374015&view=rev Log: [WebAssembly] Add REQUIRES: asserts to cfg-stackify-eh.ll This was missing in D68552. Modified: llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll Modified: llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll?rev=374015&r1=374014&r2=374015&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll (original) +++ llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll Mon Oct 7 19:50:27 2019 @@ -1,3 +1,4 @@ +; REQUIRES: asserts ; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -exception-model=wasm -mattr=+exception-handling | FileCheck %s ; RUN: llc < %s -O0 -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -verify-machineinstrs -exception-model=wasm -mattr=+exception-handling | FileCheck %s --check-prefix=NOOPT ; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -exception-model=wasm -mattr=+exception-handling -wasm-disable-ehpad-sort | FileCheck %s --check-prefix=NOSORT From llvm-commits at lists.llvm.org Mon Oct 7 19:53:24 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:53:24 +0000 (UTC) Subject: [PATCH] D68552: [WebAssembly] Fix unwind mismatch stat computation In-Reply-To: References: Message-ID: aheejin added a comment. Thank you! Done in r374015. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68552/new/ https://reviews.llvm.org/D68552 From llvm-commits at lists.llvm.org Mon Oct 7 20:00:31 2019 From: llvm-commits at lists.llvm.org (Chen Zheng via llvm-commits) Date: Tue, 08 Oct 2019 03:00:31 -0000 Subject: [llvm] r374016 - [ConstantRange] [NFC] replace addWithNoSignedWrap with addWithNoWrap. Message-ID: <20191008030031.9CD958DE8D@lists.llvm.org> Author: shchenz Date: Mon Oct 7 20:00:31 2019 New Revision: 374016 URL: http://llvm.org/viewvc/llvm-project?rev=374016&view=rev Log: [ConstantRange] [NFC] replace addWithNoSignedWrap with addWithNoWrap. Modified: llvm/trunk/include/llvm/IR/ConstantRange.h llvm/trunk/lib/IR/ConstantRange.cpp llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp llvm/trunk/unittests/IR/ConstantRangeTest.cpp Modified: llvm/trunk/include/llvm/IR/ConstantRange.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/ConstantRange.h?rev=374016&r1=374015&r2=374016&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/ConstantRange.h (original) +++ llvm/trunk/include/llvm/IR/ConstantRange.h Mon Oct 7 20:00:31 2019 @@ -338,10 +338,6 @@ public: ConstantRange addWithNoWrap(const ConstantRange &Other, unsigned NoWrapKind, PreferredRangeType RangeType = Smallest) const; - /// Return a new range representing the possible values resulting from a - /// known NSW addition of a value in this range and \p Other constant. - ConstantRange addWithNoSignedWrap(const APInt &Other) const; - /// Return a new range representing the possible values resulting /// from a subtraction of a value in this range and a value in \p Other. ConstantRange sub(const ConstantRange &Other) const; Modified: llvm/trunk/lib/IR/ConstantRange.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/ConstantRange.cpp?rev=374016&r1=374015&r2=374016&view=diff ============================================================================== --- llvm/trunk/lib/IR/ConstantRange.cpp (original) +++ llvm/trunk/lib/IR/ConstantRange.cpp Mon Oct 7 20:00:31 2019 @@ -866,16 +866,6 @@ ConstantRange ConstantRange::addWithNoWr return Result; } -ConstantRange ConstantRange::addWithNoSignedWrap(const APInt &Other) const { - // Calculate the subset of this range such that "X + Other" is - // guaranteed not to wrap (overflow) for all X in this subset. - auto NSWRange = ConstantRange::makeExactNoWrapRegion( - BinaryOperator::Add, Other, OverflowingBinaryOperator::NoSignedWrap); - auto NSWConstrainedRange = intersectWith(NSWRange); - - return NSWConstrainedRange.add(ConstantRange(Other)); -} - ConstantRange ConstantRange::sub(const ConstantRange &Other) const { if (isEmptySet() || Other.isEmptySet()) Modified: llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp?rev=374016&r1=374015&r2=374016&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp Mon Oct 7 20:00:31 2019 @@ -1839,8 +1839,8 @@ void WidenIV::calculatePostIncRange(Inst auto CmpRHSRange = SE->getSignedRange(SE->getSCEV(CmpRHS)); auto CmpConstrainedLHSRange = ConstantRange::makeAllowedICmpRegion(P, CmpRHSRange); - auto NarrowDefRange = - CmpConstrainedLHSRange.addWithNoSignedWrap(*NarrowDefRHS); + auto NarrowDefRange = CmpConstrainedLHSRange.addWithNoWrap( + *NarrowDefRHS, OverflowingBinaryOperator::NoSignedWrap); updatePostIncRangeInfo(NarrowDef, NarrowUser, NarrowDefRange); }; Modified: llvm/trunk/unittests/IR/ConstantRangeTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/IR/ConstantRangeTest.cpp?rev=374016&r1=374015&r2=374016&view=diff ============================================================================== --- llvm/trunk/unittests/IR/ConstantRangeTest.cpp (original) +++ llvm/trunk/unittests/IR/ConstantRangeTest.cpp Mon Oct 7 20:00:31 2019 @@ -643,32 +643,6 @@ TEST_F(ConstantRangeTest, Add) { ConstantRange(APInt(16, 0xe))); } -TEST_F(ConstantRangeTest, AddWithNoSignedWrap) { - EXPECT_EQ(Empty.addWithNoSignedWrap(APInt(16, 1)), Empty); - EXPECT_EQ(Full.addWithNoSignedWrap(APInt(16, 1)), - ConstantRange(APInt(16, INT16_MIN+1), APInt(16, INT16_MIN))); - EXPECT_EQ(ConstantRange(APInt(8, -50), APInt(8, 50)).addWithNoSignedWrap(APInt(8, 10)), - ConstantRange(APInt(8, -40), APInt(8, 60))); - EXPECT_EQ(ConstantRange(APInt(8, -50), APInt(8, 120)).addWithNoSignedWrap(APInt(8, 10)), - ConstantRange(APInt(8, -40), APInt(8, INT8_MIN))); - EXPECT_EQ(ConstantRange(APInt(8, 120), APInt(8, -10)).addWithNoSignedWrap(APInt(8, 5)), - ConstantRange(APInt(8, 125), APInt(8, -5))); - EXPECT_EQ(ConstantRange(APInt(8, 120), APInt(8, -120)).addWithNoSignedWrap(APInt(8, 10)), - ConstantRange(APInt(8, INT8_MIN+10), APInt(8, -110))); - - EXPECT_EQ(Empty.addWithNoSignedWrap(APInt(16, -1)), Empty); - EXPECT_EQ(Full.addWithNoSignedWrap(APInt(16, -1)), - ConstantRange(APInt(16, INT16_MIN), APInt(16, INT16_MAX))); - EXPECT_EQ(ConstantRange(APInt(8, -50), APInt(8, 50)).addWithNoSignedWrap(APInt(8, -10)), - ConstantRange(APInt(8, -60), APInt(8, 40))); - EXPECT_EQ(ConstantRange(APInt(8, -120), APInt(8, 50)).addWithNoSignedWrap(APInt(8, -10)), - ConstantRange(APInt(8, INT8_MIN), APInt(8, 40))); - EXPECT_EQ(ConstantRange(APInt(8, 120), APInt(8, -120)).addWithNoSignedWrap(APInt(8, -5)), - ConstantRange(APInt(8, 115), APInt(8, -125))); - EXPECT_EQ(ConstantRange(APInt(8, 120), APInt(8, -120)).addWithNoSignedWrap(APInt(8, -10)), - ConstantRange(APInt(8, 110), APInt(8, INT8_MIN-10))); -} - template static void TestAddWithNoSignedWrapExhaustive(Fn1 RangeFn, Fn2 IntFn) { unsigned Bits = 4; From llvm-commits at lists.llvm.org Mon Oct 7 19:59:27 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:59:27 +0000 (UTC) Subject: [PATCH] D68215: [ConstantRange] replacing addWithNoSignedWrap with addWithNoWrap - NFC In-Reply-To: References: Message-ID: <1e0e1c8247ef6551354ccf2d03d1c794@localhost.localdomain> shchenz closed this revision. shchenz added a comment. Committed in rL374016 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68215/new/ https://reviews.llvm.org/D68215 From llvm-commits at lists.llvm.org Mon Oct 7 20:01:46 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 03:01:46 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: shchenz added a comment. Yes, D45173 recognizes some standards forms of popcount, but it can not recognize the one in benchmark deepsjeng. There are many forms of popcount, this one is `TargetLowering::expandCTPOP` choose to expand. So I guess we should add some specific combination code to recognize it? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 From llvm-commits at lists.llvm.org Mon Oct 7 20:28:33 2019 From: llvm-commits at lists.llvm.org (Zi Xuan Wu via llvm-commits) Date: Tue, 08 Oct 2019 03:28:33 -0000 Subject: [llvm] r374017 - [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize Message-ID: <20191008032833.CB6C685E11@lists.llvm.org> Author: wuzish Date: Mon Oct 7 20:28:33 2019 New Revision: 374017 URL: http://llvm.org/viewvc/llvm-project?rev=374017&view=rev Log: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 Added: llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h llvm/trunk/lib/Analysis/TargetTransformInfo.cpp llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original) +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Mon Oct 7 20:28:33 2019 @@ -788,10 +788,23 @@ public: /// Additional properties of an operand's values. enum OperandValueProperties { OP_None = 0, OP_PowerOf2 = 1 }; - /// \return The number of scalar or vector registers that the target has. - /// If 'Vectors' is true, it returns the number of vector registers. If it is - /// set to false, it returns the number of scalar registers. - unsigned getNumberOfRegisters(bool Vector) const; + /// \return the number of registers in the target-provided register class. + unsigned getNumberOfRegisters(unsigned ClassID) const; + + /// \return the target-provided register class ID for the provided type, + /// accounting for type promotion and other type-legalization techniques that the target might apply. + /// However, it specifically does not account for the scalarization or splitting of vector types. + /// Should a vector type require scalarization or splitting into multiple underlying vector registers, + /// that type should be mapped to a register class containing no registers. + /// Specifically, this is designed to provide a simple, high-level view of the register allocation + /// later performed by the backend. These register classes don't necessarily map onto the + /// register classes used by the backend. + /// FIXME: It's not currently possible to determine how many registers + /// are used by the provided type. + unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const; + + /// \return the target-provided register class name + const char* getRegisterClassName(unsigned ClassID) const; /// \return The width of the largest scalar or vector register type. unsigned getRegisterBitWidth(bool Vector) const; @@ -1243,7 +1256,9 @@ public: Type *Ty) = 0; virtual int getIntImmCost(Intrinsic::ID IID, unsigned Idx, const APInt &Imm, Type *Ty) = 0; - virtual unsigned getNumberOfRegisters(bool Vector) = 0; + virtual unsigned getNumberOfRegisters(unsigned ClassID) const = 0; + virtual unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const = 0; + virtual const char* getRegisterClassName(unsigned ClassID) const = 0; virtual unsigned getRegisterBitWidth(bool Vector) const = 0; virtual unsigned getMinVectorRegisterBitWidth() = 0; virtual bool shouldMaximizeVectorBandwidth(bool OptSize) const = 0; @@ -1586,8 +1601,14 @@ public: Type *Ty) override { return Impl.getIntImmCost(IID, Idx, Imm, Ty); } - unsigned getNumberOfRegisters(bool Vector) override { - return Impl.getNumberOfRegisters(Vector); + unsigned getNumberOfRegisters(unsigned ClassID) const override { + return Impl.getNumberOfRegisters(ClassID); + } + unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const override { + return Impl.getRegisterClassForType(Vector, Ty); + } + const char* getRegisterClassName(unsigned ClassID) const override { + return Impl.getRegisterClassName(ClassID); } unsigned getRegisterBitWidth(bool Vector) const override { return Impl.getRegisterBitWidth(Vector); Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h (original) +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h Mon Oct 7 20:28:33 2019 @@ -354,7 +354,20 @@ public: return TTI::TCC_Free; } - unsigned getNumberOfRegisters(bool Vector) { return 8; } + unsigned getNumberOfRegisters(unsigned ClassID) const { return 8; } + + unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const { + return Vector ? 1 : 0; + }; + + const char* getRegisterClassName(unsigned ClassID) const { + switch (ClassID) { + default: + return "Generic::Unknown Register Class"; + case 0: return "Generic::ScalarRC"; + case 1: return "Generic::VectorRC"; + } + } unsigned getRegisterBitWidth(bool Vector) const { return 32; } Modified: llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h (original) +++ llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h Mon Oct 7 20:28:33 2019 @@ -519,8 +519,6 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(bool Vector) { return Vector ? 0 : 1; } - unsigned getRegisterBitWidth(bool Vector) const { return 32; } /// Estimate the overhead of scalarizing an instruction. Insert and Extract Modified: llvm/trunk/lib/Analysis/TargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/TargetTransformInfo.cpp?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/TargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Analysis/TargetTransformInfo.cpp Mon Oct 7 20:28:33 2019 @@ -466,8 +466,16 @@ int TargetTransformInfo::getIntImmCost(I return Cost; } -unsigned TargetTransformInfo::getNumberOfRegisters(bool Vector) const { - return TTIImpl->getNumberOfRegisters(Vector); +unsigned TargetTransformInfo::getNumberOfRegisters(unsigned ClassID) const { + return TTIImpl->getNumberOfRegisters(ClassID); +} + +unsigned TargetTransformInfo::getRegisterClassForType(bool Vector, Type *Ty) const { + return TTIImpl->getRegisterClassForType(Vector, Ty); +} + +const char* TargetTransformInfo::getRegisterClassName(unsigned ClassID) const { + return TTIImpl->getRegisterClassName(ClassID); } unsigned TargetTransformInfo::getRegisterBitWidth(bool Vector) const { Modified: llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h Mon Oct 7 20:28:33 2019 @@ -85,7 +85,8 @@ public: bool enableInterleavedAccessVectorization() { return true; } - unsigned getNumberOfRegisters(bool Vector) { + unsigned getNumberOfRegisters(unsigned ClassID) const { + bool Vector = (ClassID == 1); if (Vector) { if (ST->hasNEON()) return 32; Modified: llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h Mon Oct 7 20:28:33 2019 @@ -122,7 +122,8 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(bool Vector) { + unsigned getNumberOfRegisters(unsigned ClassID) const { + bool Vector = (ClassID == 1); if (Vector) { if (ST->hasNEON()) return 16; Modified: llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp Mon Oct 7 20:28:33 2019 @@ -594,10 +594,37 @@ bool PPCTTIImpl::enableInterleavedAccess return true; } -unsigned PPCTTIImpl::getNumberOfRegisters(bool Vector) { - if (Vector && !ST->hasAltivec() && !ST->hasQPX()) - return 0; - return ST->hasVSX() ? 64 : 32; +unsigned PPCTTIImpl::getNumberOfRegisters(unsigned ClassID) const { + assert(ClassID == GPRRC || ClassID == FPRRC || + ClassID == VRRC || ClassID == VSXRC); + if (ST->hasVSX()) { + assert(ClassID == GPRRC || ClassID == VSXRC); + return ClassID == GPRRC ? 32 : 64; + } + assert(ClassID == GPRRC || ClassID == FPRRC || ClassID == VRRC); + return 32; +} + +unsigned PPCTTIImpl::getRegisterClassForType(bool Vector, Type *Ty) const { + if (Vector) + return ST->hasVSX() ? VSXRC : VRRC; + else if (Ty && Ty->getScalarType()->isFloatTy()) + return ST->hasVSX() ? VSXRC : FPRRC; + else + return GPRRC; +} + +const char* PPCTTIImpl::getRegisterClassName(unsigned ClassID) const { + + switch (ClassID) { + default: + llvm_unreachable("unknown register class"); + return "PPC::unknown register class"; + case GPRRC: return "PPC::GPRRC"; + case FPRRC: return "PPC::FPRRC"; + case VRRC: return "PPC::VRRC"; + case VSXRC: return "PPC::VSXRC"; + } } unsigned PPCTTIImpl::getRegisterBitWidth(bool Vector) const { Modified: llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h Mon Oct 7 20:28:33 2019 @@ -72,7 +72,13 @@ public: TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const; bool enableInterleavedAccessVectorization(); - unsigned getNumberOfRegisters(bool Vector); + + enum PPCRegisterClass { + GPRRC, FPRRC, VRRC, VSXRC + }; + unsigned getNumberOfRegisters(unsigned ClassID) const; + unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const; + const char* getRegisterClassName(unsigned ClassID) const; unsigned getRegisterBitWidth(bool Vector) const; unsigned getCacheLineSize(); unsigned getPrefetchDistance(); Modified: llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp Mon Oct 7 20:28:33 2019 @@ -304,7 +304,8 @@ bool SystemZTTIImpl::isLSRCostLess(Targe C2.ScaleCost, C2.SetupCost); } -unsigned SystemZTTIImpl::getNumberOfRegisters(bool Vector) { +unsigned SystemZTTIImpl::getNumberOfRegisters(unsigned ClassID) const { + bool Vector = (ClassID == 1); if (!Vector) // Discount the stack pointer. Also leave out %r0, since it can't // be used in an address. Modified: llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h Mon Oct 7 20:28:33 2019 @@ -56,7 +56,7 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(bool Vector); + unsigned getNumberOfRegisters(unsigned ClassID) const; unsigned getRegisterBitWidth(bool Vector) const; unsigned getCacheLineSize() { return 256; } Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp Mon Oct 7 20:28:33 2019 @@ -25,10 +25,11 @@ WebAssemblyTTIImpl::getPopcntSupport(uns return TargetTransformInfo::PSK_FastHardware; } -unsigned WebAssemblyTTIImpl::getNumberOfRegisters(bool Vector) { - unsigned Result = BaseT::getNumberOfRegisters(Vector); +unsigned WebAssemblyTTIImpl::getNumberOfRegisters(unsigned ClassID) const { + unsigned Result = BaseT::getNumberOfRegisters(ClassID); // For SIMD, use at least 16 registers, as a rough guess. + bool Vector = (ClassID == 1); if (Vector) Result = std::max(Result, 16u); Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h Mon Oct 7 20:28:33 2019 @@ -53,7 +53,7 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(bool Vector); + unsigned getNumberOfRegisters(unsigned ClassID) const; unsigned getRegisterBitWidth(bool Vector) const; unsigned getArithmeticInstrCost( unsigned Opcode, Type *Ty, Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp Mon Oct 7 20:28:33 2019 @@ -116,7 +116,8 @@ llvm::Optional X86TTIImpl::get llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); } -unsigned X86TTIImpl::getNumberOfRegisters(bool Vector) { +unsigned X86TTIImpl::getNumberOfRegisters(unsigned ClassID) const { + bool Vector = (ClassID == 1); if (Vector && !ST->hasSSE1()) return 0; Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h Mon Oct 7 20:28:33 2019 @@ -116,7 +116,7 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(bool Vector); + unsigned getNumberOfRegisters(unsigned ClassID) const; unsigned getRegisterBitWidth(bool Vector) const; unsigned getLoadStoreVecRegBitWidth(unsigned AS) const; unsigned getMaxInterleaveFactor(unsigned VF); Modified: llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h Mon Oct 7 20:28:33 2019 @@ -40,7 +40,8 @@ public: : BaseT(TM, F.getParent()->getDataLayout()), ST(TM->getSubtargetImpl()), TLI(ST->getTargetLowering()) {} - unsigned getNumberOfRegisters(bool Vector) { + unsigned getNumberOfRegisters(unsigned ClassID) const { + bool Vector = (ClassID == 1); if (Vector) { return 0; } Modified: llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Mon Oct 7 20:28:33 2019 @@ -1386,7 +1386,9 @@ void Cost::RateFormula(const Formula &F, // Treat every new register that exceeds TTI.getNumberOfRegisters() - 1 as // additional instruction (at least fill). - unsigned TTIRegNum = TTI->getNumberOfRegisters(false) - 1; + // TODO: Need distinguish register class? + unsigned TTIRegNum = TTI->getNumberOfRegisters( + TTI->getRegisterClassForType(false, F.getType())) - 1; if (C.NumRegs > TTIRegNum) { // Cost already exceeded TTIRegNum, then only newly added register can add // new instructions. Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Mon Oct 7 20:28:33 2019 @@ -983,10 +983,11 @@ public: /// of a loop. struct RegisterUsage { /// Holds the number of loop invariant values that are used in the loop. - unsigned LoopInvariantRegs; - + /// The key is ClassID of target-provided register class. + SmallMapVector LoopInvariantRegs; /// Holds the maximum number of concurrent live intervals in the loop. - unsigned MaxLocalUsers; + /// The key is ClassID of target-provided register class. + SmallMapVector MaxLocalUsers; }; /// \return Returns information about the register usages of the loop for the @@ -4962,9 +4963,14 @@ LoopVectorizationCostModel::computeFeasi // Select the largest VF which doesn't require more registers than existing // ones. - unsigned TargetNumRegisters = TTI.getNumberOfRegisters(true); for (int i = RUs.size() - 1; i >= 0; --i) { - if (RUs[i].MaxLocalUsers <= TargetNumRegisters) { + bool Selected = true; + for (auto& pair : RUs[i].MaxLocalUsers) { + unsigned TargetNumRegisters = TTI.getNumberOfRegisters(pair.first); + if (pair.second > TargetNumRegisters) + Selected = false; + } + if (Selected) { MaxVF = VFs[i]; break; } @@ -5115,22 +5121,12 @@ unsigned LoopVectorizationCostModel::sel if (TC > 1 && TC < TinyTripCountInterleaveThreshold) return 1; - unsigned TargetNumRegisters = TTI.getNumberOfRegisters(VF > 1); - LLVM_DEBUG(dbgs() << "LV: The target has " << TargetNumRegisters - << " registers\n"); - - if (VF == 1) { - if (ForceTargetNumScalarRegs.getNumOccurrences() > 0) - TargetNumRegisters = ForceTargetNumScalarRegs; - } else { - if (ForceTargetNumVectorRegs.getNumOccurrences() > 0) - TargetNumRegisters = ForceTargetNumVectorRegs; - } - RegisterUsage R = calculateRegisterUsage({VF})[0]; // We divide by these constants so assume that we have at least one // instruction that uses at least one register. - R.MaxLocalUsers = std::max(R.MaxLocalUsers, 1U); + for (auto& pair : R.MaxLocalUsers) { + pair.second = std::max(pair.second, 1U); + } // We calculate the interleave count using the following formula. // Subtract the number of loop invariants from the number of available @@ -5143,13 +5139,35 @@ unsigned LoopVectorizationCostModel::sel // We also want power of two interleave counts to ensure that the induction // variable of the vector loop wraps to zero, when tail is folded by masking; // this currently happens when OptForSize, in which case IC is set to 1 above. - unsigned IC = PowerOf2Floor((TargetNumRegisters - R.LoopInvariantRegs) / - R.MaxLocalUsers); + unsigned IC = UINT_MAX; - // Don't count the induction variable as interleaved. - if (EnableIndVarRegisterHeur) - IC = PowerOf2Floor((TargetNumRegisters - R.LoopInvariantRegs - 1) / - std::max(1U, (R.MaxLocalUsers - 1))); + for (auto& pair : R.MaxLocalUsers) { + unsigned TargetNumRegisters = TTI.getNumberOfRegisters(pair.first); + LLVM_DEBUG(dbgs() << "LV: The target has " << TargetNumRegisters + << " registers of " + << TTI.getRegisterClassName(pair.first) << " register class\n"); + if (VF == 1) { + if (ForceTargetNumScalarRegs.getNumOccurrences() > 0) + TargetNumRegisters = ForceTargetNumScalarRegs; + } else { + if (ForceTargetNumVectorRegs.getNumOccurrences() > 0) + TargetNumRegisters = ForceTargetNumVectorRegs; + } + unsigned MaxLocalUsers = pair.second; + unsigned LoopInvariantRegs = 0; + if (R.LoopInvariantRegs.find(pair.first) != R.LoopInvariantRegs.end()) + LoopInvariantRegs = R.LoopInvariantRegs[pair.first]; + + unsigned TmpIC = PowerOf2Floor((TargetNumRegisters - LoopInvariantRegs) / MaxLocalUsers); + // Don't count the induction variable as interleaved. + if (EnableIndVarRegisterHeur) { + TmpIC = + PowerOf2Floor((TargetNumRegisters - LoopInvariantRegs - 1) / + std::max(1U, (MaxLocalUsers - 1))); + } + + IC = std::min(IC, TmpIC); + } // Clamp the interleave ranges to reasonable counts. unsigned MaxInterleaveCount = TTI.getMaxInterleaveFactor(VF); @@ -5331,7 +5349,7 @@ LoopVectorizationCostModel::calculateReg const DataLayout &DL = TheFunction->getParent()->getDataLayout(); SmallVector RUs(VFs.size()); - SmallVector MaxUsages(VFs.size(), 0); + SmallVector, 8> MaxUsages(VFs.size()); LLVM_DEBUG(dbgs() << "LV(REG): Calculating max register usage:\n"); @@ -5361,21 +5379,45 @@ LoopVectorizationCostModel::calculateReg // For each VF find the maximum usage of registers. for (unsigned j = 0, e = VFs.size(); j < e; ++j) { + // Count the number of live intervals. + SmallMapVector RegUsage; + if (VFs[j] == 1) { - MaxUsages[j] = std::max(MaxUsages[j], OpenIntervals.size()); - continue; + for (auto Inst : OpenIntervals) { + unsigned ClassID = TTI.getRegisterClassForType(false, Inst->getType()); + if (RegUsage.find(ClassID) == RegUsage.end()) + RegUsage[ClassID] = 1; + else + RegUsage[ClassID] += 1; + } + } else { + collectUniformsAndScalars(VFs[j]); + for (auto Inst : OpenIntervals) { + // Skip ignored values for VF > 1. + if (VecValuesToIgnore.find(Inst) != VecValuesToIgnore.end()) + continue; + if (isScalarAfterVectorization(Inst, VFs[j])) { + unsigned ClassID = TTI.getRegisterClassForType(false, Inst->getType()); + if (RegUsage.find(ClassID) == RegUsage.end()) + RegUsage[ClassID] = 1; + else + RegUsage[ClassID] += 1; + } else { + unsigned ClassID = TTI.getRegisterClassForType(true, Inst->getType()); + if (RegUsage.find(ClassID) == RegUsage.end()) + RegUsage[ClassID] = GetRegUsage(Inst->getType(), VFs[j]); + else + RegUsage[ClassID] += GetRegUsage(Inst->getType(), VFs[j]); + } + } } - collectUniformsAndScalars(VFs[j]); - // Count the number of live intervals. - unsigned RegUsage = 0; - for (auto Inst : OpenIntervals) { - // Skip ignored values for VF > 1. - if (VecValuesToIgnore.find(Inst) != VecValuesToIgnore.end() || - isScalarAfterVectorization(Inst, VFs[j])) - continue; - RegUsage += GetRegUsage(Inst->getType(), VFs[j]); + + for (auto& pair : RegUsage) { + if (MaxUsages[j].find(pair.first) != MaxUsages[j].end()) + MaxUsages[j][pair.first] = std::max(MaxUsages[j][pair.first], pair.second); + else + MaxUsages[j][pair.first] = pair.second; } - MaxUsages[j] = std::max(MaxUsages[j], RegUsage); } LLVM_DEBUG(dbgs() << "LV(REG): At #" << i << " Interval # " @@ -5386,18 +5428,32 @@ LoopVectorizationCostModel::calculateReg } for (unsigned i = 0, e = VFs.size(); i < e; ++i) { - unsigned Invariant = 0; - if (VFs[i] == 1) - Invariant = LoopInvariants.size(); - else { - for (auto Inst : LoopInvariants) - Invariant += GetRegUsage(Inst->getType(), VFs[i]); + SmallMapVector Invariant; + + for (auto Inst : LoopInvariants) { + unsigned Usage = VFs[i] == 1 ? 1 : GetRegUsage(Inst->getType(), VFs[i]); + unsigned ClassID = TTI.getRegisterClassForType(VFs[i] > 1, Inst->getType()); + if (Invariant.find(ClassID) == Invariant.end()) + Invariant[ClassID] = Usage; + else + Invariant[ClassID] += Usage; } LLVM_DEBUG(dbgs() << "LV(REG): VF = " << VFs[i] << '\n'); - LLVM_DEBUG(dbgs() << "LV(REG): Found max usage: " << MaxUsages[i] << '\n'); - LLVM_DEBUG(dbgs() << "LV(REG): Found invariant usage: " << Invariant - << '\n'); + LLVM_DEBUG(dbgs() << "LV(REG): Found max usage: " + << MaxUsages[i].size() << " item\n"); + for (const auto& pair : MaxUsages[i]) { + LLVM_DEBUG(dbgs() << "LV(REG): RegisterClass: " + << TTI.getRegisterClassName(pair.first) + << ", " << pair.second << " registers \n"); + } + LLVM_DEBUG(dbgs() << "LV(REG): Found invariant usage: " + << Invariant.size() << " item\n"); + for (const auto& pair : Invariant) { + LLVM_DEBUG(dbgs() << "LV(REG): RegisterClass: " + << TTI.getRegisterClassName(pair.first) + << ", " << pair.second << " registers \n"); + } RU.LoopInvariantRegs = Invariant; RU.MaxLocalUsers = MaxUsages[i]; @@ -7762,7 +7818,8 @@ bool LoopVectorizePass::runImpl( // The second condition is necessary because, even if the target has no // vector registers, loop vectorization may still enable scalar // interleaving. - if (!TTI->getNumberOfRegisters(true) && TTI->getMaxInterleaveFactor(1) < 2) + if (!TTI->getNumberOfRegisters(TTI->getRegisterClassForType(true)) && + TTI->getMaxInterleaveFactor(1) < 2) return false; bool Changed = false; Modified: llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp Mon Oct 7 20:28:33 2019 @@ -5237,7 +5237,7 @@ bool SLPVectorizerPass::runImpl(Function // If the target claims to have no vector registers don't attempt // vectorization. - if (!TTI->getNumberOfRegisters(true)) + if (!TTI->getNumberOfRegisters(TTI->getRegisterClassForType(true))) return false; // Don't vectorize when the attribute NoImplicitFloat is used. Added: llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll?rev=374017&view=auto ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll (added) +++ llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll Mon Oct 7 20:28:33 2019 @@ -0,0 +1,178 @@ +; RUN: opt < %s -debug-only=loop-vectorize -loop-vectorize -vectorizer-maximize-bandwidth -O2 -mtriple=powerpc64-unknown-linux -S -mcpu=pwr8 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-PWR8 +; RUN: opt < %s -debug-only=loop-vectorize -loop-vectorize -vectorizer-maximize-bandwidth -O2 -mtriple=powerpc64le-unknown-linux -S -mcpu=pwr9 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-PWR9 + + at a = global [1024 x i8] zeroinitializer, align 16 + at b = global [1024 x i8] zeroinitializer, align 16 + +define i32 @foo() { +; +; CHECK-LABEL: foo + +; CHECK: LV(REG): VF = 8 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 7 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item +; CHECK: LV(REG): VF = 16 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 13 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item + +; CHECK-PWR8: LV(REG): VF = 16 +; CHECK-PWR8-NEXT: LV(REG): Found max usage: 2 item +; CHECK-PWR8-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers +; CHECK-PWR8-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 13 registers +; CHECK-PWR8-NEXT: LV(REG): Found invariant usage: 0 item +; CHECK-PWR8: Setting best plan to VF=16, UF=4 + +; CHECK-PWR9: LV(REG): VF = 8 +; CHECK-PWR9-NEXT: LV(REG): Found max usage: 2 item +; CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers +; CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 7 registers +; CHECK-PWR9-NEXT: LV(REG): Found invariant usage: 0 item +; CHECK-PWR9: Setting best plan to VF=8, UF=8 + + +entry: + br label %for.body + +for.cond.cleanup: + %add.lcssa = phi i32 [ %add, %for.body ] + ret i32 %add.lcssa + +for.body: + %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] + %s.015 = phi i32 [ 0, %entry ], [ %add, %for.body ] + %arrayidx = getelementptr inbounds [1024 x i8], [1024 x i8]* @a, i64 0, i64 %indvars.iv + %0 = load i8, i8* %arrayidx, align 1 + %conv = zext i8 %0 to i32 + %arrayidx2 = getelementptr inbounds [1024 x i8], [1024 x i8]* @b, i64 0, i64 %indvars.iv + %1 = load i8, i8* %arrayidx2, align 1 + %conv3 = zext i8 %1 to i32 + %sub = sub nsw i32 %conv, %conv3 + %ispos = icmp sgt i32 %sub, -1 + %neg = sub nsw i32 0, %sub + %2 = select i1 %ispos, i32 %sub, i32 %neg + %add = add nsw i32 %2, %s.015 + %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 + %exitcond = icmp eq i64 %indvars.iv.next, 1024 + br i1 %exitcond, label %for.cond.cleanup, label %for.body +} + +define i32 @goo() { +; For indvars.iv used in a computating chain only feeding into getelementptr or cmp, +; it will not have vector version and the vector register usage will not exceed the +; available vector register number. +; CHECK-LABEL: goo +; CHECK: LV(REG): VF = 8 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 7 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item +; CHECK: LV(REG): VF = 16 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 13 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item +; CHECK: LV(REG): VF = 16 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 13 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item + +; CHECK: Setting best plan to VF=16, UF=4 + +entry: + br label %for.body + +for.cond.cleanup: ; preds = %for.body + %add.lcssa = phi i32 [ %add, %for.body ] + ret i32 %add.lcssa + +for.body: ; preds = %for.body, %entry + %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] + %s.015 = phi i32 [ 0, %entry ], [ %add, %for.body ] + %tmp1 = add nsw i64 %indvars.iv, 3 + %arrayidx = getelementptr inbounds [1024 x i8], [1024 x i8]* @a, i64 0, i64 %tmp1 + %tmp = load i8, i8* %arrayidx, align 1 + %conv = zext i8 %tmp to i32 + %tmp2 = add nsw i64 %indvars.iv, 2 + %arrayidx2 = getelementptr inbounds [1024 x i8], [1024 x i8]* @b, i64 0, i64 %tmp2 + %tmp3 = load i8, i8* %arrayidx2, align 1 + %conv3 = zext i8 %tmp3 to i32 + %sub = sub nsw i32 %conv, %conv3 + %ispos = icmp sgt i32 %sub, -1 + %neg = sub nsw i32 0, %sub + %tmp4 = select i1 %ispos, i32 %sub, i32 %neg + %add = add nsw i32 %tmp4, %s.015 + %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 + %exitcond = icmp eq i64 %indvars.iv.next, 1024 + br i1 %exitcond, label %for.cond.cleanup, label %for.body +} + +define i64 @bar(i64* nocapture %a) { +; CHECK-LABEL: bar +; CHECK: LV(REG): VF = 2 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 3 registers +; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 1 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item + +; CHECK: Setting best plan to VF=2, UF=12 + +entry: + br label %for.body + +for.cond.cleanup: + %add2.lcssa = phi i64 [ %add2, %for.body ] + ret i64 %add2.lcssa + +for.body: + %i.012 = phi i64 [ 0, %entry ], [ %inc, %for.body ] + %s.011 = phi i64 [ 0, %entry ], [ %add2, %for.body ] + %arrayidx = getelementptr inbounds i64, i64* %a, i64 %i.012 + %0 = load i64, i64* %arrayidx, align 8 + %add = add nsw i64 %0, %i.012 + store i64 %add, i64* %arrayidx, align 8 + %add2 = add nsw i64 %add, %s.011 + %inc = add nuw nsw i64 %i.012, 1 + %exitcond = icmp eq i64 %inc, 1024 + br i1 %exitcond, label %for.cond.cleanup, label %for.body +} + + at d = external global [0 x i64], align 8 + at e = external global [0 x i32], align 4 + at c = external global [0 x i32], align 4 + +define void @hoo(i32 %n) { +; CHECK-LABEL: hoo +; CHECK: LV(REG): VF = 4 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 2 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item +; CHECK: LV(REG): VF = 1 +; CHECK-NEXT: LV(REG): Found max usage: 1 item +; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item +; CHECK: Setting best plan to VF=1, UF=12 + +entry: + br label %for.body + +for.body: ; preds = %for.body, %entry + %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] + %arrayidx = getelementptr inbounds [0 x i64], [0 x i64]* @d, i64 0, i64 %indvars.iv + %tmp = load i64, i64* %arrayidx, align 8 + %arrayidx1 = getelementptr inbounds [0 x i32], [0 x i32]* @e, i64 0, i64 %tmp + %tmp1 = load i32, i32* %arrayidx1, align 4 + %arrayidx3 = getelementptr inbounds [0 x i32], [0 x i32]* @c, i64 0, i64 %indvars.iv + store i32 %tmp1, i32* %arrayidx3, align 4 + %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 + %exitcond = icmp eq i64 %indvars.iv.next, 10000 + br i1 %exitcond, label %for.end, label %for.body + +for.end: ; preds = %for.body + ret void +} Modified: llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll (original) +++ llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll Mon Oct 7 20:28:33 2019 @@ -22,7 +22,11 @@ target datalayout = "e-m:e-i64:64-f80:12 target triple = "x86_64-unknown-linux-gnu" ; CHECK: LV: Checking a loop in "test_g" -; CHECK: LV(REG): Found max usage: 2 +; CHECK: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 1 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers define i32 @test_g(i32* nocapture readonly %a, i32 %n) local_unnamed_addr !dbg !6 { entry: @@ -60,7 +64,11 @@ for.end: } ; CHECK: LV: Checking a loop in "test" -; CHECK: LV(REG): Found max usage: 2 +; CHECK: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 1 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers define i32 @test(i32* nocapture readonly %a, i32 %n) local_unnamed_addr { entry: Modified: llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll?rev=374017&r1=374016&r2=374017&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll (original) +++ llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll Mon Oct 7 20:28:33 2019 @@ -11,9 +11,15 @@ define i32 @foo() { ; ; CHECK-LABEL: foo ; CHECK: LV(REG): VF = 8 -; CHECK-NEXT: LV(REG): Found max usage: 7 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 7 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item ; CHECK: LV(REG): VF = 16 -; CHECK-NEXT: LV(REG): Found max usage: 13 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 13 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item entry: br label %for.body @@ -47,9 +53,15 @@ define i32 @goo() { ; available vector register number. ; CHECK-LABEL: goo ; CHECK: LV(REG): VF = 8 -; CHECK-NEXT: LV(REG): Found max usage: 7 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 7 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item ; CHECK: LV(REG): VF = 16 -; CHECK-NEXT: LV(REG): Found max usage: 13 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 13 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item entry: br label %for.body @@ -81,8 +93,11 @@ for.body: define i64 @bar(i64* nocapture %a) { ; CHECK-LABEL: bar ; CHECK: LV(REG): VF = 2 -; CHECK: LV(REG): Found max usage: 3 -; +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 3 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 1 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item + entry: br label %for.body @@ -113,8 +128,11 @@ define void @hoo(i32 %n) { ; so the max usage of AVX512 vector register will be 2. ; AVX512F-LABEL: bar ; AVX512F: LV(REG): VF = 16 -; AVX512F: LV(REG): Found max usage: 2 -; +; AVX512F-CHECK: LV(REG): Found max usage: 2 item +; AVX512F-CHECK: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; AVX512F-CHECK: LV(REG): RegisterClass: Generic::VectorRC, 2 registers +; AVX512F-CHECK: LV(REG): Found invariant usage: 0 item + entry: br label %for.body From llvm-commits at lists.llvm.org Mon Oct 7 20:27:58 2019 From: llvm-commits at lists.llvm.org (Seiya Nuta via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 03:27:58 +0000 (UTC) Subject: [PATCH] D66281: [llvm-objcopy][MachO] Implement --strip-all In-Reply-To: References: Message-ID: seiya updated this revision to Diff 223726. seiya added a comment. Rebased. No changes intended. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66281/new/ https://reviews.llvm.org/D66281 Files: llvm/docs/CommandGuide/llvm-objcopy.rst llvm/test/tools/llvm-objcopy/MachO/Inputs/strip-all-with-dwarf.yaml llvm/test/tools/llvm-objcopy/MachO/Inputs/strip-all.yaml llvm/test/tools/llvm-objcopy/MachO/strip-all.test llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp llvm/tools/llvm-objcopy/MachO/Object.cpp llvm/tools/llvm-objcopy/MachO/Object.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D66281.223726.patch Type: text/x-patch Size: 31999 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 20:30:41 2019 From: llvm-commits at lists.llvm.org (Seiya Nuta via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 03:30:41 +0000 (UTC) Subject: [PATCH] D66282: [llvm-objcopy][MachO] Implement --remove-section In-Reply-To: References: Message-ID: seiya updated this revision to Diff 223727. seiya added a comment. Rebased. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66282/new/ https://reviews.llvm.org/D66282 Files: llvm/docs/CommandGuide/llvm-objcopy.rst llvm/test/tools/llvm-objcopy/MachO/remove-section.test llvm/tools/llvm-objcopy/MachO/MachOConfig.cpp llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D66282.223727.patch Type: text/x-patch Size: 8321 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 20:31:25 2019 From: llvm-commits at lists.llvm.org (Seiya Nuta via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 03:31:25 +0000 (UTC) Subject: [PATCH] D65541: [llvm-objcopy][MachO] Implement --only-section In-Reply-To: References: Message-ID: seiya added a comment. Friendly ping: I'd like someone who is familiar with Mach-O to review this patch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65541/new/ https://reviews.llvm.org/D65541 From llvm-commits at lists.llvm.org Mon Oct 7 20:57:36 2019 From: llvm-commits at lists.llvm.org (Stefan Schmidt via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 03:57:36 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <522d410a34507d7a57fde74d99d4fd7c@localhost.localdomain> thrimbor updated this revision to Diff 223728. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 Files: lld/COFF/Writer.cpp lld/test/COFF/Inputs/ordinal-only-implib.def lld/test/COFF/imports-ordinal-only.s Index: lld/test/COFF/imports-ordinal-only.s =================================================================== --- /dev/null +++ lld/test/COFF/imports-ordinal-only.s @@ -0,0 +1,11 @@ +# REQUIRES: x86 +# +# RUN: llvm-dlltool -k -m i386 --input-def %p/Inputs/ordinal-only-implib.def --output-lib %t-implib.a +# RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj +# RUN: lld-link -out:%t.exe -entry:main -subsystem:console -safeseh:no -debug %t.obj %t-implib.a + +.text +.global _main +_main: +call _ByOrdinalFunction +ret Index: lld/test/COFF/Inputs/ordinal-only-implib.def =================================================================== --- /dev/null +++ lld/test/COFF/Inputs/ordinal-only-implib.def @@ -0,0 +1,3 @@ +LIBRARY test.dll +EXPORTS +ByOrdinalFunction @ 1 NONAME Index: lld/COFF/Writer.cpp =================================================================== --- lld/COFF/Writer.cpp +++ lld/COFF/Writer.cpp @@ -743,7 +743,8 @@ add(".idata$2", idata.dirs); add(".idata$4", idata.lookups); add(".idata$5", idata.addresses); - add(".idata$6", idata.hints); + if (!idata.hints.empty()) + add(".idata$6", idata.hints); add(".idata$7", idata.dllNames); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68352.223728.patch Type: text/x-patch Size: 1193 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 20:59:52 2019 From: llvm-commits at lists.llvm.org (Stefan Schmidt via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 03:59:52 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: thrimbor added a comment. I updated the patch and included a test case. I ran the test case before and after the change to make sure it behaves and tests correctly. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Mon Oct 7 21:08:42 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 04:08:42 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <768ac5fea52e0e641b7fea181b0aac34@localhost.localdomain> ruiu added a comment. So, you are creating an executable that has an import table that has only ordinals. Is my understanding correct? Indeed, that condition is rare and I've thought of that case before when I implemented this part. I think your fix is correct. Does all tests still pass? Please run `ninja check-all` (or equivalent). ================ Comment at: lld/test/COFF/imports-ordinal-only.s:5 +# RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj +# RUN: lld-link -out:%t.exe -entry:main -subsystem:console -safeseh:no -debug %t.obj %t-implib.a + ---------------- I'd dump the import table to verify that a correct import table is actually created in the resulting executable. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Mon Oct 7 21:09:01 2019 From: llvm-commits at lists.llvm.org (Yonghong Song via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 04:09:01 +0000 (UTC) Subject: [PATCH] D67980: [BPF] do compile-once run-everywhere relocation for bitfields In-Reply-To: References: Message-ID: <8389b5551865558da20c30ccd36feced@localhost.localdomain> yonghong-song updated this revision to Diff 223729. yonghong-song edited the summary of this revision. yonghong-song added a comment. separate old FIELD_LSHIFT_U64 relocation to FIELD_LSHIFT_U64_BUF and FIELD_LSHIFT_U64_VAL. This allows to generate better code for native loads. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67980/new/ https://reviews.llvm.org/D67980 Files: clang/include/clang/Basic/BuiltinsBPF.def clang/include/clang/Basic/DiagnosticSemaKinds.td clang/include/clang/Basic/TargetBuiltins.h clang/include/clang/Sema/Sema.h clang/include/clang/module.modulemap clang/lib/Basic/Targets/BPF.cpp clang/lib/Basic/Targets/BPF.h clang/lib/CodeGen/CGBuiltin.cpp clang/lib/CodeGen/CGExpr.cpp clang/lib/CodeGen/CodeGenFunction.h clang/lib/Sema/SemaChecking.cpp clang/test/CodeGen/builtins-bpf-preserve-field-info-1.c clang/test/CodeGen/builtins-bpf-preserve-field-info-2.c clang/test/Sema/builtins-bpf.c llvm/include/llvm/IR/IntrinsicsBPF.td llvm/lib/Target/BPF/BPF.h llvm/lib/Target/BPF/BPFAbstractMemberAccess.cpp llvm/lib/Target/BPF/BPFCORE.h llvm/lib/Target/BPF/BPFTargetMachine.cpp llvm/lib/Target/BPF/BTF.h llvm/lib/Target/BPF/BTFDebug.cpp llvm/lib/Target/BPF/BTFDebug.h llvm/test/CodeGen/BPF/CORE/intrinsic-array.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-buf-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-buf-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-buf-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-val-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-val-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-struct.ll llvm/test/CodeGen/BPF/CORE/intrinsic-union.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-access-str.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-basic.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-array-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-array-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-3.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-union-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-union-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-end-load.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-end-ret.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-3.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-ignore.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-middle-chain.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multi-array-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multi-array-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multilevel.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-pointer-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-pointer-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-struct-anonymous.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-struct-array.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-array.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-struct.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-union.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-union.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67980.223729.patch Type: text/x-patch Size: 251958 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 21:31:58 2019 From: llvm-commits at lists.llvm.org (Muhammad Omair Javaid via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 04:31:58 +0000 (UTC) Subject: [PATCH] D66935: [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize In-Reply-To: References: Message-ID: <00983f1de45b70b1e324b222820ec05d@localhost.localdomain> omjavaid reopened this revision. omjavaid added a comment. This revision is now accepted and ready to land. Hi This change was reverted and hence causes LLDB AArch64 test failures again. Revert action documented here : https://reviews.llvm.org/D67710 Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66935/new/ https://reviews.llvm.org/D66935 From llvm-commits at lists.llvm.org Mon Oct 7 21:39:53 2019 From: llvm-commits at lists.llvm.org (Bill Wendling via llvm-commits) Date: Tue, 08 Oct 2019 04:39:53 -0000 Subject: [llvm] r374018 - [IA] Recognize hexadecimal escape sequences Message-ID: <20191008043953.178BA8E610@lists.llvm.org> Author: void Date: Mon Oct 7 21:39:52 2019 New Revision: 374018 URL: http://llvm.org/viewvc/llvm-project?rev=374018&view=rev Log: [IA] Recognize hexadecimal escape sequences Summary: Implement support for hexadecimal escape sequences to match how GNU 'as' handles them. I.e., read all hexadecimal characters and truncate to the lower 16 bits. Reviewers: nickdesaulniers, jcai19 Subscribers: llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D68598 Modified: llvm/trunk/lib/MC/MCParser/AsmParser.cpp llvm/trunk/test/MC/AsmParser/directive_ascii.s Modified: llvm/trunk/lib/MC/MCParser/AsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCParser/AsmParser.cpp?rev=374018&r1=374017&r2=374018&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCParser/AsmParser.cpp (original) +++ llvm/trunk/lib/MC/MCParser/AsmParser.cpp Mon Oct 7 21:39:52 2019 @@ -2914,11 +2914,27 @@ bool AsmParser::parseEscapedString(std:: } // Recognize escaped characters. Note that this escape semantics currently - // loosely follows Darwin 'as'. Notably, it doesn't support hex escapes. + // loosely follows Darwin 'as'. ++i; if (i == e) return TokError("unexpected backslash at end of string"); + // Recognize hex sequences similarly to GNU 'as'. + if (Str[i] == 'x' || Str[i] == 'X') { + size_t length = Str.size(); + if (i + 1 >= length || !isHexDigit(Str[i + 1])) + return TokError("invalid hexadecimal escape sequence"); + + // Consume hex characters. GNU 'as' reads all hexadecimal characters and + // then truncates to the lower 16 bits. Seems reasonable. + unsigned Value = 0; + while (i + 1 < length && isHexDigit(Str[i + 1])) + Value = Value * 16 + hexDigitValue(Str[++i]); + + Data += (unsigned char)(Value & 0xFF); + continue; + } + // Recognize octal sequences. if ((unsigned)(Str[i] - '0') <= 7) { // Consume up to three octal characters. Modified: llvm/trunk/test/MC/AsmParser/directive_ascii.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AsmParser/directive_ascii.s?rev=374018&r1=374017&r2=374018&view=diff ============================================================================== --- llvm/trunk/test/MC/AsmParser/directive_ascii.s (original) +++ llvm/trunk/test/MC/AsmParser/directive_ascii.s Mon Oct 7 21:39:52 2019 @@ -39,3 +39,8 @@ TEST5: # CHECK: .byte 0 TEST6: .string "B", "C" + +# CHECK: TEST7: +# CHECK: .ascii "dk" +TEST7: + .ascii "\x64\Xa6B" From llvm-commits at lists.llvm.org Mon Oct 7 22:08:06 2019 From: llvm-commits at lists.llvm.org (Haojian Wu via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:08:06 +0000 (UTC) Subject: [PATCH] D68458: [clangd] Collect missing macro references. In-Reply-To: References: Message-ID: <0caa97f921d132d5b6e427f0edf6ae0e@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373889: [clangd] Collect missing macro references. (authored by hokein, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68458?vs=223191&id=223730#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68458/new/ https://reviews.llvm.org/D68458 Files: clang-tools-extra/trunk/clangd/CollectMacros.h clang-tools-extra/trunk/clangd/unittests/SemanticHighlightingTests.cpp Index: clang-tools-extra/trunk/clangd/CollectMacros.h =================================================================== --- clang-tools-extra/trunk/clangd/CollectMacros.h +++ clang-tools-extra/trunk/clangd/CollectMacros.h @@ -25,7 +25,8 @@ std::vector Ranges; }; -/// Collects macro definitions and expansions in the main file. It is used to: +/// Collects macro references (e.g. definitions, expansions) in the main file. +/// It is used to: /// - collect macros in the preamble section of the main file (in Preamble.cpp) /// - collect macros after the preamble of the main file (in ParsedAST.cpp) class CollectMainFileMacros : public PPCallbacks { @@ -49,6 +50,27 @@ add(MacroName, MD.getMacroInfo()); } + void MacroUndefined(const clang::Token &MacroName, + const clang::MacroDefinition &MD, + const clang::MacroDirective *Undef) override { + add(MacroName, MD.getMacroInfo()); + } + + void Ifdef(SourceLocation Loc, const Token &MacroName, + const MacroDefinition &MD) override { + add(MacroName, MD.getMacroInfo()); + } + + void Ifndef(SourceLocation Loc, const Token &MacroName, + const MacroDefinition &MD) override { + add(MacroName, MD.getMacroInfo()); + } + + void Defined(const Token &MacroName, const MacroDefinition &MD, + SourceRange Range) override { + add(MacroName, MD.getMacroInfo()); + } + private: void add(const Token &MacroNameTok, const MacroInfo *MI) { if (!InMainFile) @@ -57,7 +79,7 @@ if (Loc.isMacroID()) return; - if (auto Range = getTokenRange(SM, LangOpts, MacroNameTok.getLocation())) { + if (auto Range = getTokenRange(SM, LangOpts, Loc)) { Out.Names.insert(MacroNameTok.getIdentifierInfo()->getName()); Out.Ranges.push_back(*Range); } Index: clang-tools-extra/trunk/clangd/unittests/SemanticHighlightingTests.cpp =================================================================== --- clang-tools-extra/trunk/clangd/unittests/SemanticHighlightingTests.cpp +++ clang-tools-extra/trunk/clangd/unittests/SemanticHighlightingTests.cpp @@ -476,6 +476,20 @@ $Macro[[assert]]($Variable[[x]] != $Function[[f]]()); } )cpp", + // highlighting all macro references + R"cpp( + #ifndef $Macro[[name]] + #define $Macro[[name]] + #endif + + #define $Macro[[test]] + #undef $Macro[[test]] + #ifdef $Macro[[test]] + #endif + + #if defined($Macro[[test]]) + #endif + )cpp", R"cpp( struct $Class[[S]] { $Primitive[[float]] $Field[[Value]]; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68458.223730.patch Type: text/x-patch Size: 2636 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:08:33 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:08:33 +0000 (UTC) Subject: [PATCH] D68383: [llvm-readelf/llvm-objdump] - Improve/refactor the implementation of SHT_LLVM_ADDRSIG section dumping. In-Reply-To: References: Message-ID: <6abfbc4dc468051feb894b9c97d9328e@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373890: [llvm-readelf/llvm-objdump] - Improve/refactor the implementation of… (authored by grimar, committed by ). Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D68383?vs=223204&id=223731#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68383/new/ https://reviews.llvm.org/D68383 Files: llvm/trunk/test/tools/llvm-readobj/all.test llvm/trunk/test/tools/llvm-readobj/elf-addrsig.test llvm/trunk/tools/llvm-readobj/ELFDumper.cpp llvm/trunk/tools/llvm-readobj/llvm-readobj.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68383.223731.patch Type: text/x-patch Size: 8236 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:09:15 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:09:15 +0000 (UTC) Subject: [PATCH] D10548: Teach LTOModule to emit linker flags for dllexported symbols, plus interface cleanup. In-Reply-To: References: Message-ID: <0aef701537d86954c8c9a852a2787309@localhost.localdomain> ruiu added inline comments. ================ Comment at: llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp:1038-1041 + if (TT.isKnownWindowsMSVCEnvironment()) + OS << " /EXPORT:"; + else + OS << " -export:"; ---------------- This code is now new, but I wonder if we need to distinguish MinGW and MSVC here because they both recognize `-EXPORT`. I believe in many projects option `/foo` and `-foo` are used interchangeably. ================ Comment at: llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp:1057-1060 + if (TT.isKnownWindowsMSVCEnvironment()) + OS << ",DATA"; + else + OS << ",data"; ---------------- Ditto -- I believe both should work with `DATA`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D10548/new/ https://reviews.llvm.org/D10548 From llvm-commits at lists.llvm.org Mon Oct 7 22:09:32 2019 From: llvm-commits at lists.llvm.org (Sam McCall via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:09:32 +0000 (UTC) Subject: [PATCH] D68467: [clangd] If an undocumented definition exists, don't accept documentation from other forward decls. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373892: [clangd] If an undocumented definition exists, don't accept documentation from… (authored by sammccall, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68467?vs=223224&id=223732#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68467/new/ https://reviews.llvm.org/D68467 Files: clang-tools-extra/trunk/clangd/index/Merge.cpp clang-tools-extra/trunk/clangd/unittests/IndexTests.cpp Index: clang-tools-extra/trunk/clangd/unittests/IndexTests.cpp =================================================================== --- clang-tools-extra/trunk/clangd/unittests/IndexTests.cpp +++ clang-tools-extra/trunk/clangd/unittests/IndexTests.cpp @@ -413,6 +413,16 @@ FileURI("unittest:///test2.cc")))))); } +TEST(MergeIndexTest, NonDocumentation) { + Symbol L, R; + L.ID = R.ID = SymbolID("x"); + L.Definition.FileURI = "file:/x.h"; + R.Documentation = "Forward declarations because x.h is too big to include"; + + Symbol M = mergeSymbol(L, R); + EXPECT_EQ(M.Documentation, ""); +} + MATCHER_P2(IncludeHeaderWithRef, IncludeHeader, References, "") { return (arg.IncludeHeader == IncludeHeader) && (arg.References == References); } Index: clang-tools-extra/trunk/clangd/index/Merge.cpp =================================================================== --- clang-tools-extra/trunk/clangd/index/Merge.cpp +++ clang-tools-extra/trunk/clangd/index/Merge.cpp @@ -186,7 +186,10 @@ S.Signature = O.Signature; if (S.CompletionSnippetSuffix == "") S.CompletionSnippetSuffix = O.CompletionSnippetSuffix; - if (S.Documentation == "") + // Don't accept documentation from bare forward declarations, if there is a + // definition and it didn't provide one. S is often an undocumented class, + // and O is a non-canonical forward decl preceded by an irrelevant comment. + if (S.Documentation == "" && !S.Definition) S.Documentation = O.Documentation; if (S.ReturnType == "") S.ReturnType = O.ReturnType; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68467.223732.patch Type: text/x-patch Size: 1581 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:11:18 2019 From: llvm-commits at lists.llvm.org (Haojian Wu via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:11:18 +0000 (UTC) Subject: [PATCH] D68564: [clangd] Catch an unchecked "Expected" in HeaderSourceSwitch. In-Reply-To: References: Message-ID: <6f15d52224ce17541e6d91c928bc1dd8@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373897: [clangd] Catch an unchecked "Expected<T>" in HeaderSourceSwitch. (authored by hokein, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68564?vs=223464&id=223733#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68564/new/ https://reviews.llvm.org/D68564 Files: clang-tools-extra/trunk/clangd/ClangdLSPServer.cpp clang-tools-extra/trunk/clangd/ClangdServer.cpp clang-tools-extra/trunk/clangd/HeaderSourceSwitch.cpp clang-tools-extra/trunk/clangd/unittests/HeaderSourceSwitchTests.cpp Index: clang-tools-extra/trunk/clangd/ClangdServer.cpp =================================================================== --- clang-tools-extra/trunk/clangd/ClangdServer.cpp +++ clang-tools-extra/trunk/clangd/ClangdServer.cpp @@ -460,7 +460,7 @@ if (auto CorrespondingFile = getCorrespondingHeaderOrSource(Path, FSProvider.getFileSystem())) return CB(std::move(CorrespondingFile)); - auto Action = [Path, CB = std::move(CB), + auto Action = [Path = Path.str(), CB = std::move(CB), this](llvm::Expected InpAST) mutable { if (!InpAST) return CB(InpAST.takeError()); Index: clang-tools-extra/trunk/clangd/ClangdLSPServer.cpp =================================================================== --- clang-tools-extra/trunk/clangd/ClangdLSPServer.cpp +++ clang-tools-extra/trunk/clangd/ClangdLSPServer.cpp @@ -1045,7 +1045,7 @@ if (!Path) return Reply(Path.takeError()); if (*Path) - Reply(URIForFile::canonicalize(**Path, Params.uri.file())); + return Reply(URIForFile::canonicalize(**Path, Params.uri.file())); return Reply(llvm::None); }); } Index: clang-tools-extra/trunk/clangd/HeaderSourceSwitch.cpp =================================================================== --- clang-tools-extra/trunk/clangd/HeaderSourceSwitch.cpp +++ clang-tools-extra/trunk/clangd/HeaderSourceSwitch.cpp @@ -86,7 +86,9 @@ if (auto TargetPath = URI::resolve(TargetURI, OriginalFile)) { if (*TargetPath != OriginalFile) // exclude the original file. ++Candidates[*TargetPath]; - }; + } else { + elog("Failed to resolve URI {0}: {1}", TargetURI, TargetPath.takeError()); + } }; // If we switch from a header, we are looking for the implementation // file, so we use the definition loc; otherwise we look for the header file, Index: clang-tools-extra/trunk/clangd/unittests/HeaderSourceSwitchTests.cpp =================================================================== --- clang-tools-extra/trunk/clangd/unittests/HeaderSourceSwitchTests.cpp +++ clang-tools-extra/trunk/clangd/unittests/HeaderSourceSwitchTests.cpp @@ -125,6 +125,7 @@ Testing.HeaderCode = R"cpp( void B_Sym1(); void B_Sym2(); + void B_Sym3_NoDef(); )cpp"; Testing.Filename = "b.cpp"; Testing.Code = R"cpp( @@ -163,6 +164,12 @@ void B_Sym1(); )cpp", testPath("a.cpp")}, + + {R"cpp( + // We don't have definition in the index, so stay in the header. + void B_Sym3_NoDef(); + )cpp", + None}, }; for (const auto &Case : TestCases) { TestTU TU = TestTU::withCode(Case.HeaderCode); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68564.223733.patch Type: text/x-patch Size: 2686 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:11:58 2019 From: llvm-commits at lists.llvm.org (Phabricator via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:11:58 +0000 (UTC) Subject: [PATCH] D68572: gn build: use better triple on windows In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373899: gn build: use better triple on windows (authored by nico, committed by ). Changed prior to commit: https://reviews.llvm.org/D68572?vs=223539&id=223734#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68572/new/ https://reviews.llvm.org/D68572 Files: llvm/trunk/utils/gn/secondary/llvm/triples.gni Index: llvm/trunk/utils/gn/secondary/llvm/triples.gni =================================================================== --- llvm/trunk/utils/gn/secondary/llvm/triples.gni +++ llvm/trunk/utils/gn/secondary/llvm/triples.gni @@ -10,7 +10,7 @@ } else if (current_os == "mac") { llvm_current_triple = "x86_64-apple-darwin" } else if (current_os == "win") { - llvm_current_triple = "x86_64-pc-windows" + llvm_current_triple = "x86_64-pc-windows-msvc" } } else if (current_cpu == "arm64") { if (current_os == "android") { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68572.223734.patch Type: text/x-patch Size: 541 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:14:04 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:14:04 +0000 (UTC) Subject: [PATCH] D68548: [Mips] Fix evaluating J-format branch targets In-Reply-To: References: Message-ID: <9c346c821da526286aee0dd77a3e9959@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373906: [Mips] Fix evaluating J-format branch targets (authored by atanasyan, committed by ). Changed prior to commit: https://reviews.llvm.org/D68548?vs=223401&id=223735#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68548/new/ https://reviews.llvm.org/D68548 Files: llvm/trunk/lib/Target/Mips/MCTargetDesc/MipsMCTargetDesc.cpp llvm/trunk/test/MC/Mips/micromips-jump-pc-region.s llvm/trunk/test/MC/Mips/mips-jump-pc-region.s Index: llvm/trunk/test/MC/Mips/micromips-jump-pc-region.s =================================================================== --- llvm/trunk/test/MC/Mips/micromips-jump-pc-region.s +++ llvm/trunk/test/MC/Mips/micromips-jump-pc-region.s @@ -0,0 +1,17 @@ +# RUN: llvm-mc -triple=mips -mcpu=mips32 -mattr=+micromips -filetype=obj < %s \ +# RUN: | llvm-objdump -d - | FileCheck %s + +.set noreorder + +# Force us into the second 256 MB region with a non-zero instruction index +.org 256*1024*1024 + 12 +# CHECK-LABEL: 1000000c foo: +# CHECK-NEXT: 1000000c: d4 00 00 06 j 12 +# CHECK-NEXT: 10000010: f4 00 00 08 jal 16 +# CHECK-NEXT: 10000014: f0 00 00 05 jalx 20 +# CHECK-NEXT: 10000018: 74 00 00 0c jals 24 +foo: + j 12 + jal 16 + jalx 20 + jals 24 Index: llvm/trunk/test/MC/Mips/mips-jump-pc-region.s =================================================================== --- llvm/trunk/test/MC/Mips/mips-jump-pc-region.s +++ llvm/trunk/test/MC/Mips/mips-jump-pc-region.s @@ -0,0 +1,17 @@ +# RUN: llvm-mc -triple=mips -mcpu=mips32 -filetype=obj < %s \ +# RUN: | llvm-objdump -d - | FileCheck %s +# RUN: llvm-mc -triple=mips64 -mcpu=mips64 -filetype=obj < %s \ +# RUN: | llvm-objdump -d - | FileCheck %s + +.set noreorder + +# Force us into the second 256 MB region with a non-zero instruction index +.org 256*1024*1024 + 12 +# CHECK-LABEL: 1000000c foo: +# CHECK-NEXT: 1000000c: 08 00 00 03 j 12 +# CHECK-NEXT: 10000010: 0c 00 00 04 jal 16 +# CHECK-NEXT: 10000014: 74 00 00 05 jalx 20 +foo: + j 12 + jal 16 + jalx 20 Index: llvm/trunk/lib/Target/Mips/MCTargetDesc/MipsMCTargetDesc.cpp =================================================================== --- llvm/trunk/lib/Target/Mips/MCTargetDesc/MipsMCTargetDesc.cpp +++ llvm/trunk/lib/Target/Mips/MCTargetDesc/MipsMCTargetDesc.cpp @@ -143,12 +143,15 @@ return false; switch (Info->get(Inst.getOpcode()).OpInfo[NumOps - 1].OperandType) { case MCOI::OPERAND_UNKNOWN: - case MCOI::OPERAND_IMMEDIATE: - // jal, bal ... - Target = Inst.getOperand(NumOps - 1).getImm(); + case MCOI::OPERAND_IMMEDIATE: { + // j, jal, jalx, jals + // Absolute branch within the current 256 MB-aligned region + uint64_t Region = Addr & ~uint64_t(0xfffffff); + Target = Region + Inst.getOperand(NumOps - 1).getImm(); return true; + } case MCOI::OPERAND_PCREL: - // b, j, beq ... + // b, beq ... Target = Addr + Inst.getOperand(NumOps - 1).getImm(); return true; default: -------------- next part -------------- A non-text attachment was scrubbed... Name: D68548.223735.patch Type: text/x-patch Size: 2715 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:14:07 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:14:07 +0000 (UTC) Subject: [PATCH] D68542: [Mips] Always save RA when disabling frame pointer elimination In-Reply-To: References: Message-ID: <92010953dc5f44e9fb97759374f0b576@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373907: [Mips] Always save RA when disabling frame pointer elimination (authored by atanasyan, committed by ). Changed prior to commit: https://reviews.llvm.org/D68542?vs=223399&id=223736#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68542/new/ https://reviews.llvm.org/D68542 Files: llvm/trunk/lib/Target/Mips/MipsSEFrameLowering.cpp llvm/trunk/test/CodeGen/Mips/cconv/vector.ll llvm/trunk/test/CodeGen/Mips/dynamic-stack-realignment.ll llvm/trunk/test/CodeGen/Mips/frame-address.ll llvm/trunk/test/CodeGen/Mips/no-frame-pointer-elim.ll llvm/trunk/test/CodeGen/Mips/tnaked.ll llvm/trunk/test/CodeGen/Mips/v2i16tof32.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68542.223736.patch Type: text/x-patch Size: 36132 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:14:47 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:14:47 +0000 (UTC) Subject: [PATCH] D68253: [SampleFDO] Add compression support for any section in ExtBinary profile format In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGb523790ae1b3: [SampleFDO] Add compression support for any section in ExtBinary profile format (authored by wmi). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68253/new/ https://reviews.llvm.org/D68253 Files: llvm/include/llvm/ProfileData/SampleProf.h llvm/include/llvm/ProfileData/SampleProfReader.h llvm/include/llvm/ProfileData/SampleProfWriter.h llvm/lib/ProfileData/SampleProf.cpp llvm/lib/ProfileData/SampleProfReader.cpp llvm/lib/ProfileData/SampleProfWriter.cpp llvm/test/Transforms/SampleProfile/compressed-profile-symbol-list.ll llvm/test/Transforms/SampleProfile/profile-format-compress.ll llvm/test/Transforms/SampleProfile/uncompressed-profile-symbol-list.ll llvm/test/tools/llvm-profdata/profile-symbol-list-compress.test llvm/test/tools/llvm-profdata/roundtrip-compress.test llvm/tools/llvm-profdata/llvm-profdata.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68253.223739.patch Type: text/x-patch Size: 34217 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:14:59 2019 From: llvm-commits at lists.llvm.org (Yitzhak Mandelbaum via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:14:59 +0000 (UTC) Subject: [PATCH] D68574: [libTooling] Add `toString` method to the Stencil class In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373916: [libTooling] Add `toString` method to the Stencil class (authored by ymandel, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68574?vs=223619&id=223741#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68574/new/ https://reviews.llvm.org/D68574 Files: cfe/trunk/include/clang/Tooling/Refactoring/Stencil.h cfe/trunk/lib/Tooling/Refactoring/Stencil.cpp cfe/trunk/unittests/Tooling/StencilTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68574.223741.patch Type: text/x-patch Size: 5657 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:15:07 2019 From: llvm-commits at lists.llvm.org (Wenlei He via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:15:07 +0000 (UTC) Subject: [PATCH] D68440: [llvm-profdata] Minor format fix In-Reply-To: References: Message-ID: <52aeb25dab1628d2f59e97bf16490cfd@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGb3342e180e9c: [llvm-profdata] Minor format fix (authored by wenlei). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68440/new/ https://reviews.llvm.org/D68440 Files: llvm/lib/ProfileData/SampleProf.cpp Index: llvm/lib/ProfileData/SampleProf.cpp =================================================================== --- llvm/lib/ProfileData/SampleProf.cpp +++ llvm/lib/ProfileData/SampleProf.cpp @@ -155,6 +155,7 @@ FS.second.print(OS, Indent + 4); } } + OS.indent(Indent); OS << "}\n"; } else { OS << "No inlined callsites in this function\n"; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68440.223742.patch Type: text/x-patch Size: 377 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:15:24 2019 From: llvm-commits at lists.llvm.org (Phabricator via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:15:24 +0000 (UTC) Subject: [PATCH] D68551: [clang-format] [NFC] Ensure clang-format is itself clang-formatted. In-Reply-To: References: Message-ID: <6ac18314e4684b0466846626b31da612@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373921: [clang-format] [NFC] Ensure clang-format is itself clang-formatted. (authored by paulhoad, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68551?vs=223408&id=223744#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68551/new/ https://reviews.llvm.org/D68551 Files: cfe/trunk/tools/clang-format/ClangFormat.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68551.223744.patch Type: text/x-patch Size: 5963 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:15:35 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:15:35 +0000 (UTC) Subject: [PATCH] D68571: [Remarks] Pass StringBlockValue as StringRef. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373923: [Remarks] Pass StringBlockValue as StringRef. (authored by fhahn, committed by ). Changed prior to commit: https://reviews.llvm.org/D68571?vs=223523&id=223747#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68571/new/ https://reviews.llvm.org/D68571 Files: llvm/trunk/lib/Remarks/YAMLRemarkSerializer.cpp Index: llvm/trunk/lib/Remarks/YAMLRemarkSerializer.cpp =================================================================== --- llvm/trunk/lib/Remarks/YAMLRemarkSerializer.cpp +++ llvm/trunk/lib/Remarks/YAMLRemarkSerializer.cpp @@ -103,7 +103,7 @@ /// newlines in strings. struct StringBlockVal { StringRef Value; - StringBlockVal(const std::string &Value) : Value(Value) {} + StringBlockVal(StringRef R) : Value(R) {} }; template <> struct BlockScalarTraits { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68571.223747.patch Type: text/x-patch Size: 487 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:15:40 2019 From: llvm-commits at lists.llvm.org (Phabricator via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:15:40 +0000 (UTC) Subject: [PATCH] D68481: [clang-format] [PR27004] omits leading space for noexcept when formatting operator delete() In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373922: [clang-format] [PR27004] omits leading space for noexcept when formatting… (authored by paulhoad, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68481?vs=223369&id=223745#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68481/new/ https://reviews.llvm.org/D68481 Files: cfe/trunk/lib/Format/TokenAnnotator.cpp cfe/trunk/unittests/Format/FormatTest.cpp Index: cfe/trunk/lib/Format/TokenAnnotator.cpp =================================================================== --- cfe/trunk/lib/Format/TokenAnnotator.cpp +++ cfe/trunk/lib/Format/TokenAnnotator.cpp @@ -1611,6 +1611,13 @@ if (Tok.Next->is(tok::question)) return false; + // Functions which end with decorations like volatile, noexcept are unlikely + // to be casts. + if (Tok.Next->isOneOf(tok::kw_noexcept, tok::kw_volatile, tok::kw_const, + tok::kw_throw, tok::l_square, tok::arrow, + Keywords.kw_override, Keywords.kw_final)) + return false; + // As Java has no function types, a "(" after the ")" likely means that this // is a cast. if (Style.Language == FormatStyle::LK_Java && Tok.Next->is(tok::l_paren)) Index: cfe/trunk/unittests/Format/FormatTest.cpp =================================================================== --- cfe/trunk/unittests/Format/FormatTest.cpp +++ cfe/trunk/unittests/Format/FormatTest.cpp @@ -14678,6 +14678,33 @@ */ } +TEST_F(FormatTest, NotCastRPaen) { + + verifyFormat("void operator++(int) noexcept;"); + verifyFormat("void operator++(int &) noexcept;"); + verifyFormat("void operator delete(void *, std::size_t, const std::nothrow_t " + "&) noexcept;"); + verifyFormat( + "void operator delete(std::size_t, const std::nothrow_t &) noexcept;"); + verifyFormat("void operator delete(const std::nothrow_t &) noexcept;"); + verifyFormat("void operator delete(std::nothrow_t &) noexcept;"); + verifyFormat("void operator delete(nothrow_t &) noexcept;"); + verifyFormat("void operator delete(foo &) noexcept;"); + verifyFormat("void operator delete(foo) noexcept;"); + verifyFormat("void operator delete(int) noexcept;"); + verifyFormat("void operator delete(int &) noexcept;"); + verifyFormat("void operator delete(int &) volatile noexcept;"); + verifyFormat("void operator delete(int &) const"); + verifyFormat("void operator delete(int &) = default"); + verifyFormat("void operator delete(int &) = delete"); + verifyFormat("void operator delete(int &) [[noreturn]]"); + verifyFormat("void operator delete(int &) throw();"); + verifyFormat("void operator delete(int &) throw(int);"); + verifyFormat("auto operator delete(int &) -> int;"); + verifyFormat("auto operator delete(int &) override"); + verifyFormat("auto operator delete(int &) final"); +} + } // end namespace } // end namespace format } // end namespace clang -------------- next part -------------- A non-text attachment was scrubbed... Name: D68481.223745.patch Type: text/x-patch Size: 2502 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:15:50 2019 From: llvm-commits at lists.llvm.org (Kadir Cetinkaya via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:15:50 +0000 (UTC) Subject: [PATCH] D68273: [clangd] Fix raciness in code completion tests In-Reply-To: References: Message-ID: <5c88ca61d66e8bd66c87dc4a8a5bed19@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373924: [clangd] Fix raciness in code completion tests (authored by kadircet, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68273?vs=222890&id=223748#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68273/new/ https://reviews.llvm.org/D68273 Files: clang-tools-extra/trunk/clangd/unittests/CodeCompleteTests.cpp Index: clang-tools-extra/trunk/clangd/unittests/CodeCompleteTests.cpp =================================================================== --- clang-tools-extra/trunk/clangd/unittests/CodeCompleteTests.cpp +++ clang-tools-extra/trunk/clangd/unittests/CodeCompleteTests.cpp @@ -18,6 +18,7 @@ #include "TestFS.h" #include "TestIndex.h" #include "TestTU.h" +#include "Threading.h" #include "index/Index.h" #include "index/MemIndex.h" #include "clang/Sema/CodeCompleteConsumer.h" @@ -27,6 +28,8 @@ #include "llvm/Testing/Support/Error.h" #include "gmock/gmock.h" #include "gtest/gtest.h" +#include +#include namespace clang { namespace clangd { @@ -1112,8 +1115,9 @@ bool fuzzyFind(const FuzzyFindRequest &Req, llvm::function_ref Callback) const override { - std::lock_guard Lock(Mut); + std::unique_lock Lock(Mut); Requests.push_back(Req); + ReceivedRequestCV.notify_one(); return true; } @@ -1131,8 +1135,10 @@ // isn't used in production code. size_t estimateMemoryUsage() const override { return 0; } - const std::vector consumeRequests() const { - std::lock_guard Lock(Mut); + const std::vector consumeRequests(size_t Num) const { + std::unique_lock Lock(Mut); + EXPECT_TRUE(wait(Lock, ReceivedRequestCV, timeoutSeconds(10), + [this, Num] { return Requests.size() == Num; })); auto Reqs = std::move(Requests); Requests = {}; return Reqs; @@ -1140,16 +1146,21 @@ private: // We need a mutex to handle async fuzzy find requests. + mutable std::condition_variable ReceivedRequestCV; mutable std::mutex Mut; mutable std::vector Requests; }; -std::vector captureIndexRequests(llvm::StringRef Code) { +// Clients have to consume exactly Num requests. +std::vector captureIndexRequests(llvm::StringRef Code, + size_t Num = 1) { clangd::CodeCompleteOptions Opts; IndexRequestCollector Requests; Opts.Index = &Requests; completions(Code, {}, Opts); - return Requests.consumeRequests(); + const auto Reqs = Requests.consumeRequests(Num); + EXPECT_EQ(Reqs.size(), Num); + return Reqs; } TEST(CompletionTest, UnqualifiedIdQuery) { @@ -2098,18 +2109,15 @@ auto CompleteAtPoint = [&](StringRef P) { cantFail(runCodeComplete(Server, File, Test.point(P), Opts)); - // Sleep for a while to make sure asynchronous call (if applicable) is also - // triggered before callback is invoked. - std::this_thread::sleep_for(std::chrono::milliseconds(100)); }; CompleteAtPoint("1"); - auto Reqs1 = Requests.consumeRequests(); + auto Reqs1 = Requests.consumeRequests(1); ASSERT_EQ(Reqs1.size(), 1u); EXPECT_THAT(Reqs1[0].Scopes, UnorderedElementsAre("ns1::")); CompleteAtPoint("2"); - auto Reqs2 = Requests.consumeRequests(); + auto Reqs2 = Requests.consumeRequests(1); // Speculation succeeded. Used speculative index result. ASSERT_EQ(Reqs2.size(), 1u); EXPECT_EQ(Reqs2[0], Reqs1[0]); @@ -2117,7 +2125,7 @@ CompleteAtPoint("3"); // Speculation failed. Sent speculative index request and the new index // request after sema. - auto Reqs3 = Requests.consumeRequests(); + auto Reqs3 = Requests.consumeRequests(2); ASSERT_EQ(Reqs3.size(), 2u); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68273.223748.patch Type: text/x-patch Size: 3469 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:16:07 2019 From: llvm-commits at lists.llvm.org (Phabricator via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:16:07 +0000 (UTC) Subject: [PATCH] D68422: [DWARFASTParserClang] Factor out structure-like type parsing, NFC In-Reply-To: References: Message-ID: <007744fb68510e63cac66470ea274901@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373927: [DWARFASTParserClang] Factor out structure-like type parsing, NFC (authored by vedantk, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68422?vs=223357&id=223749#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68422/new/ https://reviews.llvm.org/D68422 Files: lldb/trunk/source/Plugins/SymbolFile/DWARF/DWARFASTParser.h lldb/trunk/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp lldb/trunk/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h lldb/trunk/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68422.223749.patch Type: text/x-patch Size: 43647 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:16:19 2019 From: llvm-commits at lists.llvm.org (Erich Keane via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:16:19 +0000 (UTC) Subject: [PATCH] D68584: Fix Calling Convention through aliases In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373929: Fix Calling Convention through aliases (authored by erichkeane, committed by ). Changed prior to commit: https://reviews.llvm.org/D68584?vs=223617&id=223750#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68584/new/ https://reviews.llvm.org/D68584 Files: cfe/trunk/lib/CodeGen/CGDeclCXX.cpp cfe/trunk/test/CodeGenCXX/call-conv-thru-alias.cpp llvm/trunk/include/llvm/IR/Value.h llvm/trunk/lib/IR/Value.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68584.223750.patch Type: text/x-patch Size: 4139 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:16:27 2019 From: llvm-commits at lists.llvm.org (Kostya Kortchinsky via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:16:27 +0000 (UTC) Subject: [PATCH] D68471: [scudo][standalone] Correct releaseToOS behavior In-Reply-To: References: Message-ID: <9a3da303826afcacb6ae113714c1a54a@localhost.localdomain> This revision was automatically updated to reflect the committed changes. cryptoad marked an inline comment as done. Closed by commit rL373930: [scudo][standalone] Correct releaseToOS behavior (authored by cryptoad, committed by ). Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68471/new/ https://reviews.llvm.org/D68471 Files: compiler-rt/trunk/lib/scudo/standalone/primary32.h compiler-rt/trunk/lib/scudo/standalone/primary64.h compiler-rt/trunk/lib/scudo/standalone/tests/primary_test.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68471.223751.patch Type: text/x-patch Size: 9756 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:16:51 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Micha=C5=82_G=C3=B3rny_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 05:16:51 +0000 (UTC) Subject: [PATCH] D68412: [clang] [cmake] Support LLVM_DISTRIBUTION_COMPONENTS in stand-alone build In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373936: [clang] [cmake] Support LLVM_DISTRIBUTION_COMPONENTS in stand-alone build (authored by mgorny, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68412?vs=223064&id=223753#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68412/new/ https://reviews.llvm.org/D68412 Files: cfe/trunk/CMakeLists.txt Index: cfe/trunk/CMakeLists.txt =================================================================== --- cfe/trunk/CMakeLists.txt +++ cfe/trunk/CMakeLists.txt @@ -114,6 +114,7 @@ include(TableGen) include(HandleLLVMOptions) include(VersionFromVCS) + include(LLVMDistributionSupport) set(PACKAGE_VERSION "${LLVM_PACKAGE_VERSION}") @@ -858,6 +859,10 @@ endif() add_subdirectory(utils/hmaptool) +if(CLANG_BUILT_STANDALONE) + llvm_distribution_add_targets() +endif() + configure_file( ${CLANG_SOURCE_DIR}/include/clang/Config/config.h.cmake ${CLANG_BINARY_DIR}/include/clang/Config/config.h) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68412.223753.patch Type: text/x-patch Size: 616 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:18:36 2019 From: llvm-commits at lists.llvm.org (Amy Huang via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:18:36 +0000 (UTC) Subject: [PATCH] D68114: Fix for expanding __pragmas in macro arguments In-Reply-To: References: Message-ID: <2d38683f50105f7a0bf55263bbc85cca@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373950: Fix for expanding __pragmas in macro arguments (authored by akhuang, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68114?vs=223255&id=223754#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68114/new/ https://reviews.llvm.org/D68114 Files: cfe/trunk/lib/Lex/Pragma.cpp cfe/trunk/test/Preprocessor/pragma_microsoft.c -------------- next part -------------- A non-text attachment was scrubbed... Name: D68114.223754.patch Type: text/x-patch Size: 5206 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:19:51 2019 From: llvm-commits at lists.llvm.org (Phabricator via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:19:51 +0000 (UTC) Subject: [PATCH] D68588: [Bitcode] Update naming of UNOP_NEG to UNOP_FNEG In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373958: [Bitcode] Update naming of UNOP_NEG to UNOP_FNEG (authored by mcinally, committed by ). Changed prior to commit: https://reviews.llvm.org/D68588?vs=223637&id=223755#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68588/new/ https://reviews.llvm.org/D68588 Files: llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Index: llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp =================================================================== --- llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp +++ llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp @@ -1063,7 +1063,7 @@ switch (Val) { default: return -1; - case bitc::UNOP_NEG: + case bitc::UNOP_FNEG: return IsFP ? Instruction::FNeg : -1; } } Index: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp =================================================================== --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp @@ -520,7 +520,7 @@ static unsigned getEncodedUnaryOpcode(unsigned Opcode) { switch (Opcode) { default: llvm_unreachable("Unknown binary instruction!"); - case Instruction::FNeg: return bitc::UNOP_NEG; + case Instruction::FNeg: return bitc::UNOP_FNEG; } } Index: llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h =================================================================== --- llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h +++ llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h @@ -391,7 +391,7 @@ /// have no fixed relation to the LLVM IR enum values. Changing these will /// break compatibility with old files. enum UnaryOpcodes { - UNOP_NEG = 0 + UNOP_FNEG = 0 }; /// BinaryOpcodes - These are values used in the bitcode files to encode which -------------- next part -------------- A non-text attachment was scrubbed... Name: D68588.223755.patch Type: text/x-patch Size: 1395 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:20:01 2019 From: llvm-commits at lists.llvm.org (Phabricator via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:20:01 +0000 (UTC) Subject: [PATCH] D68470: [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373961: [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal… (authored by lebedevri, committed by ). Changed prior to commit: https://reviews.llvm.org/D68470?vs=223332&id=223757#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68470/new/ https://reviews.llvm.org/D68470 Files: llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68470.223757.patch Type: text/x-patch Size: 8168 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:20:02 2019 From: llvm-commits at lists.llvm.org (Phabricator via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:20:02 +0000 (UTC) Subject: [PATCH] D68239: [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift amounts In-Reply-To: References: Message-ID: <2bd726ef40bcf5f09f314efae23d6476@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373960: [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift… (authored by lebedevri, committed by ). Changed prior to commit: https://reviews.llvm.org/D68239?vs=223331&id=223758#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68239/new/ https://reviews.llvm.org/D68239 Files: llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-c.ll llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68239.223758.patch Type: text/x-patch Size: 8044 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:20:34 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:20:34 +0000 (UTC) Subject: [PATCH] D67384: [Attributor] Deduce memory behavior In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373965: [Attributor] Deduce memory behavior of functions and arguments (authored by jdoerfert, committed by ). Changed prior to commit: https://reviews.llvm.org/D67384?vs=219470&id=223759#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67384/new/ https://reviews.llvm.org/D67384 Files: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/align.ll llvm/trunk/test/Transforms/FunctionAttrs/arg_nocapture.ll llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll llvm/trunk/test/Transforms/FunctionAttrs/nofree-attributor.ll llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll llvm/trunk/test/Transforms/FunctionAttrs/nosync.ll llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll llvm/trunk/test/Transforms/FunctionAttrs/willreturn.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67384.223759.patch Type: text/x-patch Size: 81288 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:20:46 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:20:46 +0000 (UTC) Subject: [PATCH] D68553: [WebAssembly] Add memory intrinsics handling to mayThrow() In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373967: [WebAssembly] Add memory intrinsics handling to mayThrow() (authored by aheejin, committed by ). Changed prior to commit: https://reviews.llvm.org/D68553?vs=223417&id=223760#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68553/new/ https://reviews.llvm.org/D68553 Files: llvm/trunk/lib/Target/WebAssembly/WebAssemblyUtilities.cpp llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll Index: llvm/trunk/lib/Target/WebAssembly/WebAssemblyUtilities.cpp =================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyUtilities.cpp +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyUtilities.cpp @@ -50,7 +50,21 @@ return false; const MachineOperand &MO = MI.getOperand(getCalleeOpNo(MI.getOpcode())); - assert(MO.isGlobal()); + assert(MO.isGlobal() || MO.isSymbol()); + + if (MO.isSymbol()) { + // Some intrinsics are lowered to calls to external symbols, which are then + // lowered to calls to library functions. Most of libcalls don't throw, but + // we only list some of them here now. + // TODO Consider adding 'nounwind' info in TargetLowering::CallLoweringInfo + // instead for more accurate info. + const char *Name = MO.getSymbolName(); + if (strcmp(Name, "memcpy") == 0 || strcmp(Name, "memmove") == 0 || + strcmp(Name, "memset") == 0) + return false; + return true; + } + const auto *F = dyn_cast(MO.getGlobal()); if (!F) return true; Index: llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll =================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll +++ llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll @@ -664,11 +664,51 @@ ret void } +%class.Object = type { i8 } + +; Intrinsics like memcpy, memmove, and memset don't throw and are lowered into +; calls to external symbols (not global addresses) in instruction selection, +; which will be eventually lowered to library function calls. +; Because this test runs with -wasm-disable-ehpad-sort, these library calls in +; invoke.cont BB fall within try~end_try, but they shouldn't cause crashes or +; unwinding destination mismatches in CFGStackify. + +; NOSORT-LABEL: test10 +; NOSORT: try +; NOSORT: call foo +; NOSORT: i32.call {{.*}} memcpy +; NOSORT: i32.call {{.*}} memmove +; NOSORT: i32.call {{.*}} memset +; NOSORT: return +; NOSORT: catch +; NOSORT: rethrow +; NOSORT: end_try +define void @test10(i8* %a, i8* %b) personality i8* bitcast (i32 (...)* @__gxx_wasm_personality_v0 to i8*) { +entry: + %o = alloca %class.Object, align 1 + invoke void @foo() + to label %invoke.cont unwind label %ehcleanup + +invoke.cont: ; preds = %entry + call void @llvm.memcpy.p0i8.p0i8.i32(i8* %a, i8* %b, i32 100, i1 false) + call void @llvm.memmove.p0i8.p0i8.i32(i8* %a, i8* %b, i32 100, i1 false) + call void @llvm.memset.p0i8.i32(i8* %a, i8 0, i32 100, i1 false) + %call = call %class.Object* @_ZN6ObjectD2Ev(%class.Object* %o) #1 + ret void + +ehcleanup: ; preds = %entry + %0 = cleanuppad within none [] + %call2 = call %class.Object* @_ZN6ObjectD2Ev(%class.Object* %o) #1 [ "funclet"(token %0) ] + cleanupret from %0 unwind to caller +} + declare void @foo() declare void @bar() declare i32 @baz() ; Function Attrs: nounwind declare void @nothrow(i32) #0 +; Function Attrs: nounwind +declare %class.Object* @_ZN6ObjectD2Ev(%class.Object* returned) #0 declare i32 @__gxx_wasm_personality_v0(...) declare i8* @llvm.wasm.get.exception(token) declare i32 @llvm.wasm.get.ehselector(token) @@ -678,5 +718,11 @@ declare void @__cxa_end_catch() declare void @__clang_call_terminate(i8*) declare void @_ZSt9terminatev() +; Function Attrs: nounwind +declare void @llvm.memcpy.p0i8.p0i8.i32(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i32, i1 immarg) #0 +; Function Attrs: nounwind +declare void @llvm.memmove.p0i8.p0i8.i32(i8* nocapture, i8* nocapture readonly, i32, i1 immarg) #0 +; Function Attrs: nounwind +declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1 immarg) #0 attributes #0 = { nounwind } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68553.223760.patch Type: text/x-patch Size: 3837 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:21:39 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:21:39 +0000 (UTC) Subject: [PATCH] D68594: [llvm-lipo] Add TextAPI to LINK_COMPONENTS In-Reply-To: References: Message-ID: <73b5c568a27124c747aa8c61f4bd9241@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373974: [llvm-lipo] Add TextAPI to LINK_COMPONENTS (authored by aheejin, committed by ). Changed prior to commit: https://reviews.llvm.org/D68594?vs=223657&id=223761#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68594/new/ https://reviews.llvm.org/D68594 Files: llvm/trunk/tools/llvm-lipo/CMakeLists.txt Index: llvm/trunk/tools/llvm-lipo/CMakeLists.txt =================================================================== --- llvm/trunk/tools/llvm-lipo/CMakeLists.txt +++ llvm/trunk/tools/llvm-lipo/CMakeLists.txt @@ -3,6 +3,7 @@ Object Option Support + TextAPI ) set(LLVM_TARGET_DEFINITIONS LipoOpts.td) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68594.223761.patch Type: text/x-patch Size: 314 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:21:47 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:21:47 +0000 (UTC) Subject: [PATCH] D68552: [WebAssembly] Fix unwind mismatch stat computation In-Reply-To: References: Message-ID: <76bc1f94c47b5acfde9d899c200339ad@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373975: [WebAssembly] Fix unwind mismatch stat computation (authored by aheejin, committed by ). Changed prior to commit: https://reviews.llvm.org/D68552?vs=223418&id=223762#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68552/new/ https://reviews.llvm.org/D68552 Files: llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll Index: llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll =================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll +++ llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll @@ -1,6 +1,7 @@ ; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -exception-model=wasm -mattr=+exception-handling | FileCheck %s ; RUN: llc < %s -O0 -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -verify-machineinstrs -exception-model=wasm -mattr=+exception-handling | FileCheck %s --check-prefix=NOOPT ; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -exception-model=wasm -mattr=+exception-handling -wasm-disable-ehpad-sort | FileCheck %s --check-prefix=NOSORT +; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -exception-model=wasm -mattr=+exception-handling -wasm-disable-ehpad-sort -stats 2>&1 | FileCheck %s --check-prefix=NOSORT-STAT target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128" target triple = "wasm32-unknown-unknown" @@ -702,6 +703,9 @@ cleanupret from %0 unwind to caller } +; Check if the unwind destination mismatch stats are correct +; NOSORT-STAT: 11 wasm-cfg-stackify - Number of EH pad unwind mismatches found + declare void @foo() declare void @bar() declare i32 @baz() Index: llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp =================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp @@ -848,7 +848,7 @@ SmallVector EHPadStack; // Range of intructions to be wrapped in a new nested try/catch using TryRange = std::pair; - // In original CFG, + // In original CFG, DenseMap> UnwindDestToTryRanges; // In new CFG, DenseMap> BrDestToTryRanges; @@ -985,7 +985,7 @@ // ... // cont: for (auto &P : UnwindDestToTryRanges) { - NumUnwindMismatches++; + NumUnwindMismatches += P.second.size(); // This means the destination is the appendix BB, which was separately // handled above. @@ -1300,7 +1300,9 @@ } } // Fix mismatches in unwind destinations induced by linearizing the code. - fixUnwindMismatches(MF); + if (MCAI->getExceptionHandlingType() == ExceptionHandling::Wasm && + MF.getFunction().hasPersonalityFn()) + fixUnwindMismatches(MF); } void WebAssemblyCFGStackify::rewriteDepthImmediates(MachineFunction &MF) { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68552.223762.patch Type: text/x-patch Size: 3368 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:22:01 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:22:01 +0000 (UTC) Subject: [PATCH] D67855: [X86] Add new calling convention that guarantees tail call optimization In-Reply-To: References: Message-ID: <4441605157b61e0f80d80c0c013e7575@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373976: [X86] Add new calling convention that guarantees tail call optimization (authored by rnk, committed by ). Changed prior to commit: https://reviews.llvm.org/D67855?vs=221788&id=223763#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67855/new/ https://reviews.llvm.org/D67855 Files: llvm/trunk/docs/BitCodeFormat.rst llvm/trunk/docs/CodeGenerator.rst llvm/trunk/docs/LangRef.rst llvm/trunk/include/llvm/IR/CallingConv.h llvm/trunk/lib/AsmParser/LLLexer.cpp llvm/trunk/lib/AsmParser/LLParser.cpp llvm/trunk/lib/AsmParser/LLToken.h llvm/trunk/lib/CodeGen/Analysis.cpp llvm/trunk/lib/IR/AsmWriter.cpp llvm/trunk/lib/Target/X86/X86CallingConv.td llvm/trunk/lib/Target/X86/X86FastISel.cpp llvm/trunk/lib/Target/X86/X86FrameLowering.cpp llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/lib/Target/X86/X86Subtarget.h llvm/trunk/test/CodeGen/X86/musttail-tailcc.ll llvm/trunk/test/CodeGen/X86/tailcall-tailcc.ll llvm/trunk/test/CodeGen/X86/tailcc-calleesave.ll llvm/trunk/test/CodeGen/X86/tailcc-disable-tail-calls.ll llvm/trunk/test/CodeGen/X86/tailcc-fastcc.ll llvm/trunk/test/CodeGen/X86/tailcc-fastisel.ll llvm/trunk/test/CodeGen/X86/tailcc-largecode.ll llvm/trunk/test/CodeGen/X86/tailcc-stackalign.ll llvm/trunk/test/CodeGen/X86/tailcc-structret.ll llvm/trunk/test/CodeGen/X86/tailccbyval.ll llvm/trunk/test/CodeGen/X86/tailccbyval64.ll llvm/trunk/test/CodeGen/X86/tailccfp.ll llvm/trunk/test/CodeGen/X86/tailccfp2.ll llvm/trunk/test/CodeGen/X86/tailccpic1.ll llvm/trunk/test/CodeGen/X86/tailccpic2.ll llvm/trunk/test/CodeGen/X86/tailccstack64.ll llvm/trunk/utils/vim/syntax/llvm.vim -------------- next part -------------- A non-text attachment was scrubbed... Name: D67855.223763.patch Type: text/x-patch Size: 40221 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:22:09 2019 From: llvm-commits at lists.llvm.org (Jan Korous via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:22:09 +0000 (UTC) Subject: [PATCH] D67742: Add VFS support for sanitizers' blacklist In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373977: Add VFS support for sanitizers' blacklist (authored by jkorous, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D67742?vs=223655&id=223765#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67742/new/ https://reviews.llvm.org/D67742 Files: cfe/trunk/lib/AST/ASTContext.cpp cfe/trunk/test/CodeGen/Inputs/sanitizer-blacklist-vfsoverlay.yaml cfe/trunk/test/CodeGen/ubsan-blacklist.c Index: cfe/trunk/test/CodeGen/ubsan-blacklist.c =================================================================== --- cfe/trunk/test/CodeGen/ubsan-blacklist.c +++ cfe/trunk/test/CodeGen/ubsan-blacklist.c @@ -5,6 +5,17 @@ // RUN: %clang_cc1 -fsanitize=unsigned-integer-overflow -fsanitize-blacklist=%t-func.blacklist -emit-llvm %s -o - | FileCheck %s --check-prefix=FUNC // RUN: %clang_cc1 -fsanitize=unsigned-integer-overflow -fsanitize-blacklist=%t-file.blacklist -emit-llvm %s -o - | FileCheck %s --check-prefix=FILE +// RUN: rm -f %t-vfsoverlay.yaml +// RUN: rm -f %t-nonexistent.blacklist +// RUN: sed -e "s|@DIR@|%T|g" %S/Inputs/sanitizer-blacklist-vfsoverlay.yaml | sed -e "s|@REAL_FILE@|%t-func.blacklist|g" | sed -e "s|@NONEXISTENT_FILE@|%t-nonexistent.blacklist|g" > %t-vfsoverlay.yaml +// RUN: %clang_cc1 -fsanitize=unsigned-integer-overflow -ivfsoverlay %t-vfsoverlay.yaml -fsanitize-blacklist=%T/only-virtual-file.blacklist -emit-llvm %s -o - | FileCheck %s --check-prefix=FUNC + +// RUN: not %clang_cc1 -fsanitize=unsigned-integer-overflow -ivfsoverlay %t-vfsoverlay.yaml -fsanitize-blacklist=%T/invalid-virtual-file.blacklist -emit-llvm %s -o - 2>&1 | FileCheck %s --check-prefix=INVALID-MAPPED-FILE +// INVALID-MAPPED-FILE: invalid-virtual-file.blacklist': No such file or directory + +// RUN: not %clang_cc1 -fsanitize=unsigned-integer-overflow -ivfsoverlay %t-vfsoverlay.yaml -fsanitize-blacklist=%t-nonexistent.blacklist -emit-llvm %s -o - 2>&1 | FileCheck %s --check-prefix=INVALID +// INVALID: nonexistent.blacklist': No such file or directory + unsigned i; // DEFAULT: @hash Index: cfe/trunk/test/CodeGen/Inputs/sanitizer-blacklist-vfsoverlay.yaml =================================================================== --- cfe/trunk/test/CodeGen/Inputs/sanitizer-blacklist-vfsoverlay.yaml +++ cfe/trunk/test/CodeGen/Inputs/sanitizer-blacklist-vfsoverlay.yaml @@ -0,0 +1,15 @@ +{ + 'version': 0, + 'roots': [ + { 'name': '@DIR@', 'type': 'directory', + 'contents': [ + { 'name': 'only-virtual-file.blacklist', 'type': 'file', + 'external-contents': '@REAL_FILE@' + }, + { 'name': 'invalid-virtual-file.blacklist', 'type': 'file', + 'external-contents': '@NONEXISTENT_FILE@' + } + ] + } + ] +} Index: cfe/trunk/lib/AST/ASTContext.cpp =================================================================== --- cfe/trunk/lib/AST/ASTContext.cpp +++ cfe/trunk/lib/AST/ASTContext.cpp @@ -72,6 +72,7 @@ #include "llvm/ADT/PointerUnion.h" #include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SmallPtrSet.h" +#include "llvm/ADT/SmallString.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringExtras.h" #include "llvm/ADT/StringRef.h" @@ -81,6 +82,7 @@ #include "llvm/Support/Compiler.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/MathExtras.h" +#include "llvm/Support/VirtualFileSystem.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -826,6 +828,18 @@ llvm_unreachable("getAddressSpaceMapMangling() doesn't cover anything."); } +static std::vector +getRealPaths(llvm::vfs::FileSystem &VFS, llvm::ArrayRef Paths) { + std::vector Result; + llvm::SmallString<128> Buffer; + for (const auto &File : Paths) { + if (std::error_code EC = VFS.getRealPath(File, Buffer)) + llvm::report_fatal_error("can't open file '" + File + "': " + EC.message()); + Result.push_back(Buffer.str()); + } + return Result; +} + ASTContext::ASTContext(LangOptions &LOpts, SourceManager &SM, IdentifierTable &idents, SelectorTable &sels, Builtin::Context &builtins) @@ -833,7 +847,10 @@ TemplateSpecializationTypes(this_()), DependentTemplateSpecializationTypes(this_()), SubstTemplateTemplateParmPacks(this_()), SourceMgr(SM), LangOpts(LOpts), - SanitizerBL(new SanitizerBlacklist(LangOpts.SanitizerBlacklistFiles, SM)), + SanitizerBL(new SanitizerBlacklist( + getRealPaths(SM.getFileManager().getVirtualFileSystem(), + LangOpts.SanitizerBlacklistFiles), + SM)), XRayFilter(new XRayFunctionFilter(LangOpts.XRayAlwaysInstrumentFiles, LangOpts.XRayNeverInstrumentFiles, LangOpts.XRayAttrListFiles, SM)), -------------- next part -------------- A non-text attachment was scrubbed... Name: D67742.223765.patch Type: text/x-patch Size: 4400 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:22:20 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:22:20 +0000 (UTC) Subject: [PATCH] D68604: [tsan] Don't delay SIGTRAP handler In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373978: [tsan] Don't delay SIGTRAP handler (authored by vitalybuka, committed by ). Herald added a subscriber: delcypher. Changed prior to commit: https://reviews.llvm.org/D68604?vs=223679&id=223766#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68604/new/ https://reviews.llvm.org/D68604 Files: compiler-rt/trunk/lib/tsan/rtl/tsan_interceptors_posix.cpp compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Index: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp =================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp @@ -0,0 +1,29 @@ +// RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=1 %run %t 2>&1 | FileCheck %s + +#include +#include +#include + +int handled; + +void handler(int signo, siginfo_t *info, void *uctx) { + handled = 1; +} + +int main() { + struct sigaction a = {}, old = {}; + a.sa_sigaction = handler; + a.sa_flags = SA_SIGINFO; + sigaction(SIGTRAP, &a, &old); + + a = {}; + sigaction(SIGTRAP, 0, &a); + assert(a.sa_sigaction == handler); + assert(a.sa_flags & SA_SIGINFO); + + __builtin_debugtrap(); + assert(handled); + fprintf(stderr, "HANDLED %d\n", handled); +} + +// CHECK: HANDLED 1 Index: compiler-rt/trunk/lib/tsan/rtl/tsan_interceptors_posix.cpp =================================================================== --- compiler-rt/trunk/lib/tsan/rtl/tsan_interceptors_posix.cpp +++ compiler-rt/trunk/lib/tsan/rtl/tsan_interceptors_posix.cpp @@ -114,6 +114,7 @@ const int EPOLL_CTL_ADD = 1; #endif const int SIGILL = 4; +const int SIGTRAP = 5; const int SIGABRT = 6; const int SIGFPE = 8; const int SIGSEGV = 11; @@ -1962,10 +1963,10 @@ } // namespace __tsan static bool is_sync_signal(ThreadSignalContext *sctx, int sig) { - return sig == SIGSEGV || sig == SIGBUS || sig == SIGILL || - sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS || - // If we are sending signal to ourselves, we must process it now. - (sctx && sig == sctx->int_signal_send); + return sig == SIGSEGV || sig == SIGBUS || sig == SIGILL || sig == SIGTRAP || + sig == SIGABRT || sig == SIGFPE || sig == SIGPIPE || sig == SIGSYS || + // If we are sending signal to ourselves, we must process it now. + (sctx && sig == sctx->int_signal_send); } void ALWAYS_INLINE rtl_generic_sighandler(bool sigact, int sig, -------------- next part -------------- A non-text attachment was scrubbed... Name: D68604.223766.patch Type: text/x-patch Size: 2139 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:22:22 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:22:22 +0000 (UTC) Subject: [PATCH] D68603: [sanitizer] Print SIGTRAP for corresponding signal In-Reply-To: References: Message-ID: <3ff71335d678c4fc1ffa2ad13b6597e6@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373979: [sanitizer] Print SIGTRAP for corresponding signal (authored by vitalybuka, committed by ). Herald added a subscriber: delcypher. Changed prior to commit: https://reviews.llvm.org/D68603?vs=223681&id=223767#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68603/new/ https://reviews.llvm.org/D68603 Files: compiler-rt/trunk/lib/sanitizer_common/sanitizer_posix.cpp compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp Index: compiler-rt/trunk/lib/sanitizer_common/sanitizer_posix.cpp =================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_posix.cpp +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_posix.cpp @@ -312,6 +312,8 @@ return "SEGV"; case SIGBUS: return "BUS"; + case SIGTRAP: + return "TRAP"; } return "UNKNOWN SIGNAL"; } Index: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp =================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp @@ -0,0 +1,8 @@ +// RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=2 not %run %t 2>&1 | FileCheck %s + +int main() { + __builtin_debugtrap(); +} + +// CHECK: Sanitizer:DEADLYSIGNAL +// CHECK: Sanitizer: TRAP on unknown address -------------- next part -------------- A non-text attachment was scrubbed... Name: D68603.223767.patch Type: text/x-patch Size: 958 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:22:55 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:22:55 +0000 (UTC) Subject: [PATCH] D68596: [tsan, go] break commands into multiple lines In-Reply-To: References: Message-ID: <35b4c86e22a17ceef398a74678a19be6@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373983: [tsan, go] break commands into multiple lines (authored by vitalybuka, committed by ). Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68596/new/ https://reviews.llvm.org/D68596 Files: compiler-rt/trunk/lib/tsan/go/build.bat Index: compiler-rt/trunk/lib/tsan/go/build.bat =================================================================== --- compiler-rt/trunk/lib/tsan/go/build.bat +++ compiler-rt/trunk/lib/tsan/go/build.bat @@ -1,4 +1,56 @@ -type tsan_go.cpp ..\rtl\tsan_interface_atomic.cpp ..\rtl\tsan_clock.cpp ..\rtl\tsan_flags.cpp ..\rtl\tsan_md5.cpp ..\rtl\tsan_mutex.cpp ..\rtl\tsan_report.cpp ..\rtl\tsan_rtl.cpp ..\rtl\tsan_rtl_mutex.cpp ..\rtl\tsan_rtl_report.cpp ..\rtl\tsan_rtl_thread.cpp ..\rtl\tsan_rtl_proc.cpp ..\rtl\tsan_stat.cpp ..\rtl\tsan_suppressions.cpp ..\rtl\tsan_sync.cpp ..\rtl\tsan_stack_trace.cpp ..\..\sanitizer_common\sanitizer_allocator.cpp ..\..\sanitizer_common\sanitizer_common.cpp ..\..\sanitizer_common\sanitizer_flags.cpp ..\..\sanitizer_common\sanitizer_stacktrace.cpp ..\..\sanitizer_common\sanitizer_libc.cpp ..\..\sanitizer_common\sanitizer_printf.cpp ..\..\sanitizer_common\sanitizer_suppressions.cpp ..\..\sanitizer_common\sanitizer_thread_registry.cpp ..\rtl\tsan_platform_windows.cpp ..\..\sanitizer_common\sanitizer_win.cpp ..\..\sanitizer_common\sanitizer_deadlock_detector1.cpp ..\..\sanitizer_common\sanitizer_stackdepot.cpp ..\..\sanitizer_common\sanitizer_persistent_allocator.cpp ..\..\sanitizer_common\sanitizer_flag_parser.cpp ..\..\sanitizer_common\sanitizer_symbolizer.cpp ..\..\sanitizer_common\sanitizer_termination.cpp > gotsan.cpp - -gcc -c -o race_windows_amd64.syso gotsan.cpp -I..\rtl -I..\.. -I..\..\sanitizer_common -I..\..\..\include -m64 -Wall -fno-exceptions -fno-rtti -DSANITIZER_GO=1 -Wno-error=attributes -Wno-attributes -Wno-format -Wno-maybe-uninitialized -DSANITIZER_DEBUG=0 -O3 -fomit-frame-pointer -std=c++11 +type ^ + tsan_go.cpp ^ + ..\rtl\tsan_interface_atomic.cpp ^ + ..\rtl\tsan_clock.cpp ^ + ..\rtl\tsan_flags.cpp ^ + ..\rtl\tsan_md5.cpp ^ + ..\rtl\tsan_mutex.cpp ^ + ..\rtl\tsan_report.cpp ^ + ..\rtl\tsan_rtl.cpp ^ + ..\rtl\tsan_rtl_mutex.cpp ^ + ..\rtl\tsan_rtl_report.cpp ^ + ..\rtl\tsan_rtl_thread.cpp ^ + ..\rtl\tsan_rtl_proc.cpp ^ + ..\rtl\tsan_stat.cpp ^ + ..\rtl\tsan_suppressions.cpp ^ + ..\rtl\tsan_sync.cpp ^ + ..\rtl\tsan_stack_trace.cpp ^ + ..\..\sanitizer_common\sanitizer_allocator.cpp ^ + ..\..\sanitizer_common\sanitizer_common.cpp ^ + ..\..\sanitizer_common\sanitizer_flags.cpp ^ + ..\..\sanitizer_common\sanitizer_stacktrace.cpp ^ + ..\..\sanitizer_common\sanitizer_libc.cpp ^ + ..\..\sanitizer_common\sanitizer_printf.cpp ^ + ..\..\sanitizer_common\sanitizer_suppressions.cpp ^ + ..\..\sanitizer_common\sanitizer_thread_registry.cpp ^ + ..\rtl\tsan_platform_windows.cpp ^ + ..\..\sanitizer_common\sanitizer_win.cpp ^ + ..\..\sanitizer_common\sanitizer_deadlock_detector1.cpp ^ + ..\..\sanitizer_common\sanitizer_stackdepot.cpp ^ + ..\..\sanitizer_common\sanitizer_persistent_allocator.cpp ^ + ..\..\sanitizer_common\sanitizer_flag_parser.cpp ^ + ..\..\sanitizer_common\sanitizer_symbolizer.cpp ^ + ..\..\sanitizer_common\sanitizer_termination.cpp ^ + > gotsan.cpp +gcc ^ + -c ^ + -o race_windows_amd64.syso ^ + gotsan.cpp ^ + -I..\rtl ^ + -I..\.. ^ + -I..\..\sanitizer_common ^ + -I..\..\..\include ^ + -m64 ^ + -Wall ^ + -fno-exceptions ^ + -fno-rtti ^ + -DSANITIZER_GO=1 ^ + -Wno-error=attributes ^ + -Wno-attributes ^ + -Wno-format ^ + -Wno-maybe-uninitialized ^ + -DSANITIZER_DEBUG=0 ^ + -O3 ^ + -fomit-frame-pointer ^ + -std=c++11 -------------- next part -------------- A non-text attachment was scrubbed... Name: D68596.223768.patch Type: text/x-patch Size: 3373 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:23:09 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:23:09 +0000 (UTC) Subject: [PATCH] D68599: [tsan, go] fix Go windows build In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG2fdec42a167c: [tsan, go] fix Go windows build (authored by vitalybuka). Changed prior to commit: https://reviews.llvm.org/D68599?vs=223670&id=223769#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68599/new/ https://reviews.llvm.org/D68599 Files: compiler-rt/lib/sanitizer_common/sanitizer_win_defs.h compiler-rt/lib/tsan/go/build.bat Index: compiler-rt/lib/tsan/go/build.bat =================================================================== --- compiler-rt/lib/tsan/go/build.bat +++ compiler-rt/lib/tsan/go/build.bat @@ -31,6 +31,9 @@ ..\..\sanitizer_common\sanitizer_flag_parser.cpp ^ ..\..\sanitizer_common\sanitizer_symbolizer.cpp ^ ..\..\sanitizer_common\sanitizer_termination.cpp ^ + ..\..\sanitizer_common\sanitizer_file.cpp ^ + ..\..\sanitizer_common\sanitizer_symbolizer_report.cpp ^ + ..\rtl\tsan_external.cpp ^ > gotsan.cpp gcc ^ @@ -46,6 +49,9 @@ -fno-exceptions ^ -fno-rtti ^ -DSANITIZER_GO=1 ^ + -DWINVER=0x0600 ^ + -D_WIN32_WINNT=0x0600 ^ + -DGetProcessMemoryInfo=K32GetProcessMemoryInfo ^ -Wno-error=attributes ^ -Wno-attributes ^ -Wno-format ^ Index: compiler-rt/lib/sanitizer_common/sanitizer_win_defs.h =================================================================== --- compiler-rt/lib/sanitizer_common/sanitizer_win_defs.h +++ compiler-rt/lib/sanitizer_common/sanitizer_win_defs.h @@ -43,6 +43,8 @@ #define STRINGIFY_(A) #A #define STRINGIFY(A) STRINGIFY_(A) +#if !SANITIZER_GO + // ----------------- A workaround for the absence of weak symbols -------------- // We don't have a direct equivalent of weak symbols when using MSVC, but we can // use the /alternatename directive to tell the linker to default a specific @@ -158,5 +160,15 @@ // return a >= b; // } // + +#else // SANITIZER_GO + +// Go neither needs nor wants weak references. +// The shenanigans above don't work for gcc. +# define WIN_WEAK_EXPORT_DEF(ReturnType, Name, ...) \ + extern "C" ReturnType Name(__VA_ARGS__) + +#endif // SANITIZER_GO + #endif // SANITIZER_WINDOWS #endif // SANITIZER_WIN_DEFS_H -------------- next part -------------- A non-text attachment was scrubbed... Name: D68599.223769.patch Type: text/x-patch Size: 1748 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:23:16 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:23:16 +0000 (UTC) Subject: [PATCH] D67871: [Attributor] Use abstract call sites for call site callback In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG661db04b98c9: [Attributor] Use abstract call sites for call site callback (authored by jdoerfert). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67871/new/ https://reviews.llvm.org/D67871 Files: llvm/include/llvm/IR/CallSite.h llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/callbacks.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67871.223771.patch Type: text/x-patch Size: 10797 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:24:07 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:24:07 +0000 (UTC) Subject: [PATCH] D68608: [clang] Accept -ftrivial-auto-var-init in clang-cl In-Reply-To: References: Message-ID: <1c4d3135b85603371d277fdc3bf15447@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373992: [clang] Accept -ftrivial-auto-var-init in clang-cl (authored by vitalybuka, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68608?vs=223694&id=223773#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68608/new/ https://reviews.llvm.org/D68608 Files: cfe/trunk/include/clang/Driver/Options.td cfe/trunk/test/Driver/cl-options.c Index: cfe/trunk/include/clang/Driver/Options.td =================================================================== --- cfe/trunk/include/clang/Driver/Options.td +++ cfe/trunk/include/clang/Driver/Options.td @@ -1715,10 +1715,10 @@ "alloca, which are of greater size than ssp-buffer-size (default: 8 bytes). " "All variable sized calls to alloca are considered vulnerable">; def ftrivial_auto_var_init : Joined<["-"], "ftrivial-auto-var-init=">, Group, - Flags<[CC1Option]>, HelpText<"Initialize trivial automatic stack variables: uninitialized (default)" + Flags<[CC1Option, CoreOption]>, HelpText<"Initialize trivial automatic stack variables: uninitialized (default)" " | pattern">, Values<"uninitialized,pattern">; def enable_trivial_var_init_zero : Joined<["-"], "enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang">, - Flags<[CC1Option]>, + Flags<[CC1Option, CoreOption]>, HelpText<"Trivial automatic variable initialization to zero is only here for benchmarks, it'll eventually be removed, and I'm OK with that because I'm only using it to benchmark">; def fstandalone_debug : Flag<["-"], "fstandalone-debug">, Group, Flags<[CoreOption]>, HelpText<"Emit full debug info for all types used by the program">; Index: cfe/trunk/test/Driver/cl-options.c =================================================================== --- cfe/trunk/test/Driver/cl-options.c +++ cfe/trunk/test/Driver/cl-options.c @@ -653,6 +653,8 @@ // RUN: -fcs-profile-generate \ // RUN: -fcs-profile-generate=dir \ // RUN: -ftime-trace \ +// RUN: -ftrivial-auto-var-init=zero \ +// RUN: -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang \ // RUN: --version \ // RUN: -Werror /Zs -- %s 2>&1 -------------- next part -------------- A non-text attachment was scrubbed... Name: D68608.223773.patch Type: text/x-patch Size: 1803 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:24:19 2019 From: llvm-commits at lists.llvm.org (Evgenii Stepanov via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:24:19 +0000 (UTC) Subject: [PATCH] D68431: [msan] Add interceptors: crypt, crypt_r. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL373993: [msan] Add interceptors: crypt, crypt_r. (authored by eugenis, committed by ). Herald added a subscriber: delcypher. Changed prior to commit: https://reviews.llvm.org/D68431?vs=223696&id=223774#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68431/new/ https://reviews.llvm.org/D68431 Files: compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68431.223774.patch Type: text/x-patch Size: 5874 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:24:39 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:24:39 +0000 (UTC) Subject: [PATCH] D68612: [CMake] Track test dependencies with add_lldb_test_dependency In-Reply-To: References: Message-ID: <4403c03188f7c5c19cdacda9107c5064@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373996: [CMake] Track test dependencies with add_lldb_test_dependency (authored by JDevlieghere, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68612?vs=223700&id=223778#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68612/new/ https://reviews.llvm.org/D68612 Files: lldb/trunk/CMakeLists.txt lldb/trunk/cmake/modules/AddLLDB.cmake lldb/trunk/lit/CMakeLists.txt lldb/trunk/test/CMakeLists.txt lldb/trunk/unittests/CMakeLists.txt lldb/trunk/utils/lldb-dotest/CMakeLists.txt -------------- next part -------------- A non-text attachment was scrubbed... Name: D68612.223778.patch Type: text/x-patch Size: 6933 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:24:54 2019 From: llvm-commits at lists.llvm.org (Lawrence D'Anna via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:24:54 +0000 (UTC) Subject: [PATCH] D68545: DWIMy filterspecs for dotest.py In-Reply-To: References: Message-ID: <9a6abf69eb7f447f0cfbae8fd26f23b3@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373997: DWIMy filterspecs for dotest.py (authored by lawrence_danna, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68545?vs=223686&id=223779#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68545/new/ https://reviews.llvm.org/D68545 Files: lldb/trunk/packages/Python/lldbsuite/test/dotest.py lldb/trunk/packages/Python/lldbsuite/test/dotest_args.py Index: lldb/trunk/packages/Python/lldbsuite/test/dotest.py =================================================================== --- lldb/trunk/packages/Python/lldbsuite/test/dotest.py +++ lldb/trunk/packages/Python/lldbsuite/test/dotest.py @@ -667,34 +667,42 @@ # Thoroughly check the filterspec against the base module and admit # the (base, filterspec) combination only when it makes sense. - filterspec = None - for filterspec in configuration.filters: - # Optimistically set the flag to True. - filtered = True - module = __import__(base) - parts = filterspec.split('.') - obj = module + + def check(obj, parts): for part in parts: try: parent, obj = obj, getattr(obj, part) except AttributeError: # The filterspec has failed. - filtered = False - break + return False + return True - # If filtered, we have a good filterspec. Add it. - if filtered: - # print("adding filter spec %s to module %s" % (filterspec, module)) - configuration.suite.addTests( - unittest2.defaultTestLoader.loadTestsFromName( - filterspec, module)) - continue + module = __import__(base) + + def iter_filters(): + for filterspec in configuration.filters: + parts = filterspec.split('.') + if check(module, parts): + yield filterspec + elif parts[0] == base and len(parts) > 1 and check(module, parts[1:]): + yield '.'.join(parts[1:]) + else: + for key,value in module.__dict__.items(): + if check(value, parts): + yield key + '.' + filterspec + + filtered = False + for filterspec in iter_filters(): + filtered = True + print("adding filter spec %s to module %s" % (filterspec, repr(module))) + tests = unittest2.defaultTestLoader.loadTestsFromName(filterspec, module) + configuration.suite.addTests(tests) # Forgo this module if the (base, filterspec) combo is invalid if configuration.filters and not filtered: return - if not filterspec or not filtered: + if not filtered: # Add the entire file's worth of tests since we're not filtered. # Also the fail-over case when the filterspec branch # (base, filterspec) combo doesn't make sense. Index: lldb/trunk/packages/Python/lldbsuite/test/dotest_args.py =================================================================== --- lldb/trunk/packages/Python/lldbsuite/test/dotest_args.py +++ lldb/trunk/packages/Python/lldbsuite/test/dotest_args.py @@ -61,7 +61,9 @@ '-f', metavar='filterspec', action='append', - help='Specify a filter, which consists of the test class name, a dot, followed by the test method, to only admit such test into the test suite') # FIXME: Example? + help=('Specify a filter, which looks like "TestModule.TestClass.test_name". '+ + 'You may also use shortened filters, such as '+ + '"TestModule.TestClass", "TestClass.test_name", or just "test_name".')) group.add_argument( '-p', metavar='pattern', -------------- next part -------------- A non-text attachment was scrubbed... Name: D68545.223779.patch Type: text/x-patch Size: 3335 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:26:01 2019 From: llvm-commits at lists.llvm.org (Lawrence D'Anna via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:26:01 +0000 (UTC) Subject: [PATCH] D68618: test fix: TestLoadUsingPaths should use realpath In-Reply-To: References: Message-ID: <7c7a9f20abf06832c7b4d82f455de2d8@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL374007: test fix: TestLoadUsingPaths should use realpath (authored by lawrence_danna, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68618?vs=223714&id=223780#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68618/new/ https://reviews.llvm.org/D68618 Files: lldb/trunk/packages/Python/lldbsuite/test/functionalities/load_using_paths/TestLoadUsingPaths.py Index: lldb/trunk/packages/Python/lldbsuite/test/functionalities/load_using_paths/TestLoadUsingPaths.py =================================================================== --- lldb/trunk/packages/Python/lldbsuite/test/functionalities/load_using_paths/TestLoadUsingPaths.py +++ lldb/trunk/packages/Python/lldbsuite/test/functionalities/load_using_paths/TestLoadUsingPaths.py @@ -33,7 +33,7 @@ ext = 'dylib' self.lib_name = 'libloadunload.' + ext - self.wd = self.getBuildDir() + self.wd = os.path.realpath(self.getBuildDir()) self.hidden_dir = os.path.join(self.wd, 'hidden') self.hidden_lib = os.path.join(self.hidden_dir, self.lib_name) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68618.223780.patch Type: text/x-patch Size: 696 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:26:24 2019 From: llvm-commits at lists.llvm.org (Andrew Trick via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:26:24 +0000 (UTC) Subject: [PATCH] D68044: [LitConfig] Silenced notes/warnings on quiet. In-Reply-To: References: Message-ID: <343a474d2d00e99f6dbdbeed6173cd36@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL374009: [LitConfig] Silenced notes/warnings on quiet. (authored by atrick, committed by ). Changed prior to commit: https://reviews.llvm.org/D68044?vs=221815&id=223781#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68044/new/ https://reviews.llvm.org/D68044 Files: llvm/trunk/utils/lit/lit/LitConfig.py Index: llvm/trunk/utils/lit/lit/LitConfig.py =================================================================== --- llvm/trunk/utils/lit/lit/LitConfig.py +++ llvm/trunk/utils/lit/lit/LitConfig.py @@ -174,10 +174,12 @@ kind, message)) def note(self, message): - self._write_message('note', message) + if not self.quiet: + self._write_message('note', message) def warning(self, message): - self._write_message('warning', message) + if not self.quiet: + self._write_message('warning', message) self.numWarnings += 1 def error(self, message): -------------- next part -------------- A non-text attachment was scrubbed... Name: D68044.223781.patch Type: text/x-patch Size: 668 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:27:01 2019 From: llvm-commits at lists.llvm.org (James Clarke via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:27:01 +0000 (UTC) Subject: [PATCH] D68368: [ItaniumMangle] Fix mangling of GNU __null in an expression to match GCC In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rL374013: [ItaniumMangle] Fix mangling of GNU __null in an expression to match GCC (authored by jrtc27, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D68368?vs=222949&id=223783#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68368/new/ https://reviews.llvm.org/D68368 Files: cfe/trunk/lib/AST/ItaniumMangle.cpp cfe/trunk/test/CodeGenCXX/mangle-exprs.cpp Index: cfe/trunk/lib/AST/ItaniumMangle.cpp =================================================================== --- cfe/trunk/lib/AST/ItaniumMangle.cpp +++ cfe/trunk/lib/AST/ItaniumMangle.cpp @@ -4273,8 +4273,11 @@ } case Expr::GNUNullExprClass: - // FIXME: should this really be mangled the same as nullptr? - // fallthrough + // Mangle as if an integer literal 0. + Out << 'L'; + mangleType(E->getType()); + Out << "0E"; + break; case Expr::CXXNullPtrLiteralExprClass: { Out << "LDnE"; Index: cfe/trunk/test/CodeGenCXX/mangle-exprs.cpp =================================================================== --- cfe/trunk/test/CodeGenCXX/mangle-exprs.cpp +++ cfe/trunk/test/CodeGenCXX/mangle-exprs.cpp @@ -373,3 +373,19 @@ template void f(decltype(T{.a.b[3][1 ... 4] = 9}) x) {} void use_f(A a) { f(a); } } + +namespace null { + template + void cpp_nullptr(typename enable_if

    ::type* = 0) { + } + + template + void gnu_null(typename enable_if

    ::type* = 0) { + } + + // CHECK-LABEL: define {{.*}} @_ZN4null11cpp_nullptrILDn0EEEvPN9enable_ifIXeqT_LDnEEvE4typeE + template void cpp_nullptr(void *); + + // CHECK-LABEL: define {{.*}} @_ZN4null8gnu_nullILPv0EEEvPN9enable_ifIXeqT_Ll0EEvE4typeE + template void gnu_null(void *); +} -------------- next part -------------- A non-text attachment was scrubbed... Name: D68368.223783.patch Type: text/x-patch Size: 1369 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:27:29 2019 From: llvm-commits at lists.llvm.org (Bill Wendling via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:27:29 +0000 (UTC) Subject: [PATCH] D68598: [IA] Recognize hexadecimal escape sequences In-Reply-To: References: Message-ID: This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rL374018: [IA] Recognize hexadecimal escape sequences (authored by void, committed by ). Changed prior to commit: https://reviews.llvm.org/D68598?vs=223666&id=223787#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68598/new/ https://reviews.llvm.org/D68598 Files: llvm/trunk/lib/MC/MCParser/AsmParser.cpp llvm/trunk/test/MC/AsmParser/directive_ascii.s Index: llvm/trunk/lib/MC/MCParser/AsmParser.cpp =================================================================== --- llvm/trunk/lib/MC/MCParser/AsmParser.cpp +++ llvm/trunk/lib/MC/MCParser/AsmParser.cpp @@ -2914,11 +2914,27 @@ } // Recognize escaped characters. Note that this escape semantics currently - // loosely follows Darwin 'as'. Notably, it doesn't support hex escapes. + // loosely follows Darwin 'as'. ++i; if (i == e) return TokError("unexpected backslash at end of string"); + // Recognize hex sequences similarly to GNU 'as'. + if (Str[i] == 'x' || Str[i] == 'X') { + size_t length = Str.size(); + if (i + 1 >= length || !isHexDigit(Str[i + 1])) + return TokError("invalid hexadecimal escape sequence"); + + // Consume hex characters. GNU 'as' reads all hexadecimal characters and + // then truncates to the lower 16 bits. Seems reasonable. + unsigned Value = 0; + while (i + 1 < length && isHexDigit(Str[i + 1])) + Value = Value * 16 + hexDigitValue(Str[++i]); + + Data += (unsigned char)(Value & 0xFF); + continue; + } + // Recognize octal sequences. if ((unsigned)(Str[i] - '0') <= 7) { // Consume up to three octal characters. Index: llvm/trunk/test/MC/AsmParser/directive_ascii.s =================================================================== --- llvm/trunk/test/MC/AsmParser/directive_ascii.s +++ llvm/trunk/test/MC/AsmParser/directive_ascii.s @@ -39,3 +39,8 @@ # CHECK: .byte 0 TEST6: .string "B", "C" + +# CHECK: TEST7: +# CHECK: .ascii "dk" +TEST7: + .ascii "\x64\Xa6B" -------------- next part -------------- A non-text attachment was scrubbed... Name: D68598.223787.patch Type: text/x-patch Size: 1636 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 00:49:10 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 07:49:10 +0000 (UTC) Subject: [PATCH] D68323: [ELF] Wrap things in `namespace lld { namespace elf {`, NFC In-Reply-To: References: Message-ID: <6da80ea7b4f5a6891166cfdddc22a72d@localhost.localdomain> MaskRay updated this revision to Diff 223454. MaskRay added a comment. Rebase on D68561 Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68323/new/ https://reviews.llvm.org/D68323 Files: ELF/Arch/AArch64.cpp ELF/Arch/AMDGPU.cpp ELF/Arch/ARM.cpp ELF/Arch/AVR.cpp ELF/Arch/Hexagon.cpp ELF/Arch/MSP430.cpp ELF/Arch/Mips.cpp ELF/Arch/MipsArchTree.cpp ELF/Arch/PPC.cpp ELF/Arch/PPC64.cpp ELF/Arch/RISCV.cpp ELF/Arch/SPARCV9.cpp ELF/Arch/X86.cpp ELF/Arch/X86_64.cpp ELF/CallGraphSort.cpp ELF/DWARF.cpp ELF/Driver.cpp ELF/DriverUtils.cpp ELF/EhFrame.cpp ELF/ICF.cpp ELF/InputFiles.cpp ELF/InputFiles.h ELF/InputSection.cpp ELF/LTO.cpp ELF/LinkerScript.cpp ELF/MapFile.cpp ELF/MarkLive.cpp ELF/OutputSections.cpp ELF/Relocations.cpp ELF/ScriptLexer.cpp ELF/ScriptParser.cpp ELF/SymbolTable.cpp ELF/Symbols.cpp ELF/Symbols.h ELF/SyntheticSections.cpp ELF/Target.cpp ELF/Writer.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68323.223454.patch Type: text/x-patch Size: 55020 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 00:52:51 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 07:52:51 +0000 (UTC) Subject: [PATCH] D68323: [ELF] Wrap things in `namespace lld { namespace elf {`, NFC In-Reply-To: References: Message-ID: ruiu accepted this revision. ruiu added a comment. LGTM Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68323/new/ https://reviews.llvm.org/D68323 From llvm-commits at lists.llvm.org Mon Oct 7 01:03:54 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:03:54 +0000 (UTC) Subject: [PATCH] D67122: [UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour In-Reply-To: References: Message-ID: lebedev.ri added a comment. Ping. Friendly remainder that the unsanitized UB is still being miscompiled. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67122/new/ https://reviews.llvm.org/D67122 From llvm-commits at lists.llvm.org Mon Oct 7 01:26:05 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:26:05 +0000 (UTC) Subject: [PATCH] D68323: [ELF] Wrap things in `namespace lld { namespace elf {`, NFC In-Reply-To: References: Message-ID: <558006506891446c404ae0d5e24f2e74@localhost.localdomain> MaskRay updated this revision to Diff 223457. MaskRay edited the summary of this revision. MaskRay added a comment. Update description. We don't change write* read* now Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68323/new/ https://reviews.llvm.org/D68323 Files: ELF/Arch/AArch64.cpp ELF/Arch/AMDGPU.cpp ELF/Arch/ARM.cpp ELF/Arch/AVR.cpp ELF/Arch/Hexagon.cpp ELF/Arch/MSP430.cpp ELF/Arch/Mips.cpp ELF/Arch/MipsArchTree.cpp ELF/Arch/PPC.cpp ELF/Arch/PPC64.cpp ELF/Arch/RISCV.cpp ELF/Arch/SPARCV9.cpp ELF/Arch/X86.cpp ELF/Arch/X86_64.cpp ELF/CallGraphSort.cpp ELF/DWARF.cpp ELF/Driver.cpp ELF/DriverUtils.cpp ELF/EhFrame.cpp ELF/ICF.cpp ELF/InputFiles.cpp ELF/InputFiles.h ELF/InputSection.cpp ELF/LTO.cpp ELF/LinkerScript.cpp ELF/MapFile.cpp ELF/MarkLive.cpp ELF/OutputSections.cpp ELF/Relocations.cpp ELF/ScriptLexer.cpp ELF/ScriptParser.cpp ELF/SymbolTable.cpp ELF/Symbols.cpp ELF/Symbols.h ELF/SyntheticSections.cpp ELF/Target.cpp ELF/Writer.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68323.223457.patch Type: text/x-patch Size: 55020 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 01:29:19 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 08:29:19 +0000 (UTC) Subject: [PATCH] D68323: [ELF] Wrap things in `namespace lld { namespace elf {`, NFC In-Reply-To: References: Message-ID: <15f75f50dc1c0cb4b8b6943d198a81f8@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL373885: [ELF] Wrap things in `namespace lld { namespace elf {`, NFC (authored by MaskRay, committed by ). Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68323/new/ https://reviews.llvm.org/D68323 Files: lld/trunk/ELF/Arch/AArch64.cpp lld/trunk/ELF/Arch/AMDGPU.cpp lld/trunk/ELF/Arch/ARM.cpp lld/trunk/ELF/Arch/AVR.cpp lld/trunk/ELF/Arch/Hexagon.cpp lld/trunk/ELF/Arch/MSP430.cpp lld/trunk/ELF/Arch/Mips.cpp lld/trunk/ELF/Arch/MipsArchTree.cpp lld/trunk/ELF/Arch/PPC.cpp lld/trunk/ELF/Arch/PPC64.cpp lld/trunk/ELF/Arch/RISCV.cpp lld/trunk/ELF/Arch/SPARCV9.cpp lld/trunk/ELF/Arch/X86.cpp lld/trunk/ELF/Arch/X86_64.cpp lld/trunk/ELF/CallGraphSort.cpp lld/trunk/ELF/DWARF.cpp lld/trunk/ELF/Driver.cpp lld/trunk/ELF/DriverUtils.cpp lld/trunk/ELF/EhFrame.cpp lld/trunk/ELF/ICF.cpp lld/trunk/ELF/InputFiles.cpp lld/trunk/ELF/InputFiles.h lld/trunk/ELF/InputSection.cpp lld/trunk/ELF/LTO.cpp lld/trunk/ELF/LinkerScript.cpp lld/trunk/ELF/MapFile.cpp lld/trunk/ELF/MarkLive.cpp lld/trunk/ELF/OutputSections.cpp lld/trunk/ELF/Relocations.cpp lld/trunk/ELF/ScriptLexer.cpp lld/trunk/ELF/ScriptParser.cpp lld/trunk/ELF/SymbolTable.cpp lld/trunk/ELF/Symbols.cpp lld/trunk/ELF/Symbols.h lld/trunk/ELF/SyntheticSections.cpp lld/trunk/ELF/Target.cpp lld/trunk/ELF/Writer.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68323.223459.patch Type: text/x-patch Size: 56158 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 03:36:44 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Lu=C3=ADs_Marques_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 10:36:44 +0000 (UTC) Subject: [PATCH] D66725: [DAGCombiner][TargetLowering] Target hook for FCOPYSIGN arg cast folding In-Reply-To: References: Message-ID: <06361c8b1fb8c2934c98622dd3ab5d44@localhost.localdomain> luismarques added a subscriber: efriedma. luismarques added a comment. In D66725#1663174 , @lenary wrote: > I would like to see a review by someone who works on cross-target parts of DAGCombiner, before this is landed. @efriedma would you be willing to review this? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66725/new/ https://reviews.llvm.org/D66725 From llvm-commits at lists.llvm.org Mon Oct 7 04:45:32 2019 From: llvm-commits at lists.llvm.org (Simon Moll via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 11:45:32 +0000 (UTC) Subject: [PATCH] D57504: RFC: Prototype & Roadmap for vector predication in LLVM In-Reply-To: References: Message-ID: <587fffb80ac672c0343f7a0fc45f3a7d@localhost.localdomain> simoll added a comment. Herald added subscribers: lenary, hiraditya. Picking this up again. I begin with changing the VP intrinsics as outlined before with one deviation from the earlier plan: - There will be no `llvm.vp.constrained.*` just `llvm.vp.*` and all FP intrinsics will have an exception mode and rounding mode parameter. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57504/new/ https://reviews.llvm.org/D57504 From llvm-commits at lists.llvm.org Mon Oct 7 08:22:33 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 15:22:33 +0000 (UTC) Subject: [PATCH] D67046: [RISCV] Add InstrInfo areMemAccessesTriviallyDisjoint hook In-Reply-To: References: Message-ID: <8a6487b8fd2975c1d55187f5fddb3240@localhost.localdomain> lenary accepted this revision. lenary added a comment. This revision is now accepted and ready to land. I have decided I'm not too worried by the brittle test. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67046/new/ https://reviews.llvm.org/D67046 From llvm-commits at lists.llvm.org Mon Oct 7 13:57:09 2019 From: llvm-commits at lists.llvm.org (Ana Pazos via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 20:57:09 +0000 (UTC) Subject: [PATCH] D66210: [RFC/WIP][RISCV] Enable the machine outliner for RISC-V In-Reply-To: References: Message-ID: <314ecebd1c3f609cc480a4ca25e0ee8c@localhost.localdomain> apazos added a comment. Lewis, this patch LGTM. You can go ahead and merge it. To improve code size, I have updated the additional patch in https://reviews.llvm.org/D68290. No change is required in machine outliner code. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66210/new/ https://reviews.llvm.org/D66210 From llvm-commits at lists.llvm.org Mon Oct 7 14:30:04 2019 From: llvm-commits at lists.llvm.org (Aditya Kumar via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 21:30:04 +0000 (UTC) Subject: [PATCH] D68559: [RISCV] Support fast calling convention In-Reply-To: References: Message-ID: hiraditya added inline comments. ================ Comment at: lib/Target/RISCV/RISCVISelLowering.cpp:2046 + + if (CallConv == CallingConv::Fast) { + ArgCCInfo.AnalyzeCallOperands(Outs, CC_RISCV_FastCC); ---------------- nit: no need to put braces around one liners. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68559/new/ https://reviews.llvm.org/D68559 From llvm-commits at lists.llvm.org Mon Oct 7 15:11:31 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:11:31 +0000 (UTC) Subject: [PATCH] D68360: PR41162 Implement LKK remainder and divisibility algorithms [urem] In-Reply-To: References: Message-ID: <689fd32f202f809261cd200a43d7f8a3@localhost.localdomain> lebedev.ri added a comment. It looks that we now use hacker's deligth lowering for urem? Where is that lowering being performed? ================ Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:4934 + const APInt &D = DivisorConstant->getAPIntValue(); + APInt C = APInt::getMaxValue(F).udiv(D.zext(F)).uadd_sat(APInt(F, 1)); + SDValue AproximateReciprocal = DAG.getConstant(C, DL, FVT.getScalarType()); ---------------- This should be a simple `add`. ================ Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:4941-4943 + if (!D.isStrictlyPositive() || D.isMaxValue() || D.isOneValue() || + D.isPowerOf2()) { + // Divisor must be in the range of (1,2^N) ---------------- There is no such restriction. ================ Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:4944-4945 + // Divisor must be in the range of (1,2^N) + // We can lower remainder of division by powers of two much better + // elsewhere. + return false; ---------------- I believe what you want to do, is to check whether *all* divisors are powers of two, and avoid *this* fold then. If at least one of them is not a power of two this should still be good. That being said many of the test changes look like regressions. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68360/new/ https://reviews.llvm.org/D68360 From llvm-commits at lists.llvm.org Mon Oct 7 15:18:40 2019 From: llvm-commits at lists.llvm.org (Tim Gymnich via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:18:40 +0000 (UTC) Subject: [PATCH] D68360: PR41162 Implement LKK remainder and divisibility algorithms [urem] In-Reply-To: References: Message-ID: <77c0b3e9c3525f893f2f85e46ee4c9ae@localhost.localdomain> TG908 marked 2 inline comments as done. TG908 added inline comments. ================ Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:4941-4943 + if (!D.isStrictlyPositive() || D.isMaxValue() || D.isOneValue() || + D.isPowerOf2()) { + // Divisor must be in the range of (1,2^N) ---------------- lebedev.ri wrote: > There is no such restriction. What do you mean? In my code? In LKK? ================ Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:4944-4945 + // Divisor must be in the range of (1,2^N) + // We can lower remainder of division by powers of two much better + // elsewhere. + return false; ---------------- lebedev.ri wrote: > I believe what you want to do, is to check whether *all* divisors are powers of two, and avoid *this* fold then. > If at least one of them is not a power of two this should still be good. > That being said many of the test changes look like regressions. Yeah. I think the regressions came from changing the way I was checking if MUL is available. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68360/new/ https://reviews.llvm.org/D68360 From llvm-commits at lists.llvm.org Mon Oct 7 15:20:47 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:20:47 +0000 (UTC) Subject: [PATCH] D68360: PR41162 Implement LKK remainder and divisibility algorithms [urem] In-Reply-To: References: Message-ID: <16c1ea97a2fb3054e39932dc4e62efe6@localhost.localdomain> lebedev.ri added inline comments. ================ Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:4941-4943 + if (!D.isStrictlyPositive() || D.isMaxValue() || D.isOneValue() || + D.isPowerOf2()) { + // Divisor must be in the range of (1,2^N) ---------------- TG908 wrote: > lebedev.ri wrote: > > There is no such restriction. > What do you mean? > In my code? > In LKK? I haven't read the paper, so i'm only looking at this code, and they don't incur any restrictions on the divisor: https://rise4fun.com/Alive/HiT CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68360/new/ https://reviews.llvm.org/D68360 From llvm-commits at lists.llvm.org Mon Oct 7 15:26:14 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Lu=C3=ADs_Marques_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 22:26:14 +0000 (UTC) Subject: [PATCH] D67046: [RISCV] Add InstrInfo areMemAccessesTriviallyDisjoint hook In-Reply-To: References: Message-ID: luismarques updated this revision to Diff 223678. luismarques edited the summary of this revision. luismarques added a comment. Rebased the test on master. Changed the test to instead check the instruction dependencies directly. Updated summary. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67046/new/ https://reviews.llvm.org/D67046 Files: llvm/lib/Target/RISCV/RISCVInstrInfo.cpp llvm/lib/Target/RISCV/RISCVInstrInfo.h llvm/lib/Target/RISCV/RISCVSubtarget.cpp llvm/test/CodeGen/RISCV/disjoint.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67046.223678.patch Type: text/x-patch Size: 5839 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:27:39 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Lu=C3=ADs_Marques_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 22:27:39 +0000 (UTC) Subject: [PATCH] D67046: [RISCV] Add InstrInfo areMemAccessesTriviallyDisjoint hook In-Reply-To: References: Message-ID: <9932232bce3f352daf568d595fe8f35a@localhost.localdomain> luismarques requested review of this revision. luismarques marked an inline comment as done. luismarques added a comment. Changed the test. Please review again. Thank you! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67046/new/ https://reviews.llvm.org/D67046 From llvm-commits at lists.llvm.org Mon Oct 7 15:30:09 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Lu=C3=ADs_Marques_via_Phabricator?= via llvm-commits) Date: Mon, 07 Oct 2019 22:30:09 +0000 (UTC) Subject: [PATCH] D67046: [RISCV] Add InstrInfo areMemAccessesTriviallyDisjoint hook In-Reply-To: References: Message-ID: <6237f1802f7ae0d63ba54788384a694c@localhost.localdomain> luismarques updated this revision to Diff 223680. luismarques added a comment. NFC. Appease clang-format, even though it was a copy-pasted declaration. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67046/new/ https://reviews.llvm.org/D67046 Files: llvm/lib/Target/RISCV/RISCVInstrInfo.cpp llvm/lib/Target/RISCV/RISCVInstrInfo.h llvm/lib/Target/RISCV/RISCVSubtarget.cpp llvm/test/CodeGen/RISCV/disjoint.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67046.223680.patch Type: text/x-patch Size: 5841 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 15:56:38 2019 From: llvm-commits at lists.llvm.org (Renato Golin via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 22:56:38 +0000 (UTC) Subject: [PATCH] D57504: RFC: Prototype & Roadmap for vector predication in LLVM In-Reply-To: References: Message-ID: <661773a388c238aacef3148f2a9ca2fa@localhost.localdomain> rengolin added reviewers: huntergr, sdesmalen. rengolin added a comment. This work was mentioned on the SVE discussion about predication, adding arm folks, just in case. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57504/new/ https://reviews.llvm.org/D57504 From llvm-commits at lists.llvm.org Mon Oct 7 16:07:21 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Mon, 07 Oct 2019 23:07:21 +0000 (UTC) Subject: [PATCH] D68607: [AMDGPU] Disable unused gfx10 dpp instructions Message-ID: rampitec created this revision. rampitec added reviewers: vpykhtin, kzhuravl. Herald added subscribers: hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, arsenm. Herald added a project: LLVM. Inhibit generation of unused real dpp instructions on gfx10 just like it is done on other subtargets. This does not change anything because these are illegal anyway and not accepted, but it does reduce the number of instruction definitions generated. https://reviews.llvm.org/D68607 Files: llvm/lib/Target/AMDGPU/VOP1Instructions.td llvm/lib/Target/AMDGPU/VOP2Instructions.td Index: llvm/lib/Target/AMDGPU/VOP2Instructions.td =================================================================== --- llvm/lib/Target/AMDGPU/VOP2Instructions.td +++ llvm/lib/Target/AMDGPU/VOP2Instructions.td @@ -939,11 +939,13 @@ } } multiclass VOP2_Real_dpp_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP2_DPP16(NAME#"_e32")> { let DecoderNamespace = "SDWA10"; } } multiclass VOP2_Real_dpp8_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP2_DPP8(NAME#"_e32")> { let DecoderNamespace = "DPP8"; } @@ -981,6 +983,7 @@ } multiclass VOP2_Real_dpp_gfx10_with_name op, string opName, string asmName> { + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP2_DPP16(opName#"_e32")> { VOP2_Pseudo ps = !cast(opName#"_e32"); let AsmString = asmName # ps.Pfl.AsmDPP16; @@ -988,6 +991,7 @@ } multiclass VOP2_Real_dpp8_gfx10_with_name op, string opName, string asmName> { + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP2_DPP8(opName#"_e32")> { VOP2_Pseudo ps = !cast(opName#"_e32"); let AsmString = asmName # ps.Pfl.AsmDPP8; @@ -1018,12 +1022,14 @@ let AsmString = asmName # !subst(", vcc", "", Ps.AsmOperands); let DecoderNamespace = "SDWA10"; } + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP2_DPP16(opName#"_e32"), asmName> { string AsmDPP = !cast(opName#"_e32").Pfl.AsmDPP16; let AsmString = asmName # !subst(", vcc", "", AsmDPP); let DecoderNamespace = "SDWA10"; } + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP2_DPP8(opName#"_e32"), asmName> { string AsmDPP8 = !cast(opName#"_e32").Pfl.AsmDPP8; Index: llvm/lib/Target/AMDGPU/VOP1Instructions.td =================================================================== --- llvm/lib/Target/AMDGPU/VOP1Instructions.td +++ llvm/lib/Target/AMDGPU/VOP1Instructions.td @@ -506,11 +506,13 @@ } } multiclass VOP1_Real_dpp_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP1_DPP16(NAME#"_e32")> { let DecoderNamespace = "SDWA10"; } } multiclass VOP1_Real_dpp8_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP1_DPP8(NAME#"_e32")> { let DecoderNamespace = "DPP8"; } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68607.223687.patch Type: text/x-patch Size: 3141 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 18:04:05 2019 From: llvm-commits at lists.llvm.org (Tim Gymnich via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 01:04:05 +0000 (UTC) Subject: [PATCH] D68360: PR41162 Implement LKK remainder and divisibility algorithms [urem] In-Reply-To: References: Message-ID: TG908 marked 2 inline comments as done and an inline comment as not done. TG908 added a comment. ================ Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:3940 // makes sense since the simplification results in fatter code. if (DAG.isKnownNeverZero(N1) && !TLI.isIntDivCheap(VT, Attr)) { SDValue OptimizedDiv = ---------------- >>! In D68360#1698334, @lebedev.ri wrote: > Where is that lowering being performed? Right here ================ Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:4941-4943 + if (!D.isStrictlyPositive() || D.isMaxValue() || D.isOneValue() || + D.isPowerOf2()) { + // Divisor must be in the range of (1,2^N) ---------------- lebedev.ri wrote: > TG908 wrote: > > lebedev.ri wrote: > > > There is no such restriction. > > What do you mean? > > In my code? > > In LKK? > I haven't read the paper, so i'm only looking at this code, > and they don't incur any restrictions on the divisor: https://rise4fun.com/Alive/HiT Oops. You are right. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68360/new/ https://reviews.llvm.org/D68360 From llvm-commits at lists.llvm.org Mon Oct 7 19:17:13 2019 From: llvm-commits at lists.llvm.org (Zixuan Wu (Zeson) via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:17:13 +0000 (UTC) Subject: [PATCH] D67148: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In-Reply-To: References: Message-ID: <422a70f21f644c99b8f91d713fafd883@localhost.localdomain> wuzish added a comment. I'd like to upstream it first, and interface modification would be done in follow-up patches. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67148/new/ https://reviews.llvm.org/D67148 From llvm-commits at lists.llvm.org Mon Oct 7 19:39:43 2019 From: llvm-commits at lists.llvm.org (Zixuan Wu (Zeson) via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 02:39:43 +0000 (UTC) Subject: [PATCH] D67148: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In-Reply-To: References: Message-ID: wuzish updated this revision to Diff 223724. wuzish added a comment. Add more comments and rebase to up-to-date. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67148/new/ https://reviews.llvm.org/D67148 Files: llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h llvm/lib/Target/ARM/ARMTargetTransformInfo.h llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.h llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h llvm/lib/Target/X86/X86TargetTransformInfo.cpp llvm/lib/Target/X86/X86TargetTransformInfo.h llvm/lib/Target/XCore/XCoreTargetTransformInfo.h llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll llvm/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll llvm/test/Transforms/LoopVectorize/X86/reg-usage.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67148.223724.patch Type: text/x-patch Size: 35069 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:27:24 2019 From: llvm-commits at lists.llvm.org (Zixuan Wu (Zeson) via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:27:24 +0000 (UTC) Subject: [PATCH] D67148: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In-Reply-To: References: Message-ID: <1bbf90c66868cdf653f7e0967f40d378@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG9f41deccc0e6: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in… (authored by wuzish). Changed prior to commit: https://reviews.llvm.org/D67148?vs=223724&id=223785#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67148/new/ https://reviews.llvm.org/D67148 Files: llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h llvm/lib/Target/ARM/ARMTargetTransformInfo.h llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.h llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h llvm/lib/Target/X86/X86TargetTransformInfo.cpp llvm/lib/Target/X86/X86TargetTransformInfo.h llvm/lib/Target/XCore/XCoreTargetTransformInfo.h llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll llvm/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll llvm/test/Transforms/LoopVectorize/X86/reg-usage.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67148.223785.patch Type: text/x-patch Size: 32616 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 22:33:50 2019 From: llvm-commits at lists.llvm.org (serge via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:33:50 +0000 (UTC) Subject: [PATCH] D68589: [lit] Leverage argparse features to remove some code In-Reply-To: References: Message-ID: <511e442f09afd0b32680309794ca5499@localhost.localdomain> serge-sans-paille added inline comments. ================ Comment at: llvm/utils/lit/lit/cl_arguments.py:204 + n = int(arg) + except: + raise _arg_error('positive integer', arg) ---------------- It's generally better to catch the conversion error explicitly (here ``ValueError``) ================ Comment at: llvm/utils/lit/lit/cl_arguments.py:211 +def _arg_error(desc, arg): + msg = "require %s, but found '%s'" % (desc, arg) + return argparse.ArgumentTypeError(msg) ---------------- require*s* Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68589/new/ https://reviews.llvm.org/D68589 From llvm-commits at lists.llvm.org Mon Oct 7 22:46:36 2019 From: llvm-commits at lists.llvm.org (Merge Guard [bot] via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 05:46:36 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: <5e73decd8eb8ce00c287007b3e2f6c25@localhost.localdomain> merge_guards_bot added a comment. Bulid results are available at http://results.llvm-merge-guard.org/Phabricator-14 See http://jenkins.llvm-merge-guard.org/job/Phabricator/14/ for more details. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 From llvm-commits at lists.llvm.org Mon Oct 7 23:02:37 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 06:02:37 +0000 (UTC) Subject: [PATCH] D68628: GlobalISel: Implement lower for G_SADDO/G_SSUBO Message-ID: arsenm created this revision. arsenm added reviewers: aemerson, aditya_nandakumar, paquette, dsanders. Herald added subscribers: Petar.Avramovic, volkan, rovka, nhaehnle, wdng, jvesely. Port directly from SelectionDAG, minus the path using ISD::SADDSAT/ISD::SSUBSAT. https://reviews.llvm.org/D68628 Files: include/llvm/CodeGen/GlobalISel/LegalizerHelper.h lib/CodeGen/GlobalISel/LegalizerHelper.cpp lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir test/CodeGen/AMDGPU/GlobalISel/regbankselect-saddo.mir test/CodeGen/AMDGPU/GlobalISel/regbankselect-ssubo.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68628.223789.patch Type: text/x-patch Size: 22182 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Mon Oct 7 23:02:37 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 06:02:37 +0000 (UTC) Subject: [PATCH] D68255: [X86] Remove AVX/AVX512 check from validateOperandSize, just always accept 512 In-Reply-To: References: Message-ID: craig.topper added a comment. New patch to factor in the target attribute D68627 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68255/new/ https://reviews.llvm.org/D68255 From llvm-commits at lists.llvm.org Mon Oct 7 23:02:37 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 06:02:37 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <31cd50d2b4b23fc01161c9aa55c99371@localhost.localdomain> ruiu added inline comments. ================ Comment at: llvm/include/llvm/Support/JamCRC.h:45 + CRC ^= 0xFFFFFFFFU; // Undo CRC-32 Init. + CRC = llvm::crc32(CRC, Data); + CRC ^= 0xFFFFFFFFU; // Undo CRC-32 XorOut. ---------------- This is the only place where you pass non-zero value as the first argument, and the way how that value is handled is a little irregular. So how about moving this class to CRC.h and define `llvm::crc32` as `llvm::crc32(ArrayRef)`? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Mon Oct 7 23:21:40 2019 From: llvm-commits at lists.llvm.org (Merge Guard [bot] via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 06:21:40 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: <4788885db7c080dcbfdb38c42d91bb82@localhost.localdomain> merge_guards_bot added a comment. Bulid results are available at http://results.llvm-merge-guard.org/Phabricator-17 See http://jenkins.llvm-merge-guard.org/job/Phabricator/17/ for more details. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 From llvm-commits at lists.llvm.org Mon Oct 7 23:21:41 2019 From: llvm-commits at lists.llvm.org (Wenlei He via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 06:21:41 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: <8d0f2c6a88445cfef57246a2d9a88f5a@localhost.localdomain> wenlei added a comment. Thanks for making indexing available for extended binary format. We've recently adding indexing to binary format as well in an internal patch, and also observed similar (~30%) build time reduction for some large services. We also have have the need to differentiate dead/cold symbols vs new symbols, so symbol list in extended binary format is useful to us as well - will try it out. (I noticed that a small change in profile generation tool (the equivalent of https://github.com/google/autofdo) is needed to populate that list though, it'd be nice if these are all part of LLVM) Left some comments inline. In addition, I think `llvm-profdata/Inputs/sample-profile.proftext` need to be updated as well to include the new section for `roundtrip.test` ================ Comment at: llvm/include/llvm/ProfileData/SampleProfWriter.h:206 virtual void initSectionLayout() override { SectionLayout = {{SecProfSummary, 0, 0, 0}, {SecNameTable, 0, 0, 0}, ---------------- With the addition of offset table at the end of sections but in the middle of section header, I found the name SectionLayout a bit confusing. This array actually represents the layout/order of section header (e.g. the reader order), but not the section (payload) layout (the writer order). I guess renaming it SectionHdrLayout or something alike may help.. Besides, some comments explaining why SecFuncOffsetTable need to be in the middle could help too, it looks like a hidden trick until I saw the comments in SampleProfileReaderExtBinaryBase::getFileSize(). ================ Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:547-548 + continue; + Data = Start + iter->second; + if (std::error_code EC = readFuncProfile()) + return EC; ---------------- nit: similar to line 539, we could assert the random access here never attempts to touch anything beyond Size? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 From llvm-commits at lists.llvm.org Mon Oct 7 23:34:43 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 06:34:43 +0000 (UTC) Subject: [PATCH] D68062: Propeller lld framework for basicblock sections In-Reply-To: References: Message-ID: ruiu added inline comments. ================ Comment at: lld/ELF/Propeller.cpp:54 +Propeller::Propeller(lld::elf::SymbolTable *ST) + : Symtab(ST), Views(), CFGMap(), Propf(nullptr) {} + ---------------- If an object can be initialized by the default ctor, you can omit them from the initializer list, so please remove `View()` and `CFGMap`. ================ Comment at: lld/ELF/Propeller.cpp:65-66 + ++LineNo; + if (line.empty()) continue; + if (line[0] != '@') break; + ++outputFileTagSeen; ---------------- Is this how clang-format formatted? If not, could you please run clang-format-diff to format this patch? ================ Comment at: lld/ELF/Propeller.cpp:81 + +SymbolEntry *Propfile::findSymbol(StringRef symName) { + std::pair symNameSplit = symName.split(".llvm."); ---------------- Could you write a function comment for this function? ================ Comment at: lld/ELF/Propeller.cpp:82 +SymbolEntry *Propfile::findSymbol(StringRef symName) { + std::pair symNameSplit = symName.split(".llvm."); + StringRef funcName; ---------------- Could you write a comment as to what this magic string `.llvm.` is? ================ Comment at: lld/ELF/Propeller.h:137 +// +// Symbols +// 1 0 N.init/_init ---------------- ruiu wrote: > Indent this part Could you indent these lines? ================ Comment at: lld/ELF/Propeller.h:280-284 + // ELFViewDeleter, which has its implementation in .cpp, saves us from having + // to have full ELFView definition visibile here. + struct ELFViewDeleter { + void operator()(ELFView *v); + }; ---------------- shenhan wrote: > ruiu wrote: > > shenhan wrote: > > > ruiu wrote: > > > > I'm puzzled by this -- do you really need this? > > > Yes, unique_ptr requires Type's dtor to be visible at this place. However, Propeller.h is the top level hdr file within propeller framework, thus it could not include any other propeller hdr file (only the other way around), so we don't have ELFView (now ObjectView after renaming)'s full type defintion here. > > > > > > To overcome this, we provide a customer deleter functor - ELFViewDeleter, and define it in the ELFView's cpp file. > > But I think you can fix that error by moving the definition of this class's ctor to a .cpp file (i.e. don't inline). > Yes, alternatively, we can move Propeller's ctor *AND* dtor to Propeller.cpp, that also implies putting an empty "dtor" definition in the .cpp. Because compiler-synthesized dtors are inlined. What shall we do, keep the code as is, or move ctor and put an empty dtor in the .cpp? I am ok w/ either. I think I prefer outlining the ctor and the dtor because it looks more natural. ================ Comment at: lld/ELF/PropellerELFCfg.h:1 +//===-------------------- PropellerELFCfg.h -------------------------------===// +// ---------------- Remove ELF from this filename, as this file is already in ELF directory. ================ Comment at: lld/ELF/PropellerELFCfg.h:29 +//===----------------------------------------------------------------------===// +#ifndef LLD_ELF_PROPELLER_ELF_CFG_H +#define LLD_ELF_PROPELLER_ELF_CFG_H ---------------- Just like other files, please leave a blank line after a file comment. ================ Comment at: lld/ELF/PropellerELFCfg.h:62 + enum EdgeType : char { + INTRA_FUNC = 0, + INTRA_RSC, ---------------- I think `=0` is the default. ================ Comment at: lld/ELF/PropellerELFCfg.h:69 + INTER_FUNC_RETURN, + } Type{INTRA_FUNC}; + ---------------- Use `= INTRA_FUNC` instead of an initializer list for consistency (it is important to keep the code consistent with other files.) ================ Comment at: lld/ELF/PropellerELFCfg.h:174-176 + uint32_t BB{0}; + uint32_t BBWoutAddr{0}; + uint32_t InvalidCFGs{0}; ---------------- Please use `=` instead of `{}` ================ Comment at: lld/include/lld/Common/PropellerCommon.h:11 + +#include + ---------------- This line should be moved above `using`, just like other files. ================ Comment at: lld/include/lld/Common/PropellerCommon.h:1 +#ifndef LLD_ELF_PROPELLER_COMMON_H +#define LLD_ELF_PROPELLER_COMMON_H ---------------- shenhan wrote: > ruiu wrote: > > shenhan wrote: > > > ruiu wrote: > > > > This header is included only once by another header, so please merge them together. > > > Yup, let me explain. We have the creaet_llvm_prof tool which is part of google/autofdo repository (https://github.com/shenhanc78/autofdo/blob/plo-dev/llvm_propeller_profile_writer.h#L14). create_llvm_prof tool and propeller need to have exactly same symbolentry definition to cooperate. To make a copy of the class definition in google/autofdo repository is not a good idea. So we create this common inclusion hdr file. > > > > > > > > This is an lld's private header directory. If you have a header that is shared by multiple LLVM subprojects, consider moving it to LLVM. > I found PropellerCommon.h exported into the llvm installation directory here: > llvm-install/include/lld/Common > along with llvm-install/include/llvm and llvm-install/include/clang. > > And I think it's probably inappropriate to place PropellerCommon.h anywhere under llvm-install/include/clang or llvm-install/include/llvm. So you are using this file only with in lld? If so, move this to lld/ELF, because we don't have a non-ELF implementation yet. Also, please consider rename SymbolEntry, even if this file is auto-generated by some other script. That name is extremely confusing in the linker's context. "Symbol" is one of the central data structures in the linker, and when you say "symbol", that means the symbol that we read from object files. Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68062/new/ https://reviews.llvm.org/D68062 From llvm-commits at lists.llvm.org Tue Oct 8 00:01:55 2019 From: llvm-commits at lists.llvm.org (Jian Cai via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:01:55 +0000 (UTC) Subject: [PATCH] D68598: [IA] Recognize hexadecimal escape sequences In-Reply-To: References: Message-ID: <28b04d29ea5f06b6f6f78d0221075d3f@localhost.localdomain> jcai19 added inline comments. ================ Comment at: llvm/test/MC/AsmParser/directive_ascii.s:46 +TEST7: + .ascii "\x64\Xa6B" ---------------- Thanks for the update. Could we have one more test case with last eight bits within the range of 7f and ff, and maybe with lower-case letter, e.g. \x8a? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68598/new/ https://reviews.llvm.org/D68598 From llvm-commits at lists.llvm.org Tue Oct 8 00:08:49 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Tue, 08 Oct 2019 07:08:49 -0000 Subject: [llvm] r374020 - [llvm-exegesis] Add stabilization test with config Message-ID: <20191008070849.349608DF33@lists.llvm.org> Author: courbet Date: Tue Oct 8 00:08:48 2019 New Revision: 374020 URL: http://llvm.org/viewvc/llvm-project?rev=374020&view=rev Log: [llvm-exegesis] Add stabilization test with config In preparation for D68629. Added: llvm/trunk/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test Added: llvm/trunk/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test?rev=374020&view=auto ============================================================================== --- llvm/trunk/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test (added) +++ llvm/trunk/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test Tue Oct 8 00:08:48 2019 @@ -0,0 +1,43 @@ +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-inconsistencies-output-file=- -analysis-clustering-epsilon=0.5 -analysis-inconsistency-epsilon=0.5 -analysis-display-unstable-clusters -analysis-numpoints=1 | FileCheck -check-prefixes=CHECK-UNSTABLE %s + +# We have two measurements with different measurements for SQRTSSr, but they +# have different configs, so they should not be placed in the same cluster by +# stabilization. + +# CHECK-UNSTABLE: SQRTSSr +# CHECK-UNSTABLE: SQRTSSr + +--- +mode: latency +key: + instructions: + - 'SQRTSSr XMM11 XMM11' + config: 'config1' + register_initial_values: + - 'XMM11=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 10000 +measurements: + - { key: latency, value: 90.1111, per_snippet_value: 90.1111 } +error: '' +info: Repeating a single explicitly serial instruction +assembled_snippet: 4883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F1C244883C410F3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBC3 +... +--- +mode: latency +key: + instructions: + - 'SQRTSSr XMM11 XMM11' + config: 'config2' + register_initial_values: + - 'XMM11=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 10000 +measurements: + - { key: latency, value: 100, per_snippet_value: 100 } +error: '' +info: Repeating a single explicitly serial instruction +assembled_snippet: 4883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F1C244883C410F3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBC3 +... From llvm-commits at lists.llvm.org Tue Oct 8 00:11:02 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:11:02 +0000 (UTC) Subject: [PATCH] D68629: [llvm-exegesis] Finish plumbing the `Config` field. Message-ID: courbet created this revision. courbet added a reviewer: gchatelet. Herald added a subscriber: tschuett. Herald added a project: LLVM. Right now there are no snippet generators that emit the `Config` Field, but I plan to add it to investigate LEA operands for PR32326. What was broken was: - `Config` Was not propagated up until the BenchmarkResult::Key. - Clustering should really consider different configs as measuring different things, so we should stabilize on (Opcode, Config) instead of just Opcode. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68629 Files: llvm/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test llvm/tools/llvm-exegesis/lib/BenchmarkCode.h llvm/tools/llvm-exegesis/lib/BenchmarkResult.h llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp llvm/tools/llvm-exegesis/lib/Clustering.cpp llvm/tools/llvm-exegesis/lib/CodeTemplate.h llvm/tools/llvm-exegesis/lib/SnippetFile.cpp llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/unittests/tools/llvm-exegesis/X86/SnippetFileTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68629.223790.patch Type: text/x-patch Size: 14537 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 00:11:02 2019 From: llvm-commits at lists.llvm.org (MyDeveloperDay via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:11:02 +0000 (UTC) Subject: [PATCH] D31635: [clang-format] Added ReferenceAlignmentStyle option In-Reply-To: References: Message-ID: <679f1058188fc0d0f70642ed9b554eeb@localhost.localdomain> MyDeveloperDay added a comment. I hadn't realized that AStyle had a separate reference alignment style, anyone coming from Astyle potentially would miss that behaviour as it would represent itself as subtle changes. For this example you state, with your patch int Add2( BTree*& Root, char *szToAdd ) How is this formatted in comparison to Astyle? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D31635/new/ https://reviews.llvm.org/D31635 From llvm-commits at lists.llvm.org Tue Oct 8 00:11:45 2019 From: llvm-commits at lists.llvm.org (Merge Guard [bot] via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:11:45 +0000 (UTC) Subject: [PATCH] D68560: Test for the build server -- DO NOT MERGE! In-Reply-To: References: Message-ID: <1b0b1a44202f752cd13c0b3ddc4cc4f8@localhost.localdomain> merge_guards_bot added a comment. Bulid results are available at http://results.llvm-merge-guard.org/Phabricator-21 See http://jenkins.llvm-merge-guard.org/job/Phabricator/21/ for more details. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68560/new/ https://reviews.llvm.org/D68560 From llvm-commits at lists.llvm.org Tue Oct 8 00:20:40 2019 From: llvm-commits at lists.llvm.org (Yonghong Song via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:20:40 +0000 (UTC) Subject: [PATCH] D67980: [BPF] do compile-once run-everywhere relocation for bitfields In-Reply-To: References: Message-ID: yonghong-song updated this revision to Diff 223792. yonghong-song edited the summary of this revision. yonghong-song added a comment. use only one relocation for left shift to optimize for direct load. The bpf_probe_read() big endian mode could have 4 more arithmetic instructions compared to little endian mode. The performance impact should be negligible comparing to bpf_probe_read() itself. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67980/new/ https://reviews.llvm.org/D67980 Files: clang/include/clang/Basic/BuiltinsBPF.def clang/include/clang/Basic/DiagnosticSemaKinds.td clang/include/clang/Basic/TargetBuiltins.h clang/include/clang/Sema/Sema.h clang/include/clang/module.modulemap clang/lib/Basic/Targets/BPF.cpp clang/lib/Basic/Targets/BPF.h clang/lib/CodeGen/CGBuiltin.cpp clang/lib/CodeGen/CGExpr.cpp clang/lib/CodeGen/CodeGenFunction.h clang/lib/Sema/SemaChecking.cpp clang/test/CodeGen/builtins-bpf-preserve-field-info-1.c clang/test/CodeGen/builtins-bpf-preserve-field-info-2.c clang/test/Sema/builtins-bpf.c llvm/include/llvm/IR/IntrinsicsBPF.td llvm/lib/Target/BPF/BPF.h llvm/lib/Target/BPF/BPFAbstractMemberAccess.cpp llvm/lib/Target/BPF/BPFCORE.h llvm/lib/Target/BPF/BPFTargetMachine.cpp llvm/lib/Target/BPF/BTF.h llvm/lib/Target/BPF/BTFDebug.cpp llvm/lib/Target/BPF/BTFDebug.h llvm/test/CodeGen/BPF/CORE/intrinsic-array.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-struct.ll llvm/test/CodeGen/BPF/CORE/intrinsic-union.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-access-str.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-basic.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-array-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-array-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-3.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-union-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-union-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-end-load.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-end-ret.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-3.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-ignore.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-middle-chain.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multi-array-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multi-array-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multilevel.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-pointer-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-pointer-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-struct-anonymous.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-struct-array.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-array.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-struct.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-union.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-union.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67980.223792.patch Type: text/x-patch Size: 226105 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 00:29:28 2019 From: llvm-commits at lists.llvm.org (MyDeveloperDay via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:29:28 +0000 (UTC) Subject: [PATCH] D31635: [clang-format] Added ReferenceAlignmentStyle option In-Reply-To: References: Message-ID: <6e2709f8c2544178292b2205c486b287@localhost.localdomain> MyDeveloperDay added a comment. Thank you for the patch, just a couple of minor things (the doc is important). As Astyle has this option I think this is a gap in clang-format (this is also not the first request I've seen for this https://bugs.llvm.org/show_bug.cgi?id=42165) Given your project is waiting to use it (https://github.com/x64dbg/x64dbg) and the number of forks/stars I think we should consider this a significant enough project using this style to warrant it being worth landing. ================ Comment at: clang/include/clang/Format/Format.h:1681 + /// \brief The ``&`` and ``&&`` alignment style. + enum ReferenceAlignmentStyle { + /// Align reference like ``PointerAlignment``. ---------------- You are missing a documentation change (there is a python script which update the ClangFormatStyleOption.rst from this header) ================ Comment at: clang/unittests/Format/FormatTest.cpp:857 + verifyFormat("int&& f3(int& b, int&& c, int* a);", Style); + verifyFormat("int* a = f1();\nint& b = f2();\nint&& c = f3();", Style); + Style.PointerAlignment = FormatStyle::PAS_Right; ---------------- Nit: for completeness (I might be tempted to add the other styles `PAS_Middle` and `PAS_Right` for pointer alignment where `ReferenceAlignment` is RAS_Pointer (given there is code which checks) ```getTokenPointerAlignment(Left) != FormatStyle::PAS_Right``` ================ Comment at: clang/unittests/Format/FormatTest.cpp:876 + verifyFormat("int * a = f1();\nint &b = f2();\nint &&c = f3();", Style); +} + ---------------- add your example as a test just comment it out and add FIXME if it doesn't work, then it will tell people this is what we are aiming for. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D31635/new/ https://reviews.llvm.org/D31635 From llvm-commits at lists.llvm.org Tue Oct 8 00:29:29 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:29:29 +0000 (UTC) Subject: [PATCH] D68441: Ignore --export-dynamic if --relocatable is given In-Reply-To: References: Message-ID: <3138ec5ef6d49addca3c126e6f9b55de@localhost.localdomain> ruiu updated this revision to Diff 223793. ruiu added a comment. Herald added subscribers: dexonsmith, steven_wu, hiraditya. - report error instead of ignoring -export-dynamic Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68441/new/ https://reviews.llvm.org/D68441 Files: lld/ELF/Driver.cpp lld/test/ELF/driver.test lld/test/ELF/lto/relocation-model.ll Index: lld/test/ELF/lto/relocation-model.ll =================================================================== --- lld/test/ELF/lto/relocation-model.ll +++ lld/test/ELF/lto/relocation-model.ll @@ -14,9 +14,6 @@ ; RUN: ld.lld %t.o -o %t-out -save-temps --export-dynamic --noinhibit-exec ; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=STATIC -; RUN: ld.lld %t.o -o %t-out -save-temps -r --export-dynamic -; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=STATIC - ;; PIC source. @@ -29,9 +26,6 @@ ; RUN: ld.lld %t.pic.o -o %t-out -save-temps --export-dynamic --noinhibit-exec ; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=STATIC -; RUN: ld.lld %t.pic.o -o %t-out -save-temps -r --export-dynamic -; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=PIC - ;; Explicit flag. Index: lld/test/ELF/driver.test =================================================================== --- lld/test/ELF/driver.test +++ lld/test/ELF/driver.test @@ -50,6 +50,7 @@ # RUN: not ld.lld -r -pie %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR6 %s # ERR6: -r and -pie may not be used together +## Attempt to use -shared and -pie together ## Attempt to use -shared and -pie together # RUN: not ld.lld -shared -pie %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR7 %s # ERR7: -shared and -pie may not be used together @@ -72,6 +73,10 @@ # RUN: not ld.lld %t -z max-page-size 2>&1 | FileCheck -check-prefix=ERR11 %s # ERR11: unknown -z value: max-page-size +## Attempt to use -r and --export-dynamic together +# RUN: not ld.lld -r -export-dynamic %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR12 %s +# ERR12: -r and --export-dynamic may not be used together + .globl _start _start: nop Index: lld/ELF/Driver.cpp =================================================================== --- lld/ELF/Driver.cpp +++ lld/ELF/Driver.cpp @@ -334,6 +334,8 @@ error("-r and --icf may not be used together"); if (config->pie) error("-r and -pie may not be used together"); + if (config->exportDynamic) + error("-r and --export-dynamic may not be used together"); } if (config->executeOnly) { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68441.223793.patch Type: text/x-patch Size: 2189 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 00:39:50 2019 From: llvm-commits at lists.llvm.org (Kadir Cetinkaya via llvm-commits) Date: Tue, 08 Oct 2019 07:39:50 -0000 Subject: [llvm] r374021 - [LoopVectorize] Fix non-debug builds after rL374017 Message-ID: <20191008073950.6C9A18E8E3@lists.llvm.org> Author: kadircet Date: Tue Oct 8 00:39:50 2019 New Revision: 374021 URL: http://llvm.org/viewvc/llvm-project?rev=374021&view=rev Log: [LoopVectorize] Fix non-debug builds after rL374017 Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp?rev=374021&r1=374020&r2=374021&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Tue Oct 8 00:39:50 2019 @@ -5442,17 +5442,19 @@ LoopVectorizationCostModel::calculateReg LLVM_DEBUG(dbgs() << "LV(REG): VF = " << VFs[i] << '\n'); LLVM_DEBUG(dbgs() << "LV(REG): Found max usage: " << MaxUsages[i].size() << " item\n"); - for (const auto& pair : MaxUsages[i]) { + for (const auto& Pair : MaxUsages[i]) { + (void)Pair; LLVM_DEBUG(dbgs() << "LV(REG): RegisterClass: " - << TTI.getRegisterClassName(pair.first) - << ", " << pair.second << " registers \n"); + << TTI.getRegisterClassName(Pair.first) + << ", " << Pair.second << " registers \n"); } LLVM_DEBUG(dbgs() << "LV(REG): Found invariant usage: " << Invariant.size() << " item\n"); - for (const auto& pair : Invariant) { + for (const auto& Pair : Invariant) { + (void)Pair; LLVM_DEBUG(dbgs() << "LV(REG): RegisterClass: " - << TTI.getRegisterClassName(pair.first) - << ", " << pair.second << " registers \n"); + << TTI.getRegisterClassName(Pair.first) + << ", " << Pair.second << " registers \n"); } RU.LoopInvariantRegs = Invariant; From llvm-commits at lists.llvm.org Tue Oct 8 00:38:32 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:38:32 +0000 (UTC) Subject: [PATCH] D68062: Propeller lld framework for basicblock sections In-Reply-To: References: Message-ID: <16e588e28d6bf8f96bc2f5c7478db551@localhost.localdomain> MaskRay added inline comments. ================ Comment at: lld/ELF/PropellerELFCfg.cpp:1 +//===-------------------- PropellerELFCfg.cpp -----------------------------===// +// ---------------- There are still open questions about the linker rewriting approach: https://lists.llvm.org/pipermail/llvm-dev/2019-October/135616.html Let's see what conclusions people will reach. ================ Comment at: lld/ELF/PropellerELFCfg.cpp:19 +#include "llvm/Object/ObjectFile.h" +// Needed by ELFSectionRef & ELFSymbolRef. +#include "llvm/Object/ELFObjectFile.h" ---------------- Delete the comment. I suspect a comment on its own line may interact badly with clang-format. ================ Comment at: lld/ELF/PropellerELFCfg.cpp:44 +bool ControlFlowGraph::writeAsDotGraph(const char *cfgOutName) { + FILE *fp = fopen(cfgOutName, "w"); + if (!fp) { ---------------- Use StringRef. Avoid const char * ``` std::error_code ec; raw_fd_ostream os(..., ec, sys::fs::OF_None); ``` ================ Comment at: lld/ELF/PropellerELFCfg.cpp:223 + StringRef symName = *s; + /* + lld::elf::Symbol *PSym = ---------------- Delete unused code. ================ Comment at: lld/ELF/PropellerELFCfg.h:95 + + const static uint64_t InvalidAddress = -1l; + ---------------- delete the suffix l. ================ Comment at: lld/ELF/PropellerELFCfg.h:101 + return BName.size(); + else + return 0; ---------------- delete else ================ Comment at: lld/ELF/PropellerELFCfg.h:185 + // See implementaion comments in .cpp. + void buildCFG(ControlFlowGraph &cfg, const SymbolRef &cfgSym, + std::map> &nodeMap); ---------------- If the ordered property is not required, map -> unordered_map ================ Comment at: lld/ELF/PropellerELFCfg.h:228 +std::ostream &operator<<(std::ostream &out, const ControlFlowGraph &cfg); +} +} // namespace lld ---------------- not clang-format'ed, i.e. i don't see ``` } // namespace propeller ``` Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68062/new/ https://reviews.llvm.org/D68062 From llvm-commits at lists.llvm.org Tue Oct 8 00:47:36 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:47:36 +0000 (UTC) Subject: [PATCH] D68629: [llvm-exegesis] Finish plumbing the `Config` field. In-Reply-To: References: Message-ID: courbet updated this revision to Diff 223795. courbet added a comment. Rebase Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68629/new/ https://reviews.llvm.org/D68629 Files: llvm/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test llvm/tools/llvm-exegesis/lib/BenchmarkCode.h llvm/tools/llvm-exegesis/lib/BenchmarkResult.h llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp llvm/tools/llvm-exegesis/lib/Clustering.cpp llvm/tools/llvm-exegesis/lib/CodeTemplate.h llvm/tools/llvm-exegesis/lib/SnippetFile.cpp llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/unittests/tools/llvm-exegesis/X86/SnippetFileTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68629.223795.patch Type: text/x-patch Size: 12952 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 00:47:36 2019 From: llvm-commits at lists.llvm.org (Chang Lin via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:47:36 +0000 (UTC) Subject: [PATCH] D53876: Preserve loop metadata when splitting exit blocks In-Reply-To: References: Message-ID: <805daab9a0e6fed9ec09850d6a2057ea@localhost.localdomain> clin1 updated this revision to Diff 223794. clin1 added a comment. Restoring latest patch. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53876/new/ https://reviews.llvm.org/D53876 Files: lib/Transforms/Utils/LoopUtils.cpp test/Transforms/LoopSimplify/preserve-llvm-loop-metadata2.ll test/Transforms/LoopSimplify/preserve-llvm-loop-metadata3.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D53876.223794.patch Type: text/x-patch Size: 8034 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 00:49:11 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:49:11 +0000 (UTC) Subject: [PATCH] D68441: Ignore --export-dynamic if --relocatable is given In-Reply-To: References: Message-ID: grimar accepted this revision. grimar added a comment. This revision is now accepted and ready to land. LGTM with 2 nits. ================ Comment at: lld/test/ELF/driver.test:54 +## Attempt to use -shared and -pie together ## Attempt to use -shared and -pie together # RUN: not ld.lld -shared -pie %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR7 %s ---------------- Remove this duplication. ================ Comment at: lld/test/ELF/driver.test:77 +## Attempt to use -r and --export-dynamic together +# RUN: not ld.lld -r -export-dynamic %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR12 %s +# ERR12: -r and --export-dynamic may not be used together ---------------- -o %tfail -> -o /dev/null Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68441/new/ https://reviews.llvm.org/D68441 From llvm-commits at lists.llvm.org Tue Oct 8 01:03:40 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via llvm-commits) Date: Tue, 08 Oct 2019 08:03:40 -0000 Subject: [lld] r374022 - Report error if -export-dynamic is used with -r Message-ID: <20191008080340.BAB3E8865A@lists.llvm.org> Author: ruiu Date: Tue Oct 8 01:03:40 2019 New Revision: 374022 URL: http://llvm.org/viewvc/llvm-project?rev=374022&view=rev Log: Report error if -export-dynamic is used with -r The combination of the two flags doesn't make sense. And other linkers seem to just ignore --export-dynamic if --relocatable is given, but we probably should report it as an error to let users know that is an invalid combination. Fixes https://bugs.llvm.org/show_bug.cgi?id=43552 Differential Revision: https://reviews.llvm.org/D68441 Modified: lld/trunk/ELF/Driver.cpp lld/trunk/test/ELF/driver.test lld/trunk/test/ELF/lto/relocation-model.ll Modified: lld/trunk/ELF/Driver.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Driver.cpp?rev=374022&r1=374021&r2=374022&view=diff ============================================================================== --- lld/trunk/ELF/Driver.cpp (original) +++ lld/trunk/ELF/Driver.cpp Tue Oct 8 01:03:40 2019 @@ -334,6 +334,8 @@ static void checkOptions() { error("-r and --icf may not be used together"); if (config->pie) error("-r and -pie may not be used together"); + if (config->exportDynamic) + error("-r and --export-dynamic may not be used together"); } if (config->executeOnly) { Modified: lld/trunk/test/ELF/driver.test URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/driver.test?rev=374022&r1=374021&r2=374022&view=diff ============================================================================== --- lld/trunk/test/ELF/driver.test (original) +++ lld/trunk/test/ELF/driver.test Tue Oct 8 01:03:40 2019 @@ -72,6 +72,10 @@ # RUN: not ld.lld %t -z max-page-size 2>&1 | FileCheck -check-prefix=ERR11 %s # ERR11: unknown -z value: max-page-size +## Attempt to use -r and --export-dynamic together +# RUN: not ld.lld -r -export-dynamic %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR12 %s +# ERR12: -r and --export-dynamic may not be used together + .globl _start _start: nop Modified: lld/trunk/test/ELF/lto/relocation-model.ll URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/lto/relocation-model.ll?rev=374022&r1=374021&r2=374022&view=diff ============================================================================== --- lld/trunk/test/ELF/lto/relocation-model.ll (original) +++ lld/trunk/test/ELF/lto/relocation-model.ll Tue Oct 8 01:03:40 2019 @@ -14,9 +14,6 @@ ; RUN: ld.lld %t.o -o %t-out -save-temps --export-dynamic --noinhibit-exec ; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=STATIC -; RUN: ld.lld %t.o -o %t-out -save-temps -r --export-dynamic -; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=STATIC - ;; PIC source. @@ -29,9 +26,6 @@ ; RUN: ld.lld %t.pic.o -o %t-out -save-temps --export-dynamic --noinhibit-exec ; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=STATIC -; RUN: ld.lld %t.pic.o -o %t-out -save-temps -r --export-dynamic -; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=PIC - ;; Explicit flag. From llvm-commits at lists.llvm.org Tue Oct 8 01:03:44 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via llvm-commits) Date: Tue, 08 Oct 2019 08:03:44 -0000 Subject: [lld] r374023 - Use /dev/null for tests that we do not need outputs Message-ID: <20191008080344.6EDAE8B948@lists.llvm.org> Author: ruiu Date: Tue Oct 8 01:03:44 2019 New Revision: 374023 URL: http://llvm.org/viewvc/llvm-project?rev=374023&view=rev Log: Use /dev/null for tests that we do not need outputs Modified: lld/trunk/test/ELF/driver.test Modified: lld/trunk/test/ELF/driver.test URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/driver.test?rev=374023&r1=374022&r2=374023&view=diff ============================================================================== --- lld/trunk/test/ELF/driver.test (original) +++ lld/trunk/test/ELF/driver.test Tue Oct 8 01:03:44 2019 @@ -27,31 +27,31 @@ ## Attempt to link DSO with -r # RUN: ld.lld -shared %t -o %t.so -# RUN: not ld.lld -r %t.so %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR %s +# RUN: not ld.lld -r %t.so %t -o /dev/null 2>&1 | FileCheck -check-prefix=ERR %s # ERR: attempted static link of dynamic object ## Attempt to use -r and -shared together -# RUN: not ld.lld -r -shared %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR2 %s +# RUN: not ld.lld -r -shared %t -o /dev/null 2>&1 | FileCheck -check-prefix=ERR2 %s # ERR2: -r and -shared may not be used together ## Attempt to use -r and --gc-sections together -# RUN: not ld.lld -r --gc-sections %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR3 %s +# RUN: not ld.lld -r --gc-sections %t -o /dev/null 2>&1 | FileCheck -check-prefix=ERR3 %s # ERR3: -r and --gc-sections may not be used together ## Attempt to use -r and --gdb-index together -# RUN: not ld.lld -r --gdb-index %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR4 %s +# RUN: not ld.lld -r --gdb-index %t -o /dev/null 2>&1 | FileCheck -check-prefix=ERR4 %s # ERR4: -r and --gdb-index may not be used together ## Attempt to use -r and --icf together -# RUN: not ld.lld -r --icf=all %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR5 %s +# RUN: not ld.lld -r --icf=all %t -o /dev/null 2>&1 | FileCheck -check-prefix=ERR5 %s # ERR5: -r and --icf may not be used together ## Attempt to use -r and -pie together -# RUN: not ld.lld -r -pie %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR6 %s +# RUN: not ld.lld -r -pie %t -o /dev/null 2>&1 | FileCheck -check-prefix=ERR6 %s # ERR6: -r and -pie may not be used together ## Attempt to use -shared and -pie together -# RUN: not ld.lld -shared -pie %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR7 %s +# RUN: not ld.lld -shared -pie %t -o /dev/null 2>&1 | FileCheck -check-prefix=ERR7 %s # ERR7: -shared and -pie may not be used together ## "--output=foo" is equivalent to "-o foo". @@ -73,7 +73,7 @@ # ERR11: unknown -z value: max-page-size ## Attempt to use -r and --export-dynamic together -# RUN: not ld.lld -r -export-dynamic %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR12 %s +# RUN: not ld.lld -r -export-dynamic %t -o /dev/null 2>&1 | FileCheck -check-prefix=ERR12 %s # ERR12: -r and --export-dynamic may not be used together .globl _start From llvm-commits at lists.llvm.org Tue Oct 8 01:05:52 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:05:52 +0000 (UTC) Subject: [PATCH] D68441: Ignore --export-dynamic if --relocatable is given In-Reply-To: References: Message-ID: <7632fa69cf5198894afd24e896752e1a@localhost.localdomain> ruiu marked 2 inline comments as done. ruiu added inline comments. ================ Comment at: lld/test/ELF/driver.test:77 +## Attempt to use -r and --export-dynamic together +# RUN: not ld.lld -r -export-dynamic %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR12 %s +# ERR12: -r and --export-dynamic may not be used together ---------------- grimar wrote: > -o %tfail -> -o /dev/null I'll do that in a follow-up patch to replace all occurrences of %tfail in this file in one shot. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68441/new/ https://reviews.llvm.org/D68441 From llvm-commits at lists.llvm.org Tue Oct 8 01:05:57 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:05:57 +0000 (UTC) Subject: [PATCH] D68441: Ignore --export-dynamic if --relocatable is given In-Reply-To: References: Message-ID: <23e072fd285718020a0fc13d53b83679@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rL374022: Report error if -export-dynamic is used with -r (authored by ruiu, committed by ). Changed prior to commit: https://reviews.llvm.org/D68441?vs=223793&id=223796#toc Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68441/new/ https://reviews.llvm.org/D68441 Files: lld/trunk/ELF/Driver.cpp lld/trunk/test/ELF/driver.test lld/trunk/test/ELF/lto/relocation-model.ll Index: lld/trunk/test/ELF/lto/relocation-model.ll =================================================================== --- lld/trunk/test/ELF/lto/relocation-model.ll +++ lld/trunk/test/ELF/lto/relocation-model.ll @@ -14,9 +14,6 @@ ; RUN: ld.lld %t.o -o %t-out -save-temps --export-dynamic --noinhibit-exec ; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=STATIC -; RUN: ld.lld %t.o -o %t-out -save-temps -r --export-dynamic -; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=STATIC - ;; PIC source. @@ -29,9 +26,6 @@ ; RUN: ld.lld %t.pic.o -o %t-out -save-temps --export-dynamic --noinhibit-exec ; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=STATIC -; RUN: ld.lld %t.pic.o -o %t-out -save-temps -r --export-dynamic -; RUN: llvm-readobj -r %t-out.lto.o | FileCheck %s --check-prefix=PIC - ;; Explicit flag. Index: lld/trunk/test/ELF/driver.test =================================================================== --- lld/trunk/test/ELF/driver.test +++ lld/trunk/test/ELF/driver.test @@ -72,6 +72,10 @@ # RUN: not ld.lld %t -z max-page-size 2>&1 | FileCheck -check-prefix=ERR11 %s # ERR11: unknown -z value: max-page-size +## Attempt to use -r and --export-dynamic together +# RUN: not ld.lld -r -export-dynamic %t -o %tfail 2>&1 | FileCheck -check-prefix=ERR12 %s +# ERR12: -r and --export-dynamic may not be used together + .globl _start _start: nop Index: lld/trunk/ELF/Driver.cpp =================================================================== --- lld/trunk/ELF/Driver.cpp +++ lld/trunk/ELF/Driver.cpp @@ -334,6 +334,8 @@ error("-r and --icf may not be used together"); if (config->pie) error("-r and -pie may not be used together"); + if (config->exportDynamic) + error("-r and --export-dynamic may not be used together"); } if (config->executeOnly) { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68441.223796.patch Type: text/x-patch Size: 1873 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 01:14:56 2019 From: llvm-commits at lists.llvm.org (Guillaume Chatelet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:14:56 +0000 (UTC) Subject: [PATCH] D68629: [llvm-exegesis] Finish plumbing the `Config` field. In-Reply-To: References: Message-ID: <96016897713a846cb8a3951d664644b6@localhost.localdomain> gchatelet added inline comments. ================ Comment at: llvm/tools/llvm-exegesis/lib/Clustering.cpp:241 + // Given an instruction Opcode and Config, in which clusters do benchmarks of + // this instruction lie? Normally, they all should be in the same cluster. + struct OpcodeAndConfig { ---------------- Not related to this patch but why a question mark here? ================ Comment at: llvm/tools/llvm-exegesis/lib/Clustering.cpp:247 + const std::string *Config; + bool operator<(const OpcodeAndConfig &O) const { + return std::tie(Opcode, *Config) < std::tie(O.Opcode, *O.Config); ---------------- How about factoring Tie: ``` inline auto Tie() const { return std::tie(Opcode, *Config); } bool operator<(const A &O) const { return Tie() < O.Tie(); } bool operator==(const A &O) const { return Tie() == O.Tie(); } ``` ================ Comment at: llvm/tools/llvm-exegesis/lib/Clustering.cpp:296 + [this, &Key](size_t P) { + return !(OpcodeAndConfig(Points_[P]) == Key); }); ---------------- maybe add the != operator to the struct, with the `Tie()` function it's a no brainer. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68629/new/ https://reviews.llvm.org/D68629 From llvm-commits at lists.llvm.org Tue Oct 8 01:21:21 2019 From: llvm-commits at lists.llvm.org (Kai Nacke via llvm-commits) Date: Tue, 08 Oct 2019 08:21:21 -0000 Subject: [llvm] r374024 - [Tools] Mark output of tools as text if it is text Message-ID: <20191008082121.66CCD8ECAE@lists.llvm.org> Author: redstar Date: Tue Oct 8 01:21:20 2019 New Revision: 374024 URL: http://llvm.org/viewvc/llvm-project?rev=374024&view=rev Log: [Tools] Mark output of tools as text if it is text Several LLVM tools write text files/streams without using OF_Text. This can cause problems on platforms which distinguish between text and binary output. This PR adds the OF_Text flag for the following tools: - llvm-dis - llvm-dwarfdump - llvm-mca - llvm-mc (assembler files only) - opt (assembler files only) - RemarkStreamer (used e.g. by opt) Reviewers: rnk, vivekvpandya, Bigcheese, andreadb Differential Revision: https://reviews.llvm.org/D67696 Modified: llvm/trunk/lib/IR/RemarkStreamer.cpp llvm/trunk/tools/llvm-dis/llvm-dis.cpp llvm/trunk/tools/llvm-dwarfdump/llvm-dwarfdump.cpp llvm/trunk/tools/llvm-mc/llvm-mc.cpp llvm/trunk/tools/llvm-mca/llvm-mca.cpp llvm/trunk/tools/opt/opt.cpp Modified: llvm/trunk/lib/IR/RemarkStreamer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/RemarkStreamer.cpp?rev=374024&r1=374023&r2=374024&view=diff ============================================================================== --- llvm/trunk/lib/IR/RemarkStreamer.cpp (original) +++ llvm/trunk/lib/IR/RemarkStreamer.cpp Tue Oct 8 01:21:20 2019 @@ -122,18 +122,20 @@ llvm::setupOptimizationRemarks(LLVMConte if (RemarksFilename.empty()) return nullptr; + Expected Format = remarks::parseFormat(RemarksFormat); + if (Error E = Format.takeError()) + return make_error(std::move(E)); + std::error_code EC; + auto Flags = *Format == remarks::Format::YAML ? sys::fs::OF_Text + : sys::fs::OF_None; auto RemarksFile = - std::make_unique(RemarksFilename, EC, sys::fs::OF_None); + std::make_unique(RemarksFilename, EC, Flags); // We don't use llvm::FileError here because some diagnostics want the file // name separately. if (EC) return make_error(errorCodeToError(EC)); - Expected Format = remarks::parseFormat(RemarksFormat); - if (Error E = Format.takeError()) - return make_error(std::move(E)); - Expected> RemarkSerializer = remarks::createRemarkSerializer( *Format, remarks::SerializerMode::Separate, RemarksFile->os()); Modified: llvm/trunk/tools/llvm-dis/llvm-dis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-dis/llvm-dis.cpp?rev=374024&r1=374023&r2=374024&view=diff ============================================================================== --- llvm/trunk/tools/llvm-dis/llvm-dis.cpp (original) +++ llvm/trunk/tools/llvm-dis/llvm-dis.cpp Tue Oct 8 01:21:20 2019 @@ -186,7 +186,7 @@ int main(int argc, char **argv) { std::error_code EC; std::unique_ptr Out( - new ToolOutputFile(OutputFilename, EC, sys::fs::OF_None)); + new ToolOutputFile(OutputFilename, EC, sys::fs::OF_Text)); if (EC) { errs() << EC.message() << '\n'; return 1; Modified: llvm/trunk/tools/llvm-dwarfdump/llvm-dwarfdump.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-dwarfdump/llvm-dwarfdump.cpp?rev=374024&r1=374023&r2=374024&view=diff ============================================================================== --- llvm/trunk/tools/llvm-dwarfdump/llvm-dwarfdump.cpp (original) +++ llvm/trunk/tools/llvm-dwarfdump/llvm-dwarfdump.cpp Tue Oct 8 01:21:20 2019 @@ -584,7 +584,7 @@ int main(int argc, char **argv) { } std::error_code EC; - ToolOutputFile OutputFile(OutputFilename, EC, sys::fs::OF_None); + ToolOutputFile OutputFile(OutputFilename, EC, sys::fs::OF_Text); error("Unable to open output file" + OutputFilename, EC); // Don't remove output file if we exit with an error. OutputFile.keep(); Modified: llvm/trunk/tools/llvm-mc/llvm-mc.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-mc/llvm-mc.cpp?rev=374024&r1=374023&r2=374024&view=diff ============================================================================== --- llvm/trunk/tools/llvm-mc/llvm-mc.cpp (original) +++ llvm/trunk/tools/llvm-mc/llvm-mc.cpp Tue Oct 8 01:21:20 2019 @@ -209,9 +209,10 @@ static const Target *GetTarget(const cha return TheTarget; } -static std::unique_ptr GetOutputStream(StringRef Path) { +static std::unique_ptr GetOutputStream(StringRef Path, + sys::fs::OpenFlags Flags) { std::error_code EC; - auto Out = std::make_unique(Path, EC, sys::fs::OF_None); + auto Out = std::make_unique(Path, EC, Flags); if (EC) { WithColor::error() << EC.message() << '\n'; return nullptr; @@ -413,7 +414,9 @@ int main(int argc, char **argv) { FeaturesStr = Features.getString(); } - std::unique_ptr Out = GetOutputStream(OutputFilename); + sys::fs::OpenFlags Flags = (FileType == OFT_AssemblyFile) ? sys::fs::OF_Text + : sys::fs::OF_None; + std::unique_ptr Out = GetOutputStream(OutputFilename, Flags); if (!Out) return 1; @@ -423,7 +426,7 @@ int main(int argc, char **argv) { WithColor::error() << "dwo output only supported with object files\n"; return 1; } - DwoOut = GetOutputStream(SplitDwarfFile); + DwoOut = GetOutputStream(SplitDwarfFile, sys::fs::OF_None); if (!DwoOut) return 1; } Modified: llvm/trunk/tools/llvm-mca/llvm-mca.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-mca/llvm-mca.cpp?rev=374024&r1=374023&r2=374024&view=diff ============================================================================== --- llvm/trunk/tools/llvm-mca/llvm-mca.cpp (original) +++ llvm/trunk/tools/llvm-mca/llvm-mca.cpp Tue Oct 8 01:21:20 2019 @@ -238,7 +238,7 @@ ErrorOr> OutputFilename = "-"; std::error_code EC; auto Out = - std::make_unique(OutputFilename, EC, sys::fs::OF_None); + std::make_unique(OutputFilename, EC, sys::fs::OF_Text); if (!EC) return std::move(Out); return EC; Modified: llvm/trunk/tools/opt/opt.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/opt/opt.cpp?rev=374024&r1=374023&r2=374024&view=diff ============================================================================== --- llvm/trunk/tools/opt/opt.cpp (original) +++ llvm/trunk/tools/opt/opt.cpp Tue Oct 8 01:21:20 2019 @@ -611,7 +611,9 @@ int main(int argc, char **argv) { OutputFilename = "-"; std::error_code EC; - Out.reset(new ToolOutputFile(OutputFilename, EC, sys::fs::OF_None)); + sys::fs::OpenFlags Flags = OutputAssembly ? sys::fs::OF_Text + : sys::fs::OF_None; + Out.reset(new ToolOutputFile(OutputFilename, EC, Flags)); if (EC) { errs() << EC.message() << '\n'; return 1; From llvm-commits at lists.llvm.org Tue Oct 8 01:25:43 2019 From: llvm-commits at lists.llvm.org (Kristof Beyls via llvm-commits) Date: Tue, 08 Oct 2019 08:25:43 -0000 Subject: [llvm] r374025 - [ARM] Generate vcmp instead of vcmpe Message-ID: <20191008082543.439868EB91@lists.llvm.org> Author: kbeyls Date: Tue Oct 8 01:25:42 2019 New Revision: 374025 URL: http://llvm.org/viewvc/llvm-project?rev=374025&view=rev Log: [ARM] Generate vcmp instead of vcmpe Based on the discussion in http://lists.llvm.org/pipermail/llvm-dev/2019-October/135574.html, the conclusion was reached that the ARM backend should produce vcmp instead of vcmpe instructions by default, i.e. not be producing an Invalid Operation exception when either arguments in a floating point compare are quiet NaNs. In the future, after constrained floating point intrinsics for floating point compare have been introduced, vcmpe instructions probably should be produced for those intrinsics - depending on the exact semantics they'll be defined to have. This patch logically consists of the following parts: - Revert http://llvm.org/viewvc/llvm-project?rev=294945&view=rev and http://llvm.org/viewvc/llvm-project?rev=294968&view=rev, which implemented fine-tuning for when to produce vcmpe (i.e. not do it for equality comparisons). The complexity introduced by those patches isn't needed anymore if we just always produce vcmp instead. Maybe these patches need to be reintroduced again once support is needed to map potential LLVM-IR constrained floating point compare intrinsics to the ARM instruction set. - Simply select vcmp, instead of vcmpe, see simple changes in lib/Target/ARM/ARMInstrVFP.td - Adapt lots of tests that tested for vcmpe (instead of vcmp). For all of these test, the intent of what is tested for isn't related to whether the vcmp should produce an Invalid Operation exception or not. Fixes PR43374. Differential Revision: https://reviews.llvm.org/D68463 Removed: llvm/trunk/test/CodeGen/ARM/vcmp-crash.ll Modified: llvm/trunk/lib/Target/ARM/ARMFastISel.cpp llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp llvm/trunk/lib/Target/ARM/ARMISelLowering.h llvm/trunk/lib/Target/ARM/ARMInstrInfo.td llvm/trunk/lib/Target/ARM/ARMInstrVFP.td llvm/trunk/test/CodeGen/ARM/2009-07-18-RewriterBug.ll llvm/trunk/test/CodeGen/ARM/arm-shrink-wrapping.ll llvm/trunk/test/CodeGen/ARM/compare-call.ll llvm/trunk/test/CodeGen/ARM/fcmp-xo.ll llvm/trunk/test/CodeGen/ARM/float-helpers.s llvm/trunk/test/CodeGen/ARM/fp16-instructions.ll llvm/trunk/test/CodeGen/ARM/fp16-promote.ll llvm/trunk/test/CodeGen/ARM/fpcmp.ll llvm/trunk/test/CodeGen/ARM/ifcvt11.ll llvm/trunk/test/CodeGen/ARM/swifterror.ll llvm/trunk/test/CodeGen/ARM/vfp.ll llvm/trunk/test/CodeGen/ARM/vsel-fp16.ll llvm/trunk/test/CodeGen/ARM/vsel.ll llvm/trunk/test/CodeGen/Thumb2/float-cmp.ll llvm/trunk/test/CodeGen/Thumb2/mve-vcmpf.ll llvm/trunk/test/CodeGen/Thumb2/mve-vcmpfr.ll llvm/trunk/test/CodeGen/Thumb2/mve-vcmpfz.ll Modified: llvm/trunk/lib/Target/ARM/ARMFastISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMFastISel.cpp?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMFastISel.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMFastISel.cpp Tue Oct 8 01:25:42 2019 @@ -191,7 +191,7 @@ class ARMFastISel final : public FastISe bool isTypeLegal(Type *Ty, MVT &VT); bool isLoadTypeLegal(Type *Ty, MVT &VT); bool ARMEmitCmp(const Value *Src1Value, const Value *Src2Value, - bool isZExt, bool isEquality); + bool isZExt); bool ARMEmitLoad(MVT VT, Register &ResultReg, Address &Addr, unsigned Alignment = 0, bool isZExt = true, bool allocReg = true); @@ -1259,8 +1259,7 @@ bool ARMFastISel::SelectBranch(const Ins if (ARMPred == ARMCC::AL) return false; // Emit the compare. - if (!ARMEmitCmp(CI->getOperand(0), CI->getOperand(1), CI->isUnsigned(), - CI->isEquality())) + if (!ARMEmitCmp(CI->getOperand(0), CI->getOperand(1), CI->isUnsigned())) return false; unsigned BrOpc = isThumb2 ? ARM::t2Bcc : ARM::Bcc; @@ -1349,7 +1348,7 @@ bool ARMFastISel::SelectIndirectBr(const } bool ARMFastISel::ARMEmitCmp(const Value *Src1Value, const Value *Src2Value, - bool isZExt, bool isEquality) { + bool isZExt) { Type *Ty = Src1Value->getType(); EVT SrcEVT = TLI.getValueType(DL, Ty, true); if (!SrcEVT.isSimple()) return false; @@ -1397,19 +1396,11 @@ bool ARMFastISel::ARMEmitCmp(const Value // TODO: Verify compares. case MVT::f32: isICmp = false; - // Equality comparisons shouldn't raise Invalid on uordered inputs. - if (isEquality) - CmpOpc = UseImm ? ARM::VCMPZS : ARM::VCMPS; - else - CmpOpc = UseImm ? ARM::VCMPEZS : ARM::VCMPES; + CmpOpc = UseImm ? ARM::VCMPZS : ARM::VCMPS; break; case MVT::f64: isICmp = false; - // Equality comparisons shouldn't raise Invalid on uordered inputs. - if (isEquality) - CmpOpc = UseImm ? ARM::VCMPZD : ARM::VCMPD; - else - CmpOpc = UseImm ? ARM::VCMPEZD : ARM::VCMPED; + CmpOpc = UseImm ? ARM::VCMPZD : ARM::VCMPD; break; case MVT::i1: case MVT::i8: @@ -1485,8 +1476,7 @@ bool ARMFastISel::SelectCmp(const Instru if (ARMPred == ARMCC::AL) return false; // Emit the compare. - if (!ARMEmitCmp(CI->getOperand(0), CI->getOperand(1), CI->isUnsigned(), - CI->isEquality())) + if (!ARMEmitCmp(CI->getOperand(0), CI->getOperand(1), CI->isUnsigned())) return false; // Now set a register based on the comparison. Explicitly set the predicates Modified: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp Tue Oct 8 01:25:42 2019 @@ -1793,34 +1793,22 @@ static ARMCC::CondCodes IntCCToARMCC(ISD /// FPCCToARMCC - Convert a DAG fp condition code to an ARM CC. static void FPCCToARMCC(ISD::CondCode CC, ARMCC::CondCodes &CondCode, - ARMCC::CondCodes &CondCode2, bool &InvalidOnQNaN) { + ARMCC::CondCodes &CondCode2) { CondCode2 = ARMCC::AL; - InvalidOnQNaN = true; switch (CC) { default: llvm_unreachable("Unknown FP condition!"); case ISD::SETEQ: - case ISD::SETOEQ: - CondCode = ARMCC::EQ; - InvalidOnQNaN = false; - break; + case ISD::SETOEQ: CondCode = ARMCC::EQ; break; case ISD::SETGT: case ISD::SETOGT: CondCode = ARMCC::GT; break; case ISD::SETGE: case ISD::SETOGE: CondCode = ARMCC::GE; break; case ISD::SETOLT: CondCode = ARMCC::MI; break; case ISD::SETOLE: CondCode = ARMCC::LS; break; - case ISD::SETONE: - CondCode = ARMCC::MI; - CondCode2 = ARMCC::GT; - InvalidOnQNaN = false; - break; + case ISD::SETONE: CondCode = ARMCC::MI; CondCode2 = ARMCC::GT; break; case ISD::SETO: CondCode = ARMCC::VC; break; case ISD::SETUO: CondCode = ARMCC::VS; break; - case ISD::SETUEQ: - CondCode = ARMCC::EQ; - CondCode2 = ARMCC::VS; - InvalidOnQNaN = false; - break; + case ISD::SETUEQ: CondCode = ARMCC::EQ; CondCode2 = ARMCC::VS; break; case ISD::SETUGT: CondCode = ARMCC::HI; break; case ISD::SETUGE: CondCode = ARMCC::PL; break; case ISD::SETLT: @@ -1828,10 +1816,7 @@ static void FPCCToARMCC(ISD::CondCode CC case ISD::SETLE: case ISD::SETULE: CondCode = ARMCC::LE; break; case ISD::SETNE: - case ISD::SETUNE: - CondCode = ARMCC::NE; - InvalidOnQNaN = false; - break; + case ISD::SETUNE: CondCode = ARMCC::NE; break; } } @@ -4259,15 +4244,13 @@ SDValue ARMTargetLowering::getARMCmp(SDV /// Returns a appropriate VFP CMP (fcmp{s|d}+fmstat) for the given operands. SDValue ARMTargetLowering::getVFPCmp(SDValue LHS, SDValue RHS, - SelectionDAG &DAG, const SDLoc &dl, - bool InvalidOnQNaN) const { + SelectionDAG &DAG, const SDLoc &dl) const { assert(Subtarget->hasFP64() || RHS.getValueType() != MVT::f64); SDValue Cmp; - SDValue C = DAG.getConstant(InvalidOnQNaN, dl, MVT::i32); if (!isFloatingPointZero(RHS)) - Cmp = DAG.getNode(ARMISD::CMPFP, dl, MVT::Glue, LHS, RHS, C); + Cmp = DAG.getNode(ARMISD::CMPFP, dl, MVT::Glue, LHS, RHS); else - Cmp = DAG.getNode(ARMISD::CMPFPw0, dl, MVT::Glue, LHS, C); + Cmp = DAG.getNode(ARMISD::CMPFPw0, dl, MVT::Glue, LHS); return DAG.getNode(ARMISD::FMSTAT, dl, MVT::Glue, Cmp); } @@ -4284,12 +4267,10 @@ ARMTargetLowering::duplicateCmp(SDValue Cmp = Cmp.getOperand(0); Opc = Cmp.getOpcode(); if (Opc == ARMISD::CMPFP) - Cmp = DAG.getNode(Opc, DL, MVT::Glue, Cmp.getOperand(0), - Cmp.getOperand(1), Cmp.getOperand(2)); + Cmp = DAG.getNode(Opc, DL, MVT::Glue, Cmp.getOperand(0),Cmp.getOperand(1)); else { assert(Opc == ARMISD::CMPFPw0 && "unexpected operand of FMSTAT"); - Cmp = DAG.getNode(Opc, DL, MVT::Glue, Cmp.getOperand(0), - Cmp.getOperand(1)); + Cmp = DAG.getNode(Opc, DL, MVT::Glue, Cmp.getOperand(0)); } return DAG.getNode(ARMISD::FMSTAT, DL, MVT::Glue, Cmp); } @@ -4929,8 +4910,7 @@ SDValue ARMTargetLowering::LowerSELECT_C } ARMCC::CondCodes CondCode, CondCode2; - bool InvalidOnQNaN; - FPCCToARMCC(CC, CondCode, CondCode2, InvalidOnQNaN); + FPCCToARMCC(CC, CondCode, CondCode2); // Normalize the fp compare. If RHS is zero we prefer to keep it there so we // match CMPFPw0 instead of CMPFP, though we don't do this for f16 because we @@ -4955,13 +4935,13 @@ SDValue ARMTargetLowering::LowerSELECT_C } SDValue ARMcc = DAG.getConstant(CondCode, dl, MVT::i32); - SDValue Cmp = getVFPCmp(LHS, RHS, DAG, dl, InvalidOnQNaN); + SDValue Cmp = getVFPCmp(LHS, RHS, DAG, dl); SDValue CCR = DAG.getRegister(ARM::CPSR, MVT::i32); SDValue Result = getCMOV(dl, VT, FalseVal, TrueVal, ARMcc, CCR, Cmp, DAG); if (CondCode2 != ARMCC::AL) { SDValue ARMcc2 = DAG.getConstant(CondCode2, dl, MVT::i32); // FIXME: Needs another CMP because flag can have but one use. - SDValue Cmp2 = getVFPCmp(LHS, RHS, DAG, dl, InvalidOnQNaN); + SDValue Cmp2 = getVFPCmp(LHS, RHS, DAG, dl); Result = getCMOV(dl, VT, Result, TrueVal, ARMcc2, CCR, Cmp2, DAG); } return Result; @@ -5188,11 +5168,10 @@ SDValue ARMTargetLowering::LowerBR_CC(SD } ARMCC::CondCodes CondCode, CondCode2; - bool InvalidOnQNaN; - FPCCToARMCC(CC, CondCode, CondCode2, InvalidOnQNaN); + FPCCToARMCC(CC, CondCode, CondCode2); SDValue ARMcc = DAG.getConstant(CondCode, dl, MVT::i32); - SDValue Cmp = getVFPCmp(LHS, RHS, DAG, dl, InvalidOnQNaN); + SDValue Cmp = getVFPCmp(LHS, RHS, DAG, dl); SDValue CCR = DAG.getRegister(ARM::CPSR, MVT::i32); SDVTList VTList = DAG.getVTList(MVT::Other, MVT::Glue); SDValue Ops[] = { Chain, Dest, ARMcc, CCR, Cmp }; Modified: llvm/trunk/lib/Target/ARM/ARMISelLowering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMISelLowering.h?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMISelLowering.h (original) +++ llvm/trunk/lib/Target/ARM/ARMISelLowering.h Tue Oct 8 01:25:42 2019 @@ -818,7 +818,7 @@ class VectorType; SDValue getARMCmp(SDValue LHS, SDValue RHS, ISD::CondCode CC, SDValue &ARMcc, SelectionDAG &DAG, const SDLoc &dl) const; SDValue getVFPCmp(SDValue LHS, SDValue RHS, SelectionDAG &DAG, - const SDLoc &dl, bool InvalidOnQNaN) const; + const SDLoc &dl) const; SDValue duplicateCmp(SDValue Cmp, SelectionDAG &DAG) const; SDValue OptimizeVFPBrcond(SDValue Op, SelectionDAG &DAG) const; Modified: llvm/trunk/lib/Target/ARM/ARMInstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrInfo.td?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrInfo.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrInfo.td Tue Oct 8 01:25:42 2019 @@ -51,8 +51,6 @@ def SDT_ARMAnd : SDTypeProfile<1, 2, SDTCisVT<2, i32>]>; def SDT_ARMCmp : SDTypeProfile<0, 2, [SDTCisSameAs<0, 1>]>; -def SDT_ARMFCmp : SDTypeProfile<0, 3, [SDTCisSameAs<0, 1>, - SDTCisVT<2, i32>]>; def SDT_ARMPICAdd : SDTypeProfile<1, 2, [SDTCisSameAs<0, 1>, SDTCisPtrTy<1>, SDTCisVT<2, i32>]>; Modified: llvm/trunk/lib/Target/ARM/ARMInstrVFP.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrVFP.td?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrVFP.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrVFP.td Tue Oct 8 01:25:42 2019 @@ -10,7 +10,7 @@ // //===----------------------------------------------------------------------===// -def SDT_CMPFP0 : SDTypeProfile<0, 2, [SDTCisFP<0>, SDTCisVT<1, i32>]>; +def SDT_CMPFP0 : SDTypeProfile<0, 1, [SDTCisFP<0>]>; def SDT_VMOVDRR : SDTypeProfile<1, 2, [SDTCisVT<0, f64>, SDTCisVT<1, i32>, SDTCisSameAs<1, 2>]>; def SDT_VMOVRRD : SDTypeProfile<2, 1, [SDTCisVT<0, i32>, SDTCisSameAs<0, 1>, @@ -19,7 +19,7 @@ def SDT_VMOVRRD : SDTypeProfile<2, 1, [S def SDT_VMOVSR : SDTypeProfile<1, 1, [SDTCisVT<0, f32>, SDTCisVT<1, i32>]>; def arm_fmstat : SDNode<"ARMISD::FMSTAT", SDTNone, [SDNPInGlue, SDNPOutGlue]>; -def arm_cmpfp : SDNode<"ARMISD::CMPFP", SDT_ARMFCmp, [SDNPOutGlue]>; +def arm_cmpfp : SDNode<"ARMISD::CMPFP", SDT_ARMCmp, [SDNPOutGlue]>; def arm_cmpfp0 : SDNode<"ARMISD::CMPFPw0", SDT_CMPFP0, [SDNPOutGlue]>; def arm_fmdrr : SDNode<"ARMISD::VMOVDRR", SDT_VMOVDRR>; def arm_fmrrd : SDNode<"ARMISD::VMOVRRD", SDT_VMOVRRD>; @@ -548,12 +548,12 @@ let Defs = [FPSCR_NZCV] in { def VCMPED : ADuI<0b11101, 0b11, 0b0100, 0b11, 0, (outs), (ins DPR:$Dd, DPR:$Dm), IIC_fpCMP64, "vcmpe", ".f64\t$Dd, $Dm", - [(arm_cmpfp DPR:$Dd, (f64 DPR:$Dm), (i32 1))]>; + [/* For disassembly only; pattern left blank */]>; def VCMPES : ASuI<0b11101, 0b11, 0b0100, 0b11, 0, (outs), (ins SPR:$Sd, SPR:$Sm), IIC_fpCMP32, "vcmpe", ".f32\t$Sd, $Sm", - [(arm_cmpfp SPR:$Sd, SPR:$Sm, (i32 1))]> { + [/* For disassembly only; pattern left blank */]> { // Some single precision VFP instructions may be executed on both NEON and // VFP pipelines on A8. let D = VFPNeonA8Domain; @@ -562,17 +562,17 @@ def VCMPES : ASuI<0b11101, 0b11, 0b0100, def VCMPEH : AHuI<0b11101, 0b11, 0b0100, 0b11, 0, (outs), (ins HPR:$Sd, HPR:$Sm), IIC_fpCMP16, "vcmpe", ".f16\t$Sd, $Sm", - [(arm_cmpfp HPR:$Sd, HPR:$Sm, (i32 1))]>; + [/* For disassembly only; pattern left blank */]>; def VCMPD : ADuI<0b11101, 0b11, 0b0100, 0b01, 0, (outs), (ins DPR:$Dd, DPR:$Dm), IIC_fpCMP64, "vcmp", ".f64\t$Dd, $Dm", - [(arm_cmpfp DPR:$Dd, (f64 DPR:$Dm), (i32 0))]>; + [(arm_cmpfp DPR:$Dd, (f64 DPR:$Dm))]>; def VCMPS : ASuI<0b11101, 0b11, 0b0100, 0b01, 0, (outs), (ins SPR:$Sd, SPR:$Sm), IIC_fpCMP32, "vcmp", ".f32\t$Sd, $Sm", - [(arm_cmpfp SPR:$Sd, SPR:$Sm, (i32 0))]> { + [(arm_cmpfp SPR:$Sd, SPR:$Sm)]> { // Some single precision VFP instructions may be executed on both NEON and // VFP pipelines on A8. let D = VFPNeonA8Domain; @@ -581,7 +581,7 @@ def VCMPS : ASuI<0b11101, 0b11, 0b0100, def VCMPH : AHuI<0b11101, 0b11, 0b0100, 0b01, 0, (outs), (ins HPR:$Sd, HPR:$Sm), IIC_fpCMP16, "vcmp", ".f16\t$Sd, $Sm", - [(arm_cmpfp HPR:$Sd, HPR:$Sm, (i32 0))]>; + [(arm_cmpfp HPR:$Sd, HPR:$Sm)]>; } // Defs = [FPSCR_NZCV] //===----------------------------------------------------------------------===// @@ -611,7 +611,7 @@ let Defs = [FPSCR_NZCV] in { def VCMPEZD : ADuI<0b11101, 0b11, 0b0101, 0b11, 0, (outs), (ins DPR:$Dd), IIC_fpCMP64, "vcmpe", ".f64\t$Dd, #0", - [(arm_cmpfp0 (f64 DPR:$Dd), (i32 1))]> { + [/* For disassembly only; pattern left blank */]> { let Inst{3-0} = 0b0000; let Inst{5} = 0; } @@ -619,7 +619,7 @@ def VCMPEZD : ADuI<0b11101, 0b11, 0b0101 def VCMPEZS : ASuI<0b11101, 0b11, 0b0101, 0b11, 0, (outs), (ins SPR:$Sd), IIC_fpCMP32, "vcmpe", ".f32\t$Sd, #0", - [(arm_cmpfp0 SPR:$Sd, (i32 1))]> { + [/* For disassembly only; pattern left blank */]> { let Inst{3-0} = 0b0000; let Inst{5} = 0; @@ -631,7 +631,7 @@ def VCMPEZS : ASuI<0b11101, 0b11, 0b0101 def VCMPEZH : AHuI<0b11101, 0b11, 0b0101, 0b11, 0, (outs), (ins HPR:$Sd), IIC_fpCMP16, "vcmpe", ".f16\t$Sd, #0", - [(arm_cmpfp0 HPR:$Sd, (i32 1))]> { + [/* For disassembly only; pattern left blank */]> { let Inst{3-0} = 0b0000; let Inst{5} = 0; } @@ -639,7 +639,7 @@ def VCMPEZH : AHuI<0b11101, 0b11, 0b0101 def VCMPZD : ADuI<0b11101, 0b11, 0b0101, 0b01, 0, (outs), (ins DPR:$Dd), IIC_fpCMP64, "vcmp", ".f64\t$Dd, #0", - [(arm_cmpfp0 (f64 DPR:$Dd), (i32 0))]> { + [(arm_cmpfp0 (f64 DPR:$Dd))]> { let Inst{3-0} = 0b0000; let Inst{5} = 0; } @@ -647,7 +647,7 @@ def VCMPZD : ADuI<0b11101, 0b11, 0b0101 def VCMPZS : ASuI<0b11101, 0b11, 0b0101, 0b01, 0, (outs), (ins SPR:$Sd), IIC_fpCMP32, "vcmp", ".f32\t$Sd, #0", - [(arm_cmpfp0 SPR:$Sd, (i32 0))]> { + [(arm_cmpfp0 SPR:$Sd)]> { let Inst{3-0} = 0b0000; let Inst{5} = 0; @@ -659,7 +659,7 @@ def VCMPZS : ASuI<0b11101, 0b11, 0b0101 def VCMPZH : AHuI<0b11101, 0b11, 0b0101, 0b01, 0, (outs), (ins HPR:$Sd), IIC_fpCMP16, "vcmp", ".f16\t$Sd, #0", - [(arm_cmpfp0 HPR:$Sd, (i32 0))]> { + [(arm_cmpfp0 HPR:$Sd)]> { let Inst{3-0} = 0b0000; let Inst{5} = 0; } Modified: llvm/trunk/test/CodeGen/ARM/2009-07-18-RewriterBug.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/2009-07-18-RewriterBug.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/2009-07-18-RewriterBug.ll (original) +++ llvm/trunk/test/CodeGen/ARM/2009-07-18-RewriterBug.ll Tue Oct 8 01:25:42 2019 @@ -1317,19 +1317,19 @@ bb15: } ; CHECK-LABEL: _build_delaunay: -; CHECK: vcmpe -; CHECK: vcmpe -; CHECK: vcmpe -; CHECK: vcmpe -; CHECK: vcmpe -; CHECK: vcmpe -; CHECK: vcmpe -; CHECK: vcmpe -; CHECK: vcmpe -; CHECK: vcmpe -; CHECK: vcmpe -; CHECK: vcmpe -; CHECK: vcmpe +; CHECK: vcmp +; CHECK: vcmp +; CHECK: vcmp +; CHECK: vcmp +; CHECK: vcmp +; CHECK: vcmp +; CHECK: vcmp +; CHECK: vcmp +; CHECK: vcmp +; CHECK: vcmp +; CHECK: vcmp +; CHECK: vcmp +; CHECK: vcmp declare i32 @puts(i8* nocapture) nounwind Modified: llvm/trunk/test/CodeGen/ARM/arm-shrink-wrapping.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/arm-shrink-wrapping.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/arm-shrink-wrapping.ll (original) +++ llvm/trunk/test/CodeGen/ARM/arm-shrink-wrapping.ll Tue Oct 8 01:25:42 2019 @@ -1781,7 +1781,7 @@ define float @debug_info(float %gamma, f ; ARM-NEXT: vmov.f32 s0, #1.000000e+00 ; ARM-NEXT: vmov.f64 d16, #1.000000e+00 ; ARM-NEXT: vadd.f64 d16, d9, d16 -; ARM-NEXT: vcmpe.f32 s16, s0 +; ARM-NEXT: vcmp.f32 s16, s0 ; ARM-NEXT: vmrs APSR_nzcv, fpscr ; ARM-NEXT: vmov d17, r0, r1 ; ARM-NEXT: vmov.f64 d18, d9 @@ -1828,7 +1828,7 @@ define float @debug_info(float %gamma, f ; THUMB-NEXT: vmov.f32 s0, #1.000000e+00 ; THUMB-NEXT: vmov.f64 d16, #1.000000e+00 ; THUMB-NEXT: vmov.f64 d18, d9 -; THUMB-NEXT: vcmpe.f32 s16, s0 +; THUMB-NEXT: vcmp.f32 s16, s0 ; THUMB-NEXT: vadd.f64 d16, d9, d16 ; THUMB-NEXT: vmrs APSR_nzcv, fpscr ; THUMB-NEXT: it gt Modified: llvm/trunk/test/CodeGen/ARM/compare-call.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/compare-call.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/compare-call.ll (original) +++ llvm/trunk/test/CodeGen/ARM/compare-call.ll Tue Oct 8 01:25:42 2019 @@ -18,5 +18,5 @@ UnifiedReturnBlock: ; preds declare i32 @bar(...) -; CHECK: vcmpe.f32 +; CHECK: vcmp.f32 Modified: llvm/trunk/test/CodeGen/ARM/fcmp-xo.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/fcmp-xo.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/fcmp-xo.ll (original) +++ llvm/trunk/test/CodeGen/ARM/fcmp-xo.ll Tue Oct 8 01:25:42 2019 @@ -5,7 +5,7 @@ define arm_aapcs_vfpcc float @foo0(float %a0) local_unnamed_addr { ; CHECK-LABEL: foo0: ; CHECK: @ %bb.0: -; CHECK-NEXT: vcmpe.f32 s0, #0 +; CHECK-NEXT: vcmp.f32 s0, #0 ; CHECK-NEXT: vmov.f32 s2, #5.000000e-01 ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vmov.f32 s4, #-5.000000e-01 @@ -24,7 +24,7 @@ define arm_aapcs_vfpcc float @float1(flo ; CHECK-NEXT: vmov.f32 s2, #1.000000e+00 ; CHECK-NEXT: vmov.f32 s4, #5.000000e-01 ; CHECK-NEXT: vmov.f32 s6, #-5.000000e-01 -; CHECK-NEXT: vcmpe.f32 s2, s0 +; CHECK-NEXT: vcmp.f32 s2, s0 ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselgt.f32 s0, s6, s4 ; CHECK-NEXT: bx lr @@ -46,7 +46,7 @@ define arm_aapcs_vfpcc float @float128(f ; VMOVSR-NEXT: vmov.f32 s4, #5.000000e-01 ; VMOVSR-NEXT: vmov s2, r0 ; VMOVSR-NEXT: vmov.f32 s6, #-5.000000e-01 -; VMOVSR-NEXT: vcmpe.f32 s2, s0 +; VMOVSR-NEXT: vcmp.f32 s2, s0 ; VMOVSR-NEXT: vmrs APSR_nzcv, fpscr ; VMOVSR-NEXT: vselgt.f32 s0, s6, s4 ; VMOVSR-NEXT: bx lr @@ -57,7 +57,7 @@ define arm_aapcs_vfpcc float @float128(f ; NEON-NEXT: vmov.f32 s2, #5.000000e-01 ; NEON-NEXT: vmov d3, r0, r0 ; NEON-NEXT: vmov.f32 s4, #-5.000000e-01 -; NEON-NEXT: vcmpe.f32 s6, s0 +; NEON-NEXT: vcmp.f32 s6, s0 ; NEON-NEXT: vmrs APSR_nzcv, fpscr ; NEON-NEXT: vselgt.f32 s0, s4, s2 ; NEON-NEXT: bx lr @@ -70,7 +70,7 @@ define arm_aapcs_vfpcc double @double1(d ; CHECK-LABEL: double1: ; CHECK: @ %bb.0: ; CHECK-NEXT: vmov.f64 d18, #1.000000e+00 -; CHECK-NEXT: vcmpe.f64 d18, d0 +; CHECK-NEXT: vcmp.f64 d18, d0 ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vmov.f64 d16, #5.000000e-01 ; CHECK-NEXT: vmov.f64 d17, #-5.000000e-01 @@ -89,7 +89,7 @@ define arm_aapcs_vfpcc double @double128 ; CHECK-NEXT: movt r0, #16480 ; CHECK-NEXT: vmov.f64 d16, #5.000000e-01 ; CHECK-NEXT: vmov d18, r1, r0 -; CHECK-NEXT: vcmpe.f64 d18, d0 +; CHECK-NEXT: vcmp.f64 d18, d0 ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vmov.f64 d17, #-5.000000e-01 ; CHECK-NEXT: vselgt.f64 d0, d17, d16 Modified: llvm/trunk/test/CodeGen/ARM/float-helpers.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/float-helpers.s?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/float-helpers.s (original) +++ llvm/trunk/test/CodeGen/ARM/float-helpers.s Tue Oct 8 01:25:42 2019 @@ -174,13 +174,13 @@ define i32 @fcmplt(float %a, float %b) # ; CHECK-SOFTFP: vmov s2, r0 ; CHECK-SOFTFP-NEXT: mov r0, #0 ; CHECK-SOFTFP-NEXT: vmov s0, r1 -; CHECK-SOFTFP-NEXT: vcmpe.f32 s2, s0 +; CHECK-SOFTFP-NEXT: vcmp.f32 s2, s0 ; CHECK-SOFTFP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-NEXT: movmi r0, #1 ; CHECK-SOFTFP-NEXT: mov pc, lr ; ; CHECK-HARDFP-SP-LABEL: fcmplt: -; CHECK-HARDFP-SP: vcmpe.f32 s0, s1 +; CHECK-HARDFP-SP: vcmp.f32 s0, s1 ; CHECK-HARDFP-SP-NEXT: mov r0, #0 ; CHECK-HARDFP-SP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-SP-NEXT: movmi r0, #1 @@ -205,13 +205,13 @@ define i32 @fcmple(float %a, float %b) # ; CHECK-SOFTFP: vmov s2, r0 ; CHECK-SOFTFP-NEXT: mov r0, #0 ; CHECK-SOFTFP-NEXT: vmov s0, r1 -; CHECK-SOFTFP-NEXT: vcmpe.f32 s2, s0 +; CHECK-SOFTFP-NEXT: vcmp.f32 s2, s0 ; CHECK-SOFTFP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-NEXT: movls r0, #1 ; CHECK-SOFTFP-NEXT: mov pc, lr ; ; CHECK-HARDFP-SP-LABEL: fcmple: -; CHECK-HARDFP-SP: vcmpe.f32 s0, s1 +; CHECK-HARDFP-SP: vcmp.f32 s0, s1 ; CHECK-HARDFP-SP-NEXT: mov r0, #0 ; CHECK-HARDFP-SP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-SP-NEXT: movls r0, #1 @@ -236,13 +236,13 @@ define i32 @fcmpge(float %a, float %b) # ; CHECK-SOFTFP: vmov s2, r0 ; CHECK-SOFTFP-NEXT: mov r0, #0 ; CHECK-SOFTFP-NEXT: vmov s0, r1 -; CHECK-SOFTFP-NEXT: vcmpe.f32 s2, s0 +; CHECK-SOFTFP-NEXT: vcmp.f32 s2, s0 ; CHECK-SOFTFP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-NEXT: movge r0, #1 ; CHECK-SOFTFP-NEXT: mov pc, lr ; ; CHECK-HARDFP-SP-LABEL: fcmpge: -; CHECK-HARDFP-SP: vcmpe.f32 s0, s1 +; CHECK-HARDFP-SP: vcmp.f32 s0, s1 ; CHECK-HARDFP-SP-NEXT: mov r0, #0 ; CHECK-HARDFP-SP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-SP-NEXT: movge r0, #1 @@ -267,13 +267,13 @@ define i32 @fcmpgt(float %a, float %b) # ; CHECK-SOFTFP: vmov s2, r0 ; CHECK-SOFTFP-NEXT: mov r0, #0 ; CHECK-SOFTFP-NEXT: vmov s0, r1 -; CHECK-SOFTFP-NEXT: vcmpe.f32 s2, s0 +; CHECK-SOFTFP-NEXT: vcmp.f32 s2, s0 ; CHECK-SOFTFP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-NEXT: movgt r0, #1 ; CHECK-SOFTFP-NEXT: mov pc, lr ; ; CHECK-HARDFP-SP-LABEL: fcmpgt: -; CHECK-HARDFP-SP: vcmpe.f32 s0, s1 +; CHECK-HARDFP-SP: vcmp.f32 s0, s1 ; CHECK-HARDFP-SP-NEXT: mov r0, #0 ; CHECK-HARDFP-SP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-SP-NEXT: movgt r0, #1 @@ -298,13 +298,13 @@ define i32 @fcmpun(float %a, float %b) # ; CHECK-SOFTFP: vmov s2, r0 ; CHECK-SOFTFP-NEXT: mov r0, #0 ; CHECK-SOFTFP-NEXT: vmov s0, r1 -; CHECK-SOFTFP-NEXT: vcmpe.f32 s2, s0 +; CHECK-SOFTFP-NEXT: vcmp.f32 s2, s0 ; CHECK-SOFTFP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-NEXT: movvs r0, #1 ; CHECK-SOFTFP-NEXT: mov pc, lr ; ; CHECK-HARDFP-SP-LABEL: fcmpun: -; CHECK-HARDFP-SP: vcmpe.f32 s0, s1 +; CHECK-HARDFP-SP: vcmp.f32 s0, s1 ; CHECK-HARDFP-SP-NEXT: mov r0, #0 ; CHECK-HARDFP-SP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-SP-NEXT: movvs r0, #1 @@ -503,13 +503,13 @@ define i32 @dcmplt(double %a, double %b) ; CHECK-SOFTFP: vmov d16, r2, r3 ; CHECK-SOFTFP-NEXT: vmov d17, r0, r1 ; CHECK-SOFTFP-NEXT: mov r0, #0 -; CHECK-SOFTFP-NEXT: vcmpe.f64 d17, d16 +; CHECK-SOFTFP-NEXT: vcmp.f64 d17, d16 ; CHECK-SOFTFP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-NEXT: movmi r0, #1 ; CHECK-SOFTFP-NEXT: mov pc, lr ; ; CHECK-HARDFP-DP-LABEL: dcmplt: -; CHECK-HARDFP-DP: vcmpe.f64 d0, d1 +; CHECK-HARDFP-DP: vcmp.f64 d0, d1 ; CHECK-HARDFP-DP-NEXT: mov r0, #0 ; CHECK-HARDFP-DP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-DP-NEXT: movmi r0, #1 @@ -545,13 +545,13 @@ define i32 @dcmple(double %a, double %b) ; CHECK-SOFTFP: vmov d16, r2, r3 ; CHECK-SOFTFP-NEXT: vmov d17, r0, r1 ; CHECK-SOFTFP-NEXT: mov r0, #0 -; CHECK-SOFTFP-NEXT: vcmpe.f64 d17, d16 +; CHECK-SOFTFP-NEXT: vcmp.f64 d17, d16 ; CHECK-SOFTFP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-NEXT: movls r0, #1 ; CHECK-SOFTFP-NEXT: mov pc, lr ; ; CHECK-HARDFP-DP-LABEL: dcmple: -; CHECK-HARDFP-DP: vcmpe.f64 d0, d1 +; CHECK-HARDFP-DP: vcmp.f64 d0, d1 ; CHECK-HARDFP-DP-NEXT: mov r0, #0 ; CHECK-HARDFP-DP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-DP-NEXT: movls r0, #1 @@ -587,13 +587,13 @@ define i32 @dcmpge(double %a, double %b) ; CHECK-SOFTFP: vmov d16, r2, r3 ; CHECK-SOFTFP-NEXT: vmov d17, r0, r1 ; CHECK-SOFTFP-NEXT: mov r0, #0 -; CHECK-SOFTFP-NEXT: vcmpe.f64 d17, d16 +; CHECK-SOFTFP-NEXT: vcmp.f64 d17, d16 ; CHECK-SOFTFP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-NEXT: movge r0, #1 ; CHECK-SOFTFP-NEXT: mov pc, lr ; ; CHECK-HARDFP-DP-LABEL: dcmpge: -; CHECK-HARDFP-DP: vcmpe.f64 d0, d1 +; CHECK-HARDFP-DP: vcmp.f64 d0, d1 ; CHECK-HARDFP-DP-NEXT: mov r0, #0 ; CHECK-HARDFP-DP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-DP-NEXT: movge r0, #1 @@ -629,13 +629,13 @@ define i32 @dcmpgt(double %a, double %b) ; CHECK-SOFTFP: vmov d16, r2, r3 ; CHECK-SOFTFP-NEXT: vmov d17, r0, r1 ; CHECK-SOFTFP-NEXT: mov r0, #0 -; CHECK-SOFTFP-NEXT: vcmpe.f64 d17, d16 +; CHECK-SOFTFP-NEXT: vcmp.f64 d17, d16 ; CHECK-SOFTFP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-NEXT: movgt r0, #1 ; CHECK-SOFTFP-NEXT: mov pc, lr ; ; CHECK-HARDFP-DP-LABEL: dcmpgt: -; CHECK-HARDFP-DP: vcmpe.f64 d0, d1 +; CHECK-HARDFP-DP: vcmp.f64 d0, d1 ; CHECK-HARDFP-DP-NEXT: mov r0, #0 ; CHECK-HARDFP-DP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-DP-NEXT: movgt r0, #1 @@ -671,13 +671,13 @@ define i32 @dcmpun(double %a, double %b) ; CHECK-SOFTFP: vmov d16, r2, r3 ; CHECK-SOFTFP-NEXT: vmov d17, r0, r1 ; CHECK-SOFTFP-NEXT: mov r0, #0 -; CHECK-SOFTFP-NEXT: vcmpe.f64 d17, d16 +; CHECK-SOFTFP-NEXT: vcmp.f64 d17, d16 ; CHECK-SOFTFP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-NEXT: movvs r0, #1 ; CHECK-SOFTFP-NEXT: mov pc, lr ; ; CHECK-HARDFP-DP-LABEL: dcmpun: -; CHECK-HARDFP-DP: vcmpe.f64 d0, d1 +; CHECK-HARDFP-DP: vcmp.f64 d0, d1 ; CHECK-HARDFP-DP-NEXT: mov r0, #0 ; CHECK-HARDFP-DP-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-DP-NEXT: movvs r0, #1 Modified: llvm/trunk/test/CodeGen/ARM/fp16-instructions.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/fp16-instructions.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/fp16-instructions.ll (original) +++ llvm/trunk/test/CodeGen/ARM/fp16-instructions.ll Tue Oct 8 01:25:42 2019 @@ -164,9 +164,9 @@ entry: ; CHECK-LABEL: VCMPE1: ; CHECK-SOFT: bl __aeabi_fcmplt -; CHECK-SOFTFP-FP16: vcmpe.f32 s0, #0 -; CHECK-SOFTFP-FULLFP16: vcmpe.f16 s0, #0 -; CHECK-HARDFP-FULLFP16: vcmpe.f16 s0, #0 +; CHECK-SOFTFP-FP16: vcmp.f32 s0, #0 +; CHECK-SOFTFP-FULLFP16: vcmp.f16 s0, #0 +; CHECK-HARDFP-FULLFP16: vcmp.f16 s0, #0 } define i32 @VCMPE2(float %F.coerce, float %G.coerce) { @@ -184,9 +184,9 @@ entry: ; CHECK-LABEL: VCMPE2: ; CHECK-SOFT: bl __aeabi_fcmplt -; CHECK-SOFTFP-FP16: vcmpe.f32 s{{.}}, s{{.}} -; CHECK-SOFTFP-FULLFP16: vcmpe.f16 s{{.}}, s{{.}} -; CHECK-HARDFP-FULLFP16: vcmpe.f16 s{{.}}, s{{.}} +; CHECK-SOFTFP-FP16: vcmp.f32 s{{.}}, s{{.}} +; CHECK-SOFTFP-FULLFP16: vcmp.f16 s{{.}}, s{{.}} +; CHECK-HARDFP-FULLFP16: vcmp.f16 s{{.}}, s{{.}} } ; Test lowering of BR_CC @@ -212,10 +212,10 @@ for.end: ; CHECK-SOFT: cmp r0, #{{0|1}} ; CHECK-SOFTFP-FP16: vcvtb.f32.f16 [[S2:s[0-9]]], [[S2]] -; CHECK-SOFTFP-FP16: vcmpe.f32 [[S2]], s0 +; CHECK-SOFTFP-FP16: vcmp.f32 [[S2]], s0 ; CHECK-SOFTFP-FP16: vmrs APSR_nzcv, fpscr -; CHECK-SOFTFP-FULLFP16: vcmpe.f16 s{{.}}, s{{.}} +; CHECK-SOFTFP-FULLFP16: vcmp.f16 s{{.}}, s{{.}} ; CHECK-SOFTFP-FULLFP16: vmrs APSR_nzcv, fpscr } @@ -727,15 +727,15 @@ define half @select_cc_ge1(half* %a0) { ; CHECK-LABEL: select_cc_ge1: -; CHECK-HARDFP-FULLFP16: vcmpe.f16 s6, s0 +; CHECK-HARDFP-FULLFP16: vcmp.f16 s6, s0 ; CHECK-HARDFP-FULLFP16-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-FULLFP16-NEXT: vselge.f16 s0, s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-A32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-A32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-A32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-A32-NEXT: vmovge.f32 s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-T32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-T32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-T32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-T32-NEXT: it ge ; CHECK-SOFTFP-FP16-T32-NEXT: vmovge.f32 s{{.}}, s{{.}} @@ -749,15 +749,15 @@ define half @select_cc_ge2(half* %a0) { ; CHECK-LABEL: select_cc_ge2: -; CHECK-HARDFP-FULLFP16: vcmpe.f16 s0, s6 +; CHECK-HARDFP-FULLFP16: vcmp.f16 s0, s6 ; CHECK-HARDFP-FULLFP16-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-FULLFP16-NEXT: vselge.f16 s0, s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-A32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-A32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-A32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-A32-NEXT: vmovls.f32 s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-T32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-T32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-T32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-T32-NEXT: it ls ; CHECK-SOFTFP-FP16-T32-NEXT: vmovls.f32 s{{.}}, s{{.}} @@ -771,15 +771,15 @@ define half @select_cc_ge3(half* %a0) { ; CHECK-LABEL: select_cc_ge3: -; CHECK-HARDFP-FULLFP16: vcmpe.f16 s0, s6 +; CHECK-HARDFP-FULLFP16: vcmp.f16 s0, s6 ; CHECK-HARDFP-FULLFP16-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-FULLFP16-NEXT: vselge.f16 s0, s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-A32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-A32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-A32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-A32-NEXT: vmovhi.f32 s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-T32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-T32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-T32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-T32-NEXT: it hi ; CHECK-SOFTFP-FP16-T32-NEXT: vmovhi.f32 s{{.}}, s{{.}} @@ -793,15 +793,15 @@ define half @select_cc_ge4(half* %a0) { ; CHECK-LABEL: select_cc_ge4: -; CHECK-HARDFP-FULLFP16: vcmpe.f16 s6, s0 +; CHECK-HARDFP-FULLFP16: vcmp.f16 s6, s0 ; CHECK-HARDFP-FULLFP16-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-FULLFP16-NEXT: vselge.f16 s0, s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-A32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-A32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-A32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-A32-NEXT: vmovlt.f32 s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-T32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-T32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-T32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-T32-NEXT: it lt ; CHECK-SOFTFP-FP16-T32-NEXT: vmovlt.f32 s{{.}}, s{{.}} @@ -816,15 +816,15 @@ define half @select_cc_gt1(half* %a0) { ; CHECK-LABEL: select_cc_gt1: -; CHECK-HARDFP-FULLFP16: vcmpe.f16 s6, s0 +; CHECK-HARDFP-FULLFP16: vcmp.f16 s6, s0 ; CHECK-HARDFP-FULLFP16-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-FULLFP16-NEXT: vselgt.f16 s0, s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-A32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-A32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-A32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-A32-NEXT: vmovgt.f32 s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-T32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-T32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-T32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-T32-NEXT: it gt ; CHECK-SOFTFP-FP16-T32-NEXT: vmovgt.f32 s{{.}}, s{{.}} @@ -838,15 +838,15 @@ define half @select_cc_gt2(half* %a0) { ; CHECK-LABEL: select_cc_gt2: -; CHECK-HARDFP-FULLFP16: vcmpe.f16 s0, s6 +; CHECK-HARDFP-FULLFP16: vcmp.f16 s0, s6 ; CHECK-HARDFP-FULLFP16-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-FULLFP16-NEXT: vselgt.f16 s0, s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-A32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-A32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-A32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-A32-NEXT: vmovpl.f32 s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-T32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-T32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-T32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-T32-NEXT: it pl ; CHECK-SOFTFP-FP16-T32-NEXT: vmovpl.f32 s{{.}}, s{{.}} @@ -860,15 +860,15 @@ define half @select_cc_gt3(half* %a0) { ; CHECK-LABEL: select_cc_gt3: -; CHECK-HARDFP-FULLFP16: vcmpe.f16 s6, s0 +; CHECK-HARDFP-FULLFP16: vcmp.f16 s6, s0 ; CHECK-HARDFP-FULLFP16-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-FULLFP16-NEXT: vselgt.f16 s0, s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-A32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-A32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-A32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-A32-NEXT: vmovle.f32 s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-T32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-T32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-T32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-T32-NEXT: it le ; CHECK-SOFTFP-FP16-T32-NEXT: vmovle.f32 s{{.}}, s{{.}} @@ -882,15 +882,15 @@ define half @select_cc_gt4(half* %a0) { ; CHECK-LABEL: select_cc_gt4: -; CHECK-HARDFP-FULLFP16: vcmpe.f16 s0, s6 +; CHECK-HARDFP-FULLFP16: vcmp.f16 s0, s6 ; CHECK-HARDFP-FULLFP16-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-HARDFP-FULLFP16-NEXT: vselgt.f16 s0, s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-A32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-A32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-A32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-A32-NEXT: vmovmi.f32 s{{.}}, s{{.}} -; CHECK-SOFTFP-FP16-T32: vcmpe.f32 s6, s0 +; CHECK-SOFTFP-FP16-T32: vcmp.f32 s6, s0 ; CHECK-SOFTFP-FP16-T32-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-SOFTFP-FP16-T32-NEXT: it mi ; CHECK-SOFTFP-FP16-T32-NEXT: vmovmi.f32 s{{.}}, s{{.}} Modified: llvm/trunk/test/CodeGen/ARM/fp16-promote.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/fp16-promote.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/fp16-promote.ll (original) +++ llvm/trunk/test/CodeGen/ARM/fp16-promote.ll Tue Oct 8 01:25:42 2019 @@ -202,7 +202,7 @@ define i1 @test_fcmp_ueq(half* %p, half* ; CHECK-FP16: vcvtb.f32.f16 ; CHECK-LIBCALL: bl __aeabi_h2f ; CHECK-LIBCALL: bl __aeabi_h2f -; CHECK-VFP: vcmpe.f32 +; CHECK-VFP: vcmp.f32 ; CHECK-NOVFP: bl __aeabi_fcmplt ; CHECK-FP16: vmrs APSR_nzcv, fpscr ; CHECK-VFP: strmi Modified: llvm/trunk/test/CodeGen/ARM/fpcmp.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/fpcmp.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/fpcmp.ll (original) +++ llvm/trunk/test/CodeGen/ARM/fpcmp.ll Tue Oct 8 01:25:42 2019 @@ -2,7 +2,7 @@ define i32 @f1(float %a) { ;CHECK-LABEL: f1: -;CHECK: vcmpe.f32 +;CHECK: vcmp.f32 ;CHECK: movmi entry: %tmp = fcmp olt float %a, 1.000000e+00 ; [#uses=1] @@ -22,7 +22,7 @@ entry: define i32 @f3(float %a) { ;CHECK-LABEL: f3: -;CHECK: vcmpe.f32 +;CHECK: vcmp.f32 ;CHECK: movgt entry: %tmp = fcmp ogt float %a, 1.000000e+00 ; [#uses=1] @@ -32,7 +32,7 @@ entry: define i32 @f4(float %a) { ;CHECK-LABEL: f4: -;CHECK: vcmpe.f32 +;CHECK: vcmp.f32 ;CHECK: movge entry: %tmp = fcmp oge float %a, 1.000000e+00 ; [#uses=1] @@ -42,7 +42,7 @@ entry: define i32 @f5(float %a) { ;CHECK-LABEL: f5: -;CHECK: vcmpe.f32 +;CHECK: vcmp.f32 ;CHECK: movls entry: %tmp = fcmp ole float %a, 1.000000e+00 ; [#uses=1] @@ -62,7 +62,7 @@ entry: define i32 @g1(double %a) { ;CHECK-LABEL: g1: -;CHECK: vcmpe.f64 +;CHECK: vcmp.f64 ;CHECK: movmi entry: %tmp = fcmp olt double %a, 1.000000e+00 ; [#uses=1] Modified: llvm/trunk/test/CodeGen/ARM/ifcvt11.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/ifcvt11.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/ifcvt11.ll (original) +++ llvm/trunk/test/CodeGen/ARM/ifcvt11.ll Tue Oct 8 01:25:42 2019 @@ -17,7 +17,7 @@ bb.nph: br label %bb bb: ; preds = %bb4, %bb.nph -; CHECK: vcmpe.f64 +; CHECK: vcmp.f64 ; CHECK: vmrs APSR_nzcv, fpscr %r.19 = phi i32 [ 0, %bb.nph ], [ %r.0, %bb4 ] %n.08 = phi i32 [ 0, %bb.nph ], [ %10, %bb4 ] @@ -30,9 +30,9 @@ bb: bb1: ; preds = %bb ; CHECK-NOT: it -; CHECK-NOT: vcmpemi +; CHECK-NOT: vcmpmi ; CHECK-NOT: vmrsmi -; CHECK: vcmpe.f64 +; CHECK: vcmp.f64 ; CHECK: vmrs APSR_nzcv, fpscr %scevgep12 = getelementptr %struct.xyz_t, %struct.xyz_t* %p, i32 %n.08, i32 2 %6 = load double, double* %scevgep12, align 4 Modified: llvm/trunk/test/CodeGen/ARM/swifterror.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/swifterror.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/swifterror.ll (original) +++ llvm/trunk/test/CodeGen/ARM/swifterror.ll Tue Oct 8 01:25:42 2019 @@ -194,7 +194,7 @@ define float @foo_loop(%swift_error** sw ; CHECK-O0: strb [[ID2]], [{{.*}}[[ID]], #8] ; spill r0 ; CHECK-O0: str r0, [sp{{.*}}] -; CHECK-O0: vcmpe +; CHECK-O0: vcmp ; CHECK-O0: ble ; reload from stack ; CHECK-O0: ldr r8 Removed: llvm/trunk/test/CodeGen/ARM/vcmp-crash.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/vcmp-crash.ll?rev=374024&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/ARM/vcmp-crash.ll (original) +++ llvm/trunk/test/CodeGen/ARM/vcmp-crash.ll (removed) @@ -1,11 +0,0 @@ -; RUN: llc -mcpu=cortex-m4 < %s | FileCheck %s - -target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64" -target triple = "thumbv7em-none--eabi" - -; CHECK: vcmp.f32 -define double @f(double %a, double %b, double %c, float %d) { - %1 = fcmp oeq float %d, 0.0 - %2 = select i1 %1, double %a, double %c - ret double %2 -} Modified: llvm/trunk/test/CodeGen/ARM/vfp.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/vfp.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/vfp.ll (original) +++ llvm/trunk/test/CodeGen/ARM/vfp.ll Tue Oct 8 01:25:42 2019 @@ -142,7 +142,7 @@ define void @test_cmpfp0(float* %glob, i ;CHECK-LABEL: test_cmpfp0: entry: %tmp = load float, float* %glob ; [#uses=1] -;CHECK: vcmpe.f32 +;CHECK: vcmp.f32 %tmp.upgrd.3 = fcmp ogt float %tmp, 0.000000e+00 ; [#uses=1] br i1 %tmp.upgrd.3, label %cond_true, label %cond_false Modified: llvm/trunk/test/CodeGen/ARM/vsel-fp16.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/vsel-fp16.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/vsel-fp16.ll (original) +++ llvm/trunk/test/CodeGen/ARM/vsel-fp16.ll Tue Oct 8 01:25:42 2019 @@ -106,7 +106,7 @@ define void @test_vsel32ogt(half* %lhs_p ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselgt.f16 s0, s0, s2 @@ -130,7 +130,7 @@ define void @test_vsel32oge(half* %lhs_p ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselge.f16 s0, s0, s2 @@ -178,7 +178,7 @@ define void @test_vsel32ugt(half* %lhs_p ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s6, s4 +; CHECK-NEXT: vcmp.f16 s6, s4 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselge.f16 s0, s2, s0 @@ -202,7 +202,7 @@ define void @test_vsel32uge(half* %lhs_p ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s6, s4 +; CHECK-NEXT: vcmp.f16 s6, s4 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselgt.f16 s0, s2, s0 @@ -226,7 +226,7 @@ define void @test_vsel32olt(half* %lhs_p ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s6, s4 +; CHECK-NEXT: vcmp.f16 s6, s4 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselgt.f16 s0, s0, s2 @@ -250,7 +250,7 @@ define void @test_vsel32ult(half* %lhs_p ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselge.f16 s0, s2, s0 @@ -274,7 +274,7 @@ define void @test_vsel32ole(half* %lhs_p ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s6, s4 +; CHECK-NEXT: vcmp.f16 s6, s4 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselge.f16 s0, s0, s2 @@ -298,7 +298,7 @@ define void @test_vsel32ule(half* %lhs_p ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselgt.f16 s0, s2, s0 @@ -322,7 +322,7 @@ define void @test_vsel32ord(half* %lhs_p ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselvs.f16 s0, s2, s0 @@ -370,7 +370,7 @@ define void @test_vsel32uno(half* %lhs_p ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselvs.f16 s0, s0, s2 @@ -395,7 +395,7 @@ define void @test_vsel32ogt_nnan(half* % ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselgt.f16 s0, s0, s2 @@ -419,7 +419,7 @@ define void @test_vsel32oge_nnan(half* % ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselge.f16 s0, s0, s2 @@ -467,7 +467,7 @@ define void @test_vsel32ugt_nnan(half* % ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselgt.f16 s0, s0, s2 @@ -491,7 +491,7 @@ define void @test_vsel32uge_nnan(half* % ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselge.f16 s0, s0, s2 @@ -515,7 +515,7 @@ define void @test_vsel32olt_nnan(half* % ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s6, s4 +; CHECK-NEXT: vcmp.f16 s6, s4 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselgt.f16 s0, s0, s2 @@ -539,7 +539,7 @@ define void @test_vsel32ult_nnan(half* % ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s6, s4 +; CHECK-NEXT: vcmp.f16 s6, s4 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselgt.f16 s0, s0, s2 @@ -563,7 +563,7 @@ define void @test_vsel32ole_nnan(half* % ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s6, s4 +; CHECK-NEXT: vcmp.f16 s6, s4 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselge.f16 s0, s0, s2 @@ -587,7 +587,7 @@ define void @test_vsel32ule_nnan(half* % ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s6, s4 +; CHECK-NEXT: vcmp.f16 s6, s4 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselge.f16 s0, s0, s2 @@ -611,7 +611,7 @@ define void @test_vsel32ord_nnan(half* % ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselvs.f16 s0, s2, s0 @@ -659,7 +659,7 @@ define void @test_vsel32uno_nnan(half* % ; CHECK-NEXT: vldr.16 s4, [r0] ; CHECK-NEXT: vldr.16 s6, [r1] ; CHECK-NEXT: movw r0, :lower16:varhalf -; CHECK-NEXT: vcmpe.f16 s4, s6 +; CHECK-NEXT: vcmp.f16 s4, s6 ; CHECK-NEXT: movt r0, :upper16:varhalf ; CHECK-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-NEXT: vselvs.f16 s0, s0, s2 Modified: llvm/trunk/test/CodeGen/ARM/vsel.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/vsel.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/vsel.ll (original) +++ llvm/trunk/test/CodeGen/ARM/vsel.ll Tue Oct 8 01:25:42 2019 @@ -96,7 +96,7 @@ define void @test_vsel32ogt(float %lhs32 %tst1 = fcmp ogt float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselgt.f32 s0, s2, s3 ret void } @@ -105,7 +105,7 @@ define void @test_vsel64ogt(float %lhs32 %tst1 = fcmp ogt float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselgt.f64 d16, d1, d2 ret void } @@ -114,7 +114,7 @@ define void @test_vsel32oge(float %lhs32 %tst1 = fcmp oge float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselge.f32 s0, s2, s3 ret void } @@ -123,7 +123,7 @@ define void @test_vsel64oge(float %lhs32 %tst1 = fcmp oge float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselge.f64 d16, d1, d2 ret void } @@ -150,7 +150,7 @@ define void @test_vsel32ugt(float %lhs32 %tst1 = fcmp ugt float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselge.f32 s0, s3, s2 ret void } @@ -159,7 +159,7 @@ define void @test_vsel64ugt(float %lhs32 %tst1 = fcmp ugt float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselge.f64 d16, d2, d1 ret void } @@ -168,7 +168,7 @@ define void @test_vsel32uge(float %lhs32 %tst1 = fcmp uge float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselgt.f32 s0, s3, s2 ret void } @@ -177,7 +177,7 @@ define void @test_vsel64uge(float %lhs32 %tst1 = fcmp uge float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselgt.f64 d16, d2, d1 ret void } @@ -186,7 +186,7 @@ define void @test_vsel32olt(float %lhs32 %tst1 = fcmp olt float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselgt.f32 s0, s2, s3 ret void } @@ -195,7 +195,7 @@ define void @test_vsel64olt(float %lhs32 %tst1 = fcmp olt float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselgt.f64 d16, d1, d2 ret void } @@ -204,7 +204,7 @@ define void @test_vsel32ult(float %lhs32 %tst1 = fcmp ult float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselge.f32 s0, s3, s2 ret void } @@ -213,7 +213,7 @@ define void @test_vsel64ult(float %lhs32 %tst1 = fcmp ult float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselge.f64 d16, d2, d1 ret void } @@ -222,7 +222,7 @@ define void @test_vsel32ole(float %lhs32 %tst1 = fcmp ole float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselge.f32 s0, s2, s3 ret void } @@ -231,7 +231,7 @@ define void @test_vsel64ole(float %lhs32 %tst1 = fcmp ole float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselge.f64 d16, d1, d2 ret void } @@ -240,7 +240,7 @@ define void @test_vsel32ule(float %lhs32 %tst1 = fcmp ule float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselgt.f32 s0, s3, s2 ret void } @@ -249,7 +249,7 @@ define void @test_vsel64ule(float %lhs32 %tst1 = fcmp ule float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselgt.f64 d16, d2, d1 ret void } @@ -258,7 +258,7 @@ define void @test_vsel32ord(float %lhs32 %tst1 = fcmp ord float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselvs.f32 s0, s3, s2 ret void } @@ -267,7 +267,7 @@ define void @test_vsel64ord(float %lhs32 %tst1 = fcmp ord float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselvs.f64 d16, d2, d1 ret void } @@ -294,7 +294,7 @@ define void @test_vsel32uno(float %lhs32 %tst1 = fcmp uno float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselvs.f32 s0, s2, s3 ret void } @@ -303,7 +303,7 @@ define void @test_vsel64uno(float %lhs32 %tst1 = fcmp uno float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselvs.f64 d16, d1, d2 ret void } @@ -313,7 +313,7 @@ define void @test_vsel32ogt_nnan(float % %tst1 = fcmp nnan ogt float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselgt.f32 s0, s2, s3 ret void } @@ -322,7 +322,7 @@ define void @test_vsel64ogt_nnan(float % %tst1 = fcmp nnan ogt float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselgt.f64 d16, d1, d2 ret void } @@ -331,7 +331,7 @@ define void @test_vsel32oge_nnan(float % %tst1 = fcmp nnan oge float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselge.f32 s0, s2, s3 ret void } @@ -340,7 +340,7 @@ define void @test_vsel64oge_nnan(float % %tst1 = fcmp nnan oge float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselge.f64 d16, d1, d2 ret void } @@ -367,7 +367,7 @@ define void @test_vsel32ugt_nnan(float % %tst1 = fcmp nnan ugt float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselgt.f32 s0, s2, s3 ret void } @@ -376,7 +376,7 @@ define void @test_vsel64ugt_nnan(float % %tst1 = fcmp nnan ugt float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselgt.f64 d16, d1, d2 ret void } @@ -385,7 +385,7 @@ define void @test_vsel32uge_nnan(float % %tst1 = fcmp nnan uge float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselge.f32 s0, s2, s3 ret void } @@ -394,7 +394,7 @@ define void @test_vsel64uge_nnan(float % %tst1 = fcmp nnan uge float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselge.f64 d16, d1, d2 ret void } @@ -403,7 +403,7 @@ define void @test_vsel32olt_nnan(float % %tst1 = fcmp nnan olt float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselgt.f32 s0, s2, s3 ret void } @@ -412,7 +412,7 @@ define void @test_vsel64olt_nnan(float % %tst1 = fcmp nnan olt float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselgt.f64 d16, d1, d2 ret void } @@ -421,7 +421,7 @@ define void @test_vsel32ult_nnan(float % %tst1 = fcmp nnan ult float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselgt.f32 s0, s2, s3 ret void } @@ -430,7 +430,7 @@ define void @test_vsel64ult_nnan(float % %tst1 = fcmp nnan ult float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselgt.f64 d16, d1, d2 ret void } @@ -439,7 +439,7 @@ define void @test_vsel32ole_nnan(float % %tst1 = fcmp nnan ole float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselge.f32 s0, s2, s3 ret void } @@ -448,7 +448,7 @@ define void @test_vsel64ole_nnan(float % %tst1 = fcmp nnan ole float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselge.f64 d16, d1, d2 ret void } @@ -457,7 +457,7 @@ define void @test_vsel32ule_nnan(float % %tst1 = fcmp nnan ule float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselge.f32 s0, s2, s3 ret void } @@ -466,7 +466,7 @@ define void @test_vsel64ule_nnan(float % %tst1 = fcmp nnan ule float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s1, s0 +; CHECK: vcmp.f32 s1, s0 ; CHECK: vselge.f64 d16, d1, d2 ret void } @@ -475,7 +475,7 @@ define void @test_vsel32ord_nnan(float % %tst1 = fcmp nnan ord float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselvs.f32 s0, s3, s2 ret void } @@ -484,7 +484,7 @@ define void @test_vsel64ord_nnan(float % %tst1 = fcmp nnan ord float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselvs.f64 d16, d2, d1 ret void } @@ -511,7 +511,7 @@ define void @test_vsel32uno_nnan(float % %tst1 = fcmp nnan uno float %lhs32, %rhs32 %val1 = select i1 %tst1, float %a, float %b store float %val1, float* @varfloat -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselvs.f32 s0, s2, s3 ret void } @@ -520,7 +520,7 @@ define void @test_vsel64uno_nnan(float % %tst1 = fcmp nnan uno float %lhs32, %rhs32 %val1 = select i1 %tst1, double %a, double %b store double %val1, double* @vardouble -; CHECK: vcmpe.f32 s0, s1 +; CHECK: vcmp.f32 s0, s1 ; CHECK: vselvs.f64 d16, d1, d2 ret void } Modified: llvm/trunk/test/CodeGen/Thumb2/float-cmp.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Thumb2/float-cmp.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Thumb2/float-cmp.ll (original) +++ llvm/trunk/test/CodeGen/Thumb2/float-cmp.ll Tue Oct 8 01:25:42 2019 @@ -23,7 +23,7 @@ define i1 @cmp_f_oeq(float %a, float %b) define i1 @cmp_f_ogt(float %a, float %b) { ; CHECK-LABEL: cmp_f_ogt: ; NONE: bl __aeabi_fcmpgt -; HARD: vcmpe.f32 +; HARD: vcmp.f32 ; HARD: movgt r0, #1 %1 = fcmp ogt float %a, %b ret i1 %1 @@ -31,7 +31,7 @@ define i1 @cmp_f_ogt(float %a, float %b) define i1 @cmp_f_oge(float %a, float %b) { ; CHECK-LABEL: cmp_f_oge: ; NONE: bl __aeabi_fcmpge -; HARD: vcmpe.f32 +; HARD: vcmp.f32 ; HARD: movge r0, #1 %1 = fcmp oge float %a, %b ret i1 %1 @@ -39,7 +39,7 @@ define i1 @cmp_f_oge(float %a, float %b) define i1 @cmp_f_olt(float %a, float %b) { ; CHECK-LABEL: cmp_f_olt: ; NONE: bl __aeabi_fcmplt -; HARD: vcmpe.f32 +; HARD: vcmp.f32 ; HARD: movmi r0, #1 %1 = fcmp olt float %a, %b ret i1 %1 @@ -47,7 +47,7 @@ define i1 @cmp_f_olt(float %a, float %b) define i1 @cmp_f_ole(float %a, float %b) { ; CHECK-LABEL: cmp_f_ole: ; NONE: bl __aeabi_fcmple -; HARD: vcmpe.f32 +; HARD: vcmp.f32 ; HARD: movls r0, #1 %1 = fcmp ole float %a, %b ret i1 %1 @@ -65,7 +65,7 @@ define i1 @cmp_f_one(float %a, float %b) define i1 @cmp_f_ord(float %a, float %b) { ; CHECK-LABEL: cmp_f_ord: ; NONE: bl __aeabi_fcmpun -; HARD: vcmpe.f32 +; HARD: vcmp.f32 ; HARD: movvc r0, #1 %1 = fcmp ord float %a, %b ret i1 %1 @@ -85,7 +85,7 @@ define i1 @cmp_f_ugt(float %a, float %b) ; NONE: bl __aeabi_fcmple ; NONE-NEXT: clz r0, r0 ; NONE-NEXT: lsrs r0, r0, #5 -; HARD: vcmpe.f32 +; HARD: vcmp.f32 ; HARD: movhi r0, #1 %1 = fcmp ugt float %a, %b ret i1 %1 @@ -95,7 +95,7 @@ define i1 @cmp_f_uge(float %a, float %b) ; NONE: bl __aeabi_fcmplt ; NONE-NEXT: clz r0, r0 ; NONE-NEXT: lsrs r0, r0, #5 -; HARD: vcmpe.f32 +; HARD: vcmp.f32 ; HARD: movpl r0, #1 %1 = fcmp uge float %a, %b ret i1 %1 @@ -105,7 +105,7 @@ define i1 @cmp_f_ult(float %a, float %b) ; NONE: bl __aeabi_fcmpge ; NONE-NEXT: clz r0, r0 ; NONE-NEXT: lsrs r0, r0, #5 -; HARD: vcmpe.f32 +; HARD: vcmp.f32 ; HARD: movlt r0, #1 %1 = fcmp ult float %a, %b ret i1 %1 @@ -115,7 +115,7 @@ define i1 @cmp_f_ule(float %a, float %b) ; NONE: bl __aeabi_fcmpgt ; NONE-NEXT: clz r0, r0 ; NONE-NEXT: lsrs r0, r0, #5 -; HARD: vcmpe.f32 +; HARD: vcmp.f32 ; HARD: movle r0, #1 %1 = fcmp ule float %a, %b ret i1 %1 @@ -131,7 +131,7 @@ define i1 @cmp_f_une(float %a, float %b) define i1 @cmp_f_uno(float %a, float %b) { ; CHECK-LABEL: cmp_f_uno: ; NONE: bl __aeabi_fcmpun -; HARD: vcmpe.f32 +; HARD: vcmp.f32 ; HARD: movvs r0, #1 %1 = fcmp uno float %a, %b ret i1 %1 @@ -164,7 +164,7 @@ define i1 @cmp_d_ogt(double %a, double % ; CHECK-LABEL: cmp_d_ogt: ; NONE: bl __aeabi_dcmpgt ; SP: bl __aeabi_dcmpgt -; DP: vcmpe.f64 +; DP: vcmp.f64 ; DP: movgt r0, #1 %1 = fcmp ogt double %a, %b ret i1 %1 @@ -173,7 +173,7 @@ define i1 @cmp_d_oge(double %a, double % ; CHECK-LABEL: cmp_d_oge: ; NONE: bl __aeabi_dcmpge ; SP: bl __aeabi_dcmpge -; DP: vcmpe.f64 +; DP: vcmp.f64 ; DP: movge r0, #1 %1 = fcmp oge double %a, %b ret i1 %1 @@ -182,7 +182,7 @@ define i1 @cmp_d_olt(double %a, double % ; CHECK-LABEL: cmp_d_olt: ; NONE: bl __aeabi_dcmplt ; SP: bl __aeabi_dcmplt -; DP: vcmpe.f64 +; DP: vcmp.f64 ; DP: movmi r0, #1 %1 = fcmp olt double %a, %b ret i1 %1 @@ -191,7 +191,7 @@ define i1 @cmp_d_ole(double %a, double % ; CHECK-LABEL: cmp_d_ole: ; NONE: bl __aeabi_dcmple ; SP: bl __aeabi_dcmple -; DP: vcmpe.f64 +; DP: vcmp.f64 ; DP: movls r0, #1 %1 = fcmp ole double %a, %b ret i1 %1 @@ -212,7 +212,7 @@ define i1 @cmp_d_ord(double %a, double % ; CHECK-LABEL: cmp_d_ord: ; NONE: bl __aeabi_dcmpun ; SP: bl __aeabi_dcmpun -; DP: vcmpe.f64 +; DP: vcmp.f64 ; DP: movvc r0, #1 %1 = fcmp ord double %a, %b ret i1 %1 @@ -221,7 +221,7 @@ define i1 @cmp_d_ugt(double %a, double % ; CHECK-LABEL: cmp_d_ugt: ; NONE: bl __aeabi_dcmple ; SP: bl __aeabi_dcmple -; DP: vcmpe.f64 +; DP: vcmp.f64 ; DP: movhi r0, #1 %1 = fcmp ugt double %a, %b ret i1 %1 @@ -231,7 +231,7 @@ define i1 @cmp_d_ult(double %a, double % ; CHECK-LABEL: cmp_d_ult: ; NONE: bl __aeabi_dcmpge ; SP: bl __aeabi_dcmpge -; DP: vcmpe.f64 +; DP: vcmp.f64 ; DP: movlt r0, #1 %1 = fcmp ult double %a, %b ret i1 %1 @@ -242,7 +242,7 @@ define i1 @cmp_d_uno(double %a, double % ; CHECK-LABEL: cmp_d_uno: ; NONE: bl __aeabi_dcmpun ; SP: bl __aeabi_dcmpun -; DP: vcmpe.f64 +; DP: vcmp.f64 ; DP: movvs r0, #1 %1 = fcmp uno double %a, %b ret i1 %1 @@ -271,7 +271,7 @@ define i1 @cmp_d_uge(double %a, double % ; CHECK-LABEL: cmp_d_uge: ; NONE: bl __aeabi_dcmplt ; SP: bl __aeabi_dcmplt -; DP: vcmpe.f64 +; DP: vcmp.f64 ; DP: movpl r0, #1 %1 = fcmp uge double %a, %b ret i1 %1 @@ -281,7 +281,7 @@ define i1 @cmp_d_ule(double %a, double % ; CHECK-LABEL: cmp_d_ule: ; NONE: bl __aeabi_dcmpgt ; SP: bl __aeabi_dcmpgt -; DP: vcmpe.f64 +; DP: vcmp.f64 ; DP: movle r0, #1 %1 = fcmp ule double %a, %b ret i1 %1 Modified: llvm/trunk/test/CodeGen/Thumb2/mve-vcmpf.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Thumb2/mve-vcmpf.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Thumb2/mve-vcmpf.ll (original) +++ llvm/trunk/test/CodeGen/Thumb2/mve-vcmpf.ll Tue Oct 8 01:25:42 2019 @@ -121,24 +121,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ogt_v4f32(<4 x float> %src, <4 x float> %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ogt_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s5 +; CHECK-MVE-NEXT: vcmp.f32 s1, s5 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s6 +; CHECK-MVE-NEXT: vcmp.f32 s2, s6 ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s7 +; CHECK-MVE-NEXT: vcmp.f32 s3, s7 ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -173,24 +173,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_oge_v4f32(<4 x float> %src, <4 x float> %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_oge_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s5 +; CHECK-MVE-NEXT: vcmp.f32 s1, s5 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s6 +; CHECK-MVE-NEXT: vcmp.f32 s2, s6 ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s7 +; CHECK-MVE-NEXT: vcmp.f32 s3, s7 ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -225,24 +225,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_olt_v4f32(<4 x float> %src, <4 x float> %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_olt_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s5 +; CHECK-MVE-NEXT: vcmp.f32 s1, s5 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s6 +; CHECK-MVE-NEXT: vcmp.f32 s2, s6 ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s7 +; CHECK-MVE-NEXT: vcmp.f32 s3, s7 ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -277,24 +277,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ole_v4f32(<4 x float> %src, <4 x float> %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ole_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s5 +; CHECK-MVE-NEXT: vcmp.f32 s1, s5 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s6 +; CHECK-MVE-NEXT: vcmp.f32 s2, s6 ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s7 +; CHECK-MVE-NEXT: vcmp.f32 s3, s7 ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -444,24 +444,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ugt_v4f32(<4 x float> %src, <4 x float> %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ugt_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s5 +; CHECK-MVE-NEXT: vcmp.f32 s1, s5 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s6 +; CHECK-MVE-NEXT: vcmp.f32 s2, s6 ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s7 +; CHECK-MVE-NEXT: vcmp.f32 s3, s7 ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -497,24 +497,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_uge_v4f32(<4 x float> %src, <4 x float> %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_uge_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s5 +; CHECK-MVE-NEXT: vcmp.f32 s1, s5 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s6 +; CHECK-MVE-NEXT: vcmp.f32 s2, s6 ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s7 +; CHECK-MVE-NEXT: vcmp.f32 s3, s7 ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -550,24 +550,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ult_v4f32(<4 x float> %src, <4 x float> %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ult_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s5 +; CHECK-MVE-NEXT: vcmp.f32 s1, s5 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s6 +; CHECK-MVE-NEXT: vcmp.f32 s2, s6 ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s7 +; CHECK-MVE-NEXT: vcmp.f32 s3, s7 ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -603,24 +603,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ule_v4f32(<4 x float> %src, <4 x float> %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ule_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s5 +; CHECK-MVE-NEXT: vcmp.f32 s1, s5 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s6 +; CHECK-MVE-NEXT: vcmp.f32 s2, s6 ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s7 +; CHECK-MVE-NEXT: vcmp.f32 s3, s7 ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -656,24 +656,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ord_v4f32(<4 x float> %src, <4 x float> %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ord_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s5 +; CHECK-MVE-NEXT: vcmp.f32 s1, s5 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s6 +; CHECK-MVE-NEXT: vcmp.f32 s2, s6 ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s7 +; CHECK-MVE-NEXT: vcmp.f32 s3, s7 ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -710,24 +710,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_uno_v4f32(<4 x float> %src, <4 x float> %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_uno_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s5 +; CHECK-MVE-NEXT: vcmp.f32 s1, s5 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s6 +; CHECK-MVE-NEXT: vcmp.f32 s2, s6 ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s7 +; CHECK-MVE-NEXT: vcmp.f32 s3, s7 ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -1035,13 +1035,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9, d10, d11} ; CHECK-MVE-NEXT: vmovx.f16 s16, s4 ; CHECK-MVE-NEXT: vmovx.f16 s18, s0 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s8 ; CHECK-MVE-NEXT: vmovx.f16 s18, s12 @@ -1054,7 +1054,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movgt r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne -; CHECK-MVE-NEXT: vcmpe.f16 s1, s5 +; CHECK-MVE-NEXT: vcmp.f16 s1, s5 ; CHECK-MVE-NEXT: lsls r2, r2, #31 ; CHECK-MVE-NEXT: vmovx.f16 s22, s1 ; CHECK-MVE-NEXT: vseleq.f16 s16, s12, s8 @@ -1074,7 +1074,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s13, s9 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s5 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1086,7 +1086,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s13 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s6 +; CHECK-MVE-NEXT: vcmp.f16 s2, s6 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[3], r1 @@ -1101,7 +1101,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s14, s10 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s6 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1113,7 +1113,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s14 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s7 +; CHECK-MVE-NEXT: vcmp.f16 s3, s7 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[5], r1 @@ -1122,7 +1122,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movgt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 ; CHECK-MVE-NEXT: cset r1, ne -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vmovx.f16 s0, s11 ; CHECK-MVE-NEXT: vseleq.f16 s20, s15, s11 @@ -1159,13 +1159,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9, d10, d11} ; CHECK-MVE-NEXT: vmovx.f16 s16, s4 ; CHECK-MVE-NEXT: vmovx.f16 s18, s0 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s8 ; CHECK-MVE-NEXT: vmovx.f16 s18, s12 @@ -1178,7 +1178,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movge r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne -; CHECK-MVE-NEXT: vcmpe.f16 s1, s5 +; CHECK-MVE-NEXT: vcmp.f16 s1, s5 ; CHECK-MVE-NEXT: lsls r2, r2, #31 ; CHECK-MVE-NEXT: vmovx.f16 s22, s1 ; CHECK-MVE-NEXT: vseleq.f16 s16, s12, s8 @@ -1198,7 +1198,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s13, s9 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s5 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1210,7 +1210,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s13 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s6 +; CHECK-MVE-NEXT: vcmp.f16 s2, s6 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[3], r1 @@ -1225,7 +1225,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s14, s10 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s6 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1237,7 +1237,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s14 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s7 +; CHECK-MVE-NEXT: vcmp.f16 s3, s7 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[5], r1 @@ -1246,7 +1246,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movge r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 ; CHECK-MVE-NEXT: cset r1, ne -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vmovx.f16 s0, s11 ; CHECK-MVE-NEXT: vseleq.f16 s20, s15, s11 @@ -1283,13 +1283,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9, d10, d11} ; CHECK-MVE-NEXT: vmovx.f16 s16, s4 ; CHECK-MVE-NEXT: vmovx.f16 s18, s0 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s8 ; CHECK-MVE-NEXT: vmovx.f16 s18, s12 @@ -1302,7 +1302,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movmi r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne -; CHECK-MVE-NEXT: vcmpe.f16 s1, s5 +; CHECK-MVE-NEXT: vcmp.f16 s1, s5 ; CHECK-MVE-NEXT: lsls r2, r2, #31 ; CHECK-MVE-NEXT: vmovx.f16 s22, s1 ; CHECK-MVE-NEXT: vseleq.f16 s16, s12, s8 @@ -1322,7 +1322,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s13, s9 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s5 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1334,7 +1334,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s13 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s6 +; CHECK-MVE-NEXT: vcmp.f16 s2, s6 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[3], r1 @@ -1349,7 +1349,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s14, s10 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s6 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1361,7 +1361,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s14 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s7 +; CHECK-MVE-NEXT: vcmp.f16 s3, s7 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[5], r1 @@ -1370,7 +1370,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movmi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 ; CHECK-MVE-NEXT: cset r1, ne -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vmovx.f16 s0, s11 ; CHECK-MVE-NEXT: vseleq.f16 s20, s15, s11 @@ -1407,13 +1407,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9, d10, d11} ; CHECK-MVE-NEXT: vmovx.f16 s16, s4 ; CHECK-MVE-NEXT: vmovx.f16 s18, s0 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s8 ; CHECK-MVE-NEXT: vmovx.f16 s18, s12 @@ -1426,7 +1426,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movls r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne -; CHECK-MVE-NEXT: vcmpe.f16 s1, s5 +; CHECK-MVE-NEXT: vcmp.f16 s1, s5 ; CHECK-MVE-NEXT: lsls r2, r2, #31 ; CHECK-MVE-NEXT: vmovx.f16 s22, s1 ; CHECK-MVE-NEXT: vseleq.f16 s16, s12, s8 @@ -1446,7 +1446,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s13, s9 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s5 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1458,7 +1458,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s13 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s6 +; CHECK-MVE-NEXT: vcmp.f16 s2, s6 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[3], r1 @@ -1473,7 +1473,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s14, s10 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s6 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1485,7 +1485,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s14 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s7 +; CHECK-MVE-NEXT: vcmp.f16 s3, s7 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[5], r1 @@ -1494,7 +1494,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movls r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 ; CHECK-MVE-NEXT: cset r1, ne -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vmovx.f16 s0, s11 ; CHECK-MVE-NEXT: vseleq.f16 s20, s15, s11 @@ -1796,13 +1796,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9, d10, d11} ; CHECK-MVE-NEXT: vmovx.f16 s16, s4 ; CHECK-MVE-NEXT: vmovx.f16 s18, s0 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s8 ; CHECK-MVE-NEXT: vmovx.f16 s18, s12 @@ -1815,7 +1815,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movhi r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne -; CHECK-MVE-NEXT: vcmpe.f16 s1, s5 +; CHECK-MVE-NEXT: vcmp.f16 s1, s5 ; CHECK-MVE-NEXT: lsls r2, r2, #31 ; CHECK-MVE-NEXT: vmovx.f16 s22, s1 ; CHECK-MVE-NEXT: vseleq.f16 s16, s12, s8 @@ -1835,7 +1835,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s13, s9 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s5 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1847,7 +1847,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s13 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s6 +; CHECK-MVE-NEXT: vcmp.f16 s2, s6 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[3], r1 @@ -1862,7 +1862,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s14, s10 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s6 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1874,7 +1874,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s14 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s7 +; CHECK-MVE-NEXT: vcmp.f16 s3, s7 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[5], r1 @@ -1883,7 +1883,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movhi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 ; CHECK-MVE-NEXT: cset r1, ne -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vmovx.f16 s0, s11 ; CHECK-MVE-NEXT: vseleq.f16 s20, s15, s11 @@ -1921,13 +1921,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9, d10, d11} ; CHECK-MVE-NEXT: vmovx.f16 s16, s4 ; CHECK-MVE-NEXT: vmovx.f16 s18, s0 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s8 ; CHECK-MVE-NEXT: vmovx.f16 s18, s12 @@ -1940,7 +1940,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movpl r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne -; CHECK-MVE-NEXT: vcmpe.f16 s1, s5 +; CHECK-MVE-NEXT: vcmp.f16 s1, s5 ; CHECK-MVE-NEXT: lsls r2, r2, #31 ; CHECK-MVE-NEXT: vmovx.f16 s22, s1 ; CHECK-MVE-NEXT: vseleq.f16 s16, s12, s8 @@ -1960,7 +1960,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s13, s9 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s5 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1972,7 +1972,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s13 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s6 +; CHECK-MVE-NEXT: vcmp.f16 s2, s6 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[3], r1 @@ -1987,7 +1987,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s14, s10 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s6 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1999,7 +1999,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s14 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s7 +; CHECK-MVE-NEXT: vcmp.f16 s3, s7 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[5], r1 @@ -2008,7 +2008,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movpl r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 ; CHECK-MVE-NEXT: cset r1, ne -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vmovx.f16 s0, s11 ; CHECK-MVE-NEXT: vseleq.f16 s20, s15, s11 @@ -2046,13 +2046,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9, d10, d11} ; CHECK-MVE-NEXT: vmovx.f16 s16, s4 ; CHECK-MVE-NEXT: vmovx.f16 s18, s0 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s8 ; CHECK-MVE-NEXT: vmovx.f16 s18, s12 @@ -2065,7 +2065,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movlt r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne -; CHECK-MVE-NEXT: vcmpe.f16 s1, s5 +; CHECK-MVE-NEXT: vcmp.f16 s1, s5 ; CHECK-MVE-NEXT: lsls r2, r2, #31 ; CHECK-MVE-NEXT: vmovx.f16 s22, s1 ; CHECK-MVE-NEXT: vseleq.f16 s16, s12, s8 @@ -2085,7 +2085,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s13, s9 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s5 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2097,7 +2097,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s13 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s6 +; CHECK-MVE-NEXT: vcmp.f16 s2, s6 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[3], r1 @@ -2112,7 +2112,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s14, s10 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s6 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2124,7 +2124,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s14 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s7 +; CHECK-MVE-NEXT: vcmp.f16 s3, s7 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[5], r1 @@ -2133,7 +2133,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movlt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 ; CHECK-MVE-NEXT: cset r1, ne -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vmovx.f16 s0, s11 ; CHECK-MVE-NEXT: vseleq.f16 s20, s15, s11 @@ -2171,13 +2171,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9, d10, d11} ; CHECK-MVE-NEXT: vmovx.f16 s16, s4 ; CHECK-MVE-NEXT: vmovx.f16 s18, s0 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s8 ; CHECK-MVE-NEXT: vmovx.f16 s18, s12 @@ -2190,7 +2190,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movle r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne -; CHECK-MVE-NEXT: vcmpe.f16 s1, s5 +; CHECK-MVE-NEXT: vcmp.f16 s1, s5 ; CHECK-MVE-NEXT: lsls r2, r2, #31 ; CHECK-MVE-NEXT: vmovx.f16 s22, s1 ; CHECK-MVE-NEXT: vseleq.f16 s16, s12, s8 @@ -2210,7 +2210,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s13, s9 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s5 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2222,7 +2222,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s13 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s6 +; CHECK-MVE-NEXT: vcmp.f16 s2, s6 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[3], r1 @@ -2237,7 +2237,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s14, s10 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s6 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2249,7 +2249,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s14 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s7 +; CHECK-MVE-NEXT: vcmp.f16 s3, s7 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[5], r1 @@ -2258,7 +2258,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movle r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 ; CHECK-MVE-NEXT: cset r1, ne -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vmovx.f16 s0, s11 ; CHECK-MVE-NEXT: vseleq.f16 s20, s15, s11 @@ -2296,13 +2296,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9, d10, d11} ; CHECK-MVE-NEXT: vmovx.f16 s16, s4 ; CHECK-MVE-NEXT: vmovx.f16 s18, s0 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s8 ; CHECK-MVE-NEXT: vmovx.f16 s18, s12 @@ -2315,7 +2315,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movvc r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne -; CHECK-MVE-NEXT: vcmpe.f16 s1, s5 +; CHECK-MVE-NEXT: vcmp.f16 s1, s5 ; CHECK-MVE-NEXT: lsls r2, r2, #31 ; CHECK-MVE-NEXT: vmovx.f16 s22, s1 ; CHECK-MVE-NEXT: vseleq.f16 s16, s12, s8 @@ -2335,7 +2335,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s13, s9 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s5 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2347,7 +2347,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s13 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s6 +; CHECK-MVE-NEXT: vcmp.f16 s2, s6 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[3], r1 @@ -2362,7 +2362,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s14, s10 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s6 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2374,7 +2374,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s14 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s7 +; CHECK-MVE-NEXT: vcmp.f16 s3, s7 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[5], r1 @@ -2383,7 +2383,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movvc r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 ; CHECK-MVE-NEXT: cset r1, ne -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vmovx.f16 s0, s11 ; CHECK-MVE-NEXT: vseleq.f16 s20, s15, s11 @@ -2422,13 +2422,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9, d10, d11} ; CHECK-MVE-NEXT: vmovx.f16 s16, s4 ; CHECK-MVE-NEXT: vmovx.f16 s18, s0 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s8 ; CHECK-MVE-NEXT: vmovx.f16 s18, s12 @@ -2441,7 +2441,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movvs r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne -; CHECK-MVE-NEXT: vcmpe.f16 s1, s5 +; CHECK-MVE-NEXT: vcmp.f16 s1, s5 ; CHECK-MVE-NEXT: lsls r2, r2, #31 ; CHECK-MVE-NEXT: vmovx.f16 s22, s1 ; CHECK-MVE-NEXT: vseleq.f16 s16, s12, s8 @@ -2461,7 +2461,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s13, s9 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s5 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2473,7 +2473,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s13 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s6 +; CHECK-MVE-NEXT: vcmp.f16 s2, s6 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[3], r1 @@ -2488,7 +2488,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s20, s14, s10 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmovx.f16 s20, s6 -; CHECK-MVE-NEXT: vcmpe.f16 s22, s20 +; CHECK-MVE-NEXT: vcmp.f16 s22, s20 ; CHECK-MVE-NEXT: vmov.16 q4[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2500,7 +2500,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s22, s14 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vseleq.f16 s20, s22, s20 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s7 +; CHECK-MVE-NEXT: vcmp.f16 s3, s7 ; CHECK-MVE-NEXT: vmov r1, s20 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov.16 q4[5], r1 @@ -2509,7 +2509,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: movvs r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 ; CHECK-MVE-NEXT: cset r1, ne -; CHECK-MVE-NEXT: vcmpe.f16 s0, s4 +; CHECK-MVE-NEXT: vcmp.f16 s0, s4 ; CHECK-MVE-NEXT: lsls r1, r1, #31 ; CHECK-MVE-NEXT: vmovx.f16 s0, s11 ; CHECK-MVE-NEXT: vseleq.f16 s20, s15, s11 Modified: llvm/trunk/test/CodeGen/Thumb2/mve-vcmpfr.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Thumb2/mve-vcmpfr.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Thumb2/mve-vcmpfr.ll (original) +++ llvm/trunk/test/CodeGen/Thumb2/mve-vcmpfr.ll Tue Oct 8 01:25:42 2019 @@ -128,24 +128,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ogt_v4f32(<4 x float> %src, float %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ogt_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s4 +; CHECK-MVE-NEXT: vcmp.f32 s1, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s4 +; CHECK-MVE-NEXT: vcmp.f32 s2, s4 ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s4 +; CHECK-MVE-NEXT: vcmp.f32 s3, s4 ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -183,24 +183,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_oge_v4f32(<4 x float> %src, float %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_oge_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s4 +; CHECK-MVE-NEXT: vcmp.f32 s1, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s4 +; CHECK-MVE-NEXT: vcmp.f32 s2, s4 ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s4 +; CHECK-MVE-NEXT: vcmp.f32 s3, s4 ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -238,24 +238,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_olt_v4f32(<4 x float> %src, float %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_olt_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s4 +; CHECK-MVE-NEXT: vcmp.f32 s1, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s4 +; CHECK-MVE-NEXT: vcmp.f32 s2, s4 ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s4 +; CHECK-MVE-NEXT: vcmp.f32 s3, s4 ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -294,24 +294,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ole_v4f32(<4 x float> %src, float %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ole_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s4 +; CHECK-MVE-NEXT: vcmp.f32 s1, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s4 +; CHECK-MVE-NEXT: vcmp.f32 s2, s4 ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s4 +; CHECK-MVE-NEXT: vcmp.f32 s3, s4 ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -472,24 +472,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ugt_v4f32(<4 x float> %src, float %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ugt_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s4 +; CHECK-MVE-NEXT: vcmp.f32 s1, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s4 +; CHECK-MVE-NEXT: vcmp.f32 s2, s4 ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s4 +; CHECK-MVE-NEXT: vcmp.f32 s3, s4 ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -529,24 +529,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_uge_v4f32(<4 x float> %src, float %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_uge_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s4 +; CHECK-MVE-NEXT: vcmp.f32 s1, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s4 +; CHECK-MVE-NEXT: vcmp.f32 s2, s4 ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s4 +; CHECK-MVE-NEXT: vcmp.f32 s3, s4 ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -586,24 +586,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ult_v4f32(<4 x float> %src, float %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ult_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s4 +; CHECK-MVE-NEXT: vcmp.f32 s1, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s4 +; CHECK-MVE-NEXT: vcmp.f32 s2, s4 ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s4 +; CHECK-MVE-NEXT: vcmp.f32 s3, s4 ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -642,24 +642,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ule_v4f32(<4 x float> %src, float %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ule_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s4 +; CHECK-MVE-NEXT: vcmp.f32 s1, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s4 +; CHECK-MVE-NEXT: vcmp.f32 s2, s4 ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s4 +; CHECK-MVE-NEXT: vcmp.f32 s3, s4 ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -698,24 +698,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ord_v4f32(<4 x float> %src, float %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ord_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s4 +; CHECK-MVE-NEXT: vcmp.f32 s1, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s4 +; CHECK-MVE-NEXT: vcmp.f32 s2, s4 ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s4 +; CHECK-MVE-NEXT: vcmp.f32 s3, s4 ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -756,24 +756,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_uno_v4f32(<4 x float> %src, float %src2, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_uno_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s4 +; CHECK-MVE-NEXT: vcmp.f32 s0, s4 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s4 +; CHECK-MVE-NEXT: vcmp.f32 s1, s4 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s4 +; CHECK-MVE-NEXT: vcmp.f32 s2, s4 ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s4 +; CHECK-MVE-NEXT: vcmp.f32 s3, s4 ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -1092,13 +1092,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r0, #0 ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s16 +; CHECK-MVE-NEXT: vcmp.f16 s12, s16 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r0, #1 ; CHECK-MVE-NEXT: cmp r0, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: movs r2, #0 ; CHECK-MVE-NEXT: lsls r0, r0, #31 @@ -1111,7 +1111,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r0, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s16 +; CHECK-MVE-NEXT: vcmp.f16 s1, s16 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -1128,7 +1128,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s9, s5 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -1138,7 +1138,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s5 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s16 +; CHECK-MVE-NEXT: vcmp.f16 s2, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 @@ -1153,7 +1153,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s10, s6 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -1163,11 +1163,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s6 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s16 +; CHECK-MVE-NEXT: vcmp.f16 s3, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: vmov.16 q3[5], r0 ; CHECK-MVE-NEXT: mov.w r0, #0 ; CHECK-MVE-NEXT: it gt @@ -1218,13 +1218,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r0, #0 ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s16 +; CHECK-MVE-NEXT: vcmp.f16 s12, s16 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r0, #1 ; CHECK-MVE-NEXT: cmp r0, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: movs r2, #0 ; CHECK-MVE-NEXT: lsls r0, r0, #31 @@ -1237,7 +1237,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r0, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s16 +; CHECK-MVE-NEXT: vcmp.f16 s1, s16 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -1254,7 +1254,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s9, s5 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -1264,7 +1264,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s5 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s16 +; CHECK-MVE-NEXT: vcmp.f16 s2, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 @@ -1279,7 +1279,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s10, s6 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -1289,11 +1289,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s6 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s16 +; CHECK-MVE-NEXT: vcmp.f16 s3, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: vmov.16 q3[5], r0 ; CHECK-MVE-NEXT: mov.w r0, #0 ; CHECK-MVE-NEXT: it ge @@ -1344,13 +1344,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r0, #0 ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s16 +; CHECK-MVE-NEXT: vcmp.f16 s12, s16 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r0, #1 ; CHECK-MVE-NEXT: cmp r0, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: movs r2, #0 ; CHECK-MVE-NEXT: lsls r0, r0, #31 @@ -1363,7 +1363,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r0, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s16 +; CHECK-MVE-NEXT: vcmp.f16 s1, s16 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -1380,7 +1380,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s9, s5 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -1390,7 +1390,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s5 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s16 +; CHECK-MVE-NEXT: vcmp.f16 s2, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 @@ -1405,7 +1405,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s10, s6 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -1415,11 +1415,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s6 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s16 +; CHECK-MVE-NEXT: vcmp.f16 s3, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: vmov.16 q3[5], r0 ; CHECK-MVE-NEXT: mov.w r0, #0 ; CHECK-MVE-NEXT: it mi @@ -1471,13 +1471,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r0, #0 ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s16 +; CHECK-MVE-NEXT: vcmp.f16 s12, s16 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r0, #1 ; CHECK-MVE-NEXT: cmp r0, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: movs r2, #0 ; CHECK-MVE-NEXT: lsls r0, r0, #31 @@ -1490,7 +1490,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r0, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s16 +; CHECK-MVE-NEXT: vcmp.f16 s1, s16 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -1507,7 +1507,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s9, s5 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -1517,7 +1517,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s5 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s16 +; CHECK-MVE-NEXT: vcmp.f16 s2, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 @@ -1532,7 +1532,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s10, s6 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -1542,11 +1542,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s6 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s16 +; CHECK-MVE-NEXT: vcmp.f16 s3, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: vmov.16 q3[5], r0 ; CHECK-MVE-NEXT: mov.w r0, #0 ; CHECK-MVE-NEXT: it ls @@ -1868,13 +1868,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r0, #0 ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s16 +; CHECK-MVE-NEXT: vcmp.f16 s12, s16 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r0, #1 ; CHECK-MVE-NEXT: cmp r0, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: movs r2, #0 ; CHECK-MVE-NEXT: lsls r0, r0, #31 @@ -1887,7 +1887,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r0, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s16 +; CHECK-MVE-NEXT: vcmp.f16 s1, s16 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -1904,7 +1904,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s9, s5 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -1914,7 +1914,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s5 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s16 +; CHECK-MVE-NEXT: vcmp.f16 s2, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 @@ -1929,7 +1929,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s10, s6 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -1939,11 +1939,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s6 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s16 +; CHECK-MVE-NEXT: vcmp.f16 s3, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: vmov.16 q3[5], r0 ; CHECK-MVE-NEXT: mov.w r0, #0 ; CHECK-MVE-NEXT: it hi @@ -1996,13 +1996,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r0, #0 ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s16 +; CHECK-MVE-NEXT: vcmp.f16 s12, s16 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r0, #1 ; CHECK-MVE-NEXT: cmp r0, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: movs r2, #0 ; CHECK-MVE-NEXT: lsls r0, r0, #31 @@ -2015,7 +2015,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r0, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s16 +; CHECK-MVE-NEXT: vcmp.f16 s1, s16 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -2032,7 +2032,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s9, s5 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -2042,7 +2042,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s5 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s16 +; CHECK-MVE-NEXT: vcmp.f16 s2, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 @@ -2057,7 +2057,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s10, s6 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -2067,11 +2067,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s6 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s16 +; CHECK-MVE-NEXT: vcmp.f16 s3, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: vmov.16 q3[5], r0 ; CHECK-MVE-NEXT: mov.w r0, #0 ; CHECK-MVE-NEXT: it pl @@ -2124,13 +2124,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r0, #0 ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s16 +; CHECK-MVE-NEXT: vcmp.f16 s12, s16 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r0, #1 ; CHECK-MVE-NEXT: cmp r0, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: movs r2, #0 ; CHECK-MVE-NEXT: lsls r0, r0, #31 @@ -2143,7 +2143,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r0, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s16 +; CHECK-MVE-NEXT: vcmp.f16 s1, s16 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -2160,7 +2160,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s9, s5 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -2170,7 +2170,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s5 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s16 +; CHECK-MVE-NEXT: vcmp.f16 s2, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 @@ -2185,7 +2185,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s10, s6 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -2195,11 +2195,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s6 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s16 +; CHECK-MVE-NEXT: vcmp.f16 s3, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: vmov.16 q3[5], r0 ; CHECK-MVE-NEXT: mov.w r0, #0 ; CHECK-MVE-NEXT: it lt @@ -2251,13 +2251,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r0, #0 ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s16 +; CHECK-MVE-NEXT: vcmp.f16 s12, s16 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r0, #1 ; CHECK-MVE-NEXT: cmp r0, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: movs r2, #0 ; CHECK-MVE-NEXT: lsls r0, r0, #31 @@ -2270,7 +2270,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r0, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s16 +; CHECK-MVE-NEXT: vcmp.f16 s1, s16 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -2287,7 +2287,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s9, s5 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -2297,7 +2297,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s5 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s16 +; CHECK-MVE-NEXT: vcmp.f16 s2, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 @@ -2312,7 +2312,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s10, s6 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -2322,11 +2322,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s6 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s16 +; CHECK-MVE-NEXT: vcmp.f16 s3, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: vmov.16 q3[5], r0 ; CHECK-MVE-NEXT: mov.w r0, #0 ; CHECK-MVE-NEXT: it le @@ -2378,13 +2378,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r0, #0 ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s16 +; CHECK-MVE-NEXT: vcmp.f16 s12, s16 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r0, #1 ; CHECK-MVE-NEXT: cmp r0, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: movs r2, #0 ; CHECK-MVE-NEXT: lsls r0, r0, #31 @@ -2397,7 +2397,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r0, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s16 +; CHECK-MVE-NEXT: vcmp.f16 s1, s16 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -2414,7 +2414,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s9, s5 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -2424,7 +2424,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s5 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s16 +; CHECK-MVE-NEXT: vcmp.f16 s2, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 @@ -2439,7 +2439,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s10, s6 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -2449,11 +2449,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s6 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s16 +; CHECK-MVE-NEXT: vcmp.f16 s3, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: vmov.16 q3[5], r0 ; CHECK-MVE-NEXT: mov.w r0, #0 ; CHECK-MVE-NEXT: it vc @@ -2507,13 +2507,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r0, #0 ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s16 +; CHECK-MVE-NEXT: vcmp.f16 s12, s16 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r0, #1 ; CHECK-MVE-NEXT: cmp r0, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: movs r2, #0 ; CHECK-MVE-NEXT: lsls r0, r0, #31 @@ -2526,7 +2526,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r0, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s16 +; CHECK-MVE-NEXT: vcmp.f16 s1, s16 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -2543,7 +2543,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s9, s5 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -2553,7 +2553,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s5 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s16 +; CHECK-MVE-NEXT: vcmp.f16 s2, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 @@ -2568,7 +2568,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s18, s10, s6 ; CHECK-MVE-NEXT: vmov r0, s18 ; CHECK-MVE-NEXT: vmovx.f16 s18, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s18, s16 +; CHECK-MVE-NEXT: vcmp.f16 s18, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r0, #0 @@ -2578,11 +2578,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r0, ne ; CHECK-MVE-NEXT: vmovx.f16 s18, s6 ; CHECK-MVE-NEXT: lsls r0, r0, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s16 +; CHECK-MVE-NEXT: vcmp.f16 s3, s16 ; CHECK-MVE-NEXT: vseleq.f16 s18, s20, s18 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r0, s18 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s16 +; CHECK-MVE-NEXT: vcmp.f16 s0, s16 ; CHECK-MVE-NEXT: vmov.16 q3[5], r0 ; CHECK-MVE-NEXT: mov.w r0, #0 ; CHECK-MVE-NEXT: it vs Modified: llvm/trunk/test/CodeGen/Thumb2/mve-vcmpfz.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Thumb2/mve-vcmpfz.ll?rev=374025&r1=374024&r2=374025&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Thumb2/mve-vcmpfz.ll (original) +++ llvm/trunk/test/CodeGen/Thumb2/mve-vcmpfz.ll Tue Oct 8 01:25:42 2019 @@ -122,24 +122,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ogt_v4f32(<4 x float> %src, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ogt_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, #0 +; CHECK-MVE-NEXT: vcmp.f32 s0, #0 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, #0 +; CHECK-MVE-NEXT: vcmp.f32 s1, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, #0 +; CHECK-MVE-NEXT: vcmp.f32 s2, #0 ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, #0 +; CHECK-MVE-NEXT: vcmp.f32 s3, #0 ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -174,24 +174,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_oge_v4f32(<4 x float> %src, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_oge_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, #0 +; CHECK-MVE-NEXT: vcmp.f32 s0, #0 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, #0 +; CHECK-MVE-NEXT: vcmp.f32 s1, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, #0 +; CHECK-MVE-NEXT: vcmp.f32 s2, #0 ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, #0 +; CHECK-MVE-NEXT: vcmp.f32 s3, #0 ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -226,24 +226,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_olt_v4f32(<4 x float> %src, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_olt_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, #0 +; CHECK-MVE-NEXT: vcmp.f32 s0, #0 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, #0 +; CHECK-MVE-NEXT: vcmp.f32 s1, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, #0 +; CHECK-MVE-NEXT: vcmp.f32 s2, #0 ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, #0 +; CHECK-MVE-NEXT: vcmp.f32 s3, #0 ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -278,24 +278,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ole_v4f32(<4 x float> %src, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ole_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, #0 +; CHECK-MVE-NEXT: vcmp.f32 s0, #0 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, #0 +; CHECK-MVE-NEXT: vcmp.f32 s1, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, #0 +; CHECK-MVE-NEXT: vcmp.f32 s2, #0 ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, #0 +; CHECK-MVE-NEXT: vcmp.f32 s3, #0 ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -446,24 +446,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ugt_v4f32(<4 x float> %src, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ugt_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, #0 +; CHECK-MVE-NEXT: vcmp.f32 s0, #0 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, #0 +; CHECK-MVE-NEXT: vcmp.f32 s1, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, #0 +; CHECK-MVE-NEXT: vcmp.f32 s2, #0 ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, #0 +; CHECK-MVE-NEXT: vcmp.f32 s3, #0 ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -499,24 +499,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_uge_v4f32(<4 x float> %src, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_uge_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, #0 +; CHECK-MVE-NEXT: vcmp.f32 s0, #0 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, #0 +; CHECK-MVE-NEXT: vcmp.f32 s1, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, #0 +; CHECK-MVE-NEXT: vcmp.f32 s2, #0 ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, #0 +; CHECK-MVE-NEXT: vcmp.f32 s3, #0 ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -552,24 +552,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ult_v4f32(<4 x float> %src, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ult_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, #0 +; CHECK-MVE-NEXT: vcmp.f32 s0, #0 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, #0 +; CHECK-MVE-NEXT: vcmp.f32 s1, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, #0 +; CHECK-MVE-NEXT: vcmp.f32 s2, #0 ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, #0 +; CHECK-MVE-NEXT: vcmp.f32 s3, #0 ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -605,24 +605,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ule_v4f32(<4 x float> %src, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ule_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, #0 +; CHECK-MVE-NEXT: vcmp.f32 s0, #0 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, #0 +; CHECK-MVE-NEXT: vcmp.f32 s1, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, #0 +; CHECK-MVE-NEXT: vcmp.f32 s2, #0 ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, #0 +; CHECK-MVE-NEXT: vcmp.f32 s3, #0 ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -658,24 +658,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_ord_v4f32(<4 x float> %src, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_ord_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s0 +; CHECK-MVE-NEXT: vcmp.f32 s0, s0 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s1 +; CHECK-MVE-NEXT: vcmp.f32 s1, s1 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s2 +; CHECK-MVE-NEXT: vcmp.f32 s2, s2 ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s3 +; CHECK-MVE-NEXT: vcmp.f32 s3, s3 ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -713,24 +713,24 @@ entry: define arm_aapcs_vfpcc <4 x float> @vcmp_uno_v4f32(<4 x float> %src, <4 x float> %a, <4 x float> %b) { ; CHECK-MVE-LABEL: vcmp_uno_v4f32: ; CHECK-MVE: @ %bb.0: @ %entry -; CHECK-MVE-NEXT: vcmpe.f32 s0, s0 +; CHECK-MVE-NEXT: vcmp.f32 s0, s0 ; CHECK-MVE-NEXT: movs r1, #0 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s1, s1 +; CHECK-MVE-NEXT: vcmp.f32 s1, s1 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r2, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s2, s2 +; CHECK-MVE-NEXT: vcmp.f32 s2, s2 ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r2, #1 ; CHECK-MVE-NEXT: cmp r2, #0 ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r3, #0 -; CHECK-MVE-NEXT: vcmpe.f32 s3, s3 +; CHECK-MVE-NEXT: vcmp.f32 s3, s3 ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r3, #1 ; CHECK-MVE-NEXT: cmp r3, #0 @@ -1032,13 +1032,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9} ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s12, #0 +; CHECK-MVE-NEXT: vcmp.f16 s12, #0 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it gt ; CHECK-MVE-NEXT: movgt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 ; CHECK-MVE-NEXT: lsls r1, r1, #31 @@ -1051,7 +1051,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r1, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, #0 +; CHECK-MVE-NEXT: vcmp.f16 s1, #0 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -1069,7 +1069,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s9, s5 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1079,7 +1079,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s5 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, #0 +; CHECK-MVE-NEXT: vcmp.f16 s2, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 @@ -1094,7 +1094,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s10, s6 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1104,11 +1104,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s6 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, #0 +; CHECK-MVE-NEXT: vcmp.f16 s3, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: vmov.16 q3[5], r1 ; CHECK-MVE-NEXT: mov.w r1, #0 ; CHECK-MVE-NEXT: it gt @@ -1152,13 +1152,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9} ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s12, #0 +; CHECK-MVE-NEXT: vcmp.f16 s12, #0 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ge ; CHECK-MVE-NEXT: movge r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 ; CHECK-MVE-NEXT: lsls r1, r1, #31 @@ -1171,7 +1171,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r1, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, #0 +; CHECK-MVE-NEXT: vcmp.f16 s1, #0 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -1189,7 +1189,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s9, s5 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1199,7 +1199,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s5 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, #0 +; CHECK-MVE-NEXT: vcmp.f16 s2, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 @@ -1214,7 +1214,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s10, s6 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1224,11 +1224,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s6 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, #0 +; CHECK-MVE-NEXT: vcmp.f16 s3, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: vmov.16 q3[5], r1 ; CHECK-MVE-NEXT: mov.w r1, #0 ; CHECK-MVE-NEXT: it ge @@ -1272,13 +1272,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9} ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s12, #0 +; CHECK-MVE-NEXT: vcmp.f16 s12, #0 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it mi ; CHECK-MVE-NEXT: movmi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 ; CHECK-MVE-NEXT: lsls r1, r1, #31 @@ -1291,7 +1291,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r1, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, #0 +; CHECK-MVE-NEXT: vcmp.f16 s1, #0 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -1309,7 +1309,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s9, s5 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1319,7 +1319,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s5 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, #0 +; CHECK-MVE-NEXT: vcmp.f16 s2, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 @@ -1334,7 +1334,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s10, s6 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1344,11 +1344,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s6 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, #0 +; CHECK-MVE-NEXT: vcmp.f16 s3, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: vmov.16 q3[5], r1 ; CHECK-MVE-NEXT: mov.w r1, #0 ; CHECK-MVE-NEXT: it mi @@ -1392,13 +1392,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9} ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s12, #0 +; CHECK-MVE-NEXT: vcmp.f16 s12, #0 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it ls ; CHECK-MVE-NEXT: movls r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 ; CHECK-MVE-NEXT: lsls r1, r1, #31 @@ -1411,7 +1411,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r1, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, #0 +; CHECK-MVE-NEXT: vcmp.f16 s1, #0 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -1429,7 +1429,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s9, s5 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1439,7 +1439,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s5 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, #0 +; CHECK-MVE-NEXT: vcmp.f16 s2, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 @@ -1454,7 +1454,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s10, s6 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1464,11 +1464,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s6 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, #0 +; CHECK-MVE-NEXT: vcmp.f16 s3, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: vmov.16 q3[5], r1 ; CHECK-MVE-NEXT: mov.w r1, #0 ; CHECK-MVE-NEXT: it ls @@ -1770,13 +1770,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9} ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s12, #0 +; CHECK-MVE-NEXT: vcmp.f16 s12, #0 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it hi ; CHECK-MVE-NEXT: movhi r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 ; CHECK-MVE-NEXT: lsls r1, r1, #31 @@ -1789,7 +1789,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r1, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, #0 +; CHECK-MVE-NEXT: vcmp.f16 s1, #0 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -1807,7 +1807,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s9, s5 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1817,7 +1817,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s5 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, #0 +; CHECK-MVE-NEXT: vcmp.f16 s2, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 @@ -1832,7 +1832,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s10, s6 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1842,11 +1842,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s6 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, #0 +; CHECK-MVE-NEXT: vcmp.f16 s3, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: vmov.16 q3[5], r1 ; CHECK-MVE-NEXT: mov.w r1, #0 ; CHECK-MVE-NEXT: it hi @@ -1891,13 +1891,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9} ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s12, #0 +; CHECK-MVE-NEXT: vcmp.f16 s12, #0 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it pl ; CHECK-MVE-NEXT: movpl r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 ; CHECK-MVE-NEXT: lsls r1, r1, #31 @@ -1910,7 +1910,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r1, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, #0 +; CHECK-MVE-NEXT: vcmp.f16 s1, #0 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -1928,7 +1928,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s9, s5 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1938,7 +1938,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s5 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, #0 +; CHECK-MVE-NEXT: vcmp.f16 s2, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 @@ -1953,7 +1953,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s10, s6 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -1963,11 +1963,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s6 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, #0 +; CHECK-MVE-NEXT: vcmp.f16 s3, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: vmov.16 q3[5], r1 ; CHECK-MVE-NEXT: mov.w r1, #0 ; CHECK-MVE-NEXT: it pl @@ -2012,13 +2012,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9} ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s12, #0 +; CHECK-MVE-NEXT: vcmp.f16 s12, #0 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it lt ; CHECK-MVE-NEXT: movlt r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 ; CHECK-MVE-NEXT: lsls r1, r1, #31 @@ -2031,7 +2031,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r1, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, #0 +; CHECK-MVE-NEXT: vcmp.f16 s1, #0 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -2049,7 +2049,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s9, s5 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2059,7 +2059,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s5 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, #0 +; CHECK-MVE-NEXT: vcmp.f16 s2, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 @@ -2074,7 +2074,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s10, s6 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2084,11 +2084,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s6 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, #0 +; CHECK-MVE-NEXT: vcmp.f16 s3, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: vmov.16 q3[5], r1 ; CHECK-MVE-NEXT: mov.w r1, #0 ; CHECK-MVE-NEXT: it lt @@ -2133,13 +2133,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9} ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s12, #0 +; CHECK-MVE-NEXT: vcmp.f16 s12, #0 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it le ; CHECK-MVE-NEXT: movle r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 ; CHECK-MVE-NEXT: lsls r1, r1, #31 @@ -2152,7 +2152,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r1, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, #0 +; CHECK-MVE-NEXT: vcmp.f16 s1, #0 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -2170,7 +2170,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s9, s5 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2180,7 +2180,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s5 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, #0 +; CHECK-MVE-NEXT: vcmp.f16 s2, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 @@ -2195,7 +2195,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s10, s6 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s16, #0 +; CHECK-MVE-NEXT: vcmp.f16 s16, #0 ; CHECK-MVE-NEXT: vmov.16 q3[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2205,11 +2205,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s6 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, #0 +; CHECK-MVE-NEXT: vcmp.f16 s3, #0 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 -; CHECK-MVE-NEXT: vcmpe.f16 s0, #0 +; CHECK-MVE-NEXT: vcmp.f16 s0, #0 ; CHECK-MVE-NEXT: vmov.16 q3[5], r1 ; CHECK-MVE-NEXT: mov.w r1, #0 ; CHECK-MVE-NEXT: it le @@ -2254,13 +2254,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9} ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s12 +; CHECK-MVE-NEXT: vcmp.f16 s12, s12 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vc ; CHECK-MVE-NEXT: movvc r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s0 +; CHECK-MVE-NEXT: vcmp.f16 s0, s0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 ; CHECK-MVE-NEXT: lsls r1, r1, #31 @@ -2273,7 +2273,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r1, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s1 +; CHECK-MVE-NEXT: vcmp.f16 s1, s1 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -2291,7 +2291,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s9, s5 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s16, s16 +; CHECK-MVE-NEXT: vcmp.f16 s16, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2301,7 +2301,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s5 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s2 +; CHECK-MVE-NEXT: vcmp.f16 s2, s2 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 @@ -2316,7 +2316,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s10, s6 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s16, s16 +; CHECK-MVE-NEXT: vcmp.f16 s16, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2326,11 +2326,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s6 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s3 +; CHECK-MVE-NEXT: vcmp.f16 s3, s3 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s0 +; CHECK-MVE-NEXT: vcmp.f16 s0, s0 ; CHECK-MVE-NEXT: vmov.16 q3[5], r1 ; CHECK-MVE-NEXT: mov.w r1, #0 ; CHECK-MVE-NEXT: it vc @@ -2377,13 +2377,13 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vpush {d8, d9} ; CHECK-MVE-NEXT: vmovx.f16 s12, s0 ; CHECK-MVE-NEXT: movs r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s12, s12 +; CHECK-MVE-NEXT: vcmp.f16 s12, s12 ; CHECK-MVE-NEXT: vmovx.f16 s12, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: it vs ; CHECK-MVE-NEXT: movvs r1, #1 ; CHECK-MVE-NEXT: cmp r1, #0 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s0 +; CHECK-MVE-NEXT: vcmp.f16 s0, s0 ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s14, s8 ; CHECK-MVE-NEXT: lsls r1, r1, #31 @@ -2396,7 +2396,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r2, ne ; CHECK-MVE-NEXT: vmov r1, s12 ; CHECK-MVE-NEXT: lsls r2, r2, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s1, s1 +; CHECK-MVE-NEXT: vcmp.f16 s1, s1 ; CHECK-MVE-NEXT: vseleq.f16 s12, s8, s4 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r2, s12 @@ -2414,7 +2414,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s9, s5 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s1 -; CHECK-MVE-NEXT: vcmpe.f16 s16, s16 +; CHECK-MVE-NEXT: vcmp.f16 s16, s16 ; CHECK-MVE-NEXT: vmov.16 q3[2], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2424,7 +2424,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s5 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s2, s2 +; CHECK-MVE-NEXT: vcmp.f16 s2, s2 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 @@ -2439,7 +2439,7 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: vseleq.f16 s16, s10, s6 ; CHECK-MVE-NEXT: vmov r1, s16 ; CHECK-MVE-NEXT: vmovx.f16 s16, s2 -; CHECK-MVE-NEXT: vcmpe.f16 s16, s16 +; CHECK-MVE-NEXT: vcmp.f16 s16, s16 ; CHECK-MVE-NEXT: vmov.16 q3[4], r1 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: mov.w r1, #0 @@ -2449,11 +2449,11 @@ define arm_aapcs_vfpcc <8 x half> @vcmp_ ; CHECK-MVE-NEXT: cset r1, ne ; CHECK-MVE-NEXT: vmovx.f16 s16, s6 ; CHECK-MVE-NEXT: lsls r1, r1, #31 -; CHECK-MVE-NEXT: vcmpe.f16 s3, s3 +; CHECK-MVE-NEXT: vcmp.f16 s3, s3 ; CHECK-MVE-NEXT: vseleq.f16 s16, s18, s16 ; CHECK-MVE-NEXT: vmrs APSR_nzcv, fpscr ; CHECK-MVE-NEXT: vmov r1, s16 -; CHECK-MVE-NEXT: vcmpe.f16 s0, s0 +; CHECK-MVE-NEXT: vcmp.f16 s0, s0 ; CHECK-MVE-NEXT: vmov.16 q3[5], r1 ; CHECK-MVE-NEXT: mov.w r1, #0 ; CHECK-MVE-NEXT: it vs From llvm-commits at lists.llvm.org Tue Oct 8 01:24:21 2019 From: llvm-commits at lists.llvm.org (Kai Nacke via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:24:21 +0000 (UTC) Subject: [PATCH] D67696: [tools] Mark output of tools as text if it is really text In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGc9ddda840526: [Tools] Mark output of tools as text if it is text (authored by Kai). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67696/new/ https://reviews.llvm.org/D67696 Files: llvm/lib/IR/RemarkStreamer.cpp llvm/tools/llvm-dis/llvm-dis.cpp llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp llvm/tools/llvm-mc/llvm-mc.cpp llvm/tools/llvm-mca/llvm-mca.cpp llvm/tools/opt/opt.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67696.223799.patch Type: text/x-patch Size: 4701 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 01:26:20 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:26:20 +0000 (UTC) Subject: [PATCH] D68462: [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. In-Reply-To: References: Message-ID: <2dea0aec4d1b75e3554cd710ea8dc4bc@localhost.localdomain> grimar marked an inline comment as done. grimar added inline comments. ================ Comment at: test/tools/llvm-readobj/all.test:86-90 + - Name: .eh_frame_hdr + Type: SHT_PROGBITS +## An arbitrary linker-generated valid content. + Content: 011b033b140000000100000000f0ffff30000000 + - Name: .eh_frame ---------------- grimar wrote: > jhenderson wrote: > > grimar wrote: > > > jhenderson wrote: > > > > grimar wrote: > > > > > jhenderson wrote: > > > > > > Same comments as earlier. Can these be empty? > > > > > No. We need to have something valid here, otherwise any > > > > > error triggered will fail the dumping. > > > > > (e.g. https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/DwarfCFIEHPrinter.h#L127). > > > > You don't actually need the .eh_frame_hdr at all, looking at the code, I think, just the .eh_frame section. That said, this appears to be different from GNU readelf. > > > > > > > > For reference, GNU readelf prints "There are no unwind sections in this file" if there are no .eh_frame_hdr sections even if there is a .eh_frame section (not looked to see what happens if there is a PT_GNU_EH_FRAME program header). > > > I am a bit confused. > > > > > > Imagine we have the code and invocations below: > > > > > > "1.s": > > > ``` > > > .section foo,"ax", at progbits > > > .cfi_startproc > > > nop > > > .cfi_endproc > > > ``` > > > > > > ``` > > > as 1.s -o 1.o > > > ld.bfd 1.o -o with_hdr --eh-frame-hdr > > > ld.bfd 1.o -o wo_hdr > > > ``` > > > > > > For both of them I do not see neither `.eh_frame_hdr` nor `.eh_frame` section dumped with `-a`. > > > I see ".eh_frame" dumped when I add `-wf` though (but still no `.eh_frame_hdr`). > > > e.g.: > > > > > > > > > ``` > > > umb at ubuntu:~/tests/81$ readelf -v > > > GNU readelf (GNU Binutils for Ubuntu) 2.31.1 > > > Copyright (C) 2018 Free Software Foundation, Inc. > > > > > > umb at ubuntu:~/tests/81$ readelf -a with_hdr > > > ELF Header: > > > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > > > Class: ELF64 > > > Data: 2's complement, little endian > > > Version: 1 (current) > > > OS/ABI: UNIX - System V > > > ABI Version: 0 > > > Type: EXEC (Executable file) > > > Machine: Advanced Micro Devices X86-64 > > > Version: 0x1 > > > Entry point address: 0x401000 > > > Start of program headers: 64 (bytes into file) > > > Start of section headers: 8584 (bytes into file) > > > Flags: 0x0 > > > Size of this header: 64 (bytes) > > > Size of program headers: 56 (bytes) > > > Number of program headers: 3 > > > Size of section headers: 64 (bytes) > > > Number of section headers: 7 > > > Section header string table index: 6 > > > > > > Section Headers: > > > [Nr] Name Type Address Offset > > > Size EntSize Flags Link Info Align > > > [ 0] NULL 0000000000000000 00000000 > > > 0000000000000000 0000000000000000 0 0 0 > > > [ 1] foo PROGBITS 0000000000401000 00001000 > > > 0000000000000001 0000000000000000 AX 0 0 1 > > > [ 2] .eh_frame_hdr PROGBITS 0000000000402000 00002000 > > > 0000000000000014 0000000000000000 A 0 0 4 > > > [ 3] .eh_frame PROGBITS 0000000000402018 00002018 > > > 000000000000002c 0000000000000000 A 0 0 8 > > > [ 4] .symtab SYMTAB 0000000000000000 00002048 > > > 00000000000000d8 0000000000000018 5 5 8 > > > [ 5] .strtab STRTAB 0000000000000000 00002120 > > > 000000000000002c 0000000000000000 0 0 1 > > > [ 6] .shstrtab STRTAB 0000000000000000 0000214c > > > 0000000000000037 0000000000000000 0 0 1 > > > Key to Flags: > > > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > > > L (link order), O (extra OS processing required), G (group), T (TLS), > > > C (compressed), x (unknown), o (OS specific), E (exclude), > > > l (large), p (processor specific) > > > > > > There are no section groups in this file. > > > > > > Program Headers: > > > Type Offset VirtAddr PhysAddr > > > FileSiz MemSiz Flags Align > > > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > > > 0x0000000000000001 0x0000000000000001 R E 0x1000 > > > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > > 0x0000000000000044 0x0000000000000044 R 0x1000 > > > GNU_EH_FRAME 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > > 0x0000000000000014 0x0000000000000014 R 0x4 > > > > > > Section to Segment mapping: > > > Segment Sections... > > > 00 foo > > > 01 .eh_frame_hdr .eh_frame > > > 02 .eh_frame_hdr > > > > > > There is no dynamic section in this file. > > > > > > There are no relocations in this file. > > > > > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > > > > > Symbol table '.symtab' contains 9 entries: > > > Num: Value Size Type Bind Vis Ndx Name > > > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > > > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > > > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > > > 3: 0000000000402018 0 SECTION LOCAL DEFAULT 3 > > > 4: 0000000000402000 0 NOTYPE LOCAL DEFAULT 2 __GNU_EH_FRAME_HDR > > > 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > > > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 __bss_start > > > 7: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _edata > > > 8: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _end > > > > > > No version information found in this file. > > > > > > umb at ubuntu:~/tests/81$ readelf -a with_hdr -wf > > > ELF Header: > > > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > > > Class: ELF64 > > > Data: 2's complement, little endian > > > Version: 1 (current) > > > OS/ABI: UNIX - System V > > > ABI Version: 0 > > > Type: EXEC (Executable file) > > > Machine: Advanced Micro Devices X86-64 > > > Version: 0x1 > > > Entry point address: 0x401000 > > > Start of program headers: 64 (bytes into file) > > > Start of section headers: 8584 (bytes into file) > > > Flags: 0x0 > > > Size of this header: 64 (bytes) > > > Size of program headers: 56 (bytes) > > > Number of program headers: 3 > > > Size of section headers: 64 (bytes) > > > Number of section headers: 7 > > > Section header string table index: 6 > > > > > > Section Headers: > > > [Nr] Name Type Address Offset > > > Size EntSize Flags Link Info Align > > > [ 0] NULL 0000000000000000 00000000 > > > 0000000000000000 0000000000000000 0 0 0 > > > [ 1] foo PROGBITS 0000000000401000 00001000 > > > 0000000000000001 0000000000000000 AX 0 0 1 > > > [ 2] .eh_frame_hdr PROGBITS 0000000000402000 00002000 > > > 0000000000000014 0000000000000000 A 0 0 4 > > > [ 3] .eh_frame PROGBITS 0000000000402018 00002018 > > > 000000000000002c 0000000000000000 A 0 0 8 > > > [ 4] .symtab SYMTAB 0000000000000000 00002048 > > > 00000000000000d8 0000000000000018 5 5 8 > > > [ 5] .strtab STRTAB 0000000000000000 00002120 > > > 000000000000002c 0000000000000000 0 0 1 > > > [ 6] .shstrtab STRTAB 0000000000000000 0000214c > > > 0000000000000037 0000000000000000 0 0 1 > > > Key to Flags: > > > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > > > L (link order), O (extra OS processing required), G (group), T (TLS), > > > C (compressed), x (unknown), o (OS specific), E (exclude), > > > l (large), p (processor specific) > > > > > > There are no section groups in this file. > > > > > > Program Headers: > > > Type Offset VirtAddr PhysAddr > > > FileSiz MemSiz Flags Align > > > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > > > 0x0000000000000001 0x0000000000000001 R E 0x1000 > > > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > > 0x0000000000000044 0x0000000000000044 R 0x1000 > > > GNU_EH_FRAME 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > > 0x0000000000000014 0x0000000000000014 R 0x4 > > > > > > Section to Segment mapping: > > > Segment Sections... > > > 00 foo > > > 01 .eh_frame_hdr .eh_frame > > > 02 .eh_frame_hdr > > > > > > There is no dynamic section in this file. > > > > > > There are no relocations in this file. > > > > > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > > > > > Symbol table '.symtab' contains 9 entries: > > > Num: Value Size Type Bind Vis Ndx Name > > > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > > > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > > > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > > > 3: 0000000000402018 0 SECTION LOCAL DEFAULT 3 > > > 4: 0000000000402000 0 NOTYPE LOCAL DEFAULT 2 __GNU_EH_FRAME_HDR > > > 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > > > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 __bss_start > > > 7: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _edata > > > 8: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _end > > > > > > No version information found in this file. > > > Contents of the .eh_frame section: > > > > > > > > > 00000000 0000000000000014 00000000 CIE > > > Version: 1 > > > Augmentation: "zR" > > > Code alignment factor: 1 > > > Data alignment factor: -8 > > > Return address column: 16 > > > Augmentation data: 1b > > > DW_CFA_def_cfa: r7 (rsp) ofs 8 > > > DW_CFA_offset: r16 (rip) at cfa-8 > > > DW_CFA_nop > > > DW_CFA_nop > > > > > > 00000018 0000000000000010 0000001c FDE cie=00000000 pc=0000000000401000..0000000000401001 > > > DW_CFA_nop > > > DW_CFA_nop > > > DW_CFA_nop > > > > > > > > > ``` > > > > > > I also see ".eh_frame" dumped when there is no ".eh_frame_hdr": > > > > > > > > > ``` > > > umb at ubuntu:~/tests/81$ readelf -a wo_hdr -wf > > > ELF Header: > > > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > > > Class: ELF64 > > > Data: 2's complement, little endian > > > Version: 1 (current) > > > OS/ABI: UNIX - System V > > > ABI Version: 0 > > > Type: EXEC (Executable file) > > > Machine: Advanced Micro Devices X86-64 > > > Version: 0x1 > > > Entry point address: 0x401000 > > > Start of program headers: 64 (bytes into file) > > > Start of section headers: 8480 (bytes into file) > > > Flags: 0x0 > > > Size of this header: 64 (bytes) > > > Size of program headers: 56 (bytes) > > > Number of program headers: 2 > > > Size of section headers: 64 (bytes) > > > Number of section headers: 6 > > > Section header string table index: 5 > > > > > > Section Headers: > > > [Nr] Name Type Address Offset > > > Size EntSize Flags Link Info Align > > > [ 0] NULL 0000000000000000 00000000 > > > 0000000000000000 0000000000000000 0 0 0 > > > [ 1] foo PROGBITS 0000000000401000 00001000 > > > 0000000000000001 0000000000000000 AX 0 0 1 > > > [ 2] .eh_frame PROGBITS 0000000000402000 00002000 > > > 000000000000002c 0000000000000000 A 0 0 8 > > > [ 3] .symtab SYMTAB 0000000000000000 00002030 > > > 00000000000000a8 0000000000000018 4 3 8 > > > [ 4] .strtab STRTAB 0000000000000000 000020d8 > > > 0000000000000019 0000000000000000 0 0 1 > > > [ 5] .shstrtab STRTAB 0000000000000000 000020f1 > > > 0000000000000029 0000000000000000 0 0 1 > > > Key to Flags: > > > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > > > L (link order), O (extra OS processing required), G (group), T (TLS), > > > C (compressed), x (unknown), o (OS specific), E (exclude), > > > l (large), p (processor specific) > > > > > > There are no section groups in this file. > > > > > > Program Headers: > > > Type Offset VirtAddr PhysAddr > > > FileSiz MemSiz Flags Align > > > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > > > 0x0000000000000001 0x0000000000000001 R E 0x1000 > > > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > > 0x000000000000002c 0x000000000000002c R 0x1000 > > > > > > Section to Segment mapping: > > > Segment Sections... > > > 00 foo > > > 01 .eh_frame > > > > > > There is no dynamic section in this file. > > > > > > There are no relocations in this file. > > > > > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > > > > > Symbol table '.symtab' contains 7 entries: > > > Num: Value Size Type Bind Vis Ndx Name > > > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > > > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > > > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > > > 3: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > > > 4: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 __bss_start > > > 5: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 _edata > > > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 _end > > > > > > No version information found in this file. > > > Contents of the .eh_frame section: > > > > > > > > > 00000000 0000000000000014 00000000 CIE > > > Version: 1 > > > Augmentation: "zR" > > > Code alignment factor: 1 > > > Data alignment factor: -8 > > > Return address column: 16 > > > Augmentation data: 1b > > > DW_CFA_def_cfa: r7 (rsp) ofs 8 > > > DW_CFA_offset: r16 (rip) at cfa-8 > > > DW_CFA_nop > > > DW_CFA_nop > > > > > > 00000018 0000000000000010 0000001c FDE cie=00000000 pc=0000000000401000..0000000000401001 > > > DW_CFA_nop > > > DW_CFA_nop > > > DW_CFA_nop > > > > > > } > > > ``` > > > > > > Since we have such differences in the behavior, should we just test the current behavior atm? > > > I.e. before this diff I tested "EH_FRAME Header [", now I also added a check for ".eh_frame section at offset...". > > > Both of them are dumped at the top level currently. Seems reasonable to test the fact we do that (with just `-all`) > > > and the order, probably? > > > (I am ok to change it in any way actually, but just wanted to clarify this before doing anything with it.) > > I'm confused too. Your output above even appears to be conflicted. Note that it mentions "The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported." > > > > Anyway, probably best to file a bug to record the issue and then do as you're doing here (i.e. test the current behaviour). > > Note that it mentions "The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported." > > Yes, that looks strange. I'll build the latest binutils from sources tomorrow and check what it do, then probably file a bug or prepare a patch. Thanks for review! I've build GNU readelf (GNU Binutils) 2.33.50.20191008 today and it still shows "The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported." for me when I request "-a" or "-u" (--unwind Display the unwind info (if present)) It seems to be a known one year old bug: https://bugzilla.redhat.com/show_bug.cgi?id=1626614 And one of interesting comments is: "A workaround is readelf -wF or -wf however, it is unclear to me why -u isn't wired up to do the same thing as -wf. Is that something worth considering?" I am going to commit this patch and do nothing else for it atm (i.e. will not report any bugs for llvm-readelf until situation with GNU be fixed, since the current our behavior looks valid and I probably see no issues). How does it sound for you? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68462/new/ https://reviews.llvm.org/D68462 From llvm-commits at lists.llvm.org Tue Oct 8 01:26:28 2019 From: llvm-commits at lists.llvm.org (Kristof Beyls via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:26:28 +0000 (UTC) Subject: [PATCH] D68463: [ARM] Generate vcmp instead of vcmpe In-Reply-To: References: Message-ID: <89013908fce70040027903d2094fd462@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG78bfe3ab9475: [ARM] Generate vcmp instead of vcmpe (authored by kristof.beyls). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68463/new/ https://reviews.llvm.org/D68463 Files: llvm/lib/Target/ARM/ARMFastISel.cpp llvm/lib/Target/ARM/ARMISelLowering.cpp llvm/lib/Target/ARM/ARMISelLowering.h llvm/lib/Target/ARM/ARMInstrInfo.td llvm/lib/Target/ARM/ARMInstrVFP.td llvm/test/CodeGen/ARM/2009-07-18-RewriterBug.ll llvm/test/CodeGen/ARM/arm-shrink-wrapping.ll llvm/test/CodeGen/ARM/compare-call.ll llvm/test/CodeGen/ARM/fcmp-xo.ll llvm/test/CodeGen/ARM/float-helpers.s llvm/test/CodeGen/ARM/fp16-instructions.ll llvm/test/CodeGen/ARM/fp16-promote.ll llvm/test/CodeGen/ARM/fpcmp.ll llvm/test/CodeGen/ARM/ifcvt11.ll llvm/test/CodeGen/ARM/swifterror.ll llvm/test/CodeGen/ARM/vcmp-crash.ll llvm/test/CodeGen/ARM/vfp.ll llvm/test/CodeGen/ARM/vsel-fp16.ll llvm/test/CodeGen/ARM/vsel.ll llvm/test/CodeGen/Thumb2/float-cmp.ll llvm/test/CodeGen/Thumb2/mve-vcmpf.ll llvm/test/CodeGen/Thumb2/mve-vcmpfr.ll llvm/test/CodeGen/Thumb2/mve-vcmpfz.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68463.223800.patch Type: text/x-patch Size: 166325 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 01:33:23 2019 From: llvm-commits at lists.llvm.org (Hans Wennborg via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:33:23 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: hans marked 5 inline comments as done. hans added a comment. In D68570#1697872 , @rupprecht wrote: > In D68570#1697633 , @hans wrote: > > > In D68570#1697588 , @joerg wrote: > > > > > Why go back to the large tables for crc32? Just because JamCRC had that bug doesn't mean it should persist. > > > > > > Because just using the table is much simpler and we already have it: no need for any run-time initialization and fancy code like call_once. Why do you consider it a bug? Generating a constant table like this at run-time -- again and again for each invocation of the program -- seems less than ideal to me. > > > Do you have any benchmarks? Of what? That generating the table is slower than just using it directly? I'd say no benchmark is needed to conclude that :-) > A table is simpler in some regards, but also less readable in another sense (what are these random hex values?). Having benchmark results helps settle that debate. It's the standard table for the CRC-32 polynomial. I'll add a comment with some references. In D68570#1697933 , @thakis wrote: > Also, in practice most clients will build against zlib and not see the tables. +1 to the current approach :) Windows builds will typically not use zlib though, so this code does get used :-) In D68570#1697940 , @mgorny wrote: > I'd personally prefer either the non-table approach or having the tables generated at build time. Given this is only going to be used rarely, I don't think we should clutter the code with big tables. It's not used rarely, it's used all the time on Windows. Generating the table at build time would be nice in theory, but writing a tablegen and hooking it up in the build system seems like overkill for this. ================ Comment at: llvm/include/llvm/Support/JamCRC.h:45 + CRC ^= 0xFFFFFFFFU; // Undo CRC-32 Init. + CRC = llvm::crc32(CRC, Data); + CRC ^= 0xFFFFFFFFU; // Undo CRC-32 XorOut. ---------------- ruiu wrote: > This is the only place where you pass non-zero value as the first argument, and the way how that value is handled is a little irregular. So how about moving this class to CRC.h and define `llvm::crc32` as `llvm::crc32(ArrayRef)`? Actually, lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp also passes a non-zero value (from ObjectFileELF::CalculateELFNotesSegmentsCRC32). But that's not the common case. Let's make an overload for the common case, and moving JamCRC into CRC.h makes sense too. ================ Comment at: llvm/lib/Support/CRC.cpp:26 -uint32_t llvm::crc32(uint32_t CRC, StringRef S) { - static llvm::once_flag InitFlag; - static CRC32Table Tbl; - llvm::call_once(InitFlag, initCRC32Table, &Tbl); +static const uint32_t CRCTable[256] = { + 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, ---------------- hiraditya wrote: > rupprecht wrote: > > Can you leave a comment how this table was generated/how it could be regenerated if needed in the future? And/or a unit test to assert the values are correct? > +1 I'm adding a comment with references for how the algorithm and the table works. There is already a unit test in llvm/unittests/Support/CRCTest.cpp CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Tue Oct 8 01:35:53 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:35:53 +0000 (UTC) Subject: [PATCH] D67990: [aarch64] fix generation of fp16 fmls In-Reply-To: References: Message-ID: SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Cheers, lgtm ================ Comment at: llvm/test/CodeGen/AArch64/fp16-fmla.ll:163 +; CHECK: fneg {{v[0-9]+}}.8h, {{v[0-9]+}}.8h +; CHECK: fmla {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h entry: ---------------- sebpop wrote: > SjoerdMeijer wrote: > > Why are we not generating a fmls? > > > > And a nit, but perhaps actually just using registers v0, v1, and v2 here makes things clearer? > That is part of the problem that Tim pointed out: when the multiply is the first operand of `fsub`, i.e., > ``` > %sub = fsub fast <8 x half> %mul, %a > ``` > that should not generate a fused multiply sub. > With this patch, for `b * c - a` we negate the value of a and generate a fused multiply add `-a + b * c`. > > Thanks, I just got myself confused here. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67990/new/ https://reviews.llvm.org/D67990 From llvm-commits at lists.llvm.org Tue Oct 8 01:35:53 2019 From: llvm-commits at lists.llvm.org (Hans Wennborg via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:35:53 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <9c5e14f0ed43992f0473d0efd7396268@localhost.localdomain> hans updated this revision to Diff 223798. hans marked an inline comment as done. hans edited the summary of this revision. hans added a comment. Adding comments, moving JamCRC into CRC.h, and adding a one-parameter crc32 overload. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 Files: clang/lib/AST/MicrosoftMangle.cpp lld/COFF/PDB.cpp lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp llvm/include/llvm/Support/CRC.h llvm/include/llvm/Support/JamCRC.h llvm/lib/DebugInfo/PDB/Native/Hash.cpp llvm/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp llvm/lib/DebugInfo/PDB/Native/TpiHashing.cpp llvm/lib/DebugInfo/Symbolize/Symbolize.cpp llvm/lib/MC/WinCOFFObjectWriter.cpp llvm/lib/Support/CMakeLists.txt llvm/lib/Support/CRC.cpp llvm/lib/Support/JamCRC.cpp llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp llvm/tools/llvm-objcopy/COFF/COFFObjcopy.cpp llvm/tools/llvm-objcopy/CopyConfig.cpp llvm/unittests/Support/CRCTest.cpp llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn -------------- next part -------------- A non-text attachment was scrubbed... Name: D68570.223798.patch Type: text/x-patch Size: 26997 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 01:46:38 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via llvm-commits) Date: Tue, 08 Oct 2019 08:46:38 -0000 Subject: [llvm] r374026 - [LoopRotate] Unconditionally get ScalarEvolution. Message-ID: <20191008084638.EDAF18E8B5@lists.llvm.org> Author: fhahn Date: Tue Oct 8 01:46:38 2019 New Revision: 374026 URL: http://llvm.org/viewvc/llvm-project?rev=374026&view=rev Log: [LoopRotate] Unconditionally get ScalarEvolution. Summary: LoopRotate is a loop pass and SE should always be available. Reviewers: anemet, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D68573 Modified: llvm/trunk/lib/Transforms/Scalar/LoopRotation.cpp Modified: llvm/trunk/lib/Transforms/Scalar/LoopRotation.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopRotation.cpp?rev=374026&r1=374025&r2=374026&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopRotation.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopRotation.cpp Tue Oct 8 01:46:38 2019 @@ -96,15 +96,14 @@ public: auto *AC = &getAnalysis().getAssumptionCache(F); auto *DTWP = getAnalysisIfAvailable(); auto *DT = DTWP ? &DTWP->getDomTree() : nullptr; - auto *SEWP = getAnalysisIfAvailable(); - auto *SE = SEWP ? &SEWP->getSE() : nullptr; + auto &SE = getAnalysis().getSE(); const SimplifyQuery SQ = getBestSimplifyQuery(*this, F); Optional MSSAU; if (EnableMSSALoopDependency) { MemorySSA *MSSA = &getAnalysis().getMSSA(); MSSAU = MemorySSAUpdater(MSSA); } - return LoopRotation(L, LI, TTI, AC, DT, SE, + return LoopRotation(L, LI, TTI, AC, DT, &SE, MSSAU.hasValue() ? MSSAU.getPointer() : nullptr, SQ, false, MaxHeaderSize, false); } From llvm-commits at lists.llvm.org Tue Oct 8 01:45:04 2019 From: llvm-commits at lists.llvm.org (James Henderson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:45:04 +0000 (UTC) Subject: [PATCH] D68462: [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. In-Reply-To: References: Message-ID: jhenderson added inline comments. ================ Comment at: test/tools/llvm-readobj/all.test:86-90 + - Name: .eh_frame_hdr + Type: SHT_PROGBITS +## An arbitrary linker-generated valid content. + Content: 011b033b140000000100000000f0ffff30000000 + - Name: .eh_frame ---------------- grimar wrote: > grimar wrote: > > jhenderson wrote: > > > grimar wrote: > > > > jhenderson wrote: > > > > > grimar wrote: > > > > > > jhenderson wrote: > > > > > > > Same comments as earlier. Can these be empty? > > > > > > No. We need to have something valid here, otherwise any > > > > > > error triggered will fail the dumping. > > > > > > (e.g. https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/DwarfCFIEHPrinter.h#L127). > > > > > You don't actually need the .eh_frame_hdr at all, looking at the code, I think, just the .eh_frame section. That said, this appears to be different from GNU readelf. > > > > > > > > > > For reference, GNU readelf prints "There are no unwind sections in this file" if there are no .eh_frame_hdr sections even if there is a .eh_frame section (not looked to see what happens if there is a PT_GNU_EH_FRAME program header). > > > > I am a bit confused. > > > > > > > > Imagine we have the code and invocations below: > > > > > > > > "1.s": > > > > ``` > > > > .section foo,"ax", at progbits > > > > .cfi_startproc > > > > nop > > > > .cfi_endproc > > > > ``` > > > > > > > > ``` > > > > as 1.s -o 1.o > > > > ld.bfd 1.o -o with_hdr --eh-frame-hdr > > > > ld.bfd 1.o -o wo_hdr > > > > ``` > > > > > > > > For both of them I do not see neither `.eh_frame_hdr` nor `.eh_frame` section dumped with `-a`. > > > > I see ".eh_frame" dumped when I add `-wf` though (but still no `.eh_frame_hdr`). > > > > e.g.: > > > > > > > > > > > > ``` > > > > umb at ubuntu:~/tests/81$ readelf -v > > > > GNU readelf (GNU Binutils for Ubuntu) 2.31.1 > > > > Copyright (C) 2018 Free Software Foundation, Inc. > > > > > > > > umb at ubuntu:~/tests/81$ readelf -a with_hdr > > > > ELF Header: > > > > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > > > > Class: ELF64 > > > > Data: 2's complement, little endian > > > > Version: 1 (current) > > > > OS/ABI: UNIX - System V > > > > ABI Version: 0 > > > > Type: EXEC (Executable file) > > > > Machine: Advanced Micro Devices X86-64 > > > > Version: 0x1 > > > > Entry point address: 0x401000 > > > > Start of program headers: 64 (bytes into file) > > > > Start of section headers: 8584 (bytes into file) > > > > Flags: 0x0 > > > > Size of this header: 64 (bytes) > > > > Size of program headers: 56 (bytes) > > > > Number of program headers: 3 > > > > Size of section headers: 64 (bytes) > > > > Number of section headers: 7 > > > > Section header string table index: 6 > > > > > > > > Section Headers: > > > > [Nr] Name Type Address Offset > > > > Size EntSize Flags Link Info Align > > > > [ 0] NULL 0000000000000000 00000000 > > > > 0000000000000000 0000000000000000 0 0 0 > > > > [ 1] foo PROGBITS 0000000000401000 00001000 > > > > 0000000000000001 0000000000000000 AX 0 0 1 > > > > [ 2] .eh_frame_hdr PROGBITS 0000000000402000 00002000 > > > > 0000000000000014 0000000000000000 A 0 0 4 > > > > [ 3] .eh_frame PROGBITS 0000000000402018 00002018 > > > > 000000000000002c 0000000000000000 A 0 0 8 > > > > [ 4] .symtab SYMTAB 0000000000000000 00002048 > > > > 00000000000000d8 0000000000000018 5 5 8 > > > > [ 5] .strtab STRTAB 0000000000000000 00002120 > > > > 000000000000002c 0000000000000000 0 0 1 > > > > [ 6] .shstrtab STRTAB 0000000000000000 0000214c > > > > 0000000000000037 0000000000000000 0 0 1 > > > > Key to Flags: > > > > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > > > > L (link order), O (extra OS processing required), G (group), T (TLS), > > > > C (compressed), x (unknown), o (OS specific), E (exclude), > > > > l (large), p (processor specific) > > > > > > > > There are no section groups in this file. > > > > > > > > Program Headers: > > > > Type Offset VirtAddr PhysAddr > > > > FileSiz MemSiz Flags Align > > > > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > > > > 0x0000000000000001 0x0000000000000001 R E 0x1000 > > > > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > > > 0x0000000000000044 0x0000000000000044 R 0x1000 > > > > GNU_EH_FRAME 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > > > 0x0000000000000014 0x0000000000000014 R 0x4 > > > > > > > > Section to Segment mapping: > > > > Segment Sections... > > > > 00 foo > > > > 01 .eh_frame_hdr .eh_frame > > > > 02 .eh_frame_hdr > > > > > > > > There is no dynamic section in this file. > > > > > > > > There are no relocations in this file. > > > > > > > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > > > > > > > Symbol table '.symtab' contains 9 entries: > > > > Num: Value Size Type Bind Vis Ndx Name > > > > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > > > > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > > > > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > > > > 3: 0000000000402018 0 SECTION LOCAL DEFAULT 3 > > > > 4: 0000000000402000 0 NOTYPE LOCAL DEFAULT 2 __GNU_EH_FRAME_HDR > > > > 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > > > > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 __bss_start > > > > 7: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _edata > > > > 8: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _end > > > > > > > > No version information found in this file. > > > > > > > > umb at ubuntu:~/tests/81$ readelf -a with_hdr -wf > > > > ELF Header: > > > > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > > > > Class: ELF64 > > > > Data: 2's complement, little endian > > > > Version: 1 (current) > > > > OS/ABI: UNIX - System V > > > > ABI Version: 0 > > > > Type: EXEC (Executable file) > > > > Machine: Advanced Micro Devices X86-64 > > > > Version: 0x1 > > > > Entry point address: 0x401000 > > > > Start of program headers: 64 (bytes into file) > > > > Start of section headers: 8584 (bytes into file) > > > > Flags: 0x0 > > > > Size of this header: 64 (bytes) > > > > Size of program headers: 56 (bytes) > > > > Number of program headers: 3 > > > > Size of section headers: 64 (bytes) > > > > Number of section headers: 7 > > > > Section header string table index: 6 > > > > > > > > Section Headers: > > > > [Nr] Name Type Address Offset > > > > Size EntSize Flags Link Info Align > > > > [ 0] NULL 0000000000000000 00000000 > > > > 0000000000000000 0000000000000000 0 0 0 > > > > [ 1] foo PROGBITS 0000000000401000 00001000 > > > > 0000000000000001 0000000000000000 AX 0 0 1 > > > > [ 2] .eh_frame_hdr PROGBITS 0000000000402000 00002000 > > > > 0000000000000014 0000000000000000 A 0 0 4 > > > > [ 3] .eh_frame PROGBITS 0000000000402018 00002018 > > > > 000000000000002c 0000000000000000 A 0 0 8 > > > > [ 4] .symtab SYMTAB 0000000000000000 00002048 > > > > 00000000000000d8 0000000000000018 5 5 8 > > > > [ 5] .strtab STRTAB 0000000000000000 00002120 > > > > 000000000000002c 0000000000000000 0 0 1 > > > > [ 6] .shstrtab STRTAB 0000000000000000 0000214c > > > > 0000000000000037 0000000000000000 0 0 1 > > > > Key to Flags: > > > > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > > > > L (link order), O (extra OS processing required), G (group), T (TLS), > > > > C (compressed), x (unknown), o (OS specific), E (exclude), > > > > l (large), p (processor specific) > > > > > > > > There are no section groups in this file. > > > > > > > > Program Headers: > > > > Type Offset VirtAddr PhysAddr > > > > FileSiz MemSiz Flags Align > > > > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > > > > 0x0000000000000001 0x0000000000000001 R E 0x1000 > > > > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > > > 0x0000000000000044 0x0000000000000044 R 0x1000 > > > > GNU_EH_FRAME 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > > > 0x0000000000000014 0x0000000000000014 R 0x4 > > > > > > > > Section to Segment mapping: > > > > Segment Sections... > > > > 00 foo > > > > 01 .eh_frame_hdr .eh_frame > > > > 02 .eh_frame_hdr > > > > > > > > There is no dynamic section in this file. > > > > > > > > There are no relocations in this file. > > > > > > > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > > > > > > > Symbol table '.symtab' contains 9 entries: > > > > Num: Value Size Type Bind Vis Ndx Name > > > > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > > > > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > > > > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > > > > 3: 0000000000402018 0 SECTION LOCAL DEFAULT 3 > > > > 4: 0000000000402000 0 NOTYPE LOCAL DEFAULT 2 __GNU_EH_FRAME_HDR > > > > 5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > > > > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 __bss_start > > > > 7: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _edata > > > > 8: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 3 _end > > > > > > > > No version information found in this file. > > > > Contents of the .eh_frame section: > > > > > > > > > > > > 00000000 0000000000000014 00000000 CIE > > > > Version: 1 > > > > Augmentation: "zR" > > > > Code alignment factor: 1 > > > > Data alignment factor: -8 > > > > Return address column: 16 > > > > Augmentation data: 1b > > > > DW_CFA_def_cfa: r7 (rsp) ofs 8 > > > > DW_CFA_offset: r16 (rip) at cfa-8 > > > > DW_CFA_nop > > > > DW_CFA_nop > > > > > > > > 00000018 0000000000000010 0000001c FDE cie=00000000 pc=0000000000401000..0000000000401001 > > > > DW_CFA_nop > > > > DW_CFA_nop > > > > DW_CFA_nop > > > > > > > > > > > > ``` > > > > > > > > I also see ".eh_frame" dumped when there is no ".eh_frame_hdr": > > > > > > > > > > > > ``` > > > > umb at ubuntu:~/tests/81$ readelf -a wo_hdr -wf > > > > ELF Header: > > > > Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 > > > > Class: ELF64 > > > > Data: 2's complement, little endian > > > > Version: 1 (current) > > > > OS/ABI: UNIX - System V > > > > ABI Version: 0 > > > > Type: EXEC (Executable file) > > > > Machine: Advanced Micro Devices X86-64 > > > > Version: 0x1 > > > > Entry point address: 0x401000 > > > > Start of program headers: 64 (bytes into file) > > > > Start of section headers: 8480 (bytes into file) > > > > Flags: 0x0 > > > > Size of this header: 64 (bytes) > > > > Size of program headers: 56 (bytes) > > > > Number of program headers: 2 > > > > Size of section headers: 64 (bytes) > > > > Number of section headers: 6 > > > > Section header string table index: 5 > > > > > > > > Section Headers: > > > > [Nr] Name Type Address Offset > > > > Size EntSize Flags Link Info Align > > > > [ 0] NULL 0000000000000000 00000000 > > > > 0000000000000000 0000000000000000 0 0 0 > > > > [ 1] foo PROGBITS 0000000000401000 00001000 > > > > 0000000000000001 0000000000000000 AX 0 0 1 > > > > [ 2] .eh_frame PROGBITS 0000000000402000 00002000 > > > > 000000000000002c 0000000000000000 A 0 0 8 > > > > [ 3] .symtab SYMTAB 0000000000000000 00002030 > > > > 00000000000000a8 0000000000000018 4 3 8 > > > > [ 4] .strtab STRTAB 0000000000000000 000020d8 > > > > 0000000000000019 0000000000000000 0 0 1 > > > > [ 5] .shstrtab STRTAB 0000000000000000 000020f1 > > > > 0000000000000029 0000000000000000 0 0 1 > > > > Key to Flags: > > > > W (write), A (alloc), X (execute), M (merge), S (strings), I (info), > > > > L (link order), O (extra OS processing required), G (group), T (TLS), > > > > C (compressed), x (unknown), o (OS specific), E (exclude), > > > > l (large), p (processor specific) > > > > > > > > There are no section groups in this file. > > > > > > > > Program Headers: > > > > Type Offset VirtAddr PhysAddr > > > > FileSiz MemSiz Flags Align > > > > LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 > > > > 0x0000000000000001 0x0000000000000001 R E 0x1000 > > > > LOAD 0x0000000000002000 0x0000000000402000 0x0000000000402000 > > > > 0x000000000000002c 0x000000000000002c R 0x1000 > > > > > > > > Section to Segment mapping: > > > > Segment Sections... > > > > 00 foo > > > > 01 .eh_frame > > > > > > > > There is no dynamic section in this file. > > > > > > > > There are no relocations in this file. > > > > > > > > The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. > > > > > > > > Symbol table '.symtab' contains 7 entries: > > > > Num: Value Size Type Bind Vis Ndx Name > > > > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > > > > 1: 0000000000401000 0 SECTION LOCAL DEFAULT 1 > > > > 2: 0000000000402000 0 SECTION LOCAL DEFAULT 2 > > > > 3: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start > > > > 4: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 __bss_start > > > > 5: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 _edata > > > > 6: 0000000000404000 0 NOTYPE GLOBAL DEFAULT 2 _end > > > > > > > > No version information found in this file. > > > > Contents of the .eh_frame section: > > > > > > > > > > > > 00000000 0000000000000014 00000000 CIE > > > > Version: 1 > > > > Augmentation: "zR" > > > > Code alignment factor: 1 > > > > Data alignment factor: -8 > > > > Return address column: 16 > > > > Augmentation data: 1b > > > > DW_CFA_def_cfa: r7 (rsp) ofs 8 > > > > DW_CFA_offset: r16 (rip) at cfa-8 > > > > DW_CFA_nop > > > > DW_CFA_nop > > > > > > > > 00000018 0000000000000010 0000001c FDE cie=00000000 pc=0000000000401000..0000000000401001 > > > > DW_CFA_nop > > > > DW_CFA_nop > > > > DW_CFA_nop > > > > > > > > } > > > > ``` > > > > > > > > Since we have such differences in the behavior, should we just test the current behavior atm? > > > > I.e. before this diff I tested "EH_FRAME Header [", now I also added a check for ".eh_frame section at offset...". > > > > Both of them are dumped at the top level currently. Seems reasonable to test the fact we do that (with just `-all`) > > > > and the order, probably? > > > > (I am ok to change it in any way actually, but just wanted to clarify this before doing anything with it.) > > > I'm confused too. Your output above even appears to be conflicted. Note that it mentions "The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported." > > > > > > Anyway, probably best to file a bug to record the issue and then do as you're doing here (i.e. test the current behaviour). > > > Note that it mentions "The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported." > > > > Yes, that looks strange. I'll build the latest binutils from sources tomorrow and check what it do, then probably file a bug or prepare a patch. Thanks for review! > I've build GNU readelf (GNU Binutils) 2.33.50.20191008 today and it still shows > "The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported." > for me when I request "-a" or "-u" (--unwind Display the unwind info (if present)) > > It seems to be a known one year old bug: > https://bugzilla.redhat.com/show_bug.cgi?id=1626614 > > And one of interesting comments is: > "A workaround is readelf -wF or -wf however, it is unclear to me why -u isn't wired up to do the same thing as -wf. Is that something worth considering?" > > I am going to commit this patch and do nothing else for it atm (i.e. will not report any bugs for llvm-readelf until situation with GNU be fixed, since the current our behavior looks valid and I probably see no issues). How does it sound for you? Sounds good! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68462/new/ https://reviews.llvm.org/D68462 From llvm-commits at lists.llvm.org Tue Oct 8 01:45:19 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:45:19 +0000 (UTC) Subject: [PATCH] D68400: [NFC][TTI] Add Alignment for isLegalMasked[Load/Store] In-Reply-To: References: Message-ID: <601122c93be1582c18464bcf766946bf@localhost.localdomain> samparker added a comment. Yes, it turns out Align is simple to use, the constructor just takes the unsigned value... but it will not accept zero and so causes assertion failures. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68400/new/ https://reviews.llvm.org/D68400 From llvm-commits at lists.llvm.org Tue Oct 8 01:49:15 2019 From: llvm-commits at lists.llvm.org (Zi Xuan Wu via llvm-commits) Date: Tue, 08 Oct 2019 08:49:15 -0000 Subject: [llvm] r374027 - [NFC] Add REQUIRES for r374017 in testcase Message-ID: <20191008084915.E4AEA8ED17@lists.llvm.org> Author: wuzish Date: Tue Oct 8 01:49:15 2019 New Revision: 374027 URL: http://llvm.org/viewvc/llvm-project?rev=374027&view=rev Log: [NFC] Add REQUIRES for r374017 in testcase Modified: llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll Modified: llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll?rev=374027&r1=374026&r2=374027&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll (original) +++ llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll Tue Oct 8 01:49:15 2019 @@ -1,5 +1,6 @@ ; RUN: opt < %s -debug-only=loop-vectorize -loop-vectorize -vectorizer-maximize-bandwidth -O2 -mtriple=powerpc64-unknown-linux -S -mcpu=pwr8 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-PWR8 ; RUN: opt < %s -debug-only=loop-vectorize -loop-vectorize -vectorizer-maximize-bandwidth -O2 -mtriple=powerpc64le-unknown-linux -S -mcpu=pwr9 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-PWR9 +; REQUIRES: asserts @a = global [1024 x i8] zeroinitializer, align 16 @b = global [1024 x i8] zeroinitializer, align 16 From llvm-commits at lists.llvm.org Tue Oct 8 01:51:24 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:51:24 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: shchenz updated this revision to Diff 223801. shchenz added a comment. address comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 Files: llvm/include/llvm/IR/PatternMatch.h llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp llvm/test/Transforms/AggressiveInstCombine/popcount.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68189.223801.patch Type: text/x-patch Size: 13226 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 01:51:27 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:51:27 +0000 (UTC) Subject: [PATCH] D68573: [LoopRotate] Unconditionally get ScalarEvolution. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGa70c52614363: [LoopRotate] Unconditionally get ScalarEvolution. (authored by fhahn). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68573/new/ https://reviews.llvm.org/D68573 Files: llvm/lib/Transforms/Scalar/LoopRotation.cpp Index: llvm/lib/Transforms/Scalar/LoopRotation.cpp =================================================================== --- llvm/lib/Transforms/Scalar/LoopRotation.cpp +++ llvm/lib/Transforms/Scalar/LoopRotation.cpp @@ -96,15 +96,14 @@ auto *AC = &getAnalysis().getAssumptionCache(F); auto *DTWP = getAnalysisIfAvailable(); auto *DT = DTWP ? &DTWP->getDomTree() : nullptr; - auto *SEWP = getAnalysisIfAvailable(); - auto *SE = SEWP ? &SEWP->getSE() : nullptr; + auto &SE = getAnalysis().getSE(); const SimplifyQuery SQ = getBestSimplifyQuery(*this, F); Optional MSSAU; if (EnableMSSALoopDependency) { MemorySSA *MSSA = &getAnalysis().getMSSA(); MSSAU = MemorySSAUpdater(MSSA); } - return LoopRotation(L, LI, TTI, AC, DT, SE, + return LoopRotation(L, LI, TTI, AC, DT, &SE, MSSAU.hasValue() ? MSSAU.getPointer() : nullptr, SQ, false, MaxHeaderSize, false); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68573.223802.patch Type: text/x-patch Size: 1119 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 01:54:24 2019 From: llvm-commits at lists.llvm.org (Jonas Paulsson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:54:24 +0000 (UTC) Subject: [PATCH] D63973: [MachineVerifier] Improve checks of target instructions operands. In-Reply-To: References: Message-ID: <237ec272f0083c6e5783c6e4547495c0@localhost.localdomain> jonpa added a comment. ping! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63973/new/ https://reviews.llvm.org/D63973 From llvm-commits at lists.llvm.org Tue Oct 8 01:54:24 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:54:24 +0000 (UTC) Subject: [PATCH] D68629: [llvm-exegesis] Finish plumbing the `Config` field. In-Reply-To: References: Message-ID: <5e85d75bcbfc94fb57d0163142199972@localhost.localdomain> courbet updated this revision to Diff 223803. courbet marked 3 inline comments as done. courbet added a comment. Addres comments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68629/new/ https://reviews.llvm.org/D68629 Files: llvm/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test llvm/tools/llvm-exegesis/lib/BenchmarkCode.h llvm/tools/llvm-exegesis/lib/BenchmarkResult.h llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp llvm/tools/llvm-exegesis/lib/Clustering.cpp llvm/tools/llvm-exegesis/lib/CodeTemplate.h llvm/tools/llvm-exegesis/lib/SnippetFile.cpp llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/unittests/tools/llvm-exegesis/X86/SnippetFileTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68629.223803.patch Type: text/x-patch Size: 12914 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 01:54:25 2019 From: llvm-commits at lists.llvm.org (James Henderson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:54:25 +0000 (UTC) Subject: [PATCH] D68210: Object/minidump: Add support for the MemoryInfoList stream In-Reply-To: References: Message-ID: <368f3b6de330b83b1f506fde99dd605d@localhost.localdomain> jhenderson added inline comments. ================ Comment at: lib/Object/Minidump.cpp:58 +MinidumpFile::getMemoryInfoList() const { + auto OptionalStream = getRawStream(StreamType::MemoryInfoList); + if (!OptionalStream) ---------------- labath wrote: > jhenderson wrote: > > I probably should have picked up on this in previous reviews, but this is too much `auto` for my liking, as it's not obvious from the call site what `getRawStream` returns. > Done. I've also changed the other calls to getRawStream. Thanks! ================ Comment at: unittests/Object/MinidumpTest.cpp:620 + }; + EXPECT_THAT_EXPECTED(cantFail(create(HeaderTooBig))->getMemoryInfoList(), + Failed()); ---------------- Here and in the similar places, I'm not convinced that `cantFail` is appropriate (if the creation code is broken, this will assert and therefore possibly hide the actual testing failures that show where it went wrong more precisely). It should probably be a two phase thing: ``` Expected> Minidump = HeaderTooBig); ASSERT_THAT_EXPECTED(Minidump, Succeeded()); EXPECTE_THAT_EXPECTED(Minidump->getMemoryInfoList(), Failed()); ``` ================ Comment at: unittests/Object/MinidumpTest.cpp:624 + // Header fits into the stream, but it is too small to contain the required + // entries). + std::vector HeaderTooSmall{ ---------------- Nit: delete the ')' Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68210/new/ https://reviews.llvm.org/D68210 From llvm-commits at lists.llvm.org Tue Oct 8 01:54:26 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 08:54:26 +0000 (UTC) Subject: [PATCH] D68629: [llvm-exegesis] Finish plumbing the `Config` field. In-Reply-To: References: Message-ID: <6f42a1ff22eb49fbc89ff35d8ed18081@localhost.localdomain> courbet added inline comments. ================ Comment at: llvm/tools/llvm-exegesis/lib/Clustering.cpp:241 + // Given an instruction Opcode and Config, in which clusters do benchmarks of + // this instruction lie? Normally, they all should be in the same cluster. + struct OpcodeAndConfig { ---------------- gchatelet wrote: > Not related to this patch but why a question mark here? Not sure, but that doe snot bother me too much :) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68629/new/ https://reviews.llvm.org/D68629 From llvm-commits at lists.llvm.org Tue Oct 8 01:59:12 2019 From: llvm-commits at lists.llvm.org (George Rimar via llvm-commits) Date: Tue, 08 Oct 2019 08:59:12 -0000 Subject: [llvm] r374028 - [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. Message-ID: <20191008085912.585408DF2C@lists.llvm.org> Author: grimar Date: Tue Oct 8 01:59:12 2019 New Revision: 374028 URL: http://llvm.org/viewvc/llvm-project?rev=374028&view=rev Log: [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. We do not check the GNU-style output when -all is given. This patch does that. Differential revision: https://reviews.llvm.org/D68462 Modified: llvm/trunk/test/tools/llvm-readobj/all.test Modified: llvm/trunk/test/tools/llvm-readobj/all.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-readobj/all.test?rev=374028&r1=374027&r2=374028&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-readobj/all.test (original) +++ llvm/trunk/test/tools/llvm-readobj/all.test Tue Oct 8 01:59:12 2019 @@ -1,22 +1,40 @@ # RUN: yaml2obj %s -o %t.o -# RUN: llvm-readobj -a %t.o | FileCheck %s --check-prefix ALL -# RUN: llvm-readobj --all %t.o | FileCheck %s --check-prefix ALL +# RUN: llvm-readobj -a %t.o | FileCheck %s --check-prefix LLVM-ALL +# RUN: llvm-readobj --all %t.o | FileCheck %s --check-prefix LLVM-ALL -# ALL: Format: ELF32-i386 -# ALL: Arch: i386 -# ALL: AddressSize: 32bit -# ALL: LoadName: -# ALL: ElfHeader { -# ALL: Sections [ -# ALL: Relocations [ -# ALL: Symbols [ -# ALL: ProgramHeaders [ -# ALL: Version symbols { -# ALL: SHT_GNU_verdef { -# ALL: SHT_GNU_verneed { -# ALL: Addrsig [ -# ALL: Notes [ -# ALL: StackSizes [ +# LLVM-ALL: Format: ELF32-i386 +# LLVM-ALL: Arch: i386 +# LLVM-ALL: AddressSize: 32bit +# LLVM-ALL: LoadName: +# LLVM-ALL: ElfHeader { +# LLVM-ALL: Sections [ +# LLVM-ALL: Relocations [ +# LLVM-ALL: Symbols [ +# LLVM-ALL: ProgramHeaders [ +# LLVM-ALL: Version symbols { +# LLVM-ALL: SHT_GNU_verdef { +# LLVM-ALL: SHT_GNU_verneed { +# LLVM-ALL: Addrsig [ +# LLVM-ALL: Notes [ +# LLVM-ALL: StackSizes [ + +# RUN: llvm-readelf -a %t.o | FileCheck %s --check-prefix GNU-ALL +# RUN: llvm-readelf --all %t.o | FileCheck %s --check-prefix GNU-ALL + +# GNU-ALL: ELF Header: +# GNU-ALL: There are {{.*}} section headers, starting at offset {{.*}}: +# GNU-ALL: Relocation section '.rela.data' at offset {{.*}} contains {{.*}} entries: +# GNU-ALL: Symbol table '.symtab' contains {{.*}} entries: +# GNU-ALL: EH_FRAME Header [ +# GNU-ALL: .eh_frame section at offset {{.*}} address 0x0: +# GNU-ALL: Dynamic section at offset {{.*}} contains {{.*}} entries: +# GNU-ALL: Program Headers: +# GNU-ALL: Version symbols section '.gnu.version' contains {{.*}} entries: +# GNU-ALL: Version definition section '.gnu.version_d' contains {{.*}} entries: +# GNU-ALL: Version needs section '.gnu.version_r' contains {{.*}} entries: +# GNU-ALL: There are no section groups in this file. +# GNU-ALL: Histogram for bucket list length (total of 1 buckets) +# GNU-ALL: Displaying notes found at file offset {{.*}} with length {{.*}}: --- !ELF FileHeader: @@ -24,3 +42,69 @@ FileHeader: Data: ELFDATA2LSB Type: ET_REL Machine: EM_386 +Sections: + - Name: .data + Type: SHT_PROGBITS + - Name: .rela.data + Type: SHT_REL + Relocations: + - Name: .gnu.version + Type: SHT_GNU_versym + Entries: [ 0 ] + - Name: .gnu.version_d + Type: SHT_GNU_verdef + Info: 0x0 + Entries: [] + - Name: .gnu.version_r + Type: SHT_GNU_verneed + Info: 0x0 + Dependencies: + - Version: 1 + File: verneed1.so.0 + Entries: [] + - Name: .dynamic + Type: SHT_DYNAMIC + Address: 0x1000 + AddressAlign: 0x1000 + Entries: + - Tag: DT_HASH + Value: 0x1100 + - Tag: DT_NULL + Value: 0 + - Name: .hash + Type: SHT_HASH + Link: 0 + Bucket: [ 1 ] + Chain: [ 0, 0 ] + Address: 0x1100 + AddressAlign: 0x100 + - Name: .eh_frame_hdr + Type: SHT_PROGBITS +## An arbitrary linker-generated valid content. + Content: 011b033b140000000100000000f0ffff30000000 + - Name: .eh_frame + Type: SHT_PROGBITS + AddressAlign: 8 +## An arbitrary linker-generated valid content. + Content: 1400000000000000017a5200017810011b0c070890010000100000001c000000c8efffff0100000000000000 + - Name: .note.gnu.build-id + Type: SHT_NOTE + Flags: [ SHF_ALLOC ] + Address: 0x1500 +## An arbitrary linker-generated valid content. + Content: 040000001000000003000000474E55004FCB712AA6387724A9F465A32CD8C14B +ProgramHeaders: + - Type: PT_LOAD + VAddr: 0x1000 + Sections: + - Section: .dynamic + - Section: .hash + - Type: PT_DYNAMIC + Sections: + - Section: .dynamic + - Type: PT_GNU_EH_FRAME + Sections: + - Section: .eh_frame_hdr + - Type: PT_NOTE + Sections: + - Section: .note.gnu.build-id From llvm-commits at lists.llvm.org Tue Oct 8 02:00:47 2019 From: llvm-commits at lists.llvm.org (Jonas Paulsson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:00:47 +0000 (UTC) Subject: [PATCH] D68267: [MBB LiveIn lists, MachineVerifier, SystemZ] New method isLiveOut() and mverifier improvement. In-Reply-To: References: Message-ID: <240d88db2d932eb3c91f368f43d6f7d8@localhost.localdomain> jonpa added a comment. ping! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68267/new/ https://reviews.llvm.org/D68267 From llvm-commits at lists.llvm.org Tue Oct 8 02:00:47 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:00:47 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: <6f6f01dcb1147362bb91990c5fb6a38f@localhost.localdomain> shchenz marked 3 inline comments as done. shchenz added inline comments. ================ Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:270 + bool IsType32 = I.getType()->isIntegerTy(32); + uint16_t FinalShiftConst = IsType32 ? 24 : 56; + uint64_t MulConst = IsType32 ? 0x01010101 : 0x0101010101010101; ---------------- xbolva00 wrote: > Type size - 8? When type size is 8, we don't need the final `lshr`, so we can not do it in instcombine for opcode `lshr`, but we can still do it based on final `mul`. Currently I left it as a follow-up issue until a real world case found. Hope this is ok. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 From llvm-commits at lists.llvm.org Tue Oct 8 02:00:47 2019 From: llvm-commits at lists.llvm.org (Jonas Paulsson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:00:47 +0000 (UTC) Subject: [PATCH] D68395: [LoopDataPrefetch] Update an existing prefetch to same address to 'write' for a store In-Reply-To: References: Message-ID: <0bda75e9538da9d58b28e2a56d5bbd96@localhost.localdomain> jonpa added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68395/new/ https://reviews.llvm.org/D68395 From llvm-commits at lists.llvm.org Tue Oct 8 02:00:47 2019 From: llvm-commits at lists.llvm.org (Jonas Paulsson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:00:47 +0000 (UTC) Subject: [PATCH] D68280: [LoopDataPrefetch] Move prefetch to dominating position of two accesses In-Reply-To: References: Message-ID: jonpa added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68280/new/ https://reviews.llvm.org/D68280 From llvm-commits at lists.llvm.org Tue Oct 8 02:00:48 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:00:48 +0000 (UTC) Subject: [PATCH] D68400: [NFC][TTI] Add Alignment for isLegalMasked[Load/Store] In-Reply-To: References: Message-ID: <8d306f02b619e18521bb46ab01396385@localhost.localdomain> dmgreen added a comment. I think that's what "MaybeAlign" is for, if I read it correctly. It is essentially an Optional with 0 alignments being the "false" optional state. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68400/new/ https://reviews.llvm.org/D68400 From llvm-commits at lists.llvm.org Tue Oct 8 02:00:52 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:00:52 +0000 (UTC) Subject: [PATCH] D68462: [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGeec98969603e: [llvm-readobj/llvm-readelf] - Add checks for GNU-style to "all.test" test case. (authored by grimar). Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D68462?vs=223578&id=223804#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68462/new/ https://reviews.llvm.org/D68462 Files: llvm/test/tools/llvm-readobj/all.test -------------- next part -------------- A non-text attachment was scrubbed... Name: D68462.223804.patch Type: text/x-patch Size: 4145 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 02:03:55 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:03:55 +0000 (UTC) Subject: [PATCH] D68629: [llvm-exegesis] Finish plumbing the `Config` field. In-Reply-To: References: Message-ID: <4b1c331c782fa9f541c02060b014bc7c@localhost.localdomain> courbet updated this revision to Diff 223805. courbet added a comment. remove spurious edit Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68629/new/ https://reviews.llvm.org/D68629 Files: llvm/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test llvm/tools/llvm-exegesis/lib/BenchmarkCode.h llvm/tools/llvm-exegesis/lib/BenchmarkResult.h llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp llvm/tools/llvm-exegesis/lib/Clustering.cpp llvm/tools/llvm-exegesis/lib/CodeTemplate.h llvm/tools/llvm-exegesis/lib/SnippetFile.cpp llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/unittests/tools/llvm-exegesis/X86/SnippetFileTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68629.223805.patch Type: text/x-patch Size: 12913 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 02:06:49 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Tue, 08 Oct 2019 09:06:49 -0000 Subject: [llvm] r374031 - [llvm-exegesis] Finish plumbing the `Config` field. Message-ID: <20191008090649.3D6EA806CF@lists.llvm.org> Author: courbet Date: Tue Oct 8 02:06:48 2019 New Revision: 374031 URL: http://llvm.org/viewvc/llvm-project?rev=374031&view=rev Log: [llvm-exegesis] Finish plumbing the `Config` field. Summary: Right now there are no snippet generators that emit the `Config` Field, but I plan to add it to investigate LEA operands for PR32326. What was broken was: - `Config` Was not propagated up until the BenchmarkResult::Key. - Clustering should really consider different configs as measuring different things, so we should stabilize on (Opcode, Config) instead of just Opcode. Reviewers: gchatelet Subscribers: tschuett, llvm-commits, lebedev.ri Tags: #llvm Differential Revision: https://reviews.llvm.org/D68629 Modified: llvm/trunk/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test llvm/trunk/tools/llvm-exegesis/lib/BenchmarkCode.h llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.h llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.cpp llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.h llvm/trunk/tools/llvm-exegesis/lib/SnippetFile.cpp llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetFileTest.cpp Modified: llvm/trunk/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test?rev=374031&r1=374030&r2=374031&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test (original) +++ llvm/trunk/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test Tue Oct 8 02:06:48 2019 @@ -4,8 +4,7 @@ # have different configs, so they should not be placed in the same cluster by # stabilization. -# CHECK-UNSTABLE: SQRTSSr -# CHECK-UNSTABLE: SQRTSSr +# CHECK-UNSTABLE-NOT: SQRTSSr --- mode: latency Modified: llvm/trunk/tools/llvm-exegesis/lib/BenchmarkCode.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/BenchmarkCode.h?rev=374031&r1=374030&r2=374031&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/BenchmarkCode.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/BenchmarkCode.h Tue Oct 8 02:06:48 2019 @@ -9,7 +9,7 @@ #ifndef LLVM_TOOLS_LLVM_EXEGESIS_BENCHMARKCODE_H #define LLVM_TOOLS_LLVM_EXEGESIS_BENCHMARKCODE_H -#include "RegisterValue.h" +#include "BenchmarkResult.h" #include "llvm/MC/MCInst.h" #include #include @@ -19,12 +19,7 @@ namespace exegesis { // A collection of instructions that are to be assembled, executed and measured. struct BenchmarkCode { - // The sequence of instructions that are to be repeated. - std::vector Instructions; - - // Before the code is executed some instructions are added to setup the - // registers initial values. - std::vector RegisterInitialValues; + InstructionBenchmarkKey Key; // We also need to provide the registers that are live on entry for the // assembler to generate proper prologue/epilogue. Modified: llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.h?rev=374031&r1=374030&r2=374031&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.h Tue Oct 8 02:06:48 2019 @@ -15,8 +15,8 @@ #ifndef LLVM_TOOLS_LLVM_EXEGESIS_BENCHMARKRESULT_H #define LLVM_TOOLS_LLVM_EXEGESIS_BENCHMARKRESULT_H -#include "BenchmarkCode.h" #include "LlvmState.h" +#include "RegisterValue.h" #include "llvm/ADT/StringMap.h" #include "llvm/ADT/StringRef.h" #include "llvm/MC/MCInst.h" Modified: llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.cpp?rev=374031&r1=374030&r2=374031&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.cpp Tue Oct 8 02:06:48 2019 @@ -31,7 +31,6 @@ BenchmarkRunner::BenchmarkRunner(const L BenchmarkRunner::~BenchmarkRunner() = default; - namespace { class FunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor { public: @@ -92,10 +91,9 @@ InstructionBenchmark BenchmarkRunner::ru InstrBenchmark.NumRepetitions = NumRepetitions; InstrBenchmark.Info = BC.Info; - const std::vector &Instructions = BC.Instructions; + const std::vector &Instructions = BC.Key.Instructions; - InstrBenchmark.Key.Instructions = Instructions; - InstrBenchmark.Key.RegisterInitialValues = BC.RegisterInitialValues; + InstrBenchmark.Key = BC.Key; // Assemble at least kMinInstructionsForSnippet instructions by repeating the // snippet for debug/analysis. This is so that the user clearly understands @@ -104,10 +102,10 @@ InstructionBenchmark BenchmarkRunner::ru { llvm::SmallString<0> Buffer; llvm::raw_svector_ostream OS(Buffer); - assembleToStream( - State.getExegesisTarget(), State.createTargetMachine(), BC.LiveIns, - BC.RegisterInitialValues, - Repetitor.Repeat(BC.Instructions, kMinInstructionsForSnippet), OS); + assembleToStream(State.getExegesisTarget(), State.createTargetMachine(), + BC.LiveIns, BC.Key.RegisterInitialValues, + Repetitor.Repeat(Instructions, kMinInstructionsForSnippet), + OS); const ExecutableFunction EF(State.createTargetMachine(), getObjectFromBuffer(OS.str())); const auto FnBytes = EF.getFunctionBytes(); @@ -117,7 +115,7 @@ InstructionBenchmark BenchmarkRunner::ru // Assemble NumRepetitions instructions repetitions of the snippet for // measurements. const auto Filler = - Repetitor.Repeat(BC.Instructions, InstrBenchmark.NumRepetitions); + Repetitor.Repeat(Instructions, InstrBenchmark.NumRepetitions); llvm::object::OwningBinary ObjectFile; if (DumpObjectToDisk) { @@ -133,7 +131,7 @@ InstructionBenchmark BenchmarkRunner::ru llvm::SmallString<0> Buffer; llvm::raw_svector_ostream OS(Buffer); assembleToStream(State.getExegesisTarget(), State.createTargetMachine(), - BC.LiveIns, BC.RegisterInitialValues, Filler, OS); + BC.LiveIns, BC.Key.RegisterInitialValues, Filler, OS); ObjectFile = getObjectFromBuffer(OS.str()); } @@ -150,7 +148,7 @@ InstructionBenchmark BenchmarkRunner::ru // Scale the measurements by instruction. BM.PerInstructionValue /= InstrBenchmark.NumRepetitions; // Scale the measurements by snippet. - BM.PerSnippetValue *= static_cast(BC.Instructions.size()) / + BM.PerSnippetValue *= static_cast(Instructions.size()) / InstrBenchmark.NumRepetitions; } @@ -167,7 +165,7 @@ BenchmarkRunner::writeObjectFile(const B return std::move(E); llvm::raw_fd_ostream OFS(ResultFD, true /*ShouldClose*/); assembleToStream(State.getExegesisTarget(), State.createTargetMachine(), - BC.LiveIns, BC.RegisterInitialValues, FillFunction, OFS); + BC.LiveIns, BC.Key.RegisterInitialValues, FillFunction, OFS); return ResultPath.str(); } Modified: llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp?rev=374031&r1=374030&r2=374031&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp Tue Oct 8 02:06:48 2019 @@ -237,39 +237,40 @@ void InstructionBenchmarkClustering::clu // We shall find every opcode with benchmarks not in just one cluster, and move // *all* the benchmarks of said Opcode into one new unstable cluster per Opcode. void InstructionBenchmarkClustering::stabilize(unsigned NumOpcodes) { - // Given an instruction Opcode, in which clusters do benchmarks of this - // instruction lie? Normally, they all should be in the same cluster. - std::vector> OpcodeToClusterIDs; - OpcodeToClusterIDs.resize(NumOpcodes); - // The list of opcodes that have more than one cluster. - llvm::SetVector UnstableOpcodes; - // Populate OpcodeToClusterIDs and UnstableOpcodes data structures. + // Given an instruction Opcode and Config, in which clusters do benchmarks of + // this instruction lie? Normally, they all should be in the same cluster. + struct OpcodeAndConfig { + explicit OpcodeAndConfig(const InstructionBenchmark &IB) + : Opcode(IB.keyInstruction().getOpcode()), Config(&IB.Key.Config) {} + unsigned Opcode; + const std::string *Config; + + auto Tie() const -> auto { return std::tie(Opcode, *Config); } + + bool operator<(const OpcodeAndConfig &O) const { return Tie() < O.Tie(); } + bool operator!=(const OpcodeAndConfig &O) const { return Tie() != O.Tie(); } + }; + std::map> + OpcodeConfigToClusterIDs; + // Populate OpcodeConfigToClusterIDs and UnstableOpcodes data structures. assert(ClusterIdForPoint_.size() == Points_.size() && "size mismatch"); for (const auto &Point : zip(Points_, ClusterIdForPoint_)) { const ClusterId &ClusterIdOfPoint = std::get<1>(Point); if (!ClusterIdOfPoint.isValid()) continue; // Only process fully valid clusters. - const unsigned Opcode = std::get<0>(Point).keyInstruction().getOpcode(); - assert(Opcode < NumOpcodes && "NumOpcodes is incorrect (too small)"); + const OpcodeAndConfig Key(std::get<0>(Point)); llvm::SmallSet &ClusterIDsOfOpcode = - OpcodeToClusterIDs[Opcode]; + OpcodeConfigToClusterIDs[Key]; ClusterIDsOfOpcode.insert(ClusterIdOfPoint); - // Is there more than one ClusterID for this opcode?. - if (ClusterIDsOfOpcode.size() < 2) - continue; // If not, then at this moment this Opcode is stable. - // Else let's record this unstable opcode for future use. - UnstableOpcodes.insert(Opcode); } - assert(OpcodeToClusterIDs.size() == NumOpcodes && "sanity check"); - // We know with how many [new] clusters we will end up with. - const auto NewTotalClusterCount = Clusters_.size() + UnstableOpcodes.size(); - Clusters_.reserve(NewTotalClusterCount); - for (const size_t UnstableOpcode : UnstableOpcodes.getArrayRef()) { + for (const auto &OpcodeConfigToClusterID : OpcodeConfigToClusterIDs) { const llvm::SmallSet &ClusterIDs = - OpcodeToClusterIDs[UnstableOpcode]; - assert(ClusterIDs.size() > 1 && - "Should only have Opcodes with more than one cluster."); + OpcodeConfigToClusterID.second; + const OpcodeAndConfig &Key = OpcodeConfigToClusterID.first; + // We only care about unstable instructions. + if (ClusterIDs.size() < 2) + continue; // Create a new unstable cluster, one per Opcode. Clusters_.emplace_back(ClusterId::makeValidUnstable(Clusters_.size())); @@ -290,8 +291,8 @@ void InstructionBenchmarkClustering::sta // and the rest of the points is for the UnstableOpcode. const auto it = std::stable_partition( OldCluster.PointIndices.begin(), OldCluster.PointIndices.end(), - [this, UnstableOpcode](size_t P) { - return Points_[P].keyInstruction().getOpcode() != UnstableOpcode; + [this, &Key](size_t P) { + return OpcodeAndConfig(Points_[P]) != Key; }); assert(std::distance(it, OldCluster.PointIndices.end()) > 0 && "Should have found at least one bad point"); @@ -314,7 +315,6 @@ void InstructionBenchmarkClustering::sta "New unstable cluster should end up with no less points than there " "was clusters"); } - assert(Clusters_.size() == NewTotalClusterCount && "sanity check"); } llvm::Expected Modified: llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.h?rev=374031&r1=374030&r2=374031&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.h Tue Oct 8 02:06:48 2019 @@ -115,6 +115,8 @@ struct CodeTemplate { CodeTemplate &operator=(const CodeTemplate &) = delete; ExecutionMode Execution = ExecutionMode::UNKNOWN; + // See InstructionBenchmarkKey.::Config. + std::string Config; // Some information about how this template has been created. std::string Info; // The list of the instructions for this template. Modified: llvm/trunk/tools/llvm-exegesis/lib/SnippetFile.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/SnippetFile.cpp?rev=374031&r1=374030&r2=374031&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/SnippetFile.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/SnippetFile.cpp Tue Oct 8 02:06:48 2019 @@ -36,7 +36,7 @@ public: // instructions. void EmitInstruction(const MCInst &Instruction, const MCSubtargetInfo &STI) override { - Result->Instructions.push_back(Instruction); + Result->Key.Instructions.push_back(Instruction); } // Implementation of the AsmCommentConsumer. @@ -65,7 +65,7 @@ public: const StringRef HexValue = Parts[1].trim(); RegVal.Value = APInt( /* each hex digit is 4 bits */ HexValue.size() * 4, HexValue, 16); - Result->RegisterInitialValues.push_back(std::move(RegVal)); + Result->Key.RegisterInitialValues.push_back(std::move(RegVal)); return; } if (CommentText.consume_front("LIVEIN")) { Modified: llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp?rev=374031&r1=374030&r2=374031&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp Tue Oct 8 02:06:48 2019 @@ -73,12 +73,13 @@ SnippetGenerator::generateConfigurations BC.Info = CT.Info; for (InstructionTemplate &IT : CT.Instructions) { randomizeUnsetVariables(State.getExegesisTarget(), ForbiddenRegs, IT); - BC.Instructions.push_back(IT.build()); + BC.Key.Instructions.push_back(IT.build()); } if (CT.ScratchSpacePointerInReg) BC.LiveIns.push_back(CT.ScratchSpacePointerInReg); - BC.RegisterInitialValues = + BC.Key.RegisterInitialValues = computeRegisterInitialValues(CT.Instructions); + BC.Key.Config = CT.Config; Output.push_back(std::move(BC)); } } Modified: llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetFileTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetFileTest.cpp?rev=374031&r1=374030&r2=374031&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetFileTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetFileTest.cpp Tue Oct 8 02:06:48 2019 @@ -78,8 +78,8 @@ TEST_F(X86SnippetFileTest, Works) { EXPECT_FALSE((bool)Snippets.takeError()); ASSERT_THAT(*Snippets, SizeIs(1)); const auto &Snippet = (*Snippets)[0]; - ASSERT_THAT(Snippet.Instructions, ElementsAre(HasOpcode(X86::INC64r))); - ASSERT_THAT(Snippet.RegisterInitialValues, + ASSERT_THAT(Snippet.Key.Instructions, ElementsAre(HasOpcode(X86::INC64r))); + ASSERT_THAT(Snippet.Key.RegisterInitialValues, ElementsAre(RegisterInitialValueIs(X86::RAX, 15), RegisterInitialValueIs(X86::SIL, 0))); ASSERT_THAT(Snippet.LiveIns, ElementsAre(X86::RDI, X86::DL)); From llvm-commits at lists.llvm.org Tue Oct 8 02:09:58 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:09:58 +0000 (UTC) Subject: [PATCH] D68400: [NFC][TTI] Add Alignment for isLegalMasked[Load/Store] In-Reply-To: References: Message-ID: <4e81cb76e86e76c3b6c704edc28d94ac@localhost.localdomain> samparker added a comment. Looks like it! I'll try it. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68400/new/ https://reviews.llvm.org/D68400 From llvm-commits at lists.llvm.org Tue Oct 8 02:09:59 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:09:59 +0000 (UTC) Subject: [PATCH] D68579: [HardwareLoops] Optimisation remarks In-Reply-To: References: Message-ID: <96689e264666dffa808753ecd3f4b10a@localhost.localdomain> SjoerdMeijer updated this revision to Diff 223806. SjoerdMeijer added a comment. Thanks for taking a look! Comments addressed. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68579/new/ https://reviews.llvm.org/D68579 Files: llvm/lib/CodeGen/HardwareLoops.cpp llvm/test/CodeGen/ARM/O3-pipeline.ll llvm/test/Transforms/HardwareLoops/ARM/opt-remarks.ll llvm/test/Transforms/HardwareLoops/unconditional-latch.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68579.223806.patch Type: text/x-patch Size: 9333 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 02:10:00 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:10:00 +0000 (UTC) Subject: [PATCH] D68289: [lldb-server/android] Show more processes by relaxing some checks In-Reply-To: References: Message-ID: <6d8e9880d26cbe0f001d6c4c0c92fa17@localhost.localdomain> labath added a comment. Hi Walter, unfortunately my quick-fix ended up causing problems on mac os. I believe I have a solution to that, but it's going to be slightly more involved so, I'd like to get the patch reviewed first. I've reverted your patch for the time being to get the linux build green. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68289/new/ https://reviews.llvm.org/D68289 From llvm-commits at lists.llvm.org Tue Oct 8 02:09:59 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:09:59 +0000 (UTC) Subject: [PATCH] D66329: [PowerPC] [Peephole] fold frame offset by using index form to save add. In-Reply-To: References: Message-ID: <8376263cb3fb0b9556b5d97686f2c94c@localhost.localdomain> shchenz planned changes to this revision. shchenz added a comment. > I took this patch and applied it to master today and I noticed that one of the LIT tests fails. Sorry, I didn't do a LIT test recently, I really should do it. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66329/new/ https://reviews.llvm.org/D66329 From llvm-commits at lists.llvm.org Tue Oct 8 02:10:11 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:10:11 +0000 (UTC) Subject: [PATCH] D68629: [llvm-exegesis] Finish plumbing the `Config` field. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG4919534ae4d4: [llvm-exegesis] Finish plumbing the `Config` field. (authored by courbet). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68629/new/ https://reviews.llvm.org/D68629 Files: llvm/test/tools/llvm-exegesis/X86/analysis-cluster-stabilization-config.test llvm/tools/llvm-exegesis/lib/BenchmarkCode.h llvm/tools/llvm-exegesis/lib/BenchmarkResult.h llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp llvm/tools/llvm-exegesis/lib/Clustering.cpp llvm/tools/llvm-exegesis/lib/CodeTemplate.h llvm/tools/llvm-exegesis/lib/SnippetFile.cpp llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/unittests/tools/llvm-exegesis/X86/SnippetFileTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68629.223807.patch Type: text/x-patch Size: 12913 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 02:13:23 2019 From: llvm-commits at lists.llvm.org (Stefan Schmidt via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:13:23 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <6977e61c867eca60b2c79562c5534196@localhost.localdomain> thrimbor updated this revision to Diff 223808. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 Files: lld/COFF/Writer.cpp lld/test/COFF/Inputs/ordinal-only-implib.def lld/test/COFF/imports-ordinal-only.s Index: lld/test/COFF/imports-ordinal-only.s =================================================================== --- /dev/null +++ lld/test/COFF/imports-ordinal-only.s @@ -0,0 +1,20 @@ +# REQUIRES: x86 +# +# RUN: llvm-dlltool -k -m i386 --input-def %p/Inputs/ordinal-only-implib.def --output-lib %t-implib.a +# RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj +# RUN: lld-link -out:%t.exe -entry:main -subsystem:console -safeseh:no -debug %t.obj %t-implib.a +# RUN: llvm-objdump -private-headers %t.exe | FileCheck %s + +.text +.global _main +_main: +call _ByOrdinalFunction +ret + +# CHECK: The Import Tables: +# CHECK-NEXT: lookup 000020b4 time 00000000 fwd 00000000 name 000020c4 addr 000020bc +# CHECK-EMPTY: +# CHECK-NEXT: DLL Name: test.dll +# CHECK-NEXT: Hint/Ord Name +# CHECK-NEXT: 1 +# CHECK-EMPTY: Index: lld/test/COFF/Inputs/ordinal-only-implib.def =================================================================== --- /dev/null +++ lld/test/COFF/Inputs/ordinal-only-implib.def @@ -0,0 +1,3 @@ +LIBRARY test.dll +EXPORTS +ByOrdinalFunction @ 1 NONAME Index: lld/COFF/Writer.cpp =================================================================== --- lld/COFF/Writer.cpp +++ lld/COFF/Writer.cpp @@ -743,7 +743,8 @@ add(".idata$2", idata.dirs); add(".idata$4", idata.lookups); add(".idata$5", idata.addresses); - add(".idata$6", idata.hints); + if (!idata.hints.empty()) + add(".idata$6", idata.hints); add(".idata$7", idata.dllNames); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68352.223808.patch Type: text/x-patch Size: 1504 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 02:13:32 2019 From: llvm-commits at lists.llvm.org (Sylvestre Ledru via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:13:32 +0000 (UTC) Subject: [PATCH] D66733: [analyzer] Add a checker option to detect nested dead stores In-Reply-To: References: Message-ID: <3cf8bf79c237e27babbad1bfc208382b@localhost.localdomain> sylvestre.ledru added a comment. @steakhal Thanks for this patch. I tried it on Firefox and it found a bunch of stuff we improved. Some examples: https://hg.mozilla.org/integration/autoland/rev/db24db8f5f549ff446d1c3c69799187bcc2409e2 https://hg.mozilla.org/integration/autoland/rev/5de53dab979a401d9ba1405974f691927e53c9bd (and more to come) I think it should be added to the release notes! Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66733/new/ https://reviews.llvm.org/D66733 From llvm-commits at lists.llvm.org Tue Oct 8 02:28:11 2019 From: llvm-commits at lists.llvm.org (Sam Parker via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:28:11 +0000 (UTC) Subject: [PATCH] D68400: [NFC][TTI] Add Alignment for isLegalMasked[Load/Store] In-Reply-To: References: Message-ID: samparker updated this revision to Diff 223809. samparker added a comment. Herald added a subscriber: hiraditya. Herald added a project: LLVM. - Now using MaybeAlign - Now using an alignment helper function in the vectorizer. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68400/new/ https://reviews.llvm.org/D68400 Files: llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/CodeGen/ScalarizeMaskedMemIntrin.cpp llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp llvm/lib/Target/ARM/ARMTargetTransformInfo.h llvm/lib/Target/X86/X86TargetTransformInfo.cpp llvm/lib/Target/X86/X86TargetTransformInfo.h llvm/lib/Transforms/Vectorize/LoopVectorize.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68400.223809.patch Type: text/x-patch Size: 11291 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 02:28:11 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:28:11 +0000 (UTC) Subject: [PATCH] D68579: [HardwareLoops] Optimisation remarks In-Reply-To: References: Message-ID: shchenz added inline comments. ================ Comment at: llvm/lib/CodeGen/HardwareLoops.cpp:247 for (Loop::iterator I = L->begin(), E = L->end(); I != E; ++I) if (TryConvertLoop(*I)) return true; // Stop search. ---------------- Maybe can add a report here, Parent hardware loop does not support containing a nested hardware loop? ================ Comment at: llvm/lib/CodeGen/HardwareLoops.cpp:251 HardwareLoopInfo HWLoopInfo(L); if (!HWLoopInfo.canAnalyze(*LI)) return false; ---------------- And also here? Invalid loop for example the one which contains irreducible control flow? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68579/new/ https://reviews.llvm.org/D68579 From llvm-commits at lists.llvm.org Tue Oct 8 02:31:44 2019 From: llvm-commits at lists.llvm.org (Stefan Schmidt via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:31:44 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <788e0cbf5b23e0256581b26f252c4133@localhost.localdomain> thrimbor added a comment. > So, you are creating an executable that has an import table that has only ordinals. Is my understanding correct? Indeed, that condition is rare and I've thought of that case before when I implemented this part. Yes, that's what we were having issues with. > I think your fix is correct. Does all tests still pass? Please run `ninja check-all` (or equivalent). I ran `make check-lld` (and now `make check-all` as well, just to be sure), and all tests passed. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 02:31:45 2019 From: llvm-commits at lists.llvm.org (Stefan Schmidt via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:31:45 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <7d29d9a5f3a22411eb75d9e90ba4a12b@localhost.localdomain> thrimbor added inline comments. ================ Comment at: lld/test/COFF/imports-ordinal-only.s:5 +# RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj +# RUN: lld-link -out:%t.exe -entry:main -subsystem:console -safeseh:no -debug %t.obj %t-implib.a + ---------------- ruiu wrote: > I'd dump the import table to verify that a correct import table is actually created in the resulting executable. I updated the test. Is this what you had in mind? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 02:37:11 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Micha=C5=82_G=C3=B3rny_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 09:37:11 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: mgorny added a comment. Doesn't winapi provide some method of computing crc32? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Tue Oct 8 02:43:05 2019 From: llvm-commits at lists.llvm.org (Nikola Prica via llvm-commits) Date: Tue, 08 Oct 2019 09:43:05 -0000 Subject: [llvm] r374033 - [ISEL][ARM][AARCH64] Tracking simple parameter forwarding registers Message-ID: <20191008094305.A0A8A81FD3@lists.llvm.org> Author: nikolaprica Date: Tue Oct 8 02:43:05 2019 New Revision: 374033 URL: http://llvm.org/viewvc/llvm-project?rev=374033&view=rev Log: [ISEL][ARM][AARCH64] Tracking simple parameter forwarding registers Support for tracking registers that forward function parameters into the following function frame. For now we only support cases when parameter is forwarded through single register. Reviewers: aprantl, vsk, t.p.northover Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D66953 Added: llvm/trunk/test/DebugInfo/AArch64/call-site-info-output.ll llvm/trunk/test/DebugInfo/ARM/call-site-info-output.ll Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=374033&r1=374032&r2=374033&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp Tue Oct 8 02:43:05 2019 @@ -3692,6 +3692,7 @@ AArch64TargetLowering::LowerCall(CallLow bool IsVarArg = CLI.IsVarArg; MachineFunction &MF = DAG.getMachineFunction(); + MachineFunction::CallSiteInfo CSInfo; bool IsThisReturn = false; AArch64FunctionInfo *FuncInfo = MF.getInfo(); @@ -3889,9 +3890,20 @@ AArch64TargetLowering::LowerCall(CallLow }) ->second; Bits = DAG.getNode(ISD::OR, DL, Bits.getValueType(), Bits, Arg); + // Call site info is used for function's parameter entry value + // tracking. For now we track only simple cases when parameter + // is transferred through whole register. + CSInfo.erase(std::remove_if(CSInfo.begin(), CSInfo.end(), + [&VA](MachineFunction::ArgRegPair ArgReg) { + return ArgReg.Reg == VA.getLocReg(); + }), + CSInfo.end()); } else { RegsToPass.emplace_back(VA.getLocReg(), Arg); RegsUsed.insert(VA.getLocReg()); + const TargetOptions &Options = DAG.getTarget().Options; + if (Options.EnableDebugEntryValues) + CSInfo.emplace_back(VA.getLocReg(), i); } } else { assert(VA.isMemLoc()); @@ -4072,12 +4084,15 @@ AArch64TargetLowering::LowerCall(CallLow // actual call instruction. if (IsTailCall) { MF.getFrameInfo().setHasTailCall(); - return DAG.getNode(AArch64ISD::TC_RETURN, DL, NodeTys, Ops); + SDValue Ret = DAG.getNode(AArch64ISD::TC_RETURN, DL, NodeTys, Ops); + DAG.addCallSiteInfo(Ret.getNode(), std::move(CSInfo)); + return Ret; } // Returns a chain and a flag for retval copy to use. Chain = DAG.getNode(AArch64ISD::CALL, DL, NodeTys, Ops); InFlag = Chain.getValue(1); + DAG.addCallSiteInfo(Chain.getNode(), std::move(CSInfo)); uint64_t CalleePopBytes = DoesCalleeRestoreStack(CallConv, TailCallOpt) ? alignTo(NumBytes, 16) : 0; Modified: llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp?rev=374033&r1=374032&r2=374033&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp Tue Oct 8 02:43:05 2019 @@ -1205,8 +1205,11 @@ bool ARMExpandPseudo::ExpandMI(MachineBa for (unsigned i = 1, e = MBBI->getNumOperands(); i != e; ++i) NewMI->addOperand(MBBI->getOperand(i)); - // Delete the pseudo instruction TCRETURN. + + // Update call site info and delete the pseudo instruction TCRETURN. + MBB.getParent()->updateCallSiteInfo(&MI, &*NewMI); MBB.erase(MBBI); + MBBI = NewMI; return true; } @@ -1436,6 +1439,7 @@ bool ARMExpandPseudo::ExpandMI(MachineBa MIB.cloneMemRefs(MI); TransferImpOps(MI, MIB, MIB); + MI.getMF()->updateCallSiteInfo(&MI, &*MIB); MI.eraseFromParent(); return true; } Modified: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp?rev=374033&r1=374032&r2=374033&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp Tue Oct 8 02:43:05 2019 @@ -2040,6 +2040,7 @@ ARMTargetLowering::LowerCall(TargetLower bool isVarArg = CLI.IsVarArg; MachineFunction &MF = DAG.getMachineFunction(); + MachineFunction::CallSiteInfo CSInfo; bool isStructRet = (Outs.empty()) ? false : Outs[0].Flags.isSRet(); bool isThisReturn = false; auto Attr = MF.getFunction().getFnAttribute("disable-tail-calls"); @@ -2164,6 +2165,9 @@ ARMTargetLowering::LowerCall(TargetLower "unexpected use of 'returned'"); isThisReturn = true; } + const TargetOptions &Options = DAG.getTarget().Options; + if (Options.EnableDebugEntryValues) + CSInfo.emplace_back(VA.getLocReg(), i); RegsToPass.push_back(std::make_pair(VA.getLocReg(), Arg)); } else if (isByVal) { assert(VA.isMemLoc()); @@ -2399,12 +2403,15 @@ ARMTargetLowering::LowerCall(TargetLower SDVTList NodeTys = DAG.getVTList(MVT::Other, MVT::Glue); if (isTailCall) { MF.getFrameInfo().setHasTailCall(); - return DAG.getNode(ARMISD::TC_RETURN, dl, NodeTys, Ops); + SDValue Ret = DAG.getNode(ARMISD::TC_RETURN, dl, NodeTys, Ops); + DAG.addCallSiteInfo(Ret.getNode(), std::move(CSInfo)); + return Ret; } // Returns a chain and a flag for retval copy to use. Chain = DAG.getNode(CallOpc, dl, NodeTys, Ops); InFlag = Chain.getValue(1); + DAG.addCallSiteInfo(Chain.getNode(), std::move(CSInfo)); Chain = DAG.getCALLSEQ_END(Chain, DAG.getIntPtrConstant(NumBytes, dl, true), DAG.getIntPtrConstant(0, dl, true), InFlag, dl); Added: llvm/trunk/test/DebugInfo/AArch64/call-site-info-output.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/AArch64/call-site-info-output.ll?rev=374033&view=auto ============================================================================== --- llvm/trunk/test/DebugInfo/AArch64/call-site-info-output.ll (added) +++ llvm/trunk/test/DebugInfo/AArch64/call-site-info-output.ll Tue Oct 8 02:43:05 2019 @@ -0,0 +1,41 @@ +; RUN: llc -mtriple aarch64-linux-gnu -debug-entry-values %s -o - -stop-before=finalize-isel | FileCheck %s +; Verify that Selection DAG knows how to recognize simple function parameter forwarding registers. +; Produced from: +; extern int fn1(int,int,int); +; int fn2(int a, int b, int c) { +; int local = fn1(a+b, c, 10); +; if (local > 10) +; return local + 10; +; return local; +; } +; clang -g -O2 -target aarch64-linux-gnu -S -emit-llvm %s +; CHECK: callSites: +; CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: +; CHECK-NEXT: - { arg: 0, reg: '$w0' } +; CHECK-NEXT: - { arg: 1, reg: '$w1' } +; CHECK-NEXT: - { arg: 2, reg: '$w2' } } + +; ModuleID = 'call-site-info-output.c' +source_filename = "call-site-info-output.c" +target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" +target triple = "aarch64-unknown-linux-gnu" + +; Function Attrs: nounwind +define dso_local i32 @fn2(i32 %a, i32 %b, i32 %c) local_unnamed_addr{ +entry: + %add = add nsw i32 %b, %a + %call = tail call i32 @fn1(i32 %add, i32 %c, i32 10) + %cmp = icmp sgt i32 %call, 10 + %add1 = add nsw i32 %call, 10 + %retval.0 = select i1 %cmp, i32 %add1, i32 %call + ret i32 %retval.0 +} + +declare dso_local i32 @fn1(i32, i32, i32) local_unnamed_addr + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) + +!llvm.ident = !{!0} + +!0 = !{!"clang version 10.0.0"} Added: llvm/trunk/test/DebugInfo/ARM/call-site-info-output.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/ARM/call-site-info-output.ll?rev=374033&view=auto ============================================================================== --- llvm/trunk/test/DebugInfo/ARM/call-site-info-output.ll (added) +++ llvm/trunk/test/DebugInfo/ARM/call-site-info-output.ll Tue Oct 8 02:43:05 2019 @@ -0,0 +1,41 @@ +; RUN: llc -mtriple arm-linux-gnu -debug-entry-values %s -o - -stop-before=finalize-isel | FileCheck %s +; Verify that Selection DAG knows how to recognize simple function parameter forwarding registers. +; Produced from: +; extern int fn1(int,int,int); +; int fn2(int a, int b, int c) { +; int local = fn1(a+b, c, 10); +; if (local > 10) +; return local + 10; +; return local; +; } +; clang -g -O2 -target arm-linux-gnu -S -emit-llvm %s +; CHECK: callSites: +; CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: +; CHECK-NEXT: - { arg: 0, reg: '$r0' } +; CHECK-NEXT: - { arg: 1, reg: '$r1' } +; CHECK-NEXT: - { arg: 2, reg: '$r2' } } + +; ModuleID = 'call-site-info-output.c' +source_filename = "call-site-info-output.c" +target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64" +target triple = "armv4t-unknown-linux-gnu" + +; Function Attrs: nounwind +define dso_local arm_aapcscc i32 @fn2(i32 %a, i32 %b, i32 %c) { +entry: + %add = add nsw i32 %b, %a + %call = tail call arm_aapcscc i32 @fn1(i32 %add, i32 %c, i32 10) + %cmp = icmp sgt i32 %call, 10 + %add1 = select i1 %cmp, i32 %c, i32 0 + %retval.0 = add nsw i32 %add1, %call + ret i32 %retval.0 +} + +declare dso_local arm_aapcscc i32 @fn1(i32, i32, i32) local_unnamed_addr + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) + +!llvm.ident = !{!0} + +!0 = !{!"clang version 10.0.0"} From llvm-commits at lists.llvm.org Tue Oct 8 02:50:36 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:50:36 +0000 (UTC) Subject: [PATCH] D68210: Object/minidump: Add support for the MemoryInfoList stream In-Reply-To: References: Message-ID: <8b96174adbff348a405c81bc0ef782ff@localhost.localdomain> labath updated this revision to Diff 223811. labath added a comment. Address review comments Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68210/new/ https://reviews.llvm.org/D68210 Files: include/llvm/BinaryFormat/Minidump.h include/llvm/BinaryFormat/MinidumpConstants.def include/llvm/Object/Minidump.h lib/Object/Minidump.cpp unittests/Object/MinidumpTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68210.223811.patch Type: text/x-patch Size: 22119 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 02:55:08 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:55:08 +0000 (UTC) Subject: [PATCH] D68210: Object/minidump: Add support for the MemoryInfoList stream In-Reply-To: References: Message-ID: <39e6c43fd0528443e19910a587ee3047@localhost.localdomain> labath added a comment. (ignore latest diff. I just realized it's totally bogus) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68210/new/ https://reviews.llvm.org/D68210 From llvm-commits at lists.llvm.org Tue Oct 8 03:09:08 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 10:09:08 +0000 (UTC) Subject: [PATCH] D66887: [test-suite][WIP] Add GCC C Torture Suite as External Test Suite In-Reply-To: References: Message-ID: <624848da8134edc3065afe2492bc415f@localhost.localdomain> lenary marked 19 inline comments as done. lenary added inline comments. ================ Comment at: SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt:43-50 + gcc_torture_dg_options_cflags(${File}) + + # Add any flags that were requested + list(APPEND CFLAGS ${_TORTURE_CFLAGS}) + list(APPEND LDFLAGS ${_TORTURE_LDFLAGS}) + + # Remove any flags that will make clang error ---------------- kristof.beyls wrote: > I don't understand cmake much, I'm afraid. > My guess is that the CLANG_UNKNOWN_CFLAGS could only come from options encoded in specific test files, i.e. those that are being discovered through cmake function gcc_torture_dg_options_cflags. > If so, wouldn't it be better to only remove those options from the list of options that gcc_torture_dg_options_cflags finds? > IIUC, gcc_torture_dg_options_cflags currently modifies the global CFLAGS variable. Maybe a better design would be that it returns the list of options it finds without modifying the global CFLAGS variable. And do the filtering of flags that clang doesn't know inside gcc_torture_dg_options_cflags? > Let me repeat that I don't know much about the cmake language, so there may be idiomatic reasons why such a design would not be best. I actually had to look up how scope works in CMake when I had a previous bug in this patch. I believe this is doing something reasonable. In CMake, each subdirectory and each function are a new scope. They inherit the scope of the enclosing subdirectory/function, but new `set` calls (and things like `append`) apply only within the current scope. So this function is not modifying a global `CFLAGS`, it's taking the per-subdirectory CFLAGS, adding some extra flags, and then setting the CFLAGS variable in the current function's scope. I have decided to simplify the logic in `gcc_torture_dg_options_cflags` and set a variable in the parent scope with the dg-options cflags only, rather than updating `CFLAGS`. I also do the removal of the erroring/unknown cflags in the above function now. Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 From llvm-commits at lists.llvm.org Tue Oct 8 03:09:09 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 10:09:09 +0000 (UTC) Subject: [PATCH] D66887: [test-suite][WIP] Add GCC C Torture Suite as External Test Suite In-Reply-To: References: Message-ID: <9b5811d9d8d759edb5f7696367e1d85e@localhost.localdomain> lenary updated this revision to Diff 223812. lenary marked an inline comment as done. lenary added a comment. - Address review feedback Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 Files: LICENSE.TXT SingleSource/Regression/C/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/README SingleSource/Regression/C/gcc-c-torture/execute/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/execute/COPYING SingleSource/Regression/C/gcc-c-torture/execute/COPYING3 SingleSource/Regression/C/gcc-c-torture/execute/LICENCE.TXT SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/lit.local.cfg -------------- next part -------------- A non-text attachment was scrubbed... Name: D66887.223812.patch Type: text/x-patch Size: 68574 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 03:13:26 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 10:13:26 +0000 (UTC) Subject: [PATCH] D66887: [test-suite][WIP] Add GCC C Torture Suite as External Test Suite In-Reply-To: References: Message-ID: lenary added a comment. @kristof.beyls @khcheang thank you both for the reviews! Sorry it has taken me a while to get back to you with an updated patch, I was unexpectedly away from my desk for two weeks. Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 From llvm-commits at lists.llvm.org Tue Oct 8 03:18:29 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 10:18:29 +0000 (UTC) Subject: [PATCH] D66887: [test-suite][WIP] Add GCC C Torture Suite as External Test Suite In-Reply-To: References: Message-ID: <5ac5703c8fefbd7c1719904c0ceed21e@localhost.localdomain> lenary updated this revision to Diff 223813. lenary added a comment. - Update LDFlags for newlib nano io - Expand TestToSkip in ieee foreach Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 Files: LICENSE.TXT SingleSource/Regression/C/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/README SingleSource/Regression/C/gcc-c-torture/execute/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/execute/COPYING SingleSource/Regression/C/gcc-c-torture/execute/COPYING3 SingleSource/Regression/C/gcc-c-torture/execute/LICENCE.TXT SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/lit.local.cfg -------------- next part -------------- A non-text attachment was scrubbed... Name: D66887.223813.patch Type: text/x-patch Size: 68572 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 03:31:24 2019 From: llvm-commits at lists.llvm.org (Hans Wennborg via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 10:31:24 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <57050c784011dd87d1d5f4ad15dd533b@localhost.localdomain> hans added a comment. In D68570#1699278 , @mgorny wrote: > Doesn't winapi provide some method of computing crc32? Not that I'm aware. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Tue Oct 8 03:46:01 2019 From: llvm-commits at lists.llvm.org (Andrea Di Biagio via llvm-commits) Date: Tue, 08 Oct 2019 10:46:01 -0000 Subject: [llvm] r374034 - [MCA][LSUnit] Track loads and stores until retirement. Message-ID: <20191008104601.D17108EBB8@lists.llvm.org> Author: adibiagio Date: Tue Oct 8 03:46:01 2019 New Revision: 374034 URL: http://llvm.org/viewvc/llvm-project?rev=374034&view=rev Log: [MCA][LSUnit] Track loads and stores until retirement. Before this patch, loads and stores were only tracked by their corresponding queues in the LSUnit from dispatch until execute stage. In practice we should be more conservative and assume that memory opcodes leave their queues at retirement stage. Basically, loads should leave the load queue only when they have completed and delivered their data. We conservatively assume that a load is completed when it is retired. Stores should be tracked by the store queue from dispatch until retirement. In practice, stores can only leave the store queue if their data can be written to the data cache. This is mostly a mechanical change. With this patch, the retire stage notifies the LSUnit when a memory instruction is retired. That would triggers the release of LDQ/STQ entries. The only visible change is in memory tests for the bdver2 model. That is because bdver2 is the only model that defines the load/store queue size. This patch partially addresses PR39830. Differential Revision: https://reviews.llvm.org/D68266 Modified: llvm/trunk/include/llvm/MCA/HardwareUnits/LSUnit.h llvm/trunk/include/llvm/MCA/Stages/RetireStage.h llvm/trunk/lib/MCA/Context.cpp llvm/trunk/lib/MCA/HardwareUnits/LSUnit.cpp llvm/trunk/lib/MCA/Stages/RetireStage.cpp llvm/trunk/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s llvm/trunk/test/tools/llvm-mca/X86/BdVer2/load-throughput.s llvm/trunk/test/tools/llvm-mca/X86/BdVer2/store-throughput.s Modified: llvm/trunk/include/llvm/MCA/HardwareUnits/LSUnit.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MCA/HardwareUnits/LSUnit.h?rev=374034&r1=374033&r2=374034&view=diff ============================================================================== --- llvm/trunk/include/llvm/MCA/HardwareUnits/LSUnit.h (original) +++ llvm/trunk/include/llvm/MCA/HardwareUnits/LSUnit.h Tue Oct 8 03:46:01 2019 @@ -291,9 +291,14 @@ public: return NextGroupID++; } - // Instruction executed event handlers. virtual void onInstructionExecuted(const InstRef &IR); + // Loads are tracked by the LDQ (load queue) from dispatch until completion. + // Stores are tracked by the STQ (store queue) from dispatch until commitment. + // By default we conservatively assume that the LDQ receives a load at + // dispatch. Loads leave the LDQ at retirement stage. + virtual void onInstructionRetired(const InstRef &IR); + virtual void onInstructionIssued(const InstRef &IR) { unsigned GroupID = IR.getInstruction()->getLSUTokenID(); Groups[GroupID]->onInstructionIssued(IR); @@ -438,9 +443,6 @@ public: /// 6. A store has to wait until an older store barrier is fully executed. unsigned dispatch(const InstRef &IR) override; - // FIXME: For simplicity, we optimistically assume a similar behavior for - // store instructions. In practice, store operations don't tend to leave the - // store queue until they reach the 'Retired' stage (See PR39830). void onInstructionExecuted(const InstRef &IR) override; }; Modified: llvm/trunk/include/llvm/MCA/Stages/RetireStage.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MCA/Stages/RetireStage.h?rev=374034&r1=374033&r2=374034&view=diff ============================================================================== --- llvm/trunk/include/llvm/MCA/Stages/RetireStage.h (original) +++ llvm/trunk/include/llvm/MCA/Stages/RetireStage.h Tue Oct 8 03:46:01 2019 @@ -16,6 +16,7 @@ #ifndef LLVM_MCA_RETIRE_STAGE_H #define LLVM_MCA_RETIRE_STAGE_H +#include "llvm/MCA/HardwareUnits/LSUnit.h" #include "llvm/MCA/HardwareUnits/RegisterFile.h" #include "llvm/MCA/HardwareUnits/RetireControlUnit.h" #include "llvm/MCA/Stages/Stage.h" @@ -27,13 +28,14 @@ class RetireStage final : public Stage { // Owner will go away when we move listeners/eventing to the stages. RetireControlUnit &RCU; RegisterFile &PRF; + LSUnitBase &LSU; RetireStage(const RetireStage &Other) = delete; RetireStage &operator=(const RetireStage &Other) = delete; public: - RetireStage(RetireControlUnit &R, RegisterFile &F) - : Stage(), RCU(R), PRF(F) {} + RetireStage(RetireControlUnit &R, RegisterFile &F, LSUnitBase &LS) + : Stage(), RCU(R), PRF(F), LSU(LS) {} bool hasWorkToComplete() const override { return !RCU.isEmpty(); } Error cycleStart() override; Modified: llvm/trunk/lib/MCA/Context.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MCA/Context.cpp?rev=374034&r1=374033&r2=374034&view=diff ============================================================================== --- llvm/trunk/lib/MCA/Context.cpp (original) +++ llvm/trunk/lib/MCA/Context.cpp Tue Oct 8 03:46:01 2019 @@ -44,7 +44,7 @@ Context::createDefaultPipeline(const Pip *RCU, *PRF); auto Execute = std::make_unique(*HWS, Opts.EnableBottleneckAnalysis); - auto Retire = std::make_unique(*RCU, *PRF); + auto Retire = std::make_unique(*RCU, *PRF, *LSU); // Pass the ownership of all the hardware units to this Context. addHardwareUnit(std::move(RCU)); Modified: llvm/trunk/lib/MCA/HardwareUnits/LSUnit.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MCA/HardwareUnits/LSUnit.cpp?rev=374034&r1=374033&r2=374034&view=diff ============================================================================== --- llvm/trunk/lib/MCA/HardwareUnits/LSUnit.cpp (original) +++ llvm/trunk/lib/MCA/HardwareUnits/LSUnit.cpp Tue Oct 8 03:46:01 2019 @@ -160,17 +160,19 @@ LSUnit::Status LSUnit::isAvailable(const } void LSUnitBase::onInstructionExecuted(const InstRef &IR) { - const InstrDesc &Desc = IR.getInstruction()->getDesc(); - bool IsALoad = Desc.MayLoad; - bool IsAStore = Desc.MayStore; - assert((IsALoad || IsAStore) && "Expected a memory operation!"); - unsigned GroupID = IR.getInstruction()->getLSUTokenID(); auto It = Groups.find(GroupID); + assert(It != Groups.end() && "Instruction not dispatched to the LS unit"); It->second->onInstructionExecuted(); - if (It->second->isExecuted()) { + if (It->second->isExecuted()) Groups.erase(It); - } +} + +void LSUnitBase::onInstructionRetired(const InstRef &IR) { + const InstrDesc &Desc = IR.getInstruction()->getDesc(); + bool IsALoad = Desc.MayLoad; + bool IsAStore = Desc.MayStore; + assert((IsALoad || IsAStore) && "Expected a memory operation!"); if (IsALoad) { releaseLQSlot(); Modified: llvm/trunk/lib/MCA/Stages/RetireStage.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MCA/Stages/RetireStage.cpp?rev=374034&r1=374033&r2=374034&view=diff ============================================================================== --- llvm/trunk/lib/MCA/Stages/RetireStage.cpp (original) +++ llvm/trunk/lib/MCA/Stages/RetireStage.cpp Tue Oct 8 03:46:01 2019 @@ -52,6 +52,10 @@ void RetireStage::notifyInstructionRetir llvm::SmallVector FreedRegs(PRF.getNumRegisterFiles()); const Instruction &Inst = *IR.getInstruction(); + // Release the load/store queue entries. + if (Inst.isMemOp()) + LSU.onInstructionRetired(IR); + for (const WriteState &WS : Inst.getDefs()) PRF.removeRegisterWrite(WS, FreedRegs); notifyEvent(HWInstructionRetiredEvent(IR, FreedRegs)); Modified: llvm/trunk/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s?rev=374034&r1=374033&r2=374034&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s (original) +++ llvm/trunk/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s Tue Oct 8 03:46:01 2019 @@ -507,12 +507,12 @@ movaps %xmm3, (%rbx) # CHECK: Iterations: 100 # CHECK-NEXT: Instructions: 400 -# CHECK-NEXT: Total Cycles: 593 +# CHECK-NEXT: Total Cycles: 554 # CHECK-NEXT: Total uOps: 400 # CHECK: Dispatch Width: 4 -# CHECK-NEXT: uOps Per Cycle: 0.67 -# CHECK-NEXT: IPC: 0.67 +# CHECK-NEXT: uOps Per Cycle: 0.72 +# CHECK-NEXT: IPC: 0.72 # CHECK-NEXT: Block RThroughput: 4.0 # CHECK: Instruction Info: @@ -532,24 +532,24 @@ movaps %xmm3, (%rbx) # CHECK: Dynamic Dispatch Stall Cycles: # CHECK-NEXT: RAT - Register unavailable: 0 # CHECK-NEXT: RCU - Retire tokens unavailable: 0 -# CHECK-NEXT: SCHEDQ - Scheduler full: 187 (31.5%) +# CHECK-NEXT: SCHEDQ - Scheduler full: 55 (9.9%) # CHECK-NEXT: LQ - Load queue full: 0 -# CHECK-NEXT: SQ - Store queue full: 342 (57.7%) +# CHECK-NEXT: SQ - Store queue full: 437 (78.9%) # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 # CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched: # CHECK-NEXT: [# dispatched], [# cycles] -# CHECK-NEXT: 0, 403 (68.0%) -# CHECK-NEXT: 1, 90 (15.2%) -# CHECK-NEXT: 2, 2 (0.3%) -# CHECK-NEXT: 3, 86 (14.5%) -# CHECK-NEXT: 4, 12 (2.0%) +# CHECK-NEXT: 0, 365 (65.9%) +# CHECK-NEXT: 1, 88 (15.9%) +# CHECK-NEXT: 2, 3 (0.5%) +# CHECK-NEXT: 3, 86 (15.5%) +# CHECK-NEXT: 4, 12 (2.2%) # CHECK: Schedulers - number of cycles where we saw N micro opcodes issued: # CHECK-NEXT: [# issued], [# cycles] -# CHECK-NEXT: 0, 292 (49.2%) -# CHECK-NEXT: 1, 202 (34.1%) -# CHECK-NEXT: 2, 99 (16.7%) +# CHECK-NEXT: 0, 253 (45.7%) +# CHECK-NEXT: 1, 202 (36.5%) +# CHECK-NEXT: 2, 99 (17.9%) # CHECK: Scheduler's queue usage: # CHECK-NEXT: [1] Resource name. @@ -595,8 +595,8 @@ movaps %xmm3, (%rbx) # CHECK: Resource pressure by instruction: # CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions: # CHECK-NEXT: - 1.00 - - - - - - - - - - - 1.00 - - - 3.00 - - - - 1.00 movd %mm0, (%rax) -# CHECK-NEXT: 0.36 2.64 - - - - - - - - - 3.00 - - - 1.00 - - - - 3.00 - - movd (%rcx), %mm1 -# CHECK-NEXT: 2.64 0.36 - - - - - - - - 3.00 - - - 1.00 - - - - 3.00 - - - movd (%rdx), %mm2 +# CHECK-NEXT: 1.53 1.47 - - - - - - - - - 3.00 - - - 1.00 - - - - 3.00 - - movd (%rcx), %mm1 +# CHECK-NEXT: 1.47 1.53 - - - - - - - - 3.00 - - - 1.00 - - - - 3.00 - - - movd (%rdx), %mm2 # CHECK-NEXT: 1.00 - - - - - - - - - - - - 1.00 - - 3.00 - - - - - 1.00 movd %mm3, (%rbx) # CHECK: Timeline view: Modified: llvm/trunk/test/tools/llvm-mca/X86/BdVer2/load-throughput.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-mca/X86/BdVer2/load-throughput.s?rev=374034&r1=374033&r2=374034&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-mca/X86/BdVer2/load-throughput.s (original) +++ llvm/trunk/test/tools/llvm-mca/X86/BdVer2/load-throughput.s Tue Oct 8 03:46:01 2019 @@ -80,7 +80,7 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: RAT - Register unavailable: 0 # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 -# CHECK-NEXT: LQ - Load queue full: 353 (86.9%) +# CHECK-NEXT: LQ - Load queue full: 354 (87.2%) # CHECK-NEXT: SQ - Store queue full: 0 # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 @@ -102,9 +102,9 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 32 36 40 +# CHECK-NEXT: PdEX 31 34 40 # CHECK-NEXT: PdFPU 0 0 64 -# CHECK-NEXT: PdLoad 37 40 40 +# CHECK-NEXT: PdLoad 36 40 40 # CHECK-NEXT: PdStore 0 0 24 # CHECK: Resources: @@ -193,7 +193,7 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: RAT - Register unavailable: 0 # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 -# CHECK-NEXT: LQ - Load queue full: 353 (86.9%) +# CHECK-NEXT: LQ - Load queue full: 354 (87.2%) # CHECK-NEXT: SQ - Store queue full: 0 # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 @@ -215,9 +215,9 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 32 36 40 +# CHECK-NEXT: PdEX 31 34 40 # CHECK-NEXT: PdFPU 0 0 64 -# CHECK-NEXT: PdLoad 37 40 40 +# CHECK-NEXT: PdLoad 36 40 40 # CHECK-NEXT: PdStore 0 0 24 # CHECK: Resources: @@ -306,7 +306,7 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: RAT - Register unavailable: 0 # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 -# CHECK-NEXT: LQ - Load queue full: 353 (86.9%) +# CHECK-NEXT: LQ - Load queue full: 354 (87.2%) # CHECK-NEXT: SQ - Store queue full: 0 # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 @@ -328,9 +328,9 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 32 36 40 +# CHECK-NEXT: PdEX 31 34 40 # CHECK-NEXT: PdFPU 0 0 64 -# CHECK-NEXT: PdLoad 37 40 40 +# CHECK-NEXT: PdLoad 36 40 40 # CHECK-NEXT: PdStore 0 0 24 # CHECK: Resources: @@ -419,7 +419,7 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: RAT - Register unavailable: 0 # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 -# CHECK-NEXT: LQ - Load queue full: 353 (86.9%) +# CHECK-NEXT: LQ - Load queue full: 354 (87.2%) # CHECK-NEXT: SQ - Store queue full: 0 # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 @@ -441,9 +441,9 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 32 36 40 +# CHECK-NEXT: PdEX 31 34 40 # CHECK-NEXT: PdFPU 0 0 64 -# CHECK-NEXT: PdLoad 37 40 40 +# CHECK-NEXT: PdLoad 36 40 40 # CHECK-NEXT: PdStore 0 0 24 # CHECK: Resources: @@ -532,7 +532,7 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: RAT - Register unavailable: 0 # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 -# CHECK-NEXT: LQ - Load queue full: 532 (87.9%) +# CHECK-NEXT: LQ - Load queue full: 533 (88.1%) # CHECK-NEXT: SQ - Store queue full: 0 # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 @@ -554,8 +554,8 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 34 38 40 -# CHECK-NEXT: PdFPU 34 38 64 +# CHECK-NEXT: PdEX 33 36 40 +# CHECK-NEXT: PdFPU 33 36 64 # CHECK-NEXT: PdLoad 37 40 40 # CHECK-NEXT: PdStore 0 0 24 @@ -646,7 +646,7 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: RAT - Register unavailable: 0 # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 -# CHECK-NEXT: LQ - Load queue full: 532 (87.9%) +# CHECK-NEXT: LQ - Load queue full: 533 (88.1%) # CHECK-NEXT: SQ - Store queue full: 0 # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 @@ -668,8 +668,8 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 34 38 40 -# CHECK-NEXT: PdFPU 34 38 64 +# CHECK-NEXT: PdEX 33 36 40 +# CHECK-NEXT: PdFPU 33 36 64 # CHECK-NEXT: PdLoad 37 40 40 # CHECK-NEXT: PdStore 0 0 24 @@ -760,7 +760,7 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: RAT - Register unavailable: 0 # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 -# CHECK-NEXT: LQ - Load queue full: 344 (56.9%) +# CHECK-NEXT: LQ - Load queue full: 345 (57.0%) # CHECK-NEXT: SQ - Store queue full: 0 # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 @@ -781,9 +781,9 @@ vmovaps (%rbx), %ymm3 # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 33 38 40 -# CHECK-NEXT: PdFPU 33 38 64 -# CHECK-NEXT: PdLoad 37 40 40 +# CHECK-NEXT: PdEX 33 36 40 +# CHECK-NEXT: PdFPU 33 36 64 +# CHECK-NEXT: PdLoad 36 40 40 # CHECK-NEXT: PdStore 0 0 24 # CHECK: Resources: Modified: llvm/trunk/test/tools/llvm-mca/X86/BdVer2/store-throughput.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-mca/X86/BdVer2/store-throughput.s?rev=374034&r1=374033&r2=374034&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-mca/X86/BdVer2/store-throughput.s (original) +++ llvm/trunk/test/tools/llvm-mca/X86/BdVer2/store-throughput.s Tue Oct 8 03:46:01 2019 @@ -81,14 +81,13 @@ vmovaps %ymm3, (%rbx) # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 # CHECK-NEXT: LQ - Load queue full: 0 -# CHECK-NEXT: SQ - Store queue full: 370 (91.8%) +# CHECK-NEXT: SQ - Store queue full: 371 (92.1%) # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 # CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched: # CHECK-NEXT: [# dispatched], [# cycles] -# CHECK-NEXT: 0, 25 (6.2%) -# CHECK-NEXT: 1, 370 (91.8%) -# CHECK-NEXT: 2, 1 (0.2%) +# CHECK-NEXT: 0, 24 (6.0%) +# CHECK-NEXT: 1, 372 (92.3%) # CHECK-NEXT: 4, 7 (1.7%) # CHECK: Schedulers - number of cycles where we saw N micro opcodes issued: @@ -103,10 +102,10 @@ vmovaps %ymm3, (%rbx) # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 22 23 40 +# CHECK-NEXT: PdEX 21 22 40 # CHECK-NEXT: PdFPU 0 0 64 # CHECK-NEXT: PdLoad 0 0 40 -# CHECK-NEXT: PdStore 23 24 24 +# CHECK-NEXT: PdStore 22 23 24 # CHECK: Resources: # CHECK-NEXT: [0.0] - PdAGLU01 @@ -195,14 +194,13 @@ vmovaps %ymm3, (%rbx) # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 # CHECK-NEXT: LQ - Load queue full: 0 -# CHECK-NEXT: SQ - Store queue full: 370 (91.8%) +# CHECK-NEXT: SQ - Store queue full: 371 (92.1%) # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 # CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched: # CHECK-NEXT: [# dispatched], [# cycles] -# CHECK-NEXT: 0, 25 (6.2%) -# CHECK-NEXT: 1, 370 (91.8%) -# CHECK-NEXT: 2, 1 (0.2%) +# CHECK-NEXT: 0, 24 (6.0%) +# CHECK-NEXT: 1, 372 (92.3%) # CHECK-NEXT: 4, 7 (1.7%) # CHECK: Schedulers - number of cycles where we saw N micro opcodes issued: @@ -217,10 +215,10 @@ vmovaps %ymm3, (%rbx) # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 22 23 40 +# CHECK-NEXT: PdEX 21 22 40 # CHECK-NEXT: PdFPU 0 0 64 # CHECK-NEXT: PdLoad 0 0 40 -# CHECK-NEXT: PdStore 23 24 24 +# CHECK-NEXT: PdStore 22 23 24 # CHECK: Resources: # CHECK-NEXT: [0.0] - PdAGLU01 @@ -309,14 +307,13 @@ vmovaps %ymm3, (%rbx) # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 # CHECK-NEXT: LQ - Load queue full: 0 -# CHECK-NEXT: SQ - Store queue full: 370 (91.8%) +# CHECK-NEXT: SQ - Store queue full: 371 (92.1%) # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 # CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched: # CHECK-NEXT: [# dispatched], [# cycles] -# CHECK-NEXT: 0, 25 (6.2%) -# CHECK-NEXT: 1, 370 (91.8%) -# CHECK-NEXT: 2, 1 (0.2%) +# CHECK-NEXT: 0, 24 (6.0%) +# CHECK-NEXT: 1, 372 (92.3%) # CHECK-NEXT: 4, 7 (1.7%) # CHECK: Schedulers - number of cycles where we saw N micro opcodes issued: @@ -331,10 +328,10 @@ vmovaps %ymm3, (%rbx) # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 22 23 40 +# CHECK-NEXT: PdEX 21 22 40 # CHECK-NEXT: PdFPU 0 0 64 # CHECK-NEXT: PdLoad 0 0 40 -# CHECK-NEXT: PdStore 23 24 24 +# CHECK-NEXT: PdStore 22 23 24 # CHECK: Resources: # CHECK-NEXT: [0.0] - PdAGLU01 @@ -423,14 +420,13 @@ vmovaps %ymm3, (%rbx) # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 # CHECK-NEXT: LQ - Load queue full: 0 -# CHECK-NEXT: SQ - Store queue full: 370 (91.8%) +# CHECK-NEXT: SQ - Store queue full: 371 (92.1%) # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 # CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched: # CHECK-NEXT: [# dispatched], [# cycles] -# CHECK-NEXT: 0, 25 (6.2%) -# CHECK-NEXT: 1, 370 (91.8%) -# CHECK-NEXT: 2, 1 (0.2%) +# CHECK-NEXT: 0, 24 (6.0%) +# CHECK-NEXT: 1, 372 (92.3%) # CHECK-NEXT: 4, 7 (1.7%) # CHECK: Schedulers - number of cycles where we saw N micro opcodes issued: @@ -445,10 +441,10 @@ vmovaps %ymm3, (%rbx) # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 22 23 40 +# CHECK-NEXT: PdEX 21 22 40 # CHECK-NEXT: PdFPU 0 0 64 # CHECK-NEXT: PdLoad 0 0 40 -# CHECK-NEXT: PdStore 23 24 24 +# CHECK-NEXT: PdStore 22 23 24 # CHECK: Resources: # CHECK-NEXT: [0.0] - PdAGLU01 @@ -537,7 +533,7 @@ vmovaps %ymm3, (%rbx) # CHECK-NEXT: RCU - Retire tokens unavailable: 0 # CHECK-NEXT: SCHEDQ - Scheduler full: 0 # CHECK-NEXT: LQ - Load queue full: 0 -# CHECK-NEXT: SQ - Store queue full: 747 (93.0%) +# CHECK-NEXT: SQ - Store queue full: 748 (93.2%) # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 # CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched: @@ -559,10 +555,10 @@ vmovaps %ymm3, (%rbx) # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 22 23 40 -# CHECK-NEXT: PdFPU 22 23 64 +# CHECK-NEXT: PdEX 21 23 40 +# CHECK-NEXT: PdFPU 21 23 64 # CHECK-NEXT: PdLoad 0 0 40 -# CHECK-NEXT: PdStore 23 24 24 +# CHECK-NEXT: PdStore 22 24 24 # CHECK: Resources: # CHECK-NEXT: [0.0] - PdAGLU01 @@ -650,16 +646,17 @@ vmovaps %ymm3, (%rbx) # CHECK: Dynamic Dispatch Stall Cycles: # CHECK-NEXT: RAT - Register unavailable: 0 # CHECK-NEXT: RCU - Retire tokens unavailable: 0 -# CHECK-NEXT: SCHEDQ - Scheduler full: 185 (30.7%) +# CHECK-NEXT: SCHEDQ - Scheduler full: 0 # CHECK-NEXT: LQ - Load queue full: 0 -# CHECK-NEXT: SQ - Store queue full: 372 (61.8%) +# CHECK-NEXT: SQ - Store queue full: 559 (92.9%) # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 # CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched: # CHECK-NEXT: [# dispatched], [# cycles] -# CHECK-NEXT: 0, 223 (37.0%) -# CHECK-NEXT: 1, 372 (61.8%) -# CHECK-NEXT: 4, 7 (1.2%) +# CHECK-NEXT: 0, 222 (36.9%) +# CHECK-NEXT: 1, 373 (62.0%) +# CHECK-NEXT: 3, 1 (0.2%) +# CHECK-NEXT: 4, 6 (1.0%) # CHECK: Schedulers - number of cycles where we saw N micro opcodes issued: # CHECK-NEXT: [# issued], [# cycles] @@ -673,10 +670,10 @@ vmovaps %ymm3, (%rbx) # CHECK-NEXT: [4] Total number of buffer entries. # CHECK: [1] [2] [3] [4] -# CHECK-NEXT: PdEX 22 24 40 -# CHECK-NEXT: PdFPU 22 24 64 +# CHECK-NEXT: PdEX 21 23 40 +# CHECK-NEXT: PdFPU 21 23 64 # CHECK-NEXT: PdLoad 0 0 40 -# CHECK-NEXT: PdStore 23 24 24 +# CHECK-NEXT: PdStore 22 24 24 # CHECK: Resources: # CHECK-NEXT: [0.0] - PdAGLU01 @@ -763,9 +760,9 @@ vmovaps %ymm3, (%rbx) # CHECK: Dynamic Dispatch Stall Cycles: # CHECK-NEXT: RAT - Register unavailable: 0 # CHECK-NEXT: RCU - Retire tokens unavailable: 0 -# CHECK-NEXT: SCHEDQ - Scheduler full: 5963 (83.2%) +# CHECK-NEXT: SCHEDQ - Scheduler full: 5777 (80.6%) # CHECK-NEXT: LQ - Load queue full: 0 -# CHECK-NEXT: SQ - Store queue full: 374 (5.2%) +# CHECK-NEXT: SQ - Store queue full: 561 (7.8%) # CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0 # CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched: From llvm-commits at lists.llvm.org Tue Oct 8 03:46:22 2019 From: llvm-commits at lists.llvm.org (Andrea Di Biagio via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 10:46:22 +0000 (UTC) Subject: [PATCH] D68266: [MCA][LSUnit] Track loads and stores until retirement. In-Reply-To: References: Message-ID: <1c32c4efbf1bbafdfe5424c19f07ee33@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG8d6651f7b11e: [MCA][LSUnit] Track loads and stores until retirement. (authored by andreadb). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D68266?vs=222592&id=223815#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68266/new/ https://reviews.llvm.org/D68266 Files: llvm/include/llvm/MCA/HardwareUnits/LSUnit.h llvm/include/llvm/MCA/Stages/RetireStage.h llvm/lib/MCA/Context.cpp llvm/lib/MCA/HardwareUnits/LSUnit.cpp llvm/lib/MCA/Stages/RetireStage.cpp llvm/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s llvm/test/tools/llvm-mca/X86/BdVer2/load-throughput.s llvm/test/tools/llvm-mca/X86/BdVer2/store-throughput.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68266.223815.patch Type: text/x-patch Size: 25918 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 03:49:16 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 10:49:16 +0000 (UTC) Subject: [PATCH] D68210: Object/minidump: Add support for the MemoryInfoList stream In-Reply-To: References: Message-ID: <348398d9635bf09478e8ffd239395318@localhost.localdomain> labath updated this revision to Diff 223816. labath added a comment. use two phase checks Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68210/new/ https://reviews.llvm.org/D68210 Files: include/llvm/BinaryFormat/Minidump.h include/llvm/BinaryFormat/MinidumpConstants.def include/llvm/Object/Minidump.h lib/Object/Minidump.cpp unittests/Object/MinidumpTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68210.223816.patch Type: text/x-patch Size: 22037 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 03:55:48 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 10:55:48 +0000 (UTC) Subject: [PATCH] D68210: Object/minidump: Add support for the MemoryInfoList stream In-Reply-To: References: Message-ID: <9be20608daa89a097398976c2078e4c4@localhost.localdomain> labath marked 2 inline comments as done. labath added inline comments. ================ Comment at: unittests/Object/MinidumpTest.cpp:620 + }; + EXPECT_THAT_EXPECTED(cantFail(create(HeaderTooBig))->getMemoryInfoList(), + Failed()); ---------------- jhenderson wrote: > Here and in the similar places, I'm not convinced that `cantFail` is appropriate (if the creation code is broken, this will assert and therefore possibly hide the actual testing failures that show where it went wrong more precisely). It should probably be a two phase thing: > > ``` > Expected> Minidump = HeaderTooBig); > ASSERT_THAT_EXPECTED(Minidump, Succeeded()); > EXPECTE_THAT_EXPECTED(Minidump->getMemoryInfoList(), Failed()); > ``` Done. The previous solution of just passing the creation error in the helper function was totally bogus, of course. Ideally, it would be possible to express this as a one-liner using gtest matchers (`HasValue(Property(&getMemoryInfoList, Failed))`, but unfortunately they are quite incompatible with the move-only Expected semantices. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68210/new/ https://reviews.llvm.org/D68210 From llvm-commits at lists.llvm.org Tue Oct 8 03:55:49 2019 From: llvm-commits at lists.llvm.org (Bjorn Pettersson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 10:55:49 +0000 (UTC) Subject: [PATCH] D68006: DSE miscompile when store is clobbered across loop iterations In-Reply-To: References: Message-ID: <460bddb379698aa012f1e9e756f01605@localhost.localdomain> bjope added inline comments. ================ Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:625 // Start checking the store-block. + WorkList.emplace_back(SecondBB, /*VisitedThroughBackedge=*/false); ---------------- Isn't store-block confusing? ================ Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:630 // Check all blocks going backward until we reach the load-block. while (!WorkList.empty()) { ---------------- Load-block? ================ Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:636 // Ignore instructions before LI if this is the FirstBB. BasicBlock::iterator BI = (B == FirstBB ? FirstBBI : B->begin()); ---------------- Is LI an old name for FirstI or FirstBBI? There is several more refs to LI and SI in the code below. Makes it a little bit hard to follow what is going on here, since the code/comments are rotten already in the baseline. Would not mind if that was cleaned up in a separate commit. ================ Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:641 if (isFirstBlock) { // Ignore instructions after SI if this is the first visit of SecondBB. assert(B == SecondBB && "first block is not the store block"); ---------------- Is SI an old name for SecondI? Or SecondBBI? ================ Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:666 for (auto PredI = pred_begin(B), PE = pred_end(B); PredI != PE; ++PredI) { - if (!Visited.insert(*PredI).second) + auto *Pred = *PredI; + if (!Visited.insert(Pred).second) ---------------- nit: I don't think that we need this local variable. Just use *PredI instead. If you keep it, then be explicit about the type instead of using auto. (As currently written you have similarly named variables that more or less can denote the same thing, but without really explaining the difference by having any type information. Just confusing IMHO.) ================ Comment at: lib/Transforms/Scalar/DeadStoreElimination.cpp:670 + bool IsBackedge = Backedges.count(std::make_pair(B, Pred)); + WorkList.emplace_back(Pred, VisitedThroughBackedge | IsBackedge); } ---------------- If we have a cfg such as A: (firstBB) B: preds A, D C: preds B (secondBB) D: preds C then I think we only visit block B once with this solution. And with VisitedThroughBackedge being false. It is only the SecondBB that can be visited twice afaict. Isn't that a problem? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68006/new/ https://reviews.llvm.org/D68006 From llvm-commits at lists.llvm.org Tue Oct 8 03:58:22 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 10:58:22 +0000 (UTC) Subject: [PATCH] D68579: [HardwareLoops] Optimisation remarks In-Reply-To: References: Message-ID: <1505c74718c208ea9f30a3fce0f81dfc@localhost.localdomain> SjoerdMeijer updated this revision to Diff 223818. SjoerdMeijer added a comment. Added two more remarks CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68579/new/ https://reviews.llvm.org/D68579 Files: llvm/lib/CodeGen/HardwareLoops.cpp llvm/test/CodeGen/ARM/O3-pipeline.ll llvm/test/Transforms/HardwareLoops/ARM/structure.ll llvm/test/Transforms/HardwareLoops/unconditional-latch.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68579.223818.patch Type: text/x-patch Size: 10713 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 04:05:20 2019 From: llvm-commits at lists.llvm.org (James Henderson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 11:05:20 +0000 (UTC) Subject: [PATCH] D68210: Object/minidump: Add support for the MemoryInfoList stream In-Reply-To: References: Message-ID: <964769186a994751b38209ed8c01ef36@localhost.localdomain> jhenderson accepted this revision. jhenderson added a comment. This revision is now accepted and ready to land. Thanks, LGTM (modulo the Minidump-specific details). Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68210/new/ https://reviews.llvm.org/D68210 From llvm-commits at lists.llvm.org Tue Oct 8 04:07:25 2019 From: llvm-commits at lists.llvm.org (Momchil Velikov via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 11:07:25 +0000 (UTC) Subject: [PATCH] D68469: [AArch64] Ensure no tagged memory is left in the unallocated portion of the stack In-Reply-To: References: Message-ID: chill updated this revision to Diff 223821. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68469/new/ https://reviews.llvm.org/D68469 Files: llvm/lib/Target/AArch64/AArch64StackTagging.cpp llvm/test/CodeGen/AArch64/stack-tagging-ex-1.ll llvm/test/CodeGen/AArch64/stack-tagging-ex-2.ll llvm/test/CodeGen/AArch64/stack-tagging-untag-placement.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68469.223821.patch Type: text/x-patch Size: 18182 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 04:14:33 2019 From: llvm-commits at lists.llvm.org (David Zarzycki via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 11:14:33 +0000 (UTC) Subject: [PATCH] D68632: [X86] Make memcmp() use PTEST if possible and also enable AVX1 Message-ID: davezarzycki created this revision. davezarzycki added a reviewer: craig.topper. davezarzycki added a project: LLVM. Hi Craig – I'm not an expert in this area. Feedback would be appreciated. In particular, the way PTEST was wired up. Thanks! Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68632 Files: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/memcmp-minsize.ll test/CodeGen/X86/memcmp-optsize.ll test/CodeGen/X86/memcmp.ll test/CodeGen/X86/setcc-wide-types.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68632.223820.patch Type: text/x-patch Size: 37151 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 04:14:34 2019 From: llvm-commits at lists.llvm.org (Chris Ye via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 11:14:34 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline Message-ID: yechunliang created this revision. yechunliang added reviewers: bjope, jmorse, vsk, probinson, jdoerfert, mtrofin. Herald added subscribers: llvm-commits, hiraditya, aprantl. Herald added a project: LLVM. Debug info affects output from "opt -inline", InlineFunction could not handle the llvm.dbg.value when it exist between alloca instructions, scanning the block of allocas will get wrong result if llvm.dbg.value instr exist. Skip dbg instr while allocas scanning. Fix the issue: https://bugs.llvm.org/show_bug.cgi?id=43291 https://reviews.llvm.org/D68633 Files: llvm/lib/Transforms/Utils/InlineFunction.cpp llvm/test/Transforms/Inline/inline-with-debuginfo.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68633.223819.patch Type: text/x-patch Size: 5878 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 04:33:11 2019 From: llvm-commits at lists.llvm.org (Wouter Vermaelen via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 11:33:11 +0000 (UTC) Subject: [PATCH] D68491: [PATCH 08/38] [noalias] [IR] IRBuilder support for noalias intrinsics. In-Reply-To: References: Message-ID: vermaelen.wouter added inline comments. ================ Comment at: llvm/include/llvm/IR/IntrinsicInst.h:891-896 + return (lhs->getOperand(Intrinsic::SideNoAliasScopeArg) == + rhs->getOperand(Intrinsic::SideNoAliasScopeArg)) && + (lhs->getOperand(Intrinsic::SideNoAliasIdentifyPObjIdArg) == + rhs->getOperand(Intrinsic::SideNoAliasIdentifyPObjIdArg)) && + (lhs->getOperand(Intrinsic::SideNoAliasIdentifyPArg) == + rhs->getOperand(Intrinsic::SideNoAliasIdentifyPArg)); ---------------- jeroen.dobbelaere wrote: > lebedev.ri wrote: > > `std::tie(<...>) == std::tie(<...>)` ? > You mean: std::forward_as_tuple ? > > Doesn't that loose the short-circuit behavior of operator&&? Detail: IMHO it doesn't really improve readability: (a == b) && (c == d) versus std::tie(a, c) == std::tie(b, d); CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68491/new/ https://reviews.llvm.org/D68491 From llvm-commits at lists.llvm.org Tue Oct 8 04:43:11 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 11:43:11 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: <9e0a71a70d924f1dbb4bbe147c14b3d0@localhost.localdomain> xbolva00 added a comment. Thanks, new patch looks great. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 From llvm-commits at lists.llvm.org Tue Oct 8 04:54:43 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via llvm-commits) Date: Tue, 08 Oct 2019 11:54:43 -0000 Subject: [llvm] r374036 - [LoopRotate] Unconditionally get DomTree. Message-ID: <20191008115443.218BE8EF88@lists.llvm.org> Author: fhahn Date: Tue Oct 8 04:54:42 2019 New Revision: 374036 URL: http://llvm.org/viewvc/llvm-project?rev=374036&view=rev Log: [LoopRotate] Unconditionally get DomTree. LoopRotate is a loop pass and the DomTree should always be available. Similar to a70c5261436322a53187d67b8bdc0445d0463a9a Modified: llvm/trunk/lib/Transforms/Scalar/LoopRotation.cpp Modified: llvm/trunk/lib/Transforms/Scalar/LoopRotation.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopRotation.cpp?rev=374036&r1=374035&r2=374036&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopRotation.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopRotation.cpp Tue Oct 8 04:54:42 2019 @@ -94,8 +94,7 @@ public: auto *LI = &getAnalysis().getLoopInfo(); const auto *TTI = &getAnalysis().getTTI(F); auto *AC = &getAnalysis().getAssumptionCache(F); - auto *DTWP = getAnalysisIfAvailable(); - auto *DT = DTWP ? &DTWP->getDomTree() : nullptr; + auto &DT = getAnalysis().getDomTree(); auto &SE = getAnalysis().getSE(); const SimplifyQuery SQ = getBestSimplifyQuery(*this, F); Optional MSSAU; @@ -103,7 +102,7 @@ public: MemorySSA *MSSA = &getAnalysis().getMSSA(); MSSAU = MemorySSAUpdater(MSSA); } - return LoopRotation(L, LI, TTI, AC, DT, &SE, + return LoopRotation(L, LI, TTI, AC, &DT, &SE, MSSAU.hasValue() ? MSSAU.getPointer() : nullptr, SQ, false, MaxHeaderSize, false); } From llvm-commits at lists.llvm.org Tue Oct 8 05:01:26 2019 From: llvm-commits at lists.llvm.org (Jeremy Morse via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 12:01:26 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline In-Reply-To: References: Message-ID: <2aa28c3ba28758c92a92347e134591a9@localhost.localdomain> jmorse added a comment. Herald added a subscriber: ormris. Looks good, one concern about whether too many debug instructions may get moved written inline ================ Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1842-1843 + // Debuginfo (@llvm.dbg.value) will make different result, skip while allocas scanning + while (isa(I)) ++I; + ---------------- Is there a possibility of an unrelated debug instruction being skipped here, and becoming part of the slice moved by lines 1847-1857? Moving dbg.values of arguments to the start of the caller may create a debug use-before-def situation, there could be other problem scenarios too. Using a debug-instruction filtering iterator (like here [0]) might just do-the-right-thing, I don't know whether feeding one to splice would behave correctly though. [0] https://github.com/llvm/llvm-project/blob/fdaa74217420729140f1786ea037ac445a724c8e/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L2592 ================ Comment at: llvm/test/Transforms/Inline/inline-with-debuginfo.ll:4-5 +; +; The purpose of this test is to check that debug info doesn't influence +; inlining decisions. + ---------------- The bug report number would be nice too ================ Comment at: llvm/test/Transforms/Inline/inline-with-debuginfo.ll:57-59 +attributes #0 = { nounwind readnone speculatable } +attributes #2 = { argmemonly nounwind } +attributes #3 = { nounwind readnone speculatable } ---------------- We generally delete function attributes if they're not necessary CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 From llvm-commits at lists.llvm.org Tue Oct 8 05:19:38 2019 From: llvm-commits at lists.llvm.org (Alexander via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 12:19:38 +0000 (UTC) Subject: [PATCH] D68635: [AMDGPU] Come back patch for the 'Assign register class for cross block values according to the divergence.' Message-ID: alex-t created this revision. alex-t added a reviewer: rampitec. Herald added subscribers: hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl, arsenm, qcolombet, MatzeB. Herald added a project: LLVM. After https://reviews.llvm.org/D59990 submit several issues were discovered. Changes in common code were preserved but AMDGPU specific part was reverted to keep the backend working correctly. Discovered issues were addressed in the following commits: - https://reviews.llvm.org/D67662 - https://reviews.llvm.org/D67101 - https://reviews.llvm.org/D63953 - https://reviews.llvm.org/D63731 This change brings back AMDGPU specific changes. https://reviews.llvm.org/D68635 Files: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp llvm/lib/Target/AMDGPU/SIISelLowering.cpp llvm/lib/Target/AMDGPU/SIISelLowering.h llvm/lib/Target/AMDGPU/SIInstrInfo.cpp llvm/test/CodeGen/AMDGPU/atomicrmw-nand.ll llvm/test/CodeGen/AMDGPU/branch-relaxation.ll llvm/test/CodeGen/AMDGPU/branch-uniformity.ll llvm/test/CodeGen/AMDGPU/commute-shifts.ll llvm/test/CodeGen/AMDGPU/control-flow-fastregalloc.ll llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll llvm/test/CodeGen/AMDGPU/cse-phi-incoming-val.ll llvm/test/CodeGen/AMDGPU/divergent-branch-uniform-condition.ll llvm/test/CodeGen/AMDGPU/extract_subvector_vec4_vec3.ll llvm/test/CodeGen/AMDGPU/fabs.ll llvm/test/CodeGen/AMDGPU/fdiv32-to-rcp-folding.ll llvm/test/CodeGen/AMDGPU/fmin_legacy.ll llvm/test/CodeGen/AMDGPU/fmul-2-combine-multi-use.ll llvm/test/CodeGen/AMDGPU/fneg-fabs.ll llvm/test/CodeGen/AMDGPU/fneg.ll llvm/test/CodeGen/AMDGPU/fsub.ll llvm/test/CodeGen/AMDGPU/i1-copy-from-loop.ll llvm/test/CodeGen/AMDGPU/i1-copy-phi-uniform-branch.ll llvm/test/CodeGen/AMDGPU/implicit-def-muse.ll llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.div.scale.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fmed3.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mov.dpp.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mqsad.pk.u16.u8.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.qsad.pk.u16.u8.ll llvm/test/CodeGen/AMDGPU/loop_break.ll llvm/test/CodeGen/AMDGPU/madak.ll llvm/test/CodeGen/AMDGPU/multilevel-break.ll llvm/test/CodeGen/AMDGPU/select-opt.ll llvm/test/CodeGen/AMDGPU/sgpr-control-flow.ll llvm/test/CodeGen/AMDGPU/sgpr-copy.ll llvm/test/CodeGen/AMDGPU/si-annotate-cf.ll llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies.mir llvm/test/CodeGen/AMDGPU/smrd.ll llvm/test/CodeGen/AMDGPU/subreg-coalescer-undef-use.ll llvm/test/CodeGen/AMDGPU/uniform-loop-inside-nonuniform.ll llvm/test/CodeGen/AMDGPU/use-sgpr-multiple-times.ll llvm/test/CodeGen/AMDGPU/valu-i1.ll llvm/test/CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll llvm/test/CodeGen/AMDGPU/wave32.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68635.223827.patch Type: text/x-patch Size: 85399 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 05:28:45 2019 From: llvm-commits at lists.llvm.org (Chris Ye via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 12:28:45 +0000 (UTC) Subject: [PATCH] D68004: [InstCombine] Fix call guard difference with dbg In-Reply-To: References: Message-ID: <2fc086404b81b99684ac015c451e1a0d@localhost.localdomain> yechunliang updated this revision to Diff 223828. yechunliang added a comment. Added one more RUN command line with "-debugify-each" in existed call-guard.ll, needn't duplicate another same test case file. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68004/new/ https://reviews.llvm.org/D68004 Files: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp llvm/test/Transforms/InstCombine/call-guard.ll Index: llvm/test/Transforms/InstCombine/call-guard.ll =================================================================== --- llvm/test/Transforms/InstCombine/call-guard.ll +++ llvm/test/Transforms/InstCombine/call-guard.ll @@ -1,4 +1,5 @@ ; RUN: opt < %s -instcombine -S | FileCheck %s +; RUN: opt < %s -instcombine -S -debugify-each | FileCheck %s declare void @llvm.experimental.guard(i1, ...) Index: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp =================================================================== --- llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp +++ llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp @@ -4018,12 +4018,12 @@ // Is this guard followed by another guard? We scan forward over a small // fixed window of instructions to handle common cases with conditions // computed between guards. - Instruction *NextInst = II->getNextNode(); + Instruction *NextInst = II->getNextNonDebugInstruction(); for (unsigned i = 0; i < GuardWideningWindow; i++) { // Note: Using context-free form to avoid compile time blow up if (!isSafeToSpeculativelyExecute(NextInst)) break; - NextInst = NextInst->getNextNode(); + NextInst = NextInst->getNextNonDebugInstruction(); } Value *NextCond = nullptr; if (match(NextInst, @@ -4035,10 +4035,10 @@ return eraseInstFromFunction(*NextInst); // Otherwise canonicalize guard(a); guard(b) -> guard(a & b). - Instruction* MoveI = II->getNextNode(); + Instruction* MoveI = II->getNextNonDebugInstruction(); while (MoveI != NextInst) { auto *Temp = MoveI; - MoveI = MoveI->getNextNode(); + MoveI = MoveI->getNextNonDebugInstruction(); Temp->moveBefore(II); } II->setArgOperand(0, Builder.CreateAnd(CurrCond, NextCond)); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68004.223828.patch Type: text/x-patch Size: 1848 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 05:28:46 2019 From: llvm-commits at lists.llvm.org (Chris Ye via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 12:28:46 +0000 (UTC) Subject: [PATCH] D68004: [InstCombine] Fix call guard difference with dbg In-Reply-To: References: Message-ID: <1cf83d1fe4603d15955b7eca979990dc@localhost.localdomain> yechunliang added a comment. just added updated test, could some one help to review again? thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68004/new/ https://reviews.llvm.org/D68004 From llvm-commits at lists.llvm.org Tue Oct 8 00:20:30 2019 From: llvm-commits at lists.llvm.org (Shiva Chen via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 07:20:30 +0000 (UTC) Subject: [PATCH] D68559: [RISCV] Support fast calling convention In-Reply-To: References: Message-ID: <2ffc8605b0435f8b52510bfcf6deffa6@localhost.localdomain> shiva0217 updated this revision to Diff 223791. shiva0217 added a comment. Update patch to address the comment Hi @hiraditya, thanks for the comment. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68559/new/ https://reviews.llvm.org/D68559 Files: lib/Target/RISCV/CMakeLists.txt lib/Target/RISCV/RISCVCallingConv.td lib/Target/RISCV/RISCVISelLowering.cpp test/CodeGen/RISCV/fastcc-float.ll test/CodeGen/RISCV/fastcc-int.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68559.223791.patch Type: text/x-patch Size: 8683 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 02:46:16 2019 From: llvm-commits at lists.llvm.org (Nikola Prica via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 09:46:16 +0000 (UTC) Subject: [PATCH] D66953: [ISEL][ARM][AARCH64] Tracking simple parameter forwarding registers In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG02682498b86a: [ISEL][ARM][AARCH64] Tracking simple parameter forwarding registers (authored by NikolaPrica). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D66953?vs=220670&id=223810#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66953/new/ https://reviews.llvm.org/D66953 Files: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp llvm/lib/Target/ARM/ARMISelLowering.cpp llvm/test/DebugInfo/AArch64/call-site-info-output.ll llvm/test/DebugInfo/ARM/call-site-info-output.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D66953.223810.patch Type: text/x-patch Size: 7953 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 03:49:18 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Lu=C3=ADs_Marques_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 10:49:18 +0000 (UTC) Subject: [PATCH] D67046: [RISCV] Add InstrInfo areMemAccessesTriviallyDisjoint hook In-Reply-To: References: Message-ID: <7a117ccb67f80605c86949eeef0c7529@localhost.localdomain> luismarques updated this revision to Diff 223817. luismarques added a comment. Fix `RUN` lines and add `REQUIRES: asserts`. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67046/new/ https://reviews.llvm.org/D67046 Files: llvm/lib/Target/RISCV/RISCVInstrInfo.cpp llvm/lib/Target/RISCV/RISCVInstrInfo.h llvm/lib/Target/RISCV/RISCVSubtarget.cpp llvm/test/CodeGen/RISCV/disjoint.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67046.223817.patch Type: text/x-patch Size: 5916 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 04:33:11 2019 From: llvm-commits at lists.llvm.org (Tim Gymnich via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 11:33:11 +0000 (UTC) Subject: [PATCH] D68360: PR41162 Implement LKK remainder and divisibility algorithms [urem] In-Reply-To: References: Message-ID: <513cfdd833626dd11c139030193d922e@localhost.localdomain> TG908 updated this revision to Diff 223824. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68360/new/ https://reviews.llvm.org/D68360 Files: llvm/include/llvm/CodeGen/TargetLowering.h llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp llvm/test/CodeGen/AArch64/urem-lkk.ll llvm/test/CodeGen/AArch64/urem-vector-lkk.ll llvm/test/CodeGen/ARM/urem-opt-size.ll llvm/test/CodeGen/PowerPC/urem-vector-lkk.ll llvm/test/CodeGen/X86/load-scalar-as-vector.ll llvm/test/CodeGen/X86/urem-i8-constant.ll llvm/test/CodeGen/X86/urem-lkk.ll llvm/test/CodeGen/X86/urem-vector-lkk.ll llvm/test/CodeGen/X86/vector-idiv-udiv-128.ll llvm/test/CodeGen/X86/vector-idiv-udiv-256.ll llvm/test/CodeGen/X86/vector-idiv-udiv-512.ll llvm/test/CodeGen/X86/vector-intrinsics.ll llvm/test/CodeGen/X86/vector-rem.ll llvm/test/CodeGen/X86/vector-truncate-combine.ll llvm/test/CodeGen/X86/vector-variable-idx.ll llvm/test/CodeGen/X86/vector-variable-idx2.ll llvm/test/CodeGen/X86/vector-width-store-merge.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68360.223824.patch Type: text/x-patch Size: 117502 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 04:33:15 2019 From: llvm-commits at lists.llvm.org (Tim Gymnich via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 11:33:15 +0000 (UTC) Subject: [PATCH] D68360: PR41162 Implement LKK remainder and divisibility algorithms [urem] In-Reply-To: References: Message-ID: <22c329ed59168d30d42cb5e925631688@localhost.localdomain> TG908 added a comment. ARM performance could still be good in loops once the constant has been loaded. Need to tests that. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68360/new/ https://reviews.llvm.org/D68360 From llvm-commits at lists.llvm.org Tue Oct 8 04:52:13 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 11:52:13 +0000 (UTC) Subject: [PATCH] D68360: PR41162 Implement LKK remainder and divisibility algorithms [urem] In-Reply-To: References: Message-ID: <1ee01e0c663d1eb0f46c751a0a7dd46b@localhost.localdomain> xbolva00 added a comment. Here you can find benchmarks if you want measure this new code a bit. https://lemire.me/blog/2019/02/08/faster-remainders-when-the-divisor-is-a-constant-beating-compilers-and-libdivide/ CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68360/new/ https://reviews.llvm.org/D68360 From llvm-commits at lists.llvm.org Tue Oct 8 05:37:40 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 12:37:40 +0000 (UTC) Subject: [PATCH] D68620: DebugInfo: Use base address selection entries for debug_loc In-Reply-To: References: Message-ID: <5ca0281da1d838c7de27672fbd0d6a46@localhost.localdomain> labath added a comment. LLDB seems to have support for base address selection in v4 debug_loc. It does not have support for v5 LLE_base_address(x) stuff, but the whole of v5 location list support is kind of wonky, which also is why I am looking at getting it to use the llvm version of the parser. As for llvm-dwarfdump, feel free to add new encodings there. My plan is to add support for all LLE encodings, but since I also need to figure out a way to refactor all of that stuff, it may take a while before I get to that. Having one or two new encodings appear in the mean time should only be a minor nuisance. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68620/new/ https://reviews.llvm.org/D68620 From llvm-commits at lists.llvm.org Tue Oct 8 05:43:46 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Tue, 08 Oct 2019 12:43:46 -0000 Subject: [llvm] r374039 - [SLP] add test with prefer-vector-width function attribute; NFC Message-ID: <20191008124346.DF48F8841D@lists.llvm.org> Author: spatel Date: Tue Oct 8 05:43:46 2019 New Revision: 374039 URL: http://llvm.org/viewvc/llvm-project?rev=374039&view=rev Log: [SLP] add test with prefer-vector-width function attribute; NFC Modified: llvm/trunk/test/Transforms/SLPVectorizer/X86/pr19657.ll Modified: llvm/trunk/test/Transforms/SLPVectorizer/X86/pr19657.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SLPVectorizer/X86/pr19657.ll?rev=374039&r1=374038&r2=374039&view=diff ============================================================================== --- llvm/trunk/test/Transforms/SLPVectorizer/X86/pr19657.ll (original) +++ llvm/trunk/test/Transforms/SLPVectorizer/X86/pr19657.ll Tue Oct 8 05:43:46 2019 @@ -1,40 +1,40 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py -; RUN: opt < %s -basicaa -slp-vectorizer -S -mcpu=corei7-avx | FileCheck %s -; RUN: opt < %s -basicaa -slp-vectorizer -slp-max-reg-size=128 -S -mcpu=corei7-avx | FileCheck %s --check-prefix=V128 +; RUN: opt < %s -basicaa -slp-vectorizer -S -mcpu=corei7-avx | FileCheck %s --check-prefixes=ANY,AVX +; RUN: opt < %s -basicaa -slp-vectorizer -slp-max-reg-size=128 -S -mcpu=corei7-avx | FileCheck %s --check-prefixes=ANY,MAX128 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" -define void @foo(double* %x) { -; CHECK-LABEL: @foo( -; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds double, double* [[X:%.*]], i64 1 -; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds double, double* [[X]], i64 2 -; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds double, double* [[X]], i64 3 -; CHECK-NEXT: [[TMP4:%.*]] = bitcast double* [[X]] to <4 x double>* -; CHECK-NEXT: [[TMP5:%.*]] = load <4 x double>, <4 x double>* [[TMP4]], align 8 -; CHECK-NEXT: [[TMP6:%.*]] = fadd <4 x double> [[TMP5]], [[TMP5]] -; CHECK-NEXT: [[TMP7:%.*]] = fadd <4 x double> [[TMP6]], [[TMP5]] -; CHECK-NEXT: [[TMP8:%.*]] = bitcast double* [[X]] to <4 x double>* -; CHECK-NEXT: store <4 x double> [[TMP7]], <4 x double>* [[TMP8]], align 8 -; CHECK-NEXT: ret void +define void @store_chains(double* %x) { +; AVX-LABEL: @store_chains( +; AVX-NEXT: [[TMP1:%.*]] = getelementptr inbounds double, double* [[X:%.*]], i64 1 +; AVX-NEXT: [[TMP2:%.*]] = getelementptr inbounds double, double* [[X]], i64 2 +; AVX-NEXT: [[TMP3:%.*]] = getelementptr inbounds double, double* [[X]], i64 3 +; AVX-NEXT: [[TMP4:%.*]] = bitcast double* [[X]] to <4 x double>* +; AVX-NEXT: [[TMP5:%.*]] = load <4 x double>, <4 x double>* [[TMP4]], align 8 +; AVX-NEXT: [[TMP6:%.*]] = fadd <4 x double> [[TMP5]], [[TMP5]] +; AVX-NEXT: [[TMP7:%.*]] = fadd <4 x double> [[TMP6]], [[TMP5]] +; AVX-NEXT: [[TMP8:%.*]] = bitcast double* [[X]] to <4 x double>* +; AVX-NEXT: store <4 x double> [[TMP7]], <4 x double>* [[TMP8]], align 8 +; AVX-NEXT: ret void ; -; V128-LABEL: @foo( -; V128-NEXT: [[TMP1:%.*]] = getelementptr inbounds double, double* [[X:%.*]], i64 1 -; V128-NEXT: [[TMP2:%.*]] = bitcast double* [[X]] to <2 x double>* -; V128-NEXT: [[TMP3:%.*]] = load <2 x double>, <2 x double>* [[TMP2]], align 8 -; V128-NEXT: [[TMP4:%.*]] = fadd <2 x double> [[TMP3]], [[TMP3]] -; V128-NEXT: [[TMP5:%.*]] = fadd <2 x double> [[TMP4]], [[TMP3]] -; V128-NEXT: [[TMP6:%.*]] = bitcast double* [[X]] to <2 x double>* -; V128-NEXT: store <2 x double> [[TMP5]], <2 x double>* [[TMP6]], align 8 -; V128-NEXT: [[TMP7:%.*]] = getelementptr inbounds double, double* [[X]], i64 2 -; V128-NEXT: [[TMP8:%.*]] = getelementptr inbounds double, double* [[X]], i64 3 -; V128-NEXT: [[TMP9:%.*]] = bitcast double* [[TMP7]] to <2 x double>* -; V128-NEXT: [[TMP10:%.*]] = load <2 x double>, <2 x double>* [[TMP9]], align 8 -; V128-NEXT: [[TMP11:%.*]] = fadd <2 x double> [[TMP10]], [[TMP10]] -; V128-NEXT: [[TMP12:%.*]] = fadd <2 x double> [[TMP11]], [[TMP10]] -; V128-NEXT: [[TMP13:%.*]] = bitcast double* [[TMP7]] to <2 x double>* -; V128-NEXT: store <2 x double> [[TMP12]], <2 x double>* [[TMP13]], align 8 -; V128-NEXT: ret void +; MAX128-LABEL: @store_chains( +; MAX128-NEXT: [[TMP1:%.*]] = getelementptr inbounds double, double* [[X:%.*]], i64 1 +; MAX128-NEXT: [[TMP2:%.*]] = bitcast double* [[X]] to <2 x double>* +; MAX128-NEXT: [[TMP3:%.*]] = load <2 x double>, <2 x double>* [[TMP2]], align 8 +; MAX128-NEXT: [[TMP4:%.*]] = fadd <2 x double> [[TMP3]], [[TMP3]] +; MAX128-NEXT: [[TMP5:%.*]] = fadd <2 x double> [[TMP4]], [[TMP3]] +; MAX128-NEXT: [[TMP6:%.*]] = bitcast double* [[X]] to <2 x double>* +; MAX128-NEXT: store <2 x double> [[TMP5]], <2 x double>* [[TMP6]], align 8 +; MAX128-NEXT: [[TMP7:%.*]] = getelementptr inbounds double, double* [[X]], i64 2 +; MAX128-NEXT: [[TMP8:%.*]] = getelementptr inbounds double, double* [[X]], i64 3 +; MAX128-NEXT: [[TMP9:%.*]] = bitcast double* [[TMP7]] to <2 x double>* +; MAX128-NEXT: [[TMP10:%.*]] = load <2 x double>, <2 x double>* [[TMP9]], align 8 +; MAX128-NEXT: [[TMP11:%.*]] = fadd <2 x double> [[TMP10]], [[TMP10]] +; MAX128-NEXT: [[TMP12:%.*]] = fadd <2 x double> [[TMP11]], [[TMP10]] +; MAX128-NEXT: [[TMP13:%.*]] = bitcast double* [[TMP7]] to <2 x double>* +; MAX128-NEXT: store <2 x double> [[TMP12]], <2 x double>* [[TMP13]], align 8 +; MAX128-NEXT: ret void ; %1 = load double, double* %x, align 8 %2 = fadd double %1, %1 @@ -58,3 +58,45 @@ define void @foo(double* %x) { ret void } +define void @store_chains_prefer_width_attr(double* %x) #0 { +; ANY-LABEL: @store_chains_prefer_width_attr( +; ANY-NEXT: [[TMP1:%.*]] = getelementptr inbounds double, double* [[X:%.*]], i64 1 +; ANY-NEXT: [[TMP2:%.*]] = bitcast double* [[X]] to <2 x double>* +; ANY-NEXT: [[TMP3:%.*]] = load <2 x double>, <2 x double>* [[TMP2]], align 8 +; ANY-NEXT: [[TMP4:%.*]] = fadd <2 x double> [[TMP3]], [[TMP3]] +; ANY-NEXT: [[TMP5:%.*]] = fadd <2 x double> [[TMP4]], [[TMP3]] +; ANY-NEXT: [[TMP6:%.*]] = bitcast double* [[X]] to <2 x double>* +; ANY-NEXT: store <2 x double> [[TMP5]], <2 x double>* [[TMP6]], align 8 +; ANY-NEXT: [[TMP7:%.*]] = getelementptr inbounds double, double* [[X]], i64 2 +; ANY-NEXT: [[TMP8:%.*]] = getelementptr inbounds double, double* [[X]], i64 3 +; ANY-NEXT: [[TMP9:%.*]] = bitcast double* [[TMP7]] to <2 x double>* +; ANY-NEXT: [[TMP10:%.*]] = load <2 x double>, <2 x double>* [[TMP9]], align 8 +; ANY-NEXT: [[TMP11:%.*]] = fadd <2 x double> [[TMP10]], [[TMP10]] +; ANY-NEXT: [[TMP12:%.*]] = fadd <2 x double> [[TMP11]], [[TMP10]] +; ANY-NEXT: [[TMP13:%.*]] = bitcast double* [[TMP7]] to <2 x double>* +; ANY-NEXT: store <2 x double> [[TMP12]], <2 x double>* [[TMP13]], align 8 +; ANY-NEXT: ret void +; + %1 = load double, double* %x, align 8 + %2 = fadd double %1, %1 + %3 = fadd double %2, %1 + store double %3, double* %x, align 8 + %4 = getelementptr inbounds double, double* %x, i64 1 + %5 = load double, double* %4, align 8 + %6 = fadd double %5, %5 + %7 = fadd double %6, %5 + store double %7, double* %4, align 8 + %8 = getelementptr inbounds double, double* %x, i64 2 + %9 = load double, double* %8, align 8 + %10 = fadd double %9, %9 + %11 = fadd double %10, %9 + store double %11, double* %8, align 8 + %12 = getelementptr inbounds double, double* %x, i64 3 + %13 = load double, double* %12, align 8 + %14 = fadd double %13, %13 + %15 = fadd double %14, %13 + store double %15, double* %12, align 8 + ret void +} + +attributes #0 = { "prefer-vector-width"="128" } From llvm-commits at lists.llvm.org Tue Oct 8 05:46:20 2019 From: llvm-commits at lists.llvm.org (Nicolai Haehnle via llvm-commits) Date: Tue, 08 Oct 2019 12:46:20 -0000 Subject: [llvm] r374040 - MachineSSAUpdater: insert IMPLICIT_DEF at top of basic block Message-ID: <20191008124620.6293E80898@lists.llvm.org> Author: nha Date: Tue Oct 8 05:46:20 2019 New Revision: 374040 URL: http://llvm.org/viewvc/llvm-project?rev=374040&view=rev Log: MachineSSAUpdater: insert IMPLICIT_DEF at top of basic block Summary: When getValueInMiddleOfBlock happens to be called for a basic block that has no incoming value at all, an IMPLICIT_DEF is inserted in that block via GetValueAtEndOfBlockInternal. This IMPLICIT_DEF must be at the top of its basic block or it will likely not reach the use that the caller intends to insert. Issue: https://github.com/GPUOpen-Drivers/llpc/issues/204 Reviewers: arsenm, rampitec Subscribers: jvesely, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68183 Added: llvm/trunk/test/CodeGen/AMDGPU/si-i1-copies.mir Modified: llvm/trunk/lib/CodeGen/MachineSSAUpdater.cpp Modified: llvm/trunk/lib/CodeGen/MachineSSAUpdater.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineSSAUpdater.cpp?rev=374040&r1=374039&r2=374040&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/MachineSSAUpdater.cpp (original) +++ llvm/trunk/lib/CodeGen/MachineSSAUpdater.cpp Tue Oct 8 05:46:20 2019 @@ -292,7 +292,7 @@ public: MachineSSAUpdater *Updater) { // Insert an implicit_def to represent an undef value. MachineInstr *NewDef = InsertNewDef(TargetOpcode::IMPLICIT_DEF, - BB, BB->getFirstTerminator(), + BB, BB->getFirstNonPHI(), Updater->VRC, Updater->MRI, Updater->TII); return NewDef->getOperand(0).getReg(); Added: llvm/trunk/test/CodeGen/AMDGPU/si-i1-copies.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/si-i1-copies.mir?rev=374040&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/si-i1-copies.mir (added) +++ llvm/trunk/test/CodeGen/AMDGPU/si-i1-copies.mir Tue Oct 8 05:46:20 2019 @@ -0,0 +1,28 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py +# RUN: llc -march=amdgcn -run-pass=si-i1-copies -verify-machineinstrs %s -o - | FileCheck -check-prefixes=GCN %s + +# Test that the new IMPLICIT_DEF is inserted in the correct location. +--- +name: test_undef +tracksRegLiveness: true +body: | + ; GCN-LABEL: name: test_undef + ; GCN: bb.0: + ; GCN: successors: %bb.1(0x80000000) + ; GCN: S_BRANCH %bb.1 + ; GCN: bb.1: + ; GCN: [[DEF:%[0-9]+]]:sreg_64 = IMPLICIT_DEF + ; GCN: [[COPY:%[0-9]+]]:sreg_64_xexec = COPY [[DEF]] + ; GCN: [[V_CNDMASK_B32_e64_:%[0-9]+]]:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, [[COPY]], implicit $exec + bb.0: + successors: %bb.1 + + %0:vreg_1 = IMPLICIT_DEF + S_BRANCH %bb.1 + + bb.1: + %1:vreg_1 = PHI %0, %bb.0 + %2:sreg_64_xexec = COPY %1 + %3:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, %2, implicit $exec + +... From llvm-commits at lists.llvm.org Tue Oct 8 05:46:32 2019 From: llvm-commits at lists.llvm.org (Nicolai Haehnle via llvm-commits) Date: Tue, 08 Oct 2019 12:46:32 -0000 Subject: [llvm] r374041 - AMDGPU: Propagate undef flag during pre-RA exec mask optimizations Message-ID: <20191008124632.57B018F04A@lists.llvm.org> Author: nha Date: Tue Oct 8 05:46:32 2019 New Revision: 374041 URL: http://llvm.org/viewvc/llvm-project?rev=374041&view=rev Log: AMDGPU: Propagate undef flag during pre-RA exec mask optimizations Summary: Issue: https://github.com/GPUOpen-Drivers/llpc/issues/204 Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68184 Modified: llvm/trunk/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp llvm/trunk/test/CodeGen/AMDGPU/optimize-exec-masking-pre-ra.mir Modified: llvm/trunk/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp?rev=374041&r1=374040&r2=374041&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp Tue Oct 8 05:46:32 2019 @@ -250,15 +250,16 @@ static unsigned optimizeVcndVcmpPair(Mac Op1->getImm() != 0 || Op2->getImm() != 1) return AMDGPU::NoRegister; - LLVM_DEBUG(dbgs() << "Folding sequence:\n\t" << *Sel << '\t' - << *Cmp << '\t' << *And); + LLVM_DEBUG(dbgs() << "Folding sequence:\n\t" << *Sel << '\t' << *Cmp << '\t' + << *And); Register CCReg = CC->getReg(); LIS->RemoveMachineInstrFromMaps(*And); - MachineInstr *Andn2 = BuildMI(MBB, *And, And->getDebugLoc(), - TII->get(Andn2Opc), And->getOperand(0).getReg()) - .addReg(ExecReg) - .addReg(CCReg, 0, CC->getSubReg()); + MachineInstr *Andn2 = + BuildMI(MBB, *And, And->getDebugLoc(), TII->get(Andn2Opc), + And->getOperand(0).getReg()) + .addReg(ExecReg) + .addReg(CCReg, getUndefRegState(CC->isUndef()), CC->getSubReg()); And->eraseFromParent(); LIS->InsertMachineInstrInMaps(*Andn2); Modified: llvm/trunk/test/CodeGen/AMDGPU/optimize-exec-masking-pre-ra.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/optimize-exec-masking-pre-ra.mir?rev=374041&r1=374040&r2=374041&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/optimize-exec-masking-pre-ra.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/optimize-exec-masking-pre-ra.mir Tue Oct 8 05:46:32 2019 @@ -1,5 +1,5 @@ # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py -# RUN: llc -mtriple=amdgcn-mesa-mesa3d -run-pass=si-optimize-exec-masking-pre-ra %s -o - | FileCheck -check-prefix=GCN %s +# RUN: llc -mtriple=amdgcn-mesa-mesa3d -run-pass=si-optimize-exec-masking-pre-ra -verify-machineinstrs %s -o - | FileCheck -check-prefix=GCN %s # Check for regression from assuming an instruction was a copy after # dropping the opcode check. @@ -95,3 +95,26 @@ body: | $exec = S_OR_B64 $exec, %7, implicit-def $scc ... + +# When folding a v_cndmask and a v_cmp in a pattern leading to +# s_cbranch_vccz, ensure that an undef operand is handled correctly. +--- +name: cndmask_cmp_cbranch_fold_undef +tracksRegLiveness: true +body: | + ; GCN-LABEL: name: cndmask_cmp_cbranch_fold_undef + ; GCN: bb.0: + ; GCN: successors: %bb.1(0x80000000) + ; GCN: $vcc = S_ANDN2_B64 $exec, undef %1:sreg_64_xexec, implicit-def $scc + ; GCN: S_CBRANCH_VCCZ %bb.1, implicit $vcc + ; GCN: bb.1: + bb.0: + + %1:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, undef %0:sreg_64_xexec, implicit $exec + V_CMP_NE_U32_e32 1, %1, implicit-def $vcc, implicit $exec + $vcc = S_AND_B64 $exec, $vcc, implicit-def dead $scc + S_CBRANCH_VCCZ %bb.1, implicit $vcc + + bb.1: + +... From llvm-commits at lists.llvm.org Tue Oct 8 05:46:48 2019 From: llvm-commits at lists.llvm.org (Bjorn Pettersson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 12:46:48 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline In-Reply-To: References: Message-ID: bjope added a comment. I do not understand how this helps. The code is written in a way that it skips any instruction, but moves contigous blocks of allocas in one splice (not sure exactly why, is that really faster?). Maybe the difference is that the check for AI->useEmpty() only is done for the first alloca in a sequence of alloca instructions? Or can't we just remove the loop at line 1847 (only moving one alloca at a time). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 From llvm-commits at lists.llvm.org Tue Oct 8 05:46:49 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 12:46:49 +0000 (UTC) Subject: [PATCH] D68309: GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELT In-Reply-To: References: Message-ID: nhaehnle added a comment. For the vector itself and the inserted element, shouldn't this be using G_ANYEXT instead? Looking at the corresponding test, the G_SHL/G_ASHR on $vgpr0 should be unnecessary based on the original code. For the element index, shouldn't this be using G_ZEXT or G_ANYEXT? I don't recall whether `insertelement` is supposed to interpret this argument as a signed or unsigned integer, and couldn't quickly find a reference to it. Not that it usually matters, but I find G_SEXT here surprising as well. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68309/new/ https://reviews.llvm.org/D68309 From llvm-commits at lists.llvm.org Tue Oct 8 05:47:11 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 12:47:11 +0000 (UTC) Subject: [PATCH] D68183: MachineSSAUpdater: insert IMPLICIT_DEF at top of basic block In-Reply-To: References: Message-ID: <5bec3be82ae5c1418b2e745e053b16f4@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG7febdb7f27df: MachineSSAUpdater: insert IMPLICIT_DEF at top of basic block (authored by nhaehnle). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68183/new/ https://reviews.llvm.org/D68183 Files: llvm/lib/CodeGen/MachineSSAUpdater.cpp llvm/test/CodeGen/AMDGPU/si-i1-copies.mir Index: llvm/test/CodeGen/AMDGPU/si-i1-copies.mir =================================================================== --- /dev/null +++ llvm/test/CodeGen/AMDGPU/si-i1-copies.mir @@ -0,0 +1,28 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py +# RUN: llc -march=amdgcn -run-pass=si-i1-copies -verify-machineinstrs %s -o - | FileCheck -check-prefixes=GCN %s + +# Test that the new IMPLICIT_DEF is inserted in the correct location. +--- +name: test_undef +tracksRegLiveness: true +body: | + ; GCN-LABEL: name: test_undef + ; GCN: bb.0: + ; GCN: successors: %bb.1(0x80000000) + ; GCN: S_BRANCH %bb.1 + ; GCN: bb.1: + ; GCN: [[DEF:%[0-9]+]]:sreg_64 = IMPLICIT_DEF + ; GCN: [[COPY:%[0-9]+]]:sreg_64_xexec = COPY [[DEF]] + ; GCN: [[V_CNDMASK_B32_e64_:%[0-9]+]]:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, [[COPY]], implicit $exec + bb.0: + successors: %bb.1 + + %0:vreg_1 = IMPLICIT_DEF + S_BRANCH %bb.1 + + bb.1: + %1:vreg_1 = PHI %0, %bb.0 + %2:sreg_64_xexec = COPY %1 + %3:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, %2, implicit $exec + +... Index: llvm/lib/CodeGen/MachineSSAUpdater.cpp =================================================================== --- llvm/lib/CodeGen/MachineSSAUpdater.cpp +++ llvm/lib/CodeGen/MachineSSAUpdater.cpp @@ -292,7 +292,7 @@ MachineSSAUpdater *Updater) { // Insert an implicit_def to represent an undef value. MachineInstr *NewDef = InsertNewDef(TargetOpcode::IMPLICIT_DEF, - BB, BB->getFirstTerminator(), + BB, BB->getFirstNonPHI(), Updater->VRC, Updater->MRI, Updater->TII); return NewDef->getOperand(0).getReg(); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68183.223832.patch Type: text/x-patch Size: 1840 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 05:47:17 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 12:47:17 +0000 (UTC) Subject: [PATCH] D68184: AMDGPU: Propagate undef flag during pre-RA exec mask optimizations In-Reply-To: References: Message-ID: <7d26d486f2accb929683aa84f0cc9c9e@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGdf6e67697bfb: AMDGPU: Propagate undef flag during pre-RA exec mask optimizations (authored by nhaehnle). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68184/new/ https://reviews.llvm.org/D68184 Files: llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp llvm/test/CodeGen/AMDGPU/optimize-exec-masking-pre-ra.mir Index: llvm/test/CodeGen/AMDGPU/optimize-exec-masking-pre-ra.mir =================================================================== --- llvm/test/CodeGen/AMDGPU/optimize-exec-masking-pre-ra.mir +++ llvm/test/CodeGen/AMDGPU/optimize-exec-masking-pre-ra.mir @@ -1,5 +1,5 @@ # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py -# RUN: llc -mtriple=amdgcn-mesa-mesa3d -run-pass=si-optimize-exec-masking-pre-ra %s -o - | FileCheck -check-prefix=GCN %s +# RUN: llc -mtriple=amdgcn-mesa-mesa3d -run-pass=si-optimize-exec-masking-pre-ra -verify-machineinstrs %s -o - | FileCheck -check-prefix=GCN %s # Check for regression from assuming an instruction was a copy after # dropping the opcode check. @@ -95,3 +95,26 @@ $exec = S_OR_B64 $exec, %7, implicit-def $scc ... + +# When folding a v_cndmask and a v_cmp in a pattern leading to +# s_cbranch_vccz, ensure that an undef operand is handled correctly. +--- +name: cndmask_cmp_cbranch_fold_undef +tracksRegLiveness: true +body: | + ; GCN-LABEL: name: cndmask_cmp_cbranch_fold_undef + ; GCN: bb.0: + ; GCN: successors: %bb.1(0x80000000) + ; GCN: $vcc = S_ANDN2_B64 $exec, undef %1:sreg_64_xexec, implicit-def $scc + ; GCN: S_CBRANCH_VCCZ %bb.1, implicit $vcc + ; GCN: bb.1: + bb.0: + + %1:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, undef %0:sreg_64_xexec, implicit $exec + V_CMP_NE_U32_e32 1, %1, implicit-def $vcc, implicit $exec + $vcc = S_AND_B64 $exec, $vcc, implicit-def dead $scc + S_CBRANCH_VCCZ %bb.1, implicit $vcc + + bb.1: + +... Index: llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp =================================================================== --- llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp +++ llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp @@ -250,15 +250,16 @@ Op1->getImm() != 0 || Op2->getImm() != 1) return AMDGPU::NoRegister; - LLVM_DEBUG(dbgs() << "Folding sequence:\n\t" << *Sel << '\t' - << *Cmp << '\t' << *And); + LLVM_DEBUG(dbgs() << "Folding sequence:\n\t" << *Sel << '\t' << *Cmp << '\t' + << *And); Register CCReg = CC->getReg(); LIS->RemoveMachineInstrFromMaps(*And); - MachineInstr *Andn2 = BuildMI(MBB, *And, And->getDebugLoc(), - TII->get(Andn2Opc), And->getOperand(0).getReg()) - .addReg(ExecReg) - .addReg(CCReg, 0, CC->getSubReg()); + MachineInstr *Andn2 = + BuildMI(MBB, *And, And->getDebugLoc(), TII->get(Andn2Opc), + And->getOperand(0).getReg()) + .addReg(ExecReg) + .addReg(CCReg, getUndefRegState(CC->isUndef()), CC->getSubReg()); And->eraseFromParent(); LIS->InsertMachineInstrInMaps(*Andn2); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68184.223833.patch Type: text/x-patch Size: 2775 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 05:53:55 2019 From: llvm-commits at lists.llvm.org (Graham Hunter via llvm-commits) Date: Tue, 08 Oct 2019 12:53:55 -0000 Subject: [llvm] r374042 - [SVE][IR] Scalable Vector size queries and IR instruction support Message-ID: <20191008125355.63E458CC1F@lists.llvm.org> Author: huntergr Date: Tue Oct 8 05:53:54 2019 New Revision: 374042 URL: http://llvm.org/viewvc/llvm-project?rev=374042&view=rev Log: [SVE][IR] Scalable Vector size queries and IR instruction support * Adds a TypeSize struct to represent the known minimum size of a type along with a flag to indicate that the runtime size is a integer multiple of that size * Converts existing size query functions from Type.h and DataLayout.h to return a TypeSize result * Adds convenience methods (including a transparent conversion operator to uint64_t) so that most existing code 'just works' as if the return values were still scalars. * Uses the new size queries along with ElementCount to ensure that all supported instructions used with scalable vectors can be constructed in IR. Reviewers: hfinkel, lattner, rkruppe, greened, rovka, rengolin, sdesmalen Reviewed By: rovka, sdesmalen Differential Revision: https://reviews.llvm.org/D53137 Added: llvm/trunk/include/llvm/Support/TypeSize.h llvm/trunk/test/Other/scalable-vectors-core-ir.ll Removed: llvm/trunk/include/llvm/Support/ScalableSize.h Modified: llvm/trunk/include/llvm/ADT/DenseMapInfo.h llvm/trunk/include/llvm/IR/DataLayout.h llvm/trunk/include/llvm/IR/DerivedTypes.h llvm/trunk/include/llvm/IR/InstrTypes.h llvm/trunk/include/llvm/IR/Type.h llvm/trunk/include/llvm/Support/MachineValueType.h llvm/trunk/lib/Analysis/InlineCost.cpp llvm/trunk/lib/CodeGen/Analysis.cpp llvm/trunk/lib/IR/DataLayout.cpp llvm/trunk/lib/IR/Instructions.cpp llvm/trunk/lib/IR/Type.cpp llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/trunk/lib/Transforms/Scalar/SROA.cpp llvm/trunk/unittests/CodeGen/ScalableVectorMVTsTest.cpp llvm/trunk/unittests/IR/VectorTypesTest.cpp Modified: llvm/trunk/include/llvm/ADT/DenseMapInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/DenseMapInfo.h?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/DenseMapInfo.h (original) +++ llvm/trunk/include/llvm/ADT/DenseMapInfo.h Tue Oct 8 05:53:54 2019 @@ -17,7 +17,7 @@ #include "llvm/ADT/Hashing.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/PointerLikeTypeTraits.h" -#include "llvm/Support/ScalableSize.h" +#include "llvm/Support/TypeSize.h" #include #include #include Modified: llvm/trunk/include/llvm/IR/DataLayout.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/DataLayout.h?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/DataLayout.h (original) +++ llvm/trunk/include/llvm/IR/DataLayout.h Tue Oct 8 05:53:54 2019 @@ -30,6 +30,7 @@ #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/MathExtras.h" #include "llvm/Support/Alignment.h" +#include "llvm/Support/TypeSize.h" #include #include #include @@ -437,23 +438,33 @@ public: /// Returns the number of bits necessary to hold the specified type. /// + /// If Ty is a scalable vector type, the scalable property will be set and + /// the runtime size will be a positive integer multiple of the base size. + /// /// For example, returns 36 for i36 and 80 for x86_fp80. The type passed must /// have a size (Type::isSized() must return true). - uint64_t getTypeSizeInBits(Type *Ty) const; + TypeSize getTypeSizeInBits(Type *Ty) const; /// Returns the maximum number of bytes that may be overwritten by /// storing the specified type. /// + /// If Ty is a scalable vector type, the scalable property will be set and + /// the runtime size will be a positive integer multiple of the base size. + /// /// For example, returns 5 for i36 and 10 for x86_fp80. - uint64_t getTypeStoreSize(Type *Ty) const { - return (getTypeSizeInBits(Ty) + 7) / 8; + TypeSize getTypeStoreSize(Type *Ty) const { + auto BaseSize = getTypeSizeInBits(Ty); + return { (BaseSize.getKnownMinSize() + 7) / 8, BaseSize.isScalable() }; } /// Returns the maximum number of bits that may be overwritten by /// storing the specified type; always a multiple of 8. /// + /// If Ty is a scalable vector type, the scalable property will be set and + /// the runtime size will be a positive integer multiple of the base size. + /// /// For example, returns 40 for i36 and 80 for x86_fp80. - uint64_t getTypeStoreSizeInBits(Type *Ty) const { + TypeSize getTypeStoreSizeInBits(Type *Ty) const { return 8 * getTypeStoreSize(Ty); } @@ -468,9 +479,12 @@ public: /// Returns the offset in bytes between successive objects of the /// specified type, including alignment padding. /// + /// If Ty is a scalable vector type, the scalable property will be set and + /// the runtime size will be a positive integer multiple of the base size. + /// /// This is the amount that alloca reserves for this type. For example, /// returns 12 or 16 for x86_fp80, depending on alignment. - uint64_t getTypeAllocSize(Type *Ty) const { + TypeSize getTypeAllocSize(Type *Ty) const { // Round up to the next alignment boundary. return alignTo(getTypeStoreSize(Ty), getABITypeAlignment(Ty)); } @@ -478,9 +492,12 @@ public: /// Returns the offset in bits between successive objects of the /// specified type, including alignment padding; always a multiple of 8. /// + /// If Ty is a scalable vector type, the scalable property will be set and + /// the runtime size will be a positive integer multiple of the base size. + /// /// This is the amount that alloca reserves for this type. For example, /// returns 96 or 128 for x86_fp80, depending on alignment. - uint64_t getTypeAllocSizeInBits(Type *Ty) const { + TypeSize getTypeAllocSizeInBits(Type *Ty) const { return 8 * getTypeAllocSize(Ty); } @@ -598,13 +615,13 @@ private: // The implementation of this method is provided inline as it is particularly // well suited to constant folding when called on a specific Type subclass. -inline uint64_t DataLayout::getTypeSizeInBits(Type *Ty) const { +inline TypeSize DataLayout::getTypeSizeInBits(Type *Ty) const { assert(Ty->isSized() && "Cannot getTypeInfo() on a type that is unsized!"); switch (Ty->getTypeID()) { case Type::LabelTyID: - return getPointerSizeInBits(0); + return TypeSize::Fixed(getPointerSizeInBits(0)); case Type::PointerTyID: - return getPointerSizeInBits(Ty->getPointerAddressSpace()); + return TypeSize::Fixed(getPointerSizeInBits(Ty->getPointerAddressSpace())); case Type::ArrayTyID: { ArrayType *ATy = cast(Ty); return ATy->getNumElements() * @@ -612,26 +629,30 @@ inline uint64_t DataLayout::getTypeSizeI } case Type::StructTyID: // Get the layout annotation... which is lazily created on demand. - return getStructLayout(cast(Ty))->getSizeInBits(); + return TypeSize::Fixed( + getStructLayout(cast(Ty))->getSizeInBits()); case Type::IntegerTyID: - return Ty->getIntegerBitWidth(); + return TypeSize::Fixed(Ty->getIntegerBitWidth()); case Type::HalfTyID: - return 16; + return TypeSize::Fixed(16); case Type::FloatTyID: - return 32; + return TypeSize::Fixed(32); case Type::DoubleTyID: case Type::X86_MMXTyID: - return 64; + return TypeSize::Fixed(64); case Type::PPC_FP128TyID: case Type::FP128TyID: - return 128; + return TypeSize::Fixed(128); // In memory objects this is always aligned to a higher boundary, but // only 80 bits contain information. case Type::X86_FP80TyID: - return 80; + return TypeSize::Fixed(80); case Type::VectorTyID: { VectorType *VTy = cast(Ty); - return VTy->getNumElements() * getTypeSizeInBits(VTy->getElementType()); + auto EltCnt = VTy->getElementCount(); + uint64_t MinBits = EltCnt.Min * + getTypeSizeInBits(VTy->getElementType()).getFixedSize(); + return TypeSize(MinBits, EltCnt.Scalable); } default: llvm_unreachable("DataLayout::getTypeSizeInBits(): Unsupported type"); Modified: llvm/trunk/include/llvm/IR/DerivedTypes.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/DerivedTypes.h?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/DerivedTypes.h (original) +++ llvm/trunk/include/llvm/IR/DerivedTypes.h Tue Oct 8 05:53:54 2019 @@ -23,7 +23,7 @@ #include "llvm/IR/Type.h" #include "llvm/Support/Casting.h" #include "llvm/Support/Compiler.h" -#include "llvm/Support/ScalableSize.h" +#include "llvm/Support/TypeSize.h" #include #include Modified: llvm/trunk/include/llvm/IR/InstrTypes.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/InstrTypes.h?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/InstrTypes.h (original) +++ llvm/trunk/include/llvm/IR/InstrTypes.h Tue Oct 8 05:53:54 2019 @@ -975,7 +975,7 @@ public: static Type* makeCmpResultType(Type* opnd_type) { if (VectorType* vt = dyn_cast(opnd_type)) { return VectorType::get(Type::getInt1Ty(opnd_type->getContext()), - vt->getNumElements()); + vt->getElementCount()); } return Type::getInt1Ty(opnd_type->getContext()); } Modified: llvm/trunk/include/llvm/IR/Type.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Type.h?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/Type.h (original) +++ llvm/trunk/include/llvm/IR/Type.h Tue Oct 8 05:53:54 2019 @@ -21,6 +21,7 @@ #include "llvm/Support/Casting.h" #include "llvm/Support/Compiler.h" #include "llvm/Support/ErrorHandling.h" +#include "llvm/Support/TypeSize.h" #include #include #include @@ -281,12 +282,15 @@ public: /// This will return zero if the type does not have a size or is not a /// primitive type. /// + /// If this is a scalable vector type, the scalable property will be set and + /// the runtime size will be a positive integer multiple of the base size. + /// /// Note that this may not reflect the size of memory allocated for an /// instance of the type or the number of bytes that are written when an /// instance of the type is stored to memory. The DataLayout class provides /// additional query functions to provide this information. /// - unsigned getPrimitiveSizeInBits() const LLVM_READONLY; + TypeSize getPrimitiveSizeInBits() const LLVM_READONLY; /// If this is a vector type, return the getPrimitiveSizeInBits value for the /// element type. Otherwise return the getPrimitiveSizeInBits value for this Modified: llvm/trunk/include/llvm/Support/MachineValueType.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/MachineValueType.h?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/include/llvm/Support/MachineValueType.h (original) +++ llvm/trunk/include/llvm/Support/MachineValueType.h Tue Oct 8 05:53:54 2019 @@ -17,7 +17,7 @@ #include "llvm/ADT/iterator_range.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/MathExtras.h" -#include "llvm/Support/ScalableSize.h" +#include "llvm/Support/TypeSize.h" #include namespace llvm { Removed: llvm/trunk/include/llvm/Support/ScalableSize.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/ScalableSize.h?rev=374041&view=auto ============================================================================== --- llvm/trunk/include/llvm/Support/ScalableSize.h (original) +++ llvm/trunk/include/llvm/Support/ScalableSize.h (removed) @@ -1,46 +0,0 @@ -//===- ScalableSize.h - Scalable vector size info ---------------*- C++ -*-===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// This file provides a struct that can be used to query the size of IR types -// which may be scalable vectors. It provides convenience operators so that -// it can be used in much the same way as a single scalar value. -// -//===----------------------------------------------------------------------===// - -#ifndef LLVM_SUPPORT_SCALABLESIZE_H -#define LLVM_SUPPORT_SCALABLESIZE_H - -namespace llvm { - -class ElementCount { -public: - unsigned Min; // Minimum number of vector elements. - bool Scalable; // If true, NumElements is a multiple of 'Min' determined - // at runtime rather than compile time. - - ElementCount(unsigned Min, bool Scalable) - : Min(Min), Scalable(Scalable) {} - - ElementCount operator*(unsigned RHS) { - return { Min * RHS, Scalable }; - } - ElementCount operator/(unsigned RHS) { - return { Min / RHS, Scalable }; - } - - bool operator==(const ElementCount& RHS) const { - return Min == RHS.Min && Scalable == RHS.Scalable; - } - bool operator!=(const ElementCount& RHS) const { - return !(*this == RHS); - } -}; - -} // end namespace llvm - -#endif // LLVM_SUPPORT_SCALABLESIZE_H Added: llvm/trunk/include/llvm/Support/TypeSize.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/TypeSize.h?rev=374042&view=auto ============================================================================== --- llvm/trunk/include/llvm/Support/TypeSize.h (added) +++ llvm/trunk/include/llvm/Support/TypeSize.h Tue Oct 8 05:53:54 2019 @@ -0,0 +1,200 @@ +//===- TypeSize.h - Wrapper around type sizes -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file provides a struct that can be used to query the size of IR types +// which may be scalable vectors. It provides convenience operators so that +// it can be used in much the same way as a single scalar value. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_SUPPORT_TYPESIZE_H +#define LLVM_SUPPORT_TYPESIZE_H + +#include + +namespace llvm { + +class ElementCount { +public: + unsigned Min; // Minimum number of vector elements. + bool Scalable; // If true, NumElements is a multiple of 'Min' determined + // at runtime rather than compile time. + + ElementCount(unsigned Min, bool Scalable) + : Min(Min), Scalable(Scalable) {} + + ElementCount operator*(unsigned RHS) { + return { Min * RHS, Scalable }; + } + ElementCount operator/(unsigned RHS) { + return { Min / RHS, Scalable }; + } + + bool operator==(const ElementCount& RHS) const { + return Min == RHS.Min && Scalable == RHS.Scalable; + } + bool operator!=(const ElementCount& RHS) const { + return !(*this == RHS); + } +}; + +// This class is used to represent the size of types. If the type is of fixed +// size, it will represent the exact size. If the type is a scalable vector, +// it will represent the known minimum size. +class TypeSize { + uint64_t MinSize; // The known minimum size. + bool IsScalable; // If true, then the runtime size is an integer multiple + // of MinSize. + +public: + constexpr TypeSize(uint64_t MinSize, bool Scalable) + : MinSize(MinSize), IsScalable(Scalable) {} + + static constexpr TypeSize Fixed(uint64_t Size) { + return TypeSize(Size, /*IsScalable=*/false); + } + + static constexpr TypeSize Scalable(uint64_t MinSize) { + return TypeSize(MinSize, /*IsScalable=*/true); + } + + // Scalable vector types with the same minimum size as a fixed size type are + // not guaranteed to be the same size at runtime, so they are never + // considered to be equal. + friend bool operator==(const TypeSize &LHS, const TypeSize &RHS) { + return std::tie(LHS.MinSize, LHS.IsScalable) == + std::tie(RHS.MinSize, RHS.IsScalable); + } + + friend bool operator!=(const TypeSize &LHS, const TypeSize &RHS) { + return !(LHS == RHS); + } + + // For many cases, size ordering between scalable and fixed size types cannot + // be determined at compile time, so such comparisons aren't allowed. + // + // e.g. could be bigger than <4 x i32> with a runtime + // vscale >= 5, equal sized with a vscale of 4, and smaller with + // a vscale <= 3. + // + // If the scalable flags match, just perform the requested comparison + // between the minimum sizes. + friend bool operator<(const TypeSize &LHS, const TypeSize &RHS) { + assert(LHS.IsScalable == RHS.IsScalable && + "Ordering comparison of scalable and fixed types"); + + return LHS.MinSize < RHS.MinSize; + } + + friend bool operator>(const TypeSize &LHS, const TypeSize &RHS) { + return RHS < LHS; + } + + friend bool operator<=(const TypeSize &LHS, const TypeSize &RHS) { + return !(RHS < LHS); + } + + friend bool operator>=(const TypeSize &LHS, const TypeSize& RHS) { + return !(LHS < RHS); + } + + // Convenience operators to obtain relative sizes independently of + // the scalable flag. + TypeSize operator*(unsigned RHS) const { + return { MinSize * RHS, IsScalable }; + } + + friend TypeSize operator*(const unsigned LHS, const TypeSize &RHS) { + return { LHS * RHS.MinSize, RHS.IsScalable }; + } + + TypeSize operator/(unsigned RHS) const { + return { MinSize / RHS, IsScalable }; + } + + // Return the minimum size with the assumption that the size is exact. + // Use in places where a scalable size doesn't make sense (e.g. non-vector + // types, or vectors in backends which don't support scalable vectors) + uint64_t getFixedSize() const { + assert(!IsScalable && "Request for a fixed size on a scalable object"); + return MinSize; + } + + // Return the known minimum size. Use in places where the scalable property + // doesn't matter (e.g. determining alignment) or in conjunction with the + // isScalable method below. + uint64_t getKnownMinSize() const { + return MinSize; + } + + // Return whether or not the size is scalable. + bool isScalable() const { + return IsScalable; + } + + // Casts to a uint64_t if this is a fixed-width size. + // + // NOTE: This interface is obsolete and will be removed in a future version + // of LLVM in favour of calling getFixedSize() directly + operator uint64_t() const { + return getFixedSize(); + } + + // Additional convenience operators needed to avoid ambiguous parses + // TODO: Make uint64_t the default operator? + TypeSize operator*(uint64_t RHS) const { + return { MinSize * RHS, IsScalable }; + } + + TypeSize operator*(int RHS) const { + return { MinSize * RHS, IsScalable }; + } + + TypeSize operator*(int64_t RHS) const { + return { MinSize * RHS, IsScalable }; + } + + friend TypeSize operator*(const uint64_t LHS, const TypeSize &RHS) { + return { LHS * RHS.MinSize, RHS.IsScalable }; + } + + friend TypeSize operator*(const int LHS, const TypeSize &RHS) { + return { LHS * RHS.MinSize, RHS.IsScalable }; + } + + friend TypeSize operator*(const int64_t LHS, const TypeSize &RHS) { + return { LHS * RHS.MinSize, RHS.IsScalable }; + } + + TypeSize operator/(uint64_t RHS) const { + return { MinSize / RHS, IsScalable }; + } + + TypeSize operator/(int RHS) const { + return { MinSize / RHS, IsScalable }; + } + + TypeSize operator/(int64_t RHS) const { + return { MinSize / RHS, IsScalable }; + } +}; + +/// Returns a TypeSize with a known minimum size that is the next integer +/// (mod 2**64) that is greater than or equal to \p Value and is a multiple +/// of \p Align. \p Align must be non-zero. +/// +/// Similar to the alignTo functions in MathExtras.h +inline TypeSize alignTo(TypeSize Size, uint64_t Align) { + assert(Align != 0u && "Align must be non-zero"); + return {(Size.getKnownMinSize() + Align - 1) / Align * Align, + Size.isScalable()}; +} + +} // end namespace llvm + +#endif // LLVM_SUPPORT_TypeSize_H Modified: llvm/trunk/lib/Analysis/InlineCost.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InlineCost.cpp?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/InlineCost.cpp (original) +++ llvm/trunk/lib/Analysis/InlineCost.cpp Tue Oct 8 05:53:54 2019 @@ -436,7 +436,8 @@ bool CallAnalyzer::visitAlloca(AllocaIns if (auto *AllocSize = dyn_cast_or_null(Size)) { Type *Ty = I.getAllocatedType(); AllocatedSize = SaturatingMultiplyAdd( - AllocSize->getLimitedValue(), DL.getTypeAllocSize(Ty), AllocatedSize); + AllocSize->getLimitedValue(), DL.getTypeAllocSize(Ty).getFixedSize(), + AllocatedSize); return Base::visitAlloca(I); } } @@ -444,7 +445,8 @@ bool CallAnalyzer::visitAlloca(AllocaIns // Accumulate the allocated size. if (I.isStaticAlloca()) { Type *Ty = I.getAllocatedType(); - AllocatedSize = SaturatingAdd(DL.getTypeAllocSize(Ty), AllocatedSize); + AllocatedSize = SaturatingAdd(DL.getTypeAllocSize(Ty).getFixedSize(), + AllocatedSize); } // We will happily inline static alloca instructions. Modified: llvm/trunk/lib/CodeGen/Analysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/Analysis.cpp?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/Analysis.cpp (original) +++ llvm/trunk/lib/CodeGen/Analysis.cpp Tue Oct 8 05:53:54 2019 @@ -309,7 +309,8 @@ static const Value *getNoopInput(const V NoopInput = Op; } else if (isa(I) && TLI.allowTruncateForTailCall(Op->getType(), I->getType())) { - DataBits = std::min(DataBits, I->getType()->getPrimitiveSizeInBits()); + DataBits = std::min((uint64_t)DataBits, + I->getType()->getPrimitiveSizeInBits().getFixedSize()); NoopInput = Op; } else if (auto CS = ImmutableCallSite(I)) { const Value *ReturnedOp = CS.getReturnedArgOperand(); Modified: llvm/trunk/lib/IR/DataLayout.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/DataLayout.cpp?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/lib/IR/DataLayout.cpp (original) +++ llvm/trunk/lib/IR/DataLayout.cpp Tue Oct 8 05:53:54 2019 @@ -29,6 +29,7 @@ #include "llvm/Support/Casting.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/MathExtras.h" +#include "llvm/Support/TypeSize.h" #include #include #include @@ -745,7 +746,10 @@ Align DataLayout::getAlignment(Type *Ty, llvm_unreachable("Bad type for getAlignment!!!"); } - return getAlignmentInfo(AlignType, getTypeSizeInBits(Ty), abi_or_pref, Ty); + // If we're dealing with a scalable vector, we just need the known minimum + // size for determining alignment. If not, we'll get the exact size. + return getAlignmentInfo(AlignType, getTypeSizeInBits(Ty).getKnownMinSize(), + abi_or_pref, Ty); } unsigned DataLayout::getABITypeAlignment(Type *Ty) const { Modified: llvm/trunk/lib/IR/Instructions.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Instructions.cpp?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/lib/IR/Instructions.cpp (original) +++ llvm/trunk/lib/IR/Instructions.cpp Tue Oct 8 05:53:54 2019 @@ -38,6 +38,7 @@ #include "llvm/Support/Casting.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/MathExtras.h" +#include "llvm/Support/TypeSize.h" #include #include #include @@ -1792,7 +1793,7 @@ ShuffleVectorInst::ShuffleVectorInst(Val const Twine &Name, Instruction *InsertBefore) : Instruction(VectorType::get(cast(V1->getType())->getElementType(), - cast(Mask->getType())->getNumElements()), + cast(Mask->getType())->getElementCount()), ShuffleVector, OperandTraits::op_begin(this), OperandTraits::operands(this), @@ -1809,7 +1810,7 @@ ShuffleVectorInst::ShuffleVectorInst(Val const Twine &Name, BasicBlock *InsertAtEnd) : Instruction(VectorType::get(cast(V1->getType())->getElementType(), - cast(Mask->getType())->getNumElements()), + cast(Mask->getType())->getElementCount()), ShuffleVector, OperandTraits::op_begin(this), OperandTraits::operands(this), @@ -2982,8 +2983,8 @@ bool CastInst::isCastable(Type *SrcTy, T } // Get the bit sizes, we'll need these - unsigned SrcBits = SrcTy->getPrimitiveSizeInBits(); // 0 for ptr - unsigned DestBits = DestTy->getPrimitiveSizeInBits(); // 0 for ptr + auto SrcBits = SrcTy->getPrimitiveSizeInBits(); // 0 for ptr + auto DestBits = DestTy->getPrimitiveSizeInBits(); // 0 for ptr // Run through the possibilities ... if (DestTy->isIntegerTy()) { // Casting to integral @@ -3030,7 +3031,7 @@ bool CastInst::isBitCastable(Type *SrcTy if (VectorType *SrcVecTy = dyn_cast(SrcTy)) { if (VectorType *DestVecTy = dyn_cast(DestTy)) { - if (SrcVecTy->getNumElements() == DestVecTy->getNumElements()) { + if (SrcVecTy->getElementCount() == DestVecTy->getElementCount()) { // An element by element cast. Valid if casting the elements is valid. SrcTy = SrcVecTy->getElementType(); DestTy = DestVecTy->getElementType(); @@ -3044,12 +3045,12 @@ bool CastInst::isBitCastable(Type *SrcTy } } - unsigned SrcBits = SrcTy->getPrimitiveSizeInBits(); // 0 for ptr - unsigned DestBits = DestTy->getPrimitiveSizeInBits(); // 0 for ptr + auto SrcBits = SrcTy->getPrimitiveSizeInBits(); // 0 for ptr + auto DestBits = DestTy->getPrimitiveSizeInBits(); // 0 for ptr // Could still have vectors of pointers if the number of elements doesn't // match - if (SrcBits == 0 || DestBits == 0) + if (SrcBits.getKnownMinSize() == 0 || DestBits.getKnownMinSize() == 0) return false; if (SrcBits != DestBits) Modified: llvm/trunk/lib/IR/Type.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Type.cpp?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/lib/IR/Type.cpp (original) +++ llvm/trunk/lib/IR/Type.cpp Tue Oct 8 05:53:54 2019 @@ -26,6 +26,7 @@ #include "llvm/Support/Casting.h" #include "llvm/Support/MathExtras.h" #include "llvm/Support/raw_ostream.h" +#include "llvm/Support/TypeSize.h" #include #include @@ -111,18 +112,22 @@ bool Type::isEmptyTy() const { return false; } -unsigned Type::getPrimitiveSizeInBits() const { +TypeSize Type::getPrimitiveSizeInBits() const { switch (getTypeID()) { - case Type::HalfTyID: return 16; - case Type::FloatTyID: return 32; - case Type::DoubleTyID: return 64; - case Type::X86_FP80TyID: return 80; - case Type::FP128TyID: return 128; - case Type::PPC_FP128TyID: return 128; - case Type::X86_MMXTyID: return 64; - case Type::IntegerTyID: return cast(this)->getBitWidth(); - case Type::VectorTyID: return cast(this)->getBitWidth(); - default: return 0; + case Type::HalfTyID: return TypeSize::Fixed(16); + case Type::FloatTyID: return TypeSize::Fixed(32); + case Type::DoubleTyID: return TypeSize::Fixed(64); + case Type::X86_FP80TyID: return TypeSize::Fixed(80); + case Type::FP128TyID: return TypeSize::Fixed(128); + case Type::PPC_FP128TyID: return TypeSize::Fixed(128); + case Type::X86_MMXTyID: return TypeSize::Fixed(64); + case Type::IntegerTyID: + return TypeSize::Fixed(cast(this)->getBitWidth()); + case Type::VectorTyID: { + const VectorType *VTy = cast(this); + return TypeSize(VTy->getBitWidth(), VTy->isScalable()); + } + default: return TypeSize::Fixed(0); } } Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp Tue Oct 8 05:53:54 2019 @@ -8526,7 +8526,7 @@ bool AArch64TargetLowering::isExtFreeImp // Get the shift amount based on the scaling factor: // log2(sizeof(IdxTy)) - log2(8). uint64_t ShiftAmt = - countTrailingZeros(DL.getTypeStoreSizeInBits(IdxTy)) - 3; + countTrailingZeros(DL.getTypeStoreSizeInBits(IdxTy).getFixedSize()) - 3; // Is the constant foldable in the shift of the addressing mode? // I.e., shift amount is between 1 and 4 inclusive. if (ShiftAmt == 0 || ShiftAmt > 4) Modified: llvm/trunk/lib/Transforms/Scalar/SROA.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SROA.cpp?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/SROA.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/SROA.cpp Tue Oct 8 05:53:54 2019 @@ -959,14 +959,16 @@ private: std::tie(UsedI, I) = Uses.pop_back_val(); if (LoadInst *LI = dyn_cast(I)) { - Size = std::max(Size, DL.getTypeStoreSize(LI->getType())); + Size = std::max(Size, + DL.getTypeStoreSize(LI->getType()).getFixedSize()); continue; } if (StoreInst *SI = dyn_cast(I)) { Value *Op = SI->getOperand(0); if (Op == UsedI) return SI; - Size = std::max(Size, DL.getTypeStoreSize(Op->getType())); + Size = std::max(Size, + DL.getTypeStoreSize(Op->getType()).getFixedSize()); continue; } Added: llvm/trunk/test/Other/scalable-vectors-core-ir.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Other/scalable-vectors-core-ir.ll?rev=374042&view=auto ============================================================================== --- llvm/trunk/test/Other/scalable-vectors-core-ir.ll (added) +++ llvm/trunk/test/Other/scalable-vectors-core-ir.ll Tue Oct 8 05:53:54 2019 @@ -0,0 +1,393 @@ +; RUN: opt -S -verify < %s | FileCheck %s +target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" +target triple = "aarch64--linux-gnu" + +;; Check supported instructions are accepted without dropping 'vscale'. +;; Same order as the LangRef + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Unary Operations +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + + +define @fneg( %val) { +; CHECK-LABEL: @fneg +; CHECK: %r = fneg %val +; CHECK-NEXT: ret %r + %r = fneg %val + ret %r +} + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Binary Operations +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +define @add( %a, %b) { +; CHECK-LABEL: @add +; CHECK: %r = add %a, %b +; CHECK-NEXT: ret %r + %r = add %a, %b + ret %r +} + +define @fadd( %a, %b) { +; CHECK-LABEL: @fadd +; CHECK: %r = fadd %a, %b +; CHECK-NEXT: ret %r + %r = fadd %a, %b + ret %r +} + +define @sub( %a, %b) { +; CHECK-LABEL: @sub +; CHECK: %r = sub %a, %b +; CHECK-NEXT: ret %r + %r = sub %a, %b + ret %r +} + +define @fsub( %a, %b) { +; CHECK-LABEL: @fsub +; CHECK: %r = fsub %a, %b +; CHECK-NEXT: ret %r + %r = fsub %a, %b + ret %r +} + +define @mul( %a, %b) { +; CHECK-LABEL: @mul +; CHECK: %r = mul %a, %b +; CHECK-NEXT: ret %r + %r = mul %a, %b + ret %r +} + +define @fmul( %a, %b) { +; CHECK-LABEL: @fmul +; CHECK: %r = fmul %a, %b +; CHECK-NEXT: ret %r + %r = fmul %a, %b + ret %r +} + +define @udiv( %a, %b) { +; CHECK-LABEL: @udiv +; CHECK: %r = udiv %a, %b +; CHECK-NEXT: ret %r + %r = udiv %a, %b + ret %r +} + +define @sdiv( %a, %b) { +; CHECK-LABEL: @sdiv +; CHECK: %r = sdiv %a, %b +; CHECK-NEXT: ret %r + %r = sdiv %a, %b + ret %r +} + +define @fdiv( %a, %b) { +; CHECK-LABEL: @fdiv +; CHECK: %r = fdiv %a, %b +; CHECK-NEXT: ret %r + %r = fdiv %a, %b + ret %r +} + +define @urem( %a, %b) { +; CHECK-LABEL: @urem +; CHECK: %r = urem %a, %b +; CHECK-NEXT: ret %r + %r = urem %a, %b + ret %r +} + +define @srem( %a, %b) { +; CHECK-LABEL: @srem +; CHECK: %r = srem %a, %b +; CHECK-NEXT: ret %r + %r = srem %a, %b + ret %r +} + +define @frem( %a, %b) { +; CHECK-LABEL: @frem +; CHECK: %r = frem %a, %b +; CHECK-NEXT: ret %r + %r = frem %a, %b + ret %r +} + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Bitwise Binary Operations +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +define @shl( %a, %b) { +; CHECK-LABEL: @shl +; CHECK: %r = shl %a, %b +; CHECK-NEXT: ret %r + %r = shl %a, %b + ret %r +} + +define @lshr( %a, %b) { +; CHECK-LABEL: @lshr +; CHECK: %r = lshr %a, %b +; CHECK-NEXT: ret %r + %r = lshr %a, %b + ret %r +} + +define @ashr( %a, %b) { +; CHECK-LABEL: @ashr +; CHECK: %r = ashr %a, %b +; CHECK-NEXT: ret %r + %r = ashr %a, %b + ret %r +} + +define @and( %a, %b) { +; CHECK-LABEL: @and +; CHECK: %r = and %a, %b +; CHECK-NEXT: ret %r + %r = and %a, %b + ret %r +} + +define @or( %a, %b) { +; CHECK-LABEL: @or +; CHECK: %r = or %a, %b +; CHECK-NEXT: ret %r + %r = or %a, %b + ret %r +} + +define @xor( %a, %b) { +; CHECK-LABEL: @xor +; CHECK: %r = xor %a, %b +; CHECK-NEXT: ret %r + %r = xor %a, %b + ret %r +} + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Vector Operations +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +define i64 @extractelement( %val) { +; CHECK-LABEL: @extractelement +; CHECK: %r = extractelement %val, i32 0 +; CHECK-NEXT: ret i64 %r + %r = extractelement %val, i32 0 + ret i64 %r +} + +define @insertelement( %vec, i8 %ins) { +; CHECK-LABEL: @insertelement +; CHECK: %r = insertelement %vec, i8 %ins, i32 0 +; CHECK-NEXT: ret %r + %r = insertelement %vec, i8 %ins, i32 0 + ret %r +} + +define @shufflevector(half %val) { +; CHECK-LABEL: @shufflevector +; CHECK: %insvec = insertelement undef, half %val, i32 0 +; CHECK-NEXT: %r = shufflevector %insvec, undef, zeroinitializer +; CHECK-NEXT: ret %r + %insvec = insertelement undef, half %val, i32 0 + %r = shufflevector %insvec, undef, zeroinitializer + ret %r +} + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Memory Access and Addressing Operations +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +define void @alloca() { +; CHECK-LABEL: @alloca +; CHECK: %vec = alloca +; CHECK-NEXT: ret void + %vec = alloca + ret void +} + +define @load(* %ptr) { +; CHECK-LABEL: @load +; CHECK: %r = load , * %ptr +; CHECK-NEXT: ret %r + %r = load , * %ptr + ret %r +} + +define void @store( %data, * %ptr) { +; CHECK-LABEL: @store +; CHECK: store %data, * %ptr +; CHECK-NEXT: ret void + store %data, * %ptr + ret void +} + +define * @getelementptr(* %base) { +; CHECK-LABEL: @getelementptr +; CHECK: %r = getelementptr , * %base, i64 0 +; CHECK-NEXT: ret * %r + %r = getelementptr , * %base, i64 0 + ret * %r +} + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Conversion Operations +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +define @truncto( %val) { +; CHECK-LABEL: @truncto +; CHECK: %r = trunc %val to +; CHECK-NEXT: ret %r + %r = trunc %val to + ret %r +} + +define @zextto( %val) { +; CHECK-LABEL: @zextto +; CHECK: %r = zext %val to +; CHECK-NEXT: ret %r + %r = zext %val to + ret %r +} + +define @sextto( %val) { +; CHECK-LABEL: @sextto +; CHECK: %r = sext %val to +; CHECK-NEXT: ret %r + %r = sext %val to + ret %r +} + +define @fptruncto( %val) { +; CHECK-LABEL: @fptruncto +; CHECK: %r = fptrunc %val to +; CHECK-NEXT: ret %r + %r = fptrunc %val to + ret %r +} + +define @fpextto( %val) { +; CHECK-LABEL: @fpextto +; CHECK: %r = fpext %val to +; CHECK-NEXT: ret %r + %r = fpext %val to + ret %r +} + +define @fptouito( %val) { +; CHECK-LABEL: @fptoui +; CHECK: %r = fptoui %val to +; CHECK-NEXT: ret %r + %r = fptoui %val to + ret %r +} + +define @fptosito( %val) { +; CHECK-LABEL: @fptosi +; CHECK: %r = fptosi %val to +; CHECK-NEXT: ret %r + %r = fptosi %val to + ret %r +} + +define @uitofpto( %val) { +; CHECK-LABEL: @uitofp +; CHECK: %r = uitofp %val to +; CHECK-NEXT: ret %r + %r = uitofp %val to + ret %r +} + +define @sitofpto( %val) { +; CHECK-LABEL: @sitofp +; CHECK: %r = sitofp %val to +; CHECK-NEXT: ret %r + %r = sitofp %val to + ret %r +} + +define @ptrtointto( %val) { +; CHECK-LABEL: @ptrtointto +; CHECK: %r = ptrtoint %val to +; CHECK-NEXT: ret %r + %r = ptrtoint %val to + ret %r +} + +define @inttoptrto( %val) { +; CHECK-LABEL: @inttoptrto +; CHECK: %r = inttoptr %val to +; CHECK-NEXT: ret %r + %r = inttoptr %val to + ret %r +} + +define @bitcastto( %a) { +; CHECK-LABEL: @bitcast +; CHECK: %r = bitcast %a to +; CHECK-NEXT: ret %r + %r = bitcast %a to + ret %r +} + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Other Operations +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +define @icmp( %a, %b) { +; CHECK-LABEL: @icmp +; CHECK: %r = icmp eq %a, %b +; CHECK-NEXT: ret %r + %r = icmp eq %a, %b + ret %r +} + +define @fcmp( %a, %b) { +; CHECK-LABEL: @fcmp +; CHECK: %r = fcmp une %a, %b +; CHECK-NEXT: ret %r + %r = fcmp une %a, %b + ret %r +} + +define @phi( %a, i32 %val) { +; CHECK-LABEL: @phi +; CHECK: %r = phi [ %a, %entry ], [ %added, %iszero ] +; CHECK-NEXT: ret %r +entry: + %cmp = icmp eq i32 %val, 0 + br i1 %cmp, label %iszero, label %end + +iszero: + %ins = insertelement undef, i8 1, i32 0 + %splatone = shufflevector %ins, undef, zeroinitializer + %added = add %a, %splatone + br label %end + +end: + %r = phi [ %a, %entry ], [ %added, %iszero ] + ret %r +} + +define @select( %a, %b, %sval) { +; CHECK-LABEL: @select +; CHECK: %r = select %sval, %a, %b +; CHECK-NEXT: ret %r + %r = select %sval, %a, %b + ret %r +} + +declare @callee() +define @call( %val) { +; CHECK-LABEL: @call +; CHECK: %r = call @callee( %val) +; CHECK-NEXT: ret %r + %r = call @callee( %val) + ret %r +} \ No newline at end of file Modified: llvm/trunk/unittests/CodeGen/ScalableVectorMVTsTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/CodeGen/ScalableVectorMVTsTest.cpp?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/unittests/CodeGen/ScalableVectorMVTsTest.cpp (original) +++ llvm/trunk/unittests/CodeGen/ScalableVectorMVTsTest.cpp Tue Oct 8 05:53:54 2019 @@ -10,7 +10,7 @@ #include "llvm/IR/DerivedTypes.h" #include "llvm/IR/LLVMContext.h" #include "llvm/Support/MachineValueType.h" -#include "llvm/Support/ScalableSize.h" +#include "llvm/Support/TypeSize.h" #include "gtest/gtest.h" using namespace llvm; Modified: llvm/trunk/unittests/IR/VectorTypesTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/IR/VectorTypesTest.cpp?rev=374042&r1=374041&r2=374042&view=diff ============================================================================== --- llvm/trunk/unittests/IR/VectorTypesTest.cpp (original) +++ llvm/trunk/unittests/IR/VectorTypesTest.cpp Tue Oct 8 05:53:54 2019 @@ -6,9 +6,10 @@ // //===----------------------------------------------------------------------===// +#include "llvm/IR/DataLayout.h" #include "llvm/IR/DerivedTypes.h" #include "llvm/IR/LLVMContext.h" -#include "llvm/Support/ScalableSize.h" +#include "llvm/Support/TypeSize.h" #include "gtest/gtest.h" using namespace llvm; @@ -161,4 +162,117 @@ TEST(VectorTypesTest, Scalable) { ASSERT_TRUE(EltCnt.Scalable); } +TEST(VectorTypesTest, FixedLenComparisons) { + LLVMContext Ctx; + DataLayout DL(""); + + Type *Int32Ty = Type::getInt32Ty(Ctx); + Type *Int64Ty = Type::getInt64Ty(Ctx); + + VectorType *V2Int32Ty = VectorType::get(Int32Ty, 2); + VectorType *V4Int32Ty = VectorType::get(Int32Ty, 4); + + VectorType *V2Int64Ty = VectorType::get(Int64Ty, 2); + + TypeSize V2I32Len = V2Int32Ty->getPrimitiveSizeInBits(); + EXPECT_EQ(V2I32Len.getKnownMinSize(), 64U); + EXPECT_FALSE(V2I32Len.isScalable()); + + EXPECT_LT(V2Int32Ty->getPrimitiveSizeInBits(), + V4Int32Ty->getPrimitiveSizeInBits()); + EXPECT_GT(V2Int64Ty->getPrimitiveSizeInBits(), + V2Int32Ty->getPrimitiveSizeInBits()); + EXPECT_EQ(V4Int32Ty->getPrimitiveSizeInBits(), + V2Int64Ty->getPrimitiveSizeInBits()); + EXPECT_NE(V2Int32Ty->getPrimitiveSizeInBits(), + V2Int64Ty->getPrimitiveSizeInBits()); + + // Check that a fixed-only comparison works for fixed size vectors. + EXPECT_EQ(V2Int64Ty->getPrimitiveSizeInBits().getFixedSize(), + V4Int32Ty->getPrimitiveSizeInBits().getFixedSize()); + + // Check the DataLayout interfaces. + EXPECT_EQ(DL.getTypeSizeInBits(V2Int64Ty), + DL.getTypeSizeInBits(V4Int32Ty)); + EXPECT_EQ(DL.getTypeSizeInBits(V2Int32Ty), 64U); + EXPECT_EQ(DL.getTypeSizeInBits(V2Int64Ty), 128U); + EXPECT_EQ(DL.getTypeStoreSize(V2Int64Ty), + DL.getTypeStoreSize(V4Int32Ty)); + EXPECT_NE(DL.getTypeStoreSizeInBits(V2Int32Ty), + DL.getTypeStoreSizeInBits(V2Int64Ty)); + EXPECT_EQ(DL.getTypeStoreSizeInBits(V2Int32Ty), 64U); + EXPECT_EQ(DL.getTypeStoreSize(V2Int64Ty), 16U); + EXPECT_EQ(DL.getTypeAllocSize(V4Int32Ty), + DL.getTypeAllocSize(V2Int64Ty)); + EXPECT_NE(DL.getTypeAllocSizeInBits(V2Int32Ty), + DL.getTypeAllocSizeInBits(V2Int64Ty)); + EXPECT_EQ(DL.getTypeAllocSizeInBits(V4Int32Ty), 128U); + EXPECT_EQ(DL.getTypeAllocSize(V2Int32Ty), 8U); + ASSERT_TRUE(DL.typeSizeEqualsStoreSize(V4Int32Ty)); +} + +TEST(VectorTypesTest, ScalableComparisons) { + LLVMContext Ctx; + DataLayout DL(""); + + Type *Int32Ty = Type::getInt32Ty(Ctx); + Type *Int64Ty = Type::getInt64Ty(Ctx); + + VectorType *ScV2Int32Ty = VectorType::get(Int32Ty, {2, true}); + VectorType *ScV4Int32Ty = VectorType::get(Int32Ty, {4, true}); + + VectorType *ScV2Int64Ty = VectorType::get(Int64Ty, {2, true}); + + TypeSize ScV2I32Len = ScV2Int32Ty->getPrimitiveSizeInBits(); + EXPECT_EQ(ScV2I32Len.getKnownMinSize(), 64U); + EXPECT_TRUE(ScV2I32Len.isScalable()); + + EXPECT_LT(ScV2Int32Ty->getPrimitiveSizeInBits(), + ScV4Int32Ty->getPrimitiveSizeInBits()); + EXPECT_GT(ScV2Int64Ty->getPrimitiveSizeInBits(), + ScV2Int32Ty->getPrimitiveSizeInBits()); + EXPECT_EQ(ScV4Int32Ty->getPrimitiveSizeInBits(), + ScV2Int64Ty->getPrimitiveSizeInBits()); + EXPECT_NE(ScV2Int32Ty->getPrimitiveSizeInBits(), + ScV2Int64Ty->getPrimitiveSizeInBits()); + + // Check the DataLayout interfaces. + EXPECT_EQ(DL.getTypeSizeInBits(ScV2Int64Ty), + DL.getTypeSizeInBits(ScV4Int32Ty)); + EXPECT_EQ(DL.getTypeSizeInBits(ScV2Int32Ty).getKnownMinSize(), 64U); + EXPECT_EQ(DL.getTypeStoreSize(ScV2Int64Ty), + DL.getTypeStoreSize(ScV4Int32Ty)); + EXPECT_NE(DL.getTypeStoreSizeInBits(ScV2Int32Ty), + DL.getTypeStoreSizeInBits(ScV2Int64Ty)); + EXPECT_EQ(DL.getTypeStoreSizeInBits(ScV2Int32Ty).getKnownMinSize(), 64U); + EXPECT_EQ(DL.getTypeStoreSize(ScV2Int64Ty).getKnownMinSize(), 16U); + EXPECT_EQ(DL.getTypeAllocSize(ScV4Int32Ty), + DL.getTypeAllocSize(ScV2Int64Ty)); + EXPECT_NE(DL.getTypeAllocSizeInBits(ScV2Int32Ty), + DL.getTypeAllocSizeInBits(ScV2Int64Ty)); + EXPECT_EQ(DL.getTypeAllocSizeInBits(ScV4Int32Ty).getKnownMinSize(), 128U); + EXPECT_EQ(DL.getTypeAllocSize(ScV2Int32Ty).getKnownMinSize(), 8U); + ASSERT_TRUE(DL.typeSizeEqualsStoreSize(ScV4Int32Ty)); +} + +TEST(VectorTypesTest, CrossComparisons) { + LLVMContext Ctx; + + Type *Int32Ty = Type::getInt32Ty(Ctx); + + VectorType *V4Int32Ty = VectorType::get(Int32Ty, {4, false}); + VectorType *ScV4Int32Ty = VectorType::get(Int32Ty, {4, true}); + + // Even though the minimum size is the same, a scalable vector could be + // larger so we don't consider them to be the same size. + EXPECT_NE(V4Int32Ty->getPrimitiveSizeInBits(), + ScV4Int32Ty->getPrimitiveSizeInBits()); + // If we are only checking the minimum, then they are the same size. + EXPECT_EQ(V4Int32Ty->getPrimitiveSizeInBits().getKnownMinSize(), + ScV4Int32Ty->getPrimitiveSizeInBits().getKnownMinSize()); + + // We can't use ordering comparisons (<,<=,>,>=) between scalable and + // non-scalable vector sizes. +} + } // end anonymous namespace From llvm-commits at lists.llvm.org Tue Oct 8 06:08:51 2019 From: llvm-commits at lists.llvm.org (Amaury Sechet via llvm-commits) Date: Tue, 08 Oct 2019 13:08:51 -0000 Subject: [llvm] r374043 - Add test for rotating truncated vectors. NFC Message-ID: <20191008130851.9C8CB81A40@lists.llvm.org> Author: deadalnix Date: Tue Oct 8 06:08:51 2019 New Revision: 374043 URL: http://llvm.org/viewvc/llvm-project?rev=374043&view=rev Log: Add test for rotating truncated vectors. NFC Modified: llvm/trunk/test/CodeGen/X86/rot16.ll llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll Modified: llvm/trunk/test/CodeGen/X86/rot16.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/rot16.ll?rev=374043&r1=374042&r2=374043&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/rot16.ll (original) +++ llvm/trunk/test/CodeGen/X86/rot16.ll Tue Oct 8 06:08:51 2019 @@ -186,22 +186,21 @@ define i32 @rot16_demandedbits(i32 %x, i ; X32-NEXT: shrl $11, %ecx ; X32-NEXT: shll $5, %eax ; X32-NEXT: orl %ecx, %eax -; X32-NEXT: andl $65536, %eax # imm = 0x10000 +; X32-NEXT: movzwl %ax, %eax ; X32-NEXT: retl ; ; X64-LABEL: rot16_demandedbits: ; X64: # %bb.0: ; X64-NEXT: movl %edi, %eax -; X64-NEXT: movl %edi, %ecx -; X64-NEXT: shrl $11, %ecx -; X64-NEXT: shll $5, %eax -; X64-NEXT: orl %ecx, %eax -; X64-NEXT: andl $65536, %eax # imm = 0x10000 +; X64-NEXT: shrl $11, %eax +; X64-NEXT: shll $5, %edi +; X64-NEXT: orl %eax, %edi +; X64-NEXT: movzwl %di, %eax ; X64-NEXT: retq %t0 = lshr i32 %x, 11 %t1 = shl i32 %x, 5 %t2 = or i32 %t0, %t1 - %t3 = and i32 %t2, 65536 + %t3 = and i32 %t2, 65535 ret i32 %t3 } Modified: llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll?rev=374043&r1=374042&r2=374043&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll Tue Oct 8 06:08:51 2019 @@ -2087,3 +2087,146 @@ define <16 x i8> @splatconstant_rotate_m %or = or <16 x i8> %lmask, %rmask ret <16 x i8> %or } + +define <4 x i32> @rot16_demandedbits(<4 x i32> %x, <4 x i32> %y) nounwind { +; X32-LABEL: rot16_demandedbits: +; X32: # %bb.0: +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax +; X32-NEXT: movl %eax, %ecx +; X32-NEXT: shrl $11, %ecx +; X32-NEXT: shll $5, %eax +; X32-NEXT: orl %ecx, %eax +; X32-NEXT: andl $65536, %eax # imm = 0x10000 +; X32-NEXT: retl +; +; X64-LABEL: rot16_demandedbits: +; X64: # %bb.0: +; X64-NEXT: movl %edi, %eax +; X64-NEXT: movl %edi, %ecx +; X64-NEXT: shrl $11, %ecx +; X64-NEXT: shll $5, %eax +; X64-NEXT: orl %ecx, %eax +; X64-NEXT: andl $65536, %eax # imm = 0x10000 +; X64-NEXT: retq +; SSE2-LABEL: rot16_demandedbits: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: psrld $11, %xmm1 +; SSE2-NEXT: pslld $11, %xmm0 +; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm0 +; SSE2-NEXT: retq +; +; SSE41-LABEL: rot16_demandedbits: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: psrld $11, %xmm1 +; SSE41-NEXT: pslld $11, %xmm0 +; SSE41-NEXT: por %xmm1, %xmm0 +; SSE41-NEXT: pxor %xmm1, %xmm1 +; SSE41-NEXT: pblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3],xmm0[4],xmm1[5],xmm0[6],xmm1[7] +; SSE41-NEXT: retq +; +; AVX-LABEL: rot16_demandedbits: +; AVX: # %bb.0: +; AVX-NEXT: vpsrld $11, %xmm0, %xmm1 +; AVX-NEXT: vpslld $11, %xmm0, %xmm0 +; AVX-NEXT: vpor %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3],xmm0[4],xmm1[5],xmm0[6],xmm1[7] +; AVX-NEXT: retq +; +; AVX512-LABEL: rot16_demandedbits: +; AVX512: # %bb.0: +; AVX512-NEXT: vpsrld $11, %xmm0, %xmm1 +; AVX512-NEXT: vpslld $11, %xmm0, %xmm0 +; AVX512-NEXT: vpor %xmm0, %xmm1, %xmm0 +; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3],xmm0[4],xmm1[5],xmm0[6],xmm1[7] +; AVX512-NEXT: retq +; +; XOP-LABEL: rot16_demandedbits: +; XOP: # %bb.0: +; XOP-NEXT: vpsrld $11, %xmm0, %xmm1 +; XOP-NEXT: vpslld $11, %xmm0, %xmm0 +; XOP-NEXT: vpor %xmm0, %xmm1, %xmm0 +; XOP-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; XOP-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3],xmm0[4],xmm1[5],xmm0[6],xmm1[7] +; XOP-NEXT: retq +; +; X32-SSE-LABEL: rot16_demandedbits: +; X32-SSE: # %bb.0: +; X32-SSE-NEXT: movdqa %xmm0, %xmm1 +; X32-SSE-NEXT: psrld $11, %xmm1 +; X32-SSE-NEXT: pslld $11, %xmm0 +; X32-SSE-NEXT: por %xmm1, %xmm0 +; X32-SSE-NEXT: pand {{\.LCPI.*}}, %xmm0 +; X32-SSE-NEXT: retl + %t0 = lshr <4 x i32> %x, + %t1 = shl <4 x i32> %x, + %t2 = or <4 x i32> %t0, %t1 + %t3 = and <4 x i32> %t2, + ret <4 x i32> %t3 +} + +define <4 x i16> @rot16_trunc(<4 x i32> %x, <4 x i32> %y) nounwind { +; SSE2-LABEL: rot16_trunc: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: psrld $11, %xmm1 +; SSE2-NEXT: pslld $5, %xmm0 +; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SSE2-NEXT: retq +; +; SSE41-LABEL: rot16_trunc: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: psrld $11, %xmm1 +; SSE41-NEXT: pslld $5, %xmm0 +; SSE41-NEXT: por %xmm1, %xmm0 +; SSE41-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,1,4,5,8,9,12,13,8,9,12,13,12,13,14,15] +; SSE41-NEXT: retq +; +; AVX-LABEL: rot16_trunc: +; AVX: # %bb.0: +; AVX-NEXT: vpsrld $11, %xmm0, %xmm1 +; AVX-NEXT: vpslld $5, %xmm0, %xmm0 +; AVX-NEXT: vpor %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,4,5,8,9,12,13,8,9,12,13,12,13,14,15] +; AVX-NEXT: retq +; +; AVX512-LABEL: rot16_trunc: +; AVX512: # %bb.0: +; AVX512-NEXT: vpsrld $11, %xmm0, %xmm1 +; AVX512-NEXT: vpslld $5, %xmm0, %xmm0 +; AVX512-NEXT: vpor %xmm0, %xmm1, %xmm0 +; AVX512-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,4,5,8,9,12,13,8,9,12,13,12,13,14,15] +; AVX512-NEXT: retq +; +; XOP-LABEL: rot16_trunc: +; XOP: # %bb.0: +; XOP-NEXT: vpsrld $11, %xmm0, %xmm1 +; XOP-NEXT: vpslld $5, %xmm0, %xmm0 +; XOP-NEXT: vpor %xmm0, %xmm1, %xmm0 +; XOP-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,4,5,8,9,12,13,8,9,12,13,12,13,14,15] +; XOP-NEXT: retq +; +; X32-SSE-LABEL: rot16_trunc: +; X32-SSE: # %bb.0: +; X32-SSE-NEXT: movdqa %xmm0, %xmm1 +; X32-SSE-NEXT: psrld $11, %xmm1 +; X32-SSE-NEXT: pslld $5, %xmm0 +; X32-SSE-NEXT: por %xmm1, %xmm0 +; X32-SSE-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; X32-SSE-NEXT: pshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7] +; X32-SSE-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; X32-SSE-NEXT: retl + %t0 = lshr <4 x i32> %x, + %t1 = shl <4 x i32> %x, + %t2 = or <4 x i32> %t0, %t1 + %t3 = trunc <4 x i32> %t2 to <4 x i16> + ret <4 x i16> %t3 +} From llvm-commits at lists.llvm.org Tue Oct 8 06:14:55 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 13:14:55 +0000 (UTC) Subject: [PATCH] D66709: AMDGPU: Introduce a flag to disable mul24 intrinsic formation In-Reply-To: References: Message-ID: <7e6c23dfab13258a2428efb200599510@localhost.localdomain> nhaehnle added a comment. Huh, interesting. I guess nobody had the time to really dig into why. Anyway, thanks. I think it'd be good to put this kind of information into our commit messages. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66709/new/ https://reviews.llvm.org/D66709 From llvm-commits at lists.llvm.org Tue Oct 8 06:14:56 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 13:14:56 +0000 (UTC) Subject: [PATCH] D68424: [tblgen] Add getOperatorAsDef() to Record In-Reply-To: References: Message-ID: <948fc9ccb9cbd0982093be0710b7502e@localhost.localdomain> nhaehnle accepted this revision. nhaehnle added a comment. This revision is now accepted and ready to land. Thanks, this makes a lot of sense. LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68424/new/ https://reviews.llvm.org/D68424 From llvm-commits at lists.llvm.org Tue Oct 8 06:23:58 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 13:23:58 +0000 (UTC) Subject: [PATCH] D64911: [AMDGPU] Extend the SI Load/Store optimizer In-Reply-To: References: Message-ID: nhaehnle added inline comments. ================ Comment at: lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:322-325 + if (AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::vaddr) == -1) + return UNKNOWN; + if (!TII.get(Opc).mayLoad() || TII.isGather4(Opc)) + return UNKNOWN; ---------------- This should probably check mayStore instead of mayLoad: we want to exclude both stores and atomics. You could also move the check for TFE and LWE to here. ================ Comment at: lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:1051 + for (unsigned I = 1, E = (*CI.I).getNumOperands(); I != E; ++I) { + (I == DMaskIdx) ? MIB.addImm(MergedDMask) : MIB.add((*CI.I).getOperand(I)); + } ---------------- Please use an if-statement. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64911/new/ https://reviews.llvm.org/D64911 From llvm-commits at lists.llvm.org Tue Oct 8 06:24:02 2019 From: llvm-commits at lists.llvm.org (Phabricator via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 13:24:02 +0000 (UTC) Subject: [PATCH] D67990: [aarch64] fix generation of fp16 fmls In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGd0d52edae92f: fix fmls fp16 (authored by Sebastian Pop <spop at amazon.com>). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67990/new/ https://reviews.llvm.org/D67990 Files: llvm/include/llvm/CodeGen/MachineCombinerPattern.h llvm/lib/Target/AArch64/AArch64InstrInfo.cpp llvm/test/CodeGen/AArch64/fp16-fmla.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67990.223837.patch Type: text/x-patch Size: 6279 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 06:33:09 2019 From: llvm-commits at lists.llvm.org (Amaury SECHET via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 13:33:09 +0000 (UTC) Subject: [PATCH] D68232: [DAGCombine] Match a greater range of rotate when not all bits are demanded. In-Reply-To: References: Message-ID: deadalnix marked an inline comment as done. deadalnix added inline comments. ================ Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:6376 + return MatchRotate(LHS, RHS, DL, + APInt::getMaxValue(LHS.getValueType().getSizeInBits())); +} ---------------- RKSimon wrote: > Shouldn't this be: APInt::getMaxValue(LHS.getScalarValueSizeInBits()) ? > > TBH I'd prefer getAllOnesValue as well as it avoids the signed/unsigned ambiguity of getMaxValue. That wouldn't generate the proper pattern for vectors and would leave things such as only the last element is demanded. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68232/new/ https://reviews.llvm.org/D68232 From llvm-commits at lists.llvm.org Tue Oct 8 06:38:42 2019 From: llvm-commits at lists.llvm.org (Kevin P. Neal via llvm-commits) Date: Tue, 08 Oct 2019 13:38:42 -0000 Subject: [llvm] r374045 - Restore documentation that 'svn update' unexpectedly yanked out from under me. Message-ID: <20191008133842.3743F894D1@lists.llvm.org> Author: kpn Date: Tue Oct 8 06:38:42 2019 New Revision: 374045 URL: http://llvm.org/viewvc/llvm-project?rev=374045&view=rev Log: Restore documentation that 'svn update' unexpectedly yanked out from under me. Added: llvm/trunk/docs/ProgrammingDocumentation.rst llvm/trunk/docs/SubsystemDocumentation.rst Added: llvm/trunk/docs/ProgrammingDocumentation.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ProgrammingDocumentation.rst?rev=374045&view=auto ============================================================================== --- llvm/trunk/docs/ProgrammingDocumentation.rst (added) +++ llvm/trunk/docs/ProgrammingDocumentation.rst Tue Oct 8 06:38:42 2019 @@ -0,0 +1,54 @@ +Programming Documentation +========================= + +For developers of applications which use LLVM as a library. + +.. toctree:: + :hidden: + + Atomics + CommandLine + ExtendingLLVM + HowToSetUpLLVMStyleRTTI + ProgrammersManual + Extensions + LibFuzzer + FuzzingLLVM + ScudoHardenedAllocator + OptBisect + GwpAsan + +:doc:`Atomics` + Information about LLVM's concurrency model. + +:doc:`ProgrammersManual` + Introduction to the general layout of the LLVM sourcebase, important classes + and APIs, and some tips & tricks. + +:doc:`Extensions` + LLVM-specific extensions to tools and formats LLVM seeks compatibility with. + +:doc:`CommandLine` + Provides information on using the command line parsing library. + +:doc:`HowToSetUpLLVMStyleRTTI` + How to make ``isa<>``, ``dyn_cast<>``, etc. available for clients of your + class hierarchy. + +:doc:`ExtendingLLVM` + Look here to see how to add instructions and intrinsics to LLVM. + +:doc:`LibFuzzer` + A library for writing in-process guided fuzzers. + +:doc:`FuzzingLLVM` + Information on writing and using Fuzzers to find bugs in LLVM. + +:doc:`ScudoHardenedAllocator` + A library that implements a security-hardened `malloc()`. + +:doc:`OptBisect` + A command line option for debugging optimization-induced failures. + +:doc:`GwpAsan` + A sampled heap memory error detection toolkit designed for production use. \ No newline at end of file Added: llvm/trunk/docs/SubsystemDocumentation.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/SubsystemDocumentation.rst?rev=374045&view=auto ============================================================================== --- llvm/trunk/docs/SubsystemDocumentation.rst (added) +++ llvm/trunk/docs/SubsystemDocumentation.rst Tue Oct 8 06:38:42 2019 @@ -0,0 +1,206 @@ +.. _index-subsystem-docs: + +Subsystem Documentation +======================= + +For API clients and LLVM developers. + +.. toctree:: + :hidden: + + AliasAnalysis + MemorySSA + BitCodeFormat + BlockFrequencyTerminology + BranchWeightMetadata + Bugpoint + CodeGenerator + ExceptionHandling + AddingConstrainedIntrinsics + LinkTimeOptimization + SegmentedStacks + TableGenFundamentals + TableGen/index + DebuggingJITedCode + GoldPlugin + MarkedUpDisassembly + SystemLibrary + SupportLibrary + SourceLevelDebugging + Vectorizers + WritingAnLLVMBackend + GarbageCollection + WritingAnLLVMPass + HowToUseAttributes + NVPTXUsage + AMDGPUUsage + StackMaps + InAlloca + BigEndianNEON + CoverageMappingFormat + Statepoints + MergeFunctions + TypeMetadata + TransformMetadata + FaultMaps + Coroutines + GlobalISel + XRay + XRayExample + XRayFDRFormat + PDB/index + CFIVerify + SpeculativeLoadHardening + StackSafetyAnalysis + LoopTerminology + DependenceGraphs/index + +:doc:`WritingAnLLVMPass` + Information on how to write LLVM transformations and analyses. + +:doc:`WritingAnLLVMBackend` + Information on how to write LLVM backends for machine targets. + +:doc:`CodeGenerator` + The design and implementation of the LLVM code generator. Useful if you are + working on retargetting LLVM to a new architecture, designing a new codegen + pass, or enhancing existing components. + +:doc:`TableGen ` + Describes the TableGen tool, which is used heavily by the LLVM code + generator. + +:doc:`AliasAnalysis` + Information on how to write a new alias analysis implementation or how to + use existing analyses. + +:doc:`MemorySSA` + Information about the MemorySSA utility in LLVM, as well as how to use it. + +:doc:`GarbageCollection` + The interfaces source-language compilers should use for compiling GC'd + programs. + +:doc:`Source Level Debugging with LLVM ` + This document describes the design and philosophy behind the LLVM + source-level debugger. + +:doc:`Vectorizers` + This document describes the current status of vectorization in LLVM. + +:doc:`ExceptionHandling` + This document describes the design and implementation of exception handling + in LLVM. + +:doc:`AddingConstrainedIntrinsics` + Gives the steps necessary when adding a new constrained math intrinsic + to LLVM. + +:doc:`Bugpoint` + Automatic bug finder and test-case reducer description and usage + information. + +:doc:`BitCodeFormat` + This describes the file format and encoding used for LLVM "bc" files. + +:doc:`Support Library ` + This document describes the LLVM Support Library (``lib/Support``) and + how to keep LLVM source code portable + +:doc:`LinkTimeOptimization` + This document describes the interface between LLVM intermodular optimizer + and the linker and its design + +:doc:`GoldPlugin` + How to build your programs with link-time optimization on Linux. + +:doc:`DebuggingJITedCode` + How to debug JITed code with GDB. + +:doc:`MCJITDesignAndImplementation` + Describes the inner workings of MCJIT execution engine. + +:doc:`ORCv2` + Describes the design and implementation of the ORC APIs, including some + usage examples, and a guide for users transitioning from ORCv1 to ORCv2. + +:doc:`BranchWeightMetadata` + Provides information about Branch Prediction Information. + +:doc:`BlockFrequencyTerminology` + Provides information about terminology used in the ``BlockFrequencyInfo`` + analysis pass. + +:doc:`SegmentedStacks` + This document describes segmented stacks and how they are used in LLVM. + +:doc:`MarkedUpDisassembly` + This document describes the optional rich disassembly output syntax. + +:doc:`HowToUseAttributes` + Answers some questions about the new Attributes infrastructure. + +:doc:`NVPTXUsage` + This document describes using the NVPTX backend to compile GPU kernels. + +:doc:`AMDGPUUsage` + This document describes using the AMDGPU backend to compile GPU kernels. + +:doc:`StackMaps` + LLVM support for mapping instruction addresses to the location of + values and allowing code to be patched. + +:doc:`BigEndianNEON` + LLVM's support for generating NEON instructions on big endian ARM targets is + somewhat nonintuitive. This document explains the implementation and rationale. + +:doc:`CoverageMappingFormat` + This describes the format and encoding used for LLVM’s code coverage mapping. + +:doc:`Statepoints` + This describes a set of experimental extensions for garbage + collection support. + +:doc:`MergeFunctions` + Describes functions merging optimization. + +:doc:`InAlloca` + Description of the ``inalloca`` argument attribute. + +:doc:`FaultMaps` + LLVM support for folding control flow into faulting machine instructions. + +:doc:`CompileCudaWithLLVM` + LLVM support for CUDA. + +:doc:`Coroutines` + LLVM support for coroutines. + +:doc:`GlobalISel` + This describes the prototype instruction selection replacement, GlobalISel. + +:doc:`XRay` + High-level documentation of how to use XRay in LLVM. + +:doc:`XRayExample` + An example of how to debug an application with XRay. + +:doc:`The Microsoft PDB File Format ` + A detailed description of the Microsoft PDB (Program Database) file format. + +:doc:`CFIVerify` + A description of the verification tool for Control Flow Integrity. + +:doc:`SpeculativeLoadHardening` + A description of the Speculative Load Hardening mitigation for Spectre v1. + +:doc:`StackSafetyAnalysis` + This document describes the design of the stack safety analysis of local + variables. + +:doc:`LoopTerminology` + A document describing Loops and associated terms as used in LLVM. + +:doc:`Dependence Graphs ` + A description of the design of the various dependence graphs such as + the DDG (Data Dependence Graph). From llvm-commits at lists.llvm.org Tue Oct 8 06:42:26 2019 From: llvm-commits at lists.llvm.org (Owen Reynolds via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 13:42:26 +0000 (UTC) Subject: [PATCH] D68033: [llvm-ar] Make paths case insensitive when on windows In-Reply-To: References: Message-ID: <8793bb491dc4b7adbaae25d56c8bf523@localhost.localdomain> gbreynoo updated this revision to Diff 223839. gbreynoo added a comment. Removed unneeded `const`, swapped use of `ifndef` and added comment to `comparePaths` function. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68033/new/ https://reviews.llvm.org/D68033 Files: llvm/test/tools/llvm-ar/non-windows-name-case.test llvm/test/tools/llvm-ar/windows-name-case.test llvm/tools/llvm-ar/llvm-ar.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68033.223839.patch Type: text/x-patch Size: 5657 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 06:51:39 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 13:51:39 +0000 (UTC) Subject: [PATCH] D68400: [NFC][TTI] Add Alignment for isLegalMasked[Load/Store] In-Reply-To: References: Message-ID: dmgreen accepted this revision. dmgreen added a comment. This revision is now accepted and ready to land. Thanks! LGTM ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:4543 return isa(I) ? - !(isLegalMaskedLoad(Ty, Ptr) || isLegalMaskedGather(Ty)) - : !(isLegalMaskedStore(Ty, Ptr) || isLegalMaskedScatter(Ty)); + !(isLegalMaskedLoad(Ty, Ptr, Alignment) || isLegalMaskedGather(Ty)) + : !(isLegalMaskedStore(Ty, Ptr, Alignment) || isLegalMaskedScatter(Ty)); ---------------- Does this need to be a MaybeAlign(Alignment)? Because the constructor is explicit, as far as I understand (hence, needs to look like the others). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68400/new/ https://reviews.llvm.org/D68400 From llvm-commits at lists.llvm.org Tue Oct 8 06:51:41 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 13:51:41 +0000 (UTC) Subject: [PATCH] D68453: TableGen: Allow 'a+b' in TableGen language In-Reply-To: References: Message-ID: <6183ae08d82821e919df47b53879e275@localhost.localdomain> nhaehnle added a comment. This does seem like a useful addition (heh) to the grammar. There is one thing that can go wrong in the future: parentheses are already reserved for DAGs. Brackets and braces also already have their own purpose. We could designate `((` without space for parenthesis, with the risk that `( (` and `((` can mean different things, with the first one making sense in DAGs. Though DAG operators are usually defs, so the risk of weirdness may be acceptable. I also agree that a shunting-based algorithm is the correct direction to go in long-term, and I'm concerned about adding code that adds momentum in the wrong direction. Right now, it's still easy to turn back and do things right, but if we add the change as-is the temptation will be great for the next person to come along and add even more code that goes off into the woods. Can we please get this right **now**? We can treat `#` and `+` as left-associative operators of the same precedence right now (and don't add anything else), which should keep things somewhat simpler, but at least we should be able to avoid using recursion for parsing the RHS. ================ Comment at: llvm/lib/TableGen/TGParser.cpp:2184-2186 + Init *RHSResult = ParseValue(CurRec, ItemType, ParseNameMode); + Result = BinOpInit::get(BinOpInit::ADD, LHS, RHSResult, + IntRecTy::get())->Fold(CurRec); ---------------- lebedev.ri wrote: > Do you want to make any sanity checks about types of lhs, rhs, final type? Yes. `resolveTypes` should be used on the LHS and RHS types and the result used as the type. That makes it consistent with `!add` handling. You don't particularly need to check that it's an IntRecTy as far as I'm concerned. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68453/new/ https://reviews.llvm.org/D68453 From llvm-commits at lists.llvm.org Tue Oct 8 06:51:42 2019 From: llvm-commits at lists.llvm.org (David Tellenbach via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 13:51:42 +0000 (UTC) Subject: [PATCH] D68639: [MachineScheduler] Add a flag to enable scheduling of cfi instructions Message-ID: tellenbach created this revision. Herald added subscribers: llvm-commits, javed.absar, hiraditya, aprantl, MatzeB. Herald added a project: LLVM. Add a flag `--schedule-cfiinstrs` to enable the scheduling of cfi instructions during instruction scheduling. If this flag is set to `false` (the current default) cfi instructions act as scheduling boundaries during instruction scheduling. This can lead to different scheduling regions and therefore differing generated assembly, depending on the presence of cfi instructions. Since some targets insert cfi instructions when debug information is generated, but not if not, the scheduling of cfi instructions leads to improved consistency between debug and non-debug mode. See also: http://lists.llvm.org/pipermail/llvm-dev/2019-September/135433.html This resolves PR37240. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68639 Files: llvm/include/llvm/CodeGen/MachineScheduler.h llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h llvm/lib/CodeGen/MachineScheduler.cpp llvm/lib/CodeGen/ScheduleDAGInstrs.cpp llvm/test/CodeGen/AArch64/cfiinstrs-no-uwtable-scheduling.ll llvm/test/CodeGen/AArch64/cfiinstrs-scheduling.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68639.223842.patch Type: text/x-patch Size: 11747 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:01:07 2019 From: llvm-commits at lists.llvm.org (Jan Beich via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:01:07 +0000 (UTC) Subject: [PATCH] D68045: [builtins] Unbreak build on FreeBSD armv7 after D60351 In-Reply-To: References: Message-ID: jbeich added a comment. Can someone land this change? Only tested on FreeBSD because contributing guide didn't cover how to elsewhere i.e., on platforms one doesn't have access to. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68045/new/ https://reviews.llvm.org/D68045 From llvm-commits at lists.llvm.org Tue Oct 8 07:10:26 2019 From: llvm-commits at lists.llvm.org (Kevin P. Neal via llvm-commits) Date: Tue, 08 Oct 2019 14:10:26 -0000 Subject: [llvm] r374049 - Nope, I'm wrong. It looks like someone else removed these on purpose and Message-ID: <20191008141026.3904C8BD5E@lists.llvm.org> Author: kpn Date: Tue Oct 8 07:10:26 2019 New Revision: 374049 URL: http://llvm.org/viewvc/llvm-project?rev=374049&view=rev Log: Nope, I'm wrong. It looks like someone else removed these on purpose and it just happened to break the bot right when I did my push. So I'm undoing this mornings incorrect push. I've also kicked off an email to hopefully get the bot fixed the correct way. Removed: llvm/trunk/docs/ProgrammingDocumentation.rst llvm/trunk/docs/SubsystemDocumentation.rst Removed: llvm/trunk/docs/ProgrammingDocumentation.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ProgrammingDocumentation.rst?rev=374048&view=auto ============================================================================== --- llvm/trunk/docs/ProgrammingDocumentation.rst (original) +++ llvm/trunk/docs/ProgrammingDocumentation.rst (removed) @@ -1,54 +0,0 @@ -Programming Documentation -========================= - -For developers of applications which use LLVM as a library. - -.. toctree:: - :hidden: - - Atomics - CommandLine - ExtendingLLVM - HowToSetUpLLVMStyleRTTI - ProgrammersManual - Extensions - LibFuzzer - FuzzingLLVM - ScudoHardenedAllocator - OptBisect - GwpAsan - -:doc:`Atomics` - Information about LLVM's concurrency model. - -:doc:`ProgrammersManual` - Introduction to the general layout of the LLVM sourcebase, important classes - and APIs, and some tips & tricks. - -:doc:`Extensions` - LLVM-specific extensions to tools and formats LLVM seeks compatibility with. - -:doc:`CommandLine` - Provides information on using the command line parsing library. - -:doc:`HowToSetUpLLVMStyleRTTI` - How to make ``isa<>``, ``dyn_cast<>``, etc. available for clients of your - class hierarchy. - -:doc:`ExtendingLLVM` - Look here to see how to add instructions and intrinsics to LLVM. - -:doc:`LibFuzzer` - A library for writing in-process guided fuzzers. - -:doc:`FuzzingLLVM` - Information on writing and using Fuzzers to find bugs in LLVM. - -:doc:`ScudoHardenedAllocator` - A library that implements a security-hardened `malloc()`. - -:doc:`OptBisect` - A command line option for debugging optimization-induced failures. - -:doc:`GwpAsan` - A sampled heap memory error detection toolkit designed for production use. \ No newline at end of file Removed: llvm/trunk/docs/SubsystemDocumentation.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/SubsystemDocumentation.rst?rev=374048&view=auto ============================================================================== --- llvm/trunk/docs/SubsystemDocumentation.rst (original) +++ llvm/trunk/docs/SubsystemDocumentation.rst (removed) @@ -1,206 +0,0 @@ -.. _index-subsystem-docs: - -Subsystem Documentation -======================= - -For API clients and LLVM developers. - -.. toctree:: - :hidden: - - AliasAnalysis - MemorySSA - BitCodeFormat - BlockFrequencyTerminology - BranchWeightMetadata - Bugpoint - CodeGenerator - ExceptionHandling - AddingConstrainedIntrinsics - LinkTimeOptimization - SegmentedStacks - TableGenFundamentals - TableGen/index - DebuggingJITedCode - GoldPlugin - MarkedUpDisassembly - SystemLibrary - SupportLibrary - SourceLevelDebugging - Vectorizers - WritingAnLLVMBackend - GarbageCollection - WritingAnLLVMPass - HowToUseAttributes - NVPTXUsage - AMDGPUUsage - StackMaps - InAlloca - BigEndianNEON - CoverageMappingFormat - Statepoints - MergeFunctions - TypeMetadata - TransformMetadata - FaultMaps - Coroutines - GlobalISel - XRay - XRayExample - XRayFDRFormat - PDB/index - CFIVerify - SpeculativeLoadHardening - StackSafetyAnalysis - LoopTerminology - DependenceGraphs/index - -:doc:`WritingAnLLVMPass` - Information on how to write LLVM transformations and analyses. - -:doc:`WritingAnLLVMBackend` - Information on how to write LLVM backends for machine targets. - -:doc:`CodeGenerator` - The design and implementation of the LLVM code generator. Useful if you are - working on retargetting LLVM to a new architecture, designing a new codegen - pass, or enhancing existing components. - -:doc:`TableGen ` - Describes the TableGen tool, which is used heavily by the LLVM code - generator. - -:doc:`AliasAnalysis` - Information on how to write a new alias analysis implementation or how to - use existing analyses. - -:doc:`MemorySSA` - Information about the MemorySSA utility in LLVM, as well as how to use it. - -:doc:`GarbageCollection` - The interfaces source-language compilers should use for compiling GC'd - programs. - -:doc:`Source Level Debugging with LLVM ` - This document describes the design and philosophy behind the LLVM - source-level debugger. - -:doc:`Vectorizers` - This document describes the current status of vectorization in LLVM. - -:doc:`ExceptionHandling` - This document describes the design and implementation of exception handling - in LLVM. - -:doc:`AddingConstrainedIntrinsics` - Gives the steps necessary when adding a new constrained math intrinsic - to LLVM. - -:doc:`Bugpoint` - Automatic bug finder and test-case reducer description and usage - information. - -:doc:`BitCodeFormat` - This describes the file format and encoding used for LLVM "bc" files. - -:doc:`Support Library ` - This document describes the LLVM Support Library (``lib/Support``) and - how to keep LLVM source code portable - -:doc:`LinkTimeOptimization` - This document describes the interface between LLVM intermodular optimizer - and the linker and its design - -:doc:`GoldPlugin` - How to build your programs with link-time optimization on Linux. - -:doc:`DebuggingJITedCode` - How to debug JITed code with GDB. - -:doc:`MCJITDesignAndImplementation` - Describes the inner workings of MCJIT execution engine. - -:doc:`ORCv2` - Describes the design and implementation of the ORC APIs, including some - usage examples, and a guide for users transitioning from ORCv1 to ORCv2. - -:doc:`BranchWeightMetadata` - Provides information about Branch Prediction Information. - -:doc:`BlockFrequencyTerminology` - Provides information about terminology used in the ``BlockFrequencyInfo`` - analysis pass. - -:doc:`SegmentedStacks` - This document describes segmented stacks and how they are used in LLVM. - -:doc:`MarkedUpDisassembly` - This document describes the optional rich disassembly output syntax. - -:doc:`HowToUseAttributes` - Answers some questions about the new Attributes infrastructure. - -:doc:`NVPTXUsage` - This document describes using the NVPTX backend to compile GPU kernels. - -:doc:`AMDGPUUsage` - This document describes using the AMDGPU backend to compile GPU kernels. - -:doc:`StackMaps` - LLVM support for mapping instruction addresses to the location of - values and allowing code to be patched. - -:doc:`BigEndianNEON` - LLVM's support for generating NEON instructions on big endian ARM targets is - somewhat nonintuitive. This document explains the implementation and rationale. - -:doc:`CoverageMappingFormat` - This describes the format and encoding used for LLVM’s code coverage mapping. - -:doc:`Statepoints` - This describes a set of experimental extensions for garbage - collection support. - -:doc:`MergeFunctions` - Describes functions merging optimization. - -:doc:`InAlloca` - Description of the ``inalloca`` argument attribute. - -:doc:`FaultMaps` - LLVM support for folding control flow into faulting machine instructions. - -:doc:`CompileCudaWithLLVM` - LLVM support for CUDA. - -:doc:`Coroutines` - LLVM support for coroutines. - -:doc:`GlobalISel` - This describes the prototype instruction selection replacement, GlobalISel. - -:doc:`XRay` - High-level documentation of how to use XRay in LLVM. - -:doc:`XRayExample` - An example of how to debug an application with XRay. - -:doc:`The Microsoft PDB File Format ` - A detailed description of the Microsoft PDB (Program Database) file format. - -:doc:`CFIVerify` - A description of the verification tool for Control Flow Integrity. - -:doc:`SpeculativeLoadHardening` - A description of the Speculative Load Hardening mitigation for Spectre v1. - -:doc:`StackSafetyAnalysis` - This document describes the design of the stack safety analysis of local - variables. - -:doc:`LoopTerminology` - A document describing Loops and associated terms as used in LLVM. - -:doc:`Dependence Graphs ` - A description of the design of the various dependence graphs such as - the DDG (Data Dependence Graph). From llvm-commits at lists.llvm.org Tue Oct 8 07:10:18 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:10:18 +0000 (UTC) Subject: [PATCH] D68639: [MachineScheduler] Add a flag to enable scheduling of cfi instructions In-Reply-To: References: Message-ID: <8b6d3e84c04505f819c9733973a1228e@localhost.localdomain> fhahn added a comment. I've not looked to closely yet, but isn't that basically the same we do for debug values? If that's the case, I think we should consider unifying the code to handle debug & CFI. Specifically, would it be possible to just extend the current handling of debug instructions to also handle CFI instructions? That should some renaming and a few code changes. I think. ================ Comment at: llvm/lib/CodeGen/MachineScheduler.cpp:449 + /// boundaries + if (CFIInstructionScheduling && MI->isCFIInstruction()) { + return false; ---------------- Can that be folded into the return, like the other conditions? Also, please update the comment with details about CFIInstructions. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68639/new/ https://reviews.llvm.org/D68639 From llvm-commits at lists.llvm.org Tue Oct 8 07:10:19 2019 From: llvm-commits at lists.llvm.org (Amaury SECHET via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:10:19 +0000 (UTC) Subject: [PATCH] D68232: [DAGCombine] Match a greater range of rotate when not all bits are demanded. In-Reply-To: References: Message-ID: deadalnix updated this revision to Diff 223844. deadalnix added a comment. Add test for vector - but nothing changes for them (see rL374043 ) and use isSubsetOf and getAllOnesValue. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68232/new/ https://reviews.llvm.org/D68232 Files: lib/CodeGen/SelectionDAG/DAGCombiner.cpp test/CodeGen/X86/rot16.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68232.223844.patch Type: text/x-patch Size: 5398 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:15:32 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via llvm-commits) Date: Tue, 08 Oct 2019 14:15:32 -0000 Subject: [llvm] r374051 - Object/minidump: Add support for the MemoryInfoList stream Message-ID: <20191008141533.0050883D28@lists.llvm.org> Author: labath Date: Tue Oct 8 07:15:32 2019 New Revision: 374051 URL: http://llvm.org/viewvc/llvm-project?rev=374051&view=rev Log: Object/minidump: Add support for the MemoryInfoList stream Summary: This patch adds the definitions of the constants and structures necessary to interpret the MemoryInfoList minidump stream, as well as the object::MinidumpFile interface to access the stream. While the code is fairly simple, there is one important deviation from the other minidump streams, which is worth calling out explicitly. Unlike other "List" streams, the size of the records inside MemoryInfoList stream is not known statically. Instead it is described in the stream header. This makes it impossible to return ArrayRef from the accessor method, as it is done with other streams. Instead, I create an iterator class, which can be parameterized by the runtime size of the structure, and return iterator_range instead. Reviewers: amccarth, jhenderson, clayborg Subscribers: JosephTremoulet, zturner, markmentovai, lldb-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68210 Modified: llvm/trunk/include/llvm/BinaryFormat/Minidump.h llvm/trunk/include/llvm/BinaryFormat/MinidumpConstants.def llvm/trunk/include/llvm/Object/Minidump.h llvm/trunk/lib/Object/Minidump.cpp llvm/trunk/unittests/Object/MinidumpTest.cpp Modified: llvm/trunk/include/llvm/BinaryFormat/Minidump.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/BinaryFormat/Minidump.h?rev=374051&r1=374050&r2=374051&view=diff ============================================================================== --- llvm/trunk/include/llvm/BinaryFormat/Minidump.h (original) +++ llvm/trunk/include/llvm/BinaryFormat/Minidump.h Tue Oct 8 07:15:32 2019 @@ -18,6 +18,7 @@ #ifndef LLVM_BINARYFORMAT_MINIDUMP_H #define LLVM_BINARYFORMAT_MINIDUMP_H +#include "llvm/ADT/BitmaskEnum.h" #include "llvm/ADT/DenseMapInfo.h" #include "llvm/Support/Endian.h" @@ -67,6 +68,42 @@ struct MemoryDescriptor { }; static_assert(sizeof(MemoryDescriptor) == 16, ""); +struct MemoryInfoListHeader { + support::ulittle32_t SizeOfHeader; + support::ulittle32_t SizeOfEntry; + support::ulittle64_t NumberOfEntries; +}; +static_assert(sizeof(MemoryInfoListHeader) == 16, ""); + +enum class MemoryProtection : uint32_t { +#define HANDLE_MDMP_PROTECT(CODE, NAME, NATIVENAME) NAME = CODE, +#include "llvm/BinaryFormat/MinidumpConstants.def" + LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/0xffffffffu), +}; + +enum class MemoryState : uint32_t { +#define HANDLE_MDMP_MEMSTATE(CODE, NAME, NATIVENAME) NAME = CODE, +#include "llvm/BinaryFormat/MinidumpConstants.def" +}; + +enum class MemoryType : uint32_t { +#define HANDLE_MDMP_MEMTYPE(CODE, NAME, NATIVENAME) NAME = CODE, +#include "llvm/BinaryFormat/MinidumpConstants.def" +}; + +struct MemoryInfo { + support::ulittle64_t BaseAddress; + support::ulittle64_t AllocationBase; + support::little_t AllocationProtect; + support::ulittle32_t Reserved0; + support::ulittle64_t RegionSize; + support::little_t State; + support::little_t Protect; + support::little_t Type; + support::ulittle32_t Reserved1; +}; +static_assert(sizeof(MemoryInfo) == 48, ""); + /// Specifies the location and type of a single stream in the minidump file. The /// minidump stream directory is an array of entries of this type, with its size /// given by Header.NumberOfStreams. Modified: llvm/trunk/include/llvm/BinaryFormat/MinidumpConstants.def URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/BinaryFormat/MinidumpConstants.def?rev=374051&r1=374050&r2=374051&view=diff ============================================================================== --- llvm/trunk/include/llvm/BinaryFormat/MinidumpConstants.def (original) +++ llvm/trunk/include/llvm/BinaryFormat/MinidumpConstants.def Tue Oct 8 07:15:32 2019 @@ -6,8 +6,9 @@ // //===----------------------------------------------------------------------===// -#if !(defined HANDLE_MDMP_STREAM_TYPE || defined HANDLE_MDMP_ARCH || \ - defined HANDLE_MDMP_PLATFORM) +#if !(defined(HANDLE_MDMP_STREAM_TYPE) || defined(HANDLE_MDMP_ARCH) || \ + defined(HANDLE_MDMP_PLATFORM) || defined(HANDLE_MDMP_PROTECT) || \ + defined(HANDLE_MDMP_MEMSTATE) || defined(HANDLE_MDMP_MEMTYPE)) #error "Missing HANDLE_MDMP definition" #endif @@ -23,6 +24,18 @@ #define HANDLE_MDMP_PLATFORM(CODE, NAME) #endif +#ifndef HANDLE_MDMP_PROTECT +#define HANDLE_MDMP_PROTECT(CODE, NAME, NATIVENAME) +#endif + +#ifndef HANDLE_MDMP_MEMSTATE +#define HANDLE_MDMP_MEMSTATE(CODE, NAME, NATIVENAME) +#endif + +#ifndef HANDLE_MDMP_MEMTYPE +#define HANDLE_MDMP_MEMTYPE(CODE, NAME, NATIVENAME) +#endif + HANDLE_MDMP_STREAM_TYPE(0x0003, ThreadList) HANDLE_MDMP_STREAM_TYPE(0x0004, ModuleList) HANDLE_MDMP_STREAM_TYPE(0x0005, MemoryList) @@ -102,6 +115,30 @@ HANDLE_MDMP_PLATFORM(0x8203, Android) // HANDLE_MDMP_PLATFORM(0x8204, PS3) // PS3 HANDLE_MDMP_PLATFORM(0x8205, NaCl) // Native Client (NaCl) +HANDLE_MDMP_PROTECT(0x01, NoAccess, PAGE_NO_ACCESS) +HANDLE_MDMP_PROTECT(0x02, ReadOnly, PAGE_READ_ONLY) +HANDLE_MDMP_PROTECT(0x04, ReadWrite, PAGE_READ_WRITE) +HANDLE_MDMP_PROTECT(0x08, WriteCopy, PAGE_WRITE_COPY) +HANDLE_MDMP_PROTECT(0x10, Execute, PAGE_EXECUTE) +HANDLE_MDMP_PROTECT(0x20, ExecuteRead, PAGE_EXECUTE_READ) +HANDLE_MDMP_PROTECT(0x40, ExecuteReadWrite, PAGE_EXECUTE_READ_WRITE) +HANDLE_MDMP_PROTECT(0x80, ExeciteWriteCopy, PAGE_EXECUTE_WRITE_COPY) +HANDLE_MDMP_PROTECT(0x100, Guard, PAGE_GUARD) +HANDLE_MDMP_PROTECT(0x200, NoCache, PAGE_NOCACHE) +HANDLE_MDMP_PROTECT(0x400, WriteCombine, PAGE_WRITECOMBINE) +HANDLE_MDMP_PROTECT(0x40000000, TargetsInvalid, PAGE_TARGETS_INVALID) + +HANDLE_MDMP_MEMSTATE(0x01000, Commit, MEM_COMMIT) +HANDLE_MDMP_MEMSTATE(0x02000, Reserve, MEM_RESERVE) +HANDLE_MDMP_MEMSTATE(0x10000, Free, MEM_FREE) + +HANDLE_MDMP_MEMTYPE(0x0020000, Private, MEM_PRIVATE) +HANDLE_MDMP_MEMTYPE(0x0040000, Mapped, MEM_MAPPED) +HANDLE_MDMP_MEMTYPE(0x1000000, Image, MEM_IMAGE) + #undef HANDLE_MDMP_STREAM_TYPE #undef HANDLE_MDMP_ARCH #undef HANDLE_MDMP_PLATFORM +#undef HANDLE_MDMP_PROTECT +#undef HANDLE_MDMP_MEMSTATE +#undef HANDLE_MDMP_MEMTYPE Modified: llvm/trunk/include/llvm/Object/Minidump.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Object/Minidump.h?rev=374051&r1=374050&r2=374051&view=diff ============================================================================== --- llvm/trunk/include/llvm/Object/Minidump.h (original) +++ llvm/trunk/include/llvm/Object/Minidump.h Tue Oct 8 07:15:32 2019 @@ -11,6 +11,7 @@ #include "llvm/ADT/DenseMap.h" #include "llvm/ADT/StringExtras.h" +#include "llvm/ADT/iterator.h" #include "llvm/BinaryFormat/Minidump.h" #include "llvm/Object/Binary.h" #include "llvm/Support/Error.h" @@ -80,16 +81,56 @@ public: return getListStream(minidump::StreamType::ThreadList); } - /// Returns the list of memory ranges embedded in the MemoryList stream. An - /// error is returned if the file does not contain this stream, or if the - /// stream is not large enough to contain the number of memory descriptors - /// declared in the stream header. The consistency of the MemoryDescriptor - /// entries themselves is not checked in any way. + /// Returns the list of descriptors embedded in the MemoryList stream. The + /// descriptors provide the content of interesting regions of memory at the + /// time the minidump was taken. An error is returned if the file does not + /// contain this stream, or if the stream is not large enough to contain the + /// number of memory descriptors declared in the stream header. The + /// consistency of the MemoryDescriptor entries themselves is not checked in + /// any way. Expected> getMemoryList() const { return getListStream( minidump::StreamType::MemoryList); } + class MemoryInfoIterator + : public iterator_facade_base { + public: + MemoryInfoIterator(ArrayRef Storage, size_t Stride) + : Storage(Storage), Stride(Stride) { + assert(Storage.size() % Stride == 0); + } + + bool operator==(const MemoryInfoIterator &R) const { + return Storage.size() == R.Storage.size(); + } + + const minidump::MemoryInfo &operator*() const { + assert(Storage.size() >= sizeof(minidump::MemoryInfo)); + return *reinterpret_cast(Storage.data()); + } + + MemoryInfoIterator &operator++() { + Storage = Storage.drop_front(Stride); + return *this; + } + + private: + ArrayRef Storage; + size_t Stride; + }; + + /// Returns the list of descriptors embedded in the MemoryInfoList stream. The + /// descriptors provide properties (e.g. permissions) of interesting regions + /// of memory at the time the minidump was taken. An error is returned if the + /// file does not contain this stream, or if the stream is not large enough to + /// contain the number of memory descriptors declared in the stream header. + /// The consistency of the MemoryInfoList entries themselves is not checked + /// in any way. + Expected> getMemoryInfoList() const; + private: static Error createError(StringRef Str) { return make_error(Str, object_error::parse_failed); @@ -137,10 +178,10 @@ private: }; template -Expected MinidumpFile::getStream(minidump::StreamType Stream) const { - if (auto OptionalStream = getRawStream(Stream)) { - if (OptionalStream->size() >= sizeof(T)) - return *reinterpret_cast(OptionalStream->data()); +Expected MinidumpFile::getStream(minidump::StreamType Type) const { + if (Optional> Stream = getRawStream(Type)) { + if (Stream->size() >= sizeof(T)) + return *reinterpret_cast(Stream->data()); return createEOFError(); } return createError("No such stream"); @@ -153,10 +194,11 @@ Expected> MinidumpFile::getD // Check for overflow. if (Count > std::numeric_limits::max() / sizeof(T)) return createEOFError(); - auto ExpectedArray = getDataSlice(Data, Offset, sizeof(T) * Count); - if (!ExpectedArray) - return ExpectedArray.takeError(); - return ArrayRef(reinterpret_cast(ExpectedArray->data()), Count); + Expected> Slice = + getDataSlice(Data, Offset, sizeof(T) * Count); + if (!Slice) + return Slice.takeError(); + return ArrayRef(reinterpret_cast(Slice->data()), Count); } } // end namespace object Modified: llvm/trunk/lib/Object/Minidump.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Object/Minidump.cpp?rev=374051&r1=374050&r2=374051&view=diff ============================================================================== --- llvm/trunk/lib/Object/Minidump.cpp (original) +++ llvm/trunk/lib/Object/Minidump.cpp Tue Oct 8 07:15:32 2019 @@ -53,13 +53,30 @@ Expected MinidumpFile::getS return Result; } +Expected> +MinidumpFile::getMemoryInfoList() const { + Optional> Stream = getRawStream(StreamType::MemoryInfoList); + if (!Stream) + return createError("No such stream"); + auto ExpectedHeader = + getDataSliceAs(*Stream, 0, 1); + if (!ExpectedHeader) + return ExpectedHeader.takeError(); + const minidump::MemoryInfoListHeader &H = ExpectedHeader.get()[0]; + Expected> Data = + getDataSlice(*Stream, H.SizeOfHeader, H.SizeOfEntry * H.NumberOfEntries); + if (!Data) + return Data.takeError(); + return make_range(MemoryInfoIterator(*Data, H.SizeOfEntry), + MemoryInfoIterator({}, H.SizeOfEntry)); +} + template -Expected> MinidumpFile::getListStream(StreamType Stream) const { - auto OptionalStream = getRawStream(Stream); - if (!OptionalStream) +Expected> MinidumpFile::getListStream(StreamType Type) const { + Optional> Stream = getRawStream(Type); + if (!Stream) return createError("No such stream"); - auto ExpectedSize = - getDataSliceAs(*OptionalStream, 0, 1); + auto ExpectedSize = getDataSliceAs(*Stream, 0, 1); if (!ExpectedSize) return ExpectedSize.takeError(); @@ -69,10 +86,10 @@ Expected> MinidumpFile::getL // Some producers insert additional padding bytes to align the list to an // 8-byte boundary. Check for that by comparing the list size with the overall // stream size. - if (ListOffset + sizeof(T) * ListSize < OptionalStream->size()) + if (ListOffset + sizeof(T) * ListSize < Stream->size()) ListOffset = 8; - return getDataSliceAs(*OptionalStream, ListOffset, ListSize); + return getDataSliceAs(*Stream, ListOffset, ListSize); } template Expected> MinidumpFile::getListStream(StreamType) const; @@ -109,13 +126,14 @@ MinidumpFile::create(MemoryBufferRef Sou return ExpectedStreams.takeError(); DenseMap StreamMap; - for (const auto &Stream : llvm::enumerate(*ExpectedStreams)) { - StreamType Type = Stream.value().Type; - const LocationDescriptor &Loc = Stream.value().Location; - - auto ExpectedStream = getDataSlice(Data, Loc.RVA, Loc.DataSize); - if (!ExpectedStream) - return ExpectedStream.takeError(); + for (const auto &StreamDescriptor : llvm::enumerate(*ExpectedStreams)) { + StreamType Type = StreamDescriptor.value().Type; + const LocationDescriptor &Loc = StreamDescriptor.value().Location; + + Expected> Stream = + getDataSlice(Data, Loc.RVA, Loc.DataSize); + if (!Stream) + return Stream.takeError(); if (Type == StreamType::Unused && Loc.DataSize == 0) { // Ignore dummy streams. This is technically ill-formed, but a number of @@ -128,7 +146,7 @@ MinidumpFile::create(MemoryBufferRef Sou return createError("Cannot handle one of the minidump streams"); // Update the directory map, checking for duplicate stream types. - if (!StreamMap.try_emplace(Type, Stream.index()).second) + if (!StreamMap.try_emplace(Type, StreamDescriptor.index()).second) return createError("Duplicate stream type"); } Modified: llvm/trunk/unittests/Object/MinidumpTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/Object/MinidumpTest.cpp?rev=374051&r1=374050&r2=374051&view=diff ============================================================================== --- llvm/trunk/unittests/Object/MinidumpTest.cpp (original) +++ llvm/trunk/unittests/Object/MinidumpTest.cpp Tue Oct 8 07:15:32 2019 @@ -511,3 +511,202 @@ TEST(MinidumpFile, getMemoryList) { EXPECT_EQ(0x00090807u, MD.Memory.RVA); } } + +TEST(MinidumpFile, getMemoryInfoList) { + std::vector OneEntry{ + // Header + 'M', 'D', 'M', 'P', 0x93, 0xa7, 0, 0, // Signature, Version + 1, 0, 0, 0, // NumberOfStreams, + 32, 0, 0, 0, // StreamDirectoryRVA + 0, 1, 2, 3, 4, 5, 6, 7, // Checksum, TimeDateStamp + 0, 0, 0, 0, 0, 0, 0, 0, // Flags + // Stream Directory + 16, 0, 0, 0, 64, 0, 0, 0, // Type, DataSize, + 44, 0, 0, 0, // RVA + // MemoryInfoListHeader + 16, 0, 0, 0, 48, 0, 0, 0, // SizeOfHeader, SizeOfEntry + 1, 0, 0, 0, 0, 0, 0, 0, // NumberOfEntries + // MemoryInfo + 0, 1, 2, 3, 4, 5, 6, 7, // BaseAddress + 8, 9, 0, 1, 2, 3, 4, 5, // AllocationBase + 16, 0, 0, 0, 6, 7, 8, 9, // AllocationProtect, Reserved0 + 0, 1, 2, 3, 4, 5, 6, 7, // RegionSize + 0, 16, 0, 0, 32, 0, 0, 0, // State, Protect + 0, 0, 2, 0, 8, 9, 0, 1, // Type, Reserved1 + }; + + // Same as before, but the list header is larger. + std::vector BiggerHeader{ + // Header + 'M', 'D', 'M', 'P', 0x93, 0xa7, 0, 0, // Signature, Version + 1, 0, 0, 0, // NumberOfStreams, + 32, 0, 0, 0, // StreamDirectoryRVA + 0, 1, 2, 3, 4, 5, 6, 7, // Checksum, TimeDateStamp + 0, 0, 0, 0, 0, 0, 0, 0, // Flags + // Stream Directory + 16, 0, 0, 0, 68, 0, 0, 0, // Type, DataSize, + 44, 0, 0, 0, // RVA + // MemoryInfoListHeader + 20, 0, 0, 0, 48, 0, 0, 0, // SizeOfHeader, SizeOfEntry + 1, 0, 0, 0, 0, 0, 0, 0, // NumberOfEntries + 0, 0, 0, 0, // ??? + // MemoryInfo + 0, 1, 2, 3, 4, 5, 6, 7, // BaseAddress + 8, 9, 0, 1, 2, 3, 4, 5, // AllocationBase + 16, 0, 0, 0, 6, 7, 8, 9, // AllocationProtect, Reserved0 + 0, 1, 2, 3, 4, 5, 6, 7, // RegionSize + 0, 16, 0, 0, 32, 0, 0, 0, // State, Protect + 0, 0, 2, 0, 8, 9, 0, 1, // Type, Reserved1 + }; + + // Same as before, but the entry is larger. + std::vector BiggerEntry{ + // Header + 'M', 'D', 'M', 'P', 0x93, 0xa7, 0, 0, // Signature, Version + 1, 0, 0, 0, // NumberOfStreams, + 32, 0, 0, 0, // StreamDirectoryRVA + 0, 1, 2, 3, 4, 5, 6, 7, // Checksum, TimeDateStamp + 0, 0, 0, 0, 0, 0, 0, 0, // Flags + // Stream Directory + 16, 0, 0, 0, 68, 0, 0, 0, // Type, DataSize, + 44, 0, 0, 0, // RVA + // MemoryInfoListHeader + 16, 0, 0, 0, 52, 0, 0, 0, // SizeOfHeader, SizeOfEntry + 1, 0, 0, 0, 0, 0, 0, 0, // NumberOfEntries + // MemoryInfo + 0, 1, 2, 3, 4, 5, 6, 7, // BaseAddress + 8, 9, 0, 1, 2, 3, 4, 5, // AllocationBase + 16, 0, 0, 0, 6, 7, 8, 9, // AllocationProtect, Reserved0 + 0, 1, 2, 3, 4, 5, 6, 7, // RegionSize + 0, 16, 0, 0, 32, 0, 0, 0, // State, Protect + 0, 0, 2, 0, 8, 9, 0, 1, // Type, Reserved1 + 0, 0, 0, 0, // ??? + }; + + for (ArrayRef Data : {OneEntry, BiggerHeader, BiggerEntry}) { + auto ExpectedFile = create(Data); + ASSERT_THAT_EXPECTED(ExpectedFile, Succeeded()); + const MinidumpFile &File = **ExpectedFile; + auto ExpectedInfo = File.getMemoryInfoList(); + ASSERT_THAT_EXPECTED(ExpectedInfo, Succeeded()); + ASSERT_EQ(1u, std::distance(ExpectedInfo->begin(), ExpectedInfo->end())); + const MemoryInfo &Info = *ExpectedInfo.get().begin(); + EXPECT_EQ(0x0706050403020100u, Info.BaseAddress); + EXPECT_EQ(0x0504030201000908u, Info.AllocationBase); + EXPECT_EQ(MemoryProtection::Execute, Info.AllocationProtect); + EXPECT_EQ(0x09080706u, Info.Reserved0); + EXPECT_EQ(0x0706050403020100u, Info.RegionSize); + EXPECT_EQ(MemoryState::Commit, Info.State); + EXPECT_EQ(MemoryProtection::ExecuteRead, Info.Protect); + EXPECT_EQ(MemoryType::Private, Info.Type); + EXPECT_EQ(0x01000908u, Info.Reserved1); + } + + // Header does not fit into the stream. + std::vector HeaderTooBig{ + // Header + 'M', 'D', 'M', 'P', 0x93, 0xa7, 0, 0, // Signature, Version + 1, 0, 0, 0, // NumberOfStreams, + 32, 0, 0, 0, // StreamDirectoryRVA + 0, 1, 2, 3, 4, 5, 6, 7, // Checksum, TimeDateStamp + 0, 0, 0, 0, 0, 0, 0, 0, // Flags + // Stream Directory + 16, 0, 0, 0, 15, 0, 0, 0, // Type, DataSize, + 44, 0, 0, 0, // RVA + // MemoryInfoListHeader + 16, 0, 0, 0, 48, 0, 0, 0, // SizeOfHeader, SizeOfEntry + 1, 0, 0, 0, 0, 0, 0, // ??? + }; + Expected> File = create(HeaderTooBig); + ASSERT_THAT_EXPECTED(File, Succeeded()); + EXPECT_THAT_EXPECTED(File.get()->getMemoryInfoList(), Failed()); + + // Header fits into the stream, but it is too small to contain the required + // entries. + std::vector HeaderTooSmall{ + // Header + 'M', 'D', 'M', 'P', 0x93, 0xa7, 0, 0, // Signature, Version + 1, 0, 0, 0, // NumberOfStreams, + 32, 0, 0, 0, // StreamDirectoryRVA + 0, 1, 2, 3, 4, 5, 6, 7, // Checksum, TimeDateStamp + 0, 0, 0, 0, 0, 0, 0, 0, // Flags + // Stream Directory + 16, 0, 0, 0, 15, 0, 0, 0, // Type, DataSize, + 44, 0, 0, 0, // RVA + // MemoryInfoListHeader + 15, 0, 0, 0, 48, 0, 0, 0, // SizeOfHeader, SizeOfEntry + 1, 0, 0, 0, 0, 0, 0, // ??? + }; + File = create(HeaderTooSmall); + ASSERT_THAT_EXPECTED(File, Succeeded()); + EXPECT_THAT_EXPECTED(File.get()->getMemoryInfoList(), Failed()); + + std::vector EntryTooBig{ + // Header + 'M', 'D', 'M', 'P', 0x93, 0xa7, 0, 0, // Signature, Version + 1, 0, 0, 0, // NumberOfStreams, + 32, 0, 0, 0, // StreamDirectoryRVA + 0, 1, 2, 3, 4, 5, 6, 7, // Checksum, TimeDateStamp + 0, 0, 0, 0, 0, 0, 0, 0, // Flags + // Stream Directory + 16, 0, 0, 0, 64, 0, 0, 0, // Type, DataSize, + 44, 0, 0, 0, // RVA + // MemoryInfoListHeader + 16, 0, 0, 0, 49, 0, 0, 0, // SizeOfHeader, SizeOfEntry + 1, 0, 0, 0, 0, 0, 0, 0, // NumberOfEntries + // MemoryInfo + 0, 1, 2, 3, 4, 5, 6, 7, // BaseAddress + 8, 9, 0, 1, 2, 3, 4, 5, // AllocationBase + 16, 0, 0, 0, 6, 7, 8, 9, // AllocationProtect, Reserved0 + 0, 1, 2, 3, 4, 5, 6, 7, // RegionSize + 0, 16, 0, 0, 32, 0, 0, 0, // State, Protect + 0, 0, 2, 0, 8, 9, 0, 1, // Type, Reserved1 + }; + File = create(EntryTooBig); + ASSERT_THAT_EXPECTED(File, Succeeded()); + EXPECT_THAT_EXPECTED(File.get()->getMemoryInfoList(), Failed()); + + std::vector ThreeEntries{ + // Header + 'M', 'D', 'M', 'P', 0x93, 0xa7, 0, 0, // Signature, Version + 1, 0, 0, 0, // NumberOfStreams, + 32, 0, 0, 0, // StreamDirectoryRVA + 0, 1, 2, 3, 4, 5, 6, 7, // Checksum, TimeDateStamp + 0, 0, 0, 0, 0, 0, 0, 0, // Flags + // Stream Directory + 16, 0, 0, 0, 160, 0, 0, 0, // Type, DataSize, + 44, 0, 0, 0, // RVA + // MemoryInfoListHeader + 16, 0, 0, 0, 48, 0, 0, 0, // SizeOfHeader, SizeOfEntry + 3, 0, 0, 0, 0, 0, 0, 0, // NumberOfEntries + // MemoryInfo + 0, 1, 2, 3, 0, 0, 0, 0, // BaseAddress + 0, 0, 0, 0, 0, 0, 0, 0, // AllocationBase + 0, 0, 0, 0, 0, 0, 0, 0, // AllocationProtect, Reserved0 + 0, 0, 0, 0, 0, 0, 0, 0, // RegionSize + 0, 0, 0, 0, 0, 0, 0, 0, // State, Protect + 0, 0, 0, 0, 0, 0, 0, 0, // Type, Reserved1 + 0, 0, 4, 5, 6, 7, 0, 0, // BaseAddress + 0, 0, 0, 0, 0, 0, 0, 0, // AllocationBase + 0, 0, 0, 0, 0, 0, 0, 0, // AllocationProtect, Reserved0 + 0, 0, 0, 0, 0, 0, 0, 0, // RegionSize + 0, 0, 0, 0, 0, 0, 0, 0, // State, Protect + 0, 0, 0, 0, 0, 0, 0, 0, // Type, Reserved1 + 0, 0, 0, 8, 9, 0, 1, 0, // BaseAddress + 0, 0, 0, 0, 0, 0, 0, 0, // AllocationBase + 0, 0, 0, 0, 0, 0, 0, 0, // AllocationProtect, Reserved0 + 0, 0, 0, 0, 0, 0, 0, 0, // RegionSize + 0, 0, 0, 0, 0, 0, 0, 0, // State, Protect + 0, 0, 0, 0, 0, 0, 0, 0, // Type, Reserved1 + }; + File = create(ThreeEntries); + ASSERT_THAT_EXPECTED(File, Succeeded()); + auto ExpectedInfo = File.get()->getMemoryInfoList(); + ASSERT_THAT_EXPECTED(ExpectedInfo, Succeeded()); + EXPECT_THAT(to_vector<3>(map_range(*ExpectedInfo, + [](const MemoryInfo &Info) -> uint64_t { + return Info.BaseAddress; + })), + testing::ElementsAre(0x0000000003020100u, 0x0000070605040000u, + 0x0001000908000000u)); +} From llvm-commits at lists.llvm.org Tue Oct 8 07:19:52 2019 From: llvm-commits at lists.llvm.org (Owen Reynolds via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:19:52 +0000 (UTC) Subject: [PATCH] D68033: [llvm-ar] Make paths case insensitive when on windows In-Reply-To: References: Message-ID: gbreynoo marked an inline comment as done. gbreynoo added inline comments. ================ Comment at: llvm/tools/llvm-ar/llvm-ar.cpp:496 +static bool comparePaths(StringRef Path1, const StringRef Path2) { +#ifndef _WIN32 ---------------- ruiu wrote: > ruiu wrote: > > I'd add a function comment here to explain what this function does for Windows. On Windows, we want a case-insensitive comparison as defined by the Unicode standard (which is I believe what CompareStringOrdinal implements), so that filenames in archives are compared case-insensitive manner. > nit: remove `const` from Path2. Use of `CompareStringOrdinal` it is not compliant to the Unicode standard as when string matching Windows file paths do not conform to the Unicode standard. I think this is fine though, using `CompareStringOrdinal` to mirror Windows behaviour makes more sense than a string comparison that conforms to the Unicode standard. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68033/new/ https://reviews.llvm.org/D68033 From llvm-commits at lists.llvm.org Tue Oct 8 07:19:53 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:19:53 +0000 (UTC) Subject: [PATCH] D68642: [llvm-exegesis] Add options to SnippetGenerator. Message-ID: courbet created this revision. courbet added a reviewer: gchatelet. Herald added a subscriber: tschuett. Herald added a project: LLVM. This adds a `-max-configs-per-opcode` option to limit the number of configs per opcode. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68642 Files: llvm/docs/CommandGuide/llvm-exegesis.rst llvm/test/tools/llvm-exegesis/X86/max-configs.test llvm/tools/llvm-exegesis/lib/Latency.h llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/tools/llvm-exegesis/lib/SnippetGenerator.h llvm/tools/llvm-exegesis/lib/Target.cpp llvm/tools/llvm-exegesis/lib/Target.h llvm/tools/llvm-exegesis/lib/Uops.h llvm/tools/llvm-exegesis/lib/X86/Target.cpp llvm/tools/llvm-exegesis/llvm-exegesis.cpp llvm/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68642.223847.patch Type: text/x-patch Size: 12034 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:19:54 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:19:54 +0000 (UTC) Subject: [PATCH] D68643: [Codegen] Alter the default promotion for saturating adds and subs Message-ID: dmgreen created this revision. dmgreen added reviewers: RKSimon, efriedma, craig.topper, leonardchan. Herald added subscribers: hiraditya, kristof.beyls. Herald added a project: LLVM. The default promotion for the add_sat/sub_sat nodes currently does: // 1. ANY_EXTEND iN to iM // 2. SHL by M-N // 3. [US][ADD|SUB]SAT // 4. L/ASHR by M-N If the promoted add_sat or sub_sat node is not legal, this can produce code that effectively does a lot of shifting (and requiring large constants to be materialised) just to use the overflow flag. It is simpler to just do the saturation manually, using the higher bitwidth addition and a min/max against the saturating bounds. That is what this patch attempts to do. A few points of interest: - This still uses the existing promotion when the promoted add/sub_sat is legal. In many situations (but not all) it would probably be better to just perform the new promotion. - It always creates a MIN and MAX node, even if they are not legal. The alternative would be to be to create a cmp/select pair, which these will legalise into anyway. - We only have AArch64 and X86 tests. I have added ARM (for which just the differences are shown here). I'm happy to add other architectures if people are interested. https://reviews.llvm.org/D68643 Files: llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp llvm/test/CodeGen/AArch64/sadd_sat.ll llvm/test/CodeGen/AArch64/sadd_sat_vec.ll llvm/test/CodeGen/AArch64/ssub_sat.ll llvm/test/CodeGen/AArch64/ssub_sat_vec.ll llvm/test/CodeGen/AArch64/uadd_sat.ll llvm/test/CodeGen/AArch64/uadd_sat_vec.ll llvm/test/CodeGen/AArch64/usub_sat.ll llvm/test/CodeGen/AArch64/usub_sat_vec.ll llvm/test/CodeGen/ARM/sadd_sat.ll llvm/test/CodeGen/ARM/ssub_sat.ll llvm/test/CodeGen/ARM/uadd_sat.ll llvm/test/CodeGen/ARM/usub_sat.ll llvm/test/CodeGen/X86/sadd_sat.ll llvm/test/CodeGen/X86/ssub_sat.ll llvm/test/CodeGen/X86/uadd_sat.ll llvm/test/CodeGen/X86/usub_sat.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68643.223836.patch Type: text/x-patch Size: 65749 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:19:59 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:19:59 +0000 (UTC) Subject: [PATCH] D68210: Object/minidump: Add support for the MemoryInfoList stream In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. labath marked an inline comment as done. Closed by commit rG6e0b1ce48e3c: Object/minidump: Add support for the MemoryInfoList stream (authored by labath). Herald added a subscriber: hiraditya. Changed prior to commit: https://reviews.llvm.org/D68210?vs=223816&id=223850#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68210/new/ https://reviews.llvm.org/D68210 Files: llvm/include/llvm/BinaryFormat/Minidump.h llvm/include/llvm/BinaryFormat/MinidumpConstants.def llvm/include/llvm/Object/Minidump.h llvm/lib/Object/Minidump.cpp llvm/unittests/Object/MinidumpTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68210.223850.patch Type: text/x-patch Size: 22112 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:23:49 2019 From: llvm-commits at lists.llvm.org (Sid Manning via llvm-commits) Date: Tue, 08 Oct 2019 14:23:49 -0000 Subject: [lld] r374052 - [lld][Hexagon] Support PLT relocation R_HEX_B15_PCREL_X/R_HEX_B9_PCREL_X Message-ID: <20191008142349.748B38D8C8@lists.llvm.org> Author: sidneym Date: Tue Oct 8 07:23:49 2019 New Revision: 374052 URL: http://llvm.org/viewvc/llvm-project?rev=374052&view=rev Log: [lld][Hexagon] Support PLT relocation R_HEX_B15_PCREL_X/R_HEX_B9_PCREL_X These are sometimes generated by tail call optimizations. Differential Revision: https://reviews.llvm.org/D66542 Added: lld/trunk/test/ELF/hexagon-plt.s Modified: lld/trunk/ELF/Arch/Hexagon.cpp lld/trunk/test/ELF/hexagon-shared.s Modified: lld/trunk/ELF/Arch/Hexagon.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Arch/Hexagon.cpp?rev=374052&r1=374051&r2=374052&view=diff ============================================================================== --- lld/trunk/ELF/Arch/Hexagon.cpp (original) +++ lld/trunk/ELF/Arch/Hexagon.cpp Tue Oct 8 07:23:49 2019 @@ -103,13 +103,13 @@ RelExpr Hexagon::getRelExpr(RelType type case R_HEX_LO16: return R_ABS; case R_HEX_B9_PCREL: - case R_HEX_B9_PCREL_X: case R_HEX_B13_PCREL: case R_HEX_B15_PCREL: - case R_HEX_B15_PCREL_X: case R_HEX_6_PCREL_X: case R_HEX_32_PCREL: return R_PC; + case R_HEX_B9_PCREL_X: + case R_HEX_B15_PCREL_X: case R_HEX_B22_PCREL: case R_HEX_PLT_B22_PCREL: case R_HEX_B22_PCREL_X: Added: lld/trunk/test/ELF/hexagon-plt.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/hexagon-plt.s?rev=374052&view=auto ============================================================================== --- lld/trunk/test/ELF/hexagon-plt.s (added) +++ lld/trunk/test/ELF/hexagon-plt.s Tue Oct 8 07:23:49 2019 @@ -0,0 +1,102 @@ +# REQUIRES: hexagon +# RUN: echo '.globl bar, weak; .type bar, at function; .type weak, at function; bar: weak:' > %t1.s + +# RUN: llvm-mc -filetype=obj -triple=hexagon-unknown-elf %t1.s -o %t1.o +# RUN: ld.lld -shared %t1.o -soname=t1.so -o %t1.so +# RUN: llvm-mc -mno-fixup -filetype=obj -triple=hexagon-unknown-elf %s -o %t.o +# RUN: ld.lld %t.o %t1.so -z separate-code -o %t +# RUN: llvm-readelf -S -s %t | FileCheck --check-prefixes=SEC,NM %s +# RUN: llvm-readobj -r %t | FileCheck --check-prefix=RELOC %s +# RUN: llvm-readelf -x .got.plt %t | FileCheck --check-prefix=GOTPLT %s +# RUN: llvm-objdump -d --no-show-raw-insn %t | FileCheck --check-prefixes=DIS %s + +# SEC: .plt PROGBITS {{0*}}00020040 + +## A canonical PLT has a non-zero st_value. bar and weak are called but their +## addresses are not taken, so a canonical PLT is not necessary. +# NM: {{0*}}00000000 0 FUNC GLOBAL DEFAULT UND bar +# NM: {{0*}}00000000 0 FUNC WEAK DEFAULT UND weak + +## The .got.plt slots relocated by .rela.plt point to .plt +## This is required by glibc. +# RELOC: .rela.plt { +# RELOC-NEXT: 0x40078 R_HEX_JMP_SLOT bar 0x0 +# RELOC-NEXT: 0x4007C R_HEX_JMP_SLOT weak 0x0 +# RELOC-NEXT: } +# GOTPLT: section '.got.plt' +# GOTPLT-NEXT: 0x00040068 00000000 00000000 00000000 00000000 +# GOTPLT-NEXT: 0x00040078 00000000 00000000 + +# DIS: _start: +## Direct call +## Call foo directly +# DIS-NEXT: { call 0x2003c } +## Call bar via plt +# DIS-NEXT: { call 0x20060 } +## Call weak via plt +# DIS-NEXT: { call 0x20070 } +# DIS-NEXT: { immext(#0) + +## Call foo directly +# DIS-NEXT: if (p0) jump:nt 0x2003c } +# DIS-NEXT: { immext(#64) +## Call bar via plt +# DIS-NEXT: if (p0) jump:nt 0x20060 } +# DIS-NEXT: { immext(#64) +## Call weak via plt +# DIS-NEXT: if (p0) jump:nt 0x20070 } +# DIS-NEXT: { immext(#0) + +## Call foo directly +# DIS-NEXT: r0 = #0 ; jump 0x2003c } +# DIS-NEXT: { immext(#0) +## Call bar via plt +# DIS-NEXT: r0 = #0 ; jump 0x20060 } +# DIS-NEXT: { immext(#0) +## Call weak via plt +# DIS-NEXT: r0 = #0 ; jump 0x20070 } + +# DIS: foo: +# DIS-NEXT: 2003c: + + +# DIS: Disassembly of section .plt: + +# DIS: 00020040 .plt: +# DIS-NEXT: 20040: { immext(#131072) +# DIS-NEXT: 20044: r28 = add(pc,##131112) } +# DIS-NEXT: 20048: { r14 -= add(r28,#16) +# DIS-NEXT: 2004c: r15 = memw(r28+#8) +# DIS-NEXT: 20050: r28 = memw(r28+#4) } +# DIS-NEXT: 20054: { r14 = asr(r14,#2) +# DIS-NEXT: 20058: jumpr r28 } +# DIS-NEXT: 2005c: { trap0(#219) } +## bar's plt slot +# DIS-NEXT: 20060: { immext(#131072) +# DIS-NEXT: 20064: r14 = add(pc,##131096) } +# DIS-NEXT: 20068: { r28 = memw(r14+#0) } +# DIS-NEXT: 2006c: { jumpr r28 } +## weak's plt slot +# DIS-NEXT: 20070: { immext(#131072) +# DIS-NEXT: 20074: r14 = add(pc,##131084) } +# DIS-NEXT: 20078: { r28 = memw(r14+#0) } +# DIS-NEXT: 2007c: { jumpr r28 } + + +.global _start, foo, bar +.weak weak + +_start: + call foo + call bar + call weak + if (p0) jump foo + if (p0) jump bar + if (p0) jump weak + { r0 = #0; jump foo } + { r0 = #0; jump bar } + { r0 = #0; jump weak } + +## foo is local and non-preemptale, no PLT is generated. +foo: + jumpr r31 Modified: lld/trunk/test/ELF/hexagon-shared.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/hexagon-shared.s?rev=374052&r1=374051&r2=374052&view=diff ============================================================================== --- lld/trunk/test/ELF/hexagon-shared.s (original) +++ lld/trunk/test/ELF/hexagon-shared.s Tue Oct 8 07:23:49 2019 @@ -1,15 +1,23 @@ # REQUIRES: hexagon -# RUN: llvm-mc -filetype=obj -triple=hexagon-unknown-elf %s -o %t +# RUN: llvm-mc -mno-fixup -filetype=obj -triple=hexagon-unknown-elf %s -o %t.o # RUN: llvm-mc -filetype=obj -triple=hexagon-unknown-elf %S/Inputs/hexagon-shared.s -o %t2.o # RUN: ld.lld -shared %t2.o -soname so -o %t3.so -# RUN: ld.lld -shared %t %t3.so -o %t4.so +# RUN: ld.lld -shared %t.o %t3.so -o %t4.so +# RUN: ld.lld -Bsymbolic -shared %t.o %t3.so -o %t5.so # RUN: llvm-objdump -d -j .plt %t4.so | FileCheck --check-prefix=PLT %s # RUN: llvm-objdump -d -j .text %t4.so | FileCheck --check-prefix=TEXT %s # RUN: llvm-objdump -D -j .got %t4.so | FileCheck --check-prefix=GOT %s # RUN: llvm-readelf -r %t4.so | FileCheck --check-prefix=RELO %s +# RUN: llvm-readelf -r %t5.so | FileCheck --check-prefix=SYMBOLIC %s -.global foo -foo: +.global _start, foo, hidden_symbol +.hidden hidden_symbol +_start: +# When -Bsymbolic is specified calls to locally resolvables should +# not generate a plt. +call ##foo +# Calls to hidden_symbols should not trigger a plt. +call ##hidden_symbol # _HEX_32_PCREL .word _DYNAMIC - . @@ -17,6 +25,10 @@ call ##bar # R_HEX_PLT_B22_PCREL call bar at PLT +# R_HEX_B15_PCREL_X +if (p0) jump bar +# R_HEX_B9_PCREL_X +{ r0 = #0; jump bar } # R_HEX_GOT_11_X and R_HEX_GOT_32_6_X r2=add(pc,##_GLOBAL_OFFSET_TABLE_ at PCREL) @@ -26,6 +38,13 @@ jumpr r0 # R_HEX_GOT_16_X r0 = add(r1,##bar at GOT) +# foo is local so no plt will be generated +foo: + jumpr lr + +hidden_symbol: + jumpr lr + # R_HEX_32 .data .global var @@ -40,26 +59,37 @@ pvar: .word var .size pvar, 4 -# PLT: { immext(#131200 -# PLT: r28 = add(pc,##131252) } -# PLT: { r14 -= add(r28,#16) -# PLT: r15 = memw(r28+#8) -# PLT: r28 = memw(r28+#4) } -# PLT: { r14 = asr(r14,#2) -# PLT: jumpr r28 } -# PLT: { trap0(#219) } -# PLT: immext(#131200) -# PLT: r14 = add(pc,##131236) } -# PLT: r28 = memw(r14+#0) } -# PLT: jumpr r28 } - -# TEXT: 10218: 68 00 01 00 00010068 -# TEXT: { call 0x10270 } -# TEXT: r0 = add(r1,##-65548) } + +# PLT: { immext(#131264 +# PLT-NEXT: r28 = add(pc,##131268) } +# PLT-NEXT: { r14 -= add(r28,#16) +# PLT-NEXT: r15 = memw(r28+#8) +# PLT-NEXT: r28 = memw(r28+#4) } +# PLT-NEXT: { r14 = asr(r14,#2) +# PLT-NEXT: jumpr r28 } +# PLT-NEXT: { trap0(#219) } +# PLT-NEXT: immext(#131200) +# PLT-NEXT: r14 = add(pc,##131252) } +# PLT-NEXT: r28 = memw(r14+#0) } +# PLT-NEXT: jumpr r28 } + +# TEXT: 8c 00 01 00 0001008c +# TEXT: { call 0x102d0 } +# TEXT: if (p0) jump:nt 0x102d0 +# TEXT: r0 = #0 ; jump 0x102d0 +# TEXT: r0 = add(r1,##-65548) # GOT: .got: -# GOT: 202f8: 00 00 00 00 00000000 +# GOT: 00 00 00 00 00000000 -# RELO: 000202f8 00000121 R_HEX_GLOB_DAT -# RELO: 00030300 00000406 R_HEX_32 -# RELO: 00030314 00000122 R_HEX_JMP_SLOT +# RELO: R_HEX_GLOB_DAT +# RELO: R_HEX_32 +# RELO: Relocation section '.rela.plt' at offset 0x22c contains 2 entries: +# RELO: R_HEX_JMP_SLOT {{.*}} foo +# RELO-NEXT: R_HEX_JMP_SLOT {{.*}} bar +# RELO-NOT: R_HEX_JMP_SLOT {{.*}} hidden + +# Make sure that no PLT is generated for a local call. +# SYMBOLIC: Relocation section '.rela.plt' at offset 0x22c contains 1 entries: +# SYMBOLIC: R_HEX_JMP_SLOT {{.*}} bar +# SYMBOLIC-NOT: R_HEX_JMP_SLOT {{.*}} foo From llvm-commits at lists.llvm.org Tue Oct 8 07:30:24 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Tue, 08 Oct 2019 14:30:24 -0000 Subject: [llvm] r374054 - [llvm-exegesis] Add options to SnippetGenerator. Message-ID: <20191008143024.7BE8381E26@lists.llvm.org> Author: courbet Date: Tue Oct 8 07:30:24 2019 New Revision: 374054 URL: http://llvm.org/viewvc/llvm-project?rev=374054&view=rev Log: [llvm-exegesis] Add options to SnippetGenerator. Summary: This adds a `-max-configs-per-opcode` option to limit the number of configs per opcode. Reviewers: gchatelet Subscribers: tschuett, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68642 Added: llvm/trunk/test/tools/llvm-exegesis/X86/max-configs.test Modified: llvm/trunk/docs/CommandGuide/llvm-exegesis.rst llvm/trunk/tools/llvm-exegesis/lib/Latency.h llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.h llvm/trunk/tools/llvm-exegesis/lib/Target.cpp llvm/trunk/tools/llvm-exegesis/lib/Target.h llvm/trunk/tools/llvm-exegesis/lib/Uops.h llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp llvm/trunk/tools/llvm-exegesis/llvm-exegesis.cpp llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp Modified: llvm/trunk/docs/CommandGuide/llvm-exegesis.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CommandGuide/llvm-exegesis.rst?rev=374054&r1=374053&r2=374054&view=diff ============================================================================== --- llvm/trunk/docs/CommandGuide/llvm-exegesis.rst (original) +++ llvm/trunk/docs/CommandGuide/llvm-exegesis.rst Tue Oct 8 07:30:24 2019 @@ -195,11 +195,23 @@ OPTIONS to specify at least one of the `-analysis-clusters-output-file=` and `-analysis-inconsistencies-output-file=`. -.. option:: -num-repetitions= +.. option:: -num-repetitions= Specify the number of repetitions of the asm snippet. Higher values lead to more accurate measurements but lengthen the benchmark. +.. option:: -max-configs-per-opcode= + + Specify the maximum configurations that can be generated for each opcode. + By default this is `1`, meaning that we assume that a single measurement is + enough to characterize an opcode. This might not be true of all instructions: + for example, the performance characteristics of the LEA instruction on X86 + depends on the value of assigned registers and immediates. Setting a value of + `-max-configs-per-opcode` larger than `1` allows `llvm-exegesis` to explore + more configurations to discover if some register or immediate assignments + lead to different performance characteristics. + + .. option:: -benchmarks-file= File to read (`analysis` mode) or write (`latency`/`uops`/`inverse_throughput` Added: llvm/trunk/test/tools/llvm-exegesis/X86/max-configs.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-exegesis/X86/max-configs.test?rev=374054&view=auto ============================================================================== --- llvm/trunk/test/tools/llvm-exegesis/X86/max-configs.test (added) +++ llvm/trunk/test/tools/llvm-exegesis/X86/max-configs.test Tue Oct 8 07:30:24 2019 @@ -0,0 +1,24 @@ +# RUN: llvm-exegesis -mode=latency -opcode-name=SBB8rr -max-configs-per-opcode=1 | FileCheck -check-prefixes=CHECK,CHECK1 %s +# RUN: llvm-exegesis -mode=latency -opcode-name=SBB8rr -max-configs-per-opcode=2 | FileCheck -check-prefixes=CHECK,CHECK2 %s + +CHECK: --- +CHECK-NEXT: mode: latency +CHECK-NEXT: key: +CHECK-NEXT: instructions: +CHECK-NEXT: SBB8rr +CHECK-NEXT: config: '' +CHECK-NEXT: register_initial_values: +CHECK-DAG: - '[[REG1:[A-Z0-9]+]]=0x0' +CHECK-LAST: ... + +CHECK1-NOT: SBB8rr + +CHECK2: --- +CHECK2-NEXT: mode: latency +CHECK2-NEXT: key: +CHECK2-NEXT: instructions: +CHECK2-NEXT: SBB8rr +CHECK2-NEXT: config: '' +CHECK2-NEXT: register_initial_values: +CHECK2-DAG: - '[[REG1:[A-Z0-9]+]]=0x0' +CHECK2-LAST: ... Modified: llvm/trunk/tools/llvm-exegesis/lib/Latency.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Latency.h?rev=374054&r1=374053&r2=374054&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Latency.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Latency.h Tue Oct 8 07:30:24 2019 @@ -24,7 +24,7 @@ namespace exegesis { class LatencySnippetGenerator : public SnippetGenerator { public: - LatencySnippetGenerator(const LLVMState &State) : SnippetGenerator(State) {} + using SnippetGenerator::SnippetGenerator; ~LatencySnippetGenerator() override; llvm::Expected> Modified: llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp?rev=374054&r1=374053&r2=374054&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp Tue Oct 8 07:30:24 2019 @@ -33,7 +33,8 @@ std::vector getSingleton(C SnippetGeneratorFailure::SnippetGeneratorFailure(const llvm::Twine &S) : llvm::StringError(S, llvm::inconvertibleErrorCode()) {} -SnippetGenerator::SnippetGenerator(const LLVMState &State) : State(State) {} +SnippetGenerator::SnippetGenerator(const LLVMState &State, const Options &Opts) + : State(State), Opts(Opts) {} SnippetGenerator::~SnippetGenerator() = default; @@ -81,6 +82,9 @@ SnippetGenerator::generateConfigurations computeRegisterInitialValues(CT.Instructions); BC.Key.Config = CT.Config; Output.push_back(std::move(BC)); + if (Output.size() >= Opts.MaxConfigsPerOpcode) + return Output; // Early exit if we exceeded the number of allowed + // configs. } } return Output; Modified: llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.h?rev=374054&r1=374053&r2=374054&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.h Tue Oct 8 07:30:24 2019 @@ -51,7 +51,11 @@ public: // Common code for all benchmark modes. class SnippetGenerator { public: - explicit SnippetGenerator(const LLVMState &State); + struct Options { + unsigned MaxConfigsPerOpcode = 1; + }; + + explicit SnippetGenerator(const LLVMState &State, const Options &Opts); virtual ~SnippetGenerator(); @@ -66,6 +70,7 @@ public: protected: const LLVMState &State; + const Options Opts; private: // API to be implemented by subclasses. Modified: llvm/trunk/tools/llvm-exegesis/lib/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Target.cpp?rev=374054&r1=374053&r2=374054&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Target.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Target.cpp Tue Oct 8 07:30:24 2019 @@ -36,17 +36,17 @@ void ExegesisTarget::registerTarget(Exeg FirstTarget = Target; } -std::unique_ptr -ExegesisTarget::createSnippetGenerator(InstructionBenchmark::ModeE Mode, - const LLVMState &State) const { +std::unique_ptr ExegesisTarget::createSnippetGenerator( + InstructionBenchmark::ModeE Mode, const LLVMState &State, + const SnippetGenerator::Options &Opts) const { switch (Mode) { case InstructionBenchmark::Unknown: return nullptr; case InstructionBenchmark::Latency: - return createLatencySnippetGenerator(State); + return createLatencySnippetGenerator(State, Opts); case InstructionBenchmark::Uops: case InstructionBenchmark::InverseThroughput: - return createUopsSnippetGenerator(State); + return createUopsSnippetGenerator(State, Opts); } return nullptr; } @@ -66,14 +66,14 @@ ExegesisTarget::createBenchmarkRunner(In return nullptr; } -std::unique_ptr -ExegesisTarget::createLatencySnippetGenerator(const LLVMState &State) const { - return std::make_unique(State); +std::unique_ptr ExegesisTarget::createLatencySnippetGenerator( + const LLVMState &State, const SnippetGenerator::Options &Opts) const { + return std::make_unique(State, Opts); } -std::unique_ptr -ExegesisTarget::createUopsSnippetGenerator(const LLVMState &State) const { - return std::make_unique(State); +std::unique_ptr ExegesisTarget::createUopsSnippetGenerator( + const LLVMState &State, const SnippetGenerator::Options &Opts) const { + return std::make_unique(State, Opts); } std::unique_ptr ExegesisTarget::createLatencyBenchmarkRunner( Modified: llvm/trunk/tools/llvm-exegesis/lib/Target.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Target.h?rev=374054&r1=374053&r2=374054&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Target.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Target.h Tue Oct 8 07:30:24 2019 @@ -125,7 +125,8 @@ public: // Creates a snippet generator for the given mode. std::unique_ptr createSnippetGenerator(InstructionBenchmark::ModeE Mode, - const LLVMState &State) const; + const LLVMState &State, + const SnippetGenerator::Options &Opts) const; // Creates a benchmark runner for the given mode. std::unique_ptr createBenchmarkRunner(InstructionBenchmark::ModeE Mode, @@ -151,9 +152,9 @@ private: // Targets can implement their own snippet generators/benchmarks runners by // implementing these. std::unique_ptr virtual createLatencySnippetGenerator( - const LLVMState &State) const; + const LLVMState &State, const SnippetGenerator::Options &Opts) const; std::unique_ptr virtual createUopsSnippetGenerator( - const LLVMState &State) const; + const LLVMState &State, const SnippetGenerator::Options &Opts) const; std::unique_ptr virtual createLatencyBenchmarkRunner( const LLVMState &State, InstructionBenchmark::ModeE Mode) const; std::unique_ptr virtual createUopsBenchmarkRunner( Modified: llvm/trunk/tools/llvm-exegesis/lib/Uops.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Uops.h?rev=374054&r1=374053&r2=374054&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Uops.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Uops.h Tue Oct 8 07:30:24 2019 @@ -22,7 +22,7 @@ namespace exegesis { class UopsSnippetGenerator : public SnippetGenerator { public: - UopsSnippetGenerator(const LLVMState &State) : SnippetGenerator(State) {} + using SnippetGenerator::SnippetGenerator; ~UopsSnippetGenerator() override; llvm::Expected> Modified: llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp?rev=374054&r1=374053&r2=374054&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp Tue Oct 8 07:30:24 2019 @@ -462,14 +462,16 @@ private: sizeof(kUnavailableRegisters[0])); } - std::unique_ptr - createLatencySnippetGenerator(const LLVMState &State) const override { - return std::make_unique(State); + std::unique_ptr createLatencySnippetGenerator( + const LLVMState &State, + const SnippetGenerator::Options &Opts) const override { + return std::make_unique(State, Opts); } - std::unique_ptr - createUopsSnippetGenerator(const LLVMState &State) const override { - return std::make_unique(State); + std::unique_ptr createUopsSnippetGenerator( + const LLVMState &State, + const SnippetGenerator::Options &Opts) const override { + return std::make_unique(State, Opts); } bool matchesArch(llvm::Triple::ArchType Arch) const override { Modified: llvm/trunk/tools/llvm-exegesis/llvm-exegesis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/llvm-exegesis.cpp?rev=374054&r1=374053&r2=374054&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/llvm-exegesis.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/llvm-exegesis.cpp Tue Oct 8 07:30:24 2019 @@ -95,6 +95,12 @@ static cl::opt cl::desc("number of time to repeat the asm snippet"), cl::cat(BenchmarkOptions), cl::init(10000)); +static cl::opt MaxConfigsPerOpcode( + "max-configs-per-opcode", + cl::desc( + "allow to snippet generator to generate at most that many configs"), + cl::cat(BenchmarkOptions), cl::init(1)); + static cl::opt IgnoreInvalidSchedClass( "ignore-invalid-sched-class", cl::desc("ignore instructions that do not define a sched class"), @@ -214,8 +220,11 @@ generateSnippets(const LLVMState &State, if (InstrDesc.isCall() || InstrDesc.isReturn()) return make_error("Unsupported opcode: isCall/isReturn"); + SnippetGenerator::Options Options; + Options.MaxConfigsPerOpcode = MaxConfigsPerOpcode; const std::unique_ptr Generator = - State.getExegesisTarget().createSnippetGenerator(BenchmarkMode, State); + State.getExegesisTarget().createSnippetGenerator(BenchmarkMode, State, + Options); if (!Generator) llvm::report_fatal_error("cannot create snippet generator"); return Generator->generateConfigurations(Instr, ForbiddenRegs); Modified: llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp?rev=374054&r1=374053&r2=374054&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp Tue Oct 8 07:30:24 2019 @@ -45,7 +45,7 @@ protected: template class SnippetGeneratorTest : public X86SnippetGeneratorTest { protected: - SnippetGeneratorTest() : Generator(State) {} + SnippetGeneratorTest() : Generator(State, SnippetGenerator::Options()) {} std::vector checkAndGetCodeTemplates(unsigned Opcode) { randomGenerator().seed(0); // Initialize seed. @@ -335,7 +335,8 @@ TEST_F(UopsSnippetGeneratorTest, MemoryU class FakeSnippetGenerator : public SnippetGenerator { public: - FakeSnippetGenerator(const LLVMState &State) : SnippetGenerator(State) {} + FakeSnippetGenerator(const LLVMState &State, const Options &Opts) + : SnippetGenerator(State, Opts) {} Instruction createInstruction(unsigned Opcode) { return State.getIC().getInstr(Opcode); From llvm-commits at lists.llvm.org Tue Oct 8 07:29:12 2019 From: llvm-commits at lists.llvm.org (David Tellenbach via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:29:12 +0000 (UTC) Subject: [PATCH] D68639: [MachineScheduler] Add a flag to enable scheduling of cfi instructions In-Reply-To: References: Message-ID: <91b082edb5e76b5491768dc7c8c171e7@localhost.localdomain> tellenbach added a comment. In D68639#1699548 , @fhahn wrote: > I've not looked to closely yet, but isn't that basically the same we do for debug values? If that's the case, I think we should consider unifying the code to handle debug & CFI. Specifically, would it be possible to just extend the current handling of debug instructions to also handle CFI instructions? That should some renaming and a few code changes. I think. You are absolutely correct, this is essentially the same as for debug values with the addition that it can be disabled. I will try to unify both approaches, maybe both loops (currently one for the vector of debug values and one for the vector of cfi instructions) can be fused into one. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68639/new/ https://reviews.llvm.org/D68639 From llvm-commits at lists.llvm.org Tue Oct 8 07:29:13 2019 From: llvm-commits at lists.llvm.org (Nikola Prica via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:29:13 +0000 (UTC) Subject: [PATCH] D67556: [ARM][AArch64][DebugInfo] Improve call site instruction interpretation In-Reply-To: References: Message-ID: NikolaPrica updated this revision to Diff 223848. NikolaPrica added a comment. -Include Destination operand in `isAddImmediate()` -Exclude description of add immediate instructions where source and destination registers are the same. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67556/new/ https://reviews.llvm.org/D67556 Files: include/llvm/CodeGen/TargetInstrInfo.h lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/CodeGen/TargetInstrInfo.cpp lib/Target/AArch64/AArch64InstrInfo.cpp lib/Target/AArch64/AArch64InstrInfo.h lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h test/DebugInfo/MIR/AArch64/dbgcall-site-interpretation.mir test/DebugInfo/MIR/ARM/dbgcall-site-interpretation.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D67556.223848.patch Type: text/x-patch Size: 27010 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:29:21 2019 From: llvm-commits at lists.llvm.org (Sid Manning via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:29:21 +0000 (UTC) Subject: [PATCH] D66542: R_HEX_B15_PCREL_X/R_HEX_B9_PCREL_X can be in shared objects In-Reply-To: References: Message-ID: <7e581fd96327939296623b384abd4692@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGaca5d395d541: [lld][Hexagon] Support PLT relocation R_HEX_B15_PCREL_X/R_HEX_B9_PCREL_X (authored by sidneym). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66542/new/ https://reviews.llvm.org/D66542 Files: lld/ELF/Arch/Hexagon.cpp lld/test/ELF/hexagon-plt.s lld/test/ELF/hexagon-shared.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D66542.223852.patch Type: text/x-patch Size: 7333 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:29:37 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:29:37 +0000 (UTC) Subject: [PATCH] D68642: [llvm-exegesis] Add options to SnippetGenerator. In-Reply-To: References: Message-ID: <0f2fe743deacf0db9749e19fae35d43e@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG2cd0f2895946: [llvm-exegesis] Add options to SnippetGenerator. (authored by courbet). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68642/new/ https://reviews.llvm.org/D68642 Files: llvm/docs/CommandGuide/llvm-exegesis.rst llvm/test/tools/llvm-exegesis/X86/max-configs.test llvm/tools/llvm-exegesis/lib/Latency.h llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/tools/llvm-exegesis/lib/SnippetGenerator.h llvm/tools/llvm-exegesis/lib/Target.cpp llvm/tools/llvm-exegesis/lib/Target.h llvm/tools/llvm-exegesis/lib/Uops.h llvm/tools/llvm-exegesis/lib/X86/Target.cpp llvm/tools/llvm-exegesis/llvm-exegesis.cpp llvm/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68642.223853.patch Type: text/x-patch Size: 12034 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:38:35 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:38:35 +0000 (UTC) Subject: [PATCH] D65402: [Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer In-Reply-To: References: Message-ID: <2d04ea15a89b0c1acaa8307dae521a0e@localhost.localdomain> uenoku updated this revision to Diff 223854. uenoku added a comment. Rebase. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65402/new/ https://reviews.llvm.org/D65402 Files: llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/align.ll llvm/test/Transforms/FunctionAttrs/arg_nocapture.ll llvm/test/Transforms/FunctionAttrs/arg_returned.ll llvm/test/Transforms/FunctionAttrs/callbacks.ll llvm/test/Transforms/FunctionAttrs/dereferenceable.ll llvm/test/Transforms/FunctionAttrs/internal-noalias.ll llvm/test/Transforms/FunctionAttrs/liveness.ll llvm/test/Transforms/FunctionAttrs/noalias_returned.ll llvm/test/Transforms/FunctionAttrs/nocapture.ll llvm/test/Transforms/FunctionAttrs/nonnull.ll llvm/test/Transforms/FunctionAttrs/norecurse.ll llvm/test/Transforms/FunctionAttrs/nosync.ll llvm/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll llvm/test/Transforms/FunctionAttrs/readattrs.ll llvm/test/Transforms/InferFunctionAttrs/dereferenceable.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D65402.223854.patch Type: text/x-patch Size: 45248 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:38:35 2019 From: llvm-commits at lists.llvm.org (Bardia Mahjour via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:38:35 +0000 (UTC) Subject: [PATCH] D68474: [DirectedGraph]: Add setTargetNode member function In-Reply-To: References: Message-ID: bmahjour added a comment. LGTM, but I'll let another reviewer give the green light :) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68474/new/ https://reviews.llvm.org/D68474 From llvm-commits at lists.llvm.org Tue Oct 8 07:38:59 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:38:59 +0000 (UTC) Subject: [PATCH] D68639: [MachineScheduler] Add a flag to enable scheduling of cfi instructions In-Reply-To: References: Message-ID: fhahn added a comment. In D68639#1699596 , @tellenbach wrote: > In D68639#1699548 , @fhahn wrote: > > > I've not looked to closely yet, but isn't that basically the same we do for debug values? If that's the case, I think we should consider unifying the code to handle debug & CFI. Specifically, would it be possible to just extend the current handling of debug instructions to also handle CFI instructions? That should some renaming and a few code changes. I think. > > > You are absolutely correct, this is essentially the same as for debug values with the addition that it can be disabled. I will try to unify both approaches, maybe both loops (currently one for the vector of debug values and one for the vector of cfi instructions) can be fused into one. I think we could just use the same map for both debug and CFI instructions and then just rename placeDebugInstruction to something like placeUnscheduledInstructions. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68639/new/ https://reviews.llvm.org/D68639 From llvm-commits at lists.llvm.org Tue Oct 8 07:39:00 2019 From: llvm-commits at lists.llvm.org (Mirko Brkusanin via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:39:00 +0000 (UTC) Subject: [PATCH] D68390: [Mips] Emit proper ABI for _mcount calls In-Reply-To: References: Message-ID: <5722397cd9d5efaecf95b546881971cd@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG45e0f2437327: [Mips] Emit proper ABI for _mcount calls (authored by mbrkusanin). Herald added subscribers: jrtc27, hiraditya. Changed prior to commit: https://reviews.llvm.org/D68390?vs=223002&id=223860#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68390/new/ https://reviews.llvm.org/D68390 Files: llvm/lib/Target/Mips/MipsSEISelDAGToDAG.cpp llvm/lib/Target/Mips/MipsSEISelDAGToDAG.h llvm/test/CodeGen/Mips/mcount.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68390.223860.patch Type: text/x-patch Size: 7911 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:48:17 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:48:17 +0000 (UTC) Subject: [PATCH] D68645: MinidumpYAML: Add support for the thread list stream Message-ID: labath created this revision. labath added reviewers: amccarth, jhenderson, clayborg. Herald added a project: LLVM. The implementation is fairly straight-forward and uses the same patterns as the existing streams. The yaml form does not attempt to preserve the data in the "gaps" that can be created by setting a larger-than-required header or entry size in the stream header, because the existing consumer (lldb) does not make use of the information in the gap in any way, and attempting to preserve that would make the implementation more complicated. Repository: rL LLVM https://reviews.llvm.org/D68645 Files: include/llvm/BinaryFormat/Minidump.h include/llvm/ObjectYAML/MinidumpYAML.h lib/ObjectYAML/MinidumpEmitter.cpp lib/ObjectYAML/MinidumpYAML.cpp test/tools/obj2yaml/basic-minidump.yaml -------------- next part -------------- A non-text attachment was scrubbed... Name: D68645.223862.patch Type: text/x-patch Size: 11434 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:48:18 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:48:18 +0000 (UTC) Subject: [PATCH] D68646: [llvm-exegesis] Explore LEA addressing modes. Message-ID: courbet created this revision. courbet added a reviewer: gchatelet. Herald added a subscriber: tschuett. Herald added a project: LLVM. courbet added a comment. Example results for uops on HSW: F10195180: lea_hsw_uops.html This will help for PR32326. This shows the well-known issue with `RBP` and `R13` as base registers. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68646 Files: llvm/test/tools/llvm-exegesis/X86/latency-LEA64r.s llvm/test/tools/llvm-exegesis/X86/uops-LEA64r.s llvm/tools/llvm-exegesis/lib/RegisterAliasing.h llvm/tools/llvm-exegesis/lib/Uops.cpp llvm/tools/llvm-exegesis/lib/X86/Target.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68646.223863.patch Type: text/x-patch Size: 9289 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:48:24 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:48:24 +0000 (UTC) Subject: [PATCH] D68646: [llvm-exegesis] Explore LEA addressing modes. In-Reply-To: References: Message-ID: <0561b3689a60d9f0f3929781a4592260@localhost.localdomain> courbet added a comment. Example results for uops on HSW: F10195180: lea_hsw_uops.html Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68646/new/ https://reviews.llvm.org/D68646 From llvm-commits at lists.llvm.org Tue Oct 8 07:57:52 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:57:52 +0000 (UTC) Subject: [PATCH] D68645: MinidumpYAML: Add support for the thread list stream In-Reply-To: References: Message-ID: <1e25d39b0be8890109f6e063cdeb157e@localhost.localdomain> JosephTremoulet added a comment. Nit: Title says "thread" rather than "memory info" Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68645/new/ https://reviews.llvm.org/D68645 From llvm-commits at lists.llvm.org Tue Oct 8 07:57:52 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:57:52 +0000 (UTC) Subject: [PATCH] D65402: [Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer In-Reply-To: References: Message-ID: <39d5208022ae2afa7f432e6eb045fa5c@localhost.localdomain> uenoku updated this revision to Diff 223867. uenoku marked 2 inline comments as done. uenoku added a comment. Address the last comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65402/new/ https://reviews.llvm.org/D65402 Files: llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/align.ll llvm/test/Transforms/FunctionAttrs/arg_nocapture.ll llvm/test/Transforms/FunctionAttrs/arg_returned.ll llvm/test/Transforms/FunctionAttrs/callbacks.ll llvm/test/Transforms/FunctionAttrs/dereferenceable.ll llvm/test/Transforms/FunctionAttrs/internal-noalias.ll llvm/test/Transforms/FunctionAttrs/liveness.ll llvm/test/Transforms/FunctionAttrs/noalias_returned.ll llvm/test/Transforms/FunctionAttrs/nocapture.ll llvm/test/Transforms/FunctionAttrs/nonnull.ll llvm/test/Transforms/FunctionAttrs/norecurse.ll llvm/test/Transforms/FunctionAttrs/nosync.ll llvm/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll llvm/test/Transforms/FunctionAttrs/readattrs.ll llvm/test/Transforms/InferFunctionAttrs/dereferenceable.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D65402.223867.patch Type: text/x-patch Size: 45387 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:58:05 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:58:05 +0000 (UTC) Subject: [PATCH] D65402: [Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer In-Reply-To: References: Message-ID: uenoku added inline comments. ================ Comment at: llvm/lib/Transforms/IPO/Attributor.cpp:1563 + (int64_t)DL.getTypeStoreSize( + getPointerOperand(I)->getType()->getPointerElementType()); + ---------------- jdoerfert wrote: > I'm still unsure about this logic, correct me if I'm wrong but we have: > > `Base = Offset + I.getPointer()` > and we know due to the access I that there are `D` dereferenceable bytes with > `D = DL.getTypeStoreSize(getPointerOperand(I)->getType()->getPointerElementType());` > Now, deref from `Base` should be: > `max(0, D + Offset)` > which is the same we have `AADereferenceableFloating::updateImpl`, but with an offset in the other direction, I think. I agree. ================ Comment at: llvm/test/Transforms/FunctionAttrs/dereferenceable.ll:199 +define i32* @test_for_minus_index(i32* %p) { +; FIXME: This should be define nonnull dereferenceable(8) i32* @test_for_minus_index(i32* dereferenceable(4) nonnull %p) +; ATTRIBUTOR: define nonnull dereferenceable(8) i32* @test_for_minus_index(i32* "no-capture-maybe-returned" %p) ---------------- jdoerfert wrote: > I don't think we can deduce much about %p but only about the return, so I would expect > `FIXME: This should be define nonnull dereferenceable(8) i32* @test_for_minus_index(i32* nonnull %p)` > or an explanation why %p is deref. > It is my mistake. Fixed. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65402/new/ https://reviews.llvm.org/D65402 From llvm-commits at lists.llvm.org Tue Oct 8 05:56:36 2019 From: llvm-commits at lists.llvm.org (Graham Hunter via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 12:56:36 +0000 (UTC) Subject: [PATCH] D53137: Scalable vector core instruction support + size queries In-Reply-To: References: Message-ID: <19fd6d1d7a269afe0df9d9575bc95f1a@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGb302561b763a: [SVE][IR] Scalable Vector size queries and IR instruction support (authored by huntergr). Changed prior to commit: https://reviews.llvm.org/D53137?vs=222593&id=223835#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53137/new/ https://reviews.llvm.org/D53137 Files: clang/lib/CodeGen/CGCall.cpp clang/lib/CodeGen/CGStmt.cpp clang/lib/CodeGen/CodeGenFunction.cpp llvm/include/llvm/ADT/DenseMapInfo.h llvm/include/llvm/IR/DataLayout.h llvm/include/llvm/IR/DerivedTypes.h llvm/include/llvm/IR/InstrTypes.h llvm/include/llvm/IR/Type.h llvm/include/llvm/Support/MachineValueType.h llvm/include/llvm/Support/ScalableSize.h llvm/include/llvm/Support/TypeSize.h llvm/lib/Analysis/InlineCost.cpp llvm/lib/CodeGen/Analysis.cpp llvm/lib/IR/DataLayout.cpp llvm/lib/IR/Instructions.cpp llvm/lib/IR/Type.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/lib/Transforms/Scalar/SROA.cpp llvm/test/Other/scalable-vectors-core-ir.ll llvm/unittests/CodeGen/ScalableVectorMVTsTest.cpp llvm/unittests/IR/VectorTypesTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D53137.223835.patch Type: text/x-patch Size: 50606 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 06:05:53 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 13:05:53 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. Message-ID: grimar created this revision. grimar added reviewers: jhenderson, MaskRay. Herald added subscribers: pzheng, seiya, s.egerton, lenary, jsji, jocewei, rupprecht, PkmX, the_o, brucehoult, MartinMosbeck, rogfer01, atanasyan, edward-jones, zzheng, niosHD, sabuasal, apazos, simoncook, jakehehrlich, johnrusso, rbar, asb, kbarton, arichardson, nhaehnle, jvesely, nemanjai, sdardis, emaste. Herald added a reviewer: espindola. Herald added a reviewer: alexshap. Herald added a reviewer: rupprecht. Our LLVM-style output was inconsistent. This patch changes the output in the following way: 1. ElfHeader { -> ElfHeader [ 2. Version symbols { -> VersionSymbols [ 3. SHT_GNU_verdef { -> VersionDefinitions [ 4. SHT_GNU_verneed { -> VersionRequirements [ 5. EH_FRAME Header [ -> EHFrameHeader [ https://reviews.llvm.org/D68636 Files: test/MC/AArch64/elf-globaladdress.ll test/MC/ARM/elf-eflags-eabi.s test/MC/ELF/basic-elf-32.s test/MC/ELF/basic-elf-64.s test/MC/Mips/elf_basic.s test/MC/Mips/elf_header.s test/MC/PowerPC/ppc64-localentry.s test/MC/RISCV/elf-header.s test/Object/multiple-sections.yaml test/tools/llvm-objcopy/ELF/binary-output-target.test test/tools/llvm-objcopy/ELF/many-sections.test test/tools/llvm-objcopy/ELF/strip-sections.test test/tools/llvm-readobj/all.test test/tools/llvm-readobj/amdgpu-elf-definitions.test test/tools/llvm-readobj/archive.test test/tools/llvm-readobj/elf-file-headers.test test/tools/llvm-readobj/elf-file-types.test test/tools/llvm-readobj/elf-invalid-shstrndx.test test/tools/llvm-readobj/elf-versioninfo.test test/tools/llvm-readobj/file-name.test test/tools/llvm-readobj/thin-archive.test test/tools/yaml2obj/verdef-section.yaml test/tools/yaml2obj/verneed-section.yaml test/tools/yaml2obj/versym-section.yaml tools/llvm-readobj/DwarfCFIEHPrinter.h tools/llvm-readobj/ELFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68636.223834.patch Type: text/x-patch Size: 21561 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 06:42:26 2019 From: llvm-commits at lists.llvm.org (James Henderson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 13:42:26 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. In-Reply-To: References: Message-ID: <7458f24fda430da3d634c3595cd4b208@localhost.localdomain> jhenderson added a comment. Herald added a subscriber: wuzish. Did you remember to run LLD tests? I'd expect to see changes there... ================ Comment at: tools/llvm-readobj/DwarfCFIEHPrinter.h:102 uint64_t EHFrameHdrSize) const { - ListScope L(W, "EH_FRAME Header"); + ListScope L(W, "EHFrameHeader"); W.startLine() << format("Address: 0x%" PRIx64 "\n", EHFrameHdrAddress); ---------------- The data printed isn't a list, because it prints some metadata. I think this should be a dictionary too. ================ Comment at: tools/llvm-readobj/ELFDumper.cpp:5128 { - DictScope D(W, "ElfHeader"); + ListScope D(W, "ElfHeader"); { ---------------- I don't think the ElfHeader should be a ListScope: it's not a list, but a dictionary (unlike, say, symbols). ================ Comment at: tools/llvm-readobj/ELFDumper.cpp:5598 const Elf_Shdr *Sec) { - DictScope SS(W, "Version symbols"); + ListScope SS(W, "VersionSymbols"); if (!Sec) ---------------- Ditto, though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? If it didn't have that stuff, it would be a list. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68636/new/ https://reviews.llvm.org/D68636 From llvm-commits at lists.llvm.org Tue Oct 8 06:42:26 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 13:42:26 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. In-Reply-To: References: Message-ID: <67c5ba6fd827639d79b0d1c5ecca44cf@localhost.localdomain> grimar added a comment. In D68636#1699479 , @jhenderson wrote: > Did you remember to run LLD tests? I'd expect to see changes there... Sure. That will be a separate change, once we agree with LLVM side. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68636/new/ https://reviews.llvm.org/D68636 From llvm-commits at lists.llvm.org Tue Oct 8 06:51:41 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 13:51:41 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. In-Reply-To: References: Message-ID: <98a3a64991e12e746618e6b1b8c03793@localhost.localdomain> grimar marked an inline comment as done. grimar added inline comments. ================ Comment at: tools/llvm-readobj/ELFDumper.cpp:5598 const Elf_Shdr *Sec) { - DictScope SS(W, "Version symbols"); + ListScope SS(W, "VersionSymbols"); if (!Sec) ---------------- jhenderson wrote: > Ditto, though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? If it didn't have that stuff, it would be a list. > though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? I do not know. The same information is printed under "Sections [" tag anyways, so it is not useful probably: ``` Section { Index: 3 Name: .gnu.version (30) Type: SHT_GNU_versym (0x6FFFFFFF) Flags [ (0x0) ] Address: 0x0 Offset: 0xB4 Size: 2 Link: 0 Info: 0 AddressAlignment: 0 EntrySize: 2 } ``` Should we remove "Section Name"/"Address"/"Offset"/"Link" and make it to be a list? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68636/new/ https://reviews.llvm.org/D68636 From llvm-commits at lists.llvm.org Tue Oct 8 07:01:08 2019 From: llvm-commits at lists.llvm.org (James Henderson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:01:08 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. In-Reply-To: References: Message-ID: <99f499d6e769f9894d64ccd5e2eef9bc@localhost.localdomain> jhenderson added inline comments. ================ Comment at: tools/llvm-readobj/ELFDumper.cpp:5598 const Elf_Shdr *Sec) { - DictScope SS(W, "Version symbols"); + ListScope SS(W, "VersionSymbols"); if (!Sec) ---------------- grimar wrote: > jhenderson wrote: > > Ditto, though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? If it didn't have that stuff, it would be a list. > > though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? > > I do not know. The same information is printed under "Sections [" tag anyways, so it is not useful probably: > > ``` > Section { > Index: 3 > Name: .gnu.version (30) > Type: SHT_GNU_versym (0x6FFFFFFF) > Flags [ (0x0) > ] > Address: 0x0 > Offset: 0xB4 > Size: 2 > Link: 0 > Info: 0 > AddressAlignment: 0 > EntrySize: 2 > } > ``` > > Should we remove "Section Name"/"Address"/"Offset"/"Link" and make it to be a list? I'd be inclined to do that personally, but it should be a separate change. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68636/new/ https://reviews.llvm.org/D68636 From llvm-commits at lists.llvm.org Tue Oct 8 07:01:15 2019 From: llvm-commits at lists.llvm.org (Graham Hunter via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:01:15 +0000 (UTC) Subject: [PATCH] D53137: Scalable vector core instruction support + size queries In-Reply-To: References: Message-ID: <06eae444ca4adb455aaf06056783bd41@localhost.localdomain> huntergr added a comment. Hmm, forgot to add the last round of minor fixes before committing. Sorry about that, will push them as well. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53137/new/ https://reviews.llvm.org/D53137 From llvm-commits at lists.llvm.org Tue Oct 8 07:19:53 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:19:53 +0000 (UTC) Subject: [PATCH] D67046: [RISCV] Add InstrInfo areMemAccessesTriviallyDisjoint hook In-Reply-To: References: Message-ID: lenary added inline comments. ================ Comment at: llvm/test/CodeGen/RISCV/disjoint.ll:6 +; RUN: -o /dev/null 2>&1 | FileCheck %s + +define i32 @test_disjoint(i32* %P, i32 %v) { ---------------- Please can you write a few sentences explaining how this tests that these two SW's are disjoint? I presume it's to do with the fact that the scheduler does not deem the `SW %1:gpr, %0:gpr, 8` to have to come after the first SW (it not being in the successors list). Knowing how this tests the disjoint hook will help us understand how to update the test if the exact output ever changes. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67046/new/ https://reviews.llvm.org/D67046 From llvm-commits at lists.llvm.org Tue Oct 8 07:29:11 2019 From: llvm-commits at lists.llvm.org (Lewis Revill via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:29:11 +0000 (UTC) Subject: [PATCH] D62190: [RISCV] Allow shrink wrapping for RISC-V In-Reply-To: References: Message-ID: <3597c681bb63fd6d931b08c74b3d6aac@localhost.localdomain> lewis-revill updated this revision to Diff 223849. lewis-revill added a comment. Make this patch independent of save/restore compatibility Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62190/new/ https://reviews.llvm.org/D62190 Files: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp llvm/test/CodeGen/RISCV/shrinkwrap.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D62190.223849.patch Type: text/x-patch Size: 4976 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:29:14 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:29:14 +0000 (UTC) Subject: [PATCH] D68559: [RISCV] Support fast calling convention In-Reply-To: References: Message-ID: <6c7302d01a53f4d38d50ae10265de9f9@localhost.localdomain> lenary added a comment. I like this change. `fastcc` is LLVM-internal only, right? Checking that we don't have to care about splitting operands that don't fit into registers, or any other psABI details, right? You said you had some performance numbers, it would be useful to have them in a comment on this patch. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68559/new/ https://reviews.llvm.org/D68559 From llvm-commits at lists.llvm.org Tue Oct 8 07:38:39 2019 From: llvm-commits at lists.llvm.org (Lewis Revill via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:38:39 +0000 (UTC) Subject: [PATCH] D62686: [RISCV] Add support for save/restore of callee-saved registers via libcalls In-Reply-To: References: Message-ID: <6e2bb791638c5128ba788bd4c14eb4a7@localhost.localdomain> lewis-revill updated this revision to Diff 223855. lewis-revill added a comment. Rebased to fix conflicts with recent split SP adjustments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62686/new/ https://reviews.llvm.org/D62686 Files: clang/lib/Driver/ToolChains/Arch/RISCV.cpp clang/test/Driver/riscv-features.c llvm/lib/Target/RISCV/RISCV.td llvm/lib/Target/RISCV/RISCVFrameLowering.cpp llvm/lib/Target/RISCV/RISCVFrameLowering.h llvm/lib/Target/RISCV/RISCVMachineFunctionInfo.h llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp llvm/lib/Target/RISCV/RISCVRegisterInfo.h llvm/lib/Target/RISCV/RISCVSubtarget.h llvm/test/CodeGen/RISCV/saverestore.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D62686.223855.patch Type: text/x-patch Size: 46224 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:38:42 2019 From: llvm-commits at lists.llvm.org (Lewis Revill via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:38:42 +0000 (UTC) Subject: [PATCH] D68644: [RISCV] Prevent unsafe shrink wrapping with save/restore enabled Message-ID: lewis-revill created this revision. lewis-revill added reviewers: asb, luismarques, apazos. Herald added subscribers: llvm-commits, pzheng, simoncook, s.egerton, lenary, Jim, benna, psnobl, jocewei, PkmX, rkruppe, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, MaskRay, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, johnrusso, rbar, hiraditya. Herald added a project: LLVM. lewis-revill added parent revisions: D62190: [RISCV] Allow shrink wrapping for RISC-V, D62686: [RISCV] Add support for save/restore of callee-saved registers via libcalls. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68644 Files: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp llvm/lib/Target/RISCV/RISCVFrameLowering.h llvm/test/CodeGen/RISCV/shrinkwrap.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68644.223857.patch Type: text/x-patch Size: 4486 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:38:50 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:38:50 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. In-Reply-To: References: Message-ID: <7447646a29fa97accd19cd7af544d9d7@localhost.localdomain> MaskRay added inline comments. ================ Comment at: tools/llvm-readobj/ELFDumper.cpp:5598 const Elf_Shdr *Sec) { - DictScope SS(W, "Version symbols"); + ListScope SS(W, "VersionSymbols"); if (!Sec) ---------------- jhenderson wrote: > grimar wrote: > > jhenderson wrote: > > > Ditto, though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? If it didn't have that stuff, it would be a list. > > > though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? > > > > I do not know. The same information is printed under "Sections [" tag anyways, so it is not useful probably: > > > > ``` > > Section { > > Index: 3 > > Name: .gnu.version (30) > > Type: SHT_GNU_versym (0x6FFFFFFF) > > Flags [ (0x0) > > ] > > Address: 0x0 > > Offset: 0xB4 > > Size: 2 > > Link: 0 > > Info: 0 > > AddressAlignment: 0 > > EntrySize: 2 > > } > > ``` > > > > Should we remove "Section Name"/"Address"/"Offset"/"Link" and make it to be a list? > I'd be inclined to do that personally, but it should be a separate change. The Linux Standard Base calls this "Symbol Version Table" but this is named "VersionSymbols" here... What do you think if we just use the regular section type name "SHT_GNU_versym"? It may improve discoverability as well. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68636/new/ https://reviews.llvm.org/D68636 From llvm-commits at lists.llvm.org Tue Oct 8 07:48:15 2019 From: llvm-commits at lists.llvm.org (Lewis Revill via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:48:15 +0000 (UTC) Subject: [PATCH] D66210: [RISCV] Enable the machine outliner for RISC-V In-Reply-To: References: Message-ID: <6fb6e1a7986bc3ff92aa2bf3ff043e52@localhost.localdomain> lewis-revill updated this revision to Diff 223859. lewis-revill retitled this revision from "[RFC/WIP][RISCV] Enable the machine outliner for RISC-V" to "[RISCV] Enable the machine outliner for RISC-V". lewis-revill added a comment. Rebased prior to commit; Will run this through the testsuite once more first. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66210/new/ https://reviews.llvm.org/D66210 Files: llvm/lib/Target/RISCV/RISCVInstrInfo.cpp llvm/lib/Target/RISCV/RISCVInstrInfo.h llvm/lib/Target/RISCV/RISCVTargetMachine.cpp llvm/test/CodeGen/RISCV/machineoutliner.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D66210.223859.patch Type: text/x-patch Size: 11934 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:48:20 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:48:20 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. In-Reply-To: References: Message-ID: <27f49f122c3b738fb4d656bfed7c2319@localhost.localdomain> grimar updated this revision to Diff 223861. grimar marked 4 inline comments as done. grimar edited the summary of this revision. grimar added a comment. - Addressed review comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68636/new/ https://reviews.llvm.org/D68636 Files: test/Object/multiple-sections.yaml test/tools/llvm-objcopy/ELF/binary-output-target.test test/tools/llvm-objcopy/ELF/many-sections.test test/tools/llvm-readobj/all.test test/tools/llvm-readobj/elf-file-headers.test test/tools/llvm-readobj/elf-versioninfo.test test/tools/llvm-readobj/file-name.test test/tools/yaml2obj/verdef-section.yaml test/tools/yaml2obj/verneed-section.yaml test/tools/yaml2obj/versym-section.yaml tools/llvm-readobj/DwarfCFIEHPrinter.h tools/llvm-readobj/ELFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68636.223861.patch Type: text/x-patch Size: 9121 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 07:48:24 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 14:48:24 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. In-Reply-To: References: Message-ID: <381e46a36ff787c4bec4c4d649eb2f6d@localhost.localdomain> grimar marked an inline comment as done. grimar added inline comments. ================ Comment at: tools/llvm-readobj/ELFDumper.cpp:5598 const Elf_Shdr *Sec) { - DictScope SS(W, "Version symbols"); + ListScope SS(W, "VersionSymbols"); if (!Sec) ---------------- MaskRay wrote: > jhenderson wrote: > > grimar wrote: > > > jhenderson wrote: > > > > Ditto, though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? If it didn't have that stuff, it would be a list. > > > > though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? > > > > > > I do not know. The same information is printed under "Sections [" tag anyways, so it is not useful probably: > > > > > > ``` > > > Section { > > > Index: 3 > > > Name: .gnu.version (30) > > > Type: SHT_GNU_versym (0x6FFFFFFF) > > > Flags [ (0x0) > > > ] > > > Address: 0x0 > > > Offset: 0xB4 > > > Size: 2 > > > Link: 0 > > > Info: 0 > > > AddressAlignment: 0 > > > EntrySize: 2 > > > } > > > ``` > > > > > > Should we remove "Section Name"/"Address"/"Offset"/"Link" and make it to be a list? > > I'd be inclined to do that personally, but it should be a separate change. > The Linux Standard Base calls this "Symbol Version Table" but this is named "VersionSymbols" here... What do you think if we just use the regular section type name "SHT_GNU_versym"? It may improve discoverability as well. > What do you think if we just use the regular section type name "SHT_GNU_versym"? It may improve discoverability as well. I.e. this is an opposite direction to what this patch does: ``` SHT_GNU_verdef { -> VersionDefinitions [ SHT_GNU_verneed { -> VersionRequirements [ ``` It will be only sections for which we use type names. Should we? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68636/new/ https://reviews.llvm.org/D68636 From llvm-commits at lists.llvm.org Tue Oct 8 08:07:37 2019 From: llvm-commits at lists.llvm.org (Cyndy Ishida via llvm-commits) Date: Tue, 08 Oct 2019 15:07:37 -0000 Subject: [llvm] r374058 - [TextAPI] Introduce TBDv4 Message-ID: <20191008150737.3014B8F12E@lists.llvm.org> Author: cishida Date: Tue Oct 8 08:07:36 2019 New Revision: 374058 URL: http://llvm.org/viewvc/llvm-project?rev=374058&view=rev Log: [TextAPI] Introduce TBDv4 Summary: This format introduces new features and platforms The motivation for this format is to support more than 1 platform since previous versions only supported additional architectures and 1 platform, for example ios + ios-simulator and macCatalyst. Reviewers: ributzka, steven_wu Reviewed By: ributzka Subscribers: mgorny, hiraditya, mgrang, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67529 Added: llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp Modified: llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h llvm/trunk/include/llvm/TextAPI/MachO/Target.h llvm/trunk/lib/TextAPI/MachO/Target.cpp llvm/trunk/lib/TextAPI/MachO/TextStub.cpp llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp llvm/trunk/unittests/TextAPI/CMakeLists.txt Modified: llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h?rev=374058&r1=374057&r2=374058&view=diff ============================================================================== --- llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h (original) +++ llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h Tue Oct 8 08:07:36 2019 @@ -67,6 +67,9 @@ enum FileType : unsigned { /// Text-based stub file (.tbd) version 3.0 TBD_V3 = 1U << 2, + /// Text-based stub file (.tbd) version 4.0 + TBD_V4 = 1U << 3, + All = ~0U, LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/All), Modified: llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h?rev=374058&r1=374057&r2=374058&view=diff ============================================================================== --- llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h (original) +++ llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h Tue Oct 8 08:07:36 2019 @@ -38,7 +38,10 @@ enum class SymbolFlags : uint8_t { /// Undefined Undefined = 1U << 3, - LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/Undefined), + /// Rexported + Rexported = 1U << 4, + + LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/Rexported), }; // clang-format on @@ -50,7 +53,7 @@ enum class SymbolKind : uint8_t { ObjectiveCInstanceVariable, }; -using TargetList = SmallVector; +using TargetList = SmallVector; class Symbol { public: Symbol(SymbolKind Kind, StringRef Name, TargetList Targets, SymbolFlags Flags) @@ -81,6 +84,10 @@ public: return (Flags & SymbolFlags::Undefined) == SymbolFlags::Undefined; } + bool isReexported() const { + return (Flags & SymbolFlags::Rexported) == SymbolFlags::Rexported; + } + using const_target_iterator = TargetList::const_iterator; using const_target_range = llvm::iterator_range; const_target_range targets() const { return {Targets}; } Modified: llvm/trunk/include/llvm/TextAPI/MachO/Target.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/TextAPI/MachO/Target.h?rev=374058&r1=374057&r2=374058&view=diff ============================================================================== --- llvm/trunk/include/llvm/TextAPI/MachO/Target.h (original) +++ llvm/trunk/include/llvm/TextAPI/MachO/Target.h Tue Oct 8 08:07:36 2019 @@ -29,6 +29,8 @@ public: explicit Target(const llvm::Triple &Triple) : Arch(mapToArchitecture(Triple)), Platform(mapToPlatformKind(Triple)) {} + static llvm::Expected create(StringRef Target); + operator std::string() const; Architecture Arch; Modified: llvm/trunk/lib/TextAPI/MachO/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/TextAPI/MachO/Target.cpp?rev=374058&r1=374057&r2=374058&view=diff ============================================================================== --- llvm/trunk/lib/TextAPI/MachO/Target.cpp (original) +++ llvm/trunk/lib/TextAPI/MachO/Target.cpp Tue Oct 8 08:07:36 2019 @@ -17,6 +17,44 @@ namespace llvm { namespace MachO { +Expected Target::create(StringRef TargetValue) { + auto Result = TargetValue.split('-'); + auto ArchitectureStr = Result.first; + auto Architecture = getArchitectureFromName(ArchitectureStr); + if (Architecture == AK_unknown) + return make_error("invalid architecture", + inconvertibleErrorCode()); + auto PlatformStr = Result.second; + PlatformKind Platform; + Platform = StringSwitch(PlatformStr) + .Case("macos", PlatformKind::macOS) + .Case("ios", PlatformKind::iOS) + .Case("tvos", PlatformKind::tvOS) + .Case("watchos", PlatformKind::watchOS) + .Case("bridgeos", PlatformKind::bridgeOS) + .Case("maccatalyst", PlatformKind::macCatalyst) + .Case("ios-simulator", PlatformKind::iOSSimulator) + .Case("tvos-simulator", PlatformKind::tvOSSimulator) + .Case("watchos-simulator", PlatformKind::watchOSSimulator) + .Default(PlatformKind::unknown); + + if (Platform == PlatformKind::unknown) { + if (PlatformStr.startswith("<") && PlatformStr.endswith(">")) { + PlatformStr = PlatformStr.drop_front().drop_back(); + unsigned long long RawValue; + if (PlatformStr.getAsInteger(10, RawValue)) + return make_error("invalid platform number", + inconvertibleErrorCode()); + + Platform = (PlatformKind)RawValue; + } + return make_error("invalid platform", + inconvertibleErrorCode()); + } + + return Target{Architecture, Platform}; +} + Target::operator std::string() const { return (getArchitectureName(Arch) + " (" + getPlatformName(Platform) + ")") .str(); @@ -42,4 +80,4 @@ ArchitectureSet mapToArchitectureSet(Arr } } // end namespace MachO. -} // end namespace llvm. \ No newline at end of file +} // end namespace llvm. Modified: llvm/trunk/lib/TextAPI/MachO/TextStub.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/TextAPI/MachO/TextStub.cpp?rev=374058&r1=374057&r2=374058&view=diff ============================================================================== --- llvm/trunk/lib/TextAPI/MachO/TextStub.cpp (original) +++ llvm/trunk/lib/TextAPI/MachO/TextStub.cpp Tue Oct 8 08:07:36 2019 @@ -147,6 +147,58 @@ Each undefineds section is defined as fo objc-ivars: [] # Optional: List of Objective C Instance Variables weak-ref-symbols: [] # Optional: List of weak defined symbols */ + +/* + + YAML Format specification. + +--- !tapi-tbd +tbd-version: 4 # The tbd version for format +targets: [ armv7-ios, x86_64-maccatalyst ] # The list of applicable tapi supported target triples +uuids: # Optional: List of target and UUID pairs. + - target: armv7-ios + value: ... + - target: x86_64-maccatalyst + value: ... +flags: [] # Optional: +install-name: /u/l/libfoo.dylib # +current-version: 1.2.3 # Optional: defaults to 1.0 +compatibility-version: 1.0 # Optional: defaults to 1.0 +swift-abi-version: 0 # Optional: defaults to 0 +parent-umbrella: # Optional: +allowable-clients: + - targets: [ armv7-ios ] # Optional: + clients: [ clientA ] +exports: # List of export sections +... +re-exports: # List of reexport sections +... +undefineds: # List of undefineds sections +... + +Each export and reexport section is defined as following: + +- targets: [ arm64-macos ] # The list of target triples associated with symbols + symbols: [ _symA ] # Optional: List of symbols + objc-classes: [] # Optional: List of Objective-C classes + objc-eh-types: [] # Optional: List of Objective-C classes + # with EH + objc-ivars: [] # Optional: List of Objective C Instance + # Variables + weak-symbols: [] # Optional: List of weak defined symbols + thread-local-symbols: [] # Optional: List of thread local symbols +- targets: [ arm64-macos, x86_64-maccatalyst ] # Optional: Targets for applicable additional symbols + symbols: [ _symB ] # Optional: List of symbols + +Each undefineds section is defined as following: +- targets: [ arm64-macos ] # The list of target triples associated with symbols + symbols: [ _symC ] # Optional: List of symbols + objc-classes: [] # Optional: List of Objective-C classes + objc-eh-types: [] # Optional: List of Objective-C classes + # with EH + objc-ivars: [] # Optional: List of Objective C Instance Variables + weak-symbols: [] # Optional: List of weak defined symbols +*/ // clang-format on using namespace llvm; @@ -175,6 +227,38 @@ struct UndefinedSection { std::vector WeakRefSymbols; }; +// Sections for direct target mapping in TBDv4 +struct SymbolSection { + TargetList Targets; + std::vector Symbols; + std::vector Classes; + std::vector ClassEHs; + std::vector Ivars; + std::vector WeakSymbols; + std::vector TlvSymbols; +}; + +struct MetadataSection { + enum Option { Clients, Libraries }; + std::vector Targets; + std::vector Values; +}; + +struct UmbrellaSection { + std::vector Targets; + std::string Umbrella; +}; + +// UUID's for TBDv4 are mapped to target not arch +struct UUIDv4 { + Target TargetID; + std::string Value; + + UUIDv4() = default; + UUIDv4(const Target &TargetID, const std::string &Value) + : TargetID(TargetID), Value(Value) {} +}; + // clang-format off enum TBDFlags : unsigned { None = 0U, @@ -189,6 +273,12 @@ enum TBDFlags : unsigned { LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(Architecture) LLVM_YAML_IS_SEQUENCE_VECTOR(ExportSection) LLVM_YAML_IS_SEQUENCE_VECTOR(UndefinedSection) +// Specific to TBDv4 +LLVM_YAML_IS_SEQUENCE_VECTOR(SymbolSection) +LLVM_YAML_IS_SEQUENCE_VECTOR(MetadataSection) +LLVM_YAML_IS_SEQUENCE_VECTOR(UmbrellaSection) +LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(Target) +LLVM_YAML_IS_SEQUENCE_VECTOR(UUIDv4) namespace llvm { namespace yaml { @@ -231,6 +321,49 @@ template <> struct MappingTraits struct MappingTraits { + static void mapping(IO &IO, SymbolSection &Section) { + IO.mapRequired("targets", Section.Targets); + IO.mapOptional("symbols", Section.Symbols); + IO.mapOptional("objc-classes", Section.Classes); + IO.mapOptional("objc-eh-types", Section.ClassEHs); + IO.mapOptional("objc-ivars", Section.Ivars); + IO.mapOptional("weak-symbols", Section.WeakSymbols); + IO.mapOptional("thread-local-symbols", Section.TlvSymbols); + } +}; + +template <> struct MappingTraits { + static void mapping(IO &IO, UmbrellaSection &Section) { + IO.mapRequired("targets", Section.Targets); + IO.mapRequired("umbrella", Section.Umbrella); + } +}; + +template <> struct MappingTraits { + static void mapping(IO &IO, UUIDv4 &UUID) { + IO.mapRequired("target", UUID.TargetID); + IO.mapRequired("value", UUID.Value); + } +}; + +template <> +struct MappingContextTraits { + static void mapping(IO &IO, MetadataSection &Section, + MetadataSection::Option &OptionKind) { + IO.mapRequired("targets", Section.Targets); + switch (OptionKind) { + case MetadataSection::Option::Clients: + IO.mapRequired("clients", Section.Values); + return; + case MetadataSection::Option::Libraries: + IO.mapRequired("libraries", Section.Values); + return; + } + llvm_unreachable("unexpected option for metadata"); + } +}; + template <> struct ScalarBitSetTraits { static void bitset(IO &IO, TBDFlags &Flags) { IO.bitSetCase(Flags, "flat_namespace", TBDFlags::FlatNamespace); @@ -240,6 +373,55 @@ template <> struct ScalarBitSetTraits struct ScalarTraits { + static void output(const Target &Value, void *, raw_ostream &OS) { + OS << Value.Arch << "-"; + switch (Value.Platform) { + default: + OS << "unknown"; + break; + case PlatformKind::macOS: + OS << "macos"; + break; + case PlatformKind::iOS: + OS << "ios"; + break; + case PlatformKind::tvOS: + OS << "tvos"; + break; + case PlatformKind::watchOS: + OS << "watchos"; + break; + case PlatformKind::bridgeOS: + OS << "bridgeos"; + break; + case PlatformKind::macCatalyst: + OS << "maccatalyst"; + break; + case PlatformKind::iOSSimulator: + OS << "ios-simulator"; + break; + case PlatformKind::tvOSSimulator: + OS << "tvos-simulator"; + break; + case PlatformKind::watchOSSimulator: + OS << "watchos-simulator"; + break; + } + } + + static StringRef input(StringRef Scalar, void *, Target &Value) { + auto Result = Target::create(Scalar); + if (!Result) + return toString(Result.takeError()); + + Value = *Result; + return {}; + } + + static QuotingType mustQuote(StringRef) { return QuotingType::None; } +}; + template <> struct MappingTraits { struct NormalizedTBD { explicit NormalizedTBD(IO &IO) {} @@ -555,71 +737,339 @@ template <> struct MappingTraits Undefineds; }; + static void setFileTypeForInput(TextAPIContext *Ctx, IO &IO) { + if (IO.mapTag("!tapi-tbd", false)) + Ctx->FileKind = FileType::TBD_V4; + else if (IO.mapTag("!tapi-tbd-v3", false)) + Ctx->FileKind = FileType::TBD_V3; + else if (IO.mapTag("!tapi-tbd-v2", false)) + Ctx->FileKind = FileType::TBD_V2; + else if (IO.mapTag("!tapi-tbd-v1", false) || + IO.mapTag("tag:yaml.org,2002:map", false)) + Ctx->FileKind = FileType::TBD_V1; + else { + Ctx->FileKind = FileType::Invalid; + return; + } + } + static void mapping(IO &IO, const InterfaceFile *&File) { auto *Ctx = reinterpret_cast(IO.getContext()); assert((!Ctx || !IO.outputting() || (Ctx && Ctx->FileKind != FileType::Invalid)) && "File type is not set in YAML context"); - MappingNormalization Keys(IO, File); - // prope file type when reading. if (!IO.outputting()) { - if (IO.mapTag("!tapi-tbd-v3", false)) - Ctx->FileKind = FileType::TBD_V3; - else if (IO.mapTag("!tapi-tbd-v2", false)) - Ctx->FileKind = FileType::TBD_V2; - else if (IO.mapTag("!tapi-tbd-v1", false) || - IO.mapTag("tag:yaml.org,2002:map", false)) - Ctx->FileKind = FileType::TBD_V1; - else { + setFileTypeForInput(Ctx, IO); + switch (Ctx->FileKind) { + default: + break; + case FileType::TBD_V4: + mapKeysToValuesV4(IO, File); + return; + case FileType::Invalid: IO.setError("unsupported file type"); return; } - } - - // Set file type when writing. - if (IO.outputting()) { + } else { + // Set file type when writing. switch (Ctx->FileKind) { default: llvm_unreachable("unexpected file type"); - case FileType::TBD_V1: - // Don't write the tag into the .tbd file for TBD v1. + case FileType::TBD_V4: + mapKeysToValuesV4(IO, File); + return; + case FileType::TBD_V3: + IO.mapTag("!tapi-tbd-v3", true); break; case FileType::TBD_V2: IO.mapTag("!tapi-tbd-v2", true); break; - case FileType::TBD_V3: - IO.mapTag("!tapi-tbd-v3", true); + case FileType::TBD_V1: + // Don't write the tag into the .tbd file for TBD v1 break; } } + mapKeysToValues(Ctx->FileKind, IO, File); + } + + using SectionList = std::vector; + struct NormalizedTBD_V4 { + explicit NormalizedTBD_V4(IO &IO) {} + NormalizedTBD_V4(IO &IO, const InterfaceFile *&File) { + auto Ctx = reinterpret_cast(IO.getContext()); + assert(Ctx); + TBDVersion = Ctx->FileKind >> 1; + Targets.insert(Targets.begin(), File->targets().begin(), + File->targets().end()); + for (const auto &IT : File->uuids()) + UUIDs.emplace_back(IT.first, IT.second); + InstallName = File->getInstallName(); + CurrentVersion = File->getCurrentVersion(); + CompatibilityVersion = File->getCompatibilityVersion(); + SwiftVersion = File->getSwiftABIVersion(); + + Flags = TBDFlags::None; + if (!File->isApplicationExtensionSafe()) + Flags |= TBDFlags::NotApplicationExtensionSafe; + + if (!File->isTwoLevelNamespace()) + Flags |= TBDFlags::FlatNamespace; + + if (File->isInstallAPI()) + Flags |= TBDFlags::InstallAPI; + + { + using TargetList = SmallVector; + std::map valueToTargetList; + for (const auto &it : File->umbrellas()) + valueToTargetList[it.second].emplace_back(it.first); + + for (const auto &it : valueToTargetList) { + UmbrellaSection CurrentSection; + CurrentSection.Targets.insert(CurrentSection.Targets.begin(), + it.second.begin(), it.second.end()); + CurrentSection.Umbrella = it.first; + ParentUmbrellas.emplace_back(std::move(CurrentSection)); + } + } + + assignTargetsToLibrary(File->allowableClients(), AllowableClients); + assignTargetsToLibrary(File->reexportedLibraries(), ReexportedLibraries); + + auto handleSymbols = + [](SectionList &CurrentSections, + InterfaceFile::const_filtered_symbol_range Symbols, + std::function Pred) { + using TargetList = SmallVector; + std::set TargetSet; + std::map SymbolToTargetList; + for (const auto *Symbol : Symbols) { + if (!Pred(Symbol)) + continue; + TargetList Targets(Symbol->targets()); + SymbolToTargetList[Symbol] = Targets; + TargetSet.emplace(std::move(Targets)); + } + for (const auto &TargetIDs : TargetSet) { + SymbolSection CurrentSection; + CurrentSection.Targets.insert(CurrentSection.Targets.begin(), + TargetIDs.begin(), TargetIDs.end()); + + for (const auto &IT : SymbolToTargetList) { + if (IT.second != TargetIDs) + continue; + + const auto *Symbol = IT.first; + switch (Symbol->getKind()) { + case SymbolKind::GlobalSymbol: + if (Symbol->isWeakDefined()) + CurrentSection.WeakSymbols.emplace_back(Symbol->getName()); + else if (Symbol->isThreadLocalValue()) + CurrentSection.TlvSymbols.emplace_back(Symbol->getName()); + else + CurrentSection.Symbols.emplace_back(Symbol->getName()); + break; + case SymbolKind::ObjectiveCClass: + CurrentSection.Classes.emplace_back(Symbol->getName()); + break; + case SymbolKind::ObjectiveCClassEHType: + CurrentSection.ClassEHs.emplace_back(Symbol->getName()); + break; + case SymbolKind::ObjectiveCInstanceVariable: + CurrentSection.Ivars.emplace_back(Symbol->getName()); + break; + } + } + sort(CurrentSection.Symbols); + sort(CurrentSection.Classes); + sort(CurrentSection.ClassEHs); + sort(CurrentSection.Ivars); + sort(CurrentSection.WeakSymbols); + sort(CurrentSection.TlvSymbols); + CurrentSections.emplace_back(std::move(CurrentSection)); + } + }; + + handleSymbols(Exports, File->exports(), [](const Symbol *Symbol) { + return !Symbol->isReexported(); + }); + handleSymbols(Reexports, File->exports(), [](const Symbol *Symbol) { + return Symbol->isReexported(); + }); + handleSymbols(Undefineds, File->undefineds(), + [](const Symbol *Symbol) { return true; }); + } + + const InterfaceFile *denormalize(IO &IO) { + auto Ctx = reinterpret_cast(IO.getContext()); + assert(Ctx); + + auto *File = new InterfaceFile; + File->setPath(Ctx->Path); + File->setFileType(Ctx->FileKind); + for (auto &id : UUIDs) + File->addUUID(id.TargetID, id.Value); + File->addTargets(Targets); + File->setInstallName(InstallName); + File->setCurrentVersion(CurrentVersion); + File->setCompatibilityVersion(CompatibilityVersion); + File->setSwiftABIVersion(SwiftVersion); + for (const auto &CurrentSection : ParentUmbrellas) + for (const auto &target : CurrentSection.Targets) + File->addParentUmbrella(target, CurrentSection.Umbrella); + File->setTwoLevelNamespace(!(Flags & TBDFlags::FlatNamespace)); + File->setApplicationExtensionSafe( + !(Flags & TBDFlags::NotApplicationExtensionSafe)); + File->setInstallAPI(Flags & TBDFlags::InstallAPI); + + for (const auto &CurrentSection : AllowableClients) { + for (const auto &lib : CurrentSection.Values) + for (const auto &Target : CurrentSection.Targets) + File->addAllowableClient(lib, Target); + } + + for (const auto &CurrentSection : ReexportedLibraries) { + for (const auto &Lib : CurrentSection.Values) + for (const auto &Target : CurrentSection.Targets) + File->addReexportedLibrary(Lib, Target); + } + + auto handleSymbols = [File](const SectionList &CurrentSections, + SymbolFlags Flag = SymbolFlags::None) { + for (const auto &CurrentSection : CurrentSections) { + for (auto &sym : CurrentSection.Symbols) + File->addSymbol(SymbolKind::GlobalSymbol, sym, + CurrentSection.Targets, Flag); + + for (auto &sym : CurrentSection.Classes) + File->addSymbol(SymbolKind::ObjectiveCClass, sym, + CurrentSection.Targets); + + for (auto &sym : CurrentSection.ClassEHs) + File->addSymbol(SymbolKind::ObjectiveCClassEHType, sym, + CurrentSection.Targets); + + for (auto &sym : CurrentSection.Ivars) + File->addSymbol(SymbolKind::ObjectiveCInstanceVariable, sym, + CurrentSection.Targets); + + for (auto &sym : CurrentSection.WeakSymbols) + File->addSymbol(SymbolKind::GlobalSymbol, sym, + CurrentSection.Targets); + for (auto &sym : CurrentSection.TlvSymbols) + File->addSymbol(SymbolKind::GlobalSymbol, sym, + CurrentSection.Targets, + SymbolFlags::ThreadLocalValue); + } + }; + + handleSymbols(Exports); + handleSymbols(Reexports, SymbolFlags::Rexported); + handleSymbols(Undefineds, SymbolFlags::Undefined); + + return File; + } + + unsigned TBDVersion; + std::vector UUIDs; + TargetList Targets; + StringRef InstallName; + PackedVersion CurrentVersion; + PackedVersion CompatibilityVersion; + SwiftVersion SwiftVersion{0}; + std::vector AllowableClients; + std::vector ReexportedLibraries; + TBDFlags Flags{TBDFlags::None}; + std::vector ParentUmbrellas; + SectionList Exports; + SectionList Reexports; + SectionList Undefineds; + + private: + using TargetList = SmallVector; + void assignTargetsToLibrary(const std::vector &Libraries, + std::vector &Section) { + std::set targetSet; + std::map valueToTargetList; + for (const auto &library : Libraries) { + TargetList targets(library.targets()); + valueToTargetList[&library] = targets; + targetSet.emplace(std::move(targets)); + } + + for (const auto &targets : targetSet) { + MetadataSection CurrentSection; + CurrentSection.Targets.insert(CurrentSection.Targets.begin(), + targets.begin(), targets.end()); + + for (const auto &it : valueToTargetList) { + if (it.second != targets) + continue; + CurrentSection.Values.emplace_back(it.first->getInstallName()); + } + llvm::sort(CurrentSection.Values); + Section.emplace_back(std::move(CurrentSection)); + } + } + }; + + static void mapKeysToValues(FileType FileKind, IO &IO, + const InterfaceFile *&File) { + MappingNormalization Keys(IO, File); IO.mapRequired("archs", Keys->Architectures); - if (Ctx->FileKind != FileType::TBD_V1) + if (FileKind != FileType::TBD_V1) IO.mapOptional("uuids", Keys->UUIDs); IO.mapRequired("platform", Keys->Platforms); - if (Ctx->FileKind != FileType::TBD_V1) + if (FileKind != FileType::TBD_V1) IO.mapOptional("flags", Keys->Flags, TBDFlags::None); IO.mapRequired("install-name", Keys->InstallName); IO.mapOptional("current-version", Keys->CurrentVersion, PackedVersion(1, 0, 0)); IO.mapOptional("compatibility-version", Keys->CompatibilityVersion, PackedVersion(1, 0, 0)); - if (Ctx->FileKind != FileType::TBD_V3) + if (FileKind != FileType::TBD_V3) IO.mapOptional("swift-version", Keys->SwiftABIVersion, SwiftVersion(0)); else IO.mapOptional("swift-abi-version", Keys->SwiftABIVersion, SwiftVersion(0)); IO.mapOptional("objc-constraint", Keys->ObjCConstraint, - (Ctx->FileKind == FileType::TBD_V1) + (FileKind == FileType::TBD_V1) ? ObjCConstraintType::None : ObjCConstraintType::Retain_Release); - if (Ctx->FileKind != FileType::TBD_V1) + if (FileKind != FileType::TBD_V1) IO.mapOptional("parent-umbrella", Keys->ParentUmbrella, StringRef()); IO.mapOptional("exports", Keys->Exports); - if (Ctx->FileKind != FileType::TBD_V1) + if (FileKind != FileType::TBD_V1) IO.mapOptional("undefineds", Keys->Undefineds); } + + static void mapKeysToValuesV4(IO &IO, const InterfaceFile *&File) { + MappingNormalization Keys(IO, + File); + IO.mapTag("!tapi-tbd", true); + IO.mapRequired("tbd-version", Keys->TBDVersion); + IO.mapRequired("targets", Keys->Targets); + IO.mapOptional("uuids", Keys->UUIDs); + IO.mapOptional("flags", Keys->Flags, TBDFlags::None); + IO.mapRequired("install-name", Keys->InstallName); + IO.mapOptional("current-version", Keys->CurrentVersion, + PackedVersion(1, 0, 0)); + IO.mapOptional("compatibility-version", Keys->CompatibilityVersion, + PackedVersion(1, 0, 0)); + IO.mapOptional("swift-abi-version", Keys->SwiftVersion, SwiftVersion(0)); + IO.mapOptional("parent-umbrella", Keys->ParentUmbrellas); + auto OptionKind = MetadataSection::Option::Clients; + IO.mapOptionalWithContext("allowable-clients", Keys->AllowableClients, + OptionKind); + OptionKind = MetadataSection::Option::Libraries; + IO.mapOptionalWithContext("reexported-libraries", Keys->ReexportedLibraries, + OptionKind); + IO.mapOptional("exports", Keys->Exports); + IO.mapOptional("reexports", Keys->Reexports); + IO.mapOptional("undefineds", Keys->Undefineds); + } }; template <> Modified: llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp?rev=374058&r1=374057&r2=374058&view=diff ============================================================================== --- llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp (original) +++ llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp Tue Oct 8 08:07:36 2019 @@ -172,14 +172,25 @@ void ScalarTraits::output( break; } } -StringRef ScalarTraits::input(StringRef Scalar, void *, +StringRef ScalarTraits::input(StringRef Scalar, void *IO, SwiftVersion &Value) { - Value = StringSwitch(Scalar) - .Case("1.0", 1) - .Case("1.1", 2) - .Case("2.0", 3) - .Case("3.0", 4) - .Default(0); + const auto *Ctx = reinterpret_cast(IO); + assert((!Ctx || Ctx->FileKind != FileType::Invalid) && + "File type is not set in context"); + + if (Ctx->FileKind == FileType::TBD_V4) { + if (Scalar.getAsInteger(10, Value)) + return "invalid Swift ABI version."; + return {}; + } else { + Value = StringSwitch(Scalar) + .Case("1.0", 1) + .Case("1.1", 2) + .Case("2.0", 3) + .Case("3.0", 4) + .Default(0); + } + if (Value != SwiftVersion(0)) return {}; Modified: llvm/trunk/unittests/TextAPI/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/TextAPI/CMakeLists.txt?rev=374058&r1=374057&r2=374058&view=diff ============================================================================== --- llvm/trunk/unittests/TextAPI/CMakeLists.txt (original) +++ llvm/trunk/unittests/TextAPI/CMakeLists.txt Tue Oct 8 08:07:36 2019 @@ -7,6 +7,7 @@ add_llvm_unittest(TextAPITests TextStubV1Tests.cpp TextStubV2Tests.cpp TextStubV3Tests.cpp + TextStubV4Tests.cpp ) target_link_libraries(TextAPITests PRIVATE LLVMTestingSupport) Added: llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp?rev=374058&view=auto ============================================================================== --- llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp (added) +++ llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp Tue Oct 8 08:07:36 2019 @@ -0,0 +1,558 @@ +//===-- TextStubV4Tests.cpp - TBD V4 File Test ----------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===-----------------------------------------------------------------------===/ +#include "llvm/TextAPI/MachO/InterfaceFile.h" +#include "llvm/TextAPI/MachO/TextAPIReader.h" +#include "llvm/TextAPI/MachO/TextAPIWriter.h" +#include "gtest/gtest.h" +#include +#include + +using namespace llvm; +using namespace llvm::MachO; + +struct ExampleSymbol { + SymbolKind Kind; + std::string Name; + bool WeakDefined; + bool ThreadLocalValue; +}; +using ExampleSymbolSeq = std::vector; +using UUIDs = std::vector>; + +inline bool operator<(const ExampleSymbol &LHS, const ExampleSymbol &RHS) { + return std::tie(LHS.Kind, LHS.Name) < std::tie(RHS.Kind, RHS.Name); +} + +inline bool operator==(const ExampleSymbol &LHS, const ExampleSymbol &RHS) { + return std::tie(LHS.Kind, LHS.Name, LHS.WeakDefined, LHS.ThreadLocalValue) == + std::tie(RHS.Kind, RHS.Name, RHS.WeakDefined, RHS.ThreadLocalValue); +} + +static ExampleSymbol TBDv4ExportedSymbols[] = { + {SymbolKind::GlobalSymbol, "_symA", false, false}, + {SymbolKind::GlobalSymbol, "_symAB", false, false}, + {SymbolKind::GlobalSymbol, "_symB", false, false}, +}; + +static ExampleSymbol TBDv4ReexportedSymbols[] = { + {SymbolKind::GlobalSymbol, "_symC", false, false}, +}; + +static ExampleSymbol TBDv4UndefinedSymbols[] = { + {SymbolKind::GlobalSymbol, "_symD", false, false}, +}; + +namespace TBDv4 { + +TEST(TBDv4, ReadFile) { + static const char tbd_v4_file[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ i386-macos, x86_64-macos, x86_64-ios ]\n" + "uuids:\n" + " - target: i386-macos\n" + " value: 00000000-0000-0000-0000-000000000000\n" + " - target: x86_64-macos\n" + " value: 11111111-1111-1111-1111-111111111111\n" + " - target: x86_64-ios\n" + " value: 11111111-1111-1111-1111-111111111111\n" + "flags: [ flat_namespace, installapi ]\n" + "install-name: Umbrella.framework/Umbrella\n" + "current-version: 1.2.3\n" + "compatibility-version: 1.2\n" + "swift-abi-version: 5\n" + "parent-umbrella:\n" + " - targets: [ i386-macos, x86_64-macos, x86_64-ios ]\n" + " umbrella: System\n" + "allowable-clients:\n" + " - targets: [ i386-macos, x86_64-macos, x86_64-ios ]\n" + " clients: [ ClientA ]\n" + "reexported-libraries:\n" + " - targets: [ i386-macos ]\n" + " libraries: [ /System/Library/Frameworks/A.framework/A ]\n" + "exports:\n" + " - targets: [ i386-macos ]\n" + " symbols: [ _symA ]\n" + " objc-classes: []\n" + " objc-eh-types: []\n" + " objc-ivars: []\n" + " weak-symbols: []\n" + " thread-local-symbols: []\n" + " - targets: [ x86_64-ios ]\n" + " symbols: [_symB]\n" + " - targets: [ x86_64-macos, x86_64-ios ]\n" + " symbols: [_symAB]\n" + "reexports:\n" + " - targets: [ i386-macos ]\n" + " symbols: [_symC]\n" + " objc-classes: []\n" + " objc-eh-types: []\n" + " objc-ivars: []\n" + " weak-symbols: []\n" + " thread-local-symbols: []\n" + "undefineds:\n" + " - targets: [ i386-macos ]\n" + " symbols: [ _symD ]\n" + " objc-classes: []\n" + " objc-eh-types: []\n" + " objc-ivars: []\n" + " weak-symbols: []\n" + " thread-local-symbols: []\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_v4_file, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + PlatformSet Platforms; + Platforms.insert(PlatformKind::macOS); + Platforms.insert(PlatformKind::iOS); + auto Archs = AK_i386 | AK_x86_64; + TargetList Targets = { + Target(AK_i386, PlatformKind::macOS), + Target(AK_x86_64, PlatformKind::macOS), + Target(AK_x86_64, PlatformKind::iOS), + }; + UUIDs uuids = {{Targets[0], "00000000-0000-0000-0000-000000000000"}, + {Targets[1], "11111111-1111-1111-1111-111111111111"}, + {Targets[2], "11111111-1111-1111-1111-111111111111"}}; + EXPECT_EQ(Archs, File->getArchitectures()); + EXPECT_EQ(uuids, File->uuids()); + EXPECT_EQ(Platforms.size(), File->getPlatforms().size()); + for (auto Platform : File->getPlatforms()) + EXPECT_EQ(Platforms.count(Platform), 1U); + EXPECT_EQ(std::string("Umbrella.framework/Umbrella"), File->getInstallName()); + EXPECT_EQ(PackedVersion(1, 2, 3), File->getCurrentVersion()); + EXPECT_EQ(PackedVersion(1, 2, 0), File->getCompatibilityVersion()); + EXPECT_EQ(5U, File->getSwiftABIVersion()); + EXPECT_FALSE(File->isTwoLevelNamespace()); + EXPECT_TRUE(File->isApplicationExtensionSafe()); + EXPECT_TRUE(File->isInstallAPI()); + InterfaceFileRef client("ClientA", Targets); + InterfaceFileRef reexport("/System/Library/Frameworks/A.framework/A", + {Targets[0]}); + EXPECT_EQ(1U, File->allowableClients().size()); + EXPECT_EQ(client, File->allowableClients().front()); + EXPECT_EQ(1U, File->reexportedLibraries().size()); + EXPECT_EQ(reexport, File->reexportedLibraries().front()); + + ExampleSymbolSeq Exports, Reexports, Undefineds; + ExampleSymbol temp; + for (const auto *Sym : File->symbols()) { + temp = ExampleSymbol{Sym->getKind(), Sym->getName(), Sym->isWeakDefined(), + Sym->isThreadLocalValue()}; + EXPECT_FALSE(Sym->isWeakReferenced()); + if (Sym->isUndefined()) + Undefineds.emplace_back(std::move(temp)); + else + Sym->isReexported() ? Reexports.emplace_back(std::move(temp)) + : Exports.emplace_back(std::move(temp)); + } + llvm::sort(Exports.begin(), Exports.end()); + llvm::sort(Reexports.begin(), Reexports.end()); + llvm::sort(Undefineds.begin(), Undefineds.end()); + + EXPECT_EQ(sizeof(TBDv4ExportedSymbols) / sizeof(ExampleSymbol), + Exports.size()); + EXPECT_EQ(sizeof(TBDv4ReexportedSymbols) / sizeof(ExampleSymbol), + Reexports.size()); + EXPECT_EQ(sizeof(TBDv4UndefinedSymbols) / sizeof(ExampleSymbol), + Undefineds.size()); + EXPECT_TRUE(std::equal(Exports.begin(), Exports.end(), + std::begin(TBDv4ExportedSymbols))); + EXPECT_TRUE(std::equal(Reexports.begin(), Reexports.end(), + std::begin(TBDv4ReexportedSymbols))); + EXPECT_TRUE(std::equal(Undefineds.begin(), Undefineds.end(), + std::begin(TBDv4UndefinedSymbols))); +} + +TEST(TBDv4, WriteFile) { + static const char tbd_v4_file[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ i386-macos, x86_64-ios-simulator ]\n" + "uuids:\n" + " - target: i386-macos\n" + " value: 00000000-0000-0000-0000-000000000000\n" + " - target: x86_64-ios-simulator\n" + " value: 11111111-1111-1111-1111-111111111111\n" + "flags: [ installapi ]\n" + "install-name: 'Umbrella.framework/Umbrella'\n" + "current-version: 1.2.3\n" + "compatibility-version: 0\n" + "swift-abi-version: 5\n" + "parent-umbrella:\n" + " - targets: [ i386-macos, x86_64-ios-simulator ]\n" + " umbrella: System\n" + "allowable-clients:\n" + " - targets: [ i386-macos ]\n" + " clients: [ ClientA ]\n" + "exports:\n" + " - targets: [ i386-macos ]\n" + " symbols: [ _symA ]\n" + " objc-classes: [ Class1 ]\n" + " weak-symbols: [ _symC ]\n" + " - targets: [ x86_64-ios-simulator ]\n" + " symbols: [ _symB ]\n" + "...\n"; + + InterfaceFile File; + TargetList Targets = { + Target(AK_i386, PlatformKind::macOS), + Target(AK_x86_64, PlatformKind::iOSSimulator), + }; + UUIDs uuids = {{Targets[0], "00000000-0000-0000-0000-000000000000"}, + {Targets[1], "11111111-1111-1111-1111-111111111111"}}; + File.setInstallName("Umbrella.framework/Umbrella"); + File.setFileType(FileType::TBD_V4); + File.addTargets(Targets); + File.addUUID(uuids[0].first, uuids[0].second); + File.addUUID(uuids[1].first, uuids[1].second); + File.setCurrentVersion(PackedVersion(1, 2, 3)); + File.setTwoLevelNamespace(); + File.setInstallAPI(true); + File.setApplicationExtensionSafe(true); + File.setSwiftABIVersion(5); + File.addAllowableClient("ClientA", Targets[0]); + File.addParentUmbrella(Targets[0], "System"); + File.addParentUmbrella(Targets[1], "System"); + File.addSymbol(SymbolKind::GlobalSymbol, "_symA", {Targets[0]}); + File.addSymbol(SymbolKind::GlobalSymbol, "_symB", {Targets[1]}); + File.addSymbol(SymbolKind::GlobalSymbol, "_symC", {Targets[0]}, + SymbolFlags::WeakDefined); + File.addSymbol(SymbolKind::ObjectiveCClass, "Class1", {Targets[0]}); + + SmallString<4096> Buffer; + raw_svector_ostream OS(Buffer); + auto Result = TextAPIWriter::writeToStream(OS, File); + EXPECT_FALSE(Result); + EXPECT_STREQ(tbd_v4_file, Buffer.c_str()); +} + +TEST(TBDv4, MultipleTargets) { + static const char tbd_multiple_targets[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ i386-maccatalyst, x86_64-tvos, arm64-ios ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_multiple_targets, "Test.tbd")); + EXPECT_TRUE(!!Result); + PlatformSet Platforms; + Platforms.insert(PlatformKind::macCatalyst); + Platforms.insert(PlatformKind::tvOS); + Platforms.insert(PlatformKind::iOS); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(AK_x86_64 | AK_arm64 | AK_i386, File->getArchitectures()); + EXPECT_EQ(Platforms.size(), File->getPlatforms().size()); + for (auto Platform : File->getPlatforms()) + EXPECT_EQ(Platforms.count(Platform), 1U); +} + +TEST(TBDv4, MultipleTargetsSameArch) { + static const char tbd_targets_same_arch[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-maccatalyst, x86_64-tvos ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_targets_same_arch, "Test.tbd")); + EXPECT_TRUE(!!Result); + PlatformSet Platforms; + Platforms.insert(PlatformKind::tvOS); + Platforms.insert(PlatformKind::macCatalyst); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(Platforms.size(), File->getPlatforms().size()); + for (auto Platform : File->getPlatforms()) + EXPECT_EQ(Platforms.count(Platform), 1U); +} + +TEST(TBDv4, MultipleTargetsSamePlatform) { + static const char tbd_multiple_targets_same_platform[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ arm64-ios, armv7k-ios ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = TextAPIReader::get( + MemoryBufferRef(tbd_multiple_targets_same_platform, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(AK_arm64 | AK_armv7k, File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::iOS, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_maccatalyst) { + static const char tbd_target_maccatalyst[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-maccatalyst ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_target_maccatalyst, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::macCatalyst, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_x86_ios) { + static const char tbd_target_x86_ios[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-ios ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_target_x86_ios, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::iOS, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_arm_bridgeOS) { + static const char tbd_platform_bridgeos[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ armv7k-bridgeos ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_platform_bridgeos, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::bridgeOS, *File->getPlatforms().begin()); + EXPECT_EQ(ArchitectureSet(AK_armv7k), File->getArchitectures()); +} + +TEST(TBDv4, Target_x86_macos) { + static const char tbd_x86_macos[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_x86_macos, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::macOS, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_x86_ios_simulator) { + static const char tbd_x86_ios_sim[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-ios-simulator ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_x86_ios_sim, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::iOSSimulator, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_x86_tvos_simulator) { + static const char tbd_x86_tvos_sim[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-tvos-simulator ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_x86_tvos_sim, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::tvOSSimulator, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_i386_watchos_simulator) { + static const char tbd_i386_watchos_sim[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ i386-watchos-simulator ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_i386_watchos_sim, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_i386), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::watchOSSimulator, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Swift_1) { + static const char tbd_swift_1[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "swift-abi-version: 1\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_swift_1, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(1U, File->getSwiftABIVersion()); +} + +TEST(TBDv4, Swift_2) { + static const char tbd_v1_swift_2[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "swift-abi-version: 2\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_v1_swift_2, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(2U, File->getSwiftABIVersion()); +} + +TEST(TBDv4, Swift_5) { + static const char tbd_swift_5[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "swift-abi-version: 5\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_swift_5, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(5U, File->getSwiftABIVersion()); +} + +TEST(TBDv4, Swift_99) { + static const char tbd_swift_99[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "swift-abi-version: 99\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_swift_99, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(99U, File->getSwiftABIVersion()); +} + +TEST(TBDv4, InvalidArchitecture) { + static const char tbd_file_unknown_architecture[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ foo-macos ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = TextAPIReader::get( + MemoryBufferRef(tbd_file_unknown_architecture, "Test.tbd")); + EXPECT_FALSE(!!Result); + auto errorMessage = toString(Result.takeError()); + ASSERT_TRUE(errorMessage.compare(0, 15, "malformed file\n") == 0); +} + +TEST(TBDv4, InvalidPlatform) { + static const char tbd_file_invalid_platform[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-maos ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = TextAPIReader::get( + MemoryBufferRef(tbd_file_invalid_platform, "Test.tbd")); + EXPECT_FALSE(!!Result); + auto errorMessage = toString(Result.takeError()); + ASSERT_TRUE(errorMessage.compare(0, 15, "malformed file\n") == 0); +} + +TEST(TBDv4, MalformedFile1) { + static const char malformed_file1[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(malformed_file1, "Test.tbd")); + EXPECT_FALSE(!!Result); + auto errorMessage = toString(Result.takeError()); + ASSERT_EQ("malformed file\nTest.tbd:2:1: error: missing required key " + "'targets'\ntbd-version: 4\n^\n", + errorMessage); +} + +TEST(TBDv4, MalformedFile2) { + static const char malformed_file2[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "foobar: \"unsupported key\"\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(malformed_file2, "Test.tbd")); + EXPECT_FALSE(!!Result); + auto errorMessage = toString(Result.takeError()); + ASSERT_EQ( + "malformed file\nTest.tbd:5:9: error: unknown key 'foobar'\nfoobar: " + "\"unsupported key\"\n ^~~~~~~~~~~~~~~~~\n", + errorMessage); +} + +TEST(TBDv4, MalformedFile3) { + static const char tbd_v1_swift_1_1[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "swift-abi-version: 1.1\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_v1_swift_1_1, "Test.tbd")); + EXPECT_FALSE(!!Result); + auto errorMessage = toString(Result.takeError()); + EXPECT_EQ("malformed file\nTest.tbd:5:20: error: invalid Swift ABI " + "version.\nswift-abi-version: 1.1\n ^~~\n", + errorMessage); +} + +} // end namespace TBDv4 From llvm-commits at lists.llvm.org Tue Oct 8 08:07:40 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:07:40 +0000 (UTC) Subject: [PATCH] D68645: MinidumpYAML: Add support for the memory info list stream In-Reply-To: References: Message-ID: <66ecc81933c9b2d5f1194cdbd0626d40@localhost.localdomain> labath added a comment. In D68645#1699705 , @JosephTremoulet wrote: > Nit: Title says "thread" rather than "memory info" Woops, sorry about that. Should be fixed now. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68645/new/ https://reviews.llvm.org/D68645 From llvm-commits at lists.llvm.org Tue Oct 8 08:07:41 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 15:07:41 +0000 (UTC) Subject: [PATCH] D68092: [AMDGPU] Invert the handling of skip insertion. In-Reply-To: References: Message-ID: <12c206a2b27b84daebe9fad7a29bd615@localhost.localdomain> nhaehnle added inline comments. ================ Comment at: lib/Target/AMDGPU/SIRemoveShortExecBranches.cpp:112 + + TII->analyzeBranch(SrcMBB, TrueMBB, FalseMBB, Cond); + if (!FalseMBB) ---------------- analyzeBranch's return value must be checked. ================ Comment at: lib/Target/AMDGPU/SIRemoveShortExecBranches.cpp:116-117 + + if (MDT->dominates(TrueMBB, &SrcMBB) || + mustRetainExeczBranch(*FalseMBB, *TrueMBB)) + return false; ---------------- What's the logic here behind using domination as a criterion? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68092/new/ https://reviews.llvm.org/D68092 From llvm-commits at lists.llvm.org Tue Oct 8 08:07:46 2019 From: llvm-commits at lists.llvm.org (Cyndy Ishida via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:07:46 +0000 (UTC) Subject: [PATCH] D67529: [TextAPI] Introduce TBDv4 In-Reply-To: References: Message-ID: <9d70552a39015a64cd0d232b23fcafd1@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG5d566c5a46ae: [TextAPI] Introduce TBDv4 (authored by cishida). Changed prior to commit: https://reviews.llvm.org/D67529?vs=223630&id=223868#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67529/new/ https://reviews.llvm.org/D67529 Files: llvm/include/llvm/TextAPI/MachO/InterfaceFile.h llvm/include/llvm/TextAPI/MachO/Symbol.h llvm/include/llvm/TextAPI/MachO/Target.h llvm/lib/TextAPI/MachO/Target.cpp llvm/lib/TextAPI/MachO/TextStub.cpp llvm/lib/TextAPI/MachO/TextStubCommon.cpp llvm/unittests/TextAPI/CMakeLists.txt llvm/unittests/TextAPI/TextStubV4Tests.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67529.223868.patch Type: text/x-patch Size: 50972 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 08:07:58 2019 From: llvm-commits at lists.llvm.org (Louis Dionne via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:07:58 +0000 (UTC) Subject: [PATCH] D68648: [CMake] Only detect the linker once in AddLLVM.cmake Message-ID: ldionne created this revision. ldionne added a reviewer: smeenai. Herald added subscribers: llvm-commits, dexonsmith, jkorous, fedor.sergeev, mgorny. Herald added a project: LLVM. Otherwise, the build output contains a bunch of "Linker detection: " lines that are really redundant. We also make redundant calls to the linker, although that's a smaller concern. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68648 Files: llvm/cmake/modules/AddLLVM.cmake Index: llvm/cmake/modules/AddLLVM.cmake =================================================================== --- llvm/cmake/modules/AddLLVM.cmake +++ llvm/cmake/modules/AddLLVM.cmake @@ -162,49 +162,51 @@ set(LLVM_COMMON_DEPENDS ${LLVM_COMMON_DEPENDS} PARENT_SCOPE) endfunction(add_llvm_symbol_exports) -if(APPLE) - execute_process( - COMMAND "${CMAKE_LINKER}" -v - ERROR_VARIABLE stderr - ) - set(LLVM_LINKER_DETECTED YES) - if("${stderr}" MATCHES "PROJECT:ld64") - set(LLVM_LINKER_IS_LD64 YES) - message(STATUS "Linker detection: ld64") - else() - set(LLVM_LINKER_DETECTED NO) - message(STATUS "Linker detection: unknown") - endif() -elseif(NOT WIN32) - # Detect what linker we have here - if( LLVM_USE_LINKER ) - set(command ${CMAKE_C_COMPILER} -fuse-ld=${LLVM_USE_LINKER} -Wl,--version) - else() - separate_arguments(flags UNIX_COMMAND "${CMAKE_EXE_LINKER_FLAGS}") - set(command ${CMAKE_C_COMPILER} ${flags} -Wl,--version) - endif() - execute_process( - COMMAND ${command} - OUTPUT_VARIABLE stdout - ERROR_VARIABLE stderr - ) - set(LLVM_LINKER_DETECTED YES) - if("${stdout}" MATCHES "GNU gold") - set(LLVM_LINKER_IS_GOLD YES) - message(STATUS "Linker detection: GNU Gold") - elseif("${stdout}" MATCHES "^LLD") - set(LLVM_LINKER_IS_LLD YES) - message(STATUS "Linker detection: LLD") - elseif("${stdout}" MATCHES "GNU ld") - set(LLVM_LINKER_IS_GNULD YES) - message(STATUS "Linker detection: GNU ld") - elseif("${stderr}" MATCHES "Solaris Link Editors" OR - "${stdout}" MATCHES "Solaris Link Editors") - set(LLVM_LINKER_IS_SOLARISLD YES) - message(STATUS "Linker detection: Solaris ld") - else() - set(LLVM_LINKER_DETECTED NO) - message(STATUS "Linker detection: unknown") +if (NOT DEFINED LLVM_LINKER_DETECTED) + if(APPLE) + execute_process( + COMMAND "${CMAKE_LINKER}" -v + ERROR_VARIABLE stderr + ) + set(LLVM_LINKER_DETECTED YES) + if("${stderr}" MATCHES "PROJECT:ld64") + set(LLVM_LINKER_IS_LD64 YES) + message(STATUS "Linker detection: ld64") + else() + set(LLVM_LINKER_DETECTED NO) + message(STATUS "Linker detection: unknown") + endif() + elseif(NOT WIN32) + # Detect what linker we have here + if( LLVM_USE_LINKER ) + set(command ${CMAKE_C_COMPILER} -fuse-ld=${LLVM_USE_LINKER} -Wl,--version) + else() + separate_arguments(flags UNIX_COMMAND "${CMAKE_EXE_LINKER_FLAGS}") + set(command ${CMAKE_C_COMPILER} ${flags} -Wl,--version) + endif() + execute_process( + COMMAND ${command} + OUTPUT_VARIABLE stdout + ERROR_VARIABLE stderr + ) + set(LLVM_LINKER_DETECTED YES) + if("${stdout}" MATCHES "GNU gold") + set(LLVM_LINKER_IS_GOLD YES) + message(STATUS "Linker detection: GNU Gold") + elseif("${stdout}" MATCHES "^LLD") + set(LLVM_LINKER_IS_LLD YES) + message(STATUS "Linker detection: LLD") + elseif("${stdout}" MATCHES "GNU ld") + set(LLVM_LINKER_IS_GNULD YES) + message(STATUS "Linker detection: GNU ld") + elseif("${stderr}" MATCHES "Solaris Link Editors" OR + "${stdout}" MATCHES "Solaris Link Editors") + set(LLVM_LINKER_IS_SOLARISLD YES) + message(STATUS "Linker detection: Solaris ld") + else() + set(LLVM_LINKER_DETECTED NO) + message(STATUS "Linker detection: unknown") + endif() endif() endif() -------------- next part -------------- A non-text attachment was scrubbed... Name: D68648.223869.patch Type: text/x-patch Size: 3402 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 08:12:39 2019 From: llvm-commits at lists.llvm.org (GN Sync Bot via llvm-commits) Date: Tue, 08 Oct 2019 15:12:39 -0000 Subject: [llvm] r374059 - gn build: Merge r374058 Message-ID: <20191008151239.16EB885008@lists.llvm.org> Author: gnsyncbot Date: Tue Oct 8 08:12:38 2019 New Revision: 374059 URL: http://llvm.org/viewvc/llvm-project?rev=374059&view=rev Log: gn build: Merge r374058 Modified: llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn?rev=374059&r1=374058&r2=374059&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn Tue Oct 8 08:12:38 2019 @@ -10,5 +10,6 @@ unittest("TextAPITests") { "TextStubV1Tests.cpp", "TextStubV2Tests.cpp", "TextStubV3Tests.cpp", + "TextStubV4Tests.cpp", ] } From llvm-commits at lists.llvm.org Tue Oct 8 08:16:48 2019 From: llvm-commits at lists.llvm.org (Sean Fertile via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:16:48 +0000 (UTC) Subject: [PATCH] D67008: implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: sfertile marked an inline comment as done. sfertile added inline comments. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:168 + +bool isRelocationSigned(XCOFFRelocation32 &Reloc); + ---------------- hubert.reinterpretcast wrote: > DiggerLin wrote: > > hubert.reinterpretcast wrote: > > > Do these need to be declared in the header? Are they called only in one `.cpp` file? If so, they can be made `static` in the `.cpp` file. Otherwise, it seems odd that these aren't `const` member functions of `XCOFFRelocation32`. > > the llvm-readobj is using those function and obj2yaml will use them too. > It is still odd to me that these aren't `const` non-static member functions of `XCOFFRelocation32`. I think were these originally templated to work with both 32-bit and 64-bit relocations, which explains why they aren't member functions. ================ Comment at: llvm/test/tools/llvm-readobj/reloc_overflow.ll:1 +# RUN: llvm-readobj --sections %p/Inputs/xcoff-reloc-overflow.o | \ +# RUN: FileCheck --check-prefix=SECOVERFLOW %s ---------------- DiggerLin wrote: > sfertile wrote: > > The `.ll` suffix implies the test is written in LLVM IR. Use `.test` instead. > I will change the name I still see the `.ll` suffix. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:140 + // Only the .text, .data, .tdata, and STYP_DWARF sections have relocation. + if (Sec.Flags != XCOFF::STYP_TEXT && Sec.Flags != XCOFF::STYP_DATA && + Sec.Flags != XCOFF::STYP_TDATA && Sec.Flags != XCOFF::STYP_DWARF) ---------------- DiggerLin wrote: > sfertile wrote: > > Is this specified in the docs? I wasn't able to find it specified anywhere. What about the exception section? I don't know anything about the exception implementation on AIX so I could be wrong, but I suspect it might contain relocations. > > > > I did find the specification of the special relocations in the loader table, and that they are a different format from the 'normal' relocations implemented in this patch. Does the loader section use the relocation pointer and relocation count in the section header table for these different relocations, or do we find them through fields defined in the loader section itself? > from the xcoff document. > s_relptr Recognized for the .text, .data, .tdata, and STYP_DWARF sections only. Thanks. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Tue Oct 8 08:16:49 2019 From: llvm-commits at lists.llvm.org (Kristof Beyls via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:16:49 +0000 (UTC) Subject: [PATCH] D66887: [test-suite][WIP] Add GCC C Torture Suite as External Test Suite In-Reply-To: References: Message-ID: kristof.beyls added a comment. Thanks for persevering Sam. I think the latest changes look good. I only have 2 more minor nits, see below. After those are addressed, I think this will be fine to commit. Thanks! ================ Comment at: SingleSource/Regression/C/gcc-c-torture/README:14-17 +# Licensing + +The testcases in SingleSource/Regression/C/gcc-c-torture/execute are covered by +the GPL. See the files whose names start with COPYING for copying permission. ---------------- Given licensing information is also present in SingleSource/Regression/C/gcc-c-torture/execute/LICENSE.TXT and called out in the top-level LICENSE.TXT file in the test-suite, I don't think there's much value in repeating the information here too. Maybe best to delete this part? ================ Comment at: SingleSource/Regression/C/gcc-c-torture/execute/LICENCE.TXT:1-2 +The testcases in this directory are covered by the GPL. See the files whose +names start with COPYING for copying permission. ---------------- it seems the file name here is still LICENCE.TXT rather than LICENSE.TXT, which would be preferred for consistency with the rest of the test-suite? Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 From llvm-commits at lists.llvm.org Tue Oct 8 08:16:50 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 15:16:50 +0000 (UTC) Subject: [PATCH] D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands In-Reply-To: References: Message-ID: <1754d0f6d3927a6a8c9dc3de381e6d2b@localhost.localdomain> nhaehnle added a comment. I think this should be good to go. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51932/new/ https://reviews.llvm.org/D51932 From llvm-commits at lists.llvm.org Tue Oct 8 08:16:52 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Milo=C5=A1_Stojanovi=C4=87_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 15:16:52 +0000 (UTC) Subject: [PATCH] D68649: [Mips][llvm-exegesis] Add a Mips target Message-ID: mstojanovic created this revision. mstojanovic added reviewers: courbet, gchatelet, atanasyan, petarj. Herald added subscribers: tschuett, arichardson, mgorny, sdardis. The target does just enough to be able to run llvm-exegesis in latency mode for at least some opcodes. https://reviews.llvm.org/D68649 Files: lib/Target/Mips/CMakeLists.txt lib/Target/Mips/Mips.td lib/Target/Mips/MipsPfmCounters.td tools/llvm-exegesis/lib/Assembler.cpp tools/llvm-exegesis/lib/CMakeLists.txt tools/llvm-exegesis/lib/Mips/CMakeLists.txt tools/llvm-exegesis/lib/Mips/LLVMBuild.txt tools/llvm-exegesis/lib/Mips/Target.cpp unittests/tools/llvm-exegesis/CMakeLists.txt unittests/tools/llvm-exegesis/Mips/CMakeLists.txt unittests/tools/llvm-exegesis/Mips/TargetTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68649.223870.patch Type: text/x-patch Size: 10589 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 08:20:19 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via llvm-commits) Date: Tue, 08 Oct 2019 15:20:19 -0000 Subject: [llvm] r374060 - [Attributor] Add helper class to compose two structured deduction. Message-ID: <20191008152019.BCDC889559@lists.llvm.org> Author: uenoku Date: Tue Oct 8 08:20:19 2019 New Revision: 374060 URL: http://llvm.org/viewvc/llvm-project?rev=374060&view=rev Log: [Attributor] Add helper class to compose two structured deduction. Summary: This patch introduces a generic way to compose two structured deductions. This will be used for composing generic deduction with `MustBeExecutedExplorer` and other existing generic deduction. Reviewers: jdoerfert, sstefan1 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66645 Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374060&r1=374059&r2=374060&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Tue Oct 8 08:20:19 2019 @@ -560,6 +560,21 @@ static void clampReturnedValueStates(Att S ^= *T; } +/// Helper class to compose two generic deduction +template class F, template class G> +struct AAComposeTwoGenericDeduction + : public F, StateType> { + AAComposeTwoGenericDeduction(const IRPosition &IRP) + : F, StateType>(IRP) {} + + /// See AbstractAttribute::updateImpl(...). + ChangeStatus updateImpl(Attributor &A) override { + return F, StateType>::updateImpl(A) | + G::updateImpl(A); + } +}; + /// Helper class for generic deduction: return value -> returned position. template From llvm-commits at lists.llvm.org Tue Oct 8 08:24:37 2019 From: llvm-commits at lists.llvm.org (Cyndy Ishida via llvm-commits) Date: Tue, 08 Oct 2019 15:24:37 -0000 Subject: [llvm] r374062 - Revert [TextAPI] Introduce TBDv4 Message-ID: <20191008152437.BCA778939C@lists.llvm.org> Author: cishida Date: Tue Oct 8 08:24:37 2019 New Revision: 374062 URL: http://llvm.org/viewvc/llvm-project?rev=374062&view=rev Log: Revert [TextAPI] Introduce TBDv4 This reverts r374058 (git commit 5d566c5a46aeaa1fa0e5c0b823c9d5f84036dc9a) Removed: llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp Modified: llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h llvm/trunk/include/llvm/TextAPI/MachO/Target.h llvm/trunk/lib/TextAPI/MachO/Target.cpp llvm/trunk/lib/TextAPI/MachO/TextStub.cpp llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp llvm/trunk/unittests/TextAPI/CMakeLists.txt Modified: llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h?rev=374062&r1=374061&r2=374062&view=diff ============================================================================== --- llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h (original) +++ llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h Tue Oct 8 08:24:37 2019 @@ -67,9 +67,6 @@ enum FileType : unsigned { /// Text-based stub file (.tbd) version 3.0 TBD_V3 = 1U << 2, - /// Text-based stub file (.tbd) version 4.0 - TBD_V4 = 1U << 3, - All = ~0U, LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/All), Modified: llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h?rev=374062&r1=374061&r2=374062&view=diff ============================================================================== --- llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h (original) +++ llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h Tue Oct 8 08:24:37 2019 @@ -38,10 +38,7 @@ enum class SymbolFlags : uint8_t { /// Undefined Undefined = 1U << 3, - /// Rexported - Rexported = 1U << 4, - - LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/Rexported), + LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/Undefined), }; // clang-format on @@ -53,7 +50,7 @@ enum class SymbolKind : uint8_t { ObjectiveCInstanceVariable, }; -using TargetList = SmallVector; +using TargetList = SmallVector; class Symbol { public: Symbol(SymbolKind Kind, StringRef Name, TargetList Targets, SymbolFlags Flags) @@ -84,10 +81,6 @@ public: return (Flags & SymbolFlags::Undefined) == SymbolFlags::Undefined; } - bool isReexported() const { - return (Flags & SymbolFlags::Rexported) == SymbolFlags::Rexported; - } - using const_target_iterator = TargetList::const_iterator; using const_target_range = llvm::iterator_range; const_target_range targets() const { return {Targets}; } Modified: llvm/trunk/include/llvm/TextAPI/MachO/Target.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/TextAPI/MachO/Target.h?rev=374062&r1=374061&r2=374062&view=diff ============================================================================== --- llvm/trunk/include/llvm/TextAPI/MachO/Target.h (original) +++ llvm/trunk/include/llvm/TextAPI/MachO/Target.h Tue Oct 8 08:24:37 2019 @@ -29,8 +29,6 @@ public: explicit Target(const llvm::Triple &Triple) : Arch(mapToArchitecture(Triple)), Platform(mapToPlatformKind(Triple)) {} - static llvm::Expected create(StringRef Target); - operator std::string() const; Architecture Arch; Modified: llvm/trunk/lib/TextAPI/MachO/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/TextAPI/MachO/Target.cpp?rev=374062&r1=374061&r2=374062&view=diff ============================================================================== --- llvm/trunk/lib/TextAPI/MachO/Target.cpp (original) +++ llvm/trunk/lib/TextAPI/MachO/Target.cpp Tue Oct 8 08:24:37 2019 @@ -17,44 +17,6 @@ namespace llvm { namespace MachO { -Expected Target::create(StringRef TargetValue) { - auto Result = TargetValue.split('-'); - auto ArchitectureStr = Result.first; - auto Architecture = getArchitectureFromName(ArchitectureStr); - if (Architecture == AK_unknown) - return make_error("invalid architecture", - inconvertibleErrorCode()); - auto PlatformStr = Result.second; - PlatformKind Platform; - Platform = StringSwitch(PlatformStr) - .Case("macos", PlatformKind::macOS) - .Case("ios", PlatformKind::iOS) - .Case("tvos", PlatformKind::tvOS) - .Case("watchos", PlatformKind::watchOS) - .Case("bridgeos", PlatformKind::bridgeOS) - .Case("maccatalyst", PlatformKind::macCatalyst) - .Case("ios-simulator", PlatformKind::iOSSimulator) - .Case("tvos-simulator", PlatformKind::tvOSSimulator) - .Case("watchos-simulator", PlatformKind::watchOSSimulator) - .Default(PlatformKind::unknown); - - if (Platform == PlatformKind::unknown) { - if (PlatformStr.startswith("<") && PlatformStr.endswith(">")) { - PlatformStr = PlatformStr.drop_front().drop_back(); - unsigned long long RawValue; - if (PlatformStr.getAsInteger(10, RawValue)) - return make_error("invalid platform number", - inconvertibleErrorCode()); - - Platform = (PlatformKind)RawValue; - } - return make_error("invalid platform", - inconvertibleErrorCode()); - } - - return Target{Architecture, Platform}; -} - Target::operator std::string() const { return (getArchitectureName(Arch) + " (" + getPlatformName(Platform) + ")") .str(); @@ -80,4 +42,4 @@ ArchitectureSet mapToArchitectureSet(Arr } } // end namespace MachO. -} // end namespace llvm. +} // end namespace llvm. \ No newline at end of file Modified: llvm/trunk/lib/TextAPI/MachO/TextStub.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/TextAPI/MachO/TextStub.cpp?rev=374062&r1=374061&r2=374062&view=diff ============================================================================== --- llvm/trunk/lib/TextAPI/MachO/TextStub.cpp (original) +++ llvm/trunk/lib/TextAPI/MachO/TextStub.cpp Tue Oct 8 08:24:37 2019 @@ -147,58 +147,6 @@ Each undefineds section is defined as fo objc-ivars: [] # Optional: List of Objective C Instance Variables weak-ref-symbols: [] # Optional: List of weak defined symbols */ - -/* - - YAML Format specification. - ---- !tapi-tbd -tbd-version: 4 # The tbd version for format -targets: [ armv7-ios, x86_64-maccatalyst ] # The list of applicable tapi supported target triples -uuids: # Optional: List of target and UUID pairs. - - target: armv7-ios - value: ... - - target: x86_64-maccatalyst - value: ... -flags: [] # Optional: -install-name: /u/l/libfoo.dylib # -current-version: 1.2.3 # Optional: defaults to 1.0 -compatibility-version: 1.0 # Optional: defaults to 1.0 -swift-abi-version: 0 # Optional: defaults to 0 -parent-umbrella: # Optional: -allowable-clients: - - targets: [ armv7-ios ] # Optional: - clients: [ clientA ] -exports: # List of export sections -... -re-exports: # List of reexport sections -... -undefineds: # List of undefineds sections -... - -Each export and reexport section is defined as following: - -- targets: [ arm64-macos ] # The list of target triples associated with symbols - symbols: [ _symA ] # Optional: List of symbols - objc-classes: [] # Optional: List of Objective-C classes - objc-eh-types: [] # Optional: List of Objective-C classes - # with EH - objc-ivars: [] # Optional: List of Objective C Instance - # Variables - weak-symbols: [] # Optional: List of weak defined symbols - thread-local-symbols: [] # Optional: List of thread local symbols -- targets: [ arm64-macos, x86_64-maccatalyst ] # Optional: Targets for applicable additional symbols - symbols: [ _symB ] # Optional: List of symbols - -Each undefineds section is defined as following: -- targets: [ arm64-macos ] # The list of target triples associated with symbols - symbols: [ _symC ] # Optional: List of symbols - objc-classes: [] # Optional: List of Objective-C classes - objc-eh-types: [] # Optional: List of Objective-C classes - # with EH - objc-ivars: [] # Optional: List of Objective C Instance Variables - weak-symbols: [] # Optional: List of weak defined symbols -*/ // clang-format on using namespace llvm; @@ -227,38 +175,6 @@ struct UndefinedSection { std::vector WeakRefSymbols; }; -// Sections for direct target mapping in TBDv4 -struct SymbolSection { - TargetList Targets; - std::vector Symbols; - std::vector Classes; - std::vector ClassEHs; - std::vector Ivars; - std::vector WeakSymbols; - std::vector TlvSymbols; -}; - -struct MetadataSection { - enum Option { Clients, Libraries }; - std::vector Targets; - std::vector Values; -}; - -struct UmbrellaSection { - std::vector Targets; - std::string Umbrella; -}; - -// UUID's for TBDv4 are mapped to target not arch -struct UUIDv4 { - Target TargetID; - std::string Value; - - UUIDv4() = default; - UUIDv4(const Target &TargetID, const std::string &Value) - : TargetID(TargetID), Value(Value) {} -}; - // clang-format off enum TBDFlags : unsigned { None = 0U, @@ -273,12 +189,6 @@ enum TBDFlags : unsigned { LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(Architecture) LLVM_YAML_IS_SEQUENCE_VECTOR(ExportSection) LLVM_YAML_IS_SEQUENCE_VECTOR(UndefinedSection) -// Specific to TBDv4 -LLVM_YAML_IS_SEQUENCE_VECTOR(SymbolSection) -LLVM_YAML_IS_SEQUENCE_VECTOR(MetadataSection) -LLVM_YAML_IS_SEQUENCE_VECTOR(UmbrellaSection) -LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(Target) -LLVM_YAML_IS_SEQUENCE_VECTOR(UUIDv4) namespace llvm { namespace yaml { @@ -321,49 +231,6 @@ template <> struct MappingTraits struct MappingTraits { - static void mapping(IO &IO, SymbolSection &Section) { - IO.mapRequired("targets", Section.Targets); - IO.mapOptional("symbols", Section.Symbols); - IO.mapOptional("objc-classes", Section.Classes); - IO.mapOptional("objc-eh-types", Section.ClassEHs); - IO.mapOptional("objc-ivars", Section.Ivars); - IO.mapOptional("weak-symbols", Section.WeakSymbols); - IO.mapOptional("thread-local-symbols", Section.TlvSymbols); - } -}; - -template <> struct MappingTraits { - static void mapping(IO &IO, UmbrellaSection &Section) { - IO.mapRequired("targets", Section.Targets); - IO.mapRequired("umbrella", Section.Umbrella); - } -}; - -template <> struct MappingTraits { - static void mapping(IO &IO, UUIDv4 &UUID) { - IO.mapRequired("target", UUID.TargetID); - IO.mapRequired("value", UUID.Value); - } -}; - -template <> -struct MappingContextTraits { - static void mapping(IO &IO, MetadataSection &Section, - MetadataSection::Option &OptionKind) { - IO.mapRequired("targets", Section.Targets); - switch (OptionKind) { - case MetadataSection::Option::Clients: - IO.mapRequired("clients", Section.Values); - return; - case MetadataSection::Option::Libraries: - IO.mapRequired("libraries", Section.Values); - return; - } - llvm_unreachable("unexpected option for metadata"); - } -}; - template <> struct ScalarBitSetTraits { static void bitset(IO &IO, TBDFlags &Flags) { IO.bitSetCase(Flags, "flat_namespace", TBDFlags::FlatNamespace); @@ -373,55 +240,6 @@ template <> struct ScalarBitSetTraits struct ScalarTraits { - static void output(const Target &Value, void *, raw_ostream &OS) { - OS << Value.Arch << "-"; - switch (Value.Platform) { - default: - OS << "unknown"; - break; - case PlatformKind::macOS: - OS << "macos"; - break; - case PlatformKind::iOS: - OS << "ios"; - break; - case PlatformKind::tvOS: - OS << "tvos"; - break; - case PlatformKind::watchOS: - OS << "watchos"; - break; - case PlatformKind::bridgeOS: - OS << "bridgeos"; - break; - case PlatformKind::macCatalyst: - OS << "maccatalyst"; - break; - case PlatformKind::iOSSimulator: - OS << "ios-simulator"; - break; - case PlatformKind::tvOSSimulator: - OS << "tvos-simulator"; - break; - case PlatformKind::watchOSSimulator: - OS << "watchos-simulator"; - break; - } - } - - static StringRef input(StringRef Scalar, void *, Target &Value) { - auto Result = Target::create(Scalar); - if (!Result) - return toString(Result.takeError()); - - Value = *Result; - return {}; - } - - static QuotingType mustQuote(StringRef) { return QuotingType::None; } -}; - template <> struct MappingTraits { struct NormalizedTBD { explicit NormalizedTBD(IO &IO) {} @@ -737,339 +555,71 @@ template <> struct MappingTraits Undefineds; }; - static void setFileTypeForInput(TextAPIContext *Ctx, IO &IO) { - if (IO.mapTag("!tapi-tbd", false)) - Ctx->FileKind = FileType::TBD_V4; - else if (IO.mapTag("!tapi-tbd-v3", false)) - Ctx->FileKind = FileType::TBD_V3; - else if (IO.mapTag("!tapi-tbd-v2", false)) - Ctx->FileKind = FileType::TBD_V2; - else if (IO.mapTag("!tapi-tbd-v1", false) || - IO.mapTag("tag:yaml.org,2002:map", false)) - Ctx->FileKind = FileType::TBD_V1; - else { - Ctx->FileKind = FileType::Invalid; - return; - } - } - static void mapping(IO &IO, const InterfaceFile *&File) { auto *Ctx = reinterpret_cast(IO.getContext()); assert((!Ctx || !IO.outputting() || (Ctx && Ctx->FileKind != FileType::Invalid)) && "File type is not set in YAML context"); + MappingNormalization Keys(IO, File); + // prope file type when reading. if (!IO.outputting()) { - setFileTypeForInput(Ctx, IO); - switch (Ctx->FileKind) { - default: - break; - case FileType::TBD_V4: - mapKeysToValuesV4(IO, File); - return; - case FileType::Invalid: + if (IO.mapTag("!tapi-tbd-v3", false)) + Ctx->FileKind = FileType::TBD_V3; + else if (IO.mapTag("!tapi-tbd-v2", false)) + Ctx->FileKind = FileType::TBD_V2; + else if (IO.mapTag("!tapi-tbd-v1", false) || + IO.mapTag("tag:yaml.org,2002:map", false)) + Ctx->FileKind = FileType::TBD_V1; + else { IO.setError("unsupported file type"); return; } - } else { - // Set file type when writing. + } + + // Set file type when writing. + if (IO.outputting()) { switch (Ctx->FileKind) { default: llvm_unreachable("unexpected file type"); - case FileType::TBD_V4: - mapKeysToValuesV4(IO, File); - return; - case FileType::TBD_V3: - IO.mapTag("!tapi-tbd-v3", true); + case FileType::TBD_V1: + // Don't write the tag into the .tbd file for TBD v1. break; case FileType::TBD_V2: IO.mapTag("!tapi-tbd-v2", true); break; - case FileType::TBD_V1: - // Don't write the tag into the .tbd file for TBD v1 + case FileType::TBD_V3: + IO.mapTag("!tapi-tbd-v3", true); break; } } - mapKeysToValues(Ctx->FileKind, IO, File); - } - - using SectionList = std::vector; - struct NormalizedTBD_V4 { - explicit NormalizedTBD_V4(IO &IO) {} - NormalizedTBD_V4(IO &IO, const InterfaceFile *&File) { - auto Ctx = reinterpret_cast(IO.getContext()); - assert(Ctx); - TBDVersion = Ctx->FileKind >> 1; - Targets.insert(Targets.begin(), File->targets().begin(), - File->targets().end()); - for (const auto &IT : File->uuids()) - UUIDs.emplace_back(IT.first, IT.second); - InstallName = File->getInstallName(); - CurrentVersion = File->getCurrentVersion(); - CompatibilityVersion = File->getCompatibilityVersion(); - SwiftVersion = File->getSwiftABIVersion(); - - Flags = TBDFlags::None; - if (!File->isApplicationExtensionSafe()) - Flags |= TBDFlags::NotApplicationExtensionSafe; - - if (!File->isTwoLevelNamespace()) - Flags |= TBDFlags::FlatNamespace; - - if (File->isInstallAPI()) - Flags |= TBDFlags::InstallAPI; - - { - using TargetList = SmallVector; - std::map valueToTargetList; - for (const auto &it : File->umbrellas()) - valueToTargetList[it.second].emplace_back(it.first); - - for (const auto &it : valueToTargetList) { - UmbrellaSection CurrentSection; - CurrentSection.Targets.insert(CurrentSection.Targets.begin(), - it.second.begin(), it.second.end()); - CurrentSection.Umbrella = it.first; - ParentUmbrellas.emplace_back(std::move(CurrentSection)); - } - } - - assignTargetsToLibrary(File->allowableClients(), AllowableClients); - assignTargetsToLibrary(File->reexportedLibraries(), ReexportedLibraries); - - auto handleSymbols = - [](SectionList &CurrentSections, - InterfaceFile::const_filtered_symbol_range Symbols, - std::function Pred) { - using TargetList = SmallVector; - std::set TargetSet; - std::map SymbolToTargetList; - for (const auto *Symbol : Symbols) { - if (!Pred(Symbol)) - continue; - TargetList Targets(Symbol->targets()); - SymbolToTargetList[Symbol] = Targets; - TargetSet.emplace(std::move(Targets)); - } - for (const auto &TargetIDs : TargetSet) { - SymbolSection CurrentSection; - CurrentSection.Targets.insert(CurrentSection.Targets.begin(), - TargetIDs.begin(), TargetIDs.end()); - - for (const auto &IT : SymbolToTargetList) { - if (IT.second != TargetIDs) - continue; - - const auto *Symbol = IT.first; - switch (Symbol->getKind()) { - case SymbolKind::GlobalSymbol: - if (Symbol->isWeakDefined()) - CurrentSection.WeakSymbols.emplace_back(Symbol->getName()); - else if (Symbol->isThreadLocalValue()) - CurrentSection.TlvSymbols.emplace_back(Symbol->getName()); - else - CurrentSection.Symbols.emplace_back(Symbol->getName()); - break; - case SymbolKind::ObjectiveCClass: - CurrentSection.Classes.emplace_back(Symbol->getName()); - break; - case SymbolKind::ObjectiveCClassEHType: - CurrentSection.ClassEHs.emplace_back(Symbol->getName()); - break; - case SymbolKind::ObjectiveCInstanceVariable: - CurrentSection.Ivars.emplace_back(Symbol->getName()); - break; - } - } - sort(CurrentSection.Symbols); - sort(CurrentSection.Classes); - sort(CurrentSection.ClassEHs); - sort(CurrentSection.Ivars); - sort(CurrentSection.WeakSymbols); - sort(CurrentSection.TlvSymbols); - CurrentSections.emplace_back(std::move(CurrentSection)); - } - }; - - handleSymbols(Exports, File->exports(), [](const Symbol *Symbol) { - return !Symbol->isReexported(); - }); - handleSymbols(Reexports, File->exports(), [](const Symbol *Symbol) { - return Symbol->isReexported(); - }); - handleSymbols(Undefineds, File->undefineds(), - [](const Symbol *Symbol) { return true; }); - } - - const InterfaceFile *denormalize(IO &IO) { - auto Ctx = reinterpret_cast(IO.getContext()); - assert(Ctx); - - auto *File = new InterfaceFile; - File->setPath(Ctx->Path); - File->setFileType(Ctx->FileKind); - for (auto &id : UUIDs) - File->addUUID(id.TargetID, id.Value); - File->addTargets(Targets); - File->setInstallName(InstallName); - File->setCurrentVersion(CurrentVersion); - File->setCompatibilityVersion(CompatibilityVersion); - File->setSwiftABIVersion(SwiftVersion); - for (const auto &CurrentSection : ParentUmbrellas) - for (const auto &target : CurrentSection.Targets) - File->addParentUmbrella(target, CurrentSection.Umbrella); - File->setTwoLevelNamespace(!(Flags & TBDFlags::FlatNamespace)); - File->setApplicationExtensionSafe( - !(Flags & TBDFlags::NotApplicationExtensionSafe)); - File->setInstallAPI(Flags & TBDFlags::InstallAPI); - - for (const auto &CurrentSection : AllowableClients) { - for (const auto &lib : CurrentSection.Values) - for (const auto &Target : CurrentSection.Targets) - File->addAllowableClient(lib, Target); - } - - for (const auto &CurrentSection : ReexportedLibraries) { - for (const auto &Lib : CurrentSection.Values) - for (const auto &Target : CurrentSection.Targets) - File->addReexportedLibrary(Lib, Target); - } - - auto handleSymbols = [File](const SectionList &CurrentSections, - SymbolFlags Flag = SymbolFlags::None) { - for (const auto &CurrentSection : CurrentSections) { - for (auto &sym : CurrentSection.Symbols) - File->addSymbol(SymbolKind::GlobalSymbol, sym, - CurrentSection.Targets, Flag); - - for (auto &sym : CurrentSection.Classes) - File->addSymbol(SymbolKind::ObjectiveCClass, sym, - CurrentSection.Targets); - - for (auto &sym : CurrentSection.ClassEHs) - File->addSymbol(SymbolKind::ObjectiveCClassEHType, sym, - CurrentSection.Targets); - - for (auto &sym : CurrentSection.Ivars) - File->addSymbol(SymbolKind::ObjectiveCInstanceVariable, sym, - CurrentSection.Targets); - - for (auto &sym : CurrentSection.WeakSymbols) - File->addSymbol(SymbolKind::GlobalSymbol, sym, - CurrentSection.Targets); - for (auto &sym : CurrentSection.TlvSymbols) - File->addSymbol(SymbolKind::GlobalSymbol, sym, - CurrentSection.Targets, - SymbolFlags::ThreadLocalValue); - } - }; - - handleSymbols(Exports); - handleSymbols(Reexports, SymbolFlags::Rexported); - handleSymbols(Undefineds, SymbolFlags::Undefined); - - return File; - } - - unsigned TBDVersion; - std::vector UUIDs; - TargetList Targets; - StringRef InstallName; - PackedVersion CurrentVersion; - PackedVersion CompatibilityVersion; - SwiftVersion SwiftVersion{0}; - std::vector AllowableClients; - std::vector ReexportedLibraries; - TBDFlags Flags{TBDFlags::None}; - std::vector ParentUmbrellas; - SectionList Exports; - SectionList Reexports; - SectionList Undefineds; - - private: - using TargetList = SmallVector; - void assignTargetsToLibrary(const std::vector &Libraries, - std::vector &Section) { - std::set targetSet; - std::map valueToTargetList; - for (const auto &library : Libraries) { - TargetList targets(library.targets()); - valueToTargetList[&library] = targets; - targetSet.emplace(std::move(targets)); - } - - for (const auto &targets : targetSet) { - MetadataSection CurrentSection; - CurrentSection.Targets.insert(CurrentSection.Targets.begin(), - targets.begin(), targets.end()); - - for (const auto &it : valueToTargetList) { - if (it.second != targets) - continue; - CurrentSection.Values.emplace_back(it.first->getInstallName()); - } - llvm::sort(CurrentSection.Values); - Section.emplace_back(std::move(CurrentSection)); - } - } - }; - - static void mapKeysToValues(FileType FileKind, IO &IO, - const InterfaceFile *&File) { - MappingNormalization Keys(IO, File); IO.mapRequired("archs", Keys->Architectures); - if (FileKind != FileType::TBD_V1) + if (Ctx->FileKind != FileType::TBD_V1) IO.mapOptional("uuids", Keys->UUIDs); IO.mapRequired("platform", Keys->Platforms); - if (FileKind != FileType::TBD_V1) + if (Ctx->FileKind != FileType::TBD_V1) IO.mapOptional("flags", Keys->Flags, TBDFlags::None); IO.mapRequired("install-name", Keys->InstallName); IO.mapOptional("current-version", Keys->CurrentVersion, PackedVersion(1, 0, 0)); IO.mapOptional("compatibility-version", Keys->CompatibilityVersion, PackedVersion(1, 0, 0)); - if (FileKind != FileType::TBD_V3) + if (Ctx->FileKind != FileType::TBD_V3) IO.mapOptional("swift-version", Keys->SwiftABIVersion, SwiftVersion(0)); else IO.mapOptional("swift-abi-version", Keys->SwiftABIVersion, SwiftVersion(0)); IO.mapOptional("objc-constraint", Keys->ObjCConstraint, - (FileKind == FileType::TBD_V1) + (Ctx->FileKind == FileType::TBD_V1) ? ObjCConstraintType::None : ObjCConstraintType::Retain_Release); - if (FileKind != FileType::TBD_V1) + if (Ctx->FileKind != FileType::TBD_V1) IO.mapOptional("parent-umbrella", Keys->ParentUmbrella, StringRef()); IO.mapOptional("exports", Keys->Exports); - if (FileKind != FileType::TBD_V1) + if (Ctx->FileKind != FileType::TBD_V1) IO.mapOptional("undefineds", Keys->Undefineds); } - - static void mapKeysToValuesV4(IO &IO, const InterfaceFile *&File) { - MappingNormalization Keys(IO, - File); - IO.mapTag("!tapi-tbd", true); - IO.mapRequired("tbd-version", Keys->TBDVersion); - IO.mapRequired("targets", Keys->Targets); - IO.mapOptional("uuids", Keys->UUIDs); - IO.mapOptional("flags", Keys->Flags, TBDFlags::None); - IO.mapRequired("install-name", Keys->InstallName); - IO.mapOptional("current-version", Keys->CurrentVersion, - PackedVersion(1, 0, 0)); - IO.mapOptional("compatibility-version", Keys->CompatibilityVersion, - PackedVersion(1, 0, 0)); - IO.mapOptional("swift-abi-version", Keys->SwiftVersion, SwiftVersion(0)); - IO.mapOptional("parent-umbrella", Keys->ParentUmbrellas); - auto OptionKind = MetadataSection::Option::Clients; - IO.mapOptionalWithContext("allowable-clients", Keys->AllowableClients, - OptionKind); - OptionKind = MetadataSection::Option::Libraries; - IO.mapOptionalWithContext("reexported-libraries", Keys->ReexportedLibraries, - OptionKind); - IO.mapOptional("exports", Keys->Exports); - IO.mapOptional("reexports", Keys->Reexports); - IO.mapOptional("undefineds", Keys->Undefineds); - } }; template <> Modified: llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp?rev=374062&r1=374061&r2=374062&view=diff ============================================================================== --- llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp (original) +++ llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp Tue Oct 8 08:24:37 2019 @@ -172,25 +172,14 @@ void ScalarTraits::output( break; } } -StringRef ScalarTraits::input(StringRef Scalar, void *IO, +StringRef ScalarTraits::input(StringRef Scalar, void *, SwiftVersion &Value) { - const auto *Ctx = reinterpret_cast(IO); - assert((!Ctx || Ctx->FileKind != FileType::Invalid) && - "File type is not set in context"); - - if (Ctx->FileKind == FileType::TBD_V4) { - if (Scalar.getAsInteger(10, Value)) - return "invalid Swift ABI version."; - return {}; - } else { - Value = StringSwitch(Scalar) - .Case("1.0", 1) - .Case("1.1", 2) - .Case("2.0", 3) - .Case("3.0", 4) - .Default(0); - } - + Value = StringSwitch(Scalar) + .Case("1.0", 1) + .Case("1.1", 2) + .Case("2.0", 3) + .Case("3.0", 4) + .Default(0); if (Value != SwiftVersion(0)) return {}; Modified: llvm/trunk/unittests/TextAPI/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/TextAPI/CMakeLists.txt?rev=374062&r1=374061&r2=374062&view=diff ============================================================================== --- llvm/trunk/unittests/TextAPI/CMakeLists.txt (original) +++ llvm/trunk/unittests/TextAPI/CMakeLists.txt Tue Oct 8 08:24:37 2019 @@ -7,7 +7,6 @@ add_llvm_unittest(TextAPITests TextStubV1Tests.cpp TextStubV2Tests.cpp TextStubV3Tests.cpp - TextStubV4Tests.cpp ) target_link_libraries(TextAPITests PRIVATE LLVMTestingSupport) Removed: llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp?rev=374061&view=auto ============================================================================== --- llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp (original) +++ llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp (removed) @@ -1,558 +0,0 @@ -//===-- TextStubV4Tests.cpp - TBD V4 File Test ----------------------------===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===-----------------------------------------------------------------------===/ -#include "llvm/TextAPI/MachO/InterfaceFile.h" -#include "llvm/TextAPI/MachO/TextAPIReader.h" -#include "llvm/TextAPI/MachO/TextAPIWriter.h" -#include "gtest/gtest.h" -#include -#include - -using namespace llvm; -using namespace llvm::MachO; - -struct ExampleSymbol { - SymbolKind Kind; - std::string Name; - bool WeakDefined; - bool ThreadLocalValue; -}; -using ExampleSymbolSeq = std::vector; -using UUIDs = std::vector>; - -inline bool operator<(const ExampleSymbol &LHS, const ExampleSymbol &RHS) { - return std::tie(LHS.Kind, LHS.Name) < std::tie(RHS.Kind, RHS.Name); -} - -inline bool operator==(const ExampleSymbol &LHS, const ExampleSymbol &RHS) { - return std::tie(LHS.Kind, LHS.Name, LHS.WeakDefined, LHS.ThreadLocalValue) == - std::tie(RHS.Kind, RHS.Name, RHS.WeakDefined, RHS.ThreadLocalValue); -} - -static ExampleSymbol TBDv4ExportedSymbols[] = { - {SymbolKind::GlobalSymbol, "_symA", false, false}, - {SymbolKind::GlobalSymbol, "_symAB", false, false}, - {SymbolKind::GlobalSymbol, "_symB", false, false}, -}; - -static ExampleSymbol TBDv4ReexportedSymbols[] = { - {SymbolKind::GlobalSymbol, "_symC", false, false}, -}; - -static ExampleSymbol TBDv4UndefinedSymbols[] = { - {SymbolKind::GlobalSymbol, "_symD", false, false}, -}; - -namespace TBDv4 { - -TEST(TBDv4, ReadFile) { - static const char tbd_v4_file[] = - "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ i386-macos, x86_64-macos, x86_64-ios ]\n" - "uuids:\n" - " - target: i386-macos\n" - " value: 00000000-0000-0000-0000-000000000000\n" - " - target: x86_64-macos\n" - " value: 11111111-1111-1111-1111-111111111111\n" - " - target: x86_64-ios\n" - " value: 11111111-1111-1111-1111-111111111111\n" - "flags: [ flat_namespace, installapi ]\n" - "install-name: Umbrella.framework/Umbrella\n" - "current-version: 1.2.3\n" - "compatibility-version: 1.2\n" - "swift-abi-version: 5\n" - "parent-umbrella:\n" - " - targets: [ i386-macos, x86_64-macos, x86_64-ios ]\n" - " umbrella: System\n" - "allowable-clients:\n" - " - targets: [ i386-macos, x86_64-macos, x86_64-ios ]\n" - " clients: [ ClientA ]\n" - "reexported-libraries:\n" - " - targets: [ i386-macos ]\n" - " libraries: [ /System/Library/Frameworks/A.framework/A ]\n" - "exports:\n" - " - targets: [ i386-macos ]\n" - " symbols: [ _symA ]\n" - " objc-classes: []\n" - " objc-eh-types: []\n" - " objc-ivars: []\n" - " weak-symbols: []\n" - " thread-local-symbols: []\n" - " - targets: [ x86_64-ios ]\n" - " symbols: [_symB]\n" - " - targets: [ x86_64-macos, x86_64-ios ]\n" - " symbols: [_symAB]\n" - "reexports:\n" - " - targets: [ i386-macos ]\n" - " symbols: [_symC]\n" - " objc-classes: []\n" - " objc-eh-types: []\n" - " objc-ivars: []\n" - " weak-symbols: []\n" - " thread-local-symbols: []\n" - "undefineds:\n" - " - targets: [ i386-macos ]\n" - " symbols: [ _symD ]\n" - " objc-classes: []\n" - " objc-eh-types: []\n" - " objc-ivars: []\n" - " weak-symbols: []\n" - " thread-local-symbols: []\n" - "...\n"; - - auto Result = TextAPIReader::get(MemoryBufferRef(tbd_v4_file, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - PlatformSet Platforms; - Platforms.insert(PlatformKind::macOS); - Platforms.insert(PlatformKind::iOS); - auto Archs = AK_i386 | AK_x86_64; - TargetList Targets = { - Target(AK_i386, PlatformKind::macOS), - Target(AK_x86_64, PlatformKind::macOS), - Target(AK_x86_64, PlatformKind::iOS), - }; - UUIDs uuids = {{Targets[0], "00000000-0000-0000-0000-000000000000"}, - {Targets[1], "11111111-1111-1111-1111-111111111111"}, - {Targets[2], "11111111-1111-1111-1111-111111111111"}}; - EXPECT_EQ(Archs, File->getArchitectures()); - EXPECT_EQ(uuids, File->uuids()); - EXPECT_EQ(Platforms.size(), File->getPlatforms().size()); - for (auto Platform : File->getPlatforms()) - EXPECT_EQ(Platforms.count(Platform), 1U); - EXPECT_EQ(std::string("Umbrella.framework/Umbrella"), File->getInstallName()); - EXPECT_EQ(PackedVersion(1, 2, 3), File->getCurrentVersion()); - EXPECT_EQ(PackedVersion(1, 2, 0), File->getCompatibilityVersion()); - EXPECT_EQ(5U, File->getSwiftABIVersion()); - EXPECT_FALSE(File->isTwoLevelNamespace()); - EXPECT_TRUE(File->isApplicationExtensionSafe()); - EXPECT_TRUE(File->isInstallAPI()); - InterfaceFileRef client("ClientA", Targets); - InterfaceFileRef reexport("/System/Library/Frameworks/A.framework/A", - {Targets[0]}); - EXPECT_EQ(1U, File->allowableClients().size()); - EXPECT_EQ(client, File->allowableClients().front()); - EXPECT_EQ(1U, File->reexportedLibraries().size()); - EXPECT_EQ(reexport, File->reexportedLibraries().front()); - - ExampleSymbolSeq Exports, Reexports, Undefineds; - ExampleSymbol temp; - for (const auto *Sym : File->symbols()) { - temp = ExampleSymbol{Sym->getKind(), Sym->getName(), Sym->isWeakDefined(), - Sym->isThreadLocalValue()}; - EXPECT_FALSE(Sym->isWeakReferenced()); - if (Sym->isUndefined()) - Undefineds.emplace_back(std::move(temp)); - else - Sym->isReexported() ? Reexports.emplace_back(std::move(temp)) - : Exports.emplace_back(std::move(temp)); - } - llvm::sort(Exports.begin(), Exports.end()); - llvm::sort(Reexports.begin(), Reexports.end()); - llvm::sort(Undefineds.begin(), Undefineds.end()); - - EXPECT_EQ(sizeof(TBDv4ExportedSymbols) / sizeof(ExampleSymbol), - Exports.size()); - EXPECT_EQ(sizeof(TBDv4ReexportedSymbols) / sizeof(ExampleSymbol), - Reexports.size()); - EXPECT_EQ(sizeof(TBDv4UndefinedSymbols) / sizeof(ExampleSymbol), - Undefineds.size()); - EXPECT_TRUE(std::equal(Exports.begin(), Exports.end(), - std::begin(TBDv4ExportedSymbols))); - EXPECT_TRUE(std::equal(Reexports.begin(), Reexports.end(), - std::begin(TBDv4ReexportedSymbols))); - EXPECT_TRUE(std::equal(Undefineds.begin(), Undefineds.end(), - std::begin(TBDv4UndefinedSymbols))); -} - -TEST(TBDv4, WriteFile) { - static const char tbd_v4_file[] = - "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ i386-macos, x86_64-ios-simulator ]\n" - "uuids:\n" - " - target: i386-macos\n" - " value: 00000000-0000-0000-0000-000000000000\n" - " - target: x86_64-ios-simulator\n" - " value: 11111111-1111-1111-1111-111111111111\n" - "flags: [ installapi ]\n" - "install-name: 'Umbrella.framework/Umbrella'\n" - "current-version: 1.2.3\n" - "compatibility-version: 0\n" - "swift-abi-version: 5\n" - "parent-umbrella:\n" - " - targets: [ i386-macos, x86_64-ios-simulator ]\n" - " umbrella: System\n" - "allowable-clients:\n" - " - targets: [ i386-macos ]\n" - " clients: [ ClientA ]\n" - "exports:\n" - " - targets: [ i386-macos ]\n" - " symbols: [ _symA ]\n" - " objc-classes: [ Class1 ]\n" - " weak-symbols: [ _symC ]\n" - " - targets: [ x86_64-ios-simulator ]\n" - " symbols: [ _symB ]\n" - "...\n"; - - InterfaceFile File; - TargetList Targets = { - Target(AK_i386, PlatformKind::macOS), - Target(AK_x86_64, PlatformKind::iOSSimulator), - }; - UUIDs uuids = {{Targets[0], "00000000-0000-0000-0000-000000000000"}, - {Targets[1], "11111111-1111-1111-1111-111111111111"}}; - File.setInstallName("Umbrella.framework/Umbrella"); - File.setFileType(FileType::TBD_V4); - File.addTargets(Targets); - File.addUUID(uuids[0].first, uuids[0].second); - File.addUUID(uuids[1].first, uuids[1].second); - File.setCurrentVersion(PackedVersion(1, 2, 3)); - File.setTwoLevelNamespace(); - File.setInstallAPI(true); - File.setApplicationExtensionSafe(true); - File.setSwiftABIVersion(5); - File.addAllowableClient("ClientA", Targets[0]); - File.addParentUmbrella(Targets[0], "System"); - File.addParentUmbrella(Targets[1], "System"); - File.addSymbol(SymbolKind::GlobalSymbol, "_symA", {Targets[0]}); - File.addSymbol(SymbolKind::GlobalSymbol, "_symB", {Targets[1]}); - File.addSymbol(SymbolKind::GlobalSymbol, "_symC", {Targets[0]}, - SymbolFlags::WeakDefined); - File.addSymbol(SymbolKind::ObjectiveCClass, "Class1", {Targets[0]}); - - SmallString<4096> Buffer; - raw_svector_ostream OS(Buffer); - auto Result = TextAPIWriter::writeToStream(OS, File); - EXPECT_FALSE(Result); - EXPECT_STREQ(tbd_v4_file, Buffer.c_str()); -} - -TEST(TBDv4, MultipleTargets) { - static const char tbd_multiple_targets[] = - "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ i386-maccatalyst, x86_64-tvos, arm64-ios ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = - TextAPIReader::get(MemoryBufferRef(tbd_multiple_targets, "Test.tbd")); - EXPECT_TRUE(!!Result); - PlatformSet Platforms; - Platforms.insert(PlatformKind::macCatalyst); - Platforms.insert(PlatformKind::tvOS); - Platforms.insert(PlatformKind::iOS); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(AK_x86_64 | AK_arm64 | AK_i386, File->getArchitectures()); - EXPECT_EQ(Platforms.size(), File->getPlatforms().size()); - for (auto Platform : File->getPlatforms()) - EXPECT_EQ(Platforms.count(Platform), 1U); -} - -TEST(TBDv4, MultipleTargetsSameArch) { - static const char tbd_targets_same_arch[] = - "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-maccatalyst, x86_64-tvos ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = - TextAPIReader::get(MemoryBufferRef(tbd_targets_same_arch, "Test.tbd")); - EXPECT_TRUE(!!Result); - PlatformSet Platforms; - Platforms.insert(PlatformKind::tvOS); - Platforms.insert(PlatformKind::macCatalyst); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); - EXPECT_EQ(Platforms.size(), File->getPlatforms().size()); - for (auto Platform : File->getPlatforms()) - EXPECT_EQ(Platforms.count(Platform), 1U); -} - -TEST(TBDv4, MultipleTargetsSamePlatform) { - static const char tbd_multiple_targets_same_platform[] = - "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ arm64-ios, armv7k-ios ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = TextAPIReader::get( - MemoryBufferRef(tbd_multiple_targets_same_platform, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(AK_arm64 | AK_armv7k, File->getArchitectures()); - EXPECT_EQ(File->getPlatforms().size(), 1U); - EXPECT_EQ(PlatformKind::iOS, *File->getPlatforms().begin()); -} - -TEST(TBDv4, Target_maccatalyst) { - static const char tbd_target_maccatalyst[] = - "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-maccatalyst ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = - TextAPIReader::get(MemoryBufferRef(tbd_target_maccatalyst, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); - EXPECT_EQ(File->getPlatforms().size(), 1U); - EXPECT_EQ(PlatformKind::macCatalyst, *File->getPlatforms().begin()); -} - -TEST(TBDv4, Target_x86_ios) { - static const char tbd_target_x86_ios[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-ios ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = - TextAPIReader::get(MemoryBufferRef(tbd_target_x86_ios, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); - EXPECT_EQ(File->getPlatforms().size(), 1U); - EXPECT_EQ(PlatformKind::iOS, *File->getPlatforms().begin()); -} - -TEST(TBDv4, Target_arm_bridgeOS) { - static const char tbd_platform_bridgeos[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ armv7k-bridgeos ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = - TextAPIReader::get(MemoryBufferRef(tbd_platform_bridgeos, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(File->getPlatforms().size(), 1U); - EXPECT_EQ(PlatformKind::bridgeOS, *File->getPlatforms().begin()); - EXPECT_EQ(ArchitectureSet(AK_armv7k), File->getArchitectures()); -} - -TEST(TBDv4, Target_x86_macos) { - static const char tbd_x86_macos[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-macos ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = TextAPIReader::get(MemoryBufferRef(tbd_x86_macos, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); - EXPECT_EQ(File->getPlatforms().size(), 1U); - EXPECT_EQ(PlatformKind::macOS, *File->getPlatforms().begin()); -} - -TEST(TBDv4, Target_x86_ios_simulator) { - static const char tbd_x86_ios_sim[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-ios-simulator ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = - TextAPIReader::get(MemoryBufferRef(tbd_x86_ios_sim, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); - EXPECT_EQ(File->getPlatforms().size(), 1U); - EXPECT_EQ(PlatformKind::iOSSimulator, *File->getPlatforms().begin()); -} - -TEST(TBDv4, Target_x86_tvos_simulator) { - static const char tbd_x86_tvos_sim[] = - "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-tvos-simulator ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = - TextAPIReader::get(MemoryBufferRef(tbd_x86_tvos_sim, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); - EXPECT_EQ(File->getPlatforms().size(), 1U); - EXPECT_EQ(PlatformKind::tvOSSimulator, *File->getPlatforms().begin()); -} - -TEST(TBDv4, Target_i386_watchos_simulator) { - static const char tbd_i386_watchos_sim[] = - "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ i386-watchos-simulator ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = - TextAPIReader::get(MemoryBufferRef(tbd_i386_watchos_sim, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(ArchitectureSet(AK_i386), File->getArchitectures()); - EXPECT_EQ(File->getPlatforms().size(), 1U); - EXPECT_EQ(PlatformKind::watchOSSimulator, *File->getPlatforms().begin()); -} - -TEST(TBDv4, Swift_1) { - static const char tbd_swift_1[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-macos ]\n" - "install-name: Test.dylib\n" - "swift-abi-version: 1\n" - "...\n"; - - auto Result = TextAPIReader::get(MemoryBufferRef(tbd_swift_1, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(1U, File->getSwiftABIVersion()); -} - -TEST(TBDv4, Swift_2) { - static const char tbd_v1_swift_2[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-macos ]\n" - "install-name: Test.dylib\n" - "swift-abi-version: 2\n" - "...\n"; - - auto Result = TextAPIReader::get(MemoryBufferRef(tbd_v1_swift_2, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(2U, File->getSwiftABIVersion()); -} - -TEST(TBDv4, Swift_5) { - static const char tbd_swift_5[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-macos ]\n" - "install-name: Test.dylib\n" - "swift-abi-version: 5\n" - "...\n"; - - auto Result = TextAPIReader::get(MemoryBufferRef(tbd_swift_5, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(5U, File->getSwiftABIVersion()); -} - -TEST(TBDv4, Swift_99) { - static const char tbd_swift_99[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-macos ]\n" - "install-name: Test.dylib\n" - "swift-abi-version: 99\n" - "...\n"; - - auto Result = TextAPIReader::get(MemoryBufferRef(tbd_swift_99, "Test.tbd")); - EXPECT_TRUE(!!Result); - auto File = std::move(Result.get()); - EXPECT_EQ(FileType::TBD_V4, File->getFileType()); - EXPECT_EQ(99U, File->getSwiftABIVersion()); -} - -TEST(TBDv4, InvalidArchitecture) { - static const char tbd_file_unknown_architecture[] = - "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ foo-macos ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = TextAPIReader::get( - MemoryBufferRef(tbd_file_unknown_architecture, "Test.tbd")); - EXPECT_FALSE(!!Result); - auto errorMessage = toString(Result.takeError()); - ASSERT_TRUE(errorMessage.compare(0, 15, "malformed file\n") == 0); -} - -TEST(TBDv4, InvalidPlatform) { - static const char tbd_file_invalid_platform[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-maos ]\n" - "install-name: Test.dylib\n" - "...\n"; - - auto Result = TextAPIReader::get( - MemoryBufferRef(tbd_file_invalid_platform, "Test.tbd")); - EXPECT_FALSE(!!Result); - auto errorMessage = toString(Result.takeError()); - ASSERT_TRUE(errorMessage.compare(0, 15, "malformed file\n") == 0); -} - -TEST(TBDv4, MalformedFile1) { - static const char malformed_file1[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "...\n"; - - auto Result = - TextAPIReader::get(MemoryBufferRef(malformed_file1, "Test.tbd")); - EXPECT_FALSE(!!Result); - auto errorMessage = toString(Result.takeError()); - ASSERT_EQ("malformed file\nTest.tbd:2:1: error: missing required key " - "'targets'\ntbd-version: 4\n^\n", - errorMessage); -} - -TEST(TBDv4, MalformedFile2) { - static const char malformed_file2[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-macos ]\n" - "install-name: Test.dylib\n" - "foobar: \"unsupported key\"\n"; - - auto Result = - TextAPIReader::get(MemoryBufferRef(malformed_file2, "Test.tbd")); - EXPECT_FALSE(!!Result); - auto errorMessage = toString(Result.takeError()); - ASSERT_EQ( - "malformed file\nTest.tbd:5:9: error: unknown key 'foobar'\nfoobar: " - "\"unsupported key\"\n ^~~~~~~~~~~~~~~~~\n", - errorMessage); -} - -TEST(TBDv4, MalformedFile3) { - static const char tbd_v1_swift_1_1[] = "--- !tapi-tbd\n" - "tbd-version: 4\n" - "targets: [ x86_64-macos ]\n" - "install-name: Test.dylib\n" - "swift-abi-version: 1.1\n" - "...\n"; - - auto Result = - TextAPIReader::get(MemoryBufferRef(tbd_v1_swift_1_1, "Test.tbd")); - EXPECT_FALSE(!!Result); - auto errorMessage = toString(Result.takeError()); - EXPECT_EQ("malformed file\nTest.tbd:5:20: error: invalid Swift ABI " - "version.\nswift-abi-version: 1.1\n ^~~\n", - errorMessage); -} - -} // end namespace TBDv4 From llvm-commits at lists.llvm.org Tue Oct 8 08:25:57 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via llvm-commits) Date: Tue, 08 Oct 2019 15:25:57 -0000 Subject: [llvm] r374063 - [Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer Message-ID: <20191008152557.332598939C@lists.llvm.org> Author: uenoku Date: Tue Oct 8 08:25:56 2019 New Revision: 374063 URL: http://llvm.org/viewvc/llvm-project?rev=374063&view=rev Log: [Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer Summary: In D65186 and related patches, MustBeExecutedContextExplorer is introduced. This enables us to traverse instructions guaranteed to execute from function entry. If we can know the argument is used as `dereferenceable` or `nonnull` in these instructions, we can mark `dereferenceable` or `nonnull` in the argument definition: 1. Memory instruction (similar to D64258) Trace memory instruction pointer operand. Currently, only inbounds GEPs are traced. ``` define i64* @f(i64* %a) { entry: %add.ptr = getelementptr inbounds i64, i64* %a, i64 1 ; (because of inbounds GEP we can know that %a is at least dereferenceable(16)) store i64 1, i64* %add.ptr, align 8 ret i64* %add.ptr ; dereferenceable 8 (because above instruction stores into it) } ``` 2. Propagation from callsite (similar to D27855) If `deref` or `nonnull` are known in call site parameter attributes we can also say that argument also that attribute. ``` declare void @use3(i8* %x, i8* %y, i8* %z); declare void @use3nonnull(i8* nonnull %x, i8* nonnull %y, i8* nonnull %z); define void @parent1(i8* %a, i8* %b, i8* %c) { call void @use3nonnull(i8* %b, i8* %c, i8* %a) ; Above instruction is always executed so we can say that at parent1(i8* nonnnull %a, i8* nonnull %b, i8* nonnull %c) call void @use3(i8* %c, i8* %a, i8* %b) ret void } ``` Reviewers: jdoerfert, sstefan1, spatel, reames Reviewed By: jdoerfert Subscribers: xbolva00, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65402 Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/align.ll llvm/trunk/test/Transforms/FunctionAttrs/arg_nocapture.ll llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll llvm/trunk/test/Transforms/FunctionAttrs/nosync.ll llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll llvm/trunk/test/Transforms/InferFunctionAttrs/dereferenceable.ll Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/Attributor.h?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/Attributor.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/Attributor.h Tue Oct 8 08:25:56 2019 @@ -101,6 +101,7 @@ #include "llvm/ADT/SetVector.h" #include "llvm/Analysis/AliasAnalysis.h" #include "llvm/Analysis/CallGraph.h" +#include "llvm/Analysis/MustExecute.h" #include "llvm/Analysis/TargetLibraryInfo.h" #include "llvm/IR/CallSite.h" #include "llvm/IR/PassManager.h" @@ -595,7 +596,7 @@ private: /// instance down in the abstract attributes. struct InformationCache { InformationCache(const Module &M, AnalysisGetter &AG) - : DL(M.getDataLayout()), AG(AG) { + : DL(M.getDataLayout()), Explorer(/* ExploreInterBlock */ true), AG(AG) { CallGraph *CG = AG.getAnalysis(M); if (!CG) @@ -626,6 +627,11 @@ struct InformationCache { return FuncRWInstsMap[&F]; } + /// Return MustBeExecutedContextExplorer + MustBeExecutedContextExplorer &getMustBeExecutedContextExplorer() { + return Explorer; + } + /// Return TargetLibraryInfo for function \p F. TargetLibraryInfo *getTargetLibraryInfoForFunction(const Function &F) { return AG.getAnalysis(F); @@ -663,6 +669,9 @@ private: /// The datalayout used in the module. const DataLayout &DL; + /// MustBeExecutedContextExplorer + MustBeExecutedContextExplorer Explorer; + /// Getters for analysis. AnalysisGetter &AG; @@ -1714,6 +1723,11 @@ struct AADereferenceable return NonNullAA && NonNullAA->isAssumedNonNull(); } + /// Return true if we know that the underlying value is nonnull. + bool isKnownNonNull() const { + return NonNullAA && NonNullAA->isKnownNonNull(); + } + /// Return true if we assume that underlying value is /// dereferenceable(_or_null) globally. bool isAssumedGlobal() const { return GlobalState.getAssumed(); } Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Tue Oct 8 08:25:56 2019 @@ -288,6 +288,35 @@ static bool addIfNotExistent(LLVMContext llvm_unreachable("Expected enum or string attribute!"); } +static const Value *getPointerOperand(const Instruction *I) { + if (auto *LI = dyn_cast(I)) + if (!LI->isVolatile()) + return LI->getPointerOperand(); + + if (auto *SI = dyn_cast(I)) + if (!SI->isVolatile()) + return SI->getPointerOperand(); + + if (auto *CXI = dyn_cast(I)) + if (!CXI->isVolatile()) + return CXI->getPointerOperand(); + + if (auto *RMWI = dyn_cast(I)) + if (!RMWI->isVolatile()) + return RMWI->getPointerOperand(); + + return nullptr; +} +static const Value *getBasePointerOfAccessPointerOperand(const Instruction *I, + int64_t &BytesOffset, + const DataLayout &DL) { + const Value *Ptr = getPointerOperand(I); + if (!Ptr) + return nullptr; + + return GetPointerBaseWithConstantOffset(Ptr, BytesOffset, DL, + /*AllowNonInbounds*/ false); +} ChangeStatus AbstractAttribute::update(Attributor &A) { ChangeStatus HasChanged = ChangeStatus::UNCHANGED; @@ -654,7 +683,8 @@ struct AAArgumentFromCallSiteArguments : }; /// Helper class for generic replication: function returned -> cs returned. -template +template struct AACallSiteReturnedFromReturned : public Base { AACallSiteReturnedFromReturned(const IRPosition &IRP) : Base(IRP) {} @@ -678,6 +708,80 @@ struct AACallSiteReturnedFromReturned : } }; +/// Helper class for generic deduction using must-be-executed-context +/// Base class is required to have `followUse` method. + +/// bool followUse(Attributor &A, const Use *U, const Instruction *I) +/// \param U Underlying use. +/// \param I The user of the \p U. +/// `followUse` returns true if the value should be tracked transitively. + +template +struct AAFromMustBeExecutedContext : public Base { + AAFromMustBeExecutedContext(const IRPosition &IRP) : Base(IRP) {} + + void initialize(Attributor &A) override { + Base::initialize(A); + IRPosition &IRP = this->getIRPosition(); + Instruction *CtxI = IRP.getCtxI(); + + if (!CtxI) + return; + + for (const Use &U : IRP.getAssociatedValue().uses()) + Uses.insert(&U); + } + + /// See AbstractAttribute::updateImpl(...). + ChangeStatus updateImpl(Attributor &A) override { + auto BeforeState = this->getState(); + auto &S = this->getState(); + Instruction *CtxI = this->getIRPosition().getCtxI(); + if (!CtxI) + return ChangeStatus::UNCHANGED; + + MustBeExecutedContextExplorer &Explorer = + A.getInfoCache().getMustBeExecutedContextExplorer(); + + SetVector NextUses; + + for (const Use *U : Uses) { + if (const Instruction *UserI = dyn_cast(U->getUser())) { + auto EIt = Explorer.begin(CtxI), EEnd = Explorer.end(CtxI); + bool Found = EIt.count(UserI); + while (!Found && ++EIt != EEnd) + Found = EIt.getCurrentInst() == UserI; + if (Found && Base::followUse(A, U, UserI)) + for (const Use &Us : UserI->uses()) + NextUses.insert(&Us); + } + } + for (const Use *U : NextUses) + Uses.insert(U); + + return BeforeState == S ? ChangeStatus::UNCHANGED : ChangeStatus::CHANGED; + } + +private: + /// Container for (transitive) uses of the associated value. + SetVector Uses; +}; + +template +using AAArgumentFromCallSiteArgumentsAndMustBeExecutedContext = + AAComposeTwoGenericDeduction; + +template +using AACallSiteReturnedFromReturnedAndMustBeExecutedContext = + AAComposeTwoGenericDeduction; + /// -----------------------NoUnwind Function Attribute-------------------------- struct AANoUnwindImpl : AANoUnwind { @@ -1434,6 +1538,46 @@ struct AANoFreeCallSite final : AANoFree }; /// ------------------------ NonNull Argument Attribute ------------------------ +static int64_t getKnownNonNullAndDerefBytesForUse( + Attributor &A, AbstractAttribute &QueryingAA, Value &AssociatedValue, + const Use *U, const Instruction *I, bool &IsNonNull, bool &TrackUse) { + // TODO: Add GEP support + TrackUse = false; + + const Function *F = I->getFunction(); + bool NullPointerIsDefined = F ? F->nullPointerIsDefined() : true; + const DataLayout &DL = A.getInfoCache().getDL(); + if (ImmutableCallSite ICS = ImmutableCallSite(I)) { + if (ICS.isBundleOperand(U)) + return 0; + + if (ICS.isCallee(U)) { + IsNonNull |= !NullPointerIsDefined; + return 0; + } + + unsigned ArgNo = ICS.getArgumentNo(U); + IRPosition IRP = IRPosition::callsite_argument(ICS, ArgNo); + auto &DerefAA = A.getAAFor(QueryingAA, IRP); + IsNonNull |= DerefAA.isKnownNonNull(); + return DerefAA.getKnownDereferenceableBytes(); + } + + int64_t Offset; + if (const Value *Base = getBasePointerOfAccessPointerOperand(I, Offset, DL)) { + if (Base == &AssociatedValue) { + int64_t DerefBytes = + Offset + + (int64_t)DL.getTypeStoreSize( + getPointerOperand(I)->getType()->getPointerElementType()); + + IsNonNull |= !NullPointerIsDefined; + return DerefBytes; + } + } + + return 0; +} struct AANonNullImpl : AANonNull { AANonNullImpl(const IRPosition &IRP) : AANonNull(IRP) {} @@ -1445,6 +1589,16 @@ struct AANonNullImpl : AANonNull { AANonNull::initialize(A); } + /// See AAFromMustBeExecutedContext + bool followUse(Attributor &A, const Use *U, const Instruction *I) { + bool IsNonNull = false; + bool TrackUse = false; + getKnownNonNullAndDerefBytesForUse(A, *this, getAssociatedValue(), U, I, + IsNonNull, TrackUse); + takeKnownMaximum(IsNonNull); + return TrackUse; + } + /// See AbstractAttribute::getAsStr(). const std::string getAsStr() const override { return getAssumed() ? "nonnull" : "may-null"; @@ -1452,12 +1606,14 @@ struct AANonNullImpl : AANonNull { }; /// NonNull attribute for a floating value. -struct AANonNullFloating : AANonNullImpl { - AANonNullFloating(const IRPosition &IRP) : AANonNullImpl(IRP) {} +struct AANonNullFloating + : AAFromMustBeExecutedContext { + using Base = AAFromMustBeExecutedContext; + AANonNullFloating(const IRPosition &IRP) : Base(IRP) {} /// See AbstractAttribute::initialize(...). void initialize(Attributor &A) override { - AANonNullImpl::initialize(A); + Base::initialize(A); if (isAtFixpoint()) return; @@ -1475,6 +1631,10 @@ struct AANonNullFloating : AANonNullImpl /// See AbstractAttribute::updateImpl(...). ChangeStatus updateImpl(Attributor &A) override { + ChangeStatus Change = Base::updateImpl(A); + if (isKnownNonNull()) + return Change; + const DataLayout &DL = A.getDataLayout(); auto VisitValueCB = [&](Value &V, AAAlign::StateType &T, @@ -1518,9 +1678,12 @@ struct AANonNullReturned final /// NonNull attribute for function argument. struct AANonNullArgument final - : AAArgumentFromCallSiteArguments { + : AAArgumentFromCallSiteArgumentsAndMustBeExecutedContext { AANonNullArgument(const IRPosition &IRP) - : AAArgumentFromCallSiteArguments(IRP) {} + : AAArgumentFromCallSiteArgumentsAndMustBeExecutedContext( + IRP) {} /// See AbstractAttribute::trackStatistics() void trackStatistics() const override { STATS_DECLTRACK_ARG_ATTR(nonnull) } @@ -1535,9 +1698,12 @@ struct AANonNullCallSiteArgument final : /// NonNull attribute for a call site return position. struct AANonNullCallSiteReturned final - : AACallSiteReturnedFromReturned { + : AACallSiteReturnedFromReturnedAndMustBeExecutedContext { AANonNullCallSiteReturned(const IRPosition &IRP) - : AACallSiteReturnedFromReturned(IRP) {} + : AACallSiteReturnedFromReturnedAndMustBeExecutedContext( + IRP) {} /// See AbstractAttribute::trackStatistics() void trackStatistics() const override { STATS_DECLTRACK_CSRET_ATTR(nonnull) } @@ -2290,6 +2456,16 @@ struct AADereferenceableImpl : AADerefer const StateType &getState() const override { return *this; } /// } + /// See AAFromMustBeExecutedContext + bool followUse(Attributor &A, const Use *U, const Instruction *I) { + bool IsNonNull = false; + bool TrackUse = false; + int64_t DerefBytes = getKnownNonNullAndDerefBytesForUse( + A, *this, getAssociatedValue(), U, I, IsNonNull, TrackUse); + takeKnownDerefBytesMaximum(DerefBytes); + return TrackUse; + } + void getDeducedAttributes(LLVMContext &Ctx, SmallVectorImpl &Attrs) const override { // TODO: Add *_globally support @@ -2314,12 +2490,16 @@ struct AADereferenceableImpl : AADerefer }; /// Dereferenceable attribute for a floating value. -struct AADereferenceableFloating : AADereferenceableImpl { - AADereferenceableFloating(const IRPosition &IRP) - : AADereferenceableImpl(IRP) {} +struct AADereferenceableFloating + : AAFromMustBeExecutedContext { + using Base = + AAFromMustBeExecutedContext; + AADereferenceableFloating(const IRPosition &IRP) : Base(IRP) {} /// See AbstractAttribute::updateImpl(...). ChangeStatus updateImpl(Attributor &A) override { + ChangeStatus Change = Base::updateImpl(A); + const DataLayout &DL = A.getDataLayout(); auto VisitValueCB = [&](Value &V, DerefState &T, bool Stripped) -> bool { @@ -2378,7 +2558,7 @@ struct AADereferenceableFloating : AADer A, getIRPosition(), *this, T, VisitValueCB)) return indicatePessimisticFixpoint(); - return clampStateAndIndicateChange(getState(), T); + return Change | clampStateAndIndicateChange(getState(), T); } /// See AbstractAttribute::trackStatistics() @@ -2403,12 +2583,11 @@ struct AADereferenceableReturned final /// Dereferenceable attribute for an argument struct AADereferenceableArgument final - : AAArgumentFromCallSiteArguments { - AADereferenceableArgument(const IRPosition &IRP) - : AAArgumentFromCallSiteArguments( - IRP) {} + : AAArgumentFromCallSiteArgumentsAndMustBeExecutedContext< + AADereferenceable, AADereferenceableImpl, DerefState> { + using Base = AAArgumentFromCallSiteArgumentsAndMustBeExecutedContext< + AADereferenceable, AADereferenceableImpl, DerefState>; + AADereferenceableArgument(const IRPosition &IRP) : Base(IRP) {} /// See AbstractAttribute::trackStatistics() void trackStatistics() const override { @@ -2428,13 +2607,16 @@ struct AADereferenceableCallSiteArgument }; /// Dereferenceable attribute deduction for a call site return value. -struct AADereferenceableCallSiteReturned final : AADereferenceableImpl { - AADereferenceableCallSiteReturned(const IRPosition &IRP) - : AADereferenceableImpl(IRP) {} +struct AADereferenceableCallSiteReturned final + : AACallSiteReturnedFromReturnedAndMustBeExecutedContext< + AADereferenceable, AADereferenceableImpl> { + using Base = AACallSiteReturnedFromReturnedAndMustBeExecutedContext< + AADereferenceable, AADereferenceableImpl>; + AADereferenceableCallSiteReturned(const IRPosition &IRP) : Base(IRP) {} /// See AbstractAttribute::initialize(...). void initialize(Attributor &A) override { - AADereferenceableImpl::initialize(A); + Base::initialize(A); Function *F = getAssociatedFunction(); if (!F) indicatePessimisticFixpoint(); @@ -2446,11 +2628,14 @@ struct AADereferenceableCallSiteReturned // call site specific liveness information and then it makes // sense to specialize attributes for call sites arguments instead of // redirecting requests to the callee argument. + + ChangeStatus Change = Base::updateImpl(A); Function *F = getAssociatedFunction(); const IRPosition &FnPos = IRPosition::returned(*F); auto &FnAA = A.getAAFor(*this, FnPos); - return clampStateAndIndicateChange( - getState(), static_cast(FnAA.getState())); + return Change | + clampStateAndIndicateChange( + getState(), static_cast(FnAA.getState())); } /// See AbstractAttribute::trackStatistics() Modified: llvm/trunk/test/Transforms/FunctionAttrs/align.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/align.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/align.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/align.ll Tue Oct 8 08:25:56 2019 @@ -175,7 +175,7 @@ define void @test9_traversal(i1 %c, i32* ; FIXME: This will work with an upcoming patch (D66618 or similar) ; define align 32 i32* @test10a(i32* align 32 "no-capture-maybe-returned" %p) -; ATTRIBUTOR: define i32* @test10a(i32* align 32 "no-capture-maybe-returned" %p) +; ATTRIBUTOR: define i32* @test10a(i32* nonnull align 32 dereferenceable(4) "no-capture-maybe-returned" %p) define i32* @test10a(i32* align 32 %p) { ; ATTRIBUTOR: %l = load i32, i32* %p, align 32 %l = load i32, i32* %p @@ -203,7 +203,7 @@ e: ; FIXME: This will work with an upcoming patch (D66618 or similar) ; define align 32 i32* @test10b(i32* align 32 "no-capture-maybe-returned" %p) -; ATTRIBUTOR: define i32* @test10b(i32* align 32 "no-capture-maybe-returned" %p) +; ATTRIBUTOR: define i32* @test10b(i32* nonnull align 32 dereferenceable(4) "no-capture-maybe-returned" %p) define i32* @test10b(i32* align 32 %p) { ; ATTRIBUTOR: %l = load i32, i32* %p, align 32 %l = load i32, i32* %p Modified: llvm/trunk/test/Transforms/FunctionAttrs/arg_nocapture.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/arg_nocapture.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/arg_nocapture.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/arg_nocapture.ll Tue Oct 8 08:25:56 2019 @@ -244,7 +244,8 @@ declare i32 @printf(i8* nocapture, ...) ; } ; ; There should *not* be a no-capture attribute on %a -; CHECK: define i64* @not_captured_but_returned_0(i64* returned writeonly "no-capture-maybe-returned" %a) +; CHECK: define nonnull dereferenceable(8) i64* @not_captured_but_returned_0(i64* nonnull returned writeonly dereferenceable(8) "no-capture-maybe-returned" %a) + define i64* @not_captured_but_returned_0(i64* %a) #0 { entry: store i64 0, i64* %a, align 8 @@ -354,6 +355,7 @@ entry: ; ; CHECK: define i32* @ret_arg_or_unknown(i32* readnone %b) ; CHECK: define i32* @ret_arg_or_unknown_through_phi(i32* readnone %b) + declare i32* @unknown() define i32* @ret_arg_or_unknown(i32* %b) #0 { Modified: llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll Tue Oct 8 08:25:56 2019 @@ -254,7 +254,7 @@ return: ; ; FNATTR: define i32* @rt0(i32* readonly %a) ; BOTH: Function Attrs: nofree noinline noreturn nosync nounwind readonly uwtable -; BOTH-NEXT: define noalias nonnull align 536870912 dereferenceable(4294967295) i32* @rt0(i32* nocapture readonly %a) +; BOTH-NEXT: define noalias nonnull align 536870912 dereferenceable(4294967295) i32* @rt0(i32* nocapture nonnull readonly dereferenceable(4) %a) define i32* @rt0(i32* %a) #0 { entry: %v = load i32, i32* %a, align 4 @@ -272,7 +272,7 @@ entry: ; ; FNATTR: define noalias i32* @rt1(i32* nocapture readonly %a) ; BOTH: Function Attrs: nofree noinline noreturn nosync nounwind readonly uwtable -; BOTH-NEXT: define noalias nonnull align 536870912 dereferenceable(4294967295) i32* @rt1(i32* nocapture readonly %a) +; BOTH-NEXT: define noalias nonnull align 536870912 dereferenceable(4294967295) i32* @rt1(i32* nocapture nonnull readonly dereferenceable(4) %a) define i32* @rt1(i32* %a) #0 { entry: %v = load i32, i32* %a, align 4 Modified: llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll Tue Oct 8 08:25:56 2019 @@ -1,5 +1,5 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py -; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=1 < %s | FileCheck %s +; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s | FileCheck %s ; ModuleID = 'callback_simple.c' target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" @@ -14,7 +14,7 @@ target datalayout = "e-m:e-p270:32:32-p2 ; FIXME: The callee -> call site direction is not working yet. define void @t0_caller(i32* %a) { -; CHECK: @t0_caller(i32* [[A:%.*]]) +; CHECK-LABEL: @t0_caller( ; CHECK-NEXT: entry: ; CHECK-NEXT: [[B:%.*]] = alloca i32, align 32 ; CHECK-NEXT: [[C:%.*]] = alloca i32*, align 64 @@ -39,7 +39,7 @@ entry: ; Note that the first two arguments are provided by the callback_broker according to the callback in !1 below! ; The others are annotated with alignment information, amongst others, or even replaced by the constants passed to the call. define internal void @t0_callback_callee(i32* %is_not_null, i32* %ptr, i32* %a, i64 %b, i32** %c) { -; CHECK: @t0_callback_callee(i32* nocapture writeonly [[IS_NOT_NULL:%.*]], i32* nocapture readonly [[PTR:%.*]], i32* [[A:%.*]], i64 [[B:%.*]], i32** nocapture nonnull readonly align 64 dereferenceable(8) [[C:%.*]]) +; CHECK-LABEL: @t0_callback_callee( ; CHECK-NEXT: entry: ; CHECK-NEXT: [[PTR_VAL:%.*]] = load i32, i32* [[PTR:%.*]], align 8 ; CHECK-NEXT: store i32 [[PTR_VAL]], i32* [[IS_NOT_NULL:%.*]] Modified: llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll Tue Oct 8 08:25:56 2019 @@ -1,4 +1,4 @@ -; RUN: opt -attributor -attributor-manifest-internal --attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=1 -S < %s | FileCheck %s --check-prefixes=ATTRIBUTOR +; RUN: opt -attributor -attributor-manifest-internal --attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 -S < %s | FileCheck %s --check-prefixes=ATTRIBUTOR declare void @deref_phi_user(i32* %a); @@ -111,3 +111,94 @@ for.inc: for.end: ; preds = %for.cond.cleanup ret void } + +; TEST 7 +; share known infomation in must-be-executed-context +declare i32* @unkown_ptr() willreturn nounwind +declare i32 @unkown_f(i32*) willreturn nounwind +define i32* @f7_0(i32* %ptr) { +; ATTRIBUTOR: define nonnull dereferenceable(8) i32* @f7_0(i32* nonnull returned dereferenceable(8) %ptr) + %T = tail call i32 @unkown_f(i32* dereferenceable(8) %ptr) + ret i32* %ptr +} + +; ATTRIBUTOR: define void @f7_1(i32* nonnull dereferenceable(4) %ptr, i1 %c) +define void @f7_1(i32* %ptr, i1 %c) { + +; ATTRIBUTOR: %A = tail call i32 @unkown_f(i32* nonnull dereferenceable(4) %ptr) + %A = tail call i32 @unkown_f(i32* %ptr) + + %ptr.0 = load i32, i32* %ptr + ; deref 4 hold + +; FIXME: this should be %B = tail call i32 @unkown_f(i32* nonnull dereferenceable(4) %ptr) +; ATTRIBUTOR: %B = tail call i32 @unkown_f(i32* nonnull dereferenceable(4) %ptr) + %B = tail call i32 @unkown_f(i32* dereferenceable(1) %ptr) + + br i1%c, label %if.true, label %if.false +if.true: +; ATTRIBUTOR: %C = tail call i32 @unkown_f(i32* nonnull dereferenceable(8) %ptr) + %C = tail call i32 @unkown_f(i32* %ptr) + +; ATTRIBUTOR: %D = tail call i32 @unkown_f(i32* nonnull dereferenceable(8) %ptr) + %D = tail call i32 @unkown_f(i32* dereferenceable(8) %ptr) + +; FIXME: This should be tail call i32 @unkown_f(i32* nonnull dereferenceable(8) %ptr) +; Making must-be-executed-context backward exploration will fix this. +; ATTRIBUTOR: %E = tail call i32 @unkown_f(i32* nonnull dereferenceable(4) %ptr) + %E = tail call i32 @unkown_f(i32* %ptr) + + ret void + +if.false: + ret void +} + +; ATTRIBUTOR: define void @f7_2(i1 %c) +define void @f7_2(i1 %c) { + + %ptr = tail call i32* @unkown_ptr() + +; ATTRIBUTOR: %A = tail call i32 @unkown_f(i32* nonnull dereferenceable(4) %ptr) + %A = tail call i32 @unkown_f(i32* %ptr) + + %arg_a.0 = load i32, i32* %ptr + ; deref 4 hold + +; ATTRIBUTOR: %B = tail call i32 @unkown_f(i32* nonnull dereferenceable(4) %ptr) + %B = tail call i32 @unkown_f(i32* dereferenceable(1) %ptr) + + br i1%c, label %if.true, label %if.false +if.true: + +; ATTRIBUTOR: %C = tail call i32 @unkown_f(i32* nonnull dereferenceable(8) %ptr) + %C = tail call i32 @unkown_f(i32* %ptr) + +; ATTRIBUTOR: %D = tail call i32 @unkown_f(i32* nonnull dereferenceable(8) %ptr) + %D = tail call i32 @unkown_f(i32* dereferenceable(8) %ptr) + + %E = tail call i32 @unkown_f(i32* %ptr) +; FIXME: This should be @unkown_f(i32* nonnull dereferenceable(8) %ptr) +; Making must-be-executed-context backward exploration will fix this. +; ATTRIBUTOR: %E = tail call i32 @unkown_f(i32* nonnull dereferenceable(4) %ptr) + + ret void + +if.false: + ret void +} + +define i32* @f7_3() { +; ATTRIBUTOR: define nonnull dereferenceable(4) i32* @f7_3() + %ptr = tail call i32* @unkown_ptr() + store i32 10, i32* %ptr, align 16 + ret i32* %ptr +} + +define i32* @test_for_minus_index(i32* %p) { +; FIXME: This should be define nonnull dereferenceable(8) i32* @test_for_minus_index(i32* nonnull %p) +; ATTRIBUTOR: define nonnull dereferenceable(8) i32* @test_for_minus_index(i32* writeonly "no-capture-maybe-returned" %p) + %q = getelementptr inbounds i32, i32* %p, i32 -2 + store i32 1, i32* %q + ret i32* %q +} Modified: llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll Tue Oct 8 08:25:56 2019 @@ -8,7 +8,7 @@ entry: ret i32 %add } -; CHECK: define private i32 @noalias_args(i32* nocapture readonly %A, i32* noalias nocapture readonly %B) +; CHECK: define private i32 @noalias_args(i32* nocapture nonnull readonly dereferenceable(4) %A, i32* noalias nocapture nonnull readonly dereferenceable(4) %B) define private i32 @noalias_args(i32* %A, i32* %B) #0 { entry: @@ -23,7 +23,8 @@ entry: ; FIXME: Should be something like this. ; define internal i32 @noalias_args_argmem(i32* noalias nocapture readonly %A, i32* noalias nocapture readonly %B) -; CHECK: define internal i32 @noalias_args_argmem(i32* nocapture readonly %A, i32* nocapture readonly %B) +; CHECK: define internal i32 @noalias_args_argmem(i32* nocapture nonnull readonly dereferenceable(4) %A, i32* nocapture nonnull readonly dereferenceable(4) %B) + ; define internal i32 @noalias_args_argmem(i32* %A, i32* %B) #1 { entry: Modified: llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll Tue Oct 8 08:25:56 2019 @@ -40,7 +40,7 @@ define i32 @volatile_load(i32*) norecurs } ; CHECK: Function Attrs: nofree norecurse nosync nounwind readonly uwtable willreturn -; CHECK-NEXT: define internal i32 @internal_load(i32* nocapture nonnull readonly %0) +; CHECK-NEXT: define internal i32 @internal_load(i32* nocapture nonnull readonly dereferenceable(4) %0) define internal i32 @internal_load(i32*) norecurse nounwind uwtable { %2 = load i32, i32* %0, align 4 ret i32 %2 Modified: llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll Tue Oct 8 08:25:56 2019 @@ -1,4 +1,4 @@ -; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=1 < %s | FileCheck %s +; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s | FileCheck %s ; TEST 1 - negative. Modified: llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll Tue Oct 8 08:25:56 2019 @@ -120,7 +120,9 @@ define void @nc2(i32* %p, i32* %q) { ret void } -; EITHER: define void @nc3(void ()* nocapture %p) + +; FNATTR: define void @nc3(void ()* nocapture %p) +; ATTRIBUTOR: define void @nc3(void ()* nocapture nonnull %p) define void @nc3(void ()* %p) { call void %p() ret void @@ -133,7 +135,8 @@ define void @nc4(i8* %p) { ret void } -; EITHER: define void @nc5(void (i8*)* nocapture %f, i8* nocapture %p) +; FNATTR: define void @nc5(void (i8*)* nocapture %f, i8* nocapture %p) +; ATTRIBUTOR: define void @nc5(void (i8*)* nocapture nonnull %f, i8* nocapture %p) define void @nc5(void (i8*)* %f, i8* %p) { call void %f(i8* %p) readonly nounwind call void %f(i8* nocapture %p) @@ -213,19 +216,22 @@ define void @test6_2(i8* %x6_2, i8* %y6_ ret void } -; EITHER: define void @test_cmpxchg(i32* nocapture %p) +; FNATTR: define void @test_cmpxchg(i32* nocapture %p) +; ATTRIBUTOR: define void @test_cmpxchg(i32* nocapture nonnull dereferenceable(4) %p) define void @test_cmpxchg(i32* %p) { cmpxchg i32* %p, i32 0, i32 1 acquire monotonic ret void } -; EITHER: define void @test_cmpxchg_ptr(i32** nocapture %p, i32* %q) +; FNATTR: define void @test_cmpxchg_ptr(i32** nocapture %p, i32* %q) +; ATTRIBUTOR: define void @test_cmpxchg_ptr(i32** nocapture nonnull dereferenceable(8) %p, i32* %q) define void @test_cmpxchg_ptr(i32** %p, i32* %q) { cmpxchg i32** %p, i32* null, i32* %q acquire monotonic ret void } -; EITHER: define void @test_atomicrmw(i32* nocapture %p) +; FNATTR: define void @test_atomicrmw(i32* nocapture %p) +; ATTRIBUTOR: define void @test_atomicrmw(i32* nocapture nonnull dereferenceable(4) %p) define void @test_atomicrmw(i32* %p) { atomicrmw add i32* %p, i32 1 seq_cst ret void Modified: llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll Tue Oct 8 08:25:56 2019 @@ -236,6 +236,104 @@ define void @f15(i8* %arg) { ret void } +declare void @fun0() #1 +declare void @fun1(i8*) #1 +declare void @fun2(i8*, i8*) #1 +declare void @fun3(i8*, i8*, i8*) #1 +; TEST 16 simple path test +; if(..) +; fun2(nonnull %a, nonnull %b) +; else +; fun2(nonnull %a, %b) +; We can say that %a is nonnull but %b is not. +define void @f16(i8* %a, i8 * %b, i8 %c) { +; FIXME: missing nonnull on %a +; ATTRIBUTOR: define void @f16(i8* %a, i8* %b, i8 %c) + %cmp = icmp eq i8 %c, 0 + br i1 %cmp, label %if.then, label %if.else +if.then: + tail call void @fun2(i8* nonnull %a, i8* nonnull %b) + ret void +if.else: + tail call void @fun2(i8* nonnull %a, i8* %b) + ret void +} +; TEST 17 explore child BB test +; if(..) +; ... (willreturn & nounwind) +; else +; ... (willreturn & nounwind) +; fun1(nonnull %a) +; We can say that %a is nonnull +define void @f17(i8* %a, i8 %c) { +; FIXME: missing nonnull on %a +; ATTRIBUTOR: define void @f17(i8* %a, i8 %c) + %cmp = icmp eq i8 %c, 0 + br i1 %cmp, label %if.then, label %if.else +if.then: + tail call void @fun0() + br label %cont +if.else: + tail call void @fun0() + br label %cont +cont: + tail call void @fun1(i8* nonnull %a) + ret void +} +; TEST 18 More complex test +; if(..) +; ... (willreturn & nounwind) +; else +; ... (willreturn & nounwind) +; if(..) +; ... (willreturn & nounwind) +; else +; ... (willreturn & nounwind) +; fun1(nonnull %a) + +define void @f18(i8* %a, i8* %b, i8 %c) { +; FIXME: missing nonnull on %a +; ATTRIBUTOR: define void @f18(i8* %a, i8* %b, i8 %c) + %cmp1 = icmp eq i8 %c, 0 + br i1 %cmp1, label %if.then, label %if.else +if.then: + tail call void @fun0() + br label %cont +if.else: + tail call void @fun0() + br label %cont +cont: + %cmp2 = icmp eq i8 %c, 1 + br i1 %cmp2, label %cont.then, label %cont.else +cont.then: + tail call void @fun1(i8* nonnull %b) + br label %cont2 +cont.else: + tail call void @fun0() + br label %cont2 +cont2: + tail call void @fun1(i8* nonnull %a) + ret void +} + +; TEST 19: Loop + +define void @f19(i8* %a, i8* %b, i8 %c) { +; FIXME: missing nonnull on %b +; ATTRIBUTOR: define void @f19(i8* %a, i8* %b, i8 %c) + br label %loop.header +loop.header: + %cmp2 = icmp eq i8 %c, 0 + br i1 %cmp2, label %loop.body, label %loop.exit +loop.body: + tail call void @fun1(i8* nonnull %b) + tail call void @fun1(i8* nonnull %a) + br label %loop.header +loop.exit: + tail call void @fun1(i8* nonnull %b) + ret void +} + ; Test propagation of nonnull callsite args back to caller. declare void @use1(i8* %x) @@ -268,14 +366,9 @@ define void @parent2(i8* %a, i8* %b, i8* ; FNATTR-NEXT: call void @use3nonnull(i8* %b, i8* %c, i8* %a) ; FNATTR-NEXT: call void @use3(i8* %c, i8* %a, i8* %b) -; FIXME: missing "nonnull", it should be -; @parent2(i8* nonnull %a, i8* nonnull %b, i8* nonnull %c) -; call void @use3nonnull(i8* nonnull %b, i8* nonnull %c, i8* nonnull %a) -; call void @use3(i8* nonnull %c, i8* nonnull %a, i8* nonnull %b) - -; ATTRIBUTOR-LABEL: @parent2(i8* %a, i8* %b, i8* %c) +; ATTRIBUTOR-LABEL: @parent2(i8* nonnull %a, i8* nonnull %b, i8* nonnull %c) ; ATTRIBUTOR-NEXT: call void @use3nonnull(i8* nonnull %b, i8* nonnull %c, i8* nonnull %a) -; ATTRIBUTOR-NEXT: call void @use3(i8* %c, i8* %a, i8* %b) +; ATTRIBUTOR-NEXT: call void @use3(i8* nonnull %c, i8* nonnull %a, i8* nonnull %b) ; BOTH-NEXT: ret void call void @use3nonnull(i8* %b, i8* %c, i8* %a) @@ -290,13 +383,9 @@ define void @parent3(i8* %a, i8* %b, i8* ; FNATTR-NEXT: call void @use1nonnull(i8* %a) ; FNATTR-NEXT: call void @use3(i8* %c, i8* %b, i8* %a) -; FIXME: missing "nonnull", it should be, -; @parent3(i8* nonnull %a, i8* %b, i8* %c) -; call void @use1nonnull(i8* nonnull %a) -; call void @use3(i8* %c, i8* %b, i8* nonnull %a) -; ATTRIBUTOR-LABEL: @parent3(i8* %a, i8* %b, i8* %c) +; ATTRIBUTOR-LABEL: @parent3(i8* nonnull %a, i8* %b, i8* %c) ; ATTRIBUTOR-NEXT: call void @use1nonnull(i8* nonnull %a) -; ATTRIBUTOR-NEXT: call void @use3(i8* %c, i8* %b, i8* %a) +; ATTRIBUTOR-NEXT: call void @use3(i8* %c, i8* %b, i8* nonnull %a) ; BOTH-NEXT: ret void @@ -313,16 +402,10 @@ define void @parent4(i8* %a, i8* %b, i8* ; CHECK-NEXT: call void @use2(i8* %a, i8* %c) ; CHECK-NEXT: call void @use1(i8* %b) -; FIXME : missing "nonnull", it should be -; @parent4(i8* %a, i8* nonnull %b, i8* nonnull %c) -; call void @use2nonnull(i8* nonnull %c, i8* nonull %b) -; call void @use2(i8* %a, i8* nonnull %c) -; call void @use1(i8* nonnull %b) - -; ATTRIBUTOR-LABEL: @parent4(i8* %a, i8* %b, i8* %c) +; ATTRIBUTOR-LABEL: @parent4(i8* %a, i8* nonnull %b, i8* nonnull %c) ; ATTRIBUTOR-NEXT: call void @use2nonnull(i8* nonnull %c, i8* nonnull %b) -; ATTRIBUTOR-NEXT: call void @use2(i8* %a, i8* %c) -; ATTRIBUTOR-NEXT: call void @use1(i8* %b) +; ATTRIBUTOR-NEXT: call void @use2(i8* %a, i8* nonnull %c) +; ATTRIBUTOR-NEXT: call void @use1(i8* nonnull %b) ; BOTH: ret void @@ -359,8 +442,7 @@ f: define i8 @parent6(i8* %a, i8* %b) { ; FNATTR-LABEL: @parent6(i8* nonnull %a, i8* %b) -; FIXME: missing "nonnull" -; ATTRIBUTOR-LABEL: @parent6(i8* %a, i8* %b) +; ATTRIBUTOR-LABEL: @parent6(i8* nonnull %a, i8* %b) ; BOTH-NEXT: [[C:%.*]] = load volatile i8, i8* %b ; FNATTR-NEXT: call void @use1nonnull(i8* %a) ; ATTRIBUTOR-NEXT: call void @use1nonnull(i8* nonnull %a) @@ -378,14 +460,9 @@ define i8 @parent7(i8* %a) { ; FNATTR-NEXT: [[RET:%.*]] = call i8 @use1safecall(i8* %a) ; FNATTR-NEXT: call void @use1nonnull(i8* %a) -; FIXME : missing "nonnull", it should be -; @parent7(i8* nonnull %a) -; [[RET:%.*]] = call i8 @use1safecall(i8* nonnull %a) -; call void @use1nonnull(i8* nonnull %a) -; ret i8 [[RET]] -; ATTRIBUTOR-LABEL: @parent7(i8* %a) -; ATTRIBUTOR-NEXT: [[RET:%.*]] = call i8 @use1safecall(i8* %a) +; ATTRIBUTOR-LABEL: @parent7(i8* nonnull %a) +; ATTRIBUTOR-NEXT: [[RET:%.*]] = call i8 @use1safecall(i8* nonnull %a) ; ATTRIBUTOR-NEXT: call void @use1nonnull(i8* nonnull %a) ; BOTH-NEXT: ret i8 [[RET]] @@ -400,9 +477,7 @@ define i8 @parent7(i8* %a) { declare i32 @esfp(...) define i1 @parent8(i8* %a, i8* %bogus1, i8* %b) personality i8* bitcast (i32 (...)* @esfp to i8*){ -; FNATTR-LABEL: @parent8(i8* nonnull %a, i8* nocapture readnone %bogus1, i8* nonnull %b) -; FIXME : missing "nonnull", it should be @parent8(i8* nonnull %a, i8* %bogus1, i8* nonnull %b) -; ATTRIBUTOR-LABEL: @parent8(i8* %a, i8* nocapture readnone %bogus1, i8* %b) +; BOTH-LABEL: @parent8(i8* nonnull %a, i8* nocapture readnone %bogus1, i8* nonnull %b) ; BOTH-NEXT: entry: ; FNATTR-NEXT: invoke void @use2nonnull(i8* %a, i8* %b) ; ATTRIBUTOR-NEXT: invoke void @use2nonnull(i8* nonnull %a, i8* nonnull %b) @@ -470,4 +545,6 @@ define weak_odr void @weak_caller(i32* n ret void } + attributes #0 = { "null-pointer-is-valid"="true" } +attributes #1 = { nounwind willreturn} Modified: llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll Tue Oct 8 08:25:56 2019 @@ -130,8 +130,15 @@ define linkonce_odr i32 @leaf_redefinabl ; Call through a function pointer ; ATTRIBUTOR-NOT: Function Attrs -; ATTRIBUTOR: define i32 @eval_func(i32 (i32)* nocapture %0, i32 %1) -define i32 @eval_func(i32 (i32)* , i32) local_unnamed_addr { +; ATTRIBUTOR: define i32 @eval_func1(i32 (i32)* nocapture nonnull %0, i32 %1) +define i32 @eval_func1(i32 (i32)* , i32) local_unnamed_addr { + %3 = tail call i32 %0(i32 %1) #2 + ret i32 %3 +} + +; ATTRIBUTOR-NOT: Function Attrs +; ATTRIBUTOR: define i32 @eval_func2(i32 (i32)* nocapture %0, i32 %1) +define i32 @eval_func2(i32 (i32)* , i32) local_unnamed_addr "null-pointer-is-valid"="true"{ %3 = tail call i32 %0(i32 %1) #2 ret i32 %3 } Modified: llvm/trunk/test/Transforms/FunctionAttrs/nosync.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nosync.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nosync.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nosync.ll Tue Oct 8 08:25:56 2019 @@ -45,7 +45,7 @@ entry: ; FNATTR: Function Attrs: nofree norecurse nounwind uwtable ; FNATTR-NEXT: define i32 @load_monotonic(i32* nocapture readonly %0) ; ATTRIBUTOR: Function Attrs: nofree norecurse nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define i32 @load_monotonic(i32* nocapture readonly %0) +; ATTRIBUTOR-NEXT: define i32 @load_monotonic(i32* nocapture nonnull readonly dereferenceable(4) %0) define i32 @load_monotonic(i32* nocapture readonly %0) norecurse nounwind uwtable { %2 = load atomic i32, i32* %0 monotonic, align 4 ret i32 %2 @@ -61,7 +61,7 @@ define i32 @load_monotonic(i32* nocaptur ; FNATTR: Function Attrs: nofree norecurse nounwind uwtable ; FNATTR-NEXT: define void @store_monotonic(i32* nocapture %0) ; ATTRIBUTOR: Function Attrs: nofree norecurse nosync nounwind uwtable -; ATTRIBUTOR-NEXT: define void @store_monotonic(i32* nocapture writeonly %0) +; ATTRIBUTOR-NEXT: define void @store_monotonic(i32* nocapture nonnull writeonly dereferenceable(4) %0) define void @store_monotonic(i32* nocapture %0) norecurse nounwind uwtable { store atomic i32 10, i32* %0 monotonic, align 4 ret void @@ -78,7 +78,7 @@ define void @store_monotonic(i32* nocapt ; FNATTR-NEXT: define i32 @load_acquire(i32* nocapture readonly %0) ; ATTRIBUTOR: Function Attrs: nofree norecurse nounwind uwtable ; ATTRIBUTOR-NOT: nosync -; ATTRIBUTOR-NEXT: define i32 @load_acquire(i32* nocapture readonly %0) +; ATTRIBUTOR-NEXT: define i32 @load_acquire(i32* nocapture nonnull readonly dereferenceable(4) %0) define i32 @load_acquire(i32* nocapture readonly %0) norecurse nounwind uwtable { %2 = load atomic i32, i32* %0 acquire, align 4 ret i32 %2 @@ -224,7 +224,8 @@ define void @scc2(i32* %0) noinline noun ; FNATTR: Function Attrs: nofree norecurse nounwind ; FNATTR-NEXT: define void @foo1(i32* nocapture %0, %"struct.std::atomic"* nocapture %1) ; ATTRIBUTOR-NOT: nosync -; ATTRIBUTOR: define void @foo1(i32* nocapture writeonly %0, %"struct.std::atomic"* nocapture writeonly %1) +; ATTRIBUTOR: define void @foo1(i32* nocapture nonnull writeonly dereferenceable(4) %0, %"struct.std::atomic"* nocapture writeonly %1) + define void @foo1(i32* %0, %"struct.std::atomic"* %1) { store i32 100, i32* %0, align 4 fence release @@ -255,8 +256,9 @@ define void @bar(i32* %0, %"struct.std:: ; TEST 13 - Fence syncscope("singlethread") seq_cst ; FNATTR: Function Attrs: nofree norecurse nounwind ; FNATTR-NEXT: define void @foo1_singlethread(i32* nocapture %0, %"struct.std::atomic"* nocapture %1) -; ATTRIBUTOR: Function Attrs: nofree nosync -; ATTRIBUTOR: define void @foo1_singlethread(i32* nocapture writeonly %0, %"struct.std::atomic"* nocapture writeonly %1) +; ATTRIBUTOR: Function Attrs: nofree nosync nounwind willreturn +; ATTRIBUTOR: define void @foo1_singlethread(i32* nocapture nonnull writeonly dereferenceable(4) %0, %"struct.std::atomic"* nocapture writeonly %1) + define void @foo1_singlethread(i32* %0, %"struct.std::atomic"* %1) { store i32 100, i32* %0, align 4 fence syncscope("singlethread") release @@ -267,7 +269,7 @@ define void @foo1_singlethread(i32* %0, ; FNATTR: Function Attrs: nofree norecurse nounwind ; FNATTR-NEXT: define void @bar_singlethread(i32* nocapture readnone %0, %"struct.std::atomic"* nocapture readonly %1) -; ATTRIBUTOR: Function Attrs: nofree nosync +; ATTRIBUTOR: Function Attrs: nofree nosync nounwind ; ATTRIBUTOR: define void @bar_singlethread(i32* nocapture readnone %0, %"struct.std::atomic"* nocapture readonly %1) define void @bar_singlethread(i32* %0, %"struct.std::atomic"* %1) { %3 = getelementptr inbounds %"struct.std::atomic", %"struct.std::atomic"* %1, i64 0, i32 0, i32 0 Modified: llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll Tue Oct 8 08:25:56 2019 @@ -70,7 +70,7 @@ return: } ; CHECK: Function Attrs: nofree nosync nounwind -; CHECK-NEXT: define internal i32* @internal_ret1_rrw(i32* %r0, i32* returned %r1, i32* %w0) +; CHECK-NEXT: define internal i32* @internal_ret1_rrw(i32* nonnull dereferenceable(4) %r0, i32* returned %r1, i32* %w0) define internal i32* @internal_ret1_rrw(i32* %r0, i32* %r1, i32* %w0) { entry: %0 = load i32, i32* %r0, align 4 @@ -121,7 +121,7 @@ return: } ; CHECK: Function Attrs: nofree nosync nounwind -; CHECK-NEXT: define internal i32* @internal_ret1_rw(i32* %r0, i32* returned %w0) +; CHECK-NEXT: define internal i32* @internal_ret1_rw(i32* nonnull dereferenceable(4) %r0, i32* returned %w0) define internal i32* @internal_ret1_rw(i32* %r0, i32* %w0) { entry: %0 = load i32, i32* %r0, align 4 Modified: llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll Tue Oct 8 08:25:56 2019 @@ -39,7 +39,7 @@ define void @test4_2(i8* %p) { } ; FNATTR: define void @test5(i8** nocapture %p, i8* %q) -; ATTRIBUTOR: define void @test5(i8** nocapture writeonly %p, i8* %q) +; ATTRIBUTOR: define void @test5(i8** nocapture nonnull writeonly dereferenceable(8) %p, i8* %q) ; Missed optz'n: we could make %q readnone, but don't break test6! define void @test5(i8** %p, i8* %q) { store i8* %q, i8** %p @@ -48,7 +48,7 @@ define void @test5(i8** %p, i8* %q) { declare void @test6_1() ; FNATTR: define void @test6_2(i8** nocapture %p, i8* %q) -; ATTRIBUTOR: define void @test6_2(i8** nocapture writeonly %p, i8* %q) +; ATTRIBUTOR: define void @test6_2(i8** nocapture nonnull writeonly dereferenceable(8) %p, i8* %q) ; This is not a missed optz'n. define void @test6_2(i8** %p, i8* %q) { store i8* %q, i8** %p Modified: llvm/trunk/test/Transforms/InferFunctionAttrs/dereferenceable.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InferFunctionAttrs/dereferenceable.ll?rev=374063&r1=374062&r2=374063&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InferFunctionAttrs/dereferenceable.ll (original) +++ llvm/trunk/test/Transforms/InferFunctionAttrs/dereferenceable.ll Tue Oct 8 08:25:56 2019 @@ -1,10 +1,17 @@ ; RUN: opt < %s -inferattrs -S | FileCheck %s +; RUN: opt < %s -attributor --attributor-disable=false -S | FileCheck %s --check-prefix=ATTRIBUTOR + + ; Determine dereference-ability before unused loads get deleted: ; https://bugs.llvm.org/show_bug.cgi?id=21780 define <4 x double> @PR21780(double* %ptr) { ; CHECK-LABEL: @PR21780(double* %ptr) +; FIXME: this should be @PR21780(double* nonnull dereferenceable(32) %ptr) +; trakcing use of GEP in Attributor would fix this problem. +; ATTRIBUTOR-LABEL: @PR21780(double* nocapture nonnull readonly dereferenceable(8) %ptr) + ; GEP of index 0 is simplified away. %arrayidx1 = getelementptr inbounds double, double* %ptr, i64 1 %arrayidx2 = getelementptr inbounds double, double* %ptr, i64 2 @@ -23,6 +30,43 @@ define <4 x double> @PR21780(double* %pt ret <4 x double> %shuffle } + +define double @PR21780_only_access3_with_inbounds(double* %ptr) { +; CHECK-LABEL: @PR21780_only_access3_with_inbounds(double* %ptr) +; FIXME: this should be @PR21780_only_access3_with_inbounds(double* nonnull dereferenceable(32) %ptr) +; trakcing use of GEP in Attributor would fix this problem. +; ATTRIBUTOR-LABEL: @PR21780_only_access3_with_inbounds(double* nocapture readonly %ptr) + + %arrayidx3 = getelementptr inbounds double, double* %ptr, i64 3 + %t3 = load double, double* %arrayidx3, align 8 + ret double %t3 +} + +define double @PR21780_only_access3_without_inbounds(double* %ptr) { +; CHECK-LABEL: @PR21780_only_access3_without_inbounds(double* %ptr) +; ATTRIBUTOR-LABEL: @PR21780_only_access3_without_inbounds(double* nocapture readonly %ptr) + %arrayidx3 = getelementptr double, double* %ptr, i64 3 + %t3 = load double, double* %arrayidx3, align 8 + ret double %t3 +} + +define double @PR21780_without_inbounds(double* %ptr) { +; CHECK-LABEL: @PR21780_without_inbounds(double* %ptr) +; FIXME: this should be @PR21780_without_inbounds(double* nonnull dereferenceable(32) %ptr) +; ATTRIBUTOR-LABEL: @PR21780_without_inbounds(double* nocapture nonnull readonly dereferenceable(8) %ptr) + + %arrayidx1 = getelementptr double, double* %ptr, i64 1 + %arrayidx2 = getelementptr double, double* %ptr, i64 2 + %arrayidx3 = getelementptr double, double* %ptr, i64 3 + + %t0 = load double, double* %ptr, align 8 + %t1 = load double, double* %arrayidx1, align 8 + %t2 = load double, double* %arrayidx2, align 8 + %t3 = load double, double* %arrayidx3, align 8 + + ret double %t3 +} + ; Unsimplified, but still valid. Also, throw in some bogus arguments. define void @gep0(i8* %unused, i8* %other, i8* %ptr) { From llvm-commits at lists.llvm.org Tue Oct 8 08:26:05 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:26:05 +0000 (UTC) Subject: [PATCH] D68645: MinidumpYAML: Add support for the memory info list stream In-Reply-To: References: Message-ID: <1d72deef48426db467d5e08df0e89551@localhost.localdomain> clayborg added a comment. LGTM Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68645/new/ https://reviews.llvm.org/D68645 From llvm-commits at lists.llvm.org Tue Oct 8 08:26:05 2019 From: llvm-commits at lists.llvm.org (Simon Tatham via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:26:05 +0000 (UTC) Subject: [PATCH] D68081: Allow update_test_checks.py to not scrub names In-Reply-To: References: Message-ID: simon_tatham added a comment. I had `update_cc_test_checks.py` give me a Python traceback just now which I think must be to do with this commit: Traceback (most recent call last): File "llvm/utils/update_cc_test_checks.py", line 267, in sys.exit(main()) File "llvm/utils/update_cc_test_checks.py", line 255, in main common.add_ir_checks(output_lines, '//', run_list, func_dict, mangled) TypeError: add_ir_checks() missing 1 required positional argument: 'preserve_names' The new parameter to `add_ir_checks` is mandatory, but it's only been added to the call in `update_test_checks.py`, and not to the one in `update_cc_test_checks.py`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68081/new/ https://reviews.llvm.org/D68081 From llvm-commits at lists.llvm.org Tue Oct 8 08:26:05 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:26:05 +0000 (UTC) Subject: [PATCH] D67749: [AArch64] Stackframe accesses to SVE objects. In-Reply-To: References: Message-ID: <56764bbfc3155d89fa6e6fea7b0376ca@localhost.localdomain> cameron.mcinally accepted this revision. cameron.mcinally marked an inline comment as done. cameron.mcinally added a comment. This revision is now accepted and ready to land. LGTM with a moderate confidence level. Maybe give other reviewers a day or so to comment before committing. ================ Comment at: lib/Target/AArch64/AArch64InstrInfo.cpp:3453 + SOffset = StackOffset(Offset, MVT::i8) + + StackOffset(SOffset.getScalableBytes(), MVT::nxv1i8); return AArch64FrameOffsetCanUpdate | ---------------- sdesmalen wrote: > cameron.mcinally wrote: > > Would you shed some light on what this change is doing? > > > > `IsMulVL` indicates there are scalable objects on the stack, right? What is the reason for the behavior change of the legacy code when `!IsMulVL`. I.e. the addition of `StackOffset(SOffset.getScalableBytes(), MVT::nxv1i8)` in the else block. > `isAArch64FrameOffsetLegal` tries to determine if the immediate field of the instruction can fit the given StackOffset, and will attempt to fold as much of the offset in the immediate. > > Although a StackOffset can contain both a scalable and a non-scalable part, it will depend on the instruction whether the immediate is scalable or non-scalable. For example, in > ```LDR , [{, #, MUL VL}]``` > the immediate is `mul vl`, so scalable, which means the instruction can only handle the "scalable" part of the StackOffset. The rest of the offset will need to be handled elsewhere. > > The variable `int64_t Offset` uses either the scalable or non-scalable part of the StackOffset, which happens on line 3411. After that, this function does its magic to determine what part of the offset can be folded into the immediate. > > On line 3448, the remaining part of the offset that could *not* be folded into the immediate will need to be reflected in the in/out parameter `SOffset`, which is a StackOffset. > If `IsMulVL` is true, then variable `Offset` is scalable and will at this point contain the part of the scalable offset that could not be folded into the immediate. > SOffset.getBytes() just passes through the fixed-size part of the offset that is not handled by the instruction. > Conversely, if `IsMulVL` is false, then the variable `Offset` is non-scalable and will contain the part of the fixed-size offset that could not be folded into the immediate. It then has to pass through SOffset.getScalableBytes() that is not handled by the instruction, so it can be handled elsewhere. > > For example: > > ```isAArch64FrameOffsetLegal(AArch64::LDR_ZXI, {16, MVT::i8} + {16, MVT::nxv1i8})``` > would fold `{16, MVT::nxv1i8}` into the immediate, and the resulting SOffset would be `{16, MVT::i8}`. It would return `AArch64FrameOffsetCanUpdate`. > > ```isAArch64FrameOffsetLegal(AArch64::LDR_ZXI, {16, MVT::i8} + {4096, MVT::nxv1i8})``` > would fold `{4080, MVT::nxv1i8}` into the immediate (note that it's immediate goes up to `#255 MUL VL`), and the resulting SOffset would be `{16, MVT::i8} + {16, MVT::nxv1i8}`. Ah, okay. So it's determining the addressing mode... CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67749/new/ https://reviews.llvm.org/D67749 From llvm-commits at lists.llvm.org Tue Oct 8 08:26:06 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:26:06 +0000 (UTC) Subject: [PATCH] D68270: DWARFDebugLoc: Add a function to get the address range of an entry In-Reply-To: References: Message-ID: labath marked 2 inline comments as done. labath added inline comments. ================ Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:291-295 + EntryIterator Absolute = + getAbsoluteLocations( + SectionedAddress{BaseAddr, SectionedAddress::UndefSection}, + LookupPooledAddress) + .begin(); ---------------- dblaikie wrote: > labath wrote: > > dblaikie wrote: > > > labath wrote: > > > > This parallel iteration is not completely nice, but I think it's worth being able to reuse the absolute range computation code. I'm open to ideas for improvement though. > > > Ah, I see - this is what you meant about "In particular it makes it possible to reuse this stuff in the dumping code, which would have been pretty hard with callbacks.". > > > > > > I'm wondering if that might be worth revisiting somewhat. A full iterator abstraction for one user here (well, two once you include lldb - but I assume it's likely going to build its own data structure from the iteration anyway, right? (it's not going to keep the iterator around, do anything interesting like partial iterations, re-iterate/etc - such that a callback would suffice)) > > > > > > I could imagine two callback APIs for this - one that gets entries and locations and one that only gets locations by filtering on the entry version. > > > > > > eg: > > > > > > // for non-verbose output: > > > LL.forEachEntry([&](const Entry &E, Expected L) { > > > if (Verbose && actually dumping debug_loc) > > > print(E) // print any LLE_*, raw parameters, etc > > > if (L) > > > print(*L) // print the resulting address range, section name (if verbose), > > > else > > > print(error stuff) > > > }); > > > > > > One question would be "when/where do we print the DWARF expression" - if there's an error computing the address range, we can still print the expression, so maybe that happens unconditionally at the end of the callback, using the expression in the Entry? (then, arguably, the expression doesn't need to be in the DWARFLocation - and I'd say make the DWARFLocation a sectioned range, exactly the same type as for ranges so that part of the dumping code, etc, can be maximally reused) > > Actually, what lldb currently does is that it does not build any data structures at all (except storing the pointer to the right place in the debug_loc section. Then, whenever it wants to do something to the loclist, it parses it afresh. I don't know why it does this exactly, but I assume it has something to do with most locations never being used, or being only a couple of times, and the actual parsing being fairly fast. What this means is that lldb is not really a single "user", but there are like four or five places where it iterates through the list, depending on what does it actually want to do with it. It also does partial iteration where it stops as soon as it find the entry it was interested in. > > Now, all of that is possible with a callback (though I am generally trying to avoid them), but it does resurface the issue of what should be the value of the second argument for DW_LLE_base_address entries (the thing which I originally used a error type for). > > Maybe this should be actually one callback API, taking two callback functions, with one of them being invoked for base_address entries, and one for others? However, if we stick to the current approaches in both LLE and RLE of making the address pool resolution function a parameter (which I'd like to keep, as it makes my job in lldb easier), then this would actually be three callbacks, which starts to get unwieldy. Though one of those callbacks could be removed with the "DWARFUnit implementing a AddrOffsetResolver interface" idea, which I really like. :) > Ah, thanks for the details on LLDB's location parsing logic. That's interesting indeed! > > I can appreciate an iterator-based API if that's the sort of usage we've got, though I expect it doesn't have any interest in the low-level encoding & just wants the fully processed address ranges/locations - it doesn't want base_address or end_of_list entries? & I think the dual-iteration is a fairly awkward API design, trying to iterate them in lock-step, etc. I'd rather avoid that if reasonably possible. > > Either having an iterator API that gives only the fully processed data/semantic view & a completely different API if you want to access the low level primitives (LLE, etc) (this is how ranges works - there's an API that gives a collection of ranges & abstracts over v4/v5/rnglists/etc - though that's partly motivated by a strong multi-client need for that functionality for symbolizing, etc - but I think it's a good abstraction/model anyway (& one of the reasons the inline range list printing doesn't include encoding information, the API it uses is too high level to even have access to it)) > > > Now, all of that is possible with a callback (though I am generally trying to avoid them), but it does resurface the issue of what should be the value of the second argument for DW_LLE_base_address entries (the thing which I originally used a error type for). > > Sorry, my intent in the above API was for the second argument to be Optional's "None" state when... oh, I see, I did use Expected there, rather than Optional, because there are legit error cases. > > I know it's sort of awkward, but I might be inclined to use Optional> there. I realize two layers of wrapping is a bit weird, but I think it'd be nicer than having an error state for what, I think, isn't erroneous. > > > Maybe this should be actually one callback API, taking two callback functions, with one of them being invoked for base_address entries, and one for others? However, if we stick to the current approaches in both LLE and RLE of making the address pool resolution function a parameter (which I'd like to keep, as it makes my job in lldb easier), then this would actually be three callbacks, which starts to get unwieldy. > > Don't mind three callbacks too much. > > > Though one of those callbacks could be removed with the "DWARFUnit implementing a AddrOffsetResolver interface" idea, which I really like. :) > > Sorry, I haven't really looked at where the address resolver callback is registered and alternative designs being discussed - but yeah, going off just the one-sentence, it seems reasonable to have the DWARFUnit own an address resolver/be the thing you consult when you want to resolve an address (just through a normal function call in DWARFUnit, perhaps - which might, internally, use a callback registered when it was constructed). > I know it's sort of awkward, but I might be inclined to use Optional> there. I realize two layers of wrapping is a bit weird, but I think it'd be nicer than having an error state for what, I think, isn't erroneous. Actually, my very first attempt at this patch used an `Expected>`, but then I scrapped it because I didn't think you'd like it. It's not the friendliest of APIs, but I think we can go with that. > Sorry, I haven't really looked at where the address resolver callback is registered and alternative designs being discussed - but yeah, going off just the one-sentence, it seems reasonable to have the DWARFUnit own an address resolver/be the thing you consult when you want to resolve an address (just through a normal function call in DWARFUnit, perhaps - which might, internally, use a callback registered when it was constructed). I think you got that backwards. I don't want the DWARFUnit to be the source of truth for address pool resolutions, as that would make it hard to use from lldb (it's far from ready to start using the llvm version right now). What I wanted was to replace the lambda/function_ref with a single-method interface. Then both DWARFUnits could implement that interface so that passing a DWARFUnit& would "just work" (but you wouldn't be limited to DWARFUnits as anyone could implement that interface, just like anyone can write a lambda). ================ Comment at: test/CodeGen/X86/debug-loclists.ll:16 ; CHECK-NEXT: 0x00000000: -; CHECK-NEXT: [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0 -; CHECK-NEXT: [0x0000000000000004, 0x0000000000000012): DW_OP_breg3 RBX+0 - -; There is no way to use llvm-dwarfdump atm (2018, october) to verify the DW_LLE_* codes emited, -; because dumper is not yet implements that. Use asm code to do this check instead. -; -; RUN: llc -mtriple=x86_64-pc-linux -filetype=asm < %s -o - | FileCheck %s --check-prefix=ASM -; ASM: .section .debug_loclists,"", at progbits -; ASM-NEXT: .long .Ldebug_loclist_table_end0-.Ldebug_loclist_table_start0 # Length -; ASM-NEXT: .Ldebug_loclist_table_start0: -; ASM-NEXT: .short 5 # Version -; ASM-NEXT: .byte 8 # Address size -; ASM-NEXT: .byte 0 # Segment selector size -; ASM-NEXT: .long 0 # Offset entry count -; ASM-NEXT: .Lloclists_table_base0: -; ASM-NEXT: .Ldebug_loc0: -; ASM-NEXT: .byte 4 # DW_LLE_offset_pair -; ASM-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0 # starting offset -; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # ending offset -; ASM-NEXT: .byte 2 # Loc expr size -; ASM-NEXT: .byte 117 # DW_OP_breg5 -; ASM-NEXT: .byte 0 # 0 -; ASM-NEXT: .byte 4 # DW_LLE_offset_pair -; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # starting offset -; ASM-NEXT: .uleb128 .Ltmp1-.Lfunc_begin0 # ending offset -; ASM-NEXT: .byte 2 # Loc expr size -; ASM-NEXT: .byte 115 # DW_OP_breg3 -; ASM-NEXT: .byte 0 # 0 -; ASM-NEXT: .byte 0 # DW_LLE_end_of_list -; ASM-NEXT: .Ldebug_loclist_table_end0: +; CHECK-NEXT: [DW_LLE_offset_pair ]: 0x0000000000000000, 0x0000000000000004 => [0x0000000000000000, 0x0000000000000004) DW_OP_breg5 RDI+0 +; CHECK-NEXT: [DW_LLE_offset_pair ]: 0x0000000000000004, 0x0000000000000012 => [0x0000000000000004, 0x0000000000000012) DW_OP_breg3 RBX+0 ---------------- dblaikie wrote: > labath wrote: > > dblaikie wrote: > > > labath wrote: > > > > This tries to follow the RLE format as closely as possible, but I think something like > > > > ``` > > > > [DW_LLE_offset_pair, 0x0000000000000000, 0x0000000000000004] => [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0 > > > > ``` > > > > would make more sense (both here and for RLE). > > > Yep, that'd make more sense to me - are you planning to unify the codepaths for this? I think that'd be for the best. > > > > > > If I were picking a printing from scratch, I might go with: > > > > > > DW_LLE_offset_pair(0x0000, 0x0004) => [0x0000, 0x0004): DW_OP_breg5 RDI+0 > > > > > > Making it look a bit more like a function call and function arguments. Though the () might be confusing with the range notation. > > > > > > I'm also undecided on the " => " separator. Whether a ':' might be better/fine, etc. > > > > > > Totally open to ideas, but mostly I'd really love these to use loclist and ranges to use the same code as much as possible, so we can get consistency and any readability benefits, etc in both. > > I like the function call format. I hoping to get some code reuse, though it's still not fully clear to me how to achieve that.. > I've posted my unification of range/loc/v4/v5 emission here: https://reviews.llvm.org/D68620 - & I'd imagine something similar in the parsing side. cool. I'll see what I can do with that. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68270/new/ https://reviews.llvm.org/D68270 From llvm-commits at lists.llvm.org Tue Oct 8 08:26:06 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:26:06 +0000 (UTC) Subject: [PATCH] D66887: [test-suite][WIP] Add GCC C Torture Suite as External Test Suite In-Reply-To: References: Message-ID: lenary updated this revision to Diff 223872. lenary marked 2 inline comments as done. lenary added a comment. - Update LICENSE.TXT spelling (American spelling this time) - Remove Licensing info duplication from README Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 Files: LICENSE.TXT SingleSource/Regression/C/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/README SingleSource/Regression/C/gcc-c-torture/execute/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/execute/COPYING SingleSource/Regression/C/gcc-c-torture/execute/COPYING3 SingleSource/Regression/C/gcc-c-torture/execute/LICENSE.TXT SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/lit.local.cfg -------------- next part -------------- A non-text attachment was scrubbed... Name: D66887.223872.patch Type: text/x-patch Size: 68367 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 08:26:06 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 15:26:06 +0000 (UTC) Subject: [PATCH] D68600: AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer In-Reply-To: References: Message-ID: nhaehnle added inline comments. ================ Comment at: test/CodeGen/AMDGPU/GlobalISel/legalize-extract.mir:1118-1131 +name: extract_s16_s64_offset18 + +body: | + bb.0: + liveins: $vgpr0_vgpr1 + ; CHECK-LABEL: name: extract_s16_s64_offset18 + ; CHECK: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 ---------------- This change seems unrelated. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68600/new/ https://reviews.llvm.org/D68600 From llvm-commits at lists.llvm.org Tue Oct 8 08:26:07 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:26:07 +0000 (UTC) Subject: [PATCH] D68484: [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. In-Reply-To: References: Message-ID: <16c3f6616b60c87f1eab343c19b3a406@localhost.localdomain> jeroen.dobbelaere marked 3 inline comments as done. jeroen.dobbelaere added a comment. Here is an example `test.c`: struct FOO { int* restrict pA; int* pB; int* restrict pC; }; void bar(int* a, int* b, int* c) { struct FOO tmp = { a, b, c }; *tmp.pA=42; *tmp.pB=43; *tmp.pC=44; } Compiled as: clang -mllvm --print-before-all -mllvm -debug -emit-llvm -O2 test.c -S -o - Before SROA: %tmp = alloca %struct.FOO, align 8 ... %1 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6) ... %pA1 = getelementptr inbounds %struct.FOO, %struct.FOO* %tmp, i32 0, i32 0 %5 = load i32*, i32** %pA1, align 8, !tbaa !9, !noalias !6 %6 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %5, i8* %1, i32** %pA1, i64 0, metadata !6), !tbaa !9, !noalias !6 ... %pC3 = getelementptr inbounds %struct.FOO, %struct.FOO* %tmp, i32 0, i32 2 %8 = load i32*, i32** %pC3, align 8, !tbaa !12, !noalias !6 %9 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %8, i8* %1, i32** %pC3, i64 0, metadata !6), !tbaa !12, !noalias !6 During SROA: Notice how llvm.noalias.decl and llvm.noalias is split, using 0, 8 and 16 for the p.objId : ... Rewriting alloca partition [0,8) to: %tmp.sroa.0 = alloca i32* Found llvm.noalias.decl: %1 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6) New llvm.noalias.decl: %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.0, i64 0, metadata !6) [...] rewriting [0,8) slice #2 original: %7 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.0.0., i8* %2, i32** %pA1, i64 0, metadata !6), !tbaa !9, !noalias !6 to: %7 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.0.0., i8* %1, i32** %tmp.sroa.0, i64 0, metadata !6), !tbaa !9, !noalias !6 [...] rewriting split [0,24) slice #4 (splittable) original: %2 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6) to: %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.0, i64 0, metadata !6) [...] Rewriting alloca partition [8,16) to: %tmp.sroa.6 = alloca i32* Found llvm.noalias.decl: %2 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6) New llvm.noalias.decl: %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.6, i64 8, metadata !6) [...] rewriting split [0,24) slice #4 (splittable) original: %3 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6) to: %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.6, i64 8, metadata !6) [...] rewriting [8,16) slice #7 original: %9 = load i32*, i32** %pB2, align 8, !tbaa !11, !noalias !6 to: %tmp.sroa.6.8. = load i32*, i32** %tmp.sroa.6, !tbaa !11, !noalias !6 Rewriting alloca partition [16,24) to: %tmp.sroa.8 = alloca i32* Found llvm.noalias.decl: %3 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6) New llvm.noalias.decl: %3 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.8, i64 16, metadata !6) [...] rewriting split [0,24) slice #4 (splittable) original: %4 = call i8* @llvm.noalias.decl.p0i8.p0s_struct.FOOs.i64(%struct.FOO* %tmp, i64 0, metadata !6) to: %3 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.8, i64 16, metadata !6) [...] rewriting [16,24) slice #10 original: %12 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.8.16., i8* %4, i32** %pC3, i64 0, metadata !6), !tbaa !12, !noalias !6 to: %12 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.8.16., i8* %3, i32** %tmp.sroa.8, i64 16, metadata !6), !tbaa !12, !noalias !6 Speculating PHIs Speculating Selects Then later on: Promoting allocas with mem2reg... Zeoring noalias.decl dep: %0 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.0, i64 0, metadata !6) Zeroing operand 2 of %3 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.0.0., i8* %0, i32** %tmp.sroa.0, i64 0, metadata !6), !tbaa !9, !noalias !6 [...] Zeoring noalias.decl dep: %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.8, i64 16, metadata !2) Zeroing operand 2 of %4 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %tmp.sroa.8.16., i8* %2, i32** %tmp.sroa.8, i64 16, metadata !2), !tbaa !10, !noalias !2 [...] Zeoring noalias.decl dep: %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** %tmp.sroa.6, i64 8, metadata !2) [...] (aargh, 'Zeoring' should of course be 'Zeroing' ;) ) After this pass, we get: define dso_local void @bar(i32* %a, i32* %b, i32* %c) #0 { entry: %0 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 0, metadata !2) %1 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 8, metadata !2) ; This one will be removed later on, as it is not used anywhere. %2 = call i8* @llvm.noalias.decl.p0i8.p0p0i32.i64(i32** null, i64 16, metadata !2) %3 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %a, i8* %0, i32** null, i64 0, metadata !2), !tbaa !5, !noalias !2 store i32 42, i32* %3, align 4, !tbaa !10, !noalias !2 store i32 43, i32* %b, align 4, !tbaa !10, !noalias !2 %4 = call i32* @llvm.noalias.p0i32.p0i8.p0p0i32.i64(i32* %c, i8* %2, i32** null, i64 16, metadata !2), !tbaa !12, !noalias !2 store i32 44, i32* %4, align 4, !tbaa !10, !noalias !2 ret void } [...] !2 = !{!3} ; p.scope: 'tmp' in function 'bar', also recycled by !noalias, as it is the only restrict declaration !3 = distinct !{!3, !4, !"bar: tmp"} !4 = distinct !{!4, !"bar"} ================ Comment at: llvm/docs/LangRef.rst:16303 +- ``p.objId``: a number that can be used to differentiate different *object P* + when ``%p.addr`` is optimized away. +- ``!p.scope``: metadata argument that refers to a list of alias.scope metadata ---------------- jdoerfert wrote: > jeroen.dobbelaere wrote: > > jdoerfert wrote: > > > This seems odd, why introduce two things that do the same thing. > > The original idea was to treat '%p.addr' sometimes as a pointer to an object and sometimes as an offset. Later it needed to be separated: SROA first splits alloca's into multiple smaller alloca's. Each separate restrict pointer now points to its own alloca (%p.addr), and there is no place to put the offset. You can differentiate by splitting the p.scope, but that would imply duplicating scopes all over the place. The p.objId serves as a convenient and less costly solution to differentiate the pointers in this case. > So `objId` is an offset into `p.addr`? If so, let's document it that way. > > How does this work if there are multiple restrict pointers in the object, e.g. `struct { restrict *a; restrict *b }`? Maybe it would help if you point me towards the place where I can see this intrinsic in action. At least then I might be able to provide better feedback on the wording. This is the confusing part for me for the LangRef vs the usage: should the LangRef describe only the high level effect, or can it also describe how llvm treats/optimizes stuff internally ? I have somehow the feeling the we might want to have a separate restrict handling document, describing how the intrincs and metadata work together. Or do you think such a thing also belongs to the LangRef ? ================ Comment at: llvm/docs/LangRef.rst:16306 + entries that contains exactly one element. It represents the variable + declaration that contains one or more restrict pointers. +- ``%p.decl``: points to the ``@llvm.noalias.decl`` intrinsic associated with ---------------- jdoerfert wrote: > jeroen.dobbelaere wrote: > > jdoerfert wrote: > > > "entries with a single element each." > > > > It represents the variable declaration that contains one or more restrict pointers. > > > I do not understand this sentence. > > hmm. Not sure how to explain it further. What I want to say is (shown with an example:) > > int *restrict A; // one !p.scope, one restrict pointer > > int *restrict B[10]; // another (single) !p.scope, ten restrict pointers > > struct FOO { int* restrict mA; int * mB; int* restrict mC; } C; // yet another !p.scope, 2 restrict pointers > > > > > > > > > In that example, how doe the `p.scopes` look like? Or, asked differently, is the `p.scope` a consequence of the declaration, hence does it uniquely identifies a declaration? Yes, the p.scope is a result of the declaration and uniquely identifies one. ================ Comment at: llvm/docs/LangRef.rst:16496 +not really represent a value. It is merely used to track a dependency on the +declaration. + ---------------- jdoerfert wrote: > jeroen.dobbelaere wrote: > > jdoerfert wrote: > > > The above reads funny, maybe: > > > "The returned value is a handle to track dependences on the declaration. There is no explicit relationship to the value of the arguments." > > > Also, why do we want an `i8*` then? We have `tokens` and we have `i32`, I'd prefer either over an `i8*` which is more confusing in this context full of `i8*` that are actually pointers (IMHO). > > I think a token has to many restrictions (no PHI, no select). i32 might do. I didn't think too much about it and just settled on i8*. > If the token is too restrictive I'd still prefer an i32 (or similar) to avoid confusion with all the i8 pointers that fly around. The wording will then make it clear that these are tokens. ok. We can consider that. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68484/new/ https://reviews.llvm.org/D68484 From llvm-commits at lists.llvm.org Tue Oct 8 08:26:06 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:26:06 +0000 (UTC) Subject: [PATCH] D66887: [test-suite][WIP] Add GCC C Torture Suite as External Test Suite In-Reply-To: References: Message-ID: <4d6903cfb053ee6caab26d793e6f883c@localhost.localdomain> lenary marked 4 inline comments as done. lenary added inline comments. ================ Comment at: SingleSource/Regression/C/gcc-c-torture/README:14-17 +# Licensing + +The testcases in SingleSource/Regression/C/gcc-c-torture/execute are covered by +the GPL. See the files whose names start with COPYING for copying permission. ---------------- kristof.beyls wrote: > Given licensing information is also present in SingleSource/Regression/C/gcc-c-torture/execute/LICENSE.TXT and called out in the top-level LICENSE.TXT file in the test-suite, I don't think there's much value in repeating the information here too. Maybe best to delete this part? Sure, deleted. ================ Comment at: SingleSource/Regression/C/gcc-c-torture/execute/LICENCE.TXT:1-2 +The testcases in this directory are covered by the GPL. See the files whose +names start with COPYING for copying permission. ---------------- kristof.beyls wrote: > it seems the file name here is still LICENCE.TXT rather than LICENSE.TXT, which would be preferred for consistency with the rest of the test-suite? I'm really sorry about you having to note this twice. I must have been having a british/american spelling moment and missed the C/S difference. Updated. Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 From llvm-commits at lists.llvm.org Tue Oct 8 08:28:36 2019 From: llvm-commits at lists.llvm.org (GN Sync Bot via llvm-commits) Date: Tue, 08 Oct 2019 15:28:36 -0000 Subject: [llvm] r374064 - gn build: Merge r374061 Message-ID: <20191008152836.AD9038F4A1@lists.llvm.org> Author: gnsyncbot Date: Tue Oct 8 08:28:36 2019 New Revision: 374064 URL: http://llvm.org/viewvc/llvm-project?rev=374064&view=rev Log: gn build: Merge r374061 Modified: llvm/trunk/utils/gn/secondary/clang/lib/Driver/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/clang/lib/Driver/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang/lib/Driver/BUILD.gn?rev=374064&r1=374063&r2=374064&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang/lib/Driver/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang/lib/Driver/BUILD.gn Tue Oct 8 08:28:36 2019 @@ -67,6 +67,7 @@ static_library("Driver") { "ToolChains/Haiku.cpp", "ToolChains/Hexagon.cpp", "ToolChains/Hurd.cpp", + "ToolChains/InterfaceStubs.cpp", "ToolChains/Linux.cpp", "ToolChains/MSP430.cpp", "ToolChains/MSVC.cpp", From llvm-commits at lists.llvm.org Tue Oct 8 08:26:22 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:26:22 +0000 (UTC) Subject: [PATCH] D66645: [Attributor] Add helper class to compose two structured deduction. In-Reply-To: References: Message-ID: This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rG08daf8cf0a55: [Attributor] Add helper class to compose two structured deduction. (authored by uenoku). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66645/new/ https://reviews.llvm.org/D66645 Files: llvm/lib/Transforms/IPO/Attributor.cpp Index: llvm/lib/Transforms/IPO/Attributor.cpp =================================================================== --- llvm/lib/Transforms/IPO/Attributor.cpp +++ llvm/lib/Transforms/IPO/Attributor.cpp @@ -560,6 +560,21 @@ S ^= *T; } +/// Helper class to compose two generic deduction +template class F, template class G> +struct AAComposeTwoGenericDeduction + : public F, StateType> { + AAComposeTwoGenericDeduction(const IRPosition &IRP) + : F, StateType>(IRP) {} + + /// See AbstractAttribute::updateImpl(...). + ChangeStatus updateImpl(Attributor &A) override { + return F, StateType>::updateImpl(A) | + G::updateImpl(A); + } +}; + /// Helper class for generic deduction: return value -> returned position. template -------------- next part -------------- A non-text attachment was scrubbed... Name: D66645.223873.patch Type: text/x-patch Size: 1077 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 08:26:46 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:26:46 +0000 (UTC) Subject: [PATCH] D65402: [Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer In-Reply-To: References: Message-ID: <2e74b0704e2365f47cf1f5c107b38ee5@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG96e6ce4cd361: [Attributor][MustExec] Deduce dereferenceable and nonnull attribute using… (authored by uenoku). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65402/new/ https://reviews.llvm.org/D65402 Files: llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/align.ll llvm/test/Transforms/FunctionAttrs/arg_nocapture.ll llvm/test/Transforms/FunctionAttrs/arg_returned.ll llvm/test/Transforms/FunctionAttrs/callbacks.ll llvm/test/Transforms/FunctionAttrs/dereferenceable.ll llvm/test/Transforms/FunctionAttrs/internal-noalias.ll llvm/test/Transforms/FunctionAttrs/liveness.ll llvm/test/Transforms/FunctionAttrs/noalias_returned.ll llvm/test/Transforms/FunctionAttrs/nocapture.ll llvm/test/Transforms/FunctionAttrs/nonnull.ll llvm/test/Transforms/FunctionAttrs/norecurse.ll llvm/test/Transforms/FunctionAttrs/nosync.ll llvm/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll llvm/test/Transforms/FunctionAttrs/readattrs.ll llvm/test/Transforms/InferFunctionAttrs/dereferenceable.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D65402.223874.patch Type: text/x-patch Size: 45387 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 08:34:52 2019 From: llvm-commits at lists.llvm.org (GN Sync Bot via llvm-commits) Date: Tue, 08 Oct 2019 15:34:52 -0000 Subject: [llvm] r374065 - gn build: Merge r374062 Message-ID: <20191008153452.80BE78C78A@lists.llvm.org> Author: gnsyncbot Date: Tue Oct 8 08:34:52 2019 New Revision: 374065 URL: http://llvm.org/viewvc/llvm-project?rev=374065&view=rev Log: gn build: Merge r374062 Modified: llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn?rev=374065&r1=374064&r2=374065&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn Tue Oct 8 08:34:52 2019 @@ -10,6 +10,5 @@ unittest("TextAPITests") { "TextStubV1Tests.cpp", "TextStubV2Tests.cpp", "TextStubV3Tests.cpp", - "TextStubV4Tests.cpp", ] } From llvm-commits at lists.llvm.org Tue Oct 8 08:35:14 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:35:14 +0000 (UTC) Subject: [PATCH] D68575: implement parsing overflow section header. In-Reply-To: References: Message-ID: <21e2a93b2ec6b6057d18698845b42583@localhost.localdomain> hubert.reinterpretcast added inline comments. ================ Comment at: llvm/test/tools/llvm-readobj/xcoff-overflow-section.test:37 +# SECOVERFLOW-NEXT: Name: .ovrflo +# SECOVERFLOW-NEXT: NumberOfRelocations: 0x3 +# SECOVERFLOW-NEXT: NumberOfLineNumbers: 0x0 ---------------- The same information, when conveyed not via an overflow header, is expressed in decimal format (and not hexadecimal). ================ Comment at: llvm/test/tools/llvm-readobj/xcoff-overflow-section.test:38 +# SECOVERFLOW-NEXT: NumberOfRelocations: 0x3 +# SECOVERFLOW-NEXT: NumberOfLineNumbers: 0x0 +# SECOVERFLOW-NEXT: Size: 0x0 ---------------- Ditto. ================ Comment at: llvm/test/tools/llvm-readobj/xcoff-overflow-section.test:41 +# SECOVERFLOW-NEXT: RawDataOffset: 0x0 +# SECOVERFLOW-NEXT: RelocationPointer: 0x0 +# SECOVERFLOW-NEXT: LineNumberPointer: 0x0 ---------------- According to the XCOFF documentation: The `s_relptr` and `s_lnnoptr` fields must have the same values as in the corresponding primary section header. ================ Comment at: llvm/test/tools/llvm-readobj/xcoff-overflow-section.test:42 +# SECOVERFLOW-NEXT: RelocationPointer: 0x0 +# SECOVERFLOW-NEXT: LineNumberPointer: 0x0 +# SECOVERFLOW-NEXT: IndexOfSectionOverflowed: 2 ---------------- See above. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:437 + uint16_t Flags = Sec.Flags & 0xffffu; + switch (Flags) { ---------------- Can we encode this into a function (`getSectionType(const T &Section)`) or constant (`SectionFlagsTypeMask`) with the comment: The low order 16 bits of s_flags denotes the section type. Also, the variable should be named `SectionType` or similar. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:441 + if (Obj.is64Bit()) + W.printString( + "64 bits XCOFF objectfile should not have overflow section!"); ---------------- I think we should not be printing a "warning" to the same stream as the output. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:442 + W.printString( + "64 bits XCOFF objectfile should not have overflow section!"); + printOverflowSectionHeader(Sec); ---------------- "64-bit XCOFF object file". ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:448 + case XCOFF::STYP_TYPCHK: + // TODO : The interpret of loader, exception, type check section header + // are different from generic section header. We will implement them ---------------- s/TODO : /TODO /; s/interpret/interpretation/; s/, type check section header/, and type check section headers/; s/from/from that of/; s/generic section header/generic section headers/; s/generic seciton header now/generic section headers for now/; ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:456 // The most significant 16-bits represent the DWARF section subtype. For // now we just dump the section type flags. if (Flags & SectionFlagsReservedMask) ---------------- Since we removed the context of the code line from here, the comment is ambiguous. This should say that, for now, we dump only the section type (low order 16 bits). Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68575/new/ https://reviews.llvm.org/D68575 From llvm-commits at lists.llvm.org Tue Oct 8 08:35:15 2019 From: llvm-commits at lists.llvm.org (Kristof Beyls via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:35:15 +0000 (UTC) Subject: [PATCH] D66887: [test-suite][WIP] Add GCC C Torture Suite as External Test Suite In-Reply-To: References: Message-ID: <93e911f7d21fbc1f7c8c7731c526d47e@localhost.localdomain> kristof.beyls accepted this revision. kristof.beyls added a comment. This revision is now accepted and ready to land. LGTM, thanks! Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 From llvm-commits at lists.llvm.org Tue Oct 8 08:35:16 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:35:16 +0000 (UTC) Subject: [PATCH] D66887: [test-suite][WIP] Add GCC C Torture Suite as External Test Suite In-Reply-To: References: Message-ID: <70e67fac5e87588776d5a4bde9ea161a@localhost.localdomain> lenary added a comment. In D66887#1676164 , @kristof.beyls wrote: > Were you planning once you commit to quickly add skip lists for other architectures too so that all public bots for all architectures will keep on passing? Sorry, I missed addressing this. There are two options: 1. I do nothing, people see that tests are failing and help me update the CMakeLists tests to skip. I think this would happen very quickly, as there would be failing tests. 2. I add a `if(ARCH MATCHES "x86" OR ARCH MATCHES "riscv")` around the `add_subdirectory(...)` in `SingleSource/Regression/C/CMakeLists.txt`. This will mean no errors, but I don't have a way of testing on other architectures so it could take a while for the torture suite to be supported by other architectures. I'm not sure of the best way forward. If we want to avoid breakage: 2; if we're optimising for maximal test coverage: 1. Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 From llvm-commits at lists.llvm.org Tue Oct 8 08:35:17 2019 From: llvm-commits at lists.llvm.org (Owen Reynolds via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:35:17 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: <4747a30aec08e5240e5e6c4325afee1c@localhost.localdomain> gbreynoo added inline comments. ================ Comment at: llvm/test/tools/llvm-ar/mri-nonascii.test:15 + +# Use input redirection to work around problems launching processess that +# include arguments with non-ascii characters. ---------------- thopre wrote: > MaskRay wrote: > > hubert.reinterpretcast wrote: > > > Minor nit: s/processess/processes/; > > What problems do you work around? POSIX.1-2017 3.282 Portable Filename Character Set consists of the classical Latin alphabet, 0~9, , , and . a filename consisting of the UTF-8 byte sequence 0xc2 0xa3 (£) may be disallowed by some implementations but it is unlikely that the implementation can arbitrarily reinterpret the byte sequence and cause the test to fail. > > > > I suggest deleting the comment. > The original message is not mine so I'm not sure what it referred to it might be that arguments are passed down the the program being invoked without interpretation, thus the filename would be UTF-8 encoded since that is what mri-utf8.test is encoded in. This would fail on Windows where filename must be UTF-16 and the output redirection of the earlier line would have created a filename in UTF-16. > > I'll let Owen confirm. Sorry that the comment was not clear. The issue I had was explicitly with the behaviour differences between python versions and OS causing strings not being encoded to the right format and failing to open the file in question. Originally I used python as opposed to filecheck as to be explicit with the expected characters and encoding. However if everyone is happier with MaskRays suggestion and it functions as expected, I'm not sure I can argue. Avoiding the dependency on the locale would be great. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 From llvm-commits at lists.llvm.org Tue Oct 8 08:35:17 2019 From: llvm-commits at lists.llvm.org (Owen Reynolds via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:35:17 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: <12686a2c07fbdc2033819c85b28b1836@localhost.localdomain> gbreynoo added a comment. This functions on Windows fine. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 From llvm-commits at lists.llvm.org Tue Oct 8 08:43:12 2019 From: llvm-commits at lists.llvm.org (Nikola Prica via llvm-commits) Date: Tue, 08 Oct 2019 15:43:12 -0000 Subject: [llvm] r374068 - [DebugInfo][If-Converter] Update call site info during the optimization Message-ID: <20191008154312.8604E891D1@lists.llvm.org> Author: nikolaprica Date: Tue Oct 8 08:43:12 2019 New Revision: 374068 URL: http://llvm.org/viewvc/llvm-project?rev=374068&view=rev Log: [DebugInfo][If-Converter] Update call site info during the optimization During the If-Converter optimization pay attention when copying or deleting call instructions in order to keep call site information in valid state. Reviewers: aprantl, vsk, efriedma Reviewed By: vsk, efriedma Differential Revision: https://reviews.llvm.org/D66955 Added: llvm/trunk/test/DebugInfo/MIR/ARM/if-coverter-call-site-info.mir Modified: llvm/trunk/include/llvm/CodeGen/MachineFunction.h llvm/trunk/lib/CodeGen/BranchFolding.cpp llvm/trunk/lib/CodeGen/IfConversion.cpp llvm/trunk/lib/CodeGen/InlineSpiller.cpp llvm/trunk/lib/CodeGen/LiveRangeEdit.cpp llvm/trunk/lib/CodeGen/MachineFunction.cpp llvm/trunk/lib/CodeGen/MachineOutliner.cpp llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp llvm/trunk/lib/CodeGen/TargetInstrInfo.cpp llvm/trunk/lib/CodeGen/UnreachableBlockElim.cpp llvm/trunk/lib/CodeGen/XRayInstrumentation.cpp llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp llvm/trunk/lib/Target/X86/X86ExpandPseudo.cpp llvm/trunk/test/CodeGen/ARM/smml.ll Modified: llvm/trunk/include/llvm/CodeGen/MachineFunction.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/MachineFunction.h?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/MachineFunction.h (original) +++ llvm/trunk/include/llvm/CodeGen/MachineFunction.h Tue Oct 8 08:43:12 2019 @@ -36,6 +36,7 @@ #include "llvm/Support/Compiler.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/Recycler.h" +#include "llvm/Target/TargetMachine.h" #include #include #include @@ -400,6 +401,17 @@ private: /// Map a call instruction to call site arguments forwarding info. CallSiteInfoMap CallSitesInfo; + /// A helper function that returns call site info for a give call + /// instruction if debug entry value support is enabled. + CallSiteInfoMap::iterator getCallSiteInfo(const MachineInstr *MI) { + assert(MI->isCall() && + "Call site info refers only to call instructions!"); + + if (!Target.Options.EnableDebugEntryValues) + return CallSitesInfo.end(); + return CallSitesInfo.find(MI); + } + // Callbacks for insertion and removal. void handleInsertion(MachineInstr &MI); void handleRemoval(MachineInstr &MI); @@ -977,12 +989,24 @@ public: return CallSitesInfo; } - /// Update call sites info by deleting entry for \p Old call instruction. - /// If \p New is present then transfer \p Old call info to it. This function - /// should be called before removing call instruction or before replacing - /// call instruction with new one. - void updateCallSiteInfo(const MachineInstr *Old, - const MachineInstr *New = nullptr); + /// Following functions update call site info. They should be called before + /// removing, replacing or copying call instruction. + + /// Move the call site info from \p Old to \New call site info. This function + /// is used when we are replacing one call instruction with another one to + /// the same callee. + void moveCallSiteInfo(const MachineInstr *Old, + const MachineInstr *New); + + /// Erase the call site info for \p MI. It is used to remove a call + /// instruction from the instruction stream. + void eraseCallSiteInfo(const MachineInstr *MI); + + /// Copy the call site info from \p Old to \ New. Its usage is when we are + /// making a copy of the instruction that will be inserted at different point + /// of the instruction stream. + void copyCallSiteInfo(const MachineInstr *Old, + const MachineInstr *New); }; //===--------------------------------------------------------------------===// Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFolding.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original) +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Tue Oct 8 08:43:12 2019 @@ -162,6 +162,11 @@ void BranchFolder::RemoveDeadBlock(Machi // Avoid matching if this pointer gets reused. TriedMerging.erase(MBB); + // Update call site info. + std::for_each(MBB->begin(), MBB->end(), [MF](const MachineInstr &MI) { + if (MI.isCall(MachineInstr::IgnoreBundle)) + MF->eraseCallSiteInfo(&MI); + }); // Remove the block. MF->erase(MBB); EHScopeMembership.erase(MBB); Modified: llvm/trunk/lib/CodeGen/IfConversion.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/IfConversion.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/IfConversion.cpp (original) +++ llvm/trunk/lib/CodeGen/IfConversion.cpp Tue Oct 8 08:43:12 2019 @@ -1743,6 +1743,11 @@ bool IfConverter::IfConvertDiamondCommon ++i; } while (NumDups1 != 0) { + // Since this instruction is going to be deleted, update call + // site info state if the instruction is call instruction. + if (DI2->isCall(MachineInstr::IgnoreBundle)) + MBB2.getParent()->eraseCallSiteInfo(&*DI2); + ++DI2; if (DI2 == MBB2.end()) break; @@ -1784,7 +1789,14 @@ bool IfConverter::IfConvertDiamondCommon // NumDups2 only counted non-dbg_value instructions, so this won't // run off the head of the list. assert(DI1 != MBB1.begin()); + --DI1; + + // Since this instruction is going to be deleted, update call + // site info state if the instruction is call instruction. + if (DI1->isCall(MachineInstr::IgnoreBundle)) + MBB1.getParent()->eraseCallSiteInfo(&*DI1); + // skip dbg_value instructions if (!DI1->isDebugInstr()) ++i; @@ -2069,6 +2081,10 @@ void IfConverter::CopyAndPredicateBlock( break; MachineInstr *MI = MF.CloneMachineInstr(&I); + // Make a copy of the call site info. + if (MI->isCall(MachineInstr::IgnoreBundle)) + MF.copyCallSiteInfo(&I,MI); + ToBBI.BB->insert(ToBBI.BB->end(), MI); ToBBI.NonPredSize++; unsigned ExtraPredCost = TII->getPredicationCost(I); Modified: llvm/trunk/lib/CodeGen/InlineSpiller.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/InlineSpiller.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/InlineSpiller.cpp (original) +++ llvm/trunk/lib/CodeGen/InlineSpiller.cpp Tue Oct 8 08:43:12 2019 @@ -867,7 +867,7 @@ foldMemoryOperand(ArrayRefisCall()) - MI->getMF()->updateCallSiteInfo(MI, FoldMI); + MI->getMF()->moveCallSiteInfo(MI, FoldMI); MI->eraseFromParent(); // Insert any new instructions other than FoldMI into the LIS maps. Modified: llvm/trunk/lib/CodeGen/LiveRangeEdit.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/LiveRangeEdit.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/LiveRangeEdit.cpp (original) +++ llvm/trunk/lib/CodeGen/LiveRangeEdit.cpp Tue Oct 8 08:43:12 2019 @@ -232,7 +232,7 @@ bool LiveRangeEdit::foldAsLoad(LiveInter LLVM_DEBUG(dbgs() << " folded: " << *FoldMI); LIS.ReplaceMachineInstrInMaps(*UseMI, *FoldMI); if (UseMI->isCall()) - UseMI->getMF()->updateCallSiteInfo(UseMI, FoldMI); + UseMI->getMF()->moveCallSiteInfo(UseMI, FoldMI); UseMI->eraseFromParent(); DefMI->addRegisterDead(LI->reg, nullptr); Dead.push_back(DefMI); Modified: llvm/trunk/lib/CodeGen/MachineFunction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineFunction.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/MachineFunction.cpp (original) +++ llvm/trunk/lib/CodeGen/MachineFunction.cpp Tue Oct 8 08:43:12 2019 @@ -835,20 +835,36 @@ void MachineFunction::addCodeViewHeapAll CodeViewHeapAllocSites.push_back(std::make_tuple(BeginLabel, EndLabel, DI)); } -void MachineFunction::updateCallSiteInfo(const MachineInstr *Old, - const MachineInstr *New) { - if (!Target.Options.EnableDebugEntryValues || Old == New) - return; +void MachineFunction::moveCallSiteInfo(const MachineInstr *Old, + const MachineInstr *New) { + assert(New->isCall() && "Call site info refers only to call instructions!"); - assert(Old->isCall() && (!New || New->isCall()) && - "Call site info referes only to call instructions!"); - CallSiteInfoMap::iterator CSIt = CallSitesInfo.find(Old); + CallSiteInfoMap::iterator CSIt = getCallSiteInfo(Old); if (CSIt == CallSitesInfo.end()) return; + CallSiteInfo CSInfo = std::move(CSIt->second); CallSitesInfo.erase(CSIt); - if (New) - CallSitesInfo[New] = CSInfo; + CallSitesInfo[New] = CSInfo; +} + +void MachineFunction::eraseCallSiteInfo(const MachineInstr *MI) { + CallSiteInfoMap::iterator CSIt = getCallSiteInfo(MI); + if (CSIt == CallSitesInfo.end()) + return; + CallSitesInfo.erase(CSIt); +} + +void MachineFunction::copyCallSiteInfo(const MachineInstr *Old, + const MachineInstr *New) { + assert(New->isCall() && "Call site info refers only to call instructions!"); + + CallSiteInfoMap::iterator CSIt = getCallSiteInfo(Old); + if (CSIt == CallSitesInfo.end()) + return; + + CallSiteInfo CSInfo = CSIt->second; + CallSitesInfo[New] = CSInfo; } /// \} Modified: llvm/trunk/lib/CodeGen/MachineOutliner.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineOutliner.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/MachineOutliner.cpp (original) +++ llvm/trunk/lib/CodeGen/MachineOutliner.cpp Tue Oct 8 08:43:12 2019 @@ -1260,7 +1260,7 @@ bool MachineOutliner::outline(Module &M, true /* isImp = true */)); } if (MI.isCall()) - MI.getMF()->updateCallSiteInfo(&MI); + MI.getMF()->eraseCallSiteInfo(&MI); }; // Copy over the defs in the outlined range. // First inst in outlined range <-- Anything that's defined in this Modified: llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp (original) +++ llvm/trunk/lib/CodeGen/PeepholeOptimizer.cpp Tue Oct 8 08:43:12 2019 @@ -1776,7 +1776,7 @@ bool PeepholeOptimizer::runOnMachineFunc LocalMIs.erase(DefMI); LocalMIs.insert(FoldMI); if (MI->isCall()) - MI->getMF()->updateCallSiteInfo(MI, FoldMI); + MI->getMF()->moveCallSiteInfo(MI, FoldMI); MI->eraseFromParent(); DefMI->eraseFromParent(); MRI->markUsesInDebugValueAsUndef(FoldedReg); Modified: llvm/trunk/lib/CodeGen/TargetInstrInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TargetInstrInfo.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/TargetInstrInfo.cpp (original) +++ llvm/trunk/lib/CodeGen/TargetInstrInfo.cpp Tue Oct 8 08:43:12 2019 @@ -143,7 +143,7 @@ TargetInstrInfo::ReplaceTailWithBranchTo while (Tail != MBB->end()) { auto MI = Tail++; if (MI->isCall()) - MBB->getParent()->updateCallSiteInfo(&*MI); + MBB->getParent()->eraseCallSiteInfo(&*MI); MBB->erase(MI); } Modified: llvm/trunk/lib/CodeGen/UnreachableBlockElim.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/UnreachableBlockElim.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/UnreachableBlockElim.cpp (original) +++ llvm/trunk/lib/CodeGen/UnreachableBlockElim.cpp Tue Oct 8 08:43:12 2019 @@ -151,7 +151,7 @@ bool UnreachableMachineBlockElim::runOnM // Remove any call site information for calls in the block. for (auto &I : DeadBlocks[i]->instrs()) if (I.isCall(MachineInstr::IgnoreBundle)) - DeadBlocks[i]->getParent()->updateCallSiteInfo(&I); + DeadBlocks[i]->getParent()->eraseCallSiteInfo(&I); DeadBlocks[i]->eraseFromParent(); } Modified: llvm/trunk/lib/CodeGen/XRayInstrumentation.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/XRayInstrumentation.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/XRayInstrumentation.cpp (original) +++ llvm/trunk/lib/CodeGen/XRayInstrumentation.cpp Tue Oct 8 08:43:12 2019 @@ -111,7 +111,7 @@ void XRayInstrumentation::replaceRetWith MIB.add(MO); Terminators.push_back(&T); if (T.isCall()) - MF.updateCallSiteInfo(&T); + MF.eraseCallSiteInfo(&T); } } } Modified: llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp Tue Oct 8 08:43:12 2019 @@ -1207,7 +1207,7 @@ bool ARMExpandPseudo::ExpandMI(MachineBa // Update call site info and delete the pseudo instruction TCRETURN. - MBB.getParent()->updateCallSiteInfo(&MI, &*NewMI); + MBB.getParent()->moveCallSiteInfo(&MI, &*NewMI); MBB.erase(MBBI); MBBI = NewMI; @@ -1439,7 +1439,7 @@ bool ARMExpandPseudo::ExpandMI(MachineBa MIB.cloneMemRefs(MI); TransferImpOps(MI, MIB, MIB); - MI.getMF()->updateCallSiteInfo(&MI, &*MIB); + MI.getMF()->moveCallSiteInfo(&MI, &*MIB); MI.eraseFromParent(); return true; } Modified: llvm/trunk/lib/Target/X86/X86ExpandPseudo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ExpandPseudo.cpp?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ExpandPseudo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ExpandPseudo.cpp Tue Oct 8 08:43:12 2019 @@ -275,7 +275,7 @@ bool X86ExpandPseudo::ExpandMI(MachineBa MachineInstr &NewMI = *std::prev(MBBI); NewMI.copyImplicitOps(*MBBI->getParent()->getParent(), *MBBI); - MBB.getParent()->updateCallSiteInfo(&*MBBI, &NewMI); + MBB.getParent()->moveCallSiteInfo(&*MBBI, &NewMI); // Delete the pseudo instruction TCRETURN. MBB.erase(MBBI); Modified: llvm/trunk/test/CodeGen/ARM/smml.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/smml.ll?rev=374068&r1=374067&r2=374068&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/smml.ll (original) +++ llvm/trunk/test/CodeGen/ARM/smml.ll Tue Oct 8 08:43:12 2019 @@ -8,6 +8,13 @@ ; RUN: llc -mtriple=thumbv7m-eabi %s -o - | FileCheck %s -check-prefix=CHECK -check-prefix=CHECK-V4 ; RUN: llc -mtriple=thumbv7em-eabi %s -o - | FileCheck %s -check-prefix=CHECK -check-prefix=CHECK-THUMBV6T2 +; Next test would previously trigger an assertion responsible for verification of +; call site info state. +; RUN: llc -stop-after=if-converter -debug-entry-values -mtriple=thumbv6t2-eabi %s -o -| FileCheck %s -check-prefix=CHECK-CALLSITE +; CHECK-CALLSITE: name: test_used_flags +; CHECK-CALLSITE: callSites: + + define i32 @Test0(i32 %a, i32 %b, i32 %c) nounwind readnone ssp { entry: ; CHECK-LABEL: Test0 Added: llvm/trunk/test/DebugInfo/MIR/ARM/if-coverter-call-site-info.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/MIR/ARM/if-coverter-call-site-info.mir?rev=374068&view=auto ============================================================================== --- llvm/trunk/test/DebugInfo/MIR/ARM/if-coverter-call-site-info.mir (added) +++ llvm/trunk/test/DebugInfo/MIR/ARM/if-coverter-call-site-info.mir Tue Oct 8 08:43:12 2019 @@ -0,0 +1,165 @@ +# RUN: llc -mtriple=arm-linux-gnu -debug-entry-values -run-pass if-converter %s -o -| FileCheck %s + +# Vefify that the call site info will be updated after the optimization. +# This test case would previously trigger an assertion when +# deleting the call instruction. + +# Test case is generated from: +# extern void +# foo (int* seg, int subseg); +# extern int* mri_common_symbol; +# +# void +# baa (int* secptr, int subseg) +# { +# if (! (secptr == 0 && subseg == 0)) +# foo (secptr, subseg); +# mri_common_symbol = 0; +# } +# +# With slight change of MIR - substitution of BL instruction with BL_pred +# in order to trigger optimization. +# clang -target arm-linux-gnu -g -O2 -Xclang -femit-debug-entry-values +# %s -stop-before=if-convert +# +# CHECK: callSites: +# CHECK-NEXT: - { bb: {{.*}}, offset: {{.*}}, fwdArgRegs: +# CHECK-NEXT: - { arg: 0, reg: '$r0' } +# CHECK-NEXT: - { arg: 1, reg: '$r1' } } + +--- | + ; ModuleID = 'if-convert-call-site-info.c' + source_filename = "if-convert-call-site-info.c" + target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64" + target triple = "armv6kz-unknown-linux-gnueabihf" + + @mri_common_symbol = external dso_local local_unnamed_addr global i32*, align 4 + + ; Function Attrs: nounwind + define dso_local void @baa(i32* %secptr, i32 %subseg) local_unnamed_addr #0 !dbg !14 { + entry: + call void @llvm.dbg.value(metadata i32* %secptr, metadata !16, metadata !DIExpression()), !dbg !18 + call void @llvm.dbg.value(metadata i32 %subseg, metadata !17, metadata !DIExpression()), !dbg !18 + %cmp = icmp eq i32* %secptr, null, !dbg !19 + %cmp1 = icmp eq i32 %subseg, 0, !dbg !21 + %or.cond = and i1 %cmp, %cmp1, !dbg !22 + br i1 %or.cond, label %if.end, label %if.then, !dbg !22 + + if.then: ; preds = %entry + tail call void @foo(i32* %secptr, i32 %subseg), !dbg !23 + br label %if.end, !dbg !23 + + if.end: ; preds = %entry, %if.then + store i32* null, i32** @mri_common_symbol, align 4, !dbg !24, !tbaa !25 + ret void, !dbg !29 + } + + declare !dbg !4 dso_local void @foo(i32*, i32) local_unnamed_addr + + ; Function Attrs: nounwind readnone speculatable willreturn + declare void @llvm.dbg.value(metadata, metadata, metadata) + + ; Function Attrs: nounwind + declare void @llvm.stackprotector(i8*, i8**) + + attributes #0 = { "frame-pointer"="all" } + + !llvm.dbg.cu = !{!0} + !llvm.module.flags = !{!9, !10, !11, !12} + !llvm.ident = !{!13} + + !0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 ", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !3, nameTableKind: None) + !1 = !DIFile(filename: "if-convert-call-site-info.c", directory: "/") + !2 = !{} + !3 = !{!4} + !4 = !DISubprogram(name: "foo", scope: !1, file: !1, line: 10, type: !5, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !2) + !5 = !DISubroutineType(types: !6) + !6 = !{null, !7, !8} + !7 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !8, size: 32) + !8 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) + !9 = !{i32 2, !"Dwarf Version", i32 4} + !10 = !{i32 2, !"Debug Info Version", i32 3} + !11 = !{i32 1, !"wchar_size", i32 4} + !12 = !{i32 1, !"min_enum_size", i32 4} + !13 = !{!"clang version 10.0.0 "} + !14 = distinct !DISubprogram(name: "baa", scope: !1, file: !1, line: 14, type: !5, scopeLine: 15, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !15) + !15 = !{!16, !17} + !16 = !DILocalVariable(name: "secptr", arg: 1, scope: !14, file: !1, line: 14, type: !7, flags: DIFlagArgumentNotModified) + !17 = !DILocalVariable(name: "subseg", arg: 2, scope: !14, file: !1, line: 14, type: !8, flags: DIFlagArgumentNotModified) + !18 = !DILocation(line: 0, scope: !14) + !19 = !DILocation(line: 16, column: 17, scope: !20) + !20 = distinct !DILexicalBlock(scope: !14, file: !1, line: 16, column: 7) + !21 = !DILocation(line: 16, column: 32, scope: !20) + !22 = !DILocation(line: 16, column: 22, scope: !20) + !23 = !DILocation(line: 17, column: 5, scope: !20) + !24 = !DILocation(line: 18, column: 21, scope: !14) + !25 = !{!26, !26, i64 0} + !26 = !{!"any pointer", !27, i64 0} + !27 = !{!"omnipotent char", !28, i64 0} + !28 = !{!"Simple C/C++ TBAA"} + !29 = !DILocation(line: 19, column: 1, scope: !14) + +... +--- +name: baa +alignment: 2 +tracksRegLiveness: true +liveins: + - { reg: '$r0' } + - { reg: '$r1' } +frameInfo: + stackSize: 8 + maxAlignment: 4 + adjustsStack: true + hasCalls: true + maxCallFrameSize: 0 +stack: + - { id: 0, type: spill-slot, offset: -4, size: 4, alignment: 4, callee-saved-register: '$lr', + callee-saved-restored: false } + - { id: 1, type: spill-slot, offset: -8, size: 4, alignment: 4, callee-saved-register: '$r11' } +callSites: + - { bb: 2, offset: 0, fwdArgRegs: + - { arg: 0, reg: '$r0' } + - { arg: 1, reg: '$r1' } } +constants: + - id: 0 + value: 'i32** null' + alignment: 4 +machineFunctionInfo: {} +body: | + bb.0.entry: + successors: %bb.1(0x60000000), %bb.2(0x20000000) + liveins: $r0, $r1, $lr + + DBG_VALUE $r0, $noreg, !16, !DIExpression(), debug-location !18 + DBG_VALUE $r0, $noreg, !16, !DIExpression(), debug-location !18 + DBG_VALUE $r1, $noreg, !17, !DIExpression(), debug-location !18 + DBG_VALUE $r1, $noreg, !17, !DIExpression(), debug-location !18 + $sp = frame-setup STMDB_UPD $sp, 14, $noreg, killed $r11, killed $lr + frame-setup CFI_INSTRUCTION def_cfa_offset 8 + frame-setup CFI_INSTRUCTION offset $lr, -4 + frame-setup CFI_INSTRUCTION offset $r11, -8 + $r11 = frame-setup MOVr killed $sp, 14, $noreg, $noreg + frame-setup CFI_INSTRUCTION def_cfa_register $r11 + CMPri renamable $r0, 0, 14, $noreg, implicit-def $cpsr, debug-location !22 + Bcc %bb.2, 1, killed $cpsr, debug-location !22 + + bb.1.entry: + successors: %bb.3(0x55555555), %bb.2(0x2aaaaaab) + liveins: $r0, $r1 + + CMPri renamable $r1, 0, 14, $noreg, implicit-def $cpsr, debug-location !22 + Bcc %bb.3, 0, killed $cpsr, debug-location !22 + + bb.2.if.then: + liveins: $r0, $r1 + + BL_pred @foo, 14, $noreg, csr_aapcs, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit-def $sp, debug-location !23 + + bb.3.if.end: + renamable $r0 = LDRi12 %const.0, 0, 14, $noreg, debug-location !24 :: (load 4 from constant-pool) + renamable $r1 = MOVi 0, 14, $noreg, $noreg + STRi12 killed renamable $r1, killed renamable $r0, 0, 14, $noreg, debug-location !24 :: (store 4 into @mri_common_symbol, !tbaa !25) + $sp = LDMIA_RET $sp, 14, $noreg, def $r11, def $pc, debug-location !29 + +... From llvm-commits at lists.llvm.org Tue Oct 8 08:45:36 2019 From: llvm-commits at lists.llvm.org (David Carlier via llvm-commits) Date: Tue, 08 Oct 2019 15:45:36 -0000 Subject: [compiler-rt] r374070 - [builtins] Unbreak build on FreeBSD armv7 after D60351 Message-ID: <20191008154536.241478875F@lists.llvm.org> Author: devnexen Date: Tue Oct 8 08:45:35 2019 New Revision: 374070 URL: http://llvm.org/viewvc/llvm-project?rev=374070&view=rev Log: [builtins] Unbreak build on FreeBSD armv7 after D60351 headers include reordering. Reviewers: phosek, echristo Reviewed-By: phosek Differential Revsion: https://reviews.llvm.org/D68045 Modified: compiler-rt/trunk/lib/builtins/atomic.c compiler-rt/trunk/lib/builtins/clear_cache.c Modified: compiler-rt/trunk/lib/builtins/atomic.c URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/builtins/atomic.c?rev=374070&r1=374069&r2=374070&view=diff ============================================================================== --- compiler-rt/trunk/lib/builtins/atomic.c (original) +++ compiler-rt/trunk/lib/builtins/atomic.c Tue Oct 8 08:45:35 2019 @@ -51,9 +51,11 @@ static const long SPINLOCK_MASK = SPINLO //////////////////////////////////////////////////////////////////////////////// #ifdef __FreeBSD__ #include -#include +// clang-format off #include +#include #include +// clang-format on typedef struct _usem Lock; __inline static void unlock(Lock *l) { __c11_atomic_store((_Atomic(uint32_t) *)&l->_count, 1, __ATOMIC_RELEASE); Modified: compiler-rt/trunk/lib/builtins/clear_cache.c URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/builtins/clear_cache.c?rev=374070&r1=374069&r2=374070&view=diff ============================================================================== --- compiler-rt/trunk/lib/builtins/clear_cache.c (original) +++ compiler-rt/trunk/lib/builtins/clear_cache.c Tue Oct 8 08:45:35 2019 @@ -23,8 +23,10 @@ uintptr_t GetCurrentProcess(void); #endif #if defined(__FreeBSD__) && defined(__arm__) -#include +// clang-format off #include +#include +// clang-format on #endif #if defined(__NetBSD__) && defined(__arm__) @@ -32,8 +34,10 @@ uintptr_t GetCurrentProcess(void); #endif #if defined(__OpenBSD__) && defined(__mips__) -#include +// clang-format off #include +#include +// clang-format on #endif #if defined(__linux__) && defined(__mips__) From llvm-commits at lists.llvm.org Tue Oct 8 08:45:16 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:45:16 +0000 (UTC) Subject: [PATCH] D68650: [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength. Message-ID: DiggerLin created this revision. DiggerLin added reviewers: hubert.reinterpretcast, sfertile, jasonliu. Herald added subscribers: llvm-commits, seiya, rupprecht. Herald added a project: LLVM. DiggerLin retitled this revision from "[AIX][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength." to "[AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength.". According the the XCOFF document, If Then XTY_SD x_scnlen contains the csect length. XTY_LD x_scnlen contains the symbol table index of the containing csect. XTY_CM x_scnlen contains the csect length. XTY_ER x_scnlen contains 0. Change the SectionLen member name to SectionOrLength is more reasonable . Repository: rL LLVM https://reviews.llvm.org/D68650 Files: llvm/include/llvm/Object/XCOFFObjectFile.h llvm/tools/llvm-readobj/XCOFFDumper.cpp Index: llvm/tools/llvm-readobj/XCOFFDumper.cpp =================================================================== --- llvm/tools/llvm-readobj/XCOFFDumper.cpp +++ llvm/tools/llvm-readobj/XCOFFDumper.cpp @@ -213,9 +213,9 @@ W.printNumber("Index", Obj.getSymbolIndex(reinterpret_cast(AuxEntPtr))); if ((AuxEntPtr->SymbolAlignmentAndType & SymbolTypeMask) == XCOFF::XTY_LD) - W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionLen); + W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionOrLength); else - W.printNumber("SectionLen", AuxEntPtr->SectionLen); + W.printNumber("SectionLen", AuxEntPtr->SectionOrLength); W.printHex("ParameterHashIndex", AuxEntPtr->ParameterHashIndex); W.printHex("TypeChkSectNum", AuxEntPtr->TypeChkSectNum); // Print out symbol alignment and type. Index: llvm/include/llvm/Object/XCOFFObjectFile.h =================================================================== --- llvm/include/llvm/Object/XCOFFObjectFile.h +++ llvm/include/llvm/Object/XCOFFObjectFile.h @@ -113,7 +113,11 @@ }; struct XCOFFCsectAuxEnt32 { - support::ubig32_t SectionLen; + support::ubig32_t + SectionOrLength; // If the symbol type is XTY_SD,XTY_CM,it contains the + // csect length. If the symbol type is XTY_LD, it + // contains the symbol table index of the containing + // csect. If the symbol type is XTY_ER, it contains 0. support::ubig32_t ParameterHashIndex; support::ubig16_t TypeChkSectNum; uint8_t SymbolAlignmentAndType; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68650.223876.patch Type: text/x-patch Size: 1599 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 08:45:16 2019 From: llvm-commits at lists.llvm.org (Owen Reynolds via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:45:16 +0000 (UTC) Subject: [PATCH] D68472: [test] Depend on C.UTF-8 dependency for mri-utf8.test In-Reply-To: References: Message-ID: gbreynoo added a comment. In D68472#1698154 , @thopre wrote: > In D68472#1697691 , @hubert.reinterpretcast wrote: > > > Thanks for adding the BOM. With the BOM, would it make sense to leave `mri-utf8.test` as the name of the file? > > > I think the testfile name should reflect what is being tested since that's the test identifier (ie. when a test fails lit prints the relative filepath) so the fact that the file is encoded in UTF-8 is irrelevant. Here the test is about llvm-ar handling non ascii filename, as the first comment explains it. How is the .txt file encoded would make a bit more sense as a name but then as I mentioned AFAIK the filename is encoded in UTF-16 on Windows anywat. In summary, I think the renaming is warranted. I agree that the new name makes more sense. LGTM once the comment is fixed. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 From llvm-commits at lists.llvm.org Tue Oct 8 08:45:16 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:45:16 +0000 (UTC) Subject: [PATCH] D68527: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering In-Reply-To: References: Message-ID: <0a8b57c3b2b23ad14d2ff21e97c1e9e8@localhost.localdomain> aheejin added a comment. - I remember before we had a somewhat complicated logic to calculate the number of bytes of total instructions of each case of the case we use `v128.const` and vs. when we use splats. Don't we need that anymore? Can we make the decision solely based the number of swizzles / consts / and splats? - Is the performance of `v128.const` better than splats? How is performance of swizzles compared to `v128.const`? ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1350 } + }; + ---------------- Would using `count_if` in place of `find_if` be simpler? ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1384 + SDValue SwizzleSrc; + SDValue SwizzleIndices; + size_t NumSwizzleLanes = 0; ---------------- Nit: Variable names for the same things in `GetSwizzleSrcs` are `SrcVec` and `IndexVec`. Making the variable names same in the two places might make reading easier. ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1388 + std::forward_as_tuple(std::tie(SwizzleSrc, SwizzleIndices), + NumSwizzleLanes) = GetMostCommon(SwizzleCounts); + ---------------- Is using `forward_as_tuple` any different from using `tie` again in this case, given that this is not passed as an argument to a function? ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1426 + (SplattedLoad = dyn_cast(SplatValue)) && + SplattedLoad->getMemoryVT() == VecT.getVectorElementType()) { + Result = DAG.getNode(WebAssemblyISD::LOAD_SPLAT, DL, VecT, SplatValue); ---------------- It's not in this CL, but is there a case this condition is not satisfied? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68527/new/ https://reviews.llvm.org/D68527 From llvm-commits at lists.llvm.org Tue Oct 8 08:54:19 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:54:19 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: <4c7d72765b4960182f88272c9d566f09@localhost.localdomain> evandro marked 3 inline comments as done. evandro added inline comments. ================ Comment at: llvm/include/llvm/Support/MathExtras.h:66 + inv_sqrtpi = 0.5641895835477563, // https://oeis.org/A087197 + sqrt2 = 1.414213562373095, // https://oeis.org/A002193 + inv_sqrt2 = 0.7071067811865475, ---------------- efriedma wrote: > evandro wrote: > > efriedma wrote: > > > The correct value of sqrt(2) in double-precision is 1.4142135623730951. > > > > > > And now I don't trust any of the other values... > > `double` has a precision of 15 or 16 significant digits. I don't understand why are you suggesting 17 significant digits when you asked to trim the precision down. > > > > Besides, the reference I provided states that this value is 1.41421356237309505. Whether it's rounded to 1.4142135623730950 or 1.4142135623730951 is a bit moot, IMO. > I asked for "the smallest number of digits required to produce the correct double-precision result". This is what you get if, for example, you ask Python 2.7 or later to convert the value to a string with `repr()` (`printf "import math\nprint(repr(math.sqrt(2)))" | python`). `1.414213562373095` produces a value that's different by one ulp. > > Yes, a one ulp difference is unlikely to matter for most uses, but if we're going to take the time to define these, we should define them correctly. You're assuming that Python is correct. `bc` says 1.41421356237309504880. glibc's `math.h` says 1.41421356237309504880 as well. And none of these is the same as your 1.4142135623730951. As I said, the precision of `double` is 15 to 16 digits and of `float`, 6 to 7 digits. `math.h` defines them with 20 digits, which is probably an agreeable precision, yes? But I believe that we call all live with a difference of ±1ulp. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 From llvm-commits at lists.llvm.org Tue Oct 8 08:54:20 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:54:20 +0000 (UTC) Subject: [PATCH] D68531: [WebAssembly] Add builtin and intrinsic for v8x16.swizzle In-Reply-To: References: Message-ID: aheejin accepted this revision. aheejin added a comment. This revision is now accepted and ready to land. > LLVM produces a poison value if the dynamic swizzle indices are greater than the vector size, but the WebAssembly instruction sets the corresponding output lane to zero. Where do we set those undef or poison lanes to zero? ================ Comment at: clang/include/clang/Basic/BuiltinsWebAssembly.def:63 // SIMD builtins +TARGET_BUILTIN(__builtin_wasm_swizzle_v8x16, "V16cV16cV16c", "nc", "unimplemented-simd128") + ---------------- Is the second indices vector always v8x16 too? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68531/new/ https://reviews.llvm.org/D68531 From llvm-commits at lists.llvm.org Tue Oct 8 08:54:45 2019 From: llvm-commits at lists.llvm.org (David CARLIER via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:54:45 +0000 (UTC) Subject: [PATCH] D68045: [builtins] Unbreak build on FreeBSD armv7 after D60351 In-Reply-To: References: Message-ID: <97f634a784736bcb788fdb9cc691aa81@localhost.localdomain> devnexen added a comment. In D68045#1699532 , @jbeich wrote: > Can someone land this change? Only tested on FreeBSD because contributing guide didn't cover how to elsewhere i.e., on platforms one doesn't have access to. Merged now. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68045/new/ https://reviews.llvm.org/D68045 From llvm-commits at lists.llvm.org Tue Oct 8 09:04:20 2019 From: llvm-commits at lists.llvm.org (Paul Robinson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:04:20 +0000 (UTC) Subject: [PATCH] D68465: [DebugInfo] Trim call-clobbered location list entries when tuning for GDB In-Reply-To: References: Message-ID: probinson added a comment. In D68465#1698682 , @dblaikie wrote: > In D68465#1697824 , @probinson wrote: > > > @dblaikie I'm also not clear what you're suggestion about .debug_addr entry plus offset. DW_LLE_offset_pair does this, derived from the base address, which ought to be available for any given function, assuming DWARF v5. Can you explain more clearly what's missing? > > > Right - for loclists there's no need for new forms, etc. It was specifically related to the other review related to this that modifies the in-memory representation of the debug_addr (llvm::AddressPool) - which I assume meant a difference in output in the address pool, but seems it doesn't add the offset inside the pool (but may end up with redundant entries in the pool which should be fixed in any case). I agree, address pool management should be able to eliminate redundant entries > My point was generally that the debug_addr section shouldn't be incnluding addresses with offsets, it should be the places that refer to debug_addr that use the offsets. The specific place I'd like to use offsets would be from FORM_addr in debug_info. But, yes, in this case the support for debug addr references from loclists, the forms are already sufficiently descriptive for this. Ah, got it. The problem you're facing is that for a DIE, a FORM describes one value, while you want to describe two--the address (or index into .debug_addr), and a separate offset. For a DIE attribute, this would normally be done using an expression. Would it work to have (say) DW_AT_low_pc be allowed to have class exprloc? (It currently must have class address, either FORM_addr or one of the FORM_addrx's.) The expression can index into .debug_addr and then add an offset to the result. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68465/new/ https://reviews.llvm.org/D68465 From llvm-commits at lists.llvm.org Tue Oct 8 09:04:20 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:04:20 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns Message-ID: dmgreen created this revision. dmgreen added reviewers: spatel, nikic, lebedev.ri, efriedma. Herald added a subscriber: hiraditya. Herald added a project: LLVM. This adds a instcombine match for code that attempts to perform signed saturating arithmetic by casting to a higher type. For example: https://godbolt.org/z/9knBnP As can be seen the unsigned cases are already matched, but signed are not. Adding them for these cases involves matching the min(max(add a b)) nodes, with proper truncs and extends. There is some work in D68643 to make the default lowering for these better. With that I believe that the intrinsic is a better choice than extending into a larger type (although it's obviously hard to tell for all architectures). This also helps with vectorization, especially in dsp routines that often want to use saturating arithmetic. This also adds a m_SpecificAPInt matcher which seemed to be useful, similar to m_SpecificInt but taking an APInt for cases like this where using an APInt make sense. https://reviews.llvm.org/D68651 Files: llvm/include/llvm/IR/PatternMatch.h llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp llvm/test/Transforms/InstCombine/sadd_sat.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68651.223875.patch Type: text/x-patch Size: 15510 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 09:04:20 2019 From: llvm-commits at lists.llvm.org (Saleem Abdulrasool via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:04:20 +0000 (UTC) Subject: [PATCH] D66904: [libunwind] Fix memory leak in handling of DW_CFA_remember_state and DW_CFA_restore_state In-Reply-To: References: Message-ID: compnerd added inline comments. Herald added a reviewer: mclow.lists. ================ Comment at: libunwind/test/remember_state_leak.pass.sh.s:1 +# REQUIRES: linux +# RUN: %build ---------------- jgorbe wrote: > compnerd wrote: > > Can you add a `REQUIRES: x86` as well please? This shouldnt be executed on non-x86 platforms. I think that explicitly listing the triple is a good idea in the build rules (you could default to x86 builds which wont work with x86_64). I think that we *could* add a requirement on `ASAN` as well, but, that isn't as big of a concern. > I agree that this shouldn't run on non-x86. Thanks for catching this. That said, shouldn't it be `REQUIRES: x86_64` or similar? The asm code below uses 64-bit registers. > > Anyway, I just tried both `REQUIRES: x86` and `REQUIRES: x86_64`, and when I run the tests with `ninja check-unwind` on my x86_64 machine the test doesn't run and it's counted within "Unsupported Tests" in the final summary instead. > > What do you mean by listing the triple in the build rules? I can't find any example of it in the existing tests and my knowledge of the testing infrastructure has a lot of gaps :( Nope, it should be `x86`. The backend is called x86 and supports (kinda) 16-bit, 32-bit and 64-bit code. As to the triple, what I mean is that we should add a `-target x86_64-unknown-linux-gnu` to the build invocation (you should be able to add that after `%build`) ================ Comment at: libunwind/test/remember_state_leak.pass.sh.s:39 +.cfi_def_cfa_register %rbp +subq $48, %rsp +.cfi_remember_state ---------------- jgorbe wrote: > compnerd wrote: > > Mind extracting this value out into a macro (`SIZEOF_UNWIND_EXCEPTION`) so others can easily identify where the value comes from? > sizeof(_Unwind_Exception) seems to be 32 instead. I didn't build the original C++ file with optimizations, so there was probably a frame pointer in there too. > > I have rebuilt the example with -O2 and now it's subtracting 40 from rsp, which I believe is equal to 8 bytes to align the stack to a 16-byte boundary plus 32 for the exception object. Would you prefer to have two `sub` instructions instead, one that adjusts the stack and another one that reserves 32 additional bytes using a `SIZEOF_UNWIND_EXCEPTION` macro? (and then two matching `add` instructions in the epilog?) I think that the readability of the test is more important, so lets go with the two subs. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66904/new/ https://reviews.llvm.org/D66904 From llvm-commits at lists.llvm.org Tue Oct 8 09:15:39 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via llvm-commits) Date: Tue, 08 Oct 2019 16:15:39 -0000 Subject: [llvm] r374073 - [WebAssembly] Fix a bug in 'try' placement Message-ID: <20191008161539.BDC968EC43@lists.llvm.org> Author: aheejin Date: Tue Oct 8 09:15:39 2019 New Revision: 374073 URL: http://llvm.org/viewvc/llvm-project?rev=374073&view=rev Log: [WebAssembly] Fix a bug in 'try' placement Summary: When searching for local expression tree created by stackified registers, for 'block' placement, we start the search from the previous instruction of a BB's terminator. But in 'try''s case, we should start from the previous instruction of a call that can throw, or a EH_LABEL that precedes the call, because the return values of the call's previous instructions can be stackified and consumed by the throwing call. For example, ``` i32.call @foo call @bar ; may throw br $label0 ``` In this case, if we start the search from the previous instruction of the terminator (`br` here), we end up stopping at `call @bar` and place a 'try' between `i32.call @foo` and `call @bar`, because `call @bar` does not have a return value so it is not a local expression tree of `br`. But in this case, unlike when placing 'block's, we should start the search from `call @bar`, because the return value of `i32.call @foo` is stackified and used by `call @bar`. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68619 Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp?rev=374073&r1=374072&r2=374073&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp Tue Oct 8 09:15:39 2019 @@ -526,40 +526,50 @@ void WebAssemblyCFGStackify::placeTryMar AfterSet.insert(&MI); } - // Local expression tree should go after the TRY. - for (auto I = Header->getFirstTerminator(), E = Header->begin(); I != E; - --I) { - if (std::prev(I)->isDebugInstr() || std::prev(I)->isPosition()) - continue; - if (WebAssembly::isChild(*std::prev(I), MFI)) - AfterSet.insert(&*std::prev(I)); - else - break; - } - // If Header unwinds to MBB (= Header contains 'invoke'), the try block should // contain the call within it. So the call should go after the TRY. The // exception is when the header's terminator is a rethrow instruction, in // which case that instruction, not a call instruction before it, is gonna // throw. + MachineInstr *ThrowingCall = nullptr; if (MBB.isPredecessor(Header)) { auto TermPos = Header->getFirstTerminator(); if (TermPos == Header->end() || TermPos->getOpcode() != WebAssembly::RETHROW) { - for (const auto &MI : reverse(*Header)) { + for (auto &MI : reverse(*Header)) { if (MI.isCall()) { AfterSet.insert(&MI); + ThrowingCall = &MI; // Possibly throwing calls are usually wrapped by EH_LABEL // instructions. We don't want to split them and the call. if (MI.getIterator() != Header->begin() && - std::prev(MI.getIterator())->isEHLabel()) + std::prev(MI.getIterator())->isEHLabel()) { AfterSet.insert(&*std::prev(MI.getIterator())); + ThrowingCall = &*std::prev(MI.getIterator()); + } break; } } } } + // Local expression tree should go after the TRY. + // For BLOCK placement, we start the search from the previous instruction of a + // BB's terminator, but in TRY's case, we should start from the previous + // instruction of a call that can throw, or a EH_LABEL that precedes the call, + // because the return values of the call's previous instructions can be + // stackified and consumed by the throwing call. + auto SearchStartPt = ThrowingCall ? MachineBasicBlock::iterator(ThrowingCall) + : Header->getFirstTerminator(); + for (auto I = SearchStartPt, E = Header->begin(); I != E; --I) { + if (std::prev(I)->isDebugInstr() || std::prev(I)->isPosition()) + continue; + if (WebAssembly::isChild(*std::prev(I), MFI)) + AfterSet.insert(&*std::prev(I)); + else + break; + } + // Add the TRY. auto InsertPos = getLatestInsertPos(Header, BeforeSet, AfterSet); MachineInstr *Begin = Modified: llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll?rev=374073&r1=374072&r2=374073&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll (original) +++ llvm/trunk/test/CodeGen/WebAssembly/cfg-stackify-eh.ll Tue Oct 8 09:15:39 2019 @@ -704,14 +704,41 @@ ehcleanup: cleanupret from %0 unwind to caller } +; Tests if 'try' marker is placed correctly. In this test, 'try' should be +; placed before the call to 'nothrow_i32' and not between the call to +; 'nothrow_i32' and 'fun', because the return value of 'nothrow_i32' is +; stackified and pushed onto the stack to be consumed by the call to 'fun'. + +; CHECK-LABEL: test11 +; CHECK: try +; CHECK: i32.call $push{{.*}}=, nothrow_i32 +; CHECK: call fun, $pop{{.*}} +define void @test11() personality i8* bitcast (i32 (...)* @__gxx_wasm_personality_v0 to i8*) { +entry: + %call = call i32 @nothrow_i32() + invoke void @fun(i32 %call) + to label %invoke.cont unwind label %terminate + +invoke.cont: ; preds = %entry + ret void + +terminate: ; preds = %entry + %0 = cleanuppad within none [] + %1 = tail call i8* @llvm.wasm.get.exception(token %0) + call void @__clang_call_terminate(i8* %1) [ "funclet"(token %0) ] + unreachable +} + ; Check if the unwind destination mismatch stats are correct ; NOSORT-STAT: 11 wasm-cfg-stackify - Number of EH pad unwind mismatches found declare void @foo() declare void @bar() declare i32 @baz() +declare void @fun(i32) ; Function Attrs: nounwind declare void @nothrow(i32) #0 +declare i32 @nothrow_i32() #0 ; Function Attrs: nounwind declare %class.Object* @_ZN6ObjectD2Ev(%class.Object* returned) #0 declare i32 @__gxx_wasm_personality_v0(...) From llvm-commits at lists.llvm.org Tue Oct 8 09:13:34 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:13:34 +0000 (UTC) Subject: [PATCH] D68650: [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength. In-Reply-To: References: Message-ID: <4d828d2a16fefda7c0f83a36a5798632@localhost.localdomain> hubert.reinterpretcast added a comment. Thanks @DiggerLin. Just minor comments. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:117 + support::ubig32_t + SectionOrLength; // If the symbol type is XTY_SD,XTY_CM,it contains the + // csect length. If the symbol type is XTY_LD, it ---------------- s/it contains//g; s/XTY_SD,XTY_CM/XTY_SD or XTY_CM/; s/is XTY_ER/is XTY_ER/; Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68650/new/ https://reviews.llvm.org/D68650 From llvm-commits at lists.llvm.org Tue Oct 8 09:13:34 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:13:34 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: <7e8790dddf6f5102cbe7e8cf8c0066af@localhost.localdomain> lebedev.ri added a comment. Do we already form these saturating intrinsics in instcombine, if they are in their native format? Can that code be extended to also use this knowledge about truncation? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Tue Oct 8 09:16:26 2019 From: llvm-commits at lists.llvm.org (Amaury Sechet via llvm-commits) Date: Tue, 08 Oct 2019 16:16:26 -0000 Subject: [llvm] r374074 - (Re)generate various tests. NFC Message-ID: <20191008161626.CD5128F5EA@lists.llvm.org> Author: deadalnix Date: Tue Oct 8 09:16:26 2019 New Revision: 374074 URL: http://llvm.org/viewvc/llvm-project?rev=374074&view=rev Log: (Re)generate various tests. NFC Modified: llvm/trunk/test/CodeGen/AArch64/arm64-rev.ll llvm/trunk/test/CodeGen/AMDGPU/lshr.v2i16.ll llvm/trunk/test/CodeGen/AMDGPU/shl.v2i16.ll llvm/trunk/test/CodeGen/ARM/rev.ll llvm/trunk/test/CodeGen/Thumb/rev.ll Modified: llvm/trunk/test/CodeGen/AArch64/arm64-rev.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-rev.ll?rev=374074&r1=374073&r2=374074&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/arm64-rev.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/arm64-rev.ll Tue Oct 8 09:16:26 2019 @@ -8,10 +8,11 @@ define i32 @test_rev_w(i32 %a) nounwind ; CHECK: // %bb.0: // %entry ; CHECK-NEXT: rev w0, w0 ; CHECK-NEXT: ret -; GISEL-LABEL: test_rev_w: -; GISEL: // %bb.0: // %entry -; GISEL-NEXT: rev w0, w0 -; GISEL-NEXT: ret +; +; FALLBACK-LABEL: test_rev_w: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: rev w0, w0 +; FALLBACK-NEXT: ret entry: %0 = tail call i32 @llvm.bswap.i32(i32 %a) ret i32 %0 @@ -23,10 +24,11 @@ define i64 @test_rev_x(i64 %a) nounwind ; CHECK: // %bb.0: // %entry ; CHECK-NEXT: rev x0, x0 ; CHECK-NEXT: ret -; GISEL-LABEL: test_rev_x: -; GISEL: // %bb.0: // %entry -; GISEL-NEXT: rev x0, x0 -; GISEL-NEXT: ret +; +; FALLBACK-LABEL: test_rev_x: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: rev x0, x0 +; FALLBACK-NEXT: ret entry: %0 = tail call i64 @llvm.bswap.i64(i64 %a) ret i64 %0 @@ -40,6 +42,13 @@ define i32 @test_rev_w_srl16(i16 %a) { ; CHECK-NEXT: and w8, w0, #0xffff ; CHECK-NEXT: rev16 w0, w8 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_rev_w_srl16: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: and w8, w0, #0xffff +; FALLBACK-NEXT: rev w8, w8 +; FALLBACK-NEXT: lsr w0, w8, #16 +; FALLBACK-NEXT: ret entry: %0 = zext i16 %a to i32 %1 = tail call i32 @llvm.bswap.i32(i32 %0) @@ -53,6 +62,13 @@ define i32 @test_rev_w_srl16_load(i16 *% ; CHECK-NEXT: ldrh w8, [x0] ; CHECK-NEXT: rev16 w0, w8 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_rev_w_srl16_load: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: ldrh w8, [x0] +; FALLBACK-NEXT: rev w8, w8 +; FALLBACK-NEXT: lsr w0, w8, #16 +; FALLBACK-NEXT: ret entry: %0 = load i16, i16 *%a %1 = zext i16 %0 to i32 @@ -68,6 +84,14 @@ define i32 @test_rev_w_srl16_add(i8 %a, ; CHECK-NEXT: add w8, w8, w1, uxtb ; CHECK-NEXT: rev16 w0, w8 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_rev_w_srl16_add: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: and w8, w1, #0xff +; FALLBACK-NEXT: add w8, w8, w0, uxtb +; FALLBACK-NEXT: rev w8, w8 +; FALLBACK-NEXT: lsr w0, w8, #16 +; FALLBACK-NEXT: ret entry: %0 = zext i8 %a to i32 %1 = zext i8 %b to i32 @@ -85,6 +109,14 @@ define i64 @test_rev_x_srl32(i32 %a) { ; CHECK-NEXT: mov w8, w0 ; CHECK-NEXT: rev32 x0, x8 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_rev_x_srl32: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: // kill: def $w0 killed $w0 def $x0 +; FALLBACK-NEXT: ubfx x8, x0, #0, #32 +; FALLBACK-NEXT: rev x8, x8 +; FALLBACK-NEXT: lsr x0, x8, #32 +; FALLBACK-NEXT: ret entry: %0 = zext i32 %a to i64 %1 = tail call i64 @llvm.bswap.i64(i64 %0) @@ -98,6 +130,13 @@ define i64 @test_rev_x_srl32_load(i32 *% ; CHECK-NEXT: ldr w8, [x0] ; CHECK-NEXT: rev32 x0, x8 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_rev_x_srl32_load: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: ldr w8, [x0] +; FALLBACK-NEXT: rev x8, x8 +; FALLBACK-NEXT: lsr x0, x8, #32 +; FALLBACK-NEXT: ret entry: %0 = load i32, i32 *%a %1 = zext i32 %0 to i64 @@ -112,6 +151,14 @@ define i64 @test_rev_x_srl32_shift(i64 % ; CHECK-NEXT: ubfx x8, x0, #2, #29 ; CHECK-NEXT: rev32 x0, x8 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_rev_x_srl32_shift: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: lsl x8, x0, #33 +; FALLBACK-NEXT: lsr x8, x8, #35 +; FALLBACK-NEXT: rev x8, x8 +; FALLBACK-NEXT: lsr x0, x8, #32 +; FALLBACK-NEXT: ret entry: %0 = shl i64 %a, 33 %1 = lshr i64 %0, 35 @@ -128,6 +175,19 @@ define i32 @test_rev16_w(i32 %X) nounwin ; CHECK: // %bb.0: // %entry ; CHECK-NEXT: rev16 w0, w0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_rev16_w: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: lsr w8, w0, #8 +; FALLBACK-NEXT: lsl w9, w0, #8 +; FALLBACK-NEXT: and w10, w8, #0xff0000 +; FALLBACK-NEXT: and w11, w9, #0xff000000 +; FALLBACK-NEXT: and w9, w9, #0xff00 +; FALLBACK-NEXT: orr w10, w11, w10 +; FALLBACK-NEXT: and w8, w8, #0xff +; FALLBACK-NEXT: orr w9, w10, w9 +; FALLBACK-NEXT: orr w0, w9, w8 +; FALLBACK-NEXT: ret entry: %tmp1 = lshr i32 %X, 8 %X15 = bitcast i32 %X to i32 @@ -151,6 +211,13 @@ define i64 @test_rev16_x(i64 %a) nounwin ; CHECK-NEXT: rev x8, x0 ; CHECK-NEXT: ror x0, x8, #16 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_rev16_x: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: rev x8, x0 +; FALLBACK-NEXT: lsl x9, x8, #48 +; FALLBACK-NEXT: orr x0, x9, x8, lsr #16 +; FALLBACK-NEXT: ret entry: %0 = tail call i64 @llvm.bswap.i64(i64 %a) %1 = lshr i64 %0, 16 @@ -164,6 +231,13 @@ define i64 @test_rev32_x(i64 %a) nounwin ; CHECK: // %bb.0: // %entry ; CHECK-NEXT: rev32 x0, x0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_rev32_x: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: rev x8, x0 +; FALLBACK-NEXT: lsl x9, x8, #32 +; FALLBACK-NEXT: orr x0, x9, x8, lsr #32 +; FALLBACK-NEXT: ret entry: %0 = tail call i64 @llvm.bswap.i64(i64 %a) %1 = lshr i64 %0, 32 @@ -178,6 +252,12 @@ define <8 x i8> @test_vrev64D8(<8 x i8>* ; CHECK-NEXT: ldr d0, [x0] ; CHECK-NEXT: rev64.8b v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev64D8: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr d0, [x0] +; FALLBACK-NEXT: rev64.8b v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <8 x i8>, <8 x i8>* %A %tmp2 = shufflevector <8 x i8> %tmp1, <8 x i8> undef, <8 x i32> ret <8 x i8> %tmp2 @@ -189,6 +269,12 @@ define <4 x i16> @test_vrev64D16(<4 x i1 ; CHECK-NEXT: ldr d0, [x0] ; CHECK-NEXT: rev64.4h v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev64D16: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr d0, [x0] +; FALLBACK-NEXT: rev64.4h v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <4 x i16>, <4 x i16>* %A %tmp2 = shufflevector <4 x i16> %tmp1, <4 x i16> undef, <4 x i32> ret <4 x i16> %tmp2 @@ -200,6 +286,17 @@ define <2 x i32> @test_vrev64D32(<2 x i3 ; CHECK-NEXT: ldr d0, [x0] ; CHECK-NEXT: rev64.2s v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev64D32: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr d0, [x0] +; FALLBACK-NEXT: adrp x8, .LCPI13_0 +; FALLBACK-NEXT: ldr d1, [x8, :lo12:.LCPI13_0] +; FALLBACK-NEXT: mov.s v2[1], w8 +; FALLBACK-NEXT: mov.d v0[1], v2[0] +; FALLBACK-NEXT: tbl.16b v0, { v0 }, v1 +; FALLBACK-NEXT: // kill: def $d0 killed $d0 killed $q0 +; FALLBACK-NEXT: ret %tmp1 = load <2 x i32>, <2 x i32>* %A %tmp2 = shufflevector <2 x i32> %tmp1, <2 x i32> undef, <2 x i32> ret <2 x i32> %tmp2 @@ -211,6 +308,17 @@ define <2 x float> @test_vrev64Df(<2 x f ; CHECK-NEXT: ldr d0, [x0] ; CHECK-NEXT: rev64.2s v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev64Df: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr d0, [x0] +; FALLBACK-NEXT: adrp x8, .LCPI14_0 +; FALLBACK-NEXT: ldr d1, [x8, :lo12:.LCPI14_0] +; FALLBACK-NEXT: mov.s v2[1], w8 +; FALLBACK-NEXT: mov.d v0[1], v2[0] +; FALLBACK-NEXT: tbl.16b v0, { v0 }, v1 +; FALLBACK-NEXT: // kill: def $d0 killed $d0 killed $q0 +; FALLBACK-NEXT: ret %tmp1 = load <2 x float>, <2 x float>* %A %tmp2 = shufflevector <2 x float> %tmp1, <2 x float> undef, <2 x i32> ret <2 x float> %tmp2 @@ -222,6 +330,12 @@ define <16 x i8> @test_vrev64Q8(<16 x i8 ; CHECK-NEXT: ldr q0, [x0] ; CHECK-NEXT: rev64.16b v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev64Q8: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr q0, [x0] +; FALLBACK-NEXT: rev64.16b v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <16 x i8>, <16 x i8>* %A %tmp2 = shufflevector <16 x i8> %tmp1, <16 x i8> undef, <16 x i32> ret <16 x i8> %tmp2 @@ -233,6 +347,12 @@ define <8 x i16> @test_vrev64Q16(<8 x i1 ; CHECK-NEXT: ldr q0, [x0] ; CHECK-NEXT: rev64.8h v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev64Q16: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr q0, [x0] +; FALLBACK-NEXT: rev64.8h v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <8 x i16>, <8 x i16>* %A %tmp2 = shufflevector <8 x i16> %tmp1, <8 x i16> undef, <8 x i32> ret <8 x i16> %tmp2 @@ -244,6 +364,14 @@ define <4 x i32> @test_vrev64Q32(<4 x i3 ; CHECK-NEXT: ldr q0, [x0] ; CHECK-NEXT: rev64.4s v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev64Q32: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: adrp x8, .LCPI17_0 +; FALLBACK-NEXT: ldr q0, [x0] +; FALLBACK-NEXT: ldr q2, [x8, :lo12:.LCPI17_0] +; FALLBACK-NEXT: tbl.16b v0, { v0, v1 }, v2 +; FALLBACK-NEXT: ret %tmp1 = load <4 x i32>, <4 x i32>* %A %tmp2 = shufflevector <4 x i32> %tmp1, <4 x i32> undef, <4 x i32> ret <4 x i32> %tmp2 @@ -255,6 +383,14 @@ define <4 x float> @test_vrev64Qf(<4 x f ; CHECK-NEXT: ldr q0, [x0] ; CHECK-NEXT: rev64.4s v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev64Qf: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: adrp x8, .LCPI18_0 +; FALLBACK-NEXT: ldr q0, [x0] +; FALLBACK-NEXT: ldr q2, [x8, :lo12:.LCPI18_0] +; FALLBACK-NEXT: tbl.16b v0, { v0, v1 }, v2 +; FALLBACK-NEXT: ret %tmp1 = load <4 x float>, <4 x float>* %A %tmp2 = shufflevector <4 x float> %tmp1, <4 x float> undef, <4 x i32> ret <4 x float> %tmp2 @@ -266,6 +402,12 @@ define <8 x i8> @test_vrev32D8(<8 x i8>* ; CHECK-NEXT: ldr d0, [x0] ; CHECK-NEXT: rev32.8b v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev32D8: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr d0, [x0] +; FALLBACK-NEXT: rev32.8b v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <8 x i8>, <8 x i8>* %A %tmp2 = shufflevector <8 x i8> %tmp1, <8 x i8> undef, <8 x i32> ret <8 x i8> %tmp2 @@ -277,6 +419,12 @@ define <4 x i16> @test_vrev32D16(<4 x i1 ; CHECK-NEXT: ldr d0, [x0] ; CHECK-NEXT: rev32.4h v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev32D16: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr d0, [x0] +; FALLBACK-NEXT: rev32.4h v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <4 x i16>, <4 x i16>* %A %tmp2 = shufflevector <4 x i16> %tmp1, <4 x i16> undef, <4 x i32> ret <4 x i16> %tmp2 @@ -288,6 +436,12 @@ define <16 x i8> @test_vrev32Q8(<16 x i8 ; CHECK-NEXT: ldr q0, [x0] ; CHECK-NEXT: rev32.16b v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev32Q8: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr q0, [x0] +; FALLBACK-NEXT: rev32.16b v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <16 x i8>, <16 x i8>* %A %tmp2 = shufflevector <16 x i8> %tmp1, <16 x i8> undef, <16 x i32> ret <16 x i8> %tmp2 @@ -299,6 +453,12 @@ define <8 x i16> @test_vrev32Q16(<8 x i1 ; CHECK-NEXT: ldr q0, [x0] ; CHECK-NEXT: rev32.8h v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev32Q16: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr q0, [x0] +; FALLBACK-NEXT: rev32.8h v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <8 x i16>, <8 x i16>* %A %tmp2 = shufflevector <8 x i16> %tmp1, <8 x i16> undef, <8 x i32> ret <8 x i16> %tmp2 @@ -310,6 +470,12 @@ define <8 x i8> @test_vrev16D8(<8 x i8>* ; CHECK-NEXT: ldr d0, [x0] ; CHECK-NEXT: rev16.8b v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev16D8: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr d0, [x0] +; FALLBACK-NEXT: rev16.8b v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <8 x i8>, <8 x i8>* %A %tmp2 = shufflevector <8 x i8> %tmp1, <8 x i8> undef, <8 x i32> ret <8 x i8> %tmp2 @@ -321,6 +487,12 @@ define <16 x i8> @test_vrev16Q8(<16 x i8 ; CHECK-NEXT: ldr q0, [x0] ; CHECK-NEXT: rev16.16b v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev16Q8: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr q0, [x0] +; FALLBACK-NEXT: rev16.16b v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <16 x i8>, <16 x i8>* %A %tmp2 = shufflevector <16 x i8> %tmp1, <16 x i8> undef, <16 x i32> ret <16 x i8> %tmp2 @@ -334,6 +506,12 @@ define <8 x i8> @test_vrev64D8_undef(<8 ; CHECK-NEXT: ldr d0, [x0] ; CHECK-NEXT: rev64.8b v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev64D8_undef: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr d0, [x0] +; FALLBACK-NEXT: rev64.8b v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <8 x i8>, <8 x i8>* %A %tmp2 = shufflevector <8 x i8> %tmp1, <8 x i8> undef, <8 x i32> ret <8 x i8> %tmp2 @@ -345,6 +523,12 @@ define <8 x i16> @test_vrev32Q16_undef(< ; CHECK-NEXT: ldr q0, [x0] ; CHECK-NEXT: rev32.8h v0, v0 ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev32Q16_undef: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: ldr q0, [x0] +; FALLBACK-NEXT: rev32.8h v0, v0 +; FALLBACK-NEXT: ret %tmp1 = load <8 x i16>, <8 x i16>* %A %tmp2 = shufflevector <8 x i16> %tmp1, <8 x i16> undef, <8 x i32> ret <8 x i16> %tmp2 @@ -359,6 +543,14 @@ define void @test_vrev64(<4 x i16>* noca ; CHECK-NEXT: st1.h { v0 }[5], [x8] ; CHECK-NEXT: st1.h { v0 }[6], [x1] ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: test_vrev64: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: ldr q0, [x0] +; FALLBACK-NEXT: add x8, x1, #2 // =2 +; FALLBACK-NEXT: st1.h { v0 }[5], [x8] +; FALLBACK-NEXT: st1.h { v0 }[6], [x1] +; FALLBACK-NEXT: ret entry: %0 = bitcast <4 x i16>* %source to <8 x i16>* %tmp2 = load <8 x i16>, <8 x i16>* %0, align 4 @@ -381,6 +573,19 @@ define void @float_vrev64(float* nocaptu ; CHECK-NEXT: rev64.4s v0, v0 ; CHECK-NEXT: str q0, [x1, #176] ; CHECK-NEXT: ret +; +; FALLBACK-LABEL: float_vrev64: +; FALLBACK: // %bb.0: // %entry +; FALLBACK-NEXT: fmov s0, wzr +; FALLBACK-NEXT: mov.s v0[1], v0[0] +; FALLBACK-NEXT: mov.s v0[2], v0[0] +; FALLBACK-NEXT: adrp x8, .LCPI28_0 +; FALLBACK-NEXT: mov.s v0[3], v0[0] +; FALLBACK-NEXT: ldr q1, [x0] +; FALLBACK-NEXT: ldr q2, [x8, :lo12:.LCPI28_0] +; FALLBACK-NEXT: tbl.16b v0, { v0, v1 }, v2 +; FALLBACK-NEXT: str q0, [x1, #176] +; FALLBACK-NEXT: ret entry: %0 = bitcast float* %source to <4 x float>* %tmp2 = load <4 x float>, <4 x float>* %0, align 4 @@ -396,10 +601,11 @@ define <4 x i32> @test_vrev32_bswap(<4 x ; CHECK: // %bb.0: ; CHECK-NEXT: rev32.16b v0, v0 ; CHECK-NEXT: ret -; GISEL-LABEL: test_vrev32_bswap: -; GISEL: // %bb.0: -; GISEL-NEXT: rev32.16b v0, v0 -; GISEL-NEXT: ret +; +; FALLBACK-LABEL: test_vrev32_bswap: +; FALLBACK: // %bb.0: +; FALLBACK-NEXT: rev32.16b v0, v0 +; FALLBACK-NEXT: ret %bswap = call <4 x i32> @llvm.bswap.v4i32(<4 x i32> %source) ret <4 x i32> %bswap } Modified: llvm/trunk/test/CodeGen/AMDGPU/lshr.v2i16.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/lshr.v2i16.ll?rev=374074&r1=374073&r2=374074&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/lshr.v2i16.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/lshr.v2i16.ll Tue Oct 8 09:16:26 2019 @@ -1,46 +1,134 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc -march=amdgcn -mcpu=gfx900 -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GFX9 %s ; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,VI,CIVI %s ; RUN: llc -march=amdgcn -mcpu=bonaire -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,CI,CIVI %s -; GCN-LABEL: {{^}}s_lshr_v2i16: -; GFX9: s_load_dword [[LHS:s[0-9]+]] -; GFX9: s_load_dword [[RHS:s[0-9]+]] -; GFX9: v_mov_b32_e32 [[VLHS:v[0-9]+]], [[LHS]] -; GFX9: v_pk_lshrrev_b16 [[RESULT:v[0-9]+]], [[RHS]], [[VLHS]] - -; CIVI: s_load_dword [[LHS:s[0-9]+]] -; CIVI: s_load_dword [[RHS:s[0-9]+]] -; CIVI: s_lshr_b32 s{{[0-9]+}}, s{{[0-9]+}}, 16 -; CIVI: s_lshr_b32 s{{[0-9]+}}, s{{[0-9]+}}, 16 -; CIVI: s_lshr_b32 s{{[0-9]+}}, s{{[0-9]+}}, s{{[0-9]+}} -; CIVI-DAG: v_bfe_u32 v{{[0-9]+}}, s{{[0-9]+}}, v{{[0-9]+}}, 16 -; CIVI-DAG: s_lshl_b32 -; CIVI: v_or_b32_e32 define amdgpu_kernel void @s_lshr_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> %lhs, <2 x i16> %rhs) #0 { +; GFX9-LABEL: s_lshr_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx2 s[2:3], s[0:1], 0x24 +; GFX9-NEXT: s_load_dword s4, s[0:1], 0x2c +; GFX9-NEXT: s_load_dword s0, s[0:1], 0x30 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v0, s2 +; GFX9-NEXT: v_mov_b32_e32 v2, s4 +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_pk_lshrrev_b16 v2, s0, v2 +; GFX9-NEXT: global_store_dword v[0:1], v2, off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: s_lshr_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx2 s[2:3], s[0:1], 0x24 +; VI-NEXT: s_load_dword s5, s[0:1], 0x2c +; VI-NEXT: s_load_dword s0, s[0:1], 0x30 +; VI-NEXT: s_mov_b32 s4, 0xffff +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: s_and_b32 s1, s5, s4 +; VI-NEXT: s_and_b32 s4, s0, s4 +; VI-NEXT: s_lshr_b32 s5, s5, 16 +; VI-NEXT: s_lshr_b32 s0, s0, 16 +; VI-NEXT: s_lshr_b32 s0, s5, s0 +; VI-NEXT: v_mov_b32_e32 v0, s4 +; VI-NEXT: v_bfe_u32 v0, s1, v0, 16 +; VI-NEXT: s_lshl_b32 s0, s0, 16 +; VI-NEXT: v_or_b32_e32 v2, s0, v0 +; VI-NEXT: v_mov_b32_e32 v0, s2 +; VI-NEXT: v_mov_b32_e32 v1, s3 +; VI-NEXT: flat_store_dword v[0:1], v2 +; VI-NEXT: s_endpgm +; +; CI-LABEL: s_lshr_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x9 +; CI-NEXT: s_load_dword s2, s[0:1], 0xb +; CI-NEXT: s_load_dword s0, s[0:1], 0xc +; CI-NEXT: s_mov_b32 s3, 0xffff +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, -1 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_lshr_b32 s1, s2, 16 +; CI-NEXT: s_lshr_b32 s8, s0, 16 +; CI-NEXT: s_and_b32 s0, s0, s3 +; CI-NEXT: v_mov_b32_e32 v0, s0 +; CI-NEXT: s_lshr_b32 s0, s1, s8 +; CI-NEXT: s_and_b32 s2, s2, s3 +; CI-NEXT: v_bfe_u32 v0, s2, v0, 16 +; CI-NEXT: s_lshl_b32 s0, s0, 16 +; CI-NEXT: v_or_b32_e32 v0, s0, v0 +; CI-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; CI-NEXT: s_endpgm %result = lshr <2 x i16> %lhs, %rhs store <2 x i16> %result, <2 x i16> addrspace(1)* %out ret void } -; GCN-LABEL: {{^}}v_lshr_v2i16: -; GCN: {{buffer|flat|global}}_load_dword [[LHS:v[0-9]+]] -; GCN: {{buffer|flat|global}}_load_dword [[RHS:v[0-9]+]] -; GFX9: v_pk_lshrrev_b16 [[RESULT:v[0-9]+]], [[RHS]], [[LHS]] - -; VI: v_lshrrev_b16_e32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} -; VI: v_lshrrev_b16_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:WORD_1 -; VI: v_or_b32_e32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} - -; CI: s_mov_b32 [[MASK:s[0-9]+]], 0xffff{{$}} -; CI-DAG: v_lshrrev_b32_e32 v{{[0-9]+}}, 16, [[LHS]] -; CI-DAG: v_lshrrev_b32_e32 v{{[0-9]+}}, 16, [[RHS]] -; CI: v_and_b32_e32 v{{[0-9]+}}, [[MASK]], v{{[0-9]+}} -; CI: v_and_b32_e32 v{{[0-9]+}}, [[MASK]], v{{[0-9]+}} -; CI: v_bfe_u32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}}, 16 -; CI: v_lshrrev_b32_e32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} -; CI: v_lshlrev_b32_e32 v{{[0-9]+}}, 16, v{{[0-9]+}} -; CI: v_or_b32_e32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} define amdgpu_kernel void @v_lshr_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in) #0 { +; GFX9-LABEL: v_lshr_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s2, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dword v3, v[0:1], off +; GFX9-NEXT: global_load_dword v4, v[0:1], off offset:4 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s0, v2 +; GFX9-NEXT: v_mov_b32_e32 v1, s1 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshrrev_b16 v2, v4, v3 +; GFX9-NEXT: global_store_dword v[0:1], v2, off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: v_lshr_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v4, 2, v0 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s3 +; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v4 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: v_add_u32_e32 v2, vcc, 4, v0 +; VI-NEXT: v_addc_u32_e32 v3, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dword v5, v[0:1] +; VI-NEXT: flat_load_dword v2, v[2:3] +; VI-NEXT: v_mov_b32_e32 v1, s1 +; VI-NEXT: v_add_u32_e32 v0, vcc, s0, v4 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshrrev_b16_e32 v3, v2, v5 +; VI-NEXT: v_lshrrev_b16_sdwa v2, v2, v5 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:WORD_1 +; VI-NEXT: v_or_b32_e32 v2, v3, v2 +; VI-NEXT: flat_store_dword v[0:1], v2 +; VI-NEXT: s_endpgm +; +; CI-LABEL: v_lshr_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 2, v0 +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[4:5], s[2:3] +; CI-NEXT: buffer_load_dword v2, v[0:1], s[4:7], 0 addr64 +; CI-NEXT: buffer_load_dword v3, v[0:1], s[4:7], 0 addr64 offset:4 +; CI-NEXT: s_mov_b32 s8, 0xffff +; CI-NEXT: s_mov_b64 s[2:3], s[6:7] +; CI-NEXT: s_waitcnt vmcnt(1) +; CI-NEXT: v_lshrrev_b32_e32 v4, 16, v2 +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_lshrrev_b32_e32 v5, 16, v3 +; CI-NEXT: v_and_b32_e32 v2, s8, v2 +; CI-NEXT: v_and_b32_e32 v3, s8, v3 +; CI-NEXT: v_bfe_u32 v2, v2, v3, 16 +; CI-NEXT: v_lshrrev_b32_e32 v3, v5, v4 +; CI-NEXT: v_lshlrev_b32_e32 v3, 16, v3 +; CI-NEXT: v_or_b32_e32 v2, v2, v3 +; CI-NEXT: buffer_store_dword v2, v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in, i64 %tid.ext @@ -53,11 +141,71 @@ define amdgpu_kernel void @v_lshr_v2i16( ret void } -; GCN-LABEL: {{^}}lshr_v_s_v2i16: -; GFX9: s_load_dword [[RHS:s[0-9]+]] -; GFX9: {{buffer|flat|global}}_load_dword [[LHS:v[0-9]+]] -; GFX9: v_pk_lshrrev_b16 [[RESULT:v[0-9]+]], [[RHS]], [[LHS]] define amdgpu_kernel void @lshr_v_s_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in, <2 x i16> %sgpr) #0 { +; GFX9-LABEL: lshr_v_s_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; GFX9-NEXT: s_load_dword s0, s[0:1], 0x34 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s7 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s6, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dword v3, v[0:1], off +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s4, v2 +; GFX9-NEXT: v_mov_b32_e32 v1, s5 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshrrev_b16 v2, s0, v3 +; GFX9-NEXT: global_store_dword v[0:1], v2, off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: lshr_v_s_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; VI-NEXT: s_load_dword s0, s[0:1], 0x34 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s7 +; VI-NEXT: v_add_u32_e32 v0, vcc, s6, v2 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dword v3, v[0:1] +; VI-NEXT: s_lshr_b32 s1, s0, 16 +; VI-NEXT: v_mov_b32_e32 v4, s1 +; VI-NEXT: v_add_u32_e32 v0, vcc, s4, v2 +; VI-NEXT: v_mov_b32_e32 v1, s5 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshrrev_b16_e32 v2, s0, v3 +; VI-NEXT: v_lshrrev_b16_sdwa v3, v4, v3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1 +; VI-NEXT: v_or_b32_e32 v2, v2, v3 +; VI-NEXT: flat_store_dword v[0:1], v2 +; VI-NEXT: s_endpgm +; +; CI-LABEL: lshr_v_s_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x9 +; CI-NEXT: s_load_dword s8, s[0:1], 0xd +; CI-NEXT: s_mov_b32 s3, 0xf000 +; CI-NEXT: s_mov_b32 s2, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 2, v0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[0:1], s[6:7] +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: buffer_load_dword v2, v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_lshr_b32 s9, s8, 16 +; CI-NEXT: s_mov_b32 s10, 0xffff +; CI-NEXT: s_and_b32 s8, s8, s10 +; CI-NEXT: s_mov_b64 s[6:7], s[2:3] +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_lshrrev_b32_e32 v3, 16, v2 +; CI-NEXT: v_and_b32_e32 v2, s10, v2 +; CI-NEXT: v_lshrrev_b32_e32 v3, s9, v3 +; CI-NEXT: v_bfe_u32 v2, v2, s8, 16 +; CI-NEXT: v_lshlrev_b32_e32 v3, 16, v3 +; CI-NEXT: v_or_b32_e32 v2, v2, v3 +; CI-NEXT: buffer_store_dword v2, v[0:1], s[4:7], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in, i64 %tid.ext @@ -68,11 +216,71 @@ define amdgpu_kernel void @lshr_v_s_v2i1 ret void } -; GCN-LABEL: {{^}}lshr_s_v_v2i16: -; GFX9: s_load_dword [[LHS:s[0-9]+]] -; GFX9: {{buffer|flat|global}}_load_dword [[RHS:v[0-9]+]] -; GFX9: v_pk_lshrrev_b16 [[RESULT:v[0-9]+]], [[RHS]], [[LHS]] define amdgpu_kernel void @lshr_s_v_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in, <2 x i16> %sgpr) #0 { +; GFX9-LABEL: lshr_s_v_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; GFX9-NEXT: s_load_dword s0, s[0:1], 0x34 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s7 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s6, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dword v3, v[0:1], off +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s4, v2 +; GFX9-NEXT: v_mov_b32_e32 v1, s5 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshrrev_b16 v2, v3, s0 +; GFX9-NEXT: global_store_dword v[0:1], v2, off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: lshr_s_v_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; VI-NEXT: s_load_dword s0, s[0:1], 0x34 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s7 +; VI-NEXT: v_add_u32_e32 v0, vcc, s6, v2 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dword v3, v[0:1] +; VI-NEXT: s_lshr_b32 s1, s0, 16 +; VI-NEXT: v_mov_b32_e32 v4, s1 +; VI-NEXT: v_add_u32_e32 v0, vcc, s4, v2 +; VI-NEXT: v_mov_b32_e32 v1, s5 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshrrev_b16_e64 v2, v3, s0 +; VI-NEXT: v_lshrrev_b16_sdwa v3, v3, v4 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD +; VI-NEXT: v_or_b32_e32 v2, v2, v3 +; VI-NEXT: flat_store_dword v[0:1], v2 +; VI-NEXT: s_endpgm +; +; CI-LABEL: lshr_s_v_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x9 +; CI-NEXT: s_load_dword s8, s[0:1], 0xd +; CI-NEXT: s_mov_b32 s3, 0xf000 +; CI-NEXT: s_mov_b32 s2, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 2, v0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[0:1], s[6:7] +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: buffer_load_dword v2, v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_lshr_b32 s9, s8, 16 +; CI-NEXT: s_mov_b32 s10, 0xffff +; CI-NEXT: s_and_b32 s8, s8, s10 +; CI-NEXT: s_mov_b64 s[6:7], s[2:3] +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_lshrrev_b32_e32 v3, 16, v2 +; CI-NEXT: v_and_b32_e32 v2, s10, v2 +; CI-NEXT: v_lshr_b32_e32 v3, s9, v3 +; CI-NEXT: v_bfe_u32 v2, s8, v2, 16 +; CI-NEXT: v_lshlrev_b32_e32 v3, 16, v3 +; CI-NEXT: v_or_b32_e32 v2, v2, v3 +; CI-NEXT: buffer_store_dword v2, v[0:1], s[4:7], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in, i64 %tid.ext @@ -83,10 +291,64 @@ define amdgpu_kernel void @lshr_s_v_v2i1 ret void } -; GCN-LABEL: {{^}}lshr_imm_v_v2i16: -; GCN: {{buffer|flat|global}}_load_dword [[RHS:v[0-9]+]] -; GFX9: v_pk_lshrrev_b16 [[RESULT:v[0-9]+]], [[RHS]], 8 define amdgpu_kernel void @lshr_imm_v_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in) #0 { +; GFX9-LABEL: lshr_imm_v_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s2, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dword v3, v[0:1], off +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s0, v2 +; GFX9-NEXT: v_mov_b32_e32 v1, s1 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshrrev_b16 v2, v3, 8 op_sel_hi:[1,0] +; GFX9-NEXT: global_store_dword v[0:1], v2, off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: lshr_imm_v_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; VI-NEXT: v_mov_b32_e32 v3, 8 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s3 +; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v2 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dword v4, v[0:1] +; VI-NEXT: v_add_u32_e32 v0, vcc, s0, v2 +; VI-NEXT: v_mov_b32_e32 v1, s1 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshrrev_b16_e64 v2, v4, 8 +; VI-NEXT: v_lshrrev_b16_sdwa v3, v4, v3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD +; VI-NEXT: v_or_b32_e32 v2, v2, v3 +; VI-NEXT: flat_store_dword v[0:1], v2 +; VI-NEXT: s_endpgm +; +; CI-LABEL: lshr_imm_v_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 2, v0 +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[4:5], s[2:3] +; CI-NEXT: buffer_load_dword v2, v[0:1], s[4:7], 0 addr64 +; CI-NEXT: s_mov_b64 s[2:3], s[6:7] +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_lshrrev_b32_e32 v3, 16, v2 +; CI-NEXT: v_and_b32_e32 v2, 0xffff, v2 +; CI-NEXT: v_lshr_b32_e32 v3, 8, v3 +; CI-NEXT: v_bfe_u32 v2, 8, v2, 16 +; CI-NEXT: v_lshlrev_b32_e32 v3, 16, v3 +; CI-NEXT: v_or_b32_e32 v2, v2, v3 +; CI-NEXT: buffer_store_dword v2, v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in, i64 %tid.ext @@ -97,10 +359,59 @@ define amdgpu_kernel void @lshr_imm_v_v2 ret void } -; GCN-LABEL: {{^}}lshr_v_imm_v2i16: -; GCN: {{buffer|flat|global}}_load_dword [[LHS:v[0-9]+]] -; GFX9: v_pk_lshrrev_b16 [[RESULT:v[0-9]+]], 8, [[LHS]] define amdgpu_kernel void @lshr_v_imm_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in) #0 { +; GFX9-LABEL: lshr_v_imm_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s2, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dword v3, v[0:1], off +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s0, v2 +; GFX9-NEXT: v_mov_b32_e32 v1, s1 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshrrev_b16 v2, 8, v3 op_sel_hi:[0,1] +; GFX9-NEXT: global_store_dword v[0:1], v2, off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: lshr_v_imm_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s3 +; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v2 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dword v3, v[0:1] +; VI-NEXT: v_add_u32_e32 v0, vcc, s0, v2 +; VI-NEXT: v_mov_b32_e32 v1, s1 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshrrev_b32_e32 v2, 24, v3 +; VI-NEXT: v_lshlrev_b32_e32 v2, 16, v2 +; VI-NEXT: v_or_b32_sdwa v2, v3, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; VI-NEXT: flat_store_dword v[0:1], v2 +; VI-NEXT: s_endpgm +; +; CI-LABEL: lshr_v_imm_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 2, v0 +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[4:5], s[2:3] +; CI-NEXT: buffer_load_dword v2, v[0:1], s[4:7], 0 addr64 +; CI-NEXT: s_mov_b64 s[2:3], s[6:7] +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_lshrrev_b32_e32 v2, 8, v2 +; CI-NEXT: v_and_b32_e32 v2, 0xff00ff, v2 +; CI-NEXT: buffer_store_dword v2, v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in, i64 %tid.ext @@ -111,13 +422,84 @@ define amdgpu_kernel void @lshr_v_imm_v2 ret void } -; GCN-LABEL: {{^}}v_lshr_v4i16: -; GCN: {{buffer|flat|global}}_load_dwordx2 -; GCN: {{buffer|flat|global}}_load_dwordx2 -; GFX9: v_pk_lshrrev_b16 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} -; GFX9: v_pk_lshrrev_b16 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} -; GCN: {{buffer|flat|global}}_store_dwordx2 define amdgpu_kernel void @v_lshr_v4i16(<4 x i16> addrspace(1)* %out, <4 x i16> addrspace(1)* %in) #0 { +; GFX9-LABEL: v_lshr_v4i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v4, 3, v0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s2, v4 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dwordx2 v[2:3], v[0:1], off +; GFX9-NEXT: global_load_dwordx2 v[0:1], v[0:1], off offset:8 +; GFX9-NEXT: v_mov_b32_e32 v5, s1 +; GFX9-NEXT: v_add_co_u32_e32 v4, vcc, s0, v4 +; GFX9-NEXT: v_addc_co_u32_e32 v5, vcc, 0, v5, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshrrev_b16 v1, v1, v3 +; GFX9-NEXT: v_pk_lshrrev_b16 v0, v0, v2 +; GFX9-NEXT: global_store_dwordx2 v[4:5], v[0:1], off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: v_lshr_v4i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v4, 3, v0 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s3 +; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v4 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: v_add_u32_e32 v2, vcc, 8, v0 +; VI-NEXT: v_addc_u32_e32 v3, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dwordx2 v[0:1], v[0:1] +; VI-NEXT: flat_load_dwordx2 v[2:3], v[2:3] +; VI-NEXT: v_mov_b32_e32 v5, s1 +; VI-NEXT: v_add_u32_e32 v4, vcc, s0, v4 +; VI-NEXT: v_addc_u32_e32 v5, vcc, 0, v5, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshrrev_b16_e32 v6, v3, v1 +; VI-NEXT: v_lshrrev_b16_sdwa v1, v3, v1 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:WORD_1 +; VI-NEXT: v_lshrrev_b16_e32 v3, v2, v0 +; VI-NEXT: v_lshrrev_b16_sdwa v0, v2, v0 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:WORD_1 +; VI-NEXT: v_or_b32_e32 v1, v6, v1 +; VI-NEXT: v_or_b32_e32 v0, v3, v0 +; VI-NEXT: flat_store_dwordx2 v[4:5], v[0:1] +; VI-NEXT: s_endpgm +; +; CI-LABEL: v_lshr_v4i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 3, v0 +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[4:5], s[2:3] +; CI-NEXT: buffer_load_dwordx2 v[2:3], v[0:1], s[4:7], 0 addr64 +; CI-NEXT: buffer_load_dwordx2 v[4:5], v[0:1], s[4:7], 0 addr64 offset:8 +; CI-NEXT: s_mov_b32 s8, 0xffff +; CI-NEXT: s_mov_b64 s[2:3], s[6:7] +; CI-NEXT: s_waitcnt vmcnt(1) +; CI-NEXT: v_lshrrev_b32_e32 v6, 16, v2 +; CI-NEXT: v_lshrrev_b32_e32 v7, 16, v3 +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_lshrrev_b32_e32 v8, 16, v4 +; CI-NEXT: v_lshrrev_b32_e32 v9, 16, v5 +; CI-NEXT: v_and_b32_e32 v2, s8, v2 +; CI-NEXT: v_and_b32_e32 v4, s8, v4 +; CI-NEXT: v_and_b32_e32 v3, s8, v3 +; CI-NEXT: v_and_b32_e32 v5, s8, v5 +; CI-NEXT: v_bfe_u32 v3, v3, v5, 16 +; CI-NEXT: v_lshrrev_b32_e32 v5, v9, v7 +; CI-NEXT: v_bfe_u32 v2, v2, v4, 16 +; CI-NEXT: v_lshrrev_b32_e32 v4, v8, v6 +; CI-NEXT: v_lshlrev_b32_e32 v5, 16, v5 +; CI-NEXT: v_lshlrev_b32_e32 v4, 16, v4 +; CI-NEXT: v_or_b32_e32 v3, v3, v5 +; CI-NEXT: v_or_b32_e32 v2, v2, v4 +; CI-NEXT: buffer_store_dwordx2 v[2:3], v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <4 x i16>, <4 x i16> addrspace(1)* %in, i64 %tid.ext @@ -130,12 +512,66 @@ define amdgpu_kernel void @v_lshr_v4i16( ret void } -; GCN-LABEL: {{^}}lshr_v_imm_v4i16: -; GCN: {{buffer|flat|global}}_load_dwordx2 -; GFX9: v_pk_lshrrev_b16 v{{[0-9]+}}, 8, v{{[0-9]+}} -; GFX9: v_pk_lshrrev_b16 v{{[0-9]+}}, 8, v{{[0-9]+}} -; GCN: {{buffer|flat|global}}_store_dwordx2 define amdgpu_kernel void @lshr_v_imm_v4i16(<4 x i16> addrspace(1)* %out, <4 x i16> addrspace(1)* %in) #0 { +; GFX9-LABEL: lshr_v_imm_v4i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 3, v0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s2, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dwordx2 v[0:1], v[0:1], off +; GFX9-NEXT: v_mov_b32_e32 v3, s1 +; GFX9-NEXT: v_add_co_u32_e32 v2, vcc, s0, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v3, vcc, 0, v3, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshrrev_b16 v1, 8, v1 op_sel_hi:[0,1] +; GFX9-NEXT: v_pk_lshrrev_b16 v0, 8, v0 op_sel_hi:[0,1] +; GFX9-NEXT: global_store_dwordx2 v[2:3], v[0:1], off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: lshr_v_imm_v4i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v2, 3, v0 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s3 +; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v2 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dwordx2 v[0:1], v[0:1] +; VI-NEXT: v_mov_b32_e32 v3, s1 +; VI-NEXT: v_add_u32_e32 v2, vcc, s0, v2 +; VI-NEXT: v_addc_u32_e32 v3, vcc, 0, v3, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshrrev_b32_e32 v4, 24, v1 +; VI-NEXT: v_lshrrev_b32_e32 v5, 24, v0 +; VI-NEXT: v_lshlrev_b32_e32 v4, 16, v4 +; VI-NEXT: v_lshlrev_b32_e32 v5, 16, v5 +; VI-NEXT: v_or_b32_sdwa v1, v1, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; VI-NEXT: v_or_b32_sdwa v0, v0, v5 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; VI-NEXT: flat_store_dwordx2 v[2:3], v[0:1] +; VI-NEXT: s_endpgm +; +; CI-LABEL: lshr_v_imm_v4i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 3, v0 +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[4:5], s[2:3] +; CI-NEXT: buffer_load_dwordx2 v[2:3], v[0:1], s[4:7], 0 addr64 +; CI-NEXT: s_mov_b32 s8, 0xff00ff +; CI-NEXT: s_mov_b64 s[2:3], s[6:7] +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_lshrrev_b32_e32 v3, 8, v3 +; CI-NEXT: v_lshrrev_b32_e32 v2, 8, v2 +; CI-NEXT: v_and_b32_e32 v3, s8, v3 +; CI-NEXT: v_and_b32_e32 v2, s8, v2 +; CI-NEXT: buffer_store_dwordx2 v[2:3], v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <4 x i16>, <4 x i16> addrspace(1)* %in, i64 %tid.ext Modified: llvm/trunk/test/CodeGen/AMDGPU/shl.v2i16.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/shl.v2i16.ll?rev=374074&r1=374073&r2=374074&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/shl.v2i16.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/shl.v2i16.ll Tue Oct 8 09:16:26 2019 @@ -1,60 +1,135 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc -march=amdgcn -mcpu=gfx900 -mattr=-flat-for-global -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GFX9 %s ; RUN: llc -march=amdgcn -mcpu=tonga -mattr=-flat-for-global -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,VI,CIVI %s ; RUN: llc -march=amdgcn -mcpu=bonaire -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,CI,CIVI %s -; GCN-LABEL: {{^}}s_shl_v2i16: -; GFX9: s_load_dword [[LHS:s[0-9]+]] -; GFX9: s_load_dword [[RHS:s[0-9]+]] -; GFX9: v_mov_b32_e32 [[VLHS:v[0-9]+]], [[LHS]] -; GFX9: v_pk_lshlrev_b16 [[RESULT:v[0-9]+]], [[RHS]], [[VLHS]] - -; VI: s_load_dword s -; VI: s_load_dword s -; VI: s_lshr_b32 -; VI: s_lshr_b32 -; VI: s_and_b32 -; VI: s_and_b32 -; VI: s_lshl_b32 -; VI: s_lshl_b32 -; VI: s_lshl_b32 -; VI: s_and_b32 -; VI: s_or_b32 - -; CI: s_load_dword s -; CI: s_load_dword s -; CI: s_lshr_b32 -; CI: s_and_b32 -; CI: s_lshr_b32 -; CI: s_lshl_b32 -; CI: s_lshl_b32 -; CI: s_lshl_b32 -; CI: s_and_b32 -; CI: s_or_b32 -; CI: _store_dword define amdgpu_kernel void @s_shl_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> %lhs, <2 x i16> %rhs) #0 { +; GFX9-LABEL: s_shl_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x24 +; GFX9-NEXT: s_load_dword s2, s[0:1], 0x2c +; GFX9-NEXT: s_load_dword s0, s[0:1], 0x30 +; GFX9-NEXT: s_mov_b32 s7, 0xf000 +; GFX9-NEXT: s_mov_b32 s6, -1 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v0, s2 +; GFX9-NEXT: v_pk_lshlrev_b16 v0, s0, v0 +; GFX9-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: s_shl_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x24 +; VI-NEXT: s_load_dword s2, s[0:1], 0x2c +; VI-NEXT: s_load_dword s0, s[0:1], 0x30 +; VI-NEXT: s_mov_b32 s3, 0xffff +; VI-NEXT: s_mov_b32 s7, 0xf000 +; VI-NEXT: s_mov_b32 s6, -1 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: s_lshr_b32 s1, s2, 16 +; VI-NEXT: s_lshr_b32 s8, s0, 16 +; VI-NEXT: s_and_b32 s2, s2, s3 +; VI-NEXT: s_and_b32 s0, s0, s3 +; VI-NEXT: s_lshl_b32 s0, s2, s0 +; VI-NEXT: s_lshl_b32 s1, s1, s8 +; VI-NEXT: s_lshl_b32 s1, s1, 16 +; VI-NEXT: s_and_b32 s0, s0, s3 +; VI-NEXT: s_or_b32 s0, s0, s1 +; VI-NEXT: v_mov_b32_e32 v0, s0 +; VI-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; VI-NEXT: s_endpgm +; +; CI-LABEL: s_shl_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x9 +; CI-NEXT: s_load_dword s2, s[0:1], 0xb +; CI-NEXT: s_load_dword s0, s[0:1], 0xc +; CI-NEXT: s_mov_b32 s3, 0xffff +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, -1 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_lshr_b32 s1, s2, 16 +; CI-NEXT: s_and_b32 s8, s0, s3 +; CI-NEXT: s_lshr_b32 s0, s0, 16 +; CI-NEXT: s_lshl_b32 s0, s1, s0 +; CI-NEXT: s_lshl_b32 s1, s2, s8 +; CI-NEXT: s_lshl_b32 s0, s0, 16 +; CI-NEXT: s_and_b32 s1, s1, s3 +; CI-NEXT: s_or_b32 s0, s1, s0 +; CI-NEXT: v_mov_b32_e32 v0, s0 +; CI-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; CI-NEXT: s_endpgm %result = shl <2 x i16> %lhs, %rhs store <2 x i16> %result, <2 x i16> addrspace(1)* %out ret void } -; GCN-LABEL: {{^}}v_shl_v2i16: -; GCN: {{buffer|flat|global}}_load_dword [[LHS:v[0-9]+]] -; GCN: {{buffer|flat|global}}_load_dword [[RHS:v[0-9]+]] -; GFX9: v_pk_lshlrev_b16 [[RESULT:v[0-9]+]], [[RHS]], [[LHS]] - -; VI: v_lshlrev_b16_e32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} -; VI: v_lshlrev_b16_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:WORD_1 -; VI: v_or_b32_e32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} - -; CI: s_mov_b32 [[MASK:s[0-9]+]], 0xffff{{$}} -; CI: v_lshrrev_b32_e32 v{{[0-9]+}}, 16, [[LHS]] -; CI: v_lshrrev_b32_e32 v{{[0-9]+}}, 16, v{{[0-9]+}} -; CI: v_lshlrev_b32_e32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} -; CI: v_lshlrev_b32_e32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} -; CI: v_lshlrev_b32_e32 v{{[0-9]+}}, 16, v{{[0-9]+}} -; CI: v_and_b32_e32 v{{[0-9]+}}, [[MASK]], v{{[0-9]+}} -; CI: v_or_b32_e32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} define amdgpu_kernel void @v_shl_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in) #0 { +; GFX9-LABEL: v_shl_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s2, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dword v3, v[0:1], off +; GFX9-NEXT: global_load_dword v4, v[0:1], off offset:4 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s0, v2 +; GFX9-NEXT: v_mov_b32_e32 v1, s1 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshlrev_b16 v2, v4, v3 +; GFX9-NEXT: global_store_dword v[0:1], v2, off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: v_shl_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v4, 2, v0 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s3 +; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v4 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: v_add_u32_e32 v2, vcc, 4, v0 +; VI-NEXT: v_addc_u32_e32 v3, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dword v5, v[0:1] +; VI-NEXT: flat_load_dword v2, v[2:3] +; VI-NEXT: v_mov_b32_e32 v1, s1 +; VI-NEXT: v_add_u32_e32 v0, vcc, s0, v4 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshlrev_b16_e32 v3, v2, v5 +; VI-NEXT: v_lshlrev_b16_sdwa v2, v2, v5 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:WORD_1 +; VI-NEXT: v_or_b32_e32 v2, v3, v2 +; VI-NEXT: flat_store_dword v[0:1], v2 +; VI-NEXT: s_endpgm +; +; CI-LABEL: v_shl_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 2, v0 +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[4:5], s[2:3] +; CI-NEXT: buffer_load_dword v2, v[0:1], s[4:7], 0 addr64 +; CI-NEXT: buffer_load_dword v3, v[0:1], s[4:7], 0 addr64 offset:4 +; CI-NEXT: s_mov_b32 s8, 0xffff +; CI-NEXT: s_mov_b64 s[2:3], s[6:7] +; CI-NEXT: s_waitcnt vmcnt(1) +; CI-NEXT: v_lshrrev_b32_e32 v4, 16, v2 +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_and_b32_e32 v5, s8, v3 +; CI-NEXT: v_lshrrev_b32_e32 v3, 16, v3 +; CI-NEXT: v_lshlrev_b32_e32 v3, v3, v4 +; CI-NEXT: v_lshlrev_b32_e32 v2, v5, v2 +; CI-NEXT: v_lshlrev_b32_e32 v3, 16, v3 +; CI-NEXT: v_and_b32_e32 v2, s8, v2 +; CI-NEXT: v_or_b32_e32 v2, v2, v3 +; CI-NEXT: buffer_store_dword v2, v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in, i64 %tid.ext @@ -67,11 +142,71 @@ define amdgpu_kernel void @v_shl_v2i16(< ret void } -; GCN-LABEL: {{^}}shl_v_s_v2i16: -; GFX9: s_load_dword [[RHS:s[0-9]+]] -; GFX9: {{buffer|flat|global}}_load_dword [[LHS:v[0-9]+]] -; GFX9: v_pk_lshlrev_b16 [[RESULT:v[0-9]+]], [[RHS]], [[LHS]] define amdgpu_kernel void @shl_v_s_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in, <2 x i16> %sgpr) #0 { +; GFX9-LABEL: shl_v_s_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; GFX9-NEXT: s_load_dword s0, s[0:1], 0x34 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s7 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s6, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dword v3, v[0:1], off +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s4, v2 +; GFX9-NEXT: v_mov_b32_e32 v1, s5 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshlrev_b16 v2, s0, v3 +; GFX9-NEXT: global_store_dword v[0:1], v2, off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: shl_v_s_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; VI-NEXT: s_load_dword s0, s[0:1], 0x34 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s7 +; VI-NEXT: v_add_u32_e32 v0, vcc, s6, v2 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dword v3, v[0:1] +; VI-NEXT: s_lshr_b32 s1, s0, 16 +; VI-NEXT: v_mov_b32_e32 v4, s1 +; VI-NEXT: v_add_u32_e32 v0, vcc, s4, v2 +; VI-NEXT: v_mov_b32_e32 v1, s5 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshlrev_b16_e32 v2, s0, v3 +; VI-NEXT: v_lshlrev_b16_sdwa v3, v4, v3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1 +; VI-NEXT: v_or_b32_e32 v2, v2, v3 +; VI-NEXT: flat_store_dword v[0:1], v2 +; VI-NEXT: s_endpgm +; +; CI-LABEL: shl_v_s_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x9 +; CI-NEXT: s_load_dword s8, s[0:1], 0xd +; CI-NEXT: s_mov_b32 s3, 0xf000 +; CI-NEXT: s_mov_b32 s2, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 2, v0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[0:1], s[6:7] +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: buffer_load_dword v2, v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_mov_b32 s9, 0xffff +; CI-NEXT: s_lshr_b32 s10, s8, 16 +; CI-NEXT: s_and_b32 s8, s8, s9 +; CI-NEXT: s_mov_b64 s[6:7], s[2:3] +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_lshrrev_b32_e32 v3, 16, v2 +; CI-NEXT: v_lshlrev_b32_e32 v2, s8, v2 +; CI-NEXT: v_lshlrev_b32_e32 v3, s10, v3 +; CI-NEXT: v_and_b32_e32 v2, s9, v2 +; CI-NEXT: v_lshlrev_b32_e32 v3, 16, v3 +; CI-NEXT: v_or_b32_e32 v2, v2, v3 +; CI-NEXT: buffer_store_dword v2, v[0:1], s[4:7], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in, i64 %tid.ext @@ -82,11 +217,71 @@ define amdgpu_kernel void @shl_v_s_v2i16 ret void } -; GCN-LABEL: {{^}}shl_s_v_v2i16: -; GFX9: s_load_dword [[LHS:s[0-9]+]] -; GFX9: {{buffer|flat|global}}_load_dword [[RHS:v[0-9]+]] -; GFX9: v_pk_lshlrev_b16 [[RESULT:v[0-9]+]], [[RHS]], [[LHS]] define amdgpu_kernel void @shl_s_v_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in, <2 x i16> %sgpr) #0 { +; GFX9-LABEL: shl_s_v_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; GFX9-NEXT: s_load_dword s0, s[0:1], 0x34 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s7 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s6, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dword v3, v[0:1], off +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s4, v2 +; GFX9-NEXT: v_mov_b32_e32 v1, s5 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshlrev_b16 v2, v3, s0 +; GFX9-NEXT: global_store_dword v[0:1], v2, off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: shl_s_v_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; VI-NEXT: s_load_dword s0, s[0:1], 0x34 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s7 +; VI-NEXT: v_add_u32_e32 v0, vcc, s6, v2 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dword v3, v[0:1] +; VI-NEXT: s_lshr_b32 s1, s0, 16 +; VI-NEXT: v_mov_b32_e32 v4, s1 +; VI-NEXT: v_add_u32_e32 v0, vcc, s4, v2 +; VI-NEXT: v_mov_b32_e32 v1, s5 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshlrev_b16_e64 v2, v3, s0 +; VI-NEXT: v_lshlrev_b16_sdwa v3, v3, v4 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD +; VI-NEXT: v_or_b32_e32 v2, v2, v3 +; VI-NEXT: flat_store_dword v[0:1], v2 +; VI-NEXT: s_endpgm +; +; CI-LABEL: shl_s_v_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x9 +; CI-NEXT: s_load_dword s8, s[0:1], 0xd +; CI-NEXT: s_mov_b32 s3, 0xf000 +; CI-NEXT: s_mov_b32 s2, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 2, v0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[0:1], s[6:7] +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: buffer_load_dword v2, v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_mov_b32 s0, 0xffff +; CI-NEXT: s_lshr_b32 s1, s8, 16 +; CI-NEXT: s_mov_b64 s[6:7], s[2:3] +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_and_b32_e32 v3, s0, v2 +; CI-NEXT: v_lshrrev_b32_e32 v2, 16, v2 +; CI-NEXT: v_lshl_b32_e32 v2, s1, v2 +; CI-NEXT: v_lshl_b32_e32 v3, s8, v3 +; CI-NEXT: v_lshlrev_b32_e32 v2, 16, v2 +; CI-NEXT: v_and_b32_e32 v3, s0, v3 +; CI-NEXT: v_or_b32_e32 v2, v3, v2 +; CI-NEXT: buffer_store_dword v2, v[0:1], s[4:7], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in, i64 %tid.ext @@ -97,10 +292,66 @@ define amdgpu_kernel void @shl_s_v_v2i16 ret void } -; GCN-LABEL: {{^}}shl_imm_v_v2i16: -; GCN: {{buffer|flat|global}}_load_dword [[RHS:v[0-9]+]] -; GFX9: v_pk_lshlrev_b16 [[RESULT:v[0-9]+]], [[RHS]], 8 define amdgpu_kernel void @shl_imm_v_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in) #0 { +; GFX9-LABEL: shl_imm_v_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s2, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dword v3, v[0:1], off +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s0, v2 +; GFX9-NEXT: v_mov_b32_e32 v1, s1 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshlrev_b16 v2, v3, 8 op_sel_hi:[1,0] +; GFX9-NEXT: global_store_dword v[0:1], v2, off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: shl_imm_v_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; VI-NEXT: v_mov_b32_e32 v3, 8 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s3 +; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v2 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dword v4, v[0:1] +; VI-NEXT: v_add_u32_e32 v0, vcc, s0, v2 +; VI-NEXT: v_mov_b32_e32 v1, s1 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshlrev_b16_e64 v2, v4, 8 +; VI-NEXT: v_lshlrev_b16_sdwa v3, v4, v3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD +; VI-NEXT: v_or_b32_e32 v2, v2, v3 +; VI-NEXT: flat_store_dword v[0:1], v2 +; VI-NEXT: s_endpgm +; +; CI-LABEL: shl_imm_v_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 2, v0 +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[4:5], s[2:3] +; CI-NEXT: buffer_load_dword v2, v[0:1], s[4:7], 0 addr64 +; CI-NEXT: s_mov_b32 s4, 0xffff +; CI-NEXT: s_mov_b64 s[2:3], s[6:7] +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_and_b32_e32 v3, s4, v2 +; CI-NEXT: v_lshrrev_b32_e32 v2, 16, v2 +; CI-NEXT: v_lshl_b32_e32 v2, 8, v2 +; CI-NEXT: v_lshl_b32_e32 v3, 8, v3 +; CI-NEXT: v_lshlrev_b32_e32 v2, 16, v2 +; CI-NEXT: v_and_b32_e32 v3, s4, v3 +; CI-NEXT: v_or_b32_e32 v2, v3, v2 +; CI-NEXT: buffer_store_dword v2, v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in, i64 %tid.ext @@ -111,10 +362,60 @@ define amdgpu_kernel void @shl_imm_v_v2i ret void } -; GCN-LABEL: {{^}}shl_v_imm_v2i16: -; GCN: {{buffer|flat|global}}_load_dword [[LHS:v[0-9]+]] -; GFX9: v_pk_lshlrev_b16 [[RESULT:v[0-9]+]], 8, [[LHS]] define amdgpu_kernel void @shl_v_imm_v2i16(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in) #0 { +; GFX9-LABEL: shl_v_imm_v2i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s2, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dword v3, v[0:1], off +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s0, v2 +; GFX9-NEXT: v_mov_b32_e32 v1, s1 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshlrev_b16 v2, 8, v3 op_sel_hi:[0,1] +; GFX9-NEXT: global_store_dword v[0:1], v2, off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: shl_v_imm_v2i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v2, 2, v0 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s3 +; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v2 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dword v3, v[0:1] +; VI-NEXT: v_add_u32_e32 v0, vcc, s0, v2 +; VI-NEXT: v_mov_b32_e32 v1, s1 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshlrev_b32_e32 v2, 8, v3 +; VI-NEXT: v_and_b32_e32 v2, 0xff000000, v2 +; VI-NEXT: v_lshlrev_b16_e32 v3, 8, v3 +; VI-NEXT: v_or_b32_e32 v2, v3, v2 +; VI-NEXT: flat_store_dword v[0:1], v2 +; VI-NEXT: s_endpgm +; +; CI-LABEL: shl_v_imm_v2i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 2, v0 +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[4:5], s[2:3] +; CI-NEXT: buffer_load_dword v2, v[0:1], s[4:7], 0 addr64 +; CI-NEXT: s_mov_b64 s[2:3], s[6:7] +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_lshlrev_b32_e32 v2, 8, v2 +; CI-NEXT: v_and_b32_e32 v2, 0xff00ff00, v2 +; CI-NEXT: buffer_store_dword v2, v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %in, i64 %tid.ext @@ -125,13 +426,84 @@ define amdgpu_kernel void @shl_v_imm_v2i ret void } -; GCN-LABEL: {{^}}v_shl_v4i16: -; GCN: {{buffer|flat|global}}_load_dwordx2 -; GCN: {{buffer|flat|global}}_load_dwordx2 -; GFX9: v_pk_lshlrev_b16 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} -; GFX9: v_pk_lshlrev_b16 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} -; GCN: {{buffer|flat|global}}_store_dwordx2 define amdgpu_kernel void @v_shl_v4i16(<4 x i16> addrspace(1)* %out, <4 x i16> addrspace(1)* %in) #0 { +; GFX9-LABEL: v_shl_v4i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v4, 3, v0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s2, v4 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dwordx2 v[2:3], v[0:1], off +; GFX9-NEXT: global_load_dwordx2 v[0:1], v[0:1], off offset:8 +; GFX9-NEXT: v_mov_b32_e32 v5, s1 +; GFX9-NEXT: v_add_co_u32_e32 v4, vcc, s0, v4 +; GFX9-NEXT: v_addc_co_u32_e32 v5, vcc, 0, v5, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshlrev_b16 v1, v1, v3 +; GFX9-NEXT: v_pk_lshlrev_b16 v0, v0, v2 +; GFX9-NEXT: global_store_dwordx2 v[4:5], v[0:1], off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: v_shl_v4i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v4, 3, v0 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s3 +; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v4 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: v_add_u32_e32 v2, vcc, 8, v0 +; VI-NEXT: v_addc_u32_e32 v3, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dwordx2 v[0:1], v[0:1] +; VI-NEXT: flat_load_dwordx2 v[2:3], v[2:3] +; VI-NEXT: v_mov_b32_e32 v5, s1 +; VI-NEXT: v_add_u32_e32 v4, vcc, s0, v4 +; VI-NEXT: v_addc_u32_e32 v5, vcc, 0, v5, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshlrev_b16_e32 v6, v3, v1 +; VI-NEXT: v_lshlrev_b16_sdwa v1, v3, v1 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:WORD_1 +; VI-NEXT: v_lshlrev_b16_e32 v3, v2, v0 +; VI-NEXT: v_lshlrev_b16_sdwa v0, v2, v0 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:WORD_1 +; VI-NEXT: v_or_b32_e32 v1, v6, v1 +; VI-NEXT: v_or_b32_e32 v0, v3, v0 +; VI-NEXT: flat_store_dwordx2 v[4:5], v[0:1] +; VI-NEXT: s_endpgm +; +; CI-LABEL: v_shl_v4i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 3, v0 +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[4:5], s[2:3] +; CI-NEXT: buffer_load_dwordx2 v[2:3], v[0:1], s[4:7], 0 addr64 +; CI-NEXT: buffer_load_dwordx2 v[4:5], v[0:1], s[4:7], 0 addr64 offset:8 +; CI-NEXT: s_mov_b32 s8, 0xffff +; CI-NEXT: s_mov_b64 s[2:3], s[6:7] +; CI-NEXT: s_waitcnt vmcnt(1) +; CI-NEXT: v_lshrrev_b32_e32 v6, 16, v2 +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_and_b32_e32 v8, s8, v4 +; CI-NEXT: v_lshrrev_b32_e32 v4, 16, v4 +; CI-NEXT: v_and_b32_e32 v9, s8, v5 +; CI-NEXT: v_lshrrev_b32_e32 v7, 16, v3 +; CI-NEXT: v_lshrrev_b32_e32 v5, 16, v5 +; CI-NEXT: v_lshlrev_b32_e32 v5, v5, v7 +; CI-NEXT: v_lshlrev_b32_e32 v3, v9, v3 +; CI-NEXT: v_lshlrev_b32_e32 v4, v4, v6 +; CI-NEXT: v_lshlrev_b32_e32 v2, v8, v2 +; CI-NEXT: v_lshlrev_b32_e32 v5, 16, v5 +; CI-NEXT: v_and_b32_e32 v3, s8, v3 +; CI-NEXT: v_lshlrev_b32_e32 v4, 16, v4 +; CI-NEXT: v_and_b32_e32 v2, s8, v2 +; CI-NEXT: v_or_b32_e32 v3, v3, v5 +; CI-NEXT: v_or_b32_e32 v2, v2, v4 +; CI-NEXT: buffer_store_dwordx2 v[2:3], v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <4 x i16>, <4 x i16> addrspace(1)* %in, i64 %tid.ext @@ -144,12 +516,73 @@ define amdgpu_kernel void @v_shl_v4i16(< ret void } -; GCN-LABEL: {{^}}shl_v_imm_v4i16: -; GCN: {{buffer|flat|global}}_load_dwordx2 -; GFX9: v_pk_lshlrev_b16 v{{[0-9]+}}, 8, v{{[0-9]+}} -; GFX9: v_pk_lshlrev_b16 v{{[0-9]+}}, 8, v{{[0-9]+}} -; GCN: {{buffer|flat|global}}_store_dwordx2 define amdgpu_kernel void @shl_v_imm_v4i16(<4 x i16> addrspace(1)* %out, <4 x i16> addrspace(1)* %in) #0 { +; GFX9-LABEL: shl_v_imm_v4i16: +; GFX9: ; %bb.0: +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_lshlrev_b32_e32 v2, 3, v0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s2, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc +; GFX9-NEXT: global_load_dwordx2 v[0:1], v[0:1], off +; GFX9-NEXT: v_mov_b32_e32 v3, s1 +; GFX9-NEXT: v_add_co_u32_e32 v2, vcc, s0, v2 +; GFX9-NEXT: v_addc_co_u32_e32 v3, vcc, 0, v3, vcc +; GFX9-NEXT: s_waitcnt vmcnt(0) +; GFX9-NEXT: v_pk_lshlrev_b16 v1, 8, v1 op_sel_hi:[0,1] +; GFX9-NEXT: v_pk_lshlrev_b16 v0, 8, v0 op_sel_hi:[0,1] +; GFX9-NEXT: global_store_dwordx2 v[2:3], v[0:1], off +; GFX9-NEXT: s_endpgm +; +; VI-LABEL: shl_v_imm_v4i16: +; VI: ; %bb.0: +; VI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; VI-NEXT: v_lshlrev_b32_e32 v2, 3, v0 +; VI-NEXT: s_mov_b32 s4, 0xff000000 +; VI-NEXT: s_waitcnt lgkmcnt(0) +; VI-NEXT: v_mov_b32_e32 v1, s3 +; VI-NEXT: v_add_u32_e32 v0, vcc, s2, v2 +; VI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc +; VI-NEXT: flat_load_dwordx2 v[0:1], v[0:1] +; VI-NEXT: v_mov_b32_e32 v3, s1 +; VI-NEXT: v_add_u32_e32 v2, vcc, s0, v2 +; VI-NEXT: v_addc_u32_e32 v3, vcc, 0, v3, vcc +; VI-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; VI-NEXT: v_lshlrev_b32_e32 v4, 8, v1 +; VI-NEXT: v_lshlrev_b16_e32 v5, 8, v0 +; VI-NEXT: v_lshlrev_b32_e32 v0, 8, v0 +; VI-NEXT: v_and_b32_e32 v0, s4, v0 +; VI-NEXT: v_lshlrev_b16_e32 v1, 8, v1 +; VI-NEXT: v_and_b32_e32 v4, s4, v4 +; VI-NEXT: v_or_b32_e32 v1, v1, v4 +; VI-NEXT: v_or_b32_e32 v0, v5, v0 +; VI-NEXT: flat_store_dwordx2 v[2:3], v[0:1] +; VI-NEXT: s_endpgm +; +; CI-LABEL: shl_v_imm_v4i16: +; CI: ; %bb.0: +; CI-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; CI-NEXT: s_mov_b32 s7, 0xf000 +; CI-NEXT: s_mov_b32 s6, 0 +; CI-NEXT: v_lshlrev_b32_e32 v0, 3, v0 +; CI-NEXT: v_mov_b32_e32 v1, 0 +; CI-NEXT: s_waitcnt lgkmcnt(0) +; CI-NEXT: s_mov_b64 s[4:5], s[2:3] +; CI-NEXT: buffer_load_dwordx2 v[2:3], v[0:1], s[4:7], 0 addr64 +; CI-NEXT: s_mov_b32 s8, 0xff00 +; CI-NEXT: s_mov_b64 s[2:3], s[6:7] +; CI-NEXT: s_waitcnt vmcnt(0) +; CI-NEXT: v_lshrrev_b32_e32 v4, 8, v3 +; CI-NEXT: v_lshlrev_b32_e32 v3, 8, v3 +; CI-NEXT: v_and_b32_e32 v4, s8, v4 +; CI-NEXT: v_lshlrev_b32_e32 v2, 8, v2 +; CI-NEXT: v_and_b32_e32 v3, s8, v3 +; CI-NEXT: v_lshlrev_b32_e32 v4, 16, v4 +; CI-NEXT: v_or_b32_e32 v3, v3, v4 +; CI-NEXT: v_and_b32_e32 v2, 0xff00ff00, v2 +; CI-NEXT: buffer_store_dwordx2 v[2:3], v[0:1], s[0:3], 0 addr64 +; CI-NEXT: s_endpgm %tid = call i32 @llvm.amdgcn.workitem.id.x() %tid.ext = sext i32 %tid to i64 %in.gep = getelementptr inbounds <4 x i16>, <4 x i16> addrspace(1)* %in, i64 %tid.ext Modified: llvm/trunk/test/CodeGen/ARM/rev.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/rev.ll?rev=374074&r1=374073&r2=374074&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/rev.ll (original) +++ llvm/trunk/test/CodeGen/ARM/rev.ll Tue Oct 8 09:16:26 2019 @@ -1,8 +1,11 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc -mtriple=arm-eabi -mattr=+v6 %s -o - | FileCheck %s define i32 @test1(i32 %X) nounwind { -; CHECK-LABEL: test1 -; CHECK: rev16 r0, r0 +; CHECK-LABEL: test1: +; CHECK: @ %bb.0: +; CHECK-NEXT: rev16 r0, r0 +; CHECK-NEXT: bx lr %tmp1 = lshr i32 %X, 8 %X15 = bitcast i32 %X to i32 %tmp4 = shl i32 %X15, 8 @@ -17,8 +20,10 @@ define i32 @test1(i32 %X) nounwind { } define i32 @test2(i32 %X) nounwind { -; CHECK-LABEL: test2 -; CHECK: revsh r0, r0 +; CHECK-LABEL: test2: +; CHECK: @ %bb.0: +; CHECK-NEXT: revsh r0, r0 +; CHECK-NEXT: bx lr %tmp1 = lshr i32 %X, 8 %tmp1.upgrd.1 = trunc i32 %tmp1 to i16 %tmp3 = trunc i32 %X to i16 @@ -31,9 +36,11 @@ define i32 @test2(i32 %X) nounwind { ; rdar://9147637 define i32 @test3(i16 zeroext %a) nounwind { -entry: ; CHECK-LABEL: test3: -; CHECK: revsh r0, r0 +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: revsh r0, r0 +; CHECK-NEXT: bx lr +entry: %0 = tail call i16 @llvm.bswap.i16(i16 %a) %1 = sext i16 %0 to i32 ret i32 %1 @@ -42,9 +49,11 @@ entry: declare i16 @llvm.bswap.i16(i16) nounwind readnone define i32 @test4(i16 zeroext %a) nounwind { -entry: ; CHECK-LABEL: test4: -; CHECK: revsh r0, r0 +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: revsh r0, r0 +; CHECK-NEXT: bx lr +entry: %conv = zext i16 %a to i32 %shr9 = lshr i16 %a, 8 %conv2 = zext i16 %shr9 to i32 @@ -57,9 +66,11 @@ entry: ; rdar://9609059 define i32 @test5(i32 %i) nounwind readnone { +; CHECK-LABEL: test5: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: revsh r0, r0 +; CHECK-NEXT: bx lr entry: -; CHECK-LABEL: test5 -; CHECK: revsh r0, r0 %shl = shl i32 %i, 24 %shr = ashr exact i32 %shl, 16 %shr23 = lshr i32 %i, 8 @@ -70,9 +81,11 @@ entry: ; rdar://9609108 define i32 @test6(i32 %x) nounwind readnone { +; CHECK-LABEL: test6: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: rev16 r0, r0 +; CHECK-NEXT: bx lr entry: -; CHECK-LABEL: test6 -; CHECK: rev16 r0, r0 %and = shl i32 %x, 8 %shl = and i32 %and, 65280 %and2 = lshr i32 %x, 8 @@ -87,10 +100,12 @@ entry: ; rdar://9164521 define i32 @test7(i32 %a) nounwind readnone { +; CHECK-LABEL: test7: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: rev r0, r0 +; CHECK-NEXT: lsr r0, r0, #16 +; CHECK-NEXT: bx lr entry: -; CHECK-LABEL: test7 -; CHECK: rev r0, r0 -; CHECK: lsr r0, r0, #16 %and = lshr i32 %a, 8 %shr3 = and i32 %and, 255 %and2 = shl i32 %a, 8 @@ -100,9 +115,11 @@ entry: } define i32 @test8(i32 %a) nounwind readnone { +; CHECK-LABEL: test8: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: revsh r0, r0 +; CHECK-NEXT: bx lr entry: -; CHECK-LABEL: test8 -; CHECK: revsh r0, r0 %and = lshr i32 %a, 8 %shr4 = and i32 %and, 255 %and2 = shl i32 %a, 8 @@ -114,9 +131,11 @@ entry: ; rdar://10750814 define zeroext i16 @test9(i16 zeroext %v) nounwind readnone { +; CHECK-LABEL: test9: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: rev16 r0, r0 +; CHECK-NEXT: bx lr entry: -; CHECK-LABEL: test9 -; CHECK: rev16 r0, r0 %conv = zext i16 %v to i32 %shr4 = lshr i32 %conv, 8 %shl = shl nuw nsw i32 %conv, 8 Modified: llvm/trunk/test/CodeGen/Thumb/rev.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Thumb/rev.ll?rev=374074&r1=374073&r2=374074&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Thumb/rev.ll (original) +++ llvm/trunk/test/CodeGen/Thumb/rev.ll Tue Oct 8 09:16:26 2019 @@ -1,8 +1,11 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc -mtriple=thumb-eabi -mattr=+v6 %s -o - | FileCheck %s define i32 @test1(i32 %X) nounwind { -; CHECK: test1 -; CHECK: rev16 r0, r0 +; CHECK-LABEL: test1: +; CHECK: @ %bb.0: +; CHECK-NEXT: rev16 r0, r0 +; CHECK-NEXT: bx lr %tmp1 = lshr i32 %X, 8 %X15 = bitcast i32 %X to i32 %tmp4 = shl i32 %X15, 8 @@ -17,8 +20,10 @@ define i32 @test1(i32 %X) nounwind { } define i32 @test2(i32 %X) nounwind { -; CHECK: test2 -; CHECK: revsh r0, r0 +; CHECK-LABEL: test2: +; CHECK: @ %bb.0: +; CHECK-NEXT: revsh r0, r0 +; CHECK-NEXT: bx lr %tmp1 = lshr i32 %X, 8 %tmp1.upgrd.1 = trunc i32 %tmp1 to i16 %tmp3 = trunc i32 %X to i16 @@ -31,9 +36,11 @@ define i32 @test2(i32 %X) nounwind { ; rdar://9147637 define i32 @test3(i16 zeroext %a) nounwind { -entry: ; CHECK-LABEL: test3: -; CHECK: revsh r0, r0 +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: revsh r0, r0 +; CHECK-NEXT: bx lr +entry: %0 = tail call i16 @llvm.bswap.i16(i16 %a) %1 = sext i16 %0 to i32 ret i32 %1 @@ -42,9 +49,11 @@ entry: declare i16 @llvm.bswap.i16(i16) nounwind readnone define i32 @test4(i16 zeroext %a) nounwind { -entry: ; CHECK-LABEL: test4: -; CHECK: revsh r0, r0 +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: revsh r0, r0 +; CHECK-NEXT: bx lr +entry: %conv = zext i16 %a to i32 %shr9 = lshr i16 %a, 8 %conv2 = zext i16 %shr9 to i32 From llvm-commits at lists.llvm.org Tue Oct 8 09:21:13 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Tue, 08 Oct 2019 16:21:13 -0000 Subject: [llvm] r374075 - [NFC][CVP] Add tests where we can replace sext with zext Message-ID: <20191008162113.93DA88D978@lists.llvm.org> Author: lebedevri Date: Tue Oct 8 09:21:13 2019 New Revision: 374075 URL: http://llvm.org/viewvc/llvm-project?rev=374075&view=rev Log: [NFC][CVP] Add tests where we can replace sext with zext If the sign bit of the value that is being sign-extended is not set, i.e. the value is non-negative (s>= 0), then zero-extension will suffice, and is better for analysis: https://rise4fun.com/Alive/a8PD Added: llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll Added: llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll?rev=374075&view=auto ============================================================================== --- llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll (added) +++ llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll Tue Oct 8 09:21:13 2019 @@ -0,0 +1,107 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt < %s -correlated-propagation -S | FileCheck %s + +; Check that debug locations are preserved. For more info see: +; https://llvm.org/docs/SourceLevelDebugging.html#fixing-errors +; RUN: opt < %s -enable-debugify -correlated-propagation -S 2>&1 | \ +; RUN: FileCheck %s -check-prefix=DEBUG +; DEBUG: CheckModuleDebugify: PASS + +declare void @use64(i64) + +define void @test1(i32 %n) { +; CHECK-LABEL: @test1( +; CHECK-NEXT: entry: +; CHECK-NEXT: br label [[FOR_COND:%.*]] +; CHECK: for.cond: +; CHECK-NEXT: [[A:%.*]] = phi i32 [ [[N:%.*]], [[ENTRY:%.*]] ], [ [[EXT:%.*]], [[FOR_BODY:%.*]] ] +; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[A]], 1 +; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]] +; CHECK: for.body: +; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[A]] to i64 +; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE]]) +; CHECK-NEXT: [[EXT]] = trunc i64 [[EXT_WIDE]] to i32 +; CHECK-NEXT: br label [[FOR_COND]] +; CHECK: for.end: +; CHECK-NEXT: ret void +; +entry: + br label %for.cond + +for.cond: ; preds = %for.body, %entry + %a = phi i32 [ %n, %entry ], [ %ext, %for.body ] + %cmp = icmp sgt i32 %a, 1 + br i1 %cmp, label %for.body, label %for.end + +for.body: ; preds = %for.cond + %ext.wide = sext i32 %a to i64 + call void @use64(i64 %ext.wide) + %ext = trunc i64 %ext.wide to i32 + br label %for.cond + +for.end: ; preds = %for.cond + ret void +} + +;; Negative test to show transform doesn't happen unless n > 0. +define void @test2(i32 %n) { +; CHECK-LABEL: @test2( +; CHECK-NEXT: entry: +; CHECK-NEXT: br label [[FOR_COND:%.*]] +; CHECK: for.cond: +; CHECK-NEXT: [[A:%.*]] = phi i32 [ [[N:%.*]], [[ENTRY:%.*]] ], [ [[EXT:%.*]], [[FOR_BODY:%.*]] ] +; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[A]], -2 +; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]] +; CHECK: for.body: +; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[A]] to i64 +; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE]]) +; CHECK-NEXT: [[EXT]] = trunc i64 [[EXT_WIDE]] to i32 +; CHECK-NEXT: br label [[FOR_COND]] +; CHECK: for.end: +; CHECK-NEXT: ret void +; +entry: + br label %for.cond + +for.cond: ; preds = %for.body, %entry + %a = phi i32 [ %n, %entry ], [ %ext, %for.body ] + %cmp = icmp sgt i32 %a, -2 + br i1 %cmp, label %for.body, label %for.end + +for.body: ; preds = %for.cond + %ext.wide = sext i32 %a to i64 + call void @use64(i64 %ext.wide) + %ext = trunc i64 %ext.wide to i32 + br label %for.cond + +for.end: ; preds = %for.cond + ret void +} + +;; Non looping test case. +define void @test3(i32 %n) { +; CHECK-LABEL: @test3( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[N:%.*]], 0 +; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.*]], label [[EXIT:%.*]] +; CHECK: bb: +; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[N]] to i64 +; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE]]) +; CHECK-NEXT: [[EXT:%.*]] = trunc i64 [[EXT_WIDE]] to i32 +; CHECK-NEXT: br label [[EXIT]] +; CHECK: exit: +; CHECK-NEXT: ret void +; +entry: + %cmp = icmp sgt i32 %n, 0 + br i1 %cmp, label %bb, label %exit + +bb: + %ext.wide = sext i32 %n to i64 + call void @use64(i64 %ext.wide) + %ext = trunc i64 %ext.wide to i32 + br label %exit + +exit: + ret void +} From llvm-commits at lists.llvm.org Tue Oct 8 09:22:57 2019 From: llvm-commits at lists.llvm.org (David Tellenbach via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:22:57 +0000 (UTC) Subject: [PATCH] D68639: [MachineScheduler] Add a flag to enable scheduling of cfi instructions In-Reply-To: References: Message-ID: tellenbach updated this revision to Diff 223881. tellenbach added a comment. Move test for scheduling boundaries into return statement Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68639/new/ https://reviews.llvm.org/D68639 Files: llvm/include/llvm/CodeGen/MachineScheduler.h llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h llvm/lib/CodeGen/MachineScheduler.cpp llvm/lib/CodeGen/ScheduleDAGInstrs.cpp llvm/test/CodeGen/AArch64/cfiinstrs-no-uwtable-scheduling.ll llvm/test/CodeGen/AArch64/cfiinstrs-scheduling.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68639.223881.patch Type: text/x-patch Size: 12226 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 09:23:14 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:23:14 +0000 (UTC) Subject: [PATCH] D68619: [WebAssembly] Fix a bug in 'try' placement In-Reply-To: References: Message-ID: <7ddd35ce18dffec80c40d8ed9dc5ef37@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG6a37c5d6fcae: [WebAssembly] Fix a bug in 'try' placement (authored by aheejin). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68619/new/ https://reviews.llvm.org/D68619 Files: llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/test/CodeGen/WebAssembly/cfg-stackify-eh.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68619.223882.patch Type: text/x-patch Size: 4579 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 09:25:42 2019 From: llvm-commits at lists.llvm.org (David Greene via llvm-commits) Date: Tue, 08 Oct 2019 16:25:42 -0000 Subject: [llvm] r374078 - [UpdateCCTestChecks] Detect function mangled name on separate line Message-ID: <20191008162542.5D0378F659@lists.llvm.org> Author: greened Date: Tue Oct 8 09:25:42 2019 New Revision: 374078 URL: http://llvm.org/viewvc/llvm-project?rev=374078&view=rev Log: [UpdateCCTestChecks] Detect function mangled name on separate line Sometimes functions with large comment blocks in front of them have their declarations output on several lines by c-index-test. Hence the one-line function name/line/mangled pattern will not work to detect them. Break the pattern up into two patterns and keep state after seeing the name/line information until we finally see the mangled name. Differential Revision: https://reviews.llvm.org/D68272 Modified: llvm/trunk/utils/update_cc_test_checks.py Modified: llvm/trunk/utils/update_cc_test_checks.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/update_cc_test_checks.py?rev=374078&r1=374077&r2=374078&view=diff ============================================================================== --- llvm/trunk/utils/update_cc_test_checks.py (original) +++ llvm/trunk/utils/update_cc_test_checks.py Tue Oct 8 09:25:42 2019 @@ -50,18 +50,32 @@ def get_line2spell_and_mangled(args, cla '-test-print-mangle', f.name]) if sys.version_info[0] > 2: output = output.decode() - - RE = re.compile(r'^FunctionDecl=(\w+):(\d+):\d+ \(Definition\) \[mangled=([^]]+)\]') + DeclRE = re.compile(r'^FunctionDecl=(\w+):(\d+):\d+ \(Definition\)') + MangleRE = re.compile(r'.*\[mangled=([^]]+)\]') + MatchedDecl = False for line in output.splitlines(): - m = RE.match(line) - if not m: continue - spell, line, mangled = m.groups() + # Get the function source name, line number and mangled name. Sometimes + # c-index-test outputs the mangled name on a separate line (this can happen + # with block comments in front of functions). Keep scanning until we see + # the mangled name. + decl_m = DeclRE.match(line) + mangle_m = MangleRE.match(line) + + if decl_m: + MatchedDecl = True + spell, lineno = decl_m.groups() + if MatchedDecl and mangle_m: + mangled = mangle_m.group(1) + MatchedDecl = False + else: + continue + if mangled == '_' + spell: # HACK for MacOS (where the mangled name includes an _ for C but the IR won't): mangled = spell # Note -test-print-mangle does not print file names so if #include is used, # the line number may come from an included file. - ret[int(line)-1] = (spell, mangled) + ret[int(lineno)-1] = (spell, mangled) if args.verbose: for line, func_name in sorted(ret.items()): print('line {}: found function {}'.format(line+1, func_name), file=sys.stderr) From llvm-commits at lists.llvm.org Tue Oct 8 09:23:47 2019 From: llvm-commits at lists.llvm.org (Paul Robinson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:23:47 +0000 (UTC) Subject: [PATCH] D68620: DebugInfo: Use base address selection entries for debug_loc In-Reply-To: References: Message-ID: probinson accepted this revision. probinson added a comment. This revision is now accepted and ready to land. > Anyone know whether they have consumers (LLDB, the Sony debugger) that would need to be updated for either the v4 changes (use of base address specifiers in classic debug_loc lists) or v5 (base_addressx, etc, etc)? I'll ask re Sony debugger. I have no direct visibility to that code. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68620/new/ https://reviews.llvm.org/D68620 From llvm-commits at lists.llvm.org Tue Oct 8 09:24:32 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:24:32 +0000 (UTC) Subject: [PATCH] D66645: [Attributor] Add helper class to compose two structured deduction. In-Reply-To: References: Message-ID: thakis added a comment. Looks like this fails on Windows: http://45.33.8.238/win/74/step_8.txt Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66645/new/ https://reviews.llvm.org/D66645 From llvm-commits at lists.llvm.org Tue Oct 8 09:26:27 2019 From: llvm-commits at lists.llvm.org (David Greene via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:26:27 +0000 (UTC) Subject: [PATCH] D68272: [UpdateCCTestChecks] Detect function mangled name on separate line In-Reply-To: References: Message-ID: <679720af4776cc8651be9b5506d37bab@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGeb6698572623: [UpdateCCTestChecks] Detect function mangled name on separate line (authored by greened). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68272/new/ https://reviews.llvm.org/D68272 Files: llvm/utils/update_cc_test_checks.py Index: llvm/utils/update_cc_test_checks.py =================================================================== --- llvm/utils/update_cc_test_checks.py +++ llvm/utils/update_cc_test_checks.py @@ -50,18 +50,32 @@ '-test-print-mangle', f.name]) if sys.version_info[0] > 2: output = output.decode() - - RE = re.compile(r'^FunctionDecl=(\w+):(\d+):\d+ \(Definition\) \[mangled=([^]]+)\]') + DeclRE = re.compile(r'^FunctionDecl=(\w+):(\d+):\d+ \(Definition\)') + MangleRE = re.compile(r'.*\[mangled=([^]]+)\]') + MatchedDecl = False for line in output.splitlines(): - m = RE.match(line) - if not m: continue - spell, line, mangled = m.groups() + # Get the function source name, line number and mangled name. Sometimes + # c-index-test outputs the mangled name on a separate line (this can happen + # with block comments in front of functions). Keep scanning until we see + # the mangled name. + decl_m = DeclRE.match(line) + mangle_m = MangleRE.match(line) + + if decl_m: + MatchedDecl = True + spell, lineno = decl_m.groups() + if MatchedDecl and mangle_m: + mangled = mangle_m.group(1) + MatchedDecl = False + else: + continue + if mangled == '_' + spell: # HACK for MacOS (where the mangled name includes an _ for C but the IR won't): mangled = spell # Note -test-print-mangle does not print file names so if #include is used, # the line number may come from an included file. - ret[int(line)-1] = (spell, mangled) + ret[int(lineno)-1] = (spell, mangled) if args.verbose: for line, func_name in sorted(ret.items()): print('line {}: found function {}'.format(line+1, func_name), file=sys.stderr) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68272.223884.patch Type: text/x-patch Size: 1733 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 09:34:02 2019 From: llvm-commits at lists.llvm.org (Sean Fertile via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:34:02 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: <42151ffa58431c0e7573941077d9c90f@localhost.localdomain> sfertile added inline comments. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:752 + const MCSymbolRefExpr::VariantKind VK = + !IsAIX ? MCSymbolRefExpr::VK_PPC_TOC : MCSymbolRefExpr::VK_None; const MCExpr *Exp = ---------------- Very minor nit: We can slightly simplify if you drop the `!' and reverse the true/false sides. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:826 + // Change the opcode to ADDIS8. If the global address is the address of + // an external symbol, is a jump table address, is a block address; or if + // large code model is enabled then generate a TOC entry and reference that. ---------------- Was the ';' after 'block-address' meant to be a comma? ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:837 + bool GlobalToc = false; if (MO.isGlobal()) { ---------------- Minor nit: We can fold away the if: `bool GlobalToc = MO.isGlobal() && Subtarget->isGVIndirectSymbol(MO.getGlobal());` ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:879 + + if (MO.isGlobal()) { LLVM_DEBUG( ---------------- Instead of using an 'if' statement that will be empty when assertions are disabled, can we fold this condition into the assert? ie ``` LLVM_DEBUG( assert((!MO.isGlobal() || Subtarget->isGVIndirectSymbol(MO.getGlobal())) && ... ``` ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:912 if (MO.isGlobal()) { - const GlobalValue *GV = MO.getGlobal(); - LLVM_DEBUG(assert(!(Subtarget->isGVIndirectSymbol(GV)) && + LLVM_DEBUG(assert(!(Subtarget->isGVIndirectSymbol(MO.getGlobal())) && "Interposable definitions must use indirect access.")); ---------------- Ditto on folding away the if. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:2 ; RUN: llc -mtriple powerpc-ibm-aix-xcoff \ -; RUN: -code-model=small < %s | FileCheck %s +; RUN: -code-model=small < %s | FileCheck %s --check-prefix=SMALL + ---------------- Please add '-verify-machine-instr` and an mcpu option to each llc invocation. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll:2 +; RUN: llc -mtriple powerpc64-ibm-aix-xcoff \ +; RUN: -code-model=small < %s | FileCheck %s --check-prefix=SMALL + ---------------- ditto. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Tue Oct 8 09:34:03 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:34:03 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: hubert.reinterpretcast added inline comments. ================ Comment at: llvm/lib/Target/PowerPC/MCTargetDesc/PPCInstPrinter.cpp:77 + assert((MI->getOperand(0).isReg() && MI->getOperand(1).isReg()) && + "The first and the second operand of addis instruction" + " should be registers."); ---------------- Add "an" before "addis". ================ Comment at: llvm/lib/Target/PowerPC/MCTargetDesc/PPCInstPrinter.cpp:80 + + assert(dyn_cast(MI->getOperand(2).getExpr()) && + "The third operand of addis instruction should be a symbol " ---------------- Do not use `dyn_cast` for its Boolean value. Use `isa`. ================ Comment at: llvm/lib/Target/PowerPC/MCTargetDesc/PPCInstPrinter.cpp:81 + assert(dyn_cast(MI->getOperand(2).getExpr()) && + "The third operand of addis instruction should be a symbol " + "reference expression."); ---------------- Same comment regarding "an". ================ Comment at: llvm/lib/Target/PowerPC/MCTargetDesc/PPCInstPrinter.cpp:82 + "The third operand of addis instruction should be a symbol " + "reference expression."); + ---------------- Add "if it is an expression at all" to the end of the sentence. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:926 // Into: %xd = ADDIS8 %x2, sym at got@tlsgd at ha - assert(Subtarget->isPPC64() && "Not supported for 32-bit PowerPC"); + assert(IsPPC64 && "Not supported for 32-bit PowerPC"); const MachineOperand &MO = MI->getOperand(2); ---------------- There is quite a bit of noise in this patch from these NFC changes. Please split them out. ================ Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.td:3172 +let hasSideEffects = 0, isReMaterializable = 1 in { +def ADDIStocHA: PPCEmitTimePseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, tocentry32:$disp), + "#ADDIStocHA", ---------------- Why the whitespace change on this line? The surrounding `def`s have spaces around the colon. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Tue Oct 8 09:47:10 2019 From: llvm-commits at lists.llvm.org (Jan Kratochvil via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:47:10 +0000 (UTC) Subject: [PATCH] D68612: [CMake] Track test dependencies with add_lldb_test_dependency In-Reply-To: References: Message-ID: <9e43294d9f55521d34b1aec8ab0e0040@localhost.localdomain> jankratochvil added a comment. On Linux OS (Fedora 30 x86_64) with GIT monorepo: After rL374000 : #rm -rf * cmake ~/redhat/llvm-monorepo2/llvm/ -DCMAKE_BUILD_TYPE=Release -DLLVM_USE_LINKER=gold -DLLVM_ENABLE_PROJECTS="lldb;clang;lld" -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_ENABLE_ASSERTIONS=ON make check-lldb ... llvm-lit: /home/jkratoch/redhat/llvm-monorepo2/llvm/utils/lit/lit/llvm/subst.py:134: fatal: Did not find count in /home/jkratoch/redhat/llvm-monorepo2-clangassert/./bin After rL373996 (this patch): llvm-lit: llvm-monorepo2/llvm/utils/lit/lit/llvm/subst.py:127: note: Did not find obj2yaml in llvm-monorepo2-clangassert/./bin:llvm-monorepo2-clangassert/./bin llvm-lit: llvm-monorepo2/llvm/utils/lit/lit/llvm/subst.py:127: note: Did not find llvm-pdbutil in llvm-monorepo2-clangassert/./bin:llvm-monorepo2-clangassert/./bin llvm-lit: llvm-monorepo2/llvm/utils/lit/lit/llvm/subst.py:127: note: Did not find llvm-mc in llvm-monorepo2-clangassert/./bin:llvm-monorepo2-clangassert/./bin llvm-lit: llvm-monorepo2/llvm/utils/lit/lit/llvm/subst.py:127: note: Did not find llvm-readobj in llvm-monorepo2-clangassert/./bin:llvm-monorepo2-clangassert/./bin llvm-lit: llvm-monorepo2/llvm/utils/lit/lit/llvm/subst.py:127: note: Did not find llvm-objdump in llvm-monorepo2-clangassert/./bin:llvm-monorepo2-clangassert/./bin llvm-lit: llvm-monorepo2/llvm/utils/lit/lit/llvm/subst.py:127: note: Did not find llvm-objcopy in llvm-monorepo2-clangassert/./bin:llvm-monorepo2-clangassert/./bin llvm-lit: llvm-monorepo2/llvm/utils/lit/lit/llvm/subst.py:127: note: Did not find lli in llvm-monorepo2-clangassert/./bin:llvm-monorepo2-clangassert/./bin llvm-lit: llvm-monorepo2/llvm/utils/lit/lit/llvm/config.py:169: fatal: Could not run process ['llvm-monorepo2-clangassert/./bin/llvm-config', '--assertion-mode', '--build-mode', '--targets-built'] Surprisingly a Fedora buildbot does not have this problem . Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68612/new/ https://reviews.llvm.org/D68612 From llvm-commits at lists.llvm.org Tue Oct 8 09:47:10 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:47:10 +0000 (UTC) Subject: [PATCH] D68472: [test] Use system locale for mri-utf8.test In-Reply-To: References: Message-ID: <5161d272386319aacd1d3fe4258ea1c2@localhost.localdomain> thopre updated this revision to Diff 223889. thopre added a comment. Remove comment about why file redirection and add a new one explaining the encoding/decoding steps performed to the file since it is key to how why the test work accross OSes. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68472/new/ https://reviews.llvm.org/D68472 Files: llvm/test/tools/llvm-ar/mri-nonascii.test llvm/test/tools/llvm-ar/mri-utf8.test Index: llvm/test/tools/llvm-ar/mri-utf8.test =================================================================== --- llvm/test/tools/llvm-ar/mri-utf8.test +++ /dev/null @@ -1,23 +0,0 @@ -# Test non-ascii archive members -# XFAIL: system-darwin - -RUN: rm -rf %t && mkdir -p %t/extracted - -RUN: echo "contents" > %t/£.txt - -RUN: echo "CREATE %t/mri.ar" > %t/script.mri -RUN: echo "ADDMOD %t/£.txt" >> %t/script.mri -RUN: echo "SAVE" >> %t/script.mri - -RUN: llvm-ar -M < %t/script.mri -RUN: cd %t/extracted && llvm-ar x %t/mri.ar - -# This works around problems launching processess that -# include arguments with non-ascii characters. -# Python on Linux defaults to ASCII encoding unless the -# environment specifies otherwise, so it is explicitly set. -# The reliance the test has on this locale is not ideal, -# however alternate solutions have been difficult due to -# behaviour differences with python 2 vs python 3, -# and linux vs windows. -RUN: env LANG=en_US.UTF-8 %python -c "assert open(u'\U000000A3.txt', 'rb').read() == b'contents\n'" Index: llvm/test/tools/llvm-ar/mri-nonascii.test =================================================================== --- /dev/null +++ llvm/test/tools/llvm-ar/mri-nonascii.test @@ -0,0 +1,22 @@ +# Test non-ascii archive members +# XFAIL: system-darwin + +RUN: rm -rf %t && mkdir -p %t/extracted + +# Note: lit's Python will read this UTF-8 encoded mri-nonascii.txt file, +# decode it to unicode. The filename in the redirection below will then +# be encoded in the system's filename encoding (e.g. UTF-16 for +# Microsoft Windows). +RUN: echo "contents" > %t/£.txt + +RUN: echo "CREATE %t/mri.ar" > %t/script.mri +RUN: echo "ADDMOD %t/£.txt" >> %t/script.mri +RUN: echo "SAVE" >> %t/script.mri + +RUN: llvm-ar -M < %t/script.mri +RUN: cd %t/extracted && llvm-ar x %t/mri.ar + +# Same as above. +RUN: FileCheck --strict-whitespace %s <£.txt +CHECK:{{^}} +CHECK-SAME:{{^}}contents{{$}} -------------- next part -------------- A non-text attachment was scrubbed... Name: D68472.223889.patch Type: text/x-patch Size: 1942 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 09:47:10 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:47:10 +0000 (UTC) Subject: [PATCH] D68589: [lit] Leverage argparse features to remove some code In-Reply-To: References: Message-ID: yln marked 4 inline comments as done. yln added inline comments. ================ Comment at: llvm/utils/lit/lit/cl_arguments.py:204 + n = int(arg) + except: + raise _arg_error('positive integer', arg) ---------------- serge-sans-paille wrote: > It's generally better to catch the conversion error explicitly (here ``ValueError``) Done. Thanks! ================ Comment at: llvm/utils/lit/lit/cl_arguments.py:211 +def _arg_error(desc, arg): + msg = "require %s, but found '%s'" % (desc, arg) + return argparse.ArgumentTypeError(msg) ---------------- serge-sans-paille wrote: > require*s* Updated. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68589/new/ https://reviews.llvm.org/D68589 From llvm-commits at lists.llvm.org Tue Oct 8 09:47:12 2019 From: llvm-commits at lists.llvm.org (Amaury SECHET via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:47:12 +0000 (UTC) Subject: [PATCH] D68250: [DAGCombine] Match more patterns for half word bswap In-Reply-To: References: Message-ID: deadalnix updated this revision to Diff 223892. deadalnix added a comment. Fix erroneously inverted condition and ensuring previously broken tests are now passing again. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68250/new/ https://reviews.llvm.org/D68250 Files: lib/CodeGen/SelectionDAG/DAGCombiner.cpp test/CodeGen/X86/bswap_tree.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68250.223892.patch Type: text/x-patch Size: 4270 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 09:47:11 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:47:11 +0000 (UTC) Subject: [PATCH] D68589: [lit] Leverage argparse features to remove some code In-Reply-To: References: Message-ID: <26341faa5dd7cb545cd456b9dbca9d5b@localhost.localdomain> yln updated this revision to Diff 223891. yln marked 2 inline comments as done. yln added a comment. Address comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68589/new/ https://reviews.llvm.org/D68589 Files: llvm/utils/lit/lit/cl_arguments.py llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/selecting.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68589.223891.patch Type: text/x-patch Size: 5953 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 09:47:12 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:47:12 +0000 (UTC) Subject: [PATCH] D68632: [X86] Make memcmp() use PTEST if possible and also enable AVX1 In-Reply-To: References: Message-ID: craig.topper added inline comments. ================ Comment at: lib/Target/X86/X86ISelLowering.cpp:20964 + // Use PTEST when explicitly requested. + if (Op0.getOpcode() == X86ISD::PTEST && isNullConstant(Op1) && ---------------- Remove this. It doesn't make sense. ================ Comment at: lib/Target/X86/X86ISelLowering.cpp:42403 + if (VecVT == CmpVT && PT) { + auto Cmp1 = SDValue(DAG.getMachineNode(XorOp, DL, VecVT, A, B), 0); + auto Cmp2 = SDValue(DAG.getMachineNode(XorOp, DL, VecVT, C, D), 0); ---------------- Why are these machine opcodes and not ISD::XOR? ================ Comment at: lib/Target/X86/X86ISelLowering.cpp:42430 + auto PT = DAG.getNode(X86ISD::PTEST, DL, MVT::i32, BCCmp, BCCmp); + return DAG.getSetCC(DL, VT, PT, DAG.getConstant(0, DL, MVT::i32), CC); + } ---------------- PTEST returns an i32 representing EFLAGS. You can't pass it to ISD::SETCC it doesn't mean anything. The DAG.getSetcc call needs to call the X86ISelLowering.cpp copy of getSetcc that creates an X86ISD::SETCC. You'll need to pass it the X86::COND_E/X86::COND_NE here. This will return an MVT::i8 so you'll need to emit an ISD::TRUNCATE to VT after it. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68632/new/ https://reviews.llvm.org/D68632 From llvm-commits at lists.llvm.org Tue Oct 8 09:47:12 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:47:12 +0000 (UTC) Subject: [PATCH] D68194: [LCSSA] Forget values we create LCSSA phis for In-Reply-To: References: Message-ID: <0c84091f1add5089b99c02e5d14a75b8@localhost.localdomain> fhahn added a comment. In D68194#1698311 , @efriedma wrote: > > I know this contradicts existing code in SCEV that tries not to break LCSSA, I'm wondering if we should just remove those in favor of teaching SCEV expander about preserving LCSSA. > > Currently, SCEV deliberately doesn't look through LCSSA PHI nodes (there's code in ScalarEvolution::createNodeForPHI etc.). You're saying we should change that? It might be a good idea. I recently ran into an performance issue involving a nested loop where SCEV for the outer loop was getting confused by an LCSSA PHI for the inner loop. I'd be worried about loop passes getting confused, though; for example, if they assume all AddRecs are part of the current loop nest. > > That said, this patch probably isn't the right place to have that discussion. AFAICT, there are at least 2 issues that need addressing if we want to handle LCSSA PHIs differently: 1) teach SCEV to look through LCSSA phis and 2) updating SCEV expander to insert LCSSA Phi nodes, if required. Incidentally, I started out with a patch to fix the issue in the expander. But addressing both 1) and 2) seems like a more fundamental change. As a first step I guess we could try to find some cases where we would get more precise results by looking through LCSSA PHIs. IMO that would be a better motivation than just the expander issue. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68194/new/ https://reviews.llvm.org/D68194 From llvm-commits at lists.llvm.org Tue Oct 8 09:47:12 2019 From: llvm-commits at lists.llvm.org (David Li via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:47:12 +0000 (UTC) Subject: [PATCH] D68616: [CodeExtractor] Factor out and reuse shrinkwrap analysis In-Reply-To: References: Message-ID: davidxl accepted this revision. davidxl added a comment. This revision is now accepted and ready to land. lgtm CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68616/new/ https://reviews.llvm.org/D68616 From llvm-commits at lists.llvm.org Tue Oct 8 09:56:01 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via llvm-commits) Date: Tue, 08 Oct 2019 16:56:01 -0000 Subject: [llvm] r374083 - [AMDGPU] Disable unused gfx10 dpp instructions Message-ID: <20191008165601.9DDA388CE3@lists.llvm.org> Author: rampitec Date: Tue Oct 8 09:56:01 2019 New Revision: 374083 URL: http://llvm.org/viewvc/llvm-project?rev=374083&view=rev Log: [AMDGPU] Disable unused gfx10 dpp instructions Inhibit generation of unused real dpp instructions on gfx10 just like it is done on other subtargets. This does not change anything because these are illegal anyway and not accepted, but it does reduce the number of instruction definitions generated. Differential Revision: https://reviews.llvm.org/D68607 Modified: llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td Modified: llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td?rev=374083&r1=374082&r2=374083&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td (original) +++ llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td Tue Oct 8 09:56:01 2019 @@ -506,11 +506,13 @@ let AssemblerPredicate = isGFX10Plus, De } } multiclass VOP1_Real_dpp_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP1_DPP16(NAME#"_e32")> { let DecoderNamespace = "SDWA10"; } } multiclass VOP1_Real_dpp8_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP1_DPP8(NAME#"_e32")> { let DecoderNamespace = "DPP8"; } Modified: llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td?rev=374083&r1=374082&r2=374083&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td (original) +++ llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td Tue Oct 8 09:56:01 2019 @@ -939,11 +939,13 @@ let AssemblerPredicate = isGFX10Plus, De } } multiclass VOP2_Real_dpp_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP2_DPP16(NAME#"_e32")> { let DecoderNamespace = "SDWA10"; } } multiclass VOP2_Real_dpp8_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP2_DPP8(NAME#"_e32")> { let DecoderNamespace = "DPP8"; } @@ -981,6 +983,7 @@ let AssemblerPredicate = isGFX10Plus, De } multiclass VOP2_Real_dpp_gfx10_with_name op, string opName, string asmName> { + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP2_DPP16(opName#"_e32")> { VOP2_Pseudo ps = !cast(opName#"_e32"); let AsmString = asmName # ps.Pfl.AsmDPP16; @@ -988,6 +991,7 @@ let AssemblerPredicate = isGFX10Plus, De } multiclass VOP2_Real_dpp8_gfx10_with_name op, string opName, string asmName> { + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP2_DPP8(opName#"_e32")> { VOP2_Pseudo ps = !cast(opName#"_e32"); let AsmString = asmName # ps.Pfl.AsmDPP8; @@ -1018,12 +1022,14 @@ let AssemblerPredicate = isGFX10Plus, De let AsmString = asmName # !subst(", vcc", "", Ps.AsmOperands); let DecoderNamespace = "SDWA10"; } + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP2_DPP16(opName#"_e32"), asmName> { string AsmDPP = !cast(opName#"_e32").Pfl.AsmDPP16; let AsmString = asmName # !subst(", vcc", "", AsmDPP); let DecoderNamespace = "SDWA10"; } + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP2_DPP8(opName#"_e32"), asmName> { string AsmDPP8 = !cast(opName#"_e32").Pfl.AsmDPP8; From llvm-commits at lists.llvm.org Tue Oct 8 09:57:09 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:57:09 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: <0c1cccce6a883f93334d9733f33bb5de@localhost.localdomain> dmgreen added a comment. Hello. Can you explain what you mean by "native format"? Do you mean without the extends/truncs, as a different way of specifying them? (I think the problem at least from C is dealing with overflowing arithmetic being undefined. If you extend at least one bit then the arithmetic can't overflow, so you can do the min/max like it's done here). I don't think there is anywhere in instcombine that currently forms a sadd_sat or ssub_sat (as opposed to uadd_sat or usub_sat), unless it's from an existing sadd_sat. We do form uadd_sat as in rL357012 and usub_sat from selects. I really just need some way to generate sadd_sats for vectorisation. If there's a better way than this, I'm all ears :) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Tue Oct 8 10:00:01 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Tue, 08 Oct 2019 17:00:01 -0000 Subject: [llvm] r374085 - CodeGenPrepare - silence static analyzer dyn_cast<> null dereference warnings. NFCI. Message-ID: <20191008170001.779018F741@lists.llvm.org> Author: rksimon Date: Tue Oct 8 10:00:01 2019 New Revision: 374085 URL: http://llvm.org/viewvc/llvm-project?rev=374085&view=rev Log: CodeGenPrepare - silence static analyzer dyn_cast<> null dereference warnings. NFCI. The static analyzer is warning about potential null dereferences, but in these cases we should be able to use cast<> directly and if not assert will fire for us. Modified: llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp Modified: llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp?rev=374085&r1=374084&r2=374085&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp (original) +++ llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp Tue Oct 8 10:00:01 2019 @@ -1524,7 +1524,7 @@ SinkShiftAndTruncate(BinaryOperator *Shi const TargetLowering &TLI, const DataLayout &DL) { BasicBlock *UserBB = User->getParent(); DenseMap InsertedTruncs; - TruncInst *TruncI = dyn_cast(User); + auto *TruncI = cast(User); bool MadeChange = false; for (Value::user_iterator TruncUI = TruncI->user_begin(), @@ -3046,7 +3046,7 @@ public: To = dyn_cast(OldReplacement); OldReplacement = Get(From); } - assert(Get(To) == To && "Replacement PHI node is already replaced."); + assert(To && Get(To) == To && "Replacement PHI node is already replaced."); Put(From, To); From->replaceAllUsesWith(To); AllPhiNodes.erase(From); @@ -3410,11 +3410,10 @@ private: Select->setFalseValue(ST.Get(Map[FalseValue])); } else { // Must be a Phi node then. - PHINode *PHI = cast(V); - auto *CurrentPhi = dyn_cast(Current); + auto *PHI = cast(V); // Fill the Phi node with values from predecessors. for (auto B : predecessors(PHI->getParent())) { - Value *PV = CurrentPhi->getIncomingValueForBlock(B); + Value *PV = cast(Current)->getIncomingValueForBlock(B); assert(Map.find(PV) != Map.end() && "No predecessor Value!"); PHI->addIncoming(ST.Get(Map[PV]), B); } @@ -3783,13 +3782,11 @@ bool TypePromotionHelper::canGetThrough( // poisoned value regular value // It should be OK since undef covers valid value. if (Inst->getOpcode() == Instruction::Shl && Inst->hasOneUse()) { - const Instruction *ExtInst = - dyn_cast(*Inst->user_begin()); + const auto *ExtInst = cast(*Inst->user_begin()); if (ExtInst->hasOneUse()) { - const Instruction *AndInst = - dyn_cast(*ExtInst->user_begin()); + const auto *AndInst = dyn_cast(*ExtInst->user_begin()); if (AndInst && AndInst->getOpcode() == Instruction::And) { - const ConstantInt *Cst = dyn_cast(AndInst->getOperand(1)); + const auto *Cst = dyn_cast(AndInst->getOperand(1)); if (Cst && Cst->getValue().isIntN(Inst->getType()->getIntegerBitWidth())) return true; @@ -5814,7 +5811,7 @@ bool CodeGenPrepare::optimizeLoadExt(Loa return false; IRBuilder<> Builder(Load->getNextNode()); - auto *NewAnd = dyn_cast( + auto *NewAnd = cast( Builder.CreateAnd(Load, ConstantInt::get(Ctx, DemandBits))); // Mark this instruction as "inserted by CGP", so that other // optimizations don't touch it. From llvm-commits at lists.llvm.org Tue Oct 8 09:58:26 2019 From: llvm-commits at lists.llvm.org (Dimitry Andric via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:58:26 +0000 (UTC) Subject: [PATCH] D68045: [builtins] Unbreak build on FreeBSD armv7 after D60351 In-Reply-To: References: Message-ID: dim closed this revision. dim added a comment. For some reason this didn't get closed by Phabricator. Committed in rCRT374070 . Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68045/new/ https://reviews.llvm.org/D68045 From llvm-commits at lists.llvm.org Tue Oct 8 10:01:57 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via llvm-commits) Date: Tue, 08 Oct 2019 17:01:57 -0000 Subject: [llvm] r374086 - [Attributor][Fix] Temporary fix for windows build bot failure Message-ID: <20191008170157.3715C8EBB3@lists.llvm.org> Author: uenoku Date: Tue Oct 8 10:01:56 2019 New Revision: 374086 URL: http://llvm.org/viewvc/llvm-project?rev=374086&view=rev Log: [Attributor][Fix] Temporary fix for windows build bot failure D65402 causes test failure related to attributor-max-iterations. This commit removes attributor-max-iterations-verify for now. I'll examine the factor and the flag should be reverted. Modified: llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll Modified: llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll?rev=374086&r1=374085&r2=374086&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll Tue Oct 8 10:01:56 2019 @@ -1,5 +1,7 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py -; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s | FileCheck %s +; FIXME: Add -attributor-max-iterations-verify -attributor-max-iterations below. +; This flag was removed because max iterations is 2 in most cases, but in windows it is 1. +; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false < %s | FileCheck %s ; ModuleID = 'callback_simple.c' target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" From llvm-commits at lists.llvm.org Tue Oct 8 10:04:51 2019 From: llvm-commits at lists.llvm.org (Tom Stellard via llvm-commits) Date: Tue, 08 Oct 2019 17:04:51 -0000 Subject: [llvm] r374087 - AMDGPU: Add offsets to MMO when lowering buffer intrinsics Message-ID: <20191008170451.717458923A@lists.llvm.org> Author: tstellar Date: Tue Oct 8 10:04:51 2019 New Revision: 374087 URL: http://llvm.org/viewvc/llvm-project?rev=374087&view=rev Log: AMDGPU: Add offsets to MMO when lowering buffer intrinsics Summary: Without offsets on the MachineMemOperands (MMOs), MachineInstr::mayAlias() will return true for all reads and writes to the same resource descriptor. This leads to O(N^2) complexity in the MachineScheduler when analyzing dependencies of buffer loads and stores. It also limits the SILoadStoreOptimizer from merging more instructions. This patch reduces the compile time of one pathological compute shader from 12 seconds to 1 second. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65097 Added: llvm/trunk/test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll Modified: llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpp llvm/trunk/lib/Target/AMDGPU/SIISelLowering.h Modified: llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpp?rev=374087&r1=374086&r2=374087&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpp Tue Oct 8 10:04:51 2019 @@ -6092,6 +6092,30 @@ SDValue SITargetLowering::LowerINTRINSIC } } +// This function computes an appropriate offset to pass to +// MachineMemOperand::setOffset() based on the offset inputs to +// an intrinsic. If any of the offsets are non-contstant or +// if VIndex is non-zero then this function returns 0. Otherwise, +// it returns the sum of VOffset, SOffset, and Offset. +static unsigned getBufferOffsetForMMO(SDValue VOffset, + SDValue SOffset, + SDValue Offset, + SDValue VIndex = SDValue()) { + + if (!isa(VOffset) || !isa(SOffset) || + !isa(Offset)) + return 0; + + if (VIndex) { + if (!isa(VIndex) || !cast(VIndex)->isNullValue()) + return 0; + } + + return cast(VOffset)->getSExtValue() + + cast(SOffset)->getSExtValue() + + cast(Offset)->getSExtValue(); +} + SDValue SITargetLowering::LowerINTRINSIC_W_CHAIN(SDValue Op, SelectionDAG &DAG) const { unsigned IntrID = cast(Op.getOperand(1))->getZExtValue(); @@ -6238,13 +6262,18 @@ SDValue SITargetLowering::LowerINTRINSIC DAG.getTargetConstant(IdxEn, DL, MVT::i1), // idxen }; - setBufferOffsets(Op.getOperand(4), DAG, &Ops[3]); + unsigned Offset = setBufferOffsets(Op.getOperand(4), DAG, &Ops[3]); + // We don't know the offset if vindex is non-zero, so clear it. + if (IdxEn) + Offset = 0; + unsigned Opc = (IntrID == Intrinsic::amdgcn_buffer_load) ? AMDGPUISD::BUFFER_LOAD : AMDGPUISD::BUFFER_LOAD_FORMAT; EVT VT = Op.getValueType(); EVT IntVT = VT.changeTypeToInteger(); auto *M = cast(Op); + M->getMemOperand()->setOffset(Offset); EVT LoadVT = Op.getValueType(); if (LoadVT.getScalarType() == MVT::f16) @@ -6275,7 +6304,9 @@ SDValue SITargetLowering::LowerINTRINSIC DAG.getTargetConstant(0, DL, MVT::i1), // idxen }; - return lowerIntrinsicLoad(cast(Op), IsFormat, DAG, Ops); + auto *M = cast(Op); + M->getMemOperand()->setOffset(getBufferOffsetForMMO(Ops[3], Ops[4], Ops[5])); + return lowerIntrinsicLoad(M, IsFormat, DAG, Ops); } case Intrinsic::amdgcn_struct_buffer_load: case Intrinsic::amdgcn_struct_buffer_load_format: { @@ -6293,6 +6324,9 @@ SDValue SITargetLowering::LowerINTRINSIC DAG.getTargetConstant(1, DL, MVT::i1), // idxen }; + auto *M = cast(Op); + M->getMemOperand()->setOffset(getBufferOffsetForMMO(Ops[3], Ops[4], Ops[5], + Ops[2])); return lowerIntrinsicLoad(cast(Op), IsFormat, DAG, Ops); } case Intrinsic::amdgcn_tbuffer_load: { @@ -6398,10 +6432,14 @@ SDValue SITargetLowering::LowerINTRINSIC DAG.getTargetConstant(Slc << 1, DL, MVT::i32), // cachepolicy DAG.getTargetConstant(IdxEn, DL, MVT::i1), // idxen }; - setBufferOffsets(Op.getOperand(5), DAG, &Ops[4]); + unsigned Offset = setBufferOffsets(Op.getOperand(5), DAG, &Ops[4]); + // We don't know the offset if vindex is non-zero, so clear it. + if (IdxEn) + Offset = 0; EVT VT = Op.getValueType(); auto *M = cast(Op); + M->getMemOperand()->setOffset(Offset); unsigned Opcode = 0; switch (IntrID) { @@ -6469,6 +6507,7 @@ SDValue SITargetLowering::LowerINTRINSIC EVT VT = Op.getValueType(); auto *M = cast(Op); + M->getMemOperand()->setOffset(getBufferOffsetForMMO(Ops[4], Ops[5], Ops[6])); unsigned Opcode = 0; switch (IntrID) { @@ -6542,6 +6581,8 @@ SDValue SITargetLowering::LowerINTRINSIC EVT VT = Op.getValueType(); auto *M = cast(Op); + M->getMemOperand()->setOffset(getBufferOffsetForMMO(Ops[4], Ops[5], Ops[6], + Ops[3])); unsigned Opcode = 0; switch (IntrID) { @@ -6605,9 +6646,13 @@ SDValue SITargetLowering::LowerINTRINSIC DAG.getTargetConstant(Slc << 1, DL, MVT::i32), // cachepolicy DAG.getTargetConstant(IdxEn, DL, MVT::i1), // idxen }; - setBufferOffsets(Op.getOperand(6), DAG, &Ops[5]); + unsigned Offset = setBufferOffsets(Op.getOperand(6), DAG, &Ops[5]); + // We don't know the offset if vindex is non-zero, so clear it. + if (IdxEn) + Offset = 0; EVT VT = Op.getValueType(); auto *M = cast(Op); + M->getMemOperand()->setOffset(Offset); return DAG.getMemIntrinsicNode(AMDGPUISD::BUFFER_ATOMIC_CMPSWAP, DL, Op->getVTList(), Ops, VT, M->getMemOperand()); @@ -6628,6 +6673,7 @@ SDValue SITargetLowering::LowerINTRINSIC }; EVT VT = Op.getValueType(); auto *M = cast(Op); + M->getMemOperand()->setOffset(getBufferOffsetForMMO(Ops[5], Ops[6], Ops[7])); return DAG.getMemIntrinsicNode(AMDGPUISD::BUFFER_ATOMIC_CMPSWAP, DL, Op->getVTList(), Ops, VT, M->getMemOperand()); @@ -6648,6 +6694,8 @@ SDValue SITargetLowering::LowerINTRINSIC }; EVT VT = Op.getValueType(); auto *M = cast(Op); + M->getMemOperand()->setOffset(getBufferOffsetForMMO(Ops[5], Ops[6], Ops[7], + Ops[4])); return DAG.getMemIntrinsicNode(AMDGPUISD::BUFFER_ATOMIC_CMPSWAP, DL, Op->getVTList(), Ops, VT, M->getMemOperand()); @@ -6889,11 +6937,15 @@ SDValue SITargetLowering::LowerINTRINSIC DAG.getTargetConstant(Glc | (Slc << 1), DL, MVT::i32), // cachepolicy DAG.getTargetConstant(IdxEn, DL, MVT::i1), // idxen }; - setBufferOffsets(Op.getOperand(5), DAG, &Ops[4]); + unsigned Offset = setBufferOffsets(Op.getOperand(5), DAG, &Ops[4]); + // We don't know the offset if vindex is non-zero, so clear it. + if (IdxEn) + Offset = 0; unsigned Opc = IntrinsicID == Intrinsic::amdgcn_buffer_store ? AMDGPUISD::BUFFER_STORE : AMDGPUISD::BUFFER_STORE_FORMAT; Opc = IsD16 ? AMDGPUISD::BUFFER_STORE_FORMAT_D16 : Opc; MemSDNode *M = cast(Op); + M->getMemOperand()->setOffset(Offset); // Handle BUFFER_STORE_BYTE/SHORT overloaded intrinsics EVT VDataType = VData.getValueType().getScalarType(); @@ -6938,6 +6990,7 @@ SDValue SITargetLowering::LowerINTRINSIC IsFormat ? AMDGPUISD::BUFFER_STORE_FORMAT : AMDGPUISD::BUFFER_STORE; Opc = IsD16 ? AMDGPUISD::BUFFER_STORE_FORMAT_D16 : Opc; MemSDNode *M = cast(Op); + M->getMemOperand()->setOffset(getBufferOffsetForMMO(Ops[4], Ops[5], Ops[6])); // Handle BUFFER_STORE_BYTE/SHORT overloaded intrinsics if (!IsD16 && !VDataVT.isVector() && EltType.getSizeInBits() < 32) @@ -6982,6 +7035,8 @@ SDValue SITargetLowering::LowerINTRINSIC AMDGPUISD::BUFFER_STORE : AMDGPUISD::BUFFER_STORE_FORMAT; Opc = IsD16 ? AMDGPUISD::BUFFER_STORE_FORMAT_D16 : Opc; MemSDNode *M = cast(Op); + M->getMemOperand()->setOffset(getBufferOffsetForMMO(Ops[4], Ops[5], Ops[6], + Ops[3])); // Handle BUFFER_STORE_BYTE/SHORT overloaded intrinsics EVT VDataType = VData.getValueType().getScalarType(); @@ -7008,10 +7063,14 @@ SDValue SITargetLowering::LowerINTRINSIC DAG.getTargetConstant(Slc << 1, DL, MVT::i32), // cachepolicy DAG.getTargetConstant(IdxEn, DL, MVT::i1), // idxen }; - setBufferOffsets(Op.getOperand(5), DAG, &Ops[4]); + unsigned Offset = setBufferOffsets(Op.getOperand(5), DAG, &Ops[4]); + // We don't know the offset if vindex is non-zero, so clear it. + if (IdxEn) + Offset = 0; EVT VT = Op.getOperand(2).getValueType(); auto *M = cast(Op); + M->getMemOperand()->setOffset(Offset); unsigned Opcode = VT.isVector() ? AMDGPUISD::BUFFER_ATOMIC_PK_FADD : AMDGPUISD::BUFFER_ATOMIC_FADD; @@ -7105,7 +7164,7 @@ std::pair SITargetLowe // Analyze a combined offset from an amdgcn_buffer_ intrinsic and store the // three offsets (voffset, soffset and instoffset) into the SDValue[3] array // pointed to by Offsets. -void SITargetLowering::setBufferOffsets(SDValue CombinedOffset, +unsigned SITargetLowering::setBufferOffsets(SDValue CombinedOffset, SelectionDAG &DAG, SDValue *Offsets, unsigned Align) const { SDLoc DL(CombinedOffset); @@ -7116,7 +7175,7 @@ void SITargetLowering::setBufferOffsets( Offsets[0] = DAG.getConstant(0, DL, MVT::i32); Offsets[1] = DAG.getConstant(SOffset, DL, MVT::i32); Offsets[2] = DAG.getTargetConstant(ImmOffset, DL, MVT::i32); - return; + return SOffset + ImmOffset; } } if (DAG.isBaseWithConstantOffset(CombinedOffset)) { @@ -7129,12 +7188,13 @@ void SITargetLowering::setBufferOffsets( Offsets[0] = N0; Offsets[1] = DAG.getConstant(SOffset, DL, MVT::i32); Offsets[2] = DAG.getTargetConstant(ImmOffset, DL, MVT::i32); - return; + return 0; } } Offsets[0] = CombinedOffset; Offsets[1] = DAG.getConstant(0, DL, MVT::i32); Offsets[2] = DAG.getTargetConstant(0, DL, MVT::i32); + return 0; } // Handle 8 bit and 16 bit buffer loads Modified: llvm/trunk/lib/Target/AMDGPU/SIISelLowering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/SIISelLowering.h?rev=374087&r1=374086&r2=374087&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/SIISelLowering.h (original) +++ llvm/trunk/lib/Target/AMDGPU/SIISelLowering.h Tue Oct 8 10:04:51 2019 @@ -203,8 +203,10 @@ private: // Analyze a combined offset from an amdgcn_buffer_ intrinsic and store the // three offsets (voffset, soffset and instoffset) into the SDValue[3] array // pointed to by Offsets. - void setBufferOffsets(SDValue CombinedOffset, SelectionDAG &DAG, - SDValue *Offsets, unsigned Align = 4) const; + /// \returns 0 If there is a non-constant offset or if the offset is 0. + /// Otherwise returns the constant offset. + unsigned setBufferOffsets(SDValue CombinedOffset, SelectionDAG &DAG, + SDValue *Offsets, unsigned Align = 4) const; // Handle 8 bit and 16 bit buffer loads SDValue handleByteShortBufferLoads(SelectionDAG &DAG, EVT LoadVT, SDLoc DL, Added: llvm/trunk/test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll?rev=374087&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll (added) +++ llvm/trunk/test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll Tue Oct 8 10:04:51 2019 @@ -0,0 +1,414 @@ +; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py +; RUN: llc -march=amdgcn -verify-machineinstrs -stop-after=amdgpu-isel -o - %s | FileCheck -check-prefix=GCN %s + +define amdgpu_cs void @mmo_offsets0(<4 x i32> addrspace(6)* inreg noalias dereferenceable(18446744073709551615) %arg0, i32 %arg1) { + ; GCN-LABEL: name: mmo_offsets0 + ; GCN: bb.0.bb.0: + ; GCN: liveins: $sgpr0, $vgpr0 + ; GCN: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 + ; GCN: [[COPY1:%[0-9]+]]:sgpr_32 = COPY $sgpr0 + ; GCN: [[S_MOV_B32_:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 0 + ; GCN: [[REG_SEQUENCE:%[0-9]+]]:sgpr_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[S_MOV_B32_]], %subreg.sub1 + ; GCN: [[S_LOAD_DWORDX4_IMM:%[0-9]+]]:sreg_128 = S_LOAD_DWORDX4_IMM killed [[REG_SEQUENCE]], 0, 0, 0 :: (dereferenceable invariant load 16 from %ir.arg0, addrspace 6) + ; GCN: [[BUFFER_LOAD_DWORDX4_OFFSET:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 16, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 16, align 1, addrspace 4) + ; GCN: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1, implicit $exec + ; GCN: [[BUFFER_LOAD_DWORDX4_IDXEN:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 16, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_DWORDX4_OFFEN:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_OFFEN [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_DWORDX4_IDXEN1:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_IDXEN [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 16, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: BUFFER_STORE_DWORDX4_OFFSET_exact killed [[BUFFER_LOAD_DWORDX4_OFFSET]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 32, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 32, align 1, addrspace 4) + ; GCN: BUFFER_STORE_DWORDX4_OFFEN_exact killed [[BUFFER_LOAD_DWORDX4_OFFEN]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_STORE_DWORDX4_IDXEN_exact killed [[BUFFER_LOAD_DWORDX4_IDXEN]], [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 32, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_STORE_DWORDX4_IDXEN_exact killed [[BUFFER_LOAD_DWORDX4_IDXEN1]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 32, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_OFFSET:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_OFFSET [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 48, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 48, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_IDXEN:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 48, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_OFFEN:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_OFFEN [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_IDXEN1:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 48, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: BUFFER_STORE_FORMAT_XYZW_OFFSET_exact killed [[BUFFER_LOAD_FORMAT_XYZW_OFFSET]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 64, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 64, align 1, addrspace 4) + ; GCN: BUFFER_STORE_FORMAT_XYZW_OFFEN_exact killed [[BUFFER_LOAD_FORMAT_XYZW_OFFEN]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_STORE_FORMAT_XYZW_IDXEN_exact killed [[BUFFER_LOAD_FORMAT_XYZW_IDXEN]], [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 64, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_STORE_FORMAT_XYZW_IDXEN_exact killed [[BUFFER_LOAD_FORMAT_XYZW_IDXEN1]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 64, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: BUFFER_ATOMIC_ADD_OFFSET [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 80, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 80, align 1, addrspace 4) + ; GCN: BUFFER_ATOMIC_ADD_OFFEN [[COPY]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 0, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_ATOMIC_ADD_IDXEN [[COPY]], [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 80, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_ATOMIC_ADD_IDXEN [[COPY]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 80, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY]], %subreg.sub0, [[COPY]], %subreg.sub1 + ; GCN: [[DEF:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_OFFSET [[REG_SEQUENCE1]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 96, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 96, align 1, addrspace 4) + ; GCN: [[COPY2:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF]].sub0 + ; GCN: [[DEF1:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_OFFEN [[REG_SEQUENCE1]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 0, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY3:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF1]].sub0 + ; GCN: [[DEF2:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_IDXEN [[REG_SEQUENCE1]], [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 96, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY4:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF2]].sub0 + ; GCN: [[DEF3:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_IDXEN [[REG_SEQUENCE1]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 96, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY5:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF3]].sub0 + ; GCN: INLINEASM &"", 1 + ; GCN: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1065353216, implicit $exec + ; GCN: BUFFER_ATOMIC_ADD_F32_OFFSET [[V_MOV_B32_e32_1]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 112, 0, implicit $exec :: (load store 4 on custom TargetCustom7 + 112, addrspace 4) + ; GCN: BUFFER_ATOMIC_ADD_F32_OFFEN [[V_MOV_B32_e32_1]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 0, 0, implicit $exec :: (load store 4 on custom TargetCustom7, addrspace 4) + ; GCN: BUFFER_ATOMIC_ADD_F32_IDXEN [[V_MOV_B32_e32_1]], [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 112, 0, implicit $exec :: (load store 4 on custom TargetCustom7, addrspace 4) + ; GCN: BUFFER_ATOMIC_ADD_F32_IDXEN [[V_MOV_B32_e32_1]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 112, 0, implicit $exec :: (load store 4 on custom TargetCustom7, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: [[BUFFER_LOAD_DWORDX4_OFFSET1:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 128, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 128, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_1:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 64 + ; GCN: [[BUFFER_LOAD_DWORDX4_OFFSET2:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET [[S_LOAD_DWORDX4_IMM]], killed [[S_MOV_B32_1]], 64, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 128, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_2:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 128 + ; GCN: [[BUFFER_LOAD_DWORDX4_OFFSET3:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_2]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 128, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_DWORDX4_OFFEN1:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_OFFEN [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_2]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY6:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: [[BUFFER_LOAD_DWORDX4_OFFSET4:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_OFFSET [[S_LOAD_DWORDX4_IMM]], [[COPY6]], 128, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_OFFSET1:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_OFFSET [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 144, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 144, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_3:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 72 + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_OFFSET2:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_OFFSET [[S_LOAD_DWORDX4_IMM]], killed [[S_MOV_B32_3]], 72, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 144, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_4:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 144 + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_OFFSET3:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_OFFSET [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_4]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 144, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_OFFEN1:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_OFFEN [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_4]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY7:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_OFFSET4:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_OFFSET [[S_LOAD_DWORDX4_IMM]], [[COPY7]], 144, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: BUFFER_ATOMIC_ADD_OFFSET [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 160, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 160, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_5:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 80 + ; GCN: BUFFER_ATOMIC_ADD_OFFSET [[COPY]], [[S_LOAD_DWORDX4_IMM]], killed [[S_MOV_B32_5]], 80, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 160, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_6:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 160 + ; GCN: BUFFER_ATOMIC_ADD_OFFSET [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_6]], 0, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 160, align 1, addrspace 4) + ; GCN: BUFFER_ATOMIC_ADD_OFFEN [[COPY]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_6]], 0, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY8:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: BUFFER_ATOMIC_ADD_OFFSET [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[COPY8]], 160, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: [[DEF4:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_OFFSET [[REG_SEQUENCE1]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 176, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 176, align 1, addrspace 4) + ; GCN: [[COPY9:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF4]].sub0 + ; GCN: [[S_MOV_B32_7:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 88 + ; GCN: [[DEF5:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_OFFSET [[REG_SEQUENCE1]], [[S_LOAD_DWORDX4_IMM]], killed [[S_MOV_B32_7]], 88, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 176, align 1, addrspace 4) + ; GCN: [[COPY10:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF5]].sub0 + ; GCN: [[S_MOV_B32_8:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 176 + ; GCN: [[DEF6:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_OFFSET [[REG_SEQUENCE1]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_8]], 0, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 176, align 1, addrspace 4) + ; GCN: [[COPY11:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF6]].sub0 + ; GCN: [[DEF7:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_OFFEN [[REG_SEQUENCE1]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_8]], 0, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY12:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF7]].sub0 + ; GCN: [[COPY13:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: [[DEF8:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_OFFSET [[REG_SEQUENCE1]], [[S_LOAD_DWORDX4_IMM]], [[COPY13]], 176, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY14:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF8]].sub0 + ; GCN: INLINEASM &"", 1 + ; GCN: BUFFER_STORE_DWORDX4_OFFSET_exact killed [[BUFFER_LOAD_DWORDX4_OFFSET1]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 192, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 192, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_9:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 96 + ; GCN: BUFFER_STORE_DWORDX4_OFFSET_exact killed [[BUFFER_LOAD_DWORDX4_OFFSET2]], [[S_LOAD_DWORDX4_IMM]], killed [[S_MOV_B32_9]], 96, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 192, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_10:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 192 + ; GCN: BUFFER_STORE_DWORDX4_OFFSET_exact killed [[BUFFER_LOAD_DWORDX4_OFFSET3]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_10]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 192, align 1, addrspace 4) + ; GCN: BUFFER_STORE_DWORDX4_OFFEN_exact killed [[BUFFER_LOAD_DWORDX4_OFFEN1]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_10]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY15:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: BUFFER_STORE_DWORDX4_OFFSET_exact killed [[BUFFER_LOAD_DWORDX4_OFFSET4]], [[S_LOAD_DWORDX4_IMM]], [[COPY15]], 192, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: BUFFER_STORE_FORMAT_XYZW_OFFSET_exact killed [[BUFFER_LOAD_FORMAT_XYZW_OFFSET1]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 208, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 208, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_11:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 104 + ; GCN: BUFFER_STORE_FORMAT_XYZW_OFFSET_exact killed [[BUFFER_LOAD_FORMAT_XYZW_OFFSET2]], [[S_LOAD_DWORDX4_IMM]], killed [[S_MOV_B32_11]], 104, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 208, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_12:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 208 + ; GCN: BUFFER_STORE_FORMAT_XYZW_OFFSET_exact killed [[BUFFER_LOAD_FORMAT_XYZW_OFFSET3]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_12]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 208, align 1, addrspace 4) + ; GCN: BUFFER_STORE_FORMAT_XYZW_OFFEN_exact killed [[BUFFER_LOAD_FORMAT_XYZW_OFFEN1]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_12]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY16:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: BUFFER_STORE_FORMAT_XYZW_OFFSET_exact killed [[BUFFER_LOAD_FORMAT_XYZW_OFFSET4]], [[S_LOAD_DWORDX4_IMM]], [[COPY16]], 208, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: [[COPY17:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[BUFFER_LOAD_DWORDX4_IDXEN2:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_IDXEN [[COPY17]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 224, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 224, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_13:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 112 + ; GCN: [[COPY18:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[BUFFER_LOAD_DWORDX4_IDXEN3:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_IDXEN [[COPY18]], [[S_LOAD_DWORDX4_IMM]], killed [[S_MOV_B32_13]], 112, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 224, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_14:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 224 + ; GCN: [[COPY19:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[BUFFER_LOAD_DWORDX4_IDXEN4:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_IDXEN [[COPY19]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_14]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 224, align 1, addrspace 4) + ; GCN: [[REG_SEQUENCE2:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[COPY]], %subreg.sub1 + ; GCN: [[BUFFER_LOAD_DWORDX4_BOTHEN:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_BOTHEN [[REG_SEQUENCE2]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_14]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY20:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[COPY21:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: [[BUFFER_LOAD_DWORDX4_IDXEN5:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_IDXEN [[COPY20]], [[S_LOAD_DWORDX4_IMM]], [[COPY21]], 224, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_DWORDX4_IDXEN6:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 224, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_DWORDX4_IDXEN7:%[0-9]+]]:vreg_128 = BUFFER_LOAD_DWORDX4_IDXEN [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 224, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: [[COPY22:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_IDXEN2:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN [[COPY22]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 240, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 240, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_15:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 120 + ; GCN: [[COPY23:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_IDXEN3:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN [[COPY23]], [[S_LOAD_DWORDX4_IMM]], killed [[S_MOV_B32_15]], 120, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 240, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_16:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 240 + ; GCN: [[COPY24:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_IDXEN4:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN [[COPY24]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_16]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7 + 240, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_BOTHEN:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_BOTHEN [[REG_SEQUENCE2]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_16]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY25:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[COPY26:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_IDXEN5:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN [[COPY25]], [[S_LOAD_DWORDX4_IMM]], [[COPY26]], 240, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_IDXEN6:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 240, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[BUFFER_LOAD_FORMAT_XYZW_IDXEN7:%[0-9]+]]:vreg_128 = BUFFER_LOAD_FORMAT_XYZW_IDXEN [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 240, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 16 from custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: [[COPY27:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: BUFFER_ATOMIC_ADD_IDXEN [[COPY]], [[COPY27]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 256, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 256, align 1, addrspace 4) + ; GCN: [[COPY28:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: BUFFER_ATOMIC_ADD_IDXEN [[COPY]], [[COPY28]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_2]], 128, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 256, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_17:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 256 + ; GCN: [[COPY29:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: BUFFER_ATOMIC_ADD_IDXEN [[COPY]], [[COPY29]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_17]], 0, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 256, align 1, addrspace 4) + ; GCN: BUFFER_ATOMIC_ADD_BOTHEN [[COPY]], [[REG_SEQUENCE2]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_17]], 0, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY30:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[COPY31:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: BUFFER_ATOMIC_ADD_IDXEN [[COPY]], [[COPY30]], [[S_LOAD_DWORDX4_IMM]], [[COPY31]], 256, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_ATOMIC_ADD_IDXEN [[COPY]], [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 256, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_ATOMIC_ADD_IDXEN [[COPY]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 256, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: [[COPY32:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[DEF9:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_IDXEN [[REG_SEQUENCE1]], [[COPY32]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 272, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 272, align 1, addrspace 4) + ; GCN: [[COPY33:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF9]].sub0 + ; GCN: [[S_MOV_B32_18:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 136 + ; GCN: [[COPY34:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[DEF10:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_IDXEN [[REG_SEQUENCE1]], [[COPY34]], [[S_LOAD_DWORDX4_IMM]], killed [[S_MOV_B32_18]], 136, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 272, align 1, addrspace 4) + ; GCN: [[COPY35:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF10]].sub0 + ; GCN: [[S_MOV_B32_19:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 272 + ; GCN: [[COPY36:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[DEF11:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_IDXEN [[REG_SEQUENCE1]], [[COPY36]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_19]], 0, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7 + 272, align 1, addrspace 4) + ; GCN: [[COPY37:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF11]].sub0 + ; GCN: [[DEF12:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_BOTHEN [[REG_SEQUENCE1]], [[REG_SEQUENCE2]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_19]], 0, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY38:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF12]].sub0 + ; GCN: [[COPY39:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[COPY40:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: [[DEF13:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_IDXEN [[REG_SEQUENCE1]], [[COPY39]], [[S_LOAD_DWORDX4_IMM]], [[COPY40]], 272, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY41:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF13]].sub0 + ; GCN: [[DEF14:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_IDXEN [[REG_SEQUENCE1]], [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 272, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY42:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF14]].sub0 + ; GCN: [[DEF15:%[0-9]+]]:vreg_64 = IMPLICIT_DEF + ; GCN: BUFFER_ATOMIC_CMPSWAP_IDXEN [[REG_SEQUENCE1]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 272, 0, implicit $exec :: (volatile dereferenceable load store 4 on custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY43:%[0-9]+]]:sreg_32_xm0 = COPY [[DEF15]].sub0 + ; GCN: INLINEASM &"", 1 + ; GCN: [[COPY44:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: BUFFER_STORE_DWORDX4_IDXEN_exact killed [[BUFFER_LOAD_DWORDX4_IDXEN2]], [[COPY44]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 288, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 288, align 1, addrspace 4) + ; GCN: [[COPY45:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: BUFFER_STORE_DWORDX4_IDXEN_exact killed [[BUFFER_LOAD_DWORDX4_IDXEN3]], [[COPY45]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_4]], 144, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 288, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_20:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 288 + ; GCN: [[COPY46:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: BUFFER_STORE_DWORDX4_IDXEN_exact killed [[BUFFER_LOAD_DWORDX4_IDXEN4]], [[COPY46]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_20]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 288, align 1, addrspace 4) + ; GCN: BUFFER_STORE_DWORDX4_BOTHEN_exact killed [[BUFFER_LOAD_DWORDX4_BOTHEN]], [[REG_SEQUENCE2]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_20]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY47:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[COPY48:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: BUFFER_STORE_DWORDX4_IDXEN_exact killed [[BUFFER_LOAD_DWORDX4_IDXEN5]], [[COPY47]], [[S_LOAD_DWORDX4_IMM]], [[COPY48]], 288, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_STORE_DWORDX4_IDXEN_exact killed [[BUFFER_LOAD_DWORDX4_IDXEN6]], [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 288, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_STORE_DWORDX4_IDXEN_exact killed [[BUFFER_LOAD_DWORDX4_IDXEN7]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 288, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: INLINEASM &"", 1 + ; GCN: [[COPY49:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: BUFFER_STORE_FORMAT_XYZW_IDXEN_exact killed [[BUFFER_LOAD_FORMAT_XYZW_IDXEN2]], [[COPY49]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 304, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 304, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_21:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 152 + ; GCN: [[COPY50:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: BUFFER_STORE_FORMAT_XYZW_IDXEN_exact killed [[BUFFER_LOAD_FORMAT_XYZW_IDXEN3]], [[COPY50]], [[S_LOAD_DWORDX4_IMM]], killed [[S_MOV_B32_21]], 152, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 304, align 1, addrspace 4) + ; GCN: [[S_MOV_B32_22:%[0-9]+]]:sreg_32_xm0 = S_MOV_B32 304 + ; GCN: [[COPY51:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: BUFFER_STORE_FORMAT_XYZW_IDXEN_exact killed [[BUFFER_LOAD_FORMAT_XYZW_IDXEN4]], [[COPY51]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_22]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7 + 304, align 1, addrspace 4) + ; GCN: BUFFER_STORE_FORMAT_XYZW_BOTHEN_exact killed [[BUFFER_LOAD_FORMAT_XYZW_BOTHEN]], [[REG_SEQUENCE2]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_22]], 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: [[COPY52:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]] + ; GCN: [[COPY53:%[0-9]+]]:sreg_32 = COPY [[COPY]] + ; GCN: BUFFER_STORE_FORMAT_XYZW_IDXEN_exact killed [[BUFFER_LOAD_FORMAT_XYZW_IDXEN5]], [[COPY52]], [[S_LOAD_DWORDX4_IMM]], [[COPY53]], 304, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_STORE_FORMAT_XYZW_IDXEN_exact killed [[BUFFER_LOAD_FORMAT_XYZW_IDXEN6]], [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 304, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: BUFFER_STORE_FORMAT_XYZW_IDXEN_exact killed [[BUFFER_LOAD_FORMAT_XYZW_IDXEN7]], [[COPY]], [[S_LOAD_DWORDX4_IMM]], [[S_MOV_B32_]], 304, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 16 into custom TargetCustom7, align 1, addrspace 4) + ; GCN: S_ENDPGM 0 +bb.0: + %tmp0 = load <4 x i32>, <4 x i32> addrspace(6)* %arg0, align 16, !invariant.load !0 + %buffer0 = call nsz <4 x float> @llvm.amdgcn.buffer.load.v4f32(<4 x i32> %tmp0, i32 0, i32 16, i1 false, i1 false) #0 + %buffer1 = call nsz <4 x float> @llvm.amdgcn.buffer.load.v4f32(<4 x i32> %tmp0, i32 0, i32 %arg1, i1 false, i1 false) #0 + %buffer2 = call nsz <4 x float> @llvm.amdgcn.buffer.load.v4f32(<4 x i32> %tmp0, i32 1, i32 16, i1 false, i1 false) #0 + %buffer3 = call nsz <4 x float> @llvm.amdgcn.buffer.load.v4f32(<4 x i32> %tmp0, i32 %arg1, i32 16, i1 false, i1 false) #0 + + ; Insert inline asm to keep the different instruction types from being mixed. This makes the output easier to read. + call void asm sideeffect "", "" () + + call void @llvm.amdgcn.buffer.store.v4f32(<4 x float> %buffer0, <4 x i32> %tmp0, i32 0, i32 32, i1 false, i1 false) #1 + call void @llvm.amdgcn.buffer.store.v4f32(<4 x float> %buffer1, <4 x i32> %tmp0, i32 0, i32 %arg1, i1 false, i1 false) #1 + call void @llvm.amdgcn.buffer.store.v4f32(<4 x float> %buffer2, <4 x i32> %tmp0, i32 1, i32 32, i1 false, i1 false) #1 + call void @llvm.amdgcn.buffer.store.v4f32(<4 x float> %buffer3, <4 x i32> %tmp0, i32 %arg1, i32 32, i1 false, i1 false) #1 + + call void asm sideeffect "", "" () + + %buffer_format0 = call nsz <4 x float> @llvm.amdgcn.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 0, i32 48, i1 false, i1 false) #0 + %buffer_format1 = call nsz <4 x float> @llvm.amdgcn.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 0, i32 %arg1, i1 false, i1 false) #0 + %buffer_format2 = call nsz <4 x float> @llvm.amdgcn.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 1, i32 48, i1 false, i1 false) #0 + %buffer_format3 = call nsz <4 x float> @llvm.amdgcn.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 %arg1, i32 48, i1 false, i1 false) #0 + + call void asm sideeffect "", "" () + + call void @llvm.amdgcn.buffer.store.format.v4f32(<4 x float> %buffer_format0, <4 x i32> %tmp0, i32 0, i32 64, i1 false, i1 false) #1 + call void @llvm.amdgcn.buffer.store.format.v4f32(<4 x float> %buffer_format1, <4 x i32> %tmp0, i32 0, i32 %arg1, i1 false, i1 false) #1 + call void @llvm.amdgcn.buffer.store.format.v4f32(<4 x float> %buffer_format2, <4 x i32> %tmp0, i32 1, i32 64, i1 false, i1 false) #1 + call void @llvm.amdgcn.buffer.store.format.v4f32(<4 x float> %buffer_format3, <4 x i32> %tmp0, i32 %arg1, i32 64, i1 false, i1 false) #1 + + call void asm sideeffect "", "" () + + %atomic_add0 = call i32 @llvm.amdgcn.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 0, i32 80, i1 false) #2 + %atomic_add1 = call i32 @llvm.amdgcn.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 0, i32 %arg1, i1 false) #2 + %atomic_add2 = call i32 @llvm.amdgcn.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 1, i32 80, i1 false) #2 + %atomic_add3 = call i32 @llvm.amdgcn.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 %arg1, i32 80, i1 false) #2 + + call void asm sideeffect "", "" () + + %atomic_cmpswap0 = call i32 @llvm.amdgcn.buffer.atomic.cmpswap(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 0, i32 96, i1 false) #2 + %atomic_cmpswap1 = call i32 @llvm.amdgcn.buffer.atomic.cmpswap(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 0, i32 %arg1, i1 false) #2 + %atomic_cmpswap2 = call i32 @llvm.amdgcn.buffer.atomic.cmpswap(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 1, i32 96, i1 false) #2 + %atomic_cmpswap3 = call i32 @llvm.amdgcn.buffer.atomic.cmpswap(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 %arg1, i32 96, i1 false) #2 + + call void asm sideeffect "", "" () + + call void @llvm.amdgcn.buffer.atomic.fadd.f32(float 1.0, <4 x i32> %tmp0, i32 0, i32 112, i1 false) #2 + call void @llvm.amdgcn.buffer.atomic.fadd.f32(float 1.0, <4 x i32> %tmp0, i32 0, i32 %arg1, i1 false) #2 + call void @llvm.amdgcn.buffer.atomic.fadd.f32(float 1.0, <4 x i32> %tmp0, i32 1, i32 112, i1 false) #2 + call void @llvm.amdgcn.buffer.atomic.fadd.f32(float 1.0, <4 x i32> %tmp0, i32 %arg1, i32 112, i1 false) #2 + + call void asm sideeffect "", "" () + + ; rsrc, offset, soffset, cachepolicy + %raw_buffer0 = call <4 x float> @llvm.amdgcn.raw.buffer.load.v4f32(<4 x i32> %tmp0, i32 128, i32 0, i32 0) #0 + %raw_buffer1 = call <4 x float> @llvm.amdgcn.raw.buffer.load.v4f32(<4 x i32> %tmp0, i32 64, i32 64, i32 0) #0 + %raw_buffer2 = call <4 x float> @llvm.amdgcn.raw.buffer.load.v4f32(<4 x i32> %tmp0, i32 0, i32 128, i32 0) #0 + %raw_buffer3 = call <4 x float> @llvm.amdgcn.raw.buffer.load.v4f32(<4 x i32> %tmp0, i32 %arg1, i32 128, i32 0) #0 + %raw_buffer4 = call <4 x float> @llvm.amdgcn.raw.buffer.load.v4f32(<4 x i32> %tmp0, i32 128, i32 %arg1, i32 0) #0 + + call void asm sideeffect "", "" () + + %raw_buffer_format0 = call <4 x float> @llvm.amdgcn.raw.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 144, i32 0, i32 0) #0 + %raw_buffer_format1 = call <4 x float> @llvm.amdgcn.raw.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 72, i32 72, i32 0) #0 + %raw_buffer_format2 = call <4 x float> @llvm.amdgcn.raw.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 0, i32 144, i32 0) #0 + %raw_buffer_format3 = call <4 x float> @llvm.amdgcn.raw.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 %arg1, i32 144, i32 0) #0 + %raw_buffer_format4 = call <4 x float> @llvm.amdgcn.raw.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 144, i32 %arg1, i32 0) #0 + + call void asm sideeffect "", "" () + + %raw_atomic_add0 = call i32 @llvm.amdgcn.raw.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 160, i32 0, i32 0) #2 + %raw_atomic_add1 = call i32 @llvm.amdgcn.raw.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 80, i32 80, i32 0) #2 + %raw_atomic_add2 = call i32 @llvm.amdgcn.raw.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 0, i32 160, i32 0) #2 + %raw_atomic_add3 = call i32 @llvm.amdgcn.raw.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 %arg1, i32 160, i32 0) #2 + %raw_atomic_add4 = call i32 @llvm.amdgcn.raw.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 160, i32 %arg1, i32 0) #2 + + call void asm sideeffect "", "" () + + %raw_atomic_cmpswap0 = call i32 @llvm.amdgcn.raw.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 176, i32 0, i32 0) #2 + %raw_atomic_cmpswap1 = call i32 @llvm.amdgcn.raw.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 88, i32 88, i32 0) #2 + %raw_atomic_cmpswap2 = call i32 @llvm.amdgcn.raw.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 0, i32 176, i32 0) #2 + %raw_atomic_cmpswap3 = call i32 @llvm.amdgcn.raw.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 %arg1, i32 176, i32 0) #2 + %raw_atomic_cmpswap4 = call i32 @llvm.amdgcn.raw.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 176, i32 %arg1, i32 0) #2 + + call void asm sideeffect "", "" () + + call void @llvm.amdgcn.raw.buffer.store.v4f32(<4 x float> %raw_buffer0, <4 x i32> %tmp0, i32 192, i32 0, i32 0) #2 + call void @llvm.amdgcn.raw.buffer.store.v4f32(<4 x float> %raw_buffer1, <4 x i32> %tmp0, i32 96, i32 96, i32 0) #2 + call void @llvm.amdgcn.raw.buffer.store.v4f32(<4 x float> %raw_buffer2, <4 x i32> %tmp0, i32 0, i32 192, i32 0) #2 + call void @llvm.amdgcn.raw.buffer.store.v4f32(<4 x float> %raw_buffer3, <4 x i32> %tmp0, i32 %arg1, i32 192, i32 0) #2 + call void @llvm.amdgcn.raw.buffer.store.v4f32(<4 x float> %raw_buffer4, <4 x i32> %tmp0, i32 192, i32 %arg1, i32 0) #2 + + call void asm sideeffect "", "" () + + call void @llvm.amdgcn.raw.buffer.store.format.v4f32(<4 x float> %raw_buffer_format0, <4 x i32> %tmp0, i32 208, i32 0, i32 0) #2 + call void @llvm.amdgcn.raw.buffer.store.format.v4f32(<4 x float> %raw_buffer_format1, <4 x i32> %tmp0, i32 104, i32 104, i32 0) #2 + call void @llvm.amdgcn.raw.buffer.store.format.v4f32(<4 x float> %raw_buffer_format2, <4 x i32> %tmp0, i32 0, i32 208, i32 0) #2 + call void @llvm.amdgcn.raw.buffer.store.format.v4f32(<4 x float> %raw_buffer_format3, <4 x i32> %tmp0, i32 %arg1, i32 208, i32 0) #2 + call void @llvm.amdgcn.raw.buffer.store.format.v4f32(<4 x float> %raw_buffer_format4, <4 x i32> %tmp0, i32 208, i32 %arg1, i32 0) #2 + + call void asm sideeffect "", "" () + + ; rsrc, vindex, offset, soffset, cachepolicy + %struct_buffer0 = call <4 x float> @llvm.amdgcn.struct.buffer.load.v4f32(<4 x i32> %tmp0, i32 0, i32 224, i32 0, i32 0) #0 + %struct_buffer1 = call <4 x float> @llvm.amdgcn.struct.buffer.load.v4f32(<4 x i32> %tmp0, i32 0, i32 112, i32 112, i32 0) #0 + %struct_buffer2 = call <4 x float> @llvm.amdgcn.struct.buffer.load.v4f32(<4 x i32> %tmp0, i32 0, i32 0, i32 224, i32 0) #0 + %struct_buffer3 = call <4 x float> @llvm.amdgcn.struct.buffer.load.v4f32(<4 x i32> %tmp0, i32 0, i32 %arg1, i32 224, i32 0) #0 + %struct_buffer4 = call <4 x float> @llvm.amdgcn.struct.buffer.load.v4f32(<4 x i32> %tmp0, i32 0, i32 224, i32 %arg1, i32 0) #0 + %struct_buffer5 = call <4 x float> @llvm.amdgcn.struct.buffer.load.v4f32(<4 x i32> %tmp0, i32 1, i32 224, i32 0, i32 0) #0 + %struct_buffer6 = call <4 x float> @llvm.amdgcn.struct.buffer.load.v4f32(<4 x i32> %tmp0, i32 %arg1, i32 224, i32 0, i32 0) #0 + + call void asm sideeffect "", "" () + + %struct_buffer_format0 = call <4 x float> @llvm.amdgcn.struct.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 0, i32 240, i32 0, i32 0) #0 + %struct_buffer_format1 = call <4 x float> @llvm.amdgcn.struct.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 0, i32 120, i32 120, i32 0) #0 + %struct_buffer_format2 = call <4 x float> @llvm.amdgcn.struct.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 0, i32 0, i32 240, i32 0) #0 + %struct_buffer_format3 = call <4 x float> @llvm.amdgcn.struct.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 0, i32 %arg1, i32 240, i32 0) #0 + %struct_buffer_format4 = call <4 x float> @llvm.amdgcn.struct.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 0, i32 240, i32 %arg1, i32 0) #0 + %struct_buffer_format5 = call <4 x float> @llvm.amdgcn.struct.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 1, i32 240, i32 0, i32 0) #0 + %struct_buffer_format6 = call <4 x float> @llvm.amdgcn.struct.buffer.load.format.v4f32(<4 x i32> %tmp0, i32 %arg1, i32 240, i32 0, i32 0) #0 + + call void asm sideeffect "", "" () + + %struct_atomic_add0 = call i32 @llvm.amdgcn.struct.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 0, i32 256, i32 0, i32 0) #2 + %struct_atomic_add1 = call i32 @llvm.amdgcn.struct.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 0, i32 128, i32 128, i32 0) #2 + %struct_atomic_add2 = call i32 @llvm.amdgcn.struct.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 0, i32 0, i32 256, i32 0) #2 + %struct_atomic_add3 = call i32 @llvm.amdgcn.struct.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 0, i32 %arg1, i32 256, i32 0) #2 + %struct_atomic_add4 = call i32 @llvm.amdgcn.struct.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 0, i32 256, i32 %arg1, i32 0) #2 + %struct_atomic_add5 = call i32 @llvm.amdgcn.struct.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 1, i32 256, i32 0, i32 0) #2 + %struct_atomic_add6 = call i32 @llvm.amdgcn.struct.buffer.atomic.add.i32(i32 %arg1, <4 x i32> %tmp0, i32 %arg1, i32 256, i32 0, i32 0) #2 + + call void asm sideeffect "", "" () + + %struct_atomic_cmpswap0 = call i32 @llvm.amdgcn.struct.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 0, i32 272, i32 0, i32 0) #2 + %struct_atomic_cmpswap1 = call i32 @llvm.amdgcn.struct.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 0, i32 136, i32 136, i32 0) #2 + %struct_atomic_cmpswap2 = call i32 @llvm.amdgcn.struct.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 0, i32 0, i32 272, i32 0) #2 + %struct_atomic_cmpswap3 = call i32 @llvm.amdgcn.struct.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 0, i32 %arg1, i32 272, i32 0) #2 + %struct_atomic_cmpswap4 = call i32 @llvm.amdgcn.struct.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 0, i32 272, i32 %arg1, i32 0) #2 + %struct_atomic_cmpswap5 = call i32 @llvm.amdgcn.struct.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 1, i32 272, i32 0, i32 0) #2 + %struct_atomic_cmpswap6 = call i32 @llvm.amdgcn.struct.buffer.atomic.cmpswap.i32(i32 %arg1, i32 %arg1, <4 x i32> %tmp0, i32 %arg1, i32 272, i32 0, i32 0) #2 + + call void asm sideeffect "", "" () + + call void @llvm.amdgcn.struct.buffer.store.v4f32(<4 x float> %struct_buffer0, <4 x i32> %tmp0, i32 0, i32 288, i32 0, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.v4f32(<4 x float> %struct_buffer1, <4 x i32> %tmp0, i32 0, i32 144, i32 144, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.v4f32(<4 x float> %struct_buffer2, <4 x i32> %tmp0, i32 0, i32 0, i32 288, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.v4f32(<4 x float> %struct_buffer3, <4 x i32> %tmp0, i32 0, i32 %arg1, i32 288, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.v4f32(<4 x float> %struct_buffer4, <4 x i32> %tmp0, i32 0, i32 288, i32 %arg1, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.v4f32(<4 x float> %struct_buffer5, <4 x i32> %tmp0, i32 1, i32 288, i32 0, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.v4f32(<4 x float> %struct_buffer6, <4 x i32> %tmp0, i32 %arg1, i32 288, i32 0, i32 0) #2 + + call void asm sideeffect "", "" () + + call void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float> %struct_buffer_format0, <4 x i32> %tmp0, i32 0, i32 304, i32 0, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float> %struct_buffer_format1, <4 x i32> %tmp0, i32 0, i32 152, i32 152, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float> %struct_buffer_format2, <4 x i32> %tmp0, i32 0, i32 0, i32 304, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float> %struct_buffer_format3, <4 x i32> %tmp0, i32 0, i32 %arg1, i32 304, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float> %struct_buffer_format4, <4 x i32> %tmp0, i32 0, i32 304, i32 %arg1, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float> %struct_buffer_format5, <4 x i32> %tmp0, i32 1, i32 304, i32 0, i32 0) #2 + call void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float> %struct_buffer_format6, <4 x i32> %tmp0, i32 %arg1, i32 304, i32 0, i32 0) #2 + + ret void +} + +declare <4 x float> @llvm.amdgcn.buffer.load.v4f32(<4 x i32>, i32, i32, i1, i1) #0 +declare void @llvm.amdgcn.buffer.store.v4f32(<4 x float>, <4 x i32>, i32, i32, i1, i1) #1 +declare <4 x float> @llvm.amdgcn.buffer.load.format.v4f32(<4 x i32>, i32, i32, i1, i1) #0 +declare void @llvm.amdgcn.buffer.store.format.v4f32(<4 x float>, <4 x i32>, i32, i32, i1, i1) #1 +declare i32 @llvm.amdgcn.buffer.atomic.add.i32(i32, <4 x i32>, i32, i32, i1) #2 +declare i32 @llvm.amdgcn.buffer.atomic.cmpswap(i32, i32, <4 x i32>, i32, i32, i1) #2 +declare void @llvm.amdgcn.buffer.atomic.fadd.f32(float, <4 x i32>, i32, i32, i1) #2 +declare <4 x float> @llvm.amdgcn.raw.buffer.load.v4f32(<4 x i32>, i32, i32, i32) #0 +declare <4 x float> @llvm.amdgcn.raw.buffer.load.format.v4f32(<4 x i32>, i32, i32, i32) #0 +declare i32 @llvm.amdgcn.raw.buffer.atomic.add.i32(i32, <4 x i32>, i32, i32, i32) #2 +declare i32 @llvm.amdgcn.raw.buffer.atomic.cmpswap.i32(i32, i32, <4 x i32>, i32, i32, i32) #2 +declare void @llvm.amdgcn.raw.buffer.store.v4f32(<4 x float>, <4 x i32>, i32, i32, i32) #2 +declare void @llvm.amdgcn.raw.buffer.store.format.v4f32(<4 x float>, <4 x i32>, i32, i32, i32) #2 + +declare <4 x float> @llvm.amdgcn.struct.buffer.load.v4f32(<4 x i32>, i32, i32, i32, i32) #0 +declare <4 x float> @llvm.amdgcn.struct.buffer.load.format.v4f32(<4 x i32>, i32, i32, i32, i32) #0 +declare i32 @llvm.amdgcn.struct.buffer.atomic.add.i32(i32, <4 x i32>, i32, i32, i32, i32) #2 +declare i32 @llvm.amdgcn.struct.buffer.atomic.cmpswap.i32(i32, i32, <4 x i32>, i32, i32, i32, i32) #2 +declare void @llvm.amdgcn.struct.buffer.store.v4f32(<4 x float>, <4 x i32>, i32, i32, i32, i32) #2 +declare void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float>, <4 x i32>, i32, i32, i32, i32) #2 + +attributes #0 = { nounwind readonly } +attributes #1 = { nounwind writeonly } +attributes #2 = { nounwind } + +!0 = !{} From llvm-commits at lists.llvm.org Tue Oct 8 10:06:27 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Tue, 08 Oct 2019 17:06:27 -0000 Subject: [compiler-rt] r374088 - [sanitizer] Disable crypt*.cpp tests on Android Message-ID: <20191008170627.B3FFA8F336@lists.llvm.org> Author: vitalybuka Date: Tue Oct 8 10:06:27 2019 New Revision: 374088 URL: http://llvm.org/viewvc/llvm-project?rev=374088&view=rev Log: [sanitizer] Disable crypt*.cpp tests on Android Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp?rev=374088&r1=374087&r2=374088&view=diff ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp (original) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp Tue Oct 8 10:06:27 2019 @@ -1,15 +1,14 @@ // RUN: %clangxx -O0 -g %s -lcrypt -o %t && %run %t +// crypt.h is missing from Android. +// UNSUPPORTED: android + #include #include #include #include -#include - -int -main (int argc, char** argv) -{ +int main(int argc, char **argv) { { crypt_data cd; cd.initialized = 0; Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp?rev=374088&r1=374087&r2=374088&view=diff ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp (original) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp Tue Oct 8 10:06:27 2019 @@ -1,5 +1,8 @@ // RUN: %clangxx -O0 -g %s -o %t -lcrypt && %run %t +// crypt is missing from Android. +// UNSUPPORTED: android + #include #include #include From llvm-commits at lists.llvm.org Tue Oct 8 10:06:50 2019 From: llvm-commits at lists.llvm.org (John McCall via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:06:50 +0000 (UTC) Subject: [PATCH] D62731: [RFC] Add support for options -frounding-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior In-Reply-To: References: Message-ID: rjmccall added a comment. Thank you, this looks very clean now. ================ Comment at: clang/docs/UsersManual.rst:1318 + mode informs the compiler that it must not assume any particular + rounding mode. + ---------------- "represent *the* corresponding IEEE rounding rules" ================ Comment at: clang/docs/UsersManual.rst:1330 + and ``fast``. + Details: + ---------------- "provided by other, single-purpose floating point options." ================ Comment at: clang/docs/UsersManual.rst:1341 + has been selected, then the compiler will issue a diagnostic warning + that the override has occurred. + ---------------- That's not typical driver behavior; why this choice? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62731/new/ https://reviews.llvm.org/D62731 From llvm-commits at lists.llvm.org Tue Oct 8 10:06:50 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:06:50 +0000 (UTC) Subject: [PATCH] D68635: [AMDGPU] Come back patch for the 'Assign register class for cross block values according to the divergence.' In-Reply-To: References: Message-ID: <689bf4e103a91c315fba9b9dfe96738a@localhost.localdomain> rampitec added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:10903 + +static bool hasCFUser(const Value *V, SetVector &Visited) { + if (Visited.count(V)) ---------------- Why SetVector and not SmallPtrSet as usual? ================ Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:10904 +static bool hasCFUser(const Value *V, SetVector &Visited) { + if (Visited.count(V)) + return false; ---------------- You can do if (!Visited.insert(V)) return false; ================ Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:3978 // No VOP2 instructions support AGPRs. - if (Src0.isReg() && RI.isAGPR(MRI, Src0.getReg())) + if (Src0.isReg() && RI.hasAGPRs(Register::isVirtualRegister(Src0.getReg()) + ? MRI.getRegClass(Src0.getReg()) ---------------- What's wrong with isAGPR() call? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68635/new/ https://reviews.llvm.org/D68635 From llvm-commits at lists.llvm.org Tue Oct 8 10:06:51 2019 From: llvm-commits at lists.llvm.org (Paul Robinson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:06:51 +0000 (UTC) Subject: [PATCH] D68620: DebugInfo: Use base address selection entries for debug_loc In-Reply-To: References: Message-ID: <020dbca496a5ac050b71a86ba9f98031@localhost.localdomain> probinson added a comment. BTW the BinaryFormat part LGTM and can go in on its own if you like. Should have been done that way in the first place. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68620/new/ https://reviews.llvm.org/D68620 From llvm-commits at lists.llvm.org Tue Oct 8 10:06:51 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:06:51 +0000 (UTC) Subject: [PATCH] D66645: [Attributor] Add helper class to compose two structured deduction. In-Reply-To: References: Message-ID: <6b6671e2115a151fc2418150260c29c3@localhost.localdomain> uenoku added a comment. In D66645#1699907 , @thakis wrote: > Looks like this fails on Windows: http://45.33.8.238/win/74/step_8.txt Thank you for teaching me. A temporary fix is done in rl374086. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66645/new/ https://reviews.llvm.org/D66645 From llvm-commits at lists.llvm.org Tue Oct 8 10:06:51 2019 From: llvm-commits at lists.llvm.org (John McCall via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:06:51 +0000 (UTC) Subject: [PATCH] D68611: [IRGen] Emit lifetime markers for temporary struct allocas In-Reply-To: References: Message-ID: <0c86984bb921deb3c4fd108653981dc8@localhost.localdomain> rjmccall added inline comments. ================ Comment at: clang/lib/CodeGen/CGCall.cpp:4014 + if (LifetimeSize) // In case we disabled lifetime markers. + CallLifetimeEndAfterCall.emplace_back(AI, LifetimeSize); + ---------------- Please push this immediately after emitting the lifetime start. That'll just make it more obvious that the `copyInto` call doesn't affect this. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68611/new/ https://reviews.llvm.org/D68611 From llvm-commits at lists.llvm.org Tue Oct 8 10:07:18 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:07:18 +0000 (UTC) Subject: [PATCH] D68600: AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer In-Reply-To: References: Message-ID: <236e26efcb7a91647f1a5b3b97e0bd6f@localhost.localdomain> arsenm updated this revision to Diff 223895. arsenm added a comment. Fix wrong test CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68600/new/ https://reviews.llvm.org/D68600 Files: lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp test/CodeGen/AMDGPU/GlobalISel/regbankselect-load.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68600.223895.patch Type: text/x-patch Size: 3798 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 10:07:21 2019 From: llvm-commits at lists.llvm.org (Tom Stellard via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:07:21 +0000 (UTC) Subject: [PATCH] D65097: AMDGPU: Add offsets to MMO when lowering buffer intrinsics In-Reply-To: References: Message-ID: <3dd04e7a16e8fde50ce5e867e0b482db@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG3a8d80944b77: AMDGPU: Add offsets to MMO when lowering buffer intrinsics (authored by tstellar). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65097/new/ https://reviews.llvm.org/D65097 Files: llvm/lib/Target/AMDGPU/SIISelLowering.cpp llvm/lib/Target/AMDGPU/SIISelLowering.h llvm/test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D65097.223896.patch Type: text/x-patch Size: 57697 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 10:08:12 2019 From: llvm-commits at lists.llvm.org (Paul Robinson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:08:12 +0000 (UTC) Subject: [PATCH] D68620: DebugInfo: Use base address selection entries for debug_loc In-Reply-To: References: Message-ID: <7ddc14f3a9c64e5ce5ed6e3ae02c3762@localhost.localdomain> probinson requested changes to this revision. probinson added a comment. This revision now requires changes to proceed. For some reason a previous comment caused it to set Accept, and the only way I know to undo it is to set Request Changes. Sorry about that. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68620/new/ https://reviews.llvm.org/D68620 From llvm-commits at lists.llvm.org Tue Oct 8 10:09:04 2019 From: llvm-commits at lists.llvm.org (Christudasan Devadasan via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:09:04 +0000 (UTC) Subject: [PATCH] D68092: [AMDGPU] Invert the handling of skip insertion. In-Reply-To: References: Message-ID: cdevadas marked 3 inline comments as done. cdevadas added inline comments. ================ Comment at: lib/Target/AMDGPU/SIRemoveShortExecBranches.cpp:112 + + TII->analyzeBranch(SrcMBB, TrueMBB, FalseMBB, Cond); + if (!FalseMBB) ---------------- nhaehnle wrote: > analyzeBranch's return value must be checked. Sure. Will add that. ================ Comment at: lib/Target/AMDGPU/SIRemoveShortExecBranches.cpp:116-117 + + if (MDT->dominates(TrueMBB, &SrcMBB) || + mustRetainExeczBranch(*FalseMBB, *TrueMBB)) + return false; ---------------- nhaehnle wrote: > What's the logic here behind using domination as a criterion? There could be a situation in which execnz (inserted during SI_LOOP lowering) can be inverted to execz by an optimization (for instance, BranchFolding). This execz should always be retained. This special check is added to handle it. Unfortunately, I couldn't write/find a test-case to reproduce it. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68092/new/ https://reviews.llvm.org/D68092 From llvm-commits at lists.llvm.org Tue Oct 8 10:17:51 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via llvm-commits) Date: Tue, 08 Oct 2019 17:17:51 -0000 Subject: [llvm] r374089 - [CodeExtractor] Factor out and reuse shrinkwrap analysis Message-ID: <20191008171751.CE1538F5CB@lists.llvm.org> Author: vedantk Date: Tue Oct 8 10:17:51 2019 New Revision: 374089 URL: http://llvm.org/viewvc/llvm-project?rev=374089&view=rev Log: [CodeExtractor] Factor out and reuse shrinkwrap analysis Factor out CodeExtractor's analysis of allocas (for shrinkwrapping purposes), and allow the analysis to be reused. This resolves a quadratic compile-time bug observed when compiling AMDGPUDisassembler.cpp.o. Pre-patch (Release + LTO clang): ``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 176.5278 ( 57.8%) 0.4915 ( 18.5%) 177.0192 ( 57.4%) 177.4112 ( 57.3%) Hot Cold Splitting ``` Post-patch (ReleaseAsserts clang): ``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 1.4051 ( 3.3%) 0.0079 ( 0.3%) 1.4129 ( 3.2%) 1.4129 ( 3.2%) Hot Cold Splitting ``` Testing: check-llvm, and comparing the AMDGPUDisassembler.cpp.o binary pre- vs. post-patch. An alternate approach is to hide CodeExtractorAnalysisCache from clients of CodeExtractor, and to recompute the analysis from scratch inside of CodeExtractor::extractCodeRegion(). This eliminates some redundant work in the shrinkwrapping legality check. However, some clients continue to exhibit O(n^2) compile time behavior as computing the analysis is O(n). rdar://55912966 Differential Revision: https://reviews.llvm.org/D68616 Modified: llvm/trunk/include/llvm/Transforms/IPO/HotColdSplitting.h llvm/trunk/include/llvm/Transforms/Utils/CodeExtractor.h llvm/trunk/lib/Transforms/IPO/BlockExtractor.cpp llvm/trunk/lib/Transforms/IPO/HotColdSplitting.cpp llvm/trunk/lib/Transforms/IPO/LoopExtractor.cpp llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp llvm/trunk/lib/Transforms/Utils/CodeExtractor.cpp llvm/trunk/unittests/Transforms/Utils/CodeExtractorTest.cpp Modified: llvm/trunk/include/llvm/Transforms/IPO/HotColdSplitting.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/HotColdSplitting.h?rev=374089&r1=374088&r2=374089&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/HotColdSplitting.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/HotColdSplitting.h Tue Oct 8 10:17:51 2019 @@ -23,6 +23,7 @@ class TargetTransformInfo; class OptimizationRemarkEmitter; class AssumptionCache; class DominatorTree; +class CodeExtractorAnalysisCache; /// A sequence of basic blocks. /// @@ -43,8 +44,10 @@ private: bool isFunctionCold(const Function &F) const; bool shouldOutlineFrom(const Function &F) const; bool outlineColdRegions(Function &F, bool HasProfileSummary); - Function *extractColdRegion(const BlockSequence &Region, DominatorTree &DT, - BlockFrequencyInfo *BFI, TargetTransformInfo &TTI, + Function *extractColdRegion(const BlockSequence &Region, + const CodeExtractorAnalysisCache &CEAC, + DominatorTree &DT, BlockFrequencyInfo *BFI, + TargetTransformInfo &TTI, OptimizationRemarkEmitter &ORE, AssumptionCache *AC, unsigned Count); ProfileSummaryInfo *PSI; Modified: llvm/trunk/include/llvm/Transforms/Utils/CodeExtractor.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Utils/CodeExtractor.h?rev=374089&r1=374088&r2=374089&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/Utils/CodeExtractor.h (original) +++ llvm/trunk/include/llvm/Transforms/Utils/CodeExtractor.h Tue Oct 8 10:17:51 2019 @@ -22,6 +22,7 @@ namespace llvm { +class AllocaInst; class BasicBlock; class BlockFrequency; class BlockFrequencyInfo; @@ -36,6 +37,38 @@ class Module; class Type; class Value; +/// A cache for the CodeExtractor analysis. The operation \ref +/// CodeExtractor::extractCodeRegion is guaranteed not to invalidate this +/// object. This object should conservatively be considered invalid if any +/// other mutating operations on the IR occur. +/// +/// Constructing this object is O(n) in the size of the function. +class CodeExtractorAnalysisCache { + /// The allocas in the function. + SmallVector Allocas; + + /// Base memory addresses of load/store instructions, grouped by block. + DenseMap> BaseMemAddrs; + + /// Blocks which contain instructions which may have unknown side-effects + /// on memory. + DenseSet SideEffectingBlocks; + + void findSideEffectInfoForBlock(BasicBlock &BB); + +public: + CodeExtractorAnalysisCache(Function &F); + + /// Get the allocas in the function at the time the analysis was created. + /// Note that some of these allocas may no longer be present in the function, + /// due to \ref CodeExtractor::extractCodeRegion. + ArrayRef getAllocas() const { return Allocas; } + + /// Check whether \p BB contains an instruction thought to load from, store + /// to, or otherwise clobber the alloca \p Addr. + bool doesBlockContainClobberOfAddr(BasicBlock &BB, AllocaInst *Addr) const; +}; + /// Utility class for extracting code into a new function. /// /// This utility provides a simple interface for extracting some sequence of @@ -104,7 +137,7 @@ class Value; /// /// Returns zero when called on a CodeExtractor instance where isEligible /// returns false. - Function *extractCodeRegion(); + Function *extractCodeRegion(const CodeExtractorAnalysisCache &CEAC); /// Verify that assumption cache isn't stale after a region is extracted. /// Returns false when verifier finds errors. AssumptionCache is passed as @@ -135,7 +168,9 @@ class Value; /// region. /// /// Returns true if it is safe to do the code motion. - bool isLegalToShrinkwrapLifetimeMarkers(Instruction *AllocaAddr) const; + bool + isLegalToShrinkwrapLifetimeMarkers(const CodeExtractorAnalysisCache &CEAC, + Instruction *AllocaAddr) const; /// Find the set of allocas whose life ranges are contained within the /// outlined region. @@ -145,7 +180,8 @@ class Value; /// are used by the lifetime markers are also candidates for shrink- /// wrapping. The instructions that need to be sunk are collected in /// 'Allocas'. - void findAllocas(ValueSet &SinkCands, ValueSet &HoistCands, + void findAllocas(const CodeExtractorAnalysisCache &CEAC, + ValueSet &SinkCands, ValueSet &HoistCands, BasicBlock *&ExitBlock) const; /// Find or create a block within the outline region for placing hoisted @@ -166,8 +202,9 @@ class Value; Instruction *LifeEnd = nullptr; }; - LifetimeMarkerInfo getLifetimeMarkers(Instruction *Addr, - BasicBlock *ExitBlock) const; + LifetimeMarkerInfo + getLifetimeMarkers(const CodeExtractorAnalysisCache &CEAC, + Instruction *Addr, BasicBlock *ExitBlock) const; void severSplitPHINodesOfEntry(BasicBlock *&Header); void severSplitPHINodesOfExits(const SmallPtrSetImpl &Exits); Modified: llvm/trunk/lib/Transforms/IPO/BlockExtractor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/BlockExtractor.cpp?rev=374089&r1=374088&r2=374089&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/BlockExtractor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/BlockExtractor.cpp Tue Oct 8 10:17:51 2019 @@ -206,7 +206,8 @@ bool BlockExtractor::runOnModule(Module ++NumExtracted; Changed = true; } - Function *F = CodeExtractor(BlocksToExtractVec).extractCodeRegion(); + CodeExtractorAnalysisCache CEAC(*BBs[0]->getParent()); + Function *F = CodeExtractor(BlocksToExtractVec).extractCodeRegion(CEAC); if (F) LLVM_DEBUG(dbgs() << "Extracted group '" << (*BBs.begin())->getName() << "' in: " << F->getName() << '\n'); Modified: llvm/trunk/lib/Transforms/IPO/HotColdSplitting.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/HotColdSplitting.cpp?rev=374089&r1=374088&r2=374089&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/HotColdSplitting.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/HotColdSplitting.cpp Tue Oct 8 10:17:51 2019 @@ -290,13 +290,10 @@ static int getOutliningPenalty(ArrayRef< return Penalty; } -Function *HotColdSplitting::extractColdRegion(const BlockSequence &Region, - DominatorTree &DT, - BlockFrequencyInfo *BFI, - TargetTransformInfo &TTI, - OptimizationRemarkEmitter &ORE, - AssumptionCache *AC, - unsigned Count) { +Function *HotColdSplitting::extractColdRegion( + const BlockSequence &Region, const CodeExtractorAnalysisCache &CEAC, + DominatorTree &DT, BlockFrequencyInfo *BFI, TargetTransformInfo &TTI, + OptimizationRemarkEmitter &ORE, AssumptionCache *AC, unsigned Count) { assert(!Region.empty()); // TODO: Pass BFI and BPI to update profile information. @@ -318,7 +315,7 @@ Function *HotColdSplitting::extractColdR return nullptr; Function *OrigF = Region[0]->getParent(); - if (Function *OutF = CE.extractCodeRegion()) { + if (Function *OutF = CE.extractCodeRegion(CEAC)) { User *U = *OutF->user_begin(); CallInst *CI = cast(U); CallSite CS(CI); @@ -606,9 +603,14 @@ bool HotColdSplitting::outlineColdRegion } } + if (OutliningWorklist.empty()) + return Changed; + // Outline single-entry cold regions, splitting up larger regions as needed. unsigned OutlinedFunctionID = 1; - while (!OutliningWorklist.empty()) { + // Cache and recycle the CodeExtractor analysis to avoid O(n^2) compile-time. + CodeExtractorAnalysisCache CEAC(F); + do { OutliningRegion Region = OutliningWorklist.pop_back_val(); assert(!Region.empty() && "Empty outlining region in worklist"); do { @@ -619,14 +621,14 @@ bool HotColdSplitting::outlineColdRegion BB->dump(); }); - Function *Outlined = extractColdRegion(SubRegion, *DT, BFI, TTI, ORE, AC, - OutlinedFunctionID); + Function *Outlined = extractColdRegion(SubRegion, CEAC, *DT, BFI, TTI, + ORE, AC, OutlinedFunctionID); if (Outlined) { ++OutlinedFunctionID; Changed = true; } } while (!Region.empty()); - } + } while (!OutliningWorklist.empty()); return Changed; } Modified: llvm/trunk/lib/Transforms/IPO/LoopExtractor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/LoopExtractor.cpp?rev=374089&r1=374088&r2=374089&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/LoopExtractor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/LoopExtractor.cpp Tue Oct 8 10:17:51 2019 @@ -141,10 +141,12 @@ bool LoopExtractor::runOnLoop(Loop *L, L if (NumLoops == 0) return Changed; --NumLoops; AssumptionCache *AC = nullptr; + Function &Func = *L->getHeader()->getParent(); if (auto *ACT = getAnalysisIfAvailable()) - AC = ACT->lookupAssumptionCache(*L->getHeader()->getParent()); + AC = ACT->lookupAssumptionCache(Func); + CodeExtractorAnalysisCache CEAC(Func); CodeExtractor Extractor(DT, *L, false, nullptr, nullptr, AC); - if (Extractor.extractCodeRegion() != nullptr) { + if (Extractor.extractCodeRegion(CEAC) != nullptr) { Changed = true; // After extraction, the loop is replaced by a function call, so // we shouldn't try to run any more loop passes on it. Modified: llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp?rev=374089&r1=374088&r2=374089&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/PartialInlining.cpp Tue Oct 8 10:17:51 2019 @@ -1122,6 +1122,9 @@ bool PartialInlinerImpl::FunctionCloner: BranchProbabilityInfo BPI(*ClonedFunc, LI); ClonedFuncBFI.reset(new BlockFrequencyInfo(*ClonedFunc, BPI, LI)); + // Cache and recycle the CodeExtractor analysis to avoid O(n^2) compile-time. + CodeExtractorAnalysisCache CEAC(*ClonedFunc); + SetVector Inputs, Outputs, Sinks; for (FunctionOutliningMultiRegionInfo::OutlineRegionInfo RegionInfo : ClonedOMRI->ORI) { @@ -1148,7 +1151,7 @@ bool PartialInlinerImpl::FunctionCloner: if (Outputs.size() > 0 && !ForceLiveExit) continue; - Function *OutlinedFunc = CE.extractCodeRegion(); + Function *OutlinedFunc = CE.extractCodeRegion(CEAC); if (OutlinedFunc) { CallSite OCS = PartialInlinerImpl::getOneCallSiteTo(OutlinedFunc); @@ -1210,11 +1213,12 @@ PartialInlinerImpl::FunctionCloner::doSi } // Extract the body of the if. + CodeExtractorAnalysisCache CEAC(*ClonedFunc); Function *OutlinedFunc = CodeExtractor(ToExtract, &DT, /*AggregateArgs*/ false, ClonedFuncBFI.get(), &BPI, LookupAC(*ClonedFunc), /* AllowVarargs */ true) - .extractCodeRegion(); + .extractCodeRegion(CEAC); if (OutlinedFunc) { BasicBlock *OutliningCallBB = Modified: llvm/trunk/lib/Transforms/Utils/CodeExtractor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/CodeExtractor.cpp?rev=374089&r1=374088&r2=374089&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/CodeExtractor.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/CodeExtractor.cpp Tue Oct 8 10:17:51 2019 @@ -305,52 +305,79 @@ static BasicBlock *getCommonExitBlock(co return CommonExitBlock; } -bool CodeExtractor::isLegalToShrinkwrapLifetimeMarkers( - Instruction *Addr) const { - AllocaInst *AI = cast(Addr->stripInBoundsConstantOffsets()); - Function *Func = (*Blocks.begin())->getParent(); - for (BasicBlock &BB : *Func) { - if (Blocks.count(&BB)) - continue; - for (Instruction &II : BB) { - if (isa(II)) - continue; +CodeExtractorAnalysisCache::CodeExtractorAnalysisCache(Function &F) { + for (BasicBlock &BB : F) { + for (Instruction &II : BB.instructionsWithoutDebug()) + if (auto *AI = dyn_cast(&II)) + Allocas.push_back(AI); - unsigned Opcode = II.getOpcode(); - Value *MemAddr = nullptr; - switch (Opcode) { - case Instruction::Store: - case Instruction::Load: { - if (Opcode == Instruction::Store) { - StoreInst *SI = cast(&II); - MemAddr = SI->getPointerOperand(); - } else { - LoadInst *LI = cast(&II); - MemAddr = LI->getPointerOperand(); - } - // Global variable can not be aliased with locals. - if (dyn_cast(MemAddr)) - break; - Value *Base = MemAddr->stripInBoundsConstantOffsets(); - if (!isa(Base) || Base == AI) - return false; + findSideEffectInfoForBlock(BB); + } +} + +void CodeExtractorAnalysisCache::findSideEffectInfoForBlock(BasicBlock &BB) { + for (Instruction &II : BB.instructionsWithoutDebug()) { + unsigned Opcode = II.getOpcode(); + Value *MemAddr = nullptr; + switch (Opcode) { + case Instruction::Store: + case Instruction::Load: { + if (Opcode == Instruction::Store) { + StoreInst *SI = cast(&II); + MemAddr = SI->getPointerOperand(); + } else { + LoadInst *LI = cast(&II); + MemAddr = LI->getPointerOperand(); + } + // Global variable can not be aliased with locals. + if (dyn_cast(MemAddr)) break; + Value *Base = MemAddr->stripInBoundsConstantOffsets(); + if (!isa(Base)) { + SideEffectingBlocks.insert(&BB); + return; } - default: { - IntrinsicInst *IntrInst = dyn_cast(&II); - if (IntrInst) { - if (IntrInst->isLifetimeStartOrEnd()) - break; - return false; - } - // Treat all the other cases conservatively if it has side effects. - if (II.mayHaveSideEffects()) - return false; + BaseMemAddrs[&BB].insert(Base); + break; + } + default: { + IntrinsicInst *IntrInst = dyn_cast(&II); + if (IntrInst) { + if (IntrInst->isLifetimeStartOrEnd()) + break; + SideEffectingBlocks.insert(&BB); + return; } + // Treat all the other cases conservatively if it has side effects. + if (II.mayHaveSideEffects()) { + SideEffectingBlocks.insert(&BB); + return; } } + } } +} +bool CodeExtractorAnalysisCache::doesBlockContainClobberOfAddr( + BasicBlock &BB, AllocaInst *Addr) const { + if (SideEffectingBlocks.count(&BB)) + return true; + auto It = BaseMemAddrs.find(&BB); + if (It != BaseMemAddrs.end()) + return It->second.count(Addr); + return false; +} + +bool CodeExtractor::isLegalToShrinkwrapLifetimeMarkers( + const CodeExtractorAnalysisCache &CEAC, Instruction *Addr) const { + AllocaInst *AI = cast(Addr->stripInBoundsConstantOffsets()); + Function *Func = (*Blocks.begin())->getParent(); + for (BasicBlock &BB : *Func) { + if (Blocks.count(&BB)) + continue; + if (CEAC.doesBlockContainClobberOfAddr(BB, AI)) + return false; + } return true; } @@ -413,7 +440,8 @@ CodeExtractor::findOrCreateBlockForHoist // outline region. If there are not other untracked uses of the address, return // the pair of markers if found; otherwise return a pair of nullptr. CodeExtractor::LifetimeMarkerInfo -CodeExtractor::getLifetimeMarkers(Instruction *Addr, +CodeExtractor::getLifetimeMarkers(const CodeExtractorAnalysisCache &CEAC, + Instruction *Addr, BasicBlock *ExitBlock) const { LifetimeMarkerInfo Info; @@ -445,7 +473,7 @@ CodeExtractor::getLifetimeMarkers(Instru Info.HoistLifeEnd = !definedInRegion(Blocks, Info.LifeEnd); // Do legality check. if ((Info.SinkLifeStart || Info.HoistLifeEnd) && - !isLegalToShrinkwrapLifetimeMarkers(Addr)) + !isLegalToShrinkwrapLifetimeMarkers(CEAC, Addr)) return {}; // Check to see if we have a place to do hoisting, if not, bail. @@ -455,7 +483,8 @@ CodeExtractor::getLifetimeMarkers(Instru return Info; } -void CodeExtractor::findAllocas(ValueSet &SinkCands, ValueSet &HoistCands, +void CodeExtractor::findAllocas(const CodeExtractorAnalysisCache &CEAC, + ValueSet &SinkCands, ValueSet &HoistCands, BasicBlock *&ExitBlock) const { Function *Func = (*Blocks.begin())->getParent(); ExitBlock = getCommonExitBlock(Blocks); @@ -476,60 +505,64 @@ void CodeExtractor::findAllocas(ValueSet return true; }; - for (BasicBlock &BB : *Func) { - if (Blocks.count(&BB)) + // Look up allocas in the original function in CodeExtractorAnalysisCache, as + // this is much faster than walking all the instructions. + for (AllocaInst *AI : CEAC.getAllocas()) { + BasicBlock *BB = AI->getParent(); + if (Blocks.count(BB)) continue; - for (Instruction &II : BB) { - auto *AI = dyn_cast(&II); - if (!AI) - continue; - LifetimeMarkerInfo MarkerInfo = getLifetimeMarkers(AI, ExitBlock); - bool Moved = moveOrIgnoreLifetimeMarkers(MarkerInfo); - if (Moved) { - LLVM_DEBUG(dbgs() << "Sinking alloca: " << *AI << "\n"); - SinkCands.insert(AI); - continue; - } + // As a prior call to extractCodeRegion() may have shrinkwrapped the alloca, + // check whether it is actually still in the original function. + Function *AIFunc = BB->getParent(); + if (AIFunc != Func) + continue; - // Follow any bitcasts. - SmallVector Bitcasts; - SmallVector BitcastLifetimeInfo; - for (User *U : AI->users()) { - if (U->stripInBoundsConstantOffsets() == AI) { - Instruction *Bitcast = cast(U); - LifetimeMarkerInfo LMI = getLifetimeMarkers(Bitcast, ExitBlock); - if (LMI.LifeStart) { - Bitcasts.push_back(Bitcast); - BitcastLifetimeInfo.push_back(LMI); - continue; - } - } + LifetimeMarkerInfo MarkerInfo = getLifetimeMarkers(CEAC, AI, ExitBlock); + bool Moved = moveOrIgnoreLifetimeMarkers(MarkerInfo); + if (Moved) { + LLVM_DEBUG(dbgs() << "Sinking alloca: " << *AI << "\n"); + SinkCands.insert(AI); + continue; + } - // Found unknown use of AI. - if (!definedInRegion(Blocks, U)) { - Bitcasts.clear(); - break; + // Follow any bitcasts. + SmallVector Bitcasts; + SmallVector BitcastLifetimeInfo; + for (User *U : AI->users()) { + if (U->stripInBoundsConstantOffsets() == AI) { + Instruction *Bitcast = cast(U); + LifetimeMarkerInfo LMI = getLifetimeMarkers(CEAC, Bitcast, ExitBlock); + if (LMI.LifeStart) { + Bitcasts.push_back(Bitcast); + BitcastLifetimeInfo.push_back(LMI); + continue; } } - // Either no bitcasts reference the alloca or there are unknown uses. - if (Bitcasts.empty()) - continue; + // Found unknown use of AI. + if (!definedInRegion(Blocks, U)) { + Bitcasts.clear(); + break; + } + } - LLVM_DEBUG(dbgs() << "Sinking alloca (via bitcast): " << *AI << "\n"); - SinkCands.insert(AI); - for (unsigned I = 0, E = Bitcasts.size(); I != E; ++I) { - Instruction *BitcastAddr = Bitcasts[I]; - const LifetimeMarkerInfo &LMI = BitcastLifetimeInfo[I]; - assert(LMI.LifeStart && - "Unsafe to sink bitcast without lifetime markers"); - moveOrIgnoreLifetimeMarkers(LMI); - if (!definedInRegion(Blocks, BitcastAddr)) { - LLVM_DEBUG(dbgs() << "Sinking bitcast-of-alloca: " << *BitcastAddr - << "\n"); - SinkCands.insert(BitcastAddr); - } + // Either no bitcasts reference the alloca or there are unknown uses. + if (Bitcasts.empty()) + continue; + + LLVM_DEBUG(dbgs() << "Sinking alloca (via bitcast): " << *AI << "\n"); + SinkCands.insert(AI); + for (unsigned I = 0, E = Bitcasts.size(); I != E; ++I) { + Instruction *BitcastAddr = Bitcasts[I]; + const LifetimeMarkerInfo &LMI = BitcastLifetimeInfo[I]; + assert(LMI.LifeStart && + "Unsafe to sink bitcast without lifetime markers"); + moveOrIgnoreLifetimeMarkers(LMI); + if (!definedInRegion(Blocks, BitcastAddr)) { + LLVM_DEBUG(dbgs() << "Sinking bitcast-of-alloca: " << *BitcastAddr + << "\n"); + SinkCands.insert(BitcastAddr); } } } @@ -1349,7 +1382,8 @@ void CodeExtractor::calculateNewCallTerm MDBuilder(TI->getContext()).createBranchWeights(BranchWeights)); } -Function *CodeExtractor::extractCodeRegion() { +Function * +CodeExtractor::extractCodeRegion(const CodeExtractorAnalysisCache &CEAC) { if (!isEligible()) return nullptr; @@ -1435,7 +1469,7 @@ Function *CodeExtractor::extractCodeRegi ValueSet inputs, outputs, SinkingCands, HoistingCands; BasicBlock *CommonExit = nullptr; - findAllocas(SinkingCands, HoistingCands, CommonExit); + findAllocas(CEAC, SinkingCands, HoistingCands, CommonExit); assert(HoistingCands.empty() || CommonExit); // Find inputs to, outputs from the code region. Modified: llvm/trunk/unittests/Transforms/Utils/CodeExtractorTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/Transforms/Utils/CodeExtractorTest.cpp?rev=374089&r1=374088&r2=374089&view=diff ============================================================================== --- llvm/trunk/unittests/Transforms/Utils/CodeExtractorTest.cpp (original) +++ llvm/trunk/unittests/Transforms/Utils/CodeExtractorTest.cpp Tue Oct 8 10:17:51 2019 @@ -62,7 +62,8 @@ TEST(CodeExtractor, ExitStub) { CodeExtractor CE(Candidates); EXPECT_TRUE(CE.isEligible()); - Function *Outlined = CE.extractCodeRegion(); + CodeExtractorAnalysisCache CEAC(*Func); + Function *Outlined = CE.extractCodeRegion(CEAC); EXPECT_TRUE(Outlined); BasicBlock *Exit = getBlockByName(Func, "notExtracted"); BasicBlock *ExitSplit = getBlockByName(Outlined, "notExtracted.split"); @@ -112,7 +113,8 @@ TEST(CodeExtractor, ExitPHIOnePredFromRe CodeExtractor CE(ExtractedBlocks); EXPECT_TRUE(CE.isEligible()); - Function *Outlined = CE.extractCodeRegion(); + CodeExtractorAnalysisCache CEAC(*Func); + Function *Outlined = CE.extractCodeRegion(CEAC); EXPECT_TRUE(Outlined); BasicBlock *Exit1 = getBlockByName(Func, "exit1"); BasicBlock *Exit2 = getBlockByName(Func, "exit2"); @@ -186,7 +188,8 @@ TEST(CodeExtractor, StoreOutputInvokeRes CodeExtractor CE(ExtractedBlocks); EXPECT_TRUE(CE.isEligible()); - Function *Outlined = CE.extractCodeRegion(); + CodeExtractorAnalysisCache CEAC(*Func); + Function *Outlined = CE.extractCodeRegion(CEAC); EXPECT_TRUE(Outlined); EXPECT_FALSE(verifyFunction(*Outlined, &errs())); EXPECT_FALSE(verifyFunction(*Func, &errs())); @@ -220,7 +223,8 @@ TEST(CodeExtractor, StoreOutputInvokeRes CodeExtractor CE(Blocks); EXPECT_TRUE(CE.isEligible()); - Function *Outlined = CE.extractCodeRegion(); + CodeExtractorAnalysisCache CEAC(*Func); + Function *Outlined = CE.extractCodeRegion(CEAC); EXPECT_TRUE(Outlined); EXPECT_FALSE(verifyFunction(*Outlined)); EXPECT_FALSE(verifyFunction(*Func)); @@ -271,7 +275,8 @@ TEST(CodeExtractor, ExtractAndInvalidate CodeExtractor CE(Blocks, nullptr, false, nullptr, nullptr, &AC); EXPECT_TRUE(CE.isEligible()); - Function *Outlined = CE.extractCodeRegion(); + CodeExtractorAnalysisCache CEAC(*Func); + Function *Outlined = CE.extractCodeRegion(CEAC); EXPECT_TRUE(Outlined); EXPECT_FALSE(verifyFunction(*Outlined)); EXPECT_FALSE(verifyFunction(*Func)); From llvm-commits at lists.llvm.org Tue Oct 8 10:18:32 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Tue, 08 Oct 2019 17:18:32 -0000 Subject: [llvm] r374090 - [SLP] add test with prefer-vector-width function attribute; NFC (PR43578) Message-ID: <20191008171832.2C1D980853@lists.llvm.org> Author: spatel Date: Tue Oct 8 10:18:32 2019 New Revision: 374090 URL: http://llvm.org/viewvc/llvm-project?rev=374090&view=rev Log: [SLP] add test with prefer-vector-width function attribute; NFC (PR43578) Modified: llvm/trunk/test/Transforms/SLPVectorizer/X86/load-merge.ll Modified: llvm/trunk/test/Transforms/SLPVectorizer/X86/load-merge.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SLPVectorizer/X86/load-merge.ll?rev=374090&r1=374089&r2=374090&view=diff ============================================================================== --- llvm/trunk/test/Transforms/SLPVectorizer/X86/load-merge.ll (original) +++ llvm/trunk/test/Transforms/SLPVectorizer/X86/load-merge.ll Tue Oct 8 10:18:32 2019 @@ -142,3 +142,62 @@ define <4 x float> @PR16739_byval(<4 x f %t15 = insertelement <4 x float> %t14, float %t13, i32 3 ret <4 x float> %t15 } + +define void @PR43578_prefer128(i32* %r, i64* %p, i64* %q) #0 { +; CHECK-LABEL: @PR43578_prefer128( +; CHECK-NEXT: [[P0:%.*]] = getelementptr inbounds i64, i64* [[P:%.*]], i64 0 +; CHECK-NEXT: [[P1:%.*]] = getelementptr inbounds i64, i64* [[P]], i64 1 +; CHECK-NEXT: [[P2:%.*]] = getelementptr inbounds i64, i64* [[P]], i64 2 +; CHECK-NEXT: [[P3:%.*]] = getelementptr inbounds i64, i64* [[P]], i64 3 +; CHECK-NEXT: [[Q0:%.*]] = getelementptr inbounds i64, i64* [[Q:%.*]], i64 0 +; CHECK-NEXT: [[Q1:%.*]] = getelementptr inbounds i64, i64* [[Q]], i64 1 +; CHECK-NEXT: [[Q2:%.*]] = getelementptr inbounds i64, i64* [[Q]], i64 2 +; CHECK-NEXT: [[Q3:%.*]] = getelementptr inbounds i64, i64* [[Q]], i64 3 +; CHECK-NEXT: [[TMP1:%.*]] = bitcast i64* [[P0]] to <4 x i64>* +; CHECK-NEXT: [[TMP2:%.*]] = load <4 x i64>, <4 x i64>* [[TMP1]], align 2 +; CHECK-NEXT: [[TMP3:%.*]] = bitcast i64* [[Q0]] to <4 x i64>* +; CHECK-NEXT: [[TMP4:%.*]] = load <4 x i64>, <4 x i64>* [[TMP3]], align 2 +; CHECK-NEXT: [[TMP5:%.*]] = sub nsw <4 x i64> [[TMP2]], [[TMP4]] +; CHECK-NEXT: [[TMP6:%.*]] = extractelement <4 x i64> [[TMP5]], i32 0 +; CHECK-NEXT: [[G0:%.*]] = getelementptr inbounds i32, i32* [[R:%.*]], i64 [[TMP6]] +; CHECK-NEXT: [[TMP7:%.*]] = extractelement <4 x i64> [[TMP5]], i32 1 +; CHECK-NEXT: [[G1:%.*]] = getelementptr inbounds i32, i32* [[R]], i64 [[TMP7]] +; CHECK-NEXT: [[TMP8:%.*]] = extractelement <4 x i64> [[TMP5]], i32 2 +; CHECK-NEXT: [[G2:%.*]] = getelementptr inbounds i32, i32* [[R]], i64 [[TMP8]] +; CHECK-NEXT: [[TMP9:%.*]] = extractelement <4 x i64> [[TMP5]], i32 3 +; CHECK-NEXT: [[G3:%.*]] = getelementptr inbounds i32, i32* [[R]], i64 [[TMP9]] +; CHECK-NEXT: ret void +; + %p0 = getelementptr inbounds i64, i64* %p, i64 0 + %p1 = getelementptr inbounds i64, i64* %p, i64 1 + %p2 = getelementptr inbounds i64, i64* %p, i64 2 + %p3 = getelementptr inbounds i64, i64* %p, i64 3 + + %q0 = getelementptr inbounds i64, i64* %q, i64 0 + %q1 = getelementptr inbounds i64, i64* %q, i64 1 + %q2 = getelementptr inbounds i64, i64* %q, i64 2 + %q3 = getelementptr inbounds i64, i64* %q, i64 3 + + %x0 = load i64, i64* %p0, align 2 + %x1 = load i64, i64* %p1, align 2 + %x2 = load i64, i64* %p2, align 2 + %x3 = load i64, i64* %p3, align 2 + + %y0 = load i64, i64* %q0, align 2 + %y1 = load i64, i64* %q1, align 2 + %y2 = load i64, i64* %q2, align 2 + %y3 = load i64, i64* %q3, align 2 + + %sub0 = sub nsw i64 %x0, %y0 + %sub1 = sub nsw i64 %x1, %y1 + %sub2 = sub nsw i64 %x2, %y2 + %sub3 = sub nsw i64 %x3, %y3 + + %g0 = getelementptr inbounds i32, i32* %r, i64 %sub0 + %g1 = getelementptr inbounds i32, i32* %r, i64 %sub1 + %g2 = getelementptr inbounds i32, i32* %r, i64 %sub2 + %g3 = getelementptr inbounds i32, i32* %r, i64 %sub3 + ret void +} + +attributes #0 = { "prefer-vector-width"="128" } From llvm-commits at lists.llvm.org Tue Oct 8 10:17:15 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:17:15 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: <88b631512a481246c73585b4880ac2f1@localhost.localdomain> lebedev.ri added a comment. In D68651#1699976 , @dmgreen wrote: > Hello. > > Can you explain what you mean by "native format"? Do you mean without the extends/truncs, as a different way of specifying them? Err, i meant native bitwidth, so yes, without any extends/truncs. > (I think the problem at least from C is dealing with overflowing arithmetic being undefined. If you extend at least one bit then the arithmetic can't overflow, so you can do the min/max like it's done here). True. This highlights that we also might want to form it from `.with.overflow()` intrinsics. > I don't think there is anywhere in instcombine that currently forms a sadd_sat or ssub_sat (as opposed to uadd_sat or usub_sat), unless it's from an existing sadd_sat. We do form uadd_sat as in rL357012 and usub_sat from selects. > > I really just need some way to generate sadd_sats for vectorisation. If there's a better way than this, I'm all ears :) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Tue Oct 8 10:17:18 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:17:18 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline In-Reply-To: References: Message-ID: aprantl added inline comments. ================ Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1842-1843 + // Debuginfo (@llvm.dbg.value) will make different result, skip while allocas scanning + while (isa(I)) ++I; + ---------------- jmorse wrote: > Is there a possibility of an unrelated debug instruction being skipped here, and becoming part of the slice moved by lines 1847-1857? Moving dbg.values of arguments to the start of the caller may create a debug use-before-def situation, there could be other problem scenarios too. > > Using a debug-instruction filtering iterator (like here [0]) might just do-the-right-thing, I don't know whether feeding one to splice would behave correctly though. > > [0] https://github.com/llvm/llvm-project/blob/fdaa74217420729140f1786ea037ac445a724c8e/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L2592 Don't we have an iterator that automatically skips debug intrinsics? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 From llvm-commits at lists.llvm.org Tue Oct 8 10:17:19 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:17:19 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <42a4c078b62628022c53699b9ac68cf0@localhost.localdomain> rupprecht added a comment. In D68570#1699204 , @hans wrote: > In D68570#1697872 , @rupprecht wrote: > > > Do you have any benchmarks? > > > Of what? That generating the table is slower than just using it directly? I'd say no benchmark is needed to conclude that :-) There is also the cost of code complexity to consider. Sure, the table generation goes away, but does that even show up on any benchmark/real world usage of this? Nobody's exercising the table generation code. People are either check-summing once (in which case, it doesn't matter as much if it's slow, because it happens once) or check-summing 1000 times (in which case the one-time table generation is probably not the CRC bottleneck). If there's no performance win, is it worth the potential code complexity of an opaque hex array over constructing it with an algorithm that can be inspected? FWIW I'm still in favor of this patch, and the objcopy part that Herald added me for LGTM. Addressing my comments would help me understand this code but feel free to ignore if I'm the only one that feels this way. ================ Comment at: llvm/lib/Support/CRC.cpp:26 -uint32_t llvm::crc32(uint32_t CRC, StringRef S) { - static llvm::once_flag InitFlag; - static CRC32Table Tbl; - llvm::call_once(InitFlag, initCRC32Table, &Tbl); +static const uint32_t CRCTable[256] = { + 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, ---------------- hans wrote: > hiraditya wrote: > > rupprecht wrote: > > > Can you leave a comment how this table was generated/how it could be regenerated if needed in the future? And/or a unit test to assert the values are correct? > > +1 > I'm adding a comment with references for how the algorithm and the table works. There is already a unit test in llvm/unittests/Support/CRCTest.cpp >From what I can tell: - This algorithm makes use of one byte from this table per byte in the input array - The table has 256 entries - The test only includes "The quick brown fox jumps over the lazy dog" and "123456789" (43 total values if I'm counting correctly) There's no way that test sufficiently tests that all 256 values here are correct. Maybe check summing a very large (>100k) buffer would give confidence that -- probably -- all values are being used. I think the unit test is sufficient for testing a generated table because it's hard to mess up the generation in a way that only affects some values. However with a hard-coded table, it's much more likely that a single value here could be wrong due to mechanical issues (e.g. accidentally changing a character in the process of formatting it). ================ Comment at: llvm/lib/Support/CRC.cpp:29-37 - for (size_t I = 0; I < Tbl->size(); ++I) { - uint32_t V = Shuffle(I); - V = Shuffle(V); - V = Shuffle(V); - V = Shuffle(V); - V = Shuffle(V); - V = Shuffle(V); ---------------- I'd personally find keeping this as a comment to be more useful in understanding how the table is generated than digging up papers CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Tue Oct 8 10:17:39 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:17:39 +0000 (UTC) Subject: [PATCH] D68616: [CodeExtractor] Factor out and reuse shrinkwrap analysis In-Reply-To: References: Message-ID: <7468c9802d6d4d3f08d5a03198673e1d@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG9852699dcb18: [CodeExtractor] Factor out and reuse shrinkwrap analysis (authored by vsk). Changed prior to commit: https://reviews.llvm.org/D68616?vs=223711&id=223897#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68616/new/ https://reviews.llvm.org/D68616 Files: llvm/include/llvm/Transforms/IPO/HotColdSplitting.h llvm/include/llvm/Transforms/Utils/CodeExtractor.h llvm/lib/Transforms/IPO/BlockExtractor.cpp llvm/lib/Transforms/IPO/HotColdSplitting.cpp llvm/lib/Transforms/IPO/LoopExtractor.cpp llvm/lib/Transforms/IPO/PartialInlining.cpp llvm/lib/Transforms/Utils/CodeExtractor.cpp llvm/unittests/Transforms/Utils/CodeExtractorTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68616.223897.patch Type: text/x-patch Size: 22559 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 10:26:52 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:26:52 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: hubert.reinterpretcast added inline comments. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:826 + // Change the opcode to ADDIS8. If the global address is the address of + // an external symbol, is a jump table address, is a block address; or if + // large code model is enabled then generate a TOC entry and reference that. ---------------- sfertile wrote: > Was the ';' after 'block-address' meant to be a comma? Add "or" before "is a block address". Add "the" before "large code model". Add a comma after "Otherwise". Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Tue Oct 8 10:26:53 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:26:53 +0000 (UTC) Subject: [PATCH] D68092: [AMDGPU] Invert the handling of skip insertion. In-Reply-To: References: Message-ID: <881989a6df489a93bdb804daa5a6eeb8@localhost.localdomain> arsenm added inline comments. ================ Comment at: lib/Target/AMDGPU/SIRemoveShortExecBranches.cpp:113-114 + TII->analyzeBranch(SrcMBB, TrueMBB, FalseMBB, Cond); + if (!FalseMBB) + FalseMBB = SrcMBB.getNextNode(); + ---------------- I think this reinterpreting analyzeBranch's outputs the way is potentially confusing. I think you don't actually need to check analyzeBranch directly here; I think MachineBasicBlock::getFallThrough does exactly this anyway (and handles the case where there's an unconditional branch as well) ================ Comment at: lib/Target/AMDGPU/SIRemoveShortExecBranches.cpp:116-117 + + if (MDT->dominates(TrueMBB, &SrcMBB) || + mustRetainExeczBranch(*FalseMBB, *TrueMBB)) + return false; ---------------- cdevadas wrote: > nhaehnle wrote: > > What's the logic here behind using domination as a criterion? > There could be a situation in which execnz (inserted during SI_LOOP lowering) can be inverted to execz by an optimization (for instance, BranchFolding). This execz should always be retained. This special check is added to handle it. > Unfortunately, I couldn't write/find a test-case to reproduce it. I'm not sure dominance is sufficient for irreducible loops, which you won't run into in practice (as in, they probably hit another control flow bug long before this) but we should handle it correctly Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68092/new/ https://reviews.llvm.org/D68092 From llvm-commits at lists.llvm.org Tue Oct 8 10:28:19 2019 From: llvm-commits at lists.llvm.org (walter erquinigo via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:28:19 +0000 (UTC) Subject: [PATCH] D68289: [lldb-server/android] Show more processes by relaxing some checks In-Reply-To: References: Message-ID: wallace added a comment. thanks, @labath ! Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68289/new/ https://reviews.llvm.org/D68289 From llvm-commits at lists.llvm.org Tue Oct 8 10:32:56 2019 From: llvm-commits at lists.llvm.org (Jinsong Ji via llvm-commits) Date: Tue, 08 Oct 2019 17:32:56 -0000 Subject: [llvm] r374091 - Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize" Message-ID: <20191008173256.9E9A689395@lists.llvm.org> Author: jsji Date: Tue Oct 8 10:32:56 2019 New Revision: 374091 URL: http://llvm.org/viewvc/llvm-project?rev=374091&view=rev Log: Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize" Also Revert "[LoopVectorize] Fix non-debug builds after rL374017" This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a. This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04. The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone. Removed: llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h llvm/trunk/lib/Analysis/TargetTransformInfo.cpp llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original) +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Tue Oct 8 10:32:56 2019 @@ -788,23 +788,10 @@ public: /// Additional properties of an operand's values. enum OperandValueProperties { OP_None = 0, OP_PowerOf2 = 1 }; - /// \return the number of registers in the target-provided register class. - unsigned getNumberOfRegisters(unsigned ClassID) const; - - /// \return the target-provided register class ID for the provided type, - /// accounting for type promotion and other type-legalization techniques that the target might apply. - /// However, it specifically does not account for the scalarization or splitting of vector types. - /// Should a vector type require scalarization or splitting into multiple underlying vector registers, - /// that type should be mapped to a register class containing no registers. - /// Specifically, this is designed to provide a simple, high-level view of the register allocation - /// later performed by the backend. These register classes don't necessarily map onto the - /// register classes used by the backend. - /// FIXME: It's not currently possible to determine how many registers - /// are used by the provided type. - unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const; - - /// \return the target-provided register class name - const char* getRegisterClassName(unsigned ClassID) const; + /// \return The number of scalar or vector registers that the target has. + /// If 'Vectors' is true, it returns the number of vector registers. If it is + /// set to false, it returns the number of scalar registers. + unsigned getNumberOfRegisters(bool Vector) const; /// \return The width of the largest scalar or vector register type. unsigned getRegisterBitWidth(bool Vector) const; @@ -1256,9 +1243,7 @@ public: Type *Ty) = 0; virtual int getIntImmCost(Intrinsic::ID IID, unsigned Idx, const APInt &Imm, Type *Ty) = 0; - virtual unsigned getNumberOfRegisters(unsigned ClassID) const = 0; - virtual unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const = 0; - virtual const char* getRegisterClassName(unsigned ClassID) const = 0; + virtual unsigned getNumberOfRegisters(bool Vector) = 0; virtual unsigned getRegisterBitWidth(bool Vector) const = 0; virtual unsigned getMinVectorRegisterBitWidth() = 0; virtual bool shouldMaximizeVectorBandwidth(bool OptSize) const = 0; @@ -1601,14 +1586,8 @@ public: Type *Ty) override { return Impl.getIntImmCost(IID, Idx, Imm, Ty); } - unsigned getNumberOfRegisters(unsigned ClassID) const override { - return Impl.getNumberOfRegisters(ClassID); - } - unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const override { - return Impl.getRegisterClassForType(Vector, Ty); - } - const char* getRegisterClassName(unsigned ClassID) const override { - return Impl.getRegisterClassName(ClassID); + unsigned getNumberOfRegisters(bool Vector) override { + return Impl.getNumberOfRegisters(Vector); } unsigned getRegisterBitWidth(bool Vector) const override { return Impl.getRegisterBitWidth(Vector); Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h (original) +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h Tue Oct 8 10:32:56 2019 @@ -354,20 +354,7 @@ public: return TTI::TCC_Free; } - unsigned getNumberOfRegisters(unsigned ClassID) const { return 8; } - - unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const { - return Vector ? 1 : 0; - }; - - const char* getRegisterClassName(unsigned ClassID) const { - switch (ClassID) { - default: - return "Generic::Unknown Register Class"; - case 0: return "Generic::ScalarRC"; - case 1: return "Generic::VectorRC"; - } - } + unsigned getNumberOfRegisters(bool Vector) { return 8; } unsigned getRegisterBitWidth(bool Vector) const { return 32; } Modified: llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h (original) +++ llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h Tue Oct 8 10:32:56 2019 @@ -519,6 +519,8 @@ public: /// \name Vector TTI Implementations /// @{ + unsigned getNumberOfRegisters(bool Vector) { return Vector ? 0 : 1; } + unsigned getRegisterBitWidth(bool Vector) const { return 32; } /// Estimate the overhead of scalarizing an instruction. Insert and Extract Modified: llvm/trunk/lib/Analysis/TargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/TargetTransformInfo.cpp?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/TargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Analysis/TargetTransformInfo.cpp Tue Oct 8 10:32:56 2019 @@ -466,16 +466,8 @@ int TargetTransformInfo::getIntImmCost(I return Cost; } -unsigned TargetTransformInfo::getNumberOfRegisters(unsigned ClassID) const { - return TTIImpl->getNumberOfRegisters(ClassID); -} - -unsigned TargetTransformInfo::getRegisterClassForType(bool Vector, Type *Ty) const { - return TTIImpl->getRegisterClassForType(Vector, Ty); -} - -const char* TargetTransformInfo::getRegisterClassName(unsigned ClassID) const { - return TTIImpl->getRegisterClassName(ClassID); +unsigned TargetTransformInfo::getNumberOfRegisters(bool Vector) const { + return TTIImpl->getNumberOfRegisters(Vector); } unsigned TargetTransformInfo::getRegisterBitWidth(bool Vector) const { Modified: llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h Tue Oct 8 10:32:56 2019 @@ -85,8 +85,7 @@ public: bool enableInterleavedAccessVectorization() { return true; } - unsigned getNumberOfRegisters(unsigned ClassID) const { - bool Vector = (ClassID == 1); + unsigned getNumberOfRegisters(bool Vector) { if (Vector) { if (ST->hasNEON()) return 32; Modified: llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h Tue Oct 8 10:32:56 2019 @@ -122,8 +122,7 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(unsigned ClassID) const { - bool Vector = (ClassID == 1); + unsigned getNumberOfRegisters(bool Vector) { if (Vector) { if (ST->hasNEON()) return 16; Modified: llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp Tue Oct 8 10:32:56 2019 @@ -594,37 +594,10 @@ bool PPCTTIImpl::enableInterleavedAccess return true; } -unsigned PPCTTIImpl::getNumberOfRegisters(unsigned ClassID) const { - assert(ClassID == GPRRC || ClassID == FPRRC || - ClassID == VRRC || ClassID == VSXRC); - if (ST->hasVSX()) { - assert(ClassID == GPRRC || ClassID == VSXRC); - return ClassID == GPRRC ? 32 : 64; - } - assert(ClassID == GPRRC || ClassID == FPRRC || ClassID == VRRC); - return 32; -} - -unsigned PPCTTIImpl::getRegisterClassForType(bool Vector, Type *Ty) const { - if (Vector) - return ST->hasVSX() ? VSXRC : VRRC; - else if (Ty && Ty->getScalarType()->isFloatTy()) - return ST->hasVSX() ? VSXRC : FPRRC; - else - return GPRRC; -} - -const char* PPCTTIImpl::getRegisterClassName(unsigned ClassID) const { - - switch (ClassID) { - default: - llvm_unreachable("unknown register class"); - return "PPC::unknown register class"; - case GPRRC: return "PPC::GPRRC"; - case FPRRC: return "PPC::FPRRC"; - case VRRC: return "PPC::VRRC"; - case VSXRC: return "PPC::VSXRC"; - } +unsigned PPCTTIImpl::getNumberOfRegisters(bool Vector) { + if (Vector && !ST->hasAltivec() && !ST->hasQPX()) + return 0; + return ST->hasVSX() ? 64 : 32; } unsigned PPCTTIImpl::getRegisterBitWidth(bool Vector) const { Modified: llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h Tue Oct 8 10:32:56 2019 @@ -72,13 +72,7 @@ public: TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const; bool enableInterleavedAccessVectorization(); - - enum PPCRegisterClass { - GPRRC, FPRRC, VRRC, VSXRC - }; - unsigned getNumberOfRegisters(unsigned ClassID) const; - unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const; - const char* getRegisterClassName(unsigned ClassID) const; + unsigned getNumberOfRegisters(bool Vector); unsigned getRegisterBitWidth(bool Vector) const; unsigned getCacheLineSize(); unsigned getPrefetchDistance(); Modified: llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp Tue Oct 8 10:32:56 2019 @@ -304,8 +304,7 @@ bool SystemZTTIImpl::isLSRCostLess(Targe C2.ScaleCost, C2.SetupCost); } -unsigned SystemZTTIImpl::getNumberOfRegisters(unsigned ClassID) const { - bool Vector = (ClassID == 1); +unsigned SystemZTTIImpl::getNumberOfRegisters(bool Vector) { if (!Vector) // Discount the stack pointer. Also leave out %r0, since it can't // be used in an address. Modified: llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h Tue Oct 8 10:32:56 2019 @@ -56,7 +56,7 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(unsigned ClassID) const; + unsigned getNumberOfRegisters(bool Vector); unsigned getRegisterBitWidth(bool Vector) const; unsigned getCacheLineSize() { return 256; } Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp Tue Oct 8 10:32:56 2019 @@ -25,11 +25,10 @@ WebAssemblyTTIImpl::getPopcntSupport(uns return TargetTransformInfo::PSK_FastHardware; } -unsigned WebAssemblyTTIImpl::getNumberOfRegisters(unsigned ClassID) const { - unsigned Result = BaseT::getNumberOfRegisters(ClassID); +unsigned WebAssemblyTTIImpl::getNumberOfRegisters(bool Vector) { + unsigned Result = BaseT::getNumberOfRegisters(Vector); // For SIMD, use at least 16 registers, as a rough guess. - bool Vector = (ClassID == 1); if (Vector) Result = std::max(Result, 16u); Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h Tue Oct 8 10:32:56 2019 @@ -53,7 +53,7 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(unsigned ClassID) const; + unsigned getNumberOfRegisters(bool Vector); unsigned getRegisterBitWidth(bool Vector) const; unsigned getArithmeticInstrCost( unsigned Opcode, Type *Ty, Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp Tue Oct 8 10:32:56 2019 @@ -116,8 +116,7 @@ llvm::Optional X86TTIImpl::get llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); } -unsigned X86TTIImpl::getNumberOfRegisters(unsigned ClassID) const { - bool Vector = (ClassID == 1); +unsigned X86TTIImpl::getNumberOfRegisters(bool Vector) { if (Vector && !ST->hasSSE1()) return 0; Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h Tue Oct 8 10:32:56 2019 @@ -116,7 +116,7 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(unsigned ClassID) const; + unsigned getNumberOfRegisters(bool Vector); unsigned getRegisterBitWidth(bool Vector) const; unsigned getLoadStoreVecRegBitWidth(unsigned AS) const; unsigned getMaxInterleaveFactor(unsigned VF); Modified: llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h Tue Oct 8 10:32:56 2019 @@ -40,8 +40,7 @@ public: : BaseT(TM, F.getParent()->getDataLayout()), ST(TM->getSubtargetImpl()), TLI(ST->getTargetLowering()) {} - unsigned getNumberOfRegisters(unsigned ClassID) const { - bool Vector = (ClassID == 1); + unsigned getNumberOfRegisters(bool Vector) { if (Vector) { return 0; } Modified: llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Tue Oct 8 10:32:56 2019 @@ -1386,9 +1386,7 @@ void Cost::RateFormula(const Formula &F, // Treat every new register that exceeds TTI.getNumberOfRegisters() - 1 as // additional instruction (at least fill). - // TODO: Need distinguish register class? - unsigned TTIRegNum = TTI->getNumberOfRegisters( - TTI->getRegisterClassForType(false, F.getType())) - 1; + unsigned TTIRegNum = TTI->getNumberOfRegisters(false) - 1; if (C.NumRegs > TTIRegNum) { // Cost already exceeded TTIRegNum, then only newly added register can add // new instructions. Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Tue Oct 8 10:32:56 2019 @@ -983,11 +983,10 @@ public: /// of a loop. struct RegisterUsage { /// Holds the number of loop invariant values that are used in the loop. - /// The key is ClassID of target-provided register class. - SmallMapVector LoopInvariantRegs; + unsigned LoopInvariantRegs; + /// Holds the maximum number of concurrent live intervals in the loop. - /// The key is ClassID of target-provided register class. - SmallMapVector MaxLocalUsers; + unsigned MaxLocalUsers; }; /// \return Returns information about the register usages of the loop for the @@ -4963,14 +4962,9 @@ LoopVectorizationCostModel::computeFeasi // Select the largest VF which doesn't require more registers than existing // ones. + unsigned TargetNumRegisters = TTI.getNumberOfRegisters(true); for (int i = RUs.size() - 1; i >= 0; --i) { - bool Selected = true; - for (auto& pair : RUs[i].MaxLocalUsers) { - unsigned TargetNumRegisters = TTI.getNumberOfRegisters(pair.first); - if (pair.second > TargetNumRegisters) - Selected = false; - } - if (Selected) { + if (RUs[i].MaxLocalUsers <= TargetNumRegisters) { MaxVF = VFs[i]; break; } @@ -5121,12 +5115,22 @@ unsigned LoopVectorizationCostModel::sel if (TC > 1 && TC < TinyTripCountInterleaveThreshold) return 1; + unsigned TargetNumRegisters = TTI.getNumberOfRegisters(VF > 1); + LLVM_DEBUG(dbgs() << "LV: The target has " << TargetNumRegisters + << " registers\n"); + + if (VF == 1) { + if (ForceTargetNumScalarRegs.getNumOccurrences() > 0) + TargetNumRegisters = ForceTargetNumScalarRegs; + } else { + if (ForceTargetNumVectorRegs.getNumOccurrences() > 0) + TargetNumRegisters = ForceTargetNumVectorRegs; + } + RegisterUsage R = calculateRegisterUsage({VF})[0]; // We divide by these constants so assume that we have at least one // instruction that uses at least one register. - for (auto& pair : R.MaxLocalUsers) { - pair.second = std::max(pair.second, 1U); - } + R.MaxLocalUsers = std::max(R.MaxLocalUsers, 1U); // We calculate the interleave count using the following formula. // Subtract the number of loop invariants from the number of available @@ -5139,35 +5143,13 @@ unsigned LoopVectorizationCostModel::sel // We also want power of two interleave counts to ensure that the induction // variable of the vector loop wraps to zero, when tail is folded by masking; // this currently happens when OptForSize, in which case IC is set to 1 above. - unsigned IC = UINT_MAX; + unsigned IC = PowerOf2Floor((TargetNumRegisters - R.LoopInvariantRegs) / + R.MaxLocalUsers); - for (auto& pair : R.MaxLocalUsers) { - unsigned TargetNumRegisters = TTI.getNumberOfRegisters(pair.first); - LLVM_DEBUG(dbgs() << "LV: The target has " << TargetNumRegisters - << " registers of " - << TTI.getRegisterClassName(pair.first) << " register class\n"); - if (VF == 1) { - if (ForceTargetNumScalarRegs.getNumOccurrences() > 0) - TargetNumRegisters = ForceTargetNumScalarRegs; - } else { - if (ForceTargetNumVectorRegs.getNumOccurrences() > 0) - TargetNumRegisters = ForceTargetNumVectorRegs; - } - unsigned MaxLocalUsers = pair.second; - unsigned LoopInvariantRegs = 0; - if (R.LoopInvariantRegs.find(pair.first) != R.LoopInvariantRegs.end()) - LoopInvariantRegs = R.LoopInvariantRegs[pair.first]; - - unsigned TmpIC = PowerOf2Floor((TargetNumRegisters - LoopInvariantRegs) / MaxLocalUsers); - // Don't count the induction variable as interleaved. - if (EnableIndVarRegisterHeur) { - TmpIC = - PowerOf2Floor((TargetNumRegisters - LoopInvariantRegs - 1) / - std::max(1U, (MaxLocalUsers - 1))); - } - - IC = std::min(IC, TmpIC); - } + // Don't count the induction variable as interleaved. + if (EnableIndVarRegisterHeur) + IC = PowerOf2Floor((TargetNumRegisters - R.LoopInvariantRegs - 1) / + std::max(1U, (R.MaxLocalUsers - 1))); // Clamp the interleave ranges to reasonable counts. unsigned MaxInterleaveCount = TTI.getMaxInterleaveFactor(VF); @@ -5349,7 +5331,7 @@ LoopVectorizationCostModel::calculateReg const DataLayout &DL = TheFunction->getParent()->getDataLayout(); SmallVector RUs(VFs.size()); - SmallVector, 8> MaxUsages(VFs.size()); + SmallVector MaxUsages(VFs.size(), 0); LLVM_DEBUG(dbgs() << "LV(REG): Calculating max register usage:\n"); @@ -5379,45 +5361,21 @@ LoopVectorizationCostModel::calculateReg // For each VF find the maximum usage of registers. for (unsigned j = 0, e = VFs.size(); j < e; ++j) { - // Count the number of live intervals. - SmallMapVector RegUsage; - if (VFs[j] == 1) { - for (auto Inst : OpenIntervals) { - unsigned ClassID = TTI.getRegisterClassForType(false, Inst->getType()); - if (RegUsage.find(ClassID) == RegUsage.end()) - RegUsage[ClassID] = 1; - else - RegUsage[ClassID] += 1; - } - } else { - collectUniformsAndScalars(VFs[j]); - for (auto Inst : OpenIntervals) { - // Skip ignored values for VF > 1. - if (VecValuesToIgnore.find(Inst) != VecValuesToIgnore.end()) - continue; - if (isScalarAfterVectorization(Inst, VFs[j])) { - unsigned ClassID = TTI.getRegisterClassForType(false, Inst->getType()); - if (RegUsage.find(ClassID) == RegUsage.end()) - RegUsage[ClassID] = 1; - else - RegUsage[ClassID] += 1; - } else { - unsigned ClassID = TTI.getRegisterClassForType(true, Inst->getType()); - if (RegUsage.find(ClassID) == RegUsage.end()) - RegUsage[ClassID] = GetRegUsage(Inst->getType(), VFs[j]); - else - RegUsage[ClassID] += GetRegUsage(Inst->getType(), VFs[j]); - } - } + MaxUsages[j] = std::max(MaxUsages[j], OpenIntervals.size()); + continue; } - - for (auto& pair : RegUsage) { - if (MaxUsages[j].find(pair.first) != MaxUsages[j].end()) - MaxUsages[j][pair.first] = std::max(MaxUsages[j][pair.first], pair.second); - else - MaxUsages[j][pair.first] = pair.second; + collectUniformsAndScalars(VFs[j]); + // Count the number of live intervals. + unsigned RegUsage = 0; + for (auto Inst : OpenIntervals) { + // Skip ignored values for VF > 1. + if (VecValuesToIgnore.find(Inst) != VecValuesToIgnore.end() || + isScalarAfterVectorization(Inst, VFs[j])) + continue; + RegUsage += GetRegUsage(Inst->getType(), VFs[j]); } + MaxUsages[j] = std::max(MaxUsages[j], RegUsage); } LLVM_DEBUG(dbgs() << "LV(REG): At #" << i << " Interval # " @@ -5428,34 +5386,18 @@ LoopVectorizationCostModel::calculateReg } for (unsigned i = 0, e = VFs.size(); i < e; ++i) { - SmallMapVector Invariant; - - for (auto Inst : LoopInvariants) { - unsigned Usage = VFs[i] == 1 ? 1 : GetRegUsage(Inst->getType(), VFs[i]); - unsigned ClassID = TTI.getRegisterClassForType(VFs[i] > 1, Inst->getType()); - if (Invariant.find(ClassID) == Invariant.end()) - Invariant[ClassID] = Usage; - else - Invariant[ClassID] += Usage; + unsigned Invariant = 0; + if (VFs[i] == 1) + Invariant = LoopInvariants.size(); + else { + for (auto Inst : LoopInvariants) + Invariant += GetRegUsage(Inst->getType(), VFs[i]); } LLVM_DEBUG(dbgs() << "LV(REG): VF = " << VFs[i] << '\n'); - LLVM_DEBUG(dbgs() << "LV(REG): Found max usage: " - << MaxUsages[i].size() << " item\n"); - for (const auto& Pair : MaxUsages[i]) { - (void)Pair; - LLVM_DEBUG(dbgs() << "LV(REG): RegisterClass: " - << TTI.getRegisterClassName(Pair.first) - << ", " << Pair.second << " registers \n"); - } - LLVM_DEBUG(dbgs() << "LV(REG): Found invariant usage: " - << Invariant.size() << " item\n"); - for (const auto& Pair : Invariant) { - (void)Pair; - LLVM_DEBUG(dbgs() << "LV(REG): RegisterClass: " - << TTI.getRegisterClassName(Pair.first) - << ", " << Pair.second << " registers \n"); - } + LLVM_DEBUG(dbgs() << "LV(REG): Found max usage: " << MaxUsages[i] << '\n'); + LLVM_DEBUG(dbgs() << "LV(REG): Found invariant usage: " << Invariant + << '\n'); RU.LoopInvariantRegs = Invariant; RU.MaxLocalUsers = MaxUsages[i]; @@ -7820,8 +7762,7 @@ bool LoopVectorizePass::runImpl( // The second condition is necessary because, even if the target has no // vector registers, loop vectorization may still enable scalar // interleaving. - if (!TTI->getNumberOfRegisters(TTI->getRegisterClassForType(true)) && - TTI->getMaxInterleaveFactor(1) < 2) + if (!TTI->getNumberOfRegisters(true) && TTI->getMaxInterleaveFactor(1) < 2) return false; bool Changed = false; Modified: llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp Tue Oct 8 10:32:56 2019 @@ -5237,7 +5237,7 @@ bool SLPVectorizerPass::runImpl(Function // If the target claims to have no vector registers don't attempt // vectorization. - if (!TTI->getNumberOfRegisters(TTI->getRegisterClassForType(true))) + if (!TTI->getNumberOfRegisters(true)) return false; // Don't vectorize when the attribute NoImplicitFloat is used. Removed: llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll?rev=374090&view=auto ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll (original) +++ llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll (removed) @@ -1,179 +0,0 @@ -; RUN: opt < %s -debug-only=loop-vectorize -loop-vectorize -vectorizer-maximize-bandwidth -O2 -mtriple=powerpc64-unknown-linux -S -mcpu=pwr8 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-PWR8 -; RUN: opt < %s -debug-only=loop-vectorize -loop-vectorize -vectorizer-maximize-bandwidth -O2 -mtriple=powerpc64le-unknown-linux -S -mcpu=pwr9 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-PWR9 -; REQUIRES: asserts - - at a = global [1024 x i8] zeroinitializer, align 16 - at b = global [1024 x i8] zeroinitializer, align 16 - -define i32 @foo() { -; -; CHECK-LABEL: foo - -; CHECK: LV(REG): VF = 8 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 7 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item -; CHECK: LV(REG): VF = 16 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 13 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item - -; CHECK-PWR8: LV(REG): VF = 16 -; CHECK-PWR8-NEXT: LV(REG): Found max usage: 2 item -; CHECK-PWR8-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers -; CHECK-PWR8-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 13 registers -; CHECK-PWR8-NEXT: LV(REG): Found invariant usage: 0 item -; CHECK-PWR8: Setting best plan to VF=16, UF=4 - -; CHECK-PWR9: LV(REG): VF = 8 -; CHECK-PWR9-NEXT: LV(REG): Found max usage: 2 item -; CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers -; CHECK-PWR9-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 7 registers -; CHECK-PWR9-NEXT: LV(REG): Found invariant usage: 0 item -; CHECK-PWR9: Setting best plan to VF=8, UF=8 - - -entry: - br label %for.body - -for.cond.cleanup: - %add.lcssa = phi i32 [ %add, %for.body ] - ret i32 %add.lcssa - -for.body: - %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] - %s.015 = phi i32 [ 0, %entry ], [ %add, %for.body ] - %arrayidx = getelementptr inbounds [1024 x i8], [1024 x i8]* @a, i64 0, i64 %indvars.iv - %0 = load i8, i8* %arrayidx, align 1 - %conv = zext i8 %0 to i32 - %arrayidx2 = getelementptr inbounds [1024 x i8], [1024 x i8]* @b, i64 0, i64 %indvars.iv - %1 = load i8, i8* %arrayidx2, align 1 - %conv3 = zext i8 %1 to i32 - %sub = sub nsw i32 %conv, %conv3 - %ispos = icmp sgt i32 %sub, -1 - %neg = sub nsw i32 0, %sub - %2 = select i1 %ispos, i32 %sub, i32 %neg - %add = add nsw i32 %2, %s.015 - %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 - %exitcond = icmp eq i64 %indvars.iv.next, 1024 - br i1 %exitcond, label %for.cond.cleanup, label %for.body -} - -define i32 @goo() { -; For indvars.iv used in a computating chain only feeding into getelementptr or cmp, -; it will not have vector version and the vector register usage will not exceed the -; available vector register number. -; CHECK-LABEL: goo -; CHECK: LV(REG): VF = 8 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 7 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item -; CHECK: LV(REG): VF = 16 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 13 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item -; CHECK: LV(REG): VF = 16 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 13 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item - -; CHECK: Setting best plan to VF=16, UF=4 - -entry: - br label %for.body - -for.cond.cleanup: ; preds = %for.body - %add.lcssa = phi i32 [ %add, %for.body ] - ret i32 %add.lcssa - -for.body: ; preds = %for.body, %entry - %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] - %s.015 = phi i32 [ 0, %entry ], [ %add, %for.body ] - %tmp1 = add nsw i64 %indvars.iv, 3 - %arrayidx = getelementptr inbounds [1024 x i8], [1024 x i8]* @a, i64 0, i64 %tmp1 - %tmp = load i8, i8* %arrayidx, align 1 - %conv = zext i8 %tmp to i32 - %tmp2 = add nsw i64 %indvars.iv, 2 - %arrayidx2 = getelementptr inbounds [1024 x i8], [1024 x i8]* @b, i64 0, i64 %tmp2 - %tmp3 = load i8, i8* %arrayidx2, align 1 - %conv3 = zext i8 %tmp3 to i32 - %sub = sub nsw i32 %conv, %conv3 - %ispos = icmp sgt i32 %sub, -1 - %neg = sub nsw i32 0, %sub - %tmp4 = select i1 %ispos, i32 %sub, i32 %neg - %add = add nsw i32 %tmp4, %s.015 - %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 - %exitcond = icmp eq i64 %indvars.iv.next, 1024 - br i1 %exitcond, label %for.cond.cleanup, label %for.body -} - -define i64 @bar(i64* nocapture %a) { -; CHECK-LABEL: bar -; CHECK: LV(REG): VF = 2 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 3 registers -; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 1 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item - -; CHECK: Setting best plan to VF=2, UF=12 - -entry: - br label %for.body - -for.cond.cleanup: - %add2.lcssa = phi i64 [ %add2, %for.body ] - ret i64 %add2.lcssa - -for.body: - %i.012 = phi i64 [ 0, %entry ], [ %inc, %for.body ] - %s.011 = phi i64 [ 0, %entry ], [ %add2, %for.body ] - %arrayidx = getelementptr inbounds i64, i64* %a, i64 %i.012 - %0 = load i64, i64* %arrayidx, align 8 - %add = add nsw i64 %0, %i.012 - store i64 %add, i64* %arrayidx, align 8 - %add2 = add nsw i64 %add, %s.011 - %inc = add nuw nsw i64 %i.012, 1 - %exitcond = icmp eq i64 %inc, 1024 - br i1 %exitcond, label %for.cond.cleanup, label %for.body -} - - at d = external global [0 x i64], align 8 - at e = external global [0 x i32], align 4 - at c = external global [0 x i32], align 4 - -define void @hoo(i32 %n) { -; CHECK-LABEL: hoo -; CHECK: LV(REG): VF = 4 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: PPC::VSXRC, 2 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item -; CHECK: LV(REG): VF = 1 -; CHECK-NEXT: LV(REG): Found max usage: 1 item -; CHECK-NEXT: LV(REG): RegisterClass: PPC::GPRRC, 2 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item -; CHECK: Setting best plan to VF=1, UF=12 - -entry: - br label %for.body - -for.body: ; preds = %for.body, %entry - %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] - %arrayidx = getelementptr inbounds [0 x i64], [0 x i64]* @d, i64 0, i64 %indvars.iv - %tmp = load i64, i64* %arrayidx, align 8 - %arrayidx1 = getelementptr inbounds [0 x i32], [0 x i32]* @e, i64 0, i64 %tmp - %tmp1 = load i32, i32* %arrayidx1, align 4 - %arrayidx3 = getelementptr inbounds [0 x i32], [0 x i32]* @c, i64 0, i64 %indvars.iv - store i32 %tmp1, i32* %arrayidx3, align 4 - %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 - %exitcond = icmp eq i64 %indvars.iv.next, 10000 - br i1 %exitcond, label %for.end, label %for.body - -for.end: ; preds = %for.body - ret void -} Modified: llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll (original) +++ llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll Tue Oct 8 10:32:56 2019 @@ -22,11 +22,7 @@ target datalayout = "e-m:e-i64:64-f80:12 target triple = "x86_64-unknown-linux-gnu" ; CHECK: LV: Checking a loop in "test_g" -; CHECK: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 1 item -; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers +; CHECK: LV(REG): Found max usage: 2 define i32 @test_g(i32* nocapture readonly %a, i32 %n) local_unnamed_addr !dbg !6 { entry: @@ -64,11 +60,7 @@ for.end: } ; CHECK: LV: Checking a loop in "test" -; CHECK: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 1 item -; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers +; CHECK: LV(REG): Found max usage: 2 define i32 @test(i32* nocapture readonly %a, i32 %n) local_unnamed_addr { entry: Modified: llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll?rev=374091&r1=374090&r2=374091&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll (original) +++ llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll Tue Oct 8 10:32:56 2019 @@ -11,15 +11,9 @@ define i32 @foo() { ; ; CHECK-LABEL: foo ; CHECK: LV(REG): VF = 8 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 7 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item +; CHECK-NEXT: LV(REG): Found max usage: 7 ; CHECK: LV(REG): VF = 16 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 13 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item +; CHECK-NEXT: LV(REG): Found max usage: 13 entry: br label %for.body @@ -53,15 +47,9 @@ define i32 @goo() { ; available vector register number. ; CHECK-LABEL: goo ; CHECK: LV(REG): VF = 8 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 7 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item +; CHECK-NEXT: LV(REG): Found max usage: 7 ; CHECK: LV(REG): VF = 16 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers -; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 13 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item +; CHECK-NEXT: LV(REG): Found max usage: 13 entry: br label %for.body @@ -93,11 +81,8 @@ for.body: define i64 @bar(i64* nocapture %a) { ; CHECK-LABEL: bar ; CHECK: LV(REG): VF = 2 -; CHECK-NEXT: LV(REG): Found max usage: 2 item -; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 3 registers -; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 1 registers -; CHECK-NEXT: LV(REG): Found invariant usage: 0 item - +; CHECK: LV(REG): Found max usage: 3 +; entry: br label %for.body @@ -128,11 +113,8 @@ define void @hoo(i32 %n) { ; so the max usage of AVX512 vector register will be 2. ; AVX512F-LABEL: bar ; AVX512F: LV(REG): VF = 16 -; AVX512F-CHECK: LV(REG): Found max usage: 2 item -; AVX512F-CHECK: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers -; AVX512F-CHECK: LV(REG): RegisterClass: Generic::VectorRC, 2 registers -; AVX512F-CHECK: LV(REG): Found invariant usage: 0 item - +; AVX512F: LV(REG): Found max usage: 2 +; entry: br label %for.body From llvm-commits at lists.llvm.org Tue Oct 8 10:36:38 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Tue, 08 Oct 2019 17:36:38 -0000 Subject: [llvm] r374092 - AMDGPU: Fix i16 arithmetic pattern redundancy Message-ID: <20191008173639.126F284C99@lists.llvm.org> Author: arsenm Date: Tue Oct 8 10:36:38 2019 New Revision: 374092 URL: http://llvm.org/viewvc/llvm-project?rev=374092&view=rev Log: AMDGPU: Fix i16 arithmetic pattern redundancy There were 2 problems here. First, these patterns were duplicated to handle the inverted shift operands instead of using the commuted PatFrags. Second, the point of the zext folding patterns don't apply to the non-0ing high subtargets. They should be skipped instead of inserting the extension. The zeroing high code would be emitted when necessary anyway. This was also emitting unnecessary zexts in cases where the high bits were undefined. Modified: llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir llvm/trunk/test/CodeGen/AMDGPU/idot2.ll llvm/trunk/test/CodeGen/AMDGPU/idot4s.ll llvm/trunk/test/CodeGen/AMDGPU/idot4u.ll llvm/trunk/test/CodeGen/AMDGPU/idot8s.ll llvm/trunk/test/CodeGen/AMDGPU/idot8u.ll llvm/trunk/test/CodeGen/AMDGPU/preserve-hi16.ll llvm/trunk/test/CodeGen/AMDGPU/sdwa-peephole.ll Modified: llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td?rev=374092&r1=374091&r2=374092&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td (original) +++ llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td Tue Oct 8 10:36:38 2019 @@ -608,9 +608,9 @@ def V_MADMK_F16 : VOP2_Pseudo <"v_madmk_ defm V_LDEXP_F16 : VOP2Inst <"v_ldexp_f16", VOP_F16_F16_I32, AMDGPUldexp>; } // End FPDPRounding = 1 -defm V_LSHLREV_B16 : VOP2Inst <"v_lshlrev_b16", VOP_I16_I16_I16>; -defm V_LSHRREV_B16 : VOP2Inst <"v_lshrrev_b16", VOP_I16_I16_I16>; -defm V_ASHRREV_I16 : VOP2Inst <"v_ashrrev_i16", VOP_I16_I16_I16>; +defm V_LSHLREV_B16 : VOP2Inst <"v_lshlrev_b16", VOP_I16_I16_I16, lshl_rev>; +defm V_LSHRREV_B16 : VOP2Inst <"v_lshrrev_b16", VOP_I16_I16_I16, lshr_rev>; +defm V_ASHRREV_I16 : VOP2Inst <"v_ashrrev_i16", VOP_I16_I16_I16, ashr_rev>; let isCommutable = 1 in { let FPDPRounding = 1 in { @@ -620,16 +620,16 @@ defm V_SUBREV_F16 : VOP2Inst <"v_subrev_ defm V_MUL_F16 : VOP2Inst <"v_mul_f16", VOP_F16_F16_F16, fmul>; def V_MADAK_F16 : VOP2_Pseudo <"v_madak_f16", VOP_MADAK_F16, [], "">; } // End FPDPRounding = 1 -defm V_ADD_U16 : VOP2Inst <"v_add_u16", VOP_I16_I16_I16>; -defm V_SUB_U16 : VOP2Inst <"v_sub_u16" , VOP_I16_I16_I16>; +defm V_ADD_U16 : VOP2Inst <"v_add_u16", VOP_I16_I16_I16, add>; +defm V_SUB_U16 : VOP2Inst <"v_sub_u16" , VOP_I16_I16_I16, sub>; defm V_SUBREV_U16 : VOP2Inst <"v_subrev_u16", VOP_I16_I16_I16, null_frag, "v_sub_u16">; -defm V_MUL_LO_U16 : VOP2Inst <"v_mul_lo_u16", VOP_I16_I16_I16>; +defm V_MUL_LO_U16 : VOP2Inst <"v_mul_lo_u16", VOP_I16_I16_I16, mul>; defm V_MAX_F16 : VOP2Inst <"v_max_f16", VOP_F16_F16_F16, fmaxnum_like>; defm V_MIN_F16 : VOP2Inst <"v_min_f16", VOP_F16_F16_F16, fminnum_like>; -defm V_MAX_U16 : VOP2Inst <"v_max_u16", VOP_I16_I16_I16>; -defm V_MAX_I16 : VOP2Inst <"v_max_i16", VOP_I16_I16_I16>; -defm V_MIN_U16 : VOP2Inst <"v_min_u16", VOP_I16_I16_I16>; -defm V_MIN_I16 : VOP2Inst <"v_min_i16", VOP_I16_I16_I16>; +defm V_MAX_U16 : VOP2Inst <"v_max_u16", VOP_I16_I16_I16, umax>; +defm V_MAX_I16 : VOP2Inst <"v_max_i16", VOP_I16_I16_I16, smax>; +defm V_MIN_U16 : VOP2Inst <"v_min_u16", VOP_I16_I16_I16, umin>; +defm V_MIN_I16 : VOP2Inst <"v_min_i16", VOP_I16_I16_I16, smin>; let Constraints = "$vdst = $src2", DisableEncoding="$src2", isConvertibleToThreeAddress = 1 in { @@ -722,53 +722,17 @@ defm V_PK_FMAC_F16 : VOP2Inst<"v_pk_fmac // Note: 16-bit instructions produce a 0 result in the high 16-bits // on GFX8 and GFX9 and preserve high 16 bits on GFX10+ -def ClearHI16 : OutPatFrag<(ops node:$op), - (V_AND_B32_e64 $op, (V_MOV_B32_e32 (i32 0xffff)))>; - -multiclass Arithmetic_i16_Pats { - -def : GCNPat< - (op i16:$src0, i16:$src1), - !if(!eq(PreservesHI16,1), (ClearHI16 (inst $src0, $src1)), (inst $src0, $src1)) ->; +multiclass Arithmetic_i16_0Hi_Pats { def : GCNPat< (i32 (zext (op i16:$src0, i16:$src1))), - !if(!eq(PreservesHI16,1), (ClearHI16 (inst $src0, $src1)), (inst $src0, $src1)) + (inst $src0, $src1) >; def : GCNPat< (i64 (zext (op i16:$src0, i16:$src1))), (REG_SEQUENCE VReg_64, - !if(!eq(PreservesHI16,1), (ClearHI16 (inst $src0, $src1)), (inst $src0, $src1)), - sub0, - (V_MOV_B32_e32 (i32 0)), sub1) ->; -} - -multiclass Bits_OpsRev_i16_Pats { - -def : GCNPat< - (op i16:$src0, i16:$src1), - !if(!eq(PreservesHI16,1), (ClearHI16 (inst VSrc_b32:$src1, VSrc_b32:$src0)), - (inst VSrc_b32:$src1, VSrc_b32:$src0)) ->; - -def : GCNPat< - (i32 (zext (op i16:$src0, i16:$src1))), - !if(!eq(PreservesHI16,1), (ClearHI16 (inst VSrc_b32:$src1, VSrc_b32:$src0)), - (inst VSrc_b32:$src1, VSrc_b32:$src0)) ->; - - -def : GCNPat< - (i64 (zext (op i16:$src0, i16:$src1))), - (REG_SEQUENCE VReg_64, - !if(!eq(PreservesHI16,1), (ClearHI16 (inst VSrc_b32:$src1, VSrc_b32:$src0)), - (inst VSrc_b32:$src1, VSrc_b32:$src0)), - sub0, + (inst $src0, $src1), sub0, (V_MOV_B32_e32 (i32 0)), sub1) >; } @@ -800,35 +764,16 @@ def : GCNPat < let Predicates = [Has16BitInsts] in { let Predicates = [Has16BitInsts, isGFX7GFX8GFX9] in { -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -} - -let Predicates = [Has16BitInsts, isGFX10Plus] in { -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -defm : Arithmetic_i16_Pats; -} - -let Predicates = [Has16BitInsts, isGFX7GFX8GFX9] in { -defm : Bits_OpsRev_i16_Pats; -defm : Bits_OpsRev_i16_Pats; -defm : Bits_OpsRev_i16_Pats; -} - -let Predicates = [Has16BitInsts, isGFX10Plus] in { -defm : Bits_OpsRev_i16_Pats; -defm : Bits_OpsRev_i16_Pats; -defm : Bits_OpsRev_i16_Pats; +defm : Arithmetic_i16_0Hi_Pats; +defm : Arithmetic_i16_0Hi_Pats; +defm : Arithmetic_i16_0Hi_Pats; +defm : Arithmetic_i16_0Hi_Pats; +defm : Arithmetic_i16_0Hi_Pats; +defm : Arithmetic_i16_0Hi_Pats; +defm : Arithmetic_i16_0Hi_Pats; +defm : Arithmetic_i16_0Hi_Pats; +defm : Arithmetic_i16_0Hi_Pats; +defm : Arithmetic_i16_0Hi_Pats; } def : ZExt_i16_i1_Pat; Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir?rev=374092&r1=374091&r2=374092&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir Tue Oct 8 10:36:38 2019 @@ -78,10 +78,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 ; GFX10: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_ASHRREV_I16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + ; GFX10: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]] %0:vgpr(s32) = COPY $vgpr0 %1:sgpr(s32) = COPY $sgpr0 %2:vgpr(s16) = G_TRUNC %0 @@ -147,10 +145,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_ASHRREV_I16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + ; GFX10: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]] %0:vgpr(s32) = COPY $vgpr0 %1:vgpr(s32) = COPY $vgpr1 %2:vgpr(s16) = G_TRUNC %0 @@ -184,10 +180,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_ASHRREV_I16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_AND_B32_e64_]], 0, 16, implicit $exec + ; GFX10: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_ASHRREV_I16_e64_]], 0, 16, implicit $exec ; GFX10: S_ENDPGM 0, implicit [[V_BFE_U32_]] %0:vgpr(s32) = COPY $vgpr0 %1:vgpr(s32) = COPY $vgpr1 @@ -329,10 +323,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_ASHRREV_I16_e64_:%[0-9]+]]:vgpr_32 = V_ASHRREV_I16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_ASHRREV_I16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + ; GFX10: S_ENDPGM 0, implicit [[V_ASHRREV_I16_e64_]] %0:sgpr(s32) = COPY $sgpr0 %1:vgpr(s32) = COPY $vgpr0 %2:sgpr(s16) = G_TRUNC %0 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir?rev=374092&r1=374091&r2=374092&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir Tue Oct 8 10:36:38 2019 @@ -78,10 +78,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 ; GFX10: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHRREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + ; GFX10: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]] %0:vgpr(s32) = COPY $vgpr0 %1:sgpr(s32) = COPY $sgpr0 %2:vgpr(s16) = G_TRUNC %0 @@ -147,10 +145,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHRREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + ; GFX10: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]] %0:vgpr(s32) = COPY $vgpr0 %1:vgpr(s32) = COPY $vgpr1 %2:vgpr(s16) = G_TRUNC %0 @@ -184,10 +180,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHRREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_AND_B32_e64_]], 0, 16, implicit $exec + ; GFX10: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_LSHRREV_B16_e64_]], 0, 16, implicit $exec ; GFX10: S_ENDPGM 0, implicit [[V_BFE_U32_]] %0:vgpr(s32) = COPY $vgpr0 %1:vgpr(s32) = COPY $vgpr1 @@ -329,10 +323,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_LSHRREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHRREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + ; GFX10: S_ENDPGM 0, implicit [[V_LSHRREV_B16_e64_]] %0:sgpr(s32) = COPY $sgpr0 %1:vgpr(s32) = COPY $vgpr0 %2:sgpr(s16) = G_TRUNC %0 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir?rev=374092&r1=374091&r2=374092&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir Tue Oct 8 10:36:38 2019 @@ -78,10 +78,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 ; GFX10: [[COPY1:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHLREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + ; GFX10: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]] %0:vgpr(s32) = COPY $vgpr0 %1:sgpr(s32) = COPY $sgpr0 %2:vgpr(s16) = G_TRUNC %0 @@ -147,10 +145,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHLREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + ; GFX10: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]] %0:vgpr(s32) = COPY $vgpr0 %1:vgpr(s32) = COPY $vgpr1 %2:vgpr(s16) = G_TRUNC %0 @@ -184,10 +180,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHLREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_AND_B32_e64_]], 0, 16, implicit $exec + ; GFX10: [[V_BFE_U32_:%[0-9]+]]:vgpr_32 = V_BFE_U32 [[V_LSHLREV_B16_e64_]], 0, 16, implicit $exec ; GFX10: S_ENDPGM 0, implicit [[V_BFE_U32_]] %0:vgpr(s32) = COPY $vgpr0 %1:vgpr(s32) = COPY $vgpr1 @@ -329,10 +323,8 @@ body: | ; GFX10: $vcc_hi = IMPLICIT_DEF ; GFX10: [[COPY:%[0-9]+]]:sreg_32_xm0 = COPY $sgpr0 ; GFX10: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0 - ; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec ; GFX10: [[V_LSHLREV_B16_e64_:%[0-9]+]]:vgpr_32 = V_LSHLREV_B16_e64 [[COPY1]], [[COPY]], implicit $exec - ; GFX10: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 [[V_LSHLREV_B16_e64_]], [[V_MOV_B32_e32_]], implicit $exec - ; GFX10: S_ENDPGM 0, implicit [[V_AND_B32_e64_]] + ; GFX10: S_ENDPGM 0, implicit [[V_LSHLREV_B16_e64_]] %0:sgpr(s32) = COPY $sgpr0 %1:vgpr(s32) = COPY $vgpr0 %2:sgpr(s16) = G_TRUNC %0 Modified: llvm/trunk/test/CodeGen/AMDGPU/idot2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/idot2.ll?rev=374092&r1=374091&r2=374092&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/idot2.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/idot2.ll Tue Oct 8 10:36:38 2019 @@ -2775,7 +2775,6 @@ define amdgpu_kernel void @notsdot2_sext ; GFX10-DL: ; %bb.0: ; %entry ; GFX10-DL-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 ; GFX10-DL-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34 -; GFX10-DL-NEXT: v_mov_b32_e32 v4, 0xffff ; GFX10-DL-NEXT: ; implicit-def: $vcc_hi ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) ; GFX10-DL-NEXT: v_mov_b32_e32 v0, s6 @@ -2784,13 +2783,13 @@ define amdgpu_kernel void @notsdot2_sext ; GFX10-DL-NEXT: v_mov_b32_e32 v1, s7 ; GFX10-DL-NEXT: s_load_dword s2, s[0:1], 0x0 ; GFX10-DL-NEXT: global_load_ushort v2, v[2:3], off -; GFX10-DL-NEXT: global_load_ushort v7, v[0:1], off +; GFX10-DL-NEXT: global_load_ushort v0, v[0:1], off ; GFX10-DL-NEXT: s_waitcnt vmcnt(1) -; GFX10-DL-NEXT: v_and_b32_sdwa v1, v2, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; GFX10-DL-NEXT: v_lshrrev_b16_e64 v1, 8, v2 ; GFX10-DL-NEXT: s_waitcnt vmcnt(0) -; GFX10-DL-NEXT: v_and_b32_sdwa v3, v7, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; GFX10-DL-NEXT: v_lshrrev_b16_e64 v3, 8, v0 ; GFX10-DL-NEXT: v_bfe_i32 v2, v2, 0, 8 -; GFX10-DL-NEXT: v_bfe_i32 v0, v7, 0, 8 +; GFX10-DL-NEXT: v_bfe_i32 v0, v0, 0, 8 ; GFX10-DL-NEXT: v_bfe_i32 v1, v1, 0, 8 ; GFX10-DL-NEXT: v_bfe_i32 v3, v3, 0, 8 ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) Modified: llvm/trunk/test/CodeGen/AMDGPU/idot4s.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/idot4s.ll?rev=374092&r1=374091&r2=374092&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/idot4s.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/idot4s.ll Tue Oct 8 10:36:38 2019 @@ -841,7 +841,6 @@ define amdgpu_kernel void @idot4_acc32_v ; GFX10-DL: ; %bb.0: ; %entry ; GFX10-DL-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 ; GFX10-DL-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34 -; GFX10-DL-NEXT: v_mov_b32_e32 v2, 0xffff ; GFX10-DL-NEXT: ; implicit-def: $vcc_hi ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) ; GFX10-DL-NEXT: s_load_dword s2, s[4:5], 0x0 @@ -850,19 +849,19 @@ define amdgpu_kernel void @idot4_acc32_v ; GFX10-DL-NEXT: v_mov_b32_e32 v0, s0 ; GFX10-DL-NEXT: v_mov_b32_e32 v1, s1 ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) -; GFX10-DL-NEXT: v_and_b32_sdwa v3, s2, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD -; GFX10-DL-NEXT: v_and_b32_sdwa v2, s3, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; GFX10-DL-NEXT: v_lshrrev_b16_e64 v2, 8, s2 +; GFX10-DL-NEXT: v_lshrrev_b16_e64 v3, 8, s3 ; GFX10-DL-NEXT: v_mov_b32_e32 v4, s4 ; GFX10-DL-NEXT: s_sext_i32_i8 s0, s2 ; GFX10-DL-NEXT: s_sext_i32_i8 s1, s3 -; GFX10-DL-NEXT: v_bfe_i32 v3, v3, 0, 8 ; GFX10-DL-NEXT: v_bfe_i32 v2, v2, 0, 8 +; GFX10-DL-NEXT: v_bfe_i32 v3, v3, 0, 8 ; GFX10-DL-NEXT: s_bfe_i32 s4, s2, 0x80010 ; GFX10-DL-NEXT: s_bfe_i32 s5, s3, 0x80010 ; GFX10-DL-NEXT: v_mad_i32_i24 v4, s0, s1, v4 ; GFX10-DL-NEXT: s_ashr_i32 s0, s2, 24 ; GFX10-DL-NEXT: s_ashr_i32 s1, s3, 24 -; GFX10-DL-NEXT: v_mad_i32_i24 v2, v3, v2, v4 +; GFX10-DL-NEXT: v_mad_i32_i24 v2, v2, v3, v4 ; GFX10-DL-NEXT: v_mad_i32_i24 v2, s4, s5, v2 ; GFX10-DL-NEXT: v_mad_i32_i24 v2, s0, s1, v2 ; GFX10-DL-NEXT: global_store_dword v[0:1], v2, off @@ -1057,16 +1056,16 @@ define amdgpu_kernel void @idot4_acc16_v ; GFX10-DL-NEXT: s_bfe_i32 s1, s3, 0x80000 ; GFX10-DL-NEXT: s_lshr_b32 s4, s2, 16 ; GFX10-DL-NEXT: s_lshr_b32 s5, s3, 16 -; GFX10-DL-NEXT: v_and_b32_sdwa v4, sext(s2), v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v4, 8, s2 ; GFX10-DL-NEXT: v_and_b32_e32 v7, s0, v2 ; GFX10-DL-NEXT: v_and_b32_e32 v6, s1, v2 -; GFX10-DL-NEXT: v_and_b32_sdwa v5, sext(s3), v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v5, 8, s3 ; GFX10-DL-NEXT: s_bfe_i32 s0, s4, 0x80000 ; GFX10-DL-NEXT: s_bfe_i32 s1, s5, 0x80000 ; GFX10-DL-NEXT: v_lshl_or_b32 v4, v4, 16, v7 -; GFX10-DL-NEXT: v_and_b32_sdwa v8, sext(s4), v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v8, 8, s4 ; GFX10-DL-NEXT: v_lshl_or_b32 v5, v5, 16, v6 -; GFX10-DL-NEXT: v_and_b32_sdwa v6, sext(s5), v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v6, 8, s5 ; GFX10-DL-NEXT: v_and_b32_e32 v7, s1, v2 ; GFX10-DL-NEXT: v_and_b32_e32 v2, s0, v2 ; GFX10-DL-NEXT: v_pk_mul_lo_u16 v4, v4, v5 Modified: llvm/trunk/test/CodeGen/AMDGPU/idot4u.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/idot4u.ll?rev=374092&r1=374091&r2=374092&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/idot4u.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/idot4u.ll Tue Oct 8 10:36:38 2019 @@ -1738,28 +1738,30 @@ define amdgpu_kernel void @udot4_acc32_v ; GFX10-DL: ; %bb.0: ; %entry ; GFX10-DL-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 ; GFX10-DL-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34 -; GFX10-DL-NEXT: s_movk_i32 s2, 0xff -; GFX10-DL-NEXT: v_mov_b32_e32 v2, 0xffff +; GFX10-DL-NEXT: s_movk_i32 s3, 0xff +; GFX10-DL-NEXT: s_mov_b32 s2, 0xffff ; GFX10-DL-NEXT: ; implicit-def: $vcc_hi ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) -; GFX10-DL-NEXT: s_load_dword s3, s[4:5], 0x0 -; GFX10-DL-NEXT: s_load_dword s4, s[6:7], 0x0 -; GFX10-DL-NEXT: s_load_dword s5, s[0:1], 0x0 +; GFX10-DL-NEXT: s_load_dword s4, s[4:5], 0x0 +; GFX10-DL-NEXT: s_load_dword s5, s[6:7], 0x0 +; GFX10-DL-NEXT: s_load_dword s6, s[0:1], 0x0 ; GFX10-DL-NEXT: v_mov_b32_e32 v0, s0 ; GFX10-DL-NEXT: v_mov_b32_e32 v1, s1 ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) -; GFX10-DL-NEXT: s_and_b32 s0, s3, s2 -; GFX10-DL-NEXT: s_and_b32 s1, s4, s2 +; GFX10-DL-NEXT: v_mov_b32_e32 v2, s4 ; GFX10-DL-NEXT: v_mov_b32_e32 v3, s5 -; GFX10-DL-NEXT: v_and_b32_sdwa v4, s3, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD -; GFX10-DL-NEXT: v_and_b32_sdwa v2, s4, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD -; GFX10-DL-NEXT: s_bfe_u32 s2, s3, 0x80010 -; GFX10-DL-NEXT: s_bfe_u32 s5, s4, 0x80010 -; GFX10-DL-NEXT: v_mad_u32_u24 v3, s0, s1, v3 -; GFX10-DL-NEXT: s_lshr_b32 s0, s3, 24 -; GFX10-DL-NEXT: s_lshr_b32 s1, s4, 24 -; GFX10-DL-NEXT: v_mad_u32_u24 v2, v4, v2, v3 -; GFX10-DL-NEXT: v_mad_u32_u24 v2, s2, s5, v2 +; GFX10-DL-NEXT: s_and_b32 s0, s4, s3 +; GFX10-DL-NEXT: s_and_b32 s1, s5, s3 +; GFX10-DL-NEXT: v_mov_b32_e32 v4, s6 +; GFX10-DL-NEXT: v_and_b32_sdwa v2, s2, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_1 +; GFX10-DL-NEXT: v_and_b32_sdwa v3, s2, v3 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_1 +; GFX10-DL-NEXT: s_bfe_u32 s3, s4, 0x80010 +; GFX10-DL-NEXT: s_bfe_u32 s2, s5, 0x80010 +; GFX10-DL-NEXT: v_mad_u32_u24 v4, s0, s1, v4 +; GFX10-DL-NEXT: s_lshr_b32 s0, s4, 24 +; GFX10-DL-NEXT: s_lshr_b32 s1, s5, 24 +; GFX10-DL-NEXT: v_mad_u32_u24 v2, v2, v3, v4 +; GFX10-DL-NEXT: v_mad_u32_u24 v2, s3, s2, v2 ; GFX10-DL-NEXT: v_mad_u32_u24 v2, s0, s1, v2 ; GFX10-DL-NEXT: global_store_dword v[0:1], v2, off ; GFX10-DL-NEXT: s_endpgm @@ -1938,9 +1940,9 @@ define amdgpu_kernel void @udot4_acc16_v ; GFX10-DL-NEXT: v_mov_b32_e32 v1, s1 ; GFX10-DL-NEXT: global_load_ushort v3, v[0:1], off ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) -; GFX10-DL-NEXT: v_and_b32_sdwa v4, s2, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; GFX10-DL-NEXT: v_lshrrev_b16_e64 v4, 8, s2 ; GFX10-DL-NEXT: v_and_b32_sdwa v7, v2, s2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_0 -; GFX10-DL-NEXT: v_and_b32_sdwa v5, s3, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD +; GFX10-DL-NEXT: v_lshrrev_b16_e64 v5, 8, s3 ; GFX10-DL-NEXT: v_and_b32_sdwa v6, v2, s3 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_0 ; GFX10-DL-NEXT: s_lshr_b32 s0, s2, 16 ; GFX10-DL-NEXT: s_lshr_b32 s1, s3, 16 @@ -2150,40 +2152,36 @@ define amdgpu_kernel void @udot4_acc8_ve ; GFX10-DL: ; %bb.0: ; %entry ; GFX10-DL-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 ; GFX10-DL-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34 -; GFX10-DL-NEXT: v_mov_b32_e32 v2, 0xffff -; GFX10-DL-NEXT: s_movk_i32 s2, 0xff ; GFX10-DL-NEXT: ; implicit-def: $vcc_hi ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) -; GFX10-DL-NEXT: s_load_dword s3, s[4:5], 0x0 -; GFX10-DL-NEXT: s_load_dword s4, s[6:7], 0x0 +; GFX10-DL-NEXT: s_load_dword s2, s[4:5], 0x0 +; GFX10-DL-NEXT: s_load_dword s3, s[6:7], 0x0 ; GFX10-DL-NEXT: v_mov_b32_e32 v0, s0 ; GFX10-DL-NEXT: v_mov_b32_e32 v1, s1 -; GFX10-DL-NEXT: global_load_ubyte v3, v[0:1], off +; GFX10-DL-NEXT: global_load_ubyte v2, v[0:1], off ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) -; GFX10-DL-NEXT: v_and_b32_sdwa v4, s3, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD -; GFX10-DL-NEXT: v_and_b32_sdwa v5, s4, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD -; GFX10-DL-NEXT: s_lshr_b32 s0, s3, 24 -; GFX10-DL-NEXT: s_lshr_b32 s1, s3, 16 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v6, s3, s4 -; GFX10-DL-NEXT: s_lshr_b32 s3, s4, 16 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v4, v4, v5 -; GFX10-DL-NEXT: s_lshr_b32 s4, s4, 24 -; GFX10-DL-NEXT: v_and_b32_sdwa v5, v6, s2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v6, s1, s3 -; GFX10-DL-NEXT: v_and_b32_sdwa v4, v4, v2 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v7, s0, s4 -; GFX10-DL-NEXT: v_or_b32_sdwa v4, v5, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_0 -; GFX10-DL-NEXT: v_and_b32_sdwa v5, v6, s2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD -; GFX10-DL-NEXT: v_and_b32_sdwa v2, v7, v2 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD -; GFX10-DL-NEXT: v_and_b32_e32 v4, 0xffff, v4 -; GFX10-DL-NEXT: v_or_b32_sdwa v2, v5, v2 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_0 -; GFX10-DL-NEXT: v_or_b32_e32 v2, v4, v2 -; GFX10-DL-NEXT: v_lshrrev_b32_e32 v5, 8, v2 +; GFX10-DL-NEXT: v_lshrrev_b16_e64 v3, 8, s2 +; GFX10-DL-NEXT: v_lshrrev_b16_e64 v4, 8, s3 +; GFX10-DL-NEXT: s_lshr_b32 s0, s2, 24 +; GFX10-DL-NEXT: s_lshr_b32 s1, s3, 24 +; GFX10-DL-NEXT: s_lshr_b32 s4, s2, 16 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v5, s2, s3 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v3, v3, v4 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v4, s0, s1 +; GFX10-DL-NEXT: s_lshr_b32 s0, s3, 16 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v3, 8, v3 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v4, 8, v4 +; GFX10-DL-NEXT: v_or_b32_sdwa v3, v5, v3 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v5, s4, s0 +; GFX10-DL-NEXT: v_and_b32_e32 v3, 0xffff, v3 +; GFX10-DL-NEXT: v_or_b32_sdwa v4, v5, v4 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD +; GFX10-DL-NEXT: v_or_b32_e32 v4, v3, v4 +; GFX10-DL-NEXT: v_lshrrev_b32_e32 v5, 8, v4 ; GFX10-DL-NEXT: s_waitcnt vmcnt(0) -; GFX10-DL-NEXT: v_add_nc_u32_e32 v3, v4, v3 -; GFX10-DL-NEXT: v_add_nc_u32_e32 v3, v3, v5 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v3, v3, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v3, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_3 +; GFX10-DL-NEXT: v_add_nc_u32_e32 v2, v3, v2 +; GFX10-DL-NEXT: v_add_nc_u32_e32 v2, v2, v5 +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1 +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_3 ; GFX10-DL-NEXT: global_store_byte v[0:1], v2, off ; GFX10-DL-NEXT: s_endpgm <4 x i8> addrspace(1)* %src2, Modified: llvm/trunk/test/CodeGen/AMDGPU/idot8s.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/idot8s.ll?rev=374092&r1=374091&r2=374092&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/idot8s.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/idot8s.ll Tue Oct 8 10:36:38 2019 @@ -473,49 +473,47 @@ define amdgpu_kernel void @idot8_acc16(< ; GFX10-DL: ; %bb.0: ; %entry ; GFX10-DL-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 ; GFX10-DL-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34 -; GFX10-DL-NEXT: v_mov_b32_e32 v2, 0xffff +; GFX10-DL-NEXT: s_mov_b32 s2, 0xffff ; GFX10-DL-NEXT: ; implicit-def: $vcc_hi ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) -; GFX10-DL-NEXT: s_load_dword s2, s[4:5], 0x0 -; GFX10-DL-NEXT: s_load_dword s4, s[6:7], 0x0 +; GFX10-DL-NEXT: s_load_dword s4, s[4:5], 0x0 +; GFX10-DL-NEXT: s_load_dword s5, s[6:7], 0x0 ; GFX10-DL-NEXT: v_mov_b32_e32 v0, s0 ; GFX10-DL-NEXT: v_mov_b32_e32 v1, s1 -; GFX10-DL-NEXT: global_load_ushort v3, v[0:1], off +; GFX10-DL-NEXT: global_load_ushort v2, v[0:1], off ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) -; GFX10-DL-NEXT: s_lshr_b32 s0, s2, 12 -; GFX10-DL-NEXT: s_lshr_b32 s1, s4, 12 -; GFX10-DL-NEXT: s_bfe_i32 s5, s2, 0x40000 +; GFX10-DL-NEXT: s_lshr_b32 s0, s4, 12 +; GFX10-DL-NEXT: s_lshr_b32 s1, s5, 12 ; GFX10-DL-NEXT: s_bfe_i32 s6, s4, 0x40000 -; GFX10-DL-NEXT: s_bfe_i32 s7, s2, 0x40004 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v4, 12, s0 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v5, 12, s1 -; GFX10-DL-NEXT: s_bfe_i32 s0, s4, 0x40004 -; GFX10-DL-NEXT: s_bfe_i32 s1, s2, 0x40008 -; GFX10-DL-NEXT: s_bfe_i32 s8, s4, 0x40008 -; GFX10-DL-NEXT: v_and_b32_e32 v4, v4, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v5, v5, v2 -; GFX10-DL-NEXT: s_bfe_i32 s9, s2, 0x40010 -; GFX10-DL-NEXT: s_bfe_i32 s10, s4, 0x40010 -; GFX10-DL-NEXT: v_mul_i32_i24_e64 v6, s1, s8 +; GFX10-DL-NEXT: s_bfe_i32 s7, s5, 0x40000 +; GFX10-DL-NEXT: s_bfe_i32 s8, s4, 0x40004 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v3, 12, s0 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v4, 12, s1 +; GFX10-DL-NEXT: s_bfe_i32 s9, s5, 0x40004 +; GFX10-DL-NEXT: s_bfe_i32 s10, s4, 0x40008 +; GFX10-DL-NEXT: s_bfe_i32 s11, s5, 0x40008 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v3, 12, v3 ; GFX10-DL-NEXT: v_ashrrev_i16_e64 v4, 12, v4 -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v5, 12, v5 -; GFX10-DL-NEXT: s_bfe_i32 s1, s2, 0x40014 -; GFX10-DL-NEXT: s_bfe_i32 s8, s4, 0x40014 -; GFX10-DL-NEXT: s_bfe_i32 s11, s2, 0x40018 -; GFX10-DL-NEXT: v_and_b32_e32 v4, v4, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v2, v5, v2 +; GFX10-DL-NEXT: s_bfe_i32 s0, s4, 0x40010 +; GFX10-DL-NEXT: s_bfe_i32 s1, s5, 0x40010 +; GFX10-DL-NEXT: v_mul_i32_i24_e64 v5, s10, s11 +; GFX10-DL-NEXT: v_and_b32_e32 v3, s2, v3 +; GFX10-DL-NEXT: v_and_b32_e32 v4, s2, v4 +; GFX10-DL-NEXT: s_bfe_i32 s10, s4, 0x40014 +; GFX10-DL-NEXT: s_bfe_i32 s11, s5, 0x40014 ; GFX10-DL-NEXT: s_bfe_i32 s12, s4, 0x40018 -; GFX10-DL-NEXT: s_ashr_i32 s2, s2, 28 +; GFX10-DL-NEXT: s_bfe_i32 s2, s5, 0x40018 ; GFX10-DL-NEXT: s_ashr_i32 s4, s4, 28 +; GFX10-DL-NEXT: s_ashr_i32 s5, s5, 28 ; GFX10-DL-NEXT: s_waitcnt vmcnt(0) -; GFX10-DL-NEXT: v_mad_i32_i24 v3, s5, s6, v3 -; GFX10-DL-NEXT: v_mad_i32_i24 v3, s7, s0, v3 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v3, v3, v6 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:WORD_0 -; GFX10-DL-NEXT: v_mad_u32_u24 v2, v4, v2, v3 -; GFX10-DL-NEXT: v_mad_i32_i24 v2, s9, s10, v2 -; GFX10-DL-NEXT: v_mad_i32_i24 v2, s1, s8, v2 -; GFX10-DL-NEXT: v_mad_i32_i24 v2, s11, s12, v2 -; GFX10-DL-NEXT: v_mad_i32_i24 v2, s2, s4, v2 +; GFX10-DL-NEXT: v_mad_i32_i24 v2, s6, s7, v2 +; GFX10-DL-NEXT: v_mad_i32_i24 v2, s8, s9, v2 +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v5 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:WORD_0 +; GFX10-DL-NEXT: v_mad_u32_u24 v2, v3, v4, v2 +; GFX10-DL-NEXT: v_mad_i32_i24 v2, s0, s1, v2 +; GFX10-DL-NEXT: v_mad_i32_i24 v2, s10, s11, v2 +; GFX10-DL-NEXT: v_mad_i32_i24 v2, s12, s2, v2 +; GFX10-DL-NEXT: v_mad_i32_i24 v2, s4, s5, v2 ; GFX10-DL-NEXT: global_store_short v[0:1], v2, off ; GFX10-DL-NEXT: s_endpgm <8 x i4> addrspace(1)* %src2, @@ -818,7 +816,6 @@ define amdgpu_kernel void @idot8_acc8(<8 ; GFX10-DL: ; %bb.0: ; %entry ; GFX10-DL-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 ; GFX10-DL-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34 -; GFX10-DL-NEXT: v_mov_b32_e32 v2, 0xffff ; GFX10-DL-NEXT: s_movk_i32 s2, 0xff ; GFX10-DL-NEXT: ; implicit-def: $vcc_hi ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) @@ -826,40 +823,38 @@ define amdgpu_kernel void @idot8_acc8(<8 ; GFX10-DL-NEXT: s_load_dword s5, s[6:7], 0x0 ; GFX10-DL-NEXT: v_mov_b32_e32 v0, s0 ; GFX10-DL-NEXT: v_mov_b32_e32 v1, s1 -; GFX10-DL-NEXT: global_load_ubyte v3, v[0:1], off +; GFX10-DL-NEXT: global_load_ubyte v2, v[0:1], off ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) ; GFX10-DL-NEXT: s_lshr_b32 s0, s4, 12 ; GFX10-DL-NEXT: s_lshr_b32 s1, s5, 12 ; GFX10-DL-NEXT: s_bfe_i32 s6, s4, 0x40000 ; GFX10-DL-NEXT: s_bfe_i32 s7, s5, 0x40000 ; GFX10-DL-NEXT: s_bfe_i32 s8, s4, 0x40004 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v4, 12, s0 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v5, 12, s1 -; GFX10-DL-NEXT: s_bfe_i32 s0, s5, 0x40004 -; GFX10-DL-NEXT: s_bfe_i32 s1, s4, 0x40008 -; GFX10-DL-NEXT: s_bfe_i32 s9, s5, 0x40008 -; GFX10-DL-NEXT: v_and_b32_e32 v4, v4, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v2, v5, v2 -; GFX10-DL-NEXT: s_bfe_i32 s10, s4, 0x40010 -; GFX10-DL-NEXT: s_bfe_i32 s11, s5, 0x40010 -; GFX10-DL-NEXT: v_mul_i32_i24_e64 v5, s1, s9 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v3, 12, s0 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v4, 12, s1 +; GFX10-DL-NEXT: s_bfe_i32 s9, s5, 0x40004 +; GFX10-DL-NEXT: s_bfe_i32 s10, s4, 0x40008 +; GFX10-DL-NEXT: s_bfe_i32 s11, s5, 0x40008 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v3, 12, v3 ; GFX10-DL-NEXT: v_ashrrev_i16_e64 v4, 12, v4 -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v2, 12, v2 -; GFX10-DL-NEXT: s_bfe_i32 s1, s4, 0x40014 -; GFX10-DL-NEXT: s_bfe_i32 s9, s5, 0x40014 +; GFX10-DL-NEXT: s_bfe_i32 s0, s4, 0x40010 +; GFX10-DL-NEXT: s_bfe_i32 s1, s5, 0x40010 +; GFX10-DL-NEXT: v_mul_i32_i24_e64 v5, s10, s11 +; GFX10-DL-NEXT: v_and_b32_e32 v3, s2, v3 +; GFX10-DL-NEXT: v_and_b32_e32 v4, s2, v4 +; GFX10-DL-NEXT: s_bfe_i32 s10, s4, 0x40014 +; GFX10-DL-NEXT: s_bfe_i32 s11, s5, 0x40014 ; GFX10-DL-NEXT: s_bfe_i32 s12, s4, 0x40018 -; GFX10-DL-NEXT: v_and_b32_sdwa v4, v4, s2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD -; GFX10-DL-NEXT: v_and_b32_sdwa v2, v2, s2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD ; GFX10-DL-NEXT: s_bfe_i32 s2, s5, 0x40018 ; GFX10-DL-NEXT: s_ashr_i32 s4, s4, 28 ; GFX10-DL-NEXT: s_ashr_i32 s5, s5, 28 ; GFX10-DL-NEXT: s_waitcnt vmcnt(0) -; GFX10-DL-NEXT: v_mad_i32_i24 v3, s6, s7, v3 -; GFX10-DL-NEXT: v_mad_i32_i24 v3, s8, s0, v3 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v3, v3, v5 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:BYTE_0 -; GFX10-DL-NEXT: v_mad_u32_u24 v2, v4, v2, v3 +; GFX10-DL-NEXT: v_mad_i32_i24 v2, s6, s7, v2 +; GFX10-DL-NEXT: v_mad_i32_i24 v2, s8, s9, v2 +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v5 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:BYTE_0 +; GFX10-DL-NEXT: v_mad_u32_u24 v2, v3, v4, v2 +; GFX10-DL-NEXT: v_mad_i32_i24 v2, s0, s1, v2 ; GFX10-DL-NEXT: v_mad_i32_i24 v2, s10, s11, v2 -; GFX10-DL-NEXT: v_mad_i32_i24 v2, s1, s9, v2 ; GFX10-DL-NEXT: v_mad_i32_i24 v2, s12, s2, v2 ; GFX10-DL-NEXT: v_mad_i32_i24 v2, s4, s5, v2 ; GFX10-DL-NEXT: global_store_byte v[0:1], v2, off @@ -2289,133 +2284,94 @@ define amdgpu_kernel void @idot8_acc8_ve ; ; GFX10-DL-LABEL: idot8_acc8_vecMul: ; GFX10-DL: ; %bb.0: ; %entry -; GFX10-DL-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x34 -; GFX10-DL-NEXT: v_mov_b32_e32 v2, 0xffff -; GFX10-DL-NEXT: s_movk_i32 s2, 0xff -; GFX10-DL-NEXT: ; implicit-def: $vcc_hi -; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) -; GFX10-DL-NEXT: v_mov_b32_e32 v0, s4 -; GFX10-DL-NEXT: v_mov_b32_e32 v1, s5 ; GFX10-DL-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 -; GFX10-DL-NEXT: global_load_ubyte v3, v[0:1], off +; GFX10-DL-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34 +; GFX10-DL-NEXT: s_mov_b32 s2, 0xffff +; GFX10-DL-NEXT: ; implicit-def: $vcc_hi ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) -; GFX10-DL-NEXT: s_load_dword s0, s[4:5], 0x0 -; GFX10-DL-NEXT: s_load_dword s1, s[6:7], 0x0 -; GFX10-DL-NEXT: s_mov_b32 s4, 0xffff +; GFX10-DL-NEXT: s_load_dword s4, s[4:5], 0x0 +; GFX10-DL-NEXT: s_load_dword s5, s[6:7], 0x0 +; GFX10-DL-NEXT: v_mov_b32_e32 v0, s0 +; GFX10-DL-NEXT: v_mov_b32_e32 v1, s1 +; GFX10-DL-NEXT: global_load_ubyte v2, v[0:1], off ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) -; GFX10-DL-NEXT: s_lshr_b32 s9, s0, 4 -; GFX10-DL-NEXT: s_lshr_b32 s16, s1, 4 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v4, 12, s0 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v5, 12, s1 -; GFX10-DL-NEXT: s_lshr_b32 s10, s0, 8 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v8, 12, s9 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v15, 12, s16 -; GFX10-DL-NEXT: s_lshr_b32 s11, s0, 12 -; GFX10-DL-NEXT: s_lshr_b32 s17, s1, 8 -; GFX10-DL-NEXT: s_lshr_b32 s18, s1, 12 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v7, 12, s10 -; GFX10-DL-NEXT: v_and_b32_e32 v4, v4, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v5, v5, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v8, v8, v2 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v13, 12, s18 -; GFX10-DL-NEXT: v_and_b32_e32 v15, v15, v2 -; GFX10-DL-NEXT: s_lshr_b32 s7, s0, 24 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v23, 12, s11 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v31, 12, s17 -; GFX10-DL-NEXT: v_and_b32_e32 v7, v7, v2 +; GFX10-DL-NEXT: s_lshr_b32 s0, s4, 4 +; GFX10-DL-NEXT: s_lshr_b32 s1, s5, 4 +; GFX10-DL-NEXT: s_lshr_b32 s6, s4, 12 +; GFX10-DL-NEXT: s_lshr_b32 s7, s5, 12 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v6, 12, s4 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v3, 12, s0 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v4, 12, s1 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v5, 12, s6 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v7, 12, s7 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v8, 12, s5 +; GFX10-DL-NEXT: s_lshr_b32 s8, s4, 8 +; GFX10-DL-NEXT: s_lshr_b32 s0, s5, 8 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v3, 12, v3 ; GFX10-DL-NEXT: v_ashrrev_i16_e64 v4, 12, v4 -; GFX10-DL-NEXT: v_and_b32_e32 v13, v13, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v6, v23, v2 ; GFX10-DL-NEXT: v_ashrrev_i16_e64 v5, 12, v5 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v9, 12, s8 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v10, 12, s0 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v7, 12, v7 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v3, v3, v4 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v6, 12, v6 ; GFX10-DL-NEXT: v_ashrrev_i16_e64 v8, 12, v8 -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v15, 12, v15 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v27, 12, s7 -; GFX10-DL-NEXT: v_and_b32_e32 v14, v31, v2 -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v23, 12, v6 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v4, 12, v9 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v5, v5, v7 +; GFX10-DL-NEXT: s_lshr_b32 s0, s4, 20 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v9, 12, v10 +; GFX10-DL-NEXT: s_lshr_b32 s1, s5, 20 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v6, v6, v8 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v3, 8, v3 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v7, 12, s0 +; GFX10-DL-NEXT: s_lshr_b32 s8, s5, 16 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v8, 12, s1 +; GFX10-DL-NEXT: s_lshr_b32 s9, s5, 28 +; GFX10-DL-NEXT: v_or_b32_sdwa v3, v6, v3 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD +; GFX10-DL-NEXT: s_lshr_b32 s7, s4, 28 +; GFX10-DL-NEXT: s_lshr_b32 s6, s4, 16 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v4, v4, v9 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v5, 8, v5 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v10, 12, s9 +; GFX10-DL-NEXT: s_lshr_b32 s0, s4, 24 +; GFX10-DL-NEXT: s_lshr_b32 s1, s5, 24 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v6, 12, s7 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v9, 12, s6 +; GFX10-DL-NEXT: v_and_b32_e32 v3, s2, v3 +; GFX10-DL-NEXT: v_or_b32_sdwa v4, v4, v5 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD ; GFX10-DL-NEXT: v_ashrrev_i16_e64 v7, 12, v7 -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v13, 12, v13 -; GFX10-DL-NEXT: v_and_b32_e32 v10, v27, v2 -; GFX10-DL-NEXT: s_lshr_b32 s5, s0, 16 -; GFX10-DL-NEXT: s_lshr_b32 s6, s0, 20 -; GFX10-DL-NEXT: s_lshr_b32 s12, s1, 16 -; GFX10-DL-NEXT: s_lshr_b32 s13, s1, 20 -; GFX10-DL-NEXT: v_and_b32_e32 v4, v4, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v5, v5, v2 -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v27, 12, v14 -; GFX10-DL-NEXT: v_and_b32_e32 v8, v8, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v15, v15, v2 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v11, 12, s6 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v4, v4, v5 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v12, 12, s5 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v19, 12, s12 -; GFX10-DL-NEXT: s_lshr_b32 s8, s0, 28 -; GFX10-DL-NEXT: s_lshr_b32 s14, s1, 24 -; GFX10-DL-NEXT: s_lshr_b32 s15, s1, 28 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v35, 12, s13 -; GFX10-DL-NEXT: v_and_b32_e32 v6, v23, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v7, v7, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v5, v27, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v13, v13, v2 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v8, v8, v15 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v9, 12, s8 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v16, 12, s15 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v5, v7, v5 -; GFX10-DL-NEXT: v_lshlrev_b16_e64 v17, 12, s14 -; GFX10-DL-NEXT: v_and_b32_e32 v11, v11, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v12, v12, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v18, v35, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v19, v19, v2 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v15, v6, v13 -; GFX10-DL-NEXT: v_and_b32_sdwa v4, v4, s2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD -; GFX10-DL-NEXT: v_and_b32_sdwa v7, v8, v2 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD -; GFX10-DL-NEXT: v_and_b32_e32 v9, v9, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v16, v16, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v17, v17, v2 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v8, 12, v8 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v13, 12, s8 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v11, 12, s0 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v12, 12, s1 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v15, 12, v9 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v19, 12, v6 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v9, 12, v13 +; GFX10-DL-NEXT: v_ashrrev_i16_e64 v10, 12, v10 +; GFX10-DL-NEXT: v_or_b32_e32 v4, v3, v4 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v7, v7, v8 ; GFX10-DL-NEXT: v_ashrrev_i16_e64 v11, 12, v11 -; GFX10-DL-NEXT: v_or_b32_sdwa v4, v4, v7 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_0 ; GFX10-DL-NEXT: v_ashrrev_i16_e64 v12, 12, v12 -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v35, 12, v18 -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v19, 12, v19 -; GFX10-DL-NEXT: v_and_b32_sdwa v5, v5, s2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD -; GFX10-DL-NEXT: v_and_b32_sdwa v6, v15, v2 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v9, 12, v9 -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v31, 12, v10 -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v16, 12, v16 -; GFX10-DL-NEXT: v_and_b32_e32 v7, v11, v2 -; GFX10-DL-NEXT: v_or_b32_sdwa v5, v5, v6 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_0 -; GFX10-DL-NEXT: v_and_b32_e32 v4, s4, v4 -; GFX10-DL-NEXT: v_ashrrev_i16_e64 v17, 12, v17 -; GFX10-DL-NEXT: v_and_b32_e32 v10, v12, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v11, v19, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v6, v35, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v8, v9, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v13, v16, v2 -; GFX10-DL-NEXT: v_and_b32_e32 v9, v31, v2 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v10, v10, v11 -; GFX10-DL-NEXT: v_and_b32_e32 v12, v17, v2 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v11, v7, v6 -; GFX10-DL-NEXT: v_or_b32_e32 v5, v4, v5 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v8, v8, v13 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v7, v9, v12 -; GFX10-DL-NEXT: v_and_b32_sdwa v9, v10, s2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD -; GFX10-DL-NEXT: v_lshrrev_b32_e32 v10, 8, v5 -; GFX10-DL-NEXT: v_and_b32_sdwa v7, v7, s2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v6, v19, v10 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v5, v15, v9 +; GFX10-DL-NEXT: v_lshrrev_b32_e32 v8, 8, v4 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v6, 8, v6 ; GFX10-DL-NEXT: s_waitcnt vmcnt(0) -; GFX10-DL-NEXT: v_add_nc_u32_e32 v3, v4, v3 -; GFX10-DL-NEXT: v_and_b32_sdwa v4, v11, v2 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD -; GFX10-DL-NEXT: v_and_b32_sdwa v2, v8, v2 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD -; GFX10-DL-NEXT: v_add_nc_u32_e32 v3, v3, v10 -; GFX10-DL-NEXT: v_or_b32_sdwa v4, v9, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_0 -; GFX10-DL-NEXT: v_or_b32_sdwa v2, v7, v2 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_0 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v3, v3, v5 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:BYTE_2 -; GFX10-DL-NEXT: v_and_b32_e32 v4, s4, v4 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v3, v3, v5 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_3 -; GFX10-DL-NEXT: v_or_b32_e32 v2, v4, v2 -; GFX10-DL-NEXT: v_add_nc_u32_e32 v3, v3, v4 -; GFX10-DL-NEXT: v_lshrrev_b32_e32 v4, 8, v2 -; GFX10-DL-NEXT: v_add_nc_u32_e32 v3, v3, v4 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v3, v3, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v3, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_3 +; GFX10-DL-NEXT: v_add_nc_u32_e32 v2, v3, v2 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v3, 8, v7 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v7, v11, v12 +; GFX10-DL-NEXT: v_add_nc_u32_e32 v2, v2, v8 +; GFX10-DL-NEXT: v_or_b32_sdwa v3, v5, v3 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD +; GFX10-DL-NEXT: v_or_b32_sdwa v5, v7, v6 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:BYTE_2 +; GFX10-DL-NEXT: v_and_b32_e32 v3, s2, v3 +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_3 +; GFX10-DL-NEXT: v_or_b32_e32 v4, v3, v5 +; GFX10-DL-NEXT: v_add_nc_u32_e32 v2, v2, v3 +; GFX10-DL-NEXT: v_lshrrev_b32_e32 v3, 8, v4 +; GFX10-DL-NEXT: v_add_nc_u32_e32 v2, v2, v3 +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1 +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_3 ; GFX10-DL-NEXT: global_store_byte v[0:1], v2, off ; GFX10-DL-NEXT: s_endpgm <8 x i4> addrspace(1)* %src2, Modified: llvm/trunk/test/CodeGen/AMDGPU/idot8u.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/idot8u.ll?rev=374092&r1=374091&r2=374092&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/idot8u.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/idot8u.ll Tue Oct 8 10:36:38 2019 @@ -2550,7 +2550,6 @@ define amdgpu_kernel void @udot8_acc8_ve ; GFX10-DL: ; %bb.0: ; %entry ; GFX10-DL-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24 ; GFX10-DL-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34 -; GFX10-DL-NEXT: v_mov_b32_e32 v2, 0xffff ; GFX10-DL-NEXT: s_mov_b32 s2, 0xffff ; GFX10-DL-NEXT: ; implicit-def: $vcc_hi ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) @@ -2558,7 +2557,7 @@ define amdgpu_kernel void @udot8_acc8_ve ; GFX10-DL-NEXT: s_load_dword s5, s[6:7], 0x0 ; GFX10-DL-NEXT: v_mov_b32_e32 v0, s0 ; GFX10-DL-NEXT: v_mov_b32_e32 v1, s1 -; GFX10-DL-NEXT: global_load_ubyte v3, v[0:1], off +; GFX10-DL-NEXT: global_load_ubyte v2, v[0:1], off ; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0) ; GFX10-DL-NEXT: s_bfe_u32 s0, s4, 0x40004 ; GFX10-DL-NEXT: s_bfe_u32 s1, s5, 0x40004 @@ -2566,47 +2565,47 @@ define amdgpu_kernel void @udot8_acc8_ve ; GFX10-DL-NEXT: s_and_b32 s8, s5, 15 ; GFX10-DL-NEXT: s_bfe_u32 s7, s4, 0x4000c ; GFX10-DL-NEXT: s_bfe_u32 s9, s5, 0x4000c -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v4, s0, s1 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v3, s0, s1 ; GFX10-DL-NEXT: s_bfe_u32 s0, s4, 0x40008 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v5, s6, s8 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v4, s6, s8 ; GFX10-DL-NEXT: s_bfe_u32 s1, s5, 0x40008 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v11, s7, s9 -; GFX10-DL-NEXT: v_and_b32_sdwa v4, v4, v2 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v5, s7, s9 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v3, 8, v3 ; GFX10-DL-NEXT: s_bfe_u32 s6, s4, 0x40014 ; GFX10-DL-NEXT: s_lshr_b32 s7, s4, 28 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v7, s0, s1 -; GFX10-DL-NEXT: v_and_b32_sdwa v6, v11, v2 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD -; GFX10-DL-NEXT: v_or_b32_sdwa v4, v5, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:WORD_0 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v6, s0, s1 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v5, 8, v5 +; GFX10-DL-NEXT: v_or_b32_e32 v3, v4, v3 ; GFX10-DL-NEXT: s_bfe_u32 s0, s5, 0x40014 ; GFX10-DL-NEXT: s_lshr_b32 s9, s5, 28 ; GFX10-DL-NEXT: s_bfe_u32 s1, s4, 0x40010 -; GFX10-DL-NEXT: v_or_b32_sdwa v5, v7, v6 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:WORD_0 -; GFX10-DL-NEXT: v_and_b32_e32 v4, s2, v4 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v11, s6, s0 +; GFX10-DL-NEXT: v_or_b32_sdwa v4, v6, v5 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD +; GFX10-DL-NEXT: v_and_b32_e32 v3, s2, v3 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v5, s6, s0 ; GFX10-DL-NEXT: s_bfe_u32 s8, s5, 0x40010 ; GFX10-DL-NEXT: s_bfe_u32 s0, s4, 0x40018 ; GFX10-DL-NEXT: s_bfe_u32 s4, s5, 0x40018 -; GFX10-DL-NEXT: v_or_b32_e32 v5, v4, v5 -; GFX10-DL-NEXT: v_and_b32_sdwa v6, v11, v2 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v7, s1, s8 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v8, s7, s9 -; GFX10-DL-NEXT: v_mul_lo_u16_e64 v9, s0, s4 -; GFX10-DL-NEXT: v_lshrrev_b32_e32 v10, 8, v5 -; GFX10-DL-NEXT: v_or_b32_sdwa v6, v7, v6 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:WORD_0 -; GFX10-DL-NEXT: v_and_b32_sdwa v2, v8, v2 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD -; GFX10-DL-NEXT: v_and_b32_e32 v6, s2, v6 -; GFX10-DL-NEXT: v_or_b32_sdwa v7, v9, v2 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:WORD_0 -; GFX10-DL-NEXT: v_or_b32_e32 v2, v6, v7 -; GFX10-DL-NEXT: v_lshrrev_b32_e32 v14, 8, v2 +; GFX10-DL-NEXT: v_or_b32_e32 v4, v3, v4 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v5, 8, v5 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v6, s1, s8 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v7, s7, s9 +; GFX10-DL-NEXT: v_mul_lo_u16_e64 v8, s0, s4 +; GFX10-DL-NEXT: v_lshrrev_b32_e32 v9, 8, v4 +; GFX10-DL-NEXT: v_or_b32_e32 v5, v6, v5 +; GFX10-DL-NEXT: v_lshlrev_b16_e64 v6, 8, v7 +; GFX10-DL-NEXT: v_and_b32_e32 v5, s2, v5 +; GFX10-DL-NEXT: v_or_b32_sdwa v6, v8, v6 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD +; GFX10-DL-NEXT: v_or_b32_e32 v11, v5, v6 +; GFX10-DL-NEXT: v_lshrrev_b32_e32 v7, 8, v11 ; GFX10-DL-NEXT: s_waitcnt vmcnt(0) -; GFX10-DL-NEXT: v_add_nc_u32_e32 v3, v4, v3 -; GFX10-DL-NEXT: v_add_nc_u32_e32 v3, v3, v10 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v3, v3, v5 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:BYTE_2 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v3, v3, v5 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_3 -; GFX10-DL-NEXT: v_add_nc_u32_e32 v3, v3, v6 -; GFX10-DL-NEXT: v_add_nc_u32_e32 v3, v3, v14 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v3, v3, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1 -; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v3, v2 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_3 +; GFX10-DL-NEXT: v_add_nc_u32_e32 v2, v3, v2 +; GFX10-DL-NEXT: v_add_nc_u32_e32 v2, v2, v9 +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:BYTE_2 +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v4 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_3 +; GFX10-DL-NEXT: v_add_nc_u32_e32 v2, v2, v5 +; GFX10-DL-NEXT: v_add_nc_u32_e32 v2, v2, v7 +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v11 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1 +; GFX10-DL-NEXT: v_add_nc_u32_sdwa v2, v2, v11 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_3 ; GFX10-DL-NEXT: global_store_byte v[0:1], v2, off ; GFX10-DL-NEXT: s_endpgm <8 x i4> addrspace(1)* %src2, Modified: llvm/trunk/test/CodeGen/AMDGPU/preserve-hi16.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/preserve-hi16.ll?rev=374092&r1=374091&r2=374092&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/preserve-hi16.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/preserve-hi16.ll Tue Oct 8 10:36:38 2019 @@ -3,8 +3,8 @@ ; GCN-LABEL: {{^}}shl_i16: ; GCN: v_lshlrev_b16_e{{32|64}} [[OP:v[0-9]+]], -; GFX9-NEXT: s_setpc_b64 -; GFX10: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GCN-NEXT: s_setpc_b64 define i16 @shl_i16(i16 %x, i16 %y) { %res = shl i16 %x, %y ret i16 %res @@ -12,8 +12,8 @@ define i16 @shl_i16(i16 %x, i16 %y) { ; GCN-LABEL: {{^}}lshr_i16: ; GCN: v_lshrrev_b16_e{{32|64}} [[OP:v[0-9]+]], -; GFX9-NEXT: s_setpc_b64 -; GFX10: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GCN-NEXT: s_setpc_b64 define i16 @lshr_i16(i16 %x, i16 %y) { %res = lshr i16 %x, %y ret i16 %res @@ -21,8 +21,8 @@ define i16 @lshr_i16(i16 %x, i16 %y) { ; GCN-LABEL: {{^}}ashr_i16: ; GCN: v_ashrrev_i16_e{{32|64}} [[OP:v[0-9]+]], -; GFX9-NEXT: s_setpc_b64 -; GFX10: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GCN-NEXT: s_setpc_b64 define i16 @ashr_i16(i16 %x, i16 %y) { %res = ashr i16 %x, %y ret i16 %res @@ -30,8 +30,8 @@ define i16 @ashr_i16(i16 %x, i16 %y) { ; GCN-LABEL: {{^}}add_u16: ; GCN: v_add_{{(nc_)*}}u16_e{{32|64}} [[OP:v[0-9]+]], -; GFX9-NEXT: s_setpc_b64 -; GFX10: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GCN-NEXT: s_setpc_b64 define i16 @add_u16(i16 %x, i16 %y) { %res = add i16 %x, %y ret i16 %res @@ -39,8 +39,8 @@ define i16 @add_u16(i16 %x, i16 %y) { ; GCN-LABEL: {{^}}sub_u16: ; GCN: v_sub_{{(nc_)*}}u16_e{{32|64}} [[OP:v[0-9]+]], -; GFX9-NEXT: s_setpc_b64 -; GFX10: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GCN-NEXT: s_setpc_b64 define i16 @sub_u16(i16 %x, i16 %y) { %res = sub i16 %x, %y ret i16 %res @@ -48,8 +48,8 @@ define i16 @sub_u16(i16 %x, i16 %y) { ; GCN-LABEL: {{^}}mul_lo_u16: ; GCN: v_mul_lo_u16_e{{32|64}} [[OP:v[0-9]+]], -; GFX9-NEXT: s_setpc_b64 -; GFX10: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GCN-NEXT: s_setpc_b64 define i16 @mul_lo_u16(i16 %x, i16 %y) { %res = mul i16 %x, %y ret i16 %res @@ -57,8 +57,8 @@ define i16 @mul_lo_u16(i16 %x, i16 %y) { ; GCN-LABEL: {{^}}min_u16: ; GCN: v_min_u16_e{{32|64}} [[OP:v[0-9]+]], -; GFX9-NEXT: s_setpc_b64 -; GFX10: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GCN-NEXT: s_setpc_b64 define i16 @min_u16(i16 %x, i16 %y) { %cmp = icmp ule i16 %x, %y %res = select i1 %cmp, i16 %x, i16 %y @@ -67,8 +67,8 @@ define i16 @min_u16(i16 %x, i16 %y) { ; GCN-LABEL: {{^}}min_i16: ; GCN: v_min_i16_e{{32|64}} [[OP:v[0-9]+]], -; GFX9-NEXT: s_setpc_b64 -; GFX10: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GCN-NEXT: s_setpc_b64 define i16 @min_i16(i16 %x, i16 %y) { %cmp = icmp sle i16 %x, %y %res = select i1 %cmp, i16 %x, i16 %y @@ -77,8 +77,8 @@ define i16 @min_i16(i16 %x, i16 %y) { ; GCN-LABEL: {{^}}max_u16: ; GCN: v_max_u16_e{{32|64}} [[OP:v[0-9]+]], -; GFX9-NEXT: s_setpc_b64 -; GFX10: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GCN-NEXT: s_setpc_b64 define i16 @max_u16(i16 %x, i16 %y) { %cmp = icmp uge i16 %x, %y %res = select i1 %cmp, i16 %x, i16 %y @@ -87,10 +87,124 @@ define i16 @max_u16(i16 %x, i16 %y) { ; GCN-LABEL: {{^}}max_i16: ; GCN: v_max_i16_e{{32|64}} [[OP:v[0-9]+]], -; GFX9-NEXT: s_setpc_b64 -; GFX10: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GCN-NEXT: s_setpc_b64 define i16 @max_i16(i16 %x, i16 %y) { %cmp = icmp sge i16 %x, %y %res = select i1 %cmp, i16 %x, i16 %y ret i16 %res } + +; GCN-LABEL: {{^}}shl_i16_zext_i32: +; GCN: v_lshlrev_b16_e{{32|64}} [[OP:v[0-9]+]], +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GFX10-NEXT: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GCN-NEXT: s_setpc_b64 +define i32 @shl_i16_zext_i32(i16 %x, i16 %y) { + %res = shl i16 %x, %y + %zext = zext i16 %res to i32 + ret i32 %zext +} + +; GCN-LABEL: {{^}}lshr_i16_zext_i32: +; GCN: v_lshrrev_b16_e{{32|64}} [[OP:v[0-9]+]], +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GFX10-NEXT: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GCN-NEXT: s_setpc_b64 +define i32 @lshr_i16_zext_i32(i16 %x, i16 %y) { + %res = lshr i16 %x, %y + %zext = zext i16 %res to i32 + ret i32 %zext +} + +; GCN-LABEL: {{^}}ashr_i16_zext_i32: +; GCN: v_ashrrev_i16_e{{32|64}} [[OP:v[0-9]+]], +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GFX10-NEXT: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GCN-NEXT: s_setpc_b64 +define i32 @ashr_i16_zext_i32(i16 %x, i16 %y) { + %res = ashr i16 %x, %y + %zext = zext i16 %res to i32 + ret i32 %zext +} + +; GCN-LABEL: {{^}}add_u16_zext_i32: +; GCN: v_add_{{(nc_)*}}u16_e{{32|64}} [[OP:v[0-9]+]], +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GFX10-NEXT: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GCN-NEXT: s_setpc_b64 +define i32 @add_u16_zext_i32(i16 %x, i16 %y) { + %res = add i16 %x, %y + %zext = zext i16 %res to i32 + ret i32 %zext +} + +; GCN-LABEL: {{^}}sub_u16_zext_i32: +; GCN: v_sub_{{(nc_)*}}u16_e{{32|64}} [[OP:v[0-9]+]], +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GFX10-NEXT: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GCN-NEXT: s_setpc_b64 +define i32 @sub_u16_zext_i32(i16 %x, i16 %y) { + %res = sub i16 %x, %y + %zext = zext i16 %res to i32 + ret i32 %zext +} + +; GCN-LABEL: {{^}}mul_lo_u16_zext_i32: +; GCN: v_mul_lo_u16_e{{32|64}} [[OP:v[0-9]+]], +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GFX10-NEXT: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GCN-NEXT: s_setpc_b64 +define i32 @mul_lo_u16_zext_i32(i16 %x, i16 %y) { + %res = mul i16 %x, %y + %zext = zext i16 %res to i32 + ret i32 %zext +} + +; GCN-LABEL: {{^}}min_u16_zext_i32: +; GCN: v_min_u16_e{{32|64}} [[OP:v[0-9]+]], +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GFX10-NEXT: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GCN-NEXT: s_setpc_b64 +define i32 @min_u16_zext_i32(i16 %x, i16 %y) { + %cmp = icmp ule i16 %x, %y + %res = select i1 %cmp, i16 %x, i16 %y + %zext = zext i16 %res to i32 + ret i32 %zext +} + +; GCN-LABEL: {{^}}min_i16_zext_i32: +; GCN: v_min_i16_e{{32|64}} [[OP:v[0-9]+]], +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GFX10-NEXT: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GCN-NEXT: s_setpc_b64 +define i32 @min_i16_zext_i32(i16 %x, i16 %y) { + %cmp = icmp sle i16 %x, %y + %res = select i1 %cmp, i16 %x, i16 %y + %zext = zext i16 %res to i32 + ret i32 %zext +} + +; GCN-LABEL: {{^}}max_u16_zext_i32: +; GCN: v_max_u16_e{{32|64}} [[OP:v[0-9]+]], +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GFX10-NEXT: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GCN-NEXT: s_setpc_b64 +define i32 @max_u16_zext_i32(i16 %x, i16 %y) { + %cmp = icmp uge i16 %x, %y + %res = select i1 %cmp, i16 %x, i16 %y + %zext = zext i16 %res to i32 + ret i32 %zext +} + +; GCN-LABEL: {{^}}max_i16_zext_i32: +; GCN: v_max_i16_e{{32|64}} [[OP:v[0-9]+]], +; GFX10-NEXT: ; implicit-def: $vcc_hi +; GFX10-NEXT: v_and_b32_e32 v{{[0-9]+}}, 0xffff, [[OP]] +; GCN-NEXT: s_setpc_b64 +define i32 @max_i16_zext_i32(i16 %x, i16 %y) { + %cmp = icmp sge i16 %x, %y + %res = select i1 %cmp, i16 %x, i16 %y + %zext = zext i16 %res to i32 + ret i32 %zext +} Modified: llvm/trunk/test/CodeGen/AMDGPU/sdwa-peephole.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/sdwa-peephole.ll?rev=374092&r1=374091&r2=374092&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/sdwa-peephole.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/sdwa-peephole.ll Tue Oct 8 10:36:38 2019 @@ -283,11 +283,8 @@ entry: ; GFX9: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD -; GFX10-DAG: v_and_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD -; GFX10-DAG: v_and_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD -; GFX10: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_0 - - +; GFX10: v_lshlrev_b16_e64 v{{[0-9]+}}, 8, v +; GFX10: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD define amdgpu_kernel void @mul_v2i8(<2 x i8> addrspace(1)* %out, <2 x i8> addrspace(1)* %ina, <2 x i8> addrspace(1)* %inb) { entry: %a = load <2 x i8>, <2 x i8> addrspace(1)* %ina, align 4 @@ -501,10 +498,10 @@ store_label: ; ; GFX89: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD ; -; GFX10: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_0 -; GFX10: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_0 -; GFX10: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD -; GFX10: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD +; GFX10: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD +; GFX10: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD +; GFX10: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD +; GFX10: v_or_b32_sdwa v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD define amdgpu_kernel void @pulled_out_test(<8 x i8> addrspace(1)* %sourceA, <8 x i8> addrspace(1)* %destValues) { entry: From llvm-commits at lists.llvm.org Tue Oct 8 10:36:52 2019 From: llvm-commits at lists.llvm.org (David Majnemer via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:36:52 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: majnemer accepted this revision. majnemer added a comment. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Tue Oct 8 10:36:53 2019 From: llvm-commits at lists.llvm.org (Alexei Starovoitov via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:36:53 +0000 (UTC) Subject: [PATCH] D67980: [BPF] do compile-once run-everywhere relocation for bitfields In-Reply-To: References: Message-ID: <7659bffa92240c5e3c3851948bf48d7e@localhost.localdomain> ast accepted this revision. ast added a comment. perfect. ship it Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67980/new/ https://reviews.llvm.org/D67980 From llvm-commits at lists.llvm.org Tue Oct 8 10:36:54 2019 From: llvm-commits at lists.llvm.org (Kostya Kortchinsky via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:36:54 +0000 (UTC) Subject: [PATCH] D68653: [scudo][standalone] Get statistics in a char buffer Message-ID: cryptoad created this revision. cryptoad added reviewers: morehouse, hctim, vitalybuka, eugenis, cferris. Herald added subscribers: Sanitizers, delcypher. Herald added projects: LLVM, Sanitizers. Following up on D68471 , this CL introduces some `getStats` APIs to gather statistics in char buffers (`ScopedString` really) instead of printing them out right away. Ultimately `printStats` will just output the buffer, but that allows us to potentially do some work on the intermediate buffer, and can be used for a `mallocz` type of functionality. This allows us to pretty much get rid of all the `Printf` calls around, but I am keeping the function in for debugging purposes. This changes the existing tests to use the new APIs when required. I will add new tests as suggested in D68471 in another CL. Repository: rCRT Compiler Runtime https://reviews.llvm.org/D68653 Files: lib/scudo/standalone/combined.h lib/scudo/standalone/crc32_hw.cpp lib/scudo/standalone/primary32.h lib/scudo/standalone/primary64.h lib/scudo/standalone/quarantine.h lib/scudo/standalone/secondary.cpp lib/scudo/standalone/secondary.h lib/scudo/standalone/size_class_map.h lib/scudo/standalone/string_utils.cpp lib/scudo/standalone/string_utils.h lib/scudo/standalone/tests/primary_test.cpp lib/scudo/standalone/tests/quarantine_test.cpp lib/scudo/standalone/tests/secondary_test.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68653.223899.patch Type: text/x-patch Size: 16388 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 10:36:54 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:36:54 +0000 (UTC) Subject: [PATCH] D68635: [AMDGPU] Come back patch for the 'Assign register class for cross block values according to the divergence.' In-Reply-To: References: Message-ID: arsenm added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:653 } case AMDGPU::PHI: { + unsigned hasVGPRUses = 0; ---------------- Can you split this into a function? ================ Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:654 case AMDGPU::PHI: { - Register Reg = MI.getOperand(0).getReg(); - if (!TRI->isSGPRClass(MRI.getRegClass(Reg))) - break; - - // We don't need to fix the PHI if the common dominator of the - // two incoming blocks terminates with a uniform branch. - bool HasVGPROperand = phiHasVGPROperands(MI, MRI, TRI, TII); - if (MI.getNumExplicitOperands() == 5 && !HasVGPROperand) { - MachineBasicBlock *MBB0 = MI.getOperand(2).getMBB(); - MachineBasicBlock *MBB1 = MI.getOperand(4).getMBB(); - - if (!predsHasDivergentTerminator(MBB0, TRI) && - !predsHasDivergentTerminator(MBB1, TRI)) { - LLVM_DEBUG(dbgs() - << "Not fixing PHI for uniform branch: " << MI << '\n'); + unsigned hasVGPRUses = 0; + SetVector worklist; ---------------- This isn't a bool, so shouldn't start with has ================ Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:664 + if (UseMI->isCopy() && + Register::isPhysicalRegister(UseMI->getOperand(0).getReg()) && + !TRI->isSGPRReg(MRI, UseMI->getOperand(0).getReg())) { ---------------- .isPhysical() ================ Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:680-693 + unsigned OpNo = UseMI->getOperandNo(&Use); + const MCInstrDesc &Desc = TII->get(UseMI->getOpcode()); + if (!Desc.isPseudo() && Desc.OpInfo && + OpNo < Desc.getNumOperands() && + Desc.OpInfo[OpNo].RegClass != -1) { + const TargetRegisterClass *OpRC = + TRI->getRegClass(Desc.OpInfo[OpNo].RegClass); ---------------- This seems to be reproducing SIInstrInfo::getOpRegClass ================ Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:702 + const TargetRegisterClass *RC = + Register::isVirtualRegister(SrcReg) + ? MRI.getRegClass(SrcReg) ---------------- SrcReg.isVirtual() ================ Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:3978 // No VOP2 instructions support AGPRs. - if (Src0.isReg() && RI.isAGPR(MRI, Src0.getReg())) + if (Src0.isReg() && RI.hasAGPRs(Register::isVirtualRegister(Src0.getReg()) + ? MRI.getRegClass(Src0.getReg()) ---------------- rampitec wrote: > What's wrong with isAGPR() call? .isVirtual() CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68635/new/ https://reviews.llvm.org/D68635 From llvm-commits at lists.llvm.org Tue Oct 8 10:36:55 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:36:55 +0000 (UTC) Subject: [PATCH] D68583: AMDGPU: Fix i16 arithmetic pattern redundancy In-Reply-To: References: Message-ID: arsenm closed this revision. arsenm added a comment. r374092 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68583/new/ https://reviews.llvm.org/D68583 From llvm-commits at lists.llvm.org Tue Oct 8 10:36:55 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:36:55 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline In-Reply-To: References: Message-ID: <01561c81f11450b9f58bdd01f3a2ae5c@localhost.localdomain> jdoerfert added inline comments. ================ Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1851 ++I; } ---------------- Don't we need to have similar logic here? What happens if there are two allocas, then the dbg intrinsic, then another one? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 From llvm-commits at lists.llvm.org Tue Oct 8 10:48:07 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:48:07 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: DiggerLin marked an inline comment as done. DiggerLin added inline comments. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:357 + // Now output the auxiliary entry. + W.write(CSectionRef.SymbolTableIndex); + // Parameter typecheck hash. Not supported. ---------------- hubert.reinterpretcast wrote: > sfertile wrote: > > hubert.reinterpretcast wrote: > > > Since the field is named `SectionLen` in `llvm::object::XCOFFCsectAuxEnt32`, a comment is warranted regarding its use also for referencing the containing csect by symbol table index. Please also add a comment in `include/llvm/Object/XCOFFObjectFile.h`. > > I'm not disagreeing with this, but it should be done in a separate patch. > @DiggerLin, please post such a patch (perhaps going so far as to rename the field to `SectionOrLength`) so we do not lose track of this. I created a NFC patch. https://reviews.llvm.org/D68650 Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 From llvm-commits at lists.llvm.org Tue Oct 8 10:48:09 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:48:09 +0000 (UTC) Subject: [PATCH] D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands In-Reply-To: References: Message-ID: <816901803f0935c86a7cb43eb82c9689@localhost.localdomain> arsenm accepted this revision. arsenm added a comment. LGTM Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51932/new/ https://reviews.llvm.org/D51932 From llvm-commits at lists.llvm.org Tue Oct 8 10:57:33 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:57:33 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <3218b5441084cf22b7c8064ee2527c78@localhost.localdomain> rnk added a comment. Maybe a dumb idea: can we compute the table with constexpr evaluation? You could set up a constexpr function that returns a struct that wraps the array, and then the body of the function would construct the array imperatively as the old initialization code did. Set up a constexpr global with an initializer that calls the function. If not all compilers support this, we could instead use static_assert to check that the table is correct under an `#ifdef __clang__`, or whatever conditions apply. One possible drawback of this approach is compile time. If that ends up mattering, it could be ifdef EXPENSIVE_CHECKS or something awful like that. Another drawback is that I have seen MSVC accept globals marked constexpr, but then sometimes they get dynamic initializers anyway. The static_assert approach might be safer. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Tue Oct 8 10:57:34 2019 From: llvm-commits at lists.llvm.org (Jason Liu via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:57:34 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: <04fefaf492fb158137f264c389d061da@localhost.localdomain> jasonliu added inline comments. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:909 + + MCSymbol *MOSymbol = getMCSymbolForMO(MO, *this); ---------------- nit: MOSymbol could move down a bit to be closer to where it get used. Also I would want to add const to this MOSymbol. But it seems that a lot of similar MOSymbols in this file do not have const with them, only add const for this one here would create inconsistency. Not sure if it's worth to have a NFC patch to add const for all the MOSymbol in this file first. ================ Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.td:3171 (PPCtoc_entry tglobaladdr:$disp, i32:$reg))]>; -def ADDIStocHA : PPCEmitTimePseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, tocentry32:$disp), - "#ADDIStocHA", - [(set i32:$rD, - (PPCtoc_entry i32:$reg, tglobaladdr:$disp))]>; +let hasSideEffects = 0, isReMaterializable = 1 in { +def ADDIStocHA: PPCEmitTimePseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, tocentry32:$disp), ---------------- Curious about what the effect is for adding hasSideEffects and isReMaterializable. We already have ADDIStocHA before for the other targets, but they did not require hasSideEffects = 0, and isReMaterializable = 1. So what is special about the AIX target that we need to add them? Or is it simply an omission before and it's actually needed for the other targets as well? I try to remove this line here and no test case would fail. If we need those, should we add test case for it? ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:2 ; RUN: llc -mtriple powerpc-ibm-aix-xcoff \ -; RUN: -code-model=small < %s | FileCheck %s +; RUN: -code-model=small < %s | FileCheck %s --check-prefix=SMALL + ---------------- Do we want to add -verify-machineinstrs to every llc invocation? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Tue Oct 8 10:57:40 2019 From: llvm-commits at lists.llvm.org (David Greene via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:57:40 +0000 (UTC) Subject: [PATCH] D67728: Scrub FileCheck regex delimiters from test checks In-Reply-To: References: Message-ID: <1e894402a2d06c8189f768d716dcd38d@localhost.localdomain> greened added a comment. In D67728#1686494 , @greened wrote: > In D67728#1678378 , @xbolva00 wrote: > > > >> I could mock-up a testcase but where does it go? > > > > As you mentioned SCEV, you can add new/update test for SCEV. But I dont think this "step" is required. > > > So, that leaves me with a question. What else needs to be done before getting this committed? Ping. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67728/new/ https://reviews.llvm.org/D67728 From llvm-commits at lists.llvm.org Tue Oct 8 10:58:03 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 17:58:03 +0000 (UTC) Subject: [PATCH] D66282: [llvm-objcopy][MachO] Implement --remove-section In-Reply-To: References: Message-ID: <495389124024d31fc0b8bfa9f2caecd7@localhost.localdomain> rupprecht accepted this revision. rupprecht added inline comments. ================ Comment at: llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp:42-46 if (!Config.OnlySection.empty()) { RemovePred = [&Config, RemovePred](const Section &Sec) { return !Config.OnlySection.matches(Sec.CanonicalName); }; } ---------------- Not related to this patch, but looks like this discards `RemovePred`, e.g. `--strip-all --only-section` will effectively work like `--only-section`. I found this after noticing that `RemovePred` is not referenced in the bit you added, which is actually fine there but error prone if the iterative construction of `RemovePred` is ever reordered. ELF objcopy works like this too, but correctly chains `RemovePred` for the `--only-section` switch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66282/new/ https://reviews.llvm.org/D66282 From llvm-commits at lists.llvm.org Tue Oct 8 11:07:10 2019 From: llvm-commits at lists.llvm.org (Paul Robinson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:07:10 +0000 (UTC) Subject: [PATCH] D68639: [MachineScheduler] Add a flag to enable scheduling of cfi instructions In-Reply-To: References: Message-ID: <2feaa40ed500a54cdc6aba5075ecf392@localhost.localdomain> probinson added inline comments. ================ Comment at: llvm/lib/CodeGen/MachineScheduler.cpp:445 +/// If the option CFIInstructionScheduling is not set, cfi instructions act as +/// scheduling boundaries, otherwise they do. This allows to schedule cfi +/// instructions. ---------------- "If , act as scheduling boundaries, otherwise they do." There seems to be a "not" missing somewhere. ================ Comment at: llvm/lib/CodeGen/ScheduleDAGInstrs.cpp:825 + } + SUnit *SU = MISUnitMap[&MI]; ---------------- I realize you are planning to rework this to handle CFI and DBG instructions more in the same way. But the tests don't appear to exercise the case of having a mix of CFI and DBG instructions, and I am not sure the current code will correctly handle all cases where CFI and DBG instructions are adjacent. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68639/new/ https://reviews.llvm.org/D68639 From llvm-commits at lists.llvm.org Tue Oct 8 11:16:34 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:16:34 +0000 (UTC) Subject: [PATCH] D68531: [WebAssembly] Add builtin and intrinsic for v8x16.swizzle In-Reply-To: References: Message-ID: <0f63b8e53cf8df9b7baf1d06264f9283@localhost.localdomain> tlively marked an inline comment as done. tlively added a comment. In D68531#1699855 , @aheejin wrote: > > LLVM produces a poison value if the dynamic swizzle indices are greater than the vector size, but the WebAssembly instruction sets the corresponding output lane to zero. > > Where do we set those undef or poison lanes to zero? We don't do that in the toolchain. It's the runtime semantics implemented in engines that set the output lanes to zero. ================ Comment at: clang/include/clang/Basic/BuiltinsWebAssembly.def:63 // SIMD builtins +TARGET_BUILTIN(__builtin_wasm_swizzle_v8x16, "V16cV16cV16c", "nc", "unimplemented-simd128") + ---------------- aheejin wrote: > Is the second indices vector always v8x16 too? Yes, that's the only version of swizzling we have for now. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68531/new/ https://reviews.llvm.org/D68531 From llvm-commits at lists.llvm.org Tue Oct 8 11:16:34 2019 From: llvm-commits at lists.llvm.org (Paul Robinson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:16:34 +0000 (UTC) Subject: [PATCH] D68270: DWARFDebugLoc: Add a function to get the address range of an entry In-Reply-To: References: Message-ID: <07655c1ac59da5b011acb4cc302aac33@localhost.localdomain> probinson added a comment. Do we care whether llvm-dwarfdump's output bears any similarities to the output from GNU readelf or objdump? There has been a push lately to get the LLVM "binutils" to behave more like GNU's, although AFAIK it hasn't gotten to the DWARF dumping part. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68270/new/ https://reviews.llvm.org/D68270 From llvm-commits at lists.llvm.org Tue Oct 8 11:16:35 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:16:35 +0000 (UTC) Subject: [PATCH] D68654: [CVP} Replace SExt with ZExt if the input is known-non-negative Message-ID: lebedev.ri created this revision. lebedev.ri added reviewers: nikic, spatel, reames, dberlin. lebedev.ri added a project: LLVM. Herald added subscribers: jfb, hiraditya. zero-extension is far more friendly for further analysis. While this doesn't directly help with the shift-by-signext problem, this is not unrelated. This has the following effect on test-suite (numbers collected after the finish of middle-end module pass manager): | Statistic | old | new | delta | percent change | | correlated-value-propagation.NumSExt | 0 | 6026 | 6026 | +100.00% | | instcount.NumAddInst | 272860 | 271283 | -1577 | -0.58% | | instcount.NumAllocaInst | 27227 | 27226 | -1 | 0.00% | | instcount.NumAndInst | 63502 | 63320 | -182 | -0.29% | | instcount.NumAShrInst | 13498 | 13407 | -91 | -0.67% | | instcount.NumAtomicCmpXchgInst | 1159 | 1159 | 0 | 0.00% | | instcount.NumAtomicRMWInst | 5036 | 5036 | 0 | 0.00% | | instcount.NumBitCastInst | 672482 | 672353 | -129 | -0.02% | | instcount.NumBrInst | 702768 | 702195 | -573 | -0.08% | | instcount.NumCallInst | 518285 | 518205 | -80 | -0.02% | | instcount.NumExtractElementInst | 18481 | 18482 | 1 | 0.01% | | instcount.NumExtractValueInst | 18290 | 18288 | -2 | -0.01% | | instcount.NumFAddInst | 139035 | 138963 | -72 | -0.05% | | instcount.NumFCmpInst | 10358 | 10348 | -10 | -0.10% | | instcount.NumFDivInst | 30310 | 30302 | -8 | -0.03% | | instcount.NumFenceInst | 387 | 387 | 0 | 0.00% | | instcount.NumFMulInst | 93873 | 93806 | -67 | -0.07% | | instcount.NumFPExtInst | 7148 | 7144 | -4 | -0.06% | | instcount.NumFPToSIInst | 2823 | 2838 | 15 | 0.53% | | instcount.NumFPToUIInst | 1251 | 1251 | 0 | 0.00% | | instcount.NumFPTruncInst | 2195 | 2191 | -4 | -0.18% | | instcount.NumFSubInst | 92109 | 92103 | -6 | -0.01% | | instcount.NumGetElementPtrInst | 1221423 | 1219157 | -2266 | -0.19% | | instcount.NumICmpInst | 479140 | 478929 | -211 | -0.04% | | instcount.NumIndirectBrInst | 2 | 2 | 0 | 0.00% | | instcount.NumInsertElementInst | 66089 | 66094 | 5 | 0.01% | | instcount.NumInsertValueInst | 2032 | 2030 | -2 | -0.10% | | instcount.NumIntToPtrInst | 19641 | 19641 | 0 | 0.00% | | instcount.NumInvokeInst | 21789 | 21788 | -1 | 0.00% | | instcount.NumLandingPadInst | 12051 | 12051 | 0 | 0.00% | | instcount.NumLoadInst | 880079 | 878673 | -1406 | -0.16% | | instcount.NumLShrInst | 25919 | 25921 | 2 | 0.01% | | instcount.NumMulInst | 42416 | 42417 | 1 | 0.00% | | instcount.NumOrInst | 100826 | 100576 | -250 | -0.25% | | instcount.NumPHIInst | 315118 | 314092 | -1026 | -0.33% | | instcount.NumPtrToIntInst | 15933 | 15939 | 6 | 0.04% | | instcount.NumResumeInst | 2156 | 2156 | 0 | 0.00% | | instcount.NumRetInst | 84485 | 84484 | -1 | 0.00% | | instcount.NumSDivInst | 8599 | 8597 | -2 | -0.02% | | instcount.NumSelectInst | 45577 | 45913 | 336 | 0.74% | | instcount.NumSExtInst | 84026 | 78278 | -5748 | -6.84% | | instcount.NumShlInst | 39796 | 39726 | -70 | -0.18% | | instcount.NumShuffleVectorInst | 100272 | 100292 | 20 | 0.02% | | instcount.NumSIToFPInst | 29131 | 29113 | -18 | -0.06% | | instcount.NumSRemInst | 1543 | 1543 | 0 | 0.00% | | instcount.NumStoreInst | 805394 | 804351 | -1043 | -0.13% | | instcount.NumSubInst | 61337 | 61414 | 77 | 0.13% | | instcount.NumSwitchInst | 8527 | 8524 | -3 | -0.04% | | instcount.NumTruncInst | 60523 | 60484 | -39 | -0.06% | | instcount.NumUDivInst | 2381 | 2381 | 0 | 0.00% | | instcount.NumUIToFPInst | 5549 | 5549 | 0 | 0.00% | | instcount.NumUnreachableInst | 9855 | 9855 | 0 | 0.00% | | instcount.NumURemInst | 1305 | 1305 | 0 | 0.00% | | instcount.NumXorInst | 10230 | 10081 | -149 | -1.46% | | instcount.NumZExtInst | 60353 | 66840 | 6487 | 10.75% | | instcount.TotalBlocks | 829582 | 829004 | -578 | -0.07% | | instcount.TotalFuncs | 83818 | 83817 | -1 | 0.00% | | instcount.TotalInsts | 7316574 | 7308483 | -8091 | -0.11% | | TLDR: we produce -0.11% less instructions, -6.84% less `sext`, +10.75% more `zext`. To be noted, clearly, not all new `zext`'s are produced by this fold. (And now i guess it might have been interesting to measure this for D68103 :S) Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68654 Files: llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp llvm/test/Transforms/CorrelatedValuePropagation/sext.ll Index: llvm/test/Transforms/CorrelatedValuePropagation/sext.ll =================================================================== --- llvm/test/Transforms/CorrelatedValuePropagation/sext.ll +++ llvm/test/Transforms/CorrelatedValuePropagation/sext.ll @@ -18,9 +18,9 @@ ; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[A]], 1 ; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]] ; CHECK: for.body: -; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[A]] to i64 -; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE]]) -; CHECK-NEXT: [[EXT]] = trunc i64 [[EXT_WIDE]] to i32 +; CHECK-NEXT: [[EXT_WIDE1:%.*]] = zext i32 [[A]] to i64 +; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE1]]) +; CHECK-NEXT: [[EXT]] = trunc i64 [[EXT_WIDE1]] to i32 ; CHECK-NEXT: br label [[FOR_COND]] ; CHECK: for.end: ; CHECK-NEXT: ret void @@ -85,9 +85,9 @@ ; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[N:%.*]], 0 ; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.*]], label [[EXIT:%.*]] ; CHECK: bb: -; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[N]] to i64 -; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE]]) -; CHECK-NEXT: [[EXT:%.*]] = trunc i64 [[EXT_WIDE]] to i32 +; CHECK-NEXT: [[EXT_WIDE1:%.*]] = zext i32 [[N]] to i64 +; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE1]]) +; CHECK-NEXT: [[EXT:%.*]] = trunc i64 [[EXT_WIDE1]] to i32 ; CHECK-NEXT: br label [[EXIT]] ; CHECK: exit: ; CHECK-NEXT: ret void Index: llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp =================================================================== --- llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp +++ llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp @@ -62,6 +62,7 @@ STATISTIC(NumUDivs, "Number of udivs whose width was decreased"); STATISTIC(NumAShrs, "Number of ashr converted to lshr"); STATISTIC(NumSRems, "Number of srem converted to urem"); +STATISTIC(NumSExt, "Number of sext converted to zext"); STATISTIC(NumOverflows, "Number of overflow checks removed"); STATISTIC(NumSaturating, "Number of saturating arithmetics converted to normal arithmetics"); @@ -637,6 +638,27 @@ return true; } +static bool processSExt(SExtInst *SDI, LazyValueInfo *LVI) { + if (SDI->getType()->isVectorTy()) + return false; + + Value *Base = SDI->getOperand(0); + + Constant *Zero = ConstantInt::get(Base->getType(), 0); + if (LVI->getPredicateAt(ICmpInst::ICMP_SGE, Base, Zero, SDI) != + LazyValueInfo::True) + return false; + + ++NumSExt; + auto *ZExt = + CastInst::CreateZExtOrBitCast(Base, SDI->getType(), SDI->getName(), SDI); + ZExt->setDebugLoc(SDI->getDebugLoc()); + SDI->replaceAllUsesWith(ZExt); + SDI->eraseFromParent(); + + return true; +} + static bool processBinOp(BinaryOperator *BinOp, LazyValueInfo *LVI) { using OBO = OverflowingBinaryOperator; @@ -745,6 +767,9 @@ case Instruction::AShr: BBChanged |= processAShr(cast(II), LVI); break; + case Instruction::SExt: + BBChanged |= processSExt(cast(II), LVI); + break; case Instruction::Add: case Instruction::Sub: BBChanged |= processBinOp(cast(II), LVI); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68654.223904.patch Type: text/x-patch Size: 3250 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 11:23:17 2019 From: llvm-commits at lists.llvm.org (Yonghong Song via llvm-commits) Date: Tue, 08 Oct 2019 18:23:17 -0000 Subject: [llvm] r374099 - [BPF] do compile-once run-everywhere relocation for bitfields Message-ID: <20191008182317.F2D468ED51@lists.llvm.org> Author: yhs Date: Tue Oct 8 11:23:17 2019 New Revision: 374099 URL: http://llvm.org/viewvc/llvm-project?rev=374099&view=rev Log: [BPF] do compile-once run-everywhere relocation for bitfields A bpf specific clang intrinsic is introduced: u32 __builtin_preserve_field_info(member_access, info_kind) Depending on info_kind, different information will be returned to the program. A relocation is also recorded for this builtin so that bpf loader can patch the instruction on the target host. This clang intrinsic is used to get certain information to facilitate struct/union member relocations. The offset relocation is extended by 4 bytes to include relocation kind. Currently supported relocation kinds are enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; for __builtin_preserve_field_info. The old access offset relocation is covered by FIELD_BYTE_OFFSET = 0. An example: struct s { int a; int b1:9; int b2:4; }; enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; void bpf_probe_read(void *, unsigned, const void *); int field_read(struct s *arg) { unsigned long long ull = 0; unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET); unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE); #ifdef USE_PROBE_READ bpf_probe_read(&ull, size, (const void *)arg + offset); unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ lshift = lshift + (size << 3) - 64; #endif #else switch(size) { case 1: ull = *(unsigned char *)((void *)arg + offset); break; case 2: ull = *(unsigned short *)((void *)arg + offset); break; case 4: ull = *(unsigned int *)((void *)arg + offset); break; case 8: ull = *(unsigned long long *)((void *)arg + offset); break; } unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #endif ull <<= lshift; if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS)) return (long long)ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); } There is a minor overhead for bpf_probe_read() on big endian. The code and relocation generated for field_read where bpf_probe_read() is used to access argument data on little endian mode: r3 = r1 r1 = 0 r1 = 4 <=== relocation (FIELD_BYTE_OFFSET) r3 += r1 r1 = r10 r1 += -8 r2 = 4 <=== relocation (FIELD_BYTE_SIZE) call bpf_probe_read r2 = 51 <=== relocation (FIELD_LSHIFT_U64) r1 = *(u64 *)(r10 - 8) r1 <<= r2 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) r0 = r1 r0 >>= r2 r3 = 1 <=== relocation (FIELD_SIGNEDNESS) if r3 == 0 goto LBB0_2 r1 s>>= r2 r0 = r1 LBB0_2: exit Compare to the above code between relocations FIELD_LSHIFT_U64 and FIELD_LSHIFT_U64, the code with big endian mode has four more instructions. r1 = 41 <=== relocation (FIELD_LSHIFT_U64) r6 += r1 r6 += -64 r6 <<= 32 r6 >>= 32 r1 = *(u64 *)(r10 - 8) r1 <<= r6 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) The code and relocation generated when using direct load. r2 = 0 r3 = 4 r4 = 4 if r4 s> 3 goto LBB0_3 if r4 == 1 goto LBB0_5 if r4 == 2 goto LBB0_6 goto LBB0_9 LBB0_6: # %sw.bb1 r1 += r3 r2 = *(u16 *)(r1 + 0) goto LBB0_9 LBB0_3: # %entry if r4 == 4 goto LBB0_7 if r4 == 8 goto LBB0_8 goto LBB0_9 LBB0_8: # %sw.bb9 r1 += r3 r2 = *(u64 *)(r1 + 0) goto LBB0_9 LBB0_5: # %sw.bb r1 += r3 r2 = *(u8 *)(r1 + 0) goto LBB0_9 LBB0_7: # %sw.bb5 r1 += r3 r2 = *(u32 *)(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 r2 <<= r1 r1 = 60 r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit Considering verifier is able to do limited constant propogation following branches. The following is the code actually traversed. r2 = 0 r3 = 4 <=== relocation r4 = 4 <=== relocation if r4 s> 3 goto LBB0_3 LBB0_3: # %entry if r4 == 4 goto LBB0_7 LBB0_7: # %sw.bb5 r1 += r3 r2 = *(u32 *)(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 <=== relocation r2 <<= r1 r1 = 60 <=== relocation r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit For native load case, the load size is calculated to be the same as the size of load width LLVM otherwise used to load the value which is then used to extract the bitfield value. Differential Revision: https://reviews.llvm.org/D67980 Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-1.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-2.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-3.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-1.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-2.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-3.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-1.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-2.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-1.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-2.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-3.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-1.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-2.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-3.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-1.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-2.ll Modified: llvm/trunk/include/llvm/IR/IntrinsicsBPF.td llvm/trunk/lib/Target/BPF/BPF.h llvm/trunk/lib/Target/BPF/BPFAbstractMemberAccess.cpp llvm/trunk/lib/Target/BPF/BPFCORE.h llvm/trunk/lib/Target/BPF/BPFTargetMachine.cpp llvm/trunk/lib/Target/BPF/BTF.h llvm/trunk/lib/Target/BPF/BTFDebug.cpp llvm/trunk/lib/Target/BPF/BTFDebug.h llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-array.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-struct.ll llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-union.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-access-str.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-basic.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-array-1.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-array-2.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-1.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-2.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-3.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-union-1.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-union-2.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-end-load.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-end-ret.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-1.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-2.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-3.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-ignore.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-middle-chain.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multi-array-1.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multi-array-2.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multilevel.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-pointer-1.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-pointer-2.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-struct-anonymous.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-struct-array.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-array.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-struct.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-union.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef.ll llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-union.ll Modified: llvm/trunk/include/llvm/IR/IntrinsicsBPF.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/IntrinsicsBPF.td?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/IntrinsicsBPF.td (original) +++ llvm/trunk/include/llvm/IR/IntrinsicsBPF.td Tue Oct 8 11:23:17 2019 @@ -20,4 +20,7 @@ let TargetPrefix = "bpf" in { // All in Intrinsic<[llvm_i64_ty], [llvm_ptr_ty, llvm_i64_ty], [IntrReadMem]>; def int_bpf_pseudo : GCCBuiltin<"__builtin_bpf_pseudo">, Intrinsic<[llvm_i64_ty], [llvm_i64_ty, llvm_i64_ty]>; + def int_bpf_preserve_field_info : GCCBuiltin<"__builtin_bpf_preserve_field_info">, + Intrinsic<[llvm_i32_ty], [llvm_anyptr_ty, llvm_i64_ty], + [IntrNoMem, ImmArg<1>]>; } Modified: llvm/trunk/lib/Target/BPF/BPF.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPF.h?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/lib/Target/BPF/BPF.h (original) +++ llvm/trunk/lib/Target/BPF/BPF.h Tue Oct 8 11:23:17 2019 @@ -15,7 +15,7 @@ namespace llvm { class BPFTargetMachine; -ModulePass *createBPFAbstractMemberAccess(); +ModulePass *createBPFAbstractMemberAccess(BPFTargetMachine *TM); FunctionPass *createBPFISelDag(BPFTargetMachine &TM); FunctionPass *createBPFMISimplifyPatchablePass(); Modified: llvm/trunk/lib/Target/BPF/BPFAbstractMemberAccess.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFAbstractMemberAccess.cpp?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/lib/Target/BPF/BPFAbstractMemberAccess.cpp (original) +++ llvm/trunk/lib/Target/BPF/BPFAbstractMemberAccess.cpp Tue Oct 8 11:23:17 2019 @@ -50,6 +50,28 @@ // addr = preserve_struct_access_index(base, gep_index, di_index) // !llvm.preserve.access.index // +// Bitfield member access needs special attention. User cannot take the +// address of a bitfield acceess. To facilitate kernel verifier +// for easy bitfield code optimization, a new clang intrinsic is introduced: +// uint32_t __builtin_preserve_field_info(member_access, info_kind) +// In IR, a chain with two (or more) intrinsic calls will be generated: +// ... +// addr = preserve_struct_access_index(base, 1, 1) !struct s +// uint32_t result = bpf_preserve_field_info(addr, info_kind) +// +// Suppose the info_kind is FIELD_SIGNEDNESS, +// The above two IR intrinsics will be replaced with +// a relocatable insn: +// signness = /* signness of member_access */ +// and signness can be changed by bpf loader based on the +// types on the host. +// +// User can also test whether a field exists or not with +// uint32_t result = bpf_preserve_field_info(member_access, FIELD_EXISTENCE) +// The field will be always available (result = 1) during initial +// compilation, but bpf loader can patch with the correct value +// on the target host where the member_access may or may not be available +// //===----------------------------------------------------------------------===// #include "BPF.h" @@ -88,7 +110,11 @@ class BPFAbstractMemberAccess final : pu public: static char ID; - BPFAbstractMemberAccess() : ModulePass(ID) {} + TargetMachine *TM; + // Add optional BPFTargetMachine parameter so that BPF backend can add the phase + // with target machine to find out the endianness. The default constructor (without + // parameters) is used by the pass manager for managing purposes. + BPFAbstractMemberAccess(BPFTargetMachine *TM = nullptr) : ModulePass(ID), TM(TM) {} struct CallInfo { uint32_t Kind; @@ -96,19 +122,21 @@ public: MDNode *Metadata; Value *Base; }; + typedef std::stack> CallInfoStack; private: enum : uint32_t { BPFPreserveArrayAI = 1, BPFPreserveUnionAI = 2, BPFPreserveStructAI = 3, + BPFPreserveFieldInfoAI = 4, }; std::map GEPGlobals; // A map to link preserve_*_access_index instrinsic calls. std::map> AIChain; // A map to hold all the base preserve_*_access_index instrinsic calls. - // The base call is not an input of any other preserve_*_access_index + // The base call is not an input of any other preserve_* // intrinsics. std::map BaseAICalls; @@ -127,6 +155,12 @@ private: bool removePreserveAccessIndexIntrinsic(Module &M); void replaceWithGEP(std::vector &CallList, uint32_t NumOfZerosIndex, uint32_t DIIndex); + bool HasPreserveFieldInfoCall(CallInfoStack &CallStack); + void GetStorageBitRange(DICompositeType *CTy, DIDerivedType *MemberTy, + uint32_t AccessIndex, uint32_t &StartBitOffset, + uint32_t &EndBitOffset); + uint32_t GetFieldInfo(uint32_t InfoKind, DICompositeType *CTy, + uint32_t AccessIndex, uint32_t PatchImm); Value *computeBaseAndAccessKey(CallInst *Call, CallInfo &CInfo, std::string &AccessKey, MDNode *&BaseMeta); @@ -139,8 +173,8 @@ char BPFAbstractMemberAccess::ID = 0; INITIALIZE_PASS(BPFAbstractMemberAccess, DEBUG_TYPE, "abstracting struct/union member accessees", false, false) -ModulePass *llvm::createBPFAbstractMemberAccess() { - return new BPFAbstractMemberAccess(); +ModulePass *llvm::createBPFAbstractMemberAccess(BPFTargetMachine *TM) { + return new BPFAbstractMemberAccess(TM); } bool BPFAbstractMemberAccess::runOnModule(Module &M) { @@ -231,6 +265,16 @@ bool BPFAbstractMemberAccess::IsPreserve CInfo.Base = Call->getArgOperand(0); return true; } + if (GV->getName().startswith("llvm.bpf.preserve.field.info")) { + CInfo.Kind = BPFPreserveFieldInfoAI; + CInfo.Metadata = nullptr; + // Check validity of info_kind as clang did not check this. + uint64_t InfoKind = getConstant(Call->getArgOperand(1)); + if (InfoKind >= BPFCoreSharedInfo::MAX_FIELD_RELOC_KIND) + report_fatal_error("Incorrect info_kind for llvm.bpf.preserve.field.info intrinsic"); + CInfo.AccessIndex = InfoKind; + return true; + } return false; } @@ -306,6 +350,9 @@ bool BPFAbstractMemberAccess::removePres bool BPFAbstractMemberAccess::IsValidAIChain(const MDNode *ParentType, uint32_t ParentAI, const MDNode *ChildType) { + if (!ChildType) + return true; // preserve_field_info, no type comparison needed. + const DIType *PType = stripQualifiers(cast(ParentType)); const DIType *CType = stripQualifiers(cast(ChildType)); @@ -463,7 +510,187 @@ uint64_t BPFAbstractMemberAccess::getCon return CV->getValue().getZExtValue(); } -/// Compute the base of the whole preserve_*_access_index chains, i.e., the base +/// Get the start and the end of storage offset for \p MemberTy. +/// The storage bits are corresponding to the LLVM internal types, +/// and the storage bits for the member determines what load width +/// to use in order to extract the bitfield value. +void BPFAbstractMemberAccess::GetStorageBitRange(DICompositeType *CTy, + DIDerivedType *MemberTy, + uint32_t AccessIndex, + uint32_t &StartBitOffset, + uint32_t &EndBitOffset) { + auto SOff = dyn_cast(MemberTy->getStorageOffsetInBits()); + assert(SOff); + StartBitOffset = SOff->getZExtValue(); + + EndBitOffset = CTy->getSizeInBits(); + uint32_t Index = AccessIndex + 1; + for (; Index < CTy->getElements().size(); ++Index) { + auto Member = cast(CTy->getElements()[Index]); + if (!Member->getStorageOffsetInBits()) { + EndBitOffset = Member->getOffsetInBits(); + break; + } + SOff = dyn_cast(Member->getStorageOffsetInBits()); + assert(SOff); + unsigned BitOffset = SOff->getZExtValue(); + if (BitOffset != StartBitOffset) { + EndBitOffset = BitOffset; + break; + } + } +} + +uint32_t BPFAbstractMemberAccess::GetFieldInfo(uint32_t InfoKind, + DICompositeType *CTy, + uint32_t AccessIndex, + uint32_t PatchImm) { + if (InfoKind == BPFCoreSharedInfo::FIELD_EXISTENCE) + return 1; + + uint32_t Tag = CTy->getTag(); + if (InfoKind == BPFCoreSharedInfo::FIELD_BYTE_OFFSET) { + if (Tag == dwarf::DW_TAG_array_type) { + auto *EltTy = stripQualifiers(CTy->getBaseType()); + PatchImm += AccessIndex * calcArraySize(CTy, 1) * + (EltTy->getSizeInBits() >> 3); + } else if (Tag == dwarf::DW_TAG_structure_type) { + auto *MemberTy = cast(CTy->getElements()[AccessIndex]); + if (!MemberTy->isBitField()) { + PatchImm += MemberTy->getOffsetInBits() >> 3; + } else { + auto SOffset = dyn_cast(MemberTy->getStorageOffsetInBits()); + assert(SOffset); + PatchImm += SOffset->getZExtValue() >> 3; + } + } + return PatchImm; + } + + if (InfoKind == BPFCoreSharedInfo::FIELD_BYTE_SIZE) { + if (Tag == dwarf::DW_TAG_array_type) { + auto *EltTy = stripQualifiers(CTy->getBaseType()); + return calcArraySize(CTy, 1) * (EltTy->getSizeInBits() >> 3); + } else { + auto *MemberTy = cast(CTy->getElements()[AccessIndex]); + uint32_t SizeInBits = MemberTy->getSizeInBits(); + if (!MemberTy->isBitField()) + return SizeInBits >> 3; + + unsigned SBitOffset, NextSBitOffset; + GetStorageBitRange(CTy, MemberTy, AccessIndex, SBitOffset, NextSBitOffset); + SizeInBits = NextSBitOffset - SBitOffset; + if (SizeInBits & (SizeInBits - 1)) + report_fatal_error("Unsupported field expression for llvm.bpf.preserve.field.info"); + return SizeInBits >> 3; + } + } + + if (InfoKind == BPFCoreSharedInfo::FIELD_SIGNEDNESS) { + const DIType *BaseTy; + if (Tag == dwarf::DW_TAG_array_type) { + // Signedness only checked when final array elements are accessed. + if (CTy->getElements().size() != 1) + report_fatal_error("Invalid array expression for llvm.bpf.preserve.field.info"); + BaseTy = stripQualifiers(CTy->getBaseType()); + } else { + auto *MemberTy = cast(CTy->getElements()[AccessIndex]); + BaseTy = stripQualifiers(MemberTy->getBaseType()); + } + + // Only basic types and enum types have signedness. + const auto *BTy = dyn_cast(BaseTy); + while (!BTy) { + const auto *CompTy = dyn_cast(BaseTy); + // Report an error if the field expression does not have signedness. + if (!CompTy || CompTy->getTag() != dwarf::DW_TAG_enumeration_type) + report_fatal_error("Invalid field expression for llvm.bpf.preserve.field.info"); + BaseTy = stripQualifiers(CompTy->getBaseType()); + BTy = dyn_cast(BaseTy); + } + uint32_t Encoding = BTy->getEncoding(); + return (Encoding == dwarf::DW_ATE_signed || Encoding == dwarf::DW_ATE_signed_char); + } + + if (InfoKind == BPFCoreSharedInfo::FIELD_LSHIFT_U64) { + // The value is loaded into a value with FIELD_BYTE_SIZE size, + // and then zero or sign extended to U64. + // FIELD_LSHIFT_U64 and FIELD_RSHIFT_U64 are operations + // to extract the original value. + const Triple &Triple = TM->getTargetTriple(); + DIDerivedType *MemberTy = nullptr; + bool IsBitField = false; + uint32_t SizeInBits; + + if (Tag == dwarf::DW_TAG_array_type) { + auto *EltTy = stripQualifiers(CTy->getBaseType()); + SizeInBits = calcArraySize(CTy, 1) * EltTy->getSizeInBits(); + } else { + MemberTy = cast(CTy->getElements()[AccessIndex]); + SizeInBits = MemberTy->getSizeInBits(); + IsBitField = MemberTy->isBitField(); + } + + if (!IsBitField) { + if (SizeInBits > 64) + report_fatal_error("too big field size for llvm.bpf.preserve.field.info"); + return 64 - SizeInBits; + } + + unsigned SBitOffset, NextSBitOffset; + GetStorageBitRange(CTy, MemberTy, AccessIndex, SBitOffset, NextSBitOffset); + if (NextSBitOffset - SBitOffset > 64) + report_fatal_error("too big field size for llvm.bpf.preserve.field.info"); + + unsigned OffsetInBits = MemberTy->getOffsetInBits(); + if (Triple.getArch() == Triple::bpfel) + return SBitOffset + 64 - OffsetInBits - SizeInBits; + else + return OffsetInBits + 64 - NextSBitOffset; + } + + if (InfoKind == BPFCoreSharedInfo::FIELD_RSHIFT_U64) { + DIDerivedType *MemberTy = nullptr; + bool IsBitField = false; + uint32_t SizeInBits; + if (Tag == dwarf::DW_TAG_array_type) { + auto *EltTy = stripQualifiers(CTy->getBaseType()); + SizeInBits = calcArraySize(CTy, 1) * EltTy->getSizeInBits(); + } else { + MemberTy = cast(CTy->getElements()[AccessIndex]); + SizeInBits = MemberTy->getSizeInBits(); + IsBitField = MemberTy->isBitField(); + } + + if (!IsBitField) { + if (SizeInBits > 64) + report_fatal_error("too big field size for llvm.bpf.preserve.field.info"); + return 64 - SizeInBits; + } + + unsigned SBitOffset, NextSBitOffset; + GetStorageBitRange(CTy, MemberTy, AccessIndex, SBitOffset, NextSBitOffset); + if (NextSBitOffset - SBitOffset > 64) + report_fatal_error("too big field size for llvm.bpf.preserve.field.info"); + + return 64 - SizeInBits; + } + + llvm_unreachable("Unknown llvm.bpf.preserve.field.info info kind"); +} + +bool BPFAbstractMemberAccess::HasPreserveFieldInfoCall(CallInfoStack &CallStack) { + // This is called in error return path, no need to maintain CallStack. + while (CallStack.size()) { + auto StackElem = CallStack.top(); + if (StackElem.second.Kind == BPFPreserveFieldInfoAI) + return true; + CallStack.pop(); + } + return false; +} + +/// Compute the base of the whole preserve_* intrinsics chains, i.e., the base /// pointer of the first preserve_*_access_index call, and construct the access /// string, which will be the name of a global variable. Value *BPFAbstractMemberAccess::computeBaseAndAccessKey(CallInst *Call, @@ -472,7 +699,7 @@ Value *BPFAbstractMemberAccess::computeB MDNode *&TypeMeta) { Value *Base = nullptr; std::string TypeName; - std::stack> CallStack; + CallInfoStack CallStack; // Put the access chain into a stack with the top as the head of the chain. while (Call) { @@ -492,7 +719,8 @@ Value *BPFAbstractMemberAccess::computeB // int a[10][20]; ... __builtin_preserve_access_index(&a[2][3]) ... // we will skip them. uint32_t FirstIndex = 0; - uint32_t AccessOffset = 0; + uint32_t PatchImm = 0; // AccessOffset or the requested field info + uint32_t InfoKind = BPFCoreSharedInfo::FIELD_BYTE_OFFSET; while (CallStack.size()) { auto StackElem = CallStack.top(); Call = StackElem.first; @@ -507,10 +735,12 @@ Value *BPFAbstractMemberAccess::computeB // struct or union type TypeName = Ty->getName(); TypeMeta = Ty; - AccessOffset += FirstIndex * Ty->getSizeInBits() >> 3; + PatchImm += FirstIndex * (Ty->getSizeInBits() >> 3); break; } + assert(CInfo.Kind == BPFPreserveArrayAI); + // Array entries will always be consumed for accumulative initial index. CallStack.pop(); @@ -546,16 +776,22 @@ Value *BPFAbstractMemberAccess::computeB if (CheckElemType) { auto *CTy = dyn_cast(BaseTy); - if (!CTy) + if (!CTy) { + if (HasPreserveFieldInfoCall(CallStack)) + report_fatal_error("Invalid field access for llvm.preserve.field.info intrinsic"); return nullptr; + } unsigned CTag = CTy->getTag(); - if (CTag != dwarf::DW_TAG_structure_type && CTag != dwarf::DW_TAG_union_type) - return nullptr; - else + if (CTag == dwarf::DW_TAG_structure_type || CTag == dwarf::DW_TAG_union_type) { TypeName = CTy->getName(); + } else { + if (HasPreserveFieldInfoCall(CallStack)) + report_fatal_error("Invalid field access for llvm.preserve.field.info intrinsic"); + return nullptr; + } TypeMeta = CTy; - AccessOffset += FirstIndex * CTy->getSizeInBits() >> 3; + PatchImm += FirstIndex * (CTy->getSizeInBits() >> 3); break; } } @@ -569,6 +805,20 @@ Value *BPFAbstractMemberAccess::computeB CInfo = StackElem.second; CallStack.pop(); + if (CInfo.Kind == BPFPreserveFieldInfoAI) + break; + + // If the next Call (the top of the stack) is a BPFPreserveFieldInfoAI, + // the action will be extracting field info. + if (CallStack.size()) { + auto StackElem2 = CallStack.top(); + CallInfo CInfo2 = StackElem2.second; + if (CInfo2.Kind == BPFPreserveFieldInfoAI) { + InfoKind = CInfo2.AccessIndex; + assert(CallStack.size() == 1); + } + } + // Access Index uint64_t AccessIndex = CInfo.AccessIndex; AccessKey += ":" + std::to_string(AccessIndex); @@ -576,20 +826,13 @@ Value *BPFAbstractMemberAccess::computeB MDNode *MDN = CInfo.Metadata; // At this stage, it cannot be pointer type. auto *CTy = cast(stripQualifiers(cast(MDN))); - uint32_t Tag = CTy->getTag(); - if (Tag == dwarf::DW_TAG_structure_type) { - auto *MemberTy = cast(CTy->getElements()[AccessIndex]); - AccessOffset += MemberTy->getOffsetInBits() >> 3; - } else if (Tag == dwarf::DW_TAG_array_type) { - auto *EltTy = stripQualifiers(CTy->getBaseType()); - AccessOffset += AccessIndex * calcArraySize(CTy, 1) * - EltTy->getSizeInBits() >> 3; - } + PatchImm = GetFieldInfo(InfoKind, CTy, AccessIndex, PatchImm); } - // Access key is the type name + access string, uniquely identifying - // one kernel memory access. - AccessKey = TypeName + ":" + std::to_string(AccessOffset) + "$" + AccessKey; + // Access key is the type name + reloc type + patched imm + access string, + // uniquely identifying one relocation. + AccessKey = TypeName + ":" + std::to_string(InfoKind) + ":" + + std::to_string(PatchImm) + "$" + AccessKey; return Base; } @@ -605,22 +848,18 @@ bool BPFAbstractMemberAccess::transformG if (!Base) return false; - // Do the transformation - // For any original GEP Call and Base %2 like - // %4 = bitcast %struct.net_device** %dev1 to i64* - // it is transformed to: - // %6 = load sk_buff:50:$0:0:0:2:0 - // %7 = bitcast %struct.sk_buff* %2 to i8* - // %8 = getelementptr i8, i8* %7, %6 - // %9 = bitcast i8* %8 to i64* - // using %9 instead of %4 - // The original Call inst is removed. BasicBlock *BB = Call->getParent(); GlobalVariable *GV; if (GEPGlobals.find(AccessKey) == GEPGlobals.end()) { - GV = new GlobalVariable(M, Type::getInt64Ty(BB->getContext()), false, - GlobalVariable::ExternalLinkage, NULL, AccessKey); + IntegerType *VarType; + if (CInfo.Kind == BPFPreserveFieldInfoAI) + VarType = Type::getInt32Ty(BB->getContext()); // 32bit return value + else + VarType = Type::getInt64Ty(BB->getContext()); // 64bit ptr arith + + GV = new GlobalVariable(M, VarType, false, GlobalVariable::ExternalLinkage, + NULL, AccessKey); GV->addAttribute(BPFCoreSharedInfo::AmaAttr); GV->setMetadata(LLVMContext::MD_preserve_access_index, TypeMeta); GEPGlobals[AccessKey] = GV; @@ -628,6 +867,25 @@ bool BPFAbstractMemberAccess::transformG GV = GEPGlobals[AccessKey]; } + if (CInfo.Kind == BPFPreserveFieldInfoAI) { + // Load the global variable which represents the returned field info. + auto *LDInst = new LoadInst(Type::getInt32Ty(BB->getContext()), GV); + BB->getInstList().insert(Call->getIterator(), LDInst); + Call->replaceAllUsesWith(LDInst); + Call->eraseFromParent(); + return true; + } + + // For any original GEP Call and Base %2 like + // %4 = bitcast %struct.net_device** %dev1 to i64* + // it is transformed to: + // %6 = load sk_buff:50:$0:0:0:2:0 + // %7 = bitcast %struct.sk_buff* %2 to i8* + // %8 = getelementptr i8, i8* %7, %6 + // %9 = bitcast i8* %8 to i64* + // using %9 instead of %4 + // The original Call inst is removed. + // Load the global variable. auto *LDInst = new LoadInst(Type::getInt64Ty(BB->getContext()), GV); BB->getInstList().insert(Call->getIterator(), LDInst); Modified: llvm/trunk/lib/Target/BPF/BPFCORE.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFCORE.h?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/lib/Target/BPF/BPFCORE.h (original) +++ llvm/trunk/lib/Target/BPF/BPFCORE.h Tue Oct 8 11:23:17 2019 @@ -13,6 +13,16 @@ namespace llvm { class BPFCoreSharedInfo { public: + enum OffsetRelocKind : uint32_t { + FIELD_BYTE_OFFSET = 0, + FIELD_BYTE_SIZE, + FIELD_EXISTENCE, + FIELD_SIGNEDNESS, + FIELD_LSHIFT_U64, + FIELD_RSHIFT_U64, + + MAX_FIELD_RELOC_KIND, + }; /// The attribute attached to globals representing a member offset static const std::string AmaAttr; /// The section name to identify a patchable external global Modified: llvm/trunk/lib/Target/BPF/BPFTargetMachine.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BPFTargetMachine.cpp?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/lib/Target/BPF/BPFTargetMachine.cpp (original) +++ llvm/trunk/lib/Target/BPF/BPFTargetMachine.cpp Tue Oct 8 11:23:17 2019 @@ -94,7 +94,7 @@ TargetPassConfig *BPFTargetMachine::crea void BPFPassConfig::addIRPasses() { - addPass(createBPFAbstractMemberAccess()); + addPass(createBPFAbstractMemberAccess(&getBPFTargetMachine())); TargetPassConfig::addIRPasses(); } Modified: llvm/trunk/lib/Target/BPF/BTF.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BTF.h?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/lib/Target/BPF/BTF.h (original) +++ llvm/trunk/lib/Target/BPF/BTF.h Tue Oct 8 11:23:17 2019 @@ -17,7 +17,7 @@ /// /// The binary layout for .BTF.ext section: /// struct ExtHeader -/// FuncInfo, LineInfo, OffsetReloc and ExternReloc subsections +/// FuncInfo, LineInfo, FieldReloc and ExternReloc subsections /// The FuncInfo subsection is defined as below: /// BTFFuncInfo Size /// struct SecFuncInfo for ELF section #1 @@ -32,12 +32,12 @@ /// struct SecLineInfo for ELF section #2 /// A number of struct BPFLineInfo for ELF section #2 /// ... -/// The OffsetReloc subsection is defined as below: -/// BPFOffsetReloc Size -/// struct SecOffsetReloc for ELF section #1 -/// A number of struct BPFOffsetReloc for ELF section #1 -/// struct SecOffsetReloc for ELF section #2 -/// A number of struct BPFOffsetReloc for ELF section #2 +/// The FieldReloc subsection is defined as below: +/// BPFFieldReloc Size +/// struct SecFieldReloc for ELF section #1 +/// A number of struct BPFFieldReloc for ELF section #1 +/// struct SecFieldReloc for ELF section #2 +/// A number of struct BPFFieldReloc for ELF section #2 /// ... /// The ExternReloc subsection is defined as below: /// BPFExternReloc Size @@ -72,11 +72,11 @@ enum { BTFDataSecVarSize = 12, SecFuncInfoSize = 8, SecLineInfoSize = 8, - SecOffsetRelocSize = 8, + SecFieldRelocSize = 8, SecExternRelocSize = 8, BPFFuncInfoSize = 8, BPFLineInfoSize = 16, - BPFOffsetRelocSize = 12, + BPFFieldRelocSize = 16, BPFExternRelocSize = 8, }; @@ -213,8 +213,8 @@ struct ExtHeader { uint32_t FuncInfoLen; ///< Length of func info section uint32_t LineInfoOff; ///< Offset of line info section uint32_t LineInfoLen; ///< Length of line info section - uint32_t OffsetRelocOff; ///< Offset of offset reloc section - uint32_t OffsetRelocLen; ///< Length of offset reloc section + uint32_t FieldRelocOff; ///< Offset of offset reloc section + uint32_t FieldRelocLen; ///< Length of offset reloc section uint32_t ExternRelocOff; ///< Offset of extern reloc section uint32_t ExternRelocLen; ///< Length of extern reloc section }; @@ -247,16 +247,17 @@ struct SecLineInfo { }; /// Specifying one offset relocation. -struct BPFOffsetReloc { +struct BPFFieldReloc { uint32_t InsnOffset; ///< Byte offset in this section uint32_t TypeID; ///< TypeID for the relocation uint32_t OffsetNameOff; ///< The string to traverse types + uint32_t RelocKind; ///< What to patch the instruction }; /// Specifying offset relocation's in one section. -struct SecOffsetReloc { +struct SecFieldReloc { uint32_t SecNameOff; ///< Section name index in the .BTF string table - uint32_t NumOffsetReloc; ///< Number of offset reloc's in this section + uint32_t NumFieldReloc; ///< Number of offset reloc's in this section }; /// Specifying one offset relocation. Modified: llvm/trunk/lib/Target/BPF/BTFDebug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BTFDebug.cpp?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/lib/Target/BPF/BTFDebug.cpp (original) +++ llvm/trunk/lib/Target/BPF/BTFDebug.cpp Tue Oct 8 11:23:17 2019 @@ -754,7 +754,7 @@ void BTFDebug::emitBTFSection() { void BTFDebug::emitBTFExtSection() { // Do not emit section if empty FuncInfoTable and LineInfoTable. if (!FuncInfoTable.size() && !LineInfoTable.size() && - !OffsetRelocTable.size() && !ExternRelocTable.size()) + !FieldRelocTable.size() && !ExternRelocTable.size()) return; MCContext &Ctx = OS.getContext(); @@ -766,8 +766,8 @@ void BTFDebug::emitBTFExtSection() { // Account for FuncInfo/LineInfo record size as well. uint32_t FuncLen = 4, LineLen = 4; - // Do not account for optional OffsetReloc/ExternReloc. - uint32_t OffsetRelocLen = 0, ExternRelocLen = 0; + // Do not account for optional FieldReloc/ExternReloc. + uint32_t FieldRelocLen = 0, ExternRelocLen = 0; for (const auto &FuncSec : FuncInfoTable) { FuncLen += BTF::SecFuncInfoSize; FuncLen += FuncSec.second.size() * BTF::BPFFuncInfoSize; @@ -776,17 +776,17 @@ void BTFDebug::emitBTFExtSection() { LineLen += BTF::SecLineInfoSize; LineLen += LineSec.second.size() * BTF::BPFLineInfoSize; } - for (const auto &OffsetRelocSec : OffsetRelocTable) { - OffsetRelocLen += BTF::SecOffsetRelocSize; - OffsetRelocLen += OffsetRelocSec.second.size() * BTF::BPFOffsetRelocSize; + for (const auto &FieldRelocSec : FieldRelocTable) { + FieldRelocLen += BTF::SecFieldRelocSize; + FieldRelocLen += FieldRelocSec.second.size() * BTF::BPFFieldRelocSize; } for (const auto &ExternRelocSec : ExternRelocTable) { ExternRelocLen += BTF::SecExternRelocSize; ExternRelocLen += ExternRelocSec.second.size() * BTF::BPFExternRelocSize; } - if (OffsetRelocLen) - OffsetRelocLen += 4; + if (FieldRelocLen) + FieldRelocLen += 4; if (ExternRelocLen) ExternRelocLen += 4; @@ -795,8 +795,8 @@ void BTFDebug::emitBTFExtSection() { OS.EmitIntValue(FuncLen, 4); OS.EmitIntValue(LineLen, 4); OS.EmitIntValue(FuncLen + LineLen, 4); - OS.EmitIntValue(OffsetRelocLen, 4); - OS.EmitIntValue(FuncLen + LineLen + OffsetRelocLen, 4); + OS.EmitIntValue(FieldRelocLen, 4); + OS.EmitIntValue(FuncLen + LineLen + FieldRelocLen, 4); OS.EmitIntValue(ExternRelocLen, 4); // Emit func_info table. @@ -831,19 +831,20 @@ void BTFDebug::emitBTFExtSection() { } } - // Emit offset reloc table. - if (OffsetRelocLen) { - OS.AddComment("OffsetReloc"); - OS.EmitIntValue(BTF::BPFOffsetRelocSize, 4); - for (const auto &OffsetRelocSec : OffsetRelocTable) { - OS.AddComment("Offset reloc section string offset=" + - std::to_string(OffsetRelocSec.first)); - OS.EmitIntValue(OffsetRelocSec.first, 4); - OS.EmitIntValue(OffsetRelocSec.second.size(), 4); - for (const auto &OffsetRelocInfo : OffsetRelocSec.second) { - Asm->EmitLabelReference(OffsetRelocInfo.Label, 4); - OS.EmitIntValue(OffsetRelocInfo.TypeID, 4); - OS.EmitIntValue(OffsetRelocInfo.OffsetNameOff, 4); + // Emit field reloc table. + if (FieldRelocLen) { + OS.AddComment("FieldReloc"); + OS.EmitIntValue(BTF::BPFFieldRelocSize, 4); + for (const auto &FieldRelocSec : FieldRelocTable) { + OS.AddComment("Field reloc section string offset=" + + std::to_string(FieldRelocSec.first)); + OS.EmitIntValue(FieldRelocSec.first, 4); + OS.EmitIntValue(FieldRelocSec.second.size(), 4); + for (const auto &FieldRelocInfo : FieldRelocSec.second) { + Asm->EmitLabelReference(FieldRelocInfo.Label, 4); + OS.EmitIntValue(FieldRelocInfo.TypeID, 4); + OS.EmitIntValue(FieldRelocInfo.OffsetNameOff, 4); + OS.EmitIntValue(FieldRelocInfo.RelocKind, 4); } } } @@ -958,23 +959,27 @@ unsigned BTFDebug::populateStructType(co return Id; } -/// Generate a struct member offset relocation. -void BTFDebug::generateOffsetReloc(const MachineInstr *MI, +/// Generate a struct member field relocation. +void BTFDebug::generateFieldReloc(const MachineInstr *MI, const MCSymbol *ORSym, DIType *RootTy, StringRef AccessPattern) { unsigned RootId = populateStructType(RootTy); size_t FirstDollar = AccessPattern.find_first_of('$'); size_t FirstColon = AccessPattern.find_first_of(':'); + size_t SecondColon = AccessPattern.find_first_of(':', FirstColon + 1); StringRef IndexPattern = AccessPattern.substr(FirstDollar + 1); - StringRef OffsetStr = AccessPattern.substr(FirstColon + 1, - FirstDollar - FirstColon); - - BTFOffsetReloc OffsetReloc; - OffsetReloc.Label = ORSym; - OffsetReloc.OffsetNameOff = addString(IndexPattern); - OffsetReloc.TypeID = RootId; - AccessOffsets[AccessPattern.str()] = std::stoi(OffsetStr); - OffsetRelocTable[SecNameOff].push_back(OffsetReloc); + StringRef RelocKindStr = AccessPattern.substr(FirstColon + 1, + SecondColon - FirstColon); + StringRef PatchImmStr = AccessPattern.substr(SecondColon + 1, + FirstDollar - SecondColon); + + BTFFieldReloc FieldReloc; + FieldReloc.Label = ORSym; + FieldReloc.OffsetNameOff = addString(IndexPattern); + FieldReloc.TypeID = RootId; + FieldReloc.RelocKind = std::stoull(RelocKindStr); + PatchImms[AccessPattern.str()] = std::stoul(PatchImmStr); + FieldRelocTable[SecNameOff].push_back(FieldReloc); } void BTFDebug::processLDimm64(const MachineInstr *MI) { @@ -982,7 +987,7 @@ void BTFDebug::processLDimm64(const Mach // will generate an .BTF.ext record. // // If the insn is "r2 = LD_imm64 @__BTF_...", - // add this insn into the .BTF.ext OffsetReloc subsection. + // add this insn into the .BTF.ext FieldReloc subsection. // Relocation looks like: // . SecName: // . InstOffset @@ -1013,7 +1018,7 @@ void BTFDebug::processLDimm64(const Mach MDNode *MDN = GVar->getMetadata(LLVMContext::MD_preserve_access_index); DIType *Ty = dyn_cast(MDN); - generateOffsetReloc(MI, ORSym, Ty, GVar->getName()); + generateFieldReloc(MI, ORSym, Ty, GVar->getName()); } else if (GVar && !GVar->hasInitializer() && GVar->hasExternalLinkage() && GVar->getSection() == BPFCoreSharedInfo::PatchableExtSecName) { MCSymbol *ORSym = OS.getContext().createTempSymbol(); @@ -1154,8 +1159,8 @@ bool BTFDebug::InstLower(const MachineIn const GlobalValue *GVal = MO.getGlobal(); auto *GVar = dyn_cast(GVal); if (GVar && GVar->hasAttribute(BPFCoreSharedInfo::AmaAttr)) { - // Emit "mov ri, " for abstract member accesses. - int64_t Imm = AccessOffsets[GVar->getName().str()]; + // Emit "mov ri, " for patched immediate. + uint32_t Imm = PatchImms[GVar->getName().str()]; OutMI.setOpcode(BPF::MOV_ri); OutMI.addOperand(MCOperand::createReg(MI->getOperand(0).getReg())); OutMI.addOperand(MCOperand::createImm(Imm)); Modified: llvm/trunk/lib/Target/BPF/BTFDebug.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/BPF/BTFDebug.h?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/lib/Target/BPF/BTFDebug.h (original) +++ llvm/trunk/lib/Target/BPF/BTFDebug.h Tue Oct 8 11:23:17 2019 @@ -195,7 +195,7 @@ class BTFStringTable { /// A mapping from string table offset to the index /// of the Table. It is used to avoid putting /// duplicated strings in the table. - std::unordered_map OffsetToIdMap; + std::map OffsetToIdMap; /// A vector of strings to represent the string table. std::vector Table; @@ -224,10 +224,11 @@ struct BTFLineInfo { }; /// Represent one offset relocation. -struct BTFOffsetReloc { +struct BTFFieldReloc { const MCSymbol *Label; ///< MCSymbol identifying insn for the reloc uint32_t TypeID; ///< Type ID uint32_t OffsetNameOff; ///< The string to traverse types + uint32_t RelocKind; ///< What to patch the instruction }; /// Represent one extern relocation. @@ -249,12 +250,12 @@ class BTFDebug : public DebugHandlerBase std::unordered_map DIToIdMap; std::map> FuncInfoTable; std::map> LineInfoTable; - std::map> OffsetRelocTable; + std::map> FieldRelocTable; std::map> ExternRelocTable; StringMap> FileContent; std::map> DataSecEntries; std::vector StructTypes; - std::map AccessOffsets; + std::map PatchImms; std::map>> FixupDerivedTypes; @@ -300,7 +301,7 @@ class BTFDebug : public DebugHandlerBase void processGlobals(bool ProcessingMapDef); /// Generate one offset relocation record. - void generateOffsetReloc(const MachineInstr *MI, const MCSymbol *ORSym, + void generateFieldReloc(const MachineInstr *MI, const MCSymbol *ORSym, DIType *RootTy, StringRef AccessPattern); /// Populating unprocessed struct type. Modified: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-array.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-array.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-array.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-array.ll Tue Oct 8 11:23:17 2019 @@ -28,12 +28,13 @@ entry: ; CHECK: exit ; ; CHECK: .section .BTF.ext,"", at progbits -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 20 # Offset reloc section string offset=20 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 20 # Field reloc section string offset=20 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long [[RELOC]] ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long 26 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i8*) local_unnamed_addr #1 Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-1.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-1.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-1.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,148 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; typedef struct s1 { int a1:7; int a2:4; int a3:5; int a4:16;} __s1; +; union u1 { int b1; __s1 b2; }; +; enum { FIELD_BYTE_SIZE = 1, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2.a1, FIELD_BYTE_SIZE); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a2, FIELD_BYTE_SIZE); +; unsigned r3 = __builtin_preserve_field_info(arg->b2.a3, FIELD_BYTE_SIZE); +; unsigned r4 = __builtin_preserve_field_info(arg->b2.a4, FIELD_BYTE_SIZE); +; /* r1: 4, r2: 4, r3: 4, r4: 4 */ +; return r1 + r2 + r3 + r4; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { i32 } +%struct.s1 = type { i32 } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !11 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !28, metadata !DIExpression()), !dbg !33 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !34, !llvm.preserve.access.index !16 + %b2 = bitcast %union.u1* %0 to %struct.s1*, !dbg !34 + %1 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !35, !llvm.preserve.access.index !21 + %2 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %1, i64 1), !dbg !36 + call void @llvm.dbg.value(metadata i32 %2, metadata !29, metadata !DIExpression()), !dbg !33 + %3 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 1), !dbg !37, !llvm.preserve.access.index !21 + %4 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %3, i64 1), !dbg !38 + call void @llvm.dbg.value(metadata i32 %4, metadata !30, metadata !DIExpression()), !dbg !33 + %5 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 2), !dbg !39, !llvm.preserve.access.index !21 + %6 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %5, i64 1), !dbg !40 + call void @llvm.dbg.value(metadata i32 %6, metadata !31, metadata !DIExpression()), !dbg !33 + %7 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 3), !dbg !41, !llvm.preserve.access.index !21 + %8 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %7, i64 1), !dbg !42 + call void @llvm.dbg.value(metadata i32 %8, metadata !32, metadata !DIExpression()), !dbg !33 + %add = add i32 %4, %2, !dbg !43 + %add4 = add i32 %add, %6, !dbg !44 + %add5 = add i32 %add4, %8, !dbg !45 + ret i32 %add5, !dbg !46 +} + +; CHECK: r1 = 4 +; CHECK: r0 = 4 +; CHECK: r0 += r1 +; CHECK: r1 = 4 +; CHECK: r0 += r1 +; CHECK: r1 = 4 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=43 +; CHECK: .ascii "0:1:0" # string offset=49 +; CHECK: .ascii "0:1:1" # string offset=92 +; CHECK: .ascii "0:1:2" # string offset=98 +; CHECK: .ascii "0:1:3" # string offset=104 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 43 # Field reloc section string offset=43 +; CHECK-NEXT: .long 4 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 49 +; CHECK-NEXT: .long 1 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 92 +; CHECK-NEXT: .long 1 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 98 +; CHECK-NEXT: .long 1 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 104 +; CHECK-NEXT: .long 1 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!7, !8, !9} +!llvm.ident = !{!10} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 3, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_BYTE_SIZE", value: 1, isUnsigned: true) +!7 = !{i32 2, !"Dwarf Version", i32 4} +!8 = !{i32 2, !"Debug Info Version", i32 3} +!9 = !{i32 1, !"wchar_size", i32 4} +!10 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)"} +!11 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 4, type: !12, scopeLine: 4, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !27) +!12 = !DISubroutineType(types: !13) +!13 = !{!14, !15} +!14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!15 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !16, size: 64) +!16 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 2, size: 32, elements: !17) +!17 = !{!18, !19} +!18 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !16, file: !1, line: 2, baseType: !14, size: 32) +!19 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !16, file: !1, line: 2, baseType: !20, size: 32) +!20 = !DIDerivedType(tag: DW_TAG_typedef, name: "__s1", file: !1, line: 1, baseType: !21) +!21 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 1, size: 32, elements: !22) +!22 = !{!23, !24, !25, !26} +!23 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !21, file: !1, line: 1, baseType: !14, size: 7, flags: DIFlagBitField, extraData: i64 0) +!24 = !DIDerivedType(tag: DW_TAG_member, name: "a2", scope: !21, file: !1, line: 1, baseType: !14, size: 4, offset: 7, flags: DIFlagBitField, extraData: i64 0) +!25 = !DIDerivedType(tag: DW_TAG_member, name: "a3", scope: !21, file: !1, line: 1, baseType: !14, size: 5, offset: 11, flags: DIFlagBitField, extraData: i64 0) +!26 = !DIDerivedType(tag: DW_TAG_member, name: "a4", scope: !21, file: !1, line: 1, baseType: !14, size: 16, offset: 16, flags: DIFlagBitField, extraData: i64 0) +!27 = !{!28, !29, !30, !31, !32} +!28 = !DILocalVariable(name: "arg", arg: 1, scope: !11, file: !1, line: 4, type: !15) +!29 = !DILocalVariable(name: "r1", scope: !11, file: !1, line: 5, type: !4) +!30 = !DILocalVariable(name: "r2", scope: !11, file: !1, line: 6, type: !4) +!31 = !DILocalVariable(name: "r3", scope: !11, file: !1, line: 7, type: !4) +!32 = !DILocalVariable(name: "r4", scope: !11, file: !1, line: 8, type: !4) +!33 = !DILocation(line: 0, scope: !11) +!34 = !DILocation(line: 5, column: 52, scope: !11) +!35 = !DILocation(line: 5, column: 55, scope: !11) +!36 = !DILocation(line: 5, column: 17, scope: !11) +!37 = !DILocation(line: 6, column: 55, scope: !11) +!38 = !DILocation(line: 6, column: 17, scope: !11) +!39 = !DILocation(line: 7, column: 55, scope: !11) +!40 = !DILocation(line: 7, column: 17, scope: !11) +!41 = !DILocation(line: 8, column: 55, scope: !11) +!42 = !DILocation(line: 8, column: 17, scope: !11) +!43 = !DILocation(line: 10, column: 13, scope: !11) +!44 = !DILocation(line: 10, column: 18, scope: !11) +!45 = !DILocation(line: 10, column: 23, scope: !11) +!46 = !DILocation(line: 10, column: 3, scope: !11) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-2.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-2.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-2.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,138 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; typedef struct s1 { int a1; char a2; } __s1; +; union u1 { int b1; __s1 b2; }; +; enum { FIELD_BYTE_SIZE = 1, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a1, FIELD_BYTE_SIZE); +; unsigned r3 = __builtin_preserve_field_info(arg->b2.a2, FIELD_BYTE_SIZE); +; /* r1: 8, r2: 4, r3: 1 */ +; return r1 + r2 + r3; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { %struct.s1 } +%struct.s1 = type { i32, i8 } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !11 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !27, metadata !DIExpression()), !dbg !31 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !32, !llvm.preserve.access.index !16 + %b2 = getelementptr inbounds %union.u1, %union.u1* %0, i64 0, i32 0, !dbg !32 + %1 = tail call i32 @llvm.bpf.preserve.field.info.p0s_struct.s1s(%struct.s1* %b2, i64 1), !dbg !33 + call void @llvm.dbg.value(metadata i32 %1, metadata !28, metadata !DIExpression()), !dbg !31 + %2 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !34, !llvm.preserve.access.index !21 + %3 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %2, i64 1), !dbg !35 + call void @llvm.dbg.value(metadata i32 %3, metadata !29, metadata !DIExpression()), !dbg !31 + %4 = tail call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.s1s(%struct.s1* %b2, i32 1, i32 1), !dbg !36, !llvm.preserve.access.index !21 + %5 = tail call i32 @llvm.bpf.preserve.field.info.p0i8(i8* %4, i64 1), !dbg !37 + call void @llvm.dbg.value(metadata i32 %5, metadata !30, metadata !DIExpression()), !dbg !31 + %add = add i32 %3, %1, !dbg !38 + %add3 = add i32 %add, %5, !dbg !39 + ret i32 %add3, !dbg !40 +} + +; CHECK: r1 = 8 +; CHECK: r0 = 4 +; CHECK: r0 += r1 +; CHECK: r1 = 1 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=42 +; CHECK: .ascii "0:1" # string offset=48 +; CHECK: .ascii "0:1:0" # string offset=89 +; CHECK: .ascii "0:1:1" # string offset=95 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 42 # Field reloc section string offset=42 +; CHECK-NEXT: .long 3 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 48 +; CHECK-NEXT: .long 1 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 89 +; CHECK-NEXT: .long 1 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 95 +; CHECK-NEXT: .long 1 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0s_struct.s1s(%struct.s1*, i64) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone +declare i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i8(i8*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!7, !8, !9} +!llvm.ident = !{!10} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 3, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_BYTE_SIZE", value: 1, isUnsigned: true) +!7 = !{i32 2, !"Dwarf Version", i32 4} +!8 = !{i32 2, !"Debug Info Version", i32 3} +!9 = !{i32 1, !"wchar_size", i32 4} +!10 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)"} +!11 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 4, type: !12, scopeLine: 4, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !26) +!12 = !DISubroutineType(types: !13) +!13 = !{!14, !15} +!14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!15 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !16, size: 64) +!16 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 2, size: 64, elements: !17) +!17 = !{!18, !19} +!18 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !16, file: !1, line: 2, baseType: !14, size: 32) +!19 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !16, file: !1, line: 2, baseType: !20, size: 64) +!20 = !DIDerivedType(tag: DW_TAG_typedef, name: "__s1", file: !1, line: 1, baseType: !21) +!21 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 1, size: 64, elements: !22) +!22 = !{!23, !24} +!23 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !21, file: !1, line: 1, baseType: !14, size: 32) +!24 = !DIDerivedType(tag: DW_TAG_member, name: "a2", scope: !21, file: !1, line: 1, baseType: !25, size: 8, offset: 32) +!25 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char) +!26 = !{!27, !28, !29, !30} +!27 = !DILocalVariable(name: "arg", arg: 1, scope: !11, file: !1, line: 4, type: !15) +!28 = !DILocalVariable(name: "r1", scope: !11, file: !1, line: 5, type: !4) +!29 = !DILocalVariable(name: "r2", scope: !11, file: !1, line: 6, type: !4) +!30 = !DILocalVariable(name: "r3", scope: !11, file: !1, line: 7, type: !4) +!31 = !DILocation(line: 0, scope: !11) +!32 = !DILocation(line: 5, column: 52, scope: !11) +!33 = !DILocation(line: 5, column: 17, scope: !11) +!34 = !DILocation(line: 6, column: 55, scope: !11) +!35 = !DILocation(line: 6, column: 17, scope: !11) +!36 = !DILocation(line: 7, column: 55, scope: !11) +!37 = !DILocation(line: 7, column: 17, scope: !11) +!38 = !DILocation(line: 9, column: 13, scope: !11) +!39 = !DILocation(line: 9, column: 18, scope: !11) +!40 = !DILocation(line: 9, column: 3, scope: !11) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-3.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-3.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-3.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-3.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,130 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; typedef struct s1 { int a1[10][10]; } __s1; +; union u1 { int b1; __s1 b2; }; +; enum { FIELD_BYTE_SIZE = 1, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2.a1[5], FIELD_BYTE_SIZE); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a1[5][5], FIELD_BYTE_SIZE); +; /* r1: 40, r2: 4 */ +; return r1 + r2; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { %struct.s1 } +%struct.s1 = type { [10 x [10 x i32]] } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !18 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !31, metadata !DIExpression()), !dbg !34 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !35, !llvm.preserve.access.index !22 + %b2 = getelementptr inbounds %union.u1, %union.u1* %0, i64 0, i32 0, !dbg !35 + %1 = tail call [10 x [10 x i32]]* @llvm.preserve.struct.access.index.p0a10a10i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !36, !llvm.preserve.access.index !27 + %2 = tail call [10 x i32]* @llvm.preserve.array.access.index.p0a10i32.p0a10a10i32([10 x [10 x i32]]* %1, i32 1, i32 5), !dbg !37, !llvm.preserve.access.index !8 + %3 = tail call i32 @llvm.bpf.preserve.field.info.p0a10i32([10 x i32]* %2, i64 1), !dbg !38 + call void @llvm.dbg.value(metadata i32 %3, metadata !32, metadata !DIExpression()), !dbg !34 + %4 = tail call i32* @llvm.preserve.array.access.index.p0i32.p0a10i32([10 x i32]* %2, i32 1, i32 5), !dbg !39, !llvm.preserve.access.index !12 + %5 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %4, i64 1), !dbg !40 + call void @llvm.dbg.value(metadata i32 %5, metadata !33, metadata !DIExpression()), !dbg !34 + %add = add i32 %5, %3, !dbg !41 + ret i32 %add, !dbg !42 +} + +; CHECK: r1 = 40 +; CHECK: r0 = 4 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=54 +; CHECK: .ascii "0:1:0:5" # string offset=60 +; CHECK: .ascii "0:1:0:5:5" # string offset=105 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 54 # Field reloc section string offset=54 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 60 +; CHECK-NEXT: .long 1 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 105 +; CHECK-NEXT: .long 1 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare [10 x [10 x i32]]* @llvm.preserve.struct.access.index.p0a10a10i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare [10 x i32]* @llvm.preserve.array.access.index.p0a10i32.p0a10a10i32([10 x [10 x i32]]*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0a10i32([10 x i32]*, i64) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.array.access.index.p0i32.p0a10i32([10 x i32]*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!14, !15, !16} +!llvm.ident = !{!17} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git c1e02f16f1105ffaf1c35ee8bc38b7d6db5c6ea9)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !7, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 3, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_BYTE_SIZE", value: 1, isUnsigned: true) +!7 = !{!8, !12} +!8 = !DICompositeType(tag: DW_TAG_array_type, baseType: !9, size: 3200, elements: !10) +!9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!10 = !{!11, !11} +!11 = !DISubrange(count: 10) +!12 = !DICompositeType(tag: DW_TAG_array_type, baseType: !9, size: 320, elements: !13) +!13 = !{!11} +!14 = !{i32 2, !"Dwarf Version", i32 4} +!15 = !{i32 2, !"Debug Info Version", i32 3} +!16 = !{i32 1, !"wchar_size", i32 4} +!17 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git c1e02f16f1105ffaf1c35ee8bc38b7d6db5c6ea9)"} +!18 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 4, type: !19, scopeLine: 4, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !30) +!19 = !DISubroutineType(types: !20) +!20 = !{!9, !21} +!21 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !22, size: 64) +!22 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 2, size: 3200, elements: !23) +!23 = !{!24, !25} +!24 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !22, file: !1, line: 2, baseType: !9, size: 32) +!25 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !22, file: !1, line: 2, baseType: !26, size: 3200) +!26 = !DIDerivedType(tag: DW_TAG_typedef, name: "__s1", file: !1, line: 1, baseType: !27) +!27 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 1, size: 3200, elements: !28) +!28 = !{!29} +!29 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !27, file: !1, line: 1, baseType: !8, size: 3200) +!30 = !{!31, !32, !33} +!31 = !DILocalVariable(name: "arg", arg: 1, scope: !18, file: !1, line: 4, type: !21) +!32 = !DILocalVariable(name: "r1", scope: !18, file: !1, line: 5, type: !4) +!33 = !DILocalVariable(name: "r2", scope: !18, file: !1, line: 6, type: !4) +!34 = !DILocation(line: 0, scope: !18) +!35 = !DILocation(line: 5, column: 52, scope: !18) +!36 = !DILocation(line: 5, column: 55, scope: !18) +!37 = !DILocation(line: 5, column: 47, scope: !18) +!38 = !DILocation(line: 5, column: 17, scope: !18) +!39 = !DILocation(line: 6, column: 47, scope: !18) +!40 = !DILocation(line: 6, column: 17, scope: !18) +!41 = !DILocation(line: 8, column: 13, scope: !18) +!42 = !DILocation(line: 8, column: 3, scope: !18) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-1.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-1.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-1.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,162 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; typedef unsigned __uint; +; struct s1 { int a1; __uint a2:9; __uint a3:4; }; +; union u1 { int b1; __uint b2:9; __uint b3:4; }; +; enum { FIELD_EXISTENCE = 2, }; +; int test(struct s1 *arg1, union u1 *arg2) { +; unsigned r1 = __builtin_preserve_field_info(arg1->a1, FIELD_EXISTENCE); +; unsigned r2 = __builtin_preserve_field_info(arg1->a3, FIELD_EXISTENCE); +; unsigned r3 = __builtin_preserve_field_info(arg2->b1, FIELD_EXISTENCE); +; unsigned r4 = __builtin_preserve_field_info(arg2->b3, FIELD_EXISTENCE); +; return r1 + r2 + r3 + r4; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%struct.s1 = type { i32, i16 } +%union.u1 = type { i32 } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%struct.s1* %arg1, %union.u1* %arg2) local_unnamed_addr #0 !dbg !11 { +entry: + call void @llvm.dbg.value(metadata %struct.s1* %arg1, metadata !29, metadata !DIExpression()), !dbg !35 + call void @llvm.dbg.value(metadata %union.u1* %arg2, metadata !30, metadata !DIExpression()), !dbg !35 + %0 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %arg1, i32 0, i32 0), !dbg !36, !llvm.preserve.access.index !16 + %1 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %0, i64 2), !dbg !37 + call void @llvm.dbg.value(metadata i32 %1, metadata !31, metadata !DIExpression()), !dbg !35 + %2 = tail call i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.s1s(%struct.s1* %arg1, i32 1, i32 2), !dbg !38, !llvm.preserve.access.index !16 + %3 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %2, i64 2), !dbg !39 + call void @llvm.dbg.value(metadata i32 %3, metadata !32, metadata !DIExpression()), !dbg !35 + %4 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg2, i32 0), !dbg !40, !llvm.preserve.access.index !23 + %b1 = getelementptr inbounds %union.u1, %union.u1* %4, i64 0, i32 0, !dbg !40 + %5 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %b1, i64 2), !dbg !41 + call void @llvm.dbg.value(metadata i32 %5, metadata !33, metadata !DIExpression()), !dbg !35 + %6 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_union.u1s(%union.u1* %arg2, i32 0, i32 2), !dbg !42, !llvm.preserve.access.index !23 + %7 = bitcast i32* %6 to i8*, !dbg !42 + %8 = tail call i32 @llvm.bpf.preserve.field.info.p0i8(i8* %7, i64 2), !dbg !43 + call void @llvm.dbg.value(metadata i32 %8, metadata !34, metadata !DIExpression()), !dbg !35 + %add = add i32 %3, %1, !dbg !44 + %add1 = add i32 %add, %5, !dbg !45 + %add2 = add i32 %add1, %8, !dbg !46 + ret i32 %add2, !dbg !47 +} + +; CHECK: r1 = 1 +; CHECK: r0 = 1 +; CHECK: r0 += r1 +; CHECK: r1 = 1 +; CHECK: r0 += r1 +; CHECK: r1 = 1 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_STRUCT(id = 2) +; CHECK: .long 37 # BTF_KIND_UNION(id = 7) +; CHECK: .ascii "s1" # string offset=1 +; CHECK: .ascii "u1" # string offset=37 +; CHECK: .ascii ".text" # string offset=64 +; CHECK: .ascii "0:0" # string offset=70 +; CHECK: .ascii "0:2" # string offset=111 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 64 # Field reloc section string offset=64 +; CHECK-NEXT: .long 4 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 70 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 111 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 7 +; CHECK-NEXT: .long 70 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 7 +; CHECK-NEXT: .long 111 +; CHECK-NEXT: .long 2 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone +declare i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i16(i16*, i64) #1 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_union.u1s(%union.u1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i8(i8*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!7, !8, !9} +!llvm.ident = !{!10} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 4, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_EXISTENCE", value: 2, isUnsigned: true) +!7 = !{i32 2, !"Dwarf Version", i32 4} +!8 = !{i32 2, !"Debug Info Version", i32 3} +!9 = !{i32 1, !"wchar_size", i32 4} +!10 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)"} +!11 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 5, type: !12, scopeLine: 5, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !28) +!12 = !DISubroutineType(types: !13) +!13 = !{!14, !15, !22} +!14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!15 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !16, size: 64) +!16 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 2, size: 64, elements: !17) +!17 = !{!18, !19, !21} +!18 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !16, file: !1, line: 2, baseType: !14, size: 32) +!19 = !DIDerivedType(tag: DW_TAG_member, name: "a2", scope: !16, file: !1, line: 2, baseType: !20, size: 9, offset: 32, flags: DIFlagBitField, extraData: i64 32) +!20 = !DIDerivedType(tag: DW_TAG_typedef, name: "__uint", file: !1, line: 1, baseType: !4) +!21 = !DIDerivedType(tag: DW_TAG_member, name: "a3", scope: !16, file: !1, line: 2, baseType: !20, size: 4, offset: 41, flags: DIFlagBitField, extraData: i64 32) +!22 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !23, size: 64) +!23 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 3, size: 32, elements: !24) +!24 = !{!25, !26, !27} +!25 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !23, file: !1, line: 3, baseType: !14, size: 32) +!26 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !23, file: !1, line: 3, baseType: !20, size: 9, flags: DIFlagBitField, extraData: i64 0) +!27 = !DIDerivedType(tag: DW_TAG_member, name: "b3", scope: !23, file: !1, line: 3, baseType: !20, size: 4, flags: DIFlagBitField, extraData: i64 0) +!28 = !{!29, !30, !31, !32, !33, !34} +!29 = !DILocalVariable(name: "arg1", arg: 1, scope: !11, file: !1, line: 5, type: !15) +!30 = !DILocalVariable(name: "arg2", arg: 2, scope: !11, file: !1, line: 5, type: !22) +!31 = !DILocalVariable(name: "r1", scope: !11, file: !1, line: 6, type: !4) +!32 = !DILocalVariable(name: "r2", scope: !11, file: !1, line: 7, type: !4) +!33 = !DILocalVariable(name: "r3", scope: !11, file: !1, line: 8, type: !4) +!34 = !DILocalVariable(name: "r4", scope: !11, file: !1, line: 9, type: !4) +!35 = !DILocation(line: 0, scope: !11) +!36 = !DILocation(line: 6, column: 53, scope: !11) +!37 = !DILocation(line: 6, column: 17, scope: !11) +!38 = !DILocation(line: 7, column: 53, scope: !11) +!39 = !DILocation(line: 7, column: 17, scope: !11) +!40 = !DILocation(line: 8, column: 53, scope: !11) +!41 = !DILocation(line: 8, column: 17, scope: !11) +!42 = !DILocation(line: 9, column: 53, scope: !11) +!43 = !DILocation(line: 9, column: 17, scope: !11) +!44 = !DILocation(line: 10, column: 13, scope: !11) +!45 = !DILocation(line: 10, column: 18, scope: !11) +!46 = !DILocation(line: 10, column: 23, scope: !11) +!47 = !DILocation(line: 10, column: 3, scope: !11) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-2.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-2.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-2.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,121 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; typedef unsigned __uint; +; struct s1 { int a1; __uint a2:9; __uint a3:4; }; +; union u1 { int b1; struct s1 b2; }; +; enum { FIELD_EXISTENCE = 2, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2.a1, FIELD_EXISTENCE); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a3, FIELD_EXISTENCE); +; return r1 + r2; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { %struct.s1 } +%struct.s1 = type { i32, i16 } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !11 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !27, metadata !DIExpression()), !dbg !30 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !31, !llvm.preserve.access.index !16 + %b2 = getelementptr inbounds %union.u1, %union.u1* %0, i64 0, i32 0, !dbg !31 + %1 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !32, !llvm.preserve.access.index !20 + %2 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %1, i64 2), !dbg !33 + call void @llvm.dbg.value(metadata i32 %2, metadata !28, metadata !DIExpression()), !dbg !30 + %3 = tail call i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.s1s(%struct.s1* %b2, i32 1, i32 2), !dbg !34, !llvm.preserve.access.index !20 + %4 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %3, i64 2), !dbg !35 + call void @llvm.dbg.value(metadata i32 %4, metadata !29, metadata !DIExpression()), !dbg !30 + %add = add i32 %4, %2, !dbg !36 + ret i32 %add, !dbg !37 +} + +; CHECK: r1 = 1 +; CHECK: r0 = 1 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=55 +; CHECK: .ascii "0:1:0" # string offset=61 +; CHECK: .ascii "0:1:2" # string offset=104 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 55 # Field reloc section string offset=55 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 61 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 104 +; CHECK-NEXT: .long 2 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone +declare i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i16(i16*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!7, !8, !9} +!llvm.ident = !{!10} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 4, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_EXISTENCE", value: 2, isUnsigned: true) +!7 = !{i32 2, !"Dwarf Version", i32 4} +!8 = !{i32 2, !"Debug Info Version", i32 3} +!9 = !{i32 1, !"wchar_size", i32 4} +!10 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)"} +!11 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 5, type: !12, scopeLine: 5, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !26) +!12 = !DISubroutineType(types: !13) +!13 = !{!14, !15} +!14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!15 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !16, size: 64) +!16 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 3, size: 64, elements: !17) +!17 = !{!18, !19} +!18 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !16, file: !1, line: 3, baseType: !14, size: 32) +!19 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !16, file: !1, line: 3, baseType: !20, size: 64) +!20 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 2, size: 64, elements: !21) +!21 = !{!22, !23, !25} +!22 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !20, file: !1, line: 2, baseType: !14, size: 32) +!23 = !DIDerivedType(tag: DW_TAG_member, name: "a2", scope: !20, file: !1, line: 2, baseType: !24, size: 9, offset: 32, flags: DIFlagBitField, extraData: i64 32) +!24 = !DIDerivedType(tag: DW_TAG_typedef, name: "__uint", file: !1, line: 1, baseType: !4) +!25 = !DIDerivedType(tag: DW_TAG_member, name: "a3", scope: !20, file: !1, line: 2, baseType: !24, size: 4, offset: 41, flags: DIFlagBitField, extraData: i64 32) +!26 = !{!27, !28, !29} +!27 = !DILocalVariable(name: "arg", arg: 1, scope: !11, file: !1, line: 5, type: !15) +!28 = !DILocalVariable(name: "r1", scope: !11, file: !1, line: 6, type: !4) +!29 = !DILocalVariable(name: "r2", scope: !11, file: !1, line: 7, type: !4) +!30 = !DILocation(line: 0, scope: !11) +!31 = !DILocation(line: 6, column: 52, scope: !11) +!32 = !DILocation(line: 6, column: 55, scope: !11) +!33 = !DILocation(line: 6, column: 17, scope: !11) +!34 = !DILocation(line: 7, column: 55, scope: !11) +!35 = !DILocation(line: 7, column: 17, scope: !11) +!36 = !DILocation(line: 8, column: 13, scope: !11) +!37 = !DILocation(line: 8, column: 3, scope: !11) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-3.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-3.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-3.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-3.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,129 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; typedef struct s1 { int a1[10][10]; } __s1; +; union u1 { int b1; __s1 b2; }; +; enum { FIELD_EXISTENCE = 2, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2.a1[5], FIELD_EXISTENCE); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a1[5][5], FIELD_EXISTENCE); +; return r1 + r2; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { %struct.s1 } +%struct.s1 = type { [10 x [10 x i32]] } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !18 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !31, metadata !DIExpression()), !dbg !34 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !35, !llvm.preserve.access.index !22 + %b2 = getelementptr inbounds %union.u1, %union.u1* %0, i64 0, i32 0, !dbg !35 + %1 = tail call [10 x [10 x i32]]* @llvm.preserve.struct.access.index.p0a10a10i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !36, !llvm.preserve.access.index !27 + %2 = tail call [10 x i32]* @llvm.preserve.array.access.index.p0a10i32.p0a10a10i32([10 x [10 x i32]]* %1, i32 1, i32 5), !dbg !37, !llvm.preserve.access.index !8 + %3 = tail call i32 @llvm.bpf.preserve.field.info.p0a10i32([10 x i32]* %2, i64 2), !dbg !38 + call void @llvm.dbg.value(metadata i32 %3, metadata !32, metadata !DIExpression()), !dbg !34 + %4 = tail call i32* @llvm.preserve.array.access.index.p0i32.p0a10i32([10 x i32]* %2, i32 1, i32 5), !dbg !39, !llvm.preserve.access.index !12 + %5 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %4, i64 2), !dbg !40 + call void @llvm.dbg.value(metadata i32 %5, metadata !33, metadata !DIExpression()), !dbg !34 + %add = add i32 %5, %3, !dbg !41 + ret i32 %add, !dbg !42 +} + +; CHECK: r1 = 1 +; CHECK: r0 = 1 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=54 +; CHECK: .ascii "0:1:0:5" # string offset=60 +; CHECK: .ascii "0:1:0:5:5" # string offset=105 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 54 # Field reloc section string offset=54 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 60 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 105 +; CHECK-NEXT: .long 2 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare [10 x [10 x i32]]* @llvm.preserve.struct.access.index.p0a10a10i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare [10 x i32]* @llvm.preserve.array.access.index.p0a10i32.p0a10a10i32([10 x [10 x i32]]*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0a10i32([10 x i32]*, i64) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.array.access.index.p0i32.p0a10i32([10 x i32]*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!14, !15, !16} +!llvm.ident = !{!17} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git c1e02f16f1105ffaf1c35ee8bc38b7d6db5c6ea9)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !7, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 3, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_EXISTENCE", value: 2, isUnsigned: true) +!7 = !{!8, !12} +!8 = !DICompositeType(tag: DW_TAG_array_type, baseType: !9, size: 3200, elements: !10) +!9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!10 = !{!11, !11} +!11 = !DISubrange(count: 10) +!12 = !DICompositeType(tag: DW_TAG_array_type, baseType: !9, size: 320, elements: !13) +!13 = !{!11} +!14 = !{i32 2, !"Dwarf Version", i32 4} +!15 = !{i32 2, !"Debug Info Version", i32 3} +!16 = !{i32 1, !"wchar_size", i32 4} +!17 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git c1e02f16f1105ffaf1c35ee8bc38b7d6db5c6ea9)"} +!18 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 4, type: !19, scopeLine: 4, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !30) +!19 = !DISubroutineType(types: !20) +!20 = !{!9, !21} +!21 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !22, size: 64) +!22 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 2, size: 3200, elements: !23) +!23 = !{!24, !25} +!24 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !22, file: !1, line: 2, baseType: !9, size: 32) +!25 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !22, file: !1, line: 2, baseType: !26, size: 3200) +!26 = !DIDerivedType(tag: DW_TAG_typedef, name: "__s1", file: !1, line: 1, baseType: !27) +!27 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 1, size: 3200, elements: !28) +!28 = !{!29} +!29 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !27, file: !1, line: 1, baseType: !8, size: 3200) +!30 = !{!31, !32, !33} +!31 = !DILocalVariable(name: "arg", arg: 1, scope: !18, file: !1, line: 4, type: !21) +!32 = !DILocalVariable(name: "r1", scope: !18, file: !1, line: 5, type: !4) +!33 = !DILocalVariable(name: "r2", scope: !18, file: !1, line: 6, type: !4) +!34 = !DILocation(line: 0, scope: !18) +!35 = !DILocation(line: 5, column: 52, scope: !18) +!36 = !DILocation(line: 5, column: 55, scope: !18) +!37 = !DILocation(line: 5, column: 47, scope: !18) +!38 = !DILocation(line: 5, column: 17, scope: !18) +!39 = !DILocation(line: 6, column: 47, scope: !18) +!40 = !DILocation(line: 6, column: 17, scope: !18) +!41 = !DILocation(line: 7, column: 13, scope: !18) +!42 = !DILocation(line: 7, column: 3, scope: !18) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-1.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-1.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-1.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,153 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK,CHECK-EL %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK,CHECK-EB %s +; Source code: +; typedef struct s1 { int a1:7; int a2:4; int a3:5; int a4:16;} __s1; +; union u1 { int b1; __s1 b2; }; +; enum { FIELD_LSHIFT_U64 = 4, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2.a1, FIELD_LSHIFT_U64); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a2, FIELD_LSHIFT_U64); +; unsigned r3 = __builtin_preserve_field_info(arg->b2.a3, FIELD_LSHIFT_U64); +; unsigned r4 = __builtin_preserve_field_info(arg->b2.a4, FIELD_LSHIFT_U64); +; /* big endian: r1: 32, r2: 39, r3: 43, r4: 48 */ +; /* little endian: r1: 57, r2: 53, r3: 48, r4: 32 */ +; return r1 + r2 + r3 + r4; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { i32 } +%struct.s1 = type { i32 } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !11 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !28, metadata !DIExpression()), !dbg !33 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !34, !llvm.preserve.access.index !16 + %b2 = bitcast %union.u1* %0 to %struct.s1*, !dbg !34 + %1 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !35, !llvm.preserve.access.index !21 + %2 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %1, i64 4), !dbg !36 + call void @llvm.dbg.value(metadata i32 %2, metadata !29, metadata !DIExpression()), !dbg !33 + %3 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 1), !dbg !37, !llvm.preserve.access.index !21 + %4 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %3, i64 4), !dbg !38 + call void @llvm.dbg.value(metadata i32 %4, metadata !30, metadata !DIExpression()), !dbg !33 + %5 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 2), !dbg !39, !llvm.preserve.access.index !21 + %6 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %5, i64 4), !dbg !40 + call void @llvm.dbg.value(metadata i32 %6, metadata !31, metadata !DIExpression()), !dbg !33 + %7 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 3), !dbg !41, !llvm.preserve.access.index !21 + %8 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %7, i64 4), !dbg !42 + call void @llvm.dbg.value(metadata i32 %8, metadata !32, metadata !DIExpression()), !dbg !33 + %add = add i32 %4, %2, !dbg !43 + %add4 = add i32 %add, %6, !dbg !44 + %add5 = add i32 %add4, %8, !dbg !45 + ret i32 %add5, !dbg !46 +} + +; CHECK-EL: r1 = 57 +; CHECK-EL: r0 = 53 +; CHECK-EB: r1 = 32 +; CHECK-EB: r0 = 39 +; CHECK: r0 += r1 +; CHECK-EL: r1 = 48 +; CHECK-EB: r1 = 43 +; CHECK: r0 += r1 +; CHECK-EL: r1 = 32 +; CHECK-EB: r1 = 48 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=43 +; CHECK: .ascii "0:1:0" # string offset=49 +; CHECK: .ascii "0:1:1" # string offset=92 +; CHECK: .ascii "0:1:2" # string offset=98 +; CHECK: .ascii "0:1:3" # string offset=104 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 43 # Field reloc section string offset=43 +; CHECK-NEXT: .long 4 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 49 +; CHECK-NEXT: .long 4 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 92 +; CHECK-NEXT: .long 4 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 98 +; CHECK-NEXT: .long 4 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 104 +; CHECK-NEXT: .long 4 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!7, !8, !9} +!llvm.ident = !{!10} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5635073377f153f7f2ff9b34c77af3c79885ff4a)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 3, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_LSHIFT_U64", value: 4, isUnsigned: true) +!7 = !{i32 2, !"Dwarf Version", i32 4} +!8 = !{i32 2, !"Debug Info Version", i32 3} +!9 = !{i32 1, !"wchar_size", i32 4} +!10 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5635073377f153f7f2ff9b34c77af3c79885ff4a)"} +!11 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 4, type: !12, scopeLine: 4, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !27) +!12 = !DISubroutineType(types: !13) +!13 = !{!14, !15} +!14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!15 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !16, size: 64) +!16 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 2, size: 32, elements: !17) +!17 = !{!18, !19} +!18 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !16, file: !1, line: 2, baseType: !14, size: 32) +!19 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !16, file: !1, line: 2, baseType: !20, size: 32) +!20 = !DIDerivedType(tag: DW_TAG_typedef, name: "__s1", file: !1, line: 1, baseType: !21) +!21 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 1, size: 32, elements: !22) +!22 = !{!23, !24, !25, !26} +!23 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !21, file: !1, line: 1, baseType: !14, size: 7, flags: DIFlagBitField, extraData: i64 0) +!24 = !DIDerivedType(tag: DW_TAG_member, name: "a2", scope: !21, file: !1, line: 1, baseType: !14, size: 4, offset: 7, flags: DIFlagBitField, extraData: i64 0) +!25 = !DIDerivedType(tag: DW_TAG_member, name: "a3", scope: !21, file: !1, line: 1, baseType: !14, size: 5, offset: 11, flags: DIFlagBitField, extraData: i64 0) +!26 = !DIDerivedType(tag: DW_TAG_member, name: "a4", scope: !21, file: !1, line: 1, baseType: !14, size: 16, offset: 16, flags: DIFlagBitField, extraData: i64 0) +!27 = !{!28, !29, !30, !31, !32} +!28 = !DILocalVariable(name: "arg", arg: 1, scope: !11, file: !1, line: 4, type: !15) +!29 = !DILocalVariable(name: "r1", scope: !11, file: !1, line: 5, type: !4) +!30 = !DILocalVariable(name: "r2", scope: !11, file: !1, line: 6, type: !4) +!31 = !DILocalVariable(name: "r3", scope: !11, file: !1, line: 7, type: !4) +!32 = !DILocalVariable(name: "r4", scope: !11, file: !1, line: 8, type: !4) +!33 = !DILocation(line: 0, scope: !11) +!34 = !DILocation(line: 5, column: 52, scope: !11) +!35 = !DILocation(line: 5, column: 55, scope: !11) +!36 = !DILocation(line: 5, column: 17, scope: !11) +!37 = !DILocation(line: 6, column: 55, scope: !11) +!38 = !DILocation(line: 6, column: 17, scope: !11) +!39 = !DILocation(line: 7, column: 55, scope: !11) +!40 = !DILocation(line: 7, column: 17, scope: !11) +!41 = !DILocation(line: 8, column: 55, scope: !11) +!42 = !DILocation(line: 8, column: 17, scope: !11) +!43 = !DILocation(line: 11, column: 13, scope: !11) +!44 = !DILocation(line: 11, column: 18, scope: !11) +!45 = !DILocation(line: 11, column: 23, scope: !11) +!46 = !DILocation(line: 11, column: 3, scope: !11) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-2.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-2.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-2.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,122 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; typedef struct s1 { int a1; short a2; } __s1; +; union u1 { int b1; __s1 b2; }; +; enum { FIELD_LSHIFT_U64 = 4, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2.a1, FIELD_LSHIFT_U64); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a2, FIELD_LSHIFT_U64); +; /* big endian: r1: 32, r2: 48 */ +; /* little endian: r1: 32, r2: 48 */ +; return r1 + r2; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { %struct.s1 } +%struct.s1 = type { i32, i16 } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !11 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !27, metadata !DIExpression()), !dbg !30 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !31, !llvm.preserve.access.index !16 + %b2 = getelementptr %union.u1, %union.u1* %0, i64 0, i32 0, !dbg !31 + %1 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !32, !llvm.preserve.access.index !21 + %2 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %1, i64 4), !dbg !33 + call void @llvm.dbg.value(metadata i32 %2, metadata !28, metadata !DIExpression()), !dbg !30 + %3 = tail call i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.s1s(%struct.s1* %b2, i32 1, i32 1), !dbg !34, !llvm.preserve.access.index !21 + %4 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %3, i64 4), !dbg !35 + call void @llvm.dbg.value(metadata i32 %4, metadata !29, metadata !DIExpression()), !dbg !30 + %add = add i32 %4, %2, !dbg !36 + ret i32 %add, !dbg !37 +} + +; CHECK: r1 = 32 +; CHECK: r0 = 48 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=43 +; CHECK: .ascii "0:1:0" # string offset=49 +; CHECK: .ascii "0:1:1" # string offset=92 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 43 # Field reloc section string offset=43 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 49 +; CHECK-NEXT: .long 4 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 92 +; CHECK-NEXT: .long 4 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone +declare i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i16(i16*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!7, !8, !9} +!llvm.ident = !{!10} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5635073377f153f7f2ff9b34c77af3c79885ff4a)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 3, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_LSHIFT_U64", value: 4, isUnsigned: true) +!7 = !{i32 2, !"Dwarf Version", i32 4} +!8 = !{i32 2, !"Debug Info Version", i32 3} +!9 = !{i32 1, !"wchar_size", i32 4} +!10 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5635073377f153f7f2ff9b34c77af3c79885ff4a)"} +!11 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 4, type: !12, scopeLine: 4, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !26) +!12 = !DISubroutineType(types: !13) +!13 = !{!14, !15} +!14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!15 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !16, size: 64) +!16 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 2, size: 64, elements: !17) +!17 = !{!18, !19} +!18 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !16, file: !1, line: 2, baseType: !14, size: 32) +!19 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !16, file: !1, line: 2, baseType: !20, size: 64) +!20 = !DIDerivedType(tag: DW_TAG_typedef, name: "__s1", file: !1, line: 1, baseType: !21) +!21 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 1, size: 64, elements: !22) +!22 = !{!23, !24} +!23 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !21, file: !1, line: 1, baseType: !14, size: 32) +!24 = !DIDerivedType(tag: DW_TAG_member, name: "a2", scope: !21, file: !1, line: 1, baseType: !25, size: 16, offset: 32) +!25 = !DIBasicType(name: "short", size: 16, encoding: DW_ATE_signed) +!26 = !{!27, !28, !29} +!27 = !DILocalVariable(name: "arg", arg: 1, scope: !11, file: !1, line: 4, type: !15) +!28 = !DILocalVariable(name: "r1", scope: !11, file: !1, line: 5, type: !4) +!29 = !DILocalVariable(name: "r2", scope: !11, file: !1, line: 6, type: !4) +!30 = !DILocation(line: 0, scope: !11) +!31 = !DILocation(line: 5, column: 52, scope: !11) +!32 = !DILocation(line: 5, column: 55, scope: !11) +!33 = !DILocation(line: 5, column: 17, scope: !11) +!34 = !DILocation(line: 6, column: 55, scope: !11) +!35 = !DILocation(line: 6, column: 17, scope: !11) +!36 = !DILocation(line: 9, column: 13, scope: !11) +!37 = !DILocation(line: 9, column: 3, scope: !11) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-1.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-1.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-1.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,148 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; typedef struct s1 { int a1:7; int a2:4; int a3:5; int a4:16;} __s1; +; union u1 { int b1; __s1 b2; }; +; enum { FIELD_RSHIFT_U64 = 5, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2.a1, FIELD_RSHIFT_U64); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a2, FIELD_RSHIFT_U64); +; unsigned r3 = __builtin_preserve_field_info(arg->b2.a3, FIELD_RSHIFT_U64); +; unsigned r4 = __builtin_preserve_field_info(arg->b2.a4, FIELD_RSHIFT_U64); +; /* r1: 57, r2: 60, r3: 59, r4: 48 */ +; return r1 + r2 + r3 + r4; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { i32 } +%struct.s1 = type { i32 } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !11 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !28, metadata !DIExpression()), !dbg !33 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !34, !llvm.preserve.access.index !16 + %b2 = bitcast %union.u1* %0 to %struct.s1*, !dbg !34 + %1 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !35, !llvm.preserve.access.index !21 + %2 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %1, i64 5), !dbg !36 + call void @llvm.dbg.value(metadata i32 %2, metadata !29, metadata !DIExpression()), !dbg !33 + %3 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 1), !dbg !37, !llvm.preserve.access.index !21 + %4 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %3, i64 5), !dbg !38 + call void @llvm.dbg.value(metadata i32 %4, metadata !30, metadata !DIExpression()), !dbg !33 + %5 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 2), !dbg !39, !llvm.preserve.access.index !21 + %6 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %5, i64 5), !dbg !40 + call void @llvm.dbg.value(metadata i32 %6, metadata !31, metadata !DIExpression()), !dbg !33 + %7 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 3), !dbg !41, !llvm.preserve.access.index !21 + %8 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %7, i64 5), !dbg !42 + call void @llvm.dbg.value(metadata i32 %8, metadata !32, metadata !DIExpression()), !dbg !33 + %add = add i32 %4, %2, !dbg !43 + %add4 = add i32 %add, %6, !dbg !44 + %add5 = add i32 %add4, %8, !dbg !45 + ret i32 %add5, !dbg !46 +} + +; CHECK: r1 = 57 +; CHECK: r0 = 60 +; CHECK: r0 += r1 +; CHECK: r1 = 59 +; CHECK: r0 += r1 +; CHECK: r1 = 48 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=43 +; CHECK: .ascii "0:1:0" # string offset=49 +; CHECK: .ascii "0:1:1" # string offset=92 +; CHECK: .ascii "0:1:2" # string offset=98 +; CHECK: .ascii "0:1:3" # string offset=104 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 43 # Field reloc section string offset=43 +; CHECK-NEXT: .long 4 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 49 +; CHECK-NEXT: .long 5 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 92 +; CHECK-NEXT: .long 5 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 98 +; CHECK-NEXT: .long 5 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 104 +; CHECK-NEXT: .long 5 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!7, !8, !9} +!llvm.ident = !{!10} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 3, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_RSHIFT_U64", value: 5, isUnsigned: true) +!7 = !{i32 2, !"Dwarf Version", i32 4} +!8 = !{i32 2, !"Debug Info Version", i32 3} +!9 = !{i32 1, !"wchar_size", i32 4} +!10 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)"} +!11 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 4, type: !12, scopeLine: 4, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !27) +!12 = !DISubroutineType(types: !13) +!13 = !{!14, !15} +!14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!15 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !16, size: 64) +!16 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 2, size: 32, elements: !17) +!17 = !{!18, !19} +!18 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !16, file: !1, line: 2, baseType: !14, size: 32) +!19 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !16, file: !1, line: 2, baseType: !20, size: 32) +!20 = !DIDerivedType(tag: DW_TAG_typedef, name: "__s1", file: !1, line: 1, baseType: !21) +!21 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 1, size: 32, elements: !22) +!22 = !{!23, !24, !25, !26} +!23 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !21, file: !1, line: 1, baseType: !14, size: 7, flags: DIFlagBitField, extraData: i64 0) +!24 = !DIDerivedType(tag: DW_TAG_member, name: "a2", scope: !21, file: !1, line: 1, baseType: !14, size: 4, offset: 7, flags: DIFlagBitField, extraData: i64 0) +!25 = !DIDerivedType(tag: DW_TAG_member, name: "a3", scope: !21, file: !1, line: 1, baseType: !14, size: 5, offset: 11, flags: DIFlagBitField, extraData: i64 0) +!26 = !DIDerivedType(tag: DW_TAG_member, name: "a4", scope: !21, file: !1, line: 1, baseType: !14, size: 16, offset: 16, flags: DIFlagBitField, extraData: i64 0) +!27 = !{!28, !29, !30, !31, !32} +!28 = !DILocalVariable(name: "arg", arg: 1, scope: !11, file: !1, line: 4, type: !15) +!29 = !DILocalVariable(name: "r1", scope: !11, file: !1, line: 5, type: !4) +!30 = !DILocalVariable(name: "r2", scope: !11, file: !1, line: 6, type: !4) +!31 = !DILocalVariable(name: "r3", scope: !11, file: !1, line: 7, type: !4) +!32 = !DILocalVariable(name: "r4", scope: !11, file: !1, line: 8, type: !4) +!33 = !DILocation(line: 0, scope: !11) +!34 = !DILocation(line: 5, column: 52, scope: !11) +!35 = !DILocation(line: 5, column: 55, scope: !11) +!36 = !DILocation(line: 5, column: 17, scope: !11) +!37 = !DILocation(line: 6, column: 55, scope: !11) +!38 = !DILocation(line: 6, column: 17, scope: !11) +!39 = !DILocation(line: 7, column: 55, scope: !11) +!40 = !DILocation(line: 7, column: 17, scope: !11) +!41 = !DILocation(line: 8, column: 55, scope: !11) +!42 = !DILocation(line: 8, column: 17, scope: !11) +!43 = !DILocation(line: 10, column: 13, scope: !11) +!44 = !DILocation(line: 10, column: 18, scope: !11) +!45 = !DILocation(line: 10, column: 23, scope: !11) +!46 = !DILocation(line: 10, column: 3, scope: !11) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-2.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-2.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-2.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,121 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; typedef struct s1 { int a1; char a2; } __s1; +; union u1 { int b1; __s1 b2; }; +; enum { FIELD_RSHIFT_U64 = 5, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2.a1, FIELD_RSHIFT_U64); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a2, FIELD_RSHIFT_U64); +; /* r1: 32, r2: 56 */ +; return r1 + r2; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { %struct.s1 } +%struct.s1 = type { i32, i8 } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !11 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !27, metadata !DIExpression()), !dbg !30 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !31, !llvm.preserve.access.index !16 + %b2 = getelementptr inbounds %union.u1, %union.u1* %0, i64 0, i32 0, !dbg !31 + %1 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !32, !llvm.preserve.access.index !21 + %2 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %1, i64 5), !dbg !33 + call void @llvm.dbg.value(metadata i32 %2, metadata !28, metadata !DIExpression()), !dbg !30 + %3 = tail call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.s1s(%struct.s1* %b2, i32 1, i32 1), !dbg !34, !llvm.preserve.access.index !21 + %4 = tail call i32 @llvm.bpf.preserve.field.info.p0i8(i8* %3, i64 5), !dbg !35 + call void @llvm.dbg.value(metadata i32 %4, metadata !29, metadata !DIExpression()), !dbg !30 + %add = add i32 %4, %2, !dbg !36 + ret i32 %add, !dbg !37 +} + +; CHECK: r1 = 32 +; CHECK: r0 = 56 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=42 +; CHECK: .ascii "0:1:0" # string offset=48 +; CHECK: .ascii "0:1:1" # string offset=91 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 42 # Field reloc section string offset=42 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 48 +; CHECK-NEXT: .long 5 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 91 +; CHECK-NEXT: .long 5 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone +declare i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i8(i8*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!7, !8, !9} +!llvm.ident = !{!10} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 3, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_RSHIFT_U64", value: 5, isUnsigned: true) +!7 = !{i32 2, !"Dwarf Version", i32 4} +!8 = !{i32 2, !"Debug Info Version", i32 3} +!9 = !{i32 1, !"wchar_size", i32 4} +!10 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)"} +!11 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 4, type: !12, scopeLine: 4, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !26) +!12 = !DISubroutineType(types: !13) +!13 = !{!14, !15} +!14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!15 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !16, size: 64) +!16 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 2, size: 64, elements: !17) +!17 = !{!18, !19} +!18 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !16, file: !1, line: 2, baseType: !14, size: 32) +!19 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !16, file: !1, line: 2, baseType: !20, size: 64) +!20 = !DIDerivedType(tag: DW_TAG_typedef, name: "__s1", file: !1, line: 1, baseType: !21) +!21 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 1, size: 64, elements: !22) +!22 = !{!23, !24} +!23 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !21, file: !1, line: 1, baseType: !14, size: 32) +!24 = !DIDerivedType(tag: DW_TAG_member, name: "a2", scope: !21, file: !1, line: 1, baseType: !25, size: 8, offset: 32) +!25 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char) +!26 = !{!27, !28, !29} +!27 = !DILocalVariable(name: "arg", arg: 1, scope: !11, file: !1, line: 4, type: !15) +!28 = !DILocalVariable(name: "r1", scope: !11, file: !1, line: 5, type: !4) +!29 = !DILocalVariable(name: "r2", scope: !11, file: !1, line: 6, type: !4) +!30 = !DILocation(line: 0, scope: !11) +!31 = !DILocation(line: 5, column: 52, scope: !11) +!32 = !DILocation(line: 5, column: 55, scope: !11) +!33 = !DILocation(line: 5, column: 17, scope: !11) +!34 = !DILocation(line: 6, column: 55, scope: !11) +!35 = !DILocation(line: 6, column: 17, scope: !11) +!36 = !DILocation(line: 8, column: 13, scope: !11) +!37 = !DILocation(line: 8, column: 3, scope: !11) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-3.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-3.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-3.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-3.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,131 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; typedef struct s1 { char a1 [5][5]; } __s1; +; union u1 { int b1; __s1 b2; }; +; enum { FIELD_RSHIFT_U64 = 5, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2.a1[3], FIELD_RSHIFT_U64); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a1[3][3], FIELD_RSHIFT_U64); +; /* r1 : 24, r2 : 56 */ +; return r1 + r2; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { i32, [24 x i8] } +%struct.s1 = type { [5 x [5 x i8]] } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !18 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !32, metadata !DIExpression()), !dbg !35 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !36, !llvm.preserve.access.index !23 + %b2 = bitcast %union.u1* %0 to %struct.s1*, !dbg !36 + %1 = tail call [5 x [5 x i8]]* @llvm.preserve.struct.access.index.p0a5a5i8.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !37, !llvm.preserve.access.index !28 + %2 = tail call [5 x i8]* @llvm.preserve.array.access.index.p0a5i8.p0a5a5i8([5 x [5 x i8]]* %1, i32 1, i32 3), !dbg !38, !llvm.preserve.access.index !8 + %3 = tail call i32 @llvm.bpf.preserve.field.info.p0a5i8([5 x i8]* %2, i64 5), !dbg !39 + call void @llvm.dbg.value(metadata i32 %3, metadata !33, metadata !DIExpression()), !dbg !35 + %4 = tail call i8* @llvm.preserve.array.access.index.p0i8.p0a5i8([5 x i8]* %2, i32 1, i32 3), !dbg !40, !llvm.preserve.access.index !12 + %5 = tail call i32 @llvm.bpf.preserve.field.info.p0i8(i8* %4, i64 5), !dbg !41 + call void @llvm.dbg.value(metadata i32 %5, metadata !34, metadata !DIExpression()), !dbg !35 + %add = add i32 %5, %3, !dbg !42 + ret i32 %add, !dbg !43 +} + +; CHECK: r1 = 24 +; CHECK: r0 = 56 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=59 +; CHECK: .ascii "0:1:0:3" # string offset=65 +; CHECK: .ascii "0:1:0:3:3" # string offset=110 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 59 # Field reloc section string offset=59 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 65 +; CHECK-NEXT: .long 5 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 110 +; CHECK-NEXT: .long 5 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare [5 x [5 x i8]]* @llvm.preserve.struct.access.index.p0a5a5i8.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare [5 x i8]* @llvm.preserve.array.access.index.p0a5i8.p0a5a5i8([5 x [5 x i8]]*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0a5i8([5 x i8]*, i64) #1 + +; Function Attrs: nounwind readnone +declare i8* @llvm.preserve.array.access.index.p0i8.p0a5i8([5 x i8]*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i8(i8*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!14, !15, !16} +!llvm.ident = !{!17} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git c1e02f16f1105ffaf1c35ee8bc38b7d6db5c6ea9)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !7, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 3, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_RSHIFT_U64", value: 5, isUnsigned: true) +!7 = !{!8, !12} +!8 = !DICompositeType(tag: DW_TAG_array_type, baseType: !9, size: 200, elements: !10) +!9 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char) +!10 = !{!11, !11} +!11 = !DISubrange(count: 5) +!12 = !DICompositeType(tag: DW_TAG_array_type, baseType: !9, size: 40, elements: !13) +!13 = !{!11} +!14 = !{i32 2, !"Dwarf Version", i32 4} +!15 = !{i32 2, !"Debug Info Version", i32 3} +!16 = !{i32 1, !"wchar_size", i32 4} +!17 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git c1e02f16f1105ffaf1c35ee8bc38b7d6db5c6ea9)"} +!18 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 4, type: !19, scopeLine: 4, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !31) +!19 = !DISubroutineType(types: !20) +!20 = !{!21, !22} +!21 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!22 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !23, size: 64) +!23 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 2, size: 224, elements: !24) +!24 = !{!25, !26} +!25 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !23, file: !1, line: 2, baseType: !21, size: 32) +!26 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !23, file: !1, line: 2, baseType: !27, size: 200) +!27 = !DIDerivedType(tag: DW_TAG_typedef, name: "__s1", file: !1, line: 1, baseType: !28) +!28 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 1, size: 200, elements: !29) +!29 = !{!30} +!30 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !28, file: !1, line: 1, baseType: !8, size: 200) +!31 = !{!32, !33, !34} +!32 = !DILocalVariable(name: "arg", arg: 1, scope: !18, file: !1, line: 4, type: !22) +!33 = !DILocalVariable(name: "r1", scope: !18, file: !1, line: 5, type: !4) +!34 = !DILocalVariable(name: "r2", scope: !18, file: !1, line: 6, type: !4) +!35 = !DILocation(line: 0, scope: !18) +!36 = !DILocation(line: 5, column: 52, scope: !18) +!37 = !DILocation(line: 5, column: 55, scope: !18) +!38 = !DILocation(line: 5, column: 47, scope: !18) +!39 = !DILocation(line: 5, column: 17, scope: !18) +!40 = !DILocation(line: 6, column: 47, scope: !18) +!41 = !DILocation(line: 6, column: 17, scope: !18) +!42 = !DILocation(line: 8, column: 13, scope: !18) +!43 = !DILocation(line: 8, column: 3, scope: !18) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-1.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-1.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-1.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,162 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; typedef unsigned __uint; +; struct s1 { int a1; __uint a2:9; __uint a3:4; }; +; union u1 { int b1; __uint b2:9; __uint b3:4; }; +; enum { FIELD_SIGNEDNESS = 3, }; +; int test(struct s1 *arg1, union u1 *arg2) { +; unsigned r1 = __builtin_preserve_field_info(arg1->a1, FIELD_SIGNEDNESS); +; unsigned r2 = __builtin_preserve_field_info(arg1->a3, FIELD_SIGNEDNESS); +; unsigned r3 = __builtin_preserve_field_info(arg2->b1, FIELD_SIGNEDNESS); +; unsigned r4 = __builtin_preserve_field_info(arg2->b3, FIELD_SIGNEDNESS); +; return r1 + r2 + r3 + r4; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%struct.s1 = type { i32, i16 } +%union.u1 = type { i32 } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%struct.s1* %arg1, %union.u1* %arg2) local_unnamed_addr #0 !dbg !11 { +entry: + call void @llvm.dbg.value(metadata %struct.s1* %arg1, metadata !29, metadata !DIExpression()), !dbg !35 + call void @llvm.dbg.value(metadata %union.u1* %arg2, metadata !30, metadata !DIExpression()), !dbg !35 + %0 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %arg1, i32 0, i32 0), !dbg !36, !llvm.preserve.access.index !16 + %1 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %0, i64 3), !dbg !37 + call void @llvm.dbg.value(metadata i32 %1, metadata !31, metadata !DIExpression()), !dbg !35 + %2 = tail call i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.s1s(%struct.s1* %arg1, i32 1, i32 2), !dbg !38, !llvm.preserve.access.index !16 + %3 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %2, i64 3), !dbg !39 + call void @llvm.dbg.value(metadata i32 %3, metadata !32, metadata !DIExpression()), !dbg !35 + %4 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg2, i32 0), !dbg !40, !llvm.preserve.access.index !23 + %b1 = getelementptr inbounds %union.u1, %union.u1* %4, i64 0, i32 0, !dbg !40 + %5 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %b1, i64 3), !dbg !41 + call void @llvm.dbg.value(metadata i32 %5, metadata !33, metadata !DIExpression()), !dbg !35 + %6 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_union.u1s(%union.u1* %arg2, i32 0, i32 2), !dbg !42, !llvm.preserve.access.index !23 + %7 = bitcast i32* %6 to i8*, !dbg !42 + %8 = tail call i32 @llvm.bpf.preserve.field.info.p0i8(i8* %7, i64 3), !dbg !43 + call void @llvm.dbg.value(metadata i32 %8, metadata !34, metadata !DIExpression()), !dbg !35 + %add = add i32 %3, %1, !dbg !44 + %add1 = add i32 %add, %5, !dbg !45 + %add2 = add i32 %add1, %8, !dbg !46 + ret i32 %add2, !dbg !47 +} + +; CHECK: r1 = 1 +; CHECK: r0 = 0 +; CHECK: r0 += r1 +; CHECK: r1 = 1 +; CHECK: r0 += r1 +; CHECK: r1 = 0 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_STRUCT(id = 2) +; CHECK: .long 37 # BTF_KIND_UNION(id = 7) +; CHECK: .ascii "s1" # string offset=1 +; CHECK: .ascii "u1" # string offset=37 +; CHECK: .ascii ".text" # string offset=64 +; CHECK: .ascii "0:0" # string offset=70 +; CHECK: .ascii "0:2" # string offset=111 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 64 # Field reloc section string offset=64 +; CHECK-NEXT: .long 4 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 70 +; CHECK-NEXT: .long 3 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 111 +; CHECK-NEXT: .long 3 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 7 +; CHECK-NEXT: .long 70 +; CHECK-NEXT: .long 3 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 7 +; CHECK-NEXT: .long 111 +; CHECK-NEXT: .long 3 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone +declare i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i16(i16*, i64) #1 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_union.u1s(%union.u1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i8(i8*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!7, !8, !9} +!llvm.ident = !{!10} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 4, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6} +!6 = !DIEnumerator(name: "FIELD_SIGNEDNESS", value: 3, isUnsigned: true) +!7 = !{i32 2, !"Dwarf Version", i32 4} +!8 = !{i32 2, !"Debug Info Version", i32 3} +!9 = !{i32 1, !"wchar_size", i32 4} +!10 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)"} +!11 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 5, type: !12, scopeLine: 5, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !28) +!12 = !DISubroutineType(types: !13) +!13 = !{!14, !15, !22} +!14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!15 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !16, size: 64) +!16 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 2, size: 64, elements: !17) +!17 = !{!18, !19, !21} +!18 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !16, file: !1, line: 2, baseType: !14, size: 32) +!19 = !DIDerivedType(tag: DW_TAG_member, name: "a2", scope: !16, file: !1, line: 2, baseType: !20, size: 9, offset: 32, flags: DIFlagBitField, extraData: i64 32) +!20 = !DIDerivedType(tag: DW_TAG_typedef, name: "__uint", file: !1, line: 1, baseType: !4) +!21 = !DIDerivedType(tag: DW_TAG_member, name: "a3", scope: !16, file: !1, line: 2, baseType: !20, size: 4, offset: 41, flags: DIFlagBitField, extraData: i64 32) +!22 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !23, size: 64) +!23 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 3, size: 32, elements: !24) +!24 = !{!25, !26, !27} +!25 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !23, file: !1, line: 3, baseType: !14, size: 32) +!26 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !23, file: !1, line: 3, baseType: !20, size: 9, flags: DIFlagBitField, extraData: i64 0) +!27 = !DIDerivedType(tag: DW_TAG_member, name: "b3", scope: !23, file: !1, line: 3, baseType: !20, size: 4, flags: DIFlagBitField, extraData: i64 0) +!28 = !{!29, !30, !31, !32, !33, !34} +!29 = !DILocalVariable(name: "arg1", arg: 1, scope: !11, file: !1, line: 5, type: !15) +!30 = !DILocalVariable(name: "arg2", arg: 2, scope: !11, file: !1, line: 5, type: !22) +!31 = !DILocalVariable(name: "r1", scope: !11, file: !1, line: 6, type: !4) +!32 = !DILocalVariable(name: "r2", scope: !11, file: !1, line: 7, type: !4) +!33 = !DILocalVariable(name: "r3", scope: !11, file: !1, line: 8, type: !4) +!34 = !DILocalVariable(name: "r4", scope: !11, file: !1, line: 9, type: !4) +!35 = !DILocation(line: 0, scope: !11) +!36 = !DILocation(line: 6, column: 53, scope: !11) +!37 = !DILocation(line: 6, column: 17, scope: !11) +!38 = !DILocation(line: 7, column: 53, scope: !11) +!39 = !DILocation(line: 7, column: 17, scope: !11) +!40 = !DILocation(line: 8, column: 53, scope: !11) +!41 = !DILocation(line: 8, column: 17, scope: !11) +!42 = !DILocation(line: 9, column: 53, scope: !11) +!43 = !DILocation(line: 9, column: 17, scope: !11) +!44 = !DILocation(line: 10, column: 13, scope: !11) +!45 = !DILocation(line: 10, column: 18, scope: !11) +!46 = !DILocation(line: 10, column: 23, scope: !11) +!47 = !DILocation(line: 10, column: 3, scope: !11) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-2.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-2.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-2.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,151 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; enum A { AA = -1, AB = 0, }; /* signed */ +; enum B { BA = 0, BB = 1, }; /* unsigned */ +; typedef enum A __A; +; typedef enum B __B; +; typedef int __int; /* signed */ +; struct s1 { __A a1; __B a2:9; __int a3:4; }; +; union u1 { int b1; struct s1 b2; }; +; enum { FIELD_SIGNEDNESS = 3, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2.a1, FIELD_SIGNEDNESS); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a2, FIELD_SIGNEDNESS); +; unsigned r3 = __builtin_preserve_field_info(arg->b2.a3, FIELD_SIGNEDNESS); +; return r1 + r2 + r3; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { %struct.s1 } +%struct.s1 = type { i32, i16 } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !20 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !37, metadata !DIExpression()), !dbg !41 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !42, !llvm.preserve.access.index !24 + %b2 = getelementptr inbounds %union.u1, %union.u1* %0, i64 0, i32 0, !dbg !42 + %1 = tail call i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !43, !llvm.preserve.access.index !28 + %2 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %1, i64 3), !dbg !44 + call void @llvm.dbg.value(metadata i32 %2, metadata !38, metadata !DIExpression()), !dbg !41 + %3 = tail call i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.s1s(%struct.s1* %b2, i32 1, i32 1), !dbg !45, !llvm.preserve.access.index !28 + %4 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %3, i64 3), !dbg !46 + call void @llvm.dbg.value(metadata i32 %4, metadata !39, metadata !DIExpression()), !dbg !41 + %5 = tail call i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.s1s(%struct.s1* %b2, i32 1, i32 2), !dbg !47, !llvm.preserve.access.index !28 + %6 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %5, i64 3), !dbg !48 + call void @llvm.dbg.value(metadata i32 %6, metadata !40, metadata !DIExpression()), !dbg !41 + %add = add i32 %4, %2, !dbg !49 + %add3 = add i32 %add, %6, !dbg !50 + ret i32 %add3, !dbg !51 +} + +; CHECK: r1 = 1 +; CHECK: r0 = 0 +; CHECK: r0 += r1 +; CHECK: r1 = 1 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=65 +; CHECK: .ascii "0:1:0" # string offset=71 +; CHECK: .ascii "0:1:1" # string offset=114 +; CHECK: .ascii "0:1:2" # string offset=120 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 65 # Field reloc section string offset=65 +; CHECK-NEXT: .long 3 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 71 +; CHECK-NEXT: .long 3 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 114 +; CHECK-NEXT: .long 3 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 120 +; CHECK-NEXT: .long 3 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone +declare i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i16(i16*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!16, !17, !18} +!llvm.ident = !{!19} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3, !8, !13} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "A", file: !1, line: 1, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!5 = !{!6, !7} +!6 = !DIEnumerator(name: "AA", value: -1) +!7 = !DIEnumerator(name: "AB", value: 0) +!8 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "B", file: !1, line: 2, baseType: !9, size: 32, elements: !10) +!9 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!10 = !{!11, !12} +!11 = !DIEnumerator(name: "BA", value: 0, isUnsigned: true) +!12 = !DIEnumerator(name: "BB", value: 1, isUnsigned: true) +!13 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 8, baseType: !9, size: 32, elements: !14) +!14 = !{!15} +!15 = !DIEnumerator(name: "FIELD_SIGNEDNESS", value: 3, isUnsigned: true) +!16 = !{i32 2, !"Dwarf Version", i32 4} +!17 = !{i32 2, !"Debug Info Version", i32 3} +!18 = !{i32 1, !"wchar_size", i32 4} +!19 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 4a60741b74384f14b21fdc0131ede326438840ab)"} +!20 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 9, type: !21, scopeLine: 9, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !36) +!21 = !DISubroutineType(types: !22) +!22 = !{!4, !23} +!23 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !24, size: 64) +!24 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 7, size: 64, elements: !25) +!25 = !{!26, !27} +!26 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !24, file: !1, line: 7, baseType: !4, size: 32) +!27 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !24, file: !1, line: 7, baseType: !28, size: 64) +!28 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 6, size: 64, elements: !29) +!29 = !{!30, !32, !34} +!30 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !28, file: !1, line: 6, baseType: !31, size: 32) +!31 = !DIDerivedType(tag: DW_TAG_typedef, name: "__A", file: !1, line: 3, baseType: !3) +!32 = !DIDerivedType(tag: DW_TAG_member, name: "a2", scope: !28, file: !1, line: 6, baseType: !33, size: 9, offset: 32, flags: DIFlagBitField, extraData: i64 32) +!33 = !DIDerivedType(tag: DW_TAG_typedef, name: "__B", file: !1, line: 4, baseType: !8) +!34 = !DIDerivedType(tag: DW_TAG_member, name: "a3", scope: !28, file: !1, line: 6, baseType: !35, size: 4, offset: 41, flags: DIFlagBitField, extraData: i64 32) +!35 = !DIDerivedType(tag: DW_TAG_typedef, name: "__int", file: !1, line: 5, baseType: !4) +!36 = !{!37, !38, !39, !40} +!37 = !DILocalVariable(name: "arg", arg: 1, scope: !20, file: !1, line: 9, type: !23) +!38 = !DILocalVariable(name: "r1", scope: !20, file: !1, line: 10, type: !9) +!39 = !DILocalVariable(name: "r2", scope: !20, file: !1, line: 11, type: !9) +!40 = !DILocalVariable(name: "r3", scope: !20, file: !1, line: 12, type: !9) +!41 = !DILocation(line: 0, scope: !20) +!42 = !DILocation(line: 10, column: 52, scope: !20) +!43 = !DILocation(line: 10, column: 55, scope: !20) +!44 = !DILocation(line: 10, column: 17, scope: !20) +!45 = !DILocation(line: 11, column: 55, scope: !20) +!46 = !DILocation(line: 11, column: 17, scope: !20) +!47 = !DILocation(line: 12, column: 55, scope: !20) +!48 = !DILocation(line: 12, column: 17, scope: !20) +!49 = !DILocation(line: 13, column: 13, scope: !20) +!50 = !DILocation(line: 13, column: 18, scope: !20) +!51 = !DILocation(line: 13, column: 3, scope: !20) Added: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-3.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-3.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-3.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-3.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,149 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; enum A { AA = -1, AB = 0, }; +; enum B { BA = 0, BB = 1, }; +; typedef enum A __A; +; typedef enum B __B; +; typedef struct s1 { __A a1[10]; __B a2[10][10]; } __s1; +; union u1 { int b1; __s1 b2; }; +; enum { FIELD_SIGNEDNESS = 3, }; +; int test(union u1 *arg) { +; unsigned r1 = __builtin_preserve_field_info(arg->b2.a1[5], FIELD_SIGNEDNESS); +; unsigned r2 = __builtin_preserve_field_info(arg->b2.a2[5][5], FIELD_SIGNEDNESS); +; /* r1 : 1, r2 : 0 */ +; return r1 + r2; +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%union.u1 = type { %struct.s1 } +%struct.s1 = type { [10 x i32], [10 x [10 x i32]] } + +; Function Attrs: nounwind readnone +define dso_local i32 @test(%union.u1* %arg) local_unnamed_addr #0 !dbg !29 { +entry: + call void @llvm.dbg.value(metadata %union.u1* %arg, metadata !43, metadata !DIExpression()), !dbg !46 + %0 = tail call %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1* %arg, i32 1), !dbg !47, !llvm.preserve.access.index !33 + %b2 = getelementptr inbounds %union.u1, %union.u1* %0, i64 0, i32 0, !dbg !47 + %1 = tail call [10 x i32]* @llvm.preserve.struct.access.index.p0a10i32.p0s_struct.s1s(%struct.s1* %b2, i32 0, i32 0), !dbg !48, !llvm.preserve.access.index !38 + %2 = tail call i32* @llvm.preserve.array.access.index.p0i32.p0a10i32([10 x i32]* %1, i32 1, i32 5), !dbg !49, !llvm.preserve.access.index !17 + %3 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %2, i64 3), !dbg !50 + call void @llvm.dbg.value(metadata i32 %3, metadata !44, metadata !DIExpression()), !dbg !46 + %4 = tail call [10 x [10 x i32]]* @llvm.preserve.struct.access.index.p0a10a10i32.p0s_struct.s1s(%struct.s1* %b2, i32 1, i32 1), !dbg !51, !llvm.preserve.access.index !38 + %5 = tail call [10 x i32]* @llvm.preserve.array.access.index.p0a10i32.p0a10a10i32([10 x [10 x i32]]* %4, i32 1, i32 5), !dbg !52, !llvm.preserve.access.index !21 + %6 = tail call i32* @llvm.preserve.array.access.index.p0i32.p0a10i32([10 x i32]* %5, i32 1, i32 5), !dbg !52, !llvm.preserve.access.index !24 + %7 = tail call i32 @llvm.bpf.preserve.field.info.p0i32(i32* %6, i64 3), !dbg !53 + call void @llvm.dbg.value(metadata i32 %7, metadata !45, metadata !DIExpression()), !dbg !46 + %add = add i32 %7, %3, !dbg !54 + ret i32 %add, !dbg !55 +} + +; CHECK: r1 = 1 +; CHECK: r0 = 0 +; CHECK: r0 += r1 +; CHECK: exit + +; CHECK: .long 1 # BTF_KIND_UNION(id = 2) +; CHECK: .ascii "u1" # string offset=1 +; CHECK: .ascii ".text" # string offset=81 +; CHECK: .ascii "0:1:0:5" # string offset=87 +; CHECK: .ascii "0:1:1:5:5" # string offset=132 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 81 # Field reloc section string offset=81 +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 87 +; CHECK-NEXT: .long 3 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 132 +; CHECK-NEXT: .long 3 + +; Function Attrs: nounwind readnone +declare %union.u1* @llvm.preserve.union.access.index.p0s_union.u1s.p0s_union.u1s(%union.u1*, i32) #1 + +; Function Attrs: nounwind readnone +declare [10 x i32]* @llvm.preserve.struct.access.index.p0a10i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32* @llvm.preserve.array.access.index.p0i32.p0a10i32([10 x i32]*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i32(i32*, i64) #1 + +; Function Attrs: nounwind readnone +declare [10 x [10 x i32]]* @llvm.preserve.struct.access.index.p0a10a10i32.p0s_struct.s1s(%struct.s1*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare [10 x i32]* @llvm.preserve.array.access.index.p0a10i32.p0a10a10i32([10 x [10 x i32]]*, i32, i32) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!25, !26, !27} +!llvm.ident = !{!28} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git c1e02f16f1105ffaf1c35ee8bc38b7d6db5c6ea9)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !16, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3, !8, !13} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "A", file: !1, line: 1, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!5 = !{!6, !7} +!6 = !DIEnumerator(name: "AA", value: -1) +!7 = !DIEnumerator(name: "AB", value: 0) +!8 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "B", file: !1, line: 2, baseType: !9, size: 32, elements: !10) +!9 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!10 = !{!11, !12} +!11 = !DIEnumerator(name: "BA", value: 0, isUnsigned: true) +!12 = !DIEnumerator(name: "BB", value: 1, isUnsigned: true) +!13 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 7, baseType: !9, size: 32, elements: !14) +!14 = !{!15} +!15 = !DIEnumerator(name: "FIELD_SIGNEDNESS", value: 3, isUnsigned: true) +!16 = !{!17, !21, !24} +!17 = !DICompositeType(tag: DW_TAG_array_type, baseType: !18, size: 320, elements: !19) +!18 = !DIDerivedType(tag: DW_TAG_typedef, name: "__A", file: !1, line: 3, baseType: !3) +!19 = !{!20} +!20 = !DISubrange(count: 10) +!21 = !DICompositeType(tag: DW_TAG_array_type, baseType: !22, size: 3200, elements: !23) +!22 = !DIDerivedType(tag: DW_TAG_typedef, name: "__B", file: !1, line: 4, baseType: !8) +!23 = !{!20, !20} +!24 = !DICompositeType(tag: DW_TAG_array_type, baseType: !22, size: 320, elements: !19) +!25 = !{i32 2, !"Dwarf Version", i32 4} +!26 = !{i32 2, !"Debug Info Version", i32 3} +!27 = !{i32 1, !"wchar_size", i32 4} +!28 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git c1e02f16f1105ffaf1c35ee8bc38b7d6db5c6ea9)"} +!29 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 8, type: !30, scopeLine: 8, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !42) +!30 = !DISubroutineType(types: !31) +!31 = !{!4, !32} +!32 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !33, size: 64) +!33 = distinct !DICompositeType(tag: DW_TAG_union_type, name: "u1", file: !1, line: 6, size: 3520, elements: !34) +!34 = !{!35, !36} +!35 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !33, file: !1, line: 6, baseType: !4, size: 32) +!36 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !33, file: !1, line: 6, baseType: !37, size: 3520) +!37 = !DIDerivedType(tag: DW_TAG_typedef, name: "__s1", file: !1, line: 5, baseType: !38) +!38 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s1", file: !1, line: 5, size: 3520, elements: !39) +!39 = !{!40, !41} +!40 = !DIDerivedType(tag: DW_TAG_member, name: "a1", scope: !38, file: !1, line: 5, baseType: !17, size: 320) +!41 = !DIDerivedType(tag: DW_TAG_member, name: "a2", scope: !38, file: !1, line: 5, baseType: !21, size: 3200, offset: 320) +!42 = !{!43, !44, !45} +!43 = !DILocalVariable(name: "arg", arg: 1, scope: !29, file: !1, line: 8, type: !32) +!44 = !DILocalVariable(name: "r1", scope: !29, file: !1, line: 9, type: !9) +!45 = !DILocalVariable(name: "r2", scope: !29, file: !1, line: 10, type: !9) +!46 = !DILocation(line: 0, scope: !29) +!47 = !DILocation(line: 9, column: 52, scope: !29) +!48 = !DILocation(line: 9, column: 55, scope: !29) +!49 = !DILocation(line: 9, column: 47, scope: !29) +!50 = !DILocation(line: 9, column: 17, scope: !29) +!51 = !DILocation(line: 10, column: 55, scope: !29) +!52 = !DILocation(line: 10, column: 47, scope: !29) +!53 = !DILocation(line: 10, column: 17, scope: !29) +!54 = !DILocation(line: 12, column: 13, scope: !29) +!55 = !DILocation(line: 12, column: 3, scope: !29) Modified: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-struct.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-struct.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-struct.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-struct.ll Tue Oct 8 11:23:17 2019 @@ -28,12 +28,13 @@ entry: ; CHECK: exit ; ; CHECK: .section .BTF.ext,"", at progbits -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 20 # Offset reloc section string offset=20 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 20 # Field reloc section string offset=20 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long [[RELOC]] ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long 26 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i8*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-union.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-union.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-union.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/intrinsic-union.ll Tue Oct 8 11:23:17 2019 @@ -27,12 +27,13 @@ entry: ; CHECK: exit ; CHECK: .section .BTF.ext,"", at progbits -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 20 # Offset reloc section string offset=20 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 20 # Field reloc section string offset=20 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long [[RELOC]] ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long 26 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i8*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-access-str.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-access-str.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-access-str.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-access-str.ll Tue Oct 8 11:23:17 2019 @@ -33,15 +33,17 @@ entry: ; CHECK: .ascii "0:1" # string offset=[[ACCESS_STR:[0-9]+]] ; CHECK-NEXT: .byte 0 ; CHECK: .section .BTF.ext,"", at progbits -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long [[SEC_INDEX]] # Offset reloc section string offset=[[SEC_INDEX]] +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long [[SEC_INDEX]] # Field reloc section string offset=[[SEC_INDEX]] ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long {{[0-9]+}} ; CHECK-NEXT: .long [[ACCESS_STR]] +; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long {{[0-9]+}} ; CHECK-NEXT: .long [[ACCESS_STR]] +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i8*, i8*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-basic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-basic.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-basic.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-basic.ll Tue Oct 8 11:23:17 2019 @@ -109,17 +109,18 @@ define dso_local i32 @bpf_prog(%struct.s ; CHECK-NEXT: .long 20 ; CHECK-NEXT: .long 124 ; CHECK-NEXT: .long 144 -; CHECK-NEXT: .long 24 -; CHECK-NEXT: .long 168 +; CHECK-NEXT: .long 28 +; CHECK-NEXT: .long 172 ; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long 8 # FuncInfo -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 43 # Offset reloc section string offset=43 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 43 # Field reloc section string offset=43 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp2 ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long 86 +; CHECK-NEXT: .long 0 ; Function Attrs: argmemonly nounwind declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-array-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-array-1.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-array-1.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-array-1.ll Tue Oct 8 11:23:17 2019 @@ -45,15 +45,17 @@ entry: ; CHECK: .ascii "0:1:0" # string offset=52 ; CHECK: .ascii "2:1" # string offset=107 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 46 # Offset reloc section string offset=46 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 46 # Field reloc section string offset=46 ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 52 +; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID2]] ; CHECK-NEXT: .long 107 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-array-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-array-2.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-array-2.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-array-2.ll Tue Oct 8 11:23:17 2019 @@ -47,15 +47,17 @@ entry: ; CHECK: .ascii "v1" # string offset=100 ; CHECK: .ascii "11:1" # string offset=107 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 46 # Offset reloc section string offset=46 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 46 # Field reloc section string offset=46 ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 52 +; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID2]] ; CHECK-NEXT: .long 107 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-1.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-1.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-1.ll Tue Oct 8 11:23:17 2019 @@ -46,15 +46,17 @@ entry: ; CHECK: .ascii "v1" # string offset=81 ; CHECK-NEXT: .byte 0 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long [[SEC_STR]] # Offset reloc section string offset=[[SEC_STR]] +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long [[SEC_STR]] # Field reloc section string offset=[[SEC_STR]] ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[V3_TID]] ; CHECK-NEXT: .long [[ACCESS_STR]] +; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[V1_TID]] ; CHECK-NEXT: .long [[ACCESS_STR]] +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-2.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-2.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-2.ll Tue Oct 8 11:23:17 2019 @@ -46,15 +46,17 @@ entry: ; CHECK: .ascii "v1" # string offset=91 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 39 # Offset reloc section string offset=39 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 39 # Field reloc section string offset=39 ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 45 +; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID2]] ; CHECK-NEXT: .long 45 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-3.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-3.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-3.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-3.ll Tue Oct 8 11:23:17 2019 @@ -45,15 +45,17 @@ entry: ; CHECK: .ascii "v1" # string offset=111 ; CHECK: .ascii "0:1" # string offset=118 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 57 # Offset reloc section string offset=57 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 57 # Field reloc section string offset=57 ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 63 +; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID2]] ; CHECK-NEXT: .long 118 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-union-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-union-1.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-union-1.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-union-1.ll Tue Oct 8 11:23:17 2019 @@ -46,15 +46,17 @@ entry: ; CHECK: .ascii "0:1" # string offset=45 ; CHECK: .ascii "v1" # string offset=91 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 39 # Offset reloc section string offset=39 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 39 # Field reloc section string offset=39 ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 45 +; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID2]] ; CHECK-NEXT: .long 45 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-union-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-union-2.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-union-2.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-cast-union-2.ll Tue Oct 8 11:23:17 2019 @@ -47,15 +47,17 @@ entry: ; CHECK: .ascii "v1" # string offset=111 ; CHECK: .ascii "0:1" # string offset=118 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 57 # Offset reloc section string offset=57 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 57 # Field reloc section string offset=57 ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 63 +; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID2]] ; CHECK-NEXT: .long 118 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-end-load.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-end-load.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-end-load.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-end-load.ll Tue Oct 8 11:23:17 2019 @@ -30,12 +30,13 @@ entry: ; CHECK: .ascii ".text" # string offset=20 ; CHECK: .ascii "0:1" # string offset=26 ; -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 20 # Offset reloc section string offset=20 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 20 # Field reloc section string offset=20 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long 26 +; CHECK-NEXT: .long 0 ; Function Attrs: nounwind readnone declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.ss(%struct.s*, i32, i32) #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-end-ret.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-end-ret.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-end-ret.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-end-ret.ll Tue Oct 8 11:23:17 2019 @@ -30,12 +30,13 @@ entry: ; CHECK: .ascii ".text" # string offset=20 ; CHECK: .ascii "0:1" # string offset=63 ; -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 20 # Offset reloc section string offset=20 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 20 # Field reloc section string offset=20 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long 63 +; CHECK-NEXT: .long 0 ; Function Attrs: nounwind readnone declare i32* @llvm.preserve.struct.access.index.p0i32.p0s_struct.ss(%struct.s*, i32, i32) #1 Added: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-1.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-1.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-1.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,189 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK %s +; Source code: +; struct s { +; int a; +; int b1:9; +; int b2:4; +; }; +; enum { +; FIELD_BYTE_OFFSET = 0, +; FIELD_BYTE_SIZE, +; FIELD_EXISTENCE, +; FIELD_SIGNEDNESS, +; FIELD_LSHIFT_U64, +; FIELD_RSHIFT_U64, +; }; +; void bpf_probe_read(void *, unsigned, const void *); +; int field_read(struct s *arg) { +; unsigned long long ull; +; unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET); +; unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE); +; unsigned lshift; +; +; bpf_probe_read(&ull, size, (const void *)arg + offset); +; lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); +; #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ +; lshift = lshift + (size << 3) - 64; +; #endif +; ull <<= lshift; +; if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS)) +; return (long long)ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); +; return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); +; } +; Compilation flag: +; clang -target bpfel -O2 -g -S -emit-llvm test.c + +%struct.s = type { i32, i16 } + +; Function Attrs: nounwind +define dso_local i32 @field_read(%struct.s* %arg) local_unnamed_addr #0 !dbg !20 { +entry: + %ull = alloca i64, align 8 + call void @llvm.dbg.value(metadata %struct.s* %arg, metadata !31, metadata !DIExpression()), !dbg !37 + %0 = bitcast i64* %ull to i8*, !dbg !38 + call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull %0) #5, !dbg !38 + %1 = tail call i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.ss(%struct.s* %arg, i32 1, i32 2), !dbg !39, !llvm.preserve.access.index !25 + %2 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %1, i64 0), !dbg !40 + call void @llvm.dbg.value(metadata i32 %2, metadata !34, metadata !DIExpression()), !dbg !37 + %3 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %1, i64 1), !dbg !41 + call void @llvm.dbg.value(metadata i32 %3, metadata !35, metadata !DIExpression()), !dbg !37 + %4 = bitcast %struct.s* %arg to i8*, !dbg !42 + %idx.ext = zext i32 %2 to i64, !dbg !43 + %add.ptr = getelementptr i8, i8* %4, i64 %idx.ext, !dbg !43 + call void @bpf_probe_read(i8* nonnull %0, i32 %3, i8* %add.ptr) #5, !dbg !44 + %5 = call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %1, i64 4), !dbg !45 + call void @llvm.dbg.value(metadata i32 %5, metadata !36, metadata !DIExpression()), !dbg !37 + %6 = load i64, i64* %ull, align 8, !dbg !46, !tbaa !47 + call void @llvm.dbg.value(metadata i64 %6, metadata !32, metadata !DIExpression()), !dbg !37 + %sh_prom = zext i32 %5 to i64, !dbg !46 + %shl = shl i64 %6, %sh_prom, !dbg !46 + call void @llvm.dbg.value(metadata i64 %shl, metadata !32, metadata !DIExpression()), !dbg !37 + %7 = call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %1, i64 3), !dbg !51 + %tobool = icmp eq i32 %7, 0, !dbg !51 + %8 = call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %1, i64 5), !dbg !37 + %sh_prom1 = zext i32 %8 to i64, !dbg !37 + %shr = ashr i64 %shl, %sh_prom1, !dbg !53 + %shr3 = lshr i64 %shl, %sh_prom1, !dbg !53 + %retval.0.in = select i1 %tobool, i64 %shr3, i64 %shr, !dbg !53 + %retval.0 = trunc i64 %retval.0.in to i32, !dbg !37 + call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #5, !dbg !54 + ret i32 %retval.0, !dbg !54 +} + +; CHECK: r{{[0-9]+}} = 4 +; CHECK: r{{[0-9]+}} = 4 +; CHECK: r{{[0-9]+}} = 51 +; CHECK: r{{[0-9]+}} = 60 +; CHECK: r{{[0-9]+}} = 1 + +; CHECK: .byte 115 # string offset=1 +; CHECK: .ascii ".text" # string offset=30 +; CHECK: .ascii "0:2" # string offset=73 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 30 # Field reloc section string offset=30 +; CHECK-NEXT: .long 5 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 73 +; CHECK-NEXT: .long 0 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 73 +; CHECK-NEXT: .long 1 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 73 +; CHECK-NEXT: .long 4 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 73 +; CHECK-NEXT: .long 5 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 73 +; CHECK-NEXT: .long 3 + +; Function Attrs: argmemonly nounwind willreturn +declare void @llvm.lifetime.start.p0i8(i64, i8* nocapture) #1 + +; Function Attrs: nounwind readnone +declare i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.ss(%struct.s*, i32, i32) #2 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i16(i16*, i64) #2 + +declare dso_local void @bpf_probe_read(i8*, i32, i8*) local_unnamed_addr #3 + +; Function Attrs: argmemonly nounwind willreturn +declare void @llvm.lifetime.end.p0i8(i64, i8* nocapture) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #4 + +attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { argmemonly nounwind willreturn } +attributes #2 = { nounwind readnone } +attributes #3 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #4 = { nounwind readnone speculatable willreturn } +attributes #5 = { nounwind } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!16, !17, !18} +!llvm.ident = !{!19} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 923aa0ce806f7739b754167239fee2c9a15e2f31)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !12, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 6, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6, !7, !8, !9, !10, !11} +!6 = !DIEnumerator(name: "FIELD_BYTE_OFFSET", value: 0, isUnsigned: true) +!7 = !DIEnumerator(name: "FIELD_BYTE_SIZE", value: 1, isUnsigned: true) +!8 = !DIEnumerator(name: "FIELD_EXISTENCE", value: 2, isUnsigned: true) +!9 = !DIEnumerator(name: "FIELD_SIGNEDNESS", value: 3, isUnsigned: true) +!10 = !DIEnumerator(name: "FIELD_LSHIFT_U64", value: 4, isUnsigned: true) +!11 = !DIEnumerator(name: "FIELD_RSHIFT_U64", value: 5, isUnsigned: true) +!12 = !{!13, !15} +!13 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !14, size: 64) +!14 = !DIDerivedType(tag: DW_TAG_const_type, baseType: null) +!15 = !DIBasicType(name: "long long int", size: 64, encoding: DW_ATE_signed) +!16 = !{i32 2, !"Dwarf Version", i32 4} +!17 = !{i32 2, !"Debug Info Version", i32 3} +!18 = !{i32 1, !"wchar_size", i32 4} +!19 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 923aa0ce806f7739b754167239fee2c9a15e2f31)"} +!20 = distinct !DISubprogram(name: "field_read", scope: !1, file: !1, line: 15, type: !21, scopeLine: 15, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !30) +!21 = !DISubroutineType(types: !22) +!22 = !{!23, !24} +!23 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!24 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !25, size: 64) +!25 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s", file: !1, line: 1, size: 64, elements: !26) +!26 = !{!27, !28, !29} +!27 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !25, file: !1, line: 2, baseType: !23, size: 32) +!28 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !25, file: !1, line: 3, baseType: !23, size: 9, offset: 32, flags: DIFlagBitField, extraData: i64 32) +!29 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !25, file: !1, line: 4, baseType: !23, size: 4, offset: 41, flags: DIFlagBitField, extraData: i64 32) +!30 = !{!31, !32, !34, !35, !36} +!31 = !DILocalVariable(name: "arg", arg: 1, scope: !20, file: !1, line: 15, type: !24) +!32 = !DILocalVariable(name: "ull", scope: !20, file: !1, line: 16, type: !33) +!33 = !DIBasicType(name: "long long unsigned int", size: 64, encoding: DW_ATE_unsigned) +!34 = !DILocalVariable(name: "offset", scope: !20, file: !1, line: 17, type: !4) +!35 = !DILocalVariable(name: "size", scope: !20, file: !1, line: 18, type: !4) +!36 = !DILocalVariable(name: "lshift", scope: !20, file: !1, line: 19, type: !4) +!37 = !DILocation(line: 0, scope: !20) +!38 = !DILocation(line: 16, column: 3, scope: !20) +!39 = !DILocation(line: 17, column: 56, scope: !20) +!40 = !DILocation(line: 17, column: 21, scope: !20) +!41 = !DILocation(line: 18, column: 19, scope: !20) +!42 = !DILocation(line: 21, column: 30, scope: !20) +!43 = !DILocation(line: 21, column: 48, scope: !20) +!44 = !DILocation(line: 21, column: 3, scope: !20) +!45 = !DILocation(line: 22, column: 12, scope: !20) +!46 = !DILocation(line: 26, column: 7, scope: !20) +!47 = !{!48, !48, i64 0} +!48 = !{!"long long", !49, i64 0} +!49 = !{!"omnipotent char", !50, i64 0} +!50 = !{!"Simple C/C++ TBAA"} +!51 = !DILocation(line: 27, column: 7, scope: !52) +!52 = distinct !DILexicalBlock(scope: !20, file: !1, line: 27, column: 7) +!53 = !DILocation(line: 27, column: 7, scope: !20) +!54 = !DILocation(line: 30, column: 1, scope: !20) Added: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-2.ll?rev=374099&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-2.ll (added) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-2.ll Tue Oct 8 11:23:17 2019 @@ -0,0 +1,246 @@ +; RUN: llc -march=bpfel -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK,CHECK-EL %s +; RUN: llc -march=bpfeb -filetype=asm -o - %s | FileCheck -check-prefixes=CHECK,CHECK-EB %s +; Source code: +; struct s { +; int a; +; int b1:9; +; int b2:4; +; }; +; enum { +; FIELD_BYTE_OFFSET = 0, +; FIELD_BYTE_SIZE, +; FIELD_EXISTENCE, +; FIELD_SIGNEDNESS, +; FIELD_LSHIFT_U64, +; FIELD_RSHIFT_U64, +; }; +; int field_read(struct s *arg) { +; unsigned long long ull; +; unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET); +; unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE); +; switch(size) { +; case 1: +; ull = *(unsigned char *)((void *)arg + offset); break; +; case 2: +; ull = *(unsigned short *)((void *)arg + offset); break; +; case 4: +; ull = *(unsigned int *)((void *)arg + offset); break; +; case 8: +; ull = *(unsigned long long *)((void *)arg + offset); break; +; } +; ull <<= __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); +; if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS)) +; return ((long long)ull) >>__builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); +; return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); +; } +; Compilation flag: +; clang -target bpf -O2 -g -S -emit-llvm test.c + +%struct.s = type { i32, i16 } + +; Function Attrs: nounwind readonly +define dso_local i32 @field_read(%struct.s* %arg) local_unnamed_addr #0 !dbg !26 { +entry: + call void @llvm.dbg.value(metadata %struct.s* %arg, metadata !37, metadata !DIExpression()), !dbg !41 + %0 = tail call i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.ss(%struct.s* %arg, i32 1, i32 2), !dbg !42, !llvm.preserve.access.index !31 + %1 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %0, i64 0), !dbg !43 + call void @llvm.dbg.value(metadata i32 %1, metadata !39, metadata !DIExpression()), !dbg !41 + %2 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %0, i64 1), !dbg !44 + call void @llvm.dbg.value(metadata i32 %2, metadata !40, metadata !DIExpression()), !dbg !41 + switch i32 %2, label %sw.epilog [ + i32 1, label %sw.bb + i32 2, label %sw.bb1 + i32 4, label %sw.bb5 + i32 8, label %sw.bb9 + ], !dbg !45 + +sw.bb: ; preds = %entry + %3 = bitcast %struct.s* %arg to i8*, !dbg !46 + %idx.ext = zext i32 %1 to i64, !dbg !48 + %add.ptr = getelementptr i8, i8* %3, i64 %idx.ext, !dbg !48 + %4 = load i8, i8* %add.ptr, align 1, !dbg !49, !tbaa !50 + %conv = zext i8 %4 to i64, !dbg !49 + call void @llvm.dbg.value(metadata i64 %conv, metadata !38, metadata !DIExpression()), !dbg !41 + br label %sw.epilog, !dbg !53 + +sw.bb1: ; preds = %entry + %5 = bitcast %struct.s* %arg to i8*, !dbg !54 + %idx.ext2 = zext i32 %1 to i64, !dbg !55 + %add.ptr3 = getelementptr i8, i8* %5, i64 %idx.ext2, !dbg !55 + %6 = bitcast i8* %add.ptr3 to i16*, !dbg !56 + %7 = load i16, i16* %6, align 2, !dbg !57, !tbaa !58 + %conv4 = zext i16 %7 to i64, !dbg !57 + call void @llvm.dbg.value(metadata i64 %conv4, metadata !38, metadata !DIExpression()), !dbg !41 + br label %sw.epilog, !dbg !60 + +sw.bb5: ; preds = %entry + %8 = bitcast %struct.s* %arg to i8*, !dbg !61 + %idx.ext6 = zext i32 %1 to i64, !dbg !62 + %add.ptr7 = getelementptr i8, i8* %8, i64 %idx.ext6, !dbg !62 + %9 = bitcast i8* %add.ptr7 to i32*, !dbg !63 + %10 = load i32, i32* %9, align 4, !dbg !64, !tbaa !65 + %conv8 = zext i32 %10 to i64, !dbg !64 + call void @llvm.dbg.value(metadata i64 %conv8, metadata !38, metadata !DIExpression()), !dbg !41 + br label %sw.epilog, !dbg !67 + +sw.bb9: ; preds = %entry + %11 = bitcast %struct.s* %arg to i8*, !dbg !68 + %idx.ext10 = zext i32 %1 to i64, !dbg !69 + %add.ptr11 = getelementptr i8, i8* %11, i64 %idx.ext10, !dbg !69 + %12 = bitcast i8* %add.ptr11 to i64*, !dbg !70 + %13 = load i64, i64* %12, align 8, !dbg !71, !tbaa !72 + call void @llvm.dbg.value(metadata i64 %13, metadata !38, metadata !DIExpression()), !dbg !41 + br label %sw.epilog, !dbg !74 + +sw.epilog: ; preds = %entry, %sw.bb9, %sw.bb5, %sw.bb1, %sw.bb + %ull.0 = phi i64 [ undef, %entry ], [ %13, %sw.bb9 ], [ %conv8, %sw.bb5 ], [ %conv4, %sw.bb1 ], [ %conv, %sw.bb ] + call void @llvm.dbg.value(metadata i64 %ull.0, metadata !38, metadata !DIExpression()), !dbg !41 + %14 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %0, i64 4), !dbg !75 + %sh_prom = zext i32 %14 to i64, !dbg !76 + %shl = shl i64 %ull.0, %sh_prom, !dbg !76 + call void @llvm.dbg.value(metadata i64 %shl, metadata !38, metadata !DIExpression()), !dbg !41 + %15 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %0, i64 3), !dbg !77 + %tobool = icmp eq i32 %15, 0, !dbg !77 + %16 = tail call i32 @llvm.bpf.preserve.field.info.p0i16(i16* %0, i64 5), !dbg !41 + %sh_prom12 = zext i32 %16 to i64, !dbg !41 + %shr = ashr i64 %shl, %sh_prom12, !dbg !79 + %shr15 = lshr i64 %shl, %sh_prom12, !dbg !79 + %retval.0.in = select i1 %tobool, i64 %shr15, i64 %shr, !dbg !79 + %retval.0 = trunc i64 %retval.0.in to i32, !dbg !41 + ret i32 %retval.0, !dbg !80 +} + +; CHECK: r{{[0-9]+}} = 4 +; CHECK: r{{[0-9]+}} = 4 +; CHECK-EL: r{{[0-9]+}} = 51 +; CHECK-EB: r{{[0-9]+}} = 41 +; CHECK: r{{[0-9]+}} = 60 +; CHECK: r{{[0-9]+}} = 1 + +; CHECK: .long 1 # BTF_KIND_STRUCT(id = 2) +; CHECK: .byte 115 # string offset=1 +; CHECK: .ascii ".text" # string offset=30 +; CHECK: .ascii "0:2" # string offset=36 + +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 30 # Field reloc section string offset=30 +; CHECK-NEXT: .long 5 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 36 +; CHECK-NEXT: .long 0 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 36 +; CHECK-NEXT: .long 1 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 36 +; CHECK-NEXT: .long 4 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 36 +; CHECK-NEXT: .long 5 +; CHECK-NEXT: .long .Ltmp{{[0-9]+}} +; CHECK-NEXT: .long 2 +; CHECK-NEXT: .long 36 +; CHECK-NEXT: .long 3 + +; Function Attrs: nounwind readnone +declare i16* @llvm.preserve.struct.access.index.p0i16.p0s_struct.ss(%struct.s*, i32, i32) #1 + +; Function Attrs: nounwind readnone +declare i32 @llvm.bpf.preserve.field.info.p0i16(i16*, i64) #1 + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) #2 + +attributes #0 = { nounwind readonly "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind readnone } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!22, !23, !24} +!llvm.ident = !{!25} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0 (https://github.com/llvm/llvm-project.git 923aa0ce806f7739b754167239fee2c9a15e2f31)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !12, nameTableKind: None) +!1 = !DIFile(filename: "test.c", directory: "/tmp/home/yhs/work/tests/core") +!2 = !{!3} +!3 = !DICompositeType(tag: DW_TAG_enumeration_type, file: !1, line: 6, baseType: !4, size: 32, elements: !5) +!4 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned) +!5 = !{!6, !7, !8, !9, !10, !11} +!6 = !DIEnumerator(name: "FIELD_BYTE_OFFSET", value: 0, isUnsigned: true) +!7 = !DIEnumerator(name: "FIELD_BYTE_SIZE", value: 1, isUnsigned: true) +!8 = !DIEnumerator(name: "FIELD_EXISTENCE", value: 2, isUnsigned: true) +!9 = !DIEnumerator(name: "FIELD_SIGNEDNESS", value: 3, isUnsigned: true) +!10 = !DIEnumerator(name: "FIELD_LSHIFT_U64", value: 4, isUnsigned: true) +!11 = !DIEnumerator(name: "FIELD_RSHIFT_U64", value: 5, isUnsigned: true) +!12 = !{!13, !15, !16, !18, !19, !21} +!13 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !14, size: 64) +!14 = !DIBasicType(name: "unsigned char", size: 8, encoding: DW_ATE_unsigned_char) +!15 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: null, size: 64) +!16 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !17, size: 64) +!17 = !DIBasicType(name: "unsigned short", size: 16, encoding: DW_ATE_unsigned) +!18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !4, size: 64) +!19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64) +!20 = !DIBasicType(name: "long long unsigned int", size: 64, encoding: DW_ATE_unsigned) +!21 = !DIBasicType(name: "long long int", size: 64, encoding: DW_ATE_signed) +!22 = !{i32 2, !"Dwarf Version", i32 4} +!23 = !{i32 2, !"Debug Info Version", i32 3} +!24 = !{i32 1, !"wchar_size", i32 4} +!25 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git 923aa0ce806f7739b754167239fee2c9a15e2f31)"} +!26 = distinct !DISubprogram(name: "field_read", scope: !1, file: !1, line: 14, type: !27, scopeLine: 14, flags: DIFlagPrototyped, isDefinition: true, isOptimized: true, unit: !0, retainedNodes: !36) +!27 = !DISubroutineType(types: !28) +!28 = !{!29, !30} +!29 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!30 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !31, size: 64) +!31 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "s", file: !1, line: 1, size: 64, elements: !32) +!32 = !{!33, !34, !35} +!33 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !31, file: !1, line: 2, baseType: !29, size: 32) +!34 = !DIDerivedType(tag: DW_TAG_member, name: "b1", scope: !31, file: !1, line: 3, baseType: !29, size: 9, offset: 32, flags: DIFlagBitField, extraData: i64 32) +!35 = !DIDerivedType(tag: DW_TAG_member, name: "b2", scope: !31, file: !1, line: 4, baseType: !29, size: 4, offset: 41, flags: DIFlagBitField, extraData: i64 32) +!36 = !{!37, !38, !39, !40} +!37 = !DILocalVariable(name: "arg", arg: 1, scope: !26, file: !1, line: 14, type: !30) +!38 = !DILocalVariable(name: "ull", scope: !26, file: !1, line: 15, type: !20) +!39 = !DILocalVariable(name: "offset", scope: !26, file: !1, line: 16, type: !4) +!40 = !DILocalVariable(name: "size", scope: !26, file: !1, line: 17, type: !4) +!41 = !DILocation(line: 0, scope: !26) +!42 = !DILocation(line: 16, column: 56, scope: !26) +!43 = !DILocation(line: 16, column: 21, scope: !26) +!44 = !DILocation(line: 17, column: 19, scope: !26) +!45 = !DILocation(line: 18, column: 3, scope: !26) +!46 = !DILocation(line: 20, column: 30, scope: !47) +!47 = distinct !DILexicalBlock(scope: !26, file: !1, line: 18, column: 16) +!48 = !DILocation(line: 20, column: 42, scope: !47) +!49 = !DILocation(line: 20, column: 11, scope: !47) +!50 = !{!51, !51, i64 0} +!51 = !{!"omnipotent char", !52, i64 0} +!52 = !{!"Simple C/C++ TBAA"} +!53 = !DILocation(line: 20, column: 53, scope: !47) +!54 = !DILocation(line: 22, column: 31, scope: !47) +!55 = !DILocation(line: 22, column: 43, scope: !47) +!56 = !DILocation(line: 22, column: 12, scope: !47) +!57 = !DILocation(line: 22, column: 11, scope: !47) +!58 = !{!59, !59, i64 0} +!59 = !{!"short", !51, i64 0} +!60 = !DILocation(line: 22, column: 54, scope: !47) +!61 = !DILocation(line: 24, column: 29, scope: !47) +!62 = !DILocation(line: 24, column: 41, scope: !47) +!63 = !DILocation(line: 24, column: 12, scope: !47) +!64 = !DILocation(line: 24, column: 11, scope: !47) +!65 = !{!66, !66, i64 0} +!66 = !{!"int", !51, i64 0} +!67 = !DILocation(line: 24, column: 52, scope: !47) +!68 = !DILocation(line: 26, column: 35, scope: !47) +!69 = !DILocation(line: 26, column: 47, scope: !47) +!70 = !DILocation(line: 26, column: 12, scope: !47) +!71 = !DILocation(line: 26, column: 11, scope: !47) +!72 = !{!73, !73, i64 0} +!73 = !{!"long long", !51, i64 0} +!74 = !DILocation(line: 26, column: 58, scope: !47) +!75 = !DILocation(line: 28, column: 11, scope: !26) +!76 = !DILocation(line: 28, column: 7, scope: !26) +!77 = !DILocation(line: 29, column: 7, scope: !78) +!78 = distinct !DILexicalBlock(scope: !26, file: !1, line: 29, column: 7) +!79 = !DILocation(line: 29, column: 7, scope: !26) +!80 = !DILocation(line: 32, column: 1, scope: !26) Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-1.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-1.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-1.ll Tue Oct 8 11:23:17 2019 @@ -34,12 +34,13 @@ entry: ; CHECK: .ascii "v3" # string offset=16 ; CHECK: .ascii "0:1" # string offset=23 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 10 # Offset reloc section string offset=10 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 10 # Field reloc section string offset=10 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 23 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-2.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-2.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-2.ll Tue Oct 8 11:23:17 2019 @@ -36,12 +36,13 @@ entry: ; CHECK: .ascii "v3" # string offset=16 ; CHECK: .ascii "7:1" # string offset=23 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 10 # Offset reloc section string offset=10 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 10 # Field reloc section string offset=10 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 23 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-3.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-3.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-3.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-global-3.ll Tue Oct 8 11:23:17 2019 @@ -34,12 +34,13 @@ entry: ; CHECK: .ascii "v3" # string offset=16 ; CHECK: .ascii "0:1" # string offset=23 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 10 # Offset reloc section string offset=10 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 10 # Field reloc section string offset=10 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 23 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-ignore.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-ignore.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-ignore.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-ignore.ll Tue Oct 8 11:23:17 2019 @@ -21,7 +21,7 @@ entry: ; CHECK: r1 += 16 ; CHECK: call get_value ; CHECK: .section .BTF.ext,"", at progbits -; CHECK-NOT: .long 12 # OffsetReloc +; CHECK-NOT: .long 16 # FieldReloc declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-middle-chain.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-middle-chain.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-middle-chain.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-middle-chain.ll Tue Oct 8 11:23:17 2019 @@ -50,18 +50,21 @@ entry: ; CHECK: .ascii "0:0:0" # string offset=76 ; CHECK: .ascii "0:0:0:0" # string offset=82 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 29 # Offset reloc section string offset=29 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 29 # Field reloc section string offset=29 ; CHECK-NEXT: .long 3 ; CHECK_NEXT: .long .Ltmp{{[0-9]+}} ; CHECK_NEXT: .long 2 ; CHECK_NEXT: .long 72 +; CHECK_NEXT: .long 0 ; CHECK_NEXT: .long .Ltmp{{[0-9]+}} ; CHECK_NEXT: .long 2 ; CHECK_NEXT: .long 76 +; CHECK_NEXT: .long 0 ; CHECK_NEXT: .long .Ltmp{{[0-9]+}} ; CHECK_NEXT: .long 2 ; CHECK_NEXT: .long 82 +; CHECK_NEXT: .long 0 ; Function Attrs: nounwind readnone declare %struct.s1* @llvm.preserve.struct.access.index.p0s_struct.s1s.p0s_struct.r1s(%struct.r1*, i32, i32) #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multi-array-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multi-array-1.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multi-array-1.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multi-array-1.ll Tue Oct 8 11:23:17 2019 @@ -35,12 +35,13 @@ entry: ; CHECK: .ascii ".text" # string offset=52 ; CHECK: .ascii "1:1:2:3" # string offset=58 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 52 # Offset reloc section string offset=52 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 52 # Field reloc section string offset=52 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 58 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multi-array-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multi-array-2.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multi-array-2.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multi-array-2.ll Tue Oct 8 11:23:17 2019 @@ -36,12 +36,13 @@ entry: ; CHECK: .ascii ".text" # string offset=52 ; CHECK: .ascii "1:1:2:3:2" # string offset=58 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 52 # Offset reloc section string offset=52 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 52 # Field reloc section string offset=52 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 58 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multilevel.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multilevel.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multilevel.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-multilevel.ll Tue Oct 8 11:23:17 2019 @@ -117,17 +117,18 @@ define dso_local i32 @bpf_prog(%struct.s ; CHECK-NEXT: .long 20 ; CHECK-NEXT: .long 76 ; CHECK-NEXT: .long 96 -; CHECK-NEXT: .long 24 -; CHECK-NEXT: .long 120 +; CHECK-NEXT: .long 28 +; CHECK-NEXT: .long 124 ; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long 8 # FuncInfo -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 57 # Offset reloc section string offset=57 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 57 # Field reloc section string offset=57 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp2 ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long 100 +; CHECK-NEXT: .long 0 ; Function Attrs: argmemonly nounwind declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-pointer-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-pointer-1.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-pointer-1.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-pointer-1.ll Tue Oct 8 11:23:17 2019 @@ -32,12 +32,13 @@ entry: ; CHECK: .ascii ".text" # string offset=26 ; CHECK: .byte 49 # string offset=32 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 26 # Offset reloc section string offset=26 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 26 # Field reloc section string offset=26 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 32 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-pointer-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-pointer-2.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-pointer-2.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-pointer-2.ll Tue Oct 8 11:23:17 2019 @@ -31,12 +31,13 @@ entry: ; CHECK: .ascii ".text" # string offset=26 ; CHECK: .ascii "1:1" # string offset=32 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 26 # Offset reloc section string offset=26 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 26 # Field reloc section string offset=26 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp{{[0-9]+}} ; CHECK-NEXT: .long [[TID1]] ; CHECK-NEXT: .long 32 +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i32*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-struct-anonymous.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-struct-anonymous.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-struct-anonymous.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-struct-anonymous.ll Tue Oct 8 11:23:17 2019 @@ -127,17 +127,18 @@ define dso_local i32 @bpf_prog(%struct.s ; CHECK-NEXT: .long 20 ; CHECK-NEXT: .long 76 ; CHECK-NEXT: .long 96 -; CHECK-NEXT: .long 24 -; CHECK-NEXT: .long 120 +; CHECK-NEXT: .long 28 +; CHECK-NEXT: .long 124 ; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long 8 # FuncInfo -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 66 # Offset reloc section string offset=66 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 66 # Field reloc section string offset=66 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp2 ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long 109 +; CHECK-NEXT: .long 0 ; Function Attrs: argmemonly nounwind declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-struct-array.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-struct-array.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-struct-array.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-struct-array.ll Tue Oct 8 11:23:17 2019 @@ -130,17 +130,18 @@ define dso_local i32 @bpf_prog(%struct.s ; CHECK-NEXT: .long 20 ; CHECK-NEXT: .long 76 ; CHECK-NEXT: .long 96 -; CHECK-NEXT: .long 24 -; CHECK-NEXT: .long 120 +; CHECK-NEXT: .long 28 +; CHECK-NEXT: .long 124 ; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long 8 # FuncInfo -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 77 # Offset reloc section string offset=77 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 77 # Field reloc section string offset=77 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp2 ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long 120 +; CHECK-NEXT: .long 0 ; Function Attrs: argmemonly nounwind declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-array.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-array.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-array.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-array.ll Tue Oct 8 11:23:17 2019 @@ -38,12 +38,13 @@ entry: ; CHECK-NEXT: .byte 0 ; CHECK: .ascii "0:0:1" # string offset=[[ACCESS_STR:[0-9]+]] ; CHECK-NEXT: .byte 0 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long [[SEC_INDEX]] # Offset reloc section string offset=[[SEC_INDEX]] +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long [[SEC_INDEX]] # Field reloc section string offset=[[SEC_INDEX]] ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long [[RELOC]] ; CHECK-NEXT: .long [[TYPE_ID]] ; CHECK-NEXT: .long [[ACCESS_STR]] +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i8*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-struct.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-struct.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-struct.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-struct.ll Tue Oct 8 11:23:17 2019 @@ -37,12 +37,13 @@ entry: ; CHECK-NEXT: .byte 0 ; CHECK: .ascii "0:1" # string offset=[[ACCESS_STR:[0-9]+]] ; CHECK-NEXT: .byte 0 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long [[SEC_STR]] # Offset reloc section string offset={{[0-9]+}} +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long [[SEC_STR]] # Field reloc section string offset={{[0-9]+}} ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long [[RELOC]] ; CHECK-NEXT: .long [[TYPE_ID]] ; CHECK-NEXT: .long [[ACCESS_STR]] +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i8*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-union.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-union.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-union.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef-union.ll Tue Oct 8 11:23:17 2019 @@ -37,12 +37,13 @@ entry: ; CHECK-NEXT: .byte 0 ; CHECK: .ascii "0:1" # string offset=[[ACCESS_STR:[0-9]+]] ; CHECK-NEXT: .byte 0 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long [[SEC_INDEX]] # Offset reloc section string offset=[[SEC_INDEX]] +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long [[SEC_INDEX]] # Field reloc section string offset=[[SEC_INDEX]] ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long [[RELOC]] ; CHECK-NEXT: .long [[TYPE_ID]] ; CHECK-NEXT: .long [[ACCESS_STR]] +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i8*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-typedef.ll Tue Oct 8 11:23:17 2019 @@ -45,12 +45,13 @@ entry: ; CHECK-NEXT: .byte 0 ; CHECK: .ascii "1:1:1" # string offset=[[ACCESS_STR:[0-9]+]] ; CHECK-NEXT: .byte 0 -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long [[SEC_STR:[0-9]+]] # Offset reloc section string offset=[[SEC_STR:[0-9]+]] +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long [[SEC_STR:[0-9]+]] # Field reloc section string offset=[[SEC_STR:[0-9]+]] ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long [[RELOC:.Ltmp[0-9]+]] ; CHECK-NEXT: .long [[TYPE_ID:[0-9]+]] ; CHECK-NEXT: .long [[ACCESS_STR:[0-9]+]] +; CHECK-NEXT: .long 0 declare dso_local i32 @get_value(i8*) local_unnamed_addr #1 Modified: llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-union.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-union.ll?rev=374099&r1=374098&r2=374099&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-union.ll (original) +++ llvm/trunk/test/CodeGen/BPF/CORE/offset-reloc-union.ll Tue Oct 8 11:23:17 2019 @@ -133,17 +133,18 @@ define dso_local i32 @bpf_prog(%union.sk ; CHECK-NEXT: .long 20 ; CHECK-NEXT: .long 76 ; CHECK-NEXT: .long 96 -; CHECK-NEXT: .long 24 -; CHECK-NEXT: .long 120 +; CHECK-NEXT: .long 28 +; CHECK-NEXT: .long 124 ; CHECK-NEXT: .long 0 ; CHECK-NEXT: .long 8 # FuncInfo -; CHECK: .long 12 # OffsetReloc -; CHECK-NEXT: .long 54 # Offset reloc section string offset=54 +; CHECK: .long 16 # FieldReloc +; CHECK-NEXT: .long 54 # Field reloc section string offset=54 ; CHECK-NEXT: .long 1 ; CHECK-NEXT: .long .Ltmp2 ; CHECK-NEXT: .long 2 ; CHECK-NEXT: .long 97 +; CHECK-NEXT: .long 0 ; Function Attrs: argmemonly nounwind declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1 From llvm-commits at lists.llvm.org Tue Oct 8 11:25:51 2019 From: llvm-commits at lists.llvm.org (Francis Visoiu Mistrih via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:25:51 +0000 (UTC) Subject: [PATCH] D68611: [IRGen] Emit lifetime markers for temporary struct allocas In-Reply-To: References: Message-ID: <1c5c8d7e8391d0e9078925897712f9e2@localhost.localdomain> thegameg updated this revision to Diff 223907. thegameg added a comment. Move cleanup code after emitting lifetime start. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68611/new/ https://reviews.llvm.org/D68611 Files: clang/lib/CodeGen/CGCall.cpp clang/test/CodeGen/aarch64-byval-temp.c -------------- next part -------------- A non-text attachment was scrubbed... Name: D68611.223907.patch Type: text/x-patch Size: 7414 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 11:25:52 2019 From: llvm-commits at lists.llvm.org (Siva Chandra via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:25:52 +0000 (UTC) Subject: [PATCH] D67867: [libc] Add few docs and implementation of strcpy and strcat. In-Reply-To: References: Message-ID: sivachandra added a comment. In D67867#1698316 , @jyknight wrote: > That this is now committed does not change anything w.r.t. needing to respond to outstanding comments. Absolutely! I have heard everyone, and will try my best to address everything. FWIW, I am not happy myself that I am unable to address jyknight's concerns on post-processing in an acceptable fashion. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67867/new/ https://reviews.llvm.org/D67867 From llvm-commits at lists.llvm.org Tue Oct 8 11:25:53 2019 From: llvm-commits at lists.llvm.org (Melanie Blower via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:25:53 +0000 (UTC) Subject: [PATCH] D62731: [RFC] Add support for options -frounding-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior In-Reply-To: References: Message-ID: mibintc updated this revision to Diff 223908. mibintc added a comment. I made a couple wording changes suggested by @rjmccall Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62731/new/ https://reviews.llvm.org/D62731 Files: clang/docs/UsersManual.rst clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Basic/LangOptions.h clang/include/clang/Driver/Options.td clang/lib/CodeGen/BackendUtil.cpp clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/CodeGen/CodeGenFunction.h clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/fpconstrained.c clang/test/Driver/clang_f_opts.c clang/test/Driver/fast-math.c llvm/include/llvm/Target/TargetOptions.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D62731.223908.patch Type: text/x-patch Size: 30598 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 11:26:04 2019 From: llvm-commits at lists.llvm.org (Yonghong Song via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:26:04 +0000 (UTC) Subject: [PATCH] D67980: [BPF] do compile-once run-everywhere relocation for bitfields In-Reply-To: References: Message-ID: <8f8621838c7d96024719a5801ad70421@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG05e46979d2f4: [BPF] do compile-once run-everywhere relocation for bitfields (authored by yonghong-song). Changed prior to commit: https://reviews.llvm.org/D67980?vs=223792&id=223909#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67980/new/ https://reviews.llvm.org/D67980 Files: clang/include/clang/Basic/BuiltinsBPF.def clang/include/clang/Basic/DiagnosticSemaKinds.td clang/include/clang/Basic/TargetBuiltins.h clang/include/clang/Sema/Sema.h clang/include/clang/module.modulemap clang/lib/Basic/Targets/BPF.cpp clang/lib/Basic/Targets/BPF.h clang/lib/CodeGen/CGBuiltin.cpp clang/lib/CodeGen/CGExpr.cpp clang/lib/CodeGen/CodeGenFunction.h clang/lib/Sema/SemaChecking.cpp clang/test/CodeGen/builtins-bpf-preserve-field-info-1.c clang/test/CodeGen/builtins-bpf-preserve-field-info-2.c clang/test/Sema/builtins-bpf.c llvm/include/llvm/IR/IntrinsicsBPF.td llvm/lib/Target/BPF/BPF.h llvm/lib/Target/BPF/BPFAbstractMemberAccess.cpp llvm/lib/Target/BPF/BPFCORE.h llvm/lib/Target/BPF/BPFTargetMachine.cpp llvm/lib/Target/BPF/BTF.h llvm/lib/Target/BPF/BTFDebug.cpp llvm/lib/Target/BPF/BTFDebug.h llvm/test/CodeGen/BPF/CORE/intrinsic-array.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-byte-size-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-existence-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-lshift-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-rshift-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-1.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-2.ll llvm/test/CodeGen/BPF/CORE/intrinsic-fieldinfo-signedness-3.ll llvm/test/CodeGen/BPF/CORE/intrinsic-struct.ll llvm/test/CodeGen/BPF/CORE/intrinsic-union.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-access-str.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-basic.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-array-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-array-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-struct-3.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-union-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-cast-union-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-end-load.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-end-ret.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-fieldinfo-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-global-3.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-ignore.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-middle-chain.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multi-array-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multi-array-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multilevel.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-pointer-1.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-pointer-2.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-struct-anonymous.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-struct-array.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-array.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-struct.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef-union.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-typedef.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-union.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67980.223909.patch Type: text/x-patch Size: 226230 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 11:35:22 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:35:22 +0000 (UTC) Subject: [PATCH] D66282: [llvm-objcopy][MachO] Implement --remove-section In-Reply-To: References: Message-ID: <1111f0767d27aeeb3102a331d7c77601@localhost.localdomain> rupprecht marked an inline comment as done. rupprecht added inline comments. ================ Comment at: llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp:42-46 if (!Config.OnlySection.empty()) { RemovePred = [&Config, RemovePred](const Section &Sec) { return !Config.OnlySection.matches(Sec.CanonicalName); }; } ---------------- rupprecht wrote: > Not related to this patch, but looks like this discards `RemovePred`, e.g. `--strip-all --only-section` will effectively work like `--only-section`. > > I found this after noticing that `RemovePred` is not referenced in the bit you added, which is actually fine there but error prone if the iterative construction of `RemovePred` is ever reordered. ELF objcopy works like this too, but correctly chains `RemovePred` for the `--only-section` switch. Nevermind, that flag takes priority, so that's WAI. Ignore this comment. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66282/new/ https://reviews.llvm.org/D66282 From llvm-commits at lists.llvm.org Tue Oct 8 11:35:23 2019 From: llvm-commits at lists.llvm.org (Melanie Blower via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:35:23 +0000 (UTC) Subject: [PATCH] D62731: [RFC] Add support for options -frounding-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior In-Reply-To: References: Message-ID: <508db959d37ae9c79448b28508c68687@localhost.localdomain> mibintc marked 2 inline comments as done. mibintc added inline comments. ================ Comment at: clang/docs/UsersManual.rst:1341 + has been selected, then the compiler will issue a diagnostic warning + that the override has occurred. + ---------------- rjmccall wrote: > That's not typical driver behavior; why this choice? The rationale for the warnings is that the floating point options are sufficiently complicated that it makes sense to warn the uses that one of the later options supplied on the command line is undoing a choice made earlier. It's not obvious that e.g. the setting for fassociative-math is also controlled by -fp-model=strict Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62731/new/ https://reviews.llvm.org/D62731 From llvm-commits at lists.llvm.org Tue Oct 8 11:36:04 2019 From: llvm-commits at lists.llvm.org (Melanie Blower via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:36:04 +0000 (UTC) Subject: [PATCH] D62731: [RFC] Add support for options -frounding-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior In-Reply-To: References: Message-ID: <71cfa5b520697db2732986e07f7d108c@localhost.localdomain> mibintc marked 2 inline comments as done. mibintc added inline comments. ================ Comment at: clang/include/clang/Driver/Options.td:927 def fdenormal_fp_math_EQ : Joined<["-"], "fdenormal-fp-math=">, Group, Flags<[CC1Option]>; +def ffp_model_EQ : Joined<["-"], "ffp-model=">, Group, Flags<[DriverOption]>, + HelpText<"Controls the semantics of floating-point calculations.">; ---------------- The ffp-model= option is just a Driver option, it is rewritten into combinations of lower level options like ffp-exception-behavior and frounding-math: it's not a cc1 option. ================ Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2326 bool SignedZeros = true; - bool TrappingMath = true; + bool TrappingMath = false; + bool RoundingFPMath = false; ---------------- By default, floating point exceptions are masked. Previously this was set to true, but the value wasn't used. This patch implements support for trapping-math Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62731/new/ https://reviews.llvm.org/D62731 From llvm-commits at lists.llvm.org Tue Oct 8 11:41:32 2019 From: llvm-commits at lists.llvm.org (Daniel Sanders via llvm-commits) Date: Tue, 08 Oct 2019 18:41:32 -0000 Subject: [llvm] r374101 - [tblgen] Add getOperatorAsDef() to Record Message-ID: <20191008184132.A769C877C8@lists.llvm.org> Author: dsanders Date: Tue Oct 8 11:41:32 2019 New Revision: 374101 URL: http://llvm.org/viewvc/llvm-project?rev=374101&view=rev Log: [tblgen] Add getOperatorAsDef() to Record Summary: While working with DagInit's, it's often the case that you expect the operator to be a reference to a def. This patch adds a wrapper for this common case to reduce the amount of boilerplate callers need to duplicate repeatedly. getOperatorAsDef() returns the record if the DagInit has an operator that is a DefInit. Otherwise, it prints a fatal error. There's only a few pre-existing examples in LLVM at the moment and I've left a few instances of the code this simplifies as they had more specific error messages than the generic one this produces. I'm going to be using this a fair bit in my subsequent patches. Reviewers: bogner, volkan, nhaehnle Reviewed By: nhaehnle Subscribers: nhaehnle, hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lenary, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68424 Modified: llvm/trunk/include/llvm/TableGen/Record.h llvm/trunk/lib/TableGen/Record.cpp llvm/trunk/utils/TableGen/AsmWriterEmitter.cpp llvm/trunk/utils/TableGen/RISCVCompressInstEmitter.cpp Modified: llvm/trunk/include/llvm/TableGen/Record.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/TableGen/Record.h?rev=374101&r1=374100&r2=374101&view=diff ============================================================================== --- llvm/trunk/include/llvm/TableGen/Record.h (original) +++ llvm/trunk/include/llvm/TableGen/Record.h Tue Oct 8 11:41:32 2019 @@ -1330,6 +1330,7 @@ public: void Profile(FoldingSetNodeID &ID) const; Init *getOperator() const { return Val; } + Record *getOperatorAsDef(ArrayRef Loc) const; StringInit *getName() const { return ValName; } Modified: llvm/trunk/lib/TableGen/Record.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/TableGen/Record.cpp?rev=374101&r1=374100&r2=374101&view=diff ============================================================================== --- llvm/trunk/lib/TableGen/Record.cpp (original) +++ llvm/trunk/lib/TableGen/Record.cpp Tue Oct 8 11:41:32 2019 @@ -1930,6 +1930,13 @@ void DagInit::Profile(FoldingSetNodeID & ProfileDagInit(ID, Val, ValName, makeArrayRef(getTrailingObjects(), NumArgs), makeArrayRef(getTrailingObjects(), NumArgNames)); } +Record *DagInit::getOperatorAsDef(ArrayRef Loc) const { + if (DefInit *DefI = dyn_cast(Val)) + return DefI->getDef(); + PrintFatalError(Loc, "Expected record as operator"); + return nullptr; +} + Init *DagInit::resolveReferences(Resolver &R) const { SmallVector NewArgs; NewArgs.reserve(arg_size()); Modified: llvm/trunk/utils/TableGen/AsmWriterEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/AsmWriterEmitter.cpp?rev=374101&r1=374100&r2=374101&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/AsmWriterEmitter.cpp (original) +++ llvm/trunk/utils/TableGen/AsmWriterEmitter.cpp Tue Oct 8 11:41:32 2019 @@ -784,8 +784,7 @@ void AsmWriterEmitter::EmitPrintAliasIns continue; // Aliases with priority 0 are never emitted. const DagInit *DI = R->getValueAsDag("ResultInst"); - const DefInit *Op = cast(DI->getOperator()); - AliasMap[getQualifiedName(Op->getDef())].insert( + AliasMap[getQualifiedName(DI->getOperatorAsDef(R->getLoc()))].insert( std::make_pair(CodeGenInstAlias(R, Target), Priority)); } Modified: llvm/trunk/utils/TableGen/RISCVCompressInstEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/RISCVCompressInstEmitter.cpp?rev=374101&r1=374100&r2=374101&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/RISCVCompressInstEmitter.cpp (original) +++ llvm/trunk/utils/TableGen/RISCVCompressInstEmitter.cpp Tue Oct 8 11:41:32 2019 @@ -411,12 +411,8 @@ void RISCVCompressInstEmitter::evaluateC assert(SourceDag && "Missing 'Input' in compress pattern!"); LLVM_DEBUG(dbgs() << "Input: " << *SourceDag << "\n"); - DefInit *OpDef = dyn_cast(SourceDag->getOperator()); - if (!OpDef) - PrintFatalError(Rec->getLoc(), - Rec->getName() + " has unexpected operator type!"); // Checking we are transforming from compressed to uncompressed instructions. - Record *Operator = OpDef->getDef(); + Record *Operator = SourceDag->getOperatorAsDef(Rec->getLoc()); if (!Operator->isSubClassOf("RVInst")) PrintFatalError(Rec->getLoc(), "Input instruction '" + Operator->getName() + "' is not a 32 bit wide instruction!"); @@ -428,12 +424,7 @@ void RISCVCompressInstEmitter::evaluateC assert(DestDag && "Missing 'Output' in compress pattern!"); LLVM_DEBUG(dbgs() << "Output: " << *DestDag << "\n"); - DefInit *DestOpDef = dyn_cast(DestDag->getOperator()); - if (!DestOpDef) - PrintFatalError(Rec->getLoc(), - Rec->getName() + " has unexpected operator type!"); - - Record *DestOperator = DestOpDef->getDef(); + Record *DestOperator = DestDag->getOperatorAsDef(Rec->getLoc()); if (!DestOperator->isSubClassOf("RVInst16")) PrintFatalError(Rec->getLoc(), "Output instruction '" + DestOperator->getName() + From llvm-commits at lists.llvm.org Tue Oct 8 11:44:32 2019 From: llvm-commits at lists.llvm.org (Douglas Gliner via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:44:32 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Fixed malformed JSON when symbolizing coverage information In-Reply-To: References: Message-ID: <23c4964aa206a880bfbf5883f51c0243@localhost.localdomain> dgg5503 added a comment. Ping. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 From llvm-commits at lists.llvm.org Tue Oct 8 11:44:32 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:44:32 +0000 (UTC) Subject: [PATCH] D68203: [SelectionDAG][SVE] Add ISD node for VSCALE. In-Reply-To: References: Message-ID: <0fc7b6ae9ba9d506c6bb8628e1343117@localhost.localdomain> cameron.mcinally added a comment. What happens if we see IR like: %1 = mul i16 vscale, -2 Would that be promoted to i32 through `PromoteIntRes_VSCALE(...)`? ================ Comment at: lib/CodeGen/SelectionDAG/SelectionDAG.cpp:5141 + } + LLVM_FALLTHROUGH; case ISD::SRA: ---------------- This should probably have a test. Same for ISD::MUL. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68203/new/ https://reviews.llvm.org/D68203 From llvm-commits at lists.llvm.org Tue Oct 8 11:44:32 2019 From: llvm-commits at lists.llvm.org (Melanie Blower via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:44:32 +0000 (UTC) Subject: [PATCH] D62731: [RFC] Add support for options -frounding-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior In-Reply-To: References: Message-ID: <01e19451f637de52250a1b4f4c9d540c@localhost.localdomain> mibintc updated this revision to Diff 223911. mibintc added a comment. clean up some dead code Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62731/new/ https://reviews.llvm.org/D62731 Files: clang/docs/UsersManual.rst clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Basic/LangOptions.h clang/include/clang/Driver/Options.td clang/lib/CodeGen/BackendUtil.cpp clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/CodeGen/CodeGenFunction.h clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/fpconstrained.c clang/test/Driver/clang_f_opts.c clang/test/Driver/fast-math.c llvm/include/llvm/Target/TargetOptions.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D62731.223911.patch Type: text/x-patch Size: 30129 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 11:44:48 2019 From: llvm-commits at lists.llvm.org (Daniel Sanders via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:44:48 +0000 (UTC) Subject: [PATCH] D68424: [tblgen] Add getOperatorAsDef() to Record In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG4b7cabf1e16f: [tblgen] Add getOperatorAsDef() to Record (authored by dsanders). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68424/new/ https://reviews.llvm.org/D68424 Files: llvm/include/llvm/TableGen/Record.h llvm/lib/TableGen/Record.cpp llvm/utils/TableGen/AsmWriterEmitter.cpp llvm/utils/TableGen/RISCVCompressInstEmitter.cpp Index: llvm/utils/TableGen/RISCVCompressInstEmitter.cpp =================================================================== --- llvm/utils/TableGen/RISCVCompressInstEmitter.cpp +++ llvm/utils/TableGen/RISCVCompressInstEmitter.cpp @@ -411,12 +411,8 @@ assert(SourceDag && "Missing 'Input' in compress pattern!"); LLVM_DEBUG(dbgs() << "Input: " << *SourceDag << "\n"); - DefInit *OpDef = dyn_cast(SourceDag->getOperator()); - if (!OpDef) - PrintFatalError(Rec->getLoc(), - Rec->getName() + " has unexpected operator type!"); // Checking we are transforming from compressed to uncompressed instructions. - Record *Operator = OpDef->getDef(); + Record *Operator = SourceDag->getOperatorAsDef(Rec->getLoc()); if (!Operator->isSubClassOf("RVInst")) PrintFatalError(Rec->getLoc(), "Input instruction '" + Operator->getName() + "' is not a 32 bit wide instruction!"); @@ -428,12 +424,7 @@ assert(DestDag && "Missing 'Output' in compress pattern!"); LLVM_DEBUG(dbgs() << "Output: " << *DestDag << "\n"); - DefInit *DestOpDef = dyn_cast(DestDag->getOperator()); - if (!DestOpDef) - PrintFatalError(Rec->getLoc(), - Rec->getName() + " has unexpected operator type!"); - - Record *DestOperator = DestOpDef->getDef(); + Record *DestOperator = DestDag->getOperatorAsDef(Rec->getLoc()); if (!DestOperator->isSubClassOf("RVInst16")) PrintFatalError(Rec->getLoc(), "Output instruction '" + DestOperator->getName() + Index: llvm/utils/TableGen/AsmWriterEmitter.cpp =================================================================== --- llvm/utils/TableGen/AsmWriterEmitter.cpp +++ llvm/utils/TableGen/AsmWriterEmitter.cpp @@ -784,8 +784,7 @@ continue; // Aliases with priority 0 are never emitted. const DagInit *DI = R->getValueAsDag("ResultInst"); - const DefInit *Op = cast(DI->getOperator()); - AliasMap[getQualifiedName(Op->getDef())].insert( + AliasMap[getQualifiedName(DI->getOperatorAsDef(R->getLoc()))].insert( std::make_pair(CodeGenInstAlias(R, Target), Priority)); } Index: llvm/lib/TableGen/Record.cpp =================================================================== --- llvm/lib/TableGen/Record.cpp +++ llvm/lib/TableGen/Record.cpp @@ -1930,6 +1930,13 @@ ProfileDagInit(ID, Val, ValName, makeArrayRef(getTrailingObjects(), NumArgs), makeArrayRef(getTrailingObjects(), NumArgNames)); } +Record *DagInit::getOperatorAsDef(ArrayRef Loc) const { + if (DefInit *DefI = dyn_cast(Val)) + return DefI->getDef(); + PrintFatalError(Loc, "Expected record as operator"); + return nullptr; +} + Init *DagInit::resolveReferences(Resolver &R) const { SmallVector NewArgs; NewArgs.reserve(arg_size()); Index: llvm/include/llvm/TableGen/Record.h =================================================================== --- llvm/include/llvm/TableGen/Record.h +++ llvm/include/llvm/TableGen/Record.h @@ -1330,6 +1330,7 @@ void Profile(FoldingSetNodeID &ID) const; Init *getOperator() const { return Val; } + Record *getOperatorAsDef(ArrayRef Loc) const; StringInit *getName() const { return ValName; } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68424.223912.patch Type: text/x-patch Size: 3300 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 11:53:57 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:53:57 +0000 (UTC) Subject: [PATCH] D68656: Add ExceptionStream to llvm::Object::minidump Message-ID: JosephTremoulet created this revision. Herald added a project: LLVM. Herald added a subscriber: llvm-commits. This will allow updating MinidumpYAML and LLDB to use this common definition. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68656 Files: llvm/include/llvm/BinaryFormat/Minidump.h llvm/include/llvm/BinaryFormat/MinidumpConstants.def llvm/include/llvm/Object/Minidump.h llvm/unittests/Object/MinidumpTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68656.223913.patch Type: text/x-patch Size: 11063 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 11:53:58 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:53:58 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream Message-ID: JosephTremoulet created this revision. Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68657 Files: llvm/include/llvm/ObjectYAML/MinidumpYAML.h llvm/lib/ObjectYAML/MinidumpEmitter.cpp llvm/lib/ObjectYAML/MinidumpYAML.cpp llvm/unittests/ObjectYAML/MinidumpYAMLTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68657.223914.patch Type: text/x-patch Size: 15415 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 12:01:48 2019 From: llvm-commits at lists.llvm.org (Jordan Rose via llvm-commits) Date: Tue, 08 Oct 2019 19:01:48 -0000 Subject: [llvm] r374102 - Mark several PointerIntPair methods as lvalue-only Message-ID: <20191008190148.D275282A29@lists.llvm.org> Author: jrose Date: Tue Oct 8 12:01:48 2019 New Revision: 374102 URL: http://llvm.org/viewvc/llvm-project?rev=374102&view=rev Log: Mark several PointerIntPair methods as lvalue-only No point in mutating 'this' if it's just going to be thrown away. https://reviews.llvm.org/D63945 Modified: llvm/trunk/include/llvm/ADT/PointerIntPair.h Modified: llvm/trunk/include/llvm/ADT/PointerIntPair.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/PointerIntPair.h?rev=374102&r1=374101&r2=374102&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/PointerIntPair.h (original) +++ llvm/trunk/include/llvm/ADT/PointerIntPair.h Tue Oct 8 12:01:48 2019 @@ -13,6 +13,7 @@ #ifndef LLVM_ADT_POINTERINTPAIR_H #define LLVM_ADT_POINTERINTPAIR_H +#include "llvm/Support/Compiler.h" #include "llvm/Support/PointerLikeTypeTraits.h" #include "llvm/Support/type_traits.h" #include @@ -59,19 +60,19 @@ public: IntType getInt() const { return (IntType)Info::getInt(Value); } - void setPointer(PointerTy PtrVal) { + void setPointer(PointerTy PtrVal) LLVM_LVALUE_FUNCTION { Value = Info::updatePointer(Value, PtrVal); } - void setInt(IntType IntVal) { + void setInt(IntType IntVal) LLVM_LVALUE_FUNCTION { Value = Info::updateInt(Value, static_cast(IntVal)); } - void initWithPointer(PointerTy PtrVal) { + void initWithPointer(PointerTy PtrVal) LLVM_LVALUE_FUNCTION { Value = Info::updatePointer(0, PtrVal); } - void setPointerAndInt(PointerTy PtrVal, IntType IntVal) { + void setPointerAndInt(PointerTy PtrVal, IntType IntVal) LLVM_LVALUE_FUNCTION { Value = Info::updateInt(Info::updatePointer(0, PtrVal), static_cast(IntVal)); } @@ -89,7 +90,7 @@ public: void *getOpaqueValue() const { return reinterpret_cast(Value); } - void setFromOpaqueValue(void *Val) { + void setFromOpaqueValue(void *Val) LLVM_LVALUE_FUNCTION { Value = reinterpret_cast(Val); } From llvm-commits at lists.llvm.org Tue Oct 8 12:03:22 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:03:22 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: <7444c0fbdc5ee3ee3eafdd9d999bb8ff@localhost.localdomain> evandro updated this revision to Diff 223915. evandro added a comment. @efriedma, you had a point. I now trimmed the precision down to one digit short of seeing a change in the mantissa bits. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 Files: llvm/include/llvm/Support/MathExtras.h llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68257.223915.patch Type: text/x-patch Size: 5421 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 12:03:22 2019 From: llvm-commits at lists.llvm.org (Jordan Rose via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:03:22 +0000 (UTC) Subject: [PATCH] D63945: Mark several PointerIntPair methods as lvalue-only In-Reply-To: References: Message-ID: <6441b9accab74f7ffccfadc249650851@localhost.localdomain> jordan_rose closed this revision. jordan_rose added a comment. Committed in rL374102 . Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63945/new/ https://reviews.llvm.org/D63945 From llvm-commits at lists.llvm.org Tue Oct 8 12:12:43 2019 From: llvm-commits at lists.llvm.org (Han Shen via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:12:43 +0000 (UTC) Subject: [PATCH] D68062: Propeller lld framework for basicblock sections In-Reply-To: References: Message-ID: shenhan added inline comments. ================ Comment at: lld/ELF/Propeller.cpp:65-66 + ++LineNo; + if (line.empty()) continue; + if (line[0] != '@') break; + ++outputFileTagSeen; ---------------- ruiu wrote: > Is this how clang-format formatted? If not, could you please run clang-format-diff to format this patch? Ah, sorry, I didn't re-run clang-format after I made changes. ================ Comment at: lld/ELF/Propeller.cpp:82 +SymbolEntry *Propfile::findSymbol(StringRef symName) { + std::pair symNameSplit = symName.split(".llvm."); + StringRef funcName; ---------------- ruiu wrote: > Could you write a comment as to what this magic string `.llvm.` is? The special handling of ".llvm." in the symbols are no longer needed. Removed all such occurrences. ================ Comment at: lld/ELF/PropellerELFCfg.cpp:1 +//===-------------------- PropellerELFCfg.cpp -----------------------------===// +// ---------------- MaskRay wrote: > There are still open questions about the linker rewriting approach: https://lists.llvm.org/pipermail/llvm-dev/2019-October/135616.html > > Let's see what conclusions people will reach. Yup, we (Sri, Rahman and I) are working on replies for the points brought out by bolt engineers. ================ Comment at: lld/ELF/PropellerELFCfg.cpp:19 +#include "llvm/Object/ObjectFile.h" +// Needed by ELFSectionRef & ELFSymbolRef. +#include "llvm/Object/ELFObjectFile.h" ---------------- MaskRay wrote: > Delete the comment. > > I suspect a comment on its own line may interact badly with clang-format. Deleted all comments regarding header inclusion. ================ Comment at: lld/ELF/PropellerELFCfg.h:185 + // See implementaion comments in .cpp. + void buildCFG(ControlFlowGraph &cfg, const SymbolRef &cfgSym, + std::map> &nodeMap); ---------------- MaskRay wrote: > If the ordered property is not required, map -> unordered_map Yup, actually, the key of the map is the ordinal, which reflects the address order of each symbol. So here, the order property is required. ================ Comment at: lld/include/lld/Common/PropellerCommon.h:1 +#ifndef LLD_ELF_PROPELLER_COMMON_H +#define LLD_ELF_PROPELLER_COMMON_H ---------------- ruiu wrote: > shenhan wrote: > > ruiu wrote: > > > shenhan wrote: > > > > ruiu wrote: > > > > > This header is included only once by another header, so please merge them together. > > > > Yup, let me explain. We have the creaet_llvm_prof tool which is part of google/autofdo repository (https://github.com/shenhanc78/autofdo/blob/plo-dev/llvm_propeller_profile_writer.h#L14). create_llvm_prof tool and propeller need to have exactly same symbolentry definition to cooperate. To make a copy of the class definition in google/autofdo repository is not a good idea. So we create this common inclusion hdr file. > > > > > > > > > > > This is an lld's private header directory. If you have a header that is shared by multiple LLVM subprojects, consider moving it to LLVM. > > I found PropellerCommon.h exported into the llvm installation directory here: > > llvm-install/include/lld/Common > > along with llvm-install/include/llvm and llvm-install/include/clang. > > > > And I think it's probably inappropriate to place PropellerCommon.h anywhere under llvm-install/include/clang or llvm-install/include/llvm. > So you are using this file only with in lld? If so, move this to lld/ELF, because we don't have a non-ELF implementation yet. Also, please consider rename SymbolEntry, even if this file is auto-generated by some other script. That name is extremely confusing in the linker's context. "Symbol" is one of the central data structures in the linker, and when you say "symbol", that means the symbol that we read from object files. No, I am sharing this file between lld and create_llvm_prof (which sits in another google repository - https://github.com/shenhanc78/autofdo/blob/plo-dev/llvm_propeller_profile_writer.h#L14 ) I'm thinking of renaming it to "BBSectionEntry", what do you think? Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68062/new/ https://reviews.llvm.org/D68062 From llvm-commits at lists.llvm.org Tue Oct 8 12:12:43 2019 From: llvm-commits at lists.llvm.org (Nikita Popov via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:12:43 +0000 (UTC) Subject: [PATCH] D68654: [CVP} Replace SExt with ZExt if the input is known-non-negative In-Reply-To: References: Message-ID: nikic accepted this revision. nikic added a comment. This revision is now accepted and ready to land. LGTM. Especially as we already have the corresponding `ashr` to `lshr` transform, this seems like an obvious extension. ================ Comment at: llvm/test/Transforms/CorrelatedValuePropagation/sext.ll:46 ;; Negative test to show transform doesn't happen unless n > 0. define void @test2(i32 %n) { ---------------- nit: `n >= 0` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68654/new/ https://reviews.llvm.org/D68654 From llvm-commits at lists.llvm.org Tue Oct 8 12:12:43 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:12:43 +0000 (UTC) Subject: [PATCH] D68202: [LLVM IR] Add `vscale` as a symbolic constant. In-Reply-To: References: Message-ID: <0702e4f5c897536d037ae832276df784@localhost.localdomain> cameron.mcinally added a comment. Looks pretty good. This could probably use some tests in `TypeAndConstantValueParsing` of `llvm/unittests/AsmParser/AsmParserTest.cpp`. In particular, an `isa(...)` test would be good. And maybe a negative test to make sure something like `` can't be parsed. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68202/new/ https://reviews.llvm.org/D68202 From llvm-commits at lists.llvm.org Tue Oct 8 12:12:42 2019 From: llvm-commits at lists.llvm.org (Han Shen via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:12:42 +0000 (UTC) Subject: [PATCH] D68062: Propeller lld framework for basicblock sections In-Reply-To: References: Message-ID: <57f76bfc0b4345ceff01f35ac70e3e4a@localhost.localdomain> shenhan updated this revision to Diff 223918. shenhan marked 22 inline comments as done. Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68062/new/ https://reviews.llvm.org/D68062 Files: lld/ELF/Propeller.cpp lld/ELF/Propeller.h lld/include/lld/Common/PropellerCommon.h lld/test/ELF/propeller/Inputs/bad-propeller-1.data lld/test/ELF/propeller/Inputs/bad-propeller-2.data lld/test/ELF/propeller/Inputs/bad-propeller-3.data lld/test/ELF/propeller/Inputs/bad-propeller-4.data lld/test/ELF/propeller/Inputs/bad-propeller-5.data lld/test/ELF/propeller/Inputs/propeller-2.data lld/test/ELF/propeller/Inputs/propeller-3.data lld/test/ELF/propeller/Inputs/propeller.data lld/test/ELF/propeller/Inputs/sample.c lld/test/ELF/propeller/propeller-bad-profile-1.s lld/test/ELF/propeller/propeller-bad-profile-2.s lld/test/ELF/propeller/propeller-bad-profile-3.s lld/test/ELF/propeller/propeller-bad-profile-4.s lld/test/ELF/propeller/propeller-bad-profile-5.s lld/test/ELF/propeller/propeller-bbsections-dump.s lld/test/ELF/propeller/propeller-compressed-strtab-lto.s lld/test/ELF/propeller/propeller-compressed-strtab.s lld/test/ELF/propeller/propeller-error-on-bblabels.s lld/test/ELF/propeller/propeller-keep-bb-symbols.s lld/test/ELF/propeller/propeller-lto-bbsections-dump.s lld/test/ELF/propeller/propeller-opt-all-combinations.s lld/test/ELF/propeller/propeller-skip.s lld/test/ELF/propeller/propeller-symbol-order-dump.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68062.223918.patch Type: text/x-patch Size: 69044 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 12:22:04 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:22:04 +0000 (UTC) Subject: [PATCH] D68203: [SelectionDAG][SVE] Add ISD node for VSCALE. In-Reply-To: References: Message-ID: cameron.mcinally added inline comments. ================ Comment at: lib/CodeGen/SelectionDAG/SelectionDAG.cpp:5141 + } + LLVM_FALLTHROUGH; case ISD::SRA: ---------------- cameron.mcinally wrote: > This should probably have a test. Same for ISD::MUL. Sorry, just SHL. MUL tests are there... CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68203/new/ https://reviews.llvm.org/D68203 From llvm-commits at lists.llvm.org Tue Oct 8 12:22:04 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:22:04 +0000 (UTC) Subject: [PATCH] D68659: [MemorySSA] Make the use of moveAllAfterMergeBlocks consistent. Message-ID: asbirlea created this revision. asbirlea added a reviewer: george.burgess.iv. Herald added subscribers: sanjoy.google, Prazek. Herald added a project: LLVM. The rule for the moveAllAfterMergeBlocks API si for all instructions from `From` to have been moved to `To`, while keeping the CFG edges (and block terminators) unchanged. Update all the callsites for moveAllAfterMergeBlocks to follow this. Pending follow-up: since the same behavior is needed everytime, merge all callsites into one. The common denominator may be the call to `MergeBlockIntoPredecessor`. Resolves PR43569. Repository: rL LLVM https://reviews.llvm.org/D68659 Files: lib/Analysis/MemorySSAUpdater.cpp lib/Transforms/Scalar/LoopUnswitch.cpp lib/Transforms/Utils/BasicBlockUtils.cpp lib/Transforms/Utils/LoopRotationUtils.cpp test/Analysis/MemorySSA/pr43569.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68659.223920.patch Type: text/x-patch Size: 9275 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 12:29:09 2019 From: llvm-commits at lists.llvm.org (Azhar Mohammed via llvm-commits) Date: Tue, 08 Oct 2019 12:29:09 -0700 Subject: [compiler-rt] r373993 - [msan] Add interceptors: crypt, crypt_r. In-Reply-To: <20191008000030.E4A6B81D62@lists.llvm.org> References: <20191008000030.E4A6B81D62@lists.llvm.org> Message-ID: <7F17858A-231D-4A25-8EEF-B715E69D97FA@apple.com> This is failing on Darwin, can you please take a look. http://green.lab.llvm.org/green/job/clang-stage1-RA/2649/consoleFull ******************** TEST 'SanitizerCommon-asan-x86_64-Darwin :: Posix/crypt.cpp' FAILED ******************** Script: -- : 'RUN: at line 1'; /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/./bin/clang --driver-mode=g++ -gline-tables-only -fsanitize=address -arch x86_64 -stdlib=libc++ -mmacosx-version-min=10.9 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk -O0 -g /Users/buildslave/jenkins/workspace/clang-stage1-RA/llvm-project/compiler-rt/test/sanitizer_common/TestCases/Posix/crypt.cpp -o /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/tools/clang/runtime/compiler-rt-bins/test/sanitizer_common/asan-x86_64-Darwin/Posix/Output/crypt.cpp.tmp -lcrypt && /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/tools/clang/runtime/compiler-rt-bins/test/sanitizer_common/asan-x86_64-Darwin/Posix/Output/crypt.cpp.tmp -- Exit Code: 1 Command Output (stderr): -- <> <> <> <>ld: library not found for -lcrypt <> <> <> <> <>clang-10: error: linker command failed with exit code 1 (use -v to see invocation) > On Oct 7, 2019, at 5:00 PM, Evgeniy Stepanov via llvm-commits wrote: > > Author: eugenis > Date: Mon Oct 7 17:00:30 2019 > New Revision: 373993 > > URL: http://llvm.org/viewvc/llvm-project?rev=373993&view=rev > Log: > [msan] Add interceptors: crypt, crypt_r. > > Reviewers: vitalybuka > > Subscribers: srhines, #sanitizers, llvm-commits > > Tags: #sanitizers, #llvm > > Differential Revision: https://reviews.llvm.org/D68431 > > Added: > compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp > compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp > Modified: > compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h > > Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc > URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc?rev=373993&r1=373992&r2=373993&view=diff > ============================================================================== > --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc (original) > +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc Mon Oct 7 17:00:30 2019 > @@ -9573,6 +9573,41 @@ INTERCEPTOR(SSIZE_T, getrandom, void *bu > #define INIT_GETRANDOM > #endif > > +#if SANITIZER_INTERCEPT_CRYPT > +INTERCEPTOR(char *, crypt, char *key, char *salt) { > + void *ctx; > + COMMON_INTERCEPTOR_ENTER(ctx, crypt, key, salt); > + COMMON_INTERCEPTOR_READ_RANGE(ctx, key, internal_strlen(key) + 1); > + COMMON_INTERCEPTOR_READ_RANGE(ctx, salt, internal_strlen(salt) + 1); > + char *res = REAL(crypt)(key, salt); > + if (res != nullptr) > + COMMON_INTERCEPTOR_INITIALIZE_RANGE(res, internal_strlen(res) + 1); > + return res; > +} > +#define INIT_CRYPT COMMON_INTERCEPT_FUNCTION(crypt); > +#else > +#define INIT_CRYPT > +#endif > + > +#if SANITIZER_INTERCEPT_CRYPT_R > +INTERCEPTOR(char *, crypt_r, char *key, char *salt, void *data) { > + void *ctx; > + COMMON_INTERCEPTOR_ENTER(ctx, crypt_r, key, salt, data); > + COMMON_INTERCEPTOR_READ_RANGE(ctx, key, internal_strlen(key) + 1); > + COMMON_INTERCEPTOR_READ_RANGE(ctx, salt, internal_strlen(salt) + 1); > + char *res = REAL(crypt_r)(key, salt, data); > + if (res != nullptr) { > + COMMON_INTERCEPTOR_WRITE_RANGE(ctx, data, > + __sanitizer::struct_crypt_data_sz); > + COMMON_INTERCEPTOR_INITIALIZE_RANGE(res, internal_strlen(res) + 1); > + } > + return res; > +} > +#define INIT_CRYPT_R COMMON_INTERCEPT_FUNCTION(crypt_r); > +#else > +#define INIT_CRYPT_R > +#endif > + > static void InitializeCommonInterceptors() { > #if SI_POSIX > static u64 metadata_mem[sizeof(MetadataHashMap) / sizeof(u64) + 1]; > @@ -9871,6 +9906,8 @@ static void InitializeCommonInterceptors > INIT_GETUSERSHELL; > INIT_SL_INIT; > INIT_GETRANDOM; > + INIT_CRYPT; > + INIT_CRYPT_R; > > INIT___PRINTF_CHK; > } > > Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h > URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h?rev=373993&r1=373992&r2=373993&view=diff > ============================================================================== > --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h (original) > +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h Mon Oct 7 17:00:30 2019 > @@ -566,6 +566,8 @@ > #define SANITIZER_INTERCEPT_FDEVNAME SI_FREEBSD > #define SANITIZER_INTERCEPT_GETUSERSHELL (SI_POSIX && !SI_ANDROID) > #define SANITIZER_INTERCEPT_SL_INIT (SI_FREEBSD || SI_NETBSD) > +#define SANITIZER_INTERCEPT_CRYPT (SI_POSIX && !SI_ANDROID) > +#define SANITIZER_INTERCEPT_CRYPT_R (SI_LINUX && !SI_ANDROID) > > #define SANITIZER_INTERCEPT_GETRANDOM (SI_LINUX && __GLIBC_PREREQ(2, 25)) > #define SANITIZER_INTERCEPT___CXA_ATEXIT SI_NETBSD > > Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp > URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp?rev=373993&r1=373992&r2=373993&view=diff > ============================================================================== > --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp (original) > +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp Mon Oct 7 17:00:30 2019 > @@ -140,6 +140,7 @@ typedef struct user_fpregs elf_fpregset_ > #include > #include > #include > +#include > #endif // SANITIZER_LINUX && !SANITIZER_ANDROID > > #if SANITIZER_ANDROID > @@ -240,6 +241,7 @@ namespace __sanitizer { > unsigned struct_ustat_sz = SIZEOF_STRUCT_USTAT; > unsigned struct_rlimit64_sz = sizeof(struct rlimit64); > unsigned struct_statvfs64_sz = sizeof(struct statvfs64); > + unsigned struct_crypt_data_sz = sizeof(struct crypt_data); > #endif // SANITIZER_LINUX && !SANITIZER_ANDROID > > #if SANITIZER_LINUX && !SANITIZER_ANDROID > > Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h > URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h?rev=373993&r1=373992&r2=373993&view=diff > ============================================================================== > --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h (original) > +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h Mon Oct 7 17:00:30 2019 > @@ -304,6 +304,7 @@ extern unsigned struct_msqid_ds_sz; > extern unsigned struct_mq_attr_sz; > extern unsigned struct_timex_sz; > extern unsigned struct_statvfs_sz; > +extern unsigned struct_crypt_data_sz; > #endif // SANITIZER_LINUX && !SANITIZER_ANDROID > > struct __sanitizer_iovec { > > Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp > URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp?rev=373993&view=auto > ============================================================================== > --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp (added) > +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp Mon Oct 7 17:00:30 2019 > @@ -0,0 +1,37 @@ > +// RUN: %clangxx -O0 -g %s -lcrypt -o %t && %run %t > + > +#include > +#include > +#include > +#include > + > +#include > + > +int > +main (int argc, char** argv) > +{ > + { > + crypt_data cd; > + cd.initialized = 0; > + char *p = crypt_r("abcdef", "xz", &cd); > + volatile size_t z = strlen(p); > + } > + { > + crypt_data cd; > + cd.initialized = 0; > + char *p = crypt_r("abcdef", "$1$", &cd); > + volatile size_t z = strlen(p); > + } > + { > + crypt_data cd; > + cd.initialized = 0; > + char *p = crypt_r("abcdef", "$5$", &cd); > + volatile size_t z = strlen(p); > + } > + { > + crypt_data cd; > + cd.initialized = 0; > + char *p = crypt_r("abcdef", "$6$", &cd); > + volatile size_t z = strlen(p); > + } > +} > > Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp > URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp?rev=373993&view=auto > ============================================================================== > --- compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp (added) > +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp Mon Oct 7 17:00:30 2019 > @@ -0,0 +1,26 @@ > +// RUN: %clangxx -O0 -g %s -o %t -lcrypt && %run %t > + > +#include > +#include > +#include > + > +int > +main (int argc, char** argv) > +{ > + { > + char *p = crypt("abcdef", "xz"); > + volatile size_t z = strlen(p); > + } > + { > + char *p = crypt("abcdef", "$1$"); > + volatile size_t z = strlen(p); > + } > + { > + char *p = crypt("abcdef", "$5$"); > + volatile size_t z = strlen(p); > + } > + { > + char *p = crypt("abcdef", "$6$"); > + volatile size_t z = strlen(p); > + } > +} > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Tue Oct 8 12:31:34 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:31:34 +0000 (UTC) Subject: [PATCH] D68527: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering In-Reply-To: References: Message-ID: <532dfacb0b62d99836d2ed07e6aded0e@localhost.localdomain> tlively updated this revision to Diff 223922. tlively marked 3 inline comments as done. tlively added a comment. - Address variable naming comment Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68527/new/ https://reviews.llvm.org/D68527 Files: llvm/lib/Target/WebAssembly/WebAssemblyISD.def llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td llvm/test/CodeGen/WebAssembly/simd-build-vector.ll llvm/test/MC/WebAssembly/simd-encodings.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68527.223922.patch Type: text/x-patch Size: 21468 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 12:32:29 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:32:29 +0000 (UTC) Subject: [PATCH] D68654: [CVP} Replace SExt with ZExt if the input is known-non-negative In-Reply-To: References: Message-ID: <4355158bfec52ab7c881221f7308b137@localhost.localdomain> lebedev.ri added a comment. In D68654#1700223 , @nikic wrote: > LGTM. Especially as we already have the corresponding `ashr` to `lshr` transform, this seems like an obvious extension. Thank you for the review. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68654/new/ https://reviews.llvm.org/D68654 From llvm-commits at lists.llvm.org Tue Oct 8 12:41:06 2019 From: llvm-commits at lists.llvm.org (Alex Lorenz via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:41:06 +0000 (UTC) Subject: [PATCH] D68093: [clang-scan-deps][static analyzer] Support for clang --analyze in scan-deps In-Reply-To: References: Message-ID: <3ab40f97560e79f91082ee04fbf7cd08@localhost.localdomain> arphaman accepted this revision. arphaman added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68093/new/ https://reviews.llvm.org/D68093 From llvm-commits at lists.llvm.org Tue Oct 8 12:41:51 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:41:51 +0000 (UTC) Subject: [PATCH] D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands In-Reply-To: References: Message-ID: arsenm added inline comments. ================ Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:815-816 + + // Check for trivially easy constant prop into one of the operands + // If this is the case then perform the operation now to resolve SGPR + // issue. If we don't do that here we will always insert a mov to m0 ---------------- I'm working on a patch to stop reserving m0; I suspect this will avoid the need for the special case propagation Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51932/new/ https://reviews.llvm.org/D51932 From llvm-commits at lists.llvm.org Tue Oct 8 12:50:22 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:50:22 +0000 (UTC) Subject: [PATCH] D68133: [Symbolize] Use the local MSVC C++ demangler instead of relying on dbghelp. NFC. In-Reply-To: References: Message-ID: <4908f4982dece0f23f1141eeac890992@localhost.localdomain> thakis added a comment. But again, maybe the tests are just overly strict here? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68133/new/ https://reviews.llvm.org/D68133 From llvm-commits at lists.llvm.org Tue Oct 8 12:50:22 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:50:22 +0000 (UTC) Subject: [PATCH] D68133: [Symbolize] Use the local MSVC C++ demangler instead of relying on dbghelp. NFC. In-Reply-To: References: Message-ID: <91e461be39762ba6eb60e231a5ced1e8@localhost.localdomain> thakis added a comment. In D68133#1694180 , @mstorsjo wrote: > Had to revert this one, as it broke sanitizer tests: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/52441 > > So apparently it seems (again) like it would be useful if the llvm microsoft demangler could be configured to leave out parts of the output. In D68134 (in LLDB) we can consider changing the testcases to include the new demangler output, but there was a concern that there might be cases within LLDB that try to parse the demangled function names. And here it significantly changes the libfuzzer deduplication token output, which I think also isn't ideal. > > @thakis - you brought it up earlier that adding options to it could be possible. Do you, who I think is familiar with that demangler, have time to make a PoC of it? Otherwise I might have a stab at it sometimes later... That links is dead. Which customization is needed here? microsoftDemangle() already gets a Flags argument; we can just add more flags to it. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68133/new/ https://reviews.llvm.org/D68133 From llvm-commits at lists.llvm.org Tue Oct 8 12:59:37 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 19:59:37 +0000 (UTC) Subject: [PATCH] D68133: [Symbolize] Use the local MSVC C++ demangler instead of relying on dbghelp. NFC. In-Reply-To: References: Message-ID: <6fa32780a663d8bff587babe0dbdc500@localhost.localdomain> mstorsjo added a comment. In D68133#1700276 , @thakis wrote: > That links is dead. Which customization is needed here? Ah crap, I should have quoted the relevant bits... IIRC, it at least had extra `public:` at the start of the symbol, extra return type and calling convention specifiers. D68134 has a more concrete case of what needs to change in LLDB if the llvm demangler is used as-is there. > microsoftDemangle() already gets a Flags argument; we can just add more flags to it. Oh, that's nice! Then this shoudln't probably be too big a deal to extend. In D68133#1700277 , @thakis wrote: > But again, maybe the tests are just overly strict here? I think for this particular case, for the fuzzer sanitizer's backtrace disambiguation log, there might be a risk that something is expecting to process the log, which might not be ready to handle extra unexpected keywords. I might be overly cautious though. In the LLDB case, a concern was voiced that some parts of LLDB might try to parse the demangled symbol names, and not expect those extra keywords there. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68133/new/ https://reviews.llvm.org/D68133 From llvm-commits at lists.llvm.org Tue Oct 8 12:59:39 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:59:39 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: <30bf87affaa48f5f7b2456e1eee3fd35@localhost.localdomain> evandro updated this revision to Diff 223925. evandro added a comment. Added the hex FP constants for easier auditing. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 Files: llvm/include/llvm/Support/MathExtras.h llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68257.223925.patch Type: text/x-patch Size: 6048 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 12:59:44 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 19:59:44 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: lebedev.ri requested changes to this revision. lebedev.ri added a comment. This revision now requires changes to proceed. (marking as reviewed as per previous comments - constantexpr, typechecking inconsistent with langref) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Tue Oct 8 13:06:01 2019 From: llvm-commits at lists.llvm.org (Dan Liew via llvm-commits) Date: Tue, 08 Oct 2019 20:06:01 -0000 Subject: [compiler-rt] r374109 - Fix `compiler_rt_logbf_test.c` test failure for Builtins-i386-darwin test suite. Message-ID: <20191008200601.BEFC18F9C6@lists.llvm.org> Author: delcypher Date: Tue Oct 8 13:06:01 2019 New Revision: 374109 URL: http://llvm.org/viewvc/llvm-project?rev=374109&view=rev Log: Fix `compiler_rt_logbf_test.c` test failure for Builtins-i386-darwin test suite. Summary: It seems that compiler-rt's implementation and Darwin libm's implementation of `logbf()` differ when given a NaN with raised sign bit. Strangely this behaviour only happens with i386 Darwin libm. For x86_64 and x86_64h the existing compiler-rt implementation matched Darwin libm. To workaround this the `compiler_rt_logbf_test.c` has been modified to do a comparison on the `fp_t` type and if that fails check if both values are NaN. If both values are NaN they are equivalent and no error needs to be raised. rdar://problem/55565503 Reviewers: rupprecht, scanon, compnerd, echristo Subscribers: #sanitizers, llvm-commits Tags: #llvm, #sanitizers Differential Revision: https://reviews.llvm.org/D67999 Modified: compiler-rt/trunk/test/builtins/Unit/compiler_rt_logbf_test.c Modified: compiler-rt/trunk/test/builtins/Unit/compiler_rt_logbf_test.c URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/builtins/Unit/compiler_rt_logbf_test.c?rev=374109&r1=374108&r2=374109&view=diff ============================================================================== --- compiler-rt/trunk/test/builtins/Unit/compiler_rt_logbf_test.c (original) +++ compiler-rt/trunk/test/builtins/Unit/compiler_rt_logbf_test.c Tue Oct 8 13:06:01 2019 @@ -13,15 +13,19 @@ //===----------------------------------------------------------------------===// #define SINGLE_PRECISION +#include "fp_lib.h" +#include "int_math.h" #include #include -#include "fp_lib.h" int test__compiler_rt_logbf(fp_t x) { fp_t crt_value = __compiler_rt_logbf(x); fp_t libm_value = logbf(x); - // Compare actual rep, e.g. to avoid NaN != the same NaN - if (toRep(crt_value) != toRep(libm_value)) { + // `!=` operator on fp_t returns false for NaNs so also check if operands are + // both NaN. We don't do `toRepr(crt_value) != toRepr(libm_value)` because + // that treats different representations of NaN as not equivalent. + if (crt_value != libm_value && + !(crt_isnan(crt_value) && crt_isnan(libm_value))) { printf("error: in __compiler_rt_logb(%a [%X]) = %a [%X] != %a [%X]\n", x, toRep(x), crt_value, toRep(crt_value), libm_value, toRep(libm_value)); From llvm-commits at lists.llvm.org Tue Oct 8 13:14:58 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:14:58 +0000 (UTC) Subject: [PATCH] D61675: [WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator In-Reply-To: References: Message-ID: cameron.mcinally added a comment. Sorry for the slow response. I've been busy with other projects. Are there any reservations about merging the Clang unary FNeg change? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61675/new/ https://reviews.llvm.org/D61675 From llvm-commits at lists.llvm.org Tue Oct 8 13:14:59 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:14:59 +0000 (UTC) Subject: [PATCH] D68575: [llvm-readobj][xcoff] implement parsing overflow section header. In-Reply-To: References: Message-ID: <8302749b2695468908188918f961e7e6@localhost.localdomain> DiggerLin marked 7 inline comments as done. DiggerLin added inline comments. ================ Comment at: llvm/test/tools/llvm-readobj/xcoff-overflow-section.test:37 +# SECOVERFLOW-NEXT: Name: .ovrflo +# SECOVERFLOW-NEXT: NumberOfRelocations: 0x3 +# SECOVERFLOW-NEXT: NumberOfLineNumbers: 0x0 ---------------- hubert.reinterpretcast wrote: > The same information, when conveyed not via an overflow header, is expressed in decimal format (and not hexadecimal). changed as suggestion. ================ Comment at: llvm/test/tools/llvm-readobj/xcoff-overflow-section.test:38 +# SECOVERFLOW-NEXT: NumberOfRelocations: 0x3 +# SECOVERFLOW-NEXT: NumberOfLineNumbers: 0x0 +# SECOVERFLOW-NEXT: Size: 0x0 ---------------- hubert.reinterpretcast wrote: > Ditto. changed as suggestion. ================ Comment at: llvm/test/tools/llvm-readobj/xcoff-overflow-section.test:41 +# SECOVERFLOW-NEXT: RawDataOffset: 0x0 +# SECOVERFLOW-NEXT: RelocationPointer: 0x0 +# SECOVERFLOW-NEXT: LineNumberPointer: 0x0 ---------------- hubert.reinterpretcast wrote: > According to the XCOFF documentation: The `s_relptr` and `s_lnnoptr` fields must have the same values as in the corresponding primary section header. changed raw data and test case ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:441 + if (Obj.is64Bit()) + W.printString( + "64 bits XCOFF objectfile should not have overflow section!"); ---------------- hubert.reinterpretcast wrote: > I think we should not be printing a "warning" to the same stream as the output. changed ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:448 + case XCOFF::STYP_TYPCHK: + // TODO : The interpret of loader, exception, type check section header + // are different from generic section header. We will implement them ---------------- hubert.reinterpretcast wrote: > s/TODO : /TODO /; > s/interpret/interpretation/; > s/, type check section header/, and type check section headers/; > s/from/from that of/; > s/generic section header/generic section headers/; > s/generic seciton header now/generic section headers for now/; changed as suggestion. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:456 // The most significant 16-bits represent the DWARF section subtype. For // now we just dump the section type flags. if (Flags & SectionFlagsReservedMask) ---------------- hubert.reinterpretcast wrote: > Since we removed the context of the code line from here, the comment is ambiguous. This should say that, for now, we dump only the section type (low order 16 bits). changed as suggestions. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68575/new/ https://reviews.llvm.org/D68575 From llvm-commits at lists.llvm.org Tue Oct 8 13:15:00 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 20:15:00 +0000 (UTC) Subject: [PATCH] D68133: [Symbolize] Use the local MSVC C++ demangler instead of relying on dbghelp. NFC. In-Reply-To: References: Message-ID: mstorsjo added a comment. In D68133#1700289 , @mstorsjo wrote: > In D68133#1700276 , @thakis wrote: > > > That links is dead. Which customization is needed here? > > > Ah crap, I should have quoted the relevant bits... > > IIRC, it at least had extra `public:` at the start of the symbol, extra return type and calling convention specifiers. D68134 has a more concrete case of what needs to change in LLDB if the llvm demangler is used as-is there. I guess ideally we would have individual flags for all the UnDecorateSymbolName flags used here: `UNDNAME_NO_ACCESS_SPECIFIERS`, `UNDNAME_NO_ALLOCATION_LANGUAGE`, `UNDNAME_NO_THROW_SIGNATURES`, `UNDNAME_NO_MEMBER_TYPE`, `UNDNAME_NO_MS_KEYWORDS`, `UNDNAME_NO_FUNCTION_RETURNS`. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68133/new/ https://reviews.llvm.org/D68133 From llvm-commits at lists.llvm.org Tue Oct 8 13:15:01 2019 From: llvm-commits at lists.llvm.org (John McCall via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:15:01 +0000 (UTC) Subject: [PATCH] D68611: [IRGen] Emit lifetime markers for temporary struct allocas In-Reply-To: References: Message-ID: <51c5f20be7d3d97333814f405186fa8f@localhost.localdomain> rjmccall accepted this revision. rjmccall added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68611/new/ https://reviews.llvm.org/D68611 From llvm-commits at lists.llvm.org Tue Oct 8 13:15:01 2019 From: llvm-commits at lists.llvm.org (Nikita Popov via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:15:01 +0000 (UTC) Subject: [PATCH] D68643: [Codegen] Alter the default promotion for add_sat and sub_sat In-Reply-To: References: Message-ID: nikic accepted this revision. nikic added a comment. This revision is now accepted and ready to land. LGTM > This still uses the existing promotion when the promoted add/sub_sat is legal. In many situations (but not all) it would probably be better to just perform the new promotion. I guess at least for the unsigned case, it would be reasonable to use the shift+sat expansion only if both the sat is legal *and* min/max is not legal. The umax+sub expansion at least is clearly better than usubsat + 3 shifts. But that's probably better left for a followup patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68643/new/ https://reviews.llvm.org/D68643 From llvm-commits at lists.llvm.org Tue Oct 8 13:15:02 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:15:02 +0000 (UTC) Subject: [PATCH] D68575: [llvm-readobj][xcoff] implement parsing overflow section header. In-Reply-To: References: Message-ID: DiggerLin updated this revision to Diff 223929. DiggerLin marked an inline comment as done. DiggerLin added a comment. address some comments Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68575/new/ https://reviews.llvm.org/D68575 Files: llvm/test/tools/llvm-readobj/Inputs/xcoff-reloc-overflow.o llvm/test/tools/llvm-readobj/xcoff-overflow-section.test llvm/tools/llvm-readobj/XCOFFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68575.223929.patch Type: text/x-patch Size: 6832 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 13:15:02 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:15:02 +0000 (UTC) Subject: [PATCH] D68465: [DebugInfo] Trim call-clobbered location list entries when tuning for GDB In-Reply-To: References: Message-ID: dblaikie added a comment. In D68465#1699862 , @probinson wrote: > In D68465#1698682 , @dblaikie wrote: > > > In D68465#1697824 , @probinson wrote: > > > > > @dblaikie I'm also not clear what you're suggestion about .debug_addr entry plus offset. DW_LLE_offset_pair does this, derived from the base address, which ought to be available for any given function, assuming DWARF v5. Can you explain more clearly what's missing? > > > > > > My point was generally that the debug_addr section shouldn't be incnluding addresses with offsets, it should be the places that refer to debug_addr that use the offsets. The specific place I'd like to use offsets would be from FORM_addr in debug_info. But, yes, in this case the support for debug addr references from loclists, the forms are already sufficiently descriptive for this. > > > Ah, got it. The problem you're facing is that for a DIE, a FORM describes one value, while you want to describe two--the address (or index into .debug_addr), and a separate offset. For a DIE attribute, this would normally be done using an expression. Would it work to have (say) DW_AT_low_pc be allowed to have class exprloc? (It currently must have class address, either FORM_addr or one of the FORM_addrx's.) The expression can index into .debug_addr and then add an offset to the result. Yep, we've (you, me, DWARF committee, other folks) have talked about it before, years ago - didn't mean to derail/rehash the whole thing. I've prototyped it with expressions - you're right, doesn't require much change to the DWARF standard (if any, really - DWARF being permissive & all) but maybe a fair bit of chaneg/work for DWARF consumers compared to a more specific encoding (& I'd previously asked/discussed what encoding would be suitable - given that both the address and teh offset support various different fixed (& then variable) length encodings and having the combination of all of those would be painful - so having only the 2xuleb encoding was the last thing discussed). But, yeha, once we've got DWARFv5 fully implemented & deployed/functioning with LLDB and GDB I'll probably look at whether addr+offset is worthwhile. (the other, sort of cheaper workaround, is to use a rnglist of length 1 - not sure how much the addr+offset would save compared to a rnglist of length 1, etc - lots to compare, see what's worthwhile) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68465/new/ https://reviews.llvm.org/D68465 From llvm-commits at lists.llvm.org Tue Oct 8 13:16:04 2019 From: llvm-commits at lists.llvm.org (Dan Liew via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:16:04 +0000 (UTC) Subject: [PATCH] D67999: Fix `compiler_rt_logbf_test.c` test failure for Builtins-i386-darwin test suite. In-Reply-To: References: Message-ID: <2d479a6a8245f0ff99a96d3217937836@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG196eae533b09: Fix `compiler_rt_logbf_test.c` test failure for Builtins-i386-darwin test suite. (authored by delcypher). Changed prior to commit: https://reviews.llvm.org/D67999?vs=222653&id=223931#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67999/new/ https://reviews.llvm.org/D67999 Files: compiler-rt/test/builtins/Unit/compiler_rt_logbf_test.c Index: compiler-rt/test/builtins/Unit/compiler_rt_logbf_test.c =================================================================== --- compiler-rt/test/builtins/Unit/compiler_rt_logbf_test.c +++ compiler-rt/test/builtins/Unit/compiler_rt_logbf_test.c @@ -13,15 +13,19 @@ //===----------------------------------------------------------------------===// #define SINGLE_PRECISION +#include "fp_lib.h" +#include "int_math.h" #include #include -#include "fp_lib.h" int test__compiler_rt_logbf(fp_t x) { fp_t crt_value = __compiler_rt_logbf(x); fp_t libm_value = logbf(x); - // Compare actual rep, e.g. to avoid NaN != the same NaN - if (toRep(crt_value) != toRep(libm_value)) { + // `!=` operator on fp_t returns false for NaNs so also check if operands are + // both NaN. We don't do `toRepr(crt_value) != toRepr(libm_value)` because + // that treats different representations of NaN as not equivalent. + if (crt_value != libm_value && + !(crt_isnan(crt_value) && crt_isnan(libm_value))) { printf("error: in __compiler_rt_logb(%a [%X]) = %a [%X] != %a [%X]\n", x, toRep(x), crt_value, toRep(crt_value), libm_value, toRep(libm_value)); -------------- next part -------------- A non-text attachment was scrubbed... Name: D67999.223931.patch Type: text/x-patch Size: 1206 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 13:29:37 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Tue, 08 Oct 2019 20:29:37 -0000 Subject: [llvm] r374111 - [CVP][NFC] Revisit sext vs. zext test Message-ID: <20191008202937.11DBB86A3F@lists.llvm.org> Author: lebedevri Date: Tue Oct 8 13:29:36 2019 New Revision: 374111 URL: http://llvm.org/viewvc/llvm-project?rev=374111&view=rev Log: [CVP][NFC] Revisit sext vs. zext test Modified: llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll Modified: llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll?rev=374111&r1=374110&r2=374111&view=diff ============================================================================== --- llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll (original) +++ llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll Tue Oct 8 13:29:36 2019 @@ -15,7 +15,7 @@ define void @test1(i32 %n) { ; CHECK-NEXT: br label [[FOR_COND:%.*]] ; CHECK: for.cond: ; CHECK-NEXT: [[A:%.*]] = phi i32 [ [[N:%.*]], [[ENTRY:%.*]] ], [ [[EXT:%.*]], [[FOR_BODY:%.*]] ] -; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[A]], 1 +; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[A]], -1 ; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]] ; CHECK: for.body: ; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[A]] to i64 @@ -30,7 +30,7 @@ entry: for.cond: ; preds = %for.body, %entry %a = phi i32 [ %n, %entry ], [ %ext, %for.body ] - %cmp = icmp sgt i32 %a, 1 + %cmp = icmp sgt i32 %a, -1 br i1 %cmp, label %for.body, label %for.end for.body: ; preds = %for.cond @@ -43,7 +43,7 @@ for.end: ret void } -;; Negative test to show transform doesn't happen unless n > 0. +;; Negative test to show transform doesn't happen unless n >= 0. define void @test2(i32 %n) { ; CHECK-LABEL: @test2( ; CHECK-NEXT: entry: @@ -82,7 +82,7 @@ for.end: define void @test3(i32 %n) { ; CHECK-LABEL: @test3( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[N:%.*]], 0 +; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[N:%.*]], -1 ; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.*]], label [[EXIT:%.*]] ; CHECK: bb: ; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[N]] to i64 @@ -93,7 +93,35 @@ define void @test3(i32 %n) { ; CHECK-NEXT: ret void ; entry: - %cmp = icmp sgt i32 %n, 0 + %cmp = icmp sgt i32 %n, -1 + br i1 %cmp, label %bb, label %exit + +bb: + %ext.wide = sext i32 %n to i64 + call void @use64(i64 %ext.wide) + %ext = trunc i64 %ext.wide to i32 + br label %exit + +exit: + ret void +} + +;; Non looping negative test case. +define void @test4(i32 %n) { +; CHECK-LABEL: @test4( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[N:%.*]], -2 +; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.*]], label [[EXIT:%.*]] +; CHECK: bb: +; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[N]] to i64 +; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE]]) +; CHECK-NEXT: [[EXT:%.*]] = trunc i64 [[EXT_WIDE]] to i32 +; CHECK-NEXT: br label [[EXIT]] +; CHECK: exit: +; CHECK-NEXT: ret void +; +entry: + %cmp = icmp sgt i32 %n, -2 br i1 %cmp, label %bb, label %exit bb: From llvm-commits at lists.llvm.org Tue Oct 8 13:29:48 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Tue, 08 Oct 2019 20:29:48 -0000 Subject: [llvm] r374112 - [CVP} Replace SExt with ZExt if the input is known-non-negative Message-ID: <20191008202948.888548FCD1@lists.llvm.org> Author: lebedevri Date: Tue Oct 8 13:29:48 2019 New Revision: 374112 URL: http://llvm.org/viewvc/llvm-project?rev=374112&view=rev Log: [CVP} Replace SExt with ZExt if the input is known-non-negative Summary: zero-extension is far more friendly for further analysis. While this doesn't directly help with the shift-by-signext problem, this is not unrelated. This has the following effect on test-suite (numbers collected after the finish of middle-end module pass manager): | Statistic | old | new | delta | percent change | | correlated-value-propagation.NumSExt | 0 | 6026 | 6026 | +100.00% | | instcount.NumAddInst | 272860 | 271283 | -1577 | -0.58% | | instcount.NumAllocaInst | 27227 | 27226 | -1 | 0.00% | | instcount.NumAndInst | 63502 | 63320 | -182 | -0.29% | | instcount.NumAShrInst | 13498 | 13407 | -91 | -0.67% | | instcount.NumAtomicCmpXchgInst | 1159 | 1159 | 0 | 0.00% | | instcount.NumAtomicRMWInst | 5036 | 5036 | 0 | 0.00% | | instcount.NumBitCastInst | 672482 | 672353 | -129 | -0.02% | | instcount.NumBrInst | 702768 | 702195 | -573 | -0.08% | | instcount.NumCallInst | 518285 | 518205 | -80 | -0.02% | | instcount.NumExtractElementInst | 18481 | 18482 | 1 | 0.01% | | instcount.NumExtractValueInst | 18290 | 18288 | -2 | -0.01% | | instcount.NumFAddInst | 139035 | 138963 | -72 | -0.05% | | instcount.NumFCmpInst | 10358 | 10348 | -10 | -0.10% | | instcount.NumFDivInst | 30310 | 30302 | -8 | -0.03% | | instcount.NumFenceInst | 387 | 387 | 0 | 0.00% | | instcount.NumFMulInst | 93873 | 93806 | -67 | -0.07% | | instcount.NumFPExtInst | 7148 | 7144 | -4 | -0.06% | | instcount.NumFPToSIInst | 2823 | 2838 | 15 | 0.53% | | instcount.NumFPToUIInst | 1251 | 1251 | 0 | 0.00% | | instcount.NumFPTruncInst | 2195 | 2191 | -4 | -0.18% | | instcount.NumFSubInst | 92109 | 92103 | -6 | -0.01% | | instcount.NumGetElementPtrInst | 1221423 | 1219157 | -2266 | -0.19% | | instcount.NumICmpInst | 479140 | 478929 | -211 | -0.04% | | instcount.NumIndirectBrInst | 2 | 2 | 0 | 0.00% | | instcount.NumInsertElementInst | 66089 | 66094 | 5 | 0.01% | | instcount.NumInsertValueInst | 2032 | 2030 | -2 | -0.10% | | instcount.NumIntToPtrInst | 19641 | 19641 | 0 | 0.00% | | instcount.NumInvokeInst | 21789 | 21788 | -1 | 0.00% | | instcount.NumLandingPadInst | 12051 | 12051 | 0 | 0.00% | | instcount.NumLoadInst | 880079 | 878673 | -1406 | -0.16% | | instcount.NumLShrInst | 25919 | 25921 | 2 | 0.01% | | instcount.NumMulInst | 42416 | 42417 | 1 | 0.00% | | instcount.NumOrInst | 100826 | 100576 | -250 | -0.25% | | instcount.NumPHIInst | 315118 | 314092 | -1026 | -0.33% | | instcount.NumPtrToIntInst | 15933 | 15939 | 6 | 0.04% | | instcount.NumResumeInst | 2156 | 2156 | 0 | 0.00% | | instcount.NumRetInst | 84485 | 84484 | -1 | 0.00% | | instcount.NumSDivInst | 8599 | 8597 | -2 | -0.02% | | instcount.NumSelectInst | 45577 | 45913 | 336 | 0.74% | | instcount.NumSExtInst | 84026 | 78278 | -5748 | -6.84% | | instcount.NumShlInst | 39796 | 39726 | -70 | -0.18% | | instcount.NumShuffleVectorInst | 100272 | 100292 | 20 | 0.02% | | instcount.NumSIToFPInst | 29131 | 29113 | -18 | -0.06% | | instcount.NumSRemInst | 1543 | 1543 | 0 | 0.00% | | instcount.NumStoreInst | 805394 | 804351 | -1043 | -0.13% | | instcount.NumSubInst | 61337 | 61414 | 77 | 0.13% | | instcount.NumSwitchInst | 8527 | 8524 | -3 | -0.04% | | instcount.NumTruncInst | 60523 | 60484 | -39 | -0.06% | | instcount.NumUDivInst | 2381 | 2381 | 0 | 0.00% | | instcount.NumUIToFPInst | 5549 | 5549 | 0 | 0.00% | | instcount.NumUnreachableInst | 9855 | 9855 | 0 | 0.00% | | instcount.NumURemInst | 1305 | 1305 | 0 | 0.00% | | instcount.NumXorInst | 10230 | 10081 | -149 | -1.46% | | instcount.NumZExtInst | 60353 | 66840 | 6487 | 10.75% | | instcount.TotalBlocks | 829582 | 829004 | -578 | -0.07% | | instcount.TotalFuncs | 83818 | 83817 | -1 | 0.00% | | instcount.TotalInsts | 7316574 | 7308483 | -8091 | -0.11% | TLDR: we produce -0.11% less instructions, -6.84% less `sext`, +10.75% more `zext`. To be noted, clearly, not all new `zext`'s are produced by this fold. (And now i guess it might have been interesting to measure this for D68103 :S) Reviewers: nikic, spatel, reames, dberlin Reviewed By: nikic Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68654 Modified: llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll Modified: llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp?rev=374112&r1=374111&r2=374112&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp Tue Oct 8 13:29:48 2019 @@ -62,6 +62,7 @@ STATISTIC(NumSDivs, "Number of sdiv STATISTIC(NumUDivs, "Number of udivs whose width was decreased"); STATISTIC(NumAShrs, "Number of ashr converted to lshr"); STATISTIC(NumSRems, "Number of srem converted to urem"); +STATISTIC(NumSExt, "Number of sext converted to zext"); STATISTIC(NumOverflows, "Number of overflow checks removed"); STATISTIC(NumSaturating, "Number of saturating arithmetics converted to normal arithmetics"); @@ -637,6 +638,27 @@ static bool processAShr(BinaryOperator * return true; } +static bool processSExt(SExtInst *SDI, LazyValueInfo *LVI) { + if (SDI->getType()->isVectorTy()) + return false; + + Value *Base = SDI->getOperand(0); + + Constant *Zero = ConstantInt::get(Base->getType(), 0); + if (LVI->getPredicateAt(ICmpInst::ICMP_SGE, Base, Zero, SDI) != + LazyValueInfo::True) + return false; + + ++NumSExt; + auto *ZExt = + CastInst::CreateZExtOrBitCast(Base, SDI->getType(), SDI->getName(), SDI); + ZExt->setDebugLoc(SDI->getDebugLoc()); + SDI->replaceAllUsesWith(ZExt); + SDI->eraseFromParent(); + + return true; +} + static bool processBinOp(BinaryOperator *BinOp, LazyValueInfo *LVI) { using OBO = OverflowingBinaryOperator; @@ -745,6 +767,9 @@ static bool runImpl(Function &F, LazyVal case Instruction::AShr: BBChanged |= processAShr(cast(II), LVI); break; + case Instruction::SExt: + BBChanged |= processSExt(cast(II), LVI); + break; case Instruction::Add: case Instruction::Sub: BBChanged |= processBinOp(cast(II), LVI); Modified: llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll?rev=374112&r1=374111&r2=374112&view=diff ============================================================================== --- llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll (original) +++ llvm/trunk/test/Transforms/CorrelatedValuePropagation/sext.ll Tue Oct 8 13:29:48 2019 @@ -18,9 +18,9 @@ define void @test1(i32 %n) { ; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[A]], -1 ; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]] ; CHECK: for.body: -; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[A]] to i64 -; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE]]) -; CHECK-NEXT: [[EXT]] = trunc i64 [[EXT_WIDE]] to i32 +; CHECK-NEXT: [[EXT_WIDE1:%.*]] = zext i32 [[A]] to i64 +; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE1]]) +; CHECK-NEXT: [[EXT]] = trunc i64 [[EXT_WIDE1]] to i32 ; CHECK-NEXT: br label [[FOR_COND]] ; CHECK: for.end: ; CHECK-NEXT: ret void @@ -85,9 +85,9 @@ define void @test3(i32 %n) { ; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[N:%.*]], -1 ; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.*]], label [[EXIT:%.*]] ; CHECK: bb: -; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[N]] to i64 -; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE]]) -; CHECK-NEXT: [[EXT:%.*]] = trunc i64 [[EXT_WIDE]] to i32 +; CHECK-NEXT: [[EXT_WIDE1:%.*]] = zext i32 [[N]] to i64 +; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE1]]) +; CHECK-NEXT: [[EXT:%.*]] = trunc i64 [[EXT_WIDE1]] to i32 ; CHECK-NEXT: br label [[EXIT]] ; CHECK: exit: ; CHECK-NEXT: ret void From llvm-commits at lists.llvm.org Tue Oct 8 13:31:15 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:31:15 +0000 (UTC) Subject: [PATCH] D68527: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering In-Reply-To: References: Message-ID: <371f1bcde3e8223a5bdfed38a858c759@localhost.localdomain> tlively added a comment. In D68527#1699832 , @aheejin wrote: > - I remember before we had a somewhat complicated logic to calculate the number of bytes of total instructions of each case of the case we use `v128.const` and vs. when we use splats. Don't we need that anymore? Can we make the decision solely based the number of swizzles / consts / and splats? Minimizing code size is not as important for SIMD as maximizing performance, so I dumped that complicated logic. I am led to believe that `v128.const` will be faster than splats once it is implemented, but we have no way to measure yet. We can make the decision based on whatever heuristic we want, but minimizing number of instructions seems like a good metric for now until we can run experiments to tune the selection algorithm. > - Is the performance of `v128.const` better than splats? How is performance of swizzles compared to `v128.const`? Yes, I believe `v128.const` will be faster than splats. I don't know how swizzles and `v128.const` compare, but I do know that emulating swizzles requires a lot of instructions per lane but emulating a v128.const only requires a single `replace_lane` and constant per lane. So it makes sense to prefer swizzles over `v128.const`s for now. ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1350 } + }; + ---------------- aheejin wrote: > Would using `count_if` in place of `find_if` be simpler? No, I need to get the iterator to the proper entry since I'm using a vector as an associative array here. If I used `count_if` and it returned 1, I would still need to find the entry to increment the count, so `find_if` is simpler because it gives me that entry directly. ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1388 + std::forward_as_tuple(std::tie(SwizzleSrc, SwizzleIndices), + NumSwizzleLanes) = GetMostCommon(SwizzleCounts); + ---------------- aheejin wrote: > Is using `forward_as_tuple` any different from using `tie` again in this case, given that this is not passed as an argument to a function? It turns out you can't nest `std::tie`. I have no idea why, but I got this solution from https://stackoverflow.com/questions/21298732/can-we-do-deep-tie-with-a-c1y-stdtie-like-function. ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1426 + (SplattedLoad = dyn_cast(SplatValue)) && + SplattedLoad->getMemoryVT() == VecT.getVectorElementType()) { + Result = DAG.getNode(WebAssemblyISD::LOAD_SPLAT, DL, VecT, SplatValue); ---------------- aheejin wrote: > It's not in this CL, but is there a case this condition is not satisfied? Yes, for example when doing a sign extending load of an i8 to an i32 then splatting that i32. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68527/new/ https://reviews.llvm.org/D68527 From llvm-commits at lists.llvm.org Tue Oct 8 13:35:10 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:35:10 +0000 (UTC) Subject: [PATCH] D61675: [WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator In-Reply-To: References: Message-ID: spatel accepted this revision. spatel added a comment. Still LGTM. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61675/new/ https://reviews.llvm.org/D61675 From llvm-commits at lists.llvm.org Tue Oct 8 13:36:17 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:36:17 +0000 (UTC) Subject: [PATCH] D68654: [CVP} Replace SExt with ZExt if the input is known-non-negative In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG354ba6985ca0: [CVP} Replace SExt with ZExt if the input is known-non-negative (authored by lebedev.ri). Changed prior to commit: https://reviews.llvm.org/D68654?vs=223904&id=223933#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68654/new/ https://reviews.llvm.org/D68654 Files: llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp llvm/test/Transforms/CorrelatedValuePropagation/sext.ll Index: llvm/test/Transforms/CorrelatedValuePropagation/sext.ll =================================================================== --- llvm/test/Transforms/CorrelatedValuePropagation/sext.ll +++ llvm/test/Transforms/CorrelatedValuePropagation/sext.ll @@ -18,9 +18,9 @@ ; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[A]], -1 ; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]] ; CHECK: for.body: -; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[A]] to i64 -; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE]]) -; CHECK-NEXT: [[EXT]] = trunc i64 [[EXT_WIDE]] to i32 +; CHECK-NEXT: [[EXT_WIDE1:%.*]] = zext i32 [[A]] to i64 +; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE1]]) +; CHECK-NEXT: [[EXT]] = trunc i64 [[EXT_WIDE1]] to i32 ; CHECK-NEXT: br label [[FOR_COND]] ; CHECK: for.end: ; CHECK-NEXT: ret void @@ -85,9 +85,9 @@ ; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[N:%.*]], -1 ; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.*]], label [[EXIT:%.*]] ; CHECK: bb: -; CHECK-NEXT: [[EXT_WIDE:%.*]] = sext i32 [[N]] to i64 -; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE]]) -; CHECK-NEXT: [[EXT:%.*]] = trunc i64 [[EXT_WIDE]] to i32 +; CHECK-NEXT: [[EXT_WIDE1:%.*]] = zext i32 [[N]] to i64 +; CHECK-NEXT: call void @use64(i64 [[EXT_WIDE1]]) +; CHECK-NEXT: [[EXT:%.*]] = trunc i64 [[EXT_WIDE1]] to i32 ; CHECK-NEXT: br label [[EXIT]] ; CHECK: exit: ; CHECK-NEXT: ret void Index: llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp =================================================================== --- llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp +++ llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp @@ -62,6 +62,7 @@ STATISTIC(NumUDivs, "Number of udivs whose width was decreased"); STATISTIC(NumAShrs, "Number of ashr converted to lshr"); STATISTIC(NumSRems, "Number of srem converted to urem"); +STATISTIC(NumSExt, "Number of sext converted to zext"); STATISTIC(NumOverflows, "Number of overflow checks removed"); STATISTIC(NumSaturating, "Number of saturating arithmetics converted to normal arithmetics"); @@ -637,6 +638,27 @@ return true; } +static bool processSExt(SExtInst *SDI, LazyValueInfo *LVI) { + if (SDI->getType()->isVectorTy()) + return false; + + Value *Base = SDI->getOperand(0); + + Constant *Zero = ConstantInt::get(Base->getType(), 0); + if (LVI->getPredicateAt(ICmpInst::ICMP_SGE, Base, Zero, SDI) != + LazyValueInfo::True) + return false; + + ++NumSExt; + auto *ZExt = + CastInst::CreateZExtOrBitCast(Base, SDI->getType(), SDI->getName(), SDI); + ZExt->setDebugLoc(SDI->getDebugLoc()); + SDI->replaceAllUsesWith(ZExt); + SDI->eraseFromParent(); + + return true; +} + static bool processBinOp(BinaryOperator *BinOp, LazyValueInfo *LVI) { using OBO = OverflowingBinaryOperator; @@ -745,6 +767,9 @@ case Instruction::AShr: BBChanged |= processAShr(cast(II), LVI); break; + case Instruction::SExt: + BBChanged |= processSExt(cast(II), LVI); + break; case Instruction::Add: case Instruction::Sub: BBChanged |= processBinOp(cast(II), LVI); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68654.223933.patch Type: text/x-patch Size: 3252 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 08:07:41 2019 From: llvm-commits at lists.llvm.org (James Henderson via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:07:41 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. In-Reply-To: References: Message-ID: <596d5324b32f13994bd733b699b64ae9@localhost.localdomain> jhenderson added a comment. In D68636#1699484 , @grimar wrote: > In D68636#1699479 , @jhenderson wrote: > > > Did you remember to run LLD tests? I'd expect to see changes there... > > > Sure. That will be a separate change, once we agree with LLVM side. Just to clarify, are you still using the SVN layout rather than the monorepo? Using the monorepo would allow having all the changes in the same patch. Aside from the discussion on the titles of Version data, this looks good to me. I don't have a strong preference whichever way is used. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68636/new/ https://reviews.llvm.org/D68636 From llvm-commits at lists.llvm.org Tue Oct 8 08:16:52 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 15:16:52 +0000 (UTC) Subject: [PATCH] D68607: [AMDGPU] Disable unused gfx10 dpp instructions In-Reply-To: References: Message-ID: nhaehnle accepted this revision. nhaehnle added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68607/new/ https://reviews.llvm.org/D68607 From llvm-commits at lists.llvm.org Tue Oct 8 08:26:07 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:26:07 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. In-Reply-To: References: Message-ID: <9df2cfcd01e9d0df90a833fb59413765@localhost.localdomain> grimar added a comment. In D68636#1699712 , @jhenderson wrote: > In D68636#1699484 , @grimar wrote: > > > In D68636#1699479 , @jhenderson wrote: > > > > > Did you remember to run LLD tests? I'd expect to see changes there... > > > > > > Sure. That will be a separate change, once we agree with LLVM side. > > > Just to clarify, are you still using the SVN layout rather than the monorepo? Using the monorepo would allow having all the changes in the same patch. Yes, I do not use monorepo yet. Going to switch soon as later this month svn will became readonly. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68636/new/ https://reviews.llvm.org/D68636 From llvm-commits at lists.llvm.org Tue Oct 8 08:45:33 2019 From: llvm-commits at lists.llvm.org (James Clarke via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:45:33 +0000 (UTC) Subject: [PATCH] D68559: [RISCV] Support fast calling convention In-Reply-To: References: Message-ID: jrtc27 added a comment. I wonder whether it would be nicer to add `CC_RISCV_FastCC` as a function, much like `CC_RISCV`, instead, to match the current code style. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68559/new/ https://reviews.llvm.org/D68559 From llvm-commits at lists.llvm.org Tue Oct 8 08:45:33 2019 From: llvm-commits at lists.llvm.org (Nikola Prica via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:45:33 +0000 (UTC) Subject: [PATCH] D66955: [DebugInfo][If-Converter] Update call site info during the optimization In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG98603a815308: [DebugInfo][If-Converter] Update call site info during the optimization (authored by NikolaPrica). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D66955?vs=220641&id=223878#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66955/new/ https://reviews.llvm.org/D66955 Files: llvm/include/llvm/CodeGen/MachineFunction.h llvm/lib/CodeGen/BranchFolding.cpp llvm/lib/CodeGen/IfConversion.cpp llvm/lib/CodeGen/InlineSpiller.cpp llvm/lib/CodeGen/LiveRangeEdit.cpp llvm/lib/CodeGen/MachineFunction.cpp llvm/lib/CodeGen/MachineOutliner.cpp llvm/lib/CodeGen/PeepholeOptimizer.cpp llvm/lib/CodeGen/TargetInstrInfo.cpp llvm/lib/CodeGen/UnreachableBlockElim.cpp llvm/lib/CodeGen/XRayInstrumentation.cpp llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp llvm/lib/Target/X86/X86ExpandPseudo.cpp llvm/test/CodeGen/ARM/smml.ll llvm/test/DebugInfo/MIR/ARM/if-coverter-call-site-info.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D66955.223878.patch Type: text/x-patch Size: 19115 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 08:54:19 2019 From: llvm-commits at lists.llvm.org (Steven Wan via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 15:54:19 +0000 (UTC) Subject: [PATCH] D67148: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In-Reply-To: References: Message-ID: <5603b5f0349829f76c873b776c1975f6@localhost.localdomain> stevewan added a comment. Hi, This is breaking our internal benchmarks, could you please take a look? Thanks! Steven Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67148/new/ https://reviews.llvm.org/D67148 From llvm-commits at lists.llvm.org Tue Oct 8 09:57:30 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 16:57:30 +0000 (UTC) Subject: [PATCH] D68607: [AMDGPU] Disable unused gfx10 dpp instructions In-Reply-To: References: Message-ID: <980d4dff68b34d41d3c13eb5091c895a@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG8f002193bf49: [AMDGPU] Disable unused gfx10 dpp instructions (authored by rampitec). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68607/new/ https://reviews.llvm.org/D68607 Files: llvm/lib/Target/AMDGPU/VOP1Instructions.td llvm/lib/Target/AMDGPU/VOP2Instructions.td Index: llvm/lib/Target/AMDGPU/VOP2Instructions.td =================================================================== --- llvm/lib/Target/AMDGPU/VOP2Instructions.td +++ llvm/lib/Target/AMDGPU/VOP2Instructions.td @@ -939,11 +939,13 @@ } } multiclass VOP2_Real_dpp_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP2_DPP16(NAME#"_e32")> { let DecoderNamespace = "SDWA10"; } } multiclass VOP2_Real_dpp8_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP2_DPP8(NAME#"_e32")> { let DecoderNamespace = "DPP8"; } @@ -981,6 +983,7 @@ } multiclass VOP2_Real_dpp_gfx10_with_name op, string opName, string asmName> { + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP2_DPP16(opName#"_e32")> { VOP2_Pseudo ps = !cast(opName#"_e32"); let AsmString = asmName # ps.Pfl.AsmDPP16; @@ -988,6 +991,7 @@ } multiclass VOP2_Real_dpp8_gfx10_with_name op, string opName, string asmName> { + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP2_DPP8(opName#"_e32")> { VOP2_Pseudo ps = !cast(opName#"_e32"); let AsmString = asmName # ps.Pfl.AsmDPP8; @@ -1018,12 +1022,14 @@ let AsmString = asmName # !subst(", vcc", "", Ps.AsmOperands); let DecoderNamespace = "SDWA10"; } + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP2_DPP16(opName#"_e32"), asmName> { string AsmDPP = !cast(opName#"_e32").Pfl.AsmDPP16; let AsmString = asmName # !subst(", vcc", "", AsmDPP); let DecoderNamespace = "SDWA10"; } + foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP2_DPP8(opName#"_e32"), asmName> { string AsmDPP8 = !cast(opName#"_e32").Pfl.AsmDPP8; Index: llvm/lib/Target/AMDGPU/VOP1Instructions.td =================================================================== --- llvm/lib/Target/AMDGPU/VOP1Instructions.td +++ llvm/lib/Target/AMDGPU/VOP1Instructions.td @@ -506,11 +506,13 @@ } } multiclass VOP1_Real_dpp_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : VOP1_DPP16(NAME#"_e32")> { let DecoderNamespace = "SDWA10"; } } multiclass VOP1_Real_dpp8_gfx10 op> { + foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in def _dpp8_gfx10 : VOP1_DPP8(NAME#"_e32")> { let DecoderNamespace = "DPP8"; } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68607.223893.patch Type: text/x-patch Size: 3141 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 11:25:52 2019 From: llvm-commits at lists.llvm.org (Shiva Chen via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:25:52 +0000 (UTC) Subject: [PATCH] D68559: [RISCV] Support fast calling convention In-Reply-To: References: Message-ID: shiva0217 updated this revision to Diff 223906. shiva0217 added a comment. Update patch to address the feedbacks Hi @lenary and @jrtc27, thanks for the comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68559/new/ https://reviews.llvm.org/D68559 Files: llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/test/CodeGen/RISCV/fastcc-float.ll llvm/test/CodeGen/RISCV/fastcc-int.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68559.223906.patch Type: text/x-patch Size: 9227 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 11:26:15 2019 From: llvm-commits at lists.llvm.org (Shiva Chen via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 18:26:15 +0000 (UTC) Subject: [PATCH] D68559: [RISCV] Support fast calling convention In-Reply-To: References: Message-ID: shiva0217 added a comment. In D68559#1699606 , @lenary wrote: > I like this change. > > `fastcc` is LLVM-internal only, right? Checking that we don't have to care about splitting operands that don't fit into registers, or any other psABI details, right? Yes, to my understanding, `fastcc` doesn't need to care about psABI details. In D68559#1699842 , @jrtc27 wrote: > I wonder whether it would be nicer to add `CC_RISCV_FastCC` as a function, much like `CC_RISCV`, instead, to match the current code style. Ok, the code style has been synced and it also easier to extend when we support RV32E in the future, thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68559/new/ https://reviews.llvm.org/D68559 From llvm-commits at lists.llvm.org Tue Oct 8 13:47:01 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:47:01 +0000 (UTC) Subject: [PATCH] D67643: [lit] Extend internal diff to support `-` argument In-Reply-To: References: Message-ID: jdenny marked an inline comment as done. jdenny added inline comments. ================ Comment at: llvm/utils/lit/lit/builtin_commands/diff.py:27-29 + # FIXME: How can we restart stdin if the encoding is wrong? How can we + # read stdin with a different encoding or in binary mode in a way that's + # compatible with python 2 and 3? ---------------- jdenny wrote: > rnk wrote: > > This function seems fragile to me. It reads each file three times: > > 1. without an encoding (system default?), may fail > > 2. with a utf-8 encoding, may fail > > 3. inside compareTwo*Files, reads them yet again > > > > I think the best way to be portable between Python 2 & 3 will probably be to work in bytes as much as possible. > > > > I think you can use this pattern to get a new file descriptor for stdin that reads bytes: > > ``` > > fileno = sys.stdin.fileno() > > if fileno is not None: > > new_stdin = os.fdopen(os.dup(fileno), 'rb') # read in binary > > ``` > > > > Here's how I would do it: > > 1. read both inputs completely in binary (can't fail) > > 2. try decoding the entire file in a few encodings (try default, try utf-8, if that fails, stick with bytes and diff_bytes) > > 3. split file on u'\n' or b'\n' as appropriate, perhaps stripping trailing '\r' if present, depending on flags > > > > Does that seem reasonable? > I don't know the motivation by the original implementation. Your plan sounds better. Sorry, but I don't have time to implement and test this right now. Hopefully soon. Thanks for the review. > I think you can use this pattern to get a new file descriptor for stdin that reads bytes: > > ``` > fileno = sys.stdin.fileno() > if fileno is not None: > ``` I don't see documentation saying that `sys.stdin.fileno` might return `None`. Is that possible? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67643/new/ https://reviews.llvm.org/D67643 From llvm-commits at lists.llvm.org Tue Oct 8 13:50:46 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Tue, 08 Oct 2019 20:50:46 -0000 Subject: [compiler-rt] r374115 - [sanitizer] Fix crypt.cpp test on Darwin Message-ID: <20191008205046.E11908FA1A@lists.llvm.org> Author: vitalybuka Date: Tue Oct 8 13:50:46 2019 New Revision: 374115 URL: http://llvm.org/viewvc/llvm-project?rev=374115&view=rev Log: [sanitizer] Fix crypt.cpp test on Darwin Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp compiler-rt/trunk/test/sanitizer_common/lit.common.cfg.py Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp?rev=374115&r1=374114&r2=374115&view=diff ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp (original) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp Tue Oct 8 13:50:46 2019 @@ -1,4 +1,4 @@ -// RUN: %clangxx -O0 -g %s -o %t -lcrypt && %run %t +// RUN: %clangxx -O0 -g %s -o %t && %run %t // crypt is missing from Android. // UNSUPPORTED: android Modified: compiler-rt/trunk/test/sanitizer_common/lit.common.cfg.py URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/lit.common.cfg.py?rev=374115&r1=374114&r2=374115&view=diff ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/lit.common.cfg.py (original) +++ compiler-rt/trunk/test/sanitizer_common/lit.common.cfg.py Tue Oct 8 13:50:46 2019 @@ -46,10 +46,13 @@ if default_tool_options_str: config.environment[tool_options] = default_tool_options_str default_tool_options_str += ':' +extra_link_flags = [] + if config.host_os in ['Linux']: - extra_link_flags = ["-ldl"] -else: - extra_link_flags = [] + extra_link_flags += ["-ldl"] + +if config.host_os in ['Linux', 'NetBSD', 'FreeBSD']: + extra_link_flags += ["-lcrypt"] clang_cflags = config.debug_info_flags + tool_cflags + [config.target_cflags] clang_cflags += extra_link_flags From llvm-commits at lists.llvm.org Tue Oct 8 13:48:35 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Tue, 8 Oct 2019 13:48:35 -0700 Subject: [compiler-rt] r373993 - [msan] Add interceptors: crypt, crypt_r. In-Reply-To: <7F17858A-231D-4A25-8EEF-B715E69D97FA@apple.com> References: <20191008000030.E4A6B81D62@lists.llvm.org> <7F17858A-231D-4A25-8EEF-B715E69D97FA@apple.com> Message-ID: r374115 On Tue, Oct 8, 2019 at 12:29 PM Azhar Mohammed via llvm-commits < llvm-commits at lists.llvm.org> wrote: > This is failing on Darwin, can you please take a look. > http://green.lab.llvm.org/green/job/clang-stage1-RA/2649/consoleFull > > > ******************** TEST 'SanitizerCommon-asan-x86_64-Darwin :: > Posix/crypt.cpp' FAILED ******************** Script: -- : 'RUN: at line > 1'; > /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/./bin/clang > --driver-mode=g++ -gline-tables-only -fsanitize=address -arch x86_64 > -stdlib=libc++ -mmacosx-version-min=10.9 -isysroot > /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk > -O0 -g > /Users/buildslave/jenkins/workspace/clang-stage1-RA/llvm-project/compiler-rt/test/sanitizer_common/TestCases/Posix/crypt.cpp > -o > /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/tools/clang/runtime/compiler-rt-bins/test/sanitizer_common/asan-x86_64-Darwin/Posix/Output/crypt.cpp.tmp > -lcrypt && > /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/tools/clang/runtime/compiler-rt-bins/test/sanitizer_common/asan-x86_64-Darwin/Posix/Output/crypt.cpp.tmp > -- Exit Code: 1 > > Command Output (stderr): -- ld: library not found for -lcrypt clang-10: > error: linker command failed with exit code 1 (use -v to see invocation) > > > > On Oct 7, 2019, at 5:00 PM, Evgeniy Stepanov via llvm-commits < > llvm-commits at lists.llvm.org> wrote: > > Author: eugenis > Date: Mon Oct 7 17:00:30 2019 > New Revision: 373993 > > URL: http://llvm.org/viewvc/llvm-project?rev=373993&view=rev > Log: > [msan] Add interceptors: crypt, crypt_r. > > Reviewers: vitalybuka > > Subscribers: srhines, #sanitizers, llvm-commits > > Tags: #sanitizers, #llvm > > Differential Revision: https://reviews.llvm.org/D68431 > > Added: > compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp > compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp > Modified: > compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h > > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h > > Modified: > compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc > URL: > http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc?rev=373993&r1=373992&r2=373993&view=diff > > ============================================================================== > --- > compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc > (original) > +++ > compiler-rt/trunk/lib/sanitizer_common/sanitizer_common_interceptors.inc > Mon Oct 7 17:00:30 2019 > @@ -9573,6 +9573,41 @@ INTERCEPTOR(SSIZE_T, getrandom, void *bu > #define INIT_GETRANDOM > #endif > > +#if SANITIZER_INTERCEPT_CRYPT > +INTERCEPTOR(char *, crypt, char *key, char *salt) { > + void *ctx; > + COMMON_INTERCEPTOR_ENTER(ctx, crypt, key, salt); > + COMMON_INTERCEPTOR_READ_RANGE(ctx, key, internal_strlen(key) + 1); > + COMMON_INTERCEPTOR_READ_RANGE(ctx, salt, internal_strlen(salt) + 1); > + char *res = REAL(crypt)(key, salt); > + if (res != nullptr) > + COMMON_INTERCEPTOR_INITIALIZE_RANGE(res, internal_strlen(res) + 1); > + return res; > +} > +#define INIT_CRYPT COMMON_INTERCEPT_FUNCTION(crypt); > +#else > +#define INIT_CRYPT > +#endif > + > +#if SANITIZER_INTERCEPT_CRYPT_R > +INTERCEPTOR(char *, crypt_r, char *key, char *salt, void *data) { > + void *ctx; > + COMMON_INTERCEPTOR_ENTER(ctx, crypt_r, key, salt, data); > + COMMON_INTERCEPTOR_READ_RANGE(ctx, key, internal_strlen(key) + 1); > + COMMON_INTERCEPTOR_READ_RANGE(ctx, salt, internal_strlen(salt) + 1); > + char *res = REAL(crypt_r)(key, salt, data); > + if (res != nullptr) { > + COMMON_INTERCEPTOR_WRITE_RANGE(ctx, data, > + __sanitizer::struct_crypt_data_sz); > + COMMON_INTERCEPTOR_INITIALIZE_RANGE(res, internal_strlen(res) + 1); > + } > + return res; > +} > +#define INIT_CRYPT_R COMMON_INTERCEPT_FUNCTION(crypt_r); > +#else > +#define INIT_CRYPT_R > +#endif > + > static void InitializeCommonInterceptors() { > #if SI_POSIX > static u64 metadata_mem[sizeof(MetadataHashMap) / sizeof(u64) + 1]; > @@ -9871,6 +9906,8 @@ static void InitializeCommonInterceptors > INIT_GETUSERSHELL; > INIT_SL_INIT; > INIT_GETRANDOM; > + INIT_CRYPT; > + INIT_CRYPT_R; > > INIT___PRINTF_CHK; > } > > Modified: > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h > URL: > http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h?rev=373993&r1=373992&r2=373993&view=diff > > ============================================================================== > --- > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h > (original) > +++ > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_interceptors.h > Mon Oct 7 17:00:30 2019 > @@ -566,6 +566,8 @@ > #define SANITIZER_INTERCEPT_FDEVNAME SI_FREEBSD > #define SANITIZER_INTERCEPT_GETUSERSHELL (SI_POSIX && !SI_ANDROID) > #define SANITIZER_INTERCEPT_SL_INIT (SI_FREEBSD || SI_NETBSD) > +#define SANITIZER_INTERCEPT_CRYPT (SI_POSIX && !SI_ANDROID) > +#define SANITIZER_INTERCEPT_CRYPT_R (SI_LINUX && !SI_ANDROID) > > #define SANITIZER_INTERCEPT_GETRANDOM (SI_LINUX && __GLIBC_PREREQ(2, 25)) > #define SANITIZER_INTERCEPT___CXA_ATEXIT SI_NETBSD > > Modified: > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp > URL: > http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp?rev=373993&r1=373992&r2=373993&view=diff > > ============================================================================== > --- > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp > (original) > +++ > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp > Mon Oct 7 17:00:30 2019 > @@ -140,6 +140,7 @@ typedef struct user_fpregs elf_fpregset_ > #include > #include > #include > +#include > #endif // SANITIZER_LINUX && !SANITIZER_ANDROID > > #if SANITIZER_ANDROID > @@ -240,6 +241,7 @@ namespace __sanitizer { > unsigned struct_ustat_sz = SIZEOF_STRUCT_USTAT; > unsigned struct_rlimit64_sz = sizeof(struct rlimit64); > unsigned struct_statvfs64_sz = sizeof(struct statvfs64); > + unsigned struct_crypt_data_sz = sizeof(struct crypt_data); > #endif // SANITIZER_LINUX && !SANITIZER_ANDROID > > #if SANITIZER_LINUX && !SANITIZER_ANDROID > > Modified: > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h > URL: > http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h?rev=373993&r1=373992&r2=373993&view=diff > > ============================================================================== > --- > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h > (original) > +++ > compiler-rt/trunk/lib/sanitizer_common/sanitizer_platform_limits_posix.h > Mon Oct 7 17:00:30 2019 > @@ -304,6 +304,7 @@ extern unsigned struct_msqid_ds_sz; > extern unsigned struct_mq_attr_sz; > extern unsigned struct_timex_sz; > extern unsigned struct_statvfs_sz; > +extern unsigned struct_crypt_data_sz; > #endif // SANITIZER_LINUX && !SANITIZER_ANDROID > > struct __sanitizer_iovec { > > Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp > URL: > http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp?rev=373993&view=auto > > ============================================================================== > --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp > (added) > +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/crypt_r.cpp > Mon Oct 7 17:00:30 2019 > @@ -0,0 +1,37 @@ > +// RUN: %clangxx -O0 -g %s -lcrypt -o %t && %run %t > + > +#include > +#include > +#include > +#include > + > +#include > + > +int > +main (int argc, char** argv) > +{ > + { > + crypt_data cd; > + cd.initialized = 0; > + char *p = crypt_r("abcdef", "xz", &cd); > + volatile size_t z = strlen(p); > + } > + { > + crypt_data cd; > + cd.initialized = 0; > + char *p = crypt_r("abcdef", "$1$", &cd); > + volatile size_t z = strlen(p); > + } > + { > + crypt_data cd; > + cd.initialized = 0; > + char *p = crypt_r("abcdef", "$5$", &cd); > + volatile size_t z = strlen(p); > + } > + { > + crypt_data cd; > + cd.initialized = 0; > + char *p = crypt_r("abcdef", "$6$", &cd); > + volatile size_t z = strlen(p); > + } > +} > > Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp > URL: > http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp?rev=373993&view=auto > > ============================================================================== > --- compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp > (added) > +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp Mon > Oct 7 17:00:30 2019 > @@ -0,0 +1,26 @@ > +// RUN: %clangxx -O0 -g %s -o %t -lcrypt && %run %t > + > +#include > +#include > +#include > + > +int > +main (int argc, char** argv) > +{ > + { > + char *p = crypt("abcdef", "xz"); > + volatile size_t z = strlen(p); > + } > + { > + char *p = crypt("abcdef", "$1$"); > + volatile size_t z = strlen(p); > + } > + { > + char *p = crypt("abcdef", "$5$"); > + volatile size_t z = strlen(p); > + } > + { > + char *p = crypt("abcdef", "$6$"); > + volatile size_t z = strlen(p); > + } > +} > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Tue Oct 8 13:56:22 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:56:22 +0000 (UTC) Subject: [PATCH] D68664: [lit] Clean up internal diff's encoding handling Message-ID: jdenny created this revision. jdenny added reviewers: probinson, stella.stamenova, bd1976llvm, jlpeyton, rnk, mgorny. Herald added a subscriber: delcypher. Herald added a project: LLVM. jdenny added a parent revision: D66574: [lit] Make internal diff work in pipelines. jdenny added a child revision: D67643: [lit] Extend internal diff to support `-` argument. As suggested by rnk at D67643#1673043 , instead of reading files multiple times until an appropriate encoding is found, read them once as binary, and then try to decode what was read. For python >= 3.5, don't fail when attempting to decode the `diff_bytes` output in order to print it. Finally, add some tests for encoding handling. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68664 Files: llvm/utils/lit/lit/builtin_commands/diff.py llvm/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.bin llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/shtest-shell.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68664.223935.patch Type: text/x-patch Size: 6172 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 13:56:22 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:56:22 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Fixed malformed JSON when symbolizing coverage information In-Reply-To: References: Message-ID: <4b6f4e7d9e5dffdcc1f349d484559ed2@localhost.localdomain> vsk added subscribers: Dor1s, vsk. vsk added a comment. @Dor1s - any chance you know more folks actively working on sancov who have the bandwidth to review? ================ Comment at: llvm/tools/sancov/sancov.cpp:351 static void printJSONStringLiteral(StringRef S, raw_ostream &OS) { - if (S.find('"') == std::string::npos) { + if (S.find('"') == std::string::npos && + S.find('\\') == std::string::npos) { ---------------- Perhaps this whole function should be simplified to `OS << json::Value(S)` (see 'llvm/Support/JSON.h')? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 From llvm-commits at lists.llvm.org Tue Oct 8 13:56:22 2019 From: llvm-commits at lists.llvm.org (Kostya Kortchinsky via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:56:22 +0000 (UTC) Subject: [PATCH] D68653: [scudo][standalone] Get statistics in a char buffer In-Reply-To: References: Message-ID: <574327250a5ec187e0ca231ae5d18131@localhost.localdomain> cryptoad updated this revision to Diff 223937. cryptoad added a comment. Adding some `getStats(char *, uptr)` specific test. Unifying the stats output a little bit for congruency purposes. I don't want to go too far there as I am pretty sure the format of the output will change in the near future. Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68653/new/ https://reviews.llvm.org/D68653 Files: lib/scudo/standalone/combined.h lib/scudo/standalone/crc32_hw.cpp lib/scudo/standalone/primary32.h lib/scudo/standalone/primary64.h lib/scudo/standalone/quarantine.h lib/scudo/standalone/secondary.cpp lib/scudo/standalone/secondary.h lib/scudo/standalone/size_class_map.h lib/scudo/standalone/string_utils.cpp lib/scudo/standalone/string_utils.h lib/scudo/standalone/tests/combined_test.cpp lib/scudo/standalone/tests/primary_test.cpp lib/scudo/standalone/tests/quarantine_test.cpp lib/scudo/standalone/tests/secondary_test.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68653.223937.patch Type: text/x-patch Size: 17855 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 13:56:23 2019 From: llvm-commits at lists.llvm.org (Kostya Kortchinsky via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:56:23 +0000 (UTC) Subject: [PATCH] D68653: [scudo][standalone] Get statistics in a char buffer In-Reply-To: References: Message-ID: <965b3375378095bb9bba6b33c789055d@localhost.localdomain> cryptoad updated this revision to Diff 223938. cryptoad added a comment. clang-format is not the best at strings. Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68653/new/ https://reviews.llvm.org/D68653 Files: lib/scudo/standalone/combined.h lib/scudo/standalone/crc32_hw.cpp lib/scudo/standalone/primary32.h lib/scudo/standalone/primary64.h lib/scudo/standalone/quarantine.h lib/scudo/standalone/secondary.cpp lib/scudo/standalone/secondary.h lib/scudo/standalone/size_class_map.h lib/scudo/standalone/string_utils.cpp lib/scudo/standalone/string_utils.h lib/scudo/standalone/tests/combined_test.cpp lib/scudo/standalone/tests/primary_test.cpp lib/scudo/standalone/tests/quarantine_test.cpp lib/scudo/standalone/tests/secondary_test.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68653.223938.patch Type: text/x-patch Size: 17860 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 13:56:24 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:56:24 +0000 (UTC) Subject: [PATCH] D68650: [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength. In-Reply-To: References: Message-ID: DiggerLin marked an inline comment as done. DiggerLin added inline comments. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:117 + support::ubig32_t + SectionOrLength; // If the symbol type is XTY_SD,XTY_CM,it contains the + // csect length. If the symbol type is XTY_LD, it ---------------- hubert.reinterpretcast wrote: > s/it contains//g; > > s/XTY_SD,XTY_CM/XTY_SD or XTY_CM/; > s/is XTY_ER/is XTY_ER/; > changed as suggestion. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68650/new/ https://reviews.llvm.org/D68650 From llvm-commits at lists.llvm.org Tue Oct 8 13:56:24 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 20:56:24 +0000 (UTC) Subject: [PATCH] D68650: [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength. In-Reply-To: References: Message-ID: <96b291c27bdb0c81afff24f01745b052@localhost.localdomain> DiggerLin updated this revision to Diff 223939. DiggerLin added a comment. addressed comment Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68650/new/ https://reviews.llvm.org/D68650 Files: llvm/include/llvm/Object/XCOFFObjectFile.h llvm/tools/llvm-readobj/XCOFFDumper.cpp Index: llvm/tools/llvm-readobj/XCOFFDumper.cpp =================================================================== --- llvm/tools/llvm-readobj/XCOFFDumper.cpp +++ llvm/tools/llvm-readobj/XCOFFDumper.cpp @@ -213,9 +213,9 @@ W.printNumber("Index", Obj.getSymbolIndex(reinterpret_cast(AuxEntPtr))); if ((AuxEntPtr->SymbolAlignmentAndType & SymbolTypeMask) == XCOFF::XTY_LD) - W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionLen); + W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionOrLength); else - W.printNumber("SectionLen", AuxEntPtr->SectionLen); + W.printNumber("SectionLen", AuxEntPtr->SectionOrLength); W.printHex("ParameterHashIndex", AuxEntPtr->ParameterHashIndex); W.printHex("TypeChkSectNum", AuxEntPtr->TypeChkSectNum); // Print out symbol alignment and type. Index: llvm/include/llvm/Object/XCOFFObjectFile.h =================================================================== --- llvm/include/llvm/Object/XCOFFObjectFile.h +++ llvm/include/llvm/Object/XCOFFObjectFile.h @@ -113,7 +113,12 @@ }; struct XCOFFCsectAuxEnt32 { - support::ubig32_t SectionLen; + support::ubig32_t + SectionOrLength; // If the symbol type is XTY_SD or XTY_CM, the csect + // length. + // If the symbol type is XTY_LD, the symbol table + // index of the containing csect. + // If the symbol type is XTY_ER, 0. support::ubig32_t ParameterHashIndex; support::ubig16_t TypeChkSectNum; uint8_t SymbolAlignmentAndType; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68650.223939.patch Type: text/x-patch Size: 1594 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 14:05:52 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:05:52 +0000 (UTC) Subject: [PATCH] D68270: DWARFDebugLoc: Add a function to get the address range of an entry In-Reply-To: References: Message-ID: clayborg added a comment. In D68270#1700108 , @probinson wrote: > Do we care whether llvm-dwarfdump's output bears any similarities to the output from GNU readelf or objdump? There has been a push lately to get the LLVM "binutils" to behave more like GNU's, although AFAIK it hasn't gotten to the DWARF dumping part. I am not too fond of the readelf output. At least for the .debug_info dumping. I like the see indentation. But I do see the appeal of consistent output. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68270/new/ https://reviews.llvm.org/D68270 From llvm-commits at lists.llvm.org Tue Oct 8 14:05:54 2019 From: llvm-commits at lists.llvm.org (Melanie Blower via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:05:54 +0000 (UTC) Subject: [PATCH] D62731: [RFC] Add support for options -frounding-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior In-Reply-To: References: Message-ID: mibintc updated this revision to Diff 223940. mibintc added a comment. I added a test case to show the warning diagnostics when options conflicting with fp-model are provided. I fixed a couple bugs in RenderFloatingPointOptions when issueing diagnostics. still owe a test case showing how the fp-model, rounding, and trapping options are rendered by the Driver for cc1 Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62731/new/ https://reviews.llvm.org/D62731 Files: clang/docs/UsersManual.rst clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Basic/LangOptions.h clang/include/clang/Driver/Options.td clang/lib/CodeGen/BackendUtil.cpp clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/CodeGen/CodeGenFunction.h clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/fpconstrained.c clang/test/Driver/clang_f_opts.c clang/test/Driver/fast-math.c clang/test/Driver/fp-model.c llvm/include/llvm/Target/TargetOptions.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D62731.223940.patch Type: text/x-patch Size: 33744 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 14:15:08 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:15:08 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: Xiangling_L marked 14 inline comments as done. Xiangling_L added inline comments. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:909 + + MCSymbol *MOSymbol = getMCSymbolForMO(MO, *this); ---------------- jasonliu wrote: > nit: MOSymbol could move down a bit to be closer to where it get used. > > Also I would want to add const to this MOSymbol. But it seems that a lot of similar MOSymbols in this file do not have const with them, only add const for this one here would create inconsistency. Not sure if it's worth to have a NFC patch to add const for all the MOSymbol in this file first. As Sean mentioned to me, Stefan is doing this NFC patch about MOSymbol, so I think I can put this up there: https://github.ibm.com/compiler/llvm-project/pull/672. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:926 // Into: %xd = ADDIS8 %x2, sym at got@tlsgd at ha - assert(Subtarget->isPPC64() && "Not supported for 32-bit PowerPC"); + assert(IsPPC64 && "Not supported for 32-bit PowerPC"); const MachineOperand &MO = MI->getOperand(2); ---------------- hubert.reinterpretcast wrote: > There is quite a bit of noise in this patch from these NFC changes. Please split them out. Thanks for mentioning this, will do. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Tue Oct 8 14:15:12 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:15:12 +0000 (UTC) Subject: [PATCH] D67643: [lit] Extend internal diff to support `-` argument In-Reply-To: References: Message-ID: <4271dcce0a957a86547488d5d89b3d6a@localhost.localdomain> jdenny updated this revision to Diff 223941. jdenny edited the summary of this revision. jdenny added a comment. Rebased onto D68664 , and extended its encoding tests for `-`. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67643/new/ https://reviews.llvm.org/D67643 Files: llvm/utils/lit/lit/builtin_commands/diff.py llvm/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt llvm/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt llvm/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/shtest-shell.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D67643.223941.patch Type: text/x-patch Size: 7915 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 14:24:28 2019 From: llvm-commits at lists.llvm.org (Adrian McCarthy via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:24:28 +0000 (UTC) Subject: [PATCH] D68134: [LLDB] Use the llvm microsoft demangler instead of the windows dbghelp api In-Reply-To: References: Message-ID: <132c2bd7c52ee4b9b0078bb7e6132b81@localhost.localdomain> amccarth accepted this revision. amccarth added a comment. LGTM after one question. ================ Comment at: lldb/lit/SymbolFile/PDB/udt-layout.test:1 REQUIRES: system-windows, lld RUN: %build --compiler=clang-cl --output=%t.exe %S/Inputs/UdtLayoutTest.cpp ---------------- Is `system-windows` still required after you've removed the dependency on dbghelp? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68134/new/ https://reviews.llvm.org/D68134 From llvm-commits at lists.llvm.org Tue Oct 8 14:24:29 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:24:29 +0000 (UTC) Subject: [PATCH] D68664: [lit] Clean up internal diff's encoding handling In-Reply-To: References: Message-ID: <1a354aa3012a1a4e4b7acd0debb772c5@localhost.localdomain> rnk accepted this revision. rnk added a comment. This revision is now accepted and ready to land. lgtm I'm guessing you've tested with Python 2.7 and 3.5, and that's probably what matters. Thanks for working on this, and sorry for ever expanding scope of the suggested refactorings. ================ Comment at: llvm/utils/lit/lit/builtin_commands/diff.py:36 + locale.getpreferredencoding(False)) + except UnicodeDecodeError: + try: ---------------- try / except UnicodeDecodeError is an exciting code pattern, but I guess it's the existing behavior. :) ================ Comment at: llvm/utils/lit/lit/builtin_commands/diff.py:43 +def compareTwoBinaryFiles(flags, filepaths, filelines): + #sys.stderr.write("Trying as binary....\n") exitCode = 0 ---------------- Extra logging? ================ Comment at: llvm/utils/lit/lit/builtin_commands/diff.py:63 +def compareTwoTextFiles(flags, filepaths, filelines_bin, encoding): + #sys.stderr.write("Trying with encoding {}....\n".format(encoding)) filelines = [] ---------------- Ditto Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68664/new/ https://reviews.llvm.org/D68664 From llvm-commits at lists.llvm.org Tue Oct 8 14:24:30 2019 From: llvm-commits at lists.llvm.org (Krzysztof Parzyszek via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:24:30 +0000 (UTC) Subject: [PATCH] D68666: LiveIntervals: Split live intervals on multiple dead defs Message-ID: kparzysz created this revision. kparzysz added reviewers: qcolombet, arsenm. Herald added subscribers: aheejin, jgravelle-google, sbc100, nhaehnle, wdng, jvesely, dschuff, MatzeB. Herald added a project: LLVM. This is a follow-up to D67448 . Split live intervals with multiple dead defs during the initial execution of the live interval analysis, but do it outside of the function `createAndComputeVirtRegInterval`. Repository: rL LLVM https://reviews.llvm.org/D68666 Files: include/llvm/CodeGen/LiveIntervals.h lib/CodeGen/LiveIntervals.cpp test/CodeGen/AMDGPU/live-intervals-multiple-dead-defs.mir test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir test/DebugInfo/X86/live-debug-vars-discard-invalid.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68666.223945.patch Type: text/x-patch Size: 5263 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 14:33:53 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:33:53 +0000 (UTC) Subject: [PATCH] D67643: [lit] Extend internal diff to support `-` argument In-Reply-To: References: Message-ID: rnk accepted this revision. rnk added a comment. This revision is now accepted and ready to land. lgtm ================ Comment at: llvm/utils/lit/lit/builtin_commands/diff.py:27-29 + # FIXME: How can we restart stdin if the encoding is wrong? How can we + # read stdin with a different encoding or in binary mode in a way that's + # compatible with python 2 and 3? ---------------- jdenny wrote: > jdenny wrote: > > rnk wrote: > > > This function seems fragile to me. It reads each file three times: > > > 1. without an encoding (system default?), may fail > > > 2. with a utf-8 encoding, may fail > > > 3. inside compareTwo*Files, reads them yet again > > > > > > I think the best way to be portable between Python 2 & 3 will probably be to work in bytes as much as possible. > > > > > > I think you can use this pattern to get a new file descriptor for stdin that reads bytes: > > > ``` > > > fileno = sys.stdin.fileno() > > > if fileno is not None: > > > new_stdin = os.fdopen(os.dup(fileno), 'rb') # read in binary > > > ``` > > > > > > Here's how I would do it: > > > 1. read both inputs completely in binary (can't fail) > > > 2. try decoding the entire file in a few encodings (try default, try utf-8, if that fails, stick with bytes and diff_bytes) > > > 3. split file on u'\n' or b'\n' as appropriate, perhaps stripping trailing '\r' if present, depending on flags > > > > > > Does that seem reasonable? > > I don't know the motivation by the original implementation. Your plan sounds better. Sorry, but I don't have time to implement and test this right now. Hopefully soon. Thanks for the review. > > I think you can use this pattern to get a new file descriptor for stdin that reads bytes: > > > > ``` > > fileno = sys.stdin.fileno() > > if fileno is not None: > > ``` > > I don't see documentation saying that `sys.stdin.fileno` might return `None`. Is that possible? I think I just copied it from stack overflow here: https://stackoverflow.com/questions/32199552/what-is-sys-stdin-fileno-in-python I guess in this case os.dup would fail, which is a reasonable failure mode. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67643/new/ https://reviews.llvm.org/D67643 From llvm-commits at lists.llvm.org Tue Oct 8 14:33:53 2019 From: llvm-commits at lists.llvm.org (Hideki Saito via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:33:53 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: <58cf95491d076c3c835679c3aed56b98@localhost.localdomain> hsaito added a comment. In D68651#1699976 , @dmgreen wrote: > We do form uadd_sat as in rL357012 and usub_sat from selects. > > I really just need some way to generate sadd_sats for vectorisation. If there's a better way than this, I'm all ears :) If we just need vectorization with saturating add to happen, we just need a better pattern matcher utility. There should be benefits for going to intrinsic that compensates for the possible loss of optimizations that we might get with a sequence of Instructions. Is this (i.e., saturating add/sub intrinsic as the canonical form) something already discussed in llvm-dev that I missed? When I was involved in the "vector idioms" discussion few years ago, general consensus (among 10+ people interested in that topic) was that we need better pattern matchers than pushing for vector idiom specific intrinsics. That's why I'm asking. The same question also applies to rL357012 . CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Tue Oct 8 14:33:54 2019 From: llvm-commits at lists.llvm.org (Ayal Zaks via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:33:54 +0000 (UTC) Subject: [PATCH] D68082: [LV] Emitting SCEV checks with OptForSize In-Reply-To: References: Message-ID: <4c8f569a7189397754e8682a51b4c111@localhost.localdomain> Ayal added a comment. In D68082#1696991 , @SjoerdMeijer wrote: > Comments addressed: > > - cleaned up the test case a bit. Can be cleaned up a bit more, "header" block is still redundant. > - couldn't reuse an existing run-line, I guess because of -mcpu=skx, but a separate run line seems fine to me. Interesting. Better to check instead of guess. This distinction means that 1. @pr43371() testcase belongs in test/Transforms/LoopVectorize/optsize.ll instead of test/Transforms/LoopVectorize/X86/optsize.ll 2. InterleavedAccessInfo::collectConstStrideAccesses() in Analysis/VectorUtils.cpp is responsible for creating the predicate under skx early enough to bailout (despite having no interleave groups eventually). Best teach it too to call getPtrStride() with Assume=!hasOptSize instead of 'true'. This is another performance opportunity rather than prevention of asserts. > > >> The other suggested fix in LoopAccessInfo::collectStridedAccess() indeed deserves a separate patch > > Thanks for that suggestions, and I will address this separately. I have unfinished business in the vectorizer, and will add this to me my list of things to do next. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68082/new/ https://reviews.llvm.org/D68082 From llvm-commits at lists.llvm.org Tue Oct 8 14:43:27 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:43:27 +0000 (UTC) Subject: [PATCH] D68664: [lit] Clean up internal diff's encoding handling In-Reply-To: References: Message-ID: <1b01ab62ce3ed37347061ff0f873b208@localhost.localdomain> jdenny marked 2 inline comments as done. jdenny added a comment. In D68664#1700456 , @rnk wrote: > I'm guessing you've tested with Python 2.7 and 3.5, and that's probably what matters. I have 2.7.15 and 3.6.8, but I assume that's sufficient. With each, I've run check-lit and manually tried diff.py. I still need to try check-all before pushing everything. > Thanks for working on this, and sorry for ever expanding scope of the suggested refactorings. It all needed to be done. I just ran out of time last month. Thanks for the reviews. ================ Comment at: llvm/utils/lit/lit/builtin_commands/diff.py:36 + locale.getpreferredencoding(False)) + except UnicodeDecodeError: + try: ---------------- rnk wrote: > try / except UnicodeDecodeError is an exciting code pattern, but I guess it's the existing behavior. :) I'm not quite sure how to interpret this remark. Indeed, it's the existing behavior. Do you recommend an alternative? ================ Comment at: llvm/utils/lit/lit/builtin_commands/diff.py:43 +def compareTwoBinaryFiles(flags, filepaths, filelines): + #sys.stderr.write("Trying as binary....\n") exitCode = 0 ---------------- rnk wrote: > Extra logging? Hmm. I'm being sloppy. I'll remove. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68664/new/ https://reviews.llvm.org/D68664 From llvm-commits at lists.llvm.org Tue Oct 8 14:48:47 2019 From: llvm-commits at lists.llvm.org (David Blaikie via llvm-commits) Date: Tue, 08 Oct 2019 21:48:47 -0000 Subject: [llvm] r374122 - DebugInfo: Move LLE enum handling to .def to match RLE handling Message-ID: <20191008214847.5B73D89069@lists.llvm.org> Author: dblaikie Date: Tue Oct 8 14:48:46 2019 New Revision: 374122 URL: http://llvm.org/viewvc/llvm-project?rev=374122&view=rev Log: DebugInfo: Move LLE enum handling to .def to match RLE handling Modified: llvm/trunk/include/llvm/BinaryFormat/Dwarf.def llvm/trunk/include/llvm/BinaryFormat/Dwarf.h llvm/trunk/lib/BinaryFormat/Dwarf.cpp llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp Modified: llvm/trunk/include/llvm/BinaryFormat/Dwarf.def URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/BinaryFormat/Dwarf.def?rev=374122&r1=374121&r2=374122&view=diff ============================================================================== --- llvm/trunk/include/llvm/BinaryFormat/Dwarf.def (original) +++ llvm/trunk/include/llvm/BinaryFormat/Dwarf.def Tue Oct 8 14:48:46 2019 @@ -17,7 +17,7 @@ defined HANDLE_DW_VIRTUALITY || defined HANDLE_DW_DEFAULTED || \ defined HANDLE_DW_CC || defined HANDLE_DW_LNS || defined HANDLE_DW_LNE || \ defined HANDLE_DW_LNCT || defined HANDLE_DW_MACRO || \ - defined HANDLE_DW_RLE || \ + defined HANDLE_DW_RLE || defined HANDLE_DW_LLE || \ (defined HANDLE_DW_CFA && defined HANDLE_DW_CFA_PRED) || \ defined HANDLE_DW_APPLE_PROPERTY || defined HANDLE_DW_UT || \ defined HANDLE_DWARF_SECTION || defined HANDLE_DW_IDX || \ @@ -91,6 +91,10 @@ #define HANDLE_DW_RLE(ID, NAME) #endif +#ifndef HANDLE_DW_LLE +#define HANDLE_DW_LLE(ID, NAME) +#endif + #ifndef HANDLE_DW_CFA #define HANDLE_DW_CFA(ID, NAME) #endif @@ -825,6 +829,17 @@ HANDLE_DW_RLE(0x05, base_address) HANDLE_DW_RLE(0x06, start_end) HANDLE_DW_RLE(0x07, start_length) +// DWARF v5 Loc List Entry encoding values. +HANDLE_DW_LLE(0x00, end_of_list) +HANDLE_DW_LLE(0x01, base_addressx) +HANDLE_DW_LLE(0x02, startx_endx) +HANDLE_DW_LLE(0x03, startx_length) +HANDLE_DW_LLE(0x04, offset_pair) +HANDLE_DW_LLE(0x05, default_location) +HANDLE_DW_LLE(0x06, base_address) +HANDLE_DW_LLE(0x07, start_end) +HANDLE_DW_LLE(0x08, start_length) + // Call frame instruction encodings. HANDLE_DW_CFA(0x00, nop) HANDLE_DW_CFA(0x40, advance_loc) @@ -939,6 +954,7 @@ HANDLE_DW_IDX(0x05, type_hash) #undef HANDLE_DW_LNCT #undef HANDLE_DW_MACRO #undef HANDLE_DW_RLE +#undef HANDLE_DW_LLE #undef HANDLE_DW_CFA #undef HANDLE_DW_CFA_PRED #undef HANDLE_DW_APPLE_PROPERTY Modified: llvm/trunk/include/llvm/BinaryFormat/Dwarf.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/BinaryFormat/Dwarf.h?rev=374122&r1=374121&r2=374122&view=diff ============================================================================== --- llvm/trunk/include/llvm/BinaryFormat/Dwarf.h (original) +++ llvm/trunk/include/llvm/BinaryFormat/Dwarf.h Tue Oct 8 14:48:46 2019 @@ -308,11 +308,17 @@ enum MacroEntryType { }; /// DWARF v5 range list entry encoding values. -enum RangeListEntries { +enum RnglistEntries { #define HANDLE_DW_RLE(ID, NAME) DW_RLE_##NAME = ID, #include "llvm/BinaryFormat/Dwarf.def" }; +/// DWARF v5 loc list entry encoding values. +enum LoclistEntries { +#define HANDLE_DW_LLE(ID, NAME) DW_LLE_##NAME = ID, +#include "llvm/BinaryFormat/Dwarf.def" +}; + /// Call frame instruction encodings. enum CallFrameInfo { #define HANDLE_DW_CFA(ID, NAME) DW_CFA_##NAME = ID, @@ -348,19 +354,6 @@ enum Constants { DW_EH_PE_indirect = 0x80 }; -/// Constants for location lists in DWARF v5. -enum LocationListEntry : unsigned char { - DW_LLE_end_of_list = 0x00, - DW_LLE_base_addressx = 0x01, - DW_LLE_startx_endx = 0x02, - DW_LLE_startx_length = 0x03, - DW_LLE_offset_pair = 0x04, - DW_LLE_default_location = 0x05, - DW_LLE_base_address = 0x06, - DW_LLE_start_end = 0x07, - DW_LLE_start_length = 0x08 -}; - /// Constants for the DW_APPLE_PROPERTY_attributes attribute. /// Keep this list in sync with clang's DeclSpec.h ObjCPropertyAttributeKind! enum ApplePropertyAttributes { @@ -475,6 +468,7 @@ StringRef LNStandardString(unsigned Stan StringRef LNExtendedString(unsigned Encoding); StringRef MacinfoString(unsigned Encoding); StringRef RangeListEncodingString(unsigned Encoding); +StringRef LocListEncodingString(unsigned Encoding); StringRef CallFrameString(unsigned Encoding, Triple::ArchType Arch); StringRef ApplePropertyString(unsigned); StringRef UnitTypeString(unsigned); Modified: llvm/trunk/lib/BinaryFormat/Dwarf.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/BinaryFormat/Dwarf.cpp?rev=374122&r1=374121&r2=374122&view=diff ============================================================================== --- llvm/trunk/lib/BinaryFormat/Dwarf.cpp (original) +++ llvm/trunk/lib/BinaryFormat/Dwarf.cpp Tue Oct 8 14:48:46 2019 @@ -472,6 +472,17 @@ StringRef llvm::dwarf::RangeListEncoding } } +StringRef llvm::dwarf::LocListEncodingString(unsigned Encoding) { + switch (Encoding) { + default: + return StringRef(); +#define HANDLE_DW_LLE(ID, NAME) \ + case DW_LLE_##NAME: \ + return "DW_LLE_" #NAME; +#include "llvm/BinaryFormat/Dwarf.def" + } +} + StringRef llvm::dwarf::CallFrameString(unsigned Encoding, Triple::ArchType Arch) { assert(Arch != llvm::Triple::ArchType::UnknownArch); Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp?rev=374122&r1=374121&r2=374122&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp (original) +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp Tue Oct 8 14:48:46 2019 @@ -143,7 +143,7 @@ DWARFDebugLoclists::parseOneLocationList DataExtractor::Cursor C(*Offset); // dwarf::DW_LLE_end_of_list_entry is 0 and indicates the end of the list. - while (auto Kind = static_cast(Data.getU8(C))) { + while (auto Kind = static_cast(Data.getU8(C))) { Entry E; E.Kind = Kind; switch (Kind) { From llvm-commits at lists.llvm.org Tue Oct 8 14:52:46 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:52:46 +0000 (UTC) Subject: [PATCH] D68650: [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength. In-Reply-To: References: Message-ID: <9d358bb2905b221250a572cfeda2d01b@localhost.localdomain> hubert.reinterpretcast accepted this revision. hubert.reinterpretcast added a comment. This revision is now accepted and ready to land. LGTM. Thanks again. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68650/new/ https://reviews.llvm.org/D68650 From llvm-commits at lists.llvm.org Tue Oct 8 14:52:47 2019 From: llvm-commits at lists.llvm.org (Nikita Popov via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 21:52:47 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: nikic added a comment. @hsaito We've put quite a bit of effort into making sure that saturating intrinsics optimize as well or better than expanded IR sequences, which is why they are indeed considered canonical IR. Whether intrinsics are canonical needs to be decided on a case by case basis, there is no general rule about it. For example the unsigned mul overflow intrinsic is considered canonical (for obvious reasons -- the alternatives tend to be much more expensive) and is formed in InstCombine. Unsigned add/sub overflow on the other hand are not, because they tends to optimize much worse than expanded IR. Those are formed in CGP instead. It should be noted that the saturating add/sub intrinsics are not intended primarily as vector intrinsics -- even though that's the case where they will most typically select to hardware instructions rather than be expanded in DAG. The intrinsic representation is simply generally useful for optimization. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Tue Oct 8 15:03:14 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via llvm-commits) Date: Tue, 08 Oct 2019 22:03:14 -0000 Subject: [llvm] r374123 - [dsymutil] Improve verbose output (NFC) Message-ID: <20191008220314.19DEA8FA21@lists.llvm.org> Author: jdevlieghere Date: Tue Oct 8 15:03:13 2019 New Revision: 374123 URL: http://llvm.org/viewvc/llvm-project?rev=374123&view=rev Log: [dsymutil] Improve verbose output (NFC) The verbose output for finding relocations assumed that we'd always dump the DIE after (which starts with a newline) and therefore didn't include one itself. However, this isn't always true, leading to garbled output. This patch adds a newline to the verbose output and adds a line that says that the DIE is being kept (which isn't obvious otherwise). It also adds a 0x prefix to the relocations. Modified: llvm/trunk/test/tools/dsymutil/basic-linking.test llvm/trunk/tools/dsymutil/DwarfLinker.cpp Modified: llvm/trunk/test/tools/dsymutil/basic-linking.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/basic-linking.test?rev=374123&r1=374122&r2=374123&view=diff ============================================================================== --- llvm/trunk/test/tools/dsymutil/basic-linking.test (original) +++ llvm/trunk/test/tools/dsymutil/basic-linking.test Tue Oct 8 15:03:13 2019 @@ -25,36 +25,44 @@ CHECK-NOT: TAG CHECK: AT_name {{.*}}basic3.c CHECK-NOT: Found valid debug map entry -CHECK: Found valid debug map entry: _main 0000000000000000 => 0000000100000ea0 +CHECK: Found valid debug map entry: _main 0x0000000000000000 => 0x0000000100000ea0 +CHECK-NEXT: Keeping subprogram DIE: CHECK-NEXT: DW_TAG_subprogram CHECK-NEXT: DW_AT_name{{.*}}"main" -CHECK: Found valid debug map entry: _private_int 0000000000000560 => 0000000100001008 +CHECK: Found valid debug map entry: _private_int 0x0000000000000560 => 0x0000000100001008 +CHECK-NEXT: Keeping variable DIE: CHECK-NEXT: DW_TAG_variable CHECK-NEXT: DW_AT_name {{.*}}"private_int" CHECK-NOT: Found valid debug map entry -CHECK: Found valid debug map entry: _baz 0000000000000310 => 0000000100001000 +CHECK: Found valid debug map entry: _baz 0x0000000000000310 => 0x0000000100001000 +CHECK-NEXT: Keeping variable DIE: CHECK-NEXT: DW_TAG_variable CHECK-NEXT: DW_AT_name {{.*}}"baz" CHECK-NOT: Found valid debug map entry -CHECK: Found valid debug map entry: _foo 0000000000000020 => 0000000100000ed0 +CHECK: Found valid debug map entry: _foo 0x0000000000000020 => 0x0000000100000ed0 +CHECK-NEXT: Keeping subprogram DIE: CHECK-NEXT: DW_TAG_subprogram CHECK-NEXT: DW_AT_name {{.*}}"foo" CHECK-NOT: Found valid debug map entry -CHECK: Found valid debug map entry: _inc 0000000000000070 => 0000000100000f20 +CHECK: Found valid debug map entry: _inc 0x0000000000000070 => 0x0000000100000f20 +CHECK-NEXT: Keeping subprogram DIE: CHECK-NEXT: DW_TAG_subprogram CHECK-NEXT: DW_AT_name {{.*}}"inc" CHECK-NOT: Found valid debug map entry -CHECK: Found valid debug map entry: _val ffffffffffffffff => 0000000100001004 +CHECK: Found valid debug map entry: _val 0xffffffffffffffff => 0x0000000100001004 +CHECK-NEXT: Keeping variable DIE: CHECK-NEXT: DW_TAG_variable CHECK-NEXT: DW_AT_name {{.*}}"val" CHECK-NOT: Found valid debug map entry -CHECK: Found valid debug map entry: _bar 0000000000000020 => 0000000100000f40 +CHECK: Found valid debug map entry: _bar 0x0000000000000020 => 0x0000000100000f40 +CHECK-NEXT: Keeping subprogram DIE: CHECK-NEXT: DW_TAG_subprogram CHECK-NEXT: DW_AT_name {{.*}}"bar" CHECK-NOT: Found valid debug map entry -CHECK: Found valid debug map entry: _inc 0000000000000070 => 0000000100000f90 +CHECK: Found valid debug map entry: _inc 0x0000000000000070 => 0x0000000100000f90 +CHECK-NEXT: Keeping subprogram DIE: CHECK-NEXT: DW_TAG_subprogram CHECK-NEXT: DW_AT_name {{.*}}"inc") @@ -75,27 +83,33 @@ CHECK-LTO-NOT: TAG CHECK-LTO: AT_name {{.*}}basic3.c CHECK-LTO-NOT: Found valid debug map entry -CHECK-LTO: Found valid debug map entry: _main 0000000000000000 => 0000000100000f40 +CHECK-LTO: Found valid debug map entry: _main 0x0000000000000000 => 0x0000000100000f40 +CHECK-LTO-NEXT: Keeping subprogram DIE: CHECK-LTO-NEXT: DW_TAG_subprogram CHECK-LTO-NEXT: DW_AT_name {{.*}}"main" CHECK-LTO-NOT: Found valid debug map entry -CHECK-LTO: Found valid debug map entry: _private_int 00000000000008e8 => 0000000100001008 +CHECK-LTO: Found valid debug map entry: _private_int 0x00000000000008e8 => 0x0000000100001008 +CHECK-LTO-NEXT: Keeping variable DIE: CHECK-LTO-NEXT: DW_TAG_variable CHECK-LTO-NEXT: DW_AT_name {{.*}}"private_int" CHECK-LTO-NOT: Found valid debug map entry -CHECK-LTO: Found valid debug map entry: _baz 0000000000000658 => 0000000100001000 +CHECK-LTO: Found valid debug map entry: _baz 0x0000000000000658 => 0x0000000100001000 +CHECK-LTO-NEXT: Keeping variable DIE: CHECK-LTO-NEXT: DW_TAG_variable CHECK-LTO-NEXT: DW_AT_name {{.*}} "baz" CHECK-LTO-NOT: Found valid debug map entry -CHECK-LTO: Found valid debug map entry: _foo 0000000000000010 => 0000000100000f50 +CHECK-LTO: Found valid debug map entry: _foo 0x0000000000000010 => 0x0000000100000f50 +CHECK-LTO-NEXT: Keeping subprogram DIE: CHECK-LTO-NEXT: DW_TAG_subprogram CHECK-LTO-NEXT: DW_AT_name {{.*}}"foo" CHECK-LTO-NOT: Found valid debug map entry -CHECK-LTO: Found valid debug map entry: _val 00000000000008ec => 0000000100001004 +CHECK-LTO: Found valid debug map entry: _val 0x00000000000008ec => 0x0000000100001004 +CHECK-LTO-NEXT: Keeping variable DIE: CHECK-LTO-NEXT: DW_TAG_variable CHECK-LTO-NEXT: DW_AT_name {{.*}}"val" CHECK-LTO-NOT: Found valid debug map entry -CHECK-LTO: Found valid debug map entry: _bar 0000000000000050 => 0000000100000f90 +CHECK-LTO: Found valid debug map entry: _bar 0x0000000000000050 => 0x0000000100000f90 +CHECK-LTO-NEXT: Keeping subprogram DIE: CHECK-LTO-NEXT: DW_TAG_subprogram CHECK-LTO-NEXT: DW_AT_name {{.*}}"bar" @@ -120,36 +134,44 @@ CHECK-ARCHIVE-NOT: TAG CHECK-ARCHIVE: AT_name {{.*}}basic3.c CHECK-ARCHIVE-NOT: Found valid debug map entry -CHECK-ARCHIVE: Found valid debug map entry: _main 0000000000000000 => 0000000100000ea0 +CHECK-ARCHIVE: Found valid debug map entry: _main 0x0000000000000000 => 0x0000000100000ea0 +CHECK-ARCHIVE-NEXT: Keeping subprogram DIE: CHECK-ARCHIVE-NEXT: DW_TAG_subprogram CHECK-ARCHIVE-NEXT: DW_AT_name{{.*}}"main" CHECK-ARCHIVE-NOT: Found valid debug map entry -CHECK-ARCHIVE: Found valid debug map entry: _private_int 0000000000000560 => 0000000100001004 +CHECK-ARCHIVE: Found valid debug map entry: _private_int 0x0000000000000560 => 0x0000000100001004 +CHECK-ARCHIVE-NEXT: Keeping variable DIE: CHECK-ARCHIVE-NEXT: DW_TAG_variable CHECK-ARCHIVE-NEXT: DW_AT_name {{.*}}"private_int" CHECK-ARCHIVE-NOT: Found valid debug map entry -CHECK-ARCHIVE: Found valid debug map entry: _baz 0000000000000310 => 0000000100001000 +CHECK-ARCHIVE: Found valid debug map entry: _baz 0x0000000000000310 => 0x0000000100001000 +CHECK-ARCHIVE-NEXT: Keeping variable DIE: CHECK-ARCHIVE-NEXT: DW_TAG_variable CHECK-ARCHIVE-NEXT: DW_AT_name {{.*}}"baz" CHECK-ARCHIVE-NOT: Found valid debug map entry -CHECK-ARCHIVE: Found valid debug map entry: _foo 0000000000000020 => 0000000100000ed0 +CHECK-ARCHIVE: Found valid debug map entry: _foo 0x0000000000000020 => 0x0000000100000ed0 +CHECK-ARCHIVE-NEXT: Keeping subprogram DIE: CHECK-ARCHIVE-NEXT: DW_TAG_subprogram CHECK-ARCHIVE-NEXT: DW_AT_name {{.*}}"foo" CHECK-ARCHIVE-NOT: Found valid debug map entry -CHECK-ARCHIVE: Found valid debug map entry: _inc 0000000000000070 => 0000000100000f20 +CHECK-ARCHIVE: Found valid debug map entry: _inc 0x0000000000000070 => 0x0000000100000f20 +CHECK-ARCHIVE-NEXT: Keeping subprogram DIE: CHECK-ARCHIVE-NEXT: DW_TAG_subprogram CHECK-ARCHIVE-NEXT: DW_AT_name {{.*}}"inc" CHECK-ARCHIVE-NOT: Found valid debug map entry -CHECK-ARCHIVE: Found valid debug map entry: _val ffffffffffffffff => 0000000100001008 +CHECK-ARCHIVE: Found valid debug map entry: _val 0xffffffffffffffff => 0x0000000100001008 +CHECK-ARCHIVE-NEXT: Keeping variable DIE: CHECK-ARCHIVE-NEXT: DW_TAG_variable CHECK-ARCHIVE-NEXT: DW_AT_name {{.*}}"val" CHECK-ARCHIVE-NOT: Found valid debug map entry -CHECK-ARCHIVE: Found valid debug map entry: _bar 0000000000000020 => 0000000100000f40 +CHECK-ARCHIVE: Found valid debug map entry: _bar 0x0000000000000020 => 0x0000000100000f40 +CHECK-ARCHIVE-NEXT: Keeping subprogram DIE: CHECK-ARCHIVE-NEXT: DW_TAG_subprogram CHECK-ARCHIVE-NEXT: DW_AT_name {{.*}}"bar" CHECK-ARCHIVE-NOT: Found valid debug map entry -CHECK-ARCHIVE: Found valid debug map entry: _inc 0000000000000070 => 0000000100000f90 +CHECK-ARCHIVE: Found valid debug map entry: _inc 0x0000000000000070 => 0x0000000100000f90 +CHECK-ARCHIVE-NEXT: Keeping subprogram DIE: CHECK-ARCHIVE-NEXT: DW_TAG_subprogram CHECK-ARCHIVE-NEXT: DW_AT_name {{.*}}"inc") Modified: llvm/trunk/tools/dsymutil/DwarfLinker.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/dsymutil/DwarfLinker.cpp?rev=374123&r1=374122&r2=374123&view=diff ============================================================================== --- llvm/trunk/tools/dsymutil/DwarfLinker.cpp (original) +++ llvm/trunk/tools/dsymutil/DwarfLinker.cpp Tue Oct 8 15:03:13 2019 @@ -578,16 +578,17 @@ bool DwarfLinker::RelocationManager::has const auto &ValidReloc = ValidRelocs[NextValidReloc++]; const auto &Mapping = ValidReloc.Mapping->getValue(); - uint64_t ObjectAddress = Mapping.ObjectAddress - ? uint64_t(*Mapping.ObjectAddress) - : std::numeric_limits::max(); + const uint64_t BinaryAddress = Mapping.BinaryAddress; + const uint64_t ObjectAddress = Mapping.ObjectAddress + ? uint64_t(*Mapping.ObjectAddress) + : std::numeric_limits::max(); if (Linker.Options.Verbose) outs() << "Found valid debug map entry: " << ValidReloc.Mapping->getKey() - << " " - << format("\t%016" PRIx64 " => %016" PRIx64, ObjectAddress, - uint64_t(Mapping.BinaryAddress)); + << "\t" + << format("0x%016" PRIx64 " => 0x%016" PRIx64 "\n", ObjectAddress, + BinaryAddress); - Info.AddrAdjust = int64_t(Mapping.BinaryAddress) + ValidReloc.Addend; + Info.AddrAdjust = BinaryAddress + ValidReloc.Addend; if (Mapping.ObjectAddress) Info.AddrAdjust -= ObjectAddress; Info.InDebugMap = true; @@ -644,7 +645,7 @@ unsigned DwarfLinker::shouldKeepVariable // See if there is a relocation to a valid debug map entry inside // this variable's location. The order is important here. We want to - // always check in the variable has a valid relocation, so that the + // always check if the variable has a valid relocation, so that the // DIEInfo is filled. However, we don't want a static variable in a // function to force us to keep the enclosing function. if (!RelocMgr.hasValidRelocation(LocationOffset, LocationEndOffset, MyInfo) || @@ -652,6 +653,7 @@ unsigned DwarfLinker::shouldKeepVariable return Flags; if (Options.Verbose) { + outs() << "Keeping variable DIE:"; DIDumpOptions DumpOpts; DumpOpts.ChildRecurseDepth = 0; DumpOpts.Verbose = Options.Verbose; @@ -688,6 +690,7 @@ unsigned DwarfLinker::shouldKeepSubprogr return Flags; if (Options.Verbose) { + outs() << "Keeping subprogram DIE:"; DIDumpOptions DumpOpts; DumpOpts.ChildRecurseDepth = 0; DumpOpts.Verbose = Options.Verbose; From llvm-commits at lists.llvm.org Tue Oct 8 15:02:27 2019 From: llvm-commits at lists.llvm.org (Volkan Keles via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:02:27 +0000 (UTC) Subject: [PATCH] D68479: GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR In-Reply-To: References: Message-ID: <0a3f48c1393190ff76141e5145aa1236@localhost.localdomain> volkan accepted this revision. volkan added a comment. This revision is now accepted and ready to land. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68479/new/ https://reviews.llvm.org/D68479 From llvm-commits at lists.llvm.org Tue Oct 8 15:02:28 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:02:28 +0000 (UTC) Subject: [PATCH] D68667: [SLP] respect target register width for GEP vectorization (PR43578) Message-ID: spatel created this revision. spatel added reviewers: craig.topper, RKSimon, xbolva00, ABataev. Herald added subscribers: hiraditya, kristof.beyls, mcrosier. Herald added a project: LLVM. We failed to account for the target register width (max vector factor) when vectorizing starting from GEPs. This causes vectorization to proceed to obviously illegal widths as in: https://bugs.llvm.org/show_bug.cgi?id=43578 For x86, this also means that SLP can produce rogue AVX or AVX512 code even when the user specifies a narrower vector width. The AArch64 test in ext-trunc.ll appears to be better using the narrower width. I'm not exactly sure what getelementptr.ll is trying to do, but it's testing with "-slp-threshold=-18", so I'm not worried about those diffs. The x86 test is an over-reduction from SPEC h264; this patch appears to restore the perf loss caused by SLP when using -march=haswell. https://reviews.llvm.org/D68667 Files: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/test/Transforms/SLPVectorizer/AArch64/ext-trunc.ll llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll llvm/test/Transforms/SLPVectorizer/X86/load-merge.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68667.223949.patch Type: text/x-patch Size: 14033 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 15:06:09 2019 From: llvm-commits at lists.llvm.org (Bill Wendling via llvm-commits) Date: Tue, 08 Oct 2019 22:06:09 -0000 Subject: [llvm] r374124 - [IA] Add tests for a few other edge cases Message-ID: <20191008220609.6900782044@lists.llvm.org> Author: void Date: Tue Oct 8 15:06:09 2019 New Revision: 374124 URL: http://llvm.org/viewvc/llvm-project?rev=374124&view=rev Log: [IA] Add tests for a few other edge cases Test with the last eight bits within the range [7F, FF] and with lower-case hex letters. Modified: llvm/trunk/test/MC/AsmParser/directive_ascii.s Modified: llvm/trunk/test/MC/AsmParser/directive_ascii.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AsmParser/directive_ascii.s?rev=374124&r1=374123&r2=374124&view=diff ============================================================================== --- llvm/trunk/test/MC/AsmParser/directive_ascii.s (original) +++ llvm/trunk/test/MC/AsmParser/directive_ascii.s Tue Oct 8 15:06:09 2019 @@ -42,5 +42,9 @@ TEST6: # CHECK: TEST7: # CHECK: .ascii "dk" +# 0xFACE & 0xFF == 0xCE == 0o316 +# 0x0FE & 0xFF == 0xFE == 0o376 +# CHECK: .ascii "\316\376" TEST7: .ascii "\x64\Xa6B" + .ascii "\xface\x0Fe" From llvm-commits at lists.llvm.org Tue Oct 8 15:03:59 2019 From: llvm-commits at lists.llvm.org (Bill Wendling via llvm-commits) Date: Tue, 8 Oct 2019 15:03:59 -0700 Subject: [PATCH] D68598: [IA] Recognize hexadecimal escape sequences In-Reply-To: <28b04d29ea5f06b6f6f78d0221075d3f@localhost.localdomain> References: <28b04d29ea5f06b6f6f78d0221075d3f@localhost.localdomain> Message-ID: Done: r374124. On Tue, Oct 8, 2019 at 12:01 AM Jian Cai via Phabricator < reviews at reviews.llvm.org> wrote: > jcai19 added inline comments. > > > ================ > Comment at: llvm/test/MC/AsmParser/directive_ascii.s:46 > +TEST7: > + .ascii "\x64\Xa6B" > ---------------- > Thanks for the update. Could we have one more test case with last eight > bits within the range of 7f and ff, and maybe with lower-case letter, e.g. > \x8a? > > > Repository: > rL LLVM > > CHANGES SINCE LAST ACTION > https://reviews.llvm.org/D68598/new/ > > https://reviews.llvm.org/D68598 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Tue Oct 8 15:09:52 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Tue, 08 Oct 2019 22:09:52 -0000 Subject: [compiler-rt] r374125 - [sanitizer] Fix crypt.cpp on Android again Message-ID: <20191008220952.216478FBB3@lists.llvm.org> Author: vitalybuka Date: Tue Oct 8 15:09:51 2019 New Revision: 374125 URL: http://llvm.org/viewvc/llvm-project?rev=374125&view=rev Log: [sanitizer] Fix crypt.cpp on Android again Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp compiler-rt/trunk/test/sanitizer_common/lit.common.cfg.py Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp?rev=374125&r1=374124&r2=374125&view=diff ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp (original) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp Tue Oct 8 15:09:51 2019 @@ -1,7 +1,7 @@ -// RUN: %clangxx -O0 -g %s -o %t && %run %t +// RUN: %clangxx -O0 -g %s -o %t -lcrypt && %run %t -// crypt is missing from Android. -// UNSUPPORTED: android +// crypt() is missing from Android and -lcrypt from darwin. +// UNSUPPORTED: android, darwin #include #include Modified: compiler-rt/trunk/test/sanitizer_common/lit.common.cfg.py URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/lit.common.cfg.py?rev=374125&r1=374124&r2=374125&view=diff ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/lit.common.cfg.py (original) +++ compiler-rt/trunk/test/sanitizer_common/lit.common.cfg.py Tue Oct 8 15:09:51 2019 @@ -51,9 +51,6 @@ extra_link_flags = [] if config.host_os in ['Linux']: extra_link_flags += ["-ldl"] -if config.host_os in ['Linux', 'NetBSD', 'FreeBSD']: - extra_link_flags += ["-lcrypt"] - clang_cflags = config.debug_info_flags + tool_cflags + [config.target_cflags] clang_cflags += extra_link_flags clang_cxxflags = config.cxx_mode_flags + clang_cflags From llvm-commits at lists.llvm.org Tue Oct 8 15:12:05 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:12:05 +0000 (UTC) Subject: [PATCH] D68668: [lit] Extend internal diff to support -U Message-ID: jdenny created this revision. jdenny added reviewers: probinson, stella.stamenova, bd1976llvm, jlpeyton, rnk, mgorny. Herald added a subscriber: delcypher. Herald added a project: LLVM. jdenny added a parent revision: D67643: [lit] Extend internal diff to support `-` argument. jdenny added a child revision: D66506: [lit] Fix internal env calling other internal commands. When using lit's internal shell, RUN lines like the following accidentally execute an external `diff` instead of lit's internal `diff`: # RUN: program | diff -U1 file - Such cases exist now, in `clang/test/Analysis` for example. We are preparing patches to ensure lit's internal `diff` is called in such cases, which will then fail because lit's internal `diff` doesn't recognize `-U` as a command-line option. This patch adds `-U` support. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68668 Files: llvm/utils/lit/lit/builtin_commands/diff.py llvm/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/shtest-shell.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68668.223948.patch Type: text/x-patch Size: 7015 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 15:12:05 2019 From: llvm-commits at lists.llvm.org (ham9174@yahoo.com via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:12:05 +0000 (UTC) Subject: [PATCH] D68650: [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength. In-Reply-To: References: Message-ID: ham999 added a comment. In D68650#1700496 , @hubert.reinterpretcast wrote: > LGTM. Thanks again. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68650/new/ https://reviews.llvm.org/D68650 From llvm-commits at lists.llvm.org Tue Oct 8 15:12:06 2019 From: llvm-commits at lists.llvm.org (Dan Liew via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:12:06 +0000 (UTC) Subject: [PATCH] D68064: [Builtins] Provide a mechanism to selectively disable tests based on whether an implementation is provided by a builtin library. In-Reply-To: References: Message-ID: <6c290ca3f692bab187b407c5a98e64a7@localhost.localdomain> delcypher added a comment. @beanz @steven_wu @arphaman @phosek @dexonsmith @thakis I've improved this patch by annotating as many tests as possible. Okay to land? Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68064/new/ https://reviews.llvm.org/D68064 From llvm-commits at lists.llvm.org Tue Oct 8 15:12:06 2019 From: llvm-commits at lists.llvm.org (Dan Liew via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:12:06 +0000 (UTC) Subject: [PATCH] D68064: [Builtins] Provide a mechanism to selectively disable tests based on whether an implementation is provided by a builtin library. In-Reply-To: References: Message-ID: <630d3a5c53762621af8b9cb0aea899a1@localhost.localdomain> delcypher updated this revision to Diff 223950. delcypher added a comment. - Rename `crt_has_*` -> `librt_has_*` so consistency with `%librt` lit substitution. - Annotate as many tests as possible. - Make python fixups suggested by Petr Hosek. Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68064/new/ https://reviews.llvm.org/D68064 Files: test/builtins/CMakeLists.txt test/builtins/Unit/absvdi2_test.c test/builtins/Unit/absvsi2_test.c test/builtins/Unit/absvti2_test.c test/builtins/Unit/adddf3vfp_test.c test/builtins/Unit/addsf3vfp_test.c test/builtins/Unit/addtf3_test.c test/builtins/Unit/addvdi3_test.c test/builtins/Unit/addvsi3_test.c test/builtins/Unit/addvti3_test.c test/builtins/Unit/ashldi3_test.c test/builtins/Unit/ashlti3_test.c test/builtins/Unit/ashrdi3_test.c test/builtins/Unit/ashrti3_test.c test/builtins/Unit/bswapdi2_test.c test/builtins/Unit/bswapsi2_test.c test/builtins/Unit/clear_cache_test.c test/builtins/Unit/clzdi2_test.c test/builtins/Unit/clzsi2_test.c test/builtins/Unit/clzti2_test.c test/builtins/Unit/cmpdi2_test.c test/builtins/Unit/cmpti2_test.c test/builtins/Unit/comparedf2_test.c test/builtins/Unit/comparesf2_test.c test/builtins/Unit/cpu_model_test.c test/builtins/Unit/ctzdi2_test.c test/builtins/Unit/ctzsi2_test.c test/builtins/Unit/ctzti2_test.c test/builtins/Unit/divdc3_test.c test/builtins/Unit/divdf3_test.c test/builtins/Unit/divdf3vfp_test.c test/builtins/Unit/divdi3_test.c test/builtins/Unit/divmodsi4_test.c test/builtins/Unit/divsc3_test.c test/builtins/Unit/divsf3_test.c test/builtins/Unit/divsf3vfp_test.c test/builtins/Unit/divsi3_test.c test/builtins/Unit/divtc3_test.c test/builtins/Unit/divtf3_test.c test/builtins/Unit/divti3_test.c test/builtins/Unit/divxc3_test.c test/builtins/Unit/enable_execute_stack_test.c test/builtins/Unit/eqdf2vfp_test.c test/builtins/Unit/eqsf2vfp_test.c test/builtins/Unit/eqtf2_test.c test/builtins/Unit/extebdsfdf2vfp_test.c test/builtins/Unit/extenddftf2_test.c test/builtins/Unit/extendhfsf2_test.c test/builtins/Unit/extendsfdf2vfp_test.c test/builtins/Unit/extendsftf2_test.c test/builtins/Unit/ffsdi2_test.c test/builtins/Unit/ffssi2_test.c test/builtins/Unit/ffsti2_test.c test/builtins/Unit/fixdfdi_test.c test/builtins/Unit/fixdfsivfp_test.c test/builtins/Unit/fixdfti_test.c test/builtins/Unit/fixsfdi_test.c test/builtins/Unit/fixsfsivfp_test.c test/builtins/Unit/fixsfti_test.c test/builtins/Unit/fixtfdi_test.c test/builtins/Unit/fixtfsi_test.c test/builtins/Unit/fixtfti_test.c test/builtins/Unit/fixunsdfdi_test.c test/builtins/Unit/fixunsdfsi_test.c test/builtins/Unit/fixunsdfsivfp_test.c test/builtins/Unit/fixunsdfti_test.c test/builtins/Unit/fixunssfdi_test.c test/builtins/Unit/fixunssfsi_test.c test/builtins/Unit/fixunssfsivfp_test.c test/builtins/Unit/fixunssfti_test.c test/builtins/Unit/fixunstfdi_test.c test/builtins/Unit/fixunstfsi_test.c test/builtins/Unit/fixunstfti_test.c test/builtins/Unit/fixunsxfdi_test.c test/builtins/Unit/fixunsxfsi_test.c test/builtins/Unit/fixunsxfti_test.c test/builtins/Unit/fixxfdi_test.c test/builtins/Unit/fixxfti_test.c test/builtins/Unit/floatdidf_test.c test/builtins/Unit/floatdisf_test.c test/builtins/Unit/floatditf_test.c test/builtins/Unit/floatdixf_test.c test/builtins/Unit/floatsidfvfp_test.c test/builtins/Unit/floatsisfvfp_test.c test/builtins/Unit/floatsitf_test.c test/builtins/Unit/floattidf_test.c test/builtins/Unit/floattisf_test.c test/builtins/Unit/floattitf_test.c test/builtins/Unit/floattixf_test.c test/builtins/Unit/floatundidf_test.c test/builtins/Unit/floatundisf_test.c test/builtins/Unit/floatunditf_test.c test/builtins/Unit/floatundixf_test.c test/builtins/Unit/floatunsitf_test.c test/builtins/Unit/floatunssidfvfp_test.c test/builtins/Unit/floatunssisfvfp_test.c test/builtins/Unit/floatuntidf_test.c test/builtins/Unit/floatuntisf_test.c test/builtins/Unit/floatuntitf_test.c test/builtins/Unit/floatuntixf_test.c test/builtins/Unit/gedf2vfp_test.c test/builtins/Unit/gesf2vfp_test.c test/builtins/Unit/getf2_test.c test/builtins/Unit/gtdf2vfp_test.c test/builtins/Unit/gtsf2vfp_test.c test/builtins/Unit/gttf2_test.c test/builtins/Unit/ledf2vfp_test.c test/builtins/Unit/lesf2vfp_test.c test/builtins/Unit/letf2_test.c test/builtins/Unit/lit.cfg.py test/builtins/Unit/lit.site.cfg.py.in test/builtins/Unit/lshrdi3_test.c test/builtins/Unit/lshrti3_test.c test/builtins/Unit/ltdf2vfp_test.c test/builtins/Unit/ltsf2vfp_test.c test/builtins/Unit/lttf2_test.c test/builtins/Unit/moddi3_test.c test/builtins/Unit/modsi3_test.c test/builtins/Unit/modti3_test.c test/builtins/Unit/muldc3_test.c test/builtins/Unit/muldf3vfp_test.c test/builtins/Unit/muldi3_test.c test/builtins/Unit/mulodi4_test.c test/builtins/Unit/mulosi4_test.c test/builtins/Unit/muloti4_test.c test/builtins/Unit/mulsc3_test.c test/builtins/Unit/mulsf3vfp_test.c test/builtins/Unit/multc3_test.c test/builtins/Unit/multf3_test.c test/builtins/Unit/multi3_test.c test/builtins/Unit/mulvdi3_test.c test/builtins/Unit/mulvsi3_test.c test/builtins/Unit/mulvti3_test.c test/builtins/Unit/mulxc3_test.c test/builtins/Unit/nedf2vfp_test.c test/builtins/Unit/negdf2vfp_test.c test/builtins/Unit/negdi2_test.c test/builtins/Unit/negsf2vfp_test.c test/builtins/Unit/negti2_test.c test/builtins/Unit/negvdi2_test.c test/builtins/Unit/negvsi2_test.c test/builtins/Unit/negvti2_test.c test/builtins/Unit/nesf2vfp_test.c test/builtins/Unit/netf2_test.c test/builtins/Unit/paritydi2_test.c test/builtins/Unit/paritysi2_test.c test/builtins/Unit/parityti2_test.c test/builtins/Unit/popcountdi2_test.c test/builtins/Unit/popcountsi2_test.c test/builtins/Unit/popcountti2_test.c test/builtins/Unit/powidf2_test.c test/builtins/Unit/powisf2_test.c test/builtins/Unit/powitf2_test.c test/builtins/Unit/powixf2_test.c test/builtins/Unit/subdf3vfp_test.c test/builtins/Unit/subsf3vfp_test.c test/builtins/Unit/subtf3_test.c test/builtins/Unit/subvdi3_test.c test/builtins/Unit/subvsi3_test.c test/builtins/Unit/subvti3_test.c test/builtins/Unit/trampoline_setup_test.c test/builtins/Unit/truncdfhf2_test.c test/builtins/Unit/truncdfsf2_test.c test/builtins/Unit/truncdfsf2vfp_test.c test/builtins/Unit/truncsfhf2_test.c test/builtins/Unit/trunctfdf2_test.c test/builtins/Unit/trunctfsf2_test.c test/builtins/Unit/ucmpdi2_test.c test/builtins/Unit/ucmpti2_test.c test/builtins/Unit/udivdi3_test.c test/builtins/Unit/udivmoddi4_test.c test/builtins/Unit/udivmodsi4_test.c test/builtins/Unit/udivmodti4_test.c test/builtins/Unit/udivsi3_test.c test/builtins/Unit/udivti3_test.c test/builtins/Unit/umoddi3_test.c test/builtins/Unit/umodsi3_test.c test/builtins/Unit/umodti3_test.c test/builtins/Unit/unorddf2vfp_test.c test/builtins/Unit/unordsf2vfp_test.c test/builtins/Unit/unordtf2_test.c -------------- next part -------------- A non-text attachment was scrubbed... Name: D68064.223950.patch Type: text/x-patch Size: 80401 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 15:12:06 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:12:06 +0000 (UTC) Subject: [PATCH] D68669: [llvm-objdump][WIP] Make llvm-objdump -h compatible with GNU objdump. Message-ID: rupprecht created this revision. Herald added subscribers: llvm-commits, cfe-commits, seiya, arphaman, jakehehrlich, aheejin, arichardson, sbc100, emaste. Herald added a reviewer: espindola. Herald added a reviewer: alexshap. Herald added a reviewer: jhenderson. Herald added projects: clang, LLVM. rupprecht planned changes to this revision. rupprecht added a comment. Note: herald added reviewers, but this patch is just to provide context. I'll send the real patches for review in the coming days. Note: this patch is large and not intended for submission as-is. Instead, this patch presents a poor implementation that makes llvm-objdump GNU compatibile for this option (with all existing tests passing, but few added tests, hacky code, etc.), and will serve as context for smaller changes to be submitted separately with more careful review. llvm-objdump -h was implemented to be similar to readelf -S (see rL141579 ). However, it is not completely compatible with that, and anyone that does want headers displayed that way can use llvm-readelf -S now that it exists. Make llvm-objdump compatible with GNU objdump instead. A brief overview of changes: - Add file offset (with implementations for all filetypes supported by llvm-readobj). - Add 2**n section alignment column (with implementations for all filetypes supported by llvm-readobj). - Section numbers are not the actual section numbers, but something different, corresponding to libbfd section numbers. llvm-readelf -s prints the actual section numbers if those are desired. - Filter out certain sections like symtabs/strtabs/relocs. The actual logic is a lot more convoluted (and probably isn't fully compatibile, but is pretty close). Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68669 Files: clang/test/Modules/pch_container.m lld/test/ELF/bss-start-common.s lld/test/ELF/edata-etext.s lld/test/ELF/edata-no-bss.s lld/test/ELF/emit-relocs-gc.s lld/test/ELF/gc-sections-metadata.s lld/test/ELF/init_fini_priority.s lld/test/ELF/invalid-fde-rel.s lld/test/ELF/linkerscript/addr.test lld/test/ELF/linkerscript/align-empty.test lld/test/ELF/linkerscript/align1.test lld/test/ELF/linkerscript/align2.test lld/test/ELF/linkerscript/align3.test lld/test/ELF/linkerscript/at2.test lld/test/ELF/linkerscript/constructor.test lld/test/ELF/linkerscript/define.test lld/test/ELF/linkerscript/double-bss.test lld/test/ELF/linkerscript/eh-frame-emit-relocs.s lld/test/ELF/linkerscript/emit-reloc-section-names.s lld/test/ELF/linkerscript/expr-sections.test lld/test/ELF/linkerscript/input-sec-dup.s lld/test/ELF/linkerscript/insert-after.test lld/test/ELF/linkerscript/insert-before.test lld/test/ELF/linkerscript/memory-include.test lld/test/ELF/linkerscript/memory.s lld/test/ELF/linkerscript/memory3.s lld/test/ELF/linkerscript/memory4.test lld/test/ELF/linkerscript/memory5.test lld/test/ELF/linkerscript/multi-sections-constraint.s lld/test/ELF/linkerscript/non-absolute2.test lld/test/ELF/linkerscript/numbers.s lld/test/ELF/linkerscript/orphan.s lld/test/ELF/linkerscript/orphans.s lld/test/ELF/linkerscript/out-of-order-section-in-region.test lld/test/ELF/linkerscript/out-of-order.s lld/test/ELF/linkerscript/output-section-include.test lld/test/ELF/linkerscript/region-alias.s lld/test/ELF/linkerscript/repsection-va.s lld/test/ELF/linkerscript/section-include.test lld/test/ELF/linkerscript/sections-constraint.s lld/test/ELF/linkerscript/sections-gc2.s lld/test/ELF/linkerscript/sections-keep.s lld/test/ELF/linkerscript/sections-sort.s lld/test/ELF/linkerscript/sections.s lld/test/ELF/linkerscript/sizeof.s lld/test/ELF/linkerscript/symbol-only.test lld/test/ELF/linkerscript/va.s lld/test/ELF/linkerscript/wildcards.s lld/test/ELF/linkerscript/wildcards2.s lld/test/ELF/relocatable-sections.s lld/test/ELF/relocatable.s lld/test/ELF/relro-omagic.s lld/test/ELF/section-name.s lld/test/ELF/sectionstart-noallochdr.s lld/test/ELF/sectionstart.s lld/test/ELF/strip-all.s lld/test/ELF/synthetic-got.s llvm/test/MC/COFF/assoc-private.s llvm/test/Object/objdump-no-sectionheaders.test llvm/test/Object/objdump-sectionheaders.test llvm/test/ObjectYAML/CodeView/sections.yaml llvm/test/tools/llvm-objcopy/ELF/symtab-error-on-remove-strtab.test llvm/test/tools/llvm-objdump/X86/adjust-vma.test llvm/test/tools/llvm-objdump/X86/macho-section-headers.test llvm/test/tools/llvm-objdump/X86/phdrs-lma.test llvm/test/tools/llvm-objdump/X86/phdrs-lma2.test llvm/test/tools/llvm-objdump/X86/section-index.s llvm/test/tools/llvm-objdump/section-filter.test llvm/test/tools/llvm-objdump/wasm.txt llvm/test/tools/llvm-objdump/xcoff-section-headers.test llvm/tools/llvm-objdump/llvm-objdump.cpp llvm/tools/llvm-objdump/llvm-objdump.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68669.223951.patch Type: text/x-patch Size: 87616 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 15:12:06 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:12:06 +0000 (UTC) Subject: [PATCH] D68669: [llvm-objdump][WIP] Make llvm-objdump -h compatible with GNU objdump. In-Reply-To: References: Message-ID: <93ea683d5eed00c59a8c5c932dc021f6@localhost.localdomain> rupprecht planned changes to this revision. rupprecht added a comment. Note: herald added reviewers, but this patch is just to provide context. I'll send the real patches for review in the coming days. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68669/new/ https://reviews.llvm.org/D68669 From llvm-commits at lists.llvm.org Tue Oct 8 15:12:22 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:12:22 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: <49652b2dfc2033edc8fb860880a812f9@localhost.localdomain> wmi marked 2 inline comments as done. wmi added a comment. In D68601#1699113 , @wenlei wrote: > Thanks for making indexing available for extended binary format. We've recently adding indexing to binary format as well in an internal patch, and also observed similar (~30%) build time reduction for some large services. Thanks for providing the data! > We also have have the need to differentiate dead/cold symbols vs new symbols, so symbol list in extended binary format is useful to us as well - will try it out. (I noticed that a small change in profile generation tool (the equivalent of https://github.com/google/autofdo) is needed to populate that list though, it'd be nice if these are all part of LLVM) You can use llvm-profdata to attach the profile symbol list section. Symbol list can be provided to llvm-profdata in a plain text file. > Left some comments inline. In addition, I think `llvm-profdata/Inputs/sample-profile.proftext` need to be updated as well to include the new section for `roundtrip.test` The roundtrip test doesn't contain any golden file in extbinary format. It generates the extbinary format profile on the fly. The input sample-profile.proftext doesn't need change. ================ Comment at: llvm/include/llvm/ProfileData/SampleProfWriter.h:206 virtual void initSectionLayout() override { SectionLayout = {{SecProfSummary, 0, 0, 0}, {SecNameTable, 0, 0, 0}, ---------------- wenlei wrote: > With the addition of offset table at the end of sections but in the middle of section header, I found the name SectionLayout a bit confusing. This array actually represents the layout/order of section header (e.g. the reader order), but not the section (payload) layout (the writer order). I guess renaming it SectionHdrLayout or something alike may help.. > > Besides, some comments explaining why SecFuncOffsetTable need to be in the middle could help too, it looks like a hidden trick until I saw the comments in SampleProfileReaderExtBinaryBase::getFileSize(). Indeed SectionHdrLayout is a better name. I changed it and added comment to explain why SecFuncOffsetTable is after SecLBRProfile in the profile but is before SecLBRProfile in the section header table. ================ Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:547-548 + continue; + Data = Start + iter->second; + if (std::error_code EC = readFuncProfile()) + return EC; ---------------- wenlei wrote: > nit: similar to line 539, we could assert the random access here never attempts to touch anything beyond Size? assertion added. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 From llvm-commits at lists.llvm.org Tue Oct 8 15:12:28 2019 From: llvm-commits at lists.llvm.org (Francis Visoiu Mistrih via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:12:28 +0000 (UTC) Subject: [PATCH] D68611: [IRGen] Emit lifetime markers for temporary struct allocas In-Reply-To: References: Message-ID: <2e20dfac62dee12fc2c438d61e0de315@localhost.localdomain> This revision was automatically updated to reflect the committed changes. thegameg marked an inline comment as done. Closed by commit rG143f6b837790: [IRGen] Emit lifetime markers for temporary struct allocas (authored by thegameg). Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68611/new/ https://reviews.llvm.org/D68611 Files: clang/lib/CodeGen/CGCall.cpp clang/test/CodeGen/aarch64-byval-temp.c -------------- next part -------------- A non-text attachment was scrubbed... Name: D68611.223952.patch Type: text/x-patch Size: 7414 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 15:28:14 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:28:14 +0000 (UTC) Subject: [PATCH] D48498: [APInt] Add helpers for rounding u/sdivs. In-Reply-To: References: Message-ID: lebedev.ri added inline comments. Herald added subscribers: sanjoy.google, kristina, dexonsmith. Herald added a project: LLVM. ================ Comment at: llvm/trunk/unittests/ADT/APIntTest.cpp:2299 + + for (uint64_t Bi = -128; Bi <= 127; Bi++) { + if (Bi == 0) ---------------- This doesn't do what you think it does.. In reality this inner loop simply never runs. ================ Comment at: llvm/trunk/unittests/ADT/APIntTest.cpp:2307 + auto Prod = Quo.sext(16) * B.sext(16); + EXPECT_TRUE(Prod.uge(A)); + if (Prod.ugt(A)) { ---------------- This always asserts. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D48498/new/ https://reviews.llvm.org/D48498 From llvm-commits at lists.llvm.org Tue Oct 8 15:28:15 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:28:15 +0000 (UTC) Subject: [PATCH] D29295: Move core RDF files from lib/Target/Hexagon to CodeGen In-Reply-To: References: Message-ID: <634fe29a88b4dc745be6eb2c7558564c@localhost.localdomain> arsenm added a comment. Herald added subscribers: steven.zhang, mgrang, kristof.beyls. Herald added a project: LLVM. Is there any progress on this? Can you post a rebased version? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29295/new/ https://reviews.llvm.org/D29295 From llvm-commits at lists.llvm.org Tue Oct 8 15:28:15 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:28:15 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: <84d12eb07a0bb0bb532667895807bbd1@localhost.localdomain> wmi updated this revision to Diff 223953. wmi added a comment. Address Wenlei's comment. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 Files: llvm/include/llvm/ProfileData/SampleProf.h llvm/include/llvm/ProfileData/SampleProfReader.h llvm/include/llvm/ProfileData/SampleProfWriter.h llvm/lib/ProfileData/SampleProfReader.cpp llvm/lib/ProfileData/SampleProfWriter.cpp llvm/test/Transforms/SampleProfile/Inputs/inline.extbinary.afdo llvm/test/Transforms/SampleProfile/Inputs/profsampleacc.extbinary.afdo llvm/unittests/ProfileData/SampleProfTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68601.223953.patch Type: text/x-patch Size: 16400 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 15:28:16 2019 From: llvm-commits at lists.llvm.org (Jan Korous via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:28:16 +0000 (UTC) Subject: [PATCH] D68093: [clang-scan-deps][static analyzer] Support for clang --analyze in scan-deps In-Reply-To: References: Message-ID: <8dd61ae3ba66b888ef3329d9bd4e5244@localhost.localdomain> jkorous added a comment. @NoQ , @hiraditya any suggestions for the option name and/or description? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68093/new/ https://reviews.llvm.org/D68093 From llvm-commits at lists.llvm.org Tue Oct 8 15:45:24 2019 From: llvm-commits at lists.llvm.org (Sourabh Singh Tomar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:45:24 +0000 (UTC) Subject: [PATCH] D68117: [DWARF-5] Support for C++11 defaulted, deleted member functions. In-Reply-To: References: Message-ID: <0128760c632e42df19eef336b5ec29cc@localhost.localdomain> SouraVX updated this revision to Diff 223954. SouraVX added a comment. Addressed comments, regarding flags. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68117/new/ https://reviews.llvm.org/D68117 Files: clang/lib/CodeGen/CGDebugInfo.cpp clang/test/CodeGenCXX/dbg-info-all-calls-described.cpp clang/test/CodeGenCXX/debug-info-defaulted-in-class.cpp clang/test/CodeGenCXX/debug-info-defaulted-out-of-class.cpp clang/test/CodeGenCXX/debug-info-deleted.cpp clang/test/CodeGenCXX/debug-info-not-defaulted.cpp llvm/include/llvm/BinaryFormat/Dwarf.h llvm/include/llvm/IR/DebugInfoFlags.def llvm/include/llvm/IR/DebugInfoMetadata.h llvm/lib/BinaryFormat/Dwarf.cpp llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp llvm/lib/IR/DebugInfoMetadata.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68117.223954.patch Type: text/x-patch Size: 16435 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 15:45:26 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:45:26 +0000 (UTC) Subject: [PATCH] D68668: [lit] Extend internal diff to support -U In-Reply-To: References: Message-ID: rnk accepted this revision. rnk added a comment. This revision is now accepted and ready to land. lgtm Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68668/new/ https://reviews.llvm.org/D68668 From llvm-commits at lists.llvm.org Tue Oct 8 15:45:26 2019 From: llvm-commits at lists.llvm.org (Wenlei He via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:45:26 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: <00dc00b053b155ca4c14502532f407cd@localhost.localdomain> wenlei accepted this revision. wenlei added a comment. This revision is now accepted and ready to land. LGTM. Thanks! > Symbol list can be provided to llvm-profdata in a plain text file. I thought it's more convenient to have PSL auto-populated by the tool that generates AutoFDO profile, or is there any reason for not using auto-generated PSL, and instead providing a plain text file as side input? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 From llvm-commits at lists.llvm.org Tue Oct 8 15:45:27 2019 From: llvm-commits at lists.llvm.org (Sourabh Singh Tomar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:45:27 +0000 (UTC) Subject: [PATCH] D68117: [DWARF-5] Support for C++11 defaulted, deleted member functions. In-Reply-To: References: Message-ID: SouraVX marked 3 inline comments as done. SouraVX added inline comments. ================ Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:1619 + else { + SPFlags |= llvm::DISubprogram::SPFlagNotDefaulted; + } ---------------- Previously SPFlagNotDefaulted is setted to SPFlagZero as it's normal value is, to save a bit. Hence in generated IR this flag is not getting set. instead 0 is getting emitted. As a result, test cases checking DISPFlagNotDefaulted in IR are failing. ================ Comment at: clang/test/CodeGenCXX/debug-info-not-defaulted.cpp:9 + +// ATTR: DISubprogram(name: "not_defaulted", {{.*}}, flags: DIFlagPublic | DIFlagPrototyped, spFlags: DISPFlagNotDefaulted +// ATTR: DISubprogram(name: "not_defaulted", {{.*}}, flags: DIFlagPublic | DIFlagPrototyped, spFlags: DISPFlagNotDefaulted ---------------- This test case is failing, checking DISPFlagNotDefaulted. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68117/new/ https://reviews.llvm.org/D68117 From llvm-commits at lists.llvm.org Tue Oct 8 15:49:35 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:49:35 +0000 (UTC) Subject: [PATCH] D67999: Fix `compiler_rt_logbf_test.c` test failure for Builtins-i386-darwin test suite. In-Reply-To: References: Message-ID: <44f392950d17ac40cca2b0680cf2c1b2@localhost.localdomain> rupprecht added a comment. In D67999#1690052 , @delcypher wrote: > @rupprecht I talked with @scanon offline and he came up with a much simpler solution. Could you re-approve this change if you're happy with it? Sorry, didn't see this earlier. The committed patch LGTM! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67999/new/ https://reviews.llvm.org/D67999 From llvm-commits at lists.llvm.org Tue Oct 8 15:58:24 2019 From: llvm-commits at lists.llvm.org (David Li via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:58:24 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: <596c32467ca99a121adfc4f8b9e70f96@localhost.localdomain> davidxl added inline comments. ================ Comment at: llvm/include/llvm/ProfileData/SampleProfReader.h:554 + /// Collect functions to be used when compiling Module \p M. + void collectFuncsToUse(const Module &M) override; }; ---------------- Nit: collectFuncsFrom(const Module &M) ================ Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:507 + for (auto &F : M) { + StringRef CanonName = FunctionSamples::getCanonicalFnName(F); + FuncsToUse.insert(CanonName); ---------------- Skip declarations? ================ Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:533 +std::error_code SampleProfileReaderExtBinary::readFuncProfiles(uint64_t Size) { + const uint8_t *Start = Data; + if (UseAllFuncs) { ---------------- End = Data + Size ================ Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:536 + while (Data < Start + Size) { + if (std::error_code EC = readFuncProfile()) + return EC; ---------------- It is more readable if readFuncProfile is taking the pointer to the data address: readFuncProfile(&Data); Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 From llvm-commits at lists.llvm.org Tue Oct 8 15:58:24 2019 From: llvm-commits at lists.llvm.org (Sourabh Singh Tomar via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 22:58:24 +0000 (UTC) Subject: [PATCH] D68117: [DWARF-5] Support for C++11 defaulted, deleted member functions. In-Reply-To: References: Message-ID: <107addd5ca08506fb778196f9ec0825e@localhost.localdomain> SouraVX marked 2 inline comments as done. SouraVX added inline comments. ================ Comment at: clang/test/CodeGenCXX/debug-info-defaulted-out-of-class.cpp:25 + + //FIXME -- clang will not mark above member funtions, excluding constructors + // as out of class. If we did not mark destructor or other member functions ---------------- This is the case, checking for Out of class definition. I've been mentioning in llvm-dev mails. ================ Comment at: clang/test/CodeGenCXX/debug-info-not-defaulted.cpp:9 + +// ATTR: DISubprogram(name: "not_defaulted", {{.*}}, flags: DIFlagPublic | DIFlagPrototyped, spFlags: DISPFlagNotDefaulted +// ATTR: DISubprogram(name: "not_defaulted", {{.*}}, flags: DIFlagPublic | DIFlagPrototyped, spFlags: DISPFlagNotDefaulted ---------------- SouraVX wrote: > This test case is failing, checking DISPFlagNotDefaulted. Please note here that, backend and llvm-dwarfdump is fine without this. Since it's value is '0' , we are able to query this using isNotDefaulted() -- hence attribute DW_AT_defaulted having value DW_DEFAULTED_no is getting set and emitted and dumped fine by llvm-dwarfdump. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68117/new/ https://reviews.llvm.org/D68117 From llvm-commits at lists.llvm.org Tue Oct 8 16:08:19 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Tue, 08 Oct 2019 23:08:19 -0000 Subject: [llvm] r374129 - gn build: unbreak libcxx build after r374116 by restoring gen_link_script.py for gn Message-ID: <20191008230819.4F4498891C@lists.llvm.org> Author: nico Date: Tue Oct 8 16:08:18 2019 New Revision: 374129 URL: http://llvm.org/viewvc/llvm-project?rev=374129&view=rev Log: gn build: unbreak libcxx build after r374116 by restoring gen_link_script.py for gn Added: llvm/trunk/utils/gn/secondary/libcxx/utils/ llvm/trunk/utils/gn/secondary/libcxx/utils/gen_link_script.py (with props) Modified: llvm/trunk/utils/gn/secondary/libcxx/src/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/libcxx/src/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/libcxx/src/BUILD.gn?rev=374129&r1=374128&r2=374129&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/libcxx/src/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/libcxx/src/BUILD.gn Tue Oct 8 16:08:18 2019 @@ -230,7 +230,7 @@ if (libcxx_enable_shared) { if (libcxx_enable_abi_linker_script) { action("cxx_linker_script") { - script = "//libcxx/utils/gen_link_script.py" + script = "//llvm/utils/gn/secondary/libcxx/utils/gen_link_script.py" outputs = [ "$runtimes_dir/libc++.so", ] Added: llvm/trunk/utils/gn/secondary/libcxx/utils/gen_link_script.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/libcxx/utils/gen_link_script.py?rev=374129&view=auto ============================================================================== --- llvm/trunk/utils/gn/secondary/libcxx/utils/gen_link_script.py (added) +++ llvm/trunk/utils/gn/secondary/libcxx/utils/gen_link_script.py Tue Oct 8 16:08:18 2019 @@ -0,0 +1,50 @@ +#!/usr/bin/env python +#===----------------------------------------------------------------------===## +# +# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +# See https://llvm.org/LICENSE.txt for license information. +# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +# +#===----------------------------------------------------------------------===## + +""" +Generate a linker script that links libc++ to the proper ABI library. +An example script for c++abi would look like "INPUT(libc++.so.1 -lc++abi)". +""" + +import argparse +import os +import sys + + +def main(): + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument("--input", help="Path to libc++ library", required=True) + parser.add_argument("--output", help="Path to libc++ linker script", + required=True) + parser.add_argument("libraries", nargs="+", + help="List of libraries libc++ depends on") + args = parser.parse_args() + + # Use the relative path for the libc++ library. + libcxx = os.path.relpath(args.input, os.path.dirname(args.output)) + + # Prepare the list of public libraries to link. + public_libs = ['-l%s' % l for l in args.libraries] + + # Generate the linker script contents. + contents = "INPUT(%s)" % ' '.join([libcxx] + public_libs) + + # Remove the existing libc++ symlink if it exists. + if os.path.islink(args.output): + os.unlink(args.output) + + # Replace it with the linker script. + with open(args.output, 'w') as f: + f.write(contents + "\n") + + return 0 + + +if __name__ == '__main__': + sys.exit(main()) Propchange: llvm/trunk/utils/gn/secondary/libcxx/utils/gen_link_script.py ------------------------------------------------------------------------------ svn:executable = * From llvm-commits at lists.llvm.org Tue Oct 8 16:08:50 2019 From: llvm-commits at lists.llvm.org (Jian Cai via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 23:08:50 +0000 (UTC) Subject: [PATCH] D68598: [IA] Recognize hexadecimal escape sequences In-Reply-To: References: Message-ID: jcai19 added inline comments. ================ Comment at: llvm/test/MC/AsmParser/directive_ascii.s:46 +TEST7: + .ascii "\x64\Xa6B" ---------------- jcai19 wrote: > Thanks for the update. Could we have one more test case with last eight bits within the range of 7f and ff, and maybe with lower-case letter, e.g. \x8a? Test cases added at r374124. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68598/new/ https://reviews.llvm.org/D68598 From llvm-commits at lists.llvm.org Tue Oct 8 16:18:19 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 23:18:19 +0000 (UTC) Subject: [PATCH] D68670: [Utils] Cleanup similar cases to MergeBlockIntoPredecessor. Message-ID: asbirlea created this revision. asbirlea added reviewers: chandlerc, sanjoy.google, george.burgess.iv. Herald added a project: LLVM. There are two cases where a block is merged into its predecessor and the MergeBlockIntoPredecessor API is not used. Update the API so it can be reused in the other cases, in order to avoid code duplication. Cleanup motivated by D68659 . Repository: rL LLVM https://reviews.llvm.org/D68670 Files: include/llvm/Transforms/Utils/BasicBlockUtils.h lib/Transforms/Scalar/LoopUnswitch.cpp lib/Transforms/Utils/BasicBlockUtils.cpp lib/Transforms/Utils/LoopRotationUtils.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68670.223957.patch Type: text/x-patch Size: 8414 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 16:37:00 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 23:37:00 +0000 (UTC) Subject: [PATCH] D68672: [APInt] Rounding right-shifts In-Reply-To: References: Message-ID: <633cd8bbdf77f4c818e59411b6858210@localhost.localdomain> lebedev.ri added a reviewer: timshen. lebedev.ri marked an inline comment as done. lebedev.ri added a subscriber: timshen. lebedev.ri added inline comments. ================ Comment at: llvm/unittests/ADT/APIntTest.cpp:2529-2538 + EXPECT_TRUE(Prod.sge(A)); + if (Prod.sgt(A)) { EXPECT_TRUE(((Quo - 1).sext(16) * B.sext(16)).ult(A)); } } { APInt Quo = APIntOps::RoundingSDiv(A, B, APInt::Rounding::DOWN); ---------------- @sanjoy @timshen FIXME: that's the intended check, right? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68672/new/ https://reviews.llvm.org/D68672 From llvm-commits at lists.llvm.org Tue Oct 8 16:36:59 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 23:36:59 +0000 (UTC) Subject: [PATCH] D68672: [APInt] Rounding right-shifts Message-ID: lebedev.ri created this revision. lebedev.ri added reviewers: sanjoy, nikic, craig.topper, RKSimon. lebedev.ri added a project: LLVM. Herald added subscribers: dexonsmith, hiraditya. lebedev.ri added a reviewer: timshen. lebedev.ri marked an inline comment as done. lebedev.ri added a subscriber: timshen. lebedev.ri added inline comments. ================ Comment at: llvm/unittests/ADT/APIntTest.cpp:2529-2538 + EXPECT_TRUE(Prod.sge(A)); + if (Prod.sgt(A)) { EXPECT_TRUE(((Quo - 1).sext(16) * B.sext(16)).ult(A)); } } { APInt Quo = APIntOps::RoundingSDiv(A, B, APInt::Rounding::DOWN); ---------------- @sanjoy @timshen FIXME: that's the intended check, right? There's already rounding division, which is used in `ConstantRange` to implement e.g. `makeExactMulNUWRegion()`/`makeExactMulNUWRegion()` but there are no versions for right-shifts. I'd like to try to extend `ConstantRange::makeGuaranteedNoWrapRegion()` to deal with `Instruction::Shl` so i believe i need rounding right shifts. The test coverage is confusing. The existing `RoundingSDiv()` didn't actually pass - it never runs. I've fixed that, and adjusted it to pass - i think we should be checking signed predicates? Then i've added `APIntOps::RoundingLShr()`, `APIntOps::RoundingAShr()`, standalone test coverage for them, and a cross-test to verify that they produce identical output as compared to their `div` friends. They do. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68672 Files: llvm/include/llvm/ADT/APInt.h llvm/lib/Support/APInt.cpp llvm/unittests/ADT/APIntTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68672.223959.patch Type: text/x-patch Size: 6905 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 16:46:54 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Tue, 08 Oct 2019 23:46:54 +0000 (UTC) Subject: [PATCH] D57779: [SLP] Add support for throttling. In-Reply-To: References: Message-ID: <3de338cb76e4648e8a255af7167dc0fe@localhost.localdomain> xbolva00 added a comment. @ABataev - any comments? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57779/new/ https://reviews.llvm.org/D57779 From llvm-commits at lists.llvm.org Tue Oct 8 16:46:54 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 23:46:54 +0000 (UTC) Subject: [PATCH] D68672: [APInt] Rounding right-shifts In-Reply-To: References: Message-ID: <8b82c6d3d51c13d08b6089516d734ea6@localhost.localdomain> lebedev.ri updated this revision to Diff 223960. lebedev.ri added a comment. Upload correct patch - i've meant to disable the `RoundingSDiv` test - it doesn't pass after making it actually run. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68672/new/ https://reviews.llvm.org/D68672 Files: llvm/include/llvm/ADT/APInt.h llvm/lib/Support/APInt.cpp llvm/unittests/ADT/APIntTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68672.223960.patch Type: text/x-patch Size: 7575 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 16:46:54 2019 From: llvm-commits at lists.llvm.org (Evgenii Stepanov via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 23:46:54 +0000 (UTC) Subject: [PATCH] D68469: [AArch64] Ensure no tagged memory is left in the unallocated portion of the stack In-Reply-To: References: Message-ID: <228aace5aea7323c527b60dcf58b5a48@localhost.localdomain> eugenis accepted this revision. eugenis added a comment. This revision is now accepted and ready to land. Everything looks great, thanks! ================ Comment at: llvm/lib/Target/AArch64/AArch64StackTagging.cpp:511 + llvm_unreachable("Corrupt instruction list"); + return false; +} ---------------- Remove this line, it's unreachable. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68469/new/ https://reviews.llvm.org/D68469 From llvm-commits at lists.llvm.org Tue Oct 8 16:55:59 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Tue, 08 Oct 2019 23:55:59 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: <76d7380b9e4820c89c985479e1c459c2@localhost.localdomain> efriedma accepted this revision. efriedma added a comment. This revision is now accepted and ready to land. LGTM ================ Comment at: llvm/include/llvm/Support/MathExtras.h:66 + inv_sqrtpi = 0.5641895835477563, // https://oeis.org/A087197 + sqrt2 = 1.414213562373095, // https://oeis.org/A002193 + inv_sqrt2 = 0.7071067811865475, ---------------- evandro wrote: > efriedma wrote: > > evandro wrote: > > > efriedma wrote: > > > > The correct value of sqrt(2) in double-precision is 1.4142135623730951. > > > > > > > > And now I don't trust any of the other values... > > > `double` has a precision of 15 or 16 significant digits. I don't understand why are you suggesting 17 significant digits when you asked to trim the precision down. > > > > > > Besides, the reference I provided states that this value is 1.41421356237309505. Whether it's rounded to 1.4142135623730950 or 1.4142135623730951 is a bit moot, IMO. > > I asked for "the smallest number of digits required to produce the correct double-precision result". This is what you get if, for example, you ask Python 2.7 or later to convert the value to a string with `repr()` (`printf "import math\nprint(repr(math.sqrt(2)))" | python`). `1.414213562373095` produces a value that's different by one ulp. > > > > Yes, a one ulp difference is unlikely to matter for most uses, but if we're going to take the time to define these, we should define them correctly. > You're assuming that Python is correct. `bc` says 1.41421356237309504880. glibc's `math.h` says 1.41421356237309504880 as well. And none of these is the same as your 1.4142135623730951. > > As I said, the precision of `double` is 15 to 16 digits and of `float`, 6 to 7 digits. `math.h` defines them with 20 digits, which is probably an agreeable precision, yes? But I believe that we call all live with a difference of ±1ulp. 1.4142135623730951 is the shortest decimal representation that produces the same double-precision number as 0x1.6a09e667f3bcdP+0. It isn't "correct" in any other sense, sure. A few extra digits is okay, I guess. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 From llvm-commits at lists.llvm.org Tue Oct 8 17:32:57 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 00:32:57 +0000 (UTC) Subject: [PATCH] D68675: [9.0 branch][ARM] VFPv2 only supports 16 D registers. Message-ID: efriedma created this revision. efriedma added reviewers: t.p.northover, tstellar. Herald added subscribers: kristina, hiraditya, kristof.beyls. Herald added a project: LLVM. Patch for 9.0.1. Simplified version of r372186/r372187: fix the meaning of the "vfpv2" and "vfpv2sp" features, but keep around the useless "vfp2d16" and "vfp2d16sp" features, to reduce the risk on the release branch. Fixes https://bugs.llvm.org/show_bug.cgi?id=43365 Repository: rL LLVM https://reviews.llvm.org/D68675 Files: clang/test/CodeGen/arm-target-features.c llvm/lib/Support/ARMTargetParser.cpp llvm/lib/Target/ARM/ARM.td llvm/test/MC/ARM/vfp-aliases-diagnostics.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68675.223964.patch Type: text/x-patch Size: 10482 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 18:00:52 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 01:00:52 +0000 (UTC) Subject: [PATCH] D68676: [ASan] Do not misrepresent high value address dereferences as null dereferences Message-ID: yln created this revision. Herald added a reviewer: jfb. Herald added projects: Sanitizers, LLVM. Herald added subscribers: llvm-commits, Sanitizers. Dereferences with addresses above the 48-bit hardware addressable range produce "invalid instruction" (instead of "invalid access") hardware exceptions (there is no hardware address decoding logic for those bits), and the address provided by this exception is the address of the instruction (not the faulting address). The kernel maps the "invalid instruction" to SEGV, but fails to provide the real fault address. Because of this ASan lies and says that those cases are null dereferences. This downgrades the severity of a found bug in terms of security. In the ASan signal handler, we can not provide the real faulting address, but at least we can try not to lie. rdar://50366151 Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68676 Files: compiler-rt/lib/asan/asan_errors.h compiler-rt/lib/sanitizer_common/sanitizer_common.h compiler-rt/lib/sanitizer_common/sanitizer_posix.cpp compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp compiler-rt/lib/sanitizer_common/sanitizer_win.cpp compiler-rt/test/asan/TestCases/Darwin/high-address-dereference.c -------------- next part -------------- A non-text attachment was scrubbed... Name: D68676.223966.patch Type: text/x-patch Size: 7002 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 18:00:52 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 01:00:52 +0000 (UTC) Subject: [PATCH] D66935: [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize In-Reply-To: References: Message-ID: <854f2016244ef366928683d0490351a5@localhost.localdomain> efriedma added subscribers: krasimir, bkramer. efriedma added a comment. Thanks for pointing that out. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66935/new/ https://reviews.llvm.org/D66935 From llvm-commits at lists.llvm.org Tue Oct 8 18:19:13 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 01:19:13 +0000 (UTC) Subject: [PATCH] D68525: [lit] Refactor ProgressDisplay In-Reply-To: References: Message-ID: <8f12576ce877928e88c7ca2062ed821f@localhost.localdomain> yln added a comment. *ping* Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68525/new/ https://reviews.llvm.org/D68525 From llvm-commits at lists.llvm.org Tue Oct 8 18:19:13 2019 From: llvm-commits at lists.llvm.org (ham9174@yahoo.com via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 01:19:13 +0000 (UTC) Subject: [PATCH] D17325: Introduce llvm/ADT/OptionSet.h utility class In-Reply-To: References: Message-ID: <7ebcd27f95761ff5d5b085d0db2138ae@localhost.localdomain> ham999 added a comment. ConversionFunction CHANGES SINCE LAST ACTION https://reviews.llvm.org/D17325/new/ https://reviews.llvm.org/D17325 From llvm-commits at lists.llvm.org Tue Oct 8 18:28:19 2019 From: llvm-commits at lists.llvm.org (Shoaib Meenai via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 01:28:19 +0000 (UTC) Subject: [PATCH] D68648: [CMake] Only detect the linker once in AddLLVM.cmake In-Reply-To: References: Message-ID: smeenai accepted this revision. smeenai added a comment. This revision is now accepted and ready to land. This only works because of a specific detail of LLVM's build system. The variable will get set in the directory scope, so different directories including this module will still duplicate the check. Here's a simple test I did, using cmake version 3.12.1 (run this as a script): mkdir /tmp/cmaketest cd /tmp/cmaketest cat > CMakeLists.txt <<'EOF' cmake_minimum_required(VERSION 3.4.3) list(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_LIST_DIR}/modules) add_subdirectory(dir1) add_subdirectory(dir2) EOF mkdir modules cat > modules/detect.cmake <<'EOF' if(NOT DEFINED DETECTED) message(STATUS Detected) set(DETECTED YES) endif() EOF mkdir dir1 cat > dir1/CMakeLists.txt <<'EOF' include(detect) add_subdirectory(subdir) EOF mkdir dir1/subdir echo 'include(detect)' > dir1/subdir/CMakeLists.txt mkdir dir2 echo 'include(detect)' > dir2/CMakeLists.txt mkdir build cd build cmake -G Ninja .. For me, the "Detected" message is still printed twice, for dir1 and dir2. It doesn't print for subdir, since that inherits the variable for dir1, but independent directories still duplicate the message. In LLVM, we `include(AddLLVM)` in the top-level CMakeLists.txt, so any variable created in that directory scope is essentially global. Nevertheless, it'd be nicer to explicitly use a global property for this, but I'm also okay with this going in as-is if you prefer. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68648/new/ https://reviews.llvm.org/D68648 From llvm-commits at lists.llvm.org Tue Oct 8 18:37:27 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 01:37:27 +0000 (UTC) Subject: [PATCH] D68680: [dsymutil] Fix handling of common symbols in multiple object files. Message-ID: JDevlieghere created this revision. JDevlieghere added a reviewer: friss. Herald added a subscriber: aprantl. Herald added a project: LLVM. For common symbols the linker emits only a single symbol entry in the debug map. This caused dsymutil to not relocate common symbols when linking DWARF coming form object files that did not have this entry. This patch fixes that by keeping track of common symbols in the object files and synthesizing a debug map entry for them using the address from the main binary. Repository: rL LLVM https://reviews.llvm.org/D68680 Files: llvm/test/tools/dsymutil/Inputs/private/tmp/common/com llvm/test/tools/dsymutil/Inputs/private/tmp/common/com1.o llvm/test/tools/dsymutil/Inputs/private/tmp/common/com2.o llvm/test/tools/dsymutil/X86/common-sym-multi.test llvm/tools/dsymutil/MachODebugMapParser.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68680.223972.patch Type: text/x-patch Size: 4306 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 18:37:27 2019 From: llvm-commits at lists.llvm.org (Matt Morehouse via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 01:37:27 +0000 (UTC) Subject: [PATCH] D68653: [scudo][standalone] Get statistics in a char buffer In-Reply-To: References: Message-ID: <19c8d42c5e45463679ab004ab85067ee@localhost.localdomain> morehouse accepted this revision. morehouse added inline comments. This revision is now accepted and ready to land. ================ Comment at: lib/scudo/standalone/tests/combined_test.cpp:141 + scudo::uptr BufferSize = 8192; + char *Buffer = new char[BufferSize]; + scudo::uptr ActualSize = Allocator->getStats(Buffer, BufferSize); ---------------- Nit: `std::unique_ptr` or `std::vector` Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68653/new/ https://reviews.llvm.org/D68653 From llvm-commits at lists.llvm.org Tue Oct 8 19:00:36 2019 From: llvm-commits at lists.llvm.org (Douglas Gliner via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 02:00:36 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Accommodate sancov and coverage report server for use under Windows In-Reply-To: References: Message-ID: <334880ff9d95013b24651904756939bb@localhost.localdomain> dgg5503 updated this revision to Diff 223973. dgg5503 retitled this revision from "[sancov] Fixed malformed JSON when symbolizing coverage information" to "[sancov] Accommodate sancov and coverage report server for use under Windows". dgg5503 edited the summary of this revision. dgg5503 added a comment. @vsk thanks for the review! It looks like the JSON support library implements what `JSONWriter` does in this tool. To reduce maintenance, I've updated sancov to use the JSON support library implementation instead. The only downside to this change is that the JSON text format differs compared to the original implementation. I'm open to reverting this diff and simply adding your suggested change which also worked. Let me know what you think. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 Files: llvm/test/tools/sancov/blacklist.test llvm/test/tools/sancov/covered_functions.test llvm/test/tools/sancov/merge.test llvm/test/tools/sancov/not_covered_functions.test llvm/test/tools/sancov/print.test llvm/test/tools/sancov/stats.test llvm/test/tools/sancov/symbolize.test llvm/test/tools/sancov/symbolize_noskip_dead_files.test llvm/test/tools/sancov/validation.test llvm/tools/sancov/coverage-report-server.py llvm/tools/sancov/sancov.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D51018.223973.patch Type: text/x-patch Size: 19497 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 19:30:06 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 02:30:06 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <8b94f3d25bd8f25fc03348b31c8bbda8@localhost.localdomain> ruiu accepted this revision. ruiu added a comment. This revision is now accepted and ready to land. LGTM ================ Comment at: lld/test/COFF/imports-ordinal-only.s:5 +# RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj +# RUN: lld-link -out:%t.exe -entry:main -subsystem:console -safeseh:no -debug %t.obj %t-implib.a + ---------------- thrimbor wrote: > ruiu wrote: > > I'd dump the import table to verify that a correct import table is actually created in the resulting executable. > I updated the test. Is this what you had in mind? Yes, but please add `--match-full-lines` to FileCheck so that if Name exists in an actual output the test will fail. By default, if an actual output is a substring of a CHECK line, it will succeed. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 20:02:58 2019 From: llvm-commits at lists.llvm.org (Frederic Riss via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 03:02:58 +0000 (UTC) Subject: [PATCH] D68680: [dsymutil] Fix handling of common symbols in multiple object files. In-Reply-To: References: Message-ID: <799c892a7ea7234a1d833f9ce2361f84@localhost.localdomain> friss accepted this revision. friss added a comment. This revision is now accepted and ready to land. LGTM Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68680/new/ https://reviews.llvm.org/D68680 From llvm-commits at lists.llvm.org Tue Oct 8 20:12:43 2019 From: llvm-commits at lists.llvm.org (Qing Shan Zhang via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 03:12:43 +0000 (UTC) Subject: [PATCH] D67950: [TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel In-Reply-To: References: Message-ID: steven.zhang updated this revision to Diff 223976. steven.zhang marked an inline comment as done. steven.zhang added a comment. rebase the patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67950/new/ https://reviews.llvm.org/D67950 Files: llvm/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll llvm/test/TableGen/InvalidMCSchedClassDesc.td llvm/utils/TableGen/SubtargetEmitter.cpp Index: llvm/utils/TableGen/SubtargetEmitter.cpp =================================================================== --- llvm/utils/TableGen/SubtargetEmitter.cpp +++ llvm/utils/TableGen/SubtargetEmitter.cpp @@ -1057,6 +1057,7 @@ LLVM_DEBUG(dbgs() << ProcModel.ModelName << " does not have resources for class " << SC.Name << '\n'); + SCDesc.NumMicroOps = MCSchedClassDesc::InvalidNumMicroOps; } } // Sum resources across all operand writes. Index: llvm/test/TableGen/InvalidMCSchedClassDesc.td =================================================================== --- /dev/null +++ llvm/test/TableGen/InvalidMCSchedClassDesc.td @@ -0,0 +1,47 @@ +// RUN: llvm-tblgen -gen-subtarget -I %p/../../include %s 2>&1 | FileCheck %s +// Check if it is valid MCSchedClassDesc if didn't have the resources. + +include "llvm/Target/Target.td" + +def MyTarget : Target; + +let OutOperandList = (outs), InOperandList = (ins) in { + def Inst_A : Instruction; + def Inst_B : Instruction; +} + +let CompleteModel = 0 in { + def SchedModel_A: SchedMachineModel; + def SchedModel_B: SchedMachineModel; + def SchedModel_C: SchedMachineModel; +} + +// Inst_B didn't have the resoures, and it is invalid. +// CHECK: SchedModel_ASchedClasses[] = { +// CHECK: {DBGFIELD("Inst_A") 1 +// CHECK-NEXT: {DBGFIELD("Inst_B") 16383 +let SchedModel = SchedModel_A in { + def Write_A : SchedWriteRes<[]>; + def : InstRW<[Write_A], (instrs Inst_A)>; +} + +// Inst_A didn't have the resoures, and it is invalid. +// CHECK: SchedModel_BSchedClasses[] = { +// CHECK: {DBGFIELD("Inst_A") 16383 +// CHECK-NEXT: {DBGFIELD("Inst_B") 1 +let SchedModel = SchedModel_B in { + def Write_B: SchedWriteRes<[]>; + def : InstRW<[Write_B], (instrs Inst_B)>; +} + +// CHECK: SchedModel_CSchedClasses[] = { +// CHECK: {DBGFIELD("Inst_A") 1 +// CHECK-NEXT: {DBGFIELD("Inst_B") 1 +let SchedModel = SchedModel_C in { + def Write_C: SchedWriteRes<[]>; + def : InstRW<[Write_C], (instrs Inst_A, Inst_B)>; +} + +def ProcessorA: ProcessorModel<"ProcessorA", SchedModel_A, []>; +def ProcessorB: ProcessorModel<"ProcessorB", SchedModel_B, []>; +def ProcessorC: ProcessorModel<"ProcessorC", SchedModel_C, []>; Index: llvm/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll =================================================================== --- llvm/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll +++ llvm/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll @@ -45,7 +45,6 @@ ; CHECK-REG-PRESSURE: ldr{{.*}}, [sp ; CHECK-REG-PRESSURE: ldr{{.*}}, [sp ; CHECK-REG-PRESSURE: ldr{{.*}}, [sp -; CHECK-REG-PRESSURE: ldr{{.*}}, [sp ; CHECK-REG-PRESSURE: bne .LBB0_1 for.body: -------------- next part -------------- A non-text attachment was scrubbed... Name: D67950.223976.patch Type: text/x-patch Size: 2791 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 20:12:44 2019 From: llvm-commits at lists.llvm.org (Qing Shan Zhang via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 03:12:44 +0000 (UTC) Subject: [PATCH] D67950: [TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel In-Reply-To: References: Message-ID: <671e820a70d005ebbe70d102764e06c5@localhost.localdomain> steven.zhang added a comment. I have rebased the patch. ================ Comment at: llvm/test/CodeGen/ARM/ParallelDSP/multi-use-loads.ll:23 ; CHECK-LE-NEXT: sxtah r1, r1, lr +; CHECK-LE-NEXT: subs r0, #1 ; CHECK-LE-NEXT: smlad r12, r4, lr, r12 ---------------- jsji wrote: > samparker wrote: > > jsji wrote: > > > Do these changes in scheduling means `ParallelDSP ` SchedModel has some interference with others? > > > If so, maybe we need another patch to fix that. > > I think if this patch was rebased, this change would go away. This brought to our attention that some instructions weren't modelled, but I hope this is fixed now. > Good to know that! Thanks @samparker ! Good catch, and yes, it has been fixed by https://reviews.llvm.org/rL373163 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67950/new/ https://reviews.llvm.org/D67950 From llvm-commits at lists.llvm.org Tue Oct 8 20:31:27 2019 From: llvm-commits at lists.llvm.org (John McCall via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 03:31:27 +0000 (UTC) Subject: [PATCH] D62731: [RFC] Add support for options -frounding-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior In-Reply-To: References: Message-ID: rjmccall added inline comments. ================ Comment at: clang/docs/UsersManual.rst:1330 + and ``fast``. + Details: + ---------------- rjmccall wrote: > "provided by other, single-purpose floating point options." I don't know why you keep including "clang" as a modifier here; this is the clang documentation, and all of these options are clang options no matter where they might have been borrowed from. ================ Comment at: clang/docs/UsersManual.rst:1341 + has been selected, then the compiler will issue a diagnostic warning + that the override has occurred. + ---------------- mibintc wrote: > rjmccall wrote: > > That's not typical driver behavior; why this choice? > The rationale for the warnings is that the floating point options are sufficiently complicated that it makes sense to warn the uses that one of the later options supplied on the command line is undoing a choice made earlier. It's not obvious that e.g. the setting for fassociative-math is also controlled by -fp-model=strict Okay. Well, it's a new option, so new behavior is alright, but if you're worried about the collisions having arbitrary effects that you'll have to maintain compatibility with, you should consider making it an error instead, because a warning still means it's permitted. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62731/new/ https://reviews.llvm.org/D62731 From llvm-commits at lists.llvm.org Tue Oct 8 20:49:59 2019 From: llvm-commits at lists.llvm.org (George Burgess IV via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 03:49:59 +0000 (UTC) Subject: [PATCH] D68659: [MemorySSA] Make the use of moveAllAfterMergeBlocks consistent. In-Reply-To: References: Message-ID: <9a2a0ad7bf8f4af7b43ae1b4e98e1100@localhost.localdomain> george.burgess.iv accepted this revision. george.burgess.iv added a comment. This revision is now accepted and ready to land. Thanks for this! LGTM with a few nits ================ Comment at: lib/Analysis/MemorySSAUpdater.cpp:1185 + auto *Defs = MSSA->getWritableBlockDefs(From); + if (Defs && Defs->begin() != Defs->end()) + if (auto *Phi = dyn_cast(&*Defs->begin())) ---------------- nit: `!Defs->empty()`? (Though I'm surprised this check is required -- shouldn't we be removing entries from the map as they become empty?) ================ Comment at: lib/Analysis/MemorySSAUpdater.cpp:1186 + if (Defs && Defs->begin() != Defs->end()) + if (auto *Phi = dyn_cast(&*Defs->begin())) + tryRemoveTrivialPhi(Phi); ---------------- If I'm reading the description properly, this function assumes that everything in `From` is now in `To`, no? If so, `cast()` seems more appropriate here, since the only thing that can remain is a single Phi ================ Comment at: test/Analysis/MemorySSA/pr43569.ll:1 +; RUN: opt -pgo-kind=pgo-instr-gen-pipeline -aa-pipeline=default -passes="default" -enable-nontrivial-unswitch -S < %s | FileCheck %s + ---------------- ; REQUIRES: asserts? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68659/new/ https://reviews.llvm.org/D68659 From llvm-commits at lists.llvm.org Tue Oct 8 20:50:47 2019 From: llvm-commits at lists.llvm.org (Kai Luo via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 03:50:47 +0000 (UTC) Subject: [PATCH] D67794: [MachineCopyPropagation] Extend MCP to do trivial copy backward propagation In-Reply-To: References: Message-ID: <12f463ed07c08e0eebbd479ba40d5d16@localhost.localdomain> lkail added a comment. ping Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67794/new/ https://reviews.llvm.org/D67794 From llvm-commits at lists.llvm.org Tue Oct 8 21:00:03 2019 From: llvm-commits at lists.llvm.org (Kristina Brooks via llvm-commits) Date: Wed, 09 Oct 2019 04:00:03 -0000 Subject: [llvm] r374138 - [TypeSize] Fix module builds (cassert) Message-ID: <20191009040003.BCD6A8FE8D@lists.llvm.org> Author: kristina Date: Tue Oct 8 21:00:03 2019 New Revision: 374138 URL: http://llvm.org/viewvc/llvm-project?rev=374138&view=rev Log: [TypeSize] Fix module builds (cassert) TypeSize.h uses `assert` statements without including the header first which leads to failures in modular builds. Modified: llvm/trunk/include/llvm/Support/TypeSize.h Modified: llvm/trunk/include/llvm/Support/TypeSize.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/TypeSize.h?rev=374138&r1=374137&r2=374138&view=diff ============================================================================== --- llvm/trunk/include/llvm/Support/TypeSize.h (original) +++ llvm/trunk/include/llvm/Support/TypeSize.h Tue Oct 8 21:00:03 2019 @@ -15,6 +15,7 @@ #ifndef LLVM_SUPPORT_TYPESIZE_H #define LLVM_SUPPORT_TYPESIZE_H +#include #include namespace llvm { From llvm-commits at lists.llvm.org Tue Oct 8 21:16:19 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via llvm-commits) Date: Wed, 09 Oct 2019 04:16:19 -0000 Subject: [llvm] r374139 - [dsymutil] Fix handling of common symbols in multiple object files. Message-ID: <20191009041619.307A6906D2@lists.llvm.org> Author: jdevlieghere Date: Tue Oct 8 21:16:18 2019 New Revision: 374139 URL: http://llvm.org/viewvc/llvm-project?rev=374139&view=rev Log: [dsymutil] Fix handling of common symbols in multiple object files. For common symbols the linker emits only a single symbol entry in the debug map. This caused dsymutil to not relocate common symbols when linking DWARF coming form object files that did not have this entry. This patch fixes that by keeping track of common symbols in the object files and synthesizing a debug map entry for them using the address from the main binary. Differential revision: https://reviews.llvm.org/D68680 Added: llvm/trunk/test/tools/dsymutil/Inputs/private/ llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/ llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/ llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com (with props) llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com1.o (with props) llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com2.o (with props) llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test Modified: llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp Added: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com?rev=374139&view=auto ============================================================================== Binary file - no diff available. Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com ------------------------------------------------------------------------------ svn:executable = * Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com1.o URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com1.o?rev=374139&view=auto ============================================================================== Binary file - no diff available. Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com1.o ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com2.o URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com2.o?rev=374139&view=auto ============================================================================== Binary file - no diff available. Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com2.o ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test?rev=374139&view=auto ============================================================================== --- llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test (added) +++ llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test Tue Oct 8 21:16:18 2019 @@ -0,0 +1,39 @@ +RUN: dsymutil -oso-prepend-path %p/../Inputs %p/../Inputs/private/tmp/common/com -f -o - | llvm-dwarfdump -debug-info - | FileCheck %s +RUN: dsymutil -oso-prepend-path %p/../Inputs %p/../Inputs/private/tmp/common/com -dump-debug-map | FileCheck %s --check-prefix DEBUGMAP + +The test was compiled from two source files: +$ cd /private/tmp/common +$ cat com1.c +int i[1000]; +int main() { + return i[1]; +} +$ cat com2.c +extern int i[1000]; +int bar() { + return i[0]; +} +$ clang -fcommon -g -c com1.c -o com1.o +$ clang -fcommon -g -c com2.c -o com2.o +$ clang -fcommon -g com1.o com2.o -o com + +CHECK: DW_TAG_compile_unit +CHECK: DW_TAG_variable +CHECK-NOT: {{NULL|DW_TAG}} +CHECK: DW_AT_name{{.*}}"i" +CHECK-NOT: {{NULL|DW_TAG}} +CHECK: DW_AT_location{{.*}}DW_OP_addr 0x100001000) + +CHECK: DW_TAG_compile_unit +CHECK: DW_TAG_variable +CHECK-NOT: {{NULL|DW_TAG}} +CHECK: DW_AT_name{{.*}}"i" +CHECK-NOT: {{NULL|DW_TAG}} +CHECK: DW_AT_location{{.*}}DW_OP_addr 0x100001000) + +DEBUGMAP: filename:{{.*}}com1.o +DEBUGMAP: symbols: +DEBUGMAP: sym: _i, binAddr: 0x0000000100001000, size: 0x00000000 +DEBUGMAP: filename:{{.*}}com2.o +DEBUGMAP: symbols: +DEBUGMAP: sym: _i, binAddr: 0x0000000100001000, size: 0x00000000 Modified: llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp?rev=374139&r1=374138&r2=374139&view=diff ============================================================================== --- llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp (original) +++ llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp Tue Oct 8 21:16:18 2019 @@ -14,6 +14,7 @@ #include "llvm/Support/Path.h" #include "llvm/Support/WithColor.h" #include "llvm/Support/raw_ostream.h" +#include namespace { using namespace llvm; @@ -51,6 +52,8 @@ private: StringRef MainBinaryStrings; /// The constructed DebugMap. std::unique_ptr Result; + /// List of common symbols that need to be added to the debug map. + std::vector CommonSymbols; /// Map of the currently processed object file symbol addresses. StringMap> CurrentObjectAddresses; @@ -81,6 +84,8 @@ private: STE.n_value); } + void addCommonSymbols(); + /// Dump the symbol table output header. void dumpSymTabHeader(raw_ostream &OS, StringRef Arch); @@ -122,11 +127,32 @@ void MachODebugMapParser::resetParserSta CurrentDebugMapObject = nullptr; } +/// Commons symbols won't show up in the symbol map but might need to be +/// relocated. We can add them to the symbol table ourselves by combining the +/// information in the object file (the symbol name) and the main binary (the +/// address). +void MachODebugMapParser::addCommonSymbols() { + for (auto &CommonSymbol : CommonSymbols) { + uint64_t CommonAddr = getMainBinarySymbolAddress(CommonSymbol); + if (CommonAddr == 0) { + // The main binary doesn't have an address for the given symbol. + continue; + } + if (!CurrentDebugMapObject->addSymbol(CommonSymbol, None /*ObjectAddress*/, + CommonAddr, 0 /*size*/)) { + // The symbol is already present. + continue; + } + } + CommonSymbols.clear(); +} + /// Create a new DebugMapObject. This function resets the state of the /// parser that was referring to the last object file and sets /// everything up to add symbols to the new one. void MachODebugMapParser::switchToNewDebugMapObject( StringRef Filename, sys::TimePoint Timestamp) { + addCommonSymbols(); resetParserState(); SmallString<80> Path(PathPrefix); @@ -466,10 +492,15 @@ void MachODebugMapParser::loadCurrentObj // relocations will use the symbol itself, and won't need an // object file address. The object file address field is optional // in the DebugMap, leave it unassigned for these symbols. - if (Sym.getFlags() & (SymbolRef::SF_Absolute | SymbolRef::SF_Common)) + uint32_t Flags = Sym.getFlags(); + if (Flags & SymbolRef::SF_Absolute) { CurrentObjectAddresses[*Name] = None; - else + } else if (Flags & SymbolRef::SF_Common) { + CurrentObjectAddresses[*Name] = None; + CommonSymbols.push_back(*Name); + } else { CurrentObjectAddresses[*Name] = Addr; + } } } From llvm-commits at lists.llvm.org Tue Oct 8 21:14:36 2019 From: llvm-commits at lists.llvm.org (Jinsong Ji via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 04:14:36 +0000 (UTC) Subject: [PATCH] D67950: [TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel In-Reply-To: References: Message-ID: <5a3e4091b21a8a2281bcd86a560ef17a@localhost.localdomain> jsji accepted this revision as: jsji. jsji added a comment. This revision is now accepted and ready to land. LGTM. It would be great if you can figure out why we still have difference in ARM tests, but it shouldn't block this. ================ Comment at: llvm/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll:48 ; CHECK-REG-PRESSURE: ldr{{.*}}, [sp -; CHECK-REG-PRESSURE: ldr{{.*}}, [sp ; CHECK-REG-PRESSURE: bne .LBB0_1 ---------------- Why do we still have this difference? Shouldn't it be fixed as well? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67950/new/ https://reviews.llvm.org/D67950 From llvm-commits at lists.llvm.org Tue Oct 8 21:19:33 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 04:19:33 +0000 (UTC) Subject: [PATCH] D68680: [dsymutil] Fix handling of common symbols in multiple object files. In-Reply-To: References: Message-ID: <7f16ba0edd99b75e6c543dd9ecd43318@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG4ac388f7cacc: [dsymutil] Fix handling of common symbols in multiple object files. (authored by JDevlieghere). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68680/new/ https://reviews.llvm.org/D68680 Files: llvm/test/tools/dsymutil/Inputs/private/tmp/common/com llvm/test/tools/dsymutil/Inputs/private/tmp/common/com1.o llvm/test/tools/dsymutil/Inputs/private/tmp/common/com2.o llvm/test/tools/dsymutil/X86/common-sym-multi.test llvm/tools/dsymutil/MachODebugMapParser.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68680.223985.patch Type: text/x-patch Size: 4306 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 21:41:08 2019 From: llvm-commits at lists.llvm.org (Stefan Schmidt via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 04:41:08 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <65ea98ece2c086a8410511498ed6c930@localhost.localdomain> thrimbor added a comment. I updated the patch to add `--match-full-lines` as requested. Can you commit the patch for me, as I don't have commit access? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 21:41:13 2019 From: llvm-commits at lists.llvm.org (Stefan Schmidt via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 04:41:13 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <0493130bd9e81a6deb5fb4e21b262a39@localhost.localdomain> thrimbor updated this revision to Diff 223986. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 Files: lld/COFF/Writer.cpp lld/test/COFF/Inputs/ordinal-only-implib.def lld/test/COFF/imports-ordinal-only.s Index: lld/test/COFF/imports-ordinal-only.s =================================================================== --- /dev/null +++ lld/test/COFF/imports-ordinal-only.s @@ -0,0 +1,20 @@ +# REQUIRES: x86 +# +# RUN: llvm-dlltool -k -m i386 --input-def %p/Inputs/ordinal-only-implib.def --output-lib %t-implib.a +# RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj +# RUN: lld-link -out:%t.exe -entry:main -subsystem:console -safeseh:no -debug %t.obj %t-implib.a +# RUN: llvm-objdump -private-headers %t.exe | FileCheck --match-full-lines %s + +.text +.global _main +_main: +call _ByOrdinalFunction +ret + +# CHECK: The Import Tables: +# CHECK-NEXT: lookup 000020b4 time 00000000 fwd 00000000 name 000020c4 addr 000020bc +# CHECK-EMPTY: +# CHECK-NEXT: DLL Name: test.dll +# CHECK-NEXT: Hint/Ord Name +# CHECK-NEXT: 1 +# CHECK-EMPTY: Index: lld/test/COFF/Inputs/ordinal-only-implib.def =================================================================== --- /dev/null +++ lld/test/COFF/Inputs/ordinal-only-implib.def @@ -0,0 +1,3 @@ +LIBRARY test.dll +EXPORTS +ByOrdinalFunction @ 1 NONAME Index: lld/COFF/Writer.cpp =================================================================== --- lld/COFF/Writer.cpp +++ lld/COFF/Writer.cpp @@ -743,7 +743,8 @@ add(".idata$2", idata.dirs); add(".idata$4", idata.lookups); add(".idata$5", idata.addresses); - add(".idata$6", idata.hints); + if (!idata.hints.empty()) + add(".idata$6", idata.hints); add(".idata$7", idata.dllNames); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68352.223986.patch Type: text/x-patch Size: 1523 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 21:49:57 2019 From: llvm-commits at lists.llvm.org (Kristina Brooks via llvm-commits) Date: Wed, 9 Oct 2019 05:49:57 +0100 Subject: [llvm] r374139 - [dsymutil] Fix handling of common symbols in multiple object files. In-Reply-To: References: <20191009041619.307A6906D2@lists.llvm.org> Message-ID: +llvm-commits On Wed, Oct 9, 2019 at 5:44 AM Kristina Brooks wrote: > > Hi, > > Do you mind regenerating these test inputs so they have something like > `comm`/`comm1.o` etc. since these names are reserved on Windows: > > test/tools/dsymutil/Inputs/private/tmp/common/com > test/tools/dsymutil/Inputs/private/tmp/common/com1.o > test/tools/dsymutil/Inputs/private/tmp/common/com2.o > > (According to MS docs - "Also avoid these names followed immediately > by an extension; for example, NUL.txt is not recommended.") > > Thank you. > > > On Wed, Oct 9, 2019 at 5:13 AM Jonas Devlieghere via llvm-commits > wrote: > > > > Author: jdevlieghere > > Date: Tue Oct 8 21:16:18 2019 > > New Revision: 374139 > > > > URL: http://llvm.org/viewvc/llvm-project?rev=374139&view=rev > > Log: > > [dsymutil] Fix handling of common symbols in multiple object files. > > > > For common symbols the linker emits only a single symbol entry in the > > debug map. This caused dsymutil to not relocate common symbols when > > linking DWARF coming form object files that did not have this entry. > > This patch fixes that by keeping track of common symbols in the object > > files and synthesizing a debug map entry for them using the address from > > the main binary. > > > > Differential revision: https://reviews.llvm.org/D68680 > > > > Added: > > llvm/trunk/test/tools/dsymutil/Inputs/private/ > > llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/ > > llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/ > > llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com (with props) > > llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com1.o (with props) > > llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com2.o (with props) > > llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test > > Modified: > > llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp > > > > Added: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com?rev=374139&view=auto > > ============================================================================== > > Binary file - no diff available. > > > > Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com > > ------------------------------------------------------------------------------ > > svn:executable = * > > > > Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com > > ------------------------------------------------------------------------------ > > svn:mime-type = application/octet-stream > > > > Added: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com1.o > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com1.o?rev=374139&view=auto > > ============================================================================== > > Binary file - no diff available. > > > > Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com1.o > > ------------------------------------------------------------------------------ > > svn:mime-type = application/octet-stream > > > > Added: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com2.o > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com2.o?rev=374139&view=auto > > ============================================================================== > > Binary file - no diff available. > > > > Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/com2.o > > ------------------------------------------------------------------------------ > > svn:mime-type = application/octet-stream > > > > Added: llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test?rev=374139&view=auto > > ============================================================================== > > --- llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test (added) > > +++ llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test Tue Oct 8 21:16:18 2019 > > @@ -0,0 +1,39 @@ > > +RUN: dsymutil -oso-prepend-path %p/../Inputs %p/../Inputs/private/tmp/common/com -f -o - | llvm-dwarfdump -debug-info - | FileCheck %s > > +RUN: dsymutil -oso-prepend-path %p/../Inputs %p/../Inputs/private/tmp/common/com -dump-debug-map | FileCheck %s --check-prefix DEBUGMAP > > + > > +The test was compiled from two source files: > > +$ cd /private/tmp/common > > +$ cat com1.c > > +int i[1000]; > > +int main() { > > + return i[1]; > > +} > > +$ cat com2.c > > +extern int i[1000]; > > +int bar() { > > + return i[0]; > > +} > > +$ clang -fcommon -g -c com1.c -o com1.o > > +$ clang -fcommon -g -c com2.c -o com2.o > > +$ clang -fcommon -g com1.o com2.o -o com > > + > > +CHECK: DW_TAG_compile_unit > > +CHECK: DW_TAG_variable > > +CHECK-NOT: {{NULL|DW_TAG}} > > +CHECK: DW_AT_name{{.*}}"i" > > +CHECK-NOT: {{NULL|DW_TAG}} > > +CHECK: DW_AT_location{{.*}}DW_OP_addr 0x100001000) > > + > > +CHECK: DW_TAG_compile_unit > > +CHECK: DW_TAG_variable > > +CHECK-NOT: {{NULL|DW_TAG}} > > +CHECK: DW_AT_name{{.*}}"i" > > +CHECK-NOT: {{NULL|DW_TAG}} > > +CHECK: DW_AT_location{{.*}}DW_OP_addr 0x100001000) > > + > > +DEBUGMAP: filename:{{.*}}com1.o > > +DEBUGMAP: symbols: > > +DEBUGMAP: sym: _i, binAddr: 0x0000000100001000, size: 0x00000000 > > +DEBUGMAP: filename:{{.*}}com2.o > > +DEBUGMAP: symbols: > > +DEBUGMAP: sym: _i, binAddr: 0x0000000100001000, size: 0x00000000 > > > > Modified: llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp?rev=374139&r1=374138&r2=374139&view=diff > > ============================================================================== > > --- llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp (original) > > +++ llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp Tue Oct 8 21:16:18 2019 > > @@ -14,6 +14,7 @@ > > #include "llvm/Support/Path.h" > > #include "llvm/Support/WithColor.h" > > #include "llvm/Support/raw_ostream.h" > > +#include > > > > namespace { > > using namespace llvm; > > @@ -51,6 +52,8 @@ private: > > StringRef MainBinaryStrings; > > /// The constructed DebugMap. > > std::unique_ptr Result; > > + /// List of common symbols that need to be added to the debug map. > > + std::vector CommonSymbols; > > > > /// Map of the currently processed object file symbol addresses. > > StringMap> CurrentObjectAddresses; > > @@ -81,6 +84,8 @@ private: > > STE.n_value); > > } > > > > + void addCommonSymbols(); > > + > > /// Dump the symbol table output header. > > void dumpSymTabHeader(raw_ostream &OS, StringRef Arch); > > > > @@ -122,11 +127,32 @@ void MachODebugMapParser::resetParserSta > > CurrentDebugMapObject = nullptr; > > } > > > > +/// Commons symbols won't show up in the symbol map but might need to be > > +/// relocated. We can add them to the symbol table ourselves by combining the > > +/// information in the object file (the symbol name) and the main binary (the > > +/// address). > > +void MachODebugMapParser::addCommonSymbols() { > > + for (auto &CommonSymbol : CommonSymbols) { > > + uint64_t CommonAddr = getMainBinarySymbolAddress(CommonSymbol); > > + if (CommonAddr == 0) { > > + // The main binary doesn't have an address for the given symbol. > > + continue; > > + } > > + if (!CurrentDebugMapObject->addSymbol(CommonSymbol, None /*ObjectAddress*/, > > + CommonAddr, 0 /*size*/)) { > > + // The symbol is already present. > > + continue; > > + } > > + } > > + CommonSymbols.clear(); > > +} > > + > > /// Create a new DebugMapObject. This function resets the state of the > > /// parser that was referring to the last object file and sets > > /// everything up to add symbols to the new one. > > void MachODebugMapParser::switchToNewDebugMapObject( > > StringRef Filename, sys::TimePoint Timestamp) { > > + addCommonSymbols(); > > resetParserState(); > > > > SmallString<80> Path(PathPrefix); > > @@ -466,10 +492,15 @@ void MachODebugMapParser::loadCurrentObj > > // relocations will use the symbol itself, and won't need an > > // object file address. The object file address field is optional > > // in the DebugMap, leave it unassigned for these symbols. > > - if (Sym.getFlags() & (SymbolRef::SF_Absolute | SymbolRef::SF_Common)) > > + uint32_t Flags = Sym.getFlags(); > > + if (Flags & SymbolRef::SF_Absolute) { > > CurrentObjectAddresses[*Name] = None; > > - else > > + } else if (Flags & SymbolRef::SF_Common) { > > + CurrentObjectAddresses[*Name] = None; > > + CommonSymbols.push_back(*Name); > > + } else { > > CurrentObjectAddresses[*Name] = Addr; > > + } > > } > > } > > > > > > > > _______________________________________________ > > llvm-commits mailing list > > llvm-commits at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits From llvm-commits at lists.llvm.org Tue Oct 8 21:59:19 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 04:59:19 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: <891d95ba54a43105b71bb45cdb71487a@localhost.localdomain> hubert.reinterpretcast added inline comments. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:168 + +bool isRelocationSigned(XCOFFRelocation32 &Reloc); + ---------------- sfertile wrote: > hubert.reinterpretcast wrote: > > DiggerLin wrote: > > > hubert.reinterpretcast wrote: > > > > Do these need to be declared in the header? Are they called only in one `.cpp` file? If so, they can be made `static` in the `.cpp` file. Otherwise, it seems odd that these aren't `const` member functions of `XCOFFRelocation32`. > > > the llvm-readobj is using those function and obj2yaml will use them too. > > It is still odd to me that these aren't `const` non-static member functions of `XCOFFRelocation32`. > I think were these originally templated to work with both 32-bit and 64-bit relocations, which explains why they aren't member functions. Would using CRTP with a base class template work for that case? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Tue Oct 8 22:21:16 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 05:21:16 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: ruiu added a comment. The test failed on my Linux box with the following message. Is that expected? /usr/local/google/home/ruiu/llvm/lld/test/COFF/imports-ordinal-only.s:15:15: error: CHECK-NEXT: expected string not found in input CHECK-NEXT: lookup 000020b4 time 00000000 fwd 00000000 name 000020c4 addr 000020bc ================================================================================== ^ :5:2: note: scanning from here lookup 000020bc time 00000000 fwd 00000000 name 000020cc addr 000020c4 I'll run the test on Windows machine too. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 22:21:16 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 05:21:16 +0000 (UTC) Subject: [PATCH] D68575: [llvm-readobj][xcoff] implement parsing overflow section header. In-Reply-To: References: Message-ID: hubert.reinterpretcast added inline comments. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:58 + static constexpr unsigned SectionFlagsTypeMask = 0xffffu; const XCOFFObjectFile &Obj; }; ---------------- Add a blank line here. Also, I am wondering if this should be part of `llvm/BinaryFormat/XCOFF.h` (perhaps in `SectionHeader32`, or in a base class thereof when 64-bit support lands). ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:455 + case XCOFF::STYP_TYPCHK: + // TODO : The interpretation of loader, exception, type check section + // headers are different from that of generic section header. We will ---------------- The "TODO" still has a colon surrounded by spaces on both sides after it. I do not think that we have been using colons after "TODO". Still missing "and" before "type check section headers". Still missing "s" after "generic section header". Typo "seciton" is still present. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:463 + } + // For now we just dump the section type flags. + if (SectionType & SectionFlagsReservedMask) ---------------- Suggestion: "For now we just dump the section type portion of the flags." Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68575/new/ https://reviews.llvm.org/D68575 From llvm-commits at lists.llvm.org Tue Oct 8 22:21:16 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 05:21:16 +0000 (UTC) Subject: [PATCH] D68684: [WebAssembly] Make returns variadic Message-ID: tlively created this revision. tlively added reviewers: aheejin, dschuff. Herald added subscribers: llvm-commits, sunfish, hiraditya, jgravelle-google, sbc100. Herald added a project: LLVM. This is necessary and sufficient to get simple cases of multiple return working with multivalue enabled. More complex cases will require block and loop signatures to be generalized to potentially be type indices as well. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68684 Files: llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td llvm/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp llvm/lib/Target/WebAssembly/WebAssemblyPeephole.cpp llvm/test/CodeGen/MIR/WebAssembly/int-type-register-class-name.mir llvm/test/CodeGen/WebAssembly/atomic-fence.mir llvm/test/CodeGen/WebAssembly/eh-labels.mir llvm/test/CodeGen/WebAssembly/explicit-locals.mir llvm/test/CodeGen/WebAssembly/function-info.mir llvm/test/CodeGen/WebAssembly/llround-conv-i32.ll llvm/test/CodeGen/WebAssembly/multivalue.ll llvm/test/CodeGen/WebAssembly/reg-argument.mir llvm/test/CodeGen/WebAssembly/reg-copy.mir llvm/test/DebugInfo/WebAssembly/dbg-value-move-clone.mir llvm/test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68684.223989.patch Type: text/x-patch Size: 26242 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 22:30:34 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 05:30:34 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <91913a60bf412ccb2f1f8d87ab5f0ff5@localhost.localdomain> ruiu added a comment. It looks like the same test fails on Windows with the same output, so I'll fix it to match the actual output and submit. Does this sound OK? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 22:35:34 2019 From: llvm-commits at lists.llvm.org (Stefan Schmidt via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 05:35:34 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: thrimbor added a comment. The output on my Arch Linux machine matches the strings the test checks for - is it normal for these values to change? Should we ignore them? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 22:44:30 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 05:44:30 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <7c417609296af677bea38d1694e55199@localhost.localdomain> ruiu added a comment. Build reproducibility (a property that if you feed the exact same inputs to compilers or linkers you can get the exact same output) is important. Unfrotunately, by default, COFF has a timestamp field that changes virtually every time, but it looks like the difference between you and me is a difference how things are laid out in the file. That is not supposed to vary. We can just ignore the line and check only the following three lines 1. CHECK-NEXT: DLL Name: test.dll 2. CHECK-NEXT: Hint/Ord Name 3. CHECK-NEXT: 1 but I'd like to understand what is going on. Are you at the repo head? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 23:02:51 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 06:02:51 +0000 (UTC) Subject: [PATCH] D68527: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering In-Reply-To: References: Message-ID: aheejin accepted this revision. aheejin added a comment. This revision is now accepted and ready to land. > We can make the decision based on whatever heuristic we want, but minimizing number of instructions seems like a good metric for now until we can run experiments to tune the selection algorithm. Wouldn't minimizing the number of instruction be the same thing as minimizing the number of bytes, only more inaccurate? > I don't know how swizzles and v128.const compare, but I do know that emulating swizzles requires a lot of instructions per lane but emulating a v128.const only requires a single replace_lane and constant per lane. I don't understand this part well. If swizzles are a lot more complicated that `v128.const` in execution, doesn't that mean swizzles will likely to take longer to execute in wasm? Why the opposite? The above are just some passing questions, but I'm not suggesting we restore the byte computation logic or another more complicated logic at this point, given that we don't have measurable any performance data at hand. Optimizing at this point seems too premature anyway. LGTM. ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1384 + SDValue SwizzleSrc; + SDValue SwizzleIndices; + size_t NumSwizzleLanes = 0; ---------------- aheejin wrote: > Nit: Variable names for the same things in `GetSwizzleSrcs` are `SrcVec` and `IndexVec`. Making the variable names same in the two places might make reading easier. In `GetSwizzleSrcs`, `IndexVec` is still `IndexVec`, while `SrcVec` was changed to `SwizzleSrc. Was that intentional? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68527/new/ https://reviews.llvm.org/D68527 From llvm-commits at lists.llvm.org Tue Oct 8 23:08:13 2019 From: llvm-commits at lists.llvm.org (Rahman Lavaee via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 06:08:13 +0000 (UTC) Subject: [PATCH] D68073: Propeller code layout optimizations In-Reply-To: References: Message-ID: <073c404a1c8b7f8b5693383687f30dc9@localhost.localdomain> rahmanl updated this revision to Diff 223990. rahmanl added a comment. This update addresses previous comments from @ruiu. Most importantly, it removes the propeller-align-basic-block functionality since it is not fit with the current code. I will add this functionality in a later patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68073/new/ https://reviews.llvm.org/D68073 Files: lld/ELF/PropellerBBReordering.cpp lld/ELF/PropellerBBReordering.h lld/ELF/PropellerFuncOrdering.cpp lld/ELF/PropellerFuncOrdering.h lld/test/ELF/propeller/propeller-layout-function-ordering.s lld/test/ELF/propeller/propeller-layout-function-with-loop.s lld/test/ELF/propeller/propeller-layout-optimal-fallthrough.s lld/test/ELF/propeller/propeller-opt-all-combinations.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68073.223990.patch Type: text/x-patch Size: 66095 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 23:25:39 2019 From: llvm-commits at lists.llvm.org (Stefan Schmidt via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 06:25:39 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: thrimbor added a comment. > but I'd like to understand what is going on. Are you at the repo head? My copy was a few days old, but I just updated and the test still works for me. I have a suspicion though - the path to the PDB gets embedded into the .rdata section , so that might shift things around depending on length of the path to the LLVM repo. I'll update the patch to ignore the addresses. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 23:25:40 2019 From: llvm-commits at lists.llvm.org (Stefan Schmidt via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 06:25:40 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <27f79418c2dceff2306e2207f0a69573@localhost.localdomain> thrimbor updated this revision to Diff 223991. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 Files: lld/COFF/Writer.cpp lld/test/COFF/Inputs/ordinal-only-implib.def lld/test/COFF/imports-ordinal-only.s Index: lld/test/COFF/imports-ordinal-only.s =================================================================== --- /dev/null +++ lld/test/COFF/imports-ordinal-only.s @@ -0,0 +1,18 @@ +# REQUIRES: x86 +# +# RUN: llvm-dlltool -k -m i386 --input-def %p/Inputs/ordinal-only-implib.def --output-lib %t-implib.a +# RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj +# RUN: lld-link -out:%t.exe -entry:main -subsystem:console -safeseh:no -debug %t.obj %t-implib.a +# RUN: llvm-objdump -private-headers %t.exe | FileCheck --match-full-lines %s + +.text +.global _main +_main: +call _ByOrdinalFunction +ret + +# CHECK: The Import Tables: +# CHECK: DLL Name: test.dll +# CHECK-NEXT: Hint/Ord Name +# CHECK-NEXT: 1 +# CHECK-EMPTY: Index: lld/test/COFF/Inputs/ordinal-only-implib.def =================================================================== --- /dev/null +++ lld/test/COFF/Inputs/ordinal-only-implib.def @@ -0,0 +1,3 @@ +LIBRARY test.dll +EXPORTS +ByOrdinalFunction @ 1 NONAME Index: lld/COFF/Writer.cpp =================================================================== --- lld/COFF/Writer.cpp +++ lld/COFF/Writer.cpp @@ -743,7 +743,8 @@ add(".idata$2", idata.dirs); add(".idata$4", idata.lookups); add(".idata$5", idata.addresses); - add(".idata$6", idata.hints); + if (!idata.hints.empty()) + add(".idata$6", idata.hints); add(".idata$7", idata.dllNames); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68352.223991.patch Type: text/x-patch Size: 1414 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 23:34:45 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 06:34:45 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: ruiu added a comment. Is it OK to drop `-debug` option from the lld-link command line? I think we don't need that option there. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 23:43:43 2019 From: llvm-commits at lists.llvm.org (Stefan Schmidt via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 06:43:43 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: <7b2eeb7baa511c1d02a3106d83b05d1a@localhost.localdomain> thrimbor added a comment. Without the `-debug` parameter the test can be false positive as it won't reach the assert in the PDB code (in `addLinkerModuleCoffGroup`) then CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 23:48:24 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via llvm-commits) Date: Wed, 09 Oct 2019 06:48:24 -0000 Subject: [lld] r374140 - [lld] Don't create hints-section if Hint/Name Table is empty Message-ID: <20191009064824.E60F087E9D@lists.llvm.org> Author: ruiu Date: Tue Oct 8 23:48:24 2019 New Revision: 374140 URL: http://llvm.org/viewvc/llvm-project?rev=374140&view=rev Log: [lld] Don't create hints-section if Hint/Name Table is empty Fixes assert in addLinkerModuleCoffGroup() when using by-ordinal imports only. Patch by Stefan Schmidt. Differential revision: https://reviews.llvm.org/D68352 Added: lld/trunk/test/COFF/Inputs/ordinal-only-implib.def lld/trunk/test/COFF/imports-ordinal-only.s Modified: lld/trunk/COFF/Writer.cpp Modified: lld/trunk/COFF/Writer.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/Writer.cpp?rev=374140&r1=374139&r2=374140&view=diff ============================================================================== --- lld/trunk/COFF/Writer.cpp (original) +++ lld/trunk/COFF/Writer.cpp Tue Oct 8 23:48:24 2019 @@ -743,7 +743,8 @@ void Writer::addSyntheticIdata() { add(".idata$2", idata.dirs); add(".idata$4", idata.lookups); add(".idata$5", idata.addresses); - add(".idata$6", idata.hints); + if (!idata.hints.empty()) + add(".idata$6", idata.hints); add(".idata$7", idata.dllNames); } Added: lld/trunk/test/COFF/Inputs/ordinal-only-implib.def URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/Inputs/ordinal-only-implib.def?rev=374140&view=auto ============================================================================== --- lld/trunk/test/COFF/Inputs/ordinal-only-implib.def (added) +++ lld/trunk/test/COFF/Inputs/ordinal-only-implib.def Tue Oct 8 23:48:24 2019 @@ -0,0 +1,3 @@ +LIBRARY test.dll +EXPORTS +ByOrdinalFunction @ 1 NONAME Added: lld/trunk/test/COFF/imports-ordinal-only.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/imports-ordinal-only.s?rev=374140&view=auto ============================================================================== --- lld/trunk/test/COFF/imports-ordinal-only.s (added) +++ lld/trunk/test/COFF/imports-ordinal-only.s Tue Oct 8 23:48:24 2019 @@ -0,0 +1,18 @@ +# REQUIRES: x86 +# +# RUN: llvm-dlltool -k -m i386 --input-def %p/Inputs/ordinal-only-implib.def --output-lib %t-implib.a +# RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj +# RUN: lld-link -out:%t.exe -entry:main -subsystem:console -safeseh:no -debug %t.obj %t-implib.a +# RUN: llvm-objdump -private-headers %t.exe | FileCheck --match-full-lines %s + +.text +.global _main +_main: +call _ByOrdinalFunction +ret + +# CHECK: The Import Tables: +# CHECK: DLL Name: test.dll +# CHECK-NEXT: Hint/Ord Name +# CHECK-NEXT: 1 +# CHECK-EMPTY: From llvm-commits at lists.llvm.org Tue Oct 8 23:52:47 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 06:52:47 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: ruiu added a comment. OK, I'll just apply your change and submit. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Tue Oct 8 23:52:48 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 06:52:48 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: <8b2e9abd3f8970707b134b2679787d6a@localhost.localdomain> shchenz updated this revision to Diff 223992. shchenz added a comment. minor fix for testcase CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 Files: llvm/include/llvm/IR/PatternMatch.h llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp llvm/test/Transforms/AggressiveInstCombine/popcount.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68189.223992.patch Type: text/x-patch Size: 13228 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 23:52:53 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 06:52:53 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGc3c5e0fbbf76: [lld] Don't create hints-section if Hint/Name Table is empty (authored by ruiu). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 Files: lld/COFF/Writer.cpp lld/test/COFF/Inputs/ordinal-only-implib.def lld/test/COFF/imports-ordinal-only.s Index: lld/test/COFF/imports-ordinal-only.s =================================================================== --- /dev/null +++ lld/test/COFF/imports-ordinal-only.s @@ -0,0 +1,18 @@ +# REQUIRES: x86 +# +# RUN: llvm-dlltool -k -m i386 --input-def %p/Inputs/ordinal-only-implib.def --output-lib %t-implib.a +# RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj +# RUN: lld-link -out:%t.exe -entry:main -subsystem:console -safeseh:no -debug %t.obj %t-implib.a +# RUN: llvm-objdump -private-headers %t.exe | FileCheck --match-full-lines %s + +.text +.global _main +_main: +call _ByOrdinalFunction +ret + +# CHECK: The Import Tables: +# CHECK: DLL Name: test.dll +# CHECK-NEXT: Hint/Ord Name +# CHECK-NEXT: 1 +# CHECK-EMPTY: Index: lld/test/COFF/Inputs/ordinal-only-implib.def =================================================================== --- /dev/null +++ lld/test/COFF/Inputs/ordinal-only-implib.def @@ -0,0 +1,3 @@ +LIBRARY test.dll +EXPORTS +ByOrdinalFunction @ 1 NONAME Index: lld/COFF/Writer.cpp =================================================================== --- lld/COFF/Writer.cpp +++ lld/COFF/Writer.cpp @@ -743,7 +743,8 @@ add(".idata$2", idata.dirs); add(".idata$4", idata.lookups); add(".idata$5", idata.addresses); - add(".idata$6", idata.hints); + if (!idata.hints.empty()) + add(".idata$6", idata.hints); add(".idata$7", idata.dllNames); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68352.223993.patch Type: text/x-patch Size: 1414 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 23:53:05 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 06:53:05 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: mstorsjo added inline comments. ================ Comment at: lld/test/COFF/imports-ordinal-only.s:3 +# +# RUN: llvm-dlltool -k -m i386 --input-def %p/Inputs/ordinal-only-implib.def --output-lib %t-implib.a +# RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj ---------------- A general comment (I meant to write this yesterday but I forgot); llvm-dlltool generally is a MinGW specific tool which can imply MinGW specific quirks. In this case I don't think there's anything specific which would introduce unexpected/unintended details, but I think it also might be possible to generate the implib with `lld-link -def:ordinal-only-implib.def -implib:%t-implib.a` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Wed Oct 9 00:04:38 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via llvm-commits) Date: Wed, 09 Oct 2019 07:04:38 -0000 Subject: [lld] r374142 - Use lld-link instead of llvm-dlltool to create an implib Message-ID: <20191009070438.BD449907AD@lists.llvm.org> Author: ruiu Date: Wed Oct 9 00:04:38 2019 New Revision: 374142 URL: http://llvm.org/viewvc/llvm-project?rev=374142&view=rev Log: Use lld-link instead of llvm-dlltool to create an implib Suggested by Martin Storsjö. Modified: lld/trunk/test/COFF/imports-ordinal-only.s Modified: lld/trunk/test/COFF/imports-ordinal-only.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/imports-ordinal-only.s?rev=374142&r1=374141&r2=374142&view=diff ============================================================================== --- lld/trunk/test/COFF/imports-ordinal-only.s (original) +++ lld/trunk/test/COFF/imports-ordinal-only.s Wed Oct 9 00:04:38 2019 @@ -1,6 +1,6 @@ # REQUIRES: x86 # -# RUN: llvm-dlltool -k -m i386 --input-def %p/Inputs/ordinal-only-implib.def --output-lib %t-implib.a +# RUN: lld-link -machine:x86 -def:%p/Inputs/ordinal-only-implib.def -implib:%t-implib.a # RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj # RUN: lld-link -out:%t.exe -entry:main -subsystem:console -safeseh:no -debug %t.obj %t-implib.a # RUN: llvm-objdump -private-headers %t.exe | FileCheck --match-full-lines %s From llvm-commits at lists.llvm.org Wed Oct 9 00:02:15 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 07:02:15 +0000 (UTC) Subject: [PATCH] D68352: [lld] Don't create hints-section if Hint/Name Table is empty In-Reply-To: References: Message-ID: ruiu added inline comments. ================ Comment at: lld/test/COFF/imports-ordinal-only.s:3 +# +# RUN: llvm-dlltool -k -m i386 --input-def %p/Inputs/ordinal-only-implib.def --output-lib %t-implib.a +# RUN: llvm-mc -triple=i386-pc-win32 %s -filetype=obj -o %t.obj ---------------- mstorsjo wrote: > A general comment (I meant to write this yesterday but I forgot); llvm-dlltool generally is a MinGW specific tool which can imply MinGW specific quirks. In this case I don't think there's anything specific which would introduce unexpected/unintended details, but I think it also might be possible to generate the implib with `lld-link -def:ordinal-only-implib.def -implib:%t-implib.a` I will make change as you suggested. Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68352/new/ https://reviews.llvm.org/D68352 From llvm-commits at lists.llvm.org Wed Oct 9 00:20:20 2019 From: llvm-commits at lists.llvm.org (Seiya Nuta via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 07:20:20 +0000 (UTC) Subject: [PATCH] D66282: [llvm-objcopy][MachO] Implement --remove-section In-Reply-To: References: Message-ID: <5bf043cfeabf3d8f8e128ea42d209c6c@localhost.localdomain> seiya marked an inline comment as done. seiya added inline comments. ================ Comment at: llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp:42-46 if (!Config.OnlySection.empty()) { RemovePred = [&Config, RemovePred](const Section &Sec) { return !Config.OnlySection.matches(Sec.CanonicalName); }; } ---------------- rupprecht wrote: > rupprecht wrote: > > Not related to this patch, but looks like this discards `RemovePred`, e.g. `--strip-all --only-section` will effectively work like `--only-section`. > > > > I found this after noticing that `RemovePred` is not referenced in the bit you added, which is actually fine there but error prone if the iterative construction of `RemovePred` is ever reordered. ELF objcopy works like this too, but correctly chains `RemovePred` for the `--only-section` switch. > Nevermind, that flag takes priority, so that's WAI. Ignore this comment. I'll add a comment for it. It's worth mentioning in the code. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66282/new/ https://reviews.llvm.org/D66282 From llvm-commits at lists.llvm.org Wed Oct 9 00:20:23 2019 From: llvm-commits at lists.llvm.org (Bjorn Pettersson via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 07:20:23 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline In-Reply-To: References: Message-ID: bjope added a comment. In D68633#1699421 , @bjope wrote: > I do not understand how this helps. The code is written in a way that it skips any instruction, but moves contigous blocks of allocas in one splice (not sure exactly why, is that really faster?). Maybe the difference is that the check for AI->useEmpty() only is done for the first alloca in a sequence of alloca instructions? Or can't we just remove the loop at line 1847 (only moving one alloca at a time). The patch description really needs to be updated. Afaict the old code has been doing "Skip dbg instr while allocas scanning.", but with this patch you in some sense start to scan for both alloca instructions _and_ dbg intrinsics. And then some dbg intrinsics will be part of the splice. As Johannes mentioned this is incomplete (since not all dbg intrinsics are considered, only those appearing after the first alloca in for each lap in the outer for loop). Besides, inlining decisions should not be impacted by if we move the dbg intrinsics or not, right? That is why I think this fix isn't solving the problem seen in the bugzilla ticket. It only hides the actual problem (given the reduced reproducer). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 From llvm-commits at lists.llvm.org Wed Oct 9 00:29:53 2019 From: llvm-commits at lists.llvm.org (Pengfei Wang via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 07:29:53 +0000 (UTC) Subject: [PATCH] D68686: [X86] Add strict fp support for instructions fadd/fsub/fmul/fdiv Message-ID: pengfei created this revision. pengfei added reviewers: craig.topper, RKSimon, andrew.w.kaylor, uweigand, kpn, spatel, cameron.mcinally. Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. This patch adds MXCSR as a reserved physical register and models its use by releated instructions. It also adds flag "mayRaiseFPException" for them. Following what SystemZ and other targets does, only the current rounding modes and the IEEE exception masks are modeled. *Changes* of the MXCSR due to exceptions are not modeled. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68686 Files: llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86InstrAVX512.td llvm/lib/Target/X86/X86InstrFormats.td llvm/lib/Target/X86/X86InstrInfo.cpp llvm/lib/Target/X86/X86InstrSSE.td llvm/lib/Target/X86/X86RegisterInfo.cpp llvm/lib/Target/X86/X86RegisterInfo.td llvm/test/CodeGen/MIR/X86/constant-pool.mir llvm/test/CodeGen/MIR/X86/fastmath.mir llvm/test/CodeGen/MIR/X86/memory-operands.mir llvm/test/CodeGen/X86/evex-to-vex-compress.mir llvm/test/CodeGen/X86/fp-strict-avx.ll llvm/test/CodeGen/X86/fp-strict-avx512.ll llvm/test/CodeGen/X86/fp-strict-sse.ll llvm/test/CodeGen/X86/ipra-reg-usage.ll llvm/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68686.223995.patch Type: text/x-patch Size: 116637 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 00:38:34 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 07:38:34 +0000 (UTC) Subject: [PATCH] D68134: [LLDB] Use the llvm microsoft demangler instead of the windows dbghelp api In-Reply-To: References: Message-ID: <66cb8785ebce2df07de0928133c39075@localhost.localdomain> mstorsjo marked an inline comment as done. mstorsjo added a comment. In D68134#1700453 , @amccarth wrote: > LGTM after one question. Thanks! I'll hold off committing for a bit still, as I might try to add more options to the microsoft demangler, to match the previous output. ================ Comment at: lldb/lit/SymbolFile/PDB/udt-layout.test:1 REQUIRES: system-windows, lld RUN: %build --compiler=clang-cl --output=%t.exe %S/Inputs/UdtLayoutTest.cpp ---------------- amccarth wrote: > Is `system-windows` still required after you've removed the dependency on dbghelp? Yes, this test depends on the system-provided PDB support, so it needs to run on windows. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68134/new/ https://reviews.llvm.org/D68134 From llvm-commits at lists.llvm.org Wed Oct 9 00:52:07 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Wed, 09 Oct 2019 07:52:07 -0000 Subject: [llvm] r374143 - [llvm-exegesis][NFC] Remove unecessary `using llvm::` directives. Message-ID: <20191009075207.BDCCE8F58F@lists.llvm.org> Author: courbet Date: Wed Oct 9 00:52:07 2019 New Revision: 374143 URL: http://llvm.org/viewvc/llvm-project?rev=374143&view=rev Log: [llvm-exegesis][NFC] Remove unecessary `using llvm::` directives. We've been in namespace llvm for at least a year. Modified: llvm/trunk/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/X86/TargetTest.cpp Modified: llvm/trunk/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp?rev=374143&r1=374142&r2=374143&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp Wed Oct 9 00:52:07 2019 @@ -24,8 +24,6 @@ void InitializeAArch64ExegesisTarget(); namespace { -using llvm::APInt; -using llvm::MCInst; using testing::Gt; using testing::IsEmpty; using testing::Not; Modified: llvm/trunk/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp?rev=374143&r1=374142&r2=374143&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp Wed Oct 9 00:52:07 2019 @@ -16,12 +16,11 @@ void InitializeX86ExegesisTarget(); namespace { -using llvm::MCInstBuilder; -using llvm::X86::EAX; -using llvm::X86::MOV32ri; -using llvm::X86::MOV64ri32; -using llvm::X86::RAX; -using llvm::X86::XOR32rr; +using X86::EAX; +using X86::MOV32ri; +using X86::MOV64ri32; +using X86::RAX; +using X86::XOR32rr; class X86MachineFunctionGeneratorTest : public MachineFunctionGeneratorBaseTest { Modified: llvm/trunk/unittests/tools/llvm-exegesis/X86/TargetTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/X86/TargetTest.cpp?rev=374143&r1=374142&r2=374143&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/X86/TargetTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/X86/TargetTest.cpp Wed Oct 9 00:52:07 2019 @@ -60,11 +60,6 @@ using testing::NotNull; using testing::Property; using testing::SizeIs; -using llvm::APInt; -using llvm::MCInst; -using llvm::MCInstBuilder; -using llvm::MCOperand; - Matcher IsImm(int64_t Value) { return AllOf(Property(&MCOperand::isImm, Eq(true)), Property(&MCOperand::getImm, Eq(Value))); From llvm-commits at lists.llvm.org Wed Oct 9 00:57:01 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 07:57:01 +0000 (UTC) Subject: [PATCH] D68649: [Mips][llvm-exegesis] Add a Mips target In-Reply-To: References: Message-ID: <8c2bc08c7343d2219eceb5afe2dde493@localhost.localdomain> courbet added a comment. Thanks for the contribution. I only have cosmetic comments. ================ Comment at: unittests/tools/llvm-exegesis/Mips/TargetTest.cpp:27 + +using llvm::APInt; +using llvm::MCInst; ---------------- You don't need these. I guess this is a copy-paste from the X86 tests: these are remains from a time when we were not in namespace llvm. (Fixed in rL374143). ================ Comment at: unittests/tools/llvm-exegesis/Mips/TargetTest.cpp:56 +TEST_F(MipsTargetTest, SetRegToConstant) { + const auto Insts = setRegTo(llvm::Mips::T0, llvm::APInt()); + EXPECT_THAT(Insts, Not(IsEmpty())); ---------------- ditto: remove `llvm::` ================ Comment at: unittests/tools/llvm-exegesis/Mips/TargetTest.cpp:57 + const auto Insts = setRegTo(llvm::Mips::T0, llvm::APInt()); + EXPECT_THAT(Insts, Not(IsEmpty())); +} ---------------- Don't you want to test the actual instruction ? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68649/new/ https://reviews.llvm.org/D68649 From llvm-commits at lists.llvm.org Wed Oct 9 01:06:36 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:06:36 +0000 (UTC) Subject: [PATCH] D68687: [llvm-exegesis][NFC] Remove extra `llvm::` qualifications. Message-ID: courbet created this revision. Herald added subscribers: jsji, MaskRay, tschuett, nemanjai. Herald added a project: LLVM. First patch: in unit tests. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68687 Files: llvm/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp llvm/unittests/tools/llvm-exegesis/ARM/AssemblerTest.cpp llvm/unittests/tools/llvm-exegesis/Common/AssemblerUtils.h llvm/unittests/tools/llvm-exegesis/PerfHelperTest.cpp llvm/unittests/tools/llvm-exegesis/PowerPC/AnalysisTest.cpp llvm/unittests/tools/llvm-exegesis/PowerPC/TargetTest.cpp llvm/unittests/tools/llvm-exegesis/RegisterValueTest.cpp llvm/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp llvm/unittests/tools/llvm-exegesis/X86/BenchmarkResultTest.cpp llvm/unittests/tools/llvm-exegesis/X86/RegisterAliasingTest.cpp llvm/unittests/tools/llvm-exegesis/X86/SchedClassResolutionTest.cpp llvm/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp llvm/unittests/tools/llvm-exegesis/X86/TargetTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68687.223996.patch Type: text/x-patch Size: 47732 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 01:15:40 2019 From: llvm-commits at lists.llvm.org (Guillaume Chatelet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:15:40 +0000 (UTC) Subject: [PATCH] D68646: [llvm-exegesis] Explore LEA addressing modes. In-Reply-To: References: Message-ID: gchatelet accepted this revision. gchatelet added inline comments. This revision is now accepted and ready to land. ================ Comment at: llvm/tools/llvm-exegesis/lib/RegisterAliasing.h:106 +// a = a & ~b; +inline void remove(llvm::BitVector &A, const llvm::BitVector &B) { ---------------- Why not implement it this way ? Is this because of knowledge that B has few bits set? ================ Comment at: llvm/tools/llvm-exegesis/lib/X86/Target.cpp:181 +// Helper to fill a mempry operand with a value. +static void setMemOp(InstructionTemplate &IT, int OpIdx, ---------------- typo ================ Comment at: llvm/tools/llvm-exegesis/lib/X86/Target.cpp:217 + for (int LogScale = 0; LogScale <= 3; ++LogScale) { + for (const int Disp : {0, 42}) { + InstructionTemplate IT(Instr); ---------------- You should provide a rationale for why only `{0, 42}` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68646/new/ https://reviews.llvm.org/D68646 From llvm-commits at lists.llvm.org Wed Oct 9 01:15:40 2019 From: llvm-commits at lists.llvm.org (Guillaume Chatelet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:15:40 +0000 (UTC) Subject: [PATCH] D68398: [Alignment][NFC] Value::getPointerAlignment returns MaybeAlign In-Reply-To: References: Message-ID: gchatelet marked an inline comment as done. gchatelet added inline comments. ================ Comment at: llvm/lib/Analysis/Loads.cpp:48 + return false; } ---------------- jdoerfert wrote: > While I'm all for making the "getAlign" function explicit eventually, I think it would be good to keep it as is for this patch as there doesn't seem to be a reason to do this here. > > fwiw: I'm behind but eventually I was going to refactor this: D66618 > Which function are you talking about exactly? I don't see a `getAlign` function. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68398/new/ https://reviews.llvm.org/D68398 From llvm-commits at lists.llvm.org Wed Oct 9 01:24:41 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:24:41 +0000 (UTC) Subject: [PATCH] D68646: [llvm-exegesis] Explore LEA addressing modes. In-Reply-To: References: Message-ID: <1bce4abcf2f8d8fd9da2341e6e8f40c9@localhost.localdomain> courbet updated this revision to Diff 223997. courbet marked 3 inline comments as done. courbet added a comment. Address comments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68646/new/ https://reviews.llvm.org/D68646 Files: llvm/test/tools/llvm-exegesis/X86/latency-LEA64r.s llvm/test/tools/llvm-exegesis/X86/uops-LEA64r.s llvm/tools/llvm-exegesis/lib/RegisterAliasing.h llvm/tools/llvm-exegesis/lib/Uops.cpp llvm/tools/llvm-exegesis/lib/X86/Target.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68646.223997.patch Type: text/x-patch Size: 9418 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 01:24:40 2019 From: llvm-commits at lists.llvm.org (serge via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:24:40 +0000 (UTC) Subject: [PATCH] D68525: [lit] Refactor ProgressDisplay In-Reply-To: References: Message-ID: serge-sans-paille added a comment. LGTM, thanks for the refactoring. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68525/new/ https://reviews.llvm.org/D68525 From llvm-commits at lists.llvm.org Wed Oct 9 01:24:41 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:24:41 +0000 (UTC) Subject: [PATCH] D68646: [llvm-exegesis] Explore LEA addressing modes. In-Reply-To: References: Message-ID: <795e16d18c934a210a907c35d6fc27c5@localhost.localdomain> courbet added a comment. Thanks ! ================ Comment at: llvm/tools/llvm-exegesis/lib/RegisterAliasing.h:106 +// a = a & ~b; +inline void remove(llvm::BitVector &A, const llvm::BitVector &B) { ---------------- gchatelet wrote: > Why not implement it this way ? Is this because of knowledge that B has few bits set? Right, this warrants a comment. Done. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68646/new/ https://reviews.llvm.org/D68646 From llvm-commits at lists.llvm.org Wed Oct 9 01:27:48 2019 From: llvm-commits at lists.llvm.org (Jeremy Morse via llvm-commits) Date: Wed, 09 Oct 2019 08:27:48 -0000 Subject: [llvm] r374144 - Revert r374139, "[dsymutil] Fix handling of common symbols in multiple object files." Message-ID: <20191009082748.824518F208@lists.llvm.org> Author: jmorse Date: Wed Oct 9 01:27:48 2019 New Revision: 374144 URL: http://llvm.org/viewvc/llvm-project?rev=374144&view=rev Log: Revert r374139, "[dsymutil] Fix handling of common symbols in multiple object files." The added test files ("com", "com1.o", "com2.o") are reserved names on Windows, and makes 'git checkout' fail with a filesystem error. Removed: llvm/trunk/test/tools/dsymutil/Inputs/private/ llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test Modified: llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp Removed: llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test?rev=374143&view=auto ============================================================================== --- llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test (original) +++ llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test (removed) @@ -1,39 +0,0 @@ -RUN: dsymutil -oso-prepend-path %p/../Inputs %p/../Inputs/private/tmp/common/com -f -o - | llvm-dwarfdump -debug-info - | FileCheck %s -RUN: dsymutil -oso-prepend-path %p/../Inputs %p/../Inputs/private/tmp/common/com -dump-debug-map | FileCheck %s --check-prefix DEBUGMAP - -The test was compiled from two source files: -$ cd /private/tmp/common -$ cat com1.c -int i[1000]; -int main() { - return i[1]; -} -$ cat com2.c -extern int i[1000]; -int bar() { - return i[0]; -} -$ clang -fcommon -g -c com1.c -o com1.o -$ clang -fcommon -g -c com2.c -o com2.o -$ clang -fcommon -g com1.o com2.o -o com - -CHECK: DW_TAG_compile_unit -CHECK: DW_TAG_variable -CHECK-NOT: {{NULL|DW_TAG}} -CHECK: DW_AT_name{{.*}}"i" -CHECK-NOT: {{NULL|DW_TAG}} -CHECK: DW_AT_location{{.*}}DW_OP_addr 0x100001000) - -CHECK: DW_TAG_compile_unit -CHECK: DW_TAG_variable -CHECK-NOT: {{NULL|DW_TAG}} -CHECK: DW_AT_name{{.*}}"i" -CHECK-NOT: {{NULL|DW_TAG}} -CHECK: DW_AT_location{{.*}}DW_OP_addr 0x100001000) - -DEBUGMAP: filename:{{.*}}com1.o -DEBUGMAP: symbols: -DEBUGMAP: sym: _i, binAddr: 0x0000000100001000, size: 0x00000000 -DEBUGMAP: filename:{{.*}}com2.o -DEBUGMAP: symbols: -DEBUGMAP: sym: _i, binAddr: 0x0000000100001000, size: 0x00000000 Modified: llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp?rev=374144&r1=374143&r2=374144&view=diff ============================================================================== --- llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp (original) +++ llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp Wed Oct 9 01:27:48 2019 @@ -14,7 +14,6 @@ #include "llvm/Support/Path.h" #include "llvm/Support/WithColor.h" #include "llvm/Support/raw_ostream.h" -#include namespace { using namespace llvm; @@ -52,8 +51,6 @@ private: StringRef MainBinaryStrings; /// The constructed DebugMap. std::unique_ptr Result; - /// List of common symbols that need to be added to the debug map. - std::vector CommonSymbols; /// Map of the currently processed object file symbol addresses. StringMap> CurrentObjectAddresses; @@ -84,8 +81,6 @@ private: STE.n_value); } - void addCommonSymbols(); - /// Dump the symbol table output header. void dumpSymTabHeader(raw_ostream &OS, StringRef Arch); @@ -127,32 +122,11 @@ void MachODebugMapParser::resetParserSta CurrentDebugMapObject = nullptr; } -/// Commons symbols won't show up in the symbol map but might need to be -/// relocated. We can add them to the symbol table ourselves by combining the -/// information in the object file (the symbol name) and the main binary (the -/// address). -void MachODebugMapParser::addCommonSymbols() { - for (auto &CommonSymbol : CommonSymbols) { - uint64_t CommonAddr = getMainBinarySymbolAddress(CommonSymbol); - if (CommonAddr == 0) { - // The main binary doesn't have an address for the given symbol. - continue; - } - if (!CurrentDebugMapObject->addSymbol(CommonSymbol, None /*ObjectAddress*/, - CommonAddr, 0 /*size*/)) { - // The symbol is already present. - continue; - } - } - CommonSymbols.clear(); -} - /// Create a new DebugMapObject. This function resets the state of the /// parser that was referring to the last object file and sets /// everything up to add symbols to the new one. void MachODebugMapParser::switchToNewDebugMapObject( StringRef Filename, sys::TimePoint Timestamp) { - addCommonSymbols(); resetParserState(); SmallString<80> Path(PathPrefix); @@ -492,15 +466,10 @@ void MachODebugMapParser::loadCurrentObj // relocations will use the symbol itself, and won't need an // object file address. The object file address field is optional // in the DebugMap, leave it unassigned for these symbols. - uint32_t Flags = Sym.getFlags(); - if (Flags & SymbolRef::SF_Absolute) { - CurrentObjectAddresses[*Name] = None; - } else if (Flags & SymbolRef::SF_Common) { + if (Sym.getFlags() & (SymbolRef::SF_Absolute | SymbolRef::SF_Common)) CurrentObjectAddresses[*Name] = None; - CommonSymbols.push_back(*Name); - } else { + else CurrentObjectAddresses[*Name] = Addr; - } } } From llvm-commits at lists.llvm.org Wed Oct 9 01:32:10 2019 From: llvm-commits at lists.llvm.org (Jeremy Morse via llvm-commits) Date: Wed, 9 Oct 2019 09:32:10 +0100 Subject: [llvm] r374139 - [dsymutil] Fix handling of common symbols in multiple object files. In-Reply-To: References: <20191009041619.307A6906D2@lists.llvm.org> Message-ID: Hi Jonas, FYI I reverted this in r374144, the reserved-names issue Kristina points out causes "git checkout" to fail on Windows. -- Thanks, Jeremy From llvm-commits at lists.llvm.org Wed Oct 9 01:33:47 2019 From: llvm-commits at lists.llvm.org (Hans Wennborg via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:33:47 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <85b2835e89ec9895ddf240839a7c6b57@localhost.localdomain> hans marked 4 inline comments as done. hans added a comment. In D68570#1700024 , @rupprecht wrote: > In D68570#1699204 , @hans wrote: > > > In D68570#1697872 , @rupprecht wrote: > > > > > Do you have any benchmarks? > > > > > > Of what? That generating the table is slower than just using it directly? I'd say no benchmark is needed to conclude that :-) > > > There is also the cost of code complexity to consider. Sure, the table generation goes away, but does that even show up on any benchmark/real world usage of this? Nobody's exercising the table generation code. People are either check-summing once (in which case, it doesn't matter as much if it's slow, because it happens once) or check-summing 1000 times (in which case the one-time table generation is probably not the CRC bottleneck). If there's no performance win, is it worth the potential code complexity of an opaque hex array over constructing it with an algorithm that can be inspected? You're right that it probably doesn't matter much, it's more the principle that it's wasteful to compute it at runtime. I think maybe our disagreement comes from that I actually see including the table directly as *less* code complexity than the way we currently generate it. It's a code pattern for computing CRC-32. If you search the web for how to compute CRC-32, this table shows up. Some specs even define CRC-32 in terms of the table (e.g. https://docs.microsoft.com/en-us/openspecs/office_protocols/ms-abs/06966aa2-70da-4bf9-8448-3355f277cd77). I think the code to generate the table (and to use it) is just as opaque. Where did the current code come from? It's from https://github.com/libarchive/libarchive/blob/master/libarchive/archive_crc32.h How did it work? There are no comments. It's a fancy algorithm and there's no way around reading some external sources to figure out how it works. Using the pre-computed table directly removes some unnecessary work and in my opinion it's clearer than the previous code. > FWIW I'm still in favor of this patch, and the objcopy part that Herald added me for LGTM. Addressing my comments would help me understand this code but feel free to ignore if I'm the only one that feels this way. I appreciate your comments and I hope I've addressed them satisfactorily. In D68570#1700080 , @rnk wrote: > Maybe a dumb idea: can we compute the table with constexpr evaluation? You could set up a constexpr function that returns a struct that wraps the array, and then the body of the function would construct the array imperatively as the old initialization code did. Set up a constexpr global with an initializer that calls the function. This sounds fun, but I don't think it would be a step towards less code complexity :-) > If not all compilers support this, we could instead use static_assert to check that the table is correct under an `#ifdef __clang__`, or whatever conditions apply. > > One possible drawback of this approach is compile time. If that ends up mattering, it could be ifdef EXPENSIVE_CHECKS or something awful like that. > > Another drawback is that I have seen MSVC accept globals marked constexpr, but then sometimes they get dynamic initializers anyway. The static_assert approach might be safer. I've extended the unit test to exercise the whole table. That should give us the benefit of static_assert'ing that it's correct, without the code complexity or compile time. ================ Comment at: llvm/lib/Support/CRC.cpp:26 -uint32_t llvm::crc32(uint32_t CRC, StringRef S) { - static llvm::once_flag InitFlag; - static CRC32Table Tbl; - llvm::call_once(InitFlag, initCRC32Table, &Tbl); +static const uint32_t CRCTable[256] = { + 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, ---------------- rupprecht wrote: > hans wrote: > > hiraditya wrote: > > > rupprecht wrote: > > > > Can you leave a comment how this table was generated/how it could be regenerated if needed in the future? And/or a unit test to assert the values are correct? > > > +1 > > I'm adding a comment with references for how the algorithm and the table works. There is already a unit test in llvm/unittests/Support/CRCTest.cpp > From what I can tell: > - This algorithm makes use of one byte from this table per byte in the input array > - The table has 256 entries > - The test only includes "The quick brown fox jumps over the lazy dog" and "123456789" (43 total values if I'm counting correctly) > There's no way that test sufficiently tests that all 256 values here are correct. Maybe check summing a very large (>100k) buffer would give confidence that -- probably -- all values are being used. > > I think the unit test is sufficient for testing a generated table because it's hard to mess up the generation in a way that only affects some values. However with a hard-coded table, it's much more likely that a single value here could be wrong due to mechanical issues (e.g. accidentally changing a character in the process of formatting it). That's a fair point. I've added a test that exercises the whole table. ================ Comment at: llvm/lib/Support/CRC.cpp:29-37 - for (size_t I = 0; I < Tbl->size(); ++I) { - uint32_t V = Shuffle(I); - V = Shuffle(V); - V = Shuffle(V); - V = Shuffle(V); - V = Shuffle(V); - V = Shuffle(V); ---------------- rupprecht wrote: > I'd personally find keeping this as a comment to be more useful in understanding how the table is generated than digging up papers But to understand this code, one would need to dig up papers too. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 From llvm-commits at lists.llvm.org Wed Oct 9 01:33:47 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:33:47 +0000 (UTC) Subject: [PATCH] D68676: [ASan] Do not misrepresent high value address dereferences as null dereferences In-Reply-To: References: Message-ID: vitalybuka added inline comments. ================ Comment at: compiler-rt/lib/sanitizer_common/sanitizer_posix.cpp:302 + // The real codes (e.g., SEGV_MAPERR, SEGV_MAPERR) are non-zero + return si->si_signo == SIGSEGV && si->si_code != 0; +} ---------------- can you move it into darwin on linux si_code is SI_KERNEL in similar cases so we will need different implementations ================ Comment at: compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp:194 const char *description = sig.Describe(); - Report("ERROR: %s: %s on unknown address %p (pc %p bp %p sp %p T%d)\n", - SanitizerToolName, description, (void *)sig.addr, (void *)sig.pc, - (void *)sig.bp, (void *)sig.sp, tid); + Report("ERROR: %s: %s on unknown address", SanitizerToolName, description); + if (!sig.is_memory_access || sig.is_true_faulting_addr) ---------------- can you change it to ``` if (trueaddress) Print("Full error with address") else Print("Full error without address") ``` I guess more readable and less problems when multiple processes write into same terminal Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68676/new/ https://reviews.llvm.org/D68676 From llvm-commits at lists.llvm.org Wed Oct 9 01:34:04 2019 From: llvm-commits at lists.llvm.org (Hans Wennborg via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:34:04 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: hans updated this revision to Diff 223999. hans marked 2 inline comments as done. hans added a comment. Extend unit test; clang-format. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 Files: clang/lib/AST/MicrosoftMangle.cpp lld/COFF/PDB.cpp lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp llvm/include/llvm/Support/CRC.h llvm/include/llvm/Support/JamCRC.h llvm/lib/DebugInfo/PDB/Native/Hash.cpp llvm/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp llvm/lib/DebugInfo/PDB/Native/TpiHashing.cpp llvm/lib/DebugInfo/Symbolize/Symbolize.cpp llvm/lib/MC/WinCOFFObjectWriter.cpp llvm/lib/Support/CMakeLists.txt llvm/lib/Support/CRC.cpp llvm/lib/Support/JamCRC.cpp llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp llvm/tools/llvm-objcopy/COFF/COFFObjcopy.cpp llvm/tools/llvm-objcopy/CopyConfig.cpp llvm/unittests/Support/CRCTest.cpp llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn -------------- next part -------------- A non-text attachment was scrubbed... Name: D68570.223999.patch Type: text/x-patch Size: 27379 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 01:42:50 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:42:50 +0000 (UTC) Subject: [PATCH] D68283: [ARM] Selection for MVE VMOVN In-Reply-To: References: Message-ID: dmgreen updated this revision to Diff 224000. dmgreen added a comment. It turns out that this as incorrrect, just not for the same reason I was thinking. I was missing type checks, so we were trying to form VMOVN's for 32bit types (which would fail to select). I've added the type checks and some extra tests for such cases. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68283/new/ https://reviews.llvm.org/D68283 Files: llvm/lib/Target/ARM/ARMISelLowering.cpp llvm/lib/Target/ARM/ARMISelLowering.h llvm/lib/Target/ARM/ARMInstrMVE.td llvm/test/CodeGen/Thumb2/mve-vmovn.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68283.224000.patch Type: text/x-patch Size: 20002 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 01:42:54 2019 From: llvm-commits at lists.llvm.org (David Stenberg via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:42:54 +0000 (UTC) Subject: [PATCH] D67556: [ARM][AArch64][DebugInfo] Improve call site instruction interpretation In-Reply-To: References: Message-ID: <6162ae86e6d9015a7d51aea0bfcede1e@localhost.localdomain> dstenb added inline comments. ================ Comment at: include/llvm/CodeGen/TargetInstrInfo.h:888 + /// If the specific machine instruction is an instruction that adds an + /// immediate value to its first operand and stores it in the first, return + /// true along with @Source machine operand to which @Offset has been ---------------- NikolaPrica wrote: > dstenb wrote: > > NikolaPrica wrote: > > > dstenb wrote: > > > > dstenb wrote: > > > > > I wonder if the hook should allow the source and destination to be different, as we then for example could describe cases like this: > > > > > > > > > > ``` > > > > > $reg0 = add $frame-ptr, -13 > > > > > ``` > > > > > > > > > > If so, would it then make sense to move the LEA part of X86's `describeLoadedValue()` hook into this hook instead? > > > > If so, should we perhaps also consider generalizing the hook so that it has a Destination out-parameter, e.g. same as `isCopyInstr()`? That could probably be helpful if we make the `describeLoadedValue()` hook aware of which register it should describe, as we discussed in D67225. > > > > I wonder if the hook should allow the source and destination to be different > > > > > > > > > In fact we should only relay on situations were source and destination operands are different. Such restriction should be used at general part of `describeLoadedValue()`. There is no use of describing situatios like > > > > > > $reg0 = add $reg0, 4 > > > > > > This case would require recursive description of $reg0. Describing such instruction is a different story. > > > > > > > If so, would it then make sense to move the LEA part of X86's describeLoadedValue() hook into this hook instead? > > > > > > The LEA instruction is more complex than add immidiate instruction. It could be observed as add immidiate for one case but it could also have addition of multiple source registers with some multiplication operations. IMHO it would be better to keep this API function clean in sense of recognizing only clear add immidiate instruction for purpose of further usage. > > > > > > > If so, should we perhaps also consider generalizing the hook so that it has a Destination out-parameter, e.g. same as isCopyInstr()? That could probably be helpful if we make the describeLoadedValue() hook aware of which register it should describe, as we discussed in D67225. > > > > > > In the first version of this function I've added a destination operand but I've removed it since there was no current use of it. But for further flexibilty I will add it. > > > > > > In fact we should only relay on situations were source and destination operands are different. Such restriction should be used at general part of describeLoadedValue(). There is no use of describing situatios like > > > > > > $reg0 = add $reg0, 4 > > > > In previous revisions of the downstream target we develop for we had to resort to: > > > > ``` > > $reg0 = mov $frame-ptr > > $reg0 = add $reg0, $offset > > ``` > > > > instead of loading the frame pointer with an offset in one instruction. Perhaps there is some upstream target that requires the same? > > > > > This case would require recursive description of $reg0. Describing such instruction is a different story. > > > > Is that due to the issue with expressions in collectCallSiteParameters() which we discussed earlier in this patch? > > > > > The LEA instruction is more complex than add immidiate instruction. It could be observed as add immidiate for one case but it could also have addition of multiple source registers with some multiplication operations. IMHO it would be better to keep this API function clean in sense of recognizing only clear add immidiate instruction for purpose of further usage. > > > > Okay, that sounds fair. Moving some parts to the LEA implementation to this hook, and keeping the rest in `describeLoadedValue()` would probably not be ideal. > > > > > In the first version of this function I've added a destination operand but I've removed it since there was no current use of it. But for further flexibilty I will add it. > > > > Okay, thanks! > > Is that due to the issue with expressions in collectCallSiteParameters() which we discussed earlier in this patch? > > Yes. You are right. Such cases should be handled the way we discussed there. But until such support is provided such instructions ($reg0 = add $reg0, 4) should be omitted. I will emphasize that as TODO comment. > > [...] adds an immediate value to its first operand and stores it in the first, [...] This part of the comment needs to be updated to reflect that the destination and source can be any operands. ================ Comment at: lib/Target/ARM/ARMBaseInstrInfo.cpp:5335 + // TODO: Third operand can be global address (usually some string). Since + // strings can be relocated we cannot calculate theirs offsets for + // now. ---------------- nit: //theirs -> their// ================ Comment at: test/DebugInfo/MIR/AArch64/dbgcall-site-interpretation.mir:2 +# RUN: llc -mtriple aarch64-linux-gnu -debug-entry-values -start-after=machineverifier -filetype=obj %s -o -| llvm-dwarfdump -| FileCheck %s +# CHECK: DW_TAG_GNU_call_site +# CHECK-NEXT: DW_AT_abstract_origin {{.*}} "func2" ---------------- I'm sorry for such a late comment about this, but if you still have the C reproducers readily available I think it would be helpful to add them to the tests (assuming that the IR or MIR haven't been substantially modified). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67556/new/ https://reviews.llvm.org/D67556 From llvm-commits at lists.llvm.org Wed Oct 9 01:42:55 2019 From: llvm-commits at lists.llvm.org (Aleksandr Urakov via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:42:55 +0000 (UTC) Subject: [PATCH] D68680: [dsymutil] Fix handling of common symbols in multiple object files. In-Reply-To: References: Message-ID: aleksandr.urakov added a comment. Hi! Could you choose some other names for `com1.o` and `com2.o`, please? :) They are reserved names on Windows, so there are problems with checking out this change. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68680/new/ https://reviews.llvm.org/D68680 From llvm-commits at lists.llvm.org Wed Oct 9 01:49:13 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Wed, 09 Oct 2019 08:49:13 -0000 Subject: [llvm] r374146 - [llvm-exegesis] Explore LEA addressing modes. Message-ID: <20191009084913.9A8A58FE97@lists.llvm.org> Author: courbet Date: Wed Oct 9 01:49:13 2019 New Revision: 374146 URL: http://llvm.org/viewvc/llvm-project?rev=374146&view=rev Log: [llvm-exegesis] Explore LEA addressing modes. Summary: This will help for PR32326. This shows the well-known issue with `RBP` and `R13` as base registers. Reviewers: gchatelet Subscribers: tschuett, llvm-commits, RKSimon, andreadb Tags: #llvm Differential Revision: https://reviews.llvm.org/D68646 Added: llvm/trunk/test/tools/llvm-exegesis/X86/latency-LEA64r.s llvm/trunk/test/tools/llvm-exegesis/X86/uops-LEA64r.s Modified: llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.h llvm/trunk/tools/llvm-exegesis/lib/Uops.cpp llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp Added: llvm/trunk/test/tools/llvm-exegesis/X86/latency-LEA64r.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-exegesis/X86/latency-LEA64r.s?rev=374146&view=auto ============================================================================== --- llvm/trunk/test/tools/llvm-exegesis/X86/latency-LEA64r.s (added) +++ llvm/trunk/test/tools/llvm-exegesis/X86/latency-LEA64r.s Wed Oct 9 01:49:13 2019 @@ -0,0 +1,16 @@ +# RUN: llvm-exegesis -mode=latency -opcode-name=LEA64r -repetition-mode=duplicate -max-configs-per-opcode=2 | FileCheck %s +# RUN: llvm-exegesis -mode=latency -opcode-name=LEA64r -repetition-mode=loop -max-configs-per-opcode=2 | FileCheck %s + +CHECK: --- +CHECK-NEXT: mode: latency +CHECK-NEXT: key: +CHECK-NEXT: instructions: +CHECK-NEXT: LEA64r +CHECK-NEXT: config: '0(%[[REG1:[A-Z0-9]+]], %[[REG1]], 1)' + +CHECK: --- +CHECK-NEXT: mode: latency +CHECK-NEXT: key: +CHECK-NEXT: instructions: +CHECK-NEXT: LEA64r +CHECK-NEXT: config: '42(%[[REG2:[A-Z0-9]+]], %[[REG2]], 1)' Added: llvm/trunk/test/tools/llvm-exegesis/X86/uops-LEA64r.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-exegesis/X86/uops-LEA64r.s?rev=374146&view=auto ============================================================================== --- llvm/trunk/test/tools/llvm-exegesis/X86/uops-LEA64r.s (added) +++ llvm/trunk/test/tools/llvm-exegesis/X86/uops-LEA64r.s Wed Oct 9 01:49:13 2019 @@ -0,0 +1,16 @@ +# RUN: llvm-exegesis -mode=uops -opcode-name=LEA64r -repetition-mode=duplicate -max-configs-per-opcode=2 | FileCheck %s +# RUN: llvm-exegesis -mode=uops -opcode-name=LEA64r -repetition-mode=loop -max-configs-per-opcode=2 | FileCheck %s + +CHECK: --- +CHECK-NEXT: mode: uops +CHECK-NEXT: key: +CHECK-NEXT: instructions: +CHECK-NEXT: LEA64r +CHECK-NEXT: config: '0(%[[REG1:[A-Z0-9]+]], %[[REG2:[A-Z0-9]+]], 1)' + +CHECK: --- +CHECK-NEXT: mode: uops +CHECK-NEXT: key: +CHECK-NEXT: instructions: +CHECK-NEXT: LEA64r +CHECK-NEXT: config: '42(%[[REG3:[A-Z0-9]+]], %[[REG4:[A-Z0-9]+]], 1)' Modified: llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.h?rev=374146&r1=374145&r2=374146&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.h Wed Oct 9 01:49:13 2019 @@ -103,6 +103,13 @@ private: RegisterClasses; }; +// `a = a & ~b`, optimized for few bit sets in B and no allocation. +inline void remove(llvm::BitVector &A, const llvm::BitVector &B) { + assert(A.size() == B.size()); + for (auto I : B.set_bits()) + A.reset(I); +} + } // namespace exegesis } // namespace llvm Modified: llvm/trunk/tools/llvm-exegesis/lib/Uops.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Uops.cpp?rev=374146&r1=374145&r2=374146&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Uops.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Uops.cpp Wed Oct 9 01:49:13 2019 @@ -89,12 +89,6 @@ getVariablesWithTiedOperands(const Instr return Result; } -static void remove(llvm::BitVector &a, const llvm::BitVector &b) { - assert(a.size() == b.size()); - for (auto I : b.set_bits()) - a.reset(I); -} - UopsBenchmarkRunner::~UopsBenchmarkRunner() = default; UopsSnippetGenerator::~UopsSnippetGenerator() = default; Modified: llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp?rev=374146&r1=374145&r2=374146&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp Wed Oct 9 01:49:13 2019 @@ -17,6 +17,7 @@ #include "X86RegisterInfo.h" #include "X86Subtarget.h" #include "llvm/MC/MCInstBuilder.h" +#include "llvm/Support/FormatVariadic.h" namespace llvm { namespace exegesis { @@ -177,6 +178,72 @@ static unsigned getX86FPFlags(const Inst return Instr.Description->TSFlags & llvm::X86II::FPTypeMask; } +// Helper to fill a memory operand with a value. +static void setMemOp(InstructionTemplate &IT, int OpIdx, + const MCOperand &OpVal) { + const auto Op = IT.Instr.Operands[OpIdx]; + assert(Op.isExplicit() && "invalid memory pattern"); + IT.getValueFor(Op) = OpVal; +}; + +// Common (latency, uops) code for LEA templates. `GetDestReg` takes the +// addressing base and index registers and returns the LEA destination register. +static llvm::Expected> generateLEATemplatesCommon( + const Instruction &Instr, const BitVector &ForbiddenRegisters, + const LLVMState &State, const SnippetGenerator::Options &Opts, + std::function GetDestReg) { + assert(Instr.Operands.size() == 6 && "invalid LEA"); + assert(X86II::getMemoryOperandNo(Instr.Description->TSFlags) == 1 && + "invalid LEA"); + + constexpr const int kDestOp = 0; + constexpr const int kBaseOp = 1; + constexpr const int kIndexOp = 3; + auto PossibleDestRegs = + Instr.Operands[kDestOp].getRegisterAliasing().sourceBits(); + remove(PossibleDestRegs, ForbiddenRegisters); + auto PossibleBaseRegs = + Instr.Operands[kBaseOp].getRegisterAliasing().sourceBits(); + remove(PossibleBaseRegs, ForbiddenRegisters); + auto PossibleIndexRegs = + Instr.Operands[kIndexOp].getRegisterAliasing().sourceBits(); + remove(PossibleIndexRegs, ForbiddenRegisters); + + const auto &RegInfo = State.getRegInfo(); + std::vector Result; + for (const unsigned BaseReg : PossibleBaseRegs.set_bits()) { + for (const unsigned IndexReg : PossibleIndexRegs.set_bits()) { + for (int LogScale = 0; LogScale <= 3; ++LogScale) { + // FIXME: Add an option for controlling how we explore immediates. + for (const int Disp : {0, 42}) { + InstructionTemplate IT(Instr); + const int64_t Scale = 1ull << LogScale; + setMemOp(IT, 1, MCOperand::createReg(BaseReg)); + setMemOp(IT, 2, MCOperand::createImm(Scale)); + setMemOp(IT, 3, MCOperand::createReg(IndexReg)); + setMemOp(IT, 4, MCOperand::createImm(Disp)); + // SegmentReg must be 0 for LEA. + setMemOp(IT, 5, MCOperand::createReg(0)); + + // Output reg is selected by the caller. + setMemOp(IT, 0, MCOperand::createReg(GetDestReg(BaseReg, IndexReg))); + + CodeTemplate CT; + CT.Instructions.push_back(std::move(IT)); + CT.Config = formatv("{3}(%{0}, %{1}, {2})", RegInfo.getName(BaseReg), + RegInfo.getName(IndexReg), Scale, Disp) + .str(); + Result.push_back(std::move(CT)); + if (Result.size() >= Opts.MaxConfigsPerOpcode) + return Result; + } + } + } + } + + return Result; +} + namespace { class X86LatencySnippetGenerator : public LatencySnippetGenerator { public: @@ -194,6 +261,17 @@ X86LatencySnippetGenerator::generateCode if (auto E = IsInvalidOpcode(Instr)) return std::move(E); + // LEA gets special attention. + const auto Opcode = Instr.Description->getOpcode(); + if (Opcode == X86::LEA64r || Opcode == X86::LEA64_32r) { + return generateLEATemplatesCommon(Instr, ForbiddenRegisters, State, Opts, + [](unsigned BaseReg, unsigned IndexReg) { + // We just select the same base and + // output register. + return BaseReg; + }); + } + switch (getX86FPFlags(Instr)) { case llvm::X86II::NotFP: return LatencySnippetGenerator::generateCodeTemplates(Instr, @@ -225,6 +303,7 @@ public: generateCodeTemplates(const Instruction &Instr, const BitVector &ForbiddenRegisters) const override; }; + } // namespace llvm::Expected> @@ -233,6 +312,28 @@ X86UopsSnippetGenerator::generateCodeTem if (auto E = IsInvalidOpcode(Instr)) return std::move(E); + // LEA gets special attention. + const auto Opcode = Instr.Description->getOpcode(); + if (Opcode == X86::LEA64r || Opcode == X86::LEA64_32r) { + // Any destination register that is not used for adddressing is fine. + auto PossibleDestRegs = + Instr.Operands[0].getRegisterAliasing().sourceBits(); + remove(PossibleDestRegs, ForbiddenRegisters); + return generateLEATemplatesCommon( + Instr, ForbiddenRegisters, State, Opts, + [this, &PossibleDestRegs](unsigned BaseReg, unsigned IndexReg) { + auto PossibleDestRegsNow = PossibleDestRegs; + remove(PossibleDestRegsNow, + State.getRATC().getRegister(BaseReg).aliasedBits()); + remove(PossibleDestRegsNow, + State.getRATC().getRegister(IndexReg).aliasedBits()); + assert(PossibleDestRegsNow.set_bits().begin() != + PossibleDestRegsNow.set_bits().end() && + "no remaining registers"); + return *PossibleDestRegsNow.set_bits().begin(); + }); + } + switch (getX86FPFlags(Instr)) { case llvm::X86II::NotFP: return UopsSnippetGenerator::generateCodeTemplates(Instr, @@ -548,17 +649,11 @@ void ExegesisX86Target::fillMemoryOperan ++MemOpIdx; } } - // Now fill in the memory operands. - const auto SetOp = [&IT](int OpIdx, const MCOperand &OpVal) { - const auto Op = IT.Instr.Operands[OpIdx]; - assert(Op.isMemory() && Op.isExplicit() && "invalid memory pattern"); - IT.getValueFor(Op) = OpVal; - }; - SetOp(MemOpIdx + 0, MCOperand::createReg(Reg)); // BaseReg - SetOp(MemOpIdx + 1, MCOperand::createImm(1)); // ScaleAmt - SetOp(MemOpIdx + 2, MCOperand::createReg(0)); // IndexReg - SetOp(MemOpIdx + 3, MCOperand::createImm(Offset)); // Disp - SetOp(MemOpIdx + 4, MCOperand::createReg(0)); // Segment + setMemOp(IT, MemOpIdx + 0, MCOperand::createReg(Reg)); // BaseReg + setMemOp(IT, MemOpIdx + 1, MCOperand::createImm(1)); // ScaleAmt + setMemOp(IT, MemOpIdx + 2, MCOperand::createReg(0)); // IndexReg + setMemOp(IT, MemOpIdx + 3, MCOperand::createImm(Offset)); // Disp + setMemOp(IT, MemOpIdx + 4, MCOperand::createReg(0)); // Segment } void ExegesisX86Target::decrementLoopCounterAndJump( From llvm-commits at lists.llvm.org Wed Oct 9 01:52:17 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:52:17 +0000 (UTC) Subject: [PATCH] D68220: [LNT] Python 3 support: stable showtests output In-Reply-To: References: Message-ID: <1bd984be40dc92733a8a58a3f7279b29@localhost.localdomain> thopre added a comment. In D68220#1688136 , @hubert.reinterpretcast wrote: > My general impression is that this is harmless. As it is, there is no other key available to sort on. @cmatthews, can you confirm? Ping @cmatthews ? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68220/new/ https://reviews.llvm.org/D68220 From llvm-commits at lists.llvm.org Wed Oct 9 01:52:18 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:52:18 +0000 (UTC) Subject: [PATCH] D68687: [llvm-exegesis][NFC] Remove extra `llvm::` qualifications. In-Reply-To: References: Message-ID: courbet updated this revision to Diff 224001. courbet added a comment. Fix build Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68687/new/ https://reviews.llvm.org/D68687 Files: llvm/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp llvm/unittests/tools/llvm-exegesis/ARM/AssemblerTest.cpp llvm/unittests/tools/llvm-exegesis/Common/AssemblerUtils.h llvm/unittests/tools/llvm-exegesis/PerfHelperTest.cpp llvm/unittests/tools/llvm-exegesis/PowerPC/AnalysisTest.cpp llvm/unittests/tools/llvm-exegesis/PowerPC/TargetTest.cpp llvm/unittests/tools/llvm-exegesis/RegisterValueTest.cpp llvm/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp llvm/unittests/tools/llvm-exegesis/X86/BenchmarkResultTest.cpp llvm/unittests/tools/llvm-exegesis/X86/RegisterAliasingTest.cpp llvm/unittests/tools/llvm-exegesis/X86/SchedClassResolutionTest.cpp llvm/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp llvm/unittests/tools/llvm-exegesis/X86/TargetTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68687.224001.patch Type: text/x-patch Size: 47722 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 01:52:24 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 08:52:24 +0000 (UTC) Subject: [PATCH] D68646: [llvm-exegesis] Explore LEA addressing modes. In-Reply-To: References: Message-ID: <3de25b816e92bb0b668d91f7a0d9784d@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGc3a7fb759931: [llvm-exegesis] Explore LEA addressing modes. (authored by courbet). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68646/new/ https://reviews.llvm.org/D68646 Files: llvm/test/tools/llvm-exegesis/X86/latency-LEA64r.s llvm/test/tools/llvm-exegesis/X86/uops-LEA64r.s llvm/tools/llvm-exegesis/lib/RegisterAliasing.h llvm/tools/llvm-exegesis/lib/Uops.cpp llvm/tools/llvm-exegesis/lib/X86/Target.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68646.224002.patch Type: text/x-patch Size: 9418 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 02:03:43 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Wed, 09 Oct 2019 09:03:43 -0000 Subject: [llvm] r374147 - [llvm-exegesis][NFC] Fix rL374146. Message-ID: <20191009090343.62F4487389@lists.llvm.org> Author: courbet Date: Wed Oct 9 02:03:42 2019 New Revision: 374147 URL: http://llvm.org/viewvc/llvm-project?rev=374147&view=rev Log: [llvm-exegesis][NFC] Fix rL374146. Remove extra semicolon: Target.cpp:187:2: warning: extra ‘;’ [-Wpedantic] Modified: llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp Modified: llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp?rev=374147&r1=374146&r2=374147&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp Wed Oct 9 02:03:42 2019 @@ -184,7 +184,7 @@ static void setMemOp(InstructionTemplate const auto Op = IT.Instr.Operands[OpIdx]; assert(Op.isExplicit() && "invalid memory pattern"); IT.getValueFor(Op) = OpVal; -}; +} // Common (latency, uops) code for LEA templates. `GetDestReg` takes the // addressing base and index registers and returns the LEA destination register. From llvm-commits at lists.llvm.org Wed Oct 9 02:06:30 2019 From: llvm-commits at lists.llvm.org (Hans Wennborg via llvm-commits) Date: Wed, 09 Oct 2019 09:06:30 -0000 Subject: [lld] r374148 - Unify the two CRC implementations Message-ID: <20191009090630.E67FF879D6@lists.llvm.org> Author: hans Date: Wed Oct 9 02:06:30 2019 New Revision: 374148 URL: http://llvm.org/viewvc/llvm-project?rev=374148&view=rev Log: Unify the two CRC implementations David added the JamCRC implementation in r246590. More recently, Eugene added a CRC-32 implementation in r357901, which falls back to zlib's crc32 function if present. These checksums are essentially the same, so having multiple implementations seems unnecessary. This replaces the CRC-32 implementation with the simpler one from JamCRC, and implements the JamCRC interface in terms of CRC-32 since this means it can use zlib's implementation when available, saving a few bytes and potentially making it faster. JamCRC took an ArrayRef argument, and CRC-32 took a StringRef. This patch changes it to ArrayRef which I think is the best choice, and simplifies a few of the callers nicely. Differential revision: https://reviews.llvm.org/D68570 Modified: lld/trunk/COFF/PDB.cpp Modified: lld/trunk/COFF/PDB.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/PDB.cpp?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- lld/trunk/COFF/PDB.cpp (original) +++ lld/trunk/COFF/PDB.cpp Wed Oct 9 02:06:30 2019 @@ -51,10 +51,10 @@ #include "llvm/Object/COFF.h" #include "llvm/Object/CVDebugRecord.h" #include "llvm/Support/BinaryByteStream.h" +#include "llvm/Support/CRC.h" #include "llvm/Support/Endian.h" #include "llvm/Support/Errc.h" #include "llvm/Support/FormatVariadic.h" -#include "llvm/Support/JamCRC.h" #include "llvm/Support/Path.h" #include "llvm/Support/ScopedPrinter.h" #include @@ -965,9 +965,7 @@ static pdb::SectionContrib createSection sc.Imod = secChunk->file->moduleDBI->getModuleIndex(); ArrayRef contents = secChunk->getContents(); JamCRC crc(0); - ArrayRef charContents = makeArrayRef( - reinterpret_cast(contents.data()), contents.size()); - crc.update(charContents); + crc.update(contents); sc.DataCrc = crc.getCRC(); } else { sc.Characteristics = os ? os->header.Characteristics : 0; From llvm-commits at lists.llvm.org Wed Oct 9 02:06:30 2019 From: llvm-commits at lists.llvm.org (Hans Wennborg via llvm-commits) Date: Wed, 09 Oct 2019 09:06:30 -0000 Subject: [llvm] r374148 - Unify the two CRC implementations Message-ID: <20191009090631.100EF8A610@lists.llvm.org> Author: hans Date: Wed Oct 9 02:06:30 2019 New Revision: 374148 URL: http://llvm.org/viewvc/llvm-project?rev=374148&view=rev Log: Unify the two CRC implementations David added the JamCRC implementation in r246590. More recently, Eugene added a CRC-32 implementation in r357901, which falls back to zlib's crc32 function if present. These checksums are essentially the same, so having multiple implementations seems unnecessary. This replaces the CRC-32 implementation with the simpler one from JamCRC, and implements the JamCRC interface in terms of CRC-32 since this means it can use zlib's implementation when available, saving a few bytes and potentially making it faster. JamCRC took an ArrayRef argument, and CRC-32 took a StringRef. This patch changes it to ArrayRef which I think is the best choice, and simplifies a few of the callers nicely. Differential revision: https://reviews.llvm.org/D68570 Removed: llvm/trunk/include/llvm/Support/JamCRC.h llvm/trunk/lib/Support/JamCRC.cpp Modified: llvm/trunk/include/llvm/Support/CRC.h llvm/trunk/lib/DebugInfo/PDB/Native/Hash.cpp llvm/trunk/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp llvm/trunk/lib/DebugInfo/PDB/Native/TpiHashing.cpp llvm/trunk/lib/DebugInfo/Symbolize/Symbolize.cpp llvm/trunk/lib/MC/WinCOFFObjectWriter.cpp llvm/trunk/lib/Support/CMakeLists.txt llvm/trunk/lib/Support/CRC.cpp llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp llvm/trunk/tools/llvm-objcopy/COFF/COFFObjcopy.cpp llvm/trunk/tools/llvm-objcopy/CopyConfig.cpp llvm/trunk/unittests/Support/CRCTest.cpp llvm/trunk/utils/gn/secondary/llvm/lib/Support/BUILD.gn Modified: llvm/trunk/include/llvm/Support/CRC.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/CRC.h?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/include/llvm/Support/CRC.h (original) +++ llvm/trunk/include/llvm/Support/CRC.h Wed Oct 9 02:06:30 2019 @@ -6,20 +6,55 @@ // //===----------------------------------------------------------------------===// // -// This file contains basic functions for calculating Cyclic Redundancy Check -// or CRC. +// This file contains implementations of CRC functions. // //===----------------------------------------------------------------------===// #ifndef LLVM_SUPPORT_CRC_H #define LLVM_SUPPORT_CRC_H -#include "llvm/ADT/StringRef.h" #include "llvm/Support/DataTypes.h" namespace llvm { -/// zlib independent CRC32 calculation. -uint32_t crc32(uint32_t CRC, StringRef S); +template class ArrayRef; + +// Compute the CRC-32 of Data. +uint32_t crc32(ArrayRef Data); + +// Compute the running CRC-32 of Data, with CRC being the previous value of the +// checksum. +uint32_t crc32(uint32_t CRC, ArrayRef Data); + +// Class for computing the JamCRC. +// +// We will use the "Rocksoft^tm Model CRC Algorithm" to describe the properties +// of this CRC: +// Width : 32 +// Poly : 04C11DB7 +// Init : FFFFFFFF +// RefIn : True +// RefOut : True +// XorOut : 00000000 +// Check : 340BC6D9 (result of CRC for "123456789") +// +// In other words, this is the same as CRC-32, except that XorOut is 0 instead +// of FFFFFFFF. +// +// N.B. We permit flexibility of the "Init" value. Some consumers of this need +// it to be zero. +class JamCRC { +public: + JamCRC(uint32_t Init = 0xFFFFFFFFU) : CRC(Init) {} + + // Update the CRC calculation with Data. + void update(ArrayRef Data); + + uint32_t getCRC() const { return CRC; } + +private: + uint32_t CRC; +}; + } // end namespace llvm #endif Removed: llvm/trunk/include/llvm/Support/JamCRC.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/JamCRC.h?rev=374147&view=auto ============================================================================== --- llvm/trunk/include/llvm/Support/JamCRC.h (original) +++ llvm/trunk/include/llvm/Support/JamCRC.h (removed) @@ -1,48 +0,0 @@ -//===-- llvm/Support/JamCRC.h - Cyclic Redundancy Check ---------*- C++ -*-===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// This file contains an implementation of JamCRC. -// -// We will use the "Rocksoft^tm Model CRC Algorithm" to describe the properties -// of this CRC: -// Width : 32 -// Poly : 04C11DB7 -// Init : FFFFFFFF -// RefIn : True -// RefOut : True -// XorOut : 00000000 -// Check : 340BC6D9 (result of CRC for "123456789") -// -// N.B. We permit flexibility of the "Init" value. Some consumers of this need -// it to be zero. -// -//===----------------------------------------------------------------------===// - -#ifndef LLVM_SUPPORT_JAMCRC_H -#define LLVM_SUPPORT_JAMCRC_H - -#include "llvm/Support/DataTypes.h" - -namespace llvm { -template class ArrayRef; - -class JamCRC { -public: - JamCRC(uint32_t Init = 0xFFFFFFFFU) : CRC(Init) {} - - // Update the CRC calculation with Data. - void update(ArrayRef Data); - - uint32_t getCRC() const { return CRC; } - -private: - uint32_t CRC; -}; -} // End of namespace llvm - -#endif Modified: llvm/trunk/lib/DebugInfo/PDB/Native/Hash.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/PDB/Native/Hash.cpp?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/PDB/Native/Hash.cpp (original) +++ llvm/trunk/lib/DebugInfo/PDB/Native/Hash.cpp Wed Oct 9 02:06:30 2019 @@ -8,8 +8,8 @@ #include "llvm/DebugInfo/PDB/Native/Hash.h" #include "llvm/ADT/ArrayRef.h" +#include "llvm/Support/CRC.h" #include "llvm/Support/Endian.h" -#include "llvm/Support/JamCRC.h" #include using namespace llvm; @@ -79,7 +79,6 @@ uint32_t pdb::hashStringV2(StringRef Str // Corresponds to `SigForPbCb` in langapi/shared/crc32.h. uint32_t pdb::hashBufferV8(ArrayRef Buf) { JamCRC JC(/*Init=*/0U); - JC.update(makeArrayRef(reinterpret_cast(Buf.data()), - Buf.size())); + JC.update(Buf); return JC.getCRC(); } Modified: llvm/trunk/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp (original) +++ llvm/trunk/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp Wed Oct 9 02:06:30 2019 @@ -22,7 +22,7 @@ #include "llvm/DebugInfo/PDB/Native/TpiStreamBuilder.h" #include "llvm/Support/BinaryStream.h" #include "llvm/Support/BinaryStreamWriter.h" -#include "llvm/Support/JamCRC.h" +#include "llvm/Support/CRC.h" #include "llvm/Support/Path.h" #include "llvm/Support/xxhash.h" @@ -174,8 +174,7 @@ Error PDBFileBuilder::finalizeMsfLayout( if (!InjectedSources.empty()) { for (const auto &IS : InjectedSources) { JamCRC CRC(0); - CRC.update(makeArrayRef(IS.Content->getBufferStart(), - IS.Content->getBufferSize())); + CRC.update(arrayRefFromStringRef(IS.Content->getBuffer())); SrcHeaderBlockEntry Entry; ::memset(&Entry, 0, sizeof(SrcHeaderBlockEntry)); Modified: llvm/trunk/lib/DebugInfo/PDB/Native/TpiHashing.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/PDB/Native/TpiHashing.cpp?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/PDB/Native/TpiHashing.cpp (original) +++ llvm/trunk/lib/DebugInfo/PDB/Native/TpiHashing.cpp Wed Oct 9 02:06:30 2019 @@ -10,7 +10,7 @@ #include "llvm/DebugInfo/CodeView/TypeDeserializer.h" #include "llvm/DebugInfo/PDB/Native/Hash.h" -#include "llvm/Support/JamCRC.h" +#include "llvm/Support/CRC.h" using namespace llvm; using namespace llvm::codeview; @@ -124,8 +124,6 @@ Expected llvm::pdb::hashTypeRe // Run CRC32 over the bytes. This corresponds to `hashBufv8`. JamCRC JC(/*Init=*/0U); - ArrayRef Bytes(reinterpret_cast(Rec.data().data()), - Rec.data().size()); - JC.update(Bytes); + JC.update(Rec.data()); return JC.getCRC(); } Modified: llvm/trunk/lib/DebugInfo/Symbolize/Symbolize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/Symbolize/Symbolize.cpp?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/Symbolize/Symbolize.cpp (original) +++ llvm/trunk/lib/DebugInfo/Symbolize/Symbolize.cpp Wed Oct 9 02:06:30 2019 @@ -205,7 +205,7 @@ bool checkFileCRC(StringRef Path, uint32 MemoryBuffer::getFileOrSTDIN(Path); if (!MB) return false; - return CRCHash == llvm::crc32(0, MB.get()->getBuffer()); + return CRCHash == llvm::crc32(arrayRefFromStringRef(MB.get()->getBuffer())); } bool findDebugBinary(const std::string &OrigPath, Modified: llvm/trunk/lib/MC/WinCOFFObjectWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/WinCOFFObjectWriter.cpp?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/lib/MC/WinCOFFObjectWriter.cpp (original) +++ llvm/trunk/lib/MC/WinCOFFObjectWriter.cpp Wed Oct 9 02:06:30 2019 @@ -31,10 +31,10 @@ #include "llvm/MC/MCValue.h" #include "llvm/MC/MCWinCOFFObjectWriter.h" #include "llvm/MC/StringTableBuilder.h" +#include "llvm/Support/CRC.h" #include "llvm/Support/Casting.h" #include "llvm/Support/Endian.h" #include "llvm/Support/ErrorHandling.h" -#include "llvm/Support/JamCRC.h" #include "llvm/Support/LEB128.h" #include "llvm/Support/MathExtras.h" #include "llvm/Support/raw_ostream.h" @@ -605,7 +605,7 @@ uint32_t WinCOFFObjectWriter::writeSecti // Calculate our CRC with an initial value of '0', this is not how // JamCRC is specified but it aligns with the expected output. JamCRC JC(/*Init=*/0); - JC.update(Buf); + JC.update(makeArrayRef(reinterpret_cast(Buf.data()), Buf.size())); return JC.getCRC(); } Modified: llvm/trunk/lib/Support/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/CMakeLists.txt?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/lib/Support/CMakeLists.txt (original) +++ llvm/trunk/lib/Support/CMakeLists.txt Wed Oct 9 02:06:30 2019 @@ -103,7 +103,6 @@ add_llvm_library(LLVMSupport IntEqClasses.cpp IntervalMap.cpp ItaniumManglingCanonicalizer.cpp - JamCRC.cpp JSON.cpp KnownBits.cpp LEB128.cpp Modified: llvm/trunk/lib/Support/CRC.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/CRC.cpp?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/lib/Support/CRC.cpp (original) +++ llvm/trunk/lib/Support/CRC.cpp Wed Oct 9 02:06:30 2019 @@ -6,63 +6,94 @@ // //===----------------------------------------------------------------------===// // -// This file implements llvm::crc32 function. +// This file contains implementations of CRC functions. +// +// The implementation technique is the one mentioned in: +// D. V. Sarwate. 1988. Computation of cyclic redundancy checks via table +// look-up. Commun. ACM 31, 8 (August 1988) +// +// See also Ross N. Williams "A Painless Guide to CRC Error Detection +// Algorithms" (https://zlib.net/crc_v3.txt) or Hacker's Delight (2nd ed.) +// Chapter 14 (Figure 14-7 in particular) for how the algorithm works. // //===----------------------------------------------------------------------===// #include "llvm/Support/CRC.h" + +#include "llvm/ADT/ArrayRef.h" #include "llvm/Config/config.h" -#include "llvm/ADT/StringRef.h" -#include "llvm/Support/Threading.h" -#include using namespace llvm; #if LLVM_ENABLE_ZLIB == 0 || !HAVE_ZLIB_H -using CRC32Table = std::array; - -static void initCRC32Table(CRC32Table *Tbl) { - auto Shuffle = [](uint32_t V) { - return (V & 1) ? (V >> 1) ^ 0xEDB88320U : V >> 1; - }; - - for (size_t I = 0; I < Tbl->size(); ++I) { - uint32_t V = Shuffle(I); - V = Shuffle(V); - V = Shuffle(V); - V = Shuffle(V); - V = Shuffle(V); - V = Shuffle(V); - V = Shuffle(V); - (*Tbl)[I] = Shuffle(V); - } -} -uint32_t llvm::crc32(uint32_t CRC, StringRef S) { - static llvm::once_flag InitFlag; - static CRC32Table Tbl; - llvm::call_once(InitFlag, initCRC32Table, &Tbl); +static const uint32_t CRCTable[256] = { + 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, 0x076dc419, 0x706af48f, + 0xe963a535, 0x9e6495a3, 0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, + 0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, 0x1db71064, 0x6ab020f2, + 0xf3b97148, 0x84be41de, 0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7, + 0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, 0x14015c4f, 0x63066cd9, + 0xfa0f3d63, 0x8d080df5, 0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, + 0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b, 0x35b5a8fa, 0x42b2986c, + 0xdbbbc9d6, 0xacbcf940, 0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59, + 0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, 0x21b4f4b5, 0x56b3c423, + 0xcfba9599, 0xb8bda50f, 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, + 0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d, 0x76dc4190, 0x01db7106, + 0x98d220bc, 0xefd5102a, 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433, + 0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, 0x7f6a0dbb, 0x086d3d2d, + 0x91646c97, 0xe6635c01, 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, + 0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, 0x65b0d9c6, 0x12b7e950, + 0x8bbeb8ea, 0xfcb9887c, 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65, + 0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, 0x4adfa541, 0x3dd895d7, + 0xa4d1c46d, 0xd3d6f4fb, 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, + 0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, 0x5005713c, 0x270241aa, + 0xbe0b1010, 0xc90c2086, 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f, + 0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, 0x59b33d17, 0x2eb40d81, + 0xb7bd5c3b, 0xc0ba6cad, 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, + 0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, 0xe3630b12, 0x94643b84, + 0x0d6d6a3e, 0x7a6a5aa8, 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1, + 0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, 0xf762575d, 0x806567cb, + 0x196c3671, 0x6e6b06e7, 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, + 0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, 0xd6d6a3e8, 0xa1d1937e, + 0x38d8c2c4, 0x4fdff252, 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b, + 0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, 0xdf60efc3, 0xa867df55, + 0x316e8eef, 0x4669be79, 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, + 0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, 0xc5ba3bbe, 0xb2bd0b28, + 0x2bb45a92, 0x5cb36a04, 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d, + 0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, 0x9c0906a9, 0xeb0e363f, + 0x72076785, 0x05005713, 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, + 0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, 0x86d3d2d4, 0xf1d4e242, + 0x68ddb3f8, 0x1fda836e, 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777, + 0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, 0x8f659eff, 0xf862ae69, + 0x616bffd3, 0x166ccf45, 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, + 0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, 0xaed16a4a, 0xd9d65adc, + 0x40df0b66, 0x37d83bf0, 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9, + 0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, 0xbad03605, 0xcdd70693, + 0x54de5729, 0x23d967bf, 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, + 0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d}; - const uint8_t *P = reinterpret_cast(S.data()); - size_t Len = S.size(); +uint32_t llvm::crc32(uint32_t CRC, ArrayRef Data) { CRC ^= 0xFFFFFFFFU; - for (; Len >= 8; Len -= 8) { - CRC = Tbl[(CRC ^ *P++) & 0xFF] ^ (CRC >> 8); - CRC = Tbl[(CRC ^ *P++) & 0xFF] ^ (CRC >> 8); - CRC = Tbl[(CRC ^ *P++) & 0xFF] ^ (CRC >> 8); - CRC = Tbl[(CRC ^ *P++) & 0xFF] ^ (CRC >> 8); - CRC = Tbl[(CRC ^ *P++) & 0xFF] ^ (CRC >> 8); - CRC = Tbl[(CRC ^ *P++) & 0xFF] ^ (CRC >> 8); - CRC = Tbl[(CRC ^ *P++) & 0xFF] ^ (CRC >> 8); - CRC = Tbl[(CRC ^ *P++) & 0xFF] ^ (CRC >> 8); + for (uint8_t Byte : Data) { + int TableIdx = (CRC ^ Byte) & 0xff; + CRC = CRCTable[TableIdx] ^ (CRC >> 8); } - while (Len--) - CRC = Tbl[(CRC ^ *P++) & 0xFF] ^ (CRC >> 8); return CRC ^ 0xFFFFFFFFU; } + #else + #include -uint32_t llvm::crc32(uint32_t CRC, StringRef S) { - return ::crc32(CRC, (const Bytef *)S.data(), S.size()); +uint32_t llvm::crc32(uint32_t CRC, ArrayRef Data) { + return ::crc32(CRC, (const Bytef *)Data.data(), Data.size()); } + #endif + +uint32_t llvm::crc32(ArrayRef Data) { return crc32(0, Data); } + +void JamCRC::update(ArrayRef Data) { + CRC ^= 0xFFFFFFFFU; // Undo CRC-32 Init. + CRC = crc32(CRC, Data); + CRC ^= 0xFFFFFFFFU; // Undo CRC-32 XorOut. +} Removed: llvm/trunk/lib/Support/JamCRC.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/JamCRC.cpp?rev=374147&view=auto ============================================================================== --- llvm/trunk/lib/Support/JamCRC.cpp (original) +++ llvm/trunk/lib/Support/JamCRC.cpp (removed) @@ -1,96 +0,0 @@ -//===-- JamCRC.cpp - Cyclic Redundancy Check --------------------*- C++ -*-===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// This file contains an implementation of JamCRC. -// -//===----------------------------------------------------------------------===// -// -// The implementation technique is the one mentioned in: -// D. V. Sarwate. 1988. Computation of cyclic redundancy checks via table -// look-up. Commun. ACM 31, 8 (August 1988) -// -//===----------------------------------------------------------------------===// - -#include "llvm/Support/JamCRC.h" -#include "llvm/ADT/ArrayRef.h" - -using namespace llvm; - -static const uint32_t CRCTable[256] = { - 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, - 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3, - 0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, - 0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, - 0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, - 0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7, - 0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, - 0x14015c4f, 0x63066cd9, 0xfa0f3d63, 0x8d080df5, - 0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, - 0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b, - 0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, - 0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59, - 0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, - 0x21b4f4b5, 0x56b3c423, 0xcfba9599, 0xb8bda50f, - 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, - 0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d, - 0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, - 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433, - 0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, - 0x7f6a0dbb, 0x086d3d2d, 0x91646c97, 0xe6635c01, - 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, - 0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, - 0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, - 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65, - 0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, - 0x4adfa541, 0x3dd895d7, 0xa4d1c46d, 0xd3d6f4fb, - 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, - 0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, - 0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, - 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f, - 0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, - 0x59b33d17, 0x2eb40d81, 0xb7bd5c3b, 0xc0ba6cad, - 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, - 0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, - 0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, - 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1, - 0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, - 0xf762575d, 0x806567cb, 0x196c3671, 0x6e6b06e7, - 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, - 0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, - 0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, - 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b, - 0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, - 0xdf60efc3, 0xa867df55, 0x316e8eef, 0x4669be79, - 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, - 0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, - 0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, - 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d, - 0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, - 0x9c0906a9, 0xeb0e363f, 0x72076785, 0x05005713, - 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, - 0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, - 0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, - 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777, - 0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, - 0x8f659eff, 0xf862ae69, 0x616bffd3, 0x166ccf45, - 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, - 0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, - 0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, - 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9, - 0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, - 0xbad03605, 0xcdd70693, 0x54de5729, 0x23d967bf, - 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, - 0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d -}; - -void JamCRC::update(ArrayRef Data) { - for (char Byte : Data) { - int TableIdx = (CRC ^ Byte) & 0xff; - CRC = CRCTable[TableIdx] ^ (CRC >> 8); - } -} Modified: llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/PGOInstrumentation.cpp Wed Oct 9 02:06:30 2019 @@ -96,6 +96,7 @@ #include "llvm/ProfileData/InstrProf.h" #include "llvm/ProfileData/InstrProfReader.h" #include "llvm/Support/BranchProbability.h" +#include "llvm/Support/CRC.h" #include "llvm/Support/Casting.h" #include "llvm/Support/CommandLine.h" #include "llvm/Support/DOTGraphTraits.h" @@ -103,7 +104,6 @@ #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/GraphWriter.h" -#include "llvm/Support/JamCRC.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Transforms/Instrumentation.h" #include "llvm/Transforms/Instrumentation/PGOInstrumentation.h" @@ -609,7 +609,7 @@ public: // value of each BB in the CFG. The higher 32 bits record the number of edges. template void FuncPGOInstrumentation::computeCFGHash() { - std::vector Indexes; + std::vector Indexes; JamCRC JC; for (auto &BB : F) { const Instruction *TI = BB.getTerminator(); @@ -620,7 +620,7 @@ void FuncPGOInstrumentationIndex; for (int J = 0; J < 4; J++) - Indexes.push_back((char)(Index >> (J * 8))); + Indexes.push_back((uint8_t)(Index >> (J * 8))); } } JC.update(Indexes); Modified: llvm/trunk/tools/llvm-objcopy/COFF/COFFObjcopy.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-objcopy/COFF/COFFObjcopy.cpp?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/tools/llvm-objcopy/COFF/COFFObjcopy.cpp (original) +++ llvm/trunk/tools/llvm-objcopy/COFF/COFFObjcopy.cpp Wed Oct 9 02:06:30 2019 @@ -16,8 +16,8 @@ #include "llvm/Object/Binary.h" #include "llvm/Object/COFF.h" +#include "llvm/Support/CRC.h" #include "llvm/Support/Errc.h" -#include "llvm/Support/JamCRC.h" #include "llvm/Support/Path.h" #include @@ -40,22 +40,13 @@ static uint64_t getNextRVA(const Object Obj.IsPE ? Obj.PeHeader.SectionAlignment : 1); } -static uint32_t getCRC32(StringRef Data) { - JamCRC CRC; - CRC.update(ArrayRef(Data.data(), Data.size())); - // The CRC32 value needs to be complemented because the JamCRC dosn't - // finalize the CRC32 value. It also dosn't negate the initial CRC32 value - // but it starts by default at 0xFFFFFFFF which is the complement of zero. - return ~CRC.getCRC(); -} - static std::vector createGnuDebugLinkSectionContents(StringRef File) { ErrorOr> LinkTargetOrErr = MemoryBuffer::getFile(File); if (!LinkTargetOrErr) error("'" + File + "': " + LinkTargetOrErr.getError().message()); auto LinkTarget = std::move(*LinkTargetOrErr); - uint32_t CRC32 = getCRC32(LinkTarget->getBuffer()); + uint32_t CRC32 = llvm::crc32(arrayRefFromStringRef(LinkTarget->getBuffer())); StringRef FileName = sys::path::filename(File); size_t CRCPos = alignTo(FileName.size() + 1, 4); Modified: llvm/trunk/tools/llvm-objcopy/CopyConfig.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-objcopy/CopyConfig.cpp?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/tools/llvm-objcopy/CopyConfig.cpp (original) +++ llvm/trunk/tools/llvm-objcopy/CopyConfig.cpp Wed Oct 9 02:06:30 2019 @@ -14,10 +14,10 @@ #include "llvm/ADT/StringSet.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" +#include "llvm/Support/CRC.h" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Compression.h" #include "llvm/Support/Errc.h" -#include "llvm/Support/JamCRC.h" #include "llvm/Support/MemoryBuffer.h" #include "llvm/Support/StringSaver.h" #include @@ -461,12 +461,8 @@ Expected parseObjcopyOptio if (!DebugOrErr) return createFileError(Config.AddGnuDebugLink, DebugOrErr.getError()); auto Debug = std::move(*DebugOrErr); - JamCRC CRC; - CRC.update( - ArrayRef(Debug->getBuffer().data(), Debug->getBuffer().size())); - // The CRC32 value needs to be complemented because the JamCRC doesn't - // finalize the CRC32 value. - Config.GnuDebugLinkCRC32 = ~CRC.getCRC(); + Config.GnuDebugLinkCRC32 = + llvm::crc32(arrayRefFromStringRef(Debug->getBuffer())); } Config.BuildIdLinkDir = InputArgs.getLastArgValue(OBJCOPY_build_id_link_dir); if (InputArgs.hasArg(OBJCOPY_build_id_link_input)) Modified: llvm/trunk/unittests/Support/CRCTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/Support/CRCTest.cpp?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/unittests/Support/CRCTest.cpp (original) +++ llvm/trunk/unittests/Support/CRCTest.cpp Wed Oct 9 02:06:30 2019 @@ -11,6 +11,7 @@ //===----------------------------------------------------------------------===// #include "llvm/Support/CRC.h" +#include "llvm/ADT/StringExtras.h" #include "gtest/gtest.h" using namespace llvm; @@ -18,12 +19,26 @@ using namespace llvm; namespace { TEST(CRCTest, CRC32) { - EXPECT_EQ(0x414FA339U, - llvm::crc32( - 0, StringRef("The quick brown fox jumps over the lazy dog"))); + EXPECT_EQ(0x414FA339U, llvm::crc32(arrayRefFromStringRef( + "The quick brown fox jumps over the lazy dog"))); + // CRC-32/ISO-HDLC test vector // http://reveng.sourceforge.net/crc-catalogue/17plus.htm#crc.cat.crc-32c - EXPECT_EQ(0xCBF43926U, llvm::crc32(0, StringRef("123456789"))); + EXPECT_EQ(0xCBF43926U, llvm::crc32(arrayRefFromStringRef("123456789"))); + + // Check the CRC-32 of each byte value, exercising all of CRCTable. + for (int i = 0; i < 256; i++) { + // Compute CRCTable[i] using Hacker's Delight (2nd ed.) Figure 14-7. + uint32_t crc = i; + for (int j = 7; j >= 0; j--) { + uint32_t mask = -(crc & 1); + crc = (crc >> 1) ^ (0xEDB88320 & mask); + } + + // CRCTable[i] is the CRC-32 of i without the initial and final bit flips. + uint8_t byte = i; + EXPECT_EQ(crc, ~llvm::crc32(0xFFFFFFFFU, byte)); + } } } // end anonymous namespace Modified: llvm/trunk/utils/gn/secondary/llvm/lib/Support/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/lib/Support/BUILD.gn?rev=374148&r1=374147&r2=374148&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/lib/Support/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/llvm/lib/Support/BUILD.gn Wed Oct 9 02:06:30 2019 @@ -79,7 +79,6 @@ static_library("Support") { "IntervalMap.cpp", "ItaniumManglingCanonicalizer.cpp", "JSON.cpp", - "JamCRC.cpp", "KnownBits.cpp", "LEB128.cpp", "LineIterator.cpp", From llvm-commits at lists.llvm.org Wed Oct 9 02:07:21 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Wed, 09 Oct 2019 09:07:21 -0000 Subject: [llvm] r374149 - [llvm-exegesis] Add missing std::move in rL374146. Message-ID: <20191009090721.91C5589060@lists.llvm.org> Author: courbet Date: Wed Oct 9 02:07:21 2019 New Revision: 374149 URL: http://llvm.org/viewvc/llvm-project?rev=374149&view=rev Log: [llvm-exegesis] Add missing std::move in rL374146. This was breaking some bots: /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/include/llvm/Support/Error.h:483:5: required from ‘llvm::Expected::Expected(OtherT&&, typename std::enable_if::value>::type*) [with OtherT = std::vector&; T = std::vector; typename std::enable_if::value>::type = void]’ /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/tools/llvm-exegesis/lib/X86/Target.cpp:238:20: required from here /usr/include/c++/6/bits/stl_construct.h:75:7: error: use of deleted function ‘llvm::exegesis::CodeTemplate::CodeTemplate(const llvm::exegesis::CodeTemplate&)’ { ::new(static_cast(__p)) _T1(std::forward<_Args>(__args)...); } ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Modified: llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp Modified: llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp?rev=374149&r1=374148&r2=374149&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp Wed Oct 9 02:07:21 2019 @@ -235,13 +235,13 @@ static llvm::Expected= Opts.MaxConfigsPerOpcode) - return Result; + return std::move(Result); } } } } - return Result; + return std::move(Result); } namespace { From llvm-commits at lists.llvm.org Tue Oct 8 17:05:40 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 00:05:40 +0000 (UTC) Subject: [PATCH] D68673: [AMDGPU] Support mov dpp with 64 bit operands Message-ID: rampitec created this revision. rampitec added reviewers: kzhuravl, arsenm, b-sumner. Herald added subscribers: hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely. Herald added a project: LLVM. We define mov/update dpp intrinsics as overloaded but do not support i64, which is a practically useful type. Fix the selection and lowering. https://reviews.llvm.org/D68673 Files: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h llvm/lib/Target/AMDGPU/SIInstrInfo.cpp llvm/lib/Target/AMDGPU/SIInstructions.td llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mov.dpp.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.update.dpp.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68673.223962.patch Type: text/x-patch Size: 8414 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 17:07:39 2019 From: llvm-commits at lists.llvm.org (Richard Smith - zygoloid via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 00:07:39 +0000 (UTC) Subject: [PATCH] D67122: [UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour In-Reply-To: References: Message-ID: <7775e7eb0779b50e99b96997754ab36b@localhost.localdomain> rsmith added a comment. Looks fine to me with some doc improvements. ================ Comment at: clang/docs/ReleaseNotes.rst:64-66 + non-zero offset to ``nullptr`` (or making non-``nullptr`` a ``nullptr``, + by subtracting pointer's integral value from the pointer itself; in C, also, + applying *any* (even zero) offset to ``nullptr``) is undefined behaviour. ---------------- The parenthetical here makes this awkward to read. Also, I don't think we need to say which LLVM revision this happened in. How about: "In both C and C++ (C17 6.5.6p8, C++ [expr.add]), pointer arithmetic is only permitted within arrays. In particular, the behavior of a program is not defined if it adds a non-zero offset (or in C, any offset) to a null pointer, or that forms a null pointer by subtracting an integer from a non-null pointer, and the LLVM optimizer now uses those guarantees for transformations. This may lead to unintended behavior in code that performs these operations. The Undefined Behavior Sanitizer `-fsanitize=pointer-overflow` check has been extended to detect these cases, so that code relying on them can be detected and fixed." ================ Comment at: clang/docs/ReleaseNotes.rst:242 -- ... +- * ``pointer-overflow`` check was extended added to catch the cases where + a non-zero offset being applied, either to a ``nullptr``, or the result ---------------- Add "The" to the start of this bullet. ================ Comment at: clang/docs/ReleaseNotes.rst:243-244 +- * ``pointer-overflow`` check was extended added to catch the cases where + a non-zero offset being applied, either to a ``nullptr``, or the result + of applying of the offset is a ``nullptr``. + As per C++ Standard ``[expr.add]`` that is undefined behaviour. ---------------- "[...] where a non-zero offset is applied to a null pointer, or the result of applying the offset is a null pointer." ================ Comment at: clang/docs/ReleaseNotes.rst:245 + of applying of the offset is a ``nullptr``. + As per C++ Standard ``[expr.add]`` that is undefined behaviour. + ---------------- I don't think we really need to say this is undefined behavior here. ================ Comment at: clang/docs/UndefinedBehaviorSanitizer.rst:133 - ``-fsanitize=pointer-overflow``: Performing pointer arithmetic which - overflows. + overflows; applying a non-zero (in C++; in C - any) offset to either a + non-``nullptr``, or pointer becoming ``nullptr`` after applying the offset. ---------------- Simplify this a bit: "Performing pointer arithmetic which overflows, or where either the old or new pointer value is a null pointer (or in C, when they both are)." ================ Comment at: clang/lib/CodeGen/CGExprScalar.cpp:4657 + Builder.GetInsertBlock()->getParent(), PtrTy->getPointerAddressSpace()); + // Check for overflows unless the GEP got constant-folded, + // and only in the default address space ---------------- If we want to split out the "constant folded" case to avoid issuing too many sanitizer traps on bogus but common patterns, we should have another sanitizer group to re-enable those diagnostics for the constant-folded cases. (I'm fine with not doing that in this patch, though.) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67122/new/ https://reviews.llvm.org/D67122 From llvm-commits at lists.llvm.org Tue Oct 8 17:23:47 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 00:23:47 +0000 (UTC) Subject: [PATCH] D68673: [AMDGPU] Support mov dpp with 64 bit operands In-Reply-To: References: Message-ID: arsenm added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:1869-1875 + (i64 (int_amdgcn_mov_dpp i64:$src, timm:$dpp_ctrl, timm:$row_mask, timm:$bank_mask, + timm:$bound_ctrl)), + (V_MOV_B64_DPP_PSEUDO $src, $src, (as_i32imm $dpp_ctrl), + (as_i32imm $row_mask), (as_i32imm $bank_mask), + (as_i1imm $bound_ctrl)) +>; + ---------------- Why not do the split here? Why treat it as a post-RA pseudo? At latest I would have expected this to be expanded in FinalizeISel CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68673/new/ https://reviews.llvm.org/D68673 From llvm-commits at lists.llvm.org Tue Oct 8 20:13:01 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 03:13:01 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. In-Reply-To: References: Message-ID: <8f8243cf3419dc18e5046dae5d3b1829@localhost.localdomain> MaskRay added inline comments. ================ Comment at: tools/llvm-readobj/ELFDumper.cpp:5598 const Elf_Shdr *Sec) { - DictScope SS(W, "Version symbols"); + ListScope SS(W, "VersionSymbols"); if (!Sec) ---------------- grimar wrote: > MaskRay wrote: > > jhenderson wrote: > > > grimar wrote: > > > > jhenderson wrote: > > > > > Ditto, though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? If it didn't have that stuff, it would be a list. > > > > > though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? > > > > > > > > I do not know. The same information is printed under "Sections [" tag anyways, so it is not useful probably: > > > > > > > > ``` > > > > Section { > > > > Index: 3 > > > > Name: .gnu.version (30) > > > > Type: SHT_GNU_versym (0x6FFFFFFF) > > > > Flags [ (0x0) > > > > ] > > > > Address: 0x0 > > > > Offset: 0xB4 > > > > Size: 2 > > > > Link: 0 > > > > Info: 0 > > > > AddressAlignment: 0 > > > > EntrySize: 2 > > > > } > > > > ``` > > > > > > > > Should we remove "Section Name"/"Address"/"Offset"/"Link" and make it to be a list? > > > I'd be inclined to do that personally, but it should be a separate change. > > The Linux Standard Base calls this "Symbol Version Table" but this is named "VersionSymbols" here... What do you think if we just use the regular section type name "SHT_GNU_versym"? It may improve discoverability as well. > > What do you think if we just use the regular section type name "SHT_GNU_versym"? It may improve discoverability as well. > > I.e. this is an opposite direction to what this patch does: > > ``` > SHT_GNU_verdef { -> VersionDefinitions [ > SHT_GNU_verneed { -> VersionRequirements [ > ``` > > It will be only sections for which we use type names. Should we? > I.e. this is an opposite direction to what this patch does: Yes. .gnu.version is currently not consistent with .gnu.version_r and .gnu.version_d, and I know this patch tries to make them consistent. I am not clear which direction we should go. I have a very weak preference for SHT_GNU_versym. The naming does not seem very consistent here. While LSB names .gnu.version_r "version requirements", binutils-gdb elf.h names it "version needs section". CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68636/new/ https://reviews.llvm.org/D68636 From llvm-commits at lists.llvm.org Tue Oct 8 22:26:33 2019 From: llvm-commits at lists.llvm.org (Michael Collison via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 05:26:33 +0000 (UTC) Subject: [PATCH] D68685: [RISCV] Scheduler description for Rocket Core Message-ID: compilerguy created this revision. compilerguy added a reviewer: asb. Herald added subscribers: llvm-commits, pzheng, s.egerton, lenary, Jim, benna, psnobl, jocewei, PkmX, jfb, rkruppe, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, MaskRay, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, hiraditya. Herald added a project: LLVM. [RISCV] Pipeline scheduler model for RISCV Rocket micro-architecture using the MIScheduler interface. Support for 32 and 64-bit Rocket cores is implemented. Repository: rL LLVM https://reviews.llvm.org/D68685 Files: llvm/lib/Target/RISCV/RISCV.td llvm/lib/Target/RISCV/RISCVInstrInfo.td llvm/lib/Target/RISCV/RISCVInstrInfoA.td llvm/lib/Target/RISCV/RISCVInstrInfoC.td llvm/lib/Target/RISCV/RISCVInstrInfoD.td llvm/lib/Target/RISCV/RISCVInstrInfoF.td llvm/lib/Target/RISCV/RISCVInstrInfoM.td llvm/lib/Target/RISCV/RISCVSchedRocket32.td llvm/lib/Target/RISCV/RISCVSchedRocket64.td llvm/lib/Target/RISCV/RISCVSchedule.td -------------- next part -------------- A non-text attachment was scrubbed... Name: D68685.223988.patch Type: text/x-patch Size: 81356 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Tue Oct 8 23:08:12 2019 From: llvm-commits at lists.llvm.org (Kito Cheng via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 06:08:12 +0000 (UTC) Subject: [PATCH] D68685: [RISCV] Scheduler description for Rocket Core In-Reply-To: References: Message-ID: <8582bf7e366c1ab3be4c8d05d7780fac@localhost.localdomain> kito-cheng added inline comments. ================ Comment at: llvm/lib/Target/RISCV/RISCVInstrInfoM.td:35 -def REMU : ALU_rr<0b0000001, 0b111, "remu">; -} // Predicates = [HasStdExtM] ---------------- Predicates are removed by accident? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68685/new/ https://reviews.llvm.org/D68685 From llvm-commits at lists.llvm.org Wed Oct 9 00:11:25 2019 From: llvm-commits at lists.llvm.org (Zixuan Wu (Zeson) via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 07:11:25 +0000 (UTC) Subject: [PATCH] D67148: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In-Reply-To: References: Message-ID: <5c8e32a819ed631ca4209aff15da6e52@localhost.localdomain> wuzish updated this revision to Diff 223994. wuzish added a comment. add `; REQUIRES: asserts` for test case CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67148/new/ https://reviews.llvm.org/D67148 Files: llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h llvm/lib/Target/ARM/ARMTargetTransformInfo.h llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.h llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h llvm/lib/Target/X86/X86TargetTransformInfo.cpp llvm/lib/Target/X86/X86TargetTransformInfo.h llvm/lib/Target/XCore/XCoreTargetTransformInfo.h llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll llvm/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll llvm/test/Transforms/LoopVectorize/X86/reg-usage.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67148.223994.patch Type: text/x-patch Size: 35090 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 02:11:18 2019 From: llvm-commits at lists.llvm.org (Hans via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 09:11:18 +0000 (UTC) Subject: [PATCH] D68570: Unify the two CRC implementations In-Reply-To: References: Message-ID: <3d4e84cbac44c9b5e11888eeb48b2389@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG1e1e3ba2526e: Unify the two CRC implementations (authored by hansw). Herald added projects: clang, LLDB. Herald added subscribers: lldb-commits, cfe-commits. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68570/new/ https://reviews.llvm.org/D68570 Files: clang/lib/AST/MicrosoftMangle.cpp lld/COFF/PDB.cpp lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp llvm/include/llvm/Support/CRC.h llvm/include/llvm/Support/JamCRC.h llvm/lib/DebugInfo/PDB/Native/Hash.cpp llvm/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp llvm/lib/DebugInfo/PDB/Native/TpiHashing.cpp llvm/lib/DebugInfo/Symbolize/Symbolize.cpp llvm/lib/MC/WinCOFFObjectWriter.cpp llvm/lib/Support/CMakeLists.txt llvm/lib/Support/CRC.cpp llvm/lib/Support/JamCRC.cpp llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp llvm/tools/llvm-objcopy/COFF/COFFObjcopy.cpp llvm/tools/llvm-objcopy/CopyConfig.cpp llvm/unittests/Support/CRCTest.cpp llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn -------------- next part -------------- A non-text attachment was scrubbed... Name: D68570.224004.patch Type: text/x-patch Size: 27379 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 02:15:34 2019 From: llvm-commits at lists.llvm.org (James Molloy via llvm-commits) Date: Wed, 09 Oct 2019 09:15:34 -0000 Subject: [llvm] r374150 - [TableGen] Fix crash when using HwModes in CodeEmitterGen Message-ID: <20191009091534.E95C08FCE7@lists.llvm.org> Author: jamesm Date: Wed Oct 9 02:15:34 2019 New Revision: 374150 URL: http://llvm.org/viewvc/llvm-project?rev=374150&view=rev Log: [TableGen] Fix crash when using HwModes in CodeEmitterGen When an instruction has an encoding definition for only a subset of the available HwModes, ensure we just avoid generating an encoding rather than crash. Modified: llvm/trunk/test/TableGen/HwModeEncodeDecode.td llvm/trunk/utils/TableGen/CodeEmitterGen.cpp Modified: llvm/trunk/test/TableGen/HwModeEncodeDecode.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/TableGen/HwModeEncodeDecode.td?rev=374150&r1=374149&r2=374150&view=diff ============================================================================== --- llvm/trunk/test/TableGen/HwModeEncodeDecode.td (original) +++ llvm/trunk/test/TableGen/HwModeEncodeDecode.td Wed Oct 9 02:15:34 2019 @@ -56,6 +56,15 @@ def bar: Instruction { let Inst{1-0} = 0b10; let AsmString = "bar $factor"; } + +def baz : Instruction { + let InOperandList = (ins i32imm:$factor); + bits<32> Inst; + let EncodingInfos = EncodingByHwMode< + [ModeB], [fooTypeEncA] + >; + let AsmString = "foo $factor"; +} } // DECODER-LABEL: DecoderTable_ModeA32[] = Modified: llvm/trunk/utils/TableGen/CodeEmitterGen.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeEmitterGen.cpp?rev=374150&r1=374149&r2=374150&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/CodeEmitterGen.cpp (original) +++ llvm/trunk/utils/TableGen/CodeEmitterGen.cpp Wed Oct 9 02:15:34 2019 @@ -367,7 +367,8 @@ void CodeEmitterGen::emitInstructionBase if (const RecordVal *RV = R->getValue("EncodingInfos")) { if (auto *DI = dyn_cast_or_null(RV->getValue())) { EncodingInfoByHwMode EBM(DI->getDef(), HWM); - EncodingDef = EBM.get(HwMode); + if (EBM.hasMode(HwMode)) + EncodingDef = EBM.get(HwMode); } } BitsInit *BI = EncodingDef->getValueAsBitsInit("Inst"); From llvm-commits at lists.llvm.org Wed Oct 9 02:20:18 2019 From: llvm-commits at lists.llvm.org (James Henderson via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 09:20:18 +0000 (UTC) Subject: [PATCH] D68645: MinidumpYAML: Add support for the memory info list stream In-Reply-To: References: Message-ID: <80ccd4257c7e8c8cd99a2244ed530cab@localhost.localdomain> jhenderson added subscribers: MaskRay, grimar. jhenderson added a comment. FYI, I'm going to be away for 2 and a half weeks from the end of work today, so won't have time to look at these if I don't get to them later today. I have no issues with other people reviewing them. You might want to add @grimar and @MaskRay to the reviews as they've been doing a lot of work in obj2yaml/yaml2obj recently. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68645/new/ https://reviews.llvm.org/D68645 From llvm-commits at lists.llvm.org Wed Oct 9 02:20:53 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 09:20:53 +0000 (UTC) Subject: [PATCH] D66329: [PowerPC] [Peephole] fold frame offset by using index form to save add. In-Reply-To: References: Message-ID: shchenz updated this revision to Diff 224003. shchenz added a comment. `ScaleReg` can not redefined between ADD and Imm instruction. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66329/new/ https://reviews.llvm.org/D66329 Files: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.h llvm/lib/Target/PowerPC/PPCPreEmitPeephole.cpp llvm/lib/Target/PowerPC/PPCRegisterInfo.h llvm/test/CodeGen/PowerPC/fold-frame-offset-using-rr.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D66329.224003.patch Type: text/x-patch Size: 14095 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 02:47:26 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 09:47:26 +0000 (UTC) Subject: [PATCH] D68667: [SLP] respect target register width for GEP vectorization (PR43578) In-Reply-To: References: Message-ID: xbolva00 added a comment. I can confirm that h264 benchmark is now atleast as good as plain -O3. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68667/new/ https://reviews.llvm.org/D68667 From llvm-commits at lists.llvm.org Wed Oct 9 02:47:27 2019 From: llvm-commits at lists.llvm.org (Oliver Stannard (Linaro) via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 09:47:27 +0000 (UTC) Subject: [PATCH] D67350: [IfCvt][ARM] Optimise diamond if-conversion for code size In-Reply-To: References: Message-ID: <3a77c3cd92ed6181cbad681822e5af0d@localhost.localdomain> ostannard added a comment. Ping Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67350/new/ https://reviews.llvm.org/D67350 From llvm-commits at lists.llvm.org Wed Oct 9 02:47:27 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 09:47:27 +0000 (UTC) Subject: [PATCH] D68649: [Mips][llvm-exegesis] Add a Mips target In-Reply-To: References: Message-ID: <2707991647bf28790ca5aa1102292e19@localhost.localdomain> atanasyan added inline comments. ================ Comment at: tools/llvm-exegesis/lib/Assembler.cpp:241 // - prologepilog: saves and restore callee saved registers. - for (const char *PassName : {"machineverifier", "prologepilog"}) + for (const char *PassName : {"postrapseudos", "machineverifier", "prologepilog"}) if (addPass(PM, PassName, *TPC)) ---------------- clang-format this line. it's too long. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68649/new/ https://reviews.llvm.org/D68649 From llvm-commits at lists.llvm.org Wed Oct 9 03:05:50 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 10:05:50 +0000 (UTC) Subject: [PATCH] D68667: [SLP] respect target register width for GEP vectorization (PR43578) In-Reply-To: References: Message-ID: xbolva00 added a comment. Generally, I think there are more bugs for -march=haswell. Only in rare cases the perf of binaries with -march=haswell is better than plain -O3. I tried this patch with zstd but nothing improved. Plain -O3 ./zstd -b selesiafiles/* -f 3# 13 files : 251919670 -> 97724903 (2.578), 182.0 MB/s , 923.2 MB/s -O3 -march=haswell /zstd -b selesiafiles/* -f 3# 13 files : 251919670 -> 97724903 (2.578), 185.7 MB/s , 866.9 MB/s -O3 -march=haswell -mprefer-vector-width=128 ./zstd -b bench/* -f 3# 13 files : 251919670 -> 97724903 (2.578), 188.5 MB/s , 806.8 MB/s for example gcc-10's results for -march=haswell ./zstd -b bench/* -f 3# 13 files : 251919670 -> 97724903 (2.578), 188.7 MB/s ,1032.8 MB/s CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68667/new/ https://reviews.llvm.org/D68667 From llvm-commits at lists.llvm.org Wed Oct 9 03:33:03 2019 From: llvm-commits at lists.llvm.org (Renato Golin via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 10:33:03 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <86bc06055849bcf853eb782d4b9b62e1@localhost.localdomain> rengolin added inline comments. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:6472 +bool LoopVectorizationPlanner::tryToInterleaveMemory( + const InterleaveGroup *IG, VFRange &Range) { ---------------- Other try{something} functions return a recipe pointer, while this one returns a boolean. If you rename this to "check" or "can" (instead of try), then you shouldn't clamp the range. I'm not sure what's best here, but this way looks a bit odd. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 From llvm-commits at lists.llvm.org Wed Oct 9 03:33:04 2019 From: llvm-commits at lists.llvm.org (Jay Foad via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 10:33:04 +0000 (UTC) Subject: [PATCH] D68563: [AMDGPU] Disable a test that was relying on misched behavior In-Reply-To: References: Message-ID: <83846ccf354cbc1f4d8226f1623dae42@localhost.localdomain> foad added a comment. In D68563#1698059 , @arsenm wrote: > I'm curious what the scheduler is able to do here? Everything is volatile and non-reorderable It moves a bunch of sgpr to vgpr copies: # *** IR Dump Before Machine Instruction Scheduler ***: # Machine code for function max_9_sgprs: NoPHIs, TracksLiveness 0B bb.0 (%ir-block.0): 16B %4:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %13:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 32B %5:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %15:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 48B %6:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %17:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 64B %7:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %19:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 80B %8:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %21:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 96B %9:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %23:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 112B %10:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %25:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 128B %11:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %27:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 144B %28:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %29:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 160B %30:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %31:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 176B INLINEASM &"" [sideeffect] [attdialect], $0:[reguse:SReg_32_XM0], %4:sreg_32_xm0_xexec, $1:[reguse:SReg_32_XM0], %5:sreg_32_xm0_xexec, $2:[reguse:SReg_32_XM0], %6:sreg_32_xm0_xexec, $3:[reguse:SReg_32_XM0], %7:sreg_32_xm0_xexec, $4:[reguse:SReg_32_XM0], %8:sreg_32_xm0_xexec, $5:[reguse:SReg_32_XM0], %9:sreg_32_xm0_xexec, $6:[reguse:SReg_32_XM0], %10:sreg_32_xm0_xexec, $7:[reguse:SReg_32_XM0], %11:sreg_32_xm0_xexec 192B %34:vgpr_32 = COPY %4:sreg_32_xm0_xexec 208B FLAT_STORE_DWORD undef %33:vreg_64, %34:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 224B %37:vgpr_32 = COPY %5:sreg_32_xm0_xexec 240B FLAT_STORE_DWORD undef %36:vreg_64, %37:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 256B %40:vgpr_32 = COPY %6:sreg_32_xm0_xexec 272B FLAT_STORE_DWORD undef %39:vreg_64, %40:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 288B %43:vgpr_32 = COPY %7:sreg_32_xm0_xexec 304B FLAT_STORE_DWORD undef %42:vreg_64, %43:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 320B %46:vgpr_32 = COPY %8:sreg_32_xm0_xexec 336B FLAT_STORE_DWORD undef %45:vreg_64, %46:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 352B %49:vgpr_32 = COPY %9:sreg_32_xm0_xexec 368B FLAT_STORE_DWORD undef %48:vreg_64, %49:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 384B %52:vgpr_32 = COPY %10:sreg_32_xm0_xexec 400B FLAT_STORE_DWORD undef %51:vreg_64, %52:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 416B %55:vgpr_32 = COPY %11:sreg_32_xm0_xexec 432B FLAT_STORE_DWORD undef %54:vreg_64, %55:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 448B %58:vgpr_32 = COPY %28:sreg_32_xm0_xexec 464B FLAT_STORE_DWORD undef %57:vreg_64, %58:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 480B %61:vgpr_32 = COPY %30:sreg_32_xm0_xexec 496B FLAT_STORE_DWORD undef %60:vreg_64, %61:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 512B S_ENDPGM 0 # End machine code for function max_9_sgprs. # *** IR Dump After Machine Instruction Scheduler ***: # Machine code for function max_9_sgprs: NoPHIs, TracksLiveness 0B bb.0 (%ir-block.0): 16B %4:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %13:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 32B %5:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %15:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 48B %6:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %17:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 64B %7:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %19:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 72B %34:vgpr_32 = COPY %4:sreg_32_xm0_xexec 76B %37:vgpr_32 = COPY %5:sreg_32_xm0_xexec 84B %40:vgpr_32 = COPY %6:sreg_32_xm0_xexec 88B %43:vgpr_32 = COPY %7:sreg_32_xm0_xexec 92B %8:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %21:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 100B %46:vgpr_32 = COPY %8:sreg_32_xm0_xexec 108B %9:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %23:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 116B %49:vgpr_32 = COPY %9:sreg_32_xm0_xexec 124B %10:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %25:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 132B %52:vgpr_32 = COPY %10:sreg_32_xm0_xexec 140B %11:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %27:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 148B %55:vgpr_32 = COPY %11:sreg_32_xm0_xexec 156B %28:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %29:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 164B %58:vgpr_32 = COPY %28:sreg_32_xm0_xexec 172B %30:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %31:sreg_64, 0, 0, 0 :: (volatile load 4 from `i32 addrspace(4)* undef`, addrspace 4) 176B INLINEASM &"" [sideeffect] [attdialect], $0:[reguse:SReg_32_XM0], %4:sreg_32_xm0_xexec, $1:[reguse:SReg_32_XM0], %5:sreg_32_xm0_xexec, $2:[reguse:SReg_32_XM0], %6:sreg_32_xm0_xexec, $3:[reguse:SReg_32_XM0], %7:sreg_32_xm0_xexec, $4:[reguse:SReg_32_XM0], %8:sreg_32_xm0_xexec, $5:[reguse:SReg_32_XM0], %9:sreg_32_xm0_xexec, $6:[reguse:SReg_32_XM0], %10:sreg_32_xm0_xexec, $7:[reguse:SReg_32_XM0], %11:sreg_32_xm0_xexec 208B FLAT_STORE_DWORD undef %33:vreg_64, %34:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 240B FLAT_STORE_DWORD undef %36:vreg_64, %37:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 272B FLAT_STORE_DWORD undef %39:vreg_64, %40:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 304B FLAT_STORE_DWORD undef %42:vreg_64, %43:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 336B FLAT_STORE_DWORD undef %45:vreg_64, %46:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 368B FLAT_STORE_DWORD undef %48:vreg_64, %49:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 400B FLAT_STORE_DWORD undef %51:vreg_64, %52:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 432B FLAT_STORE_DWORD undef %54:vreg_64, %55:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 464B FLAT_STORE_DWORD undef %57:vreg_64, %58:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 480B %61:vgpr_32 = COPY %30:sreg_32_xm0_xexec 496B FLAT_STORE_DWORD undef %60:vreg_64, %61:vgpr_32, 0, 0, 0, 0, implicit $exec, implicit $flat_scr :: (volatile store 4 into `i32 addrspace(1)* undef`, addrspace 1) 512B S_ENDPGM 0 # End machine code for function max_9_sgprs. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68563/new/ https://reviews.llvm.org/D68563 From llvm-commits at lists.llvm.org Wed Oct 9 03:42:01 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 10:42:01 +0000 (UTC) Subject: [PATCH] D68092: [AMDGPU] Invert the handling of skip insertion. In-Reply-To: References: Message-ID: nhaehnle added inline comments. ================ Comment at: lib/Target/AMDGPU/SIRemoveShortExecBranches.cpp:116-117 + + if (MDT->dominates(TrueMBB, &SrcMBB) || + mustRetainExeczBranch(*FalseMBB, *TrueMBB)) + return false; ---------------- arsenm wrote: > cdevadas wrote: > > nhaehnle wrote: > > > What's the logic here behind using domination as a criterion? > > There could be a situation in which execnz (inserted during SI_LOOP lowering) can be inverted to execz by an optimization (for instance, BranchFolding). This execz should always be retained. This special check is added to handle it. > > Unfortunately, I couldn't write/find a test-case to reproduce it. > I'm not sure dominance is sufficient for irreducible loops, which you won't run into in practice (as in, they probably hit another control flow bug long before this) but we should handle it correctly I have seen irreducible loops go all the way through compilation (because they triggered a bug somewhere, I believe in waitcount insertion), so yeah, that needs to be handled correctly. I still think a reasonable way to do this is just to scan forward like mustRetainExeczBranch already does, see if we encounter the execz target block during that scan, and only remove the execz branch in that case. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68092/new/ https://reviews.llvm.org/D68092 From llvm-commits at lists.llvm.org Wed Oct 9 03:43:22 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 10:43:22 +0000 (UTC) Subject: [PATCH] D66887: [test-suite] Add GCC C Torture Suite In-Reply-To: References: Message-ID: lenary updated this revision to Diff 224008. lenary added a comment. - Add architecture-testing guard to gcc c torture suite Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 Files: LICENSE.TXT SingleSource/Regression/C/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/README SingleSource/Regression/C/gcc-c-torture/execute/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/execute/COPYING SingleSource/Regression/C/gcc-c-torture/execute/COPYING3 SingleSource/Regression/C/gcc-c-torture/execute/LICENSE.TXT SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/lit.local.cfg -------------- next part -------------- A non-text attachment was scrubbed... Name: D66887.224008.patch Type: text/x-patch Size: 68426 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 03:50:44 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via llvm-commits) Date: Wed, 09 Oct 2019 10:50:44 -0000 Subject: [test-suite] r374155 - [test-suite] Add GCC C Torture Suite Message-ID: <20191009105044.658339086F@lists.llvm.org> Author: lenary Date: Wed Oct 9 03:50:44 2019 New Revision: 374155 URL: http://llvm.org/viewvc/llvm-project?rev=374155&view=rev Log: [test-suite] Add GCC C Torture Suite Summary: This patch adds support for testing clang/LLVM against the GCC Torture suite. This patch adds the CMake configuration and licence information for these tests. A follow-up patch will add the testcases themselves (which are too large to review, and included without modifications). They will be committed together. Reviewers: hfinkel, kristof.beyls, asb Reviewed By: kristof.beyls Subscribers: khcheang, mehdi_amini, jvesely, krytarowski, fedor.sergeev, zzheng, steven_wu, dexonsmith, arphaman, jfb, mstorsjo, lewis-revill, simoncook, s.egerton, riccibruno, asb, mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66887 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/README test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/CMakeLists.txt test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/COPYING test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/COPYING3 test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/LICENSE.TXT test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/lit.local.cfg Modified: test-suite/trunk/LICENSE.TXT test-suite/trunk/SingleSource/Regression/C/CMakeLists.txt Modified: test-suite/trunk/LICENSE.TXT URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/LICENSE.TXT?rev=374155&r1=374154&r2=374155&view=diff ============================================================================== --- test-suite/trunk/LICENSE.TXT (original) +++ test-suite/trunk/LICENSE.TXT Wed Oct 9 03:50:44 2019 @@ -304,6 +304,7 @@ zlib: llvm-test/MultiSourc Rodinia: llvm-test/MultiSource/Benchmarks/Rodinia Rodinia: llvm-test/MultiSource/Benchmarks/Rodinia Rodinia: llvm-test/MultiSource/Benchmarks/Rodinia +gcc-c-torture: llvm-test/SingleSource/Regression/C/gcc-c-torture/execute ============================================================================== Legacy LLVM License (ttps://llvm.org/docs/DeveloperPolicy.html#legacy): Modified: test-suite/trunk/SingleSource/Regression/C/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/CMakeLists.txt?rev=374155&r1=374154&r2=374155&view=diff ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/CMakeLists.txt (original) +++ test-suite/trunk/SingleSource/Regression/C/CMakeLists.txt Wed Oct 9 03:50:44 2019 @@ -1,2 +1,6 @@ +if(ARCH MATCHES "x86" OR ARCH MATCHES "riscv") + add_subdirectory(gcc-c-torture) +endif() + list(APPEND LDFLAGS -lm) llvm_singlesource(PREFIX "Regression-C-") Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt?rev=374155&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt Wed Oct 9 03:50:44 2019 @@ -0,0 +1,62 @@ +# The following cause errors if they are passed to clang via CFLAGS +set(CLANG_ERRORING_CFLAGS + "-fno-early-inlining" + "-fno-ira-share-spill-slots" + "-ftree-loop-distribution" + "-fno-tree-bit-ccp" + "-fno-tree-coalesce-vars" +) + +# This pulls out options in dg-options into `${Variable}` +function(gcc_torture_dg_options_cflags Variable File) + # Some files have dg-options which we need to pick up. These should be in + # the first line but often aren't. + # + # We also need to be careful not to pick up target-specific dg-options. + set(DG_CFLAGS) + + file(STRINGS ${File} FileLines) + foreach(FileLine ${FileLines}) + # Looking for `dg-options "..."` or `dg-additional-options "..."` without + # `{ target` afterwards (ignoring spaces). + if(FileLine MATCHES "dg-(additional-)?options ({ )?\"([^\"]*)\"( })?(.*)") + # This is needed to turn the string into a list of CFLAGS + separate_arguments(FILE_CFLAGS UNIX_COMMAND ${CMAKE_MATCH_3}) + # This does the negative lookahead for `{ target` anywhere in the rest of + # the line + if(NOT ${CMAKE_MATCH_5} MATCHES "{ +target") + list(APPEND DG_CFLAGS ${FILE_CFLAGS}) + endif() + endif() + endforeach() + + # Remove any flags that will make clang error + list(REMOVE_ITEM DG_CFLAGS ${CLANG_ERRORING_CFLAGS}) + + # Set the parent scope variable + set(${Variable} ${DG_CFLAGS} PARENT_SCOPE) +endfunction() + +function(gcc_torture_execute_test File) + cmake_parse_arguments(_TORTURE "" "PREFIX" "CFLAGS;LDFLAGS" ${ARGN}) + # There are a few tests with duplicate filenames, and CMake wants all target + # names to be unique, so we add a disambiguator to the target name. The + # disambiguator is based upon the directory structure, swapping `/` for `-`. + get_filename_component(Name ${File} NAME_WE) + set(_target "${_TORTURE_PREFIX}-${Name}") + + gcc_torture_dg_options_cflags(DG_CFLAGS ${File}) + + # Add any flags that were requested + list(APPEND CFLAGS ${DG_CFLAGS} ${_TORTURE_CFLAGS}) + list(APPEND LDFLAGS ${_TORTURE_LDFLAGS}) + + llvm_test_executable_no_test(${_target} ${File}) + llvm_test_run() + + llvm_add_test_for_target(${_target}) +endfunction() + +file(COPY lit.local.cfg DESTINATION "${CMAKE_CURRENT_BINARY_DIR}") + +add_subdirectory(execute) \ No newline at end of file Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/README URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/README?rev=374155&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/README (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/README Wed Oct 9 03:50:44 2019 @@ -0,0 +1,17 @@ +# GCC C Torture Suite + +This directory contains a checkout of +svn://gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.c-torture/execute . + +Last checked at SVN version 275024 + +CMake Build configuration added by the LLVM project. The tests in +`execute/builtins` are not run, and each CMakeLists.txt contains a list of tests +to skip. These are both general and architecture-specific. + +There are not Makefiles for this part of the test suite. + +# Platform-specific concerns: + +RISC-V: The list of tests to exclude on RISC-V was devised using glibc and qemu. +You may have to exclude additional tests when running with newlib and/or spike. Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/CMakeLists.txt?rev=374155&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/CMakeLists.txt (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/CMakeLists.txt Wed Oct 9 03:50:44 2019 @@ -0,0 +1,278 @@ +add_subdirectory(ieee) + +# GCC C Torture Suite is conventionally run without warnings +list(APPEND CFLAGS "-w") + +set(TestsToSkip) + +## +## Main Test Blacklist for Clang +## + +# Tests with features unsupported by Clang (usually GCC extensions) +# (Big list of naughty tests) +file(GLOB UnsupportedTests CONFIGURE_DEPENDS + # GCC Extension: Nested functions + 20000822-1.c + 20010209-1.c + 20010605-1.c + 20030501-1.c + 20040520-1.c + 20061220-1.c + 20090219-1.c + 920415-1.c + 920428-2.c + 920501-7.c + 920612-2.c + 920721-4.c + 921017-1.c + 921215-1.c + 931002-1.c + comp-goto-2.c + nest-align-1.c + nest-stdar-1.c + nestfunc-1.c + nestfunc-2.c + nestfunc-3.c + nestfunc-5.c + nestfunc-6.c + nestfunc-7.c + pr22061-3.c + pr22061-4.c + pr24135.c + pr51447.c + pr71494.c + + # Variable length arrays in structs + 20020412-1.c + 20040308-1.c + 20040423-1.c + 20041218-2.c + 20070919-1.c + align-nest.c + pr41935.c + pr82210.c + + # Initialization of flexible array member + pr28865.c + + # GCC Extension: __builtin_* + 20071018-1.c # __builtin_malloc + 20071120-1.c # __builtin_malloc + builtin-bitops-1.c # __builtin_clrsb, __builtin_clrsbl, __builtin_clrsbll + pr36765.c # __builtin_malloc + pr39228.c # __builtin_isinff, __builtin_isinfl + pr43008.c # __builtin_malloc + pr47237.c # __builtin_apply, __builtin_apply_args + pr78586.c # __builtin_sprintf + pr79327.c # __builtin_sprintf + pr84339.c # __builtin_malloc, __builtin_free + pr84478.c # __builtin_malloc + pr85331.c # __builtin_shuffle + strlen-7.c # __builtin_malloc + va-arg-pack-1.c # __builtin_va_arg_pack + + # Clang does not support 'DD' suffix on floating constant + pr80692.c + + # Test requires compiler to recognise llabs without including - + # clang will only recognise this function if the header is included. + 20021127-1.c + + # Tests __attribute__((noinit)) + noinit-attribute.c + + # We are unable to parse the dg-additional-options for this test, which is + # required for it to work (we discard any with `target`, but we need the + # define for this test) + 20101011-1.c + + # The following rely on C Undefined Behavior + + # Test relies on UB around (float)INT_MAX + 20031003-1.c + + # UB: Expects very specific behavior around setjmp/longjmp and allocas, which + # clang is not obliged to replicate. + pr64242.c + + # The following all expect very specific optimiser behavior from the compiler + + # __builtin_return_address(n) with n > 0 not guaranteed to give expected result + 20010122-1.c + + # Expects gnu89 inline behavior + 20001121-1.c + 20020107-1.c + 930526-1.c + 961223-1.c + 980608-1.c + bcp-1.c + loop-2c.c + p18298.c + restrict-1.c + unroll-1.c + va-arg-7.c + va-arg-8.c + + # Clang at O0 does not work out the code referencing the undefined symbol can + # never be executed + medce-1.c + + # Expects that function is always inlined + 990208-1.c + + # pragma optimize("-option") is ignored by Clang + alias-1.c + pr79043.c + + # The following all expect very specific optimiser behavior from the compiler + # around __printf_chk and friends. + fprintf-chk-1.c + printf-chk-1.c + vfprintf-chk-1.c + vprintf-chk-1.c + +) +list(APPEND TestsToSkip ${UnsupportedTests}) + +# Tests where clang currently has bugs or issues +file(GLOB FailingTests CONFIGURE_DEPENDS + + # Handling of bitfields is different between clang and GCC: + # http://lists.llvm.org/pipermail/llvm-dev/2017-October/118507.html + # https://gcc.gnu.org/ml/gcc/2017-10/msg00192.html + # http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1260.htm + bitfld-3.c + bitfld-5.c + pr32244-1.c + pr34971.c + + # This causes a stacktrace on x86 in X86TargetLowering::LowerCallTo + pr84169.c + + # clang complains the array is too large + 991014-1.c + + # __builtin_setjmp/__builtin_longjmp are interacting badly with optimisation + pr60003.c +) +list(APPEND TestsToSkip ${FailingTests}) + +## +## Tests that require extra CFLAGS in Clang +## + +# Tests that require -fwrapv +file(GLOB TestRequiresFWrapV CONFIGURE_DEPENDS + # Test relies on undefined signed overflow behavior (int foo - INT_MIN). + 20040409-1.c + 20040409-2.c + 20040409-3.c +) + +# Tests that require -Wno-return-type +file(GLOB TestRequiresWNoReturnType CONFIGURE_DEPENDS + # Non-void function must return a value + 920302-1.c + 920501-3.c + 920728-1.c +) + +# Tests that require libm (-lm ldflag) +file(GLOB TestRequiresLibM CONFIGURE_DEPENDS + 980709-1.c + float-floor.c +) + +# Tests that require newnlib Nano IO (--undefined=_printf_float ldflag) +file(GLOB TestRequiresNanoIO CONFIGURE_DEPENDS + 920501-8.c + 930513-1.c +) + +## +## Architecture-specific Test Blacklists +## + + +# Tests that require mmap +include(CheckSymbolExists) +CHECK_SYMBOL_EXISTS("mmap" "sys/types.h;sys/mman.h" HAVE_MMAP) +if (NOT HAVE_MMAP) + file(GLOB MMapTests + loop-2f.c + loop-2g.c + ) + list(APPEND TestsToSkip ${MMapTests}) +endif() + +# x86-only Tests +if(NOT ARCH MATCHES "x86") + file(GLOB X86OnlyTests CONFIGURE_DEPENDS + 990413-2.c + ) + + list(APPEND TestsToSkip ${X86OnlyTests}) +endif() + +# RISC-V Test Blacklist +if(ARCH MATCHES "riscv") + file(GLOB RISCVTestsToSkip CONFIGURE_DEPENDS + # No backend support for __builtin_longjmp/__builtin_setjmp + built-in-setjmp.c + pr84521.c + ) + + # RISC-V 32-bit Test Blacklist + if (ARCH MATCHES "riscv32") + file(GLOB RISCV32TestsToSkip CONFIGURE_DEPENDS + # No support for __int128 on rv32 + pr84748.c + ) + + list(APPEND RISCVTestsToSkip ${RISCV32TestsToSkip}) + endif() + + list(APPEND TestsToSkip ${RISCVTestsToSkip}) +endif() + +## +## Test target setup +## + +file(GLOB TestFiles CONFIGURE_DEPENDS + *.c +) +foreach(TestToSkip ${TestsToSkip}) + list(REMOVE_ITEM TestFiles ${TestToSkip}) +endforeach() + +foreach(File ${TestFiles}) + set(MaybeCFlags) + set(MaybeLDFlags) + + # Add Test-specific CFLAGS/LDFLAGS here + + if(${File} IN_LIST TestRequiresLibM) + list(APPEND MaybeLDFlags "-lm") + endif() + + if(${File} IN_LIST TestRequiresWNoReturnType) + list(APPEND MaybeCFlags "-Wno-return-type") + endif() + + if(${File} IN_LIST TestRequiresFWrapV) + list(APPEND MaybeCFlags "-fwrapv") + endif() + + if(${File} IN_LIST TestRequiresNanoIO) + list(APPEND MaybeLDFlags "-Wl,-u,_printf_float") + endif() + + # Add Test Target + gcc_torture_execute_test(${File} + PREFIX "GCC-C-execute" + CFLAGS ${MaybeCFlags} + LDFLAGS ${MaybeLDFlags}) +endforeach() Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/COPYING URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/COPYING?rev=374155&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/COPYING (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/COPYING Wed Oct 9 03:50:44 2019 @@ -0,0 +1,340 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc. + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Library General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) year name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Library General +Public License instead of this License. Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/COPYING3 URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/COPYING3?rev=374155&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/COPYING3 (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/COPYING3 Wed Oct 9 03:50:44 2019 @@ -0,0 +1,674 @@ + GNU GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The GNU General Public License is a free, copyleft license for +software and other kinds of works. + + The licenses for most software and other practical works are designed +to take away your freedom to share and change the works. By contrast, +the GNU General Public License is intended to guarantee your freedom to +share and change all versions of a program--to make sure it remains free +software for all its users. We, the Free Software Foundation, use the +GNU General Public License for most of our software; it applies also to +any other work released this way by its authors. You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +them if you wish), that you receive source code or can get it if you +want it, that you can change the software or use pieces of it in new +free programs, and that you know you can do these things. + + To protect your rights, we need to prevent others from denying you +these rights or asking you to surrender the rights. Therefore, you have +certain responsibilities if you distribute copies of the software, or if +you modify it: responsibilities to respect the freedom of others. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must pass on to the recipients the same +freedoms that you received. You must make sure that they, too, receive +or can get the source code. And you must show them these terms so they +know their rights. + + Developers that use the GNU GPL protect your rights with two steps: +(1) assert copyright on the software, and (2) offer you this License +giving you legal permission to copy, distribute and/or modify it. + + For the developers' and authors' protection, the GPL clearly explains +that there is no warranty for this free software. For both users' and +authors' sake, the GPL requires that modified versions be marked as +changed, so that their problems will not be attributed erroneously to +authors of previous versions. + + Some devices are designed to deny users access to install or run +modified versions of the software inside them, although the manufacturer +can do so. This is fundamentally incompatible with the aim of +protecting users' freedom to change the software. The systematic +pattern of such abuse occurs in the area of products for individuals to +use, which is precisely where it is most unacceptable. Therefore, we +have designed this version of the GPL to prohibit the practice for those +products. If such problems arise substantially in other domains, we +stand ready to extend this provision to those domains in future versions +of the GPL, as needed to protect the freedom of users. + + Finally, every program is threatened constantly by software patents. +States should not allow patents to restrict development and use of +software on general-purpose computers, but in those that do, we wish to +avoid the special danger that patents applied to a free program could +make it effectively proprietary. To prevent this, the GPL assures that +patents cannot be used to render the program non-free. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS + + 0. Definitions. + + "This License" refers to version 3 of the GNU General Public License. + + "Copyright" also means copyright-like laws that apply to other kinds of +works, such as semiconductor masks. + + "The Program" refers to any copyrightable work licensed under this +License. Each licensee is addressed as "you". "Licensees" and +"recipients" may be individuals or organizations. + + To "modify" a work means to copy from or adapt all or part of the work +in a fashion requiring copyright permission, other than the making of an +exact copy. The resulting work is called a "modified version" of the +earlier work or a work "based on" the earlier work. + + A "covered work" means either the unmodified Program or a work based +on the Program. + + To "propagate" a work means to do anything with it that, without +permission, would make you directly or secondarily liable for +infringement under applicable copyright law, except executing it on a +computer or modifying a private copy. Propagation includes copying, +distribution (with or without modification), making available to the +public, and in some countries other activities as well. + + To "convey" a work means any kind of propagation that enables other +parties to make or receive copies. Mere interaction with a user through +a computer network, with no transfer of a copy, is not conveying. + + An interactive user interface displays "Appropriate Legal Notices" +to the extent that it includes a convenient and prominently visible +feature that (1) displays an appropriate copyright notice, and (2) +tells the user that there is no warranty for the work (except to the +extent that warranties are provided), that licensees may convey the +work under this License, and how to view a copy of this License. If +the interface presents a list of user commands or options, such as a +menu, a prominent item in the list meets this criterion. + + 1. Source Code. + + The "source code" for a work means the preferred form of the work +for making modifications to it. "Object code" means any non-source +form of a work. + + A "Standard Interface" means an interface that either is an official +standard defined by a recognized standards body, or, in the case of +interfaces specified for a particular programming language, one that +is widely used among developers working in that language. + + The "System Libraries" of an executable work include anything, other +than the work as a whole, that (a) is included in the normal form of +packaging a Major Component, but which is not part of that Major +Component, and (b) serves only to enable use of the work with that +Major Component, or to implement a Standard Interface for which an +implementation is available to the public in source code form. A +"Major Component", in this context, means a major essential component +(kernel, window system, and so on) of the specific operating system +(if any) on which the executable work runs, or a compiler used to +produce the work, or an object code interpreter used to run it. + + The "Corresponding Source" for a work in object code form means all +the source code needed to generate, install, and (for an executable +work) run the object code and to modify the work, including scripts to +control those activities. However, it does not include the work's +System Libraries, or general-purpose tools or generally available free +programs which are used unmodified in performing those activities but +which are not part of the work. For example, Corresponding Source +includes interface definition files associated with source files for +the work, and the source code for shared libraries and dynamically +linked subprograms that the work is specifically designed to require, +such as by intimate data communication or control flow between those +subprograms and other parts of the work. + + The Corresponding Source need not include anything that users +can regenerate automatically from other parts of the Corresponding +Source. + + The Corresponding Source for a work in source code form is that +same work. + + 2. Basic Permissions. + + All rights granted under this License are granted for the term of +copyright on the Program, and are irrevocable provided the stated +conditions are met. This License explicitly affirms your unlimited +permission to run the unmodified Program. The output from running a +covered work is covered by this License only if the output, given its +content, constitutes a covered work. This License acknowledges your +rights of fair use or other equivalent, as provided by copyright law. + + You may make, run and propagate covered works that you do not +convey, without conditions so long as your license otherwise remains +in force. You may convey covered works to others for the sole purpose +of having them make modifications exclusively for you, or provide you +with facilities for running those works, provided that you comply with +the terms of this License in conveying all material for which you do +not control copyright. Those thus making or running the covered works +for you must do so exclusively on your behalf, under your direction +and control, on terms that prohibit them from making any copies of +your copyrighted material outside their relationship with you. + + Conveying under any other circumstances is permitted solely under +the conditions stated below. Sublicensing is not allowed; section 10 +makes it unnecessary. + + 3. Protecting Users' Legal Rights From Anti-Circumvention Law. + + No covered work shall be deemed part of an effective technological +measure under any applicable law fulfilling obligations under article +11 of the WIPO copyright treaty adopted on 20 December 1996, or +similar laws prohibiting or restricting circumvention of such +measures. + + When you convey a covered work, you waive any legal power to forbid +circumvention of technological measures to the extent such circumvention +is effected by exercising rights under this License with respect to +the covered work, and you disclaim any intention to limit operation or +modification of the work as a means of enforcing, against the work's +users, your or third parties' legal rights to forbid circumvention of +technological measures. + + 4. Conveying Verbatim Copies. + + You may convey verbatim copies of the Program's source code as you +receive it, in any medium, provided that you conspicuously and +appropriately publish on each copy an appropriate copyright notice; +keep intact all notices stating that this License and any +non-permissive terms added in accord with section 7 apply to the code; +keep intact all notices of the absence of any warranty; and give all +recipients a copy of this License along with the Program. + + You may charge any price or no price for each copy that you convey, +and you may offer support or warranty protection for a fee. + + 5. Conveying Modified Source Versions. + + You may convey a work based on the Program, or the modifications to +produce it from the Program, in the form of source code under the +terms of section 4, provided that you also meet all of these conditions: + + a) The work must carry prominent notices stating that you modified + it, and giving a relevant date. + + b) The work must carry prominent notices stating that it is + released under this License and any conditions added under section + 7. This requirement modifies the requirement in section 4 to + "keep intact all notices". + + c) You must license the entire work, as a whole, under this + License to anyone who comes into possession of a copy. This + License will therefore apply, along with any applicable section 7 + additional terms, to the whole of the work, and all its parts, + regardless of how they are packaged. This License gives no + permission to license the work in any other way, but it does not + invalidate such permission if you have separately received it. + + d) If the work has interactive user interfaces, each must display + Appropriate Legal Notices; however, if the Program has interactive + interfaces that do not display Appropriate Legal Notices, your + work need not make them do so. + + A compilation of a covered work with other separate and independent +works, which are not by their nature extensions of the covered work, +and which are not combined with it such as to form a larger program, +in or on a volume of a storage or distribution medium, is called an +"aggregate" if the compilation and its resulting copyright are not +used to limit the access or legal rights of the compilation's users +beyond what the individual works permit. Inclusion of a covered work +in an aggregate does not cause this License to apply to the other +parts of the aggregate. + + 6. Conveying Non-Source Forms. + + You may convey a covered work in object code form under the terms +of sections 4 and 5, provided that you also convey the +machine-readable Corresponding Source under the terms of this License, +in one of these ways: + + a) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by the + Corresponding Source fixed on a durable physical medium + customarily used for software interchange. + + b) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by a + written offer, valid for at least three years and valid for as + long as you offer spare parts or customer support for that product + model, to give anyone who possesses the object code either (1) a + copy of the Corresponding Source for all the software in the + product that is covered by this License, on a durable physical + medium customarily used for software interchange, for a price no + more than your reasonable cost of physically performing this + conveying of source, or (2) access to copy the + Corresponding Source from a network server at no charge. + + c) Convey individual copies of the object code with a copy of the + written offer to provide the Corresponding Source. This + alternative is allowed only occasionally and noncommercially, and + only if you received the object code with such an offer, in accord + with subsection 6b. + + d) Convey the object code by offering access from a designated + place (gratis or for a charge), and offer equivalent access to the + Corresponding Source in the same way through the same place at no + further charge. You need not require recipients to copy the + Corresponding Source along with the object code. If the place to + copy the object code is a network server, the Corresponding Source + may be on a different server (operated by you or a third party) + that supports equivalent copying facilities, provided you maintain + clear directions next to the object code saying where to find the + Corresponding Source. Regardless of what server hosts the + Corresponding Source, you remain obligated to ensure that it is + available for as long as needed to satisfy these requirements. + + e) Convey the object code using peer-to-peer transmission, provided + you inform other peers where the object code and Corresponding + Source of the work are being offered to the general public at no + charge under subsection 6d. + + A separable portion of the object code, whose source code is excluded +from the Corresponding Source as a System Library, need not be +included in conveying the object code work. + + A "User Product" is either (1) a "consumer product", which means any +tangible personal property which is normally used for personal, family, +or household purposes, or (2) anything designed or sold for incorporation +into a dwelling. In determining whether a product is a consumer product, +doubtful cases shall be resolved in favor of coverage. For a particular +product received by a particular user, "normally used" refers to a +typical or common use of that class of product, regardless of the status +of the particular user or of the way in which the particular user +actually uses, or expects or is expected to use, the product. A product +is a consumer product regardless of whether the product has substantial +commercial, industrial or non-consumer uses, unless such uses represent +the only significant mode of use of the product. + + "Installation Information" for a User Product means any methods, +procedures, authorization keys, or other information required to install +and execute modified versions of a covered work in that User Product from +a modified version of its Corresponding Source. The information must +suffice to ensure that the continued functioning of the modified object +code is in no case prevented or interfered with solely because +modification has been made. + + If you convey an object code work under this section in, or with, or +specifically for use in, a User Product, and the conveying occurs as +part of a transaction in which the right of possession and use of the +User Product is transferred to the recipient in perpetuity or for a +fixed term (regardless of how the transaction is characterized), the +Corresponding Source conveyed under this section must be accompanied +by the Installation Information. But this requirement does not apply +if neither you nor any third party retains the ability to install +modified object code on the User Product (for example, the work has +been installed in ROM). + + The requirement to provide Installation Information does not include a +requirement to continue to provide support service, warranty, or updates +for a work that has been modified or installed by the recipient, or for +the User Product in which it has been modified or installed. Access to a +network may be denied when the modification itself materially and +adversely affects the operation of the network or violates the rules and +protocols for communication across the network. + + Corresponding Source conveyed, and Installation Information provided, +in accord with this section must be in a format that is publicly +documented (and with an implementation available to the public in +source code form), and must require no special password or key for +unpacking, reading or copying. + + 7. Additional Terms. + + "Additional permissions" are terms that supplement the terms of this +License by making exceptions from one or more of its conditions. +Additional permissions that are applicable to the entire Program shall +be treated as though they were included in this License, to the extent +that they are valid under applicable law. If additional permissions +apply only to part of the Program, that part may be used separately +under those permissions, but the entire Program remains governed by +this License without regard to the additional permissions. + + When you convey a copy of a covered work, you may at your option +remove any additional permissions from that copy, or from any part of +it. (Additional permissions may be written to require their own +removal in certain cases when you modify the work.) You may place +additional permissions on material, added by you to a covered work, +for which you have or can give appropriate copyright permission. + + Notwithstanding any other provision of this License, for material you +add to a covered work, you may (if authorized by the copyright holders of +that material) supplement the terms of this License with terms: + + a) Disclaiming warranty or limiting liability differently from the + terms of sections 15 and 16 of this License; or + + b) Requiring preservation of specified reasonable legal notices or + author attributions in that material or in the Appropriate Legal + Notices displayed by works containing it; or + + c) Prohibiting misrepresentation of the origin of that material, or + requiring that modified versions of such material be marked in + reasonable ways as different from the original version; or + + d) Limiting the use for publicity purposes of names of licensors or + authors of the material; or + + e) Declining to grant rights under trademark law for use of some + trade names, trademarks, or service marks; or + + f) Requiring indemnification of licensors and authors of that + material by anyone who conveys the material (or modified versions of + it) with contractual assumptions of liability to the recipient, for + any liability that these contractual assumptions directly impose on + those licensors and authors. + + All other non-permissive additional terms are considered "further +restrictions" within the meaning of section 10. If the Program as you +received it, or any part of it, contains a notice stating that it is +governed by this License along with a term that is a further +restriction, you may remove that term. If a license document contains +a further restriction but permits relicensing or conveying under this +License, you may add to a covered work material governed by the terms +of that license document, provided that the further restriction does +not survive such relicensing or conveying. + + If you add terms to a covered work in accord with this section, you +must place, in the relevant source files, a statement of the +additional terms that apply to those files, or a notice indicating +where to find the applicable terms. + + Additional terms, permissive or non-permissive, may be stated in the +form of a separately written license, or stated as exceptions; +the above requirements apply either way. + + 8. Termination. + + You may not propagate or modify a covered work except as expressly +provided under this License. Any attempt otherwise to propagate or +modify it is void, and will automatically terminate your rights under +this License (including any patent licenses granted under the third +paragraph of section 11). + + However, if you cease all violation of this License, then your +license from a particular copyright holder is reinstated (a) +provisionally, unless and until the copyright holder explicitly and +finally terminates your license, and (b) permanently, if the copyright +holder fails to notify you of the violation by some reasonable means +prior to 60 days after the cessation. + + Moreover, your license from a particular copyright holder is +reinstated permanently if the copyright holder notifies you of the +violation by some reasonable means, this is the first time you have +received notice of violation of this License (for any work) from that +copyright holder, and you cure the violation prior to 30 days after +your receipt of the notice. + + Termination of your rights under this section does not terminate the +licenses of parties who have received copies or rights from you under +this License. If your rights have been terminated and not permanently +reinstated, you do not qualify to receive new licenses for the same +material under section 10. + + 9. Acceptance Not Required for Having Copies. + + You are not required to accept this License in order to receive or +run a copy of the Program. Ancillary propagation of a covered work +occurring solely as a consequence of using peer-to-peer transmission +to receive a copy likewise does not require acceptance. However, +nothing other than this License grants you permission to propagate or +modify any covered work. These actions infringe copyright if you do +not accept this License. Therefore, by modifying or propagating a +covered work, you indicate your acceptance of this License to do so. + + 10. Automatic Licensing of Downstream Recipients. + + Each time you convey a covered work, the recipient automatically +receives a license from the original licensors, to run, modify and +propagate that work, subject to this License. You are not responsible +for enforcing compliance by third parties with this License. + + An "entity transaction" is a transaction transferring control of an +organization, or substantially all assets of one, or subdividing an +organization, or merging organizations. If propagation of a covered +work results from an entity transaction, each party to that +transaction who receives a copy of the work also receives whatever +licenses to the work the party's predecessor in interest had or could +give under the previous paragraph, plus a right to possession of the +Corresponding Source of the work from the predecessor in interest, if +the predecessor has it or can get it with reasonable efforts. + + You may not impose any further restrictions on the exercise of the +rights granted or affirmed under this License. For example, you may +not impose a license fee, royalty, or other charge for exercise of +rights granted under this License, and you may not initiate litigation +(including a cross-claim or counterclaim in a lawsuit) alleging that +any patent claim is infringed by making, using, selling, offering for +sale, or importing the Program or any portion of it. + + 11. Patents. + + A "contributor" is a copyright holder who authorizes use under this +License of the Program or a work on which the Program is based. The +work thus licensed is called the contributor's "contributor version". + + A contributor's "essential patent claims" are all patent claims +owned or controlled by the contributor, whether already acquired or +hereafter acquired, that would be infringed by some manner, permitted +by this License, of making, using, or selling its contributor version, +but do not include claims that would be infringed only as a +consequence of further modification of the contributor version. For +purposes of this definition, "control" includes the right to grant +patent sublicenses in a manner consistent with the requirements of +this License. + + Each contributor grants you a non-exclusive, worldwide, royalty-free +patent license under the contributor's essential patent claims, to +make, use, sell, offer for sale, import and otherwise run, modify and +propagate the contents of its contributor version. + + In the following three paragraphs, a "patent license" is any express +agreement or commitment, however denominated, not to enforce a patent +(such as an express permission to practice a patent or covenant not to +sue for patent infringement). To "grant" such a patent license to a +party means to make such an agreement or commitment not to enforce a +patent against the party. + + If you convey a covered work, knowingly relying on a patent license, +and the Corresponding Source of the work is not available for anyone +to copy, free of charge and under the terms of this License, through a +publicly available network server or other readily accessible means, +then you must either (1) cause the Corresponding Source to be so +available, or (2) arrange to deprive yourself of the benefit of the +patent license for this particular work, or (3) arrange, in a manner +consistent with the requirements of this License, to extend the patent +license to downstream recipients. "Knowingly relying" means you have +actual knowledge that, but for the patent license, your conveying the +covered work in a country, or your recipient's use of the covered work +in a country, would infringe one or more identifiable patents in that +country that you have reason to believe are valid. + + If, pursuant to or in connection with a single transaction or +arrangement, you convey, or propagate by procuring conveyance of, a +covered work, and grant a patent license to some of the parties +receiving the covered work authorizing them to use, propagate, modify +or convey a specific copy of the covered work, then the patent license +you grant is automatically extended to all recipients of the covered +work and works based on it. + + A patent license is "discriminatory" if it does not include within +the scope of its coverage, prohibits the exercise of, or is +conditioned on the non-exercise of one or more of the rights that are +specifically granted under this License. You may not convey a covered +work if you are a party to an arrangement with a third party that is +in the business of distributing software, under which you make payment +to the third party based on the extent of your activity of conveying +the work, and under which the third party grants, to any of the +parties who would receive the covered work from you, a discriminatory +patent license (a) in connection with copies of the covered work +conveyed by you (or copies made from those copies), or (b) primarily +for and in connection with specific products or compilations that +contain the covered work, unless you entered into that arrangement, +or that patent license was granted, prior to 28 March 2007. + + Nothing in this License shall be construed as excluding or limiting +any implied license or other defenses to infringement that may +otherwise be available to you under applicable patent law. + + 12. No Surrender of Others' Freedom. + + If conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot convey a +covered work so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you may +not convey it at all. For example, if you agree to terms that obligate you +to collect a royalty for further conveying from those to whom you convey +the Program, the only way you could satisfy both those terms and this +License would be to refrain entirely from conveying the Program. + + 13. Use with the GNU Affero General Public License. + + Notwithstanding any other provision of this License, you have +permission to link or combine any covered work with a work licensed +under version 3 of the GNU Affero General Public License into a single +combined work, and to convey the resulting work. The terms of this +License will continue to apply to the part which is the covered work, +but the special requirements of the GNU Affero General Public License, +section 13, concerning interaction through a network will apply to the +combination as such. + + 14. Revised Versions of this License. + + The Free Software Foundation may publish revised and/or new versions of +the GNU General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + + Each version is given a distinguishing version number. If the +Program specifies that a certain numbered version of the GNU General +Public License "or any later version" applies to it, you have the +option of following the terms and conditions either of that numbered +version or of any later version published by the Free Software +Foundation. If the Program does not specify a version number of the +GNU General Public License, you may choose any version ever published +by the Free Software Foundation. + + If the Program specifies that a proxy can decide which future +versions of the GNU General Public License can be used, that proxy's +public statement of acceptance of a version permanently authorizes you +to choose that version for the Program. + + Later license versions may give you additional or different +permissions. However, no additional obligations are imposed on any +author or copyright holder as a result of your choosing to follow a +later version. + + 15. Disclaimer of Warranty. + + THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY +APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT +HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY +OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM +IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF +ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. Limitation of Liability. + + IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS +THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY +GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE +USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF +DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD +PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), +EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF +SUCH DAMAGES. + + 17. Interpretation of Sections 15 and 16. + + If the disclaimer of warranty and limitation of liability provided +above cannot be given local legal effect according to their terms, +reviewing courts shall apply local law that most closely approximates +an absolute waiver of all civil liability in connection with the +Program, unless a warranty or assumption of liability accompanies a +copy of the Program in return for a fee. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +state the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + + If the program does terminal interaction, make it output a short +notice like this when it starts in an interactive mode: + + Copyright (C) + This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, your program's commands +might be different; for a GUI interface, you would use an "about box". + + You should also get your employer (if you work as a programmer) or school, +if any, to sign a "copyright disclaimer" for the program, if necessary. +For more information on this, and how to apply and follow the GNU GPL, see +. + + The GNU General Public License does not permit incorporating your program +into proprietary programs. If your program is a subroutine library, you +may consider it more useful to permit linking proprietary applications with +the library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. But first, please read +. Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/LICENSE.TXT URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/LICENSE.TXT?rev=374155&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/LICENSE.TXT (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/LICENSE.TXT Wed Oct 9 03:50:44 2019 @@ -0,0 +1,2 @@ +The testcases in this directory are covered by the GPL. See the files whose +names start with COPYING for copying permission. Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt?rev=374155&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt Wed Oct 9 03:50:44 2019 @@ -0,0 +1,73 @@ +# GCC C Torture Suite is conventionally run without warnings +list(APPEND CFLAGS "-w") + +set(TestsToSkip) + +## +## Main Test Blacklist for Clang +## + +# Tests with features unsupported by Clang (usually GCC extensions) +# (Big list of naughty tests) +file(GLOB UnsupportedTests + CONFIGURE_DEPENDS + + # The following all expect very specific optimiser behavior from the compiler + + # Clang at O0 does not work out the code referencing the undefined symbol can + # never be executed + fp-cmp-7.c +) +list(APPEND TestsToSkip ${UnsupportedTests}) + +## +## Tests that require extra CFLAGS in Clang +## + +# Tests that require libm (-lm flag) +file(GLOB TestRequiresLibM CONFIGURE_DEPENDS + 20041213-1.c + mzero4.c +) + +# Tests that require -fno-trapping-math +file(GLOB TestRequiresFNoTrappingMath CONFIGURE_DEPENDS + # Needs additional flags from compare-fp-3.x + compare-fp-3.c +) + +## +## Architecture-specific Test Blacklists +## + +## +## Test target setup +## + +file(GLOB TestFiles CONFIGURE_DEPENDS + *.c +) +foreach(TestToSkip ${TestsToSkip}) + list(REMOVE_ITEM TestFiles ${TestToSkip}) +endforeach() + +foreach(File ${TestFiles}) + set(MaybeCFlags) + set(MaybeLDFlags) + + # Add Test-specific CFLAGS/LDFLAGS here + + if (${File} IN_LIST TestRequiresLibM) + list(APPEND MaybeLDFlags "-lm") + endif() + + if (${File} IN_LIST TestRequiresFNoTrappingMath) + list(APPEND MaybeCFlags "-fno-trapping-math") + endif() + + # Add Test Target + gcc_torture_execute_test(${File} + PREFIX "GCC-C-execute-ieee" + CFLAGS ${MaybeCFlags} + LDFLAGS ${MaybeLDFlags}) +endforeach() Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/lit.local.cfg URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/lit.local.cfg?rev=374155&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/lit.local.cfg (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/lit.local.cfg Wed Oct 9 03:50:44 2019 @@ -0,0 +1 @@ +config.traditional_output = False From llvm-commits at lists.llvm.org Wed Oct 9 03:51:08 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 10:51:08 +0000 (UTC) Subject: [PATCH] D68600: AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer In-Reply-To: References: Message-ID: <0c7070c7b05b097101d8cef3b726065f@localhost.localdomain> nhaehnle accepted this revision. nhaehnle added a comment. This revision is now accepted and ready to land. I do have one question. Apart from that it LGTM. ================ Comment at: lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:326-327 +// FIXME: Returns uniform if there's no source value information. This is +// probably wrong. static bool isInstrUniformNonExtLoadAlign4(const MachineInstr &MI) { ---------------- You mean because `isUniformMMO` returns true if the MMO doesn't have a pointer? There's a comment in that function which justifies that (though I'm not sure whether that comment is correct). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68600/new/ https://reviews.llvm.org/D68600 From llvm-commits at lists.llvm.org Wed Oct 9 03:51:08 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 10:51:08 +0000 (UTC) Subject: [PATCH] D68688: [LLD] [MinGW] Add a testcase for -l:name style library options. NFC. Message-ID: mstorsjo created this revision. mstorsjo added a reviewer: ruiu. Herald added a project: LLVM. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68688 Files: lld/test/MinGW/lib.test Index: lld/test/MinGW/lib.test =================================================================== --- lld/test/MinGW/lib.test +++ lld/test/MinGW/lib.test @@ -7,6 +7,13 @@ RUN: ld.lld -### -m i386pep -lfoo -L%t/lib | FileCheck -check-prefix=LIB2 %s LIB2: libfoo.dll.a +RUN: not ld.lld -### -m i386pep -l:barefilename -L%t/lib 2>&1 | FileCheck -check-prefix=LIB-LITERAL-FAIL %s +LIB-LITERAL-FAIL: unable to find library -l:barefilename + +RUN: echo > %t/lib/barefilename +RUN: ld.lld -### -m i386pep -l:barefilename -L%t/lib 2>&1 | FileCheck -check-prefix=LIB-LITERAL %s +LIB-LITERAL: barefilename + RUN: not ld.lld -### -m i386pep -Bstatic -lfoo -L%t/lib 2>&1 | FileCheck -check-prefix=LIB3 %s LIB3: unable to find library -lfoo -------------- next part -------------- A non-text attachment was scrubbed... Name: D68688.224010.patch Type: text/x-patch Size: 737 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 03:51:08 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 10:51:08 +0000 (UTC) Subject: [PATCH] D66887: [test-suite] Add GCC C Torture Suite In-Reply-To: References: Message-ID: lenary updated this revision to Diff 224009. lenary added a comment. - Rebase changes Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 Files: LICENSE.TXT SingleSource/Regression/C/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/README SingleSource/Regression/C/gcc-c-torture/execute/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/execute/COPYING SingleSource/Regression/C/gcc-c-torture/execute/COPYING3 SingleSource/Regression/C/gcc-c-torture/execute/LICENSE.TXT SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/lit.local.cfg -------------- next part -------------- A non-text attachment was scrubbed... Name: D66887.224009.patch Type: text/x-patch Size: 68426 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 03:51:09 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 10:51:09 +0000 (UTC) Subject: [PATCH] D68689: [LLD] [MinGW] Look for other library patterns with -l Message-ID: mstorsjo created this revision. mstorsjo added reviewers: ruiu, rnk. Herald added a project: LLVM. GNU ld looks for a number of other patterns than just lib.dll.a and lib.a. GNU ld does support linking directly against a DLL without using an import library. If that's the only match for a -l argument, point out that the user needs to use an import library, instead of leaving the user with a puzzling message about the -l argument not being found at all. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68689 Files: lld/MinGW/Driver.cpp lld/test/MinGW/lib.test Index: lld/test/MinGW/lib.test =================================================================== --- lld/test/MinGW/lib.test +++ lld/test/MinGW/lib.test @@ -26,3 +26,16 @@ RUN: ld.lld -### -m i386pep -Bstatic -lfoo -Bdynamic -lbar -L%t/lib | FileCheck -check-prefix=LIB5 %s LIB5: libfoo.a LIB5-SAME: libbar.dll.a + +RUN: echo > %t/lib/noprefix.dll.a +RUN: echo > %t/lib/msvcstyle.lib +RUN: ld.lld -### -m i386pep -L%t/lib -lnoprefix -lmsvcstyle | FileCheck -check-prefix=OTHERSTYLES %s +OTHERSTYLES: noprefix.dll.a +OTHERSTYLES-SAME: msvcstyle.lib + +RUN: echo > %t/lib/libnoimplib.dll +RUN: echo > %t/lib/noprefix_noimplib.dll +RUN: not ld.lld -### -m i386pep -L%t/lib -lnoimplib 2>&1 | FileCheck -check-prefix=UNSUPPORTED-DLL1 %s +RUN: not ld.lld -### -m i386pep -L%t/lib -lnoprefix_noimplib 2>&1 | FileCheck -check-prefix=UNSUPPORTED-DLL2 %s +UNSUPPORTED-DLL1: lld doesn't support linking directly against {{.*}}libnoimplib.dll, use an import library +UNSUPPORTED-DLL2: lld doesn't support linking directly against {{.*}}noprefix_noimplib.dll, use an import library Index: lld/MinGW/Driver.cpp =================================================================== --- lld/MinGW/Driver.cpp +++ lld/MinGW/Driver.cpp @@ -129,11 +129,24 @@ } for (StringRef dir : searchPaths) { - if (!bStatic) + if (!bStatic) { if (Optional s = findFile(dir, "lib" + name + ".dll.a")) return *s; + if (Optional s = findFile(dir, name + ".dll.a")) + return *s; + } if (Optional s = findFile(dir, "lib" + name + ".a")) return *s; + if (!bStatic) { + if (Optional s = findFile(dir, name + ".lib")) + return *s; + if (Optional s = findFile(dir, "lib" + name + ".dll")) + fatal("lld doesn't support linking directly against " + *s + + ", use an import library"); + if (Optional s = findFile(dir, name + ".dll")) + fatal("lld doesn't support linking directly against " + *s + + ", use an import library"); + } } fatal("unable to find library -l" + name); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68689.224011.patch Type: text/x-patch Size: 2152 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 03:51:09 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 10:51:09 +0000 (UTC) Subject: [PATCH] D65961: AMDGPU/SILoadStoreOptimizer: Optimize scanning for mergeable instructions In-Reply-To: References: Message-ID: nhaehnle added a comment. Thank you for doing this, it seems quite useful. As a follow-up to this change, do you think it makes sense to refactor CombineInfo a bit? We have a list of mergeable instructions, but the CombineInfo structure also has fields for a second instruction, which are only for temporary use, which is a bit odd. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65961/new/ https://reviews.llvm.org/D65961 From llvm-commits at lists.llvm.org Wed Oct 9 04:01:53 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via llvm-commits) Date: Wed, 09 Oct 2019 11:01:53 -0000 Subject: [test-suite] r374156 - Add GCC Torture Suite Sources Message-ID: <20191009110200.47EB990780@lists.llvm.org> Author: lenary Date: Wed Oct 9 04:01:46 2019 New Revision: 374156 URL: http://llvm.org/viewvc/llvm-project?rev=374156&view=rev Log: Add GCC Torture Suite Sources Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000112-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000113-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000121-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000205-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000217-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000223-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000224-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000225-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000227-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000313-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000402-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000403-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000419-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000422-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000503-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000511-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000519-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000519-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000523-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000528-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000603-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000622-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000703-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000707-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000715-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000715-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000722-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000726-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000731-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000731-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000808-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000815-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000818-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000819-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000822-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000910-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000910-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000914-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000917-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001009-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001009-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001011-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001013-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001017-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001017-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001024-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001026-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001027-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001031-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001101.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001108-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001111-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001112-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001121-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001124-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001130-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001130-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001203-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001203-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001221-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001228-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001229-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010106-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010114-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010116-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010118-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010119-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010122-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010123-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010129-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010206-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010209-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010221-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010222-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010224-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010325-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010329-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010403-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010409-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010422-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010518-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010518-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010520-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010604-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010605-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010605-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010711-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010717-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010723-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010904-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010904-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010910-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010915-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010924-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010925-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011008-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011019-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011024-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011109-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011109-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011113-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011114-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011115-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011121-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011126-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011126-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011128-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011217-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011219-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011223-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020103-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020107-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020108-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020118-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020127-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020129-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020201-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020206-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020206-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020213-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020215-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020216-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020219-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020225-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020225-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020226-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020227-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020307-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020314-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020320-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020321-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020328-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020404-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020406-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020411-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020412-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020413-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020418-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020423-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020503-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020506-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020510-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020529-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020611-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020614-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020615-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020619-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020716-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020720-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020805-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020810-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020819-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020904-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020911-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020916-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020920-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021010-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021010-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021011-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021015-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021024-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021111-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021113-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021119-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021127-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021204-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021219-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030105-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030109-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030117-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030120-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030120-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030125-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030128-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030203-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030209-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030216-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030218-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030221-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030222-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030224-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030307-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030313-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030316-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030323-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030330-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030401-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030403-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030404-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030408-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030501-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030606-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030613-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030626-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030626-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030714-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030715-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030717-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030718-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030811-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030821-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030828-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030828-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030903-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030909-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030910-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030913-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030914-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030914-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030916-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030920-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030928-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031003-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031010-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031011-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031012-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031020-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031201-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031204-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031211-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031211-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031214-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031215-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031216-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040208-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040218-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040223-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040302-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040307-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040308-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040309-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040311-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040313-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040319-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040331-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-1w.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-2w.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-3w.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040411-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040423-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040520-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040625-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040629-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040703-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040704-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040705-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040705-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040706-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040707-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040805-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040811-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040820-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040823-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040831-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040917-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041011-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041019-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041112-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041113-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041114-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041124-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041126-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041201-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041210-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041212-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041213-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041214-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041218-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041218-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050104-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050106-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050107-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050111-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050119-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050119-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050121-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050124-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050125-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050131-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050203-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050215-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050218-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050224-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050410-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050502-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050502-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050604-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050607-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050613-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050713-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050826-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050826-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050929-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051012-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051021-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051104-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051110-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051110-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051113-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051215-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060102-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060110-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060110-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060127-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060412-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060420-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060905-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060910-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060929-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060930-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060930-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061031-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061101-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061101-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061220-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070201-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070424-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070517-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070614-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070623-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070724-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070824-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070919-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071011-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071018-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071029-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071030-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071108-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071120-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071202-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071205-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071210-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071211-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071213-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071216-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071219-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071220-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071220-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080117-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080122-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080222-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080408-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080424-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080502-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080506-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080506-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080519-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080522-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080529-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080604-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080719-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080813-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081103-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081112-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081117-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081218-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090207-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090219-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090527-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090623-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090711-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090814-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20091229-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100209-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100316-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100416-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100430-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100708-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100805-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100827-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101011-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101013-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101025-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20110418-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20111208-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20111212-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20111227-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120105-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120111-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120207-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120427-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120427-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120615-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120808-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120817-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120919-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20121108-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20131127-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140212-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140212-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140326-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140425-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140622-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140828-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141022-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141107-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141125-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20150611-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170111-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170401-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170401-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170419-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20171008-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180112-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180131-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180226-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180921-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20181120-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20190228-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20190820-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/900409-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920202-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920302-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920409-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920410-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920411-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920415-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920428-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920428-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920429-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-7.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-8.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-9.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920506-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920520-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920603-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920604-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920612-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920612-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920618-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920625-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920710-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920711-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920726-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920728-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920730-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920731-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920810-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920812-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920829-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920908-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920908-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920909-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920922-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920929-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921006-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921007-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921013-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921016-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921017-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921019-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921019-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921029-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921104-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921110-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921112-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921113-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921117-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921123-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921123-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921124-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921202-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921202-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921204-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921207-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921208-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921208-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921215-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921218-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921218-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930106-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930111-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930123-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930126-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930208-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930406-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930408-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930429-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930429-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930513-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930513-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930518-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930526-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930527-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930529-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930608-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930614-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930614-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930621-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930622-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930622-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930628-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930630-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930702-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930713-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930718-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930719-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930725-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930818-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930916-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930921-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930929-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930930-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930930-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931002-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-10.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-11.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-12.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-13.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-14.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-7.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-8.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-9.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931005-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931009-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931012-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931017-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931018-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931031-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931102-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931102-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931110-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931110-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931208-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931228-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/940115-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/940122-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941014-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941014-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941015-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941021-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941025-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941031-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941101-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941110-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941202-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950221-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950322-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950426-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950426-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950503-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950511-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950512-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950605-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950607-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950607-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950612-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950621-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950628-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950704-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950706-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950710-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950714-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950809-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950906-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950915-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950929-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951003-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951115-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951204-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960116-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960117-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960209-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960215-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960218-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960219-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960301-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960302-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960312-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960317-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960321-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960326-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960327-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960402-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960405-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960416-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960419-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960419-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960512-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960513-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960521-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960608-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960801-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960802-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960830-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960909-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961004-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961017-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961017-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961026-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961112-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961122-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961122-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961125-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961206-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961213-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961223-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970214-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970214-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970217-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970923-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980205.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980223.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980424-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980505-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980505-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980602-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980602-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980604-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980605-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980608-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980612-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980617-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980618-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980701-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980707-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980709-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980716-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980929-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981001-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981019-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981130-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981206-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990106-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990106-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990117-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990127-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990127-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990128-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990130-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990208-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990211-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990222-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990324-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990326-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990404-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990413-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990513-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990524-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990525-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990525-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990527-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990531-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990604-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990628-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990804-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990811-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990826-0.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990827-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990829-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990923-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991014-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991016-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991019-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991023-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991030-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991112-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991118-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991201-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991221-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991227-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991228-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-access-path-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-nest.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alloca-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/anon-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-rand-ll.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-rand.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ashldi-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ashrdi-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bcp-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-layout-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-pack-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-sign-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-sign-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf64-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-7.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bswap-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bswap-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/built-in-setjmp.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-bitops-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-constant.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-types-compatible-p.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-2-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-3-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/builtins.exp test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/chk.h test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/complex-1-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/complex-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/abs.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/bfill.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/bzero.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/fprintf.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/main.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memchr.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memcmp.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memmove.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/mempcpy.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memset.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/printf.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/sprintf.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/stpcpy.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcat.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strchr.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcmp.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcpy.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcspn.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strlen.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncat.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncmp.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncpy.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strnlen.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strpbrk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strrchr.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strspn.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strstr.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memchr-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memchr.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcmp-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcmp.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-2-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-2-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr22237-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr22237.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/printf-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/printf.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strchr-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strchr.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcmp-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcmp.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-2-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcspn-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcspn.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-2-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-3-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-2-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpbrk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpbrk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-2-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strrchr-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strrchr.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strspn-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strspn.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk-lib.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/call-trap-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cbrt.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpdi-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsf-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsi-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsi-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/comp-goto-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/comp-goto-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/complex-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/complex-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/complex-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/complex-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/complex-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/complex-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/complex-7.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compndlit-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/const-addr-expr-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/conversion.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cvt-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/dbra-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divmod-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/doloop-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/doloop-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/eeprof-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/execute.exp test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/extzvsi.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ffs-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ffs-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/float-floor.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/floatunsisf-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-chk-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/frame-address.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/func-ptr-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/gofast.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20000320-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20000320-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20001122-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010114-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010114-2.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010226-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20011123-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20030331-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20030331-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20041213-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920518-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920518-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920810-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920810-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/930529-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/980619-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/980619-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/acc1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/acc2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/builtin-nan-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-3.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-4.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/copysign1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/copysign2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-2.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-3.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4e.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4f.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4f.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4l.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-6.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-7.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-7.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8e.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8f.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8f.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8l.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/hugeval.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/hugeval.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/ieee.exp test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/minuszero.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mul-subnormal-single-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mul-subnormal-single-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero2.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr28634.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr29302-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr29302-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr30704.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr30704.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr36332.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr38016.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr38016.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr50310.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr67218.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr72824-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr72824.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr84235.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/rbug.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/rbug.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc-1.x test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ifcvt-onecmpl-abs-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/index-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/inst-check.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/int-compare.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ipa-sra-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ipa-sra-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/longlong.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-10.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-11.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-12.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-13.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-14.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-15.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2b.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2c.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2d.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2e.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2f.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2g.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3b.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3c.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-4b.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-7.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-8.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-9.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-ivopts-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-ivopts-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/lshrdi-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/lto-tbaa-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/medce-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memchr-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-bi.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mod-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mode-dependent-address.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/multdi-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/multi-ix.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nest-align-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nest-stdar-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-7.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/noinit-attribute.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/p18298.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/packed-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/packed-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pending-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/postmod-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15296.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr16790-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17078-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17133.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17252.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17377.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19005.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19449.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19515.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19606.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19687.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19689.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20100-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20187-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20466-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20527-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20601-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20621-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21173.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21331.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21964-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22141-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22141-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22348.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22429.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22493-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22630.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23047.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23135.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23324.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23467.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23604.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23941.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24135.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24141.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24142.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24716.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24851.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr25125.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr25737.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27073.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27260.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27285.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27364.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27671-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28289.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28403.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28651.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28778.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28865.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28982a.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28982b.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29006.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29156.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29695-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29695-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29797-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29797-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29798.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr30185.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr30778.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31072.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31136.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31169.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31448-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31448.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31605.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr32244-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr32500.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33142.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33382.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33631.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33669.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33779-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33779-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33870-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33870.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33992.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34070-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34070-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34099-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34099.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34130.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34154.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34176.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34415.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34456.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34768-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34768-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34971.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34982.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35163.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35231.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35390.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35456.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35472.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35800.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36034-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36034-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36038.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36077.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36093.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36321.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36339.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36343.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36691.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36765.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37102.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37125.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37573.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37780.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37882.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37924.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37931.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38048-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38048-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38051.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38151.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38212.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38236.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38422.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38533.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38819.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38969.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39100.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39120.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39228.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39233.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39240.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39339.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39501.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40022.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40057.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40386.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40404.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40493.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40579.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40657.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40668.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40747.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41239.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41317.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41395-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41395-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41463.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41750.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41917.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41919.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41935.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42006.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42142.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42154.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42231.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42248.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42269-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42512.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42544.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42570.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42614.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42691.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42721.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42833.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43008.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43220.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43236.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43269.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43385.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43438.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43560.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43629.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43783.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43784.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43835.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43987.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44164.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44202-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44468.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44555.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44575.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44683.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44828.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44852.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44858.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44942.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45034.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45070.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45262.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45695.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46019.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46309.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46316.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46909-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46909-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47148.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47155.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47237.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47299.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47337.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47538.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47925.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48197.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48571-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48717.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48809.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48814-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48814-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48973-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48973-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49039.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49073.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49123.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49161.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49186.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49218.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49279.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49281.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49390.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49419.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49644.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49712.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49768.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49886.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr50865.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51023.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51323.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51447.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51466.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51581-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51581-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51877.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51933.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52129.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52209.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52286.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52760.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52979-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52979-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53084.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53160.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53465.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53645-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53645.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53688.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr54471.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr54937.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr54985.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55137.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55750.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55875.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56051.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56205.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56250.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56799.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56837.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56866.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56899.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56962.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56982.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57124.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57130.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57131.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57144.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57281.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57321.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57568.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57829.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57860.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57861.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57875.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57876.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57877.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58209.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58277-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58277-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58364.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58365.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58385.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58387.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58419.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58431.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58564.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58570.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58574.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58640-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58640.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58662.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58726.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58831.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58943.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58984.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59014-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59014.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59101.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59221.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59229.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59358.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59387.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59388.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59413.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59643.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59747.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60003.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60017.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60062.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60072.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60454.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60822.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60960.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61375.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61517.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61673.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61682.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61725.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr62151.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63209.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63302.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63641.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63659.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63843.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64006.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64242.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64255.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64260.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64682.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64718.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64756.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64957.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64979.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65053-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65053-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65170.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65216.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65369.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65401.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65418-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65418-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65427.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65648.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65956.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66187.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66233.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66556.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66757.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66940.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67037.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67226.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67714.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67781.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67929_1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68143_1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68185.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68249.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68250.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68321.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68328.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68376-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68376-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68381.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68390.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68506.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68532.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68624.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68648.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68841.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68911.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69097-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69097-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69403.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69447.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69691.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70005.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70127.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70222-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70222-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70429.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70460.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70566.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70586.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70602.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70903.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71083.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71335.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71494.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71550.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71554.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71626-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71626-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71631.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71700.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr7284-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77718.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77766.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77767.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78170.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78378.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78436.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78438.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78477.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78559.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78586.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78617.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78622.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78675.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78720.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78726.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78791.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78856.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79043.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79121.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79286.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79327.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79354.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79388.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79450.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79737-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79737-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80153.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80421.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80501.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80692.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81281.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81423.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81503.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81555.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81556.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81588.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81913.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82192.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82210.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82387.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82388.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82524.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82954.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83269.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83298.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83362.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83383.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83477.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84169.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84339.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84478.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84521.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84524.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84748.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85095.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85156.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85169.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85331.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85529-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85529-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85756.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86231.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86492.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86528.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86714.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86844.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87053.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87290.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87623.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88693.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88714.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88739.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88904.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89195.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89369.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89434.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89634.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89826.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr90025.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr90949.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr91137.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-chk-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pta-field-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pta-field-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ptr-arith-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pure-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pushpop_macro.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/regstack-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/restrict-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/return-addr.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scope-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftdi-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftdi.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftopt-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ssad-run.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stkalign.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcmp-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcpy-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcpy-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-stdarg-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-varg-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-17.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-18.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-7.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strncmp-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-aliasing-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-cpy-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ret-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ret-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/switch-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/tstdi-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/unroll-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/usad-run.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/user-printf.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/usmul.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-10.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-11.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-12.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-13.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-14.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-15.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-16.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-17.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-18.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-19.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-20.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-21.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-22.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-23.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-24.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-26.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-7.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-8.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-9.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-pack-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-trap-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vfprintf-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vfprintf-chk-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vla-dealloc-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vprintf-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vprintf-chk-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-4.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-5.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-6.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-7.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/wchar_t-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-3.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zero-struct-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zero-struct-2.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zerolen-1.c test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zerolen-2.c Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000112-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000112-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000112-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000112-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +#include + +static int +special_format (fmt) + const char *fmt; +{ + return (strchr (fmt, '*') != 0 + || strchr (fmt, 'V') != 0 + || strchr (fmt, 'S') != 0 + || strchr (fmt, 'n') != 0); +} + +main() +{ + if (special_format ("ee")) + abort (); + if (!special_format ("*e")) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000113-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000113-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000113-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000113-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +struct x { + unsigned x1:1; + unsigned x2:2; + unsigned x3:3; +}; + +foobar (int x, int y, int z) +{ + struct x a = {x, y, z}; + struct x b = {x, y, z}; + struct x *c = &b; + + c->x3 += (a.x2 - a.x1) * c->x2; + if (a.x1 != 1 || c->x3 != 5) + abort (); + exit (0); +} + +main() +{ + foobar (1, 2, 3); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000121-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000121-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000121-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000121-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +void big(long long u) { } + +void doit(unsigned int a,unsigned int b,char *id) +{ + big(*id); + big(a); + big(b); +} + +int main(void) +{ + doit(1,1,"\n"); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000205-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000205-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000205-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000205-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +static int f (int a) +{ + if (a == 0) + return 0; + do + if (a & 128) + return 1; + while (f (0)); + return 0; +} + +int main(void) +{ + if (f (~128)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000217-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000217-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000217-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000217-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +unsigned short int showbug(unsigned short int *a, unsigned short int *b) +{ + *a += *b -8; + return (*a >= 8); +} + +int main() +{ + unsigned short int x = 0; + unsigned short int y = 10; + + if (showbug(&x, &y) != 0) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000223-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000223-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000223-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000223-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,61 @@ +/* Copyright (C) 2000 Free Software Foundation, Inc. + Contributed by Nathan Sidwell 23 Feb 2000 */ + +/* __alignof__ should never return a non-power of 2 + eg, sizeof(long double) might be 12, but that means it must be alignable + on a 4 byte boundary. */ + +void check (char const *type, int align) +{ + if ((align & -align) != align) + { + abort (); + } +} + +#define QUOTE_(s) #s +#define QUOTE(s) QUOTE_(s) + +#define check(t) check(QUOTE(t), __alignof__(t)) + +// This struct should have an alignment of the lcm of all the types. If one of +// the base alignments is not a power of two, then A cannot be power of two +// aligned. +struct A +{ + char c; + signed short ss; + unsigned short us; + signed int si; + unsigned int ui; + signed long sl; + unsigned long ul; + signed long long sll; + unsigned long long ull; + float f; + double d; + long double ld; + void *dp; + void (*fp)(); +}; + +int main () +{ + check (void); + check (char); + check (signed short); + check (unsigned short); + check (signed int); + check (unsigned int); + check (signed long); + check (unsigned long); + check (signed long long); + check (unsigned long long); + check (float); + check (double); + check (long double); + check (void *); + check (void (*)()); + check (struct A); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000224-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000224-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000224-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000224-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +int loop_1 = 100; +int loop_2 = 7; +int flag = 0; + +int test (void) +{ + int i; + int counter = 0; + + while (loop_1 > counter) { + if (flag & 1) { + for (i = 0; i < loop_2; i++) { + counter++; + } + } + flag++; + } + return 1; +} + +int main() +{ + if (test () != 1) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000225-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000225-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000225-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000225-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +int main () +{ + int nResult; + int b=0; + int i = -1; + + do + { + if (b!=0) { + abort (); + nResult=1; + } else { + nResult=0; + } + i++; + b=(i+2)*4; + } while (i < 0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000227-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000227-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000227-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000227-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +static const unsigned char f[] = "\0\377"; +static const unsigned char g[] = "\0ÿ"; + +int main(void) +{ + if (sizeof f != 3 || sizeof g != 3) + abort (); + if (f[0] != g[0]) + abort (); + if (f[1] != g[1]) + abort (); + if (f[2] != g[2]) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000313-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000313-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000313-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000313-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +unsigned int buggy (unsigned int *param) +{ + unsigned int accu, zero = 0, borrow; + accu = - *param; + borrow = - (accu > zero); + *param += accu; + return borrow; +} + +int main (void) +{ + unsigned int param = 1; + unsigned int borrow = buggy (¶m); + + if (param != 0) + abort (); + if (borrow + 1 != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +int main () +{ + long winds = 0; + + while (winds != 0) + { + if (*(char *) winds) + break; + } + + if (winds == 0 || winds != 0 || *(char *) winds) + exit (0); + + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +typedef unsigned long long uint64; +const uint64 bigconst = 1ULL << 34; + +int a = 1; + +static +uint64 getmask(void) +{ + if (a) + return bigconst; + else + return 0; +} + +main() +{ + uint64 f = getmask(); + if (sizeof (long long) == 8 + && f != bigconst) abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000314-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +extern void abort (void); + +static char arg0[] = "arg0"; +static char arg1[] = "arg1"; + +static void attr_rtx (char *, char *); +static char *attr_string (char *); +static void attr_eq (char *, char *); + +static void +attr_rtx (char *varg0, char *varg1) +{ + if (varg0 != arg0) + abort (); + + if (varg1 != arg1) + abort (); + + return; +} + +static void +attr_eq (name, value) + char *name, *value; +{ + return attr_rtx (attr_string (name), + attr_string (value)); +} + +static char * +attr_string (str) + char *str; +{ + return str; +} + +int main() +{ + attr_eq (arg0, arg1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000402-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000402-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000402-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000402-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +#include + +#if ULONG_LONG_MAX != 18446744073709551615ull && ULONG_MAX != 18446744073709551615ull +int main(void) { exit (0); } +#else +#if ULONG_MAX != 18446744073709551615ull +typedef unsigned long long ull; +#else +typedef unsigned long ull; +#endif + +#include + +void checkit(int); + +main () { + const ull a = 0x1400000000ULL; + const ull b = 0x80000000ULL; + const ull c = a/b; + const ull d = 0x1400000000ULL / 0x80000000ULL; + + checkit ((int) c); + checkit ((int) d); + + exit(0); +} + +void checkit (int a) +{ + if (a != 40) + abort(); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000403-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000403-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000403-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000403-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +extern unsigned long aa[], bb[]; + +int seqgt (unsigned long a, unsigned short win, unsigned long b); + +int seqgt2 (unsigned long a, unsigned short win, unsigned long b); + +main() +{ + if (! seqgt (*aa, 0x1000, *bb) || ! seqgt2 (*aa, 0x1000, *bb)) + abort (); + + exit (0); +} + +int +seqgt (unsigned long a, unsigned short win, unsigned long b) +{ + return (long) ((a + win) - b) > 0; +} + +int +seqgt2 (unsigned long a, unsigned short win, unsigned long b) +{ + long l = ((a + win) - b); + return l > 0; +} + +unsigned long aa[] = { (1UL << (sizeof (long) * 8 - 1)) - 0xfff }; +unsigned long bb[] = { (1UL << (sizeof (long) * 8 - 1)) - 0xfff }; Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +short int i = -1; +const char * const wordlist[207]; + +const char * const * +foo(void) +{ + register const char * const *wordptr = &wordlist[207u + i]; + return wordptr; +} + +int +main() +{ + if (foo() != &wordlist[206]) + abort (); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +int f(int a,int *y) +{ + int x = a; + + if (a==0) + return *y; + + return f(a-1,&x); +} + +int main(int argc,char **argv) +{ + if (f (100, (int *) 0) != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +typedef struct { + char y; + char x[32]; +} X; + +int z (void) +{ + X xxx; + xxx.x[0] = + xxx.x[31] = '0'; + xxx.y = 0xf; + return f (xxx, xxx); +} + +int main (void) +{ + int val; + + val = z (); + if (val != 0x60) + abort (); + exit (0); +} + +int f(X x, X y) +{ + if (x.y != y.y) + return 'F'; + + return x.x[0] + y.x[0]; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ + void f(int i, int j, int radius, int width, int N) + { + const int diff = i-radius; + const int lowk = (diff>0 ? diff : 0 ); + int k; + + for(k=lowk; k<= 2; k++){ + int idx = ((k-i+radius)*width-j+radius); + if (idx < 0) + abort (); + } + + for(k=lowk; k<= 2; k++); + } + + + int main(int argc, char **argv) + { + int exc_rad=2; + int N=8; + int i; + for(i=1; i<4; i++) + f(i,1,exc_rad,2*exc_rad + 1, N); + exit (0); + } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +int main( void ) { + struct { + int node; + int type; + } lastglob[1] = { { 0 , 1 } }; + + if (lastglob[0].node != 0 || lastglob[0].type != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000412-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +unsigned bug (unsigned short value, unsigned short *buffer, + unsigned short *bufend); + +unsigned short buf[] = {1, 4, 16, 64, 256}; +int main() +{ + if (bug (512, buf, buf + 3) != 491) + abort (); + + exit (0); +} + +unsigned +bug (unsigned short value, unsigned short *buffer, unsigned short *bufend) +{ + unsigned short *tmp; + + for (tmp = buffer; tmp < bufend; tmp++) + value -= *tmp; + + return value; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000419-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000419-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000419-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000419-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +struct foo { int a, b, c; }; + +void +brother (int a, int b, int c) +{ + if (a) + abort (); +} + +void +sister (struct foo f, int b, int c) +{ + brother ((f.b == b), b, c); +} + +int +main () +{ + struct foo f = { 7, 8, 9 }; + sister (f, 1, 2); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000422-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000422-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000422-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000422-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +int ops[13] = +{ + 11, 12, 46, 3, 2, 2, 3, 2, 1, 3, 2, 1, 2 +}; + +int correct[13] = +{ + 46, 12, 11, 3, 3, 3, 2, 2, 2, 2, 2, 1, 1 +}; + +int num = 13; + +int main() +{ + int i; + + for (i = 0; i < num; i++) + { + int j; + + for (j = num - 1; j > i; j--) + { + if (ops[j-1] < ops[j]) + { + int op = ops[j]; + ops[j] = ops[j-1]; + ops[j-1] = op; + } + } + } + + + for (i = 0; i < num; i++) + if (ops[i] != correct[i]) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000503-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000503-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000503-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000503-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +unsigned long +sub (int a) +{ + return ((0 > a - 2) ? 0 : a - 2) * sizeof (long); +} + +main () +{ + if (sub (0) != 0) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000511-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000511-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000511-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000511-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +void f (int value, int expect) +{ + if (value != expect) + abort (); +} + +int main() +{ + int a = 7, b = 6, c = 4, d = 7, e = 2; + + f (a||b%c, 1); + f (a?b%c:0, 2); + f (a=b%c, 2); + f (a*=b%c, 4); + f (a/=b%c, 2); + f (a%=b%c, 0); + f (a+=b%c, 2); + f (d||c&&e, 1); + f (d?c&&e:0, 1); + f (d=c&&e, 1); + f (d*=c&&e, 1); + f (d%=c&&e, 0); + f (d+=c&&e, 1); + f (d-=c&&e, 0); + f (d||c||e, 1); + f (d?c||e:0, 0); + f (d=c||e, 1); + f (d*=c||e, 1); + f (d%=c||e, 0); + f (d+=c||e, 1); + f (d-=c||e, 0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000519-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000519-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000519-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000519-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +#include + +int +bar (int a, va_list ap) +{ + int b; + + do + b = va_arg (ap, int); + while (b > 10); + + return a + b; +} + +int +foo (int a, ...) +{ + va_list ap; + + va_start (ap, a); + return bar (a, ap); +} + +int +main () +{ + if (foo (1, 2, 3) != 3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000519-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000519-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000519-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000519-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +long x = -1L; + +int main() +{ + long b = (x != -1L); + + if (b) + abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000523-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000523-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000523-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000523-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +int +main (void) +{ + long long x; + int n; + + if (sizeof (long long) < 8) + exit (0); + + n = 9; + x = (((long long) n) << 55) / 0xff; + + if (x == 0) + abort (); + + x = (((long long) 9) << 55) / 0xff; + + if (x == 0) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000528-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000528-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000528-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000528-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +/* Copyright (C) 2000 Free Software Foundation */ +/* Contributed by Alexandre Oliva */ + +unsigned long l = (unsigned long)-2; +unsigned short s; + +int main () { + long t = l; + s = t; + if (s != (unsigned short)-2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000603-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000603-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000603-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000603-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* It is not clear whether this test is conforming. See DR#236 + http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_236.htm. However, + there seems to be consensus that the presence of a union to aggregate + struct s1 and struct s2 should make it conforming. */ +struct s1 { double d; }; +struct s2 { double d; }; +union u { struct s1 x; struct s2 y; }; + +double f(struct s1 *a, struct s2 *b) +{ + a->d = 1.0; + return b->d + 1.0; +} + +int main() +{ + union u a; + a.x.d = 0.0; + if (f (&a.x, &a.y) != 2.0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,48 @@ +typedef struct _RenderInfo RenderInfo; +struct _RenderInfo +{ + int y; + float scaley; + int src_y; +}; + +static void bar(void) { } + +static int +render_image_rgb_a (RenderInfo * info) +{ + int y, ye; + float error; + float step; + + y = info->y; + ye = 256; + + step = 1.0 / info->scaley; + + error = y * step; + error -= ((int) error) - step; + + for (; y < ye; y++) { + if (error >= 1.0) { + info->src_y += (int) error; + error -= (int) error; + bar(); + } + error += step; + } + return info->src_y; +} + +int main (void) +{ + RenderInfo info; + + info.y = 0; + info.src_y = 0; + info.scaley = 1.0; + + if (render_image_rgb_a(&info) != 256) + abort (); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +struct F { int i; }; + +void f1(struct F *x, struct F *y) +{ + int timeout = 0; + for (; ((const struct F*)x)->i < y->i ; x->i++) + if (++timeout > 5) + abort (); +} + +main() +{ + struct F x, y; + x.i = 0; + y.i = 1; + f1 (&x, &y); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000605-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +struct F { int x; int y; }; + +int main() +{ + int timeout = 0; + int x = 0; + while (1) + { + const struct F i = { x++, }; + if (i.x > 0) + break; + if (++timeout > 5) + goto die; + } + return 0; + die: + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000622-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000622-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000622-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000622-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +long foo(long a, long b, long c) +{ + if (a != 12 || b != 1 || c != 11) + abort(); + return 0; +} +long bar (long a, long b) +{ + return b; +} +void baz (long a, long b, void *c) +{ + long d; + d = (long)c; + foo(d, bar (a, 1), b); +} +int main() +{ + baz (10, 11, (void *)12); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000703-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000703-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000703-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000703-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +void abort(void); +void exit(int); +struct baz +{ + char a[17]; + char b[3]; + unsigned int c; + unsigned int d; +}; + +void foo(struct baz *p, unsigned int c, unsigned int d) +{ + __builtin_memcpy (p->b, "abc", 3); + p->c = c; + p->d = d; +} + +void bar(struct baz *p, unsigned int c, unsigned int d) +{ + ({ void *s = (p); + __builtin_memset (s, '\0', sizeof (struct baz)); + s; }); + __builtin_memcpy (p->a, "01234567890123456", 17); + __builtin_memcpy (p->b, "abc", 3); + p->c = c; + p->d = d; +} + +int main() +{ + struct baz p; + foo(&p, 71, 18); + if (p.c != 71 || p.d != 18) + abort(); + bar(&p, 59, 26); + if (p.c != 59 || p.d != 26) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +extern void abort(void); +extern void exit(int); + +struct baz { + int a, b, c, d, e; +}; + +void bar(struct baz *x, int f, int g, int h, int i, int j) +{ + if (x->a != 1 || x->b != 2 || x->c != 3 || x->d != 4 || x->e != 5 || + f != 6 || g != 7 || h != 8 || i != 9 || j != 10) + abort(); +} + +void foo(struct baz x, char **y) +{ + bar(&x,6,7,8,9,10); +} + +int main() +{ + struct baz x; + + x.a = 1; + x.b = 2; + x.c = 3; + x.d = 4; + x.e = 5; + foo(x,(char **)0); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +extern void abort(void); +extern void exit(int); + +struct baz { + int a, b, c, d, e; +}; + +void bar(struct baz *x, int f, int g, int h, int i, int j) +{ + if (x->a != 1 || x->b != 2 || x->c != 3 || x->d != 4 || x->e != 5 || + f != 6 || g != 7 || h != 8 || i != 9 || j != 10) + abort(); +} + +void foo(char *z, struct baz x, char *y) +{ + bar(&x,6,7,8,9,10); +} + +int main() +{ + struct baz x; + + x.a = 1; + x.b = 2; + x.c = 3; + x.d = 4; + x.e = 5; + foo((char *)0,x,(char *)0); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void abort(void); +extern void exit(int); + +int c; + +void baz(int *p) +{ + c = *p; +} + +void bar(int b) +{ + if (c != 1 || b != 2) + abort(); +} + +void foo(int a, int b) +{ + baz(&a); + bar(b); +} + +int main() +{ + foo(1, 2); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort(void); +extern void exit(int); + +int *c; + +void bar(int b) +{ + if (*c != 1 || b != 2) + abort(); +} + +void foo(int a, int b) +{ + c = &a; + bar(b); +} + +int main() +{ + foo(1, 2); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000706-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +extern void abort(void); +extern void exit(int); + +struct baz { int a, b, c; }; + +struct baz *c; + +void bar(int b) +{ + if (c->a != 1 || c->b != 2 || c->c != 3 || b != 4) + abort(); +} + +void foo(struct baz a, int b) +{ + c = &a; + bar(b); +} + +int main() +{ + struct baz a; + a.a = 1; + a.b = 2; + a.c = 3; + foo(a, 4); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000707-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000707-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000707-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000707-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void abort(void); +extern void exit(int); + +struct baz { + int a, b, c; +}; + +void +foo (int a, int b, int c) +{ + if (a != 4) + abort (); +} + +void +bar (struct baz x, int b, int c) +{ + foo (x.b, b, c); +} + +int +main () +{ + struct baz x = { 3, 4, 5 }; + bar (x, 1, 2); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000715-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000715-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000715-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000715-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,118 @@ +void abort(void); +void exit(int); + +void +test1(void) +{ + int x = 3, y = 2; + + if ((x < y ? x++ : y++) != 2) + abort (); + + if (x != 3) + abort (); + + if (y != 3) + abort (); +} + +void +test2(void) +{ + int x = 3, y = 2, z; + + z = (x < y) ? x++ : y++; + if (z != 2) + abort (); + + if (x != 3) + abort (); + + if (y != 3) + abort (); +} + +void +test3(void) +{ + int x = 3, y = 2; + int xx = 3, yy = 2; + + if ((xx < yy ? x++ : y++) != 2) + abort (); + + if (x != 3) + abort (); + + if (y != 3) + abort (); +} + +int x, y; + +static void +init_xy(void) +{ + x = 3; + y = 2; +} + +void +test4(void) +{ + init_xy(); + if ((x < y ? x++ : y++) != 2) + abort (); + + if (x != 3) + abort (); + + if (y != 3) + abort (); +} + +void +test5(void) +{ + int z; + + init_xy(); + z = (x < y) ? x++ : y++; + if (z != 2) + abort (); + + if (x != 3) + abort (); + + if (y != 3) + abort (); +} + +void +test6(void) +{ + int xx = 3, yy = 2; + int z; + + init_xy(); + z = (xx < y) ? x++ : y++; + if (z != 2) + abort (); + + if (x != 3) + abort (); + + if (y != 3) + abort (); +} + +int +main(){ + test1 (); + test2 (); + test3 (); + test4 (); + test5 (); + test6 (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000715-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000715-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000715-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000715-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +extern void abort(void); +extern void exit(int); + +unsigned int foo(unsigned int a) +{ + return ((unsigned char)(a + 1)) * 4; +} + +int main(void) +{ + if (foo((unsigned char)~0)) + abort (); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +typedef struct trio { int a, b, c; } trio; + +int +bar (int i, trio t) +{ + if (t.a == t.b || t.a == t.c) + abort (); +} + +int +foo (trio t, int i) +{ + return bar (i, t); +} + +main () +{ + trio t = { 1, 2, 3 }; + + foo (t, 4); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +static void +compare (long long foo) +{ + if (foo < 4294967297LL) + abort(); +} +int main(void) +{ + compare (8589934591LL); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +int c = -1; + +foo (p) + int *p; +{ + int x; + int a; + + a = p[0]; + x = a + 5; + a = c; + p[0] = x - 15; + return a; +} + +int main() +{ + int b = 1; + int a = foo(&b); + + if (a != -1 || b != (1 + 5 - 15)) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* Extracted from gas. Incorrectly generated non-pic code at -O0 for + IA-64, which produces linker errors on some operating systems. */ + +struct +{ + int offset; + struct slot + { + int field[6]; + } + slot[4]; +} s; + +int +x () +{ + int toggle = 0; + int r = s.slot[0].field[!toggle]; + return r; +} + +int +main () +{ + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000717-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +typedef struct trio { int a, b, c; } trio; + +int +bar (int i, int j, int k, trio t) +{ + if (t.a != 1 || t.b != 2 || t.c != 3 || + i != 4 || j != 5 || k != 6) + abort (); +} + +int +foo (trio t, int i, int j, int k) +{ + return bar (i, j, k, t); +} + +main () +{ + trio t = { 1, 2, 3 }; + + foo (t, 4, 5, 6); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000722-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000722-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000722-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000722-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +struct s { char *p; int t; }; + +extern void bar (void); +extern void foo (struct s *); + +int main(void) +{ + bar (); + bar (); + exit (0); +} + +void +bar (void) +{ + foo (& (struct s) { "hi", 1 }); +} + +void foo (struct s *p) +{ + if (p->t != 1) + abort(); + p->t = 2; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000726-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000726-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000726-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000726-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +void adjust_xy (short *, short *); + +struct adjust_template +{ + short kx_x; + short kx_y; + short kx; + short kz; +}; + +static struct adjust_template adjust = {0, 0, 1, 1}; + +main () +{ + short x = 1, y = 1; + + adjust_xy (&x, &y); + + if (x != 1) + abort (); + + exit (0); +} + +void +adjust_xy (x, y) + short *x; + short *y; +{ + *x = adjust.kx_x * *x + adjust.kx_y * *y + adjust.kx; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000731-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000731-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000731-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000731-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +double +foo (void) +{ + return 0.0; +} + +void +do_sibcall (void) +{ + (void) foo (); +} + +int +main (void) +{ + double x; + + for (x = 0; x < 20; x++) + do_sibcall (); + if (!(x >= 10)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000731-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000731-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000731-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000731-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +int +main() +{ + int i = 1; + int j = 0; + + while (i != 1024 || j <= 0) { + i *= 2; + ++ j; + } + + if (j != 10) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +extern void abort(void); +extern void exit(int); + +void +foo (char *bp, unsigned n) +{ + register char c; + register char *ep = bp + n; + register char *sp; + + while (bp < ep) + { + sp = bp + 3; + c = *sp; + *sp = *bp; + *bp++ = c; + sp = bp + 1; + c = *sp; + *sp = *bp; + *bp++ = c; + bp += 2; + } +} + +int main(void) +{ + int one = 1; + + if (sizeof(int) != 4 * sizeof(char)) + exit(0); + + foo((char *)&one, sizeof(one)); + foo((char *)&one, sizeof(one)); + + if (one != 1) + abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +extern void abort(void); +extern void exit(int); +int bar(void); +int baz(void); + +struct foo { + struct foo *next; +}; + +struct foo *test(struct foo *node) +{ + while (node) { + if (bar() && !baz()) + break; + node = node->next; + } + return node; +} + +int bar (void) +{ + return 0; +} + +int baz (void) +{ + return 0; +} + +int main(void) +{ + struct foo a, b, *c; + + a.next = &b; + b.next = (struct foo *)0; + c = test(&a); + if (c) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* Origin: PR c/92 from Simon Marlow , adapted + to a testcase by Joseph Myers . +*/ + +typedef struct { } empty; + +typedef struct { + int i; + empty e; + int i2; +} st; + +st s = { .i = 0, .i2 = 1 }; + +extern void abort (void); + +int +main (void) +{ + if (s.i2 == 1) + exit (0); + else + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000801-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* Origin: PR c/128 from Martin Sebor , adapted + as a testcase by Joseph Myers . +*/ +/* Character arrays initialized by a string literal must have + uninitialized elements zeroed. This isn't clear in the 1990 + standard, but was fixed in TC2 and C99; see DRs #060, #092. +*/ +extern void abort (void); + +int +foo (void) +{ + char s[2] = ""; + return 0 == s[1]; +} + +char *t; + +int +main (void) +{ + { + char s[] = "x"; + t = s; + } + if (foo ()) + exit (0); + else + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000808-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000808-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000808-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000808-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,56 @@ +typedef struct { + long int p_x, p_y; +} Point; + +void +bar () +{ +} + +void +f (p0, p1, p2, p3, p4, p5) + Point p0, p1, p2, p3, p4, p5; +{ + if (p0.p_x != 0 || p0.p_y != 1 + || p1.p_x != -1 || p1.p_y != 0 + || p2.p_x != 1 || p2.p_y != -1 + || p3.p_x != -1 || p3.p_y != 1 + || p4.p_x != 0 || p4.p_y != -1 + || p5.p_x != 1 || p5.p_y != 0) + abort (); +} + +void +foo () +{ + Point p0, p1, p2, p3, p4, p5; + + bar(); + + p0.p_x = 0; + p0.p_y = 1; + + p1.p_x = -1; + p1.p_y = 0; + + p2.p_x = 1; + p2.p_y = -1; + + p3.p_x = -1; + p3.p_y = 1; + + p4.p_x = 0; + p4.p_y = -1; + + p5.p_x = 1; + p5.p_y = 0; + + f (p0, p1, p2, p3, p4, p5); +} + +int +main() +{ + foo(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000815-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000815-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000815-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000815-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,70 @@ +struct table_elt +{ + void *exp; + struct table_elt *next_same_hash; + struct table_elt *prev_same_hash; + struct table_elt *next_same_value; + struct table_elt *prev_same_value; + struct table_elt *first_same_value; + struct table_elt *related_value; + int cost; + int mode; + char in_memory; + char in_struct; + char is_const; + char flag; +}; + +struct write_data +{ + int sp : 1; + int var : 1; + int nonscalar : 1; + int all : 1; +}; + +int cse_rtx_addr_varies_p(void *); +void remove_from_table(struct table_elt *, int); +static struct table_elt *table[32]; + +void +invalidate_memory (writes) + struct write_data *writes; +{ + register int i; + register struct table_elt *p, *next; + int all = writes->all; + int nonscalar = writes->nonscalar; + + for (i = 0; i < 31; i++) + for (p = table[i]; p; p = next) + { + next = p->next_same_hash; + if (p->in_memory + && (all + || (nonscalar && p->in_struct) + || cse_rtx_addr_varies_p (p->exp))) + remove_from_table (p, i); + } +} + +int cse_rtx_addr_varies_p(void *x) { return 0; } +void remove_from_table(struct table_elt *x, int y) { abort (); } + +int +main() +{ + struct write_data writes; + struct table_elt elt; + + __builtin_memset(&elt, 0, sizeof(elt)); + elt.in_memory = 1; + table[0] = &elt; + + __builtin_memset(&writes, 0, sizeof(writes)); + writes.var = 1; + writes.nonscalar = 1; + + invalidate_memory(&writes); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000818-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000818-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000818-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000818-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,60 @@ +/* Copyright (C) 2000 Free Software Foundation. + + by Manfred Hollstein */ + +void *temporary_obstack; + +static int input (void); +static int ISALNUM (int ch); +static void obstack_1grow (void **ptr, int ch); + +int yylex (void); +int main (void); + +int main (void) +{ + int ch = yylex (); + + exit (0); +} + +int yylex (void) +{ + int ch; + +#ifndef WORK_AROUND + for (;;) + { + ch = input (); + if (ISALNUM (ch)) + obstack_1grow (&temporary_obstack, ch); + else if (ch != '_') + break; + } +#else + do + { + ch = input (); + if (ISALNUM (ch)) + obstack_1grow (&temporary_obstack, ch); + } while (ch == '_'); +#endif + + return ch; +} + +static int input (void) +{ + return 0; +} + +static int ISALNUM (int ch) +{ + return ((ch >= 'A' && ch <= 'Z') + || (ch >= 'a' && ch <= 'z') + || (ch >= '0' && ch <= '0')); +} + +static void obstack_1grow (void **ptr, int ch) +{ +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000819-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000819-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000819-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000819-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +int a[2] = { 2, 0 }; + +void foo(int *sp, int cnt) +{ + int *p, *top; + + top = sp; sp -= cnt; + + for(p = sp; p <= top; p++) + if (*p < 2) exit(0); +} + +int main() +{ + foo(a + 1, 1); + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000822-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000822-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000822-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000822-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* { dg-require-effective-target trampolines } */ + +int f0(int (*fn)(int *), int *p) +{ + return (*fn) (p); +} + +int f1(void) +{ + int i = 0; + + int f2(int *p) + { + i = 1; + return *p + 1; + } + + return f0(f2, &i); +} + +int main() +{ + if (f1() != 2) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000910-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000910-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000910-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000910-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* Copyright (C) 2000 Free Software Foundation */ +/* by Alexandre Oliva */ + +#include + +void bar (int); +void foo (int *); + +int main () { + static int a[] = { 0, 1, 2 }; + int *i = &a[sizeof(a)/sizeof(*a)]; + + while (i-- > a) + foo (i); + + exit (0); +} + +void baz (int, int); + +void bar (int i) { baz (i, i); } +void foo (int *i) { bar (*i); } + +void baz (int i, int j) { + if (i != j) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000910-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000910-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000910-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000910-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* Copyright (C) 2000 Free Software Foundation */ +/* by Alexandre Oliva */ + +#include +#include + +char *list[] = { "*", "e" }; + +static int bar (const char *fmt) { + return (strchr (fmt, '*') != 0); +} + +static void foo () { + int i; + for (i = 0; i < sizeof (list) / sizeof (*list); i++) { + const char *fmt = list[i]; + if (bar (fmt)) + continue; + if (i == 0) + abort (); + else + exit (0); + } +} + +int main () { + foo (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000914-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000914-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000914-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000914-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,292 @@ +extern void *malloc(__SIZE_TYPE__); + +enum tree_code { +ERROR_MARK, +IDENTIFIER_NODE, +OP_IDENTIFIER, +TREE_LIST, +TREE_VEC, +BLOCK, +VOID_TYPE, +INTEGER_TYPE, +REAL_TYPE, +COMPLEX_TYPE, +VECTOR_TYPE, +ENUMERAL_TYPE, +BOOLEAN_TYPE, +CHAR_TYPE, +POINTER_TYPE, +OFFSET_TYPE, +REFERENCE_TYPE, +METHOD_TYPE, +FILE_TYPE, +ARRAY_TYPE, +SET_TYPE, +RECORD_TYPE, +UNION_TYPE, +QUAL_UNION_TYPE, +FUNCTION_TYPE, +LANG_TYPE, +INTEGER_CST, +REAL_CST, +COMPLEX_CST, +STRING_CST, +FUNCTION_DECL, +LABEL_DECL, +CONST_DECL, +TYPE_DECL, +VAR_DECL, +PARM_DECL, +RESULT_DECL, +FIELD_DECL, +NAMESPACE_DECL, +COMPONENT_REF, +BIT_FIELD_REF, +INDIRECT_REF, +BUFFER_REF, +ARRAY_REF, +CONSTRUCTOR, +COMPOUND_EXPR, +MODIFY_EXPR, +INIT_EXPR, +TARGET_EXPR, +COND_EXPR, +BIND_EXPR, +CALL_EXPR, +METHOD_CALL_EXPR, +WITH_CLEANUP_EXPR, +CLEANUP_POINT_EXPR, +PLACEHOLDER_EXPR, +WITH_RECORD_EXPR, +PLUS_EXPR, +MINUS_EXPR, +MULT_EXPR, +TRUNC_DIV_EXPR, +CEIL_DIV_EXPR, +FLOOR_DIV_EXPR, +ROUND_DIV_EXPR, +TRUNC_MOD_EXPR, +CEIL_MOD_EXPR, +FLOOR_MOD_EXPR, +ROUND_MOD_EXPR, +RDIV_EXPR, +EXACT_DIV_EXPR, +FIX_TRUNC_EXPR, +FIX_CEIL_EXPR, +FIX_FLOOR_EXPR, +FIX_ROUND_EXPR, +FLOAT_EXPR, +EXPON_EXPR, +NEGATE_EXPR, +MIN_EXPR, +MAX_EXPR, +ABS_EXPR, +FFS_EXPR, +LSHIFT_EXPR, +RSHIFT_EXPR, +LROTATE_EXPR, +RROTATE_EXPR, +BIT_IOR_EXPR, +BIT_XOR_EXPR, +BIT_AND_EXPR, +BIT_ANDTC_EXPR, +BIT_NOT_EXPR, +TRUTH_ANDIF_EXPR, +TRUTH_ORIF_EXPR, +TRUTH_AND_EXPR, +TRUTH_OR_EXPR, +TRUTH_XOR_EXPR, +TRUTH_NOT_EXPR, +LT_EXPR, +LE_EXPR, +GT_EXPR, +GE_EXPR, +EQ_EXPR, +NE_EXPR, +UNORDERED_EXPR, +ORDERED_EXPR, +UNLT_EXPR, +UNLE_EXPR, +UNGT_EXPR, +UNGE_EXPR, +UNEQ_EXPR, +IN_EXPR, +SET_LE_EXPR, +CARD_EXPR, +RANGE_EXPR, +CONVERT_EXPR, +NOP_EXPR, +NON_LVALUE_EXPR, +SAVE_EXPR, +UNSAVE_EXPR, +RTL_EXPR, +ADDR_EXPR, +REFERENCE_EXPR, +ENTRY_VALUE_EXPR, +COMPLEX_EXPR, +CONJ_EXPR, +REALPART_EXPR, +IMAGPART_EXPR, +PREDECREMENT_EXPR, +PREINCREMENT_EXPR, +POSTDECREMENT_EXPR, +POSTINCREMENT_EXPR, +VA_ARG_EXPR, +TRY_CATCH_EXPR, +TRY_FINALLY_EXPR, +GOTO_SUBROUTINE_EXPR, +POPDHC_EXPR, +POPDCC_EXPR, +LABEL_EXPR, +GOTO_EXPR, +RETURN_EXPR, +EXIT_EXPR, +LOOP_EXPR, +LABELED_BLOCK_EXPR, +EXIT_BLOCK_EXPR, +EXPR_WITH_FILE_LOCATION, +SWITCH_EXPR, + LAST_AND_UNUSED_TREE_CODE +}; +typedef union tree_node *tree; +struct tree_common +{ + union tree_node *chain; + union tree_node *type; + enum tree_code code : 8; + unsigned side_effects_flag : 1; + unsigned constant_flag : 1; + unsigned permanent_flag : 1; + unsigned addressable_flag : 1; + unsigned volatile_flag : 1; + unsigned readonly_flag : 1; + unsigned unsigned_flag : 1; + unsigned asm_written_flag: 1; + unsigned used_flag : 1; + unsigned nothrow_flag : 1; + unsigned static_flag : 1; + unsigned public_flag : 1; + unsigned private_flag : 1; + unsigned protected_flag : 1; + unsigned bounded_flag : 1; + unsigned lang_flag_0 : 1; + unsigned lang_flag_1 : 1; + unsigned lang_flag_2 : 1; + unsigned lang_flag_3 : 1; + unsigned lang_flag_4 : 1; + unsigned lang_flag_5 : 1; + unsigned lang_flag_6 : 1; +}; +union tree_node +{ + struct tree_common common; + }; +enum c_tree_code { + C_DUMMY_TREE_CODE = LAST_AND_UNUSED_TREE_CODE, +SRCLOC, +SIZEOF_EXPR, +ARROW_EXPR, +ALIGNOF_EXPR, +EXPR_STMT, +COMPOUND_STMT, +DECL_STMT, +IF_STMT, +FOR_STMT, +WHILE_STMT, +DO_STMT, +RETURN_STMT, +BREAK_STMT, +CONTINUE_STMT, +SWITCH_STMT, +GOTO_STMT, +LABEL_STMT, +ASM_STMT, +SCOPE_STMT, +CASE_LABEL, +STMT_EXPR, + LAST_C_TREE_CODE +}; +enum cplus_tree_code { + CP_DUMMY_TREE_CODE = LAST_C_TREE_CODE, +OFFSET_REF, +PTRMEM_CST, +NEW_EXPR, +VEC_NEW_EXPR, +DELETE_EXPR, +VEC_DELETE_EXPR, +SCOPE_REF, +MEMBER_REF, +TYPE_EXPR, +AGGR_INIT_EXPR, +THROW_EXPR, +EMPTY_CLASS_EXPR, +TEMPLATE_DECL, +TEMPLATE_PARM_INDEX, +TEMPLATE_TYPE_PARM, +TEMPLATE_TEMPLATE_PARM, +BOUND_TEMPLATE_TEMPLATE_PARM, +TYPENAME_TYPE, +TYPEOF_TYPE, +USING_DECL, +DEFAULT_ARG, +TEMPLATE_ID_EXPR, +CPLUS_BINDING, +OVERLOAD, +WRAPPER, +LOOKUP_EXPR, +FUNCTION_NAME, +MODOP_EXPR, +CAST_EXPR, +REINTERPRET_CAST_EXPR, +CONST_CAST_EXPR, +STATIC_CAST_EXPR, +DYNAMIC_CAST_EXPR, +DOTSTAR_EXPR, +TYPEID_EXPR, +PSEUDO_DTOR_EXPR, +SUBOBJECT, +CTOR_STMT, +CLEANUP_STMT, +START_CATCH_STMT, +CTOR_INITIALIZER, +RETURN_INIT, +TRY_BLOCK, +HANDLER, +TAG_DEFN, +IDENTITY_CONV, +LVALUE_CONV, +QUAL_CONV, +STD_CONV, +PTR_CONV, +PMEM_CONV, +BASE_CONV, +REF_BIND, +USER_CONV, +AMBIG_CONV, +RVALUE_CONV, + LAST_CPLUS_TREE_CODE +}; + +blah(){} + +convert_like_real (convs) + tree convs; +{ + switch (((enum tree_code) (convs)->common.code)) + { + case AMBIG_CONV: + return blah(); + default: + break; + }; + abort (); +} + +main() +{ + tree convs = (void *)malloc (sizeof (struct tree_common));; + + convs->common.code = AMBIG_CONV; + convert_like_real (convs); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000917-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000917-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000917-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20000917-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +/* This bug exists in gcc-2.95, egcs-1.1.2, gcc-2.7.2 and probably + every other version as well. */ + +typedef struct int3 { int a, b, c; } int3; + +int3 +one (void) +{ + return (int3) { 1, 1, 1 }; +} + +int3 +zero (void) +{ + return (int3) { 0, 0, 0 }; +} + +int +main (void) +{ + int3 a; + + /* gcc allocates a temporary for the inner expression statement + to store the return value of `one'. + + gcc frees the temporaries for the inner expression statement. + + gcc realloates the same temporary slot to store the return + value of `zero'. + + gcc expands the call to zero ahead of the expansion of the + statement expressions. The temporary gets the value of `zero'. + + gcc expands statement expressions and the stale temporary is + clobbered with the value of `one'. The bad value is copied from + the temporary into *&a. */ + + *({ ({ one (); &a; }); }) = zero (); + if (a.a && a.b && a.c) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001009-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001009-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001009-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001009-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +int a,b; +main() +{ + int c=-2; + int d=0xfe; + int e=a&1; + int f=b&2; + if ((char)(c|(e&f)) == (char)d) + return 0; + else + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001009-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001009-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001009-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001009-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +int b=1; +int foo() +{ + int a; + int c; + a=0xff; + for (;b;b--) + { + c=1; + asm(""::"r"(c)); + c=(signed char)a; + } + if (c!=-1) + abort(); + return c; +} +int main() +{ + foo(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001011-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001011-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001011-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001011-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +extern void abort(void); +extern int strcmp(const char *, const char *); + +int foo(const char *a) +{ + return strcmp(a, "main"); +} + +int main(void) +{ + if(foo(__FUNCTION__)) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001013-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001013-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001013-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001013-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +struct x { + int a, b; +} z = { -4028, 4096 }; + +int foo(struct x *p, int y) +{ + if ((y & 0xff) != y || -p->b >= p->a) + return 1; + return 0; +} + +main() +{ + if (foo (&z, 10)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001017-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001017-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001017-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001017-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ + +void bug (double *Cref, char transb, int m, int n, int k, + double a, double *A, int fdA, double *B, int fdB, + double b, double *C, int fdC) +{ + if (C != Cref) abort (); +} + +int main (void) +{ + double A[1], B[1], C[1]; + + bug (C, 'B', 1, 2, 3, 4.0, A, 5, B, 6, 7.0, C, 8); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001017-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001017-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001017-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001017-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +void +fn_4parms (unsigned char a, long *b, long *c, unsigned int *d) +{ + if (*b != 1 || *c != 2 || *d != 3) + abort (); +} + +int +main () +{ + unsigned char a = 0; + unsigned long b = 1, c = 2; + unsigned int d = 3; + + fn_4parms (a, &b, &c, &d); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001024-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001024-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001024-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001024-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +struct a; + +extern int baz (struct a *__restrict x); + +struct a { + long v; + long w; +}; + +struct b { + struct a c; + struct a d; +}; + +int bar (int x, const struct b *__restrict y, struct b *__restrict z) +{ + if (y->c.v || y->c.w != 250000 || y->d.v || y->d.w != 250000) + abort(); +} + +void foo(void) +{ + struct b x; + x.c.v = 0; + x.c.w = 250000; + x.d = x.c; + bar(0, &x, ((void *)0)); +} + +int main() +{ + foo(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001026-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001026-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001026-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001026-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +extern void abort (void); + +typedef struct { + long r[(19 + sizeof (long))/(sizeof (long))]; +} realvaluetype; + +typedef void *tree; + +static realvaluetype +real_value_from_int_cst (tree x, tree y) +{ + realvaluetype r; + int i; + for (i = 0; i < sizeof(r.r)/sizeof(long); ++i) + r.r[i] = -1; + return r; +} + +struct brfic_args +{ + tree type; + tree i; + realvaluetype d; +}; + +static void +build_real_from_int_cst_1 (data) + void * data; +{ + struct brfic_args *args = (struct brfic_args *) data; + args->d = real_value_from_int_cst (args->type, args->i); +} + +int main() +{ + struct brfic_args args; + + __builtin_memset (&args, 0, sizeof(args)); + build_real_from_int_cst_1 (&args); + + if (args.d.r[0] == 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001027-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001027-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001027-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001027-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +int x,*p=&x; + +int main() +{ + int i=0; + x=1; + p[i]=2; + if (x != 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001031-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001031-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001031-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001031-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +extern void abort (void); +extern void exit (int); + +void t1 (int x) +{ + if (x != 4100) + abort (); +} + +int t2 (void) +{ + int i; + t1 ((i = 4096) + 4); + return i; +} + +void t3 (long long x) +{ + if (x != 0x80000fffULL) + abort (); +} + +long long t4 (void) +{ + long long i; + t3 ((i = 4096) + 0x7fffffffULL); + return i; +} + +main () +{ + if (t2 () != 4096) + abort (); + if (t4 () != 4096) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001101.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001101.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001101.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001101.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* { dg-require-effective-target untyped_assembly } */ +extern void abort(void); + +typedef struct +{ + unsigned int unchanging : 1; +} struc, *rtx; + +rtx dummy ( int *a, rtx *b) +{ + *a = 1; + *b = (rtx)7; + return (rtx)1; +} + +void bogus (insn, thread, delay_list) + rtx insn; + rtx thread; + rtx delay_list; +{ + rtx new_thread; + int must_annul; + + delay_list = dummy ( &must_annul, &new_thread); + if (delay_list == 0 && new_thread ) + { + thread = new_thread; + } + if (delay_list && must_annul) + insn->unchanging = 1; + if (new_thread != thread ) + abort(); +} + +int main() +{ + struc baz; + bogus (&baz, (rtx)7, 0); + exit(0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001108-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001108-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001108-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001108-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +long long +signed_poly (long long sum, long x) +{ + sum += (long long) (long) sum * (long long) x; + return sum; +} + +unsigned long long +unsigned_poly (unsigned long long sum, unsigned long x) +{ + sum += (unsigned long long) (unsigned long) sum * (unsigned long long) x; + return sum; +} + +int +main (void) +{ + if (signed_poly (2LL, -3) != -4LL) + abort (); + + if (unsigned_poly (2ULL, 3) != 8ULL) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001111-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001111-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001111-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001111-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ + +static int next_buffer = 0; +void bar (void); + +static int t = 1, u = 0; + +long +foo (unsigned int offset) +{ + unsigned i, buffer; + int x; + char *data; + + i = u; + if (i) + return i * 0xce2f; + + buffer = next_buffer; + data = buffer * 0xce2f; + for (i = 0; i < 2; i++) + bar (); + buffer = next_buffer; + return buffer * 0xce2f + offset; + +} + +void +bar (void) +{ +} + +int +main () +{ + if (foo (3) != 3) + abort (); + next_buffer = 1; + if (foo (2) != 0xce2f + 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001112-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001112-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001112-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001112-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +int main () +{ + long long i = 1; + + i = i * 2 + 1; + + if (i != 3) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001121-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001121-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001121-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001121-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* { dg-options "-fgnu89-inline" } */ + +extern void abort (void); +extern void exit (int); + +double d; + +__inline__ double foo (void) +{ + return d; +} + +__inline__ int bar (void) +{ + foo(); + return 0; +} + +int main (void) +{ + if (bar ()) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001124-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001124-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001124-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001124-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,76 @@ + +struct inode { + long long i_size; + struct super_block *i_sb; +}; + +struct file { + long long f_pos; +}; + +struct super_block { + int s_blocksize; + unsigned char s_blocksize_bits; + int s_hs; +}; + +static char * +isofs_bread(unsigned int block) +{ + if (block) + abort (); + exit(0); +} + +static int +do_isofs_readdir(struct inode *inode, struct file *filp) +{ + int bufsize = inode->i_sb->s_blocksize; + unsigned char bufbits = inode->i_sb->s_blocksize_bits; + unsigned int block, offset; + char *bh = 0; + int hs; + + if (filp->f_pos >= inode->i_size) + return 0; + + offset = filp->f_pos & (bufsize - 1); + block = filp->f_pos >> bufbits; + hs = inode->i_sb->s_hs; + + while (filp->f_pos < inode->i_size) { + if (!bh) + bh = isofs_bread(block); + + hs += block << bufbits; + + if (hs == 0) + filp->f_pos++; + + if (offset >= bufsize) + offset &= bufsize - 1; + + if (*bh) + filp->f_pos++; + + filp->f_pos++; + } + return 0; +} + +struct super_block s; +struct inode i; +struct file f; + +int +main(int argc, char **argv) +{ + s.s_blocksize = 512; + s.s_blocksize_bits = 9; + i.i_size = 2048; + i.i_sb = &s; + f.f_pos = 0; + + do_isofs_readdir(&i,&f); + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001130-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001130-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001130-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001130-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +static inline int bar(void) { return 1; } +static int mem[3]; + +static int foo(int x) +{ + if (x != 0) + return x; + + mem[x++] = foo(bar()); + + if (x != 1) + abort(); + + return 0; +} + +int main() +{ + foo(0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001130-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001130-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001130-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001130-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +static int which_alternative = 3; + +static const char *i960_output_ldconst (void); + +static const char * +output_25 (void) +{ + switch (which_alternative) + { + case 0: + return "mov %1,%0"; + case 1: + return i960_output_ldconst (); + case 2: + return "ld %1,%0"; + case 3: + return "st %1,%0"; + } +} + +static const char *i960_output_ldconst (void) +{ + return "foo"; +} +int main(void) +{ + const char *s = output_25 () ; + if (s[0] != 's') + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001203-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001203-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001203-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001203-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* Origin: PR c/410 from Jan Echternach + , + adapted to a testcase by Joseph Myers . +*/ + +extern void exit (int); + +static void +foo (void) +{ + struct { + long a; + char b[1]; + } x = { 2, { 0 } }; +} + +int +main (void) +{ + int tmp; + foo (); + tmp = 1; + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001203-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001203-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001203-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001203-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,123 @@ +struct obstack +{ + long chunk_size; + struct _obstack_chunk *chunk; + char *object_base; + char *next_free; + char *chunk_limit; + int alignment_mask; + unsigned maybe_empty_object; +}; + +struct objfile + { + struct objfile *next; + struct obstack type_obstack; + }; + +struct type + { + unsigned length; + struct objfile *objfile; + short nfields; + struct field + { + union field_location + { + int bitpos; + unsigned long physaddr; + char *physname; + } + loc; + int bitsize; + struct type *type; + char *name; + } + *fields; + }; + +struct type *alloc_type (void); +void * xmalloc (unsigned int z); +void _obstack_newchunk (struct obstack *o, int i); +void get_discrete_bounds (long long *lowp, long long *highp); + +extern void *memset(void *, int, __SIZE_TYPE__); + +struct type * +create_array_type (struct type *result_type, struct type *element_type) +{ + long long low_bound, high_bound; + if (result_type == ((void *)0)) + { + result_type = alloc_type (); + } + get_discrete_bounds (&low_bound, &high_bound); + (result_type)->length = + (element_type)->length * (high_bound - low_bound + 1); + (result_type)->nfields = 1; + (result_type)->fields = + (struct field *) ((result_type)->objfile != ((void *)0) + ? ( + { + struct obstack *__h = + (&(result_type)->objfile -> type_obstack); + { + struct obstack *__o = (__h); + int __len = ((sizeof (struct field))); + if (__o->chunk_limit - __o->next_free < __len) + _obstack_newchunk (__o, __len); + __o->next_free += __len; (void) 0; + }; + ({ + struct obstack *__o1 = (__h); + void *value; + value = (void *) __o1->object_base; + if (__o1->next_free == value) + __o1->maybe_empty_object = 1; + __o1->next_free = (((((__o1->next_free) - (char *) 0) + +__o1->alignment_mask) + & ~ (__o1->alignment_mask)) + + (char *) 0); + if (__o1->next_free - (char *)__o1->chunk + > __o1->chunk_limit - (char *)__o1->chunk) + __o1->next_free = __o1->chunk_limit; + __o1->object_base = __o1->next_free; + value; + }); + }) : xmalloc (sizeof (struct field))); + return (result_type); +} + +struct type * +alloc_type (void) +{ + abort (); +} +void * xmalloc (unsigned int z) +{ + return 0; +} +void _obstack_newchunk (struct obstack *o, int i) +{ + abort (); +} +void +get_discrete_bounds (long long *lowp, long long *highp) +{ + *lowp = 0; + *highp = 2; +} + +int main(void) +{ + struct type element_type; + struct type result_type; + + memset (&element_type, 0, sizeof (struct type)); + memset (&result_type, 0, sizeof (struct type)); + element_type.length = 4; + create_array_type (&result_type, &element_type); + if (result_type.length != 12) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001221-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001221-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001221-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001221-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,7 @@ +int main () +{ + unsigned long long a; + if (! (a = 0xfedcba9876543210ULL)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001228-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001228-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001228-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001228-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +int foo1(void) +{ + union { + char a[sizeof (unsigned)]; + unsigned b; + } u; + + u.b = 0x01; + return u.a[0]; +} + +int foo2(void) +{ + volatile union { + char a[sizeof (unsigned)]; + unsigned b; + } u; + + u.b = 0x01; + return u.a[0]; +} + +int main(void) +{ + if (foo1() != foo2()) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001229-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001229-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001229-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20001229-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +/* This testcase originally provoked an unaligned access fault on Alpha. + + Since Digital Unix and Linux (and probably others) by default fix + these up in the kernel, the failure was not visible unless one + is sitting at the console examining logs. + + So: If we know how, ask the kernel to deliver SIGBUS instead so + that the test case visibly fails. */ + +#if defined(__alpha__) && defined(__linux__) +#include +#include + +static inline int +setsysinfo(unsigned long op, void *buffer, unsigned long size, + int *start, void *arg, unsigned long flag) +{ + syscall(__NR_osf_setsysinfo, op, buffer, size, start, arg, flag); +} + +static void __attribute__((constructor)) +trap_unaligned(void) +{ + unsigned int buf[2]; + buf[0] = SSIN_UACPROC; + buf[1] = UAC_SIGBUS | UAC_NOPRINT; + setsysinfo(SSI_NVPAIRS, buf, 1, 0, 0, 0); +} +#endif /* alpha */ + +void foo(char *a, char *b) { } + +void showinfo() +{ + char uname[33] = "", tty[38] = "/dev/"; + foo(uname, tty); +} + +int main() +{ + showinfo (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010106-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010106-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010106-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010106-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* Copyright 2001 Free Software Foundation + Contributed by Alexandre Oliva */ + +int f(int i) { + switch (i) + { + case -2: + return 33; + case -1: + return 0; + case 0: + return 7; + case 1: + return 4; + case 2: + return 3; + case 3: + return 15; + case 4: + return 9; + default: + abort (); + } +} + +int main() { + if (f(-1)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010114-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010114-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010114-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010114-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* Origin: PR c/1540 from Mattias Lampe , + adapted to a testcase by Joseph Myers . + GCC 2.95.2 fails, CVS GCC of 2001-01-13 passes. */ +extern void abort (void); +extern void exit (int); + +int +main (void) +{ + int array1[1] = { 1 }; + int array2[2][1]= { { 1 }, { 0 } }; + if (array1[0] != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010116-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010116-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010116-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010116-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* Distilled from optimization/863. */ + +extern void abort (void); +extern void exit (int); +extern void ok (int); + +typedef struct +{ + int x, y, z; +} Data; + +void find (Data *first, Data *last) +{ + int i; + for (i = (last - first) >> 2; i > 0; --i) + ok(i); + abort (); +} + +void ok(int i) +{ + if (i != 1) + abort (); + exit (0); +} + +int +main () +{ + Data DataList[4]; + find (DataList + 0, DataList + 4); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010118-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010118-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010118-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010118-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +typedef struct { + int a, b, c, d, e, f; +} A; + +void foo (A *v, int w, int x, int *y, int *z) +{ +} + +void +bar (A *v, int x, int y, int w, int h) +{ + if (v->a != x || v->b != y) { + int oldw = w; + int oldh = h; + int e = v->e; + int f = v->f; + int dx, dy; + foo(v, 0, 0, &w, &h); + dx = (oldw - w) * (double) e/2.0; + dy = (oldh - h) * (double) f/2.0; + x += dx; + y += dy; + v->a = x; + v->b = y; + v->c = w; + v->d = h; + } +} + +int main () +{ + A w = { 100, 110, 20, 30, -1, -1 }; + bar (&w,400,420,50,70); + if (w.d != 70) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010119-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010119-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010119-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010119-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +#ifdef __OPTIMIZE__ +extern void undef (void); + +void bar (unsigned x) { } +void baz (unsigned x) { } + +extern inline void foo (int a, int b) +{ + int c = 0; + while (c++ < b) + (__builtin_constant_p (a) ? ((a) > 20000 ? undef () : bar (a)) : baz (a)); +} +#else +void foo (int a, int b) +{ +} +#endif + +int main (void) +{ + foo(10, 100); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010122-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010122-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010122-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010122-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,204 @@ +/* { dg-skip-if "requires frame pointers" { *-*-* } "-fomit-frame-pointer" "" } */ +/* { dg-require-effective-target return_address } */ + +extern void exit (int); +extern void abort (void); +extern void *alloca (__SIZE_TYPE__); +char *dummy (void); + +#define NOINLINE __attribute__((noinline)) __attribute__ ((noclone)) + +void *save_ret1[6]; +void *test4a (char *); +void *test5a (char *); +void *test6a (char *); + +void NOINLINE *test1 (void) +{ + void * temp; + temp = __builtin_return_address (0); + return temp; +} + +void NOINLINE *test2 (void) +{ + void * temp; + dummy (); + temp = __builtin_return_address (0); + return temp; +} + +void NOINLINE *test3 (void) +{ + void * temp; + temp = __builtin_return_address (0); + dummy (); + return temp; +} + +void NOINLINE *test4 (void) +{ + char * save = (char*) alloca (4); + + return test4a (save); +} + +void *NOINLINE test4a (char * p) +{ + void * temp; + temp = __builtin_return_address (1); + return temp; +} + +void NOINLINE *test5 (void) +{ + char * save = (char*) alloca (4); + + return test5a (save); +} + +void NOINLINE *test5a (char * p) +{ + void * temp; + dummy (); + temp = __builtin_return_address (1); + return temp; +} + +void NOINLINE *test6 (void) +{ + char * save = (char*) alloca (4); + + return test6a (save); +} + +void NOINLINE *test6a (char * p) +{ + void * temp; + temp = __builtin_return_address (1); + dummy (); + return temp; +} + +void *(*func1[6])(void) = { test1, test2, test3, test4, test5, test6 }; + +char * NOINLINE call_func1 (int i) +{ + save_ret1[i] = func1[i] (); +} + +static void *ret_addr; +void *save_ret2[6]; +void test10a (char *); +void test11a (char *); +void test12a (char *); + +void NOINLINE test7 (void) +{ + ret_addr = __builtin_return_address (0); + return; +} + +void NOINLINE test8 (void) +{ + dummy (); + ret_addr = __builtin_return_address (0); + return; +} + +void NOINLINE test9 (void) +{ + ret_addr = __builtin_return_address (0); + dummy (); + return; +} + +void NOINLINE test10 (void) +{ + char * save = (char*) alloca (4); + + test10a (save); +} + +void NOINLINE test10a (char * p) +{ + ret_addr = __builtin_return_address (1); + return; +} + +void NOINLINE test11 (void) +{ + char * save = (char*) alloca (4); + + test11a (save); +} + +void NOINLINE test11a (char * p) +{ + dummy (); + ret_addr = __builtin_return_address (1); + return; +} + +void NOINLINE test12 (void) +{ + char * save = (char*) alloca (4); + + test12a (save); +} + +void NOINLINE test12a (char * p) +{ + ret_addr = __builtin_return_address (1); + dummy (); + return; +} + +char * dummy (void) +{ + char * save = (char*) alloca (4); + + return save; +} + +void (*func2[6])(void) = { test7, test8, test9, test10, test11, test12 }; + +void NOINLINE call_func2 (int i) +{ + func2[i] (); + save_ret2[i] = ret_addr; +} + +int main (void) +{ + int i; + + for (i = 0; i < 6; i++) { + call_func1(i); + } + + if (save_ret1[0] != save_ret1[1] + || save_ret1[1] != save_ret1[2]) + abort (); + if (save_ret1[3] != save_ret1[4] + || save_ret1[4] != save_ret1[5]) + abort (); + if (save_ret1[3] && save_ret1[0] != save_ret1[3]) + abort (); + + + for (i = 0; i < 6; i++) { + call_func2(i); + } + + if (save_ret2[0] != save_ret2[1] + || save_ret2[1] != save_ret2[2]) + abort (); + if (save_ret2[3] != save_ret2[4] + || save_ret2[4] != save_ret2[5]) + abort (); + if (save_ret2[3] && save_ret2[0] != save_ret2[3]) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010123-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010123-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010123-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010123-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +extern void abort (); +extern void exit (int); + +struct s +{ + int value; + char *string; +}; + +int main (void) +{ + int i; + for (i = 0; i < 4; i++) + { + struct s *t = & (struct s) { 3, "hey there" }; + if (t->value != 3) + abort(); + t->value = 4; + if (t->value != 4) + abort(); + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010129-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010129-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010129-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010129-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,69 @@ +/* { dg-options "-mtune=i686" { target { { i?86-*-* x86_64-*-* } && ia32 } } } */ + +extern void abort (void); +extern void exit (int); + +long baz1 (void *a) +{ + static long l; + return l++; +} + +int baz2 (const char *a) +{ + return 0; +} + +int baz3 (int i) +{ + if (!i) + abort (); + return 1; +} + +void **bar; + +int foo (void *a, long b, int c) +{ + int d = 0, e, f = 0, i; + char g[256]; + void **h; + + g[0] = '\n'; + g[1] = 0; + + while (baz1 (a) < b) { + if (g[0] != ' ' && g[0] != '\t') { + f = 1; + e = 0; + if (!d && baz2 (g) == 0) { + if ((c & 0x10) == 0) + continue; + e = d = 1; + } + if (!((c & 0x10) && (c & 0x4000) && e) && (c & 2)) + continue; + if ((c & 0x2000) && baz2 (g) == 0) + continue; + if ((c & 0x1408) && baz2 (g) == 0) + continue; + if ((c & 0x200) && baz2 (g) == 0) + continue; + if (c & 0x80) { + for (h = bar, i = 0; h; h = (void **)*h, i++) + if (baz3 (i)) + break; + } + f = 0; + } + } + return 0; +} + +int main () +{ + void *n = 0; + bar = &n; + foo (&n, 1, 0xc811); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010206-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010206-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010206-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010206-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +int foo (void) +{ + int i; +#line 1 "20010206-1.c" + if (0) i = 1; else i +#line 1 "20010206-1.c" + = 26; + return i; +} + +int main () +{ + if (foo () != 26) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010209-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010209-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010209-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010209-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* { dg-require-effective-target alloca } */ +int b; +int foo (void) +{ + int x[b]; + int bar (int t[b]) + { + int i; + for (i = 0; i < b; i++) + t[i] = i + (i > 0 ? t[i-1] : 0); + return t[b-1]; + } + return bar (x); +} + +int main () +{ + b = 6; + if (foo () != 15) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010221-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010221-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010221-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010221-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ + +int n = 2; + +main () +{ + int i, x = 45; + + for (i = 0; i < n; i++) + { + if (i != 0) + x = ( i > 0 ) ? i : 0; + } + + if (x != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010222-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010222-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010222-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010222-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,9 @@ +int a[2] = { 18, 6 }; + +int main () +{ + int b = (-3 * a[0] -3 * a[1]) / 12; + if (b != -6) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010224-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010224-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010224-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010224-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +extern void abort (void); + +typedef signed short int16_t; +typedef unsigned short uint16_t; + +int16_t logadd (int16_t *a, int16_t *b); +void ba_compute_psd (int16_t start); + +int16_t masktab[6] = { 1, 2, 3, 4, 5}; +int16_t psd[6] = { 50, 40, 30, 20, 10}; +int16_t bndpsd[6] = { 1, 2, 3, 4, 5}; + +void ba_compute_psd (int16_t start) +{ + int i,j,k; + int16_t lastbin = 4; + + j = start; + k = masktab[start]; + + bndpsd[k] = psd[j]; + j++; + + for (i = j; i < lastbin; i++) { + bndpsd[k] = logadd(&bndpsd[k], &psd[j]); + j++; + } +} + +int16_t logadd (int16_t *a, int16_t *b) +{ + return *a + *b; +} + +int main (void) +{ + int i; + + ba_compute_psd (0); + + if (bndpsd[1] != 140) abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010325-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010325-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010325-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010325-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* Origin: Joseph Myers . + + This tests for inconsistency in whether wide STRING_CSTs use the host + or the target endianness. */ + +extern void exit (int); +extern void abort (void); + +int +main (void) +{ + if (L"a" "b"[1] != L'b') + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010329-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010329-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010329-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010329-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +#include + +int main (void) +{ + void *x = ((void *)((unsigned int)INT_MAX + 2)); + void *y = ((void *)((unsigned long)LONG_MAX + 2)); + if (x >= ((void *)((unsigned int)INT_MAX + 1)) + && x <= ((void *)((unsigned int)INT_MAX + 6)) + && y >= ((void *)((unsigned long)LONG_MAX + 1)) + && y <= ((void *)((unsigned long)LONG_MAX + 6))) + exit (0); + else + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010403-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010403-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010403-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010403-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +void b (int *); +void c (int, int); +void d (int); + +int e; + +void a (int x, int y) +{ + int f = x ? e : 0; + int z = y; + + b (&y); + c (z, y); + d (f); +} + +void b (int *y) +{ + (*y)++; +} + +void c (int x, int y) +{ + if (x == y) + abort (); +} + +void d (int x) +{ +} + +int main (void) +{ + a (0, 0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010409-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010409-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010409-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010409-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +typedef __SIZE_TYPE__ size_t; +extern size_t strlen (const char *s); + +typedef struct A { + int a, b; +} A; + +typedef struct B { + struct A **a; + int b; +} B; + +A *a; +int b = 1, c; +B d[1]; + +void foo (A *x, const char *y, int z) +{ + c = y[4] + z * 25; +} + +A *bar (const char *v, int w, int x, const char *y, int z) +{ + if (w) + abort (); + exit (0); +} + +void test (const char *x, int *y) +{ + foo (d->a[d->b], "test", 200); + d->a[d->b] = bar (x, b ? 0 : 65536, strlen (x), "test", 201); + d->a[d->b]->a++; + if (y) + d->a[d->b]->b = *y; +} + +int main () +{ + d->b = 0; + d->a = &a; + test ("", 0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010422-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010422-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010422-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010422-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +unsigned int foo(unsigned int x) +{ + if (x < 5) + x = 4; + else + x = 8; + return x; +} + +int main(void) +{ + if (foo (8) != 8) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010518-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010518-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010518-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010518-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* Leaf functions with many arguments. */ + +int +add (int a, + int b, + int c, + int d, + int e, + int f, + int g, + int h, + int i, + int j, + int k, + int l, + int m) +{ + return a+b+c+d+e+f+g+h+i+j+k+l+m; +} + +int +main(void) +{ + if (add (1,2,3,4,5,6,7,8,9,10,11,12,13) != 91) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010518-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010518-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010518-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010518-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* Mis-aligned packed structures. */ + +typedef struct +{ + char b0; + char b1; + char b2; + char b3; + char b4; + char b5; +} __attribute__ ((packed)) b_struct; + + +typedef struct +{ + short a; + long b; + short c; + short d; + b_struct e; +} __attribute__ ((packed)) a_struct; + + +int +main(void) +{ + volatile a_struct *a; + volatile a_struct b; + + a = &b; + *a = (a_struct){1,2,3,4}; + a->e.b4 = 'c'; + + if (a->a != 1 || a->b != 2 || a->c != 3 || a->d != 4 || a->e.b4 != 'c') + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010520-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010520-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010520-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010520-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +static unsigned int expr_hash_table_size = 1; + +int +main () +{ + int del = 1; + unsigned int i = 0; + + if (i < expr_hash_table_size && del) + exit (0); + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010604-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010604-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010604-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010604-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +#include + +int f (int a, int b, int c, _Bool d, _Bool e, _Bool f, char g) +{ + if (g != 1 || d != true || e != true || f != true) abort (); + return a + b + c; +} + +int main (void) +{ + if (f (1, 2, -3, true, true, true, '\001')) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010605-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010605-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010605-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010605-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +int main () +{ + int v = 42; + + inline int fff (int x) + { + return x*10; + } + + return (fff (v) != 420); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010605-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010605-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010605-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010605-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +void foo (), bar (), baz (); +int main () +{ + __complex__ double x; + __complex__ float y; + __complex__ long double z; + __real__ x = 1.0; + __imag__ x = 2.0; + foo (x); + __real__ y = 3.0f; + __imag__ y = 4.0f; + bar (y); + __real__ z = 5.0L; + __imag__ z = 6.0L; + baz (z); + exit (0); +} + +void foo (__complex__ double x) +{ + if (__real__ x != 1.0 || __imag__ x != 2.0) + abort (); +} + +void bar (__complex__ float x) +{ + if (__real__ x != 3.0f || __imag__ x != 4.0f) + abort (); +} + +void baz (__complex__ long double x) +{ + if (__real__ x != 5.0L || __imag__ x != 6.0L) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010711-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010711-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010711-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010711-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +void foo (int *a) {} + +int main () +{ + int a; + if (&a == 0) + abort (); + else + { + foo (&a); + exit (0); + } +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010717-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010717-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010717-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010717-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); + +int +main () +{ + int i, j; + unsigned long u, r1, r2; + + i = -16; + j = 1; + u = i + j; + + /* no sign extension upon shift */ + r1 = u >> 1; + /* sign extension upon shift, but there shouldn't be */ + r2 = ((unsigned long) (i + j)) >> 1; + + if (r1 != r2) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010723-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010723-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010723-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010723-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +int +test () +{ + int biv,giv; + for (biv = 0, giv = 0; giv != 8; biv++) + giv = biv*8; + return giv; +} + + +main() +{ + if (test () != 8) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010904-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010904-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010904-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010904-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* If some target has a Max alignment less than 32, please create + a #ifdef around the alignment and add your alignment. */ +#ifdef __pdp11__ +#define alignment 2 +#else +#define alignment 32 +#endif + +typedef struct x { int a; int b; } __attribute__((aligned(alignment))) X; +typedef struct y { X x[32]; int c; } Y; + +Y y[2]; + +int main(void) +{ + if (((char *)&y[1] - (char *)&y[0]) & 31) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010904-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010904-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010904-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010904-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* If some target has a Max alignment less than 32, please create + a #ifdef around the alignment and add your alignment. */ +#ifdef __pdp11__ +#define alignment 2 +#else +#define alignment 32 +#endif + +typedef struct x { int a; int b; } __attribute__((aligned(alignment))) X; +typedef struct y { X x; X y[31]; int c; } Y; + +Y y[2]; + +int main(void) +{ + if (((char *)&y[1] - (char *)&y[0]) & 31) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010910-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010910-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010910-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010910-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,59 @@ +/* Test case contributed by Ingo Rohloff . + Code distilled from Linux kernel. */ + +/* Compile this program with a gcc-2.95.2 using + "gcc -O2" and run it. The result will be that + rx_ring[1].next == 0 (it should be == 14) + and + ep.skbuff[4] == 5 (it should be 0) +*/ + +extern void abort(void); + +struct epic_rx_desc +{ + unsigned int next; +}; + +struct epic_private +{ + struct epic_rx_desc *rx_ring; + unsigned int rx_skbuff[5]; +}; + +static void epic_init_ring(struct epic_private *ep) +{ + int i; + + for (i = 0; i < 5; i++) + { + ep->rx_ring[i].next = 10 + (i+1)*2; + ep->rx_skbuff[i] = 0; + } + ep->rx_ring[i-1].next = 10; +} + +static int check_rx_ring[5] = { 12,14,16,18,10 }; + +int main(void) +{ + struct epic_private ep; + struct epic_rx_desc rx_ring[5]; + int i; + + for (i=0;i<5;i++) + { + rx_ring[i].next=0; + ep.rx_skbuff[i]=5; + } + + ep.rx_ring=rx_ring; + epic_init_ring(&ep); + + for (i=0;i<5;i++) + { + if ( rx_ring[i].next != check_rx_ring[i] ) abort(); + if ( ep.rx_skbuff[i] != 0 ) abort(); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010915-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010915-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010915-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010915-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,71 @@ +/* Bug in reorg.c, deleting the "++" in the last loop in main. + Origin: . */ + +extern void f (void); +extern int x (int, char **); +extern int r (const char *); +extern char *s (char *, char **); +extern char *m (char *); +char *u; +char *h; +int check = 0; +int o = 0; + +int main (int argc, char **argv) +{ + char *args[] = {"a", "b", "c", "d", "e"}; + if (x (5, args) != 0 || check != 2 || o != 5) + abort (); + exit (0); +} + +int x (int argc, char **argv) +{ + int opt = 0; + char *g = 0; + char *p = 0; + + if (argc > o && argc > 2 && argv[o]) + { + g = s (argv[o], &p); + if (g) + { + *g++ = '\0'; + h = s (g, &p); + if (g == p) + h = m (g); + } + u = s (argv[o], &p); + if (argv[o] == p) + u = m (argv[o]); + } + else + abort (); + + while (++o < argc) + if (r (argv[o]) == 0) + return 1; + + return 0; +} + +char *m (char *x) { abort (); } +char *s (char *v, char **pp) +{ + if (strcmp (v, "a") != 0 || check++ > 1) + abort (); + *pp = v+1; + return 0; +} + +int r (const char *f) +{ + static char c[2] = "b"; + static int cnt = 0; + + if (*f != *c || f[1] != c[1] || cnt > 3) + abort (); + c[0]++; + cnt++; + return 1; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010924-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010924-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010924-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010924-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,71 @@ +/* Verify that flexible arrays can be initialized from STRING_CST + constructors. */ + +/* Baselines. */ +struct { + char a1c; + char *a1p; +} a1 = { + '4', + "62" +}; + +struct { + char a2c; + char a2p[2]; +} a2 = { + 'v', + "cq" +}; + +/* The tests. */ +struct { + char a3c; + char a3p[]; +} a3 = { + 'o', + "wx" +}; + +struct { + char a4c; + char a4p[]; +} a4 = { + '9', + { 'e', 'b' } +}; + +main() +{ + if (a1.a1c != '4') + abort(); + if (a1.a1p[0] != '6') + abort(); + if (a1.a1p[1] != '2') + abort(); + if (a1.a1p[2] != '\0') + abort(); + + if (a2.a2c != 'v') + abort(); + if (a2.a2p[0] != 'c') + abort(); + if (a2.a2p[1] != 'q') + abort(); + + if (a3.a3c != 'o') + abort(); + if (a3.a3p[0] != 'w') + abort(); + if (a3.a3p[1] != 'x') + abort(); + + if (a4.a4c != '9') + abort(); + if (a4.a4p[0] != 'e') + abort(); + if (a4.a4p[1] != 'b') + abort(); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010925-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010925-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010925-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20010925-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +extern void exit(int); +extern void abort (void); + +extern void * memcpy (void *, const void *, __SIZE_TYPE__); +int foo (void *, void *, unsigned int c); + +int src[10]; +int dst[10]; + +int main() +{ + if (foo (dst, src, 10) != 0) + abort(); + exit(0); +} + +int foo (void *a, void *b, unsigned int c) +{ + if (c == 0) + return 1; + + memcpy (a, b, c); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011008-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011008-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011008-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011008-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,107 @@ +/* { dg-add-options stack_size } */ + +extern void exit (int); +extern void abort (void); + +typedef unsigned int u_int32_t; +typedef unsigned char u_int8_t; +typedef int int32_t; + +typedef enum { + TXNLIST_DELETE, + TXNLIST_LSN, + TXNLIST_TXNID, + TXNLIST_PGNO +} db_txnlist_type; + +struct __db_lsn; typedef struct __db_lsn DB_LSN; +struct __db_lsn { + u_int32_t file; + u_int32_t offset; +}; +struct __db_txnlist; typedef struct __db_txnlist DB_TXNLIST; + +struct __db_txnlist { + db_txnlist_type type; + struct { struct __db_txnlist *le_next; struct __db_txnlist **le_prev; } links; + union { + struct { + u_int32_t txnid; + int32_t generation; + int32_t aborted; + } t; + struct { + + + u_int32_t flags; + int32_t fileid; + u_int32_t count; + char *fname; + } d; + struct { + int32_t ntxns; + int32_t maxn; + DB_LSN *lsn_array; + } l; + struct { + int32_t nentries; + int32_t maxentry; + char *fname; + int32_t fileid; + void *pgno_array; + u_int8_t uid[20]; + } p; + } u; +}; + +int log_compare (const DB_LSN *a, const DB_LSN *b) +{ + return 1; +} + + +int +__db_txnlist_lsnadd(int val, DB_TXNLIST *elp, DB_LSN *lsnp, u_int32_t flags) +{ + int i; + + for (i = 0; i < (!(flags & (0x1)) ? 1 : elp->u.l.ntxns); i++) + { + int __j; + DB_LSN __tmp; + val++; + for (__j = 0; __j < elp->u.l.ntxns - 1; __j++) + if (log_compare(&elp->u.l.lsn_array[__j], &elp->u.l.lsn_array[__j + 1]) < 0) + { + __tmp = elp->u.l.lsn_array[__j]; + elp->u.l.lsn_array[__j] = elp->u.l.lsn_array[__j + 1]; + elp->u.l.lsn_array[__j + 1] = __tmp; + } + } + + *lsnp = elp->u.l.lsn_array[0]; + return val; +} + +#if defined (STACK_SIZE) && STACK_SIZE < 12350 +#define VLEN (STACK_SIZE/10) +#else +#define VLEN 1235 +#endif + +int main (void) +{ + DB_TXNLIST el; + DB_LSN lsn, lsn_a[VLEN]; + + el.u.l.ntxns = VLEN-1; + el.u.l.lsn_array = lsn_a; + + if (__db_txnlist_lsnadd (0, &el, &lsn, 0) != 1) + abort (); + + if (__db_txnlist_lsnadd (0, &el, &lsn, 1) != VLEN-1) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011019-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011019-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011019-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011019-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +extern void exit (int); +extern void abort (void); + +struct { int a; int b[5]; } x; +int *y; + +int foo (void) +{ + return y - x.b; +} + +int main (void) +{ + y = x.b; + if (foo ()) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011024-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011024-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011024-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011024-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* Test whether store motion recognizes pure functions as potentially reading + any memory. */ + +typedef __SIZE_TYPE__ size_t; +extern void *memcpy (void *dest, const void *src, size_t n); +extern size_t strlen (const char *s); +extern int strcmp (const char *s1, const char *s2) __attribute__((pure)); + +char buf[50]; + +static void foo (void) +{ + if (memcpy (buf, "abc", 4) != buf) abort (); + if (strcmp (buf, "abc")) abort (); + memcpy (buf, "abcdefgh", strlen ("abcdefgh") + 1); +} + +int main (void) +{ + foo (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011109-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011109-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011109-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011109-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,58 @@ +void fail1(void) +{ + abort (); +} +void fail2(void) +{ + abort (); +} +void fail3(void) +{ + abort (); +} +void fail4(void) +{ + abort (); +} + + +void foo(long x) +{ + switch (x) + { + case -6: + fail1 (); break; + case 0: + fail2 (); break; + case 1: case 2: + break; + case 3: case 4: case 5: + fail3 (); + break; + default: + fail4 (); + break; + } + switch (x) + { + + case -3: + fail1 (); break; + case 0: case 4: + fail2 (); break; + case 1: case 3: + break; + case 2: case 8: + abort (); + break; + default: + fail4 (); + break; + } +} + +int main(void) +{ + foo (1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011109-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011109-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011109-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011109-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +int main(void) +{ + char *c1 = "foo"; + char *c2 = "foo"; + int i; + for (i = 0; i < 3; i++) + if (c1[i] != c2[i]) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011113-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011113-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011113-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011113-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,55 @@ +typedef __SIZE_TYPE__ size_t; +extern void *memcpy (void *__restrict, const void *__restrict, size_t); +extern void abort (void); +extern void exit (int); + +typedef struct t +{ + unsigned a : 16; + unsigned b : 8; + unsigned c : 8; + long d[4]; +} *T; + +typedef struct { + long r[3]; +} U; + +T bar (U, unsigned int); + +T foo (T x) +{ + U d, u; + + memcpy (&u, &x->d[1], sizeof u); + d = u; + return bar (d, x->b); +} + +T baz (T x) +{ + U d, u; + + d.r[0] = 0x123456789; + d.r[1] = 0xfedcba987; + d.r[2] = 0xabcdef123; + memcpy (&u, &x->d[1], sizeof u); + d = u; + return bar (d, x->b); +} + +T bar (U d, unsigned int m) +{ + if (d.r[0] != 21 || d.r[1] != 22 || d.r[2] != 23) + abort (); + return 0; +} + +struct t t = { 26, 0, 0, { 0, 21, 22, 23 }}; + +int main (void) +{ + baz (&t); + foo (&t); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011114-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011114-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011114-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011114-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +char foo(char bar[]) +{ + return bar[1]; +} +extern char foo(char *); +int main(void) +{ + if (foo("xy") != 'y') + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011115-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011115-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011115-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011115-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +extern void exit (int); + +static inline int +foo (void) +{ +#ifdef __OPTIMIZE__ + extern int undefined_reference; + return undefined_reference; +#else + return 0; +#endif +} + +static inline int +bar (void) +{ + if (foo == foo) + return 1; + else + return foo (); +} + +int main (void) +{ + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011121-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011121-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011121-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011121-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +struct s +{ + int i[18]; + char f; + char b[2]; +}; + +struct s s1; + +int +main() +{ + struct s s2; + s2.b[0] = 100; + __builtin_memcpy(&s2, &s1, ((unsigned int) &((struct s *)0)->b)); + if (s2.b[0] != 100) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011126-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011126-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011126-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011126-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* Produced a overflow in ifcvt.c, causing S to contain 0xffffffff7fffffff. */ + +int a = 1; + +int main () +{ + long long s; + + s = a; + if (s < 0) + s = -2147483648LL; + else + s = 2147483647LL; + + if (s < 0) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011126-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011126-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011126-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011126-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +/* Problem originally visible on ia64. + + There is a partial redundancy of "in + 1" that makes GCSE want to + transform the final while loop to + + p = in + 1; + tmp = p; + ... + goto start; + top: + tmp = tmp + 1; + start: + in = tmp; + if (in < p) goto top; + + We miscalculate the number of loop iterations as (p - tmp) = 0 + instead of (p - in) = 1, which results in overflow in the doloop + optimization. */ + +static const char * +test (const char *in, char *out) +{ + while (1) + { + if (*in == 'a') + { + const char *p = in + 1; + while (*p == 'x') + ++p; + if (*p == 'b') + return p; + while (in < p) + *out++ = *in++; + } + } +} + +int main () +{ + char out[4]; + test ("aab", out); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011128-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011128-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011128-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011128-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,5 @@ +main() +{ + char blah[33] = "01234567890123456789"; + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011217-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011217-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011217-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011217-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +int +main() +{ + double x = 1.0; + double y = 2.0; + + if ((y > x--) != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011219-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011219-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011219-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011219-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +/* This testcase failed on IA-32 at -O and above, because combine attached + a REG_LABEL note to jump instruction already using JUMP_LABEL. */ + +extern void abort (void); +extern void exit (int); + +enum X { A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q }; + +void +bar (const char *x, int y, const char *z) +{ +} + +long +foo (enum X x, const void *y) +{ + long a; + + switch (x) + { + case K: + a = *(long *)y; + break; + case L: + a = *(long *)y; + break; + case M: + a = *(long *)y; + break; + case N: + a = *(long *)y; + break; + case O: + a = *(long *)y; + break; + default: + bar ("foo", 1, "bar"); + } + return a; +} + +int +main () +{ + long i = 24; + if (foo (N, &i) != 24) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011223-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011223-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011223-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20011223-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* Origin: Joseph Myers . */ +/* Case labels in a switch statement are converted to the promoted + type of the controlling expression, not an unpromoted version. + Reported as PR c/2454 by + Andreas Krakowczyk . */ + +extern void exit (int); +extern void abort (void); + +static int i; + +int +main (void) +{ + i = -1; + switch ((signed char) i) { + case 255: + abort (); + default: + exit (0); + } +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020103-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020103-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020103-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020103-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* On h8300 port, the following used to be broken with -mh or -ms. */ + +extern void abort (void); +extern void exit (int); + +unsigned long +foo (unsigned long a) +{ + return a ^ 0x0000ffff; +} + +unsigned long +bar (unsigned long a) +{ + return a ^ 0xffff0000; +} + +int +main () +{ + if (foo (0) != 0x0000ffff) + abort (); + + if (bar (0) != 0xffff0000) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020107-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020107-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020107-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020107-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* This testcase failed because - 1 - buf was simplified into ~buf and when + later expanding it back into - buf + -1, -1 got lost. */ +/* { dg-options "-fgnu89-inline" } */ + +extern void abort (void); +extern void exit (int); + +static void +bar (int x) +{ + if (!x) + abort (); +} + +char buf[10]; + +inline char * +foo (char *tmp) +{ + asm ("" : "=r" (tmp) : "0" (tmp)); + return tmp + 2; +} + +int +main (void) +{ + bar ((foo (buf) - 1 - buf) == 1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020108-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020108-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020108-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020108-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,203 @@ +/* This file tests shifts in various integral modes. */ + +#include + +#define CAT(A, B) A ## B + +#define REPEAT_8 \ +REPEAT_FN ( 0) \ +REPEAT_FN ( 1) \ +REPEAT_FN ( 2) \ +REPEAT_FN ( 3) \ +REPEAT_FN ( 4) \ +REPEAT_FN ( 5) \ +REPEAT_FN ( 6) \ +REPEAT_FN ( 7) + +#define REPEAT_16 \ +REPEAT_8 \ +REPEAT_FN ( 8) \ +REPEAT_FN ( 9) \ +REPEAT_FN (10) \ +REPEAT_FN (11) \ +REPEAT_FN (12) \ +REPEAT_FN (13) \ +REPEAT_FN (14) \ +REPEAT_FN (15) + +#define REPEAT_32 \ +REPEAT_16 \ +REPEAT_FN (16) \ +REPEAT_FN (17) \ +REPEAT_FN (18) \ +REPEAT_FN (19) \ +REPEAT_FN (20) \ +REPEAT_FN (21) \ +REPEAT_FN (22) \ +REPEAT_FN (23) \ +REPEAT_FN (24) \ +REPEAT_FN (25) \ +REPEAT_FN (26) \ +REPEAT_FN (27) \ +REPEAT_FN (28) \ +REPEAT_FN (29) \ +REPEAT_FN (30) \ +REPEAT_FN (31) + +/* Define 8-bit shifts. */ +#if CHAR_BIT == 8 +typedef unsigned int u8 __attribute__((mode(QI))); +typedef signed int s8 __attribute__((mode(QI))); + +#define REPEAT_FN(COUNT) \ +u8 CAT (ashift_qi_, COUNT) (u8 n) { return n << COUNT; } +REPEAT_8 +#undef REPEAT_FN + +#define REPEAT_FN(COUNT) \ +u8 CAT (lshiftrt_qi_, COUNT) (u8 n) { return n >> COUNT; } +REPEAT_8 +#undef REPEAT_FN + +#define REPEAT_FN(COUNT) \ +s8 CAT (ashiftrt_qi_, COUNT) (s8 n) { return n >> COUNT; } +REPEAT_8 +#undef REPEAT_FN +#endif /* CHAR_BIT == 8 */ + +/* Define 16-bit shifts. */ +#if CHAR_BIT == 8 || CHAR_BIT == 16 +#if CHAR_BIT == 8 +typedef unsigned int u16 __attribute__((mode(HI))); +typedef signed int s16 __attribute__((mode(HI))); +#elif CHAR_BIT == 16 +typedef unsigned int u16 __attribute__((mode(QI))); +typedef signed int s16 __attribute__((mode(QI))); +#endif + +#define REPEAT_FN(COUNT) \ +u16 CAT (ashift_hi_, COUNT) (u16 n) { return n << COUNT; } +REPEAT_16 +#undef REPEAT_FN + +#define REPEAT_FN(COUNT) \ +u16 CAT (lshiftrt_hi_, COUNT) (u16 n) { return n >> COUNT; } +REPEAT_16 +#undef REPEAT_FN + +#define REPEAT_FN(COUNT) \ +s16 CAT (ashiftrt_hi_, COUNT) (s16 n) { return n >> COUNT; } +REPEAT_16 +#undef REPEAT_FN +#endif /* CHAR_BIT == 8 || CHAR_BIT == 16 */ + +/* Define 32-bit shifts. */ +#if CHAR_BIT == 8 || CHAR_BIT == 16 || CHAR_BIT == 32 +#if CHAR_BIT == 8 +typedef unsigned int u32 __attribute__((mode(SI))); +typedef signed int s32 __attribute__((mode(SI))); +#elif CHAR_BIT == 16 +typedef unsigned int u32 __attribute__((mode(HI))); +typedef signed int s32 __attribute__((mode(HI))); +#elif CHAR_BIT == 32 +typedef unsigned int u32 __attribute__((mode(QI))); +typedef signed int s32 __attribute__((mode(QI))); +#endif + +#define REPEAT_FN(COUNT) \ +u32 CAT (ashift_si_, COUNT) (u32 n) { return n << COUNT; } +REPEAT_32 +#undef REPEAT_FN + +#define REPEAT_FN(COUNT) \ +u32 CAT (lshiftrt_si_, COUNT) (u32 n) { return n >> COUNT; } +REPEAT_32 +#undef REPEAT_FN + +#define REPEAT_FN(COUNT) \ +s32 CAT (ashiftrt_si_, COUNT) (s32 n) { return n >> COUNT; } +REPEAT_32 +#undef REPEAT_FN +#endif /* CHAR_BIT == 8 || CHAR_BIT == 16 || CHAR_BIT == 32 */ + +extern void abort (void); +extern void exit (int); + +int +main () +{ + /* Test 8-bit shifts. */ +#if CHAR_BIT == 8 +# define REPEAT_FN(COUNT) \ + if (CAT (ashift_qi_, COUNT) (0xff) != (u8) ((u8)0xff << COUNT)) abort (); + REPEAT_8; +# undef REPEAT_FN + +# define REPEAT_FN(COUNT) \ + if (CAT (lshiftrt_qi_, COUNT) (0xff) != (u8) ((u8)0xff >> COUNT)) abort (); + REPEAT_8; +# undef REPEAT_FN + +# define REPEAT_FN(COUNT) \ + if (CAT (ashiftrt_qi_, COUNT) (-1) != -1) abort (); + REPEAT_8; +# undef REPEAT_FN + +# define REPEAT_FN(COUNT) \ + if (CAT (ashiftrt_qi_, COUNT) (0) != 0) abort (); + REPEAT_8; +# undef REPEAT_FN +#endif /* CHAR_BIT == 8 */ + + /* Test 16-bit shifts. */ +#if CHAR_BIT == 8 || CHAR_BIT == 16 +# define REPEAT_FN(COUNT) \ + if (CAT (ashift_hi_, COUNT) (0xffff) \ + != (u16) ((u16) 0xffff << COUNT)) abort (); + REPEAT_16; +# undef REPEAT_FN + +# define REPEAT_FN(COUNT) \ + if (CAT (lshiftrt_hi_, COUNT) (0xffff) \ + != (u16) ((u16) 0xffff >> COUNT)) abort (); + REPEAT_16; +# undef REPEAT_FN + +# define REPEAT_FN(COUNT) \ + if (CAT (ashiftrt_hi_, COUNT) (-1) != -1) abort (); + REPEAT_16; +# undef REPEAT_FN + +# define REPEAT_FN(COUNT) \ + if (CAT (ashiftrt_hi_, COUNT) (0) != 0) abort (); + REPEAT_16; +# undef REPEAT_FN +#endif /* CHAR_BIT == 8 || CHAR_BIT == 16 */ + + /* Test 32-bit shifts. */ +#if CHAR_BIT == 8 || CHAR_BIT == 16 || CHAR_BIT == 32 +# define REPEAT_FN(COUNT) \ + if (CAT (ashift_si_, COUNT) (0xffffffff) \ + != (u32) ((u32) 0xffffffff << COUNT)) abort (); + REPEAT_32; +# undef REPEAT_FN + +# define REPEAT_FN(COUNT) \ + if (CAT (lshiftrt_si_, COUNT) (0xffffffff) \ + != (u32) ((u32) 0xffffffff >> COUNT)) abort (); + REPEAT_32; +# undef REPEAT_FN + +# define REPEAT_FN(COUNT) \ + if (CAT (ashiftrt_si_, COUNT) (-1) != -1) abort (); + REPEAT_32; +# undef REPEAT_FN + +# define REPEAT_FN(COUNT) \ + if (CAT (ashiftrt_si_, COUNT) (0) != 0) abort (); + REPEAT_32; +# undef REPEAT_FN +#endif /* CHAR_BIT == 8 || CHAR_BIT == 16 || CHAR_BIT == 32 */ + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020118-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020118-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020118-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020118-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* This tests an insn length of sign extension on h8300 port. */ + +extern void exit (int); + +volatile signed char *q; +volatile signed int n; + +void +foo (void) +{ + signed char *p; + + for (;;) + { + p = (signed char *) q; n = p[2]; + p = (signed char *) q; n = p[2]; + p = (signed char *) q; n = p[2]; + p = (signed char *) q; n = p[2]; + p = (signed char *) q; n = p[2]; + p = (signed char *) q; n = p[2]; + p = (signed char *) q; n = p[2]; + p = (signed char *) q; n = p[2]; + p = (signed char *) q; n = p[2]; + p = (signed char *) q; n = p[2]; + p = (signed char *) q; n = p[2]; + } +} + +int +main () +{ + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020127-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020127-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020127-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020127-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* This used to fail on h8300. */ + +extern void abort (void); +extern void exit (int); + +unsigned long +foo (unsigned long n) +{ + return (~n >> 3) & 1; +} + +int +main () +{ + if (foo (1 << 3) != 0) + abort (); + + if (foo (0) != 1) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020129-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020129-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020129-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020129-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +/* This testcase failed at -O2 on IA-64, because scheduling did not take + into account conditional execution when using cselib for alias + analysis. */ + +struct D { int d1; struct D *d2; }; +struct C { struct D c1; long c2, c3, c4, c5, c6; }; +struct A { struct A *a1; struct C *a2; }; +struct B { struct C b1; struct A *b2; }; + +extern void abort (void); +extern void exit (int); + +void +foo (struct B *x, struct B *y) +{ + if (x->b2 == 0) + { + struct A *a; + + x->b2 = a = y->b2; + y->b2 = 0; + for (; a; a = a->a1) + a->a2 = &x->b1; + } + + if (y->b2 != 0) + abort (); + + if (x->b1.c3 == -1) + { + x->b1.c3 = y->b1.c3; + x->b1.c4 = y->b1.c4; + y->b1.c3 = -1; + y->b1.c4 = 0; + } + + if (y->b1.c3 != -1) + abort (); +} + +struct B x, y; + +int main () +{ + y.b1.c1.d1 = 6; + y.b1.c3 = 145; + y.b1.c4 = 2448; + x.b1.c3 = -1; + foo (&x, &y); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020201-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020201-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020201-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020201-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* Test whether division by constant works properly. */ + +extern void abort (void); +extern void exit (int); + +unsigned char cx = 7; +unsigned short sx = 14; +unsigned int ix = 21; +unsigned long lx = 28; +unsigned long long Lx = 35; + +int +main () +{ + unsigned char cy; + unsigned short sy; + unsigned int iy; + unsigned long ly; + unsigned long long Ly; + + cy = cx / 6; if (cy != 1) abort (); + cy = cx % 6; if (cy != 1) abort (); + + sy = sx / 6; if (sy != 2) abort (); + sy = sx % 6; if (sy != 2) abort (); + + iy = ix / 6; if (iy != 3) abort (); + iy = ix % 6; if (iy != 3) abort (); + + ly = lx / 6; if (ly != 4) abort (); + ly = lx % 6; if (ly != 4) abort (); + + Ly = Lx / 6; if (Ly != 5) abort (); + Ly = Lx % 6; if (Ly != 5) abort (); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020206-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020206-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020206-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020206-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +struct A { + unsigned int a, b, c; +}; + +extern void abort (void); +extern void exit (int); + +struct A bar (void) +{ + return (struct A) { 176, 52, 31 }; +} + +void baz (struct A *a) +{ + if (a->a != 176 || a->b != 52 || a->c != 31) + abort (); +} + +int main () +{ + struct A d; + + d = ({ ({ bar (); }); }); + baz (&d); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020206-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020206-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020206-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020206-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* Origin: PR c/5420 from David Mosberger . + This testcase was miscompiled when tail call optimizing, because a + compound literal initialization was emitted only in the tail call insn + chain, not in the normal call insn chain. */ + +typedef struct { unsigned short a; } A; + +extern void abort (void); +extern void exit (int); + +void foo (unsigned int x) +{ + if (x != 0x800 && x != 0x810) + abort (); +} + +int +main (int argc, char **argv) +{ + int i; + for (i = 0; i < 2; ++i) + foo (((A) { ((!(i >> 4) ? 8 : 64 + (i >> 4)) << 8) + (i << 4) } ).a); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020213-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020213-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020213-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020213-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* PR c/5681 + This testcase failed on IA-32 at -O0, because safe_from_p + incorrectly assumed it is safe to first write into a.a2 b-1 + and then read the original value from it. */ + +int bar (float); + +struct A { + float a1; + int a2; +} a; + +int b; + +void foo (void) +{ + a.a2 = bar (a.a1); + a.a2 = a.a2 < b - 1 ? a.a2 : b - 1; + if (a.a2 >= b - 1) + abort (); +} + +int bar (float x) +{ + return 2241; +} + +int main() +{ + a.a1 = 1.0f; + b = 3384; + foo (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020215-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020215-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020215-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020215-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* Test failed on an architecture that: + + - had 16-bit registers, + - passed 64-bit structures in registers, + - only allowed SImode values in even numbered registers. + + Before reload, s.i2 in foo() was represented as: + + (subreg:SI (reg:DI 0) 2) + + find_dummy_reload would return (reg:SI 1) for the subreg reload, + despite that not being a valid register. */ + +struct s +{ + short i1; + long i2; + short i3; +}; + +struct s foo (struct s s) +{ + s.i2++; + return s; +} + +int main () +{ + struct s s = foo ((struct s) { 1000, 2000L, 3000 }); + if (s.i1 != 1000 || s.i2 != 2001L || s.i3 != 3000) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020216-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020216-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020216-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020216-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* PR c/3444 + This used to fail because bitwise xor was improperly computed in char type + and sign extended to int type. */ + +extern void abort (); +extern void exit (int); + +signed char c = (signed char) 0xffffffff; + +int foo (void) +{ + return (unsigned short) c ^ (signed char) 0x99999999; +} + +int main (void) +{ + if ((unsigned char) -1 != 0xff + || sizeof (short) != 2 + || sizeof (int) != 4) + exit (0); + if (foo () != (int) 0xffff0066) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020219-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020219-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020219-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020219-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR c/4308 + This testcase failed because 0x8000000000000000 >> 0 + was incorrectly folded into 0xffffffff00000000. */ + +extern void abort (void); +extern void exit (int); + +long long foo (void) +{ + long long C = 1ULL << 63, X; + int Y = 32; + X = C >> (Y & 31); + return X; +} + +int main (void) +{ + if (foo () != 1ULL << 63) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020225-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020225-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020225-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020225-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* This testcase failed at -O2 on powerpc64 due to andsi3 writing + nonzero bits to the high 32 bits of a 64 bit register. */ + +extern void abort (void); +extern void exit (int); + +unsigned long foo (unsigned long base, unsigned int val) +{ + return base + (val & 0x80000001); +} + +int main (void) +{ + if (foo (0L, 0x0ffffff0) != 0L) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020225-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020225-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020225-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020225-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +static int +test(int x) +{ + union + { + int i; + double d; + } a; + a.d = 0; + a.i = 1; + return x >> a.i; +} + +int main(void) +{ + if (test (5) != 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020226-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020226-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020226-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020226-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,104 @@ +/* This tests the rotate patterns that some machines support. */ + +#include + +#ifndef CHAR_BIT +#define CHAR_BIT 8 +#endif + +#define ROR(a,b) (((a) >> (b)) | ((a) << ((sizeof (a) * CHAR_BIT) - (b)))) +#define ROL(a,b) (((a) << (b)) | ((a) >> ((sizeof (a) * CHAR_BIT) - (b)))) + +#define CHAR_VALUE ((unsigned char)0x1234U) +#define SHORT_VALUE ((unsigned short)0x1234U) +#define INT_VALUE 0x1234U +#define LONG_VALUE 0x12345678LU +#define LL_VALUE 0x12345678abcdef0LLU + +#define SHIFT1 4 +#define SHIFT2 ((sizeof (long long) * CHAR_BIT) - SHIFT1) + +unsigned char uc = CHAR_VALUE; +unsigned short us = SHORT_VALUE; +unsigned int ui = INT_VALUE; +unsigned long ul = LONG_VALUE; +unsigned long long ull = LL_VALUE; +int shift1 = SHIFT1; +int shift2 = SHIFT2; + +main () +{ + if (ROR (uc, shift1) != ROR (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROR (uc, SHIFT1) != ROR (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROR (us, shift1) != ROR (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROR (us, SHIFT1) != ROR (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROR (ui, shift1) != ROR (INT_VALUE, SHIFT1)) + abort (); + + if (ROR (ui, SHIFT1) != ROR (INT_VALUE, SHIFT1)) + abort (); + + if (ROR (ul, shift1) != ROR (LONG_VALUE, SHIFT1)) + abort (); + + if (ROR (ul, SHIFT1) != ROR (LONG_VALUE, SHIFT1)) + abort (); + + if (ROR (ull, shift1) != ROR (LL_VALUE, SHIFT1)) + abort (); + + if (ROR (ull, SHIFT1) != ROR (LL_VALUE, SHIFT1)) + abort (); + + if (ROR (ull, shift2) != ROR (LL_VALUE, SHIFT2)) + abort (); + + if (ROR (ull, SHIFT2) != ROR (LL_VALUE, SHIFT2)) + abort (); + + if (ROL (uc, shift1) != ROL (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROL (uc, SHIFT1) != ROL (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROL (us, shift1) != ROL (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROL (us, SHIFT1) != ROL (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROL (ui, shift1) != ROL (INT_VALUE, SHIFT1)) + abort (); + + if (ROL (ui, SHIFT1) != ROL (INT_VALUE, SHIFT1)) + abort (); + + if (ROL (ul, shift1) != ROL (LONG_VALUE, SHIFT1)) + abort (); + + if (ROL (ul, SHIFT1) != ROL (LONG_VALUE, SHIFT1)) + abort (); + + if (ROL (ull, shift1) != ROL (LL_VALUE, SHIFT1)) + abort (); + + if (ROL (ull, SHIFT1) != ROL (LL_VALUE, SHIFT1)) + abort (); + + if (ROL (ull, shift2) != ROL (LL_VALUE, SHIFT2)) + abort (); + + if (ROL (ull, SHIFT2) != ROL (LL_VALUE, SHIFT2)) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020227-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020227-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020227-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020227-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* This testcase failed on mmix-knuth-mmixware. Problem was with storing + to an unaligned mem:SC, gcc tried doing it by parts from a (concat:SC + (reg:SF 293) (reg:SF 294)). */ + +typedef __complex__ float cf; +struct x { char c; cf f; } __attribute__ ((__packed__)); +extern void f2 (struct x*); +extern void f1 (void); +int +main (void) +{ + f1 (); + exit (0); +} + +void +f1 (void) +{ + struct x s; + s.f = 1; + s.c = 42; + f2 (&s); +} + +void +f2 (struct x *y) +{ + if (y->f != 1 || y->c != 42) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020307-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020307-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020307-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020307-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,69 @@ +#define MASK(N) ((1UL << (N)) - 1) +#define BITS(N) ((1UL << ((N) - 1)) + 2) + +#define FUNC(N) void f##N(long j) { if ((j & MASK(N)) >= BITS(N)) abort();} + +FUNC(3) +FUNC(4) +FUNC(5) +FUNC(6) +FUNC(7) +FUNC(8) +FUNC(9) +FUNC(10) +FUNC(11) +FUNC(12) +FUNC(13) +FUNC(14) +FUNC(15) +FUNC(16) +FUNC(17) +FUNC(18) +FUNC(19) +FUNC(20) +FUNC(21) +FUNC(22) +FUNC(23) +FUNC(24) +FUNC(25) +FUNC(26) +FUNC(27) +FUNC(28) +FUNC(29) +FUNC(30) +FUNC(31) + +int main () +{ + f3(0); + f4(0); + f5(0); + f6(0); + f7(0); + f8(0); + f9(0); + f10(0); + f11(0); + f12(0); + f13(0); + f14(0); + f15(0); + f16(0); + f17(0); + f18(0); + f19(0); + f20(0); + f21(0); + f22(0); + f23(0); + f24(0); + f25(0); + f26(0); + f27(0); + f28(0); + f29(0); + f30(0); + f31(0); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020314-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020314-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020314-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020314-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* { dg-require-effective-target alloca } */ +void f(void * a, double y) +{ +} + +double g (double a, double b, double c, double d) +{ + double x, y, z; + void *p; + + x = a + b; + y = c * d; + + p = alloca (16); + + f(p, y); + z = x * y * a; + + return z + b; +} + +main () +{ + double a, b, c, d; + a = 1.0; + b = 0.0; + c = 10.0; + d = 0.0; + + if (g (a, b, c, d) != 0.0) + abort (); + + if (a != 1.0 || b != 0.0 || c != 10.0 || d != 0.0) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020320-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020320-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020320-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020320-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR c/5354 */ +/* Verify that GCC preserves relevant stack slots. */ + +extern void abort(void); +extern void exit(int); + +struct large { int x, y[9]; }; + +int main() +{ + int fixed; + + fixed = ({ int temp1 = 2; temp1; }) - ({ int temp2 = 1; temp2; }); + if (fixed != 1) + abort(); + + fixed = ({ struct large temp3; temp3.x = 2; temp3; }).x + - ({ struct large temp4; temp4.x = 1; temp4; }).x; + if (fixed != 1) + abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020321-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020321-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020321-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020321-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* PR 3177 */ +/* Produced a SIGILL on ia64 with sibcall from F to G. We hadn't + widened the register window to allow for the fourth outgoing + argument as an "in" register. */ + +float g (void *a, void *b, int e, int c, float d) +{ + return d; +} + +float f (void *a, void *b, int c, float d) +{ + return g (a, b, 0, c, d); +} + +int main () +{ + f (0, 0, 1, 1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020328-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020328-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020328-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020328-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +int b = 0; + +func () { } + +void +testit(int x) +{ + if (x != 20) + abort (); +} + +int +main() + +{ + int a = 0; + + if (b) + func(); + + /* simplify_and_const_int would incorrectly omit the mask in + the line below. */ + testit ((a + 23) & 0xfffffffc); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +/* derived from PR c/2100 */ + +extern void abort (); +extern void exit (int); + +#define SMALL_N 2 +#define NUM_ELEM 4 + +int main(void) +{ + int listElem[NUM_ELEM]={30,2,10,5}; + int listSmall[SMALL_N]; + int i, j; + int posGreatest=-1, greatest=-1; + + for (i=0; i greatest) { + posGreatest = i; + greatest = listElem[i]; + } + } + + for (i=SMALL_N; i greatest) { + posGreatest = j; + greatest = listSmall[j]; + } + } + } + + if (listSmall[0] != 5 || listSmall[1] != 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,230 @@ +/* PR 3967 + + local-alloc screwed up consideration of high+lo_sum and created + reg_equivs that it shouldn't have, resulting in lo_sum with + uninitialized data, resulting in segv. The test has to remain + relatively large, since register spilling is required to twig + the bug. */ + +unsigned long *Local1; +unsigned long *Local2; +unsigned long *Local3; +unsigned long *RDbf1; +unsigned long *RDbf2; +unsigned long *RDbf3; +unsigned long *IntVc1; +unsigned long *IntVc2; +unsigned long *IntCode3; +unsigned long *IntCode4; +unsigned long *IntCode5; +unsigned long *IntCode6; +unsigned long *Lom1; +unsigned long *Lom2; +unsigned long *Lom3; +unsigned long *Lom4; +unsigned long *Lom5; +unsigned long *Lom6; +unsigned long *Lom7; +unsigned long *Lom8; +unsigned long *Lom9; +unsigned long *Lom10; +unsigned long *RDbf11; +unsigned long *RDbf12; + +typedef struct + { + long a1; + unsigned long n1; + unsigned long local1; + unsigned long local2; + unsigned long local3; + unsigned long rdbf1; + unsigned long rdbf2; + unsigned long milli; + unsigned long frames1; + unsigned long frames2; + unsigned long nonShared; + long newPrivate; + long freeLimit; + unsigned long cache1; + unsigned long cache2; + unsigned long cache3; + unsigned long cache4; + unsigned long cache5; + unsigned long time6; + unsigned long frames7; + unsigned long page8; + unsigned long ot9; + unsigned long data10; + unsigned long bm11; + unsigned long misc12; + } +ShrPcCommonStatSType; + + +typedef struct + { + unsigned long sharedAttached; + unsigned long totalAttached; + long avgPercentShared; + unsigned long numberOfFreeFrames; + unsigned long localDirtyPageCount; + unsigned long globalDirtyPageCount; + long wakeupInterval; + unsigned long numActiveProcesses; + unsigned long numRecentActiveProcesses; + unsigned long gemDirtyPageKinds[10]; + unsigned long stoneDirtyPageKinds[10]; + unsigned long gemsInCacheCount; + long targetFreeFrameCount; + } +ShrPcMonStatSType; + +typedef struct + { + unsigned long c1; + unsigned long c2; + unsigned long c3; + unsigned long c4; + unsigned long c5; + unsigned long c6; + unsigned long c7; + unsigned long c8; + unsigned long c9; + unsigned long c10; + unsigned long c11; + unsigned long c12; + unsigned long a1; + unsigned long a2; + unsigned long a3; + unsigned long a4; + unsigned long a5; + unsigned long a6; + unsigned long a7; + unsigned long a8; + unsigned long a9; + unsigned long a10; + unsigned long a11; + unsigned long a12; + unsigned long a13; + unsigned long a14; + unsigned long a15; + unsigned long a16; + unsigned long a17; + unsigned long a18; + unsigned long a19; + unsigned long sessionStats[40]; + } +ShrPcGemStatSType; + +union ShrPcStatUnion + { + ShrPcMonStatSType monitor; + ShrPcGemStatSType gem; + }; + +typedef struct + { + int processId; + int sessionId; + ShrPcCommonStatSType cmn; + union ShrPcStatUnion u; + } ShrPcStatsSType; + +typedef struct + { + unsigned long *p1; + unsigned long *p2; + unsigned long *p3; + unsigned long *p4; + unsigned long *p5; + unsigned long *p6; + unsigned long *p7; + unsigned long *p8; + unsigned long *p9; + unsigned long *p10; + unsigned long *p11; + } +WorkEntrySType; + +WorkEntrySType Workspace; + +static void +setStatPointers (ShrPcStatsSType * statsPtr, long sessionId) +{ + statsPtr->sessionId = sessionId; + statsPtr->cmn.a1 = 0; + statsPtr->cmn.n1 = 5; + + Local1 = &statsPtr->cmn.local1; + Local2 = &statsPtr->cmn.local2; + Local3 = &statsPtr->cmn.local3; + RDbf1 = &statsPtr->cmn.rdbf1; + RDbf2 = &statsPtr->cmn.rdbf2; + RDbf3 = &statsPtr->cmn.milli; + *RDbf3 = 1; + + IntVc1 = &statsPtr->u.gem.a1; + IntVc2 = &statsPtr->u.gem.a2; + IntCode3 = &statsPtr->u.gem.a3; + IntCode4 = &statsPtr->u.gem.a4; + IntCode5 = &statsPtr->u.gem.a5; + IntCode6 = &statsPtr->u.gem.a6; + + { + WorkEntrySType *workSpPtr; + workSpPtr = &Workspace; + workSpPtr->p1 = &statsPtr->u.gem.a7; + workSpPtr->p2 = &statsPtr->u.gem.a8; + workSpPtr->p3 = &statsPtr->u.gem.a9; + workSpPtr->p4 = &statsPtr->u.gem.a10; + workSpPtr->p5 = &statsPtr->u.gem.a11; + workSpPtr->p6 = &statsPtr->u.gem.a12; + workSpPtr->p7 = &statsPtr->u.gem.a13; + workSpPtr->p8 = &statsPtr->u.gem.a14; + workSpPtr->p9 = &statsPtr->u.gem.a15; + workSpPtr->p10 = &statsPtr->u.gem.a16; + workSpPtr->p11 = &statsPtr->u.gem.a17; + } + Lom1 = &statsPtr->u.gem.c1; + Lom2 = &statsPtr->u.gem.c2; + Lom3 = &statsPtr->u.gem.c3; + Lom4 = &statsPtr->u.gem.c4; + Lom5 = &statsPtr->u.gem.c5; + Lom6 = &statsPtr->u.gem.c6; + Lom7 = &statsPtr->u.gem.c7; + Lom8 = &statsPtr->u.gem.c8; + Lom9 = &statsPtr->u.gem.c9; + Lom10 = &statsPtr->u.gem.c10; + RDbf11 = &statsPtr->u.gem.c11; + RDbf12 = &statsPtr->u.gem.c12; +} + +typedef struct +{ + ShrPcStatsSType stats; +} ShrPcPteSType; + +ShrPcPteSType MyPte; + +static void +initPte (void *shrpcPtr, long sessionId) +{ + ShrPcPteSType *ptePtr; + + ptePtr = &MyPte; + setStatPointers (&ptePtr->stats, sessionId); +} + +void +InitCache (int sessionId) +{ + initPte (0, sessionId); +} + +int +main (int argc, char *argv[]) +{ + InitCache (5); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020402-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,81 @@ +/* extracted from gdb sources */ + +typedef unsigned long long CORE_ADDR; + +struct blockvector; + +struct symtab { + struct blockvector *blockvector; +}; + +struct sec { + void *unused; +}; + +struct symbol { + int len; + char *name; +}; + +struct block { + CORE_ADDR startaddr, endaddr; + struct symbol *function; + struct block *superblock; + unsigned char gcc_compile_flag; + int nsyms; + struct symbol syms[1]; +}; + +struct blockvector { + int nblocks; + struct block *block[2]; +}; + +struct blockvector *blockvector_for_pc_sect(register CORE_ADDR pc, + struct symtab *symtab) +{ + register struct block *b; + register int bot, top, half; + struct blockvector *bl; + + bl = symtab->blockvector; + b = bl->block[0]; + + bot = 0; + top = bl->nblocks; + + while (top - bot > 1) + { + half = (top - bot + 1) >> 1; + b = bl->block[bot + half]; + if (b->startaddr <= pc) + bot += half; + else + top = bot + half; + } + + while (bot >= 0) + { + b = bl->block[bot]; + if (b->endaddr > pc) + { + return bl; + } + bot--; + } + return 0; +} + +int main(void) +{ + struct block a = { 0, 0x10000, 0, 0, 1, 20 }; + struct block b = { 0x10000, 0x20000, 0, 0, 1, 20 }; + struct blockvector bv = { 2, { &a, &b } }; + struct symtab s = { &bv }; + + struct blockvector *ret; + + ret = blockvector_for_pc_sect(0x500, &s); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020404-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020404-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020404-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020404-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,104 @@ +/* { dg-require-effective-target int32plus } */ +/* { dg-skip-if "pointers can be truncated" { m32c-*-* } } */ +/* Extracted from GDB sources. */ + +typedef long long bfd_signed_vma; +typedef bfd_signed_vma file_ptr; + +typedef enum bfd_boolean {false, true} boolean; + +typedef unsigned long long bfd_size_type; + +typedef unsigned int flagword; + +typedef unsigned long long CORE_ADDR; +typedef unsigned long long bfd_vma; + +struct bfd_struct { + int x; +}; + +struct asection_struct { + unsigned int user_set_vma : 1; + bfd_vma vma; + bfd_vma lma; + unsigned int alignment_power; + unsigned int entsize; +}; + +typedef struct bfd_struct bfd; +typedef struct asection_struct asection; + +static bfd * +bfd_openw_with_cleanup (char *filename, const char *target, char *mode); + +static asection * +bfd_make_section_anyway (bfd *abfd, const char *name); + +static boolean +bfd_set_section_size (bfd *abfd, asection *sec, bfd_size_type val); + +static boolean +bfd_set_section_flags (bfd *abfd, asection *sec, flagword flags); + +static boolean +bfd_set_section_contents (bfd *abfd, asection *section, void * data, file_ptr offset, bfd_size_type count); + +static void +dump_bfd_file (char *filename, char *mode, + char *target, CORE_ADDR vaddr, + char *buf, int len) +{ + bfd *obfd; + asection *osection; + + obfd = bfd_openw_with_cleanup (filename, target, mode); + osection = bfd_make_section_anyway (obfd, ".newsec"); + bfd_set_section_size (obfd, osection, len); + (((osection)->vma = (osection)->lma= (vaddr)), ((osection)->user_set_vma = (boolean)true), true); + (((osection)->alignment_power = (0)),true); + bfd_set_section_flags (obfd, osection, 0x203); + osection->entsize = 0; + bfd_set_section_contents (obfd, osection, buf, 0, len); +} + +static bfd * +bfd_openw_with_cleanup (char *filename, const char *target, char *mode) +{ + static bfd foo_bfd = { 0 }; + return &foo_bfd; +} + +static asection * +bfd_make_section_anyway (bfd *abfd, const char *name) +{ + static asection foo_section = { false, 0x0, 0x0, 0 }; + + return &foo_section; +} + +static boolean +bfd_set_section_size (bfd *abfd, asection *sec, bfd_size_type val) +{ + return true; +} + +static boolean +bfd_set_section_flags (bfd *abfd, asection *sec, flagword flags) +{ +} + +static boolean +bfd_set_section_contents (bfd *abfd, asection *section, void * data, file_ptr offset, bfd_size_type count) +{ + if (count != (bfd_size_type)0x1eadbeef) + abort(); +} + +static char hello[] = "hello"; + +int main(void) +{ + dump_bfd_file(0, 0, 0, (CORE_ADDR)0xdeadbeef, hello, (int)0x1eadbeef); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020406-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020406-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020406-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020406-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,126 @@ +// Origin: abbott at dima.unige.it +// PR c/5120 + +extern void * malloc (__SIZE_TYPE__); +extern void * calloc (__SIZE_TYPE__, __SIZE_TYPE__); + +typedef unsigned int FFelem; + +FFelem FFmul(const FFelem x, const FFelem y) +{ + return x; +} + + +struct DUPFFstruct +{ + int maxdeg; + int deg; + FFelem *coeffs; +}; + +typedef struct DUPFFstruct *DUPFF; + + +int DUPFFdeg(const DUPFF f) +{ + return f->deg; +} + + +DUPFF DUPFFnew(const int maxdeg) +{ + DUPFF ans = (DUPFF)malloc(sizeof(struct DUPFFstruct)); + ans->coeffs = 0; + if (maxdeg >= 0) ans->coeffs = (FFelem*)calloc(maxdeg+1,sizeof(FFelem)); + ans->maxdeg = maxdeg; + ans->deg = -1; + return ans; +} + +void DUPFFfree(DUPFF x) +{ +} + +void DUPFFswap(DUPFF x, DUPFF y) +{ +} + + +DUPFF DUPFFcopy(const DUPFF x) +{ + return x; +} + + +void DUPFFshift_add(DUPFF f, const DUPFF g, int deg, const FFelem coeff) +{ +} + + +DUPFF DUPFFexgcd(DUPFF *fcofac, DUPFF *gcofac, const DUPFF f, const DUPFF g) +{ + DUPFF u, v, uf, ug, vf, vg; + FFelem q, lcu, lcvrecip, p; + int df, dg, du, dv; + + printf("DUPFFexgcd called on degrees %d and %d\n", DUPFFdeg(f), DUPFFdeg(g)); + if (DUPFFdeg(f) < DUPFFdeg(g)) return DUPFFexgcd(gcofac, fcofac, g, f); /*** BUG IN THIS LINE ***/ + if (DUPFFdeg(f) != 2 || DUPFFdeg(g) != 1) abort(); + if (f->coeffs[0] == 0) return f; + /****** NEVER REACH HERE IN THE EXAMPLE ******/ + p = 2; + + df = DUPFFdeg(f); if (df < 0) df = 0; /* both inputs are zero */ + dg = DUPFFdeg(g); if (dg < 0) dg = 0; /* one input is zero */ + u = DUPFFcopy(f); + v = DUPFFcopy(g); + + uf = DUPFFnew(dg); uf->coeffs[0] = 1; uf->deg = 0; + ug = DUPFFnew(df); + vf = DUPFFnew(dg); + vg = DUPFFnew(df); vg->coeffs[0] = 1; vg->deg = 0; + + while (DUPFFdeg(v) > 0) + { + dv = DUPFFdeg(v); + lcvrecip = FFmul(1, v->coeffs[dv]); + while (DUPFFdeg(u) >= dv) + { + du = DUPFFdeg(u); + lcu = u->coeffs[du]; + q = FFmul(lcu, lcvrecip); + DUPFFshift_add(u, v, du-dv, p-q); + DUPFFshift_add(uf, vf, du-dv, p-q); + DUPFFshift_add(ug, vg, du-dv, p-q); + } + DUPFFswap(u, v); + DUPFFswap(uf, vf); + DUPFFswap(ug, vg); + } + if (DUPFFdeg(v) == 0) + { + DUPFFswap(u, v); + DUPFFswap(uf, vf); + DUPFFswap(ug, vg); + } + DUPFFfree(vf); + DUPFFfree(vg); + DUPFFfree(v); + *fcofac = uf; + *gcofac = ug; + return u; +} + + + +int main() +{ + DUPFF f, g, cf, cg, h; + f = DUPFFnew(1); f->coeffs[1] = 1; f->deg = 1; + g = DUPFFnew(2); g->coeffs[2] = 1; g->deg = 2; + + printf("calling DUPFFexgcd on degrees %d and %d\n", DUPFFdeg(f), DUPFFdeg(g)) ; + h = DUPFFexgcd(&cf, &cg, f, g); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020411-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020411-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020411-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020411-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR optimization/6177 + This testcase ICEd because expr.c did not expect to see a CONCAT + as array rtl. */ + +extern void abort (void); +extern void exit (int); + +__complex__ float foo (void) +{ + __complex__ float f[1]; + __real__ f[0] = 1.0; + __imag__ f[0] = 1.0; + f[0] = __builtin_conjf (f[0]); + return f[0]; +} + +int main (void) +{ + __complex__ double d[1]; + d[0] = foo (); + if (__real__ d[0] != 1.0 + || __imag__ d[0] != -1.0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020412-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020412-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020412-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020412-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,57 @@ +/* PR c/3711 + This testcase ICEd on IA-32 at -O0 and was miscompiled otherwise, + because std_expand_builtin_va_arg didn't handle variable size types. */ +/* { dg-require-effective-target alloca } */ + +#include + +extern void abort (void); +extern void exit (int); + +void bar (int c) +{ + static int d = '0'; + + if (c != d++) + abort (); + if (c < '0' || c > '9') + abort (); +} + +void foo (int size, ...) +{ + struct + { + char x[size]; + } d; + va_list ap; + int i; + + va_start (ap, size); + d = va_arg (ap, typeof (d)); + for (i = 0; i < size; i++) + bar (d.x[i]); + d = va_arg (ap, typeof (d)); + for (i = 0; i < size; i++) + bar (d.x[i]); + va_end (ap); +} + +int main (void) +{ + int z = 5; + struct { char a[z]; } x, y; + + x.a[0] = '0'; + x.a[1] = '1'; + x.a[2] = '2'; + x.a[3] = '3'; + x.a[4] = '4'; + y.a[0] = '5'; + y.a[1] = '6'; + y.a[2] = '7'; + y.a[3] = '8'; + y.a[4] = '9'; + foo (z, x, y); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020413-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020413-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020413-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020413-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +void test(long double val, int *eval) +{ + long double tmp = 1.0l; + int i = 0; + + if (val < 0.0l) + val = -val; + + if (val >= tmp) + while (tmp < val) + { + tmp *= 2.0l; + if (i++ >= 10) + abort (); + } + else if (val != 0.0l) + while (val < tmp) + { + tmp /= 2.0l; + if (i++ >= 10) + abort (); + } + + *eval = i; +} + +int main(void) +{ + int eval; + + test(3.0, &eval); + test(3.5, &eval); + test(4.0, &eval); + test(5.0, &eval); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020418-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020418-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020418-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020418-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* ifcvt accidentally deletes a referenced label while generating + conditional traps on machines having such patterns */ + +struct foo { int a; }; + +void gcc_crash(struct foo *p) +{ + if (__builtin_expect(p->a < 52, 0)) + __builtin_trap(); + top: + p->a++; + if (p->a >= 62) + goto top; +} + +int main(void) +{ + struct foo x; + + x.a = 53; + gcc_crash(&x); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020423-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020423-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020423-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020423-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* PR c/5430 */ +/* Verify that the multiplicative folding code is not fooled + by the mix between signed variables and unsigned constants. */ + +extern void abort (void); +extern void exit (int); + +int main (void) +{ + int my_int = 924; + unsigned int result; + + result = ((my_int*2 + 4) - 8U) / 2; + if (result != 922U) + abort(); + + result = ((my_int*2 - 4U) + 2) / 2; + if (result != 923U) + abort(); + + result = (((my_int + 2) * 2) - 8U - 4) / 2; + if (result != 920U) + abort(); + result = (((my_int + 2) * 2) - (8U + 4)) / 2; + if (result != 920U) + abort(); + + result = ((my_int*4 + 2U) - 4U) / 2; + if (result != 1847U) + abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020503-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020503-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020503-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020503-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* PR 6534 */ +/* GCSE unified the two i<0 tests, but if-conversion to ui=abs(i) + insertted the code at the wrong place corrupting the i<0 test. */ + +void abort (void); +static char * +inttostr (long i, char buf[128]) +{ + unsigned long ui = i; + char *p = buf + 127; + *p = '\0'; + if (i < 0) + ui = -ui; + do + *--p = '0' + ui % 10; + while ((ui /= 10) != 0); + if (i < 0) + *--p = '-'; + return p; +} + +int +main () +{ + char buf[128], *p; + + p = inttostr (-1, buf); + if (*p != '-') + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020506-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020506-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020506-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020506-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,332 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Test that (A & C1) op C2 optimizations behave correctly where C1 is + a constant power of 2, op is == or !=, and C2 is C1 or zero. + + Written by Roger Sayle, 5th May 2002. */ + +#include + +extern void abort (void); + +void test1 (signed char c, int set); +void test2 (unsigned char c, int set); +void test3 (short s, int set); +void test4 (unsigned short s, int set); +void test5 (int i, int set); +void test6 (unsigned int i, int set); +void test7 (long long l, int set); +void test8 (unsigned long long l, int set); + +#ifndef LONG_LONG_MAX +#define LONG_LONG_MAX __LONG_LONG_MAX__ +#endif +#ifndef LONG_LONG_MIN +#define LONG_LONG_MIN (-LONG_LONG_MAX-1) +#endif +#ifndef ULONG_LONG_MAX +#define ULONG_LONG_MAX (LONG_LONG_MAX * 2ULL + 1) +#endif + + +void +test1 (signed char c, int set) +{ + if ((c & (SCHAR_MAX+1)) == 0) + { + if (set) abort (); + } + else + if (!set) abort (); + + if ((c & (SCHAR_MAX+1)) != 0) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((c & (SCHAR_MAX+1)) == (SCHAR_MAX+1)) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((c & (SCHAR_MAX+1)) != (SCHAR_MAX+1)) + { + if (set) abort (); + } + else + if (!set) abort (); +} + +void +test2 (unsigned char c, int set) +{ + if ((c & (SCHAR_MAX+1)) == 0) + { + if (set) abort (); + } + else + if (!set) abort (); + + if ((c & (SCHAR_MAX+1)) != 0) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((c & (SCHAR_MAX+1)) == (SCHAR_MAX+1)) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((c & (SCHAR_MAX+1)) != (SCHAR_MAX+1)) + { + if (set) abort (); + } + else + if (!set) abort (); +} + +void +test3 (short s, int set) +{ + if ((s & (SHRT_MAX+1)) == 0) + { + if (set) abort (); + } + else + if (!set) abort (); + + if ((s & (SHRT_MAX+1)) != 0) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((s & (SHRT_MAX+1)) == (SHRT_MAX+1)) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((s & (SHRT_MAX+1)) != (SHRT_MAX+1)) + { + if (set) abort (); + } + else + if (!set) abort (); +} + +void +test4 (unsigned short s, int set) +{ + if ((s & (SHRT_MAX+1)) == 0) + { + if (set) abort (); + } + else + if (!set) abort (); + + if ((s & (SHRT_MAX+1)) != 0) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((s & (SHRT_MAX+1)) == (SHRT_MAX+1)) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((s & (SHRT_MAX+1)) != (SHRT_MAX+1)) + { + if (set) abort (); + } + else + if (!set) abort (); +} + +void +test5 (int i, int set) +{ + if ((i & (INT_MAX+1U)) == 0) + { + if (set) abort (); + } + else + if (!set) abort (); + + if ((i & (INT_MAX+1U)) != 0) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((i & (INT_MAX+1U)) == (INT_MAX+1U)) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((i & (INT_MAX+1U)) != (INT_MAX+1U)) + { + if (set) abort (); + } + else + if (!set) abort (); +} + +void +test6 (unsigned int i, int set) +{ + if ((i & (INT_MAX+1U)) == 0) + { + if (set) abort (); + } + else + if (!set) abort (); + + if ((i & (INT_MAX+1U)) != 0) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((i & (INT_MAX+1U)) == (INT_MAX+1U)) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((i & (INT_MAX+1U)) != (INT_MAX+1U)) + { + if (set) abort (); + } + else + if (!set) abort (); +} + +void +test7 (long long l, int set) +{ + if ((l & (LONG_LONG_MAX+1ULL)) == 0) + { + if (set) abort (); + } + else + if (!set) abort (); + + if ((l & (LONG_LONG_MAX+1ULL)) != 0) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((l & (LONG_LONG_MAX+1ULL)) == (LONG_LONG_MAX+1ULL)) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((l & (LONG_LONG_MAX+1ULL)) != (LONG_LONG_MAX+1ULL)) + { + if (set) abort (); + } + else + if (!set) abort (); +} + +void +test8 (unsigned long long l, int set) +{ + if ((l & (LONG_LONG_MAX+1ULL)) == 0) + { + if (set) abort (); + } + else + if (!set) abort (); + + if ((l & (LONG_LONG_MAX+1ULL)) != 0) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((l & (LONG_LONG_MAX+1ULL)) == (LONG_LONG_MAX+1ULL)) + { + if (!set) abort (); + } + else + if (set) abort (); + + if ((l & (LONG_LONG_MAX+1ULL)) != (LONG_LONG_MAX+1ULL)) + { + if (set) abort (); + } + else + if (!set) abort (); +} + +int +main () +{ + test1 (0, 0); + test1 (SCHAR_MAX, 0); + test1 (SCHAR_MIN, 1); + test1 (UCHAR_MAX, 1); + + test2 (0, 0); + test2 (SCHAR_MAX, 0); + test2 (SCHAR_MIN, 1); + test2 (UCHAR_MAX, 1); + + test3 (0, 0); + test3 (SHRT_MAX, 0); + test3 (SHRT_MIN, 1); + test3 (USHRT_MAX, 1); + + test4 (0, 0); + test4 (SHRT_MAX, 0); + test4 (SHRT_MIN, 1); + test4 (USHRT_MAX, 1); + + test5 (0, 0); + test5 (INT_MAX, 0); + test5 (INT_MIN, 1); + test5 (UINT_MAX, 1); + + test6 (0, 0); + test6 (INT_MAX, 0); + test6 (INT_MIN, 1); + test6 (UINT_MAX, 1); + + test7 (0, 0); + test7 (LONG_LONG_MAX, 0); + test7 (LONG_LONG_MIN, 1); + test7 (ULONG_LONG_MAX, 1); + + test8 (0, 0); + test8 (LONG_LONG_MAX, 0); + test8 (LONG_LONG_MIN, 1); + test8 (ULONG_LONG_MAX, 1); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,104 @@ +/* This tests the rotate patterns that some machines support. */ + +#include + +#ifndef CHAR_BIT +#define CHAR_BIT 8 +#endif + +#define ROR(a,b) (((a) >> (b)) | ((a) << ((sizeof (a) * CHAR_BIT) - (b)))) +#define ROL(a,b) (((a) << (b)) | ((a) >> ((sizeof (a) * CHAR_BIT) - (b)))) + +#define CHAR_VALUE ((unsigned char)0xf234U) +#define SHORT_VALUE ((unsigned short)0xf234U) +#define INT_VALUE 0xf234U +#define LONG_VALUE 0xf2345678LU +#define LL_VALUE 0xf2345678abcdef0LLU + +#define SHIFT1 4 +#define SHIFT2 ((sizeof (long long) * CHAR_BIT) - SHIFT1) + +unsigned char uc = CHAR_VALUE; +unsigned short us = SHORT_VALUE; +unsigned int ui = INT_VALUE; +unsigned long ul = LONG_VALUE; +unsigned long long ull = LL_VALUE; +int shift1 = SHIFT1; +int shift2 = SHIFT2; + +main () +{ + if (ROR (uc, shift1) != ROR (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROR (uc, SHIFT1) != ROR (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROR (us, shift1) != ROR (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROR (us, SHIFT1) != ROR (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROR (ui, shift1) != ROR (INT_VALUE, SHIFT1)) + abort (); + + if (ROR (ui, SHIFT1) != ROR (INT_VALUE, SHIFT1)) + abort (); + + if (ROR (ul, shift1) != ROR (LONG_VALUE, SHIFT1)) + abort (); + + if (ROR (ul, SHIFT1) != ROR (LONG_VALUE, SHIFT1)) + abort (); + + if (ROR (ull, shift1) != ROR (LL_VALUE, SHIFT1)) + abort (); + + if (ROR (ull, SHIFT1) != ROR (LL_VALUE, SHIFT1)) + abort (); + + if (ROR (ull, shift2) != ROR (LL_VALUE, SHIFT2)) + abort (); + + if (ROR (ull, SHIFT2) != ROR (LL_VALUE, SHIFT2)) + abort (); + + if (ROL (uc, shift1) != ROL (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROL (uc, SHIFT1) != ROL (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROL (us, shift1) != ROL (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROL (us, SHIFT1) != ROL (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROL (ui, shift1) != ROL (INT_VALUE, SHIFT1)) + abort (); + + if (ROL (ui, SHIFT1) != ROL (INT_VALUE, SHIFT1)) + abort (); + + if (ROL (ul, shift1) != ROL (LONG_VALUE, SHIFT1)) + abort (); + + if (ROL (ul, SHIFT1) != ROL (LONG_VALUE, SHIFT1)) + abort (); + + if (ROL (ull, shift1) != ROL (LL_VALUE, SHIFT1)) + abort (); + + if (ROL (ull, SHIFT1) != ROL (LL_VALUE, SHIFT1)) + abort (); + + if (ROL (ull, shift2) != ROL (LL_VALUE, SHIFT2)) + abort (); + + if (ROL (ull, SHIFT2) != ROL (LL_VALUE, SHIFT2)) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,102 @@ +#include + +#ifndef CHAR_BIT +#define CHAR_BIT 8 +#endif + +#define ROR(a,b) (((a) >> (b)) | ((a) << ((sizeof (a) * CHAR_BIT) - (b)))) +#define ROL(a,b) (((a) << (b)) | ((a) >> ((sizeof (a) * CHAR_BIT) - (b)))) + +#define CHAR_VALUE ((char)0x1234) +#define SHORT_VALUE ((short)0x1234) +#define INT_VALUE ((int)0x1234) +#define LONG_VALUE ((long)0x12345678L) +#define LL_VALUE ((long long)0x12345678abcdef0LL) + +#define SHIFT1 4 +#define SHIFT2 ((sizeof (long long) * CHAR_BIT) - SHIFT1) + +char c = CHAR_VALUE; +short s = SHORT_VALUE; +int i = INT_VALUE; +long l = LONG_VALUE; +long long ll = LL_VALUE; +int shift1 = SHIFT1; +int shift2 = SHIFT2; + +main () +{ + if (ROR (c, shift1) != ROR (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROR (c, SHIFT1) != ROR (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROR (s, shift1) != ROR (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROR (s, SHIFT1) != ROR (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROR (i, shift1) != ROR (INT_VALUE, SHIFT1)) + abort (); + + if (ROR (i, SHIFT1) != ROR (INT_VALUE, SHIFT1)) + abort (); + + if (ROR (l, shift1) != ROR (LONG_VALUE, SHIFT1)) + abort (); + + if (ROR (l, SHIFT1) != ROR (LONG_VALUE, SHIFT1)) + abort (); + + if (ROR (ll, shift1) != ROR (LL_VALUE, SHIFT1)) + abort (); + + if (ROR (ll, SHIFT1) != ROR (LL_VALUE, SHIFT1)) + abort (); + + if (ROR (ll, shift2) != ROR (LL_VALUE, SHIFT2)) + abort (); + + if (ROR (ll, SHIFT2) != ROR (LL_VALUE, SHIFT2)) + abort (); + + if (ROL (c, shift1) != ROL (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROL (c, SHIFT1) != ROL (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROL (s, shift1) != ROL (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROL (s, SHIFT1) != ROL (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROL (i, shift1) != ROL (INT_VALUE, SHIFT1)) + abort (); + + if (ROL (i, SHIFT1) != ROL (INT_VALUE, SHIFT1)) + abort (); + + if (ROL (l, shift1) != ROL (LONG_VALUE, SHIFT1)) + abort (); + + if (ROL (l, SHIFT1) != ROL (LONG_VALUE, SHIFT1)) + abort (); + + if (ROL (ll, shift1) != ROL (LL_VALUE, SHIFT1)) + abort (); + + if (ROL (ll, SHIFT1) != ROL (LL_VALUE, SHIFT1)) + abort (); + + if (ROL (ll, shift2) != ROL (LL_VALUE, SHIFT2)) + abort (); + + if (ROL (ll, SHIFT2) != ROL (LL_VALUE, SHIFT2)) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020508-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,102 @@ +#include + +#ifndef CHAR_BIT +#define CHAR_BIT 8 +#endif + +#define ROR(a,b) (((a) >> (b)) | ((a) << ((sizeof (a) * CHAR_BIT) - (b)))) +#define ROL(a,b) (((a) << (b)) | ((a) >> ((sizeof (a) * CHAR_BIT) - (b)))) + +#define CHAR_VALUE ((char)0xf234) +#define SHORT_VALUE ((short)0xf234) +#define INT_VALUE ((int)0xf234) +#define LONG_VALUE ((long)0xf2345678L) +#define LL_VALUE ((long long)0xf2345678abcdef0LL) + +#define SHIFT1 4 +#define SHIFT2 ((sizeof (long long) * CHAR_BIT) - SHIFT1) + +char c = CHAR_VALUE; +short s = SHORT_VALUE; +int i = INT_VALUE; +long l = LONG_VALUE; +long long ll = LL_VALUE; +int shift1 = SHIFT1; +int shift2 = SHIFT2; + +main () +{ + if (ROR (c, shift1) != ROR (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROR (c, SHIFT1) != ROR (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROR (s, shift1) != ROR (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROR (s, SHIFT1) != ROR (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROR (i, shift1) != ROR (INT_VALUE, SHIFT1)) + abort (); + + if (ROR (i, SHIFT1) != ROR (INT_VALUE, SHIFT1)) + abort (); + + if (ROR (l, shift1) != ROR (LONG_VALUE, SHIFT1)) + abort (); + + if (ROR (l, SHIFT1) != ROR (LONG_VALUE, SHIFT1)) + abort (); + + if (ROR (ll, shift1) != ROR (LL_VALUE, SHIFT1)) + abort (); + + if (ROR (ll, SHIFT1) != ROR (LL_VALUE, SHIFT1)) + abort (); + + if (ROR (ll, shift2) != ROR (LL_VALUE, SHIFT2)) + abort (); + + if (ROR (ll, SHIFT2) != ROR (LL_VALUE, SHIFT2)) + abort (); + + if (ROL (c, shift1) != ROL (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROL (c, SHIFT1) != ROL (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROL (s, shift1) != ROL (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROL (s, SHIFT1) != ROL (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROL (i, shift1) != ROL (INT_VALUE, SHIFT1)) + abort (); + + if (ROL (i, SHIFT1) != ROL (INT_VALUE, SHIFT1)) + abort (); + + if (ROL (l, shift1) != ROL (LONG_VALUE, SHIFT1)) + abort (); + + if (ROL (l, SHIFT1) != ROL (LONG_VALUE, SHIFT1)) + abort (); + + if (ROL (ll, shift1) != ROL (LL_VALUE, SHIFT1)) + abort (); + + if (ROL (ll, SHIFT1) != ROL (LL_VALUE, SHIFT1)) + abort (); + + if (ROL (ll, shift2) != ROL (LL_VALUE, SHIFT2)) + abort (); + + if (ROL (ll, SHIFT2) != ROL (LL_VALUE, SHIFT2)) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020510-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020510-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020510-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020510-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,84 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Test that optimizing ((c>=1) && (c<=127)) into (signed char)c < 0 + doesn't cause any problems for the compiler and behaves correctly. + + Written by Roger Sayle, 8th May 2002. */ + +#include + +extern void abort (void); + +void +testc (unsigned char c, int ok) +{ + if ((c>=1) && (c<=SCHAR_MAX)) + { + if (!ok) abort (); + } + else + if (ok) abort (); +} + +void +tests (unsigned short s, int ok) +{ + if ((s>=1) && (s<=SHRT_MAX)) + { + if (!ok) abort (); + } + else + if (ok) abort (); +} + +void +testi (unsigned int i, int ok) +{ + if ((i>=1) && (i<=INT_MAX)) + { + if (!ok) abort (); + } + else + if (ok) abort (); +} + +void +testl (unsigned long l, int ok) +{ + if ((l>=1) && (l<=LONG_MAX)) + { + if (!ok) abort (); + } + else + if (ok) abort (); +} + +int +main () +{ + testc (0, 0); + testc (1, 1); + testc (SCHAR_MAX, 1); + testc (SCHAR_MAX+1, 0); + testc (UCHAR_MAX, 0); + + tests (0, 0); + tests (1, 1); + tests (SHRT_MAX, 1); + tests (SHRT_MAX+1, 0); + tests (USHRT_MAX, 0); + + testi (0, 0); + testi (1, 1); + testi (INT_MAX, 1); + testi (INT_MAX+1U, 0); + testi (UINT_MAX, 0); + + testl (0, 0); + testl (1, 1); + testl (LONG_MAX, 1); + testl (LONG_MAX+1UL, 0); + testl (ULONG_MAX, 0); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020529-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020529-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020529-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020529-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,82 @@ +/* PR target/6838 from cato at df.lth.se. + cris-elf got an ICE with -O2: the insn matching + (insn 49 48 52 (parallel[ + (set (mem/s:HI (plus:SI (reg/v/f:SI 0 r0 [24]) + (const_int 8 [0x8])) [5 .c+0 S2 A8]) + (reg:HI 2 r2 [27])) + (set (reg/f:SI 2 r2 [31]) + (plus:SI (reg/v/f:SI 0 r0 [24]) + (const_int 8 [0x8]))) + ] ) 24 {*mov_sidehi_mem} (nil) + (nil)) + forced a splitter through the output pattern "#", but there was no + matching splitter. */ + +/* The ptx assembler appears to clobber 'b' inside foo during the f1 call. + Reported to nvidia 2016-05-18. */ +/* { dg-skip-if "PTX assembler bug" { nvptx-*-* } { "-O0" } { "" } } */ + +struct xx + { + int a; + struct xx *b; + short c; + }; + +int f1 (struct xx *); +void f2 (void); + +int +foo (struct xx *p, int b, int c, int d) +{ + int a; + + for (;;) + { + a = f1(p); + if (a) + return (0); + if (b) + continue; + p->c = d; + if (p->a) + f2 (); + if (c) + f2 (); + d = p->c; + switch (a) + { + case 1: + if (p->b) + f2 (); + if (c) + f2 (); + default: + break; + } + } + return d; +} + +int main (void) +{ + struct xx s = {0, &s, 23}; + if (foo (&s, 0, 0, 0) != 0 || s.a != 0 || s.b != &s || s.c != 0) + abort (); + exit (0); +} + +int +f1 (struct xx *p) +{ + static int beenhere = 0; + if (beenhere++ > 1) + abort (); + return beenhere > 1; +} + +void +f2 (void) +{ + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020611-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020611-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020611-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020611-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR target/6997. Missing (set_attr "cc" "none") in sleu pattern in + cris.md. Testcase from hp at axis.com. */ + +int p; +int k; +unsigned int n; + +void x () +{ + unsigned int h; + + h = n <= 30; + if (h) + p = 1; + else + p = 0; + + if (h) + k = 1; + else + k = 0; +} + +unsigned int n = 30; + +main () +{ + x (); + if (p != 1 || k != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020614-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020614-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020614-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020614-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* PR c/6677 */ +/* Verify that GCC doesn't perform illegal simplifications + when folding constants. */ + +#include + +extern void abort (void); +extern void exit (int); + +int main (void) +{ + int i; + signed char j; + unsigned char k; + + i = SCHAR_MAX; + + j = ((signed char) (i << 1)) / 2; + + if (j != -1) + abort(); + + j = ((signed char) (i * 2)) / 2; + + if (j != -1) + abort(); + + i = UCHAR_MAX; + + k = ((unsigned char) (i << 1)) / 2; + + if (k != UCHAR_MAX/2) + abort(); + + k = ((unsigned char) (i * 2)) / 2; + + if (k != UCHAR_MAX/2) + abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020615-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020615-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020615-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020615-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,59 @@ +/* PR target/7042. When reorg.c changed branches into return insns, it + completely forgot about any current_function_epilogue_delay_list and + dropped those insns. Uncovered on cris-axis-elf, where an insn in an + epilogue delay-slot set the return-value register with the testcase + below. Derived from ghostscript-6.52 (GPL) by hp at axis.com. */ + +typedef struct font_hints_s { + int axes_swapped; + int x_inverted, y_inverted; +} font_hints; +typedef struct gs_fixed_point_s { + long x, y; +} gs_fixed_point; + +int +line_hints(const font_hints *fh, const gs_fixed_point *p0, + const gs_fixed_point *p1) +{ + long dx = p1->x - p0->x; + long dy = p1->y - p0->y; + long adx, ady; + int xi = fh->x_inverted, yi = fh->y_inverted; + int hints; + if (xi) + dx = -dx; + if (yi) + dy = -dy; + if (fh->axes_swapped) { + long t = dx; + int ti = xi; + dx = dy, xi = yi; + dy = t, yi = ti; + } + adx = dx < 0 ? -dx : dx; + ady = dy < 0 ? -dy : dy; + if (dy != 0 && (adx <= ady >> 4)) { + hints = dy > 0 ? 2 : 1; + if (xi) + hints ^= 3; + } else if (dx != 0 && (ady <= adx >> 4)) { + hints = dx < 0 ? 8 : 4; + if (yi) + hints ^= 12; + } else + hints = 0; + return hints; +} +int main () +{ + static font_hints fh[] = {{0, 1, 0}, {0, 0, 1}, {0, 0, 0}}; + static gs_fixed_point gsf[] + = {{0x30000, 0x13958}, {0x30000, 0x18189}, + {0x13958, 0x30000}, {0x18189, 0x30000}}; + if (line_hints (fh, gsf, gsf + 1) != 1 + || line_hints (fh + 1, gsf + 2, gsf + 3) != 8 + || line_hints (fh + 2, gsf + 2, gsf + 3) != 4) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020619-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020619-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020619-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020619-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +#if (__SIZEOF_INT__ == 4) +typedef int int32; +#elif (__SIZEOF_LONG__ == 4) +typedef long int32; +#else +#error Add target support for int32 +#endif +static int32 ref(void) +{ + union { + char c[5]; + int32 i; + } u; + + __builtin_memset (&u, 0, sizeof(u)); + u.c[0] = 1; + u.c[1] = 2; + u.c[2] = 3; + u.c[3] = 4; + + return u.i; +} + +int main() +{ + int32 b = ref(); + if (b != 0x01020304 + && b != 0x04030201) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020716-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020716-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020716-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020716-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +extern void abort (void); +extern void exit (int); + +int sub1 (int val) +{ + return val; +} + +int testcond (int val) +{ + int flag1; + + { + int t1 = val; + { + int t2 = t1; + { + flag1 = sub1 (t2) ==0; + goto lab1; + }; + } + lab1: ; + } + + if (flag1 != 0) + return 0x4d0000; + else + return 0; +} + +int main (void) +{ + if (testcond (1)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020720-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020720-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020720-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020720-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Ensure that fabs(x) < 0.0 optimization is working. + + Written by Roger Sayle, 20th July 2002. */ + +extern void abort (void); +extern double fabs (double); +extern void link_error (void); + +void +foo (double x) +{ + double p, q; + + p = fabs (x); + q = 0.0; + if (p < q) + link_error (); +} + +int +main() +{ + foo (1.0); + return 0; +} + +#ifndef __OPTIMIZE__ +void +link_error () +{ + abort (); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020805-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020805-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020805-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020805-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* This testcase was miscompiled on IA-32, because fold-const + assumed associate_trees is always done on PLUS_EXPR. */ + +extern void abort (void); +extern void exit (int); + +void check (unsigned int m) +{ + if (m != (unsigned int) -1) + abort (); +} + +unsigned int n = 1; + +int main (void) +{ + unsigned int m; + m = (1 | (2 - n)) | (-n); + check (m); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020810-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020810-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020810-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020810-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* PR target/7559 + This testcase was miscompiled on x86-64, because classify_argument + wrongly computed the offset of nested structure fields. */ + +extern void abort (void); + +struct A +{ + long x; +}; + +struct R +{ + struct A a, b; +}; + +struct R R = { 100, 200 }; + +void f (struct R r) +{ + if (r.a.x != R.a.x || r.b.x != R.b.x) + abort (); +} + +struct R g (void) +{ + return R; +} + +int main (void) +{ + struct R r; + f(R); + r = g(); + if (r.a.x != R.a.x || r.b.x != R.b.x) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020819-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020819-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020819-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020819-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +foo () +{ + return 0; +} + +main() +{ + int i, j, k, ccp_bad = 0; + + for (i = 0; i < 10; i++) + { + for (j = 0; j < 10; j++) + if (foo ()) + ccp_bad = 1; + + k = ccp_bad != 0; + if (k) + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020904-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020904-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020904-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020904-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR c/7102 */ + +/* Verify that GCC zero-extends integer constants + in unsigned binary operations. */ + +typedef unsigned char u8; + +u8 fun(u8 y) +{ + u8 x=((u8)255)/y; + return x; +} + +int main(void) +{ + if (fun((u8)2) != 127) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020911-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020911-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020911-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020911-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,8 @@ +extern void abort (void); +unsigned short c = 0x8000; +int main() +{ + if ((c-0x8000) < 0 || (c-0x8000) > 0x7fff) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020916-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020916-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020916-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020916-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +/* Distilled from try_pre_increment in flow.c. If-conversion inserted + new instructions at the wrong place on ppc. */ + +int foo(int a) +{ + int x; + x = 0; + if (a > 0) x = 1; + if (a < 0) x = 1; + return x; +} + +int main() +{ + if (foo(1) != 1) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020920-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020920-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020920-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20020920-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +extern void abort (void); +extern void exit (int); + +struct B +{ + int x; + int y; +}; + +struct A +{ + int z; + struct B b; +}; + +struct A +f () +{ + struct B b = { 0, 1 }; + struct A a = { 2, b }; + return a; +} + +int +main (void) +{ + struct A a = f (); + if (a.z != 2 || a.b.x != 0 || a.b.y != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021010-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021010-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021010-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021010-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +#include + +int +sub () +{ + int dummy = 0, a = 16; + + if (a / INT_MAX / 16 == 0) + return 0; + else + return a / INT_MAX / 16; +} + +int +main () +{ + if (sub () != 0) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021010-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021010-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021010-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021010-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* cse.c failure on x86 target. + Contributed by Stuart Hastings 10 Oct 2002 */ +#include + +typedef signed short SInt16; + +typedef struct { + SInt16 minx; + SInt16 maxx; + SInt16 miny; + SInt16 maxy; +} IOGBounds; + +int expectedwidth = 50; + +unsigned int *global_vramPtr = (unsigned int *)0xa000; + +IOGBounds global_bounds = { 100, 150, 100, 150 }; +IOGBounds global_saveRect = { 75, 175, 75, 175 }; + +main() +{ + unsigned int *vramPtr; + int width; + IOGBounds saveRect = global_saveRect; + IOGBounds bounds = global_bounds; + + if (saveRect.minx < bounds.minx) saveRect.minx = bounds.minx; + if (saveRect.maxx > bounds.maxx) saveRect.maxx = bounds.maxx; + + vramPtr = global_vramPtr + (saveRect.miny - bounds.miny) ; + width = saveRect.maxx - saveRect.minx; + + if (width != expectedwidth) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021011-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021011-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021011-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021011-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* PR opt/8165. */ + +extern void abort (void); + +char buf[64]; + +int +main (void) +{ + int i; + + __builtin_strcpy (buf, "mystring"); + if (__builtin_strcmp (buf, "mystring") != 0) + abort (); + + for (i = 0; i < 16; ++i) + { + __builtin_strcpy (buf + i, "mystring"); + if (__builtin_strcmp (buf + i, "mystring") != 0) + abort (); + } + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021015-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021015-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021015-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021015-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* PR opt/7409. */ + +extern void abort (void); + +char g_list[] = { '1' }; + +void g (void *p, char *list, int length, char **elementPtr, char **nextPtr) +{ + if (*nextPtr != g_list) + abort (); + + **nextPtr = 0; +} + +int main (void) +{ + char *list = g_list; + char *element; + int i, length = 100; + + for (i = 0; *list != 0; i++) + { + char *prevList = list; + g (0, list, length, &element, &list); + length -= (list - prevList); + } + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021024-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021024-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021024-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021024-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +/* Origin: PR target/6981 from Mattias Engdegaard . */ +/* { dg-require-effective-target int32plus } */ + +void exit (int); +void abort (void); + +unsigned long long *cp, m; + +void foo (void) +{ +} + +void bar (unsigned rop, unsigned long long *r) +{ + unsigned rs1, rs2, rd; + +top: + rs2 = (rop >> 23) & 0x1ff; + rs1 = (rop >> 9) & 0x1ff; + rd = rop & 0x1ff; + + *cp = 1; + m = r[rs1] + r[rs2]; + *cp = 2; + foo(); + if (!rd) + goto top; + r[rd] = 1; +} + +int main(void) +{ + static unsigned long long r[64]; + unsigned long long cr; + cp = &cr; + + r[4] = 47; + r[8] = 11; + bar((8 << 23) | (4 << 9) | 15, r); + + if (m != 47 + 11) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021111-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021111-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021111-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021111-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* Origin: PR c/8467 */ + +extern void abort (void); +extern void exit (int); + +int aim_callhandler(int sess, int conn, unsigned short family, unsigned short type); + +int aim_callhandler(int sess, int conn, unsigned short family, unsigned short type) +{ + static int i = 0; + + if (!conn) + return 0; + + if (type == 0xffff) + { + return 0; + } + + if (i >= 1) + abort (); + + i++; + return aim_callhandler(sess, conn, family, (unsigned short) 0xffff); +} + +int main (void) +{ + aim_callhandler (0, 1, 0, 0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021113-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021113-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021113-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021113-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* { dg-require-effective-target alloca } */ + +/* This program tests a data flow bug that would cause constant propagation + to propagate constants through function calls. */ + +foo (int *p) +{ + *p = 10; +} + +main() +{ + int *ptr = alloca (sizeof (int)); + *ptr = 5; + foo (ptr); + if (*ptr == 5) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +struct s { int f[4]; }; + +int foo (struct s s, int x1, int x2, int x3, int x4, int x5, int x6, int x7) +{ + return s.f[3] + x7; +} + +int main () +{ + struct s s = { 1, 2, 3, 4 }; + + if (foo (s, 100, 200, 300, 400, 500, 600, 700) != 704) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,50 @@ +/* Originally added to test SH constant pool layout. t1() failed for + non-PIC and t2() failed for PIC. */ + +int t1 (float *f, int i, + void (*f1) (double), + void (*f2) (float, float)) +{ + f1 (3.0); + f[i] = f[i + 1]; + f2 (2.5f, 3.5f); +} + +int t2 (float *f, int i, + void (*f1) (double), + void (*f2) (float, float), + void (*f3) (float)) +{ + f3 (6.0f); + f1 (3.0); + f[i] = f[i + 1]; + f2 (2.5f, 3.5f); +} + +void f1 (double d) +{ + if (d != 3.0) + abort (); +} + +void f2 (float f1, float f2) +{ + if (f1 != 2.5f || f2 != 3.5f) + abort (); +} + +void f3 (float f) +{ + if (f != 6.0f) + abort (); +} + +int main () +{ + float f[3] = { 2.0f, 3.0f, 4.0f }; + t1 (f, 0, f1, f2); + t2 (f, 1, f1, f2, f3); + if (f[0] != 3.0f && f[1] != 4.0f) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021118-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +extern void abort (void); +extern void exit (int); + +int +foo (int x) +{ + if (x == -2 || -x - 100 >= 0) + abort (); + return 0; +} + +int +main () +{ + foo (-3); + foo (-99); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021119-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021119-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021119-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021119-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR 8639. */ + +extern void abort(void); + +int foo (int i) +{ + int r; + r = (80 - 4 * i) / 20; + return r; +} + +int main () +{ + if (foo (1) != 3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,58 @@ +/* Macros to emit "L Nxx R" for each octal number xx between 000 and 037. */ +#define OP1(L, N, R, I, J) L N##I##J R +#define OP2(L, N, R, I) \ + OP1(L, N, R, 0, I), OP1(L, N, R, 1, I), \ + OP1(L, N, R, 2, I), OP1(L, N, R, 3, I) +#define OP(L, N, R) \ + OP2(L, N, R, 0), OP2(L, N, R, 1), OP2(L, N, R, 2), OP2(L, N, R, 3), \ + OP2(L, N, R, 4), OP2(L, N, R, 5), OP2(L, N, R, 6), OP2(L, N, R, 7) + +/* Declare 32 unique variables with prefix N. */ +#define DECLARE(N) OP (, N,) + +/* Copy 32 variables with prefix N from the array at ADDR. + Leave ADDR pointing to the end of the array. */ +#define COPYIN(N, ADDR) OP (, N, = *(ADDR++)) + +/* Likewise, but copy the other way. */ +#define COPYOUT(N, ADDR) OP (*(ADDR++) =, N,) + +/* Add the contents of the array at ADDR to 32 variables with prefix N. + Leave ADDR pointing to the end of the array. */ +#define ADD(N, ADDR) OP (, N, += *(ADDR++)) + +volatile double gd[32]; +volatile float gf[32]; + +void foo (int n) +{ + double DECLARE(d); + float DECLARE(f); + volatile double *pd; + volatile float *pf; + int i; + + pd = gd; COPYIN (d, pd); + for (i = 0; i < n; i++) + { + pf = gf; COPYIN (f, pf); + pd = gd; ADD (d, pd); + pd = gd; ADD (d, pd); + pd = gd; ADD (d, pd); + pf = gf; COPYOUT (f, pf); + } + pd = gd; COPYOUT (d, pd); +} + +int main () +{ + int i; + + for (i = 0; i < 32; i++) + gd[i] = i, gf[i] = i; + foo (1); + for (i = 0; i < 32; i++) + if (gd[i] != i * 4 || gf[i] != i) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +int g1, g2; + +void foo (int x) +{ + int y; + + if (x) + y = 793; + else + y = 793; + g1 = 7930 / y; + g2 = 7930 / x; +} + +int main () +{ + foo (793); + if (g1 != 10 || g2 != 10) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021120-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* Test whether a partly call-clobbered register will be moved over a call. + Although the original test case didn't use any GNUisms, it proved + difficult to reduce without the named register extension. */ +#if __SH64__ == 32 +#define LOC asm ("r10") +#else +#define LOC +#endif + +unsigned int foo (char *c, unsigned int x, unsigned int y) +{ + register unsigned int z LOC; + + sprintf (c, "%d", x / y); + z = x + 1; + return z / (y + 1); +} + +int main () +{ + char c[16]; + + if (foo (c, ~1U, 4) != (~0U / 5)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021127-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021127-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021127-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021127-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* { dg-options "-std=c99" } */ + +long long a = -1; +long long llabs (long long); +void abort (void); +int +main() +{ + if (llabs (a) != 1) + abort (); + return 0; +} +long long llabs (long long b) +{ + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021204-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021204-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021204-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021204-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* This test was miscompiled when using sibling call optimization, + because X ? Y : Y - 1 optimization changed X into !X in place + and haven't reverted it if do_store_flag was successful, so + when expanding the expression the second time it was + !X ? Y : Y - 1. */ + +extern void abort (void); +extern void exit (int); + +void foo (int x) +{ + if (x != 1) + abort (); +} + +int z; + +int main (int argc, char **argv) +{ + char *a = "test"; + char *b = a + 2; + + foo (z > 0 ? b - a : b - a - 1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021219-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021219-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021219-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20021219-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +/* PR optimization/8988 */ +/* Contributed by Kevin Easton */ + +void foo(char *p1, char **p2) +{} + +int main(void) +{ + char str[] = "foo { xx }"; + char *ptr = str + 5; + + foo(ptr, &ptr); + + while (*ptr && (*ptr == 13 || *ptr == 32)) + ptr++; + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030105-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030105-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030105-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030105-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +int __attribute__ ((noinline)) +foo () +{ + const int a[8] = { 0, 1, 2, 3, 4, 5, 6, 7 }; + int i, sum; + + sum = 0; + for (i = 0; i < sizeof (a) / sizeof (*a); i++) + sum += a[i]; + + return sum; +} + +int +main () +{ + if (foo () != 28) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030109-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030109-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030109-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030109-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* PR c/8032 */ +/* Verify that an empty initializer inside a partial + parent initializer doesn't confuse GCC. */ + +struct X +{ + int a; + int b; + int z[]; +}; + +struct X x = { .b = 40, .z = {} }; + +int main () +{ + if (x.b != 40) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030117-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030117-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030117-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030117-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +int foo (int, int, int); +int bar (int, int, int); + +int main (void) +{ + if (foo (5, 10, 21) != 12) + abort (); + + if (bar (9, 12, 15) != 150) + abort (); + + exit (0); +} + +int foo (int x, int y, int z) +{ + return (x + y + z) / 3; +} + +int bar (int x, int y, int z) +{ + return foo (x * x, y * y, z * z); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030120-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030120-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030120-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030120-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,50 @@ +/* On H8/300 port, NOTICE_UPDATE_CC had a bug that causes the final + pass to remove test insns that should be kept. */ + +unsigned short +test1 (unsigned short w) +{ + if ((w & 0xff00) == 0) + { + if (w == 0) + w = 2; + } + return w; +} + +unsigned long +test2 (unsigned long w) +{ + if ((w & 0xffff0000) == 0) + { + if (w == 0) + w = 2; + } + return w; +} + +int +test3 (unsigned short a) +{ + if (a & 1) + return 1; + else if (a) + return 1; + else + return 0; +} + +int +main () +{ + if (test1 (1) != 1) + abort (); + + if (test2 (1) != 1) + abort (); + + if (test3 (2) != 1) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030120-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030120-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030120-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030120-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR 8848 */ + +extern void abort (); + +int foo(int status) +{ + int s = 0; + if (status == 1) s=1; + if (status == 3) s=3; + if (status == 4) s=4; + return s; +} + +int main() +{ + if (foo (3) != 3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030125-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030125-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030125-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030125-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,56 @@ +/* Verify whether math functions are simplified. */ +/* { dg-require-effective-target c99_runtime } */ +/* { dg-require-weak "" } */ +double sin(double); +double floor(double); +float +t(float a) +{ + return sin(a); +} +float +q(float a) +{ + return floor(a); +} +double +q1(float a) +{ + return floor(a); +} +main() +{ +#ifdef __OPTIMIZE__ + if (t(0)!=0) + abort (); + if (q(0)!=0) + abort (); + if (q1(0)!=0) + abort (); +#endif + return 0; +} +__attribute__ ((weak)) +double +floor(double a) +{ + abort (); +} +__attribute__ ((weak)) +float +floorf(float a) +{ + return a; +} +__attribute__ ((weak)) +double +sin(double a) +{ + return a; +} +__attribute__ ((weak)) +float +sinf(float a) +{ + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030128-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030128-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030128-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030128-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +unsigned char x = 50; +volatile short y = -5; + +int main () +{ + x /= y; + if (x != (unsigned char) -10) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030203-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030203-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030203-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030203-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +void f(int); +int do_layer3(int single) +{ + int stereo1; + + if(single >= 0) /* stream is stereo, but force to mono */ + stereo1 = 1; + else + stereo1 = 2; + f(single); + + return stereo1; +} + +extern void abort (); +int main() +{ + if (do_layer3(-1) != 2) + abort (); + return 0; +} + +void f(int i) {} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030209-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030209-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030209-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030209-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +/* { dg-require-stack-size "8*100*100" } */ + +double x[100][100]; +int main () +{ + int i; + + i = 99; + x[i][0] = 42; + if (x[99][0] != 42) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030216-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030216-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030216-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030216-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* This test requires constant propagation of loads and stores to be + enabled. This is only guaranteed at -O2 and higher. Do not run + at -O1. */ +/* { dg-skip-if "requires higher optimization" { *-*-* } "-O1" "" } */ + +void link_error (void); +const double one=1.0; +main () +{ +#ifdef __OPTIMIZE__ + if ((int) one != 1) + link_error (); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030218-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030218-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030218-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030218-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* On H8, the predicate general_operand_src(op,mode) used to ignore + mode when op is a (mem (post_inc ...)). As a result, the pattern + for extendhisi2 was recognized as extendqisi2. */ + +extern void abort (); +extern void exit (int); + +short *q; + +long +foo (short *p) +{ + long b = *p; + q = p + 1; + return b; +} + +int +main () +{ + short a = 0xff00; + if (foo (&a) != (long) (short) 0xff00) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030221-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030221-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030221-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030221-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR optimization/8613 */ +/* Contributed by Glen Nakamura */ + +extern void abort (void); + +int main (void) +{ + char buf[16] = "1234567890"; + char *p = buf; + + *p++ = (char) __builtin_strlen (buf); + + if ((buf[0] != 10) || (p - buf != 1)) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030222-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030222-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030222-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030222-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* Verify that we get the low part of the long long as an int. We + used to get it wrong on big-endian machines, if register allocation + succeeded at all. We use volatile to make sure the long long is + actually truncated to int, in case a single register is wide enough + for a long long. */ +/* { dg-skip-if "asm would require extra shift-left-4-byte" { spu-*-* } } */ +/* { dg-skip-if "asm requires register allocation" { nvptx-*-* } } */ +#include + +void +ll_to_int (long long x, volatile int *p) +{ + int i; + asm ("" : "=r" (i) : "0" (x)); + *p = i; +} + +int val = INT_MIN + 1; + +int main() { + volatile int i; + + ll_to_int ((long long)val, &i); + if (i != val) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030224-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030224-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030224-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030224-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* Make sure that we don't free any temp stack slots associated with + initializing marker before we're finished with them. */ + +extern void abort(); + +typedef struct { short v16; } __attribute__((packed)) jint16_t; + +struct node { + jint16_t magic; + jint16_t nodetype; + int totlen; +} __attribute__((packed)); + +struct node node, *node_p = &node; + +int main() +{ + struct node marker = { + .magic = (jint16_t) {0x1985}, + .nodetype = (jint16_t) {0x2003}, + .totlen = node_p->totlen + }; + if (marker.magic.v16 != 0x1985) + abort(); + if (marker.nodetype.v16 != 0x2003) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030307-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030307-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030307-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030307-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR optimization/8726 */ +/* Originator: Paul Eggert */ + +/* Verify that GCC doesn't miscompile tail calls on Sparc. */ + +extern void abort(void); + +int fcntl_lock(int fd, int op, long long offset, long long count, int type); + +int vfswrap_lock(char *fsp, int fd, int op, long long offset, long long count, int type) +{ + return fcntl_lock(fd, op, offset, count, type); +} + +int fcntl_lock(int fd, int op, long long offset, long long count, int type) +{ + return type; +} + +int main(void) +{ + if (vfswrap_lock (0, 1, 2, 3, 4, 5) != 5) + abort(); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030313-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030313-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030313-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030313-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,68 @@ +struct A +{ + unsigned long p, q, r, s; +} x = { 13, 14, 15, 16 }; + +extern void abort (void); +extern void exit (int); + +static inline struct A * +bar (void) +{ + struct A *r; + + switch (8) + { + case 2: + abort (); + break; + case 8: + r = &x; + break; + default: + abort (); + break; + } + return r; +} + +void +foo (unsigned long *x, int y) +{ + if (y != 12) + abort (); + if (x[0] != 1 || x[1] != 11) + abort (); + if (x[2] != 2 || x[3] != 12) + abort (); + if (x[4] != 3 || x[5] != 13) + abort (); + if (x[6] != 4 || x[7] != 14) + abort (); + if (x[8] != 5 || x[9] != 15) + abort (); + if (x[10] != 6 || x[11] != 16) + abort (); +} + +int +main (void) +{ + unsigned long a[40]; + int b = 0; + + a[b++] = 1; + a[b++] = 11; + a[b++] = 2; + a[b++] = 12; + a[b++] = 3; + a[b++] = bar()->p; + a[b++] = 4; + a[b++] = bar()->q; + a[b++] = 5; + a[b++] = bar()->r; + a[b++] = 6; + a[b++] = bar()->s; + foo (a, b); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030316-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030316-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030316-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030316-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +/* PR target/9164 */ +/* The comparison operand was sign extended erraneously. */ + +int +main (void) +{ + long j = 0x40000000; + if ((unsigned int) (0x40000000 + j) < 0L) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030323-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030323-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030323-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030323-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,115 @@ +/* PR opt/10116 */ +/* { dg-require-effective-target return_address } */ +/* Removed tablejump while label still in use; this is really a link test. */ + +void *NSReturnAddress(int offset) +{ + switch (offset) { + case 0: return __builtin_return_address(0 + 1); + case 1: return __builtin_return_address(1 + 1); + case 2: return __builtin_return_address(2 + 1); + case 3: return __builtin_return_address(3 + 1); + case 4: return __builtin_return_address(4 + 1); + case 5: return __builtin_return_address(5 + 1); + case 6: return __builtin_return_address(6 + 1); + case 7: return __builtin_return_address(7 + 1); + case 8: return __builtin_return_address(8 + 1); + case 9: return __builtin_return_address(9 + 1); + case 10: return __builtin_return_address(10 + 1); + case 11: return __builtin_return_address(11 + 1); + case 12: return __builtin_return_address(12 + 1); + case 13: return __builtin_return_address(13 + 1); + case 14: return __builtin_return_address(14 + 1); + case 15: return __builtin_return_address(15 + 1); + case 16: return __builtin_return_address(16 + 1); + case 17: return __builtin_return_address(17 + 1); + case 18: return __builtin_return_address(18 + 1); + case 19: return __builtin_return_address(19 + 1); + case 20: return __builtin_return_address(20 + 1); + case 21: return __builtin_return_address(21 + 1); + case 22: return __builtin_return_address(22 + 1); + case 23: return __builtin_return_address(23 + 1); + case 24: return __builtin_return_address(24 + 1); + case 25: return __builtin_return_address(25 + 1); + case 26: return __builtin_return_address(26 + 1); + case 27: return __builtin_return_address(27 + 1); + case 28: return __builtin_return_address(28 + 1); + case 29: return __builtin_return_address(29 + 1); + case 30: return __builtin_return_address(30 + 1); + case 31: return __builtin_return_address(31 + 1); + case 32: return __builtin_return_address(32 + 1); + case 33: return __builtin_return_address(33 + 1); + case 34: return __builtin_return_address(34 + 1); + case 35: return __builtin_return_address(35 + 1); + case 36: return __builtin_return_address(36 + 1); + case 37: return __builtin_return_address(37 + 1); + case 38: return __builtin_return_address(38 + 1); + case 39: return __builtin_return_address(39 + 1); + case 40: return __builtin_return_address(40 + 1); + case 41: return __builtin_return_address(41 + 1); + case 42: return __builtin_return_address(42 + 1); + case 43: return __builtin_return_address(43 + 1); + case 44: return __builtin_return_address(44 + 1); + case 45: return __builtin_return_address(45 + 1); + case 46: return __builtin_return_address(46 + 1); + case 47: return __builtin_return_address(47 + 1); + case 48: return __builtin_return_address(48 + 1); + case 49: return __builtin_return_address(49 + 1); + case 50: return __builtin_return_address(50 + 1); + case 51: return __builtin_return_address(51 + 1); + case 52: return __builtin_return_address(52 + 1); + case 53: return __builtin_return_address(53 + 1); + case 54: return __builtin_return_address(54 + 1); + case 55: return __builtin_return_address(55 + 1); + case 56: return __builtin_return_address(56 + 1); + case 57: return __builtin_return_address(57 + 1); + case 58: return __builtin_return_address(58 + 1); + case 59: return __builtin_return_address(59 + 1); + case 60: return __builtin_return_address(60 + 1); + case 61: return __builtin_return_address(61 + 1); + case 62: return __builtin_return_address(62 + 1); + case 63: return __builtin_return_address(63 + 1); + case 64: return __builtin_return_address(64 + 1); + case 65: return __builtin_return_address(65 + 1); + case 66: return __builtin_return_address(66 + 1); + case 67: return __builtin_return_address(67 + 1); + case 68: return __builtin_return_address(68 + 1); + case 69: return __builtin_return_address(69 + 1); + case 70: return __builtin_return_address(70 + 1); + case 71: return __builtin_return_address(71 + 1); + case 72: return __builtin_return_address(72 + 1); + case 73: return __builtin_return_address(73 + 1); + case 74: return __builtin_return_address(74 + 1); + case 75: return __builtin_return_address(75 + 1); + case 76: return __builtin_return_address(76 + 1); + case 77: return __builtin_return_address(77 + 1); + case 78: return __builtin_return_address(78 + 1); + case 79: return __builtin_return_address(79 + 1); + case 80: return __builtin_return_address(80 + 1); + case 81: return __builtin_return_address(81 + 1); + case 82: return __builtin_return_address(82 + 1); + case 83: return __builtin_return_address(83 + 1); + case 84: return __builtin_return_address(84 + 1); + case 85: return __builtin_return_address(85 + 1); + case 86: return __builtin_return_address(86 + 1); + case 87: return __builtin_return_address(87 + 1); + case 88: return __builtin_return_address(88 + 1); + case 89: return __builtin_return_address(89 + 1); + case 90: return __builtin_return_address(90 + 1); + case 91: return __builtin_return_address(91 + 1); + case 92: return __builtin_return_address(92 + 1); + case 93: return __builtin_return_address(93 + 1); + case 94: return __builtin_return_address(94 + 1); + case 95: return __builtin_return_address(95 + 1); + case 96: return __builtin_return_address(96 + 1); + case 97: return __builtin_return_address(97 + 1); + case 98: return __builtin_return_address(98 + 1); + case 99: return __builtin_return_address(99 + 1); + } + return 0; +} + +int main() +{ + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030330-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030330-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030330-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030330-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR opt/10011 */ +/* This is link test for builtin_constant_p simplification + DCE. */ + +extern void link_error(void); +static void usb_hub_port_wait_reset(unsigned int delay) +{ + int delay_time; + for (delay_time = 0; delay_time < 500; delay_time += delay) { + if (__builtin_constant_p(delay)) + link_error(); + } +} + +int main() { return 0; } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030401-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030401-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030401-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030401-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* Testcase for PR fortran/9974. This was a miscompilation of the g77 + front-end caused by the jump bypassing optimizations not handling + instructions inserted on CFG edges. */ + +extern void abort (); + +int bar () +{ + return 1; +} + +void foo (int x) +{ + unsigned char error = 0; + + if (! (error = ((x == 0) || bar ()))) + bar (); + if (! error) + abort (); +} + +int main() +{ + foo (1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030403-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030403-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030403-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030403-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* The non-destructive folder was always emitting >= when folding + comparisons to signed_max+1. */ + +#include + +int +main () +{ + unsigned long count = 8; + + if (count > INT_MAX) + abort (); + + return (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030404-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030404-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030404-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030404-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* This exposed a bug in tree-ssa-ccp.c. Since 'j' and 'i' are never + defined, CCP was not traversing the edges out of the if(), which caused + the PHI node for 'k' at the top of the while to only be visited once. + This ended up causing CCP to think that 'k' was the constant '1'. */ +main() +{ + int i, j, k; + + k = 0; + while (k < 10) + { + k++; + if (j > i) + j = 5; + else + j =3; + } + + if (k != 10) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030408-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030408-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030408-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030408-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,69 @@ +/* PR optimization/8634 */ +/* Contributed by Glen Nakamura */ + +extern void abort (void); + +struct foo { + char a, b, c, d, e, f, g, h, i, j; +}; + +int test1 () +{ + const char X[8] = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H' }; + char buffer[8]; + __builtin_memcpy (buffer, X, 8); + if (buffer[0] != 'A' || buffer[1] != 'B' + || buffer[2] != 'C' || buffer[3] != 'D' + || buffer[4] != 'E' || buffer[5] != 'F' + || buffer[6] != 'G' || buffer[7] != 'H') + abort (); + return 0; +} + +int test2 () +{ + const char X[10] = { 'A', 'B', 'C', 'D', 'E' }; + char buffer[10]; + __builtin_memcpy (buffer, X, 10); + if (buffer[0] != 'A' || buffer[1] != 'B' + || buffer[2] != 'C' || buffer[3] != 'D' + || buffer[4] != 'E' || buffer[5] != '\0' + || buffer[6] != '\0' || buffer[7] != '\0' + || buffer[8] != '\0' || buffer[9] != '\0') + abort (); + return 0; +} + +int test3 () +{ + const struct foo X = { a : 'A', c : 'C', e : 'E', g : 'G', i : 'I' }; + char buffer[10]; + __builtin_memcpy (buffer, &X, 10); + if (buffer[0] != 'A' || buffer[1] != '\0' + || buffer[2] != 'C' || buffer[3] != '\0' + || buffer[4] != 'E' || buffer[5] != '\0' + || buffer[6] != 'G' || buffer[7] != '\0' + || buffer[8] != 'I' || buffer[9] != '\0') + abort (); + return 0; +} + +int test4 () +{ + const struct foo X = { .b = 'B', .d = 'D', .f = 'F', .h = 'H' , .j = 'J' }; + char buffer[10]; + __builtin_memcpy (buffer, &X, 10); + if (buffer[0] != '\0' || buffer[1] != 'B' + || buffer[2] != '\0' || buffer[3] != 'D' + || buffer[4] != '\0' || buffer[5] != 'F' + || buffer[6] != '\0' || buffer[7] != 'H' + || buffer[8] != '\0' || buffer[9] != 'J') + abort (); + return 0; +} + +int main () +{ + test1 (); test2 (); test3 (); test4 (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030501-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030501-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030501-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030501-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +int +main (int argc, char **argv) +{ + int size = 10; + + { + int retframe_block() + { + return size + 5; + } + + if (retframe_block() != 15) + abort (); + exit (0); + + } +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030606-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030606-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030606-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030606-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ + +int * foo (int *x, int b) +{ + + *(x++) = 55; + if (b) + *(x++) = b; + + return x; +} + +main() +{ + int a[5]; + + memset (a, 1, sizeof (a)); + + if (foo(a, 0) - a != 1 || a[0] != 55 || a[1] != a[4]) + abort(); + + memset (a, 1, sizeof (a)); + + if (foo(a, 2) - a != 2 || a[0] != 55 || a[1] != 2) + abort(); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030613-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030613-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030613-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030613-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,62 @@ +/* PR optimization/10955 */ +/* Originator: */ + +/* This used to fail on SPARC32 at -O3 because the loop unroller + wrongly thought it could eliminate a pseudo in a loop, while + the pseudo was used outside the loop. */ + +extern void abort(void); + +#define COMPLEX struct CS + +COMPLEX { + long x; + long y; +}; + + +static COMPLEX CCID (COMPLEX x) +{ + COMPLEX a; + + a.x = x.x; + a.y = x.y; + + return a; +} + + +static COMPLEX CPOW (COMPLEX x, int y) +{ + COMPLEX a; + a = x; + + while (--y > 0) + a=CCID(a); + + return a; +} + + +static int c5p (COMPLEX x) +{ + COMPLEX a,b; + a = CPOW (x, 2); + b = CCID( CPOW(a,2) ); + + return (b.x == b.y); +} + + +int main (void) +{ + COMPLEX x; + + x.x = -7; + x.y = -7; + + if (!c5p(x)) + abort(); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030626-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030626-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030626-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030626-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +char buf[10]; + +extern void abort (void); +extern int sprintf (char*, const char*, ...); + +int main() +{ + int l = sprintf (buf, "foo\0bar"); + if (l != 3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030626-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030626-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030626-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030626-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +char buf[40]; + +extern int sprintf (char*, const char*, ...); +extern void abort (void); + +int main() +{ + int i = 0; + int l = sprintf (buf, "%s", i++ ? "string" : "other string"); + if (l != sizeof ("other string") - 1 || i != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030714-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030714-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030714-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030714-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,193 @@ +/* derived from PR optimization/11440 */ + +extern void abort (void); +extern void exit (int); + +typedef _Bool bool; +const bool false = 0; +const bool true = 1; + +enum EPosition +{ + STATIC, RELATIVE, ABSOLUTE, FIXED +}; +typedef enum EPosition EPosition; + +enum EFloat +{ + FNONE = 0, FLEFT, FRIGHT +}; +typedef enum EFloat EFloat; + +struct RenderBox +{ + int unused[6]; + short m_verticalPosition; + + bool m_layouted : 1; + bool m_unused : 1; + bool m_minMaxKnown : 1; + bool m_floating : 1; + + bool m_positioned : 1; + bool m_overhangingContents : 1; + bool m_relPositioned : 1; + bool m_paintSpecial : 1; + + bool m_isAnonymous : 1; + bool m_recalcMinMax : 1; + bool m_isText : 1; + bool m_inline : 1; + + bool m_replaced : 1; + bool m_mouseInside : 1; + bool m_hasFirstLine : 1; + bool m_isSelectionBorder : 1; + + bool (*isTableCell) (struct RenderBox *this); +}; + +typedef struct RenderBox RenderBox; + +struct RenderStyle +{ + struct NonInheritedFlags + { + union + { + struct + { + unsigned int _display : 4; + unsigned int _bg_repeat : 2; + bool _bg_attachment : 1; + unsigned int _overflow : 4 ; + unsigned int _vertical_align : 4; + unsigned int _clear : 2; + EPosition _position : 2; + EFloat _floating : 2; + unsigned int _table_layout : 1; + bool _flowAroundFloats :1; + + unsigned int _styleType : 3; + bool _hasHover : 1; + bool _hasActive : 1; + bool _clipSpecified : 1; + unsigned int _unicodeBidi : 2; + int _unused : 1; + } f; + int _niflags; + }; + } noninherited_flags; +}; + +typedef struct RenderStyle RenderStyle; + +extern void RenderObject_setStyle(RenderBox *this, RenderStyle *_style); +extern void removeFromSpecialObjects(RenderBox *this); + + + +void RenderBox_setStyle(RenderBox *thisin, RenderStyle *_style) +{ + RenderBox *this = thisin; + bool oldpos, tmp; + EPosition tmppo; + + tmp = this->m_positioned; + + oldpos = tmp; + + RenderObject_setStyle(this, _style); + + tmppo = _style->noninherited_flags.f._position; + + switch(tmppo) + { + case ABSOLUTE: + case FIXED: + { + bool ltrue = true; + this->m_positioned = ltrue; + break; + } + + default: + { + EFloat tmpf; + EPosition tmpp; + if (oldpos) + { + bool ltrue = true; + this->m_positioned = ltrue; + removeFromSpecialObjects(this); + } + { + bool lfalse = false; + this->m_positioned = lfalse; + } + + tmpf = _style->noninherited_flags.f._floating; + + if(!this->isTableCell (this) && !(tmpf == FNONE)) + { + bool ltrue = true; + this->m_floating = ltrue; + } + else + { + tmpp = _style->noninherited_flags.f._position; + if (tmpp == RELATIVE) + { + bool ltrue = true; + this->m_relPositioned = ltrue; + } + } + } + } +} + + + + +RenderBox g_this; +RenderStyle g__style; + +void RenderObject_setStyle(RenderBox *this, RenderStyle *_style) +{ + (void) this; + (void) _style; +} + +void removeFromSpecialObjects(RenderBox *this) +{ + (void) this; +} + +bool RenderBox_isTableCell (RenderBox *this) +{ + (void) this; + return false; +} + +int main (void) +{ + + g_this.m_relPositioned = false; + g_this.m_positioned = false; + g_this.m_floating = false; + g_this.isTableCell = RenderBox_isTableCell; + + g__style.noninherited_flags.f._position = FIXED; + g__style.noninherited_flags.f._floating = FNONE; + + RenderBox_setStyle (&g_this, &g__style); + + if (g_this.m_positioned != true) + abort (); + if (g_this.m_relPositioned != false) + abort (); + if (g_this.m_floating != false) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030715-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030715-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030715-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030715-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* PR optimization/11320 */ +/* Origin: Andreas Schwab */ + +/* Verify that the scheduler correctly computes the dependencies + in the presence of conditional instructions. */ + +int strcmp (const char *, const char *); +int ap_standalone; + +const char *ap_check_cmd_context (void *a, int b) +{ + return 0; +} + +const char *server_type (void *a, void *b, char *arg) +{ + const char *err = ap_check_cmd_context (a, 0x01|0x02|0x04|0x08|0x10); + if (err) + return err; + + if (!strcmp (arg, "inetd")) + ap_standalone = 0; + else if (!strcmp (arg, "standalone")) + ap_standalone = 1; + else + return "ServerType must be either 'inetd' or 'standalone'"; + + return 0; +} + +int main () +{ + server_type (0, 0, "standalone"); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030717-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030717-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030717-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030717-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,69 @@ +/* PR target/11087 + This testcase was miscompiled on ppc64, because basic_induction_var called + convert_modes, yet did not expect it to emit any new instructions. + Those were emitted at the end of the function and destroyed during life + analysis, while the program used uninitialized pseudos created by + convert_modes. */ + +struct A +{ + unsigned short a1; + unsigned long a2; +}; + +struct B +{ + int b1, b2, b3, b4, b5; +}; + +struct C +{ + struct B c1[1]; + int c2, c3; +}; + +static +int foo (int x) +{ + return x < 0 ? -x : x; +} + +int bar (struct C *x, struct A *y) +{ + int a = x->c3; + const int b = y->a1 >> 9; + const unsigned long c = y->a2; + int d = a; + unsigned long e, f; + + f = foo (c - x->c1[d].b4); + do + { + if (d <= 0) + d = x->c2; + d--; + + e = foo (c-x->c1[d].b4); + if (e < f) + a = d; + } + while (d != x->c3); + x->c1[a].b4 = c + b; + return a; +} + +int +main () +{ + struct A a; + struct C b; + int c; + + a.a1 = 512; + a.a2 = 4242; + __builtin_memset (&b, 0, sizeof (b)); + b.c1[0].b3 = 424242; + b.c2 = 1; + c = bar (&b, &a); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030718-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030718-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030718-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030718-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +/* PR c/10320 + The function temp was not being emitted in a prerelease of 3.4 20030406. + Contributed by pinskia at physics.uc.edu */ + +static inline void temp(); +int main() +{ + temp(); + return 0; +} +static void temp(){} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030811-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030811-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030811-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030811-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* Origin: PR target/11535 from H. J. Lu */ +/* { dg-require-effective-target return_address } */ + +void vararg (int i, ...) +{ + (void) i; +} + +int i0[0], i1; + +void test1 (void) +{ + int a = (int) (long long) __builtin_return_address (0); + vararg (0, a); +} + +void test2 (void) +{ + i0[0] = (int) (long long) __builtin_return_address (0); +} + +void test3 (void) +{ + i1 = (int) (long long) __builtin_return_address (0); +} + +void test4 (void) +{ + volatile long long a = (long long) __builtin_return_address (0); + i0[0] = (int) a; +} + +int main (void) +{ + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030821-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030821-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030821-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030821-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +extern void abort (void); + +int +foo (int x) +{ + if ((int) (x & 0x80ffffff) != (int) (0x8000fffe)) + abort (); + + return 0; +} + +int +main () +{ + return foo (0x8000fffe); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030828-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030828-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030828-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030828-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +const int *p; + +int bar (void) +{ + return *p + 1; +} + +main () +{ + /* Variable 'i' is never used but it's aliased to a global pointer. The + alias analyzer was not considering that 'i' may be used in the call to + bar(). */ + const int i = 5; + p = &i; + if (bar() != 6) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030828-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030828-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030828-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030828-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +struct rtx_def +{ + int code; +}; + +main() +{ + int tmp[2]; + struct rtx_def *r, s; + int *p, *q; + + /* The alias analyzer was creating the same memory tag for r, p and q + because 'struct rtx_def *' is type-compatible with 'int *'. However, + the alias set of 'int[2]' is not the same as 'int *', so variable + 'tmp' was deemed not aliased with anything. */ + r = &s; + r->code = 39; + + /* If 'r' wasn't declared, then q and tmp would have had the same memory + tag. */ + p = tmp; + q = p + 1; + *q = 0; + tmp[1] = 39; + if (*q != 39) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030903-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030903-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030903-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030903-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* Test that we don't let stmt.c think that the enumeration's values are + the entire set of possibilities. Such an assumption is false for C, + but true for other languages. */ + +enum X { X1 = 1, X2, X3, X4 }; +static volatile enum X test = 0; +static void y(int); + +int main() +{ + switch (test) + { + case X1: y(1); break; + case X2: y(2); break; + case X3: y(3); break; + case X4: y(4); break; + } + return 0; +} + +static void y(int x) { abort (); } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030909-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030909-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030909-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030909-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +void abort (); +void exit (int); + +void test(int x, int y) +{ + if (x == y) + abort (); +} + +void foo(int x, int y) +{ + if (x == y) + goto a; + else + { +a:; + if (x == y) + goto b; + else + { +b:; + if (x != y) + test (x, y); + } + } +} + +int main(void) +{ + foo (0, 0); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030910-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030910-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030910-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030910-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +/* The gimplifier was inserting unwanted temporaries for REALPART_EXPR + nodes. These need to be treated like a COMPONENT_REF so their address can + be taken. */ + +int main() +{ + __complex double dc; + double *dp = &(__real dc); + *dp = 3.14; + if ((__real dc) != 3.14) abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030913-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030913-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030913-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030913-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* Assignments via pointers pointing to global variables were being killed + by SSA-DCE. Test contributed by Paul Brook */ + +int glob; + +void +fn2(int ** q) +{ + *q = &glob; +} + +void test() +{ + int *p; + + fn2(&p); + + *p=42; +} + +int main() +{ + test(); + if (glob != 42) abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030914-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030914-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030914-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030914-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* On IRIX 6, PB is passed partially in registers and partially on the + stack, with an odd number of words in the register part. Check that + the long double stack argument (PC) is still accessed properly. */ + +struct s { int val[16]; }; + +long double f (int pa, struct s pb, long double pc) +{ + int i; + + for (i = 0; i < 16; i++) + pc += pb.val[i]; + return pc; +} + +int main () +{ + struct s x; + int i; + + for (i = 0; i < 16; i++) + x.val[i] = i + 1; + if (f (1, x, 10000.0L) != 10136.0L) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030914-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030914-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030914-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030914-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* On IRIX 6, PA is passed partially in registers and partially on the + stack. We therefore have two potential uses of pretend_args_size: + one for the partial argument and one for the varargs save area. + Make sure that these uses don't conflict. */ + +struct s { int i[18]; }; + +int f (struct s pa, int pb, ...) +{ + return pb; +} + +struct s gs; + +int main () +{ + if (f (gs, 0x1234) != 0x1234) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030916-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030916-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030916-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030916-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* "i" overflows in f(). Check that x[i] is not treated as a giv. */ +#include + +#if CHAR_BIT == 8 + +void f (unsigned int *x) +{ + unsigned char i; + int j; + + i = 0x10; + for (j = 0; j < 0x10; j++) + { + i += 0xe8; + x[i] = 0; + i -= 0xe7; + } +} + +int main () +{ + unsigned int x[256]; + int i; + + for (i = 0; i < 256; i++) + x[i] = 1; + f (x); + for (i = 0; i < 256; i++) + if (x[i] != (i >= 0x08 && i < 0xf8)) + abort (); + exit (0); +} +#else +int main () { exit (0); } +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030920-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030920-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030920-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030920-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +extern void abort (void); + +int main() +{ + int hicount = 0; + unsigned char *c; + char *str = "\x7f\xff"; + for (c = (unsigned char *)str; *c ; c++) { + if (!(((unsigned int)(*c)) < 0x80)) hicount++; + } + if (hicount != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030928-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030928-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030928-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20030928-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +#include + +#if INT_MAX <= 32767 +int main () { exit (0); } +#else +void get_addrs (const char**x, int *y) +{ + x[0] = "a1111" + (y[0] - 0x10000) * 2; + x[1] = "a1112" + (y[1] - 0x20000) * 2; + x[2] = "a1113" + (y[2] - 0x30000) * 2; + x[3] = "a1114" + (y[3] - 0x40000) * 2; + x[4] = "a1115" + (y[4] - 0x50000) * 2; + x[5] = "a1116" + (y[5] - 0x60000) * 2; + x[6] = "a1117" + (y[6] - 0x70000) * 2; + x[7] = "a1118" + (y[7] - 0x80000) * 2; +} + +int main () +{ + const char *x[8]; + int y[8]; + int i; + + for (i = 0; i < 8; i++) + y[i] = 0x10000 * (i + 1); + get_addrs (x, y); + for (i = 0; i < 8; i++) + if (*x[i] != 'a') + abort (); + exit (0); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031003-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031003-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031003-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031003-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR optimization/9325 */ + +#include + +extern void abort (void); + +int f1() +{ + return (int)2147483648.0f; +} + +int f2() +{ + return (int)(float)(2147483647); +} + +int main() +{ +#if INT_MAX == 2147483647 + if (f1() != 2147483647) + abort (); +#ifdef __SPU__ + /* SPU float rounds towards zero. */ + if (f2() != 0x7fffff80) + abort (); +#else + if (f2() != 2147483647) + abort (); +#endif +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031010-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031010-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031010-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031010-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* A reminder to process ops in generate_expr_as_of_bb exactly once. */ + +long __attribute__((noinline)) +foo (long ct, long cf, _Bool p1, _Bool p2, _Bool p3) +{ + long diff; + + diff = ct - cf; + + if (p1) + { + if (p2) + { + if (p3) + { + long tmp = ct; + ct = cf; + cf = tmp; + } + diff = ct - cf; + } + + return diff; + } + + abort (); +} + +int main () +{ + if (foo(2, 3, 1, 1, 1) == 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031011-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031011-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031011-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031011-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* Check that MAX_EXPR and MIN_EXPR are working properly. */ + +#define MAX(X,Y) ((X) > (Y) ? (X) : (Y)) +#define MIN(X,Y) ((X) < (Y) ? (X) : (Y)) + +extern void abort (void); + +int main() +{ + int ll_bitsize, ll_bitpos; + int rl_bitsize, rl_bitpos; + int end_bit; + + ll_bitpos = 32; ll_bitsize = 32; + rl_bitpos = 0; rl_bitsize = 32; + + end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize); + if (end_bit != 64) + abort (); + end_bit = MAX (rl_bitpos + rl_bitsize, ll_bitpos + ll_bitsize); + if (end_bit != 64) + abort (); + end_bit = MIN (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize); + if (end_bit != 32) + abort (); + end_bit = MIN (rl_bitpos + rl_bitsize, ll_bitpos + ll_bitsize); + if (end_bit != 32) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031012-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031012-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031012-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031012-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* { dg-add-options stack_size } */ + +/* PR optimization/8750 + Used to fail under Cygwin with + -O2 -fomit-frame-pointer + Testcase by David B. Trout */ + +#if defined(STACK_SIZE) && STACK_SIZE < 16000 +#define ARRAY_SIZE (STACK_SIZE / 2) +#define STRLEN (ARRAY_SIZE - 9) +#else +#define ARRAY_SIZE 15000 +#define STRLEN 13371 +#endif + +extern void *memset (void *, int, __SIZE_TYPE__); +extern void abort (void); + +static void foo () +{ + char a[ARRAY_SIZE]; + + a[0]=0; + memset( &a[0], 0xCD, STRLEN ); + a[STRLEN]=0; + if (strlen(a) != STRLEN) + abort (); +} + +int main ( int argc, char* argv[] ) +{ + foo(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031020-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031020-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031020-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031020-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR target/12654 + The Alpha backend tried to do a >= 1024 as (a - 1024) >= 0, which fails + for very large negative values. */ +/* Origin: tg at swox.com */ + +#include + +extern void abort (void); + +void __attribute__((noinline)) +foo (long x) +{ + if (x >= 1024) + abort (); +} + +int +main () +{ + foo (LONG_MIN); + foo (LONG_MIN + 10000); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031201-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031201-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031201-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031201-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,76 @@ +/* Copyright (C) 2003 Free Software Foundation. + PR target/13256 + STRICT_LOW_PART was handled incorrectly in delay slots. + Origin: Hans-Peter Nilsson. */ + +typedef struct { unsigned int e0 : 16; unsigned int e1 : 16; } s1; +typedef struct { unsigned int e0 : 16; unsigned int e1 : 16; } s2; +typedef struct { s1 i12; s2 i16; } io; +static int test_length = 2; +static io *i; +static int m = 1; +static int d = 1; +static unsigned long test_t0; +static unsigned long test_t1; +void test(void) __attribute__ ((__noinline__)); +extern int f1 (void *port) __attribute__ ((__noinline__)); +extern void f0 (void) __attribute__ ((__noinline__)); +int +f1 (void *port) +{ + int fail_count = 0; + unsigned long tlen; + s1 x0 = {0}; + s2 x1 = {0}; + + i = port; + x0.e0 = x1.e0 = 32; + i->i12 = x0; + i->i16 = x1; + do f0(); while (test_t1); + x0.e0 = x1.e0 = 8; + i->i12 = x0; + i->i16 = x1; + test (); + if (m) + { + unsigned long e = 1000000000 / 460800 * test_length; + tlen = test_t1 - test_t0; + if (((tlen-e) & 0x7FFFFFFF) > 1000) + f0(); + } + if (d) + { + unsigned long e = 1000000000 / 460800 * test_length; + tlen = test_t1 - test_t0; + if (((tlen - e) & 0x7FFFFFFF) > 1000) + f0(); + } + return fail_count != 0 ? 1 : 0; +} + +int +main () +{ + io io0; + f1 (&io0); + abort (); +} + +void +test (void) +{ + io *iop = i; + if (iop->i12.e0 != 8 || iop->i16.e0 != 8) + abort (); + exit (0); +} + +void +f0 (void) +{ + static int washere = 0; + io *iop = i; + if (washere++ || iop->i12.e0 != 32 || iop->i16.e0 != 32) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031204-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031204-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031204-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031204-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +/* PR optimization/13260 */ + +#include + +typedef unsigned long u32; + +u32 in_aton(const char* x) +{ + return 0x0a0b0c0d; +} + +u32 root_nfs_parse_addr(char *name) +{ + u32 addr; + int octets = 0; + char *cp, *cq; + + cp = cq = name; + while (octets < 4) { + while (*cp >= '0' && *cp <= '9') + cp++; + if (cp == cq || cp - cq > 3) + break; + if (*cp == '.' || octets == 3) + octets++; + if (octets < 4) + cp++; + cq = cp; + } + + if (octets == 4 && (*cp == ':' || *cp == '\0')) { + if (*cp == ':') + *cp++ = '\0'; + addr = in_aton(name); + strcpy(name, cp); + } else + addr = (-1); + + return addr; +} + +int +main() +{ + static char addr[] = "10.11.12.13:/hello"; + u32 result = root_nfs_parse_addr(addr); + if (result != 0x0a0b0c0d) { abort(); } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031211-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031211-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031211-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031211-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +struct a { unsigned int bitfield : 1; }; + +unsigned int x; + +main() +{ + struct a a = {0}; + x = 0xbeef; + a.bitfield |= x; + if (a.bitfield != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031211-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031211-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031211-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031211-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +struct a +{ + unsigned int bitfield : 3; +}; + +int main() +{ + struct a a; + + a.bitfield = 131; + foo (a.bitfield); + exit (0); +} + +foo(unsigned int z) +{ + if (z != 3) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031214-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031214-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031214-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031214-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR optimization/10312 */ +/* Originator: Peter van Hoof

    */ + +/* Verify that the strength reduction pass doesn't find + illegitimate givs. */ + +struct +{ + double a; + int n[2]; +} g = { 0., { 1, 2}}; + +int k = 0; + +void +b (int *j) +{ +} + +int +main () +{ + int j; + + for (j = 0; j < 2; j++) + k = (k > g.n[j]) ? k : g.n[j]; + + k++; + b (&j); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031215-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031215-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031215-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031215-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* PR middle-end/13400 */ +/* The following test used to fail at run-time with a write to read-only + memory, caused by if-conversion converting a conditional write into an + unconditional write. */ + +typedef struct {int c, l; char ch[3];} pstr; +const pstr ao = {2, 2, "OK"}; +const pstr * const a = &ao; + +void test1(void) +{ + if (a->ch[a->l]) { + ((char *)a->ch)[a->l] = 0; + } +} + +void test2(void) +{ + if (a->ch[a->l]) { + ((char *)a->ch)[a->l] = -1; + } +} + +void test3(void) +{ + if (a->ch[a->l]) { + ((char *)a->ch)[a->l] = 1; + } +} + +int main(void) +{ + test1(); + test2(); + test3(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031216-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031216-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031216-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20031216-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR optimization/13313 */ +/* Origin: Mike Lerwill */ + +extern void abort(void); + +void DisplayNumber (unsigned long v) +{ + if (v != 0x9aL) + abort(); +} + +unsigned long ReadNumber (void) +{ + return 0x009a0000L; +} + +int main (void) +{ + unsigned long tmp; + tmp = (ReadNumber() & 0x00ff0000L) >> 16; + DisplayNumber (tmp); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040208-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040208-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040208-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040208-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +int main () +{ + long double x; + + x = 0x1.0p-500L; + x *= 0x1.0p-522L; + if (x != 0x1.0p-1022L) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040218-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040218-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040218-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040218-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* PR target/14209. Bug in cris.md, shrinking access size of + postincrement. + Origin: . */ + +long int xb (long int *y) __attribute__ ((__noinline__)); +long int xw (long int *y) __attribute__ ((__noinline__)); +short int yb (short int *y) __attribute__ ((__noinline__)); + +long int xb (long int *y) +{ + long int xx = *y & 255; + return xx + y[1]; +} + +long int xw (long int *y) +{ + long int xx = *y & 65535; + return xx + y[1]; +} + +short int yb (short int *y) +{ + short int xx = *y & 255; + return xx + y[1]; +} + +int main (void) +{ + long int y[] = {-1, 16000}; + short int yw[] = {-1, 16000}; + + if (xb (y) != 16255 + || xw (y) != 81535 + || yb (yw) != 16255) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040223-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040223-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040223-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040223-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* { dg-require-effective-target alloca } */ +#include +#include + +void +a(void *x,int y) +{ + if (y != 1234) + abort (); +} + +int +main() +{ + a(strcpy(alloca(100),"abc"),1234); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040302-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040302-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040302-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040302-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* { dg-require-effective-target label_values } */ +int code[]={0,0,0,0,1}; + +void foo(int x) { + volatile int b; + b = 0xffffffff; +} + +void bar(int *pc) { + static const void *l[] = {&&lab0, &&end}; + + foo(0); + goto *l[*pc]; + lab0: + foo(0); + pc++; + goto *l[*pc]; + end: + return; +} + +int main() { + bar(code); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040307-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040307-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040307-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040307-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +int main() +{ + int b = 0; + + struct { + unsigned int bit0:1; + unsigned int bit1:1; + unsigned int bit2:1; + unsigned int bit3:1; + unsigned int bit4:1; + unsigned int bit5:1; + unsigned int bit6:1; + unsigned int bit7:1; + } sdata = {0x01}; + + while ( sdata.bit0-- > 0 ) { + b++ ; + if ( b > 100 ) break; + } + + if (b != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040308-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040308-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040308-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040308-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* { dg-require-effective-target alloca } */ +/* This used to fail on SPARC with an unaligned memory access. */ + +void foo(int n) +{ + struct S { + int i[n]; + unsigned int b:1; + int i2; + } __attribute__ ((packed)) __attribute__ ((aligned (4))); + + struct S s; + + s.i2 = 0; +} + +int main(void) +{ + foo(4); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040309-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040309-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040309-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040309-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +extern void abort (); + +int foo(unsigned short x) +{ + unsigned short y; + y = x > 32767 ? x - 32768 : 0; + return y; +} + +int main() +{ + if (foo (0) != 0) + abort (); + if (foo (32767) != 0) + abort (); + if (foo (32768) != 0) + abort (); + if (foo (32769) != 1) + abort (); + if (foo (65535) != 32767) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040311-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040311-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040311-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040311-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,67 @@ +/* Copyright (C) 2004 Free Software Foundation. + + Check that constant folding and RTL simplification of -(x >> y) doesn't + break anything and produces the expected results. + + Written by Roger Sayle, 11th March 2004. */ + +extern void abort (void); + +#define INT_BITS (sizeof(int)*8) + +int test1(int x) +{ + return -(x >> (INT_BITS-1)); +} + +int test2(unsigned int x) +{ + return -((int)(x >> (INT_BITS-1))); +} + +int test3(int x) +{ + int y; + y = INT_BITS-1; + return -(x >> y); +} + +int test4(unsigned int x) +{ + int y; + y = INT_BITS-1; + return -((int)(x >> y)); +} + +int main() +{ + if (test1(0) != 0) + abort (); + if (test1(1) != 0) + abort (); + if (test1(-1) != 1) + abort (); + + if (test2(0) != 0) + abort (); + if (test2(1) != 0) + abort (); + if (test2((unsigned int)-1) != -1) + abort (); + + if (test3(0) != 0) + abort (); + if (test3(1) != 0) + abort (); + if (test3(-1) != 1) + abort (); + + if (test4(0) != 0) + abort (); + if (test4(1) != 0) + abort (); + if (test4((unsigned int)-1) != -1) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040313-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040313-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040313-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040313-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR middle-end/14470 */ +/* Origin: Lodewijk Voge */ + +extern void abort(void); + +int main() +{ + int t[1025] = { 1024 }, d; + + d = 0; + d = t[d]++; + if (t[0] != 1025) + abort(); + if (d != 1024) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040319-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040319-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040319-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040319-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +int +blah (int zzz) +{ + int foo; + if (zzz >= 0) + return 1; + foo = (zzz >= 0 ? (zzz) : -(zzz)); + return foo; +} + +main() +{ + if (blah (-1) != 1) + abort (); + else + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040331-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040331-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040331-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040331-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* PR c++/14755 */ +extern void abort (void); +extern void exit (int); + +int +main (void) +{ +#if __INT_MAX__ >= 2147483647 + struct { int count: 31; } s = { 0 }; + while (s.count--) + abort (); +#elif __INT_MAX__ >= 32767 + struct { int count: 15; } s = { 0 }; + while (s.count--) + abort (); +#else + /* Don't bother because __INT_MAX__ is too small. */ +#endif + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,111 @@ +#include + +extern void abort (); + +int test1(int x) +{ + return x ^ INT_MIN; +} + +unsigned int test1u(unsigned int x) +{ + return x ^ (unsigned int)INT_MIN; +} + +unsigned int test2u(unsigned int x) +{ + return x + (unsigned int)INT_MIN; +} + +unsigned int test3u(unsigned int x) +{ + return x - (unsigned int)INT_MIN; +} + +int test4(int x) +{ + int y = INT_MIN; + return x ^ y; +} + +unsigned int test4u(unsigned int x) +{ + unsigned int y = (unsigned int)INT_MIN; + return x ^ y; +} + +unsigned int test5u(unsigned int x) +{ + unsigned int y = (unsigned int)INT_MIN; + return x + y; +} + +unsigned int test6u(unsigned int x) +{ + unsigned int y = (unsigned int)INT_MIN; + return x - y; +} + + + +void test(int a, int b) +{ + if (test1(a) != b) + abort(); + if (test4(a) != b) + abort(); +} + +void testu(unsigned int a, unsigned int b) +{ + if (test1u(a) != b) + abort(); + if (test2u(a) != b) + abort(); + if (test3u(a) != b) + abort(); + if (test4u(a) != b) + abort(); + if (test5u(a) != b) + abort(); + if (test6u(a) != b) + abort(); +} + + +int main() +{ +#if INT_MAX == 2147483647 + test(0x00000000,0x80000000); + test(0x80000000,0x00000000); + test(0x12345678,0x92345678); + test(0x92345678,0x12345678); + test(0x7fffffff,0xffffffff); + test(0xffffffff,0x7fffffff); + + testu(0x00000000,0x80000000); + testu(0x80000000,0x00000000); + testu(0x12345678,0x92345678); + testu(0x92345678,0x12345678); + testu(0x7fffffff,0xffffffff); + testu(0xffffffff,0x7fffffff); +#endif + +#if INT_MAX == 32767 + test(0x0000,0x8000); + test(0x8000,0x0000); + test(0x1234,0x9234); + test(0x9234,0x1234); + test(0x7fff,0xffff); + test(0xffff,0x7fff); + + testu(0x0000,0x8000); + testu(0x8000,0x0000); + testu(0x1234,0x9234); + testu(0x9234,0x1234); + testu(0x7fff,0xffff); + testu(0xffff,0x7fff); +#endif + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-1w.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-1w.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-1w.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-1w.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,65 @@ +/* { dg-additional-options "-fwrapv" } */ + +#include + +extern void abort (); + +int test2(int x) +{ + return x + INT_MIN; +} + +int test3(int x) +{ + return x - INT_MIN; +} + +int test5(int x) +{ + int y = INT_MIN; + return x + y; +} + +int test6(int x) +{ + int y = INT_MIN; + return x - y; +} + + + +void test(int a, int b) +{ + if (test2(a) != b) + abort(); + if (test3(a) != b) + abort(); + if (test5(a) != b) + abort(); + if (test6(a) != b) + abort(); +} + + +int main() +{ +#if INT_MAX == 2147483647 + test(0x00000000,0x80000000); + test(0x80000000,0x00000000); + test(0x12345678,0x92345678); + test(0x92345678,0x12345678); + test(0x7fffffff,0xffffffff); + test(0xffffffff,0x7fffffff); +#endif + +#if INT_MAX == 32767 + test(0x0000,0x8000); + test(0x8000,0x0000); + test(0x1234,0x9234); + test(0x9234,0x1234); + test(0x7fff,0xffff); + test(0xffff,0x7fff); +#endif + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,178 @@ +#include + +extern void abort (); + +int test1(int x) +{ + return (x ^ INT_MIN) ^ 0x1234; +} + +unsigned int test1u(unsigned int x) +{ + return (x ^ (unsigned int)INT_MIN) ^ 0x1234; +} + +int test2(int x) +{ + return (x ^ 0x1234) ^ INT_MIN; +} + +unsigned int test2u(unsigned int x) +{ + return (x ^ 0x1234) ^ (unsigned int)INT_MIN; +} + +unsigned int test3u(unsigned int x) +{ + return (x + (unsigned int)INT_MIN) ^ 0x1234; +} + +unsigned int test4u(unsigned int x) +{ + return (x ^ 0x1234) + (unsigned int)INT_MIN; +} + +unsigned int test5u(unsigned int x) +{ + return (x - (unsigned int)INT_MIN) ^ 0x1234; +} + +unsigned int test6u(unsigned int x) +{ + return (x ^ 0x1234) - (unsigned int)INT_MIN; +} + +int test7(int x) +{ + int y = INT_MIN; + int z = 0x1234; + return (x ^ y) ^ z; +} + +unsigned int test7u(unsigned int x) +{ + unsigned int y = (unsigned int)INT_MIN; + unsigned int z = 0x1234; + return (x ^ y) ^ z; +} + +int test8(int x) +{ + int y = 0x1234; + int z = INT_MIN; + return (x ^ y) ^ z; +} + +unsigned int test8u(unsigned int x) +{ + unsigned int y = 0x1234; + unsigned int z = (unsigned int)INT_MIN; + return (x ^ y) ^ z; +} + +unsigned int test9u(unsigned int x) +{ + unsigned int y = (unsigned int)INT_MIN; + unsigned int z = 0x1234; + return (x + y) ^ z; +} + +unsigned int test10u(unsigned int x) +{ + unsigned int y = 0x1234; + unsigned int z = (unsigned int)INT_MIN; + return (x ^ y) + z; +} + +unsigned int test11u(unsigned int x) +{ + unsigned int y = (unsigned int)INT_MIN; + unsigned int z = 0x1234; + return (x - y) ^ z; +} + +unsigned int test12u(unsigned int x) +{ + unsigned int y = 0x1234; + unsigned int z = (unsigned int)INT_MIN; + return (x ^ y) - z; +} + + +void test(int a, int b) +{ + if (test1(a) != b) + abort(); + if (test2(a) != b) + abort(); + if (test7(a) != b) + abort(); + if (test8(a) != b) + abort(); +} + +void testu(unsigned int a, unsigned int b) +{ + if (test1u(a) != b) + abort(); + if (test2u(a) != b) + abort(); + if (test3u(a) != b) + abort(); + if (test4u(a) != b) + abort(); + if (test5u(a) != b) + abort(); + if (test6u(a) != b) + abort(); + if (test7u(a) != b) + abort(); + if (test8u(a) != b) + abort(); + if (test9u(a) != b) + abort(); + if (test10u(a) != b) + abort(); + if (test11u(a) != b) + abort(); + if (test12u(a) != b) + abort(); +} + + +int main() +{ +#if INT_MAX == 2147483647 + test(0x00000000,0x80001234); + test(0x00001234,0x80000000); + test(0x80000000,0x00001234); + test(0x80001234,0x00000000); + test(0x7fffffff,0xffffedcb); + test(0xffffffff,0x7fffedcb); + + testu(0x00000000,0x80001234); + testu(0x00001234,0x80000000); + testu(0x80000000,0x00001234); + testu(0x80001234,0x00000000); + testu(0x7fffffff,0xffffedcb); + testu(0xffffffff,0x7fffedcb); +#endif + +#if INT_MAX == 32767 + test(0x0000,0x9234); + test(0x1234,0x8000); + test(0x8000,0x1234); + test(0x9234,0x0000); + test(0x7fff,0xedcb); + test(0xffff,0x6dcb); + + testu(0x0000,0x9234); + testu(0x8000,0x1234); + testu(0x1234,0x8000); + testu(0x9234,0x0000); + testu(0x7fff,0xedcb); + testu(0xffff,0x6dcb); +#endif + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-2w.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-2w.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-2w.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-2w.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,98 @@ +/* { dg-additional-options "-fwrapv" } */ + +#include + +extern void abort (); + +int test3(int x) +{ + return (x + INT_MIN) ^ 0x1234; +} + +int test4(int x) +{ + return (x ^ 0x1234) + INT_MIN; +} + +int test5(int x) +{ + return (x - INT_MIN) ^ 0x1234; +} + +int test6(int x) +{ + return (x ^ 0x1234) - INT_MIN; +} + +int test9(int x) +{ + int y = INT_MIN; + int z = 0x1234; + return (x + y) ^ z; +} + +int test10(int x) +{ + int y = 0x1234; + int z = INT_MIN; + return (x ^ y) + z; +} + +int test11(int x) +{ + int y = INT_MIN; + int z = 0x1234; + return (x - y) ^ z; +} + +int test12(int x) +{ + int y = 0x1234; + int z = INT_MIN; + return (x ^ y) - z; +} + + +void test(int a, int b) +{ + if (test3(a) != b) + abort(); + if (test4(a) != b) + abort(); + if (test5(a) != b) + abort(); + if (test6(a) != b) + abort(); + if (test9(a) != b) + abort(); + if (test10(a) != b) + abort(); + if (test11(a) != b) + abort(); + if (test12(a) != b) + abort(); +} + + +int main() +{ +#if INT_MAX == 2147483647 + test(0x00000000,0x80001234); + test(0x00001234,0x80000000); + test(0x80000000,0x00001234); + test(0x80001234,0x00000000); + test(0x7fffffff,0xffffedcb); + test(0xffffffff,0x7fffedcb); +#endif + +#if INT_MAX == 32767 + test(0x0000,0x9234); + test(0x1234,0x8000); + test(0x8000,0x1234); + test(0x9234,0x0000); + test(0x7fff,0xedcb); + test(0xffff,0x6dcb); +#endif + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,111 @@ +#include + +extern void abort (); + +int test1(int x) +{ + return ~(x ^ INT_MIN); +} + +unsigned int test1u(unsigned int x) +{ + return ~(x ^ (unsigned int)INT_MIN); +} + +unsigned int test2u(unsigned int x) +{ + return ~(x + (unsigned int)INT_MIN); +} + +unsigned int test3u(unsigned int x) +{ + return ~(x - (unsigned int)INT_MIN); +} + +int test4(int x) +{ + int y = INT_MIN; + return ~(x ^ y); +} + +unsigned int test4u(unsigned int x) +{ + unsigned int y = (unsigned int)INT_MIN; + return ~(x ^ y); +} + +unsigned int test5u(unsigned int x) +{ + unsigned int y = (unsigned int)INT_MIN; + return ~(x + y); +} + +unsigned int test6u(unsigned int x) +{ + unsigned int y = (unsigned int)INT_MIN; + return ~(x - y); +} + + + +void test(int a, int b) +{ + if (test1(a) != b) + abort(); + if (test4(a) != b) + abort(); +} + +void testu(unsigned int a, unsigned int b) +{ + if (test1u(a) != b) + abort(); + if (test2u(a) != b) + abort(); + if (test3u(a) != b) + abort(); + if (test4u(a) != b) + abort(); + if (test5u(a) != b) + abort(); + if (test6u(a) != b) + abort(); +} + + +int main() +{ +#if INT_MAX == 2147483647 + test(0x00000000,0x7fffffff); + test(0x80000000,0xffffffff); + test(0x12345678,0x6dcba987); + test(0x92345678,0xedcba987); + test(0x7fffffff,0x00000000); + test(0xffffffff,0x80000000); + + testu(0x00000000,0x7fffffff); + testu(0x80000000,0xffffffff); + testu(0x12345678,0x6dcba987); + testu(0x92345678,0xedcba987); + testu(0x7fffffff,0x00000000); + testu(0xffffffff,0x80000000); +#endif + +#if INT_MAX == 32767 + test(0x0000,0x7fff); + test(0x8000,0xffff); + test(0x1234,0x6dcb); + test(0x9234,0xedcb); + test(0x7fff,0x0000); + test(0xffff,0x8000); + + testu(0x0000,0x7fff); + testu(0x8000,0xffff); + testu(0x1234,0x6dcb); + testu(0x9234,0xedcb); + testu(0x7fff,0x0000); + testu(0xffff,0x8000); +#endif + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-3w.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-3w.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-3w.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040409-3w.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,64 @@ +/* { dg-additional-options "-fwrapv" } */ + +#include + +extern void abort (); + +int test2(int x) +{ + return ~(x + INT_MIN); +} + +int test3(int x) +{ + return ~(x - INT_MIN); +} + +int test5(int x) +{ + int y = INT_MIN; + return ~(x + y); +} + +int test6(int x) +{ + int y = INT_MIN; + return ~(x - y); +} + + +void test(int a, int b) +{ + if (test2(a) != b) + abort(); + if (test3(a) != b) + abort(); + if (test5(a) != b) + abort(); + if (test6(a) != b) + abort(); +} + + +int main() +{ +#if INT_MAX == 2147483647 + test(0x00000000,0x7fffffff); + test(0x80000000,0xffffffff); + test(0x12345678,0x6dcba987); + test(0x92345678,0xedcba987); + test(0x7fffffff,0x00000000); + test(0xffffffff,0x80000000); +#endif + +#if INT_MAX == 32767 + test(0x0000,0x7fff); + test(0x8000,0xffff); + test(0x1234,0x6dcb); + test(0x9234,0xedcb); + test(0x7fff,0x0000); + test(0xffff,0x8000); +#endif + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040411-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040411-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040411-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040411-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +int +sub1 (int i, int j) +{ + typedef int c[i+2]; + int x[10], y[10]; + + if (j == 2) + { + memcpy (x, y, 10 * sizeof (int)); + return sizeof (c); + } + else + return sizeof (c) * 3; +} + +int +main () +{ + if (sub1 (20, 3) != 66 * sizeof (int)) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040423-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040423-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040423-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040423-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +int +sub1 (int i, int j) +{ + typedef struct + { + int c[i+2]; + }c; + int x[10], y[10]; + + if (j == 2) + { + memcpy (x, y, 10 * sizeof (int)); + return sizeof (c); + } + else + return sizeof (c) * 3; +} + +int +main () +{ + typedef struct + { + int c[22]; + }c; + if (sub1 (20, 3) != sizeof (c)*3) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040520-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040520-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040520-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040520-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR 15454 */ + +void abort (); +int main () { + int foo; + int bar (void) + { + int baz = 0; + if (foo!=45) + baz = foo; + return baz; + } + foo = 1; + if (!bar ()) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040625-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040625-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040625-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040625-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* From PR target/16176 */ +struct __attribute__ ((packed)) s { struct s *next; }; + +struct s * __attribute__ ((noinline)) +maybe_next (struct s *s, int t) +{ + if (t) + s = s->next; + return s; +} + +int main () +{ + struct s s1, s2; + + s1.next = &s2; + if (maybe_next (&s1, 1) != &s2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040629-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040629-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040629-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040629-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,141 @@ +/* { dg-require-effective-target int32plus } */ + +/* Test arithmetics on bitfields. */ +#ifndef T + +extern void abort (void); +extern void exit (int); + +#ifndef FIELDS1 +#define FIELDS1 +#endif +#ifndef FIELDS2 +#define FIELDS2 +#endif + +struct { FIELDS1 unsigned int i : 6, j : 11, k : 15; FIELDS2 } b; +struct { FIELDS1 unsigned int i : 5, j : 1, k : 26; FIELDS2 } c; +struct { FIELDS1 unsigned int i : 16, j : 8, k : 8; FIELDS2 } d; + +unsigned int ret1 (void) { return b.i; } +unsigned int ret2 (void) { return b.j; } +unsigned int ret3 (void) { return b.k; } +unsigned int ret4 (void) { return c.i; } +unsigned int ret5 (void) { return c.j; } +unsigned int ret6 (void) { return c.k; } +unsigned int ret7 (void) { return d.i; } +unsigned int ret8 (void) { return d.j; } +unsigned int ret9 (void) { return d.k; } + +#define T(n, pre, post, op) \ +void fn1_##n (unsigned int x) { pre b.i post; } \ +void fn2_##n (unsigned int x) { pre b.j post; } \ +void fn3_##n (unsigned int x) { pre b.k post; } \ +void fn4_##n (unsigned int x) { pre c.i post; } \ +void fn5_##n (unsigned int x) { pre c.j post; } \ +void fn6_##n (unsigned int x) { pre c.k post; } \ +void fn7_##n (unsigned int x) { pre d.i post; } \ +void fn8_##n (unsigned int x) { pre d.j post; } \ +void fn9_##n (unsigned int x) { pre d.k post; } + +#include "20040629-1.c" +#undef T + +#define FAIL(n, i) abort () + +int +main (void) +{ +#define T(n, pre, post, op) \ + b.i = 51; \ + b.j = 636; \ + b.k = 31278; \ + c.i = 21; \ + c.j = 1; \ + c.k = 33554432; \ + d.i = 26812; \ + d.j = 156; \ + d.k = 187; \ + fn1_##n (3); \ + if (ret1 () != (op (51, 3) & ((1 << 6) - 1))) \ + FAIL (n, 1); \ + b.i = 51; \ + fn2_##n (251); \ + if (ret2 () != (op (636, 251) & ((1 << 11) - 1))) \ + FAIL (n, 2); \ + b.j = 636; \ + fn3_##n (13279); \ + if (ret3 () != (op (31278, 13279) & ((1 << 15) - 1))) \ + FAIL (n, 3); \ + b.j = 31278; \ + fn4_##n (24); \ + if (ret4 () != (op (21, 24) & ((1 << 5) - 1))) \ + FAIL (n, 4); \ + c.i = 21; \ + fn5_##n (1); \ + if (ret5 () != (op (1, 1) & ((1 << 1) - 1))) \ + FAIL (n, 5); \ + c.j = 1; \ + fn6_##n (264151); \ + if (ret6 () != (op (33554432, 264151) & ((1 << 26) - 1))) \ + FAIL (n, 6); \ + c.k = 33554432; \ + fn7_##n (713); \ + if (ret7 () != (op (26812, 713) & ((1 << 16) - 1))) \ + FAIL (n, 7); \ + d.i = 26812; \ + fn8_##n (17); \ + if (ret8 () != (op (156, 17) & ((1 << 8) - 1))) \ + FAIL (n, 8); \ + d.j = 156; \ + fn9_##n (199); \ + if (ret9 () != (op (187, 199) & ((1 << 8) - 1))) \ + FAIL (n, 9); \ + d.k = 187; + +#include "20040629-1.c" +#undef T + return 0; +} + +#else + +#ifndef opadd +#define opadd(x, y) (x + y) +#define opsub(x, y) (x - y) +#define opinc(x, y) (x + 1) +#define opdec(x, y) (x - 1) +#define opand(x, y) (x & y) +#define opior(x, y) (x | y) +#define opxor(x, y) (x ^ y) +#define opdiv(x, y) (x / y) +#define oprem(x, y) (x % y) +#define opadd3(x, y) (x + 3) +#define opsub7(x, y) (x - 7) +#define opand21(x, y) (x & 21) +#define opior19(x, y) (x | 19) +#define opxor37(x, y) (x ^ 37) +#define opdiv17(x, y) (x / 17) +#define oprem19(x, y) (x % 19) +#endif + +T(1, , += x, opadd) +T(2, ++, , opinc) +T(3, , ++, opinc) +T(4, , -= x, opsub) +T(5, --, , opdec) +T(6, , --, opdec) +T(7, , &= x, opand) +T(8, , |= x, opior) +T(9, , ^= x, opxor) +T(a, , /= x, opdiv) +T(b, , %= x, oprem) +T(c, , += 3, opadd3) +T(d, , -= 7, opsub7) +T(e, , &= 21, opand21) +T(f, , |= 19, opior19) +T(g, , ^= 37, opxor37) +T(h, , /= 17, opdiv17) +T(i, , %= 19, oprem19) + +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040703-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040703-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040703-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040703-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,148 @@ +/* PR 16341 */ +/* { dg-require-effective-target int32plus } */ + +#define PART_PRECISION (sizeof (cpp_num_part) * 8) + +typedef unsigned int cpp_num_part; +typedef struct cpp_num cpp_num; +struct cpp_num +{ + cpp_num_part high; + cpp_num_part low; + int unsignedp; /* True if value should be treated as unsigned. */ + int overflow; /* True if the most recent calculation overflowed. */ +}; + +static int +num_positive (cpp_num num, unsigned int precision) +{ + if (precision > PART_PRECISION) + { + precision -= PART_PRECISION; + return (num.high & (cpp_num_part) 1 << (precision - 1)) == 0; + } + + return (num.low & (cpp_num_part) 1 << (precision - 1)) == 0; +} + +static cpp_num +num_trim (cpp_num num, unsigned int precision) +{ + if (precision > PART_PRECISION) + { + precision -= PART_PRECISION; + if (precision < PART_PRECISION) + num.high &= ((cpp_num_part) 1 << precision) - 1; + } + else + { + if (precision < PART_PRECISION) + num.low &= ((cpp_num_part) 1 << precision) - 1; + num.high = 0; + } + + return num; +} + +/* Shift NUM, of width PRECISION, right by N bits. */ +static cpp_num +num_rshift (cpp_num num, unsigned int precision, unsigned int n) +{ + cpp_num_part sign_mask; + int x = num_positive (num, precision); + + if (num.unsignedp || x) + sign_mask = 0; + else + sign_mask = ~(cpp_num_part) 0; + + if (n >= precision) + num.high = num.low = sign_mask; + else + { + /* Sign-extend. */ + if (precision < PART_PRECISION) + num.high = sign_mask, num.low |= sign_mask << precision; + else if (precision < 2 * PART_PRECISION) + num.high |= sign_mask << (precision - PART_PRECISION); + + if (n >= PART_PRECISION) + { + n -= PART_PRECISION; + num.low = num.high; + num.high = sign_mask; + } + + if (n) + { + num.low = (num.low >> n) | (num.high << (PART_PRECISION - n)); + num.high = (num.high >> n) | (sign_mask << (PART_PRECISION - n)); + } + } + + num = num_trim (num, precision); + num.overflow = 0; + return num; +} + #define num_zerop(num) ((num.low | num.high) == 0) +#define num_eq(num1, num2) (num1.low == num2.low && num1.high == num2.high) + +cpp_num +num_lshift (cpp_num num, unsigned int precision, unsigned int n) +{ + if (n >= precision) + { + num.overflow = !num.unsignedp && !num_zerop (num); + num.high = num.low = 0; + } + else + { + cpp_num orig; + unsigned int m = n; + + orig = num; + if (m >= PART_PRECISION) + { + m -= PART_PRECISION; + num.high = num.low; + num.low = 0; + } + if (m) + { + num.high = (num.high << m) | (num.low >> (PART_PRECISION - m)); + num.low <<= m; + } + num = num_trim (num, precision); + + if (num.unsignedp) + num.overflow = 0; + else + { + cpp_num maybe_orig = num_rshift (num, precision, n); + num.overflow = !num_eq (orig, maybe_orig); + } + } + + return num; +} + +unsigned int precision = 64; +unsigned int n = 16; + +cpp_num num = { 0, 3, 0, 0 }; + +int main() +{ + cpp_num res = num_lshift (num, 64, n); + + if (res.low != 0x30000) + abort (); + + if (res.high != 0) + abort (); + + if (res.overflow != 0) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040704-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040704-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040704-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040704-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR 16348: Make sure that condition-first false loops DTRT. */ + +extern void abort (); + +int main() +{ + for (; 0 ;) + { + abort (); + label: + return 0; + } + goto label; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040705-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040705-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040705-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040705-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,4 @@ +/* { dg-require-effective-target int32plus } */ + +#define FIELDS1 long long l; +#include "20040629-1.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040705-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040705-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040705-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040705-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,3 @@ +/* { dg-require-effective-target int32plus } */ +#define FIELDS2 long long l; +#include "20040629-1.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040706-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040706-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040706-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040706-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,9 @@ +int main () +{ + int i; + for (i = 0; i < 10; i++) + continue; + if (i < 10) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040707-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040707-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040707-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040707-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +struct s { char c1, c2; }; +void foo (struct s s) +{ + static struct s s1; + s1 = s; +} +int main () +{ + static struct s s2; + foo (s2); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,149 @@ +/* { dg-require-effective-target int32plus } */ + +/* Test arithmetics on bitfields. */ + +extern void abort (void); +extern void exit (int); + +unsigned int +myrnd (void) +{ + static unsigned int s = 1388815473; + s *= 1103515245; + s += 12345; + return (s / 65536) % 2048; +} + +#define T(S) \ +struct S s##S; \ +struct S retme##S (struct S x) \ +{ \ + return x; \ +} \ + \ +unsigned int fn1##S (unsigned int x) \ +{ \ + struct S y = s##S; \ + y.k += x; \ + y = retme##S (y); \ + return y.k; \ +} \ + \ +unsigned int fn2##S (unsigned int x) \ +{ \ + struct S y = s##S; \ + y.k += x; \ + y.k %= 15; \ + return y.k; \ +} \ + \ +unsigned int retit##S (void) \ +{ \ + return s##S.k; \ +} \ + \ +unsigned int fn3##S (unsigned int x) \ +{ \ + s##S.k += x; \ + return retit##S (); \ +} \ + \ +void test##S (void) \ +{ \ + int i; \ + unsigned int mask, v, a, r; \ + struct S x; \ + char *p = (char *) &s##S; \ + for (i = 0; i < sizeof (s##S); ++i) \ + *p++ = myrnd (); \ + if (__builtin_classify_type (s##S.l) == 8) \ + s##S.l = 5.25; \ + s##S.k = -1; \ + mask = s##S.k; \ + v = myrnd (); \ + a = myrnd (); \ + s##S.k = v; \ + x = s##S; \ + r = fn1##S (a); \ + if (x.i != s##S.i || x.j != s##S.j \ + || x.k != s##S.k || x.l != s##S.l \ + || ((v + a) & mask) != r) \ + abort (); \ + v = myrnd (); \ + a = myrnd (); \ + s##S.k = v; \ + x = s##S; \ + r = fn2##S (a); \ + if (x.i != s##S.i || x.j != s##S.j \ + || x.k != s##S.k || x.l != s##S.l \ + || ((((v + a) & mask) % 15) & mask) != r) \ + abort (); \ + v = myrnd (); \ + a = myrnd (); \ + s##S.k = v; \ + x = s##S; \ + r = fn3##S (a); \ + if (x.i != s##S.i || x.j != s##S.j \ + || s##S.k != r || x.l != s##S.l \ + || ((v + a) & mask) != r) \ + abort (); \ +} + +struct A { unsigned int i : 6, l : 1, j : 10, k : 15; }; T(A) +struct B { unsigned int i : 6, j : 11, k : 15; unsigned int l; }; T(B) +struct C { unsigned int l; unsigned int i : 6, j : 11, k : 15; }; T(C) +struct D { unsigned long long l : 6, i : 6, j : 23, k : 29; }; T(D) +struct E { unsigned long long l, i : 12, j : 23, k : 29; }; T(E) +struct F { unsigned long long i : 12, j : 23, k : 29, l; }; T(F) +struct G { unsigned int i : 12, j : 13, k : 7; unsigned long long l; }; T(G) +struct H { unsigned int i : 12, j : 11, k : 9; unsigned long long l; }; T(H) +struct I { unsigned short i : 1, j : 6, k : 9; unsigned long long l; }; T(I) +struct J { unsigned short i : 1, j : 8, k : 7; unsigned short l; }; T(J) +struct K { unsigned int k : 6, l : 1, j : 10, i : 15; }; T(K) +struct L { unsigned int k : 6, j : 11, i : 15; unsigned int l; }; T(L) +struct M { unsigned int l; unsigned int k : 6, j : 11, i : 15; }; T(M) +struct N { unsigned long long l : 6, k : 6, j : 23, i : 29; }; T(N) +struct O { unsigned long long l, k : 12, j : 23, i : 29; }; T(O) +struct P { unsigned long long k : 12, j : 23, i : 29, l; }; T(P) +struct Q { unsigned int k : 12, j : 13, i : 7; unsigned long long l; }; T(Q) +struct R { unsigned int k : 12, j : 11, i : 9; unsigned long long l; }; T(R) +struct S { unsigned short k : 1, j : 6, i : 9; unsigned long long l; }; T(S) +struct T { unsigned short k : 1, j : 8, i : 7; unsigned short l; }; T(T) +struct U { unsigned short j : 6, k : 1, i : 9; unsigned long long l; }; T(U) +struct V { unsigned short j : 8, k : 1, i : 7; unsigned short l; }; T(V) +struct W { long double l; unsigned int k : 12, j : 13, i : 7; }; T(W) +struct X { unsigned int k : 12, j : 13, i : 7; long double l; }; T(X) +struct Y { unsigned int k : 12, j : 11, i : 9; long double l; }; T(Y) +struct Z { long double l; unsigned int j : 13, i : 7, k : 12; }; T(Z) + +int +main (void) +{ + testA (); + testB (); + testC (); + testD (); + testE (); + testF (); + testG (); + testH (); + testI (); + testJ (); + testK (); + testL (); + testM (); + testN (); + testO (); + testP (); + testQ (); + testR (); + testS (); + testT (); + testU (); + testV (); + testW (); + testX (); + testY (); + testZ (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,149 @@ +/* Test arithmetics on bitfields. */ +/* { dg-require-effective-target int32plus } */ + +extern void abort (void); +extern void exit (int); + +unsigned int +myrnd (void) +{ + static unsigned int s = 1388815473; + s *= 1103515245; + s += 12345; + return (s / 65536) % 2048; +} + +#define T(S) \ +struct S s##S; \ +struct S retme##S (struct S x) \ +{ \ + return x; \ +} \ + \ +unsigned int fn1##S (unsigned int x) \ +{ \ + struct S y = s##S; \ + y.k += x; \ + y = retme##S (y); \ + return y.k; \ +} \ + \ +unsigned int fn2##S (unsigned int x) \ +{ \ + struct S y = s##S; \ + y.k += x; \ + y.k %= 15; \ + return y.k; \ +} \ + \ +unsigned int retit##S (void) \ +{ \ + return s##S.k; \ +} \ + \ +unsigned int fn3##S (unsigned int x) \ +{ \ + s##S.k += x; \ + return retit##S (); \ +} \ + \ +void test##S (void) \ +{ \ + int i; \ + unsigned int mask, v, a, r; \ + struct S x; \ + char *p = (char *) &s##S; \ + for (i = 0; i < sizeof (s##S); ++i) \ + *p++ = myrnd (); \ + if (__builtin_classify_type (s##S.l) == 8) \ + s##S.l = 5.25; \ + s##S.k = -1; \ + mask = s##S.k; \ + v = myrnd (); \ + a = myrnd (); \ + s##S.k = v; \ + x = s##S; \ + r = fn1##S (a); \ + if (x.i != s##S.i || x.j != s##S.j \ + || x.k != s##S.k || x.l != s##S.l \ + || ((v + a) & mask) != r) \ + abort (); \ + v = myrnd (); \ + a = myrnd (); \ + s##S.k = v; \ + x = s##S; \ + r = fn2##S (a); \ + if (x.i != s##S.i || x.j != s##S.j \ + || x.k != s##S.k || x.l != s##S.l \ + || ((((v + a) & mask) % 15) & mask) != r) \ + abort (); \ + v = myrnd (); \ + a = myrnd (); \ + s##S.k = v; \ + x = s##S; \ + r = fn3##S (a); \ + if (x.i != s##S.i || x.j != s##S.j \ + || s##S.k != r || x.l != s##S.l \ + || ((v + a) & mask) != r) \ + abort (); \ +} + +#define pck __attribute__((packed)) +struct pck A { unsigned short i : 1, l : 1, j : 3, k : 11; }; T(A) +struct pck B { unsigned short i : 4, j : 1, k : 11; unsigned int l; }; T(B) +struct pck C { unsigned int l; unsigned short i : 4, j : 1, k : 11; }; T(C) +struct pck D { unsigned long long l : 6, i : 6, j : 23, k : 29; }; T(D) +struct pck E { unsigned long long l, i : 12, j : 23, k : 29; }; T(E) +struct pck F { unsigned long long i : 12, j : 23, k : 29, l; }; T(F) +struct pck G { unsigned short i : 1, j : 1, k : 6; unsigned long long l; }; T(G) +struct pck H { unsigned short i : 6, j : 2, k : 8; unsigned long long l; }; T(H) +struct pck I { unsigned short i : 1, j : 6, k : 1; unsigned long long l; }; T(I) +struct pck J { unsigned short i : 1, j : 8, k : 7; unsigned short l; }; T(J) +struct pck K { unsigned int k : 6, l : 1, j : 10, i : 15; }; T(K) +struct pck L { unsigned int k : 6, j : 11, i : 15; unsigned int l; }; T(L) +struct pck M { unsigned int l; unsigned short k : 6, j : 11, i : 15; }; T(M) +struct pck N { unsigned long long l : 6, k : 6, j : 23, i : 29; }; T(N) +struct pck O { unsigned long long l, k : 12, j : 23, i : 29; }; T(O) +struct pck P { unsigned long long k : 12, j : 23, i : 29, l; }; T(P) +struct pck Q { unsigned short k : 12, j : 1, i : 3; unsigned long long l; }; T(Q) +struct pck R { unsigned short k : 2, j : 11, i : 3; unsigned long long l; }; T(R) +struct pck S { unsigned short k : 1, j : 6, i : 9; unsigned long long l; }; T(S) +struct pck T { unsigned short k : 1, j : 8, i : 7; unsigned short l; }; T(T) +struct pck U { unsigned short j : 6, k : 1, i : 9; unsigned long long l; }; T(U) +struct pck V { unsigned short j : 8, k : 1, i : 7; unsigned short l; }; T(V) +struct pck W { long double l; unsigned int k : 12, j : 13, i : 7; }; T(W) +struct pck X { unsigned int k : 12, j : 13, i : 7; long double l; }; T(X) +struct pck Y { unsigned int k : 12, j : 11, i : 9; long double l; }; T(Y) +struct pck Z { long double l; unsigned int j : 13, i : 7, k : 12; }; T(Z) + +int +main (void) +{ + testA (); + testB (); + testC (); + testD (); + testE (); + testF (); + testG (); + testH (); + testI (); + testJ (); + testK (); + testL (); + testM (); + testN (); + testO (); + testP (); + testQ (); + testR (); + testS (); + testT (); + testU (); + testV (); + testW (); + testX (); + testY (); + testZ (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040709-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,5 @@ +/* PR rtl-optimization/68205 */ +/* { dg-require-effective-target int32plus } */ +/* { dg-additional-options "-fno-common" } */ + +#include "20040709-2.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040805-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040805-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040805-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040805-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* { dg-require-stack-size "0x12000" } */ + +#if __INT_MAX__ < 32768 +int main () { exit (0); } +#else +int a[2] = { 2, 3 }; + +static int __attribute__((noinline)) +bar (int x, void *b) +{ + a[0]++; + return x; +} + +static int __attribute__((noinline)) +foo (int x) +{ + char buf[0x10000]; + int y = a[0]; + a[1] = y; + x = bar (x, buf); + y = bar (y, buf); + return x + y; +} + +int +main () +{ + if (foo (100) != 102) + abort (); + exit (0); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040811-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040811-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040811-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040811-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* { dg-require-effective-target int32plus } */ +/* { dg-require-effective-target alloca } */ + +/* Ensure that we deallocate X when branching back before its + declaration. */ + +void *volatile p; + +int +main (void) +{ + int n = 0; + lab:; + int x[n % 1000 + 1]; + x[0] = 1; + x[n % 1000] = 2; + p = x; + n++; + if (n < 1000000) + goto lab; + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040820-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040820-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040820-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040820-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* PR rtl-optimization/17099 */ + +extern void exit (int); +extern void abort (void); + +void +check (int a) +{ + if (a != 1) + abort (); +} + +void +test (int a, int b) +{ + check ((a ? 1 : 0) | (b ? 2 : 0)); +} + +int +main (void) +{ + test (1, 0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040823-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040823-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040823-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040823-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* Ensure that we create VUSE operands also for noreturn functions. */ + +#include +#include + +int *pwarn; + +void bla (void) __attribute__ ((noreturn)); + +void bla (void) +{ + if (!*pwarn) + abort (); + + exit (0); +} + +int main (void) +{ + int warn; + + memset (&warn, 0, sizeof (warn)); + + pwarn = &warn; + + warn = 1; + + bla (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040831-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040831-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040831-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040831-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* This testcase was being miscompiled, because operand_equal_p + returned that (unsigned long) d and (long) d are equal. */ +extern void abort (void); +extern void exit (int); + +int +main (void) +{ + double d = -12.0; + long l = (d > 10000) ? (unsigned long) d : (long) d; + if (l != -12) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040917-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040917-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040917-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20040917-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* submitted by kenneth zadeck */ + +static int test_var; + +/* the idea here is that not only is inlinable, inlinable but since it + is static, the cgraph node will not be marked as output. The + current version of the code ignores these cgraph nodes. */ + +void not_inlinable() __attribute__((noinline)); + +static void +inlinable () +{ + test_var = -10; +} + +void +not_inlinable () +{ + inlinable(); +} + +main () +{ + test_var = 10; + /* Variable test_var should be considered call-clobbered by the call + to not_inlinable(). */ + not_inlinable (); + if (test_var == 10) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041011-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041011-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041011-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041011-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,60 @@ +typedef unsigned long long ull; +volatile int gvol[32]; +ull gull; + +#define MULTI(X) \ + X( 1), X( 2), X( 3), X( 4), X( 5), X( 6), X( 7), X( 8), X( 9), X(10), \ + X(11), X(12), X(13), X(14), X(15), X(16), X(17), X(18), X(19), X(20), \ + X(21), X(22), X(23), X(24), X(25), X(26), X(27), X(28), X(29), X(30) + +#define DECLARE(INDEX) x##INDEX +#define COPYIN(INDEX) x##INDEX = gvol[INDEX] +#define COPYOUT(INDEX) gvol[INDEX] = x##INDEX + +#define BUILD_TEST(NAME, N) \ + ull __attribute__((noinline)) \ + NAME (int n, ull x) \ + { \ + while (n--) \ + { \ + int MULTI (DECLARE); \ + MULTI (COPYIN); \ + MULTI (COPYOUT); \ + x += N; \ + } \ + return x; \ + } + +#define RUN_TEST(NAME, N) \ + if (NAME (3, ~0ULL) != N * 3 - 1) \ + abort (); \ + if (NAME (3, 0xffffffffULL) \ + != N * 3 + 0xffffffffULL) \ + abort (); + +#define DO_TESTS(DO_TEST) \ + DO_TEST (t1, -2048) \ + DO_TEST (t2, -513) \ + DO_TEST (t3, -512) \ + DO_TEST (t4, -511) \ + DO_TEST (t5, -1) \ + DO_TEST (t6, 1) \ + DO_TEST (t7, 511) \ + DO_TEST (t8, 512) \ + DO_TEST (t9, 513) \ + DO_TEST (t10, gull) \ + DO_TEST (t11, -gull) + +DO_TESTS (BUILD_TEST) + +ull neg (ull x) { return -x; } + +int +main () +{ + gull = 100; + DO_TESTS (RUN_TEST) + if (neg (gull) != -100ULL) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041019-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041019-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041019-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041019-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,52 @@ +test_store_ccp (int i) +{ + int *p, a, b, c; + + if (i < 5) + p = &a; + else if (i > 8) + p = &b; + else + p = &c; + + *p = 10; + b = 3; + + /* STORE-CCP was wrongfully propagating 10 into *p. */ + return *p + 2; +} + + +test_store_copy_prop (int i) +{ + int *p, a, b, c; + + if (i < 5) + p = &a; + else if (i > 8) + p = &b; + else + p = &c; + + *p = i; + b = i + 1; + + /* STORE-COPY-PROP was wrongfully propagating i into *p. */ + return *p; +} + + +main() +{ + int x; + + x = test_store_ccp (10); + if (x == 12) + abort (); + + x = test_store_copy_prop (9); + if (x == 9) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041112-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041112-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041112-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041112-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +/* This was failing on Alpha because the comparison (p != -1) was rewritten + as (p+1 != 0) and p+1 isn't allowed to wrap for pointers. */ + +extern void abort(void); + +typedef __SIZE_TYPE__ size_t; + +int global; + +static void *foo(int p) +{ + if (p == 0) + { + global++; + return &global; + } + + return (void *)(size_t)-1; +} + +int bar(void) +{ + void *p; + + p = foo(global); + if (p != (void *)(size_t)-1) + return 1; + + global++; + return 0; +} + +int main(void) +{ + global = 1; + if (bar () != 0) + abort(); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041113-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041113-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041113-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041113-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +#include + +void test (int x, ...) +{ + va_list ap; + int i; + va_start (ap, x); + if (va_arg (ap, int) != 1) + abort (); + if (va_arg (ap, int) != 2) + abort (); + if (va_arg (ap, int) != 3) + abort (); + if (va_arg (ap, int) != 4) + abort (); +} + +double a = 40.0; + +int main(int argc, char *argv[]) +{ + test(0, 1, 2, 3, (int)(a / 10.0)); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041114-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041114-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041114-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041114-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* Verify that + + var <= 0 || ((long unsigned) (unsigned) (var - 1) < MAX_UNSIGNED_INT) + + gets folded to 1. */ + +#include + +void abort (void); +void link_failure (void); + +volatile int v; + +void +foo (int var) +{ + if (!(var <= 0 + || ((long unsigned) (unsigned) (var - 1) < UINT_MAX))) + link_failure (); +} + +int +main (int argc, char **argv) +{ + foo (v); + return 0; +} + +#ifndef __OPTIMIZE__ +void +link_failure (void) +{ + abort (); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041124-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041124-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041124-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041124-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +struct s { _Complex unsigned short x; }; +struct s gs = { 100 + 200i }; +struct s __attribute__((noinline)) foo (void) { return gs; } + +int main () +{ + if (foo ().x != gs.x) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041126-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041126-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041126-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041126-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +extern int abs (int); +extern void abort (void); + +void +check (int *p) +{ + int i; + for (i = 0; i < 5; ++i) + if (p[i]) + abort (); + for (; i < 10; ++i) + if (p[i] != i + 1) + abort (); +} + +int +main (void) +{ + int a[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }; + int i; + + for (i = -5; i < 0; i++) + a[abs (i - 10) - 11] = 0; + check (a); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041201-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041201-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041201-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041201-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* PR rtl-opt/15289 */ + +typedef struct { _Complex char a; _Complex char b; } Scc2; + +Scc2 s = { 1+2i, 3+4i }; + +int checkScc2 (Scc2 s) +{ + return s.a != 1+2i || s.b != 3+4i; +} + +int main (void) +{ + return checkScc2 (s); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041210-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041210-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041210-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041210-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* The FR-V port used to fail this test because the andcc patterns + wrongly claimed to set the C and V flags. */ +#include + +int x[4] = { INT_MIN / 2, INT_MAX, 2, 4 }; + +int +main () +{ + if (x[0] < x[1]) + if ((x[2] & x[3]) < 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041212-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041212-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041212-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041212-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* A function pointer compared with a void pointer should not be canonicalized. + See PR middle-end/17564. */ +void *f (void) __attribute__ ((__noinline__)); +void * +f (void) +{ + return f; +} +int +main (void) +{ + if (f () != f) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041213-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041213-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041213-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041213-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR tree-optimization/18694 + + The dominator optimization didn't take the PHI evaluation order + into account when threading an edge. */ + +extern void abort (void) __attribute__((noreturn)); +extern void exit (int) __attribute__((noreturn)); + +void __attribute__((noinline)) +foo (int i) +{ + int next_n = 1; + int j = 0; + + for (; i != 0; i--) + { + int n; + + for (n = next_n; j < n; j++) + next_n++; + + if (j != n) + abort (); + } +} + +int +main (void) +{ + foo (2); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041214-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041214-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041214-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041214-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,69 @@ +/* { dg-require-effective-target indirect_jumps } */ + +typedef long unsigned int size_t; +extern void abort (void); +extern char *strcpy (char *, const char *); +extern int strcmp (const char *, const char *); +typedef __builtin_va_list va_list; +static const char null[] = "(null)"; +int g (char *s, const char *format, va_list ap) +{ + const char *f; + const char *string; + char spec; + static const void *step0_jumps[] = { + &&do_precision, + &&do_form_integer, + &&do_form_string, + }; + f = format; + if (*f == '\0') + goto all_done; + do + { + spec = (*++f); + goto *(step0_jumps[2]); + + /* begin switch table. */ + do_precision: + ++f; + __builtin_va_arg (ap, int); + spec = *f; + goto *(step0_jumps[2]); + + do_form_integer: + __builtin_va_arg (ap, unsigned long int); + goto end; + + do_form_string: + string = __builtin_va_arg (ap, const char *); + strcpy (s, string); + + /* End of switch table. */ + end: + ++f; + } + while (*f != '\0'); + +all_done: + return 0; +} + +void +f (char *s, const char *f, ...) +{ + va_list ap; + __builtin_va_start (ap, f); + g (s, f, ap); + __builtin_va_end (ap); +} + +int +main (void) +{ + char buf[10]; + f (buf, "%s", "asdf", 0); + if (strcmp (buf, "asdf")) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041218-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041218-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041218-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041218-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,117 @@ +/* PR rtl-optimization/16968 */ +/* Testcase by Jakub Jelinek */ + +struct T +{ + unsigned int b, c, *d; + unsigned char e; +}; +struct S +{ + unsigned int a; + struct T f; +}; +struct U +{ + struct S g, h; +}; +struct V +{ + unsigned int i; + struct U j; +}; + +extern void exit (int); +extern void abort (void); + +void * +dummy1 (void *x) +{ + return ""; +} + +void * +dummy2 (void *x, void *y) +{ + exit (0); +} + +struct V * +baz (unsigned int x) +{ + static struct V v; + __builtin_memset (&v, 0x55, sizeof (v)); + return &v; +} + +int +check (void *x, struct S *y) +{ + if (y->a || y->f.b || y->f.c || y->f.d || y->f.e) + abort (); + return 1; +} + +static struct V * +bar (unsigned int x, void *y) +{ + const struct T t = { 0, 0, (void *) 0, 0 }; + struct V *u; + void *v; + v = dummy1 (y); + if (!v) + return (void *) 0; + + u = baz (sizeof (struct V)); + u->i = x; + u->j.g.a = 0; + u->j.g.f = t; + u->j.h.a = 0; + u->j.h.f = t; + + if (!check (v, &u->j.g) || !check (v, &u->j.h)) + return (void *) 0; + return u; +} + +int +foo (unsigned int *x, unsigned int y, void **z) +{ + void *v; + unsigned int i, j; + + *z = v = (void *) 0; + + for (i = 0; i < y; i++) + { + struct V *c; + + j = *x; + + switch (j) + { + case 1: + c = bar (j, x); + break; + default: + c = 0; + break; + } + if (c) + v = dummy2 (v, c); + else + return 1; + } + + *z = v; + return 0; +} + +int +main (void) +{ + unsigned int one = 1; + void *p; + foo (&one, 1, &p); + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041218-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041218-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041218-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20041218-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +extern void abort (void); + +int test(int n) +{ + struct s { char b[n]; } __attribute__((packed)); + n++; + return sizeof(struct s); +} + +int main() +{ + if (test(123) != 123) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050104-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050104-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050104-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050104-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR tree-optimization/19060 */ + +void abort (void); + +static +long long min () +{ + return -__LONG_LONG_MAX__ - 1; +} + +void +foo (long long j) +{ + if (j > 10 || j < min ()) + abort (); +} + +int +main (void) +{ + foo (10); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050106-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050106-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050106-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050106-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR tree-optimization/19283 */ + +void abort (void); + +static inline unsigned short +foo (unsigned int *p) +{ + return *p; +}; + +unsigned int u; + +int +main () +{ + if ((foo (&u) & 0x8000) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050107-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050107-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050107-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050107-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +typedef enum { C = 1, D = 2 } B; +extern void abort (void); + +struct S +{ + B __attribute__ ((mode (byte))) a; + B __attribute__ ((mode (byte))) b; +}; + +void +foo (struct S *x) +{ + if (x->a != C || x->b != D) + abort (); +} + +int +main (void) +{ + struct S s; + s.a = C; + s.b = D; + foo (&s); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050111-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050111-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050111-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050111-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +/* PR middle-end/19084, rtl-optimization/19348 */ + +unsigned int +foo (unsigned long long x) +{ + unsigned int u; + + if (x == 0) + return 0; + u = (unsigned int) (x >> 32); + return u; +} + +unsigned long long +bar (unsigned short x) +{ + return (unsigned long long) x << 32; +} + +extern void abort (void); + +int +main (void) +{ + if (sizeof (long long) != 8) + return 0; + + if (foo (0) != 0) + abort (); + if (foo (0xffffffffULL) != 0) + abort (); + if (foo (0x25ff00ff00ULL) != 0x25) + abort (); + if (bar (0) != 0) + abort (); + if (bar (0x25) != 0x2500000000ULL) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050119-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050119-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050119-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050119-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* PR c/19342 */ +typedef enum { A, B, C, D } E; + +struct S { + E __attribute__ ((mode (__byte__))) a; + E __attribute__ ((mode (__byte__))) b; + E __attribute__ ((mode (__byte__))) c; + E __attribute__ ((mode (__byte__))) d; +}; + +extern void abort (void); +extern void exit (int); + +void +foo (struct S *s) +{ + if (s->a != s->b) + abort (); + if (s->c != C) + abort (); +} + +int +main (void) +{ + struct S s[2]; + s[0].a = B; + s[0].b = B; + s[0].c = C; + s[0].d = D; + s[1].a = D; + s[1].b = C; + s[1].c = B; + s[1].d = A; + foo (s); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050119-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050119-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050119-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050119-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +/* PR middle-end/19874 */ +typedef enum { A, B, C, D } E; + +struct S { + E __attribute__ ((mode (__byte__))) a; + E __attribute__ ((mode (__byte__))) b; + E __attribute__ ((mode (__byte__))) c; + E __attribute__ ((mode (__byte__))) d; +}; + +extern void abort (void); +extern void exit (int); + +E +foo (struct S *s) +{ + if (s->a != s->b) + abort (); + if (s->c != C) + abort (); + return s->d; +} + +int +main (void) +{ + struct S s[2]; + s[0].a = B; + s[0].b = B; + s[0].c = C; + s[0].d = D; + s[1].a = D; + s[1].b = C; + s[1].c = B; + s[1].d = A; + if (foo (s) != D) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050121-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050121-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050121-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050121-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,63 @@ +/* PR middle-end/19551 */ + +extern void abort (); + +#define T(type, name) \ +__attribute__((pure)) _Complex type \ +foo_##name (int x) \ +{ \ + _Complex type r; \ + __real r = x + 1; \ + __imag r = x - 1; \ + return r; \ +} \ + \ +void \ +bar_##name (type *x) \ +{ \ + *x = __real foo_##name (5); \ +} \ + \ +void \ +baz_##name (type *x) \ +{ \ + *x = __imag foo_##name (5); \ +} + +typedef long double ldouble_t; +typedef long long llong; + +T (float, float) +T (double, double) +T (long double, ldouble_t) +T (char, char) +T (short, short) +T (int, int) +T (long, long) +T (long long, llong) +#undef T + +int +main (void) +{ +#define T(type, name) \ + { \ + type var = 0; \ + bar_##name (&var); \ + if (var != 6) \ + abort (); \ + var = 0; \ + baz_##name (&var); \ + if (var != 4) \ + abort (); \ + } + T (float, float) + T (double, double) + T (long double, ldouble_t) + T (char, char) + T (short, short) + T (int, int) + T (long, long) + T (long long, llong) + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050124-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050124-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050124-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050124-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* PR rtl-optimization/19579 */ + +extern void abort (void); + +int +foo (int i, int j) +{ + int k = i + 1; + + if (j) + { + if (k > 0) + k++; + else if (k < 0) + k--; + } + + return k; +} + +int +main (void) +{ + if (foo (-2, 0) != -1) + abort (); + if (foo (-1, 0) != 0) + abort (); + if (foo (0, 0) != 1) + abort (); + if (foo (1, 0) != 2) + abort (); + if (foo (-2, 1) != -2) + abort (); + if (foo (-1, 1) != 0) + abort (); + if (foo (0, 1) != 2) + abort (); + if (foo (1, 1) != 3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050125-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050125-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050125-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050125-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* Verify that the CALL sideeffect isn't optimized away. */ +/* Contributed by Greg Parker 25 Jan 2005 */ + +#include +#include + +struct parse { + char *next; + char *end; + int error; +}; + +int seterr(struct parse *p, int err) +{ + p->error = err; + return 0; +} + +void bracket_empty(struct parse *p) +{ + if (((p->next < p->end) && (*p->next++) == ']') || seterr(p, 7)) { } +} + +int main(int argc __attribute__((unused)), char **argv __attribute__((unused))) +{ + struct parse p; + p.next = p.end = (char *)0x12345; + + p.error = 0; + bracket_empty(&p); + if (p.error != 7) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050131-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050131-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050131-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050131-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +/* Verify that we do not lose side effects on a MOD expression. */ + +#include +#include + +int +foo (int a) +{ + int x = 0 % a++; + return a; +} + +main() +{ + if (foo (9) != 10) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050203-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050203-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050203-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050203-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* Reduced testcase extracted from Samba source code. */ + +#include + +static void __attribute__((__noinline__)) + foo (unsigned char *p) { + *p = 0x81; +} + +static void __attribute__((__noinline__)) + bar (int x) { + asm (""); +} + +int main() { + unsigned char b; + + foo(&b); + if (b & 0x80) + { + bar (b & 0x7f); + exit (0); + } + else + { + bar (b & 1); + abort (); + } +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050215-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050215-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050215-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050215-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR middle-end/19857 */ + +typedef struct { char c[8]; } V +#ifdef __ELF__ + __attribute__ ((aligned (8))) +#endif + ; +typedef __SIZE_TYPE__ size_t; +V v; +void abort (void); + +int +main (void) +{ + V *w = &v; + if (((size_t) ((float *) ((size_t) w & ~(size_t) 3)) % 8) != 0 + || ((size_t) w & 1)) + { +#ifndef __ELF__ + if (((size_t) &v & 7) == 0) +#endif + abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050218-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050218-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050218-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050218-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR tree-optimization/19828 */ +typedef __SIZE_TYPE__ size_t; +extern size_t strlen (const char *s); +extern int strncmp (const char *s1, const char *s2, size_t n); +extern void abort (void); + +const char *a[16] = { "a", "bc", "de", "fgh" }; + +int +foo (char *x, const char *y, size_t n) +{ + size_t i, j = 0; + for (i = 0; i < n; i++) + { + if (strncmp (x + j, a[i], strlen (a[i])) != 0) + return 2; + j += strlen (a[i]); + if (y) + j += strlen (y); + } + return 0; +} + +int +main (void) +{ + if (foo ("abcde", (const char *) 0, 3) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050224-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050224-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050224-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050224-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* Origin: Mikael Pettersson and the Linux kernel. */ + +extern void abort (void); +unsigned long a = 0xc0000000, b = 0xd0000000; +unsigned long c = 0xc01bb958, d = 0xc0264000; +unsigned long e = 0xc0288000, f = 0xc02d4378; + +void +foo (int x, int y, int z) +{ + if (x != 245 || y != 36 || z != 444) + abort (); +} + +int +main (void) +{ + unsigned long g; + int h = 0, i = 0, j = 0; + + if (sizeof (unsigned long) < 4) + return 0; + + for (g = a; g < b; g += 0x1000) + if (g < c) + h++; + else if (g >= d && g < e) + j++; + else if (g < f) + i++; + foo (i, j, h); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,71 @@ +/* PR rtl-optimization/16104 */ +/* { dg-require-effective-target int32plus } */ +/* { dg-options "-Wno-psabi" } */ + +extern void abort (void); + +typedef int V2SI __attribute__ ((vector_size (8))); +typedef unsigned int V2USI __attribute__ ((vector_size (8))); +typedef short V2HI __attribute__ ((vector_size (4))); +typedef unsigned int V2UHI __attribute__ ((vector_size (4))); + +int +test1 (void) +{ + return (long long) (V2SI) 0LL; +} + +int +test2 (V2SI x) +{ + return (long long) x; +} + +V2SI +test3 (void) +{ + return (V2SI) (long long) (int) (V2HI) 0; +} + +V2SI +test4 (V2HI x) +{ + return (V2SI) (long long) (int) x; +} + +V2SI +test5 (V2USI x) +{ + return (V2SI) x; +} + +int +main (void) +{ + if (sizeof (short) != 2 || sizeof (int) != 4 || sizeof (long long) != 8) + return 0; + + if (test1 () != 0) + abort (); + + V2SI x = { 2, 2 }; + if (test2 (x) != 2) + abort (); + + union { V2SI x; int y[2]; V2USI z; long long l; } u; + u.x = test3 (); + if (u.y[0] != 0 || u.y[1] != 0) + abort (); + + V2HI y = { 4, 4 }; + union { V2SI x; long long y; } v; + v.x = test4 (y); + if (v.y != 0x40004) + abort (); + + V2USI z = { 6, 6 }; + u.x = test5 (z); + if (u.y[0] != 6 || u.y[1] != 6) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,54 @@ +/* This testcase generates MMX instructions together with x87 instructions. + Currently, there is no "emms" generated to switch between register sets, + so the testcase fails for targets where MMX insns are enabled. */ +/* { dg-options "-mno-mmx -Wno-psabi" { target { x86_64-*-* i?86-*-* } } } */ + +extern void abort (void); + +typedef int V2SI __attribute__ ((vector_size (8))); +typedef unsigned int V2USI __attribute__ ((vector_size (8))); +typedef float V2SF __attribute__ ((vector_size (8))); +typedef short V2HI __attribute__ ((vector_size (4))); +typedef unsigned int V2UHI __attribute__ ((vector_size (4))); + +long long +test1 (V2SF x) +{ + return (long long) (V2SI) x; +} + +long long +test2 (V2SF x) +{ + return (long long) x; +} + +long long +test3 (V2SI x) +{ + return (long long) (V2SF) x; +} + +int +main (void) +{ + if (sizeof (short) != 2 || sizeof (int) != 4 || sizeof (long long) != 8) + return 0; + + V2SF x = { 2.0, 2.0 }; + union { long long l; float f[2]; int i[2]; } u; + u.l = test1 (x); + if (u.f[0] != 2.0 || u.f[1] != 2.0) + abort (); + + V2SF y = { 6.0, 6.0 }; + u.l = test2 (y); + if (u.f[0] != 6.0 || u.f[1] != 6.0) + abort (); + + V2SI z = { 4, 4 }; + u.l = test3 (z); + if (u.i[0] != 4 || u.i[1] != 4) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050316-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* { dg-options "-Wno-psabi" } */ +extern void abort (void); + +typedef int V2SI __attribute__ ((vector_size (8))); +typedef unsigned int V2USI __attribute__ ((vector_size (8))); +typedef short V2HI __attribute__ ((vector_size (4))); +typedef unsigned int V2UHI __attribute__ ((vector_size (4))); + +V2USI +test1 (V2SI x) +{ + return (V2USI) (V2SI) (long long) x; +} + +long long +test2 (V2SI x) +{ + return (long long) (V2USI) (V2SI) (long long) x; +} + +int +main (void) +{ + if (sizeof (short) != 2 || sizeof (int) != 4 || sizeof (long long) != 8) + return 0; + + union { V2SI x; int y[2]; V2USI z; long long l; } u; + V2SI a = { -3, -3 }; + u.z = test1 (a); + if (u.y[0] != -3 || u.y[1] != -3) + abort (); + + u.l = test2 (a); + if (u.y[0] != -3 || u.y[1] != -3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050410-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050410-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050410-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050410-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +int s = 200; +int __attribute__((noinline)) +foo (void) +{ + return (signed char) (s - 100) - 5; +} +int +main (void) +{ + if (foo () != 95) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050502-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050502-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050502-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050502-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,67 @@ +/* PR rtl-optimization/21330 */ + +extern void abort (void); +extern int strcmp (const char *, const char *); + +int +__attribute__((noinline)) +bar (const char **x) +{ + return *(*x)++; +} + +int +__attribute__((noinline)) +baz (int c) +{ + return c != '@'; +} + +void +__attribute__((noinline)) +foo (const char **w, char *x, _Bool y, _Bool z) +{ + char c = bar (w); + int i = 0; + + while (1) + { + x[i++] = c; + c = bar (w); + if (y && c == '\'') + break; + if (z && c == '\"') + break; + if (!y && !z && !baz (c)) + break; + } + x[i] = 0; +} + +int +main (void) +{ + char buf[64]; + const char *p; + p = "abcde'fgh"; + foo (&p, buf, 1, 0); + if (strcmp (p, "fgh") != 0 || strcmp (buf, "abcde") != 0) + abort (); + p = "ABCDEFG\"HI"; + foo (&p, buf, 0, 1); + if (strcmp (p, "HI") != 0 || strcmp (buf, "ABCDEFG") != 0) + abort (); + p = "abcd\"e'fgh"; + foo (&p, buf, 1, 1); + if (strcmp (p, "e'fgh") != 0 || strcmp (buf, "abcd") != 0) + abort (); + p = "ABCDEF'G\"HI"; + foo (&p, buf, 1, 1); + if (strcmp (p, "G\"HI") != 0 || strcmp (buf, "ABCDEF") != 0) + abort (); + p = "abcdef at gh"; + foo (&p, buf, 0, 0); + if (strcmp (p, "gh") != 0 || strcmp (buf, "abcdef") != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050502-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050502-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050502-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050502-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* PR target/21297 */ +typedef __SIZE_TYPE__ size_t; +extern int memcmp (const char *, const char *, size_t); +extern void abort (); + +void +foo (char *x) +{ + int i; + for (i = 0; i < 2; i++); + x[i + i] = '\0'; +} + +void +bar (char *x) +{ + int i; + for (i = 0; i < 2; i++); + x[i + i + i + i] = '\0'; +} + +int +main (void) +{ + char x[] = "IJKLMNOPQR"; + foo (x); + if (memcmp (x, "IJKL\0NOPQR", sizeof x) != 0) + abort (); + x[4] = 'M'; + bar (x); + if (memcmp (x, "IJKLMNOP\0R", sizeof x) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050604-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050604-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050604-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050604-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +/* PR regression/21897 */ +/* This testcase generates MMX instructions together with x87 instructions. + Currently, there is no "emms" generated to switch between register sets, + so the testcase fails for targets where MMX insns are enabled. */ +/* { dg-options "-mno-mmx" { target { x86_64-*-* i?86-*-* } } } */ + +extern void abort (void); + +typedef unsigned short v4hi __attribute__ ((vector_size (8))); +typedef float v4sf __attribute__ ((vector_size (16))); + +union +{ + v4hi v; + short s[4]; +} u; + +union +{ + v4sf v; + float f[4]; +} v; + +void +foo (void) +{ + unsigned int i; + for (i = 0; i < 2; i++) + u.v += (v4hi) { 12, 32768 }; + for (i = 0; i < 2; i++) + v.v += (v4sf) { 18.0, 20.0, 22 }; +} + +int +main (void) +{ + foo (); + if (u.s[0] != 24 || u.s[1] != 0 || u.s[2] || u.s[3]) + abort (); + if (v.f[0] != 36.0 || v.f[1] != 40.0 || v.f[2] != 44.0 || v.f[3] != 0.0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050607-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050607-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050607-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050607-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* PR middle-end/21850 */ + +extern void abort (void); + +typedef int V2SI __attribute__ ((vector_size (8))); + +int +main (void) +{ +#if (__INT_MAX__ == 2147483647) \ + && (__LONG_LONG_MAX__ == 9223372036854775807LL) + if (((int)(long long)(V2SI){ 2, 2 }) != 2) + abort (); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050613-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050613-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050613-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050613-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR tree-optimization/22043 */ + +extern void abort (void); + +struct A { int i; int j; int k; int l; }; +struct B { struct A a; int r[1]; }; +struct C { struct A a; int r[0]; }; +struct D { struct A a; int r[]; }; + +void +foo (struct A *x) +{ + if (x->i != 0 || x->j != 5 || x->k != 0 || x->l != 0) + abort (); +} + +int +main () +{ + struct B b = { .a.j = 5 }; + struct C c = { .a.j = 5 }; + struct D d = { .a.j = 5 }; + foo (&b.a); + foo (&c.a); + foo (&d.a); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050713-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050713-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050713-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050713-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,56 @@ +/* Test that sibling call is not used if there is an argument overlap. */ + +extern void abort (void); + +struct S +{ + int a, b, c; +}; + +int +foo2 (struct S x, struct S y) +{ + if (x.a != 3 || x.b != 4 || x.c != 5) + abort (); + if (y.a != 6 || y.b != 7 || y.c != 8) + abort (); + return 0; +} + +int +foo3 (struct S x, struct S y, struct S z) +{ + foo2 (x, y); + if (z.a != 9 || z.b != 10 || z.c != 11) + abort (); + return 0; +} + +int +bar2 (struct S x, struct S y) +{ + return foo2 (y, x); +} + +int +bar3 (struct S x, struct S y, struct S z) +{ + return foo3 (y, x, z); +} + +int +baz3 (struct S x, struct S y, struct S z) +{ + return foo3 (y, z, x); +} + +int +main (void) +{ + struct S a = { 3, 4, 5 }, b = { 6, 7, 8 }, c = { 9, 10, 11 }; + + bar2 (b, a); + bar3 (b, a, c); + baz3 (c, a, b); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050826-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050826-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050826-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050826-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +/* PR rtl-optimization/23561 */ + +struct A +{ + char a1[1]; + char a2[5]; + char a3[1]; + char a4[2048 - 7]; +} a; + +typedef __SIZE_TYPE__ size_t; +extern void *memset (void *, int, size_t); +extern void *memcpy (void *, const void *, size_t); +extern int memcmp (const void *, const void *, size_t); +extern void abort (void); + +void +bar (struct A *x) +{ + size_t i; + if (memcmp (x, "\1HELLO\1", sizeof "\1HELLO\1")) + abort (); + for (i = 0; i < sizeof (x->a4); i++) + if (x->a4[i]) + abort (); +} + +int +foo (void) +{ + memset (&a, 0, sizeof (a)); + a.a1[0] = 1; + memcpy (a.a2, "HELLO", sizeof "HELLO"); + a.a3[0] = 1; + bar (&a); + return 0; +} + +int +main (void) +{ + foo (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050826-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050826-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050826-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050826-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,62 @@ +/* PR rtl-optimization/23560 */ + +struct rtattr +{ + unsigned short rta_len; + unsigned short rta_type; +}; + +__attribute__ ((noinline)) +int inet_check_attr (void *r, struct rtattr **rta) +{ + int i; + + for (i = 1; i <= 14; i++) + { + struct rtattr *attr = rta[i - 1]; + if (attr) + { + if (attr->rta_len - sizeof (struct rtattr) < 4) + return -22; + if (i != 9 && i != 8) + rta[i - 1] = attr + 1; + } + } + return 0; +} + +extern void abort (void); + +int +main (void) +{ + struct rtattr rt[2]; + struct rtattr *rta[14]; + int i; + + rt[0].rta_len = sizeof (struct rtattr) + 8; + rt[0].rta_type = 0; + rt[1] = rt[0]; + for (i = 0; i < 14; i++) + rta[i] = &rt[0]; + if (inet_check_attr (0, rta) != 0) + abort (); + for (i = 0; i < 14; i++) + if (rta[i] != &rt[i != 7 && i != 8]) + abort (); + for (i = 0; i < 14; i++) + rta[i] = &rt[0]; + rta[1] = 0; + rt[1].rta_len -= 8; + rta[5] = &rt[1]; + if (inet_check_attr (0, rta) != -22) + abort (); + for (i = 0; i < 14; i++) + if (i == 1 && rta[i] != 0) + abort (); + else if (i != 1 && i <= 5 && rta[i] != &rt[1]) + abort (); + else if (i > 5 && rta[i] != &rt[0]) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050929-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050929-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050929-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20050929-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* PR middle-end/24109 */ + +extern void abort (void); + +struct A { int i; int j; }; +struct B { struct A *a; struct A *b; }; +struct C { struct B *c; struct A *d; }; +struct C e = { &(struct B) { &(struct A) { 1, 2 }, &(struct A) { 3, 4 } }, &(struct A) { 5, 6 } }; + +int +main (void) +{ + if (e.c->a->i != 1 || e.c->a->j != 2) + abort (); + if (e.c->b->i != 3 || e.c->b->j != 4) + abort (); + if (e.d->i != 5 || e.d->j != 6) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051012-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051012-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051012-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051012-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* { dg-require-effective-target untyped_assembly } */ +extern void abort (void); + +struct type +{ + int *a; + + int b:16; + unsigned int p:9; +} t; + +unsigned int +foo () +{ + return t.p; +} + +int +main (void) +{ + t.p = 8; + if (foo (t) != 8) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051021-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051021-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051021-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051021-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* Verify that TRUTH_AND_EXPR is not wrongly changed to TRUTH_ANDIF_EXPR. */ + +extern void abort (void); + +int count = 0; + +int foo1(void) +{ + count++; + return 0; +} + +int foo2(void) +{ + count++; + return 0; +} + +int main(void) +{ + if ((foo1() == 1) & (foo2() == 1)) + abort (); + + if (count != 2) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051104-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051104-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051104-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051104-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR rtl-optimization/23567 */ + +struct +{ + int len; + char *name; +} s; + +int +main (void) +{ + s.len = 0; + s.name = ""; + if (s.name [s.len] != 0) + s.name [s.len] = 0; + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051110-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051110-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051110-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051110-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +void add_unwind_adjustsp (long); +void abort (void); + +unsigned char bytes[5]; + +void +add_unwind_adjustsp (long offset) +{ + int n; + unsigned long o; + + o = (long) ((offset - 0x204) >> 2); + + n = 0; + while (o) + { + bytes[n] = o & 0x7f; + o >>= 7; + if (o) + bytes[n] |= 0x80; + n++; + } +} + +int main(void) +{ + add_unwind_adjustsp (4132); + if (bytes[0] != 0x88 || bytes[1] != 0x07) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051110-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051110-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051110-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051110-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +void add_unwind_adjustsp (long); +void abort (void); + +unsigned char bytes[5]; + +int flag; + +void +add_unwind_adjustsp (long offset) +{ + int n; + unsigned long o; + + o = (long) ((offset - 0x204) >> 2); + + n = 0; + do + { +a: + bytes[n] = o & 0x7f; + o >>= 7; + if (o) + { + bytes[n] |= 0x80; + if (flag) + goto a; + } + n++; + } + while (o); +} + +int main(void) +{ + add_unwind_adjustsp (4132); + if (bytes[0] != 0x88 || bytes[1] != 0x07) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051113-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051113-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051113-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051113-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,71 @@ +extern void *malloc(__SIZE_TYPE__); +extern void *memset(void *, int, __SIZE_TYPE__); +typedef struct +{ + short a; + unsigned short b; + unsigned short c; + unsigned long long Count; + long long Count2; +} __attribute__((packed)) Struct1; + +typedef struct +{ + short a; + unsigned short b; + unsigned short c; + unsigned long long d; + long long e; + long long f; +} __attribute__((packed)) Struct2; + +typedef union +{ + Struct1 a; + Struct2 b; +} Union; + +typedef struct +{ + int Count; + Union List[0]; +} __attribute__((packed)) Struct3; + +unsigned long long Sum (Struct3 *instrs) __attribute__((noinline)); +unsigned long long Sum (Struct3 *instrs) +{ + unsigned long long count = 0; + int i; + + for (i = 0; i < instrs->Count; i++) { + count += instrs->List[i].a.Count; + } + return count; +} +long long Sum2 (Struct3 *instrs) __attribute__((noinline)); +long long Sum2 (Struct3 *instrs) +{ + long long count = 0; + int i; + + for (i = 0; i < instrs->Count; i++) { + count += instrs->List[i].a.Count2; + } + return count; +} +main() { + Struct3 *p = malloc (sizeof (int) + 3 * sizeof(Union)); + memset(p, 0, sizeof(int) + 3*sizeof(Union)); + p->Count = 3; + p->List[0].a.Count = 555; + p->List[1].a.Count = 999; + p->List[2].a.Count = 0x101010101ULL; + p->List[0].a.Count2 = 555; + p->List[1].a.Count2 = 999; + p->List[2].a.Count2 = 0x101010101LL; + if (Sum(p) != 555 + 999 + 0x101010101ULL) + abort(); + if (Sum2(p) != 555 + 999 + 0x101010101LL) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051215-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051215-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051215-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20051215-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR rtl-optimization/24899 */ + +extern void abort (void); + +__attribute__ ((noinline)) int +foo (int x, int y, int *z) +{ + int a, b, c, d; + + a = b = 0; + for (d = 0; d < y; d++) + { + if (z) + b = d * *z; + for (c = 0; c < x; c++) + a += b; + } + + return a; +} + +int +main (void) +{ + if (foo (3, 2, 0) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060102-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060102-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060102-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060102-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +extern void abort (); + +int f(int x) +{ + return (x >> (sizeof (x) * __CHAR_BIT__ - 1)) ? -1 : 1; +} + +volatile int one = 1; +int main (void) +{ + /* Test that the function above returns different values for + different signs. */ + if (f(one) == f(-one)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060110-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060110-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060110-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060110-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +extern void abort (void); + +long long +f (long long a) +{ + return (a << 32) >> 32; +} +long long a = 0x1234567876543210LL; +long long b = (0x1234567876543210LL << 32) >> 32; +int +main () +{ + if (f (a) != b) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060110-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060110-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060110-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060110-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +extern void abort (void); + +long long +f (long long a, long long b) +{ + return ((a + b) << 32) >> 32; +} + +long long a = 0x1234567876543210LL; +long long b = 0x2345678765432101LL; +long long c = ((0x1234567876543210LL + 0x2345678765432101LL) << 32) >> 32; + +int +main () +{ + if (f (a, b) != c) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060127-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060127-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060127-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060127-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +void abort (); + +void +f (long long a) +{ + if ((a & 0xffffffffLL) != 0) + abort (); +} + +long long a = 0x1234567800000000LL; + +int +main () +{ + f (a); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060412-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060412-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060412-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060412-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +extern void abort (void); + +struct S +{ + long o; +}; + +struct T +{ + long o; + struct S m[82]; +}; + +struct T t; + +int +main () +{ + struct S *p, *q; + + p = (struct S *) &t; + p = &((struct T *) p)->m[0]; + q = p + 82; + while (--q > p) + q->o = -1; + q->o = 0; + + if (q > p) + abort (); + if (q - p > 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060420-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060420-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060420-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060420-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,73 @@ +extern void abort (void); + +typedef float v4flt __attribute__ ((vector_size (16))); + +void __attribute__ ((noinline)) foo (float *dst, float **src, int a, int n) +{ + int i, j; + int z = sizeof (v4flt) / sizeof (float); + unsigned m = sizeof (v4flt) - 1; + + for (j = 0; j < n && (((unsigned long) dst + j) & m); ++j) + { + float t = src[0][j]; + for (i = 1; i < a; ++i) + t += src[i][j]; + dst[j] = t; + } + + for (; j < (n - (4 * z - 1)); j += 4 * z) + { + v4flt t0 = *(v4flt *) (src[0] + j + 0 * z); + v4flt t1 = *(v4flt *) (src[0] + j + 1 * z); + v4flt t2 = *(v4flt *) (src[0] + j + 2 * z); + v4flt t3 = *(v4flt *) (src[0] + j + 3 * z); + for (i = 1; i < a; ++i) + { + t0 += *(v4flt *) (src[i] + j + 0 * z); + t1 += *(v4flt *) (src[i] + j + 1 * z); + t2 += *(v4flt *) (src[i] + j + 2 * z); + t3 += *(v4flt *) (src[i] + j + 3 * z); + } + *(v4flt *) (dst + j + 0 * z) = t0; + *(v4flt *) (dst + j + 1 * z) = t1; + *(v4flt *) (dst + j + 2 * z) = t2; + *(v4flt *) (dst + j + 3 * z) = t3; + } + for (; j < n; ++j) + { + float t = src[0][j]; + for (i = 1; i < a; ++i) + t += src[i][j]; + dst[j] = t; + } +} + +float buffer[64]; + +int +main (void) +{ + int i; + float *dst, *src[2]; + char *cptr; + + cptr = (char *)buffer; + cptr += (-(long int) buffer & (16 * sizeof (float) - 1)); + dst = (float *)cptr; + src[0] = dst + 16; + src[1] = dst + 32; + for (i = 0; i < 16; ++i) + { + src[0][i] = (float) i + 11 * (float) i; + src[1][i] = (float) i + 12 * (float) i; + } + foo (dst, src, 2, 16); + for (i = 0; i < 16; ++i) + { + float e = (float) i + 11 * (float) i + (float) i + 12 * (float) i; + if (dst[i] != e) + abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060905-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060905-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060905-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060905-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* PR rtl-optimization/28386 */ +/* Origin: Volker Reichelt */ + +extern void abort(void); + +volatile char s[256][3]; + +char g; + +static void dummy(char a) +{ + g = a; +} + +static int foo(void) +{ + int i, j=0; + + for (i = 0; i < 256; i++) + if (i >= 128 && i < 256) + { + dummy (s[i - 128][0]); + ++j; + } + + return j; +} + +int main(void) +{ + if (foo () != 128) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060910-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060910-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060910-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060910-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* PR rtl-optimization/28636 */ +/* Origin: Andreas Schwab */ + +extern void abort(void); + +struct input_ty +{ + unsigned char *buffer_position; + unsigned char *buffer_end; +}; + +int input_getc_complicated (struct input_ty *x) { return 0; } + +int check_header (struct input_ty *deeper) +{ + unsigned len; + for (len = 0; len < 6; len++) + if (((deeper)->buffer_position < (deeper)->buffer_end + ? *((deeper)->buffer_position)++ + : input_getc_complicated((deeper))) < 0) + return 0; + return 1; +} + +struct input_ty s; +unsigned char b[6]; + +int main (void) +{ + s.buffer_position = b; + s.buffer_end = b + sizeof b; + if (!check_header(&s)) + abort(); + if (s.buffer_position != s.buffer_end) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060929-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060929-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060929-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060929-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +/* PR c/29154 */ + +extern void abort (void); + +void +foo (int **p, int *q) +{ + *(*p++)++ = *q++; +} + +void +bar (int **p, int *q) +{ + **p = *q++; + *(*p++)++; +} + +void +baz (int **p, int *q) +{ + **p = *q++; + (*p++)++; +} + +int +main (void) +{ + int i = 42, j = 0; + int *p = &i; + foo (&p, &j); + if (p - 1 != &i || j != 0 || i != 0) + abort (); + i = 43; + p = &i; + bar (&p, &j); + if (p - 1 != &i || j != 0 || i != 0) + abort (); + i = 44; + p = &i; + baz (&p, &j); + if (p - 1 != &i || j != 0 || i != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060930-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060930-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060930-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060930-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +/* PR rtl-optimization/28096 */ +/* Origin: Jan Stein */ + +extern void abort (void); + +int bar (int, int) __attribute__((noinline)); +int bar (int a, int b) +{ + if (b != 1) + abort (); +} + +void foo(int, int) __attribute__((noinline)); +void foo (int e, int n) +{ + int i, bb2, bb5; + + if (e > 0) + e = -e; + + for (i = 0; i < n; i++) + { + if (e >= 0) + { + bb2 = 0; + bb5 = 0; + } + else + { + bb5 = -e; + bb2 = bb5; + } + + bar (bb5, bb2); + } +} + +int main(void) +{ + foo (1, 1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060930-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060930-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060930-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20060930-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* PR middle-end/29272 */ + +extern void abort (void); + +struct S { struct S *s; } s; +struct T { struct T *t; } t; + +static inline void +foo (void *s) +{ + struct T *p = s; + __builtin_memcpy (&p->t, &t.t, sizeof (t.t)); +} + +void * +__attribute__((noinline)) +bar (void *p, struct S *q) +{ + q->s = &s; + foo (p); + return q->s; +} + +int +main (void) +{ + t.t = &t; + if (bar (&s, &s) != (void *) &t) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061031-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061031-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061031-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061031-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR rtl-optimization/29631 */ +/* Origin: Falk Hueffner */ + +const signed char nunmap[] = { 17, -1, 1 }; + +__attribute__((noinline)) +void ff(int i) { + asm volatile(""); +} + +__attribute__((noinline)) +void f(short delta) +{ + short p0 = 2, s; + for (s = 0; s < 2; s++) + { + p0 += delta; + ff(s); + if (nunmap[p0] == 17) + asm volatile(""); + } +} + +int main(void) +{ + f(-1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061101-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061101-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061101-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061101-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* PR rtl-optimization/28970 */ +/* Origin: Peter Bergner */ +/* { dg-require-effective-target int32plus } */ + +extern void abort (void); + +int tar (int i) +{ + if (i != 36863) + abort (); + + return -1; +} + +void bug(int q, int bcount) +{ + int j = 0; + int outgo = 0; + + while(j != -1) + { + outgo++; + if (outgo > q-1) + outgo = q-1; + j = tar (outgo*bcount); + } +} + +int main(void) +{ + bug(5, 36863); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061101-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061101-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061101-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061101-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR rtl-optimization/28970 */ +/* Origin: Peter Bergner */ + +extern void abort (void); + +int tar (long i) +{ + if (i != 36863) + abort (); + + return -1; +} + +void bug(int q, long bcount) +{ + int j = 0; + int outgo = 0; + + while(j != -1) + { + outgo++; + if (outgo > q-1) + outgo = q-1; + j = tar (outgo*bcount); + } +} + +int main(void) +{ + bug(5, 36863); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061220-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061220-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061220-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20061220-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,73 @@ +/* PR middle-end/30262 */ +/* { dg-skip-if "asm statements do not work as expected" { rl78-*-* } } */ +extern void abort (void); + +int +foo (void) +{ + unsigned int x = 0; + + void nested (void) + { + x = 254; + } + + nested (); + asm volatile ("" :: "r" (x)); + asm volatile ("" :: "m" (x)); + asm volatile ("" :: "mr" (x)); + asm volatile ("" : "=r" (x) : "0" (x)); + asm volatile ("" : "=m" (x) : "m" (x)); + return x; +} + +int +bar (void) +{ + unsigned int x = 0; + + void nested (void) + { + asm volatile ("" :: "r" (x)); + asm volatile ("" :: "m" (x)); + asm volatile ("" :: "mr" (x)); + x += 4; + asm volatile ("" : "=r" (x) : "0" (x)); + asm volatile ("" : "=m" (x) : "m" (x)); + } + + nested (); + return x; +} + +int +baz (void) +{ + unsigned int x = 0; + + void nested (void) + { + void nested2 (void) + { + asm volatile ("" :: "r" (x)); + asm volatile ("" :: "m" (x)); + asm volatile ("" :: "mr" (x)); + x += 4; + asm volatile ("" : "=r" (x) : "0" (x)); + asm volatile ("" : "=m" (x) : "m" (x)); + } + nested2 (); + nested2 (); + } + + nested (); + return x; +} + +int +main (void) +{ + if (foo () != 254 || bar () != 4 || baz () != 8) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070201-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070201-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070201-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070201-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* PR middle-end/30473 */ + +extern int sprintf (char *, const char *, ...); +extern void abort (void); + +char * +foo (char *buf, char *p) +{ + sprintf (buf, "abcde", p++); + return p; +} + +int +main (void) +{ + char buf[6]; + if (foo (buf, &buf[2]) != &buf[3]) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +struct f +{ + int i; +}; + +int g(int i, int c, struct f *ff, int *p) +{ + int *t; + if (c) + t = &i; + else + t = &ff->i; + *p = 0; + return *t; +} + +extern void abort(void); + +int main() +{ + struct f f; + f.i = 1; + if (g(5, 0, &f, &f.i) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +int f(int k, int i1, int j1) +{ + int *f1; + if(k) + f1 = &i1; + else + f1 = &j1; + i1 = 0; + return *f1; +} + +extern void abort (void); + +int main() +{ + if (f(1, 1, 2) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070212-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +struct foo { int i; int j; }; + +int bar (struct foo *k, int k2, int f, int f2) +{ + int *p, *q; + int res; + if (f) + p = &k->i; + else + p = &k->j; + res = *p; + k->i = 1; + if (f2) + q = p; + else + q = &k2; + return res + *q; +} + +extern void abort (void); + +int main() +{ + struct foo k; + k.i = 0; + k.j = 1; + if (bar (&k, 1, 1, 1) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070424-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070424-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070424-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070424-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void abort (void); +extern void exit (int); + +void do_exit (void) { exit (0); } +void do_abort (void) { abort (); } + +void foo (int x, int a) +{ + if (x < a) + goto doit; + do_exit (); + if (x != a) + goto doit; + + /* else */ + do_abort (); + return; + +doit: + do_abort (); +} + +int main() +{ + foo (1, 0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070517-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070517-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070517-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070517-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* PR rtl-optimization/31691 */ +/* Origin: Chi-Hua Chen */ + +extern void abort (void); + +static int get_kind(int) __attribute__ ((noinline)); + +static int get_kind(int v) +{ + volatile int k = v; + return k; +} + +static int some_call(void) __attribute__ ((noinline)); + +static int some_call(void) +{ + return 0; +} + +static void example (int arg) +{ + int tmp, kind = get_kind (arg); + + if (kind == 9 || kind == 10 || kind == 5) + { + if (some_call() == 0) + { + if (kind == 9 || kind == 10) + tmp = arg; + else + abort(); + } + } +} + +int main(void) +{ + example(10); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070614-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070614-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070614-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070614-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +extern void abort (void); + +_Complex v = 3.0 + 1.0iF; + +void +foo (_Complex z, int *x) +{ + if (z != v) + abort (); +} + +_Complex bar (_Complex z) __attribute__ ((pure)); +_Complex +bar (_Complex z) +{ + return v; +} + +int +baz (void) +{ + int a, i; + for (i = 0; i < 6; i++) + foo (bar (1.0iF * i), &a); + return 0; +} + +int +main () +{ + baz (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070623-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070623-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070623-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070623-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +#include + +int __attribute__((noinline)) nge(int a, int b) {return -(a >= b);} +int __attribute__((noinline)) ngt(int a, int b) {return -(a > b);} +int __attribute__((noinline)) nle(int a, int b) {return -(a <= b);} +int __attribute__((noinline)) nlt(int a, int b) {return -(a < b);} +int __attribute__((noinline)) neq(int a, int b) {return -(a == b);} +int __attribute__((noinline)) nne(int a, int b) {return -(a != b);} +int __attribute__((noinline)) ngeu(unsigned a, unsigned b) {return -(a >= b);} +int __attribute__((noinline)) ngtu(unsigned a, unsigned b) {return -(a > b);} +int __attribute__((noinline)) nleu(unsigned a, unsigned b) {return -(a <= b);} +int __attribute__((noinline)) nltu(unsigned a, unsigned b) {return -(a < b);} + + +int main() +{ + if (nge(INT_MIN, INT_MAX) != 0) abort(); + if (nge(INT_MAX, INT_MIN) != -1) abort(); + if (ngt(INT_MIN, INT_MAX) != 0) abort(); + if (ngt(INT_MAX, INT_MIN) != -1) abort(); + if (nle(INT_MIN, INT_MAX) != -1) abort(); + if (nle(INT_MAX, INT_MIN) != 0) abort(); + if (nlt(INT_MIN, INT_MAX) != -1) abort(); + if (nlt(INT_MAX, INT_MIN) != 0) abort(); + + if (neq(INT_MIN, INT_MAX) != 0) abort(); + if (neq(INT_MAX, INT_MIN) != 0) abort(); + if (nne(INT_MIN, INT_MAX) != -1) abort(); + if (nne(INT_MAX, INT_MIN) != -1) abort(); + + if (ngeu(0, ~0U) != 0) abort(); + if (ngeu(~0U, 0) != -1) abort(); + if (ngtu(0, ~0U) != 0) abort(); + if (ngtu(~0U, 0) != -1) abort(); + if (nleu(0, ~0U) != -1) abort(); + if (nleu(~0U, 0) != 0) abort(); + if (nltu(0, ~0U) != -1) abort(); + if (nltu(~0U, 0) != 0) abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070724-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070724-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070724-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070724-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +void abort (void); + +static unsigned char magic[] = "\235"; +static unsigned char value = '\235'; + +int main() +{ + if (value != magic[0]) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070824-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070824-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070824-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070824-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR tree-optimization/33136 */ +/* { dg-require-effective-target alloca } */ + +extern void abort (void); + +struct S +{ + struct S *a; + int b; +}; + +int +main (void) +{ + struct S *s = (struct S *) 0, **p, *n; + for (p = &s; *p; p = &(*p)->a); + n = (struct S *) __builtin_alloca (sizeof (*n)); + n->a = *p; + n->b = 1; + *p = n; + + if (!s) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070919-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070919-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070919-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20070919-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +/* PR c/33238 */ +/* { dg-require-effective-target alloca } */ + +typedef __SIZE_TYPE__ size_t; +int memcmp (const void *, const void *, size_t); +void abort (void); + +void +__attribute__((noinline)) +bar (void *x, void *y) +{ + struct S { char w[8]; } *p = x, *q = y; + if (memcmp (p->w, "zyxwvut", 8) != 0) + abort (); + if (memcmp (q[0].w, "abcdefg", 8) != 0) + abort (); + if (memcmp (q[1].w, "ABCDEFG", 8) != 0) + abort (); + if (memcmp (q[2].w, "zyxwvut", 8) != 0) + abort (); + if (memcmp (q[3].w, "zyxwvut", 8) != 0) + abort (); +} + +void +__attribute__((noinline)) +foo (void *x, int y) +{ + struct S { char w[y]; } *p = x, a; + int i; + a = ({ struct S b; b = p[2]; p[3] = b; }); + bar (&a, x); +} + +int +main (void) +{ + struct S { char w[8]; } p[4] + = { "abcdefg", "ABCDEFG", "zyxwvut", "ZYXWVUT" }; + foo (p, 8); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071011-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071011-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071011-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071011-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +extern void abort(void); +void foo(int *p) +{ + int x; + int y; + x = *p; + *p = 0; + y = *p; + if (x != y) + return; + abort (); +} + +int main() +{ + int a = 1; + foo(&a); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071018-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071018-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071018-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071018-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +extern void abort(void); + +struct foo { + int rank; + char *name; +}; + +struct mem { + struct foo *x[4]; +}; + +void __attribute__((noinline)) bar(struct foo **f) +{ + *f = __builtin_malloc(sizeof(struct foo)); +} +struct foo * __attribute__((noinline, noclone)) foo(int rank) +{ + void *x = __builtin_malloc(sizeof(struct mem)); + struct mem *as = x; + struct foo **upper = &as->x[rank * 8 - 5]; + *upper = 0; + bar(upper); + return *upper; +} + +int main() +{ + if (foo(1) == 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071029-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071029-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071029-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071029-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,56 @@ +extern void exit (int); +extern void abort (void); + +typedef union +{ + struct + { + int f1, f2, f3, f4, f5, f6, f7, f8; + long int f9, f10; + int f11; + } f; + char s[56]; + long int a; +} T; + +__attribute__((noinline)) +void +test (T *t) +{ + static int i = 11; + if (t->f.f1 != i++) + abort (); + if (t->f.f2 || t->f.f3 || t->f.f4 || t->f.f5 || t->f.f6 + || t->f.f7 || t->f.f8 || t->f.f9 || t->f.f10 || t->f.f11) + abort (); + if (i == 20) + exit (0); +} + +__attribute__((noinline)) +void +foo (int i) +{ + T t; +again: + t = (T) { { ++i, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } }; + test (&t); + goto again; +} + +int +main (void) +{ + T *t1, *t2; + int cnt = 0; + t1 = (T *) 0; +loop: + t2 = t1; + t1 = & (T) { .f.f9 = cnt++ }; + if (cnt < 3) + goto loop; + if (t1 != t2 || t1->f.f9 != 2) + abort (); + foo (10); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071030-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071030-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071030-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071030-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,79 @@ +/* PR target/11044 */ +/* Originator: Tim McGrath */ +/* Testcase contributed by Eric Botcazou */ + +/* Testcase copied from gcc.target/i386/loop-3.c */ + +extern void *memset (void *, int, __SIZE_TYPE__); +extern void abort (void); + +typedef struct +{ + unsigned char colormod; +} entity_state_t; + +typedef struct +{ + int num_entities; + entity_state_t *entities; +} packet_entities_t; + +typedef struct +{ + double senttime; + float ping_time; + packet_entities_t entities; +} client_frame_t; + +typedef enum +{ + cs_free, + cs_server, + cs_zombie, + cs_connected, + cs_spawned +} sv_client_state_t; + +typedef struct client_s +{ + sv_client_state_t state; + int ping; + client_frame_t frames[64]; +} client_t; + +int CalcPing (client_t *cl) +{ + float ping; + int count, i; + register client_frame_t *frame; + + if (cl->state == cs_server) + return cl->ping; + ping = 0; + count = 0; + for (frame = cl->frames, i = 0; i < 64; i++, frame++) { + if (frame->ping_time > 0) { + ping += frame->ping_time; + count++; + } + } + if (!count) + return 9999; + ping /= count; + + return ping * 1000; +} + +int main(void) +{ + client_t cl; + + memset(&cl, 0, sizeof(cl)); + + cl.frames[0].ping_time = 1.0f; + + if (CalcPing(&cl) != 1000) + abort(); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071108-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071108-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071108-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071108-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,53 @@ +/* PR tree-optimization/32575 */ + +extern void abort (void); + +struct S +{ + void *s1, *s2; + unsigned char s3, s4, s5; +}; + +__attribute__ ((noinline)) +void * +foo (void) +{ + static struct S s; + return &s; +} + +__attribute__ ((noinline)) +void * +bar () +{ + return (void *) 0; +} + +__attribute__ ((noinline)) +struct S * +test (void *a, void *b) +{ + struct S *p, q; + p = foo (); + if (p == 0) + { + p = &q; + __builtin_memset (p, 0, sizeof (*p)); + } + p->s1 = a; + p->s2 = b; + if (p == &q) + p = 0; + return p; +} + +int +main (void) +{ + int a; + int b; + struct S *z = test ((void *) &a, (void *) &b); + if (z == 0 || z->s1 != (void *) &a || z->s2 != (void *) &b || z->s3 || z->s4) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071120-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071120-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071120-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071120-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,81 @@ +extern void abort (void); + +void __attribute__((noinline,noreturn)) +vec_assert_fail (void) +{ + abort (); +} + +struct ggc_root_tab { + void *base; +}; + +typedef struct deferred_access_check {} VEC_deferred_access_check_gc; + +typedef struct deferred_access { + VEC_deferred_access_check_gc* deferred_access_checks; + int deferring_access_checks_kind; +} deferred_access; + +typedef struct VEC_deferred_access_base { + unsigned num; + deferred_access vec[1]; +} VEC_deferred_access_base; + +static __inline__ deferred_access * +VEC_deferred_access_base_last (VEC_deferred_access_base *vec_) +{ + (void)((vec_ && vec_->num) ? 0 : (vec_assert_fail (), 0)); + return &vec_->vec[vec_->num - 1]; +} + +static __inline__ void +VEC_deferred_access_base_pop (VEC_deferred_access_base *vec_) +{ + (void)((vec_->num) ? 0 : (vec_assert_fail (), 0)); + --vec_->num; +} + +void __attribute__((noinline)) +perform_access_checks (VEC_deferred_access_check_gc* p) +{ + abort (); +} + +typedef struct VEC_deferred_access_gc { + VEC_deferred_access_base base; +} VEC_deferred_access_gc; + +static VEC_deferred_access_gc *deferred_access_stack; +static unsigned deferred_access_no_check; + +const struct ggc_root_tab gt_pch_rs_gt_cp_semantics_h[] = { + { + &deferred_access_no_check + } +}; + +void __attribute__((noinline)) pop_to_parent_deferring_access_checks (void) +{ + if (deferred_access_no_check) + deferred_access_no_check--; + else + { + VEC_deferred_access_check_gc *checks; + deferred_access *ptr; + checks = (VEC_deferred_access_base_last(deferred_access_stack ? &deferred_access_stack->base : 0))->deferred_access_checks; + VEC_deferred_access_base_pop(deferred_access_stack ? &deferred_access_stack->base : 0); + ptr = VEC_deferred_access_base_last(deferred_access_stack ? &deferred_access_stack->base : 0); + if (ptr->deferring_access_checks_kind == 0) + perform_access_checks (checks); + } +} + +int main() +{ + deferred_access_stack = __builtin_malloc (sizeof(VEC_deferred_access_gc) + sizeof(deferred_access) * 8); + deferred_access_stack->base.num = 2; + deferred_access_stack->base.vec[0].deferring_access_checks_kind = 1; + pop_to_parent_deferring_access_checks (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071202-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071202-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071202-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071202-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +extern void abort (void); +struct T { int t; int r[8]; }; +struct S { int a; int b; int c[6]; struct T d; }; + +__attribute__((noinline)) void +foo (struct S *s) +{ + *s = (struct S) { s->b, s->a, { 0, 0, 0, 0, 0, 0 }, s->d }; +} + +int +main (void) +{ + struct S s = { 6, 12, { 1, 2, 3, 4, 5, 6 }, + { 7, { 8, 9, 10, 11, 12, 13, 14, 15 } } }; + foo (&s); + if (s.a != 12 || s.b != 6 + || s.c[0] || s.c[1] || s.c[2] || s.c[3] || s.c[4] || s.c[5]) + abort (); + if (s.d.t != 7 || s.d.r[0] != 8 || s.d.r[1] != 9 || s.d.r[2] != 10 + || s.d.r[3] != 11 || s.d.r[4] != 12 || s.d.r[5] != 13 + || s.d.r[6] != 14 || s.d.r[7] != 15) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071205-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071205-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071205-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071205-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR middle-end/34337 */ + +extern void abort (void); + +int +foo (int x) +{ + return ((x << 8) & 65535) | 255; +} + +int +main (void) +{ + if (foo (0x32) != 0x32ff || foo (0x174) != 0x74ff) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071210-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071210-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071210-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071210-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,69 @@ +/* PR rtl-optimization/34302 */ +/* { dg-require-effective-target label_values } */ +/* { dg-require-effective-target indirect_jumps } */ + +extern void abort (void); + +struct S +{ + int n1, n2, n3, n4; +}; + +__attribute__((noinline)) struct S +foo (int x, int y, int z) +{ + if (x != 10 || y != 9 || z != 8) + abort (); + struct S s = { 1, 2, 3, 4 }; + return s; +} + +__attribute__((noinline)) void ** +bar (void **u, int *v) +{ + void **w = u; + int *s = v, x, y, z; + void **p, **q; + static void *l[] = { &&lab1, &&lab1, &&lab2, &&lab3, &&lab4 }; + + if (!u) + return l; + + q = *w++; + goto *q; +lab2: + p = q; + q = *w++; + x = s[2]; + y = s[1]; + z = s[0]; + s -= 1; + struct S r = foo (x, y, z); + s[3] = r.n1; + s[2] = r.n2; + s[1] = r.n3; + s[0] = r.n4; + goto *q; +lab3: + p = q; + q = *w++; + s += 1; + s[0] = 23; +lab1: + goto *q; +lab4: + return 0; +} + +int +main (void) +{ + void **u = bar ((void **) 0, (int *) 0); + void *t[] = { u[2], u[4] }; + int s[] = { 7, 8, 9, 10, 11, 12 }; + if (bar (t, &s[1]) != (void **) 0 + || s[0] != 4 || s[1] != 3 || s[2] != 2 || s[3] != 1 + || s[4] != 11 || s[5] != 12) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071211-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071211-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071211-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071211-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +extern void abort() __attribute__ ((noreturn)); + +struct s +{ + unsigned long long f1 : 40; +#if(__SIZEOF_INT__ >= 4) + unsigned int f2 : 24; +#else + unsigned long int f2 : 24; +#endif +} sv; + +int main() +{ + int f2; + sv.f2 = (1 << 24) - 1; + __asm__ volatile ("" : : : "memory"); + ++sv.f2; + f2 = sv.f2; + if (f2 != 0) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071213-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071213-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071213-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071213-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,53 @@ +/* PR target/34281 */ + +#include + +extern void abort (void); + +void +h (int x, va_list ap) +{ + switch (x) + { + case 1: + if (va_arg (ap, int) != 3 || va_arg (ap, int) != 4) + abort (); + return; + case 5: + if (va_arg (ap, int) != 9 || va_arg (ap, int) != 10) + abort (); + return; + default: + abort (); + } +} + +void +f1 (int i, long long int j, ...) +{ + va_list ap; + va_start (ap, j); + h (i, ap); + if (i != 1 || j != 2) + abort (); + va_end (ap); +} + +void +f2 (int i, int j, int k, long long int l, ...) +{ + va_list ap; + va_start (ap, l); + h (i, ap); + if (i != 5 || j != 6 || k != 7 || l != 8) + abort (); + va_end (ap); +} + +int +main () +{ + f1 (1, 2, 3, 4); + f2 (5, 6, 7, 8, 9, 10); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071216-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071216-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071216-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071216-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* PR rtl-optimization/34490 */ + +extern void abort (void); + +static int x; + +int +__attribute__((noinline)) +bar (void) +{ + return x; +} + +int +foo (void) +{ + long int b = bar (); + if ((unsigned long) b < -4095L) + return b; + if (-b != 38) + b = -2; + return b + 1; +} + +int +main (void) +{ + x = 26; + if (foo () != 26) + abort (); + x = -39; + if (foo () != -1) + abort (); + x = -38; + if (foo () != -37) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071219-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071219-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071219-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071219-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,71 @@ +/* PR c++/34459 */ + +extern void abort (void); +extern void *memset (void *s, int c, __SIZE_TYPE__ n); + +struct S +{ + char s[25]; +}; + +struct S *p; + +void __attribute__((noinline,noclone)) +foo (struct S *x, int set) +{ + int i; + for (i = 0; i < sizeof (x->s); ++i) + if (x->s[i] != 0) + abort (); + else if (set) + x->s[i] = set; + p = x; +} + +void __attribute__((noinline,noclone)) +test1 (void) +{ + struct S a; + memset (&a.s, '\0', sizeof (a.s)); + foo (&a, 0); + struct S b = a; + foo (&b, 1); + b = a; + b = b; + foo (&b, 0); +} + +void __attribute__((noinline,noclone)) +test2 (void) +{ + struct S a; + memset (&a.s, '\0', sizeof (a.s)); + foo (&a, 0); + struct S b = a; + foo (&b, 1); + b = a; + b = *p; + foo (&b, 0); +} + +void __attribute__((noinline,noclone)) +test3 (void) +{ + struct S a; + memset (&a.s, '\0', sizeof (a.s)); + foo (&a, 0); + struct S b = a; + foo (&b, 1); + *p = a; + *p = b; + foo (&b, 0); +} + +int +main (void) +{ + test1 (); + test2 (); + test3 (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071220-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071220-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071220-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071220-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +/* PR tree-optimization/29484 */ +/* { dg-require-effective-target label_values } */ +/* { dg-require-effective-target indirect_jumps } */ + +extern void abort (void); + +void *__attribute__((noinline)) +baz (void **lab) +{ + asm volatile ("" : "+r" (lab)); + return *lab; +} + +static inline +int bar (void) +{ + static void *b[] = { &&addr }; + void *p = baz (b); + goto *p; +addr: + return 17; +} + +int __attribute__((noinline)) +f1 (void) +{ + return bar (); +} + +int __attribute__((noinline)) +f2 (void) +{ + return bar (); +} + +int +main (void) +{ + if (f1 () != 17 || f1 () != 17 || f2 () != 17 || f2 () != 17) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071220-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071220-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071220-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20071220-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +/* PR tree-optimization/29484 */ +/* { dg-require-effective-target label_values } */ + +extern void abort (void); + +void *__attribute__((noinline)) +baz (void **lab) +{ + asm volatile ("" : "+r" (lab)); + return *lab; +} + +static inline +int bar (void) +{ + static void *b[] = { &&addr }; + baz (b); +addr: + return 17; +} + +int __attribute__((noinline)) +f1 (void) +{ + return bar (); +} + +int __attribute__((noinline)) +f2 (void) +{ + return bar (); +} + +int +main (void) +{ + if (f1 () != 17 || f1 () != 17 || f2 () != 17 || f2 () != 17) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080117-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080117-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080117-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080117-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +typedef struct gs_imager_state_s { + struct { + int half_width; + int cap; + float miter_limit; + } line_params; +} gs_imager_state; +static const gs_imager_state gstate_initial = { { 1 } }; +void gstate_path_memory(gs_imager_state *pgs) { + *pgs = gstate_initial; +} +int gs_state_update_overprint(void) +{ + return gstate_initial.line_params.half_width; +} + +extern void abort (void); +int main() +{ + if (gs_state_update_overprint() != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080122-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080122-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080122-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080122-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* PR rtl-optimization/34628 */ +/* Origin: Martin Michlmayr */ + +typedef unsigned short u16; +typedef unsigned char u8; + +static void +do_segfault(u8 in_buf[], const u8 out_buf[], const int len) +{ + int i; + + for (i = 0; i < len; i++) { + asm(""); + + in_buf[2*i] = ( out_buf[2*i] | out_buf[(2*i)+1]<<8 ) & 0xFF; + + asm(""); + + in_buf[(2*i)+1] = ( out_buf[2*i] | out_buf[(2*i)+1]<<8 ) >> 8; + + asm(""); + } +} + +int main(int argc, char *argv[]) +{ + u8 outbuf[32] = "buffer "; + u8 inbuf[32] = "\f"; + + asm(""); + do_segfault(inbuf, outbuf, 12); + asm(""); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080222-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080222-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080222-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080222-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); + +struct container +{ + unsigned char data[1]; +}; + +unsigned char space[6] = {1, 2, 3, 4, 5, 6}; + +int +foo (struct container *p) +{ + return p->data[4]; +} + +int +main () +{ + if (foo ((struct container *) space) != 5) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080408-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080408-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080408-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080408-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +extern void abort (void); +int main () +{ + short ssi = 126; + unsigned short usi = 65280; + int fail = !(ssi < usi); + if (fail) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080424-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080424-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080424-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080424-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* PR tree-optimization/36008 */ + +extern void abort (void); + +int g[48][3][3]; + +void __attribute__ ((noinline)) +bar (int x[3][3], int y[3][3]) +{ + static int i; + if (x != g[i + 8] || y != g[i++]) + abort (); +} + +static inline void __attribute__ ((always_inline)) +foo (int x[][3][3]) +{ + int i; + for (i = 0; i < 8; i++) + { + int k = i + 8; + bar (x[k], x[k - 8]); + } +} + +int +main () +{ + foo (g); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080502-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080502-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080502-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080502-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* PR target/36090 */ + +extern void abort (void); + +long double __attribute__ ((noinline)) foo (long double x) +{ + return __builtin_signbit (x) ? 3.1415926535897932384626433832795029L : 0.0; +} + +int +main (void) +{ + if (foo (-1.0L) != 3.1415926535897932384626433832795029L) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080506-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080506-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080506-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080506-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR middle-end/36137 */ +extern void abort (void); + +#define MIN(a, b) ((a) < (b) ? (a) : (b)) +#define MAX(a, b) ((a) > (b) ? (a) : (b)) + +int +main () +{ + unsigned int u; + int i = -1; + + u = MAX ((unsigned int) MAX (i, 0), 1); + if (u != 1) + abort (); + + u = MIN ((unsigned int) MAX (i, 0), (unsigned int) i); + if (u != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080506-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080506-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080506-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080506-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR middle-end/36013 */ + +extern void abort (void); + +void __attribute__((noinline)) +foo (int **__restrict p, int **__restrict q) +{ + *p[0] = 1; + *q[0] = 2; + if (*p[0] != 2) + abort (); +} + +int +main (void) +{ + int a; + int *p1 = &a, *p2 = &a; + foo (&p1, &p2); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080519-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080519-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080519-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080519-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,58 @@ +extern void abort (void); + +typedef unsigned long HARD_REG_SET[2]; +HARD_REG_SET reg_class_contents[2]; + +struct du_chain +{ + struct du_chain *next_use; + int cl; +}; + +void __attribute__((noinline)) +merge_overlapping_regs (HARD_REG_SET *p) +{ + if ((*p)[0] != -1 || (*p)[1] != -1) + abort (); +} + +void __attribute__((noinline)) +regrename_optimize (struct du_chain *this) +{ + HARD_REG_SET this_unavailable; + unsigned long *scan_fp_; + int n_uses; + struct du_chain *last; + + this_unavailable[0] = 0; + this_unavailable[1] = 0; + + n_uses = 0; + for (last = this; last->next_use; last = last->next_use) + { + scan_fp_ = reg_class_contents[last->cl]; + n_uses++; + this_unavailable[0] |= ~ scan_fp_[0]; + this_unavailable[1] |= ~ scan_fp_[1]; + } + if (n_uses < 1) + return; + + scan_fp_ = reg_class_contents[last->cl]; + this_unavailable[0] |= ~ scan_fp_[0]; + this_unavailable[1] |= ~ scan_fp_[1]; + + merge_overlapping_regs (&this_unavailable); +} + +int main() +{ + struct du_chain du1 = { 0, 0 }; + struct du_chain du0 = { &du1, 1 }; + reg_class_contents[0][0] = -1; + reg_class_contents[0][1] = -1; + reg_class_contents[1][0] = 0; + reg_class_contents[1][1] = 0; + regrename_optimize (&du0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080522-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080522-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080522-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080522-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +/* This testcase is to make sure we have i in referenced vars and that we + properly compute aliasing for the loads and stores. */ + +extern void abort (void); + +static int i; +static int *p = &i; + +int __attribute__((noinline)) +foo(int *q) +{ + *p = 1; + *q = 2; + return *p; +} + +int __attribute__((noinline)) +bar(int *q) +{ + *q = 2; + *p = 1; + return *q; +} + +int main() +{ + int j = 0; + + if (foo(&i) != 2) + abort (); + if (bar(&i) != 1) + abort (); + if (foo(&j) != 1) + abort (); + if (j != 2) + abort (); + if (bar(&j) != 2) + abort (); + if (j != 2) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080529-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080529-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080529-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080529-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR target/36362 */ + +extern void abort (void); + +int +test (float c) +{ + return !!c * 7LL == 0; +} + +int +main (void) +{ + if (test (1.0f) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080604-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080604-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080604-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080604-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +struct barstruct { char const* some_string; } x; +extern void abort (void); +void __attribute__((noinline)) +foo(void) +{ + if (!x.some_string) + abort (); +} +void baz(int b) +{ + struct barstruct bar; + struct barstruct* barptr; + if (b) + barptr = &bar; + else + { + barptr = &x + 1; + barptr = barptr - 1; + } + barptr->some_string = "Everything OK"; + foo(); + barptr->some_string = "Everything OK"; +} +int main() +{ + x.some_string = (void *)0; + baz(0); + if (!x.some_string) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080719-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080719-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080719-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080719-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,65 @@ +typedef unsigned int u32; + +static const u32 deadfish = 0xdeadf155; + +static const u32 cfb_tab8_be[] = { + 0x00000000,0x000000ff,0x0000ff00,0x0000ffff, + 0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff, + 0xff000000,0xff0000ff,0xff00ff00,0xff00ffff, + 0xffff0000,0xffff00ff,0xffffff00,0xffffffff +}; + +static const u32 cfb_tab8_le[] = { + 0x00000000,0xff000000,0x00ff0000,0xffff0000, + 0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00, + 0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff, + 0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff +}; + +static const u32 cfb_tab16_be[] = { + 0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff +}; + +static const u32 cfb_tab16_le[] = { + 0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff +}; + +static const u32 cfb_tab32[] = { + 0x00000000, 0xffffffff +}; + + + + + + +const u32 *xxx(int bpp) +{ + const u32 *tab; + +if (0) return &deadfish; + + switch (bpp) { + case 8: + tab = cfb_tab8_be; + break; + case 16: + tab = cfb_tab16_be; + break; + case 32: + default: + tab = cfb_tab32; + break; + } + + return tab; +} + +int main(void) +{ + const u32 *a = xxx(8); + int b = a[0]; + if (b != cfb_tab8_be[0]) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080813-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080813-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080813-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20080813-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR middle-end/37103 */ + +extern void abort (void); + +void +foo (unsigned short x) +{ + signed char y = -1; + if (x == y) + abort (); +} + +void +bar (unsigned short x) +{ + unsigned char y = -1; + if (x == y) + abort (); +} + +int +main (void) +{ + if (sizeof (int) == sizeof (short)) + return 0; + foo (-1); + if (sizeof (short) > 1) + bar (-1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081103-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081103-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081103-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081103-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +struct S { char c; char arr[4]; float f; }; + +char A[4] = { '1', '2', '3', '4' }; + +void foo (struct S s) +{ + if (__builtin_memcmp (s.arr, A, 4)) + __builtin_abort (); +} + +int main (void) +{ + struct S s; + __builtin_memcpy (s.arr, A, 4); + foo (s); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081112-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081112-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081112-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081112-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +#include + +extern void abort (void); + +static __attribute__((noinline)) void foo (int a) +{ + int b = (a - 1) + INT_MIN; + + if (b != INT_MIN) + abort (); +} + +int main (void) +{ + foo (1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081117-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081117-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081117-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081117-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* { dg-require-effective-target int32plus } */ +extern void abort (void); + +struct s +{ + unsigned long long a:16; + unsigned long long b:32; + unsigned long long c:16; +}; + +__attribute__ ((noinline)) unsigned +f (struct s s, unsigned i) +{ + return s.b == i; +} + +struct s s = { 1, 0x87654321u, 2}; + +int +main () +{ + if (!f (s, 0x87654321u)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081218-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081218-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081218-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20081218-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +struct A { int i, j; char pad[512]; } a; + +int __attribute__((noinline)) +foo (void) +{ + __builtin_memset (&a, 0x26, sizeof a); + return a.i; +} + +void __attribute__((noinline)) +bar (void) +{ + __builtin_memset (&a, 0x36, sizeof a); + a.i = 0x36363636; + a.j = 0x36373636; +} + +int +main (void) +{ + int i; + if (sizeof (int) != 4 || __CHAR_BIT__ != 8) + return 0; + + if (foo () != 0x26262626) + __builtin_abort (); + for (i = 0; i < sizeof a; i++) + if (((char *)&a)[i] != 0x26) + __builtin_abort (); + + bar (); + if (a.j != 0x36373636) + __builtin_abort (); + a.j = 0x36363636; + for (i = 0; i < sizeof a; i++) + if (((char *)&a)[i] != 0x36) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,61 @@ +typedef struct descriptor_dimension +{ + int stride; + int lbound; + int ubound; +} descriptor_dimension; +typedef struct { + int *data; + int dtype; + descriptor_dimension dim[7]; +} gfc_array_i4; + +void +msum_i4 (gfc_array_i4 * const retarray, + gfc_array_i4 * const array, + const int * const pdim) +{ + int count[7]; + int extent[7]; + int * dest; + const int * base; + int dim; + int n; + int len; + + dim = (*pdim) - 1; + len = array->dim[dim].ubound + 1 - array->dim[dim].lbound; + + for (n = 0; n < dim; n++) + { + extent[n] = array->dim[n].ubound + 1 - array->dim[n].lbound; + count[n] = 0; + } + + dest = retarray->data; + base = array->data; + + do + { + int result = 0; + + for (n = 0; n < len; n++, base++) + result += *base; + *dest = result; + + count[0]++; + dest += 1; + } + while (count[0] != extent[0]); +} + +int main() +{ + int rdata[3]; + int adata[9]; + gfc_array_i4 retarray = { rdata, 265, { { 1, 1, 3 } } }; + gfc_array_i4 array = { adata, 266, { { 1, 1, 3 }, { 3, 1, 3 } } }; + int dim = 2; + msum_i4 (&retarray, &array, &dim); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,159 @@ +struct obstack {}; +struct bitmap_head_def; +typedef struct bitmap_head_def *bitmap; +typedef const struct bitmap_head_def *const_bitmap; +typedef unsigned long BITMAP_WORD; +typedef struct bitmap_obstack +{ + struct bitmap_element_def *elements; + struct bitmap_head_def *heads; + struct obstack obstack; +} bitmap_obstack; +typedef struct bitmap_element_def +{ + struct bitmap_element_def *next; + struct bitmap_element_def *prev; + unsigned int indx; + BITMAP_WORD bits[((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u))]; +} bitmap_element; + +struct bitmap_descriptor; + +typedef struct bitmap_head_def { + bitmap_element *first; + bitmap_element *current; + unsigned int indx; + bitmap_obstack *obstack; +} bitmap_head; + +bitmap_element bitmap_zero_bits; + +typedef struct +{ + bitmap_element *elt1; + bitmap_element *elt2; + unsigned word_no; + BITMAP_WORD bits; +} bitmap_iterator; + +static void __attribute__((noinline)) +bmp_iter_set_init (bitmap_iterator *bi, const_bitmap map, + unsigned start_bit, unsigned *bit_no) +{ + bi->elt1 = map->first; + bi->elt2 = ((void *)0); + + while (1) + { + if (!bi->elt1) + { + bi->elt1 = &bitmap_zero_bits; + break; + } + + if (bi->elt1->indx >= start_bit / (((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u)) * (8 * 8 * 1u))) + break; + bi->elt1 = bi->elt1->next; + } + + if (bi->elt1->indx != start_bit / (((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u)) * (8 * 8 * 1u))) + start_bit = bi->elt1->indx * (((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u)) * (8 * 8 * 1u)); + + bi->word_no = start_bit / (8 * 8 * 1u) % ((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u)); + bi->bits = bi->elt1->bits[bi->word_no]; + bi->bits >>= start_bit % (8 * 8 * 1u); + + start_bit += !bi->bits; + + *bit_no = start_bit; +} + +static void __attribute__((noinline)) +bmp_iter_next (bitmap_iterator *bi, unsigned *bit_no) +{ + bi->bits >>= 1; + *bit_no += 1; +} + +static unsigned char __attribute__((noinline)) +bmp_iter_set_tail (bitmap_iterator *bi, unsigned *bit_no) +{ + while (!(bi->bits & 1)) + { + bi->bits >>= 1; + *bit_no += 1; + } + return 1; +} + +static __inline__ unsigned char +bmp_iter_set (bitmap_iterator *bi, unsigned *bit_no) +{ + unsigned bno = *bit_no; + BITMAP_WORD bits = bi->bits; + bitmap_element *elt1; + + if (bits) + { + while (!(bits & 1)) + { + bits >>= 1; + bno += 1; + } + *bit_no = bno; + return 1; + } + + *bit_no = ((bno + 64 - 1) / 64 * 64); + bi->word_no++; + + elt1 = bi->elt1; + while (1) + { + while (bi->word_no != 2) + { + bi->bits = elt1->bits[bi->word_no]; + if (bi->bits) + { + bi->elt1 = elt1; + return bmp_iter_set_tail (bi, bit_no); + } + *bit_no += 64; + bi->word_no++; + } + + elt1 = elt1->next; + if (!elt1) + { + bi->elt1 = elt1; + return 0; + } + *bit_no = elt1->indx * (2 * 64); + bi->word_no = 0; + } +} + +extern void abort (void); + +static void __attribute__((noinline)) catchme(int i) +{ + if (i != 0 && i != 64) + abort (); +} +static void __attribute__((noinline)) foobar (bitmap_head *chain) +{ + bitmap_iterator rsi; + unsigned int regno; + for (bmp_iter_set_init (&(rsi), (chain), (0), &(regno)); + bmp_iter_set (&(rsi), &(regno)); + bmp_iter_next (&(rsi), &(regno))) + catchme(regno); +} + +int main() +{ + bitmap_element elem = { (void *)0, (void *)0, 0, { 1, 1 } }; + bitmap_head live_throughout = { &elem, &elem, 0, (void *)0 }; + foobar (&live_throughout); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090113-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,137 @@ +struct obstack {}; +struct bitmap_head_def; +typedef struct bitmap_head_def *bitmap; +typedef const struct bitmap_head_def *const_bitmap; +typedef unsigned long BITMAP_WORD; + +typedef struct bitmap_obstack +{ + struct bitmap_element_def *elements; + struct bitmap_head_def *heads; + struct obstack obstack; +} bitmap_obstack; +typedef struct bitmap_element_def +{ + struct bitmap_element_def *next; + struct bitmap_element_def *prev; + unsigned int indx; + BITMAP_WORD bits[(2)]; +} bitmap_element; + +struct bitmap_descriptor; + +typedef struct bitmap_head_def { + bitmap_element *first; + bitmap_element *current; + unsigned int indx; + bitmap_obstack *obstack; +} bitmap_head; + +bitmap_element bitmap_zero_bits; + +typedef struct +{ + bitmap_element *elt1; + bitmap_element *elt2; + unsigned word_no; + BITMAP_WORD bits; +} bitmap_iterator; + +static __attribute__((noinline)) void +bmp_iter_set_init (bitmap_iterator *bi, const_bitmap map, + unsigned start_bit, unsigned *bit_no) +{ + bi->elt1 = map->first; + bi->elt2 = ((void *)0); + + while (1) + { + if (!bi->elt1) + { + bi->elt1 = &bitmap_zero_bits; + break; + } + + if (bi->elt1->indx >= start_bit / (128u)) + break; + bi->elt1 = bi->elt1->next; + } + + if (bi->elt1->indx != start_bit / (128u)) + start_bit = bi->elt1->indx * (128u); + + bi->word_no = start_bit / 64u % (2); + bi->bits = bi->elt1->bits[bi->word_no]; + bi->bits >>= start_bit % 64u; + + start_bit += !bi->bits; + + *bit_no = start_bit; +} + +static __inline__ __attribute__((always_inline)) void +bmp_iter_next (bitmap_iterator *bi, unsigned *bit_no) +{ + bi->bits >>= 1; + *bit_no += 1; +} + +static __inline__ __attribute__((always_inline)) unsigned char +bmp_iter_set (bitmap_iterator *bi, unsigned *bit_no) +{ + if (bi->bits) + { + while (!(bi->bits & 1)) + { + bi->bits >>= 1; + *bit_no += 1; + } + return 1; + } + + *bit_no = ((*bit_no + 64u - 1) / 64u * 64u); + bi->word_no++; + + while (1) + { + while (bi->word_no != (2)) + { + bi->bits = bi->elt1->bits[bi->word_no]; + if (bi->bits) + { + while (!(bi->bits & 1)) + { + bi->bits >>= 1; + *bit_no += 1; + } + return 1; + } + *bit_no += 64u; + bi->word_no++; + } + + bi->elt1 = bi->elt1->next; + if (!bi->elt1) + return 0; + *bit_no = bi->elt1->indx * (128u); + bi->word_no = 0; + } +} + +static void __attribute__((noinline)) +foobar (bitmap_head *live_throughout) +{ + bitmap_iterator rsi; + unsigned int regno; + for (bmp_iter_set_init (&(rsi), (live_throughout), (0), &(regno)); + bmp_iter_set (&(rsi), &(regno)); + bmp_iter_next (&(rsi), &(regno))) + ; +} +int main() +{ + bitmap_element elem = { (void *)0, (void *)0, 0, { 1, 1 } }; + bitmap_head live_throughout = { &elem, &elem, 0, (void *)0 }; + foobar (&live_throughout); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090207-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090207-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090207-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090207-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +int foo(int i) +{ + int a[32]; + a[1] = 3; + a[0] = 1; + a[i] = 2; + return a[0]; +} +extern void abort (void); +int main() +{ + if (foo (0) != 2 + || foo (1) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090219-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090219-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090219-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090219-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* On ARM, BAR used to get a bogus number in E due to stack + misalignment. */ + +extern void abort (void); +extern void exit (int); + +void +foo (void) +{ + int f = 0; + + void bar (int a, int b, int c, int d, int e) + { + if (e != 0) + { + f = 1; + abort (); + } + } + + bar (0, 0, 0, 0, 0); +} + +int +main (void) +{ + foo (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090527-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090527-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090527-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090527-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +typedef enum { POSITION_ASIS, POSITION_UNSPECIFIED } unit_position; + +typedef enum { STATUS_UNKNOWN, STATUS_UNSPECIFIED } unit_status; + +typedef struct +{ + unit_position position; + unit_status status; +} unit_flags; + +extern void abort (void); + +void +new_unit (unit_flags * flags) +{ + if (flags->status == STATUS_UNSPECIFIED) + flags->status = STATUS_UNKNOWN; + + if (flags->position == POSITION_UNSPECIFIED) + flags->position = POSITION_ASIS; + + switch (flags->status) + { + case STATUS_UNKNOWN: + break; + + default: + abort (); + } +} + +int main() +{ + unit_flags f; + f.status = STATUS_UNSPECIFIED; + new_unit (&f); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090623-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090623-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090623-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090623-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +int * __restrict__ x; + +int foo (int y) +{ + *x = y; + return *x; +} + +extern void abort (void); + +int main() +{ + int i = 0; + x = &i; + if (foo(1) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090711-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090711-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090711-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090711-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* Used to be miscompiled at -O0 due to incorrect choice of sign extension + vs. zero extension. __attribute__ ((noinline)) added to try to make it + fail at higher optimization levels too. */ + +extern void abort (void); + +long long __attribute__ ((noinline)) +div (long long val) +{ + return val / 32768; +} + +int main (void) +{ + long long d1 = -990000000; + long long d2 = div(d1); + if (d2 != -30212) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090814-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090814-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090814-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20090814-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +int __attribute__((noinline)) +bar (int *a) +{ + return *a; +} +int i; +int __attribute__((noinline)) +foo (int (*a)[2]) +{ + return bar (&(*a)[i]); +} + +extern void abort (void); +int a[2]; +int main() +{ + a[0] = -1; + a[1] = 42; + i = 1; + if (foo (&a) != 42) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20091229-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20091229-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20091229-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20091229-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +long long foo(long long v) { return v / -0x080000000LL; } +int main(int argc, char **argv) { if (foo(0x080000000LL) != -1) abort(); exit (0); } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100209-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100209-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100209-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100209-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +int bar(int foo) +{ + return (int)(((unsigned long long)(long long)foo) / 8); +} +extern void abort (void); +int main() +{ + if (sizeof (long long) > sizeof (int) + && bar(-1) != -1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100316-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100316-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100316-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100316-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +struct Foo { + int i; + unsigned precision : 10; + unsigned blah : 3; +} f; + +int __attribute__((noinline,noclone)) +foo (struct Foo *p) +{ + struct Foo *q = p; + return (*q).precision; +} + +extern void abort (void); + +int main() +{ + f.i = -1; + f.precision = 0; + f.blah = -1; + if (foo (&f) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100416-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100416-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100416-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100416-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +void abort(void); + +int +movegt(int x, int y, long long a) +{ + int i; + int ret = 0; + for (i = 0; i < y; i++) + { + if (a >= (long long) 0xf000000000000000LL) + ret = x; + else + ret = y; + } + return ret; +} + +struct test +{ + long long val; + int ret; +} tests[] = { + { 0xf000000000000000LL, -1 }, + { 0xefffffffffffffffLL, 1 }, + { 0xf000000000000001LL, -1 }, + { 0x0000000000000000LL, -1 }, + { 0x8000000000000000LL, 1 }, +}; + +int +main() +{ + int i; + for (i = 0; i < sizeof (tests) / sizeof (tests[0]); i++) + { + if (movegt (-1, 1, tests[i].val) != tests[i].ret) + abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100430-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100430-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100430-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100430-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +/* This used to generate unaligned accesses at -O2 because of IVOPTS. */ + +struct packed_struct +{ + struct packed_struct1 + { + unsigned char cc11; + unsigned char cc12; + } __attribute__ ((packed)) pst1; + struct packed_struct2 + { + unsigned char cc21; + unsigned char cc22; + unsigned short ss[104]; + unsigned char cc23[13]; + } __attribute__ ((packed)) pst2[4]; +} __attribute__ ((packed)); + +typedef struct +{ + int ii; + struct packed_struct buf; +} info_t; + +static unsigned short g; + +static void __attribute__((noinline)) +dummy (unsigned short s) +{ + g = s; +} + +static int +foo (info_t *info) +{ + int i, j; + + for (i = 0; i < info->buf.pst1.cc11; i++) + for (j = 0; j < info->buf.pst2[i].cc22; j++) + dummy (info->buf.pst2[i].ss[j]); + + return 0; +} + +int main(void) +{ + info_t info; + info.buf.pst1.cc11 = 2; + info.buf.pst2[0].cc22 = info.buf.pst2[1].cc22 = 8; + return foo (&info); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100708-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100708-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100708-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100708-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* PR middle-end/44843 */ +/* Verify that we don't use the alignment of struct S for inner accesses. */ + +struct S +{ + double for_alignment; + struct { int x, y, z; } a[16]; +}; + +void f(struct S *s) __attribute__((noinline)); + +void f(struct S *s) +{ + unsigned int i; + + for (i = 0; i < 16; ++i) + { + s->a[i].x = 0; + s->a[i].y = 0; + s->a[i].z = 0; + } +} + +int main (void) +{ + struct S s; + f (&s); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100805-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100805-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100805-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100805-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +unsigned int foo (unsigned int a, unsigned int b) +{ + unsigned i; + a = a & 1; + for (i = 0; i < b; ++i) + a = a << 1 | a >> (sizeof (unsigned int) * 8 - 1); + return a; +} +extern void abort (void); +int main() +{ + if (foo (1, sizeof (unsigned int) * 8 + 1) != 2) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100827-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100827-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100827-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20100827-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +extern void abort (void); +int __attribute__((noinline,noclone)) +foo (char *p) +{ + int h = 0; + do + { + if (*p == '\0') + break; + ++h; + if (p == 0) + abort (); + ++p; + } + while (1); + return h; +} +int main() +{ + if (foo("a") != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101011-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101011-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101011-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101011-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,145 @@ +/* { dg-options "-fnon-call-exceptions" } */ +/* With -fnon-call-exceptions 0 / 0 should not be eliminated. */ +/* { dg-additional-options "-DSIGNAL_SUPPRESS" { target { ! signal } } } */ + +#ifdef SIGNAL_SUPPRESS +# define DO_TEST 0 +#elif defined (__powerpc__) || defined (__PPC__) || defined (__ppc__) || defined (__POWERPC__) || defined (__ppc) + /* On PPC division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__riscv) + /* On RISC-V division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__SPU__) + /* On SPU division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__sh__) + /* On SH division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__v850__) + /* On V850 division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__MSP430__) + /* On MSP430 division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__RL78__) + /* On RL78 division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__RX__) + /* On RX division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__aarch64__) + /* On AArch64 integer division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__TMS320C6X__) + /* On TI C6X division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__VISIUM__) + /* On Visium division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__mips__) && !defined(__linux__) + /* MIPS divisions do trap by default, but libgloss targets do not + intercept the trap and raise a SIGFPE. The same is probably + true of other bare-metal environments, so restrict the test to + systems that use the Linux kernel. */ +# define DO_TEST 0 +#elif defined (__mips16) && defined(__linux__) + /* Not all Linux kernels deal correctly the breakpoints generated by + MIPS16 divisions by zero. They show up as a SIGTRAP instead. */ +# define DO_TEST 0 +#elif defined (__MICROBLAZE__) +/* We cannot rely on division by zero generating a trap. */ +# define DO_TEST 0 +#elif defined (__epiphany__) + /* Epiphany does not have hardware division, and the software implementation + has truly undefined behavior for division by 0. */ +# define DO_TEST 0 +#elif defined (__m68k__) && !defined(__linux__) + /* Attempting to trap division-by-zero in this way isn't likely to work on + bare-metal m68k systems. */ +# define DO_TEST 0 +#elif defined (__CRIS__) + /* No SIGFPE for CRIS integer division. */ +# define DO_TEST 0 +#elif defined (__MMIX__) +/* By default we emit a sequence with DIVU, which "never signals an + exceptional condition, even when dividing by zero". */ +# define DO_TEST 0 +#elif defined (__arc__) + /* No SIGFPE for ARC integer division. */ +# define DO_TEST 0 +#elif defined (__arm__) && defined (__ARM_EABI__) +# ifdef __ARM_ARCH_EXT_IDIV__ + /* Hardware division instructions may not trap, and handle trapping + differently anyway. Skip the test if we have those instructions. */ +# define DO_TEST 0 +# else +# include + /* ARM division-by-zero behavior is to call a helper function, which + can do several different things, depending on requirements. Emulate + the behavior of other targets here by raising SIGFPE. */ +int __attribute__((used)) +__aeabi_idiv0 (int return_value) +{ + raise (SIGFPE); + return return_value; +} +# define DO_TEST 1 +# endif +#elif defined (__nios2__) + /* Nios II requires both hardware support and user configuration to + raise an exception on divide by zero. */ +# define DO_TEST 0 +#elif defined (__nvptx__) +/* There isn't even a signal function. */ +# define DO_TEST 0 +#elif defined (__csky__) + /* This presently doesn't raise SIGFPE even on csky-linux-gnu, much + less bare metal. See the implementation of __divsi3 in libgcc. */ +# define DO_TEST 0 +#elif defined (__moxie__) + /* Not all moxie configurations may raise exceptions. */ +# define DO_TEST 0 +#elif defined (__or1k__) + /* On OpenRISC division by zero does not trap. */ +# define DO_TEST 0 +#elif defined (__pru__) +/* There isn't even a signal function. */ +# define DO_TEST 0 +#else +# define DO_TEST 1 +#endif + +extern void abort (void); +extern void exit (int); + +#if DO_TEST + +#include + +void +sigfpe (int signum __attribute__ ((unused))) +{ + exit (0); +} + +#endif + +/* When optimizing, the compiler is smart enough to constant fold the + static unset variables i and j to produce 0 / 0, but it can't + eliminate the assignment to the global k. */ +static int i; +static int j; +int k __attribute__ ((used)); + +int +main () +{ +#if DO_TEST + signal (SIGFPE, sigfpe); + k = i / j; + abort (); +#else + exit (0); +#endif +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101013-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101013-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101013-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101013-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* PR rtl-optimization/45912 */ + +extern void abort (void); + +static void* __attribute__((noinline,noclone)) +get_addr_base_and_unit_offset (void *base, long long *i) +{ + *i = 0; + return base; +} + +static void* __attribute__((noinline,noclone)) +build_int_cst (void *base, long long offset) +{ + if (offset != 4) + abort (); + + return base; +} + +static void* __attribute__((noinline,noclone)) +build_ref_for_offset (void *base, long long offset) +{ + long long base_offset; + base = get_addr_base_and_unit_offset (base, &base_offset); + return build_int_cst (base, base_offset + offset / 8); +} + +int +main (void) +{ + void *ret = build_ref_for_offset ((void *)0, 32); + if (ret != (void *)0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101025-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101025-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101025-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20101025-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +static int g_7; +static int *volatile g_6 = &g_7; +int g_3; + +static int f1 (int *p_58) +{ + return *p_58; +} + +void f2 (int i) __attribute__ ((noinline)); +void f2 (int i) +{ + g_3 = i; +} + +int f3 (void) __attribute__ ((noinline)); +int f3 (void) +{ + *g_6 = 1; + f2 (f1 (&g_7)); + return 0; +} + +int main () +{ + f3 (); + if (g_3 != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20110418-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20110418-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20110418-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20110418-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +typedef unsigned long long uint64_t; +void f(uint64_t *a, uint64_t aa) __attribute__((noinline)); +void f(uint64_t *a, uint64_t aa) +{ + uint64_t new_value = aa; + uint64_t old_value = *a; + int bit_size = 32; + uint64_t mask = (uint64_t)(unsigned)(-1); + uint64_t tmp = old_value & mask; + new_value &= mask; + /* On overflow we need to add 1 in the upper bits */ + if (tmp > new_value) + new_value += 1ull< */ + +typedef __SIZE_TYPE__ size_t; + +extern void *memcpy (void *__restrict __dest, + __const void *__restrict __src, size_t __n) + __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 2))); + +extern size_t strlen (__const char *__s) + __attribute__ ((__nothrow__)) __attribute__ ((__pure__)) __attribute__ ((__nonnull__ (1))); + +typedef __INT16_TYPE__ int16_t; +typedef __INT32_TYPE__ int32_t; + +extern void abort (void); + +int a; + +static void __attribute__ ((noinline,noclone)) +do_something (int item) +{ + a = item; +} + +int +pack_unpack (char *s, char *p) +{ + char *send, *pend; + char type; + int integer_size; + + send = s + strlen (s); + pend = p + strlen (p); + + while (p < pend) + { + type = *p++; + + switch (type) + { + case 's': + integer_size = 2; + goto unpack_integer; + + case 'l': + integer_size = 4; + goto unpack_integer; + + unpack_integer: + switch (integer_size) + { + case 2: + { + union + { + int16_t i; + char a[sizeof (int16_t)]; + } + v; + memcpy (v.a, s, sizeof (int16_t)); + s += sizeof (int16_t); + do_something (v.i); + } + break; + + case 4: + { + union + { + int32_t i; + char a[sizeof (int32_t)]; + } + v; + memcpy (v.a, s, sizeof (int32_t)); + s += sizeof (int32_t); + do_something (v.i); + } + break; + } + break; + } + } + return (int) *s; +} + +int +main (void) +{ + int n = pack_unpack ("\200\001\377\376\035\300", "sl"); + if (n != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20111212-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20111212-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20111212-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20111212-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* PR tree-optimization/50569 */ +/* Reported by Paul Koning */ +/* Reduced testcase by Mikael Pettersson */ + +struct event { + struct { + unsigned int sec; + } sent __attribute__((packed)); +}; + +void __attribute__((noinline,noclone)) frob_entry(char *buf) +{ + struct event event; + + __builtin_memcpy(&event, buf, sizeof(event)); + if (event.sent.sec < 64) { + event.sent.sec = -1U; + __builtin_memcpy(buf, &event, sizeof(event)); + } +} + +int main(void) +{ + union { + char buf[1 + sizeof(struct event)]; + int align; + } u; + + __builtin_memset(&u, 0, sizeof u); + + frob_entry(&u.buf[1]); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20111227-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20111227-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20111227-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20111227-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR rtl-optimization/51667 */ +/* Testcase by Uros Bizjak */ + +extern void abort (void); + +void __attribute__((noinline,noclone)) +bar (int a) +{ + if (a != -1) + abort (); +} + +void __attribute__((noinline,noclone)) +foo (short *a, int t) +{ + short r = *a; + + if (t) + bar ((unsigned short) r); + else + bar ((signed short) r); +} + +short v = -1; + +int main(void) +{ + foo (&v, 0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120105-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120105-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120105-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120105-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +struct __attribute__((packed)) S +{ + int a, b, c; +}; + +static int __attribute__ ((noinline,noclone)) +extract(const char *p) +{ + struct S s; + __builtin_memcpy (&s, p, sizeof(struct S)); + return s.a; +} + +volatile int i; + +int main (void) +{ + char p[sizeof(struct S) + 1]; + + __builtin_memset (p, 0, sizeof(struct S) + 1); + i = extract (p + 1); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120111-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120111-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120111-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120111-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +#include +#include + +uint32_t f0a (uint64_t arg2) __attribute__((noinline)); + +uint32_t +f0a (uint64_t arg) +{ + return ~((unsigned) (arg > -3)); +} + +int main() { + uint32_t r1; + r1 = f0a (12094370573988097329ULL); + if (r1 != ~0U) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120207-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120207-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120207-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120207-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR middle-end/51994 */ +/* Testcase by Uros Bizjak */ + +extern char *strcpy (char *, const char *); +extern void abort (void); + +char __attribute__((noinline)) +test (int a) +{ + char buf[16]; + char *output = buf; + + strcpy (&buf[0], "0123456789"); + + output += a; + output -= 1; + + return output[0]; +} + +int main () +{ + if (test (2) != '1') + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120427-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120427-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120427-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120427-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +typedef struct sreal +{ + unsigned sig; /* Significant. */ + int exp; /* Exponent. */ +} sreal; + +sreal_compare (sreal *a, sreal *b) +{ + if (a->exp > b->exp) + return 1; + if (a->exp < b->exp) + return -1; + if (a->sig > b->sig) + return 1; + return -(a->sig < b->sig); +} + +sreal a[] = { + { 0, 0 }, + { 1, 0 }, + { 0, 1 }, + { 1, 1 } +}; + +int main() +{ + int i, j; + for (i = 0; i <= 3; i++) { + for (j = 0; j < 3; j++) { + if (i < j && sreal_compare(&a[i], &a[j]) != -1) abort(); + if (i == j && sreal_compare(&a[i], &a[j]) != 0) abort(); + if (i > j && sreal_compare(&a[i], &a[j]) != 1) abort(); + } + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120427-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120427-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120427-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120427-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +typedef struct sreal +{ + unsigned sig; /* Significant. */ + int exp; /* Exponent. */ +} sreal; + +sreal_compare (sreal *a, sreal *b) +{ + if (a->exp > b->exp) + return 1; + if (a->exp < b->exp) + return -1; + if (a->sig > b->sig) + return 1; + if (a->sig < b->sig) + return -1; + return 0; +} + +sreal a[] = { + { 0, 0 }, + { 1, 0 }, + { 0, 1 }, + { 1, 1 } +}; + +int main() +{ + int i, j; + for (i = 0; i <= 3; i++) { + for (j = 0; j < 3; j++) { + if (i < j && sreal_compare(&a[i], &a[j]) != -1) abort(); + if (i == j && sreal_compare(&a[i], &a[j]) != 0) abort(); + if (i > j && sreal_compare(&a[i], &a[j]) != 1) abort(); + } + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120615-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120615-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120615-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120615-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +extern void abort (void); + +void __attribute__((noinline,noclone)) + test1(int i) +{ + if (i == 12) + return; + if (i != 17) + { + if (i == 15) + return; + abort (); + } +} + +int main() { test1 (15); return 0; } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120808-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120808-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120808-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120808-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +extern void exit (int); +extern void abort (void); + +volatile int i; +unsigned char *volatile cp; +unsigned char d[32] = { 0 }; + +int +main (void) +{ + unsigned char c[32] = { 0 }; + unsigned char *p = d + i; + int j; + for (j = 0; j < 30; j++) + { + int x = 0xff; + int y = *++p; + switch (j) + { + case 1: x ^= 2; break; + case 2: x ^= 4; break; + case 25: x ^= 1; break; + default: break; + } + c[j] = y | x; + cp = p; + } + if (c[0] != 0xff + || c[1] != 0xfd + || c[2] != 0xfb + || c[3] != 0xff + || c[4] != 0xff + || c[25] != 0xfe + || cp != d + 30) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120817-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120817-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120817-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120817-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +typedef unsigned long long u64; +unsigned long foo = 0; +u64 f() __attribute__((noinline)); + +u64 f() { + return ((u64)40) + ((u64) 24) * (int)(foo - 1); +} + +int main () +{ + if (f () != 16) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120919-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120919-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120919-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20120919-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* PR rtl-optimization/54290 */ +/* Testcase by Eric Volk */ +/* { dg-require-effective-target int32plus } */ + +double vd[2] = {1., 0.}; +int vi[2] = {1234567890, 0}; +double *pd = vd; +int *pi = vi; + +extern void abort(void); + +void init (int *n, int *dummy) __attribute__ ((noinline,noclone)); + +void init (int *n, int *dummy) +{ + if(0 == n) dummy[0] = 0; +} + +int main (void) +{ + int dummy[1532]; + int i = -1, n = 1, s = 0; + init (&n, dummy); + while (i < n) { + if (i == 0) { + if (pd[i] > 0) { + if (pi[i] > 0) { + s += pi[i]; + } + } + pd[i] = pi[i]; + } + ++i; + } + if (s != 1234567890) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20121108-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20121108-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20121108-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20121108-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,50 @@ +char temp[] = "192.168.190.160"; +unsigned result = (((((192u<<8)|168u)<<8)|190u)<<8)|160u; + +int strtoul1(const char *a, char **b, int c) __attribute__((noinline, noclone)); +int strtoul1(const char *a, char **b, int c) +{ + *b = a+3; + if (a == temp) + return 192; + else if (a == temp+4) + return 168; + else if (a == temp+8) + return 190; + else if (a == temp+12) + return 160; + __builtin_abort(); +} + +int string_to_ip(const char *s) __attribute__((noinline,noclone)); +int string_to_ip(const char *s) +{ + int addr; + char *e; + int i; + + if (s == 0) + return(0); + + for (addr=0, i=0; i<4; ++i) { + int val = s ? strtoul1(s, &e, 10) : 0; + addr <<= 8; + addr |= (val & 0xFF); + if (s) { + s = (*e) ? e+1 : e; + } + } + + return addr; +} + +int main(void) +{ + int t = string_to_ip (temp); + printf ("%x\n", t); + printf ("%x\n", result); + if (t != result) + __builtin_abort (); + printf ("WORKS.\n"); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20131127-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20131127-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20131127-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20131127-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* PR middle-end/59138 */ +/* Testcase by John Regehr */ + +extern void abort (void); + +#pragma pack(1) + +struct S0 { + int f0; + int f1; + int f2; + short f3; +}; + +short a = 1; + +struct S0 b = { 1 }, c, d, e; + +struct S0 fn1() { return c; } + +void fn2 (void) +{ + b = fn1 (); + a = 0; + d = e; +} + +int main (void) +{ + fn2 (); + if (a != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140212-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140212-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140212-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140212-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* PR rtl-optimization/60116 */ +/* Reported by Zhendong Su */ + +extern void abort (void); + +int a, b, c, d = 1, e, f = 1, h, i, k; +char g, j; + +void +fn1 (void) +{ + int l; + e = 0; + c = 0; + for (;;) + { + k = a && b; + j = k * 54; + g = j * 147; + l = ~g + (long long) e && 1; + if (d) + c = l; + else + h = i = l * 9UL; + if (f) + return; + } +} + +int +main (void) +{ + fn1 (); + if (c != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140212-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140212-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140212-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140212-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* This used to fail as we would convert f into just return (unsigned int)usVlanID + which is wrong. */ + +int f(unsigned short usVlanID) __attribute__((noinline,noclone)); +int f(unsigned short usVlanID) +{ + unsigned int uiVlanID = 0xffffffff; + int i; + if ((unsigned short)0xffff != usVlanID) + uiVlanID = (unsigned int)usVlanID; + return uiVlanID; +} + +int main(void) +{ + if (f(1) != 1) + __builtin_abort (); + if (f(0xffff) != -1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140326-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140326-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140326-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140326-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +int a; + +int +main (void) +{ + char e[2] = { 0, 0 }, f = 0; + if (a == 131072) + f = e[a]; + return f; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140425-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140425-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140425-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140425-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR target/60941 */ +/* Reported by Martin Husemann */ + +extern void abort (void); + +static void __attribute__((noinline)) +set (unsigned long *l) +{ + *l = 31; +} + +int main (void) +{ + unsigned long l; + int i; + + set (&l); + i = (int) l; + l = (unsigned long)(2U << i); + if (l != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140622-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140622-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140622-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140622-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +unsigned p; + +long __attribute__((noinline, noclone)) +test (unsigned a) +{ + return (long)(p + a) - (long)p; +} + +int +main () +{ + p = (unsigned) -2; + if (test (0) != 0) + __builtin_abort (); + if (test (1) != 1) + __builtin_abort (); + if (test (2) != -(long)(unsigned)-2) + __builtin_abort (); + p = (unsigned) -1; + if (test (0) != 0) + __builtin_abort (); + if (test (1) != -(long)(unsigned)-1) + __builtin_abort (); + if (test (2) != -(long)(unsigned)-2) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140828-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140828-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140828-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20140828-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +short *f(short *a, int b, int *d) __attribute__((noinline,noclone)); + +short *f(short *a, int b, int *d) +{ + short c = *a; + a++; + c = b << c; + *d = c; + return a; +} + +int main(void) +{ + int d; + short a[2]; + a[0] = 0; + if (f(a, 1, &d) != &a[1]) + __builtin_abort (); + if (d != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141022-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141022-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141022-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141022-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +#define ABORT() do { __builtin_printf("assert.\n"); __builtin_abort (); }while(0) +int f(int a) __attribute__((noinline)); +int f(int a) +{ + int fem_key_src; + int D2930 = a & 4294967291; + fem_key_src = a == 6 ? 0 : 15; + fem_key_src = D2930 != 1 ? fem_key_src : 0; + return fem_key_src; +} + +int main(void) +{ + if (f(0) != 15) + ABORT (); + if (f(1) != 0) + ABORT (); + if (f(6) != 0) + ABORT (); + if (f(5) != 0) + ABORT (); + if (f(15) != 15) + ABORT (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141107-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141107-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141107-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141107-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +#define bool _Bool + +bool f(int a, bool c) __attribute__((noinline)); +bool f(int a, bool c) +{ + if (!a) + c = !c; + return c; +} + +void checkf(int a, bool b) +{ + bool c = f(a, b); + char d; + __builtin_memcpy (&d, &c, 1); + if ( d != (a==0)^b) + __builtin_abort(); +} + +int main(void) +{ + checkf(0, 0); + checkf(0, 1); + checkf(1, 1); + checkf(1, 0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141125-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141125-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141125-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20141125-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +int f(long long a) __attribute__((noinline,noclone)); +int f(long long a) +{ + if (a & 0x3ffffffffffffffull) + return 1; + return 1024; +} + +int main(void) +{ + if(f(0x48375d8000000000ull) != 1) + __builtin_abort (); + if (f(0xfc00000000000000ull) != 1024) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20150611-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20150611-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20150611-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20150611-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +int a, c, d; +short b; + +int +main () +{ + int e[1]; + for (; b < 2; b++) + { + a = 0; + if (b == 28378) + a = e[b]; + if (!(d || b)) + for (; c;) + ; + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170111-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170111-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170111-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170111-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* PR rtl-optimization/79032 */ +/* Reported by Daniel Cederman */ + +extern void abort (void); + +struct S { + short a; + long long b; + short c; + char d; + unsigned short e; + long *f; +}; + +static long foo (struct S *s) __attribute__((noclone, noinline)); + +static long foo (struct S *s) +{ + long a = 1; + a /= s->e; + s->f[a]--; + return a; +} + +int main (void) +{ + long val = 1; + struct S s = { 0, 0, 0, 0, 2, &val }; + val = foo (&s); + if (val != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170401-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170401-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170401-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170401-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,53 @@ +/* PR45070 */ +extern void abort(void); + +struct packed_ushort { + unsigned short ucs; +} __attribute__((packed)); + +struct source { + int pos, length; +}; + +static int flag; + +static void __attribute__((noinline)) fetch(struct source *p) +{ + p->length = 128; +} + +static struct packed_ushort __attribute__((noinline)) next(struct source *p) +{ + struct packed_ushort rv; + + if (p->pos >= p->length) { + if (flag) { + flag = 0; + fetch(p); + return next(p); + } + flag = 1; + rv.ucs = 0xffff; + return rv; + } + rv.ucs = 0; + return rv; +} + +int main(void) +{ + struct source s; + int i; + + s.pos = 0; + s.length = 0; + flag = 0; + + for (i = 0; i < 16; i++) { + struct packed_ushort rv = next(&s); + if ((i == 0 && rv.ucs != 0xffff) + || (i > 0 && rv.ucs != 0)) + abort(); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170401-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170401-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170401-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170401-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +void adjust_xy (short *, short *); + +struct adjust_template +{ + short kx_x; + short kx_y; +}; + +static struct adjust_template adjust = {1, 1}; + +main () +{ + short x = 1, y = 1; + + adjust_xy (&x, &y); + + if (x != 2) + abort (); + + exit (0); +} + +void +adjust_xy (x, y) + short *x; + short *y; +{ + *x = adjust.kx_x * *x + adjust.kx_y * *y; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170419-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170419-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170419-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20170419-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR tree-optimization/80426 */ +/* Testcase by */ + +#define INT_MAX 0x7fffffff +#define INT_MIN (-INT_MAX-1) + +int x; + +int main (void) +{ + volatile int a = 0; + volatile int b = -INT_MAX; + int j; + + for(j = 0; j < 18; j += 1) { + x = ( (a == 0) != (b - (int)(INT_MIN) ) ); + } + + if (x != 0) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20171008-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20171008-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20171008-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20171008-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +struct S { char c1, c2, c3, c4; } __attribute__((aligned(4))); + +static char bar (char **p) __attribute__((noclone, noinline)); +static struct S foo (void) __attribute__((noclone, noinline)); + +int i; + +static char +bar (char **p) +{ + i = 1; + return 0; +} + +static struct S +foo (void) +{ + struct S ret; + char r, s, c1, c2; + char *p = &r; + + s = bar (&p); + if (s) + c2 = *p; + c1 = 0; + + ret.c1 = c1; + ret.c2 = c2; + return ret; +} + +int main (void) +{ + struct S s = foo (); + if (s.c1 != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180112-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180112-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180112-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180112-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR rtl-optimization/83565 */ +/* Testcase by Sergei Trofimovich */ + +extern void abort (void); + +typedef __UINT32_TYPE__ u32; + +u32 bug (u32 * result) __attribute__((noinline)); +u32 bug (u32 * result) +{ + volatile u32 ss = 0xFFFFffff; + volatile u32 d = 0xEEEEeeee; + u32 tt = d & 0x00800000; + u32 r = tt << 8; + + r = (r >> 31) | (r << 1); + + u32 u = r^ss; + u32 off = u >> 1; + + *result = tt; + return off; +} + +int main(void) +{ + u32 l; + u32 off = bug(&l); + if (off != 0x7fffffff) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180131-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180131-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180131-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180131-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR rtl-optimization/84071 */ +/* Reported by Wilco */ + +extern void abort (void); + +typedef union +{ + signed short ss; + unsigned short us; + int x; +} U; + +int f(int x, int y, int z, int a, U u) __attribute__((noclone, noinline)); + +int f(int x, int y, int z, int a, U u) +{ + return (u.ss <= 0) + u.us; +} + +int main (void) +{ + U u = { .ss = -1 }; + + if (f (0, 0, 0, 0, u) != (1 << sizeof (short) * 8)) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180226-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180226-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180226-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180226-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* PR rtl-optimization/83496 */ +/* Reported by Hauke Mehrtens */ + +extern void abort (void); + +typedef unsigned long mp_digit; + +typedef struct { int used, alloc, sign; mp_digit *dp; } mp_int; + +int mytest(mp_int *a, mp_digit b) __attribute__((noclone, noinline)); + +int mytest(mp_int *a, mp_digit b) +{ + if (a->sign == 1) + return -1; + if (a->used > 1) + return 1; + if (a->dp[0] > b) + return 1; + if (a->dp[0] < b) + return -1; + return 0; +} + +int main (void) +{ + mp_int i = { 2, 0, -1 }; + if (mytest (&i, 0) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180921-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180921-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180921-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20180921-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,161 @@ +/* PR tree-optimization/86990 */ +/* Testcase by Zhendong Su */ + +const char *ss; + +int __attribute__((noipa)) dummy (const char *s, ...) +{ + ss = s; +} + +int i[6]; +static int j, v, e, f, h = 5, k, l, n, o, p, q, r, s, u, w, x, y, z, aa, ab, ac, + ad, ae, af, ag = 8, ah, ai, aj, ak, al; +char c; +struct a { + unsigned b; + int c : 9; + int d; +} static g = {9, 5}; +static short m[1], t = 95, am; +int an, ao, ap; +void aq(int ar) { + j = j & 5 ^ i[j ^ v & 5]; + j = j & 5 ^ i[(j ^ v) & 5]; + j = j & 4095 ^ (j ^ v) & 5; +} +void as(int ar) { + if (n) + s = 0; +} +static unsigned at() { + int au[] = {2080555007, 0}; + for (; al; al--) { + if (r) + --x; + if (g.d) + l++; + dummy("", j); + if (u) + ae = n = au[al]; + } + r = 0; + return 0; +} +int aw(int ar) { + int ax[] = {9, 5, 5, 9, 5}, ay = 3; + struct a az = {1, 3}; +av: + an = (as((at(), ax)[2]), ax[4]); + { + int ba[] = {5, 5, 9, 8, 1, 0, 5, 5, 9, 8, 1, 0, + 5, 5, 9, 8, 1, 0, 5, 5, 9, 8, 1}; + int a[] = {8, 2, 8, 2, 8, 2, 8}; + int b[] = {1027239, 8, 1, 7, 9, 2, 9, 4, 4, 2, 8, 1, 0, 4, 4, 2, + 4, 4, 2, 9, 2, 9, 8, 1, 7, 9, 2, 9, 4, 4, 2}; + if (z) { + struct a bc; + bb: + for (; e; e++) + for (; q;) + return ax[e]; + if (bc.c < g.d <= a[7]) + aa--; + } + { + struct a bd = {5}; + int d[20] = {1, 9, 7, 7, 8, 4, 4, 4, 4, 8, 1, 9, 7, 7, 8, 4, 4, 4, 4}; + c = h | r % g.c ^ x; + dummy("", g); + am -= t | x; + if (h) + while (1) { + if (a[o]) { + struct a be; + if (ar) { + struct a bf = {908, 5, 3}; + int bg[3], bh = k, bj = ag | ae, bk = aj + 3, bl = u << e; + if (f) + if (ac) + ak = w; + ag = -(ag & t); + af = ag ^ af; + if (8 < af) + break; + if (bj) + goto bi; + if (s) + dummy("", 6); + be.d = k; + w = f - bh; + dummy("", be); + if (w) + goto bb; + ao = r - aa && g.b; + if (y) + k++; + goto av; + bi: + if (aa) + continue; + if (f) + if (k) + dummy("", g); + aj = ac + k ^ g.c; + g.c = bk; + ah = 0; + for (; ah < 3; ah++) + if (s) + bg[ah] = 8; + if (!ay) + dummy("", ai); + u = bl; + g = bf; + } else + for (;; o += a[ap]) + ; + int bm[] = {0}; + for (; p; p++) + c = ad; + ad = l; + if (bd.c) { + dummy(" "); + goto bi; + } + } + int bn[] = {5, 2, 2, 5, 2, 2, 5, 2, 2, 5, 2, 2, 5, 2, 2, 5, + 2, 2, 5, 2, 2, 5, 2, 2, 5, 2, 2, 5, 2, 2, 5, 2, + 2, 5, 2, 2, 5, 2, 2, 5, 2, 2, 5, 2, 2, 5, 2}; + struct a a[] = {3440025416, 2, 8, 4, 2, 8, 4, 4, 2, 8, 4}; + struct a b = {3075920}; + if (f) { + aq(m[am + e]); + dummy("", j); + dummy("", e); + ab--; + } + if (ax[4]) { + if (l) + goto av; + ++f; + } else + ay = az.c && a; + for (; ac; ac++) + m[f] = 0; + } + h = 9; + for (; y; y = 1) + if (f) + goto av; + } + } + return 0; +} + +int main (void) +{ + aw(1); + if (g.c!= 5) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20181120-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20181120-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20181120-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20181120-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR rtl-optimization/85925 */ +/* { dg-require-effective-target int32plus } */ +/* Testcase by */ + +int a, c, d; +volatile int b; +int *e = &d; + +union U1 { + unsigned f0; + unsigned f1 : 15; +}; +volatile union U1 u = { 0x4030201 }; + +int main (void) +{ + for (c = 0; c <= 1; c++) { + union U1 f = {0x4030201}; + if (c == 1) + b; + *e = f.f1; + } + + if (d != u.f1) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20190228-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20190228-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20190228-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20190228-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* PR tree-optimization/89536 */ +/* Testcase by Zhendong Su */ + +int a = 1; + +int main (void) +{ + a = ~(a && 1); + if (a < -1) + a = ~a; + + if (!a) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20190820-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20190820-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20190820-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/20190820-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,111 @@ +/* PR rtl-optimization/91347 */ +/* Reported by John David Anglin */ + +typedef unsigned short __u16; +typedef __signed__ int __s32; +typedef unsigned int __u32; +typedef __signed__ long long __s64; +typedef unsigned long long __u64; +typedef __u16 u16; +typedef __s32 s32; +typedef __u32 u32; +typedef __u64 u64; +typedef _Bool bool; +typedef s32 int32_t; +typedef u32 uint32_t; +typedef u64 uint64_t; + +char hex_asc_upper[16]; +u16 decpair[100]; + +static __attribute__ ((noipa)) void +put_dec_full4 (char *buf, unsigned r) +{ + unsigned q; + q = (r * 0x147b) >> 19; + *((u16 *)buf) = decpair[r - 100*q]; + buf += 2; + *((u16 *)buf) = decpair[q]; +} + +static __attribute__ ((noipa)) unsigned +put_dec_helper4 (char *buf, unsigned x) +{ + uint32_t q = (x * (uint64_t)0x346DC5D7) >> 43; + put_dec_full4(buf, x - q * 10000); + return q; +} + +static __attribute__ ((noipa)) char * +put_dec (char *buf, unsigned long long n) +{ + uint32_t d3, d2, d1, q, h; + d1 = ((uint32_t)n >> 16); + h = (n >> 32); + d2 = (h ) & 0xffff; + d3 = (h >> 16); + q = 656 * d3 + 7296 * d2 + 5536 * d1 + ((uint32_t)n & 0xffff); + q = put_dec_helper4(buf, q); + q += 7671 * d3 + 9496 * d2 + 6 * d1; + q = put_dec_helper4(buf+4, q); + q += 4749 * d3 + 42 * d2; + q = put_dec_helper4(buf+8, q); + return buf; +} + +struct printf_spec { + unsigned int type:8; + signed int field_width:24; + unsigned int flags:8; + unsigned int base:8; + signed int precision:16; +} __attribute__((__packed__)); + +static __attribute__ ((noipa)) char * +number (char *buf, char *end, unsigned long long num, struct printf_spec spec) +{ + + char tmp[3 * sizeof(num)] __attribute__((__aligned__(2))); + char sign; + char locase; + int need_pfx = ((spec.flags & 64) && spec.base != 10); + int i; + bool is_zero = num == 0LL; + int field_width = spec.field_width; + int precision = spec.precision; + + i = 0; + if (num < spec.base) + tmp[i++] = hex_asc_upper[num] | locase; + else if (spec.base != 10) { + int mask = spec.base - 1; + int shift = 3; + if (spec.base == 16) + shift = 4; + else + __builtin_abort (); + do { + tmp[i++] = (hex_asc_upper[((unsigned char)num) & mask] | locase); + num >>= shift; + } while (num); + } else { + i = put_dec(tmp, num) - tmp; + } + return buf; +} + +static __attribute__ ((noipa)) char * +pointer_string (char *buf, char *end, const void *ptr, struct printf_spec spec) +{ + spec.base = 16; + spec.flags = 0; + return number(buf, end, 100, spec); +} + +int +main (void) +{ + struct printf_spec spec; + char *s = pointer_string (0, 0, 0, spec); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/900409-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/900409-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/900409-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/900409-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +long f1(long a){return a&0xff000000L;} +long f2 (long a){return a&~0xff000000L;} +long f3(long a){return a&0x000000ffL;} +long f4(long a){return a&~0x000000ffL;} +long f5(long a){return a&0x0000ffffL;} +long f6(long a){return a&~0x0000ffffL;} + +main () +{ + long a = 0x89ABCDEF; + + if (f1(a)!=0x89000000L|| + f2(a)!=0x00ABCDEFL|| + f3(a)!=0x000000EFL|| + f4(a)!=0x89ABCD00L|| + f5(a)!=0x0000CDEFL|| + f6(a)!=0x89AB0000L) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920202-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920202-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920202-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920202-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +static int rule_text_needs_stack_pop = 0; +static int input_stack_pos = 1; + +f (void) +{ + rule_text_needs_stack_pop = 1; + + if (input_stack_pos <= 0) + return 1; + else + return 0; +} + +main () +{ + f (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920302-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920302-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920302-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920302-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +/* { dg-require-effective-target label_values } */ + +short optab[5]; +char buf[10]; +execute (ip) + register short *ip; +{ + register void *base = &&x; + char *bp = buf; + static void *tab[] = {&&x, &&y, &&z}; + if (ip == 0) + { + int i; + for (i = 0; i < 3; ++i) + optab[i] = (short)(tab[i] - base); + return; + } +x: *bp++='x'; + goto *(base + *ip++); +y: *bp++='y'; + goto *(base + *ip++); +z: *bp++='z'; + *bp=0; + return; +} + +short p[5]; + +main () +{ + execute ((short *) 0); + p[0] = optab[1]; + p[1] = optab[0]; + p[2] = optab[1]; + p[3] = optab[2]; + execute (&p); + if (strcmp (buf, "xyxyz")) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920409-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920409-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920409-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920409-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +x(){signed char c=-1;return c<0;}main(){if(x()!=1)abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920410-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920410-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920410-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920410-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,3 @@ +/* { dg-require-stack-size "40000 * 4 + 256" } */ + +main(){int d[40000];d[0]=0;exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920411-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920411-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920411-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920411-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +long f (w) + char *w; +{ + long k, i, c = 0, x; + char *p = (char*) &x; + for (i = 0; i < 1; i++) + { + for (k = 0; k < sizeof (long); k++) + p[k] = w[k]; + c += x; + } + return c; +} + +main () +{ + int i; + char a[sizeof (long)]; + + for (i = sizeof (long); --i >= 0;) a[i] = ' '; + if (f (a) != ~0UL / (unsigned char) ~0 * ' ') + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920415-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920415-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920415-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920415-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +/* { dg-require-effective-target label_values } */ +main(){__label__ l;void*x(){return&&l;}goto*x();abort();return;l:exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920428-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920428-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920428-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920428-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +x(const char*s){char a[1];const char*ss=s;a[*s++]|=1;return(int)ss+1==(int)s;} +main(){if(x("")!=1)abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920428-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920428-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920428-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920428-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,6 @@ +/* { dg-require-effective-target label_values } */ +/* { dg-require-effective-target trampolines } */ + +s(i){if(i>0){__label__ l1;int f(int i){if(i==2)goto l1;return 0;}return f(i);l1:;}return 1;} +x(){return s(0)==1&&s(1)==0&&s(2)==1;} +main(){if(x()!=1)abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920429-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920429-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920429-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920429-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,3 @@ +typedef unsigned char t;int i,j; +t*f(t*p){t c;c=*p++;i=((c&2)?1:0);j=(c&7)+1;return p;} +main(){t*p0="ab",*p1;p1=f(p0);if(p0+1!=p1)abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,4 @@ +/* { dg-require-effective-target untyped_assembly } */ +int s[2]; +x(){if(!s[0]){s[1+s[1]]=s[1];return 1;}} +main(){s[0]=s[1]=0;if(x(0)!=1)abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,114 @@ +unsigned long +gcd_ll (unsigned long long x, unsigned long long y) +{ + for (;;) + { + if (y == 0) + return (unsigned long) x; + x = x % y; + if (x == 0) + return (unsigned long) y; + y = y % x; + } +} + +unsigned long long +powmod_ll (unsigned long long b, unsigned e, unsigned long long m) +{ + unsigned t; + unsigned long long pow; + int i; + + if (e == 0) + return 1; + + /* Find the most significant bit in E. */ + t = e; + for (i = 0; t != 0; i++) + t >>= 1; + + /* The most sign bit in E is handled outside of the loop, by beginning + with B in POW, and decrementing I. */ + pow = b; + i -= 2; + + for (; i >= 0; i--) + { + pow = pow * pow % m; + if ((1 << i) & e) + pow = pow * b % m; + } + + return pow; +} + +unsigned long factab[10]; + +void +facts (t, a_int, x0, p) + unsigned long long t; + int a_int; + int x0; + unsigned p; +{ + unsigned long *xp = factab; + unsigned long long x, y; + unsigned long q = 1; + unsigned long long a = a_int; + int i; + unsigned long d; + int j = 1; + unsigned long tmp; + int jj = 0; + + x = x0; + y = x0; + + for (i = 1; i < 10000; i++) + { + x = powmod_ll (x, p, t) + a; + y = powmod_ll (y, p, t) + a; + y = powmod_ll (y, p, t) + a; + + if (x > y) + tmp = x - y; + else + tmp = y - x; + q = (unsigned long long) q * tmp % t; + + if (i == j) + { + jj += 1; + j += jj; + d = gcd_ll (q, t); + if (d != 1) + { + *xp++ = d; + t /= d; + if (t == 1) + { + return; + *xp = 0; + } + } + } + } +} + +main () +{ + unsigned long long t; + unsigned x0, a; + unsigned p; + + p = 27; + t = (1ULL << p) - 1; + + a = -1; + x0 = 3; + + facts (t, a, x0, p); + if (factab[0] != 7 || factab[1] != 73 || factab[2] != 262657) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* { dg-require-effective-target label_values } */ + +int tab[9]; +execute(oip, ip) + unsigned short *oip, *ip; +{ + int x = 0; + int *xp = tab; +base: + x++; + if (x == 4) + { + *xp = 0; + return; + } + *xp++ = ip - oip; + goto *(&&base + *ip++); +} + +main() +{ + unsigned short ip[10]; + int i; + for (i = 0; i < 10; i++) + ip[i] = 0; + execute(ip, ip); + if (tab[0] != 0 || tab[1] != 1 || tab[2] != 2 || tab[3] != 0) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* { dg-require-effective-target label_values } */ + +int +x (int i) +{ + static const void *j[] = {&& x, && y, && z}; + + goto *j[i]; + + x: return 2; + y: return 3; + z: return 5; +} + +int +main (void) +{ + if ( x (0) != 2 + || x (1) != 3 + || x (2) != 5) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* { dg-require-effective-target label_values } */ + +x (int i) +{ + void *j[] = {&&x, &&y, &&z}; + goto *j[i]; + x:return 2; + y:return 3; + z:return 5; + +} +main () +{ + if (x (0) != 2 || x (1) != 3 || x (2) != 5) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,91 @@ +#include + +/* Convert a decimal string to a long long unsigned. No error check is + performed. */ + +long long unsigned +str2llu (str) + char *str; +{ + long long unsigned acc; + int d; + acc = *str++ - '0'; + for (;;) + { + d = *str++; + if (d == '\0') + break; + d -= '0'; + acc = acc * 10 + d; + } + + return acc; +} + +/* isqrt(t) - computes the square root of t. (tege 86-10-27) */ + +long unsigned +sqrtllu (long long unsigned t) +{ + long long unsigned s; + long long unsigned b; + + for (b = 0, s = t; b++, (s >>= 1) != 0; ) + ; + + s = 1LL << (b >> 1); + + if (b & 1) + s += s >> 1; + + do + { + b = t / s; + s = (s + b) >> 1; + } + while (b < s); + + return s; +} + + +int plist (p0, p1, tab) + long long unsigned p0, p1; + long long unsigned *tab; +{ + long long unsigned p; + long unsigned d; + long unsigned s; + long long unsigned *xp = tab; + + for (p = p0; p <= p1; p += 2) + { + s = sqrtllu (p); + + for (d = 3; d <= s; d += 2) + { + long long unsigned q = p % d; + if (q == 0) + goto not_prime; + } + + *xp++ = p; + not_prime:; + } + *xp = 0; + return xp - tab; +} + +main (argc, argv) + int argc; + char *argv[]; +{ + long long tab[10]; + int nprimes; + nprimes = plist (str2llu ("1234111111"), str2llu ("1234111127"), tab); + + if(tab[0]!=1234111117LL||tab[1]!=1234111121LL||tab[2]!=1234111127LL||tab[3]!=0) + abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-7.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-7.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-7.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-7.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* { dg-require-effective-target label_values } */ +/* { dg-require-effective-target trampolines } */ +/* { dg-add-options stack_size } */ + +#ifdef STACK_SIZE +#define DEPTH ((STACK_SIZE) / 512 + 1) +#else +#define DEPTH 1000 +#endif + +x(a) +{ + __label__ xlab; + void y(a) + { + if (a==0) + goto xlab; + y (a-1); + } + y (a); + xlab:; + return a; +} + +main () +{ + if (x (DEPTH) != DEPTH) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-8.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-8.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-8.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-8.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +/* { dg-additional-options "-Wl,-u,_printf_float" { target newlib_nano_io } } */ + +#include +#include + +char buf[50]; +int +va (int a, double b, int c, ...) +{ + va_list ap; + int d, e, f, g, h, i, j, k, l, m, n, o, p; + va_start (ap, c); + + d = va_arg (ap, int); + e = va_arg (ap, int); + f = va_arg (ap, int); + g = va_arg (ap, int); + h = va_arg (ap, int); + i = va_arg (ap, int); + j = va_arg (ap, int); + k = va_arg (ap, int); + l = va_arg (ap, int); + m = va_arg (ap, int); + n = va_arg (ap, int); + o = va_arg (ap, int); + p = va_arg (ap, int); + + sprintf (buf, + "%d,%f,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d", + a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p); + va_end (ap); +} + +main() +{ + va (1, 1.0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + if (strcmp ("1,1.000000,2,3,4,5,6,7,8,9,10,11,12,13,14,15", buf)) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-9.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-9.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-9.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920501-9.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +#include + +long long proc1(){return 1LL;} +long long proc2(){return 0x12345678LL;} +long long proc3(){return 0xaabbccdd12345678LL;} +long long proc4(){return -1LL;} +long long proc5(){return 0xaabbccddLL;} + +print_longlong(x,buf) + long long x; + char *buf; +{ + unsigned long l; + l= (x >> 32) & 0xffffffff; + if (l != 0) + sprintf(buf,"%lx%08.lx",l,((unsigned long)x & 0xffffffff)); + else + sprintf(buf,"%lx",((unsigned long)x & 0xffffffff)); +} + +main(){char buf[100]; +print_longlong(proc1(),buf);if(strcmp("1",buf))abort(); +print_longlong(proc2(),buf);if(strcmp("12345678",buf))abort(); +print_longlong(proc3(),buf);if(strcmp("aabbccdd12345678",buf))abort(); +print_longlong(proc4(),buf);if(strcmp("ffffffffffffffff",buf))abort(); +print_longlong(proc5(),buf);if(strcmp("aabbccdd",buf))abort(); +exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920506-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920506-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920506-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920506-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +int l[]={0,1}; +main(){int*p=l;switch(*p++){case 0:exit(0);case 1:break;case 2:break;case 3:case 4:break;}abort();} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920520-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920520-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920520-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920520-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +foo(int *bar) +{ + *bar = 8; +} + +bugger() +{ + int oldDepth, newDepth; + + foo(&oldDepth); + + switch (oldDepth) + { + case 8: + case 500: + newDepth = 8; + break; + + case 5000: + newDepth = 500; + break; + + default: + newDepth = 17; + break; + } + + return newDepth - oldDepth; +} + +main() +{ + if (bugger() != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920603-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920603-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920603-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920603-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +f(got){if(got!=0xffff)abort();} +main(){signed char c=-1;unsigned u=(unsigned short)c;f(u);exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920604-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920604-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920604-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920604-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +long long +mod (a, b) + long long a, b; +{ + return a % b; +} + +int +main () +{ + mod (1LL, 2LL); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920612-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920612-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920612-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920612-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,7 @@ +/* { dg-options "-fwrapv" } */ + +extern void abort (void); +extern void exit (int); + +int f(j)int j;{return++j>0;} +int main(){if(f((~0U)>>1))abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920612-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920612-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920612-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920612-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* { dg-require-effective-target trampolines } */ + +main () +{ + int i = 0; + int a (int x) + { + while (x) + i++, x--; + return x; + } + + if (a (2) != 0) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920618-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920618-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920618-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920618-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +main(){if(1.17549435e-38F<=1.1)exit(0);abort();} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920625-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920625-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920625-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920625-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +#include + +typedef struct{double x,y;}point; +point pts[]={{1.0,2.0},{3.0,4.0},{5.0,6.0},{7.0,8.0}}; +static int va1(int nargs,...) +{ + va_list args; + int i; + point pi; + va_start(args,nargs); + for(i=0;i 1.84467440737096e+19) + abort(); + + if (16777217L != (float)16777217e0) + abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920711-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920711-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920711-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920711-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,7 @@ +/* { dg-options "-fwrapv" } */ + +extern void abort (void); +extern void exit (int); + +int f(long a){return (--a > 0);} +int main(){if(f(0x80000000L)==0)abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +long f(short a,short b){return (long)a/b;} +main(){if(f(-32768,-1)!=32768L)abort();else exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,3 @@ +/* { dg-skip-if "requires alloca" { ! alloca } { "-O0" } { "" } } */ +f(){} +main(){int n=2;double x[n];f();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +static inline fu (unsigned short data) +{ + return data; +} +ru(i) +{ + if(fu(i++)!=5)abort(); + if(fu(++i)!=7)abort(); +} +static inline fs (signed short data) +{ + return data; +} +rs(i) +{ + if(fs(i++)!=5)abort(); + if(fs(++i)!=7)abort(); +} + + +main() +{ + ru(5); + rs(5); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920721-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,58 @@ +/* { dg-require-effective-target label_values } */ + +int try (int num) { + __label__ lab1, lab2, lab3, lab4, lab5, lab6, default_lab; + + void *do_switch (int num) { + switch(num) { + case 1: + return &&lab1; + case 2: + return &&lab2; + case 3: + return &&lab3; + case 4: + return &&lab4; + case 5: + return &&lab5; + case 6: + return &&lab6; + default: + return &&default_lab; + } + } + + goto *do_switch (num); + + lab1: + return 1; + + lab2: + return 2; + + lab3: + return 3; + + lab4: + return 4; + + lab5: + return 5; + + lab6: + return 6; + + default_lab: + return -1; +} + +main() +{ + int i; + for (i = 1; i <= 6; i++) + { + if (try (i) != i) + abort(); + } + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920726-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920726-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920726-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920726-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,63 @@ +#include +#include + +struct spurious +{ + int anumber; +}; + +int first(char *buf, char *fmt, ...) +{ + int pos, number; + va_list args; + int dummy; + char *bp = buf; + + va_start(args, fmt); + for (pos = 0; fmt[pos]; pos++) + if (fmt[pos] == 'i') + { + number = va_arg(args, int); + sprintf(bp, "%d", number); + bp += strlen(bp); + } + else + *bp++ = fmt[pos]; + + va_end(args); + *bp = 0; + return dummy; +} + +struct spurious second(char *buf,char *fmt, ...) +{ + int pos, number; + va_list args; + struct spurious dummy; + char *bp = buf; + + va_start(args, fmt); + for (pos = 0; fmt[pos]; pos++) + if (fmt[pos] == 'i') + { + number = va_arg(args, int); + sprintf(bp, "%d", number); + bp += strlen(bp); + } + else + *bp++ = fmt[pos]; + + va_end(args); + *bp = 0; + return dummy; +} + +main() +{ + char buf1[100], buf2[100]; + first(buf1, "i i ", 5, 20); + second(buf2, "i i ", 5, 20); + if (strcmp ("5 20 ", buf1) || strcmp ("5 20 ", buf2)) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920728-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920728-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920728-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920728-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +typedef struct {int dims[0]; } *A; + +f(unsigned long obj) +{ + unsigned char y = obj >> 24; + y &= ~4; + + if ((y==0)||(y!=251 )) + abort(); + + if(((int)obj&7)!=7)return; + + REST_OF_CODE_JUST_HERE_TO_TRIGGER_THE_BUG: + + { + unsigned char t = obj >> 24; + if (!(t==0)&&(t<=0x03)) + return 0; + return ((A)(obj&0x00FFFFFFL))->dims[1]; + } +} + +long g(){return 0xff000000L;} +main (){int x;f(g());exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920730-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920730-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920730-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920730-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* 920730-1.c */ +#include +f1() +{ + int b=INT_MIN; + return b>=INT_MIN; +} + +f2() +{ + int b=INT_MIN+1; + return b>= (unsigned)(INT_MAX+2); +} + +f3() +{ + int b=INT_MAX; + return b>=INT_MAX; +} + +f4() +{ + int b=-1; + return b>=UINT_MAX; +} + +main () +{ + if((f1()&f2()&f3()&f4())!=1) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920731-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920731-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920731-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920731-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +f(x){int i;for(i=0;i<8&&(x&1)==0;x>>=1,i++);return i;} +main(){if(f(4)!=2)abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920810-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920810-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920810-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920810-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +#include +#include +#include + +typedef struct{void*super;int name;int size;}t; +t*f(t*clas,int size) +{ + t*child=(t*)malloc(size); + memcpy(child,clas,clas->size); + child->super=clas; + child->name=0; + child->size=size; + return child; +} +main() +{ + t foo,*bar; + memset(&foo,37,sizeof(t)); + foo.size=sizeof(t); + bar=f(&foo,sizeof(t)); + if(bar->super!=&foo||bar->name!=0||bar->size!=sizeof(t))abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920812-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920812-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920812-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920812-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,3 @@ +typedef int t; +f(t y){switch(y){case 1:return 1;}return 0;} +main(){if(f((t)1)!=1)abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920829-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920829-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920829-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920829-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +long long c=2863311530LL,c3=2863311530LL*3; +main(){if(c*3!=c3)abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920908-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920908-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920908-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920908-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* REPRODUCED:RUN:SIGNAL MACHINE:mips OPTIONS: */ + +#include + +typedef struct{int A;}T; + +T f(int x,...) +{ +va_list ap; +T X; +va_start(ap,x); +X=va_arg(ap,T); +if(X.A!=10)abort(); +X=va_arg(ap,T); +if(X.A!=20)abort(); +va_end(ap); +return X; +} + +main() +{ +T X,Y; +int i; +X.A=10; +Y.A=20; +f(2,X,Y); +exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920908-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920908-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920908-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920908-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* The bit-field below would have a problem if __INT_MAX__ is too + small. */ +#if __INT_MAX__ < 2147483647 +int +main (void) +{ + exit (0); +} +#else +/* +CONF:m68k-sun-sunos4.1.1 +OPTIONS:-O +*/ +struct T +{ +unsigned i:8; +unsigned c:24; +}; +f(struct T t) +{ +struct T s[1]; +s[0]=t; +return(char)s->c; +} +main() +{ +struct T t; +t.i=0xff; +t.c=0xffff11; +if(f(t)!=0x11)abort(); +exit(0); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920909-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920909-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920909-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920909-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +f(a){switch(a){case 0x402:return a+1;case 0x403:return a+2;case 0x404:return a+3;case 0x405:return a+4;case 0x406:return 1;case 0x407:return 4;}return 0;} +main(){if(f(1))abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920922-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920922-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920922-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920922-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +unsigned long* +f(p)unsigned long*p; +{ + unsigned long a = (*p++) >> 24; + return p + a; +} + +main () +{ + unsigned long x = 0x80000000UL; + if (f(&x) != &x + 0x81) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920929-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920929-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920929-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/920929-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* { dg-skip-if "requires alloca" { ! alloca } { "-O0" } { "" } } */ +/* REPRODUCED:RUN:SIGNAL MACHINE:sparc OPTIONS: */ +f(int n) +{ +int i; +double v[n]; +for(i=0;i=0)abort(); +exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921013-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921013-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921013-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921013-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +f(d,x,y,n) +int*d; +float*x,*y; +int n; +{ + while(n--){*d++=*x++==*y++;} +} + +main() +{ + int r[4]; + float a[]={5,1,3,5}; + float b[]={2,4,3,0}; + int i; + f(r,a,b,4); + for(i=0;i<4;i++) + if((a[i]==b[i])!=r[i]) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921016-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921016-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921016-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921016-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +main() +{ +int j=1081; +struct +{ +signed int m:11; +}l; +if((l.m=j)==j)abort(); +exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921017-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921017-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921017-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921017-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* { dg-skip-if "requires alloca" { ! alloca } { "-O0" } { "" } } */ +/* { dg-require-effective-target trampolines } */ + +f(n) +{ + int a[n]; + int g(i) + { + return a[i]; + } + a[1]=4711; + return g(1); +} +main() +{ + if(f(2)!=4711)abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921019-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921019-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921019-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921019-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,8 @@ +void *foo[]={(void *)&("X"[0])}; + +main () +{ + if (((char*)foo[0])[0] != 'X') + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921019-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921019-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921019-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921019-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,8 @@ +main() +{ + double x,y=0.5; + x=y/0.2; + if(x!=x) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921029-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921029-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921029-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921029-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +typedef unsigned long long ULL; +ULL back; +ULL hpart, lpart; +ULL +build(long h, long l) +{ + hpart = h; + hpart <<= 32; + lpart = l; + lpart &= 0xFFFFFFFFLL; + back = hpart | lpart; + return back; +} + +main() +{ + if (build(0, 1) != 0x0000000000000001LL) + abort(); + if (build(0, 0) != 0x0000000000000000LL) + abort(); + if (build(0, 0xFFFFFFFF) != 0x00000000FFFFFFFFLL) + abort(); + if (build(0, 0xFFFFFFFE) != 0x00000000FFFFFFFELL) + abort(); + if (build(1, 1) != 0x0000000100000001LL) + abort(); + if (build(1, 0) != 0x0000000100000000LL) + abort(); + if (build(1, 0xFFFFFFFF) != 0x00000001FFFFFFFFLL) + abort(); + if (build(1, 0xFFFFFFFE) != 0x00000001FFFFFFFELL) + abort(); + if (build(0xFFFFFFFF, 1) != 0xFFFFFFFF00000001LL) + abort(); + if (build(0xFFFFFFFF, 0) != 0xFFFFFFFF00000000LL) + abort(); + if (build(0xFFFFFFFF, 0xFFFFFFFF) != 0xFFFFFFFFFFFFFFFFLL) + abort(); + if (build(0xFFFFFFFF, 0xFFFFFFFE) != 0xFFFFFFFFFFFFFFFELL) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921104-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921104-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921104-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921104-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,8 @@ +main () +{ + unsigned long val = 1; + + if (val > (unsigned long) ~0) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921110-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921110-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921110-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921110-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,8 @@ +extern void abort(void); +typedef void (*frob)(); +frob f[] = {abort}; + +int main(void) +{ + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921112-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921112-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921112-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921112-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +union u { + struct { int i1, i2; } t; + double d; +} x[2], v; + +f (x, v) + union u *x, v; +{ + *++x = v; +} + +main() +{ + x[1].t.i1 = x[1].t.i2 = 0; + v.t.i1 = 1; + v.t.i2 = 2; + f (x, v); + if (x[1].t.i1 != 1 || x[1].t.i2 != 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921113-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921113-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921113-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921113-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,58 @@ +/* { dg-require-stack-size "128 * 128 * 4 + 1024" } */ + +typedef struct { + float wsx; +} struct_list; + +typedef struct_list *list_t; + +typedef struct { + float x, y; +} vector_t; + +w(float x, float y) {} + +f1(float x, float y) +{ + if (x != 0 || y != 0) + abort(); +} +f2(float x, float y) +{ + if (x != 1 || y != 1) + abort(); +} + +gitter(int count, vector_t pos[], list_t list, int *nww, vector_t limit[2], float r) +{ + float d; + int gitt[128][128]; + + f1(limit[0].x, limit[0].y); + f2(limit[1].x, limit[1].y); + + *nww = 0; + + d = pos[0].x; + if (d <= 0.) + { + w(d, r); + if (d <= r * 0.5) + { + w(d, r); + list[0].wsx = 1; + } + } +} + +vector_t pos[1] = {{0., 0.}}; +vector_t limit[2] = {{0.,0.},{1.,1.}}; + +main() +{ + int nww; + struct_list list; + + gitter(1, pos, &list, &nww, limit, 1.); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921117-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921117-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921117-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921117-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +struct s { + char text[11]; + int flag; +} cell; + +int +check (struct s p) +{ + if (p.flag != 99) + return 1; + return strcmp (p.text, "0123456789"); +} + +main () +{ + cell.flag = 99; + strcpy (cell.text, "0123456789"); + + if (check (cell)) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921123-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921123-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921123-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921123-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +f(short *p) +{ + short x = *p; + return (--x < 0); +} + +main() +{ + short x = -10; + if (!f(&x)) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921123-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921123-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921123-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921123-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +typedef struct +{ + unsigned short b0, b1, b2, b3; +} four_quarters; + +four_quarters x; +int a, b; + +void f (four_quarters j) +{ + b = j.b2; + a = j.b3; +} + +main () +{ + four_quarters x; + x.b0 = x.b1 = x.b2 = 0; + x.b3 = 38; + f(x); + if (a != 38) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921124-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921124-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921124-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921124-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +f(x, d1, d2, d3) + double d1, d2, d3; +{ + return x; +} + +g(b,s,x,y,i,j) + char *b,*s; + double x,y; +{ + if (x != 1.0 || y != 2.0 || i != 3 || j != 4) + abort(); +} + +main() +{ + g("","", 1.0, 2.0, f(3, 0.0, 0.0, 0.0), f(4, 0.0, 0.0, 0.0)); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921202-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921202-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921202-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921202-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +/* { dg-require-effective-target untyped_assembly } */ +/* { dg-add-options stack_size } */ + +#ifndef STACK_SIZE +#define VLEN 2055 +#else +#define VLEN ((STACK_SIZE/16) - 1) +#endif +main () +{ + long dx[VLEN+1]; + long dy[VLEN+1]; + long s1[VLEN]; + int cyx, cyy; + int i; + long size; + + for (;;) + { + size = VLEN; + mpn_random2 (s1, size); + + for (i = 0; i < 1; i++) + ; + + dy[size] = 0x12345678; + + for (i = 0; i < 1; i++) + cyy = mpn_mul_1 (dy, s1, size); + + if (cyx != cyy || mpn_cmp (dx, dy, size + 1) != 0 || dx[size] != 0x12345678) + { + foo ("", 8, cyy); mpn_print (dy, size); + } + exxit(); + } +} + +foo (){} +mpn_mul_1(){} +mpn_print (){} +mpn_random2(){} +mpn_cmp(){} +exxit(){exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921202-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921202-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921202-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921202-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +int +f(long long x) +{ + x >>= 8; + return x & 0xff; +} + +main() +{ + if (f(0x0123456789ABCDEFLL) != 0xCD) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921204-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921204-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921204-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921204-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,46 @@ +/* The bit-field below would have a problem if __INT_MAX__ is too + small. */ +#if __INT_MAX__ < 2147483647 +int +main (void) +{ + exit (0); +} +#else +typedef struct { + unsigned b0:1, f1:17, b18:1, b19:1, b20:1, f2:11; +} bf; + +typedef union { + bf b; + unsigned w; +} bu; + +bu +f(bu i) +{ + bu o = i; + + if (o.b.b0) + o.b.b18 = 1, + o.b.b20 = 1; + else + o.b.b18 = 0, + o.b.b20 = 0; + + return o; +} + +main() +{ + bu a; + bu r; + + a.w = 0x4000000; + a.b.b0 = 0; + r = f(a); + if (a.w != r.w) + abort(); + exit(0); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921207-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921207-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921207-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921207-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +f() +{ + unsigned b = 0; + + if (b > ~0U) + b = ~0U; + + return b; +} +main() +{ + if (f()!=0) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921208-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921208-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921208-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921208-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +double +f(double x) +{ + return x*x; +} + +double +Int(double (*f)(double), double a) +{ + return (*f)(a); +} + +main() +{ + if (Int(&f,2.0) != 4.0) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921208-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921208-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921208-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921208-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* { dg-require-effective-target untyped_assembly } */ +/* { dg-require-stack-size "100000 * 4 + 1024" } */ + +g(){} + +f() +{ + int i; + float a[100000]; + + for (i = 0; i < 1; i++) + { + g(1.0, 1.0 + i / 2.0 * 3.0); + g(2.0, 1.0 + i / 2.0 * 3.0); + } +} + +main () +{ + f(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921215-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921215-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921215-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921215-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* { dg-require-effective-target trampolines } */ + +main() +{ + void p(void ((*f) (void ()))) + { + void r() + { + foo (); + } + + f(r); + } + + void q(void ((*f)())) + { + f(); + } + + p(q); + + exit(0); +} + +foo(){} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921218-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921218-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921218-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921218-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +f() +{ + return (unsigned char)("\377"[0]); +} + +main() +{ + if (f() != (unsigned char)(0377)) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921218-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921218-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921218-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/921218-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +f() +{ + long l2; + unsigned short us; + unsigned long ul; + short s2; + + ul = us = l2 = s2 = -1; + return ul; +} + +main() +{ + if (f()!=(unsigned short)-1) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930106-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930106-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930106-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930106-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* { dg-add-options stack_size } */ + +#if defined (STACK_SIZE) +#define DUMMY_SIZE 9 +#else +#define DUMMY_SIZE 399999 +#endif + +double g() +{ + return 1.0; +} + +f() +{ + char dummy[DUMMY_SIZE]; + double f1, f2, f3; + f1 = g(); + f2 = g(); + f3 = g(); + return f1 + f2 + f3; +} + +main() +{ + if (f() != 3.0) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930111-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930111-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930111-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930111-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +main() +{ + if (wwrite((long long) 0) != 123) + abort(); + exit(0); +} + +int +wwrite(long long i) +{ + switch(i) + { + case 3: + case 10: + case 23: + case 28: + case 47: + return 0; + default: + return 123; + } +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930123-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930123-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930123-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930123-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +f(int *x) +{ + *x = 0; +} + +main() +{ + int s, c, x; + char a[] = "c"; + + f(&s); + a[c = 0] = s == 0 ? (x=1, 'a') : (x=2, 'b'); + if (a[c] != 'a') + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930126-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930126-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930126-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930126-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +struct s { + unsigned long long a:8, b:32; +}; + +struct s +f(struct s x) +{ + x.b = 0xcdef1234; + return x; +} + +main() +{ + static struct s i; + i.a = 12; + i = f(i); + if (i.a != 12 || i.b != 0xcdef1234) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930208-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930208-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930208-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930208-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +typedef union { + long l; + struct { char b3, b2, b1, b0; } c; +} T; + +f (T u) +{ + ++u.c.b0; + ++u.c.b3; + return (u.c.b1 != 2 || u.c.b2 != 2); +} + +main () +{ + T u; + u.c.b1 = 2; + u.c.b2 = 2; + u.c.b0 = ~0; + u.c.b3 = ~0; + if (f (u)) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930406-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930406-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930406-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930406-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* { dg-add-options stack_size } */ + +f() +{ + int x = 1; +#if defined(STACK_SIZE) + char big[STACK_SIZE/2]; +#else + char big[0x1000]; +#endif + + ({ + __label__ mylabel; + mylabel: + x++; + if (x != 3) + goto mylabel; + }); + exit(0); +} + +main() +{ + f(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930408-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930408-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930408-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930408-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +typedef enum foo E; +enum foo { e0, e1 }; + +struct { + E eval; +} s; + +p() +{ + abort(); +} + +f() +{ + switch (s.eval) + { + case e0: + p(); + } +} + +main() +{ + s.eval = e1; + f(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930429-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930429-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930429-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930429-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +char * +f (char *p) +{ + short x = *p++ << 16; + return p; +} + +main () +{ + char *p = ""; + if (f (p) != p + 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930429-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930429-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930429-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930429-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +int +f (b) +{ + return (b >> 1) > 0; +} + +main () +{ + if (!f (9)) + abort (); + if (f (-9)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930513-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930513-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930513-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930513-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +/* { dg-additional-options "-Wl,-u,_printf_float" { target newlib_nano_io } } */ + +#include +char buf[2]; + +f (fp) + int (*fp)(char *, const char *, ...); +{ + (*fp)(buf, "%.0f", 5.0); +} + +main () +{ + f (&sprintf); + if (buf[0] != '5' || buf[1] != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930513-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930513-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930513-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930513-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +sub3 (i) + const int *i; +{ +} + +eq (a, b) +{ + static int i = 0; + if (a != i) + abort (); + i++; +} + +main () +{ + int i; + + for (i = 0; i < 4; i++) + { + const int j = i; + int k; + sub3 (&j); + k = j; + eq (k, k); + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930518-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930518-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930518-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930518-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +int bar = 0; + +f (p) + int *p; +{ + int foo = 2; + + while (foo > bar) + { + foo -= bar; + *p++ = foo; + bar = 1; + } +} + +main () +{ + int tab[2]; + tab[0] = tab[1] = 0; + f (tab); + if (tab[0] != 2 || tab[1] != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930526-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930526-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930526-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930526-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* { dg-options "-fgnu89-inline" } */ + +extern void exit (int); + +inline void +f (int x) +{ + int *(p[25]); + int m[25*7]; + int i; + + for (i = 0; i < 25; i++) + p[i] = m + x*i; + + p[1][0] = 0; +} + +int +main () +{ + f (7); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930527-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930527-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930527-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930527-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +f (unsigned char x) +{ + return (0x50 | (x >> 4)) ^ 0xff; +} + +main () +{ + if (f (0) != 0xaf) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930529-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930529-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930529-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930529-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +/* { dg-options { "-fwrapv" } } */ + +extern void abort (void); +extern void exit (int); + +int dd (int x, int d) { return x / d; } + +int +main () +{ + int i; + for (i = -3; i <= 3; i++) + { + if (dd (i, 1) != i / 1) + abort (); + if (dd (i, 2) != i / 2) + abort (); + if (dd (i, 3) != i / 3) + abort (); + if (dd (i, 4) != i / 4) + abort (); + if (dd (i, 5) != i / 5) + abort (); + if (dd (i, 6) != i / 6) + abort (); + if (dd (i, 7) != i / 7) + abort (); + if (dd (i, 8) != i / 8) + abort (); + } + for (i = ((unsigned) ~0 >> 1) - 3; i <= ((unsigned) ~0 >> 1) + 3; i++) + { + if (dd (i, 1) != i / 1) + abort (); + if (dd (i, 2) != i / 2) + abort (); + if (dd (i, 3) != i / 3) + abort (); + if (dd (i, 4) != i / 4) + abort (); + if (dd (i, 5) != i / 5) + abort (); + if (dd (i, 6) != i / 6) + abort (); + if (dd (i, 7) != i / 7) + abort (); + if (dd (i, 8) != i / 8) + abort (); + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +float fx (x) + float x; +{ + return 1.0 + 3.0 / (2.302585093 * x); +} + +main () +{ + float fx (), inita (), initc (), a, b, c; + a = inita (); + c = initc (); + f (); + b = fx (c) + a; + f (); + if (a != 3.0 || b < 4.3257 || b > 4.3258 || c != 4.0) + abort (); + exit (0); +} + +float inita () { return 3.0; } +float initc () { return 4.0; } +f () {} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +int w[2][2]; + +f () +{ + int i, j; + + for (i = 0; i < 2; i++) + for (j = 0; j < 2; j++) + if (i == j) + w[i][j] = 1; +} + +main () +{ + f (); + if (w[0][0] != 1 || w[1][1] != 1 || w[1][0] != 0 || w[0][1] != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930603-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +f (b, c) + unsigned char *b; + int c; +{ + unsigned long v = 0; + switch (c) + { + case 'd': + v = ((unsigned long)b[0] << 8) + b[1]; + v >>= 9; + break; + + case 'k': + v = b[3] >> 4; + break; + + default: + abort (); + } + + return v; +} +main () +{ + char buf[4]; + buf[0] = 170; buf[1] = 5; + if (f (buf, 'd') != 85) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930608-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930608-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930608-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930608-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +double f (double a) {} +double (* const a[]) (double) = {&f}; + +main () +{ + double (*p) (); + p = &f; + if (p != a[0]) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930614-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930614-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930614-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930614-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +f (double *ty) +{ + *ty = -1.0; +} + +main () +{ + double foo[6]; + double tx = 0.0, ty, d; + + f (&ty); + + if (ty < 0) + ty = -ty; + d = (tx > ty) ? tx : ty; + if (ty != d) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930614-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930614-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930614-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930614-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +main () +{ + int i, j, k, l; + float x[8][2][8][2]; + + for (i = 0; i < 8; i++) + for (j = i; j < 8; j++) + for (k = 0; k < 2; k++) + for (l = 0; l < 2; l++) + { + if ((i == j) && (k == l)) + x[i][k][j][l] = 0.8; + else + x[i][k][j][l] = 0.8; + if (x[i][k][j][l] < 0.0) + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930621-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930621-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930621-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930621-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* The bit-field below would have a problem if __INT_MAX__ is too + small. */ +#if __INT_MAX__ < 2147483647 +int +main (void) +{ + exit (0); +} +#else +f () +{ + struct { + int x : 18; + int y : 14; + } foo; + + foo.x = 10; + foo.y = 20; + + return foo.y; +} + +main () +{ + if (f () != 20) + abort (); + exit (0); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930622-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930622-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930622-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930622-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +int a = 1, b; + +g () { return 0; } +h (x) {} + +f () +{ + if (g () == -1) + return 0; + a = g (); + if (b >= 1) + h (a); + return 0; +} + +main () +{ + f (); + if (a != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930622-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930622-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930622-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930622-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +long double +ll_to_ld (long long n) +{ + return n; +} + +long long +ld_to_ll (long double n) +{ + return n; +} + +main () +{ + long long n; + + if (ll_to_ld (10LL) != 10.0) + abort (); + + if (ld_to_ll (10.0) != 10) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930628-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930628-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930628-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930628-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +f (double x[2], double y[2]) +{ + if (x == y) + abort (); +} + +main () +{ + struct { int f[3]; double x[1][2]; } tp[4][2]; + int i, j, ki, kj, mi, mj; + float bdm[4][2][4][2]; + + for (i = 0; i < 4; i++) + for (j = i; j < 4; j++) + for (ki = 0; ki < 2; ki++) + for (kj = 0; kj < 2; kj++) + if ((j == i) && (ki == kj)) + bdm[i][ki][j][kj] = 1000.0; + else + { + for (mi = 0; mi < 1; mi++) + for (mj = 0; mj < 1; mj++) + f (tp[i][ki].x[mi], tp[j][kj].x[mj]); + bdm[i][ki][j][kj] = 1000.0; + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930630-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930630-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930630-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930630-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* The bit-field below would have a problem if __INT_MAX__ is too + small. */ +#if __INT_MAX__ < 2147483647 +int +main (void) +{ + exit (0); +} +#else +main () +{ + struct + { + signed int bf0:17; + signed int bf1:7; + } bf; + + bf.bf1 = 7; + f (bf.bf1); + exit (0); +} + +f (x) + int x; +{ + if (x != 7) + abort (); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930702-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930702-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930702-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930702-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +fp (double a, int b) +{ + if (a != 33 || b != 11) + abort (); +} + +main () +{ + int (*f) (double, int) = fp; + + fp (33, 11); + f (33, 11); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930713-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930713-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930713-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930713-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +typedef struct +{ + char x; +} T; + +T +f (s1) + T s1; +{ + T s1a; + s1a.x = 17; + return s1a; +} + +main () +{ + T s1a, s1b; + s1a.x = 13; + s1b = f (s1a); + if (s1a.x != 13 || s1b.x != 17) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930718-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930718-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930718-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930718-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +typedef struct rtx_def +{ + int f1 :1; + int f2 :1; +} *rtx; + +static rtx +f (orig) + register rtx orig; +{ + if (orig->f1 || orig->f2) + return orig; + orig->f2 = 1; + return orig; +} + +void +f2 () +{ + abort (); +} + +main () +{ + struct rtx_def foo; + rtx bar; + + foo.f1 = 1; + foo.f2 = 0; + bar = f (&foo); + if (bar != &foo || bar->f2 != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930719-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930719-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930719-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930719-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +int +f (foo, bar, com) +{ + unsigned align; + if (foo) + return 0; + while (1) + { + switch (bar) + { + case 1: + if (com != 0) + return align; + *(char *) 0 = 0; + } + } +} + +main () +{ + f (0, 1, 1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930725-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930725-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930725-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930725-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +int v; + +char * +g () +{ + return ""; +} + +char * +f () +{ + return (v == 0 ? g () : "abc"); +} + +main () +{ + v = 1; + if (!strcmp (f (), "abc")) + exit (0); + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930818-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930818-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930818-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930818-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +static double one = 1.0; + +f() +{ + int colinear; + colinear = (one == 0.0); + if (colinear) + abort (); + return colinear; +} +main() +{ + if (f()) abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930916-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930916-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930916-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930916-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +f (n) + unsigned n; +{ + if ((int) n >= 0) + abort (); +} + +main () +{ + unsigned x = ~0; + f (x); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930921-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930921-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930921-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930921-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +f (x) + unsigned x; +{ + return (unsigned) (((unsigned long long) x * 0xAAAAAAAB) >> 32) >> 1; +} + +main () +{ + unsigned i; + + for (i = 0; i < 10000; i++) + if (f (i) != i / 3) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930929-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930929-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930929-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930929-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +sub1 (i) + int i; +{ + return i - (5 - i); +} + +sub2 (i) + int i; +{ + return i + (5 + i); +} + +sub3 (i) + int i; +{ + return i - (5 + i); +} + +sub4 (i) + int i; +{ + return i + (5 - i); +} + +main() +{ + if (sub1 (20) != 35) + abort (); + if (sub2 (20) != 45) + abort (); + if (sub3 (20) != -5) + abort (); + if (sub4 (20) != 5) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930930-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930930-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930930-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930930-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +__extension__ typedef __PTRDIFF_TYPE__ ptr_t; +ptr_t *wm_TR; +ptr_t *wm_HB; +ptr_t *wm_SPB; + +ptr_t mem[100]; + +f (mr_TR, mr_SPB, mr_HB, reg1, reg2) + ptr_t *mr_TR; + ptr_t *mr_SPB; + ptr_t *mr_HB; + ptr_t *reg1; + ptr_t *reg2; +{ + ptr_t *x = mr_TR; + + for (;;) + { + if (reg1 < reg2) + goto out; + if ((ptr_t *) *reg1 < mr_HB && (ptr_t *) *reg1 >= mr_SPB) + *--mr_TR = *reg1; + reg1--; + } + out: + + if (x != mr_TR) + abort (); +} + +main () +{ + mem[99] = (ptr_t) mem; + f (mem + 100, mem + 6, mem + 8, mem + 99, mem + 99); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930930-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930930-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930930-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/930930-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +int +test_endianness() +{ + union doubleword + { + double d; + unsigned long u[2]; + } dw; + dw.d = 10; + return dw.u[0] != 0 ? 1 : 0; +} + +int +test_endianness_vol() +{ + union doubleword + { + volatile double d; + volatile long u[2]; + } dw; + dw.d = 10; + return dw.u[0] != 0 ? 1 : 0; +} + +main () +{ + if (test_endianness () != test_endianness_vol ()) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931002-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931002-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931002-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931002-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* { dg-require-effective-target trampolines } */ + +f (void (*func) ()) +{ + func (); +} + +main () +{ + void t0 () + { + } + + void t1 () + { + f (t0); + } + + void t2 () + { + t1 (); + } + + t1 (); + t1 (); + t2 (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +struct tiny +{ + int c; +}; + +f (int n, struct tiny x, struct tiny y, struct tiny z, long l) +{ + if (x.c != 10) + abort(); + + if (y.c != 11) + abort(); + + if (z.c != 12) + abort(); + + if (l != 123) + abort (); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-10.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-10.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-10.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-10.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +#include + +struct tiny +{ + char c; + char d; +}; + +f (int n, ...) +{ + struct tiny x; + int i; + + va_list ap; + va_start (ap,n); + for (i = 0; i < n; i++) + { + x = va_arg (ap,struct tiny); + if (x.c != i + 10) + abort(); + if (x.d != i + 20) + abort(); + } + { + long x = va_arg (ap, long); + if (x != 123) + abort(); + } + va_end (ap); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + x[0].d = 20; + x[1].d = 21; + x[2].d = 22; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-11.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-11.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-11.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-11.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +struct tiny +{ + char c; + char d; + char e; +}; + +f (int n, struct tiny x, struct tiny y, struct tiny z, long l) +{ + if (x.c != 10) + abort(); + if (x.d != 20) + abort(); + if (x.e != 30) + abort(); + + if (y.c != 11) + abort(); + if (y.d != 21) + abort(); + if (y.e != 31) + abort(); + + if (z.c != 12) + abort(); + if (z.d != 22) + abort(); + if (z.e != 32) + abort(); + + if (l != 123) + abort (); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + x[0].d = 20; + x[1].d = 21; + x[2].d = 22; + x[0].e = 30; + x[1].e = 31; + x[2].e = 32; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-12.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-12.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-12.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-12.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +#include + +struct tiny +{ + char c; + char d; + char e; +}; + +f (int n, ...) +{ + struct tiny x; + int i; + + va_list ap; + va_start (ap,n); + for (i = 0; i < n; i++) + { + x = va_arg (ap,struct tiny); + if (x.c != i + 10) + abort(); + if (x.d != i + 20) + abort(); + if (x.e != i + 30) + abort(); + } + { + long x = va_arg (ap, long); + if (x != 123) + abort(); + } + va_end (ap); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + x[0].d = 20; + x[1].d = 21; + x[2].d = 22; + x[0].e = 30; + x[1].e = 31; + x[2].e = 32; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-13.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-13.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-13.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-13.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,59 @@ +struct tiny +{ + char c; + char d; + char e; + char f; +}; + +f (int n, struct tiny x, struct tiny y, struct tiny z, long l) +{ + if (x.c != 10) + abort(); + if (x.d != 20) + abort(); + if (x.e != 30) + abort(); + if (x.f != 40) + abort(); + + if (y.c != 11) + abort(); + if (y.d != 21) + abort(); + if (y.e != 31) + abort(); + if (y.f != 41) + abort(); + + if (z.c != 12) + abort(); + if (z.d != 22) + abort(); + if (z.e != 32) + abort(); + if (z.f != 42) + abort(); + + if (l != 123) + abort (); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + x[0].d = 20; + x[1].d = 21; + x[2].d = 22; + x[0].e = 30; + x[1].e = 31; + x[2].e = 32; + x[0].f = 40; + x[1].f = 41; + x[2].f = 42; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-14.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-14.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-14.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-14.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,55 @@ +#include + +struct tiny +{ + char c; + char d; + char e; + char f; +}; + +f (int n, ...) +{ + struct tiny x; + int i; + + va_list ap; + va_start (ap,n); + for (i = 0; i < n; i++) + { + x = va_arg (ap,struct tiny); + if (x.c != i + 10) + abort(); + if (x.d != i + 20) + abort(); + if (x.e != i + 30) + abort(); + if (x.f != i + 40) + abort(); + } + { + long x = va_arg (ap, long); + if (x != 123) + abort(); + } + va_end (ap); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + x[0].d = 20; + x[1].d = 21; + x[2].d = 22; + x[0].e = 30; + x[1].e = 31; + x[2].e = 32; + x[0].f = 40; + x[1].f = 41; + x[2].f = 42; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +#include + +struct tiny +{ + int c; +}; + +f (int n, ...) +{ + struct tiny x; + int i; + + va_list ap; + va_start (ap,n); + for (i = 0; i < n; i++) + { + x = va_arg (ap,struct tiny); + if (x.c != i + 10) + abort(); + } + { + long x = va_arg (ap, long); + if (x != 123) + abort(); + } + va_end (ap); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +struct tiny +{ + short c; +}; + +f (int n, struct tiny x, struct tiny y, struct tiny z, long l) +{ + if (x.c != 10) + abort(); + + if (y.c != 11) + abort(); + + if (z.c != 12) + abort(); + + if (l != 123) + abort (); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +#include + +struct tiny +{ + short c; +}; + +f (int n, ...) +{ + struct tiny x; + int i; + + va_list ap; + va_start (ap,n); + for (i = 0; i < n; i++) + { + x = va_arg (ap,struct tiny); + if (x.c != i + 10) + abort(); + } + { + long x = va_arg (ap, long); + if (x != 123) + abort(); + } + va_end (ap); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +struct tiny +{ + short c; + short d; +}; + +f (int n, struct tiny x, struct tiny y, struct tiny z, long l) +{ + if (x.c != 10) + abort(); + if (x.d != 20) + abort(); + + if (y.c != 11) + abort(); + if (y.d != 21) + abort(); + + if (z.c != 12) + abort(); + if (z.d != 22) + abort(); + + if (l != 123) + abort (); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + x[0].d = 20; + x[1].d = 21; + x[2].d = 22; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +#include + +struct tiny +{ + short c; + short d; +}; + +f (int n, ...) +{ + struct tiny x; + int i; + + va_list ap; + va_start (ap,n); + for (i = 0; i < n; i++) + { + x = va_arg (ap,struct tiny); + if (x.c != i + 10) + abort(); + if (x.d != i + 20) + abort(); + } + { + long x = va_arg (ap, long); + if (x != 123) + abort(); + } + va_end (ap); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + x[0].d = 20; + x[1].d = 21; + x[2].d = 22; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-7.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-7.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-7.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-7.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +struct tiny +{ + char c; +}; + +f (int n, struct tiny x, struct tiny y, struct tiny z, long l) +{ + if (x.c != 10) + abort(); + + if (y.c != 11) + abort(); + + if (z.c != 12) + abort(); + + if (l != 123) + abort (); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-8.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-8.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-8.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-8.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +#include + +struct tiny +{ + char c; +}; + +f (int n, ...) +{ + struct tiny x; + int i; + + va_list ap; + va_start (ap,n); + for (i = 0; i < n; i++) + { + x = va_arg (ap,struct tiny); + if (x.c != i + 10) + abort(); + } + { + long x = va_arg (ap, long); + if (x != 123) + abort(); + } + va_end (ap); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-9.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-9.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-9.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931004-9.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +struct tiny +{ + char c; + char d; +}; + +f (int n, struct tiny x, struct tiny y, struct tiny z, long l) +{ + if (x.c != 10) + abort(); + if (x.d != 20) + abort(); + + if (y.c != 11) + abort(); + if (y.d != 21) + abort(); + + if (z.c != 12) + abort(); + if (z.d != 22) + abort(); + + if (l != 123) + abort (); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + x[0].d = 20; + x[1].d = 21; + x[2].d = 22; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931005-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931005-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931005-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931005-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +typedef struct +{ + char x; +} T; + +T +f (s1) + T s1; +{ + T s1a; + s1a.x = s1.x; + return s1a; +} + +main () +{ + T s1a, s1b; + s1a.x = 100; + s1b = f (s1a); + if (s1b.x != 100) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931009-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931009-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931009-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931009-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +main () +{ + f (); + exit (0); +} + +static +g (out, size, lo, hi) + int *out, size, lo, hi; +{ + int j; + + for (j = 0; j < size; j++) + out[j] = j * (hi - lo); +} + + +f () +{ + int a[2]; + + g (a, 2, 0, 1); + + if (a[0] != 0 || a[1] != 1) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931012-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931012-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931012-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931012-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +f (int b, int c) +{ + if (b != 0 && b != 1 && c != 0) + b = 0; + return b; +} + +main () +{ + if (!f (1, 2)) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931017-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931017-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931017-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931017-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,54 @@ +int v; + +main () +{ + f (); + exit (0); +} + +h1 () +{ + return 0; +} + +h2 (e) + int *e; +{ + if (e != &v) + abort (); + return 0; +} + +g (c) + char *c; +{ + int i; + int b; + + do + { + i = h1 (); + if (i == -1) + return 0; + else if (i == 1) + h1 (); + } + while (i == 1); + + do + b = h2 (&v); + while (i == 5); + + if (i != 2) + return b; + *c = 'a'; + + return 0; +} + + +f () +{ + char c; + g (&c); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931018-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931018-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931018-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931018-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +unsigned int a[0x1000]; +extern const unsigned long v; + +main () +{ + f (v); + f (v); + exit (0); +} + +f (a) + unsigned long a; +{ + if (a != 0xdeadbeefL) + abort(); +} + +const unsigned long v = 0xdeadbeefL; Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931031-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931031-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931031-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931031-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* The bit-field below would have a problem if __INT_MAX__ is too + small. */ +#if __INT_MAX__ < 2147483647 +int +main (void) +{ + exit (0); +} +#else +struct foo +{ + unsigned y:1; + unsigned x:32; +}; + +int +f (x) + struct foo x; +{ + int t = x.x; + if (t < 0) + return 1; + return t+1; +} + +main () +{ + struct foo x; + x.x = -1; + if (f (x) == 0) + abort (); + exit (0); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931102-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931102-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931102-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931102-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +typedef union +{ + struct + { + char h, l; + } b; +} T; + +f (x) + int x; +{ + int num = 0; + T reg; + + reg.b.l = x; + while ((reg.b.l & 1) == 0) + { + num++; + reg.b.l >>= 1; + } + return num; +} + +main () +{ + if (f (2) != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931102-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931102-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931102-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931102-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +typedef union +{ + long align; + struct + { + short h, l; + } b; +} T; + +f (x) + int x; +{ + int num = 0; + T reg; + + reg.b.l = x; + while ((reg.b.l & 1) == 0) + { + num++; + reg.b.l >>= 1; + } + return num; +} + +main () +{ + if (f (2) != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931110-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931110-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931110-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931110-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +typedef struct +{ + short f:3, g:3, h:10; +} small; + +struct +{ + int i; + small s[10]; +} x; + +main () +{ + int i; + for (i = 0; i < 10; i++) + x.s[i].f = 0; + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931110-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931110-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931110-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931110-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +main () +{ + static int a[] = {3, 4}; + register int *b; + int c; + + b = a; + c = *b++ % 8; + if (c != 3) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931208-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931208-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931208-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931208-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +f () +{ + unsigned long x, y = 1; + + x = ((y * 8192) - 216) / 16; + return x; +} + +main () +{ + if (f () != 498) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931228-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931228-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931228-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/931228-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +f (x) +{ + x &= 010000; + x &= 007777; + x ^= 017777; + x &= 017770; + return x; +} + +main () +{ + if (f (-1) != 017770) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/940115-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/940115-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/940115-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/940115-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +f (cp, end) + char *cp; + char *end; +{ + return (cp < end); +} + +main () +{ + if (! f ((char *) 0, (char *) 1)) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/940122-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/940122-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/940122-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/940122-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +char *a = 0; +char *b = 0; + +g (x) + int x; +{ + if ((!!a) != (!!b)) + abort (); +} + +f (x) + int x; +{ + g (x * x); +} + +main () +{ + f (100); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941014-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941014-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941014-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941014-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +int f (int a, int b) { } + +main () +{ + unsigned long addr1; + unsigned long addr2; + + addr1 = (unsigned long) &f; + addr1 += 5; + addr2 = 5 + (unsigned long) &f; + + if (addr1 != addr2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941014-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941014-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941014-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941014-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +#include +#include + +typedef struct { + unsigned short a; + unsigned short b; +} foo_t; + +void a1 (unsigned long offset) {} + +volatile foo_t * +f () +{ + volatile foo_t *foo_p = (volatile foo_t *)malloc (sizeof (foo_t)); + + a1((unsigned long)foo_p-30); + if (foo_p->a & 0xf000) + printf("%d\n", foo_p->a); + foo_p->b = 0x0100; + a1 ((unsigned long)foo_p + 2); + a1 ((unsigned long)foo_p - 30); + return foo_p; +} + +main () +{ + volatile foo_t *foo_p; + + foo_p = f (); + if (foo_p->b != 0x0100) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941015-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941015-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941015-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941015-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +int +foo1 (value) + long long value; +{ + register const long long constant = 0xc000000080000000LL; + + if (value < constant) + return 1; + else + return 2; +} + +int +foo2 (value) + unsigned long long value; +{ + register const unsigned long long constant = 0xc000000080000000LL; + + if (value < constant) + return 1; + else + return 2; +} + +main () +{ + unsigned long long value = 0xc000000000000001LL; + int x, y; + + x = foo1 (value); + y = foo2 (value); + if (x != y || x != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941021-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941021-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941021-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941021-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +double glob_dbl; + +f (pdbl, value) + double *pdbl; + double value; +{ + if (pdbl == 0) + pdbl = &glob_dbl; + + *pdbl = value; +} + +main () +{ + f ((void *) 0, 55.1); + + if (glob_dbl != 55.1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941025-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941025-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941025-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941025-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +long f (x, y) + long x,y; +{ + return (x > 1) ? y : (y & 1); +} + +main () +{ + if (f (2L, 0xdecadeL) != 0xdecadeL) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941031-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941031-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941031-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941031-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +typedef long mpt; + +int +f (mpt us, mpt vs) +{ + long aus; + long avs; + + aus = us >= 0 ? us : -us; + avs = vs >= 0 ? vs : -vs; + + if (aus < avs) + { + long t = aus; + aus = avs; + avs = aus; + } + + return avs; +} + +main () +{ + if (f ((mpt) 3, (mpt) 17) != 17) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941101-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941101-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941101-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941101-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +f () +{ + int var = 7; + + if ((var/7) == 1) + return var/7; + return 0; +} + +main () +{ + if (f () != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941110-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941110-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941110-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941110-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +f (const int x) +{ + int y = 0; + y = x ? y : -y; + { + const int *p = &x; + } + return y; +} + +main () +{ + if (f (0)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941202-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941202-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941202-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/941202-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* { dg-skip-if "requires alloca" { ! alloca } { "-O0" } { "" } } */ +g (x, y) +{ + if (x != 3) + abort (); +} + +static inline +f (int i) +{ + int *tmp; + + tmp = (int *) alloca (sizeof (i)); + *tmp = i; + g (*tmp, 0); +} + +main () +{ + f (3); + exit (0); +}; Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950221-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950221-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950221-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950221-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,57 @@ +/* { dg-add-options stack_size } */ + +struct parsefile +{ + long fd; + char *buf; +}; +struct parsefile basepf; +struct parsefile *parsefile = &basepf; +#ifdef STACK_SIZE +int filler[STACK_SIZE / (2*sizeof(int))]; +#else +int filler[0x3000]; +#endif +int el; + +char * +g1 (a, b) + int a; + int *b; +{ +} + +g2 (a) + long a; +{ + if (a != 0xdeadbeefL) + abort (); + exit (0); +} + +f () +{ + register char *p, *q; + register int i; + register int something; + + if (parsefile->fd == 0L && el) + { + const char *rl_cp; + int len; + rl_cp = g1 (el, &len); + strcpy (p, rl_cp); + } + else + { + alabel: + i = g2 (parsefile->fd); + } +} + +main () +{ + el = 0; + parsefile->fd = 0xdeadbeefL; + f (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950322-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950322-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950322-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950322-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +f (unsigned char *a) +{ + int i, j; + int x, y; + + j = a[1]; + i = a[0] - j; + if (i < 0) + { + x = 1; + y = -i; + } + else + { + x = 0; + y = i; + } + return x + y; +} + + +main () +{ + unsigned char a[2]; + a[0] = 8; + a[1] = 9; + if (f (a) != 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950426-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950426-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950426-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950426-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ + +struct tag { + int m1; + char *m2[5]; +} s1, *p1; + +int i; + +main() +{ + s1.m1 = -1; + p1 = &s1; + + if ( func1( &p1->m1 ) == -1 ) + foo ("ok"); + else + abort (); + + i = 3; + s1.m2[3]= "123"; + + if ( strlen( (p1->m2[i])++ ) == 3 ) + foo ("ok"); + else + abort (); + + exit (0); +} + +func1(int *p) { return(*p); } + +foo (char *s) {} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950426-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950426-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950426-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950426-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +main() +{ + long int i = -2147483647L - 1L; /* 0x80000000 */ + char ca = 1; + + if (i >> ca != -1073741824L) + abort (); + + if (i >> i / -2000000000L != -1073741824L) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950503-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950503-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950503-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950503-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +main () +{ + int tmp; + unsigned long long utmp1, utmp2; + + tmp = 16; + + utmp1 = (~((unsigned long long) 0)) >> tmp; + utmp2 = (~((unsigned long long) 0)) >> 16; + + if (utmp1 != utmp2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950511-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950511-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950511-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950511-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +main () +{ + unsigned long long xx; + unsigned long long *x = (unsigned long long *) &xx; + + *x = -3; + *x = *x * *x; + if (*x != 9) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950512-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950512-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950512-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950512-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +unsigned +f1 (x) +{ + return ((unsigned) (x != 0) - 3) / 2; +} + +unsigned long long +f2 (x) +{ + return ((unsigned long long) (x != 0) - 3) / 2; +} + +main () +{ + if (f1 (1) != (~(unsigned) 0) >> 1) + abort (); + if (f1 (0) != ((~(unsigned) 0) >> 1) - 1) + abort (); + if (f2 (1) != (~(unsigned long long) 0) >> 1) + abort (); + if (f2 (0) != ((~(unsigned long long) 0) >> 1) - 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950605-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950605-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950605-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950605-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +f (c) + unsigned char c; +{ + if (c != 0xFF) + abort (); +} + +main () +{ + f (-1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950607-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950607-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950607-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950607-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,9 @@ +main () +{ + struct { long status; } h; + + h.status = 0; + if (((h.status & 128) == 1) && ((h.status & 32) == 0)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950607-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950607-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950607-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950607-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +typedef struct { + long int p_x, p_y; +} Point; + +int +f (Point basePt, Point pt1, Point pt2) +{ + long long vector; + + vector = + (long long) (pt1.p_x - basePt.p_x) * (long long) (pt2.p_y - basePt.p_y) - + (long long) (pt1.p_y - basePt.p_y) * (long long) (pt2.p_x - basePt.p_x); + + if (vector > (long long) 0) + return 0; + else if (vector < (long long) 0) + return 1; + else + return 2; +} + +main () +{ + Point b, p1, p2; + int answer; + + b.p_x = -23250; + b.p_y = 23250; + + p1.p_x = 23250; + p1.p_y = -23250; + + p2.p_x = -23250; + p2.p_y = -23250; + + answer = f (b, p1, p2); + + if (answer != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950612-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950612-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950612-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950612-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,48 @@ +unsigned int +f1 (int diff) +{ + return ((unsigned int) (diff < 0 ? -diff : diff)); +} + +unsigned int +f2 (unsigned int diff) +{ + return ((unsigned int) ((signed int) diff < 0 ? -diff : diff)); +} + +unsigned long long +f3 (long long diff) +{ + return ((unsigned long long) (diff < 0 ? -diff : diff)); +} + +unsigned long long +f4 (unsigned long long diff) +{ + return ((unsigned long long) ((signed long long) diff < 0 ? -diff : diff)); +} + +main () +{ + int i; + for (i = 0; i <= 10; i++) + { + if (f1 (i) != i) + abort (); + if (f1 (-i) != i) + abort (); + if (f2 (i) != i) + abort (); + if (f2 (-i) != i) + abort (); + if (f3 ((long long) i) != i) + abort (); + if (f3 ((long long) -i) != i) + abort (); + if (f4 ((long long) i) != i) + abort (); + if (f4 ((long long) -i) != i) + abort (); + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950621-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950621-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950621-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950621-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +struct s +{ + int a; + int b; + struct s *dummy; +}; + +f (struct s *sp) +{ + return sp && sp->a == -1 && sp->b == -1; +} + +main () +{ + struct s x; + x.a = x.b = -1; + if (f (&x) == 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950628-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950628-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950628-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950628-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +typedef struct +{ + char hours, day, month; + short year; +} T; + +T g (void) +{ + T now; + + now.hours = 1; + now.day = 2; + now.month = 3; + now.year = 4; + return now; +} + +T f (void) +{ + T virk; + + virk = g (); + return virk; +} + +main () +{ + if (f ().hours != 1 || f ().day != 2 || f ().month != 3 || f ().year != 4) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950704-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950704-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950704-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950704-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,59 @@ +int errflag; + +long long +f (long long x, long long y) +{ + long long r; + + errflag = 0; + r = x + y; + if (x >= 0) + { + if ((y < 0) || (r >= 0)) + return r; + } + else + { + if ((y > 0) || (r < 0)) + return r; + } + errflag = 1; + return 0; +} + +main () +{ + f (0, 0); + if (errflag) + abort (); + + f (1, -1); + if (errflag) + abort (); + + f (-1, 1); + if (errflag) + abort (); + + f (0x8000000000000000LL, 0x8000000000000000LL); + if (!errflag) + abort (); + + f (0x8000000000000000LL, -1LL); + if (!errflag) + abort (); + + f (0x7fffffffffffffffLL, 0x7fffffffffffffffLL); + if (!errflag) + abort (); + + f (0x7fffffffffffffffLL, 1LL); + if (!errflag) + abort (); + + f (0x7fffffffffffffffLL, 0x8000000000000000LL); + if (errflag) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950706-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950706-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950706-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950706-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +int +f (int n) +{ + return (n > 0) - (n < 0); +} + +main () +{ + if (f (-1) != -1) + abort (); + if (f (1) != 1) + abort (); + if (f (0) != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950710-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950710-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950710-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950710-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,54 @@ +struct twelve +{ + int a; + int b; + int c; +}; + +struct pair +{ + int first; + int second; +}; + +struct pair +g () +{ + struct pair p; + return p; +} + +static void +f () +{ + int i; + for (i = 0; i < 1; i++) + { + int j; + for (j = 0; j < 1; j++) + { + if (0) + { + int k; + for (k = 0; k < 1; k++) + { + struct pair e = g (); + } + } + else + { + struct twelve a, b; + if ((((char *) &b - (char *) &a) < 0 + ? (-((char *) &b - (char *) &a)) + : ((char *) &b - (char *) &a)) < sizeof (a)) + abort (); + } + } + } +} + +main () +{ + f (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950714-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950714-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950714-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950714-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +int array[10] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1}; + +main () +{ + int i, j; + int *p; + + for (i = 0; i < 10; i++) + for (p = &array[0]; p != &array[9]; p++) + if (*p == i) + goto label; + + label: + if (i != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950809-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950809-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950809-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950809-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +struct S +{ + int *sp, fc, *sc, a[2]; +}; + +f (struct S *x) +{ + int *t = x->sc; + int t1 = t[0]; + int t2 = t[1]; + int t3 = t[2]; + int a0 = x->a[0]; + int a1 = x->a[1]; + t[2] = t1; + t[0] = a1; + x->a[1] = a0; + x->a[0] = t3; + x->fc = t2; + x->sp = t; +} + +main () +{ + struct S s; + static int sc[3] = {2, 3, 4}; + s.sc = sc; + s.a[0] = 10; + s.a[1] = 11; + f (&s); + if (s.sp[2] != 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950906-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950906-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950906-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950906-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +g (int i) +{ +} + +f (int i) +{ + g (0); + while ( ({ i--; }) ) + g (0); +} + +main () +{ + f (10); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950915-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950915-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950915-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950915-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +long int a = 100000; +long int b = 21475; + +long +f () +{ + return ((long long) a * (long long) b) >> 16; +} + +main () +{ + if (f () < 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950929-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950929-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950929-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/950929-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +int f (char *p) { } + +main () +{ + char c; + char c2; + int i = 0; + char *pc = &c; + char *pc2 = &c2; + int *pi = &i; + + *pc2 = 1; + *pi = 1; + *pc2 &= *pi; + f (pc2); + *pc2 = 1; + *pc2 &= *pi; + if (*pc2 != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951003-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951003-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951003-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951003-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +int f (i) { return 12; } +int g () { return 0; } + +main () +{ + int i, s; + + for (i = 0; i < 32; i++) + { + s = f (i); + + if (i == g ()) + s = 42; + if (i == 0 || s == 12) + ; + else + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951115-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951115-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951115-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951115-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +int var = 0; + +g () +{ + var = 1; +} + +f () +{ + int f2 = 0; + + if (f2 == 0) + ; + + g (); +} + +main () +{ + f (); + if (var != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951204-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951204-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951204-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/951204-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +f (char *x) +{ + *x = 'x'; +} + +main () +{ + int i; + char x = '\0'; + + for (i = 0; i < 100; ++i) + { + f (&x); + if (*(const char *) &x != 'x') + abort (); + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960116-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960116-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960116-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960116-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +static inline +p (int *p) +{ + return !((long) p & 1); +} + +int +f (int *q) +{ + if (p (q) && *q) + return 1; + return 0; +} + +main () +{ + if (f ((int*) 0xffffffff) != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960117-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960117-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960117-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960117-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,46 @@ +static char id_space[2] [32 +1]; +typedef short COUNT; + +typedef char TEXT; + +union T_VALS +{ + TEXT *id __attribute__ ((aligned (2), packed)) ; +}; +typedef union T_VALS VALS; + +struct T_VAL +{ + COUNT pos __attribute__ ((aligned (2), packed)) ; + VALS vals __attribute__ ((aligned (2), packed)) ; +}; +typedef struct T_VAL VAL; + +VAL curval = {0}; + +static short idc = 0; +static int cur_line; +static int char_pos; + +typedef unsigned short WORD; + +WORD +get_id (char c) +{ + curval.vals.id[0] = c; +} + +WORD +get_tok () +{ + char c = 'c'; + curval.vals.id = id_space[idc]; + curval.pos = (cur_line << 10) | char_pos; + return get_id (c); +} + +main () +{ + get_tok (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960209-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960209-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960209-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960209-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +struct a_struct +{ + unsigned char a_character; +}; + +struct a_struct an_array[5]; +struct a_struct *a_ptr; +int yabba = 1; + +int +f (a, b) + unsigned char a; + unsigned long b; +{ + long i, j, p, q, r, s; + + if (b != (unsigned long) 0) + { + if (yabba) + return -1; + s = 4000000 / b; + for (i = 0; i < 11; i++) + { + for (j = 0; j < 256; j++) + { + if (((p - s < 0) ? -s : 0) < (( q - s < 0) ? -s : q)) + r = i; + } + } + } + + if (yabba) + return 0; + a_ptr = &an_array[a]; + a_ptr->a_character = (unsigned char) r; +} + +main () +{ + if (f (1, 0UL) != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960215-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960215-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960215-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960215-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +long double C = 2; +long double U = 1; +long double Y2 = 3; +long double Y1 = 1; +long double X, Y, Z, T, R, S; +main () +{ + X = (C + U) * Y2; + Y = C - U - U; + Z = C + U + U; + T = (C - U) * Y1; + X = X - (Z + U); + R = Y * Y1; + S = Z * Y2; + T = T - Y; + Y = (U - Y) + R; + Z = S - (Z + U + U); + R = (Y2 + U) * Y1; + Y1 = Y2 * Y1; + R = R - Y2; + Y1 = Y1 - 0.5L; + if (Z != 6) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960218-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960218-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960218-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960218-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +int glob; + +g (x) +{ + glob = x; + return 0; +} + +f (x) +{ + int a = ~x; + while (a) + a = g (a); +} + +main () +{ + f (3); + if (glob != -4) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960219-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960219-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960219-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960219-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +f (int i) +{ + if (((1 << i) & 1) == 0) + abort (); +} + +main () +{ + f (0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960301-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960301-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960301-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960301-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +struct foo { + unsigned : 12; + unsigned field : 4; +} foo; +unsigned oldfoo; + +int +bar (unsigned k) +{ + oldfoo = foo.field; + foo.field = k; + if (k) + return 1; + return 2; +} + +main () +{ + if (bar (1U) != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960302-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960302-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960302-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960302-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +long a = 1; + +foo () +{ + switch (a % 2 % 2 % 2 % 2 % 2 % 2 % 2 % 2) + { + case 0: + return 0; + case 1: + return 1; + default: + return -1; + } +} + +main () +{ + if (foo () != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,69 @@ +#include + +#ifdef DEBUG +#define abort() printf ("error, line %d\n", __LINE__) +#endif + +int count; + +void a1() { ++count; } + +void +b (unsigned char data) +{ + if (data & 0x80) a1(); + data <<= 1; + + if (data & 0x80) a1(); + data <<= 1; + + if (data & 0x80) a1(); +} + +main () +{ + count = 0; + b (0); + if (count != 0) + abort (); + + count = 0; + b (0x80); + if (count != 1) + abort (); + + count = 0; + b (0x40); + if (count != 1) + abort (); + + count = 0; + b (0x20); + if (count != 1) + abort (); + + count = 0; + b (0xc0); + if (count != 2) + abort (); + + count = 0; + b (0xa0); + if (count != 2) + abort (); + + count = 0; + b (0x60); + if (count != 2) + abort (); + + count = 0; + b (0xe0); + if (count != 3) + abort (); + +#ifdef DEBUG + printf ("Done.\n"); +#endif + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,69 @@ +#include + +#ifdef DEBUG +#define abort() printf ("error, line %d\n", __LINE__) +#endif + +int count; + +void a1() { ++count; } + +void +b (unsigned short data) +{ + if (data & 0x8000) a1(); + data <<= 1; + + if (data & 0x8000) a1(); + data <<= 1; + + if (data & 0x8000) a1(); +} + +main () +{ + count = 0; + b (0); + if (count != 0) + abort (); + + count = 0; + b (0x8000); + if (count != 1) + abort (); + + count = 0; + b (0x4000); + if (count != 1) + abort (); + + count = 0; + b (0x2000); + if (count != 1) + abort (); + + count = 0; + b (0xc000); + if (count != 2) + abort (); + + count = 0; + b (0xa000); + if (count != 2) + abort (); + + count = 0; + b (0x6000); + if (count != 2) + abort (); + + count = 0; + b (0xe000); + if (count != 3) + abort (); + +#ifdef DEBUG + printf ("Done.\n"); +#endif + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960311-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,69 @@ +#include + +#ifdef DEBUG +#define abort() printf ("error, line %d\n", __LINE__) +#endif + +int count; + +void a1() { ++count; } + +void +b (unsigned long data) +{ + if (data & 0x80000000) a1(); + data <<= 1; + + if (data & 0x80000000) a1(); + data <<= 1; + + if (data & 0x80000000) a1(); +} + +main () +{ + count = 0; + b (0); + if (count != 0) + abort (); + + count = 0; + b (0x80000000); + if (count != 1) + abort (); + + count = 0; + b (0x40000000); + if (count != 1) + abort (); + + count = 0; + b (0x20000000); + if (count != 1) + abort (); + + count = 0; + b (0xc0000000); + if (count != 2) + abort (); + + count = 0; + b (0xa0000000); + if (count != 2) + abort (); + + count = 0; + b (0x60000000); + if (count != 2) + abort (); + + count = 0; + b (0xe0000000); + if (count != 3) + abort (); + +#ifdef DEBUG + printf ("Done.\n"); +#endif + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960312-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960312-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960312-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960312-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +struct S +{ + int *sp, fc, *sc, a[2]; +}; + +f (struct S *x) +{ + int *t = x->sc; + int t1 = t[0]; + int t2 = t[1]; + int t3 = t[2]; + int a0 = x->a[0]; + int a1 = x->a[1]; + asm("": :"r" (t2), "r" (t3)); + t[2] = t1; + t[0] = a1; + x->a[1] = a0; + x->a[0] = t3; + x->fc = t2; + x->sp = t; +} + +main () +{ + struct S s; + static int sc[3] = {2, 3, 4}; + s.sc = sc; + s.a[0] = 10; + s.a[1] = 11; + f (&s); + if (s.sp[2] != 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960317-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960317-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960317-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960317-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +int +f (unsigned bitcount, int mant) +{ + int mask = -1 << bitcount; + { + if (! (mant & -mask)) + goto ab; + if (mant & ~mask) + goto auf; + } +ab: + return 0; +auf: + return 1; +} + +main () +{ + if (f (0, -1)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960321-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960321-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960321-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960321-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +char a[10] = "deadbeef"; + +char +acc_a (long i) +{ + return a[i-2000000000L]; +} + +main () +{ + if (acc_a (2000000000L) != 'd') + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960326-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960326-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960326-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960326-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +struct s +{ + int a; + int b; + short c; + int d[3]; +}; + +struct s s = { .b = 3, .d = {2,0,0} }; + +main () +{ + if (s.b != 3) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960327-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960327-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960327-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960327-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +#include +g () +{ + return '\n'; +} + +f () +{ + char s[] = "abcedfg012345"; + char *sp = s + 12; + + switch (g ()) + { + case '\n': + break; + } + + while (*--sp == '0') + ; + sprintf (sp + 1, "X"); + + if (s[12] != 'X') + abort (); +} + +main () +{ + f (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960402-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960402-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960402-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960402-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +f (signed long long int x) +{ + return x > 0xFFFFFFFFLL || x < -0x80000000LL; +} + +main () +{ + if (f (0) != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960405-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960405-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960405-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960405-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +#define X 5.9486574767861588254287966331400356538172e4931L + +long double x = X + X; +long double y = 2.0L * X; + +main () +{ +#if ! defined (__vax__) && ! defined (_CRAY) + if (x != y) + abort (); +#endif + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960416-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960416-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960416-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960416-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,63 @@ +typedef unsigned long int st; +typedef unsigned long long dt; +typedef union +{ + dt d; + struct + { + st h, l; + } + s; +} t_be; + +typedef union +{ + dt d; + struct + { + st l, h; + } + s; +} t_le; + +#define df(f, t) \ +int \ +f (t afh, t bfh) \ +{ \ + t hh; \ + t hp, lp, dp, m; \ + st ad, bd; \ + int s; \ + s = 0; \ + ad = afh.s.h - afh.s.l; \ + bd = bfh.s.l - bfh.s.h; \ + if (bd > bfh.s.l) \ + { \ + bd = -bd; \ + s = ~s; \ + } \ + lp.d = (dt) afh.s.l * bfh.s.l; \ + hp.d = (dt) afh.s.h * bfh.s.h; \ + dp.d = (dt) ad *bd; \ + dp.d ^= s; \ + hh.d = hp.d + hp.s.h + lp.s.h + dp.s.h; \ + m.d = (dt) lp.s.h + hp.s.l + lp.s.l + dp.s.l; \ + return hh.s.l + m.s.l; \ +} + +df(f_le, t_le) +df(f_be, t_be) + +main () +{ + t_be x; + x.s.h = 0x10000000U; + x.s.l = 0xe0000000U; + if (x.d == 0x10000000e0000000ULL + && f_be ((t_be) 0x100000000ULL, (t_be) 0x100000000ULL) != -1) + abort (); + if (x.d == 0xe000000010000000ULL + && f_le ((t_le) 0x100000000ULL, (t_le) 0x100000000ULL) != -1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960419-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960419-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960419-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960419-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +static int i; + +void +check(x) + int x; +{ + if (!x) + abort(); +} + +main() +{ + int *p = &i; + + check(p != (void *)0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960419-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960419-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960419-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960419-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +#define SIZE 8 + +main() +{ + int a[SIZE] = {1}; + int i; + + for (i = 1; i < SIZE; i++) + if (a[i] != 0) + abort(); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960512-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960512-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960512-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960512-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +__complex__ +double f () +{ + int a[40]; + __complex__ double c; + + a[9] = 0; + c = a[9]; + return c; +} + +main () +{ + __complex__ double c; + + if (c = f ()) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960513-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960513-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960513-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960513-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +long double +f (d, i) + long double d; + int i; +{ + long double e; + + d = -d; + e = d; + if (i == 1) + d *= 2; + d -= e * d; + d -= e * d; + d -= e * d; + d -= e * d; + d -= e * d; + return d; +} + +main () +{ + if (! (int) (f (2.0L, 1))) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960521-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960521-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960521-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960521-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* { dg-add-options stack_size } */ + +#include + +int *a, *b; +int n; + +#ifdef STACK_SIZE +#define BLOCK_SIZE (STACK_SIZE / (sizeof (*a) + sizeof (*b))) +#else +#define BLOCK_SIZE 32768 +#endif +foo () +{ + int i; + for (i = 0; i < n; i++) + a[i] = -1; + for (i = 0; i < BLOCK_SIZE - 1; i++) + b[i] = -1; +} + +main () +{ + n = BLOCK_SIZE; + a = malloc (n * sizeof(*a)); + b = malloc (n * sizeof(*b)); + *b++ = 0; + foo (); + if (b[-1]) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960608-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960608-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960608-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960608-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +typedef struct +{ + unsigned char a : 2; + unsigned char b : 3; + unsigned char c : 1; + unsigned char d : 1; + unsigned char e : 1; +} a_struct; + +foo (flags) + a_struct *flags; +{ + return (flags->c != 0 + || flags->d != 1 + || flags->e != 1 + || flags->a != 2 + || flags->b != 3); +} + +main () +{ + a_struct flags; + + flags.c = 0; + flags.d = 1; + flags.e = 1; + flags.a = 2; + flags.b = 3; + + if (foo (&flags) != 0) + abort (); + else + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960801-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960801-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960801-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960801-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +unsigned +f () +{ + long long l2; + unsigned short us; + unsigned long long ul; + short s2; + + ul = us = l2 = s2 = -1; + return ul; +} + +unsigned long long +g () +{ + long long l2; + unsigned short us; + unsigned long long ul; + short s2; + + ul = us = l2 = s2 = -1; + return ul; +} + +main () +{ + if (f () != (unsigned short) -1) + abort (); + if (g () != (unsigned short) -1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960802-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960802-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960802-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960802-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +long val = 0x5e000000; + +long +f1 (void) +{ + return 0x132; +} + +long +f2 (void) +{ + return 0x5e000000; +} + +void +f3 (long b) +{ + val = b; +} + +void +f4 () +{ + long v = f1 (); + long o = f2 (); + v = (v & 0x00ffffff) | (o & 0xff000000); + f3 (v); +} + +main () +{ + f4 (); + if (val != 0x5e000132) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960830-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960830-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960830-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960830-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +#ifdef __i386__ +f (rp) + unsigned int *rp; +{ + __asm__ ("mull %3" : "=a" (rp[0]), "=d" (rp[1]) : "%0" (7), "rm" (7)); +} + +main () +{ + unsigned int s[2]; + + f (s); + if (s[1] != 0 || s[0] != 49) + abort (); + exit (0); +} +#else +main () +{ + exit (0); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960909-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960909-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960909-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/960909-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +int +ffs (x) + int x; +{ + int bit, mask; + + if (x == 0) + return 0; + + for (bit = 1, mask = 1; !(x & mask); bit++, mask <<= 1) + ; + + return bit; +} + +f (x) + int x; +{ + int y; + y = ffs (x) - 1; + if (y < 0) + abort (); +} + +main () +{ + f (1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961004-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961004-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961004-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961004-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +int k = 0; + +main() +{ + int i; + int j; + + for (i = 0; i < 2; i++) + { + if (k) + { + if (j != 2) + abort (); + } + else + { + j = 2; + k++; + } + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961017-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961017-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961017-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961017-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,8 @@ +main () +{ + unsigned char z = 0; + + do ; + while (--z > 0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961017-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961017-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961017-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961017-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +main () +{ + int i = 0; + + + if (sizeof (unsigned long int) == 4) + { + unsigned long int z = 0; + + do { + z -= 0x00004000; + i++; + if (i > 0x00040000) + abort (); + } while (z > 0); + exit (0); + } + else if (sizeof (unsigned int) == 4) + { + unsigned int z = 0; + + do { + z -= 0x00004000; + i++; + if (i > 0x00040000) + abort (); + } while (z > 0); + exit (0); + } + else + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961026-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961026-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961026-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961026-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +int +test (arg) + int arg; +{ + if (arg > 0 || arg == 0) + return 0; + return -1; +} + +main () +{ + if (test (0) != 0) + abort (); + if (test (-1) != -1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961112-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961112-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961112-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961112-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +f (x) +{ + if (x != 0 || x == 0) + return 0; + return 1; +} + +main () +{ + if (f (3)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961122-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961122-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961122-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961122-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +long long acc; + +addhi (short a) +{ + acc += (long long) a << 32; +} + +subhi (short a) +{ + acc -= (long long) a << 32; +} + +main () +{ + acc = 0xffff00000000ll; + addhi (1); + if (acc != 0x1000000000000ll) + abort (); + subhi (1); + if (acc != 0xffff00000000ll) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961122-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961122-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961122-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961122-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +int +f (int a) +{ + return ((a >= 0 && a <= 10) && ! (a >= 0)); +} + +main () +{ + if (f (0)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961125-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961125-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961125-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961125-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +static char * +begfield (int tab, char *ptr, char *lim, int sword, int schar) +{ + if (tab) + { + while (ptr < lim && sword--) + { + while (ptr < lim && *ptr != tab) + ++ptr; + if (ptr < lim) + ++ptr; + } + } + else + { + while (1) + ; + } + + if (ptr + schar <= lim) + ptr += schar; + + return ptr; +} + +main () +{ + char *s = ":ab"; + char *lim = s + 3; + if (begfield (':', s, lim, 1, 1) != s + 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961206-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961206-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961206-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961206-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,52 @@ +int +sub1 (unsigned long long i) +{ + if (i < 0x80000000) + return 1; + else + return 0; +} + +int +sub2 (unsigned long long i) +{ + if (i <= 0x7FFFFFFF) + return 1; + else + return 0; +} + +int +sub3 (unsigned long long i) +{ + if (i >= 0x80000000) + return 0; + else + return 1; +} + +int +sub4 (unsigned long long i) +{ + if (i > 0x7FFFFFFF) + return 0; + else + return 1; +} + +main() +{ + if (sub1 (0x80000000ULL)) + abort (); + + if (sub2 (0x80000000ULL)) + abort (); + + if (sub3 (0x80000000ULL)) + abort (); + + if (sub4 (0x80000000ULL)) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961213-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961213-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961213-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961213-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +int +g (unsigned long long int *v, int n, unsigned int a[], int b) +{ + int cnt; + *v = 0; + for (cnt = 0; cnt < n; ++cnt) + *v = *v * b + a[cnt]; + return n; +} + +main () +{ + int res; + unsigned int ar[] = { 10, 11, 12, 13, 14 }; + unsigned long long int v; + + res = g (&v, sizeof(ar)/sizeof(ar[0]), ar, 16); + if (v != 0xabcdeUL) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961223-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961223-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961223-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/961223-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* { dg-options "-fgnu89-inline" } */ + +extern void exit (int); +extern void abort (void); + +struct s { + double d; +}; + +inline struct s +sub (struct s s) +{ + s.d += 1.0; + return s; +} + +int +main () +{ + struct s t = { 2.0 }; + t = sub (t); + if (t.d != 3.0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970214-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970214-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970214-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970214-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,5 @@ +#define L 1 +main () +{ + exit (L'1' != L'1'); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970214-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970214-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970214-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970214-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,5 @@ +#define m(L) (L'1' + (L)) +main () +{ + exit (m (0) != L'1'); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970217-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970217-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970217-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970217-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +sub (int i, int array[i++]) +{ + return i; +} + +main() +{ + int array[10]; + exit (sub (10, array) != 11); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970923-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970923-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970923-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/970923-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +int +ts(a) + int a; +{ + if (a < 1000 && a > 2000) + return 1; + else + return 0; +} + +int +tu(a) + unsigned int a; +{ + if (a < 1000 && a > 2000) + return 1; + else + return 0; +} + + +main() +{ + if (ts (0) || tu (0)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980205.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980205.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980205.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980205.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +#include + +void fdouble (double one, ...) +{ + double value; + va_list ap; + + va_start (ap, one); + value = va_arg (ap, double); + va_end (ap); + + if (one != 1.0 || value != 2.0) + abort (); +} + +int main () +{ + fdouble (1.0, 2.0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980223.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980223.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980223.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980223.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +typedef struct { char *addr; long type; } object; + +object bar (object blah) +{ + abort(); +} + +object foo (object x, object y) +{ + object z = *(object*)(x.addr); + if (z.type & 64) + { + y = *(object*)(z.addr+sizeof(object)); + z = *(object*)(z.addr); + if (z.type & 64) + y = bar(y); + } + return y; +} + +int nil; +object cons1[2] = { {(char *) &nil, 0}, {(char *) &nil, 0} }; +object cons2[2] = { {(char *) &cons1, 64}, {(char *) &nil, 0} }; + +main() +{ + object x = {(char *) &cons2, 64}; + object y = {(char *) &nil, 0}; + object three = foo(x,y); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980424-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980424-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980424-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980424-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +int i, a[99]; + +void f (int one) +{ + if (one != 1) + abort (); +} + +void +g () +{ + f (a[i & 0x3f]); +} + +int +main () +{ + a[0] = 1; + i = 0x40; + g (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980505-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980505-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980505-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980505-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +static int f(int) __attribute__((const)); +int main() +{ + int f1, f2, x; + x = 1; f1 = f(x); + x = 2; f2 = f(x); + if (f1 != 1 || f2 != 2) + abort (); + exit (0); +} +static int f(int x) { return x; } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980505-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980505-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980505-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980505-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +typedef unsigned short Uint16; +typedef unsigned int Uint; + +Uint f () +{ + Uint16 token; + Uint count; + static Uint16 values[1] = {0x9300}; + + token = values[0]; + count = token >> 8; + + return count; +} + +int +main () +{ + if (f () != 0x93) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +struct decision +{ + char enforce_mode; + struct decision *next; +}; + + +static void +clear_modes (p) + register struct decision *p; +{ + goto blah; + +foo: + p->enforce_mode = 0; +blah: + if (p) + goto foo; +} + +main() +{ + struct decision *p = 0; + clear_modes (p); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +static void *self(void *p){ return p; } + +int +f() +{ + struct { int i; } s, *sp; + int *ip = &s.i; + + s.i = 1; + sp = self(&s); + + *ip = 0; + return sp->i+1; +} + +main() +{ + if (f () != 1) + abort (); + else + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980506-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +unsigned char lookup_table [257]; + +static int +build_lookup (pattern) + unsigned char *pattern; +{ + int m; + + m = strlen (pattern) - 1; + + memset (lookup_table, ++m, 257); + return m; +} + +int main(argc, argv) + int argc; + char **argv; +{ + if (build_lookup ("bind") != 4) + abort (); + else + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* { dg-require-effective-target label_values } */ + +int expect_do1 = 1, expect_do2 = 2; + +static int doit(int x){ + __label__ lbl1; + __label__ lbl2; + static int jtab_init = 0; + static void *jtab[2]; + + if(!jtab_init) { + jtab[0] = &&lbl1; + jtab[1] = &&lbl2; + jtab_init = 1; + } + goto *jtab[x]; +lbl1: + return 1; +lbl2: + return 2; +} + +static void do1(void) { + if (doit(0) != expect_do1) + abort (); +} + +static void do2(void){ + if (doit(1) != expect_do2) + abort (); +} + +int main(void){ + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,57 @@ +typedef unsigned int dev_t; +typedef unsigned int kdev_t; + +static inline kdev_t to_kdev_t(int dev) +{ + int major, minor; + + if (sizeof(kdev_t) == 16) + return (kdev_t)dev; + major = (dev >> 8); + minor = (dev & 0xff); + return ((( major ) << 22 ) | ( minor )) ; + +} + +void do_mknod(const char * filename, int mode, kdev_t dev) +{ + if (dev==0x15800078) + exit(0); + else + abort(); +} + + +char * getname(const char * filename) +{ + register unsigned int a1,a2,a3,a4,a5,a6,a7,a8,a9; + a1 = (unsigned int)(filename) *5 + 1; + a2 = (unsigned int)(filename) *6 + 2; + a3 = (unsigned int)(filename) *7 + 3; + a4 = (unsigned int)(filename) *8 + 4; + a5 = (unsigned int)(filename) *9 + 5; + a6 = (unsigned int)(filename) *10 + 5; + a7 = (unsigned int)(filename) *11 + 5; + a8 = (unsigned int)(filename) *12 + 5; + a9 = (unsigned int)(filename) *13 + 5; + return (char *)(a1*a2+a3*a4+a5*a6+a7*a8+a9); +} + +int sys_mknod(const char * filename, int mode, dev_t dev) +{ + int error; + char * tmp; + + tmp = getname(filename); + error = ((long)( tmp )) ; + do_mknod(tmp,mode,to_kdev_t(dev)); + return error; +} + +int main(void) +{ + if (sizeof (int) != 4) + exit (0); + + return sys_mknod("test",1,0x12345678); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980526-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +int compare(x, y) +unsigned int x; +unsigned int y; +{ + if (x==y) + return 0; + else + return 1; +} + +main() +{ + unsigned int i, j, k, l; + i = 5; j = 2; k=0; l=2; + if (compare(5%(~(unsigned) 2), i%~j) + || compare(0, k%~l)) + abort(); + else + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980602-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980602-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980602-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980602-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,9 @@ +main() +{ + int i; + for (i = 1; i < 100; i++) + ; + if (i == 100) + exit (0); + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980602-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980602-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980602-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980602-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* The bit-field below would have a problem if __INT_MAX__ is too + small. */ +#if __INT_MAX__ < 2147483647 +int +main (void) +{ + exit (0); +} +#else +struct { + unsigned bit : 30; +} t; + +int main() +{ + if (!(t.bit++)) + exit (0); + else + abort (); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980604-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980604-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980604-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980604-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +int a = 1; +int b = -1; + +int c = 1; +int d = 0; + +main () +{ + double e; + double f; + double g; + + f = c; + g = d; + e = (a < b) ? f : g; + if (e) + abort (); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980605-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980605-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980605-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980605-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,80 @@ +/* { dg-add-options stack_size } */ + +#include + +#ifndef STACK_SIZE +#define STACK_SIZE 200000 +#endif + +__inline__ static int +dummy (x) +{ + int y; + y = (long) (x * 4711.3); + return y; +} + +int getval (void); + +int +f2 (double x) +{ + unsigned short s; + int a, b, c, d, e, f, g, h, i, j; + + a = getval (); + b = getval (); + c = getval (); + d = getval (); + e = getval (); + f = getval (); + g = getval (); + h = getval (); + i = getval (); + j = getval (); + + + s = x; + + return a + b + c + d + e + f + g + h + i + j + s; +} + +int x = 1; + +int +getval (void) +{ + return x++; +} + +char buf[10]; + +void +f () +{ + char ar[STACK_SIZE/2]; + int a, b, c, d, e, f, g, h, i, j, k; + + a = getval (); + b = getval (); + c = getval (); + d = getval (); + e = getval (); + f = getval (); + g = getval (); + h = getval (); + i = getval (); + j = getval (); + + k = f2 (17.0); + + sprintf (buf, "%d\n", a + b + c + d + e + f + g + h + i + j + k); + if (a + b + c + d + e + f + g + h + i + j + k != 227) + abort (); +} + +main () +{ + f (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980608-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980608-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980608-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980608-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* { dg-options "-fgnu89-inline" } */ + +#include + +extern void abort(void); +extern void exit (int); + +void f1(int a,int b,int c,int d,int e, int f,int g,int h,int i,int j, int k,int +l,int m,int n,int o) +{ + return; +} + +inline void debug(const char *msg,...) +{ + va_list ap; + va_start( ap, msg ); + + f1(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15); + + if ( va_arg(ap,int) != 101) + abort(); + if ( va_arg(ap,int) != 102) + abort(); + if ( va_arg(ap,int) != 103) + abort(); + if ( va_arg(ap,int) != 104) + abort(); + if ( va_arg(ap,int) != 105) + abort(); + if ( va_arg(ap,int) != 106) + abort(); + + va_end( ap ); +} + +int main(void) +{ + debug("%d %d %d %d %d %d\n", 101, 102, 103, 104, 105, 106); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980612-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980612-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980612-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980612-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +struct fd +{ + unsigned char a; + unsigned char b; +} f = { 5 }; + +struct fd *g() { return &f; } +int h() { return -1; } + +int main() +{ + struct fd *f = g(); + f->b = h(); + if (((f->a & 0x7f) & ~0x10) <= 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980617-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980617-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980617-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980617-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +void foo (unsigned int * p) +{ + if ((signed char)(*p & 0xFF) == 17 || (signed char)(*p & 0xFF) == 18) + return; + else + abort (); +} + +int main () +{ + int i = 0x30011; + foo(&i); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980618-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980618-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980618-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980618-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +void func(int, int); + +int main() +{ + int x = 7; + func(!x, !7); + exit (0); +} + +void func(int x, int y) +{ + if (x == y) + return; + else + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980701-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980701-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980701-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980701-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +ns_name_skip (unsigned char **x, unsigned char *y) +{ + *x = 0; + return 0; +} + +unsigned char a[2]; + +int dn_skipname(unsigned char *ptr, unsigned char *eom) { + unsigned char *saveptr = ptr; + + if (ns_name_skip(&ptr, eom) == -1) + return (-1); + return (ptr - saveptr); +} + +main() +{ + if (dn_skipname (&a[0], &a[1]) == 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980707-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980707-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980707-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980707-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,45 @@ +#include +#include + +char ** +buildargv (char *input) +{ + static char *arglist[256]; + int numargs = 0; + + while (1) + { + while (*input == ' ') + input++; + if (*input == 0) + break; + arglist [numargs++] = input; + while (*input != ' ' && *input != 0) + input++; + if (*input == 0) + break; + *(input++) = 0; + } + arglist [numargs] = NULL; + return arglist; +} + + +int main() +{ + char **args; + char input[256]; + int i; + + strcpy(input, " a b"); + args = buildargv(input); + + if (strcmp (args[0], "a")) + abort (); + if (strcmp (args[1], "b")) + abort (); + if (args[2] != NULL) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980709-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980709-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980709-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980709-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* { dg-xfail-if "Can not call system libm.a with -msoft-float" { powerpc-*-aix* rs6000-*-aix* } { "-msoft-float" } { "" } } */ +#include + +main() +{ + volatile double a; + double c; + a = 32.0; + c = pow(a, 1.0/3.0); + if (c + 0.1 > 3.174802 + && c - 0.1 < 3.174802) + exit (0); + else + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980716-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980716-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980716-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980716-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +#include + +void +stub(int num, ...) +{ + va_list ap; + char *end; + int i; + + for (i = 0; i < 2; i++) { + va_start(ap, num); + while ( 1 ) { + end = va_arg(ap, char *); + if (!end) break; + } + va_end(ap); + } +} + +int +main() +{ + stub(1, "ab", "bc", "cx", (char *)0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980929-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980929-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980929-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/980929-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +void f(int i) +{ + if (i != 1000) + abort (); +} + + +int main() +{ + int n=1000; + int i; + + f(n); + for(i=0; i<1; ++i) { + f(n); + n=666; + &n; + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981001-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981001-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981001-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981001-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +#define NG 0x100L + +unsigned long flg = 0; + +long sub (int n) +{ + int a, b ; + + if (n >= 2) + { + if (n % 2 == 0) + { + a = sub (n / 2); + + return (a + 2 * sub (n / 2 - 1)) * a; + } + else + { + a = sub (n / 2 + 1); + b = sub (n / 2); + + return a * a + b * b; + } + } + else + return (long) n; +} + +int main (void) +{ + if (sub (30) != 832040L) + flg |= NG; + + if (flg) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981019-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981019-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981019-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981019-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,46 @@ +/* { dg-skip-if "ptxas seg faults" { nvptx-*-* } { "-O3*" } { "" } } */ + +extern int f2(void); +extern int f3(void); +extern void f1(void); + +void +ff(int fname, int part, int nparts) +{ + if (fname) /* bb 0 */ + { + if (nparts) /* bb 1 */ + f1(); /* bb 2 */ + } + else + fname = 2; /* bb 3 */ + + /* bb 4 is the branch to bb 10 + (bb 10 is physically at the end of the loop) */ + while (f3() /* bb 10 */) + { + if (nparts /* bb 5 */ && f2() /* bb 6 */) + { + f1(); /* bb 7 ... */ + nparts = part; + if (f3()) /* ... bb 7 */ + f1(); /* bb 8 */ + f1(); /* bb 9 */ + break; + } + } + + if (nparts) /* bb 11 */ + f1(); /* bb 12 */ + return; /* bb 13 */ +} + +int main(void) +{ + ff(0, 1, 0); + return 0; +} + +int f3(void) { static int x = 0; x = !x; return x; } +void f1(void) { abort(); } +int f2(void) { abort(); } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981130-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981130-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981130-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981130-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* { dg-xfail-if "alias analysis conflicts with instruction scheduling" { m32r-*-* } { "-O2" "-O1" "-O0" "-Os"} { "" } } */ +struct s { int a; int b;}; +struct s s1; +struct s s2 = { 1, 2, }; + +void +check (a, b) + int a; + int b; +{ + if (a == b) + exit (0); + else + abort (); +} + +int +main () +{ + int * p; + int x; + + s1.a = 9; + p = & s1.a; + s1 = s2; + x = * p; + + check (x, 1); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981206-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981206-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981206-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/981206-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* Verify unaligned address aliasing on Alpha EV[45]. */ + +static unsigned short x, y; + +void foo() +{ + x = 0x345; + y = 0x567; +} + +int main() +{ + foo (); + if (x != 0x345 || y != 0x567) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990106-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990106-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990106-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990106-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +foo(bufp) +char *bufp; +{ + int x = 80; + return (*bufp++ = x ? 'a' : 'b'); +} + +main() +{ + char x; + + if (foo (&x) != 'a') + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990106-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990106-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990106-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990106-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +unsigned calc_mp(unsigned mod) +{ + unsigned a,b,c; + c=-1; + a=c/mod; + b=0-a*mod; + if (b > mod) { a += 1; b-=mod; } + return b; +} + +int main(int argc, char *argv[]) +{ + unsigned x = 1234; + unsigned y = calc_mp(x); + + if ((sizeof (y) == 4 && y != 680) + || (sizeof (y) == 2 && y != 134)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990117-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990117-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990117-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990117-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +int +foo (int x, int y, int i, int j) +{ + double tmp1 = ((double) x / y); + double tmp2 = ((double) i / j); + + return tmp1 < tmp2; +} + +main () +{ + if (foo (2, 24, 3, 4) == 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990127-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990127-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990127-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990127-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +extern void abort (void); +extern void exit (int); + +main() +{ + int a,b,c; + int *pa, *pb, *pc; + int **ppa, **ppb, **ppc; + int i,j,k,x,y,z; + + a = 10; + b = 20; + c = 30; + pa = &a; pb = &b; pc = &c; + ppa = &pa; ppb = &pb; ppc = &pc; + x = 0; y = 0; z = 0; + + for(i=0;i<10;i++){ + if( pa == &a ) pa = &b; + else pa = &a; + while( (*pa)-- ){ + x++; + if( (*pa) < 3 ) break; + else pa = &b; + } + x++; + pa = &b; + } + + if ((*pa) != -5 || (*pb) != -5 || x != 43) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990127-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990127-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990127-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990127-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* { dg-options "-mpc64" { target { i?86-*-* x86_64-*-* } } } */ + +extern void abort (void); +extern void exit (int); + +void +fpEq (double x, double y) +{ + if (x != y) + abort (); +} + +void +fpTest (double x, double y) +{ + double result1 = (35.7 * 100.0) / 45.0; + double result2 = (x * 100.0) / y; + fpEq (result1, result2); +} + +int +main () +{ + fpTest (35.7, 45.0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990128-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990128-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990128-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990128-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,50 @@ +extern int printf (const char *,...); + +struct s { struct s *n; } *p; +struct s ss; +#define MAX 10 +struct s sss[MAX]; +int count = 0; + +void sub( struct s *p, struct s **pp ); +int look( struct s *p, struct s **pp ); + +main() +{ + struct s *pp; + struct s *next; + int i; + + p = &ss; + next = p; + for ( i = 0; i < MAX; i++ ) { + next->n = &sss[i]; + next = next->n; + } + next->n = 0; + + sub( p, &pp ); + if (count != MAX+2) + abort (); + + exit( 0 ); +} + +void sub( struct s *p, struct s **pp ) +{ + for ( ; look( p, pp ); ) { + if ( p ) + p = p->n; + else + break; + } +} + +int look( struct s *p, struct s **pp ) +{ + for ( ; p; p = p->n ) + ; + *pp = p; + count++; + return( 1 ); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990130-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990130-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990130-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990130-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +int count = 0; +int dummy; + +static int * +bar(void) +{ + ++count; + return &dummy; +} + +static void +foo(void) +{ + asm("" : "+r"(*bar())); +} + +main() +{ + foo(); + if (count != 1) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990208-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990208-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990208-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990208-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,47 @@ +/* { dg-require-effective-target label_values } */ + +/* As a quality of implementation issue, we should not prevent inlining + of function explicitly marked inline just because a label therein had + its address taken. */ + +static void *ptr1, *ptr2; +static int i = 1; + +static __inline__ void doit(void **pptr, int cond) +{ + if (cond) { + here: + *pptr = &&here; + } +} + +__attribute__ ((noinline)) +static void f(int cond) +{ + doit (&ptr1, cond); +} + +__attribute__ ((noinline)) +static void g(int cond) +{ + doit (&ptr2, cond); +} + +__attribute__ ((noinline)) +static void bar(void); + +int main() +{ + f (i); + bar(); + g (i); + +#ifdef __OPTIMIZE__ + if (ptr1 == ptr2) + abort (); +#endif + + exit (0); +} + +void bar(void) { } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990211-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990211-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990211-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990211-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,61 @@ +/* Copyright (C) 1999 Free Software Foundation, Inc. + Contributed by Nathan Sidwell 20 Jan 1999 */ + +/* check range combining boolean operations work */ + +extern void abort(); + +#define N 77 + +void func(int i) +{ + /* fold-const does some clever things with range tests. Make sure + we get (some of) them right */ + + /* these must fail, regardless of the value of i */ + if ((i < 0) && (i >= 0)) + abort(); + if ((i > 0) && (i <= 0)) + abort(); + if ((i >= 0) && (i < 0)) + abort(); + if ((i <= 0) && (i > 0)) + abort(); + + if ((i < N) && (i >= N)) + abort(); + if ((i > N) && (i <= N)) + abort(); + if ((i >= N) && (i < N)) + abort(); + if ((i <= N) && (i > N)) + abort(); + + /* these must pass, regardless of the value of i */ + if (! ((i < 0) || (i >= 0))) + abort(); + if (! ((i > 0) || (i <= 0))) + abort(); + if (! ((i >= 0) || (i < 0))) + abort(); + if (! ((i <= 0) || (i > 0))) + abort(); + + if (! ((i < N) || (i >= N))) + abort(); + if (! ((i > N) || (i <= N))) + abort(); + if (! ((i >= N) || (i < N))) + abort(); + if (! ((i <= N) || (i > N))) + abort(); + + return; +} + +int main() +{ + func(0); + func(1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990222-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990222-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990222-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990222-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +char line[4] = { '1', '9', '9', '\0' }; + +int main() +{ + char *ptr = line + 3; + + while ((*--ptr += 1) > '9') *ptr = '0'; + if (line[0] != '2' || line[1] != '0' || line[2] != '0') + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990324-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990324-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990324-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990324-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +void f(long i) +{ + if ((signed char)i < 0 || (signed char)i == 0) + abort (); + else + exit (0); +} + +main() +{ + f(0xffffff01); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990326-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990326-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990326-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990326-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,407 @@ +struct a { + char a, b; + short c; +}; + +int +a1() +{ + static struct a x = { 1, 2, ~1 }, y = { 65, 2, ~2 }; + + return (x.a == (y.a & ~64) && x.b == y.b); +} + +int +a2() +{ + static struct a x = { 1, 66, ~1 }, y = { 1, 2, ~2 }; + + return (x.a == y.a && (x.b & ~64) == y.b); +} + +int +a3() +{ + static struct a x = { 9, 66, ~1 }, y = { 33, 18, ~2 }; + + return ((x.a & ~8) == (y.a & ~32) && (x.b & ~64) == (y.b & ~16)); +} + +struct b { + int c; + short b, a; +}; + +int +b1() +{ + static struct b x = { ~1, 2, 1 }, y = { ~2, 2, 65 }; + + return (x.a == (y.a & ~64) && x.b == y.b); +} + +int +b2() +{ + static struct b x = { ~1, 66, 1 }, y = { ~2, 2, 1 }; + + return (x.a == y.a && (x.b & ~64) == y.b); +} + +int +b3() +{ + static struct b x = { ~1, 66, 9 }, y = { ~2, 18, 33 }; + + return ((x.a & ~8) == (y.a & ~32) && (x.b & ~64) == (y.b & ~16)); +} + +struct c { + unsigned int c:4, b:14, a:14; +} __attribute__ ((aligned)); + +int +c1() +{ + static struct c x = { ~1, 2, 1 }, y = { ~2, 2, 65 }; + + return (x.a == (y.a & ~64) && x.b == y.b); +} + +int +c2() +{ + static struct c x = { ~1, 66, 1 }, y = { ~2, 2, 1 }; + + return (x.a == y.a && (x.b & ~64) == y.b); +} + +int +c3() +{ + static struct c x = { ~1, 66, 9 }, y = { ~2, 18, 33 }; + + return ((x.a & ~8) == (y.a & ~32) && (x.b & ~64) == (y.b & ~16)); +} + +struct d { + unsigned int a:14, b:14, c:4; +} __attribute__ ((aligned)); + +int +d1() +{ + static struct d x = { 1, 2, ~1 }, y = { 65, 2, ~2 }; + + return (x.a == (y.a & ~64) && x.b == y.b); +} + +int +d2() +{ + static struct d x = { 1, 66, ~1 }, y = { 1, 2, ~2 }; + + return (x.a == y.a && (x.b & ~64) == y.b); +} + +int +d3() +{ + static struct d x = { 9, 66, ~1 }, y = { 33, 18, ~2 }; + + return ((x.a & ~8) == (y.a & ~32) && (x.b & ~64) == (y.b & ~16)); +} + +struct e { + int c:4, b:14, a:14; +} __attribute__ ((aligned)); + +int +e1() +{ + static struct e x = { ~1, -2, -65 }, y = { ~2, -2, -1 }; + + return (x.a == (y.a & ~64) && x.b == y.b); +} + +int +e2() +{ + static struct e x = { ~1, -2, -1 }, y = { ~2, -66, -1 }; + + return (x.a == y.a && (x.b & ~64) == y.b); +} + +int +e3() +{ + static struct e x = { ~1, -18, -33 }, y = { ~2, -66, -9 }; + + return ((x.a & ~8) == (y.a & ~32) && (x.b & ~64) == (y.b & ~16)); +} + +int +e4() +{ + static struct e x = { -1, -1, 0 }; + + return x.a == 0 && x.b & 0x2000; +} + +struct f { + int a:14, b:14, c:4; +} __attribute__ ((aligned)); + +int +f1() +{ + static struct f x = { -65, -2, ~1 }, y = { -1, -2, ~2 }; + + return (x.a == (y.a & ~64) && x.b == y.b); +} + +int +f2() +{ + static struct f x = { -1, -2, ~1 }, y = { -1, -66, ~2 }; + + return (x.a == y.a && (x.b & ~64) == y.b); +} + +int +f3() +{ + static struct f x = { -33, -18, ~1 }, y = { -9, -66, ~2 }; + + return ((x.a & ~8) == (y.a & ~32) && (x.b & ~64) == (y.b & ~16)); +} + +int +f4() +{ + static struct f x = { 0, -1, -1 }; + + return x.a == 0 && x.b & 0x2000; +} + +struct gx { + int c:4, b:14, a:14; +} __attribute__ ((aligned)); +struct gy { + int b:14, a:14, c:4; +} __attribute__ ((aligned)); + +int +g1() +{ + static struct gx x = { ~1, -2, -65 }; + static struct gy y = { -2, -1, ~2 }; + + return (x.a == (y.a & ~64) && x.b == y.b); +} + +int +g2() +{ + static struct gx x = { ~1, -2, -1 }; + static struct gy y = { -66, -1, ~2 }; + + return (x.a == y.a && (x.b & ~64) == y.b); +} + +int +g3() +{ + static struct gx x = { ~1, -18, -33 }; + static struct gy y = { -66, -9, ~2 }; + + return ((x.a & ~8) == (y.a & ~32) && (x.b & ~64) == (y.b & ~16)); +} + +int +g4() +{ + static struct gx x = { ~1, 0x0020, 0x0010 }; + static struct gy y = { 0x0200, 0x0100, ~2 }; + + return ((x.a & 0x00f0) == (y.a & 0x0f00) && + (x.b & 0x00f0) == (y.b & 0x0f00)); +} + +int +g5() +{ + static struct gx x = { ~1, 0x0200, 0x0100 }; + static struct gy y = { 0x0020, 0x0010, ~2 }; + + return ((x.a & 0x0f00) == (y.a & 0x00f0) && + (x.b & 0x0f00) == (y.b & 0x00f0)); +} + +int +g6() +{ + static struct gx x = { ~1, 0xfe20, 0xfd10 }; + static struct gy y = { 0xc22f, 0xc11f, ~2 }; + + return ((x.a & 0x03ff) == (y.a & 0x3ff0) && + (x.b & 0x03ff) == (y.b & 0x3ff0)); +} + +int +g7() +{ + static struct gx x = { ~1, 0xc22f, 0xc11f }; + static struct gy y = { 0xfe20, 0xfd10, ~2 }; + + return ((x.a & 0x3ff0) == (y.a & 0x03ff) && + (x.b & 0x3ff0) == (y.b & 0x03ff)); +} + +struct hx { + int a:14, b:14, c:4; +} __attribute__ ((aligned)); +struct hy { + int c:4, a:14, b:14; +} __attribute__ ((aligned)); + +int +h1() +{ + static struct hx x = { -65, -2, ~1 }; + static struct hy y = { ~2, -1, -2 }; + + return (x.a == (y.a & ~64) && x.b == y.b); +} + +int +h2() +{ + static struct hx x = { -1, -2, ~1 }; + static struct hy y = { ~2, -1, -66 }; + + return (x.a == y.a && (x.b & ~64) == y.b); +} + +int +h3() +{ + static struct hx x = { -33, -18, ~1 }; + static struct hy y = { ~2, -9, -66 }; + + return ((x.a & ~8) == (y.a & ~32) && (x.b & ~64) == (y.b & ~16)); +} + +int +h4() +{ + static struct hx x = { 0x0010, 0x0020, ~1 }; + static struct hy y = { ~2, 0x0100, 0x0200 }; + + return ((x.a & 0x00f0) == (y.a & 0x0f00) && + (x.b & 0x00f0) == (y.b & 0x0f00)); +} + +int +h5() +{ + static struct hx x = { 0x0100, 0x0200, ~1 }; + static struct hy y = { ~2, 0x0010, 0x0020 }; + + return ((x.a & 0x0f00) == (y.a & 0x00f0) && + (x.b & 0x0f00) == (y.b & 0x00f0)); +} + +int +h6() +{ + static struct hx x = { 0xfd10, 0xfe20, ~1 }; + static struct hy y = { ~2, 0xc11f, 0xc22f }; + + return ((x.a & 0x03ff) == (y.a & 0x3ff0) && + (x.b & 0x03ff) == (y.b & 0x3ff0)); +} + +int +h7() +{ + static struct hx x = { 0xc11f, 0xc22f, ~1 }; + static struct hy y = { ~2, 0xfd10, 0xfe20 }; + + return ((x.a & 0x3ff0) == (y.a & 0x03ff) && + (x.b & 0x3ff0) == (y.b & 0x03ff)); +} + +int +main() +{ + if (!a1 ()) + abort (); + if (!a2 ()) + abort (); + if (!a3 ()) + abort (); + if (!b1 ()) + abort (); + if (!b2 ()) + abort (); + if (!b3 ()) + abort (); + if (!c1 ()) + abort (); + if (!c2 ()) + abort (); + if (!c3 ()) + abort (); + if (!d1 ()) + abort (); + if (!d2 ()) + abort (); + if (!d3 ()) + abort (); + if (!e1 ()) + abort (); + if (!e2 ()) + abort (); + if (!e3 ()) + abort (); + if (!e4 ()) + abort (); + if (!f1 ()) + abort (); + if (!f2 ()) + abort (); + if (!f3 ()) + abort (); + if (!f4 ()) + abort (); + if (!g1 ()) + abort (); + if (!g2 ()) + abort (); + if (!g3 ()) + abort (); + if (g4 ()) + abort (); + if (g5 ()) + abort (); + if (!g6 ()) + abort (); + if (!g7 ()) + abort (); + if (!h1 ()) + abort (); + if (!h2 ()) + abort (); + if (!h3 ()) + abort (); + if (h4 ()) + abort (); + if (h5 ()) + abort (); + if (!h6 ()) + abort (); + if (!h7 ()) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990404-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990404-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990404-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990404-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ + +int x[10] = { 0,1,2,3,4,5,6,7,8,9}; + +int +main() +{ + int niterations = 0, i; + + for (;;) { + int i, mi, max; + max = 0; + for (i = 0; i < 10 ; i++) { + if (x[i] > max) { + max = x[i]; + mi = i; + } + } + if (max == 0) + break; + x[mi] = 0; + niterations++; + if (niterations > 10) + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990413-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990413-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990413-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990413-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +/* This tests for a bug in regstack that was breaking glibc's math library. */ +/* { dg-skip-if "" { ! { i?86-*-* x86_64-*-* } } } */ + +extern void abort (void); + +static __inline double +minus_zero (void) +{ + union { double __d; int __i[2]; } __x; + __x.__i[0] = 0x0; + __x.__i[1] = 0x80000000; + return __x.__d; +} + +static __inline long double +__atan2l (long double __y, long double __x) +{ + register long double __value; + __asm __volatile__ ("fpatan\n\t" + : "=t" (__value) + : "0" (__x), "u" (__y) + : "st(1)"); + return __value; +} + +static __inline long double +__sqrtl (long double __x) +{ + register long double __result; + __asm __volatile__ ("fsqrt" : "=t" (__result) : "0" (__x)); + return __result; +} + +static __inline double +asin (double __x) +{ + return __atan2l (__x, __sqrtl (1.0 - __x * __x)); +} + +int +main (void) +{ + double x; + + x = minus_zero(); + x = asin (x); + + if (x != 0.0) /* actually -0.0, but 0.0 == -0.0 */ + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990513-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990513-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990513-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990513-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +#include + +void foo (int *BM_tab, int j) +{ + int *BM_tab_base; + + BM_tab_base = BM_tab; + BM_tab += 0400; + while (BM_tab_base != BM_tab) + { + *--BM_tab = j; + *--BM_tab = j; + *--BM_tab = j; + *--BM_tab = j; + } +} + +int main () +{ + int BM_tab[0400]; + memset (BM_tab, 0, sizeof (BM_tab)); + foo (BM_tab, 6); + if (BM_tab[0] != 6) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990524-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990524-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990524-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990524-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +char a[] = "12345"; +char b[] = "12345"; + +void loop (char * pz, char * pzDta) +{ + for (;;) { + switch (*(pz++) = *(pzDta++)) { + case 0: + goto loopDone2; + + case '"': + case '\\': + pz[-1] = '\\'; + *(pz++) = pzDta[-1]; + } + } loopDone2:; + + if (a - pz != b - pzDta) + abort (); +} + +main() +{ + loop (a, b); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990525-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990525-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990525-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990525-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +struct blah { + int m1, m2; +}; + +void die(struct blah arg) +{ + int i ; + struct blah buf[1]; + + for (i = 0; i < 1 ; buf[i++] = arg) + ; + if (buf[0].m1 != 1) { + abort (); + } +} + +int main() +{ + struct blah s = { 1, 2 }; + + die(s); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990525-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990525-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990525-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990525-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +typedef struct { + int v[4]; +} Test1; + +Test1 func2(); + +int func1() +{ + Test1 test; + test = func2(); + + if (test.v[0] != 10) + abort (); + if (test.v[1] != 20) + abort (); + if (test.v[2] != 30) + abort (); + if (test.v[3] != 40) + abort (); +} + +Test1 func2() +{ + Test1 tmp; + tmp.v[0] = 10; + tmp.v[1] = 20; + tmp.v[2] = 30; + tmp.v[3] = 40; + return tmp; +} + + +int main() +{ + func1(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990527-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990527-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990527-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990527-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +int sum; + +void +g (int i) +{ + sum += i; +} + +void +f(int j) +{ + int i; + + for (i = 0; i < 9; i++) + { + j++; + g (j); + j = 9; + } +} + +int +main () +{ + f (0); + if (sum != 81) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990531-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990531-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990531-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990531-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ + unsigned long bad(int reg, unsigned long inWord) + { + union { + unsigned long word; + unsigned char byte[4]; + } data; + + data.word = inWord; + data.byte[reg] = 0; + + return data.word; + } + +main() +{ + /* XXX This test could be generalized. */ + if (sizeof (long) != 4) + exit (0); + + if (bad (0, 0xdeadbeef) == 0xdeadbeef) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990604-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990604-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990604-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990604-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +int b; +void f () +{ + int i = 0; + if (b == 0) + do { + b = i; + i++; + } while (i < 10); +} + +int main () +{ + f (); + if (b != 9) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990628-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990628-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990628-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990628-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,56 @@ +#include + +struct { + long sqlcode; +} sqlca; + + +struct data_record { + int dummy; + int a[100]; +} *data_ptr, data_tmp; + + +int +num_records() +{ + return 1; +} + + +void +fetch() +{ + static int fetch_count; + + memset(&data_tmp, 0x55, sizeof(data_tmp)); + sqlca.sqlcode = (++fetch_count > 1 ? 100 : 0); +} + + +void +load_data() { + struct data_record *p; + int num = num_records(); + + data_ptr = malloc(num * sizeof(struct data_record)); + memset(data_ptr, 0xaa, num * sizeof(struct data_record)); + + fetch(); + p = data_ptr; + while (sqlca.sqlcode == 0) { + *p++ = data_tmp; + fetch(); + } +} + + +main() +{ + load_data(); + if (sizeof (int) == 2 && data_ptr[0].dummy != 0x5555) + abort (); + else if (sizeof (int) > 2 && data_ptr[0].dummy != 0x55555555) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990804-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990804-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990804-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990804-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +int gfbyte ( void ) +{ + return 0; +} + +int main( void ) +{ + int i,j,k ; + + i = gfbyte(); + + i = i + 1 ; + + if ( i == 0 ) + k = -0 ; + else + k = i + 0 ; + + if (i != 1) + abort (); + + k = 1 ; + if ( k <= i) + do + j = gfbyte () ; + while ( k++ < i ) ; + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990811-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990811-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990811-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990811-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +struct s {long a; int b;}; + +int foo(int x, void *y) +{ + switch(x) { + case 0: return ((struct s*)y)->a; + case 1: return *(signed char*)y; + case 2: return *(short*)y; + } + abort(); +} + +int main () +{ + struct s s; + short sh[10]; + signed char c[10]; + int i; + + s.a = 1; + s.b = 2; + for (i = 0; i < 10; i++) { + sh[i] = i; + c[i] = i; + } + + if (foo(0, &s) != 1) abort(); + if (foo(1, c+3) != 3) abort(); + if (foo(2, sh+3) != 3) abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990826-0.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990826-0.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990826-0.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990826-0.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* +From: niles at fan745.gsfc.nasa.gov +To: fortran at gnu.org +Subject: Re: Scary problems in g77 for RedHat 6.0. (glibc-2.1) +Date: Sun, 06 Jun 1999 23:37:23 -0400 +X-UIDL: 9c1e40c572e3b306464f703461764cd5 +*/ + +/* { dg-xfail-if "Can not call system libm.a with -msoft-float" { powerpc-*-aix* rs6000-*-aix* } { "-msoft-float" } { "" } } */ + +#include +#include + +int +main() +{ + if (floor (0.1) != 0.) + abort (); + return 0; +} + +/* +It will result in 36028797018963968.000000 on Alpha RedHat Linux 6.0 +using glibc-2.1 at least on my 21064. This may result in g77 bug +reports concerning the INT() function, just so you know. + + Thanks, + Rick Niles. +*/ Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990827-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990827-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990827-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990827-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +unsigned test(unsigned one , unsigned bit) +{ + unsigned val= bit & 1; + unsigned zero= one >> 1; + + val++; + return zero + ( val>> 1 ); +} + +int main() +{ + if (test (1,0) != 0) + abort (); + if (test (1,1) != 1) + abort (); + if (test (1,65535) != 1) + abort (); + exit (0); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990829-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990829-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990829-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990829-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +double test (const double le, const double ri) +{ + double val = ( ri - le ) / ( ri * ( le + 1.0 ) ); + return val; +} + +int main () +{ + double retval; + + retval = test(1.0,2.0); + if (retval < 0.24 || retval > 0.26) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990923-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990923-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990923-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/990923-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +#define mask 0xffff0000L +#define value 0xabcd0000L + +long +foo (long x) +{ + if ((x & mask) == value) + return x & 0xffffL; + return 1; +} + +int +main (void) +{ + if (foo (value) != 0 || foo (0) != 1) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991014-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991014-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991014-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991014-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,53 @@ + +typedef __SIZE_TYPE__ Size_t; + +#if __SIZEOF_LONG__ < __SIZEOF_POINTER__ +#define bufsize ((1LL << (8 * sizeof(Size_t) - 2))-256) +#else +#define bufsize ((1L << (8 * sizeof(Size_t) - 2))-256) +#endif + +struct huge_struct +{ + short buf[bufsize]; + int a; + int b; + int c; + int d; +}; + +union huge_union +{ + int a; + char buf[bufsize]; +}; + +Size_t union_size() +{ + return sizeof(union huge_union); +} + +Size_t struct_size() +{ + return sizeof(struct huge_struct); +} + +Size_t struct_a_offset() +{ + return (Size_t)(&((struct huge_struct *) 0)->a); +} + +int main() +{ + /* Check the exact sizeof value. bufsize is aligned on 256b. */ + if (union_size() != sizeof(char) * bufsize) + abort(); + + if (struct_size() != sizeof(short) * bufsize + 4*sizeof(int)) + abort(); + + if (struct_a_offset() < sizeof(short) * bufsize) + abort(); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991016-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991016-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991016-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991016-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,55 @@ +/* Two of these types will, on current gcc targets, have the same + mode but have different alias sets. DOIT tries to get gcse to + invalidly hoist one of the values out of the loop. */ + +typedef int T0; +typedef long T1; +typedef long long T2; + +int +doit(int sel, int n, void *p) +{ + T0 * const p0 = p; + T1 * const p1 = p; + T2 * const p2 = p; + + switch (sel) + { + case 0: + do + *p0 += *p0; + while (--n); + return *p0 == 0; + + case 1: + do + *p1 += *p1; + while (--n); + return *p1 == 0; + + case 2: + do + *p2 += *p2; + while (--n); + return *p2 == 0; + + default: + abort (); + } +} + +int +main() +{ + T0 v0; T1 v1; T2 v2; + + v0 = 1; doit(0, 5, &v0); + v1 = 1; doit(1, 5, &v1); + v2 = 1; doit(2, 5, &v2); + + if (v0 != 32) abort (); + if (v1 != 32) abort (); + if (v2 != 32) abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991019-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991019-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991019-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991019-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +typedef struct { + double epsilon; +} material_type; + +material_type foo(double x) +{ + material_type m; + + m.epsilon = 1.0 + x; + return m; +} + +main() +{ + int i; + material_type x; + + /* We must iterate enough times to overflow the FP stack on the + x86. */ + for (i = 0; i < 10; i++) + { + x = foo (1.0); + if (x.epsilon != 1.0 + 1.0) + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991023-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991023-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991023-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991023-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ + + +int blah; +foo() +{ + int i; + + for (i=0 ; i< 7 ; i++) + { + if (i == 7 - 1) + blah = 0xfcc; + else + blah = 0xfee; + } + return blah; +} + + +main() +{ + if (foo () != 0xfcc) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991030-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991030-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991030-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991030-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,7 @@ +double x = 0x1.fp1; +int main() +{ + if (x != 3.875) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991112-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991112-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991112-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991112-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* This code was miscompiled at -O3 on x86. + Reported by Jim Meyering; distilled from bash. */ + +int rl_show_char (int c) { return 0; } + +int rl_character_len (int c, int pos) +{ + return isprint (c) ? 1 : 2; +} + +int main(void) +{ + int (*x)(int, int) = rl_character_len; + if (x('a', 1) != 1) + abort(); + if (x('\002', 1) != 2) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991118-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991118-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991118-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991118-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,79 @@ +struct tmp +{ + long long int pad : 12; + long long int field : 52; +}; + +struct tmp2 +{ + long long int field : 52; + long long int pad : 12; +}; + +struct tmp3 +{ + long long int pad : 11; + long long int field : 53; +}; + +struct tmp4 +{ + long long int field : 53; + long long int pad : 11; +}; + +struct tmp +sub (struct tmp tmp) +{ + tmp.field ^= 0x0008765412345678LL; + return tmp; +} + +struct tmp2 +sub2 (struct tmp2 tmp2) +{ + tmp2.field ^= 0x0008765412345678LL; + return tmp2; +} + +struct tmp3 +sub3 (struct tmp3 tmp3) +{ + tmp3.field ^= 0x0018765412345678LL; + return tmp3; +} + +struct tmp4 +sub4 (struct tmp4 tmp4) +{ + tmp4.field ^= 0x0018765412345678LL; + return tmp4; +} + +struct tmp tmp = {0x123, 0x123456789ABCDLL}; +struct tmp2 tmp2 = {0x123456789ABCDLL, 0x123}; +struct tmp3 tmp3 = {0x123, 0x1FFFF00000000LL}; +struct tmp4 tmp4 = {0x1FFFF00000000LL, 0x123}; + +main() +{ + + if (sizeof (long long) != 8) + exit (0); + + tmp = sub (tmp); + tmp2 = sub2 (tmp2); + + if (tmp.pad != 0x123 || tmp.field != 0xFFF9551175BDFDB5LL) + abort (); + if (tmp2.pad != 0x123 || tmp2.field != 0xFFF9551175BDFDB5LL) + abort (); + + tmp3 = sub3 (tmp3); + tmp4 = sub4 (tmp4); + if (tmp3.pad != 0x123 || tmp3.field != 0xFFF989AB12345678LL) + abort (); + if (tmp4.pad != 0x123 || tmp4.field != 0xFFF989AB12345678LL) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991201-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991201-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991201-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991201-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +struct vc_data { + unsigned long space; + unsigned char vc_palette[16*3]; +}; + +struct vc { + struct vc_data *d; +}; + +struct vc_data a_con; +struct vc vc_cons[63] = { &a_con }; +int default_red[16]; +int default_grn[16]; +int default_blu[16]; + +extern void bar(int); + +void reset_palette(int currcons) +{ + int j, k; + for (j=k=0; j<16; j++) { + (vc_cons[currcons].d->vc_palette) [k++] = default_red[j]; + (vc_cons[currcons].d->vc_palette) [k++] = default_grn[j]; + (vc_cons[currcons].d->vc_palette) [k++] = default_blu[j]; + } + bar(k); +} + +void bar(int k) +{ + if (k != 16*3) + abort(); +} + +int main() +{ + reset_palette(0); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +int x, y; + +int +main() +{ + x = 2; + y = x; + do + { + x = y; + y = 2 * y; + } + while ( ! ((y - x) >= 20)); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ + +int +f1 () +{ + unsigned long x, y = 1; + + x = ((y * 8192) - 216) % 16; + return x; +} + +int +main () +{ + if (f1 () != 8) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991202-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ + +unsigned int f (unsigned int a) +{ + return a * 65536 / 8; +} + +unsigned int g (unsigned int a) +{ + return a * 65536; +} + +unsigned int h (unsigned int a) +{ + return a / 8; +} + +int main () +{ + if (f (65536) != h (g (65536))) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,116 @@ +#define VALUE 0x123456789abcdefLL +#define AFTER 0x55 + +void +test1 (int a, long long value, int after) +{ + if (a != 1 + || value != VALUE + || after != AFTER) + abort (); +} + +void +test2 (int a, int b, long long value, int after) +{ + if (a != 1 + || b != 2 + || value != VALUE + || after != AFTER) + abort (); +} + +void +test3 (int a, int b, int c, long long value, int after) +{ + if (a != 1 + || b != 2 + || c != 3 + || value != VALUE + || after != AFTER) + abort (); +} + +void +test4 (int a, int b, int c, int d, long long value, int after) +{ + if (a != 1 + || b != 2 + || c != 3 + || d != 4 + || value != VALUE + || after != AFTER) + abort (); +} + +void +test5 (int a, int b, int c, int d, int e, long long value, int after) +{ + if (a != 1 + || b != 2 + || c != 3 + || d != 4 + || e != 5 + || value != VALUE + || after != AFTER) + abort (); +} + +void +test6 (int a, int b, int c, int d, int e, int f, long long value, int after) +{ + if (a != 1 + || b != 2 + || c != 3 + || d != 4 + || e != 5 + || f != 6 + || value != VALUE + || after != AFTER) + abort (); +} + +void +test7 (int a, int b, int c, int d, int e, int f, int g, long long value, int after) +{ + if (a != 1 + || b != 2 + || c != 3 + || d != 4 + || e != 5 + || f != 6 + || g != 7 + || value != VALUE + || after != AFTER) + abort (); +} + +void +test8 (int a, int b, int c, int d, int e, int f, int g, int h, long long value, int after) +{ + if (a != 1 + || b != 2 + || c != 3 + || d != 4 + || e != 5 + || f != 6 + || g != 7 + || h != 8 + || value != VALUE + || after != AFTER) + abort (); +} + +int +main () +{ + test1 (1, VALUE, AFTER); + test2 (1, 2, VALUE, AFTER); + test3 (1, 2, 3, VALUE, AFTER); + test4 (1, 2, 3, 4, VALUE, AFTER); + test5 (1, 2, 3, 4, 5, VALUE, AFTER); + test6 (1, 2, 3, 4, 5, 6, VALUE, AFTER); + test7 (1, 2, 3, 4, 5, 6, 7, VALUE, AFTER); + test8 (1, 2, 3, 4, 5, 6, 7, 8, VALUE, AFTER); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +#include + +#define VALUE 0x123456789abcdefLL +#define AFTER 0x55 + +void +test (int n, ...) +{ + va_list ap; + int i; + + va_start (ap, n); + for (i = 2; i <= n; i++) + { + if (va_arg (ap, int) != i) + abort (); + } + + if (va_arg (ap, long long) != VALUE) + abort (); + + if (va_arg (ap, int) != AFTER) + abort (); + + va_end (ap); +} + +int +main () +{ + test (1, VALUE, AFTER); + test (2, 2, VALUE, AFTER); + test (3, 2, 3, VALUE, AFTER); + test (4, 2, 3, 4, VALUE, AFTER); + test (5, 2, 3, 4, 5, VALUE, AFTER); + test (6, 2, 3, 4, 5, 6, VALUE, AFTER); + test (7, 2, 3, 4, 5, 6, 7, VALUE, AFTER); + test (8, 2, 3, 4, 5, 6, 7, 8, VALUE, AFTER); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991216-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* Test against a problem with loop reversal. */ +static void bug(int size, int tries) +{ + int i; + int num = 0; + while (num < size) + { + for (i = 1; i < tries; i++) num++; + } +} + +int main() +{ + bug(5, 10); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991221-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991221-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991221-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991221-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +int main( void ) +{ + unsigned long totalsize = 80; + unsigned long msize = 64; + + if (sizeof(long) != 4) + exit(0); + + if ( totalsize > (2147483647L * 2UL + 1) + || (msize != 0 && ((msize - 1) > (2147483647L * 2UL + 1) ))) + abort(); + exit( 0 ); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991227-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991227-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991227-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991227-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +char* doit(int flag) +{ + return 1 + (flag ? "\0wrong\n" : "\0right\n"); +} +int main() +{ + char *result = doit(0); + if (*result == 'r' && result[1] == 'i') + exit(0); + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991228-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991228-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991228-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/991228-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +__extension__ union { double d; int i[2]; } u = { d: -0.25 }; + +/* This assumes the endianness of words in a long long is the same as + that for doubles, which doesn't hold for a few platforms, but we + can probably special case them here, as appropriate. */ +long long endianness_test = 1; +#define MSW (*(int*)&endianness_test) + +int +signbit(double x) +{ + __extension__ union { double d; int i[2]; } u = { d: x }; + return u.i[MSW] < 0; +} + +int main(void) +{ + if (2*sizeof(int) != sizeof(double) || u.i[MSW] >= 0) + exit(0); + + if (!signbit(-0.25)) + abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +int val; + +int *ptr = &val; +float *ptr2 = &val; + +__attribute__((optimize ("-fno-strict-aliasing"))) +typepun () +{ + *ptr2=0; +} + +main() +{ + *ptr=1; + typepun (); + if (*ptr) + __builtin_abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +/* { dg-require-alias "" } */ +int a[10]={}; +extern int b[10] __attribute__ ((alias("a"))); +int off; +main() +{ + b[off]=1; + a[off]=2; + if (b[off]!=2) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +/* { dg-require-alias "" } */ +/* { dg-skip-if "" { powerpc-ibm-aix* } } */ +static int a=0; +extern int b __attribute__ ((alias("a"))); +__attribute__ ((noinline)) +static inc() +{ + b++; +} +int +main() +{ + a=0; + inc (); + if (a!=1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* { dg-require-alias "" } */ +int a = 1; +extern int b __attribute__ ((alias ("a"))); +int c = 1; +extern int d __attribute__ ((alias ("c"))); +main (int argc) +{ + int *p; + int *q; + if (argc) + p = &a, q = &b; + else + p = &c, q = &d; + *p = 1; + *q = 2; + if (*p == 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-access-path-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-access-path-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-access-path-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alias-access-path-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* Test that variable + int val; + may hold value of tyope "struct c" which has same size. + This is valid in GIMPLE memory model. */ + +struct a {int val;} a={1},a2; +struct b {struct a a;}; +int val; +struct c {struct b b;} *cptr=(void *)&val; + +int +main(void) +{ + cptr->b.a=a; + val = 2; + a2=cptr->b.a; + if (a2.val == a.val) + __builtin_abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,9 @@ +typedef int new_int __attribute__ ((aligned(16))); +struct S { int x; }; + +int main() +{ + if (sizeof(struct S) != sizeof(int)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,50 @@ +/* Simple alignment checks; + looking for compiler/assembler alignment disagreements, + agreement between struct initialization and access. */ +struct a_short { char c; short s; } s_c_s = { 'a', 13 }; +struct a_int { char c ; int i; } s_c_i = { 'b', 14 }; +struct b_int { short s; int i; } s_s_i = { 15, 16 }; +struct a_float { char c; float f; } s_c_f = { 'c', 17.0 }; +struct b_float { short s; float f; } s_s_f = { 18, 19.0 }; +struct a_double { char c; double d; } s_c_d = { 'd', 20.0 }; +struct b_double { short s; double d; } s_s_d = { 21, 22.0 }; +struct c_double { int i; double d; } s_i_d = { 23, 24.0 }; +struct d_double { float f; double d; } s_f_d = { 25.0, 26.0 }; +struct a_ldouble { char c; long double ld; } s_c_ld = { 'e', 27.0 }; +struct b_ldouble { short s; long double ld; } s_s_ld = { 28, 29.0 }; +struct c_ldouble { int i; long double ld; } s_i_ld = { 30, 31.0 }; +struct d_ldouble { float f; long double ld; } s_f_ld = { 32.0, 33.0 }; +struct e_ldouble { double d; long double ld; } s_d_ld = { 34.0, 35.0 }; + +int main () +{ + if (s_c_s.c != 'a') abort (); + if (s_c_s.s != 13) abort (); + if (s_c_i.c != 'b') abort (); + if (s_c_i.i != 14) abort (); + if (s_s_i.s != 15) abort (); + if (s_s_i.i != 16) abort (); + if (s_c_f.c != 'c') abort (); + if (s_c_f.f != 17.0) abort (); + if (s_s_f.s != 18) abort (); + if (s_s_f.f != 19.0) abort (); + if (s_c_d.c != 'd') abort (); + if (s_c_d.d != 20.0) abort (); + if (s_s_d.s != 21) abort (); + if (s_s_d.d != 22.0) abort (); + if (s_i_d.i != 23) abort (); + if (s_i_d.d != 24.0) abort (); + if (s_f_d.f != 25.0) abort (); + if (s_f_d.d != 26.0) abort (); + if (s_c_ld.c != 'e') abort (); + if (s_c_ld.ld != 27.0) abort (); + if (s_s_ld.s != 28) abort (); + if (s_s_ld.ld != 29.0) abort (); + if (s_i_ld.i != 30) abort (); + if (s_i_ld.ld != 31.0) abort (); + if (s_f_ld.f != 32.0) abort (); + if (s_f_ld.ld != 33.0) abort (); + if (s_d_ld.d != 34.0) abort (); + if (s_d_ld.ld != 35.0) abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* { dg-skip-if "small alignment" { pdp11-*-* } } */ + +void func(void) __attribute__((aligned(256))); + +void func(void) +{ +} + +int main() +{ + if (__alignof__(func) != 256) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-nest.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-nest.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-nest.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/align-nest.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* { dg-skip-if "requires alloca" { ! alloca } { "-O0" } { "" } } */ + +void foo(int n) +{ + typedef struct + { + int value; + } myint; + + struct S + { + int i[n]; + unsigned int b:1; + myint mi; + } __attribute__ ((packed)) __attribute__ ((aligned (4))); + + struct S s[2]; + int k; + + for (k = 0; k < 2; k ++) + s[k].mi.value = 0; +} + +int main () +{ + foo (2); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alloca-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alloca-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alloca-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/alloca-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* { dg-skip-if "requires alloca" { ! alloca } { "-O0" } { "" } } */ +/* Verify that alloca storage is sufficiently aligned. */ +/* ??? May fail if BIGGEST_ALIGNMENT > STACK_BOUNDARY. Which, I guess + can only happen on !STRICT_ALIGNMENT targets. */ + +typedef __SIZE_TYPE__ size_t; + +struct dummy { int x __attribute__((aligned)); }; +#define BIGGEST_ALIGNMENT __alignof__(struct dummy) + +_Bool foo(void) +{ + char *p = __builtin_alloca(32); + return ((size_t)p & (BIGGEST_ALIGNMENT - 1)) == 0; +} + +int main() +{ + if (!foo()) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/anon-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/anon-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/anon-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/anon-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* Copyright (C) 2001 Free Software Foundation, Inc. */ + +/* Source: Neil Booth, 4 Nov 2001, derived from PR 2820 - field lookup in + nested anonymous entities was broken. */ + +struct +{ + int x; + struct + { + int a; + union + { + int b; + }; + }; +} foo; + +int +main(int argc, char *argv[]) +{ + foo.b = 6; + foo.a = 5; + + if (foo.b != 6) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,47 @@ +unsigned +sat_add (unsigned i) +{ + unsigned ret = i + 1; + if (ret < i) + ret = i; + return ret; +} + +unsigned +sat_add2 (unsigned i) +{ + unsigned ret = i + 1; + if (ret > i) + return ret; + return i; +} + +unsigned +sat_add3 (unsigned i) +{ + unsigned ret = i - 1; + if (ret > i) + ret = i; + return ret; +} + +unsigned +sat_add4 (unsigned i) +{ + unsigned ret = i - 1; + if (ret < i) + return ret; + return i; +} +main () +{ + if (sat_add (~0U) != ~0U) + abort (); + if (sat_add2 (~0U) != ~0U) + abort (); + if (sat_add3 (0U) != 0U) + abort (); + if (sat_add4 (0U) != 0U) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-rand-ll.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-rand-ll.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-rand-ll.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-rand-ll.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,114 @@ +long long +simple_rand () +{ + static unsigned long long seed = 47114711; + unsigned long long this = seed * 1103515245 + 12345; + seed = this; + return this >> 8; +} + +unsigned long long int +random_bitstring () +{ + unsigned long long int x; + int n_bits; + long long ran; + int tot_bits = 0; + + x = 0; + for (;;) + { + ran = simple_rand (); + n_bits = (ran >> 1) % 16; + tot_bits += n_bits; + + if (n_bits == 0) + return x; + else + { + x <<= n_bits; + if (ran & 1) + x |= (1 << n_bits) - 1; + + if (tot_bits > 8 * sizeof (long long) + 6) + return x; + } + } +} + +#define ABS(x) ((x) >= 0 ? (x) : -(x)) + +main () +{ + long long int i; + + for (i = 0; i < 10000; i++) + { + unsigned long long x, y; + x = random_bitstring (); + y = random_bitstring (); + + if (sizeof (int) == sizeof (long long)) + goto save_time; + + { unsigned long long xx = x, yy = y, r1, r2; + if (yy == 0) continue; + r1 = xx / yy; + r2 = xx % yy; + if (r2 >= yy || r1 * yy + r2 != xx) + abort (); + } + { signed long long xx = x, yy = y, r1, r2; + if ((unsigned long long) xx << 1 == 0 && yy == -1) + continue; + r1 = xx / yy; + r2 = xx % yy; + if (ABS (r2) >= (unsigned long long) ABS (yy) || (signed long long) (r1 * yy + r2) != xx) + abort (); + } + save_time: + { unsigned int xx = x, yy = y, r1, r2; + if (yy == 0) continue; + r1 = xx / yy; + r2 = xx % yy; + if (r2 >= yy || r1 * yy + r2 != xx) + abort (); + } + { signed int xx = x, yy = y, r1, r2; + if ((unsigned int) xx << 1 == 0 && yy == -1) + continue; + r1 = xx / yy; + r2 = xx % yy; + if (ABS (r2) >= (unsigned int) ABS (yy) || (signed int) (r1 * yy + r2) != xx || ((xx < 0) != (r2 < 0) && r2)) + abort (); + } + { unsigned short xx = x, yy = y, r1, r2; + if (yy == 0) continue; + r1 = xx / yy; + r2 = xx % yy; + if (r2 >= yy || r1 * yy + r2 != xx) + abort (); + } + { signed short xx = x, yy = y, r1, r2; + r1 = xx / yy; + r2 = xx % yy; + if (ABS (r2) >= (unsigned short) ABS (yy) || (signed short) (r1 * yy + r2) != xx) + abort (); + } + { unsigned char xx = x, yy = y, r1, r2; + if (yy == 0) continue; + r1 = xx / yy; + r2 = xx % yy; + if (r2 >= yy || r1 * yy + r2 != xx) + abort (); + } + { signed char xx = x, yy = y, r1, r2; + r1 = xx / yy; + r2 = xx % yy; + if (ABS (r2) >= (unsigned char) ABS (yy) || (signed char) (r1 * yy + r2) != xx) + abort (); + } + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-rand.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-rand.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-rand.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/arith-rand.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,114 @@ +long +simple_rand () +{ + static unsigned long seed = 47114711; + unsigned long this = seed * 1103515245 + 12345; + seed = this; + return this >> 8; +} + +unsigned long int +random_bitstring () +{ + unsigned long int x; + int n_bits; + long ran; + int tot_bits = 0; + + x = 0; + for (;;) + { + ran = simple_rand (); + n_bits = (ran >> 1) % 16; + tot_bits += n_bits; + + if (n_bits == 0) + return x; + else + { + x <<= n_bits; + if (ran & 1) + x |= (1 << n_bits) - 1; + + if (tot_bits > 8 * sizeof (long) + 6) + return x; + } + } +} + +#define ABS(x) ((x) >= 0 ? (x) : -(x)) + +main () +{ + long int i; + + for (i = 0; i < 1000; i++) + { + unsigned long x, y; + x = random_bitstring (); + y = random_bitstring (); + + if (sizeof (int) == sizeof (long)) + goto save_time; + + { unsigned long xx = x, yy = y, r1, r2; + if (yy == 0) continue; + r1 = xx / yy; + r2 = xx % yy; + if (r2 >= yy || r1 * yy + r2 != xx) + abort (); + } + { signed long xx = x, yy = y, r1, r2; + if ((unsigned long) xx << 1 == 0 && yy == -1) + continue; + r1 = xx / yy; + r2 = xx % yy; + if (ABS (r2) >= (unsigned long) ABS (yy) || (signed long) (r1 * yy + r2) != xx) + abort (); + } + save_time: + { unsigned int xx = x, yy = y, r1, r2; + if (yy == 0) continue; + r1 = xx / yy; + r2 = xx % yy; + if (r2 >= yy || r1 * yy + r2 != xx) + abort (); + } + { signed int xx = x, yy = y, r1, r2; + if ((unsigned int) xx << 1 == 0 && yy == -1) + continue; + r1 = xx / yy; + r2 = xx % yy; + if (ABS (r2) >= (unsigned int) ABS (yy) || (signed int) (r1 * yy + r2) != xx) + abort (); + } + { unsigned short xx = x, yy = y, r1, r2; + if (yy == 0) continue; + r1 = xx / yy; + r2 = xx % yy; + if (r2 >= yy || r1 * yy + r2 != xx) + abort (); + } + { signed short xx = x, yy = y, r1, r2; + r1 = xx / yy; + r2 = xx % yy; + if (ABS (r2) >= (unsigned short) ABS (yy) || (signed short) (r1 * yy + r2) != xx) + abort (); + } + { unsigned char xx = x, yy = y, r1, r2; + if (yy == 0) continue; + r1 = xx / yy; + r2 = xx % yy; + if (r2 >= yy || r1 * yy + r2 != xx) + abort (); + } + { signed char xx = x, yy = y, r1, r2; + r1 = xx / yy; + r2 = xx % yy; + if (ABS (r2) >= (unsigned char) ABS (yy) || (signed char) (r1 * yy + r2) != xx) + abort (); + } + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ashldi-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ashldi-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ashldi-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ashldi-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,221 @@ +#include + +extern void abort(void); +extern void exit(int); + +#if __LONG_LONG_MAX__ == 9223372036854775807LL +#define BITS 64 + +static unsigned long long const data[64] = { + 0x123456789abcdefULL, + 0x2468acf13579bdeULL, + 0x48d159e26af37bcULL, + 0x91a2b3c4d5e6f78ULL, + 0x123456789abcdef0ULL, + 0x2468acf13579bde0ULL, + 0x48d159e26af37bc0ULL, + 0x91a2b3c4d5e6f780ULL, + 0x23456789abcdef00ULL, + 0x468acf13579bde00ULL, + 0x8d159e26af37bc00ULL, + 0x1a2b3c4d5e6f7800ULL, + 0x3456789abcdef000ULL, + 0x68acf13579bde000ULL, + 0xd159e26af37bc000ULL, + 0xa2b3c4d5e6f78000ULL, + 0x456789abcdef0000ULL, + 0x8acf13579bde0000ULL, + 0x159e26af37bc0000ULL, + 0x2b3c4d5e6f780000ULL, + 0x56789abcdef00000ULL, + 0xacf13579bde00000ULL, + 0x59e26af37bc00000ULL, + 0xb3c4d5e6f7800000ULL, + 0x6789abcdef000000ULL, + 0xcf13579bde000000ULL, + 0x9e26af37bc000000ULL, + 0x3c4d5e6f78000000ULL, + 0x789abcdef0000000ULL, + 0xf13579bde0000000ULL, + 0xe26af37bc0000000ULL, + 0xc4d5e6f780000000ULL, + 0x89abcdef00000000ULL, + 0x13579bde00000000ULL, + 0x26af37bc00000000ULL, + 0x4d5e6f7800000000ULL, + 0x9abcdef000000000ULL, + 0x3579bde000000000ULL, + 0x6af37bc000000000ULL, + 0xd5e6f78000000000ULL, + 0xabcdef0000000000ULL, + 0x579bde0000000000ULL, + 0xaf37bc0000000000ULL, + 0x5e6f780000000000ULL, + 0xbcdef00000000000ULL, + 0x79bde00000000000ULL, + 0xf37bc00000000000ULL, + 0xe6f7800000000000ULL, + 0xcdef000000000000ULL, + 0x9bde000000000000ULL, + 0x37bc000000000000ULL, + 0x6f78000000000000ULL, + 0xdef0000000000000ULL, + 0xbde0000000000000ULL, + 0x7bc0000000000000ULL, + 0xf780000000000000ULL, + 0xef00000000000000ULL, + 0xde00000000000000ULL, + 0xbc00000000000000ULL, + 0x7800000000000000ULL, + 0xf000000000000000ULL, + 0xe000000000000000ULL, + 0xc000000000000000ULL, + 0x8000000000000000ULL +}; + +#elif __LONG_LONG_MAX__ == 2147483647LL +#define BITS 32 + +static unsigned long long const data[32] = { + 0x1234567fULL, + 0x2468acfeULL, + 0x48d159fcULL, + 0x91a2b3f8ULL, + 0x234567f0ULL, + 0x468acfe0ULL, + 0x8d159fc0ULL, + 0x1a2b3f80ULL, + 0x34567f00ULL, + 0x68acfe00ULL, + 0xd159fc00ULL, + 0xa2b3f800ULL, + 0x4567f000ULL, + 0x8acfe000ULL, + 0x159fc000ULL, + 0x2b3f8000ULL, + 0x567f0000ULL, + 0xacfe0000ULL, + 0x59fc0000ULL, + 0xb3f80000ULL, + 0x67f00000ULL, + 0xcfe00000ULL, + 0x9fc00000ULL, + 0x3f800000ULL, + 0x7f000000ULL, + 0xfe000000ULL, + 0xfc000000ULL, + 0xf8000000ULL, + 0xf0000000ULL, + 0xe0000000ULL, + 0xc0000000ULL, + 0x80000000ULL +}; + +#else +#error "Update the test case." +#endif + +static unsigned long long +variable_shift(unsigned long long x, int i) +{ + return x << i; +} + +static unsigned long long +constant_shift(unsigned long long x, int i) +{ + switch (i) + { + case 0: x = x << 0; break; + case 1: x = x << 1; break; + case 2: x = x << 2; break; + case 3: x = x << 3; break; + case 4: x = x << 4; break; + case 5: x = x << 5; break; + case 6: x = x << 6; break; + case 7: x = x << 7; break; + case 8: x = x << 8; break; + case 9: x = x << 9; break; + case 10: x = x << 10; break; + case 11: x = x << 11; break; + case 12: x = x << 12; break; + case 13: x = x << 13; break; + case 14: x = x << 14; break; + case 15: x = x << 15; break; + case 16: x = x << 16; break; + case 17: x = x << 17; break; + case 18: x = x << 18; break; + case 19: x = x << 19; break; + case 20: x = x << 20; break; + case 21: x = x << 21; break; + case 22: x = x << 22; break; + case 23: x = x << 23; break; + case 24: x = x << 24; break; + case 25: x = x << 25; break; + case 26: x = x << 26; break; + case 27: x = x << 27; break; + case 28: x = x << 28; break; + case 29: x = x << 29; break; + case 30: x = x << 30; break; + case 31: x = x << 31; break; +#if BITS > 32 + case 32: x = x << 32; break; + case 33: x = x << 33; break; + case 34: x = x << 34; break; + case 35: x = x << 35; break; + case 36: x = x << 36; break; + case 37: x = x << 37; break; + case 38: x = x << 38; break; + case 39: x = x << 39; break; + case 40: x = x << 40; break; + case 41: x = x << 41; break; + case 42: x = x << 42; break; + case 43: x = x << 43; break; + case 44: x = x << 44; break; + case 45: x = x << 45; break; + case 46: x = x << 46; break; + case 47: x = x << 47; break; + case 48: x = x << 48; break; + case 49: x = x << 49; break; + case 50: x = x << 50; break; + case 51: x = x << 51; break; + case 52: x = x << 52; break; + case 53: x = x << 53; break; + case 54: x = x << 54; break; + case 55: x = x << 55; break; + case 56: x = x << 56; break; + case 57: x = x << 57; break; + case 58: x = x << 58; break; + case 59: x = x << 59; break; + case 60: x = x << 60; break; + case 61: x = x << 61; break; + case 62: x = x << 62; break; + case 63: x = x << 63; break; +#endif + + default: + abort (); + } + return x; +} + +int +main() +{ + int i; + + for (i = 0; i < BITS; ++i) + { + unsigned long long y = variable_shift (data[0], i); + if (y != data[i]) + abort (); + } + for (i = 0; i < BITS; ++i) + { + unsigned long long y = constant_shift (data[0], i); + if (y != data[i]) + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ashrdi-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ashrdi-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ashrdi-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ashrdi-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,335 @@ +#include + +extern void abort(void); +extern void exit(int); + +#if __LONG_LONG_MAX__ == 9223372036854775807LL +#define BITS 64 + +static long long const zext[64] = { + 0x7654321fedcba980LL, + 0x3b2a190ff6e5d4c0LL, + 0x1d950c87fb72ea60LL, + 0xeca8643fdb97530LL, + 0x7654321fedcba98LL, + 0x3b2a190ff6e5d4cLL, + 0x1d950c87fb72ea6LL, + 0xeca8643fdb9753LL, + 0x7654321fedcba9LL, + 0x3b2a190ff6e5d4LL, + 0x1d950c87fb72eaLL, + 0xeca8643fdb975LL, + 0x7654321fedcbaLL, + 0x3b2a190ff6e5dLL, + 0x1d950c87fb72eLL, + 0xeca8643fdb97LL, + 0x7654321fedcbLL, + 0x3b2a190ff6e5LL, + 0x1d950c87fb72LL, + 0xeca8643fdb9LL, + 0x7654321fedcLL, + 0x3b2a190ff6eLL, + 0x1d950c87fb7LL, + 0xeca8643fdbLL, + 0x7654321fedLL, + 0x3b2a190ff6LL, + 0x1d950c87fbLL, + 0xeca8643fdLL, + 0x7654321feLL, + 0x3b2a190ffLL, + 0x1d950c87fLL, + 0xeca8643fLL, + 0x7654321fLL, + 0x3b2a190fLL, + 0x1d950c87LL, + 0xeca8643LL, + 0x7654321LL, + 0x3b2a190LL, + 0x1d950c8LL, + 0xeca864LL, + 0x765432LL, + 0x3b2a19LL, + 0x1d950cLL, + 0xeca86LL, + 0x76543LL, + 0x3b2a1LL, + 0x1d950LL, + 0xeca8LL, + 0x7654LL, + 0x3b2aLL, + 0x1d95LL, + 0xecaLL, + 0x765LL, + 0x3b2LL, + 0x1d9LL, + 0xecLL, + 0x76LL, + 0x3bLL, + 0x1dLL, + 0xeLL, + 0x7LL, + 0x3LL, + 0x1LL, + 0LL +}; + +static long long const sext[64] = { + 0x8edcba9f76543210LL, + 0xc76e5d4fbb2a1908LL, + 0xe3b72ea7dd950c84LL, + 0xf1db9753eeca8642LL, + 0xf8edcba9f7654321LL, + 0xfc76e5d4fbb2a190LL, + 0xfe3b72ea7dd950c8LL, + 0xff1db9753eeca864LL, + 0xff8edcba9f765432LL, + 0xffc76e5d4fbb2a19LL, + 0xffe3b72ea7dd950cLL, + 0xfff1db9753eeca86LL, + 0xfff8edcba9f76543LL, + 0xfffc76e5d4fbb2a1LL, + 0xfffe3b72ea7dd950LL, + 0xffff1db9753eeca8LL, + 0xffff8edcba9f7654LL, + 0xffffc76e5d4fbb2aLL, + 0xffffe3b72ea7dd95LL, + 0xfffff1db9753eecaLL, + 0xfffff8edcba9f765LL, + 0xfffffc76e5d4fbb2LL, + 0xfffffe3b72ea7dd9LL, + 0xffffff1db9753eecLL, + 0xffffff8edcba9f76LL, + 0xffffffc76e5d4fbbLL, + 0xffffffe3b72ea7ddLL, + 0xfffffff1db9753eeLL, + 0xfffffff8edcba9f7LL, + 0xfffffffc76e5d4fbLL, + 0xfffffffe3b72ea7dLL, + 0xffffffff1db9753eLL, + 0xffffffff8edcba9fLL, + 0xffffffffc76e5d4fLL, + 0xffffffffe3b72ea7LL, + 0xfffffffff1db9753LL, + 0xfffffffff8edcba9LL, + 0xfffffffffc76e5d4LL, + 0xfffffffffe3b72eaLL, + 0xffffffffff1db975LL, + 0xffffffffff8edcbaLL, + 0xffffffffffc76e5dLL, + 0xffffffffffe3b72eLL, + 0xfffffffffff1db97LL, + 0xfffffffffff8edcbLL, + 0xfffffffffffc76e5LL, + 0xfffffffffffe3b72LL, + 0xffffffffffff1db9LL, + 0xffffffffffff8edcLL, + 0xffffffffffffc76eLL, + 0xffffffffffffe3b7LL, + 0xfffffffffffff1dbLL, + 0xfffffffffffff8edLL, + 0xfffffffffffffc76LL, + 0xfffffffffffffe3bLL, + 0xffffffffffffff1dLL, + 0xffffffffffffff8eLL, + 0xffffffffffffffc7LL, + 0xffffffffffffffe3LL, + 0xfffffffffffffff1LL, + 0xfffffffffffffff8LL, + 0xfffffffffffffffcLL, + 0xfffffffffffffffeLL, + 0xffffffffffffffffLL +}; + +#elif __LONG_LONG_MAX__ == 2147483647LL +#define BITS 32 + +static long long const zext[32] = { + 0x76543218LL, + 0x3b2a190cLL, + 0x1d950c86LL, + 0xeca8643LL, + 0x7654321LL, + 0x3b2a190LL, + 0x1d950c8LL, + 0xeca864LL, + 0x765432LL, + 0x3b2a19LL, + 0x1d950cLL, + 0xeca86LL, + 0x76543LL, + 0x3b2a1LL, + 0x1d950LL, + 0xeca8LL, + 0x7654LL, + 0x3b2aLL, + 0x1d95LL, + 0xecaLL, + 0x765LL, + 0x3b2LL, + 0x1d9LL, + 0xecLL, + 0x76LL, + 0x3bLL, + 0x1dLL, + 0xeLL, + 0x7LL, + 0x3LL, + 0x1LL, + 0LL +}; + +static long long const sext[64] = { + 0x87654321LL, + 0xc3b2a190LL, + 0xe1d950c8LL, + 0xf0eca864LL, + 0xf8765432LL, + 0xfc3b2a19LL, + 0xfe1d950cLL, + 0xff0eca86LL, + 0xff876543LL, + 0xffc3b2a1LL, + 0xffe1d950LL, + 0xfff0eca8LL, + 0xfff87654LL, + 0xfffc3b2aLL, + 0xfffe1d95LL, + 0xffff0ecaLL, + 0xffff8765LL, + 0xffffc3b2LL, + 0xffffe1d9LL, + 0xfffff0ecLL, + 0xfffff876LL, + 0xfffffc3bLL, + 0xfffffe1dLL, + 0xffffff0eLL, + 0xffffff87LL, + 0xffffffc3LL, + 0xffffffe1LL, + 0xfffffff0LL, + 0xfffffff8LL, + 0xfffffffcLL, + 0xfffffffeLL, + 0xffffffffLL +}; + +#else +#error "Update the test case." +#endif + +static long long +variable_shift(long long x, int i) +{ + return x >> i; +} + +static long long +constant_shift(long long x, int i) +{ + switch (i) + { + case 0: x = x >> 0; break; + case 1: x = x >> 1; break; + case 2: x = x >> 2; break; + case 3: x = x >> 3; break; + case 4: x = x >> 4; break; + case 5: x = x >> 5; break; + case 6: x = x >> 6; break; + case 7: x = x >> 7; break; + case 8: x = x >> 8; break; + case 9: x = x >> 9; break; + case 10: x = x >> 10; break; + case 11: x = x >> 11; break; + case 12: x = x >> 12; break; + case 13: x = x >> 13; break; + case 14: x = x >> 14; break; + case 15: x = x >> 15; break; + case 16: x = x >> 16; break; + case 17: x = x >> 17; break; + case 18: x = x >> 18; break; + case 19: x = x >> 19; break; + case 20: x = x >> 20; break; + case 21: x = x >> 21; break; + case 22: x = x >> 22; break; + case 23: x = x >> 23; break; + case 24: x = x >> 24; break; + case 25: x = x >> 25; break; + case 26: x = x >> 26; break; + case 27: x = x >> 27; break; + case 28: x = x >> 28; break; + case 29: x = x >> 29; break; + case 30: x = x >> 30; break; + case 31: x = x >> 31; break; +#if BITS > 32 + case 32: x = x >> 32; break; + case 33: x = x >> 33; break; + case 34: x = x >> 34; break; + case 35: x = x >> 35; break; + case 36: x = x >> 36; break; + case 37: x = x >> 37; break; + case 38: x = x >> 38; break; + case 39: x = x >> 39; break; + case 40: x = x >> 40; break; + case 41: x = x >> 41; break; + case 42: x = x >> 42; break; + case 43: x = x >> 43; break; + case 44: x = x >> 44; break; + case 45: x = x >> 45; break; + case 46: x = x >> 46; break; + case 47: x = x >> 47; break; + case 48: x = x >> 48; break; + case 49: x = x >> 49; break; + case 50: x = x >> 50; break; + case 51: x = x >> 51; break; + case 52: x = x >> 52; break; + case 53: x = x >> 53; break; + case 54: x = x >> 54; break; + case 55: x = x >> 55; break; + case 56: x = x >> 56; break; + case 57: x = x >> 57; break; + case 58: x = x >> 58; break; + case 59: x = x >> 59; break; + case 60: x = x >> 60; break; + case 61: x = x >> 61; break; + case 62: x = x >> 62; break; + case 63: x = x >> 63; break; +#endif + + default: + abort (); + } + return x; +} + +int +main() +{ + int i; + + for (i = 0; i < BITS; ++i) + { + long long y = variable_shift (zext[0], i); + if (y != zext[i]) + abort (); + } + for (i = 0; i < BITS; ++i) + { + long long y = variable_shift (sext[0], i); + if (y != sext[i]) + abort (); + } + for (i = 0; i < BITS; ++i) + { + long long y = constant_shift (zext[0], i); + if (y != zext[i]) + abort (); + } + for (i = 0; i < BITS; ++i) + { + long long y = constant_shift (sext[0], i); + if (y != sext[i]) + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bcp-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bcp-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bcp-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bcp-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,96 @@ +/* { dg-options "-fgnu89-inline" } */ + +extern void abort (void); +extern void exit (int); + +__attribute__ ((externally_visible)) int global; +int func(void); + +/* These must fail. */ +int bad0(void) { return __builtin_constant_p(global); } +int bad1(void) { return __builtin_constant_p(global++); } +inline int bad2(int x) { return __builtin_constant_p(x++); } +inline int bad3(int x) { return __builtin_constant_p(x); } +inline int bad4(const char *x) { return __builtin_constant_p(x); } +int bad5(void) { return bad2(1); } +inline int bad6(int x) { return __builtin_constant_p(x+1); } +int bad7(void) { return __builtin_constant_p(func()); } +int bad8(void) { char buf[10]; return __builtin_constant_p(buf); } +int bad9(const char *x) { return __builtin_constant_p(x[123456]); } +int bad10(void) { return __builtin_constant_p(&global); } + +/* These must pass, or we've broken gcc2 functionality. */ +int good0(void) { return __builtin_constant_p(1); } +int good1(void) { return __builtin_constant_p("hi"); } +int good2(void) { return __builtin_constant_p((1234 + 45) & ~7); } + +/* These are extensions to gcc2. Failure indicates an optimization + regression. */ +int opt0(void) { return bad3(1); } +int opt1(void) { return bad6(1); } +int opt2(void) { return __builtin_constant_p("hi"[0]); } + +/* + * Opt3 is known to fail. It is one of the important cases that glibc + * was interested in though, so keep this around as a reminder. + * + * The solution is to add bits to recover bytes from constant pool + * elements given nothing but a constant pool label and an offset. + * When we can do that, and we can simplify strlen after the fact, + * then we can enable recognition of constant pool labels as constants. + */ + +/* int opt3(void) { return bad4("hi"); } */ + + +/* Call through tables so -finline-functions can't screw with us. */ +int (* volatile bad_t0[])(void) = { + bad0, bad1, bad5, bad7, bad8, bad10 +}; + +int (* volatile bad_t1[])(int x) = { + bad2, bad3, bad6 +}; + +int (* volatile bad_t2[])(const char *x) = { + bad4, bad9 +}; + +int (* volatile good_t0[])(void) = { + good0, good1, good2 +}; + +int (* volatile opt_t0[])(void) = { + opt0, opt1, opt2 /* , opt3 */ +}; + +#define N(arr) (sizeof(arr)/sizeof(*arr)) + +int main() +{ + int i; + + for (i = 0; i < N(bad_t0); ++i) + if ((*bad_t0[i])()) + abort(); + + for (i = 0; i < N(bad_t1); ++i) + if ((*bad_t1[i])(1)) + abort(); + + for (i = 0; i < N(bad_t2); ++i) + if ((*bad_t2[i])("hi")) + abort(); + + for (i = 0; i < N(good_t0); ++i) + if (! (*good_t0[i])()) + abort(); + +#ifdef __OPTIMIZE__ + for (i = 0; i < N(opt_t0); ++i) + if (! (*opt_t0[i])()) + abort(); +#endif + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-layout-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-layout-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-layout-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-layout-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,9 @@ +struct { long f8:8; long f24:24; } a; +struct { long f32:32; } b; + +main () +{ + if (sizeof (a) != sizeof (b)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-pack-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-pack-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-pack-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-pack-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +struct foo +{ + unsigned half:16; + unsigned long whole:32 __attribute__ ((packed)); +}; + +f (struct foo *q) +{ + if (q->half != 0x1234) + abort (); + if (q->whole != 0x56789abcL) + abort (); +} + +main () +{ + struct foo bar; + + bar.half = 0x1234; + bar.whole = 0x56789abcL; + f (&bar); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-sign-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-sign-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-sign-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-sign-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +main () +{ + struct { + signed int s:3; + unsigned int u:3; + int i:3; + } x = {-1, -1, -1}; + + if (x.u != 7) + abort (); + if (x.s != - 1) + abort (); + + if (x.i != -1 && x.i != 7) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-sign-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-sign-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-sign-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf-sign-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,66 @@ +/* + This test checks promotion of bitfields. Bitfields should be promoted + very much like chars and shorts: + + Bitfields (signed or unsigned) should be promoted to signed int if their + value will fit in a signed int, otherwise to an unsigned int if their + value will fit in an unsigned int, otherwise we don't promote them (ANSI/ISO + does not specify the behavior of bitfields larger than an unsigned int). + + We test the behavior by subtracting two from the promoted value: this will + result in a negitive value for signed types, a positive value for unsigned + types. This test (of course) assumes that the compiler is correctly + implementing signed and unsigned arithmetic. + */ + +struct X { + unsigned int u3:3; + signed long int s31:31; + signed long int s32:32; + unsigned long int u31:31; + unsigned long int u32:32; + unsigned long long ull3 :3; + unsigned long long ull35:35; + unsigned u15:15; +}; + +struct X x; + +main () +{ + if ((x.u3 - 2) >= 0) /* promoted value should be signed */ + abort (); + + if ((x.s31 - 2) >= 0) /* promoted value should be signed */ + abort (); + + if ((x.s32 - 2) >= 0) /* promoted value should be signed */ + abort (); + + if ((x.u15 - 2) >= 0) /* promoted value should be signed */ + abort (); + + /* Conditionalize check on whether integers are 4 bytes or larger, i.e. + larger than a 31 bit bitfield. */ + if (sizeof (int) >= 4) + { + if ((x.u31 - 2) >= 0) /* promoted value should be signed */ + abort (); + } + else + { + if ((x.u31 - 2) < 0) /* promoted value should be UNsigned */ + abort (); + } + + if ((x.u32 - 2) < 0) /* promoted value should be UNsigned */ + abort (); + + if ((x.ull3 - 2) >= 0) /* promoted value should be signed */ + abort (); + + if ((x.ull35 - 2) < 0) /* promoted value should be UNsigned */ + abort (); + + exit (0); +} From llvm-commits at lists.llvm.org Wed Oct 9 04:01:53 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via llvm-commits) Date: Wed, 09 Oct 2019 11:01:53 -0000 Subject: [test-suite] r374156 - Add GCC Torture Suite Sources Message-ID: <20191009110200.4F7408FEE4@lists.llvm.org> Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf64-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf64-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf64-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bf64-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* { dg-xfail-if "ABI specifies bitfields cannot exceed 32 bits" { mcore-*-* } } */ +struct tmp +{ + long long int pad : 12; + long long int field : 52; +}; + +struct tmp2 +{ + long long int field : 52; + long long int pad : 12; +}; + +struct tmp +sub (struct tmp tmp) +{ + tmp.field |= 0x0008765412345678LL; + return tmp; +} + +struct tmp2 +sub2 (struct tmp2 tmp2) +{ + tmp2.field |= 0x0008765412345678LL; + return tmp2; +} + +main() +{ + struct tmp tmp = {0x123, 0xFFF000FFF000FLL}; + struct tmp2 tmp2 = {0xFFF000FFF000FLL, 0x123}; + + tmp = sub (tmp); + tmp2 = sub2 (tmp2); + + if (tmp.pad != 0x123 || tmp.field != 0xFFFFFF541FFF567FLL) + abort (); + if (tmp2.pad != 0x123 || tmp2.field != 0xFFFFFF541FFF567FLL) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,54 @@ +/* Copyright 2002 Free Software Foundation, Inc. + + Tests correct signedness of operations on bitfields; in particular + that integer promotions are done correctly, including the case when + casts are present. + + The C front end was eliding the cast of an unsigned bitfield to + unsigned as a no-op, when in fact it forces a conversion to a + full-width unsigned int. (At the time of writing, the C++ front end + has a different bug; it erroneously promotes the uncast unsigned + bitfield to an unsigned int). + + Source: Neil Booth, 25 Jan 2002, based on PR 3325 (and 3326, which + is a different manifestation of the same bug). +*/ + +extern void abort (); + +int +main(int argc, char *argv[]) +{ + struct x { signed int i : 7; unsigned int u : 7; } bit; + + unsigned int u; + int i; + unsigned int unsigned_result = -13U % 61; + int signed_result = -13 % 61; + + bit.u = 61, u = 61; + bit.i = -13, i = -13; + + if (i % u != unsigned_result) + abort (); + if (i % (unsigned int) u != unsigned_result) + abort (); + + /* Somewhat counter-intuitively, bit.u is promoted to an int, making + the operands and result an int. */ + if (i % bit.u != signed_result) + abort (); + + if (bit.i % bit.u != signed_result) + abort (); + + /* But with a cast to unsigned int, the unsigned int is promoted to + itself as a no-op, and the operands and result are unsigned. */ + if (i % (unsigned int) bit.u != unsigned_result) + abort (); + + if (bit.i % (unsigned int) bit.u != unsigned_result) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* Test whether bit field boundaries aren't advanced if bit field type + has alignment large enough. */ +extern void abort (void); +extern void exit (int); + +struct A { + unsigned short a : 5; + unsigned short b : 5; + unsigned short c : 6; +}; + +struct B { + unsigned short a : 5; + unsigned short b : 3; + unsigned short c : 8; +}; + +int main () +{ + /* If short is not at least 16 bits wide, don't test anything. */ + if ((unsigned short) 65521 != 65521) + exit (0); + + if (sizeof (struct A) != sizeof (struct B)) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,54 @@ +/* Test that operations on bit-fields yield results reduced to bit-field + type. */ +/* Origin: Joseph Myers */ + +extern void exit (int); +extern void abort (void); + +struct s { + unsigned long long u33: 33; + unsigned long long u40: 40; + unsigned long long u41: 41; +}; + +struct s a = { 0x100000, 0x100000, 0x100000 }; +struct s b = { 0x100000000ULL, 0x100000000ULL, 0x100000000ULL }; +struct s c = { 0x1FFFFFFFFULL, 0, 0 }; + +int +main (void) +{ + if (a.u33 * a.u33 != 0 || a.u33 * a.u40 != 0 || a.u40 * a.u33 != 0 + || a.u40 * a.u40 != 0) + abort (); + if (a.u33 * a.u41 != 0x10000000000ULL + || a.u40 * a.u41 != 0x10000000000ULL + || a.u41 * a.u33 != 0x10000000000ULL + || a.u41 * a.u40 != 0x10000000000ULL + || a.u41 * a.u41 != 0x10000000000ULL) + abort (); + if (b.u33 + b.u33 != 0) + abort (); + if (b.u33 + b.u40 != 0x200000000ULL + || b.u33 + b.u41 != 0x200000000ULL + || b.u40 + b.u33 != 0x200000000ULL + || b.u40 + b.u40 != 0x200000000ULL + || b.u40 + b.u41 != 0x200000000ULL + || b.u41 + b.u33 != 0x200000000ULL + || b.u41 + b.u40 != 0x200000000ULL + || b.u41 + b.u41 != 0x200000000ULL) + abort (); + if (a.u33 - b.u33 != 0x100100000ULL + || a.u33 - b.u40 != 0xFF00100000ULL + || a.u33 - b.u41 != 0x1FF00100000ULL + || a.u40 - b.u33 != 0xFF00100000ULL + || a.u40 - b.u40 != 0xFF00100000ULL + || a.u40 - b.u41 != 0x1FF00100000ULL + || a.u41 - b.u33 != 0x1FF00100000ULL + || a.u41 - b.u40 != 0x1FF00100000ULL + || a.u41 - b.u41 != 0x1FF00100000ULL) + abort (); + if (++c.u33 != 0 || --c.u40 != 0xFFFFFFFFFFULL || c.u41-- != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* When comparisons of bit-fields to unsigned constants got shortened, + the shortened signed constant was wrongly marked as overflowing, + leading to a later integer_zerop failure and misoptimization. + + Related to bug tree-optimization/16437 but shows the problem on + 32-bit systems. */ +/* Origin: Joseph Myers */ + +/* { dg-require-effective-target int32plus } */ + +extern void abort (void); + +struct s { int a:12, b:20; }; + +struct s x = { -123, -456 }; + +int +main (void) +{ + if (x.a != -123U || x.b != -456U) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* See http://gcc.gnu.org/ml/gcc/2009-06/msg00072.html. */ + +extern void abort (void); + +struct s +{ + unsigned long long a:2; + unsigned long long b:40; + unsigned long long c:22; +}; + +__attribute__ ((noinline)) void +g (unsigned long long a, unsigned long long b) +{ + asm (""); + if (a != b) + abort (); +} + +__attribute__ ((noinline)) void +f (struct s s, unsigned long long b) +{ + asm (""); + g (((unsigned long long) (s.b-8)) + 8, b); +} + +int +main () +{ + struct s s = {1, 10, 3}; + struct s t = {1, 2, 3}; + f (s, 10); + f (t, 0x10000000002); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* { dg-require-effective-target int32plus } */ +union U +{ + const int a; + unsigned b : 20; +}; + +static union U u = { 0x12345678 }; + +/* Constant folding used to fail to account for endianness when folding a + union. */ + +int +main (void) +{ +#ifdef __BYTE_ORDER__ +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + return u.b - 0x45678; +#else + return u.b - 0x12345; +#endif +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-7.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-7.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-7.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bitfld-7.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* { dg-require-effective-target int32plus } */ +union U +{ + const int a; + unsigned b : 24; +}; + +static union U u = { 0x12345678 }; + +/* Constant folding used to fail to account for endianness when folding a + union. */ + +int +main (void) +{ +#ifdef __BYTE_ORDER__ +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + return u.b - 0x345678; +#else + return u.b - 0x123456; +#endif +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bswap-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bswap-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bswap-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bswap-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +/* Test __builtin_bswap64 . */ + +unsigned long long g(unsigned long long a) __attribute__((noinline)); +unsigned long long g(unsigned long long a) +{ + return __builtin_bswap64(a); +} + + +unsigned long long f(unsigned long long c) +{ + union { + unsigned long long a; + unsigned char b[8]; + } a, b; + a.a = c; + b.b[0] = a.b[7]; + b.b[1] = a.b[6]; + b.b[2] = a.b[5]; + b.b[3] = a.b[4]; + b.b[4] = a.b[3]; + b.b[5] = a.b[2]; + b.b[6] = a.b[1]; + b.b[7] = a.b[0]; + return b.a; +} + +int main(void) +{ + unsigned long long i; + /* The rest of the testcase assumes 8 byte long long. */ + if (sizeof(i) != sizeof(char)*8) + return 0; + if (f(0x12) != g(0x12)) + __builtin_abort(); + if (f(0x1234) != g(0x1234)) + __builtin_abort(); + if (f(0x123456) != g(0x123456)) + __builtin_abort(); + if (f(0x12345678ull) != g(0x12345678ull)) + __builtin_abort(); + if (f(0x1234567890ull) != g(0x1234567890ull)) + __builtin_abort(); + if (f(0x123456789012ull) != g(0x123456789012ull)) + __builtin_abort(); + if (f(0x12345678901234ull) != g(0x12345678901234ull)) + __builtin_abort(); + if (f(0x1234567890123456ull) != g(0x1234567890123456ull)) + __builtin_abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bswap-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bswap-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bswap-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/bswap-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,136 @@ +/* { dg-require-effective-target int32plus } */ + +#ifdef __UINT32_TYPE__ +typedef __UINT32_TYPE__ uint32_t; +#else +typedef __UINT32_TYPE__ unsigned; +#endif + +struct bitfield { + unsigned char f0:7; + unsigned char :1; + unsigned char f1:7; + unsigned char :1; + unsigned char f2:7; + unsigned char :1; + unsigned char f3:7; +}; + +struct ok { + unsigned char f0; + unsigned char f1; + unsigned char f2; + unsigned char f3; +}; + +union bf_or_uint32 { + struct ok inval; + struct bitfield bfval; +}; + +__attribute__ ((noinline, noclone)) uint32_t +partial_read_le32 (union bf_or_uint32 in) +{ + return in.bfval.f0 | (in.bfval.f1 << 8) + | (in.bfval.f2 << 16) | (in.bfval.f3 << 24); +} + +__attribute__ ((noinline, noclone)) uint32_t +partial_read_be32 (union bf_or_uint32 in) +{ + return in.bfval.f3 | (in.bfval.f2 << 8) + | (in.bfval.f1 << 16) | (in.bfval.f0 << 24); +} + +__attribute__ ((noinline, noclone)) uint32_t +fake_read_le32 (char *x, char *y) +{ + unsigned char c0, c1, c2, c3; + + c0 = x[0]; + c1 = x[1]; + *y = 1; + c2 = x[2]; + c3 = x[3]; + return c0 | c1 << 8 | c2 << 16 | c3 << 24; +} + +__attribute__ ((noinline, noclone)) uint32_t +fake_read_be32 (char *x, char *y) +{ + unsigned char c0, c1, c2, c3; + + c0 = x[0]; + c1 = x[1]; + *y = 1; + c2 = x[2]; + c3 = x[3]; + return c3 | c2 << 8 | c1 << 16 | c0 << 24; +} + +__attribute__ ((noinline, noclone)) uint32_t +incorrect_read_le32 (char *x, char *y) +{ + unsigned char c0, c1, c2, c3; + + c0 = x[0]; + c1 = x[1]; + c2 = x[2]; + c3 = x[3]; + *y = 1; + return c0 | c1 << 8 | c2 << 16 | c3 << 24; +} + +__attribute__ ((noinline, noclone)) uint32_t +incorrect_read_be32 (char *x, char *y) +{ + unsigned char c0, c1, c2, c3; + + c0 = x[0]; + c1 = x[1]; + c2 = x[2]; + c3 = x[3]; + *y = 1; + return c3 | c2 << 8 | c1 << 16 | c0 << 24; +} + +int +main () +{ + union bf_or_uint32 bfin; + uint32_t out; + char cin[] = { 0x83, 0x85, 0x87, 0x89 }; + + if (sizeof (uint32_t) * __CHAR_BIT__ != 32) + return 0; + bfin.inval = (struct ok) { 0x83, 0x85, 0x87, 0x89 }; + out = partial_read_le32 (bfin); + /* Test what bswap would do if its check are not strict enough instead of + what is the expected result as there is too many possible results with + bitfields. */ + if (out == 0x89878583) + __builtin_abort (); + bfin.inval = (struct ok) { 0x83, 0x85, 0x87, 0x89 }; + out = partial_read_be32 (bfin); + /* Test what bswap would do if its check are not strict enough instead of + what is the expected result as there is too many possible results with + bitfields. */ + if (out == 0x83858789) + __builtin_abort (); + out = fake_read_le32 (cin, &cin[2]); + if (out != 0x89018583) + __builtin_abort (); + cin[2] = 0x87; + out = fake_read_be32 (cin, &cin[2]); + if (out != 0x83850189) + __builtin_abort (); + cin[2] = 0x87; + out = incorrect_read_le32 (cin, &cin[2]); + if (out != 0x89878583) + __builtin_abort (); + cin[2] = 0x87; + out = incorrect_read_be32 (cin, &cin[2]); + if (out != 0x83858789) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/built-in-setjmp.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/built-in-setjmp.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/built-in-setjmp.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/built-in-setjmp.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +/* { dg-require-effective-target indirect_jumps } */ +/* { dg-require-effective-target alloca } */ + +extern int strcmp(const char *, const char *); +extern char *strcpy(char *, const char *); +extern void abort(void); +extern void exit(int); + +void *buf[20]; + +void __attribute__((noinline)) +sub2 (void) +{ + __builtin_longjmp (buf, 1); +} + +int +main () +{ + char *p = (char *) __builtin_alloca (20); + + strcpy (p, "test"); + + if (__builtin_setjmp (buf)) + { + if (strcmp (p, "test") != 0) + abort (); + + exit (0); + } + + { + int *q = (int *) __builtin_alloca (p[2] * sizeof (int)); + int i; + + for (i = 0; i < p[2]; i++) + q[i] = 0; + + while (1) + sub2 (); + } +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-bitops-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-bitops-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-bitops-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-bitops-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,281 @@ +#include +#include + +#if __INT_MAX__ > 2147483647L +# if __INT_MAX__ >= 9223372036854775807L +# define BITSIZEOF_INT 64 +# else +# define BITSIZEOF_INT 32 +# endif +#else +# if __INT_MAX__ >= 2147483647L +# define BITSIZEOF_INT 32 +# else +# define BITSIZEOF_INT 16 +# endif +#endif + +#if __LONG_MAX__ > 2147483647L +# if __LONG_MAX__ >= 9223372036854775807L +# define BITSIZEOF_LONG 64 +# else +# define BITSIZEOF_LONG 32 +# endif +#else +# define BITSIZEOF_LONG 32 +#endif + +#if __LONG_LONG_MAX__ > 2147483647L +# if __LONG_LONG_MAX__ >= 9223372036854775807L +# define BITSIZEOF_LONG_LONG 64 +# else +# define BITSIZEOF_LONG_LONG 32 +# endif +#else +# define BITSIZEOF_LONG_LONG 32 +#endif + +#define MAKE_FUNS(suffix, type) \ +int my_ffs##suffix(type x) { \ + int i; \ + if (x == 0) \ + return 0; \ + for (i = 0; i < CHAR_BIT * sizeof (type); i++) \ + if (x & ((type) 1 << i)) \ + break; \ + return i + 1; \ +} \ + \ +int my_ctz##suffix(type x) { \ + int i; \ + for (i = 0; i < CHAR_BIT * sizeof (type); i++) \ + if (x & ((type) 1 << i)) \ + break; \ + return i; \ +} \ + \ +int my_clz##suffix(type x) { \ + int i; \ + for (i = 0; i < CHAR_BIT * sizeof (type); i++) \ + if (x & ((type) 1 << ((CHAR_BIT * sizeof (type)) - i - 1))) \ + break; \ + return i; \ +} \ + \ +int my_clrsb##suffix(type x) { \ + int i; \ + int leading = (x >> CHAR_BIT * sizeof (type) - 1) & 1; \ + for (i = 1; i < CHAR_BIT * sizeof (type); i++) \ + if (((x >> ((CHAR_BIT * sizeof (type)) - i - 1)) & 1) \ + != leading) \ + break; \ + return i - 1; \ +} \ + \ +int my_popcount##suffix(type x) { \ + int i; \ + int count = 0; \ + for (i = 0; i < CHAR_BIT * sizeof (type); i++) \ + if (x & ((type) 1 << i)) \ + count++; \ + return count; \ +} \ + \ +int my_parity##suffix(type x) { \ + int i; \ + int count = 0; \ + for (i = 0; i < CHAR_BIT * sizeof (type); i++) \ + if (x & ((type) 1 << i)) \ + count++; \ + return count & 1; \ +} + +MAKE_FUNS (, unsigned); +MAKE_FUNS (l, unsigned long); +MAKE_FUNS (ll, unsigned long long); + +extern void abort (void); +extern void exit (int); + +#define NUMS16 \ + { \ + 0x0000U, \ + 0x0001U, \ + 0x8000U, \ + 0x0002U, \ + 0x4000U, \ + 0x0100U, \ + 0x0080U, \ + 0xa5a5U, \ + 0x5a5aU, \ + 0xcafeU, \ + 0xffffU \ + } + +#define NUMS32 \ + { \ + 0x00000000UL, \ + 0x00000001UL, \ + 0x80000000UL, \ + 0x00000002UL, \ + 0x40000000UL, \ + 0x00010000UL, \ + 0x00008000UL, \ + 0xa5a5a5a5UL, \ + 0x5a5a5a5aUL, \ + 0xcafe0000UL, \ + 0x00cafe00UL, \ + 0x0000cafeUL, \ + 0xffffffffUL \ + } + +#define NUMS64 \ + { \ + 0x0000000000000000ULL, \ + 0x0000000000000001ULL, \ + 0x8000000000000000ULL, \ + 0x0000000000000002ULL, \ + 0x4000000000000000ULL, \ + 0x0000000100000000ULL, \ + 0x0000000080000000ULL, \ + 0xa5a5a5a5a5a5a5a5ULL, \ + 0x5a5a5a5a5a5a5a5aULL, \ + 0xcafecafe00000000ULL, \ + 0x0000cafecafe0000ULL, \ + 0x00000000cafecafeULL, \ + 0xffffffffffffffffULL \ + } + +unsigned int ints[] = +#if BITSIZEOF_INT == 64 +NUMS64; +#elif BITSIZEOF_INT == 32 +NUMS32; +#else +NUMS16; +#endif + +unsigned long longs[] = +#if BITSIZEOF_LONG == 64 +NUMS64; +#else +NUMS32; +#endif + +unsigned long long longlongs[] = +#if BITSIZEOF_LONG_LONG == 64 +NUMS64; +#else +NUMS32; +#endif + +#define N(table) (sizeof (table) / sizeof (table[0])) + +int +main (void) +{ + int i; + + for (i = 0; i < N(ints); i++) + { + if (__builtin_ffs (ints[i]) != my_ffs (ints[i])) + abort (); + if (ints[i] != 0 + && __builtin_clz (ints[i]) != my_clz (ints[i])) + abort (); + if (ints[i] != 0 + && __builtin_ctz (ints[i]) != my_ctz (ints[i])) + abort (); + if (__builtin_clrsb (ints[i]) != my_clrsb (ints[i])) + abort (); + if (__builtin_popcount (ints[i]) != my_popcount (ints[i])) + abort (); + if (__builtin_parity (ints[i]) != my_parity (ints[i])) + abort (); + } + + for (i = 0; i < N(longs); i++) + { + if (__builtin_ffsl (longs[i]) != my_ffsl (longs[i])) + abort (); + if (longs[i] != 0 + && __builtin_clzl (longs[i]) != my_clzl (longs[i])) + abort (); + if (longs[i] != 0 + && __builtin_ctzl (longs[i]) != my_ctzl (longs[i])) + abort (); + if (__builtin_clrsbl (longs[i]) != my_clrsbl (longs[i])) + abort (); + if (__builtin_popcountl (longs[i]) != my_popcountl (longs[i])) + abort (); + if (__builtin_parityl (longs[i]) != my_parityl (longs[i])) + abort (); + } + + for (i = 0; i < N(longlongs); i++) + { + if (__builtin_ffsll (longlongs[i]) != my_ffsll (longlongs[i])) + abort (); + if (longlongs[i] != 0 + && __builtin_clzll (longlongs[i]) != my_clzll (longlongs[i])) + abort (); + if (longlongs[i] != 0 + && __builtin_ctzll (longlongs[i]) != my_ctzll (longlongs[i])) + abort (); + if (__builtin_clrsbll (longlongs[i]) != my_clrsbll (longlongs[i])) + abort (); + if (__builtin_popcountll (longlongs[i]) != my_popcountll (longlongs[i])) + abort (); + if (__builtin_parityll (longlongs[i]) != my_parityll (longlongs[i])) + abort (); + } + + /* Test constant folding. */ + +#define TEST(x, suffix) \ + if (__builtin_ffs##suffix (x) != my_ffs##suffix (x)) \ + abort (); \ + if (x != 0 && __builtin_clz##suffix (x) != my_clz##suffix (x)) \ + abort (); \ + if (x != 0 && __builtin_ctz##suffix (x) != my_ctz##suffix (x)) \ + abort (); \ + if (__builtin_clrsb##suffix (x) != my_clrsb##suffix (x)) \ + abort (); \ + if (__builtin_popcount##suffix (x) != my_popcount##suffix (x)) \ + abort (); \ + if (__builtin_parity##suffix (x) != my_parity##suffix (x)) \ + abort (); + +#if BITSIZEOF_INT == 32 + TEST(0x00000000UL,); + TEST(0x00000001UL,); + TEST(0x80000000UL,); + TEST(0x40000000UL,); + TEST(0x00010000UL,); + TEST(0x00008000UL,); + TEST(0xa5a5a5a5UL,); + TEST(0x5a5a5a5aUL,); + TEST(0xcafe0000UL,); + TEST(0x00cafe00UL,); + TEST(0x0000cafeUL,); + TEST(0xffffffffUL,); +#endif + +#if BITSIZEOF_LONG_LONG == 64 + TEST(0x0000000000000000ULL, ll); + TEST(0x0000000000000001ULL, ll); + TEST(0x8000000000000000ULL, ll); + TEST(0x0000000000000002ULL, ll); + TEST(0x4000000000000000ULL, ll); + TEST(0x0000000100000000ULL, ll); + TEST(0x0000000080000000ULL, ll); + TEST(0xa5a5a5a5a5a5a5a5ULL, ll); + TEST(0x5a5a5a5a5a5a5a5aULL, ll); + TEST(0xcafecafe00000000ULL, ll); + TEST(0x0000cafecafe0000ULL, ll); + TEST(0x00000000cafecafeULL, ll); + TEST(0xffffffffffffffffULL, ll); +#endif + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-constant.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-constant.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-constant.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-constant.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR optimization/8423. */ + +#define btest(x) __builtin_constant_p(x) ? "1" : "0" + +#ifdef __OPTIMIZE__ +void +foo (char *i) +{ + if (*i == '0') + abort (); +} +#else +void +foo (char *i) +{ +} +#endif + +int +main (void) +{ + int size = sizeof (int); + foo (btest (size)); + foo (btest (size)); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,69 @@ +/* Test that __builtin_prefetch does no harm. + + Prefetch using all valid combinations of rw and locality values. + These must be compile-time constants. */ + +#define NO_TEMPORAL_LOCALITY 0 +#define LOW_TEMPORAL_LOCALITY 1 +#define MODERATE_TEMPORAL_LOCALITY 1 +#define HIGH_TEMPORAL_LOCALITY 3 + +#define WRITE_ACCESS 1 +#define READ_ACCESS 0 + +enum locality { none, low, moderate, high }; +enum rw { read, write }; + +int arr[10]; + +void +good_const (const int *p) +{ + __builtin_prefetch (p, 0, 0); + __builtin_prefetch (p, 0, 1); + __builtin_prefetch (p, 0, 2); + __builtin_prefetch (p, READ_ACCESS, 3); + __builtin_prefetch (p, 1, NO_TEMPORAL_LOCALITY); + __builtin_prefetch (p, 1, LOW_TEMPORAL_LOCALITY); + __builtin_prefetch (p, 1, MODERATE_TEMPORAL_LOCALITY); + __builtin_prefetch (p, WRITE_ACCESS, HIGH_TEMPORAL_LOCALITY); +} + +void +good_enum (const int *p) +{ + __builtin_prefetch (p, read, none); + __builtin_prefetch (p, read, low); + __builtin_prefetch (p, read, moderate); + __builtin_prefetch (p, read, high); + __builtin_prefetch (p, write, none); + __builtin_prefetch (p, write, low); + __builtin_prefetch (p, write, moderate); + __builtin_prefetch (p, write, high); +} + +void +good_expr (const int *p) +{ + __builtin_prefetch (p, 1 - 1, 6 - (2 * 3)); + __builtin_prefetch (p, 1 + 0, 1 + 2); +} + +void +good_vararg (const int *p) +{ + __builtin_prefetch (p, 0, 3); + __builtin_prefetch (p, 0); + __builtin_prefetch (p, 1); + __builtin_prefetch (p); +} + +int +main () +{ + good_const (arr); + good_enum (arr); + good_expr (arr); + good_vararg (arr); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,152 @@ +/* Test that __builtin_prefetch does no harm. + + Prefetch data using a variety of storage classes and address + expressions. */ + +int glob_int_arr[100]; +int *glob_ptr_int = glob_int_arr; +int glob_int = 4; + +static stat_int_arr[100]; +static int *stat_ptr_int = stat_int_arr; +static int stat_int; + +struct S { + int a; + short b, c; + char d[8]; + struct S *next; +}; + +struct S str; +struct S *ptr_str = &str; + +/* Prefetch global variables using the address of the variable. */ + +void +simple_global () +{ + __builtin_prefetch (glob_int_arr, 0, 0); + __builtin_prefetch (glob_ptr_int, 0, 0); + __builtin_prefetch (&glob_int, 0, 0); +} + +/* Prefetch file-level static variables using the address of the variable. */ + +void +simple_file () +{ + __builtin_prefetch (stat_int_arr, 0, 0); + __builtin_prefetch (stat_ptr_int, 0, 0); + __builtin_prefetch (&stat_int, 0, 0); +} + +/* Prefetch local static variables using the address of the variable. */ + +void +simple_static_local () +{ + static int gx[100]; + static int *hx = gx; + static int ix; + __builtin_prefetch (gx, 0, 0); + __builtin_prefetch (hx, 0, 0); + __builtin_prefetch (&ix, 0, 0); +} + +/* Prefetch local stack variables using the address of the variable. */ + +void +simple_local () +{ + int gx[100]; + int *hx = gx; + int ix; + __builtin_prefetch (gx, 0, 0); + __builtin_prefetch (hx, 0, 0); + __builtin_prefetch (&ix, 0, 0); +} + +/* Prefetch arguments using the address of the variable. */ + +void +simple_arg (int g[100], int *h, int i) +{ + __builtin_prefetch (g, 0, 0); + __builtin_prefetch (h, 0, 0); + __builtin_prefetch (&i, 0, 0); +} + +/* Prefetch using address expressions involving global variables. */ + +void +expr_global (void) +{ + __builtin_prefetch (&str, 0, 0); + __builtin_prefetch (ptr_str, 0, 0); + __builtin_prefetch (&str.b, 0, 0); + __builtin_prefetch (&ptr_str->b, 0, 0); + __builtin_prefetch (&str.d, 0, 0); + __builtin_prefetch (&ptr_str->d, 0, 0); + __builtin_prefetch (str.next, 0, 0); + __builtin_prefetch (ptr_str->next, 0, 0); + __builtin_prefetch (str.next->d, 0, 0); + __builtin_prefetch (ptr_str->next->d, 0, 0); + + __builtin_prefetch (&glob_int_arr, 0, 0); + __builtin_prefetch (glob_ptr_int, 0, 0); + __builtin_prefetch (&glob_int_arr[2], 0, 0); + __builtin_prefetch (&glob_ptr_int[3], 0, 0); + __builtin_prefetch (glob_int_arr+3, 0, 0); + __builtin_prefetch (glob_int_arr+glob_int, 0, 0); + __builtin_prefetch (glob_ptr_int+5, 0, 0); + __builtin_prefetch (glob_ptr_int+glob_int, 0, 0); +} + +/* Prefetch using address expressions involving local variables. */ + +void +expr_local (void) +{ + int b[10]; + int *pb = b; + struct S t; + struct S *pt = &t; + int j = 4; + + __builtin_prefetch (&t, 0, 0); + __builtin_prefetch (pt, 0, 0); + __builtin_prefetch (&t.b, 0, 0); + __builtin_prefetch (&pt->b, 0, 0); + __builtin_prefetch (&t.d, 0, 0); + __builtin_prefetch (&pt->d, 0, 0); + __builtin_prefetch (t.next, 0, 0); + __builtin_prefetch (pt->next, 0, 0); + __builtin_prefetch (t.next->d, 0, 0); + __builtin_prefetch (pt->next->d, 0, 0); + + __builtin_prefetch (&b, 0, 0); + __builtin_prefetch (pb, 0, 0); + __builtin_prefetch (&b[2], 0, 0); + __builtin_prefetch (&pb[3], 0, 0); + __builtin_prefetch (b+3, 0, 0); + __builtin_prefetch (b+j, 0, 0); + __builtin_prefetch (pb+5, 0, 0); + __builtin_prefetch (pb+j, 0, 0); +} + +int +main () +{ + simple_global (); + simple_file (); + simple_static_local (); + simple_local (); + simple_arg (glob_int_arr, glob_ptr_int, glob_int); + + str.next = &str; + expr_global (); + expr_local (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,113 @@ +/* Test that __builtin_prefetch does no harm. + + Prefetch data using a variety of storage classes and address + expressions with volatile variables and pointers. */ + +int glob_int_arr[100]; +int glob_int = 4; +volatile int glob_vol_int_arr[100]; +int * volatile glob_vol_ptr_int = glob_int_arr; +volatile int *glob_ptr_vol_int = glob_vol_int_arr; +volatile int * volatile glob_vol_ptr_vol_int = glob_vol_int_arr; +volatile int glob_vol_int; + +static stat_int_arr[100]; +static volatile int stat_vol_int_arr[100]; +static int * volatile stat_vol_ptr_int = stat_int_arr; +static volatile int *stat_ptr_vol_int = stat_vol_int_arr; +static volatile int * volatile stat_vol_ptr_vol_int = stat_vol_int_arr; +static volatile int stat_vol_int; + +struct S { + int a; + short b, c; + char d[8]; + struct S *next; +}; + +struct S str; +volatile struct S vol_str; +struct S * volatile vol_ptr_str = &str; +volatile struct S *ptr_vol_str = &vol_str; +volatile struct S * volatile vol_ptr_vol_str = &vol_str; + +/* Prefetch volatile global variables using the address of the variable. */ + +void +simple_vol_global () +{ + __builtin_prefetch (glob_vol_int_arr, 0, 0); + __builtin_prefetch (glob_vol_ptr_int, 0, 0); + __builtin_prefetch (glob_ptr_vol_int, 0, 0); + __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0); + __builtin_prefetch (&glob_vol_int, 0, 0); +} + +/* Prefetch volatile static variables using the address of the variable. */ + +void +simple_vol_file () +{ + __builtin_prefetch (stat_vol_int_arr, 0, 0); + __builtin_prefetch (stat_vol_ptr_int, 0, 0); + __builtin_prefetch (stat_ptr_vol_int, 0, 0); + __builtin_prefetch (stat_vol_ptr_vol_int, 0, 0); + __builtin_prefetch (&stat_vol_int, 0, 0); +} + +/* Prefetch using address expressions involving volatile global variables. */ + +void +expr_vol_global (void) +{ + __builtin_prefetch (&vol_str, 0, 0); + __builtin_prefetch (ptr_vol_str, 0, 0); + __builtin_prefetch (vol_ptr_str, 0, 0); + __builtin_prefetch (vol_ptr_vol_str, 0, 0); + __builtin_prefetch (&vol_str.b, 0, 0); + __builtin_prefetch (&ptr_vol_str->b, 0, 0); + __builtin_prefetch (&vol_ptr_str->b, 0, 0); + __builtin_prefetch (&vol_ptr_vol_str->b, 0, 0); + __builtin_prefetch (&vol_str.d, 0, 0); + __builtin_prefetch (&vol_ptr_str->d, 0, 0); + __builtin_prefetch (&ptr_vol_str->d, 0, 0); + __builtin_prefetch (&vol_ptr_vol_str->d, 0, 0); + __builtin_prefetch (vol_str.next, 0, 0); + __builtin_prefetch (vol_ptr_str->next, 0, 0); + __builtin_prefetch (ptr_vol_str->next, 0, 0); + __builtin_prefetch (vol_ptr_vol_str->next, 0, 0); + __builtin_prefetch (vol_str.next->d, 0, 0); + __builtin_prefetch (vol_ptr_str->next->d, 0, 0); + __builtin_prefetch (ptr_vol_str->next->d, 0, 0); + __builtin_prefetch (vol_ptr_vol_str->next->d, 0, 0); + + __builtin_prefetch (&glob_vol_int_arr, 0, 0); + __builtin_prefetch (glob_vol_ptr_int, 0, 0); + __builtin_prefetch (glob_ptr_vol_int, 0, 0); + __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0); + __builtin_prefetch (&glob_vol_int_arr[2], 0, 0); + __builtin_prefetch (&glob_vol_ptr_int[3], 0, 0); + __builtin_prefetch (&glob_ptr_vol_int[3], 0, 0); + __builtin_prefetch (&glob_vol_ptr_vol_int[3], 0, 0); + __builtin_prefetch (glob_vol_int_arr+3, 0, 0); + __builtin_prefetch (glob_vol_int_arr+glob_vol_int, 0, 0); + __builtin_prefetch (glob_vol_ptr_int+5, 0, 0); + __builtin_prefetch (glob_ptr_vol_int+5, 0, 0); + __builtin_prefetch (glob_vol_ptr_vol_int+5, 0, 0); + __builtin_prefetch (glob_vol_ptr_int+glob_vol_int, 0, 0); + __builtin_prefetch (glob_ptr_vol_int+glob_vol_int, 0, 0); + __builtin_prefetch (glob_vol_ptr_vol_int+glob_vol_int, 0, 0); +} + +int +main () +{ + simple_vol_global (); + simple_vol_file (); + + str.next = &str; + vol_str.next = &str; + expr_vol_global (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,271 @@ +/* Test that __builtin_prefetch does no harm. + + Check that the expression containing the address to prefetch is + evaluated if it has side effects, even if the target does not support + data prefetch. Check changes to pointers and to array indices that are + either global variables or arguments. */ + +#define ARRSIZE 100 + +int arr[ARRSIZE]; +int *ptr = &arr[20]; +int arrindex = 4; + +/* Check that assignment within a prefetch argument is evaluated. */ + +int +assign_arg_ptr (int *p) +{ + int *q; + __builtin_prefetch ((q = p), 0, 0); + return q == p; +} + +int +assign_glob_ptr (void) +{ + int *q; + __builtin_prefetch ((q = ptr), 0, 0); + return q == ptr; +} + +int +assign_arg_idx (int *p, int i) +{ + int j; + __builtin_prefetch (&p[j = i], 0, 0); + return j == i; +} + +int +assign_glob_idx (void) +{ + int j; + __builtin_prefetch (&ptr[j = arrindex], 0, 0); + return j == arrindex; +} + +/* Check that pre/post increment/decrement within a prefetch argument are + evaluated. */ + +int +preinc_arg_ptr (int *p) +{ + int *q; + q = p + 1; + __builtin_prefetch (++p, 0, 0); + return p == q; +} + +int +preinc_glob_ptr (void) +{ + int *q; + q = ptr + 1; + __builtin_prefetch (++ptr, 0, 0); + return ptr == q; +} + +int +postinc_arg_ptr (int *p) +{ + int *q; + q = p + 1; + __builtin_prefetch (p++, 0, 0); + return p == q; +} + +int +postinc_glob_ptr (void) +{ + int *q; + q = ptr + 1; + __builtin_prefetch (ptr++, 0, 0); + return ptr == q; +} + +int +predec_arg_ptr (int *p) +{ + int *q; + q = p - 1; + __builtin_prefetch (--p, 0, 0); + return p == q; +} + +int +predec_glob_ptr (void) +{ + int *q; + q = ptr - 1; + __builtin_prefetch (--ptr, 0, 0); + return ptr == q; +} + +int +postdec_arg_ptr (int *p) +{ + int *q; + q = p - 1; + __builtin_prefetch (p--, 0, 0); + return p == q; +} + +int +postdec_glob_ptr (void) +{ + int *q; + q = ptr - 1; + __builtin_prefetch (ptr--, 0, 0); + return ptr == q; +} + +int +preinc_arg_idx (int *p, int i) +{ + int j = i + 1; + __builtin_prefetch (&p[++i], 0, 0); + return i == j; +} + + +int +preinc_glob_idx (void) +{ + int j = arrindex + 1; + __builtin_prefetch (&ptr[++arrindex], 0, 0); + return arrindex == j; +} + +int +postinc_arg_idx (int *p, int i) +{ + int j = i + 1; + __builtin_prefetch (&p[i++], 0, 0); + return i == j; +} + +int +postinc_glob_idx (void) +{ + int j = arrindex + 1; + __builtin_prefetch (&ptr[arrindex++], 0, 0); + return arrindex == j; +} + +int +predec_arg_idx (int *p, int i) +{ + int j = i - 1; + __builtin_prefetch (&p[--i], 0, 0); + return i == j; +} + +int +predec_glob_idx (void) +{ + int j = arrindex - 1; + __builtin_prefetch (&ptr[--arrindex], 0, 0); + return arrindex == j; +} + +int +postdec_arg_idx (int *p, int i) +{ + int j = i - 1; + __builtin_prefetch (&p[i--], 0, 0); + return i == j; +} + +int +postdec_glob_idx (void) +{ + int j = arrindex - 1; + __builtin_prefetch (&ptr[arrindex--], 0, 0); + return arrindex == j; +} + +/* Check that function calls within the first prefetch argument are + evaluated. */ + +int getptrcnt = 0; + +int * +getptr (int *p) +{ + getptrcnt++; + return p + 1; +} + +int +funccall_arg_ptr (int *p) +{ + __builtin_prefetch (getptr (p), 0, 0); + return getptrcnt == 1; +} + +int getintcnt = 0; + +int +getint (int i) +{ + getintcnt++; + return i + 1; +} + +int +funccall_arg_idx (int *p, int i) +{ + __builtin_prefetch (&p[getint (i)], 0, 0); + return getintcnt == 1; +} + +int +main () +{ + if (!assign_arg_ptr (ptr)) + abort (); + if (!assign_glob_ptr ()) + abort (); + if (!assign_arg_idx (ptr, 4)) + abort (); + if (!assign_glob_idx ()) + abort (); + if (!preinc_arg_ptr (ptr)) + abort (); + if (!preinc_glob_ptr ()) + abort (); + if (!postinc_arg_ptr (ptr)) + abort (); + if (!postinc_glob_ptr ()) + abort (); + if (!predec_arg_ptr (ptr)) + abort (); + if (!predec_glob_ptr ()) + abort (); + if (!postdec_arg_ptr (ptr)) + abort (); + if (!postdec_glob_ptr ()) + abort (); + if (!preinc_arg_idx (ptr, 3)) + abort (); + if (!preinc_glob_idx ()) + abort (); + if (!postinc_arg_idx (ptr, 3)) + abort (); + if (!postinc_glob_idx ()) + abort (); + if (!predec_arg_idx (ptr, 3)) + abort (); + if (!predec_glob_idx ()) + abort (); + if (!postdec_arg_idx (ptr, 3)) + abort (); + if (!postdec_glob_idx ()) + abort (); + if (!funccall_arg_ptr (ptr)) + abort (); + if (!funccall_arg_idx (ptr, 3)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,60 @@ +/* Test that __builtin_prefetch does no harm. + + Use addresses that are unlikely to be word-aligned. Some targets + have alignment requirements for prefetch addresses, so make sure the + compiler takes care of that. This fails if it aborts, anything else + is OK. */ + +struct S { + short a; + short b; + char c[8]; +} s; + +char arr[100]; +char *ptr = arr; +int idx = 3; + +void +arg_ptr (char *p) +{ + __builtin_prefetch (p, 0, 0); +} + +void +arg_idx (char *p, int i) +{ + __builtin_prefetch (&p[i], 0, 0); +} + +void +glob_ptr (void) +{ + __builtin_prefetch (ptr, 0, 0); +} + +void +glob_idx (void) +{ + __builtin_prefetch (&ptr[idx], 0, 0); +} + +int +main () +{ + __builtin_prefetch (&s.b, 0, 0); + __builtin_prefetch (&s.c[1], 0, 0); + + arg_ptr (&s.c[1]); + arg_ptr (ptr+3); + arg_idx (ptr, 3); + arg_idx (ptr+1, 2); + idx = 3; + glob_ptr (); + glob_idx (); + ptr++; + idx = 2; + glob_ptr (); + glob_idx (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-prefetch-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,46 @@ +/* Test that __builtin_prefetch does no harm. + + Data prefetch should not fault if used with an invalid address. */ + +#include + +#define ARRSIZE 65 +int *bad_addr[ARRSIZE]; +int arr_used; + +/* Fill bad_addr with a range of values in the hopes that on any target + some will be invalid addresses. */ +void +init_addrs (void) +{ + int i; + int bits_per_ptr = sizeof (void *) * 8; + for (i = 0; i < bits_per_ptr; i++) + bad_addr[i] = (void *)(1UL << i); + arr_used = bits_per_ptr + 1; /* The last element used is zero. */ +} + +void +prefetch_for_read (void) +{ + int i; + for (i = 0; i < ARRSIZE; i++) + __builtin_prefetch (bad_addr[i], 0, 0); +} + +void +prefetch_for_write (void) +{ + int i; + for (i = 0; i < ARRSIZE; i++) + __builtin_prefetch (bad_addr[i], 1, 0); +} + +int +main () +{ + init_addrs (); + prefetch_for_read (); + prefetch_for_write (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-types-compatible-p.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-types-compatible-p.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-types-compatible-p.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtin-types-compatible-p.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +int i; +double d; + +/* Make sure we return a constant. */ +float rootbeer[__builtin_types_compatible_p (int, typeof(i))]; + +typedef enum { hot, dog, poo, bear } dingos; +typedef enum { janette, laura, amanda } cranberry; + +typedef float same1; +typedef float same2; + +int main (void); + +int main (void) +{ + /* Compatible types. */ + if (!(__builtin_types_compatible_p (int, const int) + && __builtin_types_compatible_p (typeof (hot), int) + && __builtin_types_compatible_p (typeof (hot), typeof (laura)) + && __builtin_types_compatible_p (int[5], int[]) + && __builtin_types_compatible_p (same1, same2))) + abort (); + + /* Incompatible types. */ + if (__builtin_types_compatible_p (char *, int) + || __builtin_types_compatible_p (char *, const char *) + || __builtin_types_compatible_p (long double, double) + || __builtin_types_compatible_p (typeof (i), typeof (d)) + || __builtin_types_compatible_p (typeof (dingos), typeof (cranberry)) + || __builtin_types_compatible_p (char, int) + || __builtin_types_compatible_p (char *, char **)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,50 @@ +/* Verify that structure return doesn't invoke memcpy on + overlapping objects. */ + +extern void abort (void); +extern int inside_main; +typedef __SIZE_TYPE__ size_t; + +struct S { + char stuff[1024]; +}; + +union U { + struct { + int space; + struct S s; + } a; + struct { + struct S s; + int space; + } b; +}; + +struct S f(struct S *p) +{ + return *p; +} + +void g(union U *p) +{ +} + +void *memcpy(void *a, const void *b, size_t len) +{ + if (inside_main) + { + if (a < b && a+len > b) + abort (); + if (b < a && b+len > a) + abort (); + return a; + } + else + { + char *dst = (char *) a; + const char *src = (const char *) b; + while (len--) + *dst++ = *src++; + return a; + } +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* Verify that structure return doesn't invoke memcpy on + overlapping objects. */ + +extern void abort (void); + +struct S { + char stuff[1024]; +}; + +union U { + struct { + int space; + struct S s; + } a; + struct { + struct S s; + int space; + } b; +}; + +struct S f(struct S *); +void g(union U *); + +void main_test(void) +{ + union U u; + u.b.s = f(&u.a.s); + u.a.s = f(&u.b.s); + g(&u); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/20010124-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +load_lib target-supports.exp + +if [istarget "nvptx-*-*"] { + # This test uses memcpy for block move in the same file as it + # defines it. The two decls are not the same, by design, and we + # end up emitting a definition of memcpy, along with a .extern + # declaration. This confuses the ptx assembler. + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); +extern int abs_called; +extern int inside_main; + +/* The labs call should have been optimized, but the abs call + shouldn't have been. */ + +int +abs (int x) +{ + if (inside_main) + abs_called = 1; + return (x < 0 ? -x : x); +} + +long +labs (long x) +{ + if (inside_main) + abort (); + return (x < 0 ? -x : x); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* Test for -fno-builtin-FUNCTION. */ +/* Origin: Joseph Myers . */ +/* GCC normally handles abs and labs as built-in functions even without + optimization. So test that with -fno-builtin-abs, labs is so handled + but abs isn't. */ + +int abs_called = 0; + +extern int abs (int); +extern long labs (long); +extern void abort (void); + +void +main_test (void) +{ + if (labs (0) != 0) + abort (); + if (abs (0) != 0) + abort (); + if (!abs_called) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +set additional_flags -fno-builtin-abs +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-2-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-2-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-2-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-2-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/abs.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,106 @@ +/* Test for builtin abs, labs, llabs, imaxabs. */ +/* Origin: Joseph Myers */ + +#include +typedef __INTMAX_TYPE__ intmax_t; +#define INTMAX_MAX __INTMAX_MAX__ + +extern int abs (int); +extern long labs (long); +extern long long llabs (long long); +extern intmax_t imaxabs (intmax_t); +extern void abort (void); +extern void link_error (void); + +void +main_test (void) +{ + /* For each type, test both runtime and compile time (constant folding) + optimization. */ + volatile int i0 = 0, i1 = 1, im1 = -1, imin = -INT_MAX, imax = INT_MAX; + volatile long l0 = 0L, l1 = 1L, lm1 = -1L, lmin = -LONG_MAX, lmax = LONG_MAX; + volatile long long ll0 = 0LL, ll1 = 1LL, llm1 = -1LL; + volatile long long llmin = -__LONG_LONG_MAX__, llmax = __LONG_LONG_MAX__; + volatile intmax_t imax0 = 0, imax1 = 1, imaxm1 = -1; + volatile intmax_t imaxmin = -INTMAX_MAX, imaxmax = INTMAX_MAX; + if (abs (i0) != 0) + abort (); + if (abs (0) != 0) + link_error (); + if (abs (i1) != 1) + abort (); + if (abs (1) != 1) + link_error (); + if (abs (im1) != 1) + abort (); + if (abs (-1) != 1) + link_error (); + if (abs (imin) != INT_MAX) + abort (); + if (abs (-INT_MAX) != INT_MAX) + link_error (); + if (abs (imax) != INT_MAX) + abort (); + if (abs (INT_MAX) != INT_MAX) + link_error (); + if (labs (l0) != 0L) + abort (); + if (labs (0L) != 0L) + link_error (); + if (labs (l1) != 1L) + abort (); + if (labs (1L) != 1L) + link_error (); + if (labs (lm1) != 1L) + abort (); + if (labs (-1L) != 1L) + link_error (); + if (labs (lmin) != LONG_MAX) + abort (); + if (labs (-LONG_MAX) != LONG_MAX) + link_error (); + if (labs (lmax) != LONG_MAX) + abort (); + if (labs (LONG_MAX) != LONG_MAX) + link_error (); + if (llabs (ll0) != 0LL) + abort (); + if (llabs (0LL) != 0LL) + link_error (); + if (llabs (ll1) != 1LL) + abort (); + if (llabs (1LL) != 1LL) + link_error (); + if (llabs (llm1) != 1LL) + abort (); + if (llabs (-1LL) != 1LL) + link_error (); + if (llabs (llmin) != __LONG_LONG_MAX__) + abort (); + if (llabs (-__LONG_LONG_MAX__) != __LONG_LONG_MAX__) + link_error (); + if (llabs (llmax) != __LONG_LONG_MAX__) + abort (); + if (llabs (__LONG_LONG_MAX__) != __LONG_LONG_MAX__) + link_error (); + if (imaxabs (imax0) != 0) + abort (); + if (imaxabs (0) != 0) + link_error (); + if (imaxabs (imax1) != 1) + abort (); + if (imaxabs (1) != 1) + link_error (); + if (imaxabs (imaxm1) != 1) + abort (); + if (imaxabs (-1) != 1) + link_error (); + if (imaxabs (imaxmin) != INTMAX_MAX) + abort (); + if (imaxabs (-INTMAX_MAX) != INTMAX_MAX) + link_error (); + if (imaxabs (imaxmax) != INTMAX_MAX) + abort (); + if (imaxabs (INTMAX_MAX) != INTMAX_MAX) + link_error (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-3-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-3-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-3-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-3-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/abs.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/abs-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,102 @@ +/* Test for builtin abs, labs, llabs, imaxabs. Test for __builtin versions. */ +/* Origin: Joseph Myers */ + +#include +typedef __INTMAX_TYPE__ intmax_t; +#define INTMAX_MAX __INTMAX_MAX__ + +extern void abort (void); +extern void link_error (void); + +void +main_test (void) +{ + /* For each type, test both runtime and compile time (constant folding) + optimization. */ + volatile int i0 = 0, i1 = 1, im1 = -1, imin = -INT_MAX, imax = INT_MAX; + volatile long l0 = 0L, l1 = 1L, lm1 = -1L, lmin = -LONG_MAX, lmax = LONG_MAX; + volatile long long ll0 = 0LL, ll1 = 1LL, llm1 = -1LL; + volatile long long llmin = -__LONG_LONG_MAX__, llmax = __LONG_LONG_MAX__; + volatile intmax_t imax0 = 0, imax1 = 1, imaxm1 = -1; + volatile intmax_t imaxmin = -INTMAX_MAX, imaxmax = INTMAX_MAX; + if (__builtin_abs (i0) != 0) + abort (); + if (__builtin_abs (0) != 0) + link_error (); + if (__builtin_abs (i1) != 1) + abort (); + if (__builtin_abs (1) != 1) + link_error (); + if (__builtin_abs (im1) != 1) + abort (); + if (__builtin_abs (-1) != 1) + link_error (); + if (__builtin_abs (imin) != INT_MAX) + abort (); + if (__builtin_abs (-INT_MAX) != INT_MAX) + link_error (); + if (__builtin_abs (imax) != INT_MAX) + abort (); + if (__builtin_abs (INT_MAX) != INT_MAX) + link_error (); + if (__builtin_labs (l0) != 0L) + abort (); + if (__builtin_labs (0L) != 0L) + link_error (); + if (__builtin_labs (l1) != 1L) + abort (); + if (__builtin_labs (1L) != 1L) + link_error (); + if (__builtin_labs (lm1) != 1L) + abort (); + if (__builtin_labs (-1L) != 1L) + link_error (); + if (__builtin_labs (lmin) != LONG_MAX) + abort (); + if (__builtin_labs (-LONG_MAX) != LONG_MAX) + link_error (); + if (__builtin_labs (lmax) != LONG_MAX) + abort (); + if (__builtin_labs (LONG_MAX) != LONG_MAX) + link_error (); + if (__builtin_llabs (ll0) != 0LL) + abort (); + if (__builtin_llabs (0LL) != 0LL) + link_error (); + if (__builtin_llabs (ll1) != 1LL) + abort (); + if (__builtin_llabs (1LL) != 1LL) + link_error (); + if (__builtin_llabs (llm1) != 1LL) + abort (); + if (__builtin_llabs (-1LL) != 1LL) + link_error (); + if (__builtin_llabs (llmin) != __LONG_LONG_MAX__) + abort (); + if (__builtin_llabs (-__LONG_LONG_MAX__) != __LONG_LONG_MAX__) + link_error (); + if (__builtin_llabs (llmax) != __LONG_LONG_MAX__) + abort (); + if (__builtin_llabs (__LONG_LONG_MAX__) != __LONG_LONG_MAX__) + link_error (); + if (__builtin_imaxabs (imax0) != 0) + abort (); + if (__builtin_imaxabs (0) != 0) + link_error (); + if (__builtin_imaxabs (imax1) != 1) + abort (); + if (__builtin_imaxabs (1) != 1) + link_error (); + if (__builtin_imaxabs (imaxm1) != 1) + abort (); + if (__builtin_imaxabs (-1) != 1) + link_error (); + if (__builtin_imaxabs (imaxmin) != INTMAX_MAX) + abort (); + if (__builtin_imaxabs (-INTMAX_MAX) != INTMAX_MAX) + link_error (); + if (__builtin_imaxabs (imaxmax) != INTMAX_MAX) + abort (); + if (__builtin_imaxabs (INTMAX_MAX) != INTMAX_MAX) + link_error (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/builtins.exp URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/builtins.exp?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/builtins.exp (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/builtins.exp Wed Oct 9 04:01:46 2019 @@ -0,0 +1,60 @@ +# Copyright (C) 2003-2019 Free Software Foundation, Inc. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# This harness is for testing builtin support. Each test has two files: +# +# - foo.c defines the main testing function, main_test(). +# - foo-lib.c implements the library functions that foo.c is testing. +# +# The functions in foo-lib.c will often want to abort on certain inputs. +# They can use the global variable inside_main to see whether they are +# being called from the test program or part of the common runtime. +# +# In many cases, the library functions will behave as normal at -O0 +# and abort when optimisation is enabled. Such implementations should +# go into the lib/ directory so that they can be included by any test +# that needs them. They shouldn't call any external functions in case +# those functions were overridden too. + +load_lib torture-options.exp +load_lib c-torture.exp + +torture-init +set-torture-options $C_TORTURE_OPTIONS {{}} $LTO_TORTURE_OPTIONS + +set additional_flags "-fno-tree-dse -fno-tree-loop-distribute-patterns -fno-tracer -fno-ipa-ra" +if [istarget "powerpc-*-darwin*"] { + lappend additional_flags "-Wl,-multiply_defined,suppress" +} +if { [istarget *-*-eabi*] + || [istarget *-*-elf] + || [istarget *-*-mingw*] + || [istarget *-*-rtems*] } { + lappend additional_flags "-Wl,--allow-multiple-definition" +} + +foreach src [lsort [find $srcdir/$subdir *.c]] { + if {![string match *-lib.c $src] && [runtest_file_p $runtests $src]} { + c-torture-execute [list $src \ + [file root $src]-lib.c \ + $srcdir/$subdir/lib/main.c] \ + $additional_flags + } +} + +torture-finish Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/chk.h URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/chk.h?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/chk.h (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/chk.h Wed Oct 9 04:01:46 2019 @@ -0,0 +1,92 @@ +#ifndef os +# define os(ptr) __builtin_object_size (ptr, 0) +#endif + +/* This is one of the alternatives for object size checking. + If dst has side-effects, size checking will never be done. */ +#undef memcpy +#define memcpy(dst, src, len) \ + __builtin___memcpy_chk (dst, src, len, os (dst)) +#undef mempcpy +#define mempcpy(dst, src, len) \ + __builtin___mempcpy_chk (dst, src, len, os (dst)) +#undef memmove +#define memmove(dst, src, len) \ + __builtin___memmove_chk (dst, src, len, os (dst)) +#undef memset +#define memset(dst, val, len) \ + __builtin___memset_chk (dst, val, len, os (dst)) +#undef strcpy +#define strcpy(dst, src) \ + __builtin___strcpy_chk (dst, src, os (dst)) +#undef stpcpy +#define stpcpy(dst, src) \ + __builtin___stpcpy_chk (dst, src, os (dst)) +#undef strcat +#define strcat(dst, src) \ + __builtin___strcat_chk (dst, src, os (dst)) +#undef strncpy +#define strncpy(dst, src, len) \ + __builtin___strncpy_chk (dst, src, len, os (dst)) +#undef stpncpy +#define stpncpy(dst, src, len) \ + __builtin___stpncpy_chk (dst, src, len, os (dst)) +#undef strncat +#define strncat(dst, src, len) \ + __builtin___strncat_chk (dst, src, len, os (dst)) +#undef sprintf +#define sprintf(dst, ...) \ + __builtin___sprintf_chk (dst, 0, os (dst), __VA_ARGS__) +#undef vsprintf +#define vsprintf(dst, fmt, ap) \ + __builtin___vsprintf_chk (dst, 0, os (dst), fmt, ap) +#undef snprintf +#define snprintf(dst, len, ...) \ + __builtin___snprintf_chk (dst, len, 0, os (dst), __VA_ARGS__) +#undef vsnprintf +#define vsnprintf(dst, len, fmt, ap) \ + __builtin___vsnprintf_chk (dst, len, 0, os (dst), fmt, ap) + +/* Now "redefine" even builtins for the purpose of testing. */ +#undef __builtin_memcpy +#define __builtin_memcpy(dst, src, len) memcpy (dst, src, len) +#undef __builtin_mempcpy +#define __builtin_mempcpy(dst, src, len) mempcpy (dst, src, len) +#undef __builtin_memmove +#define __builtin_memmove(dst, src, len) memmove (dst, src, len) +#undef __builtin_memset +#define __builtin_memset(dst, val, len) memset (dst, val, len) +#undef __builtin_strcpy +#define __builtin_strcpy(dst, src) strcpy (dst, src) +#undef __builtin_stpcpy +#define __builtin_stpcpy(dst, src) stpcpy (dst, src) +#undef __builtin_strcat +#define __builtin_strcat(dst, src) strcat (dst, src) +#undef __builtin_strncpy +#define __builtin_strncpy(dst, src, len) strncpy (dst, src, len) +#undef __builtin_strncat +#define __builtin_strncat(dst, src, len) strncat (dst, src, len) +#undef __builtin_sprintf +#define __builtin_sprintf(dst, ...) sprintf (dst, __VA_ARGS__) +#undef __builtin_vsprintf +#define __builtin_vsprintf(dst, fmt, ap) vsprintf (dst, fmt, ap) +#undef __builtin_snprintf +#define __builtin_snprintf(dst, len, ...) snprintf (dst, len, __VA_ARGS__) +#undef __builtin_vsnprintf +#define __builtin_vsnprintf(dst, len, fmt, ap) vsnprintf (dst, len, fmt, ap) + +extern void *chk_fail_buf[]; +extern volatile int chk_fail_allowed, chk_calls; +extern volatile int memcpy_disallowed, mempcpy_disallowed, memmove_disallowed; +extern volatile int memset_disallowed, strcpy_disallowed, stpcpy_disallowed; +extern volatile int strncpy_disallowed, stpncpy_disallowed, strcat_disallowed; +extern volatile int strncat_disallowed, sprintf_disallowed, vsprintf_disallowed; +extern volatile int snprintf_disallowed, vsnprintf_disallowed; + +/* A storage class that ensures that declarations bind locally. We want + to test non-static declarations where we know it is safe to do so. */ +#if __PIC__ && !__PIE__ +#define LOCAL static +#else +#define LOCAL +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/complex-1-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/complex-1-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/complex-1-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/complex-1-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,70 @@ +extern int inside_main; +extern void abort (void); +#ifdef __OPTIMIZE__ +#define ABORT_INSIDE_MAIN do { if (inside_main) abort (); } while (0) +#else +#define ABORT_INSIDE_MAIN do { } while (0) +#endif + +static float _Complex +conjf (float _Complex z) +{ + ABORT_INSIDE_MAIN; + return ~z; +} + +static double _Complex +conj (double _Complex z) +{ + ABORT_INSIDE_MAIN; + return ~z; +} + +static long double _Complex +conjl (long double _Complex z) +{ + ABORT_INSIDE_MAIN; + return ~z; +} + +static float +crealf (float _Complex z) +{ + ABORT_INSIDE_MAIN; + return __real__ z; +} + +static double +creal (double _Complex z) +{ + ABORT_INSIDE_MAIN; + return __real__ z; +} + +static long double +creall (long double _Complex z) +{ + ABORT_INSIDE_MAIN; + return __real__ z; +} + +static float +cimagf (float _Complex z) +{ + ABORT_INSIDE_MAIN; + return __imag__ z; +} + +static double +cimag (double _Complex z) +{ + ABORT_INSIDE_MAIN; + return __imag__ z; +} + +static long double +cimagl (long double _Complex z) +{ + ABORT_INSIDE_MAIN; + return __imag__ z; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/complex-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/complex-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/complex-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/complex-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,102 @@ +/* Test for builtin conj, creal, cimag. */ +/* Origin: Joseph Myers */ + +extern float _Complex conjf (float _Complex); +extern double _Complex conj (double _Complex); +extern long double _Complex conjl (long double _Complex); + +extern float crealf (float _Complex); +extern double creal (double _Complex); +extern long double creall (long double _Complex); + +extern float cimagf (float _Complex); +extern double cimag (double _Complex); +extern long double cimagl (long double _Complex); + +extern void abort (void); +extern void link_error (void); + +void +main_test (void) +{ + /* For each type, test both runtime and compile time (constant folding) + optimization. */ + volatile float _Complex fc = 1.0F + 2.0iF; + volatile double _Complex dc = 1.0 + 2.0i; + volatile long double _Complex ldc = 1.0L + 2.0iL; + /* Test floats. */ + if (conjf (fc) != 1.0F - 2.0iF) + abort (); + if (__builtin_conjf (fc) != 1.0F - 2.0iF) + abort (); + if (conjf (1.0F + 2.0iF) != 1.0F - 2.0iF) + link_error (); + if (__builtin_conjf (1.0F + 2.0iF) != 1.0F - 2.0iF) + link_error (); + if (crealf (fc) != 1.0F) + abort (); + if (__builtin_crealf (fc) != 1.0F) + abort (); + if (crealf (1.0F + 2.0iF) != 1.0F) + link_error (); + if (__builtin_crealf (1.0F + 2.0iF) != 1.0F) + link_error (); + if (cimagf (fc) != 2.0F) + abort (); + if (__builtin_cimagf (fc) != 2.0F) + abort (); + if (cimagf (1.0F + 2.0iF) != 2.0F) + link_error (); + if (__builtin_cimagf (1.0F + 2.0iF) != 2.0F) + link_error (); + /* Test doubles. */ + if (conj (dc) != 1.0 - 2.0i) + abort (); + if (__builtin_conj (dc) != 1.0 - 2.0i) + abort (); + if (conj (1.0 + 2.0i) != 1.0 - 2.0i) + link_error (); + if (__builtin_conj (1.0 + 2.0i) != 1.0 - 2.0i) + link_error (); + if (creal (dc) != 1.0) + abort (); + if (__builtin_creal (dc) != 1.0) + abort (); + if (creal (1.0 + 2.0i) != 1.0) + link_error (); + if (__builtin_creal (1.0 + 2.0i) != 1.0) + link_error (); + if (cimag (dc) != 2.0) + abort (); + if (__builtin_cimag (dc) != 2.0) + abort (); + if (cimag (1.0 + 2.0i) != 2.0) + link_error (); + if (__builtin_cimag (1.0 + 2.0i) != 2.0) + link_error (); + /* Test long doubles. */ + if (conjl (ldc) != 1.0L - 2.0iL) + abort (); + if (__builtin_conjl (ldc) != 1.0L - 2.0iL) + abort (); + if (conjl (1.0L + 2.0iL) != 1.0L - 2.0iL) + link_error (); + if (__builtin_conjl (1.0L + 2.0iL) != 1.0L - 2.0iL) + link_error (); + if (creall (ldc) != 1.0L) + abort (); + if (__builtin_creall (ldc) != 1.0L) + abort (); + if (creall (1.0L + 2.0iL) != 1.0L) + link_error (); + if (__builtin_creall (1.0L + 2.0iL) != 1.0L) + link_error (); + if (cimagl (ldc) != 2.0L) + abort (); + if (__builtin_cimagl (ldc) != 2.0L) + abort (); + if (cimagl (1.0L + 2.0iL) != 2.0L) + link_error (); + if (__builtin_cimagl (1.0L + 2.0iL) != 2.0L) + link_error (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/fprintf.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,61 @@ +/* Copyright (C) 2001 Free Software Foundation. + + Ensure all expected transformations of builtin fprintf occur and + that we honor side effects in the arguments. + + Written by Kaveh R. Ghazi, 1/7/2001. */ + +#include +extern int fprintf_unlocked (FILE *, const char *, ...); +extern void abort(void); + +void +main_test (void) +{ + FILE *s_array[] = {stdout, NULL}, **s_ptr = s_array; + const char *const s1 = "hello world"; + const char *const s2[] = { s1, 0 }, *const*s3; + + fprintf (*s_ptr, ""); + fprintf (*s_ptr, "%s", ""); + fprintf (*s_ptr, "%s", "hello"); + fprintf (*s_ptr, "%s", "\n"); + fprintf (*s_ptr, "%s", *s2); + s3 = s2; + fprintf (*s_ptr, "%s", *s3++); + if (s3 != s2+1 || *s3 != 0) + abort(); + s3 = s2; + fprintf (*s_ptr++, "%s", *s3++); + if (s3 != s2+1 || *s3 != 0 || s_ptr != s_array+1 || *s_ptr != 0) + abort(); + + s_ptr = s_array; + fprintf (*s_ptr, "%c", '\n'); + fprintf (*s_ptr, "%c", **s2); + s3 = s2; + fprintf (*s_ptr, "%c", **s3++); + if (s3 != s2+1 || *s3 != 0) + abort(); + s3 = s2; + fprintf (*s_ptr++, "%c", **s3++); + if (s3 != s2+1 || *s3 != 0 || s_ptr != s_array+1 || *s_ptr != 0) + abort(); + + s_ptr = s_array; + fprintf (*s_ptr++, "hello world"); + if (s_ptr != s_array+1 || *s_ptr != 0) + abort(); + s_ptr = s_array; + fprintf (*s_ptr, "\n"); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + __builtin_fprintf (*s_ptr, "%s", "hello world\n"); + /* Check the unlocked style, these evaluate to nothing to avoid + problems on systems without the unlocked functions. */ + fprintf_unlocked (*s_ptr, ""); + __builtin_fprintf_unlocked (*s_ptr, ""); + fprintf_unlocked (*s_ptr, "%s", ""); + __builtin_fprintf_unlocked (*s_ptr, "%s", ""); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fprintf.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,7 @@ +load_lib target-supports.exp + +if { [check_effective_target_freestanding] } { + return 1; +} + +return 0; Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +#include +#include +extern void abort (void); +extern int inside_main; +extern size_t strlen(const char *); +int +fputs(const char *string, FILE *stream) +{ + size_t n = strlen(string); + size_t r; +#if defined __OPTIMIZE__ && !defined __OPTIMIZE_SIZE__ + if (inside_main) + abort(); +#endif + r = fwrite (string, 1, n, stream); + return n > r ? EOF : 0; +} + +/* Locking stdio doesn't matter for the purposes of this test. */ +int +fputs_unlocked(const char *string, FILE *stream) +{ + return fputs (string, stream); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,65 @@ +/* Copyright (C) 2000, 2001 Free Software Foundation. + + Ensure all expected transformations of builtin fputs occur and that + we honor side effects in the stream argument. + + Written by Kaveh R. Ghazi, 10/30/2000. */ + +#include +extern void abort(void); + +int i; + +void +main_test(void) +{ + FILE *s_array[] = {stdout, NULL}, **s_ptr = s_array; + const char *const s1 = "hello world"; + + fputs ("", *s_ptr); + fputs ("\n", *s_ptr); + fputs ("bye", *s_ptr); + fputs (s1, *s_ptr); + fputs (s1+5, *s_ptr); + fputs (s1+10, *s_ptr); + fputs (s1+11, *s_ptr); + + /* Check side-effects when transforming fputs -> NOP. */ + fputs ("", *s_ptr++); + if (s_ptr != s_array+1 || *s_ptr != 0) + abort(); + + /* Check side-effects when transforming fputs -> fputc. */ + s_ptr = s_array; + fputs ("\n", *s_ptr++); + if (s_ptr != s_array+1 || *s_ptr != 0) + abort(); + + /* Check side-effects when transforming fputs -> fwrite. */ + s_ptr = s_array; + fputs ("hello\n", *s_ptr++); + if (s_ptr != s_array+1 || *s_ptr != 0) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + s_ptr = s_array; + __builtin_fputs ("", *s_ptr); + /* These builtin stubs are called by __builtin_fputs, ensure their + prototypes are set correctly too. */ + __builtin_fputc ('\n', *s_ptr); + __builtin_fwrite ("hello\n", 1, 6, *s_ptr); + /* Check the unlocked style, these evaluate to nothing to avoid + problems on systems without the unlocked functions. */ + fputs_unlocked ("", *s_ptr); + __builtin_fputs_unlocked ("", *s_ptr); + + /* Check side-effects in conditional expression. */ + s_ptr = s_array; + fputs (i++ ? "f" : "x", *s_ptr++); + if (s_ptr != s_array+1 || *s_ptr != 0 || i != 1) + abort(); + fputs (--i ? "\n" : "\n", *--s_ptr); + if (s_ptr != s_array || i != 0) + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/fputs.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,7 @@ +load_lib target-supports.exp + +if { [check_effective_target_freestanding] } { + return 1; +} + +return 0; Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/abs.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/abs.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/abs.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/abs.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +extern int inside_main; +extern void abort (void); +#ifdef __OPTIMIZE__ +#define ABORT_INSIDE_MAIN do { if (inside_main) abort (); } while (0) +#else +#define ABORT_INSIDE_MAIN do { } while (0) +#endif + +typedef __INTMAX_TYPE__ intmax_t; + +__attribute__ ((__noinline__)) +int +abs (int x) +{ + ABORT_INSIDE_MAIN; + return x < 0 ? -x : x; +} + +__attribute__ ((__noinline__)) +long +labs (long x) +{ + ABORT_INSIDE_MAIN; + return x < 0 ? -x : x; +} + +__attribute__ ((__noinline__)) +long long +llabs (long long x) +{ + ABORT_INSIDE_MAIN; + return x < 0 ? -x : x; +} + +__attribute__ ((__noinline__)) +intmax_t +imaxabs (intmax_t x) +{ + ABORT_INSIDE_MAIN; + return x < 0 ? -x : x; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/bfill.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/bfill.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/bfill.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/bfill.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +extern int inside_main; + +__attribute__ ((__noinline__)) +void +bfill (void *s, __SIZE_TYPE__ n, int ch) +{ + char *p; + + for (p = s; n-- > 0; p++) + *p = ch; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/bzero.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/bzero.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/bzero.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/bzero.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +extern int inside_main; + +__attribute__ ((__noinline__)) +void +bzero (void *s, __SIZE_TYPE__ n) +{ + char *p; + + for (p = s; n-- > 0; p++) + *p = 0; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,519 @@ +#include +#ifdef __unix__ +#include +#endif + +/* If some target has a Max alignment less than 16, please create + a #ifdef around the alignment and add your alignment. */ +#ifdef __pdp11__ +#define ALIGNMENT 2 +#else +#define ALIGNMENT 16 +#endif + +extern void abort (void); + +extern int inside_main; +void *chk_fail_buf[256] __attribute__((aligned (ALIGNMENT))); +volatile int chk_fail_allowed, chk_calls; +volatile int memcpy_disallowed, mempcpy_disallowed, memmove_disallowed; +volatile int memset_disallowed, strcpy_disallowed, stpcpy_disallowed; +volatile int strncpy_disallowed, stpncpy_disallowed, strcat_disallowed; +volatile int strncat_disallowed, sprintf_disallowed, vsprintf_disallowed; +volatile int snprintf_disallowed, vsnprintf_disallowed; +extern __SIZE_TYPE__ strlen (const char *); +extern int vsprintf (char *, const char *, va_list); + +void __attribute__((noreturn)) +__chk_fail (void) +{ + if (chk_fail_allowed) + __builtin_longjmp (chk_fail_buf, 1); + abort (); +} + +void * +memcpy (void *dst, const void *src, __SIZE_TYPE__ n) +{ + const char *srcp; + char *dstp; + +#ifdef __OPTIMIZE__ + if (memcpy_disallowed && inside_main) + abort (); +#endif + + srcp = src; + dstp = dst; + while (n-- != 0) + *dstp++ = *srcp++; + + return dst; +} + +void * +__memcpy_chk (void *dst, const void *src, __SIZE_TYPE__ n, __SIZE_TYPE__ size) +{ + /* If size is -1, GCC should always optimize the call into memcpy. */ + if (size == (__SIZE_TYPE__) -1) + abort (); + ++chk_calls; + if (n > size) + __chk_fail (); + return memcpy (dst, src, n); +} + +void * +mempcpy (void *dst, const void *src, __SIZE_TYPE__ n) +{ + const char *srcp; + char *dstp; + +#ifdef __OPTIMIZE__ + if (mempcpy_disallowed && inside_main) + abort (); +#endif + + srcp = src; + dstp = dst; + while (n-- != 0) + *dstp++ = *srcp++; + + return dstp; +} + +void * +__mempcpy_chk (void *dst, const void *src, __SIZE_TYPE__ n, __SIZE_TYPE__ size) +{ + /* If size is -1, GCC should always optimize the call into mempcpy. */ + if (size == (__SIZE_TYPE__) -1) + abort (); + ++chk_calls; + if (n > size) + __chk_fail (); + return mempcpy (dst, src, n); +} + +void * +memmove (void *dst, const void *src, __SIZE_TYPE__ n) +{ + const char *srcp; + char *dstp; + +#ifdef __OPTIMIZE__ + if (memmove_disallowed && inside_main) + abort (); +#endif + + srcp = src; + dstp = dst; + if (srcp < dstp) + while (n-- != 0) + dstp[n] = srcp[n]; + else + while (n-- != 0) + *dstp++ = *srcp++; + + return dst; +} + +void * +__memmove_chk (void *dst, const void *src, __SIZE_TYPE__ n, __SIZE_TYPE__ size) +{ + /* If size is -1, GCC should always optimize the call into memmove. */ + if (size == (__SIZE_TYPE__) -1) + abort (); + ++chk_calls; + if (n > size) + __chk_fail (); + return memmove (dst, src, n); +} + +void * +memset (void *dst, int c, __SIZE_TYPE__ n) +{ + while (n-- != 0) + n[(char *) dst] = c; + + /* Single-byte memsets should be done inline when optimisation + is enabled. Do this after the copy in case we're being called to + initialize bss. */ +#ifdef __OPTIMIZE__ + if (memset_disallowed && inside_main && n < 2) + abort (); +#endif + + return dst; +} + +void * +__memset_chk (void *dst, int c, __SIZE_TYPE__ n, __SIZE_TYPE__ size) +{ + /* If size is -1, GCC should always optimize the call into memset. */ + if (size == (__SIZE_TYPE__) -1) + abort (); + ++chk_calls; + if (n > size) + __chk_fail (); + return memset (dst, c, n); +} + +char * +strcpy (char *d, const char *s) +{ + char *r = d; +#ifdef __OPTIMIZE__ + if (strcpy_disallowed && inside_main) + abort (); +#endif + while ((*d++ = *s++)); + return r; +} + +char * +__strcpy_chk (char *d, const char *s, __SIZE_TYPE__ size) +{ + /* If size is -1, GCC should always optimize the call into strcpy. */ + if (size == (__SIZE_TYPE__) -1) + abort (); + ++chk_calls; + if (strlen (s) >= size) + __chk_fail (); + return strcpy (d, s); +} + +char * +stpcpy (char *dst, const char *src) +{ +#ifdef __OPTIMIZE__ + if (stpcpy_disallowed && inside_main) + abort (); +#endif + + while (*src != 0) + *dst++ = *src++; + + *dst = 0; + return dst; +} + +char * +__stpcpy_chk (char *d, const char *s, __SIZE_TYPE__ size) +{ + /* If size is -1, GCC should always optimize the call into stpcpy. */ + if (size == (__SIZE_TYPE__) -1) + abort (); + ++chk_calls; + if (strlen (s) >= size) + __chk_fail (); + return stpcpy (d, s); +} + +char * +stpncpy (char *dst, const char *src, __SIZE_TYPE__ n) +{ +#ifdef __OPTIMIZE__ + if (stpncpy_disallowed && inside_main) + abort (); +#endif + + for (; *src && n; n--) + *dst++ = *src++; + + char *ret = dst; + + while (n--) + *dst++ = 0; + + return ret; +} + + +char * +__stpncpy_chk (char *s1, const char *s2, __SIZE_TYPE__ n, __SIZE_TYPE__ size) +{ + /* If size is -1, GCC should always optimize the call into stpncpy. */ + if (size == (__SIZE_TYPE__) -1) + abort (); + ++chk_calls; + if (n > size) + __chk_fail (); + return stpncpy (s1, s2, n); +} + +char * +strncpy (char *s1, const char *s2, __SIZE_TYPE__ n) +{ + char *dest = s1; +#ifdef __OPTIMIZE__ + if (strncpy_disallowed && inside_main) + abort(); +#endif + for (; *s2 && n; n--) + *s1++ = *s2++; + while (n--) + *s1++ = 0; + return dest; +} + +char * +__strncpy_chk (char *s1, const char *s2, __SIZE_TYPE__ n, __SIZE_TYPE__ size) +{ + /* If size is -1, GCC should always optimize the call into strncpy. */ + if (size == (__SIZE_TYPE__) -1) + abort (); + ++chk_calls; + if (n > size) + __chk_fail (); + return strncpy (s1, s2, n); +} + +char * +strcat (char *dst, const char *src) +{ + char *p = dst; + +#ifdef __OPTIMIZE__ + if (strcat_disallowed && inside_main) + abort (); +#endif + + while (*p) + p++; + while ((*p++ = *src++)) + ; + return dst; +} + +char * +__strcat_chk (char *d, const char *s, __SIZE_TYPE__ size) +{ + /* If size is -1, GCC should always optimize the call into strcat. */ + if (size == (__SIZE_TYPE__) -1) + abort (); + ++chk_calls; + if (strlen (d) + strlen (s) >= size) + __chk_fail (); + return strcat (d, s); +} + +char * +strncat (char *s1, const char *s2, __SIZE_TYPE__ n) +{ + char *dest = s1; + char c; +#ifdef __OPTIMIZE__ + if (strncat_disallowed && inside_main) + abort(); +#endif + while (*s1) s1++; + c = '\0'; + while (n > 0) + { + c = *s2++; + *s1++ = c; + if (c == '\0') + return dest; + n--; + } + if (c != '\0') + *s1 = '\0'; + return dest; +} + +char * +__strncat_chk (char *d, const char *s, __SIZE_TYPE__ n, __SIZE_TYPE__ size) +{ + __SIZE_TYPE__ len = strlen (d), n1 = n; + const char *s1 = s; + + /* If size is -1, GCC should always optimize the call into strncat. */ + if (size == (__SIZE_TYPE__) -1) + abort (); + ++chk_calls; + while (len < size && n1 > 0) + { + if (*s1++ == '\0') + break; + ++len; + --n1; + } + + if (len >= size) + __chk_fail (); + return strncat (d, s, n); +} + +/* No chk test in GCC testsuite needs more bytes than this. + As we can't expect vsnprintf to be available on the target, + assume 4096 bytes is enough. */ +static char chk_sprintf_buf[4096]; + +int +__sprintf_chk (char *str, int flag, __SIZE_TYPE__ size, const char *fmt, ...) +{ + int ret; + va_list ap; + + /* If size is -1 and flag 0, GCC should always optimize the call into + sprintf. */ + if (size == (__SIZE_TYPE__) -1 && flag == 0) + abort (); + ++chk_calls; +#ifdef __OPTIMIZE__ + if (sprintf_disallowed && inside_main) + abort(); +#endif + va_start (ap, fmt); + ret = vsprintf (chk_sprintf_buf, fmt, ap); + va_end (ap); + if (ret >= 0) + { + if (ret >= size) + __chk_fail (); + memcpy (str, chk_sprintf_buf, ret + 1); + } + return ret; +} + +int +__vsprintf_chk (char *str, int flag, __SIZE_TYPE__ size, const char *fmt, + va_list ap) +{ + int ret; + + /* If size is -1 and flag 0, GCC should always optimize the call into + vsprintf. */ + if (size == (__SIZE_TYPE__) -1 && flag == 0) + abort (); + ++chk_calls; +#ifdef __OPTIMIZE__ + if (vsprintf_disallowed && inside_main) + abort(); +#endif + ret = vsprintf (chk_sprintf_buf, fmt, ap); + if (ret >= 0) + { + if (ret >= size) + __chk_fail (); + memcpy (str, chk_sprintf_buf, ret + 1); + } + return ret; +} + +int +__snprintf_chk (char *str, __SIZE_TYPE__ len, int flag, __SIZE_TYPE__ size, + const char *fmt, ...) +{ + int ret; + va_list ap; + + /* If size is -1 and flag 0, GCC should always optimize the call into + snprintf. */ + if (size == (__SIZE_TYPE__) -1 && flag == 0) + abort (); + ++chk_calls; + if (size < len) + __chk_fail (); +#ifdef __OPTIMIZE__ + if (snprintf_disallowed && inside_main) + abort(); +#endif + va_start (ap, fmt); + ret = vsprintf (chk_sprintf_buf, fmt, ap); + va_end (ap); + if (ret >= 0) + { + if (ret < len) + memcpy (str, chk_sprintf_buf, ret + 1); + else + { + memcpy (str, chk_sprintf_buf, len - 1); + str[len - 1] = '\0'; + } + } + return ret; +} + +int +__vsnprintf_chk (char *str, __SIZE_TYPE__ len, int flag, __SIZE_TYPE__ size, + const char *fmt, va_list ap) +{ + int ret; + + /* If size is -1 and flag 0, GCC should always optimize the call into + vsnprintf. */ + if (size == (__SIZE_TYPE__) -1 && flag == 0) + abort (); + ++chk_calls; + if (size < len) + __chk_fail (); +#ifdef __OPTIMIZE__ + if (vsnprintf_disallowed && inside_main) + abort(); +#endif + ret = vsprintf (chk_sprintf_buf, fmt, ap); + if (ret >= 0) + { + if (ret < len) + memcpy (str, chk_sprintf_buf, ret + 1); + else + { + memcpy (str, chk_sprintf_buf, len - 1); + str[len - 1] = '\0'; + } + } + return ret; +} + +int +snprintf (char *str, __SIZE_TYPE__ len, const char *fmt, ...) +{ + int ret; + va_list ap; + +#ifdef __OPTIMIZE__ + if (snprintf_disallowed && inside_main) + abort(); +#endif + va_start (ap, fmt); + ret = vsprintf (chk_sprintf_buf, fmt, ap); + va_end (ap); + if (ret >= 0) + { + if (ret < len) + memcpy (str, chk_sprintf_buf, ret + 1); + else if (len) + { + memcpy (str, chk_sprintf_buf, len - 1); + str[len - 1] = '\0'; + } + } + return ret; +} + +/* uClibc's vsprintf calls vsnprintf. */ +#ifndef __UCLIBC__ +int +vsnprintf (char *str, __SIZE_TYPE__ len, const char *fmt, va_list ap) +{ + int ret; + +#ifdef __OPTIMIZE__ + if (vsnprintf_disallowed && inside_main) + abort(); +#endif + ret = vsprintf (chk_sprintf_buf, fmt, ap); + if (ret >= 0) + { + if (ret < len) + memcpy (str, chk_sprintf_buf, ret + 1); + else if (len) + { + memcpy (str, chk_sprintf_buf, len - 1); + str[len - 1] = '\0'; + } + } + return ret; +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/fprintf.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/fprintf.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/fprintf.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/fprintf.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +#include +#include +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +int +fprintf (FILE *fp, const char *string, ...) +{ + va_list ap; + int r; +#ifdef __OPTIMIZE__ + if (inside_main) + abort(); +#endif + va_start (ap, string); + r = vfprintf (fp, string, ap); + va_end (ap); + return r; +} + +/* Locking stdio doesn't matter for the purposes of this test. */ +__attribute__ ((__noinline__)) +int +fprintf_unlocked (FILE *fp, const char *string, ...) +{ + va_list ap; + int r; +#ifdef __OPTIMIZE__ + if (inside_main) + abort(); +#endif + va_start (ap, string); + r = vfprintf (fp, string, ap); + va_end (ap); + return r; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/main.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/main.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/main.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/main.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +extern void abort(void); +extern void main_test (void); +extern void abort (void); +int inside_main; + +int +main () +{ + inside_main = 1; + main_test (); + inside_main = 0; + return 0; +} + +/* When optimizing, all the constant cases should have been + constant folded, so no calls to link_error should remain. + In any case, link_error should not be called. */ + +#ifndef __OPTIMIZE__ +void +link_error (void) +{ + abort (); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memchr.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memchr.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memchr.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memchr.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort(void); +extern int inside_main; + +__attribute__ ((__noinline__)) +void * +memchr (const void *s, int c, __SIZE_TYPE__ n) +{ + const unsigned char uc = c; + const unsigned char *sp; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + sp = s; + for (; n != 0; ++sp, --n) + if (*sp == uc) + return (void *) sp; + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memcmp.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memcmp.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memcmp.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memcmp.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +extern void abort(void); +extern int inside_main; + +__attribute__ ((__noinline__)) +int +memcmp (const void *s1, const void *s2, __SIZE_TYPE__ len) +{ + const unsigned char *sp1, *sp2; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + sp1 = s1; + sp2 = s2; + while (len != 0 && *sp1 == *sp2) + sp1++, sp2++, len--; + + if (len == 0) + return 0; + return *sp1 - *sp2; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memmove.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memmove.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memmove.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memmove.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +void * +memmove (void *dst, const void *src, __SIZE_TYPE__ n) +{ + char *dstp; + const char *srcp; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + srcp = src; + dstp = dst; + if (srcp < dstp) + while (n-- != 0) + dstp[n] = srcp[n]; + else + while (n-- != 0) + *dstp++ = *srcp++; + + return dst; +} + +void +bcopy (const void *src, void *dst, __SIZE_TYPE__ n) +{ + memmove (dst, src, n); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/mempcpy.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/mempcpy.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/mempcpy.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/mempcpy.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +void * +mempcpy (void *dst, const void *src, __SIZE_TYPE__ n) +{ + const char *srcp; + char *dstp; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + srcp = src; + dstp = dst; + while (n-- != 0) + *dstp++ = *srcp++; + + return dstp; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memset.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memset.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memset.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/memset.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +void * +memset (void *dst, int c, __SIZE_TYPE__ n) +{ + while (n-- != 0) + n[(char *) dst] = c; + + /* Single-byte memsets should be done inline when optimisation + is enabled. Do this after the copy in case we're being called to + initialize bss. */ +#ifdef __OPTIMIZE__ + if (inside_main && n < 2) + abort (); +#endif + + return dst; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/printf.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/printf.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/printf.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/printf.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +#include +#include +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +int +printf (const char *string, ...) +{ + va_list ap; + int r; +#ifdef __OPTIMIZE__ + if (inside_main) + abort(); +#endif + va_start (ap, string); + r = vprintf (string, ap); + va_end (ap); + return r; +} + + +/* Locking stdio doesn't matter for the purposes of this test. */ +__attribute__ ((__noinline__)) +int +printf_unlocked (const char *string, ...) +{ + va_list ap; + int r; +#ifdef __OPTIMIZE__ + if (inside_main) + abort(); +#endif + va_start (ap, string); + r = vprintf (string, ap); + va_end (ap); + return r; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/sprintf.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/sprintf.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/sprintf.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/sprintf.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +#include +#include +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +int +(sprintf) (char *buf, const char *fmt, ...) +{ + va_list ap; + int r; +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + va_start (ap, fmt); + r = vsprintf (buf, fmt, ap); + va_end (ap); + return r; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/stpcpy.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/stpcpy.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/stpcpy.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/stpcpy.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +char * +stpcpy (char *dst, const char *src) +{ +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + while (*src != 0) + *dst++ = *src++; + + *dst = 0; + return dst; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcat.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcat.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcat.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcat.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +extern int inside_main; +extern void abort(void); + +__attribute__ ((__noinline__)) +char * +strcat (char *dst, const char *src) +{ + char *p = dst; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + while (*p) + p++; + while ((*p++ = *src++)) + ; + return dst; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strchr.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strchr.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strchr.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strchr.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +char * +strchr (const char *s, int c) +{ +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + for (;;) + { + if (*s == c) + return (char *) s; + if (*s == 0) + return 0; + s++; + } +} + +__attribute__ ((__noinline__)) +char * +index (const char *s, int c) +{ + return strchr (s, c); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcmp.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcmp.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcmp.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcmp.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +int +strcmp (const char *s1, const char *s2) +{ +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + while (*s1 != 0 && *s1 == *s2) + s1++, s2++; + + if (*s1 == 0 || *s2 == 0) + return (unsigned char) *s1 - (unsigned char) *s2; + return *s1 - *s2; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcpy.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcpy.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcpy.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcpy.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +char * +strcpy (char *d, const char *s) +{ + char *r = d; +#if defined __OPTIMIZE__ && !defined __OPTIMIZE_SIZE__ + if (inside_main) + abort (); +#endif + while ((*d++ = *s++)); + return r; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcspn.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcspn.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcspn.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strcspn.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +__SIZE_TYPE__ +strcspn (const char *s1, const char *s2) +{ + const char *p, *q; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort(); +#endif + + for (p = s1; *p; p++) + for (q = s2; *q; q++) + if (*p == *q) + goto found; + + found: + return p - s1; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strlen.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strlen.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strlen.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strlen.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +__SIZE_TYPE__ +strlen (const char *s) +{ + __SIZE_TYPE__ i; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + i = 0; + while (s[i] != 0) + i++; + + return i; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncat.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncat.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncat.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncat.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +extern void abort(void); +extern int inside_main; + +typedef __SIZE_TYPE__ size_t; + +__attribute__ ((__noinline__)) +char * +strncat (char *s1, const char *s2, size_t n) +{ + char *dest = s1; + char c = '\0'; +#ifdef __OPTIMIZE__ + if (inside_main) + abort(); +#endif + while (*s1) s1++; + c = '\0'; + while (n > 0) + { + c = *s2++; + *s1++ = c; + if (c == '\0') + return dest; + n--; + } + if (c != '\0') + *s1 = '\0'; + return dest; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncmp.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncmp.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncmp.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncmp.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void abort (void); +extern int inside_main; + +typedef __SIZE_TYPE__ size_t; + +__attribute__ ((__noinline__)) +int +strncmp(const char *s1, const char *s2, size_t n) +{ + const unsigned char *u1 = (const unsigned char *)s1; + const unsigned char *u2 = (const unsigned char *)s2; + unsigned char c1, c2; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort(); +#endif + + while (n > 0) + { + c1 = *u1++, c2 = *u2++; + if (c1 == '\0' || c1 != c2) + return c1 - c2; + n--; + } + return c1 - c2; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncpy.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncpy.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncpy.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strncpy.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +extern void abort(void); +extern int inside_main; + +typedef __SIZE_TYPE__ size_t; + +__attribute__ ((__noinline__)) +char * +strncpy(char *s1, const char *s2, size_t n) +{ + char *dest = s1; +#ifdef __OPTIMIZE__ + if (inside_main) + abort(); +#endif + for (; *s2 && n; n--) + *s1++ = *s2++; + while (n--) + *s1++ = 0; + return dest; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strnlen.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strnlen.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strnlen.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strnlen.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +typedef __SIZE_TYPE__ size_t; + +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +size_t +strnlen (const char *s, size_t n) +{ + size_t i; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + i = 0; + while (s[i] != 0 && n--) + i++; + + return i; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strpbrk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strpbrk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strpbrk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strpbrk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +char * +strpbrk(const char *s1, const char *s2) +{ + const char *p; +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + while (*s1) + { + for (p = s2; *p; p++) + if (*s1 == *p) + return (char *)s1; + s1++; + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strrchr.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strrchr.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strrchr.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strrchr.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +char * +strrchr (const char *s, int c) +{ + __SIZE_TYPE__ i; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + i = 0; + while (s[i] != 0) + i++; + + do + if (s[i] == c) + return (char *) s + i; + while (i-- != 0); + + return 0; +} + +__attribute__ ((__noinline__)) +char * +rindex (const char *s, int c) +{ + return strrchr (s, c); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strspn.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strspn.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strspn.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strspn.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +__SIZE_TYPE__ +strcspn (const char *s1, const char *s2) +{ + const char *p, *q; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort(); +#endif + + for (p = s1; *p; p++) + { + for (q = s2; *q; q++) + if (*p == *q) + goto proceed; + break; + + proceed:; + } + return p - s1; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strstr.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strstr.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strstr.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/lib/strstr.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +extern void abort (void); +extern int inside_main; + +__attribute__ ((__noinline__)) +char * +strstr(const char *s1, const char *s2) +{ + const char *p, *q; + +#ifdef __OPTIMIZE__ + if (inside_main) + abort (); +#endif + + /* deliberately dumb algorithm */ + for (; *s1; s1++) + { + p = s1, q = s2; + while (*q && *p) + { + if (*q != *p) + break; + p++, q++; + } + if (*q == 0) + return (char *)s1; + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memchr-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memchr-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memchr-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memchr-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/memchr.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memchr.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memchr.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memchr.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memchr.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* Copyright (C) 2007 Free Software Foundation. + + Ensure all expected transformations of builtin memchr occur + and perform correctly. + + Written by Paolo Carlini, 10/5/2007. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern void *memchr (const void *, int, size_t); + +void +main_test (void) +{ + const char* const foo1 = "hello world"; + + if (memchr (foo1, 'x', 11)) + abort (); + if (memchr (foo1, 'o', 11) != foo1 + 4) + abort (); + if (memchr (foo1, 'w', 2)) + abort (); + if (memchr (foo1 + 5, 'o', 6) != foo1 + 7) + abort (); + if (memchr (foo1, 'd', 11) != foo1 + 10) + abort (); + if (memchr (foo1, 'd', 10)) + abort (); + if (memchr (foo1, '\0', 11)) + abort (); + if (memchr (foo1, '\0', 12) != foo1 + 11) + abort (); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_memchr (foo1, 'r', 11) != foo1 + 8) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcmp-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcmp-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcmp-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcmp-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/memcmp.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcmp.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcmp.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcmp.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcmp.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* Copyright (C) 2001 Free Software Foundation. + + Ensure that short builtin memcmp are optimized and perform correctly. + On architectures with a cmpstrsi instruction, this test doesn't determine + which optimization is being performed, but it does check for correctness. + + Written by Roger Sayle, 12/02/2001. + Additional tests by Roger Sayle after PR 3508, 12/26/2001. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern int memcmp (const void *, const void *, size_t); +extern char *strcpy (char *, const char *); +extern void link_error (void); + +void +main_test (void) +{ + char str[8]; + + strcpy (str, "3141"); + + if ( memcmp (str, str+2, 0) != 0 ) + abort (); + if ( memcmp (str+1, str+3, 0) != 0 ) + abort (); + + if ( memcmp (str+1, str+3, 1) != 0 ) + abort (); + if ( memcmp (str, str+2, 1) >= 0 ) + abort (); + if ( memcmp (str+2, str, 1) <= 0 ) + abort (); + + if (memcmp ("abcd", "efgh", 4) >= 0) + link_error (); + if (memcmp ("abcd", "abcd", 4) != 0) + link_error (); + if (memcmp ("efgh", "abcd", 4) <= 0) + link_error (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,479 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __memcpy_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern int memcmp (const void *, const void *, size_t); + +#include "chk.h" + +const char s1[] = "123"; +char p[32] = ""; +volatile char *s2 = "defg"; /* prevent constant propagation to happen when whole program assumptions are made. */ +volatile char *s3 = "FGH"; /* prevent constant propagation to happen when whole program assumptions are made. */ +volatile size_t l1 = 1; /* prevent constant propagation to happen when whole program assumptions are made. */ + +void +__attribute__((noinline)) +test1 (void) +{ + int i; + +#if defined __i386__ || defined __x86_64__ + /* The functions below might not be optimized into direct stores on all + arches. It depends on how many instructions would be generated and + what limits the architecture chooses in STORE_BY_PIECES_P. */ + memcpy_disallowed = 1; +#endif + + /* All the memcpy calls in this routine except last have fixed length, so + object size checking should be done at compile time if optimizing. */ + chk_calls = 0; + + if (memcpy (p, "ABCDE", 6) != p || memcmp (p, "ABCDE", 6)) + abort (); + if (memcpy (p + 16, "VWX" + 1, 2) != p + 16 + || memcmp (p + 16, "WX\0\0", 5)) + abort (); + if (memcpy (p + 1, "", 1) != p + 1 || memcmp (p, "A\0CDE", 6)) + abort (); + if (memcpy (p + 3, "FGHI", 4) != p + 3 || memcmp (p, "A\0CFGHI", 8)) + abort (); + + i = 8; + memcpy (p + 20, "qrstu", 6); + memcpy (p + 25, "QRSTU", 6); + if (memcpy (p + 25 + 1, s1, 3) != p + 25 + 1 + || memcmp (p + 25, "Q123U", 6)) + abort (); + + if (memcpy (memcpy (p, "abcdEFG", 4) + 4, "efg", 4) != p + 4 + || memcmp (p, "abcdefg", 8)) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_memcpy (p, "ABCDE", 6) != p || memcmp (p, "ABCDE", 6)) + abort (); + + memcpy (p + 5, s3, 1); + if (memcmp (p, "ABCDEFg", 8)) + abort (); + + memcpy_disallowed = 0; + if (chk_calls) + abort (); + chk_calls = 0; + + memcpy (p + 6, s1 + 1, l1); + if (memcmp (p, "ABCDEF2", 8)) + abort (); + + /* The above memcpy copies into an object with known size, but + unknown length, so it should be a __memcpy_chk call. */ + if (chk_calls != 1) + abort (); +} + +long buf1[64]; +char *buf2 = (char *) (buf1 + 32); +long buf5[20]; +char buf7[20]; + +void +__attribute__((noinline)) +test2_sub (long *buf3, char *buf4, char *buf6, int n) +{ + int i = 0; + + /* All the memcpy/__builtin_memcpy/__builtin___memcpy_chk + calls in this routine are either fixed length, or have + side-effects in __builtin_object_size arguments, or + dst doesn't point into a known object. */ + chk_calls = 0; + + /* These should probably be handled by store_by_pieces on most arches. */ + if (memcpy (buf1, "ABCDEFGHI", 9) != (char *) buf1 + || memcmp (buf1, "ABCDEFGHI\0", 11)) + abort (); + + if (memcpy (buf1, "abcdefghijklmnopq", 17) != (char *) buf1 + || memcmp (buf1, "abcdefghijklmnopq\0", 19)) + abort (); + + if (__builtin_memcpy (buf3, "ABCDEF", 6) != (char *) buf1 + || memcmp (buf1, "ABCDEFghijklmnopq\0", 19)) + abort (); + + if (__builtin_memcpy (buf3, "a", 1) != (char *) buf1 + || memcmp (buf1, "aBCDEFghijklmnopq\0", 19)) + abort (); + + if (memcpy ((char *) buf3 + 2, "bcd" + ++i, 2) != (char *) buf1 + 2 + || memcmp (buf1, "aBcdEFghijklmnopq\0", 19) + || i != 1) + abort (); + + /* These should probably be handled by move_by_pieces on most arches. */ + if (memcpy ((char *) buf3 + 4, buf5, 6) != (char *) buf1 + 4 + || memcmp (buf1, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + if (__builtin_memcpy ((char *) buf1 + ++i + 8, (char *) buf5 + 1, 1) + != (char *) buf1 + 10 + || memcmp (buf1, "aBcdRSTUVWSlmnopq\0", 19) + || i != 2) + abort (); + + if (memcpy ((char *) buf3 + 14, buf6, 2) != (char *) buf1 + 14 + || memcmp (buf1, "aBcdRSTUVWSlmnrsq\0", 19)) + abort (); + + if (memcpy (buf3, buf5, 8) != (char *) buf1 + || memcmp (buf1, "RSTUVWXYVWSlmnrsq\0", 19)) + abort (); + + if (memcpy (buf3, buf5, 17) != (char *) buf1 + || memcmp (buf1, "RSTUVWXYZ01234567\0", 19)) + abort (); + + __builtin_memcpy (buf3, "aBcdEFghijklmnopq\0", 19); + + /* These should be handled either by movmemendM or memcpy + call. */ + + /* buf3 points to an unknown object, so __memcpy_chk should not be done. */ + if (memcpy ((char *) buf3 + 4, buf5, n + 6) != (char *) buf1 + 4 + || memcmp (buf1, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + /* This call has side-effects in dst, therefore no checking. */ + if (__builtin___memcpy_chk ((char *) buf1 + ++i + 8, (char *) buf5 + 1, + n + 1, os ((char *) buf1 + ++i + 8)) + != (char *) buf1 + 11 + || memcmp (buf1, "aBcdRSTUVWkSmnopq\0", 19) + || i != 3) + abort (); + + if (memcpy ((char *) buf3 + 14, buf6, n + 2) != (char *) buf1 + 14 + || memcmp (buf1, "aBcdRSTUVWkSmnrsq\0", 19)) + abort (); + + i = 1; + + /* These might be handled by store_by_pieces. */ + if (memcpy (buf2, "ABCDEFGHI", 9) != buf2 + || memcmp (buf2, "ABCDEFGHI\0", 11)) + abort (); + + if (memcpy (buf2, "abcdefghijklmnopq", 17) != buf2 + || memcmp (buf2, "abcdefghijklmnopq\0", 19)) + abort (); + + if (__builtin_memcpy (buf4, "ABCDEF", 6) != buf2 + || memcmp (buf2, "ABCDEFghijklmnopq\0", 19)) + abort (); + + if (__builtin_memcpy (buf4, "a", 1) != buf2 + || memcmp (buf2, "aBCDEFghijklmnopq\0", 19)) + abort (); + + if (memcpy (buf4 + 2, "bcd" + i++, 2) != buf2 + 2 + || memcmp (buf2, "aBcdEFghijklmnopq\0", 19) + || i != 2) + abort (); + + /* These might be handled by move_by_pieces. */ + if (memcpy (buf4 + 4, buf7, 6) != buf2 + 4 + || memcmp (buf2, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + /* Side effect. */ + if (__builtin___memcpy_chk (buf2 + i++ + 8, buf7 + 1, 1, + os (buf2 + i++ + 8)) + != buf2 + 10 + || memcmp (buf2, "aBcdRSTUVWSlmnopq\0", 19) + || i != 3) + abort (); + + if (memcpy (buf4 + 14, buf6, 2) != buf2 + 14 + || memcmp (buf2, "aBcdRSTUVWSlmnrsq\0", 19)) + abort (); + + __builtin_memcpy (buf4, "aBcdEFghijklmnopq\0", 19); + + /* These should be handled either by movmemendM or memcpy + call. */ + if (memcpy (buf4 + 4, buf7, n + 6) != buf2 + 4 + || memcmp (buf2, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + /* Side effect. */ + if (__builtin___memcpy_chk (buf2 + i++ + 8, buf7 + 1, n + 1, + os (buf2 + i++ + 8)) + != buf2 + 11 + || memcmp (buf2, "aBcdRSTUVWkSmnopq\0", 19) + || i != 4) + abort (); + + if (memcpy (buf4 + 14, buf6, n + 2) != buf2 + 14 + || memcmp (buf2, "aBcdRSTUVWkSmnrsq\0", 19)) + abort (); + + if (chk_calls) + abort (); +} + +void +__attribute__((noinline)) +test2 (void) +{ + long *x; + char *y; + int z; + __builtin_memcpy (buf5, "RSTUVWXYZ0123456789", 20); + __builtin_memcpy (buf7, "RSTUVWXYZ0123456789", 20); + __asm ("" : "=r" (x) : "0" (buf1)); + __asm ("" : "=r" (y) : "0" (buf2)); + __asm ("" : "=r" (z) : "0" (0)); + test2_sub (x, y, "rstuvwxyz", z); +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test3 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + size_t l; + + /* The following calls should do runtime checking + - length is not known, but destination is. */ + chk_calls = 0; + memcpy (a.buf1 + 2, s3, l1); + memcpy (r, s3, l1 + 1); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + memcpy (r, s2, l1 + 2); + memcpy (r + 2, s3, l1); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + memcpy (r, s2, l1); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + memcpy (a.buf1 + 2, s3, 1); + memcpy (r, s3, 2); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + memcpy (r, s2, 3); + r = buf3; + l = 4; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1], l = 2; + else if (i == l1) + r = &a.buf2[7], l = 3; + else if (i == l1 + 1) + r = &buf3[5], l = 4; + else if (i == l1 + 2) + r = &a.buf1[9], l = 1; + } + memcpy (r, s2, 1); + /* Here, l is known to be at most 4 and __builtin_object_size (&buf3[16], 0) + is 4, so this doesn't need runtime checking. */ + memcpy (&buf3[16], s2, l); + if (chk_calls) + abort (); + chk_calls = 0; +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test4 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + memcpy (&a.buf2[9], s2, l1 + 1); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + memcpy (&a.buf2[7], s3, strlen (s3) + 1); + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + memcpy (&buf3[19], "ab", 2); + abort (); + } + chk_fail_allowed = 0; +} + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_COPY +#define MAX_COPY (10 * sizeof (long long)) +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + MAX_EXTRA) + +/* Use a sequence length that is not divisible by two, to make it more + likely to detect when words are mixed up. */ +#define SEQUENCE_LENGTH 31 + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u1, u2; + +void +__attribute__((noinline)) +test5 (void) +{ + int off1, off2, len, i; + char *p, *q, c; + + for (off1 = 0; off1 < MAX_OFFSET; off1++) + for (off2 = 0; off2 < MAX_OFFSET; off2++) + for (len = 1; len < MAX_COPY; len++) + { + for (i = 0, c = 'A'; i < MAX_LENGTH; i++, c++) + { + u1.buf[i] = 'a'; + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + u2.buf[i] = c; + } + + p = memcpy (u1.buf + off1, u2.buf + off2, len); + if (p != u1.buf + off1) + abort (); + + q = u1.buf; + for (i = 0; i < off1; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0, c = 'A' + off2; i < len; i++, q++, c++) + { + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + if (*q != c) + abort (); + } + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + } +} + +#define TESTSIZE 80 + +char srcb[TESTSIZE] __attribute__ ((aligned)); +char dstb[TESTSIZE] __attribute__ ((aligned)); + +void +__attribute__((noinline)) +check (char *test, char *match, int n) +{ + if (memcmp (test, match, n)) + abort (); +} + +#define TN(n) \ +{ memset (dstb, 0, n); memcpy (dstb, srcb, n); check (dstb, srcb, n); } +#define T(n) \ +TN (n) \ +TN ((n) + 1) \ +TN ((n) + 2) \ +TN ((n) + 3) + +void +__attribute__((noinline)) +test6 (void) +{ + int i; + + chk_calls = 0; + + for (i = 0; i < sizeof (srcb); ++i) + srcb[i] = 'a' + i % 26; + + T (0); + T (4); + T (8); + T (12); + T (16); + T (20); + T (24); + T (28); + T (32); + T (36); + T (40); + T (44); + T (48); + T (52); + T (56); + T (60); + T (64); + T (68); + T (72); + T (76); + + /* All memcpy calls in this routine have constant arguments. */ + if (chk_calls) + abort (); +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (l1) : "0" (l1)); + test1 (); + test2 (); + test3 (); + test4 (); + test5 (); + test6 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memcpy-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test4 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-2-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-2-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-2-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-2-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,7 @@ +#include "lib/memmove.c" +#ifdef __vxworks +/* The RTP C library uses bzero and bfill, both of which are defined + in the same file as bcopy. */ +#include "lib/bzero.c" +#include "lib/bfill.c" +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* Copyright (C) 2004 Free Software Foundation. + + Check builtin memmove and bcopy optimization when length is 1. + + Written by Jakub Jelinek, 9/14/2004. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern void *memmove (void *, const void *, size_t); +extern void bcopy (const void *, void *, size_t); +extern int memcmp (const void *, const void *, size_t); + +char p[32] = "abcdefg"; +char *q = p + 4; + +void +main_test (void) +{ + /* memmove with length 1 can be optimized into memcpy if it can be + expanded inline. */ + if (memmove (p + 2, p + 3, 1) != p + 2 || memcmp (p, "abddefg", 8)) + abort (); + if (memmove (p + 1, p + 1, 1) != p + 1 || memcmp (p, "abddefg", 8)) + abort (); + if (memmove (q, p + 4, 1) != p + 4 || memcmp (p, "abddefg", 8)) + abort (); + bcopy (p + 5, p + 6, 1); + if (memcmp (p, "abddeff", 8)) + abort (); + bcopy (p + 1, p + 1, 1); + if (memcmp (p, "abddeff", 8)) + abort (); + bcopy (q, p + 4, 1); + if (memcmp (p, "abddeff", 8)) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,579 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __memcpy_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern void *memmove (void *, const void *, size_t); +extern int memcmp (const void *, const void *, size_t); + +#include "chk.h" + +const char s1[] = "123"; +char p[32] = ""; +volatile char *s2 = "defg"; /* prevent constant propagation to happen when whole program assumptions are made. */ +volatile char *s3 = "FGH"; /* prevent constant propagation to happen when whole program assumptions are made. */ +volatile size_t l1 = 1; /* prevent constant propagation to happen when whole program assumptions are made. */ + +void +__attribute__((noinline)) +test1 (void) +{ + int i; + +#if defined __i386__ || defined __x86_64__ + /* The functions below might not be optimized into direct stores on all + arches. It depends on how many instructions would be generated and + what limits the architecture chooses in STORE_BY_PIECES_P. */ + memmove_disallowed = 1; + memcpy_disallowed = 1; +#endif + + /* All the memmove calls in this routine except last have fixed length, so + object size checking should be done at compile time if optimizing. */ + chk_calls = 0; + + if (memmove (p, "ABCDE", 6) != p || memcmp (p, "ABCDE", 6)) + abort (); + if (memmove (p + 16, "VWX" + 1, 2) != p + 16 + || memcmp (p + 16, "WX\0\0", 5)) + abort (); + if (memmove (p + 1, "", 1) != p + 1 || memcmp (p, "A\0CDE", 6)) + abort (); + if (memmove (p + 3, "FGHI", 4) != p + 3 || memcmp (p, "A\0CFGHI", 8)) + abort (); + + i = 8; + memmove (p + 20, "qrstu", 6); + memmove (p + 25, "QRSTU", 6); + if (memmove (p + 25 + 1, s1, 3) != p + 25 + 1 + || memcmp (p + 25, "Q123U", 6)) + abort (); + + if (memmove (memmove (p, "abcdEFG", 4) + 4, "efg", 4) != p + 4 + || memcmp (p, "abcdefg", 8)) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_memmove (p, "ABCDE", 6) != p || memcmp (p, "ABCDE", 6)) + abort (); + + memmove (p + 5, s3, 1); + if (memcmp (p, "ABCDEFg", 8)) + abort (); + + memmove_disallowed = 0; + memcpy_disallowed = 0; + if (chk_calls) + abort (); + chk_calls = 0; + + memmove (p + 6, s1 + 1, l1); + if (memcmp (p, "ABCDEF2", 8)) + abort (); + + /* The above memmove copies into an object with known size, but + unknown length, so it should be a __memmove_chk call. */ + if (chk_calls != 1) + abort (); +} + +long buf1[64]; +char *buf2 = (char *) (buf1 + 32); +long buf5[20]; +char buf7[20]; + +void +__attribute__((noinline)) +test2_sub (long *buf3, char *buf4, char *buf6, int n) +{ + int i = 0; + + /* All the memmove/__builtin_memmove/__builtin___memmove_chk + calls in this routine are either fixed length, or have + side-effects in __builtin_object_size arguments, or + dst doesn't point into a known object. */ + chk_calls = 0; + + /* These should probably be handled by store_by_pieces on most arches. */ + if (memmove (buf1, "ABCDEFGHI", 9) != (char *) buf1 + || memcmp (buf1, "ABCDEFGHI\0", 11)) + abort (); + + if (memmove (buf1, "abcdefghijklmnopq", 17) != (char *) buf1 + || memcmp (buf1, "abcdefghijklmnopq\0", 19)) + abort (); + + if (__builtin_memmove (buf3, "ABCDEF", 6) != (char *) buf1 + || memcmp (buf1, "ABCDEFghijklmnopq\0", 19)) + abort (); + + if (__builtin_memmove (buf3, "a", 1) != (char *) buf1 + || memcmp (buf1, "aBCDEFghijklmnopq\0", 19)) + abort (); + + if (memmove ((char *) buf3 + 2, "bcd" + ++i, 2) != (char *) buf1 + 2 + || memcmp (buf1, "aBcdEFghijklmnopq\0", 19) + || i != 1) + abort (); + + /* These should probably be handled by move_by_pieces on most arches. */ + if (memmove ((char *) buf3 + 4, buf5, 6) != (char *) buf1 + 4 + || memcmp (buf1, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + if (__builtin_memmove ((char *) buf1 + ++i + 8, (char *) buf5 + 1, 1) + != (char *) buf1 + 10 + || memcmp (buf1, "aBcdRSTUVWSlmnopq\0", 19) + || i != 2) + abort (); + + if (memmove ((char *) buf3 + 14, buf6, 2) != (char *) buf1 + 14 + || memcmp (buf1, "aBcdRSTUVWSlmnrsq\0", 19)) + abort (); + + if (memmove (buf3, buf5, 8) != (char *) buf1 + || memcmp (buf1, "RSTUVWXYVWSlmnrsq\0", 19)) + abort (); + + if (memmove (buf3, buf5, 17) != (char *) buf1 + || memcmp (buf1, "RSTUVWXYZ01234567\0", 19)) + abort (); + + __builtin_memmove (buf3, "aBcdEFghijklmnopq\0", 19); + + /* These should be handled either by movmemendM or memmove + call. */ + + /* buf3 points to an unknown object, so __memmove_chk should not be done. */ + if (memmove ((char *) buf3 + 4, buf5, n + 6) != (char *) buf1 + 4 + || memcmp (buf1, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + /* This call has side-effects in dst, therefore no checking. */ + if (__builtin___memmove_chk ((char *) buf1 + ++i + 8, (char *) buf5 + 1, + n + 1, os ((char *) buf1 + ++i + 8)) + != (char *) buf1 + 11 + || memcmp (buf1, "aBcdRSTUVWkSmnopq\0", 19) + || i != 3) + abort (); + + if (memmove ((char *) buf3 + 14, buf6, n + 2) != (char *) buf1 + 14 + || memcmp (buf1, "aBcdRSTUVWkSmnrsq\0", 19)) + abort (); + + i = 1; + + /* These might be handled by store_by_pieces. */ + if (memmove (buf2, "ABCDEFGHI", 9) != buf2 + || memcmp (buf2, "ABCDEFGHI\0", 11)) + abort (); + + if (memmove (buf2, "abcdefghijklmnopq", 17) != buf2 + || memcmp (buf2, "abcdefghijklmnopq\0", 19)) + abort (); + + if (__builtin_memmove (buf4, "ABCDEF", 6) != buf2 + || memcmp (buf2, "ABCDEFghijklmnopq\0", 19)) + abort (); + + if (__builtin_memmove (buf4, "a", 1) != buf2 + || memcmp (buf2, "aBCDEFghijklmnopq\0", 19)) + abort (); + + if (memmove (buf4 + 2, "bcd" + i++, 2) != buf2 + 2 + || memcmp (buf2, "aBcdEFghijklmnopq\0", 19) + || i != 2) + abort (); + + /* These might be handled by move_by_pieces. */ + if (memmove (buf4 + 4, buf7, 6) != buf2 + 4 + || memcmp (buf2, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + /* Side effect. */ + if (__builtin___memmove_chk (buf2 + i++ + 8, buf7 + 1, 1, + os (buf2 + i++ + 8)) + != buf2 + 10 + || memcmp (buf2, "aBcdRSTUVWSlmnopq\0", 19) + || i != 3) + abort (); + + if (memmove (buf4 + 14, buf6, 2) != buf2 + 14 + || memcmp (buf2, "aBcdRSTUVWSlmnrsq\0", 19)) + abort (); + + __builtin_memmove (buf4, "aBcdEFghijklmnopq\0", 19); + + /* These should be handled either by movmemendM or memmove + call. */ + if (memmove (buf4 + 4, buf7, n + 6) != buf2 + 4 + || memcmp (buf2, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + /* Side effect. */ + if (__builtin___memmove_chk (buf2 + i++ + 8, buf7 + 1, n + 1, + os (buf2 + i++ + 8)) + != buf2 + 11 + || memcmp (buf2, "aBcdRSTUVWkSmnopq\0", 19) + || i != 4) + abort (); + + if (memmove (buf4 + 14, buf6, n + 2) != buf2 + 14 + || memcmp (buf2, "aBcdRSTUVWkSmnrsq\0", 19)) + abort (); + + if (chk_calls) + abort (); +} + +void +__attribute__((noinline)) +test2 (void) +{ + long *x; + char *y; + int z; + __builtin_memmove (buf5, "RSTUVWXYZ0123456789", 20); + __builtin_memmove (buf7, "RSTUVWXYZ0123456789", 20); + __asm ("" : "=r" (x) : "0" (buf1)); + __asm ("" : "=r" (y) : "0" (buf2)); + __asm ("" : "=r" (z) : "0" (0)); + test2_sub (x, y, "rstuvwxyz", z); +} + +static const struct foo +{ + char *s; + double d; + long l; +} foo[] = +{ + { "hello world1", 3.14159, 101L }, + { "hello world2", 3.14159, 102L }, + { "hello world3", 3.14159, 103L }, + { "hello world4", 3.14159, 104L }, + { "hello world5", 3.14159, 105L }, + { "hello world6", 3.14159, 106L } +}; + +static const struct bar +{ + char *s; + const struct foo f[3]; +} bar[] = +{ + { + "hello world10", + { + { "hello1", 3.14159, 201L }, + { "hello2", 3.14159, 202L }, + { "hello3", 3.14159, 203L }, + } + }, + { + "hello world11", + { + { "hello4", 3.14159, 204L }, + { "hello5", 3.14159, 205L }, + { "hello6", 3.14159, 206L }, + } + } +}; + +static const int baz[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 }; + +void +__attribute__((noinline)) +test3 (void) +{ + const char *s; + struct foo f1[sizeof foo/sizeof*foo]; + struct bar b1[sizeof bar/sizeof*bar]; + int bz[sizeof baz/sizeof*baz]; + + /* All the memmove/__builtin_memmove calls in this routine have fixed + length. */ + chk_calls = 0; + + /* All the *memmove calls below have src in read-only memory, so all + of them should be optimized into memcpy. */ + memmove_disallowed = 1; + if (memmove (f1, foo, sizeof (foo)) != f1 || memcmp (f1, foo, sizeof (foo))) + abort (); + if (memmove (b1, bar, sizeof (bar)) != b1 || memcmp (b1, bar, sizeof (bar))) + abort (); + memmove (bz, baz, sizeof (baz)); + if (memcmp (bz, baz, sizeof (baz))) + abort (); + + if (memmove (p, "abcde", 6) != p || memcmp (p, "abcde", 6)) + abort (); + s = s1; + if (memmove (p + 2, ++s, 0) != p + 2 || memcmp (p, "abcde", 6) || s != s1 + 1) + abort (); + if (__builtin_memmove (p + 3, "", 1) != p + 3 || memcmp (p, "abc\0e", 6)) + abort (); + memmove (p + 2, "fghijk", 4); + if (memcmp (p, "abfghi", 7)) + abort (); + s = s1 + 1; + memmove (p + 1, s++, 0); + if (memcmp (p, "abfghi", 7) || s != s1 + 2) + abort (); + __builtin_memmove (p + 4, "ABCDE", 1); + if (memcmp (p, "abfgAi", 7)) + abort (); + + /* memmove with length 1 can be optimized into memcpy if it can be + expanded inline. */ + if (memmove (p + 2, p + 3, 1) != p + 2) + abort (); + if (memcmp (p, "abggAi", 7)) + abort (); + + if (chk_calls) + abort (); + memmove_disallowed = 0; +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test4 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + size_t l; + + /* The following calls should do runtime checking + - length is not known, but destination is. */ + chk_calls = 0; + memmove (a.buf1 + 2, s3, l1); + memmove (r, s3, l1 + 1); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + memmove (r, s2, l1 + 2); + memmove (r + 2, s3, l1); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + memmove (r, s2, l1); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + memmove (a.buf1 + 2, s3, 1); + memmove (r, s3, 2); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + memmove (r, s2, 3); + r = buf3; + l = 4; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1], l = 2; + else if (i == l1) + r = &a.buf2[7], l = 3; + else if (i == l1 + 1) + r = &buf3[5], l = 4; + else if (i == l1 + 2) + r = &a.buf1[9], l = 1; + } + memmove (r, s2, 1); + /* Here, l is known to be at most 4 and __builtin_object_size (&buf3[16], 0) + is 4, so this doesn't need runtime checking. */ + memmove (&buf3[16], s2, l); + if (chk_calls) + abort (); + chk_calls = 0; +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test5 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + memmove (&a.buf2[9], s2, l1 + 1); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + memmove (&a.buf2[7], s3, strlen (s3) + 1); + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + memmove (&buf3[19], "ab", 2); + abort (); + } + chk_fail_allowed = 0; +} + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_COPY +#define MAX_COPY (10 * sizeof (long long)) +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + MAX_EXTRA) + +/* Use a sequence length that is not divisible by two, to make it more + likely to detect when words are mixed up. */ +#define SEQUENCE_LENGTH 31 + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u1, u2; + +void +__attribute__((noinline)) +test6 (void) +{ + int off1, off2, len, i; + char *p, *q, c; + + for (off1 = 0; off1 < MAX_OFFSET; off1++) + for (off2 = 0; off2 < MAX_OFFSET; off2++) + for (len = 1; len < MAX_COPY; len++) + { + for (i = 0, c = 'A'; i < MAX_LENGTH; i++, c++) + { + u1.buf[i] = 'a'; + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + u2.buf[i] = c; + } + + p = memmove (u1.buf + off1, u2.buf + off2, len); + if (p != u1.buf + off1) + abort (); + + q = u1.buf; + for (i = 0; i < off1; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0, c = 'A' + off2; i < len; i++, q++, c++) + { + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + if (*q != c) + abort (); + } + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + } +} + +#define TESTSIZE 80 + +char srcb[TESTSIZE] __attribute__ ((aligned)); +char dstb[TESTSIZE] __attribute__ ((aligned)); + +void +__attribute__((noinline)) +check (char *test, char *match, int n) +{ + if (memcmp (test, match, n)) + abort (); +} + +#define TN(n) \ +{ memset (dstb, 0, n); memmove (dstb, srcb, n); check (dstb, srcb, n); } +#define T(n) \ +TN (n) \ +TN ((n) + 1) \ +TN ((n) + 2) \ +TN ((n) + 3) + +void +__attribute__((noinline)) +test7 (void) +{ + int i; + + chk_calls = 0; + + for (i = 0; i < sizeof (srcb); ++i) + srcb[i] = 'a' + i % 26; + + T (0); + T (4); + T (8); + T (12); + T (16); + T (20); + T (24); + T (28); + T (32); + T (36); + T (40); + T (44); + T (48); + T (52); + T (56); + T (60); + T (64); + T (68); + T (72); + T (76); + + /* All memmove calls in this routine have constant arguments. */ + if (chk_calls) + abort (); +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (l1) : "0" (l1)); + test1 (); + test2 (); + __builtin_memset (p, '\0', sizeof (p)); + test3 (); + test4 (); + test5 (); + test6 (); + test7 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test5 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,7 @@ +#include "lib/memmove.c" +#ifdef __vxworks +/* The RTP C library uses bzero and bfill, both of which are defined + in the same file as bcopy. */ +#include "lib/bzero.c" +#include "lib/bfill.c" +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memmove.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,90 @@ +/* Copyright (C) 2003, 2004 Free Software Foundation. + + Ensure builtin memmove and bcopy perform correctly. + + Written by Jakub Jelinek, 4/26/2003. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern void *memmove (void *, const void *, size_t); +extern void bcopy (const void *, void *, size_t); +extern int memcmp (const void *, const void *, size_t); + +const char s1[] = "123"; +char p[32] = ""; + +static const struct foo +{ + char *s; + double d; + long l; +} foo[] = +{ + { "hello world1", 3.14159, 101L }, + { "hello world2", 3.14159, 102L }, + { "hello world3", 3.14159, 103L }, + { "hello world4", 3.14159, 104L }, + { "hello world5", 3.14159, 105L }, + { "hello world6", 3.14159, 106L } +}; + +static const struct bar +{ + char *s; + const struct foo f[3]; +} bar[] = +{ + { + "hello world10", + { + { "hello1", 3.14159, 201L }, + { "hello2", 3.14159, 202L }, + { "hello3", 3.14159, 203L }, + } + }, + { + "hello world11", + { + { "hello4", 3.14159, 204L }, + { "hello5", 3.14159, 205L }, + { "hello6", 3.14159, 206L }, + } + } +}; + +static const int baz[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 }; + +void +main_test (void) +{ + const char *s; + struct foo f1[sizeof foo/sizeof*foo]; + struct bar b1[sizeof bar/sizeof*bar]; + int bz[sizeof baz/sizeof*baz]; + + if (memmove (f1, foo, sizeof (foo)) != f1 || memcmp (f1, foo, sizeof (foo))) + abort (); + if (memmove (b1, bar, sizeof (bar)) != b1 || memcmp (b1, bar, sizeof (bar))) + abort (); + bcopy (baz, bz, sizeof (baz)); + if (memcmp (bz, baz, sizeof (baz))) + abort (); + + if (memmove (p, "abcde", 6) != p || memcmp (p, "abcde", 6)) + abort (); + s = s1; + if (memmove (p + 2, ++s, 0) != p + 2 || memcmp (p, "abcde", 6) || s != s1 + 1) + abort (); + if (__builtin_memmove (p + 3, "", 1) != p + 3 || memcmp (p, "abc\0e", 6)) + abort (); + bcopy ("fghijk", p + 2, 4); + if (memcmp (p, "abfghi", 7)) + abort (); + s = s1 + 1; + bcopy (s++, p + 1, 0); + if (memcmp (p, "abfghi", 7) || s != s1 + 2) + abort (); + __builtin_bcopy ("ABCDE", p + 4, 1); + if (memcmp (p, "abfgAi", 7)) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,117 @@ +extern void abort (void); +extern int inside_main; +typedef __SIZE_TYPE__ size_t; + +#define TEST_ABORT if (inside_main) abort() + +/* LTO code is at the present to able to track that asm alias my_bcopy on builtin + actually refers to this function. See PR47181. */ +__attribute__ ((used)) +void * +my_memcpy (void *d, const void *s, size_t n) +{ + char *dst = (char *) d; + const char *src = (const char *) s; + while (n--) + *dst++ = *src++; + return (char *) d; +} + +/* LTO code is at the present to able to track that asm alias my_bcopy on builtin + actually refers to this function. See PR47181. */ +__attribute__ ((used)) +void +my_bcopy (const void *s, void *d, size_t n) +{ + char *dst = (char *) d; + const char *src = (const char *) s; + if (src >= dst) + while (n--) + *dst++ = *src++; + else + { + dst += n; + src += n; + while (n--) + *--dst = *--src; + } +} + +__attribute__ ((used)) +void * +my_memmove (void *d, const void *s, size_t n) +{ + char *dst = (char *) d; + const char *src = (const char *) s; + if (src >= dst) + while (n--) + *dst++ = *src++; + else + { + dst += n; + src += n; + while (n--) + *--dst = *--src; + } + + return d; +} + +/* LTO code is at the present to able to track that asm alias my_bcopy on builtin + actually refers to this function. See PR47181. */ +__attribute__ ((used)) +void * +my_memset (void *d, int c, size_t n) +{ + char *dst = (char *) d; + while (n--) + *dst++ = c; + return (char *) d; +} + +/* LTO code is at the present to able to track that asm alias my_bcopy on builtin + actually refers to this function. See PR47181. */ +__attribute__ ((used)) +void +my_bzero (void *d, size_t n) +{ + char *dst = (char *) d; + while (n--) + *dst++ = '\0'; +} + +void * +memcpy (void *d, const void *s, size_t n) +{ + void *result = my_memcpy (d, s, n); + TEST_ABORT; + return result; +} + +void +bcopy (const void *s, void *d, size_t n) +{ + my_bcopy (s, d, n); + TEST_ABORT; +} + +void * +memset (void *d, int c, size_t n) +{ + void *result = my_memset (d, c, n); + TEST_ABORT; + return result; +} + +void +bzero (void *d, size_t n) +{ + my_bzero (d, n); + TEST_ABORT; +} + +#ifdef __vxworks +/* The RTP C library uses bfill, which is defined in the same file as + bzero and bcopy. */ +#include "lib/bfill.c" +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,53 @@ +/* Copyright (C) 2003 Free Software Foundation. + + Test memcpy and memset in presence of redirect. */ + +#define ASMNAME(cname) ASMNAME2 (__USER_LABEL_PREFIX__, cname) +#define ASMNAME2(prefix, cname) STRING (prefix) cname +#define STRING(x) #x + +typedef __SIZE_TYPE__ size_t; +extern void abort (void); +extern void *memcpy (void *, const void *, size_t) + __asm (ASMNAME ("my_memcpy")); +extern void bcopy (const void *, void *, size_t) + __asm (ASMNAME ("my_bcopy")); +extern void *memmove (void *, const void *, size_t) + __asm (ASMNAME ("my_memmove")); +extern void *memset (void *, int, size_t) + __asm (ASMNAME ("my_memset")); +extern void bzero (void *, size_t) + __asm (ASMNAME ("my_bzero")); +extern int memcmp (const void *, const void *, size_t); + +struct A { char c[32]; } a = { "foobar" }; +char x[64] = "foobar", y[64]; +int i = 39, j = 6, k = 4; + +extern int inside_main; + +void +main_test (void) +{ + struct A b = a; + struct A c = { { 'x' } }; + + inside_main = 1; + + if (memcmp (b.c, x, 32) || c.c[0] != 'x' || memcmp (c.c + 1, x + 32, 31)) + abort (); + if (__builtin_memcpy (y, x, i) != y || memcmp (x, y, 64)) + abort (); + if (memcpy (y + 6, x, j) != y + 6 + || memcmp (x, y, 6) || memcmp (x, y + 6, 58)) + abort (); + if (__builtin_memset (y + 2, 'X', k) != y + 2 + || memcmp (y, "foXXXXfoobar", 13)) + abort (); + bcopy (y + 1, y + 2, 6); + if (memcmp (y, "fooXXXXfobar", 13)) + abort (); + __builtin_bzero (y + 4, 2); + if (memcmp (y, "fooX\0\0Xfobar", 13)) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memops-asm.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +# Different translation units may have different user name overrides +# and we do not preserve enough context to known which one we want. + +set torture_eval_before_compile { + if {[string match {*-flto*} "$option"]} { + continue + } +} + +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-2-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-2-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-2-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-2-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/mempcpy.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,153 @@ +/* Copyright (C) 2003 Free Software Foundation. + + Ensure that builtin mempcpy and stpcpy perform correctly. + + Written by Jakub Jelinek, 21/05/2003. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern void *mempcpy (void *, const void *, size_t); +extern int memcmp (const void *, const void *, size_t); +extern int inside_main; + +long buf1[64]; +char *buf2 = (char *) (buf1 + 32); +long buf5[20]; +char buf7[20]; + +void +__attribute__((noinline)) +test (long *buf3, char *buf4, char *buf6, int n) +{ + int i = 0; + + /* These should probably be handled by store_by_pieces on most arches. */ + if (mempcpy (buf1, "ABCDEFGHI", 9) != (char *) buf1 + 9 + || memcmp (buf1, "ABCDEFGHI\0", 11)) + abort (); + + if (mempcpy (buf1, "abcdefghijklmnopq", 17) != (char *) buf1 + 17 + || memcmp (buf1, "abcdefghijklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy (buf3, "ABCDEF", 6) != (char *) buf1 + 6 + || memcmp (buf1, "ABCDEFghijklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy (buf3, "a", 1) != (char *) buf1 + 1 + || memcmp (buf1, "aBCDEFghijklmnopq\0", 19)) + abort (); + + if (mempcpy ((char *) buf3 + 2, "bcd" + ++i, 2) != (char *) buf1 + 4 + || memcmp (buf1, "aBcdEFghijklmnopq\0", 19) + || i != 1) + abort (); + + /* These should probably be handled by move_by_pieces on most arches. */ + if (mempcpy ((char *) buf3 + 4, buf5, 6) != (char *) buf1 + 10 + || memcmp (buf1, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy ((char *) buf1 + ++i + 8, (char *) buf5 + 1, 1) + != (char *) buf1 + 11 + || memcmp (buf1, "aBcdRSTUVWSlmnopq\0", 19) + || i != 2) + abort (); + + if (mempcpy ((char *) buf3 + 14, buf6, 2) != (char *) buf1 + 16 + || memcmp (buf1, "aBcdRSTUVWSlmnrsq\0", 19)) + abort (); + + if (mempcpy (buf3, buf5, 8) != (char *) buf1 + 8 + || memcmp (buf1, "RSTUVWXYVWSlmnrsq\0", 19)) + abort (); + + if (mempcpy (buf3, buf5, 17) != (char *) buf1 + 17 + || memcmp (buf1, "RSTUVWXYZ01234567\0", 19)) + abort (); + + __builtin_memcpy (buf3, "aBcdEFghijklmnopq\0", 19); + + /* These should be handled either by movmemendM or mempcpy + call. */ + if (mempcpy ((char *) buf3 + 4, buf5, n + 6) != (char *) buf1 + 10 + || memcmp (buf1, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy ((char *) buf1 + ++i + 8, (char *) buf5 + 1, n + 1) + != (char *) buf1 + 12 + || memcmp (buf1, "aBcdRSTUVWkSmnopq\0", 19) + || i != 3) + abort (); + + if (mempcpy ((char *) buf3 + 14, buf6, n + 2) != (char *) buf1 + 16 + || memcmp (buf1, "aBcdRSTUVWkSmnrsq\0", 19)) + abort (); + + i = 1; + + /* These might be handled by store_by_pieces. */ + if (mempcpy (buf2, "ABCDEFGHI", 9) != buf2 + 9 + || memcmp (buf2, "ABCDEFGHI\0", 11)) + abort (); + + if (mempcpy (buf2, "abcdefghijklmnopq", 17) != buf2 + 17 + || memcmp (buf2, "abcdefghijklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy (buf4, "ABCDEF", 6) != buf2 + 6 + || memcmp (buf2, "ABCDEFghijklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy (buf4, "a", 1) != buf2 + 1 + || memcmp (buf2, "aBCDEFghijklmnopq\0", 19)) + abort (); + + if (mempcpy (buf4 + 2, "bcd" + i++, 2) != buf2 + 4 + || memcmp (buf2, "aBcdEFghijklmnopq\0", 19) + || i != 2) + abort (); + + /* These might be handled by move_by_pieces. */ + if (mempcpy (buf4 + 4, buf7, 6) != buf2 + 10 + || memcmp (buf2, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy (buf2 + i++ + 8, buf7 + 1, 1) + != buf2 + 11 + || memcmp (buf2, "aBcdRSTUVWSlmnopq\0", 19) + || i != 3) + abort (); + + if (mempcpy (buf4 + 14, buf6, 2) != buf2 + 16 + || memcmp (buf2, "aBcdRSTUVWSlmnrsq\0", 19)) + abort (); + + __builtin_memcpy (buf4, "aBcdEFghijklmnopq\0", 19); + + /* These should be handled either by movmemendM or mempcpy + call. */ + if (mempcpy (buf4 + 4, buf7, n + 6) != buf2 + 10 + || memcmp (buf2, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy (buf2 + i++ + 8, buf7 + 1, n + 1) + != buf2 + 12 + || memcmp (buf2, "aBcdRSTUVWkSmnopq\0", 19) + || i != 4) + abort (); + + if (mempcpy (buf4 + 14, buf6, n + 2) != buf2 + 16 + || memcmp (buf2, "aBcdRSTUVWkSmnrsq\0", 19)) + abort (); +} + +void +main_test (void) +{ + /* All these tests are allowed to call mempcpy/stpcpy. */ + inside_main = 0; + __builtin_memcpy (buf5, "RSTUVWXYZ0123456789", 20); + __builtin_memcpy (buf7, "RSTUVWXYZ0123456789", 20); + test (buf1, buf2, "rstuvwxyz", 0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,487 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __mempcpy_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern void *mempcpy (void *, const void *, size_t); +extern int memcmp (const void *, const void *, size_t); + +#include "chk.h" + +const char s1[] = "123"; +char p[32] = ""; +volatile char *s2 = "defg"; /* prevent constant propagation to happen when whole program assumptions are made. */ +volatile char *s3 = "FGH"; /* prevent constant propagation to happen when whole program assumptions are made. */ +volatile size_t l1 = 1; /* prevent constant propagation to happen when whole program assumptions are made. */ + +void +__attribute__((noinline)) +test1 (void) +{ + int i; + +#if defined __i386__ || defined __x86_64__ + /* The functions below might not be optimized into direct stores on all + arches. It depends on how many instructions would be generated and + what limits the architecture chooses in STORE_BY_PIECES_P. */ + mempcpy_disallowed = 1; +#endif + + /* All the mempcpy calls in this routine except last have fixed length, so + object size checking should be done at compile time if optimizing. */ + chk_calls = 0; + + if (mempcpy (p, "ABCDE", 6) != p + 6 || memcmp (p, "ABCDE", 6)) + abort (); + if (mempcpy (p + 16, "VWX" + 1, 2) != p + 16 + 2 + || memcmp (p + 16, "WX\0\0", 5)) + abort (); + if (mempcpy (p + 1, "", 1) != p + 1 + 1 || memcmp (p, "A\0CDE", 6)) + abort (); + if (mempcpy (p + 3, "FGHI", 4) != p + 3 + 4 || memcmp (p, "A\0CFGHI", 8)) + abort (); + + i = 8; + memcpy (p + 20, "qrstu", 6); + memcpy (p + 25, "QRSTU", 6); + if (mempcpy (p + 25 + 1, s1, 3) != (p + 25 + 1 + 3) + || memcmp (p + 25, "Q123U", 6)) + abort (); + + if (mempcpy (mempcpy (p, "abcdEFG", 4), "efg", 4) != p + 8 + || memcmp (p, "abcdefg", 8)) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_mempcpy (p, "ABCDE", 6) != p + 6 || memcmp (p, "ABCDE", 6)) + abort (); + + /* If the result of mempcpy is ignored, gcc should use memcpy. + This should be optimized always, so disallow mempcpy calls. */ + mempcpy_disallowed = 1; + mempcpy (p + 5, s3, 1); + if (memcmp (p, "ABCDEFg", 8)) + abort (); + + if (chk_calls) + abort (); + chk_calls = 0; + + mempcpy (p + 6, s1 + 1, l1); + if (memcmp (p, "ABCDEF2", 8)) + abort (); + + /* The above mempcpy copies into an object with known size, but + unknown length and with result ignored, so it should be a + __memcpy_chk call. */ + if (chk_calls != 1) + abort (); + + mempcpy_disallowed = 0; +} + +long buf1[64]; +char *buf2 = (char *) (buf1 + 32); +long buf5[20]; +char buf7[20]; + +void +__attribute__((noinline)) +test2_sub (long *buf3, char *buf4, char *buf6, int n) +{ + int i = 0; + + /* All the mempcpy/__builtin_mempcpy/__builtin___mempcpy_chk + calls in this routine are either fixed length, or have + side-effects in __builtin_object_size arguments, or + dst doesn't point into a known object. */ + chk_calls = 0; + + /* These should probably be handled by store_by_pieces on most arches. */ + if (mempcpy (buf1, "ABCDEFGHI", 9) != (char *) buf1 + 9 + || memcmp (buf1, "ABCDEFGHI\0", 11)) + abort (); + + if (mempcpy (buf1, "abcdefghijklmnopq", 17) != (char *) buf1 + 17 + || memcmp (buf1, "abcdefghijklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy (buf3, "ABCDEF", 6) != (char *) buf1 + 6 + || memcmp (buf1, "ABCDEFghijklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy (buf3, "a", 1) != (char *) buf1 + 1 + || memcmp (buf1, "aBCDEFghijklmnopq\0", 19)) + abort (); + + if (mempcpy ((char *) buf3 + 2, "bcd" + ++i, 2) != (char *) buf1 + 4 + || memcmp (buf1, "aBcdEFghijklmnopq\0", 19) + || i != 1) + abort (); + + /* These should probably be handled by move_by_pieces on most arches. */ + if (mempcpy ((char *) buf3 + 4, buf5, 6) != (char *) buf1 + 10 + || memcmp (buf1, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy ((char *) buf1 + ++i + 8, (char *) buf5 + 1, 1) + != (char *) buf1 + 11 + || memcmp (buf1, "aBcdRSTUVWSlmnopq\0", 19) + || i != 2) + abort (); + + if (mempcpy ((char *) buf3 + 14, buf6, 2) != (char *) buf1 + 16 + || memcmp (buf1, "aBcdRSTUVWSlmnrsq\0", 19)) + abort (); + + if (mempcpy (buf3, buf5, 8) != (char *) buf1 + 8 + || memcmp (buf1, "RSTUVWXYVWSlmnrsq\0", 19)) + abort (); + + if (mempcpy (buf3, buf5, 17) != (char *) buf1 + 17 + || memcmp (buf1, "RSTUVWXYZ01234567\0", 19)) + abort (); + + __builtin_memcpy (buf3, "aBcdEFghijklmnopq\0", 19); + + /* These should be handled either by movmemendM or mempcpy + call. */ + + /* buf3 points to an unknown object, so __mempcpy_chk should not be done. */ + if (mempcpy ((char *) buf3 + 4, buf5, n + 6) != (char *) buf1 + 10 + || memcmp (buf1, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + /* This call has side-effects in dst, therefore no checking. */ + if (__builtin___mempcpy_chk ((char *) buf1 + ++i + 8, (char *) buf5 + 1, + n + 1, os ((char *) buf1 + ++i + 8)) + != (char *) buf1 + 12 + || memcmp (buf1, "aBcdRSTUVWkSmnopq\0", 19) + || i != 3) + abort (); + + if (mempcpy ((char *) buf3 + 14, buf6, n + 2) != (char *) buf1 + 16 + || memcmp (buf1, "aBcdRSTUVWkSmnrsq\0", 19)) + abort (); + + i = 1; + + /* These might be handled by store_by_pieces. */ + if (mempcpy (buf2, "ABCDEFGHI", 9) != buf2 + 9 + || memcmp (buf2, "ABCDEFGHI\0", 11)) + abort (); + + if (mempcpy (buf2, "abcdefghijklmnopq", 17) != buf2 + 17 + || memcmp (buf2, "abcdefghijklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy (buf4, "ABCDEF", 6) != buf2 + 6 + || memcmp (buf2, "ABCDEFghijklmnopq\0", 19)) + abort (); + + if (__builtin_mempcpy (buf4, "a", 1) != buf2 + 1 + || memcmp (buf2, "aBCDEFghijklmnopq\0", 19)) + abort (); + + if (mempcpy (buf4 + 2, "bcd" + i++, 2) != buf2 + 4 + || memcmp (buf2, "aBcdEFghijklmnopq\0", 19) + || i != 2) + abort (); + + /* These might be handled by move_by_pieces. */ + if (mempcpy (buf4 + 4, buf7, 6) != buf2 + 10 + || memcmp (buf2, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + /* Side effect. */ + if (__builtin___mempcpy_chk (buf2 + i++ + 8, buf7 + 1, 1, + os (buf2 + i++ + 8)) + != buf2 + 11 + || memcmp (buf2, "aBcdRSTUVWSlmnopq\0", 19) + || i != 3) + abort (); + + if (mempcpy (buf4 + 14, buf6, 2) != buf2 + 16 + || memcmp (buf2, "aBcdRSTUVWSlmnrsq\0", 19)) + abort (); + + __builtin_memcpy (buf4, "aBcdEFghijklmnopq\0", 19); + + /* These should be handled either by movmemendM or mempcpy + call. */ + if (mempcpy (buf4 + 4, buf7, n + 6) != buf2 + 10 + || memcmp (buf2, "aBcdRSTUVWklmnopq\0", 19)) + abort (); + + /* Side effect. */ + if (__builtin___mempcpy_chk (buf2 + i++ + 8, buf7 + 1, + n + 1, os (buf2 + i++ + 8)) + != buf2 + 12 + || memcmp (buf2, "aBcdRSTUVWkSmnopq\0", 19) + || i != 4) + abort (); + + if (mempcpy (buf4 + 14, buf6, n + 2) != buf2 + 16 + || memcmp (buf2, "aBcdRSTUVWkSmnrsq\0", 19)) + abort (); + + if (chk_calls) + abort (); +} + +void +__attribute__((noinline)) +test2 (void) +{ + long *x; + char *y; + int z; + __builtin_memcpy (buf5, "RSTUVWXYZ0123456789", 20); + __builtin_memcpy (buf7, "RSTUVWXYZ0123456789", 20); + __asm ("" : "=r" (x) : "0" (buf1)); + __asm ("" : "=r" (y) : "0" (buf2)); + __asm ("" : "=r" (z) : "0" (0)); + test2_sub (x, y, "rstuvwxyz", z); +} + +volatile void *vx; + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test3 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + size_t l; + + /* The following calls should do runtime checking + - length is not known, but destination is. */ + chk_calls = 0; + vx = mempcpy (a.buf1 + 2, s3, l1); + vx = mempcpy (r, s3, l1 + 1); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + vx = mempcpy (r, s2, l1 + 2); + vx = mempcpy (r + 2, s3, l1); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + vx = mempcpy (r, s2, l1); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + vx = mempcpy (a.buf1 + 2, s3, 1); + vx = mempcpy (r, s3, 2); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + vx = mempcpy (r, s2, 3); + r = buf3; + l = 4; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1], l = 2; + else if (i == l1) + r = &a.buf2[7], l = 3; + else if (i == l1 + 1) + r = &buf3[5], l = 4; + else if (i == l1 + 2) + r = &a.buf1[9], l = 1; + } + vx = mempcpy (r, s2, 1); + /* Here, l is known to be at most 4 and __builtin_object_size (&buf3[16], 0) + is 4, so this doesn't need runtime checking. */ + vx = mempcpy (&buf3[16], s2, l); + if (chk_calls) + abort (); + chk_calls = 0; +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test4 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + vx = mempcpy (&a.buf2[9], s2, l1 + 1); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + vx = mempcpy (&a.buf2[7], s3, strlen (s3) + 1); + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + vx = mempcpy (&buf3[19], "ab", 2); + abort (); + } + chk_fail_allowed = 0; +} + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_COPY +#define MAX_COPY (10 * sizeof (long long)) +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + MAX_EXTRA) + +/* Use a sequence length that is not divisible by two, to make it more + likely to detect when words are mixed up. */ +#define SEQUENCE_LENGTH 31 + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u1, u2; + +void +__attribute__((noinline)) +test5 (void) +{ + int off1, off2, len, i; + char *p, *q, c; + + for (off1 = 0; off1 < MAX_OFFSET; off1++) + for (off2 = 0; off2 < MAX_OFFSET; off2++) + for (len = 1; len < MAX_COPY; len++) + { + for (i = 0, c = 'A'; i < MAX_LENGTH; i++, c++) + { + u1.buf[i] = 'a'; + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + u2.buf[i] = c; + } + + p = mempcpy (u1.buf + off1, u2.buf + off2, len); + if (p != u1.buf + off1 + len) + abort (); + + q = u1.buf; + for (i = 0; i < off1; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0, c = 'A' + off2; i < len; i++, q++, c++) + { + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + if (*q != c) + abort (); + } + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + } +} + +#define TESTSIZE 80 + +char srcb[TESTSIZE] __attribute__ ((aligned)); +char dstb[TESTSIZE] __attribute__ ((aligned)); + +void +__attribute__((noinline)) +check (char *test, char *match, int n) +{ + if (memcmp (test, match, n)) + abort (); +} + +#define TN(n) \ +{ memset (dstb, 0, n); vx = mempcpy (dstb, srcb, n); check (dstb, srcb, n); } +#define T(n) \ +TN (n) \ +TN ((n) + 1) \ +TN ((n) + 2) \ +TN ((n) + 3) + +void +__attribute__((noinline)) +test6 (void) +{ + int i; + + chk_calls = 0; + + for (i = 0; i < sizeof (srcb); ++i) + srcb[i] = 'a' + i % 26; + + T (0); + T (4); + T (8); + T (12); + T (16); + T (20); + T (24); + T (28); + T (32); + T (36); + T (40); + T (44); + T (48); + T (52); + T (56); + T (60); + T (64); + T (68); + T (72); + T (76); + + /* All mempcpy calls in this routine have constant arguments. */ + if (chk_calls) + abort (); +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (l1) : "0" (l1)); + test1 (); + test2 (); + test3 (); + test4 (); + test5 (); + test6 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test4 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/mempcpy.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/mempcpy.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,68 @@ +/* Copyright (C) 2003 Free Software Foundation. + + Ensure builtin mempcpy performs correctly. + + Written by Kaveh Ghazi, 4/11/2003. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern void *mempcpy (void *, const void *, size_t); +extern int memcmp (const void *, const void *, size_t); +extern int inside_main; + +const char s1[] = "123"; +char p[32] = ""; +char *s2 = "defg"; +char *s3 = "FGH"; +size_t l1 = 1; + +void +main_test (void) +{ + int i; + +#if !defined __i386__ && !defined __x86_64__ + /* The functions below might not be optimized into direct stores on all + arches. It depends on how many instructions would be generated and + what limits the architecture chooses in STORE_BY_PIECES_P. */ + inside_main = 0; +#endif + + if (mempcpy (p, "ABCDE", 6) != p + 6 || memcmp (p, "ABCDE", 6)) + abort (); + if (mempcpy (p + 16, "VWX" + 1, 2) != p + 16 + 2 + || memcmp (p + 16, "WX\0\0", 5)) + abort (); + if (mempcpy (p + 1, "", 1) != p + 1 + 1 || memcmp (p, "A\0CDE", 6)) + abort (); + if (mempcpy (p + 3, "FGHI", 4) != p + 3 + 4 || memcmp (p, "A\0CFGHI", 8)) + abort (); + + i = 8; + memcpy (p + 20, "qrstu", 6); + memcpy (p + 25, "QRSTU", 6); + if (mempcpy (p + 25 + 1, s1, 3) != (p + 25 + 1 + 3) + || memcmp (p + 25, "Q123U", 6)) + abort (); + + if (mempcpy (mempcpy (p, "abcdEFG", 4), "efg", 4) != p + 8 + || memcmp (p, "abcdefg", 8)) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_mempcpy (p, "ABCDE", 6) != p + 6 || memcmp (p, "ABCDE", 6)) + abort (); + + /* If the result of mempcpy is ignored, gcc should use memcpy. + This should be optimized always, so set inside_main again. */ + inside_main = 1; + mempcpy (p + 5, s3, 1); + if (memcmp (p, "ABCDEFg", 8)) + abort (); + mempcpy (p + 6, s1 + 1, l1); + if (memcmp (p, "ABCDEF2", 8)) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,721 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __memset_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern void *memset (void *, int, size_t); +extern int memcmp (const void *, const void *, size_t); + +#include "chk.h" + +char buffer[32]; +int argc = 1; +volatile size_t l1 = 1; /* prevent constant propagation to happen when whole program assumptions are made. */ +volatile char *s3 = "FGH"; /* prevent constant propagation to happen when whole program assumptions are made. */ +char *s4; + +void +__attribute__((noinline)) +test1 (void) +{ + memset_disallowed = 1; + chk_calls = 0; + memset (buffer, argc, 0); + memset (buffer, argc, 1); + memset (buffer, argc, 2); + memset (buffer, argc, 3); + memset (buffer, argc, 4); + memset (buffer, argc, 5); + memset (buffer, argc, 6); + memset (buffer, argc, 7); + memset (buffer, argc, 8); + memset (buffer, argc, 9); + memset (buffer, argc, 10); + memset (buffer, argc, 11); + memset (buffer, argc, 12); + memset (buffer, argc, 13); + memset (buffer, argc, 14); + memset (buffer, argc, 15); + memset (buffer, argc, 16); + memset (buffer, argc, 17); + memset_disallowed = 0; + if (chk_calls) + abort (); +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test2 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + size_t l; + + /* The following calls should do runtime checking + - length is not known, but destination is. */ + chk_calls = 0; + memset (a.buf1 + 2, 'a', l1); + memset (r, '\0', l1 + 1); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + memset (r, argc, l1 + 2); + memset (r + 2, 'Q', l1); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + memset (r, '\0', l1); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + memset (a.buf1 + 2, '\0', 1); + memset (r, argc, 2); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + memset (r, 'N', 3); + r = buf3; + l = 4; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1], l = 2; + else if (i == l1) + r = &a.buf2[7], l = 3; + else if (i == l1 + 1) + r = &buf3[5], l = 4; + else if (i == l1 + 2) + r = &a.buf1[9], l = 1; + } + memset (r, 'H', 1); + /* Here, l is known to be at most 4 and __builtin_object_size (&buf3[16], 0) + is 4, so this doesn't need runtime checking. */ + memset (&buf3[16], 'd', l); + /* Neither length nor destination known. Doesn't need runtime checking. */ + memset (s4, 'a', l1); + memset (s4 + 2, '\0', l1 + 2); + /* Destination unknown. */ + memset (s4 + 4, 'b', 2); + memset (s4 + 6, '\0', 4); + if (chk_calls) + abort (); + chk_calls = 0; +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test3 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + memset (&a.buf2[9], '\0', l1 + 1); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + memset (&a.buf2[7], 'T', strlen (s3) + 1); + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + memset (&buf3[19], 'b', 2); + abort (); + } + chk_fail_allowed = 0; +} + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_COPY +#define MAX_COPY (10 * sizeof (long long)) +#define MAX_COPY2 15 +#else +#define MAX_COPY2 MAX_COPY +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + MAX_EXTRA) +#define MAX_LENGTH2 (MAX_OFFSET + MAX_COPY2 + MAX_EXTRA) + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u; + +char A = 'A'; + +void +__attribute__((noinline)) +test4 (void) +{ + int off, len, i; + char *p, *q; + + for (off = 0; off < MAX_OFFSET; off++) + for (len = 1; len < MAX_COPY; len++) + { + for (i = 0; i < MAX_LENGTH; i++) + u.buf[i] = 'a'; + + p = memset (u.buf + off, '\0', len); + if (p != u.buf + off) + abort (); + + q = u.buf; + for (i = 0; i < off; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0; i < len; i++, q++) + if (*q != '\0') + abort (); + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + + p = memset (u.buf + off, A, len); + if (p != u.buf + off) + abort (); + + q = u.buf; + for (i = 0; i < off; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0; i < len; i++, q++) + if (*q != 'A') + abort (); + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + + p = memset (u.buf + off, 'B', len); + if (p != u.buf + off) + abort (); + + q = u.buf; + for (i = 0; i < off; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0; i < len; i++, q++) + if (*q != 'B') + abort (); + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + } +} + +static union { + char buf[MAX_LENGTH2]; + long long align_int; + long double align_fp; +} u2; + +void reset () +{ + int i; + + for (i = 0; i < MAX_LENGTH2; i++) + u2.buf[i] = 'a'; +} + +void check (int off, int len, int ch) +{ + char *q; + int i; + + q = u2.buf; + for (i = 0; i < off; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0; i < len; i++, q++) + if (*q != ch) + abort (); + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); +} + +void +__attribute__((noinline)) +test5 (void) +{ + int off; + char *p; + + /* len == 1 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 1); + if (p != u2.buf + off) abort (); + check (off, 1, '\0'); + + p = memset (u2.buf + off, A, 1); + if (p != u2.buf + off) abort (); + check (off, 1, 'A'); + + p = memset (u2.buf + off, 'B', 1); + if (p != u2.buf + off) abort (); + check (off, 1, 'B'); + } + + /* len == 2 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 2); + if (p != u2.buf + off) abort (); + check (off, 2, '\0'); + + p = memset (u2.buf + off, A, 2); + if (p != u2.buf + off) abort (); + check (off, 2, 'A'); + + p = memset (u2.buf + off, 'B', 2); + if (p != u2.buf + off) abort (); + check (off, 2, 'B'); + } + + /* len == 3 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 3); + if (p != u2.buf + off) abort (); + check (off, 3, '\0'); + + p = memset (u2.buf + off, A, 3); + if (p != u2.buf + off) abort (); + check (off, 3, 'A'); + + p = memset (u2.buf + off, 'B', 3); + if (p != u2.buf + off) abort (); + check (off, 3, 'B'); + } + + /* len == 4 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 4); + if (p != u2.buf + off) abort (); + check (off, 4, '\0'); + + p = memset (u2.buf + off, A, 4); + if (p != u2.buf + off) abort (); + check (off, 4, 'A'); + + p = memset (u2.buf + off, 'B', 4); + if (p != u2.buf + off) abort (); + check (off, 4, 'B'); + } + + /* len == 5 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 5); + if (p != u2.buf + off) abort (); + check (off, 5, '\0'); + + p = memset (u2.buf + off, A, 5); + if (p != u2.buf + off) abort (); + check (off, 5, 'A'); + + p = memset (u2.buf + off, 'B', 5); + if (p != u2.buf + off) abort (); + check (off, 5, 'B'); + } + + /* len == 6 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 6); + if (p != u2.buf + off) abort (); + check (off, 6, '\0'); + + p = memset (u2.buf + off, A, 6); + if (p != u2.buf + off) abort (); + check (off, 6, 'A'); + + p = memset (u2.buf + off, 'B', 6); + if (p != u2.buf + off) abort (); + check (off, 6, 'B'); + } + + /* len == 7 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 7); + if (p != u2.buf + off) abort (); + check (off, 7, '\0'); + + p = memset (u2.buf + off, A, 7); + if (p != u2.buf + off) abort (); + check (off, 7, 'A'); + + p = memset (u2.buf + off, 'B', 7); + if (p != u2.buf + off) abort (); + check (off, 7, 'B'); + } + + /* len == 8 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 8); + if (p != u2.buf + off) abort (); + check (off, 8, '\0'); + + p = memset (u2.buf + off, A, 8); + if (p != u2.buf + off) abort (); + check (off, 8, 'A'); + + p = memset (u2.buf + off, 'B', 8); + if (p != u2.buf + off) abort (); + check (off, 8, 'B'); + } + + /* len == 9 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 9); + if (p != u2.buf + off) abort (); + check (off, 9, '\0'); + + p = memset (u2.buf + off, A, 9); + if (p != u2.buf + off) abort (); + check (off, 9, 'A'); + + p = memset (u2.buf + off, 'B', 9); + if (p != u2.buf + off) abort (); + check (off, 9, 'B'); + } + + /* len == 10 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 10); + if (p != u2.buf + off) abort (); + check (off, 10, '\0'); + + p = memset (u2.buf + off, A, 10); + if (p != u2.buf + off) abort (); + check (off, 10, 'A'); + + p = memset (u2.buf + off, 'B', 10); + if (p != u2.buf + off) abort (); + check (off, 10, 'B'); + } + + /* len == 11 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 11); + if (p != u2.buf + off) abort (); + check (off, 11, '\0'); + + p = memset (u2.buf + off, A, 11); + if (p != u2.buf + off) abort (); + check (off, 11, 'A'); + + p = memset (u2.buf + off, 'B', 11); + if (p != u2.buf + off) abort (); + check (off, 11, 'B'); + } + + /* len == 12 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 12); + if (p != u2.buf + off) abort (); + check (off, 12, '\0'); + + p = memset (u2.buf + off, A, 12); + if (p != u2.buf + off) abort (); + check (off, 12, 'A'); + + p = memset (u2.buf + off, 'B', 12); + if (p != u2.buf + off) abort (); + check (off, 12, 'B'); + } + + /* len == 13 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 13); + if (p != u2.buf + off) abort (); + check (off, 13, '\0'); + + p = memset (u2.buf + off, A, 13); + if (p != u2.buf + off) abort (); + check (off, 13, 'A'); + + p = memset (u2.buf + off, 'B', 13); + if (p != u2.buf + off) abort (); + check (off, 13, 'B'); + } + + /* len == 14 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 14); + if (p != u2.buf + off) abort (); + check (off, 14, '\0'); + + p = memset (u2.buf + off, A, 14); + if (p != u2.buf + off) abort (); + check (off, 14, 'A'); + + p = memset (u2.buf + off, 'B', 14); + if (p != u2.buf + off) abort (); + check (off, 14, 'B'); + } + + /* len == 15 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u2.buf + off, '\0', 15); + if (p != u2.buf + off) abort (); + check (off, 15, '\0'); + + p = memset (u2.buf + off, A, 15); + if (p != u2.buf + off) abort (); + check (off, 15, 'A'); + + p = memset (u2.buf + off, 'B', 15); + if (p != u2.buf + off) abort (); + check (off, 15, 'B'); + } +} + +void +__attribute__((noinline)) +test6 (void) +{ + int len; + char *p; + + /* off == 0 */ + for (len = 0; len < MAX_COPY2; len++) + { + reset (); + + p = memset (u2.buf, '\0', len); + if (p != u2.buf) abort (); + check (0, len, '\0'); + + p = memset (u2.buf, A, len); + if (p != u2.buf) abort (); + check (0, len, 'A'); + + p = memset (u2.buf, 'B', len); + if (p != u2.buf) abort (); + check (0, len, 'B'); + } + + /* off == 1 */ + for (len = 0; len < MAX_COPY2; len++) + { + reset (); + + p = memset (u2.buf+1, '\0', len); + if (p != u2.buf+1) abort (); + check (1, len, '\0'); + + p = memset (u2.buf+1, A, len); + if (p != u2.buf+1) abort (); + check (1, len, 'A'); + + p = memset (u2.buf+1, 'B', len); + if (p != u2.buf+1) abort (); + check (1, len, 'B'); + } + + /* off == 2 */ + for (len = 0; len < MAX_COPY2; len++) + { + reset (); + + p = memset (u2.buf+2, '\0', len); + if (p != u2.buf+2) abort (); + check (2, len, '\0'); + + p = memset (u2.buf+2, A, len); + if (p != u2.buf+2) abort (); + check (2, len, 'A'); + + p = memset (u2.buf+2, 'B', len); + if (p != u2.buf+2) abort (); + check (2, len, 'B'); + } + + /* off == 3 */ + for (len = 0; len < MAX_COPY2; len++) + { + reset (); + + p = memset (u2.buf+3, '\0', len); + if (p != u2.buf+3) abort (); + check (3, len, '\0'); + + p = memset (u2.buf+3, A, len); + if (p != u2.buf+3) abort (); + check (3, len, 'A'); + + p = memset (u2.buf+3, 'B', len); + if (p != u2.buf+3) abort (); + check (3, len, 'B'); + } + + /* off == 4 */ + for (len = 0; len < MAX_COPY2; len++) + { + reset (); + + p = memset (u2.buf+4, '\0', len); + if (p != u2.buf+4) abort (); + check (4, len, '\0'); + + p = memset (u2.buf+4, A, len); + if (p != u2.buf+4) abort (); + check (4, len, 'A'); + + p = memset (u2.buf+4, 'B', len); + if (p != u2.buf+4) abort (); + check (4, len, 'B'); + } + + /* off == 5 */ + for (len = 0; len < MAX_COPY2; len++) + { + reset (); + + p = memset (u2.buf+5, '\0', len); + if (p != u2.buf+5) abort (); + check (5, len, '\0'); + + p = memset (u2.buf+5, A, len); + if (p != u2.buf+5) abort (); + check (5, len, 'A'); + + p = memset (u2.buf+5, 'B', len); + if (p != u2.buf+5) abort (); + check (5, len, 'B'); + } + + /* off == 6 */ + for (len = 0; len < MAX_COPY2; len++) + { + reset (); + + p = memset (u2.buf+6, '\0', len); + if (p != u2.buf+6) abort (); + check (6, len, '\0'); + + p = memset (u2.buf+6, A, len); + if (p != u2.buf+6) abort (); + check (6, len, 'A'); + + p = memset (u2.buf+6, 'B', len); + if (p != u2.buf+6) abort (); + check (6, len, 'B'); + } + + /* off == 7 */ + for (len = 0; len < MAX_COPY2; len++) + { + reset (); + + p = memset (u2.buf+7, '\0', len); + if (p != u2.buf+7) abort (); + check (7, len, '\0'); + + p = memset (u2.buf+7, A, len); + if (p != u2.buf+7) abort (); + check (7, len, 'A'); + + p = memset (u2.buf+7, 'B', len); + if (p != u2.buf+7) abort (); + check (7, len, 'B'); + } +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (l1) : "0" (l1)); + s4 = buffer; + test1 (); + test2 (); + test3 (); + test4 (); + test5 (); + test6 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test3 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/memset.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/memset.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* Copyright (C) 2002, 2003 Free Software Foundation. + + Ensure that builtin memset operations for constant length and + non-constant assigned value don't cause compiler problems. + + Written by Roger Sayle, 21 April 2002. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern void *memset (void *, int, size_t); + +char buffer[32]; +int argc = 1; + +void +main_test (void) +{ + memset (buffer, argc, 0); + memset (buffer, argc, 1); + memset (buffer, argc, 2); + memset (buffer, argc, 3); + memset (buffer, argc, 4); + memset (buffer, argc, 5); + memset (buffer, argc, 6); + memset (buffer, argc, 7); + memset (buffer, argc, 8); + memset (buffer, argc, 9); + memset (buffer, argc, 10); + memset (buffer, argc, 11); + memset (buffer, argc, 12); + memset (buffer, argc, 13); + memset (buffer, argc, 14); + memset (buffer, argc, 15); + memset (buffer, argc, 16); + memset (buffer, argc, 17); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr22237-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr22237-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr22237-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr22237-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void abort (void); + +void * +memcpy (void *dst, const void *src, __SIZE_TYPE__ n) +{ + const char *srcp; + char *dstp; + + srcp = src; + dstp = dst; + + if (dst < src) + { + if (dst + n > src) + abort (); + } + else + { + if (src + n > dst) + abort (); + } + + while (n-- != 0) + *dstp++ = *srcp++; + + return dst; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr22237.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr22237.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr22237.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr22237.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +extern void abort (void); +extern void exit (int); +struct s { unsigned char a[256]; }; +union u { struct { struct s b; int c; } d; struct { int c; struct s b; } e; }; +static union u v; +static union u v0; +static struct s *p = &v.d.b; +static struct s *q = &v.e.b; + +static inline struct s rp (void) { return *p; } +static inline struct s rq (void) { return *q; } +static void pq (void) { *p = rq(); } +static void qp (void) { *q = rp(); } + +static void +init (struct s *sp) +{ + int i; + for (i = 0; i < 256; i++) + sp->a[i] = i; +} + +static void +check (struct s *sp) +{ + int i; + for (i = 0; i < 256; i++) + if (sp->a[i] != i) + abort (); +} + +void +main_test (void) +{ + v = v0; + init (p); + qp (); + check (q); + v = v0; + init (q); + pq (); + check (p); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,67 @@ +/* PR middle-end/23484 */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen (const char *); +extern void *memcpy (void *, const void *, size_t); +extern void *mempcpy (void *, const void *, size_t); +extern void *memmove (void *, const void *, size_t); +extern int snprintf (char *, size_t, const char *, ...); +extern int memcmp (const void *, const void *, size_t); + +#include "chk.h" + +static char data[8] = "ABCDEFG"; + +int l1; + +void +__attribute__((noinline)) +test1 (void) +{ + char buf[8]; + + /* All the checking calls in this routine have a maximum length, so + object size checking should be done at compile time if optimizing. */ + chk_calls = 0; + + memset (buf, 'I', sizeof (buf)); + if (memcpy (buf, data, l1 ? sizeof (buf) : 4) != buf + || memcmp (buf, "ABCDIIII", 8)) + abort (); + + memset (buf, 'J', sizeof (buf)); + if (mempcpy (buf, data, l1 ? sizeof (buf) : 4) != buf + 4 + || memcmp (buf, "ABCDJJJJ", 8)) + abort (); + + memset (buf, 'K', sizeof (buf)); + if (memmove (buf, data, l1 ? sizeof (buf) : 4) != buf + || memcmp (buf, "ABCDKKKK", 8)) + abort (); + + memset (buf, 'L', sizeof (buf)); +#if(__SIZEOF_INT__ >= 4) + if (snprintf (buf, l1 ? sizeof (buf) : 4, "%d", l1 + 65536) != 5 + || memcmp (buf, "655\0LLLL", 8)) + abort (); +#else + if (snprintf (buf, l1 ? sizeof (buf) : 4, "%d", l1 + 32700) != 5 + || memcmp (buf, "327\0LLLL", 8)) + abort (); +#endif + + if (chk_calls) + abort (); +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (l1) : "0" (l1)); + test1 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/pr23484-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,7 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/printf-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/printf-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/printf-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/printf-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/printf.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/printf.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/printf.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/printf.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/printf.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,52 @@ +/* Copyright (C) 2000 Free Software Foundation. + + Ensure all expected transformations of builtin printf occur and + that we honor side effects in the arguments. + + Written by Kaveh R. Ghazi, 12/4/2000. */ + +extern int printf (const char *, ...); +extern int printf_unlocked (const char *, ...); +extern void abort(void); + +void +main_test (void) +{ + const char *const s1 = "hello world"; + const char *const s2[] = { s1, 0 }, *const*s3; + + printf ("%s\n", "hello"); + printf ("%s\n", *s2); + s3 = s2; + printf ("%s\n", *s3++); + if (s3 != s2+1 || *s3 != 0) + abort(); + + printf ("%c", '\n'); + printf ("%c", **s2); + s3 = s2; + printf ("%c", **s3++); + if (s3 != s2+1 || *s3 != 0) + abort(); + + printf (""); + printf ("%s", ""); + printf ("\n"); + printf ("%s", "\n"); + printf ("hello world\n"); + printf ("%s", "hello world\n"); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + __builtin_printf ("%s\n", "hello"); + /* These builtin stubs are called by __builtin_printf, ensure their + prototypes are set correctly too. */ + __builtin_putchar ('\n'); + __builtin_puts ("hello"); + /* Check the unlocked style, these evaluate to nothing to avoid + problems on systems without the unlocked functions. */ + printf_unlocked (""); + __builtin_printf_unlocked (""); + printf_unlocked ("%s", ""); + __builtin_printf_unlocked ("%s", ""); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,220 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __snprintf_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern char *strcpy (char *, const char *); +extern int memcmp (const void *, const void *, size_t); +extern void *memset (void *, int, size_t); +extern int sprintf (char *, const char *, ...); +extern int snprintf (char *, size_t, const char *, ...); + +#include "chk.h" + +const char s1[] = "123"; +char p[32] = ""; +char *s2 = "defg"; +char *s3 = "FGH"; +char *s4; +size_t l1 = 1; +static char buffer[32]; +char * volatile ptr = "barf"; /* prevent constant propagation to happen when whole program assumptions are made. */ + +void +__attribute__((noinline)) +test1 (void) +{ + chk_calls = 0; + /* snprintf_disallowed = 1; */ + + memset (buffer, 'A', 32); + snprintf (buffer, 4, "foo"); + if (memcmp (buffer, "foo", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (snprintf (buffer, 4, "foo bar") != 7) + abort (); + if (memcmp (buffer, "foo", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + snprintf (buffer, 32, "%s", "bar"); + if (memcmp (buffer, "bar", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (snprintf (buffer, 21, "%s", "bar") != 3) + abort (); + if (memcmp (buffer, "bar", 4) || buffer[4] != 'A') + abort (); + + snprintf_disallowed = 0; + + memset (buffer, 'A', 32); + if (snprintf (buffer, 4, "%d%d%d", (int) l1, (int) l1 + 1, (int) l1 + 12) + != 4) + abort (); + if (memcmp (buffer, "121", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (snprintf (buffer, 32, "%d%d%d", (int) l1, (int) l1 + 1, (int) l1 + 12) + != 4) + abort (); + if (memcmp (buffer, "1213", 5) || buffer[5] != 'A') + abort (); + + if (chk_calls) + abort (); + + memset (buffer, 'A', 32); + snprintf (buffer, strlen (ptr) + 1, "%s", ptr); + if (memcmp (buffer, "barf", 5) || buffer[5] != 'A') + abort (); + + memset (buffer, 'A', 32); + snprintf (buffer, l1 + 31, "%d - %c", (int) l1 + 27, *ptr); + if (memcmp (buffer, "28 - b\0AAAAA", 12)) + abort (); + + if (chk_calls != 2) + abort (); + chk_calls = 0; + + memset (s4, 'A', 32); + snprintf (s4, l1 + 6, "%d - %c", (int) l1 - 17, ptr[1]); + if (memcmp (s4, "-16 - \0AAA", 10)) + abort (); + if (chk_calls) + abort (); +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test2 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + + /* The following calls should do runtime checking + - length is not known, but destination is. */ + chk_calls = 0; + snprintf (a.buf1 + 2, l1, "%s", s3 + 3); + snprintf (r, l1 + 4, "%s%c", s3 + 3, s3[3]); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + snprintf (r, strlen (s2) - 2, "%c %s", s2[2], s2 + 4); + snprintf (r + 2, l1, s3 + 3); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + snprintf (r, l1, s2 + 4); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known source length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + /* snprintf_disallowed = 1; */ + snprintf (a.buf1 + 2, 4, ""); + snprintf (r, 1, "a"); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + snprintf (r, 3, "%s", s1 + 1); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + snprintf (r, 1, "%s", ""); + snprintf (r, 0, "%s", ""); + snprintf_disallowed = 0; + /* Unknown destination and source, no checking. */ + snprintf (s4, l1 + 31, "%s %d", s3, 0); + if (chk_calls) + abort (); +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test3 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + snprintf (&a.buf2[9], l1 + 1, "%c%s", s2[3], s2 + 4); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + snprintf (&a.buf2[7], l1 + 30, "%s%c", s3 + strlen (s3) - 2, *s3); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + snprintf (&a.buf2[7], l1 + 3, "%d", (int) l1 + 9999); + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + snprintf (&buf3[19], 2, "a"); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + snprintf (&buf3[17], 4, "a"); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + snprintf (&buf3[17], 4, "%s", "abc"); + abort (); + } + chk_fail_allowed = 0; +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (s2) : "0" (s2)); + __asm ("" : "=r" (s3) : "0" (s3)); + __asm ("" : "=r" (l1) : "0" (l1)); + s4 = p; + test1 (); + test2 (); + test3 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/snprintf-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test3 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,197 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __sprintf_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern char *strcpy (char *, const char *); +extern int memcmp (const void *, const void *, size_t); +extern void *memset (void *, int, size_t); +extern int sprintf (char *, const char *, ...); + +#include "chk.h" + +LOCAL const char s1[] = "123"; +char p[32] = ""; +char *s2 = "defg"; +char *s3 = "FGH"; +char *s4; +size_t l1 = 1; +static char buffer[32]; +char * volatile ptr = "barf"; /* prevent constant propagation to happen when whole program assumptions are made. */ + +void +__attribute__((noinline)) +test1 (void) +{ + chk_calls = 0; + sprintf_disallowed = 1; + + memset (buffer, 'A', 32); + sprintf (buffer, "foo"); + if (memcmp (buffer, "foo", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (sprintf (buffer, "foo") != 3) + abort (); + if (memcmp (buffer, "foo", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + sprintf (buffer, "%s", "bar"); + if (memcmp (buffer, "bar", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (sprintf (buffer, "%s", "bar") != 3) + abort (); + if (memcmp (buffer, "bar", 4) || buffer[4] != 'A') + abort (); + + if (chk_calls) + abort (); + sprintf_disallowed = 0; + + memset (buffer, 'A', 32); + sprintf (buffer, "%s", ptr); + if (memcmp (buffer, "barf", 5) || buffer[5] != 'A') + abort (); + + memset (buffer, 'A', 32); + sprintf (buffer, "%d - %c", (int) l1 + 27, *ptr); + if (memcmp (buffer, "28 - b\0AAAAA", 12)) + abort (); + + if (chk_calls != 2) + abort (); + chk_calls = 0; + + sprintf (s4, "%d - %c", (int) l1 - 17, ptr[1]); + if (memcmp (s4, "-16 - a", 8)) + abort (); + if (chk_calls) + abort (); +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test2 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + + /* The following calls should do runtime checking + - source length is not known, but destination is. */ + chk_calls = 0; + sprintf (a.buf1 + 2, "%s", s3 + 3); + sprintf (r, "%s%c", s3 + 3, s3[3]); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + sprintf (r, "%c %s", s2[2], s2 + 4); + sprintf (r + 2, s3 + 3); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + sprintf (r, s2 + 4); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known source length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + sprintf_disallowed = 1; + sprintf (a.buf1 + 2, ""); + sprintf (r, "a"); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + sprintf (r, "%s", s1 + 1); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + sprintf (r, "%s", ""); + sprintf_disallowed = 0; + /* Unknown destination and source, no checking. */ + sprintf (s4, "%s %d", s3, 0); + if (chk_calls) + abort (); +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test3 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + sprintf (&a.buf2[9], "%c%s", s2[3], s2 + 4); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + sprintf (&a.buf2[7], "%s%c", s3 + strlen (s3) - 2, *s3); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + sprintf (&a.buf2[7], "%d", (int) l1 + 9999); + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + sprintf (&buf3[19], "a"); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + sprintf (&buf3[17], "%s", "abc"); + abort (); + } + chk_fail_allowed = 0; +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (s2) : "0" (s2)); + __asm ("" : "=r" (s3) : "0" (s3)); + __asm ("" : "=r" (l1) : "0" (l1)); + s4 = p; + test1 (); + test2 (); + test3 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test3 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/sprintf.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/sprintf.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,71 @@ +/* Copyright (C) 2003 Free Software Foundation. + + Test sprintf optimizations don't break anything and return the + correct results. + + Written by Roger Sayle, June 22, 2003. */ + +static char buffer[32]; + +extern void abort (); +typedef __SIZE_TYPE__ size_t; +extern int sprintf(char*, const char*, ...); +extern void *memset(void*, int, size_t); +extern int memcmp(const void*, const void*, size_t); + +void test1() +{ + sprintf(buffer,"foo"); +} + +int test2() +{ + return sprintf(buffer,"foo"); +} + +void test3() +{ + sprintf(buffer,"%s","bar"); +} + +int test4() +{ + return sprintf(buffer,"%s","bar"); +} + +void test5(char *ptr) +{ + sprintf(buffer,"%s",ptr); +} + + +void +main_test (void) +{ + memset (buffer, 'A', 32); + test1 (); + if (memcmp(buffer, "foo", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (test2 () != 3) + abort (); + if (memcmp(buffer, "foo", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + test3 (); + if (memcmp(buffer, "bar", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (test4 () != 3) + abort (); + if (memcmp(buffer, "bar", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + test5 ("barf"); + if (memcmp(buffer, "barf", 5) || buffer[5] != 'A') + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,265 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __stpcpy_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern char *stpcpy (char *, const char *); +extern int memcmp (const void *, const void *, size_t); + +#include "chk.h" + +LOCAL const char s1[] = "123"; +char p[32] = ""; +char *s2 = "defg"; +char *s3 = "FGH"; +char *s4; +size_t l1 = 1; + +void +__attribute__((noinline)) +test1 (void) +{ + int i = 8; + +#if defined __i386__ || defined __x86_64__ + /* The functions below might not be optimized into direct stores on all + arches. It depends on how many instructions would be generated and + what limits the architecture chooses in STORE_BY_PIECES_P. */ + stpcpy_disallowed = 1; +#endif + if (stpcpy (p, "abcde") != p + 5 || memcmp (p, "abcde", 6)) + abort (); + if (stpcpy (p + 16, "vwxyz" + 1) != p + 16 + 4 || memcmp (p + 16, "wxyz", 5)) + abort (); + if (stpcpy (p + 1, "") != p + 1 + 0 || memcmp (p, "a\0cde", 6)) + abort (); + if (stpcpy (p + 3, "fghij") != p + 3 + 5 || memcmp (p, "a\0cfghij", 9)) + abort (); + + if (stpcpy ((i++, p + 20 + 1), "23") != (p + 20 + 1 + 2) + || i != 9 || memcmp (p + 19, "z\0""23\0", 5)) + abort (); + + if (stpcpy (stpcpy (p, "ABCD"), "EFG") != p + 7 || memcmp (p, "ABCDEFG", 8)) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_stpcpy (p, "abcde") != p + 5 || memcmp (p, "abcde", 6)) + abort (); + + /* If return value of stpcpy is ignored, it should be optimized into + strcpy call. */ + stpcpy_disallowed = 1; + stpcpy (p + 1, "abcd"); + stpcpy_disallowed = 0; + if (memcmp (p, "aabcd", 6)) + abort (); + + if (chk_calls) + abort (); + + chk_calls = 0; + strcpy_disallowed = 1; + if (stpcpy (p, s2) != p + 4 || memcmp (p, "defg\0", 6)) + abort (); + strcpy_disallowed = 0; + stpcpy_disallowed = 1; + stpcpy (p + 2, s3); + stpcpy_disallowed = 0; + if (memcmp (p, "deFGH", 6)) + abort (); + if (chk_calls != 2) + abort (); +} + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_COPY +#define MAX_COPY (10 * sizeof (long long)) +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + 1 + MAX_EXTRA) + +/* Use a sequence length that is not divisible by two, to make it more + likely to detect when words are mixed up. */ +#define SEQUENCE_LENGTH 31 + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u1, u2; + +volatile char *vx; + +void +__attribute__((noinline)) +test2 (void) +{ + int off1, off2, len, i; + char *p, *q, c; + + for (off1 = 0; off1 < MAX_OFFSET; off1++) + for (off2 = 0; off2 < MAX_OFFSET; off2++) + for (len = 1; len < MAX_COPY; len++) + { + for (i = 0, c = 'A'; i < MAX_LENGTH; i++, c++) + { + u1.buf[i] = 'a'; + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + u2.buf[i] = c; + } + u2.buf[off2 + len] = '\0'; + + p = stpcpy (u1.buf + off1, u2.buf + off2); + if (p != u1.buf + off1 + len) + abort (); + + q = u1.buf; + for (i = 0; i < off1; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0, c = 'A' + off2; i < len; i++, q++, c++) + { + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + if (*q != c) + abort (); + } + + if (*q++ != '\0') + abort (); + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + } +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test3 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + const char *l; + + /* The following calls should do runtime checking + - source length is not known, but destination is. */ + chk_calls = 0; + vx = stpcpy (a.buf1 + 2, s3 + 3); + vx = stpcpy (r, s3 + 2); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + vx = stpcpy (r, s2 + 2); + vx = stpcpy (r + 2, s3 + 3); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + vx = stpcpy (r, s2 + 4); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known source length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + vx = stpcpy (a.buf1 + 2, ""); + vx = stpcpy (r, "a"); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + vx = stpcpy (r, s1 + 1); + r = buf3; + l = "abc"; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1], l = "e"; + else if (i == l1) + r = &a.buf2[7], l = "gh"; + else if (i == l1 + 1) + r = &buf3[5], l = "jkl"; + else if (i == l1 + 2) + r = &a.buf1[9], l = ""; + } + vx = stpcpy (r, ""); + /* Here, strlen (l) + 1 is known to be at most 4 and + __builtin_object_size (&buf3[16], 0) is 4, so this doesn't need + runtime checking. */ + vx = stpcpy (&buf3[16], l); + /* Unknown destination and source, no checking. */ + vx = stpcpy (s4, s3); + stpcpy (s4 + 4, s3); + if (chk_calls) + abort (); + chk_calls = 0; +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test4 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + vx = stpcpy (&a.buf2[9], s2 + 3); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + vx = stpcpy (&a.buf2[7], s3 + strlen (s3) - 3); + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + vx = stpcpy (&buf3[19], "a"); + abort (); + } + chk_fail_allowed = 0; +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (s2) : "0" (s2)); + __asm ("" : "=r" (s3) : "0" (s3)); + __asm ("" : "=r" (l1) : "0" (l1)); + test1 (); + s4 = p; + test2 (); + test3 (); + test4 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpcpy-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test4 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,261 @@ +/* Copyright (C) 2004, 2005, 2011 Free Software Foundation. + + Ensure builtin __stpncpy_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern char *stpncpy (char *, const char *, size_t); +extern int memcmp (const void *, const void *, size_t); +extern int strcmp (const char *, const char *); +extern int strncmp (const char *, const char *, size_t); +extern void *memset (void *, int, size_t); + +#include "chk.h" + +const char s1[] = "123"; +char p[32] = ""; +char * volatile s2 = "defg"; /* prevent constant propagation to happen when whole program assumptions are made. */ +char * volatile s3 = "FGH"; /* prevent constant propagation to happen when whole program assumptions are made. */ +char *s4; +volatile size_t l1 = 1; /* prevent constant propagation to happen when whole program assumptions are made. */ +int i; + +void +__attribute__((noinline)) +test1 (void) +{ + const char *const src = "hello world"; + const char *src2; + char dst[64], *dst2; + + chk_calls = 0; + + memset (dst, 0, sizeof (dst)); + if (stpncpy (dst, src, 4) != dst+4 || strncmp (dst, src, 4)) + abort(); + + memset (dst, 0, sizeof (dst)); + if (stpncpy (dst+16, src, 4) != dst+20 || strncmp (dst+16, src, 4)) + abort(); + + memset (dst, 0, sizeof (dst)); + if (stpncpy (dst+32, src+5, 4) != dst+36 || strncmp (dst+32, src+5, 4)) + abort(); + + memset (dst, 0, sizeof (dst)); + dst2 = dst; + if (stpncpy (++dst2, src+5, 4) != dst+5 || strncmp (dst2, src+5, 4) + || dst2 != dst+1) + abort(); + + memset (dst, 0, sizeof (dst)); + if (stpncpy (dst, src, 0) != dst || strcmp (dst, "")) + abort(); + + memset (dst, 0, sizeof (dst)); + dst2 = dst; src2 = src; + if (stpncpy (++dst2, ++src2, 0) != dst+1 || strcmp (dst2, "") + || dst2 != dst+1 || src2 != src+1) + abort(); + + memset (dst, 0, sizeof (dst)); + dst2 = dst; src2 = src; + if (stpncpy (++dst2+5, ++src2+5, 0) != dst+6 || strcmp (dst2+5, "") + || dst2 != dst+1 || src2 != src+1) + abort(); + + memset (dst, 0, sizeof (dst)); + if (stpncpy (dst, src, 12) != dst+11 || strcmp (dst, src)) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + memset (dst, 0, sizeof (dst)); + if (__builtin_stpncpy (dst, src, 4) != dst+4 || strncmp (dst, src, 4)) + abort(); + + memset (dst, 0, sizeof (dst)); + if (stpncpy (dst, i++ ? "xfoo" + 1 : "bar", 4) != dst+3 + || strcmp (dst, "bar") + || i != 1) + abort (); + + /* If return value of stpncpy is ignored, it should be optimized into + stpncpy call. */ + stpncpy_disallowed = 1; + stpncpy (dst + 1, src, 4); + stpncpy_disallowed = 0; + if (strncmp (dst + 1, src, 4)) + abort (); + + if (chk_calls) + abort (); +} + +void +__attribute__((noinline)) +test2 (void) +{ + chk_calls = 0; + + /* No runtime checking should be done here, both destination + and length are unknown. */ + size_t cpy_length = l1 < 4 ? l1 + 1 : 4; + if (stpncpy (s4, "abcd", l1 + 1) != s4 + cpy_length || strncmp (s4, "abcd", cpy_length)) + abort (); + + if (chk_calls) + abort (); +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test3 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + const char *l; + size_t l2; + + /* The following calls should do runtime checking + - source length is not known, but destination is. + The returned value is checked so that stpncpy calls + are not rewritten to strncpy calls. */ + chk_calls = 0; + if (!stpncpy (a.buf1 + 2, s3 + 3, l1)) + abort(); + if (!stpncpy (r, s3 + 2, l1 + 2)) + abort(); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + if (!stpncpy (r, s2 + 2, l1 + 2)) + abort(); + if (!stpncpy (r + 2, s3 + 3, l1)) + abort(); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + if (!stpncpy (r, s2 + 4, l1)) + abort(); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + if (!stpncpy (a.buf1 + 2, "", 3)) + abort (); + if (!stpncpy (a.buf1 + 2, "", 0)) + abort (); + if (!stpncpy (r, "a", 1)) + abort (); + if (!stpncpy (r, "a", 3)) + abort (); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + if (!stpncpy (r, s1 + 1, 3)) + abort (); + if (!stpncpy (r, s1 + 1, 2)) + abort (); + r = buf3; + l = "abc"; + l2 = 4; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1], l = "e", l2 = 2; + else if (i == l1) + r = &a.buf2[7], l = "gh", l2 = 3; + else if (i == l1 + 1) + r = &buf3[5], l = "jkl", l2 = 4; + else if (i == l1 + 2) + r = &a.buf1[9], l = "", l2 = 1; + } + if (!stpncpy (r, "", 1)) + abort (); + /* Here, strlen (l) + 1 is known to be at most 4 and + __builtin_object_size (&buf3[16], 0) is 4, so this doesn't need + runtime checking. */ + if (!stpncpy (&buf3[16], l, l2)) + abort (); + if (!stpncpy (&buf3[15], "abc", l2)) + abort (); + if (!stpncpy (&buf3[10], "fghij", l2)) + abort (); + if (chk_calls) + abort (); + chk_calls = 0; +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test4 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + if (stpncpy (&a.buf2[9], s2 + 4, l1 + 1)) + // returned value used to prevent stpncpy calls + // to be rewritten in strncpy calls + i++; + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + if (stpncpy (&a.buf2[7], s3, l1 + 4)) + i++; + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + if (stpncpy (&buf3[19], "abc", 2)) + i++; + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + if (stpncpy (&buf3[18], "", 3)) + i++; + abort (); + } + chk_fail_allowed = 0; +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (s2) : "0" (s2)); + __asm ("" : "=r" (s3) : "0" (s3)); + __asm ("" : "=r" (l1) : "0" (l1)); + test1 (); + + s4 = p; + test2 (); + test3 (); + test4 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/stpncpy-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test4 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,204 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __strcat_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern char *strcat (char *, const char *); +extern int memcmp (const void *, const void *, size_t); +extern char *strcpy (char *, const char *); +extern int strcmp (const char *, const char *); +extern void *memset (void *, int, size_t); +#define RESET_DST_WITH(FILLER) \ + do { memset (dst, 'X', sizeof (dst)); strcpy (dst, (FILLER)); } while (0) + +#include "chk.h" + +const char s1[] = "123"; +char p[32] = ""; +char *s2 = "defg"; +char *s3 = "FGH"; +char *s4; +size_t l1 = 1; +char *s5; + +void +__attribute__((noinline)) +test1 (void) +{ + const char *const x1 = "hello world"; + const char *const x2 = ""; + char dst[64], *d2; + + chk_calls = 0; + strcat_disallowed = 1; + /* Following strcat calls should be optimized out at compile time. */ + RESET_DST_WITH (x1); + if (strcat (dst, "") != dst || strcmp (dst, x1)) + abort (); + RESET_DST_WITH (x1); + if (strcat (dst, x2) != dst || strcmp (dst, x1)) + abort (); + RESET_DST_WITH (x1); d2 = dst; + if (strcat (++d2, x2) != dst+1 || d2 != dst+1 || strcmp (dst, x1)) + abort (); + RESET_DST_WITH (x1); d2 = dst; + if (strcat (++d2+5, x2) != dst+6 || d2 != dst+1 || strcmp (dst, x1)) + abort (); + RESET_DST_WITH (x1); d2 = dst; + if (strcat (++d2+5, x1+11) != dst+6 || d2 != dst+1 || strcmp (dst, x1)) + abort (); + if (chk_calls) + abort (); + strcat_disallowed = 0; + + RESET_DST_WITH (x1); + if (strcat (dst, " 1111") != dst + || memcmp (dst, "hello world 1111\0XXX", 20)) + abort (); + + RESET_DST_WITH (x1); + if (strcat (dst+5, " 2222") != dst+5 + || memcmp (dst, "hello world 2222\0XXX", 20)) + abort (); + + RESET_DST_WITH (x1); d2 = dst; + if (strcat (++d2+5, " 3333") != dst+6 || d2 != dst+1 + || memcmp (dst, "hello world 3333\0XXX", 20)) + abort (); + + RESET_DST_WITH (x1); + strcat (strcat (strcat (strcat (strcat (strcat (dst, ": this "), ""), + "is "), "a "), "test"), "."); + if (memcmp (dst, "hello world: this is a test.\0X", 30)) + abort (); + + chk_calls = 0; + strcat_disallowed = 1; + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + RESET_DST_WITH (x1); + if (__builtin_strcat (dst, "") != dst || strcmp (dst, x1)) + abort (); + if (chk_calls) + abort (); + strcat_disallowed = 0; +} + + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test2 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + + /* The following calls should do runtime checking + - source length is not known, but destination is. */ + memset (&a, '\0', sizeof (a)); + s5 = (char *) &a; + __asm __volatile ("" : : "r" (s5) : "memory"); + chk_calls = 0; + strcat (a.buf1 + 2, s3 + 3); + strcat (r, s3 + 2); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + memset (r, '\0', 3); + __asm __volatile ("" : : "r" (r) : "memory"); + strcat (r, s2 + 2); + strcat (r + 2, s3 + 3); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + strcat (r, s2 + 4); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known source length, + but we don't know the length of dest string, so runtime checking + is needed too. */ + memset (&a, '\0', sizeof (a)); + chk_calls = 0; + s5 = (char *) &a; + __asm __volatile ("" : : "r" (s5) : "memory"); + strcat (a.buf1 + 2, "a"); + strcat (r, ""); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + memset (r, '\0', 3); + __asm __volatile ("" : : "r" (r) : "memory"); + strcat (r, s1 + 1); + if (chk_calls != 2) + abort (); + chk_calls = 0; + /* Unknown destination and source, no checking. */ + strcat (s4, s3); + if (chk_calls) + abort (); + chk_calls = 0; +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test3 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + memset (&a, '\0', sizeof (a)); + memset (buf3, '\0', sizeof (buf3)); + s5 = (char *) &a; + __asm __volatile ("" : : "r" (s5) : "memory"); + s5 = buf3; + __asm __volatile ("" : : "r" (s5) : "memory"); + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strcat (&a.buf2[9], s2 + 3); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strcat (&a.buf2[7], s3 + strlen (s3) - 3); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strcat (&buf3[19], "a"); + abort (); + } + chk_fail_allowed = 0; +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (s2) : "0" (s2)); + __asm ("" : "=r" (s3) : "0" (s3)); + __asm ("" : "=r" (l1) : "0" (l1)); + s4 = p; + test1 (); + memset (p, '\0', sizeof (p)); + test2 (); + test3 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test3 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strcat.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcat.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,81 @@ +/* Copyright (C) 2000, 2003 Free Software Foundation. + + Ensure all expected transformations of builtin strcat occur and + perform correctly. + + Written by Kaveh R. Ghazi, 11/27/2000. */ + +extern int inside_main; +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern char *strcat (char *, const char *); +extern char *strcpy (char *, const char *); +extern void *memset (void *, int, size_t); +extern int memcmp (const void *, const void *, size_t); +#define RESET_DST_WITH(FILLER) \ + do { memset (dst, 'X', sizeof (dst)); strcpy (dst, (FILLER)); } while (0) + +void main_test (void) +{ + const char *const s1 = "hello world"; + const char *const s2 = ""; + char dst[64], *d2; + + RESET_DST_WITH (s1); + if (strcat (dst, "") != dst || memcmp (dst, "hello world\0XXX", 15)) + abort(); + RESET_DST_WITH (s1); + if (strcat (dst, s2) != dst || memcmp (dst, "hello world\0XXX", 15)) + abort(); + RESET_DST_WITH (s1); d2 = dst; + if (strcat (++d2, s2) != dst+1 || d2 != dst+1 + || memcmp (dst, "hello world\0XXX", 15)) + abort(); + RESET_DST_WITH (s1); d2 = dst; + if (strcat (++d2+5, s2) != dst+6 || d2 != dst+1 + || memcmp (dst, "hello world\0XXX", 15)) + abort(); + RESET_DST_WITH (s1); d2 = dst; + if (strcat (++d2+5, s1+11) != dst+6 || d2 != dst+1 + || memcmp (dst, "hello world\0XXX", 15)) + abort(); + +#ifndef __OPTIMIZE_SIZE__ +# if !defined __i386__ && !defined __x86_64__ + /* The functions below might not be optimized into direct stores on all + arches. It depends on how many instructions would be generated and + what limits the architecture chooses in STORE_BY_PIECES_P. */ + inside_main = 0; +# endif + + RESET_DST_WITH (s1); + if (strcat (dst, " 1111") != dst + || memcmp (dst, "hello world 1111\0XXX", 20)) + abort(); + + RESET_DST_WITH (s1); + if (strcat (dst+5, " 2222") != dst+5 + || memcmp (dst, "hello world 2222\0XXX", 20)) + abort(); + + RESET_DST_WITH (s1); d2 = dst; + if (strcat (++d2+5, " 3333") != dst+6 || d2 != dst+1 + || memcmp (dst, "hello world 3333\0XXX", 20)) + abort(); + + RESET_DST_WITH (s1); + strcat (strcat (strcat (strcat (strcat (strcat (dst, ": this "), ""), + "is "), "a "), "test"), "."); + if (memcmp (dst, "hello world: this is a test.\0X", 30)) + abort(); + + /* Set inside_main again. */ + inside_main = 1; +#endif + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + RESET_DST_WITH (s1); + if (__builtin_strcat (dst, "") != dst || memcmp (dst, "hello world\0XXX", 15)) + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strchr-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strchr-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strchr-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strchr-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,8 @@ +#include "lib/strchr.c" +#ifdef __vxworks +/* The RTP C library uses bzero, bfill and bcopy, all of which are defined + in the same file as index. */ +#include "lib/bzero.c" +#include "lib/bfill.c" +#include "lib/memmove.c" +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strchr.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strchr.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strchr.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strchr.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* Copyright (C) 2000, 2003 Free Software Foundation. + + Ensure all expected transformations of builtin strchr and index + occur and perform correctly. + + Written by Jakub Jelinek, 11/7/2000. */ + +extern void abort (void); +extern char *strchr (const char *, int); +extern char *index (const char *, int); + +void +main_test (void) +{ + const char *const foo = "hello world"; + + if (strchr (foo, 'x')) + abort (); + if (strchr (foo, 'o') != foo + 4) + abort (); + if (strchr (foo + 5, 'o') != foo + 7) + abort (); + if (strchr (foo, '\0') != foo + 11) + abort (); + /* Test only one instance of index since the code path is the same + as that of strchr. */ + if (index ("hello", 'z') != 0) + abort (); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strchr (foo, 'o') != foo + 4) + abort (); + if (__builtin_index (foo, 'o') != foo + 4) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcmp-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcmp-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcmp-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcmp-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strcmp.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcmp.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcmp.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcmp.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcmp.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,48 @@ +/* Copyright (C) 2000, 2003, 2004 Free Software Foundation. + + Ensure all expected transformations of builtin strcmp + occur and perform correctly. + + Written by Jakub Jelinek, 11/7/2000. */ + +extern void abort (void); +extern int strcmp (const char *, const char *); + +int x = 7; +char *bar = "hi world"; + +void +main_test (void) +{ + const char *const foo = "hello world"; + + if (strcmp (foo, "hello") <= 0) + abort (); + if (strcmp (foo + 2, "llo") <= 0) + abort (); + if (strcmp (foo, foo) != 0) + abort (); + if (strcmp (foo, "hello world ") >= 0) + abort (); + if (strcmp (foo + 10, "dx") >= 0) + abort (); + if (strcmp (10 + foo, "dx") >= 0) + abort (); + if (strcmp (bar, "") <= 0) + abort (); + if (strcmp ("", bar) >= 0) + abort (); + if (strcmp (bar+8, "") != 0) + abort (); + if (strcmp ("", bar+8) != 0) + abort (); + if (strcmp (bar+(--x), "") <= 0 || x != 6) + abort (); + if (strcmp ("", bar+(++x)) >= 0 || x != 7) + abort (); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strcmp (foo, "hello") <= 0) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-2-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-2-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-2-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-2-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strcpy.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,47 @@ +/* Copyright (C) 2004 Free Software Foundation. + + Ensure builtin strcpy is optimized into memcpy + even when there is more than one possible string literal + passed to it, but all string literals passed to it + have equal length. + + Written by Jakub Jelinek, 9/15/2004. */ + +extern void abort (void); +extern char *strcpy (char *, const char *); +typedef __SIZE_TYPE__ size_t; +extern void *memcpy (void *, const void *, size_t); +extern int memcmp (const void *, const void *, size_t); + +char buf[32], *p; +int i; + +char * +__attribute__((noinline)) +test (void) +{ + int j; + const char *q = "abcdefg"; + for (j = 0; j < 3; ++j) + { + if (j == i) + q = "bcdefgh"; + else if (j == i + 1) + q = "cdefghi"; + else if (j == i + 2) + q = "defghij"; + } + p = strcpy (buf, q); + return strcpy (buf + 16, q); +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE_SIZE__ + /* For -Os, strcpy above is not replaced with + memcpy (buf, q, 8);, as that is larger. */ + if (test () != buf + 16 || p != buf) + abort (); +#endif +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,234 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __strcpy_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern char *strcpy (char *, const char *); +extern int memcmp (const void *, const void *, size_t); + +#include "chk.h" + +LOCAL const char s1[] = "123"; +char p[32] = ""; +char *s2 = "defg"; +char *s3 = "FGH"; +char *s4; +size_t l1 = 1; + +void +__attribute__((noinline)) +test1 (void) +{ + chk_calls = 0; +#ifndef __OPTIMIZE_SIZE__ + strcpy_disallowed = 1; +#else + strcpy_disallowed = 0; +#endif + + if (strcpy (p, "abcde") != p || memcmp (p, "abcde", 6)) + abort (); + if (strcpy (p + 16, "vwxyz" + 1) != p + 16 || memcmp (p + 16, "wxyz", 5)) + abort (); + if (strcpy (p + 1, "") != p + 1 || memcmp (p, "a\0cde", 6)) + abort (); + if (strcpy (p + 3, "fghij") != p + 3 || memcmp (p, "a\0cfghij", 9)) + abort (); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strcpy (p, "abcde") != p || memcmp (p, "abcde", 6)) + abort (); + + strcpy_disallowed = 0; + if (chk_calls) + abort (); +} + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_COPY +#define MAX_COPY (10 * sizeof (long long)) +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + 1 + MAX_EXTRA) + +/* Use a sequence length that is not divisible by two, to make it more + likely to detect when words are mixed up. */ +#define SEQUENCE_LENGTH 31 + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u1, u2; + +void +__attribute__((noinline)) +test2 (void) +{ + int off1, off2, len, i; + char *p, *q, c; + + for (off1 = 0; off1 < MAX_OFFSET; off1++) + for (off2 = 0; off2 < MAX_OFFSET; off2++) + for (len = 1; len < MAX_COPY; len++) + { + for (i = 0, c = 'A'; i < MAX_LENGTH; i++, c++) + { + u1.buf[i] = 'a'; + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + u2.buf[i] = c; + } + u2.buf[off2 + len] = '\0'; + + p = strcpy (u1.buf + off1, u2.buf + off2); + if (p != u1.buf + off1) + abort (); + + q = u1.buf; + for (i = 0; i < off1; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0, c = 'A' + off2; i < len; i++, q++, c++) + { + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + if (*q != c) + abort (); + } + + if (*q++ != '\0') + abort (); + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + } +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test3 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + const char *l; + + /* The following calls should do runtime checking + - source length is not known, but destination is. */ + chk_calls = 0; + strcpy (a.buf1 + 2, s3 + 3); + strcpy (r, s3 + 2); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + strcpy (r, s2 + 2); + strcpy (r + 2, s3 + 3); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + strcpy (r, s2 + 4); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known source length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + strcpy (a.buf1 + 2, ""); + strcpy (r, "a"); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + strcpy (r, s1 + 1); + r = buf3; + l = "abc"; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1], l = "e"; + else if (i == l1) + r = &a.buf2[7], l = "gh"; + else if (i == l1 + 1) + r = &buf3[5], l = "jkl"; + else if (i == l1 + 2) + r = &a.buf1[9], l = ""; + } + strcpy (r, ""); + /* Here, strlen (l) + 1 is known to be at most 4 and + __builtin_object_size (&buf3[16], 0) is 4, so this doesn't need + runtime checking. */ + strcpy (&buf3[16], l); + /* Unknown destination and source, no checking. */ + strcpy (s4, s3); + if (chk_calls) + abort (); + chk_calls = 0; +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test4 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strcpy (&a.buf2[9], s2 + 3); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strcpy (&a.buf2[7], s3 + strlen (s3) - 3); + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strcpy (&buf3[19], "a"); + abort (); + } + chk_fail_allowed = 0; +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (s2) : "0" (s2)); + __asm ("" : "=r" (s3) : "0" (s3)); + __asm ("" : "=r" (l1) : "0" (l1)); + test1 (); + test2 (); + s4 = p; + test3 (); + test4 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test4 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strcpy.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcpy.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* Copyright (C) 2000 Free Software Foundation. + + Ensure builtin memcpy and strcpy perform correctly. + + Written by Jakub Jelinek, 11/24/2000. */ + +extern void abort (void); +extern char *strcpy (char *, const char *); +typedef __SIZE_TYPE__ size_t; +extern void *memcpy (void *, const void *, size_t); +extern int memcmp (const void *, const void *, size_t); + +char p[32] = ""; + +void +main_test (void) +{ + if (strcpy (p, "abcde") != p || memcmp (p, "abcde", 6)) + abort (); + if (strcpy (p + 16, "vwxyz" + 1) != p + 16 || memcmp (p + 16, "wxyz", 5)) + abort (); + if (strcpy (p + 1, "") != p + 1 || memcmp (p, "a\0cde", 6)) + abort (); + if (strcpy (p + 3, "fghij") != p + 3 || memcmp (p, "a\0cfghij", 9)) + abort (); + if (memcpy (p, "ABCDE", 6) != p || memcmp (p, "ABCDE", 6)) + abort (); + if (memcpy (p + 16, "VWX" + 1, 2) != p + 16 || memcmp (p + 16, "WXyz", 5)) + abort (); + if (memcpy (p + 1, "", 1) != p + 1 || memcmp (p, "A\0CDE", 6)) + abort (); + if (memcpy (p + 3, "FGHI", 4) != p + 3 || memcmp (p, "A\0CFGHIj", 9)) + abort (); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strcpy (p, "abcde") != p || memcmp (p, "abcde", 6)) + abort (); + if (__builtin_memcpy (p, "ABCDE", 6) != p || memcmp (p, "ABCDE", 6)) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcspn-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcspn-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcspn-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcspn-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strcspn.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcspn.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcspn.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcspn.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strcspn.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,54 @@ +/* Copyright (C) 2000, 2004 Free Software Foundation. + + Ensure all expected transformations of builtin strcspn occur and + perform correctly. + + Written by Kaveh R. Ghazi, 11/27/2000. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strcspn (const char *, const char *); +extern char *strcpy (char *, const char *); + +void +main_test (void) +{ + const char *const s1 = "hello world"; + char dst[64], *d2; + + if (strcspn (s1, "hello") != 0) + abort(); + if (strcspn (s1, "z") != 11) + abort(); + if (strcspn (s1+4, "z") != 7) + abort(); + if (strcspn (s1, "hello world") != 0) + abort(); + if (strcspn (s1, "") != 11) + abort(); + strcpy (dst, s1); + if (strcspn (dst, "") != 11) + abort(); + strcpy (dst, s1); d2 = dst; + if (strcspn (++d2, "") != 10 || d2 != dst+1) + abort(); + strcpy (dst, s1); d2 = dst; + if (strcspn (++d2+5, "") != 5 || d2 != dst+1) + abort(); + if (strcspn ("", s1) != 0) + abort(); + strcpy (dst, s1); + if (strcspn ("", dst) != 0) + abort(); + strcpy (dst, s1); d2 = dst; + if (strcspn ("", ++d2) != 0 || d2 != dst+1) + abort(); + strcpy (dst, s1); d2 = dst; + if (strcspn ("", ++d2+5) != 0 || d2 != dst+1) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strcspn (s1, "z") != 11) + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-2-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-2-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-2-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-2-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strlen.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* Copyright (C) 2003 Free Software Foundation. + + Test strlen optimizations on conditional expressions. + + Written by Jakub Jelinek, June 23, 2003. */ + +typedef __SIZE_TYPE__ size_t; +extern size_t strlen (const char *); +extern char *strcpy (char *, const char *); +extern int memcmp (const void *, const void *, size_t); +extern void abort (void); +extern int inside_main; + +size_t g, h, i, j, k, l; + +size_t +foo (void) +{ + if (l) + abort (); + return ++l; +} + +void +main_test (void) +{ + if (strlen (i ? "foo" + 1 : j ? "bar" + 1 : "baz" + 1) != 2) + abort (); + if (strlen (g++ ? "foo" : "bar") != 3 || g != 1) + abort (); + if (strlen (h++ ? "xfoo" + 1 : "bar") != 3 || h != 1) + abort (); + if (strlen ((i++, "baz")) != 3 || i != 1) + abort (); + /* The following calls might not optimize strlen call away. */ + inside_main = 0; + if (strlen (j ? "foo" + k++ : "bar" + k++) != 3 || k != 1) + abort (); + if (strlen (foo () ? "foo" : "bar") != 3 || l != 1) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-3-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-3-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-3-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-3-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strlen.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,76 @@ +/* Copyright (C) 2004 Free Software Foundation. + + Test strlen on const variables initialized to string literals. + + Written by Jakub Jelinek, 9/14/2004. */ + +extern void abort (void); +extern __SIZE_TYPE__ strlen (const char *); +extern char *strcpy (char *, const char *); +static const char bar[] = "Hello, World!"; +static const char baz[] = "hello, world?"; +static const char larger[20] = "short string"; +extern int inside_main; + +int l1 = 1; +int x = 6; + +void +main_test(void) +{ + inside_main = 1; + +#ifdef __OPTIMIZE__ + const char *foo; + int i; +#endif + + if (strlen (bar) != 13) + abort (); + + if (strlen (bar + 3) != 10) + abort (); + + if (strlen (&bar[6]) != 7) + abort (); + + if (strlen (bar + (x++ & 7)) != 7) + abort (); + if (x != 7) + abort (); + +#ifdef __OPTIMIZE__ + foo = bar; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + foo = "HELLO, WORLD!"; + else if (i == l1) + foo = bar; + else if (i == l1 + 1) + foo = "hello, world!"; + else + foo = baz; + } + if (strlen (foo) != 13) + abort (); +#endif + + if (strlen (larger) != 12) + abort (); + if (strlen (&larger[10]) != 2) + abort (); + + inside_main = 0; + /* The following call may or may not be folded depending on + the optimization level, and when it isn't folded (such + as may be the case with -Og) it may or may not result in + a library call, depending on whether or not it's expanded + inline (e.g., powerpc64 makes a call while x86_64 expands + it inline). */ + if (strlen (larger + (x++ & 7)) != 5) + abort (); + if (x != 8) + abort (); + inside_main = 1; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strlen.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strlen.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,71 @@ +/* Copyright (C) 2000, 2001, 2003, 2004 Free Software Foundation. + + Ensure all expected transformations of builtin strlen + occur and perform correctly. + + Written by Jakub Jelinek, 11/7/2000. + + Additional tests written by Roger Sayle, 11/02/2001: + Ensure all builtin strlen comparisons against zero are optimized + and perform correctly. The multiple calls to strcpy are to prevent + the potentially "pure" strlen calls from being removed by CSE. + + Modified by Ben Elliston, 2006-10-25: + The multiple calls to strcpy that Roger mentions above are + problematic on systems where strcpy is implemented using strlen + (which this test overrides to call abort). So, rather than use + strcpy, we perform the identical operations using array indexing + and char assignments. */ + +extern void abort (void); +extern __SIZE_TYPE__ strlen (const char *); +extern char *strcpy (char *, const char *); + +int x = 6; + +void +main_test(void) +{ + const char *const foo = "hello world"; + char str[8]; + char *ptr; + + if (strlen (foo) != 11) + abort (); + if (strlen (foo + 4) != 7) + abort (); + if (strlen (foo + (x++ & 7)) != 5) + abort (); + if (x != 7) + abort (); + + ptr = str; + ptr[0] = 'n'; ptr[1] = 't'; ptr[2] = 's'; ptr[3] = '\0'; + if (strlen (ptr) == 0) + abort (); + + ptr[0] = 'n'; ptr[1] = 't'; ptr[2] = 's'; ptr[3] = '\0'; + if (strlen (ptr) < 1) + abort (); + + ptr[0] = 'n'; ptr[1] = 't'; ptr[2] = 's'; ptr[3] = '\0'; + if (strlen (ptr) <= 0) + abort (); + + ptr[0] = 'n'; ptr[1] = 't'; ptr[2] = 's'; ptr[3] = '\0'; + if (strlen (ptr+3) != 0) + abort (); + + ptr[0] = 'n'; ptr[1] = 't'; ptr[2] = 's'; ptr[3] = '\0'; + if (strlen (ptr+3) > 0) + abort (); + + ptr[0] = 'n'; ptr[1] = 't'; ptr[2] = 's'; ptr[3] = '\0'; + if (strlen (str+3) >= 1) + abort (); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strlen (foo) != 11) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,229 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __strncat_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen (const char *); +extern void *memcpy (void *, const void *, size_t); +extern char *strcat (char *, const char *); +extern char *strncat (char *, const char *, size_t); +extern int memcmp (const void *, const void *, size_t); +extern char *strcpy (char *, const char *); +extern int strcmp (const char *, const char *); +extern void *memset (void *, int, size_t); + +#include "chk.h" + +const char s1[] = "123"; +char p[32] = ""; +char *s2 = "defg"; +char *s3 = "FGH"; +char *s4; +size_t l1 = 1; +char *s5; +int x = 123; + +void +__attribute__((noinline)) +test1 (void) +{ + const char *const s1 = "hello world"; + const char *const s2 = ""; + const char *s3; + char dst[64], *d2; + + /* Following strncat calls should be all optimized out. */ + chk_calls = 0; + strncat_disallowed = 1; + strcat_disallowed = 1; + strcpy (dst, s1); + if (strncat (dst, "", 100) != dst || strcmp (dst, s1)) + abort (); + strcpy (dst, s1); + if (strncat (dst, s2, 100) != dst || strcmp (dst, s1)) + abort (); + strcpy (dst, s1); d2 = dst; + if (strncat (++d2, s2, 100) != dst+1 || d2 != dst+1 || strcmp (dst, s1)) + abort (); + strcpy (dst, s1); d2 = dst; + if (strncat (++d2+5, s2, 100) != dst+6 || d2 != dst+1 || strcmp (dst, s1)) + abort (); + strcpy (dst, s1); d2 = dst; + if (strncat (++d2+5, s1+11, 100) != dst+6 || d2 != dst+1 || strcmp (dst, s1)) + abort (); + strcpy (dst, s1); d2 = dst; + if (strncat (++d2+5, s1, 0) != dst+6 || d2 != dst+1 || strcmp (dst, s1)) + abort (); + strcpy (dst, s1); d2 = dst; s3 = s1; + if (strncat (++d2+5, ++s3, 0) != dst+6 || d2 != dst+1 || strcmp (dst, s1) + || s3 != s1 + 1) + abort (); + strcpy (dst, s1); d2 = dst; + if (strncat (++d2+5, "", ++x) != dst+6 || d2 != dst+1 || x != 124 + || strcmp (dst, s1)) + abort (); + if (chk_calls) + abort (); + strcat_disallowed = 0; + + /* These __strncat_chk calls should be optimized into __strcat_chk, + as strlen (src) <= len. */ + strcpy (dst, s1); + if (strncat (dst, "foo", 3) != dst || strcmp (dst, "hello worldfoo")) + abort (); + strcpy (dst, s1); + if (strncat (dst, "foo", 100) != dst || strcmp (dst, "hello worldfoo")) + abort (); + strcpy (dst, s1); + if (strncat (dst, s1, 100) != dst || strcmp (dst, "hello worldhello world")) + abort (); + if (chk_calls != 3) + abort (); + + chk_calls = 0; + /* The following calls have side-effects in dest, so are not checked. */ + strcpy (dst, s1); d2 = dst; + if (__builtin___strncat_chk (++d2, s1, 100, os (++d2)) != dst+1 + || d2 != dst+1 || strcmp (dst, "hello worldhello world")) + abort (); + strcpy (dst, s1); d2 = dst; + if (__builtin___strncat_chk (++d2+5, s1, 100, os (++d2+5)) != dst+6 + || d2 != dst+1 || strcmp (dst, "hello worldhello world")) + abort (); + strcpy (dst, s1); d2 = dst; + if (__builtin___strncat_chk (++d2+5, s1+5, 100, os (++d2+5)) != dst+6 + || d2 != dst+1 || strcmp (dst, "hello world world")) + abort (); + if (chk_calls) + abort (); + + chk_calls = 0; + strcat_disallowed = 1; + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + strcpy (dst, s1); + if (__builtin_strncat (dst, "", 100) != dst || strcmp (dst, s1)) + abort (); + + if (chk_calls) + abort (); + strncat_disallowed = 0; + strcat_disallowed = 0; +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test2 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + + /* The following calls should do runtime checking. */ + memset (&a, '\0', sizeof (a)); + s5 = (char *) &a; + __asm __volatile ("" : : "r" (s5) : "memory"); + chk_calls = 0; + strncat (a.buf1 + 2, s3 + 3, l1 - 1); + strncat (r, s3 + 2, l1); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + memset (r, '\0', 3); + __asm __volatile ("" : : "r" (r) : "memory"); + strncat (r, s2 + 2, l1 + 1); + strncat (r + 2, s3 + 3, l1 - 1); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + strncat (r, s2 + 4, l1); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known source length, + but we don't know the length of dest string, so runtime checking + is needed too. */ + memset (&a, '\0', sizeof (a)); + chk_calls = 0; + s5 = (char *) &a; + __asm __volatile ("" : : "r" (s5) : "memory"); + strncat (a.buf1 + 2, "a", 5); + strncat (r, "def", 0); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + memset (r, '\0', 3); + __asm __volatile ("" : : "r" (r) : "memory"); + strncat (r, s1 + 1, 2); + if (chk_calls != 2) + abort (); + chk_calls = 0; + strcat_disallowed = 1; + /* Unknown destination and source, no checking. */ + strncat (s4, s3, l1 + 1); + strcat_disallowed = 0; + if (chk_calls) + abort (); +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test3 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + memset (&a, '\0', sizeof (a)); + memset (buf3, '\0', sizeof (buf3)); + s5 = (char *) &a; + __asm __volatile ("" : : "r" (s5) : "memory"); + s5 = buf3; + __asm __volatile ("" : : "r" (s5) : "memory"); + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strncat (&a.buf2[9], s2 + 3, 4); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strncat (&a.buf2[7], s3 + strlen (s3) - 3, 3); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strncat (&buf3[19], "abcde", 1); + abort (); + } + chk_fail_allowed = 0; +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (s2) : "0" (s2)); + __asm ("" : "=r" (s3) : "0" (s3)); + __asm ("" : "=r" (l1) : "0" (l1)); + s4 = p; + test1 (); + memset (p, '\0', sizeof (p)); + test2 (); + test3 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test3 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strncat.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncat.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,82 @@ +/* Copyright (C) 2000, 2003 Free Software Foundation. + + Ensure all expected transformations of builtin strncat occur and + perform correctly. + + Written by Kaveh R. Ghazi, 11/27/2000. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern char *strncat (char *, const char *, size_t); +extern char *strcpy (char *, const char *); +extern void *memset (void *, int, size_t); +extern int memcmp (const void *, const void *, size_t); +int x = 123; + +/* Reset the destination buffer to a known state. */ +#define RESET_DST_WITH(FILLER) \ + do { memset (dst, 'X', sizeof (dst)); strcpy (dst, (FILLER)); } while (0) + +void +main_test (void) +{ + const char *const s1 = "hello world"; + const char *const s2 = ""; + char dst[64], *d2; + + RESET_DST_WITH (s1); + if (strncat (dst, "", 100) != dst || memcmp (dst, "hello world\0XXX", 15)) + abort(); + RESET_DST_WITH (s1); + if (strncat (dst, s2, 100) != dst || memcmp (dst, "hello world\0XXX", 15)) + abort(); + RESET_DST_WITH (s1); d2 = dst; + if (strncat (++d2, s2, 100) != dst+1 || d2 != dst+1 + || memcmp (dst, "hello world\0XXX", 15)) + abort(); + RESET_DST_WITH (s1); d2 = dst; + if (strncat (++d2+5, s2, 100) != dst+6 || d2 != dst+1 + || memcmp (dst, "hello world\0XXX", 15)) + abort(); + RESET_DST_WITH (s1); d2 = dst; + if (strncat (++d2+5, s1+11, 100) != dst+6 || d2 != dst+1 + || memcmp (dst, "hello world\0XXX", 15)) + abort(); + RESET_DST_WITH (s1); d2 = dst; + if (strncat (++d2+5, s1, 0) != dst+6 || d2 != dst+1 + || memcmp (dst, "hello world\0XXX", 15)) + abort(); + RESET_DST_WITH (s1); d2 = dst; + if (strncat (++d2+5, "", ++x) != dst+6 || d2 != dst+1 || x != 124 + || memcmp (dst, "hello world\0XXX", 15)) + abort(); + + RESET_DST_WITH (s1); + if (strncat (dst, "foo", 3) != dst || memcmp (dst, "hello worldfoo\0XXX", 18)) + abort(); + RESET_DST_WITH (s1); + if (strncat (dst, "foo", 100) != dst || memcmp (dst, "hello worldfoo\0XXX", 18)) + abort(); + RESET_DST_WITH (s1); + if (strncat (dst, s1, 100) != dst || memcmp (dst, "hello worldhello world\0XXX", 26)) + abort(); + RESET_DST_WITH (s1); d2 = dst; + if (strncat (++d2, s1, 100) != dst+1 || d2 != dst+1 + || memcmp (dst, "hello worldhello world\0XXX", 26)) + abort(); + RESET_DST_WITH (s1); d2 = dst; + if (strncat (++d2+5, s1, 100) != dst+6 || d2 != dst+1 + || memcmp (dst, "hello worldhello world\0XXX", 26)) + abort(); + RESET_DST_WITH (s1); d2 = dst; + if (strncat (++d2+5, s1+5, 100) != dst+6 || d2 != dst+1 + || memcmp (dst, "hello world world\0XXX", 21)) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + RESET_DST_WITH (s1); + if (__builtin_strncat (dst, "", 100) != dst + || memcmp (dst, "hello world\0XXX", 15)) + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-2-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-2-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-2-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-2-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strncmp.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,173 @@ +/* Copyright (C) 2000, 2001, 2003, 2005 Free Software Foundation. + + Ensure all expected transformations of builtin strncmp occur and + perform correctly. + + Written by Kaveh R. Ghazi, 11/26/2000. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern int strncmp (const char *, const char *, size_t); + +void +main_test (void) +{ +#if !defined(__OPTIMIZE__) || ((defined(__sh__) || defined(__i386__) || defined (__x86_64__)) && !defined(__OPTIMIZE_SIZE__)) + /* These tests work on platforms which support cmpstrsi. We test it + at -O0 on all platforms to ensure the strncmp logic is correct. */ + const char *const s1 = "hello world"; + const char *s2; + int n = 6, x; + + s2 = s1; + if (strncmp (++s2, "ello", 3) != 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("ello", ++s2, 3) != 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp (++s2, "ello", 4) != 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("ello", ++s2, 4) != 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp (++s2, "ello", 5) <= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("ello", ++s2, 5) >= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp (++s2, "ello", 6) <= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("ello", ++s2, 6) >= 0 || s2 != s1+1) + abort(); + + s2 = s1; + if (strncmp (++s2, "zllo", 3) >= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("zllo", ++s2, 3) <= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp (++s2, "zllo", 4) >= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("zllo", ++s2, 4) <= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp (++s2, "zllo", 5) >= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("zllo", ++s2, 5) <= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp (++s2, "zllo", 6) >= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("zllo", ++s2, 6) <= 0 || s2 != s1+1) + abort(); + + s2 = s1; + if (strncmp (++s2, "allo", 3) <= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("allo", ++s2, 3) >= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp (++s2, "allo", 4) <= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("allo", ++s2, 4) >= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp (++s2, "allo", 5) <= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("allo", ++s2, 5) >= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp (++s2, "allo", 6) <= 0 || s2 != s1+1) + abort(); + s2 = s1; + if (strncmp ("allo", ++s2, 6) >= 0 || s2 != s1+1) + abort(); + + s2 = s1; n = 2; x = 1; + if (strncmp (++s2, s1+(x&3), ++n) != 0 || s2 != s1+1 || n != 3) + abort(); + s2 = s1; n = 2; x = 1; + if (strncmp (s1+(x&3), ++s2, ++n) != 0 || s2 != s1+1 || n != 3) + abort(); + s2 = s1; n = 3; x = 1; + if (strncmp (++s2, s1+(x&3), ++n) != 0 || s2 != s1+1 || n != 4) + abort(); + s2 = s1; n = 3; x = 1; + if (strncmp (s1+(x&3), ++s2, ++n) != 0 || s2 != s1+1 || n != 4) + abort(); + s2 = s1; n = 4; x = 1; + if (strncmp (++s2, s1+(x&3), ++n) != 0 || s2 != s1+1 || n != 5) + abort(); + s2 = s1; n = 4; x = 1; + if (strncmp (s1+(x&3), ++s2, ++n) != 0 || s2 != s1+1 || n != 5) + abort(); + s2 = s1; n = 5; x = 1; + if (strncmp (++s2, s1+(x&3), ++n) != 0 || s2 != s1+1 || n != 6) + abort(); + s2 = s1; n = 5; x = 1; + if (strncmp (s1+(x&3), ++s2, ++n) != 0 || s2 != s1+1 || n != 6) + abort(); + + s2 = s1; n = 2; + if (strncmp (++s2, "zllo", ++n) >= 0 || s2 != s1+1 || n != 3) + abort(); + s2 = s1; n = 2; x = 1; + if (strncmp ("zllo", ++s2, ++n) <= 0 || s2 != s1+1 || n != 3) + abort(); + s2 = s1; n = 3; x = 1; + if (strncmp (++s2, "zllo", ++n) >= 0 || s2 != s1+1 || n != 4) + abort(); + s2 = s1; n = 3; x = 1; + if (strncmp ("zllo", ++s2, ++n) <= 0 || s2 != s1+1 || n != 4) + abort(); + s2 = s1; n = 4; x = 1; + if (strncmp (++s2, "zllo", ++n) >= 0 || s2 != s1+1 || n != 5) + abort(); + s2 = s1; n = 4; x = 1; + if (strncmp ("zllo", ++s2, ++n) <= 0 || s2 != s1+1 || n != 5) + abort(); + s2 = s1; n = 5; x = 1; + if (strncmp (++s2, "zllo", ++n) >= 0 || s2 != s1+1 || n != 6) + abort(); + s2 = s1; n = 5; x = 1; + if (strncmp ("zllo", ++s2, ++n) <= 0 || s2 != s1+1 || n != 6) + abort(); + + s2 = s1; n = 2; + if (strncmp (++s2, "allo", ++n) <= 0 || s2 != s1+1 || n != 3) + abort(); + s2 = s1; n = 2; x = 1; + if (strncmp ("allo", ++s2, ++n) >= 0 || s2 != s1+1 || n != 3) + abort(); + s2 = s1; n = 3; x = 1; + if (strncmp (++s2, "allo", ++n) <= 0 || s2 != s1+1 || n != 4) + abort(); + s2 = s1; n = 3; x = 1; + if (strncmp ("allo", ++s2, ++n) >= 0 || s2 != s1+1 || n != 4) + abort(); + s2 = s1; n = 4; x = 1; + if (strncmp (++s2, "allo", ++n) <= 0 || s2 != s1+1 || n != 5) + abort(); + s2 = s1; n = 4; x = 1; + if (strncmp ("allo", ++s2, ++n) >= 0 || s2 != s1+1 || n != 5) + abort(); + s2 = s1; n = 5; x = 1; + if (strncmp (++s2, "allo", ++n) <= 0 || s2 != s1+1 || n != 6) + abort(); + s2 = s1; n = 5; x = 1; + if (strncmp ("allo", ++s2, ++n) >= 0 || s2 != s1+1 || n != 6) + abort(); + +#endif +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strncmp.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncmp.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,73 @@ +/* Copyright (C) 2000, 2001, 2003 Free Software Foundation. + + Ensure all expected transformations of builtin strncmp occur and + perform correctly. + + Written by Kaveh R. Ghazi, 11/26/2000. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern int strncmp (const char *, const char *, size_t); + +void +main_test (void) +{ + const char *const s1 = "hello world"; + const char *s2, *s3; + + if (strncmp (s1, "hello world", 12) != 0) + abort(); + if (strncmp ("hello world", s1, 12) != 0) + abort(); + if (strncmp ("hello", "hello", 6) != 0) + abort(); + if (strncmp ("hello", "hello", 2) != 0) + abort(); + if (strncmp ("hello", "hello", 100) != 0) + abort(); + if (strncmp (s1+10, "d", 100) != 0) + abort(); + if (strncmp (10+s1, "d", 100) != 0) + abort(); + if (strncmp ("d", s1+10, 1) != 0) + abort(); + if (strncmp ("d", 10+s1, 1) != 0) + abort(); + if (strncmp ("hello", "aaaaa", 100) <= 0) + abort(); + if (strncmp ("aaaaa", "hello", 100) >= 0) + abort(); + if (strncmp ("hello", "aaaaa", 1) <= 0) + abort(); + if (strncmp ("aaaaa", "hello", 1) >= 0) + abort(); + + s2 = s1; s3 = s1+4; + if (strncmp (++s2, ++s3, 0) != 0 || s2 != s1+1 || s3 != s1+5) + abort(); + s2 = s1; + if (strncmp (++s2, "", 1) <= 0 || s2 != s1+1) + abort(); + if (strncmp ("", ++s2, 1) >= 0 || s2 != s1+2) + abort(); + if (strncmp (++s2, "", 100) <= 0 || s2 != s1+3) + abort(); + if (strncmp ("", ++s2, 100) >= 0 || s2 != s1+4) + abort(); + if (strncmp (++s2+6, "", 100) != 0 || s2 != s1+5) + abort(); + if (strncmp ("", ++s2+5, 100) != 0 || s2 != s1+6) + abort(); + if (strncmp ("ozz", ++s2, 1) != 0 || s2 != s1+7) + abort(); + if (strncmp (++s2, "rzz", 1) != 0 || s2 != s1+8) + abort(); + s2 = s1; s3 = s1+4; + if (strncmp (++s2, ++s3+2, 1) >= 0 || s2 != s1+1 || s3 != s1+5) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strncmp ("hello", "a", 100) <= 0) + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,227 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __strncpy_chk performs correctly. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern char *strncpy (char *, const char *, size_t); +extern int memcmp (const void *, const void *, size_t); +extern int strcmp (const char *, const char *); +extern int strncmp (const char *, const char *, size_t); +extern void *memset (void *, int, size_t); + +#include "chk.h" + +const char s1[] = "123"; +char p[32] = ""; +char * volatile s2 = "defg"; /* prevent constant propagation to happen when whole program assumptions are made. */ +char * volatile s3 = "FGH"; /* prevent constant propagation to happen when whole program assumptions are made. */ +char *s4; +volatile size_t l1 = 1; /* prevent constant propagation to happen when whole program assumptions are made. */ +int i; + +void +__attribute__((noinline)) +test1 (void) +{ + const char *const src = "hello world"; + const char *src2; + char dst[64], *dst2; + + strncpy_disallowed = 1; + chk_calls = 0; + + memset (dst, 0, sizeof (dst)); + if (strncpy (dst, src, 4) != dst || strncmp (dst, src, 4)) + abort(); + + memset (dst, 0, sizeof (dst)); + if (strncpy (dst+16, src, 4) != dst+16 || strncmp (dst+16, src, 4)) + abort(); + + memset (dst, 0, sizeof (dst)); + if (strncpy (dst+32, src+5, 4) != dst+32 || strncmp (dst+32, src+5, 4)) + abort(); + + memset (dst, 0, sizeof (dst)); + dst2 = dst; + if (strncpy (++dst2, src+5, 4) != dst+1 || strncmp (dst2, src+5, 4) + || dst2 != dst+1) + abort(); + + memset (dst, 0, sizeof (dst)); + if (strncpy (dst, src, 0) != dst || strcmp (dst, "")) + abort(); + + memset (dst, 0, sizeof (dst)); + dst2 = dst; src2 = src; + if (strncpy (++dst2, ++src2, 0) != dst+1 || strcmp (dst2, "") + || dst2 != dst+1 || src2 != src+1) + abort(); + + memset (dst, 0, sizeof (dst)); + dst2 = dst; src2 = src; + if (strncpy (++dst2+5, ++src2+5, 0) != dst+6 || strcmp (dst2+5, "") + || dst2 != dst+1 || src2 != src+1) + abort(); + + memset (dst, 0, sizeof (dst)); + if (strncpy (dst, src, 12) != dst || strcmp (dst, src)) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + memset (dst, 0, sizeof (dst)); + if (__builtin_strncpy (dst, src, 4) != dst || strncmp (dst, src, 4)) + abort(); + + memset (dst, 0, sizeof (dst)); + if (strncpy (dst, i++ ? "xfoo" + 1 : "bar", 4) != dst + || strcmp (dst, "bar") + || i != 1) + abort (); + + if (chk_calls) + abort (); + strncpy_disallowed = 0; +} + +void +__attribute__((noinline)) +test2 (void) +{ + chk_calls = 0; + /* No runtime checking should be done here, both destination + and length are unknown. */ + strncpy (s4, "abcd", l1 + 1); + if (chk_calls) + abort (); +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test3 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int i; + const char *l; + size_t l2; + + /* The following calls should do runtime checking + - source length is not known, but destination is. */ + chk_calls = 0; + strncpy (a.buf1 + 2, s3 + 3, l1); + strncpy (r, s3 + 2, l1 + 2); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + strncpy (r, s2 + 2, l1 + 2); + strncpy (r + 2, s3 + 3, l1); + r = buf3; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1]; + else if (i == l1) + r = &a.buf2[7]; + else if (i == l1 + 1) + r = &buf3[5]; + else if (i == l1 + 2) + r = &a.buf1[9]; + } + strncpy (r, s2 + 4, l1); + if (chk_calls != 5) + abort (); + + /* Following have known destination and known length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + strncpy (a.buf1 + 2, "", 3); + strncpy (a.buf1 + 2, "", 0); + strncpy (r, "a", 1); + strncpy (r, "a", 3); + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + strncpy (r, s1 + 1, 3); + strncpy (r, s1 + 1, 2); + r = buf3; + l = "abc"; + l2 = 4; + for (i = 0; i < 4; ++i) + { + if (i == l1 - 1) + r = &a.buf1[1], l = "e", l2 = 2; + else if (i == l1) + r = &a.buf2[7], l = "gh", l2 = 3; + else if (i == l1 + 1) + r = &buf3[5], l = "jkl", l2 = 4; + else if (i == l1 + 2) + r = &a.buf1[9], l = "", l2 = 1; + } + strncpy (r, "", 1); + /* Here, strlen (l) + 1 is known to be at most 4 and + __builtin_object_size (&buf3[16], 0) is 4, so this doesn't need + runtime checking. */ + strncpy (&buf3[16], l, l2); + strncpy (&buf3[15], "abc", l2); + strncpy (&buf3[10], "fghij", l2); + if (chk_calls) + abort (); + chk_calls = 0; +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test4 (void) +{ + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strncpy (&a.buf2[9], s2 + 4, l1 + 1); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strncpy (&a.buf2[7], s3, l1 + 4); + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strncpy (&buf3[19], "abc", 2); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + strncpy (&buf3[18], "", 3); + abort (); + } + chk_fail_allowed = 0; +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (s2) : "0" (s2)); + __asm ("" : "=r" (s3) : "0" (s3)); + __asm ("" : "=r" (l1) : "0" (l1)); + test1 (); + s4 = p; + test2 (); + test3 (); + test4 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test4 struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strncpy.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strncpy.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,75 @@ +/* Copyright (C) 2000, 2005 Free Software Foundation. + + Ensure all expected transformations of builtin strncpy occur and + perform correctly. + + Written by Kaveh R. Ghazi, 11/25/2000. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern char *strncpy (char *, const char *, size_t); +extern int memcmp (const void *, const void *, size_t); +extern void *memset (void *, int, size_t); + +/* Reset the destination buffer to a known state. */ +#define RESET_DST memset(dst, 'X', sizeof(dst)) + +int i; + +void +main_test (void) +{ + const char *const src = "hello world"; + const char *src2; + char dst[64], *dst2; + + RESET_DST; + if (strncpy (dst, src, 4) != dst || memcmp (dst, "hellXXX", 7)) + abort(); + + RESET_DST; + if (strncpy (dst+16, src, 4) != dst+16 || memcmp (dst+16, "hellXXX", 7)) + abort(); + + RESET_DST; + if (strncpy (dst+32, src+5, 4) != dst+32 || memcmp (dst+32, " worXXX", 7)) + abort(); + + RESET_DST; + dst2 = dst; + if (strncpy (++dst2, src+5, 4) != dst+1 || memcmp (dst2, " worXXX", 7) + || dst2 != dst+1) + abort(); + + RESET_DST; + if (strncpy (dst, src, 0) != dst || memcmp (dst, "XXX", 3)) + abort(); + + RESET_DST; + dst2 = dst; src2 = src; + if (strncpy (++dst2, ++src2, 0) != dst+1 || memcmp (dst2, "XXX", 3) + || dst2 != dst+1 || src2 != src+1) + abort(); + + RESET_DST; + dst2 = dst; src2 = src; + if (strncpy (++dst2+5, ++src2+5, 0) != dst+6 || memcmp (dst2+5, "XXX", 3) + || dst2 != dst+1 || src2 != src+1) + abort(); + + RESET_DST; + if (strncpy (dst, src, 12) != dst || memcmp (dst, "hello world\0XXX", 15)) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + RESET_DST; + if (__builtin_strncpy (dst, src, 4) != dst || memcmp (dst, "hellXXX", 7)) + abort(); + + RESET_DST; + if (strncpy (dst, i++ ? "xfoo" + 1 : "bar", 4) != dst + || memcmp (dst, "bar\0XXX", 7) + || i != 1) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strnlen.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,92 @@ +/* PR tree-optimization/81384 - built-in form of strnlen missing + Test to verify that strnlen built-in expansion works correctly. */ + +#define PTRDIFF_MAX __PTRDIFF_MAX__ +#define SIZE_MAX __SIZE_MAX__ +#define NOIPA __attribute__ ((noipa)) + +typedef __SIZE_TYPE__ size_t; + +extern void abort (void); +extern size_t strnlen (const char *, size_t); + +#define A(expr) \ + ((expr) ? (void)0 \ + : (__builtin_printf ("assertion on line %i failed: %s\n", \ + __LINE__, #expr), \ + abort ())) + +NOIPA void test_strnlen_str_cst (void) +{ + A (strnlen ("", 0) == 0); + A (strnlen ("", 1) == 0); + A (strnlen ("", 9) == 0); + A (strnlen ("", PTRDIFF_MAX) == 0); + A (strnlen ("", SIZE_MAX) == 0); + A (strnlen ("", -1) == 0); + + A (strnlen ("1", 0) == 0); + A (strnlen ("1", 1) == 1); + A (strnlen ("1", 9) == 1); + A (strnlen ("1", PTRDIFF_MAX) == 1); + A (strnlen ("1", SIZE_MAX) == 1); + A (strnlen ("1", -2) == 1); + + A (strnlen ("123", 0) == 0); + A (strnlen ("123", 1) == 1); + A (strnlen ("123", 2) == 2); + A (strnlen ("123", 3) == 3); + A (strnlen ("123", 9) == 3); + A (strnlen ("123", PTRDIFF_MAX) == 3); + A (strnlen ("123", SIZE_MAX) == 3); + A (strnlen ("123", -2) == 3); +} + +NOIPA void test_strnlen_str_range (size_t x) +{ + size_t r_0_3 = x & 3; + size_t r_1_3 = r_0_3 | 1; + size_t r_2_3 = r_0_3 | 2; + + A (strnlen ("", r_0_3) == 0); + A (strnlen ("1", r_0_3) <= 1); + A (strnlen ("12", r_0_3) <= 2); + A (strnlen ("123", r_0_3) <= 3); + A (strnlen ("1234", r_0_3) <= 3); + + A (strnlen ("", r_1_3) == 0); + A (strnlen ("1", r_1_3) == 1); + A (strnlen ("12", r_1_3) <= 2); + A (strnlen ("123", r_1_3) <= 3); + A (strnlen ("1234", r_1_3) <= 3); + + A (strnlen ("", r_2_3) == 0); + A (strnlen ("1", r_2_3) == 1); + A (strnlen ("12", r_2_3) == 2); + A (strnlen ("123", r_2_3) <= 3); + A (strnlen ("1234", r_2_3) <= 3); +} + +NOIPA void test_strnlen_str_range_side_effect (size_t x) +{ + size_t r_0_3 = x & 3; + size_t r_1_3 = r_0_3 | 1; + size_t r_2_3 = r_0_3 | 2; + size_t n = r_2_3; + + int i = 0; + + A (strnlen ("1234" + i++, n) <= 3); + A (i == 1); + + A (strnlen ("1234", n++) <= 3); + A (n == r_2_3 + 1); +} + +void +main_test (void) +{ + test_strnlen_str_cst (); + test_strnlen_str_range ((size_t)""); + test_strnlen_str_range_side_effect ((size_t)""); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strnlen.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +# At -Og no pass records the global range information +# necessary to optimize the strnlen calls down to +# a constant. The framework assumes that the test +# will never call strnlen when the optimizer is +# enabled. So we filter out the -Og run here. + +set torture_eval_before_compile { + if {[string match {*-Og*} "$option"]} { + continue + } +} + +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpbrk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpbrk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpbrk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpbrk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strpbrk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpbrk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpbrk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpbrk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpbrk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +/* Copyright (C) 2000 Free Software Foundation. + + Ensure all expected transformations of builtin strpbrk occur and + perform correctly. + + Written by Kaveh R. Ghazi, 11/6/2000. */ + +extern void abort(void); +extern char *strpbrk (const char *, const char *); +extern int strcmp (const char *, const char *); + +void fn (const char *foo, const char *const *bar) +{ + if (strcmp(strpbrk ("hello world", "lrooo"), "llo world") != 0) + abort(); + if (strpbrk (foo, "") != 0) + abort(); + if (strpbrk (foo + 4, "") != 0) + abort(); + if (strpbrk (*bar--, "") != 0) + abort(); + if (strpbrk (*bar, "h") != foo) + abort(); + if (strpbrk (foo, "h") != foo) + abort(); + if (strpbrk (foo, "w") != foo + 6) + abort(); + if (strpbrk (foo + 6, "o") != foo + 7) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strpbrk (foo + 6, "o") != foo + 7) + abort(); +} + +void +main_test (void) +{ + const char *const foo[] = { "hello world", "bye bye world" }; + fn (foo[0], foo + 1); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-2-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-2-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-2-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-2-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/stpcpy.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,45 @@ +/* Copyright (C) 2003 Free Software Foundation. + + Ensure that builtin stpcpy performs correctly. + + Written by Jakub Jelinek, 21/05/2003. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern int memcmp (const void *, const void *, size_t); +extern char *stpcpy (char *, const char *); +extern int inside_main; + +long buf1[64]; +char *buf2 = (char *) (buf1 + 32); +long buf5[20]; +char buf7[20]; + +void +__attribute__((noinline)) +test (long *buf3, char *buf4, char *buf6, int n) +{ + int i = 4; + + if (stpcpy ((char *) buf3, "abcdefghijklmnop") != (char *) buf1 + 16 + || memcmp (buf1, "abcdefghijklmnop", 17)) + abort (); + + if (__builtin_stpcpy ((char *) buf3, "ABCDEFG") != (char *) buf1 + 7 + || memcmp (buf1, "ABCDEFG\0ijklmnop", 17)) + abort (); + + if (stpcpy ((char *) buf3 + i++, "x") != (char *) buf1 + 5 + || memcmp (buf1, "ABCDx\0G\0ijklmnop", 17)) + abort (); +} + +void +main_test (void) +{ + /* All these tests are allowed to call mempcpy/stpcpy. */ + inside_main = 0; + __builtin_memcpy (buf5, "RSTUVWXYZ0123456789", 20); + __builtin_memcpy (buf7, "RSTUVWXYZ0123456789", 20); + test (buf1, buf2, "rstuvwxyz", 0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/stpcpy.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strpcpy.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,60 @@ +/* Copyright (C) 2003, 2004 Free Software Foundation. + + Ensure builtin stpcpy performs correctly. + + Written by Kaveh Ghazi, 4/11/2003. */ + +typedef __SIZE_TYPE__ size_t; + +extern void abort (void); +extern char *strcpy (char *, const char *); +extern char *stpcpy (char *, const char *); +extern int memcmp (const void *, const void *, size_t); + +extern int inside_main; + +const char s1[] = "123"; +char p[32] = ""; +char *s2 = "defg"; +char *s3 = "FGH"; +size_t l1 = 1; + +void +main_test (void) +{ + int i = 8; + +#if !defined __i386__ && !defined __x86_64__ + /* The functions below might not be optimized into direct stores on all + arches. It depends on how many instructions would be generated and + what limits the architecture chooses in STORE_BY_PIECES_P. */ + inside_main = 0; +#endif + if (stpcpy (p, "abcde") != p + 5 || memcmp (p, "abcde", 6)) + abort (); + if (stpcpy (p + 16, "vwxyz" + 1) != p + 16 + 4 || memcmp (p + 16, "wxyz", 5)) + abort (); + if (stpcpy (p + 1, "") != p + 1 + 0 || memcmp (p, "a\0cde", 6)) + abort (); + if (stpcpy (p + 3, "fghij") != p + 3 + 5 || memcmp (p, "a\0cfghij", 9)) + abort (); + + if (stpcpy ((i++, p + 20 + 1), "23") != (p + 20 + 1 + 2) + || i != 9 || memcmp (p + 19, "z\0""23\0", 5)) + abort (); + + if (stpcpy (stpcpy (p, "ABCD"), "EFG") != p + 7 || memcmp (p, "ABCDEFG", 8)) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_stpcpy (p, "abcde") != p + 5 || memcmp (p, "abcde", 6)) + abort (); + + /* If the result of stpcpy is ignored, gcc should use strcpy. + This should be optimized always, so set inside_main again. */ + inside_main = 1; + stpcpy (p + 3, s3); + if (memcmp (p, "abcFGH", 6)) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strrchr-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strrchr-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strrchr-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strrchr-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,8 @@ +#include "lib/strrchr.c" +#ifdef __vxworks +/* The RTP C library uses bzero, bfill and bcopy, all of which are defined + in the same file as rindex. */ +#include "lib/bzero.c" +#include "lib/bfill.c" +#include "lib/memmove.c" +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strrchr.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strrchr.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strrchr.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strrchr.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +/* Copyright (C) 2000, 2003, 2004 Free Software Foundation. + + Ensure all expected transformations of builtin strrchr and rindex + occur and perform correctly. + + Written by Jakub Jelinek, 11/7/2000. */ + +extern void abort (void); +extern char *strrchr (const char *, int); +extern char *rindex (const char *, int); + +char *bar = "hi world"; +int x = 7; + +void +main_test (void) +{ + const char *const foo = "hello world"; + + if (strrchr (foo, 'x')) + abort (); + if (strrchr (foo, 'o') != foo + 7) + abort (); + if (strrchr (foo, 'e') != foo + 1) + abort (); + if (strrchr (foo + 3, 'e')) + abort (); + if (strrchr (foo, '\0') != foo + 11) + abort (); + if (strrchr (bar, '\0') != bar + 8) + abort (); + if (strrchr (bar + 4, '\0') != bar + 8) + abort (); + if (strrchr (bar + (x++ & 3), '\0') != bar + 8) + abort (); + if (x != 8) + abort (); + /* Test only one instance of rindex since the code path is the same + as that of strrchr. */ + if (rindex ("hello", 'z') != 0) + abort (); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strrchr (foo, 'o') != foo + 7) + abort (); + if (__builtin_rindex (foo, 'o') != foo + 7) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strspn-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strspn-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strspn-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strspn-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strspn.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strspn.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strspn.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strspn.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strspn.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,54 @@ +/* Copyright (C) 2000 Free Software Foundation. + + Ensure all expected transformations of builtin strspn occur and + perform correctly. + + Written by Kaveh R. Ghazi, 11/27/2000. */ + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strspn (const char *, const char *); +extern char *strcpy (char *, const char *); + +void +main_test (void) +{ + const char *const s1 = "hello world"; + char dst[64], *d2; + + if (strspn (s1, "hello") != 5) + abort(); + if (strspn (s1+4, "hello") != 1) + abort(); + if (strspn (s1, "z") != 0) + abort(); + if (strspn (s1, "hello world") != 11) + abort(); + if (strspn (s1, "") != 0) + abort(); + strcpy (dst, s1); + if (strspn (dst, "") != 0) + abort(); + strcpy (dst, s1); d2 = dst; + if (strspn (++d2, "") != 0 || d2 != dst+1) + abort(); + strcpy (dst, s1); d2 = dst; + if (strspn (++d2+5, "") != 0 || d2 != dst+1) + abort(); + if (strspn ("", s1) != 0) + abort(); + strcpy (dst, s1); + if (strspn ("", dst) != 0) + abort(); + strcpy (dst, s1); d2 = dst; + if (strspn ("", ++d2) != 0 || d2 != dst+1) + abort(); + strcpy (dst, s1); d2 = dst; + if (strspn ("", ++d2+5) != 0 || d2 != dst+1) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strspn (s1, "hello") != 5) + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern char *strchr(const char *, int); +extern int strcmp(const char *, const char *); +extern int strncmp(const char *, const char *, size_t); +extern int inside_main; +extern const char *p; + +__attribute__ ((used)) +char * +my_strstr (const char *s1, const char *s2) +{ + const size_t len = strlen (s2); + +#ifdef __OPTIMIZE__ + /* If optimizing, we should be called only in the strstr (foo + 2, p) + case. All other cases should be optimized. */ + if (inside_main) + if (s2 != p || strcmp (s1, "hello world" + 2) != 0) + abort (); +#endif + if (len == 0) + return (char *) s1; + for (s1 = strchr (s1, *s2); s1; s1 = strchr (s1 + 1, *s2)) + if (strncmp (s1, s2, len) == 0) + return (char *) s1; + return (char *) 0; +} + +char * +strstr (const char *s1, const char *s2) +{ + if (inside_main) + abort (); + + return my_strstr (s1, s2); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,45 @@ +/* Copyright (C) 2000, 2003 Free Software Foundation. + + Ensure all expected transformations of builtin strstr occur and + perform correctly in presence of redirect. */ + +#define ASMNAME(cname) ASMNAME2 (__USER_LABEL_PREFIX__, cname) +#define ASMNAME2(prefix, cname) STRING (prefix) cname +#define STRING(x) #x + +typedef __SIZE_TYPE__ size_t; +extern void abort (void); +extern char *strstr (const char *, const char *) + __asm (ASMNAME ("my_strstr")); + +const char *p = "rld", *q = "hello world"; + +void +main_test (void) +{ + const char *const foo = "hello world"; + + if (strstr (foo, "") != foo) + abort (); + if (strstr (foo + 4, "") != foo + 4) + abort (); + if (strstr (foo, "h") != foo) + abort (); + if (strstr (foo, "w") != foo + 6) + abort (); + if (strstr (foo + 6, "o") != foo + 7) + abort (); + if (strstr (foo + 1, "world") != foo + 6) + abort (); + if (strstr (foo + 2, p) != foo + 8) + abort (); + if (strstr (q, "") != q) + abort (); + if (strstr (q + 1, "o") != q + 4) + abort (); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strstr (foo + 1, "world") != foo + 6) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-asm.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +# Different translation units may have different user name overrides +# and we do not preserve enough context to known which one we want. + +set torture_eval_before_compile { + if {[string match {*-flto*} "$option"]} { + continue + } +} + +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/strstr.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/strstr.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* Copyright (C) 2000 Free Software Foundation. + + Ensure all expected transformations of builtin strstr occur and + perform correctly. + + Written by Kaveh R. Ghazi, 11/6/2000. */ + +extern void abort(void); +extern char *strstr (const char *, const char *); + +void +main_test (void) +{ + const char *const foo = "hello world"; + + if (strstr (foo, "") != foo) + abort(); + if (strstr (foo + 4, "") != foo + 4) + abort(); + if (strstr (foo, "h") != foo) + abort(); + if (strstr (foo, "w") != foo + 6) + abort(); + if (strstr (foo + 6, "o") != foo + 7) + abort(); + if (strstr (foo + 1, "world") != foo + 6) + abort(); + + /* Test at least one instance of the __builtin_ style. We do this + to ensure that it works and that the prototype is correct. */ + if (__builtin_strstr (foo + 1, "world") != foo + 6) + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,321 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __vsnprintf_chk performs correctly. */ + +#include + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern char *strcpy (char *, const char *); +extern int memcmp (const void *, const void *, size_t); +extern void *memset (void *, int, size_t); +extern int vsnprintf (char *, size_t, const char *, va_list); + +#include "chk.h" + +const char s1[] = "123"; +char p[32] = ""; +char *s2 = "defg"; +char *s3 = "FGH"; +char *s4; +size_t l1 = 1; +static char buffer[32]; +char * volatile ptr = "barf"; /* prevent constant propagation to happen when whole program assumptions are made. */ + +int +__attribute__((noinline)) +test1_sub (int i, ...) +{ + int ret = 0; + va_list ap; + va_start (ap, i); + switch (i) + { + case 0: + vsnprintf (buffer, 4, "foo", ap); + break; + case 1: + ret = vsnprintf (buffer, 4, "foo bar", ap); + break; + case 2: + vsnprintf (buffer, 32, "%s", ap); + break; + case 3: + ret = vsnprintf (buffer, 21, "%s", ap); + break; + case 4: + ret = vsnprintf (buffer, 4, "%d%d%d", ap); + break; + case 5: + ret = vsnprintf (buffer, 32, "%d%d%d", ap); + break; + case 6: + ret = vsnprintf (buffer, strlen (ptr) + 1, "%s", ap); + break; + case 7: + vsnprintf (buffer, l1 + 31, "%d - %c", ap); + break; + case 8: + vsnprintf (s4, l1 + 6, "%d - %c", ap); + break; + } + va_end (ap); + return ret; +} + +void +__attribute__((noinline)) +test1 (void) +{ + chk_calls = 0; + /* vsnprintf_disallowed = 1; */ + + memset (buffer, 'A', 32); + test1_sub (0); + if (memcmp (buffer, "foo", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (test1_sub (1) != 7) + abort (); + if (memcmp (buffer, "foo", 4) || buffer[4] != 'A') + abort (); + + vsnprintf_disallowed = 0; + + memset (buffer, 'A', 32); + test1_sub (2, "bar"); + if (memcmp (buffer, "bar", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (test1_sub (3, "bar") != 3) + abort (); + if (memcmp (buffer, "bar", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (test1_sub (4, (int) l1, (int) l1 + 1, (int) l1 + 12) != 4) + abort (); + if (memcmp (buffer, "121", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (test1_sub (5, (int) l1, (int) l1 + 1, (int) l1 + 12) != 4) + abort (); + if (memcmp (buffer, "1213", 5) || buffer[5] != 'A') + abort (); + + if (chk_calls) + abort (); + + memset (buffer, 'A', 32); + test1_sub (6, ptr); + if (memcmp (buffer, "barf", 5) || buffer[5] != 'A') + abort (); + + memset (buffer, 'A', 32); + test1_sub (7, (int) l1 + 27, *ptr); + if (memcmp (buffer, "28 - b\0AAAAA", 12)) + abort (); + + if (chk_calls != 2) + abort (); + chk_calls = 0; + + memset (s4, 'A', 32); + test1_sub (8, (int) l1 - 17, ptr[1]); + if (memcmp (s4, "-16 - \0AAA", 10)) + abort (); + if (chk_calls) + abort (); +} + +void +__attribute__((noinline)) +test2_sub (int i, ...) +{ + va_list ap; + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int j; + + va_start (ap, i); + /* The following calls should do runtime checking + - length is not known, but destination is. */ + switch (i) + { + case 0: + vsnprintf (a.buf1 + 2, l1, "%s", ap); + break; + case 1: + vsnprintf (r, l1 + 4, "%s%c", ap); + break; + case 2: + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + vsnprintf (r, strlen (s2) - 2, "%c %s", ap); + break; + case 3: + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + vsnprintf (r + 2, l1, s3 + 3, ap); + break; + case 4: + case 7: + r = buf3; + for (j = 0; j < 4; ++j) + { + if (j == l1 - 1) + r = &a.buf1[1]; + else if (j == l1) + r = &a.buf2[7]; + else if (j == l1 + 1) + r = &buf3[5]; + else if (j == l1 + 2) + r = &a.buf1[9]; + } + if (i == 4) + vsnprintf (r, l1, s2 + 4, ap); + else + vsnprintf (r, 1, "a", ap); + break; + case 5: + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + vsnprintf (r, l1 + 3, "%s", ap); + break; + case 6: + vsnprintf (a.buf1 + 2, 4, "", ap); + break; + case 8: + vsnprintf (s4, 3, "%s %d", ap); + break; + } + va_end (ap); +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test2 (void) +{ + /* The following calls should do runtime checking + - length is not known, but destination is. */ + chk_calls = 0; + test2_sub (0, s3 + 3); + test2_sub (1, s3 + 3, s3[3]); + test2_sub (2, s2[2], s2 + 4); + test2_sub (3); + test2_sub (4); + test2_sub (5, s1 + 1); + if (chk_calls != 6) + abort (); + + /* Following have known destination and known source length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + /* vsnprintf_disallowed = 1; */ + test2_sub (6); + test2_sub (7); + vsnprintf_disallowed = 0; + /* Unknown destination and source, no checking. */ + test2_sub (8, s3, 0); + if (chk_calls) + abort (); +} + +void +__attribute__((noinline)) +test3_sub (int i, ...) +{ + va_list ap; + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + va_start (ap, i); + /* The following calls should do runtime checking + - source length is not known, but destination is. */ + switch (i) + { + case 0: + vsnprintf (&a.buf2[9], l1 + 1, "%c%s", ap); + break; + case 1: + vsnprintf (&a.buf2[7], l1 + 30, "%s%c", ap); + break; + case 2: + vsnprintf (&a.buf2[7], l1 + 3, "%d", ap); + break; + case 3: + vsnprintf (&buf3[17], l1 + 3, "%s", ap); + break; + case 4: + vsnprintf (&buf3[19], 2, "a", ap); + break; + case 5: + vsnprintf (&buf3[16], 5, "a", ap); + break; + } + va_end (ap); +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test3 (void) +{ + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + test3_sub (0, s2[3], s2 + 4); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + test3_sub (1, s3 + strlen (s3) - 2, *s3); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + test3_sub (2, (int) l1 + 9999); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + test3_sub (3, "abc"); + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + test3_sub (4); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + test3_sub (5); + abort (); + } + chk_fail_allowed = 0; +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (s2) : "0" (s2)); + __asm ("" : "=r" (s3) : "0" (s3)); + __asm ("" : "=r" (l1) : "0" (l1)); + s4 = p; + test1 (); + test2 (); + test3 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsnprintf-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test3_sub struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk-lib.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk-lib.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk-lib.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk-lib.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "lib/chk.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,290 @@ +/* Copyright (C) 2004, 2005 Free Software Foundation. + + Ensure builtin __vsprintf_chk performs correctly. */ + +#include + +extern void abort (void); +typedef __SIZE_TYPE__ size_t; +extern size_t strlen(const char *); +extern void *memcpy (void *, const void *, size_t); +extern char *strcpy (char *, const char *); +extern int memcmp (const void *, const void *, size_t); +extern void *memset (void *, int, size_t); +extern int vsprintf (char *, const char *, va_list); + +#include "chk.h" + +const char s1[] = "123"; +char p[32] = ""; +char *s2 = "defg"; +char *s3 = "FGH"; +char *s4; +size_t l1 = 1; +static char buffer[32]; +char * volatile ptr = "barf"; /* prevent constant propagation to happen when whole program assumptions are made. */ + +int +__attribute__((noinline)) +test1_sub (int i, ...) +{ + int ret = 0; + va_list ap; + va_start (ap, i); + switch (i) + { + case 0: + vsprintf (buffer, "foo", ap); + break; + case 1: + ret = vsprintf (buffer, "foo", ap); + break; + case 2: + vsprintf (buffer, "%s", ap); + break; + case 3: + ret = vsprintf (buffer, "%s", ap); + break; + case 4: + vsprintf (buffer, "%d - %c", ap); + break; + case 5: + vsprintf (s4, "%d - %c", ap); + break; + } + va_end (ap); + return ret; +} + +void +__attribute__((noinline)) +test1 (void) +{ + chk_calls = 0; + vsprintf_disallowed = 1; + + memset (buffer, 'A', 32); + test1_sub (0); + if (memcmp (buffer, "foo", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (test1_sub (1) != 3) + abort (); + if (memcmp (buffer, "foo", 4) || buffer[4] != 'A') + abort (); + + if (chk_calls) + abort (); + vsprintf_disallowed = 0; + + memset (buffer, 'A', 32); + test1_sub (2, "bar"); + if (memcmp (buffer, "bar", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + if (test1_sub (3, "bar") != 3) + abort (); + if (memcmp (buffer, "bar", 4) || buffer[4] != 'A') + abort (); + + memset (buffer, 'A', 32); + test1_sub (2, ptr); + if (memcmp (buffer, "barf", 5) || buffer[5] != 'A') + abort (); + + memset (buffer, 'A', 32); + test1_sub (4, (int) l1 + 27, *ptr); + if (memcmp (buffer, "28 - b\0AAAAA", 12)) + abort (); + + if (chk_calls != 4) + abort (); + chk_calls = 0; + + test1_sub (5, (int) l1 - 17, ptr[1]); + if (memcmp (s4, "-16 - a", 8)) + abort (); + if (chk_calls) + abort (); +} + +void +__attribute__((noinline)) +test2_sub (int i, ...) +{ + va_list ap; + struct A { char buf1[10]; char buf2[10]; } a; + char *r = l1 == 1 ? &a.buf1[5] : &a.buf2[4]; + char buf3[20]; + int j; + + va_start (ap, i); + /* The following calls should do runtime checking + - source length is not known, but destination is. */ + switch (i) + { + case 0: + vsprintf (a.buf1 + 2, "%s", ap); + break; + case 1: + vsprintf (r, "%s%c", ap); + break; + case 2: + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + vsprintf (r, "%c %s", ap); + break; + case 3: + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + vsprintf (r + 2, s3 + 3, ap); + break; + case 4: + case 7: + r = buf3; + for (j = 0; j < 4; ++j) + { + if (j == l1 - 1) + r = &a.buf1[1]; + else if (j == l1) + r = &a.buf2[7]; + else if (j == l1 + 1) + r = &buf3[5]; + else if (j == l1 + 2) + r = &a.buf1[9]; + } + if (i == 4) + vsprintf (r, s2 + 4, ap); + else + vsprintf (r, "a", ap); + break; + case 5: + r = l1 == 1 ? __builtin_alloca (4) : &a.buf2[7]; + vsprintf (r, "%s", ap); + break; + case 6: + vsprintf (a.buf1 + 2, "", ap); + break; + case 8: + vsprintf (s4, "%s %d", ap); + break; + } + va_end (ap); +} + +/* Test whether compile time checking is done where it should + and so is runtime object size checking. */ +void +__attribute__((noinline)) +test2 (void) +{ + /* The following calls should do runtime checking + - source length is not known, but destination is. */ + chk_calls = 0; + test2_sub (0, s3 + 3); + test2_sub (1, s3 + 3, s3[3]); + test2_sub (2, s2[2], s2 + 4); + test2_sub (3); + test2_sub (4); + test2_sub (5, s1 + 1); + if (chk_calls != 6) + abort (); + + /* Following have known destination and known source length, + so if optimizing certainly shouldn't result in the checking + variants. */ + chk_calls = 0; + vsprintf_disallowed = 1; + test2_sub (6); + test2_sub (7); + vsprintf_disallowed = 0; + /* Unknown destination and source, no checking. */ + test2_sub (8, s3, 0); + if (chk_calls) + abort (); +} + +void +__attribute__((noinline)) +test3_sub (int i, ...) +{ + va_list ap; + struct A { char buf1[10]; char buf2[10]; } a; + char buf3[20]; + + va_start (ap, i); + /* The following calls should do runtime checking + - source length is not known, but destination is. */ + switch (i) + { + case 0: + vsprintf (&a.buf2[9], "%c%s", ap); + break; + case 1: + vsprintf (&a.buf2[7], "%s%c", ap); + break; + case 2: + vsprintf (&a.buf2[7], "%d", ap); + break; + case 3: + vsprintf (&buf3[17], "%s", ap); + break; + case 4: + vsprintf (&buf3[19], "a", ap); + break; + } + va_end (ap); +} + +/* Test whether runtime and/or compile time checking catches + buffer overflows. */ +void +__attribute__((noinline)) +test3 (void) +{ + chk_fail_allowed = 1; + /* Runtime checks. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + test3_sub (0, s2[3], s2 + 4); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + test3_sub (1, s3 + strlen (s3) - 2, *s3); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + test3_sub (2, (int) l1 + 9999); + abort (); + } + if (__builtin_setjmp (chk_fail_buf) == 0) + { + test3_sub (3, "abc"); + abort (); + } + /* This should be detectable at compile time already. */ + if (__builtin_setjmp (chk_fail_buf) == 0) + { + test3_sub (4); + abort (); + } + chk_fail_allowed = 0; +} + +void +main_test (void) +{ +#ifndef __OPTIMIZE__ + /* Object size checking is only intended for -O[s123]. */ + return; +#endif + __asm ("" : "=r" (s2) : "0" (s2)); + __asm ("" : "=r" (s3) : "0" (s3)); + __asm ("" : "=r" (l1) : "0" (l1)); + s4 = p; + test1 (); + test2 (); + test3 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/builtins/vsprintf-chk.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +load_lib target-supports.exp + +if { ! [check_effective_target_nonlocal_goto] } { + return 1 +} + +if [istarget "epiphany-*-*"] { + # This test assumes the absence of struct padding. + # to make this true for test3_sub struct A on epiphany would require + # __attribute__((packed)) . + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/call-trap-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/call-trap-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/call-trap-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/call-trap-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* Undefined behavior from a call to a function cast to a different + type does not appear until after the function designator and + arguments have been evaluated. PR 38483. */ +/* Origin: Joseph Myers */ +/* { dg-require-effective-target untyped_assembly } */ + +extern void exit (int); +extern void abort (void); + +int +foo (void) +{ + exit (0); + return 0; +} + +void +bar (void) +{ +} + +int +main (void) +{ + ((long (*)(int))bar) (foo ()); + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cbrt.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cbrt.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cbrt.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cbrt.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,92 @@ +/* + * ==================================================== + * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved. + * + * Developed at SunPro, a Sun Microsystems, Inc. business. + * Permission to use, copy, modify, and distribute this + * software is freely granted, provided that this notice + * is preserved. + * ==================================================== +*/ + +#ifndef __vax__ +static const unsigned long + B1 = 715094163, /* B1 = (682-0.03306235651)*2**20 */ + B2 = 696219795; /* B2 = (664-0.03306235651)*2**20 */ + +static const double + C = 5.42857142857142815906e-01, /* 19/35 = 0x3FE15F15, 0xF15F15F1 */ + D = -7.05306122448979611050e-01, /* -864/1225 = 0xBFE691DE, 0x2532C834 */ + E = 1.41428571428571436819e+00, /* 99/70 = 0x3FF6A0EA, 0x0EA0EA0F */ + F = 1.60714285714285720630e+00, /* 45/28 = 0x3FF9B6DB, 0x6DB6DB6E */ + G = 3.57142857142857150787e-01; /* 5/14 = 0x3FD6DB6D, 0xB6DB6DB7 */ + +double +cbrtl (double x) +{ + long hx; + double r,s,w; + double lt; + unsigned sign; + typedef unsigned unsigned32 __attribute__((mode(SI))); + union { + double t; + unsigned32 pt[2]; + } ut, ux; + int n0; + + ut.t = 1.0; + n0 = (ut.pt[0] == 0); + + ut.t = 0.0; + ux.t = x; + + hx = ux.pt[n0]; /* high word of x */ + sign=hx&0x80000000; /* sign= sign(x) */ + hx ^=sign; + if(hx>=0x7ff00000) return(x+x); /* cbrt(NaN,INF) is itself */ + if((hx| ux.pt[1-n0])==0) + return(ux.t); /* cbrt(0) is itself */ + + ux.pt[n0] = hx; + /* rough cbrt to 5 bits */ + if(hx<0x00100000) /* subnormal number */ + {ut.pt[n0]=0x43500000; /* set t= 2**54 */ + ut.t*=x; ut.pt[n0]=ut.pt[n0]/3+B2; + } + else + ut.pt[n0]=hx/3+B1; + + /* new cbrt to 23 bits, may be implemented in single precision */ + r=ut.t*ut.t/ux.t; + s=C+r*ut.t; + ut.t*=G+F/(s+E+D/s); + + /* chopped to 20 bits and make it larger than cbrt(x) */ + ut.pt[1-n0]=0; ut.pt[n0]+=0x00000001; + + /* one step newton iteration to 53 bits with error less than 0.667 ulps */ + s=ut.t*ut.t; /* t*t is exact */ + r=ux.t/s; + w=ut.t+ut.t; + r=(r-ut.t)/(w+r); /* r-s is exact */ + ut.t=ut.t+ut.t*r; + + /* restore the sign bit */ + ut.pt[n0] |= sign; + + lt = ut.t; + lt -= (lt - (x/(lt*lt))) * 0.333333333333333333333; + return lt; +} + +main () +{ + if ((int) (cbrtl (27.0) + 0.5) != 3) + abort (); + + exit (0); +} +#else +main () { exit (0); } +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpdi-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpdi-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpdi-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpdi-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,218 @@ +#define F 140 +#define T 13 + +feq (x, y) + long long int x; + long long int y; +{ + if (x == y) + return T; + else + return F; +} + +fne (x, y) + long long int x; + long long int y; +{ + if (x != y) + return T; + else + return F; +} + +flt (x, y) + long long int x; + long long int y; +{ + if (x < y) + return T; + else + return F; +} + +fge (x, y) + long long int x; + long long int y; +{ + if (x >= y) + return T; + else + return F; +} + +fgt (x, y) + long long int x; + long long int y; +{ + if (x > y) + return T; + else + return F; +} + +fle (x, y) + long long int x; + long long int y; +{ + if (x <= y) + return T; + else + return F; +} + +fltu (x, y) + unsigned long long int x; + unsigned long long int y; +{ + if (x < y) + return T; + else + return F; +} + +fgeu (x, y) + unsigned long long int x; + unsigned long long int y; +{ + if (x >= y) + return T; + else + return F; +} + +fgtu (x, y) + unsigned long long int x; + unsigned long long int y; +{ + if (x > y) + return T; + else + return F; +} + +fleu (x, y) + unsigned long long int x; + unsigned long long int y; +{ + if (x <= y) + return T; + else + return F; +} + +long long args[] = +{ + 0LL, + 1LL, + -1LL, + 0x7fffffffffffffffLL, + 0x8000000000000000LL, + 0x8000000000000001LL, + 0x1A3F237394D36C58LL, + 0x93850E92CAAC1B04LL +}; + +int correct_results[] = +{ + T, F, F, T, F, T, F, T, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, F, T, T, F, + T, F, F, T, F, T, F, T, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + T, F, F, T, F, T, F, T, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, T, F, F, T, + T, F, F, T, F, T, F, T, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + T, F, F, T, F, T, F, T, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + T, F, F, T, F, T, F, T, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + T, F, F, T, F, T, F, T, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + T, F, F, T, F, T, F, T, F, T +}; + +main () +{ + int i, j, *res = correct_results; + + for (i = 0; i < 8; i++) + { + long long arg0 = args[i]; + for (j = 0; j < 8; j++) + { + long long arg1 = args[j]; + + if (feq (arg0, arg1) != *res++) + abort (); + if (fne (arg0, arg1) != *res++) + abort (); + if (flt (arg0, arg1) != *res++) + abort (); + if (fge (arg0, arg1) != *res++) + abort (); + if (fgt (arg0, arg1) != *res++) + abort (); + if (fle (arg0, arg1) != *res++) + abort (); + if (fltu (arg0, arg1) != *res++) + abort (); + if (fgeu (arg0, arg1) != *res++) + abort (); + if (fgtu (arg0, arg1) != *res++) + abort (); + if (fleu (arg0, arg1) != *res++) + abort (); + } + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsf-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsf-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsf-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsf-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,161 @@ +#include + +#define F 140 +#define T 13 + +feq (float x, float y) +{ + if (x == y) + return T; + else + return F; +} + +fne (float x, float y) +{ + if (x != y) + return T; + else + return F; +} + +flt (float x, float y) +{ + if (x < y) + return T; + else + return F; +} + +fge (float x, float y) +{ + if (x >= y) + return T; + else + return F; +} + +fgt (float x, float y) +{ + if (x > y) + return T; + else + return F; +} + +fle (float x, float y) +{ + if (x <= y) + return T; + else + return F; +} + +float args[] = +{ + 0.0F, + 1.0F, + -1.0F, + __FLT_MAX__, + __FLT_MIN__, + 0.0000000000001F, + 123456789.0F, + -987654321.0F +}; + +int correct_results[] = +{ + T, F, F, T, F, T, + F, T, T, F, F, T, + F, T, F, T, T, F, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, F, T, T, F, + F, T, F, T, T, F, + T, F, F, T, F, T, + F, T, F, T, T, F, + F, T, T, F, F, T, + F, T, F, T, T, F, + F, T, F, T, T, F, + F, T, T, F, F, T, + F, T, F, T, T, F, + F, T, T, F, F, T, + F, T, T, F, F, T, + T, F, F, T, F, T, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, F, T, T, F, + F, T, F, T, T, F, + F, T, F, T, T, F, + F, T, F, T, T, F, + T, F, F, T, F, T, + F, T, F, T, T, F, + F, T, F, T, T, F, + F, T, F, T, T, F, + F, T, F, T, T, F, + F, T, F, T, T, F, + F, T, T, F, F, T, + F, T, F, T, T, F, + F, T, T, F, F, T, + T, F, F, T, F, T, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, F, T, T, F, + F, T, F, T, T, F, + F, T, T, F, F, T, + F, T, F, T, T, F, + F, T, T, F, F, T, + F, T, F, T, T, F, + T, F, F, T, F, T, + F, T, T, F, F, T, + F, T, F, T, T, F, + F, T, F, T, T, F, + F, T, F, T, T, F, + F, T, F, T, T, F, + F, T, T, F, F, T, + F, T, F, T, T, F, + F, T, F, T, T, F, + T, F, F, T, F, T, + F, T, F, T, T, F, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, T, F, F, T, + F, T, T, F, F, T, + T, F, F, T, F, T, +}; + +int +main (void) +{ + int i, j, *res = correct_results; + + for (i = 0; i < 8; i++) + { + float arg0 = args[i]; + for (j = 0; j < 8; j++) + { + float arg1 = args[j]; + + if (feq (arg0, arg1) != *res++) + abort (); + if (fne (arg0, arg1) != *res++) + abort (); + if (flt (arg0, arg1) != *res++) + abort (); + if (fge (arg0, arg1) != *res++) + abort (); + if (fgt (arg0, arg1) != *res++) + abort (); + if (fle (arg0, arg1) != *res++) + abort (); + } + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsi-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsi-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsi-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsi-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +f1 (unsigned int x, unsigned int y) +{ + if (x == 0) + dummy (); + x -= y; + /* 0xfffffff2 < 0x80000000? */ + if (x < ~(~(unsigned int) 0 >> 1)) + abort (); + return x; +} + +f2 (unsigned long int x, unsigned long int y) +{ + if (x == 0) + dummy (); + x -= y; + /* 0xfffffff2 < 0x80000000? */ + if (x < ~(~(unsigned long int) 0 >> 1)) + abort (); + return x; +} + + +dummy () {} + +main () +{ + /* 0x7ffffff3 0x80000001 */ + f1 ((~(unsigned int) 0 >> 1) - 12, ~(~(unsigned int) 0 >> 1) + 1); + f2 ((~(unsigned long int) 0 >> 1) - 12, ~(~(unsigned long int) 0 >> 1) + 1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsi-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsi-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsi-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cmpsi-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,200 @@ +/* { dg-require-effective-target int32plus } */ +#define F 140 +#define T 13 + +feq (int x, int y) +{ + if (x == y) + return T; + else + return F; +} + +fne (int x, int y) +{ + if (x != y) + return T; + else + return F; +} + +flt (int x, int y) +{ + if (x < y) + return T; + else + return F; +} + +fge (int x, int y) +{ + if (x >= y) + return T; + else + return F; +} + +fgt (int x, int y) +{ + if (x > y) + return T; + else + return F; +} + +fle (int x, int y) +{ + if (x <= y) + return T; + else + return F; +} + +fltu (unsigned int x, unsigned int y) +{ + if (x < y) + return T; + else + return F; +} + +fgeu (unsigned int x, unsigned int y) +{ + if (x >= y) + return T; + else + return F; +} + +fgtu (unsigned int x, unsigned int y) +{ + if (x > y) + return T; + else + return F; +} + +fleu (unsigned int x, unsigned int y) +{ + if (x <= y) + return T; + else + return F; +} + +unsigned int args[] = +{ + 0L, + 1L, + -1L, + 0x7fffffffL, + 0x80000000L, + 0x80000001L, + 0x1A3F2373L, + 0x93850E92L +}; + +int correct_results[] = +{ + T, F, F, T, F, T, F, T, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, F, T, T, F, + T, F, F, T, F, T, F, T, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + T, F, F, T, F, T, F, T, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, T, F, F, T, + T, F, F, T, F, T, F, T, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + T, F, F, T, F, T, F, T, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + T, F, F, T, F, T, F, T, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, F, T, T, F, T, F, F, T, + T, F, F, T, F, T, F, T, F, T, + F, T, F, T, T, F, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + F, T, T, F, F, T, T, F, F, T, + F, T, T, F, F, T, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, F, T, T, F, F, T, T, F, + F, T, T, F, F, T, F, T, T, F, + T, F, F, T, F, T, F, T, F, T +}; + +int +main (void) +{ + int i, j, *res = correct_results; + + for (i = 0; i < 8; i++) + { + unsigned int arg0 = args[i]; + for (j = 0; j < 8; j++) + { + unsigned int arg1 = args[j]; + + if (feq (arg0, arg1) != *res++) + abort (); + if (fne (arg0, arg1) != *res++) + abort (); + if (flt (arg0, arg1) != *res++) + abort (); + if (fge (arg0, arg1) != *res++) + abort (); + if (fgt (arg0, arg1) != *res++) + abort (); + if (fle (arg0, arg1) != *res++) + abort (); + if (fltu (arg0, arg1) != *res++) + abort (); + if (fgeu (arg0, arg1) != *res++) + abort (); + if (fgtu (arg0, arg1) != *res++) + abort (); + if (fleu (arg0, arg1) != *res++) + abort (); + } + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/comp-goto-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/comp-goto-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/comp-goto-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/comp-goto-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,167 @@ +/* { dg-require-effective-target label_values } */ +/* { dg-require-stack-size "4000" } */ + +#include + +#if __INT_MAX__ >= 2147483647 +typedef unsigned int uint32; +typedef signed int sint32; + +typedef uint32 reg_t; + +typedef unsigned long int host_addr_t; +typedef uint32 target_addr_t; +typedef sint32 target_saddr_t; + +typedef union +{ + struct + { + signed int offset:18; + unsigned int ignore:4; + unsigned int s1:8; + int :2; + signed int simm:14; + unsigned int s3:8; + unsigned int s2:8; + int pad2:2; + } f1; + long long ll; + double d; +} insn_t; + +typedef struct +{ + target_addr_t vaddr_tag; + unsigned long int rigged_paddr; +} tlb_entry_t; + +typedef struct +{ + insn_t *pc; + reg_t registers[256]; + insn_t *program; + tlb_entry_t tlb_tab[0x100]; +} environment_t; + +enum operations +{ + LOAD32_RR, + METAOP_DONE +}; + +host_addr_t +f () +{ + abort (); +} + +reg_t +simulator_kernel (int what, environment_t *env) +{ + register insn_t *pc = env->pc; + register reg_t *regs = env->registers; + register insn_t insn; + register int s1; + register reg_t r2; + register void *base_addr = &&sim_base_addr; + register tlb_entry_t *tlb = env->tlb_tab; + + if (what != 0) + { + int i; + static void *op_map[] = + { + &&L_LOAD32_RR, + &&L_METAOP_DONE, + }; + insn_t *program = env->program; + for (i = 0; i < what; i++) + program[i].f1.offset = op_map[program[i].f1.offset] - base_addr; + } + + sim_base_addr:; + + insn = *pc++; + r2 = (*(reg_t *) (((char *) regs) + (insn.f1.s2 << 2))); + s1 = (insn.f1.s1 << 2); + goto *(base_addr + insn.f1.offset); + + L_LOAD32_RR: + { + target_addr_t vaddr_page = r2 / 4096; + unsigned int x = vaddr_page % 0x100; + insn = *pc++; + + for (;;) + { + target_addr_t tag = tlb[x].vaddr_tag; + host_addr_t rigged_paddr = tlb[x].rigged_paddr; + + if (tag == vaddr_page) + { + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) (rigged_paddr + r2); + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2)); + s1 = insn.f1.s1 << 2; + goto *(base_addr + insn.f1.offset); + } + + if (((target_saddr_t) tag < 0)) + { + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) f (); + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2)); + s1 = insn.f1.s1 << 2; + goto *(base_addr + insn.f1.offset); + } + + x = (x - 1) % 0x100; + } + + L_METAOP_DONE: + return (*(reg_t *) (((char *) regs) + s1)); + } +} + +insn_t program[2 + 1]; + +void *malloc (); + +int +main () +{ + environment_t env; + insn_t insn; + int i, res; + host_addr_t a_page = (host_addr_t) malloc (2 * 4096); + target_addr_t a_vaddr = 0x123450; + target_addr_t vaddr_page = a_vaddr / 4096; + a_page = (a_page + 4096 - 1) & -4096; + + env.tlb_tab[((vaddr_page) % 0x100)].vaddr_tag = vaddr_page; + env.tlb_tab[((vaddr_page) % 0x100)].rigged_paddr = a_page - vaddr_page * 4096; + insn.f1.offset = LOAD32_RR; + env.registers[0] = 0; + env.registers[2] = a_vaddr; + *(sint32 *) (a_page + a_vaddr % 4096) = 88; + insn.f1.s1 = 0; + insn.f1.s2 = 2; + + for (i = 0; i < 2; i++) + program[i] = insn; + + insn.f1.offset = METAOP_DONE; + insn.f1.s1 = 0; + program[2] = insn; + + env.pc = program; + env.program = program; + + res = simulator_kernel (2 + 1, &env); + + if (res != 88) + abort (); + exit (0); +} +#else +main(){ exit (0); } +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/comp-goto-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/comp-goto-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/comp-goto-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/comp-goto-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* { dg-require-effective-target label_values } */ +/* { dg-require-effective-target trampolines } */ +/* { dg-add-options stack_size } */ + +/* A slight variation of 920501-7.c. */ + +#ifdef STACK_SIZE +#define DEPTH ((STACK_SIZE) / 512 + 1) +#else +#define DEPTH 1000 +#endif + +x(a) +{ + __label__ xlab; + void y(a) + { + void *x = &&llab; + if (a==-1) + goto *x; + if (a==0) + goto xlab; + llab: + y (a-1); + } + y (a); + xlab:; + return a; +} + +main () +{ + + if (x (DEPTH) != DEPTH) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,118 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Test for correctness of composite comparisons. + + Written by Roger Sayle, 3rd June 2002. */ + +extern void abort (void); + +int ieq (int x, int y, int ok) +{ + if ((x<=y) && (x>=y)) + { + if (!ok) abort (); + } + else + if (ok) abort (); + + if ((x<=y) && (x==y)) + { + if (!ok) abort (); + } + else + if (ok) abort (); + + if ((x<=y) && (y<=x)) + { + if (!ok) abort (); + } + else + if (ok) abort (); + + if ((y==x) && (x<=y)) + { + if (!ok) abort (); + } + else + if (ok) abort (); +} + +int ine (int x, int y, int ok) +{ + if ((xy)) + { + if (!ok) abort (); + } + else + if (ok) abort (); +} + +int ilt (int x, int y, int ok) +{ + if ((xy) && (x!=y)) + { + if (!ok) abort (); + } + else + if (ok) abort (); +} + +int ige (int x, int y, int ok) +{ + if ((x>y) || (x==y)) + { + if (!ok) abort (); + } + else + if (ok) abort (); +} + +int +main () +{ + ieq (1, 4, 0); + ieq (3, 3, 1); + ieq (5, 2, 0); + + ine (1, 4, 1); + ine (3, 3, 0); + ine (5, 2, 1); + + ilt (1, 4, 1); + ilt (3, 3, 0); + ilt (5, 2, 0); + + ile (1, 4, 1); + ile (3, 3, 1); + ile (5, 2, 0); + + igt (1, 4, 0); + igt (3, 3, 0); + igt (5, 2, 1); + + ige (1, 4, 0); + ige (3, 3, 1); + ige (5, 2, 1); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Ensure that the composite comparison optimization doesn't misfire + and attempt to combine a signed comparison with an unsigned one. + + Written by Roger Sayle, 3rd June 2002. */ + +extern void abort (void); + +int +foo (int x, int y) +{ + /* If miscompiled the following may become "x == y". */ + return (x<=y) && ((unsigned int)x >= (unsigned int)y); +} + +int +main () +{ + if (! foo (-1,0)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compare-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,85 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Test for composite comparison always true/false optimization. + + Written by Roger Sayle, 7th June 2002. */ + +extern void link_error0 (); +extern void link_error1 (); + +void +test1 (int x, int y) +{ + if ((x==y) && (x!=y)) + link_error0(); +} + +void +test2 (int x, int y) +{ + if ((xy)) + link_error0(); +} + +void +test3 (int x, int y) +{ + if ((x=y) || (x +#include + +int err; + +#define TEST(TYPE, FUNC) \ +__complex__ TYPE \ +ctest_ ## FUNC (__complex__ TYPE x) \ +{ \ + __complex__ TYPE res; \ + \ + res = ~x; \ + \ + return res; \ +} \ + \ +void \ +test_ ## FUNC (void) \ +{ \ + __complex__ TYPE res, x; \ + \ + x = 1.0 + 2.0i; \ + \ + res = ctest_ ## FUNC (x); \ + \ + if (res != 1.0 - 2.0i) \ + { \ + printf ("test_" #FUNC " failed\n"); \ + ++err; \ + } \ +} + + +TEST(float, float) +TEST(double, double) +TEST(long double, long_double) +TEST(int, int) +TEST(long int, long_int) + +int +main (void) +{ + + err = 0; + + test_float (); + test_double (); + test_long_double (); + test_int (); + test_long_int (); + + if (err != 0) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/complex-7.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/complex-7.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/complex-7.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/complex-7.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,56 @@ +/* Test argument passing of complex values. The MIPS64 compiler had a + bug when they were split between registers and the stack. */ +/* Origin: Joseph Myers */ + +volatile _Complex float f1 = 1.1f + 2.2if; +volatile _Complex float f2 = 3.3f + 4.4if; +volatile _Complex float f3 = 5.5f + 6.6if; +volatile _Complex float f4 = 7.7f + 8.8if; +volatile _Complex float f5 = 9.9f + 10.1if; +volatile _Complex double d1 = 1.1 + 2.2i; +volatile _Complex double d2 = 3.3 + 4.4i; +volatile _Complex double d3 = 5.5 + 6.6i; +volatile _Complex double d4 = 7.7 + 8.8i; +volatile _Complex double d5 = 9.9 + 10.1i; +volatile _Complex long double ld1 = 1.1L + 2.2iL; +volatile _Complex long double ld2 = 3.3L + 4.4iL; +volatile _Complex long double ld3 = 5.5L + 6.6iL; +volatile _Complex long double ld4 = 7.7L + 8.8iL; +volatile _Complex long double ld5 = 9.9L + 10.1iL; + +extern void abort (void); +extern void exit (int); + +__attribute__((noinline)) void +check_float (int a, _Complex float a1, _Complex float a2, + _Complex float a3, _Complex float a4, _Complex float a5) +{ + if (a1 != f1 || a2 != f2 || a3 != f3 || a4 != f4 || a5 != f5) + abort (); +} + +__attribute__((noinline)) void +check_double (int a, _Complex double a1, _Complex double a2, + _Complex double a3, _Complex double a4, _Complex double a5) +{ + if (a1 != d1 || a2 != d2 || a3 != d3 || a4 != d4 || a5 != d5) + abort (); +} + +__attribute__((noinline)) void +check_long_double (int a, _Complex long double a1, _Complex long double a2, + _Complex long double a3, _Complex long double a4, + _Complex long double a5) +{ + if (a1 != ld1 || a2 != ld2 || a3 != ld3 || a4 != ld4 || a5 != ld5) + abort (); +} + +int +main (void) +{ + check_float (0, f1, f2, f3, f4, f5); + check_double (0, d1, d2, d3, d4, d5); + check_long_double (0, ld1, ld2, ld3, ld4, ld5); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compndlit-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compndlit-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compndlit-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/compndlit-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* The bit-field below would have a problem if __INT_MAX__ is too + small. */ +#if __INT_MAX__ < 2147483647 +int +main (void) +{ + exit (0); +} +#else +struct S +{ + int a:3; + unsigned b:1, c:28; +}; + +struct S x = {1, 1, 1}; + +main () +{ + x = (struct S) {b:0, a:0, c:({ struct S o = x; o.a == 1 ? 10 : 20;})}; + if (x.c != 10) + abort (); + exit (0); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/const-addr-expr-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/const-addr-expr-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/const-addr-expr-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/const-addr-expr-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +#include +#include +extern void abort(); + +typedef struct foo +{ + int uaattrid; + char *name; +} FOO; + +FOO Upgrade_items[] = +{ + {1, "1"}, + {2, "2"}, + {0, NULL} +}; + +int *Upgd_minor_ID = + (int *) &((Upgrade_items + 1)->uaattrid); + +int *Upgd_minor_ID1 = + (int *) &((Upgrade_items)->uaattrid); + +int +main(int argc, char **argv) +{ + if (*Upgd_minor_ID != 2) + abort(); + + if (*Upgd_minor_ID1 != 1) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/conversion.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/conversion.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/conversion.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/conversion.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,551 @@ +/* Test front-end conversions, optimizer conversions, and run-time + conversions between different arithmetic types. + + Constants are specified in a non-obvious way to make them work for + any word size. Their value on a 32-bit machine is indicated in the + comments. + + Note that this code is NOT intended for testing of accuracy of fp + conversions. */ + +float +u2f(u) + unsigned int u; +{ + return u; +} + +double +u2d(u) + unsigned int u; +{ + return u; +} + +long double +u2ld(u) + unsigned int u; +{ + return u; +} + +float +s2f(s) + int s; +{ + return s; +} + +double +s2d(s) + int s; +{ + return s; +} + +long double +s2ld(s) + int s; +{ + return s; +} + +int +fnear (float x, float y) +{ + float t = x - y; + return t == 0 || x / t > 1000000.0; +} + +int +dnear (double x, double y) +{ + double t = x - y; + return t == 0 || x / t > 100000000000000.0; +} + +int +ldnear (long double x, long double y) +{ + long double t = x - y; + return t == 0 || x / t > 100000000000000000000000000000000.0; +} + +test_integer_to_float() +{ + if (u2f(0U) != (float) 0U) /* 0 */ + abort(); + if (!fnear (u2f(~0U), (float) ~0U)) /* 0xffffffff */ + abort(); + if (!fnear (u2f((~0U) >> 1), (float) ((~0U) >> 1))) /* 0x7fffffff */ + abort(); + if (u2f(~((~0U) >> 1)) != (float) ~((~0U) >> 1)) /* 0x80000000 */ + abort(); + + if (u2d(0U) != (double) 0U) /* 0 */ + abort(); + if (!dnear (u2d(~0U), (double) ~0U)) /* 0xffffffff */ + abort(); + if (!dnear (u2d((~0U) >> 1),(double) ((~0U) >> 1))) /* 0x7fffffff */ + abort(); + if (u2d(~((~0U) >> 1)) != (double) ~((~0U) >> 1)) /* 0x80000000 */ + abort(); + + if (u2ld(0U) != (long double) 0U) /* 0 */ + abort(); + if (!ldnear (u2ld(~0U), (long double) ~0U)) /* 0xffffffff */ + abort(); + if (!ldnear (u2ld((~0U) >> 1),(long double) ((~0U) >> 1))) /* 0x7fffffff */ + abort(); + if (u2ld(~((~0U) >> 1)) != (long double) ~((~0U) >> 1)) /* 0x80000000 */ + abort(); + + if (s2f(0) != (float) 0) /* 0 */ + abort(); + if (!fnear (s2f(~0), (float) ~0)) /* 0xffffffff */ + abort(); + if (!fnear (s2f((int)((~0U) >> 1)), (float)(int)((~0U) >> 1))) /* 0x7fffffff */ + abort(); + if (s2f((int)(~((~0U) >> 1))) != (float)(int)~((~0U) >> 1)) /* 0x80000000 */ + abort(); + + if (s2d(0) != (double) 0) /* 0 */ + abort(); + if (!dnear (s2d(~0), (double) ~0)) /* 0xffffffff */ + abort(); + if (!dnear (s2d((int)((~0U) >> 1)), (double)(int)((~0U) >> 1))) /* 0x7fffffff */ + abort(); + if (s2d((int)~((~0U) >> 1)) != (double)(int)~((~0U) >> 1)) /* 0x80000000 */ + abort(); + + if (s2ld(0) != (long double) 0) /* 0 */ + abort(); + if (!ldnear (s2ld(~0), (long double) ~0)) /* 0xffffffff */ + abort(); + if (!ldnear (s2ld((int)((~0U) >> 1)), (long double)(int)((~0U) >> 1))) /* 0x7fffffff */ + abort(); + if (s2ld((int)~((~0U) >> 1)) != (long double)(int)~((~0U) >> 1)) /* 0x80000000 */ + abort(); +} + +#if __GNUC__ +float +ull2f(u) + unsigned long long int u; +{ + return u; +} + +double +ull2d(u) + unsigned long long int u; +{ + return u; +} + +long double +ull2ld(u) + unsigned long long int u; +{ + return u; +} + +float +sll2f(s) + long long int s; +{ + return s; +} + +double +sll2d(s) + long long int s; +{ + return s; +} + +long double +sll2ld(s) + long long int s; +{ + return s; +} + +test_longlong_integer_to_float() +{ + if (ull2f(0ULL) != (float) 0ULL) /* 0 */ + abort(); + if (ull2f(~0ULL) != (float) ~0ULL) /* 0xffffffff */ + abort(); + if (ull2f((~0ULL) >> 1) != (float) ((~0ULL) >> 1)) /* 0x7fffffff */ + abort(); + if (ull2f(~((~0ULL) >> 1)) != (float) ~((~0ULL) >> 1)) /* 0x80000000 */ + abort(); + + if (ull2d(0ULL) != (double) 0ULL) /* 0 */ + abort(); +#if __HAVE_68881__ + /* Some 68881 targets return values in fp0, with excess precision. + But the compile-time conversion to double works correctly. */ + if (! dnear (ull2d(~0ULL), (double) ~0ULL)) /* 0xffffffff */ + abort(); + if (! dnear (ull2d((~0ULL) >> 1), (double) ((~0ULL) >> 1))) /* 0x7fffffff */ + abort(); +#else + if (ull2d(~0ULL) != (double) ~0ULL) /* 0xffffffff */ + abort(); + if (ull2d((~0ULL) >> 1) != (double) ((~0ULL) >> 1)) /* 0x7fffffff */ + abort(); +#endif + if (ull2d(~((~0ULL) >> 1)) != (double) ~((~0ULL) >> 1)) /* 0x80000000 */ + abort(); + + if (ull2ld(0ULL) != (long double) 0ULL) /* 0 */ + abort(); + if (ull2ld(~0ULL) != (long double) ~0ULL) /* 0xffffffff */ + abort(); + if (ull2ld((~0ULL) >> 1) != (long double) ((~0ULL) >> 1)) /* 0x7fffffff */ + abort(); + if (ull2ld(~((~0ULL) >> 1)) != (long double) ~((~0ULL) >> 1)) /* 0x80000000 */ + abort(); + + if (sll2f(0LL) != (float) 0LL) /* 0 */ + abort(); + if (sll2f(~0LL) != (float) ~0LL) /* 0xffffffff */ + abort(); + if (! fnear (sll2f((long long int)((~0ULL) >> 1)), (float)(long long int)((~0ULL) >> 1))) /* 0x7fffffff */ + abort(); + if (sll2f((long long int)(~((~0ULL) >> 1))) != (float)(long long int)~((~0ULL) >> 1)) /* 0x80000000 */ + abort(); + + if (sll2d(0LL) != (double) 0LL) /* 0 */ + abort(); + if (sll2d(~0LL) != (double) ~0LL) /* 0xffffffff */ + abort(); + if (!dnear (sll2d((long long int)((~0ULL) >> 1)), (double)(long long int)((~0ULL) >> 1))) /* 0x7fffffff */ + abort(); + if (! dnear (sll2d((long long int)~((~0ULL) >> 1)), (double)(long long int)~((~0ULL) >> 1))) /* 0x80000000 */ + abort(); + + if (sll2ld(0LL) != (long double) 0LL) /* 0 */ + abort(); + if (sll2ld(~0LL) != (long double) ~0LL) /* 0xffffffff */ + abort(); + if (!ldnear (sll2ld((long long int)((~0ULL) >> 1)), (long double)(long long int)((~0ULL) >> 1))) /* 0x7fffffff */ + abort(); + if (! ldnear (sll2ld((long long int)~((~0ULL) >> 1)), (long double)(long long int)~((~0ULL) >> 1))) /* 0x80000000 */ + abort(); +} +#endif + +unsigned int +f2u(float f) +{ + return (unsigned) f; +} + +unsigned int +d2u(double d) +{ + return (unsigned) d; +} + +unsigned int +ld2u(long double d) +{ + return (unsigned) d; +} + +int +f2s(float f) +{ + return (int) f; +} + +int +d2s(double d) +{ + return (int) d; +} + +int +ld2s(long double d) +{ + return (int) d; +} + +test_float_to_integer() +{ + if (f2u(0.0) != 0) + abort(); + if (f2u(0.999) != 0) + abort(); + if (f2u(1.0) != 1) + abort(); + if (f2u(1.99) != 1) + abort(); +#ifdef __SPU__ + /* SPU float rounds towards zero. */ + if (f2u((float) ((~0U) >> 1)) != 0x7fffff80) + abort(); +#else + if (f2u((float) ((~0U) >> 1)) != (~0U) >> 1 && /* 0x7fffffff */ + f2u((float) ((~0U) >> 1)) != ((~0U) >> 1) + 1) + abort(); +#endif + if (f2u((float) ~((~0U) >> 1)) != ~((~0U) >> 1)) /* 0x80000000 */ + abort(); + + /* These tests require double precision, so for hosts that don't offer + that much precision, just ignore these test. */ + if (sizeof (double) >= 8) { + if (d2u(0.0) != 0) + abort(); + if (d2u(0.999) != 0) + abort(); + if (d2u(1.0) != 1) + abort(); + if (d2u(1.99) != 1) + abort(); + if (d2u((double) (~0U)) != ~0U) /* 0xffffffff */ + abort(); + if (d2u((double) ((~0U) >> 1)) != (~0U) >> 1) /* 0x7fffffff */ + abort(); + if (d2u((double) ~((~0U) >> 1)) != ~((~0U) >> 1)) /* 0x80000000 */ + abort(); + } + + /* These tests require long double precision, so for hosts that don't offer + that much precision, just ignore these test. */ + if (sizeof (long double) >= 8) { + if (ld2u(0.0) != 0) + abort(); + if (ld2u(0.999) != 0) + abort(); + if (ld2u(1.0) != 1) + abort(); + if (ld2u(1.99) != 1) + abort(); + if (ld2u((long double) (~0U)) != ~0U) /* 0xffffffff */ + abort(); + if (ld2u((long double) ((~0U) >> 1)) != (~0U) >> 1) /* 0x7fffffff */ + abort(); + if (ld2u((long double) ~((~0U) >> 1)) != ~((~0U) >> 1)) /* 0x80000000 */ + abort(); + } + + if (f2s(0.0) != 0) + abort(); + if (f2s(0.999) != 0) + abort(); + if (f2s(1.0) != 1) + abort(); + if (f2s(1.99) != 1) + abort(); + if (f2s(-0.999) != 0) + abort(); + if (f2s(-1.0) != -1) + abort(); + if (f2s(-1.99) != -1) + abort(); + if (f2s((float)(int)~((~0U) >> 1)) != (int)~((~0U) >> 1)) /* 0x80000000 */ + abort(); + + /* These tests require double precision, so for hosts that don't offer + that much precision, just ignore these test. */ + if (sizeof (double) >= 8) { + if (d2s(0.0) != 0) + abort(); + if (d2s(0.999) != 0) + abort(); + if (d2s(1.0) != 1) + abort(); + if (d2s(1.99) != 1) + abort(); + if (d2s(-0.999) != 0) + abort(); + if (d2s(-1.0) != -1) + abort(); + if (d2s(-1.99) != -1) + abort(); + if (d2s((double) ((~0U) >> 1)) != (~0U) >> 1) /* 0x7fffffff */ + abort(); + if (d2s((double)(int)~((~0U) >> 1)) != (int)~((~0U) >> 1)) /* 0x80000000 */ + abort(); + } + + /* These tests require long double precision, so for hosts that don't offer + that much precision, just ignore these test. */ + if (sizeof (long double) >= 8) { + if (ld2s(0.0) != 0) + abort(); + if (ld2s(0.999) != 0) + abort(); + if (ld2s(1.0) != 1) + abort(); + if (ld2s(1.99) != 1) + abort(); + if (ld2s(-0.999) != 0) + abort(); + if (ld2s(-1.0) != -1) + abort(); + if (ld2s(-1.99) != -1) + abort(); + if (ld2s((long double) ((~0U) >> 1)) != (~0U) >> 1) /* 0x7fffffff */ + abort(); + if (ld2s((long double)(int)~((~0U) >> 1)) != (int)~((~0U) >> 1)) /* 0x80000000 */ + abort(); + } +} + +#if __GNUC__ +unsigned long long int +f2ull(float f) +{ + return (unsigned long long int) f; +} + +unsigned long long int +d2ull(double d) +{ + return (unsigned long long int) d; +} + +unsigned long long int +ld2ull(long double d) +{ + return (unsigned long long int) d; +} + +long long int +f2sll(float f) +{ + return (long long int) f; +} + +long long int +d2sll(double d) +{ + return (long long int) d; +} + +long long int +ld2sll(long double d) +{ + return (long long int) d; +} + +test_float_to_longlong_integer() +{ + if (f2ull(0.0) != 0LL) + abort(); + if (f2ull(0.999) != 0LL) + abort(); + if (f2ull(1.0) != 1LL) + abort(); + if (f2ull(1.99) != 1LL) + abort(); +#ifdef __SPU__ + /* SPU float rounds towards zero. */ + if (f2ull((float) ((~0ULL) >> 1)) != 0x7fffff8000000000ULL) + abort(); +#else + if (f2ull((float) ((~0ULL) >> 1)) != (~0ULL) >> 1 && /* 0x7fffffff */ + f2ull((float) ((~0ULL) >> 1)) != ((~0ULL) >> 1) + 1) + abort(); +#endif + if (f2ull((float) ~((~0ULL) >> 1)) != ~((~0ULL) >> 1)) /* 0x80000000 */ + abort(); + + if (d2ull(0.0) != 0LL) + abort(); + if (d2ull(0.999) != 0LL) + abort(); + if (d2ull(1.0) != 1LL) + abort(); + if (d2ull(1.99) != 1LL) + abort(); + if (d2ull((double) ((~0ULL) >> 1)) != (~0ULL) >> 1 && /* 0x7fffffff */ + d2ull((double) ((~0ULL) >> 1)) != ((~0ULL) >> 1) + 1) + abort(); + if (d2ull((double) ~((~0ULL) >> 1)) != ~((~0ULL) >> 1)) /* 0x80000000 */ + abort(); + + if (ld2ull(0.0) != 0LL) + abort(); + if (ld2ull(0.999) != 0LL) + abort(); + if (ld2ull(1.0) != 1LL) + abort(); + if (ld2ull(1.99) != 1LL) + abort(); + if (ld2ull((long double) ((~0ULL) >> 1)) != (~0ULL) >> 1 && /* 0x7fffffff */ + ld2ull((long double) ((~0ULL) >> 1)) != ((~0ULL) >> 1) + 1) + abort(); + if (ld2ull((long double) ~((~0ULL) >> 1)) != ~((~0ULL) >> 1)) /* 0x80000000 */ + abort(); + + + if (f2sll(0.0) != 0LL) + abort(); + if (f2sll(0.999) != 0LL) + abort(); + if (f2sll(1.0) != 1LL) + abort(); + if (f2sll(1.99) != 1LL) + abort(); + if (f2sll(-0.999) != 0LL) + abort(); + if (f2sll(-1.0) != -1LL) + abort(); + if (f2sll(-1.99) != -1LL) + abort(); + if (f2sll((float)(long long int)~((~0ULL) >> 1)) != (long long int)~((~0ULL) >> 1)) /* 0x80000000 */ + abort(); + + if (d2sll(0.0) != 0LL) + abort(); + if (d2sll(0.999) != 0LL) + abort(); + if (d2sll(1.0) != 1LL) + abort(); + if (d2sll(1.99) != 1LL) + abort(); + if (d2sll(-0.999) != 0LL) + abort(); + if (d2sll(-1.0) != -1LL) + abort(); + if (d2sll(-1.99) != -1LL) + abort(); + if (d2sll((double)(long long int)~((~0ULL) >> 1)) != (long long int)~((~0ULL) >> 1)) /* 0x80000000 */ + abort(); + + if (ld2sll(0.0) != 0LL) + abort(); + if (ld2sll(0.999) != 0LL) + abort(); + if (ld2sll(1.0) != 1LL) + abort(); + if (ld2sll(1.99) != 1LL) + abort(); + if (ld2sll(-0.999) != 0LL) + abort(); + if (ld2sll(-1.0) != -1LL) + abort(); + if (ld2sll(-1.99) != -1LL) + abort(); + if (ld2sll((long double)(long long int)~((~0ULL) >> 1)) != (long long int)~((~0ULL) >> 1)) /* 0x80000000 */ + abort(); +} +#endif + +main() +{ + test_integer_to_float(); + test_float_to_integer(); +#if __GNUC__ + test_longlong_integer_to_float(); + test_float_to_longlong_integer(); +#endif + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cvt-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cvt-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cvt-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/cvt-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +static inline long +g1 (double x) +{ + return (double) (long) x; +} + +long +g2 (double f) +{ + return f; +} + +double +f (long i) +{ + if (g1 (i) != g2 (i)) + abort (); + return g2 (i); +} + +main () +{ + if (f (123456789L) != 123456789L) + abort (); + if (f (123456789L) != g2 (123456789L)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/dbra-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/dbra-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/dbra-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/dbra-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,95 @@ +f1 (a) + long a; +{ + int i; + for (i = 0; i < 10; i++) + { + if (--a == -1) + return i; + } + return -1; +} + +f2 (a) + long a; +{ + int i; + for (i = 0; i < 10; i++) + { + if (--a != -1) + return i; + } + return -1; +} + +f3 (a) + long a; +{ + int i; + for (i = 0; i < 10; i++) + { + if (--a == 0) + return i; + } + return -1; +} + +f4 (a) + long a; +{ + int i; + for (i = 0; i < 10; i++) + { + if (--a != 0) + return i; + } + return -1; +} + +f5 (a) + long a; +{ + int i; + for (i = 0; i < 10; i++) + { + if (++a == 0) + return i; + } + return -1; +} + +f6 (a) + long a; +{ + int i; + for (i = 0; i < 10; i++) + { + if (++a != 0) + return i; + } + return -1; +} + + +main() +{ + if (f1 (5L) != 5) + abort (); + if (f2 (1L) != 0) + abort (); + if (f2 (0L) != 1) + abort (); + if (f3 (5L) != 4) + abort (); + if (f4 (1L) != 1) + abort (); + if (f4 (0L) != 0) + abort (); + if (f5 (-5L) != 4) + abort (); + if (f6 (-1L) != 1) + abort (); + if (f6 (0L) != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,355 @@ +extern void abort(void); + +int test1(int x) +{ + return x/10 == 2; +} + +int test1u(unsigned int x) +{ + return x/10U == 2; +} + +int test2(int x) +{ + return x/10 == 0; +} + +int test2u(unsigned int x) +{ + return x/10U == 0; +} + +int test3(int x) +{ + return x/10 != 2; +} + +int test3u(unsigned int x) +{ + return x/10U != 2; +} + +int test4(int x) +{ + return x/10 != 0; +} + +int test4u(unsigned int x) +{ + return x/10U != 0; +} + +int test5(int x) +{ + return x/10 < 2; +} + +int test5u(unsigned int x) +{ + return x/10U < 2; +} + +int test6(int x) +{ + return x/10 < 0; +} + +int test7(int x) +{ + return x/10 <= 2; +} + +int test7u(unsigned int x) +{ + return x/10U <= 2; +} + +int test8(int x) +{ + return x/10 <= 0; +} + +int test8u(unsigned int x) +{ + return x/10U <= 0; +} + +int test9(int x) +{ + return x/10 > 2; +} + +int test9u(unsigned int x) +{ + return x/10U > 2; +} + +int test10(int x) +{ + return x/10 > 0; +} + +int test10u(unsigned int x) +{ + return x/10U > 0; +} + +int test11(int x) +{ + return x/10 >= 2; +} + +int test11u(unsigned int x) +{ + return x/10U >= 2; +} + +int test12(int x) +{ + return x/10 >= 0; +} + + +int main() +{ + if (test1(19) != 0) + abort (); + if (test1(20) != 1) + abort (); + if (test1(29) != 1) + abort (); + if (test1(30) != 0) + abort (); + + if (test1u(19) != 0) + abort (); + if (test1u(20) != 1) + abort (); + if (test1u(29) != 1) + abort (); + if (test1u(30) != 0) + abort (); + + if (test2(0) != 1) + abort (); + if (test2(9) != 1) + abort (); + if (test2(10) != 0) + abort (); + if (test2(-1) != 1) + abort (); + if (test2(-9) != 1) + abort (); + if (test2(-10) != 0) + abort (); + + if (test2u(0) != 1) + abort (); + if (test2u(9) != 1) + abort (); + if (test2u(10) != 0) + abort (); + if (test2u(-1) != 0) + abort (); + if (test2u(-9) != 0) + abort (); + if (test2u(-10) != 0) + abort (); + + if (test3(19) != 1) + abort (); + if (test3(20) != 0) + abort (); + if (test3(29) != 0) + abort (); + if (test3(30) != 1) + abort (); + + if (test3u(19) != 1) + abort (); + if (test3u(20) != 0) + abort (); + if (test3u(29) != 0) + abort (); + if (test3u(30) != 1) + abort (); + + if (test4(0) != 0) + abort (); + if (test4(9) != 0) + abort (); + if (test4(10) != 1) + abort (); + if (test4(-1) != 0) + abort (); + if (test4(-9) != 0) + abort (); + if (test4(-10) != 1) + abort (); + + if (test4u(0) != 0) + abort (); + if (test4u(9) != 0) + abort (); + if (test4u(10) != 1) + abort (); + if (test4u(-1) != 1) + abort (); + if (test4u(-9) != 1) + abort (); + if (test4u(-10) != 1) + abort (); + + if (test5(19) != 1) + abort (); + if (test5(20) != 0) + abort (); + if (test5(29) != 0) + abort (); + if (test5(30) != 0) + abort (); + + if (test5u(19) != 1) + abort (); + if (test5u(20) != 0) + abort (); + if (test5u(29) != 0) + abort (); + if (test5u(30) != 0) + abort (); + + if (test6(0) != 0) + abort (); + if (test6(9) != 0) + abort (); + if (test6(10) != 0) + abort (); + if (test6(-1) != 0) + abort (); + if (test6(-9) != 0) + abort (); + if (test6(-10) != 1) + abort (); + + if (test7(19) != 1) + abort (); + if (test7(20) != 1) + abort (); + if (test7(29) != 1) + abort (); + if (test7(30) != 0) + abort (); + + if (test7u(19) != 1) + abort (); + if (test7u(20) != 1) + abort (); + if (test7u(29) != 1) + abort (); + if (test7u(30) != 0) + abort (); + + if (test8(0) != 1) + abort (); + if (test8(9) != 1) + abort (); + if (test8(10) != 0) + abort (); + if (test8(-1) != 1) + abort (); + if (test8(-9) != 1) + abort (); + if (test8(-10) != 1) + abort (); + + if (test8u(0) != 1) + abort (); + if (test8u(9) != 1) + abort (); + if (test8u(10) != 0) + abort (); + if (test8u(-1) != 0) + abort (); + if (test8u(-9) != 0) + abort (); + if (test8u(-10) != 0) + abort (); + + if (test9(19) != 0) + abort (); + if (test9(20) != 0) + abort (); + if (test9(29) != 0) + abort (); + if (test9(30) != 1) + abort (); + + if (test9u(19) != 0) + abort (); + if (test9u(20) != 0) + abort (); + if (test9u(29) != 0) + abort (); + if (test9u(30) != 1) + abort (); + + if (test10(0) != 0) + abort (); + if (test10(9) != 0) + abort (); + if (test10(10) != 1) + abort (); + if (test10(-1) != 0) + abort (); + if (test10(-9) != 0) + abort (); + if (test10(-10) != 0) + abort (); + + if (test10u(0) != 0) + abort (); + if (test10u(9) != 0) + abort (); + if (test10u(10) != 1) + abort (); + if (test10u(-1) != 1) + abort (); + if (test10u(-9) != 1) + abort (); + if (test10u(-10) != 1) + abort (); + + if (test11(19) != 0) + abort (); + if (test11(20) != 1) + abort (); + if (test11(29) != 1) + abort (); + if (test11(30) != 1) + abort (); + + if (test11u(19) != 0) + abort (); + if (test11u(20) != 1) + abort (); + if (test11u(29) != 1) + abort (); + if (test11u(30) != 1) + abort (); + + if (test12(0) != 1) + abort (); + if (test12(9) != 1) + abort (); + if (test12(10) != 1) + abort (); + if (test12(-1) != 1) + abort (); + if (test12(-9) != 1) + abort (); + if (test12(-10) != 0) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,91 @@ +extern void abort (void); + +int test1(int x) +{ + return x/10 == 2; +} + +int test2(int x) +{ + return x/10 == 0; +} + +int test3(int x) +{ + return x/10 == -2; +} + +int test4(int x) +{ + return x/-10 == 2; +} + +int test5(int x) +{ + return x/-10 == 0; +} + +int test6(int x) +{ + return x/-10 == -2; +} + + +int main() +{ + if (test1(19) != 0) + abort (); + if (test1(20) != 1) + abort (); + if (test1(29) != 1) + abort (); + if (test1(30) != 0) + abort (); + + if (test2(-10) != 0) + abort (); + if (test2(-9) != 1) + abort (); + if (test2(9) != 1) + abort (); + if (test2(10) != 0) + abort (); + + if (test3(-30) != 0) + abort (); + if (test3(-29) != 1) + abort (); + if (test3(-20) != 1) + abort (); + if (test3(-19) != 0) + abort (); + + if (test4(-30) != 0) + abort (); + if (test4(-29) != 1) + abort (); + if (test4(-20) != 1) + abort (); + if (test4(-19) != 0) + abort (); + + if (test5(-10) != 0) + abort (); + if (test5(-9) != 1) + abort (); + if (test5(9) != 1) + abort (); + if (test5(10) != 0) + abort (); + + if (test6(19) != 0) + abort (); + if (test6(20) != 1) + abort (); + if (test6(29) != 1) + abort (); + if (test6(30) != 0) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,96 @@ +extern void abort(void); + +int test1(char x) +{ + return x/100 == 3; +} + +int test1u(unsigned char x) +{ + return x/100 == 3; +} + +int test2(char x) +{ + return x/100 != 3; +} + +int test2u(unsigned char x) +{ + return x/100 != 3; +} + +int test3(char x) +{ + return x/100 < 3; +} + +int test3u(unsigned char x) +{ + return x/100 < 3; +} + +int test4(char x) +{ + return x/100 <= 3; +} + +int test4u(unsigned char x) +{ + return x/100 <= 3; +} + +int test5(char x) +{ + return x/100 > 3; +} + +int test5u(unsigned char x) +{ + return x/100 > 3; +} + +int test6(char x) +{ + return x/100 >= 3; +} + +int test6u(unsigned char x) +{ + return x/100 >= 3; +} + + +int main() +{ + int c; + + for (c=-128; c<256; c++) + { + if (test1(c) != 0) + abort (); + if (test1u(c) != 0) + abort (); + if (test2(c) != 1) + abort (); + if (test2u(c) != 1) + abort (); + if (test3(c) != 1) + abort (); + if (test3u(c) != 1) + abort (); + if (test4(c) != 1) + abort (); + if (test4u(c) != 1) + abort (); + if (test5(c) != 0) + abort (); + if (test5u(c) != 0) + abort (); + if (test6(c) != 0) + abort (); + if (test6u(c) != 0) + abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,201 @@ +/* PR middle-end/17894 */ + +extern void abort(void); + +int test1(int x) +{ + return x/-10 == 2; +} + +int test2(int x) +{ + return x/-10 == 0; +} + +int test3(int x) +{ + return x/-10 != 2; +} + +int test4(int x) +{ + return x/-10 != 0; +} + +int test5(int x) +{ + return x/-10 < 2; +} + +int test6(int x) +{ + return x/-10 < 0; +} + +int test7(int x) +{ + return x/-10 <= 2; +} + +int test8(int x) +{ + return x/-10 <= 0; +} + +int test9(int x) +{ + return x/-10 > 2; +} + +int test10(int x) +{ + return x/-10 > 0; +} + +int test11(int x) +{ + return x/-10 >= 2; +} + +int test12(int x) +{ + return x/-10 >= 0; +} + + +int main() +{ + if (test1(-30) != 0) + abort (); + if (test1(-29) != 1) + abort (); + if (test1(-20) != 1) + abort (); + if (test1(-19) != 0) + abort (); + + if (test2(0) != 1) + abort (); + if (test2(9) != 1) + abort (); + if (test2(10) != 0) + abort (); + if (test2(-1) != 1) + abort (); + if (test2(-9) != 1) + abort (); + if (test2(-10) != 0) + abort (); + + if (test3(-30) != 1) + abort (); + if (test3(-29) != 0) + abort (); + if (test3(-20) != 0) + abort (); + if (test3(-19) != 1) + abort (); + + if (test4(0) != 0) + abort (); + if (test4(9) != 0) + abort (); + if (test4(10) != 1) + abort (); + if (test4(-1) != 0) + abort (); + if (test4(-9) != 0) + abort (); + if (test4(-10) != 1) + abort (); + + if (test5(-30) != 0) + abort (); + if (test5(-29) != 0) + abort (); + if (test5(-20) != 0) + abort (); + if (test5(-19) != 1) + abort (); + + if (test6(0) != 0) + abort (); + if (test6(9) != 0) + abort (); + if (test6(10) != 1) + abort (); + if (test6(-1) != 0) + abort (); + if (test6(-9) != 0) + abort (); + if (test6(-10) != 0) + abort (); + + if (test7(-30) != 0) + abort (); + if (test7(-29) != 1) + abort (); + if (test7(-20) != 1) + abort (); + if (test7(-19) != 1) + abort (); + + if (test8(0) != 1) + abort (); + if (test8(9) != 1) + abort (); + if (test8(10) != 1) + abort (); + if (test8(-1) != 1) + abort (); + if (test8(-9) != 1) + abort (); + if (test8(-10) != 0) + abort (); + + if (test9(-30) != 1) + abort (); + if (test9(-29) != 0) + abort (); + if (test9(-20) != 0) + abort (); + if (test9(-19) != 0) + abort (); + + if (test10(0) != 0) + abort (); + if (test10(9) != 0) + abort (); + if (test10(10) != 0) + abort (); + if (test10(-1) != 0) + abort (); + if (test10(-9) != 0) + abort (); + if (test10(-10) != 1) + abort (); + + if (test11(-30) != 1) + abort (); + if (test11(-29) != 1) + abort (); + if (test11(-20) != 1) + abort (); + if (test11(-19) != 0) + abort (); + + if (test12(0) != 1) + abort (); + if (test12(9) != 1) + abort (); + if (test12(10) != 0) + abort (); + if (test12(-1) != 1) + abort (); + if (test12(-9) != 1) + abort (); + if (test12(-10) != 1) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divcmp-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR middle-end/26561 */ + +extern void abort(void); + +int always_one_1 (int a) +{ + if (a/100 >= -999999999) + return 1; + else + return 0; +} + +int always_one_2 (int a) +{ + if (a/100 < -999999999) + return 0; + else + return 1; +} + +int main(void) +{ + if (always_one_1 (0) != 1) + abort (); + + if (always_one_2 (0) != 1) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +typedef struct +{ + unsigned a, b, c, d; +} t1; + +f (t1 *ps) +{ + ps->a = 10000; + ps->b = ps->a / 3; + ps->c = 10000; + ps->d = ps->c / 3; +} + +main () +{ + t1 s; + f (&s); + if (s.a != 10000 || s.b != 3333 || s.c != 10000 || s.d != 3333) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +long +f (long x) +{ + return x / (-0x7fffffffL - 1L); +} + +long +r (long x) +{ + return x % (-0x7fffffffL - 1L); +} + +/* Since we have a negative divisor, this equation must hold for the + results of / and %; no specific results are guaranteed. */ +long +std_eqn (long num, long denom, long quot, long rem) +{ + /* For completeness, a check for "ABS (rem) < ABS (denom)" belongs here, + but causes trouble on 32-bit machines and isn't worthwhile. */ + return quot * (-0x7fffffffL - 1L) + rem == num; +} + +long nums[] = +{ + -1L, 0x7fffffffL, -0x7fffffffL - 1L +}; + +main () +{ + int i; + + for (i = 0; + i < sizeof (nums) / sizeof (nums[0]); + i++) + if (std_eqn (nums[i], -0x7fffffffL - 1L, f (nums[i]), r (nums[i])) == 0) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divconst-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +long long +f (long long x) +{ + return x / 10000000000LL; +} + +main () +{ + if (f (10000000000LL) != 1 || f (100000000000LL) != 10) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divmod-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divmod-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divmod-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/divmod-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,77 @@ +div1 (signed char x) +{ + return x / -1; +} + +div2 (signed short x) +{ + return x / -1; +} + +div3 (signed char x, signed char y) +{ + return x / y; +} + +div4 (signed short x, signed short y) +{ + return x / y; +} + +mod1 (signed char x) +{ + return x % -1; +} + +mod2 (signed short x) +{ + return x % -1; +} + +mod3 (signed char x, signed char y) +{ + return x % y; +} + +mod4 (signed short x, signed short y) +{ + return x % y; +} + +signed long +mod5 (signed long x, signed long y) +{ + return x % y; +} + +unsigned long +mod6 (unsigned long x, unsigned long y) +{ + return x % y; +} + +main () +{ + if (div1 (-(1 << 7)) != 1 << 7) + abort (); + if (div2 (-(1 << 15)) != 1 << 15) + abort (); + if (div3 (-(1 << 7), -1) != 1 << 7) + abort (); + if (div4 (-(1 << 15), -1) != 1 << 15) + abort (); + if (mod1 (-(1 << 7)) != 0) + abort (); + if (mod2 (-(1 << 15)) != 0) + abort (); + if (mod3 (-(1 << 7), -1) != 0) + abort (); + if (mod4 (-(1 << 15), -1) != 0) + abort (); + if (mod5 (0x50000000, 2) != 0) + abort (); + if (mod6 (0x50000000, 2) != 0) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/doloop-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/doloop-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/doloop-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/doloop-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +#include + +extern void exit (int); +extern void abort (void); + +volatile unsigned int i; + +int +main (void) +{ + unsigned char z = 0; + + do ++i; + while (--z > 0); + if (i != UCHAR_MAX + 1U) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/doloop-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/doloop-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/doloop-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/doloop-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +#include + +extern void exit (int); +extern void abort (void); + +volatile unsigned int i; + +int +main (void) +{ + unsigned short z = 0; + + do ++i; + while (--z > 0); + if (i != USHRT_MAX + 1U) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/eeprof-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/eeprof-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/eeprof-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/eeprof-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,79 @@ +/* { dg-require-effective-target return_address } */ +/* { dg-options "-finstrument-functions" } */ +/* { dg-xfail-run-if "" { powerpc-ibm-aix* } } */ + +extern void abort (void); + +#define ASSERT(X) if (!(X)) abort (); +#define NOCHK __attribute__ ((no_instrument_function)) + +int entry_calls, exit_calls; +void (*last_fn_entered)(); +void (*last_fn_exited)(); + +__attribute__ ((noinline)) +int main () NOCHK; + +__attribute__ ((noinline)) +void foo () +{ + ASSERT (last_fn_entered == foo); +} + +__attribute__ ((noinline)) +static void foo2 () +{ + ASSERT (entry_calls == 1 && exit_calls == 0); + ASSERT (last_fn_entered == foo2); + foo (); + ASSERT (entry_calls == 2 && exit_calls == 1); + ASSERT (last_fn_entered == foo); + ASSERT (last_fn_exited == foo); +} + +__attribute__ ((noinline)) +void nfoo (void) NOCHK; +void nfoo () +{ + ASSERT (entry_calls == 2 && exit_calls == 2); + ASSERT (last_fn_entered == foo); + ASSERT (last_fn_exited == foo2); + foo (); + ASSERT (entry_calls == 3 && exit_calls == 3); + ASSERT (last_fn_entered == foo); + ASSERT (last_fn_exited == foo); +} + +int main () +{ + ASSERT (entry_calls == 0 && exit_calls == 0); + + foo2 (); + + ASSERT (entry_calls == 2 && exit_calls == 2); + ASSERT (last_fn_entered == foo); + ASSERT (last_fn_exited == foo2); + + nfoo (); + + ASSERT (entry_calls == 3 && exit_calls == 3); + ASSERT (last_fn_entered == foo); + + return 0; +} + +void __cyg_profile_func_enter (void*, void*) NOCHK; +void __cyg_profile_func_exit (void*, void*) NOCHK; + +__attribute__ ((noinline)) +void __cyg_profile_func_enter (void *fn, void *parent) +{ + entry_calls++; + last_fn_entered = (void (*)())fn; +} +__attribute__ ((noinline)) +void __cyg_profile_func_exit (void *fn, void *parent) +{ + exit_calls++; + last_fn_exited = (void (*)())fn; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +typedef enum +{ + END = -1, + EMPTY = (1 << 8 ) , + BACKREF, + BEGLINE, + ENDLINE, + BEGWORD, + ENDWORD, + LIMWORD, + NOTLIMWORD, + QMARK, + STAR, + PLUS, + REPMN, + CAT, + OR, + ORTOP, + LPAREN, + RPAREN, + CSET +} token; + +static token tok; + +static int +atom () +{ + if ((tok >= 0 && tok < (1 << 8 ) ) || tok >= CSET || tok == BACKREF + || tok == BEGLINE || tok == ENDLINE || tok == BEGWORD + || tok == ENDWORD || tok == LIMWORD || tok == NOTLIMWORD) + return 1; + else + return 0; +} + +main () +{ + tok = 0; + if (atom () != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* Copyright (C) 2000 Free Software Foundation */ +/* by Alexandre Oliva */ + +enum foo { FOO, BAR }; + +/* Even though the underlying type of an enum is unspecified, the type + of enumeration constants is explicitly defined as int (6.4.4.3/2 in + the C99 Standard). Therefore, `i' must not be promoted to + `unsigned' in the comparison below; we must exit the loop when it + becomes negative. */ + +int +main () +{ + int i; + for (i = BAR; i >= FOO; --i) + if (i == -1) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/enum-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* The composite type of int and an enum compatible with int might be + either of the two types, but it isn't an unsigned type. */ +/* Origin: Joseph Myers */ + +#include + +#include + +extern void abort (void); +extern void exit (int); + +enum e { a = INT_MIN }; + +int *p; +enum e *q; +int +main (void) +{ + enum e x = a; + q = &x; + if (*(1 ? q : p) > 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/execute.exp URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/execute.exp?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/execute.exp (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/execute.exp Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +# Copyright (C) 1991-2019 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# This file was written by Rob Savoye. (rob at cygnus.com) +# Modified and maintained by Jeffrey Wheat (cassidy at cygnus.com) + +# +# These tests come from Torbjorn Granlund (tege at cygnus.com) +# C torture test suite. +# + +# Load support procs. +load_lib gcc-dg.exp + +# Initialize `dg'. +dg-init + +# Main loop. +set saved-dg-do-what-default ${dg-do-what-default} +set dg-do-what-default "run" +gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] "" "-w" +set dg-do-what-default ${saved-dg-do-what-default} + +# All done. +dg-finish Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/extzvsi.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/extzvsi.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/extzvsi.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/extzvsi.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* The bit-field below would have a problem if __INT_MAX__ is too + small. */ +#if __INT_MAX__ < 2147483647 +int +main (void) +{ + exit (0); +} +#else +/* Failed on powerpc due to bad extzvsi pattern. */ + +struct ieee +{ + unsigned int negative:1; + unsigned int exponent:11; + unsigned int mantissa0:20; + unsigned int mantissa1:32; +} x; + +unsigned int +foo (void) +{ + unsigned int exponent; + + exponent = x.exponent; + if (exponent == 0) + return 1; + else if (exponent > 1) + return 2; + return 0; +} + +int +main (void) +{ + x.exponent = 1; + if (foo () != 0) + abort (); + return 0; +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ffs-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ffs-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ffs-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ffs-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +__volatile int a = 0; + +extern void abort (void); +extern void exit (int); + +int +main (void) +{ + if (__builtin_ffs (a) != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ffs-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ffs-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ffs-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ffs-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +struct + { + int input; + int output; + } +ffstesttab[] = + { +#if __INT_MAX__ >= 2147483647 + /* at least 32-bit integers */ + { 0x80000000, 32 }, + { 0xa5a5a5a5, 1 }, + { 0x5a5a5a5a, 2 }, + { 0xcafe0000, 18 }, +#endif +#if __INT_MAX__ >= 32767 + /* at least 16-bit integers */ + { 0x8000, 16 }, + { 0xa5a5, 1 }, + { 0x5a5a, 2 }, + { 0x0ca0, 6 }, +#endif +#if __INT_MAX__ < 32767 +#error integers are too small +#endif + }; + +#define NFFSTESTS (sizeof (ffstesttab) / sizeof (ffstesttab[0])) + +extern void abort (void); +extern void exit (int); + +int +main (void) +{ + int i; + + for (i = 0; i < NFFSTESTS; i++) + { + if (__builtin_ffs (ffstesttab[i].input) != ffstesttab[i].output) + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/float-floor.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/float-floor.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/float-floor.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/float-floor.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ + +#if(__SIZEOF_DOUBLE__==8) +double d = 1024.0 - 1.0 / 32768.0; +#else +double d = 1024.0 - 1.0 / 16384.0; +#endif + +extern double floor(double); +extern float floorf(float); +extern void abort(); + +int main() { + + double df = floor(d); + float f1 = (float)floor(d); + + if ((int)df != 1023 || (int)f1 != 1023) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/floatunsisf-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/floatunsisf-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/floatunsisf-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/floatunsisf-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* The fp-bit.c function __floatunsisf had a latent bug where guard bits + could be lost leading to incorrect rounding. */ +/* Origin: Joseph Myers */ + +extern void abort (void); +extern void exit (int); +#if __INT_MAX__ >= 0x7fffffff +volatile unsigned u = 0x80000081; +#else +volatile unsigned long u = 0x80000081; +#endif +volatile float f1, f2; +int +main (void) +{ + f1 = (float) u; + f2 = (float) 0x80000081; + if (f1 != f2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* { dg-skip-if "requires io" { freestanding } } */ + +#include +#include + +int +main (void) +{ +#define test(ret, args...) \ + fprintf (stdout, args); \ + if (fprintf (stdout, args) != ret) \ + abort (); + test (5, "hello"); + test (6, "hello\n"); + test (1, "a"); + test (0, ""); + test (5, "%s", "hello"); + test (6, "%s", "hello\n"); + test (1, "%s", "a"); + test (0, "%s", ""); + test (1, "%c", 'x'); + test (7, "%s\n", "hello\n"); + test (2, "%d\n", 0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,53 @@ +/* Verify that calls to fprintf don't get eliminated even if their + result on success can be computed at compile time (they can fail). + The calls can still be transformed into those of other functions. + { dg-skip-if "requires io" { freestanding } } */ + +#include +#include +#include + +int main (void) +{ + char *tmpfname = tmpnam (0); + FILE *f = fopen (tmpfname, "w"); + if (!f) + { + perror ("fopen for writing"); + return 1; + } + + fprintf (f, "1"); + fprintf (f, "%c", '2'); + fprintf (f, "%c%c", '3', '4'); + fprintf (f, "%s", "5"); + fprintf (f, "%s%s", "6", "7"); + fprintf (f, "%i", 8); + fprintf (f, "%.1s\n", "9x"); + fclose (f); + + f = fopen (tmpfname, "r"); + if (!f) + { + perror ("fopen for reading"); + remove (tmpfname); + return 1; + } + + char buf[12] = ""; + if (1 != fscanf (f, "%s", buf)) + { + perror ("fscanf"); + fclose (f); + remove (tmpfname); + return 1; + } + + fclose (f); + remove (tmpfname); + + if (strcmp (buf, "123456789")) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-chk-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-chk-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-chk-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/fprintf-chk-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +/* { dg-skip-if "requires io" { freestanding } } */ + +#include +#include +#include + +volatile int should_optimize; + +int +__attribute__((noinline)) +__fprintf_chk (FILE *f, int flag, const char *fmt, ...) +{ + va_list ap; + int ret; +#ifdef __OPTIMIZE__ + if (should_optimize) + abort (); +#endif + should_optimize = 1; + va_start (ap, fmt); + ret = vfprintf (f, fmt, ap); + va_end (ap); + return ret; +} + +int +main (void) +{ +#define test(ret, opt, args...) \ + should_optimize = opt; \ + __fprintf_chk (stdout, 1, args); \ + if (!should_optimize) \ + abort (); \ + should_optimize = 0; \ + if (__fprintf_chk (stdout, 1, args) != ret) \ + abort (); \ + if (!should_optimize) \ + abort (); + test (5, 1, "hello"); + test (6, 1, "hello\n"); + test (1, 1, "a"); + test (0, 1, ""); + test (5, 1, "%s", "hello"); + test (6, 1, "%s", "hello\n"); + test (1, 1, "%s", "a"); + test (0, 1, "%s", ""); + test (1, 1, "%c", 'x'); + test (7, 0, "%s\n", "hello\n"); + test (2, 0, "%d\n", 0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/frame-address.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/frame-address.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/frame-address.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/frame-address.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,45 @@ +/* { dg-require-effective-target return_address } */ +int check_fa_work (const char *, const char *) __attribute__((noinline)); +int check_fa_mid (const char *) __attribute__((noinline)); +int check_fa (char *) __attribute__((noinline)); +int how_much (void) __attribute__((noinline)); + +int check_fa_work (const char *c, const char *f) +{ + const char d = 0; + + if (c >= &d) + return c >= f && f >= &d; + else + return c <= f && f <= &d; +} + +int check_fa_mid (const char *c) +{ + const char *f = __builtin_frame_address (0); + + /* Prevent a tail call to check_fa_work, eliding the current stack frame. */ + return check_fa_work (c, f) != 0; +} + +int check_fa (char *unused) +{ + const char c = 0; + + /* Prevent a tail call to check_fa_mid, eliding the current stack frame. */ + return check_fa_mid (&c) != 0; +} + +int how_much (void) +{ + return 8; +} + +int main (void) +{ + char *unused = __builtin_alloca (how_much ()); + + if (!check_fa(unused)) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/func-ptr-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/func-ptr-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/func-ptr-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/func-ptr-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +static double f (float a); +static double (*fp) (float a); + +main () +{ + fp = f; + if (fp ((float) 1) != 1.0) + abort (); + exit (0); +} + +static double +f (float a) +{ + return a; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/gofast.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/gofast.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/gofast.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/gofast.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,101 @@ +/* { dg-skip-if "requires io" { freestanding } } */ + +/* Program to test gcc's usage of the gofast library. */ + +/* The main guiding themes are to make it trivial to add test cases over time + and to make it easy for a program to parse the output to see if the right + libcalls are being made. */ + +#include + +float fp_add (float a, float b) { return a + b; } +float fp_sub (float a, float b) { return a - b; } +float fp_mul (float a, float b) { return a * b; } +float fp_div (float a, float b) { return a / b; } +float fp_neg (float a) { return -a; } + +double dp_add (double a, double b) { return a + b; } +double dp_sub (double a, double b) { return a - b; } +double dp_mul (double a, double b) { return a * b; } +double dp_div (double a, double b) { return a / b; } +double dp_neg (double a) { return -a; } + +double fp_to_dp (float f) { return f; } +float dp_to_fp (double d) { return d; } + +int eqsf2 (float a, float b) { return a == b; } +int nesf2 (float a, float b) { return a != b; } +int gtsf2 (float a, float b) { return a > b; } +int gesf2 (float a, float b) { return a >= b; } +int ltsf2 (float a, float b) { return a < b; } +int lesf2 (float a, float b) { return a <= b; } + +int eqdf2 (double a, double b) { return a == b; } +int nedf2 (double a, double b) { return a != b; } +int gtdf2 (double a, double b) { return a > b; } +int gedf2 (double a, double b) { return a >= b; } +int ltdf2 (double a, double b) { return a < b; } +int ledf2 (double a, double b) { return a <= b; } + +float floatsisf (int i) { return i; } +double floatsidf (int i) { return i; } +int fixsfsi (float f) { return f; } +int fixdfsi (double d) { return d; } +unsigned int fixunssfsi (float f) { return f; } +unsigned int fixunsdfsi (double d) { return d; } + +int fail_count = 0; + +int +fail (char *msg) +{ + fail_count++; + fprintf (stderr, "Test failed: %s\n", msg); +} + +int +main() +{ + if (fp_add (1, 1) != 2) fail ("fp_add 1+1"); + if (fp_sub (3, 2) != 1) fail ("fp_sub 3-2"); + if (fp_mul (2, 3) != 6) fail ("fp_mul 2*3"); + if (fp_div (3, 2) != 1.5) fail ("fp_div 3/2"); + if (fp_neg (1) != -1) fail ("fp_neg 1"); + + if (dp_add (1, 1) != 2) fail ("dp_add 1+1"); + if (dp_sub (3, 2) != 1) fail ("dp_sub 3-2"); + if (dp_mul (2, 3) != 6) fail ("dp_mul 2*3"); + if (dp_div (3, 2) != 1.5) fail ("dp_div 3/2"); + if (dp_neg (1) != -1) fail ("dp_neg 1"); + + if (fp_to_dp (1.5) != 1.5) fail ("fp_to_dp 1.5"); + if (dp_to_fp (1.5) != 1.5) fail ("dp_to_fp 1.5"); + + if (floatsisf (1) != 1) fail ("floatsisf 1"); + if (floatsidf (1) != 1) fail ("floatsidf 1"); + if (fixsfsi (1.42) != 1) fail ("fixsfsi 1.42"); + if (fixunssfsi (1.42) != 1) fail ("fixunssfsi 1.42"); + if (fixdfsi (1.42) != 1) fail ("fixdfsi 1.42"); + if (fixunsdfsi (1.42) != 1) fail ("fixunsdfsi 1.42"); + + if (eqsf2 (1, 1) == 0) fail ("eqsf2 1==1"); + if (eqsf2 (1, 2) != 0) fail ("eqsf2 1==2"); + if (nesf2 (1, 2) == 0) fail ("nesf2 1!=1"); + if (nesf2 (1, 1) != 0) fail ("nesf2 1!=1"); + if (gtsf2 (2, 1) == 0) fail ("gtsf2 2>1"); + if (gtsf2 (1, 1) != 0) fail ("gtsf2 1>1"); + if (gtsf2 (0, 1) != 0) fail ("gtsf2 0>1"); + if (gesf2 (2, 1) == 0) fail ("gesf2 2>=1"); + if (gesf2 (1, 1) == 0) fail ("gesf2 1>=1"); + if (gesf2 (0, 1) != 0) fail ("gesf2 0>=1"); + if (ltsf2 (1, 2) == 0) fail ("ltsf2 1<2"); + if (ltsf2 (1, 1) != 0) fail ("ltsf2 1<1"); + if (ltsf2 (1, 0) != 0) fail ("ltsf2 1<0"); + if (lesf2 (1, 2) == 0) fail ("lesf2 1<=2"); + if (lesf2 (1, 1) == 0) fail ("lesf2 1<=1"); + if (lesf2 (1, 0) != 0) fail ("lesf2 1<=0"); + + if (fail_count != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20000320-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20000320-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20000320-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20000320-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,93 @@ +#if __INT_MAX__ != 2147483647 || (__LONG_LONG_MAX__ != 9223372036854775807ll && __LONG_MAX__ != 9223372036854775807ll) +int main(void) { exit (0); } +#else +#if __LONG_MAX__ != 9223372036854775807ll +typedef unsigned long long ull; +#else +typedef unsigned long ull; +#endif +typedef unsigned ul; + +union fl { + float f; + ul l; +} uf; +union dl { + double d; + ull ll; +} ud; + +int failed = 0; + +void c(ull d, ul f) +{ + ud.ll = d; + uf.f = (float) ud.d; + if (uf.l != f) + { + failed++; + } +} + +int main() +{ + if (sizeof (float) != sizeof (ul) + || sizeof (double) != sizeof (ull)) + exit (0); + +#if (defined __arm__ || defined __thumb__) && ! (defined __ARMEB__ || defined __VFP_FP__) + /* The ARM always stores FP numbers in big-wordian format, + even when running in little-byteian mode. */ + c(0x0000000036900000ULL, 0x00000000U); + c(0x0000000136900000ULL, 0x00000001U); + c(0xffffffff369fffffULL, 0x00000001U); + c(0x0000000036A00000ULL, 0x00000001U); + c(0xffffffff36A7ffffULL, 0x00000001U); + c(0x0000000036A80000ULL, 0x00000002U); + c(0xffffffff36AfffffULL, 0x00000002U); + c(0x0000000036b00000ULL, 0x00000002U); + c(0x0000000136b00000ULL, 0x00000002U); + + c(0xdfffffff380fffffULL, 0x007fffffU); + c(0xe0000000380fffffULL, 0x00800000U); + c(0xe0000001380fffffULL, 0x00800000U); + c(0xffffffff380fffffULL, 0x00800000U); + c(0x0000000038100000ULL, 0x00800000U); + c(0x0000000138100000ULL, 0x00800000U); + c(0x1000000038100000ULL, 0x00800000U); + c(0x1000000138100000ULL, 0x00800001U); + c(0x2fffffff38100000ULL, 0x00800001U); + c(0x3000000038100000ULL, 0x00800002U); + c(0x5000000038100000ULL, 0x00800002U); + c(0x5000000138100000ULL, 0x00800003U); +#else + c(0x3690000000000000ULL, 0x00000000U); + c(0x3690000000000001ULL, 0x00000001U); + c(0x369fffffffffffffULL, 0x00000001U); + c(0x36A0000000000000ULL, 0x00000001U); + c(0x36A7ffffffffffffULL, 0x00000001U); + c(0x36A8000000000000ULL, 0x00000002U); + c(0x36AfffffffffffffULL, 0x00000002U); + c(0x36b0000000000000ULL, 0x00000002U); + c(0x36b0000000000001ULL, 0x00000002U); + + c(0x380fffffdfffffffULL, 0x007fffffU); + c(0x380fffffe0000000ULL, 0x00800000U); + c(0x380fffffe0000001ULL, 0x00800000U); + c(0x380fffffffffffffULL, 0x00800000U); + c(0x3810000000000000ULL, 0x00800000U); + c(0x3810000000000001ULL, 0x00800000U); + c(0x3810000010000000ULL, 0x00800000U); + c(0x3810000010000001ULL, 0x00800001U); + c(0x381000002fffffffULL, 0x00800001U); + c(0x3810000030000000ULL, 0x00800002U); + c(0x3810000050000000ULL, 0x00800002U); + c(0x3810000050000001ULL, 0x00800003U); +#endif + + if (failed) + abort (); + else + exit (0); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20000320-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20000320-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20000320-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20000320-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +if {[istarget "m68k-*-*"] && [check_effective_target_coldfire_fpu]} { + # ColdFire FPUs require software handling of subnormals. We are + # not aware of any system that has this. + set torture_execute_xfail "m68k-*-*" +} +if [istarget "avr-*-*"] { + # AVR doubles are floats + return 1 +} +if { [istarget "tic6x-*-*"] && [check_effective_target_ti_c67x] } { + # C6X floating point hardware turns denormals to zero in FP conversions. + set torture_execute_xfail "tic6x-*-*" + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20001122-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20001122-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20001122-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20001122-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +volatile double a, *p; + +int main () +{ + double c, d; + volatile double b; + + d = 1.0; + p = &b; + do + { + c = d; + d = c * 0.5; + b = 1 + d; + } while (b != 1.0); + + a = 1.0 + c; + if (a == 1.0) + abort(); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010114-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010114-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010114-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010114-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +extern void exit (int); +extern void abort (void); + +float +rintf (float x) +{ + static const float TWO23 = 8388608.0; + + if (__builtin_fabs (x) < TWO23) + { + if (x > 0.0) + { + x += TWO23; + x -= TWO23; + } + else if (x < 0.0) + { + x = TWO23 - x; + x = -(x - TWO23); + } + } + + return x; +} + +int main (void) +{ + if (rintf (-1.5) != -2.0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010114-2.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010114-2.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010114-2.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010114-2.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,6 @@ +if [istarget "spu-*-*"] { + # This doesn't work on the SPU because single precision floats are + # always rounded toward 0. + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010226-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010226-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010226-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20010226-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +#include + +long double dfrom = 1.1L; +long double m1; +long double m2; +unsigned long mant_long; + +int main() +{ + /* Some targets don't support a conforming long double type. This is + common with very small parts which set long double == float. Look + to see if the type has at least 32 bits of precision. */ + if (LDBL_EPSILON > 0x1p-31L) + return 0; + + m1 = dfrom / 2.0L; + m2 = m1 * 4294967296.0L; + mant_long = ((unsigned long) m2) & 0xffffffff; + + if (mant_long == 0x8ccccccc) + return 0; + else + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20011123-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20011123-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20011123-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20011123-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +main() +{ + double db1 = 1.7976931348623157e+308; + long double ldb1 = db1; + + if (sizeof (double) != 8 || sizeof (long double) != 16) + exit (0); + + if (ldb1 != 1.7976931348623157e+308) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20030331-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20030331-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20030331-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20030331-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +extern void exit (int); +extern void abort (void); +float x = -1.5f; + +float +rintf () +{ + static const float TWO23 = 8388608.0; + + if (__builtin_fabs (x) < TWO23) + { + if (x > 0.0) + { + x += TWO23; + x -= TWO23; + } + else if (x < 0.0) + { + x = TWO23 - x; + x = -(x - TWO23); + } + } + + return x; +} + +int main (void) +{ + if (rintf () != -2.0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20030331-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20030331-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20030331-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20030331-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,6 @@ +if [istarget "spu-*-*"] { + # This doesn't work on the SPU because single precision floats are + # always rounded toward 0. + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20041213-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20041213-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20041213-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/20041213-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +extern double sqrt (double); +extern void abort (void); +int once; + +double foo (void) +{ + if (once++) + abort (); + return 0.0 / 0.0; +} + +double x; +int main (void) +{ + x = sqrt (foo ()); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920518-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920518-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920518-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920518-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,8 @@ +unsigned u=2147483839;float f0=2147483648e0,f1=2147483904e0; +main() +{ + float f=u; + if(f==f0) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920518-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920518-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920518-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920518-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,6 @@ +if [istarget "spu-*-*"] { + # This doesn't work on the SPU because single precision floats are + # always rounded toward 0. + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920810-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920810-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920810-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920810-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,3 @@ +#include +double normalize(x)double x;{if(x==0)x=0;return x;} +main(){char b[9];sprintf(b,"%g",normalize(-0.0));if(strcmp(b,"0"))abort();exit(0);} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920810-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920810-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920810-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/920810-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,4 @@ +if { [check_effective_target_newlib_nano_io] } { + lappend additional_flags "-Wl,-u,_printf_float" +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/930529-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/930529-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/930529-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/930529-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +main () +{ + union { + double d; + unsigned char c[8]; + } d; + + d.d = 1.0/7.0; + + if (sizeof (char) * 8 == sizeof (double)) + { + if (d.c[0] == 0x92 && d.c[1] == 0x24 && d.c[2] == 0x49 && d.c[3] == 0x92 + && d.c[4] == 0x24 && d.c[5] == 0x49 && d.c[6] == 0xc2 && d.c[7] == 0x3f) + exit (0); + if (d.c[7] == 0x92 && d.c[6] == 0x24 && d.c[5] == 0x49 && d.c[4] == 0x92 + && d.c[3] == 0x24 && d.c[2] == 0x49 && d.c[1] == 0xc2 && d.c[0] == 0x3f) + exit (0); +#if defined __arm__ || defined __thumb__ + if (d.c[4] == 0x92 && d.c[5] == 0x24 && d.c[6] == 0x49 && d.c[7] == 0x92 + && d.c[0] == 0x24 && d.c[1] == 0x49 && d.c[2] == 0xc2 && d.c[3] == 0x3f) + exit (0); +#endif + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/980619-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/980619-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/980619-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/980619-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ + int main(void) + { + float reale = 1.0f; + float oneplus; + int i; + + if (sizeof (float) != 4) + exit (0); + + for (i = 0; ; i++) + { + oneplus = 1.0f + reale; + if (oneplus == 1.0f) + break; + reale=reale/2.0f; + } + /* Assumes ieee754 accurate arithmetic above. */ + if (i != 24) + abort (); + else + exit (0); + } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/980619-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/980619-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/980619-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/980619-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +# This used to fail on ia32, with or without -ffloat-store. +# It works now, but some people think that's a fluke, so I'm +# keeping this around just in case. + +#set torture_eval_before_execute { +# +# set compiler_conditional_xfail_data { +# "ia32 fp rounding isn't pedantic" \ +# "i?86-*-*" \ +# { "-O3" "-O2" "-O1" "-Os"} \ +# { "" } +# } +#} + +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/acc1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/acc1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/acc1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/acc1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +/* Tail call optimizations would reverse the order of additions in func(). */ + +double func (const double *array) +{ + double d = *array; + if (d == 0.0) + return d; + else + return d + func (array + 1); +} + +int main () +{ + double values[] = { 0.1e-100, 1.0, -1.0, 0.0 }; + if (func (values) != 0.1e-100) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/acc2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/acc2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/acc2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/acc2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* Tail call optimizations would reverse the order of multiplications + in func(). */ + +double func (const double *array) +{ + double d = *array; + if (d == 1.0) + return d; + else + return d * func (array + 1); +} + +int main () +{ + double values[] = { __DBL_MAX__, 2.0, 0.5, 1.0 }; + if (func (values) != __DBL_MAX__) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/builtin-nan-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/builtin-nan-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/builtin-nan-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/builtin-nan-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* PR middle-end/19983 */ + +typedef __SIZE_TYPE__ size_t; + +extern void abort(void); +extern int memcmp(const void *, const void *, size_t); + +double n1 = __builtin_nan("0x1"); +double n2 = __builtin_nan("0X1"); + +int main() +{ + if (memcmp (&n1, &n2, sizeof(double))) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,189 @@ +/* Copyright (C) 2004 Free Software Foundation. + + Test for correctness of composite floating-point comparisons. + + Written by Paolo Bonzini, 26th May 2004. */ + +extern void abort (void); + +#define TEST(c) if ((c) != ok) abort (); +#define ORD(a, b) (!__builtin_isunordered ((a), (b))) +#define UNORD(a, b) (__builtin_isunordered ((a), (b))) +#define UNEQ(a, b) (__builtin_isunordered ((a), (b)) || ((a) == (b))) +#define UNLT(a, b) (__builtin_isunordered ((a), (b)) || ((a) < (b))) +#define UNLE(a, b) (__builtin_isunordered ((a), (b)) || ((a) <= (b))) +#define UNGT(a, b) (__builtin_isunordered ((a), (b)) || ((a) > (b))) +#define UNGE(a, b) (__builtin_isunordered ((a), (b)) || ((a) >= (b))) +#define LTGT(a, b) (__builtin_islessgreater ((a), (b))) + +float pinf; +float ninf; +float NaN; + +int iuneq (float x, float y, int ok) +{ + TEST (UNEQ (x, y)); + TEST (!LTGT (x, y)); + TEST (UNLE (x, y) && UNGE (x,y)); +} + +int ieq (float x, float y, int ok) +{ + TEST (ORD (x, y) && UNEQ (x, y)); +} + +int iltgt (float x, float y, int ok) +{ + TEST (!UNEQ (x, y)); /* Not optimizable. */ + TEST (LTGT (x, y)); /* Same, __builtin_islessgreater does not trap. */ + TEST (ORD (x, y) && (UNLT (x, y) || UNGT (x,y))); +} + +int ine (float x, float y, int ok) +{ + TEST (UNLT (x, y) || UNGT (x, y)); +} + +int iunlt (float x, float y, int ok) +{ + TEST (UNLT (x, y)); + TEST (UNORD (x, y) || (x < y)); +} + +int ilt (float x, float y, int ok) +{ + TEST (ORD (x, y) && UNLT (x, y)); /* Not optimized */ + TEST ((x <= y) && (x != y)); + TEST ((x <= y) && (y != x)); + TEST ((x != y) && (x <= y)); /* Not optimized */ + TEST ((y != x) && (x <= y)); /* Not optimized */ +} + +int iunle (float x, float y, int ok) +{ + TEST (UNLE (x, y)); + TEST (UNORD (x, y) || (x <= y)); +} + +int ile (float x, float y, int ok) +{ + TEST (ORD (x, y) && UNLE (x, y)); /* Not optimized */ + TEST ((x < y) || (x == y)); + TEST ((y > x) || (x == y)); + TEST ((x == y) || (x < y)); /* Not optimized */ + TEST ((y == x) || (x < y)); /* Not optimized */ +} + +int iungt (float x, float y, int ok) +{ + TEST (UNGT (x, y)); + TEST (UNORD (x, y) || (x > y)); +} + +int igt (float x, float y, int ok) +{ + TEST (ORD (x, y) && UNGT (x, y)); /* Not optimized */ + TEST ((x >= y) && (x != y)); + TEST ((x >= y) && (y != x)); + TEST ((x != y) && (x >= y)); /* Not optimized */ + TEST ((y != x) && (x >= y)); /* Not optimized */ +} + +int iunge (float x, float y, int ok) +{ + TEST (UNGE (x, y)); + TEST (UNORD (x, y) || (x >= y)); +} + +int ige (float x, float y, int ok) +{ + TEST (ORD (x, y) && UNGE (x, y)); /* Not optimized */ + TEST ((x > y) || (x == y)); + TEST ((y < x) || (x == y)); + TEST ((x == y) || (x > y)); /* Not optimized */ + TEST ((y == x) || (x > y)); /* Not optimized */ +} + +int +main () +{ + pinf = __builtin_inf (); + ninf = -__builtin_inf (); + NaN = __builtin_nan (""); + + iuneq (ninf, pinf, 0); + iuneq (NaN, NaN, 1); + iuneq (pinf, ninf, 0); + iuneq (1, 4, 0); + iuneq (3, 3, 1); + iuneq (5, 2, 0); + + ieq (1, 4, 0); + ieq (3, 3, 1); + ieq (5, 2, 0); + + iltgt (ninf, pinf, 1); + iltgt (NaN, NaN, 0); + iltgt (pinf, ninf, 1); + iltgt (1, 4, 1); + iltgt (3, 3, 0); + iltgt (5, 2, 1); + + ine (1, 4, 1); + ine (3, 3, 0); + ine (5, 2, 1); + + iunlt (NaN, ninf, 1); + iunlt (pinf, NaN, 1); + iunlt (pinf, ninf, 0); + iunlt (pinf, pinf, 0); + iunlt (ninf, ninf, 0); + iunlt (1, 4, 1); + iunlt (3, 3, 0); + iunlt (5, 2, 0); + + ilt (1, 4, 1); + ilt (3, 3, 0); + ilt (5, 2, 0); + + iunle (NaN, ninf, 1); + iunle (pinf, NaN, 1); + iunle (pinf, ninf, 0); + iunle (pinf, pinf, 1); + iunle (ninf, ninf, 1); + iunle (1, 4, 1); + iunle (3, 3, 1); + iunle (5, 2, 0); + + ile (1, 4, 1); + ile (3, 3, 1); + ile (5, 2, 0); + + iungt (NaN, ninf, 1); + iungt (pinf, NaN, 1); + iungt (pinf, ninf, 1); + iungt (pinf, pinf, 0); + iungt (ninf, ninf, 0); + iungt (1, 4, 0); + iungt (3, 3, 0); + iungt (5, 2, 1); + + igt (1, 4, 0); + igt (3, 3, 0); + igt (5, 2, 1); + + iunge (NaN, ninf, 1); + iunge (pinf, NaN, 1); + iunge (ninf, pinf, 0); + iunge (pinf, pinf, 1); + iunge (ninf, ninf, 1); + iunge (1, 4, 0); + iunge (3, 3, 1); + iunge (5, 2, 1); + + ige (1, 4, 0); + ige (3, 3, 1); + ige (5, 2, 1); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,6 @@ +if [istarget "spu-*-*"] { + # The SPU single-precision floating point format does not + # support Nan & Inf. + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* Copyright (C) 2004 Free Software Foundation. + + Ensure that the composite comparison optimization doesn't misfire + and attempt to combine an integer comparison with a floating-point one. + + Written by Paolo Bonzini, 26th May 2004. */ + +extern void abort (void); + +int +foo (double x, double y) +{ + /* If miscompiled the following may become false. */ + return (x > y) && ((int)x == (int)y); +} + +int +main () +{ + if (! foo (1.3,1.0)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,96 @@ +/* Copyright (C) 2004 Free Software Foundation. + + Test for composite comparison always true/false optimization. + + Written by Paolo Bonzini, 26th May 2004. */ + +extern void link_error0 (); +extern void link_error1 (); + +void +test1 (float x, float y) +{ + if ((x==y) && (x!=y)) + link_error0(); +} + +void +test2 (float x, float y) +{ + if ((xy)) + link_error0(); +} + +void +test3 (float x, float y) +{ + if ((x=y) || (x= (b)) +#define UNORD(a, b) (!ORD ((a), (b))) +#define UNEQ(a, b) (!LTGT ((a), (b))) +#define UNLT(a, b) (((a) < (b)) || __builtin_isunordered ((a), (b))) +#define UNLE(a, b) (((a) <= (b)) || __builtin_isunordered ((a), (b))) +#define UNGT(a, b) (((a) > (b)) || __builtin_isunordered ((a), (b))) +#define UNGE(a, b) (((a) >= (b)) || __builtin_isunordered ((a), (b))) +#define LTGT(a, b) (((a) < (b)) || (a) > (b)) + +float pinf; +float ninf; +float NaN; + +int iuneq (float x, float y, int ok) +{ + TEST (UNEQ (x, y)); + TEST (!LTGT (x, y)); + TEST (UNLE (x, y) && UNGE (x,y)); +} + +int ieq (float x, float y, int ok) +{ + TEST (ORD (x, y) && UNEQ (x, y)); +} + +int iltgt (float x, float y, int ok) +{ + TEST (!UNEQ (x, y)); + TEST (LTGT (x, y)); + TEST (ORD (x, y) && (UNLT (x, y) || UNGT (x,y))); +} + +int ine (float x, float y, int ok) +{ + TEST (UNLT (x, y) || UNGT (x, y)); + TEST ((x < y) || (x > y) || UNORD (x, y)); +} + +int iunlt (float x, float y, int ok) +{ + TEST (UNLT (x, y)); + TEST (UNORD (x, y) || (x < y)); +} + +int ilt (float x, float y, int ok) +{ + TEST (ORD (x, y) && UNLT (x, y)); + TEST ((x <= y) && (x != y)); + TEST ((x <= y) && (y != x)); + TEST ((x != y) && (x <= y)); + TEST ((y != x) && (x <= y)); +} + +int iunle (float x, float y, int ok) +{ + TEST (UNLE (x, y)); + TEST (UNORD (x, y) || (x <= y)); +} + +int ile (float x, float y, int ok) +{ + TEST (ORD (x, y) && UNLE (x, y)); + TEST ((x < y) || (x == y)); + TEST ((y > x) || (x == y)); + TEST ((x == y) || (x < y)); + TEST ((y == x) || (x < y)); +} + +int iungt (float x, float y, int ok) +{ + TEST (UNGT (x, y)); + TEST (UNORD (x, y) || (x > y)); +} + +int igt (float x, float y, int ok) +{ + TEST (ORD (x, y) && UNGT (x, y)); + TEST ((x >= y) && (x != y)); + TEST ((x >= y) && (y != x)); + TEST ((x != y) && (x >= y)); + TEST ((y != x) && (x >= y)); +} + +int iunge (float x, float y, int ok) +{ + TEST (UNGE (x, y)); + TEST (UNORD (x, y) || (x >= y)); +} + +int ige (float x, float y, int ok) +{ + TEST (ORD (x, y) && UNGE (x, y)); + TEST ((x > y) || (x == y)); + TEST ((y < x) || (x == y)); + TEST ((x == y) || (x > y)); + TEST ((y == x) || (x > y)); +} + +int +main () +{ + pinf = __builtin_inf (); + ninf = -__builtin_inf (); + NaN = __builtin_nan (""); + + iuneq (ninf, pinf, 0); + iuneq (NaN, NaN, 1); + iuneq (pinf, ninf, 0); + iuneq (1, 4, 0); + iuneq (3, 3, 1); + iuneq (5, 2, 0); + + ieq (1, 4, 0); + ieq (3, 3, 1); + ieq (5, 2, 0); + + iltgt (ninf, pinf, 1); + iltgt (NaN, NaN, 0); + iltgt (pinf, ninf, 1); + iltgt (1, 4, 1); + iltgt (3, 3, 0); + iltgt (5, 2, 1); + + ine (1, 4, 1); + ine (3, 3, 0); + ine (5, 2, 1); + + iunlt (NaN, ninf, 1); + iunlt (pinf, NaN, 1); + iunlt (pinf, ninf, 0); + iunlt (pinf, pinf, 0); + iunlt (ninf, ninf, 0); + iunlt (1, 4, 1); + iunlt (3, 3, 0); + iunlt (5, 2, 0); + + ilt (1, 4, 1); + ilt (3, 3, 0); + ilt (5, 2, 0); + + iunle (NaN, ninf, 1); + iunle (pinf, NaN, 1); + iunle (pinf, ninf, 0); + iunle (pinf, pinf, 1); + iunle (ninf, ninf, 1); + iunle (1, 4, 1); + iunle (3, 3, 1); + iunle (5, 2, 0); + + ile (1, 4, 1); + ile (3, 3, 1); + ile (5, 2, 0); + + iungt (NaN, ninf, 1); + iungt (pinf, NaN, 1); + iungt (pinf, ninf, 1); + iungt (pinf, pinf, 0); + iungt (ninf, ninf, 0); + iungt (1, 4, 0); + iungt (3, 3, 0); + iungt (5, 2, 1); + + igt (1, 4, 0); + igt (3, 3, 0); + igt (5, 2, 1); + + iunge (NaN, ninf, 1); + iunge (pinf, NaN, 1); + iunge (ninf, pinf, 0); + iunge (pinf, pinf, 1); + iunge (ninf, ninf, 1); + iunge (1, 4, 0); + iunge (3, 3, 1); + iunge (5, 2, 1); + + ige (1, 4, 0); + ige (3, 3, 1); + ige (5, 2, 1); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-4.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-4.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-4.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/compare-fp-4.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +# The ARM VxWorks kernel uses an external floating-point library in +# which routines like __ledf2 are just aliases for __cmpdf2. These +# routines therefore don't handle NaNs correctly. +if [istarget "arm*-*-vxworks*"] { + set torture_eval_before_execute { + global compiler_conditional_xfail_data + set compiler_conditional_xfail_data { + "The ARM kernel uses a flawed floating-point library." + { "*-*-*" } + {} + { "-mrtp" } + } + } +} + +if [istarget "spu-*-*"] { + # The SPU single-precision floating point format does not + # support Nan & Inf. + return 1 +} + +lappend additional_flags "-fno-trapping-math" +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/copysign1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/copysign1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/copysign1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/copysign1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,76 @@ +#include +#include +#include + +#define fpsizeoff sizeof(float) +#define fpsizeof sizeof(double) +#define fpsizeofl sizeof(long double) + +/* Work around the fact that with the Intel double-extended precision, + we've got a 10 byte type stuffed into some amount of padding. And + the fact that -ffloat-store is going to stuff this value temporarily + into some bit of stack frame that we've no control over and can't zero. */ +#if LDBL_MANT_DIG == 64 +# if defined(__i386__) || defined(__x86_64__) || defined (__ia64__) +# undef fpsizeofl +# define fpsizeofl 10 +# endif +#endif + +/* Work around the fact that the sign of the second double in the IBM + double-double format is not strictly specified when it contains a zero. + For instance, -0.0L can be represented with either (-0.0, +0.0) or + (-0.0, -0.0). The former is what we'll get from the compiler when it + builds constants; the later is what we'll get from the negation operator + at runtime. */ +/* ??? This hack only works for big-endian, which is fortunately true for + AIX and, Darwin. */ +#if LDBL_MANT_DIG == 106 +# undef fpsizeofl +# define fpsizeofl sizeof(double) +#endif + + +#define TEST(TYPE, EXT) \ +TYPE c##EXT (TYPE x, TYPE y) \ +{ \ + return __builtin_copysign##EXT (x, y); \ +} \ + \ +struct D##EXT { TYPE x, y, z; }; \ + \ +static const struct D##EXT T##EXT[] = { \ + { 1.0, 2.0, 1.0 }, \ + { 1.0, -2.0, -1.0 }, \ + { -1.0, -2.0, -1.0 }, \ + { 0.0, -2.0, -0.0 }, \ + { -0.0, -2.0, -0.0 }, \ + { -0.0, 2.0, 0.0 }, \ + { __builtin_inf##EXT (), -0.0, -__builtin_inf##EXT () }, \ + { -__builtin_nan##EXT (""), __builtin_inf##EXT (), \ + __builtin_nan##EXT ("") } \ +}; \ + \ +void test##EXT (void) \ +{ \ + int i, n = sizeof (T##EXT) / sizeof (T##EXT[0]); \ + TYPE r; \ + for (i = 0; i < n; ++i) \ + { \ + r = c##EXT (T##EXT[i].x, T##EXT[i].y); \ + if (memcmp (&r, &T##EXT[i].z, fpsizeof##EXT) != 0) \ + abort (); \ + } \ +} + +TEST(float, f) +TEST(double, ) +TEST(long double, l) + +int main() +{ + testf(); + test(); + testl(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/copysign2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/copysign2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/copysign2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/copysign2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,70 @@ +#include +#include +#include + +#define fpsizeoff sizeof(float) +#define fpsizeof sizeof(double) +#define fpsizeofl sizeof(long double) + +/* Work around the fact that with the Intel double-extended precision, + we've got a 10 byte type stuffed into some amount of padding. And + the fact that -ffloat-store is going to stuff this value temporarily + into some bit of stack frame that we've no control over and can't zero. */ +#if LDBL_MANT_DIG == 64 +# if defined(__i386__) || defined(__x86_64__) || defined (__ia64__) +# undef fpsizeofl +# define fpsizeofl 10 +# endif +#endif + +/* Work around the fact that the sign of the second double in the IBM + double-double format is not strictly specified when it contains a zero. + For instance, -0.0L can be represented with either (-0.0, +0.0) or + (-0.0, -0.0). The former is what we'll get from the compiler when it + builds constants; the later is what we'll get from the negation operator + at runtime. */ +/* ??? This hack only works for big-endian, which is fortunately true for + AIX and Darwin. */ +#if LDBL_MANT_DIG == 106 +# undef fpsizeofl +# define fpsizeofl sizeof(double) +#endif + + +#define TEST(TYPE, EXT) \ +static TYPE Y##EXT[] = { \ + 2.0, -2.0, -2.0, -2.0, -2.0, 2.0, -0.0, __builtin_inf##EXT () \ +}; \ +static const TYPE Z##EXT[] = { \ + 1.0, -1.0, -1.0, -0.0, -0.0, 0.0, -__builtin_inf##EXT (), \ + __builtin_nan##EXT ("") \ +}; \ + \ +void test##EXT (void) \ +{ \ + TYPE r[8]; \ + int i; \ + r[0] = __builtin_copysign##EXT (1.0, Y##EXT[0]); \ + r[1] = __builtin_copysign##EXT (1.0, Y##EXT[1]); \ + r[2] = __builtin_copysign##EXT (-1.0, Y##EXT[2]); \ + r[3] = __builtin_copysign##EXT (0.0, Y##EXT[3]); \ + r[4] = __builtin_copysign##EXT (-0.0, Y##EXT[4]); \ + r[5] = __builtin_copysign##EXT (-0.0, Y##EXT[5]); \ + r[6] = __builtin_copysign##EXT (__builtin_inf##EXT (), Y##EXT[6]); \ + r[7] = __builtin_copysign##EXT (-__builtin_nan##EXT (""), Y##EXT[7]); \ + for (i = 0; i < 8; ++i) \ + if (memcmp (r+i, Z##EXT+i, fpsizeof##EXT) != 0) \ + abort (); \ +} + +TEST(float, f) +TEST(double, ) +TEST(long double, l) + +int main() +{ + testf(); + test(); + testl(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +#ifndef SIGNAL_SUPPRESS +#include +#endif + +double dnan = 1.0/0.0 - 1.0/0.0; +double x = 1.0; + +void leave () +{ + exit (0); +} + +main () +{ +#if ! defined (__vax__) && ! defined (_CRAY) + /* Move this line earlier, for architectures (like alpha) that issue + SIGFPE on the first comparisons. */ +#ifndef SIGNAL_SUPPRESS + /* Some machines catches a SIGFPE when a NaN is compared. + Let this test succeed o such machines. */ + signal (SIGFPE, leave); +#endif + /* NaN is an IEEE unordered operand. All these test should be false. */ + if (dnan == dnan) + abort (); + if (dnan != x) + x = 1.0; + else + abort (); + + if (dnan < x) + abort (); + if (dnan > x) + abort (); + if (dnan <= x) + abort (); + if (dnan >= x) + abort (); + if (dnan == x) + abort (); +#endif + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +# The ARM VxWorks kernel uses an external floating-point library in +# which routines like __ledf2 are just aliases for __cmpdf2. These +# routines therefore don't handle NaNs correctly. +if [istarget "arm*-*-vxworks*"] { + set torture_eval_before_execute { + global compiler_conditional_xfail_data + set compiler_conditional_xfail_data { + "The ARM kernel uses a flawed floating-point library." + { "*-*-*" } + {} + { "-mrtp" } + } + } +} + +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +#ifndef SIGNAL_SUPPRESS +#include +#endif + +float fnan = 1.0f/0.0f - 1.0f/0.0f; +float x = 1.0f; + +void leave () +{ + exit (0); +} + +main () +{ +#if ! defined (__vax__) && ! defined (_CRAY) + /* Move this line earlier, for architectures (like alpha) that issue + SIGFPE on the first comparisons. */ +#ifndef SIGNAL_SUPPRESS + /* Some machines catches a SIGFPE when a NaN is compared. + Let this test succeed o such machines. */ + signal (SIGFPE, leave); +#endif + /* NaN is an IEEE unordered operand. All these test should be false. */ + if (fnan == fnan) + abort (); + if (fnan != x) + x = 1.0; + else + abort (); + + if (fnan < x) + abort (); + if (fnan > x) + abort (); + if (fnan <= x) + abort (); + if (fnan >= x) + abort (); + if (fnan == x) + abort (); +#endif + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-2.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-2.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-2.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-2.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +# The ARM VxWorks kernel uses an external floating-point library in +# which routines like __ledf2 are just aliases for __cmpdf2. These +# routines therefore don't handle NaNs correctly. +if [istarget "arm*-*-vxworks*"] { + set torture_eval_before_execute { + global compiler_conditional_xfail_data + set compiler_conditional_xfail_data { + "The ARM kernel uses a flawed floating-point library." + { "*-*-*" } + {} + { "-mrtp" } + } + } +} + +if [istarget "spu-*-*"] { + # The SPU single-precision floating point format does not + # support Nan & Inf. + return 1 +} + +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +#ifndef SIGNAL_SUPPRESS +#include +#endif + +long double dnan = 1.0l/0.0l - 1.0l/0.0l; +long double x = 1.0l; + +void leave () +{ + exit (0); +} + +main () +{ +#if ! defined (__vax__) && ! defined (_CRAY) + /* Move this line earlier, for architectures (like alpha) that issue + SIGFPE on the first comparisons. */ +#ifndef SIGNAL_SUPPRESS + /* Some machines catches a SIGFPE when a NaN is compared. + Let this test succeed o such machines. */ + signal (SIGFPE, leave); +#endif + /* NaN is an IEEE unordered operand. All these test should be false. */ + if (dnan == dnan) + abort (); + if (dnan != x) + x = 1.0; + else + abort (); + + if (dnan < x) + abort (); + if (dnan > x) + abort (); + if (dnan <= x) + abort (); + if (dnan >= x) + abort (); + if (dnan == x) + abort (); +#endif + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-3.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-3.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-3.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-3.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +# The ARM VxWorks kernel uses an external floating-point library in +# which routines like __ledf2 are just aliases for __cmpdf2. These +# routines therefore don't handle NaNs correctly. +if [istarget "arm*-*-vxworks*"] { + set torture_eval_before_execute { + global compiler_conditional_xfail_data + set compiler_conditional_xfail_data { + "The ARM kernel uses a flawed floating-point library." + { "*-*-*" } + {} + { "-mrtp" } + } + } +} + +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,135 @@ +#ifndef FLOAT +#define FLOAT double +#endif + +void +test_isunordered(FLOAT x, FLOAT y, int true) +{ + if (__builtin_isunordered(x, y)) + { + if (! true) + abort (); + } + else + { + if (true) + abort (); + } +} + +void +test_isless(FLOAT x, FLOAT y, int true) +{ + if (__builtin_isless(x, y)) + { + if (! true) + abort (); + } + else + { + if (true) + abort (); + } +} + +void +test_islessequal(FLOAT x, FLOAT y, int true) +{ + if (__builtin_islessequal(x, y)) + { + if (! true) + abort (); + } + else + { + if (true) + abort (); + } +} + +void +test_isgreater(FLOAT x, FLOAT y, int true) +{ + if (__builtin_isgreater(x, y)) + { + if (! true) + abort (); + } + else + { + if (true) + abort (); + } +} + +void +test_isgreaterequal(FLOAT x, FLOAT y, int true) +{ + if (__builtin_isgreaterequal(x, y)) + { + if (! true) + abort (); + } + else + { + if (true) + abort (); + } +} + +void +test_islessgreater(FLOAT x, FLOAT y, int true) +{ + if (__builtin_islessgreater(x, y)) + { + if (! true) + abort (); + } + else + { + if (true) + abort (); + } +} + +#define NAN (0.0 / 0.0) + +int +main() +{ + struct try + { + FLOAT x, y; + unsigned unord : 1; + unsigned lt : 1; + unsigned le : 1; + unsigned gt : 1; + unsigned ge : 1; + unsigned lg : 1; + }; + + static struct try const data[] = + { + { NAN, NAN, 1, 0, 0, 0, 0, 0 }, + { 0.0, NAN, 1, 0, 0, 0, 0, 0 }, + { NAN, 0.0, 1, 0, 0, 0, 0, 0 }, + { 0.0, 0.0, 0, 0, 1, 0, 1, 0 }, + { 1.0, 2.0, 0, 1, 1, 0, 0, 1 }, + { 2.0, 1.0, 0, 0, 0, 1, 1, 1 }, + }; + + const int n = sizeof(data) / sizeof(data[0]); + int i; + + for (i = 0; i < n; ++i) + { + test_isunordered (data[i].x, data[i].y, data[i].unord); + test_isless (data[i].x, data[i].y, data[i].lt); + test_islessequal (data[i].x, data[i].y, data[i].le); + test_isgreater (data[i].x, data[i].y, data[i].gt); + test_isgreaterequal (data[i].x, data[i].y, data[i].ge); + test_islessgreater (data[i].x, data[i].y, data[i].lg); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4e.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4e.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4e.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4e.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +#if defined (__ia64__) && defined (__hpux__) +#define FLOAT __float80 +#include "fp-cmp-4.c" +#else +int +main () +{ + return 0; +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4f.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4f.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4f.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4f.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +#define FLOAT float +#include "fp-cmp-4.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4f.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4f.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4f.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4f.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,6 @@ +if [istarget "spu-*-*"] { + # The SPU single-precision floating point format does not + # support Nan & Inf. + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4l.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4l.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4l.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-4l.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +#define FLOAT long double +#include "fp-cmp-4.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,131 @@ +/* Like fp-cmp-4.c, but test that the setcc patterns are correct. */ + +static int +test_isunordered(double x, double y) +{ + return __builtin_isunordered(x, y); +} + +static int +test_not_isunordered(double x, double y) +{ + return !__builtin_isunordered(x, y); +} + +static int +test_isless(double x, double y) +{ + return __builtin_isless(x, y); +} + +static int +test_not_isless(double x, double y) +{ + return !__builtin_isless(x, y); +} + +static int +test_islessequal(double x, double y) +{ + return __builtin_islessequal(x, y); +} + +static int +test_not_islessequal(double x, double y) +{ + return !__builtin_islessequal(x, y); +} + +static int +test_isgreater(double x, double y) +{ + return __builtin_isgreater(x, y); +} + +static int +test_not_isgreater(double x, double y) +{ + return !__builtin_isgreater(x, y); +} + +static int +test_isgreaterequal(double x, double y) +{ + return __builtin_isgreaterequal(x, y); +} + +static int +test_not_isgreaterequal(double x, double y) +{ + return !__builtin_isgreaterequal(x, y); +} + +static int +test_islessgreater(double x, double y) +{ + return __builtin_islessgreater(x, y); +} + +static int +test_not_islessgreater(double x, double y) +{ + return !__builtin_islessgreater(x, y); +} + +static void +one_test(double x, double y, int expected, + int (*pos) (double, double), int (*neg) (double, double)) +{ + if ((*pos)(x, y) != expected) + abort (); + if ((*neg)(x, y) != !expected) + abort (); +} + +#define NAN (0.0 / 0.0) + +int +main() +{ + struct try + { + double x, y; + int result[6]; + }; + + static struct try const data[] = + { + { NAN, NAN, { 1, 0, 0, 0, 0, 0 } }, + { 0.0, NAN, { 1, 0, 0, 0, 0, 0 } }, + { NAN, 0.0, { 1, 0, 0, 0, 0, 0 } }, + { 0.0, 0.0, { 0, 0, 1, 0, 1, 0 } }, + { 1.0, 2.0, { 0, 1, 1, 0, 0, 1 } }, + { 2.0, 1.0, { 0, 0, 0, 1, 1, 1 } }, + }; + + struct test + { + int (*pos)(double, double); + int (*neg)(double, double); + }; + + static struct test const tests[] = + { + { test_isunordered, test_not_isunordered }, + { test_isless, test_not_isless }, + { test_islessequal, test_not_islessequal }, + { test_isgreater, test_not_isgreater }, + { test_isgreaterequal, test_not_isgreaterequal }, + { test_islessgreater, test_not_islessgreater } + }; + + const int n = sizeof(data) / sizeof(data[0]); + int i, j; + + for (i = 0; i < n; ++i) + for (j = 0; j < 6; ++j) + one_test (data[i].x, data[i].y, data[i].result[j], + tests[j].pos, tests[j].neg); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ + +const double dnan = 1.0/0.0 - 1.0/0.0; +double x = 1.0; + +extern void link_error (void); +extern void abort (void); + +main () +{ +#if ! defined (__vax__) && ! defined (_CRAY) + /* NaN is an IEEE unordered operand. All these test should be false. */ + if (dnan == dnan) + link_error (); + if (dnan != x) + x = 1.0; + else + link_error (); + + if (dnan < x) + link_error (); + if (dnan > x) + link_error (); + if (dnan <= x) + link_error (); + if (dnan >= x) + link_error (); + if (dnan == x) + link_error (); +#endif + exit (0); +} + +#ifndef __OPTIMIZE__ +void link_error (void) +{ + abort (); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-6.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-6.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-6.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-6.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +# The ARM VxWorks kernel uses an external floating-point library in +# which routines like __ledf2 are just aliases for __cmpdf2. These +# routines therefore don't handle NaNs correctly. +if [istarget "arm*-*-vxworks*"] { + set torture_eval_before_execute { + global compiler_conditional_xfail_data + set compiler_conditional_xfail_data { + "The ARM kernel uses a flawed floating-point library." + { "*-*-*" } + { "-O0" } + { "-mrtp" } + } + } +} + +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-7.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-7.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-7.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-7.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +extern void link_error (); + +void foo(double x) +{ + if (x > __builtin_inf()) + link_error (); +} + +int main () +{ + foo (1.0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-7.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-7.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-7.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-7.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +lappend additional_flags "-fno-trapping-math" +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,145 @@ +#ifndef FLOAT +#define FLOAT double +#endif + +/* Like fp-cmp-4.c, but test that the cmove patterns are correct. */ + +static FLOAT +test_isunordered(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return __builtin_isunordered(x, y) ? a : b; +} + +static FLOAT +test_not_isunordered(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return !__builtin_isunordered(x, y) ? a : b; +} + +static FLOAT +test_isless(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return __builtin_isless(x, y) ? a : b; +} + +static FLOAT +test_not_isless(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return !__builtin_isless(x, y) ? a : b; +} + +static FLOAT +test_islessequal(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return __builtin_islessequal(x, y) ? a : b; +} + +static FLOAT +test_not_islessequal(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return !__builtin_islessequal(x, y) ? a : b; +} + +static FLOAT +test_isgreater(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return __builtin_isgreater(x, y) ? a : b; +} + +static FLOAT +test_not_isgreater(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return !__builtin_isgreater(x, y) ? a : b; +} + +static FLOAT +test_isgreaterequal(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return __builtin_isgreaterequal(x, y) ? a : b; +} + +static FLOAT +test_not_isgreaterequal(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return !__builtin_isgreaterequal(x, y) ? a : b; +} + +static FLOAT +test_islessgreater(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return __builtin_islessgreater(x, y) ? a : b; +} + +static FLOAT +test_not_islessgreater(FLOAT x, FLOAT y, FLOAT a, FLOAT b) +{ + return !__builtin_islessgreater(x, y) ? a : b; +} + +static void +one_test(FLOAT x, FLOAT y, int expected, + FLOAT (*pos) (FLOAT, FLOAT, FLOAT, FLOAT), + FLOAT (*neg) (FLOAT, FLOAT, FLOAT, FLOAT)) +{ + if (((*pos)(x, y, 1.0, 2.0) == 1.0) != expected) + abort (); + if (((*neg)(x, y, 3.0, 4.0) == 4.0) != expected) + abort (); +} + +#define NAN (0.0 / 0.0) +#define INF (1.0 / 0.0) + +int +main() +{ + struct try + { + FLOAT x, y; + int result[6]; + }; + + static struct try const data[] = + { + { NAN, NAN, { 1, 0, 0, 0, 0, 0 } }, + { 0.0, NAN, { 1, 0, 0, 0, 0, 0 } }, + { NAN, 0.0, { 1, 0, 0, 0, 0, 0 } }, + { 0.0, 0.0, { 0, 0, 1, 0, 1, 0 } }, + { 1.0, 2.0, { 0, 1, 1, 0, 0, 1 } }, + { 2.0, 1.0, { 0, 0, 0, 1, 1, 1 } }, + { INF, 0.0, { 0, 0, 0, 1, 1, 1 } }, + { 1.0, INF, { 0, 1, 1, 0, 0, 1 } }, + { INF, INF, { 0, 0, 1, 0, 1, 0 } }, + { 0.0, -INF, { 0, 0, 0, 1, 1, 1 } }, + { -INF, 1.0, { 0, 1, 1, 0, 0, 1 } }, + { -INF, -INF, { 0, 0, 1, 0, 1, 0 } }, + { INF, -INF, { 0, 0, 0, 1, 1, 1 } }, + { -INF, INF, { 0, 1, 1, 0, 0, 1 } }, + }; + + struct test + { + FLOAT (*pos)(FLOAT, FLOAT, FLOAT, FLOAT); + FLOAT (*neg)(FLOAT, FLOAT, FLOAT, FLOAT); + }; + + static struct test const tests[] = + { + { test_isunordered, test_not_isunordered }, + { test_isless, test_not_isless }, + { test_islessequal, test_not_islessequal }, + { test_isgreater, test_not_isgreater }, + { test_isgreaterequal, test_not_isgreaterequal }, + { test_islessgreater, test_not_islessgreater } + }; + + const int n = sizeof(data) / sizeof(data[0]); + int i, j; + + for (i = 0; i < n; ++i) + for (j = 0; j < 6; ++j) + one_test (data[i].x, data[i].y, data[i].result[j], + tests[j].pos, tests[j].neg); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8e.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8e.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8e.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8e.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +#if defined (__ia64__) && defined (__hpux__) +#define FLOAT __float80 +#include "fp-cmp-8.c" +#else +int +main () +{ + return 0; +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8f.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8f.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8f.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8f.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +#define FLOAT float +#include "fp-cmp-8.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8f.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8f.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8f.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8f.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,6 @@ +if [istarget "spu-*-*"] { + # The SPU single-precision floating point format does not + # support Nan & Inf. + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8l.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8l.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8l.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/fp-cmp-8l.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +#define FLOAT long double +#include "fp-cmp-8.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/hugeval.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/hugeval.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/hugeval.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/hugeval.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +#include + +static const double zero = 0.0; +static const double pone = 1.0; +static const double none = -1.0; +static const double pinf = 1.0 / 0.0; +static const double ninf = -1.0 / 0.0; + +int +main () +{ + if (pinf != pone/zero) + abort (); + + if (ninf != none/zero) + abort (); + +#ifdef HUGE_VAL + if (HUGE_VAL != pinf) + abort (); + + if (-HUGE_VAL != ninf) + abort (); +#endif + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/hugeval.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/hugeval.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/hugeval.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/hugeval.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +# This test fails under hpux 9.X and 10.X because HUGE_VAL is DBL_MAX +# instead of +Infinity. + +global target_triplet +if { [istarget "hppa*-*-hpux9*"] || [istarget "hppa*-*-hpux10*"] } { + set torture_execute_xfail "$target_triplet" +} + +# VxWorks kernel mode has the same problem. +if {[istarget "*-*-vxworks*"]} { + set torture_eval_before_execute { + global compiler_conditional_xfail_data + set compiler_conditional_xfail_data { + "The kernel HUGE_VAL is defined to DBL_MAX instead of +Inf." + { "*-*-*" } + {} + { "-mrtp" } + } + } +} + +if { [istarget "tic6x-*-*"] && [check_effective_target_ti_c67x] } { + # C6X uses -freciprocal-math by default. + set torture_execute_xfail "$target_triplet" + return 1 +} + +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/ieee.exp URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/ieee.exp?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/ieee.exp (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/ieee.exp Wed Oct 9 04:01:46 2019 @@ -0,0 +1,81 @@ +# +# Expect driver script for GCC Regression Tests +# Copyright (C) 1993-2019 Free Software Foundation, Inc. +# +# This file is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . +# +# Written by Jeffrey Wheat (cassidy at cygnus.com) +# + +# Load support procs. +load_lib gcc-dg.exp +load_lib torture-options.exp +load_lib c-torture.exp + +# These tests come from Torbjorn Granlund's (tege at cygnus.com) +# C torture test suite, and other contributors. + +# Disable tests on machines with no hardware support for IEEE arithmetic. +if { [istarget "vax-*-*"] || [ istarget "powerpc-*-*spe"] || [istarget "pdp11-*-*"] } { return } + +if $tracelevel then { + strace $tracelevel +} + +torture-init +set-torture-options $C_TORTURE_OPTIONS {{}} $LTO_TORTURE_OPTIONS + +set additional_flags "-fno-inline" + +# We must use -ffloat-store/-mieee to ensure that excess precision on some +# machines does not cause problems +if { ([istarget "i?86-*-*"] || [istarget "x86_64-*-*"]) + && [check_effective_target_ia32] } then { + lappend additional_flags "-ffloat-store" +} +if [istarget "m68k-*-*"] then { + lappend additional_flags "-ffloat-store" +} +if { [istarget "alpha*-*-*"] + || [istarget "sh*-*-*"] } then { + lappend additional_flags "-mieee" +} + +if { ![check_effective_target_signal] } { + lappend additional_flags "-DSIGNAL_SUPPRESS" +} + +# load support procs +load_lib c-torture.exp + +# initialize harness +gcc_init + +# +# main test loop +# + +foreach src [lsort [glob -nocomplain $srcdir/$subdir/*.c]] { + # If we're only testing specific files and this isn't one of them, skip it. + if ![runtest_file_p $runtests $src] then { + continue + } + + c-torture-execute $src $additional_flags +} + +# All done. +torture-finish +gcc_finish Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +extern void abort (void); + +int main() +{ +#ifndef __SPU__ + /* The SPU single-precision floating point format does not support Inf. */ + float fi = __builtin_inff(); +#endif + double di = __builtin_inf(); + long double li = __builtin_infl(); + + float fh = __builtin_huge_valf(); + double dh = __builtin_huge_val(); + long double lh = __builtin_huge_vall(); + +#ifndef __SPU__ + if (fi + fi != fi) + abort (); +#endif + if (di + di != di) + abort (); + if (li + li != li) + abort (); + +#ifndef __SPU__ + if (fi != fh) + abort (); +#endif + if (di != dh) + abort (); + if (li != lh) + abort (); + +#ifndef __SPU__ + if (fi <= 0) + abort (); +#endif + if (di <= 0) + abort (); + if (li <= 0) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,82 @@ +extern void abort (void); + +void test(double f, double i) +{ + if (f == __builtin_inf()) + abort (); + if (f == -__builtin_inf()) + abort (); + if (i == -__builtin_inf()) + abort (); + if (i != __builtin_inf()) + abort (); + + if (f >= __builtin_inf()) + abort (); + if (f > __builtin_inf()) + abort (); + if (i > __builtin_inf()) + abort (); + if (f <= -__builtin_inf()) + abort (); + if (f < -__builtin_inf()) + abort (); +} + +void testf(float f, float i) +{ +#ifndef __SPU__ + /* The SPU single-precision floating point format does not support Inf. */ + + if (f == __builtin_inff()) + abort (); + if (f == -__builtin_inff()) + abort (); + if (i == -__builtin_inff()) + abort (); + if (i != __builtin_inff()) + abort (); + + if (f >= __builtin_inff()) + abort (); + if (f > __builtin_inff()) + abort (); + if (i > __builtin_inff()) + abort (); + if (f <= -__builtin_inff()) + abort (); + if (f < -__builtin_inff()) + abort (); +#endif +} + +void testl(long double f, long double i) +{ + if (f == __builtin_infl()) + abort (); + if (f == -__builtin_infl()) + abort (); + if (i == -__builtin_infl()) + abort (); + if (i != __builtin_infl()) + abort (); + + if (f >= __builtin_infl()) + abort (); + if (f > __builtin_infl()) + abort (); + if (i > __builtin_infl()) + abort (); + if (f <= -__builtin_infl()) + abort (); + if (f < -__builtin_infl()) + abort (); +} + +int main() +{ + test (34.0, __builtin_inf()); + testf (34.0f, __builtin_inff()); + testl (34.0l, __builtin_infl()); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/inf-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,78 @@ +extern void abort (void); + +void test(double f, double i) +{ + if (f == __builtin_huge_val()) + abort (); + if (f == -__builtin_huge_val()) + abort (); + if (i == -__builtin_huge_val()) + abort (); + if (i != __builtin_huge_val()) + abort (); + + if (f >= __builtin_huge_val()) + abort (); + if (f > __builtin_huge_val()) + abort (); + if (i > __builtin_huge_val()) + abort (); + if (f <= -__builtin_huge_val()) + abort (); + if (f < -__builtin_huge_val()) + abort (); +} + +void testf(float f, float i) +{ + if (f == __builtin_huge_valf()) + abort (); + if (f == -__builtin_huge_valf()) + abort (); + if (i == -__builtin_huge_valf()) + abort (); + if (i != __builtin_huge_valf()) + abort (); + + if (f >= __builtin_huge_valf()) + abort (); + if (f > __builtin_huge_valf()) + abort (); + if (i > __builtin_huge_valf()) + abort (); + if (f <= -__builtin_huge_valf()) + abort (); + if (f < -__builtin_huge_valf()) + abort (); +} + +void testl(long double f, long double i) +{ + if (f == __builtin_huge_vall()) + abort (); + if (f == -__builtin_huge_vall()) + abort (); + if (i == -__builtin_huge_vall()) + abort (); + if (i != __builtin_huge_vall()) + abort (); + + if (f >= __builtin_huge_vall()) + abort (); + if (f > __builtin_huge_vall()) + abort (); + if (i > __builtin_huge_vall()) + abort (); + if (f <= -__builtin_huge_vall()) + abort (); + if (f < -__builtin_huge_vall()) + abort (); +} + +int main() +{ + test (34.0, __builtin_huge_val()); + testf (34.0f, __builtin_huge_valf()); + testl (34.0l, __builtin_huge_vall()); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/minuszero.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/minuszero.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/minuszero.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/minuszero.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +main () +{ + union + { + double d; + unsigned short i[sizeof (double) / sizeof (short)]; + } u; + int a = 0; + int b = -5; + int j; + + u.d = (double) a / b; + + /* Look for the right pattern, but be sloppy since + we don't know the byte order. */ + for (j = 0; j < sizeof (double) / sizeof (short); j++) + { + if (u.i[j] == 0x8000) + exit (0); + } + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mul-subnormal-single-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mul-subnormal-single-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mul-subnormal-single-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mul-subnormal-single-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,75 @@ +/* Check that certain subnormal numbers (formerly known as denormalized + numbers) are rounded to within 0.5 ulp. PR other/14354. */ + +/* This test requires that float and unsigned int are the same size and + that the sign-bit of the float is at MSB of the unsigned int. */ + +#if __INT_MAX__ != 2147483647L +int main () { exit (0); } +#else + +union uf +{ + unsigned int u; + float f; +}; + +static float +u2f (unsigned int v) +{ + union uf u; + u.u = v; + return u.f; +} + +static unsigned int +f2u (float v) +{ + union uf u; + u.f = v; + return u.u; +} + +int ok = 1; + +static void +tstmul (unsigned int ux, unsigned int uy, unsigned int ur) +{ + float x = u2f (ux); + float y = u2f (uy); + + if (f2u (x * y) != ur) + /* Set a variable rather than aborting here, to simplify tracing when + several computations are wrong. */ + ok = 0; +} + +/* We don't want to make this const and static, or else we risk inlining + causing the test to fold as constants at compile-time. */ +struct +{ + unsigned int p1, p2, res; +} expected[] = + { + {0xfff, 0x3f800400, 0xfff}, + {0xf, 0x3fc88888, 0x17}, + {0xf, 0x3f844444, 0xf} + }; + +int +main () +{ + unsigned int i; + + for (i = 0; i < sizeof (expected) / sizeof (expected[0]); i++) + { + tstmul (expected[i].p1, expected[i].p2, expected[i].res); + tstmul (expected[i].p2, expected[i].p1, expected[i].res); + } + + if (!ok) + abort (); + + exit (0); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mul-subnormal-single-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mul-subnormal-single-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mul-subnormal-single-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mul-subnormal-single-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +if {[istarget "csky-*-*"] && [check_effective_target_hard_float]} { + # The C-SKY hardware FPU only supports flush-to-zero mode. + set torture_execute_xfail "csky-*-*" + return 1 +} +if [istarget "epiphany-*-*"] { + # The Epiphany single-precision floating point format does not + # support subnormals. + return 1 +} +if {[istarget "m68k-*-*"] && [check_effective_target_coldfire_fpu]} { + # ColdFire FPUs require software handling of subnormals. We are + # not aware of any system that has this. + set torture_execute_xfail "m68k-*-*" +} +if [istarget "spu-*-*"] { + # The SPU single-precision floating point format does not + # support subnormals. + return 1 +} +if { [istarget "tic6x-*-*"] && [check_effective_target_ti_c67x] } { + # C6X floating point hardware turns denormals to zero in multiplications. + set torture_execute_xfail "tic6x-*-*" + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,61 @@ +/* Test IEEE +0/-0 rules */ + +static double pzero = +0.0; +static double nzero = -0.0; +static double pinf = +1.0 / 0.0; +static double ninf = -1.0 / 0.0; +static double nan = 0.0 / 0.0; + +void +expect (double value, double expected) +{ + if (expected != expected) /* expected value is Not a number */ + { + if (value == value) /* actual value is a number */ + abort (); + } + + else if (value != value) + abort (); /* actual value is a NaN */ + + else if (memcmp ((void *)&value, (void *)&expected, sizeof (double)) != 0) + abort (); /* values don't match */ +} + +main () +{ + expect (pzero + pzero, pzero); + expect (pzero + nzero, pzero); + expect (nzero + pzero, pzero); + expect (nzero + nzero, nzero); + + expect (pzero - pzero, pzero); + expect (pzero - nzero, pzero); + expect (nzero - pzero, nzero); + expect (nzero - nzero, pzero); + + expect (pzero * pzero, pzero); + expect (pzero * nzero, nzero); + expect (nzero * pzero, nzero); + expect (nzero * nzero, pzero); + + expect (+1.00 * pzero, pzero); + expect (-1.00 * pzero, nzero); + expect (+1.00 * nzero, nzero); + expect (-1.00 * nzero, pzero); + +#ifndef _TMS320C6700 + /* C6X floating point division is implemented using reciprocals. */ + expect (pzero / pzero, nan); + expect (pzero / nzero, nan); + expect (nzero / pzero, nan); + expect (nzero / nzero, nan); + + expect (+1.00 / pzero, pinf); + expect (-1.00 / pzero, ninf); + expect (+1.00 / nzero, ninf); + expect (-1.00 / nzero, pinf); +#endif + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero2.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero2.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero2.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero2.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,6 @@ +# freebsd sets up the fpu with a different precision control which causes +# this test to "fail". +if { [istarget "i?86-*-freebsd*\[123\]\.*"] } { + set torture_execute_xfail "i?86-*-freebsd*" +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +/* Copyright (C) 2002 Free Software Foundation. + by Hans-Peter Nilsson , derived from mzero2.c + + In the MMIX port, negdf2 was bogusly expanding -x into 0 - x. */ + +double nzerod = -0.0; +float nzerof = -0.0; +double zerod = 0.0; +float zerof = 0.0; + +void expectd (double, double); +void expectf (float, float); +double negd (double); +float negf (float); + +main () +{ + expectd (negd (zerod), nzerod); + expectf (negf (zerof), nzerof); + expectd (negd (nzerod), zerod); + expectf (negf (nzerof), zerof); + exit (0); +} + +void +expectd (double value, double expected) +{ + if (value != expected + || memcmp ((void *)&value, (void *) &expected, sizeof (double)) != 0) + abort (); +} + +void +expectf (float value, float expected) +{ + if (value != expected + || memcmp ((void *)&value, (void *) &expected, sizeof (float)) != 0) + abort (); +} + +double +negd (double v) +{ + return -v; +} + +float +negf (float v) +{ + return -v; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,57 @@ +/* Copyright (C) 2003 Free Software Foundation. + by Roger Sayle , derived from mzero3.c + + Constant folding of sin(-0.0), tan(-0.0) and atan(-0.0) should + all return -0.0, for both double and float forms. */ + +void abort (void); +typedef __SIZE_TYPE__ size_t; +extern int memcmp (const void *, const void *, size_t); + +double sin (double); +double tan (double); +double atan (double); + +float sinf (float); +float tanf (float); +float atanf (float); + +void expectd (double, double); +void expectf (float, float); + +void +expectd (double value, double expected) +{ + if (value != expected + || memcmp ((void *)&value, (void *) &expected, sizeof (double)) != 0) + abort (); +} + +void +expectf (float value, float expected) +{ + if (value != expected + || memcmp ((void *)&value, (void *) &expected, sizeof (float)) != 0) + abort (); +} + +int main () +{ + expectd (sin (0.0), 0.0); + expectd (tan (0.0), 0.0); + expectd (atan (0.0), 0.0); + + expectd (sin (-0.0), -0.0); + expectd (tan (-0.0), -0.0); + expectd (atan (-0.0), -0.0); + + expectf (sinf (0.0f), 0.0f); + expectf (tanf (0.0f), 0.0f); + expectf (atanf (0.0f), 0.0f); + + expectf (sinf (-0.0f), -0.0f); + expectf (tanf (-0.0f), -0.0f); + expectf (atanf (-0.0f), -0.0f); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* Test gcse handling of IEEE 0/-0 rules. */ +static double zero = 0.0; + +int +negzero_check (double d) +{ + if (d == 0) + return !!memcmp ((void *)&zero, (void *)&d, sizeof (double)); + return 0; +} + +int +sub (double d, double e) +{ + if (d == 0.0 && e == 0.0 + && negzero_check (d) == 0 && negzero_check (e) == 0) + return 1; + else + return 0; +} + +int +main (void) +{ + double minus_zero = -0.0; + if (sub (minus_zero, 0)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/mzero6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* Tail call optimizations would convert func() into the moral equivalent of: + + double acc = 0.0; + for (int i = 0; i <= n; i++) + acc += d; + return acc; + + which mishandles the case where 'd' is -0. They also initialised 'acc' + to a zero int rather than a zero double. */ + +double func (double d, int n) +{ + if (n == 0) + return d; + else + return d + func (d, n - 1); +} + +int main () +{ + if (__builtin_copysign (1.0, func (0.0 / -5.0, 10)) != -1.0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr28634.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr28634.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr28634.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr28634.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* PR rtl-optimization/28634. On targets with delayed branches, + dbr_schedule could do the next iteration's addition in the + branch delay slot, then subtract the value again if the branch + wasn't taken. This can lead to rounding errors. */ +double x = -0x1.0p53; +double y = 1; +int +main (void) +{ + while (y > 0) + y += x; + if (y != x + 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr29302-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr29302-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr29302-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr29302-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +extern void abort (void); + +int main (void) +{ + int n; + long double x; + + x = 1/0.0; + + n = (x == 1/0.0); + + if (n == 1) + return 0; + else + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr29302-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr29302-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr29302-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr29302-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +if { [istarget "tic6x-*-*"] && [check_effective_target_ti_c67x] } { + # C6X uses -freciprocal-math by default. + set torture_execute_xfail "tic6x-*-*" + return 1 +} +return 0 +if { [istarget "tic6x-*-*"] && [check_effective_target_ti_c67x] } { + # C6X uses -freciprocal-math by default. + set torture_execute_xfail "tic6x-*-*" + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr30704.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr30704.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr30704.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr30704.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,55 @@ +/* PR middle-end/30704 */ + +typedef __SIZE_TYPE__ size_t; +extern void abort (void); +extern int memcmp (const void *, const void *, size_t); +extern void *memcpy (void *, const void *, size_t); + +long long +f1 (void) +{ + long long t; + double d = 0x0.fffffffffffff000p-1022; + memcpy (&t, &d, sizeof (long long)); + return t; +} + +double +f2 (void) +{ + long long t = 0x000fedcba9876543LL; + double d; + memcpy (&d, &t, sizeof (long long)); + return d; +} + +int +main () +{ + union + { + long long ll; + double d; + } u; + + if (sizeof (long long) != sizeof (double) || __DBL_MIN_EXP__ != -1021) + return 0; + + u.ll = f1 (); + if (u.d != 0x0.fffffffffffff000p-1022) + abort (); + + u.d = f2 (); + if (u.ll != 0x000fedcba9876543LL) + abort (); + + double b = 234.0; + long long c; + double d = b; + memcpy (&c, &b, sizeof (double)); + long long e = c; + if (memcmp (&e, &d, sizeof (double)) != 0) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr30704.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr30704.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr30704.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr30704.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,5 @@ +if [istarget "avr-*-*"] { + # AVR doubles are floats + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr36332.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr36332.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr36332.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr36332.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* PR target/36332 */ + +int +foo (long double ld) +{ + return ld == __builtin_infl (); +} + +int +main () +{ + if (foo (__LDBL_MAX__)) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr38016.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr38016.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr38016.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr38016.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1 @@ +#include "fp-cmp-8.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr38016.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr38016.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr38016.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr38016.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,2 @@ +lappend additional_flags "-fno-ivopts" "-fno-gcse" +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr50310.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr50310.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr50310.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr50310.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,73 @@ +/* PR target/50310 */ + +extern void abort (void); +double s1[4], s2[4], s3[64]; + +void +foo (void) +{ + int i; + for (i = 0; i < 4; i++) + s3[0 * 4 + i] = __builtin_isgreater (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[1 * 4 + i] = (!__builtin_isgreater (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[2 * 4 + i] = __builtin_isgreaterequal (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[3 * 4 + i] = (!__builtin_isgreaterequal (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[4 * 4 + i] = __builtin_isless (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[5 * 4 + i] = (!__builtin_isless (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[6 * 4 + i] = __builtin_islessequal (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[7 * 4 + i] = (!__builtin_islessequal (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[8 * 4 + i] = __builtin_islessgreater (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[9 * 4 + i] = (!__builtin_islessgreater (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[10 * 4 + i] = __builtin_isunordered (s1[i], s2[i]) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[11 * 4 + i] = (!__builtin_isunordered (s1[i], s2[i])) ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[12 * 4 + i] = s1[i] > s2[i] ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[13 * 4 + i] = s1[i] <= s2[i] ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[14 * 4 + i] = s1[i] < s2[i] ? -1.0 : 0.0; + for (i = 0; i < 4; i++) + s3[15 * 4 + i] = s1[i] >= s2[i] ? -1.0 : 0.0; +} + +int +main () +{ + int i; + s1[0] = 5.0; + s1[1] = 6.0; + s1[2] = 5.0; + s1[3] = __builtin_nan (""); + s2[0] = 6.0; + s2[1] = 5.0; + s2[2] = 5.0; + s2[3] = 5.0; + asm volatile ("" : : : "memory"); + foo (); + asm volatile ("" : : : "memory"); + for (i = 0; i < 16 * 4; i++) + if (i >= 12 * 4 && (i & 3) == 3) + { + if (s3[i] != 0.0) abort (); + } + else + { + static int masks[] = { 2, 2|4, 1, 1|4, 1|2, 8, 2, 1 }; + if (s3[i] + != (((1 << (i & 3)) & ((i & 4) ? ~masks[i / 8] : masks[i / 8])) + ? -1.0 : 0.0)) + abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr67218.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr67218.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr67218.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr67218.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +extern void abort (void) __attribute__ ((noreturn)); + +double __attribute__ ((noinline, noclone)) +foo (unsigned int x) +{ + return (double) (float) (x | 0xffff0000); +} + +int +main () +{ + if (foo (1) != 0x1.fffep31) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr72824-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr72824-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr72824-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr72824-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR tree-optimization/72824 */ + +typedef float V __attribute__((vector_size (4 * sizeof (float)))); + +static inline void +foo (V *x, V value) +{ + int i; + for (i = 0; i < 32; ++i) + x[i] = value; +} + +int +main () +{ + V x[32]; + foo (x, (V) { 0.f, -0.f, 0.f, -0.f }); + if (__builtin_copysignf (1.0, x[3][1]) != -1.0f) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr72824.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr72824.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr72824.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr72824.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR tree-optimization/72824 */ + +static inline void +foo (float *x, float value) +{ + int i; + for (i = 0; i < 32; ++i) + x[i] = value; +} + +int +main () +{ + float x[32]; + foo (x, -0.f); + if (__builtin_copysignf (1.0, x[3]) != -1.0f) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr84235.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr84235.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr84235.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/pr84235.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +/* PR tree-optimization/84235 */ + +int +main () +{ + double d = 1.0 / 0.0; + _Bool b = d == d && (d - d) != (d - d); + if (!b) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/rbug.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/rbug.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/rbug.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/rbug.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,54 @@ +#if defined(__i386__) && defined(__FreeBSD__) +#include +#endif + +double d (unsigned long long k) +{ + double x; + + x = (double) k; + return x; +} + +float s (unsigned long long k) +{ + float x; + + x = (float) k; + return x; +} + +main () +{ + unsigned long long int k; + double x; + +#if defined(__i386__) && defined(__FreeBSD__) + /* This test case assumes extended-precision, but FreeBSD defaults to + double-precision. Make it so. */ + fpsetprec (FP_PE); +#endif + + if (sizeof (double) >= 8) + { + k = 0x8693ba6d7d220401ULL; + x = d (k); + k = (unsigned long long) x; + if (k != 0x8693ba6d7d220800ULL) + abort (); + } + + k = 0x8234508000000001ULL; + x = s (k); + k = (unsigned long long) x; +#ifdef __SPU__ + /* SPU float rounds towards zero. */ + if (k != 0x8234500000000000ULL) + abort (); +#else + if (k != 0x8234510000000000ULL) + abort (); +#endif + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/rbug.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/rbug.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/rbug.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/rbug.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +# This doesn't work on d10v if doubles are not 64 bits + +if { [istarget "d10v-*-*"] && ! [string-match "*-mdouble64*" $CFLAGS] } { + set torture_execute_xfail "d10v-*-*" +} +if [istarget "avr-*-*"] { + # AVR doubles are floats + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +extern void abort(); + +typedef union { + struct { + unsigned int hi; + unsigned int lo; + } i; + double d; +} hexdouble; + +static const double twoTo52 = 0x1.0p+52; + +void func ( double x ) +{ + hexdouble argument; + register double y, z; + unsigned int xHead; + argument.d = x; + xHead = argument.i.hi & 0x7fffffff; + if (__builtin_expect(!!(xHead < 0x43300000u), 1)) + { + y = ( x - twoTo52 ) + twoTo52; + if ( y != x ) + abort(); + z = x - 0.5; + y = ( z - twoTo52 ) + twoTo52; + if ( y == (( x - twoTo52 ) + twoTo52) ) + abort(); + } + return; +} + +int main() +{ + if (sizeof (double) == 4) + return 0; + func((double)1.00); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc-1.x URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc-1.x?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc-1.x (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc-1.x Wed Oct 9 04:01:46 2019 @@ -0,0 +1,5 @@ +if [istarget "avr-*-*"] { + # AVR doubles are floats + return 1 +} +return 0 Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ieee/unsafe-fp-assoc.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +#include + +extern void abort(void); + +static const double C = DBL_MAX; + +double foo(double x) +{ + return ( ( (x * C) * C ) * C); +} + +int main () +{ + double d = foo (0.0); + if (d != 0.0) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ifcvt-onecmpl-abs-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ifcvt-onecmpl-abs-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ifcvt-onecmpl-abs-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ifcvt-onecmpl-abs-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ + +extern void abort(void); + +__attribute__ ((noinline)) +int foo(int n) +{ + if (n < 0) + n = ~n; + + return n; +} + +int main(void) +{ + if (foo (-1) != 0) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/index-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/index-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/index-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/index-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +int a[] = +{ + 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, + 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, + 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, + 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 +}; + +int +f (long n) +{ + return a[n - 100000]; +} + +main () +{ + if (f (100030L) != 30) + abort(); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/inst-check.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/inst-check.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/inst-check.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/inst-check.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +#include + +f(m) +{ + int i,s=0; + for(i=0;i + +gt (a, b) +{ + return a > b; +} + +ge (a, b) +{ + return a >= b; +} + +lt (a, b) +{ + return a < b; +} + +le (a, b) +{ + return a <= b; +} + +void +true (c) +{ + if (!c) + abort(); +} + +void +false (c) +{ + if (c) + abort(); +} + +f () +{ + true (gt (2, 1)); + false (gt (1, 2)); + + true (gt (INT_MAX, 0)); + false (gt (0, INT_MAX)); + true (gt (INT_MAX, 1)); + false (gt (1, INT_MAX)); + + false (gt (INT_MIN, 0)); + true (gt (0, INT_MIN)); + false (gt (INT_MIN, 1)); + true (gt (1, INT_MIN)); + + true (gt (INT_MAX, INT_MIN)); + false (gt (INT_MIN, INT_MAX)); + + true (ge (2, 1)); + false (ge (1, 2)); + + true (ge (INT_MAX, 0)); + false (ge (0, INT_MAX)); + true (ge (INT_MAX, 1)); + false (ge (1, INT_MAX)); + + false (ge (INT_MIN, 0)); + true (ge (0, INT_MIN)); + false (ge (INT_MIN, 1)); + true (ge (1, INT_MIN)); + + true (ge (INT_MAX, INT_MIN)); + false (ge (INT_MIN, INT_MAX)); + + false (lt (2, 1)); + true (lt (1, 2)); + + false (lt (INT_MAX, 0)); + true (lt (0, INT_MAX)); + false (lt (INT_MAX, 1)); + true (lt (1, INT_MAX)); + + true (lt (INT_MIN, 0)); + false (lt (0, INT_MIN)); + true (lt (INT_MIN, 1)); + false (lt (1, INT_MIN)); + + false (lt (INT_MAX, INT_MIN)); + true (lt (INT_MIN, INT_MAX)); + + false (le (2, 1)); + true (le (1, 2)); + + false (le (INT_MAX, 0)); + true (le (0, INT_MAX)); + false (le (INT_MAX, 1)); + true (le (1, INT_MAX)); + + true (le (INT_MIN, 0)); + false (le (0, INT_MIN)); + true (le (INT_MIN, 1)); + false (le (1, INT_MIN)); + + false (le (INT_MAX, INT_MIN)); + true (le (INT_MIN, INT_MAX)); +} + +main () +{ + f (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ipa-sra-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ipa-sra-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ipa-sra-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ipa-sra-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* Trivially making sure IPA-SRA does not introduce segfaults where they should + not be. */ + +struct bovid +{ + float red; + int green; + void *blue; +}; + +static int +__attribute__((noinline)) +ox (int fail, struct bovid *cow) +{ + int r; + if (fail) + r = cow->red; + else + r = 0; + return r; +} + +int main (int argc, char *argv[]) +{ + int r; + + r = ox ((argc > 2000), (void *) 0); + return r; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ipa-sra-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ipa-sra-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ipa-sra-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ipa-sra-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* { dg-require-effective-target int32plus } */ +struct big +{ + int data[1000000]; +}; + +struct small +{ + int data[10]; +}; + +union both +{ + struct big big; + struct small small; +}; + +extern void *calloc (__SIZE_TYPE__, __SIZE_TYPE__); +extern void free (void *); + +static int __attribute__((noinline)) +foo (int fail, union both *agg) +{ + int r; + if (fail) + r = agg->big.data[999999]; + else + r = agg->small.data[0]; + return r; +} + +int main (int argc, char *argv[]) +{ + union both *agg = calloc (1, sizeof (struct small)); + int r; + + r = foo ((argc > 2000), agg); + + free (agg); + return r; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/longlong.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/longlong.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/longlong.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/longlong.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +/* Source: PR 321 modified for test suite by Neil Booth 14 Jan 2001. */ + +typedef unsigned long long uint64; +unsigned long pars; + +uint64 b[32]; +uint64 *r = b; + +void alpha_ep_extbl_i_eq_0() +{ + unsigned int rb, ra, rc; + + rb = (((unsigned long)(pars) >> 27)) & 0x1fUL; + ra = (((unsigned int)(pars) >> 5)) & 0x1fUL; + rc = (((unsigned int)(pars) >> 0)) & 0x1fUL; + { + uint64 temp = ((r[ra] >> ((r[rb] & 0x7) << 3)) & 0x00000000000000FFLL); + if (rc != 31) + r[rc] = temp; + } +} + +int +main(void) +{ + if (sizeof (uint64) == 8) + { + b[17] = 0x0000000000303882ULL; /* rb */ + b[2] = 0x534f4f4c494d000aULL; /* ra & rc */ + + pars = 0x88000042; /* 17, 2, 2 coded */ + alpha_ep_extbl_i_eq_0(); + + if (b[2] != 0x4d) + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +main () +{ + int i, j, k[3]; + + j = 0; + for (i=0; i < 3; i++) + { + k[i] = j++; + } + + for (i=2; i >= 0; i--) + { + if (k[i] != i) + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-10.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-10.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-10.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-10.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* Reduced from PR optimization/5076, PR optimization/2847 */ + +static int count = 0; + +static void +inc (void) +{ + count++; +} + +int +main (void) +{ + int iNbr = 1; + int test = 0; + while (test == 0) + { + inc (); + if (iNbr == 0) + break; + else + { + inc (); + iNbr--; + } + test = 1; + } + if (count != 2) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-11.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-11.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-11.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-11.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +static int a[199]; + +static void +foo () +{ + int i; + for (i = 198; i >= 0; i--) + a[i] = i; +} + +int +main () +{ + int i; + foo (); + for (i = 0; i < 199; i++) + if (a[i] != i) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-12.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-12.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-12.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-12.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* Checks that pure functions are not treated as const. */ + +char *p; + +static int __attribute__ ((pure)) +is_end_of_statement (void) +{ + return *p == '\n' || *p == ';' || *p == '!'; +} + +void foo (void) +{ + /* The is_end_of_statement call was moved out of the loop at one stage, + resulting in an endless loop. */ + while (!is_end_of_statement ()) + p++; +} + +int +main (void) +{ + p = "abc\n"; + foo (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-13.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-13.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-13.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-13.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* PR opt/7130 */ +#define TYPE long + +void +scale (TYPE *alpha, TYPE *x, int n) +{ + int i, ix; + + if (*alpha != 1) + for (i = 0, ix = 0; i < n; i++, ix += 2) + { + TYPE tmpr, tmpi; + tmpr = *alpha * x[ix]; + tmpi = *alpha * x[ix + 1]; + x[ix] = tmpr; + x[ix + 1] = tmpi; + } +} + +int +main (void) +{ + int i; + TYPE x[10]; + TYPE alpha = 2; + + for (i = 0; i < 10; i++) + x[i] = i; + + scale (&alpha, x, 5); + + if (x[9] != 18) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-14.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-14.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-14.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-14.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +int a3[3]; + +void f(int *a) +{ + int i; + + for (i=3; --i;) + a[i] = 42 / i; +} + +int +main () +{ + f(a3); + + if (a3[1] != 42 || a3[2] != 21) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-15.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-15.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-15.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-15.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +/* Bombed with a segfault on powerpc-linux. doloop.c generated wrong + loop count. */ +void +foo (unsigned long *start, unsigned long *end) +{ + unsigned long *temp = end - 1; + + while (end > start) + *end-- = *temp--; +} + +int +main (void) +{ + unsigned long a[5]; + int start, end, k; + + for (start = 0; start < 5; start++) + for (end = 0; end < 5; end++) + { + for (k = 0; k < 5; k++) + a[k] = k; + + foo (a + start, a + end); + + for (k = 0; k <= start; k++) + if (a[k] != k) + abort (); + + for (k = start + 1; k <= end; k++) + if (a[k] != k - 1) + abort (); + + for (k = end + 1; k < 5; k++) + if (a[k] != k) + abort (); + } + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +int a[2]; + +f (b) +{ + unsigned int i; + for (i = 0; i < b; i++) + a[i] = i - 2; +} + +main () +{ + a[0] = a[1] = 0; + f (2); + if (a[0] != -2 || a[1] != -1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2b.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2b.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2b.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2b.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +#include + +int a[2]; + +f (int i) +{ + for (; i < INT_MAX; i++) + { + a[i] = -2; + if (&a[i] == &a[1]) + break; + } +} + +main () +{ + a[0] = a[1] = 0; + f (0); + if (a[0] != -2 || a[1] != -2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2c.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2c.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2c.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2c.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* { dg-options "-fgnu89-inline -Wno-pointer-to-int-cast" } */ + +extern void abort (void); +extern void exit (int); + +int a[2]; + +__inline__ void f (int b, int o) +{ + unsigned int i; + int *p; + for (p = &a[b], i = b; --i < ~0; ) + *--p = i * 3 + o; +} + +void +g(int b) +{ + f (b, (int)a); +} + +int +main () +{ + a[0] = a[1] = 0; + g (2); + if (a[0] != (int)a || a[1] != (int)a + 3) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2d.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2d.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2d.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2d.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +int a[2]; + +f (b) +{ + unsigned int i; + int *p; + for (p = &a[b], i = b; --i < ~0; ) + *--p = i * 3 + (int)a; +} + +main () +{ + a[0] = a[1] = 0; + f (2); + if (a[0] != (int)a || a[1] != (int)a + 3) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2e.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2e.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2e.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2e.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +void f (int *p, int **q) +{ + int i; + for (i = 0; i < 40; i++) + { + *q++ = &p[i]; + } +} + +int main () +{ + void *p; + int *q[40]; + __SIZE_TYPE__ start; + + /* Find the signed middle of the address space. */ + if (sizeof(start) == sizeof(int)) + start = (__SIZE_TYPE__) __INT_MAX__; + else if (sizeof(start) == sizeof(long)) + start = (__SIZE_TYPE__) __LONG_MAX__; + else if (sizeof(start) == sizeof(long long)) + start = (__SIZE_TYPE__) __LONG_LONG_MAX__; + else + return 0; + + /* Arbitrarily align the pointer. */ + start &= -32; + + /* Pretend that's good enough to start address arithmetic. */ + p = (void *)start; + + /* Verify that GIV replacement computes the correct results. */ + q[39] = 0; + f (p, q); + if (q[39] != (int *)p + 39) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2f.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2f.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2f.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2f.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,64 @@ +/* { dg-require-effective-target mmap } */ +/* { dg-skip-if "the executable is at the same position the test tries to remap" { m68k-*-linux* } } */ + +#include + +#include +#include +#include +#include +#ifndef MAP_ANON +#ifdef MAP_ANONYMOUS +#define MAP_ANON MAP_ANONYMOUS +#else +#define MAP_ANON MAP_FILE +#endif +#endif +#ifndef MAP_FILE +#define MAP_FILE 0 +#endif +#ifndef MAP_FIXED +#define MAP_FIXED 0 +#endif + +#define MAP_START (void *)0x7fff8000 +#define MAP_LEN 0x10000 + +#define OFFSET (MAP_LEN/2 - 2 * sizeof (char)); + +f (int s, char *p) +{ + int i; + for (i = s; i >= 0 && &p[i] < &p[40]; i++) + { + p[i] = -2; + } +} + +main () +{ +#ifdef MAP_ANON + char *p; + int dev_zero; + + dev_zero = open ("/dev/zero", O_RDONLY); + /* -1 is OK when we have MAP_ANON; else mmap will flag an error. */ + if (INT_MAX != 0x7fffffffL || sizeof (char *) != sizeof (int)) + exit (0); + p = mmap (MAP_START, MAP_LEN, PROT_READ|PROT_WRITE, + MAP_ANON|MAP_FIXED|MAP_PRIVATE, dev_zero, 0); + if (p != (char *)-1) + { + p += OFFSET; + p[39] = 0; + f (0, p); + if (p[39] != (char)-2) + abort (); + p[39] = 0; + f (-1, p); + if (p[39] != 0) + abort (); + } +#endif + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2g.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2g.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2g.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-2g.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,64 @@ +/* { dg-require-effective-target mmap } */ +/* { dg-skip-if "the executable is at the same position the test tries to remap" { m68k-*-linux* } } */ + +#include + +#include +#include +#include +#include +#ifndef MAP_ANON +#ifdef MAP_ANONYMOUS +#define MAP_ANON MAP_ANONYMOUS +#else +#define MAP_ANON MAP_FILE +#endif +#endif +#ifndef MAP_FILE +#define MAP_FILE 0 +#endif +#ifndef MAP_FIXED +#define MAP_FIXED 0 +#endif + +#define MAP_START (void *)0x7fff8000 +#define MAP_LEN 0x10000 + +#define OFFSET (MAP_LEN/2 - 2 * sizeof (char)); + +f (int s, char *p) +{ + int i; + for (i = s; &p[i] < &p[40] && i >= 0; i++) + { + p[i] = -2; + } +} + +main () +{ +#ifdef MAP_ANON + char *p; + int dev_zero; + + dev_zero = open ("/dev/zero", O_RDONLY); + /* -1 is OK when we have MAP_ANON; else mmap will flag an error. */ + if (INT_MAX != 0x7fffffffL || sizeof (char *) != sizeof (int)) + exit (0); + p = mmap (MAP_START, MAP_LEN, PROT_READ|PROT_WRITE, + MAP_ANON|MAP_FIXED|MAP_PRIVATE, dev_zero, 0); + if (p != (char *)-1) + { + p += OFFSET; + p[39] = 0; + f (0, p); + if (p[39] != (char)-2) + abort (); + p[39] = 0; + f (-1, p); + if (p[39] != 0) + abort (); + } +#endif + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +#include + +int n = 0; + +g (i) +{ + n++; +} + +f (m) +{ + int i; + i = m; + do + { + g (i * INT_MAX / 2); + } + while (--i > 0); +} + +main () +{ + f (4); + if (n != 4) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3b.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3b.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3b.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3b.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +#include + +int n = 0; + +g (i) +{ + n++; +} + +f (m) +{ + int i; + i = m; + do + { + g (i * 4); + i -= INT_MAX / 8; + } + while (i > 0); +} + +main () +{ + f (INT_MAX/8*4); + if (n != 4) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3c.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3c.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3c.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-3c.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +#include + +void * a[255]; + +f (m) +{ + int i; + int sh = 0x100; + i = m; + do + { + a[sh >>= 1] = ((unsigned)i << 3) + (char*)a; + i += 4; + } + while (i < INT_MAX/2 + 1 + 4 * 4); +} + +main () +{ + a[0x10] = 0; + a[0x08] = 0; + f (INT_MAX/2 + INT_MAX/4 + 2); + if (a[0x10] || a[0x08]) + abort (); + a[0x10] = 0; + a[0x08] = 0; + f (INT_MAX/2 + 1); + if (! a[0x10] || a[0x08]) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +int +f() +{ + int j = 1; + long i; + for (i = -0x70000000L; i < 0x60000000L; i += 0x10000000L) j <<= 1; + return j; +} + +int +main () +{ + if (f () != 8192) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-4b.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-4b.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-4b.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-4b.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +int +f() +{ + int j = 1; + long i; + i = 0x60000000L; + do + { + j <<= 1; + i += 0x10000000L; + } while (i < -0x60000000L); + return j; +} + +int +main () +{ + if (f () != 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +static int ap(int i); +static void testit(void){ + int ir[4] = {0,1,2,3}; + int ix,n,m; + n=1; m=3; + for (ix=1;ix<=4;ix++) { + if (n == 1) m = 4; + else m = n-1; + ap(ir[n-1]); + n = m; + } +} + +static int t = 0; +static int a[4]; + +static int ap(int i){ + if (t > 3) + abort(); + a[t++] = i; + return 1; +} + +int main(void) +{ + testit(); + if (a[0] != 0) + abort(); + if (a[1] != 3) + abort(); + if (a[2] != 2) + abort(); + if (a[3] != 1) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +main() +{ + char c; + char d; + int nbits; + c = -1; + for (nbits = 1 ; nbits < 100; nbits++) { + d = (1 << nbits) - 1; + if (d == c) + break; + } + if (nbits == 100) + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-7.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-7.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-7.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-7.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +void foo (unsigned int n) +{ + int i, j = -1; + + for (i = 0; i < 10 && j < 0; i++) + { + if ((1UL << i) == n) + j = i; + } + + if (j < 0) + abort (); +} + +main() +{ + foo (64); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-8.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-8.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-8.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-8.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +double a[3] = { 0.0, 1.0, 2.0 }; + +void bar (int x, double *y) +{ + if (x || *y != 1.0) + abort (); +} + +int main () +{ + double c; + int d; + for (d = 0; d < 3; d++) + { + c = a[d]; + if (c > 0.0) goto e; + } + bar(1, &c); + exit (1); +e: + bar(0, &c); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-9.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-9.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-9.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-9.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* Source: Neil Booth, from PR # 115. */ + +int false() +{ + return 0; +} + +extern void abort (void); + +int main (int argc,char *argv[]) +{ + int count = 0; + + while (false() || count < -123) + ++count; + + if (count) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-ivopts-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-ivopts-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-ivopts-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-ivopts-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* From PR 18977. */ +void foo(float * x); + +int main() +{ + float x[4]; + foo (x); + return 0; +} + +void foo (float *x) +{ + int i,j,k; + float temp; + static float t16[16]={1.,2.,3.,4.,5.,6.,7.,8.,9., + 10.,11.,12.,13.,14.,15.,16.}; + static float tmp[4]={0.,0.,0.,0.}; + + for (i=0; i<4; i++) { + k = 3 - i; + temp = t16[5*k]; + for(j=k+1; j<4; j++) { + tmp[k] = t16[k+ j*4] * temp; + } + } + x[0] = tmp[0]; + x[1] = tmp[1]; + x[2] = tmp[2]; + x[3] = tmp[3]; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-ivopts-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-ivopts-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-ivopts-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/loop-ivopts-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +/* PR rtl-optimization/20290 */ + +/* We used to mis-optimize the second loop in main on at least ppc and + arm, because tree loop would change the loop to something like: + + ivtmp.65 = &l[i]; + ivtmp.16 = 113; + goto (); + +:; + *(ivtmp.65 + 4294967292B) = 9; + i = i + 1; + +:; + ivtmp.16 = ivtmp.16 - 1; + ivtmp.65 = ivtmp.65 + 4B; + if (ivtmp.16 != 0) goto ; + + We used to consider the increment of i as executed in every + iteration, so we'd miscompute the final value. */ + +extern void abort (void); + +void +check (unsigned int *l) +{ + int i; + for (i = 0; i < 288; i++) + if (l[i] != 7 + (i < 256 || i >= 280) + (i >= 144 && i < 256)) + abort (); +} + +int +main (void) +{ + int i; + unsigned int l[288]; + + for (i = 0; i < 144; i++) + l[i] = 8; + for (; i < 256; i++) + l[i] = 9; + for (; i < 280; i++) + l[i] = 7; + for (; i < 288; i++) + l[i] = 8; + check (l); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/lshrdi-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/lshrdi-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/lshrdi-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/lshrdi-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,221 @@ +#include + +extern void abort(void); +extern void exit(int); + +#if __LONG_LONG_MAX__ == 9223372036854775807LL +#define BITS 64 + +static unsigned long long const zext[64] = { + 0x87654321fedcba90ULL, + 0x43b2a190ff6e5d48ULL, + 0x21d950c87fb72ea4ULL, + 0x10eca8643fdb9752ULL, + 0x87654321fedcba9ULL, + 0x43b2a190ff6e5d4ULL, + 0x21d950c87fb72eaULL, + 0x10eca8643fdb975ULL, + 0x87654321fedcbaULL, + 0x43b2a190ff6e5dULL, + 0x21d950c87fb72eULL, + 0x10eca8643fdb97ULL, + 0x87654321fedcbULL, + 0x43b2a190ff6e5ULL, + 0x21d950c87fb72ULL, + 0x10eca8643fdb9ULL, + 0x87654321fedcULL, + 0x43b2a190ff6eULL, + 0x21d950c87fb7ULL, + 0x10eca8643fdbULL, + 0x87654321fedULL, + 0x43b2a190ff6ULL, + 0x21d950c87fbULL, + 0x10eca8643fdULL, + 0x87654321feULL, + 0x43b2a190ffULL, + 0x21d950c87fULL, + 0x10eca8643fULL, + 0x87654321fULL, + 0x43b2a190fULL, + 0x21d950c87ULL, + 0x10eca8643ULL, + 0x87654321ULL, + 0x43b2a190ULL, + 0x21d950c8ULL, + 0x10eca864ULL, + 0x8765432ULL, + 0x43b2a19ULL, + 0x21d950cULL, + 0x10eca86ULL, + 0x876543ULL, + 0x43b2a1ULL, + 0x21d950ULL, + 0x10eca8ULL, + 0x87654ULL, + 0x43b2aULL, + 0x21d95ULL, + 0x10ecaULL, + 0x8765ULL, + 0x43b2ULL, + 0x21d9ULL, + 0x10ecULL, + 0x876ULL, + 0x43bULL, + 0x21dULL, + 0x10eULL, + 0x87ULL, + 0x43ULL, + 0x21ULL, + 0x10ULL, + 0x8ULL, + 0x4ULL, + 0x2ULL, + 0x1ULL +}; + +#elif __LONG_LONG_MAX__ == 2147483647LL +#define BITS 32 + +static unsigned long long const zext[32] = { + 0x87654321ULL, + 0x43b2a190ULL, + 0x21d950c8ULL, + 0x10eca864ULL, + 0x8765432ULL, + 0x43b2a19ULL, + 0x21d950cULL, + 0x10eca86ULL, + 0x876543ULL, + 0x43b2a1ULL, + 0x21d950ULL, + 0x10eca8ULL, + 0x87654ULL, + 0x43b2aULL, + 0x21d95ULL, + 0x10ecaULL, + 0x8765ULL, + 0x43b2ULL, + 0x21d9ULL, + 0x10ecULL, + 0x876ULL, + 0x43bULL, + 0x21dULL, + 0x10eULL, + 0x87ULL, + 0x43ULL, + 0x21ULL, + 0x10ULL, + 0x8ULL, + 0x4ULL, + 0x2ULL, + 0x1ULL, +}; + +#else +#error "Update the test case." +#endif + +static unsigned long long +variable_shift(unsigned long long x, int i) +{ + return x >> i; +} + +static unsigned long long +constant_shift(unsigned long long x, int i) +{ + switch (i) + { + case 0: x = x >> 0; break; + case 1: x = x >> 1; break; + case 2: x = x >> 2; break; + case 3: x = x >> 3; break; + case 4: x = x >> 4; break; + case 5: x = x >> 5; break; + case 6: x = x >> 6; break; + case 7: x = x >> 7; break; + case 8: x = x >> 8; break; + case 9: x = x >> 9; break; + case 10: x = x >> 10; break; + case 11: x = x >> 11; break; + case 12: x = x >> 12; break; + case 13: x = x >> 13; break; + case 14: x = x >> 14; break; + case 15: x = x >> 15; break; + case 16: x = x >> 16; break; + case 17: x = x >> 17; break; + case 18: x = x >> 18; break; + case 19: x = x >> 19; break; + case 20: x = x >> 20; break; + case 21: x = x >> 21; break; + case 22: x = x >> 22; break; + case 23: x = x >> 23; break; + case 24: x = x >> 24; break; + case 25: x = x >> 25; break; + case 26: x = x >> 26; break; + case 27: x = x >> 27; break; + case 28: x = x >> 28; break; + case 29: x = x >> 29; break; + case 30: x = x >> 30; break; + case 31: x = x >> 31; break; +#if BITS > 32 + case 32: x = x >> 32; break; + case 33: x = x >> 33; break; + case 34: x = x >> 34; break; + case 35: x = x >> 35; break; + case 36: x = x >> 36; break; + case 37: x = x >> 37; break; + case 38: x = x >> 38; break; + case 39: x = x >> 39; break; + case 40: x = x >> 40; break; + case 41: x = x >> 41; break; + case 42: x = x >> 42; break; + case 43: x = x >> 43; break; + case 44: x = x >> 44; break; + case 45: x = x >> 45; break; + case 46: x = x >> 46; break; + case 47: x = x >> 47; break; + case 48: x = x >> 48; break; + case 49: x = x >> 49; break; + case 50: x = x >> 50; break; + case 51: x = x >> 51; break; + case 52: x = x >> 52; break; + case 53: x = x >> 53; break; + case 54: x = x >> 54; break; + case 55: x = x >> 55; break; + case 56: x = x >> 56; break; + case 57: x = x >> 57; break; + case 58: x = x >> 58; break; + case 59: x = x >> 59; break; + case 60: x = x >> 60; break; + case 61: x = x >> 61; break; + case 62: x = x >> 62; break; + case 63: x = x >> 63; break; +#endif + + default: + abort (); + } + return x; +} + +int +main() +{ + int i; + + for (i = 0; i < BITS; ++i) + { + unsigned long long y = variable_shift (zext[0], i); + if (y != zext[i]) + abort (); + } + for (i = 0; i < BITS; ++i) + { + unsigned long long y = constant_shift (zext[0], i); + if (y != zext[i]) + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/lto-tbaa-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/lto-tbaa-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/lto-tbaa-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/lto-tbaa-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +/* { dg-additional-options "-fno-early-inlining -fno-ipa-cp" } */ +struct a { + float *b; +} *a; +struct b { + int *b; +} b; +struct c { + float *b; +} *c; +int d; +use_a (struct a *a) +{ +} +set_b (int **a) +{ + *a=&d; +} +use_c (struct c *a) +{ +} +__attribute__ ((noinline)) int **retme(int **val) +{ + return val; +} +int e; +struct b b= {&e}; +struct b b2; +struct b b3; +int **ptr = &b2.b; +main () +{ + a= (void *)0; + b.b=&e; + ptr =retme ( &b.b); + set_b (ptr); + b3=b; + if (b3.b != &d) + __builtin_abort (); + c= (void *)0; + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* Tests that the may_alias attribute works as expected. + Author: Osku Salerma Apr 2002. */ + +extern void abort(void); +extern void exit(int); + +typedef short __attribute__((__may_alias__)) short_a; + +int +main (void) +{ + int a = 0x12345678; + short_a *b = (short_a*) &a; + + b[1] = 0; + + if (a == 0x12345678) + abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +typedef struct __attribute__((__may_alias__)) { short x; } test; + +int f() { + int a=10; + test *p=(test *)&a; + p->x = 1; + return a; +} + +int main() { + if (f() == 10) + __builtin_abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mayalias-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +typedef struct __attribute__((__may_alias__)) { short x; } test; + +test *p; + +int g(int *a) +{ + p = (test*)a; +} + +int f() +{ + int a; + g(&a); + a = 10; + test s={1}; + *p=s; + return a; +} + +int main() { + if (f() == 10) + __builtin_abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/medce-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/medce-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/medce-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/medce-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ + +extern void abort (void); +extern void link_error (void); + +static int ok = 0; + +void bar (void) +{ + ok = 1; +} + +void foo(int x) +{ + switch (x) + { + case 0: + if (0) + { + link_error(); + case 1: + bar(); + } + } +} + +int main() +{ + foo (1); + if (!ok) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memchr-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memchr-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memchr-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memchr-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,207 @@ +/* PR tree-optimization/86711 - wrong folding of memchr + + Verify that memchr() of arrays initialized with string literals + where the nul doesn't fit in the array doesn't find the nul. */ +typedef __SIZE_TYPE__ size_t; +typedef __WCHAR_TYPE__ wchar_t; + +extern void* memchr (const void*, int, size_t); + +#define A(expr) \ + ((expr) \ + ? (void)0 \ + : (__builtin_printf ("assertion failed on line %i: %s\n", \ + __LINE__, #expr), \ + __builtin_abort ())) + +static const char c = '1'; +static const char s1[1] = "1"; +static const char s4[4] = "1234"; + +static const char s4_2[2][4] = { "1234", "5678" }; +static const char s5_3[3][5] = { "12345", "6789", "01234" }; + +volatile int v0 = 0; +volatile int v1 = 1; +volatile int v2 = 2; +volatile int v3 = 3; +volatile int v4 = 3; + +void test_narrow (void) +{ + int i0 = 0; + int i1 = i0 + 1; + int i2 = i1 + 1; + int i3 = i2 + 1; + int i4 = i3 + 1; + + A (memchr ("" + 1, 0, 0) == 0); + + A (memchr (&c, 0, sizeof c) == 0); + A (memchr (&c + 1, 0, sizeof c - 1) == 0); + A (memchr (&c + i1, 0, sizeof c - i1) == 0); + A (memchr (&c + v1, 0, sizeof c - v1) == 0); + + A (memchr (s1, 0, sizeof s1) == 0); + A (memchr (s1 + 1, 0, sizeof s1 - 1) == 0); + A (memchr (s1 + i1, 0, sizeof s1 - i1) == 0); + A (memchr (s1 + v1, 0, sizeof s1 - v1) == 0); + + A (memchr (&s1, 0, sizeof s1) == 0); + A (memchr (&s1 + 1, 0, sizeof s1 - 1) == 0); + A (memchr (&s1 + i1, 0, sizeof s1 - i1) == 0); + A (memchr (&s1 + v1, 0, sizeof s1 - v1) == 0); + + A (memchr (&s1[0], 0, sizeof s1) == 0); + A (memchr (&s1[0] + 1, 0, sizeof s1 - 1) == 0); + A (memchr (&s1[0] + i1, 0, sizeof s1 - i1) == 0); + A (memchr (&s1[0] + v1, 0, sizeof s1 - v1) == 0); + + A (memchr (&s1[i0], 0, sizeof s1) == 0); + A (memchr (&s1[i0] + 1, 0, sizeof s1 - 1) == 0); + A (memchr (&s1[i0] + i1, 0, sizeof s1 - i1) == 0); + A (memchr (&s1[i0] + v1, 0, sizeof s1 - v1) == 0); + + A (memchr (&s1[v0], 0, sizeof s1) == 0); + A (memchr (&s1[v0] + 1, 0, sizeof s1 - 1) == 0); + A (memchr (&s1[v0] + i1, 0, sizeof s1 - i1) == 0); + A (memchr (&s1[v0] + v1, 0, sizeof s1 - v1) == 0); + + + A (memchr (s4 + i0, 0, sizeof s4 - i0) == 0); + A (memchr (s4 + i1, 0, sizeof s4 - i1) == 0); + A (memchr (s4 + i2, 0, sizeof s4 - i2) == 0); + A (memchr (s4 + i3, 0, sizeof s4 - i3) == 0); + A (memchr (s4 + i4, 0, sizeof s4 - i4) == 0); + + A (memchr (s4 + v0, 0, sizeof s4 - v0) == 0); + A (memchr (s4 + v1, 0, sizeof s4 - v1) == 0); + A (memchr (s4 + v2, 0, sizeof s4 - v2) == 0); + A (memchr (s4 + v3, 0, sizeof s4 - v3) == 0); + A (memchr (s4 + v4, 0, sizeof s4 - v4) == 0); + + + A (memchr (s4_2, 0, sizeof s4_2) == 0); + + A (memchr (s4_2[0], 0, sizeof s4_2[0]) == 0); + A (memchr (s4_2[1], 0, sizeof s4_2[1]) == 0); + + A (memchr (s4_2[0] + 1, 0, sizeof s4_2[0] - 1) == 0); + A (memchr (s4_2[1] + 2, 0, sizeof s4_2[1] - 2) == 0); + A (memchr (s4_2[1] + 3, 0, sizeof s4_2[1] - 3) == 0); + + A (memchr (s4_2[v0], 0, sizeof s4_2[v0]) == 0); + A (memchr (s4_2[v0] + 1, 0, sizeof s4_2[v0] - 1) == 0); + + + /* The following calls must find the nul. */ + A (memchr ("", 0, 1) != 0); + A (memchr (s5_3, 0, sizeof s5_3) == &s5_3[1][4]); + + A (memchr (&s5_3[0][0] + i0, 0, sizeof s5_3 - i0) == &s5_3[1][4]); + A (memchr (&s5_3[0][0] + i1, 0, sizeof s5_3 - i1) == &s5_3[1][4]); + A (memchr (&s5_3[0][0] + i2, 0, sizeof s5_3 - i2) == &s5_3[1][4]); + A (memchr (&s5_3[0][0] + i4, 0, sizeof s5_3 - i4) == &s5_3[1][4]); + + A (memchr (&s5_3[1][i0], 0, sizeof s5_3[1] - i0) == &s5_3[1][4]); +} + +#if 4 == __WCHAR_WIDTH__ + +static const wchar_t wc = L'1'; +static const wchar_t ws1[] = L"1"; +static const wchar_t ws4[] = L"\x00123456\x12005678\x12340078\x12345600"; + +void test_wide (void) +{ + int i0 = 0; + int i1 = i0 + 1; + int i2 = i1 + 1; + int i3 = i2 + 1; + int i4 = i3 + 1; + + A (memchr (L"" + 1, 0, 0) == 0); + A (memchr (&wc + 1, 0, 0) == 0); + A (memchr (L"\x12345678", 0, sizeof (wchar_t)) == 0); + + const size_t nb = sizeof ws4; + const size_t nwb = sizeof (wchar_t); + + const char *pws1 = (const char*)ws1; + const char *pws4 = (const char*)ws4; + +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + A (memchr (ws1, 0, sizeof ws1) == pws1 + 1); + + A (memchr (&ws4[0], 0, nb) == pws4 + 3); + A (memchr (&ws4[1], 0, nb - 1 * nwb) == pws4 + 1 * nwb + 2); + A (memchr (&ws4[2], 0, nb - 2 * nwb) == pws4 + 2 * nwb + 1); + A (memchr (&ws4[3], 0, nb - 3 * nwb) == pws4 + 3 * nwb + 0); +#else + A (memchr (ws1, 0, sizeof ws1) == pws1 + 0); + + A (memchr (&ws4[0], 0, nb) == pws4 + 0); + A (memchr (&ws4[1], 0, nb - 1 * nwb) == pws4 + 1 * nwb + 1); + A (memchr (&ws4[2], 0, nb - 2 * nwb) == pws4 + 2 * nwb + 2); + A (memchr (&ws4[3], 0, nb - 3 * nwb) == pws4 + 3 * nwb + 3); +#endif +} + +#elif 2 == __WCHAR_WIDTH__ + +static const wchar_t wc = L'1'; +static const wchar_t ws1[] = L"1"; +static const wchar_t ws2[2] = L"\x1234\x5678"; /* no terminating nul */ +static const wchar_t ws4[] = L"\x0012\x1200\x1234"; + +void test_wide (void) +{ + int i0 = 0; + int i1 = i0 + 1; + int i2 = i1 + 1; + + A (sizeof (wchar_t) == 2); + + A (memchr (L"" + 1, 0, 0) == 0); + A (memchr (&wc + 1, 0, 0) == 0); + A (memchr (L"\x1234", 0, sizeof (wchar_t)) == 0); + + A (memchr (L"" + i1, i0, i0) == 0); + A (memchr (&wc + i1, i0, i0) == 0); + A (memchr (L"\x1234", i0, sizeof (wchar_t)) == 0); + + A (memchr (ws2, 0, sizeof ws2) == 0); + A (memchr (ws2, i0, sizeof ws2) == 0); + + const size_t nb = sizeof ws4; + const size_t nwb = sizeof (wchar_t); + + const char *pws1 = (const char*)ws1; + const char *pws4 = (const char*)ws4; + +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + A (memchr (ws1, i0, sizeof ws1) == pws1 + 1); + + A (memchr (&ws4[0], i0, nb) == pws4 + i1); + A (memchr (&ws4[1], i0, nb - i1 * nwb) == pws4 + i1 * nwb); + A (memchr (&ws4[2], i0, nb - i2 * nwb) == pws4 + i2 * nwb + i2); +#else + A (memchr (ws1, i0, sizeof ws1) == pws1 + 0); + + A (memchr (&ws4[0], i0, nb) == pws4 + 0); + A (memchr (&ws4[1], i0, nb - i1 * nwb) == pws4 + i1 * nwb + i1); + A (memchr (&ws4[2], i0, nb - i2 * nwb) == pws4 + i2 * nwb + i2); +#endif +} + +#else + +void test_wide (void) { } + +#endif + +int main () +{ + test_narrow (); + test_wide (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,65 @@ +/* { dg-add-options stack_size } */ + +#include + +#if defined (STACK_SIZE) +#define MEMCPY_SIZE (STACK_SIZE / 3) +#else +#define MEMCPY_SIZE (1 << 17) +#endif + + +void *copy (void *o, const void *i, unsigned l) +{ + return memcpy (o, i, l); +} + +main () +{ + unsigned i; + unsigned char src[MEMCPY_SIZE]; + unsigned char dst[MEMCPY_SIZE]; + + for (i = 0; i < MEMCPY_SIZE; i++) + src[i] = (unsigned char) i, dst[i] = 0; + + (void) memcpy (dst, src, MEMCPY_SIZE / 128); + + for (i = 0; i < MEMCPY_SIZE / 128; i++) + if (dst[i] != (unsigned char) i) + abort (); + + (void) memset (dst, 1, MEMCPY_SIZE / 128); + + for (i = 0; i < MEMCPY_SIZE / 128; i++) + if (dst[i] != 1) + abort (); + + (void) memcpy (dst, src, MEMCPY_SIZE); + + for (i = 0; i < MEMCPY_SIZE; i++) + if (dst[i] != (unsigned char) i) + abort (); + + (void) memset (dst, 0, MEMCPY_SIZE); + + for (i = 0; i < MEMCPY_SIZE; i++) + if (dst[i] != 0) + abort (); + + (void) copy (dst, src, MEMCPY_SIZE / 128); + + for (i = 0; i < MEMCPY_SIZE / 128; i++) + if (dst[i] != (unsigned char) i) + abort (); + + (void) memset (dst, 0, MEMCPY_SIZE); + + (void) copy (dst, src, MEMCPY_SIZE); + + for (i = 0; i < MEMCPY_SIZE; i++) + if (dst[i] != (unsigned char) i) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,75 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Test memcpy with various combinations of pointer alignments and lengths to + make sure any optimizations in the library are correct. + + Written by Michael Meissner, March 9, 2002. */ + +#include + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_COPY +#define MAX_COPY (10 * sizeof (long long)) +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + MAX_EXTRA) + + +/* Use a sequence length that is not divisible by two, to make it more + likely to detect when words are mixed up. */ +#define SEQUENCE_LENGTH 31 + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u1, u2; + +main () +{ + int off1, off2, len, i; + char *p, *q, c; + + for (off1 = 0; off1 < MAX_OFFSET; off1++) + for (off2 = 0; off2 < MAX_OFFSET; off2++) + for (len = 1; len < MAX_COPY; len++) + { + for (i = 0, c = 'A'; i < MAX_LENGTH; i++, c++) + { + u1.buf[i] = 'a'; + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + u2.buf[i] = c; + } + + p = memcpy (u1.buf + off1, u2.buf + off2, len); + if (p != u1.buf + off1) + abort (); + + q = u1.buf; + for (i = 0; i < off1; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0, c = 'A' + off2; i < len; i++, q++, c++) + { + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + if (*q != c) + abort (); + } + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-bi.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-bi.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-bi.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memcpy-bi.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,53 @@ +/* Test builtin-memcpy (which may emit different code for different N). */ +#include + +#define TESTSIZE 80 + +char src[TESTSIZE] __attribute__ ((aligned)); +char dst[TESTSIZE] __attribute__ ((aligned)); + +void +check (char *test, char *match, int n) +{ + if (memcmp (test, match, n)) + abort (); +} + +#define TN(n) \ +{ memset (dst, 0, n); memcpy (dst, src, n); check (dst, src, n); } +#define T(n) \ +TN (n) \ +TN ((n) + 1) \ +TN ((n) + 2) \ +TN ((n) + 3) + +main () +{ + int i,j; + + for (i = 0; i < sizeof (src); ++i) + src[i] = 'a' + i % 26; + + T (0); + T (4); + T (8); + T (12); + T (16); + T (20); + T (24); + T (28); + T (32); + T (36); + T (40); + T (44); + T (48); + T (52); + T (56); + T (60); + T (64); + T (68); + T (72); + T (76); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,96 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Test memset with various combinations of pointer alignments and lengths to + make sure any optimizations in the library are correct. + + Written by Michael Meissner, March 9, 2002. */ + +#include + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_COPY +#define MAX_COPY (10 * sizeof (long long)) +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + MAX_EXTRA) + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u; + +char A = 'A'; + +main () +{ + int off, len, i; + char *p, *q; + + for (off = 0; off < MAX_OFFSET; off++) + for (len = 1; len < MAX_COPY; len++) + { + for (i = 0; i < MAX_LENGTH; i++) + u.buf[i] = 'a'; + + p = memset (u.buf + off, '\0', len); + if (p != u.buf + off) + abort (); + + q = u.buf; + for (i = 0; i < off; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0; i < len; i++, q++) + if (*q != '\0') + abort (); + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + + p = memset (u.buf + off, A, len); + if (p != u.buf + off) + abort (); + + q = u.buf; + for (i = 0; i < off; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0; i < len; i++, q++) + if (*q != 'A') + abort (); + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + + p = memset (u.buf + off, 'B', len); + if (p != u.buf + off) + abort (); + + q = u.buf; + for (i = 0; i < off; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0; i < len; i++, q++) + if (*q != 'B') + abort (); + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,333 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Test memset with various combinations of pointer alignments and constant + lengths to make sure any optimizations in the compiler are correct. + + Written by Roger Sayle, April 22, 2002. */ + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_COPY +#define MAX_COPY 15 +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + MAX_EXTRA) + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u; + +char A = 'A'; + +void reset () +{ + int i; + + for (i = 0; i < MAX_LENGTH; i++) + u.buf[i] = 'a'; +} + +void check (int off, int len, int ch) +{ + char *q; + int i; + + q = u.buf; + for (i = 0; i < off; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0; i < len; i++, q++) + if (*q != ch) + abort (); + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); +} + +int main () +{ + int off; + char *p; + + /* len == 1 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 1); + if (p != u.buf + off) abort (); + check (off, 1, '\0'); + + p = memset (u.buf + off, A, 1); + if (p != u.buf + off) abort (); + check (off, 1, 'A'); + + p = memset (u.buf + off, 'B', 1); + if (p != u.buf + off) abort (); + check (off, 1, 'B'); + } + + /* len == 2 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 2); + if (p != u.buf + off) abort (); + check (off, 2, '\0'); + + p = memset (u.buf + off, A, 2); + if (p != u.buf + off) abort (); + check (off, 2, 'A'); + + p = memset (u.buf + off, 'B', 2); + if (p != u.buf + off) abort (); + check (off, 2, 'B'); + } + + /* len == 3 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 3); + if (p != u.buf + off) abort (); + check (off, 3, '\0'); + + p = memset (u.buf + off, A, 3); + if (p != u.buf + off) abort (); + check (off, 3, 'A'); + + p = memset (u.buf + off, 'B', 3); + if (p != u.buf + off) abort (); + check (off, 3, 'B'); + } + + /* len == 4 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 4); + if (p != u.buf + off) abort (); + check (off, 4, '\0'); + + p = memset (u.buf + off, A, 4); + if (p != u.buf + off) abort (); + check (off, 4, 'A'); + + p = memset (u.buf + off, 'B', 4); + if (p != u.buf + off) abort (); + check (off, 4, 'B'); + } + + /* len == 5 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 5); + if (p != u.buf + off) abort (); + check (off, 5, '\0'); + + p = memset (u.buf + off, A, 5); + if (p != u.buf + off) abort (); + check (off, 5, 'A'); + + p = memset (u.buf + off, 'B', 5); + if (p != u.buf + off) abort (); + check (off, 5, 'B'); + } + + /* len == 6 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 6); + if (p != u.buf + off) abort (); + check (off, 6, '\0'); + + p = memset (u.buf + off, A, 6); + if (p != u.buf + off) abort (); + check (off, 6, 'A'); + + p = memset (u.buf + off, 'B', 6); + if (p != u.buf + off) abort (); + check (off, 6, 'B'); + } + + /* len == 7 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 7); + if (p != u.buf + off) abort (); + check (off, 7, '\0'); + + p = memset (u.buf + off, A, 7); + if (p != u.buf + off) abort (); + check (off, 7, 'A'); + + p = memset (u.buf + off, 'B', 7); + if (p != u.buf + off) abort (); + check (off, 7, 'B'); + } + + /* len == 8 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 8); + if (p != u.buf + off) abort (); + check (off, 8, '\0'); + + p = memset (u.buf + off, A, 8); + if (p != u.buf + off) abort (); + check (off, 8, 'A'); + + p = memset (u.buf + off, 'B', 8); + if (p != u.buf + off) abort (); + check (off, 8, 'B'); + } + + /* len == 9 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 9); + if (p != u.buf + off) abort (); + check (off, 9, '\0'); + + p = memset (u.buf + off, A, 9); + if (p != u.buf + off) abort (); + check (off, 9, 'A'); + + p = memset (u.buf + off, 'B', 9); + if (p != u.buf + off) abort (); + check (off, 9, 'B'); + } + + /* len == 10 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 10); + if (p != u.buf + off) abort (); + check (off, 10, '\0'); + + p = memset (u.buf + off, A, 10); + if (p != u.buf + off) abort (); + check (off, 10, 'A'); + + p = memset (u.buf + off, 'B', 10); + if (p != u.buf + off) abort (); + check (off, 10, 'B'); + } + + /* len == 11 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 11); + if (p != u.buf + off) abort (); + check (off, 11, '\0'); + + p = memset (u.buf + off, A, 11); + if (p != u.buf + off) abort (); + check (off, 11, 'A'); + + p = memset (u.buf + off, 'B', 11); + if (p != u.buf + off) abort (); + check (off, 11, 'B'); + } + + /* len == 12 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 12); + if (p != u.buf + off) abort (); + check (off, 12, '\0'); + + p = memset (u.buf + off, A, 12); + if (p != u.buf + off) abort (); + check (off, 12, 'A'); + + p = memset (u.buf + off, 'B', 12); + if (p != u.buf + off) abort (); + check (off, 12, 'B'); + } + + /* len == 13 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 13); + if (p != u.buf + off) abort (); + check (off, 13, '\0'); + + p = memset (u.buf + off, A, 13); + if (p != u.buf + off) abort (); + check (off, 13, 'A'); + + p = memset (u.buf + off, 'B', 13); + if (p != u.buf + off) abort (); + check (off, 13, 'B'); + } + + /* len == 14 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 14); + if (p != u.buf + off) abort (); + check (off, 14, '\0'); + + p = memset (u.buf + off, A, 14); + if (p != u.buf + off) abort (); + check (off, 14, 'A'); + + p = memset (u.buf + off, 'B', 14); + if (p != u.buf + off) abort (); + check (off, 14, 'B'); + } + + /* len == 15 */ + for (off = 0; off < MAX_OFFSET; off++) + { + reset (); + + p = memset (u.buf + off, '\0', 15); + if (p != u.buf + off) abort (); + check (off, 15, '\0'); + + p = memset (u.buf + off, A, 15); + if (p != u.buf + off) abort (); + check (off, 15, 'A'); + + p = memset (u.buf + off, 'B', 15); + if (p != u.buf + off) abort (); + check (off, 15, 'B'); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,207 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Test memset with various combinations of constant pointer alignments and + lengths to make sure any optimizations in the compiler are correct. + + Written by Roger Sayle, July 22, 2002. */ + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_COPY +#define MAX_COPY 15 +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + MAX_EXTRA) + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u; + +char A = 'A'; + +void reset () +{ + int i; + + for (i = 0; i < MAX_LENGTH; i++) + u.buf[i] = 'a'; +} + +void check (int off, int len, int ch) +{ + char *q; + int i; + + q = u.buf; + for (i = 0; i < off; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0; i < len; i++, q++) + if (*q != ch) + abort (); + + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); +} + +int main () +{ + int len; + char *p; + + /* off == 0 */ + for (len = 0; len < MAX_COPY; len++) + { + reset (); + + p = memset (u.buf, '\0', len); + if (p != u.buf) abort (); + check (0, len, '\0'); + + p = memset (u.buf, A, len); + if (p != u.buf) abort (); + check (0, len, 'A'); + + p = memset (u.buf, 'B', len); + if (p != u.buf) abort (); + check (0, len, 'B'); + } + + /* off == 1 */ + for (len = 0; len < MAX_COPY; len++) + { + reset (); + + p = memset (u.buf+1, '\0', len); + if (p != u.buf+1) abort (); + check (1, len, '\0'); + + p = memset (u.buf+1, A, len); + if (p != u.buf+1) abort (); + check (1, len, 'A'); + + p = memset (u.buf+1, 'B', len); + if (p != u.buf+1) abort (); + check (1, len, 'B'); + } + + /* off == 2 */ + for (len = 0; len < MAX_COPY; len++) + { + reset (); + + p = memset (u.buf+2, '\0', len); + if (p != u.buf+2) abort (); + check (2, len, '\0'); + + p = memset (u.buf+2, A, len); + if (p != u.buf+2) abort (); + check (2, len, 'A'); + + p = memset (u.buf+2, 'B', len); + if (p != u.buf+2) abort (); + check (2, len, 'B'); + } + + /* off == 3 */ + for (len = 0; len < MAX_COPY; len++) + { + reset (); + + p = memset (u.buf+3, '\0', len); + if (p != u.buf+3) abort (); + check (3, len, '\0'); + + p = memset (u.buf+3, A, len); + if (p != u.buf+3) abort (); + check (3, len, 'A'); + + p = memset (u.buf+3, 'B', len); + if (p != u.buf+3) abort (); + check (3, len, 'B'); + } + + /* off == 4 */ + for (len = 0; len < MAX_COPY; len++) + { + reset (); + + p = memset (u.buf+4, '\0', len); + if (p != u.buf+4) abort (); + check (4, len, '\0'); + + p = memset (u.buf+4, A, len); + if (p != u.buf+4) abort (); + check (4, len, 'A'); + + p = memset (u.buf+4, 'B', len); + if (p != u.buf+4) abort (); + check (4, len, 'B'); + } + + /* off == 5 */ + for (len = 0; len < MAX_COPY; len++) + { + reset (); + + p = memset (u.buf+5, '\0', len); + if (p != u.buf+5) abort (); + check (5, len, '\0'); + + p = memset (u.buf+5, A, len); + if (p != u.buf+5) abort (); + check (5, len, 'A'); + + p = memset (u.buf+5, 'B', len); + if (p != u.buf+5) abort (); + check (5, len, 'B'); + } + + /* off == 6 */ + for (len = 0; len < MAX_COPY; len++) + { + reset (); + + p = memset (u.buf+6, '\0', len); + if (p != u.buf+6) abort (); + check (6, len, '\0'); + + p = memset (u.buf+6, A, len); + if (p != u.buf+6) abort (); + check (6, len, 'A'); + + p = memset (u.buf+6, 'B', len); + if (p != u.buf+6) abort (); + check (6, len, 'B'); + } + + /* off == 7 */ + for (len = 0; len < MAX_COPY; len++) + { + reset (); + + p = memset (u.buf+7, '\0', len); + if (p != u.buf+7) abort (); + check (7, len, '\0'); + + p = memset (u.buf+7, A, len); + if (p != u.buf+7) abort (); + check (7, len, 'A'); + + p = memset (u.buf+7, 'B', len); + if (p != u.buf+7) abort (); + check (7, len, 'B'); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/memset-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* Test to make sure memset of small old size works + correctly. */ +#define SIZE 15 + +void f(char *a) __attribute__((noinline)); +void f(char *a) +{ + __builtin_memset (a, 0, SIZE); +} + + +int main(void) +{ + int i; + char b[SIZE]; + for(i = 0; i < sizeof(b); i++) + { + b[i] = i; + } + f(b); + for(i = 0; i < sizeof(b); i++) + { + if (0 != b[i]) + __builtin_abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mod-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mod-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mod-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mod-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +f (x, y) +{ + if (x % y != 0) + abort (); +} + +main () +{ + f (-5, 5); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mode-dependent-address.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mode-dependent-address.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mode-dependent-address.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/mode-dependent-address.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +/* { dg-require-effective-target stdint_types } */ + +#include +#include +#include + +void f883b (int8_t * result, + int16_t * __restrict arg1, + uint32_t * __restrict arg2, + uint64_t * __restrict arg3, + uint8_t * __restrict arg4) +{ + int idx; + for (idx=0;idx<96;idx += 1) { + result[idx] = (((((((((((-27 + 2+1)>>1) || arg4[idx]) < arg1[idx]) + ? (((-27 + 2+1)>>1) || arg4[idx]) + : arg1[idx]) + >> (arg2[idx] & 31)) ^ 1) - -32)>>7) | -5) & arg3[idx]); + } +} + +int8_t result[96]; +int16_t arg1[96]; +uint32_t arg2[96]; +uint64_t arg3[96]; +uint8_t arg4[96]; + +int main (void) +{ + int i; + int correct[] = {0x0,0x1,0x2,0x3,0x0,0x1,0x2,0x3,0x8,0x9,0xa,0xb,0x8,0x9, + 0xa,0xb,0x10,0x11,0x12,0x13,0x10,0x11,0x12,0x13, + 0x18,0x19,0x1a,0x1b,0x18,0x19,0x1a,0x1b,0x20,0x21,0x22, + 0x23,0x20,0x21,0x22,0x23,0x28,0x29,0x2a, + 0x2b,0x28,0x29,0x2a,0x2b,0x30,0x31,0x32,0x33, + 0x30,0x31,0x32,0x33,0x38,0x39,0x3a,0x3b,0x38,0x39,0x3a, + 0x3b,0x40,0x41,0x42,0x43,0x40,0x41,0x42,0x43,0x48,0x49, + 0x4a,0x4b,0x48,0x49,0x4a,0x4b,0x50,0x51, + 0x52,0x53,0x50,0x51,0x52,0x53,0x58,0x59,0x5a,0x5b, + 0x58,0x59,0x5a,0x5b}; + + for (i=0; i < 96; i++) + arg3[i] = arg2[i] = arg1[i] = arg4[i] = i; + + f883b(result, arg1, arg2, arg3, arg4); + + for (i=0; i < 96; i++) + if (result[i] != correct[i]) abort(); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/multdi-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/multdi-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/multdi-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/multdi-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR target/9348 */ + +#define u_l_l unsigned long long +#define l_l long long + +l_l mpy_res; + +u_l_l mpy (long a, long b) +{ + return (u_l_l) a * (u_l_l) b; +} + +int main(void) +{ + mpy_res = mpy(1,-1); + if (mpy_res != -1LL) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/multi-ix.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/multi-ix.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/multi-ix.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/multi-ix.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,205 @@ +/* { dg-add-options stack_size } */ + +/* Test for a reload bug: + if you have a memory reference using the indexed addressing + mode, and the base address is a pseudo containing an address in the frame + and this pseudo fails to get a hard register, we end up with a double PLUS, + so the frame address gets reloaded. Now, when the index got a hard register, + and it dies in this insn, push_reload will consider that hard register as + a reload register, and disregrad overlaps with rld[n_reloads].in . That is + fine as long as the add can be done with a single insn, but when the + constant is so large that it has to be reloaded into a register first, + that clobbers the index. */ + +#include + +#ifdef STACK_SIZE +/* We need to be careful that we don't blow our stack. Function f, in the + worst case, needs to fit on the stack: + + * 40 int[CHUNK] arrays; + * ~40 ints; + * ~40 pointers for stdarg passing. + + Subtract the last two off STACK_SIZE and figure out what the maximum + chunk size can be. We make the last bit conservative to account for + register saves and other processor-dependent saving. Limit the + chunk size to some sane values. */ + +#define MIN(X,Y) ((X) < (Y) ? (X) : (Y)) +#define MAX(X,Y) ((X) > (Y) ? (X) : (Y)) + +#define CHUNK \ + MIN (500, (MAX (1, (signed)(STACK_SIZE-40*sizeof(int)-256*sizeof(void *)) \ + / (signed)(40*sizeof(int))))) +#else +#define CHUNK 500 +#endif + +void s(int, ...); +void z(int, ...); +void c(int, ...); + +typedef int l[CHUNK]; + +void +f (int n) +{ + int i; + l a0, a1, a2, a3, a4, a5, a6, a7, a8, a9; + l a10, a11, a12, a13, a14, a15, a16, a17, a18, a19; + l a20, a21, a22, a23, a24, a25, a26, a27, a28, a29; + l a30, a31, a32, a33, a34, a35, a36, a37, a38, a39; + int i0, i1, i2, i3, i4, i5, i6, i7, i8, i9; + int i10, i11, i12, i13, i14, i15, i16, i17, i18, i19; + int i20, i21, i22, i23, i24, i25, i26, i27, i28, i29; + int i30, i31, i32, i33, i34, i35, i36, i37, i38, i39; + + for (i = 0; i < n; i++) + { + s (40, a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, + a10, a11, a12, a13, a14, a15, a16, a17, a18, a19, + a20, a21, a22, a23, a24, a25, a26, a27, a28, a29, + a30, a31, a32, a33, a34, a35, a36, a37, a38, a39); + i0 = a0[0]; + i1 = a1[0]; + i2 = a2[0]; + i3 = a3[0]; + i4 = a4[0]; + i5 = a5[0]; + i6 = a6[0]; + i7 = a7[0]; + i8 = a8[0]; + i9 = a9[0]; + i10 = a10[0]; + i11 = a11[0]; + i12 = a12[0]; + i13 = a13[0]; + i14 = a14[0]; + i15 = a15[0]; + i16 = a16[0]; + i17 = a17[0]; + i18 = a18[0]; + i19 = a19[0]; + i20 = a20[0]; + i21 = a21[0]; + i22 = a22[0]; + i23 = a23[0]; + i24 = a24[0]; + i25 = a25[0]; + i26 = a26[0]; + i27 = a27[0]; + i28 = a28[0]; + i29 = a29[0]; + i30 = a30[0]; + i31 = a31[0]; + i32 = a32[0]; + i33 = a33[0]; + i34 = a34[0]; + i35 = a35[0]; + i36 = a36[0]; + i37 = a37[0]; + i38 = a38[0]; + i39 = a39[0]; + z (40, a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, + a10, a11, a12, a13, a14, a15, a16, a17, a18, a19, + a20, a21, a22, a23, a24, a25, a26, a27, a28, a29, + a30, a31, a32, a33, a34, a35, a36, a37, a38, a39); + a0[i0] = i0; + a1[i1] = i1; + a2[i2] = i2; + a3[i3] = i3; + a4[i4] = i4; + a5[i5] = i5; + a6[i6] = i6; + a7[i7] = i7; + a8[i8] = i8; + a9[i9] = i9; + a10[i10] = i10; + a11[i11] = i11; + a12[i12] = i12; + a13[i13] = i13; + a14[i14] = i14; + a15[i15] = i15; + a16[i16] = i16; + a17[i17] = i17; + a18[i18] = i18; + a19[i19] = i19; + a20[i20] = i20; + a21[i21] = i21; + a22[i22] = i22; + a23[i23] = i23; + a24[i24] = i24; + a25[i25] = i25; + a26[i26] = i26; + a27[i27] = i27; + a28[i28] = i28; + a29[i29] = i29; + a30[i30] = i30; + a31[i31] = i31; + a32[i32] = i32; + a33[i33] = i33; + a34[i34] = i34; + a35[i35] = i35; + a36[i36] = i36; + a37[i37] = i37; + a38[i38] = i38; + a39[i39] = i39; + c (40, a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, + a10, a11, a12, a13, a14, a15, a16, a17, a18, a19, + a20, a21, a22, a23, a24, a25, a26, a27, a28, a29, + a30, a31, a32, a33, a34, a35, a36, a37, a38, a39); + } +} + +int +main () +{ + /* CHUNK needs to be at least 40 to avoid stack corruption, + since index variable i0 in "a[i0] = i0" equals 39. */ + if (CHUNK < 40) + exit (0); + + f (1); + exit (0); +} + +void s(int n, ...) +{ + va_list list; + + va_start (list, n); + while (n--) + { + int *a = va_arg (list, int *); + a[0] = n; + } + va_end (list); +} + +void z(int n, ...) +{ + va_list list; + + va_start (list, n); + while (n--) + { + int *a = va_arg (list, int *); + __builtin_memset (a, 0, sizeof (l)); + } + va_end (list); +} + +void c(int n, ...) +{ + va_list list; + + va_start (list, n); + while (n--) + { + int *a = va_arg (list, int *); + if (a[n] != n) + abort (); + } + va_end (list); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nest-align-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nest-align-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nest-align-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nest-align-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +/* Test for alignment handling when a variable is accessed by nested + function. */ +/* Origin: Joey Ye */ + +/* Force bigger stack alignment for PowerPC EABI targets. */ +/* { dg-options "-mno-eabi" { target powerpc-*-eabi* } } */ + +#include + +typedef int aligned __attribute__((aligned)); +extern void abort (void); + +void +check (int *i) +{ + *i = 20; + if ((((ptrdiff_t) i) & (__alignof__(aligned) - 1)) != 0) + abort (); +} + +void +foo (void) +{ + aligned jj; + void bar () + { + jj = -20; + } + jj = 0; + bar (); + if (jj != -20) + abort (); + check (&jj); + if (jj != 20) + abort (); +} + +int +main() +{ + foo (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nest-stdar-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nest-stdar-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nest-stdar-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nest-stdar-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +#include + +main () +{ + double f (int x, ...) + { + va_list args; + double a; + + va_start (args, x); + a = va_arg (args, double); + va_end (args); + return a; + } + + if (f (1, (double)1) != 1.0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* { dg-require-effective-target trampolines } */ + +int +g (int a, int b, int (*gi) (int, int)) +{ + if ((*gi) (a, b)) + return a; + else + return b; +} + +f () +{ + int i, j; + int f2 (int a, int b) + { + return a > b; + } + + if (g (1, 2, f2) != 2) + abort (); +} + +main () +{ + f (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +/* { dg-require-effective-target trampolines } */ + +extern int foo (int, int, int (*) (int, int, int, int, int, int, int)); + +int z; + +int +main (void) +{ + int sum = 0; + int i; + + int nested (int a, int b, int c, int d, int e, int f, int g) + { + z = c + d + e + f + g; + + if (a > 2 * b) + return a - b; + else + return b - a; + } + + for (i = 0; i < 10; ++i) + { + int j; + + for (j = 0; j < 10; ++j) + { + int k; + + for (k = 0; k < 10; ++k) + sum += foo (i, j > k ? j - k : k - j, nested); + } + } + + if (sum != 2300) + abort (); + + if (z != 0x1b) + abort (); + + exit (0); +} + +int +foo (int a, int b, int (* fp) (int, int, int, int, int, int, int)) +{ + return fp (a, b, a, b, a, b, a); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,59 @@ +/* { dg-require-effective-target trampolines } */ + +extern long foo (long, long, long (*) (long, long)); +extern long use (long (*) (long, long), long, long); + +int +main (void) +{ + long sum = 0; + long i; + + long nested_0 (long a, long b) + { + if (a > 2 * b) + return a - b; + else + return b - a; + } + + long nested_1 (long a, long b) + { + return use (nested_0, b, a) + sum; + } + + long nested_2 (long a, long b) + { + return nested_1 (b, a); + } + + for (i = 0; i < 10; ++i) + { + long j; + + for (j = 0; j < 10; ++j) + { + long k; + + for (k = 0; k < 10; ++k) + sum += foo (i, j > k ? j - k : k - j, nested_2); + } + } + + if ((sum & 0xffffffff) != 0xbecfcbf5) + abort (); + + exit (0); +} + +long +use (long (* func)(long, long), long a, long b) +{ + return func (b, a); +} + +long +foo (long a, long b, long (* func) (long, long)) +{ + return func (a, b); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +/* { dg-add-options stack_size } */ + +/* Origin: hp at bitrange.com + Test that return values come out right from a 1000-level call chain to + functions without parameters that each need at least one "long" + preserved. Exposed problems related to the MMIX port. */ + +long level = 0; +extern long foo (void); +extern long bar (void); + +#ifdef STACK_SIZE +#define DEPTH ((STACK_SIZE) / 512 + 1) +#else +#define DEPTH 500 +#endif + +int +main (void) +{ + if (foo () == -42) + exit (0); + + abort (); +} + +long +foo (void) +{ + long tmp = ++level; + return bar () + tmp; +} + +long +bar (void) +{ + long tmp = level; + return tmp > DEPTH - 1 ? -42 - tmp : foo () - tmp; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* { dg-require-effective-target trampolines } */ + +extern void abort (void); +extern void exit (int); + +static void recursive (int n, void (*proc) (void)) +{ + __label__ l1; + + void do_goto (void) + { + goto l1; + } + + if (n == 3) + recursive (n - 1, do_goto); + else if (n > 0) + recursive (n - 1, proc); + else + (*proc) (); + return; + +l1: + if (n == 3) + exit (0); + else + abort (); +} + +int main () +{ + recursive (10, abort); + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* { dg-require-effective-target trampolines } */ + +/* Test that the GP gets properly restored, either by the nonlocal + receiver or the nested function. */ + +typedef __SIZE_TYPE__ size_t; +extern void abort (void); +extern void exit (int); +extern void qsort(void *, size_t, size_t, int (*)(const void *, const void *)); + +int main () +{ + __label__ nonlocal; + int compare (const void *a, const void *b) + { + goto nonlocal; + } + + char array[3]; + qsort (array, 3, 1, compare); + abort (); + + nonlocal: + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-7.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-7.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-7.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/nestfunc-7.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +struct A +{ + int one; + int two; + int three; + int four; + int five; + int six; +}; + +static int test (void) +{ + int base; + + struct A Foo (void) + { + struct A a; + + a.one = base + 1; + a.two = base + 2; + a.three = base + 3; + a.four = base + 4; + a.five = base + 5; + a.six = base + 6; + + return a; + } + + base = 10; + struct A a = Foo (); + + return (a.one == 11 + && a.two == 12 + && a.three == 13 + && a.four == 14 + && a.five == 15 + && a.six == 16); +} + +int main (void) +{ + return !test (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/noinit-attribute.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/noinit-attribute.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/noinit-attribute.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/noinit-attribute.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,59 @@ +/* { dg-do run } */ +/* { dg-require-effective-target noinit } */ +/* { dg-options "-O2" } */ + +/* This test checks that noinit data is handled correctly. */ + +extern void _start (void) __attribute__ ((noreturn)); +extern void abort (void) __attribute__ ((noreturn)); +extern void exit (int) __attribute__ ((noreturn)); + +int var_common; +int var_zero = 0; +int var_one = 1; +int __attribute__((noinit)) var_noinit; +int var_init = 2; + +int __attribute__((noinit)) func(); /* { dg-warning "attribute only applies to variables" } */ +int __attribute__((section ("mysection"), noinit)) var_section1; /* { dg-warning "because it conflicts with attribute" } */ +int __attribute__((noinit, section ("mysection"))) var_section2; /* { dg-warning "because it conflicts with attribute" } */ + + +int +main (void) +{ + /* Make sure that the C startup code has correctly initialized the ordinary variables. */ + if (var_common != 0) + abort (); + + /* Initialized variables are not re-initialized during startup, so + check their original values only during the first run of this + test. */ + if (var_init == 2) + if (var_zero != 0 || var_one != 1) + abort (); + + switch (var_init) + { + case 2: + /* First time through - change all the values. */ + var_common = var_zero = var_one = var_noinit = var_init = 3; + break; + + case 3: + /* Second time through - make sure that d has not been reset. */ + if (var_noinit != 3) + abort (); + exit (0); + + default: + /* Any other value for var_init is an error. */ + abort (); + } + + /* Simulate a processor reset by calling the C startup code. */ + _start (); + + /* Should never reach here. */ + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/p18298.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/p18298.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/p18298.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/p18298.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* { dg-options "-fgnu89-inline" } */ + +#include +#include +extern void abort (void); +int strcmp (const char*, const char*); +char s[2048] = "a"; +inline bool foo(const char *str) { + return !strcmp(s,str); +} +int main() { +int i = 0; + while(!(foo(""))) { + i ++; + s[0] = '\0'; + if (i>2) + abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/packed-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/packed-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/packed-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/packed-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +short x1 = 17; + +struct +{ + short i __attribute__ ((packed)); +} t; + +f () +{ + t.i = x1; + if (t.i != 17) + abort (); +} + +main () +{ + f (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/packed-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/packed-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/packed-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/packed-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +typedef struct s { + unsigned short a; + unsigned long b __attribute__ ((packed)); +} s; + +s t; + +int main() +{ + t.b = 0; + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pending-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pending-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pending-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pending-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ + +void dummy (x, y) + int *x; + int y; +{} + +int +main (argc, argv) + int argc; + char **argv; +{ + int number_columns=9; + int cnt0 = 0; + int cnt1 = 0; + int i,A1; + + for (i = number_columns-1; i != 0; i--) + { + if (i == 1) + { + dummy(&A1, i); + cnt0++; + } + else + { + dummy(&A1, i-1); + cnt1++; + } + } + if (cnt0 != 1 || cnt1 != 7) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/postmod-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/postmod-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/postmod-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/postmod-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,62 @@ +#define DECLARE_ARRAY(A) array##A[0x10] +#define DECLARE_COUNTER(A) counter##A = 0 +#define DECLARE_POINTER(A) *pointer##A = array##A + x +/* Create a loop that allows post-modification of pointerA, followed by + a use of the post-modified address. */ +#define BEFORE(A) counter##A += *pointer##A, pointer##A += 3 +#define AFTER(A) counter##A += pointer##A[x] + +/* Set up the arrays so that one iteration of the loop sets the counter + to 3.0f. */ +#define INIT_ARRAY(A) array##A[1] = 1.0f, array##A[5] = 2.0f + +/* Check that the loop worked correctly for all values. */ +#define CHECK_ARRAY(A) exit_code |= (counter##A != 3.0f) + +/* Having 6 copies triggered the bug for ARM and Thumb. */ +#define MANY(A) A (0), A (1), A (2), A (3), A (4), A (5) + +/* Each addendA should be allocated a register. */ +#define INIT_VOLATILE(A) addend##A = vol +#define ADD_VOLATILE(A) vol += addend##A + +/* Having 5 copies triggered the bug for ARM and Thumb. */ +#define MANY2(A) A (0), A (1), A (2), A (3), A (4) + +float MANY (DECLARE_ARRAY); +float MANY (DECLARE_COUNTER); + +volatile int stop = 1; +volatile int vol; + +void __attribute__((noinline)) +foo (int x) +{ + float MANY (DECLARE_POINTER); + int i; + + do + { + MANY (BEFORE); + MANY (AFTER); + /* Create an inner loop that should ensure the code above + has registers free for reload inheritance. */ + { + int MANY2 (INIT_VOLATILE); + for (i = 0; i < 10; i++) + MANY2 (ADD_VOLATILE); + } + } + while (!stop); +} + +int +main (void) +{ + int exit_code = 0; + + MANY (INIT_ARRAY); + foo (1); + MANY (CHECK_ARRAY); + return exit_code; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,47 @@ +/* PR 15262. + The alias analyzer only considers relations between pointers and + symbols. If two pointers P and Q point to the same symbol S, then + their respective memory tags will either be the same or they will + have S in their alias set. + + However, if there are no common symbols between P and Q, TBAA will + currently miss their alias relationship altogether. */ +struct A +{ + int t; + int i; +}; + +int foo () { return 3; } + +main () +{ + struct A loc, *locp; + float f, g, *p; + int T355, *T356; + + /* Avoid the partial hack in TBAA that would consider memory tags if + the program had no addressable symbols. */ + f = 3; + g = 2; + p = foo () ? &g : &f; + if (*p > 0.0) + g = 1; + + /* Store into *locp and cache its current value. */ + locp = malloc (sizeof (*locp)); + locp->i = 10; + T355 = locp->i; + + /* Take the address of one of locp's fields and write to it. */ + T356 = &locp->i; + *T356 = 1; + + /* Read the recently stored value. If TBAA fails, this will appear + as a redundant load that will be replaced with '10'. */ + T355 = locp->i; + if (T355 != 1) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* PR 15262. Similar to pr15262-1.c but with no obvious addresses + being taken in function foo(). Without IPA, by only looking inside + foo() we cannot tell for certain whether 'q' and 'b' alias each + other. */ +struct A +{ + int t; + int i; +}; + +struct B +{ + int *p; + float b; +}; + +float X; + +foo (struct B b, struct A *q, float *h) +{ + X += *h; + *(b.p) = 3; + q->t = 2; + return *(b.p); +} + +main() +{ + struct A a; + struct B b; + + b.p = &a.t; + if (foo (b, &a, &X) == 3) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15262.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,47 @@ +/* We used to mis-compile this testcase as we did not know that + &a+offsetof(b,a) was the same as &a.b */ +struct A +{ + int t; + int i; +}; + +void +bar (float *p) +{ + *p = 5.2; +} + +int +foo(struct A *locp, int i, int str) +{ + float f, g, *p; + int T355; + int *T356; + /* Currently, the alias analyzer has limited support for handling + aliases of structure fields when no other variables are aliased. + Introduce additional aliases to confuse it. */ + p = i ? &g : &f; + bar (p); + if (*p > 0.0) + str = 1; + + T355 = locp->i; + T356 = &locp->i; + *T356 = str; + T355 = locp->i; + + return T355; +} + +main () +{ + struct A loc; + int str; + + loc.i = 2; + str = foo (&loc, 10, 3); + if (str!=1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15296.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15296.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15296.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr15296.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,73 @@ +/* PR optimization/15296. The delayed-branch scheduler caused code that + SEGV:d for CRIS; a register was set to -1 in a delay-slot for the + fall-through code, while that register held a pointer used in code at + the branch target. */ + +typedef __INTPTR_TYPE__ intptr_t; +typedef intptr_t W; +union u0 +{ + union u0 *r; + W i; +}; +struct s1 +{ + union u0 **m0; + union u0 m1[4]; +}; + +void f (void *, struct s1 *, const union u0 *, W, W, W) + __attribute__ ((__noinline__)); +void g (void *, char *) __attribute__ ((__noinline__)); + +void +f (void *a, struct s1 *b, const union u0 *h, W v0, W v1, W v4) +{ + union u0 *e = 0; + union u0 *k = 0; + union u0 **v5 = b->m0; + union u0 *c = b->m1; + union u0 **d = &v5[0]; +l0:; + if (v0 < v1) + goto l0; + if (v0 == 0) + goto l3; + v0 = v4; + if (v0 != 0) + goto l3; + c[0].r = *d; + v1 = -1; + e = c[0].r; + if (e != 0) + g (a, ""); + k = e + 3; + k->i = v1; + goto l4; +l3:; + c[0].i = v0; + e = c[1].r; + if (e != 0) + g (a, ""); + e = c[0].r; + if (e == 0) + g (a, ""); + k = e + 2; + k->r = c[1].r; +l4:; +} + +void g (void *a, char *b) { abort (); } + +int +main () +{ + union u0 uv[] = {{ .i = 111 }, { .i = 222 }, { .i = 333 }, { .i = 444 }}; + struct s1 s = { 0, {{ .i = 555 }, { .i = 0 }, { .i = 999 }, { .i = 777 }}}; + f (0, &s, 0, 20000, 10000, (W) uv); + if (s.m1[0].i != (W) uv || s.m1[1].i != 0 || s.m1[2].i != 999 + || s.m1[3].i != 777 || uv[0].i != 111 || uv[1].i != 222 + || uv[2].i != 0 || uv[3].i != 444) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr16790-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr16790-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr16790-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr16790-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +/* PR middle-end/16790. */ + +extern void abort (); + +static void test1(unsigned int u1) +{ + unsigned int y_final_1; + signed short y_middle; + unsigned int y_final_2; + + y_final_1 = (unsigned int)( (signed short)(u1 * 2) * 3 ); + y_middle = (signed short)(u1 * 2); + y_final_2 = (unsigned int)( y_middle * 3 ); + + if (y_final_1 != y_final_2) + abort (); +} + + +static void test2(unsigned int u1) +{ + unsigned int y_final_1; + signed short y_middle; + unsigned int y_final_2; + + y_final_1 = (unsigned int)( (signed short)(u1 << 1) * 3 ); + y_middle = (signed short)(u1 << 1); + y_final_2 = (unsigned int)( y_middle * 3 ); + + if (y_final_1 != y_final_2) + abort (); +} + + +int main() +{ + test1(0x4000U); + test2(0x4000U); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17078-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17078-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17078-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17078-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +extern void abort(void); + +void test(int *ptr) +{ + int i = 1; + goto useless; + if (0) + { + useless: + i = 0; + } + else + i = 1; + *ptr = i; +} + +int main() +{ + int i = 1; + test(&i); + if (i) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17133.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17133.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17133.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17133.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void abort (void); + +int foo = 0; +void *bar = 0; +unsigned int baz = 100; + +void *pure_alloc () +{ + void *res; + + while (1) + { + res = (void *) ((((unsigned int) (foo + bar))) & ~1); + foo += 2; + if (foo < baz) + return res; + foo = 0; + } +} + +int main () +{ + pure_alloc (); + if (!foo) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17252.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17252.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17252.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17252.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR 17252. When a char * pointer P takes its own address, storing + into *P changes P itself. */ + +char *a; + +main () +{ + /* Make 'a' point to itself. */ + a = (char *)&a; + + /* Change what 'a' is pointing to. */ + a[0]++; + + /* If a's memory tag does not contain 'a' in its alias set, we will + think that this predicate is superfluous and change it to + 'if (1)'. */ + if (a == (char *)&a) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17377.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17377.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17377.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr17377.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,60 @@ +/* PR target/17377 + Bug in code emitted by "return" pattern on CRIS: missing pop of + forced return address on stack. */ +/* { dg-require-effective-target return_address } */ +int calls = 0; + +void *f (int) __attribute__ ((__noinline__)); +void * +f (int i) +{ + /* The code does a little brittle song and dance to trig the "return" + pattern instead of the function epilogue. This must still be a + leaf function for the bug to be exposed. */ + + if (calls++ == 0) + return __builtin_return_address (0); + + switch (i) + { + case 1: + return f; + case 0: + return __builtin_return_address (0); + } + return 0; +} + +int x; + +void *y (int i) __attribute__ ((__noinline__,__noclone__)); +void * +y (int i) +{ + x = 0; + + /* This must not be a sibling call: the return address must appear + constant for different calls to this function. Postincrementing x + catches otherwise unidentified multiple returns (e.g. through the + return-address register and then this epilogue popping the address + stored on stack in "f"). */ + return (char *) f (i) + x++; +} + +int +main (void) +{ + void *v = y (4); + if (y (1) != f + /* Can't reasonably check the validity of the return address + above, but it's not that important: the test-case will probably + crash on the first call to f with the bug present, or it will + run wild including returning early (in y or here), so we also + try and check the number of calls. */ + || y (0) != v + || y (3) != 0 + || y (-1) != 0 + || calls != 5) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19005.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19005.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19005.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19005.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* PR target/19005 */ +extern void abort (void); + +int v, s; + +void +bar (int a, int b) +{ + unsigned char x = v; + + if (!s) + { + if (a != x || b != (unsigned char) (x + 1)) + abort (); + } + else if (a != (unsigned char) (x + 1) || b != x) + abort (); + s ^= 1; +} + +int +foo (int x) +{ + unsigned char a = x, b = x + 1; + + bar (a, b); + a ^= b; b ^= a; a ^= b; + bar (a, b); + return 0; +} + +int +main (void) +{ + for (v = -10; v < 266; v++) + foo (v); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19449.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19449.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19449.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19449.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* PR c/19449 */ + +extern void abort (void); + +int y; +int z = __builtin_choose_expr (!__builtin_constant_p (y), 3, 4); + +int +foo (int x) +{ + return __builtin_choose_expr (!__builtin_constant_p (x), 3, y++); +} + +int +main () +{ + if (y || z != 3 || foo (4) != 3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19515.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19515.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19515.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19515.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* PR 19515 */ + +typedef union { + char a2[8]; +}aun; + +void abort (void); + +int main(void) +{ + aun a = {{0}}; + + if (a.a2[2] != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19606.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19606.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19606.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19606.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* PR c/19606 + The C front end used to shorten the type of a division to a type + that does not preserve the semantics of the original computation. + Make sure that won't happen. */ + +signed char a = -4; + +int +foo (void) +{ + return ((unsigned int) (signed int) a) / 2LL; +} + +int +bar (void) +{ + return ((unsigned int) (signed int) a) % 5LL; +} + +int +main (void) +{ + int r; + + r = foo (); + if (r != ((unsigned int) (signed int) (signed char) -4) / 2LL) + abort (); + + r = bar (); + if (r != ((unsigned int) (signed int) (signed char) -4) % 5LL) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19687.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19687.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19687.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19687.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +extern void abort (void); + +union U +{ + int i, j[4]; +}; + +int main () +{ + union U t = {}; + int i; + + for (i = 0; i < 4; ++i) + if (t.j[i] != 0) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19689.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19689.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19689.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr19689.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* { dg-require-effective-target int32plus } */ +extern void abort (void); + +struct +{ + int b : 29; +} f; + +void foo (short j) +{ + f.b = j; +} + +int main() +{ + foo (-55); + if (f.b != -55) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20100-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20100-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20100-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20100-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,76 @@ +/* PR tree-optimization/20100 + Pure function being treated as const. + Author: Hans-Peter Nilsson. */ + +static unsigned short g = 0; +static unsigned short p = 0; +unsigned char e; + +static unsigned short +next_g (void) +{ + return g == e - 1 ? 0 : g + 1; +} + +static unsigned short +curr_p (void) +{ + return p; +} + +static unsigned short +inc_g (void) +{ + return g = next_g (); +} + +static unsigned short +curr_g (void) +{ + return g; +} + +static char +ring_empty (void) +{ + if (curr_p () == curr_g ()) + return 1; + else + return 0; +} + +char +frob (unsigned short a, unsigned short b) +{ + g = a; + p = b; + inc_g (); + return ring_empty (); +} + +unsigned short +get_n (void) +{ + unsigned short n = 0; + unsigned short org_g; + org_g = curr_g (); + while (!ring_empty () && n < 5) + { + inc_g (); + n++; + } + + return n; +} + +void abort (void); +void exit (int); +int main (void) +{ + e = 3; + if (frob (0, 2) != 0 || g != 1 || p != 2 || e != 3 + || get_n () != 1 + || g != 2 || p != 2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20187-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20187-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20187-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20187-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +int a = 0x101; +int b = 0x100; + +int +test (void) +{ + return (((unsigned char) (unsigned long long) ((a ? a : 1) & (a * b))) + ? 0 : 1); +} + +int +main (void) +{ + return 1 - test (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20466-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20466-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20466-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20466-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +int f (int **, int *, int *, int **, int **) __attribute__ ((__noinline__)); +int +f (int **ipp, int *i1p, int *i2p, int **i3, int **i4) +{ + **ipp = *i1p; + *ipp = i2p; + *i3 = *i4; + **ipp = 99; + return 3; +} + +extern void exit (int); +extern void abort (void); + +int main (void) +{ + int i = 42, i1 = 66, i2 = 1, i3 = -1, i4 = 55; + int *ip = &i; + int *i3p = &i3; + int *i4p = &i4; + + f (&ip, &i1, &i2, &i3p, &i4p); + if (i != 66 || ip != &i2 || i2 != 99 || i3 != -1 || i3p != i4p || i4 != 55) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20527-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20527-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20527-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20527-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,81 @@ +/* PR rtl-optimization/20527 + Mishandled postincrement. This test-case is derived from the + function BZ2_hbCreateDecodeTables in the file huffman.c from + bzip2-1.0.2, hence requiring the following disclaimer copied here: */ + +/*-- + This file is a part of bzip2 and/or libbzip2, a program and + library for lossless, block-sorting data compression. + + Copyright (C) 1996-2002 Julian R Seward. All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + 2. The origin of this software must not be misrepresented; you must + not claim that you wrote the original software. If you use this + software in a product, an acknowledgment in the product + documentation would be appreciated but is not required. + + 3. Altered source versions must be plainly marked as such, and must + not be misrepresented as being the original software. + + 4. The name of the author may not be used to endorse or promote + products derived from this software without specific prior written + permission. + + THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS + OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY + DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE + GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, + WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + Julian Seward, Cambridge, UK. + jseward at acm.org + bzip2/libbzip2 version 1.0 of 21 March 2000 + + This program is based on (at least) the work of: + Mike Burrows + David Wheeler + Peter Fenwick + Alistair Moffat + Radford Neal + Ian H. Witten + Robert Sedgewick + Jon L. Bentley + + For more information on these sources, see the manual. +--*/ + +void f (long *limit, long *base, long minLen, long maxLen) __attribute__ ((__noinline__)); +void f (long *limit, long *base, long minLen, long maxLen) +{ + long i; + long vec; + vec = 0; + for (i = minLen; i <= maxLen; i++) { + vec += (base[i+1] - base[i]); + limit[i] = vec-1; + } +} +extern void abort (void); +extern void exit (int); +long b[] = {1, 5, 11, 23}; +int main (void) +{ + long l[3]; + f (l, b, 0, 2); + if (l[0] != 3 || l[1] != 9 || l[2] != 21) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20601-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20601-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20601-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20601-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,123 @@ +/* PR tree-optimization/20601 */ +/* { dg-xfail-if "ptxas crashes" { nvptx-*-* } { "-O1" } { "" } } */ +extern void abort (void); +extern void exit (int); + +struct T +{ + char *t1; + char t2[4096]; + char **t3; +}; + +int a[5]; +int b; +char **c; +int d; +char **e; +struct T t; +char *f[16]; +char *g[] = { "a", "-u", "b", "c" }; + +__attribute__ ((__noreturn__)) void +foo (void) +{ + while (1); +} + +__attribute__ ((noinline)) char * +bar (char *x, unsigned int y) +{ + return 0; +} + +static inline char * +baz (char *x, unsigned int y) +{ + if (sizeof (t.t2) != (unsigned int) -1 && y > sizeof (t.t2)) + foo (); + return bar (x, y); +} + +static inline int +setup1 (int x) +{ + char *p; + int rval; + + if (!baz (t.t2, sizeof (t.t2))) + baz (t.t2, sizeof (t.t2)); + + if (x & 0x200) + { + char **h, **i = e; + + ++d; + e = f; + if (t.t1 && *t.t1) + e[0] = t.t1; + else + abort (); + + for (h = e + 1; (*h = *i); ++i, ++h) + ; + } + return 1; +} + +static inline int +setup2 (void) +{ + int j = 1; + + e = c + 1; + d = b - 1; + while (d > 0 && e[0][0] == '-') + { + if (e[0][1] != '\0' && e[0][2] != '\0') + abort (); + + switch (e[0][1]) + { + case 'u': + if (!e[1]) + abort (); + + t.t3 = &e[1]; + d--; + e++; + break; + case 'P': + j |= 0x1000; + break; + case '-': + d--; + e++; + if (j == 1) + j |= 0x600; + return j; + } + d--; + e++; + } + + if (d > 0 && !(j & 1)) + abort (); + + return j; +} + +int +main (void) +{ + int x; + c = g; + b = 4; + x = setup2 (); + t.t1 = "/bin/sh"; + setup1 (x); + /* PRE shouldn't transform x into the constant 0x601 here, it's not legal. */ + if ((x & 0x400) && !a[4]) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20621-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20621-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20621-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr20621-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,8 @@ +/* { dg-require-stack-size "0x10000" } */ + +/* When generating o32 MIPS PIC, main's $gp save slot was out of range + of a single load instruction. */ +struct big { int i[sizeof (int) >= 4 && sizeof (void *) >= 4 ? 0x4000 : 4]; }; +struct big gb; +int foo (struct big b, int x) { return b.i[x]; } +int main (void) { return foo (gb, 0) + foo (gb, 1); } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21173.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21173.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21173.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21173.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +void abort (void); + +char q; +void *a[2]; + +void foo (char *p) +{ + int i; + for (i = 0; i < 2; i++) + a[i] += p - &q; +} + +int main (void) +{ + int i; + foo (&q); + for (i = 0; i < 2; i ++) + if (a[i]) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21331.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21331.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21331.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21331.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +void abort (void); + +int bar (void) { return -1; } + +unsigned long +foo () +{ unsigned long retval; + retval = bar (); + if (retval == -1) return 0; + return 3; } + +main () +{ if (foo () != 0) abort (); + return 0; } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21964-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21964-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21964-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr21964-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +void +foo (int n, int m) +{ + if (m == 0) + exit (0); + else if (n != 0) + abort (); + else + foo (n++, m - 1); +} + +int +main (void) +{ + foo (0, 4); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* { dg-require-effective-target alloca } */ +int N = 1; +void foo() {} /* Necessary to trigger the original ICE. */ +void bar (char a[2][N]) { a[1][0] = N; } +int +main (void) +{ + void *x; + + N = 4; + x = alloca (2 * N); + memset (x, 0, 2 * N); + bar (x); + if (N[(char *) x] != N) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,7 @@ +int *x; +static void bar (char a[2][(*x)++]) {} +int +main (void) +{ + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +void +bar (int N) +{ + int foo (char a[2][++N]) { N += 4; return sizeof (a[0]); } + if (foo (0) != 2) + abort (); + if (foo (0) != 7) + abort (); + if (N != 11) + abort (); +} + +int +main() +{ + bar (1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22061-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* { dg-skip-if "requires alloca" { ! alloca } { "-O0" } { "" } } */ +void +bar (int N) +{ + void foo (int a[2][N++]) {} + int a[2][N]; + foo (a); + int b[2][N]; + foo (b); + if (sizeof (a) != sizeof (int) * 2 * 1) + abort (); + if (sizeof (b) != sizeof (int) * 2 * 2) + abort (); + if (N != 3) + abort (); +} + +int +main (void) +{ + bar (1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +extern void abort (void); +extern void exit (int); +typedef __UINTPTR_TYPE__ uintptr_t; +int +main (void) +{ + int a = 0; + int *p; + uintptr_t b; + b = (uintptr_t)(p = &(int []){0, 1, 2}[++a]); + if (a != 1 || *p != 1 || *(int *)b != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +extern void abort (void); +extern void exit (int); +typedef __UINTPTR_TYPE__ uintptr_t; +int +main (void) +{ + int a = 0; + int *p; + uintptr_t b; + b = (uintptr_t)(p = &(int []){0, 1, 2}[1]); + if (*p != 1 || *(int *)b != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22098-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +extern void abort (void); +extern void exit (int); +typedef __UINTPTR_TYPE__ uintptr_t; +int n = 0; +int f (void) { return ++n; } +int +main (void) +{ + int a = 0; + int *p; + uintptr_t b; + b = (uintptr_t)(p = &(int []){0, f(), 2}[1]); + if (*p != 1 || *(int *)b != 1 || n != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22141-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22141-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22141-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22141-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,122 @@ +/* PR middle-end/22141 */ + +extern void abort (void); + +struct S +{ + struct T + { + char a; + char b; + char c; + char d; + } t; +} u; + +struct U +{ + struct S s[4]; +}; + +void __attribute__((noinline)) +c1 (struct T *p) +{ + if (p->a != 1 || p->b != 2 || p->c != 3 || p->d != 4) + abort (); + __builtin_memset (p, 0xaa, sizeof (*p)); +} + +void __attribute__((noinline)) +c2 (struct S *p) +{ + c1 (&p->t); +} + +void __attribute__((noinline)) +c3 (struct U *p) +{ + c2 (&p->s[2]); +} + +void __attribute__((noinline)) +f1 (void) +{ + u = (struct S) { { 1, 2, 3, 4 } }; +} + +void __attribute__((noinline)) +f2 (void) +{ + u.t.a = 1; + u.t.b = 2; + u.t.c = 3; + u.t.d = 4; +} + +void __attribute__((noinline)) +f3 (void) +{ + u.t.d = 4; + u.t.b = 2; + u.t.a = 1; + u.t.c = 3; +} + +void __attribute__((noinline)) +f4 (void) +{ + struct S v; + v.t.a = 1; + v.t.b = 2; + v.t.c = 3; + v.t.d = 4; + c2 (&v); +} + +void __attribute__((noinline)) +f5 (struct S *p) +{ + p->t.a = 1; + p->t.c = 3; + p->t.d = 4; + p->t.b = 2; +} + +void __attribute__((noinline)) +f6 (void) +{ + struct U v; + v.s[2].t.a = 1; + v.s[2].t.b = 2; + v.s[2].t.c = 3; + v.s[2].t.d = 4; + c3 (&v); +} + +void __attribute__((noinline)) +f7 (struct U *p) +{ + p->s[2].t.a = 1; + p->s[2].t.c = 3; + p->s[2].t.d = 4; + p->s[2].t.b = 2; +} + +int +main (void) +{ + struct U w; + f1 (); + c2 (&u); + f2 (); + c1 (&u.t); + f3 (); + c2 (&u); + f4 (); + f5 (&u); + c2 (&u); + f6 (); + f7 (&w); + c3 (&w); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22141-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22141-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22141-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22141-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,122 @@ +/* PR middle-end/22141 */ + +extern void abort (void); + +struct S +{ + struct T + { + char a; + char b; + char c; + char d; + } t; +} u __attribute__((aligned)); + +struct U +{ + struct S s[4]; +}; + +void __attribute__((noinline)) +c1 (struct T *p) +{ + if (p->a != 1 || p->b != 2 || p->c != 3 || p->d != 4) + abort (); + __builtin_memset (p, 0xaa, sizeof (*p)); +} + +void __attribute__((noinline)) +c2 (struct S *p) +{ + c1 (&p->t); +} + +void __attribute__((noinline)) +c3 (struct U *p) +{ + c2 (&p->s[2]); +} + +void __attribute__((noinline)) +f1 (void) +{ + u = (struct S) { { 1, 2, 3, 4 } }; +} + +void __attribute__((noinline)) +f2 (void) +{ + u.t.a = 1; + u.t.b = 2; + u.t.c = 3; + u.t.d = 4; +} + +void __attribute__((noinline)) +f3 (void) +{ + u.t.d = 4; + u.t.b = 2; + u.t.a = 1; + u.t.c = 3; +} + +void __attribute__((noinline)) +f4 (void) +{ + struct S v __attribute__((aligned)); + v.t.a = 1; + v.t.b = 2; + v.t.c = 3; + v.t.d = 4; + c2 (&v); +} + +void __attribute__((noinline)) +f5 (struct S *p) +{ + p->t.a = 1; + p->t.c = 3; + p->t.d = 4; + p->t.b = 2; +} + +void __attribute__((noinline)) +f6 (void) +{ + struct U v __attribute__((aligned)); + v.s[2].t.a = 1; + v.s[2].t.b = 2; + v.s[2].t.c = 3; + v.s[2].t.d = 4; + c3 (&v); +} + +void __attribute__((noinline)) +f7 (struct U *p) +{ + p->s[2].t.a = 1; + p->s[2].t.c = 3; + p->s[2].t.d = 4; + p->s[2].t.b = 2; +} + +int +main (void) +{ + struct U w __attribute__((aligned)); + f1 (); + c2 (&u); + f2 (); + c1 (&u.t); + f3 (); + c2 (&u); + f4 (); + f5 (&u); + c2 (&u); + f6 (); + f7 (&w); + c3 (&w); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22348.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22348.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22348.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22348.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +void abort (void); +void f(int i) +{ + if (i>4 + 3 * 16) + abort(); +} + +int main() +{ + unsigned int buflen, i; + buflen = 4 + 3 * 16; + for (i = 4; i < buflen; i+= 3) + f(i); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22429.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22429.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22429.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22429.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +extern void abort (void); + +#define N (1 << (sizeof(int) * __CHAR_BIT__ - 2)) + +int f(int n) +{ + if (-N <= n && n <= N-1) + return 1; + return 0; +} + +int main () +{ + if (f (N)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22493-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22493-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22493-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22493-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* { dg-options "-fwrapv" } */ + +#include +extern void abort (); +extern void exit (int); +void f(int i) +{ + if (i>0) + abort(); + i = -i; + if (i<0) + return; + abort (); +} + +int main(int argc, char *argv[]) +{ + f(INT_MIN); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22630.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22630.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22630.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr22630.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +void abort (void); + +int j; + +void bla (int *r) +{ + int *p, *q; + + p = q = r; + if (!p) + p = &j; + + if (p != q) + j = 1; +} + +int main (void) +{ + bla (0); + if (!j) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23047.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23047.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23047.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23047.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* { dg-options "-fwrapv" } */ +#include +extern void abort (); +extern void exit (int); +void f(int i) +{ + i = i > 0 ? i : -i; + if (i<0) + return; + abort (); +} + +int main(int argc, char *argv[]) +{ + f(INT_MIN); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23135.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23135.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23135.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23135.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,140 @@ +/* Based on execute/simd-1.c, modified by joern.rennecke at st.com to + trigger a reload bug. Verified for gcc mainline from 20050722 13:00 UTC + for sh-elf -m4 -O2. */ +/* { dg-options "-Wno-psabi" } */ +/* { dg-add-options stack_size } */ + +#ifndef STACK_SIZE +#define STACK_SIZE (256*1024) +#endif + +extern void abort (void); +extern void exit (int); + +typedef struct { char c[STACK_SIZE/2]; } big_t; + +typedef int __attribute__((mode(SI))) __attribute__((vector_size (8))) vecint; +typedef int __attribute__((mode(SI))) siint; + +vecint i = { 150, 100 }; +vecint j = { 10, 13 }; +vecint k; + +union { + vecint v; + siint i[2]; +} res; + +void +verify (siint a1, siint a2, siint b1, siint b2, big_t big) +{ + if (a1 != b1 + || a2 != b2) + abort (); +} + +int +main () +{ + big_t big; + vecint k0, k1, k2, k3, k4, k5, k6, k7; + + k0 = i + j; + res.v = k0; + + verify (res.i[0], res.i[1], 160, 113, big); + + k1 = i * j; + res.v = k1; + + verify (res.i[0], res.i[1], 1500, 1300, big); + + k2 = i / j; +/* This is the observed failure - reload 0 has the wrong type and thus the + conflict with reload 1 is missed: + +(insn:HI 94 92 96 1 pr23135.c:46 (parallel [ + (set (subreg:SI (reg:DI 253) 0) + (div:SI (reg:SI 4 r4) + (reg:SI 5 r5))) + (clobber (reg:SI 146 pr)) + (clobber (reg:DF 64 fr0)) + (clobber (reg:DF 66 fr2)) + (use (reg:PSI 151 )) + (use (reg/f:SI 256)) + ]) 60 {divsi3_i4} (insn_list:REG_DEP_TRUE 90 (insn_list:REG_DEP_TRUE 89 +(insn_list:REG_DEP_TRUE 42 (insn_list:REG_DEP_TRUE 83 (insn_list:REG_DEP_TRUE 92 + (insn_list:REG_DEP_TRUE 91 (nil))))))) + (expr_list:REG_DEAD (reg:SI 4 r4) + (expr_list:REG_DEAD (reg:SI 5 r5) + (expr_list:REG_UNUSED (reg:DF 66 fr2) + (expr_list:REG_UNUSED (reg:DF 64 fr0) + (expr_list:REG_UNUSED (reg:SI 146 pr) + (insn_list:REG_RETVAL 91 (nil)))))))) + +Reloads for insn # 94 +Reload 0: reload_in (SI) = (plus:SI (reg/f:SI 14 r14) + (const_int 64 [0x40])) + GENERAL_REGS, RELOAD_FOR_OUTADDR_ADDRESS (opnum = 0) + reload_in_reg: (plus:SI (reg/f:SI 14 r14) + (const_int 64 [0x40])) + reload_reg_rtx: (reg:SI 3 r3) +Reload 1: GENERAL_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, se +condary_reload_p + reload_reg_rtx: (reg:SI 3 r3) +Reload 2: reload_out (SI) = (mem:SI (plus:SI (plus:SI (reg/f:SI 14 r14) + (const_int 64 [0x40])) + (const_int 28 [0x1c])) [ 16 S8 A32]) + FPUL_REGS, RELOAD_FOR_OUTPUT (opnum = 0) + reload_out_reg: (subreg:SI (reg:DI 253) 0) + reload_reg_rtx: (reg:SI 150 fpul) + secondary_out_reload = 1 + +Reload 3: reload_in (SI) = (symbol_ref:SI ("__sdivsi3_i4") [flags 0x1]) + GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1), can't combine + reload_in_reg: (reg/f:SI 256) + reload_reg_rtx: (reg:SI 3 r3) + */ + + + res.v = k2; + + verify (res.i[0], res.i[1], 15, 7, big); + + k3 = i & j; + res.v = k3; + + verify (res.i[0], res.i[1], 2, 4, big); + + k4 = i | j; + res.v = k4; + + verify (res.i[0], res.i[1], 158, 109, big); + + k5 = i ^ j; + res.v = k5; + + verify (res.i[0], res.i[1], 156, 105, big); + + k6 = -i; + res.v = k6; + verify (res.i[0], res.i[1], -150, -100, big); + + k7 = ~i; + res.v = k7; + verify (res.i[0], res.i[1], -151, -101, big); + + k = k0 + k1 + k3 + k4 + k5 + k6 + k7; + res.v = k; + verify (res.i[0], res.i[1], 1675, 1430, big); + + k = k0 * k1 * k3 * k4 * k5 * k6 * k7; + res.v = k; + verify (res.i[0], res.i[1], 1456467968, -1579586240, big); + + k = k0 / k1 / k2 / k3 / k4 / k5 / k6 / k7; + res.v = k; + verify (res.i[0], res.i[1], 0, 0, big); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23324.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23324.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23324.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23324.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,133 @@ +extern void abort (void); +#define A(x) if (!(x)) abort () + +static union at6 {} vv6 = {}; +static struct et6 +{ + struct bt6 + { + signed av6:6; + signed bv6:7; + signed cv6:6; + signed dv6:5; + unsigned char ev6; + unsigned int fv6; + long int gv6; + } mv6; + unsigned long int nv6; + signed ov6:12; + signed pv6:3; + signed qv6:2; + signed rv6:10; + union ct6 { long int hv6; float iv6; float jv6; } sv6; + int *tv6; + union dt6 { double kv6; float lv6; } uv6; +} wv6 = { + { 8, 9, 2, 4, '\x10', 67426805U, 1047191860L }, + 1366022414UL, 858, 1, 1, 305, + { 1069379046L }, (int *) 358273621U, + { 3318.041978 } +}; +static double xv6 = 19239.101269; +static long long int yv6 = 1207859169L; +static int zv6 = 660195606; + +static union at6 +callee_af6 (struct et6 ap6, double bp6, long long int cp6, int dp6) +{ + A (wv6.mv6.av6 == ap6.mv6.av6); + A (wv6.mv6.bv6 == ap6.mv6.bv6); + A (wv6.mv6.cv6 == ap6.mv6.cv6); + A (wv6.mv6.dv6 == ap6.mv6.dv6); + A (wv6.mv6.ev6 == ap6.mv6.ev6); + A (wv6.mv6.fv6 == ap6.mv6.fv6); + A (wv6.mv6.gv6 == ap6.mv6.gv6); + A (wv6.nv6 == ap6.nv6); + A (wv6.ov6 == ap6.ov6); + A (wv6.pv6 == ap6.pv6); + A (wv6.qv6 == ap6.qv6); + A (wv6.rv6 == ap6.rv6); + A (wv6.sv6.hv6 == ap6.sv6.hv6); + A (wv6.tv6 == ap6.tv6); + A (wv6.uv6.kv6 == ap6.uv6.kv6); + A (xv6 == bp6); + A (yv6 == cp6); + A (zv6 == dp6); + return vv6; +} + +static void +caller_bf6 (void) +{ + union at6 bav6; + bav6 = callee_af6 (wv6, xv6, yv6, zv6); +} + +static unsigned char uv7 = '\x46'; +static float vv7 = 96636.982442; +static double wv7 = 28450.711801; +static union ct7 {} xv7 = {}; +static struct et7 +{ + struct dt7 + { + float iv7; + unsigned short int jv7; + } kv7; + float lv7[0]; + signed mv7:9; + short int nv7; + double ov7; + float pv7; +} yv7 = { + { 30135.996213, 42435 }, + {}, 170, 22116, 26479.628148, 4082.960685 +}; +static union ft7 +{ + float qv7; + float *rv7; + unsigned int *sv7; +} zv7 = { 5042.227886 }; +static int bav7 = 1345451862; +static struct gt7 { double tv7; } bbv7 = { 47875.491954 }; +static long int bcv7[1] = { 1732133482L }; +static long long int bdv7 = 381678602L; + +static unsigned char +callee_af7 (float ap7, double bp7, union ct7 cp7, struct et7 dp7, + union ft7 ep7, int fp7, struct gt7 gp7, long int hp7[1], + long long int ip7) +{ + A (vv7 == ap7); + A (wv7 == bp7); + A (yv7.kv7.iv7 == dp7.kv7.iv7); + A (yv7.kv7.jv7 == dp7.kv7.jv7); + A (yv7.mv7 == dp7.mv7); + A (yv7.nv7 == dp7.nv7); + A (yv7.ov7 == dp7.ov7); + A (yv7.pv7 == dp7.pv7); + A (zv7.qv7 == ep7.qv7); + A (bav7 == fp7); + A (bbv7.tv7 == gp7.tv7); + A (bcv7[0] == hp7[0]); + A (bdv7 == ip7); + return uv7; +} + +static void +caller_bf7 (void) +{ + unsigned char bev7; + + bev7 = callee_af7 (vv7, wv7, xv7, yv7, zv7, bav7, bbv7, bcv7, bdv7); + A (uv7 == bev7); +} + +int +main () +{ + caller_bf6 (); + caller_bf7 (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23467.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23467.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23467.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23467.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* { dg-skip-if "small alignment" { pdp11-*-* } } */ + +struct s1 +{ + int __attribute__ ((aligned (8))) a; +}; + +struct +{ + char c; + struct s1 m; +} v; + +int +main (void) +{ + if ((int)&v.m & 7) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23604.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23604.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23604.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23604.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); + +int g(int i, int j) +{ + if (i>-1) + if (i<2) + { + if (i != j) + { + if (j != 0) + return 0; + } + } + return 1; +} + +int main(void) +{ + if (!g(1, 0)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23941.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23941.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23941.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr23941.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,9 @@ +extern void abort (void); +double d = __FLT_MIN__ / 2.0; +int main() +{ + double x = __FLT_MIN__ / 2.0; + if (x != d) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24135.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24135.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24135.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24135.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +/* { dg-require-effective-target trampolines } */ + +extern void abort (void); + +int x(int a, int b) +{ + __label__ xlab; + __label__ xlab2; + + void y(int b) + { + switch (b) + { + case 1: goto xlab; + case 2: goto xlab; + } + } + + a = a + 2; + y (b); + + xlab: + return a; + + xlab2: + a++; + return a; + +} + +int main () +{ + int i, j; + + for (j = 1; j <= 2; ++j) + for (i = 1; i <= 2; ++i) + { + int a = x (j, i); + if (a != 2 + j) + abort (); + } + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24141.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24141.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24141.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24141.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +// reduced testcase, compile with -O2. Also, with --disable-checking +// gcc produces wrong code. + +void abort (void); +int i; + +void g (void) +{ + i = 1; +} + +void f (int a, int b) +{ + int c = 0; + if (a == 0) + c = 1; + if (c) + return; + if (c == 1) + c = 0; + if (b == 0) + c = 1; + if (c) + g (); +} + +int main (void) +{ + f (1, 0); + if (i != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24142.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24142.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24142.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24142.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +void abort (void); + +int f (int a, int b) +{ + if (a == 1) + a = 0; + if (b == 0) + a = 1; + if (a != 0) + return 0; + return 1; +} + +int main (void) +{ + if (f (1, 1) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24716.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24716.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24716.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24716.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,58 @@ +/* PR24716, scalar evolution returning the wrong result + for pdest. */ + +int Link[] = { -1 }; +int W[] = { 2 }; + +extern void abort (void); + +int f (int k, int p) +{ + int pdest, j, D1361; + j = 0; + pdest = 0; + for (;;) { + if (pdest > 2) + do + j--, pdest++; + while (j > 2); + + if (j == 1) + break; + + while (pdest > p) + if (j == p) + pdest++; + + do + { + D1361 = W[k]; + do + if (D1361 != 0) + pdest = 1, W[k] = D1361 = 0; + while (p < 1); + } while (k > 0); + + do + { + p = 0; + k = Link[k]; + while (p < j) + if (k != -1) + pdest++, p++; + } + while (k != -1); + j = 1; + } + + /* The correct return value should be pdest (1 in the call from main). + DOM3 is mistaken and propagates a 0 here. */ + return pdest; +} + +int main () +{ + if (!f (0, 2)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24851.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24851.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24851.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr24851.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* We used to handle pointer addition wrongly + at the time of recombining to an ARRAY_REF + in the case of + p + -4B + where -4B is represented as unsigned. */ + +void abort(void); +int main() +{ + int a[10], *p, *q; + q = &a[1]; + p = &q[-1]; + if (p >= &a[9]) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr25125.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr25125.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr25125.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr25125.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void exit (int); +extern void abort (void); +extern unsigned short f (short a) __attribute__((__noinline__)); + +unsigned short +f (short a) +{ + short b; + + if (a > 0) + return 0; + b = ((int) a) + - (int) 32768; + return b; +} + +int +main (void) +{ + if (sizeof (short) < 2 + || sizeof (short) >= sizeof (int)) + exit (0); + + if (f (-32767) != 1) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr25737.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr25737.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr25737.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr25737.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); + +struct delay_block { + struct delay_block *succ; +}; + +static struct delay_block Timer_Queue; + +struct delay_block* time_enqueue (struct delay_block *d) +{ + struct delay_block *q = Timer_Queue.succ; + d->succ = (void *)0; + return Timer_Queue.succ; +} + +int main(void) +{ + Timer_Queue.succ = &Timer_Queue; + if (time_enqueue (&Timer_Queue) != (void*)0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27073.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27073.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27073.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27073.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +void __attribute__((noinline)) +foo (int *p, int d1, int d2, int d3, + short count, int s1, int s2, int s3, int s4, int s5) +{ + int n = count; + while (n--) + { + *p++ = s1; + *p++ = s2; + *p++ = s3; + *p++ = s4; + *p++ = s5; + } +} + +int main() +{ + int x[10], i; + + foo (x, 0, 0, 0, 2, 100, 200, 300, 400, 500); + for (i = 0; i < 10; i++) + if (x[i] != (i % 5 + 1) * 100) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27260.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27260.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27260.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27260.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* PR middle-end/27260 */ + +extern void abort (void); +extern void *memset (void *, int, __SIZE_TYPE__); + +char buf[65]; + +void +foo (int x) +{ + memset (buf, x != 2 ? 1 : 0, 64); +} + +int +main (void) +{ + int i; + buf[64] = 2; + for (i = 0; i < 64; i++) + if (buf[i] != 0) + abort (); + foo (0); + for (i = 0; i < 64; i++) + if (buf[i] != 1) + abort (); + foo (2); + for (i = 0; i < 64; i++) + if (buf[i] != 0) + abort (); + if (buf[64] != 2) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27285.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27285.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27285.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27285.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,46 @@ +/* PR tree-optimization/27285 */ + +extern void abort (void); + +struct S { unsigned char a, b, c, d[16]; }; + +void __attribute__ ((noinline)) +foo (struct S *x, struct S *y) +{ + int a, b; + unsigned char c, *d, *e; + + b = x->b; + d = x->d; + e = y->d; + a = 0; + while (b) + { + if (b >= 8) + { + c = 0xff; + b -= 8; + } + else + { + c = 0xff << (8 - b); + b = 0; + } + + e[a] = d[a] & c; + a++; + } +} + +int +main (void) +{ + struct S x = { 0, 25, 0, { 0xaa, 0xbb, 0xcc, 0xdd }}; + struct S y = { 0, 0, 0, { 0 }}; + + foo (&x, &y); + if (x.d[0] != y.d[0] || x.d[1] != y.d[1] + || x.d[2] != y.d[2] || (x.d[3] & 0x80) != y.d[3]) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27364.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27364.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27364.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27364.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +int f(unsigned number_of_digits_to_use) +{ + if (number_of_digits_to_use >1294) + return 0; + return (number_of_digits_to_use * 3321928 / 1000000 + 1) /16; +} + +int main(void) +{ + if (f(11) != 2) + __builtin_abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27671-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27671-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27671-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr27671-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR rtl-optimization/27671. + The combiner used to simplify "a ^ b == a" to "a" via + simplify_relational_operation_1 in simplify-rtx.c. */ + +extern void abort (void) __attribute__ ((noreturn)); +extern void exit (int) __attribute__ ((noreturn)); + +static int __attribute__((noinline)) +foo (int a, int b) +{ + int c = a ^ b; + if (c == a) + abort (); +} + +int +main (void) +{ + foo (0, 1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28289.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28289.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28289.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28289.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +extern int ok (int); +extern void exit (); +static int gen_x86_64_shrd (int); +static int +gen_x86_64_shrd(int a __attribute__ ((__unused__))) +{ + return 0; +} + +extern int gen_x86_shrd_1 (int); +extern void ix86_split_ashr (int); + +void +ix86_split_ashr (int mode) +{ + (mode != 0 + ? ok + : gen_x86_64_shrd) (0); +} + +volatile int one = 1; +int +main (void) +{ + ix86_split_ashr (one); + return 1; +} + +int +ok (int i) +{ + exit (i); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28403.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28403.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28403.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28403.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +typedef unsigned long long ull; +int global; + +int __attribute__((noinline)) +foo (int x1, int x2, int x3, int x4, int x5, int x6, int x7, int x8) +{ + global = x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8; +} + +ull __attribute__((noinline)) +bar (ull x) +{ + foo (1, 2, 1, 3, 1, 4, 1, 5); + return x >> global; +} + +int +main (void) +{ + if (bar (0x123456789abcdefULL) != (0x123456789abcdefULL >> 18)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28651.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28651.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28651.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28651.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +#include + +extern void abort (void); +int __attribute__((noinline)) +foo (unsigned int u) +{ + return (int)(u + 4) < (int)u; +} + +int +main (int argc, char *argv[]) +{ + unsigned int u = INT_MAX; + + if (foo (u) == 0) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28778.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28778.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28778.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28778.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +extern void abort(void); +typedef long GLint; +void aglChoosePixelFormat (const GLint *); + +void +find (const int *alistp) +{ + const int *blist; + int list[32]; + if (alistp) + blist = alistp; + else + { + list[3] = 42; + blist = list; + } + aglChoosePixelFormat ((GLint *) blist); +} + +void +aglChoosePixelFormat (const GLint * a) +{ + int *b = (int *) a; + if (b[3] != 42) + abort (); +} + +int +main (void) +{ + find (0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28865.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28865.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28865.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28865.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +struct A { int a; char b[]; }; +union B { struct A a; char b[sizeof (struct A) + 31]; }; +union B b = { { 1, "123456789012345678901234567890" } }; +union B c = { { 2, "123456789012345678901234567890" } }; + +__attribute__((noinline, noclone)) void +foo (int *x[2]) +{ + x[0] = &b.a.a; + x[1] = &c.a.a; +} + +int +main () +{ + int *x[2]; + foo (x); + if (*x[0] != 1 || *x[1] != 2) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28982a.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28982a.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28982a.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28982a.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,65 @@ +/* PR rtl-optimization/28982. Function foo() does the equivalent of: + + float tmp_results[NVARS]; + for (int i = 0; i < NVARS; i++) + { + int inc = incs[i]; + float *ptr = ptrs[i], result = 0; + for (int j = 0; j < n; j++) + result += *ptr, ptr += inc; + tmp_results[i] = result; + } + memcpy (results, tmp_results, sizeof (results)); + + but without the outermost loop. The idea is to create high register + pressure and ensure that some INC and PTR variables are spilled. + + On ARM targets, sequences like "result += *ptr, ptr += inc" can + usually be implemented using (mem (post_modify ...)), and we do + indeed create such MEMs before reload for this testcase. However, + (post_modify ...) is not a valid address for coprocessor loads, so + for -mfloat-abi=softfp, reload reloads the POST_MODIFY into a base + register. GCC did not deal correctly with cases where the base and + index of the POST_MODIFY are themselves reloaded. */ +#define NITER 4 +#define NVARS 20 +#define MULTI(X) \ + X( 0), X( 1), X( 2), X( 3), X( 4), X( 5), X( 6), X( 7), X( 8), X( 9), \ + X(10), X(11), X(12), X(13), X(14), X(15), X(16), X(17), X(18), X(19) + +#define DECLAREI(INDEX) inc##INDEX = incs[INDEX] +#define DECLAREF(INDEX) *ptr##INDEX = ptrs[INDEX], result##INDEX = 0 +#define LOOP(INDEX) result##INDEX += *ptr##INDEX, ptr##INDEX += inc##INDEX +#define COPYOUT(INDEX) results[INDEX] = result##INDEX + +float *ptrs[NVARS]; +float results[NVARS]; +int incs[NVARS]; + +void __attribute__((noinline)) +foo (int n) +{ + int MULTI (DECLAREI); + float MULTI (DECLAREF); + while (n--) + MULTI (LOOP); + MULTI (COPYOUT); +} + +float input[NITER * NVARS]; + +int +main (void) +{ + int i; + + for (i = 0; i < NVARS; i++) + ptrs[i] = input + i, incs[i] = i; + for (i = 0; i < NITER * NVARS; i++) + input[i] = i; + foo (NITER); + for (i = 0; i < NVARS; i++) + if (results[i] != i * NITER * (NITER + 1) / 2) + return 1; + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28982b.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28982b.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28982b.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr28982b.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,56 @@ +/* { dg-require-stack-size "0x80100" } */ + +/* Like pr28982a.c, but with the spill slots outside the range of + a single sp-based load on ARM. This test tests for cases where + the addresses in the base and index reloads require further reloads. */ +#define NITER 4 +#define NVARS 20 +#define MULTI(X) \ + X( 0), X( 1), X( 2), X( 3), X( 4), X( 5), X( 6), X( 7), X( 8), X( 9), \ + X(10), X(11), X(12), X(13), X(14), X(15), X(16), X(17), X(18), X(19) + +#define DECLAREI(INDEX) inc##INDEX = incs[INDEX] +#define DECLAREF(INDEX) *ptr##INDEX = ptrs[INDEX], result##INDEX = 0 +#define LOOP(INDEX) result##INDEX += *ptr##INDEX, ptr##INDEX += inc##INDEX +#define COPYOUT(INDEX) results[INDEX] = result##INDEX + +float *ptrs[NVARS]; +float results[NVARS]; +int incs[NVARS]; + +struct big { int i[0x10000]; }; +void __attribute__((noinline)) +bar (struct big b) +{ + incs[0] += b.i[0]; +} + +void __attribute__((noinline)) +foo (int n) +{ + struct big b = {}; + int MULTI (DECLAREI); + float MULTI (DECLAREF); + while (n--) + MULTI (LOOP); + MULTI (COPYOUT); + bar (b); +} + +float input[NITER * NVARS]; + +int +main (void) +{ + int i; + + for (i = 0; i < NVARS; i++) + ptrs[i] = input + i, incs[i] = i; + for (i = 0; i < NITER * NVARS; i++) + input[i] = i; + foo (NITER); + for (i = 0; i < NVARS; i++) + if (results[i] != i * NITER * (NITER + 1) / 2) + return 1; + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29006.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29006.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29006.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29006.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,3 @@ +struct __attribute__((__packed__)) s { char c; unsigned long long x; }; +void __attribute__((__noinline__)) foo (struct s *s) { s->x = 0; } +int main (void) { struct s s = { 1, ~0ULL }; foo (&s); return s.x != 0; } Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29156.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29156.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29156.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29156.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +extern void abort(void); +struct test1 +{ + int a; + int b; +}; +struct test2 +{ + float d; + struct test1 sub; +}; + +int global; + +int bla(struct test1 *xa, struct test2 *xb) +{ + global = 1; + xb->sub.a = 1; + xa->a = 8; + return xb->sub.a; +} + +int main(void) +{ + struct test2 pom; + + if (bla (&pom.sub, &pom) != 8) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29695-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29695-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29695-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29695-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,83 @@ +/* PR middle-end/29695 */ + +extern void abort (void); + +int +f1 (void) +{ + int a = 128; + return (a & 0x80) ? 0x80 : 0; +} + +int +f2 (void) +{ + unsigned char a = 128; + return (a & 0x80) ? 0x80 : 0; +} + +int +f3 (void) +{ + unsigned char a = 128; + return (a & 0x80) ? 0x380 : 0; +} + +int +f4 (void) +{ + unsigned char a = 128; + return (a & 0x80) ? -128 : 0; +} + +long long +f5 (void) +{ + long long a = 0x80000000LL; + return (a & 0x80000000) ? 0x80000000LL : 0LL; +} + +long long +f6 (void) +{ + unsigned int a = 0x80000000; + return (a & 0x80000000) ? 0x80000000LL : 0LL; +} + +long long +f7 (void) +{ + unsigned int a = 0x80000000; + return (a & 0x80000000) ? 0x380000000LL : 0LL; +} + +long long +f8 (void) +{ + unsigned int a = 0x80000000; + return (a & 0x80000000) ? -2147483648LL : 0LL; +} + +int +main (void) +{ + if ((char) 128 != -128 || (int) 0x80000000 != -2147483648) + return 0; + if (f1 () != 128) + abort (); + if (f2 () != 128) + abort (); + if (f3 () != 896) + abort (); + if (f4 () != -128) + abort (); + if (f5 () != 0x80000000LL) + abort (); + if (f6 () != 0x80000000LL) + abort (); + if (f7 () != 0x380000000LL) + abort (); + if (f8 () != -2147483648LL) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29695-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29695-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29695-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29695-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,80 @@ +/* PR middle-end/29695 */ + +extern void abort (void); + +int a = 128; +unsigned char b = 128; +long long c = 0x80000000LL; +unsigned int d = 0x80000000; + +int +f1 (void) +{ + return (a & 0x80) ? 0x80 : 0; +} + +int +f2 (void) +{ + return (b & 0x80) ? 0x80 : 0; +} + +int +f3 (void) +{ + return (b & 0x80) ? 0x380 : 0; +} + +int +f4 (void) +{ + return (b & 0x80) ? -128 : 0; +} + +long long +f5 (void) +{ + return (c & 0x80000000) ? 0x80000000LL : 0LL; +} + +long long +f6 (void) +{ + return (d & 0x80000000) ? 0x80000000LL : 0LL; +} + +long long +f7 (void) +{ + return (d & 0x80000000) ? 0x380000000LL : 0LL; +} + +long long +f8 (void) +{ + return (d & 0x80000000) ? -2147483648LL : 0LL; +} + +int +main (void) +{ + if ((char) 128 != -128 || (int) 0x80000000 != -2147483648) + return 0; + if (f1 () != 128) + abort (); + if (f2 () != 128) + abort (); + if (f3 () != 896) + abort (); + if (f4 () != -128) + abort (); + if (f5 () != 0x80000000LL) + abort (); + if (f6 () != 0x80000000LL) + abort (); + if (f7 () != 0x380000000LL) + abort (); + if (f8 () != -2147483648LL) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29797-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29797-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29797-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29797-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* { dg-require-effective-target int32plus } */ +extern void abort(void); + +unsigned int bar(void) { return 32768; } + +int main() +{ + unsigned int nStyle = bar (); + if (nStyle & 32768) + nStyle |= 65536; + if (nStyle != (32768 | 65536)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29797-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29797-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29797-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29797-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +extern void abort(void); + +unsigned long bar(void) { return 32768; } + +int main() +{ + unsigned long nStyle = bar (); + if (nStyle & 32768) + nStyle |= 65536; + if (nStyle != (32768 | 65536)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29798.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29798.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29798.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr29798.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +extern void abort (); + +int +main () +{ + int i; + double oldrho; + double beta = 0.0; + double work = 1.0; + for (i = 1; i <= 2; i++) + { + double rho = work * work; + if (i != 1) + beta = rho / oldrho; + if (beta == 1.0) + abort (); + + /* All targets even remotely likely to ever get supported + use at least an even base, so there will never be any + floating-point rounding. All computation in this test + case is exact for even bases. */ + work /= 2.0; + oldrho = rho; + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr30185.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr30185.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr30185.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr30185.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR target/30185 */ + +extern void abort (void); + +typedef struct S { char a; long long b; } S; + +S +foo (S x, S y) +{ + S z; + z.b = x.b / y.b; + return z; +} + +int +main (void) +{ + S a, b; + a.b = 32LL; + b.b = 4LL; + if (foo (a, b).b != 8LL) + abort (); + a.b = -8LL; + b.b = -2LL; + if (foo (a, b).b != 4LL) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr30778.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr30778.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr30778.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr30778.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +extern void *memset (void *, int, __SIZE_TYPE__); +extern void abort (void); + +struct reg_stat { + void *last_death; + void *last_set; + void *last_set_value; + int last_set_label; + char last_set_sign_bit_copies; + int last_set_mode : 8; + char last_set_invalid; + char sign_bit_copies; + long nonzero_bits; +}; + +static struct reg_stat *reg_stat; + +void __attribute__((noinline)) +init_reg_last (void) +{ + memset (reg_stat, 0, __builtin_offsetof (struct reg_stat, sign_bit_copies)); +} + +int main (void) +{ + struct reg_stat r; + + reg_stat = &r; + r.nonzero_bits = -1; + init_reg_last (); + if (r.nonzero_bits != -1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31072.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31072.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31072.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31072.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,10 @@ +extern volatile int ReadyFlag_NotProperlyInitialized; + +volatile int ReadyFlag_NotProperlyInitialized=1; + +int main(void) +{ + if (ReadyFlag_NotProperlyInitialized != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31136.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31136.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31136.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31136.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +extern void abort (void); + +struct S { + unsigned b4:4; + unsigned b6:6; +} s; + +int main() +{ + s.b6 = 31; + s.b4 = s.b6; + s.b6 = s.b4; + if (s.b6 != 15) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31169.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31169.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31169.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31169.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +extern void abort(); + +#define HOST_WIDE_INT long +#define HOST_BITS_PER_WIDE_INT (sizeof(long)*8) + +struct tree_type +{ + unsigned int precision : 9; +}; + +int +sign_bit_p (struct tree_type *t, HOST_WIDE_INT val_hi, unsigned HOST_WIDE_INT val_lo) +{ + unsigned HOST_WIDE_INT mask_lo, lo; + HOST_WIDE_INT mask_hi, hi; + int width = t->precision; + + if (width > HOST_BITS_PER_WIDE_INT) + { + hi = (unsigned HOST_WIDE_INT) 1 << (width - HOST_BITS_PER_WIDE_INT - 1); + lo = 0; + + mask_hi = ((unsigned HOST_WIDE_INT) -1 + >> (2 * HOST_BITS_PER_WIDE_INT - width)); + mask_lo = -1; + } + else + { + hi = 0; + lo = (unsigned HOST_WIDE_INT) 1 << (width - 1); + + mask_hi = 0; + mask_lo = ((unsigned HOST_WIDE_INT) -1 + >> (HOST_BITS_PER_WIDE_INT - width)); + } + + if ((val_hi & mask_hi) == hi + && (val_lo & mask_lo) == lo) + return 1; + + return 0; +} + +int main() +{ + struct tree_type t; + t.precision = 1; + if (!sign_bit_p (&t, 0, -1)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31448-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31448-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31448-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31448-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* PR middle-end/31448, this used to ICE during expand because + reduce_to_bit_field_precision was not ready to handle constants. */ + +typedef struct _st { + long int iIndex : 24; + long int iIndex1 : 24; +} st; +st *next; +void g(void) +{ + st *next = 0; + int nIndx; + const static int constreg[] = { 0,}; + nIndx = 0; + next->iIndex = constreg[nIndx]; +} +void f(void) +{ + int nIndx; + const static long int constreg[] = { 0xFEFEFEFE,}; + nIndx = 0; + next->iIndex = constreg[nIndx]; + next->iIndex1 = constreg[nIndx]; +} +int main(void) +{ + st a; + next = &a; + f(); + if (next->iIndex != 0xFFFEFEFE) + __builtin_abort (); + if (next->iIndex1 != 0xFFFEFEFE) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31448.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31448.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31448.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31448.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* PR middle-end/31448, this used to ICE during expand because + reduce_to_bit_field_precision was not ready to handle constants. */ +/* { dg-require-effective-target int32plus } */ + +typedef struct _st { + int iIndex : 24; + int iIndex1 : 24; +} st; +st *next; +void g(void) +{ + st *next = 0; + int nIndx; + const static int constreg[] = { 0,}; + nIndx = 0; + next->iIndex = constreg[nIndx]; +} +void f(void) +{ + int nIndx; + const static int constreg[] = { 0xFEFEFEFE,}; + nIndx = 0; + next->iIndex = constreg[nIndx]; + next->iIndex1 = constreg[nIndx]; +} +int main(void) +{ + st a; + next = &a; + f(); + if (next->iIndex != 0xFFFEFEFE) + __builtin_abort (); + if (next->iIndex1 != 0xFFFEFEFE) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31605.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31605.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31605.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr31605.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +void put_field (unsigned int start, unsigned int len) +{ + int cur_bitshift = ((start + len) % 8) - 8; + if (cur_bitshift > -8) + exit (0); +} + +int +main () +{ + put_field (0, 1); + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr32244-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr32244-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr32244-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr32244-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +struct foo +{ + unsigned long long b:40; +} x; + +extern void abort (void); + +void test1(unsigned long long res) +{ + /* The shift is carried out in 40 bit precision. */ + if (x.b<<32 != res) + abort (); +} + +int main() +{ + x.b = 0x0100; + test1(0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr32500.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr32500.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr32500.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr32500.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +extern void abort(void); +extern void exit(int); +void foo(int) __attribute__((noinline)); +void bar(void) __attribute__((noinline)); + +/* Make sure foo is not inlined or considered pure/const. */ +int x; +void foo(int i) { x = i; } +void bar(void) { exit(0); } + +int +main(int argc, char *argv[]) +{ + int i; + int numbers[4] = { 0xdead, 0xbeef, 0x1337, 0x4242 }; + + for (i = 1; i <= 12; i++) { + if (i <= 4) + foo(numbers[i-1]); + else if (i >= 7 && i <= 9) + bar(); + } + + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33142.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33142.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33142.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33142.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +int abs(int j); +extern void abort(void); + +__attribute__((noinline)) int lisp_atan2(long dy, long dx) { + if (dx <= 0) + if (dy > 0) + return abs(dx) <= abs(dy); + return 0; +} + +int main() { + volatile long dy = 63, dx = -77; + if (lisp_atan2(dy, dx)) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33382.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33382.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33382.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33382.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +struct Foo { + int i; + int j[]; +}; + +struct Foo x = { 1, { 2, 0, 2, 3 } }; + +int foo(void) +{ + x.j[0] = 1; + return x.j[1]; +} + +extern void abort(void); + +int main() +{ + if (foo() != 0) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33631.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33631.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33631.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33631.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +typedef union +{ + int __lock; +} pthread_mutex_t; + +extern void abort (void); + +int main() +{ + struct { int c; pthread_mutex_t m; } r = { .m = 0 }; + if (r.c != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33669.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33669.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33669.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33669.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +extern void abort (void); + +typedef struct foo_t +{ + unsigned int blksz; + unsigned int bf_cnt; +} foo_t; + +#define _RNDUP(x, unit) ((((x) + (unit) - 1) / (unit)) * (unit)) +#define _RNDDOWN(x, unit) ((x) - ((x)%(unit))) + +long long +foo (foo_t *const pxp, long long offset, unsigned int extent) +{ + long long blkoffset = _RNDDOWN(offset, (long long )pxp->blksz); + unsigned int diff = (unsigned int)(offset - blkoffset); + unsigned int blkextent = _RNDUP(diff + extent, pxp->blksz); + + if (pxp->blksz < blkextent) + return -1LL; + + if (pxp->bf_cnt > pxp->blksz) + pxp->bf_cnt = pxp->blksz; + + return blkoffset; +} + +int +main () +{ + foo_t x; + long long xx; + + x.blksz = 8192; + x.bf_cnt = 0; + xx = foo (&x, 0, 4096); + if (xx != 0LL) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33779-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33779-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33779-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33779-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +int foo(int i) +{ + if (((unsigned)(i + 1)) * 4 == 0) + return 1; + return 0; +} + +extern void abort(void); +int main() +{ + if (foo(0x3fffffff) == 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33779-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33779-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33779-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33779-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +int foo(int i) +{ + return ((int)((unsigned)(i + 1) * 4)) / 4; +} + +extern void abort(void); +int main() +{ + if (foo(0x3fffffff) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33870-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33870-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33870-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33870-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,94 @@ +extern void abort (void); + +typedef struct PgHdr PgHdr; +typedef unsigned char u8; +struct PgHdr { +int y; +struct { + unsigned int pgno; + PgHdr *pNextHash, *pPrevHash; + PgHdr *pNextFree, *pPrevFree; + PgHdr *pNextAll; + u8 inJournal; + short int nRef; + PgHdr *pDirty, *pPrevDirty; + unsigned int notUsed; +} x; +}; +PgHdr **xx; +volatile int vx; +static inline PgHdr *merge_pagelist(PgHdr *pA, PgHdr *pB) +{ + PgHdr result; + PgHdr *pTail; + xx = &result.x.pDirty; + pTail = &result; + while( pA && pB ){ + if( pA->x.pgnox.pgno ){ + pTail->x.pDirty = pA; + pTail = pA; + pA = pA->x.pDirty; + }else{ + pTail->x.pDirty = pB; + pTail = pB; + pB = pB->x.pDirty; + } + vx = (*xx)->y; + } + if( pA ){ + pTail->x.pDirty = pA; + }else if( pB ){ + pTail->x.pDirty = pB; + }else{ + pTail->x.pDirty = 0; + } + return result.x.pDirty; +} + +PgHdr * __attribute__((noinline)) sort_pagelist(PgHdr *pIn) +{ + PgHdr *a[25], *p; + int i; + __builtin_memset (a, 0, sizeof (a)); + while( pIn ){ + p = pIn; + pIn = p->x.pDirty; + p->x.pDirty = 0; + for(i=0; i<25 -1; i++){ + if( a[i]==0 ){ + a[i] = p; + break; + }else{ + p = merge_pagelist(a[i], p); + a[i] = 0; + a[i] = 0; + } + } + if( i==25 -1 ){ + a[i] = merge_pagelist(a[i], p); + } + } + p = a[0]; + for(i=1; i<25; i++){ + p = merge_pagelist (p, a[i]); + } + return p; +} + +int main() +{ + PgHdr a[5]; + PgHdr *p; + a[0].x.pgno = 5; + a[0].x.pDirty = &a[1]; + a[1].x.pgno = 4; + a[1].x.pDirty = &a[2]; + a[2].x.pgno = 1; + a[2].x.pDirty = &a[3]; + a[3].x.pgno = 3; + a[3].x.pDirty = 0; + p = sort_pagelist (&a[0]); + if (p->x.pDirty == p) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33870.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33870.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33870.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33870.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,87 @@ +extern void abort (void); + +typedef struct PgHdr PgHdr; +typedef unsigned char u8; +struct PgHdr { + unsigned int pgno; + PgHdr *pNextHash, *pPrevHash; + PgHdr *pNextFree, *pPrevFree; + PgHdr *pNextAll; + u8 inJournal; + short int nRef; + PgHdr *pDirty, *pPrevDirty; + unsigned int notUsed; +}; + +static inline PgHdr *merge_pagelist(PgHdr *pA, PgHdr *pB) +{ + PgHdr result; + PgHdr *pTail; + pTail = &result; + while( pA && pB ){ + if( pA->pgnopgno ){ + pTail->pDirty = pA; + pTail = pA; + pA = pA->pDirty; + }else{ + pTail->pDirty = pB; + pTail = pB; + pB = pB->pDirty; + } + } + if( pA ){ + pTail->pDirty = pA; + }else if( pB ){ + pTail->pDirty = pB; + }else{ + pTail->pDirty = 0; + } + return result.pDirty; +} + +PgHdr * __attribute__((noinline)) sort_pagelist(PgHdr *pIn) +{ + PgHdr *a[25], *p; + int i; + __builtin_memset (a, 0, sizeof (a)); + while( pIn ){ + p = pIn; + pIn = p->pDirty; + p->pDirty = 0; + for(i=0; i<25 -1; i++){ + if( a[i]==0 ){ + a[i] = p; + break; + }else{ + p = merge_pagelist(a[i], p); + a[i] = 0; + } + } + if( i==25 -1 ){ + a[i] = merge_pagelist(a[i], p); + } + } + p = a[0]; + for(i=1; i<25; i++){ + p = merge_pagelist (p, a[i]); + } + return p; +} + +int main() +{ + PgHdr a[5]; + PgHdr *p; + a[0].pgno = 5; + a[0].pDirty = &a[1]; + a[1].pgno = 4; + a[1].pDirty = &a[2]; + a[2].pgno = 1; + a[2].pDirty = &a[3]; + a[3].pgno = 3; + a[3].pDirty = 0; + p = sort_pagelist (&a[0]); + if (p->pDirty == p) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33992.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33992.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33992.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr33992.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +extern void abort (); + +void __attribute__((noinline)) +bar (unsigned long long i) +{ + if (i) + abort (); +} + +static void __attribute__((always_inline)) +foo (unsigned long long *r) +{ + int i; + + for (i = 0; ; i++) + if (*r & ((unsigned long long)1 << (63 - i))) + break; + + bar (i); +} + +void __attribute__((noinline)) +do_test (unsigned long long *r) +{ + int i; + + for (i = 0; i < 2; ++i) + foo (r); +} + +int main() +{ + unsigned long long r = 0x8000000000000001ull; + + do_test (&r); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34070-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34070-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34070-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34070-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +extern void abort (void); + +int f(unsigned int x) +{ + return ((int)x) % 4; +} + +int main() +{ + if (f(-1) != -1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34070-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34070-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34070-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34070-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +extern void abort (void); + +int f(unsigned int x, int n) +{ + return ((int)x) / (1 << n); +} + +int main() +{ + if (f(-1, 1) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34099-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34099-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34099-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34099-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,46 @@ +int test1 (int b, int c) +{ + char x; + if (b) + return x / c; + else + return 1; +} +int test2 (int b, int c) +{ + int x; + if (b) + return x * c; + else + return 1; +} +int test3 (int b, int c) +{ + int x; + if (b) + return x % c; + else + return 1; +} +int test4 (int b, int c) +{ + char x; + if (b) + return x == c; + else + return 1; +} + +extern void abort (void); +int main() +{ + if (test1(1, 1000) != 0) + abort (); + if (test2(1, 0) != 0) + abort (); + if (test3(1, 1) != 0) + abort (); + if (test4(1, 1000) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34099.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34099.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34099.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34099.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +int foo (int b, int c) +{ + int x; + if (b) + return x & c; + else + return 1; +} +extern void abort (void); +int main() +{ + if (foo(1, 0) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34130.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34130.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34130.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34130.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +extern void abort (void); +int foo (int i) +{ + return -2 * __builtin_abs(i - 2); +} +int main() +{ + if (foo(1) != -2 + || foo(3) != -2) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34154.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34154.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34154.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34154.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +int foo( unsigned long long aLL ) +{ + switch( aLL ) + { + case 1000000000000000000ULL ... 9999999999999999999ULL : return 19 ; + default : return 20 ; + }; +}; +extern void abort (void); +int main() +{ + unsigned long long aLL = 1000000000000000000ULL; + if (foo (aLL) != 19) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34176.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34176.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34176.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34176.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,68 @@ + +typedef __SIZE_TYPE__ size_t; +typedef unsigned int index_ty; +typedef index_ty *index_list_ty; + +struct mult_index +{ + index_ty index; + unsigned int count; +}; + +struct mult_index_list +{ + struct mult_index *item; + size_t nitems; + size_t nitems_max; + + struct mult_index *item2; + size_t nitems2_max; +}; + +int __attribute__((noinline)) +hash_find_entry (size_t *result) +{ + *result = 2; + return 0; +} + +extern void abort (void); +struct mult_index * __attribute__((noinline)) +foo (size_t n) +{ + static count = 0; + if (count++ > 0) + abort (); + return 0; +} + +int +main (void) +{ + size_t nitems = 0; + + for (;;) + { + size_t list; + + hash_find_entry (&list); + { + size_t len2 = list; + struct mult_index *destptr; + struct mult_index *dest; + size_t new_max = nitems + len2; + + if (new_max != len2) + break; + dest = foo (new_max); + + destptr = dest; + while (len2--) + destptr++; + + nitems = destptr - dest; + } + } + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34415.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34415.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34415.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34415.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +const char *__attribute__((noinline)) +foo (const char *p) +{ + const char *end; + int len = 1; + for (;;) + { + int c = *p; + c = (c >= 'a' && c <= 'z' ? c - 'a' + 'A' : c); + if (c == 'B') + end = p; + else if (c == 'A') + { + end = p; + do + p++; + while (*p == '+'); + } + else + break; + p++; + len++; + } + if (len > 2 && *p == ':') + p = end; + return p; +} + +int +main (void) +{ + const char *input = "Bbb:"; + return foo (input) != input + 2; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34456.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34456.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34456.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34456.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* { dg-skip-if "requires qsort" { freestanding } } */ + +#include + +int __attribute__ ((noinline)) debug (void) { return 1; } +int errors; + +struct s { int elt; int (*compare) (int); }; + +static int +compare (const void *x, const void *y) +{ + const struct s *s1 = x, *s2 = y; + int (*compare1) (int); + int elt2; + + compare1 = s1->compare; + elt2 = s2->elt; + if (elt2 != 0 && debug () && compare1 (s1->elt) != 0) + errors++; + return compare1 (elt2); +} + +int bad_compare (int x) { return -x; } +struct s array[2] = { { 1, bad_compare }, { -1, bad_compare } }; + +int +main (void) +{ + qsort (array, 2, sizeof (struct s), compare); + return errors == 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34768-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34768-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34768-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34768-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +int x; + +void __attribute__((noinline)) foo (void) +{ + x = -x; +} +void __attribute__((const,noinline)) bar (void) +{ +} + +int __attribute__((noinline)) +test (int c) +{ + int tmp = x; + (c ? foo : bar) (); + return tmp + x; +} + +extern void abort (void); +int main() +{ + x = 1; + if (test (1) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34768-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34768-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34768-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34768-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +int x; + +int __attribute__((noinline)) foo (void) +{ + x = -x; + return 0; +} +int __attribute__((const,noinline)) bar (void) +{ + return 0; +} + +int __attribute__((noinline)) +test (int c) +{ + int tmp = x; + int res = (c ? foo : bar) (); + return tmp + x + res; +} + +extern void abort (void); +int main() +{ + x = 1; + if (test (1) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34971.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34971.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34971.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34971.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +struct foo +{ + unsigned long long b:40; +} x; + +extern void abort (void); + +void test1(unsigned long long res) +{ + /* Build a rotate expression on a 40 bit argument. */ + if ((x.b<<8) + (x.b>>32) != res) + abort (); +} + +int main() +{ + x.b = 0x0100000001; + test1(0x0000000101); + x.b = 0x0100000000; + test1(0x0000000001); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34982.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34982.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34982.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr34982.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +extern void abort (void); + +static void something(); + +int main() +{ + something(-1); + return 0; +} + +static void something(int i) +{ + if (i != -1) + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35163.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35163.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35163.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35163.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +extern void abort(void); + +int main() +{ + signed char a = -30; + signed char b = -31; + #if(__SIZEOF_INT__ >= 4) + if (a > (unsigned short)b) +#else + if ((long) a > (unsigned short)b) +#endif + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35231.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35231.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35231.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35231.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +extern void abort(void); + +int __attribute__((noinline)) +foo(int bits_per_pixel, int depth) +{ + if ((bits_per_pixel | depth) == 1) + abort (); + return bits_per_pixel; +} + +int main() +{ + if (foo(2, 0) != 2) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35390.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35390.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35390.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35390.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +extern void abort (void); + +unsigned int foo (int n) +{ + return ~((unsigned int)~n); +} + +int main() +{ + if (foo(0) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35456.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35456.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35456.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35456.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* { dg-skip-if "signed zero not supported" { "vax-*-*" } } */ +extern void abort (void); + +double +__attribute__ ((noinline)) +not_fabs (double x) +{ + return x >= 0.0 ? x : -x; +} + +int main() +{ + double x = -0.0; + double y; + + y = not_fabs (x); + + if (!__builtin_signbit (y)) + abort(); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35472.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35472.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35472.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35472.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); +extern void *memset (void *s, int c, __SIZE_TYPE__ n); +struct S { int i[16]; }; +struct S *p; +void __attribute__((noinline,noclone)) +foo(struct S *a, struct S *b) { a->i[0] = -1; p = b; } +void test (void) +{ + struct S a, b; + memset (&a.i[0], '\0', sizeof (a.i)); + memset (&b.i[0], '\0', sizeof (b.i)); + foo (&a, &b); + *p = a; + *p = b; + if (b.i[0] != -1) + abort (); +} +int main() +{ + test(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35800.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35800.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35800.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr35800.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,103 @@ +extern void abort (void); + +int stab_xcoff_builtin_type (int typenum) +{ + const char *name; + if (typenum >= 0 || typenum < -34) + { + return 0; + } + switch (-typenum) + { + case 1: + name = "int"; + break; + case 2: + name = "char"; + case 3: + name = "short"; + break; + case 4: + name = "long"; + case 5: + name = "unsigned char"; + case 6: + name = "signed char"; + case 7: + name = "unsigned short"; + case 8: + name = "unsigned int"; + case 9: + name = "unsigned"; + case 10: + name = "unsigned long"; + case 11: + name = "void"; + case 12: + name = "float"; + case 13: + name = "double"; + case 14: + name = "long double"; + case 15: + name = "integer"; + case 16: + name = "boolean"; + case 17: + name = "short real"; + case 18: + name = "real"; + case 19: + name = "stringptr"; + case 20: + name = "character"; + case 21: + name = "logical*1"; + case 22: + name = "logical*2"; + case 23: + name = "logical*4"; + case 24: + name = "logical"; + case 25: + name = "complex"; + case 26: + name = "double complex"; + case 27: + name = "integer*1"; + case 28: + name = "integer*2"; + case 29: + name = "integer*4"; + case 30: + name = "wchar"; + case 31: + name = "long long"; + case 32: + name = "unsigned long long"; + case 33: + name = "logical*8"; + case 34: + name = "integer*8"; + } + return name[0]; +} + +int main() +{ + int i; + if (stab_xcoff_builtin_type(0) != 0) + abort (); + if (stab_xcoff_builtin_type(-1) != 'i') + abort (); + if (stab_xcoff_builtin_type(-2) != 's') + abort (); + if (stab_xcoff_builtin_type(-3) != 's') + abort (); + for (i = -4; i >= -34; --i) + if (stab_xcoff_builtin_type(i) != 'i') + abort (); + if (stab_xcoff_builtin_type(-35) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36034-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36034-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36034-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36034-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +double x[5][10] = { { 10, 11, 12, 13, 14, 15, -1, -1, -1, -1 }, + { 21, 22, 23, 24, 25, 26, -1, -1, -1, -1 }, + { 32, 33, 34, 35, 36, 37, -1, -1, -1, -1 }, + { 43, 44, 45, 46, 47, 48, -1, -1, -1, -1 }, + { 54, 55, 56, 57, 58, 59, -1, -1, -1, -1 } }; +double tmp[5][6]; + +void __attribute__((noinline)) +test (void) +{ + int i, j; + for (i = 0; i < 5; ++i) + { + tmp[i][0] = x[i][0]; + tmp[i][1] = x[i][1]; + tmp[i][2] = x[i][2]; + tmp[i][3] = x[i][3]; + tmp[i][4] = x[i][4]; + tmp[i][5] = x[i][5]; + } +} +extern void abort (void); +int main() +{ + int i, j; + test(); + for (i = 0; i < 5; ++i) + for (j = 0; j < 6; ++j) + if (tmp[i][j] == -1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36034-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36034-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36034-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36034-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +double x[50] = { 10, 11, 12, 13, 14, 15, -1, -1, -1, -1, + 21, 22, 23, 24, 25, 26, -1, -1, -1, -1, + 32, 33, 34, 35, 36, 37, -1, -1, -1, -1, + 43, 44, 45, 46, 47, 48, -1, -1, -1, -1, + 54, 55, 56, 57, 58, 59, -1, -1, -1, -1 }; +double tmp[30]; + +void __attribute__((noinline)) +test (void) +{ + int i, j; + for (i = 0; i < 5; ++i) + { + tmp[i*6] = x[i*10]; + tmp[i*6+1] = x[i*10+1]; + tmp[i*6+2] = x[i*10+2]; + tmp[i*6+3] = x[i*10+3]; + tmp[i*6+4] = x[i*10+4]; + tmp[i*6+5] = x[i*10+5]; + } +} +extern void abort (void); +int main() +{ + int i, j; + test(); + for (i = 0; i < 5; ++i) + for (j = 0; j < 6; ++j) + if (tmp[i*6+j] == -1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36038.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36038.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36038.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36038.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +/* PR tree-optimization/36038 */ + +long long list[10]; +long long expect[10] = { 0, 1, 2, 3, 4, 4, 5, 6, 7, 9 }; +long long *stack_base; +int indices[10]; +int *markstack_ptr; + +void +doit (void) +{ + long long *src; + long long *dst; + long long *sp = stack_base + 5; + int diff = 2; + int shift; + int count; + + shift = diff - (markstack_ptr[-1] - markstack_ptr[-2]); + count = (sp - stack_base) - markstack_ptr[-1] + 2; + src = sp; + dst = (sp += shift); + while (--count) + *dst-- = *src--; +} + +int +main () +{ + int i; + for (i = 0; i < 10; i++) + list[i] = i; + + markstack_ptr = indices + 9; + markstack_ptr[-1] = 2; + markstack_ptr[-2] = 1; + + stack_base = list + 2; + doit (); + if (__builtin_memcmp (expect, list, sizeof (list))) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36077.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36077.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36077.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36077.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +extern void abort (void); + +unsigned int test (unsigned int x) +{ + return x / 0x80000001U / 0x00000002U; +} + +int main() +{ + if (test(2) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36093.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36093.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36093.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36093.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* { dg-skip-if "small alignment" { pdp11-*-* } } */ + +extern void abort (void); + +typedef struct Bar { + char c[129]; +} Bar __attribute__((__aligned__(128))); + +typedef struct Foo { + Bar bar[4]; +} Foo; + +Foo foo[4]; + +int main() +{ + int i, j; + Foo *foop = &foo[0]; + + for (i=0; i < 4; i++) { + Bar *bar = &foop->bar[i]; + for (j=0; j < 129; j++) { + bar->c[j] = 'a' + i; + } + } + + if (foo[0].bar[3].c[128] != 'd') + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36321.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36321.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36321.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36321.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* { dg-skip-if "requires alloca" { ! alloca } { "-O0" } { "" } } */ +extern void abort (void); + +extern __SIZE_TYPE__ strlen (const char *); +void foo(char *str) +{ + int len2 = strlen (str); + char *a = (char *) __builtin_alloca (0); + char *b = (char *) __builtin_alloca (len2*3); + + if ((int) (a-b) < (len2*3)) + { +#ifdef _WIN32 + abort (); +#endif + return; + } +} + +static char * volatile argp = "pr36321.x"; + +int main(int argc, char **argv) +{ + foo (argp); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36339.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36339.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36339.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36339.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +extern void abort (void); + +typedef unsigned long my_uintptr_t; + +int check_a(my_uintptr_t tagged_ptr); + +int __attribute__((noinline)) try_a(my_uintptr_t x) +{ + my_uintptr_t heap[2]; + my_uintptr_t *hp = heap; + + hp[0] = x; + hp[1] = 0; + return check_a((my_uintptr_t)(void*)((char*)hp + 1)); +} + +int __attribute__((noinline)) check_a(my_uintptr_t tagged_ptr) +{ + my_uintptr_t *hp = (my_uintptr_t*)(void*)((char*)tagged_ptr - 1); + + if (hp[0] == 42 && hp[1] == 0) + return 0; + return -1; +} + +int main(void) +{ + if (try_a(42) < 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36343.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36343.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36343.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36343.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +extern void abort (void); + +void __attribute__((noinline)) +bar (int **p) +{ + float *q = (float *)p; + *q = 0.0; +} + +float __attribute__((noinline)) +foo (int b) +{ + int *i = 0; + float f = 1.0; + int **p; + if (b) + p = &i; + else + p = (int **)&f; + bar (p); + if (b) + return **p; + return f; +} + +int main() +{ + if (foo(0) != 0.0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36691.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36691.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36691.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36691.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +unsigned char g_5; + +void func_1 (void) +{ + for (g_5 = 9; g_5 >= 4; g_5 -= 5) + ; +} + +extern void abort (void); +int main (void) +{ + func_1 (); + if (g_5 != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36765.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36765.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36765.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr36765.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +int __attribute__((noinline)) +foo(int i) +{ + int *p = __builtin_malloc (4 * sizeof(int)); + *p = 0; + p[i] = 1; + return *p; +} +extern void abort (void); +int main() +{ + if (foo(0) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37102.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37102.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37102.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37102.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +extern void abort (void); + +unsigned int a, b = 1, c; + +void __attribute__ ((noinline)) +foo (int x) +{ + if (x != 5) + abort (); +} + +int +main () +{ + unsigned int d, e; + for (d = 1; d < 5; d++) + if (c) + a = b; + a = b; + e = a << 1; + if (e) + e = (e << 1) ^ 1; + foo (e); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37125.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37125.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37125.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37125.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); + +static inline unsigned int +mod_rhs(int rhs) +{ + if (rhs == 0) return 1; + return rhs; +} + +void func_44 (unsigned int p_45); +void func_44 (unsigned int p_45) +{ + if (!((p_45 * -9) % mod_rhs (-9))) { + abort(); + } +} + +int main (void) +{ + func_44 (2); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37573.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37573.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37573.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37573.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,65 @@ +/* PR tree-optimization/37573 */ +/* { dg-require-effective-target int32plus } */ + +struct S +{ + unsigned int *a; + unsigned int b; + unsigned int c[624]; +}; + +static unsigned char __attribute__((noinline)) +foo (struct S *s) +{ + unsigned int r; + if (!--s->b) + { + unsigned int *c = s->c; + unsigned int i; + s->a = c; + for (i = 0; i < 227; i++) + c[i] = ((((c[i] ^ c[i + 1]) & 0x7ffffffe) ^ c[i]) >> 1) + ^ ((0 - (c[i + 1] & 1)) & 0x9908b0df) ^ c[i + 397]; + } + r = *(s->a++); + r ^= (r >> 11); + r ^= ((r & 0xff3a58ad) << 7); + r ^= ((r & 0xffffdf8c) << 15); + r ^= (r >> 18); + return (unsigned char) (r >> 1); +} + +static void __attribute__((noinline)) +bar (unsigned char *p, unsigned int q, unsigned int r) +{ + struct S s; + unsigned int i; + unsigned int *c = s.c; + *c = r; + for (i = 1; i < 624; i++) + c[i] = i + 0x6c078965 * ((c[i - 1] >> 30) ^ c[i - 1]); + s.b = 1; + while (q--) + *p++ ^= foo (&s); +}; + +static unsigned char p[23] = { + 0xc0, 0x49, 0x17, 0x32, 0x62, 0x1e, 0x2e, 0xd5, 0x4c, 0x19, 0x28, 0x49, + 0x91, 0xe4, 0x72, 0x83, 0x91, 0x3d, 0x93, 0x83, 0xb3, 0x61, 0x38 +}; + +static unsigned char q[23] = { + 0x3e, 0x41, 0x55, 0x54, 0x4f, 0x49, 0x54, 0x20, 0x55, 0x4e, 0x49, 0x43, + 0x4f, 0x44, 0x45, 0x20, 0x53, 0x43, 0x52, 0x49, 0x50, 0x54, 0x3c +}; + +int +main (void) +{ + unsigned int s; + s = 23; + bar (p, s, s + 0xa25e); + if (__builtin_memcmp (p, q, s) != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37780.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37780.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37780.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37780.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +/* PR middle-end/37780. */ + +#define VAL (8 * sizeof (int)) + +int __attribute__ ((noinline, noclone)) +fooctz (int i) +{ + return (i == 0) ? VAL : __builtin_ctz (i); +} + +int __attribute__ ((noinline, noclone)) +fooctz2 (int i) +{ + return (i != 0) ? __builtin_ctz (i) : VAL; +} + +unsigned int __attribute__ ((noinline, noclone)) +fooctz3 (unsigned int i) +{ + return (i > 0) ? __builtin_ctz (i) : VAL; +} + +int __attribute__ ((noinline, noclone)) +fooclz (int i) +{ + return (i == 0) ? VAL : __builtin_clz (i); +} + +int __attribute__ ((noinline, noclone)) +fooclz2 (int i) +{ + return (i != 0) ? __builtin_clz (i) : VAL; +} + +unsigned int __attribute__ ((noinline, noclone)) +fooclz3 (unsigned int i) +{ + return (i > 0) ? __builtin_clz (i) : VAL; +} + +int +main (void) +{ + if (fooctz (0) != VAL || fooctz2 (0) != VAL || fooctz3 (0) != VAL + || fooclz (0) != VAL || fooclz2 (0) != VAL || fooclz3 (0) != VAL) + __builtin_abort (); + + return 0; +} \ No newline at end of file Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37882.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37882.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37882.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37882.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* PR middle-end/37882 */ + +struct S +{ + unsigned char b : 3; +} s; + +int +main () +{ + s.b = 4; + if (s.b > 0 && s.b < 4) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37924.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37924.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37924.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37924.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,50 @@ +/* PR c/37924 */ + +extern void abort (void); + +signed char a; +unsigned char b; + +int +test1 (void) +{ + int c = -1; + return ((unsigned int) (a ^ c)) >> 9; +} + +int +test2 (void) +{ + int c = -1; + return ((unsigned int) (b ^ c)) >> 9; +} + +int +main (void) +{ + a = 0; + if (test1 () != (-1U >> 9)) + abort (); + a = 0x40; + if (test1 () != (-1U >> 9)) + abort (); + a = 0x80; + if (test1 () != (a < 0) ? 0 : (-1U >> 9)) + abort (); + a = 0xff; + if (test1 () != (a < 0) ? 0 : (-1U >> 9)) + abort (); + b = 0; + if (test2 () != (-1U >> 9)) + abort (); + b = 0x40; + if (test2 () != (-1U >> 9)) + abort (); + b = 0x80; + if (test2 () != (-1U >> 9)) + abort (); + b = 0xff; + if (test2 () != (-1U >> 9)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37931.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37931.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37931.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr37931.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR middle-end/37931 */ + +extern void abort (void); + +int +foo (int a, unsigned int b) +{ + return (a | 1) & (b | 1); +} + +int +main (void) +{ + if (foo (6, 0xc6) != 7) + abort (); + if (foo (0x80, 0xc1) != 0x81) + abort (); + if (foo (4, 4) != 5) + abort (); + if (foo (5, 4) != 5) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38048-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38048-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38048-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38048-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +extern void abort(void); + +int foo () +{ + int mat[2][1]; + int (*a)[1] = mat; + int det = 0; + int i; + mat[0][0] = 1; + mat[1][0] = 2; + for (i = 0; i < 2; ++i) + det += a[i][0]; + return det; +} + +int main() +{ + if (foo () != 3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38048-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38048-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38048-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38048-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void abort (void); + +static int inv_J(int a[][2]) +{ + int i, j; + int det = 0.0; + for (j=0; j<2; ++j) + det += a[j][0] + a[j][1]; + return det; +} + +int foo() +{ + int mat[2][2]; + mat[0][0] = 1; + mat[0][1] = 2; + mat[1][0] = 4; + mat[1][1] = 8; + return inv_J(mat); +} + +int main() +{ + if (foo () != 15) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38051.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38051.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38051.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38051.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,215 @@ +typedef __SIZE_TYPE__ size_t; +static int mymemcmp1 (unsigned long int, unsigned long int) + __attribute__ ((__nothrow__)); + +__inline static int +mymemcmp1 (unsigned long int a, unsigned long int b) +{ + long int srcp1 = (long int) &a; + long int srcp2 = (long int) &b; + unsigned long int a0, b0; + do + { + a0 = ((unsigned char *) srcp1)[0]; + b0 = ((unsigned char *) srcp2)[0]; + srcp1 += 1; + srcp2 += 1; + } + while (a0 == b0); + return a0 - b0; +} + +static int mymemcmp2 (long, long, size_t) __attribute__ ((__nothrow__)); + +static int +mymemcmp2 (long int srcp1, long int srcp2, size_t len) +{ + unsigned long int a0, a1; + unsigned long int b0, b1; + switch (len % 4) + { + default: + case 2: + a0 = ((unsigned long int *) srcp1)[0]; + b0 = ((unsigned long int *) srcp2)[0]; + srcp1 -= 2 * (sizeof (unsigned long int)); + srcp2 -= 2 * (sizeof (unsigned long int)); + len += 2; + goto do1; + case 3: + a1 = ((unsigned long int *) srcp1)[0]; + b1 = ((unsigned long int *) srcp2)[0]; + srcp1 -= (sizeof (unsigned long int)); + srcp2 -= (sizeof (unsigned long int)); + len += 1; + goto do2; + case 0: + if (16 <= 3 * (sizeof (unsigned long int)) && len == 0) + return 0; + a0 = ((unsigned long int *) srcp1)[0]; + b0 = ((unsigned long int *) srcp2)[0]; + goto do3; + case 1: + a1 = ((unsigned long int *) srcp1)[0]; + b1 = ((unsigned long int *) srcp2)[0]; + srcp1 += (sizeof (unsigned long int)); + srcp2 += (sizeof (unsigned long int)); + len -= 1; + if (16 <= 3 * (sizeof (unsigned long int)) && len == 0) + goto do0; + } + do + { + a0 = ((unsigned long int *) srcp1)[0]; + b0 = ((unsigned long int *) srcp2)[0]; + if (a1 != b1) + return mymemcmp1 ((a1), (b1)); + do3: + a1 = ((unsigned long int *) srcp1)[1]; + b1 = ((unsigned long int *) srcp2)[1]; + if (a0 != b0) + return mymemcmp1 ((a0), (b0)); + do2: + a0 = ((unsigned long int *) srcp1)[2]; + b0 = ((unsigned long int *) srcp2)[2]; + if (a1 != b1) + return mymemcmp1 ((a1), (b1)); + do1: + a1 = ((unsigned long int *) srcp1)[3]; + b1 = ((unsigned long int *) srcp2)[3]; + if (a0 != b0) + return mymemcmp1 ((a0), (b0)); + srcp1 += 4 * (sizeof (unsigned long int)); + srcp2 += 4 * (sizeof (unsigned long int)); + len -= 4; + } + while (len != 0); +do0: + if (a1 != b1) + return mymemcmp1 ((a1), (b1)); + return 0; +} + +static int mymemcmp3 (long, long, size_t) __attribute__ ((__nothrow__)); + +static int +mymemcmp3 (long int srcp1, long int srcp2, size_t len) +{ + unsigned long int a0, a1, a2, a3; + unsigned long int b0, b1, b2, b3; + unsigned long int x; + int shl, shr; + shl = 8 * (srcp1 % (sizeof (unsigned long int))); + shr = 8 * (sizeof (unsigned long int)) - shl; + srcp1 &= -(sizeof (unsigned long int)); + switch (len % 4) + { + default: + case 2: + a1 = ((unsigned long int *) srcp1)[0]; + a2 = ((unsigned long int *) srcp1)[1]; + b2 = ((unsigned long int *) srcp2)[0]; + srcp1 -= 1 * (sizeof (unsigned long int)); + srcp2 -= 2 * (sizeof (unsigned long int)); + len += 2; + goto do1; + case 3: + a0 = ((unsigned long int *) srcp1)[0]; + a1 = ((unsigned long int *) srcp1)[1]; + b1 = ((unsigned long int *) srcp2)[0]; + srcp2 -= 1 * (sizeof (unsigned long int)); + len += 1; + goto do2; + case 0: + if (16 <= 3 * (sizeof (unsigned long int)) && len == 0) + return 0; + a3 = ((unsigned long int *) srcp1)[0]; + a0 = ((unsigned long int *) srcp1)[1]; + b0 = ((unsigned long int *) srcp2)[0]; + srcp1 += 1 * (sizeof (unsigned long int)); + goto do3; + case 1: + a2 = ((unsigned long int *) srcp1)[0]; + a3 = ((unsigned long int *) srcp1)[1]; + b3 = ((unsigned long int *) srcp2)[0]; + srcp1 += 2 * (sizeof (unsigned long int)); + srcp2 += 1 * (sizeof (unsigned long int)); + len -= 1; + if (16 <= 3 * (sizeof (unsigned long int)) && len == 0) + goto do0; + } + do + { + a0 = ((unsigned long int *) srcp1)[0]; + b0 = ((unsigned long int *) srcp2)[0]; + x = (((a2) >> (shl)) | ((a3) << (shr))); + if (x != b3) + return mymemcmp1 ((x), (b3)); + do3: + a1 = ((unsigned long int *) srcp1)[1]; + b1 = ((unsigned long int *) srcp2)[1]; + x = (((a3) >> (shl)) | ((a0) << (shr))); + if (x != b0) + return mymemcmp1 ((x), (b0)); + do2: + a2 = ((unsigned long int *) srcp1)[2]; + b2 = ((unsigned long int *) srcp2)[2]; + x = (((a0) >> (shl)) | ((a1) << (shr))); + if (x != b1) + return mymemcmp1 ((x), (b1)); + do1: + a3 = ((unsigned long int *) srcp1)[3]; + b3 = ((unsigned long int *) srcp2)[3]; + x = (((a1) >> (shl)) | ((a2) << (shr))); + if (x != b2) + return mymemcmp1 ((x), (b2)); + srcp1 += 4 * (sizeof (unsigned long int)); + srcp2 += 4 * (sizeof (unsigned long int)); + len -= 4; + } + while (len != 0); +do0: + x = (((a2) >> (shl)) | ((a3) << (shr))); + if (x != b3) + return mymemcmp1 ((x), (b3)); + return 0; +} + +__attribute__ ((noinline)) +int mymemcmp (const void *s1, const void *s2, size_t len) +{ + unsigned long int a0; + unsigned long int b0; + long int srcp1 = (long int) s1; + long int srcp2 = (long int) s2; + if (srcp1 % (sizeof (unsigned long int)) == 0) + return mymemcmp2 (srcp1, srcp2, len / (sizeof (unsigned long int))); + else + return mymemcmp3 (srcp1, srcp2, len / (sizeof (unsigned long int))); +} + +char buf[256]; + +int +main (void) +{ + char *p; + union { long int l; char c[sizeof (long int)]; } u; + + /* The test above assumes little endian and long being the same size + as pointer. */ + if (sizeof (long int) != sizeof (void *) || sizeof (long int) < 4) + return 0; + u.l = 0x12345678L; + if (u.c[0] != 0x78 || u.c[1] != 0x56 || u.c[2] != 0x34 || u.c[3] != 0x12) + return 0; + + p = buf + 16 - (((long int) buf) & 15); + __builtin_memcpy (p + 9, +"\x1\x37\x82\xa7\x55\x49\x9d\xbf\xf8\x44\xb6\x55\x17\x8e\xf9", 15); + __builtin_memcpy (p + 128 + 24, +"\x1\x37\x82\xa7\x55\x49\xd0\xf3\xb7\x2a\x6d\x23\x71\x49\x6a", 15); + if (mymemcmp (p + 9, p + 128 + 24, 33) != -51) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38151.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38151.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38151.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38151.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,47 @@ +/* { dg-options "-Wno-psabi" } */ +/* { dg-require-effective-target int32plus } */ +void abort (void); + +struct S2848 +{ + unsigned int a; + _Complex int b; + struct + { + } __attribute__ ((aligned)) c; +}; + +struct S2848 s2848; + +int fails; + +void __attribute__((noinline)) +check2848va (int z, ...) +{ + struct S2848 arg; + __builtin_va_list ap; + + __builtin_va_start (ap, z); + + arg = __builtin_va_arg (ap, struct S2848); + + if (s2848.a != arg.a) + ++fails; + if (s2848.b != arg.b) + ++fails; + + __builtin_va_end (ap); +} + +int main (void) +{ + s2848.a = 4027477739U; + s2848.b = (723419448 + -218144346 * __extension__ 1i); + + check2848va (1, s2848); + + if (fails) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38212.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38212.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38212.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38212.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +int __attribute__((noinline)) +foo (int *__restrict p, int i) +{ + int *__restrict q; + int *__restrict r; + int v, w; + q = p + 1; + r = q - i; + v = *r; + *p = 1; + w = *r; + return v + w; +} +extern void abort (void); +int main() +{ + int i = 0; + if (foo (&i, 1) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38236.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38236.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38236.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38236.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +struct X { int i; }; + +int __attribute__((noinline)) +foo (struct X *p, int *q, int a, int b) +{ + struct X x, y; + if (a) + p = &x; + if (b) + q = &x.i; + else + q = &y.i; + *q = 1; + return p->i; +} +extern void abort (void); +int main() +{ + if (foo((void *)0, (void *)0, 1, 1) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38422.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38422.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38422.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38422.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* PR middle-end/38422 */ + +extern void abort (void); + +struct S +{ + int s : (sizeof (int) * __CHAR_BIT__ - 2); +} s; + +void +foo (void) +{ + s.s *= 2; +} + +int +main () +{ + s.s = 24; + foo (); + if (s.s != 48) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38533.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38533.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38533.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38533.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR middle-end/38533 */ + +#define A asm volatile ("" : "=r" (f) : "0" (0)); e |= f; +#define B A A A A A A A A A A A +#define C B B B B B B B B B B B + +int +foo (void) +{ + int e = 0, f; + C C B B B B B A A A A A A + return e; +} + +int +main (void) +{ + if (foo ()) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38819.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38819.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38819.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38819.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +extern void exit (int); +extern void abort (void); + +volatile int a = 1; +volatile int b = 0; +volatile int x = 2; +volatile signed int r = 8; + +void __attribute__((noinline)) +foo (void) +{ + exit (0); +} + +int +main (void) +{ + int si1 = a; + int si2 = b; + int i; + + for (i = 0; i < 100; ++i) { + foo (); + if (x == 8) + i++; + r += i + si1 % si2; + } + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38969.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38969.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38969.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr38969.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +__complex__ float +__attribute__ ((noinline)) foo (__complex__ float x) +{ + return x; +} + +__complex__ float +__attribute__ ((noinline)) bar (__complex__ float x) +{ + return foo (x); +} + +int main() +{ + __complex__ float a, b; + __real__ a = 9; + __imag__ a = 42; + + b = bar (a); + + if (a != b) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39100.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39100.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39100.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39100.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,63 @@ +/* Bad PTA results (incorrect store handling) was causing us to delete + *na = 0 store. */ + +typedef struct E +{ + int p; + struct E *n; +} *EP; + +typedef struct C +{ + EP x; + short cn, cp; +} *CP; + +__attribute__((noinline)) CP +foo (CP h, EP x) +{ + EP pl = 0, *pa = &pl; + EP nl = 0, *na = &nl; + EP n; + + while (x) + { + n = x->n; + if ((x->p & 1) == 1) + { + h->cp++; + *pa = x; + pa = &((*pa)->n); + } + else + { + h->cn++; + *na = x; + na = &((*na)->n); + } + x = n; + } + *pa = nl; + *na = 0; + h->x = pl; + return h; +} + +int +main (void) +{ + struct C c = { 0, 0, 0 }; + struct E e[2] = { { 0, &e[1] }, { 1, 0 } }; + EP p; + + foo (&c, &e[0]); + if (c.cn != 1 || c.cp != 1) + __builtin_abort (); + if (c.x != &e[1]) + __builtin_abort (); + if (e[1].n != &e[0]) + __builtin_abort (); + if (e[0].n) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39120.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39120.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39120.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39120.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +struct X { int *p; } x; + +struct X __attribute__((noinline)) +foo(int *p) { struct X x; x.p = p; return x; } + +void __attribute((noinline)) +bar() { *x.p = 1; } + +extern void abort (void); +int main() +{ + int i = 0; + x = foo(&i); + bar(); + if (i != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39228.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39228.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39228.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39228.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +/* { dg-add-options ieee } */ +/* { dg-skip-if "No Inf/NaN support" { spu-*-* } } */ + +extern void abort (void); + +static inline int __attribute__((always_inline)) testf (float b) +{ + float c = 1.01f * b; + + return __builtin_isinff (c); +} + +static inline int __attribute__((always_inline)) test (double b) +{ + double c = 1.01 * b; + + return __builtin_isinf (c); +} + +static inline int __attribute__((always_inline)) testl (long double b) +{ + long double c = 1.01L * b; + + return __builtin_isinfl (c); +} + +int main() +{ + if (testf (__FLT_MAX__) < 1) + abort (); + + if (test (__DBL_MAX__) < 1) + abort (); + + if (testl (__LDBL_MAX__) < 1) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39233.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39233.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39233.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39233.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +extern void abort (void); + +__attribute__((noinline)) void +foo (void *p) +{ + long l = (long) p; + if (l < 0 || l > 6) + abort (); +} + +int +main () +{ + short i; + for (i = 6; i >= 0; i--) + foo ((void *) (long) i); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39240.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39240.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39240.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39240.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,105 @@ +/* PR target/39240 */ + +extern void abort (void); + +__attribute__ ((noinline)) +static int foo1 (int x) +{ + return x; +} + +__attribute__ ((noinline)) +unsigned int bar1 (int x) +{ + return foo1 (x + 6); +} + +volatile unsigned long l1 = (unsigned int) -4; + +__attribute__ ((noinline)) +static short int foo2 (int x) +{ + return x; +} + +__attribute__ ((noinline)) +unsigned short int bar2 (int x) +{ + return foo2 (x + 6); +} + +volatile unsigned long l2 = (unsigned short int) -4; + +__attribute__ ((noinline)) +static signed char foo3 (int x) +{ + return x; +} + +__attribute__ ((noinline)) +unsigned char bar3 (int x) +{ + return foo3 (x + 6); +} + +volatile unsigned long l3 = (unsigned char) -4; + +__attribute__ ((noinline)) +static unsigned int foo4 (int x) +{ + return x; +} + +__attribute__ ((noinline)) +int bar4 (int x) +{ + return foo4 (x + 6); +} + +volatile unsigned long l4 = (int) -4; + +__attribute__ ((noinline)) +static unsigned short int foo5 (int x) +{ + return x; +} + +__attribute__ ((noinline)) +short int bar5 (int x) +{ + return foo5 (x + 6); +} + +volatile unsigned long l5 = (short int) -4; + +__attribute__ ((noinline)) +static unsigned char foo6 (int x) +{ + return x; +} + +__attribute__ ((noinline)) +signed char bar6 (int x) +{ + return foo6 (x + 6); +} + +volatile unsigned long l6 = (signed char) -4; + +int +main (void) +{ + if (bar1 (-10) != l1) + abort (); + if (bar2 (-10) != l2) + abort (); + if (bar3 (-10) != l3) + abort (); + if (bar4 (-10) != l4) + abort (); + if (bar5 (-10) != l5) + abort (); + if (bar6 (-10) != l6) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39339.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39339.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39339.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39339.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,80 @@ +struct C +{ + unsigned int c; + struct D + { + unsigned int columns : 4; + unsigned int fore : 12; + unsigned int back : 6; + unsigned int fragment : 1; + unsigned int standout : 1; + unsigned int underline : 1; + unsigned int strikethrough : 1; + unsigned int reverse : 1; + unsigned int blink : 1; + unsigned int half : 1; + unsigned int bold : 1; + unsigned int invisible : 1; + unsigned int pad : 1; + } attr; +}; + +struct A +{ + struct C *data; + unsigned int len; +}; + +struct B +{ + struct A *cells; + unsigned char soft_wrapped : 1; +}; + +struct E +{ + long row, col; + struct C defaults; +}; + +__attribute__ ((noinline)) +void foo (struct E *screen, unsigned int c, int columns, struct B *row) +{ + struct D attr; + long col; + int i; + col = screen->col; + attr = screen->defaults.attr; + attr.columns = columns; + row->cells->data[col].c = c; + row->cells->data[col].attr = attr; + col++; + attr.fragment = 1; + for (i = 1; i < columns; i++) + { + row->cells->data[col].c = c; + row->cells->data[col].attr = attr; + col++; + } +} + +int +main (void) +{ + struct E e = {.row = 5,.col = 0,.defaults = + {6, {-1, -1, -1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0}} }; + struct C c[4]; + struct A a = { c, 4 }; + struct B b = { &a, 1 }; + struct D d; + __builtin_memset (&c, 0, sizeof c); + foo (&e, 65, 2, &b); + d = e.defaults.attr; + d.columns = 2; + if (__builtin_memcmp (&d, &c[0].attr, sizeof d)) + __builtin_abort (); + d.fragment = 1; + if (__builtin_memcmp (&d, &c[1].attr, sizeof d)) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39501.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39501.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39501.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr39501.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,87 @@ +/* { dg-options "-ffast-math" } */ + +extern void abort (void); +extern void exit (int); + +#define min1(a,b) ((a) < (b) ? (a) : (b)) +#define max1(a,b) ((a) > (b) ? (a) : (b)) + +#define min2(a,b) ((a) <= (b) ? (a) : (b)) +#define max2(a,b) ((a) >= (b) ? (a) : (b)) + +#define F(type,n) \ + type __attribute__((noinline)) type##_##n(type a, type b) \ + { \ + return n(a, b); \ + } + +F(float,min1) +F(float,min2) +F(float,max1) +F(float,max2) + +F(double,min1) +F(double,min2) +F(double,max1) +F(double,max2) + +int main() +{ + if (float_min1(0.f, -1.f) != -1.f) abort(); + if (float_min1(-1.f, 0.f) != -1.f) abort(); + if (float_min1(0.f, 1.f) != 0.f) abort(); + if (float_min1(1.f, 0.f) != 0.f) abort(); + if (float_min1(-1.f, 1.f) != -1.f) abort(); + if (float_min1(1.f, -1.f) != -1.f) abort(); + + if (float_max1(0.f, -1.f) != 0.f) abort(); + if (float_max1(-1.f, 0.f) != 0.f) abort(); + if (float_max1(0.f, 1.f) != 1.f) abort(); + if (float_max1(1.f, 0.f) != 1.f) abort(); + if (float_max1(-1.f, 1.f) != 1.f) abort(); + if (float_max1(1.f, -1.f) != 1.f) abort(); + + if (float_min2(0.f, -1.f) != -1.f) abort(); + if (float_min2(-1.f, 0.f) != -1.f) abort(); + if (float_min2(0.f, 1.f) != 0.f) abort(); + if (float_min2(1.f, 0.f) != 0.f) abort(); + if (float_min2(-1.f, 1.f) != -1.f) abort(); + if (float_min2(1.f, -1.f) != -1.f) abort(); + + if (float_max2(0.f, -1.f) != 0.f) abort(); + if (float_max2(-1.f, 0.f) != 0.f) abort(); + if (float_max2(0.f, 1.f) != 1.f) abort(); + if (float_max2(1.f, 0.f) != 1.f) abort(); + if (float_max2(-1.f, 1.f) != 1.f) abort(); + if (float_max2(1.f, -1.f) != 1.f) abort(); + + if (double_min1(0., -1.) != -1.) abort(); + if (double_min1(-1., 0.) != -1.) abort(); + if (double_min1(0., 1.) != 0.) abort(); + if (double_min1(1., 0.) != 0.) abort(); + if (double_min1(-1., 1.) != -1.) abort(); + if (double_min1(1., -1.) != -1.) abort(); + + if (double_max1(0., -1.) != 0.) abort(); + if (double_max1(-1., 0.) != 0.) abort(); + if (double_max1(0., 1.) != 1.) abort(); + if (double_max1(1., 0.) != 1.) abort(); + if (double_max1(-1., 1.) != 1.) abort(); + if (double_max1(1., -1.) != 1.) abort(); + + if (double_min2(0., -1.) != -1.) abort(); + if (double_min2(-1., 0.) != -1.) abort(); + if (double_min2(0., 1.) != 0.) abort(); + if (double_min2(1., 0.) != 0.) abort(); + if (double_min2(-1., 1.) != -1.) abort(); + if (double_min2(1., -1.) != -1.) abort(); + + if (double_max2(0., -1.) != 0.) abort(); + if (double_max2(-1., 0.) != 0.) abort(); + if (double_max2(0., 1.) != 1.) abort(); + if (double_max2(1., 0.) != 1.) abort(); + if (double_max2(-1., 1.) != 1.) abort(); + if (double_max2(1., -1.) != 1.) abort(); + + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40022.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40022.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40022.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40022.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,50 @@ +extern void abort (void); + +struct A +{ + struct A *a; +}; + +struct B +{ + struct A *b; +}; + +__attribute__((noinline)) +struct A * +foo (struct A *x) +{ + asm volatile ("" : : "g" (x) : "memory"); + return x; +} + +__attribute__((noinline)) +void +bar (struct B *w, struct A *x, struct A *y, struct A *z) +{ + struct A **c; + c = &w->b; + *c = foo (x); + while (*c) + c = &(*c)->a; + *c = foo (y); + while (*c) + c = &(*c)->a; + *c = foo (z); +} + +struct B d; +struct A e, f, g; + +int +main (void) +{ + f.a = &g; + bar (&d, &e, &f, 0); + if (d.b == 0 + || d.b->a == 0 + || d.b->a->a == 0 + || d.b->a->a->a != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40057.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40057.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40057.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40057.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* PR middle-end/40057 */ + +extern void abort (void); + +__attribute__((noinline)) int +foo (unsigned long long x) +{ + unsigned long long y = (x >> 31ULL) & 1ULL; + if (y == 0ULL) + return 0; + return -1; +} + +__attribute__((noinline)) int +bar (long long x) +{ + long long y = (x >> 31LL) & 1LL; + if (y == 0LL) + return 0; + return -1; +} + +int +main (void) +{ + if (sizeof (long long) != 8) + return 0; + if (foo (0x1682a9aaaULL)) + abort (); + if (!foo (0x1882a9aaaULL)) + abort (); + if (bar (0x1682a9aaaLL)) + abort (); + if (!bar (0x1882a9aaaLL)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40386.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40386.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40386.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40386.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,104 @@ +/* { dg-options "-fno-ira-share-spill-slots -Wno-shift-overflow" } */ + +extern void abort (void); +extern void exit (int); + +#define CHAR_BIT 8 + +#define ROR(a,b) (((a) >> (b)) | ((a) << ((sizeof (a) * CHAR_BIT) - (b)))) +#define ROL(a,b) (((a) << (b)) | ((a) >> ((sizeof (a) * CHAR_BIT) - (b)))) + +#define CHAR_VALUE ((char)0xf234) +#define SHORT_VALUE ((short)0xf234) +#define INT_VALUE ((int)0xf234) +#define LONG_VALUE ((long)0xf2345678L) +#define LL_VALUE ((long long)0xf2345678abcdef0LL) + +#define SHIFT1 4 +#define SHIFT2 ((sizeof (long long) * CHAR_BIT) - SHIFT1) + +char c = CHAR_VALUE; +short s = SHORT_VALUE; +int i = INT_VALUE; +long l = LONG_VALUE; +long long ll = LL_VALUE; +int shift1 = SHIFT1; +int shift2 = SHIFT2; + +int +main () +{ + if (ROR (c, shift1) != ROR (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROR (c, SHIFT1) != ROR (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROR (s, shift1) != ROR (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROR (s, SHIFT1) != ROR (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROR (i, shift1) != ROR (INT_VALUE, SHIFT1)) + abort (); + + if (ROR (i, SHIFT1) != ROR (INT_VALUE, SHIFT1)) + abort (); + + if (ROR (l, shift1) != ROR (LONG_VALUE, SHIFT1)) + abort (); + + if (ROR (l, SHIFT1) != ROR (LONG_VALUE, SHIFT1)) + abort (); + + if (ROR (ll, shift1) != ROR (LL_VALUE, SHIFT1)) + abort (); + + if (ROR (ll, SHIFT1) != ROR (LL_VALUE, SHIFT1)) + abort (); + + if (ROR (ll, shift2) != ROR (LL_VALUE, SHIFT2)) + abort (); + + if (ROR (ll, SHIFT2) != ROR (LL_VALUE, SHIFT2)) + abort (); + + if (ROL (c, shift1) != ROL (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROL (c, SHIFT1) != ROL (CHAR_VALUE, SHIFT1)) + abort (); + + if (ROL (s, shift1) != ROL (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROL (s, SHIFT1) != ROL (SHORT_VALUE, SHIFT1)) + abort (); + + if (ROL (i, shift1) != ROL (INT_VALUE, SHIFT1)) + abort (); + + if (ROL (i, SHIFT1) != ROL (INT_VALUE, SHIFT1)) + abort (); + + if (ROL (l, shift1) != ROL (LONG_VALUE, SHIFT1)) + abort (); + + if (ROL (l, SHIFT1) != ROL (LONG_VALUE, SHIFT1)) + abort (); + + if (ROL (ll, shift1) != ROL (LL_VALUE, SHIFT1)) + abort (); + + if (ROL (ll, SHIFT1) != ROL (LL_VALUE, SHIFT1)) + abort (); + + if (ROL (ll, shift2) != ROL (LL_VALUE, SHIFT2)) + abort (); + + if (ROL (ll, SHIFT2) != ROL (LL_VALUE, SHIFT2)) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40404.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40404.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40404.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40404.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +extern void abort (void); + +#if (__SIZEOF_INT__ <= 2) +struct S { + unsigned long ui17 : 17; +} s; +#else +struct S { + unsigned int ui17 : 17; +} s; +#endif +int main() +{ + s.ui17 = 0x1ffff; + if (s.ui17 >= 0xfffffffeu) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40493.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40493.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40493.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40493.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,82 @@ +extern void abort (void); + +typedef union i386_operand_type +{ + struct + { + unsigned int reg8:1; + unsigned int reg16:1; + unsigned int reg32:1; + unsigned int reg64:1; + unsigned int floatreg:1; + unsigned int regmmx:1; + unsigned int regxmm:1; + unsigned int regymm:1; + unsigned int control:1; + unsigned int debug:1; + unsigned int test:1; + unsigned int sreg2:1; + unsigned int sreg3:1; + unsigned int imm1:1; + unsigned int imm8:1; + unsigned int imm8s:1; + unsigned int imm16:1; + unsigned int imm32:1; + unsigned int imm32s:1; + unsigned int imm64:1; + unsigned int disp8:1; + unsigned int disp16:1; + unsigned int disp32:1; + unsigned int disp32s:1; + unsigned int disp64:1; + unsigned int acc:1; + unsigned int floatacc:1; + unsigned int baseindex:1; + unsigned int inoutportreg:1; + unsigned int shiftcount:1; + unsigned int jumpabsolute:1; + unsigned int esseg:1; + unsigned int regmem:1; + unsigned int mem:1; + unsigned int byte:1; + unsigned int word:1; + unsigned int dword:1; + unsigned int fword:1; + unsigned int qword:1; + unsigned int tbyte:1; + unsigned int xmmword:1; + unsigned int ymmword:1; + unsigned int unspecified:1; + unsigned int anysize:1; + } bitfield; + unsigned int array[2]; +} i386_operand_type; + +unsigned int x00, x01, y00, y01; + +int main (int argc, char *argv[]) +{ + i386_operand_type a,b,c,d; + + a.bitfield.reg16 = 1; + a.bitfield.imm16 = 0; + a.array[1] = 22; + + b = a; + x00 = b.array[0]; + x01 = b.array[1]; + + c = b; + y00 = c.array[0]; + y01 = c.array[1]; + + d = c; + if (d.bitfield.reg16 != 1) + abort(); + if (d.bitfield.imm16 != 0) + abort(); + if (d.array[1] != 22) + abort(); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40579.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40579.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40579.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40579.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void abort (void); +static char * __attribute__((noinline)) +itos(int num) +{ + return (char *)0; +} +static void __attribute__((noinline)) +foo(int i, const char *x) +{ + if (i >= 4) + abort (); +} +int main() +{ + int x = -__INT_MAX__ + 3; + int i; + + for (i = 0; i < 4; ++i) + { + char *p; + --x; + p = itos(x); + foo(i, p); + } + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40657.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40657.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40657.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40657.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* Verify that that Thumb-1 epilogue size optimization does not clobber the + return value. */ + +long long v = 0x123456789abc; + +__attribute__((noinline)) void bar (int *x) +{ + asm volatile ("" : "=m" (x) ::); +} + +__attribute__((noinline)) long long foo() +{ + int x; + bar(&x); + return v; +} + +int main () +{ + if (foo () != v) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40668.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40668.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40668.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40668.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +#if (__SIZEOF_INT__ == 2) +#define TESTVALUE 0x1234 +#else +#define TESTVALUE 0x12345678 +#endif +static void +foo (unsigned int x, void *p) +{ + __builtin_memcpy (p, &x, sizeof x); +} + +void +bar (int type, void *number) +{ + switch (type) + { + case 1: + foo (TESTVALUE, number); + break; + case 7: + foo (0, number); + break; + case 8: + foo (0, number); + break; + case 9: + foo (0, number); + break; + } +} + +int +main (void) +{ + unsigned int x; + bar (1, &x); + if (x != TESTVALUE) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40747.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40747.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40747.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr40747.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* PR middle-end/40747 */ + +extern void abort (void); + +int +foo (int i) +{ + return (i < 4 && i >= 0) ? i : 4; +} + +int +main () +{ + if (foo (-1) != 4) abort (); + if (foo (0) != 0) abort (); + if (foo (1) != 1) abort (); + if (foo (2) != 2) abort (); + if (foo (3) != 3) abort (); + if (foo (4) != 4) abort (); + if (foo (5) != 4) abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41239.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41239.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41239.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41239.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,67 @@ +/* PR rtl-optimization/41239 */ + +struct S +{ + short nargs; + unsigned long arg[2]; +}; + +extern void abort (void); +extern void exit (int); +extern char fn1 (int, const char *, int, const char *, const char *); +extern void fn2 (int, ...); +extern int fn3 (int); +extern int fn4 (const char *fmt, ...) __attribute__ ((format (printf, 1, 2))); + +unsigned long +test (struct S *x) +{ + signed int arg1 = x->arg[0]; + long int arg2 = x->arg[1]; + + if (arg2 == 0) + (fn1 (20, "foo", 924, __func__, ((void *) 0)) + ? (fn2 (fn3 (0x2040082), fn4 ("division by zero"))) + : (void) 0); + + return (long int) arg1 / arg2; +} + +int +main (void) +{ + struct S s = { 2, { 5, 0 } }; + test (&s); + abort (); +} + +__attribute__((noinline)) char +fn1 (int x, const char *y, int z, const char *w, const char *v) +{ + asm volatile ("" : : "r" (w), "r" (v) : "memory"); + asm volatile ("" : "+r" (x) : "r" (y), "r" (z) : "memory"); + return x; +} + +__attribute__((noinline)) int +fn3 (int x) +{ + asm volatile ("" : "+r" (x) : : "memory"); + return x; +} + +__attribute__((noinline)) int +fn4 (const char *x, ...) +{ + asm volatile ("" : "+r" (x) : : "memory"); + return *x; +} + +__attribute__((noinline)) void +fn2 (int x, ...) +{ + asm volatile ("" : "+r" (x) : : "memory"); + if (x) + /* Could be a longjmp or throw too. */ + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41317.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41317.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41317.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41317.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void abort (void); + +struct A +{ + int i; +}; +struct B +{ + struct A a; + int j; +}; + +static void +foo (struct B *p) +{ + ((struct A *)p)->i = 1; +} + +int main() +{ + struct A a; + a.i = 0; + foo ((struct B *)&a); + if (a.i != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41395-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41395-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41395-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41395-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +struct VEC_char_base +{ + unsigned num; + unsigned alloc; + short vec[1]; +}; + +short __attribute__((noinline)) +foo (struct VEC_char_base *p, int i) +{ + short *q; + p->vec[i] = 0; + q = &p->vec[8]; + *q = 1; + return p->vec[i]; +} + +extern void abort (void); +extern void *malloc (__SIZE_TYPE__); + +int +main() +{ + struct VEC_char_base *p = malloc (sizeof (struct VEC_char_base) + 256); + if (foo (p, 8) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41395-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41395-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41395-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41395-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +struct VEC_char_base +{ + unsigned num; + unsigned alloc; + union { + short vec[1]; + struct { + int i; + int j; + int k; + } a; + } u; +}; + +short __attribute__((noinline)) +foo (struct VEC_char_base *p, int i) +{ + short *q; + p->u.vec[i] = 0; + q = &p->u.vec[16]; + *q = 1; + return p->u.vec[i]; +} + +extern void abort (void); +extern void *malloc (__SIZE_TYPE__); + +int +main() +{ + struct VEC_char_base *p = malloc (sizeof (struct VEC_char_base) + 256); + if (foo (p, 16) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41463.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41463.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41463.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41463.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,55 @@ +#include + +union tree_node; + +struct tree_common +{ + int a; + long b; + long c; + void *p; + int d; +}; + +struct other_tree +{ + struct tree_common common; + int arr[14]; +}; + +struct tree_vec +{ + struct tree_common common; + int length; + union tree_node *a[1]; +}; + +union tree_node +{ + struct other_tree othr; + struct tree_vec vec; +}; + +union tree_node global; + +union tree_node * __attribute__((noinline)) +foo (union tree_node *p, int i) +{ + union tree_node **q; + p->vec.a[i] = (union tree_node *) 0; + q = &p->vec.a[1]; + *q = &global; + return p->vec.a[i]; +} + +extern void abort (void); +extern void *malloc (__SIZE_TYPE__); + +int +main() +{ + union tree_node *p = malloc (sizeof (union tree_node)); + if (foo (p, 1) != &global) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41750.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41750.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41750.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41750.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,67 @@ +/* PR 41750 - IPA-SRA used to pass hash->sgot by value rather than by + reference. */ + +struct bfd_link_hash_table +{ + int hash; +}; + +struct foo_link_hash_table +{ + struct bfd_link_hash_table root; + int *dynobj; + int *sgot; +}; + +struct foo_link_info +{ + struct foo_link_hash_table *hash; +}; + +extern void abort (void); + +int __attribute__((noinline)) +foo_create_got_section (int *abfd, struct foo_link_info *info) +{ + info->hash->sgot = abfd; + return 1; +} + +static int * +get_got (int *abfd, struct foo_link_info *info, + struct foo_link_hash_table *hash) +{ + int *got; + int *dynobj; + + got = hash->sgot; + if (!got) + { + dynobj = hash->dynobj; + if (!dynobj) + hash->dynobj = dynobj = abfd; + if (!foo_create_got_section (dynobj, info)) + return 0; + got = hash->sgot; + } + return got; +} + +int * __attribute__((noinline,noclone)) +elf64_ia64_check_relocs (int *abfd, struct foo_link_info *info) +{ + return get_got (abfd, info, info->hash); +} + +struct foo_link_info link_info; +struct foo_link_hash_table hash; +int abfd; + +int +main () +{ + link_info.hash = &hash; + if (elf64_ia64_check_relocs (&abfd, &link_info) != &abfd) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41917.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41917.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41917.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41917.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR rtl-optimization/41917 */ + +extern void abort (void); +unsigned int a = 1; + +int +main (void) +{ + unsigned int b, c, d; + + if (sizeof (int) != 4 || (int) 0xc7d24b5e > 0) + return 0; + + c = 0xc7d24b5e; + d = a | -2; + b = (d == 0) ? c : (c % d); + if (b != c) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41919.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41919.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41919.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41919.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +extern void abort (void); + +#define assert(x) if(!(x)) abort() + +struct S1 +{ + signed char f0; +}; + +int g_23 = 0; + +static struct S1 +foo (void) +{ + int *l_100 = &g_23; + int **l_110 = &l_100; + struct S1 l_128 = { 1 }; + assert (l_100 == &g_23); + assert (l_100 == &g_23); + assert (l_100 == &g_23); + assert (l_100 == &g_23); + assert (l_100 == &g_23); + assert (l_100 == &g_23); + assert (l_100 == &g_23); + return l_128; +} + +static signed char bar(signed char si1, signed char si2) +{ + return (si1 <= 0) ? si1 : (si2 * 2); +} +int main (void) +{ + struct S1 s = foo(); + if (bar(0x99 ^ (s.f0 && 1), 1) != -104) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41935.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41935.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41935.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr41935.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR middle-end/41935 */ + +extern void abort (void); + +long int +foo (int n, int i, int j) +{ + typedef int T[n]; + struct S { int a; T b[n]; }; + return __builtin_offsetof (struct S, b[i][j]); +} + +int +main (void) +{ + typedef int T[5]; + struct S { int a; T b[5]; }; + if (foo (5, 2, 3) + != __builtin_offsetof (struct S, b) + (5 * 2 + 3) * sizeof (int)) + abort (); + if (foo (5, 5, 5) + != __builtin_offsetof (struct S, b) + (5 * 5 + 5) * sizeof (int)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42006.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42006.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42006.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42006.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +extern void abort (void); + +static unsigned int +my_add(unsigned int si1, unsigned int si2) +{ + return (si1 > (50-si2)) ? si1 : (si1 + si2); +} + +static unsigned int +my_shift(unsigned int left, unsigned int right) +{ + return (right > 100) ? left : (left >> right); +} + +static int func_4(unsigned int p_6) +{ + int count = 0; + for (p_6 = 1; p_6 < 3; p_6 = my_add(p_6, 1)) + { + if (count++ > 1) + abort (); + + if (my_shift(p_6, p_6)) + return 0; + } + return 0; +} + +int main(void) +{ + func_4(0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42142.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42142.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42142.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42142.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +int __attribute__((noinline,noclone)) +sort(int L) +{ + int end[2] = { 10, 10, }, i=0, R; + while (i<2) + { + R = end[i]; + if (L max) + max = i; +} + +static int CallFunctionRec(int (*fun)(int depth), int depth) { + if (!fun(depth)) { + return 0; + } + if (depth < 10) { + CallFunctionRec(fun, depth + 1); + } + return 1; +} + +static int CallFunction(int (*fun)(int depth)) { + return CallFunctionRec(fun, 1) && !fun(0); +} + +static int callback(int depth) { + storemax (depth); + return depth != 0; +} + +int main() { + CallFunction(callback); + if (max != 10) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42248.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42248.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42248.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42248.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +typedef struct { + _Complex double a; + _Complex double b; +} Scf10; + +Scf10 g1s; + +void +check (Scf10 x, _Complex double y) +{ + if (x.a != y) __builtin_abort (); +} + +void +init (Scf10 *p, _Complex double y) +{ + p->a = y; +} + +int +main () +{ + init (&g1s, (_Complex double)1); + check (g1s, (_Complex double)1); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42269-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42269-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42269-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42269-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* Make sure that language + abi extensions in passing S interoperate. */ + +static long long __attribute__((noinline)) +foo (unsigned short s) +{ + return (short) s; +} + +unsigned short s = 0xFFFF; + +int +main (void) +{ + return foo (s) + 1 != 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42512.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42512.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42512.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42512.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +extern void abort (void); + +short g_3; + +int main (void) +{ + int l_2; + for (l_2 = -1; l_2 != 0; l_2 = (unsigned char)(l_2 - 1)) + g_3 |= l_2; + if (g_3 != -1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42544.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42544.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42544.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42544.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR c/42544 */ + +extern void abort (void); + +int +main () +{ + signed short s = -1; + if (sizeof (long long) == sizeof (unsigned int)) + return 0; + if ((unsigned int) s >= 0x100000000ULL) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42570.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42570.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42570.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42570.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,9 @@ +typedef unsigned char uint8_t; +uint8_t foo[1][0]; +extern void abort (void); +int main() +{ + if (sizeof (foo) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42614.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42614.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42614.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42614.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,67 @@ +extern void *malloc(__SIZE_TYPE__); +extern void abort(void); +extern void free(void *); + +typedef struct SEntry +{ + unsigned char num; +} TEntry; + +typedef struct STable +{ + TEntry data[2]; +} TTable; + +TTable *init () +{ + return malloc(sizeof(TTable)); +} + +void +expect_func (int a, unsigned char *b) __attribute__ ((noinline)); + +static inline void +inlined_wrong (TEntry *entry_p, int flag); + +void +inlined_wrong (TEntry *entry_p, int flag) +{ + unsigned char index; + entry_p->num = 0; + + if (flag == 0) + abort(); + + for (index = 0; index < 1; index++) + entry_p->num++; + + if (!entry_p->num) + { + abort(); + } +} + +void +expect_func (int a, unsigned char *b) +{ + if (abs ((a == 0))) + abort (); + if (abs ((b == 0))) + abort (); +} + +int +main () +{ + unsigned char index = 0; + TTable *table_p = init(); + TEntry work; + + inlined_wrong (&(table_p->data[1]), 1); + expect_func (1, &index); + inlined_wrong (&work, 1); + + free (table_p); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42691.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42691.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42691.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42691.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +extern void abort (void); + +union _D_rep +{ + unsigned short rep[4]; + double val; +}; + +int add(double* key, double* table) +{ + unsigned i = 0; + double* deletedEntry = 0; + while (1) { + double* entry = table + i; + + if (*entry == *key) + break; + + union _D_rep _D_inf = {{ 0, 0, 0, 0x7ff0 }}; + if (*entry != _D_inf.val) + abort (); + + union _D_rep _D_inf2 = {{ 0, 0, 0, 0x7ff0 }}; + if (!_D_inf2.val) + deletedEntry = entry; + + i++; + } + if (deletedEntry) + *deletedEntry = 0.0; + return 0; +} + +int main () +{ + union _D_rep infinit = {{ 0, 0, 0, 0x7ff0 }}; + double table[2] = { infinit.val, 23 }; + double key = 23; + int ret = add (&key, table); + return ret; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42721.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42721.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42721.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42721.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR c/42721 */ + +extern void abort (void); + +static unsigned long long +foo (unsigned long long x, unsigned long long y) +{ + return x / y; +} + +static int a, b; + +int +main (void) +{ + unsigned long long c = 1; + b ^= c && (foo (a, -1ULL) != 1L); + if (b != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42833.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42833.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42833.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr42833.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,171 @@ +typedef __INT_LEAST8_TYPE__ int8_t; +typedef __UINT_LEAST32_TYPE__ uint32_t; +typedef int ssize_t; +typedef struct { int8_t v1; int8_t v2; int8_t v3; int8_t v4; } neon_s8; + +uint32_t helper_neon_rshl_s8 (uint32_t arg1, uint32_t arg2); + +uint32_t +helper_neon_rshl_s8 (uint32_t arg1, uint32_t arg2) +{ + uint32_t res; + neon_s8 vsrc1; + neon_s8 vsrc2; + neon_s8 vdest; + do + { + union + { + neon_s8 v; + uint32_t i; + } conv_u; + conv_u.i = (arg1); + vsrc1 = conv_u.v; + } + while (0); + do + { + union + { + neon_s8 v; + uint32_t i; + } conv_u; + conv_u.i = (arg2); + vsrc2 = conv_u.v; + } + while (0); + do + { + int8_t tmp; + tmp = (int8_t) vsrc2.v1; + if (tmp >= (ssize_t) sizeof (vsrc1.v1) * 8) + { + vdest.v1 = 0; + } + else if (tmp < -(ssize_t) sizeof (vsrc1.v1) * 8) + { + vdest.v1 = vsrc1.v1 >> (sizeof (vsrc1.v1) * 8 - 1); + } + else if (tmp == -(ssize_t) sizeof (vsrc1.v1) * 8) + { + vdest.v1 = vsrc1.v1 >> (tmp - 1); + vdest.v1++; + vdest.v1 >>= 1; + } + else if (tmp < 0) + { + vdest.v1 = (vsrc1.v1 + (1 << (-1 - tmp))) >> -tmp; + } + else + { + vdest.v1 = vsrc1.v1 << tmp; + } + } + while (0); + do + { + int8_t tmp; + tmp = (int8_t) vsrc2.v2; + if (tmp >= (ssize_t) sizeof (vsrc1.v2) * 8) + { + vdest.v2 = 0; + } + else if (tmp < -(ssize_t) sizeof (vsrc1.v2) * 8) + { + vdest.v2 = vsrc1.v2 >> (sizeof (vsrc1.v2) * 8 - 1); + } + else if (tmp == -(ssize_t) sizeof (vsrc1.v2) * 8) + { + vdest.v2 = vsrc1.v2 >> (tmp - 1); + vdest.v2++; + vdest.v2 >>= 1; + } + else if (tmp < 0) + { + vdest.v2 = (vsrc1.v2 + (1 << (-1 - tmp))) >> -tmp; + } + else + { + vdest.v2 = vsrc1.v2 << tmp; + } + } + while (0); + do + { + int8_t tmp; + tmp = (int8_t) vsrc2.v3; + if (tmp >= (ssize_t) sizeof (vsrc1.v3) * 8) + { + vdest.v3 = 0; + } + else if (tmp < -(ssize_t) sizeof (vsrc1.v3) * 8) + { + vdest.v3 = vsrc1.v3 >> (sizeof (vsrc1.v3) * 8 - 1); + } + else if (tmp == -(ssize_t) sizeof (vsrc1.v3) * 8) + { + vdest.v3 = vsrc1.v3 >> (tmp - 1); + vdest.v3++; + vdest.v3 >>= 1; + } + else if (tmp < 0) + { + vdest.v3 = (vsrc1.v3 + (1 << (-1 - tmp))) >> -tmp; + } + else + { + vdest.v3 = vsrc1.v3 << tmp; + } + } + while (0); + do + { + int8_t tmp; + tmp = (int8_t) vsrc2.v4; + if (tmp >= (ssize_t) sizeof (vsrc1.v4) * 8) + { + vdest.v4 = 0; + } + else if (tmp < -(ssize_t) sizeof (vsrc1.v4) * 8) + { + vdest.v4 = vsrc1.v4 >> (sizeof (vsrc1.v4) * 8 - 1); + } + else if (tmp == -(ssize_t) sizeof (vsrc1.v4) * 8) + { + vdest.v4 = vsrc1.v4 >> (tmp - 1); + vdest.v4++; + vdest.v4 >>= 1; + } + else if (tmp < 0) + { + vdest.v4 = (vsrc1.v4 + (1 << (-1 - tmp))) >> -tmp; + } + else + { + vdest.v4 = vsrc1.v4 << tmp; + } + } + while (0);; + do + { + union + { + neon_s8 v; + uint32_t i; + } conv_u; + conv_u.v = (vdest); + res = conv_u.i; + } + while (0); + return res; +} + +extern void abort(void); + +int main() +{ + uint32_t r = helper_neon_rshl_s8 (0x05050505, 0x01010101); + if (r != 0x0a0a0a0a) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43008.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43008.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43008.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43008.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +int i; +struct X { + int *p; +}; +struct X * __attribute__((malloc)) +my_alloc (void) +{ + struct X *p = __builtin_malloc (sizeof (struct X)); + p->p = &i; + return p; +} +extern void abort (void); +int main() +{ + struct X *p, *q; + p = my_alloc (); + q = my_alloc (); + *(p->p) = 1; + *(q->p) = 0; + if (*(p->p) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43220.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43220.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43220.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43220.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* { dg-require-effective-target int32plus } */ +/* { dg-require-effective-target alloca } */ + +void *volatile p; + +int +main (void) +{ + int n = 0; +lab:; + { + int x[n % 1000 + 1]; + x[0] = 1; + x[n % 1000] = 2; + p = x; + n++; + } + + { + int x[n % 1000 + 1]; + x[0] = 1; + x[n % 1000] = 2; + p = x; + n++; + } + + if (n < 1000000) + goto lab; + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43236.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43236.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43236.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43236.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* { dg-options "-ftree-loop-distribution" } */ +extern void abort(void); +extern void *memset(void *s, int c, __SIZE_TYPE__ n); +extern int memcmp(const void *s1, const void *s2, __SIZE_TYPE__ n); +/*extern int printf(const char *format, ...);*/ + +int main() +{ + char A[30], B[30], C[30]; + int i; + + /* prepare arrays */ + memset(A, 1, 30); + memset(B, 1, 30); + + for (i = 20; i-- > 10;) { + A[i] = 0; + B[i] = 0; + } + + /* expected result */ + memset(C, 1, 30); + memset(C + 10, 0, 10); + + /* show result */ +/* for (i = 0; i < 30; i++) + printf("%d %d %d\n", A[i], B[i], C[i]); */ + + /* compare results */ + if (memcmp(A, C, 30) || memcmp(B, C, 30)) abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43269.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43269.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43269.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43269.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +int g_21; +int g_211; +int g_261; + +static void __attribute__((noinline,noclone)) +func_32 (int b) +{ + if (b) { +lbl_370: + g_21 = 1; + } + + for (g_261 = -1; g_261 > -2; g_261--) { + if (g_211 + 1) { + return; + } else { + g_21 = 1; + goto lbl_370; + } + } +} + +extern void abort (void); + +int main(void) +{ + func_32(0); + if (g_261 != -1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43385.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43385.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43385.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43385.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,59 @@ +/* PR c/43385 */ + +extern void abort (void); + +int e; + +__attribute__((noinline)) void +foo (int x, int y) +{ + if (__builtin_expect (x, 0) && y != 0) + e++; +} + +__attribute__((noinline)) int +bar (int x, int y) +{ + if (__builtin_expect (x, 0) && y != 0) + return 1; + else + return 0; +} + +int +main (void) +{ + int z = 0; + asm ("" : "+r" (z)); + foo (z + 2, z + 1); + if (e != 1) + abort (); + foo (z + 2, z); + if (e != 1) + abort (); + foo (z + 1, z + 1); + if (e != 2) + abort (); + foo (z + 1, z); + if (e != 2) + abort (); + foo (z, z + 1); + if (e != 2) + abort (); + foo (z, z); + if (e != 2) + abort (); + if (bar (z + 2, z + 1) != 1) + abort (); + if (bar (z + 2, z) != 0) + abort (); + if (bar (z + 1, z + 1) != 1) + abort (); + if (bar (z + 1, z) != 0) + abort (); + if (bar (z, z + 1) != 0) + abort (); + if (bar (z, z) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43438.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43438.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43438.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43438.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); + +static unsigned char g_2 = 1; +static int g_9; +static int *l_8 = &g_9; + +static void func_12(int p_13) +{ + int * l_17 = &g_9; + *l_17 &= 0 < p_13; +} + +int main(void) +{ + unsigned char l_11 = 254; + *l_8 |= g_2; + l_11 |= *l_8; + func_12(l_11); + if (g_9 != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43560.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43560.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43560.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43560.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR tree-optimization/43560 */ + +struct S +{ + int a, b; + char c[10]; +}; + +__attribute__ ((noinline)) void +test (struct S *x) +{ + while (x->b > 1 && x->c[x->b - 1] == '/') + { + x->b--; + x->c[x->b] = '\0'; + } +} + +const struct S s = { 0, 0, "" }; + +int +main () +{ + struct S *p; + asm ("" : "=r" (p) : "0" (&s)); + test (p); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43629.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43629.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43629.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43629.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +int flag; +extern void abort (void); +int main() +{ + int x; + if (flag) + x = -1; + else + x &= 0xff; + if (x & ~0xff) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43783.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43783.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43783.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43783.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* { dg-skip-if "small alignment" { pdp11-*-* } } */ + +typedef __attribute__((aligned(16))) +struct { + unsigned long long w[3]; +} UINT192; + +UINT192 bid_Kx192[32]; + +extern void abort (void); + +int main() +{ + int i = 0; + unsigned long x = 0; + for (i = 0; i < 32; ++i) + bid_Kx192[i].w[1] = i == 1; + for (i = 0; i < 32; ++i) + x += bid_Kx192[1].w[1]; + if (x != 32) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43784.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43784.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43784.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43784.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +struct s { + unsigned char a[256]; +}; +union u { + struct { struct s b; int c; } d; + struct { int c; struct s b; } e; +}; + +static union u v; +static struct s *p = &v.d.b; +static struct s *q = &v.e.b; + +static struct s __attribute__((noinline)) rp(void) +{ + return *p; +} + +static void qp(void) +{ + *q = rp(); +} + +int main() +{ + int i; + for (i = 0; i < 256; i++) + p->a[i] = i; + qp(); + for (i = 0; i < 256; i++) + if (q->a[i] != i) + __builtin_abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43835.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43835.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43835.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43835.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +struct PMC { + unsigned flags; +}; + +typedef struct Pcc_cell +{ + struct PMC *p; + long bla; + long type; +} Pcc_cell; + +extern void abort (); +extern void Parrot_gc_mark_PMC_alive_fun(int * interp, struct PMC *pmc) + __attribute__((noinline)); + +void Parrot_gc_mark_PMC_alive_fun (int * interp, struct PMC *pmc) +{ + abort (); +} + +static void mark_cell(int * interp, Pcc_cell *c) + __attribute__((__nonnull__(1))) + __attribute__((__nonnull__(2))) + __attribute__((noinline)); + +static void +mark_cell(int * interp, Pcc_cell *c) +{ + if (c->type == 4 && c->p + && !(c->p->flags & (1<<18))) + Parrot_gc_mark_PMC_alive_fun(interp, c->p); +} + +void foo(int * interp, Pcc_cell *c); + +void +foo(int * interp, Pcc_cell *c) +{ + mark_cell(interp, c); +} + +int main() +{ + int i; + Pcc_cell c; + c.p = 0; + c.bla = 42; + c.type = 4; + foo (&i, &c); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43987.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43987.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43987.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr43987.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +char B[256 * sizeof(void *)]; +typedef void *FILE; +typedef struct globals { + int c; + FILE *l; +} __attribute__((may_alias)) T; +void add_input_file(FILE *file) +{ + (*(T*)&B).l[0] = file; +} +extern void abort (void); +int main() +{ + FILE x; + (*(T*)&B).l = &x; + add_input_file ((void *)-1); + if ((*(T*)&B).l[0] != (void *)-1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44164.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44164.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44164.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44164.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +struct X { + struct Y { + struct YY { + struct Z { + int i; + } c; + } bb; + } b; +} a; +int __attribute__((noinline, noclone)) +foo (struct Z *p) +{ + int i = p->i; + a.b = (struct Y){}; + return p->i + i; +} +extern void abort (void); +int main() +{ + a.b.bb.c.i = 1; + if (foo (&a.b.bb.c) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44202-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44202-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44202-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44202-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +extern __attribute__ ((__noreturn__)) void exit(int); +extern __attribute__ ((__noreturn__)) void abort(void); +__attribute__ ((__noinline__)) +int +add512(int a, int *b) +{ + int c = a + 512; + if (c != 0) + *b = a; + return c; +} + +__attribute__ ((__noinline__)) +int +add513(int a, int *b) +{ + int c = a + 513; + if (c == 0) + *b = a; + return c; +} + +int main(void) +{ + int b0 = -1; + int b1 = -1; + if (add512(-512, &b0) != 0 || b0 != -1 || add513(-513, &b1) != 0 || b1 != -513) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44468.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44468.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44468.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44468.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,60 @@ +#include + +struct S { + int i; + int j; +}; +struct R { + int k; + struct S a; +}; +struct Q { + float k; + struct S a; +}; +struct Q s; +int __attribute__((noinline,noclone)) +test1 (void *q) +{ + struct S *b = (struct S *)((char *)q + sizeof (int)); + s.a.i = 0; + b->i = 3; + return s.a.i; +} +int __attribute__((noinline,noclone)) +test2 (void *q) +{ + struct S *b = &((struct R *)q)->a; + s.a.i = 0; + b->i = 3; + return s.a.i; +} +int __attribute__((noinline,noclone)) +test3 (void *q) +{ + s.a.i = 0; + ((struct S *)((char *)q + sizeof (int)))->i = 3; + return s.a.i; +} +extern void abort (void); +int +main() +{ + if (sizeof (float) != sizeof (int) + || offsetof (struct R, a) != sizeof (int) + || offsetof (struct Q, a) != sizeof (int)) + return 0; + s.a.i = 1; + s.a.j = 2; + if (test1 ((void *)&s) != 3) + abort (); + s.a.i = 1; + s.a.j = 2; + if (test2 ((void *)&s) != 3) + abort (); + s.a.i = 1; + s.a.j = 2; + if (test3 ((void *)&s) != 3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44555.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44555.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44555.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44555.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +struct a { + char b[100]; +}; +int foo(struct a *a) +{ + if (&a->b) + return 1; + return 0; +} +extern void abort (void); +int main() +{ + if (foo((struct a *)0) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44575.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44575.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44575.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44575.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +/* PR target/44575 */ + +#include + +int fails = 0; +struct S { float a[3]; }; +struct S a[5]; + +void +check (int z, ...) +{ + struct S arg, *p; + va_list ap; + int j = 0, k = 0; + int i; + va_start (ap, z); + for (i = 2; i < 4; ++i) + { + p = 0; + j++; + k += 2; + switch ((z << 4) | i) + { + case 0x12: + case 0x13: + p = &a[2]; + arg = va_arg (ap, struct S); + break; + default: + ++fails; + break; + } + if (p && p->a[2] != arg.a[2]) + ++fails; + if (fails) + break; + } + va_end (ap); +} + +int +main () +{ + a[2].a[2] = -49026; + check (1, a[2], a[2]); + if (fails) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44683.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44683.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44683.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44683.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +int __attribute__((noinline,noclone)) +copysign_bug (double x) +{ + if (x != 0.0 && (x * 0.5 == x)) + return 1; + if (__builtin_copysign(1.0, x) < 0.0) + return 2; + else + return 3; +} +int main(void) +{ + double x = -0.0; + if (copysign_bug (x) != 2) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44828.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44828.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44828.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44828.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +extern void abort (void); + +static signed char +foo (signed char si1, signed char si2) +{ + return si1 * si2; +} + +int a = 0x105F61CA; + +int +main (void) +{ + int b = 0x0332F5C8; + if (foo (b, a) > 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44852.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44852.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44852.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44852.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +__attribute__ ((__noinline__)) +char *sf(char *s, char *s0) +{ + asm (""); + while (*--s == '9') + if (s == s0) + { + *s = '0'; + break; + } + ++*s++; + return s; +} + +int main() +{ + char s[] = "999999"; + char *x = sf (s+2, s); + if (x != s+1 || __builtin_strcmp (s, "199999") != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44858.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44858.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44858.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44858.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR rtl-optimization/44858 */ + +extern void abort (void); +int a = 3; +int b = 1; + +__attribute__((noinline)) long long +foo (int x, int y) +{ + return x / y; +} + +__attribute__((noinline)) int +bar (void) +{ + int c = 2; + c &= foo (1, b) > b; + b = (a != 0) | c; + return c; +} + +int +main (void) +{ + if (bar () != 0 || b != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44942.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44942.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44942.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr44942.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,70 @@ +/* PR target/44942 */ + +#include + +void +test1 (int a, int b, int c, int d, int e, int f, int g, long double h, ...) +{ + int i; + va_list ap; + + va_start (ap, h); + i = va_arg (ap, int); + if (i != 1234) + __builtin_abort (); + va_end (ap); +} + +void +test2 (int a, int b, int c, int d, int e, int f, int g, long double h, int i, + long double j, int k, long double l, int m, long double n, ...) +{ + int o; + va_list ap; + + va_start (ap, n); + o = va_arg (ap, int); + if (o != 1234) + __builtin_abort (); + va_end (ap); +} + +void +test3 (double a, double b, double c, double d, double e, double f, + double g, long double h, ...) +{ + double i; + va_list ap; + + va_start (ap, h); + i = va_arg (ap, double); + if (i != 1234.0) + __builtin_abort (); + va_end (ap); +} + +void +test4 (double a, double b, double c, double d, double e, double f, double g, + long double h, double i, long double j, double k, long double l, + double m, long double n, ...) +{ + double o; + va_list ap; + + va_start (ap, n); + o = va_arg (ap, double); + if (o != 1234.0) + __builtin_abort (); + va_end (ap); +} + +int +main () +{ + test1 (0, 0, 0, 0, 0, 0, 0, 0.0L, 1234); + test2 (0, 0, 0, 0, 0, 0, 0, 0.0L, 0, 0.0L, 0, 0.0L, 0, 0.0L, 1234); + test3 (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0L, 1234.0); + test4 (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0L, 0.0, 0.0L, + 0.0, 0.0L, 0.0, 0.0L, 1234.0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45034.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45034.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45034.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45034.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +extern void abort (void); +static void fixnum_neg(signed char x, signed char *py, int *pv) +{ + unsigned char ux, uy; + + ux = (unsigned char)x; + uy = -ux; + *py = (uy <= 127) ? (signed char)uy : (-(signed char)(255 - uy) - 1); + *pv = (x == -128) ? 1 : 0; +} + +void __attribute__((noinline)) foo(int x, int y, int v) +{ + if (y < -128 || y > 127) + abort(); +} + +int test_neg(void) +{ + signed char x, y; + int v, err; + + err = 0; + x = -128; + for (;;) { + fixnum_neg(x, &y, &v); + foo((int)x, (int)y, v); + if ((v && x != -128) || (!v && x == -128)) + ++err; + if (x == 127) + break; + ++x; + } + return err; +} + +int main(void) +{ + if (sizeof (char) != 1) + return 0; + if (test_neg() != 0) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45070.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45070.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45070.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45070.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,52 @@ +/* PR45070 */ +extern void abort(void); + +struct packed_ushort { + unsigned short ucs; +} __attribute__((packed)); + +struct source { + int pos, length; + int flag; +}; + +static void __attribute__((noinline)) fetch(struct source *p) +{ + p->length = 128; +} + +static struct packed_ushort __attribute__((noinline)) next(struct source *p) +{ + struct packed_ushort rv; + + if (p->pos >= p->length) { + if (p->flag) { + p->flag = 0; + fetch(p); + return next(p); + } + p->flag = 1; + rv.ucs = 0xffff; + return rv; + } + rv.ucs = 0; + return rv; +} + +int main(void) +{ + struct source s; + int i; + + s.pos = 0; + s.length = 0; + s.flag = 0; + + for (i = 0; i < 16; i++) { + struct packed_ushort rv = next(&s); + if ((i == 0 && rv.ucs != 0xffff) + || (i > 0 && rv.ucs != 0)) + abort(); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45262.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45262.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45262.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45262.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* PR middle-end/45262 */ + +/* { dg-require-effective-target int32plus } */ + +extern void abort (void); + +int +foo (unsigned int x) +{ + return ((int) x < 0) || ((int) (-x) < 0); +} + +int +bar (unsigned int x) +{ + return x >> 31 || (-x) >> 31; +} + +int +main (void) +{ + if (foo (1) != 1) + abort (); + if (foo (0) != 0) + abort (); + if (foo (-1) != 1) + abort (); + if (bar (1) != 1) + abort (); + if (bar (0) != 0) + abort (); + if (bar (-1) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45695.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45695.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45695.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr45695.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR rtl-optimization/45695 */ + +extern void abort (void); + +__attribute__((noinline)) void +g (int x) +{ + asm volatile ("" : "+r" (x)); +} + +__attribute__((noinline)) int +f (int a, int b, int d) +{ + int r = -1; + b += d; + if (d == a) + r = b - d; + g (b); + return r; +} + +int +main (void) +{ + int l; + asm ("" : "=r" (l) : "0" (0)); + if (f (l + 0, l + 1, l + 4) != -1) + abort (); + if (f (l + 4, l + 1, l + 4) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46019.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46019.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46019.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46019.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR middle-end/46019 */ + +extern void abort (void); + +int +main (void) +{ + unsigned long long l = 0x40000000000ULL; + int n; + for (n = 0; n < 8; n++) + if (l / (0x200000000ULL << n) != (0x200 >> n)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46309.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46309.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46309.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46309.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* PR tree-optimization/46309 */ + +extern void abort (void); + +unsigned int *q; + +__attribute__((noinline, noclone)) void +bar (unsigned int *p) +{ + if (*p != 2 && *p != 3) + (!(!(*q & 263) || *p != 1)) ? abort () : 0; +} + +int +main () +{ + unsigned int x, y; + asm volatile ("" : : : "memory"); + x = 2; + bar (&x); + x = 3; + bar (&x); + y = 1; + x = 0; + q = &y; + bar (&x); + y = 0; + x = 1; + bar (&x); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46316.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46316.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46316.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46316.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +extern void abort (void); + +long long __attribute__((noinline,noclone)) +foo (long long t) +{ + while (t > -4) + t -= 2; + + return t; +} + +int main(void) +{ + if (foo (0) != -4) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46909-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46909-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46909-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46909-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* PR tree-optimization/46909 */ + +extern void abort (); + +int +__attribute__ ((__noinline__)) +foo (unsigned int x) +{ + if (! (x == 4 || x == 6) || (x == 2 || x == 6)) + return 1; + return -1; +} + +int +main () +{ + int i; + for (i = -10; i < 10; i++) + if (foo (i) != 1 - 2 * (i == 4)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46909-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46909-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46909-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr46909-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* PR tree-optimization/46909 */ + +extern void abort (void); + +int +__attribute__((noinline)) +foo (int x) +{ + if ((x != 0 && x != 13) || x == 5 || x == 20) + return 1; + return -1; +} + +int +main (void) +{ + int i; + for (i = -10; i < 30; i++) + if (foo (i) != 1 - 2 * (i == 0) - 2 * (i == 13)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47148.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47148.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47148.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47148.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR tree-optimization/47148 */ + +static inline unsigned +bar (unsigned x, unsigned y) +{ + if (y >= 32) + return x; + else + return x >> y; +} + +static unsigned a = 1, b = 1; + +static inline void +foo (unsigned char x, unsigned y) +{ + if (!y) + return; + unsigned c = (0x7000U / (x - 2)) ^ a; + unsigned d = bar (a, a); + b &= ((a - d) && (a - 1)) + c; +} + +int +main (void) +{ + foo (1, 1); + foo (-1, 1); + if (b && ((unsigned char) -1) == 255) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47155.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47155.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47155.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47155.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR tree-optimization/47155 */ + +unsigned int a; +static signed char b = -127; +int c = 1; + +int +main (void) +{ + a = b <= (unsigned char) (-6 * c); + if (!a) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47237.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47237.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47237.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47237.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* { dg-xfail-if "can cause stack underflow" { nios2-*-* } } */ +/* { dg-require-effective-target untyped_assembly } */ +#define INTEGER_ARG 5 + +extern void abort(void); + +static void foo(int arg) +{ + if (arg != INTEGER_ARG) + abort(); +} + +static void bar(int arg) +{ + foo(arg); + __builtin_apply(foo, __builtin_apply_args(), 16); +} + +int main(void) +{ + bar(INTEGER_ARG); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47299.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47299.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47299.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47299.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR rtl-optimization/47299 */ + +extern void abort (void); + +__attribute__ ((noinline, noclone)) unsigned short +foo (unsigned char x) +{ + return x * 255; +} + +int +main () +{ + if (foo (0x40) != 0x3fc0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47337.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47337.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47337.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47337.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,86 @@ +/* PR rtl-optimization/47337 */ + +static unsigned int a[256], b = 0; +static char c = 0; +static int d = 0, *f = &d; +static long long e = 0; + +static short +foo (long long x, long long y) +{ + return x / y; +} + +static char +bar (char x, char y) +{ + return x - y; +} + +static int +baz (int x, int y) +{ + *f = (y != (short) (y * 3)); + for (c = 0; c < 2; c++) + { + lab: + if (d) + { + if (e) + e = 1; + else + return x; + } + else + { + d = 1; + goto lab; + } + f = &d; + } + return x; +} + +static void +fnx (unsigned long long x, int y) +{ + if (!y) + { + b = a[b & 1]; + b = a[b & 1]; + b = a[(b ^ (x & 1)) & 1]; + b = a[(b ^ (x & 1)) & 1]; + } +} + +char *volatile w = "2"; + +int +main () +{ + int h = 0; + unsigned int k = 0; + int l[8]; + int i, j; + + if (__builtin_strcmp (w, "1") == 0) + h = 1; + + for (i = 0; i < 256; i++) + { + for (j = 8; j > 0; j--) + k = 1; + a[i] = k; + } + for (i = 0; i < 8; i++) + l[i] = 0; + + d = bar (c, c); + d = baz (c, 1 | foo (l[0], 10)); + fnx (d, h); + fnx (e, h); + + if (d != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47538.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47538.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47538.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47538.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,73 @@ +/* PR tree-optimization/47538 */ + +struct S +{ + double a, b, *c; + unsigned long d; +}; + +__attribute__((noinline, noclone)) void +foo (struct S *x, const struct S *y) +{ + const unsigned long n = y->d + 1; + const double m = 0.25 * (y->b - y->a); + x->a = y->a; + x->b = y->b; + if (n == 1) + { + x->c[0] = 0.; + } + else if (n == 2) + { + x->c[1] = m * y->c[0]; + x->c[0] = 2.0 * x->c[1]; + } + else + { + double o = 0.0, p = 1.0; + unsigned long i; + + for (i = 1; i <= n - 2; i++) + { + x->c[i] = m * (y->c[i - 1] - y->c[i + 1]) / (double) i; + o += p * x->c[i]; + p = -p; + } + x->c[n - 1] = m * y->c[n - 2] / (n - 1.0); + o += p * x->c[n - 1]; + x->c[0] = 2.0 * o; + } +} + +int +main (void) +{ + struct S x, y; + double c[4] = { 10, 20, 30, 40 }, d[4], e[4] = { 118, 118, 118, 118 }; + + y.a = 10; + y.b = 6; + y.c = c; + x.c = d; + y.d = 3; + __builtin_memcpy (d, e, sizeof d); + foo (&x, &y); + if (d[0] != 0 || d[1] != 20 || d[2] != 10 || d[3] != -10) + __builtin_abort (); + y.d = 2; + __builtin_memcpy (d, e, sizeof d); + foo (&x, &y); + if (d[0] != 60 || d[1] != 20 || d[2] != -10 || d[3] != 118) + __builtin_abort (); + y.d = 1; + __builtin_memcpy (d, e, sizeof d); + foo (&x, &y); + if (d[0] != -20 || d[1] != -10 || d[2] != 118 || d[3] != 118) + __builtin_abort (); + y.d = 0; + __builtin_memcpy (d, e, sizeof d); + foo (&x, &y); + if (d[0] != 0 || d[1] != 118 || d[2] != 118 || d[3] != 118) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47925.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47925.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47925.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr47925.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +struct s { volatile struct s *next; }; + +void __attribute__((noinline)) +bar (int ignored, int n) +{ + asm volatile (""); +} + +int __attribute__((noinline)) +foo (volatile struct s *ptr, int n) +{ + int i; + + bar (0, n); + for (i = 0; i < n; i++) + ptr = ptr->next; +} + +int main (void) +{ + volatile struct s rec = { &rec }; + foo (&rec, 10); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48197.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48197.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48197.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48197.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR c/48197 */ + +extern void abort (void); +static int y = 0x8000; + +int +main () +{ + unsigned int x = (short)y; + if (sizeof (0LL) == sizeof (0U)) + return 0; + if (0LL > (0U ^ (short)-0x8000)) + abort (); + if (0LL > (0U ^ x)) + abort (); + if (0LL > (0U ^ (short)y)) + abort (); + if ((0U ^ (short)-0x8000) < 0LL) + abort (); + if ((0U ^ x) < 0LL) + abort (); + if ((0U ^ (short)y) < 0LL) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48571-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48571-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48571-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48571-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +#define S (sizeof (int)) + +unsigned int c[624]; +void __attribute__((noinline)) +bar (void) +{ + unsigned int i; + /* Obfuscated c[i] = c[i-1] * 2. */ + for (i = 1; i < 624; ++i) + *(unsigned int *)((void *)c + (__SIZE_TYPE__)i * S) + = 2 * *(unsigned int *)((void *)c + ((__SIZE_TYPE__)i + + ((__SIZE_TYPE__)-S)/S) * S); +} +extern void abort (void); +int +main() +{ + unsigned int i, j; + for (i = 0; i < 624; ++i) + c[i] = 1; + bar(); + j = 1; + for (i = 0; i < 624; ++i) + { + if (c[i] != j) + abort (); + j = j * 2; + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48717.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48717.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48717.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48717.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR tree-optimization/48717 */ + +extern void abort (void); + +int v = 1, w; + +unsigned short +foo (unsigned short x, unsigned short y) +{ + return x + y; +} + +void +bar (void) +{ + v = foo (~w, w); +} + +int +main () +{ + bar (); + if (v != (unsigned short) -1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48809.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48809.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48809.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48809.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,60 @@ +/* PR tree-optimization/48809 */ + +extern void abort (void); + +int +foo (signed char x) +{ + int y = 0; + switch (x) + { + case 0: y = 1; break; + case 1: y = 7; break; + case 2: y = 2; break; + case 3: y = 19; break; + case 4: y = 5; break; + case 5: y = 17; break; + case 6: y = 31; break; + case 7: y = 8; break; + case 8: y = 28; break; + case 9: y = 16; break; + case 10: y = 31; break; + case 11: y = 12; break; + case 12: y = 15; break; + case 13: y = 111; break; + case 14: y = 17; break; + case 15: y = 10; break; + case 16: y = 31; break; + case 17: y = 7; break; + case 18: y = 2; break; + case 19: y = 19; break; + case 20: y = 5; break; + case 21: y = 107; break; + case 22: y = 31; break; + case 23: y = 8; break; + case 24: y = 28; break; + case 25: y = 106; break; + case 26: y = 31; break; + case 27: y = 102; break; + case 28: y = 105; break; + case 29: y = 111; break; + case 30: y = 17; break; + case 31: y = 10; break; + case 32: y = 31; break; + case 98: y = 18; break; + case -62: y = 19; break; + } + return y; +} + +int +main () +{ + if (foo (98) != 18 || foo (97) != 0 || foo (99) != 0) + abort (); + if (foo (-62) != 19 || foo (-63) != 0 || foo (-61) != 0) + abort (); + if (foo (28) != 105 || foo (27) != 102 || foo (29) != 111) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48814-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48814-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48814-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48814-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +extern void abort (void); + +int arr[] = {1,2,3,4}; +int count = 0; + +int __attribute__((noinline)) +incr (void) +{ + return ++count; +} + +int main() +{ + arr[count++] = incr (); + if (count != 2 || arr[count] != 3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48814-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48814-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48814-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48814-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +extern void abort (void); + +int arr[] = {1,2,3,4}; +int count = 0; + +int +incr (void) +{ + return ++count; +} + +int main() +{ + arr[count++] = incr (); + if (count != 2 || arr[count] != 3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48973-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48973-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48973-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48973-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* PR middle-end/48973 */ + +extern void abort (void); +struct S { int f : 1; } s; +int v = -1; + +void +foo (unsigned int x) +{ + if (x != -1U) + abort (); +} + +int +main () +{ + s.f = (v & 1) > 0; + foo (s.f); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48973-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48973-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48973-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr48973-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR middle-end/48973 */ + +extern void abort (void); +struct S { int f : 1; } s; +int v = -1; + +int +main () +{ + s.f = v < 0; + if ((unsigned int) s.f != -1U) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49039.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49039.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49039.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49039.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR tree-optimization/49039 */ +extern void abort (void); +int cnt; + +__attribute__((noinline, noclone)) void +foo (unsigned int x, unsigned int y) +{ + unsigned int minv, maxv; + if (x == 1 || y == -2U) + return; + minv = x < y ? x : y; + maxv = x > y ? x : y; + if (minv == 1) + ++cnt; + if (maxv == -2U) + ++cnt; +} + +int +main () +{ + foo (-2U, 1); + if (cnt != 2) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49073.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49073.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49073.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49073.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR tree-optimization/49073 */ + +extern void abort (void); +int a[] = { 1, 2, 3, 4, 5, 6, 7 }, c; + +int +main () +{ + int d = 1, i = 1; + _Bool f = 0; + do + { + d = a[i]; + if (f && d == 4) + { + ++c; + break; + } + i++; + f = (d == 3); + } + while (d < 7); + if (c != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49123.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49123.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49123.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49123.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR lto/49123 */ + +extern void abort (void); +static struct S { int f : 1; } s; +static int v = -1; + +int +main () +{ + s.f = v < 0; + if ((unsigned int) s.f != -1U) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49161.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49161.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49161.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49161.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,46 @@ +/* PR tree-optimization/49161 */ + +extern void abort (void); + +int c; + +__attribute__((noinline, noclone)) void +bar (int x) +{ + if (x != c++) + abort (); +} + +__attribute__((noinline, noclone)) void +foo (int x) +{ + switch (x) + { + case 3: goto l1; + case 4: goto l2; + case 6: goto l3; + default: return; + } +l1: + goto l4; +l2: + goto l4; +l3: + bar (-1); +l4: + bar (0); + if (x != 4) + bar (1); + if (x != 3) + bar (-1); + bar (2); +} + +int +main () +{ + foo (3); + if (c != 3) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49186.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49186.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49186.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49186.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* PR target/49186 */ +extern void abort (void); + +int +main () +{ + int x; + unsigned long long uv = 0x1000000001ULL; + + x = (uv < 0x80) ? 1 : ((uv < 0x800) ? 2 : 3); + if (x != 3) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49218.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49218.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49218.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49218.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +#ifdef __SIZEOF_INT128__ +typedef __int128 L; +#else +typedef long long L; +#endif +float f; + +int +main () +{ + L i = f; + if (i <= 10) + do + { + ++i; + asm (""); + } + while (i != 11); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49279.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49279.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49279.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49279.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* PR tree-optimization/49279 */ +extern void abort (void); + +struct S { int a; int *__restrict p; }; + +__attribute__((noinline, noclone)) +struct S *bar (struct S *p) +{ + struct S *r; + asm volatile ("" : "=r" (r) : "0" (p) : "memory"); + return r; +} + +__attribute__((noinline, noclone)) +int +foo (int *p, int *q) +{ + struct S s, *t; + s.a = 1; + s.p = p; + t = bar (&s); + t->p = q; + s.p[0] = 0; + t->p[0] = 1; + return s.p[0]; +} + +int +main () +{ + int a, b; + if (foo (&a, &b) != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49281.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49281.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49281.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49281.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR target/49281 */ + +extern void abort (void); + +__attribute__((noinline, noclone)) int +foo (int x) +{ + return (x << 2) | 4; +} + +__attribute__((noinline, noclone)) int +bar (int x) +{ + return (x << 2) | 3; +} + +int +main () +{ + if (foo (43) != 172 || foo (1) != 4 || foo (2) != 12) + abort (); + if (bar (43) != 175 || bar (1) != 7 || bar (2) != 11) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49390.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49390.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49390.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49390.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,88 @@ +/* PR rtl-optimization/49390 */ + +struct S { unsigned int s1; unsigned int s2; }; +struct T { unsigned int t1; struct S t2; }; +struct U { unsigned short u1; unsigned short u2; }; +struct V { struct U v1; struct T v2; }; +struct S a; +char *b; +union { char b[64]; struct V v; } u; +volatile int v; +extern void abort (void); + +__attribute__((noinline, noclone)) void +foo (int x, void *y, unsigned int z, unsigned int w) +{ + if (x != 4 || y != (void *) &u.v.v2) + abort (); + v = z + w; + v = 16384; +} + +__attribute__((noinline, noclone)) void +bar (struct S x) +{ + v = x.s1; + v = x.s2; +} + +__attribute__((noinline, noclone)) int +baz (struct S *x) +{ + v = x->s1; + v = x->s2; + v = 0; + return v + 1; +} + +__attribute__((noinline, noclone)) void +test (struct S *c) +{ + struct T *d; + struct S e = a; + unsigned int f, g; + if (c == 0) + c = &e; + else + { + if (c->s2 % 8192 <= 15 || (8192 - c->s2 % 8192) <= 31) + foo (1, 0, c->s1, c->s2); + } + if (!baz (c)) + return; + g = (((struct U *) b)->u2 & 2) ? 32 : __builtin_offsetof (struct V, v2); + f = c->s2 % 8192; + if (f == 0) + { + e.s2 += g; + f = g; + } + else if (f < g) + { + foo (2, 0, c->s1, c->s2); + return; + } + if ((((struct U *) b)->u2 & 1) && f == g) + { + bar (*c); + foo (3, 0, c->s1, c->s2); + return; + } + d = (struct T *) (b + c->s2 % 8192); + if (d->t2.s1 >= c->s1 && (d->t2.s1 != c->s1 || d->t2.s2 >= c->s2)) + foo (4, d, c->s1, c->s2); + return; +} + +int +main () +{ + struct S *c = 0; + asm ("" : "+r" (c) : "r" (&a)); + u.v.v2.t2.s1 = 8192; + b = u.b; + test (c); + if (v != 16384) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49419.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49419.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49419.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49419.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* PR tree-optimization/49419 */ + +extern void abort (void); + +struct S { int w, x, y; } *t; + +int +foo (int n, int f, int *s, int m) +{ + int x, i, a; + if (n == -1) + return 0; + for (x = n, i = 0; t[x].w == f && i < m; i++) + x = t[x].x; + if (i == m) + abort (); + a = i + 1; + for (x = n; i > 0; i--) + { + s[i] = t[x].y; + x = t[x].x; + } + s[0] = x; + return a; +} + +int +main (void) +{ + int s[3], i; + struct S buf[3] = { { 1, 1, 2 }, { 0, 0, 0 }, { 0, 0, 0 } }; + t = buf; + if (foo (0, 1, s, 3) != 2) + abort (); + if (s[0] != 1 || s[1] != 2) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49644.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49644.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49644.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49644.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* PR c/49644 */ + +extern void abort (void); + +int +main () +{ + _Complex double a[12], *c = a, s = 3.0 + 1.0i; + double b[12] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }, *d = b; + int i; + for (i = 0; i < 6; i++) + *c++ = *d++ * s; + if (c != a + 6 || d != b + 6) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49712.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49712.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49712.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49712.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR tree-optimization/49712 */ + +int a[2], b, c, d, e; + +void +foo (int x, int y) +{ +} + +int +bar (void) +{ + int i; + for (; d <= 0; d = 1) + for (i = 0; i < 4; i++) + for (e = 0; e; e = 1) + ; + return 0; +} + +int +main () +{ + for (b = 0; b < 2; b++) + while (c) + foo (a[b] = 0, bar ()); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49768.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49768.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49768.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49768.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +/* PR tree-optimization/49768 */ + +extern void abort (void); + +int +main () +{ + static struct { unsigned int : 1; unsigned int s : 1; } s = { .s = 1 }; + if (s.s != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49886.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49886.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49886.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr49886.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,99 @@ +struct PMC { + unsigned flags; +}; + +typedef struct Pcc_cell +{ + struct PMC *p; + long bla; + long type; +} Pcc_cell; + +int gi; +int cond; + +extern void abort (); +extern void never_ever(int interp, struct PMC *pmc) + __attribute__((noinline,noclone)); + +void never_ever (int interp, struct PMC *pmc) +{ + abort (); +} + +static void mark_cell(int * interp, Pcc_cell *c) + __attribute__((__nonnull__(1))); + +static void +mark_cell(int * interp, Pcc_cell *c) +{ + if (!cond) + return; + + if (c && c->type == 4 && c->p + && !(c->p->flags & (1<<18))) + never_ever(gi + 1, c->p); + if (c && c->type == 4 && c->p + && !(c->p->flags & (1<<17))) + never_ever(gi + 2, c->p); + if (c && c->type == 4 && c->p + && !(c->p->flags & (1<<16))) + never_ever(gi + 3, c->p); + if (c && c->type == 4 && c->p + && !(c->p->flags & (1<<15))) + never_ever(gi + 4, c->p); + if (c && c->type == 4 && c->p + && !(c->p->flags & (1<<14))) + never_ever(gi + 5, c->p); + if (c && c->type == 4 && c->p + && !(c->p->flags & (1<<13))) + never_ever(gi + 6, c->p); + if (c && c->type == 4 && c->p + && !(c->p->flags & (1<<12))) + never_ever(gi + 7, c->p); + if (c && c->type == 4 && c->p + && !(c->p->flags & (1<<11))) + never_ever(gi + 8, c->p); + if (c && c->type == 4 && c->p + && !(c->p->flags & (1<<10))) + never_ever(gi + 9, c->p); +} + +static void +foo(int * interp, Pcc_cell *c) +{ + mark_cell(interp, c); +} + +static struct Pcc_cell * +__attribute__((noinline,noclone)) +getnull(void) +{ + return (struct Pcc_cell *) 0; +} + + +int main() +{ + int i; + + cond = 1; + for (i = 0; i < 100; i++) + foo (&gi, getnull ()); + return 0; +} + + +void +bar_1 (int * interp, Pcc_cell *c) +{ + c->bla += 1; + mark_cell(interp, c); +} + +void +bar_2 (int * interp, Pcc_cell *c) +{ + c->bla += 2; + mark_cell(interp, c); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr50865.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr50865.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr50865.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr50865.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR middle-end/50865 */ + +#define INT64_MIN (-__LONG_LONG_MAX__ - 1) + +int +main () +{ + volatile long long l1 = 1; + volatile long long l2 = -1; + volatile long long l3 = -1; + + if ((INT64_MIN % 1LL) != 0) + __builtin_abort (); + if ((INT64_MIN % l1) != 0) + __builtin_abort (); + if (l2 == -1) + { + if ((INT64_MIN % 1LL) != 0) + __builtin_abort (); + } + else if ((INT64_MIN % -l2) != 0) + __builtin_abort (); + if ((INT64_MIN % -l3) != 0) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51023.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51023.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51023.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51023.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +/* PR rtl-optimization/51023 */ + +extern void abort (void); + +short int +foo (long int x) +{ + return x; +} + +int +main () +{ + long int a = 0x4272AL; + if (foo (a) == a) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51323.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51323.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51323.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51323.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* PR middle-end/51323 */ + +extern void abort (void); +struct S { int a, b, c; }; +int v; + +__attribute__((noinline, noclone)) void +foo (int x, int y, int z) +{ + if (x != v || y != 0 || z != 9) + abort (); +} + +static inline int +baz (const struct S *p) +{ + return p->b; +} + +__attribute__((noinline, noclone)) void +bar (int x, struct S y) +{ + foo (baz (&y), 0, x); +} + +int +main () +{ + struct S s; + v = 3; s.a = v - 1; s.b = v; s.c = v + 1; + bar (9, s); + v = 17; s.a = v - 1; s.b = v; s.c = v + 1; + bar (9, s); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51447.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51447.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51447.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51447.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* PR rtl-optimization/51447 */ +/* { dg-require-effective-target label_values } */ +/* { dg-require-effective-target indirect_jumps } */ + +extern void abort (void); + +#ifdef __x86_64__ +register void *ptr asm ("rbx"); +#else +void *ptr; +#endif + +int +main (void) +{ + __label__ nonlocal_lab; + __attribute__((noinline, noclone)) void + bar (void *func) + { + ptr = func; + goto nonlocal_lab; + } + bar (&&nonlocal_lab); + return 1; +nonlocal_lab: + if (ptr != &&nonlocal_lab) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51466.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51466.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51466.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51466.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +/* PR tree-optimization/51466 */ + +extern void abort (void); + +__attribute__((noinline, noclone)) int +foo (int i) +{ + volatile int v[4]; + int *p; + v[i] = 6; + p = (int *) &v[i]; + return *p; +} + +__attribute__((noinline, noclone)) int +bar (int i) +{ + volatile int v[4]; + int *p; + v[i] = 6; + p = (int *) &v[i]; + *p = 8; + return v[i]; +} + +__attribute__((noinline, noclone)) int +baz (int i) +{ + volatile int v[4]; + int *p; + v[i] = 6; + p = (int *) &v[0]; + *p = 8; + return v[i]; +} + +int +main () +{ + if (foo (3) != 6 || bar (2) != 8 || baz (0) != 8 || baz (1) != 6) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51581-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51581-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51581-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51581-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,159 @@ +/* PR tree-optimization/51581 */ + +/* { dg-require-effective-target int32plus } */ + +extern void abort (void); + +#define N 4096 +int a[N], c[N]; +unsigned int b[N], d[N]; + +__attribute__((noinline, noclone)) void +f1 (void) +{ + int i; + for (i = 0; i < N; i++) + c[i] = a[i] / 3; +} + +__attribute__((noinline, noclone)) void +f2 (void) +{ + int i; + for (i = 0; i < N; i++) + d[i] = b[i] / 3; +} + +__attribute__((noinline, noclone)) void +f3 (void) +{ + int i; + for (i = 0; i < N; i++) + c[i] = a[i] / 18; +} + +__attribute__((noinline, noclone)) void +f4 (void) +{ + int i; + for (i = 0; i < N; i++) + d[i] = b[i] / 18; +} + +__attribute__((noinline, noclone)) void +f5 (void) +{ + int i; + for (i = 0; i < N; i++) + c[i] = a[i] / 19; +} + +__attribute__((noinline, noclone)) void +f6 (void) +{ + int i; + for (i = 0; i < N; i++) + d[i] = b[i] / 19; +} + +#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 +__attribute__((noinline, noclone)) void +f7 (void) +{ + int i; + for (i = 0; i < N; i++) + c[i] = (int) ((unsigned long long) (a[i] * 0x55555556LL) >> 32) - (a[i] >> 31); +} + +__attribute__((noinline, noclone)) void +f8 (void) +{ + int i; + for (i = 0; i < N; i++) + d[i] = ((unsigned int) ((b[i] * 0xaaaaaaabULL) >> 32) >> 1); +} + +__attribute__((noinline, noclone)) void +f9 (void) +{ + int i; + for (i = 0; i < N; i++) + c[i] = (((int) ((unsigned long long) (a[i] * 0x38e38e39LL) >> 32)) >> 2) - (a[i] >> 31); +} + +__attribute__((noinline, noclone)) void +f10 (void) +{ + int i; + for (i = 0; i < N; i++) + d[i] = (unsigned int) ((b[i] * 0x38e38e39ULL) >> 32) >> 2; +} + +__attribute__((noinline, noclone)) void +f11 (void) +{ + int i; + for (i = 0; i < N; i++) + c[i] = (((int) ((unsigned long long) (a[i] * 0x6bca1af3LL) >> 32)) >> 3) - (a[i] >> 31); +} + +__attribute__((noinline, noclone)) void +f12 (void) +{ + int i; + for (i = 0; i < N; i++) + { + unsigned int tmp = (b[i] * 0xaf286bcbULL) >> 32; + d[i] = (((b[i] - tmp) >> 1) + tmp) >> 4; + } +} +#endif + +int +main () +{ + int i; + for (i = 0; i < N; i++) + { + asm (""); + a[i] = i - N / 2; + b[i] = i; + } + a[0] = -__INT_MAX__ - 1; + a[1] = -__INT_MAX__; + a[N - 1] = __INT_MAX__; + b[N - 1] = ~0; + f1 (); + f2 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] / 3 || d[i] != b[i] / 3) + abort (); + f3 (); + f4 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] / 18 || d[i] != b[i] / 18) + abort (); + f5 (); + f6 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] / 19 || d[i] != b[i] / 19) + abort (); +#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 + f7 (); + f8 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] / 3 || d[i] != b[i] / 3) + abort (); + f9 (); + f10 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] / 18 || d[i] != b[i] / 18) + abort (); + f11 (); + f12 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] / 19 || d[i] != b[i] / 19) + abort (); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51581-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51581-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51581-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51581-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,174 @@ +/* PR tree-optimization/51581 */ +/* { dg-require-effective-target int32plus } */ + +extern void abort (void); + +#define N 4096 +int a[N], c[N]; +unsigned int b[N], d[N]; + +__attribute__((noinline, noclone)) void +f1 (void) +{ + int i; + for (i = 0; i < N; i++) + c[i] = a[i] % 3; +} + +__attribute__((noinline, noclone)) void +f2 (void) +{ + int i; + for (i = 0; i < N; i++) + d[i] = b[i] % 3; +} + +__attribute__((noinline, noclone)) void +f3 (void) +{ + int i; + for (i = 0; i < N; i++) + c[i] = a[i] % 18; +} + +__attribute__((noinline, noclone)) void +f4 (void) +{ + int i; + for (i = 0; i < N; i++) + d[i] = b[i] % 18; +} + +__attribute__((noinline, noclone)) void +f5 (void) +{ + int i; + for (i = 0; i < N; i++) + c[i] = a[i] % 19; +} + +__attribute__((noinline, noclone)) void +f6 (void) +{ + int i; + for (i = 0; i < N; i++) + d[i] = b[i] % 19; +} + +#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 +__attribute__((noinline, noclone)) void +f7 (void) +{ + int i; + for (i = 0; i < N; i++) + { + int x = (int) ((unsigned long long) (a[i] * 0x55555556LL) >> 32) - (a[i] >> 31); + c[i] = a[i] - x * 3; + } +} + +__attribute__((noinline, noclone)) void +f8 (void) +{ + int i; + for (i = 0; i < N; i++) + { + unsigned int x = ((unsigned int) ((b[i] * 0xaaaaaaabULL) >> 32) >> 1); + d[i] = b[i] - x * 3; + } +} + +__attribute__((noinline, noclone)) void +f9 (void) +{ + int i; + for (i = 0; i < N; i++) + { + int x = (((int) ((unsigned long long) (a[i] * 0x38e38e39LL) >> 32)) >> 2) - (a[i] >> 31); + c[i] = a[i] - x * 18; + } +} + +__attribute__((noinline, noclone)) void +f10 (void) +{ + int i; + for (i = 0; i < N; i++) + { + unsigned int x = (unsigned int) ((b[i] * 0x38e38e39ULL) >> 32) >> 2; + d[i] = b[i] - x * 18; + } +} + +__attribute__((noinline, noclone)) void +f11 (void) +{ + int i; + for (i = 0; i < N; i++) + { + int x = (((int) ((unsigned long long) (a[i] * 0x6bca1af3LL) >> 32)) >> 3) - (a[i] >> 31); + c[i] = a[i] - x * 19; + } +} + +__attribute__((noinline, noclone)) void +f12 (void) +{ + int i; + for (i = 0; i < N; i++) + { + unsigned int tmp = (b[i] * 0xaf286bcbULL) >> 32; + unsigned int x = (((b[i] - tmp) >> 1) + tmp) >> 4; + d[i] = b[i] - x * 19; + } +} +#endif + +int +main () +{ + int i; + for (i = 0; i < N; i++) + { + asm (""); + a[i] = i - N / 2; + b[i] = i; + } + a[0] = -__INT_MAX__ - 1; + a[1] = -__INT_MAX__; + a[N - 1] = __INT_MAX__; + b[N - 1] = ~0; + f1 (); + f2 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] % 3 || d[i] != b[i] % 3) + abort (); + f3 (); + f4 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] % 18 || d[i] != b[i] % 18) + abort (); + f5 (); + f6 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] % 19 || d[i] != b[i] % 19) + abort (); +#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 + f7 (); + f8 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] % 3 || d[i] != b[i] % 3) + abort (); + f9 (); + f10 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] % 18 || d[i] != b[i] % 18) + abort (); + f11 (); + f12 (); + for (i = 0; i < N; i++) + if (c[i] != a[i] % 19 || d[i] != b[i] % 19) + abort (); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51877.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51877.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51877.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51877.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,50 @@ +/* PR tree-optimization/51877 */ + +extern void abort (void); +struct A { int a; char b[32]; } a, b; + +__attribute__((noinline, noclone)) +struct A +bar (int x) +{ + struct A r; + static int n; + r.a = ++n; + __builtin_memset (r.b, 0, sizeof (r.b)); + r.b[0] = x; + return r; +} + +__attribute__((noinline, noclone)) +void +baz (void) +{ + asm volatile ("" : : : "memory"); +} + +__attribute__((noinline, noclone)) +void +foo (struct A *x, int y) +{ + if (y == 6) + a = bar (7); + else + *x = bar (7); + baz (); +} + +int +main () +{ + a = bar (3); + b = bar (4); + if (a.a != 1 || a.b[0] != 3 || b.a != 2 || b.b[0] != 4) + abort (); + foo (&b, 0); + if (a.a != 1 || a.b[0] != 3 || b.a != 3 || b.b[0] != 7) + abort (); + foo (&b, 6); + if (a.a != 4 || a.b[0] != 7 || b.a != 3 || b.b[0] != 7) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51933.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51933.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51933.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr51933.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,59 @@ +/* PR rtl-optimization/51933 */ + +static signed char v1; +static unsigned char v2[256], v3[256]; + +__attribute__((noclone, noinline)) void +foo (void) +{ +#if defined(__s390__) && !defined(__zarch__) + /* S/390 31 bit cannot deal with more than one literal pool + reference per insn. */ + asm volatile ("" : : "g" (&v1) : "memory"); + asm volatile ("" : : "g" (&v2[0])); + asm volatile ("" : : "g" (&v3[0])); +#else + asm volatile ("" : : "g" (&v1), "g" (&v2[0]), "g" (&v3[0]) : "memory"); +#endif +} + +__attribute__((noclone, noinline)) int +bar (const int x, const unsigned short *y, char *z) +{ + int i; + unsigned short u; + if (!v1) + foo (); + for (i = 0; i < x; i++) + { + u = y[i]; + z[i] = u < 0x0100 ? v2[u] : v3[u & 0xff]; + } + z[x] = '\0'; + return x; +} + +int +main (void) +{ + char buf[18]; + unsigned short s[18]; + unsigned char c[18] = "abcdefghijklmnopq"; + int i; + for (i = 0; i < 256; i++) + { + v2[i] = i; + v3[i] = i + 1; + } + for (i = 0; i < 18; i++) + s[i] = c[i]; + s[5] |= 0x600; + s[6] |= 0x500; + s[11] |= 0x2000; + s[15] |= 0x500; + foo (); + if (bar (17, s, buf) != 17 + || __builtin_memcmp (buf, "abcdeghhijkmmnoqq", 18) != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52129.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52129.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52129.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52129.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* PR target/52129 */ +/* { dg-xfail-if "ptxas crashes" { nvptx-*-* } { "-O1" } { "" } } */ + +extern void abort (void); +struct S { void *p; unsigned int q; }; +struct T { char a[64]; char b[64]; } t; + +__attribute__((noinline, noclone)) int +foo (void *x, struct S s, void *y, void *z) +{ + if (x != &t.a[2] || s.p != &t.b[5] || s.q != 27 || y != &t.a[17] || z != &t.b[17]) + abort (); + return 29; +} + +__attribute__((noinline, noclone)) int +bar (void *x, void *y, void *z, struct S s, int t, struct T *u) +{ + return foo (x, s, &u->a[t], &u->b[t]); +} + +int +main () +{ + struct S s = { &t.b[5], 27 }; + if (bar (&t.a[2], (void *) 0, (void *) 0, s, 17, &t) != 29) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52209.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52209.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52209.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52209.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR middle-end/52209 */ + +extern void abort (void); +struct S0 { int f2 : 1; } c; +int b; + +int +main () +{ + b = -1 ^ c.f2; + if (b != -1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52286.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52286.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52286.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52286.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* PR tree-optimization/52286 */ + +extern void abort (void); + +int +main () +{ +#if __SIZEOF_INT__ > 2 + int a, b; + asm ("" : "=r" (a) : "0" (0)); + b = (~a | 1) & -2038094497; +#else + long a, b; + asm ("" : "=r" (a) : "0" (0)); + b = (~a | 1) & -2038094497L; +#endif + if (b >= 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52760.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52760.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52760.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52760.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR tree-optimization/52760 */ + +struct T { unsigned short a, b, c, d; }; + +__attribute__((noinline, noclone)) void +foo (int x, struct T *y) +{ + int i; + + for (i = 0; i < x; i++) + { + y[i].a = ((0x00ff & y[i].a >> 8) | (0xff00 & y[i].a << 8)); + y[i].b = ((0x00ff & y[i].b >> 8) | (0xff00 & y[i].b << 8)); + y[i].c = ((0x00ff & y[i].c >> 8) | (0xff00 & y[i].c << 8)); + y[i].d = ((0x00ff & y[i].d >> 8) | (0xff00 & y[i].d << 8)); + } +} + +int +main () +{ + struct T t = { 0x0001, 0x0203, 0x0405, 0x0607 }; + foo (1, &t); + if (t.a != 0x0100 || t.b != 0x0302 || t.c != 0x0504 || t.d != 0x0706) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52979-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52979-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52979-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52979-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +/* PR middle-end/52979 */ + +/* { dg-require-effective-target int32plus } */ + +extern void abort (void); +int c, d, e; + +void +foo (void) +{ +} + +struct __attribute__((packed)) S { int g : 31; int h : 6; }; +struct S a = { 1 }; +static struct S b = { 1 }; + +void +bar (void) +{ + a.h = 1; + struct S f = { }; + b = f; + e = 0; + if (d) + c = a.g; +} + +void +baz (void) +{ + bar (); + a = b; +} + +int +main () +{ + baz (); + if (a.g) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52979-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52979-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52979-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr52979-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* PR middle-end/52979 */ +/* { dg-require-effective-target int32plus } */ + +extern void abort (void); +int c, d, e; + +void +foo (void) +{ +} + +struct __attribute__((packed)) S { int g : 31; int h : 6; }; +static struct S b = { 1 }; +struct S a = { 1 }; + +void +bar (void) +{ + a.h = 1; + struct S f = { }; + b = f; + e = 0; + if (d) + c = a.g; +} + +void +baz (void) +{ + bar (); + a = b; +} + +int +main () +{ + baz (); + if (a.g) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53084.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53084.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53084.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53084.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +/* PR middle-end/53084 */ + +extern void abort (void); + +__attribute__((noinline, noclone)) void +bar (const char *p) +{ + if (p[0] != 'o' || p[1] != 'o' || p[2]) + abort (); +} + +int +main () +{ + static const char *const foo[] = {"foo" + 1}; + bar (foo[0]); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53160.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53160.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53160.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53160.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* PR rtl-optimization/53160 */ + +extern void abort (void); + +int a, c = 1, d, e, g; +volatile int b; +volatile char f; +long h; +short i; + +void +foo (void) +{ + for (e = 0; e; ++e) + ; +} + +int +main () +{ + if (g) + (void) b; + foo (); + for (d = 0; d >= 0; d--) + { + short j = f; + int k = 0; + i = j ? j : j << k; + } + h = c == 0 ? 0 : i; + a = h; + if (a != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53465.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53465.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53465.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53465.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR tree-optimization/53465 */ + +extern void abort (); + +static const int a[] = { 1, 2 }; + +void +foo (const int *x, int y) +{ + int i; + int b = 0; + int c; + for (i = 0; i < y; i++) + { + int d = x[i]; + if (d == 0) + break; + if (b && d <= c) + abort (); + c = d; + b = 1; + } +} + +int +main () +{ + foo (a, 2); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53645-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53645-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53645-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53645-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,121 @@ +/* PR tree-optimization/53645 */ +/* { dg-options "-std=gnu89" } */ + +typedef unsigned short int UV __attribute__((vector_size (16))); +typedef short int SV __attribute__((vector_size (16))); +extern void abort (void); + +#define TEST(a, b, c, d, e, f, g, h) \ +__attribute__((noinline)) void \ +uq##a##b##c##d##e##f##g##h (UV *x, UV *y) \ +{ \ + *x = *y / ((UV) { a, b, c, d, e, f, g, h }); \ +} \ + \ +__attribute__((noinline)) void \ +ur##a##b##c##d##e##f##g##h (UV *x, UV *y) \ +{ \ + *x = *y % ((UV) { a, b, c, d, e, f, g, h }); \ +} \ + \ +__attribute__((noinline)) void \ +sq##a##b##c##d##e##f##g##h (SV *x, SV *y) \ +{ \ + *x = *y / ((SV) { a, b, c, d, e, f, g, h }); \ +} \ + \ +__attribute__((noinline)) void \ +sr##a##b##c##d##e##f##g##h (SV *x, SV *y) \ +{ \ + *x = *y % ((SV) { a, b, c, d, e, f, g, h }); \ +} + +#define TESTS \ +TEST (4, 4, 4, 4, 4, 4, 4, 4) \ +TEST (1, 4, 2, 8, 16, 64, 32, 128) \ +TEST (3, 3, 3, 3, 3, 3, 3, 3) \ +TEST (6, 5, 6, 5, 6, 5, 6, 5) \ +TEST (14, 14, 14, 6, 14, 6, 14, 14) \ +TEST (7, 7, 7, 7, 7, 7, 7, 7) \ + +TESTS + +UV u[] = + { ((UV) { 73U, 65531U, 0U, 174U, 921U, 65535U, 17U, 178U }), + ((UV) { 1U, 8173U, 65535U, 65472U, 12U, 29612U, 128U, 8912U }) }; +SV s[] = + { ((SV) { 73, -9123, 32761, 8191, 16371, 1201, 12701, 9999 }), + ((SV) { 9903, -1, -7323, 0, -7, -323, 9124, -9199 }) }; + +int +main () +{ + UV ur, ur2; + SV sr, sr2; + int i; +#undef TEST +#define TEST(a, b, c, d, e, f, g, h) \ + uq##a##b##c##d##e##f##g##h (&ur, u + i); \ + if (ur[0] != u[i][0] / a || ur[3] != u[i][3] / d) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); \ + if (ur[2] != u[i][2] / c || ur[1] != u[i][1] / b) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); \ + if (ur[4] != u[i][4] / e || ur[7] != u[i][7] / h) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); \ + if (ur[6] != u[i][6] / g || ur[5] != u[i][5] / f) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); \ + ur##a##b##c##d##e##f##g##h (&ur, u + i); \ + if (ur[0] != u[i][0] % a || ur[3] != u[i][3] % d) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); \ + if (ur[2] != u[i][2] % c || ur[1] != u[i][1] % b) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); \ + if (ur[4] != u[i][4] % e || ur[7] != u[i][7] % h) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); \ + if (ur[6] != u[i][6] % g || ur[5] != u[i][5] % f) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); + for (i = 0; i < sizeof (u) / sizeof (u[0]); i++) + { + TESTS + } +#undef TEST +#define TEST(a, b, c, d, e, f, g, h) \ + sq##a##b##c##d##e##f##g##h (&sr, s + i); \ + if (sr[0] != s[i][0] / a || sr[3] != s[i][3] / d) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); \ + if (sr[2] != s[i][2] / c || sr[1] != s[i][1] / b) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); \ + if (sr[4] != s[i][4] / e || sr[7] != s[i][7] / h) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); \ + if (sr[6] != s[i][6] / g || sr[5] != s[i][5] / f) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); \ + sr##a##b##c##d##e##f##g##h (&sr, s + i); \ + if (sr[0] != s[i][0] % a || sr[3] != s[i][3] % d) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); \ + if (sr[2] != s[i][2] % c || sr[1] != s[i][1] % b) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); \ + if (sr[4] != s[i][4] % e || sr[7] != s[i][7] % h) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); \ + if (sr[6] != s[i][6] % g || sr[5] != s[i][5] % f) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); + for (i = 0; i < sizeof (s) / sizeof (s[0]); i++) + { + TESTS + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53645.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53645.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53645.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53645.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,97 @@ +/* PR tree-optimization/53645 */ +/* { dg-options "-std=gnu89" } */ + +typedef unsigned int UV __attribute__((vector_size (16))); +typedef int SV __attribute__((vector_size (16))); +extern void abort (void); + +#define TEST(a, b, c, d) \ +__attribute__((noinline)) void \ +uq##a##b##c##d (UV *x, UV *y) \ +{ \ + *x = *y / ((UV) { a, b, c, d }); \ +} \ + \ +__attribute__((noinline)) void \ +ur##a##b##c##d (UV *x, UV *y) \ +{ \ + *x = *y % ((UV) { a, b, c, d }); \ +} \ + \ +__attribute__((noinline)) void \ +sq##a##b##c##d (SV *x, SV *y) \ +{ \ + *x = *y / ((SV) { a, b, c, d }); \ +} \ + \ +__attribute__((noinline)) void \ +sr##a##b##c##d (SV *x, SV *y) \ +{ \ + *x = *y % ((SV) { a, b, c, d }); \ +} + +#define TESTS \ +TEST (4, 4, 4, 4) \ +TEST (1, 4, 2, 8) \ +TEST (3, 3, 3, 3) \ +TEST (6, 5, 6, 5) \ +TEST (14, 14, 14, 6) \ +TEST (7, 7, 7, 7) \ + +TESTS + +UV u[] = + { ((UV) { 73U, 65531U, 0U, 174U }), + ((UV) { 1U, 8173U, ~0U, ~0U - 63 }) }; +SV s[] = + { ((SV) { 73, -9123, 32761, 8191 }), + ((SV) { 9903, -1, -7323, 0 }) }; + +int +main () +{ + UV ur, ur2; + SV sr, sr2; + int i; +#undef TEST +#define TEST(a, b, c, d) \ + uq##a##b##c##d (&ur, u + i); \ + if (ur[0] != u[i][0] / a || ur[3] != u[i][3] / d) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); \ + if (ur[2] != u[i][2] / c || ur[1] != u[i][1] / b) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); \ + ur##a##b##c##d (&ur, u + i); \ + if (ur[0] != u[i][0] % a || ur[3] != u[i][3] % d) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); \ + if (ur[2] != u[i][2] % c || ur[1] != u[i][1] % b) \ + abort (); \ + asm volatile ("" : : "r" (&ur) : "memory"); + for (i = 0; i < sizeof (u) / sizeof (u[0]); i++) + { + TESTS + } +#undef TEST +#define TEST(a, b, c, d) \ + sq##a##b##c##d (&sr, s + i); \ + if (sr[0] != s[i][0] / a || sr[3] != s[i][3] / d) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); \ + if (sr[2] != s[i][2] / c || sr[1] != s[i][1] / b) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); \ + sr##a##b##c##d (&sr, s + i); \ + if (sr[0] != s[i][0] % a || sr[3] != s[i][3] % d) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); \ + if (sr[2] != s[i][2] % c || sr[1] != s[i][1] % b) \ + abort (); \ + asm volatile ("" : : "r" (&sr) : "memory"); + for (i = 0; i < sizeof (s) / sizeof (s[0]); i++) + { + TESTS + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53688.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53688.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53688.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr53688.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +char headline[256]; +struct hdr { + char part1[9]; + char part2[8]; +} p; + +void __attribute__((noinline,noclone)) +init() +{ + __builtin_memcpy (p.part1, "FOOBARFOO", sizeof (p.part1)); + __builtin_memcpy (p.part2, "SPEC CPU", sizeof (p.part2)); +} + +int main() +{ + char *x; + int c; + init(); + __builtin_memcpy (&headline[0], p.part1, 9); + c = 9; + x = &headline[0]; + x = x + c; + __builtin_memset (x, ' ', 245); + __builtin_memcpy (&headline[10], p.part2, 8); + c = 18; + x = &headline[0]; + x = x + c; + __builtin_memset (x, ' ', 238); + if (headline[10] != 'S') + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr54471.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr54471.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr54471.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr54471.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* PR tree-optimization/54471 */ + +#ifdef __SIZEOF_INT128__ +#define T __int128 +#else +#define T long long +#endif + +extern void abort (void); + +__attribute__ ((noinline)) +unsigned T +foo (T ixi, unsigned ctr) +{ + unsigned T irslt = 1; + T ix = ixi; + + for (; ctr; ctr--) + { + irslt *= ix; + ix *= ix; + } + + if (irslt != 14348907) + abort (); + return irslt; +} + +int +main () +{ + unsigned T res; + + res = foo (3, 4); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr54937.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr54937.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr54937.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr54937.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ + +void exit (int); +void abort (void); +int a[1]; +void (*terminate_me)(int); + +__attribute__((noinline,noclone)) +t(int c) +{ int i; + for (i=0;ia; + int x; + + while (count--) + { + x = item->a; + if (first) + first = 0; + else if (x >= a) + return 1; + a = x; + item++; + } + return 0; +} + +extern void abort (void); + +int main () +{ + ST _1[2] = {{2}, {1}}; + if (foo(_1, 2) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55137.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55137.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55137.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55137.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR c++/55137 */ + +extern void abort (void); + +int +foo (unsigned int x) +{ + return ((int) (x + 1U) + 1) < (int) x; +} + +int +bar (unsigned int x) +{ + return (int) (x + 1U) + 1; +} + +int +baz (unsigned int x) +{ + return x + 1U; +} + +int +main () +{ + if (foo (__INT_MAX__) != (bar (__INT_MAX__) < __INT_MAX__) + || foo (__INT_MAX__) != ((int) baz (__INT_MAX__) + 1 < __INT_MAX__)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55750.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55750.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55750.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55750.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* PR middle-end/55750 */ + +extern void abort (void); + +struct S +{ + int m : 1; + int n : 7; +} arr[2]; + +__attribute__((noinline, noclone)) void +foo (unsigned i) +{ + arr[i].n++; +} + +int +main () +{ + arr[0].m = -1; + arr[0].n = (1 << 6) - 1; + arr[1].m = 0; + arr[1].n = -1; + foo (0); + foo (1); + if (arr[0].m != -1 || arr[0].n != -(1 << 6) || arr[1].m != 0 || arr[1].n != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55875.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55875.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55875.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr55875.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +int a[251]; +__attribute__ ((noinline)) +t(int i) +{ + if (i==0) + exit(0); + if (i>255) + abort (); +} +main() +{ + unsigned int i; + for (i=0;;i++) + { + a[i]=t((unsigned char)(i+5)); + } +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56051.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56051.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56051.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56051.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR tree-optimization/56051 */ + +extern void abort (void); + +int +main () +{ + unsigned char x1[1] = { 0 }; + unsigned int s1 = __CHAR_BIT__; + int a1 = x1[0] < (unsigned char) (1 << s1); + unsigned char y1 = (unsigned char) (1 << s1); + int b1 = x1[0] < y1; + if (a1 != b1) + abort (); +#if __SIZEOF_LONG_LONG__ > __SIZEOF_INT__ + unsigned long long x2[1] = { 2ULL << (sizeof (int) * __CHAR_BIT__) }; + unsigned int s2 = sizeof (int) * __CHAR_BIT__ - 1; + int a2 = x2[0] >= (unsigned long long) (1 << s2); + unsigned long long y2 = 1 << s2; + int b2 = x2[0] >= y2; + if (a2 != b2) + abort (); + unsigned long long x3[1] = { 2ULL << (sizeof (int) * __CHAR_BIT__) }; + unsigned int s3 = sizeof (int) * __CHAR_BIT__ - 1; + int a3 = x3[0] >= (unsigned long long) (1U << s3); + unsigned long long y3 = 1U << s3; + int b3 = x3[0] >= y3; + if (a3 != b3) + abort (); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56205.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56205.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56205.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56205.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,61 @@ +/* PR tree-optimization/56205 */ + +#include + +int a, b; +char c[128]; + +__attribute__((noinline, noclone)) static void +f1 (const char *fmt, ...) +{ + va_list ap; + asm volatile ("" : : : "memory"); + if (__builtin_strcmp (fmt, "%s %d %s") != 0) + __builtin_abort (); + va_start (ap, fmt); + if (__builtin_strcmp (va_arg (ap, const char *), "foo") != 0 + || va_arg (ap, int) != 1 + || __builtin_strcmp (va_arg (ap, const char *), "bar") != 0) + __builtin_abort (); + va_end (ap); +} + +__attribute__((noinline, noclone)) static void +f2 (const char *fmt, va_list ap) +{ + asm volatile ("" : : : "memory"); + if (__builtin_strcmp (fmt, "baz") != 0 + || __builtin_strcmp (va_arg (ap, const char *), "foo") != 0 + || va_arg (ap, double) != 12.0 + || va_arg (ap, int) != 26) + __builtin_abort (); +} + +static void +f3 (int x, char const *y, va_list z) +{ + f1 ("%s %d %s", x ? "" : "foo", ++a, (y && *y) ? "bar" : ""); + if (y && *y) + f2 (y, z); +} + +__attribute__((noinline, noclone)) void +f4 (int x, char const *y, ...) +{ + va_list z; + va_start (z, y); + if (!x && *c == '\0') + ++b; + f3 (x, y, z); + va_end (z); +} + +int +main () +{ + asm volatile ("" : : : "memory"); + f4 (0, "baz", "foo", 12.0, 26); + if (a != 1 || b != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56250.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56250.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56250.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56250.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +/* PR tree-optimization/56250 */ + +extern void abort (void); + +int +main () +{ + unsigned int x = 2; + unsigned int y = (0U - x / 2) / 2; + if (-1U / x != y) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56799.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56799.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56799.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56799.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +/* { dg-require-effective-target int32plus } */ + +#include +typedef struct { int x; int y;} S; +extern int foo(S*); +int hi = 0, lo = 0; + +int main() +{ + S a; + int r; + a.x = (int) 0x00010000; + a.y = 1; + r = foo (&a); + if (r == 2 && lo==0 && hi==1) + { + exit (0); + } + abort (); +} + +typedef unsigned short u16; + +__attribute__ ((noinline)) int foo (S* ptr) +{ + int a = ptr->x; + int c = 0; + u16 b = (u16) a; + if (b != 0) + { + lo = 1; + c += ptr->y; + } + b = a >> 16; + if (b != 0) + { + hi = 1; + c+= ptr->y; + } + c += ptr->y; + return c; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56837.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56837.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56837.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56837.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +extern void abort (void); +_Complex int a[1024]; + +__attribute__((noinline, noclone)) void +foo (void) +{ + int i; + for (i = 0; i < 1024; i++) + a[i] = -1; +} + +int +main () +{ + int i; + foo (); + for (i = 0; i < 1024; i++) + if (a[i] != -1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56866.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56866.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56866.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56866.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,45 @@ +/* PR target/56866 */ + +int +main () +{ +#if __CHAR_BIT__ == 8 && __SIZEOF_LONG_LONG__ == 8 && __SIZEOF_INT__ == 4 && __SIZEOF_SHORT__ == 2 + unsigned long long wq[256], rq[256]; + unsigned int wi[256], ri[256]; + unsigned short ws[256], rs[256]; + unsigned char wc[256], rc[256]; + int t; + + __builtin_memset (wq, 0, sizeof wq); + __builtin_memset (wi, 0, sizeof wi); + __builtin_memset (ws, 0, sizeof ws); + __builtin_memset (wc, 0, sizeof wc); + wq[0] = 0x0123456789abcdefULL; + wi[0] = 0x01234567; + ws[0] = 0x4567; + wc[0] = 0x73; + + asm volatile ("" : : "g" (wq), "g" (wi), "g" (ws), "g" (wc) : "memory"); + + for (t = 0; t < 256; ++t) + rq[t] = (wq[t] >> 8) | (wq[t] << (sizeof (wq[0]) * __CHAR_BIT__ - 8)); + for (t = 0; t < 256; ++t) + ri[t] = (wi[t] >> 8) | (wi[t] << (sizeof (wi[0]) * __CHAR_BIT__ - 8)); + for (t = 0; t < 256; ++t) + rs[t] = (ws[t] >> 9) | (ws[t] << (sizeof (ws[0]) * __CHAR_BIT__ - 9)); + for (t = 0; t < 256; ++t) + rc[t] = (wc[t] >> 5) | (wc[t] << (sizeof (wc[0]) * __CHAR_BIT__ - 5)); + + asm volatile ("" : : "g" (rq), "g" (ri), "g" (rs), "g" (rc) : "memory"); + + if (rq[0] != 0xef0123456789abcdULL || rq[1]) + __builtin_abort (); + if (ri[0] != 0x67012345 || ri[1]) + __builtin_abort (); + if (rs[0] != 0xb3a2 || rs[1]) + __builtin_abort (); + if (rc[0] != 0x9b || rc[1]) + __builtin_abort (); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56899.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56899.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56899.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56899.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,47 @@ +/* PR tree-optimization/56899 */ + +#if __SIZEOF_INT__ == 4 && __CHAR_BIT__ == 8 +__attribute__((noinline, noclone)) void +f1 (int v) +{ + int x = -214748365 * (v - 1); + if (x != -1932735285) + __builtin_abort (); +} + +__attribute__((noinline, noclone)) void +f2 (int v) +{ + int x = 214748365 * (v + 1); + if (x != -1932735285) + __builtin_abort (); +} + +__attribute__((noinline, noclone)) void +f3 (unsigned int v) +{ + unsigned int x = -214748365U * (v - 1); + if (x != -1932735285U) + __builtin_abort (); +} + +__attribute__((noinline, noclone)) void +f4 (unsigned int v) +{ + unsigned int x = 214748365U * (v + 1); + if (x != -1932735285U) + __builtin_abort (); +} +#endif + +int +main () +{ +#if __SIZEOF_INT__ == 4 && __CHAR_BIT__ == 8 + f1 (10); + f2 (-10); + f3 (10); + f4 (-10U); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56962.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56962.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56962.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56962.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR tree-optimization/56962 */ + +extern void abort (void); +long long v[144]; + +__attribute__((noinline, noclone)) void +bar (long long *x) +{ + if (x != &v[29]) + abort (); +} + +__attribute__((noinline, noclone)) void +foo (long long *x, long y, long z) +{ + long long a, b, c; + a = x[z * 4 + y * 3]; + b = x[z * 5 + y * 3]; + c = x[z * 5 + y * 4]; + x[y * 4] = a; + bar (&x[z * 5 + y]); + x[z * 5 + y * 5] = b + c; +} + +int +main () +{ + foo (v, 24, 1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56982.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56982.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56982.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr56982.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,46 @@ +/* { dg-require-effective-target indirect_jumps } */ +#include + +extern void abort (void); +extern void exit (int); + +static jmp_buf env; + +void baz (void) +{ + __asm__ volatile ("" : : : "memory"); +} + +static inline int g(int x) +{ + if (x) + { + baz(); + return 0; + } + else + { + baz(); + return 1; + } +} + +int f(int *e) +{ + if (*e) + return 1; + + int x = setjmp(env); + int n = g(x); + if (n == 0) + exit(0); + if (x) + abort(); + longjmp(env, 42); +} + +int main(int argc, char** argv) +{ + int v = 0; + return f(&v); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57124.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57124.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57124.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57124.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* { dg-require-effective-target int32plus } */ +/* { dg-options "-fno-strict-overflow" } */ + +extern void abort (void); +extern void exit (int); + +__attribute__ ((noinline)) void +foo(short unsigned int *p1, short unsigned int *p2) +{ + short unsigned int x1, x4; + int x2, x3, x5, x6; + unsigned int x7; + + x1 = *p1; + x2 = (int) x1; + x3 = x2 * 65536; + x4 = *p2; + x5 = (int) x4; + x6 = x3 + x4; + x7 = (unsigned int) x6; + if (x7 <= 268435455U) + abort (); + exit (0); +} + +int +main() +{ + short unsigned int x, y; + x = -5; + y = -10; + foo (&x, &y); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57130.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57130.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57130.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57130.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR rtl-optimization/57130 */ + +struct S { int a, b, c, d; } s[2] = { { 6, 8, -8, -5 }, { 0, 2, -1, 2 } }; + +__attribute__((noinline, noclone)) void +foo (struct S r) +{ + static int cnt; + if (__builtin_memcmp (&r, &s[cnt++], sizeof r) != 0) + __builtin_abort (); +} + +int +main () +{ + struct S r = { 6, 8, -8, -5 }; + foo (r); + r = (struct S) { 0, 2, -1, 2 }; + foo (r); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57131.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57131.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57131.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57131.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR rtl-optimization/57131 */ + +extern void abort (void); + +int +main () +{ + volatile int x1 = 0; + volatile long long x2 = 0; + volatile int x3 = 0; + volatile int x4 = 1; + volatile int x5 = 1; + volatile long long x6 = 1; + long long t = ((x1 * (x2 << x3)) / (x4 * x5)) + x6; + + if (t != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57144.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57144.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57144.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57144.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +extern void abort (void); +extern void exit (int); + +void __attribute__ ((noinline)) +foo(int a) +{ + int z = a > 0 ? a : -a; + long long x = z; + if (x > 0x100000000LL) + abort (); + else + exit (0); +} + +int +main() +{ + foo (1); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57281.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57281.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57281.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57281.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR rtl-optimization/57281 */ + +int a = 1, b, d, *e = &d; +long long c, *g = &c; +volatile long long f; + +int +foo (int h) +{ + int j = *g = b; + return h == 0 ? j : 0; +} + +int +main () +{ + int h = a; + for (; b != -20; b--) + { + (int) f; + *e = 0; + *e = foo (h); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57321.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57321.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57321.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57321.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* PR tree-optimization/57321 */ + +int a = 1, *b, **c; + +static int +foo (int *p) +{ + if (*p == a) + { + int *i[7][5] = { { 0 } }; + int **j[1][1]; + j[0][0] = &i[0][0]; + *b = &p != c; + } + return 0; +} + +int +main () +{ + int i = 0; + foo (&i); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR middle-end/57344 */ + +struct __attribute__((packed)) S +{ + int a : 11; +#if __SIZEOF_INT__ * __CHAR_BIT__ >= 32 + int b : 22; +#else + int b : 13; +#endif + char c; + int : 0; +} s[2]; +int i; + +__attribute__((noinline, noclone)) void +foo (int x) +{ + if (x != -3161) + __builtin_abort (); + asm volatile ("" : : : "memory"); +} + +int +main () +{ + struct S t = { 0, -3161L }; + s[1] = t; + for (; i < 1; i++) + foo (s[1].b); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* PR middle-end/57344 */ +/* { dg-require-effective-target int32plus } */ + +struct __attribute__((packed)) S +{ + int a : 27; +#if __SIZEOF_INT__ * __CHAR_BIT__ >= 32 + int b : 22; +#else + int b : 13; +#endif + char c; + int : 0; +} s[2]; +int i; + +__attribute__((noinline, noclone)) void +foo (int x) +{ + if (x != -3161) + __builtin_abort (); + asm volatile ("" : : : "memory"); +} + +int +main () +{ + struct S t = { 0, -3161L }; + s[1] = t; + for (; i < 1; i++) + foo (s[1].b); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR middle-end/57344 */ + +struct __attribute__((packed)) S +{ + long long int a : 43; + long long int b : 22; + char c; + long long int : 0; +} s[2]; +int i; + +__attribute__((noinline, noclone)) void +foo (long long int x) +{ + if (x != -3161LL) + __builtin_abort (); + asm volatile ("" : : : "memory"); +} + +int +main () +{ + struct S t = { 0, -3161LL }; + s[1] = t; + for (; i < 1; i++) + foo (s[1].b); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57344-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR middle-end/57344 */ + +struct __attribute__((packed)) S +{ + long long int a : 59; + long long int b : 54; + char c; + long long int : 0; +} s[2]; +int i; + +__attribute__((noinline, noclone)) void +foo (long long int x) +{ + if (x != -1220975898975746LL) + __builtin_abort (); + asm volatile ("" : : : "memory"); +} + +int +main () +{ + struct S t = { 0, -1220975898975746LL }; + s[1] = t; + for (; i < 1; i++) + foo (s[1].b); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57568.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57568.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57568.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57568.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,12 @@ +/* PR target/57568 */ + +extern void abort (void); +int a[6][9] = { }, b = 1, *c = &a[3][5]; + +int +main () +{ + if (b && (*c = *c + *c)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57829.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57829.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57829.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57829.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* PR rtl-optimization/57829 */ + +__attribute__((noinline, noclone)) +int +f1 (int k) +{ + return 2 | ((k - 1) >> ((int) sizeof (int) * __CHAR_BIT__ - 1)); +} + +__attribute__((noinline, noclone)) +long int +f2 (long int k) +{ + return 2L | ((k - 1L) >> ((int) sizeof (long int) * __CHAR_BIT__ - 1)); +} + +__attribute__((noinline, noclone)) +int +f3 (int k) +{ + k &= 63; + return 4 | ((k + 2) >> 5); +} + +int +main () +{ + if (f1 (1) != 2 || f2 (1L) != 2L || f3 (63) != 6 || f3 (1) != 4) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57860.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57860.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57860.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57860.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR rtl-optimization/57860 */ + +extern void abort (void); +int a, *b = &a, c, d, e, *f = &e, g, *h = &d, k[1] = { 1 }; + +int +foo (int p) +{ + for (;; g++) + { + for (; c; c--); + *f = *h = p > ((0x1FFFFFFFFLL ^ a) & *b); + if (k[g]) + return 0; + } +} + +int +main () +{ + foo (1); + if (d != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57861.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57861.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57861.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57861.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* PR rtl-optimization/57861 */ + +extern void abort (void); +short a = 1, f; +int b, c, d, *g = &b, h, i, j; +unsigned int e; + +static int +foo (char p) +{ + int k; + for (c = 0; c < 2; c++) + { + i = (j = 0) || p; + k = i * p; + if (e < k) + { + short *l = &f; + a = d && h; + *l = 0; + } + } + return 0; +} + +int +main () +{ + *g = foo (a); + if (a != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57875.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57875.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57875.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57875.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR rtl-optimization/57875 */ + +extern void abort (void); +int a[1], b, c, d, f, i; +char e[1]; + +int +main () +{ + for (; i < 1; i++) + if (!d) + { + if (!c) + f = 2; + e[0] &= f ^= 0; + } + b = a[e[0] >> 1 & 1]; + if (b != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57876.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57876.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57876.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57876.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR rtl-optimization/57876 */ + +extern void abort (void); +int a, b = 1, c, *d = &c, f, *g, h, j; +static int e; + +int +main () +{ + int i; + for (i = 0; i < 2; i++) + { + long long k = b; + int l; + for (f = 0; f < 8; f++) + { + int *m = &e; + j = *d; + h = a * j - 1; + *m = (h == 0) < k; + g = &l; + } + } + if (e != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57877.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57877.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57877.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr57877.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR rtl-optimization/57877 */ + +extern void abort (void); +int a, b, *c = &b, e, f = 6, g, h; +short d; + +static unsigned char +foo (unsigned long long p1, int *p2) +{ + for (; g <= 0; g++) + { + short *i = &d; + int *j = &e; + h = *c; + *i = h; + *j = (*i == *p2) < p1; + } + return 0; +} + +int +main () +{ + foo (f, &a); + if (e != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58209.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58209.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58209.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58209.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR tree-optimization/58209 */ + +extern void abort (void); +typedef __INTPTR_TYPE__ T; +T buf[1024]; + +T * +foo (T n) +{ + if (n == 0) + return (T *) buf; + T s = (T) foo (n - 1); + return (T *) (s + sizeof (T)); +} + +T * +bar (T n) +{ + if (n == 0) + return buf; + return foo (n - 1) + 1; +} + +int +main () +{ + int i; + for (i = 0; i < 27; i++) + if (foo (i) != buf + i || bar (i) != buf + i) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58277-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58277-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58277-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58277-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,102 @@ +/* PR tree-optimization/58277 */ + +extern void abort (void); +static int a[2]; +int b, c, d, *e, f, g, h, **i = &e, k, l = 1, n, o, p; +static int **volatile j = &e; +const int m; +char u; + +int +bar () +{ + u = 0; + return m; +} + +__attribute__((noinline, noclone)) void +baz () +{ + asm (""); +} + +static int +foo () +{ + int t1; + g = bar (); + if (l) + ; + else + for (;; h++) + { + *i = 0; + o = *e = 0; + if (p) + { + f = 0; + return 0; + } + for (;; k++) + { + int *t2 = 0; + int *const *t3[] = { + 0, 0, 0, 0, 0, 0, 0, 0, 0, &t2, 0, 0, &t2, &t2, &t2, + &t2, &t2, 0, 0, 0, 0, 0, 0, 0, &t2, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, &t2, 0, 0, 0, 0, 0, 0, 0, &t2, &t2, + &t2, &t2, &t2, 0, 0, 0, 0, 0, 0, 0, &t2, 0, 0, 0, + &t2, 0, 0, 0, &t2, 0, &t2, 0, 0, &t2, 0, 0, 0, 0, + 0, &t2, 0, 0, 0, 0, &t2, &t2, 0, 0, 0, 0, &t2, 0, + 0, 0, 0, 0, 0, 0, &t2, 0, 0, 0, 0, 0, &t2, 0, 0, 0, + &t2, &t2 + }; + int *const **t4[] = {&t3[0]}; + **i = 0; + if (**j) + break; + u = 0; + } + *i = *j; + t1 = 0; + for (; t1 < 5; t1++) + *i = *j; + } + *j = 0; + return 1; +} + +int +main () +{ + int t5; + a[0] = 1; + { + int *t6[6] = {&d, &d}; + for (n = 1; n; n--) + if (foo()) + { + int *t7[] = {0}; + d = 0; + for (; u < 1; u++) + *i = *j; + *i = 0; + *i = 0; + int t8[5] = {0}; + *i = &t8[0]; + int *const *t9 = &t6[0]; + int *const **t10 = &t9; + *t10 = &t7[0]; + } + } + u = 0; + for (; b; b++) + for (t5 = 0; t5 < 10; t5++) + c = a[a[a[a[a[a[a[a[c]]]]]]]]; + + baz (); + + if (!a[a[a[a[a[a[a[a[a[a[a[a[a[a[a[u]]]]]]]]]]]]]]]) + abort (); + + return 0; +} From llvm-commits at lists.llvm.org Wed Oct 9 04:01:53 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via llvm-commits) Date: Wed, 09 Oct 2019 11:01:53 -0000 Subject: [test-suite] r374156 - Add GCC Torture Suite Sources Message-ID: <20191009110200.57325907B8@lists.llvm.org> Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58277-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58277-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58277-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58277-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,98 @@ +/* PR tree-optimization/58277 */ + +extern void abort (void); +static int a[1], b, c, e, i, j, k, m, q[] = { 1, 1 }, t; +int volatile d; +int **r; +static int ***volatile s = &r; +int f, g, o, x; +static int *volatile h = &f, *p; +char n; + +static void +fn1 () +{ + b = a[a[a[a[a[a[a[a[b]]]]]]]]; + b = a[a[a[a[a[a[a[a[b]]]]]]]]; + b = a[a[b]]; + b = a[a[a[a[a[a[a[a[b]]]]]]]]; + b = a[a[a[a[a[a[a[a[b]]]]]]]]; +} + +static int +fn2 () +{ + n = 0; + for (; g; t++) + { + for (;; m++) + { + d; + int *u; + int **v[] = { + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, &u, 0, 0, 0, 0, &u, &u, &u, &u, &u, &u, &u, 0, + &u, 0, &u, &u, &u, 0, &u, &u, 0, &u, &u, &u, &u, 0, &u, &u, &u, + &u, &u, 0, &u, &u, 0, &u, 0, &u, &u, 0, &u, &u, &u, &u, &u, 0, + &u, 0, 0, 0, &u, &u, &u, 0, 0, &u, &u, &u, 0, &u, 0, &u, &u + }; + int ***w[] = { &v[0] }; + if (*p) + break; + return 0; + } + *h = 0; + } + return 1; +} + +static void +fn3 () +{ + int *y[] = { 0, 0, 0, 0, 0, 0, 0, 0 }; + for (; i; i++) + x = 0; + if (fn2 ()) + { + int *z[6] = { }; + for (; n < 1; n++) + *h = 0; + int t1[7]; + for (; c; c++) + o = t1[0]; + for (; e; e--) + { + int **t2 = &y[0]; + int ***t3 = &t2; + *t3 = &z[0]; + } + } + *s = 0; + for (n = 0;; n = 0) + { + int t4 = 0; + if (q[n]) + break; + *r = &t4; + } +} + +int +main () +{ + for (; j; j--) + a[0] = 0; + fn3 (); + for (; k; k++) + fn1 (); + fn1 (); + + if (n) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58364.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58364.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58364.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58364.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR tree-optimization/58364 */ + +int a = 1, b, c; + +int +foo (int x) +{ + return x < 0 ? 1 : x; +} + +int +main () +{ + if (foo (a > c == (b = 0))) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58365.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58365.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58365.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58365.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* PR rtl-optimization/58365 */ + +extern void abort (void); + +struct S +{ + volatile int a; + int b, c, d, e; +} f; +static struct S g, h; +int i = 1; + +char +foo (void) +{ + return i; +} + +static struct S +bar (void) +{ + if (foo ()) + return f; + return g; +} + +int +main () +{ + h = bar (); + f.b = 1; + if (h.b != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58385.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58385.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58385.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58385.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR tree-optimization/58385 */ + +extern void abort (void); + +int a, b = 1; + +int +foo () +{ + b = 0; + return 0; +} + +int +main () +{ + ((0 || a) & foo () >= 0) <= 1 && 1; + if (b) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58387.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58387.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58387.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58387.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +extern void abort(void); + +int a = -1; + +int main () +{ + int b = a == 0 ? 0 : -a; + if (b < 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58419.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58419.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58419.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58419.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +__attribute__((__noinline__)) +void +dummy () +{ + asm volatile(""); +} + +int a, g, i, k, *p; +signed char b; +char e; +short c, h; +static short *d = &c; + +char +foo (int p1, int p2) +{ + return p1 - p2; +} + +int +bar () +{ + short *q = &c; + *q = 1; + *p = 0; + return 0; +} + +int +main () +{ + for (b = -22; b >= -29; b--) + { + short *l = &h; + char *m = &e; + *l = a; + g = foo (*m = k && *d, 1 > i) || bar (); + } + dummy(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58431.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58431.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58431.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58431.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +char a, h; +int b, d, e, g, j, k; +volatile int c; +short i; + +int +main () +{ + int m; + + m = i ^= 1; + for (b = 0; b < 1; b++) + { + char o = m; + g = k; + j = j || c; + if (a != o) + for (; d < 1; d++) + ; + else + { + char *p = &h; + *p = 1; + for (; e; e++) + ; + } + } + + if (h != 0) + __builtin_abort(); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58564.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58564.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58564.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58564.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR middle-end/58564 */ + +extern void abort (void); +int a, b; +short *c, **d = &c; + +int +main () +{ + b = (0, 0 > ((&c == d) & (1 && (a ^ 1)))) | 0U; + if (b != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58570.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58570.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58570.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58570.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* { dg-require-effective-target int32plus } */ +#pragma pack(1) +struct S +{ + int f0:15; + int f1:29; +}; + +int e = 1, i; +static struct S d[6]; + +int +main (void) +{ + if (e) + { + d[i].f0 = 1; + d[i].f1 = 1; + } + if (d[0].f1 != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58574.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58574.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58574.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58574.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,219 @@ +/* PR target/58574 */ + +__attribute__((noinline, noclone)) double +foo (double x) +{ + double t; + switch ((int) x) + { + case 0: + t = 2 * x - 1; + return 0.70878e-3 + (0.71234e-3 + (0.35779e-5 + (0.17403e-7 + (0.81710e-10 + (0.36885e-12 + 0.15917e-14 * t) * t) * t) * t) * t) * t; + case 1: + t = 2 * x - 3; + return 0.21479e-2 + (0.72686e-3 + (0.36843e-5 + (0.18071e-7 + (0.85496e-10 + (0.38852e-12 + 0.16868e-14 * t) * t) * t) * t) * t) * t; + case 2: + t = 2 * x - 5; + return 0.36165e-2 + (0.74182e-3 + (0.37948e-5 + (0.18771e-7 + (0.89484e-10 + (0.40935e-12 + 0.17872e-14 * t) * t) * t) * t) * t) * t; + case 3: + t = 2 * x - 7; + return 0.51154e-2 + (0.75722e-3 + (0.39096e-5 + (0.19504e-7 + (0.93687e-10 + (0.43143e-12 + 0.18939e-14 * t) * t) * t) * t) * t) * t; + case 4: + t = 2 * x - 9; + return 0.66457e-2 + (0.77310e-3 + (0.40289e-5 + (0.20271e-7 + (0.98117e-10 + (0.45484e-12 + 0.20076e-14 * t) * t) * t) * t) * t) * t; + case 5: + t = 2 * x - 11; + return 0.82082e-2 + (0.78946e-3 + (0.41529e-5 + (0.21074e-7 + (0.10278e-9 + (0.47965e-12 + 0.21285e-14 * t) * t) * t) * t) * t) * t; + case 6: + t = 2 * x - 13; + return 0.98039e-2 + (0.80633e-3 + (0.42819e-5 + (0.21916e-7 + (0.10771e-9 + (0.50595e-12 + 0.22573e-14 * t) * t) * t) * t) * t) * t; + case 7: + t = 2 * x - 15; + return 0.11433e-1 + (0.82372e-3 + (0.44160e-5 + (0.22798e-7 + (0.11291e-9 + (0.53386e-12 + 0.23944e-14 * t) * t) * t) * t) * t) * t; + case 8: + t = 2 * x - 17; + return 0.13099e-1 + (0.84167e-3 + (0.45555e-5 + (0.23723e-7 + (0.11839e-9 + (0.56346e-12 + 0.25403e-14 * t) * t) * t) * t) * t) * t; + case 9: + t = 2 * x - 19; + return 0.14800e-1 + (0.86018e-3 + (0.47008e-5 + (0.24694e-7 + (0.12418e-9 + (0.59486e-12 + 0.26957e-14 * t) * t) * t) * t) * t) * t; + case 10: + t = 2 * x - 21; + return 0.16540e-1 + (0.87928e-3 + (0.48520e-5 + (0.25711e-7 + (0.13030e-9 + (0.62820e-12 + 0.28612e-14 * t) * t) * t) * t) * t) * t; + case 11: + t = 2 * x - 23; + return 0.18318e-1 + (0.89900e-3 + (0.50094e-5 + (0.26779e-7 + (0.13675e-9 + (0.66358e-12 + 0.30375e-14 * t) * t) * t) * t) * t) * t; + case 12: + t = 2 * x - 25; + return 0.20136e-1 + (0.91936e-3 + (0.51734e-5 + (0.27900e-7 + (0.14357e-9 + (0.70114e-12 + 0.32252e-14 * t) * t) * t) * t) * t) * t; + case 13: + t = 2 * x - 27; + return 0.21996e-1 + (0.94040e-3 + (0.53443e-5 + (0.29078e-7 + (0.15078e-9 + (0.74103e-12 + 0.34251e-14 * t) * t) * t) * t) * t) * t; + case 14: + t = 2 * x - 29; + return 0.23898e-1 + (0.96213e-3 + (0.55225e-5 + (0.30314e-7 + (0.15840e-9 + (0.78340e-12 + 0.36381e-14 * t) * t) * t) * t) * t) * t; + case 15: + t = 2 * x - 31; + return 0.25845e-1 + (0.98459e-3 + (0.57082e-5 + (0.31613e-7 + (0.16646e-9 + (0.82840e-12 + 0.38649e-14 * t) * t) * t) * t) * t) * t; + case 16: + t = 2 * x - 33; + return 0.27837e-1 + (0.10078e-2 + (0.59020e-5 + (0.32979e-7 + (0.17498e-9 + (0.87622e-12 + 0.41066e-14 * t) * t) * t) * t) * t) * t; + case 17: + t = 2 * x - 35; + return 0.29877e-1 + (0.10318e-2 + (0.61041e-5 + (0.34414e-7 + (0.18399e-9 + (0.92703e-12 + 0.43639e-14 * t) * t) * t) * t) * t) * t; + case 18: + t = 2 * x - 37; + return 0.31965e-1 + (0.10566e-2 + (0.63151e-5 + (0.35924e-7 + (0.19353e-9 + (0.98102e-12 + 0.46381e-14 * t) * t) * t) * t) * t) * t; + case 19: + t = 2 * x - 39; + return 0.34104e-1 + (0.10823e-2 + (0.65354e-5 + (0.37512e-7 + (0.20362e-9 + (0.10384e-11 + 0.49300e-14 * t) * t) * t) * t) * t) * t; + case 20: + t = 2 * x - 41; + return 0.36295e-1 + (0.11089e-2 + (0.67654e-5 + (0.39184e-7 + (0.21431e-9 + (0.10994e-11 + 0.52409e-14 * t) * t) * t) * t) * t) * t; + case 21: + t = 2 * x - 43; + return 0.38540e-1 + (0.11364e-2 + (0.70058e-5 + (0.40943e-7 + (0.22563e-9 + (0.11642e-11 + 0.55721e-14 * t) * t) * t) * t) * t) * t; + case 22: + t = 2 * x - 45; + return 0.40842e-1 + (0.11650e-2 + (0.72569e-5 + (0.42796e-7 + (0.23761e-9 + (0.12332e-11 + 0.59246e-14 * t) * t) * t) * t) * t) * t; + case 23: + t = 2 * x - 47; + return 0.43201e-1 + (0.11945e-2 + (0.75195e-5 + (0.44747e-7 + (0.25030e-9 + (0.13065e-11 + 0.63000e-14 * t) * t) * t) * t) * t) * t; + case 24: + t = 2 * x - 49; + return 0.45621e-1 + (0.12251e-2 + (0.77941e-5 + (0.46803e-7 + (0.26375e-9 + (0.13845e-11 + 0.66996e-14 * t) * t) * t) * t) * t) * t; + case 25: + t = 2 * x - 51; + return 0.48103e-1 + (0.12569e-2 + (0.80814e-5 + (0.48969e-7 + (0.27801e-9 + (0.14674e-11 + 0.71249e-14 * t) * t) * t) * t) * t) * t; + case 26: + t = 2 * x - 59; + return 0.58702e-1 + (0.13962e-2 + (0.93714e-5 + (0.58882e-7 + (0.34414e-9 + (0.18552e-11 + 0.91160e-14 * t) * t) * t) * t) * t) * t; + case 30: + t = 2 * x - 79; + return 0.90908e-1 + (0.18544e-2 + (0.13903e-4 + (0.95549e-7 + (0.59752e-9 + (0.33656e-11 + 0.16815e-13 * t) * t) * t) * t) * t) * t; + case 40: + t = 2 * x - 99; + return 0.13443e0 + (0.25474e-2 + (0.21385e-4 + (0.15996e-6 + (0.10585e-8 + (0.61258e-11 + 0.30412e-13 * t) * t) * t) * t) * t) * t; + case 50: + t = 2 * x - 119; + return 0.19540e0 + (0.36342e-2 + (0.34096e-4 + (0.27479e-6 + (0.18934e-8 + (0.11021e-10 + 0.52931e-13 * t) * t) * t) * t) * t) * t; + case 60: + t = 2 * x - 121; + return 0.20281e0 + (0.37739e-2 + (0.35791e-4 + (0.29038e-6 + (0.20068e-8 + (0.11673e-10 + 0.55790e-13 * t) * t) * t) * t) * t) * t; + case 61: + t = 2 * x - 123; + return 0.21050e0 + (0.39206e-2 + (0.37582e-4 + (0.30691e-6 + (0.21270e-8 + (0.12361e-10 + 0.58770e-13 * t) * t) * t) * t) * t) * t; + case 62: + t = 2 * x - 125; + return 0.21849e0 + (0.40747e-2 + (0.39476e-4 + (0.32443e-6 + (0.22542e-8 + (0.13084e-10 + 0.61873e-13 * t) * t) * t) * t) * t) * t; + case 63: + t = 2 * x - 127; + return 0.22680e0 + (0.42366e-2 + (0.41477e-4 + (0.34300e-6 + (0.23888e-8 + (0.13846e-10 + 0.65100e-13 * t) * t) * t) * t) * t) * t; + case 64: + t = 2 * x - 129; + return 0.23545e0 + (0.44067e-2 + (0.43594e-4 + (0.36268e-6 + (0.25312e-8 + (0.14647e-10 + 0.68453e-13 * t) * t) * t) * t) * t) * t; + case 65: + t = 2 * x - 131; + return 0.24444e0 + (0.45855e-2 + (0.45832e-4 + (0.38352e-6 + (0.26819e-8 + (0.15489e-10 + 0.71933e-13 * t) * t) * t) * t) * t) * t; + case 66: + t = 2 * x - 133; + return 0.25379e0 + (0.47735e-2 + (0.48199e-4 + (0.40561e-6 + (0.28411e-8 + (0.16374e-10 + 0.75541e-13 * t) * t) * t) * t) * t) * t; + case 67: + t = 2 * x - 135; + return 0.26354e0 + (0.49713e-2 + (0.50702e-4 + (0.42901e-6 + (0.30095e-8 + (0.17303e-10 + 0.79278e-13 * t) * t) * t) * t) * t) * t; + case 68: + t = 2 * x - 137; + return 0.27369e0 + (0.51793e-2 + (0.53350e-4 + (0.45379e-6 + (0.31874e-8 + (0.18277e-10 + 0.83144e-13 * t) * t) * t) * t) * t) * t; + case 69: + t = 2 * x - 139; + return 0.28426e0 + (0.53983e-2 + (0.56150e-4 + (0.48003e-6 + (0.33752e-8 + (0.19299e-10 + 0.87139e-13 * t) * t) * t) * t) * t) * t; + case 70: + t = 2 * x - 141; + return 0.29529e0 + (0.56288e-2 + (0.59113e-4 + (0.50782e-6 + (0.35735e-8 + (0.20369e-10 + 0.91262e-13 * t) * t) * t) * t) * t) * t; + case 71: + t = 2 * x - 143; + return 0.30679e0 + (0.58714e-2 + (0.62248e-4 + (0.53724e-6 + (0.37827e-8 + (0.21490e-10 + 0.95513e-13 * t) * t) * t) * t) * t) * t; + case 72: + t = 2 * x - 145; + return 0.31878e0 + (0.61270e-2 + (0.65564e-4 + (0.56837e-6 + (0.40035e-8 + (0.22662e-10 + 0.99891e-13 * t) * t) * t) * t) * t) * t; + case 73: + t = 2 * x - 147; + return 0.33130e0 + (0.63962e-2 + (0.69072e-4 + (0.60133e-6 + (0.42362e-8 + (0.23888e-10 + 0.10439e-12 * t) * t) * t) * t) * t) * t; + case 74: + t = 2 * x - 149; + return 0.34438e0 + (0.66798e-2 + (0.72783e-4 + (0.63619e-6 + (0.44814e-8 + (0.25168e-10 + 0.10901e-12 * t) * t) * t) * t) * t) * t; + case 75: + t = 2 * x - 151; + return 0.35803e0 + (0.69787e-2 + (0.76710e-4 + (0.67306e-6 + (0.47397e-8 + (0.26505e-10 + 0.11376e-12 * t) * t) * t) * t) * t) * t; + case 76: + t = 2 * x - 153; + return 0.37230e0 + (0.72938e-2 + (0.80864e-4 + (0.71206e-6 + (0.50117e-8 + (0.27899e-10 + 0.11862e-12 * t) * t) * t) * t) * t) * t; + case 77: + t = 2 * x - 155; + return 0.38722e0 + (0.76260e-2 + (0.85259e-4 + (0.75329e-6 + (0.52979e-8 + (0.29352e-10 + 0.12360e-12 * t) * t) * t) * t) * t) * t; + case 78: + t = 2 * x - 157; + return 0.40282e0 + (0.79762e-2 + (0.89909e-4 + (0.79687e-6 + (0.55989e-8 + (0.30866e-10 + 0.12868e-12 * t) * t) * t) * t) * t) * t; + case 79: + t = 2 * x - 159; + return 0.41914e0 + (0.83456e-2 + (0.94827e-4 + (0.84291e-6 + (0.59154e-8 + (0.32441e-10 + 0.13387e-12 * t) * t) * t) * t) * t) * t; + case 80: + t = 2 * x - 161; + return 0.43621e0 + (0.87352e-2 + (0.10002e-3 + (0.89156e-6 + (0.62480e-8 + (0.34079e-10 + 0.13917e-12 * t) * t) * t) * t) * t) * t; + case 81: + t = 2 * x - 163; + return 0.45409e0 + (0.91463e-2 + (0.10553e-3 + (0.94293e-6 + (0.65972e-8 + (0.35782e-10 + 0.14455e-12 * t) * t) * t) * t) * t) * t; + case 82: + t = 2 * x - 165; + return 0.47282e0 + (0.95799e-2 + (0.11135e-3 + (0.99716e-6 + (0.69638e-8 + (0.37549e-10 + 0.15003e-12 * t) * t) * t) * t) * t) * t; + case 83: + t = 2 * x - 167; + return 0.49243e0 + (0.10037e-1 + (0.11750e-3 + (0.10544e-5 + (0.73484e-8 + (0.39383e-10 + 0.15559e-12 * t) * t) * t) * t) * t) * t; + case 84: + t = 2 * x - 169; + return 0.51298e0 + (0.10520e-1 + (0.12400e-3 + (0.11147e-5 + (0.77517e-8 + (0.41283e-10 + 0.16122e-12 * t) * t) * t) * t) * t) * t; + case 85: + t = 2 * x - 171; + return 0.53453e0 + (0.11030e-1 + (0.13088e-3 + (0.11784e-5 + (0.81743e-8 + (0.43252e-10 + 0.16692e-12 * t) * t) * t) * t) * t) * t; + case 86: + t = 2 * x - 173; + return 0.55712e0 + (0.11568e-1 + (0.13815e-3 + (0.12456e-5 + (0.86169e-8 + (0.45290e-10 + 0.17268e-12 * t) * t) * t) * t) * t) * t; + case 87: + t = 2 * x - 175; + return 0.58082e0 + (0.12135e-1 + (0.14584e-3 + (0.13164e-5 + (0.90803e-8 + (0.47397e-10 + 0.17850e-12 * t) * t) * t) * t) * t) * t; + case 88: + t = 2 * x - 177; + return 0.60569e0 + (0.12735e-1 + (0.15396e-3 + (0.13909e-5 + (0.95651e-8 + (0.49574e-10 + 0.18435e-12 * t) * t) * t) * t) * t) * t; + case 89: + t = 2 * x - 179; + return 0.63178e0 + (0.13368e-1 + (0.16254e-3 + (0.14695e-5 + (0.10072e-7 + (0.51822e-10 + 0.19025e-12 * t) * t) * t) * t) * t) * t; + case 90: + t = 2 * x - 181; + return 0.65918e0 + (0.14036e-1 + (0.17160e-3 + (0.15521e-5 + (0.10601e-7 + (0.54140e-10 + 0.19616e-12 * t) * t) * t) * t) * t) * t; + case 91: + t = 2 * x - 183; + return 0.68795e0 + (0.14741e-1 + (0.18117e-3 + (0.16392e-5 + (0.11155e-7 + (0.56530e-10 + 0.20209e-12 * t) * t) * t) * t) * t) * t; + case 92: + t = 2 * x - 185; + return 0.71818e0 + (0.15486e-1 + (0.19128e-3 + (0.17307e-5 + (0.11732e-7 + (0.58991e-10 + 0.20803e-12 * t) * t) * t) * t) * t) * t; + case 93: + t = 2 * x - 187; + return 0.74993e0 + (0.16272e-1 + (0.20195e-3 + (0.18269e-5 + (0.12335e-7 + (0.61523e-10 + 0.21395e-12 * t) * t) * t) * t) * t) * t; + } + return 1.0; +} + +int +main () +{ +#ifdef __s390x__ + { + register unsigned long r5 __asm ("r5"); + r5 = 0xdeadbeefUL; + asm volatile ("":"+r" (r5)); + } +#endif + double d = foo (78.4); + if (d < 0.38 || d > 0.42) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58640-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58640-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58640-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58640-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +extern void abort (void); + +int a[20], b, c; + +int +fn1 () +{ + int d, e, f, g = 0; + + a[12] = 1; + for (e = 0; e < 3; e++) + for (d = 0; d < 2; d++) + { + for (f = 0; f < 2; f++) + { + g ^= a[12] > 1; + if (g) + return 0; + if (b) + break; + } + for (c = 0; c < 1; c++) + a[d] = a[e * 3 + 9]; + } + return 0; +} + +int +main () +{ + fn1 (); + if (a[0] != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58640.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58640.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58640.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58640.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +int a, b, c, d = 1, e; + +static signed char +foo () +{ + int f, g = a; + + for (f = 1; f < 3; f++) + for (; b < 1; b++) + { + if (d) + for (c = 0; c < 4; c++) + for (f = 0; f < 3; f++) + { + for (e = 0; e < 1; e++) + a = g; + if (f) + break; + } + else if (f) + continue; + return 0; + } + return 0; +} + +int +main () +{ + foo (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58662.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58662.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58662.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58662.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); + +int a, c, d; +volatile int b; + +static int +foo (int p1, short p2) +{ + return p1 / p2; +} + +int +main () +{ + char e; + d = foo (a == 0, (0, 35536)); + e = d % 14; + b = e && c; + if (b != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58726.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58726.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58726.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58726.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR rtl-optimization/58726 */ + +int a, c; +union { int f1; int f2 : 1; } b; + +short +foo (short p) +{ + return p < 0 ? p : a; +} + +int +main () +{ + if (sizeof (short) * __CHAR_BIT__ != 16 + || sizeof (int) * __CHAR_BIT__ != 32) + return 0; + b.f1 = 56374; + unsigned short d; + int e = b.f2; + d = e == 0 ? b.f1 : 0; + c = foo (d); + if (c != (short) 56374) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58831.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58831.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58831.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58831.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +#include + +int a, *b, c, d, f, **i, p, q, *r; +short o, j; + +static int __attribute__((noinline, noclone)) +fn1 (int *p1, int **p2) +{ + int **e = &b; + for (; p; p++) + *p1 = 1; + *e = *p2 = &d; + + assert (r); + + return c; +} + +static int ** __attribute__((noinline, noclone)) +fn2 (void) +{ + for (f = 0; f != 42; f++) + { + int *g[3] = {0, 0, 0}; + for (o = 0; o; o--) + for (; a > 1;) + { + int **h[1] = { &g[2] }; + } + } + return &r; +} + +int +main (void) +{ + i = fn2 (); + fn1 (b, i); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58943.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58943.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58943.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58943.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR c/58943 */ + +unsigned int x[1] = { 2 }; + +unsigned int +foo (void) +{ + x[0] |= 128; + return 1; +} + +int +main () +{ + x[0] |= foo (); + if (x[0] != 131) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58984.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58984.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58984.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr58984.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,57 @@ +/* PR tree-optimization/58984 */ + +struct S { int f0 : 8; int : 6; int f1 : 5; }; +struct T { char f0; int : 6; int f1 : 5; }; + +int a, *c = &a, e, n, b, m; + +static int +foo (struct S p) +{ + const unsigned short *f[36]; + for (; e < 2; e++) + { + const unsigned short **i = &f[0]; + *c ^= 1; + if (p.f1) + { + *i = 0; + return b; + } + } + return 0; +} + +static int +bar (struct T p) +{ + const unsigned short *f[36]; + for (; e < 2; e++) + { + const unsigned short **i = &f[0]; + *c ^= 1; + if (p.f1) + { + *i = 0; + return b; + } + } + return 0; +} + +int +main () +{ + struct S o = { 1, 1 }; + foo (o); + m = n || o.f0; + if (a != 1) + __builtin_abort (); + e = 0; + struct T p = { 1, 1 }; + bar (p); + m |= n || p.f0; + if (a != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59014-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59014-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59014-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59014-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR tree-optimization/59014 */ + +__attribute__((noinline, noclone)) long long int +foo (long long int x, long long int y) +{ + if (((int) x | (int) y) != 0) + return 6; + return x + y; +} + +int +main () +{ + if (sizeof (long long) == sizeof (int)) + return 0; + int shift_half = sizeof (int) * __CHAR_BIT__ / 2; + long long int x = (3LL << shift_half) << shift_half; + long long int y = (5LL << shift_half) << shift_half; + long long int z = foo (x, y); + if (z != ((8LL << shift_half) << shift_half)) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59014.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59014.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59014.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59014.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR tree-optimization/59014 */ + +int a = 2, b, c, d; + +int +foo () +{ + for (;; c++) + if ((b > 0) | (a & 1)) + ; + else + { + d = a; + return 0; + } +} + +int +main () +{ + foo (); + if (d != 2) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59101.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59101.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59101.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59101.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* PR target/59101 */ + +__attribute__((noinline, noclone)) int +foo (int a) +{ + return (~a & 4102790424LL) > 0 | 6; +} + +int +main () +{ + if (foo (0) != 7) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59221.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59221.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59221.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59221.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* { dg-xfail-if "ptxas crashes" { nvptx-*-* } { "*" } { "-O0" "-Os" } } */ + + +int a = 1, b, d; +short e; + +int +main () +{ + for (; b; b++) + ; + short f = a; + int g = 15; + e = f ? f : 1 << g; + int h = e; + d = h == 83647 ? 0 : h; + if (d != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59229.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59229.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59229.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59229.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +int i; + +__attribute__((noinline, noclone)) void +bar (char *p) +{ + if (i < 1 || i > 6) + __builtin_abort (); + if (__builtin_memcmp (p, "abcdefg", i + 1) != 0) + __builtin_abort (); + __builtin_memset (p, ' ', 7); +} + +__attribute__((noinline, noclone)) void +foo (char *p, unsigned long l) +{ + if (l < 1 || l > 6) + return; + char buf[7]; + __builtin_memcpy (buf, p, l + 1); + bar (buf); +} + +int +main () +{ + for (i = 0; i < 16; i++) + foo ("abcdefghijklmnop", i); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59358.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59358.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59358.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59358.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +/* PR tree-optimization/59358 */ + +__attribute__((noinline, noclone)) int +foo (int *x, int y) +{ + int z = *x; + if (y > z && y <= 16) + while (y > z) + z *= 2; + return z; +} + +int +main () +{ + int i; + for (i = 1; i < 17; i++) + { + int j = foo (&i, 16); + int k; + if (i >= 8 && i <= 15) + k = 16 + (i - 8) * 2; + else if (i >= 4 && i <= 7) + k = 16 + (i - 4) * 4; + else if (i == 3) + k = 24; + else + k = 16; + if (j != k) + __builtin_abort (); + j = foo (&i, 7); + if (i >= 7) + k = i; + else if (i >= 4) + k = 8 + (i - 4) * 2; + else if (i == 3) + k = 12; + else + k = 8; + if (j != k) + __builtin_abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59387.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59387.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59387.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59387.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR tree-optimization/59387 */ + +int a, *d, **e = &d, f; +char c; +struct S { int f1; } b; + +int +main () +{ + for (a = -19; a; a++) + { + for (b.f1 = 0; b.f1 < 24; b.f1++) + c--; + *e = &f; + if (!d) + return 0; + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59388.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59388.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59388.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59388.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +/* PR tree-optimization/59388 */ + +int a; +struct S { unsigned int f:1; } b; + +int +main () +{ + a = (0 < b.f) | b.f; + return a; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59413.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59413.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59413.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59413.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR tree-optimization/59413 */ + +typedef unsigned int uint32_t; + +uint32_t a; +int b; + +int +main () +{ + uint32_t c; + for (a = 7; a <= 1; a++) + { + char d = a; + c = d; + b = a == c; + } + if (a != 7) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59643.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59643.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59643.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59643.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +/* PR tree-optimization/59643 */ + +#define N 32 + +__attribute__((noinline, noclone)) void +foo (double *a, double *b, double *c, double d, double e, int n) +{ + int i; + for (i = 1; i < n - 1; i++) + a[i] = d * (b[i] + c[i] + a[i - 1] + a[i + 1]) + e * a[i]; +} + +double expected[] = { + 0.0, 10.0, 44.0, 110.0, 232.0, 490.0, 1020.0, 2078.0, 4152.0, 8314.0, + 16652.0, 33326.0, 66664.0, 133354.0, 266748.0, 533534.0, 1067064.0, + 2134138.0, 4268300.0, 8536622.0, 17073256.0, 34146538.0, 68293116.0, + 136586270.0, 273172536.0, 546345082.0, 1092690188.0, 2185380398.0, + 4370760808.0, 8741521642.0, 17483043324.0, 6.0 +}; + +int +main () +{ + int i; + double a[N], b[N], c[N]; + if (__DBL_MANT_DIG__ <= 35) + return 0; + for (i = 0; i < N; i++) + { + a[i] = (i & 3) * 2.0; + b[i] = (i & 7) - 4; + c[i] = i & 7; + } + foo (a, b, c, 2.0, 3.0, N); + for (i = 0; i < N; i++) + if (a[i] != expected[i]) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59747.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59747.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59747.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr59747.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void abort (void); +extern void exit (int); + +int a[6], c = 1, d; +short e; + +int __attribute__ ((noinline)) +fn1 (int p) +{ + return a[p]; +} + +int +main () +{ + if (sizeof (long long) != 8) + exit (0); + + a[0] = 1; + if (c) + e--; + d = e; + long long f = e; + if (fn1 ((f >> 56) & 1) != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60003.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60003.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60003.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60003.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +/* PR tree-optimization/60003 */ +/* { dg-require-effective-target indirect_jumps } */ + +extern void abort (void); + +unsigned long long jmp_buf[5]; + +__attribute__((noinline, noclone)) void +baz (void) +{ + __builtin_longjmp (&jmp_buf, 1); +} + +void +bar (void) +{ + baz (); +} + +__attribute__((noinline, noclone)) int +foo (int x) +{ + int a = 0; + + if (__builtin_setjmp (&jmp_buf) == 0) + { + while (1) + { + a = 1; + bar (); /* OK if baz () instead */ + } + } + else + { + if (a == 0) + return 0; + else + return x; + } +} + +int +main () +{ + if (foo (1) == 0) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60017.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60017.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60017.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60017.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* PR target/60017 */ + +extern void abort (void); + +struct S0 +{ + short m0; + short m1; +}; + +struct S1 +{ + unsigned m0:1; + char m1[2][2]; + struct S0 m2[2]; +}; + +struct S1 x = { 1, {{2, 3}, {4, 5}}, {{6, 7}, {8, 9}} }; + +struct S1 func (void) +{ + return x; +} + +int main (void) +{ + struct S1 ret = func (); + + if (ret.m2[1].m1 != 9) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60062.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60062.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60062.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60062.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR target/60062 */ + +int a; + +static void +foo (const char *p1, int p2) +{ + if (__builtin_strcmp (p1, "hello") != 0) + __builtin_abort (); +} + +static void +bar (const char *p1) +{ + if (__builtin_strcmp (p1, "hello") != 0) + __builtin_abort (); +} + +__attribute__((optimize (0))) int +main () +{ + foo ("hello", a); + bar ("hello"); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60072.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60072.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60072.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60072.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* PR target/60072 */ + +int c = 1; + +__attribute__ ((optimize (1))) +static int *foo (int *p) +{ + return p; +} + +int +main () +{ + *foo (&c) = 2; + return c - 2; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60454.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60454.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60454.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60454.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +#ifdef __UINT32_TYPE__ +typedef __UINT32_TYPE__ uint32_t; +#else +typedef unsigned uint32_t; +#endif + +#define __fake_const_swab32(x) ((uint32_t)( \ + (((uint32_t)(x) & (uint32_t)0x000000ffUL) << 24) | \ + (((uint32_t)(x) & (uint32_t)0x0000ff00UL) << 8) | \ + (((uint32_t)(x) & (uint32_t)0x000000ffUL) << 8) | \ + (((uint32_t)(x) & (uint32_t)0x0000ff00UL) ) | \ + (((uint32_t)(x) & (uint32_t)0xff000000UL) >> 24))) + +/* Previous version of bswap optimization would detect byte swap when none + happen. This test aims at catching such wrong detection to avoid + regressions. */ + +__attribute__ ((noinline, noclone)) uint32_t +fake_swap32 (uint32_t in) +{ + return __fake_const_swab32 (in); +} + +int main(void) +{ + if (sizeof (uint32_t) * __CHAR_BIT__ != 32) + return 0; + if (fake_swap32 (0x12345678UL) != 0x78567E12UL) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60822.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60822.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60822.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60822.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* { dg-require-effective-target int32plus } */ +struct X { + char fill0[800000]; + int a; + char fill1[900000]; + int b; +}; + +int __attribute__((noinline,noclone)) +Avg(struct X *p, int s) +{ + return (s * (long long)(p->a + p->b)) >> 17; +} + +struct X x; + +int main() +{ + x.a = 1 << 17; + x.b = 2 << 17; + if (Avg(&x, 1) != 3) + __builtin_abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60960.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60960.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60960.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr60960.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* PR tree-optimization/60960 */ + +typedef unsigned char v4qi __attribute__ ((vector_size (4))); + +__attribute__((noinline, noclone)) v4qi +f1 (v4qi v) +{ + return v / 2; +} + +__attribute__((noinline, noclone)) v4qi +f2 (v4qi v) +{ + return v / (v4qi) { 2, 2, 2, 2 }; +} + +__attribute__((noinline, noclone)) v4qi +f3 (v4qi x, v4qi y) +{ + return x / y; +} + +int +main () +{ + v4qi x = { 5, 5, 5, 5 }; + v4qi y = { 2, 2, 2, 2 }; + v4qi z = f1 (x); + if (__builtin_memcmp (&y, &z, sizeof (y)) != 0) + __builtin_abort (); + z = f2 (x); + if (__builtin_memcmp (&y, &z, sizeof (y)) != 0) + __builtin_abort (); + z = f3 (x, y); + if (__builtin_memcmp (&y, &z, sizeof (y)) != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +#ifdef __INT32_TYPE__ +typedef __INT32_TYPE__ int32_t; +#else +typedef int int32_t; +#endif + +#ifdef __UINT32_TYPE__ +typedef __UINT32_TYPE__ uint32_t; +#else +typedef unsigned uint32_t; +#endif + +#define __fake_const_swab32(x) ((uint32_t)( \ + (((uint32_t)(x) & (uint32_t)0x000000ffUL) << 24) | \ + (((uint32_t)(x) & (uint32_t)0x0000ff00UL) << 8) | \ + (((uint32_t)(x) & (uint32_t)0x00ff0000UL) >> 8) | \ + (( (int32_t)(x) & (int32_t)0xff000000UL) >> 24))) + +/* Previous version of bswap optimization failed to consider sign extension + and as a result would replace an expression *not* doing a bswap by a + bswap. */ + +__attribute__ ((noinline, noclone)) uint32_t +fake_bswap32 (uint32_t in) +{ + return __fake_const_swab32 (in); +} + +int +main(void) +{ + if (sizeof (int32_t) * __CHAR_BIT__ != 32) + return 0; + if (sizeof (uint32_t) * __CHAR_BIT__ != 32) + return 0; + if (fake_bswap32 (0x87654321) != 0xffffff87) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +#ifdef __INT16_TYPE__ +typedef __INT16_TYPE__ int16_t; +#else +typedef short int16_t; +#endif + +#ifdef __UINT32_TYPE__ +typedef __UINT32_TYPE__ uint32_t; +#else +typedef unsigned uint32_t; +#endif + +#define __fake_const_swab32(x) ((uint32_t)( \ + (((uint32_t) (x) & (uint32_t)0x000000ffUL) << 24) | \ + (((uint32_t)(int16_t)(x) & (uint32_t)0x00ffff00UL) << 8) | \ + (((uint32_t) (x) & (uint32_t)0x00ff0000UL) >> 8) | \ + (((uint32_t) (x) & (uint32_t)0xff000000UL) >> 24))) + + +/* Previous version of bswap optimization failed to consider sign extension + and as a result would replace an expression *not* doing a bswap by a + bswap. */ + +__attribute__ ((noinline, noclone)) uint32_t +fake_bswap32 (uint32_t in) +{ + return __fake_const_swab32 (in); +} + +int +main(void) +{ + if (sizeof (uint32_t) * __CHAR_BIT__ != 32) + return 0; + if (sizeof (int16_t) * __CHAR_BIT__ != 16) + return 0; + if (fake_bswap32 (0x81828384) != 0xff838281) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61306-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +short a = -1; +int b; +char c; + +int +main () +{ + c = a; + b = a | c; + if (b != -1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61375.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61375.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61375.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61375.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +#ifdef __UINT64_TYPE__ +typedef __UINT64_TYPE__ uint64_t; +#else +typedef unsigned long long uint64_t; +#endif + +#ifndef __SIZEOF_INT128__ +#define __int128 long long +#endif + +/* Some version of bswap optimization would ICE when analyzing a mask constant + too big for an uint64_t variable (PR210931). */ + +__attribute__ ((noinline, noclone)) uint64_t +uint128_central_bitsi_ior (unsigned __int128 in1, uint64_t in2) +{ + __int128 mask = (__int128)0xffff << 56; + return ((in1 & mask) >> 56) | in2; +} + +int +main(int argc, char **argv) +{ + __int128 in = 1; +#ifdef __SIZEOF_INT128__ + in <<= 64; +#endif + if (sizeof (uint64_t) * __CHAR_BIT__ != 64) + return 0; + if (sizeof (unsigned __int128) * __CHAR_BIT__ != 128) + return 0; + if (uint128_central_bitsi_ior (in, 2) != 0x102) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61517.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61517.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61517.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61517.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +int a, b, *c = &a; +unsigned short d; + +int +main () +{ + unsigned int e = a; + *c = 1; + if (!b) + { + d = e; + *c = d | e; + } + + if (a != 0) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61673.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61673.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61673.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61673.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,50 @@ +/* PR rtl-optimization/61673 */ + +char e; + +__attribute__((noinline, noclone)) void +bar (char x) +{ + if (x != 0x54 && x != (char) 0x87) + __builtin_abort (); +} + +__attribute__((noinline, noclone)) void +foo (const char *x) +{ + char d = x[0]; + int c = d; + if ((c >= 0 && c <= 0x7f) == 0) + e = d; + bar (d); +} + +__attribute__((noinline, noclone)) void +baz (const char *x) +{ + char d = x[0]; + int c = d; + if ((c >= 0 && c <= 0x7f) == 0) + e = d; +} + +int +main () +{ + const char c[] = { 0x54, 0x87 }; + e = 0x21; + foo (c); + if (e != 0x21) + __builtin_abort (); + foo (c + 1); + if (e != (char) 0x87) + __builtin_abort (); + e = 0x21; + baz (c); + if (e != 0x21) + __builtin_abort (); + baz (c + 1); + if (e != (char) 0x87) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61682.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61682.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61682.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61682.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR tree-optimization/61682 */ + +int a, b; +static int *c = &b; + +int +main () +{ + int *d = &a; + for (a = 0; a < 12; a++) + *c |= *d / 9; + + if (b != 1) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61725.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61725.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61725.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr61725.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR tree-optimization/61725 */ + +int +main () +{ + int x; + for (x = -128; x <= 128; x++) + { + int a = __builtin_ffs (x); + if (x == 0 && a != 0) + __builtin_abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr62151.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr62151.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr62151.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr62151.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* PR rtl-optimization/62151 */ + +int a, c, d, e, f, g, h, i; +short b; + +int +fn1 () +{ + b = 0; + for (;;) + { + int j[2]; + j[f] = 0; + if (h) + d = 0; + else + { + for (; f; f++) + ; + for (a = 0; a < 1; a++) + for (;;) + { + i = b & ((b ^ 1) & 83647) ? b : b - 1; + g = 1 ? i : 0; + e = j[0]; + if (c) + break; + return 0; + } + } + } +} + +int +main () +{ + fn1 (); + if (g != -1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63209.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63209.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63209.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63209.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +static int Sub(int a, int b) { + return b -a; +} + +static unsigned Select(unsigned a, unsigned b, unsigned c) { + const int pa_minus_pb = + Sub((a >> 8) & 0xff, (b >> 8) & 0xff) + + Sub((a >> 0) & 0xff, (b >> 0) & 0xff); + return (pa_minus_pb <= 0) ? a : b; +} + +__attribute__((noinline)) unsigned Predictor(unsigned left, const unsigned* const top) { + const unsigned pred = Select(top[1], left, top[0]); + return pred; +} + +int main(void) { + const unsigned top[2] = {0xff7a7a7a, 0xff7a7a7a}; + const unsigned left = 0xff7b7b7b; + const unsigned pred = Predictor(left, top /*+ 1*/); + if (pred == left) + return 0; + return 1; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63302.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63302.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63302.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63302.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,60 @@ +/* PR tree-optimization/63302 */ + +#ifdef __SIZEOF_INT128__ +#if __SIZEOF_INT128__ * __CHAR_BIT__ == 128 +#define USE_INT128 +#endif +#endif +#if __SIZEOF_LONG_LONG__ * __CHAR_BIT__ == 64 +#define USE_LLONG +#endif + +#ifdef USE_INT128 +__attribute__((noinline, noclone)) int +foo (__int128 x) +{ + __int128 v = x & (((__int128) -1 << 63) | 0x7ff); + + return v == 0 || v == ((__int128) -1 << 63); +} +#endif + +#ifdef USE_LLONG +__attribute__((noinline, noclone)) int +bar (long long x) +{ + long long v = x & (((long long) -1 << 31) | 0x7ff); + + return v == 0 || v == ((long long) -1 << 31); +} +#endif + +int +main () +{ +#ifdef USE_INT128 + if (foo (0) != 1 + || foo (1) != 0 + || foo (0x800) != 1 + || foo (0x801) != 0 + || foo ((__int128) 1 << 63) != 0 + || foo ((__int128) -1 << 63) != 1 + || foo (((__int128) -1 << 63) | 1) != 0 + || foo (((__int128) -1 << 63) | 0x800) != 1 + || foo (((__int128) -1 << 63) | 0x801) != 0) + __builtin_abort (); +#endif +#ifdef USE_LLONG + if (bar (0) != 1 + || bar (1) != 0 + || bar (0x800) != 1 + || bar (0x801) != 0 + || bar (1LL << 31) != 0 + || bar (-1LL << 31) != 1 + || bar ((-1LL << 31) | 1) != 0 + || bar ((-1LL << 31) | 0x800) != 1 + || bar ((-1LL << 31) | 0x801) != 0) + __builtin_abort (); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63641.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63641.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63641.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63641.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,54 @@ +/* PR tree-optimization/63641 */ + +__attribute__ ((noinline, noclone)) int +foo (unsigned char b) +{ + if (0x0 <= b && b <= 0x8) + goto lab; + if (b == 0x0b) + goto lab; + if (0x0e <= b && b <= 0x1a) + goto lab; + if (0x1c <= b && b <= 0x1f) + goto lab; + return 0; +lab: + return 1; +} + +__attribute__ ((noinline, noclone)) int +bar (unsigned char b) +{ + if (0x0 <= b && b <= 0x8) + goto lab; + if (b == 0x0b) + goto lab; + if (0x0e <= b && b <= 0x1a) + goto lab; + if (0x3c <= b && b <= 0x3f) + goto lab; + return 0; +lab: + return 1; +} + +char tab1[] = { 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1 }; +char tab2[] = { 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1 }; + +int +main () +{ + int i; + asm volatile ("" : : : "memory"); + for (i = 0; i < 256; i++) + if (foo (i) != (i < 32 ? tab1[i] : 0)) + __builtin_abort (); + for (i = 0; i < 256; i++) + if (bar (i) != (i < 64 ? tab2[i] : 0)) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63659.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63659.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63659.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63659.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* PR rtl-optimization/63659 */ + +int a, b, c, *d = &b, g, h, i; +unsigned char e; +char f; + +int +main () +{ + while (a) + { + for (a = 0; a; a++) + for (; c; c++) + ; + if (i) + break; + } + + char j = c, k = -1, l; + l = g = j >> h; + f = l == 0 ? k : k % l; + e = 0 ? 0 : f; + *d = e; + + if (b != 255) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63843.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63843.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63843.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr63843.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* PR rtl-optimization/63843 */ + +static inline __attribute__ ((always_inline)) +unsigned short foo (unsigned short v) +{ + return (v << 8) | (v >> 8); +} + +unsigned short __attribute__ ((noinline, noclone, hot)) +bar (unsigned char *x) +{ + unsigned int a; + unsigned short b; + __builtin_memcpy (&a, &x[0], sizeof (a)); + a ^= 0x80808080U; + __builtin_memcpy (&x[0], &a, sizeof (a)); + __builtin_memcpy (&b, &x[2], sizeof (b)); + return foo (b); +} + +int +main () +{ + unsigned char x[8] = { 0x01, 0x01, 0x01, 0x01 }; + if (__CHAR_BIT__ == 8 + && sizeof (short) == 2 + && sizeof (int) == 4 + && bar (x) != 0x8181U) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64006.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64006.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64006.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64006.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR tree-optimization/64006 */ + +int v; + +long __attribute__ ((noinline, noclone)) +test (long *x, int y) +{ + int i; + long s = 1; + for (i = 0; i < y; i++) + if (__builtin_mul_overflow (s, x[i], &s)) + v++; + return s; +} + +int +main () +{ + long d[7] = { 975, 975, 975, 975, 975, 975, 975 }; + long r = test (d, 7); + if (sizeof (long) * __CHAR_BIT__ == 64 && v != 1) + __builtin_abort (); + else if (sizeof (long) * __CHAR_BIT__ == 32 && v != 4) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64242.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64242.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64242.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64242.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* { dg-require-effective-target indirect_jumps } */ + +extern void abort (void); + +__attribute ((noinline)) void +broken_longjmp (void *p) +{ + void *buf[32]; + __builtin_memcpy (buf, p, 5 * sizeof (void*)); + __builtin_memset (p, 0, 5 * sizeof (void*)); + /* Corrupts stack pointer... */ + __builtin_longjmp (buf, 1); +} + +volatile int x = 0; +char *volatile p; +char *volatile q; + +int +main () +{ + void *buf[5]; + p = __builtin_alloca (x); + q = __builtin_alloca (x); + if (!__builtin_setjmp (buf)) + broken_longjmp (buf); + + /* Compute expected next alloca offset - some targets don't align properly + and allocate too much. */ + p = q + (q - p); + + /* Fails if stack pointer corrupted. */ + if (p != __builtin_alloca (x)) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64255.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64255.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64255.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64255.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR rtl-optimization/64255 */ + +__attribute__((noinline, noclone)) void +bar (long i, unsigned long j) +{ + if (i != 1 || j != 1) + __builtin_abort (); +} + +__attribute__((noinline, noclone)) void +foo (long i) +{ + unsigned long j; + + if (!i) + return; + j = i >= 0 ? (unsigned long) i : - (unsigned long) i; + if ((i >= 0 ? (unsigned long) i : - (unsigned long) i) != j) + __builtin_abort (); + bar (i, j); +} + +int +main () +{ + foo (1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64260.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64260.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64260.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64260.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR rtl-optimization/64260 */ + +int a = 1, b; + +void +foo (char p) +{ + int t = 0; + for (; b < 1; b++) + { + int *s = &a; + if (--t) + *s &= p; + *s &= 1; + } +} + +int +main () +{ + foo (0); + if (a != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64682.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64682.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64682.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64682.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR rtl-optimization/64682 */ + +int a, b = 1; + +__attribute__((noinline, noclone)) void +foo (int x) +{ + if (x != 5) + __builtin_abort (); +} + +int +main () +{ + int i; + for (i = 0; i < 56; i++) + for (; a; a--) + ; + int *c = &b; + if (*c) + *c = 1 % (unsigned int) *c | 5; + + foo (b); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64718.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64718.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64718.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64718.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +static int __attribute__ ((noinline, noclone)) +swap (int x) +{ + return (unsigned short) ((unsigned short) x << 8 | (unsigned short) x >> 8); +} + +static int a = 0x1234; + +int +main (void) +{ + int b = 0x1234; + if (swap (a) != 0x3412) + __builtin_abort (); + if (swap (b) != 0x3412) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64756.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64756.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64756.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64756.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR rtl-optimization/64756 */ + +int a, *tmp, **c = &tmp; +volatile int d; +static int *volatile *e = &tmp; +unsigned int f; + +static void +fn1 (int *p) +{ + int g; + for (; f < 1; f++) + for (g = 1; g >= 0; g--) + { + d || d; + *c = p; + + if (tmp != &a) + __builtin_abort (); + + *e = 0; + } +} + +int +main () +{ + fn1 (&a); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64957.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64957.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64957.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64957.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR rtl-optimization/64957 */ + +__attribute__((noinline, noclone)) int +foo (int b) +{ + return (((b ^ 5) | 1) ^ 5) | 1; +} + +__attribute__((noinline, noclone)) int +bar (int b) +{ + return (((b ^ ~5) & ~1) ^ ~5) & ~1; +} + +int +main () +{ + int i; + for (i = 0; i < 16; i++) + if (foo (i) != (i | 1) || bar (i) != (i & ~1)) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64979.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64979.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64979.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr64979.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* PR target/64979 */ + +#include + +void __attribute__((noinline, noclone)) +bar (int x, va_list *ap) +{ + if (ap) + { + int i; + for (i = 0; i < 10; i++) + if (i != va_arg (*ap, int)) + __builtin_abort (); + if (va_arg (*ap, double) != 0.5) + __builtin_abort (); + } +} + +void __attribute__((noinline, noclone)) +foo (int x, ...) +{ + va_list ap; + int n; + + va_start (ap, x); + n = va_arg (ap, int); + bar (x, (va_list *) ((n == 0) ? ((void *) 0) : &ap)); + va_end (ap); +} + +int +main () +{ + foo (100, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0.5); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65053-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65053-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65053-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65053-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR tree-optimization/65053 */ + +int i; + +__attribute__ ((noinline, noclone)) +unsigned int foo (void) +{ + return 0; +} + +int +main () +{ + unsigned int u = -1; + if (u == -1) + { + unsigned int n = foo (); + if (n > 0) + u = n - 1; + } + + while (u != -1) + { + asm ("" : "+g" (u)); + u = -1; + i = 1; + } + + if (i) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65053-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65053-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65053-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65053-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR tree-optimization/65053 */ + +int i; +unsigned int x; + +int +main () +{ + asm volatile ("" : "+g" (x)); + unsigned int n = x; + unsigned int u = 32; + if (n >= 32) + __builtin_abort (); + if (n != 0) + u = n + 32; + + while (u != 32) + { + asm ("" : : "g" (u)); + u = 32; + i = 1; + } + + if (i) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65170.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65170.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65170.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65170.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR tree-optimization/65170 */ + +#ifdef __SIZEOF_INT128__ +typedef unsigned __int128 V; +typedef unsigned long long int H; +#else +typedef unsigned long long int V; +typedef unsigned int H; +#endif + +__attribute__((noinline, noclone)) void +foo (V b, V c) +{ + V a; + b &= (H) -1; + c &= (H) -1; + a = b * c; + if (a != 1) + __builtin_abort (); +} + +int +main () +{ + foo (1, 1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* PR tree-optimization/65215 */ + +static inline unsigned int +foo (unsigned int x) +{ + return (x >> 24) | ((x >> 8) & 0xff00) | ((x << 8) & 0xff0000) | (x << 24); +} + +__attribute__((noinline, noclone)) unsigned int +bar (unsigned long long *x) +{ + return foo (*x); +} + +int +main () +{ + if (__CHAR_BIT__ != 8 || sizeof (unsigned int) != 4 || sizeof (unsigned long long) != 8) + return 0; + unsigned long long l = foo (0xdeadbeefU) | 0xfeedbea800000000ULL; + if (bar (&l) != 0xdeadbeefU) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* PR tree-optimization/65215 */ + +static inline unsigned int +foo (unsigned int x) +{ + return (x >> 24) | ((x >> 8) & 0xff00) | ((x << 8) & 0xff0000) | (x << 24); +} + +__attribute__((noinline, noclone)) unsigned long long +bar (unsigned long long *x) +{ + return ((unsigned long long) foo (*x) << 32) | foo (*x >> 32); +} + +int +main () +{ + if (__CHAR_BIT__ != 8 || sizeof (unsigned int) != 4 || sizeof (unsigned long long) != 8) + return 0; + unsigned long long l = foo (0xfeedbea8U) | ((unsigned long long) foo (0xdeadbeefU) << 32); + if (bar (&l) != 0xfeedbea8deadbeefULL) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* PR tree-optimization/65215 */ + +struct S { unsigned long long l1 : 24, l2 : 8, l3 : 32; }; + +static inline unsigned int +foo (unsigned int x) +{ + return (x >> 24) | ((x >> 8) & 0xff00) | ((x << 8) & 0xff0000) | (x << 24); +} + +__attribute__((noinline, noclone)) unsigned long long +bar (struct S *x) +{ + unsigned long long x1 = foo (((unsigned int) x->l1 << 8) | x->l2); + unsigned long long x2 = foo (x->l3); + return (x2 << 32) | x1; +} + +int +main () +{ + if (__CHAR_BIT__ != 8 || sizeof (unsigned int) != 4 || sizeof (unsigned long long) != 8) + return 0; + struct S s = { 0xdeadbeU, 0xefU, 0xfeedbea8U }; + unsigned long long l = bar (&s); + if (foo (l >> 32) != s.l3 + || (foo (l) >> 8) != s.l1 + || (foo (l) & 0xff) != s.l2) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR tree-optimization/65215 */ + +struct S { unsigned long long l1 : 48; }; + +static inline unsigned int +foo (unsigned int x) +{ + return (x >> 24) | ((x >> 8) & 0xff00) | ((x << 8) & 0xff0000) | (x << 24); +} + +__attribute__((noinline, noclone)) unsigned int +bar (struct S *x) +{ + return foo (x->l1); +} + +int +main () +{ + if (__CHAR_BIT__ != 8 || sizeof (unsigned int) != 4 || sizeof (unsigned long long) != 8) + return 0; + struct S s; + s.l1 = foo (0xdeadbeefU) | (0xfeedULL << 32); + if (bar (&s) != 0xdeadbeefU) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65215-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR tree-optimization/65215 */ + +__attribute__((noinline, noclone)) unsigned int +foo (unsigned char *p) +{ + return ((unsigned int) p[0] << 24) | (p[1] << 16) | (p[2] << 8) | p[3]; +} + +__attribute__((noinline, noclone)) unsigned int +bar (unsigned char *p) +{ + return ((unsigned int) p[3] << 24) | (p[2] << 16) | (p[1] << 8) | p[0]; +} + +struct S { unsigned int a; unsigned char b[5]; }; + +int +main () +{ + struct S s = { 1, { 2, 3, 4, 5, 6 } }; + if (__CHAR_BIT__ != 8 || sizeof (unsigned int) != 4) + return 0; + if (foo (&s.b[1]) != 0x03040506U + || bar (&s.b[1]) != 0x06050403U) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65216.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65216.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65216.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65216.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* PR tree-optimization/65216 */ + +int a, b = 62, e; +volatile int c, d; + +int +main () +{ + int f = 0; + for (a = 0; a < 2; a++) + { + b &= (8 ^ f) & 1; + for (e = 0; e < 6; e++) + if (c) + f = d; + } + if (b != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65369.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65369.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65369.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65369.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,46 @@ +/* PR tree-optimization/65369 */ +#include + +static const char data[] = + "12345678901234567890123456789012345678901234567890" + "123456789012345678901234567890"; + +__attribute__ ((noinline)) +static void foo (const unsigned int *buf) +{ + if (__builtin_memcmp (buf, data, 64)) + __builtin_abort (); +} + +__attribute__ ((noinline)) +static void bar (const unsigned char *block) +{ + uint32_t buf[16]; + __builtin_memcpy (buf + 0, block + 0, 4); + __builtin_memcpy (buf + 1, block + 4, 4); + __builtin_memcpy (buf + 2, block + 8, 4); + __builtin_memcpy (buf + 3, block + 12, 4); + __builtin_memcpy (buf + 4, block + 16, 4); + __builtin_memcpy (buf + 5, block + 20, 4); + __builtin_memcpy (buf + 6, block + 24, 4); + __builtin_memcpy (buf + 7, block + 28, 4); + __builtin_memcpy (buf + 8, block + 32, 4); + __builtin_memcpy (buf + 9, block + 36, 4); + __builtin_memcpy (buf + 10, block + 40, 4); + __builtin_memcpy (buf + 11, block + 44, 4); + __builtin_memcpy (buf + 12, block + 48, 4); + __builtin_memcpy (buf + 13, block + 52, 4); + __builtin_memcpy (buf + 14, block + 56, 4); + __builtin_memcpy (buf + 15, block + 60, 4); + foo (buf); +} + +int +main () +{ + unsigned char input[sizeof data + 16] __attribute__((aligned (16))); + __builtin_memset (input, 0, sizeof input); + __builtin_memcpy (input + 1, data, sizeof data); + bar (input + 1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65401.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65401.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65401.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65401.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,59 @@ +/* PR rtl-optimization/65401 */ + +struct S { unsigned short s[64]; }; + +__attribute__((noinline, noclone)) void +foo (struct S *x) +{ + unsigned int i; + unsigned char *s; + + s = (unsigned char *) x->s; + for (i = 0; i < 64; i++) + x->s[i] = s[i * 2] | (s[i * 2 + 1] << 8); +} + +__attribute__((noinline, noclone)) void +bar (struct S *x) +{ + unsigned int i; + unsigned char *s; + + s = (unsigned char *) x->s; + for (i = 0; i < 64; i++) + x->s[i] = (s[i * 2] << 8) | s[i * 2 + 1]; +} + +int +main () +{ + unsigned int i; + struct S s; + if (sizeof (unsigned short) != 2) + return 0; + for (i = 0; i < 64; i++) + s.s[i] = i + ((64 - i) << 8); + foo (&s); +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ + for (i = 0; i < 64; i++) + if (s.s[i] != (64 - i) + (i << 8)) + __builtin_abort (); +#elif __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + for (i = 0; i < 64; i++) + if (s.s[i] != i + ((64 - i) << 8)) + __builtin_abort (); +#endif + for (i = 0; i < 64; i++) + s.s[i] = i + ((64 - i) << 8); + bar (&s); +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + for (i = 0; i < 64; i++) + if (s.s[i] != (64 - i) + (i << 8)) + __builtin_abort (); +#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ + for (i = 0; i < 64; i++) + if (s.s[i] != i + ((64 - i) << 8)) + __builtin_abort (); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65418-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65418-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65418-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65418-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR tree-optimization/65418 */ + +__attribute__((noinline, noclone)) int +foo (int x) +{ + if (x == -216 || x == -132 || x == -218 || x == -146) + return 1; + return 0; +} + +int +main () +{ + volatile int i; + for (i = -230; i < -120; i++) + if (foo (i) != (i == -216 || i == -132 || i == -218 || i == -146)) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65418-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65418-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65418-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65418-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR tree-optimization/65418 */ + +__attribute__((noinline, noclone)) int +foo (int x) +{ + if (x == -216 || x == -211 || x == -218 || x == -205 || x == -223) + return 1; + return 0; +} + +int +main () +{ + volatile int i; + for (i = -230; i < -200; i++) + if (foo (i) != (i == -216 || i == -211 || i == -218 || i == -205 || i == -223)) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65427.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65427.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65427.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65427.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* PR tree-optimization/65427 */ + +typedef int V __attribute__ ((vector_size (8 * sizeof (int)))); +V a, b, c, d, e, f; + +__attribute__((noinline, noclone)) void +foo (int x, int y) +{ + do + { + if (x) + d = a ^ c; + else + d = a ^ b; + } + while (y); +} + +int +main () +{ + a = (V) { 1, 2, 3, 4, 5, 6, 7, 8 }; + b = (V) { 0x40, 0x80, 0x40, 0x80, 0x40, 0x80, 0x40, 0x80 }; + e = (V) { 0x41, 0x82, 0x43, 0x84, 0x45, 0x86, 0x47, 0x88 }; + foo (0, 0); + if (__builtin_memcmp (&d, &e, sizeof (V)) != 0) + __builtin_abort (); + c = (V) { 0x80, 0x40, 0x80, 0x40, 0x80, 0x40, 0x80, 0x40 }; + f = (V) { 0x81, 0x42, 0x83, 0x44, 0x85, 0x46, 0x87, 0x48 }; + foo (1, 0); + if (__builtin_memcmp (&d, &f, sizeof (V)) != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65648.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65648.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65648.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65648.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* PR target/65648 */ + +int a = 0, *b = 0, c = 0; +static int d = 0; +short e = 1; +static long long f = 0; +long long *i = &f; +unsigned char j = 0; + +__attribute__((noinline, noclone)) void +foo (int x, int *y) +{ + asm volatile ("" : : "r" (x), "r" (y) : "memory"); +} + +__attribute__((noinline, noclone)) void +bar (const char *x, long long y) +{ + asm volatile ("" : : "r" (x), "r" (&y) : "memory"); + if (y != 0) + __builtin_abort (); +} + +int +main () +{ + int k = 0; + b = &k; + j = (!a) - (c <= e); + *i = j; + foo (a, &k); + bar ("", f); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65956.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65956.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65956.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr65956.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,67 @@ +/* PR target/65956 */ + +struct A { char *a; int b; long long c; }; +char v[3]; + +__attribute__((noinline, noclone)) void +fn1 (char *x, char *y) +{ + if (x != &v[1] || y != &v[2]) + __builtin_abort (); + v[1]++; +} + +__attribute__((noinline, noclone)) int +fn2 (char *x) +{ + asm volatile ("" : "+g" (x) : : "memory"); + return x == &v[0]; +} + +__attribute__((noinline, noclone)) void +fn3 (const char *x) +{ + if (x[0] != 0) + __builtin_abort (); +} + +static struct A +foo (const char *x, struct A y, struct A z) +{ + struct A r = { 0, 0, 0 }; + if (y.b && z.b) + { + if (fn2 (y.a) && fn2 (z.a)) + switch (x[0]) + { + case '|': + break; + default: + fn3 (x); + } + fn1 (y.a, z.a); + } + return r; +} + +__attribute__((noinline, noclone)) int +bar (int x, struct A *y) +{ + switch (x) + { + case 219: + foo ("+", y[-2], y[0]); + case 220: + foo ("-", y[-2], y[0]); + } +} + +int +main () +{ + struct A a[3] = { { &v[1], 1, 1LL }, { &v[0], 0, 0LL }, { &v[2], 2, 2LL } }; + bar (220, a + 2); + if (v[1] != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66187.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66187.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66187.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66187.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* PR tree-optimization/66187 */ + +int a = 1, e = -1; +short b, f; + +int +main () +{ + f = e; + int g = b < 0 ? 0 : f + b; + if ((g & -4) < 0) + a = 0; + if (a) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66233.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66233.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66233.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66233.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* PR tree-optimization/66233 */ + +unsigned int v[8]; + +__attribute__((noinline, noclone)) void +foo (void) +{ + int i; + for (i = 0; i < 8; i++) + v[i] = (float) i; +} + +int +main () +{ + unsigned int i; + foo (); + for (i = 0; i < 8; i++) + if (v[i] != i) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66556.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66556.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66556.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66556.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,53 @@ +/* { dg-do run } */ +/* { dg-require-effective-target int32plus } */ + +extern void abort (void); + +struct { + unsigned f2; + unsigned f3 : 15; + unsigned f5 : 3; + short f6; +} b = {0x7f8000, 6, 5, 0}, g = {8, 0, 5, 0}; + +short d, l; +int a, c, h = 8; +volatile char e[237] = {4}; +short *f = &d; +short i[5] = {3}; +char j; +int *k = &c; + +int +fn1 (unsigned p1) { return -p1; } + +void +fn2 (char p1) +{ + a = p1; + e[0]; +} + +short +fn3 () +{ + *k = 4; + return *f; +} + +int +main () +{ + + unsigned m; + short *n = &i[4]; + + m = fn1 ((h && j) <= b.f5); + l = m > g.f3; + *n = 3; + fn2 (b.f2 >> 15); + if ((a & 0xff) != 0xff) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66757.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66757.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66757.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66757.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +/* PR tree-optimization/66757 */ +/* Testcase by Zhendong Su */ + +int a, b; + +int +main (void) +{ + unsigned int t = (unsigned char) (~b); + + if ((t ^ 1) / 255) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66940.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66940.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66940.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr66940.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +long long __attribute__ ((noinline, noclone)) +foo (long long ival) +{ + if (ival <= 0) + return -0x7fffffffffffffffL - 1; + + return 0x7fffffffffffffffL; +} + +int +main (void) +{ + if (foo (-1) != (-0x7fffffffffffffffL - 1)) + __builtin_abort (); + + if (foo (1) != 0x7fffffffffffffffL) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67037.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67037.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67037.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67037.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +long (*extfunc)(); + +static inline void lstrcpynW( short *d, const short *s, int n ) +{ + unsigned int count = n; + + while ((count > 1) && *s) + { + count--; + *d++ = *s++; + } + if (count) *d = 0; +} + +int __attribute__((noinline,noclone)) +badfunc(int u0, int u1, int u2, int u3, + short *fsname, unsigned int fsname_len) +{ + static const short ntfsW[] = {'N','T','F','S',0}; + char superblock[2048+3300]; + int ret = 0; + short *p; + + if (extfunc()) + return 0; + p = (void *)extfunc(); + if (p != 0) + goto done; + + extfunc(superblock); + + lstrcpynW(fsname, ntfsW, fsname_len); + + ret = 1; +done: + return ret; +} + +static long f() +{ + return 0; +} + +int main() +{ + short buf[6]; + extfunc = f; + return !badfunc(0, 0, 0, 0, buf, 6); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67226.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67226.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67226.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67226.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +struct assembly_operand +{ + int type, value, symtype, symflags, marker; +}; + +struct assembly_operand to_input, from_input; + +void __attribute__ ((__noinline__, __noclone__)) +assemblez_1 (int internal_number, struct assembly_operand o1) +{ + if (o1.type != from_input.type) + __builtin_abort (); +} + +void __attribute__ ((__noinline__, __noclone__)) +t0 (struct assembly_operand to, struct assembly_operand from) +{ + if (to.value == 0) + assemblez_1 (32, from); + else + __builtin_abort (); +} + +int +main (void) +{ + to_input.value = 0; + to_input.type = 1; + to_input.symtype = 2; + to_input.symflags = 3; + to_input.marker = 4; + + from_input.value = 5; + from_input.type = 6; + from_input.symtype = 7; + from_input.symflags = 8; + from_input.marker = 9; + + t0 (to_input, from_input); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67714.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67714.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67714.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67714.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +unsigned int b; +int c; + +signed char +fn1 () +{ + signed char d; + for (int i = 0; i < 1; i++) + d = -15; + return d; +} + +int +main (void) +{ + for (c = 0; c < 1; c++) + b = 0; + char e = fn1 (); + signed char f = e ^ b; + volatile int g = (int) f; + + if (g != -15) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67781.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67781.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67781.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67781.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* { dg-require-effective-target int32plus } */ +#ifdef __UINT32_TYPE__ +typedef __UINT32_TYPE__ uint32_t; +#else +typedef unsigned uint32_t; +#endif + +#ifdef __UINT8_TYPE__ +typedef __UINT8_TYPE__ uint8_t; +#else +typedef unsigned char uint8_t; +#endif + +struct +{ + uint32_t a; + uint8_t b; +} s = { 0x123456, 0x78 }; + +int pr67781() +{ + uint32_t c = (s.a << 8) | s.b; + return c; +} + +int +main () +{ + if (sizeof (uint32_t) * __CHAR_BIT__ != 32) + return 0; + + if (pr67781 () != 0x12345678) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67929_1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67929_1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67929_1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr67929_1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +int __attribute__ ((noinline, noclone)) +foo (float a) +{ + return a * 4.9f; +} + + +int +main (void) +{ + if (foo (10.0f) != 49) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68143_1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68143_1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68143_1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68143_1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +#define NULL 0 + +struct stuff +{ + int a; + int b; + int c; + int d; + int e; + char *f; + int g; +}; + +void __attribute__ ((noinline)) +bar (struct stuff *x) +{ + if (x->g != 2) + __builtin_abort (); +} + +int +main (int argc, char** argv) +{ + struct stuff x = {0, 0, 0, 0, 0, NULL, 0}; + x.a = 100; + x.d = 100; + x.g = 2; + /* Struct should now look like {100, 0, 0, 100, 0, 0, 0, 2}. */ + bar (&x); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68185.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68185.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68185.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68185.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* { dg-skip-if "ptxas crashes or executes incorrectly" { nvptx-*-* } { "-O0" "-Os" } { "" } } Reported 2015-11-20 */ + +int a, b, d = 1, e, f, o, u, w = 1, z; +short c, q, t; + +int +main () +{ + char g; + for (; d; d--) + { + while (o) + for (; e;) + { + c = b; + int h = o = z; + for (; u;) + for (; a;) + ; + } + if (t < 1) + g = w; + f = g; + g && (q = 1); + } + + if (q != 1) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68249.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68249.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68249.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68249.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* PR rtl-optimization/68249 */ + +int a, b, c, g, k, l, m, n; +char h; + +void +fn1 () +{ + for (; k; k++) + { + m = b || c < 0 || c > 1 ? : c; + g = l = n || m < 0 || (m > 1) > 1 >> m ? : 1 << m; + } + l = b + 1; + for (; b < 1; b++) + h = a + 1; +} + +int +main () +{ + char j; + for (; a < 1; a++) + { + fn1 (); + if (h) + j = h; + if (j > c) + g = 0; + } + + if (h != 1) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68250.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68250.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68250.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68250.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +/* PR rtl-optimization/68250 */ + +signed char a, b, h, k, l, m, o; +short c, d, n; +int e, f, g, j, q; + +void +fn1 (void) +{ + int p = b || a; + n = o > 0 || d > 1 >> o ? d : d << o; + for (; j; j++) + m = c < 0 || m || c << p; + l = f + 1; + for (; f < 1; f = 1) + k = h + 1; +} + +__attribute__((noinline, noclone)) void +fn2 (int k) +{ + if (k != 1) + __builtin_abort (); +} + +int +main () +{ + signed char i; + for (; e < 1; e++) + { + fn1 (); + if (k) + i = k; + if (i > q) + g = 0; + } + fn2 (k); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68321.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68321.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68321.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68321.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* PR rtl-optimization/68321 */ + +int e = 1, u = 5, t2, t5, i, k; +int a[1], b, m; +char n, t; + +int +fn1 (int p1) +{ + int g[1]; + for (;;) + { + if (p1 / 3) + for (; t5;) + u || n; + t2 = p1 & 4; + if (b + 1) + return 0; + u = g[0]; + } +} + +int +main () +{ + for (; e >= 0; e--) + { + char c; + if (!m) + c = t; + fn1 (c); + } + + if (a[t2] != 0) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68328.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68328.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68328.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68328.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +int a, b, c = 1, d = 1, e; + +__attribute__ ((noinline, noclone)) + int foo (void) +{ + asm volatile ("":::"memory"); + return 4195552; +} + +__attribute__ ((noinline, noclone)) + void bar (int x, int y) +{ + asm volatile (""::"g" (x), "g" (y):"memory"); + if (y == 0) + __builtin_abort (); +} + +int +baz (int x) +{ + char g, h; + int i, j; + + foo (); + for (;;) + { + if (c) + h = d; + g = h < x ? h : 0; + i = (signed char) ((unsigned char) (g - 120) ^ 1); + j = i > 97; + if (a - j) + bar (0x123456, 0); + if (!b) + return e; + } +} + +int +main () +{ + baz (2); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68376-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68376-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68376-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68376-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* PR rtl-optimization/68376 */ + +int a, b, c = 1; +signed char d; + +int +main () +{ + for (; a < 1; a++) + for (; b < 1; b++) + { + signed char e = ~d; + if (d < 1) + e = d; + d = e; + if (!c) + __builtin_abort (); + } + + if (d != 0) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68376-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68376-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68376-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68376-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,73 @@ +/* PR rtl-optimization/68376 */ + +extern void abort (void); + +__attribute__((noinline, noclone)) int +f1 (int x) +{ + return x < 0 ? ~x : x; +} + +__attribute__((noinline, noclone)) int +f2 (int x) +{ + return x < 0 ? x : ~x; +} + +__attribute__((noinline, noclone)) int +f3 (int x) +{ + return x <= 0 ? ~x : x; +} + +__attribute__((noinline, noclone)) int +f4 (int x) +{ + return x <= 0 ? x : ~x; +} + +__attribute__((noinline, noclone)) int +f5 (int x) +{ + return x >= 0 ? ~x : x; +} + +__attribute__((noinline, noclone)) int +f6 (int x) +{ + return x >= 0 ? x : ~x; +} + +__attribute__((noinline, noclone)) int +f7 (int x) +{ + return x > 0 ? ~x : x; +} + +__attribute__((noinline, noclone)) int +f8 (int x) +{ + return x > 0 ? x : ~x; +} + +int +main () +{ + if (f1 (5) != 5 || f1 (-5) != 4 || f1 (0) != 0) + abort (); + if (f2 (5) != -6 || f2 (-5) != -5 || f2 (0) != -1) + abort (); + if (f3 (5) != 5 || f3 (-5) != 4 || f3 (0) != -1) + abort (); + if (f4 (5) != -6 || f4 (-5) != -5 || f4 (0) != 0) + abort (); + if (f5 (5) != -6 || f5 (-5) != -5 || f5 (0) != -1) + abort (); + if (f6 (5) != 5 || f6 (-5) != 4 || f6 (0) != 0) + abort (); + if (f7 (5) != -6 || f7 (-5) != -5 || f7 (0) != 0) + abort (); + if (f8 (5) != 5 || f8 (-5) != 4 || f8 (0) != -1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68381.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68381.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68381.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68381.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* { dg-options "-O -fexpensive-optimizations -fno-tree-bit-ccp" } */ + +__attribute__ ((noinline, noclone)) +int +foo (unsigned short x, unsigned short y) +{ + int r; + if (__builtin_mul_overflow (x, y, &r)) + __builtin_abort (); + return r; +} + +int +main (void) +{ + int x = 1; + int y = 2; + if (foo (x, y) != x * y) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68390.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68390.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68390.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68390.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +__attribute__ ((noinline)) +double direct(int x, ...) +{ + return x*x; +} + +__attribute__ ((noinline)) +double broken(double (*indirect)(int x, ...), int v) +{ + return indirect(v); +} + +int main () +{ + double d1, d2; + int i = 2; + d1 = broken (direct, i); + if (d1 != i*i) + { + __builtin_abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68506.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68506.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68506.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68506.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,63 @@ +/* { dg-options "-fno-builtin-abort" } */ + +int a, b, m, n, o, p, s, u, i; +char c, q, y; +short d; +unsigned char e; +static int f, h; +static short g, r, v; +unsigned t; + +extern void abort (); + +int +fn1 (int p1) +{ + return a ? p1 : p1 + a; +} + +unsigned char +fn2 (unsigned char p1, int p2) +{ + return p2 >= 2 ? p1 : p1 >> p2; +} + +static short +fn3 () +{ + int w, x = 0; + for (; p < 31; p++) + { + s = fn1 (c | ((1 && c) == c)); + t = fn2 (s, x); + c = (unsigned) c > -(unsigned) ((o = (m = d = t) == p) <= 4UL) && n; + v = -c; + y = 1; + for (; y; y++) + e = v == 1; + d = 0; + for (; h != 2;) + { + for (;;) + { + if (!m) + abort (); + r = 7 - f; + x = e = i | r; + q = u * g; + w = b == q; + if (w) + break; + } + break; + } + } + return x; +} + +int +main () +{ + fn3 (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68532.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68532.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68532.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68532.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model" } */ +/* { dg-additional-options "-fno-common" { target hppa*-*-hpux* } } */ + +#define SIZE 128 +unsigned short _Alignas (16) in[SIZE]; + +__attribute__ ((noinline)) int +test (unsigned short sum, unsigned short *in, int x) +{ + for (int j = 0; j < SIZE; j += 8) + sum += in[j] * x; + return sum; +} + +int +main () +{ + for (int i = 0; i < SIZE; i++) + in[i] = i; + if (test (0, in, 1) != 960) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68624.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68624.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68624.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68624.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +int b, c, d, e = 1, f, g, h, j; + +static int +fn1 () +{ + int a = c; + if (h) + return 9; + g = (c || b) % e; + if ((g || f) && b) + return 9; + e = d; + for (c = 0; c > -4; c--) + ; + if (d) + c--; + j = c; + return d; +} + +int +main () +{ + fn1 (); + + if (c != -4) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68648.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68648.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68648.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68648.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* { dg-require-effective-target int32plus } */ +int __attribute__ ((noinline)) +foo (void) +{ + return 123; +} + +int __attribute__ ((noinline)) +bar (void) +{ + int c = 1; + c |= 4294967295 ^ (foo () | 4073709551608); + return c; +} + +int +main () +{ + if (bar () != 0x83fd4005) + __builtin_abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68841.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68841.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68841.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68841.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +static inline int +foo (int *x, int y) +{ + int z = *x; + while (y > z) + z *= 2; + return z; +} + +int +main () +{ + int i; + for (i = 1; i < 17; i++) + { + int j; + int k; + j = foo (&i, 7); + if (i >= 7) + k = i; + else if (i >= 4) + k = 8 + (i - 4) * 2; + else if (i == 3) + k = 12; + else + k = 8; + if (j != k) + __builtin_abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68911.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68911.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68911.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr68911.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +extern void abort (void); + +char a; +int b, c; +short d; + +int main () +{ + unsigned e = 2; + unsigned timeout = 0; + + for (; c < 2; c++) + { + int f = ~e / 7; + if (f) + a = e = ~(b && d); + while (e < 94) + { + e++; + if (++timeout > 100) + goto die; + } + } + return 0; +die: + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69097-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69097-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69097-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69097-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR tree-optimization/69097 */ + +int a, b; +unsigned int c; + +int +main () +{ + int d = b; + b = ~(~a + (~d | b)); + a = ~(~c >> b); + c = a % b; + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69097-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69097-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69097-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69097-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR tree-optimization/69097 */ + +__attribute__((noinline, noclone)) int +f1 (int x, int y) +{ + return x % y; +} + +__attribute__((noinline, noclone)) int +f2 (int x, int y) +{ + return x % -y; +} + +__attribute__((noinline, noclone)) int +f3 (int x, int y) +{ + int z = -y; + return x % z; +} + +int +main () +{ + if (f1 (-__INT_MAX__ - 1, 1) != 0 + || f2 (-__INT_MAX__ - 1, -1) != 0 + || f3 (-__INT_MAX__ - 1, -1) != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +#include +int a, b, d, f; +char c; +static int *e = &d; +int main() { + int g = -1L; + *e = g; + c = 4; + for (; c >= 14; c++) + *e = 1; + f = a == 0; + *e ^= f; + int h = ~d; + if (d) + b = h; + if (h) + exit (0); + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ + +#include + +int a, *c, d, e, g, f; +short b; + +int +fn1 () +{ + int h = d != 10; + if (h > g) + asm volatile ("" : : : "memory"); + if (h == 10) + { + int *i = 0; + a = 0; + for (; a < 7; a++) + for (; *i;) + ; + } + else + { + b = e / h; + return f; + } + c = &h; + abort (); +} + +int +main () +{ + fn1 (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +#include + +static int a[40] = {7, 5, 3, 3, 0, 0, 3}; +short b; +int c = 5; +int main() { + b = 0; + for (; b <= 3; b++) + if (a[b + 6] ^ (0 || c)) + ; + else + break; + if (b != 4) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69320-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +#include + +int a; +char b, d; +short c; +short fn1(int p1, int p2) { return p2 >= 2 ? p1 : p1 > p2; } + +int main() { + int *e = &a, *f = &a; + b = 1; + for (; b <= 9; b++) { + c = *e != 5 || d; + *f = fn1(c || b, a); + } + if ((long long) a != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69403.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69403.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69403.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69403.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* PR target/69403. */ + +int a, b, c; + +__attribute__ ((__noinline__)) int +fn1 () +{ + if ((b | (a != (a & c))) == 1) + __builtin_abort (); + return 0; +} + +int +main (void) +{ + a = 5; + c = 1; + b = 6; + return fn1 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69447.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69447.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69447.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69447.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +typedef unsigned char u8; +typedef unsigned short u16; +typedef unsigned int u32; +typedef unsigned long long u64; + +u64 __attribute__((noinline, noclone)) +foo(u8 u8_0, u16 u16_0, u64 u64_0, u8 u8_1, u16 u16_1, u64 u64_1, u64 u64_2, u8 u8_3, u64 u64_3) +{ + u64_1 *= 0x7730; + u64_3 *= u64_3; + u16_1 |= u64_3; + u64_3 -= 2; + u8_3 /= u64_2; + u8_0 |= 3; + u64_3 %= u8_0; + u8_0 -= 1; + return u8_0 + u16_0 + u64_0 + u8_1 + u16_1 + u64_1 + u8_3 + u64_3; +} + +int main() +{ + unsigned x = foo(1, 1, 1, 1, 1, 1, 1, 1, 1); + if (x != 0x7737) + __builtin_abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69691.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69691.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69691.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr69691.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,127 @@ +/* PR rtl-optimization/69691 */ + +char u[] = { 46, 97, 99, 104, 52, 0 }; +char *v[] = { u, 0 }; +struct S { char a[10]; struct S *b[31]; }; +struct S r[7], *r2 = r; +static struct S *w = 0; + +__attribute__((noinline, noclone)) int +fn (int x) +{ + if (__builtin_strchr (u, x) || x == 96) + return x; + __builtin_abort (); +} + +__attribute__((noinline, noclone)) int +foo (char x) +{ + if (x == 0) + __builtin_abort (); + if (fn (x) >= 96 && fn (x) <= 122) + return (fn (x) - 96); + else if (x == 46) + return 0; + else + { + __builtin_printf ("foo %d\n", x); + return -1; + } +} + +__attribute__((noinline, noclone)) void +bar (char **x) +{ + char **b, c, *d, e[500], *f, g[10]; + int z, l, h, i; + struct S *s; + + w = r2++; + for (b = x; *b; b++) + { + __builtin_strcpy (e, *b); + f = e; + do + { + d = __builtin_strchr (f, 32); + if (d) + *d = 0; + l = __builtin_strlen (f); + h = 0; + s = w; + __builtin_memset (g, 0, sizeof (g)); + for (z = 0; z < l; z++) + { + c = f[z]; + if (c >= 48 && c <= 57) + g[h] = c - 48; + else + { + i = foo (c); + if (!s->b[i]) + { + s->b[i] = r2++; + if (r2 == &r[7]) + __builtin_abort (); + } + s = s->b[i]; + h++; + } + } + __builtin_memcpy (s->a, g, 10); + if (d) + f = d + 1; + } + while (d); + } +} + +__attribute__((noinline, noclone)) void +baz (char *x) +{ + char a[300], b[300]; + int z, y, t, l; + struct S *s; + + l = __builtin_strlen (x); + *a = 96; + for (z = 0; z < l; z++) + { + a[z + 1] = fn ((unsigned int) x[z]); + if (foo (a[z + 1]) <= 0) + return; + } + a[l + 1] = 96; + l += 2; + __builtin_memset (b, 0, l + 2); + + if (!w) + return; + + for (z = 0; z < l; z++) + { + s = w; + for (y = z; y < l; y++) + { + s = s->b[foo (a[y])]; + if (!s) + break; + for (t = 0; t <= y - z + 2; t++) + if (s->a[t] > b[z + t]) + b[z + t] = s->a[t]; + } + } + for (z = 3; z < l - 2; z++) + if ((b[z] & 1) == 1) + asm (""); +} + +int +main () +{ + bar (v); + char c[] = { 97, 97, 97, 97, 97, 0 }; + baz (c); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70005.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70005.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70005.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70005.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ + +unsigned char a = 6; +int b, c; + +static void +fn1 () +{ + int i = a > 1 ? 1 : a, j = 6 & (c = a && (b = a)); + int d = 0, e = a, f = ~c, g = b || a; + unsigned char h = ~a; + if (a) + f = j; + if (h && g) + d = a; + i = -~(f * d * h) + c && (e || i) ^ f; + if (i != 1) + __builtin_abort (); +} + +int +main () +{ + fn1 (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70127.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70127.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70127.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70127.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR tree-optimization/70127 */ + +struct S { int f; signed int g : 2; } a[1], c = {5, 1}, d; +short b; + +__attribute__((noinline, noclone)) void +foo (int x) +{ + if (x != 1) + __builtin_abort (); +} + +int +main () +{ + while (b++ <= 0) + { + struct S e = {1, 1}; + d = e = a[0] = c; + } + foo (a[0].g); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70222-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70222-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70222-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70222-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR rtl-optimization/70222 */ + +int a = 1; +unsigned int b = 2; +int c = 0; +int d = 0; + +void +foo () +{ + int e = ((-(c >= c)) < b) > ((int) (-1ULL >> ((a / a) * 15))); + d = -e; +} + +__attribute__((noinline, noclone)) void +bar (int x) +{ + if (x != -1) + __builtin_abort (); +} + +int +main () +{ +#if __CHAR_BIT__ == 8 && __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 + foo (); + bar (d); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70222-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70222-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70222-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70222-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* PR rtl-optimization/70222 */ + +#if __CHAR_BIT__ == 8 && __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 +__attribute__((noinline, noclone)) unsigned int +foo (int x) +{ + unsigned long long y = -1ULL >> x; + return (unsigned int) y >> 31; +} +#endif + +int +main () +{ +#if __CHAR_BIT__ == 8 && __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 + if (foo (15) != 1 || foo (32) != 1 || foo (33) != 0) + __builtin_abort (); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70429.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70429.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70429.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70429.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR rtl-optimization/70429 */ + +__attribute__((noinline, noclone)) int +foo (int a) +{ + return (int) (0x14ff6e2207db5d1fLL >> a) >> 4; +} + +int +main () +{ + if (sizeof (int) != 4 || sizeof (long long) != 8 || __CHAR_BIT__ != 8) + return 0; + if (foo (1) != 0x3edae8 || foo (2) != -132158092) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70460.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70460.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70460.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70460.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* { dg-require-effective-target indirect_jumps } */ +/* { dg-require-effective-target label_values } */ +/* { dg-skip-if "label differences not supported" { avr-*-* } } */ + +/* PR rtl-optimization/70460 */ + +int c; + +__attribute__((noinline, noclone)) void +foo (int x) +{ + static int b[] = { &&lab1 - &&lab0, &&lab2 - &&lab0 }; + void *a = &&lab0 + b[x]; + goto *a; +lab1: + c += 2; +lab2: + c++; +lab0: + ; +} + +int +main () +{ + foo (0); + if (c != 3) + __builtin_abort (); + foo (1); + if (c != 4) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70566.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70566.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70566.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70566.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,47 @@ +/* PR target/70566. */ + +#define NULL 0 + +struct mystruct +{ + unsigned int f1 : 1; + unsigned int f2 : 1; + unsigned int f3 : 1; +}; + +__attribute__ ((noinline)) void +myfunc (int a, void *b) +{ +} +__attribute__ ((noinline)) int +myfunc2 (void *a) +{ + return 0; +} + +static void +set_f2 (struct mystruct *user, int f2) +{ + if (user->f2 != f2) + myfunc (myfunc2 (NULL), NULL); + else + __builtin_abort (); +} + +__attribute__ ((noinline)) void +foo (void *data) +{ + struct mystruct *user = data; + if (!user->f2) + set_f2 (user, 1); +} + +int +main (void) +{ + struct mystruct a; + a.f1 = 1; + a.f2 = 0; + foo (&a); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70586.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70586.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70586.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70586.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR tree-optimization/70586 */ + +int a, e, f; +short b, c, d; + +int +foo (int x, int y) +{ + return (y == 0 || (x && y == 1)) ? x : x % y; +} + +static short +bar (void) +{ + int i = foo (c, f); + f = foo (d, 2); + int g = foo (b, c); + int h = foo (g > 0, c); + c = (3 >= h ^ 7) <= foo (i, c); + if (foo (e, 1)) + return a; + return 0; +} + +int +main () +{ + bar (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70602.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70602.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70602.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70602.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* PR tree-optimization/70602 */ +/* { dg-require-effective-target int32plus } */ + +struct __attribute__((packed)) S +{ + int s : 1; + int t : 20; +}; + +int a, b, c; + +int +main () +{ + for (; a < 1; a++) + { + struct S e[] = { {0, 9}, {0, 9}, {0, 9}, {0, 0}, {0, 9}, {0, 9}, {0, 9}, + {0, 0}, {0, 9}, {0, 9}, {0, 9}, {0, 0}, {0, 9}, {0, 9}, + {0, 9}, {0, 0}, {0, 9}, {0, 9}, {0, 9}, {0, 0}, {0, 9} }; + b = b || e[0].s; + c = e[0].t; + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70903.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70903.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70903.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr70903.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +typedef unsigned char V8 __attribute__ ((vector_size (32))); +typedef unsigned int V32 __attribute__ ((vector_size (32))); +typedef unsigned long long V64 __attribute__ ((vector_size (32))); + +static V32 __attribute__ ((noinline, noclone)) +foo (V64 x) +{ + V64 y = (V64)(V8){((V8)(V64){65535, x[0]})[1]}; + return (V32){y[0], 255}; +} + +int main () +{ + V32 x = foo ((V64){}); +// __builtin_printf ("%08x %08x %08x %08x %08x %08x %08x %08x\n", x[0], x[1], x[2], x[3], x[4], x[5], x[6], x[7]); + if (x[1] != 255) + __builtin_abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71083.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71083.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71083.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71083.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,45 @@ +__extension__ typedef __UINT32_TYPE__ uint32_t; + +struct lock_chain { + uint32_t irq_context: 2, + depth: 6, + base: 24; +}; + +__attribute__((noinline, noclone)) +struct lock_chain * foo (struct lock_chain *chain) +{ + int i; + for (i = 0; i < 100; i++) + { + chain[i+1].base = chain[i].base; + } + return chain; +} + +struct lock_chain1 { + char x; + unsigned short base; +} __attribute__((packed)); + +__attribute__((noinline, noclone)) +struct lock_chain1 * bar (struct lock_chain1 *chain) +{ + int i; + for (i = 0; i < 100; i++) + { + chain[i+1].base = chain[i].base; + } + return chain; +} + +struct lock_chain test [101]; +struct lock_chain1 test1 [101]; + +int +main () +{ + foo (test); + bar (test1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71335.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71335.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71335.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71335.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +int a; +int +main () +{ + int b = 0; + while (a < 0 || b) + { + b = 0; + for (; b < 9; b++) + ; + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71494.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71494.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71494.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71494.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR middle-end/71494 */ +/* { dg-require-effective-target label_values } */ + +int +main () +{ + void *label = &&out; + int i = 0; + void test (void) + { + label = &&out2; + goto *label; + out2:; + i++; + } + goto *label; + out: + i += 2; + test (); + if (i != 3) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71550.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71550.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71550.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71550.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ + +extern void exit (int); + +int a = 3, b, c, f, g, h; +unsigned d; +char *e; + +int +main () +{ + for (; a; a--) + { + int i; + if (h && i) + __builtin_printf ("%d%d", c, f); + i = 0; + for (; i < 2; i++) + if (g) + for (; d < 10; d++) + b = *e; + i = 0; + for (; i < 1; i++) + ; + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71554.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71554.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71554.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71554.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR target/71554 */ + +int v; + +__attribute__ ((noinline, noclone)) void +bar (void) +{ + v++; +} + +__attribute__ ((noinline, noclone)) +void +foo (unsigned int x) +{ + signed int y = ((-__INT_MAX__ - 1) / 2); + signed int r; + if (__builtin_mul_overflow (x, y, &r)) + bar (); +} + +int +main () +{ + foo (2); + if (v) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71626-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71626-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71626-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71626-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR middle-end/71626 */ + +typedef __INTPTR_TYPE__ V __attribute__((__vector_size__(sizeof (__INTPTR_TYPE__)))); + +__attribute__((noinline, noclone)) V +foo () +{ + V v = { (__INTPTR_TYPE__) foo }; + return v; +} + +int +main () +{ + V v = foo (); + if (v[0] != (__INTPTR_TYPE__) foo) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71626-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71626-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71626-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71626-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,4 @@ +/* PR middle-end/71626 */ +/* { dg-additional-options "-fpic" { target fpic } } */ + +#include "pr71626-1.c" Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71631.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71631.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71631.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71631.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +/* PR tree-optimization/71631 */ + +volatile char v; +int a = 1, b = 1, c = 1; + +void +foo (const char *s) +{ + while (*s++) + v = *s; +} + +int +main () +{ + volatile int d = 1; + volatile int e = 1; + int f = 1 / a; + int g = 1U < f; + int h = 2 + g; + int i = 3 % h; + int j = e && b; + int k = 1 == c; + int l = d != 0; + short m = (short) (-1 * i * l); + short x = j * (k * m); + if (i == 1) + foo ("AB"); + if (x != -1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71700.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71700.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71700.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr71700.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +struct S +{ + signed f0 : 16; + unsigned f1 : 1; +}; + +int b; +static struct S c[] = {{-1, 0}, {-1, 0}}; +struct S d; + +int +main () +{ + struct S e = c[0]; + d = e; + if (d.f1 != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr7284-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr7284-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr7284-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr7284-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* Signed left-shift is implementation-defined in C89 (and see + DR#081), not undefined. Bug 7284 from Al Grant (AlGrant at + myrealbox.com). */ + +/* { dg-require-effective-target int32plus } */ +/* { dg-options "-std=c89" } */ + +extern void abort (void); +extern void exit (int); + +int +f (int n) +{ + return (n << 24) / (1 << 23); +} + +volatile int x = 128; + +int +main (void) +{ + if (f(x) != -256) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77718.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77718.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77718.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77718.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR middle-end/77718 */ + +char a[64] __attribute__((aligned (8))); + +__attribute__((noinline, noclone)) int +foo (void) +{ + return __builtin_memcmp ("bbbbbb", a, 6); +} + +__attribute__((noinline, noclone)) int +bar (void) +{ + return __builtin_memcmp (a, "bbbbbb", 6); +} + +int +main () +{ + __builtin_memset (a, 'a', sizeof (a)); + if (((foo () < 0) ^ ('a' > 'b')) + || ((bar () < 0) ^ ('a' < 'b'))) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77766.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77766.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77766.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77766.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +char a; +short b, d = 5, h; +char c[1]; +int e, f = 4, g, j; +int main() { + int i; + for (; f; f = a) { + g = 0; + for (; g <= 32; ++g) { + i = 0; + for (; i < 3; i++) + while (1 > d) + if (c[b]) + break; + L: + if (j) + break; + } + } + e = 0; + for (; e; e = 0) { + d++; + for (; h;) + goto L; + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77767.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77767.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77767.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr77767.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,16 @@ +/* PR c/77767 */ + +void +foo (int a, int b[a++], int c, int d[c++]) +{ + if (a != 2 || c != 2) + __builtin_abort (); +} + +int +main () +{ + int e[10]; + foo (1, e, 1, e); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78170.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78170.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78170.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78170.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +/* { dg-require-effective-target int32plus } */ + +/* PR tree-optimization/78170. + Check that sign-extended store to a bitfield + doesn't overwrite other fields. */ + +int a, b, d; + +struct S0 +{ + int f0; + int f1; + int f2; + int f3; + int f4; + int f5:15; + int f6:17; + int f7:2; + int f8:30; +} c; + +void fn1 () +{ + d = b = 1; + for (; b; b = a) + { + struct S0 e = { 0, 0, 0, 0, 0, 0, 1, 0, 1 }; + c = e; + c.f6 = -1; + } +} + +int main () +{ + fn1 (); + if (c.f7 != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78378.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78378.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78378.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78378.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +/* PR rtl-optimization/78378 */ + +unsigned long long __attribute__ ((noinline, noclone)) +foo (unsigned long long x) +{ + x <<= 41; + x /= 232; + return 1 + (unsigned short) x; +} + +int +main () +{ + unsigned long long x = foo (1); + if (x != 0x2c24) + __builtin_abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78436.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78436.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78436.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78436.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR tree-optimization/78436 */ + +struct S +{ + long int a : 24; + signed char b : 8; +} s; + +__attribute__((noinline, noclone)) void +foo () +{ + s.b = 0; + s.a = -1193165L; +} + +int +main () +{ + foo (); + if (s.b != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78438.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78438.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78438.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78438.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* PR target/78438 */ + +char a = 0; +int b = 197412621; + +__attribute__ ((noinline, noclone)) +void foo () +{ + a = 0 > (short) (b >> 11); +} + +int +main () +{ + asm volatile ("" : : : "memory"); + if (__CHAR_BIT__ != 8 || sizeof (short) != 2 || sizeof (int) < 4) + return 0; + foo (); + if (a != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78477.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78477.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78477.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78477.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR rtl-optimization/78477 */ + +unsigned a; +unsigned short b; + +unsigned +foo (unsigned x) +{ + b = x; + a >>= (b & 1); + b = 1 | (b << 5); + b >>= 15; + x = (unsigned char) b > ((2 - (unsigned char) b) & 1); + b = 0; + return x; +} + +int +main () +{ + if (__CHAR_BIT__ != 8 || sizeof (short) != 2 || sizeof (int) < 4) + return 0; + unsigned x = foo (12345); + if (x != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78559.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78559.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78559.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78559.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* PR rtl-optimization/78559 */ + +int g = 20; +int d = 0; + +short +fn2 (int p1, int p2) +{ + return p2 >= 2 || 5 >> p2 ? p1 : p1 << p2; +} + +int +main () +{ + int result = 0; +lbl_2582: + if (g) + { + for (int c = -3; c; c++) + result = fn2 (1, g); + } + else + { + for (int i = 0; i < 2; i += 2) + if (d) + goto lbl_2582; + } + if (result != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78586.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78586.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78586.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78586.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR tree-optimization/78586 */ + +void +foo (unsigned long x) +{ + char a[30]; + unsigned long b = __builtin_sprintf (a, "%lu", x); + if (b != 4) + __builtin_abort (); +} + +int +main () +{ + foo (1000); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78617.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78617.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78617.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78617.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +int a = 0; +int d = 1; +int f = 1; + +int fn1() { + return a || 1 >> a; +} + +int fn2(int p1, int p2) { + return p2 >= 2 ? p1 : p1 >> 1; +} + +int fn3(int p1) { + return d ^ p1; +} + +int fn4(int p1, int p2) { + return fn3(!d > fn2((f = fn1() - 1000) || p2, p1)); +} + +int main() { + if (fn4(0, 0) != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78622.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78622.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78622.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78622.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* PR middle-end/78622 - [7 Regression] -Wformat-overflow/-fprintf-return-value + incorrect with overflow/wrapping + { dg-skip-if "Requires %hhd format" { hppa*-*-hpux* } } + { dg-require-effective-target c99_runtime } + { dg-additional-options "-Wformat-overflow=2" } */ + +__attribute__((noinline, noclone)) int +foo (int x) +{ + if (x < 4096 + 8 || x >= 4096 + 256 + 8) + return -1; + + char buf[5]; + int n = __builtin_snprintf (buf, sizeof buf, "%hhd", x + 1); + __builtin_printf ("\"%hhd\" => %i\n", x + 1, n); + return n; +} + +int +main (void) +{ + if (__SCHAR_MAX__ != 127 || __CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4) + return 0; + + if (foo (4095 + 9) != 1 + || foo (4095 + 32) != 2 + || foo (4095 + 127) != 3 + || foo (4095 + 128) != 4 + || foo (4095 + 240) != 3 + || foo (4095 + 248) != 2 + || foo (4095 + 255) != 2 + || foo (4095 + 256) != 1) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78675.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78675.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78675.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78675.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* PR tree-optimization/78675 */ + +long int a; + +__attribute__((noinline, noclone)) long int +foo (long int x) +{ + long int b; + while (a < 1) + { + b = a && x; + ++a; + } + return b; +} + +int +main () +{ + if (foo (0) != 0) + __builtin_abort (); + a = 0; + if (foo (1) != 0) + __builtin_abort (); + a = 0; + if (foo (25) != 0) + __builtin_abort (); + a = -64; + if (foo (0) != 0) + __builtin_abort (); + a = -64; + if (foo (1) != 0) + __builtin_abort (); + a = -64; + if (foo (25) != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78720.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78720.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78720.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78720.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* PR tree-optimization/78720 */ + +__attribute__((noinline, noclone)) long int +foo (signed char x) +{ + return x < 0 ? 0x80000L : 0L; +} + +__attribute__((noinline, noclone)) long int +bar (signed char x) +{ + return x < 0 ? 0x80L : 0L; +} + +__attribute__((noinline, noclone)) long int +baz (signed char x) +{ + return x < 0 ? 0x20L : 0L; +} + +int +main () +{ + if (foo (-1) != 0x80000L || bar (-1) != 0x80L || baz (-1) != 0x20L + || foo (0) != 0L || bar (0) != 0L || baz (0) != 0L + || foo (31) != 0L || bar (31) != 0L || baz (31) != 0L) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78726.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78726.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78726.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78726.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR tree-optimization/78726 */ + +unsigned char b = 36, c = 173; +unsigned int d; + +__attribute__((noinline, noclone)) void +foo (void) +{ + unsigned a = ~b; + d = a * c * c + 1023094746U * a; +} + +int +main () +{ + if (__SIZEOF_INT__ != 4 || __CHAR_BIT__ != 8) + return 0; + asm volatile ("" : : "g" (&b), "g" (&c) : "memory"); + foo (); + if (d != 799092689U) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78791.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78791.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78791.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78791.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* PR target/78791 */ + +__attribute__((used, noinline, noclone)) unsigned long long +foo (unsigned long long x, unsigned long long y, unsigned long long z) +{ + unsigned long long a = x / y; + unsigned long long b = x % y; + a |= z; + b ^= z; + return a + b; +} + +int +main () +{ + if (foo (64, 7, 0) != 10 || foo (28, 3, 2) != 14) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78856.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78856.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78856.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr78856.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +extern void exit (int); + +int a, b, c, d, e, f[3]; + +int main() +{ + while (d) + while (1) + ; + int g = 0, h, i = 0; + for (; g < 21; g += 9) + { + int j = 1; + for (h = 0; h < 3; h++) + f[h] = 1; + for (; j < 10; j++) { + d = i && (b ? 0 : c); + i = 1; + if (g) + a = e; + } + } + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79043.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79043.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79043.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79043.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR ipa/78791 */ + +int val; + +int *ptr = &val; +float *ptr2 = &val; + +static +__attribute__((always_inline, optimize ("-fno-strict-aliasing"))) +typepun () +{ + *ptr2=0; +} + +main() +{ + *ptr=1; + typepun (); + if (*ptr) + __builtin_abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79121.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79121.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79121.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79121.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +#if __SIZEOF_INT__ < 4 + __extension__ typedef __UINT32_TYPE__ uint32_t; + __extension__ typedef __INT32_TYPE__ int32_t; +#else + typedef unsigned uint32_t; + typedef int int32_t; +#endif + +extern void abort (void); + +__attribute__ ((noinline, noclone)) unsigned long long f1 (int32_t x) +{ + return ((unsigned long long) x) << 4; +} + +__attribute__ ((noinline, noclone)) long long f2 (uint32_t x) +{ + return ((long long) x) << 4; +} + +__attribute__ ((noinline, noclone)) unsigned long long f3 (uint32_t x) +{ + return ((unsigned long long) x) << 4; +} + +__attribute__ ((noinline, noclone)) long long f4 (int32_t x) +{ + return ((long long) x) << 4; +} + +int main () +{ + if (f1 (0xf0000000) != 0xffffffff00000000) + abort (); + if (f2 (0xf0000000) != 0xf00000000) + abort (); + if (f3 (0xf0000000) != 0xf00000000) + abort (); + if (f4 (0xf0000000) != 0xffffffff00000000) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79286.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79286.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79286.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79286.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +int a = 0, c = 0; +static int d[][8] = {}; + +int main () +{ + int e; + for (int b = 0; b < 4; b++) + { + __builtin_printf ("%d\n", b, e); + while (a && c++) + e = d[300000000000000000][0]; + } + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79327.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79327.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79327.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79327.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR tree-optimization/79327 */ +/* { dg-require-effective-target c99_runtime } */ + +volatile int a; + +int +main (void) +{ + int i; + char buf[64]; + if (__builtin_sprintf (buf, "%#hho", a) != 1) + __builtin_abort (); + if (__builtin_sprintf (buf, "%#hhx", a) != 1) + __builtin_abort (); + a = 1; + if (__builtin_sprintf (buf, "%#hho", a) != 2) + __builtin_abort (); + if (__builtin_sprintf (buf, "%#hhx", a) != 3) + __builtin_abort (); + a = 127; + if (__builtin_sprintf (buf, "%#hho", a) != 4) + __builtin_abort (); + if (__builtin_sprintf (buf, "%#hhx", a) != 4) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79354.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79354.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79354.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79354.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR target/79354 */ + +int b, f, g; +float e; +unsigned long d; + +__attribute__((noinline, noclone)) void +foo (int *a) +{ + for (g = 0; g < 32; g++) + if (f) + { + e = d; + __builtin_memcpy (&b, &e, sizeof (float)); + b = *a; + } +} + +int +main () +{ + int h = 5; + f = 1; + asm volatile ("" : : : "memory"); + foo (&h); + asm volatile ("" : : : "memory"); + foo (&b); + asm volatile ("" : : : "memory"); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79388.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79388.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79388.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79388.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR rtl-optimization/79388 */ +/* { dg-additional-options "-fno-tree-coalesce-vars" } */ + +unsigned int a, c; + +__attribute__ ((noinline, noclone)) unsigned int +foo (unsigned int p) +{ + p |= 1; + p &= 0xfffe; + p %= 0xffff; + c = p; + return a + p; +} + +int +main (void) +{ + int x = foo (6); + if (x != 6) + __builtin_abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79450.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79450.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79450.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79450.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* PR rtl-optimization/79450 */ + +unsigned int +foo (unsigned char x, unsigned long long y) +{ + do + { + x &= !y; + x %= 24; + } + while (x < y); + return x + y; +} + +int +main (void) +{ + unsigned int x = foo (1, 0); + if (x != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79737-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79737-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79737-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79737-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +/* PR tree-optimization/79737 */ + +#if __SIZEOF_INT__ < 4 + __extension__ typedef __INT32_TYPE__ int32_t; +#else + typedef int int32_t; +#endif + +#pragma pack(1) +struct S +{ + int32_t b:18; + int32_t c:1; + int32_t d:24; + int32_t e:15; + int32_t f:14; +} i; +int g, j, k; +static struct S h; + +void +foo () +{ + for (j = 0; j < 6; j++) + k = 0; + for (; k < 3; k++) + { + struct S m = { 5, 0, -5, 9, 5 }; + h = m; + if (g) + i = m; + h.e = 0; + } +} + +int +main () +{ + foo (); + if (h.e != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79737-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79737-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79737-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr79737-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,47 @@ +/* PR tree-optimization/79737 */ + +#if __SIZEOF_INT__ < 4 + __extension__ typedef __INT32_TYPE__ int32_t; +#else + typedef int int32_t; +#endif + +#pragma pack(1) +struct S +{ + int32_t b:18; + int32_t c:1; + int32_t d:24; + int32_t e:15; + int32_t f:14; +} i, j; + +void +foo () +{ + i.e = 0; + i.b = 5; + i.c = 0; + i.d = -5; + i.f = 5; +} + +void +bar () +{ + j.b = 5; + j.c = 0; + j.d = -5; + j.e = 0; + j.f = 5; +} + +int +main () +{ + foo (); + bar (); + asm volatile ("" : : : "memory"); + if (i.b != j.b || i.c != j.c || i.d != j.d || i.e != j.e || i.f != j.f) + __builtin_abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80153.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80153.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80153.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80153.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,48 @@ +/* PR tree-optimization/80153 */ + +void check (int, int, int) __attribute__((noinline)); +void check (int c, int c2, int val) +{ + if (!val) { + __builtin_abort(); + } +} + +static const char *buf; +static int l, i; + +void _fputs(const char *str) __attribute__((noinline)); +void _fputs(const char *str) +{ + buf = str; + i = 0; + l = __builtin_strlen(buf); +} + +char _fgetc() __attribute__((noinline)); +char _fgetc() +{ + char val = buf[i]; + i++; + if (i > l) + return -1; + else + return val; +} + +static const char *string = "oops!\n"; + +int main(void) +{ + int i; + int c; + + _fputs(string); + + for (i = 0; i < __builtin_strlen(string); i++) { + c = _fgetc(); + check(c, string[i], c == string[i]); + } + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80421.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80421.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80421.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80421.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,121 @@ +/* PR middle-end/80421 */ + +__attribute__ ((noinline, noclone)) void +baz (const char *t, ...) +{ + asm volatile (""::"r" (t):"memory"); + if (*t == 'T') + __builtin_abort (); +} + +unsigned int +foo (char x) +{ + baz ("x %c\n", x); + switch (x) + { + default: + baz ("case default\n"); + if (x == 'D' || x == 'I') + baz ("This should never be reached.\n"); + return 0; + case 'D': + baz ("case 'D'\n"); + return 0; + case 'I': + baz ("case 'I'\n"); + return 0; + } +} + +void +bar (void) +{ + int a = 2; + int b = 5; + char c[] = { + 2, 4, 1, 2, 5, 5, 2, 4, 4, 0, 0, 0, 0, 0, 0, 3, 4, 4, 2, 4, + 1, 2, 5, 5, 2, 4, 1, 0, 0, 0, 2, 4, 4, 3, 4, 3, 3, 5, 1, 3, + 5, 5, 2, 4, 4, 2, 4, 1, 3, 5, 3, 3, 5, 1, 3, 5, 1, 2, 4, 4, + 2, 4, 2, 3, 5, 1, 3, 5, 1, 3, 5, 5, 2, 4, 1, 2, 4, 2, 3, 5, + 3, 3, 5, 1, 3, 5, 5, 2, 4, 1, 2, 4, 1, 3, 5, 3, 3, 5, 1, 3, + 5, 5, 2, 4, 4, 2, 4, 1, 3, 5, 3, 3, 5, 1, 3, 5, 1, 2, 4, 1, + 2, 4, 2, 3, 5, 1, 3, 5, 1, 3, 5, 1, 2, 4, 1, 2, 4, 1, 3, 5, + 1, 3, 5, 1, 3, 5, 1, 2, 4, 4, 2, 4, 1, 3, 5, 1, 3, 5, 1, 3, + 5, 5, 2, 4, 4, 2, 4, 2, 3, 5, 3, 3, 5, 1, 3, 5, 5, 2, 4, 4, + 2, 4, 1, 3, 5, 3, 3, 5, 1, 3, 5, 1, 2, 5, 5, 2, 4, 2, 3, 5, + 1, 3, 4, 1, 3, 5, 1, 2, 5, 5, 2, 4, 1, 2, 5, 1, 3, 5, 3, 3, + 5, 1, 2, 5, 5, 2, 4, 2, 2, 5, 1, 3, 5, 3, 3, 5, 1, 2, 5, 1, + 2, 4, 1, 2, 5, 2, 3, 5, 1, 3, 5, 1, 2, 5, 1, 2, 4, 2, 2, 5, + 1, 3, 5, 1, 3, 5, 1, 2, 5, 5, 2, 4, 2, 2, 5, 2, 3, 5, 3, 3, + 5, 1, 2, 5, 5, 2, 4, 2, 2, 5, 2, 3, 5, 3, 3, 5, 1, 2, 5, 5, + 2, 4, 2, 2, 5, 1, 3, 5, 3, 3, 5, 1, 2, 5, 5, 2, 4, 2, 2, 5, + 1, 3, 5, 3, 3, 5, 1, 2, 5, 1, 2, 4, 1, 2, 5, 2, 3, 5, 1, 3, + 5, 1, 2, 5, 5, 2, 4, 2, 2, 5, 2, 3, 5, 3, 3, 5, 1, 2, 5, 5, + 2, 4, 1, 2, 5, 1, 3, 5, 3, 3, 5, 1, 2, 5, 5, 2, 4, 2, 2, 5, + 1, 3, 5, 3, 3, 5, 1, 2, 5, 5, 2, 4, 2, 2, 5, 1, 3, 5, 3, 3, + 5, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 + }; + char *f = c + 390; + int i, j, e, g, h; + char k, l; + i = 26; + j = 25; + k = l = 'M'; + h = 2; + while (i > 0) + { + int x = i - a; + x = x > 0 ? x : 0; + x = j - x; + g = x * 3 + h; + switch (f[g]) + { + case 1: + --i; + --j; + h = 2; + f -= b * 3; + k = 'M'; + break; + case 2: + --i; + h = 0; + f -= b * 3; + k = 'I'; + break; + case 3: + --i; + h = 2; + f -= b * 3; + k = 'I'; + break; + case 4: + --j; + h = 1; + k = 'D'; + break; + case 5: + --j; + h = 2; + k = 'D'; + break; + } + if (k == l) + ++e; + else + { + foo (l); + l = k; + } + } +} + +int +main () +{ + char l = 'D'; + foo (l); + bar (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80501.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80501.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80501.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80501.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR rtl-optimization/80501 */ + +signed char v = 0; + +static signed char +foo (int x, int y) +{ + return x << y; +} + +__attribute__((noinline, noclone)) int +bar (void) +{ + return foo (v >= 0, __CHAR_BIT__ - 1) >= 1; +} + +int +main () +{ + if (sizeof (int) > sizeof (char) && bar () != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80692.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80692.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80692.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr80692.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +/* { dg-require-effective-target dfp } */ + +int main () { + _Decimal64 d64 = -0.DD; + + if (d64 != 0.DD) + __builtin_abort (); + + if (d64 != -0.DD) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81281.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81281.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81281.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81281.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* PR sanitizer/81281 */ + +void +foo (unsigned p, unsigned a, unsigned b) +{ + unsigned q = p + 7; + if (a - (1U + __INT_MAX__) >= 2) + __builtin_unreachable (); + int d = p + b; + int c = p + a; + if (c - d != __INT_MAX__) + __builtin_abort (); +} + +void +bar (unsigned p, unsigned a) +{ + unsigned q = p + 7; + if (a - (1U + __INT_MAX__) >= 2) + __builtin_unreachable (); + int c = p; + int d = p + a; + if (c - d != -__INT_MAX__ - 1) + __builtin_abort (); +} + +int +main () +{ + foo (-1U, 1U + __INT_MAX__, 1U); + bar (-1U, 1U + __INT_MAX__); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81423.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81423.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81423.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81423.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,39 @@ +/* PR rtl-optimization/81423 */ + +extern void abort (void); + +unsigned long long int ll = 0; +unsigned long long int ull1 = 1ULL; +unsigned long long int ull2 = 12008284144813806346ULL; +unsigned long long int ull3; + +unsigned long long int __attribute__ ((noinline)) +foo (void) +{ + ll = -5597998501375493990LL; + + ll = (unsigned int) (5677365550390624949LL - ll) - (ull1 > 0); + unsigned long long int ull3; + ull3 = (unsigned int) + (2067854353LL << + (((ll + -2129105131LL) ^ 10280750144413668236ULL) - + 10280750143997242009ULL)) >> ((2873442921854271231ULL | ull2) + - 12098357307243495419ULL); + + return ull3; +} + +int +main (void) +{ + /* We need a long long of exactly 64 bits and int of exactly 32 bits + for this test. */ + if (__SIZEOF_LONG_LONG__ * __CHAR_BIT__ != 64 + || __SIZEOF_INT__ * __CHAR_BIT__ != 32) + return 0; + + ull3 = foo (); + if (ull3 != 3998784) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81503.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81503.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81503.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81503.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +unsigned short a = 41461; +unsigned short b = 3419; +#if __SIZEOF_INT__ >= 4 +int c = 0; + +void foo() { + if (a + b * ~(0 != 5)) + c = -~(b * ~(0 != 5)) + 2147483647; +} +#else +__INT32_TYPE__ c = 0; + +void foo() { + if (a + b * ~((__INT32_TYPE__)(0 != 5))) + c = -~(b * ~((__INT32_TYPE__)(0 != 5))) + 2147483647; +} +#endif + +int main() { + foo(); + if (c != 2147476810) + return -1; + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81555.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81555.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81555.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81555.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* PR tree-optimization/81555 */ + +unsigned int a = 1, d = 0xfaeU, e = 0xe376U; +_Bool b = 0, f = 1; +unsigned char g = 1; + +void +foo (void) +{ + _Bool c = a != b; + if (c) + f = 0; + if (e & c & (unsigned char)d & c) + g = 0; +} + +int +main () +{ + foo (); + if (f || g != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81556.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81556.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81556.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81556.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +/* PR tree-optimization/81556 */ + +unsigned long long int b = 0xb82ff73c5c020599ULL; +unsigned long long int c = 0xd4e8188733a29d8eULL; +unsigned long long int d = 2, f = 1, g = 0, h = 0; +unsigned long long int e = 0xf27771784749f32bULL; + +__attribute__((noinline, noclone)) void +foo (void) +{ + _Bool a = d > 1; + g = f % ((d > 1) << 9); + h = a & (e & (a & b & c)); +} + +int +main () +{ + foo (); + if (g != 1 || h != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81588.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81588.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81588.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81588.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,45 @@ +/* PR tree-optimization/81588 */ + +__attribute__((noinline, noclone)) int +bar (int x) +{ + __asm volatile ("" : : "g" (x) : "memory"); +} + +__attribute__((noinline, noclone)) int +foo (unsigned x, long long y) +{ + if (y < 0) + return 0; + if (y < (long long) (4 * x)) + { + bar (y); + return 1; + } + return 0; +} + +int +main () +{ + volatile unsigned x = 10; + volatile long long y = -10000; + if (foo (x, y) != 0) + __builtin_abort (); + y = -1; + if (foo (x, y) != 0) + __builtin_abort (); + y = 0; + if (foo (x, y) != 1) + __builtin_abort (); + y = 39; + if (foo (x, y) != 1) + __builtin_abort (); + y = 40; + if (foo (x, y) != 0) + __builtin_abort (); + y = 10000; + if (foo (x, y) != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81913.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81913.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81913.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr81913.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR tree-optimization/81913 */ + +typedef __UINT8_TYPE__ u8; +typedef __UINT32_TYPE__ u32; + +static u32 +b (u8 d, u32 e, u32 g) +{ + do + { + e += g + 1; + d--; + } + while (d >= (u8) e); + + return e; +} + +int +main (void) +{ + u32 x = b (1, -0x378704, ~0xba64fc); + if (x != 0xd93190d0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82192.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82192.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82192.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82192.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* PR rtl-optimization/82192 */ + +unsigned long long int a = 0x95dd3d896f7422e2ULL; +struct S { unsigned int m : 13; } b; + +__attribute__((noinline, noclone)) void +foo (void) +{ + b.m = ((unsigned) a) >> (0x644eee9667723bf7LL + | a & ~0xdee27af8U) - 0x644eee9667763bd8LL; +} + +int +main () +{ + if (__INT_MAX__ != 0x7fffffffULL) + return 0; + foo (); + if (b.m != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82210.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82210.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82210.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82210.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR c/82210 */ +/* { dg-require-effective-target alloca } */ + +void +foo (int size) +{ + int i; + struct S { + __attribute__((aligned (16))) struct T { short c; } a[size]; + int b[size]; + } s; + + for (i = 0; i < size; i++) + s.a[i].c = 0x1234; + for (i = 0; i < size; i++) + s.b[i] = 0; + for (i = 0; i < size; i++) + if (s.a[i].c != 0x1234 || s.b[i] != 0) + __builtin_abort (); +} + +int +main () +{ + foo (15); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82387.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82387.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82387.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82387.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,27 @@ +/* PR tree-optimization/82387 */ + +struct A { int b; }; +int f = 1; + +struct A +foo (void) +{ + struct A h[] = { + {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, + {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, + {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, + {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, + {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, + {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, + {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, {1}, + }; + return h[24]; +} + +int +main () +{ + struct A i = foo (), j = i; + j.b && (f = 0); + return f; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82388.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82388.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82388.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82388.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR tree-optimization/82388 */ + +struct A { int b; int c; int d; } e; + +struct A +foo (void) +{ + struct A h[30] = {{0,0,0}}; + return h[29]; +} + +int +main () +{ + e = foo (); + return e.b; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82524.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82524.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82524.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82524.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* PR target/82524 */ + +struct S { unsigned char b, g, r, a; }; +union U { struct S c; unsigned v; }; + +static inline unsigned char +foo (unsigned char a, unsigned char b) +{ + return ((a + 1) * b) >> 8; +} + +__attribute__((noinline, noclone)) unsigned +bar (union U *x, union U *y) +{ + union U z; + unsigned char v = x->c.a; + unsigned char w = foo (y->c.a, 255 - v); + z.c.r = foo (x->c.r, v) + foo (y->c.r, w); + z.c.g = foo (x->c.g, v) + foo (y->c.g, w); + z.c.b = foo (x->c.b, v) + foo (y->c.b, w); + z.c.a = 0; + return z.v; +} + +int +main () +{ + union U a, b, c; + if ((unsigned char) ~0 != 255 || sizeof (unsigned) != 4) + return 0; + a.c = (struct S) { 255, 255, 255, 0 }; + b.c = (struct S) { 255, 255, 255, 255 }; + c.v = bar (&a, &b); + if (c.c.b != 255 || c.c.g != 255 || c.c.r != 255 || c.c.a != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82954.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82954.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82954.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr82954.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* PR tree-optimization/82954 */ + +__attribute__((noipa)) void +foo (int *__restrict p, int *__restrict q) +{ + p[0] = p[0] ^ 1; + p[1] = p[1] ^ 2; + p[2] = p[2] ^ q[2]; + p[3] = p[3] ^ q[3]; +} + +int +main () +{ + int p[4] = { 16, 32, 64, 128 }; + int q[4] = { 8, 4, 2, 1 }; + asm volatile ("" : : "g" (p), "g" (q) : "memory"); + foo (p, q); + if (p[0] != 17 || p[1] != 34 || p[2] != 66 || p[3] != 129) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83269.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83269.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83269.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83269.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +/* PR tree-optimization/83269 */ + +int +main () +{ +#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ > 4 && __CHAR_BIT__ == 8 + volatile unsigned char a = 1; + long long b = 0x80000000L; + int c = -((int)(-b) - (-0x7fffffff * a)); + if (c != 1) + __builtin_abort (); +#endif + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83298.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83298.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83298.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83298.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ + +int a, b, c = 1; + +int main () +{ + for (; b < 1; b++) + ; + if (!(c * (a < 1))) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83362.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83362.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83362.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83362.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +typedef __UINT8_TYPE__ u8; +typedef __UINT32_TYPE__ u32; + +u32 a, b, d, e; +u8 c; + +static u32 __attribute__ ((noinline, noclone)) +foo (u32 p) +{ + do + { + e /= 0xfff; + if (p > c) + d = 0; + e -= 3; + e *= b <= a; + } + while (e >= 88030); + return e; +} + +int +main (void) +{ + u32 x = foo (1164); + if (x != 0xfd) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83383.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83383.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83383.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83383.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR tree-optimization/83383 */ + +unsigned long long int a = 16ULL; +unsigned char b = 195; +unsigned long long int c = ~0ULL; +unsigned char d = 1; +unsigned long long int e[2] = { 3625445792498952486ULL, 0 }; +unsigned long long int f[2] = { 0, 8985037393681294663ULL }; +unsigned long long int g = 5052410635626804928ULL; + +void +foo () +{ + a = ((signed char) a) < b; + c = (d ? e[0] : 0) - (f[1] * a ? 1 : g); +} + +int +main() +{ + foo (); + if (a != 1 || c != 3625445792498952485ULL) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83477.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83477.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83477.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr83477.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +int yf = 0; + +void +pl (int q5, int nd) +{ + unsigned int hp = q5; + int zx = (q5 == 0) ? hp : (hp / q5); + + yf = ((nd < 2) * zx != 0) ? nd : 0; +} + +int +main (void) +{ + pl (1, !yf); + if (yf != 1) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84169.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84169.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84169.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84169.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR rtl-optimization/84169 */ + +#ifdef __SIZEOF_INT128__ +typedef unsigned __int128 T; +#else +typedef unsigned long long T; +#endif + +T b; + +static __attribute__ ((noipa)) T +foo (T c, T d, T e, T f, T g, T h) +{ + __builtin_mul_overflow ((unsigned char) h, -16, &h); + return b + h; +} + +int +main () +{ + T x = foo (0, 0, 0, 0, 0, 4); + if (x != -64) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84339.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84339.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84339.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84339.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR tree-optimization/84339 */ + +struct S { int a; char b[1]; }; + +__attribute__((noipa)) int +foo (struct S *p) +{ + return __builtin_strlen (&p->b[0]); +} + +__attribute__((noipa)) int +bar (struct S *p) +{ + return __builtin_strlen (p->b); +} + +int +main () +{ + struct S *p = __builtin_malloc (sizeof (struct S) + 16); + if (p) + { + p->a = 1; + __builtin_strcpy (p->b, "abcdefg"); + if (foo (p) != 7 || bar (p) != 7) + __builtin_abort (); + __builtin_free (p); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84478.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84478.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84478.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84478.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +/* PR tree-optimization/84478 */ + +long poolptr; +unsigned char *strpool; +static const char *poolfilearr[] = { + "mu", + "", +#define A "x", +#define B A "xx", A A "xxx", A A A A A +#define C B B B B B B B B B B +#define D C C C C C C C C C C + D C C C C C C C B B B + ((void *)0) +}; + +__attribute__((noipa)) long +makestring (void) +{ + return 1; +} + +__attribute__((noipa)) long +loadpoolstrings (long spare_size) +{ + const char *s; + long g = 0; + int i = 0, j = 0; + while ((s = poolfilearr[j++])) + { + int l = __builtin_strlen (s); + i += l; + if (i >= spare_size) return 0; + while (l-- > 0) strpool[poolptr++] = *s++; + g = makestring (); + } + return g; +} + +int +main () +{ + strpool = __builtin_malloc (4000); + if (!strpool) + return 0; + asm volatile ("" : : : "memory"); + volatile int r = loadpoolstrings (4000); + __builtin_free (strpool); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84521.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84521.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84521.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84521.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,53 @@ +/* { dg-require-effective-target indirect_jumps } */ +/* { dg-additional-options "-fomit-frame-pointer -fno-inline" } */ + +extern void abort (void); + +void +broken_longjmp (void *p) +{ + __builtin_longjmp (p, 1); +} + +volatile int x = 256; +void *volatile p = (void*)&x; +void *volatile p1; + +void +test (void) +{ + void *buf[5]; + void *volatile q = p; + + if (!__builtin_setjmp (buf)) + broken_longjmp (buf); + + /* Fails if stack pointer corrupted. */ + if (p != q) + abort (); +} + +void +test2 (void) +{ + void *volatile q = p; + p1 = __builtin_alloca (x); + test (); + + /* Fails if frame pointer corrupted. */ + if (p != q) + abort (); +} + +int +main (void) +{ + void *volatile q = p; + test (); + test2 (); + /* Fails if stack pointer corrupted. */ + if (p != q) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84524.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84524.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84524.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84524.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +/* PR target/84524 */ + +__attribute__((noipa)) void +foo (unsigned short *x) +{ + unsigned short i, v; + unsigned char j; + for (i = 0; i < 256; i++) + { + v = i << 8; + for (j = 0; j < 8; j++) + if (v & 0x8000) + v = (v << 1) ^ 0x1021; + else + v = v << 1; + x[i] = v; + } +} + +int +main () +{ + unsigned short a[256]; + + foo (a); + for (int i = 0; i < 256; i++) + { + unsigned short v = i << 8; + for (int j = 0; j < 8; j++) + { + asm volatile ("" : "+r" (v)); + if (v & 0x8000) + v = (v << 1) ^ 0x1021; + else + v = v << 1; + } + if (a[i] != v) + __builtin_abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84748.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84748.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84748.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr84748.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* { dg-require-effective-target int128 } */ + +typedef unsigned __int128 u128; + +int a, c, d; +u128 b; + +unsigned long long g0, g1; + +void +store (unsigned long long a0, unsigned long long a1) +{ + g0 = a0; + g1 = a1; +} + +void +foo (void) +{ + b += a; + c = d != 84347; + b /= c; + u128 x = b; + store (x >> 0, x >> 64); +} + +int +main (void) +{ + foo (); + if (g0 != 0 || g1 != 0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85095.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85095.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85095.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85095.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,52 @@ +/* PR target/85095 */ + +__attribute__((noipa)) unsigned long +f1 (unsigned long a, unsigned long b) +{ + unsigned long i = __builtin_add_overflow (a, b, &a); + return a + i; +} + +__attribute__((noipa)) unsigned long +f2 (unsigned long a, unsigned long b) +{ + unsigned long i = __builtin_add_overflow (a, b, &a); + return a - i; +} + +__attribute__((noipa)) unsigned long +f3 (unsigned int a, unsigned int b) +{ + unsigned int i = __builtin_add_overflow (a, b, &a); + return a + i; +} + +__attribute__((noipa)) unsigned long +f4 (unsigned int a, unsigned int b) +{ + unsigned int i = __builtin_add_overflow (a, b, &a); + return a - i; +} + +int +main () +{ + if (f1 (16UL, -18UL) != -2UL + || f1 (16UL, -17UL) != -1UL + || f1 (16UL, -16UL) != 1UL + || f1 (16UL, -15UL) != 2UL + || f2 (24UL, -26UL) != -2UL + || f2 (24UL, -25UL) != -1UL + || f2 (24UL, -24UL) != -1UL + || f2 (24UL, -23UL) != 0UL + || f3 (32U, -34U) != -2U + || f3 (32U, -33U) != -1U + || f3 (32U, -32U) != 1U + || f3 (32U, -31U) != 2U + || f4 (35U, -37U) != -2U + || f4 (35U, -36U) != -1U + || f4 (35U, -35U) != -1U + || f4 (35U, -34U) != 0U) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85156.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85156.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85156.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85156.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR tree-optimization/85156 */ + +int x, y; + +__attribute__((noipa)) int +foo (int z) +{ + if (__builtin_expect (x ? y != 0 : 0, z++)) + return 7; + return z; +} + +int +main () +{ + x = 1; + asm volatile ("" : "+m" (x), "+m" (y)); + if (foo (10) != 11) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85169.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85169.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85169.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85169.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* PR target/85169 */ + +typedef char V __attribute__((vector_size (64))); + +static void __attribute__ ((noipa)) +foo (V *p) +{ + V v = *p; + v[63] = 1; + *p = v; +} + +int +main () +{ + V v = (V) { }; + foo (&v); + for (unsigned i = 0; i < 64; i++) + if (v[i] != (i == 63)) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85331.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85331.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85331.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85331.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* PR tree-optimization/85331 */ + +typedef double V __attribute__((vector_size (2 * sizeof (double)))); +typedef long long W __attribute__((vector_size (2 * sizeof (long long)))); + +__attribute__((noipa)) void +foo (V *r) +{ + V y = { 1.0, 2.0 }; + W m = { 10000000001LL, 0LL }; + *r = __builtin_shuffle (y, m); +} + +int +main () +{ + V r; + foo (&r); + if (r[0] != 2.0 || r[1] != 1.0) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85529-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85529-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85529-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85529-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR tree-optimization/85529 */ + +struct S { int a; }; + +int b, c = 1, d, e, f; +static int g; +volatile struct S s; + +signed char +foo (signed char i, int j) +{ + return i < 0 ? i : i << j; +} + +int +main () +{ + signed char k = -83; + if (!d) + goto L; + k = e || f; +L: + for (; b < 1; b++) + s.a != (k < foo (k, 2) && (c = k = g)); + if (c != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85529-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85529-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85529-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85529-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* PR tree-optimization/85529 */ + +__attribute__((noipa)) int +foo (int x) +{ + x &= 63; + x -= 50; + x |= 1; + if (x < 0) + return 1; + int y = x >> 2; + if (x >= y) + return 1; + return 0; +} + +int +main () +{ + int i; + for (i = 0; i < 63; i++) + if (foo (i) != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* PR target/85582 */ + +int a, b, d = 2, e; +long long c = 1; + +int +main () +{ + int g = 6; +L1: + e = d; + if (a) + goto L1; + g--; + int i = c >> ~(~e | ~g); +L2: + c = (b % c) * i; + if (!e) + goto L2; + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +/* PR target/85582 */ + +#ifdef __SIZEOF_INT128__ +typedef __int128 S; +typedef unsigned __int128 U; +#else +typedef long long S; +typedef unsigned long long U; +#endif + +__attribute__((noipa)) S +f1 (S x, int y) +{ + x = x << (y & 5); + x += y; + return x; +} + +__attribute__((noipa)) S +f2 (S x, int y) +{ + x = x >> (y & 5); + x += y; + return x; +} + +__attribute__((noipa)) U +f3 (U x, int y) +{ + x = x >> (y & 5); + x += y; + return x; +} + +int +main () +{ + S a = (S) 1 << (sizeof (S) * __CHAR_BIT__ - 7); + S b = f1 (a, 12); + if (b != ((S) 1 << (sizeof (S) * __CHAR_BIT__ - 3)) + 12) + __builtin_abort (); + S c = (U) 1 << (sizeof (S) * __CHAR_BIT__ - 1); + S d = f2 (c, 12); + if ((U) d != ((U) 0x1f << (sizeof (S) * __CHAR_BIT__ - 5)) + 12) + __builtin_abort (); + U e = (U) 1 << (sizeof (U) * __CHAR_BIT__ - 1); + U f = f3 (c, 12); + if (f != ((U) 1 << (sizeof (U) * __CHAR_BIT__ - 5)) + 12) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85582-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,55 @@ +/* PR target/85582 */ + +#ifdef __SIZEOF_INT128__ +typedef __int128 S; +typedef unsigned __int128 U; +#else +typedef long long S; +typedef unsigned long long U; +#endif + +__attribute__((noipa)) U +f1 (U x, int y) +{ + return x << (y & -2); +} + +__attribute__((noipa)) S +f2 (S x, int y) +{ + return x >> (y & -2); +} + +__attribute__((noipa)) U +f3 (U x, int y) +{ + return x >> (y & -2); +} + +int +main () +{ + U a = (U) 1 << (sizeof (U) * __CHAR_BIT__ - 7); + if (f1 (a, 5) != ((U) 1 << (sizeof (S) * __CHAR_BIT__ - 3))) + __builtin_abort (); + S b = (U) 0x101 << (sizeof (S) * __CHAR_BIT__ / 2 - 7); + if (f1 (b, sizeof (S) * __CHAR_BIT__ / 2) != (U) 0x101 << (sizeof (S) * __CHAR_BIT__ - 7)) + __builtin_abort (); + if (f1 (b, sizeof (S) * __CHAR_BIT__ / 2 + 2) != (U) 0x101 << (sizeof (S) * __CHAR_BIT__ - 5)) + __builtin_abort (); + S c = (U) 1 << (sizeof (S) * __CHAR_BIT__ - 1); + if ((U) f2 (c, 5) != ((U) 0x1f << (sizeof (S) * __CHAR_BIT__ - 5))) + __builtin_abort (); + if ((U) f2 (c, sizeof (S) * __CHAR_BIT__ / 2) != ((U) -1 << (sizeof (S) * __CHAR_BIT__ / 2 - 1))) + __builtin_abort (); + if ((U) f2 (c, sizeof (S) * __CHAR_BIT__ / 2 + 2) != ((U) -1 << (sizeof (S) * __CHAR_BIT__ / 2 - 3))) + __builtin_abort (); + U d = (U) 1 << (sizeof (S) * __CHAR_BIT__ - 1); + if (f3 (c, 5) != ((U) 0x1 << (sizeof (S) * __CHAR_BIT__ - 5))) + __builtin_abort (); + if (f3 (c, sizeof (S) * __CHAR_BIT__ / 2) != ((U) 1 << (sizeof (S) * __CHAR_BIT__ / 2 - 1))) + __builtin_abort (); + if (f3 (c, sizeof (S) * __CHAR_BIT__ / 2 + 2) != ((U) 1 << (sizeof (S) * __CHAR_BIT__ / 2 - 3))) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85756.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85756.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85756.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr85756.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,50 @@ +/* PR target/85756 */ + +#if __CHAR_BIT__ == 8 && __SIZEOF_SHORT__ == 2 && __SIZEOF_INT__ == 4 +int a, c, *e, f, h = 10; +short b; +unsigned int p; + +__attribute__((noipa)) void +bar (int a) +{ + asm volatile ("" : : "r" (a) : "memory"); +} + +void +foo () +{ + unsigned j = 1, m = 430523; + int k, n = 1, *l = &h; +lab: + p = m; + m = -((~65535U | j) - n); + f = b << ~(n - 8); + n = (m || b) ^ f; + j = p; + if (p < m) + *l = k < 3; + if (!n) + l = &k; + if (c) + { + bar (a); + goto lab; + } + if (!*l) + *e = 1; +} + +int +main () +{ + foo (); + return 0; +} +#else +int +main () +{ + return 0; +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86231.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86231.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86231.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86231.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,30 @@ +/* PR tree-optimization/86231 */ + +#define ONE ((void *) 1) +#define TWO ((void *) 2) + +__attribute__((noipa)) int +foo (void *p, int x) +{ + if (p == ONE) return 0; + if (!p) + p = x ? TWO : ONE; + return p == ONE ? 0 : 1; +} + +int v[8]; + +int +main () +{ + if (foo ((void *) 0, 0) != 0 + || foo ((void *) 0, 1) != 1 + || foo (ONE, 0) != 0 + || foo (ONE, 1) != 0 + || foo (TWO, 0) != 1 + || foo (TWO, 1) != 1 + || foo (&v[7], 0) != 1 + || foo (&v[7], 1) != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86492.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86492.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86492.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86492.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* PR tree-optimization/86492 */ + +union U +{ + unsigned int r; + struct S + { + unsigned int a:12; + unsigned int b:4; + unsigned int c:16; + } f; +}; + +__attribute__((noipa)) unsigned int +foo (unsigned int x) +{ + union U u; + u.r = 0; + u.f.c = x; + u.f.b = 0xe; + return u.r; +} + +int +main () +{ + union U u; + if (__CHAR_BIT__ * __SIZEOF_INT__ != 32 || sizeof (u.r) != sizeof (u.f)) + return 0; + u.r = foo (0x72); + if (u.f.a != 0 || u.f.b != 0xe || u.f.c != 0x72) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86528.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86528.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86528.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86528.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +// { dg-require-effective-target alloca } +/* PR middle-end/86528 */ + +void __attribute__((noinline, noclone)) +test(char *data, __SIZE_TYPE__ len) +{ + static char const appended[] = "/./"; + char *buf = __builtin_alloca (len + sizeof appended); + __builtin_memcpy (buf, data, len); + __builtin_strcpy (buf + len, &appended[data[len - 1] == '/']); + if (__builtin_strcmp(buf, "test1234/./")) + __builtin_abort(); +} + +int +main() +{ + char *arg = "test1234/"; + test(arg, __builtin_strlen(arg)); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86714.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86714.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86714.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86714.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR tree-optimization/86714 - tree-ssa-forwprop.c confused by too + long initializer + + The excessively long initializer for a[0] is undefined but this + test verifies that the excess elements are not considered a part + of the value of the array as a matter of QoI. */ + +const char a[2][3] = { "1234", "xyz" }; +char b[6]; + +void *pb = b; + +int main () +{ + __builtin_memcpy (b, a, 4); + __builtin_memset (b + 4, 'a', 2); + + if (b[0] != '1' || b[1] != '2' || b[2] != '3' + || b[3] != 'x' || b[4] != 'a' || b[5] != 'a') + __builtin_abort (); + + if (__builtin_memcmp (pb, "123xaa", 6)) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86844.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86844.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86844.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr86844.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* PR tree-optimization/86844 */ + +__attribute__((noipa)) void +foo (int *p) +{ + *p = 0; + *((char *)p + 3) = 1; + *((char *)p + 1) = 2; + *((char *)p + 2) = *((char *)p + 6); +} + +int +main () +{ + int a[2] = { -1, 0 }; + if (sizeof (int) != 4) + return 0; + ((char *)a)[6] = 3; + foo (a); + if (((char *)a)[0] != 0 || ((char *)a)[1] != 2 + || ((char *)a)[2] != 3 || ((char *)a)[3] != 1) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87053.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87053.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87053.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87053.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* PR middle-end/87053 */ + +const union +{ struct { + char x[4]; + char y[4]; + }; + struct { + char z[8]; + }; +} u = {{"1234", "567"}}; + +int main () +{ + if (__builtin_strlen (u.z) != 7) + __builtin_abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87290.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87290.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87290.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87290.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,63 @@ +/* PR middle-end/87290 */ + +int c; + +__attribute__((noipa)) void +f0 (void) +{ + c++; +} + +__attribute__((noipa)) int +f1 (int x) +{ + return x % 16 == 13; +} + +__attribute__((noipa)) int +f2 (int x) +{ + return x % 16 == -13; +} + +__attribute__((noipa)) void +f3 (int x) +{ + if (x % 16 == 13) + f0 (); +} + +__attribute__((noipa)) void +f4 (int x) +{ + if (x % 16 == -13) + f0 (); +} + +int +main () +{ + int i, j; + for (i = -30; i < 30; i++) + { + if (f1 (13 + i * 16) != (i >= 0) || f2 (-13 + i * 16) != (i <= 0)) + __builtin_abort (); + f3 (13 + i * 16); + if (c != (i >= 0)) + __builtin_abort (); + f4 (-13 + i * 16); + if (c != 1 + (i == 0)) + __builtin_abort (); + for (j = 1; j < 16; j++) + { + if (f1 (13 + i * 16 + j) || f2 (-13 + i * 16 + j)) + __builtin_abort (); + f3 (13 + i * 16 + j); + f4 (-13 + i * 16 + j); + } + if (c != 1 + (i == 0)) + __builtin_abort (); + c = 0; + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87623.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87623.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87623.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr87623.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,34 @@ +/* PR middle-end/87623 */ +/* Testcase by George Thopas */ + +struct be { + unsigned short pad[1]; + unsigned char a; + unsigned char b; +} __attribute__((scalar_storage_order("big-endian"))); + +typedef struct be t_be; + +struct le { + unsigned short pad[3]; + unsigned char a; + unsigned char b; +}; + +typedef struct le t_le; + +int a_or_b_different(t_be *x,t_le *y) +{ + return (x->a != y->a) || (x->b != y->b); +} + +int main (void) +{ + t_be x = { .a=1, .b=2 }; + t_le y = { .a=1, .b=2 }; + + if (a_or_b_different(&x,&y)) + __builtin_abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88693.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88693.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88693.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88693.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,54 @@ +/* PR tree-optimization/88693 */ + +__attribute__((noipa)) void +foo (char *p) +{ + if (__builtin_strlen (p) != 9) + __builtin_abort (); +} + +__attribute__((noipa)) void +quux (char *p) +{ + int i; + for (i = 0; i < 100; i++) + if (p[i] != 'x') + __builtin_abort (); +} + +__attribute__((noipa)) void +qux (void) +{ + char b[100]; + __builtin_memset (b, 'x', sizeof (b)); + quux (b); +} + +__attribute__((noipa)) void +bar (void) +{ + static unsigned char u[9] = "abcdefghi"; + char b[100]; + __builtin_memcpy (b, u, sizeof (u)); + b[sizeof (u)] = 0; + foo (b); +} + +__attribute__((noipa)) void +baz (void) +{ + static unsigned char u[] = { 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r' }; + char b[100]; + __builtin_memcpy (b, u, sizeof (u)); + b[sizeof (u)] = 0; + foo (b); +} + +int +main () +{ + qux (); + bar (); + baz (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88714.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88714.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88714.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88714.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +/* PR bootstrap/88714 */ + +struct S { int a, b, c; int *d; }; +struct T { int *e, *f, *g; } *t = 0; +int *o = 0; + +__attribute__((noipa)) +void bar (int *x, int y, int z, int w) +{ + if (w == -1) + { + if (x != 0 || y != 0 || z != 0) + __builtin_abort (); + } + else if (w != 0 || x != t->g || y != 0 || z != 12) + __builtin_abort (); +} + +__attribute__((noipa)) void +foo (struct S *x, struct S *y, int *z, int w) +{ + *o = w; + if (w) + bar (0, 0, 0, -1); + x->d = z; + if (y->d) + y->c = y->c + y->d[0]; + bar (t->g, 0, y->c, 0); +} + +int +main () +{ + int a[4] = { 8, 9, 10, 11 }; + struct S s = { 1, 2, 3, &a[0] }; + struct T u = { 0, 0, &a[3] }; + o = &a[2]; + t = &u; + foo (&s, &s, &a[1], 5); + if (s.c != 12 || s.d != &a[1]) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88739.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88739.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88739.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88739.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,59 @@ +/* PR tree-optimization/88739 */ +#if __SIZEOF_SHORT__ == 2 && __SIZEOF_INT__ == 4 && __CHAR_BIT__ == 8 +struct A +{ + unsigned int a, b, c; + unsigned int d : 30; + unsigned int e : 2; +}; + +union U +{ + struct A f; + unsigned int g[4]; + unsigned short h[8]; + unsigned char i[16]; +}; +volatile union U v = { .f.d = 0x4089 }; + +__attribute__((noipa)) void +bar (int x) +{ + static int i; + switch (i++) + { + case 0: if (x != v.f.d) __builtin_abort (); break; + case 1: if (x != v.f.e) __builtin_abort (); break; + case 2: if (x != v.g[3]) __builtin_abort (); break; + case 3: if (x != v.h[6]) __builtin_abort (); break; + case 4: if (x != v.h[7]) __builtin_abort (); break; + default: __builtin_abort (); break; + } +} + +void +foo (unsigned int x) +{ + union U u; + u.f.d = x >> 2; + u.f.e = 0; + bar (u.f.d); + bar (u.f.e); + bar (u.g[3]); + bar (u.h[6]); + bar (u.h[7]); +} + +int +main () +{ + foo (0x10224); + return 0; +} +#else +int +main () +{ + return 0; +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88904.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88904.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88904.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr88904.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* PR rtl-optimization/88904 */ + +volatile int v; + +__attribute__((noipa)) void +bar (const char *x, const char *y, int z) +{ + if (!v) + __builtin_abort (); + asm volatile ("" : "+g" (x)); + asm volatile ("" : "+g" (y)); + asm volatile ("" : "+g" (z)); +} + +#define my_assert(e) ((e) ? (void) 0 : bar (#e, __FILE__, __LINE__)) + +typedef struct { + unsigned M1; + unsigned M2 : 1; + int : 0; + unsigned M3 : 1; +} S; + +S +foo () +{ + S result = {0, 0, 1}; + return result; +} + +int +main () +{ + S ret = foo (); + my_assert (ret.M2 == 0); + my_assert (ret.M3 == 1); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89195.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89195.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89195.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89195.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +/* PR rtl-optimization/89195 */ +/* { dg-require-effective-target int32plus } */ + +struct S { unsigned i : 24; }; + +volatile unsigned char x; + +__attribute__((noipa)) int +foo (struct S d) +{ + return d.i & x; +} + +int +main () +{ + struct S d = { 0x123456 }; + x = 0x75; + if (foo (d) != (0x56 & 0x75)) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89369.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89369.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89369.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89369.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,69 @@ +/* PR target/89369 */ + +#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 && __CHAR_BIT__ == 8 +struct S { unsigned int u[4]; }; + +static void +foo (struct S *out, struct S const *in, int shift) +{ + unsigned long long th, tl, oh, ol; + th = ((unsigned long long) in->u[3] << 32) | in->u[2]; + tl = ((unsigned long long) in->u[1] << 32) | in->u[0]; + oh = th >> (shift * 8); + ol = tl >> (shift * 8); + ol |= th << (64 - shift * 8); + out->u[1] = ol >> 32; + out->u[0] = ol; + out->u[3] = oh >> 32; + out->u[2] = oh; +} + +static void +bar (struct S *out, struct S const *in, int shift) +{ + unsigned long long th, tl, oh, ol; + th = ((unsigned long long) in->u[3] << 32) | in->u[2]; + tl = ((unsigned long long) in->u[1] << 32) | in->u[0]; + oh = th << (shift * 8); + ol = tl << (shift * 8); + oh |= tl >> (64 - shift * 8); + out->u[1] = ol >> 32; + out->u[0] = ol; + out->u[3] = oh >> 32; + out->u[2] = oh; +} + +__attribute__((noipa)) static void +baz (struct S *r, struct S *a, struct S *b, struct S *c, struct S *d) +{ + struct S x, y; + bar (&x, a, 1); + foo (&y, c, 1); + r->u[0] = a->u[0] ^ x.u[0] ^ ((b->u[0] >> 11) & 0xdfffffefU) ^ y.u[0] ^ (d->u[0] << 18); + r->u[1] = a->u[1] ^ x.u[1] ^ ((b->u[1] >> 11) & 0xddfecb7fU) ^ y.u[1] ^ (d->u[1] << 18); + r->u[2] = a->u[2] ^ x.u[2] ^ ((b->u[2] >> 11) & 0xbffaffffU) ^ y.u[2] ^ (d->u[2] << 18); + r->u[3] = a->u[3] ^ x.u[3] ^ ((b->u[3] >> 11) & 0xbffffff6U) ^ y.u[3] ^ (d->u[3] << 18); +} + +int +main () +{ + struct S a[] = { { 0x000004d3, 0xbc5448db, 0xf22bde9f, 0xebb44f8f }, + { 0x03a32799, 0x60be8246, 0xa2d266ed, 0x7aa18536 }, + { 0x15a38518, 0xcf655ce1, 0xf3e09994, 0x50ef69fe }, + { 0x88274b07, 0xe7c94866, 0xc0ea9f47, 0xb6a83c43 }, + { 0xcd0d0032, 0x5d47f5d7, 0x5a0afbf6, 0xaea87b24 }, + { 0, 0, 0, 0 } }; + baz (&a[5], &a[0], &a[1], &a[2], &a[3]); + if (a[4].u[0] != a[5].u[0] || a[4].u[1] != a[5].u[1] + || a[4].u[2] != a[5].u[2] || a[4].u[3] != a[5].u[3]) + __builtin_abort (); + return 0; +} +#else +int +main () +{ + return 0; +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89434.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89434.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89434.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89434.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* PR target/89434 */ + +#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 && __CHAR_BIT__ == 8 +long g = 0; + +static inline unsigned long long +foo (unsigned long long u) +{ + unsigned x; + __builtin_mul_overflow (-1, g, &x); + u |= (unsigned) u < (unsigned short) x; + return x - u; +} + +int +main () +{ + unsigned long long x = foo (0x222222222ULL); + if (x != 0xfffffffddddddddeULL) + __builtin_abort (); + return 0; +} +#else +int +main () +{ + return 0; +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89634.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89634.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89634.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89634.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,40 @@ +/* PR rtl-optimization/89634 */ + +static unsigned long * +foo (unsigned long *x) +{ + return x + (1 + *x); +} + +__attribute__((noipa)) unsigned long +bar (unsigned long *x) +{ + unsigned long c, d = 1, e, *f, g, h = 0, i; + for (e = *x - 1; e > 0; e--) + { + f = foo (x + 1); + for (i = 1; i < e; i++) + f = foo (f); + c = *f; + if (c == 2) + d *= 2; + else + { + i = (c - 1) / 2 - 1; + g = (2 * i + 1) * (d + 1) + (2 * d + 1); + if (g > h) + h = g; + d *= c; + } + } + return h; +} + +int +main () +{ + unsigned long a[18] = { 4, 2, -200, 200, 2, -400, 400, 3, -600, 0, 600, 5, -100, -66, 0, 66, 100, __LONG_MAX__ / 8 + 1 }; + if (bar (a) != 17) + __builtin_abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89826.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89826.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89826.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr89826.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +typedef unsigned int u32; +typedef unsigned long long u64; +u64 a; +u32 b; + +u64 +foo (u32 d) +{ + a -= d ? 0 : ~a; + return a + b; +} + +int +main (void) +{ + u64 x = foo (2); + if (x != 0) + __builtin_abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr90025.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr90025.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr90025.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr90025.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +/* PR middle-end/90025 */ + +__attribute__((noipa)) void +bar (char *p) +{ + int i; + for (i = 0; i < 6; i++) + if (p[i] != "foobar"[i]) + __builtin_abort (); + for (; i < 32; i++) + if (p[i] != '\0') + __builtin_abort (); +} + +__attribute__((noipa)) void +foo (unsigned int x) +{ + char s[32] = { 'f', 'o', 'o', 'b', 'a', 'r', 0 }; + ((unsigned int *) s)[2] = __builtin_bswap32 (x); + bar (s); +} + +int +main () +{ + foo (0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr90949.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr90949.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr90949.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr90949.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +void __attribute__ ((noipa, noinline)) my_puts (const char *str) { } + +void __attribute__ ((noipa, noinline)) my_free (void *p) { } + + +struct Node +{ + struct Node *child; +}; + +struct Node space[2] = { }; + +struct Node * __attribute__ ((noipa, noinline)) my_malloc (int bytes) +{ + return &space[0]; +} + +void +walk (struct Node *module, int cleanup) +{ + if (module == 0) + { + return; + } + if (!cleanup) + { + my_puts ("No cleanup"); + } + walk (module->child, cleanup); + if (cleanup) + { + my_free (module); + } +} + +int +main () +{ + struct Node *node = my_malloc (sizeof (struct Node)); + node->child = 0; + walk (node, 1); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr91137.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr91137.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr91137.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pr91137.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +long long a; +unsigned b; +int c[70]; +int d[70][70]; +int e; + +__attribute__ ((noinline)) void f(long long *g, int p2) { + *g = p2; +} + +__attribute__ ((noinline)) void fn2() { + for (int j = 0; j < 70; j++) { + for (int i = 0; i < 70; i++) { + if (b) + c[i] = 0; + for (int l = 0; l < 70; l++) + d[i][1] = d[l][i]; + } + for (int k = 0; k < 70; k++) + e = c[0]; + } +} + +int main() { + b = 5; + for (int j = 0; j < 70; ++j) + c[j] = 2075593088; + fn2(); + f(&a, e); + if (a) + __builtin_abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +/* { dg-skip-if "requires io" { freestanding } } */ + +#include +#include + +int +main (void) +{ +#define test(ret, args...) \ + printf (args); \ + if (printf (args) != ret) \ + abort (); + test (5, "hello"); + test (6, "hello\n"); + test (1, "a"); + test (0, ""); + test (5, "%s", "hello"); + test (6, "%s", "hello\n"); + test (1, "%s", "a"); + test (0, "%s", ""); + test (1, "%c", 'x'); + test (7, "%s\n", "hello\n"); + test (2, "%d\n", 0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,61 @@ +/* Verify that calls to printf don't get eliminated even if their + result on success can be computed at compile time (they can fail). + The calls can still be transformed into those of other functions. + { dg-require-effective-target unwrapped } + { dg-skip-if "requires io" { freestanding } } */ + +#include +#include +#include + +__attribute__ ((noipa)) void +write_file (void) +{ + printf ("1"); + printf ("%c", '2'); + printf ("%c%c", '3', '4'); + printf ("%s", "5"); + printf ("%s%s", "6", "7"); + printf ("%i", 8); + printf ("%.1s\n", "9x"); +} + + +int main (void) +{ + char *tmpfname = tmpnam (0); + FILE *f = freopen (tmpfname, "w", stdout); + if (!f) + { + perror ("fopen for writing"); + return 1; + } + + write_file (); + fclose (f); + + f = fopen (tmpfname, "r"); + if (!f) + { + perror ("fopen for reading"); + remove (tmpfname); + return 1; + } + + char buf[12] = ""; + if (1 != fscanf (f, "%s", buf)) + { + perror ("fscanf"); + fclose (f); + remove (tmpfname); + return 1; + } + + fclose (f); + remove (tmpfname); + + if (strcmp (buf, "123456789")) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-chk-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-chk-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-chk-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/printf-chk-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,51 @@ +/* { dg-skip-if "requires io" { freestanding } } */ + +#include +#include +#include + +volatile int should_optimize; + +int +__attribute__((noinline)) +__printf_chk (int flag, const char *fmt, ...) +{ + va_list ap; + int ret; +#ifdef __OPTIMIZE__ + if (should_optimize) + abort (); +#endif + should_optimize = 1; + va_start (ap, fmt); + ret = vprintf (fmt, ap); + va_end (ap); + return ret; +} + +int +main (void) +{ +#define test(ret, opt, args...) \ + should_optimize = opt; \ + __printf_chk (1, args); \ + if (!should_optimize) \ + abort (); \ + should_optimize = 0; \ + if (__printf_chk (1, args) != ret) \ + abort (); \ + if (!should_optimize) \ + abort (); + test (5, 0, "hello"); + test (6, 1, "hello\n"); + test (1, 1, "a"); + test (0, 1, ""); + test (5, 0, "%s", "hello"); + test (6, 1, "%s", "hello\n"); + test (1, 1, "%s", "a"); + test (0, 1, "%s", ""); + test (1, 1, "%c", 'x'); + test (7, 1, "%s\n", "hello\n"); + test (2, 0, "%d\n", 0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pta-field-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pta-field-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pta-field-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pta-field-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +struct Foo { + int *p; + int *q; +}; + +void __attribute__((noinline)) +bar (int **x) +{ + struct Foo *f = (struct Foo *)x; + *(f->q) = 0; +} + +int foo(void) +{ + struct Foo f; + int i = 1, j = 2; + f.p = &i; + f.q = &j; + bar(&f.p); + return j; +} + +extern void abort (void); +int main() +{ + if (foo () != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pta-field-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pta-field-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pta-field-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pta-field-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +struct Foo { + int *p; + int *q; +}; + +void __attribute__((noinline)) +bar (int **x) +{ + struct Foo *f = (struct Foo *)(x - 1); + *(f->p) = 0; +} + +int foo(void) +{ + struct Foo f; + int i = 1, j = 2; + f.p = &i; + f.q = &j; + bar(&f.q); + return i; +} + +extern void abort (void); +int main() +{ + if (foo () != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ptr-arith-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ptr-arith-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ptr-arith-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ptr-arith-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +char * +f (char *s, unsigned int i) +{ + return &s[i + 3 - 1]; +} + +main () +{ + char *str = "abcdefghijkl"; + char *x2 = f (str, 12); + if (str + 14 != x2) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pure-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pure-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pure-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pure-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,91 @@ + +/* Origin: Kaveh Ghazi 2002-05-27. */ + +/* Use a different function for each test so the link failures + indicate which one is broken. */ +extern void link_error0 (void); +extern void link_error1 (void); +extern void link_error2 (void); +extern void link_error3 (void); +extern void link_error4 (void); +extern void link_error5 (void); +extern void link_error6 (void); +extern void link_error7 (void); + +extern int i; + +extern int func0 (int) __attribute__ ((__pure__)); +extern int func1 (int) __attribute__ ((__const__)); + +/* GCC should automatically detect attributes for these functions. + At -O3 They'll be inlined, but that's ok. */ +static int func2 (int a) { return i + a; } /* pure */ +static int func3 (int a) { return a * 3; } /* const */ +static int func4 (int a) { return func0(a) + a; } /* pure */ +static int func5 (int a) { return a + func1(a); } /* const */ +static int func6 (int a) { return func2(a) + a; } /* pure */ +static int func7 (int a) { return a + func3(a); } /* const */ + +int main () +{ + int i[10], r; + + i[0] = 0; + r = func0(0); + if (i[0]) + link_error0(); + + i[1] = 0; + r = func1(0); + if (i[1]) + link_error1(); + + i[2] = 0; + r = func2(0); + if (i[2]) + link_error2(); + + i[3] = 0; + r = func3(0); + if (i[3]) + link_error3(); + + i[4] = 0; + r = func4(0); + if (i[4]) + link_error4(); + + i[5] = 0; + r = func5(0); + if (i[5]) + link_error5(); + + i[6] = 0; + r = func6(0); + if (i[6]) + link_error6(); + + i[7] = 0; + r = func7(0); + if (i[7]) + link_error7(); + + return r; +} + +int func0 (int a) { return a - i; } /* pure */ +int func1 (int a) { return a - a; } /* const */ + +int i = 2; + +#ifndef __OPTIMIZE__ +/* Avoid link failures when not optimizing. */ +void link_error0() {} +void link_error1() {} +void link_error2() {} +void link_error3() {} +void link_error4() {} +void link_error5() {} +void link_error6() {} +void link_error7() {} +#endif /* ! __OPTIMIZE__ */ Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pushpop_macro.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pushpop_macro.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pushpop_macro.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/pushpop_macro.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +extern void abort (); + +#define _ 2 +#pragma push_macro("_") +#undef _ +#define _ 1 +#pragma pop_macro("_") + +int main () +{ + if (_ != 2) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/regstack-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/regstack-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/regstack-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/regstack-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +long double C = 5; +long double U = 1; +long double Y2 = 11; +long double Y1 = 17; +long double X, Y, Z, T, R, S; +main () +{ + X = (C + U) * Y2; + Y = C - U - U; + Z = C + U + U; + T = (C - U) * Y1; + X = X - (Z + U); + R = Y * Y1; + S = Z * Y2; + T = T - Y; + Y = (U - Y) + R; + Z = S - (Z + U + U); + R = (Y2 + U) * Y1; + Y1 = Y2 * Y1; + R = R - Y2; + Y1 = Y1 - 0.5L; + if (Z != 68. || Y != 49. || X != 58. || Y1 != 186.5 || R != 193. || S != 77. + || T != 65. || Y2 != 11.) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/restrict-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/restrict-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/restrict-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/restrict-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* PR rtl-optimization/16536 + Origin: Jeremy Denise + Reduced: Wolfgang Bangerth + Volker Reichelt */ +/* { dg-options "-fgnu89-inline" } */ + +extern void abort (); + +typedef struct +{ + int i, dummy; +} A; + +inline A foo (const A* p, const A* q) +{ + return (A){p->i+q->i}; +} + +void bar (A* __restrict__ p) +{ + *p=foo(p,p); + if (p->i!=2) + abort(); +} + +int main () +{ + A a={1}; + bar(&a); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/return-addr.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/return-addr.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/return-addr.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/return-addr.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,122 @@ +/* Test to verify that a function that returns either the address + of a local variable or a non-local via a MAX_EXPR or MIN_EXPR + doesn't return null when the result of the expression is + the latter. */ + +#define NOIPA __attribute__ ((noclone, noinline, noipa)) + +#define A(expr) \ + ((expr) \ + ? (void)0 \ + : (__builtin_printf ("assertion failed on line %i: %s\n", \ + __LINE__, #expr), \ + __builtin_abort ())) + + +typedef __UINTPTR_TYPE__ uintptr_t; + +/* Return a bigger value than P. The address still points (just + past) the local variable pointed to by P so the caller does + return the address of a local variable but that's hidden from + GCC by the attribute and the point of the test is to verify + that the address in the return statement in the caller isn't + replaced by null when GCC cannot prove the address doesn't + reference a non-local variable. */ + +NOIPA char* get_max_2 (char *p) +{ + return p + 1; +} + +NOIPA char* get_max_3 (char *p, char *q) +{ + return p < q ? q + 1 : p + 1; +} + +/* Analogous to the above. The expressions are undefined because + they form an address prior to the beginning of the object but + it's hidden from GCC by the attributes. */ + +NOIPA char* get_min_2 (char *p) +{ + return p - 1; +} + +NOIPA char* get_min_3 (char *p, char *q) +{ + return p < q ? p - 1 : q - 1; +} + + +NOIPA void* test_max_2 (void) +{ + char c; + + char *p = get_max_2 (&c); + + void *q = p > &c ? p : &c; /* MAX_EXPR */ + return q; +} + +NOIPA void* test_max_3 (void) +{ + char c; + char d; + + char *p = get_max_3 (&c, &d); + + void *q = p < &c ? &c < &d ? &d : &c : p; + return q; +} + +NOIPA void* test_min_2 (void) +{ + char c; + + char *p = get_min_2 (&c); + + void *q = p < &c ? p : &c; /* MIN_EXPR" */ + return q; +} + +NOIPA void* test_min_3 (void) +{ + char c; + char d; + + char *p = get_min_3 (&c, &d); + + void *q = p > &c ? &c > &d ? &d : &c : p; + return q; +} + +NOIPA void* test_min_3_phi (int i) +{ + char a, b; + + char *p0 = &a; + char *p1 = &b; + char *p2 = get_min_3 (&a, &b); + char *p3 = get_min_3 (&a, &b); + + char *p4 = p2 < p0 ? p2 : p0; + char *p5 = p3 < p1 ? p3 : p1; + + __builtin_printf ("%p %p %p %p\n", p2, p3, p4, p5); + + if (i == 1) + return p4; + else + return p5; +} + +int main () +{ + A (0 != test_max_2 ()); + A (0 != test_max_3 ()); + + A (0 != test_min_2 ()); + A (0 != test_min_3 ()); + + A (0 != test_min_3_phi (0)); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,86 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +#define vidx(type, vec, idx) (*((type *) &(vec) + idx)) + +#define operl(a, b, op) (a op b) +#define operr(a, b, op) (b op a) + +#define check(type, count, vec0, vec1, num, op, lr) \ +do {\ + int __i; \ + for (__i = 0; __i < count; __i++) {\ + if (vidx (type, vec1, __i) != oper##lr (num, vidx (type, vec0, __i), op)) \ + __builtin_abort (); \ + }\ +} while (0) + +#define veccompare(type, count, v0, v1) \ +do {\ + int __i; \ + for (__i = 0; __i < count; __i++) { \ + if (vidx (type, v0, __i) != vidx (type, v1, __i)) \ + __builtin_abort (); \ + } \ +} while (0) + +volatile int one = 1; + +int main (int argc, char *argv[]) { +#define fvec_2 (vector(4, float)){2., 2., 2., 2.} +#define dvec_2 (vector(2, double)){2., 2.} + + + vector(8, short) v0 = {one, 1, 2, 3, 4, 5, 6, 7}; + vector(8, short) v1; + + vector(4, float) f0 = {1., 2., 3., 4.}; + vector(4, float) f1, f2; + + vector(2, double) d0 = {1., 2.}; + vector(2, double) d1, d2; + + + + v1 = 2 + v0; check (short, 8, v0, v1, 2, +, l); + v1 = 2 - v0; check (short, 8, v0, v1, 2, -, l); + v1 = 2 * v0; check (short, 8, v0, v1, 2, *, l); + v1 = 2 / v0; check (short, 8, v0, v1, 2, /, l); + v1 = 2 % v0; check (short, 8, v0, v1, 2, %, l); + v1 = 2 ^ v0; check (short, 8, v0, v1, 2, ^, l); + v1 = 2 & v0; check (short, 8, v0, v1, 2, &, l); + v1 = 2 | v0; check (short, 8, v0, v1, 2, |, l); + v1 = 2 << v0; check (short, 8, v0, v1, 2, <<, l); + v1 = 2 >> v0; check (short, 8, v0, v1, 2, >>, l); + + v1 = v0 + 2; check (short, 8, v0, v1, 2, +, r); + v1 = v0 - 2; check (short, 8, v0, v1, 2, -, r); + v1 = v0 * 2; check (short, 8, v0, v1, 2, *, r); + v1 = v0 / 2; check (short, 8, v0, v1, 2, /, r); + v1 = v0 % 2; check (short, 8, v0, v1, 2, %, r); + v1 = v0 ^ 2; check (short, 8, v0, v1, 2, ^, r); + v1 = v0 & 2; check (short, 8, v0, v1, 2, &, r); + v1 = v0 | 2; check (short, 8, v0, v1, 2, |, r); + + f1 = 2. + f0; f2 = fvec_2 + f0; veccompare (float, 4, f1, f2); + f1 = 2. - f0; f2 = fvec_2 - f0; veccompare (float, 4, f1, f2); + f1 = 2. * f0; f2 = fvec_2 * f0; veccompare (float, 4, f1, f2); + f1 = 2. / f0; f2 = fvec_2 / f0; veccompare (float, 4, f1, f2); + + f1 = f0 + 2.; f2 = f0 + fvec_2; veccompare (float, 4, f1, f2); + f1 = f0 - 2.; f2 = f0 - fvec_2; veccompare (float, 4, f1, f2); + f1 = f0 * 2.; f2 = f0 * fvec_2; veccompare (float, 4, f1, f2); + f1 = f0 / 2.; f2 = f0 / fvec_2; veccompare (float, 4, f1, f2); + + d1 = 2. + d0; d2 = dvec_2 + d0; veccompare (double, 2, d1, d2); + d1 = 2. - d0; d2 = dvec_2 - d0; veccompare (double, 2, d1, d2); + d1 = 2. * d0; d2 = dvec_2 * d0; veccompare (double, 2, d1, d2); + d1 = 2. / d0; d2 = dvec_2 / d0; veccompare (double, 2, d1, d2); + + d1 = d0 + 2.; d2 = d0 + dvec_2; veccompare (double, 2, d1, d2); + d1 = d0 - 2.; d2 = d0 - dvec_2; veccompare (double, 2, d1, d2); + d1 = d0 * 2.; d2 = d0 * dvec_2; veccompare (double, 2, d1, d2); + d1 = d0 / 2.; d2 = d0 / dvec_2; veccompare (double, 2, d1, d2); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,62 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +#define vidx(type, vec, idx) (*((type *) &(vec) + idx)) + +#define operl(a, b, op) (a op b) +#define operr(a, b, op) (b op a) + +#define check(type, count, vec0, vec1, num, op, lr) \ +do {\ + int __i; \ + for (__i = 0; __i < count; __i++) {\ + if (vidx (type, vec1, __i) != oper##lr (num, vidx (type, vec0, __i), op)) \ + __builtin_abort (); \ + }\ +} while (0) + +#define veccompare(type, count, v0, v1) \ +do {\ + int __i; \ + for (__i = 0; __i < count; __i++) { \ + if (vidx (type, v0, __i) != vidx (type, v1, __i)) \ + __builtin_abort (); \ + } \ +} while (0) + + +long __attribute__ ((noinline)) vlng () { return (long)42; } +int __attribute__ ((noinline)) vint () { return (int) 43; } +short __attribute__ ((noinline)) vsrt () { return (short)42; } +char __attribute__ ((noinline)) vchr () { return (char)42; } + + +int main (int argc, char *argv[]) { + vector(16, char) c0 = {argc, 1,2,3,4,5,6,7, argc, 1,2,3,4,5,6,7}; + vector(16, char) c1; + + vector(8, short) s0 = {argc, 1,2,3,4,5,6,7}; + vector(8, short) s1; + + vector(4, int) i0 = {argc, 1, 2, 3}; + vector(4, int) i1; + + vector(2, long) l0 = {argc, 1}; + vector(2, long) l1; + + c1 = vchr() + c0; check (char, 16, c0, c1, vchr(), +, l); + + s1 = vsrt() + s0; check (short, 8, s0, s1, vsrt(), +, l); + s1 = vchr() + s0; check (short, 8, s0, s1, vchr(), +, l); + + i1 = vint() * i0; check (int, 4, i0, i1, vint(), *, l); + i1 = vsrt() * i0; check (int, 4, i0, i1, vsrt(), *, l); + i1 = vchr() * i0; check (int, 4, i0, i1, vchr(), *, l); + + l1 = vlng() * l0; check (long, 2, l0, l1, vlng(), *, l); + l1 = vint() * l0; check (long, 2, l0, l1, vint(), *, l); + l1 = vsrt() * l0; check (long, 2, l0, l1, vsrt(), *, l); + l1 = vchr() * l0; check (long, 2, l0, l1, vchr(), *, l); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scal-to-vec3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,48 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +#define vidx(type, vec, idx) (*((type *) &(vec) + idx)) + +#define veccompare(type, count, v0, v1) \ +do {\ + int __i; \ + for (__i = 0; __i < count; __i++) { \ + if (vidx (type, v0, __i) != vidx (type, v1, __i)) \ + __builtin_abort (); \ + } \ +} while (0) + + +int main (int argc, char *argv[]) { +#define fvec_2 (vector(4, float)){2., 2., 2., 2.} +#define dvec_2 (vector(2, double)){2., 2.} + + vector(4, float) f0 = {1., 2., 3., 4.}; + vector(4, float) f1, f2; + + vector(2, double) d0 = {1., 2.}; + vector(2, double) d1, d2; + + + f1 = 2 + f0; f2 = fvec_2 + f0; veccompare (float, 4, f1, f2); + f1 = 2 - f0; f2 = fvec_2 - f0; veccompare (float, 4, f1, f2); + f1 = 2 * f0; f2 = fvec_2 * f0; veccompare (float, 4, f1, f2); + f1 = 2 / f0; f2 = fvec_2 / f0; veccompare (float, 4, f1, f2); + + f1 = f0 + 2; f2 = f0 + fvec_2; veccompare (float, 4, f1, f2); + f1 = f0 - 2; f2 = f0 - fvec_2; veccompare (float, 4, f1, f2); + f1 = f0 * 2; f2 = f0 * fvec_2; veccompare (float, 4, f1, f2); + f1 = f0 / 2; f2 = f0 / fvec_2; veccompare (float, 4, f1, f2); + + d1 = 2 + d0; d2 = dvec_2 + d0; veccompare (double, 2, d1, d2); + d1 = 2 - d0; d2 = dvec_2 - d0; veccompare (double, 2, d1, d2); + d1 = 2 * d0; d2 = dvec_2 * d0; veccompare (double, 2, d1, d2); + d1 = 2 / d0; d2 = dvec_2 / d0; veccompare (double, 2, d1, d2); + + d1 = d0 + 2; d2 = d0 + dvec_2; veccompare (double, 2, d1, d2); + d1 = d0 - 2; d2 = d0 - dvec_2; veccompare (double, 2, d1, d2); + d1 = d0 * 2; d2 = d0 * dvec_2; veccompare (double, 2, d1, d2); + d1 = d0 / 2; d2 = d0 / dvec_2; veccompare (double, 2, d1, d2); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scope-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scope-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scope-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/scope-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +int v = 3; + +f () +{ + int v = 4; + { + extern int v; + if (v != 3) + abort (); + } +} + +main () +{ + f (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftdi-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftdi-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftdi-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftdi-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* { dg-require-effective-target longlong64 } */ + +long long a = 568513516876543756; +long long b = -754324895235774564; +unsigned long long c = 156789543257562457; + +long long expected_a[64] = {568513516876543756, 1137027033753087512, 2274054067506175024, 4548108135012350048, 9096216270024700096, -254311533660151424, -508623067320302848, -1017246134640605696, -2034492269281211392, -4068984538562422784, -8137969077124845568, 2170805919459860480, 4341611838919720960, 8683223677839441920, -1080296718030667776, -2160593436061335552, -4321186872122671104, -8642373744245342208, 1161996585218867200, 2323993170437734400, 4647986340875468800, -9150771391958614016, 145201289792323584, 290402579584647168, 580805159169294336, 1161610318338588672, 2323220636677177344, 4646441273354354688, -9153861527000842240, 139021019707867136, 278042039415734272, 556084078831468544, 1112168157662937088, 2224336315325874176, 4448672630651748352, 8897345261303496704, -652053551102558208, -1304107102205116416, -2608214204410232832, -5216428408820465664, 8013887256068620288, -2418969561572311040, -4837939123144622080, 8770865827420307456, -905012418868936704, -1810024837737873408, -3620049675475746816, -7240099350951493632, 3966545371806564352, 7933090743613128704, -2580562586483294208, -5161125172966588416, 8124493727776374784, -2197756618156802048, -4395513236313604096, -8791026472627208192, 864691128455135232, 1729382256910270464, 3458764513820540928, 6917529027641081856, -4611686018427387904, -9223372036854775808ULL, 0, 0}; +long long expected_b[64] = {-754324895235774564, -377162447617887282, -188581223808943641, -94290611904471821, -47145305952235911, -23572652976117956, -11786326488058978, -5893163244029489, -2946581622014745, -1473290811007373, -736645405503687, -368322702751844, -184161351375922, -92080675687961, -46040337843981, -23020168921991, -11510084460996, -5755042230498, -2877521115249, -1438760557625, -719380278813, -359690139407, -179845069704, -89922534852, -44961267426, -22480633713, -11240316857, -5620158429, -2810079215, -1405039608, -702519804, -351259902, -175629951, -87814976, -43907488, -21953744, -10976872, -5488436, -2744218, -1372109, -686055, -343028, -171514, -85757, -42879, -21440, -10720, -5360, -2680, -1340, -670, -335, -168, -84, -42, -21, -11, -6, -3, -2, -1, -1, -1, -1}; +unsigned long long expected_c[64] = {156789543257562457, 78394771628781228, 39197385814390614, 19598692907195307, 9799346453597653, 4899673226798826, 2449836613399413, 1224918306699706, 612459153349853, 306229576674926, 153114788337463, 76557394168731, 38278697084365, 19139348542182, 9569674271091, 4784837135545, 2392418567772, 1196209283886, 598104641943, 299052320971, 149526160485, 74763080242, 37381540121, 18690770060, 9345385030, 4672692515, 2336346257, 1168173128, 584086564, 292043282, 146021641, 73010820, 36505410, 18252705, 9126352, 4563176, 2281588, 1140794, 570397, 285198, 142599, 71299, 35649, 17824, 8912, 4456, 2228, 1114, 557, 278, 139, 69, 34, 17, 8, 4, 2, 1, 0, 0, 0, 0, 0, 0}; + +int +main (void) +{ + int i; + + for (i = 0; i < 64; i++) + { + if ((a << i) != expected_a[i] + || (b >> i) != expected_b[i] + || (c >> i) != expected_c[i]) + __builtin_abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftdi.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftdi.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftdi.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftdi.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* { dg-require-effective-target int32plus } */ + +/* Failed on sparc with -mv8plus because sparc.c:set_extends() thought + erroneously that SImode ASHIFT chops the upper bits, it does not. */ + +typedef unsigned long long uint64; + +void g(uint64 x, int y, int z, uint64 *p) +{ + unsigned w = ((x >> y) & 0xffffffffULL) << (z & 0x1f); + *p |= (w & 0xffffffffULL) << z; +} + +int main(void) +{ + uint64 a = 0; + g(0xdeadbeef01234567ULL, 0, 0, &a); + return (a == 0x01234567) ? 0 : 1; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftopt-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftopt-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftopt-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/shiftopt-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,66 @@ +/* Copyright (C) 2002 Free Software Foundation + + Check that constant folding of shift operations is working. + + Roger Sayle, 10th October 2002. */ + +extern void abort (void); +extern void link_error (void); + +void +utest (unsigned int x) +{ + if (x >> 0 != x) + link_error (); + + if (x << 0 != x) + link_error (); + + if (0 << x != 0) + link_error (); + + if (0 >> x != 0) + link_error (); + + if (-1 >> x != -1) + link_error (); + + if (~0 >> x != ~0) + link_error (); +} + +void +stest (int x) +{ + if (x >> 0 != x) + link_error (); + + if (x << 0 != x) + link_error (); + + if (0 << x != 0) + link_error (); + + if (0 >> x != 0) + link_error (); +} + +int +main () +{ + utest(9); + utest(0); + + stest(9); + stest(0); + + return 0; +} + +#ifndef __OPTIMIZE__ +void +link_error () +{ + abort (); +} +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,74 @@ +/* Origin: Aldy Hernandez + + Purpose: Test generic SIMD support. This test should work + regardless of if the target has SIMD instructions. +*/ + +typedef int __attribute__((mode(SI))) __attribute__((vector_size (16))) vecint; +typedef int __attribute__((mode(SI))) siint; + +vecint i = { 150, 100, 150, 200 }; +vecint j = { 10, 13, 20, 30 }; +vecint k; + +union { + vecint v; + siint i[4]; +} res; + +/* This should go away once we can use == and != on vector types. */ +void +verify (siint a1, siint a2, siint a3, siint a4, + siint b1, siint b2, siint b3, siint b4) +{ + if (a1 != b1 + || a2 != b2 + || a3 != b3 + || a4 != b4) + abort (); +} + +int +main () +{ + k = i + j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 160, 113, 170, 230); + + k = i * j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 1500, 1300, 3000, 6000); + + k = i / j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 15, 7, 7, 6); + + k = i & j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 2, 4, 20, 8); + + k = i | j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 158, 109, 150, 222); + + k = i ^ j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 156, 105, 130, 214); + + k = -i; + res.v = k; + verify (res.i[0], res.i[1], res.i[2], res.i[3], + -150, -100, -150, -200); + + k = ~i; + res.v = k; + verify (res.i[0], res.i[1], res.i[2], res.i[3], -151, -101, -151, -201); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,72 @@ +/* + Purpose: Test generic SIMD support, V8HImode. This test should work + regardless of if the target has SIMD instructions. +*/ + +typedef short __attribute__((vector_size (16))) vecint; + +vecint i = { 150, 100, 150, 200, 0, 0, 0, 0 }; +vecint j = { 10, 13, 20, 30, 1, 1, 1, 1 }; +vecint k; + +union { + vecint v; + short i[8]; +} res; + +/* This should go away once we can use == and != on vector types. */ +void +verify (int a1, int a2, int a3, int a4, + int b1, int b2, int b3, int b4) +{ + if (a1 != b1 + || a2 != b2 + || a3 != b3 + || a4 != b4) + abort (); +} + +int +main () +{ + k = i + j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 160, 113, 170, 230); + + k = i * j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 1500, 1300, 3000, 6000); + + k = i / j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 15, 7, 7, 6); + + k = i & j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 2, 4, 20, 8); + + k = i | j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 158, 109, 150, 222); + + k = i ^ j; + res.v = k; + + verify (res.i[0], res.i[1], res.i[2], res.i[3], 156, 105, 130, 214); + + k = -i; + res.v = k; + verify (res.i[0], res.i[1], res.i[2], res.i[3], + -150, -100, -150, -200); + + k = ~i; + res.v = k; + verify (res.i[0], res.i[1], res.i[2], res.i[3], -151, -101, -151, -201); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,21 @@ +/* { dg-require-effective-target stdint_types } */ +#include +typedef int32_t __attribute__((vector_size(8))) v2si; +int64_t s64; + +static inline int64_t +__ev_convert_s64 (v2si a) +{ + return (int64_t) a; +} + +int main() +{ + union { int64_t ll; int32_t i[2]; } endianness_test; + endianness_test.ll = 1; + int32_t little_endian = endianness_test.i[0]; + s64 = __ev_convert_s64 ((v2si){1,0xffffffff}); + if (s64 != (little_endian ? 0xffffffff00000001LL : 0x1ffffffffLL)) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,59 @@ +/* Test saving and restoring of SIMD registers. */ + +typedef short Q __attribute__((vector_size(8))); + +Q q1 = {1, 2}, q2 = {3, 4}, q3 = {5, 6}, q4 = {7, 8}; + +Q w1, w2, w3, w4; +Q z1, z2, z3, z4; + +volatile int dummy; + +void __attribute__((__noinline__)) +func0 (void) +{ + dummy = 1; +} + +void __attribute__((__noinline__)) +func1 (void) +{ + Q a, b; + a = q1 * q2; + b = q3 * q4; + w1 = a; + w2 = b; + func0 (); + w3 = a; + w4 = b; +} + +void __attribute__((__noinline__)) +func2 (void) +{ + Q a, b; + a = q1 + q2; + b = q3 - q4; + z1 = a; + z2 = b; + func1 (); + z3 = a; + z4 = b; +} + +int +main (void) +{ + func2 (); + + if (memcmp (&w1, &w3, sizeof (Q)) != 0) + abort (); + if (memcmp (&w2, &w4, sizeof (Q)) != 0) + abort (); + if (memcmp (&z1, &z3, sizeof (Q)) != 0) + abort (); + if (memcmp (&z2, &z4, sizeof (Q)) != 0) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/simd-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +extern void abort (void); +extern int memcmp (const void *, const void *, __SIZE_TYPE__); + +typedef unsigned char v8qi __attribute__((vector_size(8))); + +v8qi foo(v8qi x, v8qi y) +{ + return x * y; +} + +int main() +{ + v8qi a = { 1, 2, 3, 4, 5, 6, 7, 8 }; + v8qi b = { 3, 3, 3, 3, 3, 3, 3, 3 }; + v8qi c = { 3, 6, 9, 12, 15, 18, 21, 24 }; + v8qi r; + + r = foo (a, b); + if (memcmp (&r, &c, 8) != 0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ssad-run.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ssad-run.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ssad-run.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/ssad-run.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +extern void abort (); +extern int abs (int __x) __attribute__ ((__nothrow__, __leaf__)) __attribute__ ((__const__)); + +static int +foo (signed char *w, int i, signed char *x, int j) +{ + int tot = 0; + for (int a = 0; a < 16; a++) + { + for (int b = 0; b < 16; b++) + tot += abs (w[b] - x[b]); + w += i; + x += j; + } + return tot; +} + +void +bar (signed char *w, signed char *x, int i, int *result) +{ + *result = foo (w, 16, x, i); +} + +int +main (void) +{ + signed char m[256]; + signed char n[256]; + int sum, i; + + for (i = 0; i < 256; ++i) + if (i % 2 == 0) + { + m[i] = (i % 8) * 2 + 1; + n[i] = -(i % 8); + } + else + { + m[i] = -((i % 8) * 2 + 2); + n[i] = -((i % 8) >> 1); + } + + bar (m, n, 16, &sum); + + if (sum != 2368) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,156 @@ +#include + +extern void abort (void); + +int foo_arg, bar_arg; +long x; +double d; +va_list gap; +va_list *pap; + +void +foo (int v, va_list ap) +{ + switch (v) + { + case 5: foo_arg = va_arg (ap, int); break; + default: abort (); + } +} + +void +bar (int v) +{ + if (v == 0x4006) + { + if (va_arg (gap, double) != 17.0 + || va_arg (gap, long) != 129L) + abort (); + } + else if (v == 0x4008) + { + if (va_arg (*pap, long long) != 14LL + || va_arg (*pap, long double) != 131.0L + || va_arg (*pap, int) != 17) + abort (); + } + bar_arg = v; +} + +void +f0 (int i, ...) +{ +} + +void +f1 (int i, ...) +{ + va_list ap; + va_start (ap, i); + va_end (ap); +} + +void +f2 (int i, ...) +{ + va_list ap; + va_start (ap, i); + bar (d); + x = va_arg (ap, long); + bar (x); + va_end (ap); +} + +void +f3 (int i, ...) +{ + va_list ap; + va_start (ap, i); + d = va_arg (ap, double); + va_end (ap); +} + +void +f4 (int i, ...) +{ + va_list ap; + va_start (ap, i); + x = va_arg (ap, double); + foo (i, ap); + va_end (ap); +} + +void +f5 (int i, ...) +{ + va_list ap; + va_start (ap, i); + va_copy (gap, ap); + bar (i); + va_end (ap); + va_end (gap); +} + +void +f6 (int i, ...) +{ + va_list ap; + va_start (ap, i); + bar (d); + va_arg (ap, long); + va_arg (ap, long); + x = va_arg (ap, long); + bar (x); + va_end (ap); +} + +void +f7 (int i, ...) +{ + va_list ap; + va_start (ap, i); + pap = ≈ + bar (i); + va_end (ap); +} + +void +f8 (int i, ...) +{ + va_list ap; + va_start (ap, i); + pap = ≈ + bar (i); + d = va_arg (ap, double); + va_end (ap); +} + +int +main (void) +{ + f0 (1); + f1 (2); + d = 31.0; + f2 (3, 28L); + if (bar_arg != 28 || x != 28) + abort (); + f3 (4, 131.0); + if (d != 131.0) + abort (); + f4 (5, 16.0, 128); + if (x != 16 || foo_arg != 128) + abort (); + f5 (0x4006, 17.0, 129L); + if (bar_arg != 0x4006) + abort (); + f6 (7, 12L, 14L, -31L); + if (bar_arg != -31) + abort (); + f7 (0x4008, 14LL, 131.0L, 17, 26.0); + if (bar_arg != 0x4008) + abort (); + f8 (0x4008, 14LL, 131.0L, 17, 27.0); + if (bar_arg != 0x4008 || d != 27.0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,180 @@ +#include + +extern void abort (void); + +int foo_arg, bar_arg; +long x; +double d; +va_list gap; + +void +foo (int v, va_list ap) +{ + switch (v) + { + case 5: + foo_arg = va_arg (ap, int); + foo_arg += va_arg (ap, double); + foo_arg += va_arg (ap, long long); + break; + case 8: + foo_arg = va_arg (ap, long long); + foo_arg += va_arg (ap, double); + break; + case 11: + foo_arg = va_arg (ap, int); + foo_arg += va_arg (ap, long double); + break; + default: + abort (); + } +} + +void +bar (int v) +{ + if (v == 0x4002) + { + if (va_arg (gap, int) != 13 || va_arg (gap, double) != -14.0) + abort (); + } + bar_arg = v; +} + +void +f1 (int i, ...) +{ + va_start (gap, i); + x = va_arg (gap, long); + va_end (gap); +} + +void +f2 (int i, ...) +{ + va_start (gap, i); + bar (i); + va_end (gap); +} + +void +f3 (int i, ...) +{ + va_list aps[10]; + va_start (aps[4], i); + x = va_arg (aps[4], long); + va_end (aps[4]); +} + +void +f4 (int i, ...) +{ + va_list aps[10]; + va_start (aps[4], i); + bar (i); + va_end (aps[4]); +} + +void +f5 (int i, ...) +{ + va_list aps[10]; + va_start (aps[4], i); + foo (i, aps[4]); + va_end (aps[4]); +} + +struct A { int i; va_list g; va_list h[2]; }; + +void +f6 (int i, ...) +{ + struct A a; + va_start (a.g, i); + x = va_arg (a.g, long); + va_end (a.g); +} + +void +f7 (int i, ...) +{ + struct A a; + va_start (a.g, i); + bar (i); + va_end (a.g); +} + +void +f8 (int i, ...) +{ + struct A a; + va_start (a.g, i); + foo (i, a.g); + va_end (a.g); +} + +void +f10 (int i, ...) +{ + struct A a; + va_start (a.h[1], i); + x = va_arg (a.h[1], long); + va_end (a.h[1]); +} + +void +f11 (int i, ...) +{ + struct A a; + va_start (a.h[1], i); + bar (i); + va_end (a.h[1]); +} + +void +f12 (int i, ...) +{ + struct A a; + va_start (a.h[1], i); + foo (i, a.h[1]); + va_end (a.h[1]); +} + +int +main (void) +{ + f1 (1, 79L); + if (x != 79L) + abort (); + f2 (0x4002, 13, -14.0); + if (bar_arg != 0x4002) + abort (); + f3 (3, 2031L); + if (x != 2031) + abort (); + f4 (4, 18); + if (bar_arg != 4) + abort (); + f5 (5, 1, 19.0, 18LL); + if (foo_arg != 38) + abort (); + f6 (6, 18L); + if (x != 18L) + abort (); + f7 (7); + if (bar_arg != 7) + abort (); + f8 (8, 2031LL, 13.0); + if (foo_arg != 2044) + abort (); + f10 (9, 180L); + if (x != 180L) + abort (); + f11 (10); + if (bar_arg != 10) + abort (); + f12 (11, 2030, 12.0L); + if (foo_arg != 2042) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,166 @@ +#include + +extern void abort (void); + +int foo_arg, bar_arg; +long x; +double d; +va_list gap; +struct S1 { int i; double d; int j; double e; } s1; +struct S2 { double d; long i; } s2; +int y; + +void +bar (int v) +{ + bar_arg = v; +} + +void +f1 (int i, ...) +{ + va_list ap; + va_start (ap, i); + while (i-- > 0) + x = va_arg (ap, long); + va_end (ap); +} + +void +f2 (int i, ...) +{ + va_list ap; + va_start (ap, i); + while (i-- > 0) + d = va_arg (ap, double); + va_end (ap); +} + +void +f3 (int i, ...) +{ + va_list ap; + int j = i; + while (j-- > 0) + { + va_start (ap, i); + x = va_arg (ap, long); + va_end (ap); + bar (x); + } +} + +void +f4 (int i, ...) +{ + va_list ap; + int j = i; + while (j-- > 0) + { + va_start (ap, i); + d = va_arg (ap, double); + va_end (ap); + bar (d + 4.0); + } +} + +void +f5 (int i, ...) +{ + va_list ap; + va_start (ap, i); + while (i-- > 0) + s1 = va_arg (ap, struct S1); + va_end (ap); +} + +void +f6 (int i, ...) +{ + va_list ap; + va_start (ap, i); + while (i-- > 0) + s2 = va_arg (ap, struct S2); + va_end (ap); +} + +void +f7 (int i, ...) +{ + va_list ap; + int j = i; + while (j-- > 0) + { + va_start (ap, i); + s1 = va_arg (ap, struct S1); + va_end (ap); + bar (s1.i); + } +} + +void +f8 (int i, ...) +{ + va_list ap; + int j = i; + while (j-- > 0) + { + va_start (ap, i); + s2 = va_arg (ap, struct S2); + y = va_arg (ap, int); + va_end (ap); + bar (s2.i); + } +} + +int +main (void) +{ + struct S1 a1, a3; + struct S2 a2, a4; + + f1 (7, 1L, 2L, 3L, 5L, 7L, 9L, 11L, 13L); + if (x != 11L) + abort (); + f2 (6, 1.0, 2.0, 4.0, 8.0, 16.0, 32.0, 64.0); + if (d != 32.0) + abort (); + f3 (2, 1L, 3L); + if (bar_arg != 1L || x != 1L) + abort (); + f4 (2, 17.0, 19.0); + if (bar_arg != 21 || d != 17.0) + abort (); + a1.i = 131; + a1.j = 251; + a1.d = 15.0; + a1.e = 191.0; + a3 = a1; + a3.j = 254; + a3.e = 178.0; + f5 (2, a1, a3, a1); + if (s1.i != 131 || s1.j != 254 || s1.d != 15.0 || s1.e != 178.0) + abort (); + f5 (3, a1, a3, a1); + if (s1.i != 131 || s1.j != 251 || s1.d != 15.0 || s1.e != 191.0) + abort (); + a2.i = 138; + a2.d = 16.0; + a4.i = 257; + a4.d = 176.0; + f6 (2, a2, a4, a2); + if (s2.i != 257 || s2.d != 176.0) + abort (); + f6 (3, a2, a4, a2); + if (s2.i != 138 || s2.d != 16.0) + abort (); + f7 (2, a3, a1, a1); + if (s1.i != 131 || s1.j != 254 || s1.d != 15.0 || s1.e != 178.0) + abort (); + if (bar_arg != 131) + abort (); + f8 (3, a4, a2, a2); + if (s2.i != 257 || s2.d != 176.0) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stdarg-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,137 @@ +#include + +extern void abort (void); +long x, y; + +inline void __attribute__((always_inline)) +f1i (va_list ap) +{ + x = va_arg (ap, double); + x += va_arg (ap, long); + x += va_arg (ap, double); +} + +void +f1 (int i, ...) +{ + va_list ap; + va_start (ap, i); + f1i (ap); + va_end (ap); +} + +inline void __attribute__((always_inline)) +f2i (va_list ap) +{ + y = va_arg (ap, int); + y += va_arg (ap, long); + y += va_arg (ap, double); + f1i (ap); +} + +void +f2 (int i, ...) +{ + va_list ap; + va_start (ap, i); + f2i (ap); + va_end (ap); +} + +long +f3h (int i, long arg0, long arg1, long arg2, long arg3) +{ + return i + arg0 + arg1 + arg2 + arg3; +} + +long +f3 (int i, ...) +{ + long t, arg0, arg1, arg2, arg3; + va_list ap; + + va_start (ap, i); + switch (i) + { + case 0: + t = f3h (i, 0, 0, 0, 0); + break; + case 1: + arg0 = va_arg (ap, long); + t = f3h (i, arg0, 0, 0, 0); + break; + case 2: + arg0 = va_arg (ap, long); + arg1 = va_arg (ap, long); + t = f3h (i, arg0, arg1, 0, 0); + break; + case 3: + arg0 = va_arg (ap, long); + arg1 = va_arg (ap, long); + arg2 = va_arg (ap, long); + t = f3h (i, arg0, arg1, arg2, 0); + break; + case 4: + arg0 = va_arg (ap, long); + arg1 = va_arg (ap, long); + arg2 = va_arg (ap, long); + arg3 = va_arg (ap, long); + t = f3h (i, arg0, arg1, arg2, arg3); + break; + default: + abort (); + } + va_end (ap); + + return t; +} + +void +f4 (int i, ...) +{ + va_list ap; + + va_start (ap, i); + switch (i) + { + case 4: + y = va_arg (ap, double); + break; + case 5: + y = va_arg (ap, double); + y += va_arg (ap, double); + break; + default: + abort (); + } + f1i (ap); + va_end (ap); +} + +int +main (void) +{ + f1 (3, 16.0, 128L, 32.0); + if (x != 176L) + abort (); + f2 (6, 5, 7L, 18.0, 19.0, 17L, 64.0); + if (x != 100L || y != 30L) + abort (); + if (f3 (0) != 0) + abort (); + if (f3 (1, 18L) != 19L) + abort (); + if (f3 (2, 18L, 100L) != 120L) + abort (); + if (f3 (3, 18L, 100L, 300L) != 421L) + abort (); + if (f3 (4, 18L, 71L, 64L, 86L) != 243L) + abort (); + f4 (4, 6.0, 9.0, 16L, 18.0); + if (x != 43L || y != 6L) + abort (); + f4 (5, 7.0, 21.0, 1.0, 17L, 126.0); + if (x != 144L || y != 28L) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stkalign.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stkalign.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stkalign.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/stkalign.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +/* { dg-options "-fno-inline" } */ +/* Check that stack alignment is not affected by variables not placed + on the stack. */ + +#include + +#define ALIGNMENT 64 + +unsigned test(unsigned n, unsigned p) +{ + static struct { char __attribute__((__aligned__(ALIGNMENT))) c; } s; + unsigned x; + + assert(__alignof__(s) == ALIGNMENT); + asm ("" : "=g" (x), "+m" (s) : "0" (&x)); + + return n ? test(n - 1, x) : (x ^ p); +} + +unsigned test2(unsigned n, unsigned p) +{ + static struct { char c; } s; + unsigned x; + + assert(__alignof__(s) != ALIGNMENT); + asm ("" : "=g" (x), "+m" (s) : "0" (&x)); + + return n ? test2(n - 1, x) : (x ^ p); +} + +int main (int argc, char *argv[] __attribute__((unused))) +{ + unsigned int x, y; + + x = test(argc, 0); + x |= test(argc + 1, 0); + x |= test(argc + 2, 0); + + y = test2(argc, 0); + y |= test2(argc + 1, 0); + y |= test2(argc + 2, 0); + + return (x & (ALIGNMENT - 1)) == 0 && (y & (ALIGNMENT - 1)) != 0 ? 1 : 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcmp-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcmp-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcmp-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcmp-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,131 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Test strcmp with various combinations of pointer alignments and lengths to + make sure any optimizations in the library are correct. + + Written by Michael Meissner, March 9, 2002. */ + +#include +#include + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_TEST +#define MAX_TEST (8 * sizeof (long long)) +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_TEST + MAX_EXTRA + 2) + +static union { + unsigned char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u1, u2; + +void +test (const unsigned char *s1, const unsigned char *s2, int expected) +{ + int value = strcmp ((char *) s1, (char *) s2); + + if (expected < 0 && value >= 0) + abort (); + else if (expected == 0 && value != 0) + abort (); + else if (expected > 0 && value <= 0) + abort (); +} + +main () +{ + size_t off1, off2, len, i; + unsigned char *buf1, *buf2; + unsigned char *mod1, *mod2; + unsigned char *p1, *p2; + + for (off1 = 0; off1 < MAX_OFFSET; off1++) + for (off2 = 0; off2 < MAX_OFFSET; off2++) + for (len = 0; len < MAX_TEST; len++) + { + p1 = u1.buf; + for (i = 0; i < off1; i++) + *p1++ = '\0'; + + buf1 = p1; + for (i = 0; i < len; i++) + *p1++ = 'a'; + + mod1 = p1; + for (i = 0; i < MAX_EXTRA+2; i++) + *p1++ = 'x'; + + p2 = u2.buf; + for (i = 0; i < off2; i++) + *p2++ = '\0'; + + buf2 = p2; + for (i = 0; i < len; i++) + *p2++ = 'a'; + + mod2 = p2; + for (i = 0; i < MAX_EXTRA+2; i++) + *p2++ = 'x'; + + mod1[0] = '\0'; + mod2[0] = '\0'; + test (buf1, buf2, 0); + + mod1[0] = 'a'; + mod1[1] = '\0'; + mod2[0] = '\0'; + test (buf1, buf2, +1); + + mod1[0] = '\0'; + mod2[0] = 'a'; + mod2[1] = '\0'; + test (buf1, buf2, -1); + + mod1[0] = 'b'; + mod1[1] = '\0'; + mod2[0] = 'c'; + mod2[1] = '\0'; + test (buf1, buf2, -1); + + mod1[0] = 'c'; + mod1[1] = '\0'; + mod2[0] = 'b'; + mod2[1] = '\0'; + test (buf1, buf2, +1); + + mod1[0] = 'b'; + mod1[1] = '\0'; + mod2[0] = (unsigned char)'\251'; + mod2[1] = '\0'; + test (buf1, buf2, -1); + + mod1[0] = (unsigned char)'\251'; + mod1[1] = '\0'; + mod2[0] = 'b'; + mod2[1] = '\0'; + test (buf1, buf2, +1); + + mod1[0] = (unsigned char)'\251'; + mod1[1] = '\0'; + mod2[0] = (unsigned char)'\252'; + mod2[1] = '\0'; + test (buf1, buf2, -1); + + mod1[0] = (unsigned char)'\252'; + mod1[1] = '\0'; + mod2[0] = (unsigned char)'\251'; + mod2[1] = '\0'; + test (buf1, buf2, +1); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcpy-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcpy-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcpy-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcpy-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,75 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Test strcpy with various combinations of pointer alignments and lengths to + make sure any optimizations in the library are correct. */ + +#include + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_COPY +#define MAX_COPY (10 * sizeof (long long)) +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_COPY + 1 + MAX_EXTRA) + +/* Use a sequence length that is not divisible by two, to make it more + likely to detect when words are mixed up. */ +#define SEQUENCE_LENGTH 31 + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u1, u2; + +main () +{ + int off1, off2, len, i; + char *p, *q, c; + + for (off1 = 0; off1 < MAX_OFFSET; off1++) + for (off2 = 0; off2 < MAX_OFFSET; off2++) + for (len = 1; len < MAX_COPY; len++) + { + for (i = 0, c = 'A'; i < MAX_LENGTH; i++, c++) + { + u1.buf[i] = 'a'; + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + u2.buf[i] = c; + } + u2.buf[off2 + len] = '\0'; + + p = strcpy (u1.buf + off1, u2.buf + off2); + if (p != u1.buf + off1) + abort (); + + q = u1.buf; + for (i = 0; i < off1; i++, q++) + if (*q != 'a') + abort (); + + for (i = 0, c = 'A' + off2; i < len; i++, q++, c++) + { + if (c >= 'A' + SEQUENCE_LENGTH) + c = 'A'; + if (*q != c) + abort (); + } + + if (*q++ != '\0') + abort (); + for (i = 0; i < MAX_EXTRA; i++, q++) + if (*q != 'a') + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcpy-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcpy-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcpy-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strcpy-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* Test to make sure strcpy works correctly. */ +#define STRING "Hi!THE" + +const char a[] = STRING; + +void f(char *a) __attribute__((noinline)); +void f(char *a) +{ + __builtin_strcpy (a, STRING); +} + + +int main(void) +{ + int i; + char b[sizeof(a)] = {}; + f(b); + for(i = 0; i < sizeof(b); i++) + { + if (a[i] != b[i]) + __builtin_abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +typedef struct +{ + short s __attribute__ ((aligned(2), packed)); + double d __attribute__ ((aligned(2), packed)); +} TRIAL; + +int +check (TRIAL *t) +{ + if (t->s != 1 || t->d != 16.0) + return 1; + return 0; +} + +main () +{ + TRIAL trial; + + trial.s = 1; + trial.d = 16.0; + + if (check (&trial) != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,13 @@ +typedef struct +{ + short a __attribute__ ((aligned (2),packed)); + short *ap[2] __attribute__ ((aligned (2),packed)); +} A; + +main () +{ + short i, j = 1; + A a, *ap = &a; + ap->ap[j] = &i; + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +typedef struct +{ + short i __attribute__ ((aligned (2),packed)); + int f[2] __attribute__ ((aligned (2),packed)); +} A; + +f (ap) + A *ap; +{ + short i, j = 1; + + i = ap->f[1]; + i += ap->f[j]; + for (j = 0; j < 2; j++) + i += ap->f[j]; + + return i; +} + +main () +{ + A a; + a.f[0] = 100; + a.f[1] = 13; + if (f (&a) != 139) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-pack-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +typedef struct +{ + unsigned char a __attribute__((packed)); + unsigned short b __attribute__((packed)); +} three_char_t; + +unsigned char +my_set_a (void) +{ + return 0xab; +} + +unsigned short +my_set_b (void) +{ + return 0x1234; +} + +main () +{ + three_char_t three_char; + + three_char.a = my_set_a (); + three_char.b = my_set_b (); + if (three_char.a != 0xab || three_char.b != 0x1234) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-stdarg-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-stdarg-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-stdarg-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-stdarg-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,61 @@ +#include + +struct tiny +{ + char c; + char d; + char e; + char f; + char g; +}; + +f (int n, ...) +{ + struct tiny x; + int i; + + va_list ap; + va_start (ap,n); + for (i = 0; i < n; i++) + { + x = va_arg (ap,struct tiny); + if (x.c != i + 10) + abort(); + if (x.d != i + 20) + abort(); + if (x.e != i + 30) + abort(); + if (x.f != i + 40) + abort(); + if (x.g != i + 50) + abort(); + } + { + long x = va_arg (ap, long); + if (x != 123) + abort(); + } + va_end (ap); +} + +main () +{ + struct tiny x[3]; + x[0].c = 10; + x[1].c = 11; + x[2].c = 12; + x[0].d = 20; + x[1].d = 21; + x[2].d = 22; + x[0].e = 30; + x[1].e = 31; + x[2].e = 32; + x[0].f = 40; + x[1].f = 41; + x[2].f = 42; + x[0].g = 50; + x[1].g = 51; + x[2].g = 52; + f (3, x[0], x[1], x[2], (long) 123); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-varg-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-varg-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-varg-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strct-varg-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,42 @@ +#include + +struct s { int x, y; }; + +f (int attr, ...) +{ + struct s va_values; + va_list va; + int i; + + va_start (va, attr); + + if (attr != 2) + abort (); + + va_values = va_arg (va, struct s); + if (va_values.x != 0xaaaa || va_values.y != 0x5555) + abort (); + + attr = va_arg (va, int); + if (attr != 3) + abort (); + + va_values = va_arg (va, struct s); + if (va_values.x != 0xffff || va_values.y != 0x1111) + abort (); + + va_end (va); +} + +main () +{ + struct s a, b; + + a.x = 0xaaaa; + a.y = 0x5555; + b.x = 0xffff; + b.y = 0x1111; + + f (2, a, 3, b); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-17.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-17.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-17.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-17.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,45 @@ +/* Copyright (C) 2003 Free Software Foundation. + + Test strcpy optimizations don't evaluate side-effects twice. + + Written by Jakub Jelinek, June 23, 2003. */ + +typedef __SIZE_TYPE__ size_t; +extern char *strcpy (char *, const char *); +extern int memcmp (const void *, const void *, size_t); +extern void abort (void); +extern void exit (int); + +size_t +test1 (char *s, size_t i) +{ + strcpy (s, "foobarbaz" + i++); + return i; +} + +size_t +check2 (void) +{ + static size_t r = 5; + if (r != 5) + abort (); + return ++r; +} + +void +test2 (char *s) +{ + strcpy (s, "foobarbaz" + check2 ()); +} + +int +main (void) +{ + char buf[10]; + if (test1 (buf, 7) != 8 || memcmp (buf, "az", 3)) + abort (); + test2 (buf); + if (memcmp (buf, "baz", 4)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-18.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-18.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-18.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-18.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,75 @@ +/* Copyright (C) 2003 Free Software Foundation. + + Test equal pointer optimizations don't break anything. + + Written by Roger Sayle, July 14, 2003. */ + +extern void abort (); +typedef __SIZE_TYPE__ size_t; + +extern void *memcpy(void*, const void*, size_t); +extern void *mempcpy(void*, const void*, size_t); +extern void *memmove(void*, const void*, size_t); +extern char *strcpy(char*, const char*); +extern int memcmp(const void*, const void*, size_t); +extern int strcmp(const char*, const char*); +extern int strncmp(const char*, const char*, size_t); + + +void test1 (void *ptr) +{ + if (memcpy(ptr,ptr,8) != ptr) + abort (); +} + +void test2 (char *ptr) +{ + if (mempcpy(ptr,ptr,8) != ptr+8) + abort (); +} + +void test3 (void *ptr) +{ + if (memmove(ptr,ptr,8) != ptr) + abort (); +} + +void test4 (char *ptr) +{ + if (strcpy(ptr,ptr) != ptr) + abort (); +} + +void test5 (void *ptr) +{ + if (memcmp(ptr,ptr,8) != 0) + abort (); +} + +void test6 (const char *ptr) +{ + if (strcmp(ptr,ptr) != 0) + abort (); +} + +void test7 (const char *ptr) +{ + if (strncmp(ptr,ptr,8) != 0) + abort (); +} + + +int main () +{ + char buf[10]; + + test1 (buf); + test2 (buf); + test3 (buf); + test4 (buf); + test5 (buf); + test6 (buf); + test7 (buf); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/string-opt-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,107 @@ +/* Copyright (C) 2000 Free Software Foundation. + + Ensure builtin strlen, strcmp, strchr, strrchr and strncpy + perform correctly. + + Written by Jakub Jelinek, 11/7/2000. */ + +extern void abort (void); +extern __SIZE_TYPE__ strlen (const char *); +extern int strcmp (const char *, const char *); +extern char *strchr (const char *, int); +extern char *strrchr (const char *, int); +extern char *strncpy (char *, const char *, __SIZE_TYPE__); +extern void *memset (void *, int, __SIZE_TYPE__); +extern int memcmp (const void *, const void *, __SIZE_TYPE__); + +int x = 6; +int y = 1; +char *bar = "hi world"; +char buf [64]; + +int main() +{ + const char *const foo = "hello world"; + char dst [64]; + + if (strlen (bar) != 8) + abort (); + if (strlen (bar + (++x & 2)) != 6) + abort (); + if (x != 7) + abort (); + if (strlen (foo + (x++, 6)) != 5) + abort (); + if (x != 8) + abort (); + if (strlen (foo + (++x & 1)) != 10) + abort (); + if (x != 9) + abort (); + if (strcmp (foo + (x -= 6), "lo world")) + abort (); + if (x != 3) + abort (); + if (strcmp (foo, bar) >= 0) + abort (); + if (strcmp (foo, bar + (x++ & 1)) >= 0) + abort (); + if (x != 4) + abort (); + if (strchr (foo + (x++ & 7), 'l') != foo + 9) + abort (); + if (x != 5) + abort (); + if (strchr (bar, 'o') != bar + 4) + abort (); + if (strchr (bar, '\0') != bar + 8) + abort (); + if (strrchr (bar, 'x')) + abort (); + if (strrchr (bar, 'o') != bar + 4) + abort (); + if (strcmp (foo + (x++ & 1), "ello world" + (--y & 1))) + abort (); + if (x != 6 || y != 0) + abort (); + dst[5] = ' '; + dst[6] = '\0'; + x = 5; + y = 1; + if (strncpy (dst + 1, foo + (x++ & 3), 4) != dst + 1 + || x != 6 + || strcmp (dst + 1, "ello ")) + abort (); + memset (dst, ' ', sizeof dst); + if (strncpy (dst + (++x & 1), (y++ & 3) + "foo", 10) != dst + 1 + || x != 7 + || y != 2 + || memcmp (dst, " oo\0\0\0\0\0\0\0\0 ", 12)) + abort (); + memset (dst, ' ', sizeof dst); + if (strncpy (dst, "hello", 8) != dst || memcmp (dst, "hello\0\0\0 ", 9)) + abort (); + x = '!'; + memset (buf, ' ', sizeof buf); + if (memset (buf, x++, ++y) != buf + || x != '!' + 1 + || y != 3 + || memcmp (buf, "!!!", 3)) + abort (); + if (memset (buf + y++, '-', 8) != buf + 3 + || y != 4 + || memcmp (buf, "!!!--------", 11)) + abort (); + x = 10; + if (memset (buf + ++x, 0, y++) != buf + 11 + || x != 11 + || y != 5 + || memcmp (buf + 8, "---\0\0\0", 7)) + abort (); + if (memset (buf + (x += 4), 0, 6) != buf + 15 + || x != 15 + || memcmp (buf + 10, "-\0\0\0\0\0\0\0\0\0", 11)) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,57 @@ +/* Copyright (C) 2002 Free Software Foundation. + + Test strlen with various combinations of pointer alignments and lengths to + make sure any optimizations in the library are correct. + + Written by Michael Meissner, March 9, 2002. */ + +#include +#include + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_TEST +#define MAX_TEST (8 * sizeof (long long)) +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_TEST + MAX_EXTRA + 1) + +static union { + char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u; + +main () +{ + size_t off, len, len2, i; + char *p; + + for (off = 0; off < MAX_OFFSET; off++) + for (len = 0; len < MAX_TEST; len++) + { + p = u.buf; + for (i = 0; i < off; i++) + *p++ = '\0'; + + for (i = 0; i < len; i++) + *p++ = 'a'; + + *p++ = '\0'; + for (i = 0; i < MAX_EXTRA; i++) + *p++ = 'b'; + + p = u.buf + off; + len2 = strlen (p); + if (len != len2) + abort (); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,210 @@ +/* PR tree-optimization/86532 - Wrong code due to a wrong strlen folding */ + +extern __SIZE_TYPE__ strlen (const char*); + +static const char a[2][3] = { "1", "12" }; +static const char b[2][2][5] = { { "1", "12" }, { "123", "1234" } }; + +volatile int v0 = 0; +volatile int v1 = 1; +volatile int v2 = 2; + +#define A(expr) \ + ((expr) ? (void)0 : (__builtin_printf ("assertion on line %i: %s\n", \ + __LINE__, #expr), \ + __builtin_abort ())) + +void test_array_ref_2_3 (void) +{ + A (strlen (a[v0]) == 1); + A (strlen (&a[v0][v0]) == 1); + A (strlen (&a[0][v0]) == 1); + A (strlen (&a[v0][0]) == 1); + + A (strlen (a[v1]) == 2); + A (strlen (&a[v1][0]) == 2); + A (strlen (&a[1][v0]) == 2); + A (strlen (&a[v1][v0]) == 2); + + A (strlen (&a[v1][1]) == 1); + A (strlen (&a[v1][1]) == 1); + + A (strlen (&a[v1][2]) == 0); + A (strlen (&a[v1][v2]) == 0); + + int i0 = 0; + int i1 = i0 + 1; + int i2 = i1 + 1; + + A (strlen (a[v0]) == 1); + A (strlen (&a[v0][v0]) == 1); + A (strlen (&a[i0][v0]) == 1); + A (strlen (&a[v0][i0]) == 1); + + A (strlen (a[v1]) == 2); + A (strlen (&a[v1][i0]) == 2); + A (strlen (&a[i1][v0]) == 2); + A (strlen (&a[v1][v0]) == 2); + + A (strlen (&a[v1][i1]) == 1); + A (strlen (&a[v1][i1]) == 1); + + A (strlen (&a[v1][i2]) == 0); + A (strlen (&a[v1][v2]) == 0); +} + +void test_array_off_2_3 (void) +{ + A (strlen (a[0] + 0) == 1); + A (strlen (a[0] + v0) == 1); + A (strlen (a[v0] + 0) == 1); + A (strlen (a[v0] + v0) == 1); + + A (strlen (a[v1] + 0) == 2); + A (strlen (a[1] + v0) == 2); + A (strlen (a[v1] + 0) == 2); + A (strlen (a[v1] + v0) == 2); + + A (strlen (a[v1] + 1) == 1); + A (strlen (a[v1] + v1) == 1); + + A (strlen (a[v1] + 2) == 0); + A (strlen (a[v1] + v2) == 0); + + int i0 = 0; + int i1 = i0 + 1; + int i2 = i1 + 1; + + A (strlen (a[i0] + i0) == 1); + A (strlen (a[i0] + v0) == 1); + A (strlen (a[v0] + i0) == 1); + A (strlen (a[v0] + v0) == 1); + + A (strlen (a[v1] + i0) == 2); + A (strlen (a[i1] + v0) == 2); + A (strlen (a[v1] + i0) == 2); + A (strlen (a[v1] + v0) == 2); + + A (strlen (a[v1] + i1) == 1); + A (strlen (a[v1] + v1) == 1); + + A (strlen (a[v1] + i2) == 0); + A (strlen (a[v1] + v2) == 0); +} + +void test_array_ref_2_2_5 (void) +{ + A (strlen (b[0][v0]) == 1); + A (strlen (b[v0][0]) == 1); + + A (strlen (&b[0][0][v0]) == 1); + A (strlen (&b[0][v0][0]) == 1); + A (strlen (&b[v0][0][0]) == 1); + + A (strlen (&b[0][v0][v0]) == 1); + A (strlen (&b[v0][0][v0]) == 1); + A (strlen (&b[v0][v0][0]) == 1); + + A (strlen (b[0][v1]) == 2); + A (strlen (b[v1][0]) == 3); + + A (strlen (&b[0][0][v1]) == 0); + A (strlen (&b[0][v1][0]) == 2); + A (strlen (&b[v0][0][0]) == 1); + + A (strlen (&b[0][v0][v0]) == 1); + A (strlen (&b[v0][0][v0]) == 1); + A (strlen (&b[v0][v0][0]) == 1); + + A (strlen (&b[0][v1][v1]) == 1); + A (strlen (&b[v1][0][v1]) == 2); + A (strlen (&b[v1][v1][0]) == 4); + A (strlen (&b[v1][v1][1]) == 3); + A (strlen (&b[v1][v1][2]) == 2); + + int i0 = 0; + int i1 = i0 + 1; + int i2 = i1 + 1; + + A (strlen (b[i0][v0]) == 1); + A (strlen (b[v0][i0]) == 1); + + A (strlen (&b[i0][i0][v0]) == 1); + A (strlen (&b[i0][v0][i0]) == 1); + A (strlen (&b[v0][i0][i0]) == 1); + + A (strlen (&b[i0][v0][v0]) == 1); + A (strlen (&b[v0][i0][v0]) == 1); + A (strlen (&b[v0][v0][i0]) == 1); + + A (strlen (b[i0][v1]) == 2); + A (strlen (b[v1][i0]) == 3); + + A (strlen (&b[i0][i0][v1]) == 0); + A (strlen (&b[i0][v1][i0]) == 2); + A (strlen (&b[v0][i0][i0]) == 1); + + A (strlen (&b[i0][v0][v0]) == 1); + A (strlen (&b[v0][i0][v0]) == 1); + A (strlen (&b[v0][v0][i0]) == 1); + + A (strlen (&b[i0][v1][v1]) == 1); + A (strlen (&b[v1][i0][v1]) == 2); + A (strlen (&b[v1][v1][i0]) == 4); + A (strlen (&b[v1][v1][i1]) == 3); + A (strlen (&b[v1][v1][i2]) == 2); +} + +void test_array_off_2_2_5 (void) +{ + A (strlen (b[0][0] + v0) == 1); + A (strlen (b[0][v0] + v0) == 1); + A (strlen (b[v0][0] + v0) == 1); + A (strlen (b[v0][v0] + v0) == 1); + + A (strlen (b[0][0] + v1) == 0); + A (strlen (b[0][v1] + 0) == 2); + A (strlen (b[v0][0] + 0) == 1); + + A (strlen (b[0][v0] + v0) == 1); + A (strlen (b[v0][0] + v0) == 1); + A (strlen (b[v0][v0] + 0) == 1); + + A (strlen (b[0][v1] + v1) == 1); + A (strlen (b[v1][0] + v1) == 2); + A (strlen (b[v1][v1] + 0) == 4); + A (strlen (b[v1][v1] + 1) == 3); + A (strlen (b[v1][v1] + 2) == 2); + + int i0 = 0; + int i1 = i0 + 1; + int i2 = i1 + 1; + + A (strlen (b[i0][i0] + v0) == 1); + A (strlen (b[i0][v0] + v0) == 1); + A (strlen (b[v0][i0] + v0) == 1); + A (strlen (b[v0][v0] + v0) == 1); + + A (strlen (b[i0][i0] + v1) == 0); + A (strlen (b[i0][v1] + i0) == 2); + A (strlen (b[v0][i0] + i0) == 1); + + A (strlen (b[i0][v0] + v0) == 1); + A (strlen (b[v0][i0] + v0) == 1); + A (strlen (b[v0][v0] + i0) == 1); + + A (strlen (b[i0][v1] + v1) == 1); + A (strlen (b[v1][i0] + v1) == 2); + A (strlen (b[v1][v1] + i0) == 4); + A (strlen (b[v1][v1] + i1) == 3); + A (strlen (b[v1][v1] + i2) == 2); +} + +int main () +{ + test_array_ref_2_3 (); + test_array_off_2_3 (); + + test_array_ref_2_2_5 (); + test_array_off_2_2_5 (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,132 @@ +/* PR tree-optimization/86532 - Wrong code due to a wrong strlen folding + starting with r262522 + Exercise strlen() with a multi-dimensional array of strings with + embedded nuls. */ + +extern __SIZE_TYPE__ strlen (const char*); + +static const char a[2][3][9] = { + { "1", "1\0002" }, + { "12\0003", "123\0004" } +}; + +volatile int v0 = 0; +volatile int v1 = 1; +volatile int v2 = 2; +volatile int v3 = 3; +volatile int v4 = 4; +volatile int v5 = 5; +volatile int v6 = 6; +volatile int v7 = 7; + +#define A(expr) \ + ((expr) ? (void)0 : (__builtin_printf ("assertion on line %i: %s\n", \ + __LINE__, #expr), \ + __builtin_abort ())) + +void test_array_ref (void) +{ + int i0 = 0; + int i1 = i0 + 1; + int i2 = i1 + 1; + int i3 = i2 + 1; + int i4 = i3 + 1; + int i5 = i4 + 1; + int i6 = i5 + 1; + int i7 = i6 + 1; + + A (strlen (a[0][0]) == 1); + A (strlen (a[0][1]) == 1); + + A (strlen (a[1][0]) == 2); + A (strlen (a[1][1]) == 3); + + A (strlen (&a[0][0][0]) == 1); + A (strlen (&a[0][1][0]) == 1); + + A (strlen (&a[1][0][0]) == 2); + A (strlen (&a[1][1][0]) == 3); + + A (strlen (&a[0][0][0] + 1) == 0); + A (strlen (&a[0][1][0] + 1) == 0); + A (strlen (&a[0][1][0] + 2) == 1); + A (strlen (&a[0][1][0] + 3) == 0); + A (strlen (&a[0][1][0] + 7) == 0); + + A (strlen (&a[1][0][0] + 1) == 1); + A (strlen (&a[1][1][0] + 1) == 2); + A (strlen (&a[1][1][0] + 2) == 1); + A (strlen (&a[1][1][0] + 7) == 0); + + + A (strlen (a[i0][i0]) == 1); + A (strlen (a[i0][i1]) == 1); + + A (strlen (a[i1][i0]) == 2); + A (strlen (a[i1][i1]) == 3); + + A (strlen (&a[i0][i0][i0]) == 1); + A (strlen (&a[i0][i1][i0]) == 1); + A (strlen (&a[i0][i1][i1]) == 0); + A (strlen (&a[i0][i1][i2]) == 1); + A (strlen (&a[i0][i1][i3]) == 0); + A (strlen (&a[i0][i1][i3]) == 0); + + A (strlen (&a[i1][i0][i0]) == 2); + A (strlen (&a[i1][i1][i0]) == 3); + A (strlen (&a[i1][i1][i1]) == 2); + A (strlen (&a[i1][i1][i2]) == 1); + A (strlen (&a[i1][i1][i3]) == 0); + A (strlen (&a[i1][i1][i4]) == 1); + A (strlen (&a[i1][i1][i5]) == 0); + A (strlen (&a[i1][i1][i6]) == 0); + A (strlen (&a[i1][i1][i7]) == 0); + + A (strlen (&a[i0][i0][i0] + i1) == 0); + A (strlen (&a[i0][i1][i0] + i1) == 0); + A (strlen (&a[i0][i1][i0] + i7) == 0); + + A (strlen (&a[i1][i0][i0] + i1) == 1); + A (strlen (&a[i1][i1][i0] + i1) == 2); + A (strlen (&a[i1][i1][i0] + i2) == 1); + A (strlen (&a[i1][i1][i0] + i3) == 0); + A (strlen (&a[i1][i1][i0] + i4) == 1); + A (strlen (&a[i1][i1][i0] + i5) == 0); + A (strlen (&a[i1][i1][i0] + i6) == 0); + A (strlen (&a[i1][i1][i0] + i7) == 0); + + + A (strlen (a[i0][i0]) == 1); + A (strlen (a[i0][i1]) == 1); + + A (strlen (a[i1][i0]) == 2); + A (strlen (a[i1][i1]) == 3); + + A (strlen (&a[i0][i0][i0]) == 1); + A (strlen (&a[i0][i1][i0]) == 1); + + A (strlen (&a[i1][i0][i0]) == 2); + A (strlen (&a[i1][i1][i0]) == 3); + + A (strlen (&a[i0][i0][i0] + v1) == 0); + A (strlen (&a[i0][i0][i0] + v2) == 0); + A (strlen (&a[i0][i0][i0] + v7) == 0); + + A (strlen (&a[i0][i1][i0] + v1) == 0); + A (strlen (&a[i0][i1][i0] + v2) == 1); + A (strlen (&a[i0][i1][i0] + v3) == 0); + + A (strlen (&a[i1][i0][i0] + v1) == 1); + A (strlen (&a[i1][i1][i0] + v1) == 2); + A (strlen (&a[i1][i1][i0] + v2) == 1); + A (strlen (&a[i1][i1][i0] + v3) == 0); + A (strlen (&a[i1][i1][i0] + v4) == 1); + A (strlen (&a[i1][i1][i0] + v5) == 0); + A (strlen (&a[i1][i1][i0] + v6) == 0); + A (strlen (&a[i1][i1][i0] + v7) == 0); +} + +int main (void) +{ + test_array_ref (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,232 @@ +/* PR tree-optimization/86622 - incorrect strlen of array of array plus + variable offset + Exercise strlen() with a multi-dimensional array of strings with + offsets. */ + +extern int printf (const char*, ...); +extern __SIZE_TYPE__ strlen (const char*); + +typedef char A28[28]; +typedef A28 A3_28[3]; +typedef A3_28 A2_3_28[2]; + +static const A2_3_28 a = { + /* [0][0] [0][1] [0][2] */ + { "1\00012", "123\0001234", "12345\000123456" }, + /* [1][0] [1][1] [1][2] */ + { "1234567\00012345678", "123456789\0001234567890", "12345678901\000123456789012" } +}; + +volatile int v0 = 0; +volatile int v1 = 1; +volatile int v2 = 2; +volatile int v3 = 3; +volatile int v4 = 4; +volatile int v5 = 5; +volatile int v6 = 6; +volatile int v7 = 7; + +#define A(expr, N) \ + ((strlen (expr) == N) \ + ? (void)0 : (printf ("line %i: strlen (%s = \"%s\") != %i\n", \ + __LINE__, #expr, expr, N), \ + __builtin_abort ())) + +/* Verify that strlen() involving pointer to array arguments computes + the correct result. */ + +void test_array_ptr (void) +{ + /* Compute the length of the string at the refeenced array. */ + A (*(&a[0][0] + 0), 1); + A (*(&a[0][0] + 1), 3); + A (*(&a[0][0] + 2), 5); + + A (*(&a[0][1] - 1), 1); + A (*(&a[0][1] + 0), 3); + A (*(&a[0][1] + 1), 5); + + A (*(&a[0][2] - 2), 1); + A (*(&a[0][2] - 1), 3); + A (*(&a[0][2] + 0), 5); + + A (*(&a[1][0] + 0), 7); + A (*(&a[1][0] + 1), 9); + A (*(&a[1][0] + 2), 11); + + A (*(&a[1][1] - 1), 7); + A (*(&a[1][1] + 0), 9); + A (*(&a[1][1] + 1), 11); + + A (*(&a[1][2] - 2), 7); + A (*(&a[1][2] - 1), 9); + A (*(&a[1][2] - 0), 11); + + /* Compute the length of the string past the first nul. */ + A (*(&a[0][0] + 0) + 2, 2); + A (*(&a[0][0] + 1) + 4, 4); + A (*(&a[0][0] + 2) + 6, 6); + + /* Compute the length of the string past the second nul. */ + A (*(&a[0][0] + 0) + 5, 0); + A (*(&a[0][0] + 1) + 10, 0); + A (*(&a[0][0] + 2) + 14, 0); + + int i0 = 0; + int i1 = i0 + 1; + int i2 = i1 + 1; + int i3 = i2 + 1; + int i4 = i3 + 1; + int i5 = i4 + 1; + + A (*(&a[0][0] + i0), 1); + A (*(&a[0][0] + i1), 3); + A (*(&a[0][0] + i2), 5); + + A (*(&a[0][1] - i1), 1); + A (*(&a[0][1] + i0), 3); + A (*(&a[0][1] + i1), 5); + + A (*(&a[0][2] - i2), 1); + A (*(&a[0][2] - i1), 3); + A (*(&a[0][2] + i0), 5); + + A (*(&a[1][0] + i0), 7); + A (*(&a[1][0] + i1), 9); + A (*(&a[1][0] + i2), 11); + + A (*(&a[1][1] - i1), 7); + A (*(&a[1][1] + i0), 9); + A (*(&a[1][1] + i1), 11); + + A (*(&a[1][2] - i2), 7); + A (*(&a[1][2] - i1), 9); + A (*(&a[1][2] - i0), 11); + + + A (*(&a[i0][i0] + i0), 1); + A (*(&a[i0][i0] + i1), 3); + A (*(&a[i0][i0] + i2), 5); + + A (*(&a[i0][i1] - i1), 1); + A (*(&a[i0][i1] + i0), 3); + A (*(&a[i0][i1] + i1), 5); + + A (*(&a[i0][i2] - i2), 1); + A (*(&a[i0][i2] - i1), 3); + A (*(&a[i0][i2] + i0), 5); + + A (*(&a[i1][i0] + i0), 7); + A (*(&a[i1][i0] + i1), 9); + A (*(&a[i1][i0] + i2), 11); + + A (*(&a[i1][i1] - i1), 7); + A (*(&a[i1][i1] + i0), 9); + A (*(&a[i1][i1] + i1), 11); + + A (*(&a[i1][i2] - i2), 7); + A (*(&a[i1][i2] - i1), 9); + A (*(&a[i1][i2] - i0), 11); + + + A (*(&a[i0][i0] + v0), 1); + A (*(&a[i0][i0] + v1), 3); + A (*(&a[i0][i0] + v2), 5); + + A (*(&a[i0][i1] - v1), 1); + A (*(&a[i0][i1] + v0), 3); + A (*(&a[i0][i1] + v1), 5); + + A (*(&a[i0][i2] - v2), 1); + A (*(&a[i0][i2] - v1), 3); + A (*(&a[i0][i2] + v0), 5); + + A (*(&a[i1][i0] + v0), 7); + A (*(&a[i1][i0] + v1), 9); + A (*(&a[i1][i0] + v2), 11); + + A (*(&a[i1][i1] - v1), 7); + A (*(&a[i1][i1] + v0), 9); + A (*(&a[i1][i1] + v1), 11); + + A (*(&a[i1][i2] - v2), 7); + A (*(&a[i1][i2] - v1), 9); + A (*(&a[i1][i2] - v0), 11); + + + A (*(&a[i0][i0] + v0) + i1, 0); + A (*(&a[i0][i0] + v1) + i2, 1); + A (*(&a[i0][i0] + v2) + i3, 2); + + A (*(&a[i0][i1] - v1) + v1, 0); + A (*(&a[i0][i1] + v0) + v3, 0); + A (*(&a[i0][i1] + v1) + v5, 0); + + A (*(&a[i0][v1] - i1) + i1, 0); + A (*(&a[i0][v1] + i0) + i3, 0); + A (*(&a[i0][v1] + i1) + i5, 0); +} + +static const A3_28* const pa0 = &a[0]; +static const A3_28* const pa1 = &a[1]; + +static const A3_28* const paa[] = { &a[0], &a[1] }; + +/* Verify that strlen() involving pointers and arrays of pointers + to array arguments computes the correct result. */ + +void test_ptr_array (void) +{ + int i0 = 0; + int i1 = i0 + 1; + int i2 = i1 + 1; + int i3 = i2 + 1; + + A (*((*pa0) + i0), 1); + A (*((*pa0) + i1), 3); + A (*((*pa0) + i2), 5); + + A (*(pa0[0] + i0), 1); + A (*(pa0[0] + i1), 3); + A (*(pa0[0] + i2), 5); + + A ((*pa0)[i0] + i1, 0); + A ((*pa0)[i1] + i2, 1); + A ((*pa0)[i2] + i3, 2); + + + A (*((*pa1) + i0), 7); + A (*((*pa1) + i1), 9); + A (*((*pa1) + i2), 11); + + A (*(pa1[0] + i0), 7); + A (*(pa1[0] + i1), 9); + A (*(pa1[0] + i2), 11); + + A ((*pa1)[i0] + i1, 6); + A ((*pa1)[i1] + i2, 7); + A ((*pa1)[i2] + i3, 8); + + A (*(*(paa[0]) + i0), 1); + A (*(*(paa[0]) + i1), 3); + A (*(*(paa[0]) + i2), 5); + + A (*(*(paa[1]) + i0), 7); + A (*(*(paa[1]) + i1), 9); + A (*(*(paa[1]) + i2), 11); + + A (*(*(paa[1]) - i1), 5); + A (*(*(paa[1]) - i2), 3); + A (*(*(paa[1]) - i3), 1); + + A (*(*(paa[0]) + i0) + i1, 0); + A (*(*(paa[0]) + i1) + i2, 1); + A (*(*(paa[0]) + i2) + i3, 2); +} + +int main (void) +{ + test_array_ptr (); + + test_ptr_array (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,653 @@ +/* Test to verify that even strictly undefined strlen() calls with + unterminated character arrays yield the "expected" results when + the terminating nul is present in a subsequent suobobject. */ + +extern __SIZE_TYPE__ strlen (const char *); + +unsigned nfails; + +#define A(expr, N) \ + do { \ + const char *s = (expr); \ + unsigned n = strlen (s); \ + ((n == N) \ + ? 0 \ + : (__builtin_printf ("line %i: strlen (%s = \"%s\")" \ + " == %u failed\n", \ + __LINE__, #expr, s, N), \ + ++nfails)); \ + } while (0) + + +int idx; + + +const char ca[][4] = { + { '1', '2', '3', '4' }, { '5' }, + { '1', '2', '3', '4' }, { '5', '6' }, + { '1', '2', '3', '4' }, { '5', '6', '7' }, + { '1', '2', '3', '4' }, { '5', '6', '7', '8' }, + { '9' } +}; + +static void test_const_global_arrays (void) +{ + A (ca[0], 5); + A (&ca[0][0], 5); + A (&ca[0][1], 4); + A (&ca[0][3], 2); + + int i = 0; + A (ca[i], 5); + A (&ca[i][0], 5); + A (&ca[i][1], 4); + A (&ca[i][3], 2); + + int j = i; + A (&ca[i][i], 5); + A (&ca[i][j + 1], 4); + A (&ca[i][j + 2], 3); + + A (&ca[idx][i], 5); + A (&ca[idx][j + 1], 4); + A (&ca[idx][j + 2], 3); + + A (&ca[idx][idx], 5); + A (&ca[idx][idx + 1], 4); + A (&ca[idx][idx + 2], 3); + + A (&ca[0][++j], 4); + A (&ca[0][++j], 3); + A (&ca[0][++j], 2); + + if (j != 3) + ++nfails; +} + + +static void test_const_local_arrays (void) +{ + const char a[][4] = { + { '1', '2', '3', '4' }, { '5' }, + { '1', '2', '3', '4' }, { '5', '6' }, + { '1', '2', '3', '4' }, { '5', '6', '7' }, + { '1', '2', '3', '4' }, { '5', '6', '7', '8' }, + { '9' } + }; + + A (a[0], 5); + A (&a[0][0], 5); + A (&a[0][1], 4); + A (&a[0][3], 2); + + int i = 0; + A (a[i], 5); + A (&a[i][0], 5); + A (&a[i][1], 4); + A (&a[i][3], 2); + + int j = i; + A (&a[i][i], 5); + A (&a[i][j + 1], 4); + A (&a[i][j + 2], 3); + + A (&a[idx][i], 5); + A (&a[idx][j + 1], 4); + A (&a[idx][j + 2], 3); + + A (&a[idx][idx], 5); + A (&a[idx][idx + 1], 4); + A (&a[idx][idx + 2], 3); + + A (&a[0][++j], 4); + A (&a[0][++j], 3); + A (&a[0][++j], 2); + + if (j != 3) + ++nfails; +} + + +char va[][4] = { + { '1', '2', '3', '4' }, { '5' }, + { '1', '2', '3', '4' }, { '5', '6' }, + { '1', '2', '3', '4' }, { '5', '6', '7' }, + { '1', '2', '3', '4' }, { '5', '6', '7', '8' }, + { '9' } +}; + +static void test_nonconst_global_arrays (void) +{ + { + A (va[0], 5); + A (&va[0][0], 5); + A (&va[0][1], 4); + A (&va[0][3], 2); + + int i = 0; + A (va[i], 5); + A (&va[i][0], 5); + A (&va[i][1], 4); + A (&va[i][3], 2); + + int j = i; + A (&va[i][i], 5); + A (&va[i][j + 1], 4); + A (&va[i][j + 2], 3); + + A (&va[idx][i], 5); + A (&va[idx][j + 1], 4); + A (&va[idx][j + 2], 3); + + A (&va[idx][idx], 5); + A (&va[idx][idx + 1], 4); + A (&va[idx][idx + 2], 3); + } + + { + A (va[2], 6); + A (&va[2][0], 6); + A (&va[2][1], 5); + A (&va[2][3], 3); + + int i = 2; + A (va[i], 6); + A (&va[i][0], 6); + A (&va[i][1], 5); + A (&va[i][3], 3); + + int j = i - 1; + A (&va[i][j - 1], 6); + A (&va[i][j], 5); + A (&va[i][j + 1], 4); + + A (&va[idx + 2][i - 1], 5); + A (&va[idx + 2][j], 5); + A (&va[idx + 2][j + 1], 4); + } + + int j = 0; + + A (&va[0][++j], 4); + A (&va[0][++j], 3); + A (&va[0][++j], 2); + + if (j != 3) + ++nfails; +} + + +static void test_nonconst_local_arrays (void) +{ + char a[][4] = { + { '1', '2', '3', '4' }, { '5' }, + { '1', '2', '3', '4' }, { '5', '6' }, + { '1', '2', '3', '4' }, { '5', '6', '7' }, + { '1', '2', '3', '4' }, { '5', '6', '7', '8' }, + { '9' } + }; + + A (a[0], 5); + A (&a[0][0], 5); + A (&a[0][1], 4); + A (&a[0][3], 2); + + int i = 0; + A (a[i], 5); + A (&a[i][0], 5); + A (&a[i][1], 4); + A (&a[i][3], 2); + + int j = i; + A (&a[i][i], 5); + A (&a[i][j + 1], 4); + A (&a[i][j + 2], 3); + + A (&a[idx][i], 5); + A (&a[idx][j + 1], 4); + A (&a[idx][j + 2], 3); + + A (&a[idx][idx], 5); + A (&a[idx][idx + 1], 4); + A (&a[idx][idx + 2], 3); + + A (&a[0][++j], 4); + A (&a[0][++j], 3); + A (&a[0][++j], 2); + + if (j != 3) + ++nfails; +} + + +struct MemArrays { char a[4], b[4]; }; + +const struct MemArrays cma[] = { + { { '1', '2', '3', '4' }, { '5' } }, + { { '1', '2', '3', '4' }, { '5', '6' } }, + { { '1', '2', '3', '4' }, { '5', '6' } }, + { { '1', '2', '3', '4' }, { '5', '6', '7' } }, + { { '1', '2', '3', '4' }, { '5', '6', '7', '8' } }, + { { '9' }, { '\0' } } +}; + +static void test_const_global_member_arrays (void) +{ + { + A (cma[0].a, 5); + A (&cma[0].a[0], 5); + A (&cma[0].a[1], 4); + A (&cma[0].a[2], 3); + + int i = 0; + A (cma[i].a, 5); + A (&cma[i].a[0], 5); + A (&cma[i].a[1], 4); + A (&cma[i].a[2], 3); + + int j = i; + A (&cma[i].a[j], 5); + A (&cma[i].a[j + 1], 4); + A (&cma[i].a[j + 2], 3); + + A (&cma[idx].a[i], 5); + A (&cma[idx].a[j + 1], 4); + A (&cma[idx].a[j + 2], 3); + + A (&cma[idx].a[idx], 5); + A (&cma[idx].a[idx + 1], 4); + A (&cma[idx].a[idx + 2], 3); + } + + { + A (cma[1].a, 6); + A (&cma[1].a[0], 6); + A (&cma[1].a[1], 5); + A (&cma[1].a[2], 4); + + int i = 1; + A (cma[i].a, 6); + A (&cma[i].a[0], 6); + A (&cma[i].a[1], 5); + A (&cma[i].a[2], 4); + + int j = i - 1; + A (&cma[i].a[j], 6); + A (&cma[i].a[j + 1], 5); + A (&cma[i].a[j + 2], 4); + + A (&cma[idx + 1].a[j], 6); + A (&cma[idx + 1].a[j + 1], 5); + A (&cma[idx + 1].a[j + 2], 4); + + A (&cma[idx + 1].a[idx], 6); + A (&cma[idx + 1].a[idx + 1], 5); + A (&cma[idx + 1].a[idx + 2], 4); + } + + { + A (cma[4].a, 9); + A (&cma[4].a[0], 9); + A (&cma[4].a[1], 8); + A (&cma[4].b[0], 5); + + int i = 4; + A (cma[i].a, 9); + A (&cma[i].a[0], 9); + A (&cma[i].a[1], 8); + A (&cma[i].b[0], 5); + + int j = i - 1; + A (&cma[i].a[j], 6); + A (&cma[i].a[j + 1], 5); + A (&cma[i].b[j - 2], 4); + + A (&cma[idx + 4].a[j], 6); + A (&cma[idx + 4].a[j + 1], 5); + A (&cma[idx + 4].b[j - 2], 4); + + A (&cma[idx + 4].a[idx], 9); + A (&cma[idx + 4].a[idx + 1], 8); + A (&cma[idx + 4].b[idx + 1], 4); + } +} + + +static void test_const_local_member_arrays (void) +{ + const struct MemArrays ma[] = { + { { '1', '2', '3', '4' }, { '5' } }, + { { '1', '2', '3', '4' }, { '5', '6' } }, + { { '1', '2', '3', '4' }, { '5', '6' } }, + { { '1', '2', '3', '4' }, { '5', '6', '7' } }, + { { '1', '2', '3', '4' }, { '5', '6', '7', '8' } }, + { { '9' }, { '\0' } } + }; + + { + A (ma[0].a, 5); + A (&ma[0].a[0], 5); + A (&ma[0].a[1], 4); + A (&ma[0].a[2], 3); + + int i = 0; + A (ma[i].a, 5); + A (&ma[i].a[0], 5); + A (&ma[i].a[1], 4); + A (&ma[i].a[2], 3); + + int j = i; + A (&ma[i].a[j], 5); + A (&ma[i].a[j + 1], 4); + A (&ma[i].a[j + 2], 3); + + A (&ma[idx].a[i], 5); + A (&ma[idx].a[j + 1], 4); + A (&ma[idx].a[j + 2], 3); + + A (&ma[idx].a[idx], 5); + A (&ma[idx].a[idx + 1], 4); + A (&ma[idx].a[idx + 2], 3); + } + + { + A (ma[1].a, 6); + A (&ma[1].a[0], 6); + A (&ma[1].a[1], 5); + A (&ma[1].a[2], 4); + + int i = 1; + A (ma[i].a, 6); + A (&ma[i].a[0], 6); + A (&ma[i].a[1], 5); + A (&ma[i].a[2], 4); + + int j = i - 1; + A (&ma[i].a[j], 6); + A (&ma[i].a[j + 1], 5); + A (&ma[i].a[j + 2], 4); + + A (&ma[idx + 1].a[j], 6); + A (&ma[idx + 1].a[j + 1], 5); + A (&ma[idx + 1].a[j + 2], 4); + + A (&ma[idx + 1].a[idx], 6); + A (&ma[idx + 1].a[idx + 1], 5); + A (&ma[idx + 1].a[idx + 2], 4); + } + + { + A (ma[4].a, 9); + A (&ma[4].a[0], 9); + A (&ma[4].a[1], 8); + A (&ma[4].b[0], 5); + + int i = 4; + A (ma[i].a, 9); + A (&ma[i].a[0], 9); + A (&ma[i].a[1], 8); + A (&ma[i].b[0], 5); + + int j = i - 1; + A (&ma[i].a[j], 6); + A (&ma[i].a[j + 1], 5); + A (&ma[i].b[j - 2], 4); + + A (&ma[idx + 4].a[j], 6); + A (&ma[idx + 4].a[j + 1], 5); + A (&ma[idx + 4].b[j - 2], 4); + + A (&ma[idx + 4].a[idx], 9); + A (&ma[idx + 4].a[idx + 1], 8); + A (&ma[idx + 4].b[idx + 1], 4); + } +} + +struct MemArrays vma[] = { + { { '1', '2', '3', '4' }, { '5' } }, + { { '1', '2', '3', '4' }, { '5', '6' } }, + { { '1', '2', '3', '4' }, { '5', '6' } }, + { { '1', '2', '3', '4' }, { '5', '6', '7' } }, + { { '1', '2', '3', '4' }, { '5', '6', '7', '8' } }, + { { '9' }, { '\0' } } +}; + +static void test_nonconst_global_member_arrays (void) +{ + { + A (vma[0].a, 5); + A (&vma[0].a[0], 5); + A (&vma[0].a[1], 4); + A (&vma[0].a[2], 3); + + int i = 0; + A (vma[i].a, 5); + A (&vma[i].a[0], 5); + A (&vma[i].a[1], 4); + A (&vma[i].a[2], 3); + + int j = i; + A (&vma[i].a[j], 5); + A (&vma[i].a[j + 1], 4); + A (&vma[i].a[j + 2], 3); + + A (&vma[idx].a[i], 5); + A (&vma[idx].a[j + 1], 4); + A (&vma[idx].a[j + 2], 3); + + A (&vma[idx].a[idx], 5); + A (&vma[idx].a[idx + 1], 4); + A (&vma[idx].a[idx + 2], 3); + } + + { + A (vma[1].a, 6); + A (&vma[1].a[0], 6); + A (&vma[1].a[1], 5); + A (&vma[1].a[2], 4); + + int i = 1; + A (vma[i].a, 6); + A (&vma[i].a[0], 6); + A (&vma[i].a[1], 5); + A (&vma[i].a[2], 4); + + int j = i - 1; + A (&vma[i].a[j], 6); + A (&vma[i].a[j + 1], 5); + A (&vma[i].a[j + 2], 4); + + A (&vma[idx + 1].a[j], 6); + A (&vma[idx + 1].a[j + 1], 5); + A (&vma[idx + 1].a[j + 2], 4); + + A (&vma[idx + 1].a[idx], 6); + A (&vma[idx + 1].a[idx + 1], 5); + A (&vma[idx + 1].a[idx + 2], 4); + } + + { + A (vma[4].a, 9); + A (&vma[4].a[0], 9); + A (&vma[4].a[1], 8); + A (&vma[4].b[0], 5); + + int i = 4; + A (vma[i].a, 9); + A (&vma[i].a[0], 9); + A (&vma[i].a[1], 8); + A (&vma[i].b[0], 5); + + int j = i - 1; + A (&vma[i].a[j], 6); + A (&vma[i].a[j + 1], 5); + A (&vma[i].b[j - 2], 4); + + A (&vma[idx + 4].a[j], 6); + A (&vma[idx + 4].a[j + 1], 5); + A (&vma[idx + 4].b[j - 2], 4); + + A (&vma[idx + 4].a[idx], 9); + A (&vma[idx + 4].a[idx + 1], 8); + A (&vma[idx + 4].b[idx + 1], 4); + } +} + + +static void test_nonconst_local_member_arrays (void) +{ + struct MemArrays ma[] = { + { { '1', '2', '3', '4' }, { '5' } }, + { { '1', '2', '3', '4' }, { '5', '6' } }, + { { '1', '2', '3', '4' }, { '5', '6' } }, + { { '1', '2', '3', '4' }, { '5', '6', '7' } }, + { { '1', '2', '3', '4' }, { '5', '6', '7', '8' } }, + { { '9' }, { '\0' } } + }; + + { + A (ma[0].a, 5); + A (&ma[0].a[0], 5); + A (&ma[0].a[1], 4); + A (&ma[0].a[2], 3); + + int i = 0; + A (ma[i].a, 5); + A (&ma[i].a[0], 5); + A (&ma[i].a[1], 4); + A (&ma[i].a[2], 3); + + int j = i; + A (&ma[i].a[j], 5); + A (&ma[i].a[j + 1], 4); + A (&ma[i].a[j + 2], 3); + + A (&ma[idx].a[i], 5); + A (&ma[idx].a[j + 1], 4); + A (&ma[idx].a[j + 2], 3); + + A (&ma[idx].a[idx], 5); + A (&ma[idx].a[idx + 1], 4); + A (&ma[idx].a[idx + 2], 3); + } + + { + A (ma[1].a, 6); + A (&ma[1].a[0], 6); + A (&ma[1].a[1], 5); + A (&ma[1].a[2], 4); + + int i = 1; + A (ma[i].a, 6); + A (&ma[i].a[0], 6); + A (&ma[i].a[1], 5); + A (&ma[i].a[2], 4); + + int j = i - 1; + A (&ma[i].a[j], 6); + A (&ma[i].a[j + 1], 5); + A (&ma[i].a[j + 2], 4); + + A (&ma[idx + 1].a[j], 6); + A (&ma[idx + 1].a[j + 1], 5); + A (&ma[idx + 1].a[j + 2], 4); + + A (&ma[idx + 1].a[idx], 6); + A (&ma[idx + 1].a[idx + 1], 5); + A (&ma[idx + 1].a[idx + 2], 4); + } + + { + A (ma[4].a, 9); + A (&ma[4].a[0], 9); + A (&ma[4].a[1], 8); + A (&ma[4].b[0], 5); + + int i = 4; + A (ma[i].a, 9); + A (&ma[i].a[0], 9); + A (&ma[i].a[1], 8); + A (&ma[i].b[0], 5); + + int j = i - 1; + A (&ma[i].a[j], 6); + A (&ma[i].a[j + 1], 5); + A (&ma[i].b[j - 2], 4); + + A (&ma[idx + 4].a[j], 6); + A (&ma[idx + 4].a[j + 1], 5); + A (&ma[idx + 4].b[j - 2], 4); + + A (&ma[idx + 4].a[idx], 9); + A (&ma[idx + 4].a[idx + 1], 8); + A (&ma[idx + 4].b[idx + 1], 4); + } +} + + +union UnionMemberArrays +{ + struct { char a[4], b[4]; } a; + struct { char a[8]; } c; +}; + +const union UnionMemberArrays cu = { + { { '1', '2', '3', '4' }, { '5', } } +}; + +static void test_const_union_member_arrays (void) +{ + A (cu.a.a, 5); + A (cu.a.b, 1); + A (cu.c.a, 5); + + const union UnionMemberArrays clu = { + { { '1', '2', '3', '4' }, { '5', '6' } } + }; + + A (clu.a.a, 6); + A (clu.a.b, 2); + A (clu.c.a, 6); +} + + +union UnionMemberArrays vu = { + { { '1', '2', '3', '4' }, { '5', '6' } } +}; + +static void test_nonconst_union_member_arrays (void) +{ + A (vu.a.a, 6); + A (vu.a.b, 2); + A (vu.c.a, 6); + + union UnionMemberArrays lvu = { + { { '1', '2', '3', '4' }, { '5', '6', '7' } } + }; + + A (lvu.a.a, 7); + A (lvu.a.b, 3); + A (lvu.c.a, 7); +} + + +int main (void) +{ + test_const_global_arrays (); + test_const_local_arrays (); + + test_nonconst_global_arrays (); + test_nonconst_local_arrays (); + + test_const_global_member_arrays (); + test_const_local_member_arrays (); + + test_nonconst_global_member_arrays (); + test_nonconst_local_member_arrays (); + + test_const_union_member_arrays (); + test_nonconst_union_member_arrays (); + + if (nfails) + __builtin_abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,113 @@ +/* Test to verify that strlen() calls with conditional expressions + and unterminated arrays or pointers to such things as arguments + are evaluated without making assumptions about array sizes. */ + +extern __SIZE_TYPE__ strlen (const char *); + +unsigned nfails; + +#define A(expr, N) \ + do { \ + const char *_s = (expr); \ + unsigned _n = strlen (_s); \ + ((_n == N) \ + ? 0 \ + : (__builtin_printf ("line %i: strlen ((%s) = (\"%s\"))" \ + " == %u failed\n", \ + __LINE__, #expr, _s, N), \ + ++nfails)); \ + } while (0) + + +volatile int i0 = 0; + +const char ca[2][3] = { "12" }; +const char cb[2][3] = { { '1', '2', '3', }, { '4' } }; + +char va[2][3] = { "123" }; +char vb[2][3] = { { '1', '2', '3', }, { '4', '5' } }; + +const char *s = "123456"; + + +static void test_binary_cond_expr_global (void) +{ + A (i0 ? "1" : ca[0], 2); + A (i0 ? ca[0] : "123", 3); + + /* The call to strlen (cb[0]) is strictly undefined because the array + isn't nul-terminated. This test verifies that the strlen range + optimization doesn't assume that the argument is necessarily nul + terminated. + Ditto for strlen (vb[0]). */ + A (i0 ? "1" : cb[0], 4); /* GCC 8.2 failure */ + A (i0 ? cb[0] : "12", 2); + + A (i0 ? "1" : va[0], 3); /* GCC 8.2 failure */ + A (i0 ? va[0] : "1234", 4); + + A (i0 ? "1" : vb[0], 5); /* GCC 8.2 failure */ + A (i0 ? vb[0] : "12", 2); +} + + +static void test_binary_cond_expr_local (void) +{ + const char lca[2][3] = { "12" }; + const char lcb[2][3] = { { '1', '2', '3', }, { '4' } }; + + char lva[2][3] = { "123" }; + char lvb[2][3] = { { '1', '2', '3', }, { '4', '5' } }; + + /* Also undefined as above. */ + A (i0 ? "1" : lca[0], 2); + A (i0 ? lca[0] : "123", 3); + + A (i0 ? "1" : lcb[0], 4); /* GCC 8.2 failure */ + A (i0 ? lcb[0] : "12", 2); + + A (i0 ? "1" : lva[0], 3); /* GCC 8.2 failure */ + A (i0 ? lva[0] : "1234", 4); + + A (i0 ? "1" : lvb[0], 5); /* GCC 8.2 failure */ + A (i0 ? lvb[0] : "12", 2); +} + + +static void test_ternary_cond_expr (void) +{ + /* Also undefined. */ + A (i0 == 0 ? s : i0 == 1 ? vb[0] : "123", 6); + A (i0 == 0 ? vb[0] : i0 == 1 ? s : "123", 5); + A (i0 == 0 ? "123" : i0 == 1 ? s : vb[0], 3); +} + + +const char (*pca)[3] = &ca[0]; +const char (*pcb)[3] = &cb[0]; + +char (*pva)[3] = &va[0]; +char (*pvb)[3] = &vb[0]; + +static void test_binary_cond_expr_arrayptr (void) +{ + /* Also undefined. */ + A (i0 ? *pca : *pcb, 4); /* GCC 8.2 failure */ + A (i0 ? *pcb : *pca, 2); + + A (i0 ? *pva : *pvb, 5); /* GCC 8.2 failure */ + A (i0 ? *pvb : *pva, 3); +} + + +int main (void) +{ + test_binary_cond_expr_global (); + test_binary_cond_expr_local (); + + test_ternary_cond_expr (); + test_binary_cond_expr_arrayptr (); + + if (nfails) + __builtin_abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-7.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-7.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-7.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strlen-7.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +/* Test to verify that a strlen() call with a pointer to a dynamic type + doesn't make assumptions based on the static type of the original + pointer. See g++.dg/init/strlen.C for the corresponding C++ test. */ + +struct A { int i; char a[1]; void (*p)(); }; +struct B { char a[sizeof (struct A) - __builtin_offsetof (struct A, a)]; }; + +__attribute__ ((noipa)) void +init (char *d, const char *s) +{ + __builtin_strcpy (d, s); +} + +struct B b; + +__attribute__ ((noipa)) void +test_dynamic_type (struct A *p) +{ + /* The following call is undefined because it writes past the end + of the p->a subobject, but the corresponding GIMPLE considers + it valid and there's apparently no way to distinguish invalid + cases from ones like it that might be valid. If/when GIMPLE + changes to make this possible this test can be removed. */ + char *q = (char*)__builtin_memcpy (p->a, &b, sizeof b); + + init (q, "foobar"); + + if (6 != __builtin_strlen (q)) + __builtin_abort(); +} + +int main (void) +{ + struct A *p = (struct A*)__builtin_malloc (sizeof *p); + test_dynamic_type (p); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strncmp-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strncmp-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strncmp-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/strncmp-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,141 @@ +/* { dg-xfail-if "kernel strncmp does not perform unsigned comparisons" { vxworks_kernel } } */ +/* Copyright (C) 2002 Free Software Foundation. + + Test strncmp with various combinations of pointer alignments and lengths to + make sure any optimizations in the library are correct. + + Written by Michael Meissner, March 9, 2002. */ + +#include +#include + +#ifndef MAX_OFFSET +#define MAX_OFFSET (sizeof (long long)) +#endif + +#ifndef MAX_TEST +#define MAX_TEST (8 * sizeof (long long)) +#endif + +#ifndef MAX_EXTRA +#define MAX_EXTRA (sizeof (long long)) +#endif + +#define MAX_LENGTH (MAX_OFFSET + MAX_TEST + MAX_EXTRA) + +static union { + unsigned char buf[MAX_LENGTH]; + long long align_int; + long double align_fp; +} u1, u2; + +void +test (const unsigned char *s1, const unsigned char *s2, size_t len, int expected) +{ + int value = strncmp ((char *) s1, (char *) s2, len); + + if (expected < 0 && value >= 0) + abort (); + else if (expected == 0 && value != 0) + abort (); + else if (expected > 0 && value <= 0) + abort (); +} + +main () +{ + size_t off1, off2, len, i; + unsigned char *buf1, *buf2; + unsigned char *mod1, *mod2; + unsigned char *p1, *p2; + + for (off1 = 0; off1 < MAX_OFFSET; off1++) + for (off2 = 0; off2 < MAX_OFFSET; off2++) + for (len = 0; len < MAX_TEST; len++) + { + p1 = u1.buf; + for (i = 0; i < off1; i++) + *p1++ = '\0'; + + buf1 = p1; + for (i = 0; i < len; i++) + *p1++ = 'a'; + + mod1 = p1; + for (i = 0; i < MAX_EXTRA; i++) + *p1++ = 'x'; + + p2 = u2.buf; + for (i = 0; i < off2; i++) + *p2++ = '\0'; + + buf2 = p2; + for (i = 0; i < len; i++) + *p2++ = 'a'; + + mod2 = p2; + for (i = 0; i < MAX_EXTRA; i++) + *p2++ = 'x'; + + mod1[0] = '\0'; + mod2[0] = '\0'; + test (buf1, buf2, MAX_LENGTH, 0); + test (buf1, buf2, len, 0); + + mod1[0] = 'a'; + mod1[1] = '\0'; + mod2[0] = '\0'; + test (buf1, buf2, MAX_LENGTH, +1); + test (buf1, buf2, len, 0); + + mod1[0] = '\0'; + mod2[0] = 'a'; + mod2[1] = '\0'; + test (buf1, buf2, MAX_LENGTH, -1); + test (buf1, buf2, len, 0); + + mod1[0] = 'b'; + mod1[1] = '\0'; + mod2[0] = 'c'; + mod2[1] = '\0'; + test (buf1, buf2, MAX_LENGTH, -1); + test (buf1, buf2, len, 0); + + mod1[0] = 'c'; + mod1[1] = '\0'; + mod2[0] = 'b'; + mod2[1] = '\0'; + test (buf1, buf2, MAX_LENGTH, +1); + test (buf1, buf2, len, 0); + + mod1[0] = 'b'; + mod1[1] = '\0'; + mod2[0] = (unsigned char)'\251'; + mod2[1] = '\0'; + test (buf1, buf2, MAX_LENGTH, -1); + test (buf1, buf2, len, 0); + + mod1[0] = (unsigned char)'\251'; + mod1[1] = '\0'; + mod2[0] = 'b'; + mod2[1] = '\0'; + test (buf1, buf2, MAX_LENGTH, +1); + test (buf1, buf2, len, 0); + + mod1[0] = (unsigned char)'\251'; + mod1[1] = '\0'; + mod2[0] = (unsigned char)'\252'; + mod2[1] = '\0'; + test (buf1, buf2, MAX_LENGTH, -1); + test (buf1, buf2, len, 0); + + mod1[0] = (unsigned char)'\252'; + mod1[1] = '\0'; + mod2[0] = (unsigned char)'\251'; + mod2[1] = '\0'; + test (buf1, buf2, MAX_LENGTH, +1); + test (buf1, buf2, len, 0); + } + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-aliasing-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-aliasing-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-aliasing-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-aliasing-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +struct S { float f; }; +int __attribute__((noinline)) +foo (int *r, struct S *p) +{ + int *q = (int *)&p->f; + int i = *q; + *r = 0; + return i + *q; +} +extern void abort (void); +int main() +{ + int i = 1; + if (foo (&i, (struct S *)&i) != 1) + abort (); + return (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-cpy-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-cpy-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-cpy-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-cpy-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,43 @@ +/* powerpc64-linux gcc miscompiled this due to rs6000.c:expand_block_move + not setting mem aliasing info correctly for the code implementing the + structure assignment. */ + +struct termios +{ + unsigned int a; + unsigned int b; + unsigned int c; + unsigned int d; + unsigned char pad[28]; +}; + +struct tty_driver +{ + unsigned char pad1[38]; + struct termios t __attribute__ ((aligned (8))); +}; + +static struct termios zero_t; +static struct tty_driver pty; + +void ini (void) +{ + pty.t = zero_t; + pty.t.a = 1; + pty.t.b = 2; + pty.t.c = 3; + pty.t.d = 4; +} + +int main (void) +{ + extern void abort (void); + + ini (); + if (pty.t.a != 1 + || pty.t.b != 2 + || pty.t.c != 3 + || pty.t.d != 4) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +struct S +{ + char f1; + int f2[2]; +}; + +struct S object = {'X', 8, 9}; + +main () +{ + if (object.f1 != 'X' || object.f2[0] != 8 || object.f2[1] != 9) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +struct { + int a:4; + int :4; + int b:4; + int c:4; +} x = { 2,3,4 }; + +main () +{ + if (x.a != 2) + abort (); + if (x.b != 3) + abort (); + if (x.c != 4) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +struct +{ + unsigned int f1:1, f2:1, f3:3, f4:3, f5:2, f6:1, f7:1; +} result = {1, 1, 7, 7, 3, 1, 1}; + +main () +{ + if ((result.f3 & ~7) != 0 || (result.f4 & ~7) != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ini-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,15 @@ +struct s { + int a[3]; + int c[3]; +}; + +struct s s = { + c: {1, 2, 3} +}; + +main() +{ + if (s.c[0] != 1) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ret-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ret-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ret-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ret-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,57 @@ +#include +#include + +char out[100]; + +typedef struct { double d; int i[3]; } B; +typedef struct { char c[33],c1; } X; + +char c1 = 'a'; +char c2 = 127; +char c3 = (char)128; +char c4 = (char)255; +char c5 = -1; + +double d1 = 0.1; +double d2 = 0.2; +double d3 = 0.3; +double d4 = 0.4; +double d5 = 0.5; +double d6 = 0.6; +double d7 = 0.7; +double d8 = 0.8; +double d9 = 0.9; + +B B1 = {0.1,{1,2,3}}; +B B2 = {0.2,{5,4,3}}; +X X1 = {"abcdefghijklmnopqrstuvwxyzABCDEF", 'G'}; +X X2 = {"123",'9'}; +X X3 = {"return-return-return",'R'}; + +X f (B a, char b, double c, B d) +{ + static X xr = {"return val", 'R'}; + X r; + r = xr; + r.c1 = b; + sprintf (out, "X f(B,char,double,B):({%g,{%d,%d,%d}},'%c',%g,{%g,{%d,%d,%d}})", + a.d, a.i[0], a.i[1], a.i[2], b, c, d.d, d.i[0], d.i[1], d.i[2]); + return r; +} + +X (*fp) (B, char, double, B) = &f; + +main () +{ + X Xr; + char tmp[100]; + + Xr = f (B1, c2, d3, B2); + strcpy (tmp, out); + Xr.c[0] = Xr.c1 = '\0'; + Xr = (*fp) (B1, c2, d3, B2); + if (strcmp (tmp, out)) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ret-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ret-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ret-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/struct-ret-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,28 @@ +typedef struct +{ + unsigned char a __attribute__ ((packed)); + unsigned short b __attribute__ ((packed)); +} three_byte_t; + +unsigned char +f (void) +{ + return 0xab; +} + +unsigned short +g (void) +{ + return 0x1234; +} + +main () +{ + three_byte_t three_byte; + + three_byte.a = f (); + three_byte.b = g (); + if (three_byte.a != 0xab || three_byte.b != 0x1234) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/switch-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/switch-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/switch-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/switch-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,56 @@ +/* Copyright (C) 2003 Free Software Foundation. + + Test that switch statements suitable using case bit tests are + implemented correctly. + + Written by Roger Sayle, 01/25/2001. */ + +extern void abort (void); + +int +foo (int x) +{ + switch (x) + { + case 4: + case 6: + case 9: + case 11: + return 30; + } + return 31; +} + +int +main () +{ + int i, r; + + for (i=-1; i<66; i++) + { + r = foo (i); + if (i == 4) + { + if (r != 30) + abort (); + } + else if (i == 6) + { + if (r != 30) + abort (); + } + else if (i == 9) + { + if (r != 30) + abort (); + } + else if (i == 11) + { + if (r != 30) + abort (); + } + else if (r != 31) + abort (); + } + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/tstdi-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/tstdi-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/tstdi-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/tstdi-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,139 @@ +#define FALSE 140 +#define TRUE 13 + +feq (x) + long long int x; +{ + if (x == 0) + return TRUE; + else + return FALSE; +} + +fne (x) + long long int x; +{ + if (x != 0) + return TRUE; + else + return FALSE; +} + +flt (x) + long long int x; +{ + if (x < 0) + return TRUE; + else + return FALSE; +} + +fge (x) + long long int x; +{ + if (x >= 0) + return TRUE; + else + return FALSE; +} + +fgt (x) + long long int x; +{ + if (x > 0) + return TRUE; + else + return FALSE; +} + +fle (x) + long long int x; +{ + if (x <= 0) + return TRUE; + else + return FALSE; +} + +main () +{ + if (feq (0LL) != TRUE) + abort (); + if (feq (-1LL) != FALSE) + abort (); + if (feq (0x8000000000000000LL) != FALSE) + abort (); + if (feq (0x8000000000000001LL) != FALSE) + abort (); + if (feq (1LL) != FALSE) + abort (); + if (feq (0x7fffffffffffffffLL) != FALSE) + abort (); + + if (fne (0LL) != FALSE) + abort (); + if (fne (-1LL) != TRUE) + abort (); + if (fne (0x8000000000000000LL) != TRUE) + abort (); + if (fne (0x8000000000000001LL) != TRUE) + abort (); + if (fne (1LL) != TRUE) + abort (); + if (fne (0x7fffffffffffffffLL) != TRUE) + abort (); + + if (flt (0LL) != FALSE) + abort (); + if (flt (-1LL) != TRUE) + abort (); + if (flt (0x8000000000000000LL) != TRUE) + abort (); + if (flt (0x8000000000000001LL) != TRUE) + abort (); + if (flt (1LL) != FALSE) + abort (); + if (flt (0x7fffffffffffffffLL) != FALSE) + abort (); + + if (fge (0LL) != TRUE) + abort (); + if (fge (-1LL) != FALSE) + abort (); + if (fge (0x8000000000000000LL) != FALSE) + abort (); + if (fge (0x8000000000000001LL) != FALSE) + abort (); + if (fge (1LL) != TRUE) + abort (); + if (fge (0x7fffffffffffffffLL) != TRUE) + abort (); + + if (fgt (0LL) != FALSE) + abort (); + if (fgt (-1LL) != FALSE) + abort (); + if (fgt (0x8000000000000000LL) != FALSE) + abort (); + if (fgt (0x8000000000000001LL) != FALSE) + abort (); + if (fgt (1LL) != TRUE) + abort (); + if (fgt (0x7fffffffffffffffLL) != TRUE) + abort (); + + if (fle (0LL) != TRUE) + abort (); + if (fle (-1LL) != TRUE) + abort (); + if (fle (0x8000000000000000LL) != TRUE) + abort (); + if (fle (0x8000000000000001LL) != TRUE) + abort (); + if (fle (1LL) != FALSE) + abort (); + if (fle (0x7fffffffffffffffLL) != FALSE) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/unroll-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/unroll-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/unroll-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/unroll-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +/* { dg-options "-fgnu89-inline" } */ + +extern void abort (void); +extern void exit (int); + +inline int +f (int x) +{ + return (x + 1); +} + +int +main (void) +{ + int a = 0 ; + + while ( (f(f(f(f(f(f(f(f(f(f(1))))))))))) + a < 12 ) + { + a++; + exit (0); + } + if (a != 1) + abort(); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/usad-run.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/usad-run.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/usad-run.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/usad-run.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,49 @@ +extern void abort (); +extern int abs (int __x) __attribute__ ((__nothrow__, __leaf__)) __attribute__ ((__const__)); + +static int +foo (unsigned char *w, int i, unsigned char *x, int j) +{ + int tot = 0; + for (int a = 0; a < 16; a++) + { + for (int b = 0; b < 16; b++) + tot += abs (w[b] - x[b]); + w += i; + x += j; + } + return tot; +} + +void +bar (unsigned char *w, unsigned char *x, int i, int *result) +{ + *result = foo (w, 16, x, i); +} + +int +main (void) +{ + unsigned char m[256]; + unsigned char n[256]; + int sum, i; + + for (i = 0; i < 256; ++i) + if (i % 2 == 0) + { + m[i] = (i % 8) * 2 + 1; + n[i] = -(i % 8); + } + else + { + m[i] = -((i % 8) * 2 + 2); + n[i] = -((i % 8) >> 1); + } + + bar (m, n, 16, &sum); + + if (sum != 32384) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/user-printf.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/user-printf.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/user-printf.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/user-printf.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,65 @@ +/* Verify that calls to a function declared wiith attribute format (printf) + don't get eliminated even if their result on success can be computed at + compile time (they can fail). + { dg-require-effective-target unwrapped } + { dg-skip-if "requires io" { freestanding } } */ + +#include +#include +#include +#include + +void __attribute__ ((format (printf, 1, 2), noipa)) +user_print (const char *fmt, ...) +{ + va_list va; + va_start (va, fmt); + vfprintf (stdout, fmt, va); + va_end (va); +} + +int main (void) +{ + char *tmpfname = tmpnam (0); + FILE *f = freopen (tmpfname, "w", stdout); + if (!f) + { + perror ("fopen for writing"); + return 1; + } + + user_print ("1"); + user_print ("%c", '2'); + user_print ("%c%c", '3', '4'); + user_print ("%s", "5"); + user_print ("%s%s", "6", "7"); + user_print ("%i", 8); + user_print ("%.1s\n", "9x"); + + fclose (f); + + f = fopen (tmpfname, "r"); + if (!f) + { + perror ("fopen for reading"); + remove (tmpfname); + return 1; + } + + char buf[12] = ""; + if (1 != fscanf (f, "%s", buf)) + { + perror ("fscanf"); + fclose (f); + remove (tmpfname); + return 1; + } + + fclose (f); + remove (tmpfname); + + if (strcmp (buf, "123456789")) + abort (); + + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/usmul.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/usmul.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/usmul.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/usmul.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* { dg-require-effective-target int32plus } */ +int __attribute__ ((noinline)) foo (short x, unsigned short y) +{ + return x * y; +} + +int __attribute__ ((noinline)) bar (unsigned short x, short y) +{ + return x * y; +} + +int main () +{ + if (foo (-2, 0xffff) != -131070) + abort (); + if (foo (2, 0xffff) != 131070) + abort (); + if (foo (-32768, 0x8000) != -1073741824) + abort (); + if (foo (32767, 0x8000) != 1073709056) + abort (); + + if (bar (0xffff, -2) != -131070) + abort (); + if (bar (0xffff, 2) != 131070) + abort (); + if (bar (0x8000, -32768) != -1073741824) + abort (); + if (bar (0x8000, 32767) != 1073709056) + abort (); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +#include + +typedef unsigned long L; +f (L p0, L p1, L p2, L p3, L p4, L p5, L p6, L p7, L p8, ...) +{ + va_list select; + + va_start (select, p8); + + if (va_arg (select, L) != 10) + abort (); + if (va_arg (select, L) != 11) + abort (); + if (va_arg (select, L) != 0) + abort (); + + va_end (select); +} + +main () +{ + f (1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 0L); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-10.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-10.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-10.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-10.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,248 @@ +/* This is a modfied version of va-arg-9.c to test va_copy. */ + +#include + +#ifndef va_copy +#define va_copy __va_copy +#endif + +extern __SIZE_TYPE__ strlen (const char *); + +int +to_hex (unsigned int a) +{ + static char hex[] = "0123456789abcdef"; + + if (a > 15) + abort (); + return hex[a]; +} + +void +fap (int i, char* format, va_list ap) +{ + va_list apc; + char *formatc; + + va_copy (apc, ap); + formatc = format; + + if (strlen (format) != 16 - i) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + while (*formatc) + if (*formatc++ != to_hex (va_arg (apc, int))) + abort (); +} + +void +f0 (char* format, ...) +{ + va_list ap; + + va_start (ap, format); + fap(0, format, ap); + va_end(ap); +} + +void +f1 (int a1, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(1, format, ap); + va_end(ap); +} + +void +f2 (int a1, int a2, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(2, format, ap); + va_end(ap); +} + +void +f3 (int a1, int a2, int a3, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(3, format, ap); + va_end(ap); +} + +void +f4 (int a1, int a2, int a3, int a4, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(4, format, ap); + va_end(ap); +} + +void +f5 (int a1, int a2, int a3, int a4, int a5, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(5, format, ap); + va_end(ap); +} + +void +f6 (int a1, int a2, int a3, int a4, int a5, + int a6, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(6, format, ap); + va_end(ap); +} + +void +f7 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(7, format, ap); + va_end(ap); +} + +void +f8 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(8, format, ap); + va_end(ap); +} + +void +f9 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(9, format, ap); + va_end(ap); +} + +void +f10 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(10, format, ap); + va_end(ap); +} + +void +f11 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(11, format, ap); + va_end(ap); +} + +void +f12 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(12, format, ap); + va_end(ap); +} + +void +f13 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, int a13, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(13, format, ap); + va_end(ap); +} + +void +f14 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, int a13, int a14, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(14, format, ap); + va_end(ap); +} + +void +f15 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, int a13, int a14, int a15, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(15, format, ap); + va_end(ap); +} + +main () +{ + char *f = "0123456789abcdef"; + + f0 (f+0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f1 (0, f+1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f2 (0, 1, f+2, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f3 (0, 1, 2, f+3, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f4 (0, 1, 2, 3, f+4, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f5 (0, 1, 2, 3, 4, f+5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f6 (0, 1, 2, 3, 4, 5, f+6, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f7 (0, 1, 2, 3, 4, 5, 6, f+7, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f8 (0, 1, 2, 3, 4, 5, 6, 7, f+8, 8, 9, 10, 11, 12, 13, 14, 15); + f9 (0, 1, 2, 3, 4, 5, 6, 7, 8, f+9, 9, 10, 11, 12, 13, 14, 15); + f10 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, f+10, 10, 11, 12, 13, 14, 15); + f11 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, f+11, 11, 12, 13, 14, 15); + f12 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, f+12, 12, 13, 14, 15); + f13 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, f+13, 13, 14, 15); + f14 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, f+14, 14, 15); + f15 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, f+15, 15); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-11.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-11.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-11.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-11.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* Test va_arg when the result is ignored and only the pointer increment + side effect is used. */ +#include + +static int +foo (int a, ...) +{ + va_list va; + int i, res; + + va_start (va, a); + + for (i = 0; i < 4; ++i) + (void) va_arg (va, int); + + res = va_arg (va, int); + + va_end (va); + + return res; +} + +int +main (void) +{ + if (foo (5, 4, 3, 2, 1, 0)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-12.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-12.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-12.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-12.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,25 @@ +#include + +/*typedef unsigned long L;*/ +typedef double L; +void f (L p0, L p1, L p2, L p3, L p4, L p5, L p6, L p7, L p8, ...) +{ + va_list select; + + va_start (select, p8); + + if (va_arg (select, L) != 10.) + abort (); + if (va_arg (select, L) != 11.) + abort (); + if (va_arg (select, L) != 0.) + abort (); + + va_end (select); +} + +int main () +{ + f (1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 0.); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-13.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-13.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-13.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-13.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,38 @@ +/* derived from mozilla source code */ + +#include + +typedef struct { + void *stream; + va_list ap; + int nChar; +} ScanfState; + +void dummy (va_list vap) +{ + if (va_arg (vap, int) != 1234) abort(); + return; +} + +void test (int fmt, ...) +{ + ScanfState state, *statep; + + statep = &state; + + va_start (statep->ap, fmt); + dummy (statep->ap); + va_end (statep->ap); + + va_start (state.ap, fmt); + dummy (state.ap); + va_end (state.ap); + + return; +} + +int main (void) +{ + test (456, 1234); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-14.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-14.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-14.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-14.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,45 @@ +#include + +va_list global; + +void vat(va_list param, ...) +{ + va_list local; + + va_start (local, param); + va_copy (global, local); + va_copy (param, local); + if (va_arg (local, int) != 1) + abort(); + va_end (local); + if (va_arg (global, int) != 1) + abort(); + va_end (global); + if (va_arg (param, int) != 1) + abort(); + va_end (param); + + va_start (param, param); + va_start (global, param); + va_copy (local, param); + if (va_arg (local, int) != 1) + abort(); + va_end (local); + va_copy (local, global); + if (va_arg (local, int) != 1) + abort(); + va_end (local); + if (va_arg (global, int) != 1) + abort(); + va_end (global); + if (va_arg (param, int) != 1) + abort(); + va_end (param); +} + +int main(void) +{ + va_list t; + vat (t, 1); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-15.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-15.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-15.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-15.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +#include + +void vafunction (char *dummy, ...) +{ + double darg; + int iarg; + int flag = 0; + int i; + va_list ap; + + va_start(ap, dummy); + for (i = 1; i <= 18; i++, flag++) + { + if (flag & 1) + { + darg = va_arg (ap, double); + if (darg != (double)i) + abort(); + } + else + { + iarg = va_arg (ap, int); + if (iarg != i) + abort(); + } + } + va_end(ap); +} + +int main (void) +{ + vafunction( "", + 1, 2., + 3, 4., + 5, 6., + 7, 8., + 9, 10., + 11, 12., + 13, 14., + 15, 16., + 17, 18. ); + exit(0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-16.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-16.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-16.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-16.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,41 @@ +#include + +typedef double TYPE; + +void vafunction (TYPE dummy1, TYPE dummy2, ...) +{ + va_list ap; + + va_start(ap, dummy2); + if (dummy1 != 888.) + abort(); + if (dummy2 != 999.) + abort(); + if (va_arg (ap, TYPE) != 1.) + abort(); + if (va_arg (ap, TYPE) != 2.) + abort(); + if (va_arg (ap, TYPE) != 3.) + abort(); + if (va_arg (ap, TYPE) != 4.) + abort(); + if (va_arg (ap, TYPE) != 5.) + abort(); + if (va_arg (ap, TYPE) != 6.) + abort(); + if (va_arg (ap, TYPE) != 7.) + abort(); + if (va_arg (ap, TYPE) != 8.) + abort(); + if (va_arg (ap, TYPE) != 9.) + abort(); + va_end(ap); +} + + +int main (void) +{ + vafunction( 888., 999., 1., 2., 3., 4., 5., 6., 7., 8., 9. ); + exit(0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-17.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-17.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-17.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-17.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +#include + +typedef double TYPE; + +void vafunction (char *dummy, ...) +{ + va_list ap; + + va_start(ap, dummy); + if (va_arg (ap, TYPE) != 1.) + abort(); + if (va_arg (ap, TYPE) != 2.) + abort(); + if (va_arg (ap, TYPE) != 3.) + abort(); + if (va_arg (ap, TYPE) != 4.) + abort(); + if (va_arg (ap, TYPE) != 5.) + abort(); + if (va_arg (ap, TYPE) != 6.) + abort(); + if (va_arg (ap, TYPE) != 7.) + abort(); + if (va_arg (ap, TYPE) != 8.) + abort(); + if (va_arg (ap, TYPE) != 9.) + abort(); + va_end(ap); +} + + +int main (void) +{ + vafunction( "", 1., 2., 3., 4., 5., 6., 7., 8., 9. ); + exit(0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-18.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-18.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-18.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-18.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,24 @@ +#include + +typedef double L; +void f (L p0, L p1, L p2, L p3, L p4, L p5, L p6, L p7, L p8, ...) +{ + va_list select; + + va_start (select, p8); + + if (va_arg (select, int) != 10) + abort (); + if (va_arg (select, int) != 11) + abort (); + if (va_arg (select, int) != 12) + abort (); + + va_end (select); +} + +int main () +{ + f (1., 2., 3., 4., 5., 6., 7., 8., 9., 10, 11, 12); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-19.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-19.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-19.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-19.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,37 @@ +#include + +typedef int TYPE; + +void vafunction (char *dummy, ...) +{ + va_list ap; + + va_start(ap, dummy); + if (va_arg (ap, TYPE) != 1) + abort(); + if (va_arg (ap, TYPE) != 2) + abort(); + if (va_arg (ap, TYPE) != 3) + abort(); + if (va_arg (ap, TYPE) != 4) + abort(); + if (va_arg (ap, TYPE) != 5) + abort(); + if (va_arg (ap, TYPE) != 6) + abort(); + if (va_arg (ap, TYPE) != 7) + abort(); + if (va_arg (ap, TYPE) != 8) + abort(); + if (va_arg (ap, TYPE) != 9) + abort(); + va_end(ap); +} + + +int main (void) +{ + vafunction( "", 1, 2, 3, 4, 5, 6, 7, 8, 9 ); + exit(0); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,292 @@ +/* The purpose of this test is to catch edge cases when arguments are passed + in regs and on the stack. We test 16 cases, trying to catch multiple + targets (some use 3 regs for argument passing, some use 12, etc.). + We test both the arguments and the `lastarg' (the argument to va_start). */ + +#include + +extern __SIZE_TYPE__ strlen (); + +int +to_hex (unsigned int a) +{ + static char hex[] = "0123456789abcdef"; + + if (a > 15) + abort (); + return hex[a]; +} + +void +f0 (char* format, ...) +{ + va_list ap; + + va_start (ap, format); + if (strlen (format) != 16 - 0) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f1 (int a1, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 1) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f2 (int a1, int a2, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 2) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f3 (int a1, int a2, int a3, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 3) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f4 (int a1, int a2, int a3, int a4, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 4) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f5 (int a1, int a2, int a3, int a4, int a5, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 5) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f6 (int a1, int a2, int a3, int a4, int a5, + int a6, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 6) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f7 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 7) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f8 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 8) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f9 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 9) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f10 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 10) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f11 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 11) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f12 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 12) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f13 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, int a13, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 13) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f14 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, int a13, int a14, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 14) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +void +f15 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, int a13, int a14, int a15, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + if (strlen (format) != 16 - 15) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); + va_end(ap); +} + +main () +{ + char *f = "0123456789abcdef"; + + f0 (f+0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f1 (0, f+1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f2 (0, 1, f+2, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f3 (0, 1, 2, f+3, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f4 (0, 1, 2, 3, f+4, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f5 (0, 1, 2, 3, 4, f+5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f6 (0, 1, 2, 3, 4, 5, f+6, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f7 (0, 1, 2, 3, 4, 5, 6, f+7, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f8 (0, 1, 2, 3, 4, 5, 6, 7, f+8, 8, 9, 10, 11, 12, 13, 14, 15); + f9 (0, 1, 2, 3, 4, 5, 6, 7, 8, f+9, 9, 10, 11, 12, 13, 14, 15); + f10 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, f+10, 10, 11, 12, 13, 14, 15); + f11 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, f+11, 11, 12, 13, 14, 15); + f12 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, f+12, 12, 13, 14, 15); + f13 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, f+13, 13, 14, 15); + f14 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, f+14, 14, 15); + f15 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, f+15, 15); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-20.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-20.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-20.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-20.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,22 @@ +#include + +void foo(va_list v) +{ + unsigned long long x = va_arg (v, unsigned long long); + if (x != 16LL) + abort(); +} + +void bar(char c, char d, ...) +{ + va_list v; + va_start(v, d); + foo(v); + va_end(v); +} + +int main(void) +{ + bar(0, 0, 16LL); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-21.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-21.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-21.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-21.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,48 @@ +/* Copyright (C) 2000 Free Software Foundation. + + If the argument to va_end() has side effects, test whether side + effects from that argument are honored. + + Written by Kaveh R. Ghazi, 10/31/2000. */ + +#include +#include +#include + +#ifndef __GNUC__ +#define __attribute__(x) +#endif + +static void __attribute__ ((__format__ (__printf__, 1, 2))) +doit (const char *s, ...) +{ + va_list *ap_array[3], **ap_ptr = ap_array; + + ap_array[0] = malloc (sizeof(va_list)); + ap_array[1] = NULL; + ap_array[2] = malloc (sizeof(va_list)); + + va_start (*ap_array[0], s); + vprintf (s, **ap_ptr); + /* Increment the va_list pointer once. */ + va_end (**ap_ptr++); + + /* Increment the va_list pointer a second time. */ + ap_ptr++; + + va_start (*ap_array[2], s); + /* If we failed to increment ap_ptr twice, then the parameter passed + in here will dereference NULL and should cause a crash. */ + vprintf (s, **ap_ptr); + va_end (**ap_ptr); + + /* Just in case, If *ap_ptr is NULL abort anyway. */ + if (*ap_ptr == 0) + abort(); +} + +int main() +{ + doit ("%s", "hello world\n"); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-22.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-22.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-22.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-22.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,78 @@ +#include + +extern void abort (void); +extern void exit (int); + +void bar (int n, int c) +{ + static int lastn = -1, lastc = -1; + + if (lastn != n) + { + if (lastc != lastn) + abort (); + lastc = 0; + lastn = n; + } + + if (c != (char) (lastc ^ (n << 3))) + abort (); + lastc++; +} + +#define D(N) typedef struct { char x[N]; } A##N; +D(0) D(1) D(2) D(3) D(4) D(5) D(6) D(7) +D(8) D(9) D(10) D(11) D(12) D(13) D(14) D(15) +D(16) D(31) D(32) D(35) D(72) +#undef D + +void foo (int size, ...) +{ +#define D(N) A##N a##N; +D(0) D(1) D(2) D(3) D(4) D(5) D(6) D(7) +D(8) D(9) D(10) D(11) D(12) D(13) D(14) D(15) +D(16) D(31) D(32) D(35) D(72) +#undef D + va_list ap; + int i; + + if (size != 21) + abort (); + va_start (ap, size); +#define D(N) \ + a##N = va_arg (ap, typeof (a##N)); \ + for (i = 0; i < N; i++) \ + bar (N, a##N.x[i]); +D(0) D(1) D(2) D(3) D(4) D(5) D(6) D(7) +D(8) D(9) D(10) D(11) D(12) D(13) D(14) D(15) +D(16) D(31) D(32) D(35) D(72) +#undef D + va_end (ap); +} + +int main (void) +{ +#define D(N) A##N a##N; +D(0) D(1) D(2) D(3) D(4) D(5) D(6) D(7) +D(8) D(9) D(10) D(11) D(12) D(13) D(14) D(15) +D(16) D(31) D(32) D(35) D(72) +#undef D + int i; + +#define D(N) \ + for (i = 0; i < N; i++) \ + a##N.x[i] = i ^ (N << 3); +D(0) D(1) D(2) D(3) D(4) D(5) D(6) D(7) +D(8) D(9) D(10) D(11) D(12) D(13) D(14) D(15) +D(16) D(31) D(32) D(35) D(72) +#undef D + + foo (21 +#define D(N) , a##N +D(0) D(1) D(2) D(3) D(4) D(5) D(6) D(7) +D(8) D(9) D(10) D(11) D(12) D(13) D(14) D(15) +D(16) D(31) D(32) D(35) D(72) +#undef D + ); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-23.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-23.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-23.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-23.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +/* PR 9700 */ +/* Alpha got the base address for the va_list incorrect when there was + a structure that was passed partially in registers and partially on + the stack. */ + +#include + +struct two { long x, y; }; + +void foo(int a, int b, int c, int d, int e, struct two f, int g, ...) +{ + va_list args; + int h; + + va_start(args, g); + h = va_arg(args, int); + if (g != 1 || h != 2) + abort (); +} + +int main() +{ + struct two t = { 0, 0 }; + foo(0, 0, 0, 0, 0, t, 1, 2); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-24.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-24.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-24.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-24.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,96 @@ +/* The purpose of this code is to test argument passing of a tuple of + 11 integers, with the break point between named and unnamed arguments + at every possible position. */ + +#include +#include +#include + +static int errors = 0; + +static void +verify (const char *tcase, int n[11]) +{ + int i; + for (i = 0; i <= 10; i++) + if (n[i] != i) + { + printf (" %s: n[%d] = %d expected %d\n", tcase, i, n[i], i); + errors++; + } +} + +#define STR(x) #x + +#define p(i) int q##i, +#define P(i) n[i] = q##i; + +#define p0 p(0) +#define p1 p(1) +#define p2 p(2) +#define p3 p(3) +#define p4 p(4) +#define p5 p(5) +#define p6 p(6) +#define p7 p(7) +#define p8 p(8) +#define p9 p(9) + +#define P0 P(0) +#define P1 P(1) +#define P2 P(2) +#define P3 P(3) +#define P4 P(4) +#define P5 P(5) +#define P6 P(6) +#define P7 P(7) +#define P8 P(8) +#define P9 P(9) + +#define TCASE(x, params, vecinit) \ +static void \ +varargs##x (params ...) \ +{ \ + va_list ap; \ + int n[11]; \ + int i; \ + \ + va_start (ap, q##x); \ + vecinit \ + for (i = x + 1; i <= 10; i++) \ + n[i] = va_arg (ap, int); \ + va_end (ap); \ + \ + verify (STR(varargs##x), n); \ +} + +#define TEST(x) varargs##x (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10) + +TCASE(0, p0 , P0 ) +TCASE(1, p0 p1 , P0 P1 ) +TCASE(2, p0 p1 p2 , P0 P1 P2 ) +TCASE(3, p0 p1 p2 p3 , P0 P1 P2 P3 ) +TCASE(4, p0 p1 p2 p3 p4 , P0 P1 P2 P3 P4 ) +TCASE(5, p0 p1 p2 p3 p4 p5 , P0 P1 P2 P3 P4 P5 ) +TCASE(6, p0 p1 p2 p3 p4 p5 p6 , P0 P1 P2 P3 P4 P5 P6 ) +TCASE(7, p0 p1 p2 p3 p4 p5 p6 p7 , P0 P1 P2 P3 P4 P5 P6 P7 ) +TCASE(8, p0 p1 p2 p3 p4 p5 p6 p7 p8 , P0 P1 P2 P3 P4 P5 P6 P7 P8 ) +TCASE(9, p0 p1 p2 p3 p4 p5 p6 p7 p8 p9, P0 P1 P2 P3 P4 P5 P6 P7 P8 P9) + +int main(void) +{ + TEST(0); + TEST(1); + TEST(2); + TEST(3); + TEST(4); + TEST(5); + TEST(6); + TEST(7); + TEST(8); + TEST(9); + + if (errors) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-26.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-26.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-26.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-26.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +#include + +double f (float f1, float f2, float f3, float f4, + float f5, float f6, ...) +{ + va_list ap; + double d; + + va_start (ap, f6); + d = va_arg (ap, double); + va_end (ap); + return d; +} + +int main () +{ + if (f (1, 2, 3, 4, 5, 6, 7.0) != 7.0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,33 @@ +/* On the i960 any arg bigger than 16 bytes causes all subsequent args + to be passed on the stack. We test this. */ + +#include + +typedef struct { + char a[32]; +} big; + +void +f (big x, char *s, ...) +{ + va_list ap; + + if (x.a[0] != 'a' || x.a[1] != 'b' || x.a[2] != 'c') + abort (); + va_start (ap, s); + if (va_arg (ap, int) != 42) + abort (); + if (va_arg (ap, int) != 'x') + abort (); + if (va_arg (ap, int) != 0) + abort (); + va_end (ap); +} + +main () +{ + static big x = { "abc" }; + + f (x, "", 42, 'x', 0); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,44 @@ +#include + +va_double (int n, ...) +{ + va_list args; + + va_start (args, n); + + if (va_arg (args, double) != 3.141592) + abort (); + if (va_arg (args, double) != 2.71827) + abort (); + if (va_arg (args, double) != 2.2360679) + abort (); + if (va_arg (args, double) != 2.1474836) + abort (); + + va_end (args); +} + +va_long_double (int n, ...) +{ + va_list args; + + va_start (args, n); + + if (va_arg (args, long double) != 3.141592L) + abort (); + if (va_arg (args, long double) != 2.71827L) + abort (); + if (va_arg (args, long double) != 2.2360679L) + abort (); + if (va_arg (args, long double) != 2.1474836L) + abort (); + + va_end (args); +} + +main () +{ + va_double (4, 3.141592, 2.71827, 2.2360679, 2.1474836); + va_long_double (4, 3.141592L, 2.71827L, 2.2360679L, 2.1474836L); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +#include + +f (int n, ...) +{ + va_list args; + + va_start (args, n); + + if (va_arg (args, int) != 10) + abort (); + if (va_arg (args, long long) != 10000000000LL) + abort (); + if (va_arg (args, int) != 11) + abort (); + if (va_arg (args, long double) != 3.14L) + abort (); + if (va_arg (args, int) != 12) + abort (); + if (va_arg (args, int) != 13) + abort (); + if (va_arg (args, long long) != 20000000000LL) + abort (); + if (va_arg (args, int) != 14) + abort (); + if (va_arg (args, double) != 2.72) + abort (); + + va_end(args); +} + +main () +{ + f (4, 10, 10000000000LL, 11, 3.14L, 12, 13, 20000000000LL, 14, 2.72); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-7.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-7.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-7.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-7.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,35 @@ +/* Origin: Franz Sirl */ +/* { dg-options "-fgnu89-inline" } */ + +extern void abort (void); +extern void exit (int); + +#include + +inline void +debug(int i1, int i2, int i3, int i4, int i5, int i6, int i7, + double f1, double f2, double f3, double f4, double f5, + double f6, double f7, double f8, double f9, ...) +{ + va_list ap; + + va_start (ap, f9); + + if (va_arg (ap,int) != 8) + abort (); + if (va_arg (ap,int) != 9) + abort (); + if (va_arg (ap,int) != 10) + abort (); + + va_end (ap); +} + +int +main(void) +{ + debug (1, 2, 3, 4, 5, 6, 7, + 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, + 8, 9, 10); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-8.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-8.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-8.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-8.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,45 @@ +/* Origin: Franz Sirl */ +/* { dg-options "-fgnu89-inline" } */ + +extern void abort (void); +extern void exit (int); + +#include +#include + +#if __LONG_LONG_MAX__ == 9223372036854775807LL + +typedef long long int INT64; + +inline void +debug(int i1, int i2, int i3, int i4, int i5, + int i6, int i7, int i8, int i9, ...) +{ + va_list ap; + + va_start (ap, i9); + + if (va_arg (ap,int) != 10) + abort (); + if (va_arg (ap,INT64) != 0x123400005678LL) + abort (); + + va_end (ap); +} + +int +main(void) +{ + debug(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 0x123400005678LL); + exit(0); +} + +#else + +int +main(void) +{ + exit(0); +} + +#endif /* long long 64 bits */ Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-9.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-9.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-9.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-9.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,236 @@ +/* This is a modfied version of va-arg-2.c to test passing a va_list as + a parameter to another function. */ + +#include + +extern __SIZE_TYPE__ strlen (const char *); + +int +to_hex (unsigned int a) +{ + static char hex[] = "0123456789abcdef"; + + if (a > 15) + abort (); + return hex[a]; +} + +void +fap (int i, char* format, va_list ap) +{ + if (strlen (format) != 16 - i) + abort (); + while (*format) + if (*format++ != to_hex (va_arg (ap, int))) + abort (); +} + +void +f0 (char* format, ...) +{ + va_list ap; + + va_start (ap, format); + fap(0, format, ap); + va_end(ap); +} + +void +f1 (int a1, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(1, format, ap); + va_end(ap); +} + +void +f2 (int a1, int a2, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(2, format, ap); + va_end(ap); +} + +void +f3 (int a1, int a2, int a3, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(3, format, ap); + va_end(ap); +} + +void +f4 (int a1, int a2, int a3, int a4, char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(4, format, ap); + va_end(ap); +} + +void +f5 (int a1, int a2, int a3, int a4, int a5, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(5, format, ap); + va_end(ap); +} + +void +f6 (int a1, int a2, int a3, int a4, int a5, + int a6, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(6, format, ap); + va_end(ap); +} + +void +f7 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(7, format, ap); + va_end(ap); +} + +void +f8 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(8, format, ap); + va_end(ap); +} + +void +f9 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(9, format, ap); + va_end(ap); +} + +void +f10 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(10, format, ap); + va_end(ap); +} + +void +f11 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(11, format, ap); + va_end(ap); +} + +void +f12 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(12, format, ap); + va_end(ap); +} + +void +f13 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, int a13, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(13, format, ap); + va_end(ap); +} + +void +f14 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, int a13, int a14, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(14, format, ap); + va_end(ap); +} + +void +f15 (int a1, int a2, int a3, int a4, int a5, + int a6, int a7, int a8, int a9, int a10, + int a11, int a12, int a13, int a14, int a15, + char* format, ...) +{ + va_list ap; + + va_start(ap, format); + fap(15, format, ap); + va_end(ap); +} + +main () +{ + char *f = "0123456789abcdef"; + + f0 (f+0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f1 (0, f+1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f2 (0, 1, f+2, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f3 (0, 1, 2, f+3, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f4 (0, 1, 2, 3, f+4, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f5 (0, 1, 2, 3, 4, f+5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f6 (0, 1, 2, 3, 4, 5, f+6, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f7 (0, 1, 2, 3, 4, 5, 6, f+7, 7, 8, 9, 10, 11, 12, 13, 14, 15); + f8 (0, 1, 2, 3, 4, 5, 6, 7, f+8, 8, 9, 10, 11, 12, 13, 14, 15); + f9 (0, 1, 2, 3, 4, 5, 6, 7, 8, f+9, 9, 10, 11, 12, 13, 14, 15); + f10 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, f+10, 10, 11, 12, 13, 14, 15); + f11 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, f+11, 11, 12, 13, 14, 15); + f12 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, f+12, 12, 13, 14, 15); + f13 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, f+13, 13, 14, 15); + f14 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, f+14, 14, 15); + f15 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, f+15, 15); + + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-pack-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-pack-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-pack-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-pack-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,143 @@ +/* __builtin_va_arg_pack () builtin tests. */ + +#include + +extern void abort (void); + +int v1 = 8; +long int v2 = 3; +void *v3 = (void *) &v2; +struct A { char c[16]; } v4 = { "foo" }; +long double v5 = 40; +char seen[20]; +int cnt; + +__attribute__ ((noinline)) int +foo1 (int x, int y, ...) +{ + int i; + long int l; + void *v; + struct A a; + long double ld; + va_list ap; + + va_start (ap, y); + if (x < 0 || x >= 20 || seen[x]) + abort (); + seen[x] = ++cnt; + if (y != 6) + abort (); + i = va_arg (ap, int); + if (i != 5) + abort (); + switch (x) + { + case 0: + i = va_arg (ap, int); + if (i != 9 || v1 != 9) + abort (); + a = va_arg (ap, struct A); + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0) + abort (); + v = (void *) va_arg (ap, struct A *); + if (v != (void *) &v4) + abort (); + l = va_arg (ap, long int); + if (l != 3 || v2 != 4) + abort (); + break; + case 1: + ld = va_arg (ap, long double); + if (ld != 41 || v5 != ld) + abort (); + i = va_arg (ap, int); + if (i != 8) + abort (); + v = va_arg (ap, void *); + if (v != &v2) + abort (); + break; + case 2: + break; + default: + abort (); + } + va_end (ap); + return x; +} + +__attribute__ ((noinline)) int +foo2 (int x, int y, ...) +{ + long long int ll; + void *v; + struct A a, b; + long double ld; + va_list ap; + + va_start (ap, y); + if (x < 0 || x >= 20 || seen[x]) + abort (); + seen[x] = ++cnt | 64; + if (y != 10) + abort (); + switch (x) + { + case 11: + break; + case 12: + ld = va_arg (ap, long double); + if (ld != 41 || v5 != 40) + abort (); + a = va_arg (ap, struct A); + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0) + abort (); + b = va_arg (ap, struct A); + if (__builtin_memcmp (b.c, v4.c, sizeof (b.c)) != 0) + abort (); + v = va_arg (ap, void *); + if (v != &v2) + abort (); + ll = va_arg (ap, long long int); + if (ll != 16LL) + abort (); + break; + case 2: + break; + default: + abort (); + } + va_end (ap); + return x + 8; +} + +__attribute__ ((noinline)) int +foo3 (void) +{ + return 6; +} + +extern inline __attribute__ ((always_inline, gnu_inline)) int +bar (int x, ...) +{ + if (x < 10) + return foo1 (x, foo3 (), 5, __builtin_va_arg_pack ()); + return foo2 (x, foo3 () + 4, __builtin_va_arg_pack ()); +} + +int +main (void) +{ + if (bar (0, ++v1, v4, &v4, v2++) != 0) + abort (); + if (bar (1, ++v5, 8, v3) != 1) + abort (); + if (bar (2) != 2) + abort (); + if (bar (v1 + 2) != 19) + abort (); + if (bar (v1 + 3, v5--, v4, v4, v3, 16LL) != 20) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-trap-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-trap-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-trap-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/va-arg-trap-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,36 @@ +/* Undefined behavior from a call to va_arg with a type other than + that of the argument passed (in particular, with a type such as + "float" that can never be the type of an argument passed through + "...") does not appear until after the va_list expression is + evaluated. PR 38483. */ +/* Origin: Joseph Myers */ + +#include + +extern void exit (int); +extern void abort (void); + +va_list ap; +float f; + +va_list * +foo (void) +{ + exit (0); + return ≈ +} + +void +bar (int i, ...) +{ + va_start (ap, i); + f = va_arg (*foo (), float); + va_end (ap); +} + +int +main (void) +{ + bar (1, 0); + abort (); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vfprintf-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vfprintf-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vfprintf-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vfprintf-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,55 @@ +/* { dg-skip-if "requires io" { freestanding } } */ + +#ifndef test +#include +#include +#include + +void +inner (int x, ...) +{ + va_list ap, ap2; + va_start (ap, x); + va_start (ap2, x); + + switch (x) + { +#define test(n, ret, fmt, args) \ + case n: \ + vfprintf (stdout, fmt, ap); \ + if (vfprintf (stdout, fmt, ap2) != ret) \ + abort (); \ + break; +#include "vfprintf-1.c" +#undef test + default: + abort (); + } + + va_end (ap); + va_end (ap2); +} + +int +main (void) +{ +#define test(n, ret, fmt, args) \ + inner args; +#include "vfprintf-1.c" +#undef test + return 0; +} + +#else + test (0, 5, "hello", (0)); + test (1, 6, "hello\n", (1)); + test (2, 1, "a", (2)); + test (3, 0, "", (3)); + test (4, 5, "%s", (4, "hello")); + test (5, 6, "%s", (5, "hello\n")); + test (6, 1, "%s", (6, "a")); + test (7, 0, "%s", (7, "")); + test (8, 1, "%c", (8, 'x')); + test (9, 7, "%s\n", (9, "hello\n")); + test (10, 2, "%d\n", (10, 0)); +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vfprintf-chk-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vfprintf-chk-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vfprintf-chk-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vfprintf-chk-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,75 @@ +/* { dg-skip-if "requires io" { freestanding } } */ + +#ifndef test +#include +#include +#include + +volatile int should_optimize; + +int +__attribute__((noinline)) +__vfprintf_chk (FILE *f, int flag, const char *fmt, va_list ap) +{ +#ifdef __OPTIMIZE__ + if (should_optimize) + abort (); +#endif + should_optimize = 1; + return vfprintf (f, fmt, ap); +} + +void +inner (int x, ...) +{ + va_list ap, ap2; + va_start (ap, x); + va_start (ap2, x); + + switch (x) + { +#define test(n, ret, opt, fmt, args) \ + case n: \ + should_optimize = opt; \ + __vfprintf_chk (stdout, 1, fmt, ap); \ + if (! should_optimize) \ + abort (); \ + should_optimize = 0; \ + if (__vfprintf_chk (stdout, 1, fmt, ap2) != ret) \ + abort (); \ + if (! should_optimize) \ + abort (); \ + break; +#include "vfprintf-chk-1.c" +#undef test + default: + abort (); + } + + va_end (ap); + va_end (ap2); +} + +int +main (void) +{ +#define test(n, ret, opt, fmt, args) \ + inner args; +#include "vfprintf-chk-1.c" +#undef test + return 0; +} + +#else + test (0, 5, 1, "hello", (0)); + test (1, 6, 1, "hello\n", (1)); + test (2, 1, 1, "a", (2)); + test (3, 0, 1, "", (3)); + test (4, 5, 0, "%s", (4, "hello")); + test (5, 6, 0, "%s", (5, "hello\n")); + test (6, 1, 0, "%s", (6, "a")); + test (7, 0, 0, "%s", (7, "")); + test (8, 1, 0, "%c", (8, 'x')); + test (9, 7, 0, "%s\n", (9, "hello\n")); + test (10, 2, 0, "%d\n", (10, 0)); +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vla-dealloc-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vla-dealloc-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vla-dealloc-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vla-dealloc-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,29 @@ +/* VLAs should be deallocated on a jump to before their definition, + including a jump to a label in an inner scope. PR 19771. */ +/* { dg-require-effective-target alloca } */ + +#if (__SIZEOF_INT__ <= 2) +#define LIMIT 10000 +#else +#define LIMIT 1000000 +#endif + +void *volatile p; + +int +main (void) +{ + int n = 0; + if (0) + { + lab:; + } + int x[n % 1000 + 1]; + x[0] = 1; + x[n % 1000] = 2; + p = x; + n++; + if (n < LIMIT) + goto lab; + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vprintf-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vprintf-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vprintf-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vprintf-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,55 @@ +/* { dg-skip-if "requires io" { freestanding } } */ + +#ifndef test +#include +#include +#include + +void +inner (int x, ...) +{ + va_list ap, ap2; + va_start (ap, x); + va_start (ap2, x); + + switch (x) + { +#define test(n, ret, fmt, args) \ + case n: \ + vprintf (fmt, ap); \ + if (vprintf (fmt, ap2) != ret) \ + abort (); \ + break; +#include "vprintf-1.c" +#undef test + default: + abort (); + } + + va_end (ap); + va_end (ap2); +} + +int +main (void) +{ +#define test(n, ret, fmt, args) \ + inner args; +#include "vprintf-1.c" +#undef test + return 0; +} + +#else + test (0, 5, "hello", (0)); + test (1, 6, "hello\n", (1)); + test (2, 1, "a", (2)); + test (3, 0, "", (3)); + test (4, 5, "%s", (4, "hello")); + test (5, 6, "%s", (5, "hello\n")); + test (6, 1, "%s", (6, "a")); + test (7, 0, "%s", (7, "")); + test (8, 1, "%c", (8, 'x')); + test (9, 7, "%s\n", (9, "hello\n")); + test (10, 2, "%d\n", (10, 0)); +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vprintf-chk-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vprintf-chk-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vprintf-chk-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vprintf-chk-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,75 @@ +/* { dg-skip-if "requires io" { freestanding } } */ + +#ifndef test +#include +#include +#include + +volatile int should_optimize; + +int +__attribute__((noinline)) +__vprintf_chk (int flag, const char *fmt, va_list ap) +{ +#ifdef __OPTIMIZE__ + if (should_optimize) + abort (); +#endif + should_optimize = 1; + return vprintf (fmt, ap); +} + +void +inner (int x, ...) +{ + va_list ap, ap2; + va_start (ap, x); + va_start (ap2, x); + + switch (x) + { +#define test(n, ret, opt, fmt, args) \ + case n: \ + should_optimize = opt; \ + __vprintf_chk (1, fmt, ap); \ + if (! should_optimize) \ + abort (); \ + should_optimize = 0; \ + if (__vprintf_chk (1, fmt, ap2) != ret) \ + abort (); \ + if (! should_optimize) \ + abort (); \ + break; +#include "vprintf-chk-1.c" +#undef test + default: + abort (); + } + + va_end (ap); + va_end (ap2); +} + +int +main (void) +{ +#define test(n, ret, opt, fmt, args) \ + inner args; +#include "vprintf-chk-1.c" +#undef test + return 0; +} + +#else + test (0, 5, 0, "hello", (0)); + test (1, 6, 1, "hello\n", (1)); + test (2, 1, 1, "a", (2)); + test (3, 0, 1, "", (3)); + test (4, 5, 0, "%s", (4, "hello")); + test (5, 6, 0, "%s", (5, "hello\n")); + test (6, 1, 0, "%s", (6, "a")); + test (7, 0, 0, "%s", (7, "")); + test (8, 1, 0, "%c", (8, 'x')); + test (9, 7, 0, "%s\n", (9, "hello\n")); + test (10, 2, 0, "%d\n", (10, 0)); +#endif Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ + +extern void abort (); +extern void exit (int); + +int f (int a) { + if (a != 2) { + a = -a; + if (a == 2) + return 0; + return 1; + } + return 1; +} + +int main (int argc, char *argv[]) { + if (f (-2)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,18 @@ +extern void abort (); +extern void exit (int); + +int f (int a) { + if (a != 2) { + a = a > 0 ? a : -a; + if (a == 2) + return 0; + return 1; + } + return 1; +} + +int main (int argc, char *argv[]) { + if (f (-2)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +extern void abort (); +extern void exit (int); + +int f (int a) { + if (a < 12) { + if (a > -15) { + a = a > 0 ? a : -a; + if (a == 2) + return 0; + return 1; + } + } + return 1; +} + +int main (int argc, char *argv[]) { + if (f (-2)) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-4.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-4.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-4.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-4.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +extern void exit (int); +extern void abort (); + +void test(int x, int y) +{ + int c; + + if (x == 1) abort(); + if (y == 1) abort(); + + c = x / y; + + if (c != 1) abort(); +} + +int main() +{ + test(2, 2); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-5.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-5.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-5.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-5.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ +/* { dg-require-effective-target int32plus } */ +extern void exit (int); +extern void abort (); + +void test(unsigned int a, unsigned int b) +{ + if (a < 5) + abort(); + if (b < 5) + abort(); + if (a + b != 0U) + abort(); +} + +int main(int argc, char *argv[]) +{ + unsigned int x = 0x80000000; + test(x, x); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-6.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-6.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-6.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-6.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,31 @@ +/* { dg-require-effective-target int32plus } */ +#include + +extern void exit (int); +extern void abort (); + +void test01(unsigned int a, unsigned int b) +{ + if (a < 5) + abort(); + if (b < 5) + abort(); + if (a - b != 5) + abort(); +} + +void test02(unsigned int a, unsigned int b) +{ + if (a >= 12) + if (b > 15) + if (a - b < UINT_MAX - 15U) + abort (); +} + +int main(int argc, char *argv[]) +{ + unsigned x = 0x80000000; + test01(x + 5, x); + test02(14, 16); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-7.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-7.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-7.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/vrp-7.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,20 @@ + +void abort (void); + +struct T +{ + int b : 1; +} t; + +void __attribute__((noinline)) foo (int f) +{ + t.b = (f & 0x10) ? 1 : 0; +} + +int main (void) +{ + foo (0x10); + if (!t.b) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/wchar_t-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/wchar_t-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/wchar_t-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/wchar_t-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,17 @@ +/* { dg-options "-finput-charset=utf-8" } */ +typedef __WCHAR_TYPE__ wchar_t; +wchar_t x[] = L"Ä"; +wchar_t y = L'Ä'; +extern void abort (void); +extern void exit (int); + +int main (void) +{ + if (sizeof (x) / sizeof (wchar_t) != 2) + abort (); + if (x[0] != L'Ä' || x[1] != L'\0') + abort (); + if (y != L'Ä') + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,14 @@ +#define C L'\400' + +#if C +#define zero (!C) +#else +#define zero C +#endif + +main() +{ + if (zero != 0) + abort (); + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,11 @@ +#include + +const wchar_t ws[] = L"foo"; + +int +main (void) +{ + if (ws[0] != L'f' || ws[1] != L'o' || ws[2] != L'o' || ws[3] != L'\0') + abort(); + exit(0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-3.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-3.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-3.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/widechar-3.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,26 @@ +extern void abort (void); +extern void exit (int); + +static int f(char *x) +{ + return __builtin_strlen(x); +} + +int foo () +{ + return f((char*)&L"abcdef"[0]); +} + + +int +main() +{ +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ + if (foo () != 0) + abort (); +#elif __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + if (foo () != 1) + abort (); +#endif + exit (0); +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zero-struct-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zero-struct-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zero-struct-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zero-struct-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,23 @@ +struct g{}; +char y[3]; +char *f = &y[0]; +char *ff = &y[0]; +void h(void) +{ + struct g t; + *((struct g*)(f++)) = *((struct g*)(ff++)); + *((struct g*)(f++)) = (struct g){}; + t = *((struct g*)(ff++)); +} + +void abort (void); + +int main(void) +{ + h(); + if (f != &y[2]) + abort(); + if (ff != &y[2]) + abort(); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zero-struct-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zero-struct-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zero-struct-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zero-struct-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +void abort (void); +int ii; +typedef struct {} raw_spinlock_t; +typedef struct { + raw_spinlock_t raw_lock; +} spinlock_t; +raw_spinlock_t one_raw_spinlock (void) +{ + raw_spinlock_t raw_lock; + ii++; + return raw_lock; +} +int main(void) +{ + spinlock_t lock = (spinlock_t) { .raw_lock = one_raw_spinlock() }; + if (ii != 1) + abort (); + return 0; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zerolen-1.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zerolen-1.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zerolen-1.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zerolen-1.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,32 @@ +extern void abort (void); +extern void exit (int); + +union iso_directory_record { + char carr[4]; + struct { + unsigned char name_len [1]; + char name [0]; + } u; +} entry; + +void set(union iso_directory_record *); + +int main (void) +{ + union iso_directory_record *de; + + de = &entry; + set(de); + + if (de->u.name_len[0] == 1 && de->u.name[0] == 0) + exit (0); + else + abort (); +} + +void set (union iso_directory_record *p) +{ + p->carr[0] = 1; + p->carr[1] = 0; + return; +} Added: test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zerolen-2.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zerolen-2.c?rev=374156&view=auto ============================================================================== --- test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zerolen-2.c (added) +++ test-suite/trunk/SingleSource/Regression/C/gcc-c-torture/execute/zerolen-2.c Wed Oct 9 04:01:46 2019 @@ -0,0 +1,19 @@ +/* { dg-skip-if "assumes absence of larger-than-word padding" { epiphany-*-* } } */ +extern void abort(void); + +typedef int word __attribute__((mode(word))); + +struct foo +{ + word x; + word y[0]; +}; + +int main() +{ + if (sizeof(word) != sizeof(struct foo)) + abort(); + if (__alignof__(word) != __alignof__(struct foo)) + abort(); + return 0; +} From llvm-commits at lists.llvm.org Wed Oct 9 04:00:30 2019 From: llvm-commits at lists.llvm.org (Tim Northover via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 11:00:30 +0000 (UTC) Subject: [PATCH] D68675: [9.0 branch][ARM] VFPv2 only supports 16 D registers. In-Reply-To: References: Message-ID: t.p.northover accepted this revision. t.p.northover added a comment. This revision is now accepted and ready to land. LGTM! Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68675/new/ https://reviews.llvm.org/D68675 From llvm-commits at lists.llvm.org Wed Oct 9 04:00:30 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 11:00:30 +0000 (UTC) Subject: [PATCH] D68690: AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently Message-ID: nhaehnle created this revision. nhaehnle added a reviewer: tstellar. Herald added subscribers: hiraditya, t-tye, tpr, dstuttard, yaxunl, wdng, jvesely, kzhuravl, arsenm. Herald added a project: LLVM. nhaehnle added a parent revision: D65961: AMDGPU/SILoadStoreOptimizer: Optimize scanning for mergeable instructions. We should check for same instruction class before checking whether they have the same base address, else we might iterate out of bounds of a MachineInstr operands list. The InstClass check is also cheaper. This was introduced in SVN r373630. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68690 Files: llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp Index: llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp =================================================================== --- llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp +++ llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp @@ -1512,8 +1512,8 @@ void SILoadStoreOptimizer::addInstToMergeableList(const CombineInfo &CI, std::list > &MergeableInsts) const { for (std::list &AddrList : MergeableInsts) { - if (AddrList.front().hasSameBaseAddress(*CI.I) && - AddrList.front().InstClass == CI.InstClass) { + if (AddrList.front().InstClass == CI.InstClass && + AddrList.front().hasSameBaseAddress(*CI.I)) { AddrList.emplace_back(CI); return; } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68690.224013.patch Type: text/x-patch Size: 738 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 04:00:31 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 11:00:31 +0000 (UTC) Subject: [PATCH] D68082: [LV] Emitting SCEV checks with OptForSize In-Reply-To: References: Message-ID: SjoerdMeijer updated this revision to Diff 224012. SjoerdMeijer added a comment. Cheers, moved the test back to directory `LoopVectorize` where it once was instead of `LoopVectorize\X86` (which was indeed a mistake, and then added the extra runline out of frustration as the `skx` core was unknown to me). My TODO list has grown to fixing up: 1. `InterleavedAccessInfo::collectConstStrideAccesses()` 2. `LoopAccessInfo::collectStridedAccess()` Will follow up on this shortly. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68082/new/ https://reviews.llvm.org/D68082 Files: llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp llvm/test/Transforms/LoopVectorize/optsize.ll Index: llvm/test/Transforms/LoopVectorize/optsize.ll =================================================================== --- llvm/test/Transforms/LoopVectorize/optsize.ll +++ llvm/test/Transforms/LoopVectorize/optsize.ll @@ -84,6 +84,42 @@ ret i32 0 } +; PR43371: don't run into an assert due to emitting SCEV runtime checks +; with OptForSize. +; + at cm_array = external global [2592 x i16], align 1 + +define void @pr43371() optsize { +; +; CHECK-LABEL: @pr43371 +; +; We do not want to generate SCEV predicates when optimising for size, because +; that will lead to extra code generation such as the SCEV overflow runtime +; checks. Not generating SCEV predicates can still result in vectorisation as +; the non-consecutive loads/stores can be scalarized: +; +; CHECK: vector.body: +; CHECK: store i16 0, i16* %{{.*}}, align 1 +; CHECK: store i16 0, i16* %{{.*}}, align 1 +; CHECK: br i1 {{.*}}, label %vector.body +; +entry: + br label %for.body29 + +for.cond.cleanup28: + unreachable + +for.body29: + %i24.0170 = phi i16 [ 0, %entry], [ %inc37, %for.body29] + %add33 = add i16 undef, %i24.0170 + %idxprom34 = zext i16 %add33 to i32 + %arrayidx35 = getelementptr [2592 x i16], [2592 x i16] * @cm_array, i32 0, i32 %idxprom34 + store i16 0, i16 * %arrayidx35, align 1 + %inc37 = add i16 %i24.0170, 1 + %cmp26 = icmp ult i16 %inc37, 756 + br i1 %cmp26, label %for.body29, label %for.cond.cleanup28 +} + !llvm.module.flags = !{!0} !0 = !{i32 1, !"ProfileSummary", !1} !1 = !{!2, !3, !4, !5, !6, !7, !8, !9} Index: llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp =================================================================== --- llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp +++ llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp @@ -409,7 +409,8 @@ const ValueToValueMap &Strides = getSymbolicStrides() ? *getSymbolicStrides() : ValueToValueMap(); - int Stride = getPtrStride(PSE, Ptr, TheLoop, Strides, true, false); + bool CanAddPredicate = !TheLoop->getHeader()->getParent()->hasOptSize(); + int Stride = getPtrStride(PSE, Ptr, TheLoop, Strides, CanAddPredicate, false); if (Stride == 1 || Stride == -1) return Stride; return 0; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68082.224012.patch Type: text/x-patch Size: 2217 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 04:18:38 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 11:18:38 +0000 (UTC) Subject: [PATCH] D67199: [InstCombine] Expand the simplification of log() In-Reply-To: References: Message-ID: <1da37d3bef157368cad2a77e0bb28b79@localhost.localdomain> spatel added a comment. This patch is blamed for a compiler crash in PR43617: https://bugs.llvm.org/show_bug.cgi?id=43617 Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67199/new/ https://reviews.llvm.org/D67199 From llvm-commits at lists.llvm.org Wed Oct 9 04:18:39 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Nicolai_H=C3=A4hnle_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 11:18:39 +0000 (UTC) Subject: [PATCH] D65966: AMDGPU/SILoadStoreOptimizer: Improve merging of out of order offsets In-Reply-To: References: Message-ID: nhaehnle added a comment. I think the code would benefit from the refactoring I've mentioned on the other patch, where the lists only hold a structure with information on a single instruction. Maybe call it CandidateInfo (information of one instruction, persistent in lists) vs. CombineInfo (information on a pair, only temporary). ================ Comment at: llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:482-487 + // We can't pair these if we can't determine the instruction + // class. + if (InstClass == UNKNOWN || getInstClass(MI->getOpcode(), TII) != InstClass) { + InstClass = UNKNOWN; + return; + } ---------------- Can that really happen? Only instructions with the same InstClass should be added to the same list. ================ Comment at: llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:1595 + MachineBasicBlock::const_iterator Second) const { + // FIXME: Is there a better way to do this. + const MachineBasicBlock *MBB = First->getParent(); ---------------- arsenm wrote: > Don't you know which is first from which was encountered first? Wasn't there some talk about ordered basic blocks? They exist for IR apparently, but not for MIR unless we're tracking live ranges, which we don't do here, so... This pass could perhaps number the CombineInfo instructions in order as they're collected at the start? It'd have to be kept uptodate as instructions are merged. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65966/new/ https://reviews.llvm.org/D65966 From llvm-commits at lists.llvm.org Wed Oct 9 04:18:39 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 11:18:39 +0000 (UTC) Subject: [PATCH] D68667: [SLP] respect target register width for GEP vectorization (PR43578) In-Reply-To: References: Message-ID: <93cf3a4b13a74b51b9044688e2491fd3@localhost.localdomain> spatel added a comment. In D68667#1701060 , @xbolva00 wrote: > Generally, I think there are more bugs for -march=haswell. Only in rare cases the perf of binaries with -march=haswell is better than plain -O3. > I tried this patch with zstd but nothing improved. > > Plain -O3 > ./zstd -b selesiafiles/* -f > > 3# 13 files : 251919670 -> 97724903 (2.578), 182.0 MB/s , 923.2 MB/s > > > -O3 -march=haswell > /zstd -b selesiafiles/* -f > > 3# 13 files : 251919670 -> 97724903 (2.578), 185.7 MB/s , 866.9 MB/s > > > -O3 -march=haswell -mprefer-vector-width=128 > ./zstd -b bench/* -f > > 3# 13 files : 251919670 -> 97724903 (2.578), 188.5 MB/s , 806.8 MB/s > > > for example gcc-10's results for -march=haswell > ./zstd -b bench/* -f > > 3# 13 files : 251919670 -> 97724903 (2.578), 188.7 MB/s ,1032.8 MB/s Thanks for testing! I suspect that this problem (ignoring the target-based register width) is more widespread than only the transform starting from phi, but I want to make sure we have proper tests in place if we change the behavior in other places. Can you file another bug for "zstd"? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68667/new/ https://reviews.llvm.org/D68667 From llvm-commits at lists.llvm.org Wed Oct 9 04:18:45 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 11:18:45 +0000 (UTC) Subject: [PATCH] D66887: [test-suite] Add GCC C Torture Suite In-Reply-To: References: Message-ID: <4e618219e06d2b057691a1ed24e1a20c@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rT374155: [test-suite] Add GCC C Torture Suite (authored by lenary, committed by ). Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66887/new/ https://reviews.llvm.org/D66887 Files: LICENSE.TXT SingleSource/Regression/C/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/README SingleSource/Regression/C/gcc-c-torture/execute/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/execute/COPYING SingleSource/Regression/C/gcc-c-torture/execute/COPYING3 SingleSource/Regression/C/gcc-c-torture/execute/LICENSE.TXT SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt SingleSource/Regression/C/gcc-c-torture/lit.local.cfg -------------- next part -------------- A non-text attachment was scrubbed... Name: D66887.224016.patch Type: text/x-patch Size: 68806 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 04:29:21 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Wed, 09 Oct 2019 11:29:21 -0000 Subject: [llvm] r374157 - [llvm-exegesis][NFC] Remove extra `llvm::` qualifications. Message-ID: <20191009112921.6C6CA90A78@lists.llvm.org> Author: courbet Date: Wed Oct 9 04:29:21 2019 New Revision: 374157 URL: http://llvm.org/viewvc/llvm-project?rev=374157&view=rev Log: [llvm-exegesis][NFC] Remove extra `llvm::` qualifications. Summary: First patch: in unit tests. Subscribers: nemanjai, tschuett, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68687 Modified: llvm/trunk/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/ARM/AssemblerTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/Common/AssemblerUtils.h llvm/trunk/unittests/tools/llvm-exegesis/PerfHelperTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/PowerPC/AnalysisTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/PowerPC/TargetTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/RegisterValueTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/X86/BenchmarkResultTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/X86/RegisterAliasingTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/X86/SchedClassResolutionTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp llvm/trunk/unittests/tools/llvm-exegesis/X86/TargetTest.cpp Modified: llvm/trunk/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp Wed Oct 9 04:29:21 2019 @@ -34,10 +34,10 @@ constexpr const char kTriple[] = "aarch6 class AArch64TargetTest : public ::testing::Test { protected: AArch64TargetTest() - : ExegesisTarget_(ExegesisTarget::lookup(llvm::Triple(kTriple))) { + : ExegesisTarget_(ExegesisTarget::lookup(Triple(kTriple))) { EXPECT_THAT(ExegesisTarget_, NotNull()); std::string error; - Target_ = llvm::TargetRegistry::lookupTarget(kTriple, error); + Target_ = TargetRegistry::lookupTarget(kTriple, error); EXPECT_THAT(Target_, NotNull()); STI_.reset( Target_->createMCSubtargetInfo(kTriple, "generic", /*no features*/ "")); @@ -54,14 +54,14 @@ protected: return ExegesisTarget_->setRegTo(*STI_, Reg, Value); } - const llvm::Target *Target_; + const Target *Target_; const ExegesisTarget *const ExegesisTarget_; - std::unique_ptr STI_; + std::unique_ptr STI_; }; TEST_F(AArch64TargetTest, SetRegToConstant) { // The AArch64 target currently doesn't know how to set register values. - const auto Insts = setRegTo(llvm::AArch64::X0, llvm::APInt()); + const auto Insts = setRegTo(AArch64::X0, APInt()); EXPECT_THAT(Insts, Not(IsEmpty())); } Modified: llvm/trunk/unittests/tools/llvm-exegesis/ARM/AssemblerTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/ARM/AssemblerTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/ARM/AssemblerTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/ARM/AssemblerTest.cpp Wed Oct 9 04:29:21 2019 @@ -13,8 +13,6 @@ namespace llvm { namespace exegesis { namespace { -using llvm::MCInstBuilder; - class ARMMachineFunctionGeneratorTest : public MachineFunctionGeneratorBaseTest { protected: @@ -30,16 +28,16 @@ protected: }; TEST_F(ARMMachineFunctionGeneratorTest, DISABLED_JitFunction) { - Check({}, llvm::MCInst(), 0x1e, 0xff, 0x2f, 0xe1); + Check({}, MCInst(), 0x1e, 0xff, 0x2f, 0xe1); } TEST_F(ARMMachineFunctionGeneratorTest, DISABLED_JitFunctionADDrr) { - Check({{llvm::ARM::R0, llvm::APInt()}}, - MCInstBuilder(llvm::ARM::ADDrr) - .addReg(llvm::ARM::R0) - .addReg(llvm::ARM::R0) - .addReg(llvm::ARM::R0) - .addImm(llvm::ARMCC::AL) + Check({{ARM::R0, APInt()}}, + MCInstBuilder(ARM::ADDrr) + .addReg(ARM::R0) + .addReg(ARM::R0) + .addReg(ARM::R0) + .addImm(ARMCC::AL) .addReg(0) .addReg(0), 0x00, 0x00, 0x80, 0xe0, 0x1e, 0xff, 0x2f, 0xe1); Modified: llvm/trunk/unittests/tools/llvm-exegesis/Common/AssemblerUtils.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/Common/AssemblerUtils.h?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/Common/AssemblerUtils.h (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/Common/AssemblerUtils.h Wed Oct 9 04:29:21 2019 @@ -31,14 +31,13 @@ protected: MachineFunctionGeneratorBaseTest(const std::string &TT, const std::string &CpuName) : TT(TT), CpuName(CpuName), - CanExecute(llvm::Triple(TT).getArch() == - llvm::Triple(llvm::sys::getProcessTriple()).getArch()), - ET(ExegesisTarget::lookup(llvm::Triple(TT))) { + CanExecute(Triple(TT).getArch() == + Triple(sys::getProcessTriple()).getArch()), + ET(ExegesisTarget::lookup(Triple(TT))) { assert(ET); if (!CanExecute) { - llvm::outs() << "Skipping execution, host:" - << llvm::sys::getProcessTriple() << ", target:" << TT - << "\n"; + outs() << "Skipping execution, host:" << sys::getProcessTriple() + << ", target:" << TT << "\n"; } } @@ -61,24 +60,23 @@ protected: } private: - std::unique_ptr createTargetMachine() { + std::unique_ptr createTargetMachine() { std::string Error; - const llvm::Target *TheTarget = - llvm::TargetRegistry::lookupTarget(TT, Error); + const Target *TheTarget = TargetRegistry::lookupTarget(TT, Error); EXPECT_TRUE(TheTarget) << Error << " " << TT; - const llvm::TargetOptions Options; - llvm::TargetMachine *TM = TheTarget->createTargetMachine( - TT, CpuName, "", Options, llvm::Reloc::Model::Static); + const TargetOptions Options; + TargetMachine *TM = TheTarget->createTargetMachine(TT, CpuName, "", Options, + Reloc::Model::Static); EXPECT_TRUE(TM) << TT << " " << CpuName; - return std::unique_ptr( - static_cast(TM)); + return std::unique_ptr( + static_cast(TM)); } ExecutableFunction - assembleToFunction(llvm::ArrayRef RegisterInitialValues, + assembleToFunction(ArrayRef RegisterInitialValues, FillFunction Fill) { - llvm::SmallString<256> Buffer; - llvm::raw_svector_ostream AsmStream(Buffer); + SmallString<256> Buffer; + raw_svector_ostream AsmStream(Buffer); assembleToStream(*ET, createTargetMachine(), /*LiveIns=*/{}, RegisterInitialValues, Fill, AsmStream); return ExecutableFunction(createTargetMachine(), Modified: llvm/trunk/unittests/tools/llvm-exegesis/PerfHelperTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/PerfHelperTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/PerfHelperTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/PerfHelperTest.cpp Wed Oct 9 04:29:21 2019 @@ -27,13 +27,14 @@ TEST(PerfHelperTest, FunctionalTest) { std::string CallbackEventName; std::string CallbackEventNameFullyQualifed; int64_t CallbackEventCycles; - Measure(llvm::makeArrayRef(SingleEvent), - [&](const PerfEvent &Event, int64_t Value) { - CallbackEventName = Event.name(); - CallbackEventNameFullyQualifed = Event.getPfmEventString(); - CallbackEventCycles = Value; - }, - EmptyFn); + Measure( + makeArrayRef(SingleEvent), + [&](const PerfEvent &Event, int64_t Value) { + CallbackEventName = Event.name(); + CallbackEventNameFullyQualifed = Event.getPfmEventString(); + CallbackEventCycles = Value; + }, + EmptyFn); EXPECT_EQ(CallbackEventName, "CYCLES:u"); EXPECT_THAT(CallbackEventNameFullyQualifed, Not(IsEmpty())); pfmTerminate(); Modified: llvm/trunk/unittests/tools/llvm-exegesis/PowerPC/AnalysisTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/PowerPC/AnalysisTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/PowerPC/AnalysisTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/PowerPC/AnalysisTest.cpp Wed Oct 9 04:29:21 2019 @@ -28,10 +28,9 @@ protected: AnalysisTest() { const std::string TT = "powerpc64le-unknown-linux"; std::string error; - const llvm::Target *const TheTarget = - llvm::TargetRegistry::lookupTarget(TT, error); + const Target *const TheTarget = TargetRegistry::lookupTarget(TT, error); if (!TheTarget) { - llvm::errs() << error << "\n"; + errs() << error << "\n"; return; } STI.reset(TheTarget->createMCSubtargetInfo(TT, "pwr9", "")); @@ -63,7 +62,7 @@ protected: } protected: - std::unique_ptr STI; + std::unique_ptr STI; uint16_t ALUIdx = 0; uint16_t ALUEIdx = 0; uint16_t ALUOIdx = 0; Modified: llvm/trunk/unittests/tools/llvm-exegesis/PowerPC/TargetTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/PowerPC/TargetTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/PowerPC/TargetTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/PowerPC/TargetTest.cpp Wed Oct 9 04:29:21 2019 @@ -33,10 +33,10 @@ constexpr const char kTriple[] = "powerp class PowerPCTargetTest : public ::testing::Test { protected: PowerPCTargetTest() - : ExegesisTarget_(ExegesisTarget::lookup(llvm::Triple(kTriple))) { + : ExegesisTarget_(ExegesisTarget::lookup(Triple(kTriple))) { EXPECT_THAT(ExegesisTarget_, NotNull()); std::string error; - Target_ = llvm::TargetRegistry::lookupTarget(kTriple, error); + Target_ = TargetRegistry::lookupTarget(kTriple, error); EXPECT_THAT(Target_, NotNull()); } static void SetUpTestCase() { @@ -46,15 +46,14 @@ protected: InitializePowerPCExegesisTarget(); } - const llvm::Target *Target_; + const Target *Target_; const ExegesisTarget *const ExegesisTarget_; }; TEST_F(PowerPCTargetTest, SetRegToConstant) { - const std::unique_ptr STI( + const std::unique_ptr STI( Target_->createMCSubtargetInfo(kTriple, "generic", "")); - const auto Insts = - ExegesisTarget_->setRegTo(*STI, llvm::PPC::X0, llvm::APInt()); + const auto Insts = ExegesisTarget_->setRegTo(*STI, PPC::X0, APInt()); EXPECT_THAT(Insts, Not(IsEmpty())); } Modified: llvm/trunk/unittests/tools/llvm-exegesis/RegisterValueTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/RegisterValueTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/RegisterValueTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/RegisterValueTest.cpp Wed Oct 9 04:29:21 2019 @@ -16,12 +16,12 @@ namespace exegesis { namespace { #define CHECK(EXPECTED, ACTUAL) \ - EXPECT_EQ(llvm::APInt(SizeInBits, EXPECTED, 16), \ + EXPECT_EQ(APInt(SizeInBits, EXPECTED, 16), \ bitcastFloatValue(Semantic, PredefinedValues::ACTUAL)) TEST(RegisterValueTest, Half) { const size_t SizeInBits = 16; - const auto &Semantic = llvm::APFloatBase::IEEEhalf(); + const auto &Semantic = APFloatBase::IEEEhalf(); CHECK("0000", POS_ZERO); CHECK("8000", NEG_ZERO); CHECK("3C00", ONE); @@ -37,7 +37,7 @@ TEST(RegisterValueTest, Half) { TEST(RegisterValueTest, Single) { const size_t SizeInBits = 32; - const auto &Semantic = llvm::APFloatBase::IEEEsingle(); + const auto &Semantic = APFloatBase::IEEEsingle(); CHECK("00000000", POS_ZERO); CHECK("80000000", NEG_ZERO); CHECK("3F800000", ONE); @@ -53,7 +53,7 @@ TEST(RegisterValueTest, Single) { TEST(RegisterValueTest, Double) { const size_t SizeInBits = 64; - const auto &Semantic = llvm::APFloatBase::IEEEdouble(); + const auto &Semantic = APFloatBase::IEEEdouble(); CHECK("0000000000000000", POS_ZERO); CHECK("8000000000000000", NEG_ZERO); CHECK("3FF0000000000000", ONE); Modified: llvm/trunk/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp Wed Oct 9 04:29:21 2019 @@ -38,11 +38,11 @@ protected: }; TEST_F(X86MachineFunctionGeneratorTest, DISABLED_JitFunction) { - Check({}, llvm::MCInst(), 0xc3); + Check({}, MCInst(), 0xc3); } TEST_F(X86MachineFunctionGeneratorTest, DISABLED_JitFunctionXOR32rr_X86) { - Check({{EAX, llvm::APInt(32, 1)}}, + Check({{EAX, APInt(32, 1)}}, MCInstBuilder(XOR32rr).addReg(EAX).addReg(EAX).addReg(EAX), // mov eax, 1 0xb8, 0x01, 0x00, 0x00, 0x00, Modified: llvm/trunk/unittests/tools/llvm-exegesis/X86/BenchmarkResultTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/X86/BenchmarkResultTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/X86/BenchmarkResultTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/X86/BenchmarkResultTest.cpp Wed Oct 9 04:29:21 2019 @@ -32,9 +32,9 @@ bool operator==(const BenchmarkMeasure & std::tie(B.Key, B.PerInstructionValue, B.PerSnippetValue); } -static std::string Dump(const llvm::MCInst &McInst) { +static std::string Dump(const MCInst &McInst) { std::string Buffer; - llvm::raw_string_ostream OS(Buffer); + raw_string_ostream OS(Buffer); McInst.print(OS); return Buffer; } @@ -59,19 +59,19 @@ TEST(BenchmarkResultTest, WriteToAndRead // Read benchmarks. const LLVMState State("x86_64-unknown-linux", "haswell"); - llvm::ExitOnError ExitOnErr; + ExitOnError ExitOnErr; InstructionBenchmark ToDisk; - ToDisk.Key.Instructions.push_back(llvm::MCInstBuilder(llvm::X86::XOR32rr) - .addReg(llvm::X86::AL) - .addReg(llvm::X86::AH) + ToDisk.Key.Instructions.push_back(MCInstBuilder(X86::XOR32rr) + .addReg(X86::AL) + .addReg(X86::AH) .addImm(123) .addFPImm(0.5)); ToDisk.Key.Config = "config"; ToDisk.Key.RegisterInitialValues = { - RegisterValue{llvm::X86::AL, llvm::APInt(8, "-1", 10)}, - RegisterValue{llvm::X86::AH, llvm::APInt(8, "123", 10)}}; + RegisterValue{X86::AL, APInt(8, "-1", 10)}, + RegisterValue{X86::AH, APInt(8, "123", 10)}}; ToDisk.Mode = InstructionBenchmark::Latency; ToDisk.CpuName = "cpu_name"; ToDisk.LLVMTriple = "llvm_triple"; @@ -81,12 +81,12 @@ TEST(BenchmarkResultTest, WriteToAndRead ToDisk.Error = "error"; ToDisk.Info = "info"; - llvm::SmallString<64> Filename; + SmallString<64> Filename; std::error_code EC; - EC = llvm::sys::fs::createUniqueDirectory("BenchmarkResultTestDir", Filename); + EC = sys::fs::createUniqueDirectory("BenchmarkResultTestDir", Filename); ASSERT_FALSE(EC); - llvm::sys::path::append(Filename, "data.yaml"); - llvm::errs() << Filename << "-------\n"; + sys::path::append(Filename, "data.yaml"); + errs() << Filename << "-------\n"; ExitOnErr(ToDisk.writeYaml(State, Filename)); { Modified: llvm/trunk/unittests/tools/llvm-exegesis/X86/RegisterAliasingTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/X86/RegisterAliasingTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/X86/RegisterAliasingTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/X86/RegisterAliasingTest.cpp Wed Oct 9 04:29:21 2019 @@ -27,16 +27,15 @@ class RegisterAliasingTest : public X86T TEST_F(RegisterAliasingTest, TrackSimpleRegister) { const auto &RegInfo = State.getRegInfo(); - const RegisterAliasingTracker tracker(RegInfo, llvm::X86::EAX); - std::set ActualAliasedRegisters; + const RegisterAliasingTracker tracker(RegInfo, X86::EAX); + std::set ActualAliasedRegisters; for (unsigned I : tracker.aliasedBits().set_bits()) - ActualAliasedRegisters.insert(static_cast(I)); - const std::set ExpectedAliasedRegisters = { - llvm::X86::AL, llvm::X86::AH, llvm::X86::AX, - llvm::X86::EAX, llvm::X86::HAX, llvm::X86::RAX}; + ActualAliasedRegisters.insert(static_cast(I)); + const std::set ExpectedAliasedRegisters = { + X86::AL, X86::AH, X86::AX, X86::EAX, X86::HAX, X86::RAX}; ASSERT_THAT(ActualAliasedRegisters, ExpectedAliasedRegisters); - for (llvm::MCPhysReg aliased : ExpectedAliasedRegisters) { - ASSERT_THAT(tracker.getOrigin(aliased), llvm::X86::EAX); + for (MCPhysReg aliased : ExpectedAliasedRegisters) { + ASSERT_THAT(tracker.getOrigin(aliased), X86::EAX); } } @@ -44,17 +43,16 @@ TEST_F(RegisterAliasingTest, TrackRegist // The alias bits for GR8_ABCD_LRegClassID are the union of the alias bits for // AL, BL, CL and DL. const auto &RegInfo = State.getRegInfo(); - const llvm::BitVector NoReservedReg(RegInfo.getNumRegs()); + const BitVector NoReservedReg(RegInfo.getNumRegs()); const RegisterAliasingTracker RegClassTracker( - RegInfo, NoReservedReg, - RegInfo.getRegClass(llvm::X86::GR8_ABCD_LRegClassID)); + RegInfo, NoReservedReg, RegInfo.getRegClass(X86::GR8_ABCD_LRegClassID)); - llvm::BitVector sum(RegInfo.getNumRegs()); - sum |= RegisterAliasingTracker(RegInfo, llvm::X86::AL).aliasedBits(); - sum |= RegisterAliasingTracker(RegInfo, llvm::X86::BL).aliasedBits(); - sum |= RegisterAliasingTracker(RegInfo, llvm::X86::CL).aliasedBits(); - sum |= RegisterAliasingTracker(RegInfo, llvm::X86::DL).aliasedBits(); + BitVector sum(RegInfo.getNumRegs()); + sum |= RegisterAliasingTracker(RegInfo, X86::AL).aliasedBits(); + sum |= RegisterAliasingTracker(RegInfo, X86::BL).aliasedBits(); + sum |= RegisterAliasingTracker(RegInfo, X86::CL).aliasedBits(); + sum |= RegisterAliasingTracker(RegInfo, X86::DL).aliasedBits(); ASSERT_THAT(RegClassTracker.aliasedBits(), sum); } @@ -62,13 +60,12 @@ TEST_F(RegisterAliasingTest, TrackRegist TEST_F(RegisterAliasingTest, TrackRegisterClassCache) { // Fetching twice the same tracker yields the same pointers. const auto &RegInfo = State.getRegInfo(); - const llvm::BitVector NoReservedReg(RegInfo.getNumRegs()); + const BitVector NoReservedReg(RegInfo.getNumRegs()); RegisterAliasingTrackerCache Cache(RegInfo, NoReservedReg); - ASSERT_THAT(&Cache.getRegister(llvm::X86::AX), - &Cache.getRegister(llvm::X86::AX)); + ASSERT_THAT(&Cache.getRegister(X86::AX), &Cache.getRegister(X86::AX)); - ASSERT_THAT(&Cache.getRegisterClass(llvm::X86::GR8_ABCD_LRegClassID), - &Cache.getRegisterClass(llvm::X86::GR8_ABCD_LRegClassID)); + ASSERT_THAT(&Cache.getRegisterClass(X86::GR8_ABCD_LRegClassID), + &Cache.getRegisterClass(X86::GR8_ABCD_LRegClassID)); } } // namespace Modified: llvm/trunk/unittests/tools/llvm-exegesis/X86/SchedClassResolutionTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/X86/SchedClassResolutionTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/X86/SchedClassResolutionTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/X86/SchedClassResolutionTest.cpp Wed Oct 9 04:29:21 2019 @@ -54,7 +54,7 @@ protected: } protected: - const llvm::MCSubtargetInfo &STI; + const MCSubtargetInfo &STI; uint16_t P0Idx = 0; uint16_t P1Idx = 0; uint16_t P5Idx = 0; Modified: llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp Wed Oct 9 04:29:21 2019 @@ -39,7 +39,7 @@ class X86SnippetGeneratorTest : public X protected: X86SnippetGeneratorTest() : InstrInfo(State.getInstrInfo()) {} - const llvm::MCInstrInfo &InstrInfo; + const MCInstrInfo &InstrInfo; }; template @@ -74,11 +74,11 @@ TEST_F(LatencySnippetGeneratorTest, Impl // - Var0 [Op0] // - hasAliasingImplicitRegisters (execution is always serial) // - hasAliasingRegisters - const unsigned Opcode = llvm::X86::ADC16i16; - EXPECT_THAT(InstrInfo.get(Opcode).getImplicitDefs()[0], llvm::X86::AX); - EXPECT_THAT(InstrInfo.get(Opcode).getImplicitDefs()[1], llvm::X86::EFLAGS); - EXPECT_THAT(InstrInfo.get(Opcode).getImplicitUses()[0], llvm::X86::AX); - EXPECT_THAT(InstrInfo.get(Opcode).getImplicitUses()[1], llvm::X86::EFLAGS); + const unsigned Opcode = X86::ADC16i16; + EXPECT_THAT(InstrInfo.get(Opcode).getImplicitDefs()[0], X86::AX); + EXPECT_THAT(InstrInfo.get(Opcode).getImplicitDefs()[1], X86::EFLAGS); + EXPECT_THAT(InstrInfo.get(Opcode).getImplicitUses()[0], X86::AX); + EXPECT_THAT(InstrInfo.get(Opcode).getImplicitUses()[1], X86::EFLAGS); const auto CodeTemplates = checkAndGetCodeTemplates(Opcode); ASSERT_THAT(CodeTemplates, SizeIs(1)); const auto &CT = CodeTemplates[0]; @@ -100,8 +100,8 @@ TEST_F(LatencySnippetGeneratorTest, Impl // - Var1 [Op2] // - hasTiedRegisters (execution is always serial) // - hasAliasingRegisters - const unsigned Opcode = llvm::X86::ADD16ri; - EXPECT_THAT(InstrInfo.get(Opcode).getImplicitDefs()[0], llvm::X86::EFLAGS); + const unsigned Opcode = X86::ADD16ri; + EXPECT_THAT(InstrInfo.get(Opcode).getImplicitDefs()[0], X86::EFLAGS); const auto CodeTemplates = checkAndGetCodeTemplates(Opcode); ASSERT_THAT(CodeTemplates, SizeIs(1)); const auto &CT = CodeTemplates[0]; @@ -123,7 +123,7 @@ TEST_F(LatencySnippetGeneratorTest, Impl // - Var1 [Op1] // - Var2 [Op2] // - hasAliasingRegisters - const unsigned Opcode = llvm::X86::VXORPSrr; + const unsigned Opcode = X86::VXORPSrr; const auto CodeTemplates = checkAndGetCodeTemplates(Opcode); ASSERT_THAT(CodeTemplates, SizeIs(1)); const auto &CT = CodeTemplates[0]; @@ -148,14 +148,14 @@ TEST_F(LatencySnippetGeneratorTest, // - Var1 [Op1] // - Var2 [Op2] // - hasAliasingRegisters - const unsigned Opcode = llvm::X86::VXORPSrr; + const unsigned Opcode = X86::VXORPSrr; randomGenerator().seed(0); // Initialize seed. const Instruction &Instr = State.getIC().getInstr(Opcode); auto AllRegisters = State.getRATC().emptyRegisters(); AllRegisters.flip(); auto Error = Generator.generateCodeTemplates(Instr, AllRegisters).takeError(); EXPECT_TRUE((bool)Error); - llvm::consumeError(std::move(Error)); + consumeError(std::move(Error)); } TEST_F(LatencySnippetGeneratorTest, DependencyThroughOtherOpcode) { @@ -165,7 +165,7 @@ TEST_F(LatencySnippetGeneratorTest, Depe // - Op2 Implicit Def Reg(EFLAGS) // - Var0 [Op0] // - Var1 [Op1] - const unsigned Opcode = llvm::X86::CMP64rr; + const unsigned Opcode = X86::CMP64rr; const auto CodeTemplates = checkAndGetCodeTemplates(Opcode); ASSERT_THAT(CodeTemplates, SizeIs(Gt(1U))) << "Many templates are available"; for (const auto &CT : CodeTemplates) { @@ -185,7 +185,7 @@ TEST_F(LatencySnippetGeneratorTest, LAHF // - LAHF // - Op0 Implicit Def Reg(AH) // - Op1 Implicit Use Reg(EFLAGS) - const unsigned Opcode = llvm::X86::LAHF; + const unsigned Opcode = X86::LAHF; const auto CodeTemplates = checkAndGetCodeTemplates(Opcode); ASSERT_THAT(CodeTemplates, SizeIs(Gt(1U))) << "Many templates are available"; for (const auto &CT : CodeTemplates) { @@ -203,7 +203,7 @@ TEST_F(UopsSnippetGeneratorTest, Paralle // - Op1 Explicit Use RegClass(GR32) // - Var0 [Op0] // - Var1 [Op1] - const unsigned Opcode = llvm::X86::BNDCL32rr; + const unsigned Opcode = X86::BNDCL32rr; const auto CodeTemplates = checkAndGetCodeTemplates(Opcode); ASSERT_THAT(CodeTemplates, SizeIs(1)); const auto &CT = CodeTemplates[0]; @@ -224,7 +224,7 @@ TEST_F(UopsSnippetGeneratorTest, SerialI // - Op2 Implicit Use Reg(EAX) // - hasAliasingImplicitRegisters (execution is always serial) // - hasAliasingRegisters - const unsigned Opcode = llvm::X86::CDQ; + const unsigned Opcode = X86::CDQ; const auto CodeTemplates = checkAndGetCodeTemplates(Opcode); ASSERT_THAT(CodeTemplates, SizeIs(1)); const auto &CT = CodeTemplates[0]; @@ -250,7 +250,7 @@ TEST_F(UopsSnippetGeneratorTest, StaticR // - Var1 [Op2] // - hasTiedRegisters (execution is always serial) // - hasAliasingRegisters - const unsigned Opcode = llvm::X86::CMOV32rr; + const unsigned Opcode = X86::CMOV32rr; const auto CodeTemplates = checkAndGetCodeTemplates(Opcode); ASSERT_THAT(CodeTemplates, SizeIs(1)); const auto &CT = CodeTemplates[0]; @@ -282,7 +282,7 @@ TEST_F(UopsSnippetGeneratorTest, NoTiedV // - Var2 [Op2] // - Var3 [Op3] // - hasAliasingRegisters - const unsigned Opcode = llvm::X86::CMOV_GR32; + const unsigned Opcode = X86::CMOV_GR32; const auto CodeTemplates = checkAndGetCodeTemplates(Opcode); ASSERT_THAT(CodeTemplates, SizeIs(1)); const auto &CT = CodeTemplates[0]; @@ -316,7 +316,7 @@ TEST_F(UopsSnippetGeneratorTest, MemoryU // - Var5 [Op5] // - hasMemoryOperands // - hasAliasingRegisters - const unsigned Opcode = llvm::X86::MOV32rm; + const unsigned Opcode = X86::MOV32rm; const auto CodeTemplates = checkAndGetCodeTemplates(Opcode); ASSERT_THAT(CodeTemplates, SizeIs(1)); const auto &CT = CodeTemplates[0]; @@ -343,17 +343,16 @@ public: } private: - llvm::Expected> + Expected> generateCodeTemplates(const Instruction &, const BitVector &) const override { - return llvm::make_error("not implemented", - llvm::inconvertibleErrorCode()); + return make_error("not implemented", inconvertibleErrorCode()); } }; using FakeSnippetGeneratorTest = SnippetGeneratorTest; testing::Matcher IsRegisterValue(unsigned Reg, - llvm::APInt Value) { + APInt Value) { return testing::AllOf(testing::Field(&RegisterValue::Register, Reg), testing::Field(&RegisterValue::Value, Value)); } @@ -375,13 +374,13 @@ TEST_F(FakeSnippetGeneratorTest, MemoryU // - hasMemoryOperands // - hasAliasingImplicitRegisters (execution is always serial) // - hasAliasingRegisters - const unsigned Opcode = llvm::X86::MOVSB; + const unsigned Opcode = X86::MOVSB; const Instruction &Instr = State.getIC().getInstr(Opcode); auto Error = Generator.generateConfigurations(Instr, State.getRATC().emptyRegisters()) .takeError(); EXPECT_TRUE((bool)Error); - llvm::consumeError(std::move(Error)); + consumeError(std::move(Error)); } TEST_F(FakeSnippetGeneratorTest, ComputeRegisterInitialValuesAdd16ri) { @@ -390,13 +389,12 @@ TEST_F(FakeSnippetGeneratorTest, Compute // explicit use 1 : reg RegClass=GR16 | TIED_TO:0 // explicit use 2 : imm // implicit def : EFLAGS - InstructionTemplate IT(Generator.createInstruction(llvm::X86::ADD16ri)); - IT.getValueFor(IT.Instr.Variables[0]) = - llvm::MCOperand::createReg(llvm::X86::AX); + InstructionTemplate IT(Generator.createInstruction(X86::ADD16ri)); + IT.getValueFor(IT.Instr.Variables[0]) = MCOperand::createReg(X86::AX); std::vector Snippet; Snippet.push_back(std::move(IT)); const auto RIV = Generator.computeRegisterInitialValues(Snippet); - EXPECT_THAT(RIV, ElementsAre(IsRegisterValue(llvm::X86::AX, llvm::APInt()))); + EXPECT_THAT(RIV, ElementsAre(IsRegisterValue(X86::AX, APInt()))); } TEST_F(FakeSnippetGeneratorTest, ComputeRegisterInitialValuesAdd64rr) { @@ -406,23 +404,20 @@ TEST_F(FakeSnippetGeneratorTest, Compute // -> only rbx needs defining. std::vector Snippet; { - InstructionTemplate Mov(Generator.createInstruction(llvm::X86::MOV64ri)); - Mov.getValueFor(Mov.Instr.Variables[0]) = - llvm::MCOperand::createReg(llvm::X86::RAX); - Mov.getValueFor(Mov.Instr.Variables[1]) = llvm::MCOperand::createImm(42); + InstructionTemplate Mov(Generator.createInstruction(X86::MOV64ri)); + Mov.getValueFor(Mov.Instr.Variables[0]) = MCOperand::createReg(X86::RAX); + Mov.getValueFor(Mov.Instr.Variables[1]) = MCOperand::createImm(42); Snippet.push_back(std::move(Mov)); } { - InstructionTemplate Add(Generator.createInstruction(llvm::X86::ADD64rr)); - Add.getValueFor(Add.Instr.Variables[0]) = - llvm::MCOperand::createReg(llvm::X86::RAX); - Add.getValueFor(Add.Instr.Variables[1]) = - llvm::MCOperand::createReg(llvm::X86::RBX); + InstructionTemplate Add(Generator.createInstruction(X86::ADD64rr)); + Add.getValueFor(Add.Instr.Variables[0]) = MCOperand::createReg(X86::RAX); + Add.getValueFor(Add.Instr.Variables[1]) = MCOperand::createReg(X86::RBX); Snippet.push_back(std::move(Add)); } const auto RIV = Generator.computeRegisterInitialValues(Snippet); - EXPECT_THAT(RIV, ElementsAre(IsRegisterValue(llvm::X86::RBX, llvm::APInt()))); + EXPECT_THAT(RIV, ElementsAre(IsRegisterValue(X86::RBX, APInt()))); } } // namespace Modified: llvm/trunk/unittests/tools/llvm-exegesis/X86/TargetTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/X86/TargetTest.cpp?rev=374157&r1=374156&r2=374157&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/X86/TargetTest.cpp (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/X86/TargetTest.cpp Wed Oct 9 04:29:21 2019 @@ -81,26 +81,24 @@ Matcher IsMovImmediate(unsigned Matcher IsMovValueToStack(unsigned Opcode, int64_t Value, size_t Offset) { return AllOf(OpcodeIs(Opcode), - ElementsAre(IsReg(llvm::X86::RSP), IsImm(1), IsReg(0), - IsImm(Offset), IsReg(0), IsImm(Value))); + ElementsAre(IsReg(X86::RSP), IsImm(1), IsReg(0), IsImm(Offset), + IsReg(0), IsImm(Value))); } Matcher IsMovValueFromStack(unsigned Opcode, unsigned Reg) { return AllOf(OpcodeIs(Opcode), - ElementsAre(IsReg(Reg), IsReg(llvm::X86::RSP), IsImm(1), - IsReg(0), IsImm(0), IsReg(0))); + ElementsAre(IsReg(Reg), IsReg(X86::RSP), IsImm(1), IsReg(0), + IsImm(0), IsReg(0))); } Matcher IsStackAllocate(unsigned Size) { - return AllOf( - OpcodeIs(llvm::X86::SUB64ri8), - ElementsAre(IsReg(llvm::X86::RSP), IsReg(llvm::X86::RSP), IsImm(Size))); + return AllOf(OpcodeIs(X86::SUB64ri8), + ElementsAre(IsReg(X86::RSP), IsReg(X86::RSP), IsImm(Size))); } Matcher IsStackDeallocate(unsigned Size) { - return AllOf( - OpcodeIs(llvm::X86::ADD64ri8), - ElementsAre(IsReg(llvm::X86::RSP), IsReg(llvm::X86::RSP), IsImm(Size))); + return AllOf(OpcodeIs(X86::ADD64ri8), + ElementsAre(IsReg(X86::RSP), IsReg(X86::RSP), IsImm(Size))); } constexpr const char kTriple[] = "x86_64-unknown-linux"; @@ -144,128 +142,121 @@ TEST_F(Core2TargetTest, NoHighByteRegs) } TEST_F(Core2TargetTest, SetFlags) { - const unsigned Reg = llvm::X86::EFLAGS; - EXPECT_THAT( - setRegTo(Reg, APInt(64, 0x1111222233334444ULL)), - ElementsAre(IsStackAllocate(8), - IsMovValueToStack(llvm::X86::MOV32mi, 0x33334444UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11112222UL, 4), - OpcodeIs(llvm::X86::POPF64))); + const unsigned Reg = X86::EFLAGS; + EXPECT_THAT(setRegTo(Reg, APInt(64, 0x1111222233334444ULL)), + ElementsAre(IsStackAllocate(8), + IsMovValueToStack(X86::MOV32mi, 0x33334444UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x11112222UL, 4), + OpcodeIs(X86::POPF64))); } TEST_F(Core2TargetTest, SetRegToGR8Value) { const uint8_t Value = 0xFFU; - const unsigned Reg = llvm::X86::AL; + const unsigned Reg = X86::AL; EXPECT_THAT(setRegTo(Reg, APInt(8, Value)), - ElementsAre(IsMovImmediate(llvm::X86::MOV8ri, Reg, Value))); + ElementsAre(IsMovImmediate(X86::MOV8ri, Reg, Value))); } TEST_F(Core2TargetTest, SetRegToGR16Value) { const uint16_t Value = 0xFFFFU; - const unsigned Reg = llvm::X86::BX; + const unsigned Reg = X86::BX; EXPECT_THAT(setRegTo(Reg, APInt(16, Value)), - ElementsAre(IsMovImmediate(llvm::X86::MOV16ri, Reg, Value))); + ElementsAre(IsMovImmediate(X86::MOV16ri, Reg, Value))); } TEST_F(Core2TargetTest, SetRegToGR32Value) { const uint32_t Value = 0x7FFFFU; - const unsigned Reg = llvm::X86::ECX; + const unsigned Reg = X86::ECX; EXPECT_THAT(setRegTo(Reg, APInt(32, Value)), - ElementsAre(IsMovImmediate(llvm::X86::MOV32ri, Reg, Value))); + ElementsAre(IsMovImmediate(X86::MOV32ri, Reg, Value))); } TEST_F(Core2TargetTest, SetRegToGR64Value) { const uint64_t Value = 0x7FFFFFFFFFFFFFFFULL; - const unsigned Reg = llvm::X86::RDX; + const unsigned Reg = X86::RDX; EXPECT_THAT(setRegTo(Reg, APInt(64, Value)), - ElementsAre(IsMovImmediate(llvm::X86::MOV64ri, Reg, Value))); + ElementsAre(IsMovImmediate(X86::MOV64ri, Reg, Value))); } TEST_F(Core2TargetTest, SetRegToVR64Value) { - EXPECT_THAT( - setRegTo(llvm::X86::MM0, APInt(64, 0x1111222233334444ULL)), - ElementsAre(IsStackAllocate(8), - IsMovValueToStack(llvm::X86::MOV32mi, 0x33334444UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11112222UL, 4), - IsMovValueFromStack(llvm::X86::MMX_MOVQ64rm, llvm::X86::MM0), - IsStackDeallocate(8))); + EXPECT_THAT(setRegTo(X86::MM0, APInt(64, 0x1111222233334444ULL)), + ElementsAre(IsStackAllocate(8), + IsMovValueToStack(X86::MOV32mi, 0x33334444UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x11112222UL, 4), + IsMovValueFromStack(X86::MMX_MOVQ64rm, X86::MM0), + IsStackDeallocate(8))); } TEST_F(Core2TargetTest, SetRegToVR128Value_Use_MOVDQUrm) { EXPECT_THAT( - setRegTo(llvm::X86::XMM0, - APInt(128, "11112222333344445555666677778888", 16)), + setRegTo(X86::XMM0, APInt(128, "11112222333344445555666677778888", 16)), ElementsAre(IsStackAllocate(16), - IsMovValueToStack(llvm::X86::MOV32mi, 0x77778888UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x55556666UL, 4), - IsMovValueToStack(llvm::X86::MOV32mi, 0x33334444UL, 8), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11112222UL, 12), - IsMovValueFromStack(llvm::X86::MOVDQUrm, llvm::X86::XMM0), + IsMovValueToStack(X86::MOV32mi, 0x77778888UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x55556666UL, 4), + IsMovValueToStack(X86::MOV32mi, 0x33334444UL, 8), + IsMovValueToStack(X86::MOV32mi, 0x11112222UL, 12), + IsMovValueFromStack(X86::MOVDQUrm, X86::XMM0), IsStackDeallocate(16))); } TEST_F(Core2AvxTargetTest, SetRegToVR128Value_Use_VMOVDQUrm) { EXPECT_THAT( - setRegTo(llvm::X86::XMM0, - APInt(128, "11112222333344445555666677778888", 16)), + setRegTo(X86::XMM0, APInt(128, "11112222333344445555666677778888", 16)), ElementsAre(IsStackAllocate(16), - IsMovValueToStack(llvm::X86::MOV32mi, 0x77778888UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x55556666UL, 4), - IsMovValueToStack(llvm::X86::MOV32mi, 0x33334444UL, 8), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11112222UL, 12), - IsMovValueFromStack(llvm::X86::VMOVDQUrm, llvm::X86::XMM0), + IsMovValueToStack(X86::MOV32mi, 0x77778888UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x55556666UL, 4), + IsMovValueToStack(X86::MOV32mi, 0x33334444UL, 8), + IsMovValueToStack(X86::MOV32mi, 0x11112222UL, 12), + IsMovValueFromStack(X86::VMOVDQUrm, X86::XMM0), IsStackDeallocate(16))); } TEST_F(Core2Avx512TargetTest, SetRegToVR128Value_Use_VMOVDQU32Z128rm) { EXPECT_THAT( - setRegTo(llvm::X86::XMM0, - APInt(128, "11112222333344445555666677778888", 16)), - ElementsAre( - IsStackAllocate(16), - IsMovValueToStack(llvm::X86::MOV32mi, 0x77778888UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x55556666UL, 4), - IsMovValueToStack(llvm::X86::MOV32mi, 0x33334444UL, 8), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11112222UL, 12), - IsMovValueFromStack(llvm::X86::VMOVDQU32Z128rm, llvm::X86::XMM0), - IsStackDeallocate(16))); + setRegTo(X86::XMM0, APInt(128, "11112222333344445555666677778888", 16)), + ElementsAre(IsStackAllocate(16), + IsMovValueToStack(X86::MOV32mi, 0x77778888UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x55556666UL, 4), + IsMovValueToStack(X86::MOV32mi, 0x33334444UL, 8), + IsMovValueToStack(X86::MOV32mi, 0x11112222UL, 12), + IsMovValueFromStack(X86::VMOVDQU32Z128rm, X86::XMM0), + IsStackDeallocate(16))); } TEST_F(Core2AvxTargetTest, SetRegToVR256Value_Use_VMOVDQUYrm) { const char ValueStr[] = "1111111122222222333333334444444455555555666666667777777788888888"; - EXPECT_THAT(setRegTo(llvm::X86::YMM0, APInt(256, ValueStr, 16)), - ElementsAreArray( - {IsStackAllocate(32), - IsMovValueToStack(llvm::X86::MOV32mi, 0x88888888UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x77777777UL, 4), - IsMovValueToStack(llvm::X86::MOV32mi, 0x66666666UL, 8), - IsMovValueToStack(llvm::X86::MOV32mi, 0x55555555UL, 12), - IsMovValueToStack(llvm::X86::MOV32mi, 0x44444444UL, 16), - IsMovValueToStack(llvm::X86::MOV32mi, 0x33333333UL, 20), - IsMovValueToStack(llvm::X86::MOV32mi, 0x22222222UL, 24), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11111111UL, 28), - IsMovValueFromStack(llvm::X86::VMOVDQUYrm, llvm::X86::YMM0), - IsStackDeallocate(32)})); + EXPECT_THAT( + setRegTo(X86::YMM0, APInt(256, ValueStr, 16)), + ElementsAreArray({IsStackAllocate(32), + IsMovValueToStack(X86::MOV32mi, 0x88888888UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x77777777UL, 4), + IsMovValueToStack(X86::MOV32mi, 0x66666666UL, 8), + IsMovValueToStack(X86::MOV32mi, 0x55555555UL, 12), + IsMovValueToStack(X86::MOV32mi, 0x44444444UL, 16), + IsMovValueToStack(X86::MOV32mi, 0x33333333UL, 20), + IsMovValueToStack(X86::MOV32mi, 0x22222222UL, 24), + IsMovValueToStack(X86::MOV32mi, 0x11111111UL, 28), + IsMovValueFromStack(X86::VMOVDQUYrm, X86::YMM0), + IsStackDeallocate(32)})); } TEST_F(Core2Avx512TargetTest, SetRegToVR256Value_Use_VMOVDQU32Z256rm) { const char ValueStr[] = "1111111122222222333333334444444455555555666666667777777788888888"; EXPECT_THAT( - setRegTo(llvm::X86::YMM0, APInt(256, ValueStr, 16)), - ElementsAreArray( - {IsStackAllocate(32), - IsMovValueToStack(llvm::X86::MOV32mi, 0x88888888UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x77777777UL, 4), - IsMovValueToStack(llvm::X86::MOV32mi, 0x66666666UL, 8), - IsMovValueToStack(llvm::X86::MOV32mi, 0x55555555UL, 12), - IsMovValueToStack(llvm::X86::MOV32mi, 0x44444444UL, 16), - IsMovValueToStack(llvm::X86::MOV32mi, 0x33333333UL, 20), - IsMovValueToStack(llvm::X86::MOV32mi, 0x22222222UL, 24), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11111111UL, 28), - IsMovValueFromStack(llvm::X86::VMOVDQU32Z256rm, llvm::X86::YMM0), - IsStackDeallocate(32)})); + setRegTo(X86::YMM0, APInt(256, ValueStr, 16)), + ElementsAreArray({IsStackAllocate(32), + IsMovValueToStack(X86::MOV32mi, 0x88888888UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x77777777UL, 4), + IsMovValueToStack(X86::MOV32mi, 0x66666666UL, 8), + IsMovValueToStack(X86::MOV32mi, 0x55555555UL, 12), + IsMovValueToStack(X86::MOV32mi, 0x44444444UL, 16), + IsMovValueToStack(X86::MOV32mi, 0x33333333UL, 20), + IsMovValueToStack(X86::MOV32mi, 0x22222222UL, 24), + IsMovValueToStack(X86::MOV32mi, 0x11111111UL, 28), + IsMovValueFromStack(X86::VMOVDQU32Z256rm, X86::YMM0), + IsStackDeallocate(32)})); } TEST_F(Core2Avx512TargetTest, SetRegToVR512Value) { @@ -273,103 +264,94 @@ TEST_F(Core2Avx512TargetTest, SetRegToVR "1111111122222222333333334444444455555555666666667777777788888888" "99999999AAAAAAAABBBBBBBBCCCCCCCCDDDDDDDDEEEEEEEEFFFFFFFF00000000"; EXPECT_THAT( - setRegTo(llvm::X86::ZMM0, APInt(512, ValueStr, 16)), - ElementsAreArray( - {IsStackAllocate(64), - IsMovValueToStack(llvm::X86::MOV32mi, 0x00000000UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0xFFFFFFFFUL, 4), - IsMovValueToStack(llvm::X86::MOV32mi, 0xEEEEEEEEUL, 8), - IsMovValueToStack(llvm::X86::MOV32mi, 0xDDDDDDDDUL, 12), - IsMovValueToStack(llvm::X86::MOV32mi, 0xCCCCCCCCUL, 16), - IsMovValueToStack(llvm::X86::MOV32mi, 0xBBBBBBBBUL, 20), - IsMovValueToStack(llvm::X86::MOV32mi, 0xAAAAAAAAUL, 24), - IsMovValueToStack(llvm::X86::MOV32mi, 0x99999999UL, 28), - IsMovValueToStack(llvm::X86::MOV32mi, 0x88888888UL, 32), - IsMovValueToStack(llvm::X86::MOV32mi, 0x77777777UL, 36), - IsMovValueToStack(llvm::X86::MOV32mi, 0x66666666UL, 40), - IsMovValueToStack(llvm::X86::MOV32mi, 0x55555555UL, 44), - IsMovValueToStack(llvm::X86::MOV32mi, 0x44444444UL, 48), - IsMovValueToStack(llvm::X86::MOV32mi, 0x33333333UL, 52), - IsMovValueToStack(llvm::X86::MOV32mi, 0x22222222UL, 56), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11111111UL, 60), - IsMovValueFromStack(llvm::X86::VMOVDQU32Zrm, llvm::X86::ZMM0), - IsStackDeallocate(64)})); + setRegTo(X86::ZMM0, APInt(512, ValueStr, 16)), + ElementsAreArray({IsStackAllocate(64), + IsMovValueToStack(X86::MOV32mi, 0x00000000UL, 0), + IsMovValueToStack(X86::MOV32mi, 0xFFFFFFFFUL, 4), + IsMovValueToStack(X86::MOV32mi, 0xEEEEEEEEUL, 8), + IsMovValueToStack(X86::MOV32mi, 0xDDDDDDDDUL, 12), + IsMovValueToStack(X86::MOV32mi, 0xCCCCCCCCUL, 16), + IsMovValueToStack(X86::MOV32mi, 0xBBBBBBBBUL, 20), + IsMovValueToStack(X86::MOV32mi, 0xAAAAAAAAUL, 24), + IsMovValueToStack(X86::MOV32mi, 0x99999999UL, 28), + IsMovValueToStack(X86::MOV32mi, 0x88888888UL, 32), + IsMovValueToStack(X86::MOV32mi, 0x77777777UL, 36), + IsMovValueToStack(X86::MOV32mi, 0x66666666UL, 40), + IsMovValueToStack(X86::MOV32mi, 0x55555555UL, 44), + IsMovValueToStack(X86::MOV32mi, 0x44444444UL, 48), + IsMovValueToStack(X86::MOV32mi, 0x33333333UL, 52), + IsMovValueToStack(X86::MOV32mi, 0x22222222UL, 56), + IsMovValueToStack(X86::MOV32mi, 0x11111111UL, 60), + IsMovValueFromStack(X86::VMOVDQU32Zrm, X86::ZMM0), + IsStackDeallocate(64)})); } // Note: We always put 80 bits on the stack independently of the size of the // value. This uses a bit more space but makes the code simpler. TEST_F(Core2TargetTest, SetRegToST0_32Bits) { - EXPECT_THAT( - setRegTo(llvm::X86::ST0, APInt(32, 0x11112222ULL)), - ElementsAre(IsStackAllocate(10), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11112222UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x00000000UL, 4), - IsMovValueToStack(llvm::X86::MOV16mi, 0x0000UL, 8), - OpcodeIs(llvm::X86::LD_F80m), IsStackDeallocate(10))); + EXPECT_THAT(setRegTo(X86::ST0, APInt(32, 0x11112222ULL)), + ElementsAre(IsStackAllocate(10), + IsMovValueToStack(X86::MOV32mi, 0x11112222UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x00000000UL, 4), + IsMovValueToStack(X86::MOV16mi, 0x0000UL, 8), + OpcodeIs(X86::LD_F80m), IsStackDeallocate(10))); } TEST_F(Core2TargetTest, SetRegToST1_32Bits) { - const MCInst CopySt0ToSt1 = - llvm::MCInstBuilder(llvm::X86::ST_Frr).addReg(llvm::X86::ST1); - EXPECT_THAT( - setRegTo(llvm::X86::ST1, APInt(32, 0x11112222ULL)), - ElementsAre(IsStackAllocate(10), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11112222UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x00000000UL, 4), - IsMovValueToStack(llvm::X86::MOV16mi, 0x0000UL, 8), - OpcodeIs(llvm::X86::LD_F80m), CopySt0ToSt1, - IsStackDeallocate(10))); + const MCInst CopySt0ToSt1 = MCInstBuilder(X86::ST_Frr).addReg(X86::ST1); + EXPECT_THAT(setRegTo(X86::ST1, APInt(32, 0x11112222ULL)), + ElementsAre(IsStackAllocate(10), + IsMovValueToStack(X86::MOV32mi, 0x11112222UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x00000000UL, 4), + IsMovValueToStack(X86::MOV16mi, 0x0000UL, 8), + OpcodeIs(X86::LD_F80m), CopySt0ToSt1, + IsStackDeallocate(10))); } TEST_F(Core2TargetTest, SetRegToST0_64Bits) { - EXPECT_THAT( - setRegTo(llvm::X86::ST0, APInt(64, 0x1111222233334444ULL)), - ElementsAre(IsStackAllocate(10), - IsMovValueToStack(llvm::X86::MOV32mi, 0x33334444UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11112222UL, 4), - IsMovValueToStack(llvm::X86::MOV16mi, 0x0000UL, 8), - OpcodeIs(llvm::X86::LD_F80m), IsStackDeallocate(10))); + EXPECT_THAT(setRegTo(X86::ST0, APInt(64, 0x1111222233334444ULL)), + ElementsAre(IsStackAllocate(10), + IsMovValueToStack(X86::MOV32mi, 0x33334444UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x11112222UL, 4), + IsMovValueToStack(X86::MOV16mi, 0x0000UL, 8), + OpcodeIs(X86::LD_F80m), IsStackDeallocate(10))); } TEST_F(Core2TargetTest, SetRegToST0_80Bits) { - EXPECT_THAT( - setRegTo(llvm::X86::ST0, APInt(80, "11112222333344445555", 16)), - ElementsAre(IsStackAllocate(10), - IsMovValueToStack(llvm::X86::MOV32mi, 0x44445555UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x22223333UL, 4), - IsMovValueToStack(llvm::X86::MOV16mi, 0x1111UL, 8), - OpcodeIs(llvm::X86::LD_F80m), IsStackDeallocate(10))); + EXPECT_THAT(setRegTo(X86::ST0, APInt(80, "11112222333344445555", 16)), + ElementsAre(IsStackAllocate(10), + IsMovValueToStack(X86::MOV32mi, 0x44445555UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x22223333UL, 4), + IsMovValueToStack(X86::MOV16mi, 0x1111UL, 8), + OpcodeIs(X86::LD_F80m), IsStackDeallocate(10))); } TEST_F(Core2TargetTest, SetRegToFP0_80Bits) { - EXPECT_THAT( - setRegTo(llvm::X86::FP0, APInt(80, "11112222333344445555", 16)), - ElementsAre(IsStackAllocate(10), - IsMovValueToStack(llvm::X86::MOV32mi, 0x44445555UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x22223333UL, 4), - IsMovValueToStack(llvm::X86::MOV16mi, 0x1111UL, 8), - OpcodeIs(llvm::X86::LD_Fp80m), IsStackDeallocate(10))); + EXPECT_THAT(setRegTo(X86::FP0, APInt(80, "11112222333344445555", 16)), + ElementsAre(IsStackAllocate(10), + IsMovValueToStack(X86::MOV32mi, 0x44445555UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x22223333UL, 4), + IsMovValueToStack(X86::MOV16mi, 0x1111UL, 8), + OpcodeIs(X86::LD_Fp80m), IsStackDeallocate(10))); } TEST_F(Core2TargetTest, SetRegToFP1_32Bits) { - EXPECT_THAT( - setRegTo(llvm::X86::FP1, APInt(32, 0x11112222ULL)), - ElementsAre(IsStackAllocate(10), - IsMovValueToStack(llvm::X86::MOV32mi, 0x11112222UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x00000000UL, 4), - IsMovValueToStack(llvm::X86::MOV16mi, 0x0000UL, 8), - OpcodeIs(llvm::X86::LD_Fp80m), IsStackDeallocate(10))); + EXPECT_THAT(setRegTo(X86::FP1, APInt(32, 0x11112222ULL)), + ElementsAre(IsStackAllocate(10), + IsMovValueToStack(X86::MOV32mi, 0x11112222UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x00000000UL, 4), + IsMovValueToStack(X86::MOV16mi, 0x0000UL, 8), + OpcodeIs(X86::LD_Fp80m), IsStackDeallocate(10))); } TEST_F(Core2TargetTest, SetRegToFP1_4Bits) { - EXPECT_THAT( - setRegTo(llvm::X86::FP1, APInt(4, 0x1ULL)), - ElementsAre(IsStackAllocate(10), - IsMovValueToStack(llvm::X86::MOV32mi, 0x00000001UL, 0), - IsMovValueToStack(llvm::X86::MOV32mi, 0x00000000UL, 4), - IsMovValueToStack(llvm::X86::MOV16mi, 0x0000UL, 8), - OpcodeIs(llvm::X86::LD_Fp80m), IsStackDeallocate(10))); + EXPECT_THAT(setRegTo(X86::FP1, APInt(4, 0x1ULL)), + ElementsAre(IsStackAllocate(10), + IsMovValueToStack(X86::MOV32mi, 0x00000001UL, 0), + IsMovValueToStack(X86::MOV32mi, 0x00000000UL, 4), + IsMovValueToStack(X86::MOV16mi, 0x0000UL, 8), + OpcodeIs(X86::LD_Fp80m), IsStackDeallocate(10))); } TEST_F(Core2Avx512TargetTest, FillMemoryOperands_ADD64rm) { From llvm-commits at lists.llvm.org Wed Oct 9 04:27:54 2019 From: llvm-commits at lists.llvm.org (wael yehia via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 11:27:54 +0000 (UTC) Subject: [PATCH] D66979: [InstrProf] Tighten a check for malformed data records in raw profiles In-Reply-To: References: Message-ID: <223844e5b4661a7b0fd0ce5fccc9915f@localhost.localdomain> w2yehia added a comment. Hi @vsk can you provide a description/script on how to recreate the `malformed-ptr-to-counter-array.profraw` file when someone is changing the profile layout (for example by adding new value profiling kinds). I'm thinking something like `llvm/test/tools/llvm-profdata/raw-two-profiles.test` would be nice Thanks. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66979/new/ https://reviews.llvm.org/D66979 From llvm-commits at lists.llvm.org Wed Oct 9 04:27:54 2019 From: llvm-commits at lists.llvm.org (Kai Nacke via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 11:27:54 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: <98cbe160de6f8cb42e37f4baad67d2b3@localhost.localdomain> Kai updated this revision to Diff 224017. Kai retitled this revision from "[tests] Output of od can be lower or upper case (llvm-objcopy/yaml2obj)." to "[FileCheck] Implement --ignore-case option.". Kai edited the summary of this revision. Kai added a comment. Added a test case and documentation for the --ignore-case option. Removed the changed test cases. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 Files: llvm/docs/CommandGuide/FileCheck.rst llvm/include/llvm/Support/FileCheck.h llvm/lib/Support/FileCheck.cpp llvm/lib/Support/FileCheckImpl.h llvm/test/FileCheck/check-ignore-case.txt llvm/utils/FileCheck/FileCheck.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68146.224017.patch Type: text/x-patch Size: 4425 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 04:27:55 2019 From: llvm-commits at lists.llvm.org (Kai Nacke via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 11:27:55 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: Kai added a comment. In D68146#1691246 , @MaskRay wrote: > To make it really `--ignore-case`, the pattern should also be changed to lowercase. I am not sure which pattern I have missed. The fixed string match uses the lowercase find and the regex match uses the Regex::IgnoreCase flag. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 From llvm-commits at lists.llvm.org Wed Oct 9 04:29:23 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 11:29:23 +0000 (UTC) Subject: [PATCH] D68687: [llvm-exegesis][NFC] Remove extra `llvm::` qualifications. In-Reply-To: References: Message-ID: <96386de67e8db8e0c2ed35d456b25111@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGd422d3a755d2: [llvm-exegesis][NFC] Remove extra `llvm::` qualifications. (authored by courbet). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68687/new/ https://reviews.llvm.org/D68687 Files: llvm/unittests/tools/llvm-exegesis/AArch64/TargetTest.cpp llvm/unittests/tools/llvm-exegesis/ARM/AssemblerTest.cpp llvm/unittests/tools/llvm-exegesis/Common/AssemblerUtils.h llvm/unittests/tools/llvm-exegesis/PerfHelperTest.cpp llvm/unittests/tools/llvm-exegesis/PowerPC/AnalysisTest.cpp llvm/unittests/tools/llvm-exegesis/PowerPC/TargetTest.cpp llvm/unittests/tools/llvm-exegesis/RegisterValueTest.cpp llvm/unittests/tools/llvm-exegesis/X86/AssemblerTest.cpp llvm/unittests/tools/llvm-exegesis/X86/BenchmarkResultTest.cpp llvm/unittests/tools/llvm-exegesis/X86/RegisterAliasingTest.cpp llvm/unittests/tools/llvm-exegesis/X86/SchedClassResolutionTest.cpp llvm/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp llvm/unittests/tools/llvm-exegesis/X86/TargetTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68687.224019.patch Type: text/x-patch Size: 47722 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 04:29:44 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 11:29:44 +0000 (UTC) Subject: [PATCH] D68692: [llvm-exegesis][NFC] Remove extra `llvm::` qualifications. Message-ID: courbet created this revision. courbet added a reviewer: gchatelet. Herald added subscribers: jsji, mgrang, MaskRay, tschuett, nemanjai. Herald added a project: LLVM. Second patch: in the lib. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68692 Files: llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp llvm/tools/llvm-exegesis/lib/Analysis.cpp llvm/tools/llvm-exegesis/lib/Analysis.h llvm/tools/llvm-exegesis/lib/Assembler.cpp llvm/tools/llvm-exegesis/lib/Assembler.h llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp llvm/tools/llvm-exegesis/lib/BenchmarkResult.h llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h llvm/tools/llvm-exegesis/lib/Clustering.cpp llvm/tools/llvm-exegesis/lib/Clustering.h llvm/tools/llvm-exegesis/lib/CodeTemplate.cpp llvm/tools/llvm-exegesis/lib/CodeTemplate.h llvm/tools/llvm-exegesis/lib/Latency.cpp llvm/tools/llvm-exegesis/lib/Latency.h llvm/tools/llvm-exegesis/lib/LlvmState.cpp llvm/tools/llvm-exegesis/lib/LlvmState.h llvm/tools/llvm-exegesis/lib/MCInstrDescView.cpp llvm/tools/llvm-exegesis/lib/MCInstrDescView.h llvm/tools/llvm-exegesis/lib/PerfHelper.cpp llvm/tools/llvm-exegesis/lib/PerfHelper.h llvm/tools/llvm-exegesis/lib/PowerPC/Target.cpp llvm/tools/llvm-exegesis/lib/RegisterAliasing.cpp llvm/tools/llvm-exegesis/lib/RegisterAliasing.h llvm/tools/llvm-exegesis/lib/RegisterValue.cpp llvm/tools/llvm-exegesis/lib/RegisterValue.h llvm/tools/llvm-exegesis/lib/SchedClassResolution.cpp llvm/tools/llvm-exegesis/lib/SchedClassResolution.h llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/tools/llvm-exegesis/lib/SnippetGenerator.h llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp llvm/tools/llvm-exegesis/lib/Target.cpp llvm/tools/llvm-exegesis/lib/Target.h llvm/tools/llvm-exegesis/lib/Uops.cpp llvm/tools/llvm-exegesis/lib/Uops.h llvm/tools/llvm-exegesis/lib/X86/Target.cpp llvm/tools/llvm-exegesis/llvm-exegesis.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68692.224020.patch Type: text/x-patch Size: 175075 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 04:29:57 2019 From: llvm-commits at lists.llvm.org (Kai Nacke via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 11:29:57 +0000 (UTC) Subject: [PATCH] D68693: [Tests] Output of od can be lower or upper case (llvm-objcopy/yaml2obj). Message-ID: Kai created this revision. Kai added reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson. Herald added subscribers: llvm-commits, seiya, abrachet, dexonsmith, MaskRay, arichardson, emaste. Herald added a project: LLVM. The command `od -t x` is used to dump data in hex format. The LIT tests assumes that the hex characters are in lowercase. However, there are also platforms which use uppercase letter. To solve this issue the tests are updated to use the new `--ignore-case` option of FileCheck. Repository: rL LLVM https://reviews.llvm.org/D68693 Files: llvm/test/tools/llvm-objcopy/ELF/basic-binary-copy.test llvm/test/tools/llvm-objcopy/ELF/binary-no-paddr.test llvm/test/tools/llvm-objcopy/ELF/binary-paddr.test llvm/test/tools/llvm-objcopy/ELF/binary-segment-layout.test llvm/test/tools/llvm-objcopy/ELF/check-addr-offset-align-binary.test llvm/test/tools/llvm-objcopy/ELF/dump-section.test llvm/test/tools/llvm-objcopy/ELF/preserve-segment-contents.test llvm/test/tools/llvm-objcopy/ELF/strip-all-gnu.test llvm/test/tools/llvm-objcopy/ELF/strip-sections.test llvm/test/tools/yaml2obj/elf-override-shoffset.yaml llvm/test/tools/yaml2obj/elf-override-shsize.yaml -------------- next part -------------- A non-text attachment was scrubbed... Name: D68693.224018.patch Type: text/x-patch Size: 10267 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 04:58:42 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Wed, 09 Oct 2019 11:58:42 -0000 Subject: [llvm] r374158 - [llvm-exegesis][NFC] Remove extra `llvm::` qualifications. Message-ID: <20191009115842.DE1F98A83D@lists.llvm.org> Author: courbet Date: Wed Oct 9 04:58:42 2019 New Revision: 374158 URL: http://llvm.org/viewvc/llvm-project?rev=374158&view=rev Log: [llvm-exegesis][NFC] Remove extra `llvm::` qualifications. Summary: Second patch: in the lib. Reviewers: gchatelet Subscribers: nemanjai, tschuett, MaskRay, mgrang, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68692 Modified: llvm/trunk/tools/llvm-exegesis/lib/AArch64/Target.cpp llvm/trunk/tools/llvm-exegesis/lib/Analysis.cpp llvm/trunk/tools/llvm-exegesis/lib/Analysis.h llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp llvm/trunk/tools/llvm-exegesis/lib/Assembler.h llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.cpp llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.h llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.cpp llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.h llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp llvm/trunk/tools/llvm-exegesis/lib/Clustering.h llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.cpp llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.h llvm/trunk/tools/llvm-exegesis/lib/Latency.cpp llvm/trunk/tools/llvm-exegesis/lib/Latency.h llvm/trunk/tools/llvm-exegesis/lib/LlvmState.cpp llvm/trunk/tools/llvm-exegesis/lib/LlvmState.h llvm/trunk/tools/llvm-exegesis/lib/MCInstrDescView.cpp llvm/trunk/tools/llvm-exegesis/lib/MCInstrDescView.h llvm/trunk/tools/llvm-exegesis/lib/PerfHelper.cpp llvm/trunk/tools/llvm-exegesis/lib/PerfHelper.h llvm/trunk/tools/llvm-exegesis/lib/PowerPC/Target.cpp llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.cpp llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.h llvm/trunk/tools/llvm-exegesis/lib/RegisterValue.cpp llvm/trunk/tools/llvm-exegesis/lib/RegisterValue.h llvm/trunk/tools/llvm-exegesis/lib/SchedClassResolution.cpp llvm/trunk/tools/llvm-exegesis/lib/SchedClassResolution.h llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.h llvm/trunk/tools/llvm-exegesis/lib/SnippetRepetitor.cpp llvm/trunk/tools/llvm-exegesis/lib/Target.cpp llvm/trunk/tools/llvm-exegesis/lib/Target.h llvm/trunk/tools/llvm-exegesis/lib/Uops.cpp llvm/trunk/tools/llvm-exegesis/lib/Uops.h llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp llvm/trunk/tools/llvm-exegesis/llvm-exegesis.cpp Modified: llvm/trunk/tools/llvm-exegesis/lib/AArch64/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/AArch64/Target.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/AArch64/Target.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/AArch64/Target.cpp Wed Oct 9 04:58:42 2019 @@ -16,19 +16,19 @@ namespace exegesis { static unsigned getLoadImmediateOpcode(unsigned RegBitWidth) { switch (RegBitWidth) { case 32: - return llvm::AArch64::MOVi32imm; + return AArch64::MOVi32imm; case 64: - return llvm::AArch64::MOVi64imm; + return AArch64::MOVi64imm; } llvm_unreachable("Invalid Value Width"); } // Generates instruction to load an immediate value into a register. -static llvm::MCInst loadImmediate(unsigned Reg, unsigned RegBitWidth, - const llvm::APInt &Value) { +static MCInst loadImmediate(unsigned Reg, unsigned RegBitWidth, + const APInt &Value) { if (Value.getBitWidth() > RegBitWidth) llvm_unreachable("Value must fit in the Register"); - return llvm::MCInstBuilder(getLoadImmediateOpcode(RegBitWidth)) + return MCInstBuilder(getLoadImmediateOpcode(RegBitWidth)) .addReg(Reg) .addImm(Value.getZExtValue()); } @@ -42,24 +42,23 @@ public: ExegesisAArch64Target() : ExegesisTarget(AArch64CpuPfmCounters) {} private: - std::vector setRegTo(const llvm::MCSubtargetInfo &STI, - unsigned Reg, - const llvm::APInt &Value) const override { - if (llvm::AArch64::GPR32RegClass.contains(Reg)) + std::vector setRegTo(const MCSubtargetInfo &STI, unsigned Reg, + const APInt &Value) const override { + if (AArch64::GPR32RegClass.contains(Reg)) return {loadImmediate(Reg, 32, Value)}; - if (llvm::AArch64::GPR64RegClass.contains(Reg)) + if (AArch64::GPR64RegClass.contains(Reg)) return {loadImmediate(Reg, 64, Value)}; - llvm::errs() << "setRegTo is not implemented, results will be unreliable\n"; + errs() << "setRegTo is not implemented, results will be unreliable\n"; return {}; } - bool matchesArch(llvm::Triple::ArchType Arch) const override { - return Arch == llvm::Triple::aarch64 || Arch == llvm::Triple::aarch64_be; + bool matchesArch(Triple::ArchType Arch) const override { + return Arch == Triple::aarch64 || Arch == Triple::aarch64_be; } - void addTargetSpecificPasses(llvm::PassManagerBase &PM) const override { + void addTargetSpecificPasses(PassManagerBase &PM) const override { // Function return is a pseudo-instruction that needs to be expanded - PM.add(llvm::createAArch64ExpandPseudoPass()); + PM.add(createAArch64ExpandPseudoPass()); } }; Modified: llvm/trunk/tools/llvm-exegesis/lib/Analysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Analysis.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Analysis.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Analysis.cpp Wed Oct 9 04:58:42 2019 @@ -24,11 +24,9 @@ namespace { enum EscapeTag { kEscapeCsv, kEscapeHtml, kEscapeHtmlString }; -template -void writeEscaped(llvm::raw_ostream &OS, const llvm::StringRef S); +template void writeEscaped(raw_ostream &OS, const StringRef S); -template <> -void writeEscaped(llvm::raw_ostream &OS, const llvm::StringRef S) { +template <> void writeEscaped(raw_ostream &OS, const StringRef S) { if (std::find(S.begin(), S.end(), kCsvSep) == S.end()) { OS << S; } else { @@ -44,8 +42,7 @@ void writeEscaped(llvm::raw_ } } -template <> -void writeEscaped(llvm::raw_ostream &OS, const llvm::StringRef S) { +template <> void writeEscaped(raw_ostream &OS, const StringRef S) { for (const char C : S) { if (C == '<') OS << "<"; @@ -59,8 +56,7 @@ void writeEscaped(llvm::raw } template <> -void writeEscaped(llvm::raw_ostream &OS, - const llvm::StringRef S) { +void writeEscaped(raw_ostream &OS, const StringRef S) { for (const char C : S) { if (C == '"') OS << "\\\""; @@ -73,7 +69,7 @@ void writeEscaped(llv template static void -writeClusterId(llvm::raw_ostream &OS, +writeClusterId(raw_ostream &OS, const InstructionBenchmarkClustering::ClusterId &CID) { if (CID.isNoise()) writeEscaped(OS, "[noise]"); @@ -84,7 +80,7 @@ writeClusterId(llvm::raw_ostream &OS, } template -static void writeMeasurementValue(llvm::raw_ostream &OS, const double Value) { +static void writeMeasurementValue(raw_ostream &OS, const double Value) { // Given Value, if we wanted to serialize it to a string, // how many base-10 digits will we need to store, max? static constexpr auto MaxDigitCount = @@ -98,39 +94,36 @@ static void writeMeasurementValue(llvm:: static constexpr StringLiteral SimpleFloatFormat = StringLiteral("{0:F}"); writeEscaped( - OS, - llvm::formatv(SimpleFloatFormat.data(), Value).sstr()); + OS, formatv(SimpleFloatFormat.data(), Value).sstr()); } template -void Analysis::writeSnippet(llvm::raw_ostream &OS, - llvm::ArrayRef Bytes, +void Analysis::writeSnippet(raw_ostream &OS, ArrayRef Bytes, const char *Separator) const { - llvm::SmallVector Lines; + SmallVector Lines; // Parse the asm snippet and print it. while (!Bytes.empty()) { - llvm::MCInst MI; + MCInst MI; uint64_t MISize = 0; - if (!Disasm_->getInstruction(MI, MISize, Bytes, 0, llvm::nulls(), - llvm::nulls())) { - writeEscaped(OS, llvm::join(Lines, Separator)); + if (!Disasm_->getInstruction(MI, MISize, Bytes, 0, nulls(), nulls())) { + writeEscaped(OS, join(Lines, Separator)); writeEscaped(OS, Separator); writeEscaped(OS, "[error decoding asm snippet]"); return; } - llvm::SmallString<128> InstPrinterStr; // FIXME: magic number. - llvm::raw_svector_ostream OSS(InstPrinterStr); + SmallString<128> InstPrinterStr; // FIXME: magic number. + raw_svector_ostream OSS(InstPrinterStr); InstPrinter_->printInst(&MI, OSS, "", *SubtargetInfo_); Bytes = Bytes.drop_front(MISize); - Lines.emplace_back(llvm::StringRef(InstPrinterStr).trim()); + Lines.emplace_back(StringRef(InstPrinterStr).trim()); } - writeEscaped(OS, llvm::join(Lines, Separator)); + writeEscaped(OS, join(Lines, Separator)); } // Prints a row representing an instruction, along with scheduling info and // point coordinates (measurements). void Analysis::printInstructionRowCsv(const size_t PointId, - llvm::raw_ostream &OS) const { + raw_ostream &OS) const { const InstructionBenchmark &Point = Clustering_.getPoints()[PointId]; writeClusterId(OS, Clustering_.getClusterIdForPoint(PointId)); OS << kCsvSep; @@ -139,12 +132,12 @@ void Analysis::printInstructionRowCsv(co writeEscaped(OS, Point.Key.Config); OS << kCsvSep; assert(!Point.Key.Instructions.empty()); - const llvm::MCInst &MCI = Point.keyInstruction(); + const MCInst &MCI = Point.keyInstruction(); unsigned SchedClassId; std::tie(SchedClassId, std::ignore) = ResolvedSchedClass::resolveSchedClassId( *SubtargetInfo_, *InstrInfo_, MCI); #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP) - const llvm::MCSchedClassDesc *const SCDesc = + const MCSchedClassDesc *const SCDesc = SubtargetInfo_->getSchedModel().getSchedClassDesc(SchedClassId); writeEscaped(OS, SCDesc->Name); #else @@ -157,8 +150,7 @@ void Analysis::printInstructionRowCsv(co OS << "\n"; } -Analysis::Analysis(const llvm::Target &Target, - std::unique_ptr InstrInfo, +Analysis::Analysis(const Target &Target, std::unique_ptr InstrInfo, const InstructionBenchmarkClustering &Clustering, double AnalysisInconsistencyEpsilon, bool AnalysisDisplayUnstableOpcodes) @@ -175,21 +167,20 @@ Analysis::Analysis(const llvm::Target &T SubtargetInfo_.reset(Target.createMCSubtargetInfo(FirstPoint.LLVMTriple, FirstPoint.CpuName, "")); InstPrinter_.reset(Target.createMCInstPrinter( - llvm::Triple(FirstPoint.LLVMTriple), 0 /*default variant*/, *AsmInfo_, + Triple(FirstPoint.LLVMTriple), 0 /*default variant*/, *AsmInfo_, *InstrInfo_, *RegInfo_)); - Context_ = std::make_unique(AsmInfo_.get(), RegInfo_.get(), - &ObjectFileInfo_); + Context_ = std::make_unique(AsmInfo_.get(), RegInfo_.get(), + &ObjectFileInfo_); Disasm_.reset(Target.createMCDisassembler(*SubtargetInfo_, *Context_)); assert(Disasm_ && "cannot create MCDisassembler. missing call to " "InitializeXXXTargetDisassembler ?"); } template <> -llvm::Error -Analysis::run(llvm::raw_ostream &OS) const { +Error Analysis::run(raw_ostream &OS) const { if (Clustering_.getPoints().empty()) - return llvm::Error::success(); + return Error::success(); // Write the header. OS << "cluster_id" << kCsvSep << "opcode_name" << kCsvSep << "config" @@ -208,7 +199,7 @@ Analysis::run(l } OS << "\n\n"; } - return llvm::Error::success(); + return Error::success(); } Analysis::ResolvedSchedClassAndPoints::ResolvedSchedClassAndPoints( @@ -228,7 +219,7 @@ Analysis::makePointsPerSchedClass() cons assert(!Point.Key.Instructions.empty()); // FIXME: we should be using the tuple of classes for instructions in the // snippet as key. - const llvm::MCInst &MCI = Point.keyInstruction(); + const MCInst &MCI = Point.keyInstruction(); unsigned SchedClassId; bool WasVariant; std::tie(SchedClassId, WasVariant) = @@ -252,9 +243,9 @@ Analysis::makePointsPerSchedClass() cons // Uops repeat the same opcode over again. Just show this opcode and show the // whole snippet only on hover. -static void writeUopsSnippetHtml(llvm::raw_ostream &OS, - const std::vector &Instructions, - const llvm::MCInstrInfo &InstrInfo) { +static void writeUopsSnippetHtml(raw_ostream &OS, + const std::vector &Instructions, + const MCInstrInfo &InstrInfo) { if (Instructions.empty()) return; writeEscaped(OS, InstrInfo.getName(Instructions[0].getOpcode())); @@ -264,12 +255,11 @@ static void writeUopsSnippetHtml(llvm::r // Latency tries to find a serial path. Just show the opcode path and show the // whole snippet only on hover. -static void -writeLatencySnippetHtml(llvm::raw_ostream &OS, - const std::vector &Instructions, - const llvm::MCInstrInfo &InstrInfo) { +static void writeLatencySnippetHtml(raw_ostream &OS, + const std::vector &Instructions, + const MCInstrInfo &InstrInfo) { bool First = true; - for (const llvm::MCInst &Instr : Instructions) { + for (const MCInst &Instr : Instructions) { if (First) First = false; else @@ -280,7 +270,7 @@ writeLatencySnippetHtml(llvm::raw_ostrea void Analysis::printSchedClassClustersHtml( const std::vector &Clusters, - const ResolvedSchedClass &RSC, llvm::raw_ostream &OS) const { + const ResolvedSchedClass &RSC, raw_ostream &OS) const { const auto &Points = Clustering_.getPoints(); OS << ""; OS << ""; @@ -349,7 +339,7 @@ void Analysis::SchedClassCluster::addPoi } bool Analysis::SchedClassCluster::measurementsMatch( - const llvm::MCSubtargetInfo &STI, const ResolvedSchedClass &RSC, + const MCSubtargetInfo &STI, const ResolvedSchedClass &RSC, const InstructionBenchmarkClustering &Clustering, const double AnalysisInconsistencyEpsilonSquared_) const { assert(!Clustering.getPoints().empty()); @@ -374,7 +364,7 @@ bool Analysis::SchedClassCluster::measur } void Analysis::printSchedClassDescHtml(const ResolvedSchedClass &RSC, - llvm::raw_ostream &OS) const { + raw_ostream &OS) const { OS << "
    ClusterIdOpcode/Config
    "; OS << ""; for (const auto &Stats : Cluster.getCentroid().getStats()) { @@ -422,6 +425,43 @@ void Analysis::printSchedClassDescHtml(c OS << "
    ValidVariantNumMicroOpsLatencyRThroughputWriteProcRes -llvm::Error Analysis::run( - llvm::raw_ostream &OS) const { +Error Analysis::run( + raw_ostream &OS) const { const auto &FirstPoint = Clustering_.getPoints()[0]; // Print the header. OS << "" << kHtmlHead << ""; @@ -536,12 +526,12 @@ llvm::Error Analysis::run

    Sched Class "; - return llvm::Error::success(); + return Error::success(); } } // namespace exegesis Modified: llvm/trunk/tools/llvm-exegesis/lib/Analysis.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Analysis.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Analysis.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Analysis.h Wed Oct 9 04:58:42 2019 @@ -36,8 +36,7 @@ namespace exegesis { // A helper class to analyze benchmark results for a target. class Analysis { public: - Analysis(const llvm::Target &Target, - std::unique_ptr InstrInfo, + Analysis(const Target &Target, std::unique_ptr InstrInfo, const InstructionBenchmarkClustering &Clustering, double AnalysisInconsistencyEpsilon, bool AnalysisDisplayUnstableOpcodes); @@ -47,7 +46,7 @@ public: // Find potential errors in the scheduling information given measurements. struct PrintSchedClassInconsistencies {}; - template llvm::Error run(llvm::raw_ostream &OS) const; + template Error run(raw_ostream &OS) const; private: using ClusterId = InstructionBenchmarkClustering::ClusterId; @@ -69,8 +68,7 @@ private: // Returns true if the cluster representative measurements match that of SC. bool - measurementsMatch(const llvm::MCSubtargetInfo &STI, - const ResolvedSchedClass &SC, + measurementsMatch(const MCSubtargetInfo &STI, const ResolvedSchedClass &SC, const InstructionBenchmarkClustering &Clustering, const double AnalysisInconsistencyEpsilonSquared_) const; @@ -81,14 +79,14 @@ private: SchedClassClusterCentroid Centroid; }; - void printInstructionRowCsv(size_t PointId, llvm::raw_ostream &OS) const; + void printInstructionRowCsv(size_t PointId, raw_ostream &OS) const; void printSchedClassClustersHtml(const std::vector &Clusters, const ResolvedSchedClass &SC, - llvm::raw_ostream &OS) const; + raw_ostream &OS) const; void printSchedClassDescHtml(const ResolvedSchedClass &SC, - llvm::raw_ostream &OS) const; + raw_ostream &OS) const; // A pair of (Sched Class, indices of points that belong to the sched // class). @@ -103,18 +101,18 @@ private: std::vector makePointsPerSchedClass() const; template - void writeSnippet(llvm::raw_ostream &OS, llvm::ArrayRef Bytes, + void writeSnippet(raw_ostream &OS, ArrayRef Bytes, const char *Separator) const; const InstructionBenchmarkClustering &Clustering_; - llvm::MCObjectFileInfo ObjectFileInfo_; - std::unique_ptr Context_; - std::unique_ptr SubtargetInfo_; - std::unique_ptr InstrInfo_; - std::unique_ptr RegInfo_; - std::unique_ptr AsmInfo_; - std::unique_ptr InstPrinter_; - std::unique_ptr Disasm_; + MCObjectFileInfo ObjectFileInfo_; + std::unique_ptr Context_; + std::unique_ptr SubtargetInfo_; + std::unique_ptr InstrInfo_; + std::unique_ptr RegInfo_; + std::unique_ptr AsmInfo_; + std::unique_ptr InstPrinter_; + std::unique_ptr Disasm_; const double AnalysisInconsistencyEpsilonSquared_; const bool AnalysisDisplayUnstableOpcodes_; }; Modified: llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp Wed Oct 9 04:58:42 2019 @@ -31,11 +31,9 @@ static constexpr const char FunctionID[] // Fills the given basic block with register setup code, and returns true if // all registers could be setup correctly. -static bool -generateSnippetSetupCode(const ExegesisTarget &ET, - const llvm::MCSubtargetInfo *const MSI, - llvm::ArrayRef RegisterInitialValues, - BasicBlockFiller &BBF) { +static bool generateSnippetSetupCode( + const ExegesisTarget &ET, const MCSubtargetInfo *const MSI, + ArrayRef RegisterInitialValues, BasicBlockFiller &BBF) { bool IsSnippetSetupComplete = true; for (const RegisterValue &RV : RegisterInitialValues) { // Load a constant in the register. @@ -48,20 +46,20 @@ generateSnippetSetupCode(const ExegesisT } // Small utility function to add named passes. -static bool addPass(llvm::PassManagerBase &PM, llvm::StringRef PassName, - llvm::TargetPassConfig &TPC) { - const llvm::PassRegistry *PR = llvm::PassRegistry::getPassRegistry(); - const llvm::PassInfo *PI = PR->getPassInfo(PassName); +static bool addPass(PassManagerBase &PM, StringRef PassName, + TargetPassConfig &TPC) { + const PassRegistry *PR = PassRegistry::getPassRegistry(); + const PassInfo *PI = PR->getPassInfo(PassName); if (!PI) { - llvm::errs() << " run-pass " << PassName << " is not registered.\n"; + errs() << " run-pass " << PassName << " is not registered.\n"; return true; } if (!PI->getNormalCtor()) { - llvm::errs() << " cannot create pass: " << PI->getPassName() << "\n"; + errs() << " cannot create pass: " << PI->getPassName() << "\n"; return true; } - llvm::Pass *P = PI->getNormalCtor()(); + Pass *P = PI->getNormalCtor()(); std::string Banner = std::string("After ") + std::string(P->getPassName()); PM.add(P); TPC.printAndVerify(Banner); @@ -69,42 +67,39 @@ static bool addPass(llvm::PassManagerBas return false; } -llvm::MachineFunction & -createVoidVoidPtrMachineFunction(llvm::StringRef FunctionID, - llvm::Module *Module, - llvm::MachineModuleInfo *MMI) { - llvm::Type *const ReturnType = llvm::Type::getInt32Ty(Module->getContext()); - llvm::Type *const MemParamType = llvm::PointerType::get( - llvm::Type::getInt8Ty(Module->getContext()), 0 /*default address space*/); - llvm::FunctionType *FunctionType = - llvm::FunctionType::get(ReturnType, {MemParamType}, false); - llvm::Function *const F = llvm::Function::Create( - FunctionType, llvm::GlobalValue::InternalLinkage, FunctionID, Module); +MachineFunction &createVoidVoidPtrMachineFunction(StringRef FunctionID, + Module *Module, + MachineModuleInfo *MMI) { + Type *const ReturnType = Type::getInt32Ty(Module->getContext()); + Type *const MemParamType = PointerType::get( + Type::getInt8Ty(Module->getContext()), 0 /*default address space*/); + FunctionType *FunctionType = + FunctionType::get(ReturnType, {MemParamType}, false); + Function *const F = Function::Create( + FunctionType, GlobalValue::InternalLinkage, FunctionID, Module); // Making sure we can create a MachineFunction out of this Function even if it // contains no IR. F->setIsMaterializable(true); return MMI->getOrCreateMachineFunction(*F); } -BasicBlockFiller::BasicBlockFiller(llvm::MachineFunction &MF, - llvm::MachineBasicBlock *MBB, - const llvm::MCInstrInfo *MCII) +BasicBlockFiller::BasicBlockFiller(MachineFunction &MF, MachineBasicBlock *MBB, + const MCInstrInfo *MCII) : MF(MF), MBB(MBB), MCII(MCII) {} -void BasicBlockFiller::addInstruction(const llvm::MCInst &Inst, - const llvm::DebugLoc &DL) { +void BasicBlockFiller::addInstruction(const MCInst &Inst, const DebugLoc &DL) { const unsigned Opcode = Inst.getOpcode(); - const llvm::MCInstrDesc &MCID = MCII->get(Opcode); - llvm::MachineInstrBuilder Builder = llvm::BuildMI(MBB, DL, MCID); + const MCInstrDesc &MCID = MCII->get(Opcode); + MachineInstrBuilder Builder = BuildMI(MBB, DL, MCID); for (unsigned OpIndex = 0, E = Inst.getNumOperands(); OpIndex < E; ++OpIndex) { - const llvm::MCOperand &Op = Inst.getOperand(OpIndex); + const MCOperand &Op = Inst.getOperand(OpIndex); if (Op.isReg()) { const bool IsDef = OpIndex < MCID.getNumDefs(); unsigned Flags = 0; - const llvm::MCOperandInfo &OpInfo = MCID.operands().begin()[OpIndex]; + const MCOperandInfo &OpInfo = MCID.operands().begin()[OpIndex]; if (IsDef && !OpInfo.isOptionalDef()) - Flags |= llvm::RegState::Define; + Flags |= RegState::Define; Builder.addReg(Op.getReg(), Flags); } else if (Op.isImm()) { Builder.addImm(Op.getImm()); @@ -116,31 +111,31 @@ void BasicBlockFiller::addInstruction(co } } -void BasicBlockFiller::addInstructions(ArrayRef Insts, - const llvm::DebugLoc &DL) { +void BasicBlockFiller::addInstructions(ArrayRef Insts, + const DebugLoc &DL) { for (const MCInst &Inst : Insts) addInstruction(Inst, DL); } -void BasicBlockFiller::addReturn(const llvm::DebugLoc &DL) { +void BasicBlockFiller::addReturn(const DebugLoc &DL) { // Insert the return code. - const llvm::TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo(); + const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo(); if (TII->getReturnOpcode() < TII->getNumOpcodes()) { - llvm::BuildMI(MBB, DL, TII->get(TII->getReturnOpcode())); + BuildMI(MBB, DL, TII->get(TII->getReturnOpcode())); } else { - llvm::MachineIRBuilder MIB(MF); + MachineIRBuilder MIB(MF); MIB.setMBB(*MBB); MF.getSubtarget().getCallLowering()->lowerReturn(MIB, nullptr, {}); } } -FunctionFiller::FunctionFiller(llvm::MachineFunction &MF, +FunctionFiller::FunctionFiller(MachineFunction &MF, std::vector RegistersSetUp) : MF(MF), MCII(MF.getTarget().getMCInstrInfo()), Entry(addBasicBlock()), RegistersSetUp(std::move(RegistersSetUp)) {} BasicBlockFiller FunctionFiller::addBasicBlock() { - llvm::MachineBasicBlock *MBB = MF.CreateMachineBasicBlock(); + MachineBasicBlock *MBB = MF.CreateMachineBasicBlock(); MF.push_back(MBB); return BasicBlockFiller(MF, MBB, MCII); } @@ -149,50 +144,45 @@ ArrayRef FunctionFiller::getRe return RegistersSetUp; } -static std::unique_ptr -createModule(const std::unique_ptr &Context, - const llvm::DataLayout DL) { - auto Module = std::make_unique(ModuleID, *Context); - Module->setDataLayout(DL); - return Module; +static std::unique_ptr +createModule(const std::unique_ptr &Context, const DataLayout DL) { + auto Mod = std::make_unique(ModuleID, *Context); + Mod->setDataLayout(DL); + return Mod; } -llvm::BitVector getFunctionReservedRegs(const llvm::TargetMachine &TM) { - std::unique_ptr Context = - std::make_unique(); - std::unique_ptr Module = - createModule(Context, TM.createDataLayout()); +BitVector getFunctionReservedRegs(const TargetMachine &TM) { + std::unique_ptr Context = std::make_unique(); + std::unique_ptr Module = createModule(Context, TM.createDataLayout()); // TODO: This only works for targets implementing LLVMTargetMachine. const LLVMTargetMachine &LLVMTM = static_cast(TM); - std::unique_ptr MMIWP = - std::make_unique(&LLVMTM); - llvm::MachineFunction &MF = createVoidVoidPtrMachineFunction( + std::unique_ptr MMIWP = + std::make_unique(&LLVMTM); + MachineFunction &MF = createVoidVoidPtrMachineFunction( FunctionID, Module.get(), &MMIWP.get()->getMMI()); // Saving reserved registers for client. return MF.getSubtarget().getRegisterInfo()->getReservedRegs(MF); } void assembleToStream(const ExegesisTarget &ET, - std::unique_ptr TM, - llvm::ArrayRef LiveIns, - llvm::ArrayRef RegisterInitialValues, - const FillFunction &Fill, - llvm::raw_pwrite_stream &AsmStream) { - std::unique_ptr Context = - std::make_unique(); - std::unique_ptr Module = + std::unique_ptr TM, + ArrayRef LiveIns, + ArrayRef RegisterInitialValues, + const FillFunction &Fill, raw_pwrite_stream &AsmStream) { + std::unique_ptr Context = std::make_unique(); + std::unique_ptr Module = createModule(Context, TM->createDataLayout()); - std::unique_ptr MMIWP = - std::make_unique(TM.get()); - llvm::MachineFunction &MF = createVoidVoidPtrMachineFunction( + std::unique_ptr MMIWP = + std::make_unique(TM.get()); + MachineFunction &MF = createVoidVoidPtrMachineFunction( FunctionID, Module.get(), &MMIWP.get()->getMMI()); // We need to instruct the passes that we're done with SSA and virtual // registers. auto &Properties = MF.getProperties(); - Properties.set(llvm::MachineFunctionProperties::Property::NoVRegs); - Properties.reset(llvm::MachineFunctionProperties::Property::IsSSA); - Properties.set(llvm::MachineFunctionProperties::Property::NoPHIs); + Properties.set(MachineFunctionProperties::Property::NoVRegs); + Properties.reset(MachineFunctionProperties::Property::IsSSA); + Properties.set(MachineFunctionProperties::Property::NoPHIs); for (const unsigned Reg : LiveIns) MF.getRegInfo().addLiveIn(Reg); @@ -212,7 +202,7 @@ void assembleToStream(const ExegesisTarg // If the snippet setup is not complete, we disable liveliness tracking. This // means that we won't know what values are in the registers. if (!IsSnippetSetupComplete) - Properties.reset(llvm::MachineFunctionProperties::Property::TracksLiveness); + Properties.reset(MachineFunctionProperties::Property::TracksLiveness); Fill(Sink); @@ -221,13 +211,13 @@ void assembleToStream(const ExegesisTarg MF.getRegInfo().freezeReservedRegs(MF); // We create the pass manager, run the passes to populate AsmBuffer. - llvm::MCContext &MCContext = MMIWP->getMMI().getContext(); - llvm::legacy::PassManager PM; + MCContext &MCContext = MMIWP->getMMI().getContext(); + legacy::PassManager PM; - llvm::TargetLibraryInfoImpl TLII(llvm::Triple(Module->getTargetTriple())); - PM.add(new llvm::TargetLibraryInfoWrapperPass(TLII)); + TargetLibraryInfoImpl TLII(Triple(Module->getTargetTriple())); + PM.add(new TargetLibraryInfoWrapperPass(TLII)); - llvm::TargetPassConfig *TPC = TM->createPassConfig(PM); + TargetPassConfig *TPC = TM->createPassConfig(PM); PM.add(TPC); PM.add(MMIWP.release()); TPC->printAndVerify("MachineFunctionGenerator::assemble"); @@ -239,50 +229,49 @@ void assembleToStream(const ExegesisTarg // - prologepilog: saves and restore callee saved registers. for (const char *PassName : {"machineverifier", "prologepilog"}) if (addPass(PM, PassName, *TPC)) - llvm::report_fatal_error("Unable to add a mandatory pass"); + report_fatal_error("Unable to add a mandatory pass"); TPC->setInitialized(); // AsmPrinter is responsible for generating the assembly into AsmBuffer. - if (TM->addAsmPrinter(PM, AsmStream, nullptr, - llvm::TargetMachine::CGFT_ObjectFile, MCContext)) - llvm::report_fatal_error("Cannot add AsmPrinter passes"); + if (TM->addAsmPrinter(PM, AsmStream, nullptr, TargetMachine::CGFT_ObjectFile, + MCContext)) + report_fatal_error("Cannot add AsmPrinter passes"); PM.run(*Module); // Run all the passes } -llvm::object::OwningBinary -getObjectFromBuffer(llvm::StringRef InputData) { +object::OwningBinary +getObjectFromBuffer(StringRef InputData) { // Storing the generated assembly into a MemoryBuffer that owns the memory. - std::unique_ptr Buffer = - llvm::MemoryBuffer::getMemBufferCopy(InputData); + std::unique_ptr Buffer = + MemoryBuffer::getMemBufferCopy(InputData); // Create the ObjectFile from the MemoryBuffer. - std::unique_ptr Obj = llvm::cantFail( - llvm::object::ObjectFile::createObjectFile(Buffer->getMemBufferRef())); + std::unique_ptr Obj = + cantFail(object::ObjectFile::createObjectFile(Buffer->getMemBufferRef())); // Returning both the MemoryBuffer and the ObjectFile. - return llvm::object::OwningBinary( - std::move(Obj), std::move(Buffer)); + return object::OwningBinary(std::move(Obj), + std::move(Buffer)); } -llvm::object::OwningBinary -getObjectFromFile(llvm::StringRef Filename) { - return llvm::cantFail(llvm::object::ObjectFile::createObjectFile(Filename)); +object::OwningBinary getObjectFromFile(StringRef Filename) { + return cantFail(object::ObjectFile::createObjectFile(Filename)); } namespace { // Implementation of this class relies on the fact that a single object with a // single function will be loaded into memory. -class TrackingSectionMemoryManager : public llvm::SectionMemoryManager { +class TrackingSectionMemoryManager : public SectionMemoryManager { public: explicit TrackingSectionMemoryManager(uintptr_t *CodeSize) : CodeSize(CodeSize) {} uint8_t *allocateCodeSection(uintptr_t Size, unsigned Alignment, unsigned SectionID, - llvm::StringRef SectionName) override { + StringRef SectionName) override { *CodeSize = Size; - return llvm::SectionMemoryManager::allocateCodeSection( - Size, Alignment, SectionID, SectionName); + return SectionMemoryManager::allocateCodeSection(Size, Alignment, SectionID, + SectionName); } private: @@ -292,9 +281,9 @@ private: } // namespace ExecutableFunction::ExecutableFunction( - std::unique_ptr TM, - llvm::object::OwningBinary &&ObjectFileHolder) - : Context(std::make_unique()) { + std::unique_ptr TM, + object::OwningBinary &&ObjectFileHolder) + : Context(std::make_unique()) { assert(ObjectFileHolder.getBinary() && "cannot create object file"); // Initializing the execution engine. // We need to use the JIT EngineKind to be able to add an object file. @@ -302,24 +291,23 @@ ExecutableFunction::ExecutableFunction( uintptr_t CodeSize = 0; std::string Error; ExecEngine.reset( - llvm::EngineBuilder(createModule(Context, TM->createDataLayout())) + EngineBuilder(createModule(Context, TM->createDataLayout())) .setErrorStr(&Error) .setMCPU(TM->getTargetCPU()) - .setEngineKind(llvm::EngineKind::JIT) + .setEngineKind(EngineKind::JIT) .setMCJITMemoryManager( std::make_unique(&CodeSize)) .create(TM.release())); if (!ExecEngine) - llvm::report_fatal_error(Error); + report_fatal_error(Error); // Adding the generated object file containing the assembled function. // The ExecutionEngine makes sure the object file is copied into an // executable page. ExecEngine->addObjectFile(std::move(ObjectFileHolder)); // Fetching function bytes. - FunctionBytes = - llvm::StringRef(reinterpret_cast( - ExecEngine->getFunctionAddress(FunctionID)), - CodeSize); + FunctionBytes = StringRef(reinterpret_cast( + ExecEngine->getFunctionAddress(FunctionID)), + CodeSize); } } // namespace exegesis Modified: llvm/trunk/tools/llvm-exegesis/lib/Assembler.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Assembler.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Assembler.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Assembler.h Wed Oct 9 04:58:42 2019 @@ -38,7 +38,7 @@ class ExegesisTarget; // Gather the set of reserved registers (depends on function's calling // convention and target machine). -llvm::BitVector getFunctionReservedRegs(const llvm::TargetMachine &TM); +BitVector getFunctionReservedRegs(const TargetMachine &TM); // Helper to fill in a basic block. class BasicBlockFiller { @@ -47,8 +47,7 @@ public: const MCInstrInfo *MCII); void addInstruction(const MCInst &Inst, const DebugLoc &DL = DebugLoc()); - void addInstructions(ArrayRef Insts, - const DebugLoc &DL = DebugLoc()); + void addInstructions(ArrayRef Insts, const DebugLoc &DL = DebugLoc()); void addReturn(const DebugLoc &DL = DebugLoc()); @@ -88,47 +87,43 @@ using FillFunction = std::function TM, - llvm::ArrayRef LiveIns, - llvm::ArrayRef RegisterInitialValues, - const FillFunction &Fill, - llvm::raw_pwrite_stream &AsmStream); + std::unique_ptr TM, + ArrayRef LiveIns, + ArrayRef RegisterInitialValues, + const FillFunction &Fill, raw_pwrite_stream &AsmStream); // Creates an ObjectFile in the format understood by the host. // Note: the resulting object keeps a copy of Buffer so it can be discarded once // this function returns. -llvm::object::OwningBinary -getObjectFromBuffer(llvm::StringRef Buffer); +object::OwningBinary getObjectFromBuffer(StringRef Buffer); // Loads the content of Filename as on ObjectFile and returns it. -llvm::object::OwningBinary -getObjectFromFile(llvm::StringRef Filename); +object::OwningBinary getObjectFromFile(StringRef Filename); // Consumes an ObjectFile containing a `void foo(char*)` function and make it // executable. struct ExecutableFunction { explicit ExecutableFunction( - std::unique_ptr TM, - llvm::object::OwningBinary &&ObjectFileHolder); + std::unique_ptr TM, + object::OwningBinary &&ObjectFileHolder); // Retrieves the function as an array of bytes. - llvm::StringRef getFunctionBytes() const { return FunctionBytes; } + StringRef getFunctionBytes() const { return FunctionBytes; } // Executes the function. void operator()(char *Memory) const { ((void (*)(char *))(intptr_t)FunctionBytes.data())(Memory); } - std::unique_ptr Context; - std::unique_ptr ExecEngine; - llvm::StringRef FunctionBytes; + std::unique_ptr Context; + std::unique_ptr ExecEngine; + StringRef FunctionBytes; }; // Creates a void(int8*) MachineFunction. -llvm::MachineFunction & -createVoidVoidPtrMachineFunction(llvm::StringRef FunctionID, - llvm::Module *Module, - llvm::MachineModuleInfo *MMI); +MachineFunction &createVoidVoidPtrMachineFunction(StringRef FunctionID, + Module *Module, + MachineModuleInfo *MMI); } // namespace exegesis } // namespace llvm Modified: llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.cpp Wed Oct 9 04:58:42 2019 @@ -38,18 +38,18 @@ struct YamlContext { generateOpcodeNameToOpcodeIdxMapping(State.getInstrInfo())), RegNameToRegNo(generateRegNameToRegNoMapping(State.getRegInfo())) {} - static llvm::StringMap - generateOpcodeNameToOpcodeIdxMapping(const llvm::MCInstrInfo &InstrInfo) { - llvm::StringMap Map(InstrInfo.getNumOpcodes()); + static StringMap + generateOpcodeNameToOpcodeIdxMapping(const MCInstrInfo &InstrInfo) { + StringMap Map(InstrInfo.getNumOpcodes()); for (unsigned I = 0, E = InstrInfo.getNumOpcodes(); I < E; ++I) Map[InstrInfo.getName(I)] = I; assert(Map.size() == InstrInfo.getNumOpcodes() && "Size prediction failed"); return Map; }; - llvm::StringMap - generateRegNameToRegNoMapping(const llvm::MCRegisterInfo &RegInfo) { - llvm::StringMap Map(RegInfo.getNumRegs()); + StringMap + generateRegNameToRegNoMapping(const MCRegisterInfo &RegInfo) { + StringMap Map(RegInfo.getNumRegs()); // Special-case RegNo 0, which would otherwise be spelled as ''. Map[kNoRegister] = 0; for (unsigned I = 1, E = RegInfo.getNumRegs(); I < E; ++I) @@ -58,7 +58,7 @@ struct YamlContext { return Map; }; - void serializeMCInst(const llvm::MCInst &MCInst, llvm::raw_ostream &OS) { + void serializeMCInst(const MCInst &MCInst, raw_ostream &OS) { OS << getInstrName(MCInst.getOpcode()); for (const auto &Op : MCInst) { OS << ' '; @@ -66,15 +66,15 @@ struct YamlContext { } } - void deserializeMCInst(llvm::StringRef String, llvm::MCInst &Value) { - llvm::SmallVector Pieces; + void deserializeMCInst(StringRef String, MCInst &Value) { + SmallVector Pieces; String.split(Pieces, " ", /* MaxSplit */ -1, /* KeepEmpty */ false); if (Pieces.empty()) { ErrorStream << "Unknown Instruction: '" << String << "'\n"; return; } bool ProcessOpcode = true; - for (llvm::StringRef Piece : Pieces) { + for (StringRef Piece : Pieces) { if (ProcessOpcode) Value.setOpcode(getInstrOpcode(Piece)); else @@ -85,43 +85,43 @@ struct YamlContext { std::string &getLastError() { return ErrorStream.str(); } - llvm::raw_string_ostream &getErrorStream() { return ErrorStream; } + raw_string_ostream &getErrorStream() { return ErrorStream; } - llvm::StringRef getRegName(unsigned RegNo) { + StringRef getRegName(unsigned RegNo) { // Special case: RegNo 0 is NoRegister. We have to deal with it explicitly. if (RegNo == 0) return kNoRegister; - const llvm::StringRef RegName = State->getRegInfo().getName(RegNo); + const StringRef RegName = State->getRegInfo().getName(RegNo); if (RegName.empty()) ErrorStream << "No register with enum value '" << RegNo << "'\n"; return RegName; } - llvm::Optional getRegNo(llvm::StringRef RegName) { + Optional getRegNo(StringRef RegName) { auto Iter = RegNameToRegNo.find(RegName); if (Iter != RegNameToRegNo.end()) return Iter->second; ErrorStream << "No register with name '" << RegName << "'\n"; - return llvm::None; + return None; } private: - void serializeIntegerOperand(llvm::raw_ostream &OS, int64_t Value) { + void serializeIntegerOperand(raw_ostream &OS, int64_t Value) { OS << kIntegerPrefix; - OS.write_hex(llvm::bit_cast(Value)); + OS.write_hex(bit_cast(Value)); } - bool tryDeserializeIntegerOperand(llvm::StringRef String, int64_t &Value) { + bool tryDeserializeIntegerOperand(StringRef String, int64_t &Value) { if (!String.consume_front(kIntegerPrefix)) return false; return !String.consumeInteger(16, Value); } - void serializeFPOperand(llvm::raw_ostream &OS, double Value) { - OS << kDoublePrefix << llvm::format("%la", Value); + void serializeFPOperand(raw_ostream &OS, double Value) { + OS << kDoublePrefix << format("%la", Value); } - bool tryDeserializeFPOperand(llvm::StringRef String, double &Value) { + bool tryDeserializeFPOperand(StringRef String, double &Value) { if (!String.consume_front(kDoublePrefix)) return false; char *EndPointer = nullptr; @@ -129,8 +129,7 @@ private: return EndPointer == String.end(); } - void serializeMCOperand(const llvm::MCOperand &MCOperand, - llvm::raw_ostream &OS) { + void serializeMCOperand(const MCOperand &MCOperand, raw_ostream &OS) { if (MCOperand.isReg()) { OS << getRegName(MCOperand.getReg()); } else if (MCOperand.isImm()) { @@ -142,29 +141,29 @@ private: } } - llvm::MCOperand deserializeMCOperand(llvm::StringRef String) { + MCOperand deserializeMCOperand(StringRef String) { assert(!String.empty()); int64_t IntValue = 0; double DoubleValue = 0; if (tryDeserializeIntegerOperand(String, IntValue)) - return llvm::MCOperand::createImm(IntValue); + return MCOperand::createImm(IntValue); if (tryDeserializeFPOperand(String, DoubleValue)) - return llvm::MCOperand::createFPImm(DoubleValue); + return MCOperand::createFPImm(DoubleValue); if (auto RegNo = getRegNo(String)) - return llvm::MCOperand::createReg(*RegNo); + return MCOperand::createReg(*RegNo); if (String != kInvalidOperand) ErrorStream << "Unknown Operand: '" << String << "'\n"; return {}; } - llvm::StringRef getInstrName(unsigned InstrNo) { - const llvm::StringRef InstrName = State->getInstrInfo().getName(InstrNo); + StringRef getInstrName(unsigned InstrNo) { + const StringRef InstrName = State->getInstrInfo().getName(InstrNo); if (InstrName.empty()) ErrorStream << "No opcode with enum value '" << InstrNo << "'\n"; return InstrName; } - unsigned getInstrOpcode(llvm::StringRef InstrName) { + unsigned getInstrOpcode(StringRef InstrName) { auto Iter = OpcodeNameToOpcodeIdx.find(InstrName); if (Iter != OpcodeNameToOpcodeIdx.end()) return Iter->second; @@ -172,11 +171,11 @@ private: return 0; } - const llvm::exegesis::LLVMState *State; + const exegesis::LLVMState *State; std::string LastError; - llvm::raw_string_ostream ErrorStream; - const llvm::StringMap OpcodeNameToOpcodeIdx; - const llvm::StringMap RegNameToRegNo; + raw_string_ostream ErrorStream; + const StringMap OpcodeNameToOpcodeIdx; + const StringMap RegNameToRegNo; }; } // namespace @@ -187,19 +186,18 @@ static YamlContext &getTypedContext(void return *reinterpret_cast(Ctx); } -// std::vector will be rendered as a list. -template <> struct SequenceElementTraits { +// std::vector will be rendered as a list. +template <> struct SequenceElementTraits { static const bool flow = false; }; -template <> struct ScalarTraits { +template <> struct ScalarTraits { - static void output(const llvm::MCInst &Value, void *Ctx, - llvm::raw_ostream &Out) { + static void output(const MCInst &Value, void *Ctx, raw_ostream &Out) { getTypedContext(Ctx).serializeMCInst(Value, Out); } - static StringRef input(StringRef Scalar, void *Ctx, llvm::MCInst &Value) { + static StringRef input(StringRef Scalar, void *Ctx, MCInst &Value) { YamlContext &Context = getTypedContext(Ctx); Context.deserializeMCInst(Scalar, Value); return Context.getLastError(); @@ -254,7 +252,7 @@ template <> struct ScalarTraits struct ScalarTraits Pieces; + SmallVector Pieces; String.split(Pieces, "=0x", /* MaxSplit */ -1, /* KeepEmpty */ false); YamlContext &Context = getTypedContext(Ctx); - llvm::Optional RegNo; + Optional RegNo; if (Pieces.size() == 2 && (RegNo = Context.getRegNo(Pieces[0]))) { RV.Register = *RegNo; - const unsigned BitsNeeded = llvm::APInt::getBitsNeeded(Pieces[1], kRadix); - RV.Value = llvm::APInt(BitsNeeded, Pieces[1], kRadix); + const unsigned BitsNeeded = APInt::getBitsNeeded(Pieces[1], kRadix); + RV.Value = APInt(BitsNeeded, Pieces[1], kRadix); } else { Context.getErrorStream() << "Unknown initial register value: '" << String << "'"; @@ -333,16 +331,15 @@ struct MappingContextTraits -InstructionBenchmark::readYaml(const LLVMState &State, - llvm::StringRef Filename) { +Expected +InstructionBenchmark::readYaml(const LLVMState &State, StringRef Filename) { if (auto ExpectedMemoryBuffer = - llvm::errorOrToExpected(llvm::MemoryBuffer::getFile(Filename))) { - llvm::yaml::Input Yin(*ExpectedMemoryBuffer.get()); + errorOrToExpected(MemoryBuffer::getFile(Filename))) { + yaml::Input Yin(*ExpectedMemoryBuffer.get()); YamlContext Context(State); InstructionBenchmark Benchmark; if (Yin.setCurrentDocument()) - llvm::yaml::yamlize(Yin, Benchmark, /*unused*/ true, Context); + yaml::yamlize(Yin, Benchmark, /*unused*/ true, Context); if (!Context.getLastError().empty()) return make_error(Context.getLastError()); return Benchmark; @@ -351,19 +348,18 @@ InstructionBenchmark::readYaml(const LLV } } -llvm::Expected> -InstructionBenchmark::readYamls(const LLVMState &State, - llvm::StringRef Filename) { +Expected> +InstructionBenchmark::readYamls(const LLVMState &State, StringRef Filename) { if (auto ExpectedMemoryBuffer = - llvm::errorOrToExpected(llvm::MemoryBuffer::getFile(Filename))) { - llvm::yaml::Input Yin(*ExpectedMemoryBuffer.get()); + errorOrToExpected(MemoryBuffer::getFile(Filename))) { + yaml::Input Yin(*ExpectedMemoryBuffer.get()); YamlContext Context(State); std::vector Benchmarks; while (Yin.setCurrentDocument()) { Benchmarks.emplace_back(); yamlize(Yin, Benchmarks.back(), /*unused*/ true, Context); if (Yin.error()) - return llvm::errorCodeToError(Yin.error()); + return errorCodeToError(Yin.error()); if (!Context.getLastError().empty()) return make_error(Context.getLastError()); Yin.nextDocument(); @@ -374,47 +370,46 @@ InstructionBenchmark::readYamls(const LL } } -llvm::Error InstructionBenchmark::writeYamlTo(const LLVMState &State, - llvm::raw_ostream &OS) { +Error InstructionBenchmark::writeYamlTo(const LLVMState &State, + raw_ostream &OS) { auto Cleanup = make_scope_exit([&] { OS.flush(); }); - llvm::yaml::Output Yout(OS, nullptr /*Ctx*/, 200 /*WrapColumn*/); + yaml::Output Yout(OS, nullptr /*Ctx*/, 200 /*WrapColumn*/); YamlContext Context(State); Yout.beginDocuments(); - llvm::yaml::yamlize(Yout, *this, /*unused*/ true, Context); + yaml::yamlize(Yout, *this, /*unused*/ true, Context); if (!Context.getLastError().empty()) return make_error(Context.getLastError()); Yout.endDocuments(); return Error::success(); } -llvm::Error InstructionBenchmark::readYamlFrom(const LLVMState &State, - llvm::StringRef InputContent) { - llvm::yaml::Input Yin(InputContent); +Error InstructionBenchmark::readYamlFrom(const LLVMState &State, + StringRef InputContent) { + yaml::Input Yin(InputContent); YamlContext Context(State); if (Yin.setCurrentDocument()) - llvm::yaml::yamlize(Yin, *this, /*unused*/ true, Context); + yaml::yamlize(Yin, *this, /*unused*/ true, Context); if (!Context.getLastError().empty()) return make_error(Context.getLastError()); return Error::success(); } -llvm::Error InstructionBenchmark::writeYaml(const LLVMState &State, - const llvm::StringRef Filename) { +Error InstructionBenchmark::writeYaml(const LLVMState &State, + const StringRef Filename) { if (Filename == "-") { - if (auto Err = writeYamlTo(State, llvm::outs())) + if (auto Err = writeYamlTo(State, outs())) return Err; } else { int ResultFD = 0; - if (auto E = llvm::errorCodeToError( - openFileForWrite(Filename, ResultFD, llvm::sys::fs::CD_CreateAlways, - llvm::sys::fs::OF_Text))) { + if (auto E = errorCodeToError(openFileForWrite( + Filename, ResultFD, sys::fs::CD_CreateAlways, sys::fs::OF_Text))) { return E; } - llvm::raw_fd_ostream Ostr(ResultFD, true /*shouldClose*/); + raw_fd_ostream Ostr(ResultFD, true /*shouldClose*/); if (auto Err = writeYamlTo(State, Ostr)) return Err; } - return llvm::Error::success(); + return Error::success(); } void PerInstructionStats::push(const BenchmarkMeasure &BM) { Modified: llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/BenchmarkResult.h Wed Oct 9 04:58:42 2019 @@ -32,7 +32,7 @@ namespace exegesis { struct InstructionBenchmarkKey { // The LLVM opcode name. - std::vector Instructions; + std::vector Instructions; // The initial values of the registers. std::vector RegisterInitialValues; // An opaque configuration, that can be used to separate several benchmarks of @@ -62,7 +62,7 @@ struct InstructionBenchmark { std::string CpuName; std::string LLVMTriple; // Which instruction is being benchmarked here? - const llvm::MCInst &keyInstruction() const { return Key.Instructions[0]; } + const MCInst &keyInstruction() const { return Key.Instructions[0]; } // The number of instructions inside the repeated snippet. For example, if a // snippet of 3 instructions is repeated 4 times, this is 12. int NumRepetitions = 0; @@ -75,19 +75,18 @@ struct InstructionBenchmark { std::vector AssembledSnippet; // Read functions. - static llvm::Expected - readYaml(const LLVMState &State, llvm::StringRef Filename); + static Expected readYaml(const LLVMState &State, + StringRef Filename); - static llvm::Expected> - readYamls(const LLVMState &State, llvm::StringRef Filename); + static Expected> + readYamls(const LLVMState &State, StringRef Filename); - llvm::Error readYamlFrom(const LLVMState &State, - llvm::StringRef InputContent); + class Error readYamlFrom(const LLVMState &State, StringRef InputContent); // Write functions, non-const because of YAML traits. - llvm::Error writeYamlTo(const LLVMState &State, llvm::raw_ostream &S); + class Error writeYamlTo(const LLVMState &State, raw_ostream &S); - llvm::Error writeYaml(const LLVMState &State, const llvm::StringRef Filename); + class Error writeYaml(const LLVMState &State, const StringRef Filename); }; //------------------------------------------------------------------------------ Modified: llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.cpp Wed Oct 9 04:58:42 2019 @@ -35,37 +35,36 @@ namespace { class FunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor { public: FunctionExecutorImpl(const LLVMState &State, - llvm::object::OwningBinary Obj, + object::OwningBinary Obj, BenchmarkRunner::ScratchSpace *Scratch) : Function(State.createTargetMachine(), std::move(Obj)), Scratch(Scratch) {} private: - llvm::Expected runAndMeasure(const char *Counters) const override { + Expected runAndMeasure(const char *Counters) const override { // We sum counts when there are several counters for a single ProcRes // (e.g. P23 on SandyBridge). int64_t CounterValue = 0; - llvm::SmallVector CounterNames; - llvm::StringRef(Counters).split(CounterNames, '+'); + SmallVector CounterNames; + StringRef(Counters).split(CounterNames, '+'); char *const ScratchPtr = Scratch->ptr(); for (auto &CounterName : CounterNames) { CounterName = CounterName.trim(); pfm::PerfEvent PerfEvent(CounterName); if (!PerfEvent.valid()) - llvm::report_fatal_error(llvm::Twine("invalid perf event '") - .concat(CounterName) - .concat("'")); + report_fatal_error( + Twine("invalid perf event '").concat(CounterName).concat("'")); pfm::Counter Counter(PerfEvent); Scratch->clear(); { - llvm::CrashRecoveryContext CRC; - llvm::CrashRecoveryContext::Enable(); + CrashRecoveryContext CRC; + CrashRecoveryContext::Enable(); const bool Crashed = !CRC.RunSafely([this, &Counter, ScratchPtr]() { Counter.start(); this->Function(ScratchPtr); Counter.stop(); }); - llvm::CrashRecoveryContext::Disable(); + CrashRecoveryContext::Disable(); // FIXME: Better diagnosis. if (Crashed) return make_error("snippet crashed while running"); @@ -91,7 +90,7 @@ InstructionBenchmark BenchmarkRunner::ru InstrBenchmark.NumRepetitions = NumRepetitions; InstrBenchmark.Info = BC.Info; - const std::vector &Instructions = BC.Key.Instructions; + const std::vector &Instructions = BC.Key.Instructions; InstrBenchmark.Key = BC.Key; @@ -100,8 +99,8 @@ InstructionBenchmark BenchmarkRunner::ru // that the inside instructions are repeated. constexpr const int kMinInstructionsForSnippet = 16; { - llvm::SmallString<0> Buffer; - llvm::raw_svector_ostream OS(Buffer); + SmallString<0> Buffer; + raw_svector_ostream OS(Buffer); assembleToStream(State.getExegesisTarget(), State.createTargetMachine(), BC.LiveIns, BC.Key.RegisterInitialValues, Repetitor.Repeat(Instructions, kMinInstructionsForSnippet), @@ -117,19 +116,19 @@ InstructionBenchmark BenchmarkRunner::ru const auto Filler = Repetitor.Repeat(Instructions, InstrBenchmark.NumRepetitions); - llvm::object::OwningBinary ObjectFile; + object::OwningBinary ObjectFile; if (DumpObjectToDisk) { auto ObjectFilePath = writeObjectFile(BC, Filler); - if (llvm::Error E = ObjectFilePath.takeError()) { - InstrBenchmark.Error = llvm::toString(std::move(E)); + if (Error E = ObjectFilePath.takeError()) { + InstrBenchmark.Error = toString(std::move(E)); return InstrBenchmark; } - llvm::outs() << "Check generated assembly with: /usr/bin/objdump -d " - << *ObjectFilePath << "\n"; + outs() << "Check generated assembly with: /usr/bin/objdump -d " + << *ObjectFilePath << "\n"; ObjectFile = getObjectFromFile(*ObjectFilePath); } else { - llvm::SmallString<0> Buffer; - llvm::raw_svector_ostream OS(Buffer); + SmallString<0> Buffer; + raw_svector_ostream OS(Buffer); assembleToStream(State.getExegesisTarget(), State.createTargetMachine(), BC.LiveIns, BC.Key.RegisterInitialValues, Filler, OS); ObjectFile = getObjectFromBuffer(OS.str()); @@ -138,8 +137,8 @@ InstructionBenchmark BenchmarkRunner::ru const FunctionExecutorImpl Executor(State, std::move(ObjectFile), Scratch.get()); auto Measurements = runMeasurements(Executor); - if (llvm::Error E = Measurements.takeError()) { - InstrBenchmark.Error = llvm::toString(std::move(E)); + if (Error E = Measurements.takeError()) { + InstrBenchmark.Error = toString(std::move(E)); return InstrBenchmark; } InstrBenchmark.Measurements = std::move(*Measurements); @@ -155,15 +154,15 @@ InstructionBenchmark BenchmarkRunner::ru return InstrBenchmark; } -llvm::Expected +Expected BenchmarkRunner::writeObjectFile(const BenchmarkCode &BC, const FillFunction &FillFunction) const { int ResultFD = 0; - llvm::SmallString<256> ResultPath; - if (llvm::Error E = llvm::errorCodeToError(llvm::sys::fs::createTemporaryFile( - "snippet", "o", ResultFD, ResultPath))) + SmallString<256> ResultPath; + if (Error E = errorCodeToError( + sys::fs::createTemporaryFile("snippet", "o", ResultFD, ResultPath))) return std::move(E); - llvm::raw_fd_ostream OFS(ResultFD, true /*ShouldClose*/); + raw_fd_ostream OFS(ResultFD, true /*ShouldClose*/); assembleToStream(State.getExegesisTarget(), State.createTargetMachine(), BC.LiveIns, BC.Key.RegisterInitialValues, FillFunction, OFS); return ResultPath.str(); Modified: llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/BenchmarkRunner.h Wed Oct 9 04:58:42 2019 @@ -65,8 +65,7 @@ public: class FunctionExecutor { public: virtual ~FunctionExecutor(); - virtual llvm::Expected - runAndMeasure(const char *Counters) const = 0; + virtual Expected runAndMeasure(const char *Counters) const = 0; }; protected: @@ -74,12 +73,11 @@ protected: const InstructionBenchmark::ModeE Mode; private: - virtual llvm::Expected> + virtual Expected> runMeasurements(const FunctionExecutor &Executor) const = 0; - llvm::Expected - writeObjectFile(const BenchmarkCode &Configuration, - const FillFunction &Fill) const; + Expected writeObjectFile(const BenchmarkCode &Configuration, + const FillFunction &Fill) const; const std::unique_ptr Scratch; }; Modified: llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp Wed Oct 9 04:58:42 2019 @@ -59,7 +59,7 @@ bool InstructionBenchmarkClustering::are ArrayRef Pts) const { // First, get the centroid of this group of points. This is O(N). SchedClassClusterCentroid G; - llvm::for_each(Pts, [this, &G](size_t P) { + for_each(Pts, [this, &G](size_t P) { assert(P < Points_.size()); ArrayRef Measurements = Points_[P].Measurements; if (Measurements.empty()) // Error point. @@ -73,7 +73,7 @@ bool InstructionBenchmarkClustering::are AnalysisClusteringEpsilonSquared_ / 4.0; // And now check that every point is a neighbour of the centroid. Also O(N). - return llvm::all_of( + return all_of( Pts, [this, &Centroid, AnalysisClusteringEpsilonHalvedSquared](size_t P) { assert(P < Points_.size()); const auto &PMeasurements = Points_[P].Measurements; @@ -91,7 +91,7 @@ InstructionBenchmarkClustering::Instruct AnalysisClusteringEpsilonSquared_(AnalysisClusteringEpsilonSquared), NoiseCluster_(ClusterId::noise()), ErrorCluster_(ClusterId::error()) {} -llvm::Error InstructionBenchmarkClustering::validateAndSetup() { +Error InstructionBenchmarkClustering::validateAndSetup() { ClusterIdForPoint_.resize(Points_.size()); // Mark erroneous measurements out. // All points must have the same number of dimensions, in the same order. @@ -106,15 +106,14 @@ llvm::Error InstructionBenchmarkClusteri const auto *CurMeasurement = &Point.Measurements; if (LastMeasurement) { if (LastMeasurement->size() != CurMeasurement->size()) { - return llvm::make_error( - "inconsistent measurement dimensions", - llvm::inconvertibleErrorCode()); + return make_error("inconsistent measurement dimensions", + inconvertibleErrorCode()); } for (size_t I = 0, E = LastMeasurement->size(); I < E; ++I) { if (LastMeasurement->at(I).Key != CurMeasurement->at(I).Key) { - return llvm::make_error( + return make_error( "inconsistent measurement dimensions keys", - llvm::inconvertibleErrorCode()); + inconvertibleErrorCode()); } } } @@ -123,7 +122,7 @@ llvm::Error InstructionBenchmarkClusteri if (LastMeasurement) { NumDimensions_ = LastMeasurement->size(); } - return llvm::Error::success(); + return Error::success(); } void InstructionBenchmarkClustering::clusterizeDbScan(const size_t MinPts) { @@ -146,7 +145,7 @@ void InstructionBenchmarkClustering::clu CurrentCluster.PointIndices.push_back(P); // Process P's neighbors. - llvm::SetVector> ToProcess; + SetVector> ToProcess; ToProcess.insert(Neighbors.begin(), Neighbors.end()); while (!ToProcess.empty()) { // Retrieve a point from the set. @@ -185,14 +184,14 @@ void InstructionBenchmarkClustering::clu void InstructionBenchmarkClustering::clusterizeNaive(unsigned NumOpcodes) { // Given an instruction Opcode, which are the benchmarks of this instruction? - std::vector> OpcodeToPoints; + std::vector> OpcodeToPoints; OpcodeToPoints.resize(NumOpcodes); size_t NumOpcodesSeen = 0; for (size_t P = 0, NumPoints = Points_.size(); P < NumPoints; ++P) { const InstructionBenchmark &Point = Points_[P]; const unsigned Opcode = Point.keyInstruction().getOpcode(); assert(Opcode < NumOpcodes && "NumOpcodes is incorrect (too small)"); - llvm::SmallVectorImpl &PointsOfOpcode = OpcodeToPoints[Opcode]; + SmallVectorImpl &PointsOfOpcode = OpcodeToPoints[Opcode]; if (PointsOfOpcode.empty()) // If we previously have not seen any points of ++NumOpcodesSeen; // this opcode, then naturally this is the new opcode. PointsOfOpcode.emplace_back(P); @@ -204,16 +203,16 @@ void InstructionBenchmarkClustering::clu "can't see more opcodes than there are total points"); Clusters_.reserve(NumOpcodesSeen); // One cluster per opcode. - for (ArrayRef PointsOfOpcode : llvm::make_filter_range( - OpcodeToPoints, [](ArrayRef PointsOfOpcode) { - return !PointsOfOpcode.empty(); // Ignore opcodes with no points. - })) { + for (ArrayRef PointsOfOpcode : + make_filter_range(OpcodeToPoints, [](ArrayRef PointsOfOpcode) { + return !PointsOfOpcode.empty(); // Ignore opcodes with no points. + })) { // Create a new cluster. Clusters_.emplace_back(ClusterId::makeValid( Clusters_.size(), /*IsUnstable=*/!areAllNeighbours(PointsOfOpcode))); Cluster &CurrentCluster = Clusters_.back(); // Mark points as belonging to the new cluster. - llvm::for_each(PointsOfOpcode, [this, &CurrentCluster](size_t P) { + for_each(PointsOfOpcode, [this, &CurrentCluster](size_t P) { ClusterIdForPoint_[P] = CurrentCluster.Id; }); // And add all the points of this opcode to the new cluster. @@ -250,8 +249,7 @@ void InstructionBenchmarkClustering::sta bool operator<(const OpcodeAndConfig &O) const { return Tie() < O.Tie(); } bool operator!=(const OpcodeAndConfig &O) const { return Tie() != O.Tie(); } }; - std::map> - OpcodeConfigToClusterIDs; + std::map> OpcodeConfigToClusterIDs; // Populate OpcodeConfigToClusterIDs and UnstableOpcodes data structures. assert(ClusterIdForPoint_.size() == Points_.size() && "size mismatch"); for (const auto &Point : zip(Points_, ClusterIdForPoint_)) { @@ -259,14 +257,12 @@ void InstructionBenchmarkClustering::sta if (!ClusterIdOfPoint.isValid()) continue; // Only process fully valid clusters. const OpcodeAndConfig Key(std::get<0>(Point)); - llvm::SmallSet &ClusterIDsOfOpcode = - OpcodeConfigToClusterIDs[Key]; + SmallSet &ClusterIDsOfOpcode = OpcodeConfigToClusterIDs[Key]; ClusterIDsOfOpcode.insert(ClusterIdOfPoint); } for (const auto &OpcodeConfigToClusterID : OpcodeConfigToClusterIDs) { - const llvm::SmallSet &ClusterIDs = - OpcodeConfigToClusterID.second; + const SmallSet &ClusterIDs = OpcodeConfigToClusterID.second; const OpcodeAndConfig &Key = OpcodeConfigToClusterID.first; // We only care about unstable instructions. if (ClusterIDs.size() < 2) @@ -317,11 +313,10 @@ void InstructionBenchmarkClustering::sta } } -llvm::Expected -InstructionBenchmarkClustering::create( +Expected InstructionBenchmarkClustering::create( const std::vector &Points, const ModeE Mode, const size_t DbscanMinPts, const double AnalysisClusteringEpsilon, - llvm::Optional NumOpcodes) { + Optional NumOpcodes) { InstructionBenchmarkClustering Clustering( Points, AnalysisClusteringEpsilon * AnalysisClusteringEpsilon); if (auto Error = Clustering.validateAndSetup()) { @@ -338,7 +333,7 @@ InstructionBenchmarkClustering::create( Clustering.stabilize(NumOpcodes.getValue()); } else /*if(Mode == ModeE::Naive)*/ { if (!NumOpcodes.hasValue()) - llvm::report_fatal_error( + report_fatal_error( "'naive' clustering mode requires opcode count to be specified"); Clustering.clusterizeNaive(NumOpcodes.getValue()); } @@ -352,13 +347,13 @@ void SchedClassClusterCentroid::addPoint assert(Representative.size() == Point.size() && "All points should have identical dimensions."); - for (const auto &I : llvm::zip(Representative, Point)) + for (const auto &I : zip(Representative, Point)) std::get<0>(I).push(std::get<1>(I)); } std::vector SchedClassClusterCentroid::getAsPoint() const { std::vector ClusterCenterPoint(Representative.size()); - for (const auto &I : llvm::zip(ClusterCenterPoint, Representative)) + for (const auto &I : zip(ClusterCenterPoint, Representative)) std::get<0>(I).PerInstructionValue = std::get<1>(I).avg(); return ClusterCenterPoint; } @@ -369,7 +364,7 @@ bool SchedClassClusterCentroid::validate switch (Mode) { case InstructionBenchmark::Latency: if (NumMeasurements != 1) { - llvm::errs() + errs() << "invalid number of measurements in latency mode: expected 1, got " << NumMeasurements << "\n"; return false; @@ -380,9 +375,9 @@ bool SchedClassClusterCentroid::validate break; case InstructionBenchmark::InverseThroughput: if (NumMeasurements != 1) { - llvm::errs() << "invalid number of measurements in inverse throughput " - "mode: expected 1, got " - << NumMeasurements << "\n"; + errs() << "invalid number of measurements in inverse throughput " + "mode: expected 1, got " + << NumMeasurements << "\n"; return false; } break; Modified: llvm/trunk/tools/llvm-exegesis/lib/Clustering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Clustering.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Clustering.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Clustering.h Wed Oct 9 04:58:42 2019 @@ -29,10 +29,10 @@ public: // Clusters `Points` using DBSCAN with the given parameters. See the cc file // for more explanations on the algorithm. - static llvm::Expected + static Expected create(const std::vector &Points, ModeE Mode, size_t DbscanMinPts, double AnalysisClusteringEpsilon, - llvm::Optional NumOpcodes = llvm::None); + Optional NumOpcodes = None); class ClusterId { public: @@ -123,7 +123,7 @@ private: const std::vector &Points, double AnalysisClusteringEpsilonSquared); - llvm::Error validateAndSetup(); + Error validateAndSetup(); void clusterizeDbScan(size_t MinPts); void clusterizeNaive(unsigned NumOpcodes); Modified: llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.cpp Wed Oct 9 04:58:42 2019 @@ -32,32 +32,30 @@ unsigned InstructionTemplate::getOpcode( return Instr.Description->getOpcode(); } -llvm::MCOperand &InstructionTemplate::getValueFor(const Variable &Var) { +MCOperand &InstructionTemplate::getValueFor(const Variable &Var) { return VariableValues[Var.getIndex()]; } -const llvm::MCOperand & -InstructionTemplate::getValueFor(const Variable &Var) const { +const MCOperand &InstructionTemplate::getValueFor(const Variable &Var) const { return VariableValues[Var.getIndex()]; } -llvm::MCOperand &InstructionTemplate::getValueFor(const Operand &Op) { +MCOperand &InstructionTemplate::getValueFor(const Operand &Op) { return getValueFor(Instr.Variables[Op.getVariableIndex()]); } -const llvm::MCOperand & -InstructionTemplate::getValueFor(const Operand &Op) const { +const MCOperand &InstructionTemplate::getValueFor(const Operand &Op) const { return getValueFor(Instr.Variables[Op.getVariableIndex()]); } bool InstructionTemplate::hasImmediateVariables() const { - return llvm::any_of(Instr.Variables, [this](const Variable &Var) { + return any_of(Instr.Variables, [this](const Variable &Var) { return Instr.getPrimaryOperand(Var).isImmediate(); }); } -llvm::MCInst InstructionTemplate::build() const { - llvm::MCInst Result; +MCInst InstructionTemplate::build() const { + MCInst Result; Result.setOpcode(Instr.Description->Opcode); for (const auto &Op : Instr.Operands) if (Op.isExplicit()) @@ -66,10 +64,10 @@ llvm::MCInst InstructionTemplate::build( } bool isEnumValue(ExecutionMode Execution) { - return llvm::isPowerOf2_32(static_cast(Execution)); + return isPowerOf2_32(static_cast(Execution)); } -llvm::StringRef getName(ExecutionMode Bit) { +StringRef getName(ExecutionMode Bit) { assert(isEnumValue(Bit) && "Bit must be a power of two"); switch (Bit) { case ExecutionMode::UNKNOWN: @@ -92,7 +90,7 @@ llvm::StringRef getName(ExecutionMode Bi llvm_unreachable("Missing enum case"); } -llvm::ArrayRef getAllExecutionBits() { +ArrayRef getAllExecutionBits() { static const ExecutionMode kAllExecutionModeBits[] = { ExecutionMode::ALWAYS_SERIAL_IMPLICIT_REGS_ALIAS, ExecutionMode::ALWAYS_SERIAL_TIED_REGS_ALIAS, @@ -102,12 +100,11 @@ llvm::ArrayRef getAllExec ExecutionMode::ALWAYS_PARALLEL_MISSING_USE_OR_DEF, ExecutionMode::PARALLEL_VIA_EXPLICIT_REGS, }; - return llvm::makeArrayRef(kAllExecutionModeBits); + return makeArrayRef(kAllExecutionModeBits); } -llvm::SmallVector -getExecutionModeBits(ExecutionMode Execution) { - llvm::SmallVector Result; +SmallVector getExecutionModeBits(ExecutionMode Execution) { + SmallVector Result; for (const auto Bit : getAllExecutionBits()) if ((Execution & Bit) == Bit) Result.push_back(Bit); Modified: llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/CodeTemplate.h Wed Oct 9 04:58:42 2019 @@ -31,19 +31,19 @@ struct InstructionTemplate { InstructionTemplate &operator=(InstructionTemplate &&); // default unsigned getOpcode() const; - llvm::MCOperand &getValueFor(const Variable &Var); - const llvm::MCOperand &getValueFor(const Variable &Var) const; - llvm::MCOperand &getValueFor(const Operand &Op); - const llvm::MCOperand &getValueFor(const Operand &Op) const; + MCOperand &getValueFor(const Variable &Var); + const MCOperand &getValueFor(const Variable &Var) const; + MCOperand &getValueFor(const Operand &Op); + const MCOperand &getValueFor(const Operand &Op) const; bool hasImmediateVariables() const; - // Builds an llvm::MCInst from this InstructionTemplate setting its operands + // Builds an MCInst from this InstructionTemplate setting its operands // to the corresponding variable values. Precondition: All VariableValues must // be set. - llvm::MCInst build() const; + MCInst build() const; Instruction Instr; - llvm::SmallVector VariableValues; + SmallVector VariableValues; }; enum class ExecutionMode : uint8_t { @@ -91,14 +91,14 @@ enum class ExecutionMode : uint8_t { bool isEnumValue(ExecutionMode Execution); // Returns a human readable string for the enum. -llvm::StringRef getName(ExecutionMode Execution); +StringRef getName(ExecutionMode Execution); // Returns a sequence of increasing powers of two corresponding to all the // Execution flags. -llvm::ArrayRef getAllExecutionBits(); +ArrayRef getAllExecutionBits(); // Decomposes Execution into individual set bits. -llvm::SmallVector getExecutionModeBits(ExecutionMode); +SmallVector getExecutionModeBits(ExecutionMode); LLVM_ENABLE_BITMASK_ENUMS_IN_NAMESPACE(); Modified: llvm/trunk/tools/llvm-exegesis/lib/Latency.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Latency.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Latency.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Latency.cpp Wed Oct 9 04:58:42 2019 @@ -84,7 +84,7 @@ static void appendCodeTemplates(const LL const Instruction &Instr, const BitVector &ForbiddenRegisters, ExecutionMode ExecutionModeBit, - llvm::StringRef ExecutionClassDescription, + StringRef ExecutionClassDescription, std::vector &CodeTemplates) { assert(isEnumValue(ExecutionModeBit) && "Bit must be a power of two"); switch (ExecutionModeBit) { @@ -151,7 +151,7 @@ static void appendCodeTemplates(const LL LatencySnippetGenerator::~LatencySnippetGenerator() = default; -llvm::Expected> +Expected> LatencySnippetGenerator::generateCodeTemplates( const Instruction &Instr, const BitVector &ForbiddenRegisters) const { std::vector Results; @@ -179,8 +179,7 @@ LatencyBenchmarkRunner::LatencyBenchmark LatencyBenchmarkRunner::~LatencyBenchmarkRunner() = default; -llvm::Expected> -LatencyBenchmarkRunner::runMeasurements( +Expected> LatencyBenchmarkRunner::runMeasurements( const FunctionExecutor &Executor) const { // Cycle measurements include some overhead from the kernel. Repeat the // measure several times and take the minimum value. @@ -188,7 +187,7 @@ LatencyBenchmarkRunner::runMeasurements( int64_t MinValue = std::numeric_limits::max(); const char *CounterName = State.getPfmCounters().CycleCounter; if (!CounterName) - llvm::report_fatal_error("sched model does not define a cycle counter"); + report_fatal_error("sched model does not define a cycle counter"); for (size_t I = 0; I < NumMeasurements; ++I) { auto ExpectedCounterValue = Executor.runAndMeasure(CounterName); if (!ExpectedCounterValue) Modified: llvm/trunk/tools/llvm-exegesis/lib/Latency.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Latency.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Latency.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Latency.h Wed Oct 9 04:58:42 2019 @@ -27,7 +27,7 @@ public: using SnippetGenerator::SnippetGenerator; ~LatencySnippetGenerator() override; - llvm::Expected> + Expected> generateCodeTemplates(const Instruction &Instr, const BitVector &ForbiddenRegisters) const override; }; @@ -39,7 +39,7 @@ public: ~LatencyBenchmarkRunner() override; private: - llvm::Expected> + Expected> runMeasurements(const FunctionExecutor &Executor) const override; }; } // namespace exegesis Modified: llvm/trunk/tools/llvm-exegesis/lib/LlvmState.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/LlvmState.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/LlvmState.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/LlvmState.cpp Wed Oct 9 04:58:42 2019 @@ -24,16 +24,15 @@ namespace exegesis { LLVMState::LLVMState(const std::string &Triple, const std::string &CpuName, const std::string &Features) { std::string Error; - const llvm::Target *const TheTarget = - llvm::TargetRegistry::lookupTarget(Triple, Error); + const Target *const TheTarget = TargetRegistry::lookupTarget(Triple, Error); assert(TheTarget && "unknown target for host"); - const llvm::TargetOptions Options; + const TargetOptions Options; TargetMachine.reset( - static_cast(TheTarget->createTargetMachine( - Triple, CpuName, Features, Options, llvm::Reloc::Model::Static))); + static_cast(TheTarget->createTargetMachine( + Triple, CpuName, Features, Options, Reloc::Model::Static))); TheExegesisTarget = ExegesisTarget::lookup(TargetMachine->getTargetTriple()); if (!TheExegesisTarget) { - llvm::errs() << "no exegesis target for " << Triple << ", using default\n"; + errs() << "no exegesis target for " << Triple << ", using default\n"; TheExegesisTarget = &ExegesisTarget::getDefault(); } PfmCounters = &TheExegesisTarget->getPfmCounters(CpuName); @@ -47,32 +46,29 @@ LLVMState::LLVMState(const std::string & } LLVMState::LLVMState(const std::string &CpuName) - : LLVMState(llvm::sys::getProcessTriple(), - CpuName.empty() ? llvm::sys::getHostCPUName().str() : CpuName, - "") {} + : LLVMState(sys::getProcessTriple(), + CpuName.empty() ? sys::getHostCPUName().str() : CpuName, "") {} -std::unique_ptr -LLVMState::createTargetMachine() const { - return std::unique_ptr( - static_cast( - TargetMachine->getTarget().createTargetMachine( - TargetMachine->getTargetTriple().normalize(), - TargetMachine->getTargetCPU(), - TargetMachine->getTargetFeatureString(), TargetMachine->Options, - llvm::Reloc::Model::Static))); +std::unique_ptr LLVMState::createTargetMachine() const { + return std::unique_ptr(static_cast( + TargetMachine->getTarget().createTargetMachine( + TargetMachine->getTargetTriple().normalize(), + TargetMachine->getTargetCPU(), + TargetMachine->getTargetFeatureString(), TargetMachine->Options, + Reloc::Model::Static))); } -bool LLVMState::canAssemble(const llvm::MCInst &Inst) const { - llvm::MCObjectFileInfo ObjectFileInfo; - llvm::MCContext Context(TargetMachine->getMCAsmInfo(), - TargetMachine->getMCRegisterInfo(), &ObjectFileInfo); - std::unique_ptr CodeEmitter( +bool LLVMState::canAssemble(const MCInst &Inst) const { + MCObjectFileInfo ObjectFileInfo; + MCContext Context(TargetMachine->getMCAsmInfo(), + TargetMachine->getMCRegisterInfo(), &ObjectFileInfo); + std::unique_ptr CodeEmitter( TargetMachine->getTarget().createMCCodeEmitter( *TargetMachine->getMCInstrInfo(), *TargetMachine->getMCRegisterInfo(), Context)); - llvm::SmallVector Tmp; - llvm::raw_svector_ostream OS(Tmp); - llvm::SmallVector Fixups; + SmallVector Tmp; + raw_svector_ostream OS(Tmp); + SmallVector Fixups; CodeEmitter->encodeInstruction(Inst, OS, Fixups, *TargetMachine->getMCSubtargetInfo()); return Tmp.size() > 0; Modified: llvm/trunk/tools/llvm-exegesis/lib/LlvmState.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/LlvmState.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/LlvmState.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/LlvmState.h Wed Oct 9 04:58:42 2019 @@ -42,21 +42,21 @@ public: const std::string &CpuName, const std::string &Features = ""); // For tests. - const llvm::TargetMachine &getTargetMachine() const { return *TargetMachine; } - std::unique_ptr createTargetMachine() const; + const TargetMachine &getTargetMachine() const { return *TargetMachine; } + std::unique_ptr createTargetMachine() const; const ExegesisTarget &getExegesisTarget() const { return *TheExegesisTarget; } - bool canAssemble(const llvm::MCInst &mc_inst) const; + bool canAssemble(const MCInst &mc_inst) const; // For convenience: - const llvm::MCInstrInfo &getInstrInfo() const { + const MCInstrInfo &getInstrInfo() const { return *TargetMachine->getMCInstrInfo(); } - const llvm::MCRegisterInfo &getRegInfo() const { + const MCRegisterInfo &getRegInfo() const { return *TargetMachine->getMCRegisterInfo(); } - const llvm::MCSubtargetInfo &getSubtargetInfo() const { + const MCSubtargetInfo &getSubtargetInfo() const { return *TargetMachine->getMCSubtargetInfo(); } @@ -67,7 +67,7 @@ public: private: const ExegesisTarget *TheExegesisTarget; - std::unique_ptr TargetMachine; + std::unique_ptr TargetMachine; std::unique_ptr RATC; std::unique_ptr IC; const PfmCountersInfo *PfmCounters; Modified: llvm/trunk/tools/llvm-exegesis/lib/MCInstrDescView.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/MCInstrDescView.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/MCInstrDescView.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/MCInstrDescView.cpp Wed Oct 9 04:58:42 2019 @@ -59,12 +59,12 @@ bool Operand::isVariable() const { retur bool Operand::isMemory() const { return isExplicit() && - getExplicitOperandInfo().OperandType == llvm::MCOI::OPERAND_MEMORY; + getExplicitOperandInfo().OperandType == MCOI::OPERAND_MEMORY; } bool Operand::isImmediate() const { return isExplicit() && - getExplicitOperandInfo().OperandType == llvm::MCOI::OPERAND_IMMEDIATE; + getExplicitOperandInfo().OperandType == MCOI::OPERAND_IMMEDIATE; } unsigned Operand::getTiedToIndex() const { @@ -89,12 +89,12 @@ const RegisterAliasingTracker &Operand:: return *Tracker; } -const llvm::MCOperandInfo &Operand::getExplicitOperandInfo() const { +const MCOperandInfo &Operand::getExplicitOperandInfo() const { assert(Info); return *Info; } -Instruction::Instruction(const llvm::MCInstrInfo &InstrInfo, +Instruction::Instruction(const MCInstrInfo &InstrInfo, const RegisterAliasingTrackerCache &RATC, unsigned Opcode) : Description(&InstrInfo.get(Opcode)), Name(InstrInfo.getName(Opcode)) { @@ -108,11 +108,11 @@ Instruction::Instruction(const llvm::MCI if (OpInfo.RegClass >= 0) Operand.Tracker = &RATC.getRegisterClass(OpInfo.RegClass); Operand.TiedToIndex = - Description->getOperandConstraint(OpIndex, llvm::MCOI::TIED_TO); + Description->getOperandConstraint(OpIndex, MCOI::TIED_TO); Operand.Info = &OpInfo; Operands.push_back(Operand); } - for (const llvm::MCPhysReg *MCPhysReg = Description->getImplicitDefs(); + for (const MCPhysReg *MCPhysReg = Description->getImplicitDefs(); MCPhysReg && *MCPhysReg; ++MCPhysReg, ++OpIndex) { Operand Operand; Operand.Index = OpIndex; @@ -121,7 +121,7 @@ Instruction::Instruction(const llvm::MCI Operand.ImplicitReg = MCPhysReg; Operands.push_back(Operand); } - for (const llvm::MCPhysReg *MCPhysReg = Description->getImplicitUses(); + for (const MCPhysReg *MCPhysReg = Description->getImplicitUses(); MCPhysReg && *MCPhysReg; ++MCPhysReg, ++OpIndex) { Operand Operand; Operand.Index = OpIndex; @@ -209,8 +209,8 @@ bool Instruction::hasAliasingRegistersTh } bool Instruction::hasTiedRegisters() const { - return llvm::any_of( - Variables, [](const Variable &Var) { return Var.hasTiedOperands(); }); + return any_of(Variables, + [](const Variable &Var) { return Var.hasTiedOperands(); }); } bool Instruction::hasAliasingRegisters( @@ -223,9 +223,9 @@ bool Instruction::hasOneUseOrOneDef() co return AllDefRegs.count() || AllUseRegs.count(); } -void Instruction::dump(const llvm::MCRegisterInfo &RegInfo, +void Instruction::dump(const MCRegisterInfo &RegInfo, const RegisterAliasingTrackerCache &RATC, - llvm::raw_ostream &Stream) const { + raw_ostream &Stream) const { Stream << "- " << Name << "\n"; for (const auto &Op : Operands) { Stream << "- Op" << Op.getIndex(); @@ -277,7 +277,7 @@ void Instruction::dump(const llvm::MCReg Stream << "- hasAliasingRegisters\n"; } -InstructionsCache::InstructionsCache(const llvm::MCInstrInfo &InstrInfo, +InstructionsCache::InstructionsCache(const MCInstrInfo &InstrInfo, const RegisterAliasingTrackerCache &RATC) : InstrInfo(InstrInfo), RATC(RATC) {} @@ -298,9 +298,10 @@ operator==(const AliasingRegisterOperand return std::tie(Defs, Uses) == std::tie(Other.Defs, Other.Uses); } -static void addOperandIfAlias( - const llvm::MCPhysReg Reg, bool SelectDef, llvm::ArrayRef Operands, - llvm::SmallVectorImpl &OperandValues) { +static void +addOperandIfAlias(const MCPhysReg Reg, bool SelectDef, + ArrayRef Operands, + SmallVectorImpl &OperandValues) { for (const auto &Op : Operands) { if (Op.isReg() && Op.isDef() == SelectDef) { const int SourceReg = Op.getRegisterAliasing().getOrigin(Reg); @@ -314,13 +315,13 @@ bool AliasingRegisterOperands::hasImplic const auto HasImplicit = [](const RegisterOperandAssignment &ROV) { return ROV.Op->isImplicit(); }; - return llvm::any_of(Defs, HasImplicit) && llvm::any_of(Uses, HasImplicit); + return any_of(Defs, HasImplicit) && any_of(Uses, HasImplicit); } bool AliasingConfigurations::empty() const { return Configurations.empty(); } bool AliasingConfigurations::hasImplicitAliasing() const { - return llvm::any_of(Configurations, [](const AliasingRegisterOperands &ARO) { + return any_of(Configurations, [](const AliasingRegisterOperands &ARO) { return ARO.hasImplicitAliasing(); }); } @@ -330,19 +331,19 @@ AliasingConfigurations::AliasingConfigur if (UseInstruction.AllUseRegs.anyCommon(DefInstruction.AllDefRegs)) { auto CommonRegisters = UseInstruction.AllUseRegs; CommonRegisters &= DefInstruction.AllDefRegs; - for (const llvm::MCPhysReg Reg : CommonRegisters.set_bits()) { + for (const MCPhysReg Reg : CommonRegisters.set_bits()) { AliasingRegisterOperands ARO; addOperandIfAlias(Reg, true, DefInstruction.Operands, ARO.Defs); addOperandIfAlias(Reg, false, UseInstruction.Operands, ARO.Uses); if (!ARO.Defs.empty() && !ARO.Uses.empty() && - !llvm::is_contained(Configurations, ARO)) + !is_contained(Configurations, ARO)) Configurations.push_back(std::move(ARO)); } } } -void DumpMCOperand(const llvm::MCRegisterInfo &MCRegisterInfo, - const llvm::MCOperand &Op, llvm::raw_ostream &OS) { +void DumpMCOperand(const MCRegisterInfo &MCRegisterInfo, const MCOperand &Op, + raw_ostream &OS) { if (!Op.isValid()) OS << "Invalid"; else if (Op.isReg()) @@ -357,9 +358,9 @@ void DumpMCOperand(const llvm::MCRegiste OS << "SubInst"; } -void DumpMCInst(const llvm::MCRegisterInfo &MCRegisterInfo, - const llvm::MCInstrInfo &MCInstrInfo, - const llvm::MCInst &MCInst, llvm::raw_ostream &OS) { +void DumpMCInst(const MCRegisterInfo &MCRegisterInfo, + const MCInstrInfo &MCInstrInfo, const MCInst &MCInst, + raw_ostream &OS) { OS << MCInstrInfo.getName(MCInst.getOpcode()); for (unsigned I = 0, E = MCInst.getNumOperands(); I < E; ++I) { if (I > 0) Modified: llvm/trunk/tools/llvm-exegesis/lib/MCInstrDescView.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/MCInstrDescView.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/MCInstrDescView.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/MCInstrDescView.h Wed Oct 9 04:58:42 2019 @@ -44,7 +44,7 @@ struct Variable { bool hasTiedOperands() const; // The indices of the operands tied to this Variable. - llvm::SmallVector TiedOperands; + SmallVector TiedOperands; // The index of this Variable in Instruction.Variables and its associated // Value in InstructionBuilder.VariableValues. @@ -78,22 +78,22 @@ struct Operand { unsigned getVariableIndex() const; unsigned getImplicitReg() const; const RegisterAliasingTracker &getRegisterAliasing() const; - const llvm::MCOperandInfo &getExplicitOperandInfo() const; + const MCOperandInfo &getExplicitOperandInfo() const; // Please use the accessors above and not the following fields. int Index = -1; bool IsDef = false; const RegisterAliasingTracker *Tracker = nullptr; // Set for Register Op. - const llvm::MCOperandInfo *Info = nullptr; // Set for Explicit Op. + const MCOperandInfo *Info = nullptr; // Set for Explicit Op. int TiedToIndex = -1; // Set for Reg&Explicit Op. - const llvm::MCPhysReg *ImplicitReg = nullptr; // Set for Implicit Op. + const MCPhysReg *ImplicitReg = nullptr; // Set for Implicit Op. int VariableIndex = -1; // Set for Explicit Op. }; // A view over an MCInstrDesc offering a convenient interface to compute // Register aliasing. struct Instruction { - Instruction(const llvm::MCInstrInfo &InstrInfo, + Instruction(const MCInstrInfo &InstrInfo, const RegisterAliasingTrackerCache &RATC, unsigned Opcode); // Returns the Operand linked to this Variable. @@ -129,31 +129,31 @@ struct Instruction { bool hasOneUseOrOneDef() const; // Convenient function to help with debugging. - void dump(const llvm::MCRegisterInfo &RegInfo, + void dump(const MCRegisterInfo &RegInfo, const RegisterAliasingTrackerCache &RATC, - llvm::raw_ostream &Stream) const; + raw_ostream &Stream) const; - const llvm::MCInstrDesc *Description; // Never nullptr. - llvm::StringRef Name; // The name of this instruction. - llvm::SmallVector Operands; - llvm::SmallVector Variables; - llvm::BitVector ImplDefRegs; // The set of aliased implicit def registers. - llvm::BitVector ImplUseRegs; // The set of aliased implicit use registers. - llvm::BitVector AllDefRegs; // The set of all aliased def registers. - llvm::BitVector AllUseRegs; // The set of all aliased use registers. + const MCInstrDesc *Description; // Never nullptr. + StringRef Name; // The name of this instruction. + SmallVector Operands; + SmallVector Variables; + BitVector ImplDefRegs; // The set of aliased implicit def registers. + BitVector ImplUseRegs; // The set of aliased implicit use registers. + BitVector AllDefRegs; // The set of all aliased def registers. + BitVector AllUseRegs; // The set of all aliased use registers. }; // Instructions are expensive to instantiate. This class provides a cache of // Instructions with lazy construction. struct InstructionsCache { - InstructionsCache(const llvm::MCInstrInfo &InstrInfo, + InstructionsCache(const MCInstrInfo &InstrInfo, const RegisterAliasingTrackerCache &RATC); // Returns the Instruction object corresponding to this Opcode. const Instruction &getInstr(unsigned Opcode) const; private: - const llvm::MCInstrInfo &InstrInfo; + const MCInstrInfo &InstrInfo; const RegisterAliasingTrackerCache &RATC; mutable std::unordered_map> Instructions; @@ -161,11 +161,11 @@ private: // Represents the assignment of a Register to an Operand. struct RegisterOperandAssignment { - RegisterOperandAssignment(const Operand *Operand, llvm::MCPhysReg Reg) + RegisterOperandAssignment(const Operand *Operand, MCPhysReg Reg) : Op(Operand), Reg(Reg) {} const Operand *Op; // Pointer to an Explicit Register Operand. - llvm::MCPhysReg Reg; + MCPhysReg Reg; bool operator==(const RegisterOperandAssignment &other) const; }; @@ -177,8 +177,8 @@ struct RegisterOperandAssignment { // other (e.g. AX/AL) // - The operands are tied. struct AliasingRegisterOperands { - llvm::SmallVector Defs; // Unlikely size() > 1. - llvm::SmallVector Uses; + SmallVector Defs; // Unlikely size() > 1. + SmallVector Uses; // True is Defs and Use contain an Implicit Operand. bool hasImplicitAliasing() const; @@ -195,15 +195,15 @@ struct AliasingConfigurations { bool empty() const; // True if no aliasing configuration is found. bool hasImplicitAliasing() const; - llvm::SmallVector Configurations; + SmallVector Configurations; }; // Writes MCInst to OS. // This is not assembly but the internal LLVM's name for instructions and // registers. -void DumpMCInst(const llvm::MCRegisterInfo &MCRegisterInfo, - const llvm::MCInstrInfo &MCInstrInfo, - const llvm::MCInst &MCInst, llvm::raw_ostream &OS); +void DumpMCInst(const MCRegisterInfo &MCRegisterInfo, + const MCInstrInfo &MCInstrInfo, const MCInst &MCInst, + raw_ostream &OS); } // namespace exegesis } // namespace llvm Modified: llvm/trunk/tools/llvm-exegesis/lib/PerfHelper.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/PerfHelper.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/PerfHelper.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/PerfHelper.cpp Wed Oct 9 04:58:42 2019 @@ -52,7 +52,7 @@ PerfEvent::PerfEvent(PerfEvent &&Other) Other.Attr = nullptr; } -PerfEvent::PerfEvent(llvm::StringRef PfmEventString) +PerfEvent::PerfEvent(StringRef PfmEventString) : EventString(PfmEventString.str()), Attr(nullptr) { #ifdef HAVE_LIBPFM char *Fstr = nullptr; @@ -67,8 +67,8 @@ PerfEvent::PerfEvent(llvm::StringRef Pfm // We don't know beforehand which counters are available (e.g. 6 uops ports // on Sandybridge but 8 on Haswell) so we report the missing counter without // crashing. - llvm::errs() << pfm_strerror(Result) << " - cannot create event " - << EventString << "\n"; + errs() << pfm_strerror(Result) << " - cannot create event " << EventString + << "\n"; } if (Fstr) { FullQualifiedEventString = Fstr; @@ -77,13 +77,13 @@ PerfEvent::PerfEvent(llvm::StringRef Pfm #endif } -llvm::StringRef PerfEvent::name() const { return EventString; } +StringRef PerfEvent::name() const { return EventString; } bool PerfEvent::valid() const { return !FullQualifiedEventString.empty(); } const perf_event_attr *PerfEvent::attribute() const { return Attr; } -llvm::StringRef PerfEvent::getPfmEventString() const { +StringRef PerfEvent::getPfmEventString() const { return FullQualifiedEventString; } @@ -97,9 +97,9 @@ Counter::Counter(const PerfEvent &Event) perf_event_attr AttrCopy = *Event.attribute(); FileDescriptor = perf_event_open(&AttrCopy, Pid, Cpu, GroupFd, Flags); if (FileDescriptor == -1) { - llvm::errs() << "Unable to open event, make sure your kernel allows user " - "space perf monitoring.\nYou may want to try:\n$ sudo sh " - "-c 'echo -1 > /proc/sys/kernel/perf_event_paranoid'\n"; + errs() << "Unable to open event, make sure your kernel allows user " + "space perf monitoring.\nYou may want to try:\n$ sudo sh " + "-c 'echo -1 > /proc/sys/kernel/perf_event_paranoid'\n"; } assert(FileDescriptor != -1 && "Unable to open event"); } @@ -115,7 +115,7 @@ int64_t Counter::read() const { ssize_t ReadSize = ::read(FileDescriptor, &Count, sizeof(Count)); if (ReadSize != sizeof(Count)) { Count = -1; - llvm::errs() << "Failed to read event counter\n"; + errs() << "Failed to read event counter\n"; } return Count; } Modified: llvm/trunk/tools/llvm-exegesis/lib/PerfHelper.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/PerfHelper.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/PerfHelper.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/PerfHelper.h Wed Oct 9 04:58:42 2019 @@ -36,14 +36,14 @@ class PerfEvent { public: // http://perfmon2.sourceforge.net/manv4/libpfm.html // Events are expressed as strings. e.g. "INSTRUCTION_RETIRED" - explicit PerfEvent(llvm::StringRef pfm_event_string); + explicit PerfEvent(StringRef pfm_event_string); PerfEvent(const PerfEvent &) = delete; PerfEvent(PerfEvent &&other); ~PerfEvent(); // The pfm_event_string passed at construction time. - llvm::StringRef name() const; + StringRef name() const; // Whether the event was successfully created. bool valid() const; @@ -53,7 +53,7 @@ public: // The fully qualified name for the event. // e.g. "snb_ep::INSTRUCTION_RETIRED:e=0:i=0:c=0:t=0:u=1:k=0:mg=0:mh=1" - llvm::StringRef getPfmEventString() const; + StringRef getPfmEventString() const; private: const std::string EventString; @@ -86,7 +86,7 @@ private: // callback is called for each successful measure (PerfEvent needs to be valid). template void Measure( - llvm::ArrayRef Events, + ArrayRef Events, const std::function &Callback, Function Fn) { for (const auto &Event : Events) { Modified: llvm/trunk/tools/llvm-exegesis/lib/PowerPC/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/PowerPC/Target.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/PowerPC/Target.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/PowerPC/Target.cpp Wed Oct 9 04:58:42 2019 @@ -22,11 +22,10 @@ public: ExegesisPowerPCTarget() : ExegesisTarget(PPCCpuPfmCounters) {} private: - std::vector setRegTo(const llvm::MCSubtargetInfo &STI, - unsigned Reg, - const llvm::APInt &Value) const override; - bool matchesArch(llvm::Triple::ArchType Arch) const override { - return Arch == llvm::Triple::ppc64le; + std::vector setRegTo(const MCSubtargetInfo &STI, unsigned Reg, + const APInt &Value) const override; + bool matchesArch(Triple::ArchType Arch) const override { + return Arch == Triple::ppc64le; } }; } // end anonymous namespace @@ -34,31 +33,31 @@ private: static unsigned getLoadImmediateOpcode(unsigned RegBitWidth) { switch (RegBitWidth) { case 32: - return llvm::PPC::LI; + return PPC::LI; case 64: - return llvm::PPC::LI8; + return PPC::LI8; } llvm_unreachable("Invalid Value Width"); } // Generates instruction to load an immediate value into a register. -static llvm::MCInst loadImmediate(unsigned Reg, unsigned RegBitWidth, - const llvm::APInt &Value) { +static MCInst loadImmediate(unsigned Reg, unsigned RegBitWidth, + const APInt &Value) { if (Value.getBitWidth() > RegBitWidth) llvm_unreachable("Value must fit in the Register"); - return llvm::MCInstBuilder(getLoadImmediateOpcode(RegBitWidth)) + return MCInstBuilder(getLoadImmediateOpcode(RegBitWidth)) .addReg(Reg) .addImm(Value.getZExtValue()); } -std::vector -ExegesisPowerPCTarget::setRegTo(const llvm::MCSubtargetInfo &STI, unsigned Reg, - const llvm::APInt &Value) const { - if (llvm::PPC::GPRCRegClass.contains(Reg)) +std::vector ExegesisPowerPCTarget::setRegTo(const MCSubtargetInfo &STI, + unsigned Reg, + const APInt &Value) const { + if (PPC::GPRCRegClass.contains(Reg)) return {loadImmediate(Reg, 32, Value)}; - if (llvm::PPC::G8RCRegClass.contains(Reg)) + if (PPC::G8RCRegClass.contains(Reg)) return {loadImmediate(Reg, 64, Value)}; - llvm::errs() << "setRegTo is not implemented, results will be unreliable\n"; + errs() << "setRegTo is not implemented, results will be unreliable\n"; return {}; } Modified: llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.cpp Wed Oct 9 04:58:42 2019 @@ -11,11 +11,11 @@ namespace llvm { namespace exegesis { -llvm::BitVector getAliasedBits(const llvm::MCRegisterInfo &RegInfo, - const llvm::BitVector &SourceBits) { - llvm::BitVector AliasedBits(RegInfo.getNumRegs()); +BitVector getAliasedBits(const MCRegisterInfo &RegInfo, + const BitVector &SourceBits) { + BitVector AliasedBits(RegInfo.getNumRegs()); for (const size_t PhysReg : SourceBits.set_bits()) { - using RegAliasItr = llvm::MCRegAliasIterator; + using RegAliasItr = MCRegAliasIterator; for (auto Itr = RegAliasItr(PhysReg, &RegInfo, true); Itr.isValid(); ++Itr) { AliasedBits.set(*Itr); @@ -24,31 +24,30 @@ llvm::BitVector getAliasedBits(const llv return AliasedBits; } -RegisterAliasingTracker::RegisterAliasingTracker( - const llvm::MCRegisterInfo &RegInfo) +RegisterAliasingTracker::RegisterAliasingTracker(const MCRegisterInfo &RegInfo) : SourceBits(RegInfo.getNumRegs()), AliasedBits(RegInfo.getNumRegs()), Origins(RegInfo.getNumRegs()) {} RegisterAliasingTracker::RegisterAliasingTracker( - const llvm::MCRegisterInfo &RegInfo, const llvm::BitVector &ReservedReg, - const llvm::MCRegisterClass &RegClass) + const MCRegisterInfo &RegInfo, const BitVector &ReservedReg, + const MCRegisterClass &RegClass) : RegisterAliasingTracker(RegInfo) { - for (llvm::MCPhysReg PhysReg : RegClass) + for (MCPhysReg PhysReg : RegClass) if (!ReservedReg[PhysReg]) // Removing reserved registers. SourceBits.set(PhysReg); FillOriginAndAliasedBits(RegInfo, SourceBits); } -RegisterAliasingTracker::RegisterAliasingTracker( - const llvm::MCRegisterInfo &RegInfo, const llvm::MCPhysReg PhysReg) +RegisterAliasingTracker::RegisterAliasingTracker(const MCRegisterInfo &RegInfo, + const MCPhysReg PhysReg) : RegisterAliasingTracker(RegInfo) { SourceBits.set(PhysReg); FillOriginAndAliasedBits(RegInfo, SourceBits); } void RegisterAliasingTracker::FillOriginAndAliasedBits( - const llvm::MCRegisterInfo &RegInfo, const llvm::BitVector &SourceBits) { - using RegAliasItr = llvm::MCRegAliasIterator; + const MCRegisterInfo &RegInfo, const BitVector &SourceBits) { + using RegAliasItr = MCRegAliasIterator; for (const size_t PhysReg : SourceBits.set_bits()) { for (auto Itr = RegAliasItr(PhysReg, &RegInfo, true); Itr.isValid(); ++Itr) { @@ -59,12 +58,12 @@ void RegisterAliasingTracker::FillOrigin } RegisterAliasingTrackerCache::RegisterAliasingTrackerCache( - const llvm::MCRegisterInfo &RegInfo, const llvm::BitVector &ReservedReg) + const MCRegisterInfo &RegInfo, const BitVector &ReservedReg) : RegInfo(RegInfo), ReservedReg(ReservedReg), EmptyRegisters(RegInfo.getNumRegs()) {} const RegisterAliasingTracker & -RegisterAliasingTrackerCache::getRegister(llvm::MCPhysReg PhysReg) const { +RegisterAliasingTrackerCache::getRegister(MCPhysReg PhysReg) const { auto &Found = Registers[PhysReg]; if (!Found) Found.reset(new RegisterAliasingTracker(RegInfo, PhysReg)); Modified: llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/RegisterAliasing.h Wed Oct 9 04:58:42 2019 @@ -25,78 +25,78 @@ namespace llvm { namespace exegesis { // Returns the registers that are aliased by the ones set in SourceBits. -llvm::BitVector getAliasedBits(const llvm::MCRegisterInfo &RegInfo, - const llvm::BitVector &SourceBits); +BitVector getAliasedBits(const MCRegisterInfo &RegInfo, + const BitVector &SourceBits); // Keeps track of a mapping from one register (or a register class) to its // aliased registers. // // e.g. -// RegisterAliasingTracker Tracker(RegInfo, llvm::X86::EAX); -// Tracker.sourceBits() == { llvm::X86::EAX } -// Tracker.aliasedBits() == { llvm::X86::AL, llvm::X86::AH, llvm::X86::AX, -// llvm::X86::EAX,llvm::X86::HAX, llvm::X86::RAX } -// Tracker.getOrigin(llvm::X86::AL) == llvm::X86::EAX; -// Tracker.getOrigin(llvm::X86::BX) == -1; +// RegisterAliasingTracker Tracker(RegInfo, X86::EAX); +// Tracker.sourceBits() == { X86::EAX } +// Tracker.aliasedBits() == { X86::AL, X86::AH, X86::AX, +// X86::EAX,X86::HAX, X86::RAX } +// Tracker.getOrigin(X86::AL) == X86::EAX; +// Tracker.getOrigin(X86::BX) == -1; struct RegisterAliasingTracker { // Construct a tracker from an MCRegisterClass. - RegisterAliasingTracker(const llvm::MCRegisterInfo &RegInfo, - const llvm::BitVector &ReservedReg, - const llvm::MCRegisterClass &RegClass); + RegisterAliasingTracker(const MCRegisterInfo &RegInfo, + const BitVector &ReservedReg, + const MCRegisterClass &RegClass); // Construct a tracker from an MCPhysReg. - RegisterAliasingTracker(const llvm::MCRegisterInfo &RegInfo, - const llvm::MCPhysReg Register); + RegisterAliasingTracker(const MCRegisterInfo &RegInfo, + const MCPhysReg Register); - const llvm::BitVector &sourceBits() const { return SourceBits; } + const BitVector &sourceBits() const { return SourceBits; } // Retrieves all the touched registers as a BitVector. - const llvm::BitVector &aliasedBits() const { return AliasedBits; } + const BitVector &aliasedBits() const { return AliasedBits; } // Returns the origin of this register or -1. - int getOrigin(llvm::MCPhysReg Aliased) const { + int getOrigin(MCPhysReg Aliased) const { if (!AliasedBits[Aliased]) return -1; return Origins[Aliased]; } private: - RegisterAliasingTracker(const llvm::MCRegisterInfo &RegInfo); + RegisterAliasingTracker(const MCRegisterInfo &RegInfo); RegisterAliasingTracker(const RegisterAliasingTracker &) = delete; - void FillOriginAndAliasedBits(const llvm::MCRegisterInfo &RegInfo, - const llvm::BitVector &OriginalBits); + void FillOriginAndAliasedBits(const MCRegisterInfo &RegInfo, + const BitVector &OriginalBits); - llvm::BitVector SourceBits; - llvm::BitVector AliasedBits; - llvm::PackedVector Origins; // Max 1024 physical registers. + BitVector SourceBits; + BitVector AliasedBits; + PackedVector Origins; // Max 1024 physical registers. }; // A cache of existing trackers. struct RegisterAliasingTrackerCache { // RegInfo must outlive the cache. - RegisterAliasingTrackerCache(const llvm::MCRegisterInfo &RegInfo, - const llvm::BitVector &ReservedReg); + RegisterAliasingTrackerCache(const MCRegisterInfo &RegInfo, + const BitVector &ReservedReg); // Convenient function to retrieve a BitVector of the right size. - const llvm::BitVector &emptyRegisters() const { return EmptyRegisters; } + const BitVector &emptyRegisters() const { return EmptyRegisters; } // Convenient function to retrieve the registers the function body can't use. - const llvm::BitVector &reservedRegisters() const { return ReservedReg; } + const BitVector &reservedRegisters() const { return ReservedReg; } // Convenient function to retrieve the underlying MCRegInfo. - const llvm::MCRegisterInfo ®Info() const { return RegInfo; } + const MCRegisterInfo ®Info() const { return RegInfo; } // Retrieves the RegisterAliasingTracker for this particular register. - const RegisterAliasingTracker &getRegister(llvm::MCPhysReg Reg) const; + const RegisterAliasingTracker &getRegister(MCPhysReg Reg) const; // Retrieves the RegisterAliasingTracker for this particular register class. const RegisterAliasingTracker &getRegisterClass(unsigned RegClassIndex) const; private: - const llvm::MCRegisterInfo &RegInfo; - const llvm::BitVector ReservedReg; - const llvm::BitVector EmptyRegisters; + const MCRegisterInfo &RegInfo; + const BitVector ReservedReg; + const BitVector EmptyRegisters; mutable std::unordered_map> Registers; mutable std::unordered_map> @@ -104,7 +104,7 @@ private: }; // `a = a & ~b`, optimized for few bit sets in B and no allocation. -inline void remove(llvm::BitVector &A, const llvm::BitVector &B) { +inline void remove(BitVector &A, const BitVector &B) { assert(A.size() == B.size()); for (auto I : B.set_bits()) A.reset(I); Modified: llvm/trunk/tools/llvm-exegesis/lib/RegisterValue.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/RegisterValue.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/RegisterValue.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/RegisterValue.cpp Wed Oct 9 04:58:42 2019 @@ -12,27 +12,27 @@ namespace llvm { namespace exegesis { -static llvm::APFloat getFloatValue(const llvm::fltSemantics &FltSemantics, - PredefinedValues Value) { +static APFloat getFloatValue(const fltSemantics &FltSemantics, + PredefinedValues Value) { switch (Value) { case PredefinedValues::POS_ZERO: - return llvm::APFloat::getZero(FltSemantics); + return APFloat::getZero(FltSemantics); case PredefinedValues::NEG_ZERO: - return llvm::APFloat::getZero(FltSemantics, true); + return APFloat::getZero(FltSemantics, true); case PredefinedValues::ONE: - return llvm::APFloat(FltSemantics, "1"); + return APFloat(FltSemantics, "1"); case PredefinedValues::TWO: - return llvm::APFloat(FltSemantics, "2"); + return APFloat(FltSemantics, "2"); case PredefinedValues::INF: - return llvm::APFloat::getInf(FltSemantics); + return APFloat::getInf(FltSemantics); case PredefinedValues::QNAN: - return llvm::APFloat::getQNaN(FltSemantics); + return APFloat::getQNaN(FltSemantics); case PredefinedValues::SMALLEST_NORM: - return llvm::APFloat::getSmallestNormalized(FltSemantics); + return APFloat::getSmallestNormalized(FltSemantics); case PredefinedValues::LARGEST: - return llvm::APFloat::getLargest(FltSemantics); + return APFloat::getLargest(FltSemantics); case PredefinedValues::ULP: - return llvm::APFloat::getSmallest(FltSemantics); + return APFloat::getSmallest(FltSemantics); case PredefinedValues::ONE_PLUS_ULP: auto Output = getFloatValue(FltSemantics, PredefinedValues::ONE); Output.next(false); @@ -41,8 +41,8 @@ static llvm::APFloat getFloatValue(const llvm_unreachable("Unhandled exegesis::PredefinedValues"); } -llvm::APInt bitcastFloatValue(const llvm::fltSemantics &FltSemantics, - PredefinedValues Value) { +APInt bitcastFloatValue(const fltSemantics &FltSemantics, + PredefinedValues Value) { return getFloatValue(FltSemantics, Value).bitcastToAPInt(); } Modified: llvm/trunk/tools/llvm-exegesis/lib/RegisterValue.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/RegisterValue.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/RegisterValue.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/RegisterValue.h Wed Oct 9 04:58:42 2019 @@ -24,9 +24,9 @@ namespace exegesis { // A simple object storing the value for a particular register. struct RegisterValue { - static RegisterValue zero(unsigned Reg) { return {Reg, llvm::APInt()}; } + static RegisterValue zero(unsigned Reg) { return {Reg, APInt()}; } unsigned Register; - llvm::APInt Value; + APInt Value; }; enum class PredefinedValues { @@ -43,8 +43,8 @@ enum class PredefinedValues { ONE_PLUS_ULP, // The value just after 1.0 }; -llvm::APInt bitcastFloatValue(const llvm::fltSemantics &FltSemantics, - PredefinedValues Value); +APInt bitcastFloatValue(const fltSemantics &FltSemantics, + PredefinedValues Value); } // namespace exegesis } // namespace llvm Modified: llvm/trunk/tools/llvm-exegesis/lib/SchedClassResolution.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/SchedClassResolution.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/SchedClassResolution.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/SchedClassResolution.cpp Wed Oct 9 04:58:42 2019 @@ -46,20 +46,20 @@ namespace exegesis { // Note that in this case, P016 does not contribute any cycles, so it would // be removed by this function. // FIXME: Move this to MCSubtargetInfo and use it in llvm-mca. -static llvm::SmallVector -getNonRedundantWriteProcRes(const llvm::MCSchedClassDesc &SCDesc, - const llvm::MCSubtargetInfo &STI) { - llvm::SmallVector Result; +static SmallVector +getNonRedundantWriteProcRes(const MCSchedClassDesc &SCDesc, + const MCSubtargetInfo &STI) { + SmallVector Result; const auto &SM = STI.getSchedModel(); const unsigned NumProcRes = SM.getNumProcResourceKinds(); // This assumes that the ProcResDescs are sorted in topological order, which // is guaranteed by the tablegen backend. - llvm::SmallVector ProcResUnitUsage(NumProcRes); + SmallVector ProcResUnitUsage(NumProcRes); for (const auto *WPR = STI.getWriteProcResBegin(&SCDesc), *const WPREnd = STI.getWriteProcResEnd(&SCDesc); WPR != WPREnd; ++WPR) { - const llvm::MCProcResourceDesc *const ProcResDesc = + const MCProcResourceDesc *const ProcResDesc = SM.getProcResource(WPR->ProcResourceIdx); if (ProcResDesc->SubUnitsIdxBegin == nullptr) { // This is a ProcResUnit. @@ -123,11 +123,11 @@ getNonRedundantWriteProcRes(const llvm:: // RemainingPressure = 0.0 // We stop as there is no remaining budget to distribute. static void distributePressure(float RemainingPressure, - llvm::SmallVector Subunits, - llvm::SmallVector &DensePressure) { + SmallVector Subunits, + SmallVector &DensePressure) { // Find the number of subunits with minimal pressure (they are at the // front). - llvm::sort(Subunits, [&DensePressure](const uint16_t A, const uint16_t B) { + sort(Subunits, [&DensePressure](const uint16_t A, const uint16_t B) { return DensePressure[A] < DensePressure[B]; }); const auto getPressureForSubunit = [&DensePressure, @@ -171,27 +171,26 @@ static void distributePressure(float Rem } } -std::vector> computeIdealizedProcResPressure( - const llvm::MCSchedModel &SM, - llvm::SmallVector WPRS) { +std::vector> +computeIdealizedProcResPressure(const MCSchedModel &SM, + SmallVector WPRS) { // DensePressure[I] is the port pressure for Proc Resource I. - llvm::SmallVector DensePressure(SM.getNumProcResourceKinds()); - llvm::sort(WPRS, [](const llvm::MCWriteProcResEntry &A, - const llvm::MCWriteProcResEntry &B) { + SmallVector DensePressure(SM.getNumProcResourceKinds()); + sort(WPRS, [](const MCWriteProcResEntry &A, const MCWriteProcResEntry &B) { return A.ProcResourceIdx < B.ProcResourceIdx; }); - for (const llvm::MCWriteProcResEntry &WPR : WPRS) { + for (const MCWriteProcResEntry &WPR : WPRS) { // Get units for the entry. - const llvm::MCProcResourceDesc *const ProcResDesc = + const MCProcResourceDesc *const ProcResDesc = SM.getProcResource(WPR.ProcResourceIdx); if (ProcResDesc->SubUnitsIdxBegin == nullptr) { // This is a ProcResUnit. DensePressure[WPR.ProcResourceIdx] += WPR.Cycles; } else { // This is a ProcResGroup. - llvm::SmallVector Subunits(ProcResDesc->SubUnitsIdxBegin, - ProcResDesc->SubUnitsIdxBegin + - ProcResDesc->NumUnits); + SmallVector Subunits(ProcResDesc->SubUnitsIdxBegin, + ProcResDesc->SubUnitsIdxBegin + + ProcResDesc->NumUnits); distributePressure(WPR.Cycles, Subunits, DensePressure); } } @@ -204,7 +203,7 @@ std::vector> return Pressure; } -ResolvedSchedClass::ResolvedSchedClass(const llvm::MCSubtargetInfo &STI, +ResolvedSchedClass::ResolvedSchedClass(const MCSubtargetInfo &STI, unsigned ResolvedSchedClassId, bool WasVariant) : SchedClassId(ResolvedSchedClassId), @@ -217,9 +216,9 @@ ResolvedSchedClass::ResolvedSchedClass(c "ResolvedSchedClass should never be variant"); } -static unsigned ResolveVariantSchedClassId(const llvm::MCSubtargetInfo &STI, +static unsigned ResolveVariantSchedClassId(const MCSubtargetInfo &STI, unsigned SchedClassId, - const llvm::MCInst &MCI) { + const MCInst &MCI) { const auto &SM = STI.getSchedModel(); while (SchedClassId && SM.getSchedClassDesc(SchedClassId)->isVariant()) SchedClassId = @@ -228,9 +227,9 @@ static unsigned ResolveVariantSchedClass } std::pair -ResolvedSchedClass::resolveSchedClassId( - const llvm::MCSubtargetInfo &SubtargetInfo, - const llvm::MCInstrInfo &InstrInfo, const llvm::MCInst &MCI) { +ResolvedSchedClass::resolveSchedClassId(const MCSubtargetInfo &SubtargetInfo, + const MCInstrInfo &InstrInfo, + const MCInst &MCI) { unsigned SchedClassId = InstrInfo.get(MCI.getOpcode()).getSchedClass(); const bool WasVariant = SchedClassId && SubtargetInfo.getSchedModel() .getSchedClassDesc(SchedClassId) @@ -240,11 +239,11 @@ ResolvedSchedClass::resolveSchedClassId( } // Returns a ProxResIdx by id or name. -static unsigned findProcResIdx(const llvm::MCSubtargetInfo &STI, - const llvm::StringRef NameOrId) { +static unsigned findProcResIdx(const MCSubtargetInfo &STI, + const StringRef NameOrId) { // Interpret the key as an ProcResIdx. unsigned ProcResIdx = 0; - if (llvm::to_integer(NameOrId, ProcResIdx, 10)) + if (to_integer(NameOrId, ProcResIdx, 10)) return ProcResIdx; // Interpret the key as a ProcRes name. const auto &SchedModel = STI.getSchedModel(); @@ -256,7 +255,7 @@ static unsigned findProcResIdx(const llv } std::vector ResolvedSchedClass::getAsPoint( - InstructionBenchmark::ModeE Mode, const llvm::MCSubtargetInfo &STI, + InstructionBenchmark::ModeE Mode, const MCSubtargetInfo &STI, ArrayRef Representative) const { const size_t NumMeasurements = Representative.size(); @@ -270,13 +269,13 @@ std::vector ResolvedSc LatencyMeasure.PerInstructionValue = 0.0; for (unsigned I = 0; I < SCDesc->NumWriteLatencyEntries; ++I) { - const llvm::MCWriteLatencyEntry *const WLE = + const MCWriteLatencyEntry *const WLE = STI.getWriteLatencyEntry(SCDesc, I); LatencyMeasure.PerInstructionValue = std::max(LatencyMeasure.PerInstructionValue, WLE->Cycles); } } else if (Mode == InstructionBenchmark::Uops) { - for (const auto &I : llvm::zip(SchedClassPoint, Representative)) { + for (const auto &I : zip(SchedClassPoint, Representative)) { BenchmarkMeasure &Measure = std::get<0>(I); const PerInstructionStats &Stats = std::get<1>(I); @@ -296,9 +295,9 @@ std::vector ResolvedSc } else if (Key == "NumMicroOps") { Measure.PerInstructionValue = SCDesc->NumMicroOps; } else { - llvm::errs() << "expected `key` to be either a ProcResIdx or a ProcRes " - "name, got " - << Key << "\n"; + errs() << "expected `key` to be either a ProcResIdx or a ProcRes " + "name, got " + << Key << "\n"; return {}; } } Modified: llvm/trunk/tools/llvm-exegesis/lib/SchedClassResolution.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/SchedClassResolution.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/SchedClassResolution.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/SchedClassResolution.h Wed Oct 9 04:58:42 2019 @@ -31,29 +31,27 @@ namespace exegesis { // Computes the idealized ProcRes Unit pressure. This is the expected // distribution if the CPU scheduler can distribute the load as evenly as // possible. -std::vector> computeIdealizedProcResPressure( - const llvm::MCSchedModel &SM, - llvm::SmallVector WPRS); +std::vector> +computeIdealizedProcResPressure(const MCSchedModel &SM, + SmallVector WPRS); -// An llvm::MCSchedClassDesc augmented with some additional data. +// An MCSchedClassDesc augmented with some additional data. struct ResolvedSchedClass { - ResolvedSchedClass(const llvm::MCSubtargetInfo &STI, - unsigned ResolvedSchedClassId, bool WasVariant); + ResolvedSchedClass(const MCSubtargetInfo &STI, unsigned ResolvedSchedClassId, + bool WasVariant); static std::pair - resolveSchedClassId(const llvm::MCSubtargetInfo &SubtargetInfo, - const llvm::MCInstrInfo &InstrInfo, - const llvm::MCInst &MCI); + resolveSchedClassId(const MCSubtargetInfo &SubtargetInfo, + const MCInstrInfo &InstrInfo, const MCInst &MCI); std::vector - getAsPoint(InstructionBenchmark::ModeE Mode, const llvm::MCSubtargetInfo &STI, + getAsPoint(InstructionBenchmark::ModeE Mode, const MCSubtargetInfo &STI, ArrayRef Representative) const; const unsigned SchedClassId; - const llvm::MCSchedClassDesc *const SCDesc; + const MCSchedClassDesc *const SCDesc; const bool WasVariant; // Whether the original class was variant. - const llvm::SmallVector - NonRedundantWriteProcRes; + const SmallVector NonRedundantWriteProcRes; const std::vector> IdealizedProcResPressure; }; Modified: llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.cpp Wed Oct 9 04:58:42 2019 @@ -30,18 +30,17 @@ std::vector getSingleton(C return Result; } -SnippetGeneratorFailure::SnippetGeneratorFailure(const llvm::Twine &S) - : llvm::StringError(S, llvm::inconvertibleErrorCode()) {} +SnippetGeneratorFailure::SnippetGeneratorFailure(const Twine &S) + : StringError(S, inconvertibleErrorCode()) {} SnippetGenerator::SnippetGenerator(const LLVMState &State, const Options &Opts) : State(State), Opts(Opts) {} SnippetGenerator::~SnippetGenerator() = default; -llvm::Expected> -SnippetGenerator::generateConfigurations( - const Instruction &Instr, const llvm::BitVector &ExtraForbiddenRegs) const { - llvm::BitVector ForbiddenRegs = State.getRATC().reservedRegisters(); +Expected> SnippetGenerator::generateConfigurations( + const Instruction &Instr, const BitVector &ExtraForbiddenRegs) const { + BitVector ForbiddenRegs = State.getRATC().reservedRegisters(); ForbiddenRegs |= ExtraForbiddenRegs; // If the instruction has memory registers, prevent the generator from // using the scratch register and its aliasing registers. @@ -98,7 +97,7 @@ std::vector SnippetGenera // Ignore memory operands which are handled separately. // Loop invariant: DefinedRegs[i] is true iif it has been set at least once // before the current instruction. - llvm::BitVector DefinedRegs = State.getRATC().emptyRegisters(); + BitVector DefinedRegs = State.getRATC().emptyRegisters(); std::vector RIV; for (const InstructionTemplate &IT : Instructions) { // Returns the register that this Operand sets or uses, or 0 if this is not @@ -134,11 +133,11 @@ std::vector SnippetGenera return RIV; } -llvm::Expected> +Expected> generateSelfAliasingCodeTemplates(const Instruction &Instr) { const AliasingConfigurations SelfAliasing(Instr, Instr); if (SelfAliasing.empty()) - return llvm::make_error("empty self aliasing"); + return make_error("empty self aliasing"); std::vector Result; Result.emplace_back(); CodeTemplate &CT = Result.back(); @@ -155,13 +154,12 @@ generateSelfAliasingCodeTemplates(const return std::move(Result); } -llvm::Expected> -generateUnconstrainedCodeTemplates(const Instruction &Instr, - llvm::StringRef Msg) { +Expected> +generateUnconstrainedCodeTemplates(const Instruction &Instr, StringRef Msg) { std::vector Result; Result.emplace_back(); CodeTemplate &CT = Result.back(); - CT.Info = llvm::formatv("{0}, repeating an unconstrained assignment", Msg); + CT.Info = formatv("{0}, repeating an unconstrained assignment", Msg); CT.Instructions.emplace_back(Instr); return std::move(Result); } @@ -193,14 +191,14 @@ static void setRegisterOperandValue(cons assert(AssignedValue.isReg() && AssignedValue.getReg() == ROV.Reg); return; } - AssignedValue = llvm::MCOperand::createReg(ROV.Reg); + AssignedValue = MCOperand::createReg(ROV.Reg); } else { assert(ROV.Op->isImplicitReg()); assert(ROV.Reg == ROV.Op->getImplicitReg()); } } -size_t randomBit(const llvm::BitVector &Vector) { +size_t randomBit(const BitVector &Vector) { assert(Vector.any()); auto Itr = Vector.set_bits_begin(); for (size_t I = randomIndex(Vector.count() - 1); I != 0; --I) @@ -218,10 +216,10 @@ void setRandomAliasing(const AliasingCon } void randomizeUnsetVariables(const ExegesisTarget &Target, - const llvm::BitVector &ForbiddenRegs, + const BitVector &ForbiddenRegs, InstructionTemplate &IT) { for (const Variable &Var : IT.Instr.Variables) { - llvm::MCOperand &AssignedValue = IT.getValueFor(Var); + MCOperand &AssignedValue = IT.getValueFor(Var); if (!AssignedValue.isValid()) Target.randomizeMCOperand(IT.Instr, Var, AssignedValue, ForbiddenRegs); } Modified: llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/SnippetGenerator.h Wed Oct 9 04:58:42 2019 @@ -33,19 +33,18 @@ namespace exegesis { std::vector getSingleton(CodeTemplate &&CT); // Generates code templates that has a self-dependency. -llvm::Expected> +Expected> generateSelfAliasingCodeTemplates(const Instruction &Instr); // Generates code templates without assignment constraints. -llvm::Expected> -generateUnconstrainedCodeTemplates(const Instruction &Instr, - llvm::StringRef Msg); +Expected> +generateUnconstrainedCodeTemplates(const Instruction &Instr, StringRef Msg); // A class representing failures that happened during Benchmark, they are used // to report informations to the user. -class SnippetGeneratorFailure : public llvm::StringError { +class SnippetGeneratorFailure : public StringError { public: - SnippetGeneratorFailure(const llvm::Twine &S); + SnippetGeneratorFailure(const Twine &S); }; // Common code for all benchmark modes. @@ -60,9 +59,9 @@ public: virtual ~SnippetGenerator(); // Calls generateCodeTemplate and expands it into one or more BenchmarkCode. - llvm::Expected> + Expected> generateConfigurations(const Instruction &Instr, - const llvm::BitVector &ExtraForbiddenRegs) const; + const BitVector &ExtraForbiddenRegs) const; // Given a snippet, computes which registers the setup code needs to define. std::vector computeRegisterInitialValues( @@ -74,7 +73,7 @@ protected: private: // API to be implemented by subclasses. - virtual llvm::Expected> + virtual Expected> generateCodeTemplates(const Instruction &Instr, const BitVector &ForbiddenRegisters) const = 0; }; @@ -89,7 +88,7 @@ size_t randomIndex(size_t Max); // Picks a random bit among the bits set in Vector and returns its index. // Precondition: Vector must have at least one bit set. -size_t randomBit(const llvm::BitVector &Vector); +size_t randomBit(const BitVector &Vector); // Picks a random configuration, then selects a random def and a random use from // it and finally set the selected values in the provided InstructionInstances. @@ -99,7 +98,7 @@ void setRandomAliasing(const AliasingCon // Assigns a Random Value to all Variables in IT that are still Invalid. // Do not use any of the registers in `ForbiddenRegs`. void randomizeUnsetVariables(const ExegesisTarget &Target, - const llvm::BitVector &ForbiddenRegs, + const BitVector &ForbiddenRegs, InstructionTemplate &IT); } // namespace exegesis Modified: llvm/trunk/tools/llvm-exegesis/lib/SnippetRepetitor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/SnippetRepetitor.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/SnippetRepetitor.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/SnippetRepetitor.cpp Wed Oct 9 04:58:42 2019 @@ -69,8 +69,8 @@ public: Entry.addInstruction(Inst); // Set up the loop basic block. - Entry.MBB->addSuccessor(Loop.MBB, llvm::BranchProbability::getOne()); - Loop.MBB->addSuccessor(Loop.MBB, llvm::BranchProbability::getOne()); + Entry.MBB->addSuccessor(Loop.MBB, BranchProbability::getOne()); + Loop.MBB->addSuccessor(Loop.MBB, BranchProbability::getOne()); // The live ins are: the loop counter, the registers that were setup by // the entry block, and entry block live ins. Loop.MBB->addLiveIn(LoopCounter); @@ -83,7 +83,7 @@ public: State.getInstrInfo()); // Set up the exit basic block. - Loop.MBB->addSuccessor(Exit.MBB, llvm::BranchProbability::getZero()); + Loop.MBB->addSuccessor(Exit.MBB, BranchProbability::getZero()); Exit.addReturn(); }; } Modified: llvm/trunk/tools/llvm-exegesis/lib/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Target.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Target.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Target.cpp Wed Oct 9 04:58:42 2019 @@ -17,7 +17,7 @@ ExegesisTarget::~ExegesisTarget() {} // static ExegesisTarget *FirstTarget = nullptr; -const ExegesisTarget *ExegesisTarget::lookup(llvm::Triple TT) { +const ExegesisTarget *ExegesisTarget::lookup(Triple TT) { for (const ExegesisTarget *T = FirstTarget; T != nullptr; T = T->Next) { if (T->matchesArch(TT.getArch())) return T; @@ -86,23 +86,23 @@ ExegesisTarget::createUopsBenchmarkRunne return std::make_unique(State); } -void ExegesisTarget::randomizeMCOperand( - const Instruction &Instr, const Variable &Var, - llvm::MCOperand &AssignedValue, - const llvm::BitVector &ForbiddenRegs) const { +void ExegesisTarget::randomizeMCOperand(const Instruction &Instr, + const Variable &Var, + MCOperand &AssignedValue, + const BitVector &ForbiddenRegs) const { const Operand &Op = Instr.getPrimaryOperand(Var); switch (Op.getExplicitOperandInfo().OperandType) { - case llvm::MCOI::OperandType::OPERAND_IMMEDIATE: + case MCOI::OperandType::OPERAND_IMMEDIATE: // FIXME: explore immediate values too. - AssignedValue = llvm::MCOperand::createImm(1); + AssignedValue = MCOperand::createImm(1); break; - case llvm::MCOI::OperandType::OPERAND_REGISTER: { + case MCOI::OperandType::OPERAND_REGISTER: { assert(Op.isReg()); auto AllowedRegs = Op.getRegisterAliasing().sourceBits(); assert(AllowedRegs.size() == ForbiddenRegs.size()); for (auto I : ForbiddenRegs.set_bits()) AllowedRegs.reset(I); - AssignedValue = llvm::MCOperand::createReg(randomBit(AllowedRegs)); + AssignedValue = MCOperand::createReg(randomBit(AllowedRegs)); break; } default: @@ -115,8 +115,7 @@ static_assert(std::is_podCpuName) != CpuName) { + if (Found == CpuPfmCounters.end() || StringRef(Found->CpuName) != CpuName) { // Use the default. if (CpuPfmCounters.begin() != CpuPfmCounters.end() && CpuPfmCounters.begin()->CpuName[0] == '\0') { @@ -149,13 +147,12 @@ public: ExegesisDefaultTarget() : ExegesisTarget({}) {} private: - std::vector setRegTo(const llvm::MCSubtargetInfo &STI, - unsigned Reg, - const llvm::APInt &Value) const override { + std::vector setRegTo(const MCSubtargetInfo &STI, unsigned Reg, + const APInt &Value) const override { llvm_unreachable("Not yet implemented"); } - bool matchesArch(llvm::Triple::ArchType Arch) const override { + bool matchesArch(Triple::ArchType Arch) const override { llvm_unreachable("never called"); return false; } Modified: llvm/trunk/tools/llvm-exegesis/lib/Target.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Target.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Target.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Target.h Wed Oct 9 04:58:42 2019 @@ -9,7 +9,7 @@ /// \file /// /// Classes that handle the creation of target-specific objects. This is -/// similar to llvm::Target/TargetRegistry. +/// similar to Target/TargetRegistry. /// //===----------------------------------------------------------------------===// @@ -56,31 +56,26 @@ struct PfmCountersInfo { struct CpuAndPfmCounters { const char *CpuName; const PfmCountersInfo *PCI; - bool operator<(llvm::StringRef S) const { - return llvm::StringRef(CpuName) < S; - } + bool operator<(StringRef S) const { return StringRef(CpuName) < S; } }; class ExegesisTarget { public: - explicit ExegesisTarget(llvm::ArrayRef CpuPfmCounters) + explicit ExegesisTarget(ArrayRef CpuPfmCounters) : CpuPfmCounters(CpuPfmCounters) {} // Targets can use this to add target-specific passes in assembleToStream(); - virtual void addTargetSpecificPasses(llvm::PassManagerBase &PM) const {} + virtual void addTargetSpecificPasses(PassManagerBase &PM) const {} // Generates code to move a constant into a the given register. // Precondition: Value must fit into Reg. - virtual std::vector - setRegTo(const llvm::MCSubtargetInfo &STI, unsigned Reg, - const llvm::APInt &Value) const = 0; + virtual std::vector setRegTo(const MCSubtargetInfo &STI, unsigned Reg, + const APInt &Value) const = 0; // Returns the register pointing to scratch memory, or 0 if this target // does not support memory operands. The benchmark function uses the // default calling convention. - virtual unsigned getScratchMemoryRegister(const llvm::Triple &) const { - return 0; - } + virtual unsigned getScratchMemoryRegister(const Triple &) const { return 0; } // Fills memory operands with references to the address at [Reg] + Offset. virtual void fillMemoryOperands(InstructionTemplate &IT, unsigned Reg, @@ -90,14 +85,12 @@ public: } // Returns a counter usable as a loop counter. - virtual unsigned getLoopCounterRegister(const llvm::Triple &) const { - return 0; - } + virtual unsigned getLoopCounterRegister(const Triple &) const { return 0; } // Adds the code to decrement the loop counter and virtual void decrementLoopCounterAndJump(MachineBasicBlock &MBB, MachineBasicBlock &TargetMBB, - const llvm::MCInstrInfo &MII) const { + const MCInstrInfo &MII) const { llvm_unreachable("decrementLoopCounterAndBranch() requires " "getLoopCounterRegister() > 0"); } @@ -119,8 +112,8 @@ public: // The target is responsible for handling any operand // starting from OPERAND_FIRST_TARGET. virtual void randomizeMCOperand(const Instruction &Instr, const Variable &Var, - llvm::MCOperand &AssignedValue, - const llvm::BitVector &ForbiddenRegs) const; + MCOperand &AssignedValue, + const BitVector &ForbiddenRegs) const; // Creates a snippet generator for the given mode. std::unique_ptr @@ -134,7 +127,7 @@ public: // Returns the ExegesisTarget for the given triple or nullptr if the target // does not exist. - static const ExegesisTarget *lookup(llvm::Triple TT); + static const ExegesisTarget *lookup(Triple TT); // Returns the default (unspecialized) ExegesisTarget. static const ExegesisTarget &getDefault(); // Registers a target. Not thread safe. @@ -144,10 +137,10 @@ public: // Returns the Pfm counters for the given CPU (or the default if no pfm // counters are defined for this CPU). - const PfmCountersInfo &getPfmCounters(llvm::StringRef CpuName) const; + const PfmCountersInfo &getPfmCounters(StringRef CpuName) const; private: - virtual bool matchesArch(llvm::Triple::ArchType Arch) const = 0; + virtual bool matchesArch(Triple::ArchType Arch) const = 0; // Targets can implement their own snippet generators/benchmarks runners by // implementing these. @@ -161,7 +154,7 @@ private: const LLVMState &State) const; const ExegesisTarget *Next = nullptr; - const llvm::ArrayRef CpuPfmCounters; + const ArrayRef CpuPfmCounters; }; } // namespace exegesis Modified: llvm/trunk/tools/llvm-exegesis/lib/Uops.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Uops.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Uops.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Uops.cpp Wed Oct 9 04:58:42 2019 @@ -80,9 +80,9 @@ namespace llvm { namespace exegesis { -static llvm::SmallVector +static SmallVector getVariablesWithTiedOperands(const Instruction &Instr) { - llvm::SmallVector Result; + SmallVector Result; for (const auto &Var : Instr.Variables) if (Var.hasTiedOperands()) Result.push_back(&Var); @@ -145,7 +145,7 @@ static std::vector return Instructions; } TmpIT.getValueFor(*TiedVariables[VarId]) = - llvm::MCOperand::createReg(NextPossibleReg); + MCOperand::createReg(NextPossibleReg); // Bump iterator. Iterators[VarId] = NextPossibleReg; // Prevent other variables from using the register. @@ -157,8 +157,7 @@ static std::vector } } -llvm::Expected> -UopsSnippetGenerator::generateCodeTemplates( +Expected> UopsSnippetGenerator::generateCodeTemplates( const Instruction &Instr, const BitVector &ForbiddenRegisters) const { CodeTemplate CT; CT.ScratchSpacePointerInReg = @@ -189,7 +188,7 @@ UopsSnippetGenerator::generateCodeTempla return getSingleton(std::move(CT)); } // No tied variables, we pick random values for defs. - llvm::BitVector Defs(State.getRegInfo().getNumRegs()); + BitVector Defs(State.getRegInfo().getNumRegs()); for (const auto &Op : Instr.Operands) { if (Op.isReg() && Op.isExplicit() && Op.isDef() && !Op.isMemory()) { auto PossibleRegisters = Op.getRegisterAliasing().sourceBits(); @@ -198,7 +197,7 @@ UopsSnippetGenerator::generateCodeTempla assert(PossibleRegisters.any() && "No register left to choose from"); const auto RandomReg = randomBit(PossibleRegisters); Defs.set(RandomReg); - IT.getValueFor(Op) = llvm::MCOperand::createReg(RandomReg); + IT.getValueFor(Op) = MCOperand::createReg(RandomReg); } } // And pick random use values that are not reserved and don't alias with defs. @@ -210,7 +209,7 @@ UopsSnippetGenerator::generateCodeTempla remove(PossibleRegisters, DefAliases); assert(PossibleRegisters.any() && "No register left to choose from"); const auto RandomReg = randomBit(PossibleRegisters); - IT.getValueFor(Op) = llvm::MCOperand::createReg(RandomReg); + IT.getValueFor(Op) = MCOperand::createReg(RandomReg); } } CT.Info = @@ -220,7 +219,7 @@ UopsSnippetGenerator::generateCodeTempla return getSingleton(std::move(CT)); } -llvm::Expected> +Expected> UopsBenchmarkRunner::runMeasurements(const FunctionExecutor &Executor) const { std::vector Result; const PfmCountersInfo &PCI = State.getPfmCounters(); Modified: llvm/trunk/tools/llvm-exegesis/lib/Uops.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Uops.h?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Uops.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Uops.h Wed Oct 9 04:58:42 2019 @@ -25,7 +25,7 @@ public: using SnippetGenerator::SnippetGenerator; ~UopsSnippetGenerator() override; - llvm::Expected> + Expected> generateCodeTemplates(const Instruction &Instr, const BitVector &ForbiddenRegisters) const override; @@ -69,7 +69,7 @@ public: static constexpr const size_t kMinNumDifferentAddresses = 6; private: - llvm::Expected> + Expected> runMeasurements(const FunctionExecutor &Executor) const override; }; Modified: llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/X86/Target.cpp Wed Oct 9 04:58:42 2019 @@ -148,34 +148,31 @@ static Error isInvalidMemoryInstr(const } } -static llvm::Error IsInvalidOpcode(const Instruction &Instr) { +static Error IsInvalidOpcode(const Instruction &Instr) { const auto OpcodeName = Instr.Name; if ((Instr.Description->TSFlags & X86II::FormMask) == X86II::Pseudo) - return llvm::make_error("unsupported opcode: pseudo instruction"); + return make_error("unsupported opcode: pseudo instruction"); if (OpcodeName.startswith("POPF") || OpcodeName.startswith("PUSHF") || OpcodeName.startswith("ADJCALLSTACK")) - return llvm::make_error( - "unsupported opcode: Push/Pop/AdjCallStack"); - if (llvm::Error Error = isInvalidMemoryInstr(Instr)) + return make_error("unsupported opcode: Push/Pop/AdjCallStack"); + if (Error Error = isInvalidMemoryInstr(Instr)) return Error; // We do not handle instructions with OPERAND_PCREL. for (const Operand &Op : Instr.Operands) if (Op.isExplicit() && - Op.getExplicitOperandInfo().OperandType == llvm::MCOI::OPERAND_PCREL) - return llvm::make_error( - "unsupported opcode: PC relative operand"); + Op.getExplicitOperandInfo().OperandType == MCOI::OPERAND_PCREL) + return make_error("unsupported opcode: PC relative operand"); // We do not handle second-form X87 instructions. We only handle first-form // ones (_Fp), see comment in X86InstrFPStack.td. for (const Operand &Op : Instr.Operands) if (Op.isReg() && Op.isExplicit() && - Op.getExplicitOperandInfo().RegClass == llvm::X86::RSTRegClassID) - return llvm::make_error( - "unsupported second-form X87 instruction"); - return llvm::Error::success(); + Op.getExplicitOperandInfo().RegClass == X86::RSTRegClassID) + return make_error("unsupported second-form X87 instruction"); + return Error::success(); } static unsigned getX86FPFlags(const Instruction &Instr) { - return Instr.Description->TSFlags & llvm::X86II::FPTypeMask; + return Instr.Description->TSFlags & X86II::FPTypeMask; } // Helper to fill a memory operand with a value. @@ -188,7 +185,7 @@ static void setMemOp(InstructionTemplate // Common (latency, uops) code for LEA templates. `GetDestReg` takes the // addressing base and index registers and returns the LEA destination register. -static llvm::Expected> generateLEATemplatesCommon( +static Expected> generateLEATemplatesCommon( const Instruction &Instr, const BitVector &ForbiddenRegisters, const LLVMState &State, const SnippetGenerator::Options &Opts, std::function GetDestReg) { @@ -249,13 +246,13 @@ class X86LatencySnippetGenerator : publi public: using LatencySnippetGenerator::LatencySnippetGenerator; - llvm::Expected> + Expected> generateCodeTemplates(const Instruction &Instr, const BitVector &ForbiddenRegisters) const override; }; } // namespace -llvm::Expected> +Expected> X86LatencySnippetGenerator::generateCodeTemplates( const Instruction &Instr, const BitVector &ForbiddenRegisters) const { if (auto E = IsInvalidOpcode(Instr)) @@ -273,17 +270,17 @@ X86LatencySnippetGenerator::generateCode } switch (getX86FPFlags(Instr)) { - case llvm::X86II::NotFP: + case X86II::NotFP: return LatencySnippetGenerator::generateCodeTemplates(Instr, ForbiddenRegisters); - case llvm::X86II::ZeroArgFP: - case llvm::X86II::OneArgFP: - case llvm::X86II::SpecialFP: - case llvm::X86II::CompareFP: - case llvm::X86II::CondMovFP: - return llvm::make_error("Unsupported x87 Instruction"); - case llvm::X86II::OneArgFPRW: - case llvm::X86II::TwoArgFP: + case X86II::ZeroArgFP: + case X86II::OneArgFP: + case X86II::SpecialFP: + case X86II::CompareFP: + case X86II::CondMovFP: + return make_error("Unsupported x87 Instruction"); + case X86II::OneArgFPRW: + case X86II::TwoArgFP: // These are instructions like // - `ST(0) = fsqrt(ST(0))` (OneArgFPRW) // - `ST(0) = ST(0) + ST(i)` (TwoArgFP) @@ -299,14 +296,14 @@ class X86UopsSnippetGenerator : public U public: using UopsSnippetGenerator::UopsSnippetGenerator; - llvm::Expected> + Expected> generateCodeTemplates(const Instruction &Instr, const BitVector &ForbiddenRegisters) const override; }; } // namespace -llvm::Expected> +Expected> X86UopsSnippetGenerator::generateCodeTemplates( const Instruction &Instr, const BitVector &ForbiddenRegisters) const { if (auto E = IsInvalidOpcode(Instr)) @@ -335,23 +332,23 @@ X86UopsSnippetGenerator::generateCodeTem } switch (getX86FPFlags(Instr)) { - case llvm::X86II::NotFP: + case X86II::NotFP: return UopsSnippetGenerator::generateCodeTemplates(Instr, ForbiddenRegisters); - case llvm::X86II::ZeroArgFP: - case llvm::X86II::OneArgFP: - case llvm::X86II::SpecialFP: - return llvm::make_error("Unsupported x87 Instruction"); - case llvm::X86II::OneArgFPRW: - case llvm::X86II::TwoArgFP: + case X86II::ZeroArgFP: + case X86II::OneArgFP: + case X86II::SpecialFP: + return make_error("Unsupported x87 Instruction"); + case X86II::OneArgFPRW: + case X86II::TwoArgFP: // These are instructions like // - `ST(0) = fsqrt(ST(0))` (OneArgFPRW) // - `ST(0) = ST(0) + ST(i)` (TwoArgFP) // They are intrinsically serial and do not modify the state of the stack. // We generate the same code for latency and uops. return generateSelfAliasingCodeTemplates(Instr); - case llvm::X86II::CompareFP: - case llvm::X86II::CondMovFP: + case X86II::CompareFP: + case X86II::CondMovFP: // We can compute uops for any FP instruction that does not grow or shrink // the stack (either do not touch the stack or push as much as they pop). return generateUnconstrainedCodeTemplates( @@ -364,66 +361,66 @@ X86UopsSnippetGenerator::generateCodeTem static unsigned getLoadImmediateOpcode(unsigned RegBitWidth) { switch (RegBitWidth) { case 8: - return llvm::X86::MOV8ri; + return X86::MOV8ri; case 16: - return llvm::X86::MOV16ri; + return X86::MOV16ri; case 32: - return llvm::X86::MOV32ri; + return X86::MOV32ri; case 64: - return llvm::X86::MOV64ri; + return X86::MOV64ri; } llvm_unreachable("Invalid Value Width"); } // Generates instruction to load an immediate value into a register. -static llvm::MCInst loadImmediate(unsigned Reg, unsigned RegBitWidth, - const llvm::APInt &Value) { +static MCInst loadImmediate(unsigned Reg, unsigned RegBitWidth, + const APInt &Value) { if (Value.getBitWidth() > RegBitWidth) llvm_unreachable("Value must fit in the Register"); - return llvm::MCInstBuilder(getLoadImmediateOpcode(RegBitWidth)) + return MCInstBuilder(getLoadImmediateOpcode(RegBitWidth)) .addReg(Reg) .addImm(Value.getZExtValue()); } // Allocates scratch memory on the stack. -static llvm::MCInst allocateStackSpace(unsigned Bytes) { - return llvm::MCInstBuilder(llvm::X86::SUB64ri8) - .addReg(llvm::X86::RSP) - .addReg(llvm::X86::RSP) +static MCInst allocateStackSpace(unsigned Bytes) { + return MCInstBuilder(X86::SUB64ri8) + .addReg(X86::RSP) + .addReg(X86::RSP) .addImm(Bytes); } // Fills scratch memory at offset `OffsetBytes` with value `Imm`. -static llvm::MCInst fillStackSpace(unsigned MovOpcode, unsigned OffsetBytes, - uint64_t Imm) { - return llvm::MCInstBuilder(MovOpcode) +static MCInst fillStackSpace(unsigned MovOpcode, unsigned OffsetBytes, + uint64_t Imm) { + return MCInstBuilder(MovOpcode) // Address = ESP - .addReg(llvm::X86::RSP) // BaseReg - .addImm(1) // ScaleAmt - .addReg(0) // IndexReg - .addImm(OffsetBytes) // Disp - .addReg(0) // Segment + .addReg(X86::RSP) // BaseReg + .addImm(1) // ScaleAmt + .addReg(0) // IndexReg + .addImm(OffsetBytes) // Disp + .addReg(0) // Segment // Immediate. .addImm(Imm); } // Loads scratch memory into register `Reg` using opcode `RMOpcode`. -static llvm::MCInst loadToReg(unsigned Reg, unsigned RMOpcode) { - return llvm::MCInstBuilder(RMOpcode) +static MCInst loadToReg(unsigned Reg, unsigned RMOpcode) { + return MCInstBuilder(RMOpcode) .addReg(Reg) // Address = ESP - .addReg(llvm::X86::RSP) // BaseReg - .addImm(1) // ScaleAmt - .addReg(0) // IndexReg - .addImm(0) // Disp - .addReg(0); // Segment + .addReg(X86::RSP) // BaseReg + .addImm(1) // ScaleAmt + .addReg(0) // IndexReg + .addImm(0) // Disp + .addReg(0); // Segment } // Releases scratch memory. -static llvm::MCInst releaseStackSpace(unsigned Bytes) { - return llvm::MCInstBuilder(llvm::X86::ADD64ri8) - .addReg(llvm::X86::RSP) - .addReg(llvm::X86::RSP) +static MCInst releaseStackSpace(unsigned Bytes) { + return MCInstBuilder(X86::ADD64ri8) + .addReg(X86::RSP) + .addReg(X86::RSP) .addImm(Bytes); } @@ -431,19 +428,19 @@ static llvm::MCInst releaseStackSpace(un // constant and provide methods to load the stack value into a register. namespace { struct ConstantInliner { - explicit ConstantInliner(const llvm::APInt &Constant) : Constant_(Constant) {} + explicit ConstantInliner(const APInt &Constant) : Constant_(Constant) {} - std::vector loadAndFinalize(unsigned Reg, unsigned RegBitWidth, - unsigned Opcode); + std::vector loadAndFinalize(unsigned Reg, unsigned RegBitWidth, + unsigned Opcode); - std::vector loadX87STAndFinalize(unsigned Reg); + std::vector loadX87STAndFinalize(unsigned Reg); - std::vector loadX87FPAndFinalize(unsigned Reg); + std::vector loadX87FPAndFinalize(unsigned Reg); - std::vector popFlagAndFinalize(); + std::vector popFlagAndFinalize(); private: - ConstantInliner &add(const llvm::MCInst &Inst) { + ConstantInliner &add(const MCInst &Inst) { Instructions.push_back(Inst); return *this; } @@ -452,14 +449,14 @@ private: static constexpr const unsigned kF80Bytes = 10; // 80 bits. - llvm::APInt Constant_; - std::vector Instructions; + APInt Constant_; + std::vector Instructions; }; } // namespace -std::vector ConstantInliner::loadAndFinalize(unsigned Reg, - unsigned RegBitWidth, - unsigned Opcode) { +std::vector ConstantInliner::loadAndFinalize(unsigned Reg, + unsigned RegBitWidth, + unsigned Opcode) { assert((RegBitWidth & 7) == 0 && "RegBitWidth must be a multiple of 8 bits"); initStack(RegBitWidth / 8); add(loadToReg(Reg, Opcode)); @@ -467,62 +464,62 @@ std::vector ConstantInline return std::move(Instructions); } -std::vector ConstantInliner::loadX87STAndFinalize(unsigned Reg) { +std::vector ConstantInliner::loadX87STAndFinalize(unsigned Reg) { initStack(kF80Bytes); - add(llvm::MCInstBuilder(llvm::X86::LD_F80m) + add(MCInstBuilder(X86::LD_F80m) // Address = ESP - .addReg(llvm::X86::RSP) // BaseReg - .addImm(1) // ScaleAmt - .addReg(0) // IndexReg - .addImm(0) // Disp - .addReg(0)); // Segment - if (Reg != llvm::X86::ST0) - add(llvm::MCInstBuilder(llvm::X86::ST_Frr).addReg(Reg)); + .addReg(X86::RSP) // BaseReg + .addImm(1) // ScaleAmt + .addReg(0) // IndexReg + .addImm(0) // Disp + .addReg(0)); // Segment + if (Reg != X86::ST0) + add(MCInstBuilder(X86::ST_Frr).addReg(Reg)); add(releaseStackSpace(kF80Bytes)); return std::move(Instructions); } -std::vector ConstantInliner::loadX87FPAndFinalize(unsigned Reg) { +std::vector ConstantInliner::loadX87FPAndFinalize(unsigned Reg) { initStack(kF80Bytes); - add(llvm::MCInstBuilder(llvm::X86::LD_Fp80m) + add(MCInstBuilder(X86::LD_Fp80m) .addReg(Reg) // Address = ESP - .addReg(llvm::X86::RSP) // BaseReg - .addImm(1) // ScaleAmt - .addReg(0) // IndexReg - .addImm(0) // Disp - .addReg(0)); // Segment + .addReg(X86::RSP) // BaseReg + .addImm(1) // ScaleAmt + .addReg(0) // IndexReg + .addImm(0) // Disp + .addReg(0)); // Segment add(releaseStackSpace(kF80Bytes)); return std::move(Instructions); } -std::vector ConstantInliner::popFlagAndFinalize() { +std::vector ConstantInliner::popFlagAndFinalize() { initStack(8); - add(llvm::MCInstBuilder(llvm::X86::POPF64)); + add(MCInstBuilder(X86::POPF64)); return std::move(Instructions); } void ConstantInliner::initStack(unsigned Bytes) { assert(Constant_.getBitWidth() <= Bytes * 8 && "Value does not have the correct size"); - const llvm::APInt WideConstant = Constant_.getBitWidth() < Bytes * 8 - ? Constant_.sext(Bytes * 8) - : Constant_; + const APInt WideConstant = Constant_.getBitWidth() < Bytes * 8 + ? Constant_.sext(Bytes * 8) + : Constant_; add(allocateStackSpace(Bytes)); size_t ByteOffset = 0; for (; Bytes - ByteOffset >= 4; ByteOffset += 4) add(fillStackSpace( - llvm::X86::MOV32mi, ByteOffset, + X86::MOV32mi, ByteOffset, WideConstant.extractBits(32, ByteOffset * 8).getZExtValue())); if (Bytes - ByteOffset >= 2) { add(fillStackSpace( - llvm::X86::MOV16mi, ByteOffset, + X86::MOV16mi, ByteOffset, WideConstant.extractBits(16, ByteOffset * 8).getZExtValue())); ByteOffset += 2; } if (Bytes - ByteOffset >= 1) add(fillStackSpace( - llvm::X86::MOV8mi, ByteOffset, + X86::MOV8mi, ByteOffset, WideConstant.extractBits(8, ByteOffset * 8).getZExtValue())); } @@ -534,28 +531,27 @@ public: ExegesisX86Target() : ExegesisTarget(X86CpuPfmCounters) {} private: - void addTargetSpecificPasses(llvm::PassManagerBase &PM) const override; + void addTargetSpecificPasses(PassManagerBase &PM) const override; - unsigned getScratchMemoryRegister(const llvm::Triple &TT) const override; + unsigned getScratchMemoryRegister(const Triple &TT) const override; - unsigned getLoopCounterRegister(const llvm::Triple &) const override; + unsigned getLoopCounterRegister(const Triple &) const override; unsigned getMaxMemoryAccessSize() const override { return 64; } void randomizeMCOperand(const Instruction &Instr, const Variable &Var, - llvm::MCOperand &AssignedValue, - const llvm::BitVector &ForbiddenRegs) const override; + MCOperand &AssignedValue, + const BitVector &ForbiddenRegs) const override; void fillMemoryOperands(InstructionTemplate &IT, unsigned Reg, unsigned Offset) const override; void decrementLoopCounterAndJump(MachineBasicBlock &MBB, MachineBasicBlock &TargetMBB, - const llvm::MCInstrInfo &MII) const override; + const MCInstrInfo &MII) const override; - std::vector setRegTo(const llvm::MCSubtargetInfo &STI, - unsigned Reg, - const llvm::APInt &Value) const override; + std::vector setRegTo(const MCSubtargetInfo &STI, unsigned Reg, + const APInt &Value) const override; ArrayRef getUnavailableRegisters() const override { return makeArrayRef(kUnavailableRegisters, @@ -575,8 +571,8 @@ private: return std::make_unique(State, Opts); } - bool matchesArch(llvm::Triple::ArchType Arch) const override { - return Arch == llvm::Triple::x86_64 || Arch == llvm::Triple::x86; + bool matchesArch(Triple::ArchType Arch) const override { + return Arch == Triple::x86_64 || Arch == Triple::x86; } static const unsigned kUnavailableRegisters[4]; @@ -594,24 +590,21 @@ constexpr const unsigned kLoopCounterReg } // namespace -void ExegesisX86Target::addTargetSpecificPasses( - llvm::PassManagerBase &PM) const { +void ExegesisX86Target::addTargetSpecificPasses(PassManagerBase &PM) const { // Lowers FP pseudo-instructions, e.g. ABS_Fp32 -> ABS_F. - PM.add(llvm::createX86FloatingPointStackifierPass()); + PM.add(createX86FloatingPointStackifierPass()); } -unsigned -ExegesisX86Target::getScratchMemoryRegister(const llvm::Triple &TT) const { +unsigned ExegesisX86Target::getScratchMemoryRegister(const Triple &TT) const { if (!TT.isArch64Bit()) { // FIXME: This would require popping from the stack, so we would have to // add some additional setup code. return 0; } - return TT.isOSWindows() ? llvm::X86::RCX : llvm::X86::RDI; + return TT.isOSWindows() ? X86::RCX : X86::RDI; } -unsigned -ExegesisX86Target::getLoopCounterRegister(const llvm::Triple &TT) const { +unsigned ExegesisX86Target::getLoopCounterRegister(const Triple &TT) const { if (!TT.isArch64Bit()) { return 0; } @@ -619,16 +612,15 @@ ExegesisX86Target::getLoopCounterRegiste } void ExegesisX86Target::randomizeMCOperand( - const Instruction &Instr, const Variable &Var, - llvm::MCOperand &AssignedValue, - const llvm::BitVector &ForbiddenRegs) const { + const Instruction &Instr, const Variable &Var, MCOperand &AssignedValue, + const BitVector &ForbiddenRegs) const { ExegesisTarget::randomizeMCOperand(Instr, Var, AssignedValue, ForbiddenRegs); const Operand &Op = Instr.getPrimaryOperand(Var); switch (Op.getExplicitOperandInfo().OperandType) { - case llvm::X86::OperandType::OPERAND_COND_CODE: - AssignedValue = llvm::MCOperand::createImm( - randomIndex(llvm::X86::CondCode::LAST_VALID_COND)); + case X86::OperandType::OPERAND_COND_CODE: + AssignedValue = + MCOperand::createImm(randomIndex(X86::CondCode::LAST_VALID_COND)); break; default: break; @@ -658,7 +650,7 @@ void ExegesisX86Target::fillMemoryOperan void ExegesisX86Target::decrementLoopCounterAndJump( MachineBasicBlock &MBB, MachineBasicBlock &TargetMBB, - const llvm::MCInstrInfo &MII) const { + const MCInstrInfo &MII) const { BuildMI(&MBB, DebugLoc(), MII.get(X86::ADD64ri8)) .addDef(kLoopCounterReg) .addUse(kLoopCounterReg) @@ -668,45 +660,44 @@ void ExegesisX86Target::decrementLoopCou .addImm(X86::COND_NE); } -std::vector -ExegesisX86Target::setRegTo(const llvm::MCSubtargetInfo &STI, unsigned Reg, - const llvm::APInt &Value) const { - if (llvm::X86::GR8RegClass.contains(Reg)) +std::vector ExegesisX86Target::setRegTo(const MCSubtargetInfo &STI, + unsigned Reg, + const APInt &Value) const { + if (X86::GR8RegClass.contains(Reg)) return {loadImmediate(Reg, 8, Value)}; - if (llvm::X86::GR16RegClass.contains(Reg)) + if (X86::GR16RegClass.contains(Reg)) return {loadImmediate(Reg, 16, Value)}; - if (llvm::X86::GR32RegClass.contains(Reg)) + if (X86::GR32RegClass.contains(Reg)) return {loadImmediate(Reg, 32, Value)}; - if (llvm::X86::GR64RegClass.contains(Reg)) + if (X86::GR64RegClass.contains(Reg)) return {loadImmediate(Reg, 64, Value)}; ConstantInliner CI(Value); - if (llvm::X86::VR64RegClass.contains(Reg)) - return CI.loadAndFinalize(Reg, 64, llvm::X86::MMX_MOVQ64rm); - if (llvm::X86::VR128XRegClass.contains(Reg)) { - if (STI.getFeatureBits()[llvm::X86::FeatureAVX512]) - return CI.loadAndFinalize(Reg, 128, llvm::X86::VMOVDQU32Z128rm); - if (STI.getFeatureBits()[llvm::X86::FeatureAVX]) - return CI.loadAndFinalize(Reg, 128, llvm::X86::VMOVDQUrm); - return CI.loadAndFinalize(Reg, 128, llvm::X86::MOVDQUrm); - } - if (llvm::X86::VR256XRegClass.contains(Reg)) { - if (STI.getFeatureBits()[llvm::X86::FeatureAVX512]) - return CI.loadAndFinalize(Reg, 256, llvm::X86::VMOVDQU32Z256rm); - if (STI.getFeatureBits()[llvm::X86::FeatureAVX]) - return CI.loadAndFinalize(Reg, 256, llvm::X86::VMOVDQUYrm); - } - if (llvm::X86::VR512RegClass.contains(Reg)) - if (STI.getFeatureBits()[llvm::X86::FeatureAVX512]) - return CI.loadAndFinalize(Reg, 512, llvm::X86::VMOVDQU32Zrm); - if (llvm::X86::RSTRegClass.contains(Reg)) { + if (X86::VR64RegClass.contains(Reg)) + return CI.loadAndFinalize(Reg, 64, X86::MMX_MOVQ64rm); + if (X86::VR128XRegClass.contains(Reg)) { + if (STI.getFeatureBits()[X86::FeatureAVX512]) + return CI.loadAndFinalize(Reg, 128, X86::VMOVDQU32Z128rm); + if (STI.getFeatureBits()[X86::FeatureAVX]) + return CI.loadAndFinalize(Reg, 128, X86::VMOVDQUrm); + return CI.loadAndFinalize(Reg, 128, X86::MOVDQUrm); + } + if (X86::VR256XRegClass.contains(Reg)) { + if (STI.getFeatureBits()[X86::FeatureAVX512]) + return CI.loadAndFinalize(Reg, 256, X86::VMOVDQU32Z256rm); + if (STI.getFeatureBits()[X86::FeatureAVX]) + return CI.loadAndFinalize(Reg, 256, X86::VMOVDQUYrm); + } + if (X86::VR512RegClass.contains(Reg)) + if (STI.getFeatureBits()[X86::FeatureAVX512]) + return CI.loadAndFinalize(Reg, 512, X86::VMOVDQU32Zrm); + if (X86::RSTRegClass.contains(Reg)) { return CI.loadX87STAndFinalize(Reg); } - if (llvm::X86::RFP32RegClass.contains(Reg) || - llvm::X86::RFP64RegClass.contains(Reg) || - llvm::X86::RFP80RegClass.contains(Reg)) { + if (X86::RFP32RegClass.contains(Reg) || X86::RFP64RegClass.contains(Reg) || + X86::RFP80RegClass.contains(Reg)) { return CI.loadX87FPAndFinalize(Reg); } - if (Reg == llvm::X86::EFLAGS) + if (Reg == X86::EFLAGS) return CI.popFlagAndFinalize(); return {}; // Not yet implemented. } Modified: llvm/trunk/tools/llvm-exegesis/llvm-exegesis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/llvm-exegesis.cpp?rev=374158&r1=374157&r2=374158&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/llvm-exegesis.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/llvm-exegesis.cpp Wed Oct 9 04:58:42 2019 @@ -165,13 +165,12 @@ static ExitOnError ExitOnErr; // Checks that only one of OpcodeNames, OpcodeIndex or SnippetsFile is provided, // and returns the opcode indices or {} if snippets should be read from // `SnippetsFile`. -static std::vector -getOpcodesOrDie(const llvm::MCInstrInfo &MCInstrInfo) { +static std::vector getOpcodesOrDie(const MCInstrInfo &MCInstrInfo) { const size_t NumSetFlags = (OpcodeNames.empty() ? 0 : 1) + (OpcodeIndex == 0 ? 0 : 1) + (SnippetsFile.empty() ? 0 : 1); if (NumSetFlags != 1) - llvm::report_fatal_error( + report_fatal_error( "please provide one and only one of 'opcode-index', 'opcode-name' or " "'snippets-file'"); if (!SnippetsFile.empty()) @@ -185,33 +184,31 @@ getOpcodesOrDie(const llvm::MCInstrInfo return Result; } // Resolve opcode name -> opcode. - const auto ResolveName = - [&MCInstrInfo](llvm::StringRef OpcodeName) -> unsigned { + const auto ResolveName = [&MCInstrInfo](StringRef OpcodeName) -> unsigned { for (unsigned I = 1, E = MCInstrInfo.getNumOpcodes(); I < E; ++I) if (MCInstrInfo.getName(I) == OpcodeName) return I; return 0u; }; - llvm::SmallVector Pieces; - llvm::StringRef(OpcodeNames.getValue()) + SmallVector Pieces; + StringRef(OpcodeNames.getValue()) .split(Pieces, ",", /* MaxSplit */ -1, /* KeepEmpty */ false); std::vector Result; - for (const llvm::StringRef OpcodeName : Pieces) { + for (const StringRef OpcodeName : Pieces) { if (unsigned Opcode = ResolveName(OpcodeName)) Result.push_back(Opcode); else - llvm::report_fatal_error( - llvm::Twine("unknown opcode ").concat(OpcodeName)); + report_fatal_error(Twine("unknown opcode ").concat(OpcodeName)); } return Result; } // Generates code snippets for opcode `Opcode`. -static llvm::Expected> +static Expected> generateSnippets(const LLVMState &State, unsigned Opcode, - const llvm::BitVector &ForbiddenRegs) { + const BitVector &ForbiddenRegs) { const Instruction &Instr = State.getIC().getInstr(Opcode); - const llvm::MCInstrDesc &InstrDesc = *Instr.Description; + const MCInstrDesc &InstrDesc = *Instr.Description; // Ignore instructions that we cannot run. if (InstrDesc.isPseudo()) return make_error("Unsupported opcode: isPseudo"); @@ -226,22 +223,22 @@ generateSnippets(const LLVMState &State, State.getExegesisTarget().createSnippetGenerator(BenchmarkMode, State, Options); if (!Generator) - llvm::report_fatal_error("cannot create snippet generator"); + report_fatal_error("cannot create snippet generator"); return Generator->generateConfigurations(Instr, ForbiddenRegs); } void benchmarkMain() { #ifndef HAVE_LIBPFM - llvm::report_fatal_error( + report_fatal_error( "benchmarking unavailable, LLVM was built without libpfm."); #endif if (exegesis::pfm::pfmInitialize()) - llvm::report_fatal_error("cannot initialize libpfm"); + report_fatal_error("cannot initialize libpfm"); - llvm::InitializeNativeTarget(); - llvm::InitializeNativeTargetAsmPrinter(); - llvm::InitializeNativeTargetAsmParser(); + InitializeNativeTarget(); + InitializeNativeTargetAsmPrinter(); + InitializeNativeTargetAsmParser(); InitializeNativeExegesisTarget(); const LLVMState State(CpuName); @@ -256,16 +253,16 @@ void benchmarkMain() { // -ignore-invalid-sched-class is passed. if (IgnoreInvalidSchedClass && State.getInstrInfo().get(Opcode).getSchedClass() == 0) { - llvm::errs() << State.getInstrInfo().getName(Opcode) - << ": ignoring instruction without sched class\n"; + errs() << State.getInstrInfo().getName(Opcode) + << ": ignoring instruction without sched class\n"; continue; } auto ConfigsForInstr = generateSnippets(State, Opcode, Repetitor->getReservedRegs()); if (!ConfigsForInstr) { - llvm::logAllUnhandledErrors( - ConfigsForInstr.takeError(), llvm::errs(), - llvm::Twine(State.getInstrInfo().getName(Opcode)).concat(": ")); + logAllUnhandledErrors( + ConfigsForInstr.takeError(), errs(), + Twine(State.getInstrInfo().getName(Opcode)).concat(": ")); continue; } std::move(ConfigsForInstr->begin(), ConfigsForInstr->end(), @@ -278,11 +275,11 @@ void benchmarkMain() { const std::unique_ptr Runner = State.getExegesisTarget().createBenchmarkRunner(BenchmarkMode, State); if (!Runner) { - llvm::report_fatal_error("cannot create benchmark runner"); + report_fatal_error("cannot create benchmark runner"); } if (NumRepetitions == 0) - llvm::report_fatal_error("--num-repetitions must be greater than zero"); + report_fatal_error("--num-repetitions must be greater than zero"); // Write to standard output if file is not set. if (BenchmarkFile.empty()) @@ -304,40 +301,39 @@ static void maybeRunAnalysis(const Analy if (OutputFilename.empty()) return; if (OutputFilename != "-") { - llvm::errs() << "Printing " << Name << " results to file '" - << OutputFilename << "'\n"; + errs() << "Printing " << Name << " results to file '" << OutputFilename + << "'\n"; } std::error_code ErrorCode; - llvm::raw_fd_ostream ClustersOS(OutputFilename, ErrorCode, - llvm::sys::fs::FA_Read | - llvm::sys::fs::FA_Write); + raw_fd_ostream ClustersOS(OutputFilename, ErrorCode, + sys::fs::FA_Read | sys::fs::FA_Write); if (ErrorCode) - llvm::report_fatal_error("cannot open out file: " + OutputFilename); + report_fatal_error("cannot open out file: " + OutputFilename); if (auto Err = Analyzer.run(ClustersOS)) - llvm::report_fatal_error(std::move(Err)); + report_fatal_error(std::move(Err)); } static void analysisMain() { if (BenchmarkFile.empty()) - llvm::report_fatal_error("--benchmarks-file must be set."); + report_fatal_error("--benchmarks-file must be set."); if (AnalysisClustersOutputFile.empty() && AnalysisInconsistenciesOutputFile.empty()) { - llvm::report_fatal_error( + report_fatal_error( "At least one of --analysis-clusters-output-file and " "--analysis-inconsistencies-output-file must be specified."); } - llvm::InitializeNativeTarget(); - llvm::InitializeNativeTargetAsmPrinter(); - llvm::InitializeNativeTargetDisassembler(); + InitializeNativeTarget(); + InitializeNativeTargetAsmPrinter(); + InitializeNativeTargetDisassembler(); // Read benchmarks. const LLVMState State(""); const std::vector Points = ExitOnErr(InstructionBenchmark::readYamls(State, BenchmarkFile)); - llvm::outs() << "Parsed " << Points.size() << " benchmark points\n"; + outs() << "Parsed " << Points.size() << " benchmark points\n"; if (Points.empty()) { - llvm::errs() << "no benchmarks to analyze\n"; + errs() << "no benchmarks to analyze\n"; return; } // FIXME: Check that all points have the same triple/cpu. @@ -345,13 +341,13 @@ static void analysisMain() { std::string Error; const auto *TheTarget = - llvm::TargetRegistry::lookupTarget(Points[0].LLVMTriple, Error); + TargetRegistry::lookupTarget(Points[0].LLVMTriple, Error); if (!TheTarget) { - llvm::errs() << "unknown target '" << Points[0].LLVMTriple << "'\n"; + errs() << "unknown target '" << Points[0].LLVMTriple << "'\n"; return; } - std::unique_ptr InstrInfo(TheTarget->createMCInstrInfo()); + std::unique_ptr InstrInfo(TheTarget->createMCInstrInfo()); const auto Clustering = ExitOnErr(InstructionBenchmarkClustering::create( Points, AnalysisClusteringAlgorithm, AnalysisDbscanNumPoints, @@ -375,8 +371,8 @@ int main(int Argc, char **Argv) { using namespace llvm; cl::ParseCommandLineOptions(Argc, Argv, ""); - exegesis::ExitOnErr.setExitCodeMapper([](const llvm::Error &Err) { - if (Err.isA()) + exegesis::ExitOnErr.setExitCodeMapper([](const Error &Err) { + if (Err.isA()) return EXIT_SUCCESS; return EXIT_FAILURE; }); From llvm-commits at lists.llvm.org Wed Oct 9 05:04:16 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:04:16 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: <082479566a1bc3b1613ef745292da13f@localhost.localdomain> dmgreen added a comment. Yes. I was going off the prior art for adding sadd_sat and ssub_sat to instcombine. And from the cases this patch is matching, where we are otherwise extending to a higher type, the intrinsic seems to produce equal or better code in most of the cases I've tried now. So this wasn't just for vectorisation, although it does make things a lot simpler there. (On an arm specific note, we have a scalar qadd instruction that can be used, if we can sort out the "q" flag otherwise being visible from C). If the canonical form for one of these signed saturating adds/subs wasn't an intrinsic, what would it be? Going into a higher type is awkward for us because the i64 add is not legal, and so doesn't look like the kind of instruction that should be vectorised, plus in ISel we'd have to catch the lowering fairly early and do something special. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Wed Oct 9 05:04:21 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:04:21 +0000 (UTC) Subject: [PATCH] D68692: [llvm-exegesis][NFC] Remove extra `llvm::` qualifications. In-Reply-To: References: Message-ID: <5b1cf4c181e92c6c456a94585c23f6b5@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG50cdd56beb8a: [llvm-exegesis][NFC] Remove extra `llvm::` qualifications. (authored by courbet). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68692/new/ https://reviews.llvm.org/D68692 Files: llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp llvm/tools/llvm-exegesis/lib/Analysis.cpp llvm/tools/llvm-exegesis/lib/Analysis.h llvm/tools/llvm-exegesis/lib/Assembler.cpp llvm/tools/llvm-exegesis/lib/Assembler.h llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp llvm/tools/llvm-exegesis/lib/BenchmarkResult.h llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h llvm/tools/llvm-exegesis/lib/Clustering.cpp llvm/tools/llvm-exegesis/lib/Clustering.h llvm/tools/llvm-exegesis/lib/CodeTemplate.cpp llvm/tools/llvm-exegesis/lib/CodeTemplate.h llvm/tools/llvm-exegesis/lib/Latency.cpp llvm/tools/llvm-exegesis/lib/Latency.h llvm/tools/llvm-exegesis/lib/LlvmState.cpp llvm/tools/llvm-exegesis/lib/LlvmState.h llvm/tools/llvm-exegesis/lib/MCInstrDescView.cpp llvm/tools/llvm-exegesis/lib/MCInstrDescView.h llvm/tools/llvm-exegesis/lib/PerfHelper.cpp llvm/tools/llvm-exegesis/lib/PerfHelper.h llvm/tools/llvm-exegesis/lib/PowerPC/Target.cpp llvm/tools/llvm-exegesis/lib/RegisterAliasing.cpp llvm/tools/llvm-exegesis/lib/RegisterAliasing.h llvm/tools/llvm-exegesis/lib/RegisterValue.cpp llvm/tools/llvm-exegesis/lib/RegisterValue.h llvm/tools/llvm-exegesis/lib/SchedClassResolution.cpp llvm/tools/llvm-exegesis/lib/SchedClassResolution.h llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp llvm/tools/llvm-exegesis/lib/SnippetGenerator.h llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp llvm/tools/llvm-exegesis/lib/Target.cpp llvm/tools/llvm-exegesis/lib/Target.h llvm/tools/llvm-exegesis/lib/Uops.cpp llvm/tools/llvm-exegesis/lib/Uops.h llvm/tools/llvm-exegesis/lib/X86/Target.cpp llvm/tools/llvm-exegesis/llvm-exegesis.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68692.224021.patch Type: text/x-patch Size: 175075 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 05:13:18 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:13:18 +0000 (UTC) Subject: [PATCH] D68643: [Codegen] Alter the default promotion for add_sat and sub_sat In-Reply-To: References: Message-ID: <64b92c38946ec81b67701dbcb77cabd6@localhost.localdomain> dmgreen added a comment. Thanks. I'll put a patch together showing the differences. It was only really the odd types like "i4" that changed IIRC. Some of them were looking better in places, worse in others. Making it dependent on whether min/max are available sounds like a good idea. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68643/new/ https://reviews.llvm.org/D68643 From llvm-commits at lists.llvm.org Wed Oct 9 05:22:21 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:22:21 +0000 (UTC) Subject: [PATCH] D68390: [Mips] Emit proper ABI for _mcount calls In-Reply-To: References: Message-ID: RKSimon added a comment. @mbrkusanin This is causing failures on EXPENSIVE_CHECKS builds, please can you take a look? http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20071/steps/test-check-all/logs/stdio Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68390/new/ https://reviews.llvm.org/D68390 From llvm-commits at lists.llvm.org Wed Oct 9 05:22:23 2019 From: llvm-commits at lists.llvm.org (Sourabh Singh Tomar via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:22:23 +0000 (UTC) Subject: [PATCH] D68697: [DWARF5] Added support for DW_AT_noreturn attribute to be emitted for C++ class member functions. Message-ID: SouraVX created this revision. SouraVX added reviewers: aprantl, vleschuk. SouraVX added a project: debug-info. Herald added subscribers: llvm-commits, ormris. Herald added a project: LLVM. This Patch adds support in clang C++ frontend to emit DW_AT_noreturn for C++ class member functions. https://reviews.llvm.org/D68697 Files: clang/lib/CodeGen/CGDebugInfo.cpp llvm/test/DebugInfo/X86/noreturn_cpp11.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68697.224024.patch Type: text/x-patch Size: 6988 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 05:25:38 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 12:25:38 +0000 (UTC) Subject: [PATCH] D68667: [SLP] respect target register width for GEP vectorization (PR43578) In-Reply-To: References: Message-ID: <2957c1b0067d1d6d6146b0e507810855@localhost.localdomain> xbolva00 added a comment. Yes, I will do. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68667/new/ https://reviews.llvm.org/D68667 From llvm-commits at lists.llvm.org Wed Oct 9 05:25:54 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:25:54 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: <454f1345a1f15749a15ef0b350598da5@localhost.localdomain> thopre added inline comments. ================ Comment at: llvm/test/FileCheck/check-ignore-case.txt:17 +loop 5 +LOOP 5 +BREAK ---------------- s/5/6/ ? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 From llvm-commits at lists.llvm.org Wed Oct 9 05:29:52 2019 From: llvm-commits at lists.llvm.org (David Green via llvm-commits) Date: Wed, 09 Oct 2019 12:29:52 -0000 Subject: [llvm] r374159 - [ARM] Add saturating arithmetic tests for MVE. NFC Message-ID: <20191009122952.C2E718F8EB@lists.llvm.org> Author: dmgreen Date: Wed Oct 9 05:29:51 2019 New Revision: 374159 URL: http://llvm.org/viewvc/llvm-project?rev=374159&view=rev Log: [ARM] Add saturating arithmetic tests for MVE. NFC Added: llvm/trunk/test/CodeGen/Thumb2/mve-saturating-arith.ll Added: llvm/trunk/test/CodeGen/Thumb2/mve-saturating-arith.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Thumb2/mve-saturating-arith.ll?rev=374159&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/Thumb2/mve-saturating-arith.ll (added) +++ llvm/trunk/test/CodeGen/Thumb2/mve-saturating-arith.ll Wed Oct 9 05:29:51 2019 @@ -0,0 +1,501 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc -mtriple=thumbv8.1m.main-arm-none-eabi -mattr=+mve -verify-machineinstrs %s -o - | FileCheck %s + +define arm_aapcs_vfpcc <16 x i8> @sadd_int8_t(<16 x i8> %src1, <16 x i8> %src2) { +; CHECK-LABEL: sadd_int8_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: .vsave {d8, d9} +; CHECK-NEXT: vpush {d8, d9} +; CHECK-NEXT: vadd.i8 q2, q0, q1 +; CHECK-NEXT: vmov.i8 q3, #0x80 +; CHECK-NEXT: vcmp.s8 lt, q2, zr +; CHECK-NEXT: vmov.i8 q4, #0x7f +; CHECK-NEXT: vpsel q3, q4, q3 +; CHECK-NEXT: vcmp.s8 gt, q0, q2 +; CHECK-NEXT: vmrs r0, p0 +; CHECK-NEXT: vcmp.s8 lt, q1, zr +; CHECK-NEXT: vmrs r1, p0 +; CHECK-NEXT: eors r0, r1 +; CHECK-NEXT: vmsr p0, r0 +; CHECK-NEXT: vpsel q0, q3, q2 +; CHECK-NEXT: vpop {d8, d9} +; CHECK-NEXT: bx lr +entry: + %0 = call <16 x i8> @llvm.sadd.sat.v16i8(<16 x i8> %src1, <16 x i8> %src2) + ret <16 x i8> %0 +} + +define arm_aapcs_vfpcc <8 x i16> @sadd_int16_t(<8 x i16> %src1, <8 x i16> %src2) { +; CHECK-LABEL: sadd_int16_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: .vsave {d8, d9} +; CHECK-NEXT: vpush {d8, d9} +; CHECK-NEXT: vadd.i16 q2, q0, q1 +; CHECK-NEXT: vmov.i16 q3, #0x8000 +; CHECK-NEXT: vcmp.s16 lt, q2, zr +; CHECK-NEXT: vmvn.i16 q4, #0x8000 +; CHECK-NEXT: vpsel q3, q4, q3 +; CHECK-NEXT: vcmp.s16 gt, q0, q2 +; CHECK-NEXT: vmrs r0, p0 +; CHECK-NEXT: vcmp.s16 lt, q1, zr +; CHECK-NEXT: vmrs r1, p0 +; CHECK-NEXT: eors r0, r1 +; CHECK-NEXT: vmsr p0, r0 +; CHECK-NEXT: vpsel q0, q3, q2 +; CHECK-NEXT: vpop {d8, d9} +; CHECK-NEXT: bx lr +entry: + %0 = call <8 x i16> @llvm.sadd.sat.v8i16(<8 x i16> %src1, <8 x i16> %src2) + ret <8 x i16> %0 +} + +define arm_aapcs_vfpcc <4 x i32> @sadd_int32_t(<4 x i32> %src1, <4 x i32> %src2) { +; CHECK-LABEL: sadd_int32_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: .vsave {d8, d9} +; CHECK-NEXT: vpush {d8, d9} +; CHECK-NEXT: vadd.i32 q2, q0, q1 +; CHECK-NEXT: vmov.i32 q3, #0x80000000 +; CHECK-NEXT: vcmp.s32 lt, q2, zr +; CHECK-NEXT: vmvn.i32 q4, #0x80000000 +; CHECK-NEXT: vpsel q3, q4, q3 +; CHECK-NEXT: vcmp.s32 gt, q0, q2 +; CHECK-NEXT: vmrs r0, p0 +; CHECK-NEXT: vcmp.s32 lt, q1, zr +; CHECK-NEXT: vmrs r1, p0 +; CHECK-NEXT: eors r0, r1 +; CHECK-NEXT: vmsr p0, r0 +; CHECK-NEXT: vpsel q0, q3, q2 +; CHECK-NEXT: vpop {d8, d9} +; CHECK-NEXT: bx lr +entry: + %0 = call <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %src1, <4 x i32> %src2) + ret <4 x i32> %0 +} + +define arm_aapcs_vfpcc <2 x i64> @sadd_int64_t(<2 x i64> %src1, <2 x i64> %src2) { +; CHECK-LABEL: sadd_int64_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: .save {r4, r5, r6, r7, r8, lr} +; CHECK-NEXT: push.w {r4, r5, r6, r7, r8, lr} +; CHECK-NEXT: vmov r0, s4 +; CHECK-NEXT: vmov r5, s0 +; CHECK-NEXT: vmov r8, s5 +; CHECK-NEXT: vmov r4, s1 +; CHECK-NEXT: vmov r7, s2 +; CHECK-NEXT: vmov r3, s7 +; CHECK-NEXT: vmov r6, s3 +; CHECK-NEXT: adds.w r12, r5, r0 +; CHECK-NEXT: adc.w r0, r4, r8 +; CHECK-NEXT: asrs r2, r0, #31 +; CHECK-NEXT: vmov.32 q2[0], r2 +; CHECK-NEXT: vmov.32 q2[1], r2 +; CHECK-NEXT: vmov r2, s6 +; CHECK-NEXT: adds.w lr, r7, r2 +; CHECK-NEXT: adc.w r2, r6, r3 +; CHECK-NEXT: subs.w r5, r12, r5 +; CHECK-NEXT: sbcs.w r4, r0, r4 +; CHECK-NEXT: asr.w r1, r2, #31 +; CHECK-NEXT: mov.w r4, #0 +; CHECK-NEXT: vmov.32 q2[2], r1 +; CHECK-NEXT: it lt +; CHECK-NEXT: movlt r4, #1 +; CHECK-NEXT: vmov.32 q2[3], r1 +; CHECK-NEXT: adr r1, .LCPI3_0 +; CHECK-NEXT: vldrw.u32 q0, [r1] +; CHECK-NEXT: adr r1, .LCPI3_1 +; CHECK-NEXT: vldrw.u32 q1, [r1] +; CHECK-NEXT: cmp r4, #0 +; CHECK-NEXT: vbic q0, q0, q2 +; CHECK-NEXT: csetm r4, ne +; CHECK-NEXT: vand q1, q1, q2 +; CHECK-NEXT: movs r1, #0 +; CHECK-NEXT: vorr q0, q1, q0 +; CHECK-NEXT: vmov.32 q1[0], r4 +; CHECK-NEXT: vmov.32 q1[1], r4 +; CHECK-NEXT: subs.w r4, lr, r7 +; CHECK-NEXT: sbcs.w r4, r2, r6 +; CHECK-NEXT: it lt +; CHECK-NEXT: movlt r1, #1 +; CHECK-NEXT: cmp r1, #0 +; CHECK-NEXT: csetm r1, ne +; CHECK-NEXT: vmov.32 q1[2], r1 +; CHECK-NEXT: vmov.32 q1[3], r1 +; CHECK-NEXT: asr.w r1, r8, #31 +; CHECK-NEXT: vmov.32 q2[0], r1 +; CHECK-NEXT: vmov.32 q2[1], r1 +; CHECK-NEXT: asrs r1, r3, #31 +; CHECK-NEXT: vmov.32 q2[2], r1 +; CHECK-NEXT: vmov.32 q2[3], r1 +; CHECK-NEXT: veor q1, q2, q1 +; CHECK-NEXT: vmov.32 q2[0], r12 +; CHECK-NEXT: vmov.32 q2[1], r0 +; CHECK-NEXT: vand q0, q0, q1 +; CHECK-NEXT: vmov.32 q2[2], lr +; CHECK-NEXT: vmov.32 q2[3], r2 +; CHECK-NEXT: vbic q1, q2, q1 +; CHECK-NEXT: vorr q0, q0, q1 +; CHECK-NEXT: pop.w {r4, r5, r6, r7, r8, pc} +; CHECK-NEXT: .p2align 4 +; CHECK-NEXT: @ %bb.1: +; CHECK-NEXT: .LCPI3_0: +; CHECK-NEXT: .long 0 @ 0x0 +; CHECK-NEXT: .long 2147483648 @ 0x80000000 +; CHECK-NEXT: .long 0 @ 0x0 +; CHECK-NEXT: .long 2147483648 @ 0x80000000 +; CHECK-NEXT: .LCPI3_1: +; CHECK-NEXT: .long 4294967295 @ 0xffffffff +; CHECK-NEXT: .long 2147483647 @ 0x7fffffff +; CHECK-NEXT: .long 4294967295 @ 0xffffffff +; CHECK-NEXT: .long 2147483647 @ 0x7fffffff +entry: + %0 = call <2 x i64> @llvm.sadd.sat.v2i64(<2 x i64> %src1, <2 x i64> %src2) + ret <2 x i64> %0 +} + +define arm_aapcs_vfpcc <16 x i8> @uadd_int8_t(<16 x i8> %src1, <16 x i8> %src2) { +; CHECK-LABEL: uadd_int8_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: vmvn q2, q1 +; CHECK-NEXT: vmin.u8 q0, q0, q2 +; CHECK-NEXT: vadd.i8 q0, q0, q1 +; CHECK-NEXT: bx lr +entry: + %0 = call <16 x i8> @llvm.uadd.sat.v16i8(<16 x i8> %src1, <16 x i8> %src2) + ret <16 x i8> %0 +} + +define arm_aapcs_vfpcc <8 x i16> @uadd_int16_t(<8 x i16> %src1, <8 x i16> %src2) { +; CHECK-LABEL: uadd_int16_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: vmvn q2, q1 +; CHECK-NEXT: vmin.u16 q0, q0, q2 +; CHECK-NEXT: vadd.i16 q0, q0, q1 +; CHECK-NEXT: bx lr +entry: + %0 = call <8 x i16> @llvm.uadd.sat.v8i16(<8 x i16> %src1, <8 x i16> %src2) + ret <8 x i16> %0 +} + +define arm_aapcs_vfpcc <4 x i32> @uadd_int32_t(<4 x i32> %src1, <4 x i32> %src2) { +; CHECK-LABEL: uadd_int32_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: vmvn q2, q1 +; CHECK-NEXT: vmin.u32 q0, q0, q2 +; CHECK-NEXT: vadd.i32 q0, q0, q1 +; CHECK-NEXT: bx lr +entry: + %0 = call <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %src1, <4 x i32> %src2) + ret <4 x i32> %0 +} + +define arm_aapcs_vfpcc <2 x i64> @uadd_int64_t(<2 x i64> %src1, <2 x i64> %src2) { +; CHECK-LABEL: uadd_int64_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: .save {r4, lr} +; CHECK-NEXT: push {r4, lr} +; CHECK-NEXT: vmov r2, s4 +; CHECK-NEXT: vmov r3, s0 +; CHECK-NEXT: vmov r0, s5 +; CHECK-NEXT: vmov r1, s1 +; CHECK-NEXT: vmov r4, s2 +; CHECK-NEXT: adds.w lr, r3, r2 +; CHECK-NEXT: vmov r2, s6 +; CHECK-NEXT: adc.w r12, r1, r0 +; CHECK-NEXT: subs.w r3, lr, r3 +; CHECK-NEXT: sbcs.w r1, r12, r1 +; CHECK-NEXT: vmov r3, s3 +; CHECK-NEXT: mov.w r1, #0 +; CHECK-NEXT: mov.w r0, #0 +; CHECK-NEXT: it lo +; CHECK-NEXT: movlo r1, #1 +; CHECK-NEXT: cmp r1, #0 +; CHECK-NEXT: csetm r1, ne +; CHECK-NEXT: vmov.32 q0[0], lr +; CHECK-NEXT: vmov.32 q2[0], r1 +; CHECK-NEXT: vmov.32 q0[1], r12 +; CHECK-NEXT: vmov.32 q2[1], r1 +; CHECK-NEXT: vmov r1, s7 +; CHECK-NEXT: adds r2, r2, r4 +; CHECK-NEXT: vmov.32 q0[2], r2 +; CHECK-NEXT: adcs r1, r3 +; CHECK-NEXT: subs r4, r2, r4 +; CHECK-NEXT: sbcs.w r3, r1, r3 +; CHECK-NEXT: it lo +; CHECK-NEXT: movlo r0, #1 +; CHECK-NEXT: cmp r0, #0 +; CHECK-NEXT: vmov.32 q0[3], r1 +; CHECK-NEXT: csetm r0, ne +; CHECK-NEXT: vmov.32 q2[2], r0 +; CHECK-NEXT: vmov.32 q2[3], r0 +; CHECK-NEXT: vorr q0, q0, q2 +; CHECK-NEXT: pop {r4, pc} +entry: + %0 = call <2 x i64> @llvm.uadd.sat.v2i64(<2 x i64> %src1, <2 x i64> %src2) + ret <2 x i64> %0 +} + + +define arm_aapcs_vfpcc <16 x i8> @ssub_int8_t(<16 x i8> %src1, <16 x i8> %src2) { +; CHECK-LABEL: ssub_int8_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: .vsave {d8, d9} +; CHECK-NEXT: vpush {d8, d9} +; CHECK-NEXT: vsub.i8 q2, q0, q1 +; CHECK-NEXT: vmov.i8 q3, #0x80 +; CHECK-NEXT: vcmp.s8 lt, q2, zr +; CHECK-NEXT: vmov.i8 q4, #0x7f +; CHECK-NEXT: vpsel q3, q4, q3 +; CHECK-NEXT: vcmp.s8 gt, q0, q2 +; CHECK-NEXT: vmrs r0, p0 +; CHECK-NEXT: vcmp.s8 gt, q1, zr +; CHECK-NEXT: vmrs r1, p0 +; CHECK-NEXT: eors r0, r1 +; CHECK-NEXT: vmsr p0, r0 +; CHECK-NEXT: vpsel q0, q3, q2 +; CHECK-NEXT: vpop {d8, d9} +; CHECK-NEXT: bx lr +entry: + %0 = call <16 x i8> @llvm.ssub.sat.v16i8(<16 x i8> %src1, <16 x i8> %src2) + ret <16 x i8> %0 +} + +define arm_aapcs_vfpcc <8 x i16> @ssub_int16_t(<8 x i16> %src1, <8 x i16> %src2) { +; CHECK-LABEL: ssub_int16_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: .vsave {d8, d9} +; CHECK-NEXT: vpush {d8, d9} +; CHECK-NEXT: vsub.i16 q2, q0, q1 +; CHECK-NEXT: vmov.i16 q3, #0x8000 +; CHECK-NEXT: vcmp.s16 lt, q2, zr +; CHECK-NEXT: vmvn.i16 q4, #0x8000 +; CHECK-NEXT: vpsel q3, q4, q3 +; CHECK-NEXT: vcmp.s16 gt, q0, q2 +; CHECK-NEXT: vmrs r0, p0 +; CHECK-NEXT: vcmp.s16 gt, q1, zr +; CHECK-NEXT: vmrs r1, p0 +; CHECK-NEXT: eors r0, r1 +; CHECK-NEXT: vmsr p0, r0 +; CHECK-NEXT: vpsel q0, q3, q2 +; CHECK-NEXT: vpop {d8, d9} +; CHECK-NEXT: bx lr +entry: + %0 = call <8 x i16> @llvm.ssub.sat.v8i16(<8 x i16> %src1, <8 x i16> %src2) + ret <8 x i16> %0 +} + +define arm_aapcs_vfpcc <4 x i32> @ssub_int32_t(<4 x i32> %src1, <4 x i32> %src2) { +; CHECK-LABEL: ssub_int32_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: .vsave {d8, d9} +; CHECK-NEXT: vpush {d8, d9} +; CHECK-NEXT: vsub.i32 q2, q0, q1 +; CHECK-NEXT: vmov.i32 q3, #0x80000000 +; CHECK-NEXT: vcmp.s32 lt, q2, zr +; CHECK-NEXT: vmvn.i32 q4, #0x80000000 +; CHECK-NEXT: vpsel q3, q4, q3 +; CHECK-NEXT: vcmp.s32 gt, q0, q2 +; CHECK-NEXT: vmrs r0, p0 +; CHECK-NEXT: vcmp.s32 gt, q1, zr +; CHECK-NEXT: vmrs r1, p0 +; CHECK-NEXT: eors r0, r1 +; CHECK-NEXT: vmsr p0, r0 +; CHECK-NEXT: vpsel q0, q3, q2 +; CHECK-NEXT: vpop {d8, d9} +; CHECK-NEXT: bx lr +entry: + %0 = call <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %src1, <4 x i32> %src2) + ret <4 x i32> %0 +} + +define arm_aapcs_vfpcc <2 x i64> @ssub_int64_t(<2 x i64> %src1, <2 x i64> %src2) { +; CHECK-LABEL: ssub_int64_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: .save {r4, r5, r6, lr} +; CHECK-NEXT: push {r4, r5, r6, lr} +; CHECK-NEXT: .vsave {d8, d9} +; CHECK-NEXT: vpush {d8, d9} +; CHECK-NEXT: vmov r2, s4 +; CHECK-NEXT: movs r0, #0 +; CHECK-NEXT: vmov lr, s5 +; CHECK-NEXT: vmov r12, s7 +; CHECK-NEXT: vmov r5, s0 +; CHECK-NEXT: vmov r4, s1 +; CHECK-NEXT: rsbs r3, r2, #0 +; CHECK-NEXT: sbcs.w r3, r0, lr +; CHECK-NEXT: mov.w r3, #0 +; CHECK-NEXT: it lt +; CHECK-NEXT: movlt r3, #1 +; CHECK-NEXT: cmp r3, #0 +; CHECK-NEXT: csetm r3, ne +; CHECK-NEXT: vmov.32 q2[0], r3 +; CHECK-NEXT: vmov.32 q2[1], r3 +; CHECK-NEXT: vmov r3, s6 +; CHECK-NEXT: rsbs r1, r3, #0 +; CHECK-NEXT: sbcs.w r1, r0, r12 +; CHECK-NEXT: mov.w r1, #0 +; CHECK-NEXT: it lt +; CHECK-NEXT: movlt r1, #1 +; CHECK-NEXT: cmp r1, #0 +; CHECK-NEXT: csetm r1, ne +; CHECK-NEXT: subs r6, r5, r2 +; CHECK-NEXT: vmov.32 q2[2], r1 +; CHECK-NEXT: vmov.32 q2[3], r1 +; CHECK-NEXT: sbc.w r1, r4, lr +; CHECK-NEXT: subs r5, r6, r5 +; CHECK-NEXT: sbcs.w r5, r1, r4 +; CHECK-NEXT: vmov r4, s2 +; CHECK-NEXT: mov.w r5, #0 +; CHECK-NEXT: it lt +; CHECK-NEXT: movlt r5, #1 +; CHECK-NEXT: cmp r5, #0 +; CHECK-NEXT: csetm r5, ne +; CHECK-NEXT: vmov.32 q1[0], r5 +; CHECK-NEXT: vmov.32 q1[1], r5 +; CHECK-NEXT: vmov r5, s3 +; CHECK-NEXT: subs r3, r4, r3 +; CHECK-NEXT: sbc.w r2, r5, r12 +; CHECK-NEXT: subs r4, r3, r4 +; CHECK-NEXT: sbcs.w r5, r2, r5 +; CHECK-NEXT: it lt +; CHECK-NEXT: movlt r0, #1 +; CHECK-NEXT: cmp r0, #0 +; CHECK-NEXT: csetm r0, ne +; CHECK-NEXT: vmov.32 q1[2], r0 +; CHECK-NEXT: vmov.32 q1[3], r0 +; CHECK-NEXT: asrs r0, r1, #31 +; CHECK-NEXT: veor q0, q2, q1 +; CHECK-NEXT: vmov.32 q2[0], r0 +; CHECK-NEXT: vmov.32 q2[1], r0 +; CHECK-NEXT: asrs r0, r2, #31 +; CHECK-NEXT: vmov.32 q2[2], r0 +; CHECK-NEXT: vmov.32 q1[0], r6 +; CHECK-NEXT: vmov.32 q2[3], r0 +; CHECK-NEXT: adr r0, .LCPI11_0 +; CHECK-NEXT: vldrw.u32 q3, [r0] +; CHECK-NEXT: adr r0, .LCPI11_1 +; CHECK-NEXT: vldrw.u32 q4, [r0] +; CHECK-NEXT: vmov.32 q1[1], r1 +; CHECK-NEXT: vmov.32 q1[2], r3 +; CHECK-NEXT: vbic q3, q3, q2 +; CHECK-NEXT: vand q2, q4, q2 +; CHECK-NEXT: vmov.32 q1[3], r2 +; CHECK-NEXT: vorr q2, q2, q3 +; CHECK-NEXT: vbic q1, q1, q0 +; CHECK-NEXT: vand q0, q2, q0 +; CHECK-NEXT: vorr q0, q0, q1 +; CHECK-NEXT: vpop {d8, d9} +; CHECK-NEXT: pop {r4, r5, r6, pc} +; CHECK-NEXT: .p2align 4 +; CHECK-NEXT: @ %bb.1: +; CHECK-NEXT: .LCPI11_0: +; CHECK-NEXT: .long 0 @ 0x0 +; CHECK-NEXT: .long 2147483648 @ 0x80000000 +; CHECK-NEXT: .long 0 @ 0x0 +; CHECK-NEXT: .long 2147483648 @ 0x80000000 +; CHECK-NEXT: .LCPI11_1: +; CHECK-NEXT: .long 4294967295 @ 0xffffffff +; CHECK-NEXT: .long 2147483647 @ 0x7fffffff +; CHECK-NEXT: .long 4294967295 @ 0xffffffff +; CHECK-NEXT: .long 2147483647 @ 0x7fffffff +entry: + %0 = call <2 x i64> @llvm.ssub.sat.v2i64(<2 x i64> %src1, <2 x i64> %src2) + ret <2 x i64> %0 +} + +define arm_aapcs_vfpcc <16 x i8> @usub_int8_t(<16 x i8> %src1, <16 x i8> %src2) { +; CHECK-LABEL: usub_int8_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: vmax.u8 q0, q0, q1 +; CHECK-NEXT: vsub.i8 q0, q0, q1 +; CHECK-NEXT: bx lr +entry: + %0 = call <16 x i8> @llvm.usub.sat.v16i8(<16 x i8> %src1, <16 x i8> %src2) + ret <16 x i8> %0 +} + +define arm_aapcs_vfpcc <8 x i16> @usub_int16_t(<8 x i16> %src1, <8 x i16> %src2) { +; CHECK-LABEL: usub_int16_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: vmax.u16 q0, q0, q1 +; CHECK-NEXT: vsub.i16 q0, q0, q1 +; CHECK-NEXT: bx lr +entry: + %0 = call <8 x i16> @llvm.usub.sat.v8i16(<8 x i16> %src1, <8 x i16> %src2) + ret <8 x i16> %0 +} + +define arm_aapcs_vfpcc <4 x i32> @usub_int32_t(<4 x i32> %src1, <4 x i32> %src2) { +; CHECK-LABEL: usub_int32_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: vmax.u32 q0, q0, q1 +; CHECK-NEXT: vsub.i32 q0, q0, q1 +; CHECK-NEXT: bx lr +entry: + %0 = call <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %src1, <4 x i32> %src2) + ret <4 x i32> %0 +} + +define arm_aapcs_vfpcc <2 x i64> @usub_int64_t(<2 x i64> %src1, <2 x i64> %src2) { +; CHECK-LABEL: usub_int64_t: +; CHECK: @ %bb.0: @ %entry +; CHECK-NEXT: .save {r4, lr} +; CHECK-NEXT: push {r4, lr} +; CHECK-NEXT: vmov r2, s4 +; CHECK-NEXT: vmov r3, s0 +; CHECK-NEXT: vmov r0, s5 +; CHECK-NEXT: vmov r1, s1 +; CHECK-NEXT: vmov r4, s2 +; CHECK-NEXT: subs.w lr, r3, r2 +; CHECK-NEXT: vmov r2, s6 +; CHECK-NEXT: sbc.w r12, r1, r0 +; CHECK-NEXT: subs.w r3, r3, lr +; CHECK-NEXT: sbcs.w r1, r1, r12 +; CHECK-NEXT: vmov r3, s3 +; CHECK-NEXT: mov.w r1, #0 +; CHECK-NEXT: mov.w r0, #0 +; CHECK-NEXT: it lo +; CHECK-NEXT: movlo r1, #1 +; CHECK-NEXT: cmp r1, #0 +; CHECK-NEXT: csetm r1, ne +; CHECK-NEXT: vmov.32 q0[0], lr +; CHECK-NEXT: vmov.32 q2[0], r1 +; CHECK-NEXT: vmov.32 q0[1], r12 +; CHECK-NEXT: vmov.32 q2[1], r1 +; CHECK-NEXT: vmov r1, s7 +; CHECK-NEXT: subs r2, r4, r2 +; CHECK-NEXT: vmov.32 q0[2], r2 +; CHECK-NEXT: sbc.w r1, r3, r1 +; CHECK-NEXT: subs r4, r4, r2 +; CHECK-NEXT: sbcs r3, r1 +; CHECK-NEXT: it lo +; CHECK-NEXT: movlo r0, #1 +; CHECK-NEXT: cmp r0, #0 +; CHECK-NEXT: vmov.32 q0[3], r1 +; CHECK-NEXT: csetm r0, ne +; CHECK-NEXT: vmov.32 q2[2], r0 +; CHECK-NEXT: vmov.32 q2[3], r0 +; CHECK-NEXT: vbic q0, q0, q2 +; CHECK-NEXT: pop {r4, pc} +entry: + %0 = call <2 x i64> @llvm.usub.sat.v2i64(<2 x i64> %src1, <2 x i64> %src2) + ret <2 x i64> %0 +} + + +declare <16 x i8> @llvm.sadd.sat.v16i8(<16 x i8> %src1, <16 x i8> %src2) +declare <8 x i16> @llvm.sadd.sat.v8i16(<8 x i16> %src1, <8 x i16> %src2) +declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %src1, <4 x i32> %src2) +declare <2 x i64> @llvm.sadd.sat.v2i64(<2 x i64> %src1, <2 x i64> %src2) +declare <16 x i8> @llvm.uadd.sat.v16i8(<16 x i8> %src1, <16 x i8> %src2) +declare <8 x i16> @llvm.uadd.sat.v8i16(<8 x i16> %src1, <8 x i16> %src2) +declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %src1, <4 x i32> %src2) +declare <2 x i64> @llvm.uadd.sat.v2i64(<2 x i64> %src1, <2 x i64> %src2) +declare <16 x i8> @llvm.ssub.sat.v16i8(<16 x i8> %src1, <16 x i8> %src2) +declare <8 x i16> @llvm.ssub.sat.v8i16(<8 x i16> %src1, <8 x i16> %src2) +declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %src1, <4 x i32> %src2) +declare <2 x i64> @llvm.ssub.sat.v2i64(<2 x i64> %src1, <2 x i64> %src2) +declare <16 x i8> @llvm.usub.sat.v16i8(<16 x i8> %src1, <16 x i8> %src2) +declare <8 x i16> @llvm.usub.sat.v8i16(<8 x i16> %src1, <8 x i16> %src2) +declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %src1, <4 x i32> %src2) +declare <2 x i64> @llvm.usub.sat.v2i64(<2 x i64> %src1, <2 x i64> %src2) From llvm-commits at lists.llvm.org Wed Oct 9 05:36:22 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Wed, 09 Oct 2019 12:36:22 -0000 Subject: [llvm] r374160 - [CostModel][X86] Add tests for extractelement from non-immediate vector element indices Message-ID: <20191009123623.023FE90968@lists.llvm.org> Author: rksimon Date: Wed Oct 9 05:36:22 2019 New Revision: 374160 URL: http://llvm.org/viewvc/llvm-project?rev=374160&view=rev Log: [CostModel][X86] Add tests for extractelement from non-immediate vector element indices Modified: llvm/trunk/test/Analysis/CostModel/X86/vector-extract.ll Modified: llvm/trunk/test/Analysis/CostModel/X86/vector-extract.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/CostModel/X86/vector-extract.ll?rev=374160&r1=374159&r2=374160&view=diff ============================================================================== --- llvm/trunk/test/Analysis/CostModel/X86/vector-extract.ll (original) +++ llvm/trunk/test/Analysis/CostModel/X86/vector-extract.ll Wed Oct 9 05:36:22 2019 @@ -15,10 +15,13 @@ define i32 @extract_double(i32 %arg) { ; SSE-LABEL: 'extract_double' +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_a = extractelement <2 x double> undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f64_0 = extractelement <2 x double> undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_1 = extractelement <2 x double> undef, i32 1 +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_a = extractelement <4 x double> undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f64_0 = extractelement <4 x double> undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_3 = extractelement <4 x double> undef, i32 3 +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_a = extractelement <8 x double> undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_0 = extractelement <8 x double> undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_3 = extractelement <8 x double> undef, i32 3 ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_4 = extractelement <8 x double> undef, i32 4 @@ -26,10 +29,13 @@ define i32 @extract_double(i32 %arg) { ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX-LABEL: 'extract_double' +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_a = extractelement <2 x double> undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f64_0 = extractelement <2 x double> undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_1 = extractelement <2 x double> undef, i32 1 +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_a = extractelement <4 x double> undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f64_0 = extractelement <4 x double> undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_3 = extractelement <4 x double> undef, i32 3 +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_a = extractelement <8 x double> undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_0 = extractelement <8 x double> undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_3 = extractelement <8 x double> undef, i32 3 ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_4 = extractelement <8 x double> undef, i32 4 @@ -37,10 +43,13 @@ define i32 @extract_double(i32 %arg) { ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX512-LABEL: 'extract_double' +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_a = extractelement <2 x double> undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f64_0 = extractelement <2 x double> undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_1 = extractelement <2 x double> undef, i32 1 +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_a = extractelement <4 x double> undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f64_0 = extractelement <4 x double> undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_3 = extractelement <4 x double> undef, i32 3 +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_a = extractelement <8 x double> undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_0 = extractelement <8 x double> undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_3 = extractelement <8 x double> undef, i32 3 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_4 = extractelement <8 x double> undef, i32 4 @@ -48,22 +57,28 @@ define i32 @extract_double(i32 %arg) { ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'extract_double' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_a = extractelement <2 x double> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f64_0 = extractelement <2 x double> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_1 = extractelement <2 x double> undef, i32 1 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_a = extractelement <4 x double> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f64_0 = extractelement <4 x double> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_3 = extractelement <4 x double> undef, i32 3 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_a = extractelement <8 x double> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_0 = extractelement <8 x double> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_3 = extractelement <8 x double> undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_4 = extractelement <8 x double> undef, i32 4 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_7 = extractelement <8 x double> undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v2f64_a = extractelement <2 x double> undef, i32 %arg %v2f64_0 = extractelement <2 x double> undef, i32 0 %v2f64_1 = extractelement <2 x double> undef, i32 1 + %v4f64_a = extractelement <4 x double> undef, i32 %arg %v4f64_0 = extractelement <4 x double> undef, i32 0 %v4f64_3 = extractelement <4 x double> undef, i32 3 + %v8f64_a = extractelement <8 x double> undef, i32 %arg %v8f64_0 = extractelement <8 x double> undef, i32 0 %v8f64_3 = extractelement <8 x double> undef, i32 3 %v8f64_4 = extractelement <8 x double> undef, i32 4 @@ -74,14 +89,18 @@ define i32 @extract_double(i32 %arg) { define i32 @extract_float(i32 %arg) { ; SSE-LABEL: 'extract_float' +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_a = extractelement <2 x float> undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f32_0 = extractelement <2 x float> undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_1 = extractelement <2 x float> undef, i32 1 +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_a = extractelement <4 x float> undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f32_0 = extractelement <4 x float> undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_3 = extractelement <4 x float> undef, i32 3 +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_a = extractelement <8 x float> undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32_0 = extractelement <8 x float> undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_3 = extractelement <8 x float> undef, i32 3 ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32_4 = extractelement <8 x float> undef, i32 4 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_7 = extractelement <8 x float> undef, i32 7 +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_a = extractelement <16 x float> undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_0 = extractelement <16 x float> undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_3 = extractelement <16 x float> undef, i32 3 ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_8 = extractelement <16 x float> undef, i32 8 @@ -89,14 +108,18 @@ define i32 @extract_float(i32 %arg) { ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX-LABEL: 'extract_float' +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_a = extractelement <2 x float> undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f32_0 = extractelement <2 x float> undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_1 = extractelement <2 x float> undef, i32 1 +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_a = extractelement <4 x float> undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f32_0 = extractelement <4 x float> undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_3 = extractelement <4 x float> undef, i32 3 +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_a = extractelement <8 x float> undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32_0 = extractelement <8 x float> undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_3 = extractelement <8 x float> undef, i32 3 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_4 = extractelement <8 x float> undef, i32 4 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_7 = extractelement <8 x float> undef, i32 7 +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_a = extractelement <16 x float> undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_0 = extractelement <16 x float> undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_3 = extractelement <16 x float> undef, i32 3 ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_8 = extractelement <16 x float> undef, i32 8 @@ -104,14 +127,18 @@ define i32 @extract_float(i32 %arg) { ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX512-LABEL: 'extract_float' +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_a = extractelement <2 x float> undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f32_0 = extractelement <2 x float> undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_1 = extractelement <2 x float> undef, i32 1 +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_a = extractelement <4 x float> undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f32_0 = extractelement <4 x float> undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_3 = extractelement <4 x float> undef, i32 3 +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_a = extractelement <8 x float> undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32_0 = extractelement <8 x float> undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_3 = extractelement <8 x float> undef, i32 3 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_4 = extractelement <8 x float> undef, i32 4 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_7 = extractelement <8 x float> undef, i32 7 +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_a = extractelement <16 x float> undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_0 = extractelement <16 x float> undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_3 = extractelement <16 x float> undef, i32 3 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_8 = extractelement <16 x float> undef, i32 8 @@ -119,31 +146,39 @@ define i32 @extract_float(i32 %arg) { ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'extract_float' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_a = extractelement <2 x float> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f32_0 = extractelement <2 x float> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_1 = extractelement <2 x float> undef, i32 1 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_a = extractelement <4 x float> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f32_0 = extractelement <4 x float> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_3 = extractelement <4 x float> undef, i32 3 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_a = extractelement <8 x float> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32_0 = extractelement <8 x float> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_3 = extractelement <8 x float> undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_4 = extractelement <8 x float> undef, i32 4 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_7 = extractelement <8 x float> undef, i32 7 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_a = extractelement <16 x float> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_0 = extractelement <16 x float> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_3 = extractelement <16 x float> undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_8 = extractelement <16 x float> undef, i32 8 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_15 = extractelement <16 x float> undef, i32 15 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v2f32_a = extractelement <2 x float> undef, i32 %arg %v2f32_0 = extractelement <2 x float> undef, i32 0 %v2f32_1 = extractelement <2 x float> undef, i32 1 + %v4f32_a = extractelement <4 x float> undef, i32 %arg %v4f32_0 = extractelement <4 x float> undef, i32 0 %v4f32_3 = extractelement <4 x float> undef, i32 3 + %v8f32_a = extractelement <8 x float> undef, i32 %arg %v8f32_0 = extractelement <8 x float> undef, i32 0 %v8f32_3 = extractelement <8 x float> undef, i32 3 %v8f32_4 = extractelement <8 x float> undef, i32 4 %v8f32_7 = extractelement <8 x float> undef, i32 7 + %v16f32_a = extractelement <16 x float> undef, i32 %arg %v16f32_0 = extractelement <16 x float> undef, i32 0 %v16f32_3 = extractelement <16 x float> undef, i32 3 %v16f32_8 = extractelement <16 x float> undef, i32 8 @@ -154,10 +189,13 @@ define i32 @extract_float(i32 %arg) { define i32 @extract_i64(i32 %arg) { ; CHECK-LABEL: 'extract_i64' +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_a = extractelement <2 x i64> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_0 = extractelement <2 x i64> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_1 = extractelement <2 x i64> undef, i32 1 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_a = extractelement <4 x i64> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_0 = extractelement <4 x i64> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_3 = extractelement <4 x i64> undef, i32 3 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_a = extractelement <8 x i64> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_0 = extractelement <8 x i64> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_3 = extractelement <8 x i64> undef, i32 3 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_4 = extractelement <8 x i64> undef, i32 4 @@ -165,22 +203,28 @@ define i32 @extract_i64(i32 %arg) { ; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'extract_i64' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_a = extractelement <2 x i64> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_0 = extractelement <2 x i64> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_1 = extractelement <2 x i64> undef, i32 1 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_a = extractelement <4 x i64> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_0 = extractelement <4 x i64> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_3 = extractelement <4 x i64> undef, i32 3 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_a = extractelement <8 x i64> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_0 = extractelement <8 x i64> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_3 = extractelement <8 x i64> undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_4 = extractelement <8 x i64> undef, i32 4 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_7 = extractelement <8 x i64> undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v2i64_a = extractelement <2 x i64> undef, i32 %arg %v2i64_0 = extractelement <2 x i64> undef, i32 0 %v2i64_1 = extractelement <2 x i64> undef, i32 1 + %v4i64_a = extractelement <4 x i64> undef, i32 %arg %v4i64_0 = extractelement <4 x i64> undef, i32 0 %v4i64_3 = extractelement <4 x i64> undef, i32 3 + %v8i64_a = extractelement <8 x i64> undef, i32 %arg %v8i64_0 = extractelement <8 x i64> undef, i32 0 %v8i64_3 = extractelement <8 x i64> undef, i32 3 %v8i64_4 = extractelement <8 x i64> undef, i32 4 @@ -191,14 +235,18 @@ define i32 @extract_i64(i32 %arg) { define i32 @extract_i32(i32 %arg) { ; CHECK-LABEL: 'extract_i32' +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_a = extractelement <2 x i32> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_0 = extractelement <2 x i32> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_1 = extractelement <2 x i32> undef, i32 1 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_a = extractelement <4 x i32> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_0 = extractelement <4 x i32> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_3 = extractelement <4 x i32> undef, i32 3 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_a = extractelement <8 x i32> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_0 = extractelement <8 x i32> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_3 = extractelement <8 x i32> undef, i32 3 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_4 = extractelement <8 x i32> undef, i32 4 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_7 = extractelement <8 x i32> undef, i32 7 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_a = extractelement <16 x i32> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_0 = extractelement <16 x i32> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_3 = extractelement <16 x i32> undef, i32 3 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_8 = extractelement <16 x i32> undef, i32 8 @@ -206,31 +254,39 @@ define i32 @extract_i32(i32 %arg) { ; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'extract_i32' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_a = extractelement <2 x i32> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_0 = extractelement <2 x i32> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_1 = extractelement <2 x i32> undef, i32 1 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_a = extractelement <4 x i32> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_0 = extractelement <4 x i32> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_3 = extractelement <4 x i32> undef, i32 3 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_a = extractelement <8 x i32> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_0 = extractelement <8 x i32> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_3 = extractelement <8 x i32> undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_4 = extractelement <8 x i32> undef, i32 4 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_7 = extractelement <8 x i32> undef, i32 7 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_a = extractelement <16 x i32> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_0 = extractelement <16 x i32> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_3 = extractelement <16 x i32> undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_8 = extractelement <16 x i32> undef, i32 8 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_15 = extractelement <16 x i32> undef, i32 15 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v2i32_a = extractelement <2 x i32> undef, i32 %arg %v2i32_0 = extractelement <2 x i32> undef, i32 0 %v2i32_1 = extractelement <2 x i32> undef, i32 1 + %v4i32_a = extractelement <4 x i32> undef, i32 %arg %v4i32_0 = extractelement <4 x i32> undef, i32 0 %v4i32_3 = extractelement <4 x i32> undef, i32 3 + %v8i32_a = extractelement <8 x i32> undef, i32 %arg %v8i32_0 = extractelement <8 x i32> undef, i32 0 %v8i32_3 = extractelement <8 x i32> undef, i32 3 %v8i32_4 = extractelement <8 x i32> undef, i32 4 %v8i32_7 = extractelement <8 x i32> undef, i32 7 + %v16i32_a = extractelement <16 x i32> undef, i32 %arg %v16i32_0 = extractelement <16 x i32> undef, i32 0 %v16i32_3 = extractelement <16 x i32> undef, i32 3 %v16i32_8 = extractelement <16 x i32> undef, i32 8 @@ -241,12 +297,15 @@ define i32 @extract_i32(i32 %arg) { define i32 @extract_i16(i32 %arg) { ; CHECK-LABEL: 'extract_i16' +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_a = extractelement <8 x i16> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_0 = extractelement <8 x i16> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_7 = extractelement <8 x i16> undef, i32 7 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_a = extractelement <16 x i16> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_0 = extractelement <16 x i16> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_7 = extractelement <16 x i16> undef, i32 7 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_8 = extractelement <16 x i16> undef, i32 8 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_15 = extractelement <16 x i16> undef, i32 15 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_a = extractelement <32 x i16> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_0 = extractelement <32 x i16> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_7 = extractelement <32 x i16> undef, i32 7 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_8 = extractelement <32 x i16> undef, i32 8 @@ -257,12 +316,15 @@ define i32 @extract_i16(i32 %arg) { ; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'extract_i16' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_a = extractelement <8 x i16> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_0 = extractelement <8 x i16> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_7 = extractelement <8 x i16> undef, i32 7 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_a = extractelement <16 x i16> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_0 = extractelement <16 x i16> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_7 = extractelement <16 x i16> undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_8 = extractelement <16 x i16> undef, i32 8 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_15 = extractelement <16 x i16> undef, i32 15 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_a = extractelement <32 x i16> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_0 = extractelement <32 x i16> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_7 = extractelement <32 x i16> undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_8 = extractelement <32 x i16> undef, i32 8 @@ -272,14 +334,17 @@ define i32 @extract_i16(i32 %arg) { ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_31 = extractelement <32 x i16> undef, i32 31 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v8i16_a = extractelement <8 x i16> undef, i32 %arg %v8i16_0 = extractelement <8 x i16> undef, i32 0 %v8i16_7 = extractelement <8 x i16> undef, i32 7 + %v16i16_a = extractelement <16 x i16> undef, i32 %arg %v16i16_0 = extractelement <16 x i16> undef, i32 0 %v16i16_7 = extractelement <16 x i16> undef, i32 7 %v16i16_8 = extractelement <16 x i16> undef, i32 8 %v16i16_15 = extractelement <16 x i16> undef, i32 15 + %v32i16_a = extractelement <32 x i16> undef, i32 %arg %v32i16_0 = extractelement <32 x i16> undef, i32 0 %v32i16_7 = extractelement <32 x i16> undef, i32 7 %v32i16_8 = extractelement <32 x i16> undef, i32 8 @@ -293,15 +358,18 @@ define i32 @extract_i16(i32 %arg) { define i32 @extract_i8(i32 %arg) { ; CHECK-LABEL: 'extract_i8' +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_a = extractelement <16 x i8> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_0 = extractelement <16 x i8> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_8 = extractelement <16 x i8> undef, i32 8 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_15 = extractelement <16 x i8> undef, i32 15 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_a = extractelement <32 x i8> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_0 = extractelement <32 x i8> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_7 = extractelement <32 x i8> undef, i32 7 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_8 = extractelement <32 x i8> undef, i32 8 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_15 = extractelement <32 x i8> undef, i32 15 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_24 = extractelement <32 x i8> undef, i32 24 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_31 = extractelement <32 x i8> undef, i32 31 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_a = extractelement <64 x i8> undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_0 = extractelement <64 x i8> undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_7 = extractelement <64 x i8> undef, i32 7 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_8 = extractelement <64 x i8> undef, i32 8 @@ -314,15 +382,18 @@ define i32 @extract_i8(i32 %arg) { ; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'extract_i8' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_a = extractelement <16 x i8> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_0 = extractelement <16 x i8> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_8 = extractelement <16 x i8> undef, i32 8 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_15 = extractelement <16 x i8> undef, i32 15 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_a = extractelement <32 x i8> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_0 = extractelement <32 x i8> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_7 = extractelement <32 x i8> undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_8 = extractelement <32 x i8> undef, i32 8 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_15 = extractelement <32 x i8> undef, i32 15 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_24 = extractelement <32 x i8> undef, i32 24 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_31 = extractelement <32 x i8> undef, i32 31 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_a = extractelement <64 x i8> undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_0 = extractelement <64 x i8> undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_7 = extractelement <64 x i8> undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_8 = extractelement <64 x i8> undef, i32 8 @@ -334,10 +405,12 @@ define i32 @extract_i8(i32 %arg) { ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_63 = extractelement <64 x i8> undef, i32 63 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v16i8_a = extractelement <16 x i8> undef, i32 %arg %v16i8_0 = extractelement <16 x i8> undef, i32 0 %v16i8_8 = extractelement <16 x i8> undef, i32 8 %v16i8_15 = extractelement <16 x i8> undef, i32 15 + %v32i8_a = extractelement <32 x i8> undef, i32 %arg %v32i8_0 = extractelement <32 x i8> undef, i32 0 %v32i8_7 = extractelement <32 x i8> undef, i32 7 %v32i8_8 = extractelement <32 x i8> undef, i32 8 @@ -345,6 +418,7 @@ define i32 @extract_i8(i32 %arg) { %v32i8_24 = extractelement <32 x i8> undef, i32 24 %v32i8_31 = extractelement <32 x i8> undef, i32 31 + %v64i8_a = extractelement <64 x i8> undef, i32 %arg %v64i8_0 = extractelement <64 x i8> undef, i32 0 %v64i8_7 = extractelement <64 x i8> undef, i32 7 %v64i8_8 = extractelement <64 x i8> undef, i32 8 From llvm-commits at lists.llvm.org Wed Oct 9 05:36:34 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Wed, 09 Oct 2019 12:36:34 -0000 Subject: [llvm] r374161 - [CostModel][X86] Add tests for insertelement to non-immediate vector element indices Message-ID: <20191009123634.3C4F090B87@lists.llvm.org> Author: rksimon Date: Wed Oct 9 05:36:34 2019 New Revision: 374161 URL: http://llvm.org/viewvc/llvm-project?rev=374161&view=rev Log: [CostModel][X86] Add tests for insertelement to non-immediate vector element indices Modified: llvm/trunk/test/Analysis/CostModel/X86/vector-insert.ll Modified: llvm/trunk/test/Analysis/CostModel/X86/vector-insert.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/CostModel/X86/vector-insert.ll?rev=374161&r1=374160&r2=374161&view=diff ============================================================================== --- llvm/trunk/test/Analysis/CostModel/X86/vector-insert.ll (original) +++ llvm/trunk/test/Analysis/CostModel/X86/vector-insert.ll Wed Oct 9 05:36:34 2019 @@ -15,10 +15,13 @@ define i32 @insert_double(i32 %arg) { ; SSE-LABEL: 'insert_double' +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_a = insertelement <2 x double> undef, double undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f64_0 = insertelement <2 x double> undef, double undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_1 = insertelement <2 x double> undef, double undef, i32 1 +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_a = insertelement <4 x double> undef, double undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f64_0 = insertelement <4 x double> undef, double undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_3 = insertelement <4 x double> undef, double undef, i32 3 +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_a = insertelement <8 x double> undef, double undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_0 = insertelement <8 x double> undef, double undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_3 = insertelement <8 x double> undef, double undef, i32 3 ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_4 = insertelement <8 x double> undef, double undef, i32 4 @@ -26,10 +29,13 @@ define i32 @insert_double(i32 %arg) { ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX-LABEL: 'insert_double' +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_a = insertelement <2 x double> undef, double undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f64_0 = insertelement <2 x double> undef, double undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_1 = insertelement <2 x double> undef, double undef, i32 1 +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_a = insertelement <4 x double> undef, double undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f64_0 = insertelement <4 x double> undef, double undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_3 = insertelement <4 x double> undef, double undef, i32 3 +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_a = insertelement <8 x double> undef, double undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_0 = insertelement <8 x double> undef, double undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_3 = insertelement <8 x double> undef, double undef, i32 3 ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_4 = insertelement <8 x double> undef, double undef, i32 4 @@ -37,10 +43,13 @@ define i32 @insert_double(i32 %arg) { ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX512-LABEL: 'insert_double' +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_a = insertelement <2 x double> undef, double undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f64_0 = insertelement <2 x double> undef, double undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_1 = insertelement <2 x double> undef, double undef, i32 1 +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_a = insertelement <4 x double> undef, double undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f64_0 = insertelement <4 x double> undef, double undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_3 = insertelement <4 x double> undef, double undef, i32 3 +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_a = insertelement <8 x double> undef, double undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_0 = insertelement <8 x double> undef, double undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_3 = insertelement <8 x double> undef, double undef, i32 3 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_4 = insertelement <8 x double> undef, double undef, i32 4 @@ -48,22 +57,28 @@ define i32 @insert_double(i32 %arg) { ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'insert_double' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_a = insertelement <2 x double> undef, double undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f64_0 = insertelement <2 x double> undef, double undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f64_1 = insertelement <2 x double> undef, double undef, i32 1 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_a = insertelement <4 x double> undef, double undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f64_0 = insertelement <4 x double> undef, double undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f64_3 = insertelement <4 x double> undef, double undef, i32 3 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_a = insertelement <8 x double> undef, double undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_0 = insertelement <8 x double> undef, double undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_3 = insertelement <8 x double> undef, double undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f64_4 = insertelement <8 x double> undef, double undef, i32 4 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f64_7 = insertelement <8 x double> undef, double undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v2f64_a = insertelement <2 x double> undef, double undef, i32 %arg %v2f64_0 = insertelement <2 x double> undef, double undef, i32 0 %v2f64_1 = insertelement <2 x double> undef, double undef, i32 1 + %v4f64_a = insertelement <4 x double> undef, double undef, i32 %arg %v4f64_0 = insertelement <4 x double> undef, double undef, i32 0 %v4f64_3 = insertelement <4 x double> undef, double undef, i32 3 + %v8f64_a = insertelement <8 x double> undef, double undef, i32 %arg %v8f64_0 = insertelement <8 x double> undef, double undef, i32 0 %v8f64_3 = insertelement <8 x double> undef, double undef, i32 3 %v8f64_4 = insertelement <8 x double> undef, double undef, i32 4 @@ -74,14 +89,18 @@ define i32 @insert_double(i32 %arg) { define i32 @insert_float(i32 %arg) { ; SSE-LABEL: 'insert_float' +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_a = insertelement <2 x float> undef, float undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f32_0 = insertelement <2 x float> undef, float undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_1 = insertelement <2 x float> undef, float undef, i32 1 +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_a = insertelement <4 x float> undef, float undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f32_0 = insertelement <4 x float> undef, float undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_3 = insertelement <4 x float> undef, float undef, i32 3 +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_a = insertelement <8 x float> undef, float undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32_0 = insertelement <8 x float> undef, float undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_3 = insertelement <8 x float> undef, float undef, i32 3 ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32_4 = insertelement <8 x float> undef, float undef, i32 4 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_7 = insertelement <8 x float> undef, float undef, i32 7 +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_a = insertelement <16 x float> undef, float undef, i32 %arg ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_0 = insertelement <16 x float> undef, float undef, i32 0 ; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_3 = insertelement <16 x float> undef, float undef, i32 3 ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_8 = insertelement <16 x float> undef, float undef, i32 8 @@ -89,14 +108,18 @@ define i32 @insert_float(i32 %arg) { ; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX-LABEL: 'insert_float' +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_a = insertelement <2 x float> undef, float undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f32_0 = insertelement <2 x float> undef, float undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_1 = insertelement <2 x float> undef, float undef, i32 1 +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_a = insertelement <4 x float> undef, float undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f32_0 = insertelement <4 x float> undef, float undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_3 = insertelement <4 x float> undef, float undef, i32 3 +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_a = insertelement <8 x float> undef, float undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32_0 = insertelement <8 x float> undef, float undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_3 = insertelement <8 x float> undef, float undef, i32 3 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_4 = insertelement <8 x float> undef, float undef, i32 4 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_7 = insertelement <8 x float> undef, float undef, i32 7 +; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_a = insertelement <16 x float> undef, float undef, i32 %arg ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_0 = insertelement <16 x float> undef, float undef, i32 0 ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_3 = insertelement <16 x float> undef, float undef, i32 3 ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_8 = insertelement <16 x float> undef, float undef, i32 8 @@ -104,14 +127,18 @@ define i32 @insert_float(i32 %arg) { ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX512-LABEL: 'insert_float' +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_a = insertelement <2 x float> undef, float undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f32_0 = insertelement <2 x float> undef, float undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_1 = insertelement <2 x float> undef, float undef, i32 1 +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_a = insertelement <4 x float> undef, float undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f32_0 = insertelement <4 x float> undef, float undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_3 = insertelement <4 x float> undef, float undef, i32 3 +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_a = insertelement <8 x float> undef, float undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32_0 = insertelement <8 x float> undef, float undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_3 = insertelement <8 x float> undef, float undef, i32 3 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_4 = insertelement <8 x float> undef, float undef, i32 4 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_7 = insertelement <8 x float> undef, float undef, i32 7 +; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_a = insertelement <16 x float> undef, float undef, i32 %arg ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_0 = insertelement <16 x float> undef, float undef, i32 0 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_3 = insertelement <16 x float> undef, float undef, i32 3 ; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_8 = insertelement <16 x float> undef, float undef, i32 8 @@ -119,31 +146,39 @@ define i32 @insert_float(i32 %arg) { ; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'insert_float' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_a = insertelement <2 x float> undef, float undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v2f32_0 = insertelement <2 x float> undef, float undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2f32_1 = insertelement <2 x float> undef, float undef, i32 1 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_a = insertelement <4 x float> undef, float undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v4f32_0 = insertelement <4 x float> undef, float undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4f32_3 = insertelement <4 x float> undef, float undef, i32 3 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_a = insertelement <8 x float> undef, float undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v8f32_0 = insertelement <8 x float> undef, float undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_3 = insertelement <8 x float> undef, float undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_4 = insertelement <8 x float> undef, float undef, i32 4 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8f32_7 = insertelement <8 x float> undef, float undef, i32 7 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_a = insertelement <16 x float> undef, float undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_0 = insertelement <16 x float> undef, float undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_3 = insertelement <16 x float> undef, float undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %v16f32_8 = insertelement <16 x float> undef, float undef, i32 8 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16f32_15 = insertelement <16 x float> undef, float undef, i32 15 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v2f32_a = insertelement <2 x float> undef, float undef, i32 %arg %v2f32_0 = insertelement <2 x float> undef, float undef, i32 0 %v2f32_1 = insertelement <2 x float> undef, float undef, i32 1 + %v4f32_a = insertelement <4 x float> undef, float undef, i32 %arg %v4f32_0 = insertelement <4 x float> undef, float undef, i32 0 %v4f32_3 = insertelement <4 x float> undef, float undef, i32 3 + %v8f32_a = insertelement <8 x float> undef, float undef, i32 %arg %v8f32_0 = insertelement <8 x float> undef, float undef, i32 0 %v8f32_3 = insertelement <8 x float> undef, float undef, i32 3 %v8f32_4 = insertelement <8 x float> undef, float undef, i32 4 %v8f32_7 = insertelement <8 x float> undef, float undef, i32 7 + %v16f32_a = insertelement <16 x float> undef, float undef, i32 %arg %v16f32_0 = insertelement <16 x float> undef, float undef, i32 0 %v16f32_3 = insertelement <16 x float> undef, float undef, i32 3 %v16f32_8 = insertelement <16 x float> undef, float undef, i32 8 @@ -154,10 +189,13 @@ define i32 @insert_float(i32 %arg) { define i32 @insert_i64(i32 %arg) { ; CHECK-LABEL: 'insert_i64' +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_a = insertelement <2 x i64> undef, i64 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_0 = insertelement <2 x i64> undef, i64 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_1 = insertelement <2 x i64> undef, i64 undef, i32 1 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_a = insertelement <4 x i64> undef, i64 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_0 = insertelement <4 x i64> undef, i64 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_3 = insertelement <4 x i64> undef, i64 undef, i32 3 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_a = insertelement <8 x i64> undef, i64 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_0 = insertelement <8 x i64> undef, i64 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_3 = insertelement <8 x i64> undef, i64 undef, i32 3 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_4 = insertelement <8 x i64> undef, i64 undef, i32 4 @@ -165,22 +203,28 @@ define i32 @insert_i64(i32 %arg) { ; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'insert_i64' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_a = insertelement <2 x i64> undef, i64 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_0 = insertelement <2 x i64> undef, i64 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i64_1 = insertelement <2 x i64> undef, i64 undef, i32 1 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_a = insertelement <4 x i64> undef, i64 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_0 = insertelement <4 x i64> undef, i64 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i64_3 = insertelement <4 x i64> undef, i64 undef, i32 3 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_a = insertelement <8 x i64> undef, i64 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_0 = insertelement <8 x i64> undef, i64 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_3 = insertelement <8 x i64> undef, i64 undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_4 = insertelement <8 x i64> undef, i64 undef, i32 4 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i64_7 = insertelement <8 x i64> undef, i64 undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v2i64_a = insertelement <2 x i64> undef, i64 undef, i32 %arg %v2i64_0 = insertelement <2 x i64> undef, i64 undef, i32 0 %v2i64_1 = insertelement <2 x i64> undef, i64 undef, i32 1 + %v4i64_a = insertelement <4 x i64> undef, i64 undef, i32 %arg %v4i64_0 = insertelement <4 x i64> undef, i64 undef, i32 0 %v4i64_3 = insertelement <4 x i64> undef, i64 undef, i32 3 + %v8i64_a = insertelement <8 x i64> undef, i64 undef, i32 %arg %v8i64_0 = insertelement <8 x i64> undef, i64 undef, i32 0 %v8i64_3 = insertelement <8 x i64> undef, i64 undef, i32 3 %v8i64_4 = insertelement <8 x i64> undef, i64 undef, i32 4 @@ -191,14 +235,18 @@ define i32 @insert_i64(i32 %arg) { define i32 @insert_i32(i32 %arg) { ; CHECK-LABEL: 'insert_i32' +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_a = insertelement <2 x i32> undef, i32 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_0 = insertelement <2 x i32> undef, i32 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_1 = insertelement <2 x i32> undef, i32 undef, i32 1 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_a = insertelement <4 x i32> undef, i32 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_0 = insertelement <4 x i32> undef, i32 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_3 = insertelement <4 x i32> undef, i32 undef, i32 3 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_a = insertelement <8 x i32> undef, i32 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_0 = insertelement <8 x i32> undef, i32 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_3 = insertelement <8 x i32> undef, i32 undef, i32 3 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_4 = insertelement <8 x i32> undef, i32 undef, i32 4 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_7 = insertelement <8 x i32> undef, i32 undef, i32 7 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_a = insertelement <16 x i32> undef, i32 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_0 = insertelement <16 x i32> undef, i32 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_3 = insertelement <16 x i32> undef, i32 undef, i32 3 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_8 = insertelement <16 x i32> undef, i32 undef, i32 8 @@ -206,31 +254,39 @@ define i32 @insert_i32(i32 %arg) { ; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'insert_i32' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_a = insertelement <2 x i32> undef, i32 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_0 = insertelement <2 x i32> undef, i32 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v2i32_1 = insertelement <2 x i32> undef, i32 undef, i32 1 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_a = insertelement <4 x i32> undef, i32 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_0 = insertelement <4 x i32> undef, i32 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v4i32_3 = insertelement <4 x i32> undef, i32 undef, i32 3 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_a = insertelement <8 x i32> undef, i32 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_0 = insertelement <8 x i32> undef, i32 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_3 = insertelement <8 x i32> undef, i32 undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_4 = insertelement <8 x i32> undef, i32 undef, i32 4 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i32_7 = insertelement <8 x i32> undef, i32 undef, i32 7 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_a = insertelement <16 x i32> undef, i32 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_0 = insertelement <16 x i32> undef, i32 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_3 = insertelement <16 x i32> undef, i32 undef, i32 3 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_8 = insertelement <16 x i32> undef, i32 undef, i32 8 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i32_15 = insertelement <16 x i32> undef, i32 undef, i32 15 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v2i32_a = insertelement <2 x i32> undef, i32 undef, i32 %arg %v2i32_0 = insertelement <2 x i32> undef, i32 undef, i32 0 %v2i32_1 = insertelement <2 x i32> undef, i32 undef, i32 1 + %v4i32_a = insertelement <4 x i32> undef, i32 undef, i32 %arg %v4i32_0 = insertelement <4 x i32> undef, i32 undef, i32 0 %v4i32_3 = insertelement <4 x i32> undef, i32 undef, i32 3 + %v8i32_a = insertelement <8 x i32> undef, i32 undef, i32 %arg %v8i32_0 = insertelement <8 x i32> undef, i32 undef, i32 0 %v8i32_3 = insertelement <8 x i32> undef, i32 undef, i32 3 %v8i32_4 = insertelement <8 x i32> undef, i32 undef, i32 4 %v8i32_7 = insertelement <8 x i32> undef, i32 undef, i32 7 + %v16i32_a = insertelement <16 x i32> undef, i32 undef, i32 %arg %v16i32_0 = insertelement <16 x i32> undef, i32 undef, i32 0 %v16i32_3 = insertelement <16 x i32> undef, i32 undef, i32 3 %v16i32_8 = insertelement <16 x i32> undef, i32 undef, i32 8 @@ -241,12 +297,15 @@ define i32 @insert_i32(i32 %arg) { define i32 @insert_i16(i32 %arg) { ; CHECK-LABEL: 'insert_i16' +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_a = insertelement <8 x i16> undef, i16 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_0 = insertelement <8 x i16> undef, i16 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_7 = insertelement <8 x i16> undef, i16 undef, i32 7 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_a = insertelement <16 x i16> undef, i16 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_0 = insertelement <16 x i16> undef, i16 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_7 = insertelement <16 x i16> undef, i16 undef, i32 7 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_8 = insertelement <16 x i16> undef, i16 undef, i32 8 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_15 = insertelement <16 x i16> undef, i16 undef, i32 15 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_a = insertelement <32 x i16> undef, i16 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_0 = insertelement <32 x i16> undef, i16 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_7 = insertelement <32 x i16> undef, i16 undef, i32 7 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_8 = insertelement <32 x i16> undef, i16 undef, i32 8 @@ -257,12 +316,15 @@ define i32 @insert_i16(i32 %arg) { ; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'insert_i16' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_a = insertelement <8 x i16> undef, i16 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_0 = insertelement <8 x i16> undef, i16 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v8i16_7 = insertelement <8 x i16> undef, i16 undef, i32 7 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_a = insertelement <16 x i16> undef, i16 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_0 = insertelement <16 x i16> undef, i16 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_7 = insertelement <16 x i16> undef, i16 undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_8 = insertelement <16 x i16> undef, i16 undef, i32 8 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i16_15 = insertelement <16 x i16> undef, i16 undef, i32 15 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_a = insertelement <32 x i16> undef, i16 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_0 = insertelement <32 x i16> undef, i16 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_7 = insertelement <32 x i16> undef, i16 undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_8 = insertelement <32 x i16> undef, i16 undef, i32 8 @@ -272,14 +334,17 @@ define i32 @insert_i16(i32 %arg) { ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i16_31 = insertelement <32 x i16> undef, i16 undef, i32 31 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v8i16_a = insertelement <8 x i16> undef, i16 undef, i32 %arg %v8i16_0 = insertelement <8 x i16> undef, i16 undef, i32 0 %v8i16_7 = insertelement <8 x i16> undef, i16 undef, i32 7 + %v16i16_a = insertelement <16 x i16> undef, i16 undef, i32 %arg %v16i16_0 = insertelement <16 x i16> undef, i16 undef, i32 0 %v16i16_7 = insertelement <16 x i16> undef, i16 undef, i32 7 %v16i16_8 = insertelement <16 x i16> undef, i16 undef, i32 8 %v16i16_15 = insertelement <16 x i16> undef, i16 undef, i32 15 + %v32i16_a = insertelement <32 x i16> undef, i16 undef, i32 %arg %v32i16_0 = insertelement <32 x i16> undef, i16 undef, i32 0 %v32i16_7 = insertelement <32 x i16> undef, i16 undef, i32 7 %v32i16_8 = insertelement <32 x i16> undef, i16 undef, i32 8 @@ -293,15 +358,18 @@ define i32 @insert_i16(i32 %arg) { define i32 @insert_i8(i32 %arg) { ; CHECK-LABEL: 'insert_i8' +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_a = insertelement <16 x i8> undef, i8 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_0 = insertelement <16 x i8> undef, i8 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_8 = insertelement <16 x i8> undef, i8 undef, i32 8 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_15 = insertelement <16 x i8> undef, i8 undef, i32 15 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_a = insertelement <32 x i8> undef, i8 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_0 = insertelement <32 x i8> undef, i8 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_7 = insertelement <32 x i8> undef, i8 undef, i32 7 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_8 = insertelement <32 x i8> undef, i8 undef, i32 8 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_15 = insertelement <32 x i8> undef, i8 undef, i32 15 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_24 = insertelement <32 x i8> undef, i8 undef, i32 24 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_31 = insertelement <32 x i8> undef, i8 undef, i32 31 +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_a = insertelement <64 x i8> undef, i8 undef, i32 %arg ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_0 = insertelement <64 x i8> undef, i8 undef, i32 0 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_7 = insertelement <64 x i8> undef, i8 undef, i32 7 ; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_8 = insertelement <64 x i8> undef, i8 undef, i32 8 @@ -314,15 +382,18 @@ define i32 @insert_i8(i32 %arg) { ; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; BTVER2-LABEL: 'insert_i8' +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_a = insertelement <16 x i8> undef, i8 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_0 = insertelement <16 x i8> undef, i8 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_8 = insertelement <16 x i8> undef, i8 undef, i32 8 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v16i8_15 = insertelement <16 x i8> undef, i8 undef, i32 15 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_a = insertelement <32 x i8> undef, i8 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_0 = insertelement <32 x i8> undef, i8 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_7 = insertelement <32 x i8> undef, i8 undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_8 = insertelement <32 x i8> undef, i8 undef, i32 8 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_15 = insertelement <32 x i8> undef, i8 undef, i32 15 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_24 = insertelement <32 x i8> undef, i8 undef, i32 24 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i8_31 = insertelement <32 x i8> undef, i8 undef, i32 31 +; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_a = insertelement <64 x i8> undef, i8 undef, i32 %arg ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_0 = insertelement <64 x i8> undef, i8 undef, i32 0 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_7 = insertelement <64 x i8> undef, i8 undef, i32 7 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_8 = insertelement <64 x i8> undef, i8 undef, i32 8 @@ -334,10 +405,12 @@ define i32 @insert_i8(i32 %arg) { ; BTVER2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v64i8_63 = insertelement <64 x i8> undef, i8 undef, i32 63 ; BTVER2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; + %v16i8_a = insertelement <16 x i8> undef, i8 undef, i32 %arg %v16i8_0 = insertelement <16 x i8> undef, i8 undef, i32 0 %v16i8_8 = insertelement <16 x i8> undef, i8 undef, i32 8 %v16i8_15 = insertelement <16 x i8> undef, i8 undef, i32 15 + %v32i8_a = insertelement <32 x i8> undef, i8 undef, i32 %arg %v32i8_0 = insertelement <32 x i8> undef, i8 undef, i32 0 %v32i8_7 = insertelement <32 x i8> undef, i8 undef, i32 7 %v32i8_8 = insertelement <32 x i8> undef, i8 undef, i32 8 @@ -345,6 +418,7 @@ define i32 @insert_i8(i32 %arg) { %v32i8_24 = insertelement <32 x i8> undef, i8 undef, i32 24 %v32i8_31 = insertelement <32 x i8> undef, i8 undef, i32 31 + %v64i8_a = insertelement <64 x i8> undef, i8 undef, i32 %arg %v64i8_0 = insertelement <64 x i8> undef, i8 undef, i32 0 %v64i8_7 = insertelement <64 x i8> undef, i8 undef, i32 7 %v64i8_8 = insertelement <64 x i8> undef, i8 undef, i32 8 From llvm-commits at lists.llvm.org Wed Oct 9 05:37:56 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Wed, 09 Oct 2019 12:37:56 -0000 Subject: [llvm] r374162 - [llvm-exegesis] Fix r374158 Message-ID: <20191009123756.68B3390B9D@lists.llvm.org> Author: courbet Date: Wed Oct 9 05:37:56 2019 New Revision: 374162 URL: http://llvm.org/viewvc/llvm-project?rev=374162&view=rev Log: [llvm-exegesis] Fix r374158 Some bots complain about missing 'class': LlvmState.h:70:40: error: declaration of ‘std::unique_ptr llvm::exegesis::LLVMState::TargetMachine’ [-fpermissive] std::unique_ptr TargetMachine; Modified: llvm/trunk/tools/llvm-exegesis/lib/LlvmState.cpp llvm/trunk/tools/llvm-exegesis/lib/LlvmState.h Modified: llvm/trunk/tools/llvm-exegesis/lib/LlvmState.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/LlvmState.cpp?rev=374162&r1=374161&r2=374162&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/LlvmState.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/LlvmState.cpp Wed Oct 9 05:37:56 2019 @@ -27,10 +27,10 @@ LLVMState::LLVMState(const std::string & const Target *const TheTarget = TargetRegistry::lookupTarget(Triple, Error); assert(TheTarget && "unknown target for host"); const TargetOptions Options; - TargetMachine.reset( + TheTargetMachine.reset( static_cast(TheTarget->createTargetMachine( Triple, CpuName, Features, Options, Reloc::Model::Static))); - TheExegesisTarget = ExegesisTarget::lookup(TargetMachine->getTargetTriple()); + TheExegesisTarget = ExegesisTarget::lookup(TheTargetMachine->getTargetTriple()); if (!TheExegesisTarget) { errs() << "no exegesis target for " << Triple << ", using default\n"; TheExegesisTarget = &ExegesisTarget::getDefault(); @@ -51,26 +51,26 @@ LLVMState::LLVMState(const std::string & std::unique_ptr LLVMState::createTargetMachine() const { return std::unique_ptr(static_cast( - TargetMachine->getTarget().createTargetMachine( - TargetMachine->getTargetTriple().normalize(), - TargetMachine->getTargetCPU(), - TargetMachine->getTargetFeatureString(), TargetMachine->Options, + TheTargetMachine->getTarget().createTargetMachine( + TheTargetMachine->getTargetTriple().normalize(), + TheTargetMachine->getTargetCPU(), + TheTargetMachine->getTargetFeatureString(), TheTargetMachine->Options, Reloc::Model::Static))); } bool LLVMState::canAssemble(const MCInst &Inst) const { MCObjectFileInfo ObjectFileInfo; - MCContext Context(TargetMachine->getMCAsmInfo(), - TargetMachine->getMCRegisterInfo(), &ObjectFileInfo); + MCContext Context(TheTargetMachine->getMCAsmInfo(), + TheTargetMachine->getMCRegisterInfo(), &ObjectFileInfo); std::unique_ptr CodeEmitter( - TargetMachine->getTarget().createMCCodeEmitter( - *TargetMachine->getMCInstrInfo(), *TargetMachine->getMCRegisterInfo(), + TheTargetMachine->getTarget().createMCCodeEmitter( + *TheTargetMachine->getMCInstrInfo(), *TheTargetMachine->getMCRegisterInfo(), Context)); SmallVector Tmp; raw_svector_ostream OS(Tmp); SmallVector Fixups; CodeEmitter->encodeInstruction(Inst, OS, Fixups, - *TargetMachine->getMCSubtargetInfo()); + *TheTargetMachine->getMCSubtargetInfo()); return Tmp.size() > 0; } Modified: llvm/trunk/tools/llvm-exegesis/lib/LlvmState.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/LlvmState.h?rev=374162&r1=374161&r2=374162&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/LlvmState.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/LlvmState.h Wed Oct 9 05:37:56 2019 @@ -42,7 +42,7 @@ public: const std::string &CpuName, const std::string &Features = ""); // For tests. - const TargetMachine &getTargetMachine() const { return *TargetMachine; } + const TargetMachine &getTargetMachine() const { return *TheTargetMachine; } std::unique_ptr createTargetMachine() const; const ExegesisTarget &getExegesisTarget() const { return *TheExegesisTarget; } @@ -51,13 +51,13 @@ public: // For convenience: const MCInstrInfo &getInstrInfo() const { - return *TargetMachine->getMCInstrInfo(); + return *TheTargetMachine->getMCInstrInfo(); } const MCRegisterInfo &getRegInfo() const { - return *TargetMachine->getMCRegisterInfo(); + return *TheTargetMachine->getMCRegisterInfo(); } const MCSubtargetInfo &getSubtargetInfo() const { - return *TargetMachine->getMCSubtargetInfo(); + return *TheTargetMachine->getMCSubtargetInfo(); } const RegisterAliasingTrackerCache &getRATC() const { return *RATC; } @@ -67,7 +67,7 @@ public: private: const ExegesisTarget *TheExegesisTarget; - std::unique_ptr TargetMachine; + std::unique_ptr TheTargetMachine; std::unique_ptr RATC; std::unique_ptr IC; const PfmCountersInfo *PfmCounters; From llvm-commits at lists.llvm.org Wed Oct 9 05:40:40 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:40:40 +0000 (UTC) Subject: [PATCH] D68698: [test-suite] Add Architecture Detection for RISC-V Message-ID: lenary created this revision. lenary added reviewers: asb, luismarques. Herald added subscribers: llvm-commits, s.egerton, PkmX, rkruppe, rogfer01, shiva0217, kito-cheng, simoncook, mgorny. Herald added a project: LLVM. The LLVM test suite has its own way of detecting the system architecture. This adds support in that file for detecting RISC-V. We use "riscv64" to identify 64-bit RISC-V, and "riscv32" to identify 32-bit RISC-V, so that attempting to detect "riscv" matches any version of RISC-V. Repository: rT test-suite https://reviews.llvm.org/D68698 Files: cmake/modules/DetectArchitecture.c Index: cmake/modules/DetectArchitecture.c =================================================================== --- cmake/modules/DetectArchitecture.c +++ cmake/modules/DetectArchitecture.c @@ -8,6 +8,12 @@ const char *str = "ARCHITECTURE IS Mips"; #elif defined(__powerpc__) || defined(__ppc__) || defined(__power__) const char *str = "ARCHITECTURE IS PowerPC"; +#elif defined(__riscv) +#if __riscv_xlen == 64 +const char *str = "ARCHITECTURE IS riscv64"; +#elif __riscv_xlen == 32 +const char *str = "ARCHITECTURE IS riscv32"; +#endif #elif defined(__s390__) const char *str = "ARCHITECTURE IS SystemZ"; #elif defined(__sparc__) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68698.224025.patch Type: text/x-patch Size: 635 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 05:49:52 2019 From: llvm-commits at lists.llvm.org (Ayal Zaks via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:49:52 +0000 (UTC) Subject: [PATCH] D68082: [LV] Emitting SCEV checks with OptForSize In-Reply-To: References: Message-ID: <0326611a2b2b16dc24d2813fdc3a124c@localhost.localdomain> Ayal accepted this revision. Ayal added a comment. This revision is now accepted and ready to land. This LGTM, with additional CHECK-NOT to the test, thanks! ================ Comment at: llvm/test/Transforms/LoopVectorize/optsize.ll:100 +; the non-consecutive loads/stores can be scalarized: +; +; CHECK: vector.body: ---------------- Better also verify here that no SCEV predicates get generated, e.g., "CHECK-NOT: vector.scevcheck" as in pr39417-optsize-scevchecks.ll, or CHECK-NOT the predicates themselves as in pr34681.ll. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68082/new/ https://reviews.llvm.org/D68082 From llvm-commits at lists.llvm.org Wed Oct 9 05:49:52 2019 From: llvm-commits at lists.llvm.org (Gil Rapaport via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:49:52 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <199219a3d89a2544c8ec5985c247f8ad@localhost.localdomain> gilr marked an inline comment as done. gilr added inline comments. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:6472 +bool LoopVectorizationPlanner::tryToInterleaveMemory( + const InterleaveGroup *IG, VFRange &Range) { ---------------- rengolin wrote: > Other try{something} functions return a recipe pointer, while this one returns a boolean. > > If you rename this to "check" or "can" (instead of try), then you shouldn't clamp the range. > > I'm not sure what's best here, but this way looks a bit odd. Agreed. Will inline this code at call site instead. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 From llvm-commits at lists.llvm.org Wed Oct 9 05:49:53 2019 From: llvm-commits at lists.llvm.org (David Stenberg via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:49:53 +0000 (UTC) Subject: [PATCH] D68465: [DebugInfo] Trim call-clobbered location list entries when tuning for GDB In-Reply-To: References: Message-ID: <99d297b7bd9e383c706ed9a7f65f55c4@localhost.localdomain> dstenb added a comment. In D68465#1698682 , @dblaikie wrote: > In D68465#1697407 , @dstenb wrote: > > > In D68465#1695482 , @dblaikie wrote: > > > > > Thanks for bringing this up! > > > > > > A few thoughts from me: > > > > > > 1. Yeah, I tend to agree with the DWARF Committee folks & the fact that LLDB can do the right thing without this change sort of points to this being a "fix it in GDB" situation. Have you tried asking the GDB folks about it/submitting patches there rather than here? > > > > > > No, we have not done that yet. > > > I think it'd be worthwhile having at least a statement from GDB that they feel this should be the responsibility of the producer. Though even if that's the answer they provide - I think some amount of pushback (especially given the existence proof of LLDB's behavior, by the sounds of it/if I'm understanding you correctly) might be worthwhile. Yes, that's fair. I sent a mail to the GDB mailing list now: https://sourceware.org/ml/gdb/2019-10/msg00002.html. As mentioned in the mail, it appears that registers actually are described in that way for some targets, e.g. RS/6000 and S/390. Sorry for not noticing that before. (Although, I did not test any such targets in practice, so I might have misunderstood that code.) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68465/new/ https://reviews.llvm.org/D68465 From llvm-commits at lists.llvm.org Wed Oct 9 05:53:49 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:53:49 +0000 (UTC) Subject: [PATCH] D68133: [Symbolize] Use the local MSVC C++ demangler instead of relying on dbghelp. NFC. In-Reply-To: References: Message-ID: <6bad3857d65a5c6e90ab9e35c4c59d13@localhost.localdomain> thakis added a comment. > for the fuzzer sanitizer's backtrace disambiguation log, there might be a risk that something is expecting to process the log, which might not be ready to handle extra unexpected keywords. I might be overly cautious though. > > In the LLDB case, a concern was voiced that some parts of LLDB might try to parse the demangled symbol names, and not expect those extra keywords there. I think LLDB/win and fuzzers/win might have few enough clients that we can just trying to change it and see what breaks. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68133/new/ https://reviews.llvm.org/D68133 From llvm-commits at lists.llvm.org Wed Oct 9 05:59:11 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:59:11 +0000 (UTC) Subject: [PATCH] D68082: [LV] Emitting SCEV checks with OptForSize In-Reply-To: References: Message-ID: <322df92d226d0c108f2521e5c6e2f3e2@localhost.localdomain> SjoerdMeijer added a comment. Many thanks again for reviewing. I will add the check-not before committing, and as I said, I will follow up soon to address the other improvements opportunities. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68082/new/ https://reviews.llvm.org/D68082 From llvm-commits at lists.llvm.org Wed Oct 9 05:59:11 2019 From: llvm-commits at lists.llvm.org (Gil Rapaport via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 12:59:11 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: gilr updated this revision to Diff 224026. gilr added a comment. - Applied review comment - Simplified predicate lambda function CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 Files: llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h llvm/lib/Transforms/Vectorize/VPlan.cpp llvm/lib/Transforms/Vectorize/VPlan.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68577.224026.patch Type: text/x-patch Size: 19954 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 06:03:30 2019 From: llvm-commits at lists.llvm.org (Simon Tatham via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:03:30 +0000 (UTC) Subject: [PATCH] D67158: [ARM] Begin adding IR intrinsics for MVE instructions. In-Reply-To: References: Message-ID: simon_tatham updated this revision to Diff 224027. simon_tatham added a comment. Split this patch into three as requested. This one now contains only the subset of the previous IR intrinsics that can be implemented by Tablegen patterns. The ones using C++ are moved out to two followup patches. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67158/new/ https://reviews.llvm.org/D67158 Files: llvm/include/llvm/IR/IntrinsicsARM.td llvm/lib/Target/ARM/ARMISelLowering.cpp llvm/lib/Target/ARM/ARMInstrMVE.td llvm/test/CodeGen/Thumb2/mve-intrinsics/vaddq.ll llvm/test/CodeGen/Thumb2/mve-intrinsics/vcvt.ll llvm/test/CodeGen/Thumb2/mve-intrinsics/vminvq.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67158.224027.patch Type: text/x-patch Size: 21637 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 06:03:30 2019 From: llvm-commits at lists.llvm.org (Simon Tatham via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:03:30 +0000 (UTC) Subject: [PATCH] D68699: [ARM] Add some sample IR MVE intrinsics with C++ isel. Message-ID: simon_tatham created this revision. simon_tatham added reviewers: dmgreen, miyuki, ostannard. Herald added subscribers: llvm-commits, hiraditya, kristof.beyls. Herald added a project: LLVM. This adds some initial example IR intrinsics for MVE instructions that deliver multiple output values, and hence, have to be instruction- selected by custom C++ code instead of Tablegen patterns. I've added the writeback gather load instructions (taking a vector of base addresses and a single common offset, returning a vector of loaded values and an updated vector of base addresses); one example from the long shift family (taking and returning a 64-bit value in two GPRs); and the VADC instruction (which propagates a carry bit from each vector-lane addition to the next, taking an input carry flag in FPSCR and outputting the final one in FPSCR as well). To support the VPT-predicated forms of these instructions, I've written some helper functions to add the cluster of MVE predicate operands to the end of a MachineInstr. `AddMVEPredicateToOps` is used when the instruction actually is predicated (so it takes a predicate mask argument), and `AddEmptyMVEPredicateToOps` is for when the instruction is unpredicated (so it fills in $noreg for the mask). Each one comes in a form suitable for `vpred_n`, and one for `vpred_r` which takes the extra 'inactive' parameter. For VADC, the representation of the carry flag in the IR intrinsic is a word intended to be moved directly to and from `FPSCR_nzcvqc`, i.e. with the carry flag in bit 29 of the word. (The user-facing ACLE intrinsic will want it to be in bit 0, but I'll do that on the clang side.) Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68699 Files: llvm/include/llvm/IR/IntrinsicsARM.td llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp llvm/test/CodeGen/Thumb2/mve-intrinsics/scalar-shifts.ll llvm/test/CodeGen/Thumb2/mve-intrinsics/vadc.ll llvm/test/CodeGen/Thumb2/mve-intrinsics/vldr.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68699.224028.patch Type: text/x-patch Size: 16413 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 06:03:31 2019 From: llvm-commits at lists.llvm.org (Simon Tatham via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:03:31 +0000 (UTC) Subject: [PATCH] D68700: [ARM] Add IR intrinsics for MVE VLD[24] and VST[24]. Message-ID: simon_tatham created this revision. simon_tatham added reviewers: dmgreen, miyuki, ostannard. Herald added subscribers: llvm-commits, hiraditya, kristof.beyls. Herald added a project: LLVM. The VST2 and VST4 instructions take two or four vector registers as input, and store part of each register to memory in an interleaved pattern. They come in variants indicating which part of each register they store (VST20 and VST21; VST40 to VST43 inclusive); the intention is that issuing each of those variants in turn has the combined effect of loading or storing the whole set of registers to a memory block of equal size. The corresponding VLD2 and VLD4 instructions load from memory in the same interleaved format: each one overwrites only part of its output register set, and again, the idea is that if you use VLD4{0,1,2,3} or VLD2{0,1} together, you end up having written to the whole of each register. I've implemented the stores and loads quite differently. The loads were easiest to implement as a single intrinsic that expands to all four VLD4x instructions or both VLD2x, delivering four complete output registers. (Implementing each individual load as a separate instruction taking four input registers to partially overwrite is possible in theory, but pointless, and when I tried it, I found it would need extra work to get the register allocation not to be horrible.) Since that intrinsic delivers multiple outputs, it has to be instruction-selected in custom C++. But the store instructions are easier to model individually, because they don't overwrite any register at all and you can write a DAG Isel pattern in Tablegen for each one. Hence, my new intrinsic `int_arm_mve_vld4q` expands to four load instructions, delivers four full output vectors, and is handled by C++ code, whereas `int_arm_mve_vst4q` expands to just one store instruction, takes four input vectors and a constant indicating which lanes to store, and is handled entirely in Tablegen. (And similarly for vld2q/vst2q.) This is asymmetric, but it was the easiest way to do each one. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68700 Files: llvm/include/llvm/IR/IntrinsicsARM.td llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp llvm/lib/Target/ARM/ARMInstrMVE.td llvm/test/CodeGen/Thumb2/mve-intrinsics/vld24.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68700.224029.patch Type: text/x-patch Size: 11763 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 06:03:33 2019 From: llvm-commits at lists.llvm.org (Simon Tatham via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:03:33 +0000 (UTC) Subject: [PATCH] D67162: [InstCombine] Known-bits optimization for ARM MVE VADC. In-Reply-To: References: Message-ID: simon_tatham updated this revision to Diff 224033. simon_tatham added a comment. Rebased to current master. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67162/new/ https://reviews.llvm.org/D67162 Files: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp llvm/test/CodeGen/Thumb2/mve-intrinsics/vadc-multiple.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67162.224033.patch Type: text/x-patch Size: 5437 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 06:03:44 2019 From: llvm-commits at lists.llvm.org (Simon Tatham via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:03:44 +0000 (UTC) Subject: [PATCH] D67438: [InstCombine] Range metadata for ARM MVE VMIN/VMAX. In-Reply-To: References: Message-ID: simon_tatham updated this revision to Diff 224034. simon_tatham added a comment. Rebased to current master. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67438/new/ https://reviews.llvm.org/D67438 Files: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp llvm/test/CodeGen/Thumb2/mve-intrinsics/vmin-instcombine.ll Index: llvm/test/CodeGen/Thumb2/mve-intrinsics/vmin-instcombine.ll =================================================================== --- /dev/null +++ llvm/test/CodeGen/Thumb2/mve-intrinsics/vmin-instcombine.ll @@ -0,0 +1,38 @@ +; RUN: opt -instcombine -S -o - %s | FileCheck --check-prefix=IR %s +; RUN: opt -instcombine %s | llc -mtriple=thumbv8.1m.main -mattr=+mve.fp -verify-machineinstrs -o - | FileCheck --check-prefix=ASM %s + +define arm_aapcs_vfpcc i32 @test_vmaxvq_u8(i32 %a, <16 x i8> %b) { +; ASM-LABEL: test_vmaxvq_u8: +; ASM: @ %bb.0: @ %entry +; ASM-NEXT: vmaxv.u8 r0, q0 +; ASM-NEXT: bx lr +entry: + %0 = tail call i32 @llvm.arm.mve.maxv.u.v16i8(i32 %a, <16 x i8> %b) + %1 = trunc i32 %0 to i8 + %2 = zext i8 %1 to i32 + ret i32 %2 +} + +define arm_aapcs_vfpcc i16 @test_vminvq_s16(i32 %a, <8 x i16> %b) { +; ASM-LABEL: test_vminvq_s16: +; ASM: @ %bb.0: @ %entry +; ASM-NEXT: vminv.s16 r0, q0 +; Ideally the next line should be ASM-NEXT, enforcing that there's +; no sxth instruction between the vminv and the return. But in fact +; there is, because signed range metadata doesn't generate an +; AssertSext. +; ASM: bx lr +entry: + %0 = tail call i32 @llvm.arm.mve.minv.s.v8i16(i32 %a, <8 x i16> %b) + %1 = trunc i32 %0 to i16 + %2 = sext i16 %1 to i32 + ret i16 %1 +} + +declare i32 @llvm.arm.mve.maxv.u.v16i8(i32, <16 x i8>) +declare i32 @llvm.arm.mve.minv.s.v8i16(i32, <8 x i16>) + +; IR: tail call i32 @llvm.arm.mve.maxv.u.v16i8(i32 %a, <16 x i8> %b), !range !0 +; IR: tail call i32 @llvm.arm.mve.minv.s.v8i16(i32 %a, <8 x i16> %b), !range ! +; IR: !0 = !{i32 0, i32 256} +; IR: !1 = !{i32 -32768, i32 32768} Index: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp =================================================================== --- llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp +++ llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp @@ -3307,6 +3307,39 @@ } break; } + case Intrinsic::arm_mve_minv_u: + case Intrinsic::arm_mve_minv_s: + case Intrinsic::arm_mve_maxv_u: + case Intrinsic::arm_mve_maxv_s: { + unsigned ScalarWidth = II->getArgOperand(1) + ->getType() + ->getVectorElementType() + ->getScalarSizeInBits(); + + bool Modified = false; + + KnownBits ScalarKnown(32); + if (SimplifyDemandedBits(II, 0, APInt::getLowBitsSet(32, ScalarWidth), + ScalarKnown, 0)) + Modified = true; + if (ScalarWidth < 32 && !II->getMetadata(LLVMContext::MD_range)) { + uint32_t Lo = 0, Hi = (uint32_t)1 << ScalarWidth; + if (IID == Intrinsic::arm_mve_minv_s || + IID == Intrinsic::arm_mve_maxv_s) { + uint32_t Offset = Hi >> 1; + Lo -= Offset; + Hi -= Offset; + } + Type *IntTy32 = Type::getInt32Ty(II->getContext()); + Metadata *M[] = {ConstantAsMetadata::get(ConstantInt::get(IntTy32, Lo)), + ConstantAsMetadata::get(ConstantInt::get(IntTy32, Hi))}; + II->setMetadata(LLVMContext::MD_range, MDNode::get(II->getContext(), M)); + Modified = true; + } + if (Modified) + return II; + break; + } case Intrinsic::arm_mve_vadc: case Intrinsic::arm_mve_vadc_predicated: { unsigned CarryOp = -------------- next part -------------- A non-text attachment was scrubbed... Name: D67438.224034.patch Type: text/x-patch Size: 3315 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 06:08:19 2019 From: llvm-commits at lists.llvm.org (Simon Tatham via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:08:19 +0000 (UTC) Subject: [PATCH] D68406: [update_cc_test_checks] Support 'clang | opt | FileCheck' In-Reply-To: References: Message-ID: simon_tatham updated this revision to Diff 224035. simon_tatham added a comment. Added an option as requested to point at the `opt` binary. Unlike the existing `--clang`, I've made it only lazily give an error: in the common case where the test file doesn't try to call `opt` in the first place, it shouldn't be necessary to have it available. Also, I've added a fix to the `add_ir_checks` call, which stopped working recently due to r373912. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68406/new/ https://reviews.llvm.org/D68406 Files: llvm/utils/update_cc_test_checks.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68406.224035.patch Type: text/x-patch Size: 5006 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 06:12:21 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Wed, 09 Oct 2019 13:12:21 -0000 Subject: [llvm] r374164 - [mips] Split expandLoadImmReal into multiple methods. NFC Message-ID: <20191009131221.534C38EA9E@lists.llvm.org> Author: atanasyan Date: Wed Oct 9 06:12:21 2019 New Revision: 374164 URL: http://llvm.org/viewvc/llvm-project?rev=374164&view=rev Log: [mips] Split expandLoadImmReal into multiple methods. NFC The `expandLoadImmReal` handles four different and almost non-overlapping cases: loading a "single" float immediate into a GPR, loading a "single" float immediate into a FPR, and the same couple for a "double" float immediate. It's better to move each `else if` branch into separate methods. Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp?rev=374164&r1=374163&r2=374164&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp (original) +++ llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Wed Oct 9 06:12:21 2019 @@ -234,9 +234,14 @@ class MipsAsmParser : public MCTargetAsm bool expandLoadImm(MCInst &Inst, bool Is32BitImm, SMLoc IDLoc, MCStreamer &Out, const MCSubtargetInfo *STI); - bool expandLoadImmReal(MCInst &Inst, bool IsSingle, bool IsGPR, bool Is64FPU, - SMLoc IDLoc, MCStreamer &Out, - const MCSubtargetInfo *STI); + bool expandLoadSingleImmToGPR(MCInst &Inst, SMLoc IDLoc, MCStreamer &Out, + const MCSubtargetInfo *STI); + bool expandLoadSingleImmToFPR(MCInst &Inst, SMLoc IDLoc, MCStreamer &Out, + const MCSubtargetInfo *STI); + bool expandLoadDoubleImmToGPR(MCInst &Inst, SMLoc IDLoc, MCStreamer &Out, + const MCSubtargetInfo *STI); + bool expandLoadDoubleImmToFPR(MCInst &Inst, bool Is64FPU, SMLoc IDLoc, + MCStreamer &Out, const MCSubtargetInfo *STI); bool expandLoadAddress(unsigned DstReg, unsigned BaseReg, const MCOperand &Offset, bool Is32BitAddress, @@ -2455,25 +2460,21 @@ MipsAsmParser::tryExpandInstruction(MCIn : MER_Success; case Mips::LoadImmSingleGPR: - return expandLoadImmReal(Inst, true, true, false, IDLoc, Out, STI) - ? MER_Fail - : MER_Success; + return expandLoadSingleImmToGPR(Inst, IDLoc, Out, STI) ? MER_Fail + : MER_Success; case Mips::LoadImmSingleFGR: - return expandLoadImmReal(Inst, true, false, false, IDLoc, Out, STI) - ? MER_Fail - : MER_Success; + return expandLoadSingleImmToFPR(Inst, IDLoc, Out, STI) ? MER_Fail + : MER_Success; case Mips::LoadImmDoubleGPR: - return expandLoadImmReal(Inst, false, true, false, IDLoc, Out, STI) - ? MER_Fail - : MER_Success; + return expandLoadDoubleImmToGPR(Inst, IDLoc, Out, STI) ? MER_Fail + : MER_Success; case Mips::LoadImmDoubleFGR: - return expandLoadImmReal(Inst, false, false, true, IDLoc, Out, STI) - ? MER_Fail - : MER_Success; + return expandLoadDoubleImmToFPR(Inst, true, IDLoc, Out, STI) ? MER_Fail + : MER_Success; case Mips::LoadImmDoubleFGR_32: - return expandLoadImmReal(Inst, false, false, false, IDLoc, Out, STI) - ? MER_Fail - : MER_Success; + return expandLoadDoubleImmToFPR(Inst, false, IDLoc, Out, STI) ? MER_Fail + : MER_Success; + case Mips::Ulh: return expandUlh(Inst, true, IDLoc, Out, STI) ? MER_Fail : MER_Success; case Mips::Ulhu: @@ -3293,11 +3294,27 @@ bool MipsAsmParser::emitPartialAddress(M return false; } -bool MipsAsmParser::expandLoadImmReal(MCInst &Inst, bool IsSingle, bool IsGPR, - bool Is64FPU, SMLoc IDLoc, - MCStreamer &Out, - const MCSubtargetInfo *STI) { - MipsTargetStreamer &TOut = getTargetStreamer(); +static uint64_t convertIntToDoubleImm(uint64_t ImmOp64) { + // If ImmOp64 is AsmToken::Integer type (all bits set to zero in the + // exponent field), convert it to double (e.g. 1 to 1.0) + if ((Hi_32(ImmOp64) & 0x7ff00000) == 0) { + APFloat RealVal(APFloat::IEEEdouble(), ImmOp64); + ImmOp64 = RealVal.bitcastToAPInt().getZExtValue(); + } + return ImmOp64; +} + +static uint32_t covertDoubleImmToSingleImm(uint64_t ImmOp64) { + // Conversion of a double in an uint64_t to a float in a uint32_t, + // retaining the bit pattern of a float. + double DoubleImm = BitsToDouble(ImmOp64); + float TmpFloat = static_cast(DoubleImm); + return FloatToBits(TmpFloat); +} + +bool MipsAsmParser::expandLoadSingleImmToGPR(MCInst &Inst, SMLoc IDLoc, + MCStreamer &Out, + const MCSubtargetInfo *STI) { assert(Inst.getNumOperands() == 2 && "Invalid operand count"); assert(Inst.getOperand(0).isReg() && Inst.getOperand(1).isImm() && "Invalid instruction operand."); @@ -3305,166 +3322,200 @@ bool MipsAsmParser::expandLoadImmReal(MC unsigned FirstReg = Inst.getOperand(0).getReg(); uint64_t ImmOp64 = Inst.getOperand(1).getImm(); - uint32_t HiImmOp64 = (ImmOp64 & 0xffffffff00000000) >> 32; - // If ImmOp64 is AsmToken::Integer type (all bits set to zero in the - // exponent field), convert it to double (e.g. 1 to 1.0) - if ((HiImmOp64 & 0x7ff00000) == 0) { - APFloat RealVal(APFloat::IEEEdouble(), ImmOp64); - ImmOp64 = RealVal.bitcastToAPInt().getZExtValue(); - } + ImmOp64 = convertIntToDoubleImm(ImmOp64); - uint32_t LoImmOp64 = ImmOp64 & 0xffffffff; - HiImmOp64 = (ImmOp64 & 0xffffffff00000000) >> 32; + uint32_t ImmOp32 = covertDoubleImmToSingleImm(ImmOp64); - if (IsSingle) { - // Conversion of a double in an uint64_t to a float in a uint32_t, - // retaining the bit pattern of a float. - uint32_t ImmOp32; - double doubleImm = BitsToDouble(ImmOp64); - float tmp_float = static_cast(doubleImm); - ImmOp32 = FloatToBits(tmp_float); - - if (IsGPR) { - if (loadImmediate(ImmOp32, FirstReg, Mips::NoRegister, true, true, IDLoc, - Out, STI)) - return true; - return false; - } else { - unsigned ATReg = getATReg(IDLoc); - if (!ATReg) - return true; - if (LoImmOp64 == 0) { - if (loadImmediate(ImmOp32, ATReg, Mips::NoRegister, true, true, IDLoc, - Out, STI)) - return true; - TOut.emitRR(Mips::MTC1, FirstReg, ATReg, IDLoc, STI); - return false; - } + return loadImmediate(ImmOp32, FirstReg, Mips::NoRegister, true, true, IDLoc, + Out, STI); +} - MCSection *CS = getStreamer().getCurrentSectionOnly(); - // FIXME: Enhance this expansion to use the .lit4 & .lit8 sections - // where appropriate. - MCSection *ReadOnlySection = getContext().getELFSection( - ".rodata", ELF::SHT_PROGBITS, ELF::SHF_ALLOC); +bool MipsAsmParser::expandLoadSingleImmToFPR(MCInst &Inst, SMLoc IDLoc, + MCStreamer &Out, + const MCSubtargetInfo *STI) { + MipsTargetStreamer &TOut = getTargetStreamer(); + assert(Inst.getNumOperands() == 2 && "Invalid operand count"); + assert(Inst.getOperand(0).isReg() && Inst.getOperand(1).isImm() && + "Invalid instruction operand."); - MCSymbol *Sym = getContext().createTempSymbol(); - const MCExpr *LoSym = - MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext()); - const MipsMCExpr *LoExpr = - MipsMCExpr::create(MipsMCExpr::MEK_LO, LoSym, getContext()); + unsigned FirstReg = Inst.getOperand(0).getReg(); + uint64_t ImmOp64 = Inst.getOperand(1).getImm(); - getStreamer().SwitchSection(ReadOnlySection); - getStreamer().EmitLabel(Sym, IDLoc); - getStreamer().EmitIntValue(ImmOp32, 4); - getStreamer().SwitchSection(CS); + ImmOp64 = convertIntToDoubleImm(ImmOp64); - if(emitPartialAddress(TOut, IDLoc, Sym)) - return true; - TOut.emitRRX(Mips::LWC1, FirstReg, ATReg, - MCOperand::createExpr(LoExpr), IDLoc, STI); - } + uint32_t ImmOp32 = covertDoubleImmToSingleImm(ImmOp64); + + unsigned ATReg = getATReg(IDLoc); + if (!ATReg) + return true; + + if (Lo_32(ImmOp64) == 0) { + if (loadImmediate(ImmOp32, ATReg, Mips::NoRegister, true, true, IDLoc, Out, + STI)) + return true; + TOut.emitRR(Mips::MTC1, FirstReg, ATReg, IDLoc, STI); return false; } - // if(!IsSingle) + MCSection *CS = getStreamer().getCurrentSectionOnly(); + // FIXME: Enhance this expansion to use the .lit4 & .lit8 sections + // where appropriate. + MCSection *ReadOnlySection = + getContext().getELFSection(".rodata", ELF::SHT_PROGBITS, ELF::SHF_ALLOC); + + MCSymbol *Sym = getContext().createTempSymbol(); + const MCExpr *LoSym = + MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext()); + const MipsMCExpr *LoExpr = + MipsMCExpr::create(MipsMCExpr::MEK_LO, LoSym, getContext()); + + getStreamer().SwitchSection(ReadOnlySection); + getStreamer().EmitLabel(Sym, IDLoc); + getStreamer().EmitIntValue(ImmOp32, 4); + getStreamer().SwitchSection(CS); + + if (emitPartialAddress(TOut, IDLoc, Sym)) + return true; + TOut.emitRRX(Mips::LWC1, FirstReg, ATReg, MCOperand::createExpr(LoExpr), + IDLoc, STI); + return false; +} + +bool MipsAsmParser::expandLoadDoubleImmToGPR(MCInst &Inst, SMLoc IDLoc, + MCStreamer &Out, + const MCSubtargetInfo *STI) { + MipsTargetStreamer &TOut = getTargetStreamer(); + assert(Inst.getNumOperands() == 2 && "Invalid operand count"); + assert(Inst.getOperand(0).isReg() && Inst.getOperand(1).isImm() && + "Invalid instruction operand."); + + unsigned FirstReg = Inst.getOperand(0).getReg(); + uint64_t ImmOp64 = Inst.getOperand(1).getImm(); + + ImmOp64 = convertIntToDoubleImm(ImmOp64); + + uint32_t LoImmOp64 = Lo_32(ImmOp64); + uint32_t HiImmOp64 = Hi_32(ImmOp64); + unsigned ATReg = getATReg(IDLoc); if (!ATReg) return true; - if (IsGPR) { - if (LoImmOp64 == 0) { - if(isABI_N32() || isABI_N64()) { - if (loadImmediate(HiImmOp64, FirstReg, Mips::NoRegister, false, true, - IDLoc, Out, STI)) - return true; - return false; - } else { - if (loadImmediate(HiImmOp64, FirstReg, Mips::NoRegister, true, true, + if (LoImmOp64 == 0) { + if (isABI_N32() || isABI_N64()) { + if (loadImmediate(HiImmOp64, FirstReg, Mips::NoRegister, false, true, IDLoc, Out, STI)) - return true; + return true; + } else { + if (loadImmediate(HiImmOp64, FirstReg, Mips::NoRegister, true, true, + IDLoc, Out, STI)) + return true; - if (loadImmediate(0, nextReg(FirstReg), Mips::NoRegister, true, true, + if (loadImmediate(0, nextReg(FirstReg), Mips::NoRegister, true, true, IDLoc, Out, STI)) - return true; - return false; - } + return true; } + return false; + } - MCSection *CS = getStreamer().getCurrentSectionOnly(); - MCSection *ReadOnlySection = getContext().getELFSection( - ".rodata", ELF::SHT_PROGBITS, ELF::SHF_ALLOC); + MCSection *CS = getStreamer().getCurrentSectionOnly(); + MCSection *ReadOnlySection = + getContext().getELFSection(".rodata", ELF::SHT_PROGBITS, ELF::SHF_ALLOC); + + MCSymbol *Sym = getContext().createTempSymbol(); + const MCExpr *LoSym = + MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext()); + const MipsMCExpr *LoExpr = + MipsMCExpr::create(MipsMCExpr::MEK_LO, LoSym, getContext()); - MCSymbol *Sym = getContext().createTempSymbol(); - const MCExpr *LoSym = - MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext()); - const MipsMCExpr *LoExpr = - MipsMCExpr::create(MipsMCExpr::MEK_LO, LoSym, getContext()); + getStreamer().SwitchSection(ReadOnlySection); + getStreamer().EmitLabel(Sym, IDLoc); + getStreamer().EmitIntValue(HiImmOp64, 4); + getStreamer().EmitIntValue(LoImmOp64, 4); + getStreamer().SwitchSection(CS); - getStreamer().SwitchSection(ReadOnlySection); - getStreamer().EmitLabel(Sym, IDLoc); - getStreamer().EmitIntValue(HiImmOp64, 4); - getStreamer().EmitIntValue(LoImmOp64, 4); - getStreamer().SwitchSection(CS); + if (emitPartialAddress(TOut, IDLoc, Sym)) + return true; - if(emitPartialAddress(TOut, IDLoc, Sym)) - return true; - if(isABI_N64()) - TOut.emitRRX(Mips::DADDiu, ATReg, ATReg, - MCOperand::createExpr(LoExpr), IDLoc, STI); - else - TOut.emitRRX(Mips::ADDiu, ATReg, ATReg, - MCOperand::createExpr(LoExpr), IDLoc, STI); + if (isABI_N64()) + TOut.emitRRX(Mips::DADDiu, ATReg, ATReg, MCOperand::createExpr(LoExpr), + IDLoc, STI); + else + TOut.emitRRX(Mips::ADDiu, ATReg, ATReg, MCOperand::createExpr(LoExpr), + IDLoc, STI); - if(isABI_N32() || isABI_N64()) - TOut.emitRRI(Mips::LD, FirstReg, ATReg, 0, IDLoc, STI); - else { - TOut.emitRRI(Mips::LW, FirstReg, ATReg, 0, IDLoc, STI); - TOut.emitRRI(Mips::LW, nextReg(FirstReg), ATReg, 4, IDLoc, STI); - } - return false; - } else { // if(!IsGPR && !IsSingle) - if ((LoImmOp64 == 0) && - !((HiImmOp64 & 0xffff0000) && (HiImmOp64 & 0x0000ffff))) { - // FIXME: In the case where the constant is zero, we can load the - // register directly from the zero register. - if (loadImmediate(HiImmOp64, ATReg, Mips::NoRegister, true, true, IDLoc, - Out, STI)) - return true; - if (isABI_N32() || isABI_N64()) - TOut.emitRR(Mips::DMTC1, FirstReg, ATReg, IDLoc, STI); - else if (hasMips32r2()) { - TOut.emitRR(Mips::MTC1, FirstReg, Mips::ZERO, IDLoc, STI); - TOut.emitRRR(Mips::MTHC1_D32, FirstReg, FirstReg, ATReg, IDLoc, STI); - } else { - TOut.emitRR(Mips::MTC1, nextReg(FirstReg), ATReg, IDLoc, STI); - TOut.emitRR(Mips::MTC1, FirstReg, Mips::ZERO, IDLoc, STI); - } - return false; - } + if (isABI_N32() || isABI_N64()) + TOut.emitRRI(Mips::LD, FirstReg, ATReg, 0, IDLoc, STI); + else { + TOut.emitRRI(Mips::LW, FirstReg, ATReg, 0, IDLoc, STI); + TOut.emitRRI(Mips::LW, nextReg(FirstReg), ATReg, 4, IDLoc, STI); + } + return false; +} - MCSection *CS = getStreamer().getCurrentSectionOnly(); - // FIXME: Enhance this expansion to use the .lit4 & .lit8 sections - // where appropriate. - MCSection *ReadOnlySection = getContext().getELFSection( - ".rodata", ELF::SHT_PROGBITS, ELF::SHF_ALLOC); +bool MipsAsmParser::expandLoadDoubleImmToFPR(MCInst &Inst, bool Is64FPU, + SMLoc IDLoc, MCStreamer &Out, + const MCSubtargetInfo *STI) { + MipsTargetStreamer &TOut = getTargetStreamer(); + assert(Inst.getNumOperands() == 2 && "Invalid operand count"); + assert(Inst.getOperand(0).isReg() && Inst.getOperand(1).isImm() && + "Invalid instruction operand."); - MCSymbol *Sym = getContext().createTempSymbol(); - const MCExpr *LoSym = - MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext()); - const MipsMCExpr *LoExpr = - MipsMCExpr::create(MipsMCExpr::MEK_LO, LoSym, getContext()); + unsigned FirstReg = Inst.getOperand(0).getReg(); + uint64_t ImmOp64 = Inst.getOperand(1).getImm(); + + ImmOp64 = convertIntToDoubleImm(ImmOp64); - getStreamer().SwitchSection(ReadOnlySection); - getStreamer().EmitLabel(Sym, IDLoc); - getStreamer().EmitIntValue(HiImmOp64, 4); - getStreamer().EmitIntValue(LoImmOp64, 4); - getStreamer().SwitchSection(CS); + uint32_t LoImmOp64 = Lo_32(ImmOp64); + uint32_t HiImmOp64 = Hi_32(ImmOp64); + + unsigned ATReg = getATReg(IDLoc); + if (!ATReg) + return true; - if(emitPartialAddress(TOut, IDLoc, Sym)) + if ((LoImmOp64 == 0) && + !((HiImmOp64 & 0xffff0000) && (HiImmOp64 & 0x0000ffff))) { + // FIXME: In the case where the constant is zero, we can load the + // register directly from the zero register. + if (loadImmediate(HiImmOp64, ATReg, Mips::NoRegister, true, true, IDLoc, + Out, STI)) return true; - TOut.emitRRX(Is64FPU ? Mips::LDC164 : Mips::LDC1, FirstReg, ATReg, - MCOperand::createExpr(LoExpr), IDLoc, STI); + if (isABI_N32() || isABI_N64()) + TOut.emitRR(Mips::DMTC1, FirstReg, ATReg, IDLoc, STI); + else if (hasMips32r2()) { + TOut.emitRR(Mips::MTC1, FirstReg, Mips::ZERO, IDLoc, STI); + TOut.emitRRR(Mips::MTHC1_D32, FirstReg, FirstReg, ATReg, IDLoc, STI); + } else { + TOut.emitRR(Mips::MTC1, nextReg(FirstReg), ATReg, IDLoc, STI); + TOut.emitRR(Mips::MTC1, FirstReg, Mips::ZERO, IDLoc, STI); + } + return false; } + + MCSection *CS = getStreamer().getCurrentSectionOnly(); + // FIXME: Enhance this expansion to use the .lit4 & .lit8 sections + // where appropriate. + MCSection *ReadOnlySection = + getContext().getELFSection(".rodata", ELF::SHT_PROGBITS, ELF::SHF_ALLOC); + + MCSymbol *Sym = getContext().createTempSymbol(); + const MCExpr *LoSym = + MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext()); + const MipsMCExpr *LoExpr = + MipsMCExpr::create(MipsMCExpr::MEK_LO, LoSym, getContext()); + + getStreamer().SwitchSection(ReadOnlySection); + getStreamer().EmitLabel(Sym, IDLoc); + getStreamer().EmitIntValue(HiImmOp64, 4); + getStreamer().EmitIntValue(LoImmOp64, 4); + getStreamer().SwitchSection(CS); + + if (emitPartialAddress(TOut, IDLoc, Sym)) + return true; + + TOut.emitRRX(Is64FPU ? Mips::LDC164 : Mips::LDC1, FirstReg, ATReg, + MCOperand::createExpr(LoExpr), IDLoc, STI); + return false; } From llvm-commits at lists.llvm.org Wed Oct 9 06:12:27 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Wed, 09 Oct 2019 13:12:27 -0000 Subject: [llvm] r374165 - [mips] Rename local variable. NFC Message-ID: <20191009131227.6C62090C14@lists.llvm.org> Author: atanasyan Date: Wed Oct 9 06:12:27 2019 New Revision: 374165 URL: http://llvm.org/viewvc/llvm-project?rev=374165&view=rev Log: [mips] Rename local variable. NFC Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp?rev=374165&r1=374164&r2=374165&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp (original) +++ llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Wed Oct 9 06:12:27 2019 @@ -3345,15 +3345,15 @@ bool MipsAsmParser::expandLoadSingleImmT uint32_t ImmOp32 = covertDoubleImmToSingleImm(ImmOp64); - unsigned ATReg = getATReg(IDLoc); - if (!ATReg) + unsigned TmpReg = getATReg(IDLoc); + if (!TmpReg) return true; if (Lo_32(ImmOp64) == 0) { - if (loadImmediate(ImmOp32, ATReg, Mips::NoRegister, true, true, IDLoc, Out, + if (loadImmediate(ImmOp32, TmpReg, Mips::NoRegister, true, true, IDLoc, Out, STI)) return true; - TOut.emitRR(Mips::MTC1, FirstReg, ATReg, IDLoc, STI); + TOut.emitRR(Mips::MTC1, FirstReg, TmpReg, IDLoc, STI); return false; } @@ -3376,7 +3376,7 @@ bool MipsAsmParser::expandLoadSingleImmT if (emitPartialAddress(TOut, IDLoc, Sym)) return true; - TOut.emitRRX(Mips::LWC1, FirstReg, ATReg, MCOperand::createExpr(LoExpr), + TOut.emitRRX(Mips::LWC1, FirstReg, TmpReg, MCOperand::createExpr(LoExpr), IDLoc, STI); return false; } @@ -3397,8 +3397,8 @@ bool MipsAsmParser::expandLoadDoubleImmT uint32_t LoImmOp64 = Lo_32(ImmOp64); uint32_t HiImmOp64 = Hi_32(ImmOp64); - unsigned ATReg = getATReg(IDLoc); - if (!ATReg) + unsigned TmpReg = getATReg(IDLoc); + if (!TmpReg) return true; if (LoImmOp64 == 0) { @@ -3438,17 +3438,17 @@ bool MipsAsmParser::expandLoadDoubleImmT return true; if (isABI_N64()) - TOut.emitRRX(Mips::DADDiu, ATReg, ATReg, MCOperand::createExpr(LoExpr), + TOut.emitRRX(Mips::DADDiu, TmpReg, TmpReg, MCOperand::createExpr(LoExpr), IDLoc, STI); else - TOut.emitRRX(Mips::ADDiu, ATReg, ATReg, MCOperand::createExpr(LoExpr), + TOut.emitRRX(Mips::ADDiu, TmpReg, TmpReg, MCOperand::createExpr(LoExpr), IDLoc, STI); if (isABI_N32() || isABI_N64()) - TOut.emitRRI(Mips::LD, FirstReg, ATReg, 0, IDLoc, STI); + TOut.emitRRI(Mips::LD, FirstReg, TmpReg, 0, IDLoc, STI); else { - TOut.emitRRI(Mips::LW, FirstReg, ATReg, 0, IDLoc, STI); - TOut.emitRRI(Mips::LW, nextReg(FirstReg), ATReg, 4, IDLoc, STI); + TOut.emitRRI(Mips::LW, FirstReg, TmpReg, 0, IDLoc, STI); + TOut.emitRRI(Mips::LW, nextReg(FirstReg), TmpReg, 4, IDLoc, STI); } return false; } @@ -3469,24 +3469,24 @@ bool MipsAsmParser::expandLoadDoubleImmT uint32_t LoImmOp64 = Lo_32(ImmOp64); uint32_t HiImmOp64 = Hi_32(ImmOp64); - unsigned ATReg = getATReg(IDLoc); - if (!ATReg) + unsigned TmpReg = getATReg(IDLoc); + if (!TmpReg) return true; if ((LoImmOp64 == 0) && !((HiImmOp64 & 0xffff0000) && (HiImmOp64 & 0x0000ffff))) { // FIXME: In the case where the constant is zero, we can load the // register directly from the zero register. - if (loadImmediate(HiImmOp64, ATReg, Mips::NoRegister, true, true, IDLoc, + if (loadImmediate(HiImmOp64, TmpReg, Mips::NoRegister, true, true, IDLoc, Out, STI)) return true; if (isABI_N32() || isABI_N64()) - TOut.emitRR(Mips::DMTC1, FirstReg, ATReg, IDLoc, STI); + TOut.emitRR(Mips::DMTC1, FirstReg, TmpReg, IDLoc, STI); else if (hasMips32r2()) { TOut.emitRR(Mips::MTC1, FirstReg, Mips::ZERO, IDLoc, STI); - TOut.emitRRR(Mips::MTHC1_D32, FirstReg, FirstReg, ATReg, IDLoc, STI); + TOut.emitRRR(Mips::MTHC1_D32, FirstReg, FirstReg, TmpReg, IDLoc, STI); } else { - TOut.emitRR(Mips::MTC1, nextReg(FirstReg), ATReg, IDLoc, STI); + TOut.emitRR(Mips::MTC1, nextReg(FirstReg), TmpReg, IDLoc, STI); TOut.emitRR(Mips::MTC1, FirstReg, Mips::ZERO, IDLoc, STI); } return false; @@ -3513,7 +3513,7 @@ bool MipsAsmParser::expandLoadDoubleImmT if (emitPartialAddress(TOut, IDLoc, Sym)) return true; - TOut.emitRRX(Is64FPU ? Mips::LDC164 : Mips::LDC1, FirstReg, ATReg, + TOut.emitRRX(Is64FPU ? Mips::LDC164 : Mips::LDC1, FirstReg, TmpReg, MCOperand::createExpr(LoExpr), IDLoc, STI); return false; From llvm-commits at lists.llvm.org Wed Oct 9 06:19:41 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via llvm-commits) Date: Wed, 09 Oct 2019 13:19:41 -0000 Subject: [llvm] r374166 - [LV] Emitting SCEV checks with OptForSize Message-ID: <20191009131941.41DA390C14@lists.llvm.org> Author: sjoerdmeijer Date: Wed Oct 9 06:19:41 2019 New Revision: 374166 URL: http://llvm.org/viewvc/llvm-project?rev=374166&view=rev Log: [LV] Emitting SCEV checks with OptForSize When optimising for size and SCEV runtime checks need to be emitted to check overflow behaviour, the loop vectorizer can run in this assert: LoopVectorize.cpp:2699: void llvm::InnerLoopVectorizer::emitSCEVChecks( llvm::Loop *, llvm::BasicBlock *): Assertion `!BB->getParent()->hasOptSize() && "Cannot SCEV check stride or overflow when opt We should not generate predicates while optimising for size because code will be generated for predicates such as these SCEV overflow runtime checks. This should fix PR43371. Differential Revision: https://reviews.llvm.org/D68082 Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp llvm/trunk/test/Transforms/LoopVectorize/optsize.ll Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp?rev=374166&r1=374165&r2=374166&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp Wed Oct 9 06:19:41 2019 @@ -409,7 +409,8 @@ int LoopVectorizationLegality::isConsecu const ValueToValueMap &Strides = getSymbolicStrides() ? *getSymbolicStrides() : ValueToValueMap(); - int Stride = getPtrStride(PSE, Ptr, TheLoop, Strides, true, false); + bool CanAddPredicate = !TheLoop->getHeader()->getParent()->hasOptSize(); + int Stride = getPtrStride(PSE, Ptr, TheLoop, Strides, CanAddPredicate, false); if (Stride == 1 || Stride == -1) return Stride; return 0; Modified: llvm/trunk/test/Transforms/LoopVectorize/optsize.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/optsize.ll?rev=374166&r1=374165&r2=374166&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/optsize.ll (original) +++ llvm/trunk/test/Transforms/LoopVectorize/optsize.ll Wed Oct 9 06:19:41 2019 @@ -84,6 +84,43 @@ for.end: ret i32 0 } +; PR43371: don't run into an assert due to emitting SCEV runtime checks +; with OptForSize. +; + at cm_array = external global [2592 x i16], align 1 + +define void @pr43371() optsize { +; +; CHECK-LABEL: @pr43371 +; CHECK-NOT: vector.scevcheck +; +; We do not want to generate SCEV predicates when optimising for size, because +; that will lead to extra code generation such as the SCEV overflow runtime +; checks. Not generating SCEV predicates can still result in vectorisation as +; the non-consecutive loads/stores can be scalarized: +; +; CHECK: vector.body: +; CHECK: store i16 0, i16* %{{.*}}, align 1 +; CHECK: store i16 0, i16* %{{.*}}, align 1 +; CHECK: br i1 {{.*}}, label %vector.body +; +entry: + br label %for.body29 + +for.cond.cleanup28: + unreachable + +for.body29: + %i24.0170 = phi i16 [ 0, %entry], [ %inc37, %for.body29] + %add33 = add i16 undef, %i24.0170 + %idxprom34 = zext i16 %add33 to i32 + %arrayidx35 = getelementptr [2592 x i16], [2592 x i16] * @cm_array, i32 0, i32 %idxprom34 + store i16 0, i16 * %arrayidx35, align 1 + %inc37 = add i16 %i24.0170, 1 + %cmp26 = icmp ult i16 %inc37, 756 + br i1 %cmp26, label %for.body29, label %for.cond.cleanup28 +} + !llvm.module.flags = !{!0} !0 = !{i32 1, !"ProfileSummary", !1} !1 = !{!2, !3, !4, !5, !6, !7, !8, !9} From llvm-commits at lists.llvm.org Wed Oct 9 06:23:18 2019 From: llvm-commits at lists.llvm.org (Kai Nacke via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:23:18 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: <5b596d2286c4e20bb01ae2e04d03359b@localhost.localdomain> Kai updated this revision to Diff 224037. Kai added a comment. Change single digit in test case to form a sequence. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 Files: llvm/docs/CommandGuide/FileCheck.rst llvm/include/llvm/Support/FileCheck.h llvm/lib/Support/FileCheck.cpp llvm/lib/Support/FileCheckImpl.h llvm/test/FileCheck/check-ignore-case.txt llvm/utils/FileCheck/FileCheck.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68146.224037.patch Type: text/x-patch Size: 4425 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 06:23:35 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:23:35 +0000 (UTC) Subject: [PATCH] D68082: [LV] Emitting SCEV checks with OptForSize In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGd1170dbe5831: [LV] Emitting SCEV checks with OptForSize (authored by SjoerdMeijer). Changed prior to commit: https://reviews.llvm.org/D68082?vs=224012&id=224040#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68082/new/ https://reviews.llvm.org/D68082 Files: llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp llvm/test/Transforms/LoopVectorize/optsize.ll Index: llvm/test/Transforms/LoopVectorize/optsize.ll =================================================================== --- llvm/test/Transforms/LoopVectorize/optsize.ll +++ llvm/test/Transforms/LoopVectorize/optsize.ll @@ -84,6 +84,43 @@ ret i32 0 } +; PR43371: don't run into an assert due to emitting SCEV runtime checks +; with OptForSize. +; + at cm_array = external global [2592 x i16], align 1 + +define void @pr43371() optsize { +; +; CHECK-LABEL: @pr43371 +; CHECK-NOT: vector.scevcheck +; +; We do not want to generate SCEV predicates when optimising for size, because +; that will lead to extra code generation such as the SCEV overflow runtime +; checks. Not generating SCEV predicates can still result in vectorisation as +; the non-consecutive loads/stores can be scalarized: +; +; CHECK: vector.body: +; CHECK: store i16 0, i16* %{{.*}}, align 1 +; CHECK: store i16 0, i16* %{{.*}}, align 1 +; CHECK: br i1 {{.*}}, label %vector.body +; +entry: + br label %for.body29 + +for.cond.cleanup28: + unreachable + +for.body29: + %i24.0170 = phi i16 [ 0, %entry], [ %inc37, %for.body29] + %add33 = add i16 undef, %i24.0170 + %idxprom34 = zext i16 %add33 to i32 + %arrayidx35 = getelementptr [2592 x i16], [2592 x i16] * @cm_array, i32 0, i32 %idxprom34 + store i16 0, i16 * %arrayidx35, align 1 + %inc37 = add i16 %i24.0170, 1 + %cmp26 = icmp ult i16 %inc37, 756 + br i1 %cmp26, label %for.body29, label %for.cond.cleanup28 +} + !llvm.module.flags = !{!0} !0 = !{i32 1, !"ProfileSummary", !1} !1 = !{!2, !3, !4, !5, !6, !7, !8, !9} Index: llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp =================================================================== --- llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp +++ llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp @@ -409,7 +409,8 @@ const ValueToValueMap &Strides = getSymbolicStrides() ? *getSymbolicStrides() : ValueToValueMap(); - int Stride = getPtrStride(PSE, Ptr, TheLoop, Strides, true, false); + bool CanAddPredicate = !TheLoop->getHeader()->getParent()->hasOptSize(); + int Stride = getPtrStride(PSE, Ptr, TheLoop, Strides, CanAddPredicate, false); if (Stride == 1 || Stride == -1) return Stride; return 0; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68082.224040.patch Type: text/x-patch Size: 2250 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 06:26:13 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:26:13 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: thopre added a comment. LGTM but I'll let others comment. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 From llvm-commits at lists.llvm.org Wed Oct 9 06:33:17 2019 From: llvm-commits at lists.llvm.org (Renato Golin via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:33:17 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <6b4fad0de9a3dddbe08320bf0a37b1d3@localhost.localdomain> rengolin added a comment. So, IIUC, this is changing tryCreateRecipe to move the interleave recipe creation to the caller, buildVPlanWithVPRecipes. The dependencies with the sink values is recorded initially, then the plans are created, then the sinks are applied and, if any, the interleave groups. The refactoring of VPRecipeBase make sense to me and the resulting code looks cleaner, but this doesn't look like an NFC change. Not that this is a bad thing, but I can't quite reach the conclusion that all the loops that would have been interleaved will continue to do so, because the order of the plans may change (for better or worse) the conditions in which the plan starts with. Regardless, I think this is a positive change and goes in the direction we want the VPlan infrastructure to be. It also looks semantically equivalent (with the caveat above), so the change looks good to me. It would be good to wait for further reviews on the next few days, just in case I missed something. I haven't looked at this code for a while, so that's very likely. :) Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 From llvm-commits at lists.llvm.org Wed Oct 9 06:33:18 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:33:18 +0000 (UTC) Subject: [PATCH] D68703: [llvm-exegesis] Ensure that ExecutableFunction are aligned. Message-ID: courbet created this revision. courbet added a reviewer: gchatelet. Herald added a subscriber: tschuett. Herald added a project: LLVM. Experiments show that this is the alignment we get (for ELF+Linux), but let's ensure that we have it. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68703 Files: llvm/tools/llvm-exegesis/lib/Assembler.cpp Index: llvm/tools/llvm-exegesis/lib/Assembler.cpp =================================================================== --- llvm/tools/llvm-exegesis/lib/Assembler.cpp +++ llvm/tools/llvm-exegesis/lib/Assembler.cpp @@ -21,6 +21,7 @@ #include "llvm/ExecutionEngine/SectionMemoryManager.h" #include "llvm/IR/LegacyPassManager.h" #include "llvm/MC/MCInstrInfo.h" +#include "llvm/Support/Alignment.h" #include "llvm/Support/MemoryBuffer.h" namespace llvm { @@ -28,6 +29,7 @@ static constexpr const char ModuleID[] = "ExegesisInfoTest"; static constexpr const char FunctionID[] = "foo"; +static const Align kFunctionAlignment(4096); // Fills the given basic block with register setup code, and returns true if // all registers could be setup correctly. @@ -169,13 +171,13 @@ ArrayRef LiveIns, ArrayRef RegisterInitialValues, const FillFunction &Fill, raw_pwrite_stream &AsmStream) { - std::unique_ptr Context = std::make_unique(); + auto Context = std::make_unique(); std::unique_ptr Module = createModule(Context, TM->createDataLayout()); - std::unique_ptr MMIWP = - std::make_unique(TM.get()); + auto MMIWP = std::make_unique(TM.get()); MachineFunction &MF = createVoidVoidPtrMachineFunction( FunctionID, Module.get(), &MMIWP.get()->getMMI()); + MF.ensureAlignment(kFunctionAlignment); // We need to instruct the passes that we're done with SSA and virtual // registers. @@ -305,9 +307,11 @@ // executable page. ExecEngine->addObjectFile(std::move(ObjectFileHolder)); // Fetching function bytes. - FunctionBytes = StringRef(reinterpret_cast( - ExecEngine->getFunctionAddress(FunctionID)), - CodeSize); + const auto FunctionAddress = ExecEngine->getFunctionAddress(FunctionID); + assert((FunctionAddress & (kFunctionAlignment.value() - 1)) == 0 && + "function is not properly aligned"); + FunctionBytes = + StringRef(reinterpret_cast(FunctionAddress), CodeSize); } } // namespace exegesis -------------- next part -------------- A non-text attachment was scrubbed... Name: D68703.224041.patch Type: text/x-patch Size: 2282 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 06:33:18 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:33:18 +0000 (UTC) Subject: [PATCH] D67162: [InstCombine] Known-bits optimization for ARM MVE VADC. In-Reply-To: References: Message-ID: dmgreen accepted this revision. dmgreen added a comment. This revision is now accepted and ready to land. LGTM. Thanks Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67162/new/ https://reviews.llvm.org/D67162 From llvm-commits at lists.llvm.org Wed Oct 9 06:42:37 2019 From: llvm-commits at lists.llvm.org (Alexander Richardson via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:42:37 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: arichardson added a comment. LGTM ================ Comment at: llvm/test/FileCheck/check-ignore-case.txt:22 +# CHECK-NOT: loop +# CHECK-NEXT: break ---------------- newline? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 From llvm-commits at lists.llvm.org Wed Oct 9 06:42:38 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Lu=C3=ADs_Marques_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 13:42:38 +0000 (UTC) Subject: [PATCH] D68698: [test-suite] Add Architecture Detection for RISC-V In-Reply-To: References: Message-ID: luismarques accepted this revision. luismarques added a comment. This revision is now accepted and ready to land. LGTM. I suggest slightly tweaking the commit message to make it clear that it's the `str` message that is prefix matched for `riscv`. Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68698/new/ https://reviews.llvm.org/D68698 From llvm-commits at lists.llvm.org Wed Oct 9 06:42:39 2019 From: llvm-commits at lists.llvm.org (Guillaume Chatelet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:42:39 +0000 (UTC) Subject: [PATCH] D68703: [llvm-exegesis] Ensure that ExecutableFunction are aligned. In-Reply-To: References: Message-ID: gchatelet added inline comments. ================ Comment at: llvm/tools/llvm-exegesis/lib/Assembler.cpp:311 + const auto FunctionAddress = ExecEngine->getFunctionAddress(FunctionID); + assert((FunctionAddress & (kFunctionAlignment.value() - 1)) == 0 && + "function is not properly aligned"); ---------------- `assert(isAligned(FunctionAddress, kFunctionAlignment) && "function is not properly aligned");` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68703/new/ https://reviews.llvm.org/D68703 From llvm-commits at lists.llvm.org Wed Oct 9 06:51:53 2019 From: llvm-commits at lists.llvm.org (Kai Nacke via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:51:53 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: Kai updated this revision to Diff 224042. Kai added a comment. Added newline at end of test case. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 Files: llvm/docs/CommandGuide/FileCheck.rst llvm/include/llvm/Support/FileCheck.h llvm/lib/Support/FileCheck.cpp llvm/lib/Support/FileCheckImpl.h llvm/test/FileCheck/check-ignore-case.txt llvm/utils/FileCheck/FileCheck.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68146.224042.patch Type: text/x-patch Size: 4397 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 06:51:54 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:51:54 +0000 (UTC) Subject: [PATCH] D68703: [llvm-exegesis] Ensure that ExecutableFunction are aligned. In-Reply-To: References: Message-ID: <5e718dc4283987607a1056d5de8e8015@localhost.localdomain> courbet updated this revision to Diff 224043. courbet marked an inline comment as done. courbet added a comment. Address Comments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68703/new/ https://reviews.llvm.org/D68703 Files: llvm/tools/llvm-exegesis/lib/Assembler.cpp Index: llvm/tools/llvm-exegesis/lib/Assembler.cpp =================================================================== --- llvm/tools/llvm-exegesis/lib/Assembler.cpp +++ llvm/tools/llvm-exegesis/lib/Assembler.cpp @@ -21,6 +21,7 @@ #include "llvm/ExecutionEngine/SectionMemoryManager.h" #include "llvm/IR/LegacyPassManager.h" #include "llvm/MC/MCInstrInfo.h" +#include "llvm/Support/Alignment.h" #include "llvm/Support/MemoryBuffer.h" namespace llvm { @@ -28,6 +29,7 @@ static constexpr const char ModuleID[] = "ExegesisInfoTest"; static constexpr const char FunctionID[] = "foo"; +static const Align kFunctionAlignment(4096); // Fills the given basic block with register setup code, and returns true if // all registers could be setup correctly. @@ -169,13 +171,13 @@ ArrayRef LiveIns, ArrayRef RegisterInitialValues, const FillFunction &Fill, raw_pwrite_stream &AsmStream) { - std::unique_ptr Context = std::make_unique(); + auto Context = std::make_unique(); std::unique_ptr Module = createModule(Context, TM->createDataLayout()); - std::unique_ptr MMIWP = - std::make_unique(TM.get()); + auto MMIWP = std::make_unique(TM.get()); MachineFunction &MF = createVoidVoidPtrMachineFunction( FunctionID, Module.get(), &MMIWP.get()->getMMI()); + MF.ensureAlignment(kFunctionAlignment); // We need to instruct the passes that we're done with SSA and virtual // registers. @@ -305,9 +307,11 @@ // executable page. ExecEngine->addObjectFile(std::move(ObjectFileHolder)); // Fetching function bytes. - FunctionBytes = StringRef(reinterpret_cast( - ExecEngine->getFunctionAddress(FunctionID)), - CodeSize); + const auto FunctionAddress = ExecEngine->getFunctionAddress(FunctionID); + assert(isAligned(kFunctionAlignment, FunctionAddress) && + "function is not properly aligned"); + FunctionBytes = + StringRef(reinterpret_cast(FunctionAddress), CodeSize); } } // namespace exegesis -------------- next part -------------- A non-text attachment was scrubbed... Name: D68703.224043.patch Type: text/x-patch Size: 2271 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 07:11:56 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:11:56 +0000 (UTC) Subject: [PATCH] D68703: [llvm-exegesis] Ensure that ExecutableFunction are aligned. In-Reply-To: References: Message-ID: <7e5d97774558c4685b90d8b642c91c2d@localhost.localdomain> courbet updated this revision to Diff 224047. courbet added a comment. make type explicit Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68703/new/ https://reviews.llvm.org/D68703 Files: llvm/tools/llvm-exegesis/lib/Assembler.cpp Index: llvm/tools/llvm-exegesis/lib/Assembler.cpp =================================================================== --- llvm/tools/llvm-exegesis/lib/Assembler.cpp +++ llvm/tools/llvm-exegesis/lib/Assembler.cpp @@ -21,6 +21,7 @@ #include "llvm/ExecutionEngine/SectionMemoryManager.h" #include "llvm/IR/LegacyPassManager.h" #include "llvm/MC/MCInstrInfo.h" +#include "llvm/Support/Alignment.h" #include "llvm/Support/MemoryBuffer.h" namespace llvm { @@ -28,6 +29,7 @@ static constexpr const char ModuleID[] = "ExegesisInfoTest"; static constexpr const char FunctionID[] = "foo"; +static const Align kFunctionAlignment(4096); // Fills the given basic block with register setup code, and returns true if // all registers could be setup correctly. @@ -169,13 +171,13 @@ ArrayRef LiveIns, ArrayRef RegisterInitialValues, const FillFunction &Fill, raw_pwrite_stream &AsmStream) { - std::unique_ptr Context = std::make_unique(); + auto Context = std::make_unique(); std::unique_ptr Module = createModule(Context, TM->createDataLayout()); - std::unique_ptr MMIWP = - std::make_unique(TM.get()); + auto MMIWP = std::make_unique(TM.get()); MachineFunction &MF = createVoidVoidPtrMachineFunction( FunctionID, Module.get(), &MMIWP.get()->getMMI()); + MF.ensureAlignment(kFunctionAlignment); // We need to instruct the passes that we're done with SSA and virtual // registers. @@ -305,9 +307,11 @@ // executable page. ExecEngine->addObjectFile(std::move(ObjectFileHolder)); // Fetching function bytes. - FunctionBytes = StringRef(reinterpret_cast( - ExecEngine->getFunctionAddress(FunctionID)), - CodeSize); + const uint64_t FunctionAddress = ExecEngine->getFunctionAddress(FunctionID); + assert(isAligned(kFunctionAlignment, FunctionAddress) && + "function is not properly aligned"); + FunctionBytes = + StringRef(reinterpret_cast(FunctionAddress), CodeSize); } } // namespace exegesis -------------- next part -------------- A non-text attachment was scrubbed... Name: D68703.224047.patch Type: text/x-patch Size: 2275 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 07:11:57 2019 From: llvm-commits at lists.llvm.org (Yonggang Luo via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:11:57 +0000 (UTC) Subject: [PATCH] D67867: [libc] Add few docs and implementation of strcpy and strcat. In-Reply-To: References: Message-ID: <915e1393e3fa6f5c39080320a6ce17a4@localhost.localdomain> lygstate added inline comments. ================ Comment at: libc/trunk/include/ctype.h:18 + +int isalpha(int); + ---------------- There is no DLL export things here for MSVC, so only building as a static c lib? not considerating as a shared library? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67867/new/ https://reviews.llvm.org/D67867 From llvm-commits at lists.llvm.org Wed Oct 9 07:17:39 2019 From: llvm-commits at lists.llvm.org (David Green via llvm-commits) Date: Wed, 09 Oct 2019 14:17:39 -0000 Subject: [llvm] r374169 - Add and adjust saturating tests. NFC Message-ID: <20191009141739.3438687329@lists.llvm.org> Author: dmgreen Date: Wed Oct 9 07:17:38 2019 New Revision: 374169 URL: http://llvm.org/viewvc/llvm-project?rev=374169&view=rev Log: Add and adjust saturating tests. NFC This adds some extra testing to the existing [su][add/sub]_sat X86 and AArch64 tests and adds equivalent tests for ARM. Added: llvm/trunk/test/CodeGen/ARM/sadd_sat.ll llvm/trunk/test/CodeGen/ARM/ssub_sat.ll llvm/trunk/test/CodeGen/ARM/uadd_sat.ll llvm/trunk/test/CodeGen/ARM/usub_sat.ll Modified: llvm/trunk/test/CodeGen/AArch64/sadd_sat.ll llvm/trunk/test/CodeGen/AArch64/ssub_sat.ll llvm/trunk/test/CodeGen/AArch64/uadd_sat.ll llvm/trunk/test/CodeGen/AArch64/usub_sat.ll llvm/trunk/test/CodeGen/X86/sadd_sat.ll llvm/trunk/test/CodeGen/X86/ssub_sat.ll llvm/trunk/test/CodeGen/X86/uadd_sat.ll llvm/trunk/test/CodeGen/X86/usub_sat.ll Modified: llvm/trunk/test/CodeGen/AArch64/sadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/sadd_sat.ll?rev=374169&r1=374168&r2=374169&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/sadd_sat.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/sadd_sat.ll Wed Oct 9 07:17:38 2019 @@ -1,10 +1,12 @@ ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc < %s -mtriple=aarch64-- | FileCheck %s -declare i4 @llvm.sadd.sat.i4 (i4, i4) -declare i32 @llvm.sadd.sat.i32 (i32, i32) -declare i64 @llvm.sadd.sat.i64 (i64, i64) -declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32>, <4 x i32>) +declare i4 @llvm.sadd.sat.i4(i4, i4) +declare i8 @llvm.sadd.sat.i8(i8, i8) +declare i16 @llvm.sadd.sat.i16(i16, i16) +declare i32 @llvm.sadd.sat.i32(i32, i32) +declare i64 @llvm.sadd.sat.i64(i64, i64) +declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32>, <4 x i32>) define i32 @func(i32 %x, i32 %y) nounwind { ; CHECK-LABEL: func: @@ -34,6 +36,38 @@ define i64 @func2(i64 %x, i64 %y) nounwi ret i64 %tmp; } +define i16 @func16(i16 %x, i16 %y) nounwind { +; CHECK-LABEL: func16: +; CHECK: // %bb.0: +; CHECK-NEXT: lsl w8, w0, #16 +; CHECK-NEXT: adds w10, w8, w1, lsl #16 +; CHECK-NEXT: mov w9, #2147483647 +; CHECK-NEXT: cmp w10, #0 // =0 +; CHECK-NEXT: cinv w9, w9, ge +; CHECK-NEXT: adds w8, w8, w1, lsl #16 +; CHECK-NEXT: csel w8, w9, w8, vs +; CHECK-NEXT: asr w0, w8, #16 +; CHECK-NEXT: ret + %tmp = call i16 @llvm.sadd.sat.i16(i16 %x, i16 %y); + ret i16 %tmp; +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; CHECK-LABEL: func8: +; CHECK: // %bb.0: +; CHECK-NEXT: lsl w8, w0, #24 +; CHECK-NEXT: adds w10, w8, w1, lsl #24 +; CHECK-NEXT: mov w9, #2147483647 +; CHECK-NEXT: cmp w10, #0 // =0 +; CHECK-NEXT: cinv w9, w9, ge +; CHECK-NEXT: adds w8, w8, w1, lsl #24 +; CHECK-NEXT: csel w8, w9, w8, vs +; CHECK-NEXT: asr w0, w8, #24 +; CHECK-NEXT: ret + %tmp = call i8 @llvm.sadd.sat.i8(i8 %x, i8 %y); + ret i8 %tmp; +} + define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-LABEL: func3: ; CHECK: // %bb.0: Modified: llvm/trunk/test/CodeGen/AArch64/ssub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/ssub_sat.ll?rev=374169&r1=374168&r2=374169&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/ssub_sat.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/ssub_sat.ll Wed Oct 9 07:17:38 2019 @@ -1,10 +1,12 @@ ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc < %s -mtriple=aarch64-- | FileCheck %s -declare i4 @llvm.ssub.sat.i4 (i4, i4) -declare i32 @llvm.ssub.sat.i32 (i32, i32) -declare i64 @llvm.ssub.sat.i64 (i64, i64) -declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32>, <4 x i32>) +declare i4 @llvm.ssub.sat.i4(i4, i4) +declare i8 @llvm.ssub.sat.i8(i8, i8) +declare i16 @llvm.ssub.sat.i16(i16, i16) +declare i32 @llvm.ssub.sat.i32(i32, i32) +declare i64 @llvm.ssub.sat.i64(i64, i64) +declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32>, <4 x i32>) define i32 @func(i32 %x, i32 %y) nounwind { ; CHECK-LABEL: func: @@ -34,6 +36,38 @@ define i64 @func2(i64 %x, i64 %y) nounwi ret i64 %tmp; } +define i16 @func16(i16 %x, i16 %y) nounwind { +; CHECK-LABEL: func16: +; CHECK: // %bb.0: +; CHECK-NEXT: lsl w8, w0, #16 +; CHECK-NEXT: subs w10, w8, w1, lsl #16 +; CHECK-NEXT: mov w9, #2147483647 +; CHECK-NEXT: cmp w10, #0 // =0 +; CHECK-NEXT: cinv w9, w9, ge +; CHECK-NEXT: subs w8, w8, w1, lsl #16 +; CHECK-NEXT: csel w8, w9, w8, vs +; CHECK-NEXT: asr w0, w8, #16 +; CHECK-NEXT: ret + %tmp = call i16 @llvm.ssub.sat.i16(i16 %x, i16 %y); + ret i16 %tmp; +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; CHECK-LABEL: func8: +; CHECK: // %bb.0: +; CHECK-NEXT: lsl w8, w0, #24 +; CHECK-NEXT: subs w10, w8, w1, lsl #24 +; CHECK-NEXT: mov w9, #2147483647 +; CHECK-NEXT: cmp w10, #0 // =0 +; CHECK-NEXT: cinv w9, w9, ge +; CHECK-NEXT: subs w8, w8, w1, lsl #24 +; CHECK-NEXT: csel w8, w9, w8, vs +; CHECK-NEXT: asr w0, w8, #24 +; CHECK-NEXT: ret + %tmp = call i8 @llvm.ssub.sat.i8(i8 %x, i8 %y); + ret i8 %tmp; +} + define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-LABEL: func3: ; CHECK: // %bb.0: Modified: llvm/trunk/test/CodeGen/AArch64/uadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/uadd_sat.ll?rev=374169&r1=374168&r2=374169&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/uadd_sat.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/uadd_sat.ll Wed Oct 9 07:17:38 2019 @@ -1,9 +1,11 @@ ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc < %s -mtriple=aarch64-- | FileCheck %s -declare i4 @llvm.uadd.sat.i4 (i4, i4) -declare i32 @llvm.uadd.sat.i32 (i32, i32) -declare i64 @llvm.uadd.sat.i64 (i64, i64) +declare i4 @llvm.uadd.sat.i4(i4, i4) +declare i8 @llvm.uadd.sat.i8(i8, i8) +declare i16 @llvm.uadd.sat.i16(i16, i16) +declare i32 @llvm.uadd.sat.i32(i32, i32) +declare i64 @llvm.uadd.sat.i64(i64, i64) define i32 @func(i32 %x, i32 %y) nounwind { ; CHECK-LABEL: func: @@ -25,6 +27,30 @@ define i64 @func2(i64 %x, i64 %y) nounwi ret i64 %tmp; } +define i16 @func16(i16 %x, i16 %y) nounwind { +; CHECK-LABEL: func16: +; CHECK: // %bb.0: +; CHECK-NEXT: lsl w8, w0, #16 +; CHECK-NEXT: adds w8, w8, w1, lsl #16 +; CHECK-NEXT: csinv w8, w8, wzr, lo +; CHECK-NEXT: lsr w0, w8, #16 +; CHECK-NEXT: ret + %tmp = call i16 @llvm.uadd.sat.i16(i16 %x, i16 %y); + ret i16 %tmp; +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; CHECK-LABEL: func8: +; CHECK: // %bb.0: +; CHECK-NEXT: lsl w8, w0, #24 +; CHECK-NEXT: adds w8, w8, w1, lsl #24 +; CHECK-NEXT: csinv w8, w8, wzr, lo +; CHECK-NEXT: lsr w0, w8, #24 +; CHECK-NEXT: ret + %tmp = call i8 @llvm.uadd.sat.i8(i8 %x, i8 %y); + ret i8 %tmp; +} + define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-LABEL: func3: ; CHECK: // %bb.0: Modified: llvm/trunk/test/CodeGen/AArch64/usub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/usub_sat.ll?rev=374169&r1=374168&r2=374169&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/usub_sat.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/usub_sat.ll Wed Oct 9 07:17:38 2019 @@ -1,9 +1,11 @@ ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc < %s -mtriple=aarch64-- | FileCheck %s -declare i4 @llvm.usub.sat.i4 (i4, i4) -declare i32 @llvm.usub.sat.i32 (i32, i32) -declare i64 @llvm.usub.sat.i64 (i64, i64) +declare i4 @llvm.usub.sat.i4(i4, i4) +declare i8 @llvm.usub.sat.i8(i8, i8) +declare i16 @llvm.usub.sat.i16(i16, i16) +declare i32 @llvm.usub.sat.i32(i32, i32) +declare i64 @llvm.usub.sat.i64(i64, i64) define i32 @func(i32 %x, i32 %y) nounwind { ; CHECK-LABEL: func: @@ -25,6 +27,30 @@ define i64 @func2(i64 %x, i64 %y) nounwi ret i64 %tmp; } +define i16 @func16(i16 %x, i16 %y) nounwind { +; CHECK-LABEL: func16: +; CHECK: // %bb.0: +; CHECK-NEXT: lsl w8, w0, #16 +; CHECK-NEXT: subs w8, w8, w1, lsl #16 +; CHECK-NEXT: csel w8, wzr, w8, lo +; CHECK-NEXT: lsr w0, w8, #16 +; CHECK-NEXT: ret + %tmp = call i16 @llvm.usub.sat.i16(i16 %x, i16 %y); + ret i16 %tmp; +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; CHECK-LABEL: func8: +; CHECK: // %bb.0: +; CHECK-NEXT: lsl w8, w0, #24 +; CHECK-NEXT: subs w8, w8, w1, lsl #24 +; CHECK-NEXT: csel w8, wzr, w8, lo +; CHECK-NEXT: lsr w0, w8, #24 +; CHECK-NEXT: ret + %tmp = call i8 @llvm.usub.sat.i8(i8 %x, i8 %y); + ret i8 %tmp; +} + define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-LABEL: func3: ; CHECK: // %bb.0: Added: llvm/trunk/test/CodeGen/ARM/sadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/sadd_sat.ll?rev=374169&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/ARM/sadd_sat.ll (added) +++ llvm/trunk/test/CodeGen/ARM/sadd_sat.ll Wed Oct 9 07:17:38 2019 @@ -0,0 +1,415 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=thumbv6m-none-eabi | FileCheck %s --check-prefix=CHECK-T1 +; RUN: llc < %s -mtriple=thumbv7m-none-eabi | FileCheck %s --check-prefix=CHECK-T2 --check-prefix=CHECK-T2NODSP +; RUN: llc < %s -mtriple=thumbv7em-none-eabi | FileCheck %s --check-prefix=CHECK-T2 --check-prefix=CHECK-T2DSP +; RUN: llc < %s -mtriple=armv8a-none-eabi | FileCheck %s --check-prefix=CHECK-ARM + +declare i4 @llvm.sadd.sat.i4(i4, i4) +declare i8 @llvm.sadd.sat.i8(i8, i8) +declare i16 @llvm.sadd.sat.i16(i16, i16) +declare i32 @llvm.sadd.sat.i32(i32, i32) +declare i64 @llvm.sadd.sat.i64(i64, i64) + +define i32 @func(i32 %x, i32 %y) nounwind { +; CHECK-T1-LABEL: func: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: mov r2, r0 +; CHECK-T1-NEXT: movs r3, #1 +; CHECK-T1-NEXT: adds r0, r0, r1 +; CHECK-T1-NEXT: mov r1, r3 +; CHECK-T1-NEXT: bmi .LBB0_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r1, #0 +; CHECK-T1-NEXT: .LBB0_2: +; CHECK-T1-NEXT: cmp r1, #0 +; CHECK-T1-NEXT: bne .LBB0_4 +; CHECK-T1-NEXT: @ %bb.3: +; CHECK-T1-NEXT: lsls r1, r3, #31 +; CHECK-T1-NEXT: cmp r0, r2 +; CHECK-T1-NEXT: bvs .LBB0_5 +; CHECK-T1-NEXT: b .LBB0_6 +; CHECK-T1-NEXT: .LBB0_4: +; CHECK-T1-NEXT: ldr r1, .LCPI0_0 +; CHECK-T1-NEXT: cmp r0, r2 +; CHECK-T1-NEXT: bvc .LBB0_6 +; CHECK-T1-NEXT: .LBB0_5: +; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: .LBB0_6: +; CHECK-T1-NEXT: bx lr +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI0_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff +; +; CHECK-T2-LABEL: func: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: adds r2, r0, r1 +; CHECK-T2-NEXT: mov.w r3, #0 +; CHECK-T2-NEXT: mov.w r1, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r3, #1 +; CHECK-T2-NEXT: cmp r3, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r1, #-2147483648 +; CHECK-T2-NEXT: cmp r2, r0 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r1, r2 +; CHECK-T2-NEXT: mov r0, r1 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: adds r2, r0, r1 +; CHECK-ARM-NEXT: mov r3, #0 +; CHECK-ARM-NEXT: movwmi r3, #1 +; CHECK-ARM-NEXT: mov r1, #-2147483648 +; CHECK-ARM-NEXT: cmp r3, #0 +; CHECK-ARM-NEXT: mvnne r1, #-2147483648 +; CHECK-ARM-NEXT: cmp r2, r0 +; CHECK-ARM-NEXT: movvc r1, r2 +; CHECK-ARM-NEXT: mov r0, r1 +; CHECK-ARM-NEXT: bx lr + %tmp = call i32 @llvm.sadd.sat.i32(i32 %x, i32 %y) + ret i32 %tmp +} + +define i64 @func2(i64 %x, i64 %y) nounwind { +; CHECK-T1-LABEL: func2: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: .save {r4, r5, r6, r7, lr} +; CHECK-T1-NEXT: push {r4, r5, r6, r7, lr} +; CHECK-T1-NEXT: .pad #4 +; CHECK-T1-NEXT: sub sp, #4 +; CHECK-T1-NEXT: str r2, [sp] @ 4-byte Spill +; CHECK-T1-NEXT: mov r2, r0 +; CHECK-T1-NEXT: movs r4, #1 +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: mov r5, r4 +; CHECK-T1-NEXT: bge .LBB1_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: mov r5, r0 +; CHECK-T1-NEXT: .LBB1_2: +; CHECK-T1-NEXT: cmp r1, #0 +; CHECK-T1-NEXT: mov r7, r4 +; CHECK-T1-NEXT: bge .LBB1_4 +; CHECK-T1-NEXT: @ %bb.3: +; CHECK-T1-NEXT: mov r7, r0 +; CHECK-T1-NEXT: .LBB1_4: +; CHECK-T1-NEXT: subs r6, r7, r5 +; CHECK-T1-NEXT: rsbs r5, r6, #0 +; CHECK-T1-NEXT: adcs r5, r6 +; CHECK-T1-NEXT: ldr r6, [sp] @ 4-byte Reload +; CHECK-T1-NEXT: adds r6, r2, r6 +; CHECK-T1-NEXT: adcs r1, r3 +; CHECK-T1-NEXT: cmp r1, #0 +; CHECK-T1-NEXT: mov r2, r4 +; CHECK-T1-NEXT: bge .LBB1_6 +; CHECK-T1-NEXT: @ %bb.5: +; CHECK-T1-NEXT: mov r2, r0 +; CHECK-T1-NEXT: .LBB1_6: +; CHECK-T1-NEXT: subs r0, r7, r2 +; CHECK-T1-NEXT: subs r2, r0, #1 +; CHECK-T1-NEXT: sbcs r0, r2 +; CHECK-T1-NEXT: ands r5, r0 +; CHECK-T1-NEXT: beq .LBB1_8 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: asrs r6, r1, #31 +; CHECK-T1-NEXT: .LBB1_8: +; CHECK-T1-NEXT: cmp r1, #0 +; CHECK-T1-NEXT: bmi .LBB1_10 +; CHECK-T1-NEXT: @ %bb.9: +; CHECK-T1-NEXT: lsls r2, r4, #31 +; CHECK-T1-NEXT: cmp r5, #0 +; CHECK-T1-NEXT: beq .LBB1_11 +; CHECK-T1-NEXT: b .LBB1_12 +; CHECK-T1-NEXT: .LBB1_10: +; CHECK-T1-NEXT: ldr r2, .LCPI1_0 +; CHECK-T1-NEXT: cmp r5, #0 +; CHECK-T1-NEXT: bne .LBB1_12 +; CHECK-T1-NEXT: .LBB1_11: +; CHECK-T1-NEXT: mov r2, r1 +; CHECK-T1-NEXT: .LBB1_12: +; CHECK-T1-NEXT: mov r0, r6 +; CHECK-T1-NEXT: mov r1, r2 +; CHECK-T1-NEXT: add sp, #4 +; CHECK-T1-NEXT: pop {r4, r5, r6, r7, pc} +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.13: +; CHECK-T1-NEXT: .LCPI1_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff +; +; CHECK-T2-LABEL: func2: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: .save {r7, lr} +; CHECK-T2-NEXT: push {r7, lr} +; CHECK-T2-NEXT: cmp.w r1, #-1 +; CHECK-T2-NEXT: mov.w lr, #0 +; CHECK-T2-NEXT: it gt +; CHECK-T2-NEXT: movgt.w lr, #1 +; CHECK-T2-NEXT: adds r0, r0, r2 +; CHECK-T2-NEXT: adc.w r2, r1, r3 +; CHECK-T2-NEXT: movs r1, #0 +; CHECK-T2-NEXT: cmp.w r2, #-1 +; CHECK-T2-NEXT: it gt +; CHECK-T2-NEXT: movgt r1, #1 +; CHECK-T2-NEXT: subs.w r1, lr, r1 +; CHECK-T2-NEXT: mov.w r12, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: movne r1, #1 +; CHECK-T2-NEXT: cmp.w r3, #-1 +; CHECK-T2-NEXT: it gt +; CHECK-T2-NEXT: movgt.w r12, #1 +; CHECK-T2-NEXT: sub.w r3, lr, r12 +; CHECK-T2-NEXT: clz r3, r3 +; CHECK-T2-NEXT: lsrs r3, r3, #5 +; CHECK-T2-NEXT: ands r3, r1 +; CHECK-T2-NEXT: mov.w r1, #-2147483648 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: asrne r0, r2, #31 +; CHECK-T2-NEXT: cmp r2, #0 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: mvnmi r1, #-2147483648 +; CHECK-T2-NEXT: cmp r3, #0 +; CHECK-T2-NEXT: it eq +; CHECK-T2-NEXT: moveq r1, r2 +; CHECK-T2-NEXT: pop {r7, pc} +; +; CHECK-ARM-LABEL: func2: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: .save {r11, lr} +; CHECK-ARM-NEXT: push {r11, lr} +; CHECK-ARM-NEXT: adds r0, r0, r2 +; CHECK-ARM-NEXT: mov r2, #0 +; CHECK-ARM-NEXT: adc r12, r1, r3 +; CHECK-ARM-NEXT: cmn r1, #1 +; CHECK-ARM-NEXT: mov r1, #0 +; CHECK-ARM-NEXT: mov lr, #0 +; CHECK-ARM-NEXT: movwgt r1, #1 +; CHECK-ARM-NEXT: cmn r12, #1 +; CHECK-ARM-NEXT: movwgt r2, #1 +; CHECK-ARM-NEXT: subs r2, r1, r2 +; CHECK-ARM-NEXT: movwne r2, #1 +; CHECK-ARM-NEXT: cmn r3, #1 +; CHECK-ARM-NEXT: movwgt lr, #1 +; CHECK-ARM-NEXT: sub r1, r1, lr +; CHECK-ARM-NEXT: clz r1, r1 +; CHECK-ARM-NEXT: lsr r1, r1, #5 +; CHECK-ARM-NEXT: ands r2, r1, r2 +; CHECK-ARM-NEXT: asrne r0, r12, #31 +; CHECK-ARM-NEXT: mov r1, #-2147483648 +; CHECK-ARM-NEXT: cmp r12, #0 +; CHECK-ARM-NEXT: mvnmi r1, #-2147483648 +; CHECK-ARM-NEXT: cmp r2, #0 +; CHECK-ARM-NEXT: moveq r1, r12 +; CHECK-ARM-NEXT: pop {r11, pc} + %tmp = call i64 @llvm.sadd.sat.i64(i64 %x, i64 %y) + ret i64 %tmp +} + +define i16 @func16(i16 %x, i16 %y) nounwind { +; CHECK-T1-LABEL: func16: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r3, r1, #16 +; CHECK-T1-NEXT: lsls r1, r0, #16 +; CHECK-T1-NEXT: movs r2, #1 +; CHECK-T1-NEXT: adds r0, r1, r3 +; CHECK-T1-NEXT: mov r3, r2 +; CHECK-T1-NEXT: bmi .LBB2_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r3, #0 +; CHECK-T1-NEXT: .LBB2_2: +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: bne .LBB2_4 +; CHECK-T1-NEXT: @ %bb.3: +; CHECK-T1-NEXT: lsls r2, r2, #31 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvs .LBB2_5 +; CHECK-T1-NEXT: b .LBB2_6 +; CHECK-T1-NEXT: .LBB2_4: +; CHECK-T1-NEXT: ldr r2, .LCPI2_0 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvc .LBB2_6 +; CHECK-T1-NEXT: .LBB2_5: +; CHECK-T1-NEXT: mov r0, r2 +; CHECK-T1-NEXT: .LBB2_6: +; CHECK-T1-NEXT: asrs r0, r0, #16 +; CHECK-T1-NEXT: bx lr +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI2_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff +; +; CHECK-T2-LABEL: func16: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r2, r0, #16 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #16 +; CHECK-T2-NEXT: movs r2, #0 +; CHECK-T2-NEXT: cmp r1, #0 +; CHECK-T2-NEXT: mov.w r3, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r2, #1 +; CHECK-T2-NEXT: cmp r2, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r3, #-2147483648 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #16 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r3, r1 +; CHECK-T2-NEXT: asrs r0, r3, #16 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func16: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r2, r0, #16 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #16 +; CHECK-ARM-NEXT: mov r2, #0 +; CHECK-ARM-NEXT: cmp r1, #0 +; CHECK-ARM-NEXT: movwmi r2, #1 +; CHECK-ARM-NEXT: mov r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r2, #0 +; CHECK-ARM-NEXT: mvnne r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #16 +; CHECK-ARM-NEXT: movvc r3, r1 +; CHECK-ARM-NEXT: asr r0, r3, #16 +; CHECK-ARM-NEXT: bx lr + %tmp = call i16 @llvm.sadd.sat.i16(i16 %x, i16 %y) + ret i16 %tmp +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; CHECK-T1-LABEL: func8: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r3, r1, #24 +; CHECK-T1-NEXT: lsls r1, r0, #24 +; CHECK-T1-NEXT: movs r2, #1 +; CHECK-T1-NEXT: adds r0, r1, r3 +; CHECK-T1-NEXT: mov r3, r2 +; CHECK-T1-NEXT: bmi .LBB3_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r3, #0 +; CHECK-T1-NEXT: .LBB3_2: +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: bne .LBB3_4 +; CHECK-T1-NEXT: @ %bb.3: +; CHECK-T1-NEXT: lsls r2, r2, #31 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvs .LBB3_5 +; CHECK-T1-NEXT: b .LBB3_6 +; CHECK-T1-NEXT: .LBB3_4: +; CHECK-T1-NEXT: ldr r2, .LCPI3_0 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvc .LBB3_6 +; CHECK-T1-NEXT: .LBB3_5: +; CHECK-T1-NEXT: mov r0, r2 +; CHECK-T1-NEXT: .LBB3_6: +; CHECK-T1-NEXT: asrs r0, r0, #24 +; CHECK-T1-NEXT: bx lr +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI3_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff +; +; CHECK-T2-LABEL: func8: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r2, r0, #24 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #24 +; CHECK-T2-NEXT: movs r2, #0 +; CHECK-T2-NEXT: cmp r1, #0 +; CHECK-T2-NEXT: mov.w r3, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r2, #1 +; CHECK-T2-NEXT: cmp r2, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r3, #-2147483648 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #24 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r3, r1 +; CHECK-T2-NEXT: asrs r0, r3, #24 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func8: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r2, r0, #24 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #24 +; CHECK-ARM-NEXT: mov r2, #0 +; CHECK-ARM-NEXT: cmp r1, #0 +; CHECK-ARM-NEXT: movwmi r2, #1 +; CHECK-ARM-NEXT: mov r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r2, #0 +; CHECK-ARM-NEXT: mvnne r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #24 +; CHECK-ARM-NEXT: movvc r3, r1 +; CHECK-ARM-NEXT: asr r0, r3, #24 +; CHECK-ARM-NEXT: bx lr + %tmp = call i8 @llvm.sadd.sat.i8(i8 %x, i8 %y) + ret i8 %tmp +} + +define i4 @func3(i4 %x, i4 %y) nounwind { +; CHECK-T1-LABEL: func3: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r3, r1, #28 +; CHECK-T1-NEXT: lsls r1, r0, #28 +; CHECK-T1-NEXT: movs r2, #1 +; CHECK-T1-NEXT: adds r0, r1, r3 +; CHECK-T1-NEXT: mov r3, r2 +; CHECK-T1-NEXT: bmi .LBB4_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r3, #0 +; CHECK-T1-NEXT: .LBB4_2: +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: bne .LBB4_4 +; CHECK-T1-NEXT: @ %bb.3: +; CHECK-T1-NEXT: lsls r2, r2, #31 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvs .LBB4_5 +; CHECK-T1-NEXT: b .LBB4_6 +; CHECK-T1-NEXT: .LBB4_4: +; CHECK-T1-NEXT: ldr r2, .LCPI4_0 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvc .LBB4_6 +; CHECK-T1-NEXT: .LBB4_5: +; CHECK-T1-NEXT: mov r0, r2 +; CHECK-T1-NEXT: .LBB4_6: +; CHECK-T1-NEXT: asrs r0, r0, #28 +; CHECK-T1-NEXT: bx lr +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI4_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff +; +; CHECK-T2-LABEL: func3: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r2, r0, #28 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #28 +; CHECK-T2-NEXT: movs r2, #0 +; CHECK-T2-NEXT: cmp r1, #0 +; CHECK-T2-NEXT: mov.w r3, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r2, #1 +; CHECK-T2-NEXT: cmp r2, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r3, #-2147483648 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #28 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r3, r1 +; CHECK-T2-NEXT: asrs r0, r3, #28 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func3: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r2, r0, #28 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #28 +; CHECK-ARM-NEXT: mov r2, #0 +; CHECK-ARM-NEXT: cmp r1, #0 +; CHECK-ARM-NEXT: movwmi r2, #1 +; CHECK-ARM-NEXT: mov r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r2, #0 +; CHECK-ARM-NEXT: mvnne r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #28 +; CHECK-ARM-NEXT: movvc r3, r1 +; CHECK-ARM-NEXT: asr r0, r3, #28 +; CHECK-ARM-NEXT: bx lr + %tmp = call i4 @llvm.sadd.sat.i4(i4 %x, i4 %y) + ret i4 %tmp +} Added: llvm/trunk/test/CodeGen/ARM/ssub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/ssub_sat.ll?rev=374169&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/ARM/ssub_sat.ll (added) +++ llvm/trunk/test/CodeGen/ARM/ssub_sat.ll Wed Oct 9 07:17:38 2019 @@ -0,0 +1,608 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=thumbv6m-none-eabi | FileCheck %s --check-prefix=CHECK-T1 +; RUN: llc < %s -mtriple=thumbv7m-none-eabi | FileCheck %s --check-prefix=CHECK-T2 --check-prefix=CHECK-T2NODSP +; RUN: llc < %s -mtriple=thumbv7em-none-eabi | FileCheck %s --check-prefix=CHECK-T2 --check-prefix=CHECK-T2DSP +; RUN: llc < %s -mtriple=armv8a-none-eabi | FileCheck %s --check-prefix=CHECK-ARM + +declare i4 @llvm.ssub.sat.i4(i4, i4) +declare i8 @llvm.ssub.sat.i8(i8, i8) +declare i16 @llvm.ssub.sat.i16(i16, i16) +declare i32 @llvm.ssub.sat.i32(i32, i32) +declare i64 @llvm.ssub.sat.i64(i64, i64) +declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32>, <4 x i32>) + +define i32 @func(i32 %x, i32 %y) nounwind { +; CHECK-T1-LABEL: func: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: .save {r4, lr} +; CHECK-T1-NEXT: push {r4, lr} +; CHECK-T1-NEXT: mov r2, r0 +; CHECK-T1-NEXT: movs r3, #1 +; CHECK-T1-NEXT: subs r0, r0, r1 +; CHECK-T1-NEXT: mov r4, r3 +; CHECK-T1-NEXT: bmi .LBB0_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r4, #0 +; CHECK-T1-NEXT: .LBB0_2: +; CHECK-T1-NEXT: cmp r4, #0 +; CHECK-T1-NEXT: bne .LBB0_4 +; CHECK-T1-NEXT: @ %bb.3: +; CHECK-T1-NEXT: lsls r3, r3, #31 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvs .LBB0_5 +; CHECK-T1-NEXT: b .LBB0_6 +; CHECK-T1-NEXT: .LBB0_4: +; CHECK-T1-NEXT: ldr r3, .LCPI0_0 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvc .LBB0_6 +; CHECK-T1-NEXT: .LBB0_5: +; CHECK-T1-NEXT: mov r0, r3 +; CHECK-T1-NEXT: .LBB0_6: +; CHECK-T1-NEXT: pop {r4, pc} +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI0_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff +; +; CHECK-T2-LABEL: func: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: subs.w r12, r0, r1 +; CHECK-T2-NEXT: mov.w r3, #0 +; CHECK-T2-NEXT: mov.w r2, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r3, #1 +; CHECK-T2-NEXT: cmp r3, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r2, #-2147483648 +; CHECK-T2-NEXT: cmp r0, r1 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r2, r12 +; CHECK-T2-NEXT: mov r0, r2 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: subs r12, r0, r1 +; CHECK-ARM-NEXT: mov r3, #0 +; CHECK-ARM-NEXT: movwmi r3, #1 +; CHECK-ARM-NEXT: mov r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r3, #0 +; CHECK-ARM-NEXT: mvnne r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r0, r1 +; CHECK-ARM-NEXT: movvc r2, r12 +; CHECK-ARM-NEXT: mov r0, r2 +; CHECK-ARM-NEXT: bx lr + %tmp = call i32 @llvm.ssub.sat.i32(i32 %x, i32 %y) + ret i32 %tmp +} + +define i64 @func2(i64 %x, i64 %y) nounwind { +; CHECK-T1-LABEL: func2: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: .save {r4, r5, r6, r7, lr} +; CHECK-T1-NEXT: push {r4, r5, r6, r7, lr} +; CHECK-T1-NEXT: .pad #4 +; CHECK-T1-NEXT: sub sp, #4 +; CHECK-T1-NEXT: str r2, [sp] @ 4-byte Spill +; CHECK-T1-NEXT: mov r2, r0 +; CHECK-T1-NEXT: movs r4, #1 +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: mov r5, r4 +; CHECK-T1-NEXT: bge .LBB1_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: mov r5, r0 +; CHECK-T1-NEXT: .LBB1_2: +; CHECK-T1-NEXT: cmp r1, #0 +; CHECK-T1-NEXT: mov r7, r4 +; CHECK-T1-NEXT: bge .LBB1_4 +; CHECK-T1-NEXT: @ %bb.3: +; CHECK-T1-NEXT: mov r7, r0 +; CHECK-T1-NEXT: .LBB1_4: +; CHECK-T1-NEXT: subs r5, r7, r5 +; CHECK-T1-NEXT: subs r6, r5, #1 +; CHECK-T1-NEXT: sbcs r5, r6 +; CHECK-T1-NEXT: ldr r6, [sp] @ 4-byte Reload +; CHECK-T1-NEXT: subs r6, r2, r6 +; CHECK-T1-NEXT: sbcs r1, r3 +; CHECK-T1-NEXT: cmp r1, #0 +; CHECK-T1-NEXT: mov r2, r4 +; CHECK-T1-NEXT: bge .LBB1_6 +; CHECK-T1-NEXT: @ %bb.5: +; CHECK-T1-NEXT: mov r2, r0 +; CHECK-T1-NEXT: .LBB1_6: +; CHECK-T1-NEXT: subs r0, r7, r2 +; CHECK-T1-NEXT: subs r2, r0, #1 +; CHECK-T1-NEXT: sbcs r0, r2 +; CHECK-T1-NEXT: ands r5, r0 +; CHECK-T1-NEXT: beq .LBB1_8 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: asrs r6, r1, #31 +; CHECK-T1-NEXT: .LBB1_8: +; CHECK-T1-NEXT: cmp r1, #0 +; CHECK-T1-NEXT: bmi .LBB1_10 +; CHECK-T1-NEXT: @ %bb.9: +; CHECK-T1-NEXT: lsls r2, r4, #31 +; CHECK-T1-NEXT: cmp r5, #0 +; CHECK-T1-NEXT: beq .LBB1_11 +; CHECK-T1-NEXT: b .LBB1_12 +; CHECK-T1-NEXT: .LBB1_10: +; CHECK-T1-NEXT: ldr r2, .LCPI1_0 +; CHECK-T1-NEXT: cmp r5, #0 +; CHECK-T1-NEXT: bne .LBB1_12 +; CHECK-T1-NEXT: .LBB1_11: +; CHECK-T1-NEXT: mov r2, r1 +; CHECK-T1-NEXT: .LBB1_12: +; CHECK-T1-NEXT: mov r0, r6 +; CHECK-T1-NEXT: mov r1, r2 +; CHECK-T1-NEXT: add sp, #4 +; CHECK-T1-NEXT: pop {r4, r5, r6, r7, pc} +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.13: +; CHECK-T1-NEXT: .LCPI1_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff +; +; CHECK-T2-LABEL: func2: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: .save {r4, lr} +; CHECK-T2-NEXT: push {r4, lr} +; CHECK-T2-NEXT: cmp.w r3, #-1 +; CHECK-T2-NEXT: mov.w lr, #0 +; CHECK-T2-NEXT: it gt +; CHECK-T2-NEXT: movgt.w lr, #1 +; CHECK-T2-NEXT: cmp.w r1, #-1 +; CHECK-T2-NEXT: mov.w r4, #0 +; CHECK-T2-NEXT: mov.w r12, #0 +; CHECK-T2-NEXT: it gt +; CHECK-T2-NEXT: movgt r4, #1 +; CHECK-T2-NEXT: subs.w lr, r4, lr +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: movne.w lr, #1 +; CHECK-T2-NEXT: subs r0, r0, r2 +; CHECK-T2-NEXT: sbc.w r2, r1, r3 +; CHECK-T2-NEXT: cmp.w r2, #-1 +; CHECK-T2-NEXT: it gt +; CHECK-T2-NEXT: movgt.w r12, #1 +; CHECK-T2-NEXT: subs.w r1, r4, r12 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: movne r1, #1 +; CHECK-T2-NEXT: ands.w r3, lr, r1 +; CHECK-T2-NEXT: mov.w r1, #-2147483648 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: asrne r0, r2, #31 +; CHECK-T2-NEXT: cmp r2, #0 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: mvnmi r1, #-2147483648 +; CHECK-T2-NEXT: cmp r3, #0 +; CHECK-T2-NEXT: it eq +; CHECK-T2-NEXT: moveq r1, r2 +; CHECK-T2-NEXT: pop {r4, pc} +; +; CHECK-ARM-LABEL: func2: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: .save {r4, lr} +; CHECK-ARM-NEXT: push {r4, lr} +; CHECK-ARM-NEXT: cmn r3, #1 +; CHECK-ARM-NEXT: mov lr, #0 +; CHECK-ARM-NEXT: movwgt lr, #1 +; CHECK-ARM-NEXT: cmn r1, #1 +; CHECK-ARM-NEXT: mov r4, #0 +; CHECK-ARM-NEXT: mov r12, #0 +; CHECK-ARM-NEXT: movwgt r4, #1 +; CHECK-ARM-NEXT: subs lr, r4, lr +; CHECK-ARM-NEXT: movwne lr, #1 +; CHECK-ARM-NEXT: subs r0, r0, r2 +; CHECK-ARM-NEXT: sbc r2, r1, r3 +; CHECK-ARM-NEXT: cmn r2, #1 +; CHECK-ARM-NEXT: movwgt r12, #1 +; CHECK-ARM-NEXT: subs r1, r4, r12 +; CHECK-ARM-NEXT: movwne r1, #1 +; CHECK-ARM-NEXT: ands r3, lr, r1 +; CHECK-ARM-NEXT: asrne r0, r2, #31 +; CHECK-ARM-NEXT: mov r1, #-2147483648 +; CHECK-ARM-NEXT: cmp r2, #0 +; CHECK-ARM-NEXT: mvnmi r1, #-2147483648 +; CHECK-ARM-NEXT: cmp r3, #0 +; CHECK-ARM-NEXT: moveq r1, r2 +; CHECK-ARM-NEXT: pop {r4, pc} + %tmp = call i64 @llvm.ssub.sat.i64(i64 %x, i64 %y) + ret i64 %tmp +} + +define i16 @func16(i16 %x, i16 %y) nounwind { +; CHECK-T1-LABEL: func16: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: .save {r4, lr} +; CHECK-T1-NEXT: push {r4, lr} +; CHECK-T1-NEXT: lsls r1, r1, #16 +; CHECK-T1-NEXT: lsls r2, r0, #16 +; CHECK-T1-NEXT: movs r3, #1 +; CHECK-T1-NEXT: subs r0, r2, r1 +; CHECK-T1-NEXT: mov r4, r3 +; CHECK-T1-NEXT: bmi .LBB2_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r4, #0 +; CHECK-T1-NEXT: .LBB2_2: +; CHECK-T1-NEXT: cmp r4, #0 +; CHECK-T1-NEXT: bne .LBB2_4 +; CHECK-T1-NEXT: @ %bb.3: +; CHECK-T1-NEXT: lsls r3, r3, #31 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvs .LBB2_5 +; CHECK-T1-NEXT: b .LBB2_6 +; CHECK-T1-NEXT: .LBB2_4: +; CHECK-T1-NEXT: ldr r3, .LCPI2_0 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvc .LBB2_6 +; CHECK-T1-NEXT: .LBB2_5: +; CHECK-T1-NEXT: mov r0, r3 +; CHECK-T1-NEXT: .LBB2_6: +; CHECK-T1-NEXT: asrs r0, r0, #16 +; CHECK-T1-NEXT: pop {r4, pc} +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI2_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff +; +; CHECK-T2-LABEL: func16: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r0, r0, #16 +; CHECK-T2-NEXT: sub.w r12, r0, r1, lsl #16 +; CHECK-T2-NEXT: movs r3, #0 +; CHECK-T2-NEXT: cmp.w r12, #0 +; CHECK-T2-NEXT: mov.w r2, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r3, #1 +; CHECK-T2-NEXT: cmp r3, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r2, #-2147483648 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #16 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r2, r12 +; CHECK-T2-NEXT: asrs r0, r2, #16 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func16: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r0, r0, #16 +; CHECK-ARM-NEXT: sub r12, r0, r1, lsl #16 +; CHECK-ARM-NEXT: mov r3, #0 +; CHECK-ARM-NEXT: cmp r12, #0 +; CHECK-ARM-NEXT: movwmi r3, #1 +; CHECK-ARM-NEXT: mov r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r3, #0 +; CHECK-ARM-NEXT: mvnne r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #16 +; CHECK-ARM-NEXT: movvc r2, r12 +; CHECK-ARM-NEXT: asr r0, r2, #16 +; CHECK-ARM-NEXT: bx lr + %tmp = call i16 @llvm.ssub.sat.i16(i16 %x, i16 %y) + ret i16 %tmp +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; CHECK-T1-LABEL: func8: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: .save {r4, lr} +; CHECK-T1-NEXT: push {r4, lr} +; CHECK-T1-NEXT: lsls r1, r1, #24 +; CHECK-T1-NEXT: lsls r2, r0, #24 +; CHECK-T1-NEXT: movs r3, #1 +; CHECK-T1-NEXT: subs r0, r2, r1 +; CHECK-T1-NEXT: mov r4, r3 +; CHECK-T1-NEXT: bmi .LBB3_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r4, #0 +; CHECK-T1-NEXT: .LBB3_2: +; CHECK-T1-NEXT: cmp r4, #0 +; CHECK-T1-NEXT: bne .LBB3_4 +; CHECK-T1-NEXT: @ %bb.3: +; CHECK-T1-NEXT: lsls r3, r3, #31 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvs .LBB3_5 +; CHECK-T1-NEXT: b .LBB3_6 +; CHECK-T1-NEXT: .LBB3_4: +; CHECK-T1-NEXT: ldr r3, .LCPI3_0 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvc .LBB3_6 +; CHECK-T1-NEXT: .LBB3_5: +; CHECK-T1-NEXT: mov r0, r3 +; CHECK-T1-NEXT: .LBB3_6: +; CHECK-T1-NEXT: asrs r0, r0, #24 +; CHECK-T1-NEXT: pop {r4, pc} +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI3_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff +; +; CHECK-T2-LABEL: func8: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r0, r0, #24 +; CHECK-T2-NEXT: sub.w r12, r0, r1, lsl #24 +; CHECK-T2-NEXT: movs r3, #0 +; CHECK-T2-NEXT: cmp.w r12, #0 +; CHECK-T2-NEXT: mov.w r2, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r3, #1 +; CHECK-T2-NEXT: cmp r3, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r2, #-2147483648 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #24 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r2, r12 +; CHECK-T2-NEXT: asrs r0, r2, #24 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func8: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r0, r0, #24 +; CHECK-ARM-NEXT: sub r12, r0, r1, lsl #24 +; CHECK-ARM-NEXT: mov r3, #0 +; CHECK-ARM-NEXT: cmp r12, #0 +; CHECK-ARM-NEXT: movwmi r3, #1 +; CHECK-ARM-NEXT: mov r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r3, #0 +; CHECK-ARM-NEXT: mvnne r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #24 +; CHECK-ARM-NEXT: movvc r2, r12 +; CHECK-ARM-NEXT: asr r0, r2, #24 +; CHECK-ARM-NEXT: bx lr + %tmp = call i8 @llvm.ssub.sat.i8(i8 %x, i8 %y) + ret i8 %tmp +} + +define i4 @func3(i4 %x, i4 %y) nounwind { +; CHECK-T1-LABEL: func3: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: .save {r4, lr} +; CHECK-T1-NEXT: push {r4, lr} +; CHECK-T1-NEXT: lsls r1, r1, #28 +; CHECK-T1-NEXT: lsls r2, r0, #28 +; CHECK-T1-NEXT: movs r3, #1 +; CHECK-T1-NEXT: subs r0, r2, r1 +; CHECK-T1-NEXT: mov r4, r3 +; CHECK-T1-NEXT: bmi .LBB4_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r4, #0 +; CHECK-T1-NEXT: .LBB4_2: +; CHECK-T1-NEXT: cmp r4, #0 +; CHECK-T1-NEXT: bne .LBB4_4 +; CHECK-T1-NEXT: @ %bb.3: +; CHECK-T1-NEXT: lsls r3, r3, #31 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvs .LBB4_5 +; CHECK-T1-NEXT: b .LBB4_6 +; CHECK-T1-NEXT: .LBB4_4: +; CHECK-T1-NEXT: ldr r3, .LCPI4_0 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvc .LBB4_6 +; CHECK-T1-NEXT: .LBB4_5: +; CHECK-T1-NEXT: mov r0, r3 +; CHECK-T1-NEXT: .LBB4_6: +; CHECK-T1-NEXT: asrs r0, r0, #28 +; CHECK-T1-NEXT: pop {r4, pc} +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI4_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff +; +; CHECK-T2-LABEL: func3: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r0, r0, #28 +; CHECK-T2-NEXT: sub.w r12, r0, r1, lsl #28 +; CHECK-T2-NEXT: movs r3, #0 +; CHECK-T2-NEXT: cmp.w r12, #0 +; CHECK-T2-NEXT: mov.w r2, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r3, #1 +; CHECK-T2-NEXT: cmp r3, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r2, #-2147483648 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #28 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r2, r12 +; CHECK-T2-NEXT: asrs r0, r2, #28 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func3: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r0, r0, #28 +; CHECK-ARM-NEXT: sub r12, r0, r1, lsl #28 +; CHECK-ARM-NEXT: mov r3, #0 +; CHECK-ARM-NEXT: cmp r12, #0 +; CHECK-ARM-NEXT: movwmi r3, #1 +; CHECK-ARM-NEXT: mov r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r3, #0 +; CHECK-ARM-NEXT: mvnne r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #28 +; CHECK-ARM-NEXT: movvc r2, r12 +; CHECK-ARM-NEXT: asr r0, r2, #28 +; CHECK-ARM-NEXT: bx lr + %tmp = call i4 @llvm.ssub.sat.i4(i4 %x, i4 %y) + ret i4 %tmp +} + +define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind { +; CHECK-T1-LABEL: vec: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: .save {r4, r5, r6, r7, lr} +; CHECK-T1-NEXT: push {r4, r5, r6, r7, lr} +; CHECK-T1-NEXT: .pad #12 +; CHECK-T1-NEXT: sub sp, #12 +; CHECK-T1-NEXT: str r3, [sp] @ 4-byte Spill +; CHECK-T1-NEXT: mov r4, r1 +; CHECK-T1-NEXT: mov r1, r0 +; CHECK-T1-NEXT: ldr r5, [sp, #32] +; CHECK-T1-NEXT: movs r7, #1 +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: str r0, [sp, #8] @ 4-byte Spill +; CHECK-T1-NEXT: subs r0, r1, r5 +; CHECK-T1-NEXT: str r0, [sp, #4] @ 4-byte Spill +; CHECK-T1-NEXT: mov r6, r7 +; CHECK-T1-NEXT: bmi .LBB5_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: ldr r6, [sp, #8] @ 4-byte Reload +; CHECK-T1-NEXT: .LBB5_2: +; CHECK-T1-NEXT: lsls r3, r7, #31 +; CHECK-T1-NEXT: ldr r0, .LCPI5_0 +; CHECK-T1-NEXT: cmp r6, #0 +; CHECK-T1-NEXT: mov r6, r0 +; CHECK-T1-NEXT: bne .LBB5_4 +; CHECK-T1-NEXT: @ %bb.3: +; CHECK-T1-NEXT: mov r6, r3 +; CHECK-T1-NEXT: .LBB5_4: +; CHECK-T1-NEXT: cmp r1, r5 +; CHECK-T1-NEXT: bvc .LBB5_6 +; CHECK-T1-NEXT: @ %bb.5: +; CHECK-T1-NEXT: str r6, [sp, #4] @ 4-byte Spill +; CHECK-T1-NEXT: .LBB5_6: +; CHECK-T1-NEXT: ldr r5, [sp, #36] +; CHECK-T1-NEXT: subs r1, r4, r5 +; CHECK-T1-NEXT: mov r6, r7 +; CHECK-T1-NEXT: bmi .LBB5_8 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: ldr r6, [sp, #8] @ 4-byte Reload +; CHECK-T1-NEXT: .LBB5_8: +; CHECK-T1-NEXT: cmp r6, #0 +; CHECK-T1-NEXT: mov r6, r0 +; CHECK-T1-NEXT: bne .LBB5_10 +; CHECK-T1-NEXT: @ %bb.9: +; CHECK-T1-NEXT: mov r6, r3 +; CHECK-T1-NEXT: .LBB5_10: +; CHECK-T1-NEXT: cmp r4, r5 +; CHECK-T1-NEXT: bvc .LBB5_12 +; CHECK-T1-NEXT: @ %bb.11: +; CHECK-T1-NEXT: mov r1, r6 +; CHECK-T1-NEXT: .LBB5_12: +; CHECK-T1-NEXT: ldr r5, [sp, #40] +; CHECK-T1-NEXT: subs r4, r2, r5 +; CHECK-T1-NEXT: mov r6, r7 +; CHECK-T1-NEXT: bmi .LBB5_14 +; CHECK-T1-NEXT: @ %bb.13: +; CHECK-T1-NEXT: ldr r6, [sp, #8] @ 4-byte Reload +; CHECK-T1-NEXT: .LBB5_14: +; CHECK-T1-NEXT: cmp r6, #0 +; CHECK-T1-NEXT: mov r6, r0 +; CHECK-T1-NEXT: bne .LBB5_16 +; CHECK-T1-NEXT: @ %bb.15: +; CHECK-T1-NEXT: mov r6, r3 +; CHECK-T1-NEXT: .LBB5_16: +; CHECK-T1-NEXT: cmp r2, r5 +; CHECK-T1-NEXT: bvc .LBB5_18 +; CHECK-T1-NEXT: @ %bb.17: +; CHECK-T1-NEXT: mov r4, r6 +; CHECK-T1-NEXT: .LBB5_18: +; CHECK-T1-NEXT: ldr r2, [sp, #44] +; CHECK-T1-NEXT: ldr r6, [sp] @ 4-byte Reload +; CHECK-T1-NEXT: subs r5, r6, r2 +; CHECK-T1-NEXT: bpl .LBB5_23 +; CHECK-T1-NEXT: @ %bb.19: +; CHECK-T1-NEXT: cmp r7, #0 +; CHECK-T1-NEXT: beq .LBB5_24 +; CHECK-T1-NEXT: .LBB5_20: +; CHECK-T1-NEXT: cmp r6, r2 +; CHECK-T1-NEXT: bvc .LBB5_22 +; CHECK-T1-NEXT: .LBB5_21: +; CHECK-T1-NEXT: mov r5, r0 +; CHECK-T1-NEXT: .LBB5_22: +; CHECK-T1-NEXT: ldr r0, [sp, #4] @ 4-byte Reload +; CHECK-T1-NEXT: mov r2, r4 +; CHECK-T1-NEXT: mov r3, r5 +; CHECK-T1-NEXT: add sp, #12 +; CHECK-T1-NEXT: pop {r4, r5, r6, r7, pc} +; CHECK-T1-NEXT: .LBB5_23: +; CHECK-T1-NEXT: ldr r7, [sp, #8] @ 4-byte Reload +; CHECK-T1-NEXT: cmp r7, #0 +; CHECK-T1-NEXT: bne .LBB5_20 +; CHECK-T1-NEXT: .LBB5_24: +; CHECK-T1-NEXT: mov r0, r3 +; CHECK-T1-NEXT: cmp r6, r2 +; CHECK-T1-NEXT: bvs .LBB5_21 +; CHECK-T1-NEXT: b .LBB5_22 +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.25: +; CHECK-T1-NEXT: .LCPI5_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff +; +; CHECK-T2-LABEL: vec: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: .save {r4, r5, r6, r7, lr} +; CHECK-T2-NEXT: push {r4, r5, r6, r7, lr} +; CHECK-T2-NEXT: .pad #4 +; CHECK-T2-NEXT: sub sp, #4 +; CHECK-T2-NEXT: ldr r4, [sp, #24] +; CHECK-T2-NEXT: mov lr, r0 +; CHECK-T2-NEXT: ldr r7, [sp, #28] +; CHECK-T2-NEXT: movs r5, #0 +; CHECK-T2-NEXT: subs r6, r0, r4 +; CHECK-T2-NEXT: mov.w r0, #0 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r0, #1 +; CHECK-T2-NEXT: cmp r0, #0 +; CHECK-T2-NEXT: mov.w r0, #-2147483648 +; CHECK-T2-NEXT: mov.w r12, #-2147483648 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r0, #-2147483648 +; CHECK-T2-NEXT: cmp lr, r4 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r0, r6 +; CHECK-T2-NEXT: subs r6, r1, r7 +; CHECK-T2-NEXT: mov.w r4, #0 +; CHECK-T2-NEXT: mov.w lr, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r4, #1 +; CHECK-T2-NEXT: cmp r4, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne lr, #-2147483648 +; CHECK-T2-NEXT: cmp r1, r7 +; CHECK-T2-NEXT: ldr r1, [sp, #32] +; CHECK-T2-NEXT: mov.w r4, #0 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc lr, r6 +; CHECK-T2-NEXT: subs r6, r2, r1 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r4, #1 +; CHECK-T2-NEXT: cmp r4, #0 +; CHECK-T2-NEXT: mov.w r4, #-2147483648 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r4, #-2147483648 +; CHECK-T2-NEXT: cmp r2, r1 +; CHECK-T2-NEXT: ldr r1, [sp, #36] +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r4, r6 +; CHECK-T2-NEXT: subs r2, r3, r1 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r5, #1 +; CHECK-T2-NEXT: cmp r5, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r12, #-2147483648 +; CHECK-T2-NEXT: cmp r3, r1 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r12, r2 +; CHECK-T2-NEXT: mov r1, lr +; CHECK-T2-NEXT: mov r2, r4 +; CHECK-T2-NEXT: mov r3, r12 +; CHECK-T2-NEXT: add sp, #4 +; CHECK-T2-NEXT: pop {r4, r5, r6, r7, pc} +; +; CHECK-ARM-LABEL: vec: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: vmov d17, r2, r3 +; CHECK-ARM-NEXT: mov r12, sp +; CHECK-ARM-NEXT: vld1.64 {d18, d19}, [r12] +; CHECK-ARM-NEXT: vmov d16, r0, r1 +; CHECK-ARM-NEXT: vmvn.i32 q11, #0x80000000 +; CHECK-ARM-NEXT: vsub.i32 q10, q8, q9 +; CHECK-ARM-NEXT: vcgt.s32 q9, q9, #0 +; CHECK-ARM-NEXT: vclt.s32 q12, q10, #0 +; CHECK-ARM-NEXT: vmvn q13, q12 +; CHECK-ARM-NEXT: vcgt.s32 q8, q8, q10 +; CHECK-ARM-NEXT: vbsl q11, q12, q13 +; CHECK-ARM-NEXT: veor q8, q9, q8 +; CHECK-ARM-NEXT: vbsl q8, q11, q10 +; CHECK-ARM-NEXT: vmov r0, r1, d16 +; CHECK-ARM-NEXT: vmov r2, r3, d17 +; CHECK-ARM-NEXT: bx lr + %tmp = call <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %x, <4 x i32> %y) + ret <4 x i32> %tmp +} Added: llvm/trunk/test/CodeGen/ARM/uadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/uadd_sat.ll?rev=374169&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/ARM/uadd_sat.ll (added) +++ llvm/trunk/test/CodeGen/ARM/uadd_sat.ll Wed Oct 9 07:17:38 2019 @@ -0,0 +1,199 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=thumbv6m-none-eabi | FileCheck %s --check-prefix=CHECK-T1 +; RUN: llc < %s -mtriple=thumbv7m-none-eabi | FileCheck %s --check-prefix=CHECK-T2 --check-prefix=CHECK-T2NODSP +; RUN: llc < %s -mtriple=thumbv7em-none-eabi | FileCheck %s --check-prefix=CHECK-T2 --check-prefix=CHECK-T2DSP +; RUN: llc < %s -mtriple=armv8a-none-eabi | FileCheck %s --check-prefix=CHECK-ARM + +declare i4 @llvm.uadd.sat.i4(i4, i4) +declare i8 @llvm.uadd.sat.i8(i8, i8) +declare i16 @llvm.uadd.sat.i16(i16, i16) +declare i32 @llvm.uadd.sat.i32(i32, i32) +declare i64 @llvm.uadd.sat.i64(i64, i64) + +define i32 @func(i32 %x, i32 %y) nounwind { +; CHECK-T1-LABEL: func: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: adds r0, r0, r1 +; CHECK-T1-NEXT: blo .LBB0_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: mvns r0, r0 +; CHECK-T1-NEXT: .LBB0_2: +; CHECK-T1-NEXT: bx lr +; +; CHECK-T2-LABEL: func: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: adds r0, r0, r1 +; CHECK-T2-NEXT: it hs +; CHECK-T2-NEXT: movhs.w r0, #-1 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: adds r0, r0, r1 +; CHECK-ARM-NEXT: mvnhs r0, #0 +; CHECK-ARM-NEXT: bx lr + %tmp = call i32 @llvm.uadd.sat.i32(i32 %x, i32 %y) + ret i32 %tmp +} + +define i64 @func2(i64 %x, i64 %y) nounwind { +; CHECK-T1-LABEL: func2: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: .save {r4, r5, r7, lr} +; CHECK-T1-NEXT: push {r4, r5, r7, lr} +; CHECK-T1-NEXT: movs r5, #0 +; CHECK-T1-NEXT: adds r4, r0, r2 +; CHECK-T1-NEXT: adcs r1, r3 +; CHECK-T1-NEXT: mov r3, r5 +; CHECK-T1-NEXT: adcs r3, r5 +; CHECK-T1-NEXT: mvns r2, r5 +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: mov r0, r2 +; CHECK-T1-NEXT: beq .LBB1_3 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: beq .LBB1_4 +; CHECK-T1-NEXT: .LBB1_2: +; CHECK-T1-NEXT: mov r1, r2 +; CHECK-T1-NEXT: pop {r4, r5, r7, pc} +; CHECK-T1-NEXT: .LBB1_3: +; CHECK-T1-NEXT: mov r0, r4 +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: bne .LBB1_2 +; CHECK-T1-NEXT: .LBB1_4: +; CHECK-T1-NEXT: mov r2, r1 +; CHECK-T1-NEXT: mov r1, r2 +; CHECK-T1-NEXT: pop {r4, r5, r7, pc} +; +; CHECK-T2-LABEL: func2: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: adds r0, r0, r2 +; CHECK-T2-NEXT: mov.w r12, #0 +; CHECK-T2-NEXT: adcs r1, r3 +; CHECK-T2-NEXT: adcs r2, r12, #0 +; CHECK-T2-NEXT: itt ne +; CHECK-T2-NEXT: movne.w r0, #-1 +; CHECK-T2-NEXT: movne.w r1, #-1 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func2: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: adds r0, r0, r2 +; CHECK-ARM-NEXT: mov r12, #0 +; CHECK-ARM-NEXT: adcs r1, r1, r3 +; CHECK-ARM-NEXT: adcs r2, r12, #0 +; CHECK-ARM-NEXT: mvnne r0, #0 +; CHECK-ARM-NEXT: mvnne r1, #0 +; CHECK-ARM-NEXT: bx lr + %tmp = call i64 @llvm.uadd.sat.i64(i64 %x, i64 %y) + ret i64 %tmp +} + +define i16 @func16(i16 %x, i16 %y) nounwind { +; CHECK-T1-LABEL: func16: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r1, r1, #16 +; CHECK-T1-NEXT: lsls r0, r0, #16 +; CHECK-T1-NEXT: adds r0, r0, r1 +; CHECK-T1-NEXT: blo .LBB2_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: mvns r0, r0 +; CHECK-T1-NEXT: .LBB2_2: +; CHECK-T1-NEXT: lsrs r0, r0, #16 +; CHECK-T1-NEXT: bx lr +; +; CHECK-T2-LABEL: func16: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r2, r0, #16 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #16 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #16 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo.w r1, #-1 +; CHECK-T2-NEXT: lsrs r0, r1, #16 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func16: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r2, r0, #16 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #16 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #16 +; CHECK-ARM-NEXT: mvnlo r1, #0 +; CHECK-ARM-NEXT: lsr r0, r1, #16 +; CHECK-ARM-NEXT: bx lr + %tmp = call i16 @llvm.uadd.sat.i16(i16 %x, i16 %y) + ret i16 %tmp +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; CHECK-T1-LABEL: func8: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r1, r1, #24 +; CHECK-T1-NEXT: lsls r0, r0, #24 +; CHECK-T1-NEXT: adds r0, r0, r1 +; CHECK-T1-NEXT: blo .LBB3_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: mvns r0, r0 +; CHECK-T1-NEXT: .LBB3_2: +; CHECK-T1-NEXT: lsrs r0, r0, #24 +; CHECK-T1-NEXT: bx lr +; +; CHECK-T2-LABEL: func8: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r2, r0, #24 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #24 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #24 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo.w r1, #-1 +; CHECK-T2-NEXT: lsrs r0, r1, #24 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func8: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r2, r0, #24 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #24 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #24 +; CHECK-ARM-NEXT: mvnlo r1, #0 +; CHECK-ARM-NEXT: lsr r0, r1, #24 +; CHECK-ARM-NEXT: bx lr + %tmp = call i8 @llvm.uadd.sat.i8(i8 %x, i8 %y) + ret i8 %tmp +} + +define i4 @func3(i4 %x, i4 %y) nounwind { +; CHECK-T1-LABEL: func3: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r1, r1, #28 +; CHECK-T1-NEXT: lsls r0, r0, #28 +; CHECK-T1-NEXT: adds r0, r0, r1 +; CHECK-T1-NEXT: blo .LBB4_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: mvns r0, r0 +; CHECK-T1-NEXT: .LBB4_2: +; CHECK-T1-NEXT: lsrs r0, r0, #28 +; CHECK-T1-NEXT: bx lr +; +; CHECK-T2-LABEL: func3: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r2, r0, #28 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #28 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #28 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo.w r1, #-1 +; CHECK-T2-NEXT: lsrs r0, r1, #28 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func3: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r2, r0, #28 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #28 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #28 +; CHECK-ARM-NEXT: mvnlo r1, #0 +; CHECK-ARM-NEXT: lsr r0, r1, #28 +; CHECK-ARM-NEXT: bx lr + %tmp = call i4 @llvm.uadd.sat.i4(i4 %x, i4 %y) + ret i4 %tmp +} Added: llvm/trunk/test/CodeGen/ARM/usub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/usub_sat.ll?rev=374169&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/ARM/usub_sat.ll (added) +++ llvm/trunk/test/CodeGen/ARM/usub_sat.ll Wed Oct 9 07:17:38 2019 @@ -0,0 +1,196 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=thumbv6m-none-eabi | FileCheck %s --check-prefix=CHECK-T1 +; RUN: llc < %s -mtriple=thumbv7m-none-eabi | FileCheck %s --check-prefix=CHECK-T2 --check-prefix=CHECK-T2NODSP +; RUN: llc < %s -mtriple=thumbv7em-none-eabi | FileCheck %s --check-prefix=CHECK-T2 --check-prefix=CHECK-T2DSP +; RUN: llc < %s -mtriple=armv8a-none-eabi | FileCheck %s --check-prefix=CHECK-ARM + +declare i4 @llvm.usub.sat.i4(i4, i4) +declare i8 @llvm.usub.sat.i8(i8, i8) +declare i16 @llvm.usub.sat.i16(i16, i16) +declare i32 @llvm.usub.sat.i32(i32, i32) +declare i64 @llvm.usub.sat.i64(i64, i64) + +define i32 @func(i32 %x, i32 %y) nounwind { +; CHECK-T1-LABEL: func: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: subs r0, r0, r1 +; CHECK-T1-NEXT: bhs .LBB0_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: .LBB0_2: +; CHECK-T1-NEXT: bx lr +; +; CHECK-T2-LABEL: func: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: subs r0, r0, r1 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo r0, #0 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: subs r0, r0, r1 +; CHECK-ARM-NEXT: movlo r0, #0 +; CHECK-ARM-NEXT: bx lr + %tmp = call i32 @llvm.usub.sat.i32(i32 %x, i32 %y) + ret i32 %tmp +} + +define i64 @func2(i64 %x, i64 %y) nounwind { +; CHECK-T1-LABEL: func2: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: .save {r4, lr} +; CHECK-T1-NEXT: push {r4, lr} +; CHECK-T1-NEXT: mov r4, r1 +; CHECK-T1-NEXT: movs r1, #0 +; CHECK-T1-NEXT: subs r2, r0, r2 +; CHECK-T1-NEXT: sbcs r4, r3 +; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: adcs r0, r1 +; CHECK-T1-NEXT: movs r3, #1 +; CHECK-T1-NEXT: subs r3, r3, r0 +; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: beq .LBB1_3 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: beq .LBB1_4 +; CHECK-T1-NEXT: .LBB1_2: +; CHECK-T1-NEXT: pop {r4, pc} +; CHECK-T1-NEXT: .LBB1_3: +; CHECK-T1-NEXT: mov r0, r2 +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: bne .LBB1_2 +; CHECK-T1-NEXT: .LBB1_4: +; CHECK-T1-NEXT: mov r1, r4 +; CHECK-T1-NEXT: pop {r4, pc} +; +; CHECK-T2-LABEL: func2: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: subs r0, r0, r2 +; CHECK-T2-NEXT: mov.w r12, #0 +; CHECK-T2-NEXT: sbcs r1, r3 +; CHECK-T2-NEXT: adc r2, r12, #0 +; CHECK-T2-NEXT: rsbs.w r2, r2, #1 +; CHECK-T2-NEXT: itt ne +; CHECK-T2-NEXT: movne r0, #0 +; CHECK-T2-NEXT: movne r1, #0 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func2: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: subs r0, r0, r2 +; CHECK-ARM-NEXT: mov r12, #0 +; CHECK-ARM-NEXT: sbcs r1, r1, r3 +; CHECK-ARM-NEXT: adc r2, r12, #0 +; CHECK-ARM-NEXT: rsbs r2, r2, #1 +; CHECK-ARM-NEXT: movwne r0, #0 +; CHECK-ARM-NEXT: movwne r1, #0 +; CHECK-ARM-NEXT: bx lr + %tmp = call i64 @llvm.usub.sat.i64(i64 %x, i64 %y) + ret i64 %tmp +} + +define i16 @func16(i16 %x, i16 %y) nounwind { +; CHECK-T1-LABEL: func16: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r1, r1, #16 +; CHECK-T1-NEXT: lsls r0, r0, #16 +; CHECK-T1-NEXT: subs r0, r0, r1 +; CHECK-T1-NEXT: bhs .LBB2_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: .LBB2_2: +; CHECK-T1-NEXT: lsrs r0, r0, #16 +; CHECK-T1-NEXT: bx lr +; +; CHECK-T2-LABEL: func16: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r0, r0, #16 +; CHECK-T2-NEXT: sub.w r2, r0, r1, lsl #16 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #16 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo r2, #0 +; CHECK-T2-NEXT: lsrs r0, r2, #16 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func16: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r0, r0, #16 +; CHECK-ARM-NEXT: sub r2, r0, r1, lsl #16 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #16 +; CHECK-ARM-NEXT: movlo r2, #0 +; CHECK-ARM-NEXT: lsr r0, r2, #16 +; CHECK-ARM-NEXT: bx lr + %tmp = call i16 @llvm.usub.sat.i16(i16 %x, i16 %y) + ret i16 %tmp +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; CHECK-T1-LABEL: func8: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r1, r1, #24 +; CHECK-T1-NEXT: lsls r0, r0, #24 +; CHECK-T1-NEXT: subs r0, r0, r1 +; CHECK-T1-NEXT: bhs .LBB3_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: .LBB3_2: +; CHECK-T1-NEXT: lsrs r0, r0, #24 +; CHECK-T1-NEXT: bx lr +; +; CHECK-T2-LABEL: func8: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r0, r0, #24 +; CHECK-T2-NEXT: sub.w r2, r0, r1, lsl #24 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #24 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo r2, #0 +; CHECK-T2-NEXT: lsrs r0, r2, #24 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func8: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r0, r0, #24 +; CHECK-ARM-NEXT: sub r2, r0, r1, lsl #24 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #24 +; CHECK-ARM-NEXT: movlo r2, #0 +; CHECK-ARM-NEXT: lsr r0, r2, #24 +; CHECK-ARM-NEXT: bx lr + %tmp = call i8 @llvm.usub.sat.i8(i8 %x, i8 %y) + ret i8 %tmp +} + +define i4 @func3(i4 %x, i4 %y) nounwind { +; CHECK-T1-LABEL: func3: +; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r1, r1, #28 +; CHECK-T1-NEXT: lsls r0, r0, #28 +; CHECK-T1-NEXT: subs r0, r0, r1 +; CHECK-T1-NEXT: bhs .LBB4_2 +; CHECK-T1-NEXT: @ %bb.1: +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: .LBB4_2: +; CHECK-T1-NEXT: lsrs r0, r0, #28 +; CHECK-T1-NEXT: bx lr +; +; CHECK-T2-LABEL: func3: +; CHECK-T2: @ %bb.0: +; CHECK-T2-NEXT: lsls r0, r0, #28 +; CHECK-T2-NEXT: sub.w r2, r0, r1, lsl #28 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #28 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo r2, #0 +; CHECK-T2-NEXT: lsrs r0, r2, #28 +; CHECK-T2-NEXT: bx lr +; +; CHECK-ARM-LABEL: func3: +; CHECK-ARM: @ %bb.0: +; CHECK-ARM-NEXT: lsl r0, r0, #28 +; CHECK-ARM-NEXT: sub r2, r0, r1, lsl #28 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #28 +; CHECK-ARM-NEXT: movlo r2, #0 +; CHECK-ARM-NEXT: lsr r0, r2, #28 +; CHECK-ARM-NEXT: bx lr + %tmp = call i4 @llvm.usub.sat.i4(i4 %x, i4 %y) + ret i4 %tmp +} Modified: llvm/trunk/test/CodeGen/X86/sadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sadd_sat.ll?rev=374169&r1=374168&r2=374169&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/sadd_sat.ll (original) +++ llvm/trunk/test/CodeGen/X86/sadd_sat.ll Wed Oct 9 07:17:38 2019 @@ -2,10 +2,12 @@ ; RUN: llc < %s -mtriple=i686 -mattr=cmov | FileCheck %s --check-prefixes=CHECK,X86 ; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s --check-prefixes=CHECK,X64 -declare i4 @llvm.sadd.sat.i4 (i4, i4) -declare i32 @llvm.sadd.sat.i32 (i32, i32) -declare i64 @llvm.sadd.sat.i64 (i64, i64) -declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32>, <4 x i32>) +declare i4 @llvm.sadd.sat.i4(i4, i4) +declare i8 @llvm.sadd.sat.i8(i8, i8) +declare i16 @llvm.sadd.sat.i16(i16, i16) +declare i32 @llvm.sadd.sat.i32(i32, i32) +declare i64 @llvm.sadd.sat.i64(i64, i64) +declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32>, <4 x i32>) define i32 @func(i32 %x, i32 %y) nounwind { ; X86-LABEL: func: @@ -89,6 +91,70 @@ define i64 @func2(i64 %x, i64 %y) nounwi ret i64 %tmp; } +define i16 @func16(i16 %x, i16 %y) nounwind { +; X86-LABEL: func16: +; X86: # %bb.0: +; X86-NEXT: pushl %esi +; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax +; X86-NEXT: movzwl {{[0-9]+}}(%esp), %edx +; X86-NEXT: xorl %ecx, %ecx +; X86-NEXT: movl %eax, %esi +; X86-NEXT: addw %dx, %si +; X86-NEXT: setns %cl +; X86-NEXT: addl $32767, %ecx # imm = 0x7FFF +; X86-NEXT: addw %dx, %ax +; X86-NEXT: cmovol %ecx, %eax +; X86-NEXT: # kill: def $ax killed $ax killed $eax +; X86-NEXT: popl %esi +; X86-NEXT: retl +; +; X64-LABEL: func16: +; X64: # %bb.0: +; X64-NEXT: xorl %eax, %eax +; X64-NEXT: movl %edi, %ecx +; X64-NEXT: addw %si, %cx +; X64-NEXT: setns %al +; X64-NEXT: addl $32767, %eax # imm = 0x7FFF +; X64-NEXT: addw %si, %di +; X64-NEXT: cmovnol %edi, %eax +; X64-NEXT: # kill: def $ax killed $ax killed $eax +; X64-NEXT: retq + %tmp = call i16 @llvm.sadd.sat.i16(i16 %x, i16 %y) + ret i16 %tmp +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; X86-LABEL: func8: +; X86: # %bb.0: +; X86-NEXT: movb {{[0-9]+}}(%esp), %al +; X86-NEXT: movb {{[0-9]+}}(%esp), %dl +; X86-NEXT: xorl %ecx, %ecx +; X86-NEXT: movb %al, %ah +; X86-NEXT: addb %dl, %ah +; X86-NEXT: setns %cl +; X86-NEXT: addl $127, %ecx +; X86-NEXT: addb %dl, %al +; X86-NEXT: movzbl %al, %eax +; X86-NEXT: cmovol %ecx, %eax +; X86-NEXT: # kill: def $al killed $al killed $eax +; X86-NEXT: retl +; +; X64-LABEL: func8: +; X64: # %bb.0: +; X64-NEXT: xorl %ecx, %ecx +; X64-NEXT: movl %edi, %eax +; X64-NEXT: addb %sil, %al +; X64-NEXT: setns %cl +; X64-NEXT: addl $127, %ecx +; X64-NEXT: addb %sil, %dil +; X64-NEXT: movzbl %dil, %eax +; X64-NEXT: cmovol %ecx, %eax +; X64-NEXT: # kill: def $al killed $al killed $eax +; X64-NEXT: retq + %tmp = call i8 @llvm.sadd.sat.i8(i8 %x, i8 %y) + ret i8 %tmp +} + define i4 @func3(i4 %x, i4 %y) nounwind { ; X86-LABEL: func3: ; X86: # %bb.0: Modified: llvm/trunk/test/CodeGen/X86/ssub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/ssub_sat.ll?rev=374169&r1=374168&r2=374169&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/ssub_sat.ll (original) +++ llvm/trunk/test/CodeGen/X86/ssub_sat.ll Wed Oct 9 07:17:38 2019 @@ -2,10 +2,12 @@ ; RUN: llc < %s -mtriple=i686 -mattr=cmov | FileCheck %s --check-prefixes=CHECK,X86 ; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s --check-prefixes=CHECK,X64 -declare i4 @llvm.ssub.sat.i4 (i4, i4) -declare i32 @llvm.ssub.sat.i32 (i32, i32) -declare i64 @llvm.ssub.sat.i64 (i64, i64) -declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32>, <4 x i32>) +declare i4 @llvm.ssub.sat.i4(i4, i4) +declare i8 @llvm.ssub.sat.i8(i8, i8) +declare i16 @llvm.ssub.sat.i16(i16, i16) +declare i32 @llvm.ssub.sat.i32(i32, i32) +declare i64 @llvm.ssub.sat.i64(i64, i64) +declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32>, <4 x i32>) define i32 @func(i32 %x, i32 %y) nounwind { ; X86-LABEL: func: @@ -89,6 +91,70 @@ define i64 @func2(i64 %x, i64 %y) nounwi ret i64 %tmp } +define i16 @func16(i16 %x, i16 %y) nounwind { +; X86-LABEL: func16: +; X86: # %bb.0: +; X86-NEXT: pushl %esi +; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax +; X86-NEXT: movzwl {{[0-9]+}}(%esp), %edx +; X86-NEXT: xorl %ecx, %ecx +; X86-NEXT: movl %eax, %esi +; X86-NEXT: subw %dx, %si +; X86-NEXT: setns %cl +; X86-NEXT: addl $32767, %ecx # imm = 0x7FFF +; X86-NEXT: subw %dx, %ax +; X86-NEXT: cmovol %ecx, %eax +; X86-NEXT: # kill: def $ax killed $ax killed $eax +; X86-NEXT: popl %esi +; X86-NEXT: retl +; +; X64-LABEL: func16: +; X64: # %bb.0: +; X64-NEXT: xorl %eax, %eax +; X64-NEXT: movl %edi, %ecx +; X64-NEXT: subw %si, %cx +; X64-NEXT: setns %al +; X64-NEXT: addl $32767, %eax # imm = 0x7FFF +; X64-NEXT: subw %si, %di +; X64-NEXT: cmovnol %edi, %eax +; X64-NEXT: # kill: def $ax killed $ax killed $eax +; X64-NEXT: retq + %tmp = call i16 @llvm.ssub.sat.i16(i16 %x, i16 %y) + ret i16 %tmp +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; X86-LABEL: func8: +; X86: # %bb.0: +; X86-NEXT: movb {{[0-9]+}}(%esp), %al +; X86-NEXT: movb {{[0-9]+}}(%esp), %dl +; X86-NEXT: xorl %ecx, %ecx +; X86-NEXT: movb %al, %ah +; X86-NEXT: subb %dl, %ah +; X86-NEXT: setns %cl +; X86-NEXT: addl $127, %ecx +; X86-NEXT: subb %dl, %al +; X86-NEXT: movzbl %al, %eax +; X86-NEXT: cmovol %ecx, %eax +; X86-NEXT: # kill: def $al killed $al killed $eax +; X86-NEXT: retl +; +; X64-LABEL: func8: +; X64: # %bb.0: +; X64-NEXT: xorl %ecx, %ecx +; X64-NEXT: movl %edi, %eax +; X64-NEXT: subb %sil, %al +; X64-NEXT: setns %cl +; X64-NEXT: addl $127, %ecx +; X64-NEXT: subb %sil, %dil +; X64-NEXT: movzbl %dil, %eax +; X64-NEXT: cmovol %ecx, %eax +; X64-NEXT: # kill: def $al killed $al killed $eax +; X64-NEXT: retq + %tmp = call i8 @llvm.ssub.sat.i8(i8 %x, i8 %y) + ret i8 %tmp +} + define i4 @func3(i4 %x, i4 %y) nounwind { ; X86-LABEL: func3: ; X86: # %bb.0: Modified: llvm/trunk/test/CodeGen/X86/uadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/uadd_sat.ll?rev=374169&r1=374168&r2=374169&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/uadd_sat.ll (original) +++ llvm/trunk/test/CodeGen/X86/uadd_sat.ll Wed Oct 9 07:17:38 2019 @@ -2,10 +2,12 @@ ; RUN: llc < %s -mtriple=i686 -mattr=cmov | FileCheck %s --check-prefixes=CHECK,X86 ; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s --check-prefixes=CHECK,X64 -declare i4 @llvm.uadd.sat.i4 (i4, i4) -declare i32 @llvm.uadd.sat.i32 (i32, i32) -declare i64 @llvm.uadd.sat.i64 (i64, i64) -declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32>, <4 x i32>) +declare i4 @llvm.uadd.sat.i4(i4, i4) +declare i8 @llvm.uadd.sat.i8(i8, i8) +declare i16 @llvm.uadd.sat.i16(i16, i16) +declare i32 @llvm.uadd.sat.i32(i32, i32) +declare i64 @llvm.uadd.sat.i64(i64, i64) +declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32>, <4 x i32>) define i32 @func(i32 %x, i32 %y) nounwind { ; X86-LABEL: func: @@ -48,6 +50,50 @@ define i64 @func2(i64 %x, i64 %y) nounwi ret i64 %tmp } +define i16 @func16(i16 %x, i16 %y) nounwind { +; X86-LABEL: func16: +; X86: # %bb.0: +; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx +; X86-NEXT: addw {{[0-9]+}}(%esp), %cx +; X86-NEXT: movl $65535, %eax # imm = 0xFFFF +; X86-NEXT: cmovael %ecx, %eax +; X86-NEXT: # kill: def $ax killed $ax killed $eax +; X86-NEXT: retl +; +; X64-LABEL: func16: +; X64: # %bb.0: +; X64-NEXT: addw %si, %di +; X64-NEXT: movl $65535, %eax # imm = 0xFFFF +; X64-NEXT: cmovael %edi, %eax +; X64-NEXT: # kill: def $ax killed $ax killed $eax +; X64-NEXT: retq + %tmp = call i16 @llvm.uadd.sat.i16(i16 %x, i16 %y) + ret i16 %tmp +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; X86-LABEL: func8: +; X86: # %bb.0: +; X86-NEXT: movb {{[0-9]+}}(%esp), %al +; X86-NEXT: addb {{[0-9]+}}(%esp), %al +; X86-NEXT: movzbl %al, %ecx +; X86-NEXT: movl $255, %eax +; X86-NEXT: cmovael %ecx, %eax +; X86-NEXT: # kill: def $al killed $al killed $eax +; X86-NEXT: retl +; +; X64-LABEL: func8: +; X64: # %bb.0: +; X64-NEXT: addb %sil, %dil +; X64-NEXT: movzbl %dil, %ecx +; X64-NEXT: movl $255, %eax +; X64-NEXT: cmovael %ecx, %eax +; X64-NEXT: # kill: def $al killed $al killed $eax +; X64-NEXT: retq + %tmp = call i8 @llvm.uadd.sat.i8(i8 %x, i8 %y) + ret i8 %tmp +} + define i4 @func3(i4 %x, i4 %y) nounwind { ; X86-LABEL: func3: ; X86: # %bb.0: Modified: llvm/trunk/test/CodeGen/X86/usub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/usub_sat.ll?rev=374169&r1=374168&r2=374169&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/usub_sat.ll (original) +++ llvm/trunk/test/CodeGen/X86/usub_sat.ll Wed Oct 9 07:17:38 2019 @@ -2,10 +2,12 @@ ; RUN: llc < %s -mtriple=i686 -mattr=cmov | FileCheck %s --check-prefixes=CHECK,X86 ; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s --check-prefixes=CHECK,X64 -declare i4 @llvm.usub.sat.i4 (i4, i4) -declare i32 @llvm.usub.sat.i32 (i32, i32) -declare i64 @llvm.usub.sat.i64 (i64, i64) -declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32>, <4 x i32>) +declare i4 @llvm.usub.sat.i4(i4, i4) +declare i8 @llvm.usub.sat.i8(i8, i8) +declare i16 @llvm.usub.sat.i16(i16, i16) +declare i32 @llvm.usub.sat.i32(i32, i32) +declare i64 @llvm.usub.sat.i64(i64, i64) +declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32>, <4 x i32>) define i32 @func(i32 %x, i32 %y) nounwind { ; X86-LABEL: func: @@ -48,6 +50,50 @@ define i64 @func2(i64 %x, i64 %y) nounwi ret i64 %tmp } +define i16 @func16(i16 %x, i16 %y) nounwind { +; X86-LABEL: func16: +; X86: # %bb.0: +; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax +; X86-NEXT: xorl %ecx, %ecx +; X86-NEXT: subw {{[0-9]+}}(%esp), %ax +; X86-NEXT: cmovbl %ecx, %eax +; X86-NEXT: # kill: def $ax killed $ax killed $eax +; X86-NEXT: retl +; +; X64-LABEL: func16: +; X64: # %bb.0: +; X64-NEXT: xorl %eax, %eax +; X64-NEXT: subw %si, %di +; X64-NEXT: cmovael %edi, %eax +; X64-NEXT: # kill: def $ax killed $ax killed $eax +; X64-NEXT: retq + %tmp = call i16 @llvm.usub.sat.i16(i16 %x, i16 %y) + ret i16 %tmp +} + +define i8 @func8(i8 %x, i8 %y) nounwind { +; X86-LABEL: func8: +; X86: # %bb.0: +; X86-NEXT: movb {{[0-9]+}}(%esp), %al +; X86-NEXT: xorl %ecx, %ecx +; X86-NEXT: subb {{[0-9]+}}(%esp), %al +; X86-NEXT: movzbl %al, %eax +; X86-NEXT: cmovbl %ecx, %eax +; X86-NEXT: # kill: def $al killed $al killed $eax +; X86-NEXT: retl +; +; X64-LABEL: func8: +; X64: # %bb.0: +; X64-NEXT: xorl %ecx, %ecx +; X64-NEXT: subb %sil, %dil +; X64-NEXT: movzbl %dil, %eax +; X64-NEXT: cmovbl %ecx, %eax +; X64-NEXT: # kill: def $al killed $al killed $eax +; X64-NEXT: retq + %tmp = call i8 @llvm.usub.sat.i8(i8 %x, i8 %y) + ret i8 %tmp +} + define i4 @func3(i4 %x, i4 %y) nounwind { ; X86-LABEL: func3: ; X86: # %bb.0: From llvm-commits at lists.llvm.org Wed Oct 9 07:21:10 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:21:10 +0000 (UTC) Subject: [PATCH] D68704: [llvm-readobj] - Cleanup "Version symbols" dumping. Message-ID: grimar created this revision. grimar added reviewers: jhenderson, MaskRay. Herald added subscribers: seiya, rupprecht. This changes "Version symbols {" -> "SHT_GNU_versym [" to be consistent with another 2 versioning sections. And removes few fields that are not useful: "Section Name", "Address", "Ofsset" and "Link" (they dumplicated the information available under the "Sections [" tag). https://reviews.llvm.org/D68704 Files: test/tools/llvm-readobj/all.test test/tools/llvm-readobj/elf-versioninfo.test test/tools/yaml2obj/versym-section.yaml tools/llvm-readobj/ELFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68704.224049.patch Type: text/x-patch Size: 4903 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 07:21:13 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:21:13 +0000 (UTC) Subject: [PATCH] D67158: [ARM] Begin adding IR intrinsics for MVE instructions. In-Reply-To: References: Message-ID: dmgreen added a comment. Thanks for splitting this up. ================ Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:710 +defm MVE_VMINV : MVE_VMINMAXV_ty< + "vminv", 0b1, int_arm_mve_minv_s, int_arm_mve_minv_u>; +defm MVE_VMAXV : MVE_VMINMAXV_ty< ---------------- I feel like we should come up with a style and try and stick with it. The adds/subs below add in VT to the existing instructions and use it in the new Patterns, mixed in with the old patterns. These ones add the intrinsics to the multiclass so the top level can include the pattern (but also has patterns outside (below) too. I think there's value in making this somewhat structured, if we can. ================ Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:2794 + foreach ptype = [mkpred.p] in { + def : Pat<(vtype (fadd (vtype MQPR:$Qm), (vtype MQPR:$Qn))), + (vtype (instr (vtype MQPR:$Qm), (vtype MQPR:$Qn)))>; ---------------- Little bit of indenting, please. ================ Comment at: llvm/test/CodeGen/Thumb2/mve-intrinsics/vaddq.ll:4 + +define arm_aapcs_vfpcc <4 x i32> @test_vaddq_u32(<4 x i32> %a, <4 x i32> %b) { +; CHECK-LABEL: test_vaddq_u32: ---------------- We probably don't _need_ tests for simple instructions like this, they should be covered elsewhere (fine to leave them if you wish). ================ Comment at: llvm/test/CodeGen/Thumb2/mve-intrinsics/vaddq.ll:24 + +define arm_aapcs_vfpcc <16 x i8> @test_vaddq_m_s8(<16 x i8> %inactive, <16 x i8> %a, <16 x i8> %b, i16 zeroext %p) { +; CHECK-LABEL: test_vaddq_m_s8: ---------------- For the rest of the tests, at least for codegen we have tried to fill in all the combinations for type and operations (at least the legal types). It can be useful for making sure nothing is missed (here or in the future when some refactoring happens). Whether you want to do the same thing here is up to you, or whether you think that having interesting combinations is enough (adds with v16i8, subs with v4f32 for example). ================ Comment at: llvm/test/CodeGen/Thumb2/mve-intrinsics/vaddq.ll:52 + %1 = tail call <4 x i1> @llvm.arm.mve.pred.i2v.v4i1(i32 %0) + %2 = tail call <4 x float> @llvm.arm.mve.sub.predicated.v4f32.v4i1(<4 x float> %a, <4 x float> %b, <4 x i1> %1, <4 x float> %inactive) + ret <4 x float> %2 ---------------- Do we care about what happens when there is a fp intrinsic but we don't have mve.fp? I presume this will be a fail to select or some sort of legalisation error, which is probably fine considering what is happening. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67158/new/ https://reviews.llvm.org/D67158 From llvm-commits at lists.llvm.org Wed Oct 9 07:25:08 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Wed, 09 Oct 2019 14:25:08 -0000 Subject: [llvm] r374170 - [llvm-exegesis] Ensure that ExecutableFunction are aligned. Message-ID: <20191009142508.82C3F85C95@lists.llvm.org> Author: courbet Date: Wed Oct 9 07:25:08 2019 New Revision: 374170 URL: http://llvm.org/viewvc/llvm-project?rev=374170&view=rev Log: [llvm-exegesis] Ensure that ExecutableFunction are aligned. Summary: Experiments show that this is the alignment we get (for ELF+Linux), but let's ensure that we have it. Reviewers: gchatelet Subscribers: tschuett, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68703 Modified: llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp Modified: llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp?rev=374170&r1=374169&r2=374170&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp Wed Oct 9 07:25:08 2019 @@ -21,6 +21,7 @@ #include "llvm/ExecutionEngine/SectionMemoryManager.h" #include "llvm/IR/LegacyPassManager.h" #include "llvm/MC/MCInstrInfo.h" +#include "llvm/Support/Alignment.h" #include "llvm/Support/MemoryBuffer.h" namespace llvm { @@ -28,6 +29,7 @@ namespace exegesis { static constexpr const char ModuleID[] = "ExegesisInfoTest"; static constexpr const char FunctionID[] = "foo"; +static const Align kFunctionAlignment(4096); // Fills the given basic block with register setup code, and returns true if // all registers could be setup correctly. @@ -169,13 +171,13 @@ void assembleToStream(const ExegesisTarg ArrayRef LiveIns, ArrayRef RegisterInitialValues, const FillFunction &Fill, raw_pwrite_stream &AsmStream) { - std::unique_ptr Context = std::make_unique(); + auto Context = std::make_unique(); std::unique_ptr Module = createModule(Context, TM->createDataLayout()); - std::unique_ptr MMIWP = - std::make_unique(TM.get()); + auto MMIWP = std::make_unique(TM.get()); MachineFunction &MF = createVoidVoidPtrMachineFunction( FunctionID, Module.get(), &MMIWP.get()->getMMI()); + MF.ensureAlignment(kFunctionAlignment); // We need to instruct the passes that we're done with SSA and virtual // registers. @@ -305,9 +307,11 @@ ExecutableFunction::ExecutableFunction( // executable page. ExecEngine->addObjectFile(std::move(ObjectFileHolder)); // Fetching function bytes. - FunctionBytes = StringRef(reinterpret_cast( - ExecEngine->getFunctionAddress(FunctionID)), - CodeSize); + const uint64_t FunctionAddress = ExecEngine->getFunctionAddress(FunctionID); + assert(isAligned(kFunctionAlignment, FunctionAddress) && + "function is not properly aligned"); + FunctionBytes = + StringRef(reinterpret_cast(FunctionAddress), CodeSize); } } // namespace exegesis From llvm-commits at lists.llvm.org Wed Oct 9 07:26:09 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Wed, 09 Oct 2019 14:26:09 -0000 Subject: [llvm] r374171 - Fix Wdocumentation unknown parameter warning. NFCI. Message-ID: <20191009142609.4E1A790DD2@lists.llvm.org> Author: rksimon Date: Wed Oct 9 07:26:09 2019 New Revision: 374171 URL: http://llvm.org/viewvc/llvm-project?rev=374171&view=rev Log: Fix Wdocumentation unknown parameter warning. NFCI. Modified: llvm/trunk/include/llvm-c/DebugInfo.h Modified: llvm/trunk/include/llvm-c/DebugInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/DebugInfo.h?rev=374171&r1=374170&r2=374171&view=diff ============================================================================== --- llvm/trunk/include/llvm-c/DebugInfo.h (original) +++ llvm/trunk/include/llvm-c/DebugInfo.h Wed Oct 9 07:26:09 2019 @@ -539,7 +539,7 @@ LLVMDIBuilderCreateSubroutineType(LLVMDI * @param Builder The DIBuilder. * @param ParentMacroFile Macro parent (could be NULL). * @param Line Source line number where the macro is defined. - * @param MacroType DW_MACINFO_define or DW_MACINFO_undef. + * @param RecordType DW_MACINFO_define or DW_MACINFO_undef. * @param Name Macro name. * @param NameLen Macro name length. * @param Value Macro value. From llvm-commits at lists.llvm.org Wed Oct 9 07:30:34 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:30:34 +0000 (UTC) Subject: [PATCH] D68575: [llvm-readobj][xcoff] implement parsing overflow section header. In-Reply-To: References: Message-ID: <51e973924c647ee1e162d6d668406cc6@localhost.localdomain> DiggerLin marked 3 inline comments as done. DiggerLin added inline comments. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:58 + static constexpr unsigned SectionFlagsTypeMask = 0xffffu; const XCOFFObjectFile &Obj; }; ---------------- hubert.reinterpretcast wrote: > Add a blank line here. Also, I am wondering if this should be part of `llvm/BinaryFormat/XCOFF.h` (perhaps in `SectionHeader32`, or in a base class thereof when 64-bit support lands). for consistent with SectionFlagsReservedMask, puting define SectionFlagsTypeMask here too, I think we maybe need to create a NFC patch to put SectionFlagsReservedMask and SectionFlagsTypeMask in the xcoff.h ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:455 + case XCOFF::STYP_TYPCHK: + // TODO : The interpretation of loader, exception, type check section + // headers are different from that of generic section header. We will ---------------- hubert.reinterpretcast wrote: > The "TODO" still has a colon surrounded by spaces on both sides after it. I do not think that we have been using colons after "TODO". > > Still missing "and" before "type check section headers". > > Still missing "s" after "generic section header". > > Typo "seciton" is still present. changed as suggestion ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:463 + } + // For now we just dump the section type flags. + if (SectionType & SectionFlagsReservedMask) ---------------- hubert.reinterpretcast wrote: > Suggestion: "For now we just dump the section type portion of the flags." changed as suggestion. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68575/new/ https://reviews.llvm.org/D68575 From llvm-commits at lists.llvm.org Wed Oct 9 07:30:34 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:30:34 +0000 (UTC) Subject: [PATCH] D68705: [llvm-readelf/llvm-readobj] - Improve dumping of broken versioning sections. Message-ID: grimar created this revision. grimar added reviewers: jhenderson, MaskRay. Herald added subscribers: seiya, rupprecht. grimar retitled this revision from "[llvm-readelf/llvm-readobj] - Improve dumping of the broken versioning sections." to "[llvm-readelf/llvm-readobj] - Improve dumping of broken versioning sections.". grimar added a parent revision: D68704: [llvm-readobj] - Cleanup "Version symbols" dumping.. This updates the `elf-invalid-versioning.test` test case: makes a cleanup, adds llvm-readobj calls and fixes 2 crash/assert issues I've found (test cases are provided). Depends on: https://reviews.llvm.org/D68704 https://reviews.llvm.org/D68705 Files: test/tools/llvm-readobj/elf-invalid-versioning.test tools/llvm-readobj/ELFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68705.224050.patch Type: text/x-patch Size: 11516 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 07:30:35 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:30:35 +0000 (UTC) Subject: [PATCH] D68700: [ARM] Add IR intrinsics for MVE VLD[24] and VST[24]. In-Reply-To: References: Message-ID: dmgreen added a comment. Looks good. I think we can use this for autovec codegen too (there is a pre-isel pass that allows us to convert load+shuffle combos that the vectorizer produces into these intrinsics). As that is the case it would probably be worth making sure we have lots of test coverage of the different types. (I'm happy enough to do that later if you with, but adding them here sounds more sensible, as this is where they are being introduced). ================ Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:4291 +defm : MVE_vst24_patterns<16, v8i16>; +defm : MVE_vst24_patterns<32, v4i32>; + ---------------- Do we need floating point types? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68700/new/ https://reviews.llvm.org/D68700 From llvm-commits at lists.llvm.org Wed Oct 9 07:30:36 2019 From: llvm-commits at lists.llvm.org (Alexey Bataev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:30:36 +0000 (UTC) Subject: [PATCH] D43582: [SLP] Generalization of stores vectorization. In-Reply-To: References: Message-ID: <980b336232e083a63ac3f4ac480471c9@localhost.localdomain> ABataev updated this revision to Diff 224053. ABataev added a comment. Rebase + added analysis of target register width. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D43582/new/ https://reviews.llvm.org/D43582 Files: include/llvm/Transforms/Vectorize/SLPVectorizer.h lib/Transforms/Vectorize/SLPVectorizer.cpp test/Transforms/SLPVectorizer/X86/arith-add-ssat.ll test/Transforms/SLPVectorizer/X86/arith-add-usat.ll test/Transforms/SLPVectorizer/X86/arith-add.ll test/Transforms/SLPVectorizer/X86/arith-fix.ll test/Transforms/SLPVectorizer/X86/arith-mul.ll test/Transforms/SLPVectorizer/X86/arith-sub-ssat.ll test/Transforms/SLPVectorizer/X86/arith-sub-usat.ll test/Transforms/SLPVectorizer/X86/arith-sub.ll test/Transforms/SLPVectorizer/X86/bitreverse.ll test/Transforms/SLPVectorizer/X86/ctlz.ll test/Transforms/SLPVectorizer/X86/ctpop.ll test/Transforms/SLPVectorizer/X86/cttz.ll test/Transforms/SLPVectorizer/X86/different-vec-widths.ll test/Transforms/SLPVectorizer/X86/pr35497.ll test/Transforms/SLPVectorizer/X86/shift-ashr.ll test/Transforms/SLPVectorizer/X86/shift-lshr.ll test/Transforms/SLPVectorizer/X86/shift-shl.ll test/Transforms/SLPVectorizer/X86/stores_vectorize.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D43582.224053.patch Type: text/x-patch Size: 146474 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 07:30:40 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:30:40 +0000 (UTC) Subject: [PATCH] D68703: [llvm-exegesis] Ensure that ExecutableFunction are aligned. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG04a9a0eb0dd4: [llvm-exegesis] Ensure that ExecutableFunction are aligned. (authored by courbet). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68703/new/ https://reviews.llvm.org/D68703 Files: llvm/tools/llvm-exegesis/lib/Assembler.cpp Index: llvm/tools/llvm-exegesis/lib/Assembler.cpp =================================================================== --- llvm/tools/llvm-exegesis/lib/Assembler.cpp +++ llvm/tools/llvm-exegesis/lib/Assembler.cpp @@ -21,6 +21,7 @@ #include "llvm/ExecutionEngine/SectionMemoryManager.h" #include "llvm/IR/LegacyPassManager.h" #include "llvm/MC/MCInstrInfo.h" +#include "llvm/Support/Alignment.h" #include "llvm/Support/MemoryBuffer.h" namespace llvm { @@ -28,6 +29,7 @@ static constexpr const char ModuleID[] = "ExegesisInfoTest"; static constexpr const char FunctionID[] = "foo"; +static const Align kFunctionAlignment(4096); // Fills the given basic block with register setup code, and returns true if // all registers could be setup correctly. @@ -169,13 +171,13 @@ ArrayRef LiveIns, ArrayRef RegisterInitialValues, const FillFunction &Fill, raw_pwrite_stream &AsmStream) { - std::unique_ptr Context = std::make_unique(); + auto Context = std::make_unique(); std::unique_ptr Module = createModule(Context, TM->createDataLayout()); - std::unique_ptr MMIWP = - std::make_unique(TM.get()); + auto MMIWP = std::make_unique(TM.get()); MachineFunction &MF = createVoidVoidPtrMachineFunction( FunctionID, Module.get(), &MMIWP.get()->getMMI()); + MF.ensureAlignment(kFunctionAlignment); // We need to instruct the passes that we're done with SSA and virtual // registers. @@ -305,9 +307,11 @@ // executable page. ExecEngine->addObjectFile(std::move(ObjectFileHolder)); // Fetching function bytes. - FunctionBytes = StringRef(reinterpret_cast( - ExecEngine->getFunctionAddress(FunctionID)), - CodeSize); + const uint64_t FunctionAddress = ExecEngine->getFunctionAddress(FunctionID); + assert(isAligned(kFunctionAlignment, FunctionAddress) && + "function is not properly aligned"); + FunctionBytes = + StringRef(reinterpret_cast(FunctionAddress), CodeSize); } } // namespace exegesis -------------- next part -------------- A non-text attachment was scrubbed... Name: D68703.224054.patch Type: text/x-patch Size: 2275 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 07:39:53 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:39:53 +0000 (UTC) Subject: [PATCH] D68575: [llvm-readobj][xcoff] implement parsing overflow section header. In-Reply-To: References: Message-ID: <1b9b4511b1bdee25101e3b54025b83d0@localhost.localdomain> DiggerLin updated this revision to Diff 224055. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68575/new/ https://reviews.llvm.org/D68575 Files: llvm/test/tools/llvm-readobj/Inputs/xcoff-reloc-overflow.o llvm/test/tools/llvm-readobj/xcoff-overflow-section.test llvm/tools/llvm-readobj/XCOFFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68575.224055.patch Type: text/x-patch Size: 6853 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 07:39:54 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:39:54 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: <2ab67ceb562e9fbc69381ef9e3572756@localhost.localdomain> Xiangling_L marked 10 inline comments as done. Xiangling_L added inline comments. ================ Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.td:3171 (PPCtoc_entry tglobaladdr:$disp, i32:$reg))]>; -def ADDIStocHA : PPCEmitTimePseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, tocentry32:$disp), - "#ADDIStocHA", - [(set i32:$rD, - (PPCtoc_entry i32:$reg, tglobaladdr:$disp))]>; +let hasSideEffects = 0, isReMaterializable = 1 in { +def ADDIStocHA: PPCEmitTimePseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, tocentry32:$disp), ---------------- jasonliu wrote: > Curious about what the effect is for adding hasSideEffects and isReMaterializable. > We already have ADDIStocHA before for the other targets, but they did not require hasSideEffects = 0, and isReMaterializable = 1. > So what is special about the AIX target that we need to add them? Or is it simply an omission before and it's actually needed for the other targets as well? > I try to remove this line here and no test case would fail. If we need those, should we add test case for it? Some of my rational of adding these two properties are as follows, and please feel free to raise your further concerns: 1. For the instruction property `hasSideEffects`, it's used to indicate "does the instruction have side effects that are not captured by any operands of the instruction or other flags". And it's unset by default, the TableGen will infer its value from the instruction pattern when possible. One rationale I used to add this property is that since `ADDIStocHA` and `ADDIStocHA8` [using under 64-bit mode] have the same instruction pattern, it should be safe to explicitly specify this `hasSideEffects=0` for `ADDIStocHA`. And also considering `ADDIStocHA` itself, what it does is to load a value from TOC, it does have no side effect like implicit function call, volatile variable read etc, so I set this property to be 0. 2.For the instruction property `isReMaterializable = 1`, meaning this instruction has no side effects and requires no operands that aren't always available. The only allowed uses are constants and unallocatable physical registers so that the instructions result is independent of the place in the function. Since it loads value from TOC, and TOC register is always reserved on PPC target, so I think it's safe to set it as 1. Plus, the same reason as above, I would prefer sync the behavior of `ADDIStocHA` and ADDIStocHA8`. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:2 ; RUN: llc -mtriple powerpc-ibm-aix-xcoff \ -; RUN: -code-model=small < %s | FileCheck %s +; RUN: -code-model=small < %s | FileCheck %s --check-prefix=SMALL + ---------------- jasonliu wrote: > Do we want to add -verify-machineinstrs to every llc invocation? Yes, I will add it and cpu level for testcases as well. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:2 ; RUN: llc -mtriple powerpc-ibm-aix-xcoff \ -; RUN: -code-model=small < %s | FileCheck %s +; RUN: -code-model=small < %s | FileCheck %s --check-prefix=SMALL + ---------------- sfertile wrote: > Please add '-verify-machine-instr` and an mcpu option to each llc invocation. Thank you for reminding me of this, will do. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Wed Oct 9 07:39:54 2019 From: llvm-commits at lists.llvm.org (Jay Foad via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:39:54 +0000 (UTC) Subject: [PATCH] D64353: [AMDGPU] Run '' after isel to simplify PHIs. In-Reply-To: References: Message-ID: <84a54cec14f4122f6d86e016a027ec51@localhost.localdomain> foad added a comment. Can we revert this now that D67101 has landed? I have tried locally reverting your change to addInstSelector and your lcssa-optnone.ll test case still passes. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64353/new/ https://reviews.llvm.org/D64353 From llvm-commits at lists.llvm.org Wed Oct 9 07:49:01 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:49:01 +0000 (UTC) Subject: [PATCH] D68672: [APInt] Rounding right-shifts In-Reply-To: References: Message-ID: <5260b9b0fe2a504b010f1da702892e0a@localhost.localdomain> lebedev.ri updated this revision to Diff 224056. lebedev.ri added a comment. Ok, last update - detect existence of quotent by counting trailing zeros, improve testss. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68672/new/ https://reviews.llvm.org/D68672 Files: llvm/include/llvm/ADT/APInt.h llvm/lib/Support/APInt.cpp llvm/unittests/ADT/APIntTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68672.224056.patch Type: text/x-patch Size: 8214 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 07:49:01 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:49:01 +0000 (UTC) Subject: [PATCH] D68645: MinidumpYAML: Add support for the memory info list stream In-Reply-To: References: Message-ID: grimar added inline comments. ================ Comment at: include/llvm/ObjectYAML/MinidumpYAML.h:111 + + explicit MemoryInfoListStream(std::vector Infos) + : Stream(StreamKind::MemoryInfoList, ---------------- Maybe be more explicit here, i.e. ``` std::vector &&Infos ``` ? ================ Comment at: lib/ObjectYAML/MinidumpEmitter.cpp:166 + Header.SizeOfEntry = sizeof(minidump::MemoryInfo); + Header.NumberOfEntries = InfoList.Infos.size(); + File.allocateNewObject(Header); ---------------- Probably just ``` minidump::MemoryInfoListHeader Header = { (support::ulittle32_t)sizeof(minidump::MemoryInfoListHeader), (support::ulittle32_t)sizeof(minidump::MemoryInfo), (support::ulittle64_t)InfoList.Infos.size()}; ``` ? Or perhaps it could have a constructor. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68645/new/ https://reviews.llvm.org/D68645 From llvm-commits at lists.llvm.org Wed Oct 9 07:49:01 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:49:01 +0000 (UTC) Subject: [PATCH] D68699: [ARM] Add some sample IR MVE intrinsics with C++ isel. In-Reply-To: References: Message-ID: <30e8fa7db0a773b943a7ea1a35e66e32@localhost.localdomain> dmgreen added inline comments. ================ Comment at: llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp:2381 + int32_t ImmValue = cast(N->getOperand(3))->getZExtValue(); + Ops.push_back(getI32Imm(ImmValue, Loc)); // immediate offset + ---------------- The immediate is in the range +-128? Do we need to diagnose here when that is out of range? Is it already diagnosed by the front end? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68699/new/ https://reviews.llvm.org/D68699 From llvm-commits at lists.llvm.org Wed Oct 9 07:49:02 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:49:02 +0000 (UTC) Subject: [PATCH] D68406: [update_cc_test_checks] Support 'clang | opt | FileCheck' In-Reply-To: References: Message-ID: dmgreen added a comment. This looks very useful to me (I've often wanted something similar with opt -> llc). It would make all the testing MVE intrinsics a lot more readable (to the extent that I'm not sure what we would do without it!) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68406/new/ https://reviews.llvm.org/D68406 From llvm-commits at lists.llvm.org Wed Oct 9 07:58:36 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:58:36 +0000 (UTC) Subject: [PATCH] D68664: [lit] Clean up internal diff's encoding handling In-Reply-To: References: Message-ID: jdenny updated this revision to Diff 224051. jdenny added a comment. Removed commented code pointed out during review. Changed `diff.decode(errors="replace")` to `diff.decode(errors="backslashreplace")` so the test suite doesn't fail at `sys.stdout.write(diff)` when running with python3. The test's commands worked fine when running interactively from a shell prompt, apparently because stdout is then different. Sorry, I must have forgotten to re-run the actual test suite with python3 after making some change here. In any case, `backslashreplace` results in output like the following, where `\xff` represents the bytes that cannot be rendered: foo -bar +bar\xff\xff baz That seems better than `ignore`, which just drops them altogether: foo -bar +bar baz CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68664/new/ https://reviews.llvm.org/D68664 Files: llvm/utils/lit/lit/builtin_commands/diff.py llvm/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.bin llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/shtest-shell.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68664.224051.patch Type: text/x-patch Size: 6059 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 07:58:36 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:58:36 +0000 (UTC) Subject: [PATCH] D68686: [X86] Add strict fp support for instructions fadd/fsub/fmul/fdiv In-Reply-To: References: Message-ID: <8304a0b4babf05a1e4ca9f329ba657a7@localhost.localdomain> cameron.mcinally added a comment. This looks good to me, but other users should review it as well... Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68686/new/ https://reviews.llvm.org/D68686 From llvm-commits at lists.llvm.org Wed Oct 9 07:58:36 2019 From: llvm-commits at lists.llvm.org (Alexey Bataev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:58:36 +0000 (UTC) Subject: [PATCH] D68667: [SLP] respect target register width for GEP vectorization (PR43578) In-Reply-To: References: Message-ID: <3b188e70df5c0edf79845fe8515b23c7@localhost.localdomain> ABataev accepted this revision. ABataev added a comment. This revision is now accepted and ready to land. Looks good. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68667/new/ https://reviews.llvm.org/D68667 From llvm-commits at lists.llvm.org Wed Oct 9 07:58:36 2019 From: llvm-commits at lists.llvm.org (Kostya Kortchinsky via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 14:58:36 +0000 (UTC) Subject: [PATCH] D68653: [scudo][standalone] Get statistics in a char buffer In-Reply-To: References: Message-ID: <20a5a86f2127b872515ac203d8a91b64@localhost.localdomain> cryptoad updated this revision to Diff 224059. cryptoad marked an inline comment as done. cryptoad added a comment. As pointed out by Matt, use a `std::vector` in the new test. Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68653/new/ https://reviews.llvm.org/D68653 Files: lib/scudo/standalone/combined.h lib/scudo/standalone/crc32_hw.cpp lib/scudo/standalone/primary32.h lib/scudo/standalone/primary64.h lib/scudo/standalone/quarantine.h lib/scudo/standalone/secondary.cpp lib/scudo/standalone/secondary.h lib/scudo/standalone/size_class_map.h lib/scudo/standalone/string_utils.cpp lib/scudo/standalone/string_utils.h lib/scudo/standalone/tests/combined_test.cpp lib/scudo/standalone/tests/primary_test.cpp lib/scudo/standalone/tests/quarantine_test.cpp lib/scudo/standalone/tests/secondary_test.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68653.224059.patch Type: text/x-patch Size: 17851 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 02:56:29 2019 From: llvm-commits at lists.llvm.org (Stefan O'Rear via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 09:56:29 +0000 (UTC) Subject: [PATCH] D66210: [RISCV] Enable the machine outliner for RISC-V In-Reply-To: References: Message-ID: <67b713788812cdd5692a3b21230a1dc0@localhost.localdomain> sorear added inline comments. ================ Comment at: llvm/lib/Target/RISCV/RISCVInstrInfo.cpp:481 + RS.enterBasicBlock(MBB); + return !RS.isRegUsed(RISCV::X5); +} ---------------- luismarques wrote: > lewis-revill wrote: > > luismarques wrote: > > > If we are only going to support one possible register for now, shouldn't it be the one least likely to already be in use? Wouldn't that be t6 (x31)? > > That's a sensible suggestion, I was using t0 to match how the save/restore libcalls behave so I presumed there was a good reason for using t0. Don't we also need to think about RV32E here though? > > > > > Good point about RV32E. I guess you can either always use `t2` or check the target and use `t6` when available, falling back to `t2` otherwise. If that's not trivial it might be worth checking how hard it would be to dynamically choose the register, like AArch64 does IIRC. `ra` and `t0` (`x1` and `x5`) have [special functionality in implementations with a return-address stack](https://content.riscv.org/wp-content/uploads/2019/06/riscv-spec.pdf#page=38); `jr t6` will be treated as a general indirect branch, not a return, and is much more likely to mispredict. So this should probably prefer `t0` whenever possible. ================ Comment at: llvm/lib/Target/RISCV/RISCVInstrInfo.cpp:548 + // RISCV::PseudoCALL = 8 bytes. + unsigned CallOverhead = 8; + for (auto &C : RepeatedSequenceLocs) ---------------- Tangentially related, but in many of the cases where outlining makes sense, it would also make sense to generate 4-byte `jal t0, label` instructions; lld and ld.bfd would need to be taught to generate thunks for out of range `jal` (the Go linker already supports them). ================ Comment at: llvm/lib/Target/RISCV/RISCVInstrInfo.cpp:552 + + // RISCV::PseudoRET = 4 bytes. + unsigned FrameOverhead = 4; ---------------- Note that `jr ` is a compressible instruction regardless of the register used. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66210/new/ https://reviews.llvm.org/D66210 From llvm-commits at lists.llvm.org Wed Oct 9 03:15:02 2019 From: llvm-commits at lists.llvm.org (Nikola Prica via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 10:15:02 +0000 (UTC) Subject: [PATCH] D67004: [DebugInfo] Enable call site parameter debug info for ARM and AArch64 In-Reply-To: References: Message-ID: <17c602523ad1c5af1470e819f890596d@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGf71bac6f4351: [DebugInfo] Enable call site debug info for ARM and AArch64 (authored by NikolaPrica). Herald added a project: clang. Herald added a subscriber: cfe-commits. Changed prior to commit: https://reviews.llvm.org/D67004?vs=218306&id=224006#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67004/new/ https://reviews.llvm.org/D67004 Files: clang/lib/CodeGen/CGDebugInfo.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/debug-info-param-modification.c Index: clang/test/CodeGen/debug-info-param-modification.c =================================================================== --- clang/test/CodeGen/debug-info-param-modification.c +++ clang/test/CodeGen/debug-info-param-modification.c @@ -1,4 +1,8 @@ // RUN: %clang -Xclang -femit-debug-entry-values -g -O2 -Xclang -disable-llvm-passes -S -target x86_64-none-linux-gnu -emit-llvm %s -o - | FileCheck %s -check-prefix=CHECK-ENTRY-VAL-OPT +// RUN: %clang -Xclang -femit-debug-entry-values -g -O2 -Xclang -disable-llvm-passes -S -target arm-none-linux-gnu -emit-llvm %s -o - | FileCheck %s -check-prefix=CHECK-ENTRY-VAL-OPT +// RUN: %clang -Xclang -femit-debug-entry-values -g -O2 -Xclang -disable-llvm-passes -S -target aarch64-none-linux-gnu -emit-llvm %s -o - | FileCheck %s -check-prefix=CHECK-ENTRY-VAL-OPT +// RUN: %clang -Xclang -femit-debug-entry-values -g -O2 -Xclang -disable-llvm-passes -S -target armeb-none-linux-gnu -emit-llvm %s -o - | FileCheck %s -check-prefix=CHECK-ENTRY-VAL-OPT + // CHECK-ENTRY-VAL-OPT: !DILocalVariable(name: "a", arg: 1, scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: {{.*}}) // CHECK-ENTRY-VAL-OPT: !DILocalVariable(name: "b", arg: 2, scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: {{.*}}, flags: DIFlagArgumentNotModified) // Index: clang/lib/Frontend/CompilerInvocation.cpp =================================================================== --- clang/lib/Frontend/CompilerInvocation.cpp +++ clang/lib/Frontend/CompilerInvocation.cpp @@ -777,10 +777,14 @@ Opts.DisableLLVMPasses = Args.hasArg(OPT_disable_llvm_passes); Opts.DisableLifetimeMarkers = Args.hasArg(OPT_disable_lifetimemarkers); + const llvm::Triple::ArchType DebugEntryValueArchs[] = { + llvm::Triple::x86, llvm::Triple::x86_64, llvm::Triple::aarch64, + llvm::Triple::arm, llvm::Triple::armeb}; + llvm::Triple T(TargetOpts.Triple); - llvm::Triple::ArchType Arch = T.getArch(); if (Opts.OptimizationLevel > 0 && - (Arch == llvm::Triple::x86 || Arch == llvm::Triple::x86_64)) + Opts.getDebugInfo() >= codegenoptions::LimitedDebugInfo && + llvm::is_contained(DebugEntryValueArchs, T.getArch())) Opts.EnableDebugEntryValues = Args.hasArg(OPT_femit_debug_entry_values); Opts.DisableO0ImplyOptNone = Args.hasArg(OPT_disable_O0_optnone); Index: clang/lib/CodeGen/CGDebugInfo.cpp =================================================================== --- clang/lib/CodeGen/CGDebugInfo.cpp +++ clang/lib/CodeGen/CGDebugInfo.cpp @@ -3706,8 +3706,7 @@ const FunctionDecl *CalleeDecl) { auto &CGOpts = CGM.getCodeGenOpts(); if (!CGOpts.EnableDebugEntryValues || !CGM.getLangOpts().Optimize || - !CallOrInvoke || - CGM.getCodeGenOpts().getDebugInfo() < codegenoptions::LimitedDebugInfo) + !CallOrInvoke) return; auto *Func = CallOrInvoke->getCalledFunction(); -------------- next part -------------- A non-text attachment was scrubbed... Name: D67004.224006.patch Type: text/x-patch Size: 2880 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 03:42:15 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 10:42:15 +0000 (UTC) Subject: [PATCH] D67122: [UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour In-Reply-To: References: Message-ID: <037932b61cc84930ae35727d2c0e799d@localhost.localdomain> lebedev.ri updated this revision to Diff 224007. lebedev.ri marked 5 inline comments as done. lebedev.ri added a comment. @rsmith thank you for the review! Rebased, addressed documentation nits. Anything else here? If not, care to stamp? :) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67122/new/ https://reviews.llvm.org/D67122 Files: clang/docs/ReleaseNotes.rst clang/docs/UndefinedBehaviorSanitizer.rst clang/lib/CodeGen/CGExprScalar.cpp clang/test/CodeGen/catch-nullptr-and-nonzero-offset-blacklist.c clang/test/CodeGen/catch-nullptr-and-nonzero-offset-in-offsetof-idiom.c clang/test/CodeGen/catch-nullptr-and-nonzero-offset-when-nullptr-is-defined.c clang/test/CodeGen/catch-nullptr-and-nonzero-offset.c clang/test/CodeGen/catch-pointer-overflow-volatile.c clang/test/CodeGen/catch-pointer-overflow.c clang/test/CodeGen/ubsan-pointer-overflow.c clang/test/CodeGen/ubsan-pointer-overflow.m clang/test/CodeGenCXX/catch-nullptr-and-nonzero-offset-in-offsetof-idiom.cpp compiler-rt/lib/sanitizer_common/sanitizer_suppressions.h compiler-rt/lib/ubsan/ubsan_checks.inc compiler-rt/lib/ubsan/ubsan_handlers.cpp compiler-rt/test/ubsan/TestCases/Pointer/index-overflow.cpp compiler-rt/test/ubsan/TestCases/Pointer/nullptr-and-nonzero-offset-constants.cpp compiler-rt/test/ubsan/TestCases/Pointer/nullptr-and-nonzero-offset-summary.cpp compiler-rt/test/ubsan/TestCases/Pointer/nullptr-and-nonzero-offset-variable.cpp compiler-rt/test/ubsan/TestCases/Pointer/unsigned-index-expression.cpp compiler-rt/test/ubsan_minimal/TestCases/nullptr-and-nonzero-offset.c llvm/docs/ReleaseNotes.rst -------------- next part -------------- A non-text attachment was scrubbed... Name: D67122.224007.patch Type: text/x-patch Size: 131625 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 03:51:11 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 10:51:11 +0000 (UTC) Subject: [PATCH] D67122: [UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour In-Reply-To: References: Message-ID: lebedev.ri added inline comments. ================ Comment at: clang/lib/CodeGen/CGExprScalar.cpp:4657 + Builder.GetInsertBlock()->getParent(), PtrTy->getPointerAddressSpace()); + // Check for overflows unless the GEP got constant-folded, + // and only in the default address space ---------------- rsmith wrote: > If we want to split out the "constant folded" case to avoid issuing too many sanitizer traps on bogus but common patterns, we should have another sanitizer group to re-enable those diagnostics for the constant-folded cases. (I'm fine with not doing that in this patch, though.) I'm not sure about this point, i think i'm gonna leave this as-is for now.. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67122/new/ https://reviews.llvm.org/D67122 From llvm-commits at lists.llvm.org Wed Oct 9 06:13:58 2019 From: llvm-commits at lists.llvm.org (Tim Gymnich via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 13:13:58 +0000 (UTC) Subject: [PATCH] D68360: PR41162 Implement LKK remainder and divisibility algorithms [urem] In-Reply-To: References: Message-ID: <5a38027242cf9cb2ef071409e2f65333@localhost.localdomain> TG908 added a comment. I tested loops containing a rem operation on AArch64. With LKK the loop body contains 3 fewer instructions. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68360/new/ https://reviews.llvm.org/D68360 From llvm-commits at lists.llvm.org Wed Oct 9 08:09:29 2019 From: llvm-commits at lists.llvm.org (Kostya Kortchinsky via llvm-commits) Date: Wed, 09 Oct 2019 15:09:29 -0000 Subject: [compiler-rt] r374173 - [scudo][standalone] Get statistics in a char buffer Message-ID: <20191009150929.1F63A87C14@lists.llvm.org> Author: cryptoad Date: Wed Oct 9 08:09:28 2019 New Revision: 374173 URL: http://llvm.org/viewvc/llvm-project?rev=374173&view=rev Log: [scudo][standalone] Get statistics in a char buffer Summary: Following up on D68471, this CL introduces some `getStats` APIs to gather statistics in char buffers (`ScopedString` really) instead of printing them out right away. Ultimately `printStats` will just output the buffer, but that allows us to potentially do some work on the intermediate buffer, and can be used for a `mallocz` type of functionality. This allows us to pretty much get rid of all the `Printf` calls around, but I am keeping the function in for debugging purposes. This changes the existing tests to use the new APIs when required. I will add new tests as suggested in D68471 in another CL. Reviewers: morehouse, hctim, vitalybuka, eugenis, cferris Reviewed By: morehouse Subscribers: delcypher, #sanitizers, llvm-commits Tags: #llvm, #sanitizers Differential Revision: https://reviews.llvm.org/D68653 Modified: compiler-rt/trunk/lib/scudo/standalone/combined.h compiler-rt/trunk/lib/scudo/standalone/crc32_hw.cpp compiler-rt/trunk/lib/scudo/standalone/primary32.h compiler-rt/trunk/lib/scudo/standalone/primary64.h compiler-rt/trunk/lib/scudo/standalone/quarantine.h compiler-rt/trunk/lib/scudo/standalone/secondary.cpp compiler-rt/trunk/lib/scudo/standalone/secondary.h compiler-rt/trunk/lib/scudo/standalone/size_class_map.h compiler-rt/trunk/lib/scudo/standalone/string_utils.cpp compiler-rt/trunk/lib/scudo/standalone/string_utils.h compiler-rt/trunk/lib/scudo/standalone/tests/combined_test.cpp compiler-rt/trunk/lib/scudo/standalone/tests/primary_test.cpp compiler-rt/trunk/lib/scudo/standalone/tests/quarantine_test.cpp compiler-rt/trunk/lib/scudo/standalone/tests/secondary_test.cpp Modified: compiler-rt/trunk/lib/scudo/standalone/combined.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/combined.h?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/combined.h (original) +++ compiler-rt/trunk/lib/scudo/standalone/combined.h Wed Oct 9 08:09:28 2019 @@ -369,12 +369,31 @@ public: Primary.enable(); } + // The function returns the amount of bytes required to store the statistics, + // which might be larger than the amount of bytes provided. Note that the + // statistics buffer is not necessarily constant between calls to this + // function. This can be called with a null buffer or zero size for buffer + // sizing purposes. + uptr getStats(char *Buffer, uptr Size) { + ScopedString Str(1024); + disable(); + const uptr Length = getStats(&Str) + 1; + enable(); + if (Length < Size) + Size = Length; + if (Buffer && Size) { + memcpy(Buffer, Str.data(), Size); + Buffer[Size - 1] = '\0'; + } + return Length; + } + void printStats() { + ScopedString Str(1024); disable(); - Primary.printStats(); - Secondary.printStats(); - Quarantine.printStats(); + getStats(&Str); enable(); + Str.output(); } void releaseToOS() { Primary.releaseToOS(); } @@ -563,6 +582,13 @@ private: *Size = getSize(Ptr, &Header); return P; } + + uptr getStats(ScopedString *Str) { + Primary.getStats(Str); + Secondary.getStats(Str); + Quarantine.getStats(Str); + return Str->length(); + } }; } // namespace scudo Modified: compiler-rt/trunk/lib/scudo/standalone/crc32_hw.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/crc32_hw.cpp?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/crc32_hw.cpp (original) +++ compiler-rt/trunk/lib/scudo/standalone/crc32_hw.cpp Wed Oct 9 08:09:28 2019 @@ -1,4 +1,4 @@ -//===-- crc32_hw.h ----------------------------------------------*- C++ -*-===// +//===-- crc32_hw.cpp --------------------------------------------*- C++ -*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. Modified: compiler-rt/trunk/lib/scudo/standalone/primary32.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/primary32.h?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/primary32.h (original) +++ compiler-rt/trunk/lib/scudo/standalone/primary32.h Wed Oct 9 08:09:28 2019 @@ -143,7 +143,7 @@ public: } } - void printStats() { + void getStats(ScopedString *Str) { // TODO(kostyak): get the RSS per region. uptr TotalMapped = 0; uptr PoppedBlocks = 0; @@ -154,11 +154,11 @@ public: PoppedBlocks += Sci->Stats.PoppedBlocks; PushedBlocks += Sci->Stats.PushedBlocks; } - Printf("Stats: SizeClassAllocator32: %zuM mapped in %zu allocations; " - "remains %zu\n", - TotalMapped >> 20, PoppedBlocks, PoppedBlocks - PushedBlocks); + Str->append("Stats: SizeClassAllocator32: %zuM mapped in %zu allocations; " + "remains %zu\n", + TotalMapped >> 20, PoppedBlocks, PoppedBlocks - PushedBlocks); for (uptr I = 0; I < NumClasses; I++) - printStats(I, 0); + getStats(Str, I, 0); } uptr releaseToOS() { @@ -328,17 +328,17 @@ private: return B; } - void printStats(uptr ClassId, uptr Rss) { + void getStats(ScopedString *Str, uptr ClassId, uptr Rss) { SizeClassInfo *Sci = getSizeClassInfo(ClassId); if (Sci->AllocatedUser == 0) return; const uptr InUse = Sci->Stats.PoppedBlocks - Sci->Stats.PushedBlocks; const uptr AvailableChunks = Sci->AllocatedUser / getSizeByClassId(ClassId); - Printf(" %02zu (%6zu): mapped: %6zuK popped: %7zu pushed: %7zu inuse: %6zu" - " avail: %6zu rss: %6zuK\n", - ClassId, getSizeByClassId(ClassId), Sci->AllocatedUser >> 10, - Sci->Stats.PoppedBlocks, Sci->Stats.PushedBlocks, InUse, - AvailableChunks, Rss >> 10); + Str->append(" %02zu (%6zu): mapped: %6zuK popped: %7zu pushed: %7zu " + "inuse: %6zu avail: %6zu rss: %6zuK\n", + ClassId, getSizeByClassId(ClassId), Sci->AllocatedUser >> 10, + Sci->Stats.PoppedBlocks, Sci->Stats.PushedBlocks, InUse, + AvailableChunks, Rss >> 10); } NOINLINE uptr releaseToOSMaybe(SizeClassInfo *Sci, uptr ClassId, Modified: compiler-rt/trunk/lib/scudo/standalone/primary64.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/primary64.h?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/primary64.h (original) +++ compiler-rt/trunk/lib/scudo/standalone/primary64.h Wed Oct 9 08:09:28 2019 @@ -147,7 +147,7 @@ public: } } - void printStats() const { + void getStats(ScopedString *Str) const { // TODO(kostyak): get the RSS per region. uptr TotalMapped = 0; uptr PoppedBlocks = 0; @@ -159,12 +159,13 @@ public: PoppedBlocks += Region->Stats.PoppedBlocks; PushedBlocks += Region->Stats.PushedBlocks; } - Printf("Stats: Primary64: %zuM mapped (%zuM rss) in %zu allocations; " - "remains %zu\n", - TotalMapped >> 20, 0, PoppedBlocks, PoppedBlocks - PushedBlocks); + Str->append("Stats: SizeClassAllocator64: %zuM mapped (%zuM rss) in %zu " + "allocations; remains %zu\n", + TotalMapped >> 20, 0, PoppedBlocks, + PoppedBlocks - PushedBlocks); for (uptr I = 0; I < NumClasses; I++) - printStats(I, 0); + getStats(Str, I, 0); } uptr releaseToOS() { @@ -269,10 +270,12 @@ private: if (UNLIKELY(RegionBase + MappedUser + UserMapSize > RegionSize)) { if (!Region->Exhausted) { Region->Exhausted = true; - printStats(); - Printf( + ScopedString Str(1024); + getStats(&Str); + Str.append( "Scudo OOM: The process has Exhausted %zuM for size class %zu.\n", RegionSize >> 20, Size); + Str.output(); } return nullptr; } @@ -322,21 +325,21 @@ private: return B; } - void printStats(uptr ClassId, uptr Rss) const { + void getStats(ScopedString *Str, uptr ClassId, uptr Rss) const { RegionInfo *Region = getRegionInfo(ClassId); if (Region->MappedUser == 0) return; const uptr InUse = Region->Stats.PoppedBlocks - Region->Stats.PushedBlocks; const uptr TotalChunks = Region->AllocatedUser / getSizeByClassId(ClassId); - Printf("%s %02zu (%6zu): mapped: %6zuK popped: %7zu pushed: %7zu inuse: " - "%6zu total: %6zu rss: %6zuK releases: %6zu last released: %6zuK " - "region: 0x%zx (0x%zx)\n", - Region->Exhausted ? "F" : " ", ClassId, getSizeByClassId(ClassId), - Region->MappedUser >> 10, Region->Stats.PoppedBlocks, - Region->Stats.PushedBlocks, InUse, TotalChunks, Rss >> 10, - Region->ReleaseInfo.RangesReleased, - Region->ReleaseInfo.LastReleasedBytes >> 10, Region->RegionBeg, - getRegionBaseByClassId(ClassId)); + Str->append("%s %02zu (%6zu): mapped: %6zuK popped: %7zu pushed: %7zu " + "inuse: %6zu total: %6zu rss: %6zuK releases: %6zu last " + "released: %6zuK region: 0x%zx (0x%zx)\n", + Region->Exhausted ? "F" : " ", ClassId, + getSizeByClassId(ClassId), Region->MappedUser >> 10, + Region->Stats.PoppedBlocks, Region->Stats.PushedBlocks, InUse, + TotalChunks, Rss >> 10, Region->ReleaseInfo.RangesReleased, + Region->ReleaseInfo.LastReleasedBytes >> 10, Region->RegionBeg, + getRegionBaseByClassId(ClassId)); } NOINLINE uptr releaseToOSMaybe(RegionInfo *Region, uptr ClassId, Modified: compiler-rt/trunk/lib/scudo/standalone/quarantine.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/quarantine.h?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/quarantine.h (original) +++ compiler-rt/trunk/lib/scudo/standalone/quarantine.h Wed Oct 9 08:09:28 2019 @@ -130,7 +130,7 @@ public: subFromSize(ExtractedSize); } - void printStats() const { + void getStats(ScopedString *Str) const { uptr BatchCount = 0; uptr TotalOverheadBytes = 0; uptr TotalBytes = 0; @@ -152,11 +152,11 @@ public: (TotalQuarantinedBytes == 0) ? 0 : TotalOverheadBytes * 100 / TotalQuarantinedBytes; - Printf("Global quarantine stats: batches: %zu; bytes: %zu (user: %zu); " - "chunks: %zu (capacity: %zu); %zu%% chunks used; %zu%% memory " - "overhead\n", - BatchCount, TotalBytes, TotalQuarantinedBytes, TotalQuarantineChunks, - QuarantineChunksCapacity, ChunksUsagePercent, MemoryOverheadPercent); + Str->append( + "Stats: Quarantine: batches: %zu; bytes: %zu (user: %zu); chunks: %zu " + "(capacity: %zu); %zu%% chunks used; %zu%% memory overhead\n", + BatchCount, TotalBytes, TotalQuarantinedBytes, TotalQuarantineChunks, + QuarantineChunksCapacity, ChunksUsagePercent, MemoryOverheadPercent); } private: @@ -218,11 +218,11 @@ public: recycle(0, Cb); } - void printStats() const { + void getStats(ScopedString *Str) const { // It assumes that the world is stopped, just as the allocator's printStats. - Printf("Quarantine limits: global: %zuM; thread local: %zuK\n", - getMaxSize() >> 20, getCacheSize() >> 10); - Cache.printStats(); + Cache.getStats(Str); + Str->append("Quarantine limits: global: %zuK; thread local: %zuK\n", + getMaxSize() >> 10, getCacheSize() >> 10); } private: Modified: compiler-rt/trunk/lib/scudo/standalone/secondary.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/secondary.cpp?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/secondary.cpp (original) +++ compiler-rt/trunk/lib/scudo/standalone/secondary.cpp Wed Oct 9 08:09:28 2019 @@ -123,12 +123,13 @@ void MapAllocator::deallocate(void *Ptr) unmap(Addr, Size, UNMAP_ALL, &Data); } -void MapAllocator::printStats() const { - Printf("Stats: MapAllocator: allocated %zu times (%zuK), freed %zu times " - "(%zuK), remains %zu (%zuK) max %zuM\n", - NumberOfAllocs, AllocatedBytes >> 10, NumberOfFrees, FreedBytes >> 10, - NumberOfAllocs - NumberOfFrees, (AllocatedBytes - FreedBytes) >> 10, - LargestSize >> 20); +void MapAllocator::getStats(ScopedString *Str) const { + Str->append( + "Stats: MapAllocator: allocated %zu times (%zuK), freed %zu times " + "(%zuK), remains %zu (%zuK) max %zuM\n", + NumberOfAllocs, AllocatedBytes >> 10, NumberOfFrees, FreedBytes >> 10, + NumberOfAllocs - NumberOfFrees, (AllocatedBytes - FreedBytes) >> 10, + LargestSize >> 20); } } // namespace scudo Modified: compiler-rt/trunk/lib/scudo/standalone/secondary.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/secondary.h?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/secondary.h (original) +++ compiler-rt/trunk/lib/scudo/standalone/secondary.h Wed Oct 9 08:09:28 2019 @@ -12,6 +12,7 @@ #include "common.h" #include "mutex.h" #include "stats.h" +#include "string_utils.h" namespace scudo { @@ -70,7 +71,7 @@ public: return getBlockEnd(Ptr) - reinterpret_cast(Ptr); } - void printStats() const; + void getStats(ScopedString *Str) const; void disable() { Mutex.lock(); } Modified: compiler-rt/trunk/lib/scudo/standalone/size_class_map.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/size_class_map.h?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/size_class_map.h (original) +++ compiler-rt/trunk/lib/scudo/standalone/size_class_map.h Wed Oct 9 08:09:28 2019 @@ -86,6 +86,7 @@ public: } static void print() { + ScopedString Buffer(1024); uptr PrevS = 0; uptr TotalCached = 0; for (uptr I = 0; I < NumClasses; I++) { @@ -93,19 +94,20 @@ public: continue; const uptr S = getSizeByClassId(I); if (S >= MidSize / 2 && (S & (S - 1)) == 0) - Printf("\n"); + Buffer.append("\n"); const uptr D = S - PrevS; const uptr P = PrevS ? (D * 100 / PrevS) : 0; const uptr L = S ? getMostSignificantSetBitIndex(S) : 0; const uptr Cached = getMaxCachedHint(S) * S; - Printf( + Buffer.append( "C%02zu => S: %zu diff: +%zu %02zu%% L %zu Cached: %zu %zu; id %zu\n", I, getSizeByClassId(I), D, P, L, getMaxCachedHint(S), Cached, getClassIdBySize(S)); TotalCached += Cached; PrevS = S; } - Printf("Total Cached: %zu\n", TotalCached); + Buffer.append("Total Cached: %zu\n", TotalCached); + Buffer.output(); } static void validate() { Modified: compiler-rt/trunk/lib/scudo/standalone/string_utils.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/string_utils.cpp?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/string_utils.cpp (original) +++ compiler-rt/trunk/lib/scudo/standalone/string_utils.cpp Wed Oct 9 08:09:28 2019 @@ -208,9 +208,18 @@ int formatString(char *Buffer, uptr Buff } void ScopedString::append(const char *Format, va_list Args) { - CHECK_LT(Length, String.size()); - formatString(String.data() + Length, String.size() - Length, Format, Args); - Length += strlen(String.data() + Length); + DCHECK_LT(Length, String.size()); + va_list ArgsCopy; + va_copy(ArgsCopy, Args); + // formatString doesn't currently support a null buffer or zero buffer length, + // so in order to get the resulting formatted string length, we use a one-char + // buffer. + char C[1]; + const uptr AdditionalLength = + static_cast(formatString(C, sizeof(C), Format, Args)) + 1; + String.resize(Length + AdditionalLength); + formatString(String.data() + Length, AdditionalLength, Format, ArgsCopy); + Length = strlen(String.data()); CHECK_LT(Length, String.size()); } @@ -226,7 +235,7 @@ FORMAT(1, 2) void Printf(const char *Format, ...) { va_list Args; va_start(Args, Format); - ScopedString Msg(512); + ScopedString Msg(1024); Msg.append(Format, Args); outputRaw(Msg.data()); va_end(Args); Modified: compiler-rt/trunk/lib/scudo/standalone/string_utils.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/string_utils.h?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/string_utils.h (original) +++ compiler-rt/trunk/lib/scudo/standalone/string_utils.h Wed Oct 9 08:09:28 2019 @@ -29,6 +29,7 @@ public: } void append(const char *Format, va_list Args); void append(const char *Format, ...); + void output() const { outputRaw(String.data()); } private: Vector String; Modified: compiler-rt/trunk/lib/scudo/standalone/tests/combined_test.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/tests/combined_test.cpp?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/tests/combined_test.cpp (original) +++ compiler-rt/trunk/lib/scudo/standalone/tests/combined_test.cpp Wed Oct 9 08:09:28 2019 @@ -136,7 +136,21 @@ template static void test } Allocator->releaseToOS(); - Allocator->printStats(); + + scudo::uptr BufferSize = 8192; + std::vector Buffer(BufferSize); + scudo::uptr ActualSize = Allocator->getStats(Buffer.data(), BufferSize); + while (ActualSize > BufferSize) { + BufferSize = ActualSize + 1024; + Buffer.resize(BufferSize); + ActualSize = Allocator->getStats(Buffer.data(), BufferSize); + } + std::string Stats(Buffer.begin(), Buffer.end()); + // Basic checks on the contents of the statistics output, which also allows us + // to verify that we got it all. + EXPECT_NE(Stats.find("Stats: SizeClassAllocator"), std::string::npos); + EXPECT_NE(Stats.find("Stats: MapAllocator"), std::string::npos); + EXPECT_NE(Stats.find("Stats: Quarantine"), std::string::npos); } TEST(ScudoCombinedTest, BasicCombined) { Modified: compiler-rt/trunk/lib/scudo/standalone/tests/primary_test.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/tests/primary_test.cpp?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/tests/primary_test.cpp (original) +++ compiler-rt/trunk/lib/scudo/standalone/tests/primary_test.cpp Wed Oct 9 08:09:28 2019 @@ -46,7 +46,9 @@ template static void } Cache.destroy(nullptr); Allocator->releaseToOS(); - Allocator->printStats(); + scudo::ScopedString Str(1024); + Allocator->getStats(&Str); + Str.output(); } TEST(ScudoPrimaryTest, BasicPrimary) { @@ -86,7 +88,9 @@ TEST(ScudoPrimaryTest, Primary64OOM) { } Cache.destroy(nullptr); Allocator.releaseToOS(); - Allocator.printStats(); + scudo::ScopedString Str(1024); + Allocator.getStats(&Str); + Str.output(); EXPECT_EQ(AllocationFailed, true); Allocator.unmapTestOnly(); } @@ -125,7 +129,9 @@ template static void } Cache.destroy(nullptr); Allocator->releaseToOS(); - Allocator->printStats(); + scudo::ScopedString Str(1024); + Allocator->getStats(&Str); + Str.output(); } TEST(ScudoPrimaryTest, PrimaryIterate) { @@ -180,7 +186,9 @@ template static void for (auto &T : Threads) T.join(); Allocator->releaseToOS(); - Allocator->printStats(); + scudo::ScopedString Str(1024); + Allocator->getStats(&Str); + Str.output(); } TEST(ScudoPrimaryTest, PrimaryThreaded) { @@ -203,8 +211,7 @@ template static void Cache.init(nullptr, Allocator.get()); const scudo::uptr Size = scudo::getPageSizeCached() * 2; EXPECT_TRUE(Primary::canAllocate(Size)); - const scudo::uptr ClassId = - Primary::SizeClassMap::getClassIdBySize(Size); + const scudo::uptr ClassId = Primary::SizeClassMap::getClassIdBySize(Size); void *P = Cache.allocate(ClassId); EXPECT_NE(P, nullptr); Cache.deallocate(ClassId, P); Modified: compiler-rt/trunk/lib/scudo/standalone/tests/quarantine_test.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/tests/quarantine_test.cpp?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/tests/quarantine_test.cpp (original) +++ compiler-rt/trunk/lib/scudo/standalone/tests/quarantine_test.cpp Wed Oct 9 08:09:28 2019 @@ -213,7 +213,9 @@ TEST(ScudoQuarantineTest, GlobalQuaranti Quarantine.drainAndRecycle(&Cache, Cb); EXPECT_EQ(Cache.getSize(), 0UL); - Quarantine.printStats(); + scudo::ScopedString Str(1024); + Quarantine.getStats(&Str); + Str.output(); } void *populateQuarantine(void *Param) { @@ -236,5 +238,7 @@ TEST(ScudoQuarantineTest, ThreadedGlobal for (scudo::uptr I = 0; I < NumberOfThreads; I++) pthread_join(T[I], 0); - Quarantine.printStats(); + scudo::ScopedString Str(1024); + Quarantine.getStats(&Str); + Str.output(); } Modified: compiler-rt/trunk/lib/scudo/standalone/tests/secondary_test.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/scudo/standalone/tests/secondary_test.cpp?rev=374173&r1=374172&r2=374173&view=diff ============================================================================== --- compiler-rt/trunk/lib/scudo/standalone/tests/secondary_test.cpp (original) +++ compiler-rt/trunk/lib/scudo/standalone/tests/secondary_test.cpp Wed Oct 9 08:09:28 2019 @@ -45,7 +45,9 @@ TEST(ScudoSecondaryTest, SecondaryBasic) L->deallocate(V.back()); V.pop_back(); } - L->printStats(); + scudo::ScopedString Str(1024); + L->getStats(&Str); + Str.output(); } // This exercises a variety of combinations of size and alignment for the @@ -76,7 +78,9 @@ TEST(ScudoSecondaryTest, SecondaryCombin } } } - L->printStats(); + scudo::ScopedString Str(1024); + L->getStats(&Str); + Str.output(); } TEST(ScudoSecondaryTest, SecondaryIterate) { @@ -97,7 +101,9 @@ TEST(ScudoSecondaryTest, SecondaryIterat L->deallocate(V.back()); V.pop_back(); } - L->printStats(); + scudo::ScopedString Str(1024); + L->getStats(&Str); + Str.output(); } static std::mutex Mutex; @@ -133,5 +139,7 @@ TEST(ScudoSecondaryTest, SecondaryThread } for (auto &T : Threads) T.join(); - L->printStats(); + scudo::ScopedString Str(1024); + L->getStats(&Str); + Str.output(); } From llvm-commits at lists.llvm.org Wed Oct 9 08:07:53 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:07:53 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: spatel added subscribers: xbolva00, craig.topper. spatel added a comment. In D68651#1701189 , @dmgreen wrote: > Yes. I was going off the prior art for adding sadd_sat and ssub_sat to instcombine. And from the cases this patch is matching, where we are otherwise extending to a higher type, the intrinsic seems to produce equal or better code in most of the cases I've tried now. So this wasn't just for vectorisation, although it does make things a lot simpler there. (On an arm specific note, we have a scalar qadd instruction that can be used, if we can sort out the "q" flag otherwise being visible from C). > > If the canonical form for one of these signed saturating adds/subs wasn't an intrinsic, what would it be? Going into a higher type is awkward for us because the i64 add is not legal, and so doesn't look like the kind of instruction that should be vectorised, plus in ISel we'd have to catch the lowering fairly early and do something special. I haven't looked at the patch in detail, but as author of at least part of the prior art cited here, I agree with the direction*. I also participated in some of the vector idioms discussions from a few years ago. There's overlap with the vector idiom problems, but as noted, these are generic (scalar too) math ops, so it's not exactly the same. We invested significantly in IR analysis and codegen for the math intrinsics, so that may have changed the thinking. I don't remember the sequence of events or if there was a dedicated llvm-dev thread for this, but the general idea is that if we have a generic intrinsic for the math and can easily invert the transform in the backend for targets/types that are not supported, try to canonicalize to the intrinsic. https://bugs.llvm.org/show_bug.cgi?id=43580 may show another example where transforming to an intrinsic would help. cc @craig.topper @xbolva00 *We should have codegen tests in place for multiple targets/types, and it appears that is added with D68643 . CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Wed Oct 9 08:17:04 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:17:04 +0000 (UTC) Subject: [PATCH] D68706: [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space Message-ID: spatel created this revision. spatel added reviewers: lebedev.ri, nlopes, jdoerfert. Herald added subscribers: hiraditya, mcrosier. Herald added a project: LLVM. Follow-up to D68244 to account for a corner case discussed in: https://bugs.llvm.org/show_bug.cgi?id=43501 Add one more restriction: if the pointer is deref-or-null and in a non-default (non-zero) address space, we can't assume inbounds. https://reviews.llvm.org/D68706 Files: llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp llvm/test/Transforms/InstCombine/load-bitcast-vec.ll Index: llvm/test/Transforms/InstCombine/load-bitcast-vec.ll =================================================================== --- llvm/test/Transforms/InstCombine/load-bitcast-vec.ll +++ llvm/test/Transforms/InstCombine/load-bitcast-vec.ll @@ -89,11 +89,11 @@ ret float %r } -; TODO: Is a null pointer inbounds in any address space? +; A null pointer can't be assumed inbounds in a non-default address space. define float @matching_scalar_smallest_deref_or_null_addrspace(<4 x float> addrspace(4)* dereferenceable_or_null(1) %p) { ; CHECK-LABEL: @matching_scalar_smallest_deref_or_null_addrspace( -; CHECK-NEXT: [[BC:%.*]] = getelementptr inbounds <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 +; CHECK-NEXT: [[BC:%.*]] = getelementptr <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 ; CHECK-NEXT: [[R:%.*]] = load float, float addrspace(4)* [[BC]], align 16 ; CHECK-NEXT: ret float [[R]] ; Index: llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp =================================================================== --- llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp +++ llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp @@ -2344,8 +2344,12 @@ // If the source pointer is dereferenceable, then assume it points to an // allocated object and apply "inbounds" to the GEP. bool CanBeNull; - if (Src->getPointerDereferenceableBytes(DL, CanBeNull)) - GEP->setIsInBounds(); + if (Src->getPointerDereferenceableBytes(DL, CanBeNull)) { + // In a non-default address space (not 0), a null pointer can not be + // assumed inbounds, so ignore that case (dereferenceable_or_null). + if (SrcPTy->getAddressSpace() == 0 || !CanBeNull) + GEP->setIsInBounds(); + } return GEP; } } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68706.224060.patch Type: text/x-patch Size: 1824 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 08:17:04 2019 From: llvm-commits at lists.llvm.org (Steven Wan via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:17:04 +0000 (UTC) Subject: [PATCH] D68603: [sanitizer] Print SIGTRAP for corresponding signal In-Reply-To: References: Message-ID: <85dc54cfb4c11b23b1a7ce883b92c644@localhost.localdomain> stevewan added a comment. Hi @vitalybuka, This is causing LIT failures in `clang-s390x-linux`. Can you please take a look? Thanks! Steven Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68603/new/ https://reviews.llvm.org/D68603 From llvm-commits at lists.llvm.org Wed Oct 9 08:17:04 2019 From: llvm-commits at lists.llvm.org (Steven Wan via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:17:04 +0000 (UTC) Subject: [PATCH] D68604: [tsan] Don't delay SIGTRAP handler In-Reply-To: References: Message-ID: stevewan added a comment. Hi @vitalybuka, This is causing LIT failures in `clang-s390x-linux`. Can you please take a look? Thanks! Steven Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68604/new/ https://reviews.llvm.org/D68604 From llvm-commits at lists.llvm.org Wed Oct 9 08:17:08 2019 From: llvm-commits at lists.llvm.org (Kostya Kortchinsky via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:17:08 +0000 (UTC) Subject: [PATCH] D68653: [scudo][standalone] Get statistics in a char buffer In-Reply-To: References: Message-ID: <4d683745eb72dde42d206be2d3d9a61c@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGf7b1489ffc51: [scudo][standalone] Get statistics in a char buffer (authored by cryptoad). Changed prior to commit: https://reviews.llvm.org/D68653?vs=224059&id=224061#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68653/new/ https://reviews.llvm.org/D68653 Files: compiler-rt/lib/scudo/standalone/combined.h compiler-rt/lib/scudo/standalone/crc32_hw.cpp compiler-rt/lib/scudo/standalone/primary32.h compiler-rt/lib/scudo/standalone/primary64.h compiler-rt/lib/scudo/standalone/quarantine.h compiler-rt/lib/scudo/standalone/secondary.cpp compiler-rt/lib/scudo/standalone/secondary.h compiler-rt/lib/scudo/standalone/size_class_map.h compiler-rt/lib/scudo/standalone/string_utils.cpp compiler-rt/lib/scudo/standalone/string_utils.h compiler-rt/lib/scudo/standalone/tests/combined_test.cpp compiler-rt/lib/scudo/standalone/tests/primary_test.cpp compiler-rt/lib/scudo/standalone/tests/quarantine_test.cpp compiler-rt/lib/scudo/standalone/tests/secondary_test.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68653.224061.patch Type: text/x-patch Size: 18355 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 08:26:32 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 15:26:32 +0000 (UTC) Subject: [PATCH] D65402: [Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer In-Reply-To: References: Message-ID: xbolva00 added a comment. dereferenceable attribute is not added to the arguments? define dso_local void @_Z3fooPaS_S_ii(i8* noalias nocapture writeonly %0, i8* noalias nocapture readnone %1, i8* noalias nocapture readonly %2, i32 %3, i32 %4) local_unnamed_addr #0 { tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture nonnull writeonly align 1 dereferenceable(16) %0, i8* noalias nocapture nonnull readonly align 1 dereferenceable(16) %2, i64 16, i1 false) #2 ret void } I would expect define dso_local void @_Z3fooPaS_S_ii(i8* noalias nocapture writeonly dereferenceable(16) %0, i8* noalias nocapture readnone %1, i8* noalias nocapture readonly dereferenceable(16) %2, i32 %3, i32 %4) local_unnamed_addr #0 { tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture nonnull writeonly align 1 dereferenceable(16) %0, i8* noalias nocapture nonnull readonly align 1 dereferenceable(16) %2, i64 16, i1 false) #2 ret void } https://godbolt.org/z/if9rle Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65402/new/ https://reviews.llvm.org/D65402 From llvm-commits at lists.llvm.org Wed Oct 9 08:26:32 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:26:32 +0000 (UTC) Subject: [PATCH] D68706: [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space In-Reply-To: References: Message-ID: <98adaac90df1e9753e4f7bab192adea5@localhost.localdomain> lebedev.ri added a comment. You want `llvm::NullPointerIsDefined()`, which also checks for `"null-pointer-is-valid"` attribute. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68706/new/ https://reviews.llvm.org/D68706 From llvm-commits at lists.llvm.org Wed Oct 9 08:26:32 2019 From: llvm-commits at lists.llvm.org (Ehsan Amiri via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:26:32 +0000 (UTC) Subject: [PATCH] D68476: [SVE][AArch64] Adding pattern matching for some SVE instructions. In-Reply-To: References: Message-ID: <41ccce08964bcb7a3fbb5d91ab92edb6@localhost.localdomain> amehsan accepted this revision. amehsan added a comment. This revision is now accepted and ready to land. This looks straighforward. So LGTM. Since this is @mgudim's first patch, I will commit it for him. But I will wait a bit more (before committing) just in case @huntergr has any comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68476/new/ https://reviews.llvm.org/D68476 From llvm-commits at lists.llvm.org Wed Oct 9 08:26:33 2019 From: llvm-commits at lists.llvm.org (Jeremy Morse via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:26:33 +0000 (UTC) Subject: [PATCH] D68708: [RFC] Adopt Dexter and use it to run debuginfo-tests Message-ID: jmorse created this revision. Herald added a project: LLVM. Herald added a subscriber: llvm-commits. This is a patch demonstrating the changes we'd like to make for an RFC I'm about to send to llvm-dev@, to use the Dexter tool for running debuginfo tests. Broadly, this patch: - Imports the Dexter codebase (written in python) from https://github.com/snsystems/dexter into the debuginfo-tests directory - Converts a variety of old {ll,g}db tests to be runnable by Dexter and drops them in the debuginfo-tests/dexter-tests directory - Does similar for a bunch of CDB / dbgeng tests - Glues these things in so that all (the supported) tests run when one runs 'ninja check-debuginfo' or check-all. More context in the email, which I'll link here once it's sent. You can also browse the tree at https://github.com/jmorse/llvm-project/tree/dexter-rfc Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68708 Files: debuginfo-tests/README.txt debuginfo-tests/aggregate-indirect-arg.cpp debuginfo-tests/apple-accel.cpp debuginfo-tests/asan-blocks.c debuginfo-tests/asan-deque.cpp debuginfo-tests/asan.c debuginfo-tests/block_var.m debuginfo-tests/blocks.m debuginfo-tests/ctor.cpp debuginfo-tests/dbg-arg.c debuginfo-tests/dexter-tests/aggregate-indirect-arg.cpp debuginfo-tests/dexter-tests/asan-deque.cpp debuginfo-tests/dexter-tests/asan.c debuginfo-tests/dexter-tests/ctor.cpp debuginfo-tests/dexter-tests/dbg-arg.c debuginfo-tests/dexter-tests/global-constant.cpp debuginfo-tests/dexter-tests/hello.c debuginfo-tests/dexter-tests/inline-line-gap.cpp debuginfo-tests/dexter-tests/lit.local.cfg debuginfo-tests/dexter-tests/nrvo-string.cpp debuginfo-tests/dexter-tests/nrvo.cpp debuginfo-tests/dexter-tests/realigned-frame.cpp debuginfo-tests/dexter-tests/stack-var.c debuginfo-tests/dexter-tests/vla.c debuginfo-tests/dexter/.gitignore debuginfo-tests/dexter/Commands.md debuginfo-tests/dexter/LICENSE.txt debuginfo-tests/dexter/README.md debuginfo-tests/dexter/dex/__init__.py debuginfo-tests/dexter/dex/builder/Builder.py debuginfo-tests/dexter/dex/builder/ParserOptions.py debuginfo-tests/dexter/dex/builder/__init__.py debuginfo-tests/dexter/dex/builder/scripts/posix/clang-c.sh debuginfo-tests/dexter/dex/builder/scripts/posix/clang.sh debuginfo-tests/dexter/dex/builder/scripts/windows/clang-cl_vs2015.bat debuginfo-tests/dexter/dex/builder/scripts/windows/clang.bat debuginfo-tests/dexter/dex/command/CommandBase.py debuginfo-tests/dexter/dex/command/ParseCommand.py debuginfo-tests/dexter/dex/command/StepValueInfo.py debuginfo-tests/dexter/dex/command/__init__.py debuginfo-tests/dexter/dex/command/commands/DexExpectProgramState.py debuginfo-tests/dexter/dex/command/commands/DexExpectStepKind.py debuginfo-tests/dexter/dex/command/commands/DexExpectStepOrder.py debuginfo-tests/dexter/dex/command/commands/DexExpectWatchBase.py debuginfo-tests/dexter/dex/command/commands/DexExpectWatchType.py debuginfo-tests/dexter/dex/command/commands/DexExpectWatchValue.py debuginfo-tests/dexter/dex/command/commands/DexLabel.py debuginfo-tests/dexter/dex/command/commands/DexUnreachable.py debuginfo-tests/dexter/dex/command/commands/DexWatch.py debuginfo-tests/dexter/dex/debugger/DebuggerBase.py debuginfo-tests/dexter/dex/debugger/Debuggers.py debuginfo-tests/dexter/dex/debugger/__init__.py debuginfo-tests/dexter/dex/debugger/dbgeng/README.md debuginfo-tests/dexter/dex/debugger/dbgeng/__init__.py debuginfo-tests/dexter/dex/debugger/dbgeng/breakpoint.py debuginfo-tests/dexter/dex/debugger/dbgeng/client.py debuginfo-tests/dexter/dex/debugger/dbgeng/control.py debuginfo-tests/dexter/dex/debugger/dbgeng/dbgeng.py debuginfo-tests/dexter/dex/debugger/dbgeng/probe_process.py debuginfo-tests/dexter/dex/debugger/dbgeng/setup.py debuginfo-tests/dexter/dex/debugger/dbgeng/symbols.py debuginfo-tests/dexter/dex/debugger/dbgeng/symgroup.py debuginfo-tests/dexter/dex/debugger/dbgeng/sysobjs.py debuginfo-tests/dexter/dex/debugger/dbgeng/utils.py debuginfo-tests/dexter/dex/debugger/lldb/LLDB.py debuginfo-tests/dexter/dex/debugger/lldb/__init__.py debuginfo-tests/dexter/dex/debugger/visualstudio/VisualStudio.py debuginfo-tests/dexter/dex/debugger/visualstudio/VisualStudio2015.py debuginfo-tests/dexter/dex/debugger/visualstudio/VisualStudio2017.py debuginfo-tests/dexter/dex/debugger/visualstudio/__init__.py debuginfo-tests/dexter/dex/debugger/visualstudio/windows/ComInterface.py debuginfo-tests/dexter/dex/debugger/visualstudio/windows/__init__.py debuginfo-tests/dexter/dex/dextIR/BuilderIR.py debuginfo-tests/dexter/dex/dextIR/DebuggerIR.py debuginfo-tests/dexter/dex/dextIR/DextIR.py debuginfo-tests/dexter/dex/dextIR/FrameIR.py debuginfo-tests/dexter/dex/dextIR/LocIR.py debuginfo-tests/dexter/dex/dextIR/ProgramState.py debuginfo-tests/dexter/dex/dextIR/StepIR.py debuginfo-tests/dexter/dex/dextIR/ValueIR.py debuginfo-tests/dexter/dex/dextIR/__init__.py debuginfo-tests/dexter/dex/heuristic/Heuristic.py debuginfo-tests/dexter/dex/heuristic/__init__.py debuginfo-tests/dexter/dex/tools/Main.py debuginfo-tests/dexter/dex/tools/TestToolBase.py debuginfo-tests/dexter/dex/tools/ToolBase.py debuginfo-tests/dexter/dex/tools/__init__.py debuginfo-tests/dexter/dex/tools/clang_opt_bisect/Tool.py debuginfo-tests/dexter/dex/tools/clang_opt_bisect/__init__.py debuginfo-tests/dexter/dex/tools/help/Tool.py debuginfo-tests/dexter/dex/tools/help/__init__.py debuginfo-tests/dexter/dex/tools/list_debuggers/Tool.py debuginfo-tests/dexter/dex/tools/list_debuggers/__init__.py debuginfo-tests/dexter/dex/tools/no_tool_/Tool.py debuginfo-tests/dexter/dex/tools/no_tool_/__init__.py debuginfo-tests/dexter/dex/tools/run_debugger_internal_/Tool.py debuginfo-tests/dexter/dex/tools/run_debugger_internal_/__init__.py debuginfo-tests/dexter/dex/tools/test/Tool.py debuginfo-tests/dexter/dex/tools/test/__init__.py debuginfo-tests/dexter/dex/tools/view/Tool.py debuginfo-tests/dexter/dex/tools/view/__init__.py debuginfo-tests/dexter/dex/utils/Environment.py debuginfo-tests/dexter/dex/utils/Exceptions.py debuginfo-tests/dexter/dex/utils/ExtArgParse.py debuginfo-tests/dexter/dex/utils/PrettyOutputBase.py debuginfo-tests/dexter/dex/utils/ReturnCode.py debuginfo-tests/dexter/dex/utils/RootDirectory.py debuginfo-tests/dexter/dex/utils/Timer.py debuginfo-tests/dexter/dex/utils/UnitTests.py debuginfo-tests/dexter/dex/utils/Version.py debuginfo-tests/dexter/dex/utils/Warning.py debuginfo-tests/dexter/dex/utils/WorkingDirectory.py debuginfo-tests/dexter/dex/utils/__init__.py debuginfo-tests/dexter/dex/utils/posix/PrettyOutput.py debuginfo-tests/dexter/dex/utils/posix/__init__.py debuginfo-tests/dexter/dex/utils/windows/PrettyOutput.py debuginfo-tests/dexter/dex/utils/windows/__init__.py debuginfo-tests/dexter/dexter.py debuginfo-tests/dexter/feature_tests/Readme.md debuginfo-tests/dexter/feature_tests/commands/penalty/expect_program_state.cpp debuginfo-tests/dexter/feature_tests/commands/penalty/expect_step_kinds.cpp debuginfo-tests/dexter/feature_tests/commands/penalty/expect_step_order.cpp debuginfo-tests/dexter/feature_tests/commands/penalty/expect_watch_type.cpp debuginfo-tests/dexter/feature_tests/commands/penalty/expect_watch_value.cpp debuginfo-tests/dexter/feature_tests/commands/penalty/unreachable.cpp debuginfo-tests/dexter/feature_tests/commands/perfect/expect_program_state.cpp debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/direction.cpp debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/func.cpp debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/func_external.cpp debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/recursive.cpp debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_order.cpp debuginfo-tests/dexter/feature_tests/commands/perfect/expect_watch_type.cpp debuginfo-tests/dexter/feature_tests/commands/perfect/expect_watch_value.cpp debuginfo-tests/dexter/feature_tests/commands/perfect/unreachable.cpp debuginfo-tests/dexter/feature_tests/lit.local.cfg debuginfo-tests/dexter/feature_tests/subtools/clang-opt-bisect/clang-opt-bisect.cpp debuginfo-tests/dexter/feature_tests/subtools/help/help.test debuginfo-tests/dexter/feature_tests/subtools/list-debuggers/list-debuggers.test debuginfo-tests/dexter/feature_tests/subtools/test/err_paren.cpp debuginfo-tests/dexter/feature_tests/subtools/test/err_paren_mline.cpp debuginfo-tests/dexter/feature_tests/subtools/test/err_syntax.cpp debuginfo-tests/dexter/feature_tests/subtools/test/err_syntax_mline.cpp debuginfo-tests/dexter/feature_tests/subtools/test/err_type.cpp debuginfo-tests/dexter/feature_tests/subtools/test/err_type_mline.cpp debuginfo-tests/dexter/feature_tests/subtools/view.cpp debuginfo-tests/dexter/feature_tests/unittests/run.test debuginfo-tests/foreach.m debuginfo-tests/forward-declare-class.cpp debuginfo-tests/lit.cfg.py debuginfo-tests/lit.local.cfg debuginfo-tests/lit.site.cfg.py.in debuginfo-tests/llgdb-tests/apple-accel.cpp debuginfo-tests/llgdb-tests/asan-blocks.c debuginfo-tests/llgdb-tests/asan-deque.cpp debuginfo-tests/llgdb-tests/asan.c debuginfo-tests/llgdb-tests/block_var.m debuginfo-tests/llgdb-tests/blocks.m debuginfo-tests/llgdb-tests/foreach.m debuginfo-tests/llgdb-tests/forward-declare-class.cpp debuginfo-tests/llgdb-tests/lit.local.cfg debuginfo-tests/llgdb-tests/llgdb.py debuginfo-tests/llgdb-tests/nested-struct.cpp debuginfo-tests/llgdb-tests/nrvo-string.cpp debuginfo-tests/llgdb-tests/safestack.c debuginfo-tests/llgdb-tests/static-member-2.cpp debuginfo-tests/llgdb-tests/static-member.cpp debuginfo-tests/llgdb-tests/test_debuginfo.pl debuginfo-tests/llgdb.py debuginfo-tests/nested-struct.cpp debuginfo-tests/nrvo-string.cpp debuginfo-tests/safestack.c debuginfo-tests/sret.cpp debuginfo-tests/stack-var.c debuginfo-tests/static-member-2.cpp debuginfo-tests/static-member.cpp debuginfo-tests/test_debuginfo.pl debuginfo-tests/vla.c debuginfo-tests/win_cdb-tests/README.txt debuginfo-tests/win_cdb-tests/lit.local.cfg.py debuginfo-tests/win_cdb/README.txt debuginfo-tests/win_cdb/global-constant.cpp debuginfo-tests/win_cdb/hello.c debuginfo-tests/win_cdb/inline-line-gap.cpp debuginfo-tests/win_cdb/lit.local.cfg.py debuginfo-tests/win_cdb/nrvo.cpp debuginfo-tests/win_cdb/realigned-frame.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68708.224052.patch Type: text/x-patch Size: 406052 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 08:33:29 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via llvm-commits) Date: Wed, 09 Oct 2019 15:33:29 -0000 Subject: [test-suite] r374176 - [test-suite] Add Architecture Detection for RISC-V Message-ID: <20191009153329.5EB4A8192C@lists.llvm.org> Author: lenary Date: Wed Oct 9 08:33:29 2019 New Revision: 374176 URL: http://llvm.org/viewvc/llvm-project?rev=374176&view=rev Log: [test-suite] Add Architecture Detection for RISC-V Summary: The LLVM test suite has its own way of detecting the system architecture. This adds support in that file for detecting RISC-V. This will eventually cause the ARCH variable to be populated. We use ARCH="riscv64" to identify 64-bit RISC-V, and ARCH="riscv32" to identify 32-bit RISC-V, so that attempting to detect "riscv" in the ARCH variable will match any version of RISC-V. Reviewers: asb, luismarques Reviewed By: luismarques Subscribers: mgorny, simoncook, kito-cheng, shiva0217, rogfer01, rkruppe, PkmX, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68698 Modified: test-suite/trunk/cmake/modules/DetectArchitecture.c Modified: test-suite/trunk/cmake/modules/DetectArchitecture.c URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/cmake/modules/DetectArchitecture.c?rev=374176&r1=374175&r2=374176&view=diff ============================================================================== --- test-suite/trunk/cmake/modules/DetectArchitecture.c (original) +++ test-suite/trunk/cmake/modules/DetectArchitecture.c Wed Oct 9 08:33:29 2019 @@ -8,6 +8,12 @@ const char *str = "ARCHITECTURE IS Alpha const char *str = "ARCHITECTURE IS Mips"; #elif defined(__powerpc__) || defined(__ppc__) || defined(__power__) const char *str = "ARCHITECTURE IS PowerPC"; +#elif defined(__riscv) +#if __riscv_xlen == 64 +const char *str = "ARCHITECTURE IS riscv64"; +#elif __riscv_xlen == 32 +const char *str = "ARCHITECTURE IS riscv32"; +#endif #elif defined(__s390__) const char *str = "ARCHITECTURE IS SystemZ"; #elif defined(__sparc__) From llvm-commits at lists.llvm.org Wed Oct 9 08:35:39 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:35:39 +0000 (UTC) Subject: [PATCH] D68709: (not yet for review) win: Use cross-platform code in Parallel.h/.cpp Message-ID: thakis created this revision. Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. XXX r287140 perf tests r246219 hints at issues (fixed in r296906 due to 2015+ req) https://reviews.llvm.org/D8348 added linux impl mar 2015 r179397 added the code in 2013, doesn't say why PR41198. https://reviews.llvm.org/D68709 Files: llvm/include/llvm/Support/Parallel.h llvm/lib/Support/Parallel.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68709.224063.patch Type: text/x-patch Size: 2533 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 08:35:45 2019 From: llvm-commits at lists.llvm.org (Sam Elliott via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:35:45 +0000 (UTC) Subject: [PATCH] D68698: [test-suite] Add Architecture Detection for RISC-V In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rT374176: [test-suite] Add Architecture Detection for RISC-V (authored by lenary, committed by ). Repository: rT test-suite CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68698/new/ https://reviews.llvm.org/D68698 Files: cmake/modules/DetectArchitecture.c Index: cmake/modules/DetectArchitecture.c =================================================================== --- cmake/modules/DetectArchitecture.c +++ cmake/modules/DetectArchitecture.c @@ -8,6 +8,12 @@ const char *str = "ARCHITECTURE IS Mips"; #elif defined(__powerpc__) || defined(__ppc__) || defined(__power__) const char *str = "ARCHITECTURE IS PowerPC"; +#elif defined(__riscv) +#if __riscv_xlen == 64 +const char *str = "ARCHITECTURE IS riscv64"; +#elif __riscv_xlen == 32 +const char *str = "ARCHITECTURE IS riscv32"; +#endif #elif defined(__s390__) const char *str = "ARCHITECTURE IS SystemZ"; #elif defined(__sparc__) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68698.224064.patch Type: text/x-patch Size: 635 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 08:44:48 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:44:48 +0000 (UTC) Subject: [PATCH] D68645: MinidumpYAML: Add support for the memory info list stream In-Reply-To: References: Message-ID: <238f5b331a983f6fda83113157a9c65c@localhost.localdomain> labath added inline comments. ================ Comment at: lib/ObjectYAML/MinidumpEmitter.cpp:166 + Header.SizeOfEntry = sizeof(minidump::MemoryInfo); + Header.NumberOfEntries = InfoList.Infos.size(); + File.allocateNewObject(Header); ---------------- grimar wrote: > Probably just > > ``` > minidump::MemoryInfoListHeader Header = { > (support::ulittle32_t)sizeof(minidump::MemoryInfoListHeader), > (support::ulittle32_t)sizeof(minidump::MemoryInfo), > (support::ulittle64_t)InfoList.Infos.size()}; > ``` > > ? > > Or perhaps it could have a constructor. The thought of a constructor has crossed my mind too. I've now added it. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68645/new/ https://reviews.llvm.org/D68645 From llvm-commits at lists.llvm.org Wed Oct 9 08:44:48 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:44:48 +0000 (UTC) Subject: [PATCH] D68645: MinidumpYAML: Add support for the memory info list stream In-Reply-To: References: Message-ID: labath updated this revision to Diff 224065. labath marked 2 inline comments as done. labath added a comment. Address review comments Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68645/new/ https://reviews.llvm.org/D68645 Files: include/llvm/BinaryFormat/Minidump.h include/llvm/ObjectYAML/MinidumpYAML.h lib/ObjectYAML/MinidumpEmitter.cpp lib/ObjectYAML/MinidumpYAML.cpp test/tools/obj2yaml/basic-minidump.yaml -------------- next part -------------- A non-text attachment was scrubbed... Name: D68645.224065.patch Type: text/x-patch Size: 11783 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 08:44:50 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:44:50 +0000 (UTC) Subject: [PATCH] D68270: DWARFDebugLoc: Add a function to get the address range of an entry In-Reply-To: References: Message-ID: dblaikie added a comment. In D68270#1700108 , @probinson wrote: > Do we care whether llvm-dwarfdump's output bears any similarities to the output from GNU readelf or objdump? There has been a push lately to get the LLVM "binutils" to behave more like GNU's, although AFAIK it hasn't gotten to the DWARF dumping part. Generally I hope not to deal with that until there's a user with a need for it who wants to do the work & has a specific use-case that can help motivate which similarities are desirable and which ones don't matter (& perhaps if there's enough that they start to tradeoff usability - maybe the "compatibility mode" is a separate tool or separate flag to the existing tool). My broader hope is probably that llvm-dwarfdump is more for interactive uses than other dumpers, so fewer people might try to build automated things on top of it & thus expect specific output (this gives us both the freedom not to match the GNU tools, and the freedom not to match previous llvm-dwarfdump behavior (which we've done a fair bit in the past - which seems to support the theory that people don't seem to be building much on top of this)) ================ Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:291-295 + EntryIterator Absolute = + getAbsoluteLocations( + SectionedAddress{BaseAddr, SectionedAddress::UndefSection}, + LookupPooledAddress) + .begin(); ---------------- labath wrote: > dblaikie wrote: > > labath wrote: > > > dblaikie wrote: > > > > labath wrote: > > > > > This parallel iteration is not completely nice, but I think it's worth being able to reuse the absolute range computation code. I'm open to ideas for improvement though. > > > > Ah, I see - this is what you meant about "In particular it makes it possible to reuse this stuff in the dumping code, which would have been pretty hard with callbacks.". > > > > > > > > I'm wondering if that might be worth revisiting somewhat. A full iterator abstraction for one user here (well, two once you include lldb - but I assume it's likely going to build its own data structure from the iteration anyway, right? (it's not going to keep the iterator around, do anything interesting like partial iterations, re-iterate/etc - such that a callback would suffice)) > > > > > > > > I could imagine two callback APIs for this - one that gets entries and locations and one that only gets locations by filtering on the entry version. > > > > > > > > eg: > > > > > > > > // for non-verbose output: > > > > LL.forEachEntry([&](const Entry &E, Expected L) { > > > > if (Verbose && actually dumping debug_loc) > > > > print(E) // print any LLE_*, raw parameters, etc > > > > if (L) > > > > print(*L) // print the resulting address range, section name (if verbose), > > > > else > > > > print(error stuff) > > > > }); > > > > > > > > One question would be "when/where do we print the DWARF expression" - if there's an error computing the address range, we can still print the expression, so maybe that happens unconditionally at the end of the callback, using the expression in the Entry? (then, arguably, the expression doesn't need to be in the DWARFLocation - and I'd say make the DWARFLocation a sectioned range, exactly the same type as for ranges so that part of the dumping code, etc, can be maximally reused) > > > Actually, what lldb currently does is that it does not build any data structures at all (except storing the pointer to the right place in the debug_loc section. Then, whenever it wants to do something to the loclist, it parses it afresh. I don't know why it does this exactly, but I assume it has something to do with most locations never being used, or being only a couple of times, and the actual parsing being fairly fast. What this means is that lldb is not really a single "user", but there are like four or five places where it iterates through the list, depending on what does it actually want to do with it. It also does partial iteration where it stops as soon as it find the entry it was interested in. > > > Now, all of that is possible with a callback (though I am generally trying to avoid them), but it does resurface the issue of what should be the value of the second argument for DW_LLE_base_address entries (the thing which I originally used a error type for). > > > Maybe this should be actually one callback API, taking two callback functions, with one of them being invoked for base_address entries, and one for others? However, if we stick to the current approaches in both LLE and RLE of making the address pool resolution function a parameter (which I'd like to keep, as it makes my job in lldb easier), then this would actually be three callbacks, which starts to get unwieldy. Though one of those callbacks could be removed with the "DWARFUnit implementing a AddrOffsetResolver interface" idea, which I really like. :) > > Ah, thanks for the details on LLDB's location parsing logic. That's interesting indeed! > > > > I can appreciate an iterator-based API if that's the sort of usage we've got, though I expect it doesn't have any interest in the low-level encoding & just wants the fully processed address ranges/locations - it doesn't want base_address or end_of_list entries? & I think the dual-iteration is a fairly awkward API design, trying to iterate them in lock-step, etc. I'd rather avoid that if reasonably possible. > > > > Either having an iterator API that gives only the fully processed data/semantic view & a completely different API if you want to access the low level primitives (LLE, etc) (this is how ranges works - there's an API that gives a collection of ranges & abstracts over v4/v5/rnglists/etc - though that's partly motivated by a strong multi-client need for that functionality for symbolizing, etc - but I think it's a good abstraction/model anyway (& one of the reasons the inline range list printing doesn't include encoding information, the API it uses is too high level to even have access to it)) > > > > > Now, all of that is possible with a callback (though I am generally trying to avoid them), but it does resurface the issue of what should be the value of the second argument for DW_LLE_base_address entries (the thing which I originally used a error type for). > > > > Sorry, my intent in the above API was for the second argument to be Optional's "None" state when... oh, I see, I did use Expected there, rather than Optional, because there are legit error cases. > > > > I know it's sort of awkward, but I might be inclined to use Optional> there. I realize two layers of wrapping is a bit weird, but I think it'd be nicer than having an error state for what, I think, isn't erroneous. > > > > > Maybe this should be actually one callback API, taking two callback functions, with one of them being invoked for base_address entries, and one for others? However, if we stick to the current approaches in both LLE and RLE of making the address pool resolution function a parameter (which I'd like to keep, as it makes my job in lldb easier), then this would actually be three callbacks, which starts to get unwieldy. > > > > Don't mind three callbacks too much. > > > > > Though one of those callbacks could be removed with the "DWARFUnit implementing a AddrOffsetResolver interface" idea, which I really like. :) > > > > Sorry, I haven't really looked at where the address resolver callback is registered and alternative designs being discussed - but yeah, going off just the one-sentence, it seems reasonable to have the DWARFUnit own an address resolver/be the thing you consult when you want to resolve an address (just through a normal function call in DWARFUnit, perhaps - which might, internally, use a callback registered when it was constructed). > > I know it's sort of awkward, but I might be inclined to use Optional> there. I realize two layers of wrapping is a bit weird, but I think it'd be nicer than having an error state for what, I think, isn't erroneous. > Actually, my very first attempt at this patch used an `Expected>`, but then I scrapped it because I didn't think you'd like it. It's not the friendliest of APIs, but I think we can go with that. > > > Sorry, I haven't really looked at where the address resolver callback is registered and alternative designs being discussed - but yeah, going off just the one-sentence, it seems reasonable to have the DWARFUnit own an address resolver/be the thing you consult when you want to resolve an address (just through a normal function call in DWARFUnit, perhaps - which might, internally, use a callback registered when it was constructed). > > I think you got that backwards. I don't want the DWARFUnit to be the source of truth for address pool resolutions, as that would make it hard to use from lldb (it's far from ready to start using the llvm version right now). What I wanted was to replace the lambda/function_ref with a single-method interface. Then both DWARFUnits could implement that interface so that passing a DWARFUnit& would "just work" (but you wouldn't be limited to DWARFUnits as anyone could implement that interface, just like anyone can write a lambda). As for Expected> (or Optional>) - yeah, I think this is a non-obvious API (both the general problem and this specific solution). I think it's probably worth discussing this design a bit more to save you time writing/rewriting things a bit. I guess there are a few layers of failure here. There's the possibility that the iteration itself could fail - even for debug_loc style lists (if we reached the end of the section before encountering a terminating {0,0}). That would suggest a fallible iterator idiom: http://llvm.org/docs/ProgrammersManual.html#building-fallible-iterators-and-iterator-ranges But then, yes, when looking at the "processed"/semantic view, that could fail too in the case of an invalid address index, etc. The generic/processed/abstracted-over-ranges-and-rnglists API for ranges produces a fully computer vector (& then returns Expected of that range) - is that reasonable? (this does mean manifesting a whole location in memory, which may not be needed so I could understand avoiding that even without fully implementing & demonstrating the vector solution is inadequate). But I /think/ maybe the we could/should have two APIs - one generic API that abstracts over loc/loclists and only provides the fully processed view, and another that is type specific for dumping the underlying representation (only used in dumping debug_loclists). Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68270/new/ https://reviews.llvm.org/D68270 From llvm-commits at lists.llvm.org Wed Oct 9 08:54:24 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via llvm-commits) Date: Wed, 09 Oct 2019 15:54:24 -0000 Subject: [llvm] r374177 - [MemorySSA] Make the use of moveAllAfterMergeBlocks consistent. Message-ID: <20191009155424.D9EF48A76F@lists.llvm.org> Author: asbirlea Date: Wed Oct 9 08:54:24 2019 New Revision: 374177 URL: http://llvm.org/viewvc/llvm-project?rev=374177&view=rev Log: [MemorySSA] Make the use of moveAllAfterMergeBlocks consistent. Summary: The rule for the moveAllAfterMergeBlocks API si for all instructions from `From` to have been moved to `To`, while keeping the CFG edges (and block terminators) unchanged. Update all the callsites for moveAllAfterMergeBlocks to follow this. Pending follow-up: since the same behavior is needed everytime, merge all callsites into one. The common denominator may be the call to `MergeBlockIntoPredecessor`. Resolves PR43569. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68659 Added: llvm/trunk/test/Analysis/MemorySSA/pr43569.ll Modified: llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp llvm/trunk/lib/Transforms/Utils/BasicBlockUtils.cpp llvm/trunk/lib/Transforms/Utils/LoopRotationUtils.cpp Modified: llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp?rev=374177&r1=374176&r2=374177&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp (original) +++ llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp Wed Oct 9 08:54:24 2019 @@ -1159,25 +1159,32 @@ void MemorySSAUpdater::moveAllAccesses(B if (!Accs) return; + assert(Start->getParent() == To && "Incorrect Start instruction"); MemoryAccess *FirstInNew = nullptr; for (Instruction &I : make_range(Start->getIterator(), To->end())) if ((FirstInNew = MSSA->getMemoryAccess(&I))) break; - if (!FirstInNew) - return; + if (FirstInNew) { + auto *MUD = cast(FirstInNew); + do { + auto NextIt = ++MUD->getIterator(); + MemoryUseOrDef *NextMUD = (!Accs || NextIt == Accs->end()) + ? nullptr + : cast(&*NextIt); + MSSA->moveTo(MUD, To, MemorySSA::End); + // Moving MUD from Accs in the moveTo above, may delete Accs, so we need + // to retrieve it again. + Accs = MSSA->getWritableBlockAccesses(From); + MUD = NextMUD; + } while (MUD); + } - auto *MUD = cast(FirstInNew); - do { - auto NextIt = ++MUD->getIterator(); - MemoryUseOrDef *NextMUD = (!Accs || NextIt == Accs->end()) - ? nullptr - : cast(&*NextIt); - MSSA->moveTo(MUD, To, MemorySSA::End); - // Moving MUD from Accs in the moveTo above, may delete Accs, so we need to - // retrieve it again. - Accs = MSSA->getWritableBlockAccesses(From); - MUD = NextMUD; - } while (MUD); + // If all accesses were moved and only a trivial Phi remains, we try to remove + // that Phi. This is needed when From is going to be deleted. + auto *Defs = MSSA->getWritableBlockDefs(From); + if (Defs && !Defs->empty()) + if (auto *Phi = dyn_cast(&*Defs->begin())) + tryRemoveTrivialPhi(Phi); } void MemorySSAUpdater::moveAllAfterSpliceBlocks(BasicBlock *From, @@ -1193,7 +1200,7 @@ void MemorySSAUpdater::moveAllAfterSplic void MemorySSAUpdater::moveAllAfterMergeBlocks(BasicBlock *From, BasicBlock *To, Instruction *Start) { - assert(From->getSinglePredecessor() == To && + assert(From->getUniquePredecessor() == To && "From block is expected to have a single predecessor (To)."); moveAllAccesses(From, To, Start); for (BasicBlock *Succ : successors(From)) Modified: llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp?rev=374177&r1=374176&r2=374177&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp Wed Oct 9 08:54:24 2019 @@ -1629,15 +1629,27 @@ void LoopUnswitch::SimplifyCode(std::vec ReplaceUsesOfWith(PN, PN->getIncomingValue(0), Worklist, L, LPM, MSSAU.get()); - // If Succ has any successors with PHI nodes, update them to have - // entries coming from Pred instead of Succ. - Succ->replaceAllUsesWith(Pred); + Instruction *STI = Succ->getTerminator(); + Instruction *Start = &*Succ->begin(); + // If there's nothing to move, mark the starting instruction as the last + // instruction in the block. + if (Start == STI) + Start = BI; // Move all of the successor contents from Succ to Pred. Pred->getInstList().splice(BI->getIterator(), Succ->getInstList(), - Succ->begin(), Succ->end()); + Succ->begin(), STI->getIterator()); if (MSSAU) - MSSAU->moveAllAfterMergeBlocks(Succ, Pred, BI); + MSSAU->moveAllAfterMergeBlocks(Succ, Pred, Start); + + // Move terminator instruction from Succ now, we're deleting BI below. + // FIXME: remove BI first might be more intuitive. + Pred->getInstList().splice(Pred->end(), Succ->getInstList()); + + // If Succ has any successors with PHI nodes, update them to have + // entries coming from Pred instead of Succ. + Succ->replaceAllUsesWith(Pred); + LPM->deleteSimpleAnalysisValue(BI, L); RemoveFromWorklist(BI, Worklist); BI->eraseFromParent(); Modified: llvm/trunk/lib/Transforms/Utils/BasicBlockUtils.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/BasicBlockUtils.cpp?rev=374177&r1=374176&r2=374177&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/BasicBlockUtils.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/BasicBlockUtils.cpp Wed Oct 9 08:54:24 2019 @@ -227,17 +227,29 @@ bool llvm::MergeBlockIntoPredecessor(Bas Updates.push_back({DominatorTree::Delete, PredBB, BB}); } - if (MSSAU) - MSSAU->moveAllAfterMergeBlocks(BB, PredBB, &*(BB->begin())); + Instruction *PTI = PredBB->getTerminator(); + Instruction *STI = BB->getTerminator(); + Instruction *Start = &*BB->begin(); + // If there's nothing to move, mark the starting instruction as the last + // instruction in the block. + if (Start == STI) + Start = PTI; - // Delete the unconditional branch from the predecessor... - PredBB->getInstList().pop_back(); + // Move all definitions in the successor to the predecessor... + PredBB->getInstList().splice(PTI->getIterator(), BB->getInstList(), + BB->begin(), STI->getIterator()); + + if (MSSAU) + MSSAU->moveAllAfterMergeBlocks(BB, PredBB, Start); // Make all PHI nodes that referred to BB now refer to Pred as their // source... BB->replaceAllUsesWith(PredBB); - // Move all definitions in the successor to the predecessor... + // Delete the unconditional branch from the predecessor... + PredBB->getInstList().pop_back(); + + // Move terminator instruction and add unreachable to now empty BB. PredBB->getInstList().splice(PredBB->end(), BB->getInstList()); new UnreachableInst(BB->getContext(), BB); @@ -274,11 +286,10 @@ bool llvm::MergeBlockIntoPredecessor(Bas "applying corresponding DTU updates."); DTU->applyUpdatesPermissive(Updates); DTU->deleteBB(BB); - } - - else { + } else { BB->eraseFromParent(); // Nuke BB if DTU is nullptr. } + return true; } Modified: llvm/trunk/lib/Transforms/Utils/LoopRotationUtils.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/LoopRotationUtils.cpp?rev=374177&r1=374176&r2=374177&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/LoopRotationUtils.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/LoopRotationUtils.cpp Wed Oct 9 08:54:24 2019 @@ -615,8 +615,13 @@ bool LoopRotate::simplifyLoopLatch(Loop LLVM_DEBUG(dbgs() << "Folding loop latch " << Latch->getName() << " into " << LastExit->getName() << "\n"); + Instruction *FirstLatchInst = &*Latch->begin(); + // If there's nothing to move, mark the starting instruction as the last + // instruction in the block. + if (FirstLatchInst == Jmp) + FirstLatchInst = BI; + // Hoist the instructions from Latch into LastExit. - Instruction *FirstLatchInst = &*(Latch->begin()); LastExit->getInstList().splice(BI->getIterator(), Latch->getInstList(), Latch->begin(), Jmp->getIterator()); Added: llvm/trunk/test/Analysis/MemorySSA/pr43569.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/MemorySSA/pr43569.ll?rev=374177&view=auto ============================================================================== --- llvm/trunk/test/Analysis/MemorySSA/pr43569.ll (added) +++ llvm/trunk/test/Analysis/MemorySSA/pr43569.ll Wed Oct 9 08:54:24 2019 @@ -0,0 +1,49 @@ +; RUN: opt -pgo-kind=pgo-instr-gen-pipeline -aa-pipeline=default -passes="default" -enable-nontrivial-unswitch -S < %s | FileCheck %s +; REQUIRES: asserts + +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" + + at __profn_c = private constant [1 x i8] c"c" + at b = common dso_local global i32 0, align 4 + at a = common dso_local global i16 0, align 2 + +; CHECK-LABEL: @c() +; Function Attrs: nounwind uwtable +define dso_local void @c() #0 { +entry: + call void @llvm.instrprof.increment(i8* getelementptr inbounds ([1 x i8], [1 x i8]* @__profn_c, i32 0, i32 0), i64 68269137, i32 3, i32 0) + br label %for.cond + +for.cond: ; preds = %for.end, %entry + call void @llvm.instrprof.increment(i8* getelementptr inbounds ([1 x i8], [1 x i8]* @__profn_c, i32 0, i32 0), i64 68269137, i32 3, i32 1) + store i32 0, i32* @b, align 4 + br label %for.cond1 + +for.cond1: ; preds = %for.inc, %for.cond + %0 = load i32, i32* @b, align 4 + %1 = load i16, i16* @a, align 2 + %conv = sext i16 %1 to i32 + %cmp = icmp slt i32 %0, %conv + br i1 %cmp, label %for.body, label %for.end + +for.body: ; preds = %for.cond1 + call void @llvm.instrprof.increment(i8* getelementptr inbounds ([1 x i8], [1 x i8]* @__profn_c, i32 0, i32 0), i64 68269137, i32 3, i32 2) + br label %for.inc + +for.inc: ; preds = %for.body + %2 = load i32, i32* @b, align 4 + %inc = add nsw i32 %2, 1 + store i32 %inc, i32* @b, align 4 + br label %for.cond1 + +for.end: ; preds = %for.cond1 + br label %for.cond +} + +; Function Attrs: nounwind +declare void @llvm.instrprof.increment(i8*, i64, i32, i32) #1 + +attributes #0 = { nounwind uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind } + From llvm-commits at lists.llvm.org Wed Oct 9 08:53:57 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:53:57 +0000 (UTC) Subject: [PATCH] D68659: [MemorySSA] Make the use of moveAllAfterMergeBlocks consistent. In-Reply-To: References: Message-ID: <430593da7786fbe451908dc5c07dca8b@localhost.localdomain> asbirlea updated this revision to Diff 224068. asbirlea marked 4 inline comments as done. asbirlea added a comment. Address comments. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68659/new/ https://reviews.llvm.org/D68659 Files: lib/Analysis/MemorySSAUpdater.cpp lib/Transforms/Scalar/LoopUnswitch.cpp lib/Transforms/Utils/BasicBlockUtils.cpp lib/Transforms/Utils/LoopRotationUtils.cpp test/Analysis/MemorySSA/pr43569.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68659.224068.patch Type: text/x-patch Size: 9308 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 08:53:57 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:53:57 +0000 (UTC) Subject: [PATCH] D68659: [MemorySSA] Make the use of moveAllAfterMergeBlocks consistent. In-Reply-To: References: Message-ID: <1143a539f17292c7b446bc4af0ff1dd6@localhost.localdomain> asbirlea added inline comments. ================ Comment at: lib/Analysis/MemorySSAUpdater.cpp:1186 + if (Defs && Defs->begin() != Defs->end()) + if (auto *Phi = dyn_cast(&*Defs->begin())) + tryRemoveTrivialPhi(Phi); ---------------- george.burgess.iv wrote: > If I'm reading the description properly, this function assumes that everything in `From` is now in `To`, no? > > If so, `cast()` seems more appropriate here, since the only thing that can remain is a single Phi This method (`moveAllAccesses`) is also called from `moveAllAfterSpliceBlocks`, when `From` is not necessarily empty. I'll update the comment above. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68659/new/ https://reviews.llvm.org/D68659 From llvm-commits at lists.llvm.org Wed Oct 9 08:54:05 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 15:54:05 +0000 (UTC) Subject: [PATCH] D68659: [MemorySSA] Make the use of moveAllAfterMergeBlocks consistent. In-Reply-To: References: Message-ID: <2bc2cefa3cbb8c47dcdf5e47cf92e324@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG7faa14a98bdc: [MemorySSA] Make the use of moveAllAfterMergeBlocks consistent. (authored by asbirlea). Herald added a subscriber: hiraditya. Changed prior to commit: https://reviews.llvm.org/D68659?vs=224068&id=224071#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68659/new/ https://reviews.llvm.org/D68659 Files: llvm/lib/Analysis/MemorySSAUpdater.cpp llvm/lib/Transforms/Scalar/LoopUnswitch.cpp llvm/lib/Transforms/Utils/BasicBlockUtils.cpp llvm/lib/Transforms/Utils/LoopRotationUtils.cpp llvm/test/Analysis/MemorySSA/pr43569.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68659.224071.patch Type: text/x-patch Size: 9378 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 09:03:18 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:03:18 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: <9cbe5443f37a47ea1bbdb5c98f2ebec6@localhost.localdomain> DiggerLin marked 5 inline comments as done. DiggerLin added inline comments. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:89 uint32_t Size; + uint32_t PaddingSize; uint32_t FileOffsetToData; ---------------- hubert.reinterpretcast wrote: > Remove this field (see comments on later lines). After I discuss with Sean, We decide to keep variable PaddingSize here. with additional variable, it can make the code more readable and the logic of padding more easy to understand. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:103 Size = 0; + PaddingSize = 0; FileOffsetToData = 0; ---------------- hubert.reinterpretcast wrote: > Remove this field (see comments on later lines). After I discuss with Sean, We decide to keep variable PaddingSize here. with additional variable, it can make the code more readable and the logic of padding more easy to understand. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:511 + const MCSectionXCOFF *MCSec = Csect.MCCsect; + Csect.PaddingSize = alignTo(Address, MCSec->getAlignment()) - Address; + Address += Csect.PaddingSize; ---------------- hubert.reinterpretcast wrote: > The inter-csect padding is not really a property of the csect requiring alignment. We do not need to store this value here. The amount of padding to write can be determined by tracking the virtual address of the raw section data being written during the serialization into the object file. The next virtual address following the padding should be `Csect.Address`. I got what you talk about, But If we do not calculate the padding size here, We have to calculate the padding size in the XCOFFObjectWriter::writeSections , it will make the logic of the function writeSections complicated. for example // Write the program code control sections one at a time. uint32_t PaddingSize = 0; //additional variable here. for (auto it= ProgramCodeCsects.begin(); it!=ProgramCodeCsects.end() ;++it ) { if (PaddingSize) W.OS.write_zeros(Csect.PaddingSize); // And I think I also need some comment to explain following code. if(std::next(it) != ProgramCodeCsects.end() ) PaddingSize = std::next(it)->Address - it->Address - it->size; Asm.writeSectionData(W.OS, Csect.MCCsect, Layout); } ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:525 + } + Text.PaddingSize = alignTo(Address, DefaultSectionAlign) - Address; + Address += Text.PaddingSize; ---------------- hubert.reinterpretcast wrote: > The `Size` field accounts for the padding. We do not need to store this value here. The amount of padding to write can be determined by tracking the virtual address of the raw section data being written during the serialization into the object file. The next virtual address following the padding should be `Text.Address + Text.Size`. some reason as above. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 From llvm-commits at lists.llvm.org Wed Oct 9 09:03:19 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:03:19 +0000 (UTC) Subject: [PATCH] D65402: [Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer In-Reply-To: References: Message-ID: uenoku added a comment. In D65402#1701550 , @xbolva00 wrote: > dereferenceable attribute is not added to the arguments? > > define dso_local void @_Z3fooPaS_S_ii(i8* noalias nocapture writeonly %0, i8* noalias nocapture readnone %1, i8* noalias nocapture readonly %2, i32 %3, i32 %4) local_unnamed_addr #0 { > > tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture nonnull writeonly align 1 dereferenceable(16) %0, i8* noalias nocapture nonnull readonly align 1 dereferenceable(16) %2, i64 16, i1 false) #2 > ret void > > } > > I would expect > > define dso_local void @_Z3fooPaS_S_ii(i8* noalias nocapture writeonly dereferenceable(16) %0, i8* noalias nocapture readnone %1, i8* noalias nocapture readonly dereferenceable(16) %2, i32 %3, i32 %4) local_unnamed_addr #0 { > > tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture nonnull writeonly align 1 dereferenceable(16) %0, i8* noalias nocapture nonnull readonly align 1 dereferenceable(16) %2, i64 16, i1 false) #2 > ret void > > } > > https://godbolt.org/z/if9rle Actually, dereferenceable is added to the arguments if opt is executed for the above IR. https://godbolt.org/z/CItEJG As far as I see the log, the problem is that InstCombine is located after Attributor Pass in -O3. Therefore, dereferenceable is not propagated. Please take a look at the stderr output in https://godbolt.org/z/l_6aVE. Solutions are to change the order or to run the Attributor several times. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65402/new/ https://reviews.llvm.org/D65402 From llvm-commits at lists.llvm.org Wed Oct 9 09:12:39 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:12:39 +0000 (UTC) Subject: [PATCH] D68680: [dsymutil] Fix handling of common symbols in multiple object files. In-Reply-To: References: Message-ID: aprantl added inline comments. ================ Comment at: llvm/tools/dsymutil/MachODebugMapParser.cpp:144 + // The symbol is already present. + continue; + } ---------------- This looks like no-op. I guess we could just ignore the return value? ================ Comment at: llvm/tools/dsymutil/MachODebugMapParser.cpp:498 CurrentObjectAddresses[*Name] = None; - else + } else if (Flags & SymbolRef::SF_Common) { + CurrentObjectAddresses[*Name] = None; ---------------- is it expected that a symbol with `SymbolRef::SF_Absolute | SymbolRef::SF_Common` goes into the first if only? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68680/new/ https://reviews.llvm.org/D68680 From llvm-commits at lists.llvm.org Wed Oct 9 09:12:41 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:12:41 +0000 (UTC) Subject: [PATCH] D68525: [lit] Refactor ProgressDisplay In-Reply-To: References: Message-ID: <8aabbbce0bacd8be1755658592be6280@localhost.localdomain> yln added a comment. In D68525#1701001 , @serge-sans-paille wrote: > LGTM, thanks for the refactoring. Hi Serge, do you feel comfortable officially accepting this revision or should we wait for a second LGTM? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68525/new/ https://reviews.llvm.org/D68525 From llvm-commits at lists.llvm.org Wed Oct 9 09:12:40 2019 From: llvm-commits at lists.llvm.org (Hideto Ueno via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:12:40 +0000 (UTC) Subject: [PATCH] D68624: [Attributor] Handle `null` differently in capture and alias logic In-Reply-To: References: Message-ID: <6641f619e9e8077709da355a7e1cdfc5@localhost.localdomain> uenoku accepted this revision. uenoku added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68624/new/ https://reviews.llvm.org/D68624 From llvm-commits at lists.llvm.org Wed Oct 9 09:12:41 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:12:41 +0000 (UTC) Subject: [PATCH] D68656: Add ExceptionStream to llvm::Object::minidump In-Reply-To: References: Message-ID: labath added a comment. Thanks for taking your time to do this. I have one question: It looks like you're not using the exception code enum in the follow-up patch. I think that's completely reasonable given that the enum values are overloaded and system-dependent. But given this fact, and the fact that I am not convinced the enum values are completely right (e.g. the linux signal numbers depend also on the architecture -- though this may not manifest itself on the architectures that breakpad supports right now), what would you say to just dropping that enumeration? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68656/new/ https://reviews.llvm.org/D68656 From llvm-commits at lists.llvm.org Wed Oct 9 09:12:41 2019 From: llvm-commits at lists.llvm.org (Dineshkumar Bhaskaran via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:12:41 +0000 (UTC) Subject: [PATCH] D68712: Avoid PT_LOAD to have overlapping p_offset ranges on EM_AMDGPU Message-ID: dineshkb-amd created this revision. dineshkb-amd added a reviewer: MaskRay. dineshkb-amd added a project: AMDGPU. Herald added subscribers: llvm-commits, fedor.sergeev, arichardson, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl, emaste, jyknight. Herald added a reviewer: espindola. Herald added a project: LLVM. Changes introduced in commit https://llvm.org/svn/llvm-project/lld/trunk at 370180 allows PT_LOAD to have overlapping p_offset ranges on EM_AMDGPU and EM_SPARCV9. However this is introducing crashes in an AMD internal test cases. Thus is selectively disable for AMDGPU. The test case is not included as it requires elaborate setup and has several application layers. Repository: rLLD LLVM Linker https://reviews.llvm.org/D68712 Files: ELF/Writer.cpp test/ELF/amdgpu-relocs.s Index: test/ELF/amdgpu-relocs.s =================================================================== --- test/ELF/amdgpu-relocs.s +++ test/ELF/amdgpu-relocs.s @@ -94,7 +94,7 @@ # linker. # CHECK: Relocations [ # CHECK: .rela.dyn { -# CHECK-NEXT: R_AMDGPU_RELATIVE64 - 0x3928 +# CHECK-NEXT: R_AMDGPU_RELATIVE64 - 0x3008 # CHECK-NEXT: R_AMDGPU_ABS64 common_var0 0x0 # CHECK-NEXT: R_AMDGPU_ABS64 common_var1 0x0 # CHECK-NEXT: R_AMDGPU_ABS64 common_var2 0x0 @@ -114,16 +114,16 @@ # CHECK-NEXT: } # CHECK-NEXT: ] -# NM: 0000000000003930 B common_var0 -# NM: 0000000000003d30 B common_var1 -# NM: 0000000000004130 B common_var2 -# NM: 0000000000003928 d temp2 +# NM: 0000000000003010 B common_var0 +# NM: 0000000000003410 B common_var1 +# NM: 0000000000003810 B common_var2 +# NM: 0000000000003008 d temp2 -# temp2 - foo = 0x3928-0x768 = 0x31c0 +# temp2 - foo = 0x3008-0x768 = 0x28a0 # HEX: section '.rodata': -# HEX-NEXT: 0x00000768 c0310000 00000000 +# HEX-NEXT: 0x00000768 a0280000 00000000 # common_var2+4, common_var1+8, and common_var0+12. # HEX: section 'nonalloc': -# HEX-NEXT: 0x00000000 00000000 34410000 00000000 383d0000 -# HEX-NEXT: 0x00000010 00000000 3c390000 +# HEX-NEXT: 0x00000000 00000000 14380000 00000000 18340000 +# HEX-NEXT: 0x00000010 00000000 1c300000 Index: ELF/Writer.cpp =================================================================== --- ELF/Writer.cpp +++ ELF/Writer.cpp @@ -2241,7 +2241,9 @@ // maximum page size boundary so that we can find the ELF header at the // start. We cannot benefit from overlapping p_offset ranges with the // previous segment anyway. - if (config->zSeparate == SeparateSegmentKind::Loadable || + bool enable = config->emachine != EM_AMDGPU; + + if (!enable || config->zSeparate == SeparateSegmentKind::Loadable || (config->zSeparate == SeparateSegmentKind::Code && prev && (prev->p_flags & PF_X) != (p->p_flags & PF_X)) || cmd->type == SHT_LLVM_PART_EHDR) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68712.224070.patch Type: text/x-patch Size: 2009 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 09:12:43 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:12:43 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline In-Reply-To: References: Message-ID: <5d5cec1d0e09ca0f7a4a20748b3a1e9f@localhost.localdomain> aprantl added a comment. In D68633#1700966 , @bjope wrote: > Besides, inlining decisions should not be impacted by if we move the dbg intrinsics or not, right? Correct. `clang -g | strip` and `clang` should produce identical output. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 From llvm-commits at lists.llvm.org Wed Oct 9 09:12:53 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:12:53 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: <75fe82f5043340631eba2e91563517ce@localhost.localdomain> Xiangling_L updated this revision to Diff 224076. Xiangling_L marked 3 inline comments as done. Xiangling_L added a comment. Address 2nd round comments; split out variable name NFC patch; update testcases; Notes: A NFC patch about getMCSymbolForTOCPseudoMO will be splited out from this patch, and after that patch is landed, I will rebase this lowering patch onto it, then post following variable name NFC patch and 'hasSideEffects & isRematerializable' NFC patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 Files: llvm/include/llvm/MC/MCExpr.h llvm/lib/MC/MCExpr.cpp llvm/lib/Target/PowerPC/MCTargetDesc/PPCInstPrinter.cpp llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.td llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68341.224076.patch Type: text/x-patch Size: 18496 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 09:19:14 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via llvm-commits) Date: Wed, 09 Oct 2019 16:19:14 -0000 Subject: [llvm] r374178 - Re-land "[dsymutil] Fix handling of common symbols in multiple object files." Message-ID: <20191009161914.2007890979@lists.llvm.org> Author: jdevlieghere Date: Wed Oct 9 09:19:13 2019 New Revision: 374178 URL: http://llvm.org/viewvc/llvm-project?rev=374178&view=rev Log: Re-land "[dsymutil] Fix handling of common symbols in multiple object files." The original patch got reverted because it hit a long-standing legacy issue on Windows that prevents files from being named `com`. Thanks Kristina & Jeremy for pointing this out. Added: llvm/trunk/test/tools/dsymutil/Inputs/private/ llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/ llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/ llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common.x86_64 (with props) llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common1.o (with props) llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common2.o (with props) llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test Modified: llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp Added: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common.x86_64 URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common.x86_64?rev=374178&view=auto ============================================================================== Binary file - no diff available. Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common.x86_64 ------------------------------------------------------------------------------ svn:executable = * Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common.x86_64 ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common1.o URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common1.o?rev=374178&view=auto ============================================================================== Binary file - no diff available. Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common1.o ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common2.o URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common2.o?rev=374178&view=auto ============================================================================== Binary file - no diff available. Propchange: llvm/trunk/test/tools/dsymutil/Inputs/private/tmp/common/common2.o ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test?rev=374178&view=auto ============================================================================== --- llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test (added) +++ llvm/trunk/test/tools/dsymutil/X86/common-sym-multi.test Wed Oct 9 09:19:13 2019 @@ -0,0 +1,39 @@ +RUN: dsymutil -oso-prepend-path %p/../Inputs %p/../Inputs/private/tmp/common/common.x86_64 -f -o - | llvm-dwarfdump -debug-info - | FileCheck %s +RUN: dsymutil -oso-prepend-path %p/../Inputs %p/../Inputs/private/tmp/common/common.x86_64 -dump-debug-map | FileCheck %s --check-prefix DEBUGMAP + +The test was compiled from two source files: +$ cd /private/tmp/common +$ cat common1.c +int i[1000]; +int main() { + return i[1]; +} +$ cat common2.c +extern int i[1000]; +int bar() { + return i[0]; +} +$ clang -fcommon -g -c common1.c -o common1.o +$ clang -fcommon -g -c common2.c -o common2.o +$ clang -fcommon -g common1.o common2.o -o common.x86_64 + +CHECK: DW_TAG_compile_unit +CHECK: DW_TAG_variable +CHECK-NOT: {{NULL|DW_TAG}} +CHECK: DW_AT_name{{.*}}"i" +CHECK-NOT: {{NULL|DW_TAG}} +CHECK: DW_AT_location{{.*}}DW_OP_addr 0x100001000) + +CHECK: DW_TAG_compile_unit +CHECK: DW_TAG_variable +CHECK-NOT: {{NULL|DW_TAG}} +CHECK: DW_AT_name{{.*}}"i" +CHECK-NOT: {{NULL|DW_TAG}} +CHECK: DW_AT_location{{.*}}DW_OP_addr 0x100001000) + +DEBUGMAP: filename:{{.*}}common1.o +DEBUGMAP: symbols: +DEBUGMAP: sym: _i, binAddr: 0x0000000100001000, size: 0x00000000 +DEBUGMAP: filename:{{.*}}common2.o +DEBUGMAP: symbols: +DEBUGMAP: sym: _i, binAddr: 0x0000000100001000, size: 0x00000000 Modified: llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp?rev=374178&r1=374177&r2=374178&view=diff ============================================================================== --- llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp (original) +++ llvm/trunk/tools/dsymutil/MachODebugMapParser.cpp Wed Oct 9 09:19:13 2019 @@ -14,6 +14,7 @@ #include "llvm/Support/Path.h" #include "llvm/Support/WithColor.h" #include "llvm/Support/raw_ostream.h" +#include namespace { using namespace llvm; @@ -51,6 +52,8 @@ private: StringRef MainBinaryStrings; /// The constructed DebugMap. std::unique_ptr Result; + /// List of common symbols that need to be added to the debug map. + std::vector CommonSymbols; /// Map of the currently processed object file symbol addresses. StringMap> CurrentObjectAddresses; @@ -81,6 +84,8 @@ private: STE.n_value); } + void addCommonSymbols(); + /// Dump the symbol table output header. void dumpSymTabHeader(raw_ostream &OS, StringRef Arch); @@ -122,11 +127,32 @@ void MachODebugMapParser::resetParserSta CurrentDebugMapObject = nullptr; } +/// Commons symbols won't show up in the symbol map but might need to be +/// relocated. We can add them to the symbol table ourselves by combining the +/// information in the object file (the symbol name) and the main binary (the +/// address). +void MachODebugMapParser::addCommonSymbols() { + for (auto &CommonSymbol : CommonSymbols) { + uint64_t CommonAddr = getMainBinarySymbolAddress(CommonSymbol); + if (CommonAddr == 0) { + // The main binary doesn't have an address for the given symbol. + continue; + } + if (!CurrentDebugMapObject->addSymbol(CommonSymbol, None /*ObjectAddress*/, + CommonAddr, 0 /*size*/)) { + // The symbol is already present. + continue; + } + } + CommonSymbols.clear(); +} + /// Create a new DebugMapObject. This function resets the state of the /// parser that was referring to the last object file and sets /// everything up to add symbols to the new one. void MachODebugMapParser::switchToNewDebugMapObject( StringRef Filename, sys::TimePoint Timestamp) { + addCommonSymbols(); resetParserState(); SmallString<80> Path(PathPrefix); @@ -466,10 +492,15 @@ void MachODebugMapParser::loadCurrentObj // relocations will use the symbol itself, and won't need an // object file address. The object file address field is optional // in the DebugMap, leave it unassigned for these symbols. - if (Sym.getFlags() & (SymbolRef::SF_Absolute | SymbolRef::SF_Common)) + uint32_t Flags = Sym.getFlags(); + if (Flags & SymbolRef::SF_Absolute) { CurrentObjectAddresses[*Name] = None; - else + } else if (Flags & SymbolRef::SF_Common) { + CurrentObjectAddresses[*Name] = None; + CommonSymbols.push_back(*Name); + } else { CurrentObjectAddresses[*Name] = Addr; + } } } From llvm-commits at lists.llvm.org Wed Oct 9 09:19:39 2019 From: llvm-commits at lists.llvm.org (Jason Liu via llvm-commits) Date: Wed, 09 Oct 2019 16:19:39 -0000 Subject: [llvm] r374179 - [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength. Message-ID: <20191009161939.6384A90F78@lists.llvm.org> Author: jasonliu Date: Wed Oct 9 09:19:39 2019 New Revision: 374179 URL: http://llvm.org/viewvc/llvm-project?rev=374179&view=rev Log: [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength. Summary: According the the XCOFF document, If Then XTY_SD x_scnlen contains the csect length. XTY_LD x_scnlen contains the symbol table index of the containing csect. XTY_CM x_scnlen contains the csect length. XTY_ER x_scnlen contains 0. Change the SectionLen member name to SectionOrLength is more reasonable. Authored By: DiggerLin Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D68650 Added: llvm/trunk/D68650.diff Modified: llvm/trunk/include/llvm/Object/XCOFFObjectFile.h llvm/trunk/tools/llvm-readobj/XCOFFDumper.cpp Added: llvm/trunk/D68650.diff URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/D68650.diff?rev=374179&view=auto ============================================================================== --- llvm/trunk/D68650.diff (added) +++ llvm/trunk/D68650.diff Wed Oct 9 09:19:39 2019 @@ -0,0 +1,34 @@ +Index: llvm/include/llvm/Object/XCOFFObjectFile.h +=================================================================== +--- llvm/include/llvm/Object/XCOFFObjectFile.h ++++ llvm/include/llvm/Object/XCOFFObjectFile.h +@@ -113,7 +113,12 @@ + }; + + struct XCOFFCsectAuxEnt32 { +- support::ubig32_t SectionLen; ++ support::ubig32_t ++ SectionOrLength; // If the symbol type is XTY_SD or XTY_CM, the csect ++ // length. ++ // If the symbol type is XTY_LD, the symbol table ++ // index of the containing csect. ++ // If the symbol type is XTY_ER, 0. + support::ubig32_t ParameterHashIndex; + support::ubig16_t TypeChkSectNum; + uint8_t SymbolAlignmentAndType; +Index: llvm/tools/llvm-readobj/XCOFFDumper.cpp +=================================================================== +--- llvm/tools/llvm-readobj/XCOFFDumper.cpp ++++ llvm/tools/llvm-readobj/XCOFFDumper.cpp +@@ -213,9 +213,9 @@ + W.printNumber("Index", + Obj.getSymbolIndex(reinterpret_cast(AuxEntPtr))); + if ((AuxEntPtr->SymbolAlignmentAndType & SymbolTypeMask) == XCOFF::XTY_LD) +- W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionLen); ++ W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionOrLength); + else +- W.printNumber("SectionLen", AuxEntPtr->SectionLen); ++ W.printNumber("SectionLen", AuxEntPtr->SectionOrLength); + W.printHex("ParameterHashIndex", AuxEntPtr->ParameterHashIndex); + W.printHex("TypeChkSectNum", AuxEntPtr->TypeChkSectNum); + // Print out symbol alignment and type. Modified: llvm/trunk/include/llvm/Object/XCOFFObjectFile.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Object/XCOFFObjectFile.h?rev=374179&r1=374178&r2=374179&view=diff ============================================================================== --- llvm/trunk/include/llvm/Object/XCOFFObjectFile.h (original) +++ llvm/trunk/include/llvm/Object/XCOFFObjectFile.h Wed Oct 9 09:19:39 2019 @@ -113,7 +113,12 @@ struct XCOFFStringTable { }; struct XCOFFCsectAuxEnt32 { - support::ubig32_t SectionLen; + support::ubig32_t + SectionOrLength; // If the symbol type is XTY_SD or XTY_CM, the csect + // length. + // If the symbol type is XTY_LD, the symbol table + // index of the containing csect. + // If the symbol type is XTY_ER, 0. support::ubig32_t ParameterHashIndex; support::ubig16_t TypeChkSectNum; uint8_t SymbolAlignmentAndType; Modified: llvm/trunk/tools/llvm-readobj/XCOFFDumper.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-readobj/XCOFFDumper.cpp?rev=374179&r1=374178&r2=374179&view=diff ============================================================================== --- llvm/trunk/tools/llvm-readobj/XCOFFDumper.cpp (original) +++ llvm/trunk/tools/llvm-readobj/XCOFFDumper.cpp Wed Oct 9 09:19:39 2019 @@ -161,9 +161,9 @@ void XCOFFDumper::printCsectAuxEnt32(con W.printNumber("Index", Obj.getSymbolIndex(reinterpret_cast(AuxEntPtr))); if ((AuxEntPtr->SymbolAlignmentAndType & SymbolTypeMask) == XCOFF::XTY_LD) - W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionLen); + W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionOrLength); else - W.printNumber("SectionLen", AuxEntPtr->SectionLen); + W.printNumber("SectionLen", AuxEntPtr->SectionOrLength); W.printHex("ParameterHashIndex", AuxEntPtr->ParameterHashIndex); W.printHex("TypeChkSectNum", AuxEntPtr->TypeChkSectNum); // Print out symbol alignment and type. From llvm-commits at lists.llvm.org Wed Oct 9 09:17:37 2019 From: llvm-commits at lists.llvm.org (Jonas Devlieghere via llvm-commits) Date: Wed, 9 Oct 2019 09:17:37 -0700 Subject: [llvm] r374139 - [dsymutil] Fix handling of common symbols in multiple object files. In-Reply-To: References: <20191009041619.307A6906D2@lists.llvm.org> Message-ID: Thanks Kirstina & Jeremy! I've re-landed this with updated file names in r374178. On Wed, Oct 9, 2019 at 1:32 AM Jeremy Morse via llvm-commits wrote: > > Hi Jonas, > > FYI I reverted this in r374144, the reserved-names issue Kristina > points out causes "git checkout" to fail on Windows. > > -- > Thanks, > Jeremy > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits From llvm-commits at lists.llvm.org Wed Oct 9 09:21:50 2019 From: llvm-commits at lists.llvm.org (Ayal Zaks via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:21:50 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: Ayal added inline comments. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:6903 + + if (!Recipe) + Recipe = tryToBlend(Instr, Plan); ---------------- This if (!Recipe) case and the next should be nested? ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:6917 // having first checked for specific widening recipes that deal with // Interleave Groups, Inductions and Phi nodes. if (tryToWiden(Instr, VPBB, Range)) ---------------- Update above comment: we no longer check for Interleave Groups widening recipe here, only Inductions and Phi nodes. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7054 - // I is a member of an InterleaveGroup for Range.Start. If it's an adjunct - // member of the IG, do not construct any Recipe for it. - const InterleaveGroup *IG = - CM.getInterleavedAccessGroup(Instr); - if (IG && Instr != IG->getInsertPos() && - Range.Start >= 2 && // Query is illegal for VF == 1 - CM.getWideningDecision(Instr, Range.Start) == - LoopVectorizationCostModel::CM_Interleave) { - auto SinkCandidate = SinkAfterInverse.find(Instr); - if (SinkCandidate != SinkAfterInverse.end()) - Ingredients.push_back(SinkCandidate->second); - continue; - } - - // Move instructions to handle first-order recurrences, step 1: avoid - // handling this instruction until after we've handled the instruction it - // should follow. - auto SAIt = SinkAfter.find(Instr); - if (SAIt != SinkAfter.end()) { - LLVM_DEBUG(dbgs() << "Sinking" << *SAIt->first << " after" - << *SAIt->second - << " to vectorize a 1st order recurrence.\n"); - SinkAfterInverse[SAIt->second] = Instr; - continue; - } - - Ingredients.push_back(Instr); - - // Move instructions to handle first-order recurrences, step 2: push the - // instruction to be sunk at its insertion point. - auto SAInvIt = SinkAfterInverse.find(Instr); - if (SAInvIt != SinkAfterInverse.end()) - Ingredients.push_back(SAInvIt->second); - } - - // Introduce each ingredient into VPlan. - for (Instruction *Instr : Ingredients) { - if (RecipeBuilder.tryToCreateRecipe(Instr, Range, Plan, VPBB)) - continue; + bool Widened = RecipeBuilder.tryToCreateRecipe(Instr, Range, Plan, VPBB); ---------------- Can retain the early-exiting "if (tryToCreateRecipe()) continue"? ================ Comment at: llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h:53 + // those ingredients get a VPWidenRecipe, also avoid compressing other + // ingredients into it to avoid having to split such receipes later. + DenseMap Ingredient2Recipe; ---------------- receipes >> recipes ================ Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:117 +} + BasicBlock * ---------------- Better place the implementation of removeFromParent() next to that of eraseFromParent() below. ================ Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:288 +/// Insert an unlinked instruction into a basic block immediately after the +/// specified instruction. +void VPRecipeBase::insertAfter(VPRecipeBase *InsertPos) { ---------------- Above comment suffices at header file only (and strictly speaking it's about Recipes rather than instructions). Update Parent as done in insertBefore() above. Should (existing) insertBefore() also assert !Parent before setting it? ================ Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:611 + void removeFromParent(); + /// The method which generates the output IR instructions that correspond to ---------------- Better place removeFromParent() right before eraseFromParent() below, to emphasize that the latter (only) also deletes it. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 From llvm-commits at lists.llvm.org Wed Oct 9 09:21:50 2019 From: llvm-commits at lists.llvm.org (Nikola Prica via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:21:50 +0000 (UTC) Subject: [PATCH] D67556: [ARM][AArch64][DebugInfo] Improve call site instruction interpretation In-Reply-To: References: Message-ID: NikolaPrica updated this revision to Diff 224077. NikolaPrica added a comment. -Update `addImmediate()` commet. -Add source code producer for test case. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67556/new/ https://reviews.llvm.org/D67556 Files: include/llvm/CodeGen/TargetInstrInfo.h lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/CodeGen/TargetInstrInfo.cpp lib/Target/AArch64/AArch64InstrInfo.cpp lib/Target/AArch64/AArch64InstrInfo.h lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h test/DebugInfo/MIR/AArch64/dbgcall-site-interpretation.mir test/DebugInfo/MIR/ARM/dbgcall-site-interpretation.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D67556.224077.patch Type: text/x-patch Size: 27719 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 09:21:51 2019 From: llvm-commits at lists.llvm.org (James Henderson via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:21:51 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: jhenderson added inline comments. ================ Comment at: llvm/test/FileCheck/check-ignore-case.txt:23 +# CHECK-NEXT: break \ No newline at end of file ---------------- Nit: no new line at EOF. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 From llvm-commits at lists.llvm.org Wed Oct 9 09:24:25 2019 From: llvm-commits at lists.llvm.org (Jason Liu via llvm-commits) Date: Wed, 09 Oct 2019 16:24:25 -0000 Subject: [llvm] r374181 - [NFC] Remove files got accidentally upload in llvm-svn 374179 Message-ID: <20191009162426.0142C90FCE@lists.llvm.org> Author: jasonliu Date: Wed Oct 9 09:24:25 2019 New Revision: 374181 URL: http://llvm.org/viewvc/llvm-project?rev=374181&view=rev Log: [NFC] Remove files got accidentally upload in llvm-svn 374179 Removed: llvm/trunk/D68650.diff Removed: llvm/trunk/D68650.diff URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/D68650.diff?rev=374180&view=auto ============================================================================== --- llvm/trunk/D68650.diff (original) +++ llvm/trunk/D68650.diff (removed) @@ -1,34 +0,0 @@ -Index: llvm/include/llvm/Object/XCOFFObjectFile.h -=================================================================== ---- llvm/include/llvm/Object/XCOFFObjectFile.h -+++ llvm/include/llvm/Object/XCOFFObjectFile.h -@@ -113,7 +113,12 @@ - }; - - struct XCOFFCsectAuxEnt32 { -- support::ubig32_t SectionLen; -+ support::ubig32_t -+ SectionOrLength; // If the symbol type is XTY_SD or XTY_CM, the csect -+ // length. -+ // If the symbol type is XTY_LD, the symbol table -+ // index of the containing csect. -+ // If the symbol type is XTY_ER, 0. - support::ubig32_t ParameterHashIndex; - support::ubig16_t TypeChkSectNum; - uint8_t SymbolAlignmentAndType; -Index: llvm/tools/llvm-readobj/XCOFFDumper.cpp -=================================================================== ---- llvm/tools/llvm-readobj/XCOFFDumper.cpp -+++ llvm/tools/llvm-readobj/XCOFFDumper.cpp -@@ -213,9 +213,9 @@ - W.printNumber("Index", - Obj.getSymbolIndex(reinterpret_cast(AuxEntPtr))); - if ((AuxEntPtr->SymbolAlignmentAndType & SymbolTypeMask) == XCOFF::XTY_LD) -- W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionLen); -+ W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionOrLength); - else -- W.printNumber("SectionLen", AuxEntPtr->SectionLen); -+ W.printNumber("SectionLen", AuxEntPtr->SectionOrLength); - W.printHex("ParameterHashIndex", AuxEntPtr->ParameterHashIndex); - W.printHex("TypeChkSectNum", AuxEntPtr->TypeChkSectNum); - // Print out symbol alignment and type. From llvm-commits at lists.llvm.org Wed Oct 9 09:22:01 2019 From: llvm-commits at lists.llvm.org (Jason Liu via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:22:01 +0000 (UTC) Subject: [PATCH] D68650: [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength. In-Reply-To: References: Message-ID: <5163bfe2dd1ea8b6ed8bd2acc7a486f9@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG6453f700f29a: [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to… (authored by jasonliu). Changed prior to commit: https://reviews.llvm.org/D68650?vs=223939&id=224078#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68650/new/ https://reviews.llvm.org/D68650 Files: llvm/D68650.diff llvm/include/llvm/Object/XCOFFObjectFile.h llvm/tools/llvm-readobj/XCOFFDumper.cpp Index: llvm/tools/llvm-readobj/XCOFFDumper.cpp =================================================================== --- llvm/tools/llvm-readobj/XCOFFDumper.cpp +++ llvm/tools/llvm-readobj/XCOFFDumper.cpp @@ -161,9 +161,9 @@ W.printNumber("Index", Obj.getSymbolIndex(reinterpret_cast(AuxEntPtr))); if ((AuxEntPtr->SymbolAlignmentAndType & SymbolTypeMask) == XCOFF::XTY_LD) - W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionLen); + W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionOrLength); else - W.printNumber("SectionLen", AuxEntPtr->SectionLen); + W.printNumber("SectionLen", AuxEntPtr->SectionOrLength); W.printHex("ParameterHashIndex", AuxEntPtr->ParameterHashIndex); W.printHex("TypeChkSectNum", AuxEntPtr->TypeChkSectNum); // Print out symbol alignment and type. Index: llvm/include/llvm/Object/XCOFFObjectFile.h =================================================================== --- llvm/include/llvm/Object/XCOFFObjectFile.h +++ llvm/include/llvm/Object/XCOFFObjectFile.h @@ -113,7 +113,12 @@ }; struct XCOFFCsectAuxEnt32 { - support::ubig32_t SectionLen; + support::ubig32_t + SectionOrLength; // If the symbol type is XTY_SD or XTY_CM, the csect + // length. + // If the symbol type is XTY_LD, the symbol table + // index of the containing csect. + // If the symbol type is XTY_ER, 0. support::ubig32_t ParameterHashIndex; support::ubig16_t TypeChkSectNum; uint8_t SymbolAlignmentAndType; Index: llvm/D68650.diff =================================================================== --- /dev/null +++ llvm/D68650.diff @@ -0,0 +1,34 @@ +Index: llvm/include/llvm/Object/XCOFFObjectFile.h +=================================================================== +--- llvm/include/llvm/Object/XCOFFObjectFile.h ++++ llvm/include/llvm/Object/XCOFFObjectFile.h +@@ -113,7 +113,12 @@ + }; + + struct XCOFFCsectAuxEnt32 { +- support::ubig32_t SectionLen; ++ support::ubig32_t ++ SectionOrLength; // If the symbol type is XTY_SD or XTY_CM, the csect ++ // length. ++ // If the symbol type is XTY_LD, the symbol table ++ // index of the containing csect. ++ // If the symbol type is XTY_ER, 0. + support::ubig32_t ParameterHashIndex; + support::ubig16_t TypeChkSectNum; + uint8_t SymbolAlignmentAndType; +Index: llvm/tools/llvm-readobj/XCOFFDumper.cpp +=================================================================== +--- llvm/tools/llvm-readobj/XCOFFDumper.cpp ++++ llvm/tools/llvm-readobj/XCOFFDumper.cpp +@@ -213,9 +213,9 @@ + W.printNumber("Index", + Obj.getSymbolIndex(reinterpret_cast(AuxEntPtr))); + if ((AuxEntPtr->SymbolAlignmentAndType & SymbolTypeMask) == XCOFF::XTY_LD) +- W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionLen); ++ W.printNumber("ContainingCsectSymbolIndex", AuxEntPtr->SectionOrLength); + else +- W.printNumber("SectionLen", AuxEntPtr->SectionLen); ++ W.printNumber("SectionLen", AuxEntPtr->SectionOrLength); + W.printHex("ParameterHashIndex", AuxEntPtr->ParameterHashIndex); + W.printHex("TypeChkSectNum", AuxEntPtr->TypeChkSectNum); + // Print out symbol alignment and type. -------------- next part -------------- A non-text attachment was scrubbed... Name: D68650.224078.patch Type: text/x-patch Size: 3365 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 09:31:50 2019 From: llvm-commits at lists.llvm.org (Momchil Velikov via llvm-commits) Date: Wed, 09 Oct 2019 16:31:50 -0000 Subject: [llvm] r374182 - [AArch64] Ensure no tagged memory is left in the unallocated portion of the Message-ID: <20191009163150.BCBE790B9C@lists.llvm.org> Author: chill Date: Wed Oct 9 09:31:50 2019 New Revision: 374182 URL: http://llvm.org/viewvc/llvm-project?rev=374182&view=rev Log: [AArch64] Ensure no tagged memory is left in the unallocated portion of the stack This patch makes sure that if we tag some memory, we untag that memory before the function returns/throws via any exit, reachable from the tag operation. For that we place the untag operation either at: a) the lifetime end call for the alloca, if that call post-dominates the lifetime start call (where the tag operation is placed), or it (the lifetime end call) dominates all reachable exits, otherwise b) at the reachable exits Differential Revision: https://reviews.llvm.org/D68469 Added: llvm/trunk/test/CodeGen/AArch64/stack-tagging-ex-1.ll llvm/trunk/test/CodeGen/AArch64/stack-tagging-ex-2.ll llvm/trunk/test/CodeGen/AArch64/stack-tagging-untag-placement.ll Modified: llvm/trunk/lib/Target/AArch64/AArch64StackTagging.cpp Modified: llvm/trunk/lib/Target/AArch64/AArch64StackTagging.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64StackTagging.cpp?rev=374182&r1=374181&r2=374182&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/AArch64StackTagging.cpp (original) +++ llvm/trunk/lib/Target/AArch64/AArch64StackTagging.cpp Wed Oct 9 09:31:50 2019 @@ -19,6 +19,7 @@ #include "llvm/ADT/Optional.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/Statistic.h" +#include "llvm/Analysis/CFG.h" #include "llvm/Analysis/LoopInfo.h" #include "llvm/Analysis/ScalarEvolution.h" #include "llvm/Analysis/ScalarEvolutionExpressions.h" @@ -491,6 +492,24 @@ void AArch64StackTagging::alignAndPadAll Info.AI = NewAI; } +// Helper function to check for post-dominance. +static bool postDominates(const PostDominatorTree *PDT, const IntrinsicInst *A, + const IntrinsicInst *B) { + const BasicBlock *ABB = A->getParent(); + const BasicBlock *BBB = B->getParent(); + + if (ABB != BBB) + return PDT->dominates(ABB, BBB); + + for (const Instruction &I : *ABB) { + if (&I == B) + return true; + if (&I == A) + return false; + } + llvm_unreachable("Corrupt instruction list"); +} + // FIXME: check for MTE extension bool AArch64StackTagging::runOnFunction(Function &Fn) { if (!Fn.hasFnAttribute(Attribute::SanitizeMemTag)) @@ -565,23 +584,31 @@ bool AArch64StackTagging::runOnFunction( if (NumInterestingAllocas == 0) return true; + std::unique_ptr DeleteDT; + DominatorTree *DT = nullptr; + if (auto *P = getAnalysisIfAvailable()) + DT = &P->getDomTree(); + + if (DT == nullptr && (NumInterestingAllocas > 1 || + !F->hasFnAttribute(Attribute::OptimizeNone))) { + DeleteDT = std::make_unique(*F); + DT = DeleteDT.get(); + } + + std::unique_ptr DeletePDT; + PostDominatorTree *PDT = nullptr; + if (auto *P = getAnalysisIfAvailable()) + PDT = &P->getPostDomTree(); + + if (PDT == nullptr && !F->hasFnAttribute(Attribute::OptimizeNone)) { + DeletePDT = std::make_unique(*F); + PDT = DeletePDT.get(); + } + SetTagFunc = Intrinsic::getDeclaration(F->getParent(), Intrinsic::aarch64_settag); - // Compute DT only if the function has the attribute, there are more than 1 - // interesting allocas, and it is not available for free. - Instruction *Base; - if (NumInterestingAllocas > 1) { - auto *DTWP = getAnalysisIfAvailable(); - if (DTWP) { - Base = insertBaseTaggedPointer(Allocas, &DTWP->getDomTree()); - } else { - DominatorTree DT(*F); - Base = insertBaseTaggedPointer(Allocas, &DT); - } - } else { - Base = insertBaseTaggedPointer(Allocas, nullptr); - } + Instruction *Base = insertBaseTaggedPointer(Allocas, DT); for (auto &I : Allocas) { const AllocaInfo &Info = I.second; @@ -604,11 +631,37 @@ bool AArch64StackTagging::runOnFunction( if (UnrecognizedLifetimes.empty() && Info.LifetimeStart.size() == 1 && Info.LifetimeEnd.size() == 1) { IntrinsicInst *Start = Info.LifetimeStart[0]; + IntrinsicInst *End = Info.LifetimeEnd[0]; uint64_t Size = dyn_cast(Start->getArgOperand(0))->getZExtValue(); Size = alignTo(Size, kTagGranuleSize); tagAlloca(AI, Start->getNextNode(), Start->getArgOperand(1), Size); - untagAlloca(AI, Info.LifetimeEnd[0], Size); + // We need to ensure that if we tag some object, we certainly untag it + // before the function exits. + if (PDT != nullptr && postDominates(PDT, End, Start)) { + untagAlloca(AI, End, Size); + } else { + SmallVector ReachableRetVec; + unsigned NumCoveredExits = 0; + for (auto &RI : RetVec) { + if (!isPotentiallyReachable(Start, RI, nullptr, DT)) + continue; + ReachableRetVec.push_back(RI); + if (DT != nullptr && DT->dominates(End, RI)) + ++NumCoveredExits; + } + // If there's a mix of covered and non-covered exits, just put the untag + // on exits, so we avoid the redundancy of untagging twice. + if (NumCoveredExits == ReachableRetVec.size()) { + untagAlloca(AI, End, Size); + } else { + for (auto &RI : ReachableRetVec) + untagAlloca(AI, RI, Size); + // We may have inserted untag outside of the lifetime interval. + // Remove the lifetime end call for this alloca. + End->eraseFromParent(); + } + } } else { uint64_t Size = Info.AI->getAllocationSizeInBits(*DL).getValue() / 8; Value *Ptr = IRB.CreatePointerCast(TagPCall, IRB.getInt8PtrTy()); Added: llvm/trunk/test/CodeGen/AArch64/stack-tagging-ex-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/stack-tagging-ex-1.ll?rev=374182&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/stack-tagging-ex-1.ll (added) +++ llvm/trunk/test/CodeGen/AArch64/stack-tagging-ex-1.ll Wed Oct 9 09:31:50 2019 @@ -0,0 +1,69 @@ +; RUN: opt -S -stack-tagging %s -o - | FileCheck %s + +target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" +target triple = "aarch64-arm-unknown-eabi" + +define void @f() local_unnamed_addr #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) { +start: +; CHECK-LABEL: start: + %a = alloca i8, i32 48, align 8 + call void @llvm.lifetime.start.p0i8(i64 48, i8* nonnull %a) #2 +; CHECK: call void @llvm.aarch64.settag(i8* %a.tag, i64 48) + %b = alloca i8, i32 48, align 8 + call void @llvm.lifetime.start.p0i8(i64 48, i8* nonnull %b) #2 +; CHECK: call void @llvm.aarch64.settag(i8* %b.tag, i64 48) + invoke void @g (i8 * nonnull %a, i8 * nonnull %b) to label %next0 unwind label %lpad0 +; CHECK-NOT: settag + +next0: +; CHECK-LABEL: next0: + call void @llvm.lifetime.end.p0i8(i64 40, i8* nonnull %a) + call void @llvm.lifetime.end.p0i8(i64 40, i8* nonnull %b) + br label %exit +; CHECK-NOT: settag + +lpad0: +; CHECK-LABEL: lpad0: + %pad0v = landingpad { i8*, i32 } catch i8* null + %v = extractvalue { i8*, i32 } %pad0v, 0 + %x = call i8* @__cxa_begin_catch(i8* %v) #2 + invoke void @__cxa_end_catch() to label %next1 unwind label %lpad1 +; CHECK-NOT: settag + +next1: +; CHECK-LABEL: next1: + br label %exit +; CHECK-NOT: settag + +lpad1: +; CHECK-LABEL: lpad1: +; CHECK-DAG: call void @llvm.aarch64.settag(i8* %a, i64 48) +; CHECK-DAG: call void @llvm.aarch64.settag(i8* %b, i64 48) + %pad1v = landingpad { i8*, i32 } cleanup + resume { i8*, i32 } %pad1v + +exit: +; CHECK-LABEL: exit: +; CHECK-DAG: call void @llvm.aarch64.settag(i8* %a, i64 48) +; CHECK-DAG: call void @llvm.aarch64.settag(i8* %b, i64 48) + ret void +; CHECK: ret void +} + +declare void @g(i8 *, i8 *) #0 + +declare dso_local i32 @__gxx_personality_v0(...) + +declare dso_local i8* @__cxa_begin_catch(i8*) local_unnamed_addr + +declare dso_local void @__cxa_end_catch() local_unnamed_addr + +; Function Attrs: argmemonly nounwind willreturn +declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1 + +; Function Attrs: argmemonly nounwind willreturn +declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #1 + +attributes #0 = { sanitize_memtag "correctly-rounded-divide-sqrt-fp-math"="false" "denormal-fp-math"="preserve-sign" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="true" "no-jump-tables"="false" "no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+mte,+neon,+v8.5a" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { argmemonly nounwind willreturn } +attributes #2 = { nounwind } Added: llvm/trunk/test/CodeGen/AArch64/stack-tagging-ex-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/stack-tagging-ex-2.ll?rev=374182&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/stack-tagging-ex-2.ll (added) +++ llvm/trunk/test/CodeGen/AArch64/stack-tagging-ex-2.ll Wed Oct 9 09:31:50 2019 @@ -0,0 +1,183 @@ +; clang -target aarch64-eabi -O2 -march=armv8.5-a+memtag -fsanitize=memtag -S -emit-llvm test.cc +; void bar() { +; throw 42; +; } + +; void foo() { +; int A0; +; __asm volatile("" : : "r"(&A0)); + +; try { +; bar(); +; } catch (int exc) { +; } + +; throw 15532; +; } + +; int main() { +; try { +; foo(); +; } catch (int exc) { +; } + +; return 0; +; } + +; RUN: opt -S -stack-tagging %s -o - | FileCheck %s + +target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" +target triple = "aarch64-unknown-unknown-eabi" + + at _ZTIi = external dso_local constant i8* + +; Function Attrs: noreturn sanitize_memtag +define dso_local void @_Z3barv() local_unnamed_addr #0 { +entry: + %exception = tail call i8* @__cxa_allocate_exception(i64 4) #4 + %0 = bitcast i8* %exception to i32* + store i32 42, i32* %0, align 16, !tbaa !2 + tail call void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8*), i8* null) #5 + unreachable +} + +declare dso_local i8* @__cxa_allocate_exception(i64) local_unnamed_addr + +declare dso_local void @__cxa_throw(i8*, i8*, i8*) local_unnamed_addr + +; Function Attrs: noreturn sanitize_memtag +define dso_local void @_Z3foov() local_unnamed_addr #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) { +entry: + %A0 = alloca i32, align 4 + %0 = bitcast i32* %A0 to i8* + call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #4 + call void asm sideeffect "", "r"(i32* nonnull %A0) #4, !srcloc !6 + invoke void @_Z3barv() + to label %try.cont unwind label %lpad + +lpad: ; preds = %entry + %1 = landingpad { i8*, i32 } + cleanup + catch i8* bitcast (i8** @_ZTIi to i8*) + %2 = extractvalue { i8*, i32 } %1, 1 + %3 = call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*)) #4 + %matches = icmp eq i32 %2, %3 + br i1 %matches, label %catch, label %ehcleanup + +catch: ; preds = %lpad + %4 = extractvalue { i8*, i32 } %1, 0 + %5 = call i8* @__cxa_begin_catch(i8* %4) #4 + call void @__cxa_end_catch() #4 + br label %try.cont + +try.cont: ; preds = %entry, %catch + %exception = call i8* @__cxa_allocate_exception(i64 4) #4 + %6 = bitcast i8* %exception to i32* + store i32 15532, i32* %6, align 16, !tbaa !2 + call void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8*), i8* null) #5 + unreachable + +ehcleanup: ; preds = %lpad + call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #4 + resume { i8*, i32 } %1 +} + +; Function Attrs: argmemonly nounwind willreturn +declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1 + +declare dso_local i32 @__gxx_personality_v0(...) + +; Function Attrs: nounwind readnone +declare i32 @llvm.eh.typeid.for(i8*) #2 + +declare dso_local i8* @__cxa_begin_catch(i8*) local_unnamed_addr + +declare dso_local void @__cxa_end_catch() local_unnamed_addr + +; Function Attrs: argmemonly nounwind willreturn +declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #1 + +; Function Attrs: norecurse sanitize_memtag +define dso_local i32 @main() local_unnamed_addr #3 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) { +entry: +; CHECK-LABEL: entry: + %A0.i = alloca i32, align 4 + %0 = bitcast i32* %A0.i to i8* + call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #4 + call void asm sideeffect "", "r"(i32* nonnull %A0.i) #4, !srcloc !6 +; CHECK: call void @llvm.aarch64.settag(i8* %1, i64 16) +; CHECK-NEXT: call void asm sideeffect + %exception.i6 = call i8* @__cxa_allocate_exception(i64 4) #4 + %1 = bitcast i8* %exception.i6 to i32* + store i32 42, i32* %1, align 16, !tbaa !2 + invoke void @__cxa_throw(i8* %exception.i6, i8* bitcast (i8** @_ZTIi to i8*), i8* null) #5 + to label %.noexc7 unwind label %lpad.i + +.noexc7: ; preds = %entry + unreachable + +lpad.i: ; preds = %entry + %2 = landingpad { i8*, i32 } + cleanup + catch i8* bitcast (i8** @_ZTIi to i8*) + %3 = extractvalue { i8*, i32 } %2, 1 + %4 = call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*)) #4 + %matches.i = icmp eq i32 %3, %4 + br i1 %matches.i, label %catch.i, label %ehcleanup.i + +catch.i: ; preds = %lpad.i + %5 = extractvalue { i8*, i32 } %2, 0 + %6 = call i8* @__cxa_begin_catch(i8* %5) #4 + call void @__cxa_end_catch() #4 + %exception.i = call i8* @__cxa_allocate_exception(i64 4) #4 + %7 = bitcast i8* %exception.i to i32* + store i32 15532, i32* %7, align 16, !tbaa !2 + invoke void @__cxa_throw(i8* %exception.i, i8* bitcast (i8** @_ZTIi to i8*), i8* null) #5 + to label %.noexc unwind label %lpad + +.noexc: ; preds = %catch.i + unreachable + +ehcleanup.i: ; preds = %lpad.i + call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #4 + br label %lpad.body + +lpad: ; preds = %catch.i + %8 = landingpad { i8*, i32 } + catch i8* bitcast (i8** @_ZTIi to i8*) + %.pre = extractvalue { i8*, i32 } %8, 1 + br label %lpad.body + +lpad.body: ; preds = %ehcleanup.i, %lpad + %.pre-phi = phi i32 [ %3, %ehcleanup.i ], [ %.pre, %lpad ] + %eh.lpad-body = phi { i8*, i32 } [ %2, %ehcleanup.i ], [ %8, %lpad ] + %matches = icmp eq i32 %.pre-phi, %4 + br i1 %matches, label %catch, label %eh.resume + +catch: ; preds = %lpad.body + %9 = extractvalue { i8*, i32 } %eh.lpad-body, 0 + %10 = call i8* @__cxa_begin_catch(i8* %9) #4 + call void @__cxa_end_catch() #4 + ret i32 0 + +eh.resume: ; preds = %lpad.body + resume { i8*, i32 } %eh.lpad-body +} + +attributes #0 = { noreturn sanitize_memtag "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+mte,+neon,+v8.5a" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { argmemonly nounwind willreturn } +attributes #2 = { nounwind readnone } +attributes #3 = { norecurse sanitize_memtag "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+mte,+neon,+v8.5a" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #4 = { nounwind } +attributes #5 = { noreturn } + +!llvm.module.flags = !{!0} +!llvm.ident = !{!1} + +!0 = !{i32 1, !"wchar_size", i32 4} +!1 = !{!"clang version 10.0.0 (https://github.com/llvm/llvm-project.git c38188c5fe41751fda095edde1a878b2a051ae58)"} +!2 = !{!3, !3, i64 0} +!3 = !{!"int", !4, i64 0} +!4 = !{!"omnipotent char", !5, i64 0} +!5 = !{!"Simple C++ TBAA"} +!6 = !{i32 70} Added: llvm/trunk/test/CodeGen/AArch64/stack-tagging-untag-placement.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/stack-tagging-untag-placement.ll?rev=374182&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/stack-tagging-untag-placement.ll (added) +++ llvm/trunk/test/CodeGen/AArch64/stack-tagging-untag-placement.ll Wed Oct 9 09:31:50 2019 @@ -0,0 +1,82 @@ +;; RUN: opt -S -stack-tagging %s -o - | FileCheck %s +target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" +target triple = "aarch64-arm-unknown-eabi" + +define void @f() local_unnamed_addr #0 { +S0: +; CHECK-LABEL: S0: +; CHECK: %basetag = call i8* @llvm.aarch64.irg.sp(i64 0) + %v = alloca i8, i32 48, align 8 +; CHECK: %v.tag = call i8* @llvm.aarch64.tagp.p0i8(i8* %v, i8* %basetag, i64 0) + %w = alloca i8, i32 48, align 16 +; CHECK: %w.tag = call i8* @llvm.aarch64.tagp.p0i8(i8* %w, i8* %basetag, i64 1) + + %t0 = call i32 @g0() #1 + %b0 = icmp eq i32 %t0, 0 + br i1 %b0, label %S1, label %exit3 + +S1: +; CHECK-LABEL: S1: + call void @llvm.lifetime.start.p0i8(i64 48, i8 * nonnull %v) #1 +; CHECK: call void @llvm.aarch64.settag(i8* %v.tag, i64 48) + call void @llvm.lifetime.start.p0i8(i64 48, i8 * nonnull %w) #1 +; CHECK: call void @llvm.aarch64.settag(i8* %w.tag, i64 48) + %t1 = call i32 @g1(i8 * nonnull %v, i8 * nonnull %w) #1 +; CHECK: call i32 @g1 +; CHECK-NOT: settag{{.*}}%v +; CHECK: call void @llvm.aarch64.settag(i8* %w, i64 48) +; CHECK-NOT: settag{{.*}}%v + call void @llvm.lifetime.end.p0i8(i64 48, i8 * nonnull %w) #1 +; CHECK: call void @llvm.lifetime.end.p0i8(i64 48, i8* nonnull %w.tag) + %b1 = icmp eq i32 %t1, 0 + br i1 %b1, label %S2, label %S3 +; CHECK-NOT: settag + +S2: +; CHECK-LABEL: S2: + call void @z0() #1 + br label %exit1 +; CHECK-NOT: settag + +S3: +; CHECK-LABEL: S3: + call void @llvm.lifetime.end.p0i8(i64 48, i8 * nonnull %v) #1 + tail call void @z1() #1 + br label %exit2 +; CHECK-NOT: settag + +exit1: +; CHECK-LABEL: exit1: +; CHECK: call void @llvm.aarch64.settag(i8* %v, i64 48) + ret void + +exit2: +; CHECK-LABEL: exit2: +; CHECK: call void @llvm.aarch64.settag(i8* %v, i64 48) + ret void + +exit3: +; CHECK-LABEL: exit3: + call void @z2() #1 +; CHECK-NOT: settag + ret void +; CHECK: ret void +} + +declare i32 @g0() #0 + +declare i32 @g1(i8 *, i8 *) #0 + +declare void @z0() #0 + +declare void @z1() #0 + +declare void @z2() #0 + +declare void @llvm.lifetime.start.p0i8(i64 immarg, i8 * nocapture) #1 + +declare void @llvm.lifetime.end.p0i8(i64 immarg, i8 * nocapture) #1 + +attributes #0 = { sanitize_memtag "correctly-rounded-divide-sqrt-fp-math"="false" "denormal-fp-math"="preserve-sign" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="true" "no-jump-tables"="false" "no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+mte,+neon,+v8.5a" "unsafe-fp-math"="false" "use-soft-float"="false" } +attributes #1 = { nounwind } + From llvm-commits at lists.llvm.org Wed Oct 9 09:32:49 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Wed, 09 Oct 2019 16:32:49 -0000 Subject: [llvm] r374183 - [SLP] respect target register width for GEP vectorization (PR43578) Message-ID: <20191009163249.F275E90FD1@lists.llvm.org> Author: spatel Date: Wed Oct 9 09:32:49 2019 New Revision: 374183 URL: http://llvm.org/viewvc/llvm-project?rev=374183&view=rev Log: [SLP] respect target register width for GEP vectorization (PR43578) We failed to account for the target register width (max vector factor) when vectorizing starting from GEPs. This causes vectorization to proceed to obviously illegal widths as in: https://bugs.llvm.org/show_bug.cgi?id=43578 For x86, this also means that SLP can produce rogue AVX or AVX512 code even when the user specifies a narrower vector width. The AArch64 test in ext-trunc.ll appears to be better using the narrower width. I'm not exactly sure what getelementptr.ll is trying to do, but it's testing with "-slp-threshold=-18", so I'm not worried about those diffs. The x86 test is an over-reduction from SPEC h264; this patch appears to restore the perf loss caused by SLP when using -march=haswell. Differential Revision: https://reviews.llvm.org/D68667 Modified: llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/trunk/test/Transforms/SLPVectorizer/AArch64/ext-trunc.ll llvm/trunk/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll llvm/trunk/test/Transforms/SLPVectorizer/X86/load-merge.ll Modified: llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp?rev=374183&r1=374182&r2=374183&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp Wed Oct 9 09:32:49 2019 @@ -6981,10 +6981,16 @@ bool SLPVectorizerPass::vectorizeGEPIndi LLVM_DEBUG(dbgs() << "SLP: Analyzing a getelementptr list of length " << Entry.second.size() << ".\n"); - // We process the getelementptr list in chunks of 16 (like we do for - // stores) to minimize compile-time. - for (unsigned BI = 0, BE = Entry.second.size(); BI < BE; BI += 16) { - auto Len = std::min(BE - BI, 16); + // Process the GEP list in chunks suitable for the target's supported + // vector size. If a vector register can't hold 1 element, we are done. + unsigned MaxVecRegSize = R.getMaxVecRegSize(); + unsigned EltSize = R.getVectorElementSize(Entry.second[0]); + if (MaxVecRegSize < EltSize) + continue; + + unsigned MaxElts = MaxVecRegSize / EltSize; + for (unsigned BI = 0, BE = Entry.second.size(); BI < BE; BI += MaxElts) { + auto Len = std::min(BE - BI, MaxElts); auto GEPList = makeArrayRef(&Entry.second[BI], Len); // Initialize a set a candidate getelementptrs. Note that we use a Modified: llvm/trunk/test/Transforms/SLPVectorizer/AArch64/ext-trunc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SLPVectorizer/AArch64/ext-trunc.ll?rev=374183&r1=374182&r2=374183&view=diff ============================================================================== --- llvm/trunk/test/Transforms/SLPVectorizer/AArch64/ext-trunc.ll (original) +++ llvm/trunk/test/Transforms/SLPVectorizer/AArch64/ext-trunc.ll Wed Oct 9 09:32:49 2019 @@ -61,23 +61,25 @@ define void @test2(<4 x i16> %a, <4 x i1 ; CHECK-NEXT: [[Z0:%.*]] = zext <4 x i16> [[A:%.*]] to <4 x i32> ; CHECK-NEXT: [[Z1:%.*]] = zext <4 x i16> [[B:%.*]] to <4 x i32> ; CHECK-NEXT: [[SUB0:%.*]] = sub <4 x i32> [[Z0]], [[Z1]] -; CHECK-NEXT: [[TMP0:%.*]] = sext <4 x i32> [[SUB0]] to <4 x i64> -; CHECK-NEXT: [[TMP1:%.*]] = insertelement <4 x i64> undef, i64 [[C0:%.*]], i32 0 -; CHECK-NEXT: [[TMP2:%.*]] = insertelement <4 x i64> [[TMP1]], i64 [[C1:%.*]], i32 1 -; CHECK-NEXT: [[TMP3:%.*]] = insertelement <4 x i64> [[TMP2]], i64 [[C2:%.*]], i32 2 -; CHECK-NEXT: [[TMP4:%.*]] = insertelement <4 x i64> [[TMP3]], i64 [[C3:%.*]], i32 3 -; CHECK-NEXT: [[TMP5:%.*]] = add <4 x i64> [[TMP0]], [[TMP4]] -; CHECK-NEXT: [[TMP6:%.*]] = extractelement <4 x i64> [[TMP5]], i32 0 -; CHECK-NEXT: [[GEP0:%.*]] = getelementptr inbounds i64, i64* [[P:%.*]], i64 [[TMP6]] +; CHECK-NEXT: [[E0:%.*]] = extractelement <4 x i32> [[SUB0]], i32 0 +; CHECK-NEXT: [[S0:%.*]] = sext i32 [[E0]] to i64 +; CHECK-NEXT: [[A0:%.*]] = add i64 [[S0]], [[C0:%.*]] +; CHECK-NEXT: [[GEP0:%.*]] = getelementptr inbounds i64, i64* [[P:%.*]], i64 [[A0]] ; CHECK-NEXT: [[LOAD0:%.*]] = load i64, i64* [[GEP0]] -; CHECK-NEXT: [[TMP7:%.*]] = extractelement <4 x i64> [[TMP5]], i32 1 -; CHECK-NEXT: [[GEP1:%.*]] = getelementptr inbounds i64, i64* [[P]], i64 [[TMP7]] +; CHECK-NEXT: [[E1:%.*]] = extractelement <4 x i32> [[SUB0]], i32 1 +; CHECK-NEXT: [[S1:%.*]] = sext i32 [[E1]] to i64 +; CHECK-NEXT: [[A1:%.*]] = add i64 [[S1]], [[C1:%.*]] +; CHECK-NEXT: [[GEP1:%.*]] = getelementptr inbounds i64, i64* [[P]], i64 [[A1]] ; CHECK-NEXT: [[LOAD1:%.*]] = load i64, i64* [[GEP1]] -; CHECK-NEXT: [[TMP8:%.*]] = extractelement <4 x i64> [[TMP5]], i32 2 -; CHECK-NEXT: [[GEP2:%.*]] = getelementptr inbounds i64, i64* [[P]], i64 [[TMP8]] +; CHECK-NEXT: [[E2:%.*]] = extractelement <4 x i32> [[SUB0]], i32 2 +; CHECK-NEXT: [[S2:%.*]] = sext i32 [[E2]] to i64 +; CHECK-NEXT: [[A2:%.*]] = add i64 [[S2]], [[C2:%.*]] +; CHECK-NEXT: [[GEP2:%.*]] = getelementptr inbounds i64, i64* [[P]], i64 [[A2]] ; CHECK-NEXT: [[LOAD2:%.*]] = load i64, i64* [[GEP2]] -; CHECK-NEXT: [[TMP9:%.*]] = extractelement <4 x i64> [[TMP5]], i32 3 -; CHECK-NEXT: [[GEP3:%.*]] = getelementptr inbounds i64, i64* [[P]], i64 [[TMP9]] +; CHECK-NEXT: [[E3:%.*]] = extractelement <4 x i32> [[SUB0]], i32 3 +; CHECK-NEXT: [[S3:%.*]] = sext i32 [[E3]] to i64 +; CHECK-NEXT: [[A3:%.*]] = add i64 [[S3]], [[C3:%.*]] +; CHECK-NEXT: [[GEP3:%.*]] = getelementptr inbounds i64, i64* [[P]], i64 [[A3]] ; CHECK-NEXT: [[LOAD3:%.*]] = load i64, i64* [[GEP3]] ; CHECK-NEXT: call void @foo(i64 [[LOAD0]], i64 [[LOAD1]], i64 [[LOAD2]], i64 [[LOAD3]]) ; CHECK-NEXT: ret void Modified: llvm/trunk/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll?rev=374183&r1=374182&r2=374183&view=diff ============================================================================== --- llvm/trunk/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll (original) +++ llvm/trunk/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll Wed Oct 9 09:32:49 2019 @@ -23,10 +23,7 @@ target triple = "aarch64--linux-gnu" ; } ; -; YAML: --- !Passed -; YAML-NEXT: Pass: slp-vectorizer -; YAML-NEXT: Name: VectorizedList -; YAML-NEXT: Function: getelementptr_4x32 +; YAML-LABEL: Function: getelementptr_4x32 ; YAML-NEXT: Args: ; YAML-NEXT: - String: 'SLP vectorized with cost ' ; YAML-NEXT: - Cost: '11' @@ -39,7 +36,7 @@ target triple = "aarch64--linux-gnu" ; YAML-NEXT: Function: getelementptr_4x32 ; YAML-NEXT: Args: ; YAML-NEXT: - String: 'SLP vectorized with cost ' -; YAML-NEXT: - Cost: '16' +; YAML-NEXT: - Cost: '6' ; YAML-NEXT: - String: ' and with tree size ' ; YAML-NEXT: - TreeSize: '3' @@ -49,49 +46,50 @@ define i32 @getelementptr_4x32(i32* noca ; CHECK-NEXT: [[CMP31:%.*]] = icmp sgt i32 [[N:%.*]], 0 ; CHECK-NEXT: br i1 [[CMP31]], label [[FOR_BODY_PREHEADER:%.*]], label [[FOR_COND_CLEANUP:%.*]] ; CHECK: for.body.preheader: -; CHECK-NEXT: [[TMP0:%.*]] = insertelement <4 x i32> , i32 [[X:%.*]], i32 1 -; CHECK-NEXT: [[TMP1:%.*]] = insertelement <4 x i32> [[TMP0]], i32 [[Y:%.*]], i32 2 -; CHECK-NEXT: [[TMP2:%.*]] = insertelement <4 x i32> [[TMP1]], i32 [[Z:%.*]], i32 3 +; CHECK-NEXT: [[TMP0:%.*]] = insertelement <2 x i32> , i32 [[X:%.*]], i32 1 +; CHECK-NEXT: [[TMP1:%.*]] = insertelement <2 x i32> undef, i32 [[Y:%.*]], i32 0 +; CHECK-NEXT: [[TMP2:%.*]] = insertelement <2 x i32> [[TMP1]], i32 [[Z:%.*]], i32 1 ; CHECK-NEXT: br label [[FOR_BODY:%.*]] ; CHECK: for.cond.cleanup.loopexit: -; CHECK-NEXT: [[TMP3:%.*]] = extractelement <2 x i32> [[TMP21:%.*]], i32 1 +; CHECK-NEXT: [[TMP3:%.*]] = extractelement <2 x i32> [[TMP22:%.*]], i32 1 ; CHECK-NEXT: br label [[FOR_COND_CLEANUP]] ; CHECK: for.cond.cleanup: ; CHECK-NEXT: [[SUM_0_LCSSA:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[TMP3]], [[FOR_COND_CLEANUP_LOOPEXIT:%.*]] ] ; CHECK-NEXT: ret i32 [[SUM_0_LCSSA]] ; CHECK: for.body: -; CHECK-NEXT: [[TMP4:%.*]] = phi <2 x i32> [ zeroinitializer, [[FOR_BODY_PREHEADER]] ], [ [[TMP21]], [[FOR_BODY]] ] +; CHECK-NEXT: [[TMP4:%.*]] = phi <2 x i32> [ zeroinitializer, [[FOR_BODY_PREHEADER]] ], [ [[TMP22]], [[FOR_BODY]] ] ; CHECK-NEXT: [[TMP5:%.*]] = extractelement <2 x i32> [[TMP4]], i32 0 ; CHECK-NEXT: [[T4:%.*]] = shl nsw i32 [[TMP5]], 1 -; CHECK-NEXT: [[TMP6:%.*]] = insertelement <4 x i32> undef, i32 [[T4]], i32 0 -; CHECK-NEXT: [[TMP7:%.*]] = shufflevector <4 x i32> [[TMP6]], <4 x i32> undef, <4 x i32> zeroinitializer -; CHECK-NEXT: [[TMP8:%.*]] = add nsw <4 x i32> [[TMP7]], [[TMP2]] -; CHECK-NEXT: [[TMP9:%.*]] = extractelement <4 x i32> [[TMP8]], i32 0 +; CHECK-NEXT: [[TMP6:%.*]] = insertelement <2 x i32> undef, i32 [[T4]], i32 0 +; CHECK-NEXT: [[TMP7:%.*]] = shufflevector <2 x i32> [[TMP6]], <2 x i32> undef, <2 x i32> zeroinitializer +; CHECK-NEXT: [[TMP8:%.*]] = add nsw <2 x i32> [[TMP7]], [[TMP0]] +; CHECK-NEXT: [[TMP9:%.*]] = extractelement <2 x i32> [[TMP8]], i32 0 ; CHECK-NEXT: [[TMP10:%.*]] = sext i32 [[TMP9]] to i64 ; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[G:%.*]], i64 [[TMP10]] ; CHECK-NEXT: [[T6:%.*]] = load i32, i32* [[ARRAYIDX]], align 4 ; CHECK-NEXT: [[TMP11:%.*]] = extractelement <2 x i32> [[TMP4]], i32 1 ; CHECK-NEXT: [[ADD1:%.*]] = add nsw i32 [[T6]], [[TMP11]] -; CHECK-NEXT: [[TMP12:%.*]] = extractelement <4 x i32> [[TMP8]], i32 1 +; CHECK-NEXT: [[TMP12:%.*]] = extractelement <2 x i32> [[TMP8]], i32 1 ; CHECK-NEXT: [[TMP13:%.*]] = sext i32 [[TMP12]] to i64 ; CHECK-NEXT: [[ARRAYIDX5:%.*]] = getelementptr inbounds i32, i32* [[G]], i64 [[TMP13]] ; CHECK-NEXT: [[T8:%.*]] = load i32, i32* [[ARRAYIDX5]], align 4 ; CHECK-NEXT: [[ADD6:%.*]] = add nsw i32 [[ADD1]], [[T8]] -; CHECK-NEXT: [[TMP14:%.*]] = extractelement <4 x i32> [[TMP8]], i32 2 -; CHECK-NEXT: [[TMP15:%.*]] = sext i32 [[TMP14]] to i64 -; CHECK-NEXT: [[ARRAYIDX10:%.*]] = getelementptr inbounds i32, i32* [[G]], i64 [[TMP15]] +; CHECK-NEXT: [[TMP14:%.*]] = add nsw <2 x i32> [[TMP7]], [[TMP2]] +; CHECK-NEXT: [[TMP15:%.*]] = extractelement <2 x i32> [[TMP14]], i32 0 +; CHECK-NEXT: [[TMP16:%.*]] = sext i32 [[TMP15]] to i64 +; CHECK-NEXT: [[ARRAYIDX10:%.*]] = getelementptr inbounds i32, i32* [[G]], i64 [[TMP16]] ; CHECK-NEXT: [[T10:%.*]] = load i32, i32* [[ARRAYIDX10]], align 4 ; CHECK-NEXT: [[ADD11:%.*]] = add nsw i32 [[ADD6]], [[T10]] -; CHECK-NEXT: [[TMP16:%.*]] = extractelement <4 x i32> [[TMP8]], i32 3 -; CHECK-NEXT: [[TMP17:%.*]] = sext i32 [[TMP16]] to i64 -; CHECK-NEXT: [[ARRAYIDX15:%.*]] = getelementptr inbounds i32, i32* [[G]], i64 [[TMP17]] +; CHECK-NEXT: [[TMP17:%.*]] = extractelement <2 x i32> [[TMP14]], i32 1 +; CHECK-NEXT: [[TMP18:%.*]] = sext i32 [[TMP17]] to i64 +; CHECK-NEXT: [[ARRAYIDX15:%.*]] = getelementptr inbounds i32, i32* [[G]], i64 [[TMP18]] ; CHECK-NEXT: [[T12:%.*]] = load i32, i32* [[ARRAYIDX15]], align 4 -; CHECK-NEXT: [[TMP18:%.*]] = insertelement <2 x i32> undef, i32 [[TMP5]], i32 0 -; CHECK-NEXT: [[TMP19:%.*]] = insertelement <2 x i32> [[TMP18]], i32 [[ADD11]], i32 1 -; CHECK-NEXT: [[TMP20:%.*]] = insertelement <2 x i32> , i32 [[T12]], i32 1 -; CHECK-NEXT: [[TMP21]] = add nsw <2 x i32> [[TMP19]], [[TMP20]] -; CHECK-NEXT: [[TMP22:%.*]] = extractelement <2 x i32> [[TMP21]], i32 0 -; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[TMP22]], [[N]] +; CHECK-NEXT: [[TMP19:%.*]] = insertelement <2 x i32> undef, i32 [[TMP5]], i32 0 +; CHECK-NEXT: [[TMP20:%.*]] = insertelement <2 x i32> [[TMP19]], i32 [[ADD11]], i32 1 +; CHECK-NEXT: [[TMP21:%.*]] = insertelement <2 x i32> , i32 [[T12]], i32 1 +; CHECK-NEXT: [[TMP22]] = add nsw <2 x i32> [[TMP20]], [[TMP21]] +; CHECK-NEXT: [[TMP23:%.*]] = extractelement <2 x i32> [[TMP22]], i32 0 +; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[TMP23]], [[N]] ; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP_LOOPEXIT]], label [[FOR_BODY]] ; entry: @@ -133,10 +131,7 @@ for.body: br i1 %exitcond, label %for.cond.cleanup.loopexit, label %for.body } -; YAML: --- !Passed -; YAML-NEXT: Pass: slp-vectorizer -; YAML-NEXT: Name: VectorizedList -; YAML-NEXT: Function: getelementptr_2x32 +; YAML-LABEL: Function: getelementptr_2x32 ; YAML-NEXT: Args: ; YAML-NEXT: - String: 'SLP vectorized with cost ' ; YAML-NEXT: - Cost: '11' Modified: llvm/trunk/test/Transforms/SLPVectorizer/X86/load-merge.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SLPVectorizer/X86/load-merge.ll?rev=374183&r1=374182&r2=374183&view=diff ============================================================================== --- llvm/trunk/test/Transforms/SLPVectorizer/X86/load-merge.ll (original) +++ llvm/trunk/test/Transforms/SLPVectorizer/X86/load-merge.ll Wed Oct 9 09:32:49 2019 @@ -153,19 +153,24 @@ define void @PR43578_prefer128(i32* %r, ; CHECK-NEXT: [[Q1:%.*]] = getelementptr inbounds i64, i64* [[Q]], i64 1 ; CHECK-NEXT: [[Q2:%.*]] = getelementptr inbounds i64, i64* [[Q]], i64 2 ; CHECK-NEXT: [[Q3:%.*]] = getelementptr inbounds i64, i64* [[Q]], i64 3 -; CHECK-NEXT: [[TMP1:%.*]] = bitcast i64* [[P0]] to <4 x i64>* -; CHECK-NEXT: [[TMP2:%.*]] = load <4 x i64>, <4 x i64>* [[TMP1]], align 2 -; CHECK-NEXT: [[TMP3:%.*]] = bitcast i64* [[Q0]] to <4 x i64>* -; CHECK-NEXT: [[TMP4:%.*]] = load <4 x i64>, <4 x i64>* [[TMP3]], align 2 -; CHECK-NEXT: [[TMP5:%.*]] = sub nsw <4 x i64> [[TMP2]], [[TMP4]] -; CHECK-NEXT: [[TMP6:%.*]] = extractelement <4 x i64> [[TMP5]], i32 0 -; CHECK-NEXT: [[G0:%.*]] = getelementptr inbounds i32, i32* [[R:%.*]], i64 [[TMP6]] -; CHECK-NEXT: [[TMP7:%.*]] = extractelement <4 x i64> [[TMP5]], i32 1 -; CHECK-NEXT: [[G1:%.*]] = getelementptr inbounds i32, i32* [[R]], i64 [[TMP7]] -; CHECK-NEXT: [[TMP8:%.*]] = extractelement <4 x i64> [[TMP5]], i32 2 -; CHECK-NEXT: [[G2:%.*]] = getelementptr inbounds i32, i32* [[R]], i64 [[TMP8]] -; CHECK-NEXT: [[TMP9:%.*]] = extractelement <4 x i64> [[TMP5]], i32 3 -; CHECK-NEXT: [[G3:%.*]] = getelementptr inbounds i32, i32* [[R]], i64 [[TMP9]] +; CHECK-NEXT: [[TMP1:%.*]] = bitcast i64* [[P0]] to <2 x i64>* +; CHECK-NEXT: [[TMP2:%.*]] = load <2 x i64>, <2 x i64>* [[TMP1]], align 2 +; CHECK-NEXT: [[TMP3:%.*]] = bitcast i64* [[P2]] to <2 x i64>* +; CHECK-NEXT: [[TMP4:%.*]] = load <2 x i64>, <2 x i64>* [[TMP3]], align 2 +; CHECK-NEXT: [[TMP5:%.*]] = bitcast i64* [[Q0]] to <2 x i64>* +; CHECK-NEXT: [[TMP6:%.*]] = load <2 x i64>, <2 x i64>* [[TMP5]], align 2 +; CHECK-NEXT: [[TMP7:%.*]] = bitcast i64* [[Q2]] to <2 x i64>* +; CHECK-NEXT: [[TMP8:%.*]] = load <2 x i64>, <2 x i64>* [[TMP7]], align 2 +; CHECK-NEXT: [[TMP9:%.*]] = sub nsw <2 x i64> [[TMP2]], [[TMP6]] +; CHECK-NEXT: [[TMP10:%.*]] = sub nsw <2 x i64> [[TMP4]], [[TMP8]] +; CHECK-NEXT: [[TMP11:%.*]] = extractelement <2 x i64> [[TMP9]], i32 0 +; CHECK-NEXT: [[G0:%.*]] = getelementptr inbounds i32, i32* [[R:%.*]], i64 [[TMP11]] +; CHECK-NEXT: [[TMP12:%.*]] = extractelement <2 x i64> [[TMP9]], i32 1 +; CHECK-NEXT: [[G1:%.*]] = getelementptr inbounds i32, i32* [[R]], i64 [[TMP12]] +; CHECK-NEXT: [[TMP13:%.*]] = extractelement <2 x i64> [[TMP10]], i32 0 +; CHECK-NEXT: [[G2:%.*]] = getelementptr inbounds i32, i32* [[R]], i64 [[TMP13]] +; CHECK-NEXT: [[TMP14:%.*]] = extractelement <2 x i64> [[TMP10]], i32 1 +; CHECK-NEXT: [[G3:%.*]] = getelementptr inbounds i32, i32* [[R]], i64 [[TMP14]] ; CHECK-NEXT: ret void ; %p0 = getelementptr inbounds i64, i64* %p, i64 0 From llvm-commits at lists.llvm.org Wed Oct 9 09:31:25 2019 From: llvm-commits at lists.llvm.org (Jason Liu via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:31:25 +0000 (UTC) Subject: [PATCH] D68650: [AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength. In-Reply-To: References: Message-ID: <9efedc29f9022c89aef2e119efbd51af@localhost.localdomain> jasonliu added a comment. Sorry I accidentally committed llvm/D68650 .diff. I already removed it with another commit llvm-svn: 374181. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68650/new/ https://reviews.llvm.org/D68650 From llvm-commits at lists.llvm.org Wed Oct 9 09:31:26 2019 From: llvm-commits at lists.llvm.org (Mirko Brkusanin via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:31:26 +0000 (UTC) Subject: [PATCH] D68390: [Mips] Emit proper ABI for _mcount calls In-Reply-To: References: Message-ID: <56a71d34985b71c3d8ec431074e518cd@localhost.localdomain> mbrkusanin added a comment. In D68390#1701209 , @RKSimon wrote: > This is causing failures on EXPENSIVE_CHECKS builds, please can you take a look? Thanks. I will look into it as soon as possible. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68390/new/ https://reviews.llvm.org/D68390 From llvm-commits at lists.llvm.org Wed Oct 9 09:31:26 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:31:26 +0000 (UTC) Subject: [PATCH] D68690: AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently In-Reply-To: References: Message-ID: <6a3785c4d68fbaac122d42fa571a36e9@localhost.localdomain> arsenm added a comment. Testcase? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68690/new/ https://reviews.llvm.org/D68690 From llvm-commits at lists.llvm.org Wed Oct 9 09:31:41 2019 From: llvm-commits at lists.llvm.org (Momchil Velikov via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:31:41 +0000 (UTC) Subject: [PATCH] D68469: [AArch64] Ensure no tagged memory is left in the unallocated portion of the stack In-Reply-To: References: Message-ID: <209737f34eef59e67b9c8e481be9a60c@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGd037a5f06538: [AArch64] Ensure no tagged memory is left in the unallocated portion of the… (authored by chill). Changed prior to commit: https://reviews.llvm.org/D68469?vs=223821&id=224081#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68469/new/ https://reviews.llvm.org/D68469 Files: llvm/lib/Target/AArch64/AArch64StackTagging.cpp llvm/test/CodeGen/AArch64/stack-tagging-ex-1.ll llvm/test/CodeGen/AArch64/stack-tagging-ex-2.ll llvm/test/CodeGen/AArch64/stack-tagging-untag-placement.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68469.224081.patch Type: text/x-patch Size: 18165 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 09:31:51 2019 From: llvm-commits at lists.llvm.org (Kevin P. Neal via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:31:51 +0000 (UTC) Subject: [PATCH] D68713: [FPEnv] Change test to conform to strictfp attribute rules Message-ID: kpn created this revision. kpn added a reviewer: spatel. Herald added a project: LLVM. Herald added a subscriber: llvm-commits. This test does not conform to the new strictfp attribute rules. In particular, the function definition is not marked strictfp despite containing a function marked strictfp. Also, if any function call is marked strictfp then all function calls in that function must be marked. This change to move the one strictfp call to a new properly marked function meets all the new rules. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68713 Files: llvm/test/Bitcode/compatibility.ll Index: llvm/test/Bitcode/compatibility.ll =================================================================== --- llvm/test/Bitcode/compatibility.ll +++ llvm/test/Bitcode/compatibility.ll @@ -1374,9 +1374,6 @@ call void @f.nobuiltin() builtin ; CHECK: call void @f.nobuiltin() #43 - call void @f.strictfp() strictfp - ; CHECK: call void @f.strictfp() #44 - call fastcc noalias i32* @f.noalias() noinline ; CHECK: call fastcc noalias i32* @f.noalias() #12 tail call ghccc nonnull i32* @f.nonnull() minsize @@ -1392,6 +1389,13 @@ ret void } +define void @instructions.strictfp() #44 { + call void @f.strictfp() strictfp + ; CHECK: call void @f.strictfp() #44 + + ret void +} + define void @instructions.call_notail() { notail call void @f1() ; CHECK: notail call void @f1() -------------- next part -------------- A non-text attachment was scrubbed... Name: D68713.224080.patch Type: text/x-patch Size: 807 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 09:40:56 2019 From: llvm-commits at lists.llvm.org (Kai Nacke via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:40:56 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: <51ca6cdaabb2f1b10c602148bbeacf84@localhost.localdomain> Kai marked an inline comment as done. Kai added inline comments. ================ Comment at: llvm/test/FileCheck/check-ignore-case.txt:23 +# CHECK-NEXT: break \ No newline at end of file ---------------- jhenderson wrote: > Nit: no new line at EOF. When I download the raw file and look at it in hex mode, then the last byte is 0x0A. That's a new line at EOF, isn't it? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 From llvm-commits at lists.llvm.org Wed Oct 9 09:40:58 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:40:58 +0000 (UTC) Subject: [PATCH] D68714: [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219) Message-ID: lebedev.ri created this revision. lebedev.ri added reviewers: andreadb, mattd, RKSimon. lebedev.ri added a project: LLVM. Herald added subscribers: arphaman, gbedwell. As disscused in https://bugs.llvm.org/show_bug.cgi?id=43219, i believe it may be somewhat useful to show //some// aggregates over all the sea of statistics provided. Example: Average Wait times (based on the timeline view): [0]: Executions [1]: Average time spent waiting in a scheduler's queue [2]: Average time spent waiting in a scheduler's queue while ready [3]: Average time elapsed from WB until retire stage [0] [1] [2] [3] 0. 3 1.0 1.0 4.7 vmulps %xmm0, %xmm1, %xmm2 1. 3 2.7 0.0 2.3 vhaddps %xmm2, %xmm2, %xmm3 2. 3 6.0 0.0 0.0 vhaddps %xmm3, %xmm3, %xmm4 3 3.2 0.3 2.3 I.e. we average the averages. FIXME: coloring for that row is wrong, and i'm not sure how to fix it. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68714 Files: llvm/docs/CommandGuide/llvm-mca.rst llvm/test/tools/llvm-mca/AArch64/Cortex/direct-branch.s llvm/test/tools/llvm-mca/ARM/memcpy-ldm-stm.s llvm/test/tools/llvm-mca/ARM/vld1-index-update.s llvm/test/tools/llvm-mca/SystemZ/stm-lm.s llvm/test/tools/llvm-mca/X86/Barcelona/clear-super-register-1.s llvm/test/tools/llvm-mca/X86/Barcelona/clear-super-register-2.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-cmp.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-pcmpeq.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-pcmpgt.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-sbb-1.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-sbb-2.s llvm/test/tools/llvm-mca/X86/Barcelona/int-to-fpu-forwarding-3.s llvm/test/tools/llvm-mca/X86/Barcelona/load-store-throughput.s llvm/test/tools/llvm-mca/X86/Barcelona/load-throughput.s llvm/test/tools/llvm-mca/X86/Barcelona/one-idioms.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-5.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-7.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update.s llvm/test/tools/llvm-mca/X86/Barcelona/read-advance-1.s llvm/test/tools/llvm-mca/X86/Barcelona/read-advance-2.s llvm/test/tools/llvm-mca/X86/Barcelona/read-advance-3.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-1.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-2.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-3.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-4.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-5.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-6.s llvm/test/tools/llvm-mca/X86/Barcelona/store-throughput.s llvm/test/tools/llvm-mca/X86/Barcelona/zero-idioms.s llvm/test/tools/llvm-mca/X86/BdVer2/add-sequence.s llvm/test/tools/llvm-mca/X86/BdVer2/clear-super-register-1.s llvm/test/tools/llvm-mca/X86/BdVer2/clear-super-register-2.s llvm/test/tools/llvm-mca/X86/BdVer2/clear-super-register-3.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-cmp.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-pcmpeq.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-pcmpgt.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-sbb-1.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-sbb-2.s llvm/test/tools/llvm-mca/X86/BdVer2/dependent-pmuld-paddd.s llvm/test/tools/llvm-mca/X86/BdVer2/dot-product.s llvm/test/tools/llvm-mca/X86/BdVer2/hadd-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BdVer2/hadd-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BdVer2/int-to-fpu-forwarding-3.s llvm/test/tools/llvm-mca/X86/BdVer2/load-store-alias.s llvm/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s llvm/test/tools/llvm-mca/X86/BdVer2/load-throughput.s llvm/test/tools/llvm-mca/X86/BdVer2/memcpy-like-test.s llvm/test/tools/llvm-mca/X86/BdVer2/one-idioms.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-5.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update.s llvm/test/tools/llvm-mca/X86/BdVer2/pipes-fpu.s llvm/test/tools/llvm-mca/X86/BdVer2/pr37790.s llvm/test/tools/llvm-mca/X86/BdVer2/rank.s llvm/test/tools/llvm-mca/X86/BdVer2/read-advance-1.s llvm/test/tools/llvm-mca/X86/BdVer2/read-advance-2.s llvm/test/tools/llvm-mca/X86/BdVer2/read-advance-3.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-1.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-2.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-3.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-4.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-5.s llvm/test/tools/llvm-mca/X86/BdVer2/register-files-1.s llvm/test/tools/llvm-mca/X86/BdVer2/register-files-2.s llvm/test/tools/llvm-mca/X86/BdVer2/register-files-3.s llvm/test/tools/llvm-mca/X86/BdVer2/register-files-4.s llvm/test/tools/llvm-mca/X86/BdVer2/register-files-5.s llvm/test/tools/llvm-mca/X86/BdVer2/store-throughput.s llvm/test/tools/llvm-mca/X86/BdVer2/vbroadcast-operand-latency.s llvm/test/tools/llvm-mca/X86/BdVer2/vec-logic-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BdVer2/vec-logic-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BdVer2/xop-super-registers-1.s llvm/test/tools/llvm-mca/X86/BdVer2/xop-super-registers-2.s llvm/test/tools/llvm-mca/X86/BdVer2/zero-idioms-avx-256.s llvm/test/tools/llvm-mca/X86/BdVer2/zero-idioms.s llvm/test/tools/llvm-mca/X86/Broadwell/zero-idioms.s llvm/test/tools/llvm-mca/X86/BtVer2/add-sequence.s llvm/test/tools/llvm-mca/X86/BtVer2/bottleneck-hints-1.s llvm/test/tools/llvm-mca/X86/BtVer2/bottleneck-hints-2.s llvm/test/tools/llvm-mca/X86/BtVer2/bottleneck-hints-3.s llvm/test/tools/llvm-mca/X86/BtVer2/clear-super-register-1.s llvm/test/tools/llvm-mca/X86/BtVer2/clear-super-register-2.s llvm/test/tools/llvm-mca/X86/BtVer2/cmpxchg-read-advance.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-cmp.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-pcmpeq.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-pcmpgt.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-sbb-1.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-sbb-2.s llvm/test/tools/llvm-mca/X86/BtVer2/dependent-pmuld-paddd.s llvm/test/tools/llvm-mca/X86/BtVer2/dot-product.s llvm/test/tools/llvm-mca/X86/BtVer2/hadd-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BtVer2/hadd-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BtVer2/int-to-fpu-forwarding-3.s llvm/test/tools/llvm-mca/X86/BtVer2/load-store-alias.s llvm/test/tools/llvm-mca/X86/BtVer2/memcpy-like-test.s llvm/test/tools/llvm-mca/X86/BtVer2/one-idioms.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-5.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-7.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update.s llvm/test/tools/llvm-mca/X86/BtVer2/pipes-fpu.s llvm/test/tools/llvm-mca/X86/BtVer2/pr37790.s llvm/test/tools/llvm-mca/X86/BtVer2/rank.s llvm/test/tools/llvm-mca/X86/BtVer2/read-advance-1.s llvm/test/tools/llvm-mca/X86/BtVer2/read-advance-2.s llvm/test/tools/llvm-mca/X86/BtVer2/read-advance-3.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-1.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-2.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-3.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-4.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-5.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-6.s llvm/test/tools/llvm-mca/X86/BtVer2/register-files-1.s llvm/test/tools/llvm-mca/X86/BtVer2/register-files-2.s llvm/test/tools/llvm-mca/X86/BtVer2/register-files-3.s llvm/test/tools/llvm-mca/X86/BtVer2/register-files-4.s llvm/test/tools/llvm-mca/X86/BtVer2/register-files-5.s llvm/test/tools/llvm-mca/X86/BtVer2/vbroadcast-operand-latency.s llvm/test/tools/llvm-mca/X86/BtVer2/vec-logic-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BtVer2/vec-logic-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BtVer2/xadd.s llvm/test/tools/llvm-mca/X86/BtVer2/xchg.s llvm/test/tools/llvm-mca/X86/BtVer2/zero-idioms-avx-256.s llvm/test/tools/llvm-mca/X86/BtVer2/zero-idioms.s llvm/test/tools/llvm-mca/X86/Generic/avx512-super-registers-1.s llvm/test/tools/llvm-mca/X86/Generic/avx512-super-registers-2.s llvm/test/tools/llvm-mca/X86/Generic/avx512-super-registers-3.s llvm/test/tools/llvm-mca/X86/Generic/xop-super-registers-1.s llvm/test/tools/llvm-mca/X86/Generic/xop-super-registers-2.s llvm/test/tools/llvm-mca/X86/Haswell/cmpxchg16b.s llvm/test/tools/llvm-mca/X86/Haswell/zero-idioms.s llvm/test/tools/llvm-mca/X86/SandyBridge/zero-idioms.s llvm/test/tools/llvm-mca/X86/SkylakeClient/zero-idioms.s llvm/test/tools/llvm-mca/X86/SkylakeServer/zero-idioms.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-5.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-7.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update.s llvm/test/tools/llvm-mca/X86/bextr-read-after-ld.s llvm/test/tools/llvm-mca/X86/bzhi-read-after-ld.s llvm/test/tools/llvm-mca/X86/fma3-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/fma3-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/option-all-views-1.s llvm/test/tools/llvm-mca/X86/option-all-views-2.s llvm/test/tools/llvm-mca/X86/option-no-stats-1.s llvm/test/tools/llvm-mca/X86/read-after-ld-1.s llvm/test/tools/llvm-mca/X86/read-after-ld-2.s llvm/test/tools/llvm-mca/X86/read-after-ld-3.s llvm/test/tools/llvm-mca/X86/sqrt-rsqrt-rcp-memop.s llvm/test/tools/llvm-mca/X86/variable-blend-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/variable-blend-read-after-ld-2.s llvm/tools/llvm-mca/Views/TimelineView.cpp llvm/tools/llvm-mca/Views/TimelineView.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68714.224084.patch Type: text/x-patch Size: 118635 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 09:41:06 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:41:06 +0000 (UTC) Subject: [PATCH] D68667: [SLP] respect target register width for GEP vectorization (PR43578) In-Reply-To: References: Message-ID: <7adfef831d3b4fada146677fdea8c485@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGdf14bd315db9: [SLP] respect target register width for GEP vectorization (PR43578) (authored by spatel). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68667/new/ https://reviews.llvm.org/D68667 Files: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/test/Transforms/SLPVectorizer/AArch64/ext-trunc.ll llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll llvm/test/Transforms/SLPVectorizer/X86/load-merge.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68667.224086.patch Type: text/x-patch Size: 14033 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 09:50:31 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:50:31 +0000 (UTC) Subject: [PATCH] D68706: [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space In-Reply-To: References: Message-ID: <3e66c3f64ba9fda2ab19931fa48b3434@localhost.localdomain> jdoerfert added a comment. In D68706#1701551 , @lebedev.ri wrote: > You want `llvm::NullPointerIsDefined()`, which also checks for `"null-pointer-is-valid"` attribute. `getPointerDereferenceableBytes` should do the above. The tests in the file show it works except there is one missing: define float @matching_scalar_smallest_deref_addrspace(<4 x float> addrspace(4)* dereferenceable(1) %p) { ; CHECK-LABEL: @matching_scalar_smallest_deref_addrspace( ; CHECK-NEXT: [[BC:%.*]] = getelementptr inbounds <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 ; CHECK-NEXT: [[R:%.*]] = load float, float addrspace(4)* [[BC]], align 16 ; CHECK-NEXT: ret float [[R]] ; %bc = bitcast <4 x float> addrspace(4)* %p to float addrspace(4)* %r = load float, float addrspace(4)* %bc, align 16 ret float %r } I think this is fine but I want to hear if people agree. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68706/new/ https://reviews.llvm.org/D68706 From llvm-commits at lists.llvm.org Wed Oct 9 09:59:50 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:59:50 +0000 (UTC) Subject: [PATCH] D68645: MinidumpYAML: Add support for the memory info list stream In-Reply-To: References: Message-ID: <643ad0a4083bcb73d5ea7cf7c49d059a@localhost.localdomain> grimar accepted this revision. grimar added a comment. This revision is now accepted and ready to land. LGTM Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68645/new/ https://reviews.llvm.org/D68645 From llvm-commits at lists.llvm.org Wed Oct 9 10:09:25 2019 From: llvm-commits at lists.llvm.org (serge via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:09:25 +0000 (UTC) Subject: [PATCH] D68525: [lit] Refactor ProgressDisplay In-Reply-To: References: Message-ID: <39571fdf1f545a64bd5b1718b6242e95@localhost.localdomain> serge-sans-paille accepted this revision. serge-sans-paille added a comment. This revision is now accepted and ready to land. Yeah, I'm all in for that one. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68525/new/ https://reviews.llvm.org/D68525 From llvm-commits at lists.llvm.org Wed Oct 9 10:09:27 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:09:27 +0000 (UTC) Subject: [PATCH] D68713: [FPEnv] Change test to conform to strictfp attribute rules In-Reply-To: References: Message-ID: <74878295e3f33e560d57553e5c2d014d@localhost.localdomain> spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM - you may want to annotate the title with 'NFC' when committing to indicate this patch doesn't actually change code. As discussed off-list, we have other release-versioned binary test files for compatibility, and I'm not sure yet how that will be handled. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68713/new/ https://reviews.llvm.org/D68713 From llvm-commits at lists.llvm.org Wed Oct 9 10:09:28 2019 From: llvm-commits at lists.llvm.org (Aaron Puchert via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:09:28 +0000 (UTC) Subject: [PATCH] D51741: [coro]Pass rvalue reference for named local variable to return_value In-Reply-To: References: Message-ID: <67c697afcfc13737a5d14533cc97a86a@localhost.localdomain> aaronpuchert added a comment. Herald added a project: LLVM. This change breaks the following code that worked before: task f(MoveOnly &value) { co_return value; } The error message is: clang/test/SemaCXX/coroutine-rvo.cpp:60:13: error: call to deleted constructor of 'MoveOnly' co_return value; ^~~~~ clang/test/SemaCXX/coroutine-rvo.cpp:43:3: note: 'MoveOnly' has been explicitly marked deleted here MoveOnly(const MoveOnly&) = delete; ^ Is that maybe intentional, and is the code not intended to compile? ================ Comment at: cfe/trunk/lib/Sema/SemaCoroutine.cpp:846 + if (E) { + auto NRVOCandidate = this->getCopyElisionCandidate(E->getType(), E, CES_AsIfByStdMove); + if (NRVOCandidate) { ---------------- Why not `CES_Strict` like in `Sema::BuildReturnStmt`? With `CES_Strict` the test still works, and we can also return references. ================ Comment at: cfe/trunk/lib/Sema/SemaCoroutine.cpp:849 + InitializedEntity Entity = + InitializedEntity::InitializeResult(Loc, E->getType(), NRVOCandidate); + ExprResult MoveResult = this->PerformMoveOrCopyInitialization( ---------------- The last parameter has type `bool`, and because we're in `if (NRVOCandidate)`, that will always be true. Wouldn't it be more straightforward to just pass `true` into the function? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51741/new/ https://reviews.llvm.org/D51741 From llvm-commits at lists.llvm.org Wed Oct 9 10:24:56 2019 From: llvm-commits at lists.llvm.org (Kevin P. Neal via llvm-commits) Date: Wed, 09 Oct 2019 17:24:56 -0000 Subject: [llvm] r374186 - [FPEnv][NFC] Change test to conform to strictfp attribute rules. Message-ID: <20191009172456.890828867E@lists.llvm.org> Author: kpn Date: Wed Oct 9 10:24:56 2019 New Revision: 374186 URL: http://llvm.org/viewvc/llvm-project?rev=374186&view=rev Log: [FPEnv][NFC] Change test to conform to strictfp attribute rules. In particular, the function definition is not marked strictfp despite containing a function marked strictfp. Also, if any function call is marked strictfp then all function calls in that function must be marked. This change to move the one strictfp call to a new properly marked function meets all the new rules. Tested with a stricter version of D68233. Reviewed by: spatel Approved by: spatel Differential Revision: https://reviews.llvm.org/D68713 Modified: llvm/trunk/test/Bitcode/compatibility.ll Modified: llvm/trunk/test/Bitcode/compatibility.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Bitcode/compatibility.ll?rev=374186&r1=374185&r2=374186&view=diff ============================================================================== --- llvm/trunk/test/Bitcode/compatibility.ll (original) +++ llvm/trunk/test/Bitcode/compatibility.ll Wed Oct 9 10:24:56 2019 @@ -1374,9 +1374,6 @@ exit: call void @f.nobuiltin() builtin ; CHECK: call void @f.nobuiltin() #43 - call void @f.strictfp() strictfp - ; CHECK: call void @f.strictfp() #44 - call fastcc noalias i32* @f.noalias() noinline ; CHECK: call fastcc noalias i32* @f.noalias() #12 tail call ghccc nonnull i32* @f.nonnull() minsize @@ -1391,6 +1388,13 @@ define void @instructions.call_musttail( ret void } + +define void @instructions.strictfp() #44 { + call void @f.strictfp() strictfp + ; CHECK: call void @f.strictfp() #44 + + ret void +} define void @instructions.call_notail() { notail call void @f1() From llvm-commits at lists.llvm.org Wed Oct 9 10:27:49 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:27:49 +0000 (UTC) Subject: [PATCH] D68527: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering In-Reply-To: References: Message-ID: <2bd3e7b10b7167c19c78e9c2afd5992d@localhost.localdomain> tlively added a comment. In D68527#1700939 , @aheejin wrote: > Wouldn't minimizing the number of instruction be the same thing as minimizing the number of bytes, only more inaccurate? It's true that minimizing instructions approximates minimizing bytes, but it also stands on its own as a reasonable metric. In this case minimizing instructions makes more sense than minimizing bytes. > If swizzles are a lot more complicated that `v128.const` in execution, doesn't that mean swizzles will likely to take longer to execute in wasm? Why the opposite? Swizzles lower directly to hardware instructions so they are fast for engines to execute. But doing the same operation without a swizzle instruction would require a long sequence of other wasm instructions and therefore be slow to execute. Because this difference is large for swizzles it is a good idea to prefer to use them when possible. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68527/new/ https://reviews.llvm.org/D68527 From llvm-commits at lists.llvm.org Wed Oct 9 10:27:53 2019 From: llvm-commits at lists.llvm.org (Kevin P. Neal via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:27:53 +0000 (UTC) Subject: [PATCH] D68713: [FPEnv] Change test to conform to strictfp attribute rules In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG44e988ab14cb: [FPEnv][NFC] Change test to conform to strictfp attribute rules. (authored by kpn). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68713/new/ https://reviews.llvm.org/D68713 Files: llvm/test/Bitcode/compatibility.ll Index: llvm/test/Bitcode/compatibility.ll =================================================================== --- llvm/test/Bitcode/compatibility.ll +++ llvm/test/Bitcode/compatibility.ll @@ -1374,9 +1374,6 @@ call void @f.nobuiltin() builtin ; CHECK: call void @f.nobuiltin() #43 - call void @f.strictfp() strictfp - ; CHECK: call void @f.strictfp() #44 - call fastcc noalias i32* @f.noalias() noinline ; CHECK: call fastcc noalias i32* @f.noalias() #12 tail call ghccc nonnull i32* @f.nonnull() minsize @@ -1392,6 +1389,13 @@ ret void } +define void @instructions.strictfp() #44 { + call void @f.strictfp() strictfp + ; CHECK: call void @f.strictfp() #44 + + ret void +} + define void @instructions.call_notail() { notail call void @f1() ; CHECK: notail call void @f1() -------------- next part -------------- A non-text attachment was scrubbed... Name: D68713.224096.patch Type: text/x-patch Size: 807 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 10:39:20 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via llvm-commits) Date: Wed, 09 Oct 2019 17:39:20 -0000 Subject: [llvm] r374188 - [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering Message-ID: <20191009173920.19DA28D8E0@lists.llvm.org> Author: tlively Date: Wed Oct 9 10:39:19 2019 New Revision: 374188 URL: http://llvm.org/viewvc/llvm-project?rev=374188&view=rev Log: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering Summary: Adds the new v8x16.swizzle SIMD instruction as specified at https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#swizzling-using-variable-indices. In addition to adding swizzles as a candidate lowering in LowerBUILD_VECTOR, also rewrites and simplifies the lowering to minimize the number of replace_lanes necessary rather than trying to minimize code size. This leads to more uses of v128.const instead of splats, which is expected to increase performance. The new code will be easier to tune once V8 implements all the vector construction operations, and it will also be easier to add new candidate instructions in the future if necessary. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68527 Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyISD.def llvm/trunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td llvm/trunk/test/CodeGen/WebAssembly/simd-build-vector.ll llvm/trunk/test/MC/WebAssembly/simd-encodings.s Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyISD.def URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyISD.def?rev=374188&r1=374187&r2=374188&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyISD.def (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyISD.def Wed Oct 9 10:39:19 2019 @@ -26,6 +26,7 @@ HANDLE_NODETYPE(WrapperPIC) HANDLE_NODETYPE(BR_IF) HANDLE_NODETYPE(BR_TABLE) HANDLE_NODETYPE(SHUFFLE) +HANDLE_NODETYPE(SWIZZLE) HANDLE_NODETYPE(VEC_SHL) HANDLE_NODETYPE(VEC_SHR_S) HANDLE_NODETYPE(VEC_SHR_U) Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp?rev=374188&r1=374187&r2=374188&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp Wed Oct 9 10:39:19 2019 @@ -1292,68 +1292,116 @@ SDValue WebAssemblyTargetLowering::Lower const EVT VecT = Op.getValueType(); const EVT LaneT = Op.getOperand(0).getValueType(); const size_t Lanes = Op.getNumOperands(); + bool CanSwizzle = Subtarget->hasUnimplementedSIMD128() && VecT == MVT::v16i8; + + // BUILD_VECTORs are lowered to the instruction that initializes the highest + // possible number of lanes at once followed by a sequence of replace_lane + // instructions to individually initialize any remaining lanes. + + // TODO: Tune this. For example, lanewise swizzling is very expensive, so + // swizzled lanes should be given greater weight. + + // TODO: Investigate building vectors by shuffling together vectors built by + // separately specialized means. + auto IsConstant = [](const SDValue &V) { return V.getOpcode() == ISD::Constant || V.getOpcode() == ISD::ConstantFP; }; - // Find the most common operand, which is approximately the best to splat - using Entry = std::pair; - SmallVector ValueCounts; - size_t NumConst = 0, NumDynamic = 0; - for (const SDValue &Lane : Op->op_values()) { - if (Lane.isUndef()) { - continue; - } else if (IsConstant(Lane)) { - NumConst++; - } else { - NumDynamic++; - } - auto CountIt = std::find_if(ValueCounts.begin(), ValueCounts.end(), - [&Lane](Entry A) { return A.first == Lane; }); - if (CountIt == ValueCounts.end()) { - ValueCounts.emplace_back(Lane, 1); + // Returns the source vector and index vector pair if they exist. Checks for: + // (extract_vector_elt + // $src, + // (sign_extend_inreg (extract_vector_elt $indices, $i)) + // ) + auto GetSwizzleSrcs = [](size_t I, const SDValue &Lane) { + auto Bail = std::make_pair(SDValue(), SDValue()); + if (Lane->getOpcode() != ISD::EXTRACT_VECTOR_ELT) + return Bail; + const SDValue &SwizzleSrc = Lane->getOperand(0); + const SDValue &IndexExt = Lane->getOperand(1); + if (IndexExt->getOpcode() != ISD::SIGN_EXTEND_INREG) + return Bail; + const SDValue &Index = IndexExt->getOperand(0); + if (Index->getOpcode() != ISD::EXTRACT_VECTOR_ELT) + return Bail; + const SDValue &SwizzleIndices = Index->getOperand(0); + if (SwizzleSrc.getValueType() != MVT::v16i8 || + SwizzleIndices.getValueType() != MVT::v16i8 || + Index->getOperand(1)->getOpcode() != ISD::Constant || + Index->getConstantOperandVal(1) != I) + return Bail; + return std::make_pair(SwizzleSrc, SwizzleIndices); + }; + + using ValueEntry = std::pair; + SmallVector SplatValueCounts; + + using SwizzleEntry = std::pair, size_t>; + SmallVector SwizzleCounts; + + auto AddCount = [](auto &Counts, const auto &Val) { + auto CountIt = std::find_if(Counts.begin(), Counts.end(), + [&Val](auto E) { return E.first == Val; }); + if (CountIt == Counts.end()) { + Counts.emplace_back(Val, 1); } else { CountIt->second++; } + }; + + auto GetMostCommon = [](auto &Counts) { + auto CommonIt = + std::max_element(Counts.begin(), Counts.end(), + [](auto A, auto B) { return A.second < B.second; }); + assert(CommonIt != Counts.end() && "Unexpected all-undef build_vector"); + return *CommonIt; + }; + + size_t NumConstantLanes = 0; + + // Count eligible lanes for each type of vector creation op + for (size_t I = 0; I < Lanes; ++I) { + const SDValue &Lane = Op->getOperand(I); + if (Lane.isUndef()) + continue; + + AddCount(SplatValueCounts, Lane); + + if (IsConstant(Lane)) { + NumConstantLanes++; + } else if (CanSwizzle) { + auto SwizzleSrcs = GetSwizzleSrcs(I, Lane); + if (SwizzleSrcs.first) + AddCount(SwizzleCounts, SwizzleSrcs); + } } - auto CommonIt = - std::max_element(ValueCounts.begin(), ValueCounts.end(), - [](Entry A, Entry B) { return A.second < B.second; }); - assert(CommonIt != ValueCounts.end() && "Unexpected all-undef build_vector"); - SDValue SplatValue = CommonIt->first; - size_t NumCommon = CommonIt->second; - // If v128.const is available, consider using it instead of a splat + SDValue SplatValue; + size_t NumSplatLanes; + std::tie(SplatValue, NumSplatLanes) = GetMostCommon(SplatValueCounts); + + SDValue SwizzleSrc; + SDValue SwizzleIndices; + size_t NumSwizzleLanes = 0; + if (SwizzleCounts.size()) + std::forward_as_tuple(std::tie(SwizzleSrc, SwizzleIndices), + NumSwizzleLanes) = GetMostCommon(SwizzleCounts); + + // Predicate returning true if the lane is properly initialized by the + // original instruction + std::function IsLaneConstructed; + SDValue Result; if (Subtarget->hasUnimplementedSIMD128()) { - // {i32,i64,f32,f64}.const opcode, and value - const size_t ConstBytes = 1 + std::max(size_t(4), 16 / Lanes); - // SIMD prefix and opcode - const size_t SplatBytes = 2; - const size_t SplatConstBytes = SplatBytes + ConstBytes; - // SIMD prefix, opcode, and lane index - const size_t ReplaceBytes = 3; - const size_t ReplaceConstBytes = ReplaceBytes + ConstBytes; - // SIMD prefix, v128.const opcode, and 128-bit value - const size_t VecConstBytes = 18; - // Initial v128.const and a replace_lane for each non-const operand - const size_t ConstInitBytes = VecConstBytes + NumDynamic * ReplaceBytes; - // Initial splat and all necessary replace_lanes - const size_t SplatInitBytes = - IsConstant(SplatValue) - // Initial constant splat - ? (SplatConstBytes + - // Constant replace_lanes - (NumConst - NumCommon) * ReplaceConstBytes + - // Dynamic replace_lanes - (NumDynamic * ReplaceBytes)) - // Initial dynamic splat - : (SplatBytes + - // Constant replace_lanes - (NumConst * ReplaceConstBytes) + - // Dynamic replace_lanes - (NumDynamic - NumCommon) * ReplaceBytes); - if (ConstInitBytes < SplatInitBytes) { - // Create build_vector that will lower to initial v128.const + // Prefer swizzles over vector consts over splats + if (NumSwizzleLanes >= NumSplatLanes && + NumSwizzleLanes >= NumConstantLanes) { + Result = DAG.getNode(WebAssemblyISD::SWIZZLE, DL, VecT, SwizzleSrc, + SwizzleIndices); + auto Swizzled = std::make_pair(SwizzleSrc, SwizzleIndices); + IsLaneConstructed = [&, Swizzled](size_t I, const SDValue &Lane) { + return Swizzled == GetSwizzleSrcs(I, Lane); + }; + } else if (NumConstantLanes >= NumSplatLanes) { SmallVector ConstLanes; for (const SDValue &Lane : Op->op_values()) { if (IsConstant(Lane)) { @@ -1364,35 +1412,35 @@ SDValue WebAssemblyTargetLowering::Lower ConstLanes.push_back(DAG.getConstant(0, DL, LaneT)); } } - SDValue Result = DAG.getBuildVector(VecT, DL, ConstLanes); - // Add replace_lane instructions for non-const lanes - for (size_t I = 0; I < Lanes; ++I) { - const SDValue &Lane = Op->getOperand(I); - if (!Lane.isUndef() && !IsConstant(Lane)) - Result = DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, VecT, Result, Lane, - DAG.getConstant(I, DL, MVT::i32)); - } - return Result; + Result = DAG.getBuildVector(VecT, DL, ConstLanes); + IsLaneConstructed = [&](size_t _, const SDValue &Lane) { + return IsConstant(Lane); + }; } } - // Use a splat for the initial vector - SDValue Result; - // Possibly a load_splat - LoadSDNode *SplattedLoad; - if (Subtarget->hasUnimplementedSIMD128() && - (SplattedLoad = dyn_cast(SplatValue)) && - SplattedLoad->getMemoryVT() == VecT.getVectorElementType()) { - Result = DAG.getNode(WebAssemblyISD::LOAD_SPLAT, DL, VecT, SplatValue); - } else { - Result = DAG.getSplatBuildVector(VecT, DL, SplatValue); + if (!Result) { + // Use a splat, but possibly a load_splat + LoadSDNode *SplattedLoad; + if (Subtarget->hasUnimplementedSIMD128() && + (SplattedLoad = dyn_cast(SplatValue)) && + SplattedLoad->getMemoryVT() == VecT.getVectorElementType()) { + Result = DAG.getNode(WebAssemblyISD::LOAD_SPLAT, DL, VecT, SplatValue); + } else { + Result = DAG.getSplatBuildVector(VecT, DL, SplatValue); + } + IsLaneConstructed = [&](size_t _, const SDValue &Lane) { + return Lane == SplatValue; + }; } - // Add replace_lane instructions for other values + + // Add replace_lane instructions for any unhandled values for (size_t I = 0; I < Lanes; ++I) { const SDValue &Lane = Op->getOperand(I); - if (Lane != SplatValue) + if (!Lane.isUndef() && !IsLaneConstructed(I, Lane)) Result = DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, VecT, Result, Lane, DAG.getConstant(I, DL, MVT::i32)); } + return Result; } Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td?rev=374188&r1=374187&r2=374188&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td Wed Oct 9 10:39:19 2019 @@ -275,6 +275,15 @@ def : Pat<(vec_t (wasm_shuffle (vec_t V1 (i32 LaneIdx32:$mE), (i32 LaneIdx32:$mF)))>; } +// Swizzle lanes: v8x16.swizzle +def wasm_swizzle_t : SDTypeProfile<1, 2, []>; +def wasm_swizzle : SDNode<"WebAssemblyISD::SWIZZLE", wasm_swizzle_t>; +defm SWIZZLE : + SIMD_I<(outs V128:$dst), (ins V128:$src, V128:$mask), (outs), (ins), + [(set (v16i8 V128:$dst), + (wasm_swizzle (v16i8 V128:$src), (v16i8 V128:$mask)))], + "v8x16.swizzle\t$dst, $src, $mask", "v8x16.swizzle", 192>; + // Create vector with identical lanes: splat def splat2 : PatFrag<(ops node:$x), (build_vector node:$x, node:$x)>; def splat4 : PatFrag<(ops node:$x), (build_vector Modified: llvm/trunk/test/CodeGen/WebAssembly/simd-build-vector.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/simd-build-vector.ll?rev=374188&r1=374187&r2=374188&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/simd-build-vector.ll (original) +++ llvm/trunk/test/CodeGen/WebAssembly/simd-build-vector.ll Wed Oct 9 10:39:19 2019 @@ -7,13 +7,12 @@ target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128" target triple = "wasm32-unknown-unknown" -; CHECK-LABEL: same_const_one_replaced_i8x16: -; CHECK-NEXT: .functype same_const_one_replaced_i8x16 (i32) -> (v128) -; CHECK-NEXT: i32.const $push[[L0:[0-9]+]]=, 42 -; CHECK-NEXT: i16x8.splat $push[[L1:[0-9]+]]=, $pop[[L0]] -; CHECK-NEXT: i16x8.replace_lane $push[[L2:[0-9]+]]=, $pop[[L1]], 5, $0 -; CHECK-NEXT: return $pop[[L2]] -define <8 x i16> @same_const_one_replaced_i8x16(i16 %x) { +; CHECK-LABEL: same_const_one_replaced_i16x8: +; CHECK-NEXT: .functype same_const_one_replaced_i16x8 (i32) -> (v128) +; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 42, 42, 42, 42, 42, 0, 42, 42 +; CHECK-NEXT: i16x8.replace_lane $push[[L1:[0-9]+]]=, $pop[[L0]], 5, $0 +; CHECK-NEXT: return $pop[[L1]] +define <8 x i16> @same_const_one_replaced_i16x8(i16 %x) { %v = insertelement <8 x i16> , i16 %x, @@ -21,12 +20,12 @@ define <8 x i16> @same_const_one_replace ret <8 x i16> %v } -; CHECK-LABEL: different_const_one_replaced_i8x16: -; CHECK-NEXT: .functype different_const_one_replaced_i8x16 (i32) -> (v128) +; CHECK-LABEL: different_const_one_replaced_i16x8: +; CHECK-NEXT: .functype different_const_one_replaced_i16x8 (i32) -> (v128) ; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 1, -2, 3, -4, 5, 0, 7, -8 ; CHECK-NEXT: i16x8.replace_lane $push[[L1:[0-9]+]]=, $pop[[L0]], 5, $0 ; CHECK-NEXT: return $pop[[L1]] -define <8 x i16> @different_const_one_replaced_i8x16(i16 %x) { +define <8 x i16> @different_const_one_replaced_i16x8(i16 %x) { %v = insertelement <8 x i16> , i16 %x, @@ -36,10 +35,9 @@ define <8 x i16> @different_const_one_re ; CHECK-LABEL: same_const_one_replaced_f32x4: ; CHECK-NEXT: .functype same_const_one_replaced_f32x4 (f32) -> (v128) -; CHECK-NEXT: f32.const $push[[L0:[0-9]+]]=, 0x1.5p5 -; CHECK-NEXT: f32x4.splat $push[[L1:[0-9]+]]=, $pop[[L0]] -; CHECK-NEXT: f32x4.replace_lane $push[[L2:[0-9]+]]=, $pop[[L1]], 2, $0 -; CHECK-NEXT: return $pop[[L2]] +; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 0x1.5p5, 0x1.5p5, 0x0p0, 0x1.5p5 +; CHECK-NEXT: f32x4.replace_lane $push[[L1:[0-9]+]]=, $pop[[L0]], 2, $0 +; CHECK-NEXT: return $pop[[L1]] define <4 x float> @same_const_one_replaced_f32x4(float %x) { %v = insertelement <4 x float> , @@ -63,11 +61,8 @@ define <4 x float> @different_const_one_ ; CHECK-LABEL: splat_common_const_i32x4: ; CHECK-NEXT: .functype splat_common_const_i32x4 () -> (v128) -; CHECK-NEXT: i32.const $push[[L0:[0-9]+]]=, 3 -; CHECK-NEXT: i32x4.splat $push[[L1:[0-9]+]]=, $pop[[L0]] -; CHECK-NEXT: i32.const $push[[L2:[0-9]+]]=, 1 -; CHECK-NEXT: i32x4.replace_lane $push[[L3:[0-9]+]]=, $pop[[L1]], 3, $pop[[L2]] -; CHECK-NEXT: return $pop[[L3]] +; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 0, 3, 3, 1 +; CHECK-NEXT: return $pop[[L0]] define <4 x i32> @splat_common_const_i32x4() { ret <4 x i32> } @@ -92,11 +87,159 @@ define <8 x i16> @splat_common_arg_i16x8 ret <8 x i16> %v7 } +; CHECK-LABEL: swizzle_one_i8x16: +; CHECK-NEXT: .functype swizzle_one_i8x16 (v128, v128) -> (v128) +; CHECK-NEXT: v8x16.swizzle $push[[L0:[0-9]+]]=, $0, $1 +; CHECK-NEXT: return $pop[[L0]] +define <16 x i8> @swizzle_one_i8x16(<16 x i8> %src, <16 x i8> %mask) { + %m0 = extractelement <16 x i8> %mask, i32 0 + %s0 = extractelement <16 x i8> %src, i8 %m0 + %v0 = insertelement <16 x i8> undef, i8 %s0, i32 0 + ret <16 x i8> %v0 +} + +; CHECK-LABEL: swizzle_all_i8x16: +; CHECK-NEXT: .functype swizzle_all_i8x16 (v128, v128) -> (v128) +; CHECK-NEXT: v8x16.swizzle $push[[L0:[0-9]+]]=, $0, $1 +; CHECK-NEXT: return $pop[[L0]] +define <16 x i8> @swizzle_all_i8x16(<16 x i8> %src, <16 x i8> %mask) { + %m0 = extractelement <16 x i8> %mask, i32 0 + %s0 = extractelement <16 x i8> %src, i8 %m0 + %v0 = insertelement <16 x i8> undef, i8 %s0, i32 0 + %m1 = extractelement <16 x i8> %mask, i32 1 + %s1 = extractelement <16 x i8> %src, i8 %m1 + %v1 = insertelement <16 x i8> %v0, i8 %s1, i32 1 + %m2 = extractelement <16 x i8> %mask, i32 2 + %s2 = extractelement <16 x i8> %src, i8 %m2 + %v2 = insertelement <16 x i8> %v1, i8 %s2, i32 2 + %m3 = extractelement <16 x i8> %mask, i32 3 + %s3 = extractelement <16 x i8> %src, i8 %m3 + %v3 = insertelement <16 x i8> %v2, i8 %s3, i32 3 + %m4 = extractelement <16 x i8> %mask, i32 4 + %s4 = extractelement <16 x i8> %src, i8 %m4 + %v4 = insertelement <16 x i8> %v3, i8 %s4, i32 4 + %m5 = extractelement <16 x i8> %mask, i32 5 + %s5 = extractelement <16 x i8> %src, i8 %m5 + %v5 = insertelement <16 x i8> %v4, i8 %s5, i32 5 + %m6 = extractelement <16 x i8> %mask, i32 6 + %s6 = extractelement <16 x i8> %src, i8 %m6 + %v6 = insertelement <16 x i8> %v5, i8 %s6, i32 6 + %m7 = extractelement <16 x i8> %mask, i32 7 + %s7 = extractelement <16 x i8> %src, i8 %m7 + %v7 = insertelement <16 x i8> %v6, i8 %s7, i32 7 + %m8 = extractelement <16 x i8> %mask, i32 8 + %s8 = extractelement <16 x i8> %src, i8 %m8 + %v8 = insertelement <16 x i8> %v7, i8 %s8, i32 8 + %m9 = extractelement <16 x i8> %mask, i32 9 + %s9 = extractelement <16 x i8> %src, i8 %m9 + %v9 = insertelement <16 x i8> %v8, i8 %s9, i32 9 + %m10 = extractelement <16 x i8> %mask, i32 10 + %s10 = extractelement <16 x i8> %src, i8 %m10 + %v10 = insertelement <16 x i8> %v9, i8 %s10, i32 10 + %m11 = extractelement <16 x i8> %mask, i32 11 + %s11 = extractelement <16 x i8> %src, i8 %m11 + %v11 = insertelement <16 x i8> %v10, i8 %s11, i32 11 + %m12 = extractelement <16 x i8> %mask, i32 12 + %s12 = extractelement <16 x i8> %src, i8 %m12 + %v12 = insertelement <16 x i8> %v11, i8 %s12, i32 12 + %m13 = extractelement <16 x i8> %mask, i32 13 + %s13 = extractelement <16 x i8> %src, i8 %m13 + %v13 = insertelement <16 x i8> %v12, i8 %s13, i32 13 + %m14 = extractelement <16 x i8> %mask, i32 14 + %s14 = extractelement <16 x i8> %src, i8 %m14 + %v14 = insertelement <16 x i8> %v13, i8 %s14, i32 14 + %m15 = extractelement <16 x i8> %mask, i32 15 + %s15 = extractelement <16 x i8> %src, i8 %m15 + %v15 = insertelement <16 x i8> %v14, i8 %s15, i32 15 + ret <16 x i8> %v15 +} + +; CHECK-LABEL: swizzle_one_i16x8: +; CHECK-NEXT: .functype swizzle_one_i16x8 (v128, v128) -> (v128) +; CHECK-NOT: swizzle +; CHECK: return +define <8 x i16> @swizzle_one_i16x8(<8 x i16> %src, <8 x i16> %mask) { + %m0 = extractelement <8 x i16> %mask, i32 0 + %s0 = extractelement <8 x i16> %src, i16 %m0 + %v0 = insertelement <8 x i16> undef, i16 %s0, i32 0 + ret <8 x i16> %v0 +} + +; CHECK-LABEL: mashup_swizzle_i8x16: +; CHECK-NEXT: .functype mashup_swizzle_i8x16 (v128, v128, i32) -> (v128) +; CHECK-NEXT: v8x16.swizzle $push[[L0:[0-9]+]]=, $0, $1 +; CHECK: i8x16.replace_lane +; CHECK: i8x16.replace_lane +; CHECK: i8x16.replace_lane +; CHECK: i8x16.replace_lane +; CHECK: return +define <16 x i8> @mashup_swizzle_i8x16(<16 x i8> %src, <16 x i8> %mask, i8 %splatted) { + ; swizzle 0 + %m0 = extractelement <16 x i8> %mask, i32 0 + %s0 = extractelement <16 x i8> %src, i8 %m0 + %v0 = insertelement <16 x i8> undef, i8 %s0, i32 0 + ; swizzle 7 + %m1 = extractelement <16 x i8> %mask, i32 7 + %s1 = extractelement <16 x i8> %src, i8 %m1 + %v1 = insertelement <16 x i8> %v0, i8 %s1, i32 7 + ; splat 3 + %v2 = insertelement <16 x i8> %v1, i8 %splatted, i32 3 + ; splat 12 + %v3 = insertelement <16 x i8> %v2, i8 %splatted, i32 12 + ; const 4 + %v4 = insertelement <16 x i8> %v3, i8 42, i32 4 + ; const 14 + %v5 = insertelement <16 x i8> %v4, i8 42, i32 14 + ret <16 x i8> %v5 +} + +; CHECK-LABEL: mashup_const_i8x16: +; CHECK-NEXT: .functype mashup_const_i8x16 (v128, v128, i32) -> (v128) +; CHECK: v128.const $push[[L0:[0-9]+]]=, 0, 0, 0, 0, 42, 0, 0, 0, 0, 0, 0, 0, 0, 0, 42, 0 +; CHECK: i8x16.replace_lane +; CHECK: i8x16.replace_lane +; CHECK: i8x16.replace_lane +; CHECK: return +define <16 x i8> @mashup_const_i8x16(<16 x i8> %src, <16 x i8> %mask, i8 %splatted) { + ; swizzle 0 + %m0 = extractelement <16 x i8> %mask, i32 0 + %s0 = extractelement <16 x i8> %src, i8 %m0 + %v0 = insertelement <16 x i8> undef, i8 %s0, i32 0 + ; splat 3 + %v1 = insertelement <16 x i8> %v0, i8 %splatted, i32 3 + ; splat 12 + %v2 = insertelement <16 x i8> %v1, i8 %splatted, i32 12 + ; const 4 + %v3 = insertelement <16 x i8> %v2, i8 42, i32 4 + ; const 14 + %v4 = insertelement <16 x i8> %v3, i8 42, i32 14 + ret <16 x i8> %v4 +} + +; CHECK-LABEL: mashup_splat_i8x16: +; CHECK-NEXT: .functype mashup_splat_i8x16 (v128, v128, i32) -> (v128) +; CHECK: i8x16.splat $push[[L0:[0-9]+]]=, $2 +; CHECK: i8x16.replace_lane +; CHECK: i8x16.replace_lane +; CHECK: return +define <16 x i8> @mashup_splat_i8x16(<16 x i8> %src, <16 x i8> %mask, i8 %splatted) { + ; swizzle 0 + %m0 = extractelement <16 x i8> %mask, i32 0 + %s0 = extractelement <16 x i8> %src, i8 %m0 + %v0 = insertelement <16 x i8> undef, i8 %s0, i32 0 + ; splat 3 + %v1 = insertelement <16 x i8> %v0, i8 %splatted, i32 3 + ; splat 12 + %v2 = insertelement <16 x i8> %v1, i8 %splatted, i32 12 + ; const 4 + %v3 = insertelement <16 x i8> %v2, i8 42, i32 4 + ret <16 x i8> %v3 +} + ; CHECK-LABEL: undef_const_insert_f32x4: ; CHECK-NEXT: .functype undef_const_insert_f32x4 () -> (v128) -; CHECK-NEXT: f32.const $push[[L0:[0-9]+]]=, 0x1.5p5 -; CHECK-NEXT: f32x4.splat $push[[L1:[0-9]+]]=, $pop[[L0]] -; CHECK-NEXT: return $pop[[L1]] +; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 0x0p0, 0x1.5p5, 0x0p0, 0x0p0 +; CHECK-NEXT: return $pop[[L0]] define <4 x float> @undef_const_insert_f32x4() { %v = insertelement <4 x float> undef, float 42., i32 1 ret <4 x float> %v Modified: llvm/trunk/test/MC/WebAssembly/simd-encodings.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/WebAssembly/simd-encodings.s?rev=374188&r1=374187&r2=374188&view=diff ============================================================================== --- llvm/trunk/test/MC/WebAssembly/simd-encodings.s (original) +++ llvm/trunk/test/MC/WebAssembly/simd-encodings.s Wed Oct 9 10:39:19 2019 @@ -463,6 +463,9 @@ main: # CHECK: f64x2.convert_i64x2_u # encoding: [0xfd,0xb2,0x01] f64x2.convert_i64x2_u + # CHECK: v8x16.swizzle # encoding: [0xfd,0xc0,0x01] + v8x16.swizzle + # CHECK: v8x16.load_splat 48 # encoding: [0xfd,0xc2,0x01,0x00,0x30] v8x16.load_splat 48 From llvm-commits at lists.llvm.org Wed Oct 9 10:37:44 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:37:44 +0000 (UTC) Subject: [PATCH] D68527: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering In-Reply-To: References: Message-ID: <1f30bcb6c67633b137d5f11c376d9a32@localhost.localdomain> tlively marked an inline comment as done. tlively added inline comments. ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1384 + SDValue SwizzleSrc; + SDValue SwizzleIndices; + size_t NumSwizzleLanes = 0; ---------------- aheejin wrote: > aheejin wrote: > > Nit: Variable names for the same things in `GetSwizzleSrcs` are `SrcVec` and `IndexVec`. Making the variable names same in the two places might make reading easier. > In `GetSwizzleSrcs`, `IndexVec` is still `IndexVec`, while `SrcVec` was changed to `SwizzleSrc. Was that intentional? Not intentional! Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68527/new/ https://reviews.llvm.org/D68527 From llvm-commits at lists.llvm.org Wed Oct 9 10:38:39 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:38:39 +0000 (UTC) Subject: [PATCH] D68527: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering In-Reply-To: References: Message-ID: <3e0060cbb8c989058e81182bb314640b@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGd5b7a4e2e8dc: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering (authored by tlively). Changed prior to commit: https://reviews.llvm.org/D68527?vs=223922&id=224098#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68527/new/ https://reviews.llvm.org/D68527 Files: llvm/lib/Target/WebAssembly/WebAssemblyISD.def llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td llvm/test/CodeGen/WebAssembly/simd-build-vector.ll llvm/test/MC/WebAssembly/simd-encodings.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68527.224098.patch Type: text/x-patch Size: 21486 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 10:45:47 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via llvm-commits) Date: Wed, 09 Oct 2019 17:45:47 -0000 Subject: [llvm] r374189 - [WebAssembly] Add builtin and intrinsic for v8x16.swizzle Message-ID: <20191009174547.500E991011@lists.llvm.org> Author: tlively Date: Wed Oct 9 10:45:47 2019 New Revision: 374189 URL: http://llvm.org/viewvc/llvm-project?rev=374189&view=rev Log: [WebAssembly] Add builtin and intrinsic for v8x16.swizzle Summary: This clang builtin and corresponding LLVM intrinsic are necessary to expose the exact semantics of the underlying WebAssembly instruction to users. LLVM produces a poison value if the dynamic swizzle indices are greater than the vector size, but the WebAssembly instruction sets the corresponding output lane to zero. Users who depend on this behavior can safely use this builtin. Depends on D68527. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68531 Modified: llvm/trunk/include/llvm/IR/IntrinsicsWebAssembly.td llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td llvm/trunk/test/CodeGen/WebAssembly/simd-intrinsics.ll Modified: llvm/trunk/include/llvm/IR/IntrinsicsWebAssembly.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/IntrinsicsWebAssembly.td?rev=374189&r1=374188&r2=374189&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/IntrinsicsWebAssembly.td (original) +++ llvm/trunk/include/llvm/IR/IntrinsicsWebAssembly.td Wed Oct 9 10:45:47 2019 @@ -89,6 +89,10 @@ def int_wasm_atomic_notify: // SIMD intrinsics //===----------------------------------------------------------------------===// +def int_wasm_swizzle : + Intrinsic<[llvm_v16i8_ty], + [llvm_v16i8_ty, llvm_v16i8_ty], + [IntrNoMem, IntrSpeculatable]>; def int_wasm_sub_saturate_signed : Intrinsic<[llvm_anyvector_ty], [LLVMMatchType<0>, LLVMMatchType<0>], Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td?rev=374189&r1=374188&r2=374189&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td Wed Oct 9 10:45:47 2019 @@ -278,12 +278,16 @@ def : Pat<(vec_t (wasm_shuffle (vec_t V1 // Swizzle lanes: v8x16.swizzle def wasm_swizzle_t : SDTypeProfile<1, 2, []>; def wasm_swizzle : SDNode<"WebAssemblyISD::SWIZZLE", wasm_swizzle_t>; +let Predicates = [HasUnimplementedSIMD128] in defm SWIZZLE : SIMD_I<(outs V128:$dst), (ins V128:$src, V128:$mask), (outs), (ins), [(set (v16i8 V128:$dst), (wasm_swizzle (v16i8 V128:$src), (v16i8 V128:$mask)))], "v8x16.swizzle\t$dst, $src, $mask", "v8x16.swizzle", 192>; +def : Pat<(int_wasm_swizzle (v16i8 V128:$src), (v16i8 V128:$mask)), + (SWIZZLE V128:$src, V128:$mask)>; + // Create vector with identical lanes: splat def splat2 : PatFrag<(ops node:$x), (build_vector node:$x, node:$x)>; def splat4 : PatFrag<(ops node:$x), (build_vector Modified: llvm/trunk/test/CodeGen/WebAssembly/simd-intrinsics.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/simd-intrinsics.ll?rev=374189&r1=374188&r2=374189&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/simd-intrinsics.ll (original) +++ llvm/trunk/test/CodeGen/WebAssembly/simd-intrinsics.ll Wed Oct 9 10:45:47 2019 @@ -11,6 +11,16 @@ target triple = "wasm32-unknown-unknown" ; ============================================================================== ; 16 x i8 ; ============================================================================== +; CHECK-LABEL: swizzle_v16i8: +; SIMD128-NEXT: .functype swizzle_v16i8 (v128, v128) -> (v128){{$}} +; SIMD128-NEXT: v8x16.swizzle $push[[R:[0-9]+]]=, $0, $1{{$}} +; SIMD128-NEXT: return $pop[[R]]{{$}} +declare <16 x i8> @llvm.wasm.swizzle(<16 x i8>, <16 x i8>) +define <16 x i8> @swizzle_v16i8(<16 x i8> %x, <16 x i8> %y) { + %a = call <16 x i8> @llvm.wasm.swizzle(<16 x i8> %x, <16 x i8> %y) + ret <16 x i8> %a +} + ; CHECK-LABEL: add_sat_s_v16i8: ; SIMD128-NEXT: .functype add_sat_s_v16i8 (v128, v128) -> (v128){{$}} ; SIMD128-NEXT: i8x16.add_saturate_s $push[[R:[0-9]+]]=, $0, $1{{$}} From llvm-commits at lists.llvm.org Wed Oct 9 10:46:50 2019 From: llvm-commits at lists.llvm.org (Alexander via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:46:50 +0000 (UTC) Subject: [PATCH] D68635: [AMDGPU] Come back patch for the 'Assign register class for cross block values according to the divergence.' In-Reply-To: References: Message-ID: alex-t updated this revision to Diff 224099. alex-t added a comment. Changed according reviewers requests. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68635/new/ https://reviews.llvm.org/D68635 Files: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp llvm/lib/Target/AMDGPU/SIISelLowering.cpp llvm/lib/Target/AMDGPU/SIISelLowering.h llvm/lib/Target/AMDGPU/SIInstrInfo.cpp llvm/test/CodeGen/AMDGPU/atomicrmw-nand.ll llvm/test/CodeGen/AMDGPU/branch-relaxation.ll llvm/test/CodeGen/AMDGPU/branch-uniformity.ll llvm/test/CodeGen/AMDGPU/commute-shifts.ll llvm/test/CodeGen/AMDGPU/control-flow-fastregalloc.ll llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll llvm/test/CodeGen/AMDGPU/cse-phi-incoming-val.ll llvm/test/CodeGen/AMDGPU/divergent-branch-uniform-condition.ll llvm/test/CodeGen/AMDGPU/extract_subvector_vec4_vec3.ll llvm/test/CodeGen/AMDGPU/fabs.ll llvm/test/CodeGen/AMDGPU/fdiv32-to-rcp-folding.ll llvm/test/CodeGen/AMDGPU/fmin_legacy.ll llvm/test/CodeGen/AMDGPU/fmul-2-combine-multi-use.ll llvm/test/CodeGen/AMDGPU/fneg-fabs.ll llvm/test/CodeGen/AMDGPU/fneg.ll llvm/test/CodeGen/AMDGPU/fsub.ll llvm/test/CodeGen/AMDGPU/i1-copy-from-loop.ll llvm/test/CodeGen/AMDGPU/i1-copy-phi-uniform-branch.ll llvm/test/CodeGen/AMDGPU/implicit-def-muse.ll llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.div.scale.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fmed3.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mov.dpp.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mqsad.pk.u16.u8.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.qsad.pk.u16.u8.ll llvm/test/CodeGen/AMDGPU/loop_break.ll llvm/test/CodeGen/AMDGPU/madak.ll llvm/test/CodeGen/AMDGPU/multilevel-break.ll llvm/test/CodeGen/AMDGPU/select-opt.ll llvm/test/CodeGen/AMDGPU/sgpr-control-flow.ll llvm/test/CodeGen/AMDGPU/sgpr-copy.ll llvm/test/CodeGen/AMDGPU/si-annotate-cf.ll llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies.mir llvm/test/CodeGen/AMDGPU/smrd.ll llvm/test/CodeGen/AMDGPU/subreg-coalescer-undef-use.ll llvm/test/CodeGen/AMDGPU/uniform-loop-inside-nonuniform.ll llvm/test/CodeGen/AMDGPU/use-sgpr-multiple-times.ll llvm/test/CodeGen/AMDGPU/valu-i1.ll llvm/test/CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll llvm/test/CodeGen/AMDGPU/wave32.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68635.224099.patch Type: text/x-patch Size: 87289 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 10:47:25 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:47:25 +0000 (UTC) Subject: [PATCH] D68531: [WebAssembly] Add builtin and intrinsic for v8x16.swizzle In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG3419e90dc1a2: [WebAssembly] Add builtin and intrinsic for v8x16.swizzle (authored by tlively). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68531/new/ https://reviews.llvm.org/D68531 Files: clang/include/clang/Basic/BuiltinsWebAssembly.def clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/builtins-wasm.c llvm/include/llvm/IR/IntrinsicsWebAssembly.td llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td llvm/test/CodeGen/WebAssembly/simd-intrinsics.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68531.224100.patch Type: text/x-patch Size: 4904 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 10:52:26 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Wed, 09 Oct 2019 17:52:26 -0000 Subject: [llvm] r374190 - [InstCombine] add another test for gep inbounds; NFC Message-ID: <20191009175226.AFB9B9082A@lists.llvm.org> Author: spatel Date: Wed Oct 9 10:52:26 2019 New Revision: 374190 URL: http://llvm.org/viewvc/llvm-project?rev=374190&view=rev Log: [InstCombine] add another test for gep inbounds; NFC Modified: llvm/trunk/test/Transforms/InstCombine/load-bitcast-vec.ll Modified: llvm/trunk/test/Transforms/InstCombine/load-bitcast-vec.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/load-bitcast-vec.ll?rev=374190&r1=374189&r2=374190&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/load-bitcast-vec.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/load-bitcast-vec.ll Wed Oct 9 10:52:26 2019 @@ -89,6 +89,17 @@ define float @matching_scalar_smallest_d ret float %r } +define float @matching_scalar_smallest_deref_addrspace(<4 x float> addrspace(4)* dereferenceable(1) %p) { +; CHECK-LABEL: @matching_scalar_smallest_deref_addrspace( +; CHECK-NEXT: [[BC:%.*]] = getelementptr inbounds <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 +; CHECK-NEXT: [[R:%.*]] = load float, float addrspace(4)* [[BC]], align 16 +; CHECK-NEXT: ret float [[R]] +; + %bc = bitcast <4 x float> addrspace(4)* %p to float addrspace(4)* + %r = load float, float addrspace(4)* %bc, align 16 + ret float %r +} + ; TODO: Is a null pointer inbounds in any address space? define float @matching_scalar_smallest_deref_or_null_addrspace(<4 x float> addrspace(4)* dereferenceable_or_null(1) %p) { From llvm-commits at lists.llvm.org Wed Oct 9 10:56:59 2019 From: llvm-commits at lists.llvm.org (Andrea Di Biagio via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:56:59 +0000 (UTC) Subject: [PATCH] D68714: [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219) In-Reply-To: References: Message-ID: <3ed10fed3efd616a8bb7f577d7cb06cb@localhost.localdomain> andreadb added a comment. Thanks Roman. It is a shame that so many tests are affected by this change. But that is obviously not your fault. See my comments below. ================ Comment at: llvm/tools/llvm-mca/Views/TimelineView.cpp:132-140 void TimelineView::printWaitTimeEntry(formatted_raw_ostream &OS, const WaitTimeEntry &Entry, unsigned SourceIndex, - unsigned Executions) const { - OS << SourceIndex << '.'; + unsigned CumulativeExecutions, + unsigned Executions, + bool ShouldNumber) const { + if (ShouldNumber) ---------------- You can still use the old signature for this method (see the explanation below): We know that we are printing the special entry if `SourceIndex == Source.size()`. You can use that knowledge in two places: 1) You can automatically infer flag `ShouldNumber`. ``` bool ShouldNumber = SourceIndex != Source.size(); if (ShouldNumber) OS << SourceIndex << '.'; ``` 2) Before printing the average times, you can check if numbers are fore he special entry and modify the value of `Executions` with `Timeline.size() / Source.size()`. You can do it where you currently added the FIXME comment. ``` if (!ShouldNumber) { // override Executions for the purpose of changing colors. Executions = Timeline.size() / Source.size(); } ``` Basically you don't need `CumulativeExecutions` as it can be inferred from the context. That should also fix the issue with the coloring of the output. ================ Comment at: llvm/tools/llvm-mca/Views/TimelineView.cpp:204-215 + + WaitTimeEntry TotalWaitTime = std::accumulate( + WaitTime.begin(), WaitTime.end(), WaitTimeEntry{0, 0, 0}, + [](const WaitTimeEntry &A, const WaitTimeEntry &B) { + return WaitTimeEntry{ + A.CyclesSpentInSchedulerQueue + B.CyclesSpentInSchedulerQueue, + A.CyclesSpentInSQWhileReady + B.CyclesSpentInSQWhileReady, ---------------- We should not print the special entry if Source.size() == 1. If the input assembly only contains a single instruction, then we know that the entry is redundant. It should also (hopefully) simplify the diff a bit. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68714/new/ https://reviews.llvm.org/D68714 From llvm-commits at lists.llvm.org Wed Oct 9 10:56:59 2019 From: llvm-commits at lists.llvm.org (Paul Robinson via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:56:59 +0000 (UTC) Subject: [PATCH] D68620: DebugInfo: Use base address selection entries for debug_loc In-Reply-To: References: Message-ID: <2b2ebaeeefd999f257f376fbf64ab676@localhost.localdomain> probinson added a comment. > I'll ask re Sony debugger. I have no direct visibility to that code. My debugger guys say they have code to handle it and some hand-coded tests, so they are cautiously optimistic that nothing bad will happen. ================ Comment at: lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2328 + BaseIsSet = true; + if (UseDwarf5) { + Asm->OutStreamer->AddComment(StringifyEnum(BaseAddressx)); ---------------- Would it be more readable this way? ``` if (!UseDwarf5) { Base = NewBase; BaseIsSet = true; Asm-OutStreamer->EmitIntValue(-1, Size); // etc } else if (NewBase != Begin || P.second.size() > 1) { Base = NewBase; BaseIsSet = true; Asm->OutStreamer->AddComment(StringifyEnum(BaseAddressx); // etc } ``` As there are only 2 lines in common. (My eye caught `if (!UseDwarf5` and two lines later `if (UseDwarf5)` and did a double-take.) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68620/new/ https://reviews.llvm.org/D68620 From llvm-commits at lists.llvm.org Wed Oct 9 10:57:00 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:57:00 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: <04ff99deed4b8293b57e28c4ca4e43ae@localhost.localdomain> wmi added a comment. In D68601#1700595 , @wenlei wrote: > LGTM. Thanks! > > > Symbol list can be provided to llvm-profdata in a plain text file. > > I thought it's more convenient to have PSL auto-populated by the tool that generates AutoFDO profile, or is there any reason for not using auto-generated PSL, and instead providing a plain text file as side input? Putting the list in a plain text file gives more flexibility so user can order the list and strip some of them. You are right it is convenient for create_llvm_prof to get the list from binary directly. I just wanted to point out there is an existing alternative support before we port the support of create_llvm_prof to github. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 From llvm-commits at lists.llvm.org Wed Oct 9 10:57:01 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:57:01 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: <97b5aa5d2294e50f2c0c606aa8da6e26@localhost.localdomain> wmi marked 4 inline comments as done. wmi added inline comments. ================ Comment at: llvm/include/llvm/ProfileData/SampleProfReader.h:554 + /// Collect functions to be used when compiling Module \p M. + void collectFuncsToUse(const Module &M) override; }; ---------------- davidxl wrote: > Nit: collectFuncsFrom(const Module &M) Fixed ================ Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:507 + for (auto &F : M) { + StringRef CanonName = FunctionSamples::getCanonicalFnName(F); + FuncsToUse.insert(CanonName); ---------------- davidxl wrote: > Skip declarations? Fixed ================ Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:533 +std::error_code SampleProfileReaderExtBinary::readFuncProfiles(uint64_t Size) { + const uint8_t *Start = Data; + if (UseAllFuncs) { ---------------- davidxl wrote: > End = Data + Size It is set in the parent function: readOneSection. ================ Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:536 + while (Data < Start + Size) { + if (std::error_code EC = readFuncProfile()) + return EC; ---------------- davidxl wrote: > It is more readable if readFuncProfile is taking the pointer to the data address: > readFuncProfile(&Data); Fixed. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 From llvm-commits at lists.llvm.org Wed Oct 9 10:57:05 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:57:05 +0000 (UTC) Subject: [PATCH] D68717: [Codegen] More add_sat and sub_sat promotion Message-ID: dmgreen created this revision. dmgreen added reviewers: nikic, craig.topper, RKSimon, leonardchan. Herald added a subscriber: hiraditya. Herald added a project: LLVM. As a continuation of D68643 , the default promotion for saturation arithmetic can be further refined when MIN/MAX are known to be legal. All the test changes here are in uncommon types like a i4. This uses isOperationLegal, as opposed to legal or custom because the X86 MINs/MAXs are custom lowered. Some of the MVE tests look larger because they are materialising constants (I imagine they might be pulled out of a loop). https://reviews.llvm.org/D68717 Files: llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp llvm/test/CodeGen/Thumb2/mve-saturating-arith.ll llvm/test/CodeGen/X86/sadd_sat_vec.ll llvm/test/CodeGen/X86/ssub_sat_vec.ll llvm/test/CodeGen/X86/uadd_sat_vec.ll llvm/test/CodeGen/X86/usub_sat_vec.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68717.224093.patch Type: text/x-patch Size: 24607 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 11:08:06 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:08:06 +0000 (UTC) Subject: [PATCH] D68717: [Codegen] More add_sat and sub_sat promotion In-Reply-To: References: Message-ID: dmgreen marked an inline comment as done. dmgreen added inline comments. ================ Comment at: llvm/test/CodeGen/X86/sadd_sat_vec.ll:496 +; SSE2-NEXT: psubb %xmm1, %xmm0 +; SSE2-NEXT: retq +; ---------------- If we always went though the min/max pair, this would look like: ``` +; SSE2-NEXT: paddb %xmm1, %xmm0 +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtb %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn %xmm1, %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [248,248,248,248,248,248,248,248,248,248,248,248,248, +; SSE2-NEXT: movdqa %xmm2, %xmm0 +; SSE2-NEXT: pcmpgtb %xmm1, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm2 +; SSE2-NEXT: pandn %xmm1, %xmm0 +; SSE2-NEXT: por %xmm2, %xmm0 +; SSE2-NEXT: retq ``` Which is a little larger. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68717/new/ https://reviews.llvm.org/D68717 From llvm-commits at lists.llvm.org Wed Oct 9 11:08:06 2019 From: llvm-commits at lists.llvm.org (Hideki Saito via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:08:06 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: <6d7674f14e0010e922ce6df240a65435@localhost.localdomain> hsaito added a comment. > I haven't looked at the patch in detail, but as author of at least part of the prior art cited here, I agree with the direction*. I also participated in some of the vector idioms discussions from a few years ago. There's overlap with the vector idiom problems, but as noted, these are generic (scalar too) math ops, so it's not exactly the same. We invested significantly in IR analysis and codegen for the math intrinsics, so that may have changed the thinking. I don't remember the sequence of events or if there was a dedicated llvm-dev thread for this, but the general idea is that if we have a generic intrinsic for the math and can easily invert the transform in the backend for targets/types that are not supported, try to canonicalize to the intrinsic. Don't get me wrong. I'm not against the direction. I just want to make sure we have a general agreement with the rest of the community on where and how to draw the line ---- and get it documented (and the document to be kept up-to-date). Then, we can go back to the idiom list we created and determine which ones should be canonicalized, and which ones should go to better pattern matchers --- reflect that into the document. That'll make it easier for other interested people to actually work on those, w/o having to make a point on individual cases. That's the main purpose of raising this question. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Wed Oct 9 11:08:07 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:08:07 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: <7f2da7dce2d40900ea3b1eeee8be3c05@localhost.localdomain> wmi updated this revision to Diff 224104. wmi added a comment. Address David's comment. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 Files: llvm/include/llvm/ProfileData/SampleProf.h llvm/include/llvm/ProfileData/SampleProfReader.h llvm/include/llvm/ProfileData/SampleProfWriter.h llvm/lib/ProfileData/SampleProfReader.cpp llvm/lib/ProfileData/SampleProfWriter.cpp llvm/lib/Transforms/IPO/SampleProfile.cpp llvm/test/Transforms/SampleProfile/Inputs/inline.extbinary.afdo llvm/test/Transforms/SampleProfile/Inputs/profsampleacc.extbinary.afdo llvm/unittests/ProfileData/SampleProfTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68601.224104.patch Type: text/x-patch Size: 19652 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 11:08:07 2019 From: llvm-commits at lists.llvm.org (Gil Rapaport via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:08:07 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: gilr added a comment. In D68577#1701381 , @rengolin wrote: > So, IIUC, this is changing tryCreateRecipe to move the interleave recipe creation to the caller, buildVPlanWithVPRecipes. The dependencies with the sink values is recorded initially, then the plans are created, then the sinks are applied and, if any, the interleave groups. Correct. Motivation is to express these dependecies as VPlan transformation phase ordering. Ayal discussed this is more details in his 2017 VPlan talk . > ... but this doesn't look like an NFC change. Not that this is a bad thing, but I can't quite reach the conclusion that all the loops that would have been interleaved will continue to do so, because the order of the plans may change (for better or worse) the conditions in which the plan starts with. > Regardless, I think this is a positive change and goes in the direction we want the VPlan infrastructure to be. It also looks semantically equivalent (with the caveat above), so the change looks good to me. > It would be good to wait for further reviews on the next few days, just in case I missed something. Excellent. The intention is indeed to only change the way Planner executes these already-taken decisions (SA by Legal, IG by CostModel). Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 From llvm-commits at lists.llvm.org Wed Oct 9 11:08:07 2019 From: llvm-commits at lists.llvm.org (David Li via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:08:07 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: davidxl added inline comments. ================ Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:533 +std::error_code SampleProfileReaderExtBinary::readFuncProfiles(uint64_t Size) { + const uint8_t *Start = Data; + if (UseAllFuncs) { ---------------- wmi wrote: > davidxl wrote: > > End = Data + Size > It is set in the parent function: readOneSection. What I meant is to define a local variable 'End' and use it instead of of Start. It makes code slightly more readable. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 From llvm-commits at lists.llvm.org Wed Oct 9 11:08:07 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Milo=C5=A1_Stojanovi=C4=87_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 18:08:07 +0000 (UTC) Subject: [PATCH] D68649: [Mips][llvm-exegesis] Add a Mips target In-Reply-To: References: Message-ID: <10b6c70269da5550d8e68ed31c50bd92@localhost.localdomain> mstojanovic updated this revision to Diff 224106. mstojanovic added a comment. Removed obsolete `llvm::` and includes, ran clang-format. Added direct testing of the instruction. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68649/new/ https://reviews.llvm.org/D68649 Files: lib/Target/Mips/CMakeLists.txt lib/Target/Mips/Mips.td lib/Target/Mips/MipsPfmCounters.td tools/llvm-exegesis/lib/Assembler.cpp tools/llvm-exegesis/lib/CMakeLists.txt tools/llvm-exegesis/lib/Mips/CMakeLists.txt tools/llvm-exegesis/lib/Mips/LLVMBuild.txt tools/llvm-exegesis/lib/Mips/Target.cpp unittests/tools/llvm-exegesis/CMakeLists.txt unittests/tools/llvm-exegesis/Mips/CMakeLists.txt unittests/tools/llvm-exegesis/Mips/TargetTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68649.224106.patch Type: text/x-patch Size: 11167 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 11:14:33 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Wed, 9 Oct 2019 11:14:33 -0700 Subject: [llvm] r374122 - DebugInfo: Move LLE enum handling to .def to match RLE handling In-Reply-To: <20191008214847.5B73D89069@lists.llvm.org> References: <20191008214847.5B73D89069@lists.llvm.org> Message-ID: UBSAN error after the patch /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:146:15: runtime error: load of value 71, which is not a valid value for type 'llvm::dwarf::LoclistEntries' http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-ubsan/builds/15298/steps/check-llvm%20ubsan/logs/stdio On Tue, Oct 8, 2019 at 2:46 PM David Blaikie via llvm-commits < llvm-commits at lists.llvm.org> wrote: > Author: dblaikie > Date: Tue Oct 8 14:48:46 2019 > New Revision: 374122 > > URL: http://llvm.org/viewvc/llvm-project?rev=374122&view=rev > Log: > DebugInfo: Move LLE enum handling to .def to match RLE handling > > Modified: > llvm/trunk/include/llvm/BinaryFormat/Dwarf.def > llvm/trunk/include/llvm/BinaryFormat/Dwarf.h > llvm/trunk/lib/BinaryFormat/Dwarf.cpp > llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp > > Modified: llvm/trunk/include/llvm/BinaryFormat/Dwarf.def > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/BinaryFormat/Dwarf.def?rev=374122&r1=374121&r2=374122&view=diff > > ============================================================================== > --- llvm/trunk/include/llvm/BinaryFormat/Dwarf.def (original) > +++ llvm/trunk/include/llvm/BinaryFormat/Dwarf.def Tue Oct 8 14:48:46 2019 > @@ -17,7 +17,7 @@ > defined HANDLE_DW_VIRTUALITY || defined HANDLE_DW_DEFAULTED || > \ > defined HANDLE_DW_CC || defined HANDLE_DW_LNS || defined > HANDLE_DW_LNE || \ > defined HANDLE_DW_LNCT || defined HANDLE_DW_MACRO || > \ > - defined HANDLE_DW_RLE || > \ > + defined HANDLE_DW_RLE || defined HANDLE_DW_LLE || > \ > (defined HANDLE_DW_CFA && defined HANDLE_DW_CFA_PRED) || > \ > defined HANDLE_DW_APPLE_PROPERTY || defined HANDLE_DW_UT || > \ > defined HANDLE_DWARF_SECTION || defined HANDLE_DW_IDX || > \ > @@ -91,6 +91,10 @@ > #define HANDLE_DW_RLE(ID, NAME) > #endif > > +#ifndef HANDLE_DW_LLE > +#define HANDLE_DW_LLE(ID, NAME) > +#endif > + > #ifndef HANDLE_DW_CFA > #define HANDLE_DW_CFA(ID, NAME) > #endif > @@ -825,6 +829,17 @@ HANDLE_DW_RLE(0x05, base_address) > HANDLE_DW_RLE(0x06, start_end) > HANDLE_DW_RLE(0x07, start_length) > > +// DWARF v5 Loc List Entry encoding values. > +HANDLE_DW_LLE(0x00, end_of_list) > +HANDLE_DW_LLE(0x01, base_addressx) > +HANDLE_DW_LLE(0x02, startx_endx) > +HANDLE_DW_LLE(0x03, startx_length) > +HANDLE_DW_LLE(0x04, offset_pair) > +HANDLE_DW_LLE(0x05, default_location) > +HANDLE_DW_LLE(0x06, base_address) > +HANDLE_DW_LLE(0x07, start_end) > +HANDLE_DW_LLE(0x08, start_length) > + > // Call frame instruction encodings. > HANDLE_DW_CFA(0x00, nop) > HANDLE_DW_CFA(0x40, advance_loc) > @@ -939,6 +954,7 @@ HANDLE_DW_IDX(0x05, type_hash) > #undef HANDLE_DW_LNCT > #undef HANDLE_DW_MACRO > #undef HANDLE_DW_RLE > +#undef HANDLE_DW_LLE > #undef HANDLE_DW_CFA > #undef HANDLE_DW_CFA_PRED > #undef HANDLE_DW_APPLE_PROPERTY > > Modified: llvm/trunk/include/llvm/BinaryFormat/Dwarf.h > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/BinaryFormat/Dwarf.h?rev=374122&r1=374121&r2=374122&view=diff > > ============================================================================== > --- llvm/trunk/include/llvm/BinaryFormat/Dwarf.h (original) > +++ llvm/trunk/include/llvm/BinaryFormat/Dwarf.h Tue Oct 8 14:48:46 2019 > @@ -308,11 +308,17 @@ enum MacroEntryType { > }; > > /// DWARF v5 range list entry encoding values. > -enum RangeListEntries { > +enum RnglistEntries { > #define HANDLE_DW_RLE(ID, NAME) DW_RLE_##NAME = ID, > #include "llvm/BinaryFormat/Dwarf.def" > }; > > +/// DWARF v5 loc list entry encoding values. > +enum LoclistEntries { > +#define HANDLE_DW_LLE(ID, NAME) DW_LLE_##NAME = ID, > +#include "llvm/BinaryFormat/Dwarf.def" > +}; > + > /// Call frame instruction encodings. > enum CallFrameInfo { > #define HANDLE_DW_CFA(ID, NAME) DW_CFA_##NAME = ID, > @@ -348,19 +354,6 @@ enum Constants { > DW_EH_PE_indirect = 0x80 > }; > > -/// Constants for location lists in DWARF v5. > -enum LocationListEntry : unsigned char { > - DW_LLE_end_of_list = 0x00, > - DW_LLE_base_addressx = 0x01, > - DW_LLE_startx_endx = 0x02, > - DW_LLE_startx_length = 0x03, > - DW_LLE_offset_pair = 0x04, > - DW_LLE_default_location = 0x05, > - DW_LLE_base_address = 0x06, > - DW_LLE_start_end = 0x07, > - DW_LLE_start_length = 0x08 > -}; > - > /// Constants for the DW_APPLE_PROPERTY_attributes attribute. > /// Keep this list in sync with clang's DeclSpec.h > ObjCPropertyAttributeKind! > enum ApplePropertyAttributes { > @@ -475,6 +468,7 @@ StringRef LNStandardString(unsigned Stan > StringRef LNExtendedString(unsigned Encoding); > StringRef MacinfoString(unsigned Encoding); > StringRef RangeListEncodingString(unsigned Encoding); > +StringRef LocListEncodingString(unsigned Encoding); > StringRef CallFrameString(unsigned Encoding, Triple::ArchType Arch); > StringRef ApplePropertyString(unsigned); > StringRef UnitTypeString(unsigned); > > Modified: llvm/trunk/lib/BinaryFormat/Dwarf.cpp > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/BinaryFormat/Dwarf.cpp?rev=374122&r1=374121&r2=374122&view=diff > > ============================================================================== > --- llvm/trunk/lib/BinaryFormat/Dwarf.cpp (original) > +++ llvm/trunk/lib/BinaryFormat/Dwarf.cpp Tue Oct 8 14:48:46 2019 > @@ -472,6 +472,17 @@ StringRef llvm::dwarf::RangeListEncoding > } > } > > +StringRef llvm::dwarf::LocListEncodingString(unsigned Encoding) { > + switch (Encoding) { > + default: > + return StringRef(); > +#define HANDLE_DW_LLE(ID, NAME) > \ > + case DW_LLE_##NAME: > \ > + return "DW_LLE_" #NAME; > +#include "llvm/BinaryFormat/Dwarf.def" > + } > +} > + > StringRef llvm::dwarf::CallFrameString(unsigned Encoding, > Triple::ArchType Arch) { > assert(Arch != llvm::Triple::ArchType::UnknownArch); > > Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp?rev=374122&r1=374121&r2=374122&view=diff > > ============================================================================== > --- llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp (original) > +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp Tue Oct 8 14:48:46 > 2019 > @@ -143,7 +143,7 @@ DWARFDebugLoclists::parseOneLocationList > DataExtractor::Cursor C(*Offset); > > // dwarf::DW_LLE_end_of_list_entry is 0 and indicates the end of the > list. > - while (auto Kind = > static_cast(Data.getU8(C))) { > + while (auto Kind = static_cast(Data.getU8(C))) { > Entry E; > E.Kind = Kind; > switch (Kind) { > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Wed Oct 9 11:17:17 2019 From: llvm-commits at lists.llvm.org (Ali Tamur via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:17:17 +0000 (UTC) Subject: [PATCH] D63978: Clang Interface Stubs merger plumbing for Driver In-Reply-To: References: Message-ID: <306496ca8fd83ec90a3bc7befd7a6096@localhost.localdomain> tamur added a comment. It seems that with this patch, llvm-ifs starts to depend on yaml2obj, which as far as I know, was only used for testing purposes until now. Is this intended? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63978/new/ https://reviews.llvm.org/D63978 From llvm-commits at lists.llvm.org Wed Oct 9 11:17:17 2019 From: llvm-commits at lists.llvm.org (Chris Matthews via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:17:17 +0000 (UTC) Subject: [PATCH] D68220: [LNT] Python 3 support: stable showtests output In-Reply-To: References: Message-ID: <17af20f9c06d6d77f7277dd239e432f0@localhost.localdomain> cmatthews accepted this revision. cmatthews added a comment. This revision is now accepted and ready to land. Yeah, that seems fine. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68220/new/ https://reviews.llvm.org/D68220 From llvm-commits at lists.llvm.org Wed Oct 9 11:23:31 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via llvm-commits) Date: Wed, 09 Oct 2019 18:23:31 -0000 Subject: [llvm] r374194 - [lit] Refactor ProgressDisplay Message-ID: <20191009182331.0ED2A8FEA0@lists.llvm.org> Author: yln Date: Wed Oct 9 11:23:30 2019 New Revision: 374194 URL: http://llvm.org/viewvc/llvm-project?rev=374194&view=rev Log: [lit] Refactor ProgressDisplay Move progress display to separate file. Simplify some code paths. Decouple from other components via progress callback. Remove unused `_Display` class. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D68525 Added: llvm/trunk/utils/lit/lit/display.py Modified: llvm/trunk/utils/lit/lit/ProgressBar.py llvm/trunk/utils/lit/lit/main.py llvm/trunk/utils/lit/lit/run.py llvm/trunk/utils/lit/tests/progress-bar.py Modified: llvm/trunk/utils/lit/lit/ProgressBar.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/ProgressBar.py?rev=374194&r1=374193&r2=374194&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/ProgressBar.py (original) +++ llvm/trunk/utils/lit/lit/ProgressBar.py Wed Oct 9 11:23:30 2019 @@ -172,7 +172,7 @@ class SimpleProgressBar: A simple progress bar which doesn't need any terminal support. This prints out a progress bar like: - 'Header: 0 .. 10.. 20.. ...' + 'Header: 0.. 10.. 20.. ...' """ def __init__(self, header): @@ -191,7 +191,7 @@ class SimpleProgressBar: for i in range(self.atIndex, next): idx = i % 5 if idx == 0: - sys.stdout.write('%-2d' % (i*2)) + sys.stdout.write('%2d' % (i*2)) elif idx == 1: pass # Skip second char elif idx < 4: Added: llvm/trunk/utils/lit/lit/display.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/display.py?rev=374194&view=auto ============================================================================== --- llvm/trunk/utils/lit/lit/display.py (added) +++ llvm/trunk/utils/lit/lit/display.py Wed Oct 9 11:23:30 2019 @@ -0,0 +1,98 @@ +import sys + +import lit.ProgressBar + +def create_display(opts, tests, total_tests, workers): + if opts.quiet: + return NopProgressDisplay() + + of_total = (' of %d' % total_tests) if (tests != total_tests) else '' + header = '-- Testing: %d%s tests, %d workers --' % (tests, of_total, workers) + + progress_bar = None + if opts.succinct and opts.useProgressBar: + try: + tc = lit.ProgressBar.TerminalController() + progress_bar = lit.ProgressBar.ProgressBar(tc, header) + except ValueError: + print(header) + progress_bar = lit.ProgressBar.SimpleProgressBar('Testing: ') + else: + print(header) + + if progress_bar: + progress_bar.update(0, '') + + return ProgressDisplay(opts, tests, progress_bar) + +class NopProgressDisplay(object): + def update(self, test): pass + def finish(self): pass + +class ProgressDisplay(object): + def __init__(self, opts, numTests, progressBar): + self.opts = opts + self.numTests = numTests + self.progressBar = progressBar + self.completed = 0 + + def finish(self): + if self.progressBar: + self.progressBar.clear() + elif self.opts.succinct: + sys.stdout.write('\n') + + def update(self, test): + self.completed += 1 + + show_result = test.result.code.isFailure or \ + self.opts.showAllOutput or \ + (not self.opts.quiet and not self.opts.succinct) + if show_result: + self.print_result(test) + + if self.progressBar: + percent = float(self.completed) / self.numTests + self.progressBar.update(percent, test.getFullName()) + + def print_result(self, test): + if self.progressBar: + self.progressBar.clear() + + # Show the test result line. + test_name = test.getFullName() + print('%s: %s (%d of %d)' % (test.result.code.name, test_name, + self.completed, self.numTests)) + + # Show the test failure output, if requested. + if (test.result.code.isFailure and self.opts.showOutput) or \ + self.opts.showAllOutput: + if test.result.code.isFailure: + print("%s TEST '%s' FAILED %s" % ('*'*20, test.getFullName(), + '*'*20)) + print(test.result.output) + print("*" * 20) + + # Report test metrics, if present. + if test.result.metrics: + print("%s TEST '%s' RESULTS %s" % ('*'*10, test.getFullName(), + '*'*10)) + items = sorted(test.result.metrics.items()) + for metric_name, value in items: + print('%s: %s ' % (metric_name, value.format())) + print("*" * 10) + + # Report micro-tests, if present + if test.result.microResults: + items = sorted(test.result.microResults.items()) + for micro_test_name, micro_test in items: + print("%s MICRO-TEST: %s" % + ('*'*3, micro_test_name)) + + if micro_test.metrics: + sorted_metrics = sorted(micro_test.metrics.items()) + for metric_name, value in sorted_metrics: + print(' %s: %s ' % (metric_name, value.format())) + + # Ensure the output is flushed. + sys.stdout.flush() Modified: llvm/trunk/utils/lit/lit/main.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/main.py?rev=374194&r1=374193&r2=374194&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/main.py (original) +++ llvm/trunk/utils/lit/lit/main.py Wed Oct 9 11:23:30 2019 @@ -19,84 +19,12 @@ import tempfile import shutil from xml.sax.saxutils import quoteattr -import lit.ProgressBar +import lit.discovery +import lit.display import lit.LitConfig -import lit.Test import lit.run +import lit.Test import lit.util -import lit.discovery - -class TestingProgressDisplay(object): - def __init__(self, opts, numTests, progressBar=None): - self.opts = opts - self.numTests = numTests - self.progressBar = progressBar - self.completed = 0 - - def finish(self): - if self.progressBar: - self.progressBar.clear() - elif self.opts.quiet: - pass - elif self.opts.succinct: - sys.stdout.write('\n') - - def update(self, test): - self.completed += 1 - - if self.opts.incremental: - update_incremental_cache(test) - - if self.progressBar: - self.progressBar.update(float(self.completed)/self.numTests, - test.getFullName()) - - shouldShow = test.result.code.isFailure or \ - self.opts.showAllOutput or \ - (not self.opts.quiet and not self.opts.succinct) - if not shouldShow: - return - - if self.progressBar: - self.progressBar.clear() - - # Show the test result line. - test_name = test.getFullName() - print('%s: %s (%d of %d)' % (test.result.code.name, test_name, - self.completed, self.numTests)) - - # Show the test failure output, if requested. - if (test.result.code.isFailure and self.opts.showOutput) or \ - self.opts.showAllOutput: - if test.result.code.isFailure: - print("%s TEST '%s' FAILED %s" % ('*'*20, test.getFullName(), - '*'*20)) - print(test.result.output) - print("*" * 20) - - # Report test metrics, if present. - if test.result.metrics: - print("%s TEST '%s' RESULTS %s" % ('*'*10, test.getFullName(), - '*'*10)) - items = sorted(test.result.metrics.items()) - for metric_name, value in items: - print('%s: %s ' % (metric_name, value.format())) - print("*" * 10) - - # Report micro-tests, if present - if test.result.microResults: - items = sorted(test.result.microResults.items()) - for micro_test_name, micro_test in items: - print("%s MICRO-TEST: %s" % - ('*'*3, micro_test_name)) - - if micro_test.metrics: - sorted_metrics = sorted(micro_test.metrics.items()) - for metric_name, value in sorted_metrics: - print(' %s: %s ' % (metric_name, value.format())) - - # Ensure the output is flushed. - sys.stdout.flush() def write_test_results(run, lit_config, testing_time, output_path): try: @@ -505,29 +433,22 @@ def main_with_tmp(builtinParameters): except: pass - extra = (' of %d' % numTotalTests) if (len(run.tests) != numTotalTests) else '' - header = '-- Testing: %d%s tests, %d workers --' % (len(run.tests), extra, opts.numWorkers) - progressBar = None - if not opts.quiet: - if opts.succinct and opts.useProgressBar: - try: - tc = lit.ProgressBar.TerminalController() - progressBar = lit.ProgressBar.ProgressBar(tc, header) - except ValueError: - print(header) - progressBar = lit.ProgressBar.SimpleProgressBar('Testing: ') - else: - print(header) + display = lit.display.create_display(opts, len(run.tests), + numTotalTests, opts.numWorkers) + def progress_callback(test): + display.update(test) + if opts.incremental: + update_incremental_cache(test) startTime = time.time() - display = TestingProgressDisplay(opts, len(run.tests), progressBar) try: - run.execute_tests(display, opts.numWorkers, opts.maxTime) + run.execute_tests(progress_callback, opts.numWorkers, opts.maxTime) except KeyboardInterrupt: sys.exit(2) + testing_time = time.time() - startTime + display.finish() - testing_time = time.time() - startTime if not opts.quiet: print('Testing Time: %.2fs' % (testing_time,)) Modified: llvm/trunk/utils/lit/lit/run.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/run.py?rev=374194&r1=374193&r2=374194&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/run.py (original) +++ llvm/trunk/utils/lit/lit/run.py Wed Oct 9 11:23:30 2019 @@ -5,18 +5,6 @@ import lit.Test import lit.util import lit.worker -class _Display(object): - def __init__(self, display, provider, maxFailures): - self.display = display - self.provider = provider - self.maxFailures = maxFailures or object() - self.failedCount = 0 - def update(self, test): - self.display.update(test) - self.failedCount += (test.result.code == lit.Test.FAIL) - if self.failedCount == self.maxFailures: - self.provider.cancel() - # No-operation semaphore for supporting `None` for parallelism_groups. # lit_config.parallelism_groups['my_group'] = None class NopSemaphore(object): @@ -93,21 +81,20 @@ class Run(object): finally: pool.join() - def execute_tests(self, display, workers, max_time=None): + def execute_tests(self, progress_callback, workers, max_time): """ - execute_tests(display, workers, [max_time]) + execute_tests(progress_callback, workers, max_time) Execute the tests in the run using up to the specified number of - parallel tasks, and inform the display of each individual result. The + parallel tasks, and inform the caller of each individual result. The provided tests should be a subset of the tests available in this run object. + The progress_callback will be invoked for each completed test. + If max_time is non-None, it should be a time in seconds after which to stop executing tests. - The display object will have its update method called for each completed - test. - Upon completion, each test in the run will have its result computed. Tests which were not actually executed (for any reason) will be given an UNRESOLVED result. @@ -116,9 +103,7 @@ class Run(object): if not self.tests: return - # Save the display object on the runner so that we can update it from - # our task completion callback. - self.display = display + self.progress_callback = progress_callback self.failure_count = 0 self.hit_max_failures = False @@ -156,7 +141,7 @@ class Run(object): assert self.tests[test_index].file_path == test_with_result.file_path, \ "parent and child disagree on test path" self.tests[test_index] = test_with_result - self.display.update(test_with_result) + self.progress_callback(test_with_result) # If we've finished all the tests or too many tests have failed, notify # the main thread that we've stopped testing. Modified: llvm/trunk/utils/lit/tests/progress-bar.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/progress-bar.py?rev=374194&r1=374193&r2=374194&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/progress-bar.py (original) +++ llvm/trunk/utils/lit/tests/progress-bar.py Wed Oct 9 11:23:30 2019 @@ -3,11 +3,12 @@ # RUN: not %{lit} -j 1 -s %{inputs}/progress-bar > %t.out # RUN: FileCheck < %t.out %s # -# CHECK: Testing: 0 .. 10.. 20 +# CHECK: Testing: # CHECK: FAIL: progress-bar :: test-1.txt (1 of 4) -# CHECK: Testing: 0 .. 10.. 20.. 30.. 40.. +# CHECK: Testing: 0.. 10.. 20 # CHECK: FAIL: progress-bar :: test-2.txt (2 of 4) -# CHECK: Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70 +# CHECK: Testing: 0.. 10.. 20.. 30.. 40.. # CHECK: FAIL: progress-bar :: test-3.txt (3 of 4) -# CHECK: Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. +# CHECK: Testing: 0.. 10.. 20.. 30.. 40.. 50.. 60.. 70 # CHECK: FAIL: progress-bar :: test-4.txt (4 of 4) +# CHECK: Testing: 0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. From llvm-commits at lists.llvm.org Wed Oct 9 11:26:42 2019 From: llvm-commits at lists.llvm.org (Michael Kruse via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:26:42 +0000 (UTC) Subject: [PATCH] D68551: [clang-format] [NFC] Ensure clang-format is itself clang-formatted. In-Reply-To: References: Message-ID: Meinersbur added a comment. In D68551#1697842 , @mitchell-stellar wrote: > I agree that a system in place that either enforces clang-formatting on commit or after the fact would be ideal. Otherwise, I don't see a need to have to approve these NFC commits. The current coding policy contains "Our long term goal is for the entire codebase to follow the convention, but we explicitly do not want patches that do large-scale reformatting of existing code." that was added after someone removed all trailing whitespace all LLVM files. Reformatting the code you are going to work in is fine, but not on the entire code base. Ideally we'd also run the regression tests in a pre-commit hook. Btw, I am the author of the CMakeLists snippet quoted by @MyDeveloperDay. Before that, it was a shell script that didn't run on Windows. Making it part of the regression test basically eliminated all discussion about code formatting, but we had to run large-scale reformatting whenever clang-format changed in some way. It also runs by the polly-* buildbots which I personally do not like since I don't see code formatting as a reason why a build should fail. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68551/new/ https://reviews.llvm.org/D68551 From llvm-commits at lists.llvm.org Wed Oct 9 11:26:42 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:26:42 +0000 (UTC) Subject: [PATCH] D68635: [AMDGPU] Come back patch for the 'Assign register class for cross block values according to the divergence.' In-Reply-To: References: Message-ID: <52efd0617878ba45b717fbc7912a7e46@localhost.localdomain> rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68635/new/ https://reviews.llvm.org/D68635 From llvm-commits at lists.llvm.org Wed Oct 9 11:26:43 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:26:43 +0000 (UTC) Subject: [PATCH] D66979: [InstrProf] Tighten a check for malformed data records in raw profiles In-Reply-To: References: Message-ID: vsk added a comment. In D66979#1701139 , @w2yehia wrote: > Hi @vsk can you provide a description/script on how to recreate the `malformed-ptr-to-counter-array.profraw` file when someone is changing the profile layout (for example by adding new value profiling kinds). > I'm thinking something like `llvm/test/tools/llvm-profdata/raw-two-profiles.test` would be nice > Thanks. Hi @w2yehia, I think the test needs to rewritten. PTAL at D68718 . Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66979/new/ https://reviews.llvm.org/D66979 From llvm-commits at lists.llvm.org Wed Oct 9 11:26:42 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:26:42 +0000 (UTC) Subject: [PATCH] D68718: [llvm-profdata] Make "malformed-ptr-to-counter-array.test" textual Message-ID: vsk created this revision. vsk added a reviewer: w2yehia. Herald added a project: LLVM. As pointed out in https://reviews.llvm.org/D66979 post-commit, making this test textual would make it more maintainable. https://reviews.llvm.org/D68718 Files: llvm/test/tools/llvm-profdata/Inputs/malformed-ptr-to-counter-array.profraw llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test Index: llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test =================================================================== --- llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test +++ llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test @@ -1,5 +1,53 @@ -REQUIRES: zlib +// Header +// +// INSTR_PROF_RAW_HEADER(uint64_t, Magic, __llvm_profile_get_magic()) +// INSTR_PROF_RAW_HEADER(uint64_t, Version, __llvm_profile_get_version()) +// INSTR_PROF_RAW_HEADER(uint64_t, DataSize, DataSize) +// INSTR_PROF_RAW_HEADER(uint64_t, CountersSize, CountersSize) +// INSTR_PROF_RAW_HEADER(uint64_t, NamesSize, NamesSize) +// INSTR_PROF_RAW_HEADER(uint64_t, CountersDelta, (uintptr_t)CountersBegin) +// INSTR_PROF_RAW_HEADER(uint64_t, NamesDelta, (uintptr_t)NamesBegin) +// INSTR_PROF_RAW_HEADER(uint64_t, ValueKindLast, IPVK_Last) -RUN: not llvm-profdata merge -o /dev/null %p/Inputs/malformed-ptr-to-counter-array.profraw 2>&1 | FileCheck %s +RUN: printf '\201rforpl\377' > %t.profraw +RUN: printf '\4\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\1\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\2\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\10\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\0\0\6\0\1\0\0\0' >> %t.profraw +RUN: printf '\0\0\6\0\2\0\0\0' >> %t.profraw +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw + +// Data Section +// +// INSTR_PROF_DATA(const uint64_t, llvm::Type::getInt64Ty(Ctx), NameRef, \ +// ConstantInt::get(llvm::Type::getInt64Ty(Ctx), \ +// IndexedInstrProf::ComputeHash(getPGOFuncNameVarInitializer(Inc->getName())))) +// INSTR_PROF_DATA(const uint64_t, llvm::Type::getInt64Ty(Ctx), FuncHash, \ +// ConstantInt::get(llvm::Type::getInt64Ty(Ctx), \ +// Inc->getHash()->getZExtValue())) +// INSTR_PROF_DATA(const IntPtrT, llvm::Type::getInt64PtrTy(Ctx), CounterPtr, \ +// ConstantExpr::getBitCast(CounterPtr, \ +// llvm::Type::getInt64PtrTy(Ctx))) + +RUN: printf '\067\265\035\031\112\165\023\344' >> %t.profraw +RUN: printf '\02\0\0\0\0\0\0\0' >> %t.profraw + +// Note: The CounterPtr here is off-by-one. This should trigger a malformed profile error. +RUN: printf '\0\0\6\0\1\0\0\1' >> %t.profraw + +// Counter Section + +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw + +// Name Section + +RUN: printf '\02\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\067\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\101\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\3\0bar\0\0\0' >> %t.profraw + +RUN: not llvm-profdata merge -o /dev/null %t.profraw 2>&1 | FileCheck %s CHECK: Malformed instrumentation profile data -------------- next part -------------- A non-text attachment was scrubbed... Name: D68718.224109.patch Type: text/x-patch Size: 2699 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 11:26:43 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:26:43 +0000 (UTC) Subject: [PATCH] D68706: [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space In-Reply-To: References: Message-ID: <274d757ce7558d15e959689efc365deb@localhost.localdomain> spatel updated this revision to Diff 224108. spatel added a comment. Patch updated: No diffs in this patch itself, but rebased after adding test at rL374190 . I don't have much experience with addrspaces or inbounds, so let me know if I should change anything else. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68706/new/ https://reviews.llvm.org/D68706 Files: llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp llvm/test/Transforms/InstCombine/load-bitcast-vec.ll Index: llvm/test/Transforms/InstCombine/load-bitcast-vec.ll =================================================================== --- llvm/test/Transforms/InstCombine/load-bitcast-vec.ll +++ llvm/test/Transforms/InstCombine/load-bitcast-vec.ll @@ -100,11 +100,11 @@ ret float %r } -; TODO: Is a null pointer inbounds in any address space? +; A null pointer can't be assumed inbounds in a non-default address space. define float @matching_scalar_smallest_deref_or_null_addrspace(<4 x float> addrspace(4)* dereferenceable_or_null(1) %p) { ; CHECK-LABEL: @matching_scalar_smallest_deref_or_null_addrspace( -; CHECK-NEXT: [[BC:%.*]] = getelementptr inbounds <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 +; CHECK-NEXT: [[BC:%.*]] = getelementptr <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 ; CHECK-NEXT: [[R:%.*]] = load float, float addrspace(4)* [[BC]], align 16 ; CHECK-NEXT: ret float [[R]] ; Index: llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp =================================================================== --- llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp +++ llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp @@ -2344,8 +2344,12 @@ // If the source pointer is dereferenceable, then assume it points to an // allocated object and apply "inbounds" to the GEP. bool CanBeNull; - if (Src->getPointerDereferenceableBytes(DL, CanBeNull)) - GEP->setIsInBounds(); + if (Src->getPointerDereferenceableBytes(DL, CanBeNull)) { + // In a non-default address space (not 0), a null pointer can not be + // assumed inbounds, so ignore that case (dereferenceable_or_null). + if (SrcPTy->getAddressSpace() == 0 || !CanBeNull) + GEP->setIsInBounds(); + } return GEP; } } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68706.224108.patch Type: text/x-patch Size: 1826 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 11:26:44 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:26:44 +0000 (UTC) Subject: [PATCH] D68714: [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219) In-Reply-To: References: Message-ID: <7af072864d932cde9ea05412cd5f8473@localhost.localdomain> lebedev.ri added inline comments. ================ Comment at: llvm/tools/llvm-mca/Views/TimelineView.cpp:132-140 void TimelineView::printWaitTimeEntry(formatted_raw_ostream &OS, const WaitTimeEntry &Entry, unsigned SourceIndex, - unsigned Executions) const { - OS << SourceIndex << '.'; + unsigned CumulativeExecutions, + unsigned Executions, + bool ShouldNumber) const { + if (ShouldNumber) ---------------- andreadb wrote: > You can still use the old signature for this method (see the explanation below): > > We know that we are printing the special entry if `SourceIndex == Source.size()`. > > You can use that knowledge in two places: > > 1) You can automatically infer flag `ShouldNumber`. > > ``` > bool ShouldNumber = SourceIndex != Source.size(); > if (ShouldNumber) > OS << SourceIndex << '.'; > ``` > > 2) Before printing the average times, you can check if numbers are fore he special entry and modify the value of `Executions` with `Timeline.size() / Source.size()`. > > You can do it where you currently added the FIXME comment. > > ``` > if (!ShouldNumber) { > // override Executions for the purpose of changing colors. > Executions = Timeline.size() / Source.size(); > } > ``` > > Basically you don't need `CumulativeExecutions` as it can be inferred from the context. That should also fix the issue with the coloring of the output. To be honest i do not understand this comment. Is this better or worse? This does not help with coloring. ================ Comment at: llvm/tools/llvm-mca/Views/TimelineView.cpp:204-215 + + WaitTimeEntry TotalWaitTime = std::accumulate( + WaitTime.begin(), WaitTime.end(), WaitTimeEntry{0, 0, 0}, + [](const WaitTimeEntry &A, const WaitTimeEntry &B) { + return WaitTimeEntry{ + A.CyclesSpentInSchedulerQueue + B.CyclesSpentInSchedulerQueue, + A.CyclesSpentInSQWhileReady + B.CyclesSpentInSQWhileReady, ---------------- andreadb wrote: > We should not print the special entry if Source.size() == 1. > > If the input assembly only contains a single instruction, then we know that the entry is redundant. > > It should also (hopefully) simplify the diff a bit. Right. Not by much though. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68714/new/ https://reviews.llvm.org/D68714 From llvm-commits at lists.llvm.org Wed Oct 9 11:26:44 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:26:44 +0000 (UTC) Subject: [PATCH] D68714: [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219) In-Reply-To: References: Message-ID: lebedev.ri updated this revision to Diff 224110. lebedev.ri marked 3 inline comments as done. lebedev.ri added a comment. Attempt to address nits. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68714/new/ https://reviews.llvm.org/D68714 Files: llvm/docs/CommandGuide/llvm-mca.rst llvm/test/tools/llvm-mca/ARM/memcpy-ldm-stm.s llvm/test/tools/llvm-mca/ARM/vld1-index-update.s llvm/test/tools/llvm-mca/SystemZ/stm-lm.s llvm/test/tools/llvm-mca/X86/Barcelona/clear-super-register-1.s llvm/test/tools/llvm-mca/X86/Barcelona/clear-super-register-2.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-cmp.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-pcmpeq.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-pcmpgt.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-sbb-1.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-sbb-2.s llvm/test/tools/llvm-mca/X86/Barcelona/int-to-fpu-forwarding-3.s llvm/test/tools/llvm-mca/X86/Barcelona/load-store-throughput.s llvm/test/tools/llvm-mca/X86/Barcelona/load-throughput.s llvm/test/tools/llvm-mca/X86/Barcelona/one-idioms.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-7.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update.s llvm/test/tools/llvm-mca/X86/Barcelona/read-advance-1.s llvm/test/tools/llvm-mca/X86/Barcelona/read-advance-2.s llvm/test/tools/llvm-mca/X86/Barcelona/read-advance-3.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-1.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-2.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-3.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-4.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-5.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-6.s llvm/test/tools/llvm-mca/X86/Barcelona/store-throughput.s llvm/test/tools/llvm-mca/X86/Barcelona/zero-idioms.s llvm/test/tools/llvm-mca/X86/BdVer2/add-sequence.s llvm/test/tools/llvm-mca/X86/BdVer2/clear-super-register-1.s llvm/test/tools/llvm-mca/X86/BdVer2/clear-super-register-2.s llvm/test/tools/llvm-mca/X86/BdVer2/clear-super-register-3.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-cmp.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-pcmpeq.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-pcmpgt.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-sbb-1.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-sbb-2.s llvm/test/tools/llvm-mca/X86/BdVer2/dependent-pmuld-paddd.s llvm/test/tools/llvm-mca/X86/BdVer2/dot-product.s llvm/test/tools/llvm-mca/X86/BdVer2/hadd-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BdVer2/hadd-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BdVer2/int-to-fpu-forwarding-3.s llvm/test/tools/llvm-mca/X86/BdVer2/load-store-alias.s llvm/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s llvm/test/tools/llvm-mca/X86/BdVer2/load-throughput.s llvm/test/tools/llvm-mca/X86/BdVer2/memcpy-like-test.s llvm/test/tools/llvm-mca/X86/BdVer2/one-idioms.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update.s llvm/test/tools/llvm-mca/X86/BdVer2/pipes-fpu.s llvm/test/tools/llvm-mca/X86/BdVer2/pr37790.s llvm/test/tools/llvm-mca/X86/BdVer2/rank.s llvm/test/tools/llvm-mca/X86/BdVer2/read-advance-1.s llvm/test/tools/llvm-mca/X86/BdVer2/read-advance-2.s llvm/test/tools/llvm-mca/X86/BdVer2/read-advance-3.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-1.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-2.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-3.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-4.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-5.s llvm/test/tools/llvm-mca/X86/BdVer2/register-files-1.s llvm/test/tools/llvm-mca/X86/BdVer2/register-files-2.s llvm/test/tools/llvm-mca/X86/BdVer2/register-files-5.s llvm/test/tools/llvm-mca/X86/BdVer2/store-throughput.s llvm/test/tools/llvm-mca/X86/BdVer2/vbroadcast-operand-latency.s llvm/test/tools/llvm-mca/X86/BdVer2/vec-logic-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BdVer2/vec-logic-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BdVer2/xop-super-registers-1.s llvm/test/tools/llvm-mca/X86/BdVer2/xop-super-registers-2.s llvm/test/tools/llvm-mca/X86/BdVer2/zero-idioms-avx-256.s llvm/test/tools/llvm-mca/X86/BdVer2/zero-idioms.s llvm/test/tools/llvm-mca/X86/Broadwell/zero-idioms.s llvm/test/tools/llvm-mca/X86/BtVer2/add-sequence.s llvm/test/tools/llvm-mca/X86/BtVer2/bottleneck-hints-1.s llvm/test/tools/llvm-mca/X86/BtVer2/bottleneck-hints-3.s llvm/test/tools/llvm-mca/X86/BtVer2/clear-super-register-1.s llvm/test/tools/llvm-mca/X86/BtVer2/clear-super-register-2.s llvm/test/tools/llvm-mca/X86/BtVer2/cmpxchg-read-advance.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-cmp.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-pcmpeq.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-pcmpgt.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-sbb-1.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-sbb-2.s llvm/test/tools/llvm-mca/X86/BtVer2/dependent-pmuld-paddd.s llvm/test/tools/llvm-mca/X86/BtVer2/dot-product.s llvm/test/tools/llvm-mca/X86/BtVer2/hadd-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BtVer2/hadd-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BtVer2/int-to-fpu-forwarding-3.s llvm/test/tools/llvm-mca/X86/BtVer2/load-store-alias.s llvm/test/tools/llvm-mca/X86/BtVer2/memcpy-like-test.s llvm/test/tools/llvm-mca/X86/BtVer2/one-idioms.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-7.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update.s llvm/test/tools/llvm-mca/X86/BtVer2/pipes-fpu.s llvm/test/tools/llvm-mca/X86/BtVer2/pr37790.s llvm/test/tools/llvm-mca/X86/BtVer2/rank.s llvm/test/tools/llvm-mca/X86/BtVer2/read-advance-1.s llvm/test/tools/llvm-mca/X86/BtVer2/read-advance-2.s llvm/test/tools/llvm-mca/X86/BtVer2/read-advance-3.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-1.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-2.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-3.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-4.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-5.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-6.s llvm/test/tools/llvm-mca/X86/BtVer2/register-files-1.s llvm/test/tools/llvm-mca/X86/BtVer2/register-files-2.s llvm/test/tools/llvm-mca/X86/BtVer2/register-files-5.s llvm/test/tools/llvm-mca/X86/BtVer2/vbroadcast-operand-latency.s llvm/test/tools/llvm-mca/X86/BtVer2/vec-logic-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BtVer2/vec-logic-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BtVer2/xadd.s llvm/test/tools/llvm-mca/X86/BtVer2/xchg.s llvm/test/tools/llvm-mca/X86/BtVer2/zero-idioms-avx-256.s llvm/test/tools/llvm-mca/X86/BtVer2/zero-idioms.s llvm/test/tools/llvm-mca/X86/Generic/avx512-super-registers-1.s llvm/test/tools/llvm-mca/X86/Generic/avx512-super-registers-2.s llvm/test/tools/llvm-mca/X86/Generic/avx512-super-registers-3.s llvm/test/tools/llvm-mca/X86/Generic/xop-super-registers-1.s llvm/test/tools/llvm-mca/X86/Generic/xop-super-registers-2.s llvm/test/tools/llvm-mca/X86/Haswell/zero-idioms.s llvm/test/tools/llvm-mca/X86/SandyBridge/zero-idioms.s llvm/test/tools/llvm-mca/X86/SkylakeClient/zero-idioms.s llvm/test/tools/llvm-mca/X86/SkylakeServer/zero-idioms.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-7.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update.s llvm/test/tools/llvm-mca/X86/bextr-read-after-ld.s llvm/test/tools/llvm-mca/X86/bzhi-read-after-ld.s llvm/test/tools/llvm-mca/X86/fma3-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/fma3-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/read-after-ld-1.s llvm/test/tools/llvm-mca/X86/read-after-ld-2.s llvm/test/tools/llvm-mca/X86/read-after-ld-3.s llvm/test/tools/llvm-mca/X86/sqrt-rsqrt-rcp-memop.s llvm/test/tools/llvm-mca/X86/variable-blend-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/variable-blend-read-after-ld-2.s llvm/tools/llvm-mca/Views/TimelineView.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68714.224110.patch Type: text/x-patch Size: 111134 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 11:26:49 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:26:49 +0000 (UTC) Subject: [PATCH] D68525: [lit] Refactor ProgressDisplay In-Reply-To: References: Message-ID: <4cd9f52206d8575ea4013f3fbcc76f60@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG72c7c21dda99: [lit] Refactor ProgressDisplay (authored by yln). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68525/new/ https://reviews.llvm.org/D68525 Files: llvm/utils/lit/lit/ProgressBar.py llvm/utils/lit/lit/display.py llvm/utils/lit/lit/main.py llvm/utils/lit/lit/run.py llvm/utils/lit/tests/progress-bar.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68525.224111.patch Type: text/x-patch Size: 12740 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 11:27:16 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:27:16 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: <6a52d5ce0f9270bf317cecb9f19f06ae@localhost.localdomain> wmi marked an inline comment as done. wmi added inline comments. ================ Comment at: llvm/lib/ProfileData/SampleProfReader.cpp:533 +std::error_code SampleProfileReaderExtBinary::readFuncProfiles(uint64_t Size) { + const uint8_t *Start = Data; + if (UseAllFuncs) { ---------------- davidxl wrote: > wmi wrote: > > davidxl wrote: > > > End = Data + Size > > It is set in the parent function: readOneSection. > What I meant is to define a local variable 'End' and use it instead of of Start. It makes code slightly more readable. I see. Actually the class member "End" is also set to be end of section in readOneSection, and it is a little confusing to have two "End" variables, so I clean it further to remove the param "Size" and use class member "End" instead. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 From llvm-commits at lists.llvm.org Wed Oct 9 11:37:13 2019 From: llvm-commits at lists.llvm.org (David Blaikie via llvm-commits) Date: Wed, 09 Oct 2019 18:37:13 -0000 Subject: [llvm] r374196 - DebugInfo: Shot in the dark attempt to fix ubsan error from r374122 Message-ID: <20191009183713.892C79085A@lists.llvm.org> Author: dblaikie Date: Wed Oct 9 11:37:13 2019 New Revision: 374196 URL: http://llvm.org/viewvc/llvm-project?rev=374196&view=rev Log: DebugInfo: Shot in the dark attempt to fix ubsan error from r374122 (specifying an underlying type for the enum might also be suitable - but this seems better/as good, since there's a clear expectation this can contain values other than the actual enumerators of this enum) Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp?rev=374196&r1=374195&r2=374196&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp (original) +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp Wed Oct 9 11:37:13 2019 @@ -143,7 +143,7 @@ DWARFDebugLoclists::parseOneLocationList DataExtractor::Cursor C(*Offset); // dwarf::DW_LLE_end_of_list_entry is 0 and indicates the end of the list. - while (auto Kind = static_cast(Data.getU8(C))) { + while (auto Kind = Data.getU8(C)) { Entry E; E.Kind = Kind; switch (Kind) { From llvm-commits at lists.llvm.org Wed Oct 9 11:35:07 2019 From: llvm-commits at lists.llvm.org (David Blaikie via llvm-commits) Date: Wed, 9 Oct 2019 11:35:07 -0700 Subject: [llvm] r374122 - DebugInfo: Move LLE enum handling to .def to match RLE handling In-Reply-To: References: <20191008214847.5B73D89069@lists.llvm.org> Message-ID: Thanks - sorry for the noise. I've committed an attempted fix in r374196 & will keep an eye on the buildbot. On Wed, Oct 9, 2019 at 11:15 AM Vitaly Buka wrote: > UBSAN error after the patch > > /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:146:15: runtime error: load of value 71, which is not a valid value for type 'llvm::dwarf::LoclistEntries' > > > http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-ubsan/builds/15298/steps/check-llvm%20ubsan/logs/stdio > > On Tue, Oct 8, 2019 at 2:46 PM David Blaikie via llvm-commits < > llvm-commits at lists.llvm.org> wrote: > >> Author: dblaikie >> Date: Tue Oct 8 14:48:46 2019 >> New Revision: 374122 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=374122&view=rev >> Log: >> DebugInfo: Move LLE enum handling to .def to match RLE handling >> >> Modified: >> llvm/trunk/include/llvm/BinaryFormat/Dwarf.def >> llvm/trunk/include/llvm/BinaryFormat/Dwarf.h >> llvm/trunk/lib/BinaryFormat/Dwarf.cpp >> llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp >> >> Modified: llvm/trunk/include/llvm/BinaryFormat/Dwarf.def >> URL: >> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/BinaryFormat/Dwarf.def?rev=374122&r1=374121&r2=374122&view=diff >> >> ============================================================================== >> --- llvm/trunk/include/llvm/BinaryFormat/Dwarf.def (original) >> +++ llvm/trunk/include/llvm/BinaryFormat/Dwarf.def Tue Oct 8 14:48:46 >> 2019 >> @@ -17,7 +17,7 @@ >> defined HANDLE_DW_VIRTUALITY || defined HANDLE_DW_DEFAULTED || >> \ >> defined HANDLE_DW_CC || defined HANDLE_DW_LNS || defined >> HANDLE_DW_LNE || \ >> defined HANDLE_DW_LNCT || defined HANDLE_DW_MACRO || >> \ >> - defined HANDLE_DW_RLE || >> \ >> + defined HANDLE_DW_RLE || defined HANDLE_DW_LLE || >> \ >> (defined HANDLE_DW_CFA && defined HANDLE_DW_CFA_PRED) || >> \ >> defined HANDLE_DW_APPLE_PROPERTY || defined HANDLE_DW_UT || >> \ >> defined HANDLE_DWARF_SECTION || defined HANDLE_DW_IDX || >> \ >> @@ -91,6 +91,10 @@ >> #define HANDLE_DW_RLE(ID, NAME) >> #endif >> >> +#ifndef HANDLE_DW_LLE >> +#define HANDLE_DW_LLE(ID, NAME) >> +#endif >> + >> #ifndef HANDLE_DW_CFA >> #define HANDLE_DW_CFA(ID, NAME) >> #endif >> @@ -825,6 +829,17 @@ HANDLE_DW_RLE(0x05, base_address) >> HANDLE_DW_RLE(0x06, start_end) >> HANDLE_DW_RLE(0x07, start_length) >> >> +// DWARF v5 Loc List Entry encoding values. >> +HANDLE_DW_LLE(0x00, end_of_list) >> +HANDLE_DW_LLE(0x01, base_addressx) >> +HANDLE_DW_LLE(0x02, startx_endx) >> +HANDLE_DW_LLE(0x03, startx_length) >> +HANDLE_DW_LLE(0x04, offset_pair) >> +HANDLE_DW_LLE(0x05, default_location) >> +HANDLE_DW_LLE(0x06, base_address) >> +HANDLE_DW_LLE(0x07, start_end) >> +HANDLE_DW_LLE(0x08, start_length) >> + >> // Call frame instruction encodings. >> HANDLE_DW_CFA(0x00, nop) >> HANDLE_DW_CFA(0x40, advance_loc) >> @@ -939,6 +954,7 @@ HANDLE_DW_IDX(0x05, type_hash) >> #undef HANDLE_DW_LNCT >> #undef HANDLE_DW_MACRO >> #undef HANDLE_DW_RLE >> +#undef HANDLE_DW_LLE >> #undef HANDLE_DW_CFA >> #undef HANDLE_DW_CFA_PRED >> #undef HANDLE_DW_APPLE_PROPERTY >> >> Modified: llvm/trunk/include/llvm/BinaryFormat/Dwarf.h >> URL: >> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/BinaryFormat/Dwarf.h?rev=374122&r1=374121&r2=374122&view=diff >> >> ============================================================================== >> --- llvm/trunk/include/llvm/BinaryFormat/Dwarf.h (original) >> +++ llvm/trunk/include/llvm/BinaryFormat/Dwarf.h Tue Oct 8 14:48:46 2019 >> @@ -308,11 +308,17 @@ enum MacroEntryType { >> }; >> >> /// DWARF v5 range list entry encoding values. >> -enum RangeListEntries { >> +enum RnglistEntries { >> #define HANDLE_DW_RLE(ID, NAME) DW_RLE_##NAME = ID, >> #include "llvm/BinaryFormat/Dwarf.def" >> }; >> >> +/// DWARF v5 loc list entry encoding values. >> +enum LoclistEntries { >> +#define HANDLE_DW_LLE(ID, NAME) DW_LLE_##NAME = ID, >> +#include "llvm/BinaryFormat/Dwarf.def" >> +}; >> + >> /// Call frame instruction encodings. >> enum CallFrameInfo { >> #define HANDLE_DW_CFA(ID, NAME) DW_CFA_##NAME = ID, >> @@ -348,19 +354,6 @@ enum Constants { >> DW_EH_PE_indirect = 0x80 >> }; >> >> -/// Constants for location lists in DWARF v5. >> -enum LocationListEntry : unsigned char { >> - DW_LLE_end_of_list = 0x00, >> - DW_LLE_base_addressx = 0x01, >> - DW_LLE_startx_endx = 0x02, >> - DW_LLE_startx_length = 0x03, >> - DW_LLE_offset_pair = 0x04, >> - DW_LLE_default_location = 0x05, >> - DW_LLE_base_address = 0x06, >> - DW_LLE_start_end = 0x07, >> - DW_LLE_start_length = 0x08 >> -}; >> - >> /// Constants for the DW_APPLE_PROPERTY_attributes attribute. >> /// Keep this list in sync with clang's DeclSpec.h >> ObjCPropertyAttributeKind! >> enum ApplePropertyAttributes { >> @@ -475,6 +468,7 @@ StringRef LNStandardString(unsigned Stan >> StringRef LNExtendedString(unsigned Encoding); >> StringRef MacinfoString(unsigned Encoding); >> StringRef RangeListEncodingString(unsigned Encoding); >> +StringRef LocListEncodingString(unsigned Encoding); >> StringRef CallFrameString(unsigned Encoding, Triple::ArchType Arch); >> StringRef ApplePropertyString(unsigned); >> StringRef UnitTypeString(unsigned); >> >> Modified: llvm/trunk/lib/BinaryFormat/Dwarf.cpp >> URL: >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/BinaryFormat/Dwarf.cpp?rev=374122&r1=374121&r2=374122&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/BinaryFormat/Dwarf.cpp (original) >> +++ llvm/trunk/lib/BinaryFormat/Dwarf.cpp Tue Oct 8 14:48:46 2019 >> @@ -472,6 +472,17 @@ StringRef llvm::dwarf::RangeListEncoding >> } >> } >> >> +StringRef llvm::dwarf::LocListEncodingString(unsigned Encoding) { >> + switch (Encoding) { >> + default: >> + return StringRef(); >> +#define HANDLE_DW_LLE(ID, NAME) >> \ >> + case DW_LLE_##NAME: >> \ >> + return "DW_LLE_" #NAME; >> +#include "llvm/BinaryFormat/Dwarf.def" >> + } >> +} >> + >> StringRef llvm::dwarf::CallFrameString(unsigned Encoding, >> Triple::ArchType Arch) { >> assert(Arch != llvm::Triple::ArchType::UnknownArch); >> >> Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp >> URL: >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp?rev=374122&r1=374121&r2=374122&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp (original) >> +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp Tue Oct 8 14:48:46 >> 2019 >> @@ -143,7 +143,7 @@ DWARFDebugLoclists::parseOneLocationList >> DataExtractor::Cursor C(*Offset); >> >> // dwarf::DW_LLE_end_of_list_entry is 0 and indicates the end of the >> list. >> - while (auto Kind = >> static_cast(Data.getU8(C))) { >> + while (auto Kind = static_cast(Data.getU8(C))) { >> Entry E; >> E.Kind = Kind; >> switch (Kind) { >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Wed Oct 9 11:36:17 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:36:17 +0000 (UTC) Subject: [PATCH] D68706: [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space In-Reply-To: References: Message-ID: <0283f1b0d2218c3e832f2c796f44d600@localhost.localdomain> jdoerfert added a comment. Can you add the test I provided as well? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68706/new/ https://reviews.llvm.org/D68706 From llvm-commits at lists.llvm.org Wed Oct 9 11:36:18 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:36:18 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: wmi updated this revision to Diff 224113. wmi added a comment. Address David's comment. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 Files: llvm/include/llvm/ProfileData/SampleProf.h llvm/include/llvm/ProfileData/SampleProfReader.h llvm/include/llvm/ProfileData/SampleProfWriter.h llvm/lib/ProfileData/SampleProfReader.cpp llvm/lib/ProfileData/SampleProfWriter.cpp llvm/lib/Transforms/IPO/SampleProfile.cpp llvm/test/Transforms/SampleProfile/Inputs/inline.extbinary.afdo llvm/test/Transforms/SampleProfile/Inputs/profsampleacc.extbinary.afdo llvm/unittests/ProfileData/SampleProfTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68601.224113.patch Type: text/x-patch Size: 20178 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 11:36:18 2019 From: llvm-commits at lists.llvm.org (serge via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:36:18 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 Message-ID: serge-sans-paille created this revision. Herald added subscribers: llvm-commits, cfe-commits, hiraditya, dschuff. Herald added projects: clang, LLVM. Implement protection against the stack clash attack [0]. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. If possible, use MOV already present in the entry block instead of generating new ones. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68720 Files: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Basic/DiagnosticFrontendKinds.td clang/include/clang/Basic/TargetInfo.h clang/include/clang/Driver/CC1Options.td clang/include/clang/Driver/Options.td clang/lib/Basic/Targets/X86.h clang/lib/CodeGen/BackendUtil.cpp clang/lib/CodeGen/CGStmt.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/stack-clash-protection.c clang/test/Driver/stack-clash-protection.c llvm/include/llvm/CodeGen/CommandFlags.inc llvm/include/llvm/Target/TargetOptions.h llvm/lib/Target/X86/X86FrameLowering.cpp llvm/lib/Target/X86/X86FrameLowering.h llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86ISelLowering.h llvm/lib/Target/X86/X86InstrCompiler.td llvm/lib/Target/X86/X86InstrInfo.td llvm/test/CodeGen/X86/stack-clash-dynamic-alloca.ll llvm/test/CodeGen/X86/stack-clash-medium-natural-probes.ll llvm/test/CodeGen/X86/stack-clash-medium.ll llvm/test/CodeGen/X86/stack-clash-small.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68720.224102.patch Type: text/x-patch Size: 28569 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 11:36:18 2019 From: llvm-commits at lists.llvm.org (Puyan Lotfi via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:36:18 +0000 (UTC) Subject: [PATCH] D63978: Clang Interface Stubs merger plumbing for Driver In-Reply-To: References: Message-ID: plotfi marked 3 inline comments as done. plotfi added a comment. In D63978#1701864 , @tamur wrote: > It seems that with this patch, llvm-ifs starts to depend on yaml2obj, which as far as I know, was only used for testing purposes until now. Is this intended? No not with this patch, but with an earlier patch llvm-ifs does use yaml to generate elf. The library was available and elfabi wasn’t. I can move llvm-ifs to elfabi when it is ready. ================ Comment at: clang/lib/Driver/Driver.cpp:3372 + if (Phase == phases::IfsMerge) { + assert(Phase == PL.back() && "merging must be final compilation step."); + MergerInputs.push_back(Current); ---------------- compnerd wrote: > plotfi wrote: > > compnerd wrote: > > > Does the interface merging have to be the last step? I could see interface merging preceding linking just fine. > > For now I think that's the expedient thing to do. Do you want to change that? > Add a TODO perhaps? I agree with that. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63978/new/ https://reviews.llvm.org/D63978 From llvm-commits at lists.llvm.org Wed Oct 9 11:36:18 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:36:18 +0000 (UTC) Subject: [PATCH] D68684: [WebAssembly] Make returns variadic In-Reply-To: References: Message-ID: <0bc902e83b7a9038bcb9ae3662918dda@localhost.localdomain> sbc100 added inline comments. ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp:1269 if (MI.getOpcode() == WebAssembly::END_BLOCK) { + assert(MFI.getResults().size() <= 1 && + "Multivalue block signatures not implemented yet"); ---------------- report_fatal_error so end users see this too? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68684/new/ https://reviews.llvm.org/D68684 From llvm-commits at lists.llvm.org Wed Oct 9 11:36:18 2019 From: llvm-commits at lists.llvm.org (David Li via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:36:18 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: <144f44294be0d29e2e9d310590d684cf@localhost.localdomain> davidxl accepted this revision. davidxl added a comment. lgtm Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 From llvm-commits at lists.llvm.org Wed Oct 9 11:45:23 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:45:23 +0000 (UTC) Subject: [PATCH] D68706: [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space In-Reply-To: References: Message-ID: <7816d11aceaf5ac006b67ac314fe4a36@localhost.localdomain> spatel added a comment. In D68706#1701915 , @jdoerfert wrote: > Can you add the test I provided as well? Did I miss a message? I copy/pasted at line 92 of the test file (no diff from the code change): define float @matching_scalar_smallest_deref_addrspace(<4 x float> addrspace(4)* dereferenceable(1) %p) { ; CHECK-LABEL: @matching_scalar_smallest_deref_addrspace( ; CHECK-NEXT: [[BC:%.*]] = getelementptr inbounds <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 ; CHECK-NEXT: [[R:%.*]] = load float, float addrspace(4)* [[BC]], align 16 ; CHECK-NEXT: ret float [[R]] ; %bc = bitcast <4 x float> addrspace(4)* %p to float addrspace(4)* %r = load float, float addrspace(4)* %bc, align 16 ret float %r } CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68706/new/ https://reviews.llvm.org/D68706 From llvm-commits at lists.llvm.org Wed Oct 9 11:54:56 2019 From: llvm-commits at lists.llvm.org (Nathan Chancellor via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:54:56 +0000 (UTC) Subject: [PATCH] D67867: [libc] Add few docs and implementation of strcpy and strcat. In-Reply-To: References: Message-ID: nathanchance added a comment. Just as an FYI, this patch breaks `LLVM_INCLUDE_TESTS=OFF` for me: $ cmake -GNinja -DPYTHON_EXECUTABLE=$(command -v python3) -DLLVM_ENABLE_PROJECTS=all ../llvm ... -- Configuring done -- Generating done -- Build files have been written to: /home/nathan/src/llvm-project/build $ cd .. && rm -rf build && mkdir -p build && cd build $ cmake -GNinja -DPYTHON_EXECUTABLE=$(command -v python3) -DLLVM_ENABLE_PROJECTS=all -DLLVM_INCLUDE_TESTS=OFF ../llvm ... -- Configuring done CMake Error at /home/nathan/src/llvm-project/libc/cmake/modules/LLVMLibCRules.cmake:264 (add_dependencies): The dependency target "gtest" of target "strcpy_test" does not exist. Call Stack (most recent call first): /home/nathan/src/llvm-project/libc/src/string/strcpy/CMakeLists.txt:11 (add_libc_unittest) CMake Error at /home/nathan/src/llvm-project/libc/cmake/modules/LLVMLibCRules.cmake:264 (add_dependencies): The dependency target "gtest" of target "strcat_test" does not exist. Call Stack (most recent call first): /home/nathan/src/llvm-project/libc/src/string/strcat/CMakeLists.txt:12 (add_libc_unittest) -- Generating done -- Build files have been written to: /home/nathan/src/llvm-project/build $ git revert -n 4380647e79bd80af1ebf6191c2d6629855ccf556 $ cd .. && rm -rf build && mkdir -p build && cd build $ cmake -GNinja -DPYTHON_EXECUTABLE=$(command -v python3) -DLLVM_ENABLE_PROJECTS=all -DLLVM_INCLUDE_TESTS=OFF ../llvm ... -- Configuring done -- Generating done -- Build files have been written to: /home/nathan/src/llvm-project/build This is as of r374191 . Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67867/new/ https://reviews.llvm.org/D67867 From llvm-commits at lists.llvm.org Wed Oct 9 11:54:56 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:54:56 +0000 (UTC) Subject: [PATCH] D67350: [IfCvt][ARM] Optimise diamond if-conversion for code size In-Reply-To: References: Message-ID: <796207a6bb32cb306bb955827320e63b@localhost.localdomain> efriedma accepted this revision. efriedma added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67350/new/ https://reviews.llvm.org/D67350 From llvm-commits at lists.llvm.org Wed Oct 9 11:54:57 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:54:57 +0000 (UTC) Subject: [PATCH] D68575: [llvm-readobj][xcoff] implement parsing overflow section header. In-Reply-To: References: Message-ID: hubert.reinterpretcast marked 2 inline comments as done. hubert.reinterpretcast added a comment. LGTM to land as-is. Not sure if other people have an opinion about the `const`. @DiggerLin, I believe you have had a number of patches committed into the project. I think you can request commit access and land this yourself. Thanks. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:58 + static constexpr unsigned SectionFlagsTypeMask = 0xffffu; const XCOFFObjectFile &Obj; }; ---------------- DiggerLin wrote: > hubert.reinterpretcast wrote: > > Add a blank line here. Also, I am wondering if this should be part of `llvm/BinaryFormat/XCOFF.h` (perhaps in `SectionHeader32`, or in a base class thereof when 64-bit support lands). > for consistent with SectionFlagsReservedMask, puting define SectionFlagsTypeMask here too, I think we maybe need to create a NFC patch to put SectionFlagsReservedMask and SectionFlagsTypeMask in the xcoff.h Okay, I agree. Would you mind posting such an NFC patch after this patch lands? ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:455 + case XCOFF::STYP_TYPCHK: + // TODO : The interpretation of loader, exception, type check section + // headers are different from that of generic section header. We will ---------------- DiggerLin wrote: > hubert.reinterpretcast wrote: > > The "TODO" still has a colon surrounded by spaces on both sides after it. I do not think that we have been using colons after "TODO". > > > > Still missing "and" before "type check section headers". > > > > Still missing "s" after "generic section header". > > > > Typo "seciton" is still present. > changed as suggestion For future reference, I believe we have been using "Oxford commas". That is, a comma before the "and" before (in this case) the third list item, would be appropriate. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:46 template void printSectionHeaders(ArrayRef Sections); + template void printGenericSectionHeader(T &Sec) const; + template void printOverflowSectionHeader(T &Sec) const; ---------------- I am not sure that I see a meaningful difference between the functions here that are `const` and the ones that are not. Given that there are already precedent cases of printing methods of `ObjDumper` subclasses that are `const`, I am okay with adding new ones that are `const` if we have reason to believe they will remain `const`. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68575/new/ https://reviews.llvm.org/D68575 From llvm-commits at lists.llvm.org Wed Oct 9 11:54:58 2019 From: llvm-commits at lists.llvm.org (Brian Gesiak via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:54:58 +0000 (UTC) Subject: [PATCH] D51741: [coro]Pass rvalue reference for named local variable to return_value In-Reply-To: References: Message-ID: <8d2987324105bb5b5a2f035aca46214a@localhost.localdomain> modocache added a subscriber: lewissbaker. modocache added a comment. > Is that maybe intentional, and is the code not intended to compile? It looks like it should work to me, but maybe @lewissbaker or @GorNishanov can answer definitively. ================ Comment at: cfe/trunk/lib/Sema/SemaCoroutine.cpp:846 + if (E) { + auto NRVOCandidate = this->getCopyElisionCandidate(E->getType(), E, CES_AsIfByStdMove); + if (NRVOCandidate) { ---------------- aaronpuchert wrote: > Why not `CES_Strict` like in `Sema::BuildReturnStmt`? With `CES_Strict` the test still works, and we can also return references. So this fixes your test case? If so it sounds good to me. I'll make this change or you can feel free to if you get around to it first. ================ Comment at: cfe/trunk/lib/Sema/SemaCoroutine.cpp:849 + InitializedEntity Entity = + InitializedEntity::InitializeResult(Loc, E->getType(), NRVOCandidate); + ExprResult MoveResult = this->PerformMoveOrCopyInitialization( ---------------- aaronpuchert wrote: > The last parameter has type `bool`, and because we're in `if (NRVOCandidate)`, that will always be true. Wouldn't it be more straightforward to just pass `true` into the function? Makes sense! I can send a patch to do this, or feel free to commit one yourself if you get to it first. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51741/new/ https://reviews.llvm.org/D51741 From llvm-commits at lists.llvm.org Wed Oct 9 11:54:58 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:54:58 +0000 (UTC) Subject: [PATCH] D68649: [Mips][llvm-exegesis] Add a Mips target In-Reply-To: References: Message-ID: <5de8951145ea3c6b5c6b79f8d247bbda@localhost.localdomain> atanasyan accepted this revision. atanasyan added a comment. This revision is now accepted and ready to land. LGTM Do you have commit access? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68649/new/ https://reviews.llvm.org/D68649 From llvm-commits at lists.llvm.org Wed Oct 9 11:55:31 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 18:55:31 +0000 (UTC) Subject: [PATCH] D68250: [DAGCombine] Match more patterns for half word bswap In-Reply-To: References: Message-ID: spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM - I applied the patch locally and ran 'make check' and don't see failures now. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68250/new/ https://reviews.llvm.org/D68250 From llvm-commits at lists.llvm.org Wed Oct 9 12:03:19 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Wed, 9 Oct 2019 12:03:19 -0700 Subject: r374055 Message-ID: Hello Mirko, I looks like your commit r374055 broke tests to the builder llvm-clang-x86_64-expensive-checks-win: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20071 . . . ******************** Failing Tests (2): LLVM :: CodeGen/Mips/long-call-mcount.ll LLVM :: CodeGen/Mips/mcount.ll The builder was already red and did not send notifications on this. Please have a look ASAP? Thanks Galina -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Wed Oct 9 12:13:30 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:13:30 +0000 (UTC) Subject: [PATCH] D67199: [InstCombine] Expand the simplification of log() In-Reply-To: References: Message-ID: <8aa72b27855190e8ee4ae1978f38c82b@localhost.localdomain> efriedma added inline comments. ================ Comment at: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp:1923 + LibFunc ArgLb = NotLibFunc; + TLI->getLibFunc(ArgNm, ArgLb); + ---------------- This should be using the overload of getLibFunc that takes a CallSite, instead of expanding it out by hand. This formulation skips checks that should happen otherwise (specifically, that it's not an indirect call, that the call isn't marked nobuiltin, and the function has an appropriate signature). Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67199/new/ https://reviews.llvm.org/D67199 From llvm-commits at lists.llvm.org Wed Oct 9 12:23:22 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:23:22 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: efriedma added a comment. Is there some reason this isn't using the existing stack-probe-size attribute? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 From llvm-commits at lists.llvm.org Wed Oct 9 12:23:23 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:23:23 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: efriedma added a comment. Sorry, I meant the "probe-stack" attribute. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 From llvm-commits at lists.llvm.org Wed Oct 9 12:23:23 2019 From: llvm-commits at lists.llvm.org (wael yehia via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:23:23 +0000 (UTC) Subject: [PATCH] D66979: [InstrProf] Tighten a check for malformed data records in raw profiles In-Reply-To: References: Message-ID: <78013b20977dae95695cae7a51e8dcfe@localhost.localdomain> w2yehia added a comment. @vsk thanks Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66979/new/ https://reviews.llvm.org/D66979 From llvm-commits at lists.llvm.org Wed Oct 9 12:23:24 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:23:24 +0000 (UTC) Subject: [PATCH] D68721: [NFC][PowerPC]Clean up PPCAsmPrinter for TOC related pseudo opcode Message-ID: Xiangling_L created this revision. Xiangling_L added reviewers: stefanp, jsji, sfertile, hubert.reinterpretcast. Xiangling_L added a project: LLVM. Herald added subscribers: llvm-commits, shchenz, MaskRay, kbarton, hiraditya, nemanjai. Add a helper function `getMCSymbolForTOCPseudoMO` to clean up PPCAsmPrinter a little bit. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68721 Files: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68721.224124.patch Type: text/x-patch Size: 8269 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 12:23:40 2019 From: llvm-commits at lists.llvm.org (Arthur O'Dwyer via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:23:40 +0000 (UTC) Subject: [PATCH] D51741: [coro]Pass rvalue reference for named local variable to return_value In-Reply-To: References: Message-ID: Quuxplusone added a comment. In D51741#1701757 , @aaronpuchert wrote: > This change breaks the following code that worked before: > > task f(MoveOnly &value) { > co_return value; > } > This patch is heavily heavily merge-conflicted by P1825 . Aaron's example code should not be affected by P1825 . It should do overload resolution on `task::return_value` with one parameter of type `MoveOnly&`. However, task g(MoveOnly &&value) { co_return value; } task h(MoveOnly value) { co_return value; } Both of these should first do overload resolution for one parameter of type `MoveOnly&&`, and then, only if that overload resolution fails, should they fall back to overload resolution for one parameter of type `MoveOnly&`. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51741/new/ https://reviews.llvm.org/D51741 From llvm-commits at lists.llvm.org Wed Oct 9 12:24:00 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:24:00 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: evandro marked 2 inline comments as done. evandro added inline comments. ================ Comment at: llvm/include/llvm/Support/MathExtras.h:66 + inv_sqrtpi = 0.5641895835477563, // https://oeis.org/A087197 + sqrt2 = 1.414213562373095, // https://oeis.org/A002193 + inv_sqrt2 = 0.7071067811865475, ---------------- efriedma wrote: > evandro wrote: > > efriedma wrote: > > > evandro wrote: > > > > efriedma wrote: > > > > > The correct value of sqrt(2) in double-precision is 1.4142135623730951. > > > > > > > > > > And now I don't trust any of the other values... > > > > `double` has a precision of 15 or 16 significant digits. I don't understand why are you suggesting 17 significant digits when you asked to trim the precision down. > > > > > > > > Besides, the reference I provided states that this value is 1.41421356237309505. Whether it's rounded to 1.4142135623730950 or 1.4142135623730951 is a bit moot, IMO. > > > I asked for "the smallest number of digits required to produce the correct double-precision result". This is what you get if, for example, you ask Python 2.7 or later to convert the value to a string with `repr()` (`printf "import math\nprint(repr(math.sqrt(2)))" | python`). `1.414213562373095` produces a value that's different by one ulp. > > > > > > Yes, a one ulp difference is unlikely to matter for most uses, but if we're going to take the time to define these, we should define them correctly. > > You're assuming that Python is correct. `bc` says 1.41421356237309504880. glibc's `math.h` says 1.41421356237309504880 as well. And none of these is the same as your 1.4142135623730951. > > > > As I said, the precision of `double` is 15 to 16 digits and of `float`, 6 to 7 digits. `math.h` defines them with 20 digits, which is probably an agreeable precision, yes? But I believe that we call all live with a difference of ±1ulp. > 1.4142135623730951 is the shortest decimal representation that produces the same double-precision number as 0x1.6a09e667f3bcdP+0. It isn't "correct" in any other sense, sure. > > A few extra digits is okay, I guess. Indeed, for some numbers more digits were necessary than for others. For uniformity's sake, I used the maximum from the number of digits required. Thank you. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 From llvm-commits at lists.llvm.org Wed Oct 9 12:25:49 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:25:49 +0000 (UTC) Subject: [PATCH] D68285: [AMDGPU] Use math constants defined in MathExtras (NFC) In-Reply-To: References: Message-ID: <4c82ac9f0567cb2053d7e96f6280e5b8@localhost.localdomain> evandro added a comment. Ping! 🔔 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68285/new/ https://reviews.llvm.org/D68285 From llvm-commits at lists.llvm.org Wed Oct 9 12:26:03 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:26:03 +0000 (UTC) Subject: [PATCH] D68353: [AArch64] Remove overlapping definitions (NFC) In-Reply-To: References: Message-ID: evandro added a comment. Ping! 🔔 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68353/new/ https://reviews.llvm.org/D68353 From llvm-commits at lists.llvm.org Wed Oct 9 12:32:50 2019 From: llvm-commits at lists.llvm.org (MyDeveloperDay via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:32:50 +0000 (UTC) Subject: [PATCH] D68551: [clang-format] [NFC] Ensure clang-format is itself clang-formatted. In-Reply-To: References: Message-ID: MyDeveloperDay added a comment. > Btw, I am the author of the CMakeLists snippet quoted by @MyDeveloperDay. Before that, it was a shell script that didn't run on Windows. Making it part of the regression test basically eliminated all discussion about code formatting, but we had to run large-scale reformatting whenever clang-format changed in some way. It also runs by the polly-* buildbots which I personally do not like since I don't see code formatting as a reason why a build should fail. Thank you for your comment, do we have CMake infrastructure (I'm not a CMake expert) to be able to parameterize that snippet and put it somewhere centrally so that others could simply inherit this in their CMakeList.txt like: file( GLOB files ../lib/Format/*.h ../lib/Format/*.cpp ../unittests/*.cpp ../include/clang/Format/*.h) add_clang_format_target(XXX,files) so that they'd get your XXX-check-format and XXX-update-format rules? It might help the proliferation of clang-formatted areas? (and keep them clean) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68551/new/ https://reviews.llvm.org/D68551 From llvm-commits at lists.llvm.org Wed Oct 9 12:44:28 2019 From: llvm-commits at lists.llvm.org (Andrea Di Biagio via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:44:28 +0000 (UTC) Subject: [PATCH] D68714: [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219) In-Reply-To: References: Message-ID: <31fd40758def615a31c35adf4e6b2934@localhost.localdomain> andreadb added inline comments. ================ Comment at: llvm/tools/llvm-mca/Views/TimelineView.cpp:132-140 void TimelineView::printWaitTimeEntry(formatted_raw_ostream &OS, const WaitTimeEntry &Entry, unsigned SourceIndex, - unsigned Executions) const { - OS << SourceIndex << '.'; + unsigned CumulativeExecutions, + unsigned Executions, + bool ShouldNumber) const { + if (ShouldNumber) ---------------- lebedev.ri wrote: > andreadb wrote: > > You can still use the old signature for this method (see the explanation below): > > > > We know that we are printing the special entry if `SourceIndex == Source.size()`. > > > > You can use that knowledge in two places: > > > > 1) You can automatically infer flag `ShouldNumber`. > > > > ``` > > bool ShouldNumber = SourceIndex != Source.size(); > > if (ShouldNumber) > > OS << SourceIndex << '.'; > > ``` > > > > 2) Before printing the average times, you can check if numbers are fore he special entry and modify the value of `Executions` with `Timeline.size() / Source.size()`. > > > > You can do it where you currently added the FIXME comment. > > > > ``` > > if (!ShouldNumber) { > > // override Executions for the purpose of changing colors. > > Executions = Timeline.size() / Source.size(); > > } > > ``` > > > > Basically you don't need `CumulativeExecutions` as it can be inferred from the context. That should also fix the issue with the coloring of the output. > To be honest i do not understand this comment. > Is this better or worse? > This does not help with coloring. Okay. I see the problem now. At line 155 we have: ``` int BufferSize = UsedBuffer[SourceIndex].second; ``` However, SourceIndex is not a valid index if method `printWaitTimeEntry()` is called to print the . The motivation is that `SourceIndex` is set to `Source.size()` for entry , and there are only `Source.size()` elements in vector `UsedBuffer`. So that access is invalid if we are printing the . There is not an easy fix for it. I suggest for now that we avoid to change the colors if we know that we are printing the entry. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68714/new/ https://reviews.llvm.org/D68714 From llvm-commits at lists.llvm.org Wed Oct 9 12:51:49 2019 From: llvm-commits at lists.llvm.org (David Greene via llvm-commits) Date: Wed, 09 Oct 2019 19:51:49 -0000 Subject: [llvm] r374205 - [System Model] [TTI] Update cache and prefetch TTI interfaces Message-ID: <20191009195149.26FF98AC1C@lists.llvm.org> Author: greened Date: Wed Oct 9 12:51:48 2019 New Revision: 374205 URL: http://llvm.org/viewvc/llvm-project?rev=374205&view=rev Log: [System Model] [TTI] Update cache and prefetch TTI interfaces Re-apply 9fdfb045ae8b/r365676 with fixes for PPC and Hexagon. This involved moving defaults from TargetTransformInfoImplBase to MCSubtargetInfo. Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include: - Marking existing interfaces const and/or override as appropriate - Adding comments - Adding BasicTTIImpl interfaces that delegate to a subtarget implementation - Moving the default TargetTransformInfoImplBase implementation to a default MCSubtarget implementation Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget implementation, so its custom TTI implementation is migrated to use the new facilities in BasicTTIImpl to invoke its custom subtarget implementation. The custom TTI implementations continue to exist for the other targets with this change. They are not moved over to subtarget-based implementations. The end goal is to have the default subtarget implementation defer to the system model defined by the target. With this change, the default MCSubtargetInfo implementation essentially returns the defaults TargetTransformInfoImplBase used to return. Existing users of TTI defaults will hit the defaults now in MCSubtargetInfo. Targets that define their own custom TTI implementations won't use the BasicTTIImpl implementations that route to the subtarget. Once system models are in place for the targets that use these interfaces, their custom TTI implementations can be removed. Differential Revision: https://reviews.llvm.org/D63614 Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h llvm/trunk/include/llvm/MC/MCSubtargetInfo.h llvm/trunk/lib/Analysis/TargetTransformInfo.cpp llvm/trunk/lib/MC/MCSubtargetInfo.cpp llvm/trunk/lib/Target/AArch64/AArch64Subtarget.h llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h llvm/trunk/lib/Target/Hexagon/HexagonTargetTransformInfo.h llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original) +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Wed Oct 9 12:51:48 2019 @@ -837,18 +837,20 @@ public: /// \return The associativity of the cache level, if available. llvm::Optional getCacheAssociativity(CacheLevel Level) const; - /// \return How much before a load we should place the prefetch instruction. - /// This is currently measured in number of instructions. + /// \return How much before a load we should place the prefetch + /// instruction. This is currently measured in number of + /// instructions. unsigned getPrefetchDistance() const; - /// \return Some HW prefetchers can handle accesses up to a certain constant - /// stride. This is the minimum stride in bytes where it makes sense to start - /// adding SW prefetches. The default is 1, i.e. prefetch with any stride. + /// \return Some HW prefetchers can handle accesses up to a certain + /// constant stride. This is the minimum stride in bytes where it + /// makes sense to start adding SW prefetches. The default is 1, + /// i.e. prefetch with any stride. unsigned getMinPrefetchStride() const; - /// \return The maximum number of iterations to prefetch ahead. If the - /// required number of iterations is more than this number, no prefetching is - /// performed. + /// \return The maximum number of iterations to prefetch ahead. If + /// the required number of iterations is more than this number, no + /// prefetching is performed. unsigned getMaxPrefetchIterationsAhead() const; /// \return The maximum interleave factor that any transform should try to @@ -1250,12 +1252,26 @@ public: virtual unsigned getMinimumVF(unsigned ElemWidth) const = 0; virtual bool shouldConsiderAddressTypePromotion( const Instruction &I, bool &AllowPromotionWithoutCommonHeader) = 0; - virtual unsigned getCacheLineSize() = 0; - virtual llvm::Optional getCacheSize(CacheLevel Level) = 0; - virtual llvm::Optional getCacheAssociativity(CacheLevel Level) = 0; - virtual unsigned getPrefetchDistance() = 0; - virtual unsigned getMinPrefetchStride() = 0; - virtual unsigned getMaxPrefetchIterationsAhead() = 0; + virtual unsigned getCacheLineSize() const = 0; + virtual llvm::Optional getCacheSize(CacheLevel Level) const = 0; + virtual llvm::Optional getCacheAssociativity(CacheLevel Level) const = 0; + + /// \return How much before a load we should place the prefetch + /// instruction. This is currently measured in number of + /// instructions. + virtual unsigned getPrefetchDistance() const = 0; + + /// \return Some HW prefetchers can handle accesses up to a certain + /// constant stride. This is the minimum stride in bytes where it + /// makes sense to start adding SW prefetches. The default is 1, + /// i.e. prefetch with any stride. + virtual unsigned getMinPrefetchStride() const = 0; + + /// \return The maximum number of iterations to prefetch ahead. If + /// the required number of iterations is more than this number, no + /// prefetching is performed. + virtual unsigned getMaxPrefetchIterationsAhead() const = 0; + virtual unsigned getMaxInterleaveFactor(unsigned VF) = 0; virtual unsigned getArithmeticInstrCost(unsigned Opcode, Type *Ty, OperandValueKind Opd1Info, @@ -1606,22 +1622,36 @@ public: return Impl.shouldConsiderAddressTypePromotion( I, AllowPromotionWithoutCommonHeader); } - unsigned getCacheLineSize() override { + unsigned getCacheLineSize() const override { return Impl.getCacheLineSize(); } - llvm::Optional getCacheSize(CacheLevel Level) override { + llvm::Optional getCacheSize(CacheLevel Level) const override { return Impl.getCacheSize(Level); } - llvm::Optional getCacheAssociativity(CacheLevel Level) override { + llvm::Optional getCacheAssociativity(CacheLevel Level) const override { return Impl.getCacheAssociativity(Level); } - unsigned getPrefetchDistance() override { return Impl.getPrefetchDistance(); } - unsigned getMinPrefetchStride() override { + + /// Return the preferred prefetch distance in terms of instructions. + /// + unsigned getPrefetchDistance() const override { + return Impl.getPrefetchDistance(); + } + + /// Return the minimum stride necessary to trigger software + /// prefetching. + /// + unsigned getMinPrefetchStride() const override { return Impl.getMinPrefetchStride(); } - unsigned getMaxPrefetchIterationsAhead() override { + + /// Return the maximum prefetch distance in terms of loop + /// iterations. + /// + unsigned getMaxPrefetchIterationsAhead() const override { return Impl.getMaxPrefetchIterationsAhead(); } + unsigned getMaxInterleaveFactor(unsigned VF) override { return Impl.getMaxInterleaveFactor(VF); } Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h (original) +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h Wed Oct 9 12:51:48 2019 @@ -371,37 +371,6 @@ public: return false; } - unsigned getCacheLineSize() { return 0; } - - llvm::Optional getCacheSize(TargetTransformInfo::CacheLevel Level) { - switch (Level) { - case TargetTransformInfo::CacheLevel::L1D: - LLVM_FALLTHROUGH; - case TargetTransformInfo::CacheLevel::L2D: - return llvm::Optional(); - } - - llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); - } - - llvm::Optional getCacheAssociativity( - TargetTransformInfo::CacheLevel Level) { - switch (Level) { - case TargetTransformInfo::CacheLevel::L1D: - LLVM_FALLTHROUGH; - case TargetTransformInfo::CacheLevel::L2D: - return llvm::Optional(); - } - - llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); - } - - unsigned getPrefetchDistance() { return 0; } - - unsigned getMinPrefetchStride() { return 1; } - - unsigned getMaxPrefetchIterationsAhead() { return UINT_MAX; } - unsigned getMaxInterleaveFactor(unsigned VF) { return 1; } unsigned getArithmeticInstrCost(unsigned Opcode, Type *Ty, Modified: llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h (original) +++ llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h Wed Oct 9 12:51:48 2019 @@ -514,6 +514,34 @@ public: return BaseT::getInstructionLatency(I); } + virtual Optional + getCacheSize(TargetTransformInfo::CacheLevel Level) const { + return Optional( + getST()->getCacheSize(static_cast(Level))); + } + + virtual Optional + getCacheAssociativity(TargetTransformInfo::CacheLevel Level) const { + return Optional( + getST()->getCacheAssociativity(static_cast(Level))); + } + + virtual unsigned getCacheLineSize() const { + return getST()->getCacheLineSize(); + } + + virtual unsigned getPrefetchDistance() const { + return getST()->getPrefetchDistance(); + } + + virtual unsigned getMinPrefetchStride() const { + return getST()->getMinPrefetchStride(); + } + + virtual unsigned getMaxPrefetchIterationsAhead() const { + return getST()->getMaxPrefetchIterationsAhead(); + } + /// @} /// \name Vector TTI Implementations Modified: llvm/trunk/include/llvm/MC/MCSubtargetInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCSubtargetInfo.h?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCSubtargetInfo.h (original) +++ llvm/trunk/include/llvm/MC/MCSubtargetInfo.h Wed Oct 9 12:51:48 2019 @@ -223,6 +223,50 @@ public: } virtual unsigned getHwMode() const { return 0; } + + /// Return the cache size in bytes for the given level of cache. + /// Level is zero-based, so a value of zero means the first level of + /// cache. + /// + virtual Optional getCacheSize(unsigned Level) const; + + /// Return the cache associatvity for the given level of cache. + /// Level is zero-based, so a value of zero means the first level of + /// cache. + /// + virtual Optional getCacheAssociativity(unsigned Level) const; + + /// Return the target cache line size in bytes at a given level. + /// + virtual Optional getCacheLineSize(unsigned Level) const; + + /// Return the target cache line size in bytes. By default, return + /// the line size for the bottom-most level of cache. This provides + /// a more convenient interface for the common case where all cache + /// levels have the same line size. Return zero if there is no + /// cache model. + /// + virtual unsigned getCacheLineSize() const { + Optional Size = getCacheLineSize(0); + if (Size) + return *Size; + + return 0; + } + + /// Return the preferred prefetch distance in terms of instructions. + /// + virtual unsigned getPrefetchDistance() const; + + /// Return the maximum prefetch distance in terms of loop + /// iterations. + /// + virtual unsigned getMaxPrefetchIterationsAhead() const; + + /// Return the minimum stride necessary to trigger software + /// prefetching. + /// + virtual unsigned getMinPrefetchStride() const; }; } // end namespace llvm Modified: llvm/trunk/lib/Analysis/TargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/TargetTransformInfo.cpp?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/TargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Analysis/TargetTransformInfo.cpp Wed Oct 9 12:51:48 2019 @@ -40,6 +40,34 @@ namespace { struct NoTTIImpl : TargetTransformInfoImplCRTPBase { explicit NoTTIImpl(const DataLayout &DL) : TargetTransformInfoImplCRTPBase(DL) {} + + unsigned getCacheLineSize() const { return 0; } + + llvm::Optional getCacheSize(TargetTransformInfo::CacheLevel Level) const { + switch (Level) { + case TargetTransformInfo::CacheLevel::L1D: + LLVM_FALLTHROUGH; + case TargetTransformInfo::CacheLevel::L2D: + return llvm::Optional(); + } + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); + } + + llvm::Optional getCacheAssociativity( + TargetTransformInfo::CacheLevel Level) const { + switch (Level) { + case TargetTransformInfo::CacheLevel::L1D: + LLVM_FALLTHROUGH; + case TargetTransformInfo::CacheLevel::L2D: + return llvm::Optional(); + } + + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); + } + + unsigned getPrefetchDistance() const { return 0; } + unsigned getMinPrefetchStride() const { return 1; } + unsigned getMaxPrefetchIterationsAhead() const { return UINT_MAX; } }; } Modified: llvm/trunk/lib/MC/MCSubtargetInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCSubtargetInfo.cpp?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCSubtargetInfo.cpp (original) +++ llvm/trunk/lib/MC/MCSubtargetInfo.cpp Wed Oct 9 12:51:48 2019 @@ -315,3 +315,28 @@ void MCSubtargetInfo::initInstrItins(Ins InstrItins = InstrItineraryData(getSchedModel(), Stages, OperandCycles, ForwardingPaths); } + +Optional MCSubtargetInfo::getCacheSize(unsigned Level) const { + return Optional(); +} + +Optional +MCSubtargetInfo::getCacheAssociativity(unsigned Level) const { + return Optional(); +} + +Optional MCSubtargetInfo::getCacheLineSize(unsigned Level) const { + return Optional(); +} + +unsigned MCSubtargetInfo::getPrefetchDistance() const { + return 0; +} + +unsigned MCSubtargetInfo::getMaxPrefetchIterationsAhead() const { + return UINT_MAX; +} + +unsigned MCSubtargetInfo::getMinPrefetchStride() const { + return 1; +} Modified: llvm/trunk/lib/Target/AArch64/AArch64Subtarget.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64Subtarget.h?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/AArch64Subtarget.h (original) +++ llvm/trunk/lib/Target/AArch64/AArch64Subtarget.h Wed Oct 9 12:51:48 2019 @@ -353,10 +353,10 @@ public: unsigned getVectorInsertExtractBaseCost() const { return VectorInsertExtractBaseCost; } - unsigned getCacheLineSize() const { return CacheLineSize; } - unsigned getPrefetchDistance() const { return PrefetchDistance; } - unsigned getMinPrefetchStride() const { return MinPrefetchStride; } - unsigned getMaxPrefetchIterationsAhead() const { + unsigned getCacheLineSize() const override { return CacheLineSize; } + unsigned getPrefetchDistance() const override { return PrefetchDistance; } + unsigned getMinPrefetchStride() const override { return MinPrefetchStride; } + unsigned getMaxPrefetchIterationsAhead() const override { return MaxPrefetchIterationsAhead; } unsigned getPrefFunctionLogAlignment() const { Modified: llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.cpp Wed Oct 9 12:51:48 2019 @@ -892,22 +892,6 @@ bool AArch64TTIImpl::shouldConsiderAddre return Considerable; } -unsigned AArch64TTIImpl::getCacheLineSize() { - return ST->getCacheLineSize(); -} - -unsigned AArch64TTIImpl::getPrefetchDistance() { - return ST->getPrefetchDistance(); -} - -unsigned AArch64TTIImpl::getMinPrefetchStride() { - return ST->getMinPrefetchStride(); -} - -unsigned AArch64TTIImpl::getMaxPrefetchIterationsAhead() { - return ST->getMaxPrefetchIterationsAhead(); -} - bool AArch64TTIImpl::useReductionIntrinsic(unsigned Opcode, Type *Ty, TTI::ReductionFlags Flags) const { assert(isa(Ty) && "Expected Ty to be a vector type"); Modified: llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h Wed Oct 9 12:51:48 2019 @@ -156,14 +156,6 @@ public: shouldConsiderAddressTypePromotion(const Instruction &I, bool &AllowPromotionWithoutCommonHeader); - unsigned getCacheLineSize(); - - unsigned getPrefetchDistance(); - - unsigned getMinPrefetchStride(); - - unsigned getMaxPrefetchIterationsAhead(); - bool shouldExpandReduction(const IntrinsicInst *II) const { return false; } Modified: llvm/trunk/lib/Target/Hexagon/HexagonTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonTargetTransformInfo.h?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/lib/Target/Hexagon/HexagonTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/Hexagon/HexagonTargetTransformInfo.h Wed Oct 9 12:51:48 2019 @@ -68,8 +68,8 @@ public: bool shouldFavorPostInc() const; // L1 cache prefetch. - unsigned getPrefetchDistance() const; - unsigned getCacheLineSize() const; + unsigned getPrefetchDistance() const override; + unsigned getCacheLineSize() const override; /// @} Modified: llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp Wed Oct 9 12:51:48 2019 @@ -613,7 +613,7 @@ unsigned PPCTTIImpl::getRegisterBitWidth } -unsigned PPCTTIImpl::getCacheLineSize() { +unsigned PPCTTIImpl::getCacheLineSize() const { // Check first if the user specified a custom line size. if (CacheLineSize.getNumOccurrences() > 0) return CacheLineSize; @@ -628,7 +628,7 @@ unsigned PPCTTIImpl::getCacheLineSize() return 64; } -unsigned PPCTTIImpl::getPrefetchDistance() { +unsigned PPCTTIImpl::getPrefetchDistance() const { // This seems like a reasonable default for the BG/Q (this pass is enabled, by // default, only on the BG/Q). return 300; Modified: llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h Wed Oct 9 12:51:48 2019 @@ -74,8 +74,8 @@ public: bool enableInterleavedAccessVectorization(); unsigned getNumberOfRegisters(bool Vector); unsigned getRegisterBitWidth(bool Vector) const; - unsigned getCacheLineSize(); - unsigned getPrefetchDistance(); + unsigned getCacheLineSize() const override; + unsigned getPrefetchDistance() const override; unsigned getMaxInterleaveFactor(unsigned VF); int vectorCostAdjustment(int Cost, unsigned Opcode, Type *Ty1, Type *Ty2); int getArithmeticInstrCost( Modified: llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h?rev=374205&r1=374204&r2=374205&view=diff ============================================================================== --- llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h Wed Oct 9 12:51:48 2019 @@ -59,9 +59,9 @@ public: unsigned getNumberOfRegisters(bool Vector); unsigned getRegisterBitWidth(bool Vector) const; - unsigned getCacheLineSize() { return 256; } - unsigned getPrefetchDistance() { return 2000; } - unsigned getMinPrefetchStride() { return 2048; } + unsigned getCacheLineSize() const override { return 256; } + unsigned getPrefetchDistance() const override { return 2000; } + unsigned getMinPrefetchStride() const override { return 2048; } bool hasDivRemOp(Type *DataType, bool IsSigned); bool prefersVectorizedAddressing() { return false; } From llvm-commits at lists.llvm.org Wed Oct 9 12:51:09 2019 From: llvm-commits at lists.llvm.org (Michael Kruse via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:51:09 +0000 (UTC) Subject: [PATCH] D68551: [clang-format] [NFC] Ensure clang-format is itself clang-formatted. In-Reply-To: References: Message-ID: Meinersbur added a comment. We should seek having only one formatting policy for the entire project, it's too confusing otherwise (e.g. LLVMSupport/ADT is also used by clang-format, are they force-formatted as well? What about the tools/utils subdirectories?). If you would like to gradually move into that direction, it should be discussed on llvm-dev. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68551/new/ https://reviews.llvm.org/D68551 From llvm-commits at lists.llvm.org Wed Oct 9 12:51:09 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:51:09 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: <49784e94b6c1158dc99af5528781a269@localhost.localdomain> DiggerLin updated this revision to Diff 224128. DiggerLin added a comment. change padding method as hubert 's suggestion. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 Files: llvm/include/llvm/MC/MCSectionXCOFF.h llvm/lib/MC/MCXCOFFStreamer.cpp llvm/lib/MC/XCOFFObjectWriter.cpp llvm/test/CodeGen/PowerPC/aix-return55.ll llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D66969.224128.patch Type: text/x-patch Size: 20665 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 12:58:02 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via llvm-commits) Date: Wed, 09 Oct 2019 19:58:02 -0000 Subject: [llvm] r374207 - [Support] Add mathematical constants Message-ID: <20191009195802.19C7B84B3E@lists.llvm.org> Author: evandro Date: Wed Oct 9 12:58:01 2019 New Revision: 374207 URL: http://llvm.org/viewvc/llvm-project?rev=374207&view=rev Log: [Support] Add mathematical constants Add own version of the mathematical constants from the upcoming C++20 `std::numbers`. Differential revision: https://reviews.llvm.org/D68257 Modified: llvm/trunk/include/llvm/Support/MathExtras.h llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp Modified: llvm/trunk/include/llvm/Support/MathExtras.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/MathExtras.h?rev=374207&r1=374206&r2=374207&view=diff ============================================================================== --- llvm/trunk/include/llvm/Support/MathExtras.h (original) +++ llvm/trunk/include/llvm/Support/MathExtras.h Wed Oct 9 12:58:01 2019 @@ -39,6 +39,7 @@ unsigned char _BitScanReverse64(unsigned #endif namespace llvm { + /// The behavior an operation has on an input of 0. enum ZeroBehavior { /// The returned value is undefined. @@ -49,6 +50,42 @@ enum ZeroBehavior { ZB_Width }; +/// Mathematical constants. +namespace numbers { +// TODO: Track C++20 std::numbers. +// TODO: Favor using the hexadecimal FP constants (requires C++17). +constexpr double e = 2.7182818284590452354, // (0x1.5bf0a8b145749P+1) https://oeis.org/A001113 + egamma = .57721566490153286061, // (0x1.2788cfc6fb619P-1) https://oeis.org/A001620 + ln2 = .69314718055994530942, // (0x1.62e42fefa39efP-1) https://oeis.org/A002162 + ln10 = 2.3025850929940456840, // (0x1.24bb1bbb55516P+1) https://oeis.org/A002392 + log2e = 1.4426950408889634074, // (0x1.71547652b82feP+0) + log10e = .43429448190325182765, // (0x1.bcb7b1526e50eP-2) + pi = 3.1415926535897932385, // (0x1.921fb54442d18P+1) https://oeis.org/A000796 + inv_pi = .31830988618379067154, // (0x1.45f306bc9c883P-2) https://oeis.org/A049541 + sqrtpi = 1.7724538509055160273, // (0x1.c5bf891b4ef6bP+0) https://oeis.org/A002161 + inv_sqrtpi = .56418958354775628695, // (0x1.20dd750429b6dP-1) https://oeis.org/A087197 + sqrt2 = 1.4142135623730950488, // (0x1.6a09e667f3bcdP+0) https://oeis.org/A00219 + inv_sqrt2 = .70710678118654752440, // (0x1.6a09e667f3bcdP-1) + sqrt3 = 1.7320508075688772935, // (0x1.bb67ae8584caaP+0) https://oeis.org/A002194 + inv_sqrt3 = .57735026918962576451, // (0x1.279a74590331cP-1) + phi = 1.6180339887498948482; // (0x1.9e3779b97f4a8P+0) https://oeis.org/A001622 +constexpr float ef = 2.71828183F, // (0x1.5bf0a8P+1) https://oeis.org/A001113 + egammaf = .577215665F, // (0x1.2788d0P-1) https://oeis.org/A001620 + ln2f = .693147181F, // (0x1.62e430P-1) https://oeis.org/A002162 + ln10f = 2.30258509F, // (0x1.26bb1cP+1) https://oeis.org/A002392 + log2ef = 1.44269504F, // (0x1.715476P+0) + log10ef = .434294482F, // (0x1.bcb7b2P-2) + pif = 3.14159265F, // (0x1.921fb6P+1) https://oeis.org/A000796 + inv_pif = .318309886F, // (0x1.45f306P-2) https://oeis.org/A049541 + sqrtpif = 1.77245385F, // (0x1.c5bf8aP+0) https://oeis.org/A002161 + inv_sqrtpif = .564189584F, // (0x1.20dd76P-1) https://oeis.org/A087197 + sqrt2f = 1.41421356F, // (0x1.6a09e6P+0) https://oeis.org/A002193 + inv_sqrt2f = .707106781F, // (0x1.6a09e6P-1) + sqrt3f = 1.73205081F, // (0x1.bb67aeP+0) https://oeis.org/A002194 + inv_sqrt3f = .577350269F, // (0x1.279a74P-1) + phif = 1.61803399F; // (0x1.9e377aP+0) https://oeis.org/A001622 +} // namespace numbers + namespace detail { template struct TrailingZerosCounter { static unsigned count(T Val, ZeroBehavior) { Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp?rev=374207&r1=374206&r2=374207&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Wed Oct 9 12:58:01 2019 @@ -4974,12 +4974,11 @@ static SDValue expandExp(const SDLoc &dl // Put the exponent in the right bit position for later addition to the // final result: // - // #define LOG2OFe 1.4426950f - // t0 = Op * LOG2OFe + // t0 = Op * log2(e) // TODO: What fast-math-flags should be set here? SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, Op, - getF32Constant(DAG, 0x3fb8aa3b, dl)); + DAG.getConstantFP(numbers::log2ef, dl, MVT::f32)); return getLimitedPrecisionExp2(t0, dl, DAG); } @@ -4997,10 +4996,11 @@ static SDValue expandLog(const SDLoc &dl LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { SDValue Op1 = DAG.getNode(ISD::BITCAST, dl, MVT::i32, Op); - // Scale the exponent by log(2) [0.69314718f]. + // Scale the exponent by log(2). SDValue Exp = GetExponent(DAG, Op1, TLI, dl); - SDValue LogOfExponent = DAG.getNode(ISD::FMUL, dl, MVT::f32, Exp, - getF32Constant(DAG, 0x3f317218, dl)); + SDValue LogOfExponent = + DAG.getNode(ISD::FMUL, dl, MVT::f32, Exp, + DAG.getConstantFP(numbers::ln2f, dl, MVT::f32)); // Get the significand and build it into a floating-point number with // exponent of 1. Modified: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp?rev=374207&r1=374206&r2=374207&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp Wed Oct 9 12:58:01 2019 @@ -48,7 +48,6 @@ static cl::opt cl::desc("Enable unsafe double to float " "shrinking for math lib calls")); - //===----------------------------------------------------------------------===// // Helper Functions //===----------------------------------------------------------------------===// @@ -1941,9 +1940,8 @@ Value *LibCallSimplifier::optimizeLog(Ca ArgID == Intrinsic::exp || ArgID == Intrinsic::exp2) { Constant *Eul; if (ArgLb == ExpLb || ArgID == Intrinsic::exp) - // FIXME: The Euler number should be M_E, but it's place of definition - // is not quite standard. - Eul = ConstantFP::get(Log->getType(), 2.7182818284590452354); + // FIXME: Add more precise value of e for long double. + Eul = ConstantFP::get(Log->getType(), numbers::e); else if (ArgLb == Exp2Lb || ArgID == Intrinsic::exp2) Eul = ConstantFP::get(Log->getType(), 2.0); else From llvm-commits at lists.llvm.org Wed Oct 9 13:00:44 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via llvm-commits) Date: Wed, 09 Oct 2019 20:00:44 -0000 Subject: [llvm] r374208 - [AMDGPU] Use math constants defined in MathExtras (NFC) Message-ID: <20191009200044.2BDEE81CEF@lists.llvm.org> Author: evandro Date: Wed Oct 9 13:00:43 2019 New Revision: 374208 URL: http://llvm.org/viewvc/llvm-project?rev=374208&view=rev Log: [AMDGPU] Use math constants defined in MathExtras (NFC) Use the the new math constants in `MathExtras.h`. Differential revision: https://reviews.llvm.org/D68285 Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp llvm/trunk/lib/Target/AMDGPU/AMDGPULibCalls.cpp llvm/trunk/lib/Target/AMDGPU/R600ISelLowering.cpp Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp?rev=374208&r1=374207&r2=374208&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp Wed Oct 9 13:00:43 2019 @@ -12,10 +12,6 @@ // //===----------------------------------------------------------------------===// -#define AMDGPU_LOG2E_F 1.44269504088896340735992468100189214f -#define AMDGPU_LN2_F 0.693147180559945309417232121458176568f -#define AMDGPU_LN10_F 2.30258509299404568401799145468436421f - #include "AMDGPUISelLowering.h" #include "AMDGPU.h" #include "AMDGPUCallLowering.h" @@ -37,6 +33,7 @@ #include "llvm/IR/DataLayout.h" #include "llvm/IR/DiagnosticInfo.h" #include "llvm/Support/KnownBits.h" +#include "llvm/Support/MathExtras.h" using namespace llvm; #include "AMDGPUGenCallingConv.inc" @@ -1135,9 +1132,9 @@ SDValue AMDGPUTargetLowering::LowerOpera case ISD::FROUND: return LowerFROUND(Op, DAG); case ISD::FFLOOR: return LowerFFLOOR(Op, DAG); case ISD::FLOG: - return LowerFLOG(Op, DAG, 1 / AMDGPU_LOG2E_F); + return LowerFLOG(Op, DAG, 1.0F / numbers::log2ef); case ISD::FLOG10: - return LowerFLOG(Op, DAG, AMDGPU_LN2_F / AMDGPU_LN10_F); + return LowerFLOG(Op, DAG, numbers::ln2f / numbers::ln10f); case ISD::FEXP: return lowerFEXP(Op, DAG); case ISD::SINT_TO_FP: return LowerSINT_TO_FP(Op, DAG); @@ -2285,30 +2282,13 @@ SDValue AMDGPUTargetLowering::LowerFLOG( return DAG.getNode(ISD::FMUL, SL, VT, Log2Operand, Log2BaseInvertedOperand); } -// Return M_LOG2E of appropriate type -static SDValue getLog2EVal(SelectionDAG &DAG, const SDLoc &SL, EVT VT) { - switch (VT.getScalarType().getSimpleVT().SimpleTy) { - case MVT::f32: - return DAG.getConstantFP(1.44269504088896340735992468100189214f, SL, VT); - case MVT::f16: - return DAG.getConstantFP( - APFloat(APFloat::IEEEhalf(), "1.44269504088896340735992468100189214"), - SL, VT); - case MVT::f64: - return DAG.getConstantFP( - APFloat(APFloat::IEEEdouble(), "0x1.71547652b82fep+0"), SL, VT); - default: - llvm_unreachable("unsupported fp type"); - } -} - // exp2(M_LOG2E_F * f); SDValue AMDGPUTargetLowering::lowerFEXP(SDValue Op, SelectionDAG &DAG) const { EVT VT = Op.getValueType(); SDLoc SL(Op); SDValue Src = Op.getOperand(0); - const SDValue K = getLog2EVal(DAG, SL, VT); + const SDValue K = DAG.getConstantFP(numbers::log2e, SL, VT); SDValue Mul = DAG.getNode(ISD::FMUL, SL, VT, Src, K, Op->getFlags()); return DAG.getNode(ISD::FEXP2, SL, VT, Mul, Op->getFlags()); } Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPULibCalls.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPULibCalls.cpp?rev=374208&r1=374207&r2=374208&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPULibCalls.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPULibCalls.cpp Wed Oct 9 13:00:43 2019 @@ -30,6 +30,7 @@ #include "llvm/IR/Module.h" #include "llvm/IR/ValueSymbolTable.h" #include "llvm/Support/Debug.h" +#include "llvm/Support/MathExtras.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Target/TargetMachine.h" #include "llvm/Target/TargetOptions.h" @@ -48,18 +49,10 @@ static cl::list UseNative(" cl::CommaSeparated, cl::ValueOptional, cl::Hidden); -#define MATH_PI 3.14159265358979323846264338327950288419716939937511 -#define MATH_E 2.71828182845904523536028747135266249775724709369996 -#define MATH_SQRT2 1.41421356237309504880168872420969807856967187537695 - -#define MATH_LOG2E 1.4426950408889634073599246810018921374266459541529859 -#define MATH_LOG10E 0.4342944819032518276511289189166050822943970058036665 -// Value of log2(10) -#define MATH_LOG2_10 3.3219280948873623478703194294893901758648313930245806 -// Value of 1 / log2(10) -#define MATH_RLOG2_10 0.3010299956639811952137388947244930267681898814621085 -// Value of 1 / M_LOG2E_F = 1 / log2(e) -#define MATH_RLOG2_E 0.6931471805599453094172321214581765680755001343602552 +#define MATH_PI numbers::pi +#define MATH_E numbers::e +#define MATH_SQRT2 numbers::sqrt2 +#define MATH_SQRT1_2 numbers::inv_sqrt2 namespace llvm { @@ -254,8 +247,8 @@ struct TableEntry { /* a list of {result, input} */ static const TableEntry tbl_acos[] = { - {MATH_PI/2.0, 0.0}, - {MATH_PI/2.0, -0.0}, + {MATH_PI / 2.0, 0.0}, + {MATH_PI / 2.0, -0.0}, {0.0, 1.0}, {MATH_PI, -1.0} }; @@ -271,8 +264,8 @@ static const TableEntry tbl_acospi[] = { static const TableEntry tbl_asin[] = { {0.0, 0.0}, {-0.0, -0.0}, - {MATH_PI/2.0, 1.0}, - {-MATH_PI/2.0, -1.0} + {MATH_PI / 2.0, 1.0}, + {-MATH_PI / 2.0, -1.0} }; static const TableEntry tbl_asinh[] = { {0.0, 0.0}, @@ -287,8 +280,8 @@ static const TableEntry tbl_asinpi[] = { static const TableEntry tbl_atan[] = { {0.0, 0.0}, {-0.0, -0.0}, - {MATH_PI/4.0, 1.0}, - {-MATH_PI/4.0, -1.0} + {MATH_PI / 4.0, 1.0}, + {-MATH_PI / 4.0, -1.0} }; static const TableEntry tbl_atanh[] = { {0.0, 0.0}, @@ -359,7 +352,7 @@ static const TableEntry tbl_log10[] = { }; static const TableEntry tbl_rsqrt[] = { {1.0, 1.0}, - {1.0/MATH_SQRT2, 2.0} + {MATH_SQRT1_2, 2.0} }; static const TableEntry tbl_sin[] = { {0.0, 0.0}, @@ -868,7 +861,7 @@ static double log2(double V) { #if _XOPEN_SOURCE >= 600 || defined(_ISOC99_SOURCE) || _POSIX_C_SOURCE >= 200112L return ::log2(V); #else - return log(V) / 0.693147180559945309417; + return log(V) / numbers::ln2; #endif } } Modified: llvm/trunk/lib/Target/AMDGPU/R600ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/R600ISelLowering.cpp?rev=374208&r1=374207&r2=374208&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/R600ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/R600ISelLowering.cpp Wed Oct 9 13:00:43 2019 @@ -41,6 +41,7 @@ #include "llvm/Support/Compiler.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/MachineValueType.h" +#include "llvm/Support/MathExtras.h" #include #include #include @@ -782,7 +783,7 @@ SDValue R600TargetLowering::LowerTrig(SD return TrigVal; // On R600 hw, COS/SIN input must be between -Pi and Pi. return DAG.getNode(ISD::FMUL, DL, VT, TrigVal, - DAG.getConstantFP(3.14159265359, DL, MVT::f32)); + DAG.getConstantFP(numbers::pif, DL, MVT::f32)); } SDValue R600TargetLowering::LowerSHLParts(SDValue Op, SelectionDAG &DAG) const { From llvm-commits at lists.llvm.org Wed Oct 9 13:00:53 2019 From: llvm-commits at lists.llvm.org (Jinsong Ji via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:00:53 +0000 (UTC) Subject: [PATCH] D68721: [NFC][PowerPC]Clean up PPCAsmPrinter for TOC related pseudo opcode In-Reply-To: References: Message-ID: <7618fbf6ef13ee54a22d766ac1ffa332@localhost.localdomain> jsji accepted this revision. jsji added a comment. This revision is now accepted and ready to land. Herald added a subscriber: wuzish. LGTM. Some nit comments. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:741 + // Map the machine operand to its corresponding MCSymbol. + const MCSymbol *MOSymbol = getMCSymbolForTOCPseudoMO(MO, *this); + ---------------- `MOSymbol` used only once here, maybe inline the call directly? ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:761 + // an external symbol, is a jump table address, or is a block address; or if + // the large code model is enabled then generate a TOC entry and reference + // that. Otherwise, reference the symbol directly. ---------------- large code model only for CPI? ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:774 if (GlobalToc || MO.isJTI() || MO.isBlockAddress() || - TM.getCodeModel() == CodeModel::Large) + (MO.isCPI() && TM.getCodeModel() == CodeModel::Large)) MOSymbol = lookUpOrCreateTOCEntry(MOSymbol); ---------------- Are these clang formatted? Formatting looks weird to me. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:797 + // an external symbol, is a jump table address, is a block address; or if + // large code model is enabled then generate a TOC entry and reference that. + // Otherwise reference the symbol directly. ---------------- large code model only for CPI? ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:835 + + const MCSymbol *MOSymbol = getMCSymbolForTOCPseudoMO(MO, *this); + ---------------- MOSymbol used only once here, maybe inline the call directly? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68721/new/ https://reviews.llvm.org/D68721 From llvm-commits at lists.llvm.org Wed Oct 9 13:00:53 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:00:53 +0000 (UTC) Subject: [PATCH] D68714: [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219) In-Reply-To: References: Message-ID: lebedev.ri updated this revision to Diff 224131. lebedev.ri marked 2 inline comments as done. lebedev.ri added a comment. Fix coloring by fixing out-of-bounds read. Thanks to @andreadb to noticing! :) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68714/new/ https://reviews.llvm.org/D68714 Files: llvm/docs/CommandGuide/llvm-mca.rst llvm/test/tools/llvm-mca/ARM/memcpy-ldm-stm.s llvm/test/tools/llvm-mca/ARM/vld1-index-update.s llvm/test/tools/llvm-mca/SystemZ/stm-lm.s llvm/test/tools/llvm-mca/X86/Barcelona/clear-super-register-1.s llvm/test/tools/llvm-mca/X86/Barcelona/clear-super-register-2.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-cmp.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-pcmpeq.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-pcmpgt.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-sbb-1.s llvm/test/tools/llvm-mca/X86/Barcelona/dependency-breaking-sbb-2.s llvm/test/tools/llvm-mca/X86/Barcelona/int-to-fpu-forwarding-3.s llvm/test/tools/llvm-mca/X86/Barcelona/load-store-throughput.s llvm/test/tools/llvm-mca/X86/Barcelona/load-throughput.s llvm/test/tools/llvm-mca/X86/Barcelona/one-idioms.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update-7.s llvm/test/tools/llvm-mca/X86/Barcelona/partial-reg-update.s llvm/test/tools/llvm-mca/X86/Barcelona/read-advance-1.s llvm/test/tools/llvm-mca/X86/Barcelona/read-advance-2.s llvm/test/tools/llvm-mca/X86/Barcelona/read-advance-3.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-1.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-2.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-3.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-4.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-5.s llvm/test/tools/llvm-mca/X86/Barcelona/reg-move-elimination-6.s llvm/test/tools/llvm-mca/X86/Barcelona/store-throughput.s llvm/test/tools/llvm-mca/X86/Barcelona/zero-idioms.s llvm/test/tools/llvm-mca/X86/BdVer2/add-sequence.s llvm/test/tools/llvm-mca/X86/BdVer2/clear-super-register-1.s llvm/test/tools/llvm-mca/X86/BdVer2/clear-super-register-2.s llvm/test/tools/llvm-mca/X86/BdVer2/clear-super-register-3.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-cmp.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-pcmpeq.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-pcmpgt.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-sbb-1.s llvm/test/tools/llvm-mca/X86/BdVer2/dependency-breaking-sbb-2.s llvm/test/tools/llvm-mca/X86/BdVer2/dependent-pmuld-paddd.s llvm/test/tools/llvm-mca/X86/BdVer2/dot-product.s llvm/test/tools/llvm-mca/X86/BdVer2/hadd-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BdVer2/hadd-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BdVer2/int-to-fpu-forwarding-3.s llvm/test/tools/llvm-mca/X86/BdVer2/load-store-alias.s llvm/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s llvm/test/tools/llvm-mca/X86/BdVer2/load-throughput.s llvm/test/tools/llvm-mca/X86/BdVer2/memcpy-like-test.s llvm/test/tools/llvm-mca/X86/BdVer2/one-idioms.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/BdVer2/partial-reg-update.s llvm/test/tools/llvm-mca/X86/BdVer2/pipes-fpu.s llvm/test/tools/llvm-mca/X86/BdVer2/pr37790.s llvm/test/tools/llvm-mca/X86/BdVer2/rank.s llvm/test/tools/llvm-mca/X86/BdVer2/read-advance-1.s llvm/test/tools/llvm-mca/X86/BdVer2/read-advance-2.s llvm/test/tools/llvm-mca/X86/BdVer2/read-advance-3.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-1.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-2.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-3.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-4.s llvm/test/tools/llvm-mca/X86/BdVer2/reg-move-elimination-5.s llvm/test/tools/llvm-mca/X86/BdVer2/register-files-1.s llvm/test/tools/llvm-mca/X86/BdVer2/register-files-2.s llvm/test/tools/llvm-mca/X86/BdVer2/register-files-5.s llvm/test/tools/llvm-mca/X86/BdVer2/store-throughput.s llvm/test/tools/llvm-mca/X86/BdVer2/vbroadcast-operand-latency.s llvm/test/tools/llvm-mca/X86/BdVer2/vec-logic-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BdVer2/vec-logic-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BdVer2/xop-super-registers-1.s llvm/test/tools/llvm-mca/X86/BdVer2/xop-super-registers-2.s llvm/test/tools/llvm-mca/X86/BdVer2/zero-idioms-avx-256.s llvm/test/tools/llvm-mca/X86/BdVer2/zero-idioms.s llvm/test/tools/llvm-mca/X86/Broadwell/zero-idioms.s llvm/test/tools/llvm-mca/X86/BtVer2/add-sequence.s llvm/test/tools/llvm-mca/X86/BtVer2/bottleneck-hints-1.s llvm/test/tools/llvm-mca/X86/BtVer2/bottleneck-hints-3.s llvm/test/tools/llvm-mca/X86/BtVer2/clear-super-register-1.s llvm/test/tools/llvm-mca/X86/BtVer2/clear-super-register-2.s llvm/test/tools/llvm-mca/X86/BtVer2/cmpxchg-read-advance.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-cmp.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-pcmpeq.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-pcmpgt.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-sbb-1.s llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-sbb-2.s llvm/test/tools/llvm-mca/X86/BtVer2/dependent-pmuld-paddd.s llvm/test/tools/llvm-mca/X86/BtVer2/dot-product.s llvm/test/tools/llvm-mca/X86/BtVer2/hadd-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BtVer2/hadd-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BtVer2/int-to-fpu-forwarding-3.s llvm/test/tools/llvm-mca/X86/BtVer2/load-store-alias.s llvm/test/tools/llvm-mca/X86/BtVer2/memcpy-like-test.s llvm/test/tools/llvm-mca/X86/BtVer2/one-idioms.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-7.s llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update.s llvm/test/tools/llvm-mca/X86/BtVer2/pipes-fpu.s llvm/test/tools/llvm-mca/X86/BtVer2/pr37790.s llvm/test/tools/llvm-mca/X86/BtVer2/rank.s llvm/test/tools/llvm-mca/X86/BtVer2/read-advance-1.s llvm/test/tools/llvm-mca/X86/BtVer2/read-advance-2.s llvm/test/tools/llvm-mca/X86/BtVer2/read-advance-3.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-1.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-2.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-3.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-4.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-5.s llvm/test/tools/llvm-mca/X86/BtVer2/reg-move-elimination-6.s llvm/test/tools/llvm-mca/X86/BtVer2/register-files-1.s llvm/test/tools/llvm-mca/X86/BtVer2/register-files-2.s llvm/test/tools/llvm-mca/X86/BtVer2/register-files-5.s llvm/test/tools/llvm-mca/X86/BtVer2/vbroadcast-operand-latency.s llvm/test/tools/llvm-mca/X86/BtVer2/vec-logic-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/BtVer2/vec-logic-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/BtVer2/xadd.s llvm/test/tools/llvm-mca/X86/BtVer2/xchg.s llvm/test/tools/llvm-mca/X86/BtVer2/zero-idioms-avx-256.s llvm/test/tools/llvm-mca/X86/BtVer2/zero-idioms.s llvm/test/tools/llvm-mca/X86/Generic/avx512-super-registers-1.s llvm/test/tools/llvm-mca/X86/Generic/avx512-super-registers-2.s llvm/test/tools/llvm-mca/X86/Generic/avx512-super-registers-3.s llvm/test/tools/llvm-mca/X86/Generic/xop-super-registers-1.s llvm/test/tools/llvm-mca/X86/Generic/xop-super-registers-2.s llvm/test/tools/llvm-mca/X86/Haswell/zero-idioms.s llvm/test/tools/llvm-mca/X86/SandyBridge/zero-idioms.s llvm/test/tools/llvm-mca/X86/SkylakeClient/zero-idioms.s llvm/test/tools/llvm-mca/X86/SkylakeServer/zero-idioms.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-2.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-3.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-4.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-6.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update-7.s llvm/test/tools/llvm-mca/X86/Znver1/partial-reg-update.s llvm/test/tools/llvm-mca/X86/bextr-read-after-ld.s llvm/test/tools/llvm-mca/X86/bzhi-read-after-ld.s llvm/test/tools/llvm-mca/X86/fma3-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/fma3-read-after-ld-2.s llvm/test/tools/llvm-mca/X86/read-after-ld-1.s llvm/test/tools/llvm-mca/X86/read-after-ld-2.s llvm/test/tools/llvm-mca/X86/read-after-ld-3.s llvm/test/tools/llvm-mca/X86/sqrt-rsqrt-rcp-memop.s llvm/test/tools/llvm-mca/X86/variable-blend-read-after-ld-1.s llvm/test/tools/llvm-mca/X86/variable-blend-read-after-ld-2.s llvm/tools/llvm-mca/Views/TimelineView.cpp llvm/tools/llvm-mca/Views/TimelineView.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68714.224131.patch Type: text/x-patch Size: 111759 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 13:00:53 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:00:53 +0000 (UTC) Subject: [PATCH] D68714: [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219) In-Reply-To: References: Message-ID: lebedev.ri added inline comments. ================ Comment at: llvm/tools/llvm-mca/Views/TimelineView.cpp:132-140 void TimelineView::printWaitTimeEntry(formatted_raw_ostream &OS, const WaitTimeEntry &Entry, unsigned SourceIndex, - unsigned Executions) const { - OS << SourceIndex << '.'; + unsigned CumulativeExecutions, + unsigned Executions, + bool ShouldNumber) const { + if (ShouldNumber) ---------------- andreadb wrote: > lebedev.ri wrote: > > andreadb wrote: > > > You can still use the old signature for this method (see the explanation below): > > > > > > We know that we are printing the special entry if `SourceIndex == Source.size()`. > > > > > > You can use that knowledge in two places: > > > > > > 1) You can automatically infer flag `ShouldNumber`. > > > > > > ``` > > > bool ShouldNumber = SourceIndex != Source.size(); > > > if (ShouldNumber) > > > OS << SourceIndex << '.'; > > > ``` > > > > > > 2) Before printing the average times, you can check if numbers are fore he special entry and modify the value of `Executions` with `Timeline.size() / Source.size()`. > > > > > > You can do it where you currently added the FIXME comment. > > > > > > ``` > > > if (!ShouldNumber) { > > > // override Executions for the purpose of changing colors. > > > Executions = Timeline.size() / Source.size(); > > > } > > > ``` > > > > > > Basically you don't need `CumulativeExecutions` as it can be inferred from the context. That should also fix the issue with the coloring of the output. > > To be honest i do not understand this comment. > > Is this better or worse? > > This does not help with coloring. > Okay. I see the problem now. > > At line 155 we have: > ``` > int BufferSize = UsedBuffer[SourceIndex].second; > ``` > > However, SourceIndex is not a valid index if method `printWaitTimeEntry()` is called to print the . The motivation is that `SourceIndex` is set to `Source.size()` for entry , and there are only `Source.size()` elements in vector `UsedBuffer`. So that access is invalid if we are printing the . > > There is not an easy fix for it. > I suggest for now that we avoid to change the colors if we know that we are printing the entry. Oh, that would explain it.. I should have seen it :( Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68714/new/ https://reviews.llvm.org/D68714 From llvm-commits at lists.llvm.org Wed Oct 9 13:00:54 2019 From: llvm-commits at lists.llvm.org (Tony Jiang via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:00:54 +0000 (UTC) Subject: [PATCH] D66840: docs/DeveloperPolicy: Add instructions for requesting GitHub commit access In-Reply-To: References: Message-ID: jtony added a comment. I am not able to run the last step successfully. Initially, I thought it's because I used wrong password. So I sent another new password hash to Chris Lattner to update it. He updated the password hash for me. When I used the new password to run `svn commit -m "Request commit access for jtony"` still failed. Anyway know why? Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66840/new/ https://reviews.llvm.org/D66840 From llvm-commits at lists.llvm.org Wed Oct 9 13:00:54 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:00:54 +0000 (UTC) Subject: [PATCH] D68706: [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space In-Reply-To: References: Message-ID: <42ce47de43780121cc11c27ff87117b6@localhost.localdomain> jdoerfert accepted this revision. jdoerfert added a comment. This revision is now accepted and ready to land. In D68706#1701937 , @spatel wrote: > In D68706#1701915 , @jdoerfert wrote: > > > Can you add the test I provided as well? > > > Did I miss a message? I copy/pasted at line 92 of the test file (no diff from the code change): > > define float @matching_scalar_smallest_deref_addrspace(<4 x float> addrspace(4)* dereferenceable(1) %p) { > ; CHECK-LABEL: @matching_scalar_smallest_deref_addrspace( > ; CHECK-NEXT: [[BC:%.*]] = getelementptr inbounds <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 > ; CHECK-NEXT: [[R:%.*]] = load float, float addrspace(4)* [[BC]], align 16 > ; CHECK-NEXT: ret float [[R]] > ; > %bc = bitcast <4 x float> addrspace(4)* %p to float addrspace(4)* > %r = load float, float addrspace(4)* %bc, align 16 > ret float %r > } > I did not realize you comited the test separately so I was expecting to see a new test show in the diff. Sorry. LGTM with a request for an additional comment (see below) ================ Comment at: llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp:2349 + // In a non-default address space (not 0), a null pointer can not be + // assumed inbounds, so ignore that case (dereferenceable_or_null). + if (SrcPTy->getAddressSpace() == 0 || !CanBeNull) ---------------- Could you add one more sentence here please to complement your reasoning: ``` // The reason is that `null` is not treated differently in these address spaces // and we consequently ignore the `gep inbounds` special case for `null` which // allows `inbounds` on `null` if the indices are zeros. ``` CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68706/new/ https://reviews.llvm.org/D68706 From llvm-commits at lists.llvm.org Wed Oct 9 13:00:58 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:00:58 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: <3b56d80a21336000ad7199fd5d92ffc9@localhost.localdomain> DiggerLin updated this revision to Diff 224132. DiggerLin added a comment. add if (PaddingSize) protect. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 Files: llvm/include/llvm/MC/MCSectionXCOFF.h llvm/lib/MC/MCXCOFFStreamer.cpp llvm/lib/MC/XCOFFObjectWriter.cpp llvm/test/CodeGen/PowerPC/aix-return55.ll llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D66969.224132.patch Type: text/x-patch Size: 20689 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 13:01:00 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:01:00 +0000 (UTC) Subject: [PATCH] D68257: [Support] Add mathematical constants In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. evandro marked an inline comment as done. Closed by commit rGe60415a0db2b: [Support] Add mathematical constants (authored by evandro). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68257/new/ https://reviews.llvm.org/D68257 Files: llvm/include/llvm/Support/MathExtras.h llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68257.224133.patch Type: text/x-patch Size: 6048 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 13:01:07 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:01:07 +0000 (UTC) Subject: [PATCH] D68285: [AMDGPU] Use math constants defined in MathExtras (NFC) In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGc57a9dc487e3: [AMDGPU] Use math constants defined in MathExtras (NFC) (authored by evandro). Changed prior to commit: https://reviews.llvm.org/D68285?vs=223374&id=224134#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68285/new/ https://reviews.llvm.org/D68285 Files: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp llvm/lib/Target/AMDGPU/AMDGPULibCalls.cpp llvm/lib/Target/AMDGPU/R600ISelLowering.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68285.224134.patch Type: text/x-patch Size: 5878 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 13:05:55 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Wed, 9 Oct 2019 13:05:55 -0700 Subject: [PATCH] D68604: [tsan] Don't delay SIGTRAP handler In-Reply-To: References: Message-ID: Thanks. Looking. On Wed, Oct 9, 2019 at 8:17 AM Steven Wan via Phabricator < reviews at reviews.llvm.org> wrote: > stevewan added a comment. > > Hi @vitalybuka, > > This is causing LIT failures in `clang-s390x-linux`. Can you please take a > look? Thanks! > > Steven > > > Repository: > rL LLVM > > CHANGES SINCE LAST ACTION > https://reviews.llvm.org/D68604/new/ > > https://reviews.llvm.org/D68604 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Wed Oct 9 13:10:19 2019 From: llvm-commits at lists.llvm.org (Max Moroz via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:10:19 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Accommodate sancov and coverage report server for use under Windows In-Reply-To: References: Message-ID: <589cf606cb6de533e21358df2fa6a05b@localhost.localdomain> Dor1s added a comment. > @Dor1s - any chance you know more folks actively working on sancov who have the bandwidth to review? Added Matt and Vitaly from "sanitizers" team, + Jonathan who has Windows expertise. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 From llvm-commits at lists.llvm.org Wed Oct 9 13:14:17 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Wed, 09 Oct 2019 20:14:17 -0000 Subject: [llvm] r374210 - [ConstProp] add tests for extractelement with undef index; NFC Message-ID: <20191009201417.8E3F8811BE@lists.llvm.org> Author: spatel Date: Wed Oct 9 13:14:17 2019 New Revision: 374210 URL: http://llvm.org/viewvc/llvm-project?rev=374210&view=rev Log: [ConstProp] add tests for extractelement with undef index; NFC Modified: llvm/trunk/test/Transforms/ConstProp/InsertElement.ll Modified: llvm/trunk/test/Transforms/ConstProp/InsertElement.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/ConstProp/InsertElement.ll?rev=374210&r1=374209&r2=374210&view=diff ============================================================================== --- llvm/trunk/test/Transforms/ConstProp/InsertElement.ll (original) +++ llvm/trunk/test/Transforms/ConstProp/InsertElement.ll Wed Oct 9 13:14:17 2019 @@ -1,32 +1,53 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py ; RUN: opt < %s -constprop -S | FileCheck %s -; CHECK-LABEL: @test1 define i32 @test1() { +; CHECK-LABEL: @test1( +; CHECK-NEXT: ret i32 2139171423 +; %A = bitcast i32 2139171423 to float %B = insertelement <1 x float> undef, float %A, i32 0 %C = extractelement <1 x float> %B, i32 0 %D = bitcast float %C to i32 ret i32 %D -; CHECK: ret i32 2139171423 } -; CHECK-LABEL: @insertelement define <4 x i64> @insertelement() { +; CHECK-LABEL: @insertelement( +; CHECK-NEXT: ret <4 x i64> +; %vec1 = insertelement <4 x i64> undef, i64 -1, i32 0 %vec2 = insertelement <4 x i64> %vec1, i64 -2, i32 1 %vec3 = insertelement <4 x i64> %vec2, i64 -3, i32 2 %vec4 = insertelement <4 x i64> %vec3, i64 -4, i32 3 - ; CHECK: ret <4 x i64> ret <4 x i64> %vec4 } -; CHECK-LABEL: @insertelement_undef define <4 x i64> @insertelement_undef() { +; CHECK-LABEL: @insertelement_undef( +; CHECK-NEXT: [[VEC4:%.*]] = insertelement <4 x i64> , i64 -4, i32 3 +; CHECK-NEXT: ret <4 x i64> undef +; %vec1 = insertelement <4 x i64> undef, i64 -1, i32 0 %vec2 = insertelement <4 x i64> %vec1, i64 -2, i32 1 %vec3 = insertelement <4 x i64> %vec2, i64 -3, i32 2 %vec4 = insertelement <4 x i64> %vec3, i64 -4, i32 3 %vec5 = insertelement <4 x i64> %vec3, i64 -5, i32 4 - ; CHECK: ret <4 x i64> undef ret <4 x i64> %vec5 } + +define i64 @extract_undef_index_from_zero_vec() { +; CHECK-LABEL: @extract_undef_index_from_zero_vec( +; CHECK-NEXT: ret i64 0 +; + %E = extractelement <2 x i64> zeroinitializer, i64 undef + ret i64 %E +} + +define i64 @extract_undef_index_from_nonzero_vec() { +; CHECK-LABEL: @extract_undef_index_from_nonzero_vec( +; CHECK-NEXT: ret i64 undef +; + %E = extractelement <2 x i64> , i64 undef + ret i64 %E +} From llvm-commits at lists.llvm.org Wed Oct 9 13:12:47 2019 From: llvm-commits at lists.llvm.org (Melanie Blower via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:12:47 +0000 (UTC) Subject: [PATCH] D62731: Add support for options -frounding-math, ftrapping-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior In-Reply-To: References: Message-ID: <500e8e0f48d998ce0712b1a4099e4062@localhost.localdomain> mibintc updated this revision to Diff 224136. mibintc retitled this revision from "[RFC] Add support for options -frounding-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior" to "Add support for options -frounding-math, ftrapping-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior". mibintc added a comment. I added a new test case fp-model.c to test RenderFloatingPointOptions, I also fixed a few issues that I spotted while working through this test case. I responded to couple documentation comments from @rjmccall I still owe a more deluxe version of the test fp-constrained.c to be sure all the option values come through as expected Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62731/new/ https://reviews.llvm.org/D62731 Files: clang/docs/UsersManual.rst clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Basic/LangOptions.h clang/include/clang/Driver/Options.td clang/lib/CodeGen/BackendUtil.cpp clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/CodeGen/CodeGenFunction.h clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/fpconstrained.c clang/test/Driver/clang_f_opts.c clang/test/Driver/fast-math.c clang/test/Driver/fp-model.c llvm/include/llvm/Target/TargetOptions.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D62731.224136.patch Type: text/x-patch Size: 37204 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 13:18:28 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Wed, 09 Oct 2019 20:18:28 -0000 Subject: [compiler-rt] r374211 - [sanitizer] Use raise() in test and cover more signals Message-ID: <20191009201828.145B6811BE@lists.llvm.org> Author: vitalybuka Date: Wed Oct 9 13:18:27 2019 New Revision: 374211 URL: http://llvm.org/viewvc/llvm-project?rev=374211&view=rev Log: [sanitizer] Use raise() in test and cover more signals Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.cpp Removed: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.cpp?rev=374211&view=auto ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.cpp (added) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.cpp Wed Oct 9 13:18:27 2019 @@ -0,0 +1,19 @@ +// RUN: %clangxx -O1 %s -o %t +// RUN: %env_tool_opts=handle_sigfpe=2 not %run %t 0 2>&1 | FileCheck %s -DSIGNAME=FPE +// RUN: %env_tool_opts=handle_sigill=2 not %run %t 1 2>&1 | FileCheck %s -DSIGNAME=ILL +// RUN: %env_tool_opts=handle_abort=2 not %run %t 2 2>&1 | FileCheck %s -DSIGNAME=ABRT +// RUN: %env_tool_opts=handle_segv=2 not %run %t 3 2>&1 | FileCheck %s -DSIGNAME=SEGV +// RUN: %env_tool_opts=handle_sigbus=2 not %run %t 4 2>&1 | FileCheck %s -DSIGNAME=BUS +// RUN: %env_tool_opts=handle_sigtrap=2 not %run %t 5 2>&1 | FileCheck %s -DSIGNAME=TRAP + +#include +#include + +int main(int argc, char **argv) { + if (argc != 2) return 0; + int signals[] = {SIGFPE, SIGILL, SIGABRT, SIGSEGV, SIGBUS, SIGTRAP}; + raise(signals[atoi(argv[1])]); +} + +// CHECK: Sanitizer:DEADLYSIGNAL +// CHECK: Sanitizer: [[SIGNAME]] on unknown address Removed: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp?rev=374210&view=auto ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp (original) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap.cpp (removed) @@ -1,8 +0,0 @@ -// RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=2 not %run %t 2>&1 | FileCheck %s - -int main() { - __builtin_debugtrap(); -} - -// CHECK: Sanitizer:DEADLYSIGNAL -// CHECK: Sanitizer: TRAP on unknown address From llvm-commits at lists.llvm.org Wed Oct 9 13:19:34 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:19:34 +0000 (UTC) Subject: [PATCH] D68721: [NFC][PowerPC]Clean up PPCAsmPrinter for TOC related pseudo opcode In-Reply-To: References: Message-ID: <2eff14dd8cadcf000e8d27e3d9bb177a@localhost.localdomain> Xiangling_L marked 6 inline comments as done. Xiangling_L added inline comments. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:774 if (GlobalToc || MO.isJTI() || MO.isBlockAddress() || - TM.getCodeModel() == CodeModel::Large) + (MO.isCPI() && TM.getCodeModel() == CodeModel::Large)) MOSymbol = lookUpOrCreateTOCEntry(MOSymbol); ---------------- jsji wrote: > Are these clang formatted? Formatting looks weird to me. Yes, they are. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68721/new/ https://reviews.llvm.org/D68721 From llvm-commits at lists.llvm.org Wed Oct 9 13:19:35 2019 From: llvm-commits at lists.llvm.org (Jake Ehrlich via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:19:35 +0000 (UTC) Subject: [PATCH] D68724: [libFuzzer] Fix Alarm callback in fuchsia. In-Reply-To: References: Message-ID: <4e3c7960be6762353ee87db72d960573@localhost.localdomain> jakehehrlich accepted this revision. jakehehrlich added a comment. This revision is now accepted and ready to land. LGTM except a nit ================ Comment at: compiler-rt/lib/fuzzer/FuzzerLoop.cpp:276-277 assert(Options.UnitTimeoutSec > 0); // In Windows Alarm callback is executed by a different thread. // NetBSD's current behavior needs this change too. +#if !LIBFUZZER_WINDOWS && !LIBFUZZER_NETBSD && !LIBFUZZER_FUCHSIA ---------------- Can you mention that Fuchsia does the same thing as windows here? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68724/new/ https://reviews.llvm.org/D68724 From llvm-commits at lists.llvm.org Wed Oct 9 13:19:35 2019 From: llvm-commits at lists.llvm.org (Melanie Blower via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:19:35 +0000 (UTC) Subject: [PATCH] D62731: Add support for options -frounding-math, ftrapping-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior In-Reply-To: References: Message-ID: mibintc marked 5 inline comments as done. mibintc added inline comments. ================ Comment at: clang/docs/UsersManual.rst:1330 + and ``fast``. + Details: + ---------------- rjmccall wrote: > rjmccall wrote: > > "provided by other, single-purpose floating point options." > I don't know why you keep including "clang" as a modifier here; this is the clang documentation, and all of these options are clang options no matter where they might have been borrowed from. thanks for explicitly pointing out use of 'clang', i fixed it ================ Comment at: clang/docs/UsersManual.rst:1341 + has been selected, then the compiler will issue a diagnostic warning + that the override has occurred. + ---------------- rjmccall wrote: > mibintc wrote: > > rjmccall wrote: > > > That's not typical driver behavior; why this choice? > > The rationale for the warnings is that the floating point options are sufficiently complicated that it makes sense to warn the uses that one of the later options supplied on the command line is undoing a choice made earlier. It's not obvious that e.g. the setting for fassociative-math is also controlled by -fp-model=strict > Okay. Well, it's a new option, so new behavior is alright, but if you're worried about the collisions having arbitrary effects that you'll have to maintain compatibility with, you should consider making it an error instead, because a warning still means it's permitted. @andrew.w.kaylor What do you think about making the diagnostics error vs. warning? ================ Comment at: clang/include/clang/Basic/LangOptions.h:187 + enum FPRoundingModeKind { + // Round to the nearest integer - IEEE rounding mode ---------------- Currently there's no way to get at any of these values besides ToNearest and Dynamic, but I put all the supported values here to support future work ================ Comment at: clang/include/clang/Basic/LangOptions.h:203 + // Floating point exceptions are not handled: fp exceptions are masked. + FPEB_Ignore, // This is the default + // Optimizer will avoid transformations that may raise exceptions that would ---------------- -fno-trapping-math implemented by selecting -ffp-exception-behavior=ignore and -ftrapping-math is implemented by selecting -ffp-exception-behavior=strict. What do you think about making ftrapping-math a Driver only option, so that Driver converts the values like this. Otherwise let's make fp-exception-behavior take precedence, in llvm, over ftrapping-math (trapping math is t/f but exception behavior, in the llvm Constrained Floating Point Intrinsics, can take 3 values) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62731/new/ https://reviews.llvm.org/D62731 From llvm-commits at lists.llvm.org Wed Oct 9 13:19:35 2019 From: llvm-commits at lists.llvm.org (Marco Vanotti via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:19:35 +0000 (UTC) Subject: [PATCH] D68724: [libFuzzer] Fix Alarm callback in fuchsia. Message-ID: charco created this revision. charco added reviewers: mcgrathr, jakehehrlich, phosek, kcc. Herald added subscribers: llvm-commits, Sanitizers, kristof.beyls, krytarowski. Herald added projects: Sanitizers, LLVM. jakehehrlich accepted this revision. jakehehrlich added a comment. This revision is now accepted and ready to land. LGTM except a nit ================ Comment at: compiler-rt/lib/fuzzer/FuzzerLoop.cpp:276-277 assert(Options.UnitTimeoutSec > 0); // In Windows Alarm callback is executed by a different thread. // NetBSD's current behavior needs this change too. +#if !LIBFUZZER_WINDOWS && !LIBFUZZER_NETBSD && !LIBFUZZER_FUCHSIA ---------------- Can you mention that Fuchsia does the same thing as windows here? This patch adds an #if macro to skip the `InFuzzingThread()` comparison for fuchsia, similar to what it is done for Windows and NetBSD. In fuchsia, the alarm callback runs in a separate thread[0], making it fail the comparison `InFuzzingThread()`, breaking the `-timeout` flag. [0]: https://github.com/llvm/llvm-project/blob/master/compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp#L323 Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68724 Files: compiler-rt/lib/fuzzer/FuzzerLoop.cpp Index: compiler-rt/lib/fuzzer/FuzzerLoop.cpp =================================================================== --- compiler-rt/lib/fuzzer/FuzzerLoop.cpp +++ compiler-rt/lib/fuzzer/FuzzerLoop.cpp @@ -275,7 +275,7 @@ assert(Options.UnitTimeoutSec > 0); // In Windows Alarm callback is executed by a different thread. // NetBSD's current behavior needs this change too. -#if !LIBFUZZER_WINDOWS && !LIBFUZZER_NETBSD +#if !LIBFUZZER_WINDOWS && !LIBFUZZER_NETBSD && !LIBFUZZER_FUCHSIA if (!InFuzzingThread()) return; #endif -------------- next part -------------- A non-text attachment was scrubbed... Name: D68724.224139.patch Type: text/x-patch Size: 538 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 13:22:14 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Wed, 09 Oct 2019 20:22:14 -0000 Subject: [compiler-rt] r374213 - [sanitizer] Make signal_name a C test Message-ID: <20191009202214.5A66188B9C@lists.llvm.org> Author: vitalybuka Date: Wed Oct 9 13:22:14 2019 New Revision: 374213 URL: http://llvm.org/viewvc/llvm-project?rev=374213&view=rev Log: [sanitizer] Make signal_name a C test Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.c Removed: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.cpp Added: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.c URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.c?rev=374213&view=auto ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.c (added) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.c Wed Oct 9 13:22:14 2019 @@ -0,0 +1,20 @@ +// RUN: %clang -O1 %s -o %t +// RUN: %env_tool_opts=handle_sigfpe=2 not %run %t 0 2>&1 | FileCheck %s -DSIGNAME=FPE +// RUN: %env_tool_opts=handle_sigill=2 not %run %t 1 2>&1 | FileCheck %s -DSIGNAME=ILL +// RUN: %env_tool_opts=handle_abort=2 not %run %t 2 2>&1 | FileCheck %s -DSIGNAME=ABRT +// RUN: %env_tool_opts=handle_segv=2 not %run %t 3 2>&1 | FileCheck %s -DSIGNAME=SEGV +// RUN: %env_tool_opts=handle_sigbus=2 not %run %t 4 2>&1 | FileCheck %s -DSIGNAME=BUS +// RUN: %env_tool_opts=handle_sigtrap=2 not %run %t 5 2>&1 | FileCheck %s -DSIGNAME=TRAP + +#include +#include + +int main(int argc, char **argv) { + if (argc != 2) + return 0; + int signals[] = {SIGFPE, SIGILL, SIGABRT, SIGSEGV, SIGBUS, SIGTRAP}; + raise(signals[atoi(argv[1])]); +} + +// CHECK: Sanitizer:DEADLYSIGNAL +// CHECK: Sanitizer: [[SIGNAME]] on unknown address Removed: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.cpp?rev=374212&view=auto ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.cpp (original) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_name.cpp (removed) @@ -1,19 +0,0 @@ -// RUN: %clangxx -O1 %s -o %t -// RUN: %env_tool_opts=handle_sigfpe=2 not %run %t 0 2>&1 | FileCheck %s -DSIGNAME=FPE -// RUN: %env_tool_opts=handle_sigill=2 not %run %t 1 2>&1 | FileCheck %s -DSIGNAME=ILL -// RUN: %env_tool_opts=handle_abort=2 not %run %t 2 2>&1 | FileCheck %s -DSIGNAME=ABRT -// RUN: %env_tool_opts=handle_segv=2 not %run %t 3 2>&1 | FileCheck %s -DSIGNAME=SEGV -// RUN: %env_tool_opts=handle_sigbus=2 not %run %t 4 2>&1 | FileCheck %s -DSIGNAME=BUS -// RUN: %env_tool_opts=handle_sigtrap=2 not %run %t 5 2>&1 | FileCheck %s -DSIGNAME=TRAP - -#include -#include - -int main(int argc, char **argv) { - if (argc != 2) return 0; - int signals[] = {SIGFPE, SIGILL, SIGABRT, SIGSEGV, SIGBUS, SIGTRAP}; - raise(signals[atoi(argv[1])]); -} - -// CHECK: Sanitizer:DEADLYSIGNAL -// CHECK: Sanitizer: [[SIGNAME]] on unknown address From llvm-commits at lists.llvm.org Wed Oct 9 13:28:42 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 20:28:42 +0000 (UTC) Subject: [PATCH] D65402: [Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer In-Reply-To: References: Message-ID: xbolva00 added a comment. Yeah, you probably want to run it multiple times. @jeoerfert Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65402/new/ https://reviews.llvm.org/D65402 From llvm-commits at lists.llvm.org Wed Oct 9 13:28:43 2019 From: llvm-commits at lists.llvm.org (Derek Schuff via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:28:43 +0000 (UTC) Subject: [PATCH] D68684: [WebAssembly] Make returns variadic In-Reply-To: References: Message-ID: dschuff added inline comments. ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp:1269 if (MI.getOpcode() == WebAssembly::END_BLOCK) { + assert(MFI.getResults().size() <= 1 && + "Multivalue block signatures not implemented yet"); ---------------- sbc100 wrote: > report_fatal_error so end users see this too? By this time we should have legalized everything, so this can be an assert because it should never happen? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68684/new/ https://reviews.llvm.org/D68684 From llvm-commits at lists.llvm.org Wed Oct 9 13:28:44 2019 From: llvm-commits at lists.llvm.org (Derek Schuff via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:28:44 +0000 (UTC) Subject: [PATCH] D68684: [WebAssembly] Make returns variadic In-Reply-To: References: Message-ID: <3e9c8272e3b286ff33bd3b44d987cfa0@localhost.localdomain> dschuff added a comment. Should there be a limit on how many returns we are willing to return in a multi? or should we have no fallback to sret at all? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68684/new/ https://reviews.llvm.org/D68684 From llvm-commits at lists.llvm.org Wed Oct 9 13:29:41 2019 From: llvm-commits at lists.llvm.org (Marco Vanotti via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:29:41 +0000 (UTC) Subject: [PATCH] D68724: [libFuzzer] Fix Alarm callback in fuchsia. In-Reply-To: References: Message-ID: charco updated this revision to Diff 224142. charco added a comment. Update comments to keep in sync with code. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68724/new/ https://reviews.llvm.org/D68724 Files: compiler-rt/lib/fuzzer/FuzzerLoop.cpp Index: compiler-rt/lib/fuzzer/FuzzerLoop.cpp =================================================================== --- compiler-rt/lib/fuzzer/FuzzerLoop.cpp +++ compiler-rt/lib/fuzzer/FuzzerLoop.cpp @@ -273,9 +273,9 @@ NO_SANITIZE_MEMORY void Fuzzer::AlarmCallback() { assert(Options.UnitTimeoutSec > 0); - // In Windows Alarm callback is executed by a different thread. + // In Windows and Fuchsia, Alarm callback is executed by a different thread. // NetBSD's current behavior needs this change too. -#if !LIBFUZZER_WINDOWS && !LIBFUZZER_NETBSD +#if !LIBFUZZER_WINDOWS && !LIBFUZZER_NETBSD && !LIBFUZZER_FUCHSIA if (!InFuzzingThread()) return; #endif -------------- next part -------------- A non-text attachment was scrubbed... Name: D68724.224142.patch Type: text/x-patch Size: 670 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 13:29:44 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:29:44 +0000 (UTC) Subject: [PATCH] D67199: [InstCombine] Expand the simplification of log() In-Reply-To: References: Message-ID: <631129f2dbb5bf8de97ffe3b3d0b49eb@localhost.localdomain> evandro marked an inline comment as done. evandro added a subscriber: craig.topper. evandro added inline comments. ================ Comment at: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp:1923 + LibFunc ArgLb = NotLibFunc; + TLI->getLibFunc(ArgNm, ArgLb); + ---------------- efriedma wrote: > This should be using the overload of getLibFunc that takes a CallSite, instead of expanding it out by hand. This formulation skips checks that should happen otherwise (specifically, that it's not an indirect call, that the call isn't marked nobuiltin, and the function has an appropriate signature). True, but, as @craig.topper said, the cast at line 1919 above might be returning `nullptr`. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67199/new/ https://reviews.llvm.org/D67199 From llvm-commits at lists.llvm.org Wed Oct 9 13:32:04 2019 From: llvm-commits at lists.llvm.org (Marco Vanotti via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:32:04 +0000 (UTC) Subject: [PATCH] D68724: [libFuzzer] Fix Alarm callback in fuchsia. In-Reply-To: References: Message-ID: <7d5d5d463a13e4ad491086e4880420b8@localhost.localdomain> charco marked an inline comment as done. charco added a comment. Thanks for the review! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68724/new/ https://reviews.llvm.org/D68724 From llvm-commits at lists.llvm.org Wed Oct 9 13:32:05 2019 From: llvm-commits at lists.llvm.org (Aaron Puchert via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:32:05 +0000 (UTC) Subject: [PATCH] D51741: [coro]Pass rvalue reference for named local variable to return_value In-Reply-To: References: Message-ID: aaronpuchert added a comment. In D51741#1702038 , @Quuxplusone wrote: > This patch is heavily heavily merge-conflicted by P1825 . Just to be clear, this change is pretty old: it's already contained in Clang 8. I was just adding comments. But since you're here: what is the right `CopyElisionSemanticsKind`, is it `CES_AsIfByStdMove` like this change does it, or `CES_Strict` like `BuildReturnStmt` does it? ================ Comment at: cfe/trunk/lib/Sema/SemaCoroutine.cpp:846 + if (E) { + auto NRVOCandidate = this->getCopyElisionCandidate(E->getType(), E, CES_AsIfByStdMove); + if (NRVOCandidate) { ---------------- modocache wrote: > aaronpuchert wrote: > > Why not `CES_Strict` like in `Sema::BuildReturnStmt`? With `CES_Strict` the test still works, and we can also return references. > So this fixes your test case? If so it sounds good to me. I'll make this change or you can feel free to if you get around to it first. I can post a change, I'm just not sure if it's correct. The (Clang) tests run fine, will test with libc++ later today or tomorrow. ================ Comment at: cfe/trunk/lib/Sema/SemaCoroutine.cpp:857-859 // FIXME: If the operand is a reference to a variable that's about to go out // of scope, we should treat the operand as an xvalue for this overload // resolution. ---------------- @Quuxplusone Am I right that your paper basically addresses this comment? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51741/new/ https://reviews.llvm.org/D51741 From llvm-commits at lists.llvm.org Wed Oct 9 13:38:10 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:38:10 +0000 (UTC) Subject: [PATCH] D67199: [InstCombine] Expand the simplification of log() In-Reply-To: References: Message-ID: <03f674c291382d39e8a827affcbae14d@localhost.localdomain> craig.topper added inline comments. ================ Comment at: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp:1923 + LibFunc ArgLb = NotLibFunc; + TLI->getLibFunc(ArgNm, ArgLb); + ---------------- evandro wrote: > efriedma wrote: > > This should be using the overload of getLibFunc that takes a CallSite, instead of expanding it out by hand. This formulation skips checks that should happen otherwise (specifically, that it's not an indirect call, that the call isn't marked nobuiltin, and the function has an appropriate signature). > True, but, as @craig.topper said, the cast at line 1919 above might be returning `nullptr`. The CallSite version specifically checks that getCalledFunction doesn't return null. Because the case where it returns null is for an indirect call. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67199/new/ https://reviews.llvm.org/D67199 From llvm-commits at lists.llvm.org Wed Oct 9 13:41:39 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 20:41:39 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: xbolva00 accepted this revision. xbolva00 added a comment. Ok for me CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 From llvm-commits at lists.llvm.org Wed Oct 9 13:48:50 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Wed, 09 Oct 2019 20:48:50 -0000 Subject: [compiler-rt] r374220 - [sanitizer] Disable signal_trap_handler on s390 Message-ID: <20191009204850.CF7C08091F@lists.llvm.org> Author: vitalybuka Date: Wed Oct 9 13:48:50 2019 New Revision: 374220 URL: http://llvm.org/viewvc/llvm-project?rev=374220&view=rev Log: [sanitizer] Disable signal_trap_handler on s390 Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp?rev=374220&r1=374219&r2=374220&view=diff ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp (original) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Wed Oct 9 13:48:50 2019 @@ -1,5 +1,8 @@ // RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=1 %run %t 2>&1 | FileCheck %s +// __builtin_debugtrap() does not raise SIGTRAP these platforms. +// UNSUPPORTED: s390 + #include #include #include @@ -26,6 +29,8 @@ int main() { assert(a.sa_flags & SA_SIGINFO); in_handler = 1; + // Check that signal handler is not postponed by sanitizer. + // Don't use raise here as it calls any signal handler immediately. __builtin_debugtrap(); in_handler = 0; From llvm-commits at lists.llvm.org Wed Oct 9 13:48:52 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Wed, 09 Oct 2019 20:48:52 -0000 Subject: [llvm] r374221 - [System Model] [TTI] Fix virtual destructor warning Message-ID: <20191009204852.51DA380BD9@lists.llvm.org> Author: vitalybuka Date: Wed Oct 9 13:48:52 2019 New Revision: 374221 URL: http://llvm.org/viewvc/llvm-project?rev=374221&view=rev Log: [System Model] [TTI] Fix virtual destructor warning Modified: llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h Modified: llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h?rev=374221&r1=374220&r2=374221&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h (original) +++ llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h Wed Oct 9 13:48:52 2019 @@ -190,6 +190,7 @@ private: protected: explicit BasicTTIImplBase(const TargetMachine *TM, const DataLayout &DL) : BaseT(DL) {} + virtual ~BasicTTIImplBase() = default; using TargetTransformInfoImplBase::DL; From llvm-commits at lists.llvm.org Wed Oct 9 13:48:54 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Wed, 09 Oct 2019 20:48:54 -0000 Subject: [llvm] r374222 - [System Model] [TTI] Define AMDGPUTTIImpl::getST and AMDGPUTTIImpl::getTLI Message-ID: <20191009204854.3AE3D84CFB@lists.llvm.org> Author: vitalybuka Date: Wed Oct 9 13:48:54 2019 New Revision: 374222 URL: http://llvm.org/viewvc/llvm-project?rev=374222&view=rev Log: [System Model] [TTI] Define AMDGPUTTIImpl::getST and AMDGPUTTIImpl::getTLI To fix "infinite recursion" warning. Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h?rev=374222&r1=374221&r2=374222&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h Wed Oct 9 13:48:54 2019 @@ -46,10 +46,18 @@ class AMDGPUTTIImpl final : public Basic Triple TargetTriple; + const TargetSubtargetInfo *ST; + const TargetLoweringBase *TLI; + + const TargetSubtargetInfo *getST() const { return ST; } + const TargetLoweringBase *getTLI() const { return TLI; } + public: explicit AMDGPUTTIImpl(const AMDGPUTargetMachine *TM, const Function &F) - : BaseT(TM, F.getParent()->getDataLayout()), - TargetTriple(TM->getTargetTriple()) {} + : BaseT(TM, F.getParent()->getDataLayout()), + TargetTriple(TM->getTargetTriple()), + ST(static_cast(TM->getSubtargetImpl(F))), + TLI(ST->getTargetLowering()) {} void getUnrollingPreferences(Loop *L, ScalarEvolution &SE, TTI::UnrollingPreferences &UP); From llvm-commits at lists.llvm.org Wed Oct 9 13:47:19 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:47:19 +0000 (UTC) Subject: [PATCH] D68684: [WebAssembly] Make returns variadic In-Reply-To: References: Message-ID: <98f23f4fe2aaac516c437c81194d4bd1@localhost.localdomain> tlively marked an inline comment as done. tlively added a comment. In D68684#1702183 , @dschuff wrote: > Should there be a limit on how many returns we are willing to return in a multi? or should we have no fallback to sret at all? My current thinking is that such a limit should be left up to the ABI logic in the frontend and that in the backend we should turn all aggregate returns by value into multivalue returns with no limit. WDYT? ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp:1269 if (MI.getOpcode() == WebAssembly::END_BLOCK) { + assert(MFI.getResults().size() <= 1 && + "Multivalue block signatures not implemented yet"); ---------------- dschuff wrote: > sbc100 wrote: > > report_fatal_error so end users see this too? > By this time we should have legalized everything, so this can be an assert because it should never happen? @sbc100 is right that users should be able to see these errors. These asserts will be triggered when trying to emit nontrivial code with multivalue enabled. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68684/new/ https://reviews.llvm.org/D68684 From llvm-commits at lists.llvm.org Wed Oct 9 13:47:20 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:47:20 +0000 (UTC) Subject: [PATCH] D68527: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering In-Reply-To: References: Message-ID: <46cf314a9bbe25e961060d86de6fd49a@localhost.localdomain> aheejin added a comment. In D68527#1701774 , @tlively wrote: > In D68527#1700939 , @aheejin wrote: > > > If swizzles are a lot more complicated that `v128.const` in execution, doesn't that mean swizzles will likely to take longer to execute in wasm? Why the opposite? > > > Swizzles lower directly to hardware instructions so they are fast for engines to execute. But doing the same operation without a swizzle instruction would require a long sequence of other wasm instructions and therefore be slow to execute. Because this difference is large for swizzles it is a good idea to prefer to use them when possible. We are deciding which one among const/swizzle/splat to use based on the number of lanes hit by the instruction. The rest is the number of `replace_lane`s, so I don't think swizzles are more expensive to emulate than others, because after a single const/swizzle/splat, all emulation cost is down to the number of `replace_lane`s...? Anyway, not really related to the CL itself Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68527/new/ https://reviews.llvm.org/D68527 From llvm-commits at lists.llvm.org Wed Oct 9 13:47:21 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:47:21 +0000 (UTC) Subject: [PATCH] D68721: [NFC][PowerPC]Clean up PPCAsmPrinter for TOC related pseudo opcode In-Reply-To: References: Message-ID: <046d8a04be35055f3959e728d0b6eacf@localhost.localdomain> Xiangling_L updated this revision to Diff 224149. Xiangling_L marked an inline comment as done. Xiangling_L added a comment. Address commments & add `const` for `lookUpOrCreateTOCEntry` & `TOC` MapVector Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68721/new/ https://reviews.llvm.org/D68721 Files: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68721.224149.patch Type: text/x-patch Size: 10164 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 13:47:35 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Wed, 9 Oct 2019 13:47:35 -0700 Subject: [PATCH] D68604: [tsan] Don't delay SIGTRAP handler In-Reply-To: References: Message-ID: Should be fixed after r374220 On Wed, Oct 9, 2019 at 8:17 AM Steven Wan via Phabricator < reviews at reviews.llvm.org> wrote: > stevewan added a comment. > > Hi @vitalybuka, > > This is causing LIT failures in `clang-s390x-linux`. Can you please take a > look? Thanks! > > Steven > > > Repository: > rL LLVM > > CHANGES SINCE LAST ACTION > https://reviews.llvm.org/D68604/new/ > > https://reviews.llvm.org/D68604 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Wed Oct 9 13:52:39 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Wed, 09 Oct 2019 20:52:39 -0000 Subject: [compiler-rt] r374223 - [sanitizer, NFC] Fix grammar in comment Message-ID: <20191009205239.2EC3283C81@lists.llvm.org> Author: vitalybuka Date: Wed Oct 9 13:52:39 2019 New Revision: 374223 URL: http://llvm.org/viewvc/llvm-project?rev=374223&view=rev Log: [sanitizer, NFC] Fix grammar in comment Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp?rev=374223&r1=374222&r2=374223&view=diff ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp (original) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Linux/signal_trap_handler.cpp Wed Oct 9 13:52:39 2019 @@ -1,6 +1,6 @@ // RUN: %clangxx -O1 %s -o %t && %env_tool_opts=handle_sigtrap=1 %run %t 2>&1 | FileCheck %s -// __builtin_debugtrap() does not raise SIGTRAP these platforms. +// __builtin_debugtrap() does not raise SIGTRAP on these platforms. // UNSUPPORTED: s390 #include From llvm-commits at lists.llvm.org Wed Oct 9 13:52:41 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:52:41 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: craig.topper added inline comments. ================ Comment at: llvm/include/llvm/IR/PatternMatch.h:664 /// the value. -inline specific_intval m_SpecificInt(uint64_t V) { return specific_intval(V); } +inline specific_intval m_SpecificInt(APInt V) { return specific_intval(V); } + ---------------- Can we std::move this into specific_intval constructor and then std::move it again in the class. Otherwise we're making multiple heap allocations whenever the value is more the 64 bits. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 From llvm-commits at lists.llvm.org Wed Oct 9 13:56:54 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:56:54 +0000 (UTC) Subject: [PATCH] D68689: [LLD] [MinGW] Look for other library patterns with -l In-Reply-To: References: Message-ID: <8082d84ac673585aca9ef759509f58e2@localhost.localdomain> rnk accepted this revision. rnk added a comment. This revision is now accepted and ready to land. lgtm Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68689/new/ https://reviews.llvm.org/D68689 From llvm-commits at lists.llvm.org Wed Oct 9 14:01:50 2019 From: llvm-commits at lists.llvm.org (Jake Ehrlich via llvm-commits) Date: Wed, 09 Oct 2019 21:01:50 -0000 Subject: [compiler-rt] r374228 - [libFuzzer] Fix Alarm callback in fuchsia. Message-ID: <20191009210150.8FC9282D26@lists.llvm.org> Author: jakehehrlich Date: Wed Oct 9 14:01:50 2019 New Revision: 374228 URL: http://llvm.org/viewvc/llvm-project?rev=374228&view=rev Log: [libFuzzer] Fix Alarm callback in fuchsia. This patch adds an #if macro to skip the InFuzzingThread() comparison for fuchsia, similar to what it is done for Windows and NetBSD. In fuchsia, the alarm callback runs in a separate thread[0], making it fail the comparison InFuzzingThread(), breaking the -timeout flag. [0]: https://github.com/llvm/llvm-project/blob/master/compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp#L323 Author: charco (aka Marco Vanotti) Differential Revision: https://reviews.llvm.org/D68166 Modified: compiler-rt/trunk/lib/fuzzer/FuzzerLoop.cpp Modified: compiler-rt/trunk/lib/fuzzer/FuzzerLoop.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/fuzzer/FuzzerLoop.cpp?rev=374228&r1=374227&r2=374228&view=diff ============================================================================== --- compiler-rt/trunk/lib/fuzzer/FuzzerLoop.cpp (original) +++ compiler-rt/trunk/lib/fuzzer/FuzzerLoop.cpp Wed Oct 9 14:01:50 2019 @@ -273,9 +273,9 @@ void Fuzzer::InterruptCallback() { NO_SANITIZE_MEMORY void Fuzzer::AlarmCallback() { assert(Options.UnitTimeoutSec > 0); - // In Windows Alarm callback is executed by a different thread. + // In Windows and Fuchsia, Alarm callback is executed by a different thread. // NetBSD's current behavior needs this change too. -#if !LIBFUZZER_WINDOWS && !LIBFUZZER_NETBSD +#if !LIBFUZZER_WINDOWS && !LIBFUZZER_NETBSD && !LIBFUZZER_FUCHSIA if (!InFuzzingThread()) return; #endif From llvm-commits at lists.llvm.org Wed Oct 9 14:02:53 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:02:53 +0000 (UTC) Subject: [PATCH] D68684: [WebAssembly] Make returns variadic In-Reply-To: References: Message-ID: <12d5e6174e25ff3287769653a19cc64f@localhost.localdomain> aheejin accepted this revision. aheejin added inline comments. This revision is now accepted and ready to land. ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp:1309 + // TODO: should probably be <= 2? But be conservative to start... + assert(Ret->getNumOperands() < 2 && "Multivalue return not supported yet"); ---------------- Why <=2? If multivalues are supported in fastisel later, do we still have that limit? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68684/new/ https://reviews.llvm.org/D68684 From llvm-commits at lists.llvm.org Wed Oct 9 14:07:05 2019 From: llvm-commits at lists.llvm.org (Arthur O'Dwyer via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:07:05 +0000 (UTC) Subject: [PATCH] D51741: [coro]Pass rvalue reference for named local variable to return_value In-Reply-To: References: Message-ID: <07b8717a930722d304088d3a07da8fb1@localhost.localdomain> Quuxplusone added inline comments. ================ Comment at: cfe/trunk/lib/Sema/SemaCoroutine.cpp:857-859 // FIXME: If the operand is a reference to a variable that's about to go out // of scope, we should treat the operand as an xvalue for this overload // resolution. ---------------- aaronpuchert wrote: > @Quuxplusone Am I right that your paper basically addresses this comment? I'm not yet motivated enough to look closely enough to give an //informed// opinion... but my kneejerk impression is that this FIXME comment is saying "we should do implicit move [in the [P1155](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1155r2.html) sense] here." In which case, this patch (D51741) itself fixed this FIXME at least partly, and maybe completely. Maybe this patch should have removed or amended the FIXME, rather than just adding code above it. > But since you're here: what is the right `CopyElisionSemanticsKind`, is it `CES_AsIfByStdMove` like this change does it, or `CES_Strict` like `BuildReturnStmt` does it? Oh geez, asking me about code I introduced... ;) My impression is that the correct thing here is `CES_Strict`. Notice that we have a `CES_FormerDefault` (basically "what were the rules in C++11 before the first DR"), and then a `CES_Default` (basically "what are the rules in C++17, before [P1825](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1825r0.html)"). My `-Wreturn-std-move` patch added `CES_AsIfByStdMove` as an **implementation detail** of the diagnostic codepath. I didn't expect anyone to use it in real code. However, now that [P1825](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1825r0.html) has been accepted into C++2a, `CES_AsIfByStdMove` is the actual (draft-)standard behavior, and should rightly be named something like `CES_FutureDefault`! So I would suggest that this code should use `CES_Strict` for now, just like we do in `BuildReturnStmt`. Then, someone should "implement [P1825](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1825r0.html)," which means figuring out what the heck to do about `-Wreturn-std-move`... but whatever they figure out, it'll apply equally cleanly to this code as to `BuildReturnStmt`. I'm confused because both here and `BuildReturnStmt` have calls to `getCopyElisionCandidate` //outside// the call to `PerformMoveOrCopyInitialization`, even though `PerformMoveOrCopyInitialization` will happily accept a null `NRVOCandidate` as a signal to make its own call to `getCopyElisionCandidate`. (And notice that //that// call will use `CES_Default`, which seems right, as opposed to `CES_Strict`, which seems wrong to me, but clearly I've forgotten how this stuff works.) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51741/new/ https://reviews.llvm.org/D51741 From llvm-commits at lists.llvm.org Wed Oct 9 14:22:08 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:22:08 +0000 (UTC) Subject: [PATCH] D68730: [llvm-objdump] Adjust spacing and field width for --section-headers Message-ID: rupprecht created this revision. rupprecht added reviewers: grimar, jhenderson. Herald added subscribers: llvm-commits, seiya, aheejin, arichardson, sbc100, emaste. Herald added a reviewer: espindola. Herald added a project: LLVM. - Expand the "Name" column past 13 characters when any of the section names are longer. Current behavior is a staggard output instead of a nice table if a single name is longer. - Only print the required number of hex chars for addresses (i.e. 8 characters for 32-bit, 16 characters for 64-bit) - Fix trailing spaces Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68730 Files: lld/test/ELF/got32-i386.s lld/test/ELF/got32x-i386.s llvm/test/tools/llvm-objdump/section-headers-address-width.test llvm/test/tools/llvm-objdump/section-headers-name-width.test llvm/test/tools/llvm-objdump/section-headers-spacing.test llvm/test/tools/llvm-objdump/wasm.txt llvm/test/tools/llvm-objdump/xcoff-section-headers.test llvm/tools/llvm-objdump/llvm-objdump.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68730.224163.patch Type: text/x-patch Size: 13722 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 14:22:09 2019 From: llvm-commits at lists.llvm.org (serge via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:22:09 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: serge-sans-paille added a comment. @efriedma : there's indeed an intersection with the `probe-stack` attribute. The `probe-stack` attribute (a) forces a function call, and (b) this function call only happens **before** the stack gets expanded. (a) is probably a performance issue in several cases, plus it requires an extra register (that's mentioned in https://reviews.llvm.org/D9653) (b) is an issue, as pointed out in https://lwn.net/Articles/726587/ (grep for valgrind) : from valgrind point of view, accessing un-allocated stack memory triggers error, and we probably want to please valgrind Doing the call *after* the stack allocation is also not an option, as a signal could be raised between the stack allocation and the stack probing, escaping the stack probe if a custom signal handler is executed. That being said, I do think it would be a good thing to have a special value for `probe-stack`, say `probe-stack=inline-asm`, that would trigger generation of inlined assembly as I do. That way we have all the pieces in one place, with different strategies. And we would have clang set the attribute for each function when `-fstack-clash-protection` is given. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 From llvm-commits at lists.llvm.org Wed Oct 9 14:22:11 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:22:11 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: hubert.reinterpretcast added inline comments. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:518 + Csect.Size = Layout.getSectionAddressSize(MCSec); + Address = Csect.Address + Csect.Size; + Csect.SymbolTableIndex = SymbolTableIndex; ---------------- There's two spaces after the `+`. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:296 + // Write the program code control sections one at a time. + uint32_t PreCSectEndAddress = Text.Address; + uint32_t PaddingSize; ---------------- Suggestion: `CurrentAddressLocation` ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:299 + for (const auto &Csect : ProgramCodeCsects) { + // PaddingSize = Virtual address of current CSect - Virtual end address of + // previous CSect. ---------------- I think the code can be made sufficiently self-explanatory that we don't need a comment here. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:302 + PaddingSize = Csect.Address - PreCSectEndAddress; + if (PaddingSize) + W.OS.write_zeros(PaddingSize); ---------------- The above write to `PaddingSize` is only read by the `if` and the use inside the `if`. ``` if (uint32_t PaddingSize = Csect.Address - CurrentAddressLocation) ``` ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:308 + + // Padding Size of Tail Section = + // Virtual end address of current Section - Virtual end address of last CSect. ---------------- Suggestion: The size of the tail padding in a section is the end virtual address of the current section minus the the end virtual address of the last csect in that section. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:310 + // Virtual end address of current Section - Virtual end address of last CSect. + if (ProgramCodeCsects.size()) { + PaddingSize = Text.Address + Text.Size - PreCSectEndAddress; ---------------- ``` if (!ProgramCodeCsects.empty()) ``` however, I suggest checking the section and not the group of csects (they aren't the same thing): ``` if (Text.Index != -1) ``` ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:312 + PaddingSize = Text.Address + Text.Size - PreCSectEndAddress; + if (PaddingSize) + W.OS.write_zeros(PaddingSize); ---------------- Same comment about the write to `PaddingSize`. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:531 + + // First Csect of each section do not need padding zero. We need to + // adjust section virtual address to first Csect's address. ---------------- Use "csect" instead of "Csect" when using the term in an English context where the word would not be capitalized. Suggestion: The first csect of a section can be aligned by adjusting the virtual address of its containing section instead of writing zeroes into the object file. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:544 BSS.Index = SectionIndex++; - assert(alignTo(Address, DefaultSectionAlign) == Address && - "Improperly aligned address for section."); - uint32_t StartAddress = Address; + // We use alignment address of previous section as BSS start address. + BSS.Address = Address; ---------------- The difference in the calculation for the virtual address of the `.bss` section and that of the `.text` section might complicate efforts to common up the handling. Note that a change in how the virtual address of `.bss` is calculated is within the scope of this patch because it changes the value from being always zero. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 From llvm-commits at lists.llvm.org Wed Oct 9 14:22:15 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:22:15 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Accommodate sancov and coverage report server for use under Windows In-Reply-To: References: Message-ID: <572b346219167583a8a377de679422bf@localhost.localdomain> vitalybuka added a comment. In D51018#1700811 , @dgg5503 wrote: > @vsk thanks for the review! It looks like the JSON support library implements what `JSONWriter` does in this tool. To reduce maintenance, I've updated sancov to use the JSON support library implementation instead. The only downside to this change is that the JSON text format differs compared to the original implementation. I'm open to reverting this diff and simply adding your suggested change which also worked. Let me know what you think. > > EDIT: > I've also updated the title and description to better describe the changes in this diff. I like the change. Could you move JSONWriter -> JSON refactoring into separate patch and rebase win stuff ontop? If you don't have commiter access, someone will need to commit it for you So I don't mind to split the patch and commit myself. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 From llvm-commits at lists.llvm.org Wed Oct 9 14:25:28 2019 From: llvm-commits at lists.llvm.org (David Blaikie via llvm-commits) Date: Wed, 09 Oct 2019 21:25:28 -0000 Subject: [llvm] r374232 - llvm-dwarfdump: Support multiple debug_loclists contributions Message-ID: <20191009212528.80E6A85D2A@lists.llvm.org> Author: dblaikie Date: Wed Oct 9 14:25:28 2019 New Revision: 374232 URL: http://llvm.org/viewvc/llvm-project?rev=374232&view=rev Log: llvm-dwarfdump: Support multiple debug_loclists contributions Also fixing the incorrect "offset" field being computed/printed for each location list. Added: llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_multiple.s Modified: llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h llvm/trunk/lib/DebugInfo/DWARF/DWARFContext.cpp llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp llvm/trunk/test/CodeGen/X86/debug-loclists.ll llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists.test llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_startx_length.s Modified: llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h?rev=374232&r1=374231&r2=374232&view=diff ============================================================================== --- llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h (original) +++ llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h Wed Oct 9 14:25:28 2019 @@ -99,7 +99,7 @@ private: bool IsLittleEndian; public: - void parse(DataExtractor data, unsigned Version); + void parse(DataExtractor data, uint64_t Offset, uint64_t EndOffset, uint16_t Version); void dump(raw_ostream &OS, uint64_t BaseAddr, const MCRegisterInfo *RegInfo, Optional Offset) const; Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFContext.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFContext.cpp?rev=374232&r1=374231&r2=374232&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/DWARF/DWARFContext.cpp (original) +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFContext.cpp Wed Oct 9 14:25:28 2019 @@ -290,20 +290,24 @@ static void dumpLoclistsSection(raw_ostr const MCRegisterInfo *MRI, Optional DumpOffset) { uint64_t Offset = 0; - DWARFDebugLoclists Loclists; - DWARFListTableHeader Header(".debug_loclists", "locations"); - if (Error E = Header.extract(Data, &Offset)) { - WithColor::error() << toString(std::move(E)) << '\n'; - return; - } - - Header.dump(OS, DumpOpts); - DataExtractor LocData(Data.getData().drop_front(Offset), - Data.isLittleEndian(), Header.getAddrSize()); + while (Data.isValidOffset(Offset)) { + DWARFListTableHeader Header(".debug_loclists", "locations"); + if (Error E = Header.extract(Data, &Offset)) { + WithColor::error() << toString(std::move(E)) << '\n'; + return; + } - Loclists.parse(LocData, Header.getVersion()); - Loclists.dump(OS, 0, MRI, DumpOffset); + Header.dump(OS, DumpOpts); + DataExtractor LocData(Data.getData(), + Data.isLittleEndian(), Header.getAddrSize()); + + DWARFDebugLoclists Loclists; + uint64_t EndOffset = Header.length() + Header.getHeaderOffset(); + Loclists.parse(LocData, Offset, EndOffset, Header.getVersion()); + Loclists.dump(OS, 0, MRI, DumpOffset); + Offset = EndOffset; + } } void DWARFContext::dump( @@ -733,7 +737,7 @@ const DWARFDebugLoclists *DWARFContext:: // Use version 4. DWO does not support the DWARF v5 .debug_loclists yet and // that means we are parsing the new style .debug_loc (pre-standatized version // of the .debug_loclists). - LocDWO->parse(LocData, 4 /* Version */); + LocDWO->parse(LocData, 0, LocData.getData().size(), 4 /* Version */); return LocDWO.get(); } Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp?rev=374232&r1=374231&r2=374232&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp (original) +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp Wed Oct 9 14:25:28 2019 @@ -187,12 +187,11 @@ DWARFDebugLoclists::parseOneLocationList return LL; } -void DWARFDebugLoclists::parse(DataExtractor data, unsigned Version) { +void DWARFDebugLoclists::parse(DataExtractor data, uint64_t Offset, uint64_t EndOffset, uint16_t Version) { IsLittleEndian = data.isLittleEndian(); AddressSize = data.getAddressSize(); - uint64_t Offset = 0; - while (Offset < data.getData().size()) { + while (Offset < EndOffset) { if (auto LL = parseOneLocationList(data, &Offset, Version)) Locations.push_back(std::move(*LL)); else { Modified: llvm/trunk/test/CodeGen/X86/debug-loclists.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/debug-loclists.ll?rev=374232&r1=374231&r2=374232&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/debug-loclists.ll (original) +++ llvm/trunk/test/CodeGen/X86/debug-loclists.ll Wed Oct 9 14:25:28 2019 @@ -12,7 +12,7 @@ ; CHECK: .debug_loclists contents: ; CHECK-NEXT: 0x00000000: locations list header: length = 0x00000015, version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000000 -; CHECK-NEXT: 0x00000000: +; CHECK-NEXT: 0x0000000c: ; CHECK-NEXT: [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0 ; CHECK-NEXT: [0x0000000000000004, 0x0000000000000012): DW_OP_breg3 RBX+0 Modified: llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists.test?rev=374232&r1=374231&r2=374232&view=diff ============================================================================== --- llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists.test (original) +++ llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists.test Wed Oct 9 14:25:28 2019 @@ -10,7 +10,7 @@ # CHECK: .debug_loclists contents: # CHECK-NEXT: 0x00000000: locations list header: length = 0x0000002c, version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000000 -# CHECK-NEXT: 0x00000000: +# CHECK-NEXT: 0x0000000c: # CHECK-NEXT: [0x0000000000000000, 0x0000000000000010): DW_OP_breg5 RDI+0 # CHECK-NEXT: [0x0000000000000530, 0x0000000000000540): DW_OP_breg6 RBP-8, DW_OP_deref # CHECK-NEXT: [0x0000000000000700, 0x0000000000000710): DW_OP_breg5 RDI+0 Added: llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_multiple.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_multiple.s?rev=374232&view=auto ============================================================================== --- llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_multiple.s (added) +++ llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_multiple.s Wed Oct 9 14:25:28 2019 @@ -0,0 +1,44 @@ +# RUN: llvm-mc %s -filetype obj -triple x86_64-pc-linux -o %t.o +# RUN: llvm-dwarfdump -v %t.o | FileCheck %s + +# Test dumping of multiple separate debug_loclist contributions +# CHECK: .debug_loclists contents: +# CHECK: 0x00000000: locations list header: +# CHECK: 0x0000000c: +# CHECK: [0x0000000000000001, 0x0000000000000002): DW_OP_consts +7, DW_OP_stack_value +# CHECK: 0x00000014: locations list header: +# CHECK: [0x0000000000000005, 0x0000000000000007): DW_OP_consts +12, DW_OP_stack_value + + .section .debug_loclists,"", at progbits + .long .Ldebug_loclist_table_end0-.Ldebug_loclist_table_start0 # Length +.Ldebug_loclist_table_start0: + .short 5 # Version + .byte 8 # Address size + .byte 0 # Segment selector size + .long 0 # Offset entry count + + .byte 4 # DW_LLE_offset_pair + .uleb128 1 # starting offset + .uleb128 2 # ending offset + .byte 3 # Loc expr size + .byte 17 # DW_OP_consts + .byte 7 # 7 + .byte 159 # DW_OP_stack_value + .byte 0 # DW_LLE_end_of_list +.Ldebug_loclist_table_end0: + .long .Ldebug_loclist_table_end1-.Ldebug_loclist_table_start1 # Length +.Ldebug_loclist_table_start1: + .short 5 # Version + .byte 8 # Address size + .byte 0 # Segment selector size + .long 0 # Offset entry count + + .byte 4 # DW_LLE_offset_pair + .uleb128 5 # starting offset + .uleb128 7 # ending offset + .byte 3 # Loc expr size + .byte 17 # DW_OP_consts + .byte 12 # 12 + .byte 159 # DW_OP_stack_value + .byte 0 # DW_LLE_end_of_list +.Ldebug_loclist_table_end1: Modified: llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_startx_length.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_startx_length.s?rev=374232&r1=374231&r2=374232&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_startx_length.s (original) +++ llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_startx_length.s Wed Oct 9 14:25:28 2019 @@ -7,7 +7,7 @@ # CHECK: .debug_loclists contents: # CHECK-NEXT: 0x00000000: locations list header: length = 0x0000000e, version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000000 -# CHECK-NEXT: 0x00000000: +# CHECK-NEXT: 0x0000000c: # CHECK-NEXT: Addr idx 1 (w/ length 16): DW_OP_reg5 RDI .section .debug_loclists,"", at progbits From llvm-commits at lists.llvm.org Wed Oct 9 14:25:32 2019 From: llvm-commits at lists.llvm.org (Zachary Turner via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:25:32 +0000 (UTC) Subject: [PATCH] D66431: [PDB] Fix bug when using multiple PCH header objects with the same name. In-Reply-To: References: Message-ID: <249d596adfbf3009d4d1aa0fc68c0176@localhost.localdomain> zturner updated this revision to Diff 224164. zturner added a comment. Herald added a subscriber: hiraditya. Rebased this onto tip of trunk. I'm *finally* going to submit this. I had to make a few slight changes because it broke one existing test. I don't remember it breaking this test before, but anyway... The behavioral change is that now we don't distinguish between "mismatched signature" and "missing pch object". We also don't use the relative path comparison logic that I first implemented, because that breaks absolute paths. Instead, we use the previous fileNameOnly logic, but we couple that with a signature check. In other words, we continue searching until we have a match on *both* the signature and the file name. If nothing matches, we just return a generic "nothing matched" error. This changes some error text, but otherwise it should be more robust and handle every possible case with relative and absolute paths. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66431/new/ https://reviews.llvm.org/D66431 Files: lld/COFF/PDB.cpp lld/test/COFF/Inputs/precompa/precomp.obj lld/test/COFF/Inputs/precompa/useprecomp.obj lld/test/COFF/Inputs/precompb/precomp.obj lld/test/COFF/Inputs/precompb/useprecomp.obj lld/test/COFF/precomp-link-samename.test lld/test/COFF/precomp-link.test llvm/include/llvm/DebugInfo/PDB/GenericError.h llvm/lib/DebugInfo/PDB/GenericError.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D66431.224164.patch Type: text/x-patch Size: 5510 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 14:31:59 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:31:59 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: hubert.reinterpretcast added inline comments. ================ Comment at: llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll:56 ; SYMS-NEXT: Symbol { -; SYMS-NEXT: Index: [[#Index:]] -; SYMS-NEXT: Name: a +; SYMS: Index: [[#Index:]]{{[[:space:]]*}}Name: a ; SYMS-NEXT: Value (RelocatableAddress): 0x0 ---------------- It seems this file was changed accidentally by today's updates. The ` ` (space) character before the `*` is correct. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 From llvm-commits at lists.llvm.org Wed Oct 9 14:32:00 2019 From: llvm-commits at lists.llvm.org (Chris Bieneman via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:32:00 +0000 (UTC) Subject: [PATCH] D68732: Break out OrcError and RPC Message-ID: beanz created this revision. beanz added a reviewer: lhames. Herald added subscribers: hiraditya, mgorny. Herald added a project: LLVM. When createing an ORC remote JIT target the current library split forces the target process to link large portions of LLVM (Core, Execution Engine, JITLink, Object, MC, Passes, RuntimeDyld, Support, Target, and TransformUtils). This occurs because the ORC RPC interfaces rely on the static globals the ORC Error types require, which starts a cycle of pulling in more and more. This patch breaks the ORC RPC Error implementations out into an "OrcError" library which only depends on LLVM Support. It also pulls the ORC RPC headers into their own subdirectory. With this patch code can include the Orc/RPC/*.h headers and will only incur link dependencies on LLVMOrcError and LLVMSupport. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68732 Files: llvm/examples/Kaleidoscope/BuildingAJIT/Chapter5/RemoteJITUtils.h llvm/include/llvm/ExecutionEngine/Orc/OrcRemoteTargetRPCAPI.h llvm/include/llvm/ExecutionEngine/Orc/RPC/RPCSerialization.h llvm/include/llvm/ExecutionEngine/Orc/RPC/RPCUtils.h llvm/include/llvm/ExecutionEngine/Orc/RPC/RawByteChannel.h llvm/include/llvm/ExecutionEngine/Orc/RPCSerialization.h llvm/include/llvm/ExecutionEngine/Orc/RPCUtils.h llvm/include/llvm/ExecutionEngine/Orc/RawByteChannel.h llvm/lib/ExecutionEngine/CMakeLists.txt llvm/lib/ExecutionEngine/LLVMBuild.txt llvm/lib/ExecutionEngine/Orc/CMakeLists.txt llvm/lib/ExecutionEngine/Orc/LLVMBuild.txt llvm/lib/ExecutionEngine/Orc/OrcError.cpp llvm/lib/ExecutionEngine/Orc/RPCUtils.cpp llvm/lib/ExecutionEngine/OrcError/CMakeLists.txt llvm/lib/ExecutionEngine/OrcError/LLVMBuild.txt llvm/lib/ExecutionEngine/OrcError/OrcError.cpp llvm/lib/ExecutionEngine/OrcError/RPCError.cpp llvm/tools/lli/RemoteJITUtils.h llvm/unittests/ExecutionEngine/Orc/QueueChannel.h llvm/unittests/ExecutionEngine/Orc/RPCUtilsTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68732.224165.patch Type: text/x-patch Size: 13318 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 14:32:03 2019 From: llvm-commits at lists.llvm.org (Erich Keane via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:32:03 +0000 (UTC) Subject: [PATCH] D68521: [PATCH 36/38] [noalias] Clang CodeGen for restrict-qualified pointers In-Reply-To: References: Message-ID: <14b5ef2964442834308130881ffd6165@localhost.localdomain> erichkeane added inline comments. ================ Comment at: clang/lib/AST/Type.cpp:115 + } + } else if (const auto *ArrayTy = dyn_cast(CannonTy)) { + return ArrayTy->getElementType().isRestrictOrContainsRestrictMembers(); ---------------- Rather than this recursion, could you just unpack it from the CannonTy? So replace 107 with: Type *CannonTy = getCanonicalType()->getBaseElementTypeUnsafe(); That way this is all unpacked and saves you a recursion. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68521/new/ https://reviews.llvm.org/D68521 From llvm-commits at lists.llvm.org Wed Oct 9 14:32:04 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:32:04 +0000 (UTC) Subject: [PATCH] D68684: [WebAssembly] Make returns variadic In-Reply-To: References: Message-ID: tlively updated this revision to Diff 224167. tlively added a comment. - Use `report_fatal_error` to report unimplemented functionality Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68684/new/ https://reviews.llvm.org/D68684 Files: llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td llvm/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp llvm/lib/Target/WebAssembly/WebAssemblyPeephole.cpp llvm/test/CodeGen/MIR/WebAssembly/int-type-register-class-name.mir llvm/test/CodeGen/WebAssembly/atomic-fence.mir llvm/test/CodeGen/WebAssembly/eh-labels.mir llvm/test/CodeGen/WebAssembly/explicit-locals.mir llvm/test/CodeGen/WebAssembly/function-info.mir llvm/test/CodeGen/WebAssembly/llround-conv-i32.ll llvm/test/CodeGen/WebAssembly/multivalue.ll llvm/test/CodeGen/WebAssembly/reg-argument.mir llvm/test/CodeGen/WebAssembly/reg-copy.mir llvm/test/DebugInfo/WebAssembly/dbg-value-move-clone.mir llvm/test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68684.224167.patch Type: text/x-patch Size: 26258 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 14:36:03 2019 From: llvm-commits at lists.llvm.org (Wei Mi via llvm-commits) Date: Wed, 09 Oct 2019 21:36:03 -0000 Subject: [llvm] r374233 - [SampleFDO] Add indexing for function profiles so they can be loaded on demand Message-ID: <20191009213603.4F6688374E@lists.llvm.org> Author: wmi Date: Wed Oct 9 14:36:03 2019 New Revision: 374233 URL: http://llvm.org/viewvc/llvm-project?rev=374233&view=rev Log: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format Currently for Text, Binary and ExtBinary format profiles, when we compile a module with samplefdo, even if there is no function showing up in the profile, we have to load all the function profiles from the profile input. That is a waste of compile time. CompactBinary format profile has already had the support of loading function profiles on demand. In this patch, we add the support to load profile on demand for ExtBinary format. It will work no matter the sections in ExtBinary format profile are compressed or not. Experiment shows it reduces the time to compile a server benchmark by 30%. When profile remapping and loading function profiles on demand are both used, extra work needs to be done so that the loading on demand process will take the name remapping into consideration. It will be addressed in a follow-up patch. Differential Revision: https://reviews.llvm.org/D68601 Modified: llvm/trunk/include/llvm/ProfileData/SampleProf.h llvm/trunk/include/llvm/ProfileData/SampleProfReader.h llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h llvm/trunk/lib/ProfileData/SampleProfReader.cpp llvm/trunk/lib/ProfileData/SampleProfWriter.cpp llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp llvm/trunk/test/Transforms/SampleProfile/Inputs/inline.extbinary.afdo llvm/trunk/test/Transforms/SampleProfile/Inputs/profsampleacc.extbinary.afdo llvm/trunk/unittests/ProfileData/SampleProfTest.cpp Modified: llvm/trunk/include/llvm/ProfileData/SampleProf.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ProfileData/SampleProf.h?rev=374233&r1=374232&r2=374233&view=diff ============================================================================== --- llvm/trunk/include/llvm/ProfileData/SampleProf.h (original) +++ llvm/trunk/include/llvm/ProfileData/SampleProf.h Wed Oct 9 14:36:03 2019 @@ -120,6 +120,7 @@ enum SecType { SecProfSummary = 1, SecNameTable = 2, SecProfileSymbolList = 3, + SecFuncOffsetTable = 4, // marker for the first type of profile. SecFuncProfileFirst = 32, SecLBRProfile = SecFuncProfileFirst @@ -135,6 +136,8 @@ static inline std::string getSecName(Sec return "NameTableSection"; case SecProfileSymbolList: return "ProfileSymbolListSection"; + case SecFuncOffsetTable: + return "FuncOffsetTableSection"; case SecLBRProfile: return "LBRProfileSection"; } Modified: llvm/trunk/include/llvm/ProfileData/SampleProfReader.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ProfileData/SampleProfReader.h?rev=374233&r1=374232&r2=374233&view=diff ============================================================================== --- llvm/trunk/include/llvm/ProfileData/SampleProfReader.h (original) +++ llvm/trunk/include/llvm/ProfileData/SampleProfReader.h Wed Oct 9 14:36:03 2019 @@ -279,7 +279,7 @@ public: /// Print the profile for \p FName on stream \p OS. void dumpFunctionProfile(StringRef FName, raw_ostream &OS = dbgs()); - virtual void collectFuncsToUse(const Module &M) {} + virtual void collectFuncsFrom(const Module &M) {} /// Print all the profiles on stream \p OS. void dump(raw_ostream &OS = dbgs()); @@ -424,7 +424,7 @@ protected: bool at_eof() const { return Data >= End; } /// Read the next function profile instance. - std::error_code readFuncProfile(); + std::error_code readFuncProfile(const uint8_t *Start); /// Read the contents of the given profile instance. std::error_code readProfile(FunctionSamples &FProfile); @@ -526,7 +526,17 @@ private: virtual std::error_code verifySPMagic(uint64_t Magic) override; virtual std::error_code readOneSection(const uint8_t *Start, uint64_t Size, SecType Type) override; - std::error_code readProfileSymbolList(uint64_t Size); + std::error_code readProfileSymbolList(); + std::error_code readFuncOffsetTable(); + std::error_code readFuncProfiles(); + + /// The table mapping from function name to the offset of its FunctionSample + /// towards file start. + DenseMap FuncOffsetTable; + /// The set containing the functions to use when compiling a module. + DenseSet FuncsToUse; + /// Use all functions from the input profile. + bool UseAllFuncs = true; public: SampleProfileReaderExtBinary(std::unique_ptr B, LLVMContext &C, @@ -539,6 +549,9 @@ public: virtual std::unique_ptr getProfileSymbolList() override { return std::move(ProfSymList); }; + + /// Collect functions with definitions in Module \p M. + void collectFuncsFrom(const Module &M) override; }; class SampleProfileReaderCompactBinary : public SampleProfileReaderBinary { @@ -571,7 +584,7 @@ public: std::error_code read() override; /// Collect functions to be used when compiling Module \p M. - void collectFuncsToUse(const Module &M) override; + void collectFuncsFrom(const Module &M) override; }; using InlineCallStack = SmallVector; Modified: llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h?rev=374233&r1=374232&r2=374233&view=diff ============================================================================== --- llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h (original) +++ llvm/trunk/include/llvm/ProfileData/SampleProfWriter.h Wed Oct 9 14:36:03 2019 @@ -153,14 +153,15 @@ public: protected: uint64_t markSectionStart(SecType Type); std::error_code addNewSection(SecType Sec, uint64_t SectionStart); - virtual void initSectionLayout() = 0; + virtual void initSectionHdrLayout() = 0; virtual std::error_code writeSections(const StringMap &ProfileMap) = 0; - // Specifiy the section layout in the profile. Note that the order in - // SecHdrTable (order to collect sections) may be different from the - // order in SectionLayout (order to write out sections into profile). - SmallVector SectionLayout; + // Specifiy the order of sections in section header table. Note + // the order of sections in the profile may be different that the + // order in SectionHdrLayout. sample Reader will follow the order + // in SectionHdrLayout to read each section. + SmallVector SectionHdrLayout; private: void allocSecHdrTable(); @@ -193,23 +194,44 @@ class SampleProfileWriterExtBinary : pub public: SampleProfileWriterExtBinary(std::unique_ptr &OS) : SampleProfileWriterExtBinaryBase(OS) { - initSectionLayout(); + initSectionHdrLayout(); } + virtual std::error_code writeSample(const FunctionSamples &S) override; virtual void setProfileSymbolList(ProfileSymbolList *PSL) override { ProfSymList = PSL; }; private: - virtual void initSectionLayout() override { - SectionLayout = {{SecProfSummary, 0, 0, 0}, - {SecNameTable, 0, 0, 0}, - {SecLBRProfile, 0, 0, 0}, - {SecProfileSymbolList, 0, 0, 0}}; + virtual void initSectionHdrLayout() override { + // Note that SecFuncOffsetTable section is written after SecLBRProfile + // in the profile, but is put before SecLBRProfile in SectionHdrLayout. + // + // This is because sample reader follows the order of SectionHdrLayout to + // read each section, to read function profiles on demand sample reader + // need to get the offset of each function profile first. + // + // SecFuncOffsetTable section is written after SecLBRProfile in the + // profile because FuncOffsetTable needs to be populated while section + // SecLBRProfile is written. + SectionHdrLayout = {{SecProfSummary, 0, 0, 0}, + {SecNameTable, 0, 0, 0}, + {SecFuncOffsetTable, 0, 0, 0}, + {SecLBRProfile, 0, 0, 0}, + {SecProfileSymbolList, 0, 0, 0}}; }; virtual std::error_code writeSections(const StringMap &ProfileMap) override; ProfileSymbolList *ProfSymList = nullptr; + + // Save the start of SecLBRProfile so we can compute the offset to the + // start of SecLBRProfile for each Function's Profile and will keep it + // in FuncOffsetTable. + uint64_t SecLBRProfileStart; + // FuncOffsetTable maps function name to its profile offset in SecLBRProfile + // section. It is used to load function profile on demand. + MapVector FuncOffsetTable; + std::error_code writeFuncOffsetTable(); }; // CompactBinary is a compact format of binary profile which both reduces Modified: llvm/trunk/lib/ProfileData/SampleProfReader.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ProfileData/SampleProfReader.cpp?rev=374233&r1=374232&r2=374233&view=diff ============================================================================== --- llvm/trunk/lib/ProfileData/SampleProfReader.cpp (original) +++ llvm/trunk/lib/ProfileData/SampleProfReader.cpp Wed Oct 9 14:36:03 2019 @@ -439,7 +439,9 @@ SampleProfileReaderBinary::readProfile(F return sampleprof_error::success; } -std::error_code SampleProfileReaderBinary::readFuncProfile() { +std::error_code +SampleProfileReaderBinary::readFuncProfile(const uint8_t *Start) { + Data = Start; auto NumHeadSamples = readNumber(); if (std::error_code EC = NumHeadSamples.getError()) return EC; @@ -461,7 +463,7 @@ std::error_code SampleProfileReaderBinar std::error_code SampleProfileReaderBinary::read() { while (!at_eof()) { - if (std::error_code EC = readFuncProfile()) + if (std::error_code EC = readFuncProfile(Data)) return EC; } @@ -483,13 +485,15 @@ SampleProfileReaderExtBinary::readOneSec return EC; break; case SecLBRProfile: - while (Data < Start + Size) { - if (std::error_code EC = readFuncProfile()) - return EC; - } + if (std::error_code EC = readFuncProfiles()) + return EC; break; case SecProfileSymbolList: - if (std::error_code EC = readProfileSymbolList(Size)) + if (std::error_code EC = readProfileSymbolList()) + return EC; + break; + case SecFuncOffsetTable: + if (std::error_code EC = readFuncOffsetTable()) return EC; break; default: @@ -498,15 +502,65 @@ SampleProfileReaderExtBinary::readOneSec return sampleprof_error::success; } -std::error_code -SampleProfileReaderExtBinary::readProfileSymbolList(uint64_t Size) { +void SampleProfileReaderExtBinary::collectFuncsFrom(const Module &M) { + UseAllFuncs = false; + FuncsToUse.clear(); + for (auto &F : M) + FuncsToUse.insert(FunctionSamples::getCanonicalFnName(F)); +} + +std::error_code SampleProfileReaderExtBinary::readFuncOffsetTable() { + auto Size = readNumber(); + if (std::error_code EC = Size.getError()) + return EC; + + FuncOffsetTable.reserve(*Size); + for (uint32_t I = 0; I < *Size; ++I) { + auto FName(readStringFromTable()); + if (std::error_code EC = FName.getError()) + return EC; + + auto Offset = readNumber(); + if (std::error_code EC = Offset.getError()) + return EC; + + FuncOffsetTable[*FName] = *Offset; + } + return sampleprof_error::success; +} + +std::error_code SampleProfileReaderExtBinary::readFuncProfiles() { + const uint8_t *Start = Data; + if (UseAllFuncs) { + while (Data < End) { + if (std::error_code EC = readFuncProfile(Data)) + return EC; + } + assert(Data == End && "More data is read than expected"); + return sampleprof_error::success; + } + + for (auto Name : FuncsToUse) { + auto iter = FuncOffsetTable.find(Name); + if (iter == FuncOffsetTable.end()) + continue; + const uint8_t *FuncProfileAddr = Start + iter->second; + assert(FuncProfileAddr < End && "out of LBRProfile section"); + if (std::error_code EC = readFuncProfile(FuncProfileAddr)) + return EC; + } + Data = End; + return sampleprof_error::success; +} + +std::error_code SampleProfileReaderExtBinary::readProfileSymbolList() { if (!ProfSymList) ProfSymList = std::make_unique(); - if (std::error_code EC = ProfSymList->read(Data, Size)) + if (std::error_code EC = ProfSymList->read(Data, End - Data)) return EC; - Data = Data + Size; + Data = End; return sampleprof_error::success; } @@ -600,9 +654,9 @@ std::error_code SampleProfileReaderCompa for (auto Offset : OffsetsToUse) { const uint8_t *SavedData = Data; - Data = reinterpret_cast(Buffer->getBufferStart()) + - Offset; - if (std::error_code EC = readFuncProfile()) + if (std::error_code EC = readFuncProfile( + reinterpret_cast(Buffer->getBufferStart()) + + Offset)) return EC; Data = SavedData; } @@ -719,8 +773,16 @@ uint64_t SampleProfileReaderExtBinaryBas } uint64_t SampleProfileReaderExtBinaryBase::getFileSize() { - auto &LastEntry = SecHdrTable.back(); - return LastEntry.Offset + LastEntry.Size; + // Sections in SecHdrTable is not necessarily in the same order as + // sections in the profile because section like FuncOffsetTable needs + // to be written after section LBRProfile but needs to be read before + // section LBRProfile, so we cannot simply use the last entry in + // SecHdrTable to calculate the file size. + uint64_t FileSize = 0; + for (auto &Entry : SecHdrTable) { + FileSize = std::max(Entry.Offset + Entry.Size, FileSize); + } + return FileSize; } bool SampleProfileReaderExtBinaryBase::dumpSectionInfo(raw_ostream &OS) { @@ -812,13 +874,11 @@ std::error_code SampleProfileReaderCompa return sampleprof_error::success; } -void SampleProfileReaderCompactBinary::collectFuncsToUse(const Module &M) { +void SampleProfileReaderCompactBinary::collectFuncsFrom(const Module &M) { UseAllFuncs = false; FuncsToUse.clear(); - for (auto &F : M) { - StringRef CanonName = FunctionSamples::getCanonicalFnName(F); - FuncsToUse.insert(CanonName); - } + for (auto &F : M) + FuncsToUse.insert(FunctionSamples::getCanonicalFnName(F)); } std::error_code SampleProfileReaderBinary::readSummaryEntry( Modified: llvm/trunk/lib/ProfileData/SampleProfWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ProfileData/SampleProfWriter.cpp?rev=374233&r1=374232&r2=374233&view=diff ============================================================================== --- llvm/trunk/lib/ProfileData/SampleProfWriter.cpp (original) +++ llvm/trunk/lib/ProfileData/SampleProfWriter.cpp Wed Oct 9 14:36:03 2019 @@ -76,7 +76,7 @@ SampleProfileWriter::write(const StringM SecHdrTableEntry & SampleProfileWriterExtBinaryBase::getEntryInLayout(SecType Type) { auto SecIt = std::find_if( - SectionLayout.begin(), SectionLayout.end(), + SectionHdrLayout.begin(), SectionHdrLayout.end(), [=](const auto &Entry) -> bool { return Entry.Type == Type; }); return *SecIt; } @@ -143,6 +143,29 @@ std::error_code SampleProfileWriterExtBi return sampleprof_error::success; } +std::error_code +SampleProfileWriterExtBinary::writeSample(const FunctionSamples &S) { + uint64_t Offset = OutputStream->tell(); + StringRef Name = S.getName(); + FuncOffsetTable[Name] = Offset - SecLBRProfileStart; + encodeULEB128(S.getHeadSamples(), *OutputStream); + return writeBody(S); +} + +std::error_code SampleProfileWriterExtBinary::writeFuncOffsetTable() { + auto &OS = *OutputStream; + + // Write out the table size. + encodeULEB128(FuncOffsetTable.size(), OS); + + // Write out FuncOffsetTable. + for (auto entry : FuncOffsetTable) { + writeNameIdx(entry.first); + encodeULEB128(entry.second, OS); + } + return sampleprof_error::success; +} + std::error_code SampleProfileWriterExtBinary::writeSections( const StringMap &ProfileMap) { uint64_t SectionStart = markSectionStart(SecProfSummary); @@ -163,6 +186,7 @@ std::error_code SampleProfileWriterExtBi return EC; SectionStart = markSectionStart(SecLBRProfile); + SecLBRProfileStart = OutputStream->tell(); if (std::error_code EC = writeFuncProfiles(ProfileMap)) return EC; if (std::error_code EC = addNewSection(SecLBRProfile, SectionStart)) @@ -178,6 +202,12 @@ std::error_code SampleProfileWriterExtBi if (std::error_code EC = addNewSection(SecProfileSymbolList, SectionStart)) return EC; + SectionStart = markSectionStart(SecFuncOffsetTable); + if (std::error_code EC = writeFuncOffsetTable()) + return EC; + if (std::error_code EC = addNewSection(SecFuncOffsetTable, SectionStart)) + return EC; + return sampleprof_error::success; } @@ -359,7 +389,7 @@ std::error_code SampleProfileWriterBinar } void SampleProfileWriterExtBinaryBase::setToCompressAllSections() { - for (auto &Entry : SectionLayout) + for (auto &Entry : SectionHdrLayout) addSecFlags(Entry, SecFlagCompress); } @@ -369,7 +399,7 @@ void SampleProfileWriterExtBinaryBase::s void SampleProfileWriterExtBinaryBase::addSectionFlags(SecType Type, SecFlags Flags) { - for (auto &Entry : SectionLayout) { + for (auto &Entry : SectionHdrLayout) { if (Entry.Type == Type) addSecFlags(Entry, Flags); } @@ -378,9 +408,9 @@ void SampleProfileWriterExtBinaryBase::a void SampleProfileWriterExtBinaryBase::allocSecHdrTable() { support::endian::Writer Writer(*OutputStream, support::little); - Writer.write(static_cast(SectionLayout.size())); + Writer.write(static_cast(SectionHdrLayout.size())); SecHdrTableOffset = OutputStream->tell(); - for (uint32_t i = 0; i < SectionLayout.size(); i++) { + for (uint32_t i = 0; i < SectionHdrLayout.size(); i++) { Writer.write(static_cast(-1)); Writer.write(static_cast(-1)); Writer.write(static_cast(-1)); @@ -402,14 +432,15 @@ std::error_code SampleProfileWriterExtBi IndexMap.insert({static_cast(SecHdrTable[i].Type), i}); } - // Write the sections in the order specified in SectionLayout. - // That is the sections order Reader will see. Note that the - // sections order in which Reader expects to read may be different - // from the order in which Writer is able to write, so we need - // to adjust the order in SecHdrTable to be consistent with - // SectionLayout when we write SecHdrTable to the memory. - for (uint32_t i = 0; i < SectionLayout.size(); i++) { - uint32_t idx = IndexMap[static_cast(SectionLayout[i].Type)]; + // Write the section header table in the order specified in + // SectionHdrLayout. That is the sections order Reader will see. + // Note that the sections order in which Reader expects to read + // may be different from the order in which Writer is able to + // write, so we need to adjust the order in SecHdrTable to be + // consistent with SectionHdrLayout when we write SecHdrTable + // to the memory. + for (uint32_t i = 0; i < SectionHdrLayout.size(); i++) { + uint32_t idx = IndexMap[static_cast(SectionHdrLayout[i].Type)]; Writer.write(static_cast(SecHdrTable[idx].Type)); Writer.write(static_cast(SecHdrTable[idx].Flags)); Writer.write(static_cast(SecHdrTable[idx].Offset)); Modified: llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp?rev=374233&r1=374232&r2=374233&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp Wed Oct 9 14:36:03 2019 @@ -1682,7 +1682,7 @@ bool SampleProfileLoader::doInitializati return false; } Reader = std::move(ReaderOrErr.get()); - Reader->collectFuncsToUse(M); + Reader->collectFuncsFrom(M); ProfileIsValid = (Reader->read() == sampleprof_error::success); PSL = Reader->getProfileSymbolList(); Modified: llvm/trunk/test/Transforms/SampleProfile/Inputs/inline.extbinary.afdo URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/Inputs/inline.extbinary.afdo?rev=374233&r1=374232&r2=374233&view=diff ============================================================================== Binary files - no diff available. Modified: llvm/trunk/test/Transforms/SampleProfile/Inputs/profsampleacc.extbinary.afdo URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/Inputs/profsampleacc.extbinary.afdo?rev=374233&r1=374232&r2=374233&view=diff ============================================================================== Binary files - no diff available. Modified: llvm/trunk/unittests/ProfileData/SampleProfTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ProfileData/SampleProfTest.cpp?rev=374233&r1=374232&r2=374233&view=diff ============================================================================== --- llvm/trunk/unittests/ProfileData/SampleProfTest.cpp (original) +++ llvm/trunk/unittests/ProfileData/SampleProfTest.cpp Wed Oct 9 14:36:03 2019 @@ -54,7 +54,7 @@ struct SampleProfTest : ::testing::Test auto ReaderOrErr = SampleProfileReader::create(Profile, Context); ASSERT_TRUE(NoError(ReaderOrErr.getError())); Reader = std::move(ReaderOrErr.get()); - Reader->collectFuncsToUse(M); + Reader->collectFuncsFrom(M); } void testRoundTrip(SampleProfileFormat Format, bool Remap) { @@ -86,6 +86,13 @@ struct SampleProfTest : ::testing::Test BarSamples.addCalledTargetSamples(1, 0, MconstructName, 1000); BarSamples.addCalledTargetSamples(1, 0, StringviewName, 437); + StringRef BazName("_Z3bazi"); + FunctionSamples BazSamples; + BazSamples.setName(BazName); + BazSamples.addTotalSamples(12557); + BazSamples.addHeadSamples(1257); + BazSamples.addBodySamples(1, 0, 12557); + Module M("my_module", Context); FunctionType *fn_type = FunctionType::get(Type::getVoidTy(Context), {}, false); @@ -95,6 +102,7 @@ struct SampleProfTest : ::testing::Test StringMap Profiles; Profiles[FooName] = std::move(FooSamples); Profiles[BarName] = std::move(BarSamples); + Profiles[BazName] = std::move(BazSamples); ProfileSymbolList List; if (Format == SampleProfileFormat::SPF_Ext_Binary) { @@ -137,8 +145,6 @@ struct SampleProfTest : ::testing::Test ASSERT_TRUE(NoError(EC)); } - ASSERT_EQ(2u, Reader->getProfiles().size()); - FunctionSamples *ReadFooSamples = Reader->getSamplesFor(FooName); ASSERT_TRUE(ReadFooSamples != nullptr); if (Format != SampleProfileFormat::SPF_Compact_Binary) { @@ -158,6 +164,20 @@ struct SampleProfTest : ::testing::Test ReadBarSamples->findCallTargetMapAt(1, 0); ASSERT_FALSE(CTMap.getError()); + // Because _Z3bazi is not defined in module M, expect _Z3bazi's profile + // is not loaded when the profile is ExtBinary or Compact format because + // these formats support loading function profiles on demand. + FunctionSamples *ReadBazSamples = Reader->getSamplesFor(BazName); + if (Format == SampleProfileFormat::SPF_Ext_Binary || + Format == SampleProfileFormat::SPF_Compact_Binary) { + ASSERT_TRUE(ReadBazSamples == nullptr); + ASSERT_EQ(2u, Reader->getProfiles().size()); + } else { + ASSERT_TRUE(ReadBazSamples != nullptr); + ASSERT_EQ(12557u, ReadBazSamples->getTotalSamples()); + ASSERT_EQ(3u, Reader->getProfiles().size()); + } + std::string MconstructGUID; StringRef MconstructRep = getRepInFormat(MconstructName, Format, MconstructGUID); @@ -169,9 +189,9 @@ struct SampleProfTest : ::testing::Test auto VerifySummary = [](ProfileSummary &Summary) mutable { ASSERT_EQ(ProfileSummary::PSK_Sample, Summary.getKind()); - ASSERT_EQ(123603u, Summary.getTotalCount()); - ASSERT_EQ(6u, Summary.getNumCounts()); - ASSERT_EQ(2u, Summary.getNumFunctions()); + ASSERT_EQ(136160u, Summary.getTotalCount()); + ASSERT_EQ(7u, Summary.getNumCounts()); + ASSERT_EQ(3u, Summary.getNumFunctions()); ASSERT_EQ(1437u, Summary.getMaxFunctionCount()); ASSERT_EQ(60351u, Summary.getMaxCount()); @@ -188,8 +208,8 @@ struct SampleProfTest : ::testing::Test Cutoff = 990000; auto NinetyNinePerc = find_if(Details, Predicate); ASSERT_EQ(60000u, EightyPerc->MinCount); - ASSERT_EQ(60000u, NinetyPerc->MinCount); - ASSERT_EQ(60000u, NinetyFivePerc->MinCount); + ASSERT_EQ(12557u, NinetyPerc->MinCount); + ASSERT_EQ(12557u, NinetyFivePerc->MinCount); ASSERT_EQ(610u, NinetyNinePerc->MinCount); }; From llvm-commits at lists.llvm.org Wed Oct 9 14:35:20 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:35:20 +0000 (UTC) Subject: [PATCH] D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format In-Reply-To: References: Message-ID: <74c7c9c452677a3c7ec83ae9e8f68d41@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG09dcfe680570: [SampleFDO] Add indexing for function profiles so they can be loaded on demand… (authored by wmi). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68601/new/ https://reviews.llvm.org/D68601 Files: llvm/include/llvm/ProfileData/SampleProf.h llvm/include/llvm/ProfileData/SampleProfReader.h llvm/include/llvm/ProfileData/SampleProfWriter.h llvm/lib/ProfileData/SampleProfReader.cpp llvm/lib/ProfileData/SampleProfWriter.cpp llvm/lib/Transforms/IPO/SampleProfile.cpp llvm/test/Transforms/SampleProfile/Inputs/inline.extbinary.afdo llvm/test/Transforms/SampleProfile/Inputs/profsampleacc.extbinary.afdo llvm/unittests/ProfileData/SampleProfTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68601.224168.patch Type: text/x-patch Size: 20178 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 14:42:08 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via llvm-commits) Date: Wed, 09 Oct 2019 21:42:08 -0000 Subject: [llvm] r374235 - [WebAssembly] Make returns variadic Message-ID: <20191009214208.A3B5B8304F@lists.llvm.org> Author: tlively Date: Wed Oct 9 14:42:08 2019 New Revision: 374235 URL: http://llvm.org/viewvc/llvm-project?rev=374235&view=rev Log: [WebAssembly] Make returns variadic Summary: This is necessary and sufficient to get simple cases of multiple return working with multivalue enabled. More complex cases will require block and loop signatures to be generalized to potentially be type indices as well. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68684 Modified: llvm/trunk/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp llvm/trunk/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/trunk/lib/Target/WebAssembly/WebAssemblyFastISel.cpp llvm/trunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrControl.td llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrInfo.td llvm/trunk/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp llvm/trunk/lib/Target/WebAssembly/WebAssemblyPeephole.cpp llvm/trunk/test/CodeGen/MIR/WebAssembly/int-type-register-class-name.mir llvm/trunk/test/CodeGen/WebAssembly/atomic-fence.mir llvm/trunk/test/CodeGen/WebAssembly/eh-labels.mir llvm/trunk/test/CodeGen/WebAssembly/explicit-locals.mir llvm/trunk/test/CodeGen/WebAssembly/function-info.mir llvm/trunk/test/CodeGen/WebAssembly/llround-conv-i32.ll llvm/trunk/test/CodeGen/WebAssembly/multivalue.ll llvm/trunk/test/CodeGen/WebAssembly/reg-argument.mir llvm/trunk/test/CodeGen/WebAssembly/reg-copy.mir llvm/trunk/test/DebugInfo/WebAssembly/dbg-value-move-clone.mir llvm/trunk/test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir Modified: llvm/trunk/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp Wed Oct 9 14:42:08 2019 @@ -52,7 +52,9 @@ void WebAssemblyInstPrinter::printInst(c // Print any additional variadic operands. const MCInstrDesc &Desc = MII.get(MI->getOpcode()); - if (Desc.isVariadic()) + if (Desc.isVariadic()) { + if (Desc.getNumOperands() == 0 && MI->getNumOperands() > 0) + OS << "\t"; for (auto I = Desc.getNumOperands(), E = MI->getNumOperands(); I < E; ++I) { // FIXME: For CALL_INDIRECT_VOID, don't print a leading comma, because // we have an extra flags operand which is not currently printed, for @@ -63,6 +65,7 @@ void WebAssemblyInstPrinter::printInst(c OS << ", "; printOperand(MI, I, OS); } + } // Print any added annotation. printAnnotation(OS, Annot); Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp Wed Oct 9 14:42:08 2019 @@ -332,43 +332,15 @@ void WebAssemblyAsmPrinter::EmitInstruct // These represent values which are live into the function entry, so there's // no instruction to emit. break; - case WebAssembly::FALLTHROUGH_RETURN_I32: - case WebAssembly::FALLTHROUGH_RETURN_I32_S: - case WebAssembly::FALLTHROUGH_RETURN_I64: - case WebAssembly::FALLTHROUGH_RETURN_I64_S: - case WebAssembly::FALLTHROUGH_RETURN_F32: - case WebAssembly::FALLTHROUGH_RETURN_F32_S: - case WebAssembly::FALLTHROUGH_RETURN_F64: - case WebAssembly::FALLTHROUGH_RETURN_F64_S: - case WebAssembly::FALLTHROUGH_RETURN_v16i8: - case WebAssembly::FALLTHROUGH_RETURN_v16i8_S: - case WebAssembly::FALLTHROUGH_RETURN_v8i16: - case WebAssembly::FALLTHROUGH_RETURN_v8i16_S: - case WebAssembly::FALLTHROUGH_RETURN_v4i32: - case WebAssembly::FALLTHROUGH_RETURN_v4i32_S: - case WebAssembly::FALLTHROUGH_RETURN_v2i64: - case WebAssembly::FALLTHROUGH_RETURN_v2i64_S: - case WebAssembly::FALLTHROUGH_RETURN_v4f32: - case WebAssembly::FALLTHROUGH_RETURN_v4f32_S: - case WebAssembly::FALLTHROUGH_RETURN_v2f64: - case WebAssembly::FALLTHROUGH_RETURN_v2f64_S: { + case WebAssembly::FALLTHROUGH_RETURN: { // These instructions represent the implicit return at the end of a - // function body. Always pops one value off the stack. + // function body. if (isVerbose()) { - OutStreamer->AddComment("fallthrough-return-value"); + OutStreamer->AddComment("fallthrough-return"); OutStreamer->AddBlankLine(); } break; } - case WebAssembly::FALLTHROUGH_RETURN_VOID: - case WebAssembly::FALLTHROUGH_RETURN_VOID_S: - // This instruction represents the implicit return at the end of a - // function body with no return value. - if (isVerbose()) { - OutStreamer->AddComment("fallthrough-return-void"); - OutStreamer->AddBlankLine(); - } - break; case WebAssembly::COMPILER_FENCE: // This is a compiler barrier that prevents instruction reordering during // backend compilation, and should not be emitted. Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp Wed Oct 9 14:42:08 2019 @@ -1227,11 +1227,11 @@ getDepth(const SmallVectorImpl(); - assert(MFI.getResults().size() <= 1); if (MFI.getResults().empty()) return; + // TODO: Generalize from value types to function types for multivalue WebAssembly::ExprType RetType; switch (MFI.getResults().front().SimpleTy) { case MVT::i32: @@ -1266,10 +1266,14 @@ void WebAssemblyCFGStackify::fixEndsAtEn if (MI.isPosition() || MI.isDebugInstr()) continue; if (MI.getOpcode() == WebAssembly::END_BLOCK) { + if (MFI.getResults().size() > 1) + report_fatal_error("Multivalue block signatures not implemented yet"); EndToBegin[&MI]->getOperand(0).setImm(int32_t(RetType)); continue; } if (MI.getOpcode() == WebAssembly::END_LOOP) { + if (MFI.getResults().size() > 1) + report_fatal_error("Multivalue loop signatures not implemented yet"); EndToBegin[&MI]->getOperand(0).setImm(int32_t(RetType)); continue; } Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyFastISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyFastISel.cpp?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyFastISel.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyFastISel.cpp Wed Oct 9 14:42:08 2019 @@ -1302,51 +1302,33 @@ bool WebAssemblyFastISel::selectRet(cons if (Ret->getNumOperands() == 0) { BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, - TII.get(WebAssembly::RETURN_VOID)); + TII.get(WebAssembly::RETURN)); return true; } + // TODO: support multiple return in FastISel + if (Ret->getNumOperands() > 1) + return false; + Value *RV = Ret->getOperand(0); if (!Subtarget->hasSIMD128() && RV->getType()->isVectorTy()) return false; - unsigned Opc; switch (getSimpleType(RV->getType())) { case MVT::i1: case MVT::i8: case MVT::i16: case MVT::i32: - Opc = WebAssembly::RETURN_I32; - break; case MVT::i64: - Opc = WebAssembly::RETURN_I64; - break; case MVT::f32: - Opc = WebAssembly::RETURN_F32; - break; case MVT::f64: - Opc = WebAssembly::RETURN_F64; - break; case MVT::v16i8: - Opc = WebAssembly::RETURN_v16i8; - break; case MVT::v8i16: - Opc = WebAssembly::RETURN_v8i16; - break; case MVT::v4i32: - Opc = WebAssembly::RETURN_v4i32; - break; case MVT::v2i64: - Opc = WebAssembly::RETURN_v2i64; - break; case MVT::v4f32: - Opc = WebAssembly::RETURN_v4f32; - break; case MVT::v2f64: - Opc = WebAssembly::RETURN_v2f64; - break; case MVT::exnref: - Opc = WebAssembly::RETURN_EXNREF; break; default: return false; @@ -1363,7 +1345,9 @@ bool WebAssemblyFastISel::selectRet(cons if (Reg == 0) return false; - BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(Opc)).addReg(Reg); + BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, + TII.get(WebAssembly::RETURN)) + .addReg(Reg); return true; } Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp Wed Oct 9 14:42:08 2019 @@ -852,8 +852,8 @@ bool WebAssemblyTargetLowering::CanLower CallingConv::ID /*CallConv*/, MachineFunction & /*MF*/, bool /*IsVarArg*/, const SmallVectorImpl &Outs, LLVMContext & /*Context*/) const { - // WebAssembly can't currently handle returning tuples. - return Outs.size() <= 1; + // WebAssembly can only handle returning tuples with multivalue enabled + return Subtarget->hasMultivalue() || Outs.size() <= 1; } SDValue WebAssemblyTargetLowering::LowerReturn( @@ -861,7 +861,8 @@ SDValue WebAssemblyTargetLowering::Lower const SmallVectorImpl &Outs, const SmallVectorImpl &OutVals, const SDLoc &DL, SelectionDAG &DAG) const { - assert(Outs.size() <= 1 && "WebAssembly can only return up to one value"); + assert(Subtarget->hasMultivalue() || + Outs.size() <= 1 && "MVP WebAssembly can only return up to one value"); if (!callingConvSupported(CallConv)) fail(DL, DAG, "WebAssembly doesn't support non-C calling conventions"); Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrControl.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrControl.td?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrControl.td (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrControl.td Wed Oct 9 14:42:08 2019 @@ -84,49 +84,19 @@ let isTerminator = 1, isBarrier = 1 in defm END_FUNCTION : NRI<(outs), (ins), [], "end_function", 0x0b>; } // Uses = [VALUE_STACK], Defs = [VALUE_STACK] -multiclass RETURN { - defm RETURN_#vt : I<(outs), (ins vt:$val), (outs), (ins), - [(WebAssemblyreturn vt:$val)], - "return \t$val", "return", 0x0f>; - // Equivalent to RETURN_#vt, for use at the end of a function when wasm - // semantics return by falling off the end of the block. - let isCodeGenOnly = 1 in - defm FALLTHROUGH_RETURN_#vt : I<(outs), (ins vt:$val), (outs), (ins), []>; -} - -multiclass SIMD_RETURN { - defm RETURN_#vt : I<(outs), (ins V128:$val), (outs), (ins), - [(WebAssemblyreturn (vt V128:$val))], - "return \t$val", "return", 0x0f>, - Requires<[HasSIMD128]>; - // Equivalent to RETURN_#vt, for use at the end of a function when wasm - // semantics return by falling off the end of the block. - let isCodeGenOnly = 1 in - defm FALLTHROUGH_RETURN_#vt : I<(outs), (ins V128:$val), (outs), (ins), - []>, - Requires<[HasSIMD128]>; -} let isTerminator = 1, hasCtrlDep = 1, isBarrier = 1 in { let isReturn = 1 in { - defm "": RETURN; - defm "": RETURN; - defm "": RETURN; - defm "": RETURN; - defm "": RETURN; - defm "": SIMD_RETURN; - defm "": SIMD_RETURN; - defm "": SIMD_RETURN; - defm "": SIMD_RETURN; - defm "": SIMD_RETURN; - defm "": SIMD_RETURN; - - defm RETURN_VOID : NRI<(outs), (ins), [(WebAssemblyreturn)], "return", 0x0f>; - - // This is to RETURN_VOID what FALLTHROUGH_RETURN_#vt is to RETURN_#vt. - let isCodeGenOnly = 1 in - defm FALLTHROUGH_RETURN_VOID : NRI<(outs), (ins), []>; + +defm RETURN : I<(outs), (ins variable_ops), (outs), (ins), + [(WebAssemblyreturn)], + "return", "return", 0x0f>; +// Equivalent to RETURN, for use at the end of a function when wasm +// semantics return by falling off the end of the block. +let isCodeGenOnly = 1 in +defm FALLTHROUGH_RETURN : I<(outs), (ins variable_ops), (outs), (ins), []>; + } // isReturn = 1 defm UNREACHABLE : NRI<(outs), (ins), [(trap)], "unreachable", 0x00>; Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrInfo.td?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrInfo.td (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyInstrInfo.td Wed Oct 9 14:42:08 2019 @@ -106,7 +106,8 @@ def WebAssemblybr_table : SDNode<"WebAss def WebAssemblyargument : SDNode<"WebAssemblyISD::ARGUMENT", SDT_WebAssemblyArgument>; def WebAssemblyreturn : SDNode<"WebAssemblyISD::RETURN", - SDT_WebAssemblyReturn, [SDNPHasChain]>; + SDT_WebAssemblyReturn, + [SDNPHasChain, SDNPVariadic]>; def WebAssemblywrapper : SDNode<"WebAssemblyISD::Wrapper", SDT_WebAssemblyWrapper>; def WebAssemblywrapperPIC : SDNode<"WebAssemblyISD::WrapperPIC", Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp Wed Oct 9 14:42:08 2019 @@ -49,10 +49,12 @@ void llvm::computeSignatureVTs(const Fun computeLegalValueVTs(F, TM, Ty->getReturnType(), Results); MVT PtrVT = MVT::getIntegerVT(TM.createDataLayout().getPointerSizeInBits()); - if (Results.size() > 1) { - // WebAssembly currently can't lower returns of multiple values without - // demoting to sret (see WebAssemblyTargetLowering::CanLowerReturn). So - // replace multiple return values with a pointer parameter. + if (Results.size() > 1 && + !TM.getSubtarget(F).hasMultivalue()) { + // WebAssembly can't lower returns of multiple values without demoting to + // sret unless multivalue is enabled (see + // WebAssemblyTargetLowering::CanLowerReturn). So replace multiple return + // values with a poitner parameter. Results.clear(); Params.push_back(PtrVT); } Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyPeephole.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyPeephole.cpp?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyPeephole.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyPeephole.cpp Wed Oct 9 14:42:08 2019 @@ -75,9 +75,7 @@ static bool maybeRewriteToFallthrough(Ma const MachineFunction &MF, WebAssemblyFunctionInfo &MFI, MachineRegisterInfo &MRI, - const WebAssemblyInstrInfo &TII, - unsigned FallthroughOpc, - unsigned CopyLocalOpc) { + const WebAssemblyInstrInfo &TII) { if (DisableWebAssemblyFallthroughReturnOpt) return false; if (&MBB != &MF.back()) @@ -90,13 +88,36 @@ static bool maybeRewriteToFallthrough(Ma if (&MI != &*End) return false; - if (FallthroughOpc != WebAssembly::FALLTHROUGH_RETURN_VOID) { - // If the operand isn't stackified, insert a COPY to read the operand and - // stackify it. - MachineOperand &MO = MI.getOperand(0); + for (auto &MO : MI.explicit_operands()) { + // If the operand isn't stackified, insert a COPY to read the operands and + // stackify them. Register Reg = MO.getReg(); if (!MFI.isVRegStackified(Reg)) { - Register NewReg = MRI.createVirtualRegister(MRI.getRegClass(Reg)); + unsigned CopyLocalOpc; + const TargetRegisterClass *RegClass = MRI.getRegClass(Reg); + switch (RegClass->getID()) { + case WebAssembly::I32RegClassID: + CopyLocalOpc = WebAssembly::COPY_I32; + break; + case WebAssembly::I64RegClassID: + CopyLocalOpc = WebAssembly::COPY_I64; + break; + case WebAssembly::F32RegClassID: + CopyLocalOpc = WebAssembly::COPY_F32; + break; + case WebAssembly::F64RegClassID: + CopyLocalOpc = WebAssembly::COPY_F64; + break; + case WebAssembly::V128RegClassID: + CopyLocalOpc = WebAssembly::COPY_V128; + break; + case WebAssembly::EXNREFRegClassID: + CopyLocalOpc = WebAssembly::COPY_EXNREF; + break; + default: + llvm_unreachable("Unexpected register class for return operand"); + } + Register NewReg = MRI.createVirtualRegister(RegClass); BuildMI(MBB, MI, MI.getDebugLoc(), TII.get(CopyLocalOpc), NewReg) .addReg(Reg); MO.setReg(NewReg); @@ -104,8 +125,7 @@ static bool maybeRewriteToFallthrough(Ma } } - // Rewrite the return. - MI.setDesc(TII.get(FallthroughOpc)); + MI.setDesc(TII.get(WebAssembly::FALLTHROUGH_RETURN)); return true; } @@ -157,60 +177,8 @@ bool WebAssemblyPeephole::runOnMachineFu break; } // Optimize away an explicit void return at the end of the function. - case WebAssembly::RETURN_I32: - Changed |= maybeRewriteToFallthrough( - MI, MBB, MF, MFI, MRI, TII, WebAssembly::FALLTHROUGH_RETURN_I32, - WebAssembly::COPY_I32); - break; - case WebAssembly::RETURN_I64: - Changed |= maybeRewriteToFallthrough( - MI, MBB, MF, MFI, MRI, TII, WebAssembly::FALLTHROUGH_RETURN_I64, - WebAssembly::COPY_I64); - break; - case WebAssembly::RETURN_F32: - Changed |= maybeRewriteToFallthrough( - MI, MBB, MF, MFI, MRI, TII, WebAssembly::FALLTHROUGH_RETURN_F32, - WebAssembly::COPY_F32); - break; - case WebAssembly::RETURN_F64: - Changed |= maybeRewriteToFallthrough( - MI, MBB, MF, MFI, MRI, TII, WebAssembly::FALLTHROUGH_RETURN_F64, - WebAssembly::COPY_F64); - break; - case WebAssembly::RETURN_v16i8: - Changed |= maybeRewriteToFallthrough( - MI, MBB, MF, MFI, MRI, TII, WebAssembly::FALLTHROUGH_RETURN_v16i8, - WebAssembly::COPY_V128); - break; - case WebAssembly::RETURN_v8i16: - Changed |= maybeRewriteToFallthrough( - MI, MBB, MF, MFI, MRI, TII, WebAssembly::FALLTHROUGH_RETURN_v8i16, - WebAssembly::COPY_V128); - break; - case WebAssembly::RETURN_v4i32: - Changed |= maybeRewriteToFallthrough( - MI, MBB, MF, MFI, MRI, TII, WebAssembly::FALLTHROUGH_RETURN_v4i32, - WebAssembly::COPY_V128); - break; - case WebAssembly::RETURN_v2i64: - Changed |= maybeRewriteToFallthrough( - MI, MBB, MF, MFI, MRI, TII, WebAssembly::FALLTHROUGH_RETURN_v2i64, - WebAssembly::COPY_V128); - break; - case WebAssembly::RETURN_v4f32: - Changed |= maybeRewriteToFallthrough( - MI, MBB, MF, MFI, MRI, TII, WebAssembly::FALLTHROUGH_RETURN_v4f32, - WebAssembly::COPY_V128); - break; - case WebAssembly::RETURN_v2f64: - Changed |= maybeRewriteToFallthrough( - MI, MBB, MF, MFI, MRI, TII, WebAssembly::FALLTHROUGH_RETURN_v2f64, - WebAssembly::COPY_V128); - break; - case WebAssembly::RETURN_VOID: - Changed |= maybeRewriteToFallthrough( - MI, MBB, MF, MFI, MRI, TII, WebAssembly::FALLTHROUGH_RETURN_VOID, - WebAssembly::INSTRUCTION_LIST_END); + case WebAssembly::RETURN: + Changed |= maybeRewriteToFallthrough(MI, MBB, MF, MFI, MRI, TII); break; } Modified: llvm/trunk/test/CodeGen/MIR/WebAssembly/int-type-register-class-name.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/MIR/WebAssembly/int-type-register-class-name.mir?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/MIR/WebAssembly/int-type-register-class-name.mir (original) +++ llvm/trunk/test/CodeGen/MIR/WebAssembly/int-type-register-class-name.mir Wed Oct 9 14:42:08 2019 @@ -9,5 +9,5 @@ body: | liveins: $arguments %0:i32 = CONST_I32 0, implicit-def dead $arguments ; CHECK: %0:i32 = CONST_I32 0, implicit-def dead $arguments - RETURN_VOID implicit-def dead $arguments + RETURN implicit-def dead $arguments ... Modified: llvm/trunk/test/CodeGen/WebAssembly/atomic-fence.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/atomic-fence.mir?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/atomic-fence.mir (original) +++ llvm/trunk/test/CodeGen/WebAssembly/atomic-fence.mir Wed Oct 9 14:42:08 2019 @@ -39,7 +39,7 @@ body: | COMPILER_FENCE implicit-def $arguments %2:i32 = ADD_I32 %0:i32, %0:i32, implicit-def $arguments CALL_VOID @foo, %2:i32, %1:i32, implicit-def $arguments - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... --- @@ -63,6 +63,5 @@ body: | ATOMIC_FENCE 0, implicit-def $arguments %2:i32 = ADD_I32 %0:i32, %0:i32, implicit-def $arguments CALL_VOID @foo, %2:i32, %1:i32, implicit-def $arguments - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... - Modified: llvm/trunk/test/CodeGen/WebAssembly/eh-labels.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/eh-labels.mir?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/eh-labels.mir (original) +++ llvm/trunk/test/CodeGen/WebAssembly/eh-labels.mir Wed Oct 9 14:42:08 2019 @@ -42,5 +42,5 @@ body: | bb.2: ; predecessors: %bb.0, %bb.1 - RETURN_VOID implicit-def dead $arguments + RETURN implicit-def dead $arguments ... Modified: llvm/trunk/test/CodeGen/WebAssembly/explicit-locals.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/explicit-locals.mir?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/explicit-locals.mir (original) +++ llvm/trunk/test/CodeGen/WebAssembly/explicit-locals.mir Wed Oct 9 14:42:08 2019 @@ -19,5 +19,5 @@ body: | ; CHECK-NOT: dead %{{[0-9]+}} ; CHECK: DROP_I32 killed %{{[0-9]+}} dead %0:i32 = CONST_I32 0, implicit-def dead $arguments, implicit $sp32, implicit $sp64 - RETURN_VOID implicit-def dead $arguments + RETURN implicit-def dead $arguments ... Modified: llvm/trunk/test/CodeGen/WebAssembly/function-info.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/function-info.mir?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/function-info.mir (original) +++ llvm/trunk/test/CodeGen/WebAssembly/function-info.mir Wed Oct 9 14:42:08 2019 @@ -8,5 +8,5 @@ liveins: - { reg: '$arguments' } body: | bb.0: - RETURN_VOID implicit-def dead $arguments + RETURN implicit-def dead $arguments ... Modified: llvm/trunk/test/CodeGen/WebAssembly/llround-conv-i32.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/llround-conv-i32.ll?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/llround-conv-i32.ll (original) +++ llvm/trunk/test/CodeGen/WebAssembly/llround-conv-i32.ll Wed Oct 9 14:42:08 2019 @@ -7,7 +7,7 @@ define i64 @testmsxs_builtin(float %x) { ; CHECK-NEXT: # %bb.0: # %entry ; CHECK-NEXT: local.get 0 ; CHECK-NEXT: i64.call llroundf -; CHECK-NEXT: # fallthrough-return-value +; CHECK-NEXT: # fallthrough-return ; CHECK-NEXT: end_function entry: %0 = tail call i64 @llvm.llround.f32(float %x) @@ -20,7 +20,7 @@ define i64 @testmsxd_builtin(double %x) ; CHECK-NEXT: # %bb.0: # %entry ; CHECK-NEXT: local.get 0 ; CHECK-NEXT: i64.call llround -; CHECK-NEXT: # fallthrough-return-value +; CHECK-NEXT: # fallthrough-return ; CHECK-NEXT: end_function entry: %0 = tail call i64 @llvm.llround.f64(double %x) Modified: llvm/trunk/test/CodeGen/WebAssembly/multivalue.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/multivalue.ll?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/multivalue.ll (original) +++ llvm/trunk/test/CodeGen/WebAssembly/multivalue.ll Wed Oct 9 14:42:08 2019 @@ -9,15 +9,17 @@ target triple = "wasm32-unknown-unknown" %pair = type { i32, i32 } %packed_pair = type <{ i32, i32 }> -; CHECK-LABEL: sret: -; CHECK-NEXT: sret (i32, i32, i32) -> () -define %pair @sret(%pair %p) { +; CHECK-LABEL: pair_ident: +; CHECK-NEXT: pair_ident (i32, i32) -> (i32, i32) +; CHECK-NEXT: return $0, $1{{$}} +define %pair @pair_ident(%pair %p) { ret %pair %p } -; CHECK-LABEL: packed_sret: -; CHECK-NEXT: packed_sret (i32, i32, i32) -> () -define %packed_pair @packed_sret(%packed_pair %p) { +; CHECK-LABEL: packed_pair_ident: +; CHECK-NEXT: packed_pair_ident (i32, i32) -> (i32, i32) +; CHECK-nEXT: return $0, $1{{$}} +define %packed_pair @packed_pair_ident(%packed_pair %p) { ret %packed_pair %p } Modified: llvm/trunk/test/CodeGen/WebAssembly/reg-argument.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/reg-argument.mir?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/reg-argument.mir (original) +++ llvm/trunk/test/CodeGen/WebAssembly/reg-argument.mir Wed Oct 9 14:42:08 2019 @@ -11,7 +11,7 @@ body: | bb.0: %0:i32 = CONST_I32 0, implicit-def $arguments %1:i32 = ARGUMENT_i32 0, implicit $arguments - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... --- name: argument_i64 @@ -22,7 +22,7 @@ body: | bb.0: %0:i32 = CONST_I32 0, implicit-def $arguments %1:i64 = ARGUMENT_i64 0, implicit $arguments - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... --- name: argument_f32 @@ -33,7 +33,7 @@ body: | bb.0: %0:i32 = CONST_I32 0, implicit-def $arguments %1:f32 = ARGUMENT_f32 0, implicit $arguments - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... --- name: argument_f64 @@ -44,7 +44,7 @@ body: | bb.0: %0:i32 = CONST_I32 0, implicit-def $arguments %1:f64 = ARGUMENT_f64 0, implicit $arguments - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... --- name: argument_exnref @@ -55,5 +55,5 @@ body: | bb.0: %0:i32 = CONST_I32 0, implicit-def $arguments %1:exnref = ARGUMENT_exnref 0, implicit $arguments - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... Modified: llvm/trunk/test/CodeGen/WebAssembly/reg-copy.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/WebAssembly/reg-copy.mir?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/WebAssembly/reg-copy.mir (original) +++ llvm/trunk/test/CodeGen/WebAssembly/reg-copy.mir Wed Oct 9 14:42:08 2019 @@ -6,10 +6,10 @@ name: copy_i32 body: | ; CHECK-LABEL: bb.0: ; CHECK-NEXT: %0:i32 = COPY_I32 %1:i32 - ; CHECK-NEXT: RETURN_VOID + ; CHECK-NEXT: RETURN bb.0: %0:i32 = COPY %1:i32 - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... --- name: copy_i64 @@ -17,10 +17,10 @@ name: copy_i64 body: | ; CHECK-LABEL: bb.0: ; CHECK-NEXT: %0:i64 = COPY_I64 %1:i64 - ; CHECK-NEXT: RETURN_VOID + ; CHECK-NEXT: RETURN bb.0: %0:i64 = COPY %1:i64 - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... --- name: copy_f32 @@ -28,10 +28,10 @@ name: copy_f32 body: | ; CHECK-LABEL: bb.0: ; CHECK-NEXT: %0:f32 = COPY_F32 %1:f32 - ; CHECK-NEXT: RETURN_VOID + ; CHECK-NEXT: RETURN bb.0: %0:f32 = COPY %1:f32 - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... --- name: copy_f64 @@ -39,10 +39,10 @@ name: copy_f64 body: | ; CHECK-LABEL: bb.0: ; CHECK-NEXT: %0:f64 = COPY_F64 %1:f64 - ; CHECK-NEXT: RETURN_VOID + ; CHECK-NEXT: RETURN bb.0: %0:f64 = COPY %1:f64 - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... --- name: copy_v128 @@ -50,10 +50,10 @@ name: copy_v128 body: | ; CHECK-LABEL: bb.0: ; CHECK-NEXT: %0:v128 = COPY_V128 %1:v128 - ; CHECK-NEXT: RETURN_VOID + ; CHECK-NEXT: RETURN bb.0: %0:v128 = COPY %1:v128 - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... --- name: copy_exnref @@ -61,8 +61,8 @@ name: copy_exnref body: | ; CHECK-LABEL: bb.0: ; CHECK-NEXT: %0:exnref = COPY_EXNREF %1:exnref - ; CHECK-NEXT: RETURN_VOID + ; CHECK-NEXT: RETURN bb.0: %0:exnref = COPY %1:exnref - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments ... Modified: llvm/trunk/test/DebugInfo/WebAssembly/dbg-value-move-clone.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/WebAssembly/dbg-value-move-clone.mir?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/test/DebugInfo/WebAssembly/dbg-value-move-clone.mir (original) +++ llvm/trunk/test/DebugInfo/WebAssembly/dbg-value-move-clone.mir Wed Oct 9 14:42:08 2019 @@ -60,6 +60,6 @@ body: | bb.1: CALL_VOID @foo, %1:i32, implicit-def dead $arguments, implicit $sp32, implicit $sp64 CALL_VOID @foo, %1:i32, implicit-def dead $arguments, implicit $sp32, implicit $sp64 - RETURN_VOID implicit-def dead $arguments + RETURN implicit-def dead $arguments ... Modified: llvm/trunk/test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir?rev=374235&r1=374234&r2=374235&view=diff ============================================================================== --- llvm/trunk/test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir (original) +++ llvm/trunk/test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir Wed Oct 9 14:42:08 2019 @@ -55,6 +55,6 @@ body: | %1:i32 = CALL_i32 @bar, implicit-def dead $arguments, implicit $sp32, implicit $sp64 DBG_VALUE %1:i32, $noreg, !12, !DIExpression(), debug-location !15; :357:12 line no:357 CALL_VOID @foo, %1:i32, implicit-def dead $arguments, implicit $sp32, implicit $sp64 - RETURN_VOID implicit-def dead $arguments + RETURN implicit-def dead $arguments ... From llvm-commits at lists.llvm.org Wed Oct 9 14:41:36 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:41:36 +0000 (UTC) Subject: [PATCH] D68451: [Sanitizers] Porting getrandom/getentropy interceptors to FreeBSD In-Reply-To: References: Message-ID: <4a61ce2421782b5b59d504e3ca77230d@localhost.localdomain> vitalybuka accepted this revision. vitalybuka added inline comments. This revision is now accepted and ready to land. ================ Comment at: compiler-rt/test/sanitizer_common/TestCases/Posix/getrandom.c:11 +#if defined(__linux__) #if __GLIBC_PREREQ(2, 25) ---------------- #if (defined(__linux__) && __GLIBC_PREREQ(2, 25)) || defined(__FreeBSD__) #define HAS_GETRANDOM #endif Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68451/new/ https://reviews.llvm.org/D68451 From llvm-commits at lists.llvm.org Wed Oct 9 14:41:38 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:41:38 +0000 (UTC) Subject: [PATCH] D68684: [WebAssembly] Make returns variadic In-Reply-To: References: Message-ID: <20e49194a58c7c26b2421ac1dee17672@localhost.localdomain> tlively updated this revision to Diff 224170. tlively marked an inline comment as done. tlively added a comment. - Bail instead of asserting in FastISel Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68684/new/ https://reviews.llvm.org/D68684 Files: llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td llvm/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp llvm/lib/Target/WebAssembly/WebAssemblyPeephole.cpp llvm/test/CodeGen/MIR/WebAssembly/int-type-register-class-name.mir llvm/test/CodeGen/WebAssembly/atomic-fence.mir llvm/test/CodeGen/WebAssembly/eh-labels.mir llvm/test/CodeGen/WebAssembly/explicit-locals.mir llvm/test/CodeGen/WebAssembly/function-info.mir llvm/test/CodeGen/WebAssembly/llround-conv-i32.ll llvm/test/CodeGen/WebAssembly/multivalue.ll llvm/test/CodeGen/WebAssembly/reg-argument.mir llvm/test/CodeGen/WebAssembly/reg-copy.mir llvm/test/DebugInfo/WebAssembly/dbg-value-move-clone.mir llvm/test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68684.224170.patch Type: text/x-patch Size: 26211 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 14:41:40 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:41:40 +0000 (UTC) Subject: [PATCH] D68684: [WebAssembly] Make returns variadic In-Reply-To: References: Message-ID: <4e9eb7e77ae33ddf252459f8fa84502f@localhost.localdomain> tlively added a comment. I'll land this since it's just a WIP and not ready for users yet. I'm happy to discuss ABI limits and other considerations further and make changes in followups! ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp:1309 + // TODO: should probably be <= 2? But be conservative to start... + assert(Ret->getNumOperands() < 2 && "Multivalue return not supported yet"); ---------------- aheejin wrote: > Why <=2? If multivalues are supported in fastisel later, do we still have that limit? Oops, this comment should have been removed. I'm updating this to just bail out of FastISel in the case of multiple return for now. The proper condition to check is `Ret->getNumOperands() > 1`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68684/new/ https://reviews.llvm.org/D68684 From llvm-commits at lists.llvm.org Wed Oct 9 14:42:39 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:42:39 +0000 (UTC) Subject: [PATCH] D68684: [WebAssembly] Make returns variadic In-Reply-To: References: Message-ID: <17a0a2fad780837354d379981a4f88f6@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG00f9e5aa76f4: [WebAssembly] Make returns variadic (authored by tlively). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68684/new/ https://reviews.llvm.org/D68684 Files: llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td llvm/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp llvm/lib/Target/WebAssembly/WebAssemblyPeephole.cpp llvm/test/CodeGen/MIR/WebAssembly/int-type-register-class-name.mir llvm/test/CodeGen/WebAssembly/atomic-fence.mir llvm/test/CodeGen/WebAssembly/eh-labels.mir llvm/test/CodeGen/WebAssembly/explicit-locals.mir llvm/test/CodeGen/WebAssembly/function-info.mir llvm/test/CodeGen/WebAssembly/llround-conv-i32.ll llvm/test/CodeGen/WebAssembly/multivalue.ll llvm/test/CodeGen/WebAssembly/reg-argument.mir llvm/test/CodeGen/WebAssembly/reg-copy.mir llvm/test/DebugInfo/WebAssembly/dbg-value-move-clone.mir llvm/test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68684.224171.patch Type: text/x-patch Size: 26211 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 14:52:15 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via llvm-commits) Date: Wed, 09 Oct 2019 21:52:15 -0000 Subject: [llvm] r374240 - [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator Message-ID: <20191009215215.E507185FBE@lists.llvm.org> Author: mcinally Date: Wed Oct 9 14:52:15 2019 New Revision: 374240 URL: http://llvm.org/viewvc/llvm-project?rev=374240&view=rev Log: [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator Also update Clang to call Builder.CreateFNeg(...) for UnaryMinus. Differential Revision: https://reviews.llvm.org/D61675 Modified: llvm/trunk/include/llvm/IR/IRBuilder.h llvm/trunk/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll llvm/trunk/test/CodeGen/AMDGPU/divrem24-assume.ll llvm/trunk/test/Transforms/InstCombine/cos-1.ll llvm/trunk/test/Transforms/InstCombine/fast-math.ll llvm/trunk/test/Transforms/InstCombine/fmul.ll llvm/trunk/test/Transforms/InstCombine/select-crash.ll llvm/trunk/unittests/IR/InstructionsTest.cpp Modified: llvm/trunk/include/llvm/IR/IRBuilder.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/IRBuilder.h?rev=374240&r1=374239&r2=374240&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/IRBuilder.h (original) +++ llvm/trunk/include/llvm/IR/IRBuilder.h Wed Oct 9 14:52:15 2019 @@ -1504,7 +1504,7 @@ public: MDNode *FPMathTag = nullptr) { if (auto *VC = dyn_cast(V)) return Insert(Folder.CreateFNeg(VC), Name); - return Insert(setFPAttrs(BinaryOperator::CreateFNeg(V), FPMathTag, FMF), + return Insert(setFPAttrs(UnaryOperator::CreateFNeg(V), FPMathTag, FMF), Name); } @@ -1514,9 +1514,7 @@ public: const Twine &Name = "") { if (auto *VC = dyn_cast(V)) return Insert(Folder.CreateFNeg(VC), Name); - // TODO: This should return UnaryOperator::CreateFNeg(...) once we are - // confident that they are optimized sufficiently. - return Insert(setFPAttrs(BinaryOperator::CreateFNeg(V), nullptr, + return Insert(setFPAttrs(UnaryOperator::CreateFNeg(V), nullptr, FMFSource->getFastMathFlags()), Name); } Modified: llvm/trunk/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll?rev=374240&r1=374239&r2=374240&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll Wed Oct 9 14:52:15 2019 @@ -227,7 +227,7 @@ define amdgpu_kernel void @udiv_i16(i16 ; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]] ; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]] ; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]]) -; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]] +; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]]) ; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32 ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]]) @@ -254,7 +254,7 @@ define amdgpu_kernel void @urem_i16(i16 ; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]] ; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]] ; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]]) -; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]] +; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]]) ; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32 ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]]) @@ -286,7 +286,7 @@ define amdgpu_kernel void @sdiv_i16(i16 ; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]] ; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]]) -; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]] +; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]]) ; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32 ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]]) @@ -317,7 +317,7 @@ define amdgpu_kernel void @srem_i16(i16 ; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]] ; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]]) -; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]] +; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]]) ; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32 ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]]) @@ -347,7 +347,7 @@ define amdgpu_kernel void @udiv_i8(i8 ad ; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]] ; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]] ; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]]) -; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]] +; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]]) ; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32 ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]]) @@ -374,7 +374,7 @@ define amdgpu_kernel void @urem_i8(i8 ad ; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]] ; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]] ; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]]) -; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]] +; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]]) ; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32 ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]]) @@ -406,7 +406,7 @@ define amdgpu_kernel void @sdiv_i8(i8 ad ; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]] ; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]]) -; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]] +; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]]) ; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32 ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]]) @@ -437,7 +437,7 @@ define amdgpu_kernel void @srem_i8(i8 ad ; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]] ; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]]) -; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]] +; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]]) ; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32 ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]]) @@ -1265,7 +1265,7 @@ define amdgpu_kernel void @udiv_v4i16(<4 ; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]] ; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]]) -; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]] +; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]]) ; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32 ; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]]) @@ -1285,7 +1285,7 @@ define amdgpu_kernel void @udiv_v4i16(<4 ; CHECK-NEXT: [[TMP27:%.*]] = fdiv fast float 1.000000e+00, [[TMP26]] ; CHECK-NEXT: [[TMP28:%.*]] = fmul fast float [[TMP25]], [[TMP27]] ; CHECK-NEXT: [[TMP29:%.*]] = call fast float @llvm.trunc.f32(float [[TMP28]]) -; CHECK-NEXT: [[TMP30:%.*]] = fsub fast float -0.000000e+00, [[TMP29]] +; CHECK-NEXT: [[TMP30:%.*]] = fneg fast float [[TMP29]] ; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP30]], float [[TMP26]], float [[TMP25]]) ; CHECK-NEXT: [[TMP32:%.*]] = fptoui float [[TMP29]] to i32 ; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.fabs.f32(float [[TMP31]]) @@ -1305,7 +1305,7 @@ define amdgpu_kernel void @udiv_v4i16(<4 ; CHECK-NEXT: [[TMP47:%.*]] = fdiv fast float 1.000000e+00, [[TMP46]] ; CHECK-NEXT: [[TMP48:%.*]] = fmul fast float [[TMP45]], [[TMP47]] ; CHECK-NEXT: [[TMP49:%.*]] = call fast float @llvm.trunc.f32(float [[TMP48]]) -; CHECK-NEXT: [[TMP50:%.*]] = fsub fast float -0.000000e+00, [[TMP49]] +; CHECK-NEXT: [[TMP50:%.*]] = fneg fast float [[TMP49]] ; CHECK-NEXT: [[TMP51:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP50]], float [[TMP46]], float [[TMP45]]) ; CHECK-NEXT: [[TMP52:%.*]] = fptoui float [[TMP49]] to i32 ; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.fabs.f32(float [[TMP51]]) @@ -1325,7 +1325,7 @@ define amdgpu_kernel void @udiv_v4i16(<4 ; CHECK-NEXT: [[TMP67:%.*]] = fdiv fast float 1.000000e+00, [[TMP66]] ; CHECK-NEXT: [[TMP68:%.*]] = fmul fast float [[TMP65]], [[TMP67]] ; CHECK-NEXT: [[TMP69:%.*]] = call fast float @llvm.trunc.f32(float [[TMP68]]) -; CHECK-NEXT: [[TMP70:%.*]] = fsub fast float -0.000000e+00, [[TMP69]] +; CHECK-NEXT: [[TMP70:%.*]] = fneg fast float [[TMP69]] ; CHECK-NEXT: [[TMP71:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP70]], float [[TMP66]], float [[TMP65]]) ; CHECK-NEXT: [[TMP72:%.*]] = fptoui float [[TMP69]] to i32 ; CHECK-NEXT: [[TMP73:%.*]] = call fast float @llvm.fabs.f32(float [[TMP71]]) @@ -1355,7 +1355,7 @@ define amdgpu_kernel void @urem_v4i16(<4 ; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]] ; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]]) -; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]] +; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]]) ; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32 ; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]]) @@ -1377,7 +1377,7 @@ define amdgpu_kernel void @urem_v4i16(<4 ; CHECK-NEXT: [[TMP29:%.*]] = fdiv fast float 1.000000e+00, [[TMP28]] ; CHECK-NEXT: [[TMP30:%.*]] = fmul fast float [[TMP27]], [[TMP29]] ; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.trunc.f32(float [[TMP30]]) -; CHECK-NEXT: [[TMP32:%.*]] = fsub fast float -0.000000e+00, [[TMP31]] +; CHECK-NEXT: [[TMP32:%.*]] = fneg fast float [[TMP31]] ; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP32]], float [[TMP28]], float [[TMP27]]) ; CHECK-NEXT: [[TMP34:%.*]] = fptoui float [[TMP31]] to i32 ; CHECK-NEXT: [[TMP35:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]]) @@ -1399,7 +1399,7 @@ define amdgpu_kernel void @urem_v4i16(<4 ; CHECK-NEXT: [[TMP51:%.*]] = fdiv fast float 1.000000e+00, [[TMP50]] ; CHECK-NEXT: [[TMP52:%.*]] = fmul fast float [[TMP49]], [[TMP51]] ; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.trunc.f32(float [[TMP52]]) -; CHECK-NEXT: [[TMP54:%.*]] = fsub fast float -0.000000e+00, [[TMP53]] +; CHECK-NEXT: [[TMP54:%.*]] = fneg fast float [[TMP53]] ; CHECK-NEXT: [[TMP55:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP54]], float [[TMP50]], float [[TMP49]]) ; CHECK-NEXT: [[TMP56:%.*]] = fptoui float [[TMP53]] to i32 ; CHECK-NEXT: [[TMP57:%.*]] = call fast float @llvm.fabs.f32(float [[TMP55]]) @@ -1421,7 +1421,7 @@ define amdgpu_kernel void @urem_v4i16(<4 ; CHECK-NEXT: [[TMP73:%.*]] = fdiv fast float 1.000000e+00, [[TMP72]] ; CHECK-NEXT: [[TMP74:%.*]] = fmul fast float [[TMP71]], [[TMP73]] ; CHECK-NEXT: [[TMP75:%.*]] = call fast float @llvm.trunc.f32(float [[TMP74]]) -; CHECK-NEXT: [[TMP76:%.*]] = fsub fast float -0.000000e+00, [[TMP75]] +; CHECK-NEXT: [[TMP76:%.*]] = fneg fast float [[TMP75]] ; CHECK-NEXT: [[TMP77:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP76]], float [[TMP72]], float [[TMP71]]) ; CHECK-NEXT: [[TMP78:%.*]] = fptoui float [[TMP75]] to i32 ; CHECK-NEXT: [[TMP79:%.*]] = call fast float @llvm.fabs.f32(float [[TMP77]]) @@ -1456,7 +1456,7 @@ define amdgpu_kernel void @sdiv_v4i16(<4 ; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]]) -; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]] +; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]] ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]]) ; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32 ; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]]) @@ -1480,7 +1480,7 @@ define amdgpu_kernel void @sdiv_v4i16(<4 ; CHECK-NEXT: [[TMP34:%.*]] = fdiv fast float 1.000000e+00, [[TMP33]] ; CHECK-NEXT: [[TMP35:%.*]] = fmul fast float [[TMP32]], [[TMP34]] ; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.trunc.f32(float [[TMP35]]) -; CHECK-NEXT: [[TMP37:%.*]] = fsub fast float -0.000000e+00, [[TMP36]] +; CHECK-NEXT: [[TMP37:%.*]] = fneg fast float [[TMP36]] ; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP37]], float [[TMP33]], float [[TMP32]]) ; CHECK-NEXT: [[TMP39:%.*]] = fptosi float [[TMP36]] to i32 ; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.fabs.f32(float [[TMP38]]) @@ -1504,7 +1504,7 @@ define amdgpu_kernel void @sdiv_v4i16(<4 ; CHECK-NEXT: [[TMP58:%.*]] = fdiv fast float 1.000000e+00, [[TMP57]] ; CHECK-NEXT: [[TMP59:%.*]] = fmul fast float [[TMP56]], [[TMP58]] ; CHECK-NEXT: [[TMP60:%.*]] = call fast float @llvm.trunc.f32(float [[TMP59]]) -; CHECK-NEXT: [[TMP61:%.*]] = fsub fast float -0.000000e+00, [[TMP60]] +; CHECK-NEXT: [[TMP61:%.*]] = fneg fast float [[TMP60]] ; CHECK-NEXT: [[TMP62:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP61]], float [[TMP57]], float [[TMP56]]) ; CHECK-NEXT: [[TMP63:%.*]] = fptosi float [[TMP60]] to i32 ; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.fabs.f32(float [[TMP62]]) @@ -1528,7 +1528,7 @@ define amdgpu_kernel void @sdiv_v4i16(<4 ; CHECK-NEXT: [[TMP82:%.*]] = fdiv fast float 1.000000e+00, [[TMP81]] ; CHECK-NEXT: [[TMP83:%.*]] = fmul fast float [[TMP80]], [[TMP82]] ; CHECK-NEXT: [[TMP84:%.*]] = call fast float @llvm.trunc.f32(float [[TMP83]]) -; CHECK-NEXT: [[TMP85:%.*]] = fsub fast float -0.000000e+00, [[TMP84]] +; CHECK-NEXT: [[TMP85:%.*]] = fneg fast float [[TMP84]] ; CHECK-NEXT: [[TMP86:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP85]], float [[TMP81]], float [[TMP80]]) ; CHECK-NEXT: [[TMP87:%.*]] = fptosi float [[TMP84]] to i32 ; CHECK-NEXT: [[TMP88:%.*]] = call fast float @llvm.fabs.f32(float [[TMP86]]) @@ -1562,7 +1562,7 @@ define amdgpu_kernel void @srem_v4i16(<4 ; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]]) -; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]] +; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]] ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]]) ; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32 ; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]]) @@ -1588,7 +1588,7 @@ define amdgpu_kernel void @srem_v4i16(<4 ; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast float 1.000000e+00, [[TMP35]] ; CHECK-NEXT: [[TMP37:%.*]] = fmul fast float [[TMP34]], [[TMP36]] ; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.trunc.f32(float [[TMP37]]) -; CHECK-NEXT: [[TMP39:%.*]] = fsub fast float -0.000000e+00, [[TMP38]] +; CHECK-NEXT: [[TMP39:%.*]] = fneg fast float [[TMP38]] ; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP39]], float [[TMP35]], float [[TMP34]]) ; CHECK-NEXT: [[TMP41:%.*]] = fptosi float [[TMP38]] to i32 ; CHECK-NEXT: [[TMP42:%.*]] = call fast float @llvm.fabs.f32(float [[TMP40]]) @@ -1614,7 +1614,7 @@ define amdgpu_kernel void @srem_v4i16(<4 ; CHECK-NEXT: [[TMP62:%.*]] = fdiv fast float 1.000000e+00, [[TMP61]] ; CHECK-NEXT: [[TMP63:%.*]] = fmul fast float [[TMP60]], [[TMP62]] ; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.trunc.f32(float [[TMP63]]) -; CHECK-NEXT: [[TMP65:%.*]] = fsub fast float -0.000000e+00, [[TMP64]] +; CHECK-NEXT: [[TMP65:%.*]] = fneg fast float [[TMP64]] ; CHECK-NEXT: [[TMP66:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP65]], float [[TMP61]], float [[TMP60]]) ; CHECK-NEXT: [[TMP67:%.*]] = fptosi float [[TMP64]] to i32 ; CHECK-NEXT: [[TMP68:%.*]] = call fast float @llvm.fabs.f32(float [[TMP66]]) @@ -1640,7 +1640,7 @@ define amdgpu_kernel void @srem_v4i16(<4 ; CHECK-NEXT: [[TMP88:%.*]] = fdiv fast float 1.000000e+00, [[TMP87]] ; CHECK-NEXT: [[TMP89:%.*]] = fmul fast float [[TMP86]], [[TMP88]] ; CHECK-NEXT: [[TMP90:%.*]] = call fast float @llvm.trunc.f32(float [[TMP89]]) -; CHECK-NEXT: [[TMP91:%.*]] = fsub fast float -0.000000e+00, [[TMP90]] +; CHECK-NEXT: [[TMP91:%.*]] = fneg fast float [[TMP90]] ; CHECK-NEXT: [[TMP92:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP91]], float [[TMP87]], float [[TMP86]]) ; CHECK-NEXT: [[TMP93:%.*]] = fptosi float [[TMP90]] to i32 ; CHECK-NEXT: [[TMP94:%.*]] = call fast float @llvm.fabs.f32(float [[TMP92]]) @@ -1671,7 +1671,7 @@ define amdgpu_kernel void @udiv_i3(i3 ad ; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]] ; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]] ; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]]) -; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]] +; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]]) ; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32 ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]]) @@ -1698,7 +1698,7 @@ define amdgpu_kernel void @urem_i3(i3 ad ; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]] ; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]] ; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]]) -; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]] +; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]]) ; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32 ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]]) @@ -1730,7 +1730,7 @@ define amdgpu_kernel void @sdiv_i3(i3 ad ; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]] ; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]]) -; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]] +; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]]) ; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32 ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]]) @@ -1761,7 +1761,7 @@ define amdgpu_kernel void @srem_i3(i3 ad ; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]] ; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]]) -; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]] +; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]]) ; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32 ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]]) @@ -1793,7 +1793,7 @@ define amdgpu_kernel void @udiv_v3i16(<3 ; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]] ; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]]) -; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]] +; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]]) ; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32 ; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]]) @@ -1813,7 +1813,7 @@ define amdgpu_kernel void @udiv_v3i16(<3 ; CHECK-NEXT: [[TMP27:%.*]] = fdiv fast float 1.000000e+00, [[TMP26]] ; CHECK-NEXT: [[TMP28:%.*]] = fmul fast float [[TMP25]], [[TMP27]] ; CHECK-NEXT: [[TMP29:%.*]] = call fast float @llvm.trunc.f32(float [[TMP28]]) -; CHECK-NEXT: [[TMP30:%.*]] = fsub fast float -0.000000e+00, [[TMP29]] +; CHECK-NEXT: [[TMP30:%.*]] = fneg fast float [[TMP29]] ; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP30]], float [[TMP26]], float [[TMP25]]) ; CHECK-NEXT: [[TMP32:%.*]] = fptoui float [[TMP29]] to i32 ; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.fabs.f32(float [[TMP31]]) @@ -1833,7 +1833,7 @@ define amdgpu_kernel void @udiv_v3i16(<3 ; CHECK-NEXT: [[TMP47:%.*]] = fdiv fast float 1.000000e+00, [[TMP46]] ; CHECK-NEXT: [[TMP48:%.*]] = fmul fast float [[TMP45]], [[TMP47]] ; CHECK-NEXT: [[TMP49:%.*]] = call fast float @llvm.trunc.f32(float [[TMP48]]) -; CHECK-NEXT: [[TMP50:%.*]] = fsub fast float -0.000000e+00, [[TMP49]] +; CHECK-NEXT: [[TMP50:%.*]] = fneg fast float [[TMP49]] ; CHECK-NEXT: [[TMP51:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP50]], float [[TMP46]], float [[TMP45]]) ; CHECK-NEXT: [[TMP52:%.*]] = fptoui float [[TMP49]] to i32 ; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.fabs.f32(float [[TMP51]]) @@ -1863,7 +1863,7 @@ define amdgpu_kernel void @urem_v3i16(<3 ; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]] ; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]]) -; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]] +; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]]) ; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32 ; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]]) @@ -1885,7 +1885,7 @@ define amdgpu_kernel void @urem_v3i16(<3 ; CHECK-NEXT: [[TMP29:%.*]] = fdiv fast float 1.000000e+00, [[TMP28]] ; CHECK-NEXT: [[TMP30:%.*]] = fmul fast float [[TMP27]], [[TMP29]] ; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.trunc.f32(float [[TMP30]]) -; CHECK-NEXT: [[TMP32:%.*]] = fsub fast float -0.000000e+00, [[TMP31]] +; CHECK-NEXT: [[TMP32:%.*]] = fneg fast float [[TMP31]] ; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP32]], float [[TMP28]], float [[TMP27]]) ; CHECK-NEXT: [[TMP34:%.*]] = fptoui float [[TMP31]] to i32 ; CHECK-NEXT: [[TMP35:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]]) @@ -1907,7 +1907,7 @@ define amdgpu_kernel void @urem_v3i16(<3 ; CHECK-NEXT: [[TMP51:%.*]] = fdiv fast float 1.000000e+00, [[TMP50]] ; CHECK-NEXT: [[TMP52:%.*]] = fmul fast float [[TMP49]], [[TMP51]] ; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.trunc.f32(float [[TMP52]]) -; CHECK-NEXT: [[TMP54:%.*]] = fsub fast float -0.000000e+00, [[TMP53]] +; CHECK-NEXT: [[TMP54:%.*]] = fneg fast float [[TMP53]] ; CHECK-NEXT: [[TMP55:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP54]], float [[TMP50]], float [[TMP49]]) ; CHECK-NEXT: [[TMP56:%.*]] = fptoui float [[TMP53]] to i32 ; CHECK-NEXT: [[TMP57:%.*]] = call fast float @llvm.fabs.f32(float [[TMP55]]) @@ -1942,7 +1942,7 @@ define amdgpu_kernel void @sdiv_v3i16(<3 ; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]]) -; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]] +; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]] ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]]) ; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32 ; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]]) @@ -1966,7 +1966,7 @@ define amdgpu_kernel void @sdiv_v3i16(<3 ; CHECK-NEXT: [[TMP34:%.*]] = fdiv fast float 1.000000e+00, [[TMP33]] ; CHECK-NEXT: [[TMP35:%.*]] = fmul fast float [[TMP32]], [[TMP34]] ; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.trunc.f32(float [[TMP35]]) -; CHECK-NEXT: [[TMP37:%.*]] = fsub fast float -0.000000e+00, [[TMP36]] +; CHECK-NEXT: [[TMP37:%.*]] = fneg fast float [[TMP36]] ; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP37]], float [[TMP33]], float [[TMP32]]) ; CHECK-NEXT: [[TMP39:%.*]] = fptosi float [[TMP36]] to i32 ; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.fabs.f32(float [[TMP38]]) @@ -1990,7 +1990,7 @@ define amdgpu_kernel void @sdiv_v3i16(<3 ; CHECK-NEXT: [[TMP58:%.*]] = fdiv fast float 1.000000e+00, [[TMP57]] ; CHECK-NEXT: [[TMP59:%.*]] = fmul fast float [[TMP56]], [[TMP58]] ; CHECK-NEXT: [[TMP60:%.*]] = call fast float @llvm.trunc.f32(float [[TMP59]]) -; CHECK-NEXT: [[TMP61:%.*]] = fsub fast float -0.000000e+00, [[TMP60]] +; CHECK-NEXT: [[TMP61:%.*]] = fneg fast float [[TMP60]] ; CHECK-NEXT: [[TMP62:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP61]], float [[TMP57]], float [[TMP56]]) ; CHECK-NEXT: [[TMP63:%.*]] = fptosi float [[TMP60]] to i32 ; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.fabs.f32(float [[TMP62]]) @@ -2024,7 +2024,7 @@ define amdgpu_kernel void @srem_v3i16(<3 ; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]]) -; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]] +; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]] ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]]) ; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32 ; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]]) @@ -2050,7 +2050,7 @@ define amdgpu_kernel void @srem_v3i16(<3 ; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast float 1.000000e+00, [[TMP35]] ; CHECK-NEXT: [[TMP37:%.*]] = fmul fast float [[TMP34]], [[TMP36]] ; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.trunc.f32(float [[TMP37]]) -; CHECK-NEXT: [[TMP39:%.*]] = fsub fast float -0.000000e+00, [[TMP38]] +; CHECK-NEXT: [[TMP39:%.*]] = fneg fast float [[TMP38]] ; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP39]], float [[TMP35]], float [[TMP34]]) ; CHECK-NEXT: [[TMP41:%.*]] = fptosi float [[TMP38]] to i32 ; CHECK-NEXT: [[TMP42:%.*]] = call fast float @llvm.fabs.f32(float [[TMP40]]) @@ -2076,7 +2076,7 @@ define amdgpu_kernel void @srem_v3i16(<3 ; CHECK-NEXT: [[TMP62:%.*]] = fdiv fast float 1.000000e+00, [[TMP61]] ; CHECK-NEXT: [[TMP63:%.*]] = fmul fast float [[TMP60]], [[TMP62]] ; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.trunc.f32(float [[TMP63]]) -; CHECK-NEXT: [[TMP65:%.*]] = fsub fast float -0.000000e+00, [[TMP64]] +; CHECK-NEXT: [[TMP65:%.*]] = fneg fast float [[TMP64]] ; CHECK-NEXT: [[TMP66:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP65]], float [[TMP61]], float [[TMP60]]) ; CHECK-NEXT: [[TMP67:%.*]] = fptosi float [[TMP64]] to i32 ; CHECK-NEXT: [[TMP68:%.*]] = call fast float @llvm.fabs.f32(float [[TMP66]]) @@ -2109,7 +2109,7 @@ define amdgpu_kernel void @udiv_v3i15(<3 ; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]] ; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]]) -; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]] +; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]]) ; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32 ; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]]) @@ -2129,7 +2129,7 @@ define amdgpu_kernel void @udiv_v3i15(<3 ; CHECK-NEXT: [[TMP27:%.*]] = fdiv fast float 1.000000e+00, [[TMP26]] ; CHECK-NEXT: [[TMP28:%.*]] = fmul fast float [[TMP25]], [[TMP27]] ; CHECK-NEXT: [[TMP29:%.*]] = call fast float @llvm.trunc.f32(float [[TMP28]]) -; CHECK-NEXT: [[TMP30:%.*]] = fsub fast float -0.000000e+00, [[TMP29]] +; CHECK-NEXT: [[TMP30:%.*]] = fneg fast float [[TMP29]] ; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP30]], float [[TMP26]], float [[TMP25]]) ; CHECK-NEXT: [[TMP32:%.*]] = fptoui float [[TMP29]] to i32 ; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.fabs.f32(float [[TMP31]]) @@ -2149,7 +2149,7 @@ define amdgpu_kernel void @udiv_v3i15(<3 ; CHECK-NEXT: [[TMP47:%.*]] = fdiv fast float 1.000000e+00, [[TMP46]] ; CHECK-NEXT: [[TMP48:%.*]] = fmul fast float [[TMP45]], [[TMP47]] ; CHECK-NEXT: [[TMP49:%.*]] = call fast float @llvm.trunc.f32(float [[TMP48]]) -; CHECK-NEXT: [[TMP50:%.*]] = fsub fast float -0.000000e+00, [[TMP49]] +; CHECK-NEXT: [[TMP50:%.*]] = fneg fast float [[TMP49]] ; CHECK-NEXT: [[TMP51:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP50]], float [[TMP46]], float [[TMP45]]) ; CHECK-NEXT: [[TMP52:%.*]] = fptoui float [[TMP49]] to i32 ; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.fabs.f32(float [[TMP51]]) @@ -2179,7 +2179,7 @@ define amdgpu_kernel void @urem_v3i15(<3 ; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]] ; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]] ; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]]) -; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]] +; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]]) ; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32 ; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]]) @@ -2201,7 +2201,7 @@ define amdgpu_kernel void @urem_v3i15(<3 ; CHECK-NEXT: [[TMP29:%.*]] = fdiv fast float 1.000000e+00, [[TMP28]] ; CHECK-NEXT: [[TMP30:%.*]] = fmul fast float [[TMP27]], [[TMP29]] ; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.trunc.f32(float [[TMP30]]) -; CHECK-NEXT: [[TMP32:%.*]] = fsub fast float -0.000000e+00, [[TMP31]] +; CHECK-NEXT: [[TMP32:%.*]] = fneg fast float [[TMP31]] ; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP32]], float [[TMP28]], float [[TMP27]]) ; CHECK-NEXT: [[TMP34:%.*]] = fptoui float [[TMP31]] to i32 ; CHECK-NEXT: [[TMP35:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]]) @@ -2223,7 +2223,7 @@ define amdgpu_kernel void @urem_v3i15(<3 ; CHECK-NEXT: [[TMP51:%.*]] = fdiv fast float 1.000000e+00, [[TMP50]] ; CHECK-NEXT: [[TMP52:%.*]] = fmul fast float [[TMP49]], [[TMP51]] ; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.trunc.f32(float [[TMP52]]) -; CHECK-NEXT: [[TMP54:%.*]] = fsub fast float -0.000000e+00, [[TMP53]] +; CHECK-NEXT: [[TMP54:%.*]] = fneg fast float [[TMP53]] ; CHECK-NEXT: [[TMP55:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP54]], float [[TMP50]], float [[TMP49]]) ; CHECK-NEXT: [[TMP56:%.*]] = fptoui float [[TMP53]] to i32 ; CHECK-NEXT: [[TMP57:%.*]] = call fast float @llvm.fabs.f32(float [[TMP55]]) @@ -2258,7 +2258,7 @@ define amdgpu_kernel void @sdiv_v3i15(<3 ; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]]) -; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]] +; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]] ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]]) ; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32 ; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]]) @@ -2282,7 +2282,7 @@ define amdgpu_kernel void @sdiv_v3i15(<3 ; CHECK-NEXT: [[TMP34:%.*]] = fdiv fast float 1.000000e+00, [[TMP33]] ; CHECK-NEXT: [[TMP35:%.*]] = fmul fast float [[TMP32]], [[TMP34]] ; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.trunc.f32(float [[TMP35]]) -; CHECK-NEXT: [[TMP37:%.*]] = fsub fast float -0.000000e+00, [[TMP36]] +; CHECK-NEXT: [[TMP37:%.*]] = fneg fast float [[TMP36]] ; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP37]], float [[TMP33]], float [[TMP32]]) ; CHECK-NEXT: [[TMP39:%.*]] = fptosi float [[TMP36]] to i32 ; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.fabs.f32(float [[TMP38]]) @@ -2306,7 +2306,7 @@ define amdgpu_kernel void @sdiv_v3i15(<3 ; CHECK-NEXT: [[TMP58:%.*]] = fdiv fast float 1.000000e+00, [[TMP57]] ; CHECK-NEXT: [[TMP59:%.*]] = fmul fast float [[TMP56]], [[TMP58]] ; CHECK-NEXT: [[TMP60:%.*]] = call fast float @llvm.trunc.f32(float [[TMP59]]) -; CHECK-NEXT: [[TMP61:%.*]] = fsub fast float -0.000000e+00, [[TMP60]] +; CHECK-NEXT: [[TMP61:%.*]] = fneg fast float [[TMP60]] ; CHECK-NEXT: [[TMP62:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP61]], float [[TMP57]], float [[TMP56]]) ; CHECK-NEXT: [[TMP63:%.*]] = fptosi float [[TMP60]] to i32 ; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.fabs.f32(float [[TMP62]]) @@ -2340,7 +2340,7 @@ define amdgpu_kernel void @srem_v3i15(<3 ; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]] ; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]] ; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]]) -; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]] +; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]] ; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]]) ; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32 ; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]]) @@ -2366,7 +2366,7 @@ define amdgpu_kernel void @srem_v3i15(<3 ; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast float 1.000000e+00, [[TMP35]] ; CHECK-NEXT: [[TMP37:%.*]] = fmul fast float [[TMP34]], [[TMP36]] ; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.trunc.f32(float [[TMP37]]) -; CHECK-NEXT: [[TMP39:%.*]] = fsub fast float -0.000000e+00, [[TMP38]] +; CHECK-NEXT: [[TMP39:%.*]] = fneg fast float [[TMP38]] ; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP39]], float [[TMP35]], float [[TMP34]]) ; CHECK-NEXT: [[TMP41:%.*]] = fptosi float [[TMP38]] to i32 ; CHECK-NEXT: [[TMP42:%.*]] = call fast float @llvm.fabs.f32(float [[TMP40]]) @@ -2392,7 +2392,7 @@ define amdgpu_kernel void @srem_v3i15(<3 ; CHECK-NEXT: [[TMP62:%.*]] = fdiv fast float 1.000000e+00, [[TMP61]] ; CHECK-NEXT: [[TMP63:%.*]] = fmul fast float [[TMP60]], [[TMP62]] ; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.trunc.f32(float [[TMP63]]) -; CHECK-NEXT: [[TMP65:%.*]] = fsub fast float -0.000000e+00, [[TMP64]] +; CHECK-NEXT: [[TMP65:%.*]] = fneg fast float [[TMP64]] ; CHECK-NEXT: [[TMP66:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP65]], float [[TMP61]], float [[TMP60]]) ; CHECK-NEXT: [[TMP67:%.*]] = fptosi float [[TMP64]] to i32 ; CHECK-NEXT: [[TMP68:%.*]] = call fast float @llvm.fabs.f32(float [[TMP66]]) Modified: llvm/trunk/test/CodeGen/AMDGPU/divrem24-assume.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/divrem24-assume.ll?rev=374240&r1=374239&r2=374240&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/divrem24-assume.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/divrem24-assume.ll Wed Oct 9 14:52:15 2019 @@ -12,7 +12,7 @@ define amdgpu_kernel void @divrem24_assu ; CHECK-NEXT: [[TMP2:%.*]] = fdiv fast float 1.000000e+00, [[TMP1]] ; CHECK-NEXT: [[TMP3:%.*]] = fmul fast float [[TMP0]], [[TMP2]] ; CHECK-NEXT: [[TMP4:%.*]] = call fast float @llvm.trunc.f32(float [[TMP3]]) -; CHECK-NEXT: [[TMP5:%.*]] = fsub fast float -0.000000e+00, [[TMP4]] +; CHECK-NEXT: [[TMP5:%.*]] = fneg fast float [[TMP4]] ; CHECK-NEXT: [[TMP6:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP5]], float [[TMP1]], float [[TMP0]]) ; CHECK-NEXT: [[TMP7:%.*]] = fptoui float [[TMP4]] to i32 ; CHECK-NEXT: [[TMP8:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]]) Modified: llvm/trunk/test/Transforms/InstCombine/cos-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/cos-1.ll?rev=374240&r1=374239&r2=374240&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/cos-1.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/cos-1.ll Wed Oct 9 14:52:15 2019 @@ -84,7 +84,7 @@ define float @cosf_unary_negated_arg_FMF define double @sin_negated_arg(double %x) { ; ANY-LABEL: @sin_negated_arg( ; ANY-NEXT: [[TMP1:%.*]] = call double @sin(double [[X:%.*]]) -; ANY-NEXT: [[TMP2:%.*]] = fsub double -0.000000e+00, [[TMP1]] +; ANY-NEXT: [[TMP2:%.*]] = fneg double [[TMP1]] ; ANY-NEXT: ret double [[TMP2]] ; %neg = fsub double -0.0, %x @@ -95,7 +95,7 @@ define double @sin_negated_arg(double %x define double @sin_unary_negated_arg(double %x) { ; ANY-LABEL: @sin_unary_negated_arg( ; ANY-NEXT: [[TMP1:%.*]] = call double @sin(double [[X:%.*]]) -; ANY-NEXT: [[TMP2:%.*]] = fsub double -0.000000e+00, [[TMP1]] +; ANY-NEXT: [[TMP2:%.*]] = fneg double [[TMP1]] ; ANY-NEXT: ret double [[TMP2]] ; %neg = fneg double %x @@ -106,7 +106,7 @@ define double @sin_unary_negated_arg(dou define float @sinf_negated_arg(float %x) { ; ANY-LABEL: @sinf_negated_arg( ; ANY-NEXT: [[TMP1:%.*]] = call float @sinf(float [[X:%.*]]) -; ANY-NEXT: [[TMP2:%.*]] = fsub float -0.000000e+00, [[TMP1]] +; ANY-NEXT: [[TMP2:%.*]] = fneg float [[TMP1]] ; ANY-NEXT: ret float [[TMP2]] ; %neg = fsub float -0.0, %x @@ -117,7 +117,7 @@ define float @sinf_negated_arg(float %x) define float @sinf_unary_negated_arg(float %x) { ; ANY-LABEL: @sinf_unary_negated_arg( ; ANY-NEXT: [[TMP1:%.*]] = call float @sinf(float [[X:%.*]]) -; ANY-NEXT: [[TMP2:%.*]] = fsub float -0.000000e+00, [[TMP1]] +; ANY-NEXT: [[TMP2:%.*]] = fneg float [[TMP1]] ; ANY-NEXT: ret float [[TMP2]] ; %neg = fneg float %x @@ -128,7 +128,7 @@ define float @sinf_unary_negated_arg(flo define float @sinf_negated_arg_FMF(float %x) { ; ANY-LABEL: @sinf_negated_arg_FMF( ; ANY-NEXT: [[TMP1:%.*]] = call nnan afn float @sinf(float [[X:%.*]]) -; ANY-NEXT: [[TMP2:%.*]] = fsub nnan afn float -0.000000e+00, [[TMP1]] +; ANY-NEXT: [[TMP2:%.*]] = fneg nnan afn float [[TMP1]] ; ANY-NEXT: ret float [[TMP2]] ; %neg = fsub ninf float -0.0, %x @@ -139,7 +139,7 @@ define float @sinf_negated_arg_FMF(float define float @sinf_unary_negated_arg_FMF(float %x) { ; ANY-LABEL: @sinf_unary_negated_arg_FMF( ; ANY-NEXT: [[TMP1:%.*]] = call nnan afn float @sinf(float [[X:%.*]]) -; ANY-NEXT: [[TMP2:%.*]] = fsub nnan afn float -0.000000e+00, [[TMP1]] +; ANY-NEXT: [[TMP2:%.*]] = fneg nnan afn float [[TMP1]] ; ANY-NEXT: ret float [[TMP2]] ; %neg = fneg ninf float %x @@ -227,7 +227,7 @@ define double @unary_neg_sin_negated_arg define double @tan_negated_arg(double %x) { ; ANY-LABEL: @tan_negated_arg( ; ANY-NEXT: [[TMP1:%.*]] = call double @tan(double [[X:%.*]]) -; ANY-NEXT: [[TMP2:%.*]] = fsub double -0.000000e+00, [[TMP1]] +; ANY-NEXT: [[TMP2:%.*]] = fneg double [[TMP1]] ; ANY-NEXT: ret double [[TMP2]] ; %neg = fsub double -0.0, %x @@ -238,7 +238,7 @@ define double @tan_negated_arg(double %x define double @tan_unary_negated_arg(double %x) { ; ANY-LABEL: @tan_unary_negated_arg( ; ANY-NEXT: [[TMP1:%.*]] = call double @tan(double [[X:%.*]]) -; ANY-NEXT: [[TMP2:%.*]] = fsub double -0.000000e+00, [[TMP1]] +; ANY-NEXT: [[TMP2:%.*]] = fneg double [[TMP1]] ; ANY-NEXT: ret double [[TMP2]] ; %neg = fneg double %x @@ -251,7 +251,7 @@ define double @tan_unary_negated_arg(dou define fp128 @tanl_negated_arg(fp128 %x) { ; ANY-LABEL: @tanl_negated_arg( ; ANY-NEXT: [[TMP1:%.*]] = call fp128 @tanl(fp128 [[X:%.*]]) -; ANY-NEXT: [[TMP2:%.*]] = fsub fp128 0xL00000000000000008000000000000000, [[TMP1]] +; ANY-NEXT: [[TMP2:%.*]] = fneg fp128 [[TMP1]] ; ANY-NEXT: ret fp128 [[TMP2]] ; %neg = fsub fp128 0xL00000000000000008000000000000000, %x @@ -262,7 +262,7 @@ define fp128 @tanl_negated_arg(fp128 %x) define fp128 @tanl_unary_negated_arg(fp128 %x) { ; ANY-LABEL: @tanl_unary_negated_arg( ; ANY-NEXT: [[TMP1:%.*]] = call fp128 @tanl(fp128 [[X:%.*]]) -; ANY-NEXT: [[TMP2:%.*]] = fsub fp128 0xL00000000000000008000000000000000, [[TMP1]] +; ANY-NEXT: [[TMP2:%.*]] = fneg fp128 [[TMP1]] ; ANY-NEXT: ret fp128 [[TMP2]] ; %neg = fneg fp128 %x Modified: llvm/trunk/test/Transforms/InstCombine/fast-math.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/fast-math.ll?rev=374240&r1=374239&r2=374240&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/fast-math.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/fast-math.ll Wed Oct 9 14:52:15 2019 @@ -504,7 +504,7 @@ define float @fsub_op0_fmul_const_wrong_ define float @fold16(float %x, float %y) { ; CHECK-LABEL: @fold16( ; CHECK-NEXT: [[CMP:%.*]] = fcmp ogt float [[X:%.*]], [[Y:%.*]] -; CHECK-NEXT: [[TMP1:%.*]] = fsub float -0.000000e+00, [[Y]] +; CHECK-NEXT: [[TMP1:%.*]] = fneg float [[Y]] ; CHECK-NEXT: [[R_P:%.*]] = select i1 [[CMP]], float [[Y]], float [[TMP1]] ; CHECK-NEXT: [[R:%.*]] = fadd float [[R_P]], [[X]] ; CHECK-NEXT: ret float [[R]] Modified: llvm/trunk/test/Transforms/InstCombine/fmul.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/fmul.ll?rev=374240&r1=374239&r2=374240&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/fmul.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/fmul.ll Wed Oct 9 14:52:15 2019 @@ -994,7 +994,7 @@ define double @fmul_negated_constant_exp define float @negate_if_true(float %x, i1 %cond) { ; CHECK-LABEL: @negate_if_true( -; CHECK-NEXT: [[TMP1:%.*]] = fsub float -0.000000e+00, [[X:%.*]] +; CHECK-NEXT: [[TMP1:%.*]] = fneg float [[X:%.*]] ; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[COND:%.*]], float [[TMP1]], float [[X]] ; CHECK-NEXT: ret float [[TMP2]] ; @@ -1005,7 +1005,7 @@ define float @negate_if_true(float %x, i define float @negate_if_false(float %x, i1 %cond) { ; CHECK-LABEL: @negate_if_false( -; CHECK-NEXT: [[TMP1:%.*]] = fsub arcp float -0.000000e+00, [[X:%.*]] +; CHECK-NEXT: [[TMP1:%.*]] = fneg arcp float [[X:%.*]] ; CHECK-NEXT: [[TMP2:%.*]] = select arcp i1 [[COND:%.*]], float [[X]], float [[TMP1]] ; CHECK-NEXT: ret float [[TMP2]] ; @@ -1017,7 +1017,7 @@ define float @negate_if_false(float %x, define <2 x double> @negate_if_true_commute(<2 x double> %px, i1 %cond) { ; CHECK-LABEL: @negate_if_true_commute( ; CHECK-NEXT: [[X:%.*]] = fdiv <2 x double> , [[PX:%.*]] -; CHECK-NEXT: [[TMP1:%.*]] = fsub ninf <2 x double> , [[X]] +; CHECK-NEXT: [[TMP1:%.*]] = fneg ninf <2 x double> [[X]] ; CHECK-NEXT: [[TMP2:%.*]] = select ninf i1 [[COND:%.*]], <2 x double> [[TMP1]], <2 x double> [[X]] ; CHECK-NEXT: ret <2 x double> [[TMP2]] ; @@ -1030,7 +1030,7 @@ define <2 x double> @negate_if_true_comm define <2 x double> @negate_if_false_commute(<2 x double> %px, <2 x i1> %cond) { ; CHECK-LABEL: @negate_if_false_commute( ; CHECK-NEXT: [[X:%.*]] = fdiv <2 x double> , [[PX:%.*]] -; CHECK-NEXT: [[TMP1:%.*]] = fsub <2 x double> , [[X]] +; CHECK-NEXT: [[TMP1:%.*]] = fneg <2 x double> [[X]] ; CHECK-NEXT: [[TMP2:%.*]] = select <2 x i1> [[COND:%.*]], <2 x double> [[X]], <2 x double> [[TMP1]] ; CHECK-NEXT: ret <2 x double> [[TMP2]] ; Modified: llvm/trunk/test/Transforms/InstCombine/select-crash.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/select-crash.ll?rev=374240&r1=374239&r2=374240&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/select-crash.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/select-crash.ll Wed Oct 9 14:52:15 2019 @@ -4,7 +4,7 @@ define fastcc double @gimp_operation_color_balance_map(float %value, double %highlights) nounwind readnone inlinehint { entry: ; CHECK: gimp_operation_color_balance_map -; CHECK: fsub double -0.000000 +; CHECK: fneg double %conv = fpext float %value to double %div = fdiv double %conv, 1.600000e+01 %add = fadd double %div, 1.000000e+00 @@ -22,7 +22,7 @@ entry: ; PR10180: same crash, but with vectors define <4 x float> @foo(i1 %b, <4 x float> %x, <4 x float> %y, <4 x float> %z) { ; CHECK-LABEL: @foo( -; CHECK: fsub <4 x float> +; CHECK: fneg <4 x float> ; CHECK: select ; CHECK: fadd <4 x float> %a = fadd <4 x float> %x, %y Modified: llvm/trunk/unittests/IR/InstructionsTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/IR/InstructionsTest.cpp?rev=374240&r1=374239&r2=374240&view=diff ============================================================================== --- llvm/trunk/unittests/IR/InstructionsTest.cpp (original) +++ llvm/trunk/unittests/IR/InstructionsTest.cpp Wed Oct 9 14:52:15 2019 @@ -1115,5 +1115,20 @@ if.end: EXPECT_EQ(ArgBA->getBasicBlock(), &IfThen); } +TEST(InstructionsTest, UnaryOperator) { + LLVMContext Context; + IRBuilder<> Builder(Context); + Instruction *I = Builder.CreatePHI(Builder.getDoubleTy(), 0); + Value *F = Builder.CreateFNeg(I); + + EXPECT_TRUE(isa(F)); + EXPECT_TRUE(isa(F)); + EXPECT_TRUE(isa(F)); + EXPECT_TRUE(isa(F)); + EXPECT_FALSE(isa(F)); + + F->deleteValue(); +} + } // end anonymous namespace } // end namespace llvm From llvm-commits at lists.llvm.org Wed Oct 9 14:52:18 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:52:18 +0000 (UTC) Subject: [PATCH] D68735: AMDGPU: Don't fold copies to physregs Message-ID: arsenm created this revision. arsenm added reviewers: rampitec, kerbowa. Herald added subscribers: arphaman, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl. In a future patch, this will help cleanup m0 handling. The register coalescer handles copies from a register that materializes an immediate, but doesn't handle move immediates itself. The virtual register uses will often be allocated to the same register, so there end up being no real copy. https://reviews.llvm.org/D68735 Files: lib/Target/AMDGPU/SIFoldOperands.cpp test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll Index: test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll =================================================================== --- test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll +++ test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll @@ -14,8 +14,8 @@ ; GCN-NEXT: s_mov_b64 s[0:1], s[36:37] ; GCN-NEXT: v_mov_b32_e32 v1, 0x2000 ; GCN-NEXT: v_mov_b32_e32 v2, 0x4000 -; GCN-NEXT: s_mov_b64 s[2:3], s[38:39] ; GCN-NEXT: v_mov_b32_e32 v3, 0 +; GCN-NEXT: s_mov_b64 s[2:3], s[38:39] ; GCN-NEXT: v_mov_b32_e32 v4, 0x400000 ; GCN-NEXT: s_add_u32 s32, s33, 0xc0000 ; GCN-NEXT: v_add_nc_u32_e64 v32, 4, 0x4000 Index: lib/Target/AMDGPU/SIFoldOperands.cpp =================================================================== --- lib/Target/AMDGPU/SIFoldOperands.cpp +++ lib/Target/AMDGPU/SIFoldOperands.cpp @@ -581,13 +581,17 @@ if (FoldingImmLike && UseMI->isCopy()) { Register DestReg = UseMI->getOperand(0).getReg(); - const TargetRegisterClass *DestRC = Register::isVirtualRegister(DestReg) - ? MRI->getRegClass(DestReg) - : TRI->getPhysRegClass(DestReg); + + // Don't fold into a copy to a physical register. Doing so would interfere + // with the register coalescer's logic which would avoid redundant + // initalizations. + if (DestReg.isPhysical()) + return; + + const TargetRegisterClass *DestRC = MRI->getRegClass(DestReg); Register SrcReg = UseMI->getOperand(1).getReg(); - if (Register::isVirtualRegister(DestReg) && - Register::isVirtualRegister(SrcReg)) { + if (SrcReg.isVirtual()) { // XXX - This can be an assert? const TargetRegisterClass * SrcRC = MRI->getRegClass(SrcReg); if (TRI->isSGPRClass(SrcRC) && TRI->hasVectorRegisters(DestRC)) { MachineRegisterInfo::use_iterator NextUse; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68735.224176.patch Type: text/x-patch Size: 1926 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 14:52:19 2019 From: llvm-commits at lists.llvm.org (Andy Kaylor via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:52:19 +0000 (UTC) Subject: [PATCH] D68686: [X86] Add strict fp support for instructions fadd/fsub/fmul/fdiv In-Reply-To: References: Message-ID: <6ab108533401cbcf59dffc6dc5cb8bc3@localhost.localdomain> andrew.w.kaylor added a comment. This patch seems to be doing at least two different things. Can you separate the changes related to strict node handling into a separate patch? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68686/new/ https://reviews.llvm.org/D68686 From llvm-commits at lists.llvm.org Wed Oct 9 09:03:21 2019 From: llvm-commits at lists.llvm.org (Tim Gymnich via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 16:03:21 +0000 (UTC) Subject: [PATCH] D68360: PR41162 Implement LKK remainder and divisibility algorithms [urem] In-Reply-To: References: Message-ID: <7b3d34f4492cf25aacf83105e9d354f0@localhost.localdomain> TG908 marked an inline comment as done. TG908 added inline comments. ================ Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:4922 + // Check to see if we can do this. + if (!isTypeLegal(VT) || !isTypeLegal(FVT)) + return SDValue(); ---------------- This right here seems to fail on riscv64 +m with: ``` (lldb) p FVT (llvm::EVT) $1 = { V = (SimpleTy = i64) LLVMTy = 0x0000000000000000 } ``` ``` (lldb) p VT (llvm::EVT) $2 = { V = (SimpleTy = i32) LLVMTy = 0x0000000000000000 } ``` Those types should be legal right? What am I missing? ``` (lldb) expr isTypeLegal(VT) (bool) $5 = false (lldb) expr isTypeLegal(FVT) (bool) $6 = true ``` CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68360/new/ https://reviews.llvm.org/D68360 From llvm-commits at lists.llvm.org Wed Oct 9 10:58:37 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 17:58:37 +0000 (UTC) Subject: [PATCH] D68636: [llvm-readobj] - Refine the LLVM-style output to be consistent. In-Reply-To: References: Message-ID: <1648761f221e6cec457ef4c693d7e45f@localhost.localdomain> rupprecht added inline comments. ================ Comment at: tools/llvm-readobj/ELFDumper.cpp:5598 const Elf_Shdr *Sec) { - DictScope SS(W, "Version symbols"); + ListScope SS(W, "VersionSymbols"); if (!Sec) ---------------- MaskRay wrote: > grimar wrote: > > MaskRay wrote: > > > jhenderson wrote: > > > > grimar wrote: > > > > > jhenderson wrote: > > > > > > Ditto, though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? If it didn't have that stuff, it would be a list. > > > > > > though I'm wondering here why the VersionSymbols data includes stuff to do with its section header? > > > > > > > > > > I do not know. The same information is printed under "Sections [" tag anyways, so it is not useful probably: > > > > > > > > > > ``` > > > > > Section { > > > > > Index: 3 > > > > > Name: .gnu.version (30) > > > > > Type: SHT_GNU_versym (0x6FFFFFFF) > > > > > Flags [ (0x0) > > > > > ] > > > > > Address: 0x0 > > > > > Offset: 0xB4 > > > > > Size: 2 > > > > > Link: 0 > > > > > Info: 0 > > > > > AddressAlignment: 0 > > > > > EntrySize: 2 > > > > > } > > > > > ``` > > > > > > > > > > Should we remove "Section Name"/"Address"/"Offset"/"Link" and make it to be a list? > > > > I'd be inclined to do that personally, but it should be a separate change. > > > The Linux Standard Base calls this "Symbol Version Table" but this is named "VersionSymbols" here... What do you think if we just use the regular section type name "SHT_GNU_versym"? It may improve discoverability as well. > > > What do you think if we just use the regular section type name "SHT_GNU_versym"? It may improve discoverability as well. > > > > I.e. this is an opposite direction to what this patch does: > > > > ``` > > SHT_GNU_verdef { -> VersionDefinitions [ > > SHT_GNU_verneed { -> VersionRequirements [ > > ``` > > > > It will be only sections for which we use type names. Should we? > > I.e. this is an opposite direction to what this patch does: > > Yes. .gnu.version is currently not consistent with .gnu.version_r and .gnu.version_d, and I know this patch tries to make them consistent. > > I am not clear which direction we should go. I have a very weak preference for SHT_GNU_versym. > > The naming does not seem very consistent here. While LSB names .gnu.version_r "version requirements", binutils-gdb elf.h names it "version needs section". Count me in the camp that (slightly) prefers seeing keys like "VersionDefinitions" rather than "SHT_GNU_verdef", though mostly on style grounds (not having a mix of CamelCase and SNAKE_CASE). However, I admit this is a bikeshed -- I think this patch (not D68704) will make things look nicer, but I have zero technical objections to it. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68636/new/ https://reviews.llvm.org/D68636 From llvm-commits at lists.llvm.org Wed Oct 9 12:42:10 2019 From: llvm-commits at lists.llvm.org (Roger Ferrer Ibanez via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 19:42:10 +0000 (UTC) Subject: [PATCH] D68360: PR41162 Implement LKK remainder and divisibility algorithms [urem] In-Reply-To: References: Message-ID: rogfer01 added inline comments. ================ Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:4922 + // Check to see if we can do this. + if (!isTypeLegal(VT) || !isTypeLegal(FVT)) + return SDValue(); ---------------- TG908 wrote: > This right here seems to fail on riscv64 +m with: > > > ``` > (lldb) p FVT > > (llvm::EVT) $1 = { > V = (SimpleTy = i64) > LLVMTy = 0x0000000000000000 > } > ``` > ``` > (lldb) p VT > > (llvm::EVT) $2 = { > V = (SimpleTy = i32) > LLVMTy = 0x0000000000000000 > } > ``` > Those types should be legal right? What am I missing? > > > ``` > (lldb) expr isTypeLegal(VT) > (bool) $5 = false > (lldb) expr isTypeLegal(FVT) > (bool) $6 = true > ``` In riscv64 only `i64` is legal because registers are 64-bit and instructions operate with all the bits of the GPRs. In other words, the current assortment of instructions that would be useable to implement 32-bit operations (there are just a few of them) is not broad enough to warrant making `i32` legal in RISC-V. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68360/new/ https://reviews.llvm.org/D68360 From llvm-commits at lists.llvm.org Wed Oct 9 13:22:22 2019 From: llvm-commits at lists.llvm.org (Chris Bieneman via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 20:22:22 +0000 (UTC) Subject: [PATCH] D63614: [System Model] [TTI] Update cache and prefetch TTI interfaces In-Reply-To: References: Message-ID: <8e407ef560e6b3af3e49223881b5d52a@localhost.localdomain> beanz added a comment. This patch causes lots of warning spews because it adds virtual methods to a class that doesn't have a virtual destructor. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63614/new/ https://reviews.llvm.org/D63614 From llvm-commits at lists.llvm.org Wed Oct 9 14:12:38 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:12:38 +0000 (UTC) Subject: [PATCH] D68729: [AMDGPU] Fixed dpp combine of VOP1 Message-ID: rampitec created this revision. rampitec added reviewers: vpykhtin, kzhuravl, arsenm. Herald added subscribers: MaskRay, kbarton, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, nemanjai. Herald added a project: LLVM. If original instruction did not have source modifiers they were not added to the new DPP instruction as well, even if needed. https://reviews.llvm.org/D68729 Files: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/test/CodeGen/AMDGPU/dpp_combine.mir Index: llvm/test/CodeGen/AMDGPU/dpp_combine.mir =================================================================== --- llvm/test/CodeGen/AMDGPU/dpp_combine.mir +++ llvm/test/CodeGen/AMDGPU/dpp_combine.mir @@ -526,3 +526,14 @@ %3:vreg_64 = REG_SEQUENCE %2, %subreg.sub0 ; %3.sub1 is undef %4:vgpr_32 = V_MOV_B32_dpp %3.sub1, %1, 1, 15, 15, 1, implicit $exec %5:vgpr_32 = V_ADD_U32_e32 %4, %0.sub1, implicit $exec +... + +# CHECK-LABEL: name: dpp_vop1 +# CHECK: %3:vgpr_32 = V_CEIL_F32_dpp %1:vgpr_32, 0, undef %2:vgpr_32, 1, 15, 15, 1, implicit $exec +name: dpp_vop1 +tracksRegLiveness: true +body: | + bb.0: + %2:vgpr_32 = V_MOV_B32_dpp undef %1:vgpr_32, undef %0:vgpr_32, 1, 15, 15, 1, implicit $exec + %3:vgpr_32 = V_CEIL_F32_e32 %2, implicit $exec +... Index: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp =================================================================== --- llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp +++ llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp @@ -195,6 +195,10 @@ assert(0LL == (Mod0->getImm() & ~(SISrcMods::ABS | SISrcMods::NEG))); DPPInst.addImm(Mod0->getImm()); ++NumOperands; + } else if (AMDGPU::getNamedOperandIdx(DPPOp, + AMDGPU::OpName::src0_modifiers) != -1) { + DPPInst.addImm(0); + ++NumOperands; } auto *Src0 = TII->getNamedOperand(MovMI, AMDGPU::OpName::src0); assert(Src0); @@ -214,6 +218,10 @@ assert(0LL == (Mod1->getImm() & ~(SISrcMods::ABS | SISrcMods::NEG))); DPPInst.addImm(Mod1->getImm()); ++NumOperands; + } else if (AMDGPU::getNamedOperandIdx(DPPOp, + AMDGPU::OpName::src1_modifiers) != -1) { + DPPInst.addImm(0); + ++NumOperands; } if (auto *Src1 = TII->getNamedOperand(OrigMI, AMDGPU::OpName::src1)) { if (!TII->isOperandLegal(*DPPInst.getInstr(), NumOperands, Src1)) { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68729.224158.patch Type: text/x-patch Size: 1874 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 14:22:19 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:22:19 +0000 (UTC) Subject: [PATCH] D68729: [AMDGPU] Fixed dpp combine of VOP1 In-Reply-To: References: Message-ID: <0d6ece3737cafd4caa4c66ff695fdfdf@localhost.localdomain> arsenm added inline comments. Herald added a subscriber: wuzish. ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:221-222 ++NumOperands; + } else if (AMDGPU::getNamedOperandIdx(DPPOp, + AMDGPU::OpName::src1_modifiers) != -1) { + DPPInst.addImm(0); ---------------- This case isn't tested ================ Comment at: llvm/test/CodeGen/AMDGPU/dpp_combine.mir:530 +... + +# CHECK-LABEL: name: dpp_vop1 ---------------- Add a comment explaining what this tests CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68729/new/ https://reviews.llvm.org/D68729 From llvm-commits at lists.llvm.org Wed Oct 9 14:25:35 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:25:35 +0000 (UTC) Subject: [PATCH] D68729: [AMDGPU] Fixed dpp combine of VOP1 In-Reply-To: References: Message-ID: <1ae3a30024874d6ed5c77cab5d117554@localhost.localdomain> rampitec marked an inline comment as done. rampitec added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:221-222 ++NumOperands; + } else if (AMDGPU::getNamedOperandIdx(DPPOp, + AMDGPU::OpName::src1_modifiers) != -1) { + DPPInst.addImm(0); ---------------- arsenm wrote: > This case isn't tested I do not think such instructions currently exists. VOP2 are tested by the original test. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68729/new/ https://reviews.llvm.org/D68729 From llvm-commits at lists.llvm.org Wed Oct 9 14:34:56 2019 From: llvm-commits at lists.llvm.org (David Greene via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:34:56 +0000 (UTC) Subject: [PATCH] D63614: [System Model] [TTI] Update cache and prefetch TTI interfaces In-Reply-To: References: Message-ID: <4b18c7969478e4d80cecbfea78f6547d@localhost.localdomain> greened added a comment. In D63614#1702176 , @beanz wrote: > This patch causes lots of warning spews because it adds virtual methods to a class that doesn't have a virtual destructor. Vitaly fixed it before I could get to it. Thanks for the heads-up! Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63614/new/ https://reviews.llvm.org/D63614 From llvm-commits at lists.llvm.org Wed Oct 9 14:34:56 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:34:56 +0000 (UTC) Subject: [PATCH] D68729: [AMDGPU] Fixed dpp combine of VOP1 In-Reply-To: References: Message-ID: <143f336897c23409fd8aa5a87eac844b@localhost.localdomain> arsenm added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:221-222 ++NumOperands; + } else if (AMDGPU::getNamedOperandIdx(DPPOp, + AMDGPU::OpName::src1_modifiers) != -1) { + DPPInst.addImm(0); ---------------- rampitec wrote: > arsenm wrote: > > This case isn't tested > I do not think such instructions currently exists. VOP2 are tested by the original test. Won't this happen for any VOP2 form of an FP instruction? V_MIN_F32_e32 has no modifiers, but V_MIN_F32_dpp does CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68729/new/ https://reviews.llvm.org/D68729 From llvm-commits at lists.llvm.org Wed Oct 9 14:52:17 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:52:17 +0000 (UTC) Subject: [PATCH] D68729: [AMDGPU] Fixed dpp combine of VOP1 In-Reply-To: References: Message-ID: <567d5aa33dc806218dccd2e6f2d25f2a@localhost.localdomain> rampitec marked 4 inline comments as done. rampitec added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:221-222 ++NumOperands; + } else if (AMDGPU::getNamedOperandIdx(DPPOp, + AMDGPU::OpName::src1_modifiers) != -1) { + DPPInst.addImm(0); ---------------- arsenm wrote: > rampitec wrote: > > arsenm wrote: > > > This case isn't tested > > I do not think such instructions currently exists. VOP2 are tested by the original test. > Won't this happen for any VOP2 form of an FP instruction? V_MIN_F32_e32 has no modifiers, but V_MIN_F32_dpp does Good point, thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68729/new/ https://reviews.llvm.org/D68729 From llvm-commits at lists.llvm.org Wed Oct 9 14:52:18 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:52:18 +0000 (UTC) Subject: [PATCH] D68729: [AMDGPU] Fixed dpp combine of VOP1 In-Reply-To: References: Message-ID: <0237fcf1c55f16da72e4f10f21e87267@localhost.localdomain> rampitec updated this revision to Diff 224175. rampitec marked an inline comment as done. rampitec added a comment. Updated test. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68729/new/ https://reviews.llvm.org/D68729 Files: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/test/CodeGen/AMDGPU/dpp_combine.mir Index: llvm/test/CodeGen/AMDGPU/dpp_combine.mir =================================================================== --- llvm/test/CodeGen/AMDGPU/dpp_combine.mir +++ llvm/test/CodeGen/AMDGPU/dpp_combine.mir @@ -526,3 +526,26 @@ %3:vreg_64 = REG_SEQUENCE %2, %subreg.sub0 ; %3.sub1 is undef %4:vgpr_32 = V_MOV_B32_dpp %3.sub1, %1, 1, 15, 15, 1, implicit $exec %5:vgpr_32 = V_ADD_U32_e32 %4, %0.sub1, implicit $exec +... + +# Test instruction which does not have modifiers in VOP1 form but does in DPP form. +# CHECK-LABEL: name: dpp_vop1 +# CHECK: %3:vgpr_32 = V_CEIL_F32_dpp %1:vgpr_32, 0, undef %2:vgpr_32, 1, 15, 15, 1, implicit $exec +name: dpp_vop1 +tracksRegLiveness: true +body: | + bb.0: + %2:vgpr_32 = V_MOV_B32_dpp undef %1:vgpr_32, undef %0:vgpr_32, 1, 15, 15, 1, implicit $exec + %3:vgpr_32 = V_CEIL_F32_e32 %2, implicit $exec +... + +# Test instruction which does not have modifiers in VOP2 form but does in DPP form. +# CHECK-LABEL: name: dpp_min +# CHECK: %3:vgpr_32 = V_MIN_F32_dpp %1:vgpr_32, 0, undef %2:vgpr_32, 0, undef %4:vgpr_32, 1, 15, 15, 1, implicit $exec +name: dpp_min +tracksRegLiveness: true +body: | + bb.0: + %2:vgpr_32 = V_MOV_B32_dpp undef %1:vgpr_32, undef %0:vgpr_32, 1, 15, 15, 1, implicit $exec + %4:vgpr_32 = V_MIN_F32_e32 %2, undef %3:vgpr_32, implicit $exec +... Index: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp =================================================================== --- llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp +++ llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp @@ -195,6 +195,10 @@ assert(0LL == (Mod0->getImm() & ~(SISrcMods::ABS | SISrcMods::NEG))); DPPInst.addImm(Mod0->getImm()); ++NumOperands; + } else if (AMDGPU::getNamedOperandIdx(DPPOp, + AMDGPU::OpName::src0_modifiers) != -1) { + DPPInst.addImm(0); + ++NumOperands; } auto *Src0 = TII->getNamedOperand(MovMI, AMDGPU::OpName::src0); assert(Src0); @@ -214,6 +218,10 @@ assert(0LL == (Mod1->getImm() & ~(SISrcMods::ABS | SISrcMods::NEG))); DPPInst.addImm(Mod1->getImm()); ++NumOperands; + } else if (AMDGPU::getNamedOperandIdx(DPPOp, + AMDGPU::OpName::src1_modifiers) != -1) { + DPPInst.addImm(0); + ++NumOperands; } if (auto *Src1 = TII->getNamedOperand(OrigMI, AMDGPU::OpName::src1)) { if (!TII->isOperandLegal(*DPPInst.getInstr(), NumOperands, Src1)) { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68729.224175.patch Type: text/x-patch Size: 2425 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 14:54:08 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:54:08 +0000 (UTC) Subject: [PATCH] D68600: AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer In-Reply-To: References: Message-ID: arsenm marked an inline comment as done. arsenm added inline comments. ================ Comment at: lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:326-327 +// FIXME: Returns uniform if there's no source value information. This is +// probably wrong. static bool isInstrUniformNonExtLoadAlign4(const MachineInstr &MI) { ---------------- nhaehnle wrote: > You mean because `isUniformMMO` returns true if the MMO doesn't have a pointer? There's a comment in that function which justifies that (though I'm not sure whether that comment is correct). The comment there isn't entirely wrong, but also isn't entirely correct. There can also be null without a PSV. I don't think this would ever happen in a real program, and is more a MIR semantics question. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68600/new/ https://reviews.llvm.org/D68600 From llvm-commits at lists.llvm.org Wed Oct 9 14:55:27 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 21:55:27 +0000 (UTC) Subject: [PATCH] D61675: [WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator In-Reply-To: References: Message-ID: <7ca561b09075177ca27fd69b973d62f3@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG47363a148f1d: [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator (authored by cameron.mcinally). Herald added a project: clang. Herald added a subscriber: cfe-commits. Changed prior to commit: https://reviews.llvm.org/D61675?vs=220389&id=224181#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61675/new/ https://reviews.llvm.org/D61675 Files: clang/lib/CodeGen/CGExprScalar.cpp clang/test/CodeGen/aarch64-neon-2velem.c clang/test/CodeGen/aarch64-neon-fma.c clang/test/CodeGen/aarch64-neon-intrinsics.c clang/test/CodeGen/aarch64-neon-misc.c clang/test/CodeGen/aarch64-neon-scalar-x-indexed-elem.c clang/test/CodeGen/aarch64-v8.2a-fp16-intrinsics.c clang/test/CodeGen/aarch64-v8.2a-neon-intrinsics.c clang/test/CodeGen/arm-v8.2a-neon-intrinsics.c clang/test/CodeGen/arm_neon_intrinsics.c clang/test/CodeGen/avx512f-builtins.c clang/test/CodeGen/avx512vl-builtins.c clang/test/CodeGen/builtins-ppc-vsx.c clang/test/CodeGen/complex-math.c clang/test/CodeGen/exprs.c clang/test/CodeGen/fma-builtins.c clang/test/CodeGen/fma4-builtins.c clang/test/CodeGen/fp16-ops.c clang/test/CodeGen/zvector.c clang/test/CodeGen/zvector2.c llvm/include/llvm/IR/IRBuilder.h llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll llvm/test/CodeGen/AMDGPU/divrem24-assume.ll llvm/test/Transforms/InstCombine/cos-1.ll llvm/test/Transforms/InstCombine/fast-math.ll llvm/test/Transforms/InstCombine/fmul.ll llvm/test/Transforms/InstCombine/select-crash.ll llvm/unittests/IR/InstructionsTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D61675.224181.patch Type: text/x-patch Size: 228046 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 15:02:58 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via llvm-commits) Date: Wed, 09 Oct 2019 22:02:58 -0000 Subject: [llvm] r374241 - [AMDGPU] Fixed dpp combine of VOP1 Message-ID: <20191009220258.E188A84D22@lists.llvm.org> Author: rampitec Date: Wed Oct 9 15:02:58 2019 New Revision: 374241 URL: http://llvm.org/viewvc/llvm-project?rev=374241&view=rev Log: [AMDGPU] Fixed dpp combine of VOP1 If original instruction did not have source modifiers they were not added to the new DPP instruction as well, even if needed. Differential Revision: https://reviews.llvm.org/D68729 Modified: llvm/trunk/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir Modified: llvm/trunk/lib/Target/AMDGPU/GCNDPPCombine.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/GCNDPPCombine.cpp?rev=374241&r1=374240&r2=374241&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/GCNDPPCombine.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/GCNDPPCombine.cpp Wed Oct 9 15:02:58 2019 @@ -195,6 +195,10 @@ MachineInstr *GCNDPPCombine::createDPPIn assert(0LL == (Mod0->getImm() & ~(SISrcMods::ABS | SISrcMods::NEG))); DPPInst.addImm(Mod0->getImm()); ++NumOperands; + } else if (AMDGPU::getNamedOperandIdx(DPPOp, + AMDGPU::OpName::src0_modifiers) != -1) { + DPPInst.addImm(0); + ++NumOperands; } auto *Src0 = TII->getNamedOperand(MovMI, AMDGPU::OpName::src0); assert(Src0); @@ -214,6 +218,10 @@ MachineInstr *GCNDPPCombine::createDPPIn assert(0LL == (Mod1->getImm() & ~(SISrcMods::ABS | SISrcMods::NEG))); DPPInst.addImm(Mod1->getImm()); ++NumOperands; + } else if (AMDGPU::getNamedOperandIdx(DPPOp, + AMDGPU::OpName::src1_modifiers) != -1) { + DPPInst.addImm(0); + ++NumOperands; } if (auto *Src1 = TII->getNamedOperand(OrigMI, AMDGPU::OpName::src1)) { if (!TII->isOperandLegal(*DPPInst.getInstr(), NumOperands, Src1)) { Modified: llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir?rev=374241&r1=374240&r2=374241&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir Wed Oct 9 15:02:58 2019 @@ -526,3 +526,26 @@ body: | %3:vreg_64 = REG_SEQUENCE %2, %subreg.sub0 ; %3.sub1 is undef %4:vgpr_32 = V_MOV_B32_dpp %3.sub1, %1, 1, 15, 15, 1, implicit $exec %5:vgpr_32 = V_ADD_U32_e32 %4, %0.sub1, implicit $exec +... + +# Test instruction which does not have modifiers in VOP1 form but does in DPP form. +# CHECK-LABEL: name: dpp_vop1 +# CHECK: %3:vgpr_32 = V_CEIL_F32_dpp %1:vgpr_32, 0, undef %2:vgpr_32, 1, 15, 15, 1, implicit $exec +name: dpp_vop1 +tracksRegLiveness: true +body: | + bb.0: + %2:vgpr_32 = V_MOV_B32_dpp undef %1:vgpr_32, undef %0:vgpr_32, 1, 15, 15, 1, implicit $exec + %3:vgpr_32 = V_CEIL_F32_e32 %2, implicit $exec +... + +# Test instruction which does not have modifiers in VOP2 form but does in DPP form. +# CHECK-LABEL: name: dpp_min +# CHECK: %3:vgpr_32 = V_MIN_F32_dpp %1:vgpr_32, 0, undef %2:vgpr_32, 0, undef %4:vgpr_32, 1, 15, 15, 1, implicit $exec +name: dpp_min +tracksRegLiveness: true +body: | + bb.0: + %2:vgpr_32 = V_MOV_B32_dpp undef %1:vgpr_32, undef %0:vgpr_32, 1, 15, 15, 1, implicit $exec + %4:vgpr_32 = V_MIN_F32_e32 %2, undef %3:vgpr_32, implicit $exec +... From llvm-commits at lists.llvm.org Wed Oct 9 15:03:24 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via llvm-commits) Date: Wed, 09 Oct 2019 22:03:24 -0000 Subject: [llvm] r374243 - [InstCombine] Fix PR43617 Message-ID: <20191009220324.1570A80AFB@lists.llvm.org> Author: evandro Date: Wed Oct 9 15:03:23 2019 New Revision: 374243 URL: http://llvm.org/viewvc/llvm-project?rev=374243&view=rev Log: [InstCombine] Fix PR43617 Check for `nullptr` before inspecting composite function. Modified: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp Modified: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp?rev=374243&r1=374242&r2=374243&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp Wed Oct 9 15:03:23 2019 @@ -1916,10 +1916,10 @@ Value *LibCallSimplifier::optimizeLog(Ca B.setFastMathFlags(FastMathFlags::getFast()); Function *ArgFn = Arg->getCalledFunction(); - StringRef ArgNm = ArgFn->getName(); - Intrinsic::ID ArgID = ArgFn->getIntrinsicID(); + Intrinsic::ID ArgID = + ArgFn ? ArgFn->getIntrinsicID() : Intrinsic::not_intrinsic; LibFunc ArgLb = NotLibFunc; - TLI->getLibFunc(ArgNm, ArgLb); + TLI->getLibFunc(Arg, ArgLb); // log(pow(x,y)) -> y*log(x) if (ArgLb == PowLb || ArgID == Intrinsic::pow) { @@ -1934,9 +1934,10 @@ Value *LibCallSimplifier::optimizeLog(Ca substituteInParent(Arg, MulY); return MulY; } + // log(exp{,2,10}(y)) -> y*log({e,2,10}) // TODO: There is no exp10() intrinsic yet. - else if (ArgLb == ExpLb || ArgLb == Exp2Lb || ArgLb == Exp10Lb || + if (ArgLb == ExpLb || ArgLb == Exp2Lb || ArgLb == Exp10Lb || ArgID == Intrinsic::exp || ArgID == Intrinsic::exp2) { Constant *Eul; if (ArgLb == ExpLb || ArgID == Intrinsic::exp) From llvm-commits at lists.llvm.org Wed Oct 9 15:04:37 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:04:37 +0000 (UTC) Subject: [PATCH] D68735: AMDGPU: Don't fold copies to physregs In-Reply-To: References: Message-ID: <605315070ad6de94fd816a8ff0f37570@localhost.localdomain> rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68735/new/ https://reviews.llvm.org/D68735 From llvm-commits at lists.llvm.org Wed Oct 9 15:10:10 2019 From: llvm-commits at lists.llvm.org (Marcello Maggioni via llvm-commits) Date: Wed, 09 Oct 2019 22:10:10 -0000 Subject: [llvm] r374245 - [GISel] Refactor and split PatternMatchTest. NFC Message-ID: <20191009221010.4F42685BC6@lists.llvm.org> Author: mggm Date: Wed Oct 9 15:10:10 2019 New Revision: 374245 URL: http://llvm.org/viewvc/llvm-project?rev=374245&view=rev Log: [GISel] Refactor and split PatternMatchTest. NFC Split the ConstantFold part into a separate file and make it use the fixture GISelMITest. Added: llvm/trunk/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp Modified: llvm/trunk/unittests/CodeGen/GlobalISel/CMakeLists.txt llvm/trunk/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp Modified: llvm/trunk/unittests/CodeGen/GlobalISel/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/CodeGen/GlobalISel/CMakeLists.txt?rev=374245&r1=374244&r2=374245&view=diff ============================================================================== --- llvm/trunk/unittests/CodeGen/GlobalISel/CMakeLists.txt (original) +++ llvm/trunk/unittests/CodeGen/GlobalISel/CMakeLists.txt Wed Oct 9 15:10:10 2019 @@ -10,6 +10,7 @@ set(LLVM_LINK_COMPONENTS ) add_llvm_unittest(GlobalISelTests + ConstantFoldingTest.cpp CSETest.cpp LegalizerHelperTest.cpp LegalizerInfoTest.cpp Added: llvm/trunk/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp?rev=374245&view=auto ============================================================================== --- llvm/trunk/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp (added) +++ llvm/trunk/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp Wed Oct 9 15:10:10 2019 @@ -0,0 +1,71 @@ +//===- ConstantFoldingTest.cpp -------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "GISelMITest.h" +#include "llvm/CodeGen/GlobalISel/ConstantFoldingMIRBuilder.h" +#include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h" +#include "llvm/CodeGen/GlobalISel/Utils.h" +#include "llvm/CodeGen/MachineFunction.h" +#include "gtest/gtest.h" + +using namespace llvm; + +namespace { + +TEST_F(GISelMITest, FoldWithBuilder) { + setUp(); + if (!TM) + return; + // Try to use the FoldableInstructionsBuilder to build binary ops. + ConstantFoldingMIRBuilder CFB(B.getState()); + LLT s32 = LLT::scalar(32); + int64_t Cst; + auto MIBCAdd = + CFB.buildAdd(s32, CFB.buildConstant(s32, 0), CFB.buildConstant(s32, 1)); + // This should be a constant now. + bool match = mi_match(MIBCAdd->getOperand(0).getReg(), *MRI, m_ICst(Cst)); + EXPECT_TRUE(match); + EXPECT_EQ(Cst, 1); + auto MIBCAdd1 = + CFB.buildInstr(TargetOpcode::G_ADD, {s32}, + {CFB.buildConstant(s32, 0), CFB.buildConstant(s32, 1)}); + // This should be a constant now. + match = mi_match(MIBCAdd1->getOperand(0).getReg(), *MRI, m_ICst(Cst)); + EXPECT_TRUE(match); + EXPECT_EQ(Cst, 1); + + // Try one of the other constructors of MachineIRBuilder to make sure it's + // compatible. + ConstantFoldingMIRBuilder CFB1(*MF); + CFB1.setInsertPt(*EntryMBB, EntryMBB->end()); + auto MIBCSub = + CFB1.buildInstr(TargetOpcode::G_SUB, {s32}, + {CFB1.buildConstant(s32, 1), CFB1.buildConstant(s32, 1)}); + // This should be a constant now. + match = mi_match(MIBCSub->getOperand(0).getReg(), *MRI, m_ICst(Cst)); + EXPECT_TRUE(match); + EXPECT_EQ(Cst, 0); + + auto MIBCSext1 = + CFB1.buildInstr(TargetOpcode::G_SEXT_INREG, {s32}, + {CFB1.buildConstant(s32, 0x01), uint64_t(8)}); + // This should be a constant now. + match = mi_match(MIBCSext1->getOperand(0).getReg(), *MRI, m_ICst(Cst)); + EXPECT_TRUE(match); + EXPECT_EQ(1, Cst); + + auto MIBCSext2 = + CFB1.buildInstr(TargetOpcode::G_SEXT_INREG, {s32}, + {CFB1.buildConstant(s32, 0x80), uint64_t(8)}); + // This should be a constant now. + match = mi_match(MIBCSext2->getOperand(0).getReg(), *MRI, m_ICst(Cst)); + EXPECT_TRUE(match); + EXPECT_EQ(-0x80, Cst); +} + +} // namespace \ No newline at end of file Modified: llvm/trunk/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp?rev=374245&r1=374244&r2=374245&view=diff ============================================================================== --- llvm/trunk/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp (original) +++ llvm/trunk/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp Wed Oct 9 15:10:10 2019 @@ -6,6 +6,7 @@ // //===----------------------------------------------------------------------===// +#include "GISelMITest.h" #include "llvm/CodeGen/GlobalISel/ConstantFoldingMIRBuilder.h" #include "llvm/CodeGen/GlobalISel/MIPatternMatch.h" #include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h" @@ -29,140 +30,29 @@ using namespace MIPatternMatch; namespace { -void initLLVM() { - InitializeAllTargets(); - InitializeAllTargetMCs(); - InitializeAllAsmPrinters(); - InitializeAllAsmParsers(); - - PassRegistry *Registry = PassRegistry::getPassRegistry(); - initializeCore(*Registry); - initializeCodeGen(*Registry); -} - -/// Create a TargetMachine. As we lack a dedicated always available target for -/// unittests, we go for "AArch64". -std::unique_ptr createTargetMachine() { - Triple TargetTriple("aarch64--"); - std::string Error; - const Target *T = TargetRegistry::lookupTarget("", TargetTriple, Error); - if (!T) - return nullptr; - - TargetOptions Options; - return std::unique_ptr(static_cast( - T->createTargetMachine("AArch64", "", "", Options, None, None, - CodeGenOpt::Aggressive))); -} - -std::unique_ptr parseMIR(LLVMContext &Context, - std::unique_ptr &MIR, - const TargetMachine &TM, StringRef MIRCode, - const char *FuncName, MachineModuleInfo &MMI) { - SMDiagnostic Diagnostic; - std::unique_ptr MBuffer = MemoryBuffer::getMemBuffer(MIRCode); - MIR = createMIRParser(std::move(MBuffer), Context); - if (!MIR) - return nullptr; - - std::unique_ptr M = MIR->parseIRModule(); - if (!M) - return nullptr; - - M->setDataLayout(TM.createDataLayout()); - - if (MIR->parseMachineFunctions(*M, MMI)) - return nullptr; - - return M; -} - -std::pair, std::unique_ptr> -createDummyModule(LLVMContext &Context, const LLVMTargetMachine &TM, - StringRef MIRFunc) { - SmallString<512> S; - StringRef MIRString = (Twine(R"MIR( ---- -... -name: func -registers: - - { id: 0, class: _ } - - { id: 1, class: _ } - - { id: 2, class: _ } - - { id: 3, class: _ } -body: | - bb.1: - %0(s64) = COPY $x0 - %1(s64) = COPY $x1 - %2(s64) = COPY $x2 -)MIR") + Twine(MIRFunc) + Twine("...\n")) - .toNullTerminatedStringRef(S); - std::unique_ptr MIR; - auto MMI = std::make_unique(&TM); - std::unique_ptr M = - parseMIR(Context, MIR, TM, MIRString, "func", *MMI); - return make_pair(std::move(M), std::move(MMI)); -} - -static MachineFunction *getMFFromMMI(const Module *M, - const MachineModuleInfo *MMI) { - Function *F = M->getFunction("func"); - auto *MF = MMI->getMachineFunction(*F); - return MF; -} - -static void collectCopies(SmallVectorImpl &Copies, - MachineFunction *MF) { - for (auto &MBB : *MF) - for (MachineInstr &MI : MBB) { - if (MI.getOpcode() == TargetOpcode::COPY) - Copies.push_back(MI.getOperand(0).getReg()); - } -} - -TEST(PatternMatchInstr, MatchIntConstant) { - LLVMContext Context; - std::unique_ptr TM = createTargetMachine(); +TEST_F(GISelMITest, MatchIntConstant) { + setUp(); if (!TM) return; - auto ModuleMMIPair = createDummyModule(Context, *TM, ""); - MachineFunction *MF = - getMFFromMMI(ModuleMMIPair.first.get(), ModuleMMIPair.second.get()); - SmallVector Copies; - collectCopies(Copies, MF); - MachineBasicBlock *EntryMBB = &*MF->begin(); - MachineIRBuilder B(*MF); - MachineRegisterInfo &MRI = MF->getRegInfo(); - B.setInsertPt(*EntryMBB, EntryMBB->end()); auto MIBCst = B.buildConstant(LLT::scalar(64), 42); int64_t Cst; - bool match = mi_match(MIBCst->getOperand(0).getReg(), MRI, m_ICst(Cst)); + bool match = mi_match(MIBCst->getOperand(0).getReg(), *MRI, m_ICst(Cst)); EXPECT_TRUE(match); EXPECT_EQ(Cst, 42); } -TEST(PatternMatchInstr, MatchBinaryOp) { - LLVMContext Context; - std::unique_ptr TM = createTargetMachine(); +TEST_F(GISelMITest, MatchBinaryOp) { + setUp(); if (!TM) return; - auto ModuleMMIPair = createDummyModule(Context, *TM, ""); - MachineFunction *MF = - getMFFromMMI(ModuleMMIPair.first.get(), ModuleMMIPair.second.get()); - SmallVector Copies; - collectCopies(Copies, MF); - MachineBasicBlock *EntryMBB = &*MF->begin(); - MachineIRBuilder B(*MF); - MachineRegisterInfo &MRI = MF->getRegInfo(); - B.setInsertPt(*EntryMBB, EntryMBB->end()); LLT s64 = LLT::scalar(64); auto MIBAdd = B.buildAdd(s64, Copies[0], Copies[1]); // Test case for no bind. bool match = - mi_match(MIBAdd->getOperand(0).getReg(), MRI, m_GAdd(m_Reg(), m_Reg())); + mi_match(MIBAdd->getOperand(0).getReg(), *MRI, m_GAdd(m_Reg(), m_Reg())); EXPECT_TRUE(match); Register Src0, Src1, Src2; - match = mi_match(MIBAdd->getOperand(0).getReg(), MRI, + match = mi_match(MIBAdd->getOperand(0).getReg(), *MRI, m_GAdd(m_Reg(Src0), m_Reg(Src1))); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); @@ -172,14 +62,14 @@ TEST(PatternMatchInstr, MatchBinaryOp) { auto MIBMul = B.buildMul(s64, MIBAdd, Copies[2]); // Try to match MUL. - match = mi_match(MIBMul->getOperand(0).getReg(), MRI, + match = mi_match(MIBMul->getOperand(0).getReg(), *MRI, m_GMul(m_Reg(Src0), m_Reg(Src1))); EXPECT_TRUE(match); EXPECT_EQ(Src0, MIBAdd->getOperand(0).getReg()); EXPECT_EQ(Src1, Copies[2]); // Try to match MUL(ADD) - match = mi_match(MIBMul->getOperand(0).getReg(), MRI, + match = mi_match(MIBMul->getOperand(0).getReg(), *MRI, m_GMul(m_GAdd(m_Reg(Src0), m_Reg(Src1)), m_Reg(Src2))); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); @@ -191,7 +81,7 @@ TEST(PatternMatchInstr, MatchBinaryOp) { // Try to match MUL(Cst, Reg) on src of MUL(Reg, Cst) to validate // commutativity. int64_t Cst; - match = mi_match(MIBMul2->getOperand(0).getReg(), MRI, + match = mi_match(MIBMul2->getOperand(0).getReg(), *MRI, m_GMul(m_ICst(Cst), m_Reg(Src0))); EXPECT_TRUE(match); EXPECT_EQ(Cst, 42); @@ -199,14 +89,14 @@ TEST(PatternMatchInstr, MatchBinaryOp) { // Make sure commutative doesn't work with something like SUB. auto MIBSub = B.buildSub(s64, Copies[0], B.buildConstant(s64, 42)); - match = mi_match(MIBSub->getOperand(0).getReg(), MRI, + match = mi_match(MIBSub->getOperand(0).getReg(), *MRI, m_GSub(m_ICst(Cst), m_Reg(Src0))); EXPECT_FALSE(match); auto MIBFMul = B.buildInstr(TargetOpcode::G_FMUL, {s64}, {Copies[0], B.buildConstant(s64, 42)}); // Match and test commutativity for FMUL. - match = mi_match(MIBFMul->getOperand(0).getReg(), MRI, + match = mi_match(MIBFMul->getOperand(0).getReg(), *MRI, m_GFMul(m_ICst(Cst), m_Reg(Src0))); EXPECT_TRUE(match); EXPECT_EQ(Cst, 42); @@ -215,7 +105,7 @@ TEST(PatternMatchInstr, MatchBinaryOp) { // FSUB auto MIBFSub = B.buildInstr(TargetOpcode::G_FSUB, {s64}, {Copies[0], B.buildConstant(s64, 42)}); - match = mi_match(MIBFSub->getOperand(0).getReg(), MRI, + match = mi_match(MIBFSub->getOperand(0).getReg(), *MRI, m_GFSub(m_Reg(Src0), m_Reg())); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); @@ -223,7 +113,7 @@ TEST(PatternMatchInstr, MatchBinaryOp) { // Build AND %0, %1 auto MIBAnd = B.buildAnd(s64, Copies[0], Copies[1]); // Try to match AND. - match = mi_match(MIBAnd->getOperand(0).getReg(), MRI, + match = mi_match(MIBAnd->getOperand(0).getReg(), *MRI, m_GAnd(m_Reg(Src0), m_Reg(Src1))); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); @@ -232,72 +122,17 @@ TEST(PatternMatchInstr, MatchBinaryOp) { // Build OR %0, %1 auto MIBOr = B.buildOr(s64, Copies[0], Copies[1]); // Try to match OR. - match = mi_match(MIBOr->getOperand(0).getReg(), MRI, + match = mi_match(MIBOr->getOperand(0).getReg(), *MRI, m_GOr(m_Reg(Src0), m_Reg(Src1))); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); EXPECT_EQ(Src1, Copies[1]); - - // Try to use the FoldableInstructionsBuilder to build binary ops. - ConstantFoldingMIRBuilder CFB(B.getState()); - LLT s32 = LLT::scalar(32); - auto MIBCAdd = - CFB.buildAdd(s32, CFB.buildConstant(s32, 0), CFB.buildConstant(s32, 1)); - // This should be a constant now. - match = mi_match(MIBCAdd->getOperand(0).getReg(), MRI, m_ICst(Cst)); - EXPECT_TRUE(match); - EXPECT_EQ(Cst, 1); - auto MIBCAdd1 = - CFB.buildInstr(TargetOpcode::G_ADD, {s32}, - {CFB.buildConstant(s32, 0), CFB.buildConstant(s32, 1)}); - // This should be a constant now. - match = mi_match(MIBCAdd1->getOperand(0).getReg(), MRI, m_ICst(Cst)); - EXPECT_TRUE(match); - EXPECT_EQ(Cst, 1); - - // Try one of the other constructors of MachineIRBuilder to make sure it's - // compatible. - ConstantFoldingMIRBuilder CFB1(*MF); - CFB1.setInsertPt(*EntryMBB, EntryMBB->end()); - auto MIBCSub = - CFB1.buildInstr(TargetOpcode::G_SUB, {s32}, - {CFB1.buildConstant(s32, 1), CFB1.buildConstant(s32, 1)}); - // This should be a constant now. - match = mi_match(MIBCSub->getOperand(0).getReg(), MRI, m_ICst(Cst)); - EXPECT_TRUE(match); - EXPECT_EQ(Cst, 0); - - auto MIBCSext1 = - CFB1.buildInstr(TargetOpcode::G_SEXT_INREG, {s32}, - {CFB1.buildConstant(s32, 0x01), uint64_t(8)}); - // This should be a constant now. - match = mi_match(MIBCSext1->getOperand(0).getReg(), MRI, m_ICst(Cst)); - EXPECT_TRUE(match); - EXPECT_EQ(1, Cst); - - auto MIBCSext2 = - CFB1.buildInstr(TargetOpcode::G_SEXT_INREG, {s32}, - {CFB1.buildConstant(s32, 0x80), uint64_t(8)}); - // This should be a constant now. - match = mi_match(MIBCSext2->getOperand(0).getReg(), MRI, m_ICst(Cst)); - EXPECT_TRUE(match); - EXPECT_EQ(-0x80, Cst); } -TEST(PatternMatchInstr, MatchFPUnaryOp) { - LLVMContext Context; - std::unique_ptr TM = createTargetMachine(); +TEST_F(GISelMITest, MatchFPUnaryOp) { + setUp(); if (!TM) return; - auto ModuleMMIPair = createDummyModule(Context, *TM, ""); - MachineFunction *MF = - getMFFromMMI(ModuleMMIPair.first.get(), ModuleMMIPair.second.get()); - SmallVector Copies; - collectCopies(Copies, MF); - MachineBasicBlock *EntryMBB = &*MF->begin(); - MachineIRBuilder B(*MF); - MachineRegisterInfo &MRI = MF->getRegInfo(); - B.setInsertPt(*EntryMBB, EntryMBB->end()); // Truncate s64 to s32. LLT s32 = LLT::scalar(32); @@ -305,23 +140,24 @@ TEST(PatternMatchInstr, MatchFPUnaryOp) // Match G_FABS. auto MIBFabs = B.buildInstr(TargetOpcode::G_FABS, {s32}, {Copy0s32}); - bool match = mi_match(MIBFabs->getOperand(0).getReg(), MRI, m_GFabs(m_Reg())); + bool match = + mi_match(MIBFabs->getOperand(0).getReg(), *MRI, m_GFabs(m_Reg())); EXPECT_TRUE(match); Register Src; auto MIBFNeg = B.buildInstr(TargetOpcode::G_FNEG, {s32}, {Copy0s32}); - match = mi_match(MIBFNeg->getOperand(0).getReg(), MRI, m_GFNeg(m_Reg(Src))); + match = mi_match(MIBFNeg->getOperand(0).getReg(), *MRI, m_GFNeg(m_Reg(Src))); EXPECT_TRUE(match); EXPECT_EQ(Src, Copy0s32->getOperand(0).getReg()); - match = mi_match(MIBFabs->getOperand(0).getReg(), MRI, m_GFabs(m_Reg(Src))); + match = mi_match(MIBFabs->getOperand(0).getReg(), *MRI, m_GFabs(m_Reg(Src))); EXPECT_TRUE(match); EXPECT_EQ(Src, Copy0s32->getOperand(0).getReg()); // Build and match FConstant. auto MIBFCst = B.buildFConstant(s32, .5); const ConstantFP *TmpFP{}; - match = mi_match(MIBFCst->getOperand(0).getReg(), MRI, m_GFCst(TmpFP)); + match = mi_match(MIBFCst->getOperand(0).getReg(), *MRI, m_GFCst(TmpFP)); EXPECT_TRUE(match); EXPECT_TRUE(TmpFP); APFloat APF((float).5); @@ -332,7 +168,7 @@ TEST(PatternMatchInstr, MatchFPUnaryOp) LLT s64 = LLT::scalar(64); auto MIBFCst64 = B.buildFConstant(s64, .5); const ConstantFP *TmpFP64{}; - match = mi_match(MIBFCst64->getOperand(0).getReg(), MRI, m_GFCst(TmpFP64)); + match = mi_match(MIBFCst64->getOperand(0).getReg(), *MRI, m_GFCst(TmpFP64)); EXPECT_TRUE(match); EXPECT_TRUE(TmpFP64); APFloat APF64(.5); @@ -344,7 +180,7 @@ TEST(PatternMatchInstr, MatchFPUnaryOp) LLT s16 = LLT::scalar(16); auto MIBFCst16 = B.buildFConstant(s16, .5); const ConstantFP *TmpFP16{}; - match = mi_match(MIBFCst16->getOperand(0).getReg(), MRI, m_GFCst(TmpFP16)); + match = mi_match(MIBFCst16->getOperand(0).getReg(), *MRI, m_GFCst(TmpFP16)); EXPECT_TRUE(match); EXPECT_TRUE(TmpFP16); bool Ignored; @@ -355,20 +191,11 @@ TEST(PatternMatchInstr, MatchFPUnaryOp) EXPECT_NE(TmpFP16, TmpFP); } -TEST(PatternMatchInstr, MatchExtendsTrunc) { - LLVMContext Context; - std::unique_ptr TM = createTargetMachine(); +TEST_F(GISelMITest, MatchExtendsTrunc) { + setUp(); if (!TM) return; - auto ModuleMMIPair = createDummyModule(Context, *TM, ""); - MachineFunction *MF = - getMFFromMMI(ModuleMMIPair.first.get(), ModuleMMIPair.second.get()); - SmallVector Copies; - collectCopies(Copies, MF); - MachineBasicBlock *EntryMBB = &*MF->begin(); - MachineIRBuilder B(*MF); - MachineRegisterInfo &MRI = MF->getRegInfo(); - B.setInsertPt(*EntryMBB, EntryMBB->end()); + LLT s64 = LLT::scalar(64); LLT s32 = LLT::scalar(32); @@ -378,72 +205,62 @@ TEST(PatternMatchInstr, MatchExtendsTrun auto MIBSExt = B.buildSExt(s64, MIBTrunc); Register Src0; bool match = - mi_match(MIBTrunc->getOperand(0).getReg(), MRI, m_GTrunc(m_Reg(Src0))); + mi_match(MIBTrunc->getOperand(0).getReg(), *MRI, m_GTrunc(m_Reg(Src0))); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); match = - mi_match(MIBAExt->getOperand(0).getReg(), MRI, m_GAnyExt(m_Reg(Src0))); + mi_match(MIBAExt->getOperand(0).getReg(), *MRI, m_GAnyExt(m_Reg(Src0))); EXPECT_TRUE(match); EXPECT_EQ(Src0, MIBTrunc->getOperand(0).getReg()); - match = mi_match(MIBSExt->getOperand(0).getReg(), MRI, m_GSExt(m_Reg(Src0))); + match = mi_match(MIBSExt->getOperand(0).getReg(), *MRI, m_GSExt(m_Reg(Src0))); EXPECT_TRUE(match); EXPECT_EQ(Src0, MIBTrunc->getOperand(0).getReg()); - match = mi_match(MIBZExt->getOperand(0).getReg(), MRI, m_GZExt(m_Reg(Src0))); + match = mi_match(MIBZExt->getOperand(0).getReg(), *MRI, m_GZExt(m_Reg(Src0))); EXPECT_TRUE(match); EXPECT_EQ(Src0, MIBTrunc->getOperand(0).getReg()); // Match ext(trunc src) - match = mi_match(MIBAExt->getOperand(0).getReg(), MRI, + match = mi_match(MIBAExt->getOperand(0).getReg(), *MRI, m_GAnyExt(m_GTrunc(m_Reg(Src0)))); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); - match = mi_match(MIBSExt->getOperand(0).getReg(), MRI, + match = mi_match(MIBSExt->getOperand(0).getReg(), *MRI, m_GSExt(m_GTrunc(m_Reg(Src0)))); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); - match = mi_match(MIBZExt->getOperand(0).getReg(), MRI, + match = mi_match(MIBZExt->getOperand(0).getReg(), *MRI, m_GZExt(m_GTrunc(m_Reg(Src0)))); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); } -TEST(PatternMatchInstr, MatchSpecificType) { - LLVMContext Context; - std::unique_ptr TM = createTargetMachine(); +TEST_F(GISelMITest, MatchSpecificType) { + setUp(); if (!TM) return; - auto ModuleMMIPair = createDummyModule(Context, *TM, ""); - MachineFunction *MF = - getMFFromMMI(ModuleMMIPair.first.get(), ModuleMMIPair.second.get()); - SmallVector Copies; - collectCopies(Copies, MF); - MachineBasicBlock *EntryMBB = &*MF->begin(); - MachineIRBuilder B(*MF); - MachineRegisterInfo &MRI = MF->getRegInfo(); - B.setInsertPt(*EntryMBB, EntryMBB->end()); // Try to match a 64bit add. LLT s64 = LLT::scalar(64); LLT s32 = LLT::scalar(32); auto MIBAdd = B.buildAdd(s64, Copies[0], Copies[1]); - EXPECT_FALSE(mi_match(MIBAdd->getOperand(0).getReg(), MRI, + EXPECT_FALSE(mi_match(MIBAdd->getOperand(0).getReg(), *MRI, m_GAdd(m_SpecificType(s32), m_Reg()))); - EXPECT_TRUE(mi_match(MIBAdd->getOperand(0).getReg(), MRI, + EXPECT_TRUE(mi_match(MIBAdd->getOperand(0).getReg(), *MRI, m_GAdd(m_SpecificType(s64), m_Reg()))); // Try to match the destination type of a bitcast. LLT v2s32 = LLT::vector(2, 32); auto MIBCast = B.buildCast(v2s32, Copies[0]); EXPECT_TRUE( - mi_match(MIBCast->getOperand(0).getReg(), MRI, m_GBitcast(m_Reg()))); + mi_match(MIBCast->getOperand(0).getReg(), *MRI, m_GBitcast(m_Reg()))); EXPECT_TRUE( - mi_match(MIBCast->getOperand(0).getReg(), MRI, m_SpecificType(v2s32))); + mi_match(MIBCast->getOperand(0).getReg(), *MRI, m_SpecificType(v2s32))); EXPECT_TRUE( - mi_match(MIBCast->getOperand(1).getReg(), MRI, m_SpecificType(s64))); + mi_match(MIBCast->getOperand(1).getReg(), *MRI, m_SpecificType(s64))); // Build a PTRToInt and INTTOPTR and match and test them. LLT PtrTy = LLT::pointer(0, 64); @@ -452,43 +269,34 @@ TEST(PatternMatchInstr, MatchSpecificTyp Register Src0; // match the ptrtoint(inttoptr reg) - bool match = mi_match(MIBPtrToInt->getOperand(0).getReg(), MRI, + bool match = mi_match(MIBPtrToInt->getOperand(0).getReg(), *MRI, m_GPtrToInt(m_GIntToPtr(m_Reg(Src0)))); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); } -TEST(PatternMatchInstr, MatchCombinators) { - LLVMContext Context; - std::unique_ptr TM = createTargetMachine(); +TEST_F(GISelMITest, MatchCombinators) { + setUp(); if (!TM) return; - auto ModuleMMIPair = createDummyModule(Context, *TM, ""); - MachineFunction *MF = - getMFFromMMI(ModuleMMIPair.first.get(), ModuleMMIPair.second.get()); - SmallVector Copies; - collectCopies(Copies, MF); - MachineBasicBlock *EntryMBB = &*MF->begin(); - MachineIRBuilder B(*MF); - MachineRegisterInfo &MRI = MF->getRegInfo(); - B.setInsertPt(*EntryMBB, EntryMBB->end()); + LLT s64 = LLT::scalar(64); LLT s32 = LLT::scalar(32); auto MIBAdd = B.buildAdd(s64, Copies[0], Copies[1]); Register Src0, Src1; bool match = - mi_match(MIBAdd->getOperand(0).getReg(), MRI, + mi_match(MIBAdd->getOperand(0).getReg(), *MRI, m_all_of(m_SpecificType(s64), m_GAdd(m_Reg(Src0), m_Reg(Src1)))); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); EXPECT_EQ(Src1, Copies[1]); // Check for s32 (which should fail). match = - mi_match(MIBAdd->getOperand(0).getReg(), MRI, + mi_match(MIBAdd->getOperand(0).getReg(), *MRI, m_all_of(m_SpecificType(s32), m_GAdd(m_Reg(Src0), m_Reg(Src1)))); EXPECT_FALSE(match); match = - mi_match(MIBAdd->getOperand(0).getReg(), MRI, + mi_match(MIBAdd->getOperand(0).getReg(), *MRI, m_any_of(m_SpecificType(s32), m_GAdd(m_Reg(Src0), m_Reg(Src1)))); EXPECT_TRUE(match); EXPECT_EQ(Src0, Copies[0]); @@ -496,33 +304,24 @@ TEST(PatternMatchInstr, MatchCombinators // Match a case where none of the predicates hold true. match = mi_match( - MIBAdd->getOperand(0).getReg(), MRI, + MIBAdd->getOperand(0).getReg(), *MRI, m_any_of(m_SpecificType(LLT::scalar(16)), m_GSub(m_Reg(), m_Reg()))); EXPECT_FALSE(match); } -TEST(PatternMatchInstr, MatchMiscellaneous) { - LLVMContext Context; - std::unique_ptr TM = createTargetMachine(); +TEST_F(GISelMITest, MatchMiscellaneous) { + setUp(); if (!TM) return; - auto ModuleMMIPair = createDummyModule(Context, *TM, ""); - MachineFunction *MF = - getMFFromMMI(ModuleMMIPair.first.get(), ModuleMMIPair.second.get()); - SmallVector Copies; - collectCopies(Copies, MF); - MachineBasicBlock *EntryMBB = &*MF->begin(); - MachineIRBuilder B(*MF); - MachineRegisterInfo &MRI = MF->getRegInfo(); - B.setInsertPt(*EntryMBB, EntryMBB->end()); + LLT s64 = LLT::scalar(64); auto MIBAdd = B.buildAdd(s64, Copies[0], Copies[1]); // Make multiple uses of this add. B.buildCast(LLT::pointer(0, 32), MIBAdd); B.buildCast(LLT::pointer(1, 32), MIBAdd); - bool match = mi_match(MIBAdd.getReg(0), MRI, m_GAdd(m_Reg(), m_Reg())); + bool match = mi_match(MIBAdd.getReg(0), *MRI, m_GAdd(m_Reg(), m_Reg())); EXPECT_TRUE(match); - match = mi_match(MIBAdd.getReg(0), MRI, m_OneUse(m_GAdd(m_Reg(), m_Reg()))); + match = mi_match(MIBAdd.getReg(0), *MRI, m_OneUse(m_GAdd(m_Reg(), m_Reg()))); EXPECT_FALSE(match); } } // namespace From llvm-commits at lists.llvm.org Wed Oct 9 15:14:08 2019 From: llvm-commits at lists.llvm.org (Douglas Gliner via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:14:08 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Accommodate sancov and coverage report server for use under Windows In-Reply-To: References: Message-ID: dgg5503 added a comment. In D51018#1702330 , @vitalybuka wrote: > In D51018#1700811 , @dgg5503 wrote: > > > @vsk thanks for the review! It looks like the JSON support library implements what `JSONWriter` does in this tool. To reduce maintenance, I've updated sancov to use the JSON support library implementation instead. The only downside to this change is that the JSON text format differs compared to the original implementation. I'm open to reverting this diff and simply adding your suggested change which also worked. Let me know what you think. > > > > EDIT: > > I've also updated the title and description to better describe the changes in this diff. > > > I like the change. > > Could you move JSONWriter -> JSON refactoring into separate patch and rebase win stuff ontop? > If you don't have commiter access, someone will need to commit it for you > So I don't mind to split the patch and commit myself. Sure thing! Would I be creating a separate diff for review or would this be taken from the history? Also, I do not have commiter access so I will need someone to commit it for me. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 From llvm-commits at lists.llvm.org Wed Oct 9 15:22:36 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Wed, 09 Oct 2019 22:22:36 -0000 Subject: [llvm] r374249 - gn build: (manually) merge r374219 Message-ID: <20191009222236.3BB1E842BE@lists.llvm.org> Author: nico Date: Wed Oct 9 15:22:36 2019 New Revision: 374249 URL: http://llvm.org/viewvc/llvm-project?rev=374249&view=rev Log: gn build: (manually) merge r374219 Added: llvm/trunk/utils/gn/secondary/clang/tools/clang-offload-wrapper/ llvm/trunk/utils/gn/secondary/clang/tools/clang-offload-wrapper/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/clang/tools/driver/BUILD.gn Added: llvm/trunk/utils/gn/secondary/clang/tools/clang-offload-wrapper/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang/tools/clang-offload-wrapper/BUILD.gn?rev=374249&view=auto ============================================================================== --- llvm/trunk/utils/gn/secondary/clang/tools/clang-offload-wrapper/BUILD.gn (added) +++ llvm/trunk/utils/gn/secondary/clang/tools/clang-offload-wrapper/BUILD.gn Wed Oct 9 15:22:36 2019 @@ -0,0 +1,13 @@ +executable("clang-offload-wrapper") { + configs += [ "//llvm/utils/gn/build:clang_code" ] + deps = [ + "//clang/lib/Basic", + "//llvm/lib/Bitcode/Writer", + "//llvm/lib/IR", + "//llvm/lib/Support", + "//llvm/lib/Transforms/Utils", + ] + sources = [ + "ClangOffloadWrapper.cpp", + ] +} Modified: llvm/trunk/utils/gn/secondary/clang/tools/driver/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang/tools/driver/BUILD.gn?rev=374249&r1=374248&r2=374249&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang/tools/driver/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang/tools/driver/BUILD.gn Wed Oct 9 15:22:36 2019 @@ -59,6 +59,7 @@ executable("clang") { "//clang/lib/FrontendTool", "//clang/lib/Headers", "//clang/tools/clang-offload-bundler", + "//clang/tools/clang-offload-wrapper", "//llvm/include/llvm/Config:llvm-config", "//llvm/lib/Analysis", "//llvm/lib/CodeGen", From llvm-commits at lists.llvm.org Wed Oct 9 15:23:50 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:23:50 +0000 (UTC) Subject: [PATCH] D68676: [ASan] Do not misrepresent high value address dereferences as null dereferences In-Reply-To: References: Message-ID: yln updated this revision to Diff 224188. yln added a comment. Address Vitaly's comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68676/new/ https://reviews.llvm.org/D68676 Files: compiler-rt/lib/asan/asan_errors.h compiler-rt/lib/sanitizer_common/sanitizer_common.h compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp compiler-rt/lib/sanitizer_common/sanitizer_mac.cpp compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp compiler-rt/lib/sanitizer_common/sanitizer_win.cpp compiler-rt/test/asan/TestCases/Darwin/high-address-dereference.c -------------- next part -------------- A non-text attachment was scrubbed... Name: D68676.224188.patch Type: text/x-patch Size: 7683 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 15:23:51 2019 From: llvm-commits at lists.llvm.org (Marcello Maggioni via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:23:51 +0000 (UTC) Subject: [PATCH] D68739: [GISel] Allow ConstantFoldBinOp to consider G_FCONSTANT binary representation for combines Message-ID: kariddi created this revision. kariddi added reviewers: aditya_nandakumar, qcolombet, volkan. Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. In GlobalISel there's no different type representing Floats or Ints , but just bag of bits types (s16,s32,s64...). When IRTranslator generates code like this: %v = bitcast float 1.0 to i32 %v2 = and i32 %x, %v It will translate the code to %0_(s32) = G_FCONSTANT float 1.0 %1_(s32) = G_AND %X(s32), %0(s32) Because a G_BITCAST from s32 to s32 doesn't make sense in this case ... So from this behavior its clear that IRTranslator considers the output of "G_FCONSTANT" just a bag of bits the origin of which was a float constant. Currently though in the constant folder we don't consider constants coming from G_FCONSTANTs for folding purposes as the folder considers them "float data" . In reality though we should consider them like the "untyped bit representation"of a float and it should participate in any constant folding involving "integer" data. Repository: rL LLVM https://reviews.llvm.org/D68739 Files: llvm/include/llvm/CodeGen/GlobalISel/Utils.h llvm/lib/CodeGen/GlobalISel/Utils.cpp llvm/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68739.224187.patch Type: text/x-patch Size: 12450 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 15:34:16 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:34:16 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: <2214894ddabfa4c96e28829cf255f7fe@localhost.localdomain> rupprecht added a comment. Mostly LGTM, just a couple test nits. Thanks for fixing FileCheck! In D68146#1701152 , @Kai wrote: > In D68146#1691246 , @MaskRay wrote: > > > To make it really `--ignore-case`, the pattern should also be changed to lowercase. > > > I am not sure which pattern I have missed. The fixed string match uses the lowercase find and the regex match uses the Regex::IgnoreCase flag. `StringRef::find_lower()` is a bad name for something that should really be called `StringRef::find_case_insensitive()`, which I think is the confusion here. ================ Comment at: llvm/test/FileCheck/check-ignore-case.txt:1 +# RUN: FileCheck -ignore-case -match-full-lines -check-prefix=FULL -input-file %s %s +# RUN: FileCheck -ignore-case -check-prefix=REGEX -input-file %s %s ---------------- Can you split up these lines with descriptions of what each one does? e.g. ``` ## Check that ... # RUN: FileCheck ... ## Check that other thing ... # RUN: FileCheck ... ``` ================ Comment at: llvm/test/FileCheck/check-ignore-case.txt:5 +# RUN: FileCheck -ignore-case -input-file %s %s + +this is the STRING to be matched ---------------- Can you add a check that case is ignored for `-implicit-check-not`? e.g. `not FileCheck -ignore-case -implicit-check-not=sTrinG` CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 From llvm-commits at lists.llvm.org Wed Oct 9 15:34:16 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:34:16 +0000 (UTC) Subject: [PATCH] D68693: [Tests] Output of od can be lower or upper case (llvm-objcopy/yaml2obj). In-Reply-To: References: Message-ID: <4cc6d1e545802fea803d1294ebc66dc6@localhost.localdomain> rupprecht accepted this revision. rupprecht added a comment. This revision is now accepted and ready to land. Much better :) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68693/new/ https://reviews.llvm.org/D68693 From llvm-commits at lists.llvm.org Wed Oct 9 15:34:17 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:34:17 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Accommodate sancov and coverage report server for use under Windows In-Reply-To: References: Message-ID: vitalybuka added a comment. In D51018#1702476 , @dgg5503 wrote: > In D51018#1702330 , @vitalybuka wrote: > > > In D51018#1700811 , @dgg5503 wrote: > > > > > @vsk thanks for the review! It looks like the JSON support library implements what `JSONWriter` does in this tool. To reduce maintenance, I've updated sancov to use the JSON support library implementation instead. The only downside to this change is that the JSON text format differs compared to the original implementation. I'm open to reverting this diff and simply adding your suggested change which also worked. Let me know what you think. > > > > > > EDIT: > > > I've also updated the title and description to better describe the changes in this diff. > > > > > > I like the change. > > > > Could you move JSONWriter -> JSON refactoring into separate patch and rebase win stuff ontop? > > If you don't have commiter access, someone will need to commit it for you > > So I don't mind to split the patch and commit myself. > > > Sure thing! Would I be creating a separate diff for review or would this be taken from the history? Also, I do not have commiter access so I will need someone to commit it for me. yes, you need to upload new patch with "arc diff" then you can chain them with "Edit Related Revisions..." on this site CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 From llvm-commits at lists.llvm.org Wed Oct 9 15:35:27 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:35:27 +0000 (UTC) Subject: [PATCH] D68721: [NFC][PowerPC]Clean up PPCAsmPrinter for TOC related pseudo opcode In-Reply-To: References: Message-ID: <87d1a2fb57555990d709929b6891459c@localhost.localdomain> hubert.reinterpretcast added inline comments. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:517 +static MCSymbol *getMCSymbolForTOCPseudoMO(const MachineOperand &MO, AsmPrinter &AP) { + if (MO.isGlobal()) + return AP.getSymbol(MO.getGlobal()); ---------------- Has a switch on `getType` been considered? ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:685 // Map the operand to its corresponding MCSymbol. - MCSymbol *MOSymbol = nullptr; - if (MO.isGlobal()) - MOSymbol = getSymbol(MO.getGlobal()); - else if (MO.isCPI()) - MOSymbol = GetCPISymbol(MO.getIndex()); - else if (MO.isJTI()) - MOSymbol = GetJTISymbol(MO.getIndex()); - else if (MO.isBlockAddress()) - MOSymbol = GetBlockAddressSymbol(MO.getBlockAddress()); - + const MCSymbol *MOSymbol = getMCSymbolForTOCPseudoMO(MO, *this); const bool IsAIX = TM.getTargetTriple().isOSAIX(); ---------------- Is it also possible to use `* const`? ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:739 + assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress()) && + "Invalid operand!"); ---------------- There's an extra space in the indentation here. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:760 + // Change the opcode to ADDIS8. If the global address is the address of + // an external symbol, is a jump table address, is a block address or is a + // constant pool index with large code model enabled, then generate a TOC ---------------- Please don't remove the Oxford comma before the "or". ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:771 + + bool GlobalToc = + MO.isGlobal() && Subtarget->isGVIndirectSymbol(MO.getGlobal()); ---------------- Can this be `const`? ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:774 if (GlobalToc || MO.isJTI() || MO.isBlockAddress() || - TM.getCodeModel() == CodeModel::Large) + (MO.isCPI() && TM.getCodeModel() == CodeModel::Large)) MOSymbol = lookUpOrCreateTOCEntry(MOSymbol); ---------------- This does not seem to be an NFC change. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:797 + // an external symbol, is a jump table address, is a block address, or is + // a constant pool index with large code model enabled then generate a + // TOC entry and reference that. Otherwise reference the symbol directly. ---------------- Comma before "then". ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:798 + // a constant pool index with large code model enabled then generate a + // TOC entry and reference that. Otherwise reference the symbol directly. TmpInst.setOpcode(PPC::LD); ---------------- Comma after "Otherwise". Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68721/new/ https://reviews.llvm.org/D68721 From llvm-commits at lists.llvm.org Wed Oct 9 15:35:38 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:35:38 +0000 (UTC) Subject: [PATCH] D68740: [NFC][CVP] Count all the no-wraps we prooved Message-ID: lebedev.ri created this revision. lebedev.ri added reviewers: nikic, spatel, dberlin. lebedev.ri added a project: LLVM. Herald added a subscriber: hiraditya. I'm not sure if i'm going overboard with this.. It looks like this is the only missing statistic in the CVP pass. Since we proove NSW and NUW separately i'd think we should count them separately too. test-suite: | correlated-value-propagation.NumAddNSW | 4381 | | correlated-value-propagation.NumAddNUW | 6532 | | correlated-value-propagation.NumMulNUW | 4 | | correlated-value-propagation.NumNSW | 5099 | | correlated-value-propagation.NumNUW | 8570 | | correlated-value-propagation.NumNW | 13669 | | correlated-value-propagation.NumOverflows | 4 | | correlated-value-propagation.NumSubNSW | 718 | | correlated-value-propagation.NumSubNUW | 2034 | Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68740 Files: llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68740.224191.patch Type: text/x-patch Size: 4793 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 15:44:43 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Wed, 09 Oct 2019 22:44:43 -0000 Subject: [llvm] r374252 - GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR Message-ID: <20191009224443.BE0C585088@lists.llvm.org> Author: arsenm Date: Wed Oct 9 15:44:43 2019 New Revision: 374252 URL: http://llvm.org/viewvc/llvm-project?rev=374252&view=rev Log: GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR. Modified: llvm/trunk/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h llvm/trunk/lib/CodeGen/GlobalISel/LegalizerHelper.cpp llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-ashr.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-build-vector.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-extract.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fadd.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fcanonicalize.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fcos.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-ffloor.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fma.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmad.s16.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmaxnum.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmul.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsin.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsqrt.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsub.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-lshr.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-phi.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-shl.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-shuffle-vector.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-smax.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-smin.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-umax.mir llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-umin.mir Modified: llvm/trunk/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h (original) +++ llvm/trunk/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h Wed Oct 9 15:44:43 2019 @@ -203,6 +203,10 @@ public: LegalizeResult fewerElementsVectorUnmergeValues(MachineInstr &MI, unsigned TypeIdx, LLT NarrowTy); + LegalizeResult fewerElementsVectorBuildVector(MachineInstr &MI, + unsigned TypeIdx, + LLT NarrowTy); + LegalizeResult reduceLoadStoreWidth(MachineInstr &MI, unsigned TypeIdx, LLT NarrowTy); Modified: llvm/trunk/lib/CodeGen/GlobalISel/LegalizerHelper.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/LegalizerHelper.cpp?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/LegalizerHelper.cpp Wed Oct 9 15:44:43 2019 @@ -2767,6 +2767,65 @@ LegalizerHelper::fewerElementsVectorUnme } LegalizerHelper::LegalizeResult +LegalizerHelper::fewerElementsVectorBuildVector(MachineInstr &MI, + unsigned TypeIdx, + LLT NarrowTy) { + assert(TypeIdx == 0 && "not a vector type index"); + Register DstReg = MI.getOperand(0).getReg(); + LLT DstTy = MRI.getType(DstReg); + LLT SrcTy = DstTy.getElementType(); + + int DstNumElts = DstTy.getNumElements(); + int NarrowNumElts = NarrowTy.getNumElements(); + int NumConcat = (DstNumElts + NarrowNumElts - 1) / NarrowNumElts; + LLT WidenedDstTy = LLT::vector(NarrowNumElts * NumConcat, SrcTy); + + SmallVector ConcatOps; + SmallVector SubBuildVector; + + Register UndefReg; + if (WidenedDstTy != DstTy) + UndefReg = MIRBuilder.buildUndef(SrcTy).getReg(0); + + // Create a G_CONCAT_VECTORS of NarrowTy pieces, padding with undef as + // necessary. + // + // %3:_(<3 x s16>) = G_BUILD_VECTOR %0, %1, %2 + // -> <2 x s16> + // + // %4:_(s16) = G_IMPLICIT_DEF + // %5:_(<2 x s16>) = G_BUILD_VECTOR %0, %1 + // %6:_(<2 x s16>) = G_BUILD_VECTOR %2, %4 + // %7:_(<4 x s16>) = G_CONCAT_VECTORS %5, %6 + // %3:_(<3 x s16>) = G_EXTRACT %7, 0 + for (int I = 0; I != NumConcat; ++I) { + for (int J = 0; J != NarrowNumElts; ++J) { + int SrcIdx = NarrowNumElts * I + J; + + if (SrcIdx < DstNumElts) { + Register SrcReg = MI.getOperand(SrcIdx + 1).getReg(); + SubBuildVector.push_back(SrcReg); + } else + SubBuildVector.push_back(UndefReg); + } + + auto BuildVec = MIRBuilder.buildBuildVector(NarrowTy, SubBuildVector); + ConcatOps.push_back(BuildVec.getReg(0)); + SubBuildVector.clear(); + } + + if (DstTy == WidenedDstTy) + MIRBuilder.buildConcatVectors(DstReg, ConcatOps); + else { + auto Concat = MIRBuilder.buildConcatVectors(WidenedDstTy, ConcatOps); + MIRBuilder.buildExtract(DstReg, Concat, 0); + } + + MI.eraseFromParent(); + return Legalized; +} + +LegalizerHelper::LegalizeResult LegalizerHelper::reduceLoadStoreWidth(MachineInstr &MI, unsigned TypeIdx, LLT NarrowTy) { // FIXME: Don't know how to handle secondary types yet. @@ -2941,6 +3000,8 @@ LegalizerHelper::fewerElementsVector(Mac return fewerElementsVectorPhi(MI, TypeIdx, NarrowTy); case G_UNMERGE_VALUES: return fewerElementsVectorUnmergeValues(MI, TypeIdx, NarrowTy); + case G_BUILD_VECTOR: + return fewerElementsVectorBuildVector(MI, TypeIdx, NarrowTy); case G_LOAD: case G_STORE: return reduceLoadStoreWidth(MI, TypeIdx, NarrowTy); Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp Wed Oct 9 15:44:43 2019 @@ -64,6 +64,14 @@ static LegalityPredicate isSmallOddVecto }; } +static LegalityPredicate isWideVec16(unsigned TypeIdx) { + return [=](const LegalityQuery &Query) { + const LLT Ty = Query.Types[TypeIdx]; + const LLT EltTy = Ty.getScalarType(); + return EltTy.getSizeInBits() == 16 && Ty.getNumElements() > 2; + }; +} + static LegalizeMutation oneMoreElement(unsigned TypeIdx) { return [=](const LegalityQuery &Query) { const LLT Ty = Query.Types[TypeIdx]; @@ -945,7 +953,8 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo .legalForCartesianProduct(AllS32Vectors, {S32}) .legalForCartesianProduct(AllS64Vectors, {S64}) .clampNumElements(0, V16S32, V32S32) - .clampNumElements(0, V2S64, V16S64); + .clampNumElements(0, V2S64, V16S64) + .fewerElementsIf(isWideVec16(0), changeTo(0, V2S16)); if (ST.hasScalarPackInsts()) BuildVector.legalFor({V2S16, S32}); Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-ashr.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-ashr.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-ashr.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-ashr.mir Wed Oct 9 15:44:43 2019 @@ -623,25 +623,28 @@ body: | ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY3]], [[C]](s32) ; SI: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[SHL]], [[C]](s32) ; SI: [[ASHR1:%[0-9]+]]:_(s32) = G_ASHR [[ASHR]], [[AND]](s32) + ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[ASHR1]](s32) ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32) ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32) ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[COPY5]], [[C]](s32) ; SI: [[ASHR2:%[0-9]+]]:_(s32) = G_ASHR [[SHL1]], [[C]](s32) ; SI: [[ASHR3:%[0-9]+]]:_(s32) = G_ASHR [[ASHR2]], [[AND1]](s32) + ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[ASHR3]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[BITCAST3]](s32) ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[BITCAST1]](s32) ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[COPY7]], [[C]](s32) ; SI: [[ASHR4:%[0-9]+]]:_(s32) = G_ASHR [[SHL2]], [[C]](s32) ; SI: [[ASHR5:%[0-9]+]]:_(s32) = G_ASHR [[ASHR4]], [[AND2]](s32) - ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[ASHR1]](s32) - ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[ASHR3]](s32) - ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[ASHR5]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY8]](s32), [[COPY9]](s32), [[COPY10]](s32) - ; SI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: [[DEF2:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; SI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF2]], [[TRUNC]](<3 x s16>), 0 + ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[ASHR5]](s32) + ; SI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF2]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: [[DEF3:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; SI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF3]], [[EXTRACT2]](<3 x s16>), 0 ; SI: $vgpr0_vgpr1 = COPY [[INSERT2]](<4 x s16>) ; VI-LABEL: name: test_ashr_v3s16_v3s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 @@ -672,13 +675,13 @@ body: | ; VI: [[ASHR:%[0-9]+]]:_(s16) = G_ASHR [[TRUNC]], [[TRUNC3]](s16) ; VI: [[ASHR1:%[0-9]+]]:_(s16) = G_ASHR [[TRUNC1]], [[TRUNC4]](s16) ; VI: [[ASHR2:%[0-9]+]]:_(s16) = G_ASHR [[TRUNC2]], [[TRUNC5]](s16) - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[ASHR]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[ASHR1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[ASHR2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF2:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF2]], [[TRUNC6]](<3 x s16>), 0 + ; VI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[ASHR]](s16), [[ASHR1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[ASHR2]](s16), [[DEF2]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF3:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF3]], [[EXTRACT2]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT2]](<4 x s16>) ; GFX9-LABEL: name: test_ashr_v3s16_v3s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 @@ -771,8 +774,10 @@ body: | ; SI: [[ASHR6:%[0-9]+]]:_(s32) = G_ASHR [[SHL3]], [[C]](s32) ; SI: [[ASHR7:%[0-9]+]]:_(s32) = G_ASHR [[ASHR6]], [[AND3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[ASHR7]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_ashr_v4s16_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -799,8 +804,10 @@ body: | ; VI: [[ASHR1:%[0-9]+]]:_(s16) = G_ASHR [[TRUNC1]], [[TRUNC5]](s16) ; VI: [[ASHR2:%[0-9]+]]:_(s16) = G_ASHR [[TRUNC2]], [[TRUNC6]](s16) ; VI: [[ASHR3:%[0-9]+]]:_(s16) = G_ASHR [[TRUNC3]], [[TRUNC7]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[ASHR]](s16), [[ASHR1]](s16), [[ASHR2]](s16), [[ASHR3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[ASHR]](s16), [[ASHR1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[ASHR2]](s16), [[ASHR3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_ashr_v4s16_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-build-vector.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-build-vector.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-build-vector.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-build-vector.mir Wed Oct 9 15:44:43 2019 @@ -822,3 +822,303 @@ body: | %4:_(<4 x s128>) = G_BUILD_VECTOR %0, %1, %2, %3 S_NOP 0, implicit %4 ... + +--- +name: build_vector_v2s16 +body: | + bb.0: + liveins: $vgpr0, $vgpr1 + + ; CHECK-LABEL: name: build_vector_v2s16 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32) + ; CHECK: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32) + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CHECK: S_NOP 0, implicit [[BUILD_VECTOR]](<2 x s16>) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s16) = G_TRUNC %0 + %3:_(s16) = G_TRUNC %1 + %4:_(<2 x s16>) = G_BUILD_VECTOR %2, %3 + S_NOP 0, implicit %4 +... + +--- +name: build_vector_v3s16 +body: | + bb.0: + liveins: $vgpr0, $vgpr1, $vgpr2 + + ; CHECK-LABEL: name: build_vector_v3s16 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2 + ; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32) + ; CHECK: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32) + ; CHECK: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[COPY2]](s32) + ; CHECK: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CHECK: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CHECK: S_NOP 0, implicit [[EXTRACT]](<3 x s16>) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s32) = COPY $vgpr2 + %3:_(s16) = G_TRUNC %0 + %4:_(s16) = G_TRUNC %1 + %5:_(s16) = G_TRUNC %2 + %6:_(<3 x s16>) = G_BUILD_VECTOR %3, %4, %5 + S_NOP 0, implicit %6 +... + +--- +name: build_vector_v4s16 +body: | + bb.0: + liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3 + + ; CHECK-LABEL: name: build_vector_v4s16 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2 + ; CHECK: [[COPY3:%[0-9]+]]:_(s32) = COPY $vgpr3 + ; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32) + ; CHECK: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32) + ; CHECK: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[COPY2]](s32) + ; CHECK: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[COPY3]](s32) + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CHECK: S_NOP 0, implicit [[CONCAT_VECTORS]](<4 x s16>) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s32) = COPY $vgpr2 + %3:_(s32) = COPY $vgpr3 + %4:_(s16) = G_TRUNC %0 + %5:_(s16) = G_TRUNC %1 + %6:_(s16) = G_TRUNC %2 + %7:_(s16) = G_TRUNC %3 + %8:_(<4 x s16>) = G_BUILD_VECTOR %4, %5, %6, %7 + S_NOP 0, implicit %8 +... + +--- +name: build_vector_v5s16 +body: | + bb.0: + liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4 + + ; CHECK-LABEL: name: build_vector_v5s16 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2 + ; CHECK: [[COPY3:%[0-9]+]]:_(s32) = COPY $vgpr3 + ; CHECK: [[COPY4:%[0-9]+]]:_(s32) = COPY $vgpr4 + ; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32) + ; CHECK: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32) + ; CHECK: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[COPY2]](s32) + ; CHECK: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[COPY3]](s32) + ; CHECK: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[COPY4]](s32) + ; CHECK: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CHECK: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[DEF]](s16) + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<6 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>) + ; CHECK: [[EXTRACT:%[0-9]+]]:_(<5 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<6 x s16>), 0 + ; CHECK: S_NOP 0, implicit [[EXTRACT]](<5 x s16>) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s32) = COPY $vgpr2 + %3:_(s32) = COPY $vgpr3 + %4:_(s32) = COPY $vgpr4 + %5:_(s16) = G_TRUNC %0 + %6:_(s16) = G_TRUNC %1 + %7:_(s16) = G_TRUNC %2 + %8:_(s16) = G_TRUNC %3 + %9:_(s16) = G_TRUNC %4 + %10:_(<5 x s16>) = G_BUILD_VECTOR %5, %6, %7, %8, %9 + S_NOP 0, implicit %10 +... + +--- +name: build_vector_v7s16 +body: | + bb.0: + liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6 + + ; CHECK-LABEL: name: build_vector_v7s16 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2 + ; CHECK: [[COPY3:%[0-9]+]]:_(s32) = COPY $vgpr3 + ; CHECK: [[COPY4:%[0-9]+]]:_(s32) = COPY $vgpr4 + ; CHECK: [[COPY5:%[0-9]+]]:_(s32) = COPY $vgpr5 + ; CHECK: [[COPY6:%[0-9]+]]:_(s32) = COPY $vgpr6 + ; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32) + ; CHECK: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32) + ; CHECK: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[COPY2]](s32) + ; CHECK: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[COPY3]](s32) + ; CHECK: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[COPY4]](s32) + ; CHECK: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[COPY5]](s32) + ; CHECK: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[COPY6]](s32) + ; CHECK: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CHECK: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CHECK: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[DEF]](s16) + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; CHECK: [[EXTRACT:%[0-9]+]]:_(<7 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<8 x s16>), 0 + ; CHECK: S_NOP 0, implicit [[EXTRACT]](<7 x s16>) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s32) = COPY $vgpr2 + %3:_(s32) = COPY $vgpr3 + %4:_(s32) = COPY $vgpr4 + %5:_(s32) = COPY $vgpr5 + %6:_(s32) = COPY $vgpr6 + %7:_(s16) = G_TRUNC %0 + %8:_(s16) = G_TRUNC %1 + %9:_(s16) = G_TRUNC %2 + %10:_(s16) = G_TRUNC %3 + %11:_(s16) = G_TRUNC %4 + %12:_(s16) = G_TRUNC %5 + %13:_(s16) = G_TRUNC %6 + %14:_(<7 x s16>) = G_BUILD_VECTOR %7, %8, %9, %10, %11, %12, %13 + S_NOP 0, implicit %14 +... + +--- +name: build_vector_v8s16 +body: | + bb.0: + liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7 + + ; CHECK-LABEL: name: build_vector_v8s16 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2 + ; CHECK: [[COPY3:%[0-9]+]]:_(s32) = COPY $vgpr3 + ; CHECK: [[COPY4:%[0-9]+]]:_(s32) = COPY $vgpr4 + ; CHECK: [[COPY5:%[0-9]+]]:_(s32) = COPY $vgpr5 + ; CHECK: [[COPY6:%[0-9]+]]:_(s32) = COPY $vgpr6 + ; CHECK: [[COPY7:%[0-9]+]]:_(s32) = COPY $vgpr7 + ; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32) + ; CHECK: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32) + ; CHECK: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[COPY2]](s32) + ; CHECK: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[COPY3]](s32) + ; CHECK: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[COPY4]](s32) + ; CHECK: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[COPY5]](s32) + ; CHECK: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[COPY6]](s32) + ; CHECK: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[COPY7]](s32) + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CHECK: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CHECK: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; CHECK: S_NOP 0, implicit [[CONCAT_VECTORS]](<8 x s16>) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s32) = COPY $vgpr2 + %3:_(s32) = COPY $vgpr3 + %4:_(s32) = COPY $vgpr4 + %5:_(s32) = COPY $vgpr5 + %6:_(s32) = COPY $vgpr6 + %7:_(s32) = COPY $vgpr7 + %8:_(s16) = G_TRUNC %0 + %9:_(s16) = G_TRUNC %1 + %10:_(s16) = G_TRUNC %2 + %11:_(s16) = G_TRUNC %3 + %12:_(s16) = G_TRUNC %4 + %13:_(s16) = G_TRUNC %5 + %14:_(s16) = G_TRUNC %6 + %15:_(s16) = G_TRUNC %7 + %16:_(<8 x s16>) = G_BUILD_VECTOR %8, %9, %10, %11, %12, %13, %14, %15 + S_NOP 0, implicit %16 +... + +--- +name: build_vector_v16s16 +body: | + bb.0: + liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8, $vgpr9, $vgpr10, $vgpr11, $vgpr12, $vgpr13, $vgpr14, $vgpr15 + + ; CHECK-LABEL: name: build_vector_v16s16 + ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0 + ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1 + ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2 + ; CHECK: [[COPY3:%[0-9]+]]:_(s32) = COPY $vgpr3 + ; CHECK: [[COPY4:%[0-9]+]]:_(s32) = COPY $vgpr4 + ; CHECK: [[COPY5:%[0-9]+]]:_(s32) = COPY $vgpr5 + ; CHECK: [[COPY6:%[0-9]+]]:_(s32) = COPY $vgpr6 + ; CHECK: [[COPY7:%[0-9]+]]:_(s32) = COPY $vgpr7 + ; CHECK: [[COPY8:%[0-9]+]]:_(s32) = COPY $vgpr8 + ; CHECK: [[COPY9:%[0-9]+]]:_(s32) = COPY $vgpr9 + ; CHECK: [[COPY10:%[0-9]+]]:_(s32) = COPY $vgpr10 + ; CHECK: [[COPY11:%[0-9]+]]:_(s32) = COPY $vgpr11 + ; CHECK: [[COPY12:%[0-9]+]]:_(s32) = COPY $vgpr12 + ; CHECK: [[COPY13:%[0-9]+]]:_(s32) = COPY $vgpr13 + ; CHECK: [[COPY14:%[0-9]+]]:_(s32) = COPY $vgpr14 + ; CHECK: [[COPY15:%[0-9]+]]:_(s32) = COPY $vgpr15 + ; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32) + ; CHECK: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32) + ; CHECK: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[COPY2]](s32) + ; CHECK: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[COPY3]](s32) + ; CHECK: [[TRUNC4:%[0-9]+]]:_(s16) = G_TRUNC [[COPY4]](s32) + ; CHECK: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[COPY5]](s32) + ; CHECK: [[TRUNC6:%[0-9]+]]:_(s16) = G_TRUNC [[COPY6]](s32) + ; CHECK: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[COPY7]](s32) + ; CHECK: [[TRUNC8:%[0-9]+]]:_(s16) = G_TRUNC [[COPY8]](s32) + ; CHECK: [[TRUNC9:%[0-9]+]]:_(s16) = G_TRUNC [[COPY9]](s32) + ; CHECK: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[COPY10]](s32) + ; CHECK: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[COPY11]](s32) + ; CHECK: [[TRUNC12:%[0-9]+]]:_(s16) = G_TRUNC [[COPY12]](s32) + ; CHECK: [[TRUNC13:%[0-9]+]]:_(s16) = G_TRUNC [[COPY13]](s32) + ; CHECK: [[TRUNC14:%[0-9]+]]:_(s16) = G_TRUNC [[COPY14]](s32) + ; CHECK: [[TRUNC15:%[0-9]+]]:_(s16) = G_TRUNC [[COPY15]](s32) + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CHECK: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CHECK: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CHECK: [[BUILD_VECTOR4:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC8]](s16), [[TRUNC9]](s16) + ; CHECK: [[BUILD_VECTOR5:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC10]](s16), [[TRUNC11]](s16) + ; CHECK: [[BUILD_VECTOR6:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC12]](s16), [[TRUNC13]](s16) + ; CHECK: [[BUILD_VECTOR7:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC14]](s16), [[TRUNC15]](s16) + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<16 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>), [[BUILD_VECTOR4]](<2 x s16>), [[BUILD_VECTOR5]](<2 x s16>), [[BUILD_VECTOR6]](<2 x s16>), [[BUILD_VECTOR7]](<2 x s16>) + ; CHECK: S_NOP 0, implicit [[CONCAT_VECTORS]](<16 x s16>) + %0:_(s32) = COPY $vgpr0 + %1:_(s32) = COPY $vgpr1 + %2:_(s32) = COPY $vgpr2 + %3:_(s32) = COPY $vgpr3 + %4:_(s32) = COPY $vgpr4 + %5:_(s32) = COPY $vgpr5 + %6:_(s32) = COPY $vgpr6 + %7:_(s32) = COPY $vgpr7 + %8:_(s32) = COPY $vgpr8 + %9:_(s32) = COPY $vgpr9 + %10:_(s32) = COPY $vgpr10 + %11:_(s32) = COPY $vgpr11 + %12:_(s32) = COPY $vgpr12 + %13:_(s32) = COPY $vgpr13 + %14:_(s32) = COPY $vgpr14 + %15:_(s32) = COPY $vgpr15 + %16:_(s16) = G_TRUNC %0 + %17:_(s16) = G_TRUNC %1 + %18:_(s16) = G_TRUNC %2 + %19:_(s16) = G_TRUNC %3 + %20:_(s16) = G_TRUNC %4 + %21:_(s16) = G_TRUNC %5 + %22:_(s16) = G_TRUNC %6 + %23:_(s16) = G_TRUNC %7 + %24:_(s16) = G_TRUNC %8 + %25:_(s16) = G_TRUNC %9 + %26:_(s16) = G_TRUNC %10 + %27:_(s16) = G_TRUNC %11 + %28:_(s16) = G_TRUNC %12 + %29:_(s16) = G_TRUNC %13 + %30:_(s16) = G_TRUNC %14 + %31:_(s16) = G_TRUNC %15 + %32:_(<16 x s16>) = G_BUILD_VECTOR %16, %17, %18, %19, %20, %21, %22, %23, %24, %25, %26, %27, %28, %29, %30, %31 + S_NOP 0, implicit %32 +... Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-extract.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-extract.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-extract.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-extract.mir Wed Oct 9 15:44:43 2019 @@ -513,15 +513,18 @@ body: | ; CHECK: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[DEF]](<4 x s32>) ; CHECK: [[EXTRACT:%[0-9]+]]:_(<3 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0 ; CHECK: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8), [[UV2:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<3 x s8>) - ; CHECK: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UV]](s8) - ; CHECK: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[UV1]](s8) - ; CHECK: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[UV2]](s8) - ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; CHECK: [[TRUNC1:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CHECK: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CHECK: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[TRUNC1]](<3 x s16>), 0 - ; CHECK: [[EXTRACT1:%[0-9]+]]:_(s16) = G_EXTRACT [[INSERT]](<4 x s16>), 32 - ; CHECK: [[ANYEXT3:%[0-9]+]]:_(s32) = G_ANYEXT [[EXTRACT1]](s16) + ; CHECK: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[UV]](s8) + ; CHECK: [[ANYEXT1:%[0-9]+]]:_(s16) = G_ANYEXT [[UV1]](s8) + ; CHECK: [[ANYEXT2:%[0-9]+]]:_(s16) = G_ANYEXT [[UV2]](s8) + ; CHECK: [[DEF1:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[ANYEXT]](s16), [[ANYEXT1]](s16) + ; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[ANYEXT2]](s16), [[DEF1]](s16) + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CHECK: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CHECK: [[DEF2:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CHECK: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF2]], [[EXTRACT1]](<3 x s16>), 0 + ; CHECK: [[EXTRACT2:%[0-9]+]]:_(s16) = G_EXTRACT [[INSERT]](<4 x s16>), 32 + ; CHECK: [[ANYEXT3:%[0-9]+]]:_(s32) = G_ANYEXT [[EXTRACT2]](s16) ; CHECK: $vgpr0 = COPY [[ANYEXT3]](s32) %0:_(<3 x s8>) = G_IMPLICIT_DEF %1:_(s8) = G_EXTRACT %0, 16 @@ -538,18 +541,22 @@ body: | ; CHECK: [[TRUNC:%[0-9]+]]:_(<6 x s1>) = G_TRUNC [[DEF]](<6 x s32>) ; CHECK: [[EXTRACT:%[0-9]+]]:_(<5 x s1>) = G_EXTRACT [[TRUNC]](<6 x s1>), 0 ; CHECK: [[UV:%[0-9]+]]:_(s1), [[UV1:%[0-9]+]]:_(s1), [[UV2:%[0-9]+]]:_(s1), [[UV3:%[0-9]+]]:_(s1), [[UV4:%[0-9]+]]:_(s1) = G_UNMERGE_VALUES [[EXTRACT]](<5 x s1>) - ; CHECK: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UV]](s1) - ; CHECK: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[UV1]](s1) - ; CHECK: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[UV2]](s1) - ; CHECK: [[ANYEXT3:%[0-9]+]]:_(s32) = G_ANYEXT [[UV3]](s1) - ; CHECK: [[ANYEXT4:%[0-9]+]]:_(s32) = G_ANYEXT [[UV4]](s1) - ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<5 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32), [[ANYEXT3]](s32), [[ANYEXT4]](s32) - ; CHECK: [[TRUNC1:%[0-9]+]]:_(<5 x s16>) = G_TRUNC [[BUILD_VECTOR]](<5 x s32>) - ; CHECK: [[DEF1:%[0-9]+]]:_(<6 x s32>) = G_IMPLICIT_DEF - ; CHECK: [[TRUNC2:%[0-9]+]]:_(<6 x s16>) = G_TRUNC [[DEF1]](<6 x s32>) - ; CHECK: [[INSERT:%[0-9]+]]:_(<6 x s16>) = G_INSERT [[TRUNC2]], [[TRUNC1]](<5 x s16>), 0 - ; CHECK: [[EXTRACT1:%[0-9]+]]:_(s16) = G_EXTRACT [[INSERT]](<6 x s16>), 64 - ; CHECK: [[ANYEXT5:%[0-9]+]]:_(s32) = G_ANYEXT [[EXTRACT1]](s16) + ; CHECK: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[UV]](s1) + ; CHECK: [[ANYEXT1:%[0-9]+]]:_(s16) = G_ANYEXT [[UV1]](s1) + ; CHECK: [[ANYEXT2:%[0-9]+]]:_(s16) = G_ANYEXT [[UV2]](s1) + ; CHECK: [[ANYEXT3:%[0-9]+]]:_(s16) = G_ANYEXT [[UV3]](s1) + ; CHECK: [[ANYEXT4:%[0-9]+]]:_(s16) = G_ANYEXT [[UV4]](s1) + ; CHECK: [[DEF1:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[ANYEXT]](s16), [[ANYEXT1]](s16) + ; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[ANYEXT2]](s16), [[ANYEXT3]](s16) + ; CHECK: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[ANYEXT4]](s16), [[DEF1]](s16) + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<6 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>) + ; CHECK: [[EXTRACT1:%[0-9]+]]:_(<5 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<6 x s16>), 0 + ; CHECK: [[DEF2:%[0-9]+]]:_(<6 x s32>) = G_IMPLICIT_DEF + ; CHECK: [[TRUNC1:%[0-9]+]]:_(<6 x s16>) = G_TRUNC [[DEF2]](<6 x s32>) + ; CHECK: [[INSERT:%[0-9]+]]:_(<6 x s16>) = G_INSERT [[TRUNC1]], [[EXTRACT1]](<5 x s16>), 0 + ; CHECK: [[EXTRACT2:%[0-9]+]]:_(s16) = G_EXTRACT [[INSERT]](<6 x s16>), 64 + ; CHECK: [[ANYEXT5:%[0-9]+]]:_(s32) = G_ANYEXT [[EXTRACT2]](s16) ; CHECK: $vgpr0 = COPY [[ANYEXT5]](s32) %0:_(<5 x s1>) = G_IMPLICIT_DEF %1:_(s1) = G_EXTRACT %0, 4 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fadd.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fadd.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fadd.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fadd.mir Wed Oct 9 15:44:43 2019 @@ -358,12 +358,12 @@ body: | ; SI: [[FPEXT5:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC5]](s16) ; SI: [[FADD2:%[0-9]+]]:_(s32) = G_FADD [[FPEXT4]], [[FPEXT5]] ; SI: [[FPTRUNC2:%[0-9]+]]:_(s16) = G_FPTRUNC [[FADD2]](s32) - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC6]](<3 x s16>) + ; SI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[DEF4]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; VI-LABEL: name: test_fadd_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -393,12 +393,12 @@ body: | ; VI: [[FADD:%[0-9]+]]:_(s16) = G_FADD [[TRUNC]], [[TRUNC3]] ; VI: [[FADD1:%[0-9]+]]:_(s16) = G_FADD [[TRUNC1]], [[TRUNC4]] ; VI: [[FADD2:%[0-9]+]]:_(s16) = G_FADD [[TRUNC2]], [[TRUNC5]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FADD]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FADD1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FADD2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC6]](<3 x s16>) + ; VI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD]](s16), [[FADD1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD2]](s16), [[DEF4]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; GFX9-LABEL: name: test_fadd_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -479,8 +479,10 @@ body: | ; SI: [[FPEXT7:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC7]](s16) ; SI: [[FADD3:%[0-9]+]]:_(s32) = G_FADD [[FPEXT6]], [[FPEXT7]] ; SI: [[FPTRUNC3:%[0-9]+]]:_(s16) = G_FPTRUNC [[FADD3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16), [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_fadd_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -507,8 +509,10 @@ body: | ; VI: [[FADD1:%[0-9]+]]:_(s16) = G_FADD [[TRUNC1]], [[TRUNC5]] ; VI: [[FADD2:%[0-9]+]]:_(s16) = G_FADD [[TRUNC2]], [[TRUNC6]] ; VI: [[FADD3:%[0-9]+]]:_(s16) = G_FADD [[TRUNC3]], [[TRUNC7]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FADD]](s16), [[FADD1]](s16), [[FADD2]](s16), [[FADD3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD]](s16), [[FADD1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD2]](s16), [[FADD3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_fadd_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fcanonicalize.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fcanonicalize.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fcanonicalize.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fcanonicalize.mir Wed Oct 9 15:44:43 2019 @@ -235,12 +235,12 @@ body: | ; SI: [[FPEXT2:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC2]](s16) ; SI: [[FCANONICALIZE2:%[0-9]+]]:_(s32) = G_FCANONICALIZE [[FPEXT2]] ; SI: [[FPTRUNC2:%[0-9]+]]:_(s16) = G_FPTRUNC [[FCANONICALIZE2]](s32) - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; SI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[DEF2]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) ; VI-LABEL: name: test_fcanonicalize_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -258,12 +258,12 @@ body: | ; VI: [[FCANONICALIZE:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC]] ; VI: [[FCANONICALIZE1:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC1]] ; VI: [[FCANONICALIZE2:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC2]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FCANONICALIZE]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FCANONICALIZE1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FCANONICALIZE2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; VI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FCANONICALIZE]](s16), [[FCANONICALIZE1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FCANONICALIZE2]](s16), [[DEF2]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) ; GFX9-LABEL: name: test_fcanonicalize_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -321,8 +321,10 @@ body: | ; SI: [[FPEXT3:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC3]](s16) ; SI: [[FCANONICALIZE3:%[0-9]+]]:_(s32) = G_FCANONICALIZE [[FPEXT3]] ; SI: [[FPTRUNC3:%[0-9]+]]:_(s16) = G_FPTRUNC [[FCANONICALIZE3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16), [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_fcanonicalize_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[COPY]](<4 x s16>) @@ -339,8 +341,10 @@ body: | ; VI: [[FCANONICALIZE1:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC1]] ; VI: [[FCANONICALIZE2:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC2]] ; VI: [[FCANONICALIZE3:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC3]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FCANONICALIZE]](s16), [[FCANONICALIZE1]](s16), [[FCANONICALIZE2]](s16), [[FCANONICALIZE3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FCANONICALIZE]](s16), [[FCANONICALIZE1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FCANONICALIZE2]](s16), [[FCANONICALIZE3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_fcanonicalize_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[COPY]](<4 x s16>) Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fcos.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fcos.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fcos.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fcos.mir Wed Oct 9 15:44:43 2019 @@ -344,12 +344,12 @@ body: | ; SI: [[INT4:%[0-9]+]]:_(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.fract), [[FMUL2]](s32) ; SI: [[INT5:%[0-9]+]]:_(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.cos), [[INT4]](s32) ; SI: [[FPTRUNC2:%[0-9]+]]:_(s16) = G_FPTRUNC [[INT5]](s32) - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; SI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[DEF2]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) ; VI-LABEL: name: test_fcos_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -374,12 +374,12 @@ body: | ; VI: [[FMUL2:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC2]], [[C1]] ; VI: [[INT4:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.fract), [[FMUL2]](s16) ; VI: [[INT5:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.cos), [[INT4]](s16) - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[INT1]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[INT3]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[INT5]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; VI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT1]](s16), [[INT3]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT5]](s16), [[DEF2]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) ; GFX9-LABEL: name: test_fcos_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -401,12 +401,12 @@ body: | ; GFX9: [[INT1:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.cos), [[FMUL1]](s16) ; GFX9: [[FMUL2:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC2]], [[C1]] ; GFX9: [[INT2:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.cos), [[FMUL2]](s16) - ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[INT]](s16) - ; GFX9: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[INT1]](s16) - ; GFX9: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[INT2]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; GFX9: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT]](s16), [[INT1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT2]](s16), [[DEF2]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) %0:_(<3 x s16>) = G_IMPLICIT_DEF %1:_(<3 x s16>) = G_FCOS %0 S_NOP 0, implicit %1 @@ -451,8 +451,10 @@ body: | ; SI: [[INT6:%[0-9]+]]:_(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.fract), [[FMUL3]](s32) ; SI: [[INT7:%[0-9]+]]:_(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.cos), [[INT6]](s32) ; SI: [[FPTRUNC3:%[0-9]+]]:_(s16) = G_FPTRUNC [[INT7]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16), [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_fcos_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[COPY]](<4 x s16>) @@ -478,8 +480,10 @@ body: | ; VI: [[FMUL3:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC3]], [[C1]] ; VI: [[INT6:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.fract), [[FMUL3]](s16) ; VI: [[INT7:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.cos), [[INT6]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[INT1]](s16), [[INT3]](s16), [[INT5]](s16), [[INT7]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT1]](s16), [[INT3]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT5]](s16), [[INT7]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_fcos_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[COPY]](<4 x s16>) @@ -501,8 +505,10 @@ body: | ; GFX9: [[INT2:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.cos), [[FMUL2]](s16) ; GFX9: [[FMUL3:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC3]], [[C1]] ; GFX9: [[INT3:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.cos), [[FMUL3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[INT]](s16), [[INT1]](s16), [[INT2]](s16), [[INT3]](s16) - ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT]](s16), [[INT1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT2]](s16), [[INT3]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) %0:_(<4 x s16>) = COPY $vgpr0_vgpr1 %1:_(<4 x s16>) = G_FCOS %0 $vgpr0_vgpr1 = COPY %1 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-ffloor.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-ffloor.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-ffloor.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-ffloor.mir Wed Oct 9 15:44:43 2019 @@ -257,12 +257,12 @@ body: | ; SI: [[FPEXT2:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC2]](s16) ; SI: [[FFLOOR2:%[0-9]+]]:_(s32) = G_FFLOOR [[FPEXT2]] ; SI: [[FPTRUNC2:%[0-9]+]]:_(s16) = G_FPTRUNC [[FFLOOR2]](s32) - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; SI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[DEF2]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) ; VI-LABEL: name: test_ffloor_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -280,12 +280,12 @@ body: | ; VI: [[FFLOOR:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC]] ; VI: [[FFLOOR1:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC1]] ; VI: [[FFLOOR2:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC2]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FFLOOR]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FFLOOR1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FFLOOR2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; VI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FFLOOR]](s16), [[FFLOOR1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FFLOOR2]](s16), [[DEF2]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) ; GFX9-LABEL: name: test_ffloor_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -303,12 +303,12 @@ body: | ; GFX9: [[FFLOOR:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC]] ; GFX9: [[FFLOOR1:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC1]] ; GFX9: [[FFLOOR2:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC2]] - ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FFLOOR]](s16) - ; GFX9: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FFLOOR1]](s16) - ; GFX9: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FFLOOR2]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; GFX9: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FFLOOR]](s16), [[FFLOOR1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FFLOOR2]](s16), [[DEF2]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) %0:_(<3 x s16>) = G_IMPLICIT_DEF %1:_(<3 x s16>) = G_FFLOOR %0 S_NOP 0, implicit %1 @@ -344,8 +344,10 @@ body: | ; SI: [[FPEXT3:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC3]](s16) ; SI: [[FFLOOR3:%[0-9]+]]:_(s32) = G_FFLOOR [[FPEXT3]] ; SI: [[FPTRUNC3:%[0-9]+]]:_(s16) = G_FPTRUNC [[FFLOOR3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16), [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_ffloor_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[COPY]](<4 x s16>) @@ -362,8 +364,10 @@ body: | ; VI: [[FFLOOR1:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC1]] ; VI: [[FFLOOR2:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC2]] ; VI: [[FFLOOR3:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC3]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FFLOOR]](s16), [[FFLOOR1]](s16), [[FFLOOR2]](s16), [[FFLOOR3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FFLOOR]](s16), [[FFLOOR1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FFLOOR2]](s16), [[FFLOOR3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_ffloor_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[COPY]](<4 x s16>) @@ -380,8 +384,10 @@ body: | ; GFX9: [[FFLOOR1:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC1]] ; GFX9: [[FFLOOR2:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC2]] ; GFX9: [[FFLOOR3:%[0-9]+]]:_(s16) = G_FFLOOR [[TRUNC3]] - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FFLOOR]](s16), [[FFLOOR1]](s16), [[FFLOOR2]](s16), [[FFLOOR3]](s16) - ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FFLOOR]](s16), [[FFLOOR1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FFLOOR2]](s16), [[FFLOOR3]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) %0:_(<4 x s16>) = COPY $vgpr0_vgpr1 %1:_(<4 x s16>) = G_FFLOOR %0 $vgpr0_vgpr1 = COPY %1 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fma.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fma.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fma.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fma.mir Wed Oct 9 15:44:43 2019 @@ -436,12 +436,12 @@ body: | ; SI: [[FPEXT8:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC8]](s16) ; SI: [[FMA2:%[0-9]+]]:_(s32) = G_FMA [[FPEXT6]], [[FPEXT7]], [[FPEXT8]] ; SI: [[FPTRUNC2:%[0-9]+]]:_(s16) = G_FPTRUNC [[FMA2]](s32) - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC9:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC9]](<3 x s16>) + ; SI: [[DEF6:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[DEF6]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT3:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT3]](<3 x s16>) ; VI-LABEL: name: test_fma_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -483,12 +483,12 @@ body: | ; VI: [[FMA:%[0-9]+]]:_(s16) = G_FMA [[TRUNC]], [[TRUNC3]], [[TRUNC6]] ; VI: [[FMA1:%[0-9]+]]:_(s16) = G_FMA [[TRUNC1]], [[TRUNC4]], [[TRUNC7]] ; VI: [[FMA2:%[0-9]+]]:_(s16) = G_FMA [[TRUNC2]], [[TRUNC5]], [[TRUNC8]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FMA]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FMA1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FMA2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC9:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC9]](<3 x s16>) + ; VI: [[DEF6:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMA]](s16), [[FMA1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMA2]](s16), [[DEF6]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT3:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT3]](<3 x s16>) ; GFX9-LABEL: name: test_fma_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -592,8 +592,10 @@ body: | ; SI: [[FPEXT11:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC11]](s16) ; SI: [[FMA3:%[0-9]+]]:_(s32) = G_FMA [[FPEXT9]], [[FPEXT10]], [[FPEXT11]] ; SI: [[FPTRUNC3:%[0-9]+]]:_(s16) = G_FPTRUNC [[FMA3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16), [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_fma_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -630,8 +632,10 @@ body: | ; VI: [[FMA1:%[0-9]+]]:_(s16) = G_FMA [[TRUNC1]], [[TRUNC5]], [[TRUNC9]] ; VI: [[FMA2:%[0-9]+]]:_(s16) = G_FMA [[TRUNC2]], [[TRUNC6]], [[TRUNC10]] ; VI: [[FMA3:%[0-9]+]]:_(s16) = G_FMA [[TRUNC3]], [[TRUNC7]], [[TRUNC11]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FMA]](s16), [[FMA1]](s16), [[FMA2]](s16), [[FMA3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMA]](s16), [[FMA1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMA2]](s16), [[FMA3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_fma_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmad.s16.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmad.s16.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmad.s16.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmad.s16.mir Wed Oct 9 15:44:43 2019 @@ -312,8 +312,10 @@ body: | ; SI-F16DENORM: [[FPEXT15:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC11]](s16) ; SI-F16DENORM: [[FADD3:%[0-9]+]]:_(s32) = G_FADD [[FPEXT14]], [[FPEXT15]] ; SI-F16DENORM: [[FPTRUNC7:%[0-9]+]]:_(s16) = G_FPTRUNC [[FADD3]](s32) - ; SI-F16DENORM: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC1]](s16), [[FPTRUNC3]](s16), [[FPTRUNC5]](s16), [[FPTRUNC7]](s16) - ; SI-F16DENORM: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI-F16DENORM: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC1]](s16), [[FPTRUNC3]](s16) + ; SI-F16DENORM: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC5]](s16), [[FPTRUNC7]](s16) + ; SI-F16DENORM: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI-F16DENORM: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; SI-F16FLUSH-LABEL: name: test_fmad_v4s16 ; SI-F16FLUSH: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; SI-F16FLUSH: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -378,8 +380,10 @@ body: | ; SI-F16FLUSH: [[FPEXT15:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC11]](s16) ; SI-F16FLUSH: [[FADD3:%[0-9]+]]:_(s32) = G_FADD [[FPEXT14]], [[FPEXT15]] ; SI-F16FLUSH: [[FPTRUNC7:%[0-9]+]]:_(s16) = G_FPTRUNC [[FADD3]](s32) - ; SI-F16FLUSH: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC1]](s16), [[FPTRUNC3]](s16), [[FPTRUNC5]](s16), [[FPTRUNC7]](s16) - ; SI-F16FLUSH: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI-F16FLUSH: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC1]](s16), [[FPTRUNC3]](s16) + ; SI-F16FLUSH: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC5]](s16), [[FPTRUNC7]](s16) + ; SI-F16FLUSH: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI-F16FLUSH: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-F16DENORM-LABEL: name: test_fmad_v4s16 ; VI-F16DENORM: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI-F16DENORM: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -412,8 +416,10 @@ body: | ; VI-F16DENORM: [[TRUNC10:%[0-9]+]]:_(s16) = G_TRUNC [[BITCAST5]](s32) ; VI-F16DENORM: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[BITCAST5]], [[C]](s32) ; VI-F16DENORM: [[TRUNC11:%[0-9]+]]:_(s16) = G_TRUNC [[LSHR5]](s32) - ; VI-F16DENORM: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR %16(s16), %17(s16), %18(s16), %19(s16) - ; VI-F16DENORM: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI-F16DENORM: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR %16(s16), %17(s16) + ; VI-F16DENORM: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR %18(s16), %19(s16) + ; VI-F16DENORM: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI-F16DENORM: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-F16DENORM: [[FMUL:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC3]], [[TRUNC7]] ; VI-F16DENORM: [[FADD:%[0-9]+]]:_(s16) = G_FADD [[FMUL]], [[TRUNC11]] ; VI-F16DENORM: [[FMUL1:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC2]], [[TRUNC6]] @@ -458,8 +464,10 @@ body: | ; VI-F16FLUSH: [[FMAD1:%[0-9]+]]:_(s16) = G_FMAD [[TRUNC1]], [[TRUNC5]], [[TRUNC9]] ; VI-F16FLUSH: [[FMAD2:%[0-9]+]]:_(s16) = G_FMAD [[TRUNC2]], [[TRUNC6]], [[TRUNC10]] ; VI-F16FLUSH: [[FMAD3:%[0-9]+]]:_(s16) = G_FMAD [[TRUNC3]], [[TRUNC7]], [[TRUNC11]] - ; VI-F16FLUSH: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FMAD]](s16), [[FMAD1]](s16), [[FMAD2]](s16), [[FMAD3]](s16) - ; VI-F16FLUSH: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI-F16FLUSH: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMAD]](s16), [[FMAD1]](s16) + ; VI-F16FLUSH: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMAD2]](s16), [[FMAD3]](s16) + ; VI-F16FLUSH: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI-F16FLUSH: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX10-LABEL: name: test_fmad_v4s16 ; GFX10: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX10: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -500,8 +508,10 @@ body: | ; GFX10: [[FADD2:%[0-9]+]]:_(s16) = G_FADD [[FMUL2]], [[TRUNC10]] ; GFX10: [[FMUL3:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC3]], [[TRUNC7]] ; GFX10: [[FADD3:%[0-9]+]]:_(s16) = G_FADD [[FMUL3]], [[TRUNC11]] - ; GFX10: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FADD]](s16), [[FADD1]](s16), [[FADD2]](s16), [[FADD3]](s16) - ; GFX10: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX10: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD]](s16), [[FADD1]](s16) + ; GFX10: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD2]](s16), [[FADD3]](s16) + ; GFX10: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX10: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) %0:_(<4 x s16>) = COPY $vgpr0_vgpr1 %1:_(<4 x s16>) = COPY $vgpr2_vgpr3 %2:_(<4 x s16>) = COPY $vgpr4_vgpr5 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmaxnum.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmaxnum.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmaxnum.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmaxnum.mir Wed Oct 9 15:44:43 2019 @@ -420,13 +420,13 @@ body: | ; SI: [[FPEXT5:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC5]](s16) ; SI: [[FMINNUM_IEEE2:%[0-9]+]]:_(s32) = G_FMINNUM_IEEE [[FPEXT4]], [[FPEXT5]] ; SI: [[FPTRUNC2:%[0-9]+]]:_(s16) = G_FPTRUNC [[FMINNUM_IEEE2]](s32) - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: [[DEF2:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; SI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF2]], [[TRUNC6]](<3 x s16>), 0 + ; SI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[DEF2]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: [[DEF3:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; SI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF3]], [[EXTRACT2]](<3 x s16>), 0 ; SI: $vgpr0_vgpr1 = COPY [[INSERT2]](<4 x s16>) ; VI-LABEL: name: test_fminnum_v3s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 @@ -463,13 +463,13 @@ body: | ; VI: [[FCANONICALIZE4:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC2]] ; VI: [[FCANONICALIZE5:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC5]] ; VI: [[FMINNUM_IEEE2:%[0-9]+]]:_(s16) = G_FMINNUM_IEEE [[FCANONICALIZE4]], [[FCANONICALIZE5]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FMINNUM_IEEE]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FMINNUM_IEEE1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FMINNUM_IEEE2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF2:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF2]], [[TRUNC6]](<3 x s16>), 0 + ; VI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMINNUM_IEEE]](s16), [[FMINNUM_IEEE1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMINNUM_IEEE2]](s16), [[DEF2]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF3:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF3]], [[EXTRACT2]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT2]](<4 x s16>) ; GFX9-LABEL: name: test_fminnum_v3s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 @@ -561,8 +561,10 @@ body: | ; SI: [[FPEXT7:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC7]](s16) ; SI: [[FMINNUM_IEEE3:%[0-9]+]]:_(s32) = G_FMINNUM_IEEE [[FPEXT6]], [[FPEXT7]] ; SI: [[FPTRUNC3:%[0-9]+]]:_(s16) = G_FPTRUNC [[FMINNUM_IEEE3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16), [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_fminnum_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -597,8 +599,10 @@ body: | ; VI: [[FCANONICALIZE6:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC3]] ; VI: [[FCANONICALIZE7:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC7]] ; VI: [[FMINNUM_IEEE3:%[0-9]+]]:_(s16) = G_FMINNUM_IEEE [[FCANONICALIZE6]], [[FCANONICALIZE7]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FMINNUM_IEEE]](s16), [[FMINNUM_IEEE1]](s16), [[FMINNUM_IEEE2]](s16), [[FMINNUM_IEEE3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMINNUM_IEEE]](s16), [[FMINNUM_IEEE1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMINNUM_IEEE2]](s16), [[FMINNUM_IEEE3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_fminnum_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir Wed Oct 9 15:44:43 2019 @@ -420,13 +420,13 @@ body: | ; SI: [[FPEXT5:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC5]](s16) ; SI: [[FMINNUM_IEEE2:%[0-9]+]]:_(s32) = G_FMINNUM_IEEE [[FPEXT4]], [[FPEXT5]] ; SI: [[FPTRUNC2:%[0-9]+]]:_(s16) = G_FPTRUNC [[FMINNUM_IEEE2]](s32) - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: [[DEF2:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; SI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF2]], [[TRUNC6]](<3 x s16>), 0 + ; SI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[DEF2]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: [[DEF3:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; SI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF3]], [[EXTRACT2]](<3 x s16>), 0 ; SI: $vgpr0_vgpr1 = COPY [[INSERT2]](<4 x s16>) ; VI-LABEL: name: test_fminnum_v3s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 @@ -463,13 +463,13 @@ body: | ; VI: [[FCANONICALIZE4:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC2]] ; VI: [[FCANONICALIZE5:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC5]] ; VI: [[FMINNUM_IEEE2:%[0-9]+]]:_(s16) = G_FMINNUM_IEEE [[FCANONICALIZE4]], [[FCANONICALIZE5]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FMINNUM_IEEE]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FMINNUM_IEEE1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FMINNUM_IEEE2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF2:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF2]], [[TRUNC6]](<3 x s16>), 0 + ; VI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMINNUM_IEEE]](s16), [[FMINNUM_IEEE1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMINNUM_IEEE2]](s16), [[DEF2]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF3:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF3]], [[EXTRACT2]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT2]](<4 x s16>) ; GFX9-LABEL: name: test_fminnum_v3s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 @@ -561,8 +561,10 @@ body: | ; SI: [[FPEXT7:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC7]](s16) ; SI: [[FMINNUM_IEEE3:%[0-9]+]]:_(s32) = G_FMINNUM_IEEE [[FPEXT6]], [[FPEXT7]] ; SI: [[FPTRUNC3:%[0-9]+]]:_(s16) = G_FPTRUNC [[FMINNUM_IEEE3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16), [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_fminnum_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -597,8 +599,10 @@ body: | ; VI: [[FCANONICALIZE6:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC3]] ; VI: [[FCANONICALIZE7:%[0-9]+]]:_(s16) = G_FCANONICALIZE [[TRUNC7]] ; VI: [[FMINNUM_IEEE3:%[0-9]+]]:_(s16) = G_FMINNUM_IEEE [[FCANONICALIZE6]], [[FCANONICALIZE7]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FMINNUM_IEEE]](s16), [[FMINNUM_IEEE1]](s16), [[FMINNUM_IEEE2]](s16), [[FMINNUM_IEEE3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMINNUM_IEEE]](s16), [[FMINNUM_IEEE1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMINNUM_IEEE2]](s16), [[FMINNUM_IEEE3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_fminnum_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmul.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmul.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmul.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fmul.mir Wed Oct 9 15:44:43 2019 @@ -357,12 +357,12 @@ body: | ; SI: [[FPEXT5:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC5]](s16) ; SI: [[FMUL2:%[0-9]+]]:_(s32) = G_FMUL [[FPEXT4]], [[FPEXT5]] ; SI: [[FPTRUNC2:%[0-9]+]]:_(s16) = G_FPTRUNC [[FMUL2]](s32) - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC6]](<3 x s16>) + ; SI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[DEF4]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; VI-LABEL: name: test_fmul_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -392,12 +392,12 @@ body: | ; VI: [[FMUL:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC]], [[TRUNC3]] ; VI: [[FMUL1:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC1]], [[TRUNC4]] ; VI: [[FMUL2:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC2]], [[TRUNC5]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FMUL]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FMUL1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FMUL2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC6]](<3 x s16>) + ; VI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMUL]](s16), [[FMUL1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMUL2]](s16), [[DEF4]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; GFX9-LABEL: name: test_fmul_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -478,8 +478,10 @@ body: | ; SI: [[FPEXT7:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC7]](s16) ; SI: [[FMUL3:%[0-9]+]]:_(s32) = G_FMUL [[FPEXT6]], [[FPEXT7]] ; SI: [[FPTRUNC3:%[0-9]+]]:_(s16) = G_FPTRUNC [[FMUL3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16), [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_fmul_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -506,8 +508,10 @@ body: | ; VI: [[FMUL1:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC1]], [[TRUNC5]] ; VI: [[FMUL2:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC2]], [[TRUNC6]] ; VI: [[FMUL3:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC3]], [[TRUNC7]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FMUL]](s16), [[FMUL1]](s16), [[FMUL2]](s16), [[FMUL3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMUL]](s16), [[FMUL1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FMUL2]](s16), [[FMUL3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_fmul_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsin.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsin.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsin.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsin.mir Wed Oct 9 15:44:43 2019 @@ -344,12 +344,12 @@ body: | ; SI: [[INT4:%[0-9]+]]:_(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.fract), [[FMUL2]](s32) ; SI: [[INT5:%[0-9]+]]:_(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.sin), [[INT4]](s32) ; SI: [[FPTRUNC2:%[0-9]+]]:_(s16) = G_FPTRUNC [[INT5]](s32) - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; SI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[DEF2]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) ; VI-LABEL: name: test_fsin_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -374,12 +374,12 @@ body: | ; VI: [[FMUL2:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC2]], [[C1]] ; VI: [[INT4:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.fract), [[FMUL2]](s16) ; VI: [[INT5:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.sin), [[INT4]](s16) - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[INT1]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[INT3]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[INT5]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; VI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT1]](s16), [[INT3]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT5]](s16), [[DEF2]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) ; GFX9-LABEL: name: test_fsin_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -401,12 +401,12 @@ body: | ; GFX9: [[INT1:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.sin), [[FMUL1]](s16) ; GFX9: [[FMUL2:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC2]], [[C1]] ; GFX9: [[INT2:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.sin), [[FMUL2]](s16) - ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[INT]](s16) - ; GFX9: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[INT1]](s16) - ; GFX9: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[INT2]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; GFX9: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT]](s16), [[INT1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT2]](s16), [[DEF2]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) %0:_(<3 x s16>) = G_IMPLICIT_DEF %1:_(<3 x s16>) = G_FSIN %0 S_NOP 0, implicit %1 @@ -451,8 +451,10 @@ body: | ; SI: [[INT6:%[0-9]+]]:_(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.fract), [[FMUL3]](s32) ; SI: [[INT7:%[0-9]+]]:_(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.sin), [[INT6]](s32) ; SI: [[FPTRUNC3:%[0-9]+]]:_(s16) = G_FPTRUNC [[INT7]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16), [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_fsin_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[COPY]](<4 x s16>) @@ -478,8 +480,10 @@ body: | ; VI: [[FMUL3:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC3]], [[C1]] ; VI: [[INT6:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.fract), [[FMUL3]](s16) ; VI: [[INT7:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.sin), [[INT6]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[INT1]](s16), [[INT3]](s16), [[INT5]](s16), [[INT7]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT1]](s16), [[INT3]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT5]](s16), [[INT7]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_fsin_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[COPY]](<4 x s16>) @@ -501,8 +505,10 @@ body: | ; GFX9: [[INT2:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.sin), [[FMUL2]](s16) ; GFX9: [[FMUL3:%[0-9]+]]:_(s16) = G_FMUL [[TRUNC3]], [[C1]] ; GFX9: [[INT3:%[0-9]+]]:_(s16) = G_INTRINSIC intrinsic(@llvm.amdgcn.sin), [[FMUL3]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[INT]](s16), [[INT1]](s16), [[INT2]](s16), [[INT3]](s16) - ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT]](s16), [[INT1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[INT2]](s16), [[INT3]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) %0:_(<4 x s16>) = COPY $vgpr0_vgpr1 %1:_(<4 x s16>) = G_FSIN %0 $vgpr0_vgpr1 = COPY %1 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsqrt.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsqrt.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsqrt.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsqrt.mir Wed Oct 9 15:44:43 2019 @@ -257,12 +257,12 @@ body: | ; SI: [[FPEXT2:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC2]](s16) ; SI: [[FSQRT2:%[0-9]+]]:_(s32) = G_FSQRT [[FPEXT2]] ; SI: [[FPTRUNC2:%[0-9]+]]:_(s16) = G_FPTRUNC [[FSQRT2]](s32) - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; SI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[DEF2]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) ; VI-LABEL: name: test_fsqrt_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -280,12 +280,12 @@ body: | ; VI: [[FSQRT:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC]] ; VI: [[FSQRT1:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC1]] ; VI: [[FSQRT2:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC2]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FSQRT]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FSQRT1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FSQRT2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; VI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FSQRT]](s16), [[FSQRT1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FSQRT2]](s16), [[DEF2]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) ; GFX9-LABEL: name: test_fsqrt_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -303,12 +303,12 @@ body: | ; GFX9: [[FSQRT:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC]] ; GFX9: [[FSQRT1:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC1]] ; GFX9: [[FSQRT2:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC2]] - ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FSQRT]](s16) - ; GFX9: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FSQRT1]](s16) - ; GFX9: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FSQRT2]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; GFX9: [[TRUNC3:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9: S_NOP 0, implicit [[TRUNC3]](<3 x s16>) + ; GFX9: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FSQRT]](s16), [[FSQRT1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FSQRT2]](s16), [[DEF2]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9: S_NOP 0, implicit [[EXTRACT1]](<3 x s16>) %0:_(<3 x s16>) = G_IMPLICIT_DEF %1:_(<3 x s16>) = G_FSQRT %0 S_NOP 0, implicit %1 @@ -344,8 +344,10 @@ body: | ; SI: [[FPEXT3:%[0-9]+]]:_(s32) = G_FPEXT [[TRUNC3]](s16) ; SI: [[FSQRT3:%[0-9]+]]:_(s32) = G_FSQRT [[FPEXT3]] ; SI: [[FPTRUNC3:%[0-9]+]]:_(s16) = G_FPTRUNC [[FSQRT3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16), [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_fsqrt_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[COPY]](<4 x s16>) @@ -362,8 +364,10 @@ body: | ; VI: [[FSQRT1:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC1]] ; VI: [[FSQRT2:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC2]] ; VI: [[FSQRT3:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC3]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FSQRT]](s16), [[FSQRT1]](s16), [[FSQRT2]](s16), [[FSQRT3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FSQRT]](s16), [[FSQRT1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FSQRT2]](s16), [[FSQRT3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_fsqrt_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[COPY]](<4 x s16>) @@ -380,8 +384,10 @@ body: | ; GFX9: [[FSQRT1:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC1]] ; GFX9: [[FSQRT2:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC2]] ; GFX9: [[FSQRT3:%[0-9]+]]:_(s16) = G_FSQRT [[TRUNC3]] - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FSQRT]](s16), [[FSQRT1]](s16), [[FSQRT2]](s16), [[FSQRT3]](s16) - ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FSQRT]](s16), [[FSQRT1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FSQRT2]](s16), [[FSQRT3]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) %0:_(<4 x s16>) = COPY $vgpr0_vgpr1 %1:_(<4 x s16>) = G_FSQRT %0 $vgpr0_vgpr1 = COPY %1 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsub.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsub.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsub.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-fsub.mir Wed Oct 9 15:44:43 2019 @@ -405,12 +405,12 @@ body: | ; SI: [[FPEXT5:%[0-9]+]]:_(s32) = G_FPEXT [[FNEG2]](s16) ; SI: [[FADD2:%[0-9]+]]:_(s32) = G_FADD [[FPEXT4]], [[FPEXT5]] ; SI: [[FPTRUNC2:%[0-9]+]]:_(s16) = G_FPTRUNC [[FADD2]](s32) - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FPTRUNC2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC6]](<3 x s16>) + ; SI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[DEF4]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; VI-LABEL: name: test_fsub_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -443,12 +443,12 @@ body: | ; VI: [[FADD1:%[0-9]+]]:_(s16) = G_FADD [[TRUNC1]], [[FNEG1]] ; VI: [[FNEG2:%[0-9]+]]:_(s16) = G_FNEG [[TRUNC5]] ; VI: [[FADD2:%[0-9]+]]:_(s16) = G_FADD [[TRUNC2]], [[FNEG2]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FADD]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FADD1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FADD2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC6]](<3 x s16>) + ; VI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD]](s16), [[FADD1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD2]](s16), [[DEF4]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; GFX9-LABEL: name: test_fsub_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -481,12 +481,12 @@ body: | ; GFX9: [[FADD1:%[0-9]+]]:_(s16) = G_FADD [[TRUNC1]], [[FNEG1]] ; GFX9: [[FNEG2:%[0-9]+]]:_(s16) = G_FNEG [[TRUNC5]] ; GFX9: [[FADD2:%[0-9]+]]:_(s16) = G_FADD [[TRUNC2]], [[FNEG2]] - ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[FADD]](s16) - ; GFX9: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[FADD1]](s16) - ; GFX9: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[FADD2]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; GFX9: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9: S_NOP 0, implicit [[TRUNC6]](<3 x s16>) + ; GFX9: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD]](s16), [[FADD1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD2]](s16), [[DEF4]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) %0:_(<3 x s16>) = G_IMPLICIT_DEF %1:_(<3 x s16>) = G_IMPLICIT_DEF %2:_(<3 x s16>) = G_FSUB %0, %1 @@ -541,8 +541,10 @@ body: | ; SI: [[FPEXT7:%[0-9]+]]:_(s32) = G_FPEXT [[FNEG3]](s16) ; SI: [[FADD3:%[0-9]+]]:_(s32) = G_FADD [[FPEXT6]], [[FPEXT7]] ; SI: [[FPTRUNC3:%[0-9]+]]:_(s16) = G_FPTRUNC [[FADD3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16), [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC]](s16), [[FPTRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FPTRUNC2]](s16), [[FPTRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_fsub_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -573,8 +575,10 @@ body: | ; VI: [[FADD2:%[0-9]+]]:_(s16) = G_FADD [[TRUNC2]], [[FNEG2]] ; VI: [[FNEG3:%[0-9]+]]:_(s16) = G_FNEG [[TRUNC7]] ; VI: [[FADD3:%[0-9]+]]:_(s16) = G_FADD [[TRUNC3]], [[FNEG3]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FADD]](s16), [[FADD1]](s16), [[FADD2]](s16), [[FADD3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD]](s16), [[FADD1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD2]](s16), [[FADD3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_fsub_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -605,8 +609,10 @@ body: | ; GFX9: [[FADD2:%[0-9]+]]:_(s16) = G_FADD [[TRUNC2]], [[FNEG2]] ; GFX9: [[FNEG3:%[0-9]+]]:_(s16) = G_FNEG [[TRUNC7]] ; GFX9: [[FADD3:%[0-9]+]]:_(s16) = G_FADD [[TRUNC3]], [[FNEG3]] - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[FADD]](s16), [[FADD1]](s16), [[FADD2]](s16), [[FADD3]](s16) - ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD]](s16), [[FADD1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[FADD2]](s16), [[FADD3]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) %0:_(<4 x s16>) = COPY $vgpr0_vgpr1 %1:_(<4 x s16>) = COPY $vgpr2_vgpr3 %2:_(<4 x s16>) = G_FSUB %0, %1 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir Wed Oct 9 15:44:43 2019 @@ -5205,87 +5205,102 @@ body: | ; CI-LABEL: name: test_load_constant_v3s16_align2 ; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) + ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) + ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; CI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; CI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; CI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; VI-LABEL: name: test_load_constant_v3s16_align2 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) + ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) + ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; VI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; VI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; GFX9-LABEL: name: test_load_constant_v3s16_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) + ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) + ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; GFX9: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; GFX9: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; GFX9: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; CI-MESA-LABEL: name: test_load_constant_v3s16_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) + ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) + ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CI-MESA: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CI-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; CI-MESA: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CI-MESA: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CI-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; CI-MESA: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; GFX9-MESA-LABEL: name: test_load_constant_v3s16_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) + ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p4) :: (load 2, addrspace 4) + ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p4) :: (load 2, addrspace 4) - ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9-MESA: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; GFX9-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; GFX9-MESA: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9-MESA: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; GFX9-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(<3 x s16>) = G_LOAD %0 :: (load 6, align 2, addrspace 4) @@ -5653,8 +5668,10 @@ body: | ; CI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_load_constant_v4s16_align2 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) @@ -5671,8 +5688,10 @@ body: | ; VI: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_load_constant_v4s16_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) @@ -5689,8 +5708,10 @@ body: | ; GFX9: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; CI-MESA-LABEL: name: test_load_constant_v4s16_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) @@ -5707,8 +5728,10 @@ body: | ; CI-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI-MESA: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-MESA-LABEL: name: test_load_constant_v4s16_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4) @@ -5725,8 +5748,10 @@ body: | ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p4) :: (load 2, addrspace 4) ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(<4 x s16>) = G_LOAD %0 :: (load 8, align 2, addrspace 4) $vgpr0_vgpr1 = COPY %1 @@ -6047,8 +6072,12 @@ body: | ; CI: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) ; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 4) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; VI-LABEL: name: test_load_constant_v8s16_align8 ; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, align 8, addrspace 4) @@ -6081,8 +6110,12 @@ body: | ; VI: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 4) ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; VI: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; GFX9-LABEL: name: test_load_constant_v8s16_align8 ; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, align 8, addrspace 4) @@ -6115,8 +6148,12 @@ body: | ; GFX9: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) ; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 4) ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; CI-MESA-LABEL: name: test_load_constant_v8s16_align8 ; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, align 8, addrspace 4) @@ -6149,8 +6186,12 @@ body: | ; CI-MESA: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) ; CI-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 4) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI-MESA: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; GFX9-MESA-LABEL: name: test_load_constant_v8s16_align8 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, align 8, addrspace 4) @@ -6183,8 +6224,12 @@ body: | ; GFX9-MESA: [[GEP6:%[0-9]+]]:_(p4) = G_GEP [[COPY]], [[C6]](s64) ; GFX9-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p4) :: (load 2, addrspace 4) ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9-MESA: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) %0:_(p4) = COPY $vgpr0_vgpr1 %1:_(<8 x s16>) = G_LOAD %0 :: (load 8, align 8, addrspace 4) $vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir Wed Oct 9 15:44:43 2019 @@ -5295,87 +5295,102 @@ body: | ; CI-LABEL: name: test_load_flat_v3s16_align2 ; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) + ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) + ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; CI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; CI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; CI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; VI-LABEL: name: test_load_flat_v3s16_align2 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) + ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) + ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; VI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; VI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; GFX9-LABEL: name: test_load_flat_v3s16_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) + ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) + ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; GFX9: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; GFX9: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; GFX9: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; CI-MESA-LABEL: name: test_load_flat_v3s16_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) + ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) + ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CI-MESA: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CI-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; CI-MESA: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CI-MESA: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CI-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; CI-MESA: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; GFX9-MESA-LABEL: name: test_load_flat_v3s16_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) + ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p0) :: (load 2) + ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p0) :: (load 2) - ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9-MESA: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; GFX9-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; GFX9-MESA: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9-MESA: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; GFX9-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(<3 x s16>) = G_LOAD %0 :: (load 6, align 2, addrspace 0) @@ -5743,8 +5758,10 @@ body: | ; CI: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_load_flat_v4s16_align2 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) @@ -5761,8 +5778,10 @@ body: | ; VI: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_load_flat_v4s16_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) @@ -5779,8 +5798,10 @@ body: | ; GFX9: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; CI-MESA-LABEL: name: test_load_flat_v4s16_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) @@ -5797,8 +5818,10 @@ body: | ; CI-MESA: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI-MESA: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-MESA-LABEL: name: test_load_flat_v4s16_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2) @@ -5815,8 +5838,10 @@ body: | ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p0) :: (load 2) ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(<4 x s16>) = G_LOAD %0 :: (load 8, align 2, addrspace 0) $vgpr0_vgpr1 = COPY %1 @@ -6137,8 +6162,12 @@ body: | ; CI: [[GEP6:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C6]](s64) ; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p0) :: (load 2) ; CI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; VI-LABEL: name: test_load_flat_v8s16_align8 ; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2, align 8) @@ -6171,8 +6200,12 @@ body: | ; VI: [[GEP6:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C6]](s64) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p0) :: (load 2) ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; VI: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; GFX9-LABEL: name: test_load_flat_v8s16_align8 ; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2, align 8) @@ -6205,8 +6238,12 @@ body: | ; GFX9: [[GEP6:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C6]](s64) ; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p0) :: (load 2) ; GFX9: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; CI-MESA-LABEL: name: test_load_flat_v8s16_align8 ; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2, align 8) @@ -6239,8 +6276,12 @@ body: | ; CI-MESA: [[GEP6:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C6]](s64) ; CI-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p0) :: (load 2) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI-MESA: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; GFX9-MESA-LABEL: name: test_load_flat_v8s16_align8 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2, align 8) @@ -6273,8 +6314,12 @@ body: | ; GFX9-MESA: [[GEP6:%[0-9]+]]:_(p0) = G_GEP [[COPY]], [[C6]](s64) ; GFX9-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p0) :: (load 2) ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9-MESA: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) %0:_(p0) = COPY $vgpr0_vgpr1 %1:_(<8 x s16>) = G_LOAD %0 :: (load 8, align 8, addrspace 0) $vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir Wed Oct 9 15:44:43 2019 @@ -5021,19 +5021,22 @@ body: | ; SI-LABEL: name: test_load_global_v3s16_align2 ; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) + ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; SI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) + ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; SI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; SI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; SI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; SI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; SI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; SI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; CI-HSA-LABEL: name: test_load_global_v3s16_align2 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -5044,36 +5047,42 @@ body: | ; CI-MESA-LABEL: name: test_load_global_v3s16_align2 ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) + ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; CI-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) + ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; CI-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CI-MESA: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CI-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; CI-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; CI-MESA: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CI-MESA: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CI-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; CI-MESA: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; VI-LABEL: name: test_load_global_v3s16_align2 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) + ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; VI: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) + ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; VI: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; VI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; VI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; GFX9-HSA-LABEL: name: test_load_global_v3s16_align2 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 @@ -5084,19 +5093,22 @@ body: | ; GFX9-MESA-LABEL: name: test_load_global_v3s16_align2 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) + ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 ; GFX9-MESA: [[GEP:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C]](s64) ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p1) :: (load 2, addrspace 1) + ; GFX9-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 4 ; GFX9-MESA: [[GEP1:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C1]](s64) ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p1) :: (load 2, addrspace 1) - ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9-MESA: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; GFX9-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; GFX9-MESA: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; GFX9-MESA: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9-MESA: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; GFX9-MESA: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(<3 x s16>) = G_LOAD %0 :: (load 6, align 2, addrspace 1) @@ -5434,8 +5446,10 @@ body: | ; SI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; CI-HSA-LABEL: name: test_load_global_v4s16_align2 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-HSA: [[LOAD:%[0-9]+]]:_(<4 x s16>) = G_LOAD [[COPY]](p1) :: (load 8, align 2, addrspace 1) @@ -5456,8 +5470,10 @@ body: | ; CI-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) ; CI-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI-MESA: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_load_global_v4s16_align2 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1) @@ -5474,8 +5490,10 @@ body: | ; VI: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-HSA-LABEL: name: test_load_global_v4s16_align2 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<4 x s16>) = G_LOAD [[COPY]](p1) :: (load 8, align 2, addrspace 1) @@ -5496,8 +5514,10 @@ body: | ; GFX9-MESA: [[GEP2:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C2]](s64) ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p1) :: (load 2, addrspace 1) ; GFX9-MESA: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(<4 x s16>) = G_LOAD %0 :: (load 8, align 2, addrspace 1) $vgpr0_vgpr1 = COPY %1 @@ -5776,8 +5796,12 @@ body: | ; SI: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) ; SI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) ; SI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; SI: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; CI-HSA-LABEL: name: test_load_global_v8s16_align8 ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 8, addrspace 1) @@ -5810,8 +5834,12 @@ body: | ; CI-HSA: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) ; CI-HSA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) ; CI-HSA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; CI-HSA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; CI-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-HSA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-HSA: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI-HSA: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI-HSA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; CI-HSA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; CI-MESA-LABEL: name: test_load_global_v8s16_align8 ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 8, addrspace 1) @@ -5844,8 +5872,12 @@ body: | ; CI-MESA: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) ; CI-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) ; CI-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-MESA: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; CI-MESA: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; VI-LABEL: name: test_load_global_v8s16_align8 ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 8, addrspace 1) @@ -5878,8 +5910,12 @@ body: | ; VI: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) ; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) ; VI: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; VI: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; GFX9-HSA-LABEL: name: test_load_global_v8s16_align8 ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 8, addrspace 1) @@ -5912,8 +5948,12 @@ body: | ; GFX9-HSA: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) ; GFX9-HSA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) ; GFX9-HSA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; GFX9-HSA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; GFX9-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9-HSA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-HSA: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9-HSA: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9-HSA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; GFX9-HSA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) ; GFX9-MESA-LABEL: name: test_load_global_v8s16_align8 ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1 ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 8, addrspace 1) @@ -5946,8 +5986,12 @@ body: | ; GFX9-MESA: [[GEP6:%[0-9]+]]:_(p1) = G_GEP [[COPY]], [[C6]](s64) ; GFX9-MESA: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[GEP6]](p1) :: (load 2, addrspace 1) ; GFX9-MESA: [[TRUNC7:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD7]](s32) - ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16), [[TRUNC4]](s16), [[TRUNC5]](s16), [[TRUNC6]](s16), [[TRUNC7]](s16) - ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BUILD_VECTOR]](<8 x s16>) + ; GFX9-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9-MESA: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9-MESA: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC4]](s16), [[TRUNC5]](s16) + ; GFX9-MESA: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC6]](s16), [[TRUNC7]](s16) + ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>), [[BUILD_VECTOR2]](<2 x s16>), [[BUILD_VECTOR3]](<2 x s16>) + ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<8 x s16>) %0:_(p1) = COPY $vgpr0_vgpr1 %1:_(<8 x s16>) = G_LOAD %0 :: (load 8, align 8, addrspace 1) $vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir Wed Oct 9 15:44:43 2019 @@ -6325,87 +6325,102 @@ body: | ; SI-LABEL: name: test_load_local_v3s16_align2 ; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) + ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) + ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; SI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; SI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; SI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; SI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; SI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; SI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; CI-LABEL: name: test_load_local_v3s16_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) + ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) + ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; CI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; CI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; CI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; CI-DS128-LABEL: name: test_load_local_v3s16_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) + ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI-DS128: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) + ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI-DS128: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; CI-DS128: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; CI-DS128: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; CI-DS128: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CI-DS128: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CI-DS128: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; CI-DS128: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; CI-DS128: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-DS128: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; CI-DS128: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI-DS128: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CI-DS128: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CI-DS128: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; CI-DS128: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; VI-LABEL: name: test_load_local_v3s16_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) + ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) + ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; VI: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; VI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; VI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; GFX9-LABEL: name: test_load_local_v3s16_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) + ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p3) :: (load 2, addrspace 3) + ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C1]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p3) :: (load 2, addrspace 3) - ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; GFX9: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; GFX9: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; GFX9: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) %0:_(p3) = COPY $vgpr0 %1:_(<3 x s16>) = G_LOAD %0 :: (load 6, align 2, addrspace 3) @@ -6777,8 +6792,10 @@ body: | ; SI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; CI-LABEL: name: test_load_local_v4s16_align2 ; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) @@ -6795,8 +6812,10 @@ body: | ; CI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) ; CI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; CI-DS128-LABEL: name: test_load_local_v4s16_align2 ; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) @@ -6813,8 +6832,10 @@ body: | ; CI-DS128: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; CI-DS128: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CI-DS128: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI-DS128: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CI-DS128: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI-DS128: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_load_local_v4s16_align2 ; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) @@ -6831,8 +6852,10 @@ body: | ; VI: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) ; VI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_load_local_v4s16_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3) @@ -6849,8 +6872,10 @@ body: | ; GFX9: [[GEP2:%[0-9]+]]:_(p3) = G_GEP [[COPY]], [[C2]](s32) ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[GEP2]](p3) :: (load 2, addrspace 3) ; GFX9: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD3]](s32) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; GFX9: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) %0:_(p3) = COPY $vgpr0 %1:_(<4 x s16>) = G_LOAD %0 :: (load 8, align 2, addrspace 3) $vgpr0_vgpr1 = COPY %1 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir Wed Oct 9 15:44:43 2019 @@ -5143,70 +5143,82 @@ body: | ; SI-LABEL: name: test_load_private_v3s16_align8 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, align 8, addrspace 5) + ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) + ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, align 4, addrspace 5) - ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; SI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; SI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; SI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; SI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; SI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; CI-LABEL: name: test_load_private_v3s16_align8 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, align 8, addrspace 5) + ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) + ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, align 4, addrspace 5) - ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; CI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; CI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; CI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; VI-LABEL: name: test_load_private_v3s16_align8 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, align 8, addrspace 5) + ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) + ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, align 4, addrspace 5) - ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; VI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; VI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; GFX9-LABEL: name: test_load_private_v3s16_align8 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, align 8, addrspace 5) + ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) + ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, align 4, addrspace 5) - ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; GFX9: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; GFX9: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; GFX9: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) %0:_(p5) = COPY $vgpr0 %1:_(<3 x s16>) = G_LOAD %0 :: (load 6, align 8, addrspace 5) @@ -5224,70 +5236,82 @@ body: | ; SI-LABEL: name: test_load_private_v3s16_align2 ; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) + ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; SI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) + ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; SI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; SI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; SI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; SI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; SI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; SI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; CI-LABEL: name: test_load_private_v3s16_align2 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) + ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; CI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) + ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; CI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; CI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; CI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; CI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; CI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; VI-LABEL: name: test_load_private_v3s16_align2 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) + ; VI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; VI: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) + ; VI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; VI: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; VI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; VI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; VI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; GFX9-LABEL: name: test_load_private_v3s16_align2 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5) + ; GFX9: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32) ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 ; GFX9: [[GEP:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C]](s32) ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[GEP]](p5) :: (load 2, addrspace 5) + ; GFX9: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32) ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 4 ; GFX9: [[GEP1:%[0-9]+]]:_(p5) = G_GEP [[COPY]], [[C1]](s32) ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[GEP1]](p5) :: (load 2, addrspace 5) - ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32) - ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32) - ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32) - ; GFX9: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC]](<3 x s16>), 0 + ; GFX9: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD2]](s32) + ; GFX9: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; GFX9: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) %0:_(p5) = COPY $vgpr0 %1:_(<3 x s16>) = G_LOAD %0 :: (load 6, align 2, addrspace 5) @@ -5344,13 +5368,13 @@ body: | ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) ; SI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) ; SI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] - ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16) - ; SI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[OR1]](s16) - ; SI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[OR2]](s16) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; SI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; SI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC6]](<3 x s16>), 0 + ; SI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[OR]](s16), [[OR1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[OR2]](s16), [[DEF]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; SI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; SI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; CI-LABEL: name: test_load_private_v3s16_align1 ; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -5394,13 +5418,13 @@ body: | ; CI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[AND5]], [[COPY4]](s32) ; CI: [[TRUNC5:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) ; CI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[TRUNC5]] - ; CI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16) - ; CI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[OR1]](s16) - ; CI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[OR2]](s16) - ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; CI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC6]](<3 x s16>), 0 + ; CI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[OR]](s16), [[OR1]](s16) + ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[OR2]](s16), [[DEF]](s16) + ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; CI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; VI-LABEL: name: test_load_private_v3s16_align1 ; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -5438,13 +5462,13 @@ body: | ; VI: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C1]] ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C2]](s16) ; VI: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[OR1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[OR2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC6]](<3 x s16>), 0 + ; VI: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[OR]](s16), [[OR1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[OR2]](s16), [[DEF]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) ; GFX9-LABEL: name: test_load_private_v3s16_align1 ; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0 @@ -5482,13 +5506,13 @@ body: | ; GFX9: [[AND5:%[0-9]+]]:_(s16) = G_AND [[TRUNC5]], [[C1]] ; GFX9: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[AND5]], [[C2]](s16) ; GFX9: [[OR2:%[0-9]+]]:_(s16) = G_OR [[AND4]], [[SHL2]] - ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16) - ; GFX9: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[OR1]](s16) - ; GFX9: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[OR2]](s16) - ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; GFX9: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF]], [[TRUNC6]](<3 x s16>), 0 + ; GFX9: [[DEF:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[OR]](s16), [[OR1]](s16) + ; GFX9: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[OR2]](s16), [[DEF]](s16) + ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; GFX9: [[DEF1:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; GFX9: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF1]], [[EXTRACT]](<3 x s16>), 0 ; GFX9: $vgpr0_vgpr1 = COPY [[INSERT]](<4 x s16>) %0:_(p5) = COPY $vgpr0 %1:_(<3 x s16>) = G_LOAD %0 :: (load 6, align 1, addrspace 5) Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-lshr.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-lshr.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-lshr.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-lshr.mir Wed Oct 9 15:44:43 2019 @@ -609,23 +609,26 @@ body: | ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[BITCAST]](s32) ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] ; SI: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[AND1]], [[AND]](s32) + ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LSHR4]](s32) ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32) ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32) ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] ; SI: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[AND3]], [[AND2]](s32) + ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LSHR5]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[BITCAST3]](s32) ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[BITCAST1]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY7]], [[C1]] ; SI: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[AND5]], [[AND4]](s32) - ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32) - ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32) - ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY8]](s32), [[COPY9]](s32), [[COPY10]](s32) - ; SI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: [[DEF2:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; SI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF2]], [[TRUNC]](<3 x s16>), 0 + ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[LSHR6]](s32) + ; SI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF2]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: [[DEF3:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; SI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF3]], [[EXTRACT2]](<3 x s16>), 0 ; SI: $vgpr0_vgpr1 = COPY [[INSERT2]](<4 x s16>) ; VI-LABEL: name: test_lshr_v3s16_v3s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 @@ -656,13 +659,13 @@ body: | ; VI: [[LSHR4:%[0-9]+]]:_(s16) = G_LSHR [[TRUNC]], [[TRUNC3]](s16) ; VI: [[LSHR5:%[0-9]+]]:_(s16) = G_LSHR [[TRUNC1]], [[TRUNC4]](s16) ; VI: [[LSHR6:%[0-9]+]]:_(s16) = G_LSHR [[TRUNC2]], [[TRUNC5]](s16) - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[LSHR4]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[LSHR5]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[LSHR6]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF2:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF2]], [[TRUNC6]](<3 x s16>), 0 + ; VI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[LSHR4]](s16), [[LSHR5]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[LSHR6]](s16), [[DEF2]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF3:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF3]], [[EXTRACT2]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT2]](<4 x s16>) ; GFX9-LABEL: name: test_lshr_v3s16_v3s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 @@ -751,8 +754,10 @@ body: | ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C1]] ; SI: [[LSHR7:%[0-9]+]]:_(s32) = G_LSHR [[AND7]], [[AND6]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[LSHR7]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_lshr_v4s16_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -779,8 +784,10 @@ body: | ; VI: [[LSHR5:%[0-9]+]]:_(s16) = G_LSHR [[TRUNC1]], [[TRUNC5]](s16) ; VI: [[LSHR6:%[0-9]+]]:_(s16) = G_LSHR [[TRUNC2]], [[TRUNC6]](s16) ; VI: [[LSHR7:%[0-9]+]]:_(s16) = G_LSHR [[TRUNC3]], [[TRUNC7]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[LSHR4]](s16), [[LSHR5]](s16), [[LSHR6]](s16), [[LSHR7]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[LSHR4]](s16), [[LSHR5]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[LSHR6]](s16), [[LSHR7]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_lshr_v4s16_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-phi.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-phi.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-phi.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-phi.mir Wed Oct 9 15:44:43 2019 @@ -144,25 +144,28 @@ body: | ; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY [[BITCAST]](s32) ; CHECK: [[COPY3:%[0-9]+]]:_(s32) = COPY [[BITCAST2]](s32) ; CHECK: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[COPY2]], [[COPY3]] + ; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[ADD]](s32) ; CHECK: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32) ; CHECK: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32) ; CHECK: [[ADD1:%[0-9]+]]:_(s32) = G_ADD [[COPY4]], [[COPY5]] + ; CHECK: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[ADD1]](s32) ; CHECK: [[COPY6:%[0-9]+]]:_(s32) = COPY [[BITCAST1]](s32) ; CHECK: [[COPY7:%[0-9]+]]:_(s32) = COPY [[BITCAST3]](s32) ; CHECK: [[ADD2:%[0-9]+]]:_(s32) = G_ADD [[COPY6]], [[COPY7]] - ; CHECK: [[COPY8:%[0-9]+]]:_(s32) = COPY [[ADD]](s32) - ; CHECK: [[COPY9:%[0-9]+]]:_(s32) = COPY [[ADD1]](s32) - ; CHECK: [[COPY10:%[0-9]+]]:_(s32) = COPY [[ADD2]](s32) - ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY8]](s32), [[COPY9]](s32), [[COPY10]](s32) - ; CHECK: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; CHECK: [[DEF3:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CHECK: [[INSERT3:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF3]], [[TRUNC]](<3 x s16>), 0 + ; CHECK: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[ADD2]](s32) + ; CHECK: [[DEF3:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF3]](s16) + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; CHECK: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; CHECK: [[DEF4:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CHECK: [[INSERT3:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF4]], [[EXTRACT1]](<3 x s16>), 0 ; CHECK: G_BR %bb.2 ; CHECK: bb.2: ; CHECK: [[PHI:%[0-9]+]]:_(<4 x s16>) = G_PHI [[INSERT]](<4 x s16>), %bb.0, [[INSERT3]](<4 x s16>), %bb.1 - ; CHECK: [[EXTRACT1:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[PHI]](<4 x s16>), 0 - ; CHECK: [[DEF4:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; CHECK: [[INSERT4:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF4]], [[EXTRACT1]](<3 x s16>), 0 + ; CHECK: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[PHI]](<4 x s16>), 0 + ; CHECK: [[DEF5:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; CHECK: [[INSERT4:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF5]], [[EXTRACT2]](<3 x s16>), 0 ; CHECK: $vgpr0_vgpr1 = COPY [[INSERT4]](<4 x s16>) ; CHECK: S_SETPC_B64 undef $sgpr30_sgpr31 bb.0: @@ -236,10 +239,12 @@ body: | ; CHECK: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32) ; CHECK: [[ADD3:%[0-9]+]]:_(s32) = G_ADD [[COPY8]], [[COPY9]] ; CHECK: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[ADD3]](s32) - ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) ; CHECK: G_BR %bb.2 ; CHECK: bb.2: - ; CHECK: [[PHI:%[0-9]+]]:_(<4 x s16>) = G_PHI [[COPY]](<4 x s16>), %bb.0, [[BUILD_VECTOR]](<4 x s16>), %bb.1 + ; CHECK: [[PHI:%[0-9]+]]:_(<4 x s16>) = G_PHI [[COPY]](<4 x s16>), %bb.0, [[CONCAT_VECTORS]](<4 x s16>), %bb.1 ; CHECK: $vgpr0_vgpr1 = COPY [[PHI]](<4 x s16>) ; CHECK: S_SETPC_B64 undef $sgpr30_sgpr31 bb.0: @@ -708,8 +713,10 @@ body: | ; CHECK: [[ADD61:%[0-9]+]]:_(s32) = G_ADD [[UV61]], [[UV125]] ; CHECK: [[ADD62:%[0-9]+]]:_(s32) = G_ADD [[UV62]], [[UV126]] ; CHECK: [[ADD63:%[0-9]+]]:_(s32) = G_ADD [[UV63]], [[UV127]] - ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<64 x s32>) = G_BUILD_VECTOR [[ADD]](s32), [[ADD1]](s32), [[ADD2]](s32), [[ADD3]](s32), [[ADD4]](s32), [[ADD5]](s32), [[ADD6]](s32), [[ADD7]](s32), [[ADD8]](s32), [[ADD9]](s32), [[ADD10]](s32), [[ADD11]](s32), [[ADD12]](s32), [[ADD13]](s32), [[ADD14]](s32), [[ADD15]](s32), [[ADD16]](s32), [[ADD17]](s32), [[ADD18]](s32), [[ADD19]](s32), [[ADD20]](s32), [[ADD21]](s32), [[ADD22]](s32), [[ADD23]](s32), [[ADD24]](s32), [[ADD25]](s32), [[ADD26]](s32), [[ADD27]](s32), [[ADD28]](s32), [[ADD29]](s32), [[ADD30]](s32), [[ADD31]](s32), [[ADD32]](s32), [[ADD33]](s32), [[ADD34]](s32), [[ADD35]](s32), [[ADD36]](s32), [[ADD37]](s32), [[ADD38]](s32), [[ADD39]](s32), [[ADD40]](s32), [[ADD41]](s32), [[ADD42]](s32), [[ADD43]](s32), [[ADD44]](s32), [[ADD45]](s32), [[ADD46]](s32), [[ADD47]](s32), [[ADD48]](s32), [[ADD49]](s32), [[ADD50]](s32), [[ADD51]](s32), [[ADD52]](s32), [[ADD53]](s32), [[ADD54]](s32), [[ADD55]](s32), [[ADD56]](s32), [[ADD57]](s32), [[ADD58]](s32), [[ADD59]](s32), [[ADD60]](s32), [[ADD61]](s32), [[ADD62]](s32), [[ADD63]](s32) - ; CHECK: [[UV128:%[0-9]+]]:_(<16 x s32>), [[UV129:%[0-9]+]]:_(<16 x s32>), [[UV130:%[0-9]+]]:_(<16 x s32>), [[UV131:%[0-9]+]]:_(<16 x s32>) = G_UNMERGE_VALUES [[BUILD_VECTOR]](<64 x s32>) + ; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<32 x s32>) = G_BUILD_VECTOR [[ADD]](s32), [[ADD1]](s32), [[ADD2]](s32), [[ADD3]](s32), [[ADD4]](s32), [[ADD5]](s32), [[ADD6]](s32), [[ADD7]](s32), [[ADD8]](s32), [[ADD9]](s32), [[ADD10]](s32), [[ADD11]](s32), [[ADD12]](s32), [[ADD13]](s32), [[ADD14]](s32), [[ADD15]](s32), [[ADD16]](s32), [[ADD17]](s32), [[ADD18]](s32), [[ADD19]](s32), [[ADD20]](s32), [[ADD21]](s32), [[ADD22]](s32), [[ADD23]](s32), [[ADD24]](s32), [[ADD25]](s32), [[ADD26]](s32), [[ADD27]](s32), [[ADD28]](s32), [[ADD29]](s32), [[ADD30]](s32), [[ADD31]](s32) + ; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<32 x s32>) = G_BUILD_VECTOR [[ADD32]](s32), [[ADD33]](s32), [[ADD34]](s32), [[ADD35]](s32), [[ADD36]](s32), [[ADD37]](s32), [[ADD38]](s32), [[ADD39]](s32), [[ADD40]](s32), [[ADD41]](s32), [[ADD42]](s32), [[ADD43]](s32), [[ADD44]](s32), [[ADD45]](s32), [[ADD46]](s32), [[ADD47]](s32), [[ADD48]](s32), [[ADD49]](s32), [[ADD50]](s32), [[ADD51]](s32), [[ADD52]](s32), [[ADD53]](s32), [[ADD54]](s32), [[ADD55]](s32), [[ADD56]](s32), [[ADD57]](s32), [[ADD58]](s32), [[ADD59]](s32), [[ADD60]](s32), [[ADD61]](s32), [[ADD62]](s32), [[ADD63]](s32) + ; CHECK: [[UV128:%[0-9]+]]:_(<16 x s32>), [[UV129:%[0-9]+]]:_(<16 x s32>) = G_UNMERGE_VALUES [[BUILD_VECTOR]](<32 x s32>) + ; CHECK: [[UV130:%[0-9]+]]:_(<16 x s32>), [[UV131:%[0-9]+]]:_(<16 x s32>) = G_UNMERGE_VALUES [[BUILD_VECTOR1]](<32 x s32>) ; CHECK: G_BR %bb.2 ; CHECK: bb.2: ; CHECK: [[PHI:%[0-9]+]]:_(<16 x s32>) = G_PHI [[DEF]](<16 x s32>), %bb.0, [[UV128]](<16 x s32>), %bb.1 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-shl.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-shl.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-shl.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-shl.mir Wed Oct 9 15:44:43 2019 @@ -596,21 +596,24 @@ body: | ; SI: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[BITCAST]](s32) ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[COPY3]], [[AND]](s32) + ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32) ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32) ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32) ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[COPY5]], [[AND1]](s32) + ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL1]](s32) ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[BITCAST3]](s32) ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY6]], [[C1]] ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[BITCAST1]](s32) ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[COPY7]], [[AND2]](s32) - ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[SHL]](s32) - ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[SHL1]](s32) - ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[SHL2]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY8]](s32), [[COPY9]](s32), [[COPY10]](s32) - ; SI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: [[DEF2:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; SI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF2]], [[TRUNC]](<3 x s16>), 0 + ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[SHL2]](s32) + ; SI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF2]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: [[DEF3:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; SI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF3]], [[EXTRACT2]](<3 x s16>), 0 ; SI: $vgpr0_vgpr1 = COPY [[INSERT2]](<4 x s16>) ; VI-LABEL: name: test_shl_v3s16_v3s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 @@ -641,13 +644,13 @@ body: | ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[TRUNC]], [[TRUNC3]](s16) ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[TRUNC1]], [[TRUNC4]](s16) ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[TRUNC2]], [[TRUNC5]](s16) - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[SHL]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[SHL1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[SHL2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: [[DEF2:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF - ; VI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF2]], [[TRUNC6]](<3 x s16>), 0 + ; VI: [[DEF2:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SHL]](s16), [[SHL1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SHL2]](s16), [[DEF2]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: [[DEF3:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF + ; VI: [[INSERT2:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[DEF3]], [[EXTRACT2]](<3 x s16>), 0 ; VI: $vgpr0_vgpr1 = COPY [[INSERT2]](<4 x s16>) ; GFX9-LABEL: name: test_shl_v3s16_v3s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 @@ -732,8 +735,10 @@ body: | ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32) ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[COPY9]], [[AND3]](s32) ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SHL3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_shl_v4s16_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -760,8 +765,10 @@ body: | ; VI: [[SHL1:%[0-9]+]]:_(s16) = G_SHL [[TRUNC1]], [[TRUNC5]](s16) ; VI: [[SHL2:%[0-9]+]]:_(s16) = G_SHL [[TRUNC2]], [[TRUNC6]](s16) ; VI: [[SHL3:%[0-9]+]]:_(s16) = G_SHL [[TRUNC3]], [[TRUNC7]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[SHL]](s16), [[SHL1]](s16), [[SHL2]](s16), [[SHL3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SHL]](s16), [[SHL1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SHL2]](s16), [[SHL3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_shl_v4s16_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-shuffle-vector.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-shuffle-vector.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-shuffle-vector.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-shuffle-vector.mir Wed Oct 9 15:44:43 2019 @@ -349,8 +349,10 @@ body: | ; CHECK: [[BUILD_VECTOR3:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ASHR9]](s32), [[ASHR10]](s32), [[ASHR11]](s32) ; CHECK: [[EXTRACT5:%[0-9]+]]:_(s32) = G_EXTRACT [[BUILD_VECTOR3]](<3 x s32>), 0 ; CHECK: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[EXTRACT5]](s32) - ; CHECK: [[BUILD_VECTOR4:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; CHECK: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR4]](<4 x s16>) + ; CHECK: [[BUILD_VECTOR4:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; CHECK: [[BUILD_VECTOR5:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR4]](<2 x s16>), [[BUILD_VECTOR5]](<2 x s16>) + ; CHECK: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) %0:_(<4 x s16>) = COPY $vgpr0_vgpr1 %1:_(<4 x s16>) = COPY $vgpr2_vgpr3 %2:_(<3 x s16>) = G_EXTRACT %0, 0 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-smax.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-smax.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-smax.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-smax.mir Wed Oct 9 15:44:43 2019 @@ -355,6 +355,7 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) ; SI: [[ASHR1:%[0-9]+]]:_(s32) = G_ASHR [[SHL1]], [[C]](s32) ; SI: [[SMAX:%[0-9]+]]:_(s32) = G_SMAX [[ASHR]], [[ASHR1]] + ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[SMAX]](s32) ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32) ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[COPY2]], [[C]](s32) ; SI: [[ASHR2:%[0-9]+]]:_(s32) = G_ASHR [[SHL2]], [[C]](s32) @@ -362,6 +363,7 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[COPY3]], [[C]](s32) ; SI: [[ASHR3:%[0-9]+]]:_(s32) = G_ASHR [[SHL3]], [[C]](s32) ; SI: [[SMAX1:%[0-9]+]]:_(s32) = G_SMAX [[ASHR2]], [[ASHR3]] + ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SMAX1]](s32) ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[BITCAST1]](s32) ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[COPY4]], [[C]](s32) ; SI: [[ASHR4:%[0-9]+]]:_(s32) = G_ASHR [[SHL4]], [[C]](s32) @@ -369,12 +371,13 @@ body: | ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[COPY5]], [[C]](s32) ; SI: [[ASHR5:%[0-9]+]]:_(s32) = G_ASHR [[SHL5]], [[C]](s32) ; SI: [[SMAX2:%[0-9]+]]:_(s32) = G_SMAX [[ASHR4]], [[ASHR5]] - ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[SMAX]](s32) - ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[SMAX1]](s32) - ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[SMAX2]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32) - ; SI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC]](<3 x s16>) + ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[SMAX2]](s32) + ; SI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF4]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; VI-LABEL: name: test_smax_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -404,12 +407,12 @@ body: | ; VI: [[SMAX:%[0-9]+]]:_(s16) = G_SMAX [[TRUNC]], [[TRUNC3]] ; VI: [[SMAX1:%[0-9]+]]:_(s16) = G_SMAX [[TRUNC1]], [[TRUNC4]] ; VI: [[SMAX2:%[0-9]+]]:_(s16) = G_SMAX [[TRUNC2]], [[TRUNC5]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[SMAX]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[SMAX1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[SMAX2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC6]](<3 x s16>) + ; VI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SMAX]](s16), [[SMAX1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SMAX2]](s16), [[DEF4]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; GFX9-LABEL: name: test_smax_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -484,8 +487,10 @@ body: | ; SI: [[ASHR7:%[0-9]+]]:_(s32) = G_ASHR [[SHL7]], [[C]](s32) ; SI: [[SMAX3:%[0-9]+]]:_(s32) = G_SMAX [[ASHR6]], [[ASHR7]] ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SMAX3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_smax_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -512,8 +517,10 @@ body: | ; VI: [[SMAX1:%[0-9]+]]:_(s16) = G_SMAX [[TRUNC1]], [[TRUNC5]] ; VI: [[SMAX2:%[0-9]+]]:_(s16) = G_SMAX [[TRUNC2]], [[TRUNC6]] ; VI: [[SMAX3:%[0-9]+]]:_(s16) = G_SMAX [[TRUNC3]], [[TRUNC7]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[SMAX]](s16), [[SMAX1]](s16), [[SMAX2]](s16), [[SMAX3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SMAX]](s16), [[SMAX1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SMAX2]](s16), [[SMAX3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_smax_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-smin.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-smin.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-smin.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-smin.mir Wed Oct 9 15:44:43 2019 @@ -355,6 +355,7 @@ body: | ; SI: [[SHL1:%[0-9]+]]:_(s32) = G_SHL [[COPY1]], [[C]](s32) ; SI: [[ASHR1:%[0-9]+]]:_(s32) = G_ASHR [[SHL1]], [[C]](s32) ; SI: [[SMIN:%[0-9]+]]:_(s32) = G_SMIN [[ASHR]], [[ASHR1]] + ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[SMIN]](s32) ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32) ; SI: [[SHL2:%[0-9]+]]:_(s32) = G_SHL [[COPY2]], [[C]](s32) ; SI: [[ASHR2:%[0-9]+]]:_(s32) = G_ASHR [[SHL2]], [[C]](s32) @@ -362,6 +363,7 @@ body: | ; SI: [[SHL3:%[0-9]+]]:_(s32) = G_SHL [[COPY3]], [[C]](s32) ; SI: [[ASHR3:%[0-9]+]]:_(s32) = G_ASHR [[SHL3]], [[C]](s32) ; SI: [[SMIN1:%[0-9]+]]:_(s32) = G_SMIN [[ASHR2]], [[ASHR3]] + ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SMIN1]](s32) ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[BITCAST1]](s32) ; SI: [[SHL4:%[0-9]+]]:_(s32) = G_SHL [[COPY4]], [[C]](s32) ; SI: [[ASHR4:%[0-9]+]]:_(s32) = G_ASHR [[SHL4]], [[C]](s32) @@ -369,12 +371,13 @@ body: | ; SI: [[SHL5:%[0-9]+]]:_(s32) = G_SHL [[COPY5]], [[C]](s32) ; SI: [[ASHR5:%[0-9]+]]:_(s32) = G_ASHR [[SHL5]], [[C]](s32) ; SI: [[SMIN2:%[0-9]+]]:_(s32) = G_SMIN [[ASHR4]], [[ASHR5]] - ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[SMIN]](s32) - ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[SMIN1]](s32) - ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[SMIN2]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32) - ; SI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC]](<3 x s16>) + ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[SMIN2]](s32) + ; SI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF4]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; VI-LABEL: name: test_smin_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -404,12 +407,12 @@ body: | ; VI: [[SMIN:%[0-9]+]]:_(s16) = G_SMIN [[TRUNC]], [[TRUNC3]] ; VI: [[SMIN1:%[0-9]+]]:_(s16) = G_SMIN [[TRUNC1]], [[TRUNC4]] ; VI: [[SMIN2:%[0-9]+]]:_(s16) = G_SMIN [[TRUNC2]], [[TRUNC5]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[SMIN]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[SMIN1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[SMIN2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC6]](<3 x s16>) + ; VI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SMIN]](s16), [[SMIN1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SMIN2]](s16), [[DEF4]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; GFX9-LABEL: name: test_smin_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -484,8 +487,10 @@ body: | ; SI: [[ASHR7:%[0-9]+]]:_(s32) = G_ASHR [[SHL7]], [[C]](s32) ; SI: [[SMIN3:%[0-9]+]]:_(s32) = G_SMIN [[ASHR6]], [[ASHR7]] ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[SMIN3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_smin_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -512,8 +517,10 @@ body: | ; VI: [[SMIN1:%[0-9]+]]:_(s16) = G_SMIN [[TRUNC1]], [[TRUNC5]] ; VI: [[SMIN2:%[0-9]+]]:_(s16) = G_SMIN [[TRUNC2]], [[TRUNC6]] ; VI: [[SMIN3:%[0-9]+]]:_(s16) = G_SMIN [[TRUNC3]], [[TRUNC7]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[SMIN]](s16), [[SMIN1]](s16), [[SMIN2]](s16), [[SMIN3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SMIN]](s16), [[SMIN1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[SMIN2]](s16), [[SMIN3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_smin_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-umax.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-umax.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-umax.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-umax.mir Wed Oct 9 15:44:43 2019 @@ -337,22 +337,25 @@ body: | ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[BITCAST2]](s32) ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] ; SI: [[UMAX:%[0-9]+]]:_(s32) = G_UMAX [[AND]], [[AND1]] + ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[UMAX]](s32) ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32) ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32) ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] ; SI: [[UMAX1:%[0-9]+]]:_(s32) = G_UMAX [[AND2]], [[AND3]] + ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[UMAX1]](s32) ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[BITCAST1]](s32) ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[BITCAST3]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] ; SI: [[UMAX2:%[0-9]+]]:_(s32) = G_UMAX [[AND4]], [[AND5]] - ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[UMAX]](s32) - ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[UMAX1]](s32) - ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[UMAX2]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32) - ; SI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC]](<3 x s16>) + ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[UMAX2]](s32) + ; SI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF4]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; VI-LABEL: name: test_umax_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -382,12 +385,12 @@ body: | ; VI: [[UMAX:%[0-9]+]]:_(s16) = G_UMAX [[TRUNC]], [[TRUNC3]] ; VI: [[UMAX1:%[0-9]+]]:_(s16) = G_UMAX [[TRUNC1]], [[TRUNC4]] ; VI: [[UMAX2:%[0-9]+]]:_(s16) = G_UMAX [[TRUNC2]], [[TRUNC5]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UMAX]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[UMAX1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[UMAX2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC6]](<3 x s16>) + ; VI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[UMAX]](s16), [[UMAX1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[UMAX2]](s16), [[DEF4]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; GFX9-LABEL: name: test_umax_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -455,8 +458,10 @@ body: | ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C1]] ; SI: [[UMAX3:%[0-9]+]]:_(s32) = G_UMAX [[AND6]], [[AND7]] ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[UMAX3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_umax_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -483,8 +488,10 @@ body: | ; VI: [[UMAX1:%[0-9]+]]:_(s16) = G_UMAX [[TRUNC1]], [[TRUNC5]] ; VI: [[UMAX2:%[0-9]+]]:_(s16) = G_UMAX [[TRUNC2]], [[TRUNC6]] ; VI: [[UMAX3:%[0-9]+]]:_(s16) = G_UMAX [[TRUNC3]], [[TRUNC7]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[UMAX]](s16), [[UMAX1]](s16), [[UMAX2]](s16), [[UMAX3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[UMAX]](s16), [[UMAX1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[UMAX2]](s16), [[UMAX3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_umax_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-umin.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-umin.mir?rev=374252&r1=374251&r2=374252&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-umin.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-umin.mir Wed Oct 9 15:44:43 2019 @@ -337,22 +337,25 @@ body: | ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[BITCAST2]](s32) ; SI: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] ; SI: [[UMIN:%[0-9]+]]:_(s32) = G_UMIN [[AND]], [[AND1]] + ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[UMIN]](s32) ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32) ; SI: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[C1]] ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32) ; SI: [[AND3:%[0-9]+]]:_(s32) = G_AND [[COPY3]], [[C1]] ; SI: [[UMIN1:%[0-9]+]]:_(s32) = G_UMIN [[AND2]], [[AND3]] + ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[UMIN1]](s32) ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[BITCAST1]](s32) ; SI: [[AND4:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[C1]] ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[BITCAST3]](s32) ; SI: [[AND5:%[0-9]+]]:_(s32) = G_AND [[COPY5]], [[C1]] ; SI: [[UMIN2:%[0-9]+]]:_(s32) = G_UMIN [[AND4]], [[AND5]] - ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[UMIN]](s32) - ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[UMIN1]](s32) - ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[UMIN2]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32) - ; SI: [[TRUNC:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; SI: S_NOP 0, implicit [[TRUNC]](<3 x s16>) + ; SI: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[UMIN2]](s32) + ; SI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[DEF4]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; SI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; VI-LABEL: name: test_umin_v3s16 ; VI: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; VI: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -382,12 +385,12 @@ body: | ; VI: [[UMIN:%[0-9]+]]:_(s16) = G_UMIN [[TRUNC]], [[TRUNC3]] ; VI: [[UMIN1:%[0-9]+]]:_(s16) = G_UMIN [[TRUNC1]], [[TRUNC4]] ; VI: [[UMIN2:%[0-9]+]]:_(s16) = G_UMIN [[TRUNC2]], [[TRUNC5]] - ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UMIN]](s16) - ; VI: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[UMIN1]](s16) - ; VI: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[UMIN2]](s16) - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<3 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32) - ; VI: [[TRUNC6:%[0-9]+]]:_(<3 x s16>) = G_TRUNC [[BUILD_VECTOR]](<3 x s32>) - ; VI: S_NOP 0, implicit [[TRUNC6]](<3 x s16>) + ; VI: [[DEF4:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[UMIN]](s16), [[UMIN1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[UMIN2]](s16), [[DEF4]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: [[EXTRACT2:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS]](<4 x s16>), 0 + ; VI: S_NOP 0, implicit [[EXTRACT2]](<3 x s16>) ; GFX9-LABEL: name: test_umin_v3s16 ; GFX9: [[DEF:%[0-9]+]]:_(<4 x s16>) = G_IMPLICIT_DEF ; GFX9: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[DEF]](<4 x s16>), 0 @@ -455,8 +458,10 @@ body: | ; SI: [[AND7:%[0-9]+]]:_(s32) = G_AND [[COPY9]], [[C1]] ; SI: [[UMIN3:%[0-9]+]]:_(s32) = G_UMIN [[AND6]], [[AND7]] ; SI: [[TRUNC3:%[0-9]+]]:_(s16) = G_TRUNC [[UMIN3]](s32) - ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16), [[TRUNC2]](s16), [[TRUNC3]](s16) - ; SI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC]](s16), [[TRUNC1]](s16) + ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[TRUNC2]](s16), [[TRUNC3]](s16) + ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; VI-LABEL: name: test_umin_v4s16 ; VI: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; VI: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 @@ -483,8 +488,10 @@ body: | ; VI: [[UMIN1:%[0-9]+]]:_(s16) = G_UMIN [[TRUNC1]], [[TRUNC5]] ; VI: [[UMIN2:%[0-9]+]]:_(s16) = G_UMIN [[TRUNC2]], [[TRUNC6]] ; VI: [[UMIN3:%[0-9]+]]:_(s16) = G_UMIN [[TRUNC3]], [[TRUNC7]] - ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR [[UMIN]](s16), [[UMIN1]](s16), [[UMIN2]](s16), [[UMIN3]](s16) - ; VI: $vgpr0_vgpr1 = COPY [[BUILD_VECTOR]](<4 x s16>) + ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[UMIN]](s16), [[UMIN1]](s16) + ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[UMIN2]](s16), [[UMIN3]](s16) + ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s16>), [[BUILD_VECTOR1]](<2 x s16>) + ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<4 x s16>) ; GFX9-LABEL: name: test_umin_v4s16 ; GFX9: [[COPY:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr0_vgpr1 ; GFX9: [[COPY1:%[0-9]+]]:_(<4 x s16>) = COPY $vgpr2_vgpr3 From llvm-commits at lists.llvm.org Wed Oct 9 15:44:47 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Wed, 09 Oct 2019 22:44:47 -0000 Subject: [llvm] r374253 - AMDGPU: Fix typos Message-ID: <20191009224447.3B1D186400@lists.llvm.org> Author: arsenm Date: Wed Oct 9 15:44:47 2019 New Revision: 374253 URL: http://llvm.org/viewvc/llvm-project?rev=374253&view=rev Log: AMDGPU: Fix typos Modified: llvm/trunk/lib/Target/AMDGPU/SIFixSGPRCopies.cpp Modified: llvm/trunk/lib/Target/AMDGPU/SIFixSGPRCopies.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/SIFixSGPRCopies.cpp?rev=374253&r1=374252&r2=374253&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/SIFixSGPRCopies.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/SIFixSGPRCopies.cpp Wed Oct 9 15:44:47 2019 @@ -606,14 +606,14 @@ static bool hoistAndMergeSGPRInits(unsig auto MBB = MI->getParent(); MachineInstr &BoundaryMI = *getFirstNonPrologue(MBB, TII); MachineBasicBlock::reverse_iterator B(BoundaryMI); - // Check if B should actually be a bondary. If not set the previous + // Check if B should actually be a boundary. If not set the previous // instruction as the boundary instead. if (!TII->isBasicBlockPrologue(*B)) B++; auto R = std::next(MI->getReverseIterator()); const unsigned Threshold = 50; - // Search until B or Threashold for a place to insert the initialization. + // Search until B or Threshold for a place to insert the initialization. for (unsigned I = 0; R != B && I < Threshold; ++R, ++I) if (R->readsRegister(Reg, TRI) || R->definesRegister(Reg, TRI) || TII->isSchedulingBoundary(*R, MBB, *MBB->getParent())) From llvm-commits at lists.llvm.org Wed Oct 9 15:44:48 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Wed, 09 Oct 2019 22:44:48 -0000 Subject: [llvm] r374254 - AMDGPU: Relax register classes used Message-ID: <20191009224448.40528864D7@lists.llvm.org> Author: arsenm Date: Wed Oct 9 15:44:48 2019 New Revision: 374254 URL: http://llvm.org/viewvc/llvm-project?rev=374254&view=rev Log: AMDGPU: Relax register classes used Modified: llvm/trunk/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp Modified: llvm/trunk/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp?rev=374254&r1=374253&r2=374254&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp Wed Oct 9 15:44:48 2019 @@ -819,7 +819,7 @@ SILoadStoreOptimizer::mergeRead2Pair(Com unsigned BaseSubReg = AddrReg->getSubReg(); unsigned BaseRegFlags = 0; if (CI.BaseOff) { - Register ImmReg = MRI->createVirtualRegister(&AMDGPU::SGPR_32RegClass); + Register ImmReg = MRI->createVirtualRegister(&AMDGPU::SReg_32RegClass); BuildMI(*MBB, CI.Paired, DL, TII->get(AMDGPU::S_MOV_B32), ImmReg) .addImm(CI.BaseOff); @@ -912,7 +912,7 @@ SILoadStoreOptimizer::mergeWrite2Pair(Co unsigned BaseSubReg = AddrReg->getSubReg(); unsigned BaseRegFlags = 0; if (CI.BaseOff) { - Register ImmReg = MRI->createVirtualRegister(&AMDGPU::SGPR_32RegClass); + Register ImmReg = MRI->createVirtualRegister(&AMDGPU::SReg_32RegClass); BuildMI(*MBB, CI.Paired, DL, TII->get(AMDGPU::S_MOV_B32), ImmReg) .addImm(CI.BaseOff); From llvm-commits at lists.llvm.org Wed Oct 9 15:44:49 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Wed, 09 Oct 2019 22:44:49 -0000 Subject: [llvm] r374255 - AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer Message-ID: <20191009224449.7B8538637F@lists.llvm.org> Author: arsenm Date: Wed Oct 9 15:44:49 2019 New Revision: 374255 URL: http://llvm.org/viewvc/llvm-project?rev=374255&view=rev Log: AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer This was ignoring the register bank of the input pointer, and isUniformMMO seems overly aggressive. This will now conservatively assume a VGPR in cases where the incoming bank hasn't been determined yet (i.e. is from a loop phi). Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/regbankselect-load.mir Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp?rev=374255&r1=374254&r2=374255&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp Wed Oct 9 15:44:49 2019 @@ -323,6 +323,8 @@ AMDGPURegisterBankInfo::getInstrAlternat } } +// FIXME: Returns uniform if there's no source value information. This is +// probably wrong. static bool isInstrUniformNonExtLoadAlign4(const MachineInstr &MI) { if (!MI.hasOneMemOperand()) return false; @@ -1047,8 +1049,13 @@ bool AMDGPURegisterBankInfo::applyMappin SmallVector SrcRegs(OpdMapper.getVRegs(1)); // If the pointer is an SGPR, we have nothing to do. - if (SrcRegs.empty()) - return false; + if (SrcRegs.empty()) { + Register PtrReg = MI.getOperand(1).getReg(); + const RegisterBank *PtrBank = getRegBank(PtrReg, MRI, *TRI); + if (PtrBank == &AMDGPU::SGPRRegBank) + return false; + SrcRegs.push_back(PtrReg); + } assert(LoadSize % MaxNonSmrdLoadSize == 0); @@ -2025,7 +2032,7 @@ AMDGPURegisterBankInfo::getInstrMappingF const MachineFunction &MF = *MI.getParent()->getParent(); const MachineRegisterInfo &MRI = MF.getRegInfo(); - SmallVector OpdsMapping(MI.getNumOperands()); + SmallVector OpdsMapping(2); unsigned Size = getSizeInBits(MI.getOperand(0).getReg(), MRI, *TRI); LLT LoadTy = MRI.getType(MI.getOperand(0).getReg()); Register PtrReg = MI.getOperand(1).getReg(); @@ -2036,7 +2043,10 @@ AMDGPURegisterBankInfo::getInstrMappingF const ValueMapping *ValMapping; const ValueMapping *PtrMapping; - if ((AS != AMDGPUAS::LOCAL_ADDRESS && AS != AMDGPUAS::REGION_ADDRESS && + const RegisterBank *PtrBank = getRegBank(PtrReg, MRI, *TRI); + + if (PtrBank == &AMDGPU::SGPRRegBank && + (AS != AMDGPUAS::LOCAL_ADDRESS && AS != AMDGPUAS::REGION_ADDRESS && AS != AMDGPUAS::PRIVATE_ADDRESS) && isInstrUniformNonExtLoadAlign4(MI)) { // We have a uniform instruction so we want to use an SMRD load Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/regbankselect-load.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/regbankselect-load.mir?rev=374255&r1=374254&r2=374255&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/regbankselect-load.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/regbankselect-load.mir Wed Oct 9 15:44:49 2019 @@ -69,6 +69,8 @@ define amdgpu_kernel void @load_constant_i32_uniform_align2() {ret void} define amdgpu_kernel void @load_constant_i32_uniform_align1() {ret void} define amdgpu_kernel void @load_private_uniform_sgpr_i32() {ret void} + define amdgpu_kernel void @load_constant_v8i32_vgpr_crash() { ret void } + define amdgpu_kernel void @load_constant_v8i32_vgpr_crash_loop_phi() { ret void } declare i32 @llvm.amdgcn.workitem.id.x() #0 attributes #0 = { nounwind readnone } @@ -652,3 +654,47 @@ body: | %0:_(p5) = COPY $sgpr0 %1:_(s32) = G_LOAD %0 :: (load 4, addrspace 5, align 4) ... + +--- +name: load_constant_v8i32_vgpr_crash +legalized: true +tracksRegLiveness: true + +body: | + bb.0: + liveins: $vgpr0_vgpr1 + + ; CHECK-LABEL: name: load_constant_v8i32_vgpr_crash + ; CHECK: %0:vgpr(p4) = COPY $vgpr0_vgpr1 + ; CHECK: vgpr(<4 x s32>) = G_LOAD %0(p4) + ; CHECK: vgpr(<4 x s32>) = G_LOAD + ; CHECK: G_CONCAT_VECTORS + %0:_(p4) = COPY $vgpr0_vgpr1 + %1:_(<8 x s32>) = G_LOAD %0 :: (load 32, addrspace 4) +... + +--- +name: load_constant_v8i32_vgpr_crash_loop_phi +legalized: true +tracksRegLiveness: true + +body: | + bb.0: + liveins: $sgpr0_sgpr1, $sgpr2_sgpr3 + + ; CHECK-LABEL: name: load_constant_v8i32_vgpr_crash_loop_phi + ; CHECK: G_PHI + ; CHECK: vgpr(<4 x s32>) = G_LOAD + ; CHECK: vgpr(<4 x s32>) = G_LOAD + ; CHECK: G_CONCAT_VECTORS + + %0:_(p4) = COPY $sgpr0_sgpr1 + %1:_(p4) = COPY $sgpr2_sgpr3 + G_BR %bb.1 + + bb.1: + %2:_(p4) = G_PHI %0, %bb.0, %4, %bb.1 + %3:_(<8 x s32>) = G_LOAD %2 :: (load 32, addrspace 4) + %4:_(p4) = COPY %1 + G_BR %bb.1 +... From llvm-commits at lists.llvm.org Wed Oct 9 15:43:40 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:43:40 +0000 (UTC) Subject: [PATCH] D68739: [GISel] Allow ConstantFoldBinOp to consider G_FCONSTANT binary representation for combines In-Reply-To: References: Message-ID: <6e27c3f3628bd8b0577a5becd4e0e781@localhost.localdomain> arsenm added inline comments. ================ Comment at: llvm/lib/CodeGen/GlobalISel/Utils.cpp:323-325 + (&FPVal->getValueAPF().getSemantics() == &APFloat::IEEEdouble() || + &FPVal->getValueAPF().getSemantics() == &APFloat::IEEEsingle() || + &FPVal->getValueAPF().getSemantics() == &APFloat::IEEEhalf())) { ---------------- Why do you need to whitelist these types? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68739/new/ https://reviews.llvm.org/D68739 From llvm-commits at lists.llvm.org Wed Oct 9 15:43:41 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:43:41 +0000 (UTC) Subject: [PATCH] D68479: GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR In-Reply-To: References: Message-ID: <19a4417d26a3f298e30af2acc9ab0a37@localhost.localdomain> arsenm closed this revision. arsenm added a comment. r374252 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68479/new/ https://reviews.llvm.org/D68479 From llvm-commits at lists.llvm.org Wed Oct 9 15:43:41 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:43:41 +0000 (UTC) Subject: [PATCH] D68600: AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer In-Reply-To: References: Message-ID: <4ee18e904cace31adcb6d829853e9e23@localhost.localdomain> arsenm closed this revision. arsenm added a comment. r374255 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68600/new/ https://reviews.llvm.org/D68600 From llvm-commits at lists.llvm.org Wed Oct 9 15:43:45 2019 From: llvm-commits at lists.llvm.org (wael yehia via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:43:45 +0000 (UTC) Subject: [PATCH] D68718: [llvm-profdata] Make "malformed-ptr-to-counter-array.test" textual In-Reply-To: References: Message-ID: w2yehia added a comment. Hi Vedant, thanks for the quick fix. The test file works, and I'm able to patch it when I add more value profiles. I think the comments should be rearranged like so (basically I fixed the placement of the Counter and Name sections): // Header // // INSTR_PROF_RAW_HEADER(uint64_t, Magic, __llvm_profile_get_magic()) // INSTR_PROF_RAW_HEADER(uint64_t, Version, __llvm_profile_get_version()) // INSTR_PROF_RAW_HEADER(uint64_t, DataSize, DataSize) // INSTR_PROF_RAW_HEADER(uint64_t, CountersSize, CountersSize) // INSTR_PROF_RAW_HEADER(uint64_t, NamesSize, NamesSize) // INSTR_PROF_RAW_HEADER(uint64_t, CountersDelta, (uintptr_t)CountersBegin) // INSTR_PROF_RAW_HEADER(uint64_t, NamesDelta, (uintptr_t)NamesBegin) // INSTR_PROF_RAW_HEADER(uint64_t, ValueKindLast, IPVK_Last) RUN: printf '\201rforpl\377' > %t.profraw RUN: printf '\4\0\0\0\0\0\0\0' >> %t.profraw RUN: printf '\1\0\0\0\0\0\0\0' >> %t.profraw RUN: printf '\2\0\0\0\0\0\0\0' >> %t.profraw RUN: printf '\10\0\0\0\0\0\0\0' >> %t.profraw RUN: printf '\0\0\6\0\1\0\0\0' >> %t.profraw RUN: printf '\0\0\6\0\2\0\0\0' >> %t.profraw RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw // Data Section // // INSTR_PROF_DATA(const uint64_t, llvm::Type::getInt64Ty(Ctx), NameRef, \ // ConstantInt::get(llvm::Type::getInt64Ty(Ctx), \ // IndexedInstrProf::ComputeHash(getPGOFuncNameVarInitializer(Inc->getName())))) // INSTR_PROF_DATA(const uint64_t, llvm::Type::getInt64Ty(Ctx), FuncHash, \ // ConstantInt::get(llvm::Type::getInt64Ty(Ctx), \ // Inc->getHash()->getZExtValue())) // INSTR_PROF_DATA(const IntPtrT, llvm::Type::getInt64PtrTy(Ctx), CounterPtr, \ // ConstantExpr::getBitCast(CounterPtr, \ // llvm::Type::getInt64PtrTy(Ctx))) // INSTR_PROF_DATA(const IntPtrT, llvm::Type::getInt8PtrTy(Ctx), FunctionPointer, \ // FunctionAddr) // INSTR_PROF_DATA(IntPtrT, llvm::Type::getInt8PtrTy(Ctx), Values, \ // ValuesPtrExpr) // INSTR_PROF_DATA(const uint32_t, llvm::Type::getInt32Ty(Ctx), NumCounters, \ // ConstantInt::get(llvm::Type::getInt32Ty(Ctx), NumCounters)) // INSTR_PROF_DATA(const uint16_t, Int16ArrayTy, NumValueSites[IPVK_Last+1], \ // ConstantArray::get(Int16ArrayTy, Int16ArrayVals) RUN: printf '\067\265\035\031\112\165\023\344' >> %t.profraw RUN: printf '\02\0\0\0\0\0\0\0' >> %t.profraw // Note: The CounterPtr here is off-by-one. This should trigger a malformed profile error. RUN: printf '\0\0\6\0\1\0\0\1' >> %t.profraw RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw RUN: printf '\02\0\0\0\0\0\0\0' >> %t.profraw // Counter Section RUN: printf '\067\0\0\0\0\0\0\0' >> %t.profraw RUN: printf '\101\0\0\0\0\0\0\0' >> %t.profraw // Name Section RUN: printf '\3\0bar\0\0\0' >> %t.profraw RUN: not llvm-profdata merge -o /dev/null %t.profraw 2>&1 | FileCheck %s CHECK: Malformed instrumentation profile data Also, an alternative to listing the fields of the `ProfData` struct, is to use this comment or something similar to indicate that an array of `ProfData` (or `__llvm_profile_data`) objects follow: struct ProfData { #define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) \ Type Name; #include "llvm/ProfileData/InstrProfData.inc" }; that way the comment doesn't have to be updated everytime we update ProfData. Either way is fine. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68718/new/ https://reviews.llvm.org/D68718 From llvm-commits at lists.llvm.org Wed Oct 9 15:51:42 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Wed, 09 Oct 2019 22:51:42 -0000 Subject: [llvm] r374257 - AMDGPU: Don't fold copies to physregs Message-ID: <20191009225142.7FD2C862B7@lists.llvm.org> Author: arsenm Date: Wed Oct 9 15:51:42 2019 New Revision: 374257 URL: http://llvm.org/viewvc/llvm-project?rev=374257&view=rev Log: AMDGPU: Don't fold copies to physregs In a future patch, this will help cleanup m0 handling. The register coalescer handles copies from a register that materializes an immediate, but doesn't handle move immediates itself. The virtual register uses will often be allocated to the same register, so there end up being no real copy. Modified: llvm/trunk/lib/Target/AMDGPU/SIFoldOperands.cpp llvm/trunk/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll Modified: llvm/trunk/lib/Target/AMDGPU/SIFoldOperands.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/SIFoldOperands.cpp?rev=374257&r1=374256&r2=374257&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/SIFoldOperands.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/SIFoldOperands.cpp Wed Oct 9 15:51:42 2019 @@ -581,13 +581,17 @@ void SIFoldOperands::foldOperand( if (FoldingImmLike && UseMI->isCopy()) { Register DestReg = UseMI->getOperand(0).getReg(); - const TargetRegisterClass *DestRC = Register::isVirtualRegister(DestReg) - ? MRI->getRegClass(DestReg) - : TRI->getPhysRegClass(DestReg); + + // Don't fold into a copy to a physical register. Doing so would interfere + // with the register coalescer's logic which would avoid redundant + // initalizations. + if (DestReg.isPhysical()) + return; + + const TargetRegisterClass *DestRC = MRI->getRegClass(DestReg); Register SrcReg = UseMI->getOperand(1).getReg(); - if (Register::isVirtualRegister(DestReg) && - Register::isVirtualRegister(SrcReg)) { + if (SrcReg.isVirtual()) { // XXX - This can be an assert? const TargetRegisterClass * SrcRC = MRI->getRegClass(SrcReg); if (TRI->isSGPRClass(SrcRC) && TRI->hasVectorRegisters(DestRC)) { MachineRegisterInfo::use_iterator NextUse; Modified: llvm/trunk/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll?rev=374257&r1=374256&r2=374257&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll Wed Oct 9 15:51:42 2019 @@ -14,8 +14,8 @@ define amdgpu_kernel void @kernel_backgr ; GCN-NEXT: s_mov_b64 s[0:1], s[36:37] ; GCN-NEXT: v_mov_b32_e32 v1, 0x2000 ; GCN-NEXT: v_mov_b32_e32 v2, 0x4000 -; GCN-NEXT: s_mov_b64 s[2:3], s[38:39] ; GCN-NEXT: v_mov_b32_e32 v3, 0 +; GCN-NEXT: s_mov_b64 s[2:3], s[38:39] ; GCN-NEXT: v_mov_b32_e32 v4, 0x400000 ; GCN-NEXT: s_add_u32 s32, s33, 0xc0000 ; GCN-NEXT: v_add_nc_u32_e64 v32, 4, 0x4000 From llvm-commits at lists.llvm.org Wed Oct 9 15:54:05 2019 From: llvm-commits at lists.llvm.org (Marcello Maggioni via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:54:05 +0000 (UTC) Subject: [PATCH] D68739: [GISel] Allow ConstantFoldBinOp to consider G_FCONSTANT binary representation for combines In-Reply-To: References: Message-ID: <0ec7d5ebaf0d207ed917bc28d5a2b6ba@localhost.localdomain> kariddi marked an inline comment as done. kariddi added inline comments. ================ Comment at: llvm/lib/CodeGen/GlobalISel/Utils.cpp:323-325 + (&FPVal->getValueAPF().getSemantics() == &APFloat::IEEEdouble() || + &FPVal->getValueAPF().getSemantics() == &APFloat::IEEEsingle() || + &FPVal->getValueAPF().getSemantics() == &APFloat::IEEEhalf())) { ---------------- arsenm wrote: > Why do you need to whitelist these types? Are the types that fit a uint64 that we know of, but I guess that could be checked from "the only user" of this function instead Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68739/new/ https://reviews.llvm.org/D68739 From llvm-commits at lists.llvm.org Wed Oct 9 15:54:05 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:54:05 +0000 (UTC) Subject: [PATCH] D68721: [NFC][PowerPC]Clean up PPCAsmPrinter for TOC related pseudo opcode In-Reply-To: References: Message-ID: <9958c0fc1eed83f1789b70e32f1988dd@localhost.localdomain> hubert.reinterpretcast added inline comments. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:836 + LLVM_DEBUG( + !(MO.isGlobal() && Subtarget->isGVIndirectSymbol(MO.getGlobal())) && + "Interposable definitions must use indirect access."); ---------------- Missing the invocation of `assert` here? ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:1377 - for (MapVector::iterator I = TOC.begin(), + for (MapVector::iterator I = TOC.begin(), E = TOC.end(); I != E; ++I) { ---------------- Has a range-based for loop been considered? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68721/new/ https://reviews.llvm.org/D68721 From llvm-commits at lists.llvm.org Wed Oct 9 15:54:06 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:54:06 +0000 (UTC) Subject: [PATCH] D68735: AMDGPU: Don't fold copies to physregs In-Reply-To: References: Message-ID: arsenm closed this revision. arsenm added a comment. r374257 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68735/new/ https://reviews.llvm.org/D68735 From llvm-commits at lists.llvm.org Wed Oct 9 15:54:06 2019 From: llvm-commits at lists.llvm.org (Marcello Maggioni via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 22:54:06 +0000 (UTC) Subject: [PATCH] D68739: [GISel] Allow ConstantFoldBinOp to consider G_FCONSTANT binary representation for combines In-Reply-To: References: Message-ID: <1f2230c97c5473d7eb9d2a348fdc383c@localhost.localdomain> kariddi marked an inline comment as done. kariddi added inline comments. ================ Comment at: llvm/lib/CodeGen/GlobalISel/Utils.cpp:323-325 + (&FPVal->getValueAPF().getSemantics() == &APFloat::IEEEdouble() || + &FPVal->getValueAPF().getSemantics() == &APFloat::IEEEsingle() || + &FPVal->getValueAPF().getSemantics() == &APFloat::IEEEhalf())) { ---------------- kariddi wrote: > arsenm wrote: > > Why do you need to whitelist these types? > Are the types that fit a uint64 that we know of, but I guess that could be checked from "the only user" of this function instead Actually, that's not necessarily true ... ConstantFoldBinOp used to use a function that only handled Optional . Now I substituted that with a function that returns APInt (for simplicity), but I wanted to keep the functionality the same as getConstantVRegVal() I guess ... I guess I can remove that limitation for the Floats, but maintain the limitation fo the integers and then check in ConstantFoldBinOp. I'm just scared that allowing ConstantFoldBinOp to digest things it never saw before would cause some "unexpected consequence" :-P Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68739/new/ https://reviews.llvm.org/D68739 From llvm-commits at lists.llvm.org Wed Oct 9 15:57:07 2019 From: llvm-commits at lists.llvm.org (Matt Morehouse via llvm-commits) Date: Wed, 09 Oct 2019 22:57:07 -0000 Subject: [compiler-rt] r374258 - [sanitizer_common] Remove OnPrint from Go build. Message-ID: <20191009225707.6156686077@lists.llvm.org> Author: morehouse Date: Wed Oct 9 15:57:07 2019 New Revision: 374258 URL: http://llvm.org/viewvc/llvm-project?rev=374258&view=rev Log: [sanitizer_common] Remove OnPrint from Go build. Summary: Go now uses __sanitizer_on_print instead. Reviewers: vitalybuka, dvyukov Reviewed By: vitalybuka Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68621 Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_printf.cpp Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_printf.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_printf.cpp?rev=374258&r1=374257&r2=374258&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_printf.cpp (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_printf.cpp Wed Oct 9 15:57:07 2019 @@ -229,8 +229,6 @@ void SetPrintfAndReportCallback(void (*c // Can be overriden in frontend. #if SANITIZER_GO && defined(TSAN_EXTERNAL_HOOKS) // Implementation must be defined in frontend. -// TODO(morehouse): Remove OnPrint after migrating Go to __sanitizer_on_print. -extern "C" void OnPrint(const char *str); extern "C" void __sanitizer_on_print(const char *str); #else SANITIZER_INTERFACE_WEAK_DEF(void, __sanitizer_on_print, const char *str) { @@ -239,10 +237,6 @@ SANITIZER_INTERFACE_WEAK_DEF(void, __san #endif static void CallPrintfAndReportCallback(const char *str) { -#if SANITIZER_GO && defined(TSAN_EXTERNAL_HOOKS) - // TODO(morehouse): Remove OnPrint after migrating Go to __sanitizer_on_print. - OnPrint(str); -#endif __sanitizer_on_print(str); if (PrintfAndReportCallback) PrintfAndReportCallback(str); From llvm-commits at lists.llvm.org Wed Oct 9 16:03:28 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:03:28 +0000 (UTC) Subject: [PATCH] D68739: [GISel] Allow ConstantFoldBinOp to consider G_FCONSTANT binary representation for combines In-Reply-To: References: Message-ID: arsenm added inline comments. ================ Comment at: llvm/lib/CodeGen/GlobalISel/Utils.cpp:323-325 + (&FPVal->getValueAPF().getSemantics() == &APFloat::IEEEdouble() || + &FPVal->getValueAPF().getSemantics() == &APFloat::IEEEsingle() || + &FPVal->getValueAPF().getSemantics() == &APFloat::IEEEhalf())) { ---------------- kariddi wrote: > kariddi wrote: > > arsenm wrote: > > > Why do you need to whitelist these types? > > Are the types that fit a uint64 that we know of, but I guess that could be checked from "the only user" of this function instead > Actually, that's not necessarily true ... ConstantFoldBinOp used to use a function that only handled Optional . Now I substituted that with a function that returns APInt (for simplicity), but I wanted to keep the functionality the same as getConstantVRegVal() I guess ... > > I guess I can remove that limitation for the Floats, but maintain the limitation fo the integers and then check in ConstantFoldBinOp. > > I'm just scared that allowing ConstantFoldBinOp to digest things it never saw before would cause some "unexpected consequence" :-P I'm worried about somebody adding bfloat16 or something and then never updating this list. I think it would be fine to just return anything bitcastToAPInt will handle Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68739/new/ https://reviews.llvm.org/D68739 From llvm-commits at lists.llvm.org Wed Oct 9 16:03:30 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:03:30 +0000 (UTC) Subject: [PATCH] D68292: [CMake] Disable building all Darwin libraries (except builtins) for macOS i386 when the SDK is >= 10.15. In-Reply-To: References: Message-ID: <4cf8b6bad70527da5bc65e0feafacb47@localhost.localdomain> yln accepted this revision. yln added a comment. This revision is now accepted and ready to land. I confirmed that `ninja check-asan` works with this patch on my macOS 10.15 machine. Repository: rCRT Compiler Runtime CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68292/new/ https://reviews.llvm.org/D68292 From llvm-commits at lists.llvm.org Wed Oct 9 16:03:34 2019 From: llvm-commits at lists.llvm.org (Matt Morehouse via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:03:34 +0000 (UTC) Subject: [PATCH] D68621: [sanitizer_common] Remove OnPrint from Go build. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGda6cb7ba4c73: [sanitizer_common] Remove OnPrint from Go build. (authored by morehouse). Herald added a project: Sanitizers. Herald added a subscriber: Sanitizers. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68621/new/ https://reviews.llvm.org/D68621 Files: compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp Index: compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp =================================================================== --- compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp +++ compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp @@ -229,8 +229,6 @@ // Can be overriden in frontend. #if SANITIZER_GO && defined(TSAN_EXTERNAL_HOOKS) // Implementation must be defined in frontend. -// TODO(morehouse): Remove OnPrint after migrating Go to __sanitizer_on_print. -extern "C" void OnPrint(const char *str); extern "C" void __sanitizer_on_print(const char *str); #else SANITIZER_INTERFACE_WEAK_DEF(void, __sanitizer_on_print, const char *str) { @@ -239,10 +237,6 @@ #endif static void CallPrintfAndReportCallback(const char *str) { -#if SANITIZER_GO && defined(TSAN_EXTERNAL_HOOKS) - // TODO(morehouse): Remove OnPrint after migrating Go to __sanitizer_on_print. - OnPrint(str); -#endif __sanitizer_on_print(str); if (PrintfAndReportCallback) PrintfAndReportCallback(str); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68621.224194.patch Type: text/x-patch Size: 1005 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 16:06:38 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via llvm-commits) Date: Wed, 09 Oct 2019 23:06:38 -0000 Subject: [llvm] r374259 - [WebAssembly] Fix tests missed in rL374235 Message-ID: <20191009230638.6F2D38690C@lists.llvm.org> Author: tlively Date: Wed Oct 9 16:06:38 2019 New Revision: 374259 URL: http://llvm.org/viewvc/llvm-project?rev=374259&view=rev Log: [WebAssembly] Fix tests missed in rL374235 Modified: llvm/trunk/test/CodeGen/MIR/WebAssembly/typed-immediate-operand-invalid0.mir llvm/trunk/test/CodeGen/MIR/WebAssembly/typed-immediate-operand-invalid1.mir llvm/trunk/unittests/Target/WebAssembly/WebAssemblyExceptionInfoTest.cpp Modified: llvm/trunk/test/CodeGen/MIR/WebAssembly/typed-immediate-operand-invalid0.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/MIR/WebAssembly/typed-immediate-operand-invalid0.mir?rev=374259&r1=374258&r2=374259&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/MIR/WebAssembly/typed-immediate-operand-invalid0.mir (original) +++ llvm/trunk/test/CodeGen/MIR/WebAssembly/typed-immediate-operand-invalid0.mir Wed Oct 9 16:06:38 2019 @@ -9,5 +9,5 @@ body: | liveins: $arguments ; CHECK: [[@LINE+1]]:24: expected integers after 'i'/'s'/'p' type character %0:i32 = CONST_I32 i 0, implicit-def dead $arguments - RETURN_VOID implicit-def dead $arguments + RETURN implicit-def dead $arguments ... Modified: llvm/trunk/test/CodeGen/MIR/WebAssembly/typed-immediate-operand-invalid1.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/MIR/WebAssembly/typed-immediate-operand-invalid1.mir?rev=374259&r1=374258&r2=374259&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/MIR/WebAssembly/typed-immediate-operand-invalid1.mir (original) +++ llvm/trunk/test/CodeGen/MIR/WebAssembly/typed-immediate-operand-invalid1.mir Wed Oct 9 16:06:38 2019 @@ -9,5 +9,5 @@ body: | liveins: $arguments ; CHECK: [[@LINE+1]]:24: a typed immediate operand should start with one of 'i', 's', or 'p' %0:i32 = CONST_I32 abc 0, implicit-def dead $arguments - RETURN_VOID implicit-def dead $arguments + RETURN implicit-def dead $arguments ... Modified: llvm/trunk/unittests/Target/WebAssembly/WebAssemblyExceptionInfoTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/Target/WebAssembly/WebAssemblyExceptionInfoTest.cpp?rev=374259&r1=374258&r2=374259&view=diff ============================================================================== --- llvm/trunk/unittests/Target/WebAssembly/WebAssemblyExceptionInfoTest.cpp (original) +++ llvm/trunk/unittests/Target/WebAssembly/WebAssemblyExceptionInfoTest.cpp Wed Oct 9 16:06:38 2019 @@ -132,7 +132,7 @@ body: | bb.7: ; predecessors: %bb.5, %bb.1 liveins: $value_stack - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments bb.8 (landing-pad): ; predecessors: %bb.4 @@ -307,7 +307,7 @@ body: | bb.9: ; predecessors: %bb.0, %bb.7 liveins: $value_stack - RETURN_VOID implicit-def $arguments + RETURN implicit-def $arguments bb.10 (landing-pad): ; predecessors: %bb.4 From llvm-commits at lists.llvm.org Wed Oct 9 16:07:58 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Wed, 9 Oct 2019 16:07:58 -0700 Subject: [llvm] r374243 - [InstCombine] Fix PR43617 In-Reply-To: <20191009220324.1570A80AFB@lists.llvm.org> References: <20191009220324.1570A80AFB@lists.llvm.org> Message-ID: You could just use the CallInst version of getIntrinsicID() which already does the right thing ~Craig On Wed, Oct 9, 2019 at 3:01 PM Evandro Menezes via llvm-commits < llvm-commits at lists.llvm.org> wrote: > Author: evandro > Date: Wed Oct 9 15:03:23 2019 > New Revision: 374243 > > URL: http://llvm.org/viewvc/llvm-project?rev=374243&view=rev > Log: > [InstCombine] Fix PR43617 > > Check for `nullptr` before inspecting composite function. > > Modified: > llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp > > Modified: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp?rev=374243&r1=374242&r2=374243&view=diff > > ============================================================================== > --- llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp (original) > +++ llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp Wed Oct 9 > 15:03:23 2019 > @@ -1916,10 +1916,10 @@ Value *LibCallSimplifier::optimizeLog(Ca > B.setFastMathFlags(FastMathFlags::getFast()); > > Function *ArgFn = Arg->getCalledFunction(); > - StringRef ArgNm = ArgFn->getName(); > - Intrinsic::ID ArgID = ArgFn->getIntrinsicID(); > + Intrinsic::ID ArgID = > + ArgFn ? ArgFn->getIntrinsicID() : Intrinsic::not_intrinsic; > LibFunc ArgLb = NotLibFunc; > - TLI->getLibFunc(ArgNm, ArgLb); > + TLI->getLibFunc(Arg, ArgLb); > > // log(pow(x,y)) -> y*log(x) > if (ArgLb == PowLb || ArgID == Intrinsic::pow) { > @@ -1934,9 +1934,10 @@ Value *LibCallSimplifier::optimizeLog(Ca > substituteInParent(Arg, MulY); > return MulY; > } > + > // log(exp{,2,10}(y)) -> y*log({e,2,10}) > // TODO: There is no exp10() intrinsic yet. > - else if (ArgLb == ExpLb || ArgLb == Exp2Lb || ArgLb == Exp10Lb || > + if (ArgLb == ExpLb || ArgLb == Exp2Lb || ArgLb == Exp10Lb || > ArgID == Intrinsic::exp || ArgID == Intrinsic::exp2) { > Constant *Eul; > if (ArgLb == ExpLb || ArgID == Intrinsic::exp) > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Wed Oct 9 16:10:49 2019 From: llvm-commits at lists.llvm.org (GN Sync Bot via llvm-commits) Date: Wed, 09 Oct 2019 23:10:49 -0000 Subject: [llvm] r374260 - gn build: Merge r374245 Message-ID: <20191009231049.30BC986AB5@lists.llvm.org> Author: gnsyncbot Date: Wed Oct 9 16:10:49 2019 New Revision: 374260 URL: http://llvm.org/viewvc/llvm-project?rev=374260&view=rev Log: gn build: Merge r374245 Modified: llvm/trunk/utils/gn/secondary/llvm/unittests/CodeGen/GlobalISel/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/llvm/unittests/CodeGen/GlobalISel/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/unittests/CodeGen/GlobalISel/BUILD.gn?rev=374260&r1=374259&r2=374260&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/unittests/CodeGen/GlobalISel/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/llvm/unittests/CodeGen/GlobalISel/BUILD.gn Wed Oct 9 16:10:49 2019 @@ -13,6 +13,7 @@ unittest("GlobalISelTests") { ] sources = [ "CSETest.cpp", + "ConstantFoldingTest.cpp", "GISelMITest.cpp", "KnownBitsTest.cpp", "LegalizerHelperTest.cpp", From llvm-commits at lists.llvm.org Wed Oct 9 16:12:45 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 23:12:45 +0000 (UTC) Subject: [PATCH] D67986: [InstCombine] snprintf (d, size, "%s", s) -> memccpy (d, s, '\0', size - 1), d[size - 1] = 0 In-Reply-To: References: Message-ID: <84620a9ecf5f0c38e8c6db151a2f59b0@localhost.localdomain> xbolva00 added a comment. In D67986#1687323 , @xbolva00 wrote: > Oh, I think the proposed transformation in that paper is incorrect. > > It should rather be: > memccpy(d,s,0, n-1) > d[n-1] = 0 > > Since "A terminating null character is automatically appended after the content written." (snprintf) @efriedma, what do you think about this sequence? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67986/new/ https://reviews.llvm.org/D67986 From llvm-commits at lists.llvm.org Wed Oct 9 16:12:45 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 23:12:45 +0000 (UTC) Subject: [PATCH] D68089: [InstCombine] Optimize some memccpy calls to memcpy/null In-Reply-To: References: Message-ID: <1e7fc5a9a73401a3756ded605e9a6684@localhost.localdomain> xbolva00 added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68089/new/ https://reviews.llvm.org/D68089 From llvm-commits at lists.llvm.org Wed Oct 9 16:12:45 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Wed, 09 Oct 2019 23:12:45 +0000 (UTC) Subject: [PATCH] D66604: [GVN] AnalyzeLoadAvailability: Replace a load after lifetime.end with undef (PR20811) In-Reply-To: References: Message-ID: <2f7ae3bc7f7208bd017406c7e583f4df@localhost.localdomain> xbolva00 added a comment. @fhahn , is it fine for you? @BK1603, should somebody land this patch for you? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66604/new/ https://reviews.llvm.org/D66604 From llvm-commits at lists.llvm.org Wed Oct 9 16:22:00 2019 From: llvm-commits at lists.llvm.org (serge via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:22:00 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: serge-sans-paille updated this revision to Diff 224196. serge-sans-paille added a comment. Move to `stack-probe` compatibility, using a dedicated name to trigger inline assembly. It looks better to me because 1. it leverage existing mechanics 2. it has a finer grain Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 Files: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Basic/DiagnosticFrontendKinds.td clang/include/clang/Basic/TargetInfo.h clang/include/clang/Driver/CC1Options.td clang/include/clang/Driver/Options.td clang/lib/Basic/Targets/X86.h clang/lib/CodeGen/CGStmt.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/stack-clash-protection.c clang/test/Driver/stack-clash-protection.c llvm/lib/Target/X86/X86FrameLowering.cpp llvm/lib/Target/X86/X86FrameLowering.h llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86ISelLowering.h llvm/lib/Target/X86/X86InstrCompiler.td llvm/lib/Target/X86/X86InstrInfo.td llvm/test/CodeGen/X86/stack-clash-dynamic-alloca.ll llvm/test/CodeGen/X86/stack-clash-medium-natural-probes.ll llvm/test/CodeGen/X86/stack-clash-medium.ll llvm/test/CodeGen/X86/stack-clash-small.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68720.224196.patch Type: text/x-patch Size: 27891 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 16:22:01 2019 From: llvm-commits at lists.llvm.org (Paul Robinson via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:22:01 +0000 (UTC) Subject: [PATCH] D68117: [DWARF-5] Support for C++11 defaulted, deleted member functions. In-Reply-To: References: Message-ID: <79f655a4883b7324adac68d2d853a9ab@localhost.localdomain> probinson added a comment. We really do want to pack the four mutually exclusive cases into two bits. I have tried to give more explicit comments inline to explain how you would do this. It really should work fine, recognizing that the "not defaulted" case is not explicitly represented in the textual IR because it uses a zero value in the defaulted/deleted subfield of SPFlags. ================ Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:1619 + else { + SPFlags |= llvm::DISubprogram::SPFlagNotDefaulted; + } ---------------- SouraVX wrote: > Previously SPFlagNotDefaulted is setted to SPFlagZero as it's normal value is, to save a bit. Hence in generated IR this flag is not getting set. instead 0 is getting emitted. > As a result, test cases checking DISPFlagNotDefaulted in IR are failing. Given that DISPFlagNotDefaulted is represented by the absence of the other related flags, that makes sense. Those tests would verify the DISPFlagNotDefaulted case by showing none of those flags are present. ================ Comment at: clang/test/CodeGenCXX/dbg-info-all-calls-described.cpp:60 // HAS-ATTR-DAG: DISubprogram(name: "declaration2", {{.*}}, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition -// HAS-ATTR-DAG: DISubprogram(name: "struct1", {{.*}}, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized) +// HAS-ATTR-DAG: DISubprogram(name: "struct1", {{.*}}, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized | DISPFlagNotDefaulted) // HAS-ATTR-DAG: DISubprogram(name: "struct1", {{.*}}, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition ---------------- Because DISPFlagNotDefaulted has a zero value, the unmodified test correctly verifies that no other defaulted/deleted flags are present. ================ Comment at: clang/test/CodeGenCXX/debug-info-not-defaulted.cpp:9 + +// ATTR: DISubprogram(name: "not_defaulted", {{.*}}, flags: DIFlagPublic | DIFlagPrototyped, spFlags: DISPFlagNotDefaulted +// ATTR: DISubprogram(name: "not_defaulted", {{.*}}, flags: DIFlagPublic | DIFlagPrototyped, spFlags: DISPFlagNotDefaulted ---------------- SouraVX wrote: > SouraVX wrote: > > This test case is failing, checking DISPFlagNotDefaulted. > Please note here that, backend and llvm-dwarfdump is fine without this. > Since it's value is '0' , we are able to query this using isNotDefaulted() -- hence attribute > DW_AT_defaulted having value DW_DEFAULTED_no is getting set and emitted and dumped fine by llvm-dwarfdump. DISPFlagNotDefaulted is not explicitly represented in the textual IR; it is implied by the absence of any of the other deleted/defaulted values. The test needs to verify that spFlags is omitted from these DISubprogram entries; or if there are other spFlags present, it must verify that the other deleted/defaulted values are not present. ================ Comment at: llvm/include/llvm/IR/DebugInfoFlags.def:93 +HANDLE_DISP_FLAG((1u << 10), DefaultedInClass) +HANDLE_DISP_FLAG((1u << 11), DefaultedOutOfClass) ---------------- There are 4 mutually exclusive cases, which can be handled using 4 values in a 2-bit field. We will give NotDefaulted the zero value, so it is not explicitly defined here. So we would have: ``` HANDLE_DISP_FLAG((1u << 9), Deleted) HANDLE_DISP_FLAG((2u << 9), DefaultedInClass) HANDLE_DISP_FLAG((3u << 9), DefaultedOutOfClass) ``` ================ Comment at: llvm/include/llvm/IR/DebugInfoFlags.def:98 // NOTE: Always must be equal to largest flag, check this when adding new flags. -HANDLE_DISP_FLAG((1 << 8), Largest) +HANDLE_DISP_FLAG((1 << 11), Largest) #undef DISP_FLAG_LARGEST_NEEDED ---------------- This can be 10, because we used only 2 bits for deleted/defaulted. ================ Comment at: llvm/include/llvm/IR/DebugInfoMetadata.h:1615 SPFlagVirtuality = SPFlagVirtual | SPFlagPureVirtual, + SPFlagDefaultedInOrOutOfClass = + SPFlagDefaultedInClass | SPFlagDefaultedOutOfClass, ---------------- I would call this SPFlagDeletedOrDefaulted. ================ Comment at: llvm/include/llvm/IR/DebugInfoMetadata.h:1632 + static DISPFlags + toSPFlags(bool IsLocalToUnit, bool IsDefinition, bool IsOptimized, + unsigned Virtuality = SPFlagNonvirtual, ---------------- No, you don't want to modify this function. It is for converting from older bitcode formats that did not have a DISPFlags field. ================ Comment at: llvm/include/llvm/IR/DebugInfoMetadata.h:1777 + } + bool isDeleted() const { return getSPFlags() & SPFlagDeleted; } ---------------- With all 4 values encoded in one field, isDeleted would become ``` return (getSPFlags() & SPFlagDefaultedOrDeleted) == SPFlagDeleted; ``` and of course the others would use the new mask name as well. ================ Comment at: llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1309 + dwarf::DW_DEFAULTED_no); + if (SP->isDeleted()) + addFlag(SPDie, dwarf::DW_AT_deleted); ---------------- `else if` here. It cannot be both defaulted and deleted. ================ Comment at: llvm/lib/IR/DebugInfoMetadata.cpp:603 case SPFlagVirtuality: + case SPFlagDefaultedInOrOutOfClass: return ""; ---------------- This would go away, if we pack 4 values into 2 bits. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68117/new/ https://reviews.llvm.org/D68117 From llvm-commits at lists.llvm.org Wed Oct 9 16:22:02 2019 From: llvm-commits at lists.llvm.org (serge via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:22:02 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: serge-sans-paille added a comment. @efriedma alos compared to `probe-stack` with a function, this version has the ability to use existing MOV operations to avoid generating probes, which looks like a big plus to me. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 From llvm-commits at lists.llvm.org Wed Oct 9 16:40:16 2019 From: llvm-commits at lists.llvm.org (Brian Cain via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:40:16 +0000 (UTC) Subject: [PATCH] D68741: test-release.sh s/http/https/ Message-ID: bcain created this revision. bcain added reviewers: hans, rovka, ro, dim. Herald added a subscriber: dmgreen. Herald added a project: LLVM. This will be more effective at preserving release integrity but as a practical matter will also circumvent hiccups when http proxies reject some content. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68741 Files: llvm/utils/release/test-release.sh Index: llvm/utils/release/test-release.sh =================================================================== --- llvm/utils/release/test-release.sh +++ llvm/utils/release/test-release.sh @@ -20,7 +20,7 @@ generator="Unix Makefiles" # Base SVN URL for the sources. -Base_url="http://llvm.org/svn/llvm-project" +Base_url="https://llvm.org/svn/llvm-project" Release="" Release_no_dot="" -------------- next part -------------- A non-text attachment was scrubbed... Name: D68741.224199.patch Type: text/x-patch Size: 393 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 16:40:17 2019 From: llvm-commits at lists.llvm.org (Andrei Elovikov via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:40:17 +0000 (UTC) Subject: [PATCH] D68484: [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. In-Reply-To: References: Message-ID: <9d85168fd5551d5b7ab71609fca5d140@localhost.localdomain> a.elovikov added inline comments. ================ Comment at: llvm/docs/LangRef.rst:16241 + i32 , metadata !p.scope) !noalias !VisibleScopes + %side.p = i8* @llvm.side.noalias.XXX(i8* %side.p, i8* %p.decl, + i8** p.addr, i8** %side.p.addr, ---------------- I find it strange to see %side.p on both left and right sides. Is it a typo or does it have some special meaning? After reading till the intrinsics' description I believe it should be just "%p" on the right side. ================ Comment at: llvm/docs/LangRef.rst:16575 +It will be transformed into a ``llvm.side.noalias`` intrinsic and moved onto +the ``noalias_sidechannel`` path, so that pointer optimizations can still be +done and the restrict information is not lost. ---------------- > the ``noalias_sidechannel`` path Not sure about terminology, but are `@llvm.noalias.arg.guard`/`@llvm.noalias.copy.guard` considered as `noalias_sidechannel`? I'd suggest not to use the spelling from the load/store instructions and have a more general `moved onto the "side" path` (if my understanding is correct here). ================ Comment at: llvm/docs/LangRef.rst:16615 + +The third argument ``%p.addr`` is the address in memory of this pointer. + ---------------- No explicit "or null" here. Is that intentional? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68484/new/ https://reviews.llvm.org/D68484 From llvm-commits at lists.llvm.org Wed Oct 9 16:40:17 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:40:17 +0000 (UTC) Subject: [PATCH] D68718: [llvm-profdata] Make "malformed-ptr-to-counter-array.test" textual In-Reply-To: References: Message-ID: vsk updated this revision to Diff 224205. vsk added a comment. Thanks for the correction -- the revised version (hopefully) makes more sense. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68718/new/ https://reviews.llvm.org/D68718 Files: llvm/test/tools/llvm-profdata/Inputs/malformed-ptr-to-counter-array.profraw llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test Index: llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test =================================================================== --- llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test +++ llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test @@ -1,5 +1,50 @@ -REQUIRES: zlib +// Header +// +// INSTR_PROF_RAW_HEADER(uint64_t, Magic, __llvm_profile_get_magic()) +// INSTR_PROF_RAW_HEADER(uint64_t, Version, __llvm_profile_get_version()) +// INSTR_PROF_RAW_HEADER(uint64_t, DataSize, DataSize) +// INSTR_PROF_RAW_HEADER(uint64_t, CountersSize, CountersSize) +// INSTR_PROF_RAW_HEADER(uint64_t, NamesSize, NamesSize) +// INSTR_PROF_RAW_HEADER(uint64_t, CountersDelta, (uintptr_t)CountersBegin) +// INSTR_PROF_RAW_HEADER(uint64_t, NamesDelta, (uintptr_t)NamesBegin) +// INSTR_PROF_RAW_HEADER(uint64_t, ValueKindLast, IPVK_Last) -RUN: not llvm-profdata merge -o /dev/null %p/Inputs/malformed-ptr-to-counter-array.profraw 2>&1 | FileCheck %s +RUN: printf '\201rforpl\377' > %t.profraw +RUN: printf '\4\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\1\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\2\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\10\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\0\0\6\0\1\0\0\0' >> %t.profraw +RUN: printf '\0\0\6\0\2\0\0\0' >> %t.profraw +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw + +// Data Section +// +// struct ProfData { +// #define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) \ +// Type Name; +// #include "llvm/ProfileData/InstrProfData.inc" +// }; + +RUN: printf '\067\265\035\031\112\165\023\344' >> %t.profraw +RUN: printf '\02\0\0\0\0\0\0\0' >> %t.profraw + +// Note: The CounterPtr here is off-by-one. This should trigger a malformed profile error. +RUN: printf '\0\0\6\0\1\0\0\1' >> %t.profraw + +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\02\0\0\0\0\0\0\0' >> %t.profraw + +// Counter Section + +RUN: printf '\067\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\101\0\0\0\0\0\0\0' >> %t.profraw + +// Name Section + +RUN: printf '\3\0bar\0\0\0' >> %t.profraw + +RUN: not llvm-profdata merge -o /dev/null %t.profraw 2>&1 | FileCheck %s CHECK: Malformed instrumentation profile data -------------- next part -------------- A non-text attachment was scrubbed... Name: D68718.224205.patch Type: text/x-patch Size: 2230 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 16:40:17 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:40:17 +0000 (UTC) Subject: [PATCH] D68742: AMDGPU: Use SGPR_128 instead of SReg_128 for vregs Message-ID: arsenm created this revision. arsenm added a reviewer: rampitec. Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl, qcolombet. SGPR_128 only includes the real allocatable SGPRs, and SReg_128 adds the additional non-allocatable TTMP registers. There's no point in allocating SReg_128 vregs. This shrinks the size of the classes regalloc needs to consider, which is usually good. https://reviews.llvm.org/D68742 Files: lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp lib/Target/AMDGPU/AMDGPUTargetMachine.cpp lib/Target/AMDGPU/SIISelLowering.cpp lib/Target/AMDGPU/SIInstrInfo.cpp lib/Target/AMDGPU/SILoadStoreOptimizer.cpp lib/Target/AMDGPU/SIMachineFunctionInfo.cpp lib/Target/AMDGPU/SIRegisterInfo.cpp lib/Target/AMDGPU/SIRegisterInfo.td test/CodeGen/AMDGPU/GlobalISel/inst-select-build-vector.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-concat-vectors.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-merge-values.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-trunc.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-unmerge-values.mir test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f16.ll test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f32.ll test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.ll test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll test/CodeGen/AMDGPU/coalescer-extend-pruned-subrange.mir test/CodeGen/AMDGPU/coalescer-identical-values-undef.mir test/CodeGen/AMDGPU/coalescer-subranges-another-copymi-not-live.mir test/CodeGen/AMDGPU/coalescer-subranges-another-prune-error.mir test/CodeGen/AMDGPU/coalescer-subreg-join.mir test/CodeGen/AMDGPU/coalescer-subregjoin-fullcopy.mir test/CodeGen/AMDGPU/coalescer-with-subregs-bad-identical.mir test/CodeGen/AMDGPU/constant-fold-imm-immreg.mir test/CodeGen/AMDGPU/couldnt-join-subrange-3.mir test/CodeGen/AMDGPU/dce-disjoint-intervals.mir test/CodeGen/AMDGPU/detect-dead-lanes.mir test/CodeGen/AMDGPU/extract_subvector_vec4_vec3.ll test/CodeGen/AMDGPU/fold-imm-copy.mir test/CodeGen/AMDGPU/fold-imm-f16-f32.mir test/CodeGen/AMDGPU/fold-multiple.mir test/CodeGen/AMDGPU/global-load-store-atomics.mir test/CodeGen/AMDGPU/memory_clause.mir test/CodeGen/AMDGPU/merge-load-store.mir test/CodeGen/AMDGPU/mubuf-legalize-operands.mir test/CodeGen/AMDGPU/optimize-negated-cond-exec-masking.mir test/CodeGen/AMDGPU/phi-elimination-end-cf.mir test/CodeGen/AMDGPU/promote-constOffset-to-imm.mir test/CodeGen/AMDGPU/regbank-reassign.mir test/CodeGen/AMDGPU/regcoal-subrange-join-seg.mir test/CodeGen/AMDGPU/regcoal-subrange-join.mir test/CodeGen/AMDGPU/regcoalescing-remove-partial-redundancy-assert.mir test/CodeGen/AMDGPU/rename-independent-subregs.mir test/CodeGen/AMDGPU/schedule-regpressure.mir test/CodeGen/AMDGPU/spill-before-exec.mir test/CodeGen/AMDGPU/splitkit.mir test/CodeGen/AMDGPU/subreg-split-live-in-error.mir test/CodeGen/AMDGPU/subreg_interference.mir test/CodeGen/AMDGPU/subvector-test.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68742.224202.patch Type: text/x-patch Size: 160708 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 16:40:33 2019 From: llvm-commits at lists.llvm.org (Azhar Mohammed via llvm-commits) Date: Wed, 09 Oct 2019 16:40:33 -0700 Subject: [test-suite] r374156 - Add GCC Torture Suite Sources In-Reply-To: <20191009110200.47EB990780@lists.llvm.org> References: <20191009110200.47EB990780@lists.llvm.org> Message-ID: <13FA746A-62F8-4818-A1D3-4C967689B0E7@apple.com> Hi Looks like this change is causing a CMake error while trying to build the test suite. Can you please take a look? CMake Error at SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt:34 (list): list sub-command REMOVE_ITEM requires list to be present. Call Stack (most recent call first): SingleSource/Regression/C/gcc-c-torture/CMakeLists.txt:48 (gcc_torture_dg_options_cflags) SingleSource/Regression/C/gcc-c-torture/execute/ieee/CMakeLists.txt:69 (gcc_torture_execute_test) > On Oct 9, 2019, at 4:01 AM, Sam Elliott via llvm-commits wrote: > > Author: lenary > Date: Wed Oct 9 04:01:46 2019 > New Revision: 374156 > > URL: http://llvm.org/viewvc/llvm-project?rev=374156&view=rev > Log: > Add GCC Torture Suite Sources > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Wed Oct 9 16:43:33 2019 From: llvm-commits at lists.llvm.org (Philip Reames via llvm-commits) Date: Wed, 09 Oct 2019 23:43:33 -0000 Subject: [llvm] r374261 - Conservatively add volatility and atomic checks in a few places Message-ID: <20191009234333.B367E85ADD@lists.llvm.org> Author: reames Date: Wed Oct 9 16:43:33 2019 New Revision: 374261 URL: http://llvm.org/viewvc/llvm-project?rev=374261&view=rev Log: Conservatively add volatility and atomic checks in a few places As background, starting in D66309, I'm working on support unordered atomics analogous to volatile flags on normal LoadSDNode/StoreSDNodes for X86. As part of that, I spent some time going through usages of LoadSDNode and StoreSDNode looking for cases where we might have missed a volatility check or need an atomic check. I couldn't find any cases that clearly miscompile - i.e. no test cases - but a couple of pieces in code loop suspicious though I can't figure out how to exercise them. This patch adds defensive checks and asserts in the places my manual audit found. If anyone has any ideas on how to either a) disprove any of the checks, or b) hit the bug they might be fixing, I welcome suggestions. Differential Revision: https://reviews.llvm.org/D68419 Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=374261&r1=374260&r2=374261&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Wed Oct 9 16:43:33 2019 @@ -10152,7 +10152,10 @@ SDValue DAGCombiner::ReduceLoadWidth(SDN return SDValue(); LoadSDNode *LN0 = cast(N0); - if (!isLegalNarrowLdSt(LN0, ExtType, ExtVT, ShAmt)) + // Reducing the width of a volatile load is illegal. For atomics, we may be + // able to reduce the width provided we never widen again. (see D66309) + if (!LN0->isSimple() || + !isLegalNarrowLdSt(LN0, ExtType, ExtVT, ShAmt)) return SDValue(); auto AdjustBigEndianShift = [&](unsigned ShAmt) { @@ -16276,6 +16279,11 @@ SDValue DAGCombiner::splitMergedValStore if (OptLevel == CodeGenOpt::None) return SDValue(); + // Can't change the number of memory accesses for a volatile store or break + // atomicity for an atomic one. + if (!ST->isSimple()) + return SDValue(); + SDValue Val = ST->getValue(); SDLoc DL(ST); Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374261&r1=374260&r2=374261&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Oct 9 16:43:33 2019 @@ -4859,6 +4859,8 @@ bool X86TargetLowering::isFPImmLegal(con bool X86TargetLowering::shouldReduceLoadWidth(SDNode *Load, ISD::LoadExtType ExtTy, EVT NewVT) const { + assert(cast(Load)->isSimple() && "illegal to narrow"); + // "ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF // relocation target a movq or addq instruction: don't let the load shrink. SDValue BasePtr = cast(Load)->getBasePtr(); @@ -7724,7 +7726,7 @@ static SDValue LowerAsSplatVectorLoad(SD static bool findEltLoadSrc(SDValue Elt, LoadSDNode *&Ld, int64_t &ByteOffset) { if (ISD::isNON_EXTLoad(Elt.getNode())) { auto *BaseLd = cast(Elt); - if (BaseLd->getMemOperand()->getFlags() & MachineMemOperand::MOVolatile) + if (!BaseLd->isSimple()) return false; Ld = BaseLd; ByteOffset = 0; @@ -7878,8 +7880,8 @@ static SDValue EltsFromConsecutiveLoads( auto CreateLoad = [&DAG, &DL, &Loads](EVT VT, LoadSDNode *LDBase) { auto MMOFlags = LDBase->getMemOperand()->getFlags(); - assert(!(MMOFlags & MachineMemOperand::MOVolatile) && - "Cannot merge volatile loads."); + assert(LDBase->isSimple() && + "Cannot merge volatile or atomic loads."); SDValue NewLd = DAG.getLoad(VT, DL, LDBase->getChain(), LDBase->getBasePtr(), LDBase->getPointerInfo(), LDBase->getAlignment(), MMOFlags); From llvm-commits at lists.llvm.org Wed Oct 9 16:49:23 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:49:23 +0000 (UTC) Subject: [PATCH] D68744: [GSYM] Add GsymCreator and GsymReader. Message-ID: clayborg created this revision. clayborg added reviewers: aprantl, JDevlieghere, vsk, MaskRay, lemo, phosek, echristo, jakehehrlich. Herald added subscribers: mgrang, mgorny. This patch adds the ability to create GSYM files with GsymCreator, and read them with GsymReader. Full testing has been added for both new classes. This patch differs from the original patch https://reviews.llvm.org/D53379 in that is uses a StringTableBuilder class from llvm instead of a custom version. Support for big and little endian files has been added. If the endianness matches the current host, we use efficient extraction for the header, address table and address info offset tables. https://reviews.llvm.org/D68744 Files: include/llvm/DebugInfo/GSYM/FileWriter.h include/llvm/DebugInfo/GSYM/GsymCreator.h include/llvm/DebugInfo/GSYM/GsymReader.h include/llvm/DebugInfo/GSYM/Header.h lib/DebugInfo/GSYM/CMakeLists.txt lib/DebugInfo/GSYM/GsymCreator.cpp lib/DebugInfo/GSYM/GsymReader.cpp lib/DebugInfo/GSYM/Header.cpp unittests/DebugInfo/GSYM/CMakeLists.txt unittests/DebugInfo/GSYM/GSYMTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68744.224204.patch Type: text/x-patch Size: 50954 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 16:49:24 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:49:24 +0000 (UTC) Subject: [PATCH] D68742: AMDGPU: Use SGPR_128 instead of SReg_128 for vregs In-Reply-To: References: Message-ID: <93da275497f53e36723ab3191f6c937a@localhost.localdomain> rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68742/new/ https://reviews.llvm.org/D68742 From llvm-commits at lists.llvm.org Wed Oct 9 16:50:27 2019 From: llvm-commits at lists.llvm.org (Philip Reames via Phabricator via llvm-commits) Date: Wed, 09 Oct 2019 23:50:27 +0000 (UTC) Subject: [PATCH] D68419: Conservatively add volatility and atomic checks in a few places In-Reply-To: References: Message-ID: <6c7a38ddfe576aadfaa711bd0a9beaa5@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG931120846e5f: Conservatively add volatility and atomic checks in a few places (authored by reames). Herald added a subscriber: hiraditya. Changed prior to commit: https://reviews.llvm.org/D68419?vs=223082&id=224207#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68419/new/ https://reviews.llvm.org/D68419 Files: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/lib/Target/X86/X86ISelLowering.cpp Index: llvm/lib/Target/X86/X86ISelLowering.cpp =================================================================== --- llvm/lib/Target/X86/X86ISelLowering.cpp +++ llvm/lib/Target/X86/X86ISelLowering.cpp @@ -4859,6 +4859,8 @@ bool X86TargetLowering::shouldReduceLoadWidth(SDNode *Load, ISD::LoadExtType ExtTy, EVT NewVT) const { + assert(cast(Load)->isSimple() && "illegal to narrow"); + // "ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF // relocation target a movq or addq instruction: don't let the load shrink. SDValue BasePtr = cast(Load)->getBasePtr(); @@ -7724,7 +7726,7 @@ static bool findEltLoadSrc(SDValue Elt, LoadSDNode *&Ld, int64_t &ByteOffset) { if (ISD::isNON_EXTLoad(Elt.getNode())) { auto *BaseLd = cast(Elt); - if (BaseLd->getMemOperand()->getFlags() & MachineMemOperand::MOVolatile) + if (!BaseLd->isSimple()) return false; Ld = BaseLd; ByteOffset = 0; @@ -7878,8 +7880,8 @@ auto CreateLoad = [&DAG, &DL, &Loads](EVT VT, LoadSDNode *LDBase) { auto MMOFlags = LDBase->getMemOperand()->getFlags(); - assert(!(MMOFlags & MachineMemOperand::MOVolatile) && - "Cannot merge volatile loads."); + assert(LDBase->isSimple() && + "Cannot merge volatile or atomic loads."); SDValue NewLd = DAG.getLoad(VT, DL, LDBase->getChain(), LDBase->getBasePtr(), LDBase->getPointerInfo(), LDBase->getAlignment(), MMOFlags); Index: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp =================================================================== --- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp +++ llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp @@ -10152,7 +10152,10 @@ return SDValue(); LoadSDNode *LN0 = cast(N0); - if (!isLegalNarrowLdSt(LN0, ExtType, ExtVT, ShAmt)) + // Reducing the width of a volatile load is illegal. For atomics, we may be + // able to reduce the width provided we never widen again. (see D66309) + if (!LN0->isSimple() || + !isLegalNarrowLdSt(LN0, ExtType, ExtVT, ShAmt)) return SDValue(); auto AdjustBigEndianShift = [&](unsigned ShAmt) { @@ -16276,6 +16279,11 @@ if (OptLevel == CodeGenOpt::None) return SDValue(); + // Can't change the number of memory accesses for a volatile store or break + // atomicity for an atomic one. + if (!ST->isSimple()) + return SDValue(); + SDValue Val = ST->getValue(); SDLoc DL(ST); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68419.224207.patch Type: text/x-patch Size: 2580 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 17:08:18 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:08:18 +0000 (UTC) Subject: [PATCH] D68744: [GSYM] Add GsymCreator and GsymReader. In-Reply-To: References: Message-ID: aprantl added a comment. Mechanically, this looks mostly fine. ================ Comment at: include/llvm/DebugInfo/GSYM/GsymCreator.h:140 + std::vector Files; + std::vector UUID; + bool Finalized = false; ---------------- In other places we use `uint8_t UUID[16]` for this. ================ Comment at: include/llvm/DebugInfo/GSYM/GsymCreator.h:168 + uint32_t insertString(StringRef S) { + std::lock_guard Guard(Mutex); + if (S.empty()) ---------------- Just personal opinion, but anything with locks I would put into the cpp file. ================ Comment at: include/llvm/DebugInfo/GSYM/GsymCreator.h:213 + /// \param UUIDBytes The new UUID bytes. + void setUUID(llvm::ArrayRef UUIDBytes) { + UUID.assign(UUIDBytes.begin(), UUIDBytes.end()); ---------------- same here ================ Comment at: include/llvm/DebugInfo/GSYM/GsymReader.h:51 + // local storage and set point the ArrayRef objects above to these swapped + // copies. + struct SwappedData { ---------------- /// ================ Comment at: include/llvm/DebugInfo/GSYM/GsymReader.h:68 + // Accessor functions that allow iteration across all addresses in the GSYM + // file. + size_t getNumAddresses() const; ---------------- /// ================ Comment at: include/llvm/DebugInfo/GSYM/GsymReader.h:71 + Optional getAddress(size_t Index) const; + Optional getFile(uint32_t Index) const { + if (Index < Files.size()) ---------------- why is this interface useful? Wouldn't an iterator of a ForEach function be cleaner? ================ Comment at: lib/DebugInfo/GSYM/GsymCreator.cpp:1 +//===- GsymCreator.cpp ------------------------------------------*- C++ -*-===// +// ---------------- a `-*- C++ -*-` marker only makes sense in a .h file where the language is ambiguous. ================ Comment at: lib/DebugInfo/GSYM/GsymCreator.cpp:196 + if (Prev != Funcs.end()) { + if (Prev->Range.intersects(Curr->Range)) { + // Overlapping address ranges. ---------------- This is confusing to read because of all the nested-ness. Would it be possible to convert this into a ``` if (error) { OS << warning continue; } ``` form? ================ Comment at: lib/DebugInfo/GSYM/GsymReader.cpp:1 +//===- GsymReader.cpp -------------------------------------------*- C++ -*-===// +// ---------------- same here ================ Comment at: unittests/DebugInfo/GSYM/GSYMTest.cpp:1300 + VerifyFunctionInfo(GR, Func2Addr+FuncSize-1, Func2); + VerifyFunctionInfoError(GR, Func2Addr+FuncSize, + "address 0x1030 not in GSYM"); ---------------- clang-format CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68744/new/ https://reviews.llvm.org/D68744 From llvm-commits at lists.llvm.org Wed Oct 9 17:08:18 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:08:18 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: efriedma added a comment. > (b) is an issue, as pointed out in https://lwn.net/Articles/726587/ (grep for valgrind) : from valgrind point of view, accessing un-allocated stack memory triggers error, and we probably want to please valgrind > > Doing the call *after* the stack allocation is also not an option, as a signal could be raised between the stack allocation and the stack probing, escaping the stack probe if a custom signal handler is executed. I'm not sure I follow. How are you solving this problem in your patch? By limiting the amount you adjust the stack at a time? What limit is sufficient to avoid this issue? ----- Can you give a complete assembly listing for small examples of static and dynamic stack probing? ================ Comment at: llvm/lib/Target/X86/X86FrameLowering.cpp:400 + !(STI.isOSWindows() && !STI.isTargetMachO()); + if (InlineStackClashProtector && !InEpilogue) { + const uint64_t PageSize = TLI.getStackProbeSize(MF); ---------------- Why is this code in a different location from the stack probing code that generates a call? ================ Comment at: llvm/lib/Target/X86/X86FrameLowering.cpp:408 + CurrentAbsOffset += ChunkSize; + MI->getOperand(3).setIsDead(); // The EFLAGS implicit def is dead. + ---------------- This algorithm needs some documentation; it isn't at all obvious what it's doing. Particularly the interaction with "free" stack probes. Should we generate a loop if the stack frame is large? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 From llvm-commits at lists.llvm.org Wed Oct 9 17:08:18 2019 From: llvm-commits at lists.llvm.org (Marcello Maggioni via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:08:18 +0000 (UTC) Subject: [PATCH] D68739: [GISel] Allow ConstantFoldBinOp to consider G_FCONSTANT binary representation for combines In-Reply-To: References: Message-ID: kariddi marked 2 inline comments as done. kariddi added inline comments. ================ Comment at: llvm/lib/CodeGen/GlobalISel/Utils.cpp:323-325 + (&FPVal->getValueAPF().getSemantics() == &APFloat::IEEEdouble() || + &FPVal->getValueAPF().getSemantics() == &APFloat::IEEEsingle() || + &FPVal->getValueAPF().getSemantics() == &APFloat::IEEEhalf())) { ---------------- arsenm wrote: > kariddi wrote: > > kariddi wrote: > > > arsenm wrote: > > > > Why do you need to whitelist these types? > > > Are the types that fit a uint64 that we know of, but I guess that could be checked from "the only user" of this function instead > > Actually, that's not necessarily true ... ConstantFoldBinOp used to use a function that only handled Optional . Now I substituted that with a function that returns APInt (for simplicity), but I wanted to keep the functionality the same as getConstantVRegVal() I guess ... > > > > I guess I can remove that limitation for the Floats, but maintain the limitation fo the integers and then check in ConstantFoldBinOp. > > > > I'm just scared that allowing ConstantFoldBinOp to digest things it never saw before would cause some "unexpected consequence" :-P > I'm worried about somebody adding bfloat16 or something and then never updating this list. I think it would be fine to just return anything bitcastToAPInt will handle I tried removing the limitation. It seems not not cause problems , so I'll update the patch Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68739/new/ https://reviews.llvm.org/D68739 From llvm-commits at lists.llvm.org Wed Oct 9 17:08:19 2019 From: llvm-commits at lists.llvm.org (Marcello Maggioni via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:08:19 +0000 (UTC) Subject: [PATCH] D68739: [GISel] Allow ConstantFoldBinOp to consider G_FCONSTANT binary representation for combines In-Reply-To: References: Message-ID: <4f39f24c97dfbe2581ce03137b67a44b@localhost.localdomain> kariddi updated this revision to Diff 224211. kariddi marked an inline comment as done. kariddi added a comment. Clang-formatted test and corrected "MRI" to "*MRI" Removed limitation pointed out by arsenm Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68739/new/ https://reviews.llvm.org/D68739 Files: llvm/include/llvm/CodeGen/GlobalISel/Utils.h llvm/lib/CodeGen/GlobalISel/Utils.cpp llvm/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68739.224211.patch Type: text/x-patch Size: 10946 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 17:17:27 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:17:27 +0000 (UTC) Subject: [PATCH] D68747: [codeview] Try to avoid emitting .cv_loc with line zero Message-ID: rnk created this revision. rnk added a reviewer: akhuang. Herald added a subscriber: hiraditya. Herald added a project: LLVM. Visual Studio doesn't like it while stepping. It kicks you out of the source view of the file being stepped through and tries to fall back to the disassembly view. Fixes PR43530 The fix is incomplete, because it's possible to have a basic block with no source locations at all. In this case, we don't emit a .cv_loc, but that will result in wrong stepping behavior in the debugger if the layout predecessor of the location-less BB has an unrelated source location. We could try harder to find a valid location that dominates or post-dominates the current BB, but in general it's a dataflow problem, and one still might not exist. I left a FIXME about this. As an alternative, we might want to consider having the middle-end check if its emitting codeview and get it to stop using line zero. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68747 Files: llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp llvm/test/DebugInfo/COFF/line-zero.ll llvm/test/DebugInfo/COFF/local-variables.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68747.224212.patch Type: text/x-patch Size: 5421 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 17:17:28 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Thu, 10 Oct 2019 00:17:28 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: <56a4b5c9f6578b0fb36aef044ee8ffba@localhost.localdomain> xbolva00 added a comment. Please add info about this new feature to release notes Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 From llvm-commits at lists.llvm.org Wed Oct 9 17:17:28 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:17:28 +0000 (UTC) Subject: [PATCH] D68676: [ASan] Do not misrepresent high value address dereferences as null dereferences In-Reply-To: References: Message-ID: yln updated this revision to Diff 224213. yln added a comment. Move test to 'asan/Posix' and provide Linux implementation for 'SignalContext::IsTrueFaultingAddress'. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68676/new/ https://reviews.llvm.org/D68676 Files: compiler-rt/lib/asan/asan_errors.h compiler-rt/lib/sanitizer_common/sanitizer_common.h compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp compiler-rt/lib/sanitizer_common/sanitizer_mac.cpp compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp compiler-rt/lib/sanitizer_common/sanitizer_win.cpp compiler-rt/test/asan/TestCases/Posix/high-address-dereference.c -------------- next part -------------- A non-text attachment was scrubbed... Name: D68676.224213.patch Type: text/x-patch Size: 7978 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 17:33:04 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via llvm-commits) Date: Thu, 10 Oct 2019 00:33:04 -0000 Subject: [compiler-rt] r374265 - [ASan] Do not misrepresent high value address dereferences as null dereferences Message-ID: <20191010003304.78FBF86525@lists.llvm.org> Author: yln Date: Wed Oct 9 17:33:04 2019 New Revision: 374265 URL: http://llvm.org/viewvc/llvm-project?rev=374265&view=rev Log: [ASan] Do not misrepresent high value address dereferences as null dereferences Dereferences with addresses above the 48-bit hardware addressable range produce "invalid instruction" (instead of "invalid access") hardware exceptions (there is no hardware address decoding logic for those bits), and the address provided by this exception is the address of the instruction (not the faulting address). The kernel maps the "invalid instruction" to SEGV, but fails to provide the real fault address. Because of this ASan lies and says that those cases are null dereferences. This downgrades the severity of a found bug in terms of security. In the ASan signal handler, we can not provide the real faulting address, but at least we can try not to lie. rdar://50366151 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D68676 Added: compiler-rt/trunk/test/asan/TestCases/Posix/high-address-dereference.c Modified: compiler-rt/trunk/lib/asan/asan_errors.h compiler-rt/trunk/lib/sanitizer_common/sanitizer_common.h compiler-rt/trunk/lib/sanitizer_common/sanitizer_linux.cpp compiler-rt/trunk/lib/sanitizer_common/sanitizer_mac.cpp compiler-rt/trunk/lib/sanitizer_common/sanitizer_symbolizer_report.cpp compiler-rt/trunk/lib/sanitizer_common/sanitizer_win.cpp Modified: compiler-rt/trunk/lib/asan/asan_errors.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_errors.h?rev=374265&r1=374264&r2=374265&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_errors.h (original) +++ compiler-rt/trunk/lib/asan/asan_errors.h Wed Oct 9 17:33:04 2019 @@ -48,7 +48,8 @@ struct ErrorDeadlySignal : ErrorBase { scariness.Scare(10, "stack-overflow"); } else if (!signal.is_memory_access) { scariness.Scare(10, "signal"); - } else if (signal.addr < GetPageSizeCached()) { + } else if (signal.is_true_faulting_addr && + signal.addr < GetPageSizeCached()) { scariness.Scare(10, "null-deref"); } else if (signal.addr == signal.pc) { scariness.Scare(60, "wild-jump"); Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_common.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_common.h?rev=374265&r1=374264&r2=374265&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_common.h (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_common.h Wed Oct 9 17:33:04 2019 @@ -881,6 +881,11 @@ struct SignalContext { bool is_memory_access; enum WriteFlag { UNKNOWN, READ, WRITE } write_flag; + // In some cases the kernel cannot provide the true faulting address; `addr` + // will be zero then. This field allows to distinguish between these cases + // and dereferences of null. + bool is_true_faulting_addr; + // VS2013 doesn't implement unrestricted unions, so we need a trivial default // constructor SignalContext() = default; @@ -893,7 +898,8 @@ struct SignalContext { context(context), addr(GetAddress()), is_memory_access(IsMemoryAccess()), - write_flag(GetWriteFlag()) { + write_flag(GetWriteFlag()), + is_true_faulting_addr(IsTrueFaultingAddress()) { InitPcSpBp(); } @@ -914,6 +920,7 @@ struct SignalContext { uptr GetAddress() const; WriteFlag GetWriteFlag() const; bool IsMemoryAccess() const; + bool IsTrueFaultingAddress() const; }; void InitializePlatformEarly(); Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_linux.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_linux.cpp?rev=374265&r1=374264&r2=374265&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_linux.cpp (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_linux.cpp Wed Oct 9 17:33:04 2019 @@ -1849,6 +1849,12 @@ SignalContext::WriteFlag SignalContext:: #endif } +bool SignalContext::IsTrueFaultingAddress() const { + auto si = static_cast(siginfo); + // SIGSEGV signals without a true fault address have si_code set to 128. + return si->si_signo == SIGSEGV && si->si_code != 128; +} + void SignalContext::DumpAllRegisters(void *context) { // FIXME: Implement this. } Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_mac.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_mac.cpp?rev=374265&r1=374264&r2=374265&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_mac.cpp (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_mac.cpp Wed Oct 9 17:33:04 2019 @@ -754,6 +754,12 @@ SignalContext::WriteFlag SignalContext:: #endif } +bool SignalContext::IsTrueFaultingAddress() const { + auto si = static_cast(siginfo); + // "Real" SIGSEGV codes (e.g., SEGV_MAPERR, SEGV_MAPERR) are non-zero. + return si->si_signo == SIGSEGV && si->si_code != 0; +} + static void GetPcSpBp(void *context, uptr *pc, uptr *sp, uptr *bp) { ucontext_t *ucontext = (ucontext_t*)context; # if defined(__aarch64__) Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_symbolizer_report.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_symbolizer_report.cpp?rev=374265&r1=374264&r2=374265&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_symbolizer_report.cpp (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_symbolizer_report.cpp Wed Oct 9 17:33:04 2019 @@ -191,9 +191,14 @@ static void ReportDeadlySignalImpl(const SanitizerCommonDecorator d; Printf("%s", d.Warning()); const char *description = sig.Describe(); - Report("ERROR: %s: %s on unknown address %p (pc %p bp %p sp %p T%d)\n", - SanitizerToolName, description, (void *)sig.addr, (void *)sig.pc, - (void *)sig.bp, (void *)sig.sp, tid); + if (sig.is_memory_access && !sig.is_true_faulting_addr) + Report("ERROR: %s: %s on unknown address (pc %p bp %p sp %p T%d)\n", + SanitizerToolName, description, (void *)sig.pc, (void *)sig.bp, + (void *)sig.sp, tid); + else + Report("ERROR: %s: %s on unknown address %p (pc %p bp %p sp %p T%d)\n", + SanitizerToolName, description, (void *)sig.addr, (void *)sig.pc, + (void *)sig.bp, (void *)sig.sp, tid); Printf("%s", d.Default()); if (sig.pc < GetPageSizeCached()) Report("Hint: pc points to the zero page.\n"); @@ -203,7 +208,11 @@ static void ReportDeadlySignalImpl(const ? "WRITE" : (sig.write_flag == SignalContext::READ ? "READ" : "UNKNOWN"); Report("The signal is caused by a %s memory access.\n", access_type); - if (sig.addr < GetPageSizeCached()) + if (!sig.is_true_faulting_addr) + Report("Hint: this fault was caused by a dereference of a high value " + "address (see registers below). Dissassemble the provided pc " + "to learn which register value was used.\n"); + else if (sig.addr < GetPageSizeCached()) Report("Hint: address points to the zero page.\n"); } MaybeReportNonExecRegion(sig.pc); Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_win.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_win.cpp?rev=374265&r1=374264&r2=374265&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_win.cpp (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_win.cpp Wed Oct 9 17:33:04 2019 @@ -945,6 +945,11 @@ bool SignalContext::IsMemoryAccess() con return GetWriteFlag() != SignalContext::UNKNOWN; } +bool SignalContext::IsTrueFaultingAddress() const { + // TODO: Provide real implementation for this. See Linux and Mac variants. + return IsMemoryAccess(); +} + SignalContext::WriteFlag SignalContext::GetWriteFlag() const { EXCEPTION_RECORD *exception_record = (EXCEPTION_RECORD *)siginfo; // The contents of this array are documented at Added: compiler-rt/trunk/test/asan/TestCases/Posix/high-address-dereference.c URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/asan/TestCases/Posix/high-address-dereference.c?rev=374265&view=auto ============================================================================== --- compiler-rt/trunk/test/asan/TestCases/Posix/high-address-dereference.c (added) +++ compiler-rt/trunk/test/asan/TestCases/Posix/high-address-dereference.c Wed Oct 9 17:33:04 2019 @@ -0,0 +1,50 @@ +// On x86_64, the kernel does not provide the faulting address for dereferences +// of addresses greater than the 48-bit hardware addressable range, i.e., +// `siginfo.si_addr` is zero in ASan's SEGV signal handler. This test checks +// that ASan does not misrepresent such cases as "NULL dereferences". + +// REQUIRES: x86_64-target-arch +// RUN: %clang_asan %s -o %t +// RUN: export %env_asan_opts=print_scariness=1 +// RUN: not %run %t 0x0000000000000000 2>&1 | FileCheck %s --check-prefixes=ZERO,HINT-PAGE0 +// RUN: not %run %t 0x0000000000000FFF 2>&1 | FileCheck %s --check-prefixes=LOW1,HINT-PAGE0 +// RUN: not %run %t 0x0000000000001000 2>&1 | FileCheck %s --check-prefixes=LOW2,HINT-NONE +// RUN: not %run %t 0x4141414141414141 2>&1 | FileCheck %s --check-prefixes=HIGH,HINT-HIGHADDR +// RUN: not %run %t 0xFFFFFFFFFFFFFFFF 2>&1 | FileCheck %s --check-prefixes=MAX,HINT-HIGHADDR + +#include +#include + +int main(int argc, const char *argv[]) { + const char *hex = argv[1]; + uint64_t *addr = (uint64_t *)strtoull(hex, NULL, 16); + uint64_t x = *addr; // segmentation fault + return x; +} + +// ZERO: SEGV on unknown address 0x000000000000 (pc +// LOW1: SEGV on unknown address 0x000000000fff (pc +// LOW2: SEGV on unknown address 0x000000001000 (pc +// HIGH: SEGV on unknown address (pc +// MAX: SEGV on unknown address (pc + +// HINT-PAGE0-NOT: Hint: this fault was caused by a dereference of a high value address +// HINT-PAGE0: Hint: address points to the zero page. + +// HINT-NONE-NOT: Hint: this fault was caused by a dereference of a high value address +// HINT-NONE-NOT: Hint: address points to the zero page. + +// HINT-HIGHADDR: Hint: this fault was caused by a dereference of a high value address +// HINT-HIGHADDR-NOT: Hint: address points to the zero page. + +// ZERO: SCARINESS: 10 (null-deref) +// LOW1: SCARINESS: 10 (null-deref) +// LOW2: SCARINESS: 20 (wild-addr-read) +// HIGH: SCARINESS: 20 (wild-addr-read) +// MAX: SCARINESS: 20 (wild-addr-read) + +// TODO: Currently, register values are only printed on Mac. Once this changes, +// remove the 'TODO_' prefix in the following lines. +// TODO_HIGH,TODO_MAX: Register values: +// TODO_HIGH: = 0x4141414141414141 +// TODO_MAX: = 0xffffffffffffffff From llvm-commits at lists.llvm.org Wed Oct 9 17:35:53 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:35:53 +0000 (UTC) Subject: [PATCH] D68749: [lld][WebAssembly] Refactor markLive.cpp Message-ID: sbc100 created this revision. Herald added subscribers: llvm-commits, sunfish, aheejin, jgravelle-google, dschuff. Herald added a project: LLVM. sbc100 added a reviewer: ruiu. This pattern matches the ELF implementation add if also useful as part of a planned change where running `mark` more than once is needed. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68749 Files: lld/wasm/MarkLive.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68749.224215.patch Type: text/x-patch Size: 3461 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 17:36:16 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:36:16 +0000 (UTC) Subject: [PATCH] D68676: [ASan] Do not misrepresent high value address dereferences as null dereferences In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGb577efe4567f: [ASan] Do not misrepresent high value address dereferences as null dereferences (authored by yln). Changed prior to commit: https://reviews.llvm.org/D68676?vs=224213&id=224217#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68676/new/ https://reviews.llvm.org/D68676 Files: compiler-rt/lib/asan/asan_errors.h compiler-rt/lib/sanitizer_common/sanitizer_common.h compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp compiler-rt/lib/sanitizer_common/sanitizer_mac.cpp compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp compiler-rt/lib/sanitizer_common/sanitizer_win.cpp compiler-rt/test/asan/TestCases/Posix/high-address-dereference.c -------------- next part -------------- A non-text attachment was scrubbed... Name: D68676.224217.patch Type: text/x-patch Size: 8052 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 17:45:13 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:45:13 +0000 (UTC) Subject: [PATCH] D68529: [lit] Move argument parsing/validation to separate file In-Reply-To: References: Message-ID: <7fd2d2559c7feadfcef79226afbe3707@localhost.localdomain> yln added a comment. @serge-sans-paille: I am assuming that you are again fine with this, but I would more comfortable landing this with the official green check mark. ;) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68529/new/ https://reviews.llvm.org/D68529 From llvm-commits at lists.llvm.org Wed Oct 9 17:46:04 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:46:04 +0000 (UTC) Subject: [PATCH] D68739: [GISel] Allow ConstantFoldBinOp to consider G_FCONSTANT binary representation for combines In-Reply-To: References: Message-ID: <9415dd66d928bd5ff492db29c77e54d1@localhost.localdomain> arsenm added inline comments. ================ Comment at: llvm/lib/CodeGen/GlobalISel/Utils.cpp:321-324 + const ConstantFP *FPVal = getConstantFPVRegVal(VReg, MRI); + if (FPVal) + return FPVal->getValueAPF().bitcastToAPInt(); + Optional IntVal = getConstantVRegVal(VReg, MRI); ---------------- I kind of don't like potentially repeating the search through copy/extensions twice. Can you maybe add some kind of parameter to getConstantVRegValWithLookThrough, so it handles both? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68739/new/ https://reviews.llvm.org/D68739 From llvm-commits at lists.llvm.org Wed Oct 9 17:54:17 2019 From: llvm-commits at lists.llvm.org (Amy Huang via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:54:17 +0000 (UTC) Subject: [PATCH] D68747: [codeview] Try to avoid emitting .cv_loc with line zero In-Reply-To: References: Message-ID: akhuang accepted this revision. akhuang added a comment. This revision is now accepted and ready to land. lgtm- ================ Comment at: llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp:2863 +// corresponds to optimized code that doesn't have a distinct source location. +// In this case, we try try to use the previous or next source location +// depending on the context. ---------------- typo Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68747/new/ https://reviews.llvm.org/D68747 From llvm-commits at lists.llvm.org Wed Oct 9 17:54:19 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:54:19 +0000 (UTC) Subject: [PATCH] D68751: [lld][WebAssembly] Where possible handle signature mismatches via an adaptor function Message-ID: sbc100 created this revision. Herald added subscribers: llvm-commits, dexonsmith, steven_wu, sunfish, aheejin, hiraditya, jgravelle-google, mehdi_amini, dschuff. Herald added a project: LLVM. This is similar to what we do at the bitcode level with the WebAssemblyFixFunctionBitcasts pass but implemented at the object file level. Previously when we had caller and callee disagree about the function signature we generated a warning replaced the call with a call to a dummy function that contained only unreachable. After this change we still generate the warning but will also then try to generate an adapter function that will add or remove arguments such that call can still possible succeed. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68751 Files: lld/test/wasm/lto/signature-mismatch.ll lld/wasm/InputChunks.h lld/wasm/MarkLive.cpp lld/wasm/SymbolTable.cpp lld/wasm/SymbolTable.h lld/wasm/Writer.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68751.224220.patch Type: text/x-patch Size: 9299 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 17:55:05 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:55:05 +0000 (UTC) Subject: [PATCH] D68751: [lld][WebAssembly] Where possible handle signature mismatches via an adaptor function In-Reply-To: References: Message-ID: sbc100 updated this revision to Diff 224222. sbc100 added a comment. . Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68751/new/ https://reviews.llvm.org/D68751 Files: lld/test/wasm/lto/signature-mismatch.ll lld/wasm/InputChunks.h lld/wasm/MarkLive.cpp lld/wasm/SymbolTable.cpp lld/wasm/SymbolTable.h lld/wasm/Writer.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68751.224222.patch Type: text/x-patch Size: 9297 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 17:55:29 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 00:55:29 +0000 (UTC) Subject: [PATCH] D67986: [InstCombine] snprintf (d, size, "%s", s) -> memccpy (d, s, '\0', size - 1), d[size - 1] = 0 In-Reply-To: References: Message-ID: <3189822efe14800bb0c9a7a78972d655@localhost.localdomain> efriedma added a comment. The sequence from https://reviews.llvm.org/D67986#1692153 works, I guess. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67986/new/ https://reviews.llvm.org/D67986 From llvm-commits at lists.llvm.org Wed Oct 9 18:06:02 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via llvm-commits) Date: Thu, 10 Oct 2019 01:06:02 -0000 Subject: [llvm] r374267 - [codeview] Try to avoid emitting .cv_loc with line zero Message-ID: <20191010010602.3953F877F7@lists.llvm.org> Author: rnk Date: Wed Oct 9 18:06:01 2019 New Revision: 374267 URL: http://llvm.org/viewvc/llvm-project?rev=374267&view=rev Log: [codeview] Try to avoid emitting .cv_loc with line zero Summary: Visual Studio doesn't like it while stepping. It kicks you out of the source view of the file being stepped through and tries to fall back to the disassembly view. Fixes PR43530 The fix is incomplete, because it's possible to have a basic block with no source locations at all. In this case, we don't emit a .cv_loc, but that will result in wrong stepping behavior in the debugger if the layout predecessor of the location-less BB has an unrelated source location. We could try harder to find a valid location that dominates or post-dominates the current BB, but in general it's a dataflow problem, and one still might not exist. I left a FIXME about this. As an alternative, we might want to consider having the middle-end check if its emitting codeview and get it to stop using line zero. Reviewers: akhuang Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68747 Added: llvm/trunk/test/DebugInfo/COFF/line-zero.ll Modified: llvm/trunk/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp llvm/trunk/test/DebugInfo/COFF/local-variables.ll Modified: llvm/trunk/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp?rev=374267&r1=374266&r2=374267&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp Wed Oct 9 18:06:01 2019 @@ -2858,6 +2858,14 @@ void CodeViewDebug::endFunctionImpl(cons CurFn = nullptr; } +// Usable locations are valid with non-zero line numbers. A line number of zero +// corresponds to optimized code that doesn't have a distinct source location. +// In this case, we try to use the previous or next source location depending on +// the context. +static bool isUsableDebugLoc(DebugLoc DL) { + return DL && DL.getLine() != 0; +} + void CodeViewDebug::beginInstruction(const MachineInstr *MI) { DebugHandlerBase::beginInstruction(MI); @@ -2869,19 +2877,21 @@ void CodeViewDebug::beginInstruction(con // If the first instruction of a new MBB has no location, find the first // instruction with a location and use that. DebugLoc DL = MI->getDebugLoc(); - if (!DL && MI->getParent() != PrevInstBB) { + if (!isUsableDebugLoc(DL) && MI->getParent() != PrevInstBB) { for (const auto &NextMI : *MI->getParent()) { if (NextMI.isDebugInstr()) continue; DL = NextMI.getDebugLoc(); - if (DL) + if (isUsableDebugLoc(DL)) break; } + // FIXME: Handle the case where the BB has no valid locations. This would + // probably require doing a real dataflow analysis. } PrevInstBB = MI->getParent(); // If we still don't have a debug location, don't record a location. - if (!DL) + if (!isUsableDebugLoc(DL)) return; maybeRecordLocation(DL, Asm->MF); Added: llvm/trunk/test/DebugInfo/COFF/line-zero.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/COFF/line-zero.ll?rev=374267&view=auto ============================================================================== --- llvm/trunk/test/DebugInfo/COFF/line-zero.ll (added) +++ llvm/trunk/test/DebugInfo/COFF/line-zero.ll Wed Oct 9 18:06:01 2019 @@ -0,0 +1,77 @@ +; RUN: llc < %s | FileCheck %s + +; C++ source to regenerate: +; int main() { +; volatile int x; +; x = 1; +; #line 0 +; x = 2; +; #line 7 +; x = 3; +; } + + +; CHECK-LABEL: main: # @main +; CHECK: .cv_loc 0 1 1 0 # t.cpp:1:0 +; CHECK: .cv_loc 0 1 3 0 # t.cpp:3:0 +; CHECK: movl $1, 4(%rsp) +; CHECK-NOT: .cv_loc {{.*}} t.cpp:0:0 +; CHECK: movl $2, 4(%rsp) +; CHECK: .cv_loc 0 1 7 0 # t.cpp:7:0 +; CHECK: movl $3, 4(%rsp) +; CHECK: .cv_loc 0 1 8 0 # t.cpp:8:0 +; CHECK: xorl %eax, %eax +; CHECK: retq + +; ModuleID = 't.cpp' +source_filename = "t.cpp" +target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-pc-windows-msvc19.22.27905" + +; Function Attrs: norecurse nounwind uwtable +define dso_local i32 @main() local_unnamed_addr #0 !dbg !8 { +entry: + %x = alloca i32, align 4 + %x.0.x.0..sroa_cast = bitcast i32* %x to i8*, !dbg !15 + call void @llvm.dbg.declare(metadata i32* %x, metadata !13, metadata !DIExpression()), !dbg !15 + store volatile i32 1, i32* %x, align 4, !dbg !16, !tbaa !17 + store volatile i32 2, i32* %x, align 4, !dbg !21, !tbaa !17 + store volatile i32 3, i32* %x, align 4, !dbg !22, !tbaa !17 + ret i32 0, !dbg !23 +} + +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.declare(metadata, metadata, metadata) #2 + +attributes #0 = { norecurse nounwind uwtable } +attributes #1 = { argmemonly nounwind willreturn } +attributes #2 = { nounwind readnone speculatable willreturn } + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!3, !4, !5, !6} +!llvm.ident = !{!7} + +!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "t.cpp", directory: "C:\5Csrc\5Cllvm-project\5Cbuild", checksumkind: CSK_MD5, checksum: "8b6d53b166e6fa660f115eff7beedf3b") +!2 = !{} +!3 = !{i32 2, !"CodeView", i32 1} +!4 = !{i32 2, !"Debug Info Version", i32 3} +!5 = !{i32 1, !"wchar_size", i32 2} +!6 = !{i32 7, !"PIC Level", i32 2} +!7 = !{!"clang version 10.0.0"} +!8 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 1, type: !9, scopeLine: 1, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !12) +!9 = !DISubroutineType(types: !10) +!10 = !{!11} +!11 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!12 = !{!13} +!13 = !DILocalVariable(name: "x", scope: !8, file: !1, line: 2, type: !14) +!14 = !DIDerivedType(tag: DW_TAG_volatile_type, baseType: !11) +!15 = !DILocation(line: 2, scope: !8) +!16 = !DILocation(line: 3, scope: !8) +!17 = !{!18, !18, i64 0} +!18 = !{!"int", !19, i64 0} +!19 = !{!"omnipotent char", !20, i64 0} +!20 = !{!"Simple C++ TBAA"} +!21 = !DILocation(line: 0, scope: !8) +!22 = !DILocation(line: 7, scope: !8) +!23 = !DILocation(line: 8, scope: !8) Modified: llvm/trunk/test/DebugInfo/COFF/local-variables.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/COFF/local-variables.ll?rev=374267&r1=374266&r2=374267&view=diff ============================================================================== --- llvm/trunk/test/DebugInfo/COFF/local-variables.ll (original) +++ llvm/trunk/test/DebugInfo/COFF/local-variables.ll Wed Oct 9 18:06:01 2019 @@ -60,7 +60,7 @@ ; ASM: leaq 36(%rsp), %rcx ; ASM: [[else_end:\.Ltmp.*]]: ; ASM: .LBB0_3: # %if.end -; ASM: .cv_loc 0 1 0 0 # t.cpp:0:0 +; ASM: .cv_loc 0 1 17 1 # t.cpp:17:1 ; ASM: callq capture ; ASM: nop ; ASM: addq $56, %rsp From llvm-commits at lists.llvm.org Wed Oct 9 18:04:00 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:04:00 +0000 (UTC) Subject: [PATCH] D68686: [X86] Add strict fp support for instructions fadd/fsub/fmul/fdiv In-Reply-To: References: Message-ID: <645661f3a0d7480e2fed49f9e014117e@localhost.localdomain> craig.topper added inline comments. ================ Comment at: llvm/test/CodeGen/X86/fp-strict-avx.ll:1 +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+avx2 -O3 | FileCheck %s --check-prefixes=CHECK,X86 ---------------- We tend to prefer to split tests by vector width rather than features. So we should have 128-bit test with sse, avx, and avx512 command lines. A 256-bit test with avx and avx512 command lines. And a 512-bit test with avx512 command line. This way we can make sure a given function is generated in a similar way for all isas. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68686/new/ https://reviews.llvm.org/D68686 From llvm-commits at lists.llvm.org Wed Oct 9 18:04:26 2019 From: llvm-commits at lists.llvm.org (Douglas Gliner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:04:26 +0000 (UTC) Subject: [PATCH] D68752: [sancov] Use LLVM Support library JSON writer in favor of individual implementation Message-ID: dgg5503 created this revision. dgg5503 added reviewers: kcc, filcab, phosek, morehouse, vitalybuka, metzman. dgg5503 added projects: Sanitizers, LLVM. Herald added subscribers: dexonsmith, mehdi_amini. In this diff, I've replaced the individual implementation of `JSONWriter` with `json::OStream` provided by `llvm/Support/JSON.h`. Important Note: The output format of the JSON is considerably different compared to the original implementation. Important differences include: - New line for each entry in an array (should make diffs cleaner) - No space between keys and colon in attributed object entries. - Attributes with empty strings will now print the attribute name and a quote pair rather than excluding the attribute altogether Examples of these differences can be seen in the changes to the sancov tests which compare the JSON output. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68752 Files: llvm/test/tools/sancov/merge.test llvm/test/tools/sancov/symbolize.test llvm/test/tools/sancov/symbolize_noskip_dead_files.test llvm/tools/sancov/sancov.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68752.224214.patch Type: text/x-patch Size: 14977 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 18:13:13 2019 From: llvm-commits at lists.llvm.org (Douglas Gliner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:13:13 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Accommodate sancov and coverage report server for use under Windows In-Reply-To: References: Message-ID: <45143a9b4d881fd9a52e8550e682a580@localhost.localdomain> dgg5503 updated this revision to Diff 224223. dgg5503 edited the summary of this revision. dgg5503 added a comment. Split out JSON changes to D68752 . CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 Files: llvm/test/tools/sancov/blacklist.test llvm/test/tools/sancov/covered_functions.test llvm/test/tools/sancov/merge.test llvm/test/tools/sancov/not_covered_functions.test llvm/test/tools/sancov/print.test llvm/test/tools/sancov/stats.test llvm/test/tools/sancov/symbolize.test llvm/test/tools/sancov/symbolize_noskip_dead_files.test llvm/test/tools/sancov/validation.test llvm/tools/sancov/coverage-report-server.py llvm/tools/sancov/sancov.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D51018.224223.patch Type: text/x-patch Size: 7104 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 18:13:14 2019 From: llvm-commits at lists.llvm.org (Douglas Gliner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:13:14 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Accommodate sancov and coverage report server for use under Windows In-Reply-To: References: Message-ID: dgg5503 added a comment. @vitalybuka thanks for the explanation. I believe I did it correctly, please let me know otherwise. It is my first time submitting a change to the LLVM project! In other news, I've slightly modified the test `symbolize.test` to actually test the case I present in the initial description. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 From llvm-commits at lists.llvm.org Wed Oct 9 18:13:18 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:13:18 +0000 (UTC) Subject: [PATCH] D68747: [codeview] Try to avoid emitting .cv_loc with line zero In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG9d8f0b3519c4: [codeview] Try to avoid emitting .cv_loc with line zero (authored by rnk). Changed prior to commit: https://reviews.llvm.org/D68747?vs=224212&id=224226#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68747/new/ https://reviews.llvm.org/D68747 Files: llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp llvm/test/DebugInfo/COFF/line-zero.ll llvm/test/DebugInfo/COFF/local-variables.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68747.224226.patch Type: text/x-patch Size: 5417 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 18:13:51 2019 From: llvm-commits at lists.llvm.org (Mitch Phillips via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:13:51 +0000 (UTC) Subject: [PATCH] D68754: [sanitizers] Update linker scripts to avoid emutls issues. Message-ID: hctim created this revision. Herald added projects: Sanitizers, LLVM. Herald added subscribers: llvm-commits, Sanitizers. Looks like the linker script for ASan contains an entry for "__sancov_*". Android TLS symbols are emulated, and so we end up with "__sancov_lowest_stack" actually ending up with a symbol name of "__emutls_v.__sancov_lowest_stack". This symbol is then discarded, as the linker script doesn't consider prefixes. This patch fixes the above issue, and adds explicit exports of __sancov for other sanitizer libraries. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68754 Files: compiler-rt/lib/asan/asan.syms.extra compiler-rt/lib/hwasan/hwasan.syms.extra compiler-rt/lib/msan/msan.syms.extra compiler-rt/lib/ubsan/ubsan.syms.extra compiler-rt/lib/ubsan_minimal/ubsan.syms.extra Index: compiler-rt/lib/ubsan_minimal/ubsan.syms.extra =================================================================== --- compiler-rt/lib/ubsan_minimal/ubsan.syms.extra +++ compiler-rt/lib/ubsan_minimal/ubsan.syms.extra @@ -1 +1 @@ -__ubsan_* +*__ubsan_* Index: compiler-rt/lib/ubsan/ubsan.syms.extra =================================================================== --- compiler-rt/lib/ubsan/ubsan.syms.extra +++ compiler-rt/lib/ubsan/ubsan.syms.extra @@ -1 +1 @@ -__ubsan_* +*__ubsan_* Index: compiler-rt/lib/msan/msan.syms.extra =================================================================== --- compiler-rt/lib/msan/msan.syms.extra +++ compiler-rt/lib/msan/msan.syms.extra @@ -1,2 +1,3 @@ -__msan_* -__ubsan_* +*__msan_* +*__ubsan_* +*__sancov_* Index: compiler-rt/lib/hwasan/hwasan.syms.extra =================================================================== --- compiler-rt/lib/hwasan/hwasan.syms.extra +++ compiler-rt/lib/hwasan/hwasan.syms.extra @@ -1,2 +1,3 @@ -__hwasan_* -__ubsan_* +*__hwasan_* +*__ubsan_* +*__sancov_* Index: compiler-rt/lib/asan/asan.syms.extra =================================================================== --- compiler-rt/lib/asan/asan.syms.extra +++ compiler-rt/lib/asan/asan.syms.extra @@ -1,4 +1,4 @@ -__asan_* -__lsan_* -__ubsan_* -__sancov_* +*__asan_* +*__lsan_* +*__ubsan_* +*__sancov_* -------------- next part -------------- A non-text attachment was scrubbed... Name: D68754.224227.patch Type: text/x-patch Size: 1344 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 18:22:56 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:22:56 +0000 (UTC) Subject: [PATCH] D68712: Avoid PT_LOAD to have overlapping p_offset ranges on EM_AMDGPU In-Reply-To: References: Message-ID: <2989519db28494bd750b80d3aa77d44b@localhost.localdomain> MaskRay requested changes to this revision. MaskRay added a comment. This revision now requires changes to proceed. You can work around your internal tests with -z separate-code, instead of disabling the feature in the code. Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68712/new/ https://reviews.llvm.org/D68712 From llvm-commits at lists.llvm.org Wed Oct 9 18:22:56 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:22:56 +0000 (UTC) Subject: [PATCH] D68105: [LNT] Python 3 support: fix report version literal In-Reply-To: References: Message-ID: <9fa8e65d929ab43e0c5d4450c21e8854@localhost.localdomain> hubert.reinterpretcast accepted this revision. hubert.reinterpretcast added a comment. This revision is now accepted and ready to land. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68105/new/ https://reviews.llvm.org/D68105 From llvm-commits at lists.llvm.org Wed Oct 9 18:22:57 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:22:57 +0000 (UTC) Subject: [PATCH] D67882: [LNT] Python 3 support: remove useless var-setting getter In-Reply-To: References: Message-ID: hubert.reinterpretcast accepted this revision. hubert.reinterpretcast added a comment. This revision is now accepted and ready to land. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67882/new/ https://reviews.llvm.org/D67882 From llvm-commits at lists.llvm.org Wed Oct 9 18:22:57 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:22:57 +0000 (UTC) Subject: [PATCH] D68104: [LNT] Python 3 support: adapt secret computation In-Reply-To: References: Message-ID: <62163d38c3645a85440556f275740751@localhost.localdomain> hubert.reinterpretcast accepted this revision. hubert.reinterpretcast added a comment. This revision is now accepted and ready to land. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68104/new/ https://reviews.llvm.org/D68104 From llvm-commits at lists.llvm.org Wed Oct 9 18:32:01 2019 From: llvm-commits at lists.llvm.org (Anthony Eden via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:32:01 +0000 (UTC) Subject: [PATCH] D64962: appendToGlobalCtors: allow for llvm.global_ctors functions of varying type In-Reply-To: References: Message-ID: <6dd1582900eebf9b6041e054004a2cad@localhost.localdomain> aeden added a comment. In the meantime I've "worked around this" by casting my ConstantExpr * into a Function * and passing it off as one when calling appendToGlobalCtors. This happens to work only because the pointer is only ever used as a Constant * (possible Q: Why should appendToGlobalCtors take a Function * in the first place?) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64962/new/ https://reviews.llvm.org/D64962 From llvm-commits at lists.llvm.org Wed Oct 9 18:41:09 2019 From: llvm-commits at lists.llvm.org (Mitch Phillips via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:41:09 +0000 (UTC) Subject: [PATCH] D68754: [sanitizers] Update linker scripts to avoid emutls issues. In-Reply-To: References: Message-ID: hctim updated this revision to Diff 224230. hctim added a comment. Herald added a subscriber: srhines. - Add test for emutls. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68754/new/ https://reviews.llvm.org/D68754 Files: compiler-rt/lib/asan/asan.syms.extra compiler-rt/lib/hwasan/hwasan.syms.extra compiler-rt/lib/msan/msan.syms.extra compiler-rt/lib/ubsan/ubsan.syms.extra compiler-rt/lib/ubsan_minimal/ubsan.syms.extra compiler-rt/test/asan/TestCases/coverage-emutls.cpp Index: compiler-rt/test/asan/TestCases/coverage-emutls.cpp =================================================================== --- /dev/null +++ compiler-rt/test/asan/TestCases/coverage-emutls.cpp @@ -0,0 +1,7 @@ +// Test that sanitizer coverage instrumentation is being exported correctly. +// This caused a bug on Android where the TLS is emulated, and the linker script +// rules didn't capture the change from "__sancov_lowest_stack" to +// "__emutls_v.__sancov_lowest_stack", and didn't export the symbol. + +// RUN: llvm-nm %shared_libasan | FileCheck %s +// CHECK: __sancov_lowest_stack Index: compiler-rt/lib/ubsan_minimal/ubsan.syms.extra =================================================================== --- compiler-rt/lib/ubsan_minimal/ubsan.syms.extra +++ compiler-rt/lib/ubsan_minimal/ubsan.syms.extra @@ -1 +1 @@ -__ubsan_* +*__ubsan_* Index: compiler-rt/lib/ubsan/ubsan.syms.extra =================================================================== --- compiler-rt/lib/ubsan/ubsan.syms.extra +++ compiler-rt/lib/ubsan/ubsan.syms.extra @@ -1 +1 @@ -__ubsan_* +*__ubsan_* Index: compiler-rt/lib/msan/msan.syms.extra =================================================================== --- compiler-rt/lib/msan/msan.syms.extra +++ compiler-rt/lib/msan/msan.syms.extra @@ -1,2 +1,3 @@ -__msan_* -__ubsan_* +*__msan_* +*__ubsan_* +*__sancov_* Index: compiler-rt/lib/hwasan/hwasan.syms.extra =================================================================== --- compiler-rt/lib/hwasan/hwasan.syms.extra +++ compiler-rt/lib/hwasan/hwasan.syms.extra @@ -1,2 +1,3 @@ -__hwasan_* -__ubsan_* +*__hwasan_* +*__ubsan_* +*__sancov_* Index: compiler-rt/lib/asan/asan.syms.extra =================================================================== --- compiler-rt/lib/asan/asan.syms.extra +++ compiler-rt/lib/asan/asan.syms.extra @@ -1,4 +1,4 @@ -__asan_* -__lsan_* -__ubsan_* -__sancov_* +*__asan_* +*__lsan_* +*__ubsan_* +*__sancov_* -------------- next part -------------- A non-text attachment was scrubbed... Name: D68754.224230.patch Type: text/x-patch Size: 1938 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 18:50:16 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:50:16 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: shchenz updated this revision to Diff 224234. shchenz marked an inline comment as done. shchenz added a comment. avoid unnescessary heap allocations in APInt copy constructor. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 Files: llvm/include/llvm/IR/PatternMatch.h llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp llvm/test/Transforms/AggressiveInstCombine/popcount.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68189.224234.patch Type: text/x-patch Size: 13254 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 18:50:20 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:50:20 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: <354b46e77afec236d3f378069f0b9ebf@localhost.localdomain> shchenz added inline comments. ================ Comment at: llvm/include/llvm/IR/PatternMatch.h:664 /// the value. -inline specific_intval m_SpecificInt(uint64_t V) { return specific_intval(V); } +inline specific_intval m_SpecificInt(APInt V) { return specific_intval(V); } + ---------------- craig.topper wrote: > Can we std::move this into specific_intval constructor and then std::move it again in the class. Otherwise we're making multiple heap allocations whenever the value is more the 64 bits. Right. Thanks for pointing it out. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 From llvm-commits at lists.llvm.org Wed Oct 9 18:57:58 2019 From: llvm-commits at lists.llvm.org (Igor Kudrin via llvm-commits) Date: Thu, 10 Oct 2019 01:57:58 +0000 Subject: [llvm] r371510 - Reland [DWARF] Add a unit test for DWARFUnit::getLength(). In-Reply-To: References: <20190910115432.418288B698@lists.llvm.org> Message-ID: Hi David, ASan complained when `MemoryBuffer` checked for the null terminator. The fix was to pass to `MemoryBuffer::getMemBuffer` the actual size of the section data, not including the termination byte of the string. Sorry for the late reply. Best Regards, Igor Kudrin C++ Developer, Access Softek, Inc.​ From: David Blaikie Sent: Tuesday, September 17, 2019 5:08 To: Igor Kudrin Cc: llvm-commits Subject: Re: [llvm] r371510 - Reland [DWARF] Add a unit test for DWARFUnit::getLength(). CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. If you suspect potential phishing or spam email, report it to ReportSpam at accesssoftek.com What was the asan bot failure? What was the fix? On Tue, Sep 10, 2019 at 4:52 AM Igor Kudrin via llvm-commits > wrote: Author: ikudrin Date: Tue Sep 10 04:54:32 2019 New Revision: 371510 URL: http://llvm.org/viewvc/llvm-project?rev=371510&view=rev Log: Reland [DWARF] Add a unit test for DWARFUnit::getLength(). This is a follow-up of rL369529, where the return value of DWARFUnit::getLength() was changed from uint32_t to uint64_t. The test checks that a unit header with Length > 4G can be successfully parsed and the value of the Length field is not truncated. Differential Revision: https://reviews.llvm.org/D67276 Modified: llvm/trunk/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp Modified: llvm/trunk/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp?rev=371510&r1=371509&r2=371510&view=diff ============================================================================== --- llvm/trunk/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp (original) +++ llvm/trunk/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp Tue Sep 10 04:54:32 2019 @@ -3158,4 +3158,46 @@ TEST(DWARFDebugInfo, TestDWARFDieRangeIn AssertRangesIntersect(Ranges, {{0x20, 0x21}, {0x2f, 0x31}}); } +TEST(DWARFDebugInfo, TestDWARF64UnitLength) { + static const char DebugInfoSecRaw[] = + "\xff\xff\xff\xff" // DWARF64 mark + "\x88\x77\x66\x55\x44\x33\x22\x11" // Length + "\x05\x00" // Version + "\x01" // DW_UT_compile + "\x04" // Address size + "\0\0\0\0\0\0\0\0"; // Offset Into Abbrev. Sec. + StringMap> Sections; + Sections.insert(std::make_pair( + "debug_info", MemoryBuffer::getMemBuffer(StringRef( + DebugInfoSecRaw, sizeof(DebugInfoSecRaw) - 1)))); + auto Context = DWARFContext::create(Sections, /* AddrSize = */ 4, + /* isLittleEndian = */ true); + const auto &Obj = Context->getDWARFObj(); + Obj.forEachInfoSections([&](const DWARFSection &Sec) { + DWARFUnitHeader Header; + DWARFDataExtractor Data(Obj, Sec, /* IsLittleEndian = */ true, + /* AddressSize = */ 4); + uint64_t Offset = 0; + EXPECT_FALSE(Header.extract(*Context, Data, &Offset)); + // Header.extract() returns false because there is not enough space + // in the section for the declared length. Anyway, we can check that + // the properties are read correctly. + ASSERT_EQ(DwarfFormat::DWARF64, Header.getFormat()); + ASSERT_EQ(0x1122334455667788ULL, Header.getLength()); + ASSERT_EQ(5, Header.getVersion()); + ASSERT_EQ(DW_UT_compile, Header.getUnitType()); + ASSERT_EQ(4, Header.getAddressByteSize()); + + // Check that the length can be correctly read in the unit class. + DWARFUnitVector DummyUnitVector; + DWARFSection DummySec; + DWARFCompileUnit CU(*Context, Sec, Header, /* DA = */ 0, /* RS = */ 0, + /* LocSection = */ 0, /* SS = */ StringRef(), + /* SOS = */ DummySec, /* AOS = */ 0, + /* LS = */ DummySec, /* LE = */ true, + /* isDWO= */ false, DummyUnitVector); + ASSERT_EQ(0x1122334455667788ULL, CU.getLength()); + }); +} + } // end anonymous namespace _______________________________________________ llvm-commits mailing list llvm-commits at lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Wed Oct 9 18:59:19 2019 From: llvm-commits at lists.llvm.org (Volodymyr Sapsai via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 01:59:19 +0000 (UTC) Subject: [PATCH] D68252: [Stats] Add ALWAYS_ENABLED_STATISTIC enabled regardless of LLVM_ENABLE_STATS. In-Reply-To: References: Message-ID: <5655d6baced0429111d0019c2a120170@localhost.localdomain> vsapsai updated this revision to Diff 224235. vsapsai marked an inline comment as done. vsapsai added a comment. Herald added a subscriber: jfb. - Address review comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68252/new/ https://reviews.llvm.org/D68252 Files: clang/include/clang/Basic/FileManager.h clang/include/clang/Lex/HeaderSearch.h clang/lib/Basic/FileManager.cpp clang/lib/Lex/HeaderSearch.cpp llvm/include/llvm/ADT/Statistic.h llvm/lib/Support/Statistic.cpp llvm/unittests/ADT/StatisticTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68252.224235.patch Type: text/x-patch Size: 14835 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 19:04:56 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Thu, 10 Oct 2019 02:04:56 -0000 Subject: [lld] r374270 - dummy comment typo fix commit to cycle the bots Message-ID: <20191010020456.8630787740@lists.llvm.org> Author: nico Date: Wed Oct 9 19:04:56 2019 New Revision: 374270 URL: http://llvm.org/viewvc/llvm-project?rev=374270&view=rev Log: dummy comment typo fix commit to cycle the bots Modified: lld/trunk/COFF/DLL.cpp lld/trunk/COFF/Driver.cpp lld/trunk/COFF/ICF.cpp lld/trunk/COFF/InputFiles.cpp lld/trunk/COFF/MinGW.cpp Modified: lld/trunk/COFF/DLL.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/DLL.cpp?rev=374270&r1=374269&r2=374270&view=diff ============================================================================== --- lld/trunk/COFF/DLL.cpp (original) +++ lld/trunk/COFF/DLL.cpp Wed Oct 9 19:04:56 2019 @@ -135,7 +135,7 @@ private: static std::vector> binImports(const std::vector &imports) { // Group DLL-imported symbols by DLL name because that's how - // symbols are layed out in the import descriptor table. + // symbols are laid out in the import descriptor table. auto less = [](const std::string &a, const std::string &b) { return config->dllOrder[a] < config->dllOrder[b]; }; Modified: lld/trunk/COFF/Driver.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/Driver.cpp?rev=374270&r1=374269&r2=374270&view=diff ============================================================================== --- lld/trunk/COFF/Driver.cpp (original) +++ lld/trunk/COFF/Driver.cpp Wed Oct 9 19:04:56 2019 @@ -718,8 +718,7 @@ static std::string getImplibPath() { return out.str(); } -// -// The import name is caculated as the following: +// The import name is calculated as follows: // // | LIBRARY w/ ext | LIBRARY w/o ext | no LIBRARY // -----+----------------+---------------------+------------------ Modified: lld/trunk/COFF/ICF.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/ICF.cpp?rev=374270&r1=374269&r2=374270&view=diff ============================================================================== --- lld/trunk/COFF/ICF.cpp (original) +++ lld/trunk/COFF/ICF.cpp Wed Oct 9 19:04:56 2019 @@ -77,7 +77,7 @@ private: // section is insignificant to the user program and the behaviour matches that // of the Visual C++ linker. bool ICF::isEligible(SectionChunk *c) { - // Non-comdat chunks, dead chunks, and writable chunks are not elegible. + // Non-comdat chunks, dead chunks, and writable chunks are not eligible. bool writable = c->getOutputCharacteristics() & llvm::COFF::IMAGE_SCN_MEM_WRITE; if (!c->isCOMDAT() || !c->live || writable) return false; @@ -274,7 +274,7 @@ void ICF::run(ArrayRef vec) { for (Symbol *b : sc->symbols()) if (auto *sym = dyn_cast_or_null(b)) hash += sym->getChunk()->eqClass[cnt % 2]; - // Set MSB to 1 to avoid collisions with non-hash classs. + // Set MSB to 1 to avoid collisions with non-hash classes. sc->eqClass[(cnt + 1) % 2] = hash | (1U << 31); }); } @@ -297,7 +297,7 @@ void ICF::run(ArrayRef vec) { log("ICF needed " + Twine(cnt) + " iterations"); - // Merge sections in the same classs. + // Merge sections in the same classes. forEachClass([&](size_t begin, size_t end) { if (end - begin == 1) return; Modified: lld/trunk/COFF/InputFiles.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/InputFiles.cpp?rev=374270&r1=374269&r2=374270&view=diff ============================================================================== --- lld/trunk/COFF/InputFiles.cpp (original) +++ lld/trunk/COFF/InputFiles.cpp Wed Oct 9 19:04:56 2019 @@ -599,7 +599,7 @@ Optional ObjFile::createDefine // Comdat handling. // A comdat symbol consists of two symbol table entries. // The first symbol entry has the name of the section (e.g. .text), fixed - // values for the other fields, and one auxilliary record. + // values for the other fields, and one auxiliary record. // The second symbol entry has the name of the comdat symbol, called the // "comdat leader". // When this function is called for the first symbol entry of a comdat, @@ -669,7 +669,7 @@ ArrayRef ObjFile::getDebugSecti return {}; } -// OBJ files systematically store critical informations in a .debug$S stream, +// OBJ files systematically store critical information in a .debug$S stream, // even if the TU was compiled with no debug info. At least two records are // always there. S_OBJNAME stores a 32-bit signature, which is loaded into the // PCHSignature member. S_COMPILE3 stores compile-time cmd-line flags. This is Modified: lld/trunk/COFF/MinGW.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/MinGW.cpp?rev=374270&r1=374269&r2=374270&view=diff ============================================================================== --- lld/trunk/COFF/MinGW.cpp (original) +++ lld/trunk/COFF/MinGW.cpp Wed Oct 9 19:04:56 2019 @@ -55,7 +55,7 @@ AutoExporter::AutoExporter() { // C++ symbols "__rtti_", "__builtin_", - // Artifical symbols such as .refptr + // Artificial symbols such as .refptr ".", }; From llvm-commits at lists.llvm.org Wed Oct 9 19:08:33 2019 From: llvm-commits at lists.llvm.org (Volodymyr Sapsai via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 02:08:33 +0000 (UTC) Subject: [PATCH] D68252: [Stats] Add ALWAYS_ENABLED_STATISTIC enabled regardless of LLVM_ENABLE_STATS. In-Reply-To: References: Message-ID: <5762c38811c951f1883544aedd308566@localhost.localdomain> vsapsai marked 2 inline comments as done. vsapsai added a comment. Thanks for the review. ================ Comment at: llvm/include/llvm/ADT/Statistic.h:47 -class Statistic { +class StatisticBase { public: ---------------- dsanders wrote: > Do we actually need the common base class? I'm thinking that since NoopStatistic never registers (and can't because the interfaces to do so changed to TrackingStatistic*), then there's a good chance that nothing reads DebugType, Name, Desc, Value, Initialized and we might be able to save a little memory by eliminating them. Good point. I've tried to remove the common base class but we are reading Name and Desc at least in [`FusionCandidate::reportInvalidCandidate`](https://github.com/llvm/llvm-project/blob/d6e9e99cec95c83293c68d3b30534e34f53a1923/llvm/lib/Transforms/Scalar/LoopFuse.cpp#L339-L342) and [`reportLoopFusion`](https://github.com/llvm/llvm-project/blob/d6e9e99cec95c83293c68d3b30534e34f53a1923/llvm/lib/Transforms/Scalar/LoopFuse.cpp#L1326-L1331). So I've made `StatisticBase` to store only DebugType, Name, Desc and moved Value and Initialized to `TrackingStatistic`. It saves a little bit of memory and makes `NoopStatistic::getValue()` cheaper as we don't touch atomic. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68252/new/ https://reviews.llvm.org/D68252 From llvm-commits at lists.llvm.org Wed Oct 9 19:17:50 2019 From: llvm-commits at lists.llvm.org (Pengfei Wang via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 02:17:50 +0000 (UTC) Subject: [PATCH] D68686: [X86] Add strict fp support for instructions fadd/fsub/fmul/fdiv In-Reply-To: References: Message-ID: pengfei updated this revision to Diff 224236. pengfei added a comment. Separate strict node handling from former patch. This patch only models MXCSR. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68686/new/ https://reviews.llvm.org/D68686 Files: llvm/lib/Target/X86/X86InstrAVX512.td llvm/lib/Target/X86/X86InstrFormats.td llvm/lib/Target/X86/X86InstrInfo.cpp llvm/lib/Target/X86/X86InstrSSE.td llvm/lib/Target/X86/X86RegisterInfo.cpp llvm/lib/Target/X86/X86RegisterInfo.td llvm/test/CodeGen/MIR/X86/constant-pool.mir llvm/test/CodeGen/MIR/X86/fastmath.mir llvm/test/CodeGen/MIR/X86/memory-operands.mir llvm/test/CodeGen/X86/evex-to-vex-compress.mir llvm/test/CodeGen/X86/ipra-reg-usage.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68686.224236.patch Type: text/x-patch Size: 71351 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 19:36:11 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 02:36:11 +0000 (UTC) Subject: [PATCH] D67986: [InstCombine] snprintf (d, size, "%s", s) -> memccpy (d, s, '\0', size - 1), d[size - 1] = 0 In-Reply-To: References: Message-ID: <9b0949e292eac0a247fa223b186c6361@localhost.localdomain> MaskRay added a comment. This transformation seems to increase code size significantly. Is the snprintf "%s" pattern common enough? I suspect most projects have already used memccpy, stpncpy, strscpy, or strlcpy. For the few that don't, the performance probably does not matter. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67986/new/ https://reviews.llvm.org/D67986 From llvm-commits at lists.llvm.org Wed Oct 9 19:45:25 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 02:45:25 +0000 (UTC) Subject: [PATCH] D68689: [LLD] [MinGW] Look for other library patterns with -l In-Reply-To: References: Message-ID: <4ed2e1dec9f751a121eea179d4a2b90b@localhost.localdomain> ruiu accepted this revision. ruiu added a comment. LGTM with this change. ================ Comment at: lld/MinGW/Driver.cpp:144-145 + if (Optional s = findFile(dir, "lib" + name + ".dll")) + fatal("lld doesn't support linking directly against " + *s + + ", use an import library"); + if (Optional s = findFile(dir, name + ".dll")) ---------------- It should be error instead of fatal, as our promise is that we use fatal only to report corrupted files. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68689/new/ https://reviews.llvm.org/D68689 From llvm-commits at lists.llvm.org Wed Oct 9 19:45:26 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 02:45:26 +0000 (UTC) Subject: [PATCH] D68749: [lld][WebAssembly] Refactor markLive.cpp. NFC In-Reply-To: References: Message-ID: ruiu accepted this revision. ruiu added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68749/new/ https://reviews.llvm.org/D68749 From llvm-commits at lists.llvm.org Wed Oct 9 19:45:28 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 02:45:28 +0000 (UTC) Subject: [PATCH] D68676: [ASan] Do not misrepresent high value address dereferences as null dereferences In-Reply-To: References: Message-ID: thakis added inline comments. ================ Comment at: compiler-rt/lib/sanitizer_common/sanitizer_win.cpp:949 +bool SignalContext::IsTrueFaultingAddress() const { + // TODO: Provide real implementation for this. See Linux and Mac variants. + return IsMemoryAccess(); ---------------- This breaks a bunch of bots with `/home/buildbots/ppc64be-clang-lnt-test/clang-ppc64be-lnt/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_win.cpp:949: Missing username in TODO; it should look like "// TODO(my_username): Stuff." [readability/todo] [2]` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68676/new/ https://reviews.llvm.org/D68676 From llvm-commits at lists.llvm.org Wed Oct 9 19:48:47 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Thu, 10 Oct 2019 02:48:47 -0000 Subject: [llvm] r374272 - gn build: (manually) merge r374271 Message-ID: <20191010024847.4CFA987961@lists.llvm.org> Author: nico Date: Wed Oct 9 19:48:47 2019 New Revision: 374272 URL: http://llvm.org/viewvc/llvm-project?rev=374272&view=rev Log: gn build: (manually) merge r374271 Added: llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Transformer/ llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Transformer/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/clang-tools-extra/clang-tidy/utils/BUILD.gn llvm/trunk/utils/gn/secondary/clang-tools-extra/unittests/clang-tidy/BUILD.gn llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Refactoring/BUILD.gn llvm/trunk/utils/gn/secondary/clang/unittests/Tooling/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/clang-tools-extra/clang-tidy/utils/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang-tools-extra/clang-tidy/utils/BUILD.gn?rev=374272&r1=374271&r2=374272&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang-tools-extra/clang-tidy/utils/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang-tools-extra/clang-tidy/utils/BUILD.gn Wed Oct 9 19:48:47 2019 @@ -7,7 +7,7 @@ static_library("utils") { "//clang/lib/ASTMatchers", "//clang/lib/Basic", "//clang/lib/Lex", - "//clang/lib/Tooling/Refactoring", + "//clang/lib/Tooling/Transformer", "//llvm/lib/Support", ] sources = [ Modified: llvm/trunk/utils/gn/secondary/clang-tools-extra/unittests/clang-tidy/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang-tools-extra/unittests/clang-tidy/BUILD.gn?rev=374272&r1=374271&r2=374272&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang-tools-extra/unittests/clang-tidy/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang-tools-extra/unittests/clang-tidy/BUILD.gn Wed Oct 9 19:48:47 2019 @@ -18,7 +18,7 @@ unittest("ClangTidyTests") { "//clang/lib/Serialization", "//clang/lib/Tooling", "//clang/lib/Tooling/Core", - "//clang/lib/Tooling/Refactoring", + "//clang/lib/Tooling/Transformer", "//llvm/lib/Support", ] include_dirs = [ "//clang-tools-extra/clang-tidy" ] Modified: llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Refactoring/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Refactoring/BUILD.gn?rev=374272&r1=374271&r2=374272&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Refactoring/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Refactoring/BUILD.gn Wed Oct 9 19:48:47 2019 @@ -19,16 +19,11 @@ static_library("Refactoring") { "AtomicChange.cpp", "Extract/Extract.cpp", "Extract/SourceExtraction.cpp", - "RangeSelector.cpp", "RefactoringActions.cpp", "Rename/RenamingAction.cpp", "Rename/SymbolOccurrences.cpp", "Rename/USRFinder.cpp", "Rename/USRFindingAction.cpp", "Rename/USRLocFinder.cpp", - "SourceCode.cpp", - "SourceCodeBuilders.cpp", - "Stencil.cpp", - "Transformer.cpp", ] } Added: llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Transformer/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Transformer/BUILD.gn?rev=374272&view=auto ============================================================================== --- llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Transformer/BUILD.gn (added) +++ llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Transformer/BUILD.gn Wed Oct 9 19:48:47 2019 @@ -0,0 +1,20 @@ +static_library("Transformer") { + output_name = "clangToolingTransformer" + configs += [ "//llvm/utils/gn/build:clang_code" ] + deps = [ + "//clang/lib/AST", + "//clang/lib/ASTMatchers", + "//clang/lib/Basic", + "//clang/lib/Lex", + "//clang/lib/Tooling/Core", + "//clang/lib/Tooling/Refactoring", + "//llvm/lib/Support", + ] + sources = [ + "RangeSelector.cpp", + "SourceCode.cpp", + "SourceCodeBuilders.cpp", + "Stencil.cpp", + "Transformer.cpp", + ] +} Modified: llvm/trunk/utils/gn/secondary/clang/unittests/Tooling/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang/unittests/Tooling/BUILD.gn?rev=374272&r1=374271&r2=374272&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang/unittests/Tooling/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang/unittests/Tooling/BUILD.gn Wed Oct 9 19:48:47 2019 @@ -14,6 +14,7 @@ unittest("ToolingTests") { "//clang/lib/Tooling", "//clang/lib/Tooling/Core", "//clang/lib/Tooling/Refactoring", + "//clang/lib/Tooling/Transformer", "//llvm/lib/Support", "//llvm/lib/Target:TargetsToBuild", "//llvm/lib/Testing/Support", From llvm-commits at lists.llvm.org Wed Oct 9 20:00:16 2019 From: llvm-commits at lists.llvm.org (Chen Zheng via llvm-commits) Date: Thu, 10 Oct 2019 03:00:16 -0000 Subject: [llvm] r374273 - [PowerPC] add testcase for ppc loop instr form prep - NFC Message-ID: <20191010030016.9918D8A915@lists.llvm.org> Author: shchenz Date: Wed Oct 9 20:00:15 2019 New Revision: 374273 URL: http://llvm.org/viewvc/llvm-project?rev=374273&view=rev Log: [PowerPC] add testcase for ppc loop instr form prep - NFC Added: llvm/trunk/test/CodeGen/PowerPC/loop-instr-form-prepare.ll Added: llvm/trunk/test/CodeGen/PowerPC/loop-instr-form-prepare.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/loop-instr-form-prepare.ll?rev=374273&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/PowerPC/loop-instr-form-prepare.ll (added) +++ llvm/trunk/test/CodeGen/PowerPC/loop-instr-form-prepare.ll Wed Oct 9 20:00:15 2019 @@ -0,0 +1,753 @@ +; RUN: llc -ppc-asm-full-reg-names -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr9 < %s | FileCheck %s + +; test_no_prep: +; unsigned long test_no_prep(char *p, int count) { +; unsigned long i=0, res=0; +; int DISP1 = 4001; +; int DISP2 = 4002; +; int DISP3 = 4003; +; int DISP4 = 4004; +; for (; i < count ; i++) { +; unsigned long x1 = *(unsigned long *)(p + i + DISP1); +; unsigned long x2 = *(unsigned long *)(p + i + DISP2); +; unsigned long x3 = *(unsigned long *)(p + i + DISP3); +; unsigned long x4 = *(unsigned long *)(p + i + DISP4); +; res += x1*x2*x3*x4; +; } +; return res + count; +; } + +define i64 @test_no_prep(i8* %0, i32 signext %1) { +; CHECK-LABEL: test_no_prep: +; CHECK: addi r3, r3, 4004 +; CHECK: .LBB0_2: # +; CHECK-NEXT: ldx r9, r3, r6 +; CHECK-NEXT: ldx r10, r3, r7 +; CHECK-NEXT: mulld r9, r10, r9 +; CHECK-NEXT: ldx r11, r3, r8 +; CHECK-NEXT: mulld r9, r9, r11 +; CHECK-NEXT: ld r12, 0(r3) +; CHECK-NEXT: addi r3, r3, 1 +; CHECK-NEXT: maddld r5, r9, r12, r5 +; CHECK-NEXT: bdnz .LBB0_2 + %3 = sext i32 %1 to i64 + %4 = icmp eq i32 %1, 0 + br i1 %4, label %27, label %5 + +5: ; preds = %2, %5 + %6 = phi i64 [ %25, %5 ], [ 0, %2 ] + %7 = phi i64 [ %24, %5 ], [ 0, %2 ] + %8 = getelementptr inbounds i8, i8* %0, i64 %6 + %9 = getelementptr inbounds i8, i8* %8, i64 4001 + %10 = bitcast i8* %9 to i64* + %11 = load i64, i64* %10, align 8 + %12 = getelementptr inbounds i8, i8* %8, i64 4002 + %13 = bitcast i8* %12 to i64* + %14 = load i64, i64* %13, align 8 + %15 = getelementptr inbounds i8, i8* %8, i64 4003 + %16 = bitcast i8* %15 to i64* + %17 = load i64, i64* %16, align 8 + %18 = getelementptr inbounds i8, i8* %8, i64 4004 + %19 = bitcast i8* %18 to i64* + %20 = load i64, i64* %19, align 8 + %21 = mul i64 %14, %11 + %22 = mul i64 %21, %17 + %23 = mul i64 %22, %20 + %24 = add i64 %23, %7 + %25 = add nuw i64 %6, 1 + %26 = icmp ult i64 %25, %3 + br i1 %26, label %5, label %27 + +27: ; preds = %5, %2 + %28 = phi i64 [ 0, %2 ], [ %24, %5 ] + %29 = add i64 %28, %3 + ret i64 %29 +} + +; test_ds_prep: +; unsigned long test_ds_prep(char *p, int count) { +; unsigned long i=0, res=0; +; int DISP1 = 4001; +; int DISP2 = 4002; +; int DISP3 = 4003; +; int DISP4 = 4006; +; for (; i < count ; i++) { +; unsigned long x1 = *(unsigned long *)(p + i + DISP1); +; unsigned long x2 = *(unsigned long *)(p + i + DISP2); +; unsigned long x3 = *(unsigned long *)(p + i + DISP3); +; unsigned long x4 = *(unsigned long *)(p + i + DISP4); +; res += x1*x2*x3*x4; +; } +; return res + count; +; } + +define i64 @test_ds_prep(i8* %0, i32 signext %1) { +; CHECK-LABEL: test_ds_prep: +; CHECK: addi r6, r3, 4001 +; CHECK: .LBB1_2: # +; CHECK-NEXT: ld r10, 0(r6) +; CHECK-NEXT: ldx r11, r6, r5 +; CHECK-NEXT: mulld r10, r11, r10 +; CHECK-NEXT: ldx r12, r6, r7 +; CHECK-NEXT: mulld r10, r10, r12 +; CHECK-NEXT: addi r9, r6, 1 +; CHECK-NEXT: ldx r6, r6, r8 +; CHECK-NEXT: maddld r3, r10, r6, r3 +; CHECK-NEXT: mr r6, r9 +; CHECK-NEXT: bdnz .LBB1_2 + %3 = sext i32 %1 to i64 + %4 = icmp eq i32 %1, 0 + br i1 %4, label %27, label %5 + +5: ; preds = %2, %5 + %6 = phi i64 [ %25, %5 ], [ 0, %2 ] + %7 = phi i64 [ %24, %5 ], [ 0, %2 ] + %8 = getelementptr inbounds i8, i8* %0, i64 %6 + %9 = getelementptr inbounds i8, i8* %8, i64 4001 + %10 = bitcast i8* %9 to i64* + %11 = load i64, i64* %10, align 8 + %12 = getelementptr inbounds i8, i8* %8, i64 4002 + %13 = bitcast i8* %12 to i64* + %14 = load i64, i64* %13, align 8 + %15 = getelementptr inbounds i8, i8* %8, i64 4003 + %16 = bitcast i8* %15 to i64* + %17 = load i64, i64* %16, align 8 + %18 = getelementptr inbounds i8, i8* %8, i64 4006 + %19 = bitcast i8* %18 to i64* + %20 = load i64, i64* %19, align 8 + %21 = mul i64 %14, %11 + %22 = mul i64 %21, %17 + %23 = mul i64 %22, %20 + %24 = add i64 %23, %7 + %25 = add nuw i64 %6, 1 + %26 = icmp ult i64 %25, %3 + br i1 %26, label %5, label %27 + +27: ; preds = %5, %2 + %28 = phi i64 [ 0, %2 ], [ %24, %5 ] + %29 = add i64 %28, %3 + ret i64 %29 +} + +; test_max_number_reminder: +; unsigned long test_max_number_reminder(char *p, int count) { +; unsigned long i=0, res=0; +; int DISP1 = 4001; +; int DISP2 = 4002; +; int DISP3 = 4003; +; int DISP4 = 4005; +; int DISP5 = 4006; +; int DISP6 = 4007; +; int DISP7 = 4014; +; int DISP8 = 4010; +; int DISP9 = 4011; +; for (; i < count ; i++) { +; unsigned long x1 = *(unsigned long *)(p + i + DISP1); +; unsigned long x2 = *(unsigned long *)(p + i + DISP2); +; unsigned long x3 = *(unsigned long *)(p + i + DISP3); +; unsigned long x4 = *(unsigned long *)(p + i + DISP4); +; unsigned long x5 = *(unsigned long *)(p + i + DISP5); +; unsigned long x6 = *(unsigned long *)(p + i + DISP6); +; unsigned long x7 = *(unsigned long *)(p + i + DISP7); +; unsigned long x8 = *(unsigned long *)(p + i + DISP8); +; unsigned long x9 = *(unsigned long *)(p + i + DISP9); +; res += x1*x2*x3*x4*x5*x6*x7*x8*x9; +; } +; return res + count; +;} + +define i64 @test_max_number_reminder(i8* %0, i32 signext %1) { +; CHECK-LABEL: test_max_number_reminder: +; CHECK: addi r8, r3, 4001 +; CHECK: .LBB2_2: # +; CHECK-NEXT: ld r30, 0(r8) +; CHECK-NEXT: ldx r29, r8, r5 +; CHECK-NEXT: mulld r30, r29, r30 +; CHECK-NEXT: addi r0, r8, 1 +; CHECK-NEXT: ld r28, 4(r8) +; CHECK-NEXT: ldx r27, r8, r7 +; CHECK-NEXT: ldx r26, r8, r9 +; CHECK-NEXT: ldx r25, r8, r10 +; CHECK-NEXT: ldx r24, r8, r11 +; CHECK-NEXT: ldx r23, r8, r12 +; CHECK-NEXT: ldx r8, r8, r6 +; CHECK-NEXT: mulld r8, r30, r8 +; CHECK-NEXT: mulld r8, r8, r28 +; CHECK-NEXT: mulld r8, r8, r27 +; CHECK-NEXT: mulld r8, r8, r26 +; CHECK-NEXT: mulld r8, r8, r25 +; CHECK-NEXT: mulld r8, r8, r24 +; CHECK-NEXT: maddld r3, r8, r23, r3 +; CHECK-NEXT: mr r8, r0 +; CHECK-NEXT: bdnz .LBB2_2 + %3 = sext i32 %1 to i64 + %4 = icmp eq i32 %1, 0 + br i1 %4, label %47, label %5 + +5: ; preds = %2, %5 + %6 = phi i64 [ %45, %5 ], [ 0, %2 ] + %7 = phi i64 [ %44, %5 ], [ 0, %2 ] + %8 = getelementptr inbounds i8, i8* %0, i64 %6 + %9 = getelementptr inbounds i8, i8* %8, i64 4001 + %10 = bitcast i8* %9 to i64* + %11 = load i64, i64* %10, align 8 + %12 = getelementptr inbounds i8, i8* %8, i64 4002 + %13 = bitcast i8* %12 to i64* + %14 = load i64, i64* %13, align 8 + %15 = getelementptr inbounds i8, i8* %8, i64 4003 + %16 = bitcast i8* %15 to i64* + %17 = load i64, i64* %16, align 8 + %18 = getelementptr inbounds i8, i8* %8, i64 4005 + %19 = bitcast i8* %18 to i64* + %20 = load i64, i64* %19, align 8 + %21 = getelementptr inbounds i8, i8* %8, i64 4006 + %22 = bitcast i8* %21 to i64* + %23 = load i64, i64* %22, align 8 + %24 = getelementptr inbounds i8, i8* %8, i64 4007 + %25 = bitcast i8* %24 to i64* + %26 = load i64, i64* %25, align 8 + %27 = getelementptr inbounds i8, i8* %8, i64 4014 + %28 = bitcast i8* %27 to i64* + %29 = load i64, i64* %28, align 8 + %30 = getelementptr inbounds i8, i8* %8, i64 4010 + %31 = bitcast i8* %30 to i64* + %32 = load i64, i64* %31, align 8 + %33 = getelementptr inbounds i8, i8* %8, i64 4011 + %34 = bitcast i8* %33 to i64* + %35 = load i64, i64* %34, align 8 + %36 = mul i64 %14, %11 + %37 = mul i64 %36, %17 + %38 = mul i64 %37, %20 + %39 = mul i64 %38, %23 + %40 = mul i64 %39, %26 + %41 = mul i64 %40, %29 + %42 = mul i64 %41, %32 + %43 = mul i64 %42, %35 + %44 = add i64 %43, %7 + %45 = add nuw i64 %6, 1 + %46 = icmp ult i64 %45, %3 + br i1 %46, label %5, label %47 + +47: ; preds = %5, %2 + %48 = phi i64 [ 0, %2 ], [ %44, %5 ] + %49 = add i64 %48, %3 + ret i64 %49 +} + +; test_update_ds_prep_interact: +; unsigned long test_update_ds_prep_interact(char *p, int count) { +; unsigned long i=0, res=0; +; int DISP1 = 4001; +; int DISP2 = 4002; +; int DISP3 = 4003; +; int DISP4 = 4006; +; for (; i < count ; i++) { +; unsigned long x1 = *(unsigned long *)(p + 4 * i + DISP1); +; unsigned long x2 = *(unsigned long *)(p + 4 * i + DISP2); +; unsigned long x3 = *(unsigned long *)(p + 4 * i + DISP3); +; unsigned long x4 = *(unsigned long *)(p + 4 * i + DISP4); +; res += x1*x2*x3*x4; +; } +; return res + count; +; } + +define dso_local i64 @test_update_ds_prep_interact(i8* %0, i32 signext %1) { +; CHECK-LABEL: test_update_ds_prep_interact: +; CHECK: addi r3, r3, 3997 +; CHECK: .LBB3_2: # +; CHECK-NEXT: ldu r9, 4(r3) +; CHECK-NEXT: ldx r10, r3, r6 +; CHECK-NEXT: mulld r9, r10, r9 +; CHECK-NEXT: ldx r11, r3, r7 +; CHECK-NEXT: mulld r9, r9, r11 +; CHECK-NEXT: ldx r12, r3, r8 +; CHECK-NEXT: maddld r5, r9, r12, r5 +; CHECK-NEXT: bdnz .LBB3_2 + %3 = sext i32 %1 to i64 + %4 = icmp eq i32 %1, 0 + br i1 %4, label %28, label %5 + +5: ; preds = %2, %5 + %6 = phi i64 [ %26, %5 ], [ 0, %2 ] + %7 = phi i64 [ %25, %5 ], [ 0, %2 ] + %8 = shl i64 %6, 2 + %9 = getelementptr inbounds i8, i8* %0, i64 %8 + %10 = getelementptr inbounds i8, i8* %9, i64 4001 + %11 = bitcast i8* %10 to i64* + %12 = load i64, i64* %11, align 8 + %13 = getelementptr inbounds i8, i8* %9, i64 4002 + %14 = bitcast i8* %13 to i64* + %15 = load i64, i64* %14, align 8 + %16 = getelementptr inbounds i8, i8* %9, i64 4003 + %17 = bitcast i8* %16 to i64* + %18 = load i64, i64* %17, align 8 + %19 = getelementptr inbounds i8, i8* %9, i64 4006 + %20 = bitcast i8* %19 to i64* + %21 = load i64, i64* %20, align 8 + %22 = mul i64 %15, %12 + %23 = mul i64 %22, %18 + %24 = mul i64 %23, %21 + %25 = add i64 %24, %7 + %26 = add nuw i64 %6, 1 + %27 = icmp ult i64 %26, %3 + br i1 %27, label %5, label %28 + +28: ; preds = %5, %2 + %29 = phi i64 [ 0, %2 ], [ %25, %5 ] + %30 = add i64 %29, %3 + ret i64 %30 +} + +; test_update_ds_prep_nointeract: +; unsigned long test_update_ds_prep_nointeract(char *p, int count) { +; unsigned long i=0, res=0; +; int DISP1 = 4001; +; int DISP2 = 4002; +; int DISP3 = 4003; +; int DISP4 = 4007; +; for (; i < count ; i++) { +; char x1 = *(p + i + DISP1); +; unsigned long x2 = *(unsigned long *)(p + i + DISP2); +; unsigned long x3 = *(unsigned long *)(p + i + DISP3); +; unsigned long x4 = *(unsigned long *)(p + i + DISP4); +; res += (unsigned long)x1*x2*x3*x4; +; } +; return res + count; +; } + +define i64 @test_update_ds_prep_nointeract(i8* %0, i32 signext %1) { +; CHECK-LABEL: test_update_ds_prep_nointeract: +; CHECK: addi r3, r3, 4000 +; CHECK: .LBB4_2: # +; CHECK-NEXT: lbzu r9, 1(r3) +; CHECK-NEXT: ldx r10, r3, r6 +; CHECK-NEXT: mulld r9, r10, r9 +; CHECK-NEXT: ldx r11, r3, r7 +; CHECK-NEXT: mulld r9, r9, r11 +; CHECK-NEXT: ldx r12, r3, r8 +; CHECK-NEXT: maddld r5, r9, r12, r5 +; CHECK-NEXT: bdnz .LBB4_2 + %3 = sext i32 %1 to i64 + %4 = icmp eq i32 %1, 0 + br i1 %4, label %27, label %5 + +5: ; preds = %2, %5 + %6 = phi i64 [ %25, %5 ], [ 0, %2 ] + %7 = phi i64 [ %24, %5 ], [ 0, %2 ] + %8 = getelementptr inbounds i8, i8* %0, i64 %6 + %9 = getelementptr inbounds i8, i8* %8, i64 4001 + %10 = load i8, i8* %9, align 1 + %11 = getelementptr inbounds i8, i8* %8, i64 4002 + %12 = bitcast i8* %11 to i64* + %13 = load i64, i64* %12, align 8 + %14 = getelementptr inbounds i8, i8* %8, i64 4003 + %15 = bitcast i8* %14 to i64* + %16 = load i64, i64* %15, align 8 + %17 = getelementptr inbounds i8, i8* %8, i64 4007 + %18 = bitcast i8* %17 to i64* + %19 = load i64, i64* %18, align 8 + %20 = zext i8 %10 to i64 + %21 = mul i64 %13, %20 + %22 = mul i64 %21, %16 + %23 = mul i64 %22, %19 + %24 = add i64 %23, %7 + %25 = add nuw i64 %6, 1 + %26 = icmp ult i64 %25, %3 + br i1 %26, label %5, label %27 + +27: ; preds = %5, %2 + %28 = phi i64 [ 0, %2 ], [ %24, %5 ] + %29 = add i64 %28, %3 + ret i64 %29 +} + +; test_ds_multiple_chains: +; unsigned long test_ds_multiple_chains(char *p, char *q, int count) { +; unsigned long i=0, res=0; +; int DISP1 = 4001; +; int DISP2 = 4010; +; int DISP3 = 4005; +; int DISP4 = 4009; +; for (; i < count ; i++) { +; unsigned long x1 = *(unsigned long *)(p + i + DISP1); +; unsigned long x2 = *(unsigned long *)(p + i + DISP2); +; unsigned long x3 = *(unsigned long *)(p + i + DISP3); +; unsigned long x4 = *(unsigned long *)(p + i + DISP4); +; unsigned long x5 = *(unsigned long *)(q + i + DISP1); +; unsigned long x6 = *(unsigned long *)(q + i + DISP2); +; unsigned long x7 = *(unsigned long *)(q + i + DISP3); +; unsigned long x8 = *(unsigned long *)(q + i + DISP4); +; res += x1*x2*x3*x4*x5*x6*x7*x8; +; } +; return res + count; +; } + +define dso_local i64 @test_ds_multiple_chains(i8* %0, i8* %1, i32 signext %2) { +; CHECK-LABEL: test_ds_multiple_chains: +; CHECK: addi r3, r3, 4010 +; CHECK: addi r4, r4, 4010 +; CHECK: .LBB5_2: # +; CHECK-NEXT: ldx r10, r3, r7 +; CHECK-NEXT: ld r11, 0(r3) +; CHECK-NEXT: mulld r10, r11, r10 +; CHECK-NEXT: ldx r11, r3, r8 +; CHECK-NEXT: mulld r10, r10, r11 +; CHECK-NEXT: ldx r12, r3, r9 +; CHECK-NEXT: addi r3, r3, 1 +; CHECK-NEXT: mulld r10, r10, r12 +; CHECK-NEXT: ldx r0, r4, r7 +; CHECK-NEXT: mulld r10, r10, r0 +; CHECK-NEXT: ld r30, 0(r4) +; CHECK-NEXT: mulld r10, r10, r30 +; CHECK-NEXT: ldx r29, r4, r8 +; CHECK-NEXT: mulld r10, r10, r29 +; CHECK-NEXT: ldx r28, r4, r9 +; CHECK-NEXT: addi r4, r4, 1 +; CHECK-NEXT: maddld r6, r10, r28, r6 +; CHECK-NEXT: bdnz .LBB5_2 + %4 = sext i32 %2 to i64 + %5 = icmp eq i32 %2, 0 + br i1 %5, label %45, label %6 + +6: ; preds = %3, %6 + %7 = phi i64 [ %43, %6 ], [ 0, %3 ] + %8 = phi i64 [ %42, %6 ], [ 0, %3 ] + %9 = getelementptr inbounds i8, i8* %0, i64 %7 + %10 = getelementptr inbounds i8, i8* %9, i64 4001 + %11 = bitcast i8* %10 to i64* + %12 = load i64, i64* %11, align 8 + %13 = getelementptr inbounds i8, i8* %9, i64 4010 + %14 = bitcast i8* %13 to i64* + %15 = load i64, i64* %14, align 8 + %16 = getelementptr inbounds i8, i8* %9, i64 4005 + %17 = bitcast i8* %16 to i64* + %18 = load i64, i64* %17, align 8 + %19 = getelementptr inbounds i8, i8* %9, i64 4009 + %20 = bitcast i8* %19 to i64* + %21 = load i64, i64* %20, align 8 + %22 = getelementptr inbounds i8, i8* %1, i64 %7 + %23 = getelementptr inbounds i8, i8* %22, i64 4001 + %24 = bitcast i8* %23 to i64* + %25 = load i64, i64* %24, align 8 + %26 = getelementptr inbounds i8, i8* %22, i64 4010 + %27 = bitcast i8* %26 to i64* + %28 = load i64, i64* %27, align 8 + %29 = getelementptr inbounds i8, i8* %22, i64 4005 + %30 = bitcast i8* %29 to i64* + %31 = load i64, i64* %30, align 8 + %32 = getelementptr inbounds i8, i8* %22, i64 4009 + %33 = bitcast i8* %32 to i64* + %34 = load i64, i64* %33, align 8 + %35 = mul i64 %15, %12 + %36 = mul i64 %35, %18 + %37 = mul i64 %36, %21 + %38 = mul i64 %37, %25 + %39 = mul i64 %38, %28 + %40 = mul i64 %39, %31 + %41 = mul i64 %40, %34 + %42 = add i64 %41, %8 + %43 = add nuw i64 %7, 1 + %44 = icmp ult i64 %43, %4 + br i1 %44, label %6, label %45 + +45: ; preds = %6, %3 + %46 = phi i64 [ 0, %3 ], [ %42, %6 ] + %47 = add i64 %46, %4 + ret i64 %47 +} + +; test_ds_cross_basic_blocks: +;extern char *arr; +;unsigned long foo(char *p, int count) +;{ +; unsigned long i=0, res=0; +; int DISP1 = 4000; +; int DISP2 = 4001; +; int DISP3 = 4002; +; int DISP4 = 4003; +; int DISP5 = 4005; +; int DISP6 = 4009; +; unsigned long x1, x2, x3, x4, x5, x6; +; x1=x2=x3=x4=x5=x6=1; +; for (; i < count ; i++) { +; if (arr[i] % 3 == 1) { +; x1 += *(unsigned long *)(p + i + DISP1); +; x2 += *(unsigned long *)(p + i + DISP2); +; } +; else if (arr[i] % 3 == 2) { +; x3 += *(unsigned long *)(p + i + DISP3); +; x4 += *(unsigned long *)(p + i + DISP5); +; } +; else { +; x5 += *(unsigned long *)(p + i + DISP4); +; x6 += *(unsigned long *)(p + i + DISP6); +; } +; res += x1*x2*x3*x4*x5*x6; +; } +; return res; +;} + + at arr = external local_unnamed_addr global i8*, align 8 + +define i64 @test_ds_cross_basic_blocks(i8* %0, i32 signext %1) { +; CHECK-LABEL: test_ds_cross_basic_blocks: +; CHECK: addi r5, r3, 4000 +; CHECK: .LBB6_2: # +; CHECK-NEXT: ld r0, 0(r5) +; CHECK-NEXT: add r26, r0, r26 +; CHECK-NEXT: ldx r0, r5, r7 +; CHECK-NEXT: add r27, r0, r27 +; CHECK-NEXT: .LBB6_3: # +; CHECK-NEXT: mulld r0, r27, r26 +; CHECK-NEXT: mulld r0, r0, r28 +; CHECK-NEXT: mulld r0, r0, r29 +; CHECK-NEXT: mulld r0, r0, r30 +; CHECK-NEXT: maddld r3, r0, r12, r3 +; CHECK-NEXT: addi r5, r5, 1 +; CHECK-NEXT: bdz .LBB6_9 +; CHECK-NEXT: .LBB6_4: # +; CHECK-NEXT: lbzu r0, 1(r6) +; CHECK-NEXT: clrldi r25, r0, 32 +; CHECK-NEXT: mulld r25, r25, r4 +; CHECK-NEXT: rldicl r25, r25, 31, 33 +; CHECK-NEXT: slwi r24, r25, 1 +; CHECK-NEXT: add r25, r25, r24 +; CHECK-NEXT: subf r0, r25, r0 +; CHECK-NEXT: cmplwi r0, 1 +; CHECK-NEXT: beq cr0, .LBB6_2 +; CHECK-NEXT: # %bb.5: # +; CHECK-NEXT: clrlwi r0, r0, 24 +; CHECK-NEXT: cmplwi r0, 2 +; CHECK-NEXT: bne cr0, .LBB6_7 +; CHECK-NEXT: # %bb.6: # +; CHECK-NEXT: ldx r0, r5, r8 +; CHECK-NEXT: add r28, r0, r28 +; CHECK-NEXT: ldx r0, r5, r9 +; CHECK-NEXT: add r29, r0, r29 +; CHECK-NEXT: b .LBB6_3 +; CHECK-NEXT: .p2align 4 +; CHECK-NEXT: .LBB6_7: # +; CHECK-NEXT: ldx r0, r5, r10 +; CHECK-NEXT: add r30, r0, r30 +; CHECK-NEXT: ldx r0, r5, r11 +; CHECK-NEXT: add r12, r0, r12 + %3 = sext i32 %1 to i64 + %4 = icmp eq i32 %1, 0 + br i1 %4, label %66, label %5 + +5: ; preds = %2 + %6 = load i8*, i8** @arr, align 8 + br label %7 + +7: ; preds = %5, %51 + %8 = phi i64 [ 1, %5 ], [ %57, %51 ] + %9 = phi i64 [ 1, %5 ], [ %56, %51 ] + %10 = phi i64 [ 1, %5 ], [ %55, %51 ] + %11 = phi i64 [ 1, %5 ], [ %54, %51 ] + %12 = phi i64 [ 1, %5 ], [ %53, %51 ] + %13 = phi i64 [ 1, %5 ], [ %52, %51 ] + %14 = phi i64 [ 0, %5 ], [ %64, %51 ] + %15 = phi i64 [ 0, %5 ], [ %63, %51 ] + %16 = getelementptr inbounds i8, i8* %6, i64 %14 + %17 = load i8, i8* %16, align 1 + %18 = urem i8 %17, 3 + %19 = icmp eq i8 %18, 1 + br i1 %19, label %20, label %30 + +20: ; preds = %7 + %21 = getelementptr inbounds i8, i8* %0, i64 %14 + %22 = getelementptr inbounds i8, i8* %21, i64 4000 + %23 = bitcast i8* %22 to i64* + %24 = load i64, i64* %23, align 8 + %25 = add i64 %24, %13 + %26 = getelementptr inbounds i8, i8* %21, i64 4001 + %27 = bitcast i8* %26 to i64* + %28 = load i64, i64* %27, align 8 + %29 = add i64 %28, %12 + br label %51 + +30: ; preds = %7 + %31 = icmp eq i8 %18, 2 + %32 = getelementptr inbounds i8, i8* %0, i64 %14 + br i1 %31, label %33, label %42 + +33: ; preds = %30 + %34 = getelementptr inbounds i8, i8* %32, i64 4002 + %35 = bitcast i8* %34 to i64* + %36 = load i64, i64* %35, align 8 + %37 = add i64 %36, %11 + %38 = getelementptr inbounds i8, i8* %32, i64 4005 + %39 = bitcast i8* %38 to i64* + %40 = load i64, i64* %39, align 8 + %41 = add i64 %40, %10 + br label %51 + +42: ; preds = %30 + %43 = getelementptr inbounds i8, i8* %32, i64 4003 + %44 = bitcast i8* %43 to i64* + %45 = load i64, i64* %44, align 8 + %46 = add i64 %45, %9 + %47 = getelementptr inbounds i8, i8* %32, i64 4009 + %48 = bitcast i8* %47 to i64* + %49 = load i64, i64* %48, align 8 + %50 = add i64 %49, %8 + br label %51 + +51: ; preds = %33, %42, %20 + %52 = phi i64 [ %25, %20 ], [ %13, %33 ], [ %13, %42 ] + %53 = phi i64 [ %29, %20 ], [ %12, %33 ], [ %12, %42 ] + %54 = phi i64 [ %11, %20 ], [ %37, %33 ], [ %11, %42 ] + %55 = phi i64 [ %10, %20 ], [ %41, %33 ], [ %10, %42 ] + %56 = phi i64 [ %9, %20 ], [ %9, %33 ], [ %46, %42 ] + %57 = phi i64 [ %8, %20 ], [ %8, %33 ], [ %50, %42 ] + %58 = mul i64 %53, %52 + %59 = mul i64 %58, %54 + %60 = mul i64 %59, %55 + %61 = mul i64 %60, %56 + %62 = mul i64 %61, %57 + %63 = add i64 %62, %15 + %64 = add nuw i64 %14, 1 + %65 = icmp ult i64 %64, %3 + br i1 %65, label %7, label %66 + +66: ; preds = %51, %2 + %67 = phi i64 [ 0, %2 ], [ %63, %51 ] + ret i64 %67 +} + +; test_ds_float: +;float test_ds_float(char *p, int count) { +; int i=0 ; +; float res=0; +; int DISP1 = 4001; +; int DISP2 = 4002; +; int DISP3 = 4022; +; int DISP4 = 4062; +; for (; i < count ; i++) { +; float x1 = *(float *)(p + i + DISP1); +; float x2 = *(float *)(p + i + DISP2); +; float x3 = *(float *)(p + i + DISP3); +; float x4 = *(float *)(p + i + DISP4); +; res += x1*x2*x3*x4; +; } +; return res; +;} + +define float @test_ds_float(i8* %0, i32 signext %1) { +; CHECK-LABEL: test_ds_float: +; CHECK: addi r3, r3, 4000 +; CHECK: .LBB7_2: # +; CHECK-NEXT: lfsu f0, 1(r3) +; CHECK-NEXT: lfsx f2, r3, r4 +; CHECK-NEXT: lfsx f3, r3, r5 +; CHECK-NEXT: xsmulsp f0, f0, f2 +; CHECK-NEXT: lfsx f4, r3, r6 +; CHECK-NEXT: xsmulsp f0, f0, f3 +; CHECK-NEXT: xsmulsp f0, f0, f4 +; CHECK-NEXT: xsaddsp f1, f1, f0 +; CHECK-NEXT: bdnz .LBB7_2 + %3 = icmp sgt i32 %1, 0 + br i1 %3, label %4, label %28 + +4: ; preds = %2 + %5 = zext i32 %1 to i64 + br label %6 + +6: ; preds = %6, %4 + %7 = phi i64 [ 0, %4 ], [ %26, %6 ] + %8 = phi float [ 0.000000e+00, %4 ], [ %25, %6 ] + %9 = getelementptr inbounds i8, i8* %0, i64 %7 + %10 = getelementptr inbounds i8, i8* %9, i64 4001 + %11 = bitcast i8* %10 to float* + %12 = load float, float* %11, align 4 + %13 = getelementptr inbounds i8, i8* %9, i64 4002 + %14 = bitcast i8* %13 to float* + %15 = load float, float* %14, align 4 + %16 = getelementptr inbounds i8, i8* %9, i64 4022 + %17 = bitcast i8* %16 to float* + %18 = load float, float* %17, align 4 + %19 = getelementptr inbounds i8, i8* %9, i64 4062 + %20 = bitcast i8* %19 to float* + %21 = load float, float* %20, align 4 + %22 = fmul float %12, %15 + %23 = fmul float %22, %18 + %24 = fmul float %23, %21 + %25 = fadd float %8, %24 + %26 = add nuw nsw i64 %7, 1 + %27 = icmp eq i64 %26, %5 + br i1 %27, label %28, label %6 + +28: ; preds = %6, %2 + %29 = phi float [ 0.000000e+00, %2 ], [ %25, %6 ] + ret float %29 +} + +; test_ds_combine_float_int: +;float test_ds_combine_float_int(char *p, int count) { +; int i=0 ; +; float res=0; +; int DISP1 = 4001; +; int DISP2 = 4002; +; int DISP3 = 4022; +; int DISP4 = 4062; +; for (; i < count ; i++) { +; float x1 = *(float *)(p + i + DISP1); +; unsigned long x2 = *(unsigned long*)(p + i + DISP2); +; float x3 = *(float *)(p + i + DISP3); +; float x4 = *(float *)(p + i + DISP4); +; res += x1*x2*x3*x4; +; } +; return res; +;} + +define float @test_ds_combine_float_int(i8* %0, i32 signext %1) { +; CHECK-LABEL: test_ds_combine_float_int: +; CHECK: addi r4, r3, 4001 +; CHECK: addi r3, r3, 4000 +; CHECK: .LBB8_2: # +; CHECK-NEXT: lfdu f4, 1(r4) +; CHECK-NEXT: lfsu f0, 1(r3) +; CHECK-NEXT: xscvuxdsp f4, f4 +; CHECK-NEXT: lfsx f2, r3, r5 +; CHECK-NEXT: lfsx f3, r3, r6 +; CHECK-NEXT: xsmulsp f0, f0, f4 +; CHECK-NEXT: xsmulsp f0, f2, f0 +; CHECK-NEXT: xsmulsp f0, f3, f0 +; CHECK-NEXT: xsaddsp f1, f1, f0 +; CHECK-NEXT: bdnz .LBB8_2 + %3 = icmp sgt i32 %1, 0 + br i1 %3, label %4, label %29 + +4: ; preds = %2 + %5 = zext i32 %1 to i64 + br label %6 + +6: ; preds = %6, %4 + %7 = phi i64 [ 0, %4 ], [ %27, %6 ] + %8 = phi float [ 0.000000e+00, %4 ], [ %26, %6 ] + %9 = getelementptr inbounds i8, i8* %0, i64 %7 + %10 = getelementptr inbounds i8, i8* %9, i64 4001 + %11 = bitcast i8* %10 to float* + %12 = load float, float* %11, align 4 + %13 = getelementptr inbounds i8, i8* %9, i64 4002 + %14 = bitcast i8* %13 to i64* + %15 = load i64, i64* %14, align 8 + %16 = getelementptr inbounds i8, i8* %9, i64 4022 + %17 = bitcast i8* %16 to float* + %18 = load float, float* %17, align 4 + %19 = getelementptr inbounds i8, i8* %9, i64 4062 + %20 = bitcast i8* %19 to float* + %21 = load float, float* %20, align 4 + %22 = uitofp i64 %15 to float + %23 = fmul float %12, %22 + %24 = fmul float %18, %23 + %25 = fmul float %21, %24 + %26 = fadd float %8, %25 + %27 = add nuw nsw i64 %7, 1 + %28 = icmp eq i64 %27, %5 + br i1 %28, label %29, label %6 + +29: ; preds = %6, %2 + %30 = phi float [ 0.000000e+00, %2 ], [ %26, %6 ] + ret float %30 +} From llvm-commits at lists.llvm.org Wed Oct 9 20:03:52 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 03:03:52 +0000 (UTC) Subject: [PATCH] D68751: [lld][WebAssembly] Where possible handle signature mismatches via an adaptor function In-Reply-To: References: Message-ID: ruiu added a comment. I'm not sure if there's a standard for the wasm object files and how the linker works, but if there's any, could you add a pointer to the document to the comment? ================ Comment at: lld/wasm/SymbolTable.cpp:592 +// types don't match the adaptor creation will tail. +bool SymbolTable::replaceWithAdaptorFunction(FunctionSymbol *sym, + FunctionSymbol *target) { ---------------- This function seems to enables to call an external function with a less number of arguments. I wonder what is the use case of this -- as long as you have a correct header file for functions, compilers can tell users that the number of parameters doesn't match. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68751/new/ https://reviews.llvm.org/D68751 From llvm-commits at lists.llvm.org Wed Oct 9 20:12:58 2019 From: llvm-commits at lists.llvm.org (Pengfei Wang via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 03:12:58 +0000 (UTC) Subject: [PATCH] D68757: [X86] Add strict fp support for instructions fadd/fsub/fmul/fdiv Message-ID: pengfei created this revision. pengfei added reviewers: craig.topper, RKSimon, andrew.w.kaylor, uweigand, kpn, spatel, cameron.mcinally. Herald added a project: LLVM. This is the following patch of D68686 . It adds strict node support for instructions fadd/fsub/fmul/fdiv Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68757 Files: llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86InstrAVX512.td llvm/lib/Target/X86/X86InstrSSE.td llvm/test/CodeGen/X86/vec-strict-128.ll llvm/test/CodeGen/X86/vec-strict-256.ll llvm/test/CodeGen/X86/vec-strict-512.ll llvm/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68757.224238.patch Type: text/x-patch Size: 35052 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 20:23:06 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via llvm-commits) Date: Thu, 10 Oct 2019 03:23:06 -0000 Subject: [lld] r374275 - [lld][WebAssembly] Refactor markLive.cpp. NFC Message-ID: <20191010032306.7D8338DF6A@lists.llvm.org> Author: sbc Date: Wed Oct 9 20:23:06 2019 New Revision: 374275 URL: http://llvm.org/viewvc/llvm-project?rev=374275&view=rev Log: [lld][WebAssembly] Refactor markLive.cpp. NFC This pattern matches the ELF implementation add if also useful as part of a planned change where running `mark` more than once is needed. Differential Revision: https://reviews.llvm.org/D68749 Modified: lld/trunk/wasm/MarkLive.cpp Modified: lld/trunk/wasm/MarkLive.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/wasm/MarkLive.cpp?rev=374275&r1=374274&r2=374275&view=diff ============================================================================== --- lld/trunk/wasm/MarkLive.cpp (original) +++ lld/trunk/wasm/MarkLive.cpp Wed Oct 9 20:23:06 2019 @@ -31,38 +31,52 @@ using namespace llvm; using namespace llvm::wasm; -void lld::wasm::markLive() { - if (!config->gcSections) - return; +namespace lld { +namespace wasm { - LLVM_DEBUG(dbgs() << "markLive\n"); - SmallVector q; +namespace { + +class MarkLive { +public: + void run(); + +private: + void enqueue(Symbol *sym); + void markSymbol(Symbol *sym); + void mark(); + + // A list of chunks to visit. + SmallVector queue; +}; - std::function enqueue = [&](Symbol *sym) { - if (!sym || sym->isLive()) - return; - LLVM_DEBUG(dbgs() << "markLive: " << sym->getName() << "\n"); - sym->markLive(); - if (InputChunk *chunk = sym->getChunk()) - q.push_back(chunk); - - // The ctor functions are all referenced by the synthetic callCtors - // function. However, this function does not contain relocations so we - // have to manually mark the ctors as live if callCtors itself is live. - if (sym == WasmSym::callCtors) { - if (config->isPic) - enqueue(WasmSym::applyRelocs); - for (const ObjFile *obj : symtab->objectFiles) { - const WasmLinkingData &l = obj->getWasmObj()->linkingData(); - for (const WasmInitFunc &f : l.InitFunctions) { - auto* initSym = obj->getFunctionSymbol(f.Symbol); - if (!initSym->isDiscarded()) - enqueue(initSym); - } +} // namespace + +void MarkLive::enqueue(Symbol *sym) { + if (!sym || sym->isLive()) + return; + LLVM_DEBUG(dbgs() << "markLive: " << sym->getName() << "\n"); + sym->markLive(); + if (InputChunk *chunk = sym->getChunk()) + queue.push_back(chunk); + + // The ctor functions are all referenced by the synthetic callCtors + // function. However, this function does not contain relocations so we + // have to manually mark the ctors as live if callCtors itself is live. + if (sym == WasmSym::callCtors) { + if (config->isPic) + enqueue(WasmSym::applyRelocs); + for (const ObjFile *obj : symtab->objectFiles) { + const WasmLinkingData &l = obj->getWasmObj()->linkingData(); + for (const WasmInitFunc &f : l.InitFunctions) { + auto* initSym = obj->getFunctionSymbol(f.Symbol); + if (!initSym->isDiscarded()) + enqueue(initSym); } } - }; + } +} +void MarkLive::run() { // Add GC root symbols. if (!config->entry.empty()) enqueue(symtab->find(config->entry)); @@ -87,9 +101,13 @@ void lld::wasm::markLive() { if (config->sharedMemory && !config->shared) enqueue(WasmSym::initMemory); + mark(); +} + +void MarkLive::mark() { // Follow relocations to mark all reachable chunks. - while (!q.empty()) { - InputChunk *c = q.pop_back_val(); + while (!queue.empty()) { + InputChunk *c = queue.pop_back_val(); for (const WasmRelocation reloc : c->getRelocations()) { if (reloc.Type == R_WASM_TYPE_INDEX_LEB) @@ -113,6 +131,16 @@ void lld::wasm::markLive() { enqueue(sym); } } +} + +void markLive() { + if (!config->gcSections) + return; + + LLVM_DEBUG(dbgs() << "markLive\n"); + + MarkLive marker; + marker.run(); // Report garbage-collected sections. if (config->printGcSections) { @@ -138,3 +166,6 @@ void lld::wasm::markLive() { message("removing unused section " + toString(g)); } } + +} // namespace wasm +} // namespace lld From llvm-commits at lists.llvm.org Wed Oct 9 20:22:10 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 03:22:10 +0000 (UTC) Subject: [PATCH] D68751: [lld][WebAssembly] Where possible handle signature mismatches via an adaptor function In-Reply-To: References: Message-ID: sbc100 marked an inline comment as done. sbc100 added a comment. The primary documentation we have is at : https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md ================ Comment at: lld/wasm/SymbolTable.cpp:592 +// types don't match the adaptor creation will tail. +bool SymbolTable::replaceWithAdaptorFunction(FunctionSymbol *sym, + FunctionSymbol *target) { ---------------- ruiu wrote: > This function seems to enables to call an external function with a less number of arguments. I wonder what is the use case of this -- as long as you have a correct header file for functions, compilers can tell users that the number of parameters doesn't match. Yes, normally this shouldn't happen, but sadly there are some cases in C when it does. The motivating case here is crt1.c calling the main function from _start. We want to be able support both 3 argument and 2 arguments forms. With native linking you can simply link these two together it will kind of "just work". With wasm, before this change we end up generating a linker warning and _start ends up calling stub function, so the program doesn't work. This change fixes that use case. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68751/new/ https://reviews.llvm.org/D68751 From llvm-commits at lists.llvm.org Wed Oct 9 20:22:12 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 03:22:12 +0000 (UTC) Subject: [PATCH] D68757: [X86] Add strict fp support for instructions fadd/fsub/fmul/fdiv In-Reply-To: References: Message-ID: craig.topper added inline comments. ================ Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:5394 } -defm VADD : avx512_binop_s_round<0x58, "vadd", fadd, X86fadds, X86faddRnds, +defm VADD : avx512_binop_s_round<0x58, "vadd", any_fadd, X86fadds, X86faddRnds, SchedWriteFAddSizes, 1>, SIMD_EXC; ---------------- These are scalar instructions, but you didn't make f32/f64 Legal in X86ISelLowering.cpp Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68757/new/ https://reviews.llvm.org/D68757 From llvm-commits at lists.llvm.org Wed Oct 9 20:22:12 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 03:22:12 +0000 (UTC) Subject: [PATCH] D67088: [PowerPC] extend PPCPreIncPrep Pass for ds/dq form In-Reply-To: References: Message-ID: shchenz updated this revision to Diff 224239. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67088/new/ https://reviews.llvm.org/D67088 Files: llvm/lib/Target/PowerPC/PPCLoopPreIncPrep.cpp llvm/test/CodeGen/PowerPC/loop-instr-form-prepare.ll llvm/test/CodeGen/PowerPC/swaps-le-1.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67088.224239.patch Type: text/x-patch Size: 32433 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 20:22:23 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 03:22:23 +0000 (UTC) Subject: [PATCH] D68749: [lld][WebAssembly] Refactor markLive.cpp. NFC In-Reply-To: References: Message-ID: <146b7caf5d932928aeafe61650429f52@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGad2e12a3d996: [lld][WebAssembly] Refactor markLive.cpp. NFC (authored by sbc100). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68749/new/ https://reviews.llvm.org/D68749 Files: lld/wasm/MarkLive.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68749.224240.patch Type: text/x-patch Size: 3461 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 20:49:31 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 03:49:31 +0000 (UTC) Subject: [PATCH] D68751: [lld][WebAssembly] Where possible handle signature mismatches via an adaptor function In-Reply-To: References: Message-ID: <1b618889b1957bbb20184a6ff27b8dda@localhost.localdomain> ruiu added inline comments. ================ Comment at: lld/wasm/SymbolTable.cpp:592 +// types don't match the adaptor creation will tail. +bool SymbolTable::replaceWithAdaptorFunction(FunctionSymbol *sym, + FunctionSymbol *target) { ---------------- sbc100 wrote: > ruiu wrote: > > This function seems to enables to call an external function with a less number of arguments. I wonder what is the use case of this -- as long as you have a correct header file for functions, compilers can tell users that the number of parameters doesn't match. > Yes, normally this shouldn't happen, but sadly there are some cases in C when it does. > > The motivating case here is crt1.c calling the main function from _start. We want to be able support both 3 argument and 2 arguments forms. With native linking you can simply link these two together it will kind of "just work". With wasm, before this change we end up generating a linker warning and _start ends up calling stub function, so the program doesn't work. > > This change fixes that use case. I'm curious if it makes sense to limit this functionality only to "main" if there's no other use cases, so that we don't accidentally link a wrong program as if it were a correct one. What do you think? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68751/new/ https://reviews.llvm.org/D68751 From llvm-commits at lists.llvm.org Wed Oct 9 20:49:31 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 03:49:31 +0000 (UTC) Subject: [PATCH] D68656: Add ExceptionStream to llvm::Object::minidump In-Reply-To: References: Message-ID: <1241ee8f917c4ea3617b17afb0631129@localhost.localdomain> JosephTremoulet added a comment. In D68656#1701658 , @labath wrote: > Thanks for taking your time to do this. I have one question: It looks like you're not using the exception code enum in the follow-up patch. I think that's completely reasonable given that the enum values are overloaded and system-dependent. But given this fact, and the fact that I am not convinced the enum values are completely right (e.g. the linux signal numbers depend also on the architecture -- though this may not manifest itself on the architectures that breakpad supports right now), what would you say to just dropping that enumeration? Yeah, I went back and forth myself on whether to include those. The thing is that I need to replace the definition of `MinidumpException::DumpRequested` with something in these types. When I did some research, I found that the breakpad constant in question is actually Linux-specific, and that there are similar constants defined for Mac and Windows. So I went the route of pulling in the big list of constants for each OS group. It's easy to miss, but in D68658 I actually do reference `Linux_DumpRequested` in code and in a comment make a reference `Mac_Simulated` and `Windows_Simulated` though not by name. Perhaps a better approach would be to define just those three, with a comment about how the field gets used more generally and where external constants come from? I'll push an update that goes that route, please let me know what you think. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68656/new/ https://reviews.llvm.org/D68656 From llvm-commits at lists.llvm.org Wed Oct 9 20:49:31 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 03:49:31 +0000 (UTC) Subject: [PATCH] D68656: Add ExceptionStream to llvm::Object::minidump In-Reply-To: References: Message-ID: JosephTremoulet updated this revision to Diff 224242. JosephTremoulet added a comment. - Remove the os-defined exception code enum values Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68656/new/ https://reviews.llvm.org/D68656 Files: llvm/include/llvm/BinaryFormat/Minidump.h llvm/include/llvm/Object/Minidump.h llvm/unittests/Object/MinidumpTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68656.224242.patch Type: text/x-patch Size: 6268 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 21:24:44 2019 From: llvm-commits at lists.llvm.org (Cyndy Ishida via llvm-commits) Date: Thu, 10 Oct 2019 04:24:44 -0000 Subject: [llvm] r374277 - Reland "[TextAPI] Introduce TBDv4" Message-ID: <20191010042444.BF70F85E81@lists.llvm.org> Author: cishida Date: Wed Oct 9 21:24:44 2019 New Revision: 374277 URL: http://llvm.org/viewvc/llvm-project?rev=374277&view=rev Log: Reland "[TextAPI] Introduce TBDv4" Original Patch broke for compilations w/ gcc and exposed asan fail. This reland repairs those bugs. Differential Revision: https://reviews.llvm.org/D67529 Added: llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp Modified: llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h llvm/trunk/include/llvm/TextAPI/MachO/Target.h llvm/trunk/lib/TextAPI/MachO/Target.cpp llvm/trunk/lib/TextAPI/MachO/TextStub.cpp llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp llvm/trunk/unittests/TextAPI/CMakeLists.txt Modified: llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h?rev=374277&r1=374276&r2=374277&view=diff ============================================================================== --- llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h (original) +++ llvm/trunk/include/llvm/TextAPI/MachO/InterfaceFile.h Wed Oct 9 21:24:44 2019 @@ -67,6 +67,9 @@ enum FileType : unsigned { /// Text-based stub file (.tbd) version 3.0 TBD_V3 = 1U << 2, + /// Text-based stub file (.tbd) version 4.0 + TBD_V4 = 1U << 3, + All = ~0U, LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/All), Modified: llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h?rev=374277&r1=374276&r2=374277&view=diff ============================================================================== --- llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h (original) +++ llvm/trunk/include/llvm/TextAPI/MachO/Symbol.h Wed Oct 9 21:24:44 2019 @@ -38,7 +38,10 @@ enum class SymbolFlags : uint8_t { /// Undefined Undefined = 1U << 3, - LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/Undefined), + /// Rexported + Rexported = 1U << 4, + + LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/Rexported), }; // clang-format on @@ -50,7 +53,7 @@ enum class SymbolKind : uint8_t { ObjectiveCInstanceVariable, }; -using TargetList = SmallVector; +using TargetList = SmallVector; class Symbol { public: Symbol(SymbolKind Kind, StringRef Name, TargetList Targets, SymbolFlags Flags) @@ -81,6 +84,10 @@ public: return (Flags & SymbolFlags::Undefined) == SymbolFlags::Undefined; } + bool isReexported() const { + return (Flags & SymbolFlags::Rexported) == SymbolFlags::Rexported; + } + using const_target_iterator = TargetList::const_iterator; using const_target_range = llvm::iterator_range; const_target_range targets() const { return {Targets}; } Modified: llvm/trunk/include/llvm/TextAPI/MachO/Target.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/TextAPI/MachO/Target.h?rev=374277&r1=374276&r2=374277&view=diff ============================================================================== --- llvm/trunk/include/llvm/TextAPI/MachO/Target.h (original) +++ llvm/trunk/include/llvm/TextAPI/MachO/Target.h Wed Oct 9 21:24:44 2019 @@ -29,6 +29,8 @@ public: explicit Target(const llvm::Triple &Triple) : Arch(mapToArchitecture(Triple)), Platform(mapToPlatformKind(Triple)) {} + static llvm::Expected create(StringRef Target); + operator std::string() const; Architecture Arch; Modified: llvm/trunk/lib/TextAPI/MachO/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/TextAPI/MachO/Target.cpp?rev=374277&r1=374276&r2=374277&view=diff ============================================================================== --- llvm/trunk/lib/TextAPI/MachO/Target.cpp (original) +++ llvm/trunk/lib/TextAPI/MachO/Target.cpp Wed Oct 9 21:24:44 2019 @@ -17,6 +17,36 @@ namespace llvm { namespace MachO { +Expected Target::create(StringRef TargetValue) { + auto Result = TargetValue.split('-'); + auto ArchitectureStr = Result.first; + auto Architecture = getArchitectureFromName(ArchitectureStr); + auto PlatformStr = Result.second; + PlatformKind Platform; + Platform = StringSwitch(PlatformStr) + .Case("macos", PlatformKind::macOS) + .Case("ios", PlatformKind::iOS) + .Case("tvos", PlatformKind::tvOS) + .Case("watchos", PlatformKind::watchOS) + .Case("bridgeos", PlatformKind::bridgeOS) + .Case("maccatalyst", PlatformKind::macCatalyst) + .Case("ios-simulator", PlatformKind::iOSSimulator) + .Case("tvos-simulator", PlatformKind::tvOSSimulator) + .Case("watchos-simulator", PlatformKind::watchOSSimulator) + .Default(PlatformKind::unknown); + + if (Platform == PlatformKind::unknown) { + if (PlatformStr.startswith("<") && PlatformStr.endswith(">")) { + PlatformStr = PlatformStr.drop_front().drop_back(); + unsigned long long RawValue; + if (!PlatformStr.getAsInteger(10, RawValue)) + Platform = (PlatformKind)RawValue; + } + } + + return Target{Architecture, Platform}; +} + Target::operator std::string() const { return (getArchitectureName(Arch) + " (" + getPlatformName(Platform) + ")") .str(); @@ -42,4 +72,4 @@ ArchitectureSet mapToArchitectureSet(Arr } } // end namespace MachO. -} // end namespace llvm. \ No newline at end of file +} // end namespace llvm. Modified: llvm/trunk/lib/TextAPI/MachO/TextStub.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/TextAPI/MachO/TextStub.cpp?rev=374277&r1=374276&r2=374277&view=diff ============================================================================== --- llvm/trunk/lib/TextAPI/MachO/TextStub.cpp (original) +++ llvm/trunk/lib/TextAPI/MachO/TextStub.cpp Wed Oct 9 21:24:44 2019 @@ -147,6 +147,58 @@ Each undefineds section is defined as fo objc-ivars: [] # Optional: List of Objective C Instance Variables weak-ref-symbols: [] # Optional: List of weak defined symbols */ + +/* + + YAML Format specification. + +--- !tapi-tbd +tbd-version: 4 # The tbd version for format +targets: [ armv7-ios, x86_64-maccatalyst ] # The list of applicable tapi supported target triples +uuids: # Optional: List of target and UUID pairs. + - target: armv7-ios + value: ... + - target: x86_64-maccatalyst + value: ... +flags: [] # Optional: +install-name: /u/l/libfoo.dylib # +current-version: 1.2.3 # Optional: defaults to 1.0 +compatibility-version: 1.0 # Optional: defaults to 1.0 +swift-abi-version: 0 # Optional: defaults to 0 +parent-umbrella: # Optional: +allowable-clients: + - targets: [ armv7-ios ] # Optional: + clients: [ clientA ] +exports: # List of export sections +... +re-exports: # List of reexport sections +... +undefineds: # List of undefineds sections +... + +Each export and reexport section is defined as following: + +- targets: [ arm64-macos ] # The list of target triples associated with symbols + symbols: [ _symA ] # Optional: List of symbols + objc-classes: [] # Optional: List of Objective-C classes + objc-eh-types: [] # Optional: List of Objective-C classes + # with EH + objc-ivars: [] # Optional: List of Objective C Instance + # Variables + weak-symbols: [] # Optional: List of weak defined symbols + thread-local-symbols: [] # Optional: List of thread local symbols +- targets: [ arm64-macos, x86_64-maccatalyst ] # Optional: Targets for applicable additional symbols + symbols: [ _symB ] # Optional: List of symbols + +Each undefineds section is defined as following: +- targets: [ arm64-macos ] # The list of target triples associated with symbols + symbols: [ _symC ] # Optional: List of symbols + objc-classes: [] # Optional: List of Objective-C classes + objc-eh-types: [] # Optional: List of Objective-C classes + # with EH + objc-ivars: [] # Optional: List of Objective C Instance Variables + weak-symbols: [] # Optional: List of weak defined symbols +*/ // clang-format on using namespace llvm; @@ -175,6 +227,38 @@ struct UndefinedSection { std::vector WeakRefSymbols; }; +// Sections for direct target mapping in TBDv4 +struct SymbolSection { + TargetList Targets; + std::vector Symbols; + std::vector Classes; + std::vector ClassEHs; + std::vector Ivars; + std::vector WeakSymbols; + std::vector TlvSymbols; +}; + +struct MetadataSection { + enum Option { Clients, Libraries }; + std::vector Targets; + std::vector Values; +}; + +struct UmbrellaSection { + std::vector Targets; + std::string Umbrella; +}; + +// UUID's for TBDv4 are mapped to target not arch +struct UUIDv4 { + Target TargetID; + std::string Value; + + UUIDv4() = default; + UUIDv4(const Target &TargetID, const std::string &Value) + : TargetID(TargetID), Value(Value) {} +}; + // clang-format off enum TBDFlags : unsigned { None = 0U, @@ -189,6 +273,12 @@ enum TBDFlags : unsigned { LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(Architecture) LLVM_YAML_IS_SEQUENCE_VECTOR(ExportSection) LLVM_YAML_IS_SEQUENCE_VECTOR(UndefinedSection) +// Specific to TBDv4 +LLVM_YAML_IS_SEQUENCE_VECTOR(SymbolSection) +LLVM_YAML_IS_SEQUENCE_VECTOR(MetadataSection) +LLVM_YAML_IS_SEQUENCE_VECTOR(UmbrellaSection) +LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(Target) +LLVM_YAML_IS_SEQUENCE_VECTOR(UUIDv4) namespace llvm { namespace yaml { @@ -231,6 +321,49 @@ template <> struct MappingTraits struct MappingTraits { + static void mapping(IO &IO, SymbolSection &Section) { + IO.mapRequired("targets", Section.Targets); + IO.mapOptional("symbols", Section.Symbols); + IO.mapOptional("objc-classes", Section.Classes); + IO.mapOptional("objc-eh-types", Section.ClassEHs); + IO.mapOptional("objc-ivars", Section.Ivars); + IO.mapOptional("weak-symbols", Section.WeakSymbols); + IO.mapOptional("thread-local-symbols", Section.TlvSymbols); + } +}; + +template <> struct MappingTraits { + static void mapping(IO &IO, UmbrellaSection &Section) { + IO.mapRequired("targets", Section.Targets); + IO.mapRequired("umbrella", Section.Umbrella); + } +}; + +template <> struct MappingTraits { + static void mapping(IO &IO, UUIDv4 &UUID) { + IO.mapRequired("target", UUID.TargetID); + IO.mapRequired("value", UUID.Value); + } +}; + +template <> +struct MappingContextTraits { + static void mapping(IO &IO, MetadataSection &Section, + MetadataSection::Option &OptionKind) { + IO.mapRequired("targets", Section.Targets); + switch (OptionKind) { + case MetadataSection::Option::Clients: + IO.mapRequired("clients", Section.Values); + return; + case MetadataSection::Option::Libraries: + IO.mapRequired("libraries", Section.Values); + return; + } + llvm_unreachable("unexpected option for metadata"); + } +}; + template <> struct ScalarBitSetTraits { static void bitset(IO &IO, TBDFlags &Flags) { IO.bitSetCase(Flags, "flat_namespace", TBDFlags::FlatNamespace); @@ -240,6 +373,60 @@ template <> struct ScalarBitSetTraits struct ScalarTraits { + static void output(const Target &Value, void *, raw_ostream &OS) { + OS << Value.Arch << "-"; + switch (Value.Platform) { + default: + OS << "unknown"; + break; + case PlatformKind::macOS: + OS << "macos"; + break; + case PlatformKind::iOS: + OS << "ios"; + break; + case PlatformKind::tvOS: + OS << "tvos"; + break; + case PlatformKind::watchOS: + OS << "watchos"; + break; + case PlatformKind::bridgeOS: + OS << "bridgeos"; + break; + case PlatformKind::macCatalyst: + OS << "maccatalyst"; + break; + case PlatformKind::iOSSimulator: + OS << "ios-simulator"; + break; + case PlatformKind::tvOSSimulator: + OS << "tvos-simulator"; + break; + case PlatformKind::watchOSSimulator: + OS << "watchos-simulator"; + break; + } + } + + static StringRef input(StringRef Scalar, void *, Target &Value) { + auto Result = Target::create(Scalar); + if (!Result) + return toString(Result.takeError()); + + Value = *Result; + if (Value.Arch == AK_unknown) + return "unknown architecture"; + if (Value.Platform == PlatformKind::unknown) + return "unknown platform"; + + return {}; + } + + static QuotingType mustQuote(StringRef) { return QuotingType::None; } +}; + template <> struct MappingTraits { struct NormalizedTBD { explicit NormalizedTBD(IO &IO) {} @@ -555,71 +742,336 @@ template <> struct MappingTraits Undefineds; }; + static void setFileTypeForInput(TextAPIContext *Ctx, IO &IO) { + if (IO.mapTag("!tapi-tbd", false)) + Ctx->FileKind = FileType::TBD_V4; + else if (IO.mapTag("!tapi-tbd-v3", false)) + Ctx->FileKind = FileType::TBD_V3; + else if (IO.mapTag("!tapi-tbd-v2", false)) + Ctx->FileKind = FileType::TBD_V2; + else if (IO.mapTag("!tapi-tbd-v1", false) || + IO.mapTag("tag:yaml.org,2002:map", false)) + Ctx->FileKind = FileType::TBD_V1; + else { + Ctx->FileKind = FileType::Invalid; + return; + } + } + static void mapping(IO &IO, const InterfaceFile *&File) { auto *Ctx = reinterpret_cast(IO.getContext()); assert((!Ctx || !IO.outputting() || (Ctx && Ctx->FileKind != FileType::Invalid)) && "File type is not set in YAML context"); - MappingNormalization Keys(IO, File); - // prope file type when reading. if (!IO.outputting()) { - if (IO.mapTag("!tapi-tbd-v3", false)) - Ctx->FileKind = FileType::TBD_V3; - else if (IO.mapTag("!tapi-tbd-v2", false)) - Ctx->FileKind = FileType::TBD_V2; - else if (IO.mapTag("!tapi-tbd-v1", false) || - IO.mapTag("tag:yaml.org,2002:map", false)) - Ctx->FileKind = FileType::TBD_V1; - else { + setFileTypeForInput(Ctx, IO); + switch (Ctx->FileKind) { + default: + break; + case FileType::TBD_V4: + mapKeysToValuesV4(IO, File); + return; + case FileType::Invalid: IO.setError("unsupported file type"); return; } - } - - // Set file type when writing. - if (IO.outputting()) { + } else { + // Set file type when writing. switch (Ctx->FileKind) { default: llvm_unreachable("unexpected file type"); - case FileType::TBD_V1: - // Don't write the tag into the .tbd file for TBD v1. + case FileType::TBD_V4: + mapKeysToValuesV4(IO, File); + return; + case FileType::TBD_V3: + IO.mapTag("!tapi-tbd-v3", true); break; case FileType::TBD_V2: IO.mapTag("!tapi-tbd-v2", true); break; - case FileType::TBD_V3: - IO.mapTag("!tapi-tbd-v3", true); + case FileType::TBD_V1: + // Don't write the tag into the .tbd file for TBD v1 break; } } + mapKeysToValues(Ctx->FileKind, IO, File); + } + using SectionList = std::vector; + struct NormalizedTBD_V4 { + explicit NormalizedTBD_V4(IO &IO) {} + NormalizedTBD_V4(IO &IO, const InterfaceFile *&File) { + auto Ctx = reinterpret_cast(IO.getContext()); + assert(Ctx); + TBDVersion = Ctx->FileKind >> 1; + Targets.insert(Targets.begin(), File->targets().begin(), + File->targets().end()); + for (const auto &IT : File->uuids()) + UUIDs.emplace_back(IT.first, IT.second); + InstallName = File->getInstallName(); + CurrentVersion = File->getCurrentVersion(); + CompatibilityVersion = File->getCompatibilityVersion(); + SwiftABIVersion = File->getSwiftABIVersion(); + + Flags = TBDFlags::None; + if (!File->isApplicationExtensionSafe()) + Flags |= TBDFlags::NotApplicationExtensionSafe; + + if (!File->isTwoLevelNamespace()) + Flags |= TBDFlags::FlatNamespace; + + if (File->isInstallAPI()) + Flags |= TBDFlags::InstallAPI; + + { + std::map valueToTargetList; + for (const auto &it : File->umbrellas()) + valueToTargetList[it.second].emplace_back(it.first); + + for (const auto &it : valueToTargetList) { + UmbrellaSection CurrentSection; + CurrentSection.Targets.insert(CurrentSection.Targets.begin(), + it.second.begin(), it.second.end()); + CurrentSection.Umbrella = it.first; + ParentUmbrellas.emplace_back(std::move(CurrentSection)); + } + } + + assignTargetsToLibrary(File->allowableClients(), AllowableClients); + assignTargetsToLibrary(File->reexportedLibraries(), ReexportedLibraries); + + auto handleSymbols = + [](SectionList &CurrentSections, + InterfaceFile::const_filtered_symbol_range Symbols, + std::function Pred) { + std::set TargetSet; + std::map SymbolToTargetList; + for (const auto *Symbol : Symbols) { + if (!Pred(Symbol)) + continue; + TargetList Targets(Symbol->targets()); + SymbolToTargetList[Symbol] = Targets; + TargetSet.emplace(std::move(Targets)); + } + for (const auto &TargetIDs : TargetSet) { + SymbolSection CurrentSection; + CurrentSection.Targets.insert(CurrentSection.Targets.begin(), + TargetIDs.begin(), TargetIDs.end()); + + for (const auto &IT : SymbolToTargetList) { + if (IT.second != TargetIDs) + continue; + + const auto *Symbol = IT.first; + switch (Symbol->getKind()) { + case SymbolKind::GlobalSymbol: + if (Symbol->isWeakDefined()) + CurrentSection.WeakSymbols.emplace_back(Symbol->getName()); + else if (Symbol->isThreadLocalValue()) + CurrentSection.TlvSymbols.emplace_back(Symbol->getName()); + else + CurrentSection.Symbols.emplace_back(Symbol->getName()); + break; + case SymbolKind::ObjectiveCClass: + CurrentSection.Classes.emplace_back(Symbol->getName()); + break; + case SymbolKind::ObjectiveCClassEHType: + CurrentSection.ClassEHs.emplace_back(Symbol->getName()); + break; + case SymbolKind::ObjectiveCInstanceVariable: + CurrentSection.Ivars.emplace_back(Symbol->getName()); + break; + } + } + sort(CurrentSection.Symbols); + sort(CurrentSection.Classes); + sort(CurrentSection.ClassEHs); + sort(CurrentSection.Ivars); + sort(CurrentSection.WeakSymbols); + sort(CurrentSection.TlvSymbols); + CurrentSections.emplace_back(std::move(CurrentSection)); + } + }; + + handleSymbols(Exports, File->exports(), [](const Symbol *Symbol) { + return !Symbol->isReexported(); + }); + handleSymbols(Reexports, File->exports(), [](const Symbol *Symbol) { + return Symbol->isReexported(); + }); + handleSymbols(Undefineds, File->undefineds(), + [](const Symbol *Symbol) { return true; }); + } + + const InterfaceFile *denormalize(IO &IO) { + auto Ctx = reinterpret_cast(IO.getContext()); + assert(Ctx); + + auto *File = new InterfaceFile; + File->setPath(Ctx->Path); + File->setFileType(Ctx->FileKind); + for (auto &id : UUIDs) + File->addUUID(id.TargetID, id.Value); + File->addTargets(Targets); + File->setInstallName(InstallName); + File->setCurrentVersion(CurrentVersion); + File->setCompatibilityVersion(CompatibilityVersion); + File->setSwiftABIVersion(SwiftABIVersion); + for (const auto &CurrentSection : ParentUmbrellas) + for (const auto &target : CurrentSection.Targets) + File->addParentUmbrella(target, CurrentSection.Umbrella); + File->setTwoLevelNamespace(!(Flags & TBDFlags::FlatNamespace)); + File->setApplicationExtensionSafe( + !(Flags & TBDFlags::NotApplicationExtensionSafe)); + File->setInstallAPI(Flags & TBDFlags::InstallAPI); + + for (const auto &CurrentSection : AllowableClients) { + for (const auto &lib : CurrentSection.Values) + for (const auto &Target : CurrentSection.Targets) + File->addAllowableClient(lib, Target); + } + + for (const auto &CurrentSection : ReexportedLibraries) { + for (const auto &Lib : CurrentSection.Values) + for (const auto &Target : CurrentSection.Targets) + File->addReexportedLibrary(Lib, Target); + } + + auto handleSymbols = [File](const SectionList &CurrentSections, + SymbolFlags Flag = SymbolFlags::None) { + for (const auto &CurrentSection : CurrentSections) { + for (auto &sym : CurrentSection.Symbols) + File->addSymbol(SymbolKind::GlobalSymbol, sym, + CurrentSection.Targets, Flag); + + for (auto &sym : CurrentSection.Classes) + File->addSymbol(SymbolKind::ObjectiveCClass, sym, + CurrentSection.Targets); + + for (auto &sym : CurrentSection.ClassEHs) + File->addSymbol(SymbolKind::ObjectiveCClassEHType, sym, + CurrentSection.Targets); + + for (auto &sym : CurrentSection.Ivars) + File->addSymbol(SymbolKind::ObjectiveCInstanceVariable, sym, + CurrentSection.Targets); + + for (auto &sym : CurrentSection.WeakSymbols) + File->addSymbol(SymbolKind::GlobalSymbol, sym, + CurrentSection.Targets); + for (auto &sym : CurrentSection.TlvSymbols) + File->addSymbol(SymbolKind::GlobalSymbol, sym, + CurrentSection.Targets, + SymbolFlags::ThreadLocalValue); + } + }; + + handleSymbols(Exports); + handleSymbols(Reexports, SymbolFlags::Rexported); + handleSymbols(Undefineds, SymbolFlags::Undefined); + + return File; + } + + unsigned TBDVersion; + std::vector UUIDs; + TargetList Targets; + StringRef InstallName; + PackedVersion CurrentVersion; + PackedVersion CompatibilityVersion; + SwiftVersion SwiftABIVersion{0}; + std::vector AllowableClients; + std::vector ReexportedLibraries; + TBDFlags Flags{TBDFlags::None}; + std::vector ParentUmbrellas; + SectionList Exports; + SectionList Reexports; + SectionList Undefineds; + + private: + void assignTargetsToLibrary(const std::vector &Libraries, + std::vector &Section) { + std::set targetSet; + std::map valueToTargetList; + for (const auto &library : Libraries) { + TargetList targets(library.targets()); + valueToTargetList[&library] = targets; + targetSet.emplace(std::move(targets)); + } + + for (const auto &targets : targetSet) { + MetadataSection CurrentSection; + CurrentSection.Targets.insert(CurrentSection.Targets.begin(), + targets.begin(), targets.end()); + + for (const auto &it : valueToTargetList) { + if (it.second != targets) + continue; + + CurrentSection.Values.emplace_back(it.first->getInstallName()); + } + llvm::sort(CurrentSection.Values); + Section.emplace_back(std::move(CurrentSection)); + } + } + }; + + static void mapKeysToValues(FileType FileKind, IO &IO, + const InterfaceFile *&File) { + MappingNormalization Keys(IO, File); IO.mapRequired("archs", Keys->Architectures); - if (Ctx->FileKind != FileType::TBD_V1) + if (FileKind != FileType::TBD_V1) IO.mapOptional("uuids", Keys->UUIDs); IO.mapRequired("platform", Keys->Platforms); - if (Ctx->FileKind != FileType::TBD_V1) + if (FileKind != FileType::TBD_V1) IO.mapOptional("flags", Keys->Flags, TBDFlags::None); IO.mapRequired("install-name", Keys->InstallName); IO.mapOptional("current-version", Keys->CurrentVersion, PackedVersion(1, 0, 0)); IO.mapOptional("compatibility-version", Keys->CompatibilityVersion, PackedVersion(1, 0, 0)); - if (Ctx->FileKind != FileType::TBD_V3) + if (FileKind != FileType::TBD_V3) IO.mapOptional("swift-version", Keys->SwiftABIVersion, SwiftVersion(0)); else IO.mapOptional("swift-abi-version", Keys->SwiftABIVersion, SwiftVersion(0)); IO.mapOptional("objc-constraint", Keys->ObjCConstraint, - (Ctx->FileKind == FileType::TBD_V1) + (FileKind == FileType::TBD_V1) ? ObjCConstraintType::None : ObjCConstraintType::Retain_Release); - if (Ctx->FileKind != FileType::TBD_V1) + if (FileKind != FileType::TBD_V1) IO.mapOptional("parent-umbrella", Keys->ParentUmbrella, StringRef()); IO.mapOptional("exports", Keys->Exports); - if (Ctx->FileKind != FileType::TBD_V1) + if (FileKind != FileType::TBD_V1) IO.mapOptional("undefineds", Keys->Undefineds); } + + static void mapKeysToValuesV4(IO &IO, const InterfaceFile *&File) { + MappingNormalization Keys(IO, + File); + IO.mapTag("!tapi-tbd", true); + IO.mapRequired("tbd-version", Keys->TBDVersion); + IO.mapRequired("targets", Keys->Targets); + IO.mapOptional("uuids", Keys->UUIDs); + IO.mapOptional("flags", Keys->Flags, TBDFlags::None); + IO.mapRequired("install-name", Keys->InstallName); + IO.mapOptional("current-version", Keys->CurrentVersion, + PackedVersion(1, 0, 0)); + IO.mapOptional("compatibility-version", Keys->CompatibilityVersion, + PackedVersion(1, 0, 0)); + IO.mapOptional("swift-abi-version", Keys->SwiftABIVersion, SwiftVersion(0)); + IO.mapOptional("parent-umbrella", Keys->ParentUmbrellas); + auto OptionKind = MetadataSection::Option::Clients; + IO.mapOptionalWithContext("allowable-clients", Keys->AllowableClients, + OptionKind); + OptionKind = MetadataSection::Option::Libraries; + IO.mapOptionalWithContext("reexported-libraries", Keys->ReexportedLibraries, + OptionKind); + IO.mapOptional("exports", Keys->Exports); + IO.mapOptional("reexports", Keys->Reexports); + IO.mapOptional("undefineds", Keys->Undefineds); + } }; template <> Modified: llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp?rev=374277&r1=374276&r2=374277&view=diff ============================================================================== --- llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp (original) +++ llvm/trunk/lib/TextAPI/MachO/TextStubCommon.cpp Wed Oct 9 21:24:44 2019 @@ -172,14 +172,25 @@ void ScalarTraits::output( break; } } -StringRef ScalarTraits::input(StringRef Scalar, void *, +StringRef ScalarTraits::input(StringRef Scalar, void *IO, SwiftVersion &Value) { - Value = StringSwitch(Scalar) - .Case("1.0", 1) - .Case("1.1", 2) - .Case("2.0", 3) - .Case("3.0", 4) - .Default(0); + const auto *Ctx = reinterpret_cast(IO); + assert((!Ctx || Ctx->FileKind != FileType::Invalid) && + "File type is not set in context"); + + if (Ctx->FileKind == FileType::TBD_V4) { + if (Scalar.getAsInteger(10, Value)) + return "invalid Swift ABI version."; + return {}; + } else { + Value = StringSwitch(Scalar) + .Case("1.0", 1) + .Case("1.1", 2) + .Case("2.0", 3) + .Case("3.0", 4) + .Default(0); + } + if (Value != SwiftVersion(0)) return {}; Modified: llvm/trunk/unittests/TextAPI/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/TextAPI/CMakeLists.txt?rev=374277&r1=374276&r2=374277&view=diff ============================================================================== --- llvm/trunk/unittests/TextAPI/CMakeLists.txt (original) +++ llvm/trunk/unittests/TextAPI/CMakeLists.txt Wed Oct 9 21:24:44 2019 @@ -7,6 +7,7 @@ add_llvm_unittest(TextAPITests TextStubV1Tests.cpp TextStubV2Tests.cpp TextStubV3Tests.cpp + TextStubV4Tests.cpp ) target_link_libraries(TextAPITests PRIVATE LLVMTestingSupport) Added: llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp?rev=374277&view=auto ============================================================================== --- llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp (added) +++ llvm/trunk/unittests/TextAPI/TextStubV4Tests.cpp Wed Oct 9 21:24:44 2019 @@ -0,0 +1,564 @@ +//===-- TextStubV4Tests.cpp - TBD V4 File Test ----------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===-----------------------------------------------------------------------===/ +#include "llvm/TextAPI/MachO/InterfaceFile.h" +#include "llvm/TextAPI/MachO/TextAPIReader.h" +#include "llvm/TextAPI/MachO/TextAPIWriter.h" +#include "gtest/gtest.h" +#include +#include + +using namespace llvm; +using namespace llvm::MachO; + +struct ExampleSymbol { + SymbolKind Kind; + std::string Name; + bool WeakDefined; + bool ThreadLocalValue; +}; +using ExampleSymbolSeq = std::vector; +using UUIDs = std::vector>; + +inline bool operator<(const ExampleSymbol &LHS, const ExampleSymbol &RHS) { + return std::tie(LHS.Kind, LHS.Name) < std::tie(RHS.Kind, RHS.Name); +} + +inline bool operator==(const ExampleSymbol &LHS, const ExampleSymbol &RHS) { + return std::tie(LHS.Kind, LHS.Name, LHS.WeakDefined, LHS.ThreadLocalValue) == + std::tie(RHS.Kind, RHS.Name, RHS.WeakDefined, RHS.ThreadLocalValue); +} + +static ExampleSymbol TBDv4ExportedSymbols[] = { + {SymbolKind::GlobalSymbol, "_symA", false, false}, + {SymbolKind::GlobalSymbol, "_symAB", false, false}, + {SymbolKind::GlobalSymbol, "_symB", false, false}, +}; + +static ExampleSymbol TBDv4ReexportedSymbols[] = { + {SymbolKind::GlobalSymbol, "_symC", false, false}, +}; + +static ExampleSymbol TBDv4UndefinedSymbols[] = { + {SymbolKind::GlobalSymbol, "_symD", false, false}, +}; + +namespace TBDv4 { + +TEST(TBDv4, ReadFile) { + static const char tbd_v4_file[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ i386-macos, x86_64-macos, x86_64-ios ]\n" + "uuids:\n" + " - target: i386-macos\n" + " value: 00000000-0000-0000-0000-000000000000\n" + " - target: x86_64-macos\n" + " value: 11111111-1111-1111-1111-111111111111\n" + " - target: x86_64-ios\n" + " value: 11111111-1111-1111-1111-111111111111\n" + "flags: [ flat_namespace, installapi ]\n" + "install-name: Umbrella.framework/Umbrella\n" + "current-version: 1.2.3\n" + "compatibility-version: 1.2\n" + "swift-abi-version: 5\n" + "parent-umbrella:\n" + " - targets: [ i386-macos, x86_64-macos, x86_64-ios ]\n" + " umbrella: System\n" + "allowable-clients:\n" + " - targets: [ i386-macos, x86_64-macos, x86_64-ios ]\n" + " clients: [ ClientA ]\n" + "reexported-libraries:\n" + " - targets: [ i386-macos ]\n" + " libraries: [ /System/Library/Frameworks/A.framework/A ]\n" + "exports:\n" + " - targets: [ i386-macos ]\n" + " symbols: [ _symA ]\n" + " objc-classes: []\n" + " objc-eh-types: []\n" + " objc-ivars: []\n" + " weak-symbols: []\n" + " thread-local-symbols: []\n" + " - targets: [ x86_64-ios ]\n" + " symbols: [_symB]\n" + " - targets: [ x86_64-macos, x86_64-ios ]\n" + " symbols: [_symAB]\n" + "reexports:\n" + " - targets: [ i386-macos ]\n" + " symbols: [_symC]\n" + " objc-classes: []\n" + " objc-eh-types: []\n" + " objc-ivars: []\n" + " weak-symbols: []\n" + " thread-local-symbols: []\n" + "undefineds:\n" + " - targets: [ i386-macos ]\n" + " symbols: [ _symD ]\n" + " objc-classes: []\n" + " objc-eh-types: []\n" + " objc-ivars: []\n" + " weak-symbols: []\n" + " thread-local-symbols: []\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_v4_file, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + PlatformSet Platforms; + Platforms.insert(PlatformKind::macOS); + Platforms.insert(PlatformKind::iOS); + auto Archs = AK_i386 | AK_x86_64; + TargetList Targets = { + Target(AK_i386, PlatformKind::macOS), + Target(AK_x86_64, PlatformKind::macOS), + Target(AK_x86_64, PlatformKind::iOS), + }; + UUIDs uuids = {{Targets[0], "00000000-0000-0000-0000-000000000000"}, + {Targets[1], "11111111-1111-1111-1111-111111111111"}, + {Targets[2], "11111111-1111-1111-1111-111111111111"}}; + EXPECT_EQ(Archs, File->getArchitectures()); + EXPECT_EQ(uuids, File->uuids()); + EXPECT_EQ(Platforms.size(), File->getPlatforms().size()); + for (auto Platform : File->getPlatforms()) + EXPECT_EQ(Platforms.count(Platform), 1U); + EXPECT_EQ(std::string("Umbrella.framework/Umbrella"), File->getInstallName()); + EXPECT_EQ(PackedVersion(1, 2, 3), File->getCurrentVersion()); + EXPECT_EQ(PackedVersion(1, 2, 0), File->getCompatibilityVersion()); + EXPECT_EQ(5U, File->getSwiftABIVersion()); + EXPECT_FALSE(File->isTwoLevelNamespace()); + EXPECT_TRUE(File->isApplicationExtensionSafe()); + EXPECT_TRUE(File->isInstallAPI()); + InterfaceFileRef client("ClientA", Targets); + InterfaceFileRef reexport("/System/Library/Frameworks/A.framework/A", + {Targets[0]}); + EXPECT_EQ(1U, File->allowableClients().size()); + EXPECT_EQ(client, File->allowableClients().front()); + EXPECT_EQ(1U, File->reexportedLibraries().size()); + EXPECT_EQ(reexport, File->reexportedLibraries().front()); + + ExampleSymbolSeq Exports, Reexports, Undefineds; + ExampleSymbol temp; + for (const auto *Sym : File->symbols()) { + temp = ExampleSymbol{Sym->getKind(), Sym->getName(), Sym->isWeakDefined(), + Sym->isThreadLocalValue()}; + EXPECT_FALSE(Sym->isWeakReferenced()); + if (Sym->isUndefined()) + Undefineds.emplace_back(std::move(temp)); + else + Sym->isReexported() ? Reexports.emplace_back(std::move(temp)) + : Exports.emplace_back(std::move(temp)); + } + llvm::sort(Exports.begin(), Exports.end()); + llvm::sort(Reexports.begin(), Reexports.end()); + llvm::sort(Undefineds.begin(), Undefineds.end()); + + EXPECT_EQ(sizeof(TBDv4ExportedSymbols) / sizeof(ExampleSymbol), + Exports.size()); + EXPECT_EQ(sizeof(TBDv4ReexportedSymbols) / sizeof(ExampleSymbol), + Reexports.size()); + EXPECT_EQ(sizeof(TBDv4UndefinedSymbols) / sizeof(ExampleSymbol), + Undefineds.size()); + EXPECT_TRUE(std::equal(Exports.begin(), Exports.end(), + std::begin(TBDv4ExportedSymbols))); + EXPECT_TRUE(std::equal(Reexports.begin(), Reexports.end(), + std::begin(TBDv4ReexportedSymbols))); + EXPECT_TRUE(std::equal(Undefineds.begin(), Undefineds.end(), + std::begin(TBDv4UndefinedSymbols))); +} + +TEST(TBDv4, WriteFile) { + static const char tbd_v4_file[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ i386-macos, x86_64-ios-simulator ]\n" + "uuids:\n" + " - target: i386-macos\n" + " value: 00000000-0000-0000-0000-000000000000\n" + " - target: x86_64-ios-simulator\n" + " value: 11111111-1111-1111-1111-111111111111\n" + "flags: [ installapi ]\n" + "install-name: 'Umbrella.framework/Umbrella'\n" + "current-version: 1.2.3\n" + "compatibility-version: 0\n" + "swift-abi-version: 5\n" + "parent-umbrella:\n" + " - targets: [ i386-macos, x86_64-ios-simulator ]\n" + " umbrella: System\n" + "allowable-clients:\n" + " - targets: [ i386-macos ]\n" + " clients: [ ClientA ]\n" + "exports:\n" + " - targets: [ i386-macos ]\n" + " symbols: [ _symA ]\n" + " objc-classes: [ Class1 ]\n" + " weak-symbols: [ _symC ]\n" + " - targets: [ x86_64-ios-simulator ]\n" + " symbols: [ _symB ]\n" + "...\n"; + + InterfaceFile File; + TargetList Targets = { + Target(AK_i386, PlatformKind::macOS), + Target(AK_x86_64, PlatformKind::iOSSimulator), + }; + UUIDs uuids = {{Targets[0], "00000000-0000-0000-0000-000000000000"}, + {Targets[1], "11111111-1111-1111-1111-111111111111"}}; + File.setInstallName("Umbrella.framework/Umbrella"); + File.setFileType(FileType::TBD_V4); + File.addTargets(Targets); + File.addUUID(uuids[0].first, uuids[0].second); + File.addUUID(uuids[1].first, uuids[1].second); + File.setCurrentVersion(PackedVersion(1, 2, 3)); + File.setTwoLevelNamespace(); + File.setInstallAPI(true); + File.setApplicationExtensionSafe(true); + File.setSwiftABIVersion(5); + File.addAllowableClient("ClientA", Targets[0]); + File.addParentUmbrella(Targets[0], "System"); + File.addParentUmbrella(Targets[1], "System"); + File.addSymbol(SymbolKind::GlobalSymbol, "_symA", {Targets[0]}); + File.addSymbol(SymbolKind::GlobalSymbol, "_symB", {Targets[1]}); + File.addSymbol(SymbolKind::GlobalSymbol, "_symC", {Targets[0]}, + SymbolFlags::WeakDefined); + File.addSymbol(SymbolKind::ObjectiveCClass, "Class1", {Targets[0]}); + + SmallString<4096> Buffer; + raw_svector_ostream OS(Buffer); + auto Result = TextAPIWriter::writeToStream(OS, File); + EXPECT_FALSE(Result); + EXPECT_STREQ(tbd_v4_file, Buffer.c_str()); +} + +TEST(TBDv4, MultipleTargets) { + static const char tbd_multiple_targets[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ i386-maccatalyst, x86_64-tvos, arm64-ios ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_multiple_targets, "Test.tbd")); + EXPECT_TRUE(!!Result); + PlatformSet Platforms; + Platforms.insert(PlatformKind::macCatalyst); + Platforms.insert(PlatformKind::tvOS); + Platforms.insert(PlatformKind::iOS); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(AK_x86_64 | AK_arm64 | AK_i386, File->getArchitectures()); + EXPECT_EQ(Platforms.size(), File->getPlatforms().size()); + for (auto Platform : File->getPlatforms()) + EXPECT_EQ(Platforms.count(Platform), 1U); +} + +TEST(TBDv4, MultipleTargetsSameArch) { + static const char tbd_targets_same_arch[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-maccatalyst, x86_64-tvos ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_targets_same_arch, "Test.tbd")); + EXPECT_TRUE(!!Result); + PlatformSet Platforms; + Platforms.insert(PlatformKind::tvOS); + Platforms.insert(PlatformKind::macCatalyst); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(Platforms.size(), File->getPlatforms().size()); + for (auto Platform : File->getPlatforms()) + EXPECT_EQ(Platforms.count(Platform), 1U); +} + +TEST(TBDv4, MultipleTargetsSamePlatform) { + static const char tbd_multiple_targets_same_platform[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ arm64-ios, armv7k-ios ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = TextAPIReader::get( + MemoryBufferRef(tbd_multiple_targets_same_platform, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(AK_arm64 | AK_armv7k, File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::iOS, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_maccatalyst) { + static const char tbd_target_maccatalyst[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-maccatalyst ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_target_maccatalyst, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::macCatalyst, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_x86_ios) { + static const char tbd_target_x86_ios[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-ios ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_target_x86_ios, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::iOS, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_arm_bridgeOS) { + static const char tbd_platform_bridgeos[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ armv7k-bridgeos ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_platform_bridgeos, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::bridgeOS, *File->getPlatforms().begin()); + EXPECT_EQ(ArchitectureSet(AK_armv7k), File->getArchitectures()); +} + +TEST(TBDv4, Target_x86_macos) { + static const char tbd_x86_macos[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_x86_macos, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::macOS, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_x86_ios_simulator) { + static const char tbd_x86_ios_sim[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-ios-simulator ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_x86_ios_sim, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::iOSSimulator, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_x86_tvos_simulator) { + static const char tbd_x86_tvos_sim[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-tvos-simulator ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_x86_tvos_sim, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_x86_64), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::tvOSSimulator, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Target_i386_watchos_simulator) { + static const char tbd_i386_watchos_sim[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ i386-watchos-simulator ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_i386_watchos_sim, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(ArchitectureSet(AK_i386), File->getArchitectures()); + EXPECT_EQ(File->getPlatforms().size(), 1U); + EXPECT_EQ(PlatformKind::watchOSSimulator, *File->getPlatforms().begin()); +} + +TEST(TBDv4, Swift_1) { + static const char tbd_swift_1[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "swift-abi-version: 1\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_swift_1, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(1U, File->getSwiftABIVersion()); +} + +TEST(TBDv4, Swift_2) { + static const char tbd_v1_swift_2[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "swift-abi-version: 2\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_v1_swift_2, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(2U, File->getSwiftABIVersion()); +} + +TEST(TBDv4, Swift_5) { + static const char tbd_swift_5[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "swift-abi-version: 5\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_swift_5, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(5U, File->getSwiftABIVersion()); +} + +TEST(TBDv4, Swift_99) { + static const char tbd_swift_99[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "swift-abi-version: 99\n" + "...\n"; + + auto Result = TextAPIReader::get(MemoryBufferRef(tbd_swift_99, "Test.tbd")); + EXPECT_TRUE(!!Result); + auto File = std::move(Result.get()); + EXPECT_EQ(FileType::TBD_V4, File->getFileType()); + EXPECT_EQ(99U, File->getSwiftABIVersion()); +} + +TEST(TBDv4, InvalidArchitecture) { + static const char tbd_file_unknown_architecture[] = + "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ foo-macos ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = TextAPIReader::get( + MemoryBufferRef(tbd_file_unknown_architecture, "Test.tbd")); + EXPECT_FALSE(!!Result); + auto errorMessage = toString(Result.takeError()); + EXPECT_EQ("malformed file\nTest.tbd:3:12: error: unknown " + "architecture\ntargets: [ foo-macos ]\n" + " ^~~~~~~~~~\n", + errorMessage); +} + +TEST(TBDv4, InvalidPlatform) { + static const char tbd_file_invalid_platform[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-maos ]\n" + "install-name: Test.dylib\n" + "...\n"; + + auto Result = TextAPIReader::get( + MemoryBufferRef(tbd_file_invalid_platform, "Test.tbd")); + EXPECT_FALSE(!!Result); + auto errorMessage = toString(Result.takeError()); + EXPECT_EQ("malformed file\nTest.tbd:3:12: error: unknown platform\ntargets: " + "[ x86_64-maos ]\n" + " ^~~~~~~~~~~~\n", + errorMessage); +} + +TEST(TBDv4, MalformedFile1) { + static const char malformed_file1[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(malformed_file1, "Test.tbd")); + EXPECT_FALSE(!!Result); + auto errorMessage = toString(Result.takeError()); + ASSERT_EQ("malformed file\nTest.tbd:2:1: error: missing required key " + "'targets'\ntbd-version: 4\n^\n", + errorMessage); +} + +TEST(TBDv4, MalformedFile2) { + static const char malformed_file2[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "foobar: \"unsupported key\"\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(malformed_file2, "Test.tbd")); + EXPECT_FALSE(!!Result); + auto errorMessage = toString(Result.takeError()); + ASSERT_EQ( + "malformed file\nTest.tbd:5:9: error: unknown key 'foobar'\nfoobar: " + "\"unsupported key\"\n ^~~~~~~~~~~~~~~~~\n", + errorMessage); +} + +TEST(TBDv4, MalformedFile3) { + static const char tbd_v1_swift_1_1[] = "--- !tapi-tbd\n" + "tbd-version: 4\n" + "targets: [ x86_64-macos ]\n" + "install-name: Test.dylib\n" + "swift-abi-version: 1.1\n" + "...\n"; + + auto Result = + TextAPIReader::get(MemoryBufferRef(tbd_v1_swift_1_1, "Test.tbd")); + EXPECT_FALSE(!!Result); + auto errorMessage = toString(Result.takeError()); + EXPECT_EQ("malformed file\nTest.tbd:5:20: error: invalid Swift ABI " + "version.\nswift-abi-version: 1.1\n ^~~\n", + errorMessage); +} + +} // end namespace TBDv4 From llvm-commits at lists.llvm.org Wed Oct 9 21:25:53 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 04:25:53 +0000 (UTC) Subject: [PATCH] D68758: Improve error message for bad SHF_MERGE sections Message-ID: ruiu created this revision. ruiu added reviewers: MaskRay, grimar. Herald added subscribers: arichardson, emaste. Herald added a reviewer: espindola. Herald added a project: LLVM. This patch adds a section name to error messages. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68758 Files: lld/ELF/InputFiles.cpp lld/ELF/InputFiles.h lld/test/ELF/merge-bad-input1.s lld/test/ELF/merge-bad-input2.s Index: lld/test/ELF/merge-bad-input2.s =================================================================== --- /dev/null +++ lld/test/ELF/merge-bad-input2.s @@ -0,0 +1,8 @@ +# REQUIRES: x86 +# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o +# RUN: not ld.lld %t.o -o /dev/null 2>&1 | FileCheck %s + +# CHECK: merge-bad-input2.s.tmp.o:(.foo): writable SHF_MERGE section is not supported + +.section .foo,"awM", at progbits,1 +.zero 16 Index: lld/test/ELF/merge-bad-input1.s =================================================================== --- /dev/null +++ lld/test/ELF/merge-bad-input1.s @@ -0,0 +1,9 @@ +# REQUIRES: x86 +# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o +# RUN: not ld.lld %t.o -o /dev/null 2>&1 | FileCheck %s + +# CHECK: merge-bad-input1.s.tmp.o:(.foo): SHF_MERGE section size must be a multiple of sh_entsize + +.section .foo,"aM", at progbits,16 +.align 16 +.zero 24 Index: lld/ELF/InputFiles.h =================================================================== --- lld/ELF/InputFiles.h +++ lld/ELF/InputFiles.h @@ -259,7 +259,7 @@ InputSectionBase *createInputSection(const Elf_Shdr &sec); StringRef getSectionName(const Elf_Shdr &sec); - bool shouldMerge(const Elf_Shdr &sec); + bool shouldMerge(const Elf_Shdr &sec, StringRef name); // Each ELF symbol contains a section index which the symbol belongs to. // However, because the number of bits dedicated for that is limited, a Index: lld/ELF/InputFiles.cpp =================================================================== --- lld/ELF/InputFiles.cpp +++ lld/ELF/InputFiles.cpp @@ -483,7 +483,8 @@ return signature; } -template bool ObjFile::shouldMerge(const Elf_Shdr &sec) { +template +bool ObjFile::shouldMerge(const Elf_Shdr &sec, StringRef name) { // On a regular link we don't merge sections if -O0 (default is -O1). This // sometimes makes the linker significantly faster, although the output will // be bigger. @@ -515,14 +516,15 @@ if (entSize == 0) return false; if (sec.sh_size % entSize) - fatal(toString(this) + - ": SHF_MERGE section size must be a multiple of sh_entsize"); + fatal(toString(this) + ":(" + name + + "): SHF_MERGE section size must be a multiple of sh_entsize"); uint64_t flags = sec.sh_flags; if (!(flags & SHF_MERGE)) return false; if (flags & SHF_WRITE) - fatal(toString(this) + ": writable SHF_MERGE section is not supported"); + fatal(toString(this) + ":(" + name + + "): writable SHF_MERGE section is not supported"); return true; } @@ -1033,7 +1035,7 @@ if (name == ".eh_frame" && !config->relocatable) return make(*this, sec, name); - if (shouldMerge(sec)) + if (shouldMerge(sec, name)) return make(*this, sec, name); return make(*this, sec, name); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68758.224243.patch Type: text/x-patch Size: 2906 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 21:25:53 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 04:25:53 +0000 (UTC) Subject: [PATCH] D68751: [lld][WebAssembly] Where possible handle signature mismatches via an adaptor function In-Reply-To: References: Message-ID: <5636f4a3166c2cdd3b4d5fcdae18edd7@localhost.localdomain> sbc100 marked an inline comment as done. sbc100 added inline comments. ================ Comment at: lld/wasm/SymbolTable.cpp:592 +// types don't match the adaptor creation will tail. +bool SymbolTable::replaceWithAdaptorFunction(FunctionSymbol *sym, + FunctionSymbol *target) { ---------------- ruiu wrote: > sbc100 wrote: > > ruiu wrote: > > > This function seems to enables to call an external function with a less number of arguments. I wonder what is the use case of this -- as long as you have a correct header file for functions, compilers can tell users that the number of parameters doesn't match. > > Yes, normally this shouldn't happen, but sadly there are some cases in C when it does. > > > > The motivating case here is crt1.c calling the main function from _start. We want to be able support both 3 argument and 2 arguments forms. With native linking you can simply link these two together it will kind of "just work". With wasm, before this change we end up generating a linker warning and _start ends up calling stub function, so the program doesn't work. > > > > This change fixes that use case. > I'm curious if it makes sense to limit this functionality only to "main" if there's no other use cases, so that we don't accidentally link a wrong program as if it were a correct one. What do you think? We have seen other examples of code that does this. I remember another case in the mucl c library, although I think most of them are actual bugs that should be fixed in the code. One example is cmake which tests for the presence of a function by compiling a simple test file containing a call the the function, but the function always declared with no args. e.g. it tests for printf by compiling and linking a call to a no-arg version of printf. We already successfully link such broken programs which is why cmake works today (via replaceWithUnreachable). This change just takes it one step further and allows some such programs to also execute. With this change we still do give a warning (and that be be turned into an error) so we are still encouraging developers to fix their code. But we want to be permissive here to allow as much legacy code as possible o be compiled and run. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68751/new/ https://reviews.llvm.org/D68751 From llvm-commits at lists.llvm.org Wed Oct 9 21:29:50 2019 From: llvm-commits at lists.llvm.org (GN Sync Bot via llvm-commits) Date: Thu, 10 Oct 2019 04:29:50 -0000 Subject: [llvm] r374278 - gn build: Merge r374277 Message-ID: <20191010042950.221CE81F1B@lists.llvm.org> Author: gnsyncbot Date: Wed Oct 9 21:29:49 2019 New Revision: 374278 URL: http://llvm.org/viewvc/llvm-project?rev=374278&view=rev Log: gn build: Merge r374277 Modified: llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn?rev=374278&r1=374277&r2=374278&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/llvm/unittests/TextAPI/BUILD.gn Wed Oct 9 21:29:49 2019 @@ -10,5 +10,6 @@ unittest("TextAPITests") { "TextStubV1Tests.cpp", "TextStubV2Tests.cpp", "TextStubV3Tests.cpp", + "TextStubV4Tests.cpp", ] } From llvm-commits at lists.llvm.org Wed Oct 9 21:34:57 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 04:34:57 +0000 (UTC) Subject: [PATCH] D68751: [lld][WebAssembly] Where possible handle signature mismatches via an adaptor function In-Reply-To: References: Message-ID: <4df2b2362918bf4d3cc157468c011209@localhost.localdomain> ruiu added inline comments. ================ Comment at: lld/wasm/SymbolTable.cpp:592 +// types don't match the adaptor creation will tail. +bool SymbolTable::replaceWithAdaptorFunction(FunctionSymbol *sym, + FunctionSymbol *target) { ---------------- sbc100 wrote: > ruiu wrote: > > sbc100 wrote: > > > ruiu wrote: > > > > This function seems to enables to call an external function with a less number of arguments. I wonder what is the use case of this -- as long as you have a correct header file for functions, compilers can tell users that the number of parameters doesn't match. > > > Yes, normally this shouldn't happen, but sadly there are some cases in C when it does. > > > > > > The motivating case here is crt1.c calling the main function from _start. We want to be able support both 3 argument and 2 arguments forms. With native linking you can simply link these two together it will kind of "just work". With wasm, before this change we end up generating a linker warning and _start ends up calling stub function, so the program doesn't work. > > > > > > This change fixes that use case. > > I'm curious if it makes sense to limit this functionality only to "main" if there's no other use cases, so that we don't accidentally link a wrong program as if it were a correct one. What do you think? > We have seen other examples of code that does this. I remember another case in the mucl c library, although I think most of them are actual bugs that should be fixed in the code. One example is cmake which tests for the presence of a function by compiling a simple test file containing a call the the function, but the function always declared with no args. e.g. it tests for printf by compiling and linking a call to a no-arg version of printf. > > We already successfully link such broken programs which is why cmake works today (via replaceWithUnreachable). This change just takes it one step further and allows some such programs to also execute. > > With this change we still do give a warning (and that be be turned into an error) so we are still encouraging developers to fix their code. But we want to be permissive here to allow as much legacy code as possible o be compiled and run. Thank you for the detailed explanation. That helped me a lot. Could you add that here as a comment? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68751/new/ https://reviews.llvm.org/D68751 From llvm-commits at lists.llvm.org Wed Oct 9 21:34:58 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 04:34:58 +0000 (UTC) Subject: [PATCH] D68759: [WebAssembly] Wrap definitions in namespace lld { namespace wasm {. NFC Message-ID: MaskRay created this revision. Herald added subscribers: llvm-commits, dexonsmith, steven_wu, sunfish, aheejin, hiraditya, jgravelle-google, sbc100, mehdi_amini, dschuff. Herald added a project: LLVM. Similar to D68323 , but for wasm. Repository: rLLD LLVM Linker https://reviews.llvm.org/D68759 Files: wasm/Driver.cpp wasm/InputChunks.cpp wasm/InputFiles.cpp wasm/LTO.cpp wasm/OutputSections.cpp wasm/Relocations.cpp wasm/SymbolTable.cpp wasm/Symbols.cpp wasm/SyntheticSections.cpp wasm/Writer.cpp wasm/WriterUtils.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68759.224244.patch Type: text/x-patch Size: 19004 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 21:44:02 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 04:44:02 +0000 (UTC) Subject: [PATCH] D68759: [WebAssembly] Wrap definitions in namespace lld { namespace wasm {. NFC In-Reply-To: References: Message-ID: <584f42386933ab9f2dde2d7faed643d0@localhost.localdomain> ruiu accepted this revision. ruiu added a comment. This revision is now accepted and ready to land. LGTM Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68759/new/ https://reviews.llvm.org/D68759 From llvm-commits at lists.llvm.org Wed Oct 9 22:11:13 2019 From: llvm-commits at lists.llvm.org (Pengfei Wang via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 05:11:13 +0000 (UTC) Subject: [PATCH] D68757: [X86] Add strict fp support for instructions fadd/fsub/fmul/fdiv In-Reply-To: References: Message-ID: <64bd42e5f5ab4ec6a244c593aa8e7250@localhost.localdomain> pengfei updated this revision to Diff 224245. pengfei added a comment. Add test case for scalar instructions. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68757/new/ https://reviews.llvm.org/D68757 Files: llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86InstrAVX512.td llvm/lib/Target/X86/X86InstrSSE.td llvm/test/CodeGen/X86/vec-strict-128.ll llvm/test/CodeGen/X86/vec-strict-256.ll llvm/test/CodeGen/X86/vec-strict-512.ll llvm/test/CodeGen/X86/vec-strict-scalar.ll llvm/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68757.224245.patch Type: text/x-patch Size: 41365 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 22:20:12 2019 From: llvm-commits at lists.llvm.org (Pengfei Wang via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 05:20:12 +0000 (UTC) Subject: [PATCH] D68757: [X86] Add strict fp support for instructions fadd/fsub/fmul/fdiv In-Reply-To: References: Message-ID: <4fc0a0f7fae047582d455d39ee14d774@localhost.localdomain> pengfei updated this revision to Diff 224248. pengfei added a comment. Rename scalar test file to fp-strict-scalar. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68757/new/ https://reviews.llvm.org/D68757 Files: llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86InstrAVX512.td llvm/lib/Target/X86/X86InstrSSE.td llvm/test/CodeGen/X86/fp-strict-scalar.ll llvm/test/CodeGen/X86/vec-strict-128.ll llvm/test/CodeGen/X86/vec-strict-256.ll llvm/test/CodeGen/X86/vec-strict-512.ll llvm/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68757.224248.patch Type: text/x-patch Size: 41324 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 22:25:40 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via llvm-commits) Date: Thu, 10 Oct 2019 05:25:40 -0000 Subject: [lld] r374279 - [WebAssembly] Wrap definitions in namespace lld { namespace wasm {. NFC Message-ID: <20191010052540.4D9F588400@lists.llvm.org> Author: maskray Date: Wed Oct 9 22:25:39 2019 New Revision: 374279 URL: http://llvm.org/viewvc/llvm-project?rev=374279&view=rev Log: [WebAssembly] Wrap definitions in namespace lld { namespace wasm {. NFC Similar to D68323, but for wasm. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D68759 Modified: lld/trunk/wasm/Driver.cpp lld/trunk/wasm/InputChunks.cpp lld/trunk/wasm/InputFiles.cpp lld/trunk/wasm/LTO.cpp lld/trunk/wasm/OutputSections.cpp lld/trunk/wasm/Relocations.cpp lld/trunk/wasm/SymbolTable.cpp lld/trunk/wasm/Symbols.cpp lld/trunk/wasm/SyntheticSections.cpp lld/trunk/wasm/Writer.cpp lld/trunk/wasm/WriterUtils.cpp Modified: lld/trunk/wasm/Driver.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/wasm/Driver.cpp?rev=374279&r1=374278&r2=374279&view=diff ============================================================================== --- lld/trunk/wasm/Driver.cpp (original) +++ lld/trunk/wasm/Driver.cpp Wed Oct 9 22:25:39 2019 @@ -37,10 +37,9 @@ using namespace llvm::object; using namespace llvm::sys; using namespace llvm::wasm; -using namespace lld; -using namespace lld::wasm; - -Configuration *lld::wasm::config; +namespace lld { +namespace wasm { +Configuration *config; namespace { @@ -79,8 +78,7 @@ private: }; } // anonymous namespace -bool lld::wasm::link(ArrayRef args, bool canExitEarly, - raw_ostream &error) { +bool link(ArrayRef args, bool canExitEarly, raw_ostream &error) { errorHandler().logName = args::getFilenameWithoutExe(args[0]); errorHandler().errorOS = &error; errorHandler().errorLimitExceededMsg = @@ -787,3 +785,6 @@ void LinkerDriver::link(ArrayReffile) + ":(" + c->getName() + ")").str(); } +namespace wasm { StringRef InputChunk::getComdatName() const { uint32_t index = getComdat(); if (index == UINT32_MAX) @@ -346,3 +346,6 @@ void InputSegment::generateRelocationCod writeUleb128(os, 0, "offset"); } } + +} // namespace wasm +} // namespace lld Modified: lld/trunk/wasm/InputFiles.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/wasm/InputFiles.cpp?rev=374279&r1=374278&r2=374279&view=diff ============================================================================== --- lld/trunk/wasm/InputFiles.cpp (original) +++ lld/trunk/wasm/InputFiles.cpp Wed Oct 9 22:25:39 2019 @@ -22,16 +22,27 @@ #define DEBUG_TYPE "lld" -using namespace lld; -using namespace lld::wasm; - using namespace llvm; using namespace llvm::object; using namespace llvm::wasm; -std::unique_ptr lld::wasm::tar; +namespace lld { + +// Returns a string in the format of "foo.o" or "foo.a(bar.o)". +std::string toString(const wasm::InputFile *file) { + if (!file) + return ""; + + if (file->archiveName.empty()) + return file->getName(); + + return (file->archiveName + "(" + file->getName() + ")").str(); +} -Optional lld::wasm::readFile(StringRef path) { +namespace wasm { +std::unique_ptr tar; + +Optional readFile(StringRef path) { log("Loading: " + path); auto mbOrErr = MemoryBuffer::getFile(path); @@ -48,7 +59,7 @@ Optional lld::wasm::rea return mbref; } -InputFile *lld::wasm::createObjectFile(MemoryBufferRef mb, +InputFile *createObjectFile(MemoryBufferRef mb, StringRef archiveName) { file_magic magic = identify_magic(mb.getBuffer()); if (magic == file_magic::wasm_object) { @@ -542,13 +553,5 @@ void BitcodeFile::parse() { symbols.push_back(createBitcodeSymbol(keptComdats, objSym, *this)); } -// Returns a string in the format of "foo.o" or "foo.a(bar.o)". -std::string lld::toString(const wasm::InputFile *file) { - if (!file) - return ""; - - if (file->archiveName.empty()) - return file->getName(); - - return (file->archiveName + "(" + file->getName() + ")").str(); -} +} // namespace wasm +} // namespace lld Modified: lld/trunk/wasm/LTO.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/wasm/LTO.cpp?rev=374279&r1=374278&r2=374279&view=diff ============================================================================== --- lld/trunk/wasm/LTO.cpp (original) +++ lld/trunk/wasm/LTO.cpp Wed Oct 9 22:25:39 2019 @@ -36,9 +36,9 @@ #include using namespace llvm; -using namespace lld; -using namespace lld::wasm; +namespace lld { +namespace wasm { static std::unique_ptr createLTO() { lto::Config c; c.Options = initTargetOptionsFromCodeGenFlags(); @@ -165,3 +165,6 @@ std::vector BitcodeCompiler:: return ret; } + +} // namespace wasm +} // namespace lld Modified: lld/trunk/wasm/OutputSections.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/wasm/OutputSections.cpp?rev=374279&r1=374278&r2=374279&view=diff ============================================================================== --- lld/trunk/wasm/OutputSections.cpp (original) +++ lld/trunk/wasm/OutputSections.cpp Wed Oct 9 22:25:39 2019 @@ -20,9 +20,17 @@ using namespace llvm; using namespace llvm::wasm; -using namespace lld; -using namespace lld::wasm; +namespace lld { + +// Returns a string, e.g. "FUNCTION(.text)". +std::string toString(const wasm::OutputSection &sec) { + if (!sec.name.empty()) + return (sec.getSectionName() + "(" + sec.name + ")").str(); + return sec.getSectionName(); +} + +namespace wasm { static StringRef sectionTypeToString(uint32_t sectionType) { switch (sectionType) { case WASM_SEC_CUSTOM: @@ -58,13 +66,6 @@ static StringRef sectionTypeToString(uin } } -// Returns a string, e.g. "FUNCTION(.text)". -std::string lld::toString(const OutputSection &sec) { - if (!sec.name.empty()) - return (sec.getSectionName() + "(" + sec.name + ")").str(); - return sec.getSectionName(); -} - StringRef OutputSection::getSectionName() const { return sectionTypeToString(type); } @@ -248,3 +249,6 @@ void CustomSection::writeRelocations(raw for (const InputSection *s : inputSections) s->writeRelocations(os); } + +} // namespace wasm +} // namespace lld Modified: lld/trunk/wasm/Relocations.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/wasm/Relocations.cpp?rev=374279&r1=374278&r2=374279&view=diff ============================================================================== --- lld/trunk/wasm/Relocations.cpp (original) +++ lld/trunk/wasm/Relocations.cpp Wed Oct 9 22:25:39 2019 @@ -14,9 +14,8 @@ using namespace llvm; using namespace llvm::wasm; -using namespace lld; -using namespace lld::wasm; - +namespace lld { +namespace wasm { static bool requiresGOTAccess(const Symbol *sym) { return config->isPic && !sym->isHidden() && !sym->isLocal(); } @@ -54,7 +53,7 @@ static void addGOTEntry(Symbol *sym) { out.globalSec->addStaticGOTEntry(sym); } -void lld::wasm::scanRelocations(InputChunk *chunk) { +void scanRelocations(InputChunk *chunk) { if (!chunk->live) return; ObjFile *file = chunk->file; @@ -113,3 +112,6 @@ void lld::wasm::scanRelocations(InputChu } } + +} // namespace wasm +} // namespace lld Modified: lld/trunk/wasm/SymbolTable.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/wasm/SymbolTable.cpp?rev=374279&r1=374278&r2=374279&view=diff ============================================================================== --- lld/trunk/wasm/SymbolTable.cpp (original) +++ lld/trunk/wasm/SymbolTable.cpp Wed Oct 9 22:25:39 2019 @@ -21,10 +21,10 @@ using namespace llvm; using namespace llvm::wasm; using namespace llvm::object; -using namespace lld; -using namespace lld::wasm; -SymbolTable *lld::wasm::symtab; +namespace lld { +namespace wasm { +SymbolTable *symtab; void SymbolTable::addFile(InputFile *file) { log("Processing: " + toString(file)); @@ -692,3 +692,6 @@ void SymbolTable::handleSymbolVariants() } } } + +} // namespace wasm +} // namespace lld Modified: lld/trunk/wasm/Symbols.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/wasm/Symbols.cpp?rev=374279&r1=374278&r2=374279&view=diff ============================================================================== --- lld/trunk/wasm/Symbols.cpp (original) +++ lld/trunk/wasm/Symbols.cpp Wed Oct 9 22:25:39 2019 @@ -21,9 +21,45 @@ using namespace llvm; using namespace llvm::wasm; -using namespace lld; -using namespace lld::wasm; +namespace lld { +std::string toString(const wasm::Symbol &sym) { + return maybeDemangleSymbol(sym.getName()); +} + +std::string maybeDemangleSymbol(StringRef name) { + if (wasm::config->demangle) + return demangleItanium(name); + return name; +} + +std::string toString(wasm::Symbol::Kind kind) { + switch (kind) { + case wasm::Symbol::DefinedFunctionKind: + return "DefinedFunction"; + case wasm::Symbol::DefinedDataKind: + return "DefinedData"; + case wasm::Symbol::DefinedGlobalKind: + return "DefinedGlobal"; + case wasm::Symbol::DefinedEventKind: + return "DefinedEvent"; + case wasm::Symbol::UndefinedFunctionKind: + return "UndefinedFunction"; + case wasm::Symbol::UndefinedDataKind: + return "UndefinedData"; + case wasm::Symbol::UndefinedGlobalKind: + return "UndefinedGlobal"; + case wasm::Symbol::LazyKind: + return "LazyKind"; + case wasm::Symbol::SectionKind: + return "SectionKind"; + case wasm::Symbol::OutputSectionKind: + return "OutputSectionKind"; + } + llvm_unreachable("invalid symbol kind"); +} + +namespace wasm { DefinedFunction *WasmSym::callCtors; DefinedFunction *WasmSym::initMemory; DefinedFunction *WasmSym::applyRelocs; @@ -298,49 +334,12 @@ const OutputSectionSymbol *SectionSymbol void LazySymbol::fetch() { cast(file)->addMember(&archiveSymbol); } -std::string lld::toString(const wasm::Symbol &sym) { - return lld::maybeDemangleSymbol(sym.getName()); -} - -std::string lld::maybeDemangleSymbol(StringRef name) { - if (config->demangle) - return demangleItanium(name); - return name; -} - -std::string lld::toString(wasm::Symbol::Kind kind) { - switch (kind) { - case wasm::Symbol::DefinedFunctionKind: - return "DefinedFunction"; - case wasm::Symbol::DefinedDataKind: - return "DefinedData"; - case wasm::Symbol::DefinedGlobalKind: - return "DefinedGlobal"; - case wasm::Symbol::DefinedEventKind: - return "DefinedEvent"; - case wasm::Symbol::UndefinedFunctionKind: - return "UndefinedFunction"; - case wasm::Symbol::UndefinedDataKind: - return "UndefinedData"; - case wasm::Symbol::UndefinedGlobalKind: - return "UndefinedGlobal"; - case wasm::Symbol::LazyKind: - return "LazyKind"; - case wasm::Symbol::SectionKind: - return "SectionKind"; - case wasm::Symbol::OutputSectionKind: - return "OutputSectionKind"; - } - llvm_unreachable("invalid symbol kind"); -} - - -void lld::wasm::printTraceSymbolUndefined(StringRef name, const InputFile* file) { +void printTraceSymbolUndefined(StringRef name, const InputFile* file) { message(toString(file) + ": reference to " + name); } // Print out a log message for --trace-symbol. -void lld::wasm::printTraceSymbol(Symbol *sym) { +void printTraceSymbol(Symbol *sym) { // Undefined symbols are traced via printTraceSymbolUndefined if (sym->isUndefined()) return; @@ -354,5 +353,8 @@ void lld::wasm::printTraceSymbol(Symbol message(toString(sym->getFile()) + s + sym->getName()); } -const char *lld::wasm::defaultModule = "env"; -const char *lld::wasm::functionTableName = "__indirect_function_table"; +const char *defaultModule = "env"; +const char *functionTableName = "__indirect_function_table"; + +} // namespace wasm +} // namespace lld Modified: lld/trunk/wasm/SyntheticSections.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/wasm/SyntheticSections.cpp?rev=374279&r1=374278&r2=374279&view=diff ============================================================================== --- lld/trunk/wasm/SyntheticSections.cpp (original) +++ lld/trunk/wasm/SyntheticSections.cpp Wed Oct 9 22:25:39 2019 @@ -22,10 +22,10 @@ using namespace llvm; using namespace llvm::wasm; -using namespace lld; -using namespace lld::wasm; +namespace lld { +namespace wasm { -OutStruct lld::wasm::out; +OutStruct out; namespace { @@ -567,3 +567,6 @@ void RelocSection::writeBody() { writeUleb128(bodyOutputStream, count, "reloc count"); sec->writeRelocations(bodyOutputStream); } + +} // namespace wasm +} // namespace lld Modified: lld/trunk/wasm/Writer.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/wasm/Writer.cpp?rev=374279&r1=374278&r2=374279&view=diff ============================================================================== --- lld/trunk/wasm/Writer.cpp (original) +++ lld/trunk/wasm/Writer.cpp Wed Oct 9 22:25:39 2019 @@ -39,9 +39,9 @@ using namespace llvm; using namespace llvm::wasm; -using namespace lld; -using namespace lld::wasm; +namespace lld { +namespace wasm { static constexpr int stackAlignment = 16; namespace { @@ -1088,4 +1088,7 @@ void Writer::createHeader() { fileSize += header.size(); } -void lld::wasm::writeResult() { Writer().run(); } +void writeResult() { Writer().run(); } + +} // namespace wasm +} // namespace lld Modified: lld/trunk/wasm/WriterUtils.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/wasm/WriterUtils.cpp?rev=374279&r1=374278&r2=374279&view=diff ============================================================================== --- lld/trunk/wasm/WriterUtils.cpp (original) +++ lld/trunk/wasm/WriterUtils.cpp Wed Oct 9 22:25:39 2019 @@ -18,50 +18,94 @@ using namespace llvm; using namespace llvm::wasm; namespace lld { +std::string toString(ValType type) { + switch (type) { + case ValType::I32: + return "i32"; + case ValType::I64: + return "i64"; + case ValType::F32: + return "f32"; + case ValType::F64: + return "f64"; + case ValType::V128: + return "v128"; + case ValType::EXNREF: + return "exnref"; + } + llvm_unreachable("Invalid wasm::ValType"); +} + +std::string toString(const WasmSignature &sig) { + SmallString<128> s("("); + for (ValType type : sig.Params) { + if (s.size() != 1) + s += ", "; + s += toString(type); + } + s += ") -> "; + if (sig.Returns.empty()) + s += "void"; + else + s += toString(sig.Returns[0]); + return s.str(); +} -void wasm::debugWrite(uint64_t offset, const Twine &msg) { +std::string toString(const WasmGlobalType &type) { + return (type.Mutable ? "var " : "const ") + + toString(static_cast(type.Type)); +} + +std::string toString(const WasmEventType &type) { + if (type.Attribute == WASM_EVENT_ATTRIBUTE_EXCEPTION) + return "exception"; + return "unknown"; +} + +namespace wasm { +void debugWrite(uint64_t offset, const Twine &msg) { LLVM_DEBUG(dbgs() << format(" | %08lld: ", offset) << msg << "\n"); } -void wasm::writeUleb128(raw_ostream &os, uint32_t number, const Twine &msg) { +void writeUleb128(raw_ostream &os, uint32_t number, const Twine &msg) { debugWrite(os.tell(), msg + "[" + utohexstr(number) + "]"); encodeULEB128(number, os); } -void wasm::writeSleb128(raw_ostream &os, int32_t number, const Twine &msg) { +void writeSleb128(raw_ostream &os, int32_t number, const Twine &msg) { debugWrite(os.tell(), msg + "[" + utohexstr(number) + "]"); encodeSLEB128(number, os); } -void wasm::writeBytes(raw_ostream &os, const char *bytes, size_t count, +void writeBytes(raw_ostream &os, const char *bytes, size_t count, const Twine &msg) { debugWrite(os.tell(), msg + " [data[" + Twine(count) + "]]"); os.write(bytes, count); } -void wasm::writeStr(raw_ostream &os, StringRef string, const Twine &msg) { +void writeStr(raw_ostream &os, StringRef string, const Twine &msg) { debugWrite(os.tell(), msg + " [str[" + Twine(string.size()) + "]: " + string + "]"); encodeULEB128(string.size(), os); os.write(string.data(), string.size()); } -void wasm::writeU8(raw_ostream &os, uint8_t byte, const Twine &msg) { +void writeU8(raw_ostream &os, uint8_t byte, const Twine &msg) { debugWrite(os.tell(), msg + " [0x" + utohexstr(byte) + "]"); os << byte; } -void wasm::writeU32(raw_ostream &os, uint32_t number, const Twine &msg) { +void writeU32(raw_ostream &os, uint32_t number, const Twine &msg) { debugWrite(os.tell(), msg + "[0x" + utohexstr(number) + "]"); support::endian::write(os, number, support::little); } -void wasm::writeValueType(raw_ostream &os, ValType type, const Twine &msg) { +void writeValueType(raw_ostream &os, ValType type, const Twine &msg) { writeU8(os, static_cast(type), msg + "[type: " + toString(type) + "]"); } -void wasm::writeSig(raw_ostream &os, const WasmSignature &sig) { +void writeSig(raw_ostream &os, const WasmSignature &sig) { writeU8(os, WASM_TYPE_FUNC, "signature type"); writeUleb128(os, sig.Params.size(), "param Count"); for (ValType paramType : sig.Params) { @@ -73,22 +117,22 @@ void wasm::writeSig(raw_ostream &os, con } } -void wasm::writeI32Const(raw_ostream &os, int32_t number, const Twine &msg) { +void writeI32Const(raw_ostream &os, int32_t number, const Twine &msg) { writeU8(os, WASM_OPCODE_I32_CONST, "i32.const"); writeSleb128(os, number, msg); } -void wasm::writeI64Const(raw_ostream &os, int32_t number, const Twine &msg) { +void writeI64Const(raw_ostream &os, int32_t number, const Twine &msg) { writeU8(os, WASM_OPCODE_I64_CONST, "i64.const"); writeSleb128(os, number, msg); } -void wasm::writeMemArg(raw_ostream &os, uint32_t alignment, uint32_t offset) { +void writeMemArg(raw_ostream &os, uint32_t alignment, uint32_t offset) { writeUleb128(os, alignment, "alignment"); writeUleb128(os, offset, "offset"); } -void wasm::writeInitExpr(raw_ostream &os, const WasmInitExpr &initExpr) { +void writeInitExpr(raw_ostream &os, const WasmInitExpr &initExpr) { writeU8(os, initExpr.Opcode, "opcode"); switch (initExpr.Opcode) { case WASM_OPCODE_I32_CONST: @@ -106,39 +150,39 @@ void wasm::writeInitExpr(raw_ostream &os writeU8(os, WASM_OPCODE_END, "opcode:end"); } -void wasm::writeLimits(raw_ostream &os, const WasmLimits &limits) { +void writeLimits(raw_ostream &os, const WasmLimits &limits) { writeU8(os, limits.Flags, "limits flags"); writeUleb128(os, limits.Initial, "limits initial"); if (limits.Flags & WASM_LIMITS_FLAG_HAS_MAX) writeUleb128(os, limits.Maximum, "limits max"); } -void wasm::writeGlobalType(raw_ostream &os, const WasmGlobalType &type) { +void writeGlobalType(raw_ostream &os, const WasmGlobalType &type) { // TODO: Update WasmGlobalType to use ValType and remove this cast. writeValueType(os, ValType(type.Type), "global type"); writeU8(os, type.Mutable, "global mutable"); } -void wasm::writeGlobal(raw_ostream &os, const WasmGlobal &global) { +void writeGlobal(raw_ostream &os, const WasmGlobal &global) { writeGlobalType(os, global.Type); writeInitExpr(os, global.InitExpr); } -void wasm::writeEventType(raw_ostream &os, const WasmEventType &type) { +void writeEventType(raw_ostream &os, const WasmEventType &type) { writeUleb128(os, type.Attribute, "event attribute"); writeUleb128(os, type.SigIndex, "sig index"); } -void wasm::writeEvent(raw_ostream &os, const WasmEvent &event) { +void writeEvent(raw_ostream &os, const WasmEvent &event) { writeEventType(os, event.Type); } -void wasm::writeTableType(raw_ostream &os, const llvm::wasm::WasmTable &type) { +void writeTableType(raw_ostream &os, const llvm::wasm::WasmTable &type) { writeU8(os, WASM_TYPE_FUNCREF, "table type"); writeLimits(os, type.Limits); } -void wasm::writeImport(raw_ostream &os, const WasmImport &import) { +void writeImport(raw_ostream &os, const WasmImport &import) { writeStr(os, import.Module, "import module name"); writeStr(os, import.Field, "import field name"); writeU8(os, import.Kind, "import kind"); @@ -163,7 +207,7 @@ void wasm::writeImport(raw_ostream &os, } } -void wasm::writeExport(raw_ostream &os, const WasmExport &export_) { +void writeExport(raw_ostream &os, const WasmExport &export_) { writeStr(os, export_.Name, "export name"); writeU8(os, export_.Kind, "export kind"); switch (export_.Kind) { @@ -183,48 +227,6 @@ void wasm::writeExport(raw_ostream &os, fatal("unsupported export type: " + Twine(export_.Kind)); } } -} // namespace lld - -std::string lld::toString(ValType type) { - switch (type) { - case ValType::I32: - return "i32"; - case ValType::I64: - return "i64"; - case ValType::F32: - return "f32"; - case ValType::F64: - return "f64"; - case ValType::V128: - return "v128"; - case ValType::EXNREF: - return "exnref"; - } - llvm_unreachable("Invalid wasm::ValType"); -} - -std::string lld::toString(const WasmSignature &sig) { - SmallString<128> s("("); - for (ValType type : sig.Params) { - if (s.size() != 1) - s += ", "; - s += toString(type); - } - s += ") -> "; - if (sig.Returns.empty()) - s += "void"; - else - s += toString(sig.Returns[0]); - return s.str(); -} - -std::string lld::toString(const WasmGlobalType &type) { - return (type.Mutable ? "var " : "const ") + - toString(static_cast(type.Type)); -} -std::string lld::toString(const WasmEventType &type) { - if (type.Attribute == WASM_EVENT_ATTRIBUTE_EXCEPTION) - return "exception"; - return "unknown"; -} +} // namespace wasm +} // namespace lld From llvm-commits at lists.llvm.org Wed Oct 9 22:29:14 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 05:29:14 +0000 (UTC) Subject: [PATCH] D68759: [WebAssembly] Wrap definitions in namespace lld { namespace wasm {. NFC In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG33c59abf5c69: [WebAssembly] Wrap definitions in namespace lld { namespace wasm {. NFC (authored by MaskRay). Changed prior to commit: https://reviews.llvm.org/D68759?vs=224244&id=224249#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68759/new/ https://reviews.llvm.org/D68759 Files: lld/wasm/Driver.cpp lld/wasm/InputChunks.cpp lld/wasm/InputFiles.cpp lld/wasm/LTO.cpp lld/wasm/OutputSections.cpp lld/wasm/Relocations.cpp lld/wasm/SymbolTable.cpp lld/wasm/Symbols.cpp lld/wasm/SyntheticSections.cpp lld/wasm/Writer.cpp lld/wasm/WriterUtils.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68759.224249.patch Type: text/x-patch Size: 19154 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 22:33:21 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Thu, 10 Oct 2019 05:33:21 -0000 Subject: [llvm] r374280 - [Attributor] Handle `null` differently in capture and alias logic Message-ID: <20191010053321.4906285E7A@lists.llvm.org> Author: jdoerfert Date: Wed Oct 9 22:33:21 2019 New Revision: 374280 URL: http://llvm.org/viewvc/llvm-project?rev=374280&view=rev Log: [Attributor] Handle `null` differently in capture and alias logic Summary: `null` in the default address space (=AS 0) cannot be captured nor can it alias anything. We make this clear now as it can be important for callbacks and other cases later on. In addition, this patch improves the debug output for noalias deduction. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68624 Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374280&r1=374279&r2=374280&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Wed Oct 9 22:33:21 2019 @@ -1907,7 +1907,11 @@ struct AANoAliasFloating final : AANoAli /// See AbstractAttribute::initialize(...). void initialize(Attributor &A) override { AANoAliasImpl::initialize(A); - if (isa(getAnchorValue())) + Value &Val = getAssociatedValue(); + if (isa(Val)) + indicateOptimisticFixpoint(); + if (isa(Val) && + Val.getType()->getPointerAddressSpace() == 0) indicateOptimisticFixpoint(); } @@ -1971,8 +1975,12 @@ struct AANoAliasCallSiteArgument final : // check only uses possibly executed before this callsite. auto &NoCaptureAA = A.getAAFor(*this, IRP); - if (!NoCaptureAA.isAssumedNoCaptureMaybeReturned()) + if (!NoCaptureAA.isAssumedNoCaptureMaybeReturned()) { + LLVM_DEBUG( + dbgs() << "[Attributor][AANoAliasCSArg] " << V + << " cannot be noalias as it is potentially captured\n"); return indicatePessimisticFixpoint(); + } // (iii) Check there is no other pointer argument which could alias with the // value. @@ -1986,13 +1994,15 @@ struct AANoAliasCallSiteArgument final : if (const Function *F = getAnchorScope()) { if (AAResults *AAR = A.getInfoCache().getAAResultsForFunction(*F)) { + bool IsAliasing = AAR->isNoAlias(&getAssociatedValue(), ArgOp); LLVM_DEBUG(dbgs() << "[Attributor][NoAliasCSArg] Check alias between " "callsite arguments " << AAR->isNoAlias(&getAssociatedValue(), ArgOp) << " " - << getAssociatedValue() << " " << *ArgOp << "\n"); + << getAssociatedValue() << " " << *ArgOp << " => " + << (IsAliasing ? "" : "no-") << "alias \n"); - if (AAR->isNoAlias(&getAssociatedValue(), ArgOp)) + if (IsAliasing) continue; } } @@ -2881,6 +2891,13 @@ struct AANoCaptureImpl : public AANoCapt void initialize(Attributor &A) override { AANoCapture::initialize(A); + // You cannot "capture" null in the default address space. + if (isa(getAssociatedValue()) && + getAssociatedValue().getType()->getPointerAddressSpace() == 0) { + indicateOptimisticFixpoint(); + return; + } + const IRPosition &IRP = getIRPosition(); const Function *F = getArgNo() >= 0 ? IRP.getAssociatedFunction() : IRP.getAnchorScope(); Modified: llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll?rev=374280&r1=374279&r2=374280&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/callbacks.ll Wed Oct 9 22:33:21 2019 @@ -24,7 +24,7 @@ define void @t0_caller(i32* %a) { ; CHECK-NEXT: [[TMP0:%.*]] = bitcast i32* [[B]] to i8* ; CHECK-NEXT: store i32 42, i32* [[B]], align 32 ; CHECK-NEXT: store i32* [[B]], i32** [[C]], align 64 -; CHECK-NEXT: call void (i32*, i32*, void (i32*, i32*, ...)*, ...) @t0_callback_broker(i32* null, i32* nonnull align 128 dereferenceable(4) [[PTR]], void (i32*, i32*, ...)* nonnull bitcast (void (i32*, i32*, i32*, i64, i32**)* @t0_callback_callee to void (i32*, i32*, ...)*), i32* [[A:%.*]], i64 99, i32** nonnull align 64 dereferenceable(8) [[C]]) +; CHECK-NEXT: call void (i32*, i32*, void (i32*, i32*, ...)*, ...) @t0_callback_broker(i32* noalias null, i32* nonnull align 128 dereferenceable(4) [[PTR]], void (i32*, i32*, ...)* nonnull bitcast (void (i32*, i32*, i32*, i64, i32**)* @t0_callback_callee to void (i32*, i32*, ...)*), i32* [[A:%.*]], i64 99, i32** nonnull align 64 dereferenceable(8) [[C]]) ; CHECK-NEXT: ret void ; entry: Modified: llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll?rev=374280&r1=374279&r2=374280&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll Wed Oct 9 22:33:21 2019 @@ -320,5 +320,14 @@ define i1 @captureDereferenceableOrNullI ret i1 %2 } +declare void @unknown(i8*) +define void @test_callsite() { +entry: +; We know that 'null' in AS 0 does not alias anything and cannot be captured +; CHECK: call void @unknown(i8* noalias nocapture null) + call void @unknown(i8* null) + ret void +} + declare i8* @llvm.launder.invariant.group.p0i8(i8*) declare i8* @llvm.strip.invariant.group.p0i8(i8*) From llvm-commits at lists.llvm.org Wed Oct 9 22:34:21 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Thu, 10 Oct 2019 05:34:21 -0000 Subject: [llvm] r374281 - [Attributor][NFC] clang format Message-ID: <20191010053421.AEFB98FB9B@lists.llvm.org> Author: jdoerfert Date: Wed Oct 9 22:34:21 2019 New Revision: 374281 URL: http://llvm.org/viewvc/llvm-project?rev=374281&view=rev Log: [Attributor][NFC] clang format Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h llvm/trunk/lib/Transforms/IPO/Attributor.cpp Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/Attributor.h?rev=374281&r1=374280&r2=374281&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/Attributor.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/Attributor.h Wed Oct 9 22:34:21 2019 @@ -899,13 +899,12 @@ struct Attributor { const DataLayout &getDataLayout() const { return InfoCache.DL; } private: - /// The private version of getAAFor that allows to omit a querying abstract /// attribute. See also the public getAAFor method. template const AAType &getOrCreateAAFor(const IRPosition &IRP, - const AbstractAttribute *QueryingAA = nullptr, - bool TrackDependence = false) { + const AbstractAttribute *QueryingAA = nullptr, + bool TrackDependence = false) { if (const AAType *AAPtr = lookupAAFor(IRP, QueryingAA, TrackDependence)) return *AAPtr; @@ -1417,7 +1416,8 @@ struct AAReturnedValues const function_ref &)> &Pred) const = 0; - using iterator = MapVector>::iterator; + using iterator = + MapVector>::iterator; using const_iterator = MapVector>::const_iterator; virtual llvm::iterator_range returned_values() = 0; Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374281&r1=374280&r2=374281&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Wed Oct 9 22:34:21 2019 @@ -139,8 +139,8 @@ static cl::opt DepRecInterval( static cl::opt EnableHeapToStack("enable-heap-to-stack-conversion", cl::init(true), cl::Hidden); -static cl::opt MaxHeapToStackSize("max-heap-to-stack-size", - cl::init(128), cl::Hidden); +static cl::opt MaxHeapToStackSize("max-heap-to-stack-size", cl::init(128), + cl::Hidden); /// Logic operators for the change status enum class. /// From llvm-commits at lists.llvm.org Wed Oct 9 22:38:27 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 05:38:27 +0000 (UTC) Subject: [PATCH] D68624: [Attributor] Handle `null` differently in capture and alias logic In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG72adda1740ca: [Attributor] Handle `null` differently in capture and alias logic (authored by jdoerfert). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68624/new/ https://reviews.llvm.org/D68624 Files: llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/callbacks.ll llvm/test/Transforms/FunctionAttrs/nocapture.ll Index: llvm/test/Transforms/FunctionAttrs/nocapture.ll =================================================================== --- llvm/test/Transforms/FunctionAttrs/nocapture.ll +++ llvm/test/Transforms/FunctionAttrs/nocapture.ll @@ -320,5 +320,14 @@ ret i1 %2 } +declare void @unknown(i8*) +define void @test_callsite() { +entry: +; We know that 'null' in AS 0 does not alias anything and cannot be captured +; CHECK: call void @unknown(i8* noalias nocapture null) + call void @unknown(i8* null) + ret void +} + declare i8* @llvm.launder.invariant.group.p0i8(i8*) declare i8* @llvm.strip.invariant.group.p0i8(i8*) Index: llvm/test/Transforms/FunctionAttrs/callbacks.ll =================================================================== --- llvm/test/Transforms/FunctionAttrs/callbacks.ll +++ llvm/test/Transforms/FunctionAttrs/callbacks.ll @@ -24,7 +24,7 @@ ; CHECK-NEXT: [[TMP0:%.*]] = bitcast i32* [[B]] to i8* ; CHECK-NEXT: store i32 42, i32* [[B]], align 32 ; CHECK-NEXT: store i32* [[B]], i32** [[C]], align 64 -; CHECK-NEXT: call void (i32*, i32*, void (i32*, i32*, ...)*, ...) @t0_callback_broker(i32* null, i32* nonnull align 128 dereferenceable(4) [[PTR]], void (i32*, i32*, ...)* nonnull bitcast (void (i32*, i32*, i32*, i64, i32**)* @t0_callback_callee to void (i32*, i32*, ...)*), i32* [[A:%.*]], i64 99, i32** nonnull align 64 dereferenceable(8) [[C]]) +; CHECK-NEXT: call void (i32*, i32*, void (i32*, i32*, ...)*, ...) @t0_callback_broker(i32* noalias null, i32* nonnull align 128 dereferenceable(4) [[PTR]], void (i32*, i32*, ...)* nonnull bitcast (void (i32*, i32*, i32*, i64, i32**)* @t0_callback_callee to void (i32*, i32*, ...)*), i32* [[A:%.*]], i64 99, i32** nonnull align 64 dereferenceable(8) [[C]]) ; CHECK-NEXT: ret void ; entry: Index: llvm/lib/Transforms/IPO/Attributor.cpp =================================================================== --- llvm/lib/Transforms/IPO/Attributor.cpp +++ llvm/lib/Transforms/IPO/Attributor.cpp @@ -1907,7 +1907,11 @@ /// See AbstractAttribute::initialize(...). void initialize(Attributor &A) override { AANoAliasImpl::initialize(A); - if (isa(getAnchorValue())) + Value &Val = getAssociatedValue(); + if (isa(Val)) + indicateOptimisticFixpoint(); + if (isa(Val) && + Val.getType()->getPointerAddressSpace() == 0) indicateOptimisticFixpoint(); } @@ -1971,8 +1975,12 @@ // check only uses possibly executed before this callsite. auto &NoCaptureAA = A.getAAFor(*this, IRP); - if (!NoCaptureAA.isAssumedNoCaptureMaybeReturned()) + if (!NoCaptureAA.isAssumedNoCaptureMaybeReturned()) { + LLVM_DEBUG( + dbgs() << "[Attributor][AANoAliasCSArg] " << V + << " cannot be noalias as it is potentially captured\n"); return indicatePessimisticFixpoint(); + } // (iii) Check there is no other pointer argument which could alias with the // value. @@ -1986,13 +1994,15 @@ if (const Function *F = getAnchorScope()) { if (AAResults *AAR = A.getInfoCache().getAAResultsForFunction(*F)) { + bool IsAliasing = AAR->isNoAlias(&getAssociatedValue(), ArgOp); LLVM_DEBUG(dbgs() << "[Attributor][NoAliasCSArg] Check alias between " "callsite arguments " << AAR->isNoAlias(&getAssociatedValue(), ArgOp) << " " - << getAssociatedValue() << " " << *ArgOp << "\n"); + << getAssociatedValue() << " " << *ArgOp << " => " + << (IsAliasing ? "" : "no-") << "alias \n"); - if (AAR->isNoAlias(&getAssociatedValue(), ArgOp)) + if (IsAliasing) continue; } } @@ -2881,6 +2891,13 @@ void initialize(Attributor &A) override { AANoCapture::initialize(A); + // You cannot "capture" null in the default address space. + if (isa(getAssociatedValue()) && + getAssociatedValue().getType()->getPointerAddressSpace() == 0) { + indicateOptimisticFixpoint(); + return; + } + const IRPosition &IRP = getIRPosition(); const Function *F = getArgNo() >= 0 ? IRP.getAssociatedFunction() : IRP.getAnchorScope(); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68624.224250.patch Type: text/x-patch Size: 4330 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 22:47:29 2019 From: llvm-commits at lists.llvm.org (Yonghong Song via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 05:47:29 +0000 (UTC) Subject: [PATCH] D68760: [BPF] Remove relocation for patchable externs Message-ID: yonghong-song created this revision. yonghong-song added reviewers: ast, anakryiko. Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. Previously, patchable extern relocations are introduced to patch external variables used for multi versioning in compile once, run everywhere use case. The load instruction will be converted into a move with an patchable immediate which can be changed by bpf loader on the host. The kernel verifier has evolved and is able to load and propagate constant values, so compiler relocation becomes unnecessary. This patch removed codes related to this. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68760 Files: llvm/lib/Target/BPF/BPFAbstractMemberAccess.cpp llvm/lib/Target/BPF/BPFCORE.h llvm/lib/Target/BPF/BPFMISimplifyPatchable.cpp llvm/lib/Target/BPF/BTF.h llvm/lib/Target/BPF/BTFDebug.cpp llvm/lib/Target/BPF/BTFDebug.h llvm/test/CodeGen/BPF/BTF/binary-format.ll llvm/test/CodeGen/BPF/BTF/filename.ll llvm/test/CodeGen/BPF/BTF/func-func-ptr.ll llvm/test/CodeGen/BPF/BTF/func-non-void.ll llvm/test/CodeGen/BPF/BTF/func-source.ll llvm/test/CodeGen/BPF/BTF/func-typedef.ll llvm/test/CodeGen/BPF/BTF/func-unused-arg.ll llvm/test/CodeGen/BPF/BTF/func-void.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-basic.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-multilevel.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-struct-anonymous.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-struct-array.ll llvm/test/CodeGen/BPF/CORE/offset-reloc-union.ll llvm/test/CodeGen/BPF/CORE/patchable-extern-char.ll llvm/test/CodeGen/BPF/CORE/patchable-extern-uint.ll llvm/test/CodeGen/BPF/CORE/patchable-extern-ulonglong.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68760.224251.patch Type: text/x-patch Size: 40041 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 23:05:44 2019 From: llvm-commits at lists.llvm.org (Rainer Orth via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 06:05:44 +0000 (UTC) Subject: [PATCH] D68741: test-release.sh s/http/https/ In-Reply-To: References: Message-ID: ro added a comment. Is there any point doing this now when the switch to git is imminent? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68741/new/ https://reviews.llvm.org/D68741 From llvm-commits at lists.llvm.org Wed Oct 9 23:05:45 2019 From: llvm-commits at lists.llvm.org (Mehdi AMINI via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 06:05:45 +0000 (UTC) Subject: [PATCH] D66840: docs/DeveloperPolicy: Add instructions for requesting GitHub commit access In-Reply-To: References: Message-ID: mehdi_amini added a comment. I think this would be useful to have in? ================ Comment at: llvm/docs/DeveloperPolicy.rst:415 + echo "$SVN_USERNAME:$GITHUB_USERNAME" >> trunk/github-usernames.txt + svn commit -m "Request commit access for $SVN_USERNAME" + ---------------- xbolva00 wrote: > Needs "cd trunk/" before svn commit, otherwise > > ~/TMP/tmp-llvm-svn$ svn commit -m " ... " > svn: E155007: '/home/xbolva00/TMP/tmp-llvm-svn' is not a working copy > ping? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66840/new/ https://reviews.llvm.org/D66840 From llvm-commits at lists.llvm.org Wed Oct 9 23:07:01 2019 From: llvm-commits at lists.llvm.org (Rui Ueyama via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 06:07:01 +0000 (UTC) Subject: [PATCH] D68688: [LLD] [MinGW] Add a testcase for -l:name style library options. NFC. In-Reply-To: References: Message-ID: ruiu accepted this revision. ruiu added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68688/new/ https://reviews.llvm.org/D68688 From llvm-commits at lists.llvm.org Wed Oct 9 23:14:49 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Thu, 10 Oct 2019 06:14:49 +0000 (UTC) Subject: [PATCH] D66840: docs/DeveloperPolicy: Add instructions for requesting GitHub commit access In-Reply-To: References: Message-ID: <3077f7158d2ede2f73289357ef354846@localhost.localdomain> xbolva00 added inline comments. ================ Comment at: llvm/docs/DeveloperPolicy.rst:415 + echo "$SVN_USERNAME:$GITHUB_USERNAME" >> trunk/github-usernames.txt + svn commit -m "Request commit access for $SVN_USERNAME" + ---------------- mehdi_amini wrote: > xbolva00 wrote: > > Needs "cd trunk/" before svn commit, otherwise > > > > ~/TMP/tmp-llvm-svn$ svn commit -m " ... " > > svn: E155007: '/home/xbolva00/TMP/tmp-llvm-svn' is not a working copy > > > ping? ping what? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66840/new/ https://reviews.llvm.org/D66840 From llvm-commits at lists.llvm.org Wed Oct 9 23:14:49 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Thu, 10 Oct 2019 06:14:49 +0000 (UTC) Subject: [PATCH] D67986: [InstCombine] snprintf (d, size, "%s", s) -> memccpy (d, s, '\0', size - 1), d[size - 1] = 0 In-Reply-To: References: Message-ID: <061b9ce2c18912f7814b16a1fe828f19@localhost.localdomain> xbolva00 abandoned this revision. xbolva00 added a comment. In D67986#1702901 , @MaskRay wrote: > This transformation seems to increase code size significantly. Is the snprintf "%s" pattern common enough? I suspect most projects have already used memccpy, stpncpy, strscpy, or strlcpy. For the few that don't, the performance probably does not matter. Yes, quite common. But okay, if you dont want it, let's just abandon it. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67986/new/ https://reviews.llvm.org/D67986 From llvm-commits at lists.llvm.org Wed Oct 9 23:14:49 2019 From: llvm-commits at lists.llvm.org (serge via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 06:14:49 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: serge-sans-paille updated this revision to Diff 224254. serge-sans-paille added a comment. Added documentation + release note entry Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 Files: clang/docs/ReleaseNotes.rst clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Basic/DiagnosticFrontendKinds.td clang/include/clang/Basic/TargetInfo.h clang/include/clang/Driver/CC1Options.td clang/include/clang/Driver/Options.td clang/lib/Basic/Targets/X86.h clang/lib/CodeGen/CGStmt.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/stack-clash-protection.c clang/test/Driver/stack-clash-protection.c llvm/docs/ReleaseNotes.rst llvm/lib/Target/X86/X86FrameLowering.cpp llvm/lib/Target/X86/X86FrameLowering.h llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86ISelLowering.h llvm/lib/Target/X86/X86InstrCompiler.td llvm/lib/Target/X86/X86InstrInfo.td llvm/test/CodeGen/X86/stack-clash-dynamic-alloca.ll llvm/test/CodeGen/X86/stack-clash-medium-natural-probes.ll llvm/test/CodeGen/X86/stack-clash-medium.ll llvm/test/CodeGen/X86/stack-clash-small.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68720.224254.patch Type: text/x-patch Size: 29882 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 23:14:49 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 06:14:49 +0000 (UTC) Subject: [PATCH] D68008: [Attributor] Use abstract call sites to determine associated arguments In-Reply-To: References: Message-ID: jdoerfert updated this revision to Diff 224255. jdoerfert added a comment. Add no-alias restriction and test Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68008/new/ https://reviews.llvm.org/D68008 Files: llvm/include/llvm/IR/CallSite.h llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/IR/AbstractCallSite.cpp llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/callbacks.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68008.224255.patch Type: text/x-patch Size: 17354 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 23:25:01 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Thu, 10 Oct 2019 06:25:01 -0000 Subject: [llvm] r374283 - [X86] Add test case for trunc_packus_v16i32_v16i8 with avx512vl+avx512bw and prefer-vector-width=256 and min-legal-vector-width=256. NFC Message-ID: <20191010062501.2E71987CAF@lists.llvm.org> Author: ctopper Date: Wed Oct 9 23:25:00 2019 New Revision: 374283 URL: http://llvm.org/viewvc/llvm-project?rev=374283&view=rev Log: [X86] Add test case for trunc_packus_v16i32_v16i8 with avx512vl+avx512bw and prefer-vector-width=256 and min-legal-vector-width=256. NFC Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll?rev=374283&r1=374282&r2=374283&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll (original) +++ llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Wed Oct 9 23:25:00 2019 @@ -1079,3 +1079,23 @@ define void @vselect_split_v16i16_setcc( store <16 x i32> %b, <16 x i32>* %r ret void } + +define <16 x i8> @trunc_packus_v16i32_v16i8(<16 x i32>* %p, <16 x i8>* %q) "min-legal-vector-width"="256" { +; CHECK-LABEL: trunc_packus_v16i32_v16i8: +; CHECK: # %bb.0: +; CHECK-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; CHECK-NEXT: vpmaxsd 32(%rdi), %ymm0, %ymm1 +; CHECK-NEXT: vpmovusdb %ymm1, %xmm1 +; CHECK-NEXT: vpmaxsd (%rdi), %ymm0, %ymm0 +; CHECK-NEXT: vpmovusdb %ymm0, %xmm0 +; CHECK-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] +; CHECK-NEXT: vzeroupper +; CHECK-NEXT: retq + %a = load <16 x i32>, <16 x i32>* %p + %b = icmp slt <16 x i32> %a, + %c = select <16 x i1> %b, <16 x i32> %a, <16 x i32> + %d = icmp sgt <16 x i32> %c, zeroinitializer + %e = select <16 x i1> %d, <16 x i32> %c, <16 x i32> zeroinitializer + %f = trunc <16 x i32> %e to <16 x i8> + ret <16 x i8> %f +} From llvm-commits at lists.llvm.org Wed Oct 9 23:43:47 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Thu, 10 Oct 2019 06:43:47 +0000 (UTC) Subject: [PATCH] D68689: [LLD] [MinGW] Look for other library patterns with -l In-Reply-To: References: Message-ID: <5bb08d3e878f884793621bcdd6898a20@localhost.localdomain> mstorsjo updated this revision to Diff 224260. mstorsjo edited the summary of this revision. mstorsjo added a comment. Changed to use error() instead of fatal() in the added lines of code, and in another place in the same function. Added an `if (errorCount()) return false;` at the end of the MinGW driver, to avoid actually trying to start the linking if errors were reported at this stage. All other existing error reporting in the MinGW driver is still using fatal(), but changing that is a different matter. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68689/new/ https://reviews.llvm.org/D68689 Files: lld/MinGW/Driver.cpp lld/test/MinGW/lib.test Index: lld/test/MinGW/lib.test =================================================================== --- lld/test/MinGW/lib.test +++ lld/test/MinGW/lib.test @@ -26,3 +26,16 @@ RUN: ld.lld -### -m i386pep -Bstatic -lfoo -Bdynamic -lbar -L%t/lib | FileCheck -check-prefix=LIB5 %s LIB5: libfoo.a LIB5-SAME: libbar.dll.a + +RUN: echo > %t/lib/noprefix.dll.a +RUN: echo > %t/lib/msvcstyle.lib +RUN: ld.lld -### -m i386pep -L%t/lib -lnoprefix -lmsvcstyle | FileCheck -check-prefix=OTHERSTYLES %s +OTHERSTYLES: noprefix.dll.a +OTHERSTYLES-SAME: msvcstyle.lib + +RUN: echo > %t/lib/libnoimplib.dll +RUN: echo > %t/lib/noprefix_noimplib.dll +RUN: not ld.lld -### -m i386pep -L%t/lib -lnoimplib 2>&1 | FileCheck -check-prefix=UNSUPPORTED-DLL1 %s +RUN: not ld.lld -### -m i386pep -L%t/lib -lnoprefix_noimplib 2>&1 | FileCheck -check-prefix=UNSUPPORTED-DLL2 %s +UNSUPPORTED-DLL1: lld doesn't support linking directly against {{.*}}libnoimplib.dll, use an import library +UNSUPPORTED-DLL2: lld doesn't support linking directly against {{.*}}noprefix_noimplib.dll, use an import library Index: lld/MinGW/Driver.cpp =================================================================== --- lld/MinGW/Driver.cpp +++ lld/MinGW/Driver.cpp @@ -125,17 +125,36 @@ for (StringRef dir : searchPaths) if (Optional s = findFile(dir, name.substr(1))) return *s; - fatal("unable to find library -l" + name); + error("unable to find library -l" + name); + return ""; } for (StringRef dir : searchPaths) { - if (!bStatic) + if (!bStatic) { if (Optional s = findFile(dir, "lib" + name + ".dll.a")) return *s; + if (Optional s = findFile(dir, name + ".dll.a")) + return *s; + } if (Optional s = findFile(dir, "lib" + name + ".a")) return *s; + if (!bStatic) { + if (Optional s = findFile(dir, name + ".lib")) + return *s; + if (Optional s = findFile(dir, "lib" + name + ".dll")) { + error("lld doesn't support linking directly against " + *s + + ", use an import library"); + return ""; + } + if (Optional s = findFile(dir, name + ".dll")) { + error("lld doesn't support linking directly against " + *s + + ", use an import library"); + return ""; + } + } } - fatal("unable to find library -l" + name); + error("unable to find library -l" + name); + return ""; } // Convert Unix-ish command line arguments to Windows-ish ones and @@ -352,5 +371,7 @@ std::vector vec; for (const std::string &s : linkArgs) vec.push_back(s.c_str()); + if (errorCount()) + return false; return coff::link(vec, true); } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68689.224260.patch Type: text/x-patch Size: 2785 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 23:43:58 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 06:43:58 +0000 (UTC) Subject: [PATCH] D68763: [X86] Use packusdw+vpmovuswb to implement v16i32->V16i8 that clamps signed inputs to be between 0 and 255 when zmm registers are disabled on SKX. Message-ID: craig.topper created this revision. craig.topper added reviewers: RKSimon, spatel. Herald added a subscriber: hiraditya. Herald added a project: LLVM. If we've disable zmm registers, the v16i32 will need to be split. This split will propagate through min/max the truncate. This creates two sequences that need to be concatenated back to v16i8. We can instead use packusdw to do part of the clamping, truncating, and concatenating all at once. Then we can use a vpmovuswb to finish off the clamp. https://reviews.llvm.org/D68763 Files: llvm/lib/Target/X86/X86ISelLowering.cpp llvm/test/CodeGen/X86/min-legal-vector-width.ll Index: llvm/test/CodeGen/X86/min-legal-vector-width.ll =================================================================== --- llvm/test/CodeGen/X86/min-legal-vector-width.ll +++ llvm/test/CodeGen/X86/min-legal-vector-width.ll @@ -1083,12 +1083,10 @@ define <16 x i8> @trunc_packus_v16i32_v16i8(<16 x i32>* %p, <16 x i8>* %q) "min-legal-vector-width"="256" { ; CHECK-LABEL: trunc_packus_v16i32_v16i8: ; CHECK: # %bb.0: -; CHECK-NEXT: vpxor %xmm0, %xmm0, %xmm0 -; CHECK-NEXT: vpmaxsd 32(%rdi), %ymm0, %ymm1 -; CHECK-NEXT: vpmovusdb %ymm1, %xmm1 -; CHECK-NEXT: vpmaxsd (%rdi), %ymm0, %ymm0 -; CHECK-NEXT: vpmovusdb %ymm0, %xmm0 -; CHECK-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] +; CHECK-NEXT: vmovdqa (%rdi), %ymm0 +; CHECK-NEXT: vpackusdw 32(%rdi), %ymm0, %ymm0 +; CHECK-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; CHECK-NEXT: vpmovuswb %ymm0, %xmm0 ; CHECK-NEXT: vzeroupper ; CHECK-NEXT: retq %a = load <16 x i32>, <16 x i32>* %p Index: llvm/lib/Target/X86/X86ISelLowering.cpp =================================================================== --- llvm/lib/Target/X86/X86ISelLowering.cpp +++ llvm/lib/Target/X86/X86ISelLowering.cpp @@ -39841,6 +39841,21 @@ if (auto USatVal = detectUSatPattern(In, VT, DAG, DL)) return DAG.getNode(X86ISD::VTRUNCUS, DL, VT, USatVal); } + + // If we're clamping a signed 32-bit vector to 0-255 and the 32-bit vector is + // split across two registers. We can use a packusdw+perm to clamp to 0-65535 + // and concatenate at the same time. Then we can use a final vpmovuswb to + // clip to 0-255. + if (Subtarget.hasBWI() && !Subtarget.useAVX512Regs() && + InVT == MVT::v16i32 && VT == MVT::v16i8) { + if (auto USatVal = detectSSatPattern(In, VT, true)) { + // Emit a VPACKUSDW+VPERMQ followed by a VPMOVUSWB. + SDValue Mid = truncateVectorWithPACK(X86ISD::PACKUS, MVT::v16i16, USatVal, + DL, DAG, Subtarget); + return DAG.getNode(X86ISD::VTRUNCUS, DL, VT, Mid); + } + } + if (VT.isVector() && isPowerOf2_32(VT.getVectorNumElements()) && !(Subtarget.hasAVX512() && InSVT == MVT::i32) && !(Subtarget.hasBWI() && InSVT == MVT::i16) && -------------- next part -------------- A non-text attachment was scrubbed... Name: D68763.224258.patch Type: text/x-patch Size: 2231 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 23:52:58 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 06:52:58 +0000 (UTC) Subject: [PATCH] D68758: Improve error message for bad SHF_MERGE sections In-Reply-To: References: Message-ID: MaskRay added inline comments. ================ Comment at: lld/ELF/InputFiles.cpp:520 + fatal(toString(this) + ":(" + name + + "): SHF_MERGE section size must be a multiple of sh_entsize"); ---------------- Make it a bit more detailed, e.g. section size (1337) must be a multiple of sh_entsize (4) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68758/new/ https://reviews.llvm.org/D68758 From llvm-commits at lists.llvm.org Wed Oct 9 23:52:58 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 06:52:58 +0000 (UTC) Subject: [PATCH] D68626: [Attributor] Use undef for calls with unused arguments. In-Reply-To: References: Message-ID: jdoerfert updated this revision to Diff 224262. jdoerfert added a comment. Update tests, keep it in ValueSimplify for now Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68626/new/ https://reviews.llvm.org/D68626 Files: llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/align.ll llvm/test/Transforms/FunctionAttrs/callbacks.ll llvm/test/Transforms/FunctionAttrs/internal-noalias.ll llvm/test/Transforms/FunctionAttrs/liveness.ll llvm/test/Transforms/FunctionAttrs/noalias_returned.ll llvm/test/Transforms/FunctionAttrs/nonnull.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68626.224262.patch Type: text/x-patch Size: 14929 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Wed Oct 9 23:52:59 2019 From: llvm-commits at lists.llvm.org (Jian Cai via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 06:52:59 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses Message-ID: jcai19 created this revision. Herald added subscribers: llvm-commits, hiraditya, kristof.beyls. Herald added a project: LLVM. jcai19 added a reviewer: nickdesaulniers. jcai19 added subscribers: manojgupta, llozano. Integrated assembler does not acceet offset expressions surrounded by parenthesis. Handle this case for GAS compability. https://bugs.llvm.org/show_bug.cgi?id=43631 Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68764 Files: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp llvm/test/MC/ARM/gas-compl.s Index: llvm/test/MC/ARM/gas-compl.s =================================================================== --- /dev/null +++ llvm/test/MC/ARM/gas-compl.s @@ -0,0 +1,8 @@ +@ RUN: llvm-mc -triple=arm < %s | FileCheck %s + +@ CHECK: ldr r12, [sp, #15] +.syntax unified + ldr r12, [sp, #(15)] +@ CHECK: ldr r12, [sp, #40] +.syntax unified + ldr r12, [sp, #(15+5*5)] Index: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp =================================================================== --- llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp +++ llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp @@ -5734,19 +5734,24 @@ } // If we have a '#', it's an immediate offset, else assume it's a register - // offset. Be friendly and also accept a plain integer (without a leading - // hash) for gas compatibility. + // offset. Be friendly and also accept a plain integer or expression (without + // a leading hash) for gas compatibility. if (Parser.getTok().is(AsmToken::Hash) || Parser.getTok().is(AsmToken::Dollar) || + Parser.getTok().is(AsmToken::LParen) || Parser.getTok().is(AsmToken::Integer)) { - if (Parser.getTok().isNot(AsmToken::Integer)) - Parser.Lex(); // Eat '#' or '$'. + bool StartsWithParen = false; + if (Parser.getTok().isNot(AsmToken::Integer)) { + StartsWithParen = Parser.getTok().is(AsmToken::LParen); + Parser.Lex(); // Eat '#' or '$' or '(' + } E = Parser.getTok().getLoc(); bool isNegative = getParser().getTok().is(AsmToken::Minus); const MCExpr *Offset; - if (getParser().parseExpression(Offset)) + if (getParser().parseExpression(Offset)) { return true; + } // The expression has to be a constant. Memory references with relocations // don't come through here, as they use the

    Header::decode(Da H.StrtabOffset = Data.getU32(&Offset); H.StrtabSize = Data.getU32(&Offset); Data.getU8(&Offset, H.UUID, GSYM_MAX_UUID_SIZE); - llvm::Error Err = getHeaderError(H); - if (Err) + if (llvm::Error Err = H.checkForError()) return std::move(Err); return H; } llvm::Error Header::encode(FileWriter &O) const { // Users must verify the Header is valid prior to calling this funtion. - llvm::Error Err = getHeaderError(*this); - if (Err) + if (llvm::Error Err = checkForError()) return Err; O.writeU32(Magic); O.writeU16(Version); Modified: llvm/trunk/unittests/DebugInfo/GSYM/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/DebugInfo/GSYM/CMakeLists.txt?rev=374381&r1=374380&r2=374381&view=diff ============================================================================== --- llvm/trunk/unittests/DebugInfo/GSYM/CMakeLists.txt (original) +++ llvm/trunk/unittests/DebugInfo/GSYM/CMakeLists.txt Thu Oct 10 10:10:11 2019 @@ -1,5 +1,6 @@ set(LLVM_LINK_COMPONENTS DebugInfoGSYM + MC Support ) Modified: llvm/trunk/unittests/DebugInfo/GSYM/GSYMTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/DebugInfo/GSYM/GSYMTest.cpp?rev=374381&r1=374380&r2=374381&view=diff ============================================================================== --- llvm/trunk/unittests/DebugInfo/GSYM/GSYMTest.cpp (original) +++ llvm/trunk/unittests/DebugInfo/GSYM/GSYMTest.cpp Thu Oct 10 10:10:11 2019 @@ -13,6 +13,8 @@ #include "llvm/DebugInfo/GSYM/FileEntry.h" #include "llvm/DebugInfo/GSYM/FileWriter.h" #include "llvm/DebugInfo/GSYM/FunctionInfo.h" +#include "llvm/DebugInfo/GSYM/GsymCreator.h" +#include "llvm/DebugInfo/GSYM/GsymReader.h" #include "llvm/DebugInfo/GSYM/InlineInfo.h" #include "llvm/DebugInfo/GSYM/Range.h" #include "llvm/DebugInfo/GSYM/StringTable.h" @@ -1046,3 +1048,255 @@ TEST(GSYMTest, TestHeaderEncodeDecode) { TestHeaderEncodeDecode(H, llvm::support::little); TestHeaderEncodeDecode(H, llvm::support::big); } + +static void TestGsymCreatorEncodeError(llvm::support::endianness ByteOrder, + const GsymCreator &GC, + std::string ExpectedErrorMsg) { + SmallString<512> Str; + raw_svector_ostream OutStrm(Str); + FileWriter FW(OutStrm, ByteOrder); + llvm::Error Err = GC.encode(FW); + ASSERT_TRUE(bool(Err)); + checkError(ExpectedErrorMsg, std::move(Err)); +} + +TEST(GSYMTest, TestGsymCreatorEncodeErrors) { + const uint8_t ValidUUID[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, + 14, 15, 16}; + const uint8_t InvalidUUID[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, + 14, 15, 16, 17, 18, 19, 20, 21}; + // Verify we get an error when trying to encode an GsymCreator with no + // function infos. We shouldn't be saving a GSYM file in this case since + // there is nothing inside of it. + GsymCreator GC; + TestGsymCreatorEncodeError(llvm::support::little, GC, + "no functions to encode"); + const uint64_t FuncAddr = 0x1000; + const uint64_t FuncSize = 0x100; + const uint32_t FuncName = GC.insertString("foo"); + // Verify we get an error trying to encode a GsymCreator that isn't + // finalized. + GC.addFunctionInfo(FunctionInfo(FuncAddr, FuncSize, FuncName)); + TestGsymCreatorEncodeError(llvm::support::little, GC, + "GsymCreator wasn't finalized prior to encoding"); + std::string finalizeIssues; + raw_string_ostream OS(finalizeIssues); + llvm::Error finalizeErr = GC.finalize(OS); + ASSERT_FALSE(bool(finalizeErr)); + finalizeErr = GC.finalize(OS); + ASSERT_TRUE(bool(finalizeErr)); + checkError("already finalized", std::move(finalizeErr)); + // Verify we get an error trying to encode a GsymCreator with a UUID that is + // too long. + GC.setUUID(InvalidUUID); + TestGsymCreatorEncodeError(llvm::support::little, GC, + "invalid UUID size 21"); + GC.setUUID(ValidUUID); + // Verify errors are propagated when we try to encoding an invalid line + // table. + GC.forEachFunctionInfo([](FunctionInfo &FI) -> bool { + FI.OptLineTable = LineTable(); // Invalid line table. + return false; // Stop iterating + }); + TestGsymCreatorEncodeError(llvm::support::little, GC, + "attempted to encode invalid LineTable object"); + // Verify errors are propagated when we try to encoding an invalid inline + // info. + GC.forEachFunctionInfo([](FunctionInfo &FI) -> bool { + FI.OptLineTable = llvm::None; + FI.Inline = InlineInfo(); // Invalid InlineInfo. + return false; // Stop iterating + }); + TestGsymCreatorEncodeError(llvm::support::little, GC, + "attempted to encode invalid InlineInfo object"); +} + +static void Compare(const GsymCreator &GC, const GsymReader &GR) { + // Verify that all of the data in a GsymCreator is correctly decoded from + // a GsymReader. To do this, we iterator over + GC.forEachFunctionInfo([&](const FunctionInfo &FI) -> bool { + auto DecodedFI = GR.getFunctionInfo(FI.Range.Start); + EXPECT_TRUE(bool(DecodedFI)); + EXPECT_EQ(FI, *DecodedFI); + return true; // Keep iterating over all FunctionInfo objects. + }); +} + +static void TestEncodeDecode(const GsymCreator &GC, + support::endianness ByteOrder, uint16_t Version, + uint8_t AddrOffSize, uint64_t BaseAddress, + uint32_t NumAddresses, ArrayRef UUID) { + SmallString<512> Str; + raw_svector_ostream OutStrm(Str); + FileWriter FW(OutStrm, ByteOrder); + llvm::Error Err = GC.encode(FW); + ASSERT_FALSE((bool)Err); + Expected GR = GsymReader::copyBuffer(OutStrm.str()); + ASSERT_TRUE(bool(GR)); + const Header &Hdr = GR->getHeader(); + EXPECT_EQ(Hdr.Version, Version); + EXPECT_EQ(Hdr.AddrOffSize, AddrOffSize); + EXPECT_EQ(Hdr.UUIDSize, UUID.size()); + EXPECT_EQ(Hdr.BaseAddress, BaseAddress); + EXPECT_EQ(Hdr.NumAddresses, NumAddresses); + EXPECT_EQ(ArrayRef(Hdr.UUID, Hdr.UUIDSize), UUID); + Compare(GC, GR.get()); +} + +TEST(GSYMTest, TestGsymCreator1ByteAddrOffsets) { + uint8_t UUID[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}; + GsymCreator GC; + GC.setUUID(UUID); + constexpr uint64_t BaseAddr = 0x1000; + constexpr uint8_t AddrOffSize = 1; + const uint32_t Func1Name = GC.insertString("foo"); + const uint32_t Func2Name = GC.insertString("bar"); + GC.addFunctionInfo(FunctionInfo(BaseAddr+0x00, 0x10, Func1Name)); + GC.addFunctionInfo(FunctionInfo(BaseAddr+0x20, 0x10, Func2Name)); + Error Err = GC.finalize(llvm::nulls()); + ASSERT_FALSE(Err); + TestEncodeDecode(GC, llvm::support::little, + GSYM_VERSION, + AddrOffSize, + BaseAddr, + 2, // NumAddresses + ArrayRef(UUID)); + TestEncodeDecode(GC, llvm::support::big, + GSYM_VERSION, + AddrOffSize, + BaseAddr, + 2, // NumAddresses + ArrayRef(UUID)); +} + +TEST(GSYMTest, TestGsymCreator2ByteAddrOffsets) { + uint8_t UUID[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}; + GsymCreator GC; + GC.setUUID(UUID); + constexpr uint64_t BaseAddr = 0x1000; + constexpr uint8_t AddrOffSize = 2; + const uint32_t Func1Name = GC.insertString("foo"); + const uint32_t Func2Name = GC.insertString("bar"); + GC.addFunctionInfo(FunctionInfo(BaseAddr+0x000, 0x100, Func1Name)); + GC.addFunctionInfo(FunctionInfo(BaseAddr+0x200, 0x100, Func2Name)); + Error Err = GC.finalize(llvm::nulls()); + ASSERT_FALSE(Err); + TestEncodeDecode(GC, llvm::support::little, + GSYM_VERSION, + AddrOffSize, + BaseAddr, + 2, // NumAddresses + ArrayRef(UUID)); + TestEncodeDecode(GC, llvm::support::big, + GSYM_VERSION, + AddrOffSize, + BaseAddr, + 2, // NumAddresses + ArrayRef(UUID)); +} + +TEST(GSYMTest, TestGsymCreator4ByteAddrOffsets) { + uint8_t UUID[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}; + GsymCreator GC; + GC.setUUID(UUID); + constexpr uint64_t BaseAddr = 0x1000; + constexpr uint8_t AddrOffSize = 4; + const uint32_t Func1Name = GC.insertString("foo"); + const uint32_t Func2Name = GC.insertString("bar"); + GC.addFunctionInfo(FunctionInfo(BaseAddr+0x000, 0x100, Func1Name)); + GC.addFunctionInfo(FunctionInfo(BaseAddr+0x20000, 0x100, Func2Name)); + Error Err = GC.finalize(llvm::nulls()); + ASSERT_FALSE(Err); + TestEncodeDecode(GC, llvm::support::little, + GSYM_VERSION, + AddrOffSize, + BaseAddr, + 2, // NumAddresses + ArrayRef(UUID)); + TestEncodeDecode(GC, llvm::support::big, + GSYM_VERSION, + AddrOffSize, + BaseAddr, + 2, // NumAddresses + ArrayRef(UUID)); +} + +TEST(GSYMTest, TestGsymCreator8ByteAddrOffsets) { + uint8_t UUID[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}; + GsymCreator GC; + GC.setUUID(UUID); + constexpr uint64_t BaseAddr = 0x1000; + constexpr uint8_t AddrOffSize = 8; + const uint32_t Func1Name = GC.insertString("foo"); + const uint32_t Func2Name = GC.insertString("bar"); + GC.addFunctionInfo(FunctionInfo(BaseAddr+0x000, 0x100, Func1Name)); + GC.addFunctionInfo(FunctionInfo(BaseAddr+0x100000000, 0x100, Func2Name)); + Error Err = GC.finalize(llvm::nulls()); + ASSERT_FALSE(Err); + TestEncodeDecode(GC, llvm::support::little, + GSYM_VERSION, + AddrOffSize, + BaseAddr, + 2, // NumAddresses + ArrayRef(UUID)); + TestEncodeDecode(GC, llvm::support::big, + GSYM_VERSION, + AddrOffSize, + BaseAddr, + 2, // NumAddresses + ArrayRef(UUID)); +} + +static void VerifyFunctionInfo(const GsymReader &GR, uint64_t Addr, + const FunctionInfo &FI) { + auto ExpFI = GR.getFunctionInfo(Addr); + ASSERT_TRUE(bool(ExpFI)); + ASSERT_EQ(FI, ExpFI.get()); +} + +static void VerifyFunctionInfoError(const GsymReader &GR, uint64_t Addr, + std::string ErrMessage) { + auto ExpFI = GR.getFunctionInfo(Addr); + ASSERT_FALSE(bool(ExpFI)); + checkError(ErrMessage, ExpFI.takeError()); +} + +TEST(GSYMTest, TestGsymReader) { + uint8_t UUID[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}; + GsymCreator GC; + GC.setUUID(UUID); + constexpr uint64_t BaseAddr = 0x1000; + constexpr uint64_t Func1Addr = BaseAddr; + constexpr uint64_t Func2Addr = BaseAddr+0x20; + constexpr uint64_t FuncSize = 0x10; + const uint32_t Func1Name = GC.insertString("foo"); + const uint32_t Func2Name = GC.insertString("bar"); + const auto ByteOrder = support::endian::system_endianness(); + GC.addFunctionInfo(FunctionInfo(Func1Addr, FuncSize, Func1Name)); + GC.addFunctionInfo(FunctionInfo(Func2Addr, FuncSize, Func2Name)); + Error FinalizeErr = GC.finalize(llvm::nulls()); + ASSERT_FALSE(FinalizeErr); + SmallString<512> Str; + raw_svector_ostream OutStrm(Str); + FileWriter FW(OutStrm, ByteOrder); + llvm::Error Err = GC.encode(FW); + ASSERT_FALSE((bool)Err); + if (auto ExpectedGR = GsymReader::copyBuffer(OutStrm.str())) { + const GsymReader &GR = ExpectedGR.get(); + VerifyFunctionInfoError(GR, Func1Addr-1, "address 0xfff not in GSYM"); + + FunctionInfo Func1(Func1Addr, FuncSize, Func1Name); + VerifyFunctionInfo(GR, Func1Addr, Func1); + VerifyFunctionInfo(GR, Func1Addr+1, Func1); + VerifyFunctionInfo(GR, Func1Addr+FuncSize-1, Func1); + VerifyFunctionInfoError(GR, Func1Addr+FuncSize, + "address 0x1010 not in GSYM"); + VerifyFunctionInfoError(GR, Func2Addr-1, "address 0x101f not in GSYM"); + FunctionInfo Func2(Func2Addr, FuncSize, Func2Name); + VerifyFunctionInfo(GR, Func2Addr, Func2); + VerifyFunctionInfo(GR, Func2Addr+1, Func2); + VerifyFunctionInfo(GR, Func2Addr+FuncSize-1, Func2); + VerifyFunctionInfoError(GR, Func2Addr+FuncSize, + "address 0x1030 not in GSYM"); + } +} From llvm-commits at lists.llvm.org Thu Oct 10 10:14:20 2019 From: llvm-commits at lists.llvm.org (GN Sync Bot via llvm-commits) Date: Thu, 10 Oct 2019 17:14:20 -0000 Subject: [llvm] r374383 - gn build: Merge r374381 Message-ID: <20191010171420.E6846858CA@lists.llvm.org> Author: gnsyncbot Date: Thu Oct 10 10:14:20 2019 New Revision: 374383 URL: http://llvm.org/viewvc/llvm-project?rev=374383&view=rev Log: gn build: Merge r374381 Modified: llvm/trunk/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn?rev=374383&r1=374382&r2=374383&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn Thu Oct 10 10:14:20 2019 @@ -6,6 +6,8 @@ static_library("GSYM") { sources = [ "FileWriter.cpp", "FunctionInfo.cpp", + "GsymCreator.cpp", + "GsymReader.cpp", "Header.cpp", "InlineInfo.cpp", "LineTable.cpp", From llvm-commits at lists.llvm.org Thu Oct 10 10:13:34 2019 From: llvm-commits at lists.llvm.org (David Tenty via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:13:34 +0000 (UTC) Subject: [PATCH] D68815: [AIX] Use .space instead of .zero in assembly Message-ID: daltenty created this revision. Herald added subscribers: llvm-commits, jsji, MaskRay, kbarton, hiraditya, nemanjai. Herald added a project: LLVM. daltenty added reviewers: Xiangling_L, jasonliu, sfertile, DiggerLin. Herald added a subscriber: wuzish. The AIX system assembler does not understand .zero, so we should prefer emitting .space. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68815 Files: llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp llvm/test/CodeGen/PowerPC/aix-space.ll Index: llvm/test/CodeGen/PowerPC/aix-space.ll =================================================================== --- /dev/null +++ llvm/test/CodeGen/PowerPC/aix-space.ll @@ -0,0 +1,17 @@ +; RUN: llc -verify-machineinstrs -O0 -mcpu=pwr7 -mtriple powerpc-ibm-aix-xcoff < %s | FileCheck %s + + at a = common global double 0.000000e+00, align 8 + +; Get some constants into the constant pool that need spacing for alignment +define void @e() { +entry: + %0 = load double, double* @a, align 8 + %mul = fmul double 1.500000e+00, %0 + store double %mul, double* @a, align 8 + %mul1 = fmul double 0x3F9C71C71C71C71C, %0 + store double %mul1, double* @a, align 8 + ret void +} + +; CHECK: .space 4 +; CHECK-NOT: .zero Index: llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp =================================================================== --- llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp +++ llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp @@ -86,4 +86,5 @@ PPCXCOFFMCAsmInfo::PPCXCOFFMCAsmInfo(bool Is64Bit, const Triple &T) { assert(!IsLittleEndian && "Little-endian XCOFF not supported."); CodePointerSize = CalleeSaveStackSlotSize = Is64Bit ? 8 : 4; + ZeroDirective = "\t.space\t"; } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68815.224397.patch Type: text/x-patch Size: 1221 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 10:13:35 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:13:35 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: <649723e6adf94cfd02c5072de20fcec2@localhost.localdomain> rupprecht added a comment. In D68146#1703693 , @thakis wrote: > The test fails on Linux: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/28537/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Acheck-ignore-case.txt If it takes a while to fix please revert while you investigate. > > Also, https://github.com/llvm/llvm-project/commit/dfd2b6f07fc40a190335f580d8a965bbebfe94df looks like you touched ~all lines in docs/CommandGuide/FileCheck.rst and llvm/include/llvm/Support/FileCheck.h Maybe you converted them to windows line endings? If so, please undo that. (Maybe revert and reland with fixed line endings so that the diff for the actual change is readable.) As described in http://llvm.org/docs/GettingStarted.html#checkout-llvm-from-git, the right way to checkout the repository on windows is: % git clone --config core.autocrlf=false https://github.com/llvm/llvm-project.git This is the second time I've reviewed a change like this without realizing, would appreciate tips if this could be more visible in Phab somehow... Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 From llvm-commits at lists.llvm.org Thu Oct 10 10:13:35 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:13:35 +0000 (UTC) Subject: [PATCH] D68744: [GSYM] Add GsymCreator and GsymReader. In-Reply-To: References: Message-ID: <815d8183fe19b3ab7edf3a9cdf9a7c8c@localhost.localdomain> clayborg added a comment. $ svn commit Sending include/llvm/DebugInfo/GSYM/FileWriter.h Adding include/llvm/DebugInfo/GSYM/GsymCreator.h Adding include/llvm/DebugInfo/GSYM/GsymReader.h Sending include/llvm/DebugInfo/GSYM/Header.h Sending lib/DebugInfo/GSYM/CMakeLists.txt Sending lib/DebugInfo/GSYM/FunctionInfo.cpp Adding lib/DebugInfo/GSYM/GsymCreator.cpp Adding lib/DebugInfo/GSYM/GsymReader.cpp Sending lib/DebugInfo/GSYM/Header.cpp Sending unittests/DebugInfo/GSYM/CMakeLists.txt Sending unittests/DebugInfo/GSYM/GSYMTest.cpp Transmitting file data ...........done Committing transaction... Committed revision 374381. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68744/new/ https://reviews.llvm.org/D68744 From llvm-commits at lists.llvm.org Thu Oct 10 10:13:36 2019 From: llvm-commits at lists.llvm.org (Hal Finkel via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:13:36 +0000 (UTC) Subject: [PATCH] D68793: [System Model] [TTI] Add TTI interfaces for write-combining buffers In-Reply-To: References: Message-ID: <69544db1484297cd6dbc2e9905cd91a8@localhost.localdomain> hfinkel added a comment. How do you imagine that we'd use this? Do we need some kind of size to go along with this? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68793/new/ https://reviews.llvm.org/D68793 From llvm-commits at lists.llvm.org Thu Oct 10 10:13:37 2019 From: llvm-commits at lists.llvm.org (Orlando Cazalet-Hyams via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:13:37 +0000 (UTC) Subject: [PATCH] D68816: [NFC] Replace a linked list in LiveDebugVariables pass with a DenseMap Message-ID: Orlando created this revision. Orlando added reviewers: aprantl, probinson, vsk, dblaikie. Orlando added a project: debug-info. Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. Orlando added a comment. The numbers are much better than I expected but I haven't been able to prove myself wrong after a bunch of testing. So, I've put this patch up for the community's expert eyes. Before going into details of the patch, I think it's important to point out that it seems to reduces self-host build times of llvm on my Linux VM by up to 12%: ------------------------------------------------------------------ Self-host build times with and without this patch ------------------------------------------------------------------ With this patch applied | Build type | CPU time (mins) ------------------------------------------------------------------ No | RelWithDebInfo, asan | 226 Yes | RelWithDebInfo, asan | 218 (-3.5%) No | RelWithDebInfo | 183 Yes | RelWithDebInfo | 161 (-12%) In LiveDebugValues.cpp: Prior to this patch, UserValues were grouped into linked list chains. Each chain was the union of two sets: { A: Matching Source variable } or { B: Matching virtual register }. A ptr to the heads (or 'leaders') of each of these chains were kept in a map with the { Source variable } used as the key (set A predicate) and another with { Virtual register } as key (set B predicate). There was a search through the chains in the function getUserValue looking for UserValues with matching { Source variable, Complex expression, Inlined-at location }. Essentially searching for a subset of A through two interleaved linked lists of set A and B. Importantly, by design, the subset will only contain one or zero elements here. That is to say a UserValue can be uniquely identified by the tuple { Source variable, Complex expression, Inlined-at location } if it exists. This patch removes the linked list and instead uses a DenseMap to map the tuple { Source variable, Complex expression, Inlined-at location } to UserValue ptrs so that the getUserValue search predicate is this map key. The virtual register map now maps a vreg to a SmallVector so that set B is still available for quick searches. Repository: rL LLVM https://reviews.llvm.org/D68816 Files: llvm/lib/CodeGen/LiveDebugVariables.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68816.224396.patch Type: text/x-patch Size: 9760 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 10:13:37 2019 From: llvm-commits at lists.llvm.org (Orlando Cazalet-Hyams via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:13:37 +0000 (UTC) Subject: [PATCH] D68816: [NFC] Replace a linked list in LiveDebugVariables pass with a DenseMap In-Reply-To: References: Message-ID: Orlando added a comment. The numbers are much better than I expected but I haven't been able to prove myself wrong after a bunch of testing. So, I've put this patch up for the community's expert eyes. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68816/new/ https://reviews.llvm.org/D68816 From llvm-commits at lists.llvm.org Thu Oct 10 10:13:42 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:13:42 +0000 (UTC) Subject: [PATCH] D68744: [GSYM] Add GsymCreator and GsymReader. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG4b6c9de868cd: Add GsymCreator and GsymReader. (authored by clayborg). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D68744?vs=224372&id=224398#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68744/new/ https://reviews.llvm.org/D68744 Files: llvm/include/llvm/DebugInfo/GSYM/FileWriter.h llvm/include/llvm/DebugInfo/GSYM/GsymCreator.h llvm/include/llvm/DebugInfo/GSYM/GsymReader.h llvm/include/llvm/DebugInfo/GSYM/Header.h llvm/lib/DebugInfo/GSYM/CMakeLists.txt llvm/lib/DebugInfo/GSYM/FunctionInfo.cpp llvm/lib/DebugInfo/GSYM/GsymCreator.cpp llvm/lib/DebugInfo/GSYM/GsymReader.cpp llvm/lib/DebugInfo/GSYM/Header.cpp llvm/unittests/DebugInfo/GSYM/CMakeLists.txt llvm/unittests/DebugInfo/GSYM/GSYMTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68744.224398.patch Type: text/x-patch Size: 57192 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 10:19:58 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via llvm-commits) Date: Thu, 10 Oct 2019 17:19:58 -0000 Subject: [compiler-rt] r374384 - Reland "[ASan] Do not misrepresent high value address dereferences as null dereferences" Message-ID: <20191010171958.612CA889F2@lists.llvm.org> Author: yln Date: Thu Oct 10 10:19:58 2019 New Revision: 374384 URL: http://llvm.org/viewvc/llvm-project?rev=374384&view=rev Log: Reland "[ASan] Do not misrepresent high value address dereferences as null dereferences" Updated: Removed offending TODO comment. Dereferences with addresses above the 48-bit hardware addressable range produce "invalid instruction" (instead of "invalid access") hardware exceptions (there is no hardware address decoding logic for those bits), and the address provided by this exception is the address of the instruction (not the faulting address). The kernel maps the "invalid instruction" to SEGV, but fails to provide the real fault address. Because of this ASan lies and says that those cases are null dereferences. This downgrades the severity of a found bug in terms of security. In the ASan signal handler, we can not provide the real faulting address, but at least we can try not to lie. rdar://50366151 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D68676 llvm-svn: 374265 Added: compiler-rt/trunk/test/asan/TestCases/Posix/high-address-dereference.c Modified: compiler-rt/trunk/lib/asan/asan_errors.h compiler-rt/trunk/lib/sanitizer_common/sanitizer_common.h compiler-rt/trunk/lib/sanitizer_common/sanitizer_linux.cpp compiler-rt/trunk/lib/sanitizer_common/sanitizer_mac.cpp compiler-rt/trunk/lib/sanitizer_common/sanitizer_symbolizer_report.cpp compiler-rt/trunk/lib/sanitizer_common/sanitizer_win.cpp Modified: compiler-rt/trunk/lib/asan/asan_errors.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_errors.h?rev=374384&r1=374383&r2=374384&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_errors.h (original) +++ compiler-rt/trunk/lib/asan/asan_errors.h Thu Oct 10 10:19:58 2019 @@ -48,7 +48,8 @@ struct ErrorDeadlySignal : ErrorBase { scariness.Scare(10, "stack-overflow"); } else if (!signal.is_memory_access) { scariness.Scare(10, "signal"); - } else if (signal.addr < GetPageSizeCached()) { + } else if (signal.is_true_faulting_addr && + signal.addr < GetPageSizeCached()) { scariness.Scare(10, "null-deref"); } else if (signal.addr == signal.pc) { scariness.Scare(60, "wild-jump"); Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_common.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_common.h?rev=374384&r1=374383&r2=374384&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_common.h (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_common.h Thu Oct 10 10:19:58 2019 @@ -881,6 +881,11 @@ struct SignalContext { bool is_memory_access; enum WriteFlag { UNKNOWN, READ, WRITE } write_flag; + // In some cases the kernel cannot provide the true faulting address; `addr` + // will be zero then. This field allows to distinguish between these cases + // and dereferences of null. + bool is_true_faulting_addr; + // VS2013 doesn't implement unrestricted unions, so we need a trivial default // constructor SignalContext() = default; @@ -893,7 +898,8 @@ struct SignalContext { context(context), addr(GetAddress()), is_memory_access(IsMemoryAccess()), - write_flag(GetWriteFlag()) { + write_flag(GetWriteFlag()), + is_true_faulting_addr(IsTrueFaultingAddress()) { InitPcSpBp(); } @@ -914,6 +920,7 @@ struct SignalContext { uptr GetAddress() const; WriteFlag GetWriteFlag() const; bool IsMemoryAccess() const; + bool IsTrueFaultingAddress() const; }; void InitializePlatformEarly(); Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_linux.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_linux.cpp?rev=374384&r1=374383&r2=374384&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_linux.cpp (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_linux.cpp Thu Oct 10 10:19:58 2019 @@ -1849,6 +1849,12 @@ SignalContext::WriteFlag SignalContext:: #endif } +bool SignalContext::IsTrueFaultingAddress() const { + auto si = static_cast(siginfo); + // SIGSEGV signals without a true fault address have si_code set to 128. + return si->si_signo == SIGSEGV && si->si_code != 128; +} + void SignalContext::DumpAllRegisters(void *context) { // FIXME: Implement this. } Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_mac.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_mac.cpp?rev=374384&r1=374383&r2=374384&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_mac.cpp (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_mac.cpp Thu Oct 10 10:19:58 2019 @@ -754,6 +754,12 @@ SignalContext::WriteFlag SignalContext:: #endif } +bool SignalContext::IsTrueFaultingAddress() const { + auto si = static_cast(siginfo); + // "Real" SIGSEGV codes (e.g., SEGV_MAPERR, SEGV_MAPERR) are non-zero. + return si->si_signo == SIGSEGV && si->si_code != 0; +} + static void GetPcSpBp(void *context, uptr *pc, uptr *sp, uptr *bp) { ucontext_t *ucontext = (ucontext_t*)context; # if defined(__aarch64__) Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_symbolizer_report.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_symbolizer_report.cpp?rev=374384&r1=374383&r2=374384&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_symbolizer_report.cpp (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_symbolizer_report.cpp Thu Oct 10 10:19:58 2019 @@ -191,9 +191,14 @@ static void ReportDeadlySignalImpl(const SanitizerCommonDecorator d; Printf("%s", d.Warning()); const char *description = sig.Describe(); - Report("ERROR: %s: %s on unknown address %p (pc %p bp %p sp %p T%d)\n", - SanitizerToolName, description, (void *)sig.addr, (void *)sig.pc, - (void *)sig.bp, (void *)sig.sp, tid); + if (sig.is_memory_access && !sig.is_true_faulting_addr) + Report("ERROR: %s: %s on unknown address (pc %p bp %p sp %p T%d)\n", + SanitizerToolName, description, (void *)sig.pc, (void *)sig.bp, + (void *)sig.sp, tid); + else + Report("ERROR: %s: %s on unknown address %p (pc %p bp %p sp %p T%d)\n", + SanitizerToolName, description, (void *)sig.addr, (void *)sig.pc, + (void *)sig.bp, (void *)sig.sp, tid); Printf("%s", d.Default()); if (sig.pc < GetPageSizeCached()) Report("Hint: pc points to the zero page.\n"); @@ -203,7 +208,11 @@ static void ReportDeadlySignalImpl(const ? "WRITE" : (sig.write_flag == SignalContext::READ ? "READ" : "UNKNOWN"); Report("The signal is caused by a %s memory access.\n", access_type); - if (sig.addr < GetPageSizeCached()) + if (!sig.is_true_faulting_addr) + Report("Hint: this fault was caused by a dereference of a high value " + "address (see registers below). Dissassemble the provided pc " + "to learn which register value was used.\n"); + else if (sig.addr < GetPageSizeCached()) Report("Hint: address points to the zero page.\n"); } MaybeReportNonExecRegion(sig.pc); Modified: compiler-rt/trunk/lib/sanitizer_common/sanitizer_win.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_win.cpp?rev=374384&r1=374383&r2=374384&view=diff ============================================================================== --- compiler-rt/trunk/lib/sanitizer_common/sanitizer_win.cpp (original) +++ compiler-rt/trunk/lib/sanitizer_common/sanitizer_win.cpp Thu Oct 10 10:19:58 2019 @@ -945,6 +945,11 @@ bool SignalContext::IsMemoryAccess() con return GetWriteFlag() != SignalContext::UNKNOWN; } +bool SignalContext::IsTrueFaultingAddress() const { + // FIXME: Provide real implementation for this. See Linux and Mac variants. + return IsMemoryAccess(); +} + SignalContext::WriteFlag SignalContext::GetWriteFlag() const { EXCEPTION_RECORD *exception_record = (EXCEPTION_RECORD *)siginfo; // The contents of this array are documented at Added: compiler-rt/trunk/test/asan/TestCases/Posix/high-address-dereference.c URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/asan/TestCases/Posix/high-address-dereference.c?rev=374384&view=auto ============================================================================== --- compiler-rt/trunk/test/asan/TestCases/Posix/high-address-dereference.c (added) +++ compiler-rt/trunk/test/asan/TestCases/Posix/high-address-dereference.c Thu Oct 10 10:19:58 2019 @@ -0,0 +1,50 @@ +// On x86_64, the kernel does not provide the faulting address for dereferences +// of addresses greater than the 48-bit hardware addressable range, i.e., +// `siginfo.si_addr` is zero in ASan's SEGV signal handler. This test checks +// that ASan does not misrepresent such cases as "NULL dereferences". + +// REQUIRES: x86_64-target-arch +// RUN: %clang_asan %s -o %t +// RUN: export %env_asan_opts=print_scariness=1 +// RUN: not %run %t 0x0000000000000000 2>&1 | FileCheck %s --check-prefixes=ZERO,HINT-PAGE0 +// RUN: not %run %t 0x0000000000000FFF 2>&1 | FileCheck %s --check-prefixes=LOW1,HINT-PAGE0 +// RUN: not %run %t 0x0000000000001000 2>&1 | FileCheck %s --check-prefixes=LOW2,HINT-NONE +// RUN: not %run %t 0x4141414141414141 2>&1 | FileCheck %s --check-prefixes=HIGH,HINT-HIGHADDR +// RUN: not %run %t 0xFFFFFFFFFFFFFFFF 2>&1 | FileCheck %s --check-prefixes=MAX,HINT-HIGHADDR + +#include +#include + +int main(int argc, const char *argv[]) { + const char *hex = argv[1]; + uint64_t *addr = (uint64_t *)strtoull(hex, NULL, 16); + uint64_t x = *addr; // segmentation fault + return x; +} + +// ZERO: SEGV on unknown address 0x000000000000 (pc +// LOW1: SEGV on unknown address 0x000000000fff (pc +// LOW2: SEGV on unknown address 0x000000001000 (pc +// HIGH: SEGV on unknown address (pc +// MAX: SEGV on unknown address (pc + +// HINT-PAGE0-NOT: Hint: this fault was caused by a dereference of a high value address +// HINT-PAGE0: Hint: address points to the zero page. + +// HINT-NONE-NOT: Hint: this fault was caused by a dereference of a high value address +// HINT-NONE-NOT: Hint: address points to the zero page. + +// HINT-HIGHADDR: Hint: this fault was caused by a dereference of a high value address +// HINT-HIGHADDR-NOT: Hint: address points to the zero page. + +// ZERO: SCARINESS: 10 (null-deref) +// LOW1: SCARINESS: 10 (null-deref) +// LOW2: SCARINESS: 20 (wild-addr-read) +// HIGH: SCARINESS: 20 (wild-addr-read) +// MAX: SCARINESS: 20 (wild-addr-read) + +// TODO: Currently, register values are only printed on Mac. Once this changes, +// remove the 'TODO_' prefix in the following lines. +// TODO_HIGH,TODO_MAX: Register values: +// TODO_HIGH: = 0x4141414141414141 +// TODO_MAX: = 0xffffffffffffffff From llvm-commits at lists.llvm.org Thu Oct 10 10:23:07 2019 From: llvm-commits at lists.llvm.org (Eugene Leviant via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:23:07 +0000 (UTC) Subject: [PATCH] D67322: [LLD][ThinLTO] Handle GUID collision in import global processing In-Reply-To: References: Message-ID: <02c2984101604b4491428c277154d6ef@localhost.localdomain> evgeny777 added a comment. Is this the only place where bad things can happen? May be simply raise an error in `addGlobalValueSummary` when new summary type is different from that of `SummaryList[0]`? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67322/new/ https://reviews.llvm.org/D67322 From llvm-commits at lists.llvm.org Thu Oct 10 10:23:07 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:23:07 +0000 (UTC) Subject: [PATCH] D68676: [ASan] Do not misrepresent high value address dereferences as null dereferences In-Reply-To: References: Message-ID: <25d339a4118f78f0f6a3283b538f2829@localhost.localdomain> yln added a comment. Apologies for the failing bots. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68676/new/ https://reviews.llvm.org/D68676 From llvm-commits at lists.llvm.org Thu Oct 10 10:23:08 2019 From: llvm-commits at lists.llvm.org (Jinsong Ji via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:23:08 +0000 (UTC) Subject: [PATCH] D68817: [PowerPC][docs] Update IBM official docs in Compiler Writers Info page Message-ID: jsji created this revision. jsji added reviewers: PowerPC, hfinkel, nemanjai. Herald added subscribers: llvm-commits, shchenz. Herald added a project: LLVM. Just realized that most of the links in this page are deprecated. So update some important reference here: - adding PowerISA 3.0B/2.7B - adding P8 /P9 User Manual - ELFv2 ABI and errata Move deprecated ones into "Other documents..". Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68817 Files: llvm/docs/CompilerWriterInfo.rst Index: llvm/docs/CompilerWriterInfo.rst =================================================================== --- llvm/docs/CompilerWriterInfo.rst +++ llvm/docs/CompilerWriterInfo.rst @@ -58,21 +58,27 @@ IBM - Official manuals and docs ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -* `Power Instruction Set Architecture, Versions 2.03 through 2.06 (authentication required, free sign-up) `_ +* `Power Instruction Set Architecture, Version 3.0B `_ -* `PowerPC Compiler Writer's Guide `_ +* `POWER9 Processor User's Manual `_ -* `Intro to PowerPC Architecture `_ +* `Power Instruction Set Architecture, Version 2.07B `_ -* `PowerPC Processor Manuals (embedded) `_ +* `POWER8 Processor User's Manual `_ -* `Various IBM specifications and white papers `_ +* `Power Instruction Set Architecture, Versions 2.03 through 2.06 (Internet Archive) `_ + +* `IBM AIX 7.2 POWER Assembly Reference `_ * `IBM AIX/5L for POWER Assembly Reference `_ Other documents, collections, notes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +* `PowerPC Compiler Writer's Guide `_ +* `Intro to PowerPC Architecture `_ +* `PowerPC Processor Manuals (embedded) `_ +* `Various IBM specifications and white papers `_ * `PowerPC ABI documents `_ * `PowerPC64 alignment of long doubles (from GCC) `_ * `Long branch stubs for powerpc64-linux (from binutils) `_ @@ -133,6 +139,9 @@ ----- * `Linux extensions to gabi `_ +* `64-Bit ELF V2 ABI Specification: Power Architecture `_ + +* `OpenPOWER ELFv2 Errata: ELFv2 ABI Version 1.4 `_ * `PowerPC 64-bit ELF ABI Supplement `_ * `Procedure Call Standard for the AArch64 Architecture `_ * `Procedure Call Standard for the ARM Architecture `_ -------------- next part -------------- A non-text attachment was scrubbed... Name: D68817.224399.patch Type: text/x-patch Size: 3591 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 10:23:08 2019 From: llvm-commits at lists.llvm.org (Andrei Elovikov via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:23:08 +0000 (UTC) Subject: [PATCH] D68492: [PATCH 09/38] [noalias] D9376: llvm.noalias - handling of dead intrinsics In-Reply-To: References: Message-ID: <5f91e23dfcd4f6fae8595cb8e0a53a32@localhost.localdomain> a.elovikov added inline comments. ================ Comment at: llvm/lib/Analysis/InstructionSimplify.cpp:5156 + if (isa(Arg0) || + (isa(Arg0) && + Arg0->getType()->getPointerAddressSpace() == 0)) ---------------- What if we have %i = ptr2int %p %null = sub %i, %i %nullptr = int2ptr %null %scope = call @llvm.noalias(%nullptr) ; introduce the scope %null2 = ptr2int %scope %i2 = add %null2, %i %same.as.orig.p = int2ptr %i2 Why don't we want `%same.as.orig.p` to have the scope? ================ Comment at: llvm/test/Transforms/InstSimplify/noalias.ll:8 +; CHECK-LABEL: @test1 +; CHECK-NOT: llvm.noalias.p0i8 +; CHECK: ret void ---------------- Simplify to `CHECK-NEXT: ret void`? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68492/new/ https://reviews.llvm.org/D68492 From llvm-commits at lists.llvm.org Thu Oct 10 10:23:19 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:23:19 +0000 (UTC) Subject: [PATCH] D68632: [X86] Make memcmp() use PTEST if possible and also enable AVX1 In-Reply-To: References: Message-ID: craig.topper added inline comments. ================ Comment at: lib/Target/X86/X86ISelLowering.cpp:42430 + auto BCCmp = DAG.getBitcast(OpSize == 256 ? MVT::v4i64 : MVT::v2i64, Cmp); + auto PT = DAG.getNode(X86ISD::PTEST, DL, MVT::i32, BCCmp, BCCmp); + auto SetCC = getSETCC(CC == ISD::SETEQ ? X86::COND_E : X86::COND_NE, PT, DL, DAG); ---------------- You already used PT as a variable name earlier. The earlier one should probably be UsePTEST Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68632/new/ https://reviews.llvm.org/D68632 From llvm-commits at lists.llvm.org Thu Oct 10 10:25:45 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:25:45 +0000 (UTC) Subject: [PATCH] D68816: [NFC] Replace a linked list in LiveDebugVariables pass with a DenseMap In-Reply-To: References: Message-ID: <3be4a0e93cb7f893158dd67774a52900@localhost.localdomain> aprantl added a comment. This is a really nice performance win and the code looks nicer, too. ================ Comment at: llvm/lib/CodeGen/LiveDebugVariables.cpp:154 +class UserValueIdentity { +private: + const DILocalVariable *Variable;// The debug info variable we are part of. ---------------- nit: ``` /// The debug info variable we are part of. const DILocalVariable *Variable; ... ``` ================ Comment at: llvm/lib/CodeGen/LiveDebugVariables.cpp:333 +namespace llvm { +template <> struct DenseMapInfo { + static inline UserValueIdentity getEmptyKey() { ---------------- FYI. If you don't want to implement all this, you can also inherit from std::pair>. (With a single pair this is a more obvious win.) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68816/new/ https://reviews.llvm.org/D68816 From llvm-commits at lists.llvm.org Thu Oct 10 10:32:35 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:32:35 +0000 (UTC) Subject: [PATCH] D68033: [llvm-ar] Make paths case insensitive when on windows In-Reply-To: References: Message-ID: rupprecht added a comment. In D68033#1703605 , @ruiu wrote: > As jhenderson suggested, could you add a description about this behavior? Looks like llvm-ar doesn't have a manual page, so the second best thing to add a description is probably the help message for `--help`. We have a command guide entry: http://llvm.org/docs/CommandGuide/llvm-ar.html And it looks like that is also available (at least on debian-based systems) via `man llvm-ar-8` +1 to documenting it there. > > > In D68033#1703583 , @MaskRay wrote: > >> Case insensitivity and platform differences do make me sad, but if people think it is the right thing to do on Windows I'll not insist. > > > Well I'm not happy about this change, but this is probably an unavoidable consequence of the decision that Microsoft made in the early 80s... :) ================ Comment at: llvm/tools/llvm-ar/llvm-ar.cpp:509 +#else + return normalizePath(Path1) == normalizePath(Path2); +#endif ---------------- I'm not quite sure about this change... a few of the callsites before were `Name == normalizePath(Path)`, not `normalizePath(Name) == normalizePath(Path)`. My past experiences of compatibility testing llvm-ar vs GNU ar has largely been paged out, but I think this may have been one of the differences. It may actually be something we want, but we should test it. e.g. to test the `performReadOperation` can you see if extracting "foo/file.txt" will end up extracting "bar/file.txt" (in a situation where `CompareFullPath` is false)? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68033/new/ https://reviews.llvm.org/D68033 From llvm-commits at lists.llvm.org Thu Oct 10 10:32:35 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:32:35 +0000 (UTC) Subject: [PATCH] D67986: [InstCombine] snprintf (d, size, "%s", s) -> memccpy (d, s, '\0', size - 1), d[size - 1] = 0 In-Reply-To: References: Message-ID: <7fb2f898a519313dc45d45bebd6cb50b@localhost.localdomain> jdoerfert added a comment. In D67986#1703031 , @xbolva00 wrote: > In D67986#1702901 , @MaskRay wrote: > > > This transformation seems to increase code size significantly. Is the snprintf "%s" pattern common enough? I suspect most projects have already used memccpy, stpncpy, strscpy, or strlcpy. For the few that don't, the performance probably does not matter. > > > Yes, quite common. But okay, if you dont want it, let's just abandon it. I wouldn't have quit on this so easily, but OK. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67986/new/ https://reviews.llvm.org/D67986 From llvm-commits at lists.llvm.org Thu Oct 10 10:32:35 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:32:35 +0000 (UTC) Subject: [PATCH] D68816: [NFC] Replace a linked list in LiveDebugVariables pass with a DenseMap In-Reply-To: References: Message-ID: <5e66df5f943e96a3b59f0321e6688ce6@localhost.localdomain> aprantl added inline comments. ================ Comment at: llvm/lib/CodeGen/LiveDebugVariables.cpp:167 + // FIXME: The fragment should be part of the identity, but not + // other things in the expression like stack values. + return Var == Variable && Expr == Expression && IA == InlinedAt; ---------------- Not your code, but: add a FragmentInfo element? ================ Comment at: llvm/lib/CodeGen/LiveDebugVariables.cpp:409 + /// Map unique UserValue identity to UserValue. + using UVMap = DenseMap; + UVMap UserVarMap; ---------------- How many elements does this have on average? Is a SmallDenseMap a win? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68816/new/ https://reviews.llvm.org/D68816 From llvm-commits at lists.llvm.org Thu Oct 10 10:32:36 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:32:36 +0000 (UTC) Subject: [PATCH] D68819: [Utils] Allow update_test_checks to check function arguments Message-ID: jdoerfert created this revision. jdoerfert added reviewers: lebedev.ri, greened, spatel, xbolva00, RKSimon, mehdi_amini. Herald added a subscriber: bollu. Herald added a project: LLVM. This adds a switch to the update_test_checks that triggers arguments to be present in the check line. If not set, the behavior should be the same as before. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68819 Files: llvm/utils/UpdateTestChecks/common.py llvm/utils/update_analyze_test_checks.py llvm/utils/update_mir_test_checks.py llvm/utils/update_test_checks.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68819.224404.patch Type: text/x-patch Size: 5581 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 10:33:00 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:33:00 +0000 (UTC) Subject: [PATCH] D68632: [X86] Make memcmp() use PTEST if possible and also enable AVX1 In-Reply-To: References: Message-ID: <180c6298c096a9e6e60d19abfadc5d36@localhost.localdomain> craig.topper added inline comments. ================ Comment at: test/CodeGen/X86/setcc-wide-types.ll:2 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=sse2 | FileCheck %s --check-prefix=ANY --check-prefix=NO512 --check-prefix=SSE2 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx | FileCheck %s --check-prefix=ANY --check-prefix=NO512 --check-prefix=AVXANY --check-prefix=AVX1 ---------------- Need an sse4.1 command line since we changed that behavior too. ================ Comment at: test/CodeGen/X86/setcc-wide-types.ll:101 +; AVX2-NEXT: retq %bcx = bitcast <4 x i64> %x to i256 %bcy = bitcast <4 x i64> %y to i256 ---------------- The AVX512 checks are missing here? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68632/new/ https://reviews.llvm.org/D68632 From llvm-commits at lists.llvm.org Thu Oct 10 10:39:24 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Thu, 10 Oct 2019 17:39:24 -0000 Subject: [llvm] r374388 - [lit] Make internal diff work in pipelines Message-ID: <20191010173924.E207591ED5@lists.llvm.org> Author: jdenny Date: Thu Oct 10 10:39:24 2019 New Revision: 374388 URL: http://llvm.org/viewvc/llvm-project?rev=374388&view=rev Log: [lit] Make internal diff work in pipelines When using lit's internal shell, RUN lines like the following accidentally execute an external `diff` instead of lit's internal `diff`: ``` # RUN: program | diff file - # RUN: not diff file1 file2 | FileCheck %s ``` Such cases exist now, in `clang/test/Analysis` for example. We are preparing patches to ensure lit's internal `diff` is called in such cases, which will then fail because lit's internal `diff` cannot currently be used in pipelines and doesn't recognize `-` as a command-line option. To enable pipelines, this patch moves lit's `diff` implementation into an out-of-process script, similar to lit's `cat` implementation. A follow-up patch will implement `-` to mean stdin. Reviewed By: probinson, stella.stamenova Differential Revision: https://reviews.llvm.org/D66574 Added: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt Modified: llvm/trunk/utils/lit/lit/TestRunner.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/TestRunner.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/TestRunner.py?rev=374388&r1=374387&r2=374388&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/TestRunner.py (original) +++ llvm/trunk/utils/lit/lit/TestRunner.py Thu Oct 10 10:39:24 2019 @@ -1,7 +1,5 @@ from __future__ import absolute_import -import difflib import errno -import functools import io import itertools import getopt @@ -361,218 +359,6 @@ def executeBuiltinMkdir(cmd, cmd_shenv): exitCode = 1 return ShellCommandResult(cmd, "", stderr.getvalue(), exitCode, False) -def executeBuiltinDiff(cmd, cmd_shenv): - """executeBuiltinDiff - Compare files line by line.""" - args = expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:] - try: - opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) - except getopt.GetoptError as err: - raise InternalShellError(cmd, "Unsupported: 'diff': %s" % str(err)) - - filelines, filepaths, dir_trees = ([] for i in range(3)) - ignore_all_space = False - ignore_space_change = False - unified_diff = False - recursive_diff = False - strip_trailing_cr = False - for o, a in opts: - if o == "-w": - ignore_all_space = True - elif o == "-b": - ignore_space_change = True - elif o == "-u": - unified_diff = True - elif o == "-r": - recursive_diff = True - elif o == "--strip-trailing-cr": - strip_trailing_cr = True - else: - assert False, "unhandled option" - - if len(args) != 2: - raise InternalShellError(cmd, "Error: missing or extra operand") - - def getDirTree(path, basedir=""): - # Tree is a tuple of form (dirname, child_trees). - # An empty dir has child_trees = [], a file has child_trees = None. - child_trees = [] - for dirname, child_dirs, files in os.walk(os.path.join(basedir, path)): - for child_dir in child_dirs: - child_trees.append(getDirTree(child_dir, dirname)) - for filename in files: - child_trees.append((filename, None)) - return path, sorted(child_trees) - - def compareTwoFiles(filepaths): - compare_bytes = False - encoding = None - filelines = [] - for file in filepaths: - try: - with open(file, 'r') as f: - filelines.append(f.readlines()) - except UnicodeDecodeError: - try: - with io.open(file, 'r', encoding="utf-8") as f: - filelines.append(f.readlines()) - encoding = "utf-8" - except: - compare_bytes = True - - if compare_bytes: - return compareTwoBinaryFiles(filepaths) - else: - return compareTwoTextFiles(filepaths, encoding) - - def compareTwoBinaryFiles(filepaths): - filelines = [] - for file in filepaths: - with open(file, 'rb') as f: - filelines.append(f.readlines()) - - exitCode = 0 - if hasattr(difflib, 'diff_bytes'): - # python 3.5 or newer - diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) - diffs = [diff.decode() for diff in diffs] - else: - # python 2.7 - func = difflib.unified_diff if unified_diff else difflib.context_diff - diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) - - for diff in diffs: - stdout.write(diff) - exitCode = 1 - return exitCode - - def compareTwoTextFiles(filepaths, encoding): - filelines = [] - for file in filepaths: - if encoding is None: - with open(file, 'r') as f: - filelines.append(f.readlines()) - else: - with io.open(file, 'r', encoding=encoding) as f: - filelines.append(f.readlines()) - - exitCode = 0 - def compose2(f, g): - return lambda x: f(g(x)) - - f = lambda x: x - if strip_trailing_cr: - f = compose2(lambda line: line.rstrip('\r'), f) - if ignore_all_space or ignore_space_change: - ignoreSpace = lambda line, separator: separator.join(line.split()) - ignoreAllSpaceOrSpaceChange = functools.partial(ignoreSpace, separator='' if ignore_all_space else ' ') - f = compose2(ignoreAllSpaceOrSpaceChange, f) - - for idx, lines in enumerate(filelines): - filelines[idx]= [f(line) for line in lines] - - func = difflib.unified_diff if unified_diff else difflib.context_diff - for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): - stdout.write(diff) - exitCode = 1 - return exitCode - - def printDirVsFile(dir_path, file_path): - if os.path.getsize(file_path): - msg = "File %s is a directory while file %s is a regular file" - else: - msg = "File %s is a directory while file %s is a regular empty file" - stdout.write(msg % (dir_path, file_path) + "\n") - - def printFileVsDir(file_path, dir_path): - if os.path.getsize(file_path): - msg = "File %s is a regular file while file %s is a directory" - else: - msg = "File %s is a regular empty file while file %s is a directory" - stdout.write(msg % (file_path, dir_path) + "\n") - - def printOnlyIn(basedir, path, name): - stdout.write("Only in %s: %s\n" % (os.path.join(basedir, path), name)) - - def compareDirTrees(dir_trees, base_paths=["", ""]): - # Dirnames of the trees are not checked, it's caller's responsibility, - # as top-level dirnames are always different. Base paths are important - # for doing os.walk, but we don't put it into tree's dirname in order - # to speed up string comparison below and while sorting in getDirTree. - left_tree, right_tree = dir_trees[0], dir_trees[1] - left_base, right_base = base_paths[0], base_paths[1] - - # Compare two files or report file vs. directory mismatch. - if left_tree[1] is None and right_tree[1] is None: - return compareTwoFiles([os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])]) - - if left_tree[1] is None and right_tree[1] is not None: - printFileVsDir(os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])) - return 1 - - if left_tree[1] is not None and right_tree[1] is None: - printDirVsFile(os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])) - return 1 - - # Compare two directories via recursive use of compareDirTrees. - exitCode = 0 - left_names = [node[0] for node in left_tree[1]] - right_names = [node[0] for node in right_tree[1]] - l, r = 0, 0 - while l < len(left_names) and r < len(right_names): - # Names are sorted in getDirTree, rely on that order. - if left_names[l] < right_names[r]: - exitCode = 1 - printOnlyIn(left_base, left_tree[0], left_names[l]) - l += 1 - elif left_names[l] > right_names[r]: - exitCode = 1 - printOnlyIn(right_base, right_tree[0], right_names[r]) - r += 1 - else: - exitCode |= compareDirTrees([left_tree[1][l], right_tree[1][r]], - [os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])]) - l += 1 - r += 1 - - # At least one of the trees has ended. Report names from the other tree. - while l < len(left_names): - exitCode = 1 - printOnlyIn(left_base, left_tree[0], left_names[l]) - l += 1 - while r < len(right_names): - exitCode = 1 - printOnlyIn(right_base, right_tree[0], right_names[r]) - r += 1 - return exitCode - - stderr = StringIO() - stdout = StringIO() - exitCode = 0 - try: - for file in args: - if not os.path.isabs(file): - file = os.path.realpath(os.path.join(cmd_shenv.cwd, file)) - - if recursive_diff: - dir_trees.append(getDirTree(file)) - else: - filepaths.append(file) - - if not recursive_diff: - exitCode = compareTwoFiles(filepaths) - else: - exitCode = compareDirTrees(dir_trees) - - except IOError as err: - stderr.write("Error: 'diff' command failed, %s\n" % str(err)) - exitCode = 1 - - return ShellCommandResult(cmd, stdout.getvalue(), stderr.getvalue(), exitCode, False) - def executeBuiltinRm(cmd, cmd_shenv): """executeBuiltinRm - Removes (deletes) files or directories.""" args = expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:] @@ -838,14 +624,6 @@ def _executeShCmd(cmd, shenv, results, t results.append(cmdResult) return cmdResult.exitCode - if cmd.commands[0].args[0] == 'diff': - if len(cmd.commands) != 1: - raise InternalShellError(cmd.commands[0], "Unsupported: 'diff' " - "cannot be part of a pipeline") - cmdResult = executeBuiltinDiff(cmd.commands[0], shenv) - results.append(cmdResult) - return cmdResult.exitCode - if cmd.commands[0].args[0] == 'rm': if len(cmd.commands) != 1: raise InternalShellError(cmd.commands[0], "Unsupported: 'rm' " @@ -866,7 +644,7 @@ def _executeShCmd(cmd, shenv, results, t stderrTempFiles = [] opened_files = [] named_temp_files = [] - builtin_commands = set(['cat']) + builtin_commands = set(['cat', 'diff']) builtin_commands_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "builtin_commands") # To avoid deadlock, we use a single stderr stream for piped # output. This is null until we have seen some output using Added: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374388&view=auto ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (added) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Thu Oct 10 10:39:24 2019 @@ -0,0 +1,228 @@ +import difflib +import functools +import getopt +import os +import sys + +class DiffFlags(): + def __init__(self): + self.ignore_all_space = False + self.ignore_space_change = False + self.unified_diff = False + self.recursive_diff = False + self.strip_trailing_cr = False + +def getDirTree(path, basedir=""): + # Tree is a tuple of form (dirname, child_trees). + # An empty dir has child_trees = [], a file has child_trees = None. + child_trees = [] + for dirname, child_dirs, files in os.walk(os.path.join(basedir, path)): + for child_dir in child_dirs: + child_trees.append(getDirTree(child_dir, dirname)) + for filename in files: + child_trees.append((filename, None)) + return path, sorted(child_trees) + +def compareTwoFiles(flags, filepaths): + compare_bytes = False + encoding = None + filelines = [] + for file in filepaths: + try: + with open(file, 'r') as f: + filelines.append(f.readlines()) + except UnicodeDecodeError: + try: + with io.open(file, 'r', encoding="utf-8") as f: + filelines.append(f.readlines()) + encoding = "utf-8" + except: + compare_bytes = True + + if compare_bytes: + return compareTwoBinaryFiles(flags, filepaths) + else: + return compareTwoTextFiles(flags, filepaths, encoding) + +def compareTwoBinaryFiles(flags, filepaths): + filelines = [] + for file in filepaths: + with open(file, 'rb') as f: + filelines.append(f.readlines()) + + exitCode = 0 + if hasattr(difflib, 'diff_bytes'): + # python 3.5 or newer + diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) + diffs = [diff.decode() for diff in diffs] + else: + # python 2.7 + if flags.unified_diff: + func = difflib.unified_diff + else: + func = difflib.context_diff + diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) + + for diff in diffs: + sys.stdout.write(diff) + exitCode = 1 + return exitCode + +def compareTwoTextFiles(flags, filepaths, encoding): + filelines = [] + for file in filepaths: + if encoding is None: + with open(file, 'r') as f: + filelines.append(f.readlines()) + else: + with io.open(file, 'r', encoding=encoding) as f: + filelines.append(f.readlines()) + + exitCode = 0 + def compose2(f, g): + return lambda x: f(g(x)) + + f = lambda x: x + if flags.strip_trailing_cr: + f = compose2(lambda line: line.rstrip('\r'), f) + if flags.ignore_all_space or flags.ignore_space_change: + ignoreSpace = lambda line, separator: separator.join(line.split()) + ignoreAllSpaceOrSpaceChange = functools.partial(ignoreSpace, separator='' if flags.ignore_all_space else ' ') + f = compose2(ignoreAllSpaceOrSpaceChange, f) + + for idx, lines in enumerate(filelines): + filelines[idx]= [f(line) for line in lines] + + func = difflib.unified_diff if flags.unified_diff else difflib.context_diff + for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): + sys.stdout.write(diff) + exitCode = 1 + return exitCode + +def printDirVsFile(dir_path, file_path): + if os.path.getsize(file_path): + msg = "File %s is a directory while file %s is a regular file" + else: + msg = "File %s is a directory while file %s is a regular empty file" + sys.stdout.write(msg % (dir_path, file_path) + "\n") + +def printFileVsDir(file_path, dir_path): + if os.path.getsize(file_path): + msg = "File %s is a regular file while file %s is a directory" + else: + msg = "File %s is a regular empty file while file %s is a directory" + sys.stdout.write(msg % (file_path, dir_path) + "\n") + +def printOnlyIn(basedir, path, name): + sys.stdout.write("Only in %s: %s\n" % (os.path.join(basedir, path), name)) + +def compareDirTrees(flags, dir_trees, base_paths=["", ""]): + # Dirnames of the trees are not checked, it's caller's responsibility, + # as top-level dirnames are always different. Base paths are important + # for doing os.walk, but we don't put it into tree's dirname in order + # to speed up string comparison below and while sorting in getDirTree. + left_tree, right_tree = dir_trees[0], dir_trees[1] + left_base, right_base = base_paths[0], base_paths[1] + + # Compare two files or report file vs. directory mismatch. + if left_tree[1] is None and right_tree[1] is None: + return compareTwoFiles(flags, + [os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])]) + + if left_tree[1] is None and right_tree[1] is not None: + printFileVsDir(os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])) + return 1 + + if left_tree[1] is not None and right_tree[1] is None: + printDirVsFile(os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])) + return 1 + + # Compare two directories via recursive use of compareDirTrees. + exitCode = 0 + left_names = [node[0] for node in left_tree[1]] + right_names = [node[0] for node in right_tree[1]] + l, r = 0, 0 + while l < len(left_names) and r < len(right_names): + # Names are sorted in getDirTree, rely on that order. + if left_names[l] < right_names[r]: + exitCode = 1 + printOnlyIn(left_base, left_tree[0], left_names[l]) + l += 1 + elif left_names[l] > right_names[r]: + exitCode = 1 + printOnlyIn(right_base, right_tree[0], right_names[r]) + r += 1 + else: + exitCode |= compareDirTrees(flags, + [left_tree[1][l], right_tree[1][r]], + [os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])]) + l += 1 + r += 1 + + # At least one of the trees has ended. Report names from the other tree. + while l < len(left_names): + exitCode = 1 + printOnlyIn(left_base, left_tree[0], left_names[l]) + l += 1 + while r < len(right_names): + exitCode = 1 + printOnlyIn(right_base, right_tree[0], right_names[r]) + r += 1 + return exitCode + +def main(argv): + args = argv[1:] + try: + opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) + except getopt.GetoptError as err: + sys.stderr.write("Unsupported: 'diff': %s\n" % str(err)) + sys.exit(1) + + flags = DiffFlags() + filelines, filepaths, dir_trees = ([] for i in range(3)) + for o, a in opts: + if o == "-w": + flags.ignore_all_space = True + elif o == "-b": + flags.ignore_space_change = True + elif o == "-u": + flags.unified_diff = True + elif o == "-r": + flags.recursive_diff = True + elif o == "--strip-trailing-cr": + flags.strip_trailing_cr = True + else: + assert False, "unhandled option" + + if len(args) != 2: + sys.stderr.write("Error: missing or extra operand\n") + sys.exit(1) + + exitCode = 0 + try: + for file in args: + if not os.path.isabs(file): + file = os.path.realpath(os.path.join(os.getcwd(), file)) + + if flags.recursive_diff: + dir_trees.append(getDirTree(file)) + else: + filepaths.append(file) + + if not flags.recursive_diff: + exitCode = compareTwoFiles(flags, filepaths) + else: + exitCode = compareDirTrees(flags, dir_trees) + + except IOError as err: + sys.stderr.write("Error: 'diff' command failed, %s\n" % str(err)) + exitCode = 1 + + sys.exit(exitCode) + +if __name__ == "__main__": + main(sys.argv) Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt?rev=374387&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt (removed) @@ -1,3 +0,0 @@ -# Check error on a unsupported diff (cannot be part of a pipeline). -# -# RUN: diff diff-error-0.txt diff-error-0.txt | echo Output Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt?rev=374388&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt Thu Oct 10 10:39:24 2019 @@ -0,0 +1,15 @@ +# RUN: echo foo > %t.foo +# RUN: echo bar > %t.bar + +# Check output pipe. +# RUN: diff %t.foo %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s +# RUN: diff -u %t.foo %t.bar | FileCheck %s && false || true + +# Fail so lit will print output. +# RUN: false + +# CHECK: @@ +# CHECK-NEXT: -foo +# CHECK-NEXT: +bar + +# EMPTY-NOT: {{.}} Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374388&r1=374387&r2=374388&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Thu Oct 10 10:39:24 2019 @@ -34,28 +34,20 @@ # CHECK: error: command failed with exit status: 127 # CHECK: *** -# CHECK: FAIL: shtest-shell :: diff-error-0.txt -# CHECK: *** TEST 'shtest-shell :: diff-error-0.txt' FAILED *** -# CHECK: $ "diff" "diff-error-0.txt" "diff-error-0.txt" -# CHECK: # command stderr: -# CHECK: Unsupported: 'diff' cannot be part of a pipeline -# CHECK: error: command failed with exit status: 127 -# CHECK: *** - # CHECK: FAIL: shtest-shell :: diff-error-1.txt # CHECK: *** TEST 'shtest-shell :: diff-error-1.txt' FAILED *** # CHECK: $ "diff" "-B" "temp1.txt" "temp2.txt" # CHECK: # command stderr: # CHECK: Unsupported: 'diff': option -B not recognized -# CHECK: error: command failed with exit status: 127 +# CHECK: error: command failed with exit status: 1 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-2.txt # CHECK: *** TEST 'shtest-shell :: diff-error-2.txt' FAILED *** # CHECK: $ "diff" "temp.txt" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 127 +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 1 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-3.txt @@ -82,18 +74,43 @@ # CHECK: *** TEST 'shtest-shell :: diff-error-5.txt' FAILED *** # CHECK: $ "diff" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 127 +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 1 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-6.txt # CHECK: *** TEST 'shtest-shell :: diff-error-6.txt' FAILED *** # CHECK: $ "diff" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 127 +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 1 # CHECK: *** + +# CHECK: FAIL: shtest-shell :: diff-pipes.txt + +# CHECK: *** TEST 'shtest-shell :: diff-pipes.txt' FAILED *** + +# CHECK: $ "diff" "{{[^"]*}}.foo" "{{[^"]*}}.foo" +# CHECK-NOT: note +# CHECK-NOT: error +# CHECK: $ "FileCheck" +# CHECK-NOT: note +# CHECK-NOT: error + +# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "{{[^"]*}}.bar" +# CHECK: note: command had no output on stdout or stderr +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "FileCheck" +# CHECK-NOT: note +# CHECK-NOT: error +# CHECK: $ "true" + +# CHECK: $ "false" + +# CHECK: *** + + # CHECK: FAIL: shtest-shell :: diff-r-error-0.txt # CHECK: *** TEST 'shtest-shell :: diff-r-error-0.txt' FAILED *** # CHECK: $ "diff" "-r" From llvm-commits at lists.llvm.org Thu Oct 10 10:39:41 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Thu, 10 Oct 2019 17:39:41 -0000 Subject: [llvm] r374389 - [lit] Clean up internal diff's encoding handling Message-ID: <20191010173941.3BAEC92474@lists.llvm.org> Author: jdenny Date: Thu Oct 10 10:39:41 2019 New Revision: 374389 URL: http://llvm.org/viewvc/llvm-project?rev=374389&view=rev Log: [lit] Clean up internal diff's encoding handling As suggested by rnk at D67643#1673043, instead of reading files multiple times until an appropriate encoding is found, read them once as binary, and then try to decode what was read. For python >= 3.5, don't fail when attempting to decode the `diff_bytes` output in order to print it. Finally, add some tests for encoding handling. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D68664 Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin (with props) llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 (with props) llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374389&r1=374388&r2=374389&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Thu Oct 10 10:39:41 2019 @@ -1,6 +1,7 @@ import difflib import functools import getopt +import locale import os import sys @@ -24,37 +25,26 @@ def getDirTree(path, basedir=""): return path, sorted(child_trees) def compareTwoFiles(flags, filepaths): - compare_bytes = False - encoding = None filelines = [] for file in filepaths: - try: - with open(file, 'r') as f: - filelines.append(f.readlines()) - except UnicodeDecodeError: - try: - with io.open(file, 'r', encoding="utf-8") as f: - filelines.append(f.readlines()) - encoding = "utf-8" - except: - compare_bytes = True - - if compare_bytes: - return compareTwoBinaryFiles(flags, filepaths) - else: - return compareTwoTextFiles(flags, filepaths, encoding) + with open(file, 'rb') as file_bin: + filelines.append(file_bin.readlines()) -def compareTwoBinaryFiles(flags, filepaths): - filelines = [] - for file in filepaths: - with open(file, 'rb') as f: - filelines.append(f.readlines()) + try: + return compareTwoTextFiles(flags, filepaths, filelines, + locale.getpreferredencoding(False)) + except UnicodeDecodeError: + try: + return compareTwoTextFiles(flags, filepaths, filelines, "utf-8") + except: + return compareTwoBinaryFiles(flags, filepaths, filelines) +def compareTwoBinaryFiles(flags, filepaths, filelines): exitCode = 0 if hasattr(difflib, 'diff_bytes'): # python 3.5 or newer diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) - diffs = [diff.decode() for diff in diffs] + diffs = [diff.decode(errors="backslashreplace") for diff in diffs] else: # python 2.7 if flags.unified_diff: @@ -68,15 +58,14 @@ def compareTwoBinaryFiles(flags, filepat exitCode = 1 return exitCode -def compareTwoTextFiles(flags, filepaths, encoding): +def compareTwoTextFiles(flags, filepaths, filelines_bin, encoding): filelines = [] - for file in filepaths: - if encoding is None: - with open(file, 'r') as f: - filelines.append(f.readlines()) - else: - with io.open(file, 'r', encoding=encoding) as f: - filelines.append(f.readlines()) + for lines_bin in filelines_bin: + lines = [] + for line_bin in lines_bin: + line = line_bin.decode(encoding=encoding) + lines.append(line) + filelines.append(lines) exitCode = 0 def compose2(f, g): Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt?rev=374389&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt Thu Oct 10 10:39:41 2019 @@ -0,0 +1,9 @@ +# Check that diff falls back to binary mode if it cannot decode a file. + +# RUN: diff -u diff-in.bin diff-in.bin +# RUN: diff -u diff-in.utf16 diff-in.bin && false || true +# RUN: diff -u diff-in.utf8 diff-in.bin && false || true +# RUN: diff -u diff-in.bin diff-in.utf8 && false || true + +# Fail so lit will print output. +# RUN: false Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin?rev=374389&view=auto ============================================================================== Binary file - no diff available. Propchange: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16?rev=374389&view=auto ============================================================================== Binary file - no diff available. Propchange: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8?rev=374389&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 Thu Oct 10 10:39:41 2019 @@ -0,0 +1,3 @@ +foo +bar +baz Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374389&r1=374388&r2=374389&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Thu Oct 10 10:39:41 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (27) +# CHECK: Failing Tests (28) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: Option '--max-failures' requires positive integer Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374389&r1=374388&r2=374389&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Thu Oct 10 10:39:41 2019 @@ -34,6 +34,58 @@ # CHECK: error: command failed with exit status: 127 # CHECK: *** + +# CHECK: FAIL: shtest-shell :: diff-encodings.txt +# CHECK: *** TEST 'shtest-shell :: diff-encodings.txt' FAILED *** + +# CHECK: $ "diff" "-u" "diff-in.bin" "diff-in.bin" +# CHECK-NOT: error + +# CHECK: $ "diff" "-u" "diff-in.utf16" "diff-in.bin" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: {{^ .f.o.o.$}} +# CHECK-NEXT: {{^-.b.a.r.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^ .b.a.z.$}} +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "diff" "-u" "diff-in.utf8" "diff-in.bin" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: -foo +# CHECK-NEXT: -bar +# CHECK-NEXT: -baz +# CHECK-NEXT: {{^\+.f.o.o.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^\+.b.a.z.$}} +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "diff" "-u" "diff-in.bin" "diff-in.utf8" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: {{^\-.f.o.o.$}} +# CHECK-NEXT: {{^\-.b.a.r..}} +# CHECK-NEXT: {{^\-.b.a.z.$}} +# CHECK-NEXT: +foo +# CHECK-NEXT: +bar +# CHECK-NEXT: +baz +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "false" + +# CHECK: *** + + # CHECK: FAIL: shtest-shell :: diff-error-1.txt # CHECK: *** TEST 'shtest-shell :: diff-error-1.txt' FAILED *** # CHECK: $ "diff" "-B" "temp1.txt" "temp2.txt" @@ -245,4 +297,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (27) +# CHECK: Failing Tests (28) From llvm-commits at lists.llvm.org Thu Oct 10 10:39:57 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Thu, 10 Oct 2019 17:39:57 -0000 Subject: [llvm] r374390 - [lit] Extend internal diff to support `-` argument Message-ID: <20191010173957.9BB219249F@lists.llvm.org> Author: jdenny Date: Thu Oct 10 10:39:57 2019 New Revision: 374390 URL: http://llvm.org/viewvc/llvm-project?rev=374390&view=rev Log: [lit] Extend internal diff to support `-` argument When using lit's internal shell, RUN lines like the following accidentally execute an external `diff` instead of lit's internal `diff`: ``` # RUN: program | diff file - ``` Such cases exist now, in `clang/test/Analysis` for example. We are preparing patches to ensure lit's internal `diff` is called in such cases, which will then fail because lit's internal `diff` doesn't recognize `-` as a command-line option. This patch adds support for `-` to mean stdin. Reviewed By: probinson, rnk Differential Revision: https://reviews.llvm.org/D67643 Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374390&r1=374389&r2=374390&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Thu Oct 10 10:39:57 2019 @@ -27,8 +27,13 @@ def getDirTree(path, basedir=""): def compareTwoFiles(flags, filepaths): filelines = [] for file in filepaths: - with open(file, 'rb') as file_bin: - filelines.append(file_bin.readlines()) + if file == "-": + stdin_fileno = sys.stdin.fileno() + with os.fdopen(os.dup(stdin_fileno), 'rb') as stdin_bin: + filelines.append(stdin_bin.readlines()) + else: + with open(file, 'rb') as file_bin: + filelines.append(file_bin.readlines()) try: return compareTwoTextFiles(flags, filepaths, filelines, @@ -194,10 +199,13 @@ def main(argv): exitCode = 0 try: for file in args: - if not os.path.isabs(file): + if file != "-" and not os.path.isabs(file): file = os.path.realpath(os.path.join(os.getcwd(), file)) if flags.recursive_diff: + if file == "-": + sys.stderr.write("Error: cannot recursively compare '-'\n") + sys.exit(1) dir_trees.append(getDirTree(file)) else: filepaths.append(file) Modified: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt?rev=374390&r1=374389&r2=374390&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt Thu Oct 10 10:39:57 2019 @@ -5,5 +5,11 @@ # RUN: diff -u diff-in.utf8 diff-in.bin && false || true # RUN: diff -u diff-in.bin diff-in.utf8 && false || true +# RUN: cat diff-in.bin | diff -u - diff-in.bin +# RUN: cat diff-in.bin | diff -u diff-in.bin - +# RUN: cat diff-in.bin | diff -u diff-in.utf16 - && false || true +# RUN: cat diff-in.bin | diff -u diff-in.utf8 - && false || true +# RUN: cat diff-in.bin | diff -u - diff-in.utf8 && false || true + # Fail so lit will print output. # RUN: false Modified: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt?rev=374390&r1=374389&r2=374390&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt Thu Oct 10 10:39:57 2019 @@ -5,6 +5,16 @@ # RUN: diff %t.foo %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s # RUN: diff -u %t.foo %t.bar | FileCheck %s && false || true +# Check input pipe. +# RUN: echo foo | diff -u - %t.foo +# RUN: echo foo | diff -u %t.foo - +# RUN: echo bar | diff -u %t.foo - && false || true +# RUN: echo bar | diff -u - %t.foo && false || true + +# Check output and input pipes at the same time. +# RUN: echo foo | diff - %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s +# RUN: echo bar | diff -u %t.foo - | FileCheck %s && false || true + # Fail so lit will print output. # RUN: false Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt?rev=374390&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt Thu Oct 10 10:39:57 2019 @@ -0,0 +1,2 @@ +# diff -r currently cannot handle stdin. +# RUN: diff -r - %t Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt?rev=374390&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt Thu Oct 10 10:39:57 2019 @@ -0,0 +1,2 @@ +# diff -r currently cannot handle stdin. +# RUN: diff -r %t - Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374390&r1=374389&r2=374390&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Thu Oct 10 10:39:57 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (28) +# CHECK: Failing Tests (30) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: Option '--max-failures' requires positive integer Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374390&r1=374389&r2=374390&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Thu Oct 10 10:39:57 2019 @@ -81,6 +81,60 @@ # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" +# CHECK: $ "cat" "diff-in.bin" +# CHECK-NOT: error +# CHECK: $ "diff" "-u" "-" "diff-in.bin" +# CHECK-NOT: error + +# CHECK: $ "cat" "diff-in.bin" +# CHECK-NOT: error +# CHECK: $ "diff" "-u" "diff-in.bin" "-" +# CHECK-NOT: error + +# CHECK: $ "cat" "diff-in.bin" +# CHECK-NOT: error +# CHECK: $ "diff" "-u" "diff-in.utf16" "-" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: {{^ .f.o.o.$}} +# CHECK-NEXT: {{^-.b.a.r.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^ .b.a.z.$}} +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "cat" "diff-in.bin" +# CHECK-NOT: error +# CHECK: $ "diff" "-u" "diff-in.utf8" "-" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: -foo +# CHECK-NEXT: -bar +# CHECK-NEXT: -baz +# CHECK-NEXT: {{^\+.f.o.o.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^\+.b.a.z.$}} +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "diff" "-u" "-" "diff-in.utf8" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: {{^\-.f.o.o.$}} +# CHECK-NEXT: {{^\-.b.a.r..}} +# CHECK-NEXT: {{^\-.b.a.z.$}} +# CHECK-NEXT: +foo +# CHECK-NEXT: +bar +# CHECK-NEXT: +baz +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + # CHECK: $ "false" # CHECK: *** @@ -158,6 +212,51 @@ # CHECK-NOT: error # CHECK: $ "true" +# CHECK: $ "echo" "foo" +# CHECK: $ "diff" "-u" "-" "{{[^"]*}}.foo" +# CHECK-NOT: note +# CHECK-NOT: error + +# CHECK: $ "echo" "foo" +# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" +# CHECK-NOT: note +# CHECK-NOT: error + +# CHECK: $ "echo" "bar" +# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" +# CHECK: # command output: +# CHECK: @@ +# CHECK-NEXT: -foo +# CHECK-NEXT: +bar +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "echo" "bar" +# CHECK: $ "diff" "-u" "-" "{{[^"]*}}.foo" +# CHECK: # command output: +# CHECK: @@ +# CHECK-NEXT: -bar +# CHECK-NEXT: +foo +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "echo" "foo" +# CHECK: $ "diff" "-" "{{[^"]*}}.foo" +# CHECK-NOT: note +# CHECK-NOT: error +# CHECK: $ "FileCheck" +# CHECK-NOT: note +# CHECK-NOT: error + +# CHECK: $ "echo" "bar" +# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" +# CHECK: note: command had no output on stdout or stderr +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "FileCheck" +# CHECK-NOT: note +# CHECK-NOT: error +# CHECK: $ "true" + # CHECK: $ "false" # CHECK: *** @@ -216,6 +315,20 @@ # CHECK: File {{.*}}dir1{{.*}}extra_file is a regular empty file while file {{.*}}dir2{{.*}}extra_file is a directory # CHECK: error: command failed with exit status: 1 +# CHECK: FAIL: shtest-shell :: diff-r-error-7.txt +# CHECK: *** TEST 'shtest-shell :: diff-r-error-7.txt' FAILED *** +# CHECK: $ "diff" "-r" "-" "{{[^"]*}}" +# CHECK: # command stderr: +# CHECK: Error: cannot recursively compare '-' +# CHECK: error: command failed with exit status: 1 + +# CHECK: FAIL: shtest-shell :: diff-r-error-8.txt +# CHECK: *** TEST 'shtest-shell :: diff-r-error-8.txt' FAILED *** +# CHECK: $ "diff" "-r" "{{[^"]*}}" "-" +# CHECK: # command stderr: +# CHECK: Error: cannot recursively compare '-' +# CHECK: error: command failed with exit status: 1 + # CHECK: PASS: shtest-shell :: diff-r.txt # CHECK: FAIL: shtest-shell :: error-0.txt @@ -297,4 +410,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (28) +# CHECK: Failing Tests (30) From llvm-commits at lists.llvm.org Thu Oct 10 10:40:01 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Thu, 10 Oct 2019 17:40:01 -0000 Subject: [llvm] r374391 - gn build: merge r374381 more (effectively a no-op) Message-ID: <20191010174001.17984924C1@lists.llvm.org> Author: nico Date: Thu Oct 10 10:40:00 2019 New Revision: 374391 URL: http://llvm.org/viewvc/llvm-project?rev=374391&view=rev Log: gn build: merge r374381 more (effectively a no-op) Modified: llvm/trunk/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn?rev=374391&r1=374390&r2=374391&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn Thu Oct 10 10:40:00 2019 @@ -1,6 +1,7 @@ static_library("GSYM") { output_name = "LLVMDebugInfoGSYM" deps = [ + "//llvm/lib/MC", "//llvm/lib/Support", ] sources = [ From llvm-commits at lists.llvm.org Thu Oct 10 10:40:12 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Thu, 10 Oct 2019 17:40:12 -0000 Subject: [llvm] r374392 - [lit] Extend internal diff to support -U Message-ID: <20191010174012.9E7F7924CF@lists.llvm.org> Author: jdenny Date: Thu Oct 10 10:40:12 2019 New Revision: 374392 URL: http://llvm.org/viewvc/llvm-project?rev=374392&view=rev Log: [lit] Extend internal diff to support -U When using lit's internal shell, RUN lines like the following accidentally execute an external `diff` instead of lit's internal `diff`: ``` # RUN: program | diff -U1 file - ``` Such cases exist now, in `clang/test/Analysis` for example. We are preparing patches to ensure lit's internal `diff` is called in such cases, which will then fail because lit's internal `diff` doesn't recognize `-U` as a command-line option. This patch adds `-U` support. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D68668 Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374392&r1=374391&r2=374392&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Thu Oct 10 10:40:12 2019 @@ -10,6 +10,7 @@ class DiffFlags(): self.ignore_all_space = False self.ignore_space_change = False self.unified_diff = False + self.num_context_lines = 3 self.recursive_diff = False self.strip_trailing_cr = False @@ -48,7 +49,10 @@ def compareTwoBinaryFiles(flags, filepat exitCode = 0 if hasattr(difflib, 'diff_bytes'): # python 3.5 or newer - diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) + diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], + filelines[1], filepaths[0].encode(), + filepaths[1].encode(), + n = flags.num_context_lines) diffs = [diff.decode(errors="backslashreplace") for diff in diffs] else: # python 2.7 @@ -56,7 +60,8 @@ def compareTwoBinaryFiles(flags, filepat func = difflib.unified_diff else: func = difflib.context_diff - diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) + diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1], + n = flags.num_context_lines) for diff in diffs: sys.stdout.write(diff) @@ -88,7 +93,8 @@ def compareTwoTextFiles(flags, filepaths filelines[idx]= [f(line) for line in lines] func = difflib.unified_diff if flags.unified_diff else difflib.context_diff - for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): + for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1], + n = flags.num_context_lines): sys.stdout.write(diff) exitCode = 1 return exitCode @@ -171,7 +177,7 @@ def compareDirTrees(flags, dir_trees, ba def main(argv): args = argv[1:] try: - opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) + opts, args = getopt.gnu_getopt(args, "wbuU:r", ["strip-trailing-cr"]) except getopt.GetoptError as err: sys.stderr.write("Unsupported: 'diff': %s\n" % str(err)) sys.exit(1) @@ -185,6 +191,16 @@ def main(argv): flags.ignore_space_change = True elif o == "-u": flags.unified_diff = True + elif o.startswith("-U"): + flags.unified_diff = True + try: + flags.num_context_lines = int(a) + if flags.num_context_lines < 0: + raise ValueException + except: + sys.stderr.write("Error: invalid '-U' argument: {}\n" + .format(a)) + sys.exit(1) elif o == "-r": flags.recursive_diff = True elif o == "--strip-trailing-cr": Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt?rev=374392&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt Thu Oct 10 10:40:12 2019 @@ -0,0 +1,38 @@ +# RUN: echo 1 > %t.foo +# RUN: echo 2 >> %t.foo +# RUN: echo 3 >> %t.foo +# RUN: echo 4 >> %t.foo +# RUN: echo 5 >> %t.foo +# RUN: echo 6 foo >> %t.foo +# RUN: echo 7 >> %t.foo +# RUN: echo 8 >> %t.foo +# RUN: echo 9 >> %t.foo +# RUN: echo 10 >> %t.foo +# RUN: echo 11 >> %t.foo + +# RUN: echo 1 > %t.bar +# RUN: echo 2 >> %t.bar +# RUN: echo 3 >> %t.bar +# RUN: echo 4 >> %t.bar +# RUN: echo 5 >> %t.bar +# RUN: echo 6 bar >> %t.bar +# RUN: echo 7 >> %t.bar +# RUN: echo 8 >> %t.bar +# RUN: echo 9 >> %t.bar +# RUN: echo 10 >> %t.bar +# RUN: echo 11 >> %t.bar + +# Default is 3 lines of context. +# RUN: diff -u %t.foo %t.bar && false || true + +# Override default of 3 lines of context. +# RUN: diff -U 2 %t.foo %t.bar && false || true +# RUN: diff -U4 %t.foo %t.bar && false || true +# RUN: diff -U0 %t.foo %t.bar && false || true + +# Check bad -U argument. +# RUN: diff -U 30.1 %t.foo %t.foo && false || true +# RUN: diff -U-1 %t.foo %t.foo && false || true + +# Fail so lit will print output. +# RUN: false Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374392&r1=374391&r2=374392&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Thu Oct 10 10:40:12 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (30) +# CHECK: Failing Tests (31) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: Option '--max-failures' requires positive integer Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374392&r1=374391&r2=374392&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Thu Oct 10 10:40:12 2019 @@ -331,6 +331,82 @@ # CHECK: PASS: shtest-shell :: diff-r.txt + +# CHECK: FAIL: shtest-shell :: diff-unified.txt + +# CHECK: *** TEST 'shtest-shell :: diff-unified.txt' FAILED *** + +# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "{{[^"]*}}.bar" +# CHECK: # command output: +# CHECK: @@ {{.*}} @@ +# CHECK-NEXT: 3 +# CHECK-NEXT: 4 +# CHECK-NEXT: 5 +# CHECK-NEXT: -6 foo +# CHECK-NEXT: +6 bar +# CHECK-NEXT: 7 +# CHECK-NEXT: 8 +# CHECK-NEXT: 9 +# CHECK-EMPTY: +# CHECK-NEXT: error: command failed with exit status: 1 +# CHECK-NEXT: $ "true" + +# CHECK: $ "diff" "-U" "2" "{{[^"]*}}.foo" "{{[^"]*}}.bar" +# CHECK: # command output: +# CHECK: @@ {{.*}} @@ +# CHECK-NEXT: 4 +# CHECK-NEXT: 5 +# CHECK-NEXT: -6 foo +# CHECK-NEXT: +6 bar +# CHECK-NEXT: 7 +# CHECK-NEXT: 8 +# CHECK-EMPTY: +# CHECK-NEXT: error: command failed with exit status: 1 +# CHECK-NEXT: $ "true" + +# CHECK: $ "diff" "-U4" "{{[^"]*}}.foo" "{{[^"]*}}.bar" +# CHECK: # command output: +# CHECK: @@ {{.*}} @@ +# CHECK-NEXT: 2 +# CHECK-NEXT: 3 +# CHECK-NEXT: 4 +# CHECK-NEXT: 5 +# CHECK-NEXT: -6 foo +# CHECK-NEXT: +6 bar +# CHECK-NEXT: 7 +# CHECK-NEXT: 8 +# CHECK-NEXT: 9 +# CHECK-NEXT: 10 +# CHECK-EMPTY: +# CHECK-NEXT: error: command failed with exit status: 1 +# CHECK-NEXT: $ "true" + +# CHECK: $ "diff" "-U0" "{{[^"]*}}.foo" "{{[^"]*}}.bar" +# CHECK: # command output: +# CHECK: @@ {{.*}} @@ +# CHECK-NEXT: -6 foo +# CHECK-NEXT: +6 bar +# CHECK-EMPTY: +# CHECK-NEXT: error: command failed with exit status: 1 +# CHECK-NEXT: $ "true" + +# CHECK: $ "diff" "-U" "30.1" "{{[^"]*}}" "{{[^"]*}}" +# CHECK: # command stderr: +# CHECK: Error: invalid '-U' argument: 30.1 +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "diff" "-U-1" "{{[^"]*}}" "{{[^"]*}}" +# CHECK: # command stderr: +# CHECK: Error: invalid '-U' argument: -1 +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "false" + +# CHECK: *** + + # CHECK: FAIL: shtest-shell :: error-0.txt # CHECK: *** TEST 'shtest-shell :: error-0.txt' FAILED *** # CHECK: $ "not-a-real-command" @@ -410,4 +486,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (30) +# CHECK: Failing Tests (31) From llvm-commits at lists.llvm.org Thu Oct 10 10:43:13 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:43:13 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: DiggerLin marked 9 inline comments as done. DiggerLin added inline comments. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:518 + Csect.Size = Layout.getSectionAddressSize(MCSec); + Address = Csect.Address + Csect.Size; + Csect.SymbolTableIndex = SymbolTableIndex; ---------------- hubert.reinterpretcast wrote: > There's two spaces after the `+`. deleted, thanks ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:296 + // Write the program code control sections one at a time. + uint32_t PreCSectEndAddress = Text.Address; + uint32_t PaddingSize; ---------------- hubert.reinterpretcast wrote: > Suggestion: `CurrentAddressLocation` changed as suggestion ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:299 + for (const auto &Csect : ProgramCodeCsects) { + // PaddingSize = Virtual address of current CSect - Virtual end address of + // previous CSect. ---------------- hubert.reinterpretcast wrote: > I think the code can be made sufficiently self-explanatory that we don't need a comment here. deleted the comment. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:308 + + // Padding Size of Tail Section = + // Virtual end address of current Section - Virtual end address of last CSect. ---------------- hubert.reinterpretcast wrote: > Suggestion: > The size of the tail padding in a section is the end virtual address of the current section minus the the end virtual address of the last csect in that section. changed comment as suggestion ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:310 + // Virtual end address of current Section - Virtual end address of last CSect. + if (ProgramCodeCsects.size()) { + PaddingSize = Text.Address + Text.Size - PreCSectEndAddress; ---------------- hubert.reinterpretcast wrote: > ``` > if (!ProgramCodeCsects.empty()) > ``` > however, I suggest checking the section and not the group of csects (they aren't the same thing): > ``` > if (Text.Index != -1) > ``` changed as suggestion. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:312 + PaddingSize = Text.Address + Text.Size - PreCSectEndAddress; + if (PaddingSize) + W.OS.write_zeros(PaddingSize); ---------------- hubert.reinterpretcast wrote: > Same comment about the write to `PaddingSize`. changed as suggestion. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:531 + + // First Csect of each section do not need padding zero. We need to + // adjust section virtual address to first Csect's address. ---------------- hubert.reinterpretcast wrote: > Use "csect" instead of "Csect" when using the term in an English context where the word would not be capitalized. > > Suggestion: > The first csect of a section can be aligned by adjusting the virtual address of its containing section instead of writing zeroes into the object file. changed as suggestion ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:544 BSS.Index = SectionIndex++; - assert(alignTo(Address, DefaultSectionAlign) == Address && - "Improperly aligned address for section."); - uint32_t StartAddress = Address; + // We use alignment address of previous section as BSS start address. + BSS.Address = Address; ---------------- hubert.reinterpretcast wrote: > The difference in the calculation for the virtual address of the `.bss` section and that of the `.text` section might complicate efforts to common up the handling. Note that a change in how the virtual address of `.bss` is calculated is within the scope of this patch because it changes the value from being always zero. changed , thanks ================ Comment at: llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll:56 ; SYMS-NEXT: Symbol { -; SYMS-NEXT: Index: [[#Index:]] -; SYMS-NEXT: Name: a +; SYMS: Index: [[#Index:]]{{[[:space:]]*}}Name: a ; SYMS-NEXT: Value (RelocatableAddress): 0x0 ---------------- hubert.reinterpretcast wrote: > It seems this file was changed accidentally by today's updates. The ` ` (space) character before the `*` is correct. changed , thanks Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 From llvm-commits at lists.llvm.org Thu Oct 10 10:43:14 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:43:14 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: <681fe42b41c24ff55742cc63bc577285@localhost.localdomain> DiggerLin updated this revision to Diff 224408. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 Files: llvm/include/llvm/MC/MCSectionXCOFF.h llvm/lib/MC/MCXCOFFStreamer.cpp llvm/lib/MC/XCOFFObjectWriter.cpp llvm/test/CodeGen/PowerPC/aix-return55.ll llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D66969.224408.patch Type: text/x-patch Size: 20609 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 10:43:32 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:43:32 +0000 (UTC) Subject: [PATCH] D68664: [lit] Clean up internal diff's encoding handling In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG19e6bb25f05f: [lit] Clean up internal diff's encoding handling (authored by jdenny). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68664/new/ https://reviews.llvm.org/D68664 Files: llvm/utils/lit/lit/builtin_commands/diff.py llvm/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.bin llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/shtest-shell.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68664.224410.patch Type: text/x-patch Size: 6059 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 10:43:34 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:43:34 +0000 (UTC) Subject: [PATCH] D67643: [lit] Extend internal diff to support `-` argument In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGf4edce12fffe: [lit] Extend internal diff to support `-` argument (authored by jdenny). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67643/new/ https://reviews.llvm.org/D67643 Files: llvm/utils/lit/lit/builtin_commands/diff.py llvm/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt llvm/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt llvm/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/shtest-shell.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D67643.224411.patch Type: text/x-patch Size: 7915 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 10:43:49 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:43:49 +0000 (UTC) Subject: [PATCH] D68668: [lit] Extend internal diff to support -U In-Reply-To: References: Message-ID: <330b7875f4057886ad78f739fe832f59@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG61d7ecbf84c7: [lit] Extend internal diff to support -U (authored by jdenny). Changed prior to commit: https://reviews.llvm.org/D68668?vs=223948&id=224412#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68668/new/ https://reviews.llvm.org/D68668 Files: llvm/utils/lit/lit/builtin_commands/diff.py llvm/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/shtest-shell.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68668.224412.patch Type: text/x-patch Size: 7024 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 10:47:18 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Thu, 10 Oct 2019 17:47:18 -0000 Subject: [llvm] r374395 - gn build: restore tablegen restat optimization after r373664 Message-ID: <20191010174718.AEC3F92479@lists.llvm.org> Author: nico Date: Thu Oct 10 10:47:18 2019 New Revision: 374395 URL: http://llvm.org/viewvc/llvm-project?rev=374395&view=rev Log: gn build: restore tablegen restat optimization after r373664 Modified: llvm/trunk/utils/gn/secondary/llvm/utils/TableGen/tablegen.gni Modified: llvm/trunk/utils/gn/secondary/llvm/utils/TableGen/tablegen.gni URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/utils/TableGen/tablegen.gni?rev=374395&r1=374394&r2=374395&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/utils/TableGen/tablegen.gni (original) +++ llvm/trunk/utils/gn/secondary/llvm/utils/TableGen/tablegen.gni Thu Oct 10 10:47:18 2019 @@ -66,6 +66,8 @@ template("tablegen") { args = [ rebase_path(tblgen_executable, root_build_dir), + "--write-if-changed", + "-I", rebase_path("//llvm/include", root_build_dir), From llvm-commits at lists.llvm.org Thu Oct 10 10:49:33 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via llvm-commits) Date: Thu, 10 Oct 2019 17:49:33 -0000 Subject: [llvm] r374396 - Unbreak windows buildbots. Message-ID: <20191010174933.89A5B924BD@lists.llvm.org> Author: gclayton Date: Thu Oct 10 10:49:33 2019 New Revision: 374396 URL: http://llvm.org/viewvc/llvm-project?rev=374396&view=rev Log: Unbreak windows buildbots. Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp?rev=374396&r1=374395&r2=374396&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp (original) +++ llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp Thu Oct 10 10:49:33 2019 @@ -14,7 +14,6 @@ #include #include #include -#include #include #include #include From llvm-commits at lists.llvm.org Thu Oct 10 10:52:03 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Thu, 10 Oct 2019 17:52:03 -0000 Subject: [llvm] r374397 - [DAGCombiner] fold select-of-constants to shift Message-ID: <20191010175203.337AA924A9@lists.llvm.org> Author: spatel Date: Thu Oct 10 10:52:02 2019 New Revision: 374397 URL: http://llvm.org/viewvc/llvm-project?rev=374397&view=rev Log: [DAGCombiner] fold select-of-constants to shift This reverses the scalar canonicalization proposed in D63382. Pre: isPowerOf2(C1) %r = select i1 %cond, i32 C1, i32 0 => %z = zext i1 %cond to i32 %r = shl i32 %z, log2(C1) https://rise4fun.com/Alive/Z50 x86 already tries to fold this pattern, but it isn't done uniformly, so we still see a diff. AArch64 probably should enable the TLI hook to benefit too, but that's a follow-on. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=374397&r1=374396&r2=374397&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Thu Oct 10 10:52:02 2019 @@ -8221,10 +8221,11 @@ SDValue DAGCombiner::foldSelectOfConstan return Cond; } - // For any constants that differ by 1, we can transform the select into an - // extend and add. Use a target hook because some targets may prefer to - // transform in the other direction. + // Use a target hook because some targets may prefer to transform in the + // other direction. if (TLI.convertSelectOfConstantsToMath(VT)) { + // For any constants that differ by 1, we can transform the select into an + // extend and add. const APInt &C1Val = C1->getAPIntValue(); const APInt &C2Val = C2->getAPIntValue(); if (C1Val - 1 == C2Val) { @@ -8239,6 +8240,14 @@ SDValue DAGCombiner::foldSelectOfConstan Cond = DAG.getNode(ISD::SIGN_EXTEND, DL, VT, Cond); return DAG.getNode(ISD::ADD, DL, VT, Cond, N2); } + + // select Cond, Pow2, 0 --> (zext Cond) << log2(Pow2) + if (C1Val.isPowerOf2() && C2Val.isNullValue()) { + if (VT != MVT::i1) + Cond = DAG.getNode(ISD::ZERO_EXTEND, DL, VT, Cond); + SDValue ShAmtC = DAG.getConstant(C1Val.exactLogBase2(), DL, VT); + return DAG.getNode(ISD::SHL, DL, VT, Cond, ShAmtC); + } } return SDValue(); Modified: llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll?rev=374397&r1=374396&r2=374397&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll (original) +++ llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll Thu Oct 10 10:52:02 2019 @@ -202,19 +202,15 @@ define i32 @PR31175(i32 %x, i32 %y) { define i8 @sel_shift_bool_i8(i1 %t) { ; CHECK-NOBMI-LABEL: sel_shift_bool_i8: ; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: # kill: def $edi killed $edi def $rdi -; CHECK-NOBMI-NEXT: notb %dil -; CHECK-NOBMI-NEXT: shlb $7, %dil -; CHECK-NOBMI-NEXT: leal -128(%rdi), %eax +; CHECK-NOBMI-NEXT: movl %edi, %eax +; CHECK-NOBMI-NEXT: shlb $7, %al ; CHECK-NOBMI-NEXT: # kill: def $al killed $al killed $eax ; CHECK-NOBMI-NEXT: retq ; ; CHECK-BMI-LABEL: sel_shift_bool_i8: ; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: # kill: def $edi killed $edi def $rdi -; CHECK-BMI-NEXT: notb %dil -; CHECK-BMI-NEXT: shlb $7, %dil -; CHECK-BMI-NEXT: leal -128(%rdi), %eax +; CHECK-BMI-NEXT: movl %edi, %eax +; CHECK-BMI-NEXT: shlb $7, %al ; CHECK-BMI-NEXT: # kill: def $al killed $al killed $eax ; CHECK-BMI-NEXT: retq %shl = select i1 %t, i8 128, i8 0 From llvm-commits at lists.llvm.org Thu Oct 10 10:52:34 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via llvm-commits) Date: Thu, 10 Oct 2019 17:52:34 -0000 Subject: [llvm] r374398 - Unbreak llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast buildbot. Message-ID: <20191010175234.108D5924F4@lists.llvm.org> Author: gclayton Date: Thu Oct 10 10:52:33 2019 New Revision: 374398 URL: http://llvm.org/viewvc/llvm-project?rev=374398&view=rev Log: Unbreak llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast buildbot. Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp?rev=374398&r1=374397&r2=374398&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp (original) +++ llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp Thu Oct 10 10:52:33 2019 @@ -14,7 +14,7 @@ #include #include - +#include using namespace llvm; using namespace gsym; From llvm-commits at lists.llvm.org Thu Oct 10 10:53:08 2019 From: llvm-commits at lists.llvm.org (Krzysztof Parzyszek via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:53:08 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: <5073d2ca668501e19a0100e20b2c7d8e@localhost.localdomain> kparzysz added a comment. I'm in favor of treating signed saturation as canonical. The issue in delaying detection of such cases to instruction selection is the volatility of the IR: there is no guarantee that the IR will remain in the same form (expected by isel) from one day to the next. For example, some optimization may decide to just promote the operations to the wider type and only do the extension/truncate once, depending on how many saturating operations may be near one another. Handling this variability in isel is just not feasible. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Thu Oct 10 10:53:08 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:53:08 +0000 (UTC) Subject: [PATCH] D68744: [GSYM] Add GsymCreator and GsymReader. In-Reply-To: References: Message-ID: <5266b8ec7dbffc58c00ed6b2f5b93777@localhost.localdomain> thakis added a comment. This doesn't build on Windows: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/11397 C:\b\slave\clang-x64-windows-msvc\build\llvm.src\lib\DebugInfo\GSYM\GsymReader.cpp(17): fatal error C1083: Cannot open include file: 'sys/mman.h': No such file or directory Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68744/new/ https://reviews.llvm.org/D68744 From llvm-commits at lists.llvm.org Thu Oct 10 10:53:09 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 17:53:09 +0000 (UTC) Subject: [PATCH] D68744: [GSYM] Add GsymCreator and GsymReader. In-Reply-To: References: Message-ID: <823c61c1e89930b94b2a8a4ef56afe32@localhost.localdomain> thakis added a comment. The fix attempt didn't work: ECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_UNICODE -DUNICODE -I../../llvm/include -Igen/llvm/include /O2 /Zc:inline /EHs-c- /W4 -Wno-unused-parameter -Wdelete-non-virtual-dtor -Wstring-conversion -Wno-nonportable-include-path -Wcovered-switch-default /GR- ../../llvm/lib/DebugInfo/GSYM/GsymCreator.cpp(76,3): error: use of undeclared identifier 'bzero' bzero(Hdr.UUID, sizeof(Hdr.UUID)); ^ 1 error generated. [16/106] CXX obj/llvm/lib/DebugInfo/GSYM/GSYM.GsymReader.obj FAILED: obj/llvm/lib/DebugInfo/GSYM/GSYM.GsymReader.obj c:\src\goma\goma-win64/gomacc c:/src/chrome/src/third_party/llvm-build/Release+Asserts/bin/clang-cl /nologo /showIncludes /Foobj/llvm/lib/DebugInfo/GSYM/GSYM.GsymReader.obj /c ../../llvm/lib/DebugInfo/GSYM/GsymReader.cpp -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_UNICODE -DUNICODE -I../../llvm/include -Igen/llvm/include /O2 /Zc:inline /EHs-c- /W4 -Wno-unused-parameter -Wdelete-non-virtual-dtor -Wstring-conversion -Wno-nonportable-include-path -Wcovered-switch-default /GR- ../../llvm/lib/DebugInfo/GSYM/GsymReader.cpp(19,10): fatal error: 'unistd.h' file not found #include Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68744/new/ https://reviews.llvm.org/D68744 From llvm-commits at lists.llvm.org Thu Oct 10 10:53:43 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Thu, 10 Oct 2019 17:53:43 +0000 (UTC) Subject: [PATCH] D67986: [InstCombine] snprintf (d, size, "%s", s) -> memccpy (d, s, '\0', size - 1), d[size - 1] = 0 In-Reply-To: References: Message-ID: xbolva00 added a comment. BTW, we transform snprintf(d, s, "%s" , ...) into two calls - memcpy + strlen - and nobody is concerned about code size increase anyway. So I dont think code size is problem here. Various InstCombine transformations produces two calls. (here it is just call + select..). I will continue with this patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67986/new/ https://reviews.llvm.org/D67986 From llvm-commits at lists.llvm.org Thu Oct 10 10:58:38 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via llvm-commits) Date: Thu, 10 Oct 2019 17:58:38 -0000 Subject: [llvm] r374400 - [lit] Move argument parsing/validation to separate file Message-ID: <20191010175838.3F54A924A2@lists.llvm.org> Author: yln Date: Thu Oct 10 10:58:38 2019 New Revision: 374400 URL: http://llvm.org/viewvc/llvm-project?rev=374400&view=rev Log: [lit] Move argument parsing/validation to separate file Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D68529 Added: llvm/trunk/utils/lit/lit/cl_arguments.py Modified: llvm/trunk/utils/lit/lit/main.py Added: llvm/trunk/utils/lit/lit/cl_arguments.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/cl_arguments.py?rev=374400&view=auto ============================================================================== --- llvm/trunk/utils/lit/lit/cl_arguments.py (added) +++ llvm/trunk/utils/lit/lit/cl_arguments.py Thu Oct 10 10:58:38 2019 @@ -0,0 +1,218 @@ +import argparse +import os +import shlex +import sys + +import lit.util + +def parse_args(): + parser = argparse.ArgumentParser() + parser.add_argument('test_paths', + nargs='*', + help='Files or paths to include in the test suite') + + parser.add_argument("--version", + dest="show_version", + help="Show version and exit", + action="store_true", + default=False) + parser.add_argument("-j", "--workers", + dest="numWorkers", + metavar="N", + help="Number of workers used for testing", + type=int, + default=None) + parser.add_argument("--config-prefix", + dest="configPrefix", + metavar="NAME", + help="Prefix for 'lit' config files", + action="store", + default=None) + parser.add_argument("-D", "--param", + dest="userParameters", + metavar="NAME=VAL", + help="Add 'NAME' = 'VAL' to the user defined parameters", + type=str, + action="append", + default=[]) + + format_group = parser.add_argument_group("Output Format") + # FIXME: I find these names very confusing, although I like the + # functionality. + format_group.add_argument("-q", "--quiet", + help="Suppress no error output", + action="store_true", + default=False) + format_group.add_argument("-s", "--succinct", + help="Reduce amount of output", + action="store_true", + default=False) + format_group.add_argument("-v", "--verbose", + dest="showOutput", + help="Show test output for failures", + action="store_true", + default=False) + format_group.add_argument("-vv", "--echo-all-commands", + dest="echoAllCommands", + action="store_true", + default=False, + help="Echo all commands as they are executed to stdout. In case of " + "failure, last command shown will be the failing one.") + format_group.add_argument("-a", "--show-all", + dest="showAllOutput", + help="Display all commandlines and output", + action="store_true", + default=False) + format_group.add_argument("-o", "--output", + dest="output_path", + help="Write test results to the provided path", + action="store", + metavar="PATH") + format_group.add_argument("--no-progress-bar", + dest="useProgressBar", + help="Do not use curses based progress bar", + action="store_false", + default=True) + format_group.add_argument("--show-unsupported", + help="Show unsupported tests", + action="store_true", + default=False) + format_group.add_argument("--show-xfail", + help="Show tests that were expected to fail", + action="store_true", + default=False) + + execution_group = parser.add_argument_group("Test Execution") + execution_group.add_argument("--path", + help="Additional paths to add to testing environment", + action="append", + type=str, + default=[]) + execution_group.add_argument("--vg", + dest="useValgrind", + help="Run tests under valgrind", + action="store_true", + default=False) + execution_group.add_argument("--vg-leak", + dest="valgrindLeakCheck", + help="Check for memory leaks under valgrind", + action="store_true", + default=False) + execution_group.add_argument("--vg-arg", + dest="valgrindArgs", + metavar="ARG", + help="Specify an extra argument for valgrind", + type=str, + action="append", + default=[]) + execution_group.add_argument("--time-tests", + dest="timeTests", + help="Track elapsed wall time for each test", + action="store_true", + default=False) + execution_group.add_argument("--no-execute", + dest="noExecute", + help="Don't execute any tests (assume PASS)", + action="store_true", + default=False) + execution_group.add_argument("--xunit-xml-output", + dest="xunit_output_file", + help="Write XUnit-compatible XML test reports to the specified file", + default=None) + execution_group.add_argument("--timeout", + dest="maxIndividualTestTime", + help="Maximum time to spend running a single test (in seconds). " + "0 means no time limit. [Default: 0]", + type=int, + default=None) + execution_group.add_argument("--max-failures", + dest="maxFailures", + help="Stop execution after the given number of failures.", + action="store", + type=int, + default=None) + + selection_group = parser.add_argument_group("Test Selection") + selection_group.add_argument("--max-tests", + dest="maxTests", + metavar="N", + help="Maximum number of tests to run", + action="store", + type=int, + default=None) + selection_group.add_argument("--max-time", + dest="maxTime", + metavar="N", + help="Maximum time to spend testing (in seconds)", + action="store", + type=float, + default=None) + selection_group.add_argument("--shuffle", + help="Run tests in random order", + action="store_true", + default=False) + selection_group.add_argument("-i", "--incremental", + help="Run modified and failing tests first (updates mtimes)", + action="store_true", + default=False) + selection_group.add_argument("--filter", + metavar="REGEX", + help="Only run tests with paths matching the given regular expression", + action="store", + default=os.environ.get("LIT_FILTER")) + selection_group.add_argument("--num-shards", dest="numShards", metavar="M", + help="Split testsuite into M pieces and only run one", + action="store", + type=int, + default=os.environ.get("LIT_NUM_SHARDS")) + selection_group.add_argument("--run-shard", + dest="runShard", + metavar="N", + help="Run shard #N of the testsuite", + action="store", + type=int, + default=os.environ.get("LIT_RUN_SHARD")) + + debug_group = parser.add_argument_group("Debug and Experimental Options") + debug_group.add_argument("--debug", + help="Enable debugging (for 'lit' development)", + action="store_true", + default=False) + debug_group.add_argument("--show-suites", + dest="showSuites", + help="Show discovered test suites", + action="store_true", + default=False) + debug_group.add_argument("--show-tests", + dest="showTests", + help="Show all discovered tests", + action="store_true", + default=False) + + opts = parser.parse_args(sys.argv[1:] + + shlex.split(os.environ.get("LIT_OPTS", ""))) + + # Validate options + if not opts.test_paths: + parser.error('No inputs specified') + + if opts.numWorkers is None: + opts.numWorkers = lit.util.detectCPUs() + elif opts.numWorkers <= 0: + parser.error("Option '--workers' or '-j' requires positive integer") + + if opts.maxFailures is not None and opts.maxFailures <= 0: + parser.error("Option '--max-failures' requires positive integer") + + if opts.echoAllCommands: + opts.showOutput = True + + if (opts.numShards is not None) or (opts.runShard is not None): + if (opts.numShards is None) or (opts.runShard is None): + parser.error("--num-shards and --run-shard must be used together") + if opts.numShards <= 0: + parser.error("--num-shards must be positive") + if (opts.runShard < 1) or (opts.runShard > opts.numShards): + parser.error("--run-shard must be between 1 and --num-shards (inclusive)") + + return opts Modified: llvm/trunk/utils/lit/lit/main.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/main.py?rev=374400&r1=374399&r2=374400&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/main.py (original) +++ llvm/trunk/utils/lit/lit/main.py Thu Oct 10 10:58:38 2019 @@ -11,14 +11,13 @@ import os import platform import random import re -import shlex import sys import time -import argparse import tempfile import shutil from xml.sax.saxutils import quoteattr +import lit.cl_arguments import lit.discovery import lit.display import lit.LitConfig @@ -90,14 +89,12 @@ def update_incremental_cache(test): fname = test.getFilePath() os.utime(fname, None) -def sort_by_incremental_cache(run): - def sortIndex(test): - fname = test.getFilePath() - try: - return -os.path.getmtime(fname) - except: - return 0 - run.tests.sort(key = lambda t: sortIndex(t)) +def by_mtime(test): + fname = test.getFilePath() + try: + return os.path.getmtime(fname) + except: + return 0 def main(builtinParameters = {}): # Create a temp directory inside the normal temp directory so that we can @@ -129,152 +126,12 @@ def main(builtinParameters = {}): pass def main_with_tmp(builtinParameters): - parser = argparse.ArgumentParser() - parser.add_argument('test_paths', - nargs='*', - help='Files or paths to include in the test suite') - - parser.add_argument("--version", dest="show_version", - help="Show version and exit", - action="store_true", default=False) - parser.add_argument("-j", "--threads", "--workers", dest="numWorkers", metavar="N", - help="Number of workers used for testing", - type=int, default=None) - parser.add_argument("--config-prefix", dest="configPrefix", - metavar="NAME", help="Prefix for 'lit' config files", - action="store", default=None) - parser.add_argument("-D", "--param", dest="userParameters", - metavar="NAME=VAL", - help="Add 'NAME' = 'VAL' to the user defined parameters", - type=str, action="append", default=[]) - - format_group = parser.add_argument_group("Output Format") - # FIXME: I find these names very confusing, although I like the - # functionality. - format_group.add_argument("-q", "--quiet", - help="Suppress no error output", - action="store_true", default=False) - format_group.add_argument("-s", "--succinct", - help="Reduce amount of output", - action="store_true", default=False) - format_group.add_argument("-v", "--verbose", dest="showOutput", - help="Show test output for failures", - action="store_true", default=False) - format_group.add_argument("-vv", "--echo-all-commands", - dest="echoAllCommands", - action="store_true", default=False, - help="Echo all commands as they are executed to stdout.\ - In case of failure, last command shown will be the\ - failing one.") - format_group.add_argument("-a", "--show-all", dest="showAllOutput", - help="Display all commandlines and output", - action="store_true", default=False) - format_group.add_argument("-o", "--output", dest="output_path", - help="Write test results to the provided path", - action="store", metavar="PATH") - format_group.add_argument("--no-progress-bar", dest="useProgressBar", - help="Do not use curses based progress bar", - action="store_false", default=True) - format_group.add_argument("--show-unsupported", - help="Show unsupported tests", - action="store_true", default=False) - format_group.add_argument("--show-xfail", - help="Show tests that were expected to fail", - action="store_true", default=False) - - execution_group = parser.add_argument_group("Test Execution") - execution_group.add_argument("--path", - help="Additional paths to add to testing environment", - action="append", type=str, default=[]) - execution_group.add_argument("--vg", dest="useValgrind", - help="Run tests under valgrind", - action="store_true", default=False) - execution_group.add_argument("--vg-leak", dest="valgrindLeakCheck", - help="Check for memory leaks under valgrind", - action="store_true", default=False) - execution_group.add_argument("--vg-arg", dest="valgrindArgs", metavar="ARG", - help="Specify an extra argument for valgrind", - type=str, action="append", default=[]) - execution_group.add_argument("--time-tests", dest="timeTests", - help="Track elapsed wall time for each test", - action="store_true", default=False) - execution_group.add_argument("--no-execute", dest="noExecute", - help="Don't execute any tests (assume PASS)", - action="store_true", default=False) - execution_group.add_argument("--xunit-xml-output", dest="xunit_output_file", - help=("Write XUnit-compatible XML test reports to the" - " specified file"), default=None) - execution_group.add_argument("--timeout", dest="maxIndividualTestTime", - help="Maximum time to spend running a single test (in seconds)." - "0 means no time limit. [Default: 0]", - type=int, default=None) - execution_group.add_argument("--max-failures", dest="maxFailures", - help="Stop execution after the given number of failures.", - action="store", type=int, default=None) - - selection_group = parser.add_argument_group("Test Selection") - selection_group.add_argument("--max-tests", dest="maxTests", metavar="N", - help="Maximum number of tests to run", - action="store", type=int, default=None) - selection_group.add_argument("--max-time", dest="maxTime", metavar="N", - help="Maximum time to spend testing (in seconds)", - action="store", type=float, default=None) - selection_group.add_argument("--shuffle", - help="Run tests in random order", - action="store_true", default=False) - selection_group.add_argument("-i", "--incremental", - help="Run modified and failing tests first (updates " - "mtimes)", - action="store_true", default=False) - selection_group.add_argument("--filter", metavar="REGEX", - help=("Only run tests with paths matching the given " - "regular expression"), - action="store", - default=os.environ.get("LIT_FILTER")) - selection_group.add_argument("--num-shards", dest="numShards", metavar="M", - help="Split testsuite into M pieces and only run one", - action="store", type=int, - default=os.environ.get("LIT_NUM_SHARDS")) - selection_group.add_argument("--run-shard", dest="runShard", metavar="N", - help="Run shard #N of the testsuite", - action="store", type=int, - default=os.environ.get("LIT_RUN_SHARD")) - - debug_group = parser.add_argument_group("Debug and Experimental Options") - debug_group.add_argument("--debug", - help="Enable debugging (for 'lit' development)", - action="store_true", default=False) - debug_group.add_argument("--show-suites", dest="showSuites", - help="Show discovered test suites", - action="store_true", default=False) - debug_group.add_argument("--show-tests", dest="showTests", - help="Show all discovered tests", - action="store_true", default=False) - - opts = parser.parse_args(sys.argv[1:] + - shlex.split(os.environ.get("LIT_OPTS", ""))) - args = opts.test_paths + opts = lit.cl_arguments.parse_args() if opts.show_version: print("lit %s" % (lit.__version__,)) return - if not args: - parser.error('No inputs specified') - - if opts.numWorkers is None: - opts.numWorkers = lit.util.detectCPUs() - elif opts.numWorkers <= 0: - parser.error("Option '--workers' or '-j' requires positive integer") - - if opts.maxFailures is not None and opts.maxFailures <= 0: - parser.error("Option '--max-failures' requires positive integer") - - if opts.echoAllCommands: - opts.showOutput = True - - inputs = args - # Create the user defined parameters. userParams = dict(builtinParameters) for entry in opts.userParameters: @@ -313,7 +170,7 @@ def main_with_tmp(builtinParameters): # Perform test discovery. run = lit.run.Run(litConfig, - lit.discovery.find_tests_for_inputs(litConfig, inputs)) + lit.discovery.find_tests_for_inputs(litConfig, opts.test_paths)) # After test discovery the configuration might have changed # the maxIndividualTestTime. If we explicitly set this on the @@ -377,18 +234,12 @@ def main_with_tmp(builtinParameters): if opts.shuffle: random.shuffle(run.tests) elif opts.incremental: - sort_by_incremental_cache(run) + run.tests.sort(key=by_mtime, reverse=True) else: run.tests.sort(key = lambda t: (not t.isEarlyTest(), t.getFullName())) # Then optionally restrict our attention to a shard of the tests. if (opts.numShards is not None) or (opts.runShard is not None): - if (opts.numShards is None) or (opts.runShard is None): - parser.error("--num-shards and --run-shard must be used together") - if opts.numShards <= 0: - parser.error("--num-shards must be positive") - if (opts.runShard < 1) or (opts.runShard > opts.numShards): - parser.error("--run-shard must be between 1 and --num-shards (inclusive)") num_tests = len(run.tests) # Note: user views tests and shard numbers counting from 1. test_ixs = range(opts.runShard - 1, num_tests, opts.numShards) From llvm-commits at lists.llvm.org Thu Oct 10 11:01:27 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via llvm-commits) Date: Thu, 10 Oct 2019 18:01:27 -0000 Subject: [llvm] r374404 - Fix test to avoid check-not matching the temp file absolute path Message-ID: <20191010180127.E1FD692546@lists.llvm.org> Author: rnk Date: Thu Oct 10 11:01:27 2019 New Revision: 374404 URL: http://llvm.org/viewvc/llvm-project?rev=374404&view=rev Log: Fix test to avoid check-not matching the temp file absolute path Fix for PR43636 Modified: llvm/trunk/test/tools/llvm-objdump/X86/elf-disassemble-symbol-labels-exec.test Modified: llvm/trunk/test/tools/llvm-objdump/X86/elf-disassemble-symbol-labels-exec.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-objdump/X86/elf-disassemble-symbol-labels-exec.test?rev=374404&r1=374403&r2=374404&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-objdump/X86/elf-disassemble-symbol-labels-exec.test (original) +++ llvm/trunk/test/tools/llvm-objdump/X86/elf-disassemble-symbol-labels-exec.test Thu Oct 10 11:01:27 2019 @@ -6,6 +6,9 @@ # RUN: --implicit-check-not=absolute \ # RUN: --implicit-check-not=other +# Match this line so the implicit check-nots don't match the path. +# CHECK: {{^.*}}file format ELF64-x86-64 + # CHECK: 0000000000004000 first: # CHECK: 0000000000004001 second: # CHECK: 0000000000004002 third: From llvm-commits at lists.llvm.org Thu Oct 10 11:03:37 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via llvm-commits) Date: Thu, 10 Oct 2019 18:03:37 -0000 Subject: [llvm] r374405 - [lit] Leverage argparse features to remove some code Message-ID: <20191010180337.4805B91CFF@lists.llvm.org> Author: yln Date: Thu Oct 10 11:03:37 2019 New Revision: 374405 URL: http://llvm.org/viewvc/llvm-project?rev=374405&view=rev Log: [lit] Leverage argparse features to remove some code Reviewed By: rnk, serge-sans-paille Differential Revision: https://reviews.llvm.org/D68589 Modified: llvm/trunk/utils/lit/lit/cl_arguments.py llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/selecting.py Modified: llvm/trunk/utils/lit/lit/cl_arguments.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/cl_arguments.py?rev=374405&r1=374404&r2=374405&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/cl_arguments.py (original) +++ llvm/trunk/utils/lit/lit/cl_arguments.py Thu Oct 10 11:03:37 2019 @@ -8,7 +8,7 @@ import lit.util def parse_args(): parser = argparse.ArgumentParser() parser.add_argument('test_paths', - nargs='*', + nargs='+', help='Files or paths to include in the test suite') parser.add_argument("--version", @@ -20,13 +20,12 @@ def parse_args(): dest="numWorkers", metavar="N", help="Number of workers used for testing", - type=int, - default=None) + type=_positive_int, + default=lit.util.detectCPUs()) parser.add_argument("--config-prefix", dest="configPrefix", metavar="NAME", help="Prefix for 'lit' config files", - action="store", default=None) parser.add_argument("-D", "--param", dest="userParameters", @@ -66,7 +65,6 @@ def parse_args(): format_group.add_argument("-o", "--output", dest="output_path", help="Write test results to the provided path", - action="store", metavar="PATH") format_group.add_argument("--no-progress-bar", dest="useProgressBar", @@ -128,8 +126,7 @@ def parse_args(): execution_group.add_argument("--max-failures", dest="maxFailures", help="Stop execution after the given number of failures.", - action="store", - type=int, + type=_positive_int, default=None) selection_group = parser.add_argument_group("Test Selection") @@ -137,14 +134,12 @@ def parse_args(): dest="maxTests", metavar="N", help="Maximum number of tests to run", - action="store", type=int, default=None) selection_group.add_argument("--max-time", dest="maxTime", metavar="N", help="Maximum time to spend testing (in seconds)", - action="store", type=float, default=None) selection_group.add_argument("--shuffle", @@ -158,19 +153,18 @@ def parse_args(): selection_group.add_argument("--filter", metavar="REGEX", help="Only run tests with paths matching the given regular expression", - action="store", default=os.environ.get("LIT_FILTER")) - selection_group.add_argument("--num-shards", dest="numShards", metavar="M", + selection_group.add_argument("--num-shards", + dest="numShards", + metavar="M", help="Split testsuite into M pieces and only run one", - action="store", - type=int, + type=_positive_int, default=os.environ.get("LIT_NUM_SHARDS")) selection_group.add_argument("--run-shard", dest="runShard", metavar="N", help="Run shard #N of the testsuite", - action="store", - type=int, + type=_positive_int, default=os.environ.get("LIT_RUN_SHARD")) debug_group = parser.add_argument_group("Debug and Experimental Options") @@ -192,27 +186,27 @@ def parse_args(): opts = parser.parse_args(sys.argv[1:] + shlex.split(os.environ.get("LIT_OPTS", ""))) - # Validate options - if not opts.test_paths: - parser.error('No inputs specified') - - if opts.numWorkers is None: - opts.numWorkers = lit.util.detectCPUs() - elif opts.numWorkers <= 0: - parser.error("Option '--workers' or '-j' requires positive integer") - - if opts.maxFailures is not None and opts.maxFailures <= 0: - parser.error("Option '--max-failures' requires positive integer") - + # Validate command line options if opts.echoAllCommands: opts.showOutput = True - if (opts.numShards is not None) or (opts.runShard is not None): - if (opts.numShards is None) or (opts.runShard is None): + if opts.numShards or opts.runShard: + if not opts.numShards or not opts.runShard: parser.error("--num-shards and --run-shard must be used together") - if opts.numShards <= 0: - parser.error("--num-shards must be positive") - if (opts.runShard < 1) or (opts.runShard > opts.numShards): + if opts.runShard > opts.numShards: parser.error("--run-shard must be between 1 and --num-shards (inclusive)") return opts + +def _positive_int(arg): + try: + n = int(arg) + except ValueError: + raise _arg_error('positive integer', arg) + if n <= 0: + raise _arg_error('positive integer', arg) + return n + +def _arg_error(desc, arg): + msg = "requires %s, but found '%s'" % (desc, arg) + return argparse.ArgumentTypeError(msg) Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374405&r1=374404&r2=374405&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Thu Oct 10 11:03:37 2019 @@ -11,4 +11,4 @@ # CHECK: Failing Tests (31) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) -# CHECK: error: Option '--max-failures' requires positive integer +# CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/selecting.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/selecting.py?rev=374405&r1=374404&r2=374405&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/selecting.py (original) +++ llvm/trunk/utils/lit/tests/selecting.py Thu Oct 10 11:03:37 2019 @@ -87,7 +87,7 @@ # # RUN: not %{lit} --num-shards 0 --run-shard 2 %{inputs}/discovery >%t.out 2>%t.err # RUN: FileCheck --check-prefix=CHECK-SHARD-ERR < %t.err %s -# CHECK-SHARD-ERR: error: --num-shards must be positive +# CHECK-SHARD-ERR: error: argument --num-shards: requires positive integer, but found '0' # # RUN: not %{lit} --num-shards 3 --run-shard 4 %{inputs}/discovery >%t.out 2>%t.err # RUN: FileCheck --check-prefix=CHECK-SHARD-ERR2 < %t.err %s From llvm-commits at lists.llvm.org Thu Oct 10 11:04:52 2019 From: llvm-commits at lists.llvm.org (Michael Liao via llvm-commits) Date: Thu, 10 Oct 2019 18:04:52 -0000 Subject: [llvm] r374406 - Fix build by adding the missing dependency. Message-ID: <20191010180452.6EB7B91CFF@lists.llvm.org> Author: hliao Date: Thu Oct 10 11:04:52 2019 New Revision: 374406 URL: http://llvm.org/viewvc/llvm-project?rev=374406&view=rev Log: Fix build by adding the missing dependency. Modified: llvm/trunk/lib/DebugInfo/GSYM/LLVMBuild.txt Modified: llvm/trunk/lib/DebugInfo/GSYM/LLVMBuild.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/GSYM/LLVMBuild.txt?rev=374406&r1=374405&r2=374406&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/GSYM/LLVMBuild.txt (original) +++ llvm/trunk/lib/DebugInfo/GSYM/LLVMBuild.txt Thu Oct 10 11:04:52 2019 @@ -18,4 +18,4 @@ type = Library name = DebugInfoGSYM parent = DebugInfo -required_libraries = Support +required_libraries = MC Support From llvm-commits at lists.llvm.org Thu Oct 10 11:03:00 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:03:00 +0000 (UTC) Subject: [PATCH] D68575: [llvm-readobj][xcoff] implement parsing overflow section header. In-Reply-To: References: Message-ID: DiggerLin marked 3 inline comments as done. DiggerLin added inline comments. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:58 + static constexpr unsigned SectionFlagsTypeMask = 0xffffu; const XCOFFObjectFile &Obj; }; ---------------- hubert.reinterpretcast wrote: > DiggerLin wrote: > > hubert.reinterpretcast wrote: > > > Add a blank line here. Also, I am wondering if this should be part of `llvm/BinaryFormat/XCOFF.h` (perhaps in `SectionHeader32`, or in a base class thereof when 64-bit support lands). > > for consistent with SectionFlagsReservedMask, puting define SectionFlagsTypeMask here too, I think we maybe need to create a NFC patch to put SectionFlagsReservedMask and SectionFlagsTypeMask in the xcoff.h > Okay, I agree. Would you mind posting such an NFC patch after this patch lands? OK, I will do it. ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:455 + case XCOFF::STYP_TYPCHK: + // TODO : The interpretation of loader, exception, type check section + // headers are different from that of generic section header. We will ---------------- hubert.reinterpretcast wrote: > DiggerLin wrote: > > hubert.reinterpretcast wrote: > > > The "TODO" still has a colon surrounded by spaces on both sides after it. I do not think that we have been using colons after "TODO". > > > > > > Still missing "and" before "type check section headers". > > > > > > Still missing "s" after "generic section header". > > > > > > Typo "seciton" is still present. > > changed as suggestion > For future reference, I believe we have been using "Oxford commas". That is, a comma before the "and" before (in this case) the third list item, would be appropriate. OK, got it , thanks ================ Comment at: llvm/tools/llvm-readobj/XCOFFDumper.cpp:46 template void printSectionHeaders(ArrayRef Sections); + template void printGenericSectionHeader(T &Sec) const; + template void printOverflowSectionHeader(T &Sec) const; ---------------- hubert.reinterpretcast wrote: > I am not sure that I see a meaningful difference between the functions here that are `const` and the ones that are not. Given that there are already precedent cases of printing methods of `ObjDumper` subclasses that are `const`, I am okay with adding new ones that are `const` if we have reason to believe they will remain `const`. when I create a new function , I make as many functions const as possible so that accidental changes to objects are avoided. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68575/new/ https://reviews.llvm.org/D68575 From llvm-commits at lists.llvm.org Thu Oct 10 11:03:14 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:03:14 +0000 (UTC) Subject: [PATCH] D68529: [lit] Move argument parsing/validation to separate file In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG715bfa4ef800: [lit] Move argument parsing/validation to separate file (authored by yln). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68529/new/ https://reviews.llvm.org/D68529 Files: llvm/utils/lit/lit/cl_arguments.py llvm/utils/lit/lit/main.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68529.224416.patch Type: text/x-patch Size: 19216 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 11:03:50 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:03:50 +0000 (UTC) Subject: [PATCH] D68589: [lit] Leverage argparse features to remove some code In-Reply-To: References: Message-ID: <7be138bdc5ba83274bea1d436e8bfbf3@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG822946ceaabb: [lit] Leverage argparse features to remove some code (authored by yln). Changed prior to commit: https://reviews.llvm.org/D68589?vs=223891&id=224419#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68589/new/ https://reviews.llvm.org/D68589 Files: llvm/utils/lit/lit/cl_arguments.py llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/selecting.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68589.224419.patch Type: text/x-patch Size: 5953 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 11:04:09 2019 From: llvm-commits at lists.llvm.org (Andrei Elovikov via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:04:09 +0000 (UTC) Subject: [PATCH] D68498: [PATCH 15/38] [noalias] D9382: llvm.noalias - don't prevent loop vectorization In-Reply-To: References: Message-ID: <475f965a08daf08f6ccec5bebc155ecc@localhost.localdomain> a.elovikov added inline comments. ================ Comment at: llvm/test/Transforms/LoopVectorize/noalias.ll:7-9 +; CHECK-LABEL: @test( +; CHECK: @llvm.noalias.p0i32 +; CHECK: store <2 x i32> ---------------- lebedev.ri wrote: > It might be good to check a bit more context - *how* did it get vectorized? And, in addition to this, it might be good to see how it's vectorized for the interleaved accesses (especially how the intrinsics/shuffles "interaction" looks like), e.g (similar to interleaved-accesses-1.ll). struct S { int * restrict i; float * restrict f; }; S *p1, *p2; for (int j = 0; j < N; ++j) { S t = p1[j]; p2[j] = t; } CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68498/new/ https://reviews.llvm.org/D68498 From llvm-commits at lists.llvm.org Thu Oct 10 11:11:49 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via llvm-commits) Date: Thu, 10 Oct 2019 18:11:49 -0000 Subject: [llvm] r374409 - Fix buildbots by using memset instead of bzero. Message-ID: <20191010181150.0269492561@lists.llvm.org> Author: gclayton Date: Thu Oct 10 11:11:49 2019 New Revision: 374409 URL: http://llvm.org/viewvc/llvm-project?rev=374409&view=rev Log: Fix buildbots by using memset instead of bzero. Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp?rev=374409&r1=374408&r2=374409&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp (original) +++ llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp Thu Oct 10 11:11:49 2019 @@ -73,7 +73,7 @@ llvm::Error GsymCreator::encode(FileWrit Hdr.NumAddresses = static_cast(Funcs.size()); Hdr.StrtabOffset = 0; // We will fix this up later. Hdr.StrtabOffset = 0; // We will fix this up later. - bzero(Hdr.UUID, sizeof(Hdr.UUID)); + memset(Hdr.UUID, 0, sizeof(Hdr.UUID)); if (UUID.size() > sizeof(Hdr.UUID)) return createStringError(std::errc::invalid_argument, "invalid UUID size %u", (uint32_t)UUID.size()); From llvm-commits at lists.llvm.org Thu Oct 10 11:13:13 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via llvm-commits) Date: Thu, 10 Oct 2019 18:13:13 -0000 Subject: [llvm] r374410 - Unbreak buildbots. Message-ID: <20191010181313.478A592555@lists.llvm.org> Author: gclayton Date: Thu Oct 10 11:13:13 2019 New Revision: 374410 URL: http://llvm.org/viewvc/llvm-project?rev=374410&view=rev Log: Unbreak buildbots. Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp?rev=374410&r1=374409&r2=374410&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp (original) +++ llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp Thu Oct 10 11:13:13 2019 @@ -16,7 +16,6 @@ #include #include #include -#include #include #include From llvm-commits at lists.llvm.org Thu Oct 10 11:12:32 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:12:32 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream In-Reply-To: References: Message-ID: <3d4934b6800239f719d12e75b3e7d450@localhost.localdomain> JosephTremoulet updated this revision to Diff 224420. JosephTremoulet added a comment. - Change Exception Information format per feedback Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68657/new/ https://reviews.llvm.org/D68657 Files: llvm/include/llvm/ObjectYAML/MinidumpYAML.h llvm/lib/ObjectYAML/MinidumpEmitter.cpp llvm/lib/ObjectYAML/MinidumpYAML.cpp llvm/unittests/ObjectYAML/MinidumpYAMLTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68657.224420.patch Type: text/x-patch Size: 13532 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 11:12:33 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:12:33 +0000 (UTC) Subject: [PATCH] D68744: [GSYM] Add GsymCreator and GsymReader. In-Reply-To: References: Message-ID: <253c0f516f9cab3e98b63649d2377d91@localhost.localdomain> clayborg added a comment. In D68744#1704377 , @thakis wrote: > The fix attempt didn't work: > > ECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_UNICODE -DUNICODE -I../../llvm/include -Igen/llvm/include /O2 /Zc:inline /EHs-c- /W4 -Wno-unused-parameter -Wdelete-non-virtual-dtor -Wstring-conversion -Wno-nonportable-include-path -Wcovered-switch-default /GR- > ../../llvm/lib/DebugInfo/GSYM/GsymCreator.cpp(76,3): error: use of undeclared identifier 'bzero' > bzero(Hdr.UUID, sizeof(Hdr.UUID)); > ^ > 1 error generated. > [16/106] CXX obj/llvm/lib/DebugInfo/GSYM/GSYM.GsymReader.obj > FAILED: obj/llvm/lib/DebugInfo/GSYM/GSYM.GsymReader.obj > c:\src\goma\goma-win64/gomacc c:/src/chrome/src/third_party/llvm-build/Release+Asserts/bin/clang-cl /nologo /showIncludes /Foobj/llvm/lib/DebugInfo/GSYM/GSYM.GsymReader.obj /c ../../llvm/lib/DebugInfo/GSYM/GsymReader.cpp -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_UNICODE -DUNICODE -I../../llvm/include -Igen/llvm/include /O2 /Zc:inline /EHs-c- /W4 -Wno-unused-parameter -Wdelete-non-virtual-dtor -Wstring-conversion -Wno-nonportable-include-path -Wcovered-switch-default /GR- > ../../llvm/lib/DebugInfo/GSYM/GsymReader.cpp(19,10): fatal error: 'unistd.h' file not found > #include > $ svn commit Sending lib/DebugInfo/GSYM/GsymCreator.cpp Transmitting file data .done Committing transaction... Committed revision 374409. $ svn commit Sending lib/DebugInfo/GSYM/GsymReader.cpp Transmitting file data .done Committing transaction... Committed revision 374410. Switched to memset and removed include of unistd.h Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68744/new/ https://reviews.llvm.org/D68744 From llvm-commits at lists.llvm.org Thu Oct 10 11:17:24 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via llvm-commits) Date: Thu, 10 Oct 2019 18:17:24 -0000 Subject: [llvm] r374411 - Remove strings.h include to fix GSYM Windows build Message-ID: <20191010181724.64DFD9256E@lists.llvm.org> Author: rnk Date: Thu Oct 10 11:17:24 2019 New Revision: 374411 URL: http://llvm.org/viewvc/llvm-project?rev=374411&view=rev Log: Remove strings.h include to fix GSYM Windows build Fifth time's the charm. Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp?rev=374411&r1=374410&r2=374411&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp (original) +++ llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp Thu Oct 10 11:17:24 2019 @@ -14,7 +14,6 @@ #include #include -#include using namespace llvm; using namespace gsym; From llvm-commits at lists.llvm.org Thu Oct 10 11:20:16 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Thu, 10 Oct 2019 18:20:16 -0000 Subject: [llvm] r374413 - Fix Windows build after r374381 Message-ID: <20191010182016.D72C89256E@lists.llvm.org> Author: nico Date: Thu Oct 10 11:20:16 2019 New Revision: 374413 URL: http://llvm.org/viewvc/llvm-project?rev=374413&view=rev Log: Fix Windows build after r374381 Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp?rev=374413&r1=374412&r2=374413&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp (original) +++ llvm/trunk/lib/DebugInfo/GSYM/GsymCreator.cpp Thu Oct 10 11:20:16 2019 @@ -14,6 +14,8 @@ #include #include +#include +#include using namespace llvm; using namespace gsym; Modified: llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp?rev=374413&r1=374412&r2=374413&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp (original) +++ llvm/trunk/lib/DebugInfo/GSYM/GsymReader.cpp Thu Oct 10 11:20:16 2019 @@ -10,16 +10,9 @@ #include "llvm/DebugInfo/GSYM/GsymReader.h" #include -#include #include #include #include -#include -#include - -#include -#include -#include #include "llvm/DebugInfo/GSYM/GsymCreator.h" #include "llvm/DebugInfo/GSYM/InlineInfo.h" From llvm-commits at lists.llvm.org Thu Oct 10 11:22:16 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:22:16 +0000 (UTC) Subject: [PATCH] D68744: [GSYM] Add GsymCreator and GsymReader. In-Reply-To: References: Message-ID: thakis added a comment. That's still not enough, you need to remove a bunch of (unused) includes. With this, my windows box is happy. (I don't have svn setup on my win box, so I can't land this easily): diff --git a/llvm/lib/DebugInfo/GSYM/GsymCreator.cpp b/llvm/lib/DebugInfo/GSYM/GsymCreator.cpp index 9dc632dfcfb..f371426f201 100644 --- a/llvm/lib/DebugInfo/GSYM/GsymCreator.cpp +++ b/llvm/lib/DebugInfo/GSYM/GsymCreator.cpp @@ -14,7 +14,8 @@ #include #include -#include +#include +#include using namespace llvm; using namespace gsym; @@ -73,7 +74,7 @@ llvm::Error GsymCreator::encode(FileWriter &O) const { Hdr.NumAddresses = static_cast(Funcs.size()); Hdr.StrtabOffset = 0; // We will fix this up later. Hdr.StrtabOffset = 0; // We will fix this up later. - bzero(Hdr.UUID, sizeof(Hdr.UUID)); + memset(Hdr.UUID, 0, sizeof(Hdr.UUID)); if (UUID.size() > sizeof(Hdr.UUID)) return createStringError(std::errc::invalid_argument, "invalid UUID size %u", (uint32_t)UUID.size()); diff --git a/llvm/lib/DebugInfo/GSYM/GsymReader.cpp b/llvm/lib/DebugInfo/GSYM/GsymReader.cpp index f7bbd700713..dfb585b87c1 100644 --- a/llvm/lib/DebugInfo/GSYM/GsymReader.cpp +++ b/llvm/lib/DebugInfo/GSYM/GsymReader.cpp @@ -10,17 +10,8 @@ #include "llvm/DebugInfo/GSYM/GsymReader.h" #include -#include #include -#include #include -#include -#include -#include - -#include -#include -#include #include "llvm/DebugInfo/GSYM/GsymCreator.h" #include "llvm/DebugInfo/GSYM/InlineInfo.h" Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68744/new/ https://reviews.llvm.org/D68744 From llvm-commits at lists.llvm.org Thu Oct 10 11:22:16 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:22:16 +0000 (UTC) Subject: [PATCH] D68632: [X86] Make memcmp() use PTEST if possible and also enable AVX1 In-Reply-To: References: Message-ID: craig.topper added a comment. This should fix the scaliarization issue seen with ISD::XOR and ISD::OR diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp index 377c608..92d1166 100644 - a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp @@ -4344,7 +4344,9 @@ SDValue DAGCombiner::hoistLogicOpWithSameOpcodeHands(SDNode *N) { if ((HandOpcode == ISD::BITCAST || HandOpcode == ISD::SCALAR_TO_VECTOR) && Level <= AfterLegalizeTypes) { // Input types must be integer and the same. - if (XVT.isInteger() && XVT == Y.getValueType()) { + if (XVT.isInteger() && XVT == Y.getValueType() && + !(VT.isVector() && TLI.isTypeLegal(VT) && + !XVT.isVector() && !TLI.isTypeLegal(XVT))) { SDValue Logic = DAG.getNode(LogicOpcode, DL, XVT, X, Y); return DAG.getNode(HandOpcode, DL, VT, Logic); } Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68632/new/ https://reviews.llvm.org/D68632 From llvm-commits at lists.llvm.org Thu Oct 10 11:22:17 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:22:17 +0000 (UTC) Subject: [PATCH] D68744: [GSYM] Add GsymCreator and GsymReader. In-Reply-To: References: Message-ID: thakis added a comment. I found a computer with commit set up, so fixed in r374413. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68744/new/ https://reviews.llvm.org/D68744 From llvm-commits at lists.llvm.org Thu Oct 10 11:22:16 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:22:16 +0000 (UTC) Subject: [PATCH] D68763: [X86] Use packusdw+vpmovuswb to implement v16i32->V16i8 that clamps signed inputs to be between 0 and 255 when zmm registers are disabled on SKX. In-Reply-To: References: Message-ID: RKSimon accepted this revision. RKSimon added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68763/new/ https://reviews.llvm.org/D68763 From llvm-commits at lists.llvm.org Thu Oct 10 11:22:18 2019 From: llvm-commits at lists.llvm.org (Amaury SECHET via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:22:18 +0000 (UTC) Subject: [PATCH] D68195: [DAGCombiner] Peek through vector concats when trying to combine shuffles. In-Reply-To: References: Message-ID: <0c55a6fe128c9c96d1baa998036aa995@localhost.localdomain> deadalnix updated this revision to Diff 224424. deadalnix added a comment. Rebase and ping. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68195/new/ https://reviews.llvm.org/D68195 Files: lib/CodeGen/SelectionDAG/DAGCombiner.cpp test/CodeGen/X86/vector-shuffle-combining.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68195.224424.patch Type: text/x-patch Size: 6203 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 11:23:02 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:23:02 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream In-Reply-To: References: Message-ID: <5df50068d6342d3d91503e218b41b6ba@localhost.localdomain> JosephTremoulet updated this revision to Diff 224426. JosephTremoulet added a comment. - Add test with extraneous parameter Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68657/new/ https://reviews.llvm.org/D68657 Files: llvm/include/llvm/ObjectYAML/MinidumpYAML.h llvm/lib/ObjectYAML/MinidumpEmitter.cpp llvm/lib/ObjectYAML/MinidumpYAML.cpp llvm/unittests/ObjectYAML/MinidumpYAMLTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68657.224426.patch Type: text/x-patch Size: 15284 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 11:31:57 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via llvm-commits) Date: Thu, 10 Oct 2019 18:31:57 -0000 Subject: [llvm] r374415 - Print quoted backslashes in LLVM IR as \\ instead of \5C Message-ID: <20191010183157.B442E85F01@lists.llvm.org> Author: rnk Date: Thu Oct 10 11:31:57 2019 New Revision: 374415 URL: http://llvm.org/viewvc/llvm-project?rev=374415&view=rev Log: Print quoted backslashes in LLVM IR as \\ instead of \5C This improves readability of Windows path string literals in LLVM IR. The LLVM assembler has supported \\ in IR strings for a long time, but the lexer doesn't tolerate escaped quotes, so they have to be printed as \22 for now. Modified: llvm/trunk/lib/Support/StringExtras.cpp llvm/trunk/test/Assembler/asm-path-writer.ll llvm/trunk/test/Assembler/source-filename-backslash.ll llvm/trunk/test/CodeGen/MIR/X86/global-value-operands.mir llvm/trunk/unittests/ADT/StringExtrasTest.cpp llvm/trunk/unittests/IR/MetadataTest.cpp Modified: llvm/trunk/lib/Support/StringExtras.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/StringExtras.cpp?rev=374415&r1=374414&r2=374415&view=diff ============================================================================== --- llvm/trunk/lib/Support/StringExtras.cpp (original) +++ llvm/trunk/lib/Support/StringExtras.cpp Thu Oct 10 11:31:57 2019 @@ -60,7 +60,9 @@ void llvm::SplitString(StringRef Source, void llvm::printEscapedString(StringRef Name, raw_ostream &Out) { for (unsigned i = 0, e = Name.size(); i != e; ++i) { unsigned char C = Name[i]; - if (isPrint(C) && C != '\\' && C != '"') + if (C == '\\') + Out << '\\' << C; + else if (isPrint(C) && C != '"') Out << C; else Out << '\\' << hexdigit(C >> 4) << hexdigit(C & 0x0F); Modified: llvm/trunk/test/Assembler/asm-path-writer.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Assembler/asm-path-writer.ll?rev=374415&r1=374414&r2=374415&view=diff ============================================================================== --- llvm/trunk/test/Assembler/asm-path-writer.ll (original) +++ llvm/trunk/test/Assembler/asm-path-writer.ll Thu Oct 10 11:31:57 2019 @@ -1,6 +1,6 @@ ; RUN: llvm-as < %s | llvm-dis | FileCheck %s -; CHECK: ^0 = module: (path: ".\5Cf4folder\5Cabc.o", hash: (0, 0, 0, 0, 0)) +; CHECK: ^0 = module: (path: ".\\f4folder\\abc.o", hash: (0, 0, 0, 0, 0)) -^0 = module: (path: ".\5Cf4folder\5Cabc.o", hash: (0, 0, 0, 0, 0)) +^0 = module: (path: ".\5Cf4folder\\abc.o", hash: (0, 0, 0, 0, 0)) ^1 = gv: (guid: 15822663052811949562, summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2))) Modified: llvm/trunk/test/Assembler/source-filename-backslash.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Assembler/source-filename-backslash.ll?rev=374415&r1=374414&r2=374415&view=diff ============================================================================== --- llvm/trunk/test/Assembler/source-filename-backslash.ll (original) +++ llvm/trunk/test/Assembler/source-filename-backslash.ll Thu Oct 10 11:31:57 2019 @@ -1,8 +1,7 @@ - ; Make sure that llvm-as/llvm-dis properly assemble/disassemble the ; source_filename. ; RUN: llvm-as < %s | llvm-dis | FileCheck %s -; CHECK: source_filename = "C:\5Cpath\5Cwith\5Cbackslashes\5Ctest.cc" -source_filename = "C:\5Cpath\5Cwith\5Cbackslashes\5Ctest.cc" +; CHECK: source_filename = "C:\\path\\with\\backslashes\\test.cc" +source_filename = "C:\\path\\with\5Cbackslashes\\test.cc" Modified: llvm/trunk/test/CodeGen/MIR/X86/global-value-operands.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/MIR/X86/global-value-operands.mir?rev=374415&r1=374414&r2=374415&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/MIR/X86/global-value-operands.mir (original) +++ llvm/trunk/test/CodeGen/MIR/X86/global-value-operands.mir Thu Oct 10 11:31:57 2019 @@ -103,7 +103,7 @@ body: | name: test2 body: | bb.0.entry: - ; CHECK: , @"\01Hello@$%09 \5C World,", + ; CHECK: , @"\01Hello@$%09 \\ World,", $rax = MOV64rm $rip, 1, _, @"\01Hello@$%09 \\ World,", _ $eax = MOV32rm killed $rax, 1, _, 0, _ RETQ $eax Modified: llvm/trunk/unittests/ADT/StringExtrasTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ADT/StringExtrasTest.cpp?rev=374415&r1=374414&r2=374415&view=diff ============================================================================== --- llvm/trunk/unittests/ADT/StringExtrasTest.cpp (original) +++ llvm/trunk/unittests/ADT/StringExtrasTest.cpp Thu Oct 10 11:31:57 2019 @@ -109,7 +109,7 @@ TEST(StringExtrasTest, printEscapedStrin std::string str; raw_string_ostream OS(str); printEscapedString("ABCdef123&<>\\\"'\t", OS); - EXPECT_EQ("ABCdef123&<>\\5C\\22'\\09", OS.str()); + EXPECT_EQ("ABCdef123&<>\\\\\\22'\\09", OS.str()); } TEST(StringExtrasTest, printHTMLEscaped) { Modified: llvm/trunk/unittests/IR/MetadataTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/IR/MetadataTest.cpp?rev=374415&r1=374414&r2=374415&view=diff ============================================================================== --- llvm/trunk/unittests/IR/MetadataTest.cpp (original) +++ llvm/trunk/unittests/IR/MetadataTest.cpp Thu Oct 10 11:31:57 2019 @@ -164,7 +164,7 @@ TEST_F(MDStringTest, PrintingComplex) { std::string Str; raw_string_ostream oss(Str); s->print(oss); - EXPECT_STREQ("!\"\\00\\0A\\22\\5C\\FF\"", oss.str().c_str()); + EXPECT_STREQ("!\"\\00\\0A\\22\\\\\\FF\"", oss.str().c_str()); } typedef MetadataTest MDNodeTest; From llvm-commits at lists.llvm.org Thu Oct 10 11:31:24 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:31:24 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream In-Reply-To: References: Message-ID: JosephTremoulet added a comment. In D68657#1703752 , @labath wrote: > Therefore I think it would make sense to just spell out each member of that array as a separate member in the yaml representation, which should be a much simpler endeavour. We can use the "actual parameter count" field to suppress the fields that don't contain any value, if they really are zero, which should make the yaml output concise in the usual cases. I think something like this should be sufficient: > ... > I think that would strike a good balance between code complexity, output brevity, and being able to generate interesting and potentially invalid inputs for other tools (which is one of the main goals of yaml2obj, and so interpreting the input too strictly is not desired/helpful). Ok. Updated. I agree the code is simpler this way and the YAML just as readable, plus it's nice not having to worry about buffer overrun. Added testcases with missing and extraneous parameters. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68657/new/ https://reviews.llvm.org/D68657 From llvm-commits at lists.llvm.org Thu Oct 10 11:31:24 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:31:24 +0000 (UTC) Subject: [PATCH] D68744: [GSYM] Add GsymCreator and GsymReader. In-Reply-To: References: Message-ID: <0dacf5322d542636210e9b9546968f3d@localhost.localdomain> clayborg added a comment. In D68744#1704432 , @thakis wrote: > I found a computer with commit set up, so fixed in r374413. Thank you!!! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68744/new/ https://reviews.llvm.org/D68744 From llvm-commits at lists.llvm.org Thu Oct 10 11:31:24 2019 From: llvm-commits at lists.llvm.org (Rong Xu via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:31:24 +0000 (UTC) Subject: [PATCH] D67989: [ValueTracking] Improve pointer offset computation for cases of same base In-Reply-To: References: Message-ID: <421c93de5cf6cae70cdb50e05dada33d@localhost.localdomain> xur added a comment. I totally agree with what eugenis said. I added his comments to a TODO comment. I will commit this version and may address the TODO later. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67989/new/ https://reviews.llvm.org/D67989 From llvm-commits at lists.llvm.org Thu Oct 10 11:31:25 2019 From: llvm-commits at lists.llvm.org (Hal Finkel via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:31:25 +0000 (UTC) Subject: [PATCH] D68817: [PowerPC][docs] Update IBM official docs in Compiler Writers Info page In-Reply-To: References: Message-ID: hfinkel accepted this revision as: hfinkel. hfinkel added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68817/new/ https://reviews.llvm.org/D68817 From llvm-commits at lists.llvm.org Thu Oct 10 11:31:25 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:31:25 +0000 (UTC) Subject: [PATCH] D68820: win: Move Parallel.h off concrt to cross-platform code Message-ID: thakis created this revision. thakis added a reviewer: rnk. Herald added a subscriber: hiraditya. Herald added a project: LLVM. thakis added subscribers: BillyONeal, aganea. r179397 added Parallel.h and implemented it terms of concrt in 2013. In 2015, a cross-platform implementation of the functions has appeared and is in use everywhere but on Windows (r232419). r246219 hints that had issues in MSVC2013, but r296906 suggests they've been fixed now that we require 2015+. So remove the concrt code. It's less code, and it sounds like concrt has conceptual and performance issues, see PR41198. I built blink_core.dll in a debug component build with full symbols and in a release component build without any symbols. I couldn't measure a performance difference for linking blink_core.dll before and after this patch. (Raw data: https://gist.github.com/nico/d4b02c7dd835bb96ed67e919f3558e6f) https://reviews.llvm.org/D68820 Files: llvm/include/llvm/Support/Parallel.h llvm/lib/Support/Parallel.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68820.224425.patch Type: text/x-patch Size: 2533 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 11:31:26 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:31:26 +0000 (UTC) Subject: [PATCH] D68821: AMDGPU: Relax 32-bit SGPR register class Message-ID: arsenm created this revision. arsenm added reviewers: rampitec, kerbowa, nhaehnle. Herald added subscribers: arphaman, t-tye, tpr, dstuttard, yaxunl, wdng, jvesely, kzhuravl. Mostly use SReg_32 instead of SReg_32_XM0 for arbitrary values. This will allow the register coalescer to do a better job eliminating copies to m0. For GlobalISel, as a terrible hack, use SGPR_32 for things that should use SCC until booleans are solved. https://reviews.llvm.org/D68821 Files: lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp lib/Target/AMDGPU/SIISelLowering.cpp lib/Target/AMDGPU/SIInstrInfo.cpp lib/Target/AMDGPU/SIRegisterInfo.cpp lib/Target/AMDGPU/SIRegisterInfo.h test/CodeGen/AMDGPU/GlobalISel/inst-select-add.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.class.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.class.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.cos.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.cos.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.cvt.pk.i16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.cvt.pk.u16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.cvt.pknorm.i16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.cvt.pknorm.u16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.cvt.pkrtz.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.fmed3.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.fmed3.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.fract.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.fract.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.ldexp.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.ldexp.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.mbcnt.lo.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.rcp.legacy.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.rcp.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.rcp.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.rsq.clamp.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.rsq.legacy.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.rsq.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.rsq.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.s.sendmsg.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.sffbh.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.sin.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.sin.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-ffbh-u32.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-and.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-anyext.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-ashr.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-bitreverse.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-brcond.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-build-vector.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-concat-vectors.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-constant.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-copy.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-ctpop.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-extract.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-fabs.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-fcmp.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-fcmp.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-ffloor.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-ffloor.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-fmaxnum-ieee.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-fmaxnum.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-fminnum-ieee.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-fminnum.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-fmul.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-fneg.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-fptosi.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-fptoui.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-frame-index.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-gep.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-icmp.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-implicit-def.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-insert.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-lshr.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-merge-values.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-mul.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-or.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-phi.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-mask.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-ptrtoint.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-select.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-sext.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-shl.s16.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-sitofp.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-smax.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-smin.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-smulh.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-sub.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-trunc.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-uaddo.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-uitofp.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-umax.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-umin.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-umulh.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-unmerge-values.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-usubo.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-xor.mir test/CodeGen/AMDGPU/GlobalISel/inst-select-zext.mir test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f16.ll test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f32.ll test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.ll test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll test/CodeGen/AMDGPU/extract_subvector_vec4_vec3.ll test/CodeGen/AMDGPU/inline-constraints.ll test/CodeGen/AMDGPU/llvm.amdgcn.readfirstlane.ll test/CodeGen/AMDGPU/llvm.amdgcn.readlane.ll test/CodeGen/AMDGPU/llvm.amdgcn.writelane.ll test/CodeGen/AMDGPU/read_register.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68821.224427.patch Type: text/x-patch Size: 340118 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 11:31:26 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:31:26 +0000 (UTC) Subject: [PATCH] D68709: (not yet for review) win: Use cross-platform code in Parallel.h/.cpp In-Reply-To: References: Message-ID: thakis abandoned this revision. thakis added a comment. real patch at D68820 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68709/new/ https://reviews.llvm.org/D68709 From llvm-commits at lists.llvm.org Thu Oct 10 11:31:26 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:31:26 +0000 (UTC) Subject: [PATCH] D59676: Make Parallel.h build with libc++ on Windows. In-Reply-To: References: Message-ID: thakis added a comment. Herald added a subscriber: ldionne. Moving us off concrt in D68820 . CHANGES SINCE LAST ACTION https://reviews.llvm.org/D59676/new/ https://reviews.llvm.org/D59676 From llvm-commits at lists.llvm.org Thu Oct 10 11:41:05 2019 From: llvm-commits at lists.llvm.org (Hal Finkel via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:41:05 +0000 (UTC) Subject: [PATCH] D68814: [LV] Allow assume calls in predicated blocks. In-Reply-To: References: Message-ID: hfinkel added inline comments. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:6904 } + LLVM_DEBUG(dbgs() << "LV: Scalarizing and predicating:" << *I << "\n"); ---------------- Unintentional whitespace change? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68814/new/ https://reviews.llvm.org/D68814 From llvm-commits at lists.llvm.org Thu Oct 10 11:41:05 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:41:05 +0000 (UTC) Subject: [PATCH] D68821: AMDGPU: Relax 32-bit SGPR register class In-Reply-To: References: Message-ID: rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68821/new/ https://reviews.llvm.org/D68821 From llvm-commits at lists.llvm.org Thu Oct 10 11:41:06 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:41:06 +0000 (UTC) Subject: [PATCH] D68820: win: Move Parallel.h off concrt to cross-platform code In-Reply-To: References: Message-ID: <8f0f5e222b27090a2f5ba13a00ef3376@localhost.localdomain> rnk accepted this revision. rnk added a comment. This revision is now accepted and ready to land. Looks very good. :) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68820/new/ https://reviews.llvm.org/D68820 From llvm-commits at lists.llvm.org Thu Oct 10 11:41:07 2019 From: llvm-commits at lists.llvm.org (Hal Finkel via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:41:07 +0000 (UTC) Subject: [PATCH] D68804: [System Model] [TTI] Move default cache/prefetch implementations In-Reply-To: References: Message-ID: <2a0ce00dac3d170a2f1d43637a2e7a3e@localhost.localdomain> hfinkel accepted this revision. hfinkel added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68804/new/ https://reviews.llvm.org/D68804 From llvm-commits at lists.llvm.org Thu Oct 10 11:50:37 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:50:37 +0000 (UTC) Subject: [PATCH] D68529: [lit] Move argument parsing/validation to separate file In-Reply-To: References: Message-ID: <25ab5c93a4138f7600609e5d941a92bf@localhost.localdomain> thakis added a comment. Looks like the `--threads` option disappeared in the move, which angers the libc++ testsuite. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68529/new/ https://reviews.llvm.org/D68529 From llvm-commits at lists.llvm.org Thu Oct 10 11:50:37 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:50:37 +0000 (UTC) Subject: [PATCH] D68819: [Utils] Allow update_test_checks to check function arguments In-Reply-To: References: Message-ID: <547db93fa1cc34c0a08a9b132cd10787@localhost.localdomain> lebedev.ri added a comment. This doesn't change the default, right? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68819/new/ https://reviews.llvm.org/D68819 From llvm-commits at lists.llvm.org Thu Oct 10 11:50:38 2019 From: llvm-commits at lists.llvm.org (Jian Cai via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:50:38 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses In-Reply-To: References: Message-ID: <0a04cb4c5cb5386706f8f5d6a59b60e0@localhost.localdomain> jcai19 added a comment. In D68764#1703095 , @MaskRay wrote: > > acceet offset > > typo Done. Thanks for catching it. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68764/new/ https://reviews.llvm.org/D68764 From llvm-commits at lists.llvm.org Thu Oct 10 11:50:38 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:50:38 +0000 (UTC) Subject: [PATCH] D68816: [NFC] Replace a linked list in LiveDebugVariables pass with a DenseMap In-Reply-To: References: Message-ID: <4e3f5da794b7a6bed390c87fd93d8b2d@localhost.localdomain> dblaikie added inline comments. ================ Comment at: llvm/lib/CodeGen/LiveDebugVariables.cpp:338 + } + static inline UserValueIdentity getTombstoneKey() { + auto Key = DenseMapInfo::getTombstoneKey(); ---------------- Drop the inline keyword here and above - the linkage is implied by the definition being provided inside a class definition. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68816/new/ https://reviews.llvm.org/D68816 From llvm-commits at lists.llvm.org Thu Oct 10 11:50:39 2019 From: llvm-commits at lists.llvm.org (Yonghong Song via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:50:39 +0000 (UTC) Subject: [PATCH] D68822: [WIP][BPF] Support external globals Message-ID: yonghong-song created this revision. yonghong-song added reviewers: ast, anakryiko. Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. Emit types for all globals. Previously, the types for external variables and globals with section names are omitted. For external variables, also provide additional information about the purpose of the extern's through section name. For example, for the following example, -bash-4.4$ cat t.c extern int a __attribute__((section("bpf_linux"))); extern int b __attribute__((section("bpf_curr_module"))); extern int c __attribute__((section("bpf_cross_module"))); int test() { return a + b + c; } -bash-4.4$ clang -target bpf -g -O2 -S t.c Generated BTF_KIND_VARs: .long 66 # BTF_KIND_VAR(id = 4) .long 234881024 # 0xe000000 .long 0 .long 18 <=== 1 << 4 | 2 extern linux symbols .long 68 # BTF_KIND_VAR(id = 5) .long 234881024 # 0xe000000 .long 0 .long 34 <=== 2 << 4 | 2 current module to-be-patched symbols .long 70 # BTF_KIND_VAR(id = 6) .long 234881024 # 0xe000000 .long 0 .long 2 <=== 0 << 4 | 2 extern cross bpf module symbols The current u32 is defined as below in uapi/linux/btf.h: enum { BTF_VAR_STATIC = 0, BTF_VAR_GLOBAL_ALLOCATED, }; struct btf_var { __u32 linkage; }; We may need to add macros to access linkage and other information. We need to carve bits with some reserved. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68822 Files: llvm/lib/Target/BPF/BTF.h llvm/lib/Target/BPF/BTFDebug.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68822.224429.patch Type: text/x-patch Size: 5862 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 11:50:39 2019 From: llvm-commits at lists.llvm.org (David Greene via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:50:39 +0000 (UTC) Subject: [PATCH] D68793: [System Model] [TTI] Add TTI interfaces for write-combining buffers In-Reply-To: References: Message-ID: greened added a comment. In D68793#1704266 , @hfinkel wrote: > How do you imagine that we'd use this? Do we need some kind of size to go along with this? See the Intel optimization guide, section 3.6.9. https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf Basically, this information can be used to inform loop transformations as well as use of non-temporal instructions. A write-combining buffer is not the same as a store buffer. A write-combining buffer is always one cache line in size, so I don't think we need size information. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68793/new/ https://reviews.llvm.org/D68793 From llvm-commits at lists.llvm.org Thu Oct 10 11:50:39 2019 From: llvm-commits at lists.llvm.org (Jian Cai via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:50:39 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses In-Reply-To: References: Message-ID: jcai19 updated this revision to Diff 224430. jcai19 added a comment. Fix a typo. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68764/new/ https://reviews.llvm.org/D68764 Files: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp llvm/test/MC/ARM/gas-compl.s Index: llvm/test/MC/ARM/gas-compl.s =================================================================== --- /dev/null +++ llvm/test/MC/ARM/gas-compl.s @@ -0,0 +1,26 @@ +@ RUN: llvm-mc -triple=arm < %s | FileCheck %s + +@ CHECK: ldr r12, [sp, #15] +.syntax unified + ldr r12, [sp, (15)] + +@ CHECK: ldr r12, [sp, #15] +.syntax unified + ldr r12, [sp, #(15)] + +@ CHECK: ldr r12, [sp, #15] +.syntax unified + ldr r12, [sp, $(15)] + +@ CHECK: ldr r12, [sp, #40] +.syntax unified + ldr r12, [sp, (15+5*5)] + +@ CHECK: ldr r12, [sp, #40] +.syntax unified + ldr r12, [sp, #(15+5*5)] + + +@ CHECK: ldr r12, [sp, #40] +.syntax unified + ldr r12, [sp, $(15+5*5)] Index: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp =================================================================== --- llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp +++ llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp @@ -5734,13 +5734,15 @@ } // If we have a '#', it's an immediate offset, else assume it's a register - // offset. Be friendly and also accept a plain integer (without a leading - // hash) for gas compatibility. + // offset. Be friendly and also accept a plain integer or expression (without + // a leading hash) for gas compatibility. if (Parser.getTok().is(AsmToken::Hash) || Parser.getTok().is(AsmToken::Dollar) || + Parser.getTok().is(AsmToken::LParen) || Parser.getTok().is(AsmToken::Integer)) { if (Parser.getTok().isNot(AsmToken::Integer)) - Parser.Lex(); // Eat '#' or '$'. + if (!Parser.getTok().is(AsmToken::LParen)) + Parser.Lex(); // Eat '#' or '$' E = Parser.getTok().getLoc(); bool isNegative = getParser().getTok().is(AsmToken::Minus); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68764.224430.patch Type: text/x-patch Size: 1702 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 11:51:32 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:51:32 +0000 (UTC) Subject: [PATCH] D68730: [llvm-objdump] Adjust spacing and field width for --section-headers In-Reply-To: References: Message-ID: <3065373f9dd5716bb778c3c9273cf628@localhost.localdomain> rupprecht marked 9 inline comments as done. rupprecht added inline comments. ================ Comment at: llvm/test/tools/llvm-objdump/section-headers-address-width.test:10 +# 32: {{^}}Idx Name Size VMA LMA Type{{$}} +# 32: {{^}} 1 .foo 00000000 00000400 00000400 TEXT{{$}} + ---------------- grimar wrote: > I wonder, should we be able to shrink it too? > > Something like: > Len = max length among all sections. > > Instead of a current (if I understand it right) > Len = max(max length ..., 13) 13 is chosen to make the output match GNU objdump. Having a cutoff (not necessarily 13, but any number around there) also makes the formatting stable for objects with regular-length section names. ================ Comment at: llvm/test/tools/llvm-objdump/section-headers-name-width.test:4 +# RUN: yaml2obj %s --docnum=1 -o %t-name13chars.o +# RUN: llvm-objdump -h --show-lma %t-name13chars.o \ +# RUN: | FileCheck %s --check-prefix=NAME-13 --strict-whitespace ---------------- grimar wrote: > You are not using `--show-lma` it seems? I.e. I am not sure why do you need this test. Added a comment -- the reason is just that it goes down a different code path, e.g. would catch a bad implementation like: ``` if (ShowLMA) outs() << "Name " << ... // Oops, didn't use left_justify else outs() << left_justify("Name", NameWidth) << ... ``` ================ Comment at: llvm/test/tools/llvm-objdump/section-headers-spacing.test:1 +## Check leading and trailing whitespace for full lines. +# RUN: yaml2obj %s -o %t-whitespace.o ---------------- grimar wrote: > What do you think about combining these tests you have here into one that > could use `yaml2obj --docnum=X` and check spacing, formatting etc in one place? > (I am not sure it if it is usefull to have 3 different test files?) I started out with one test file, but found it to be a collection of somewhat unrelated things -- e.g. name column width and 32 vs 64 bit column widths are different features. So I think it's better to have more focused test files. It's a slightly personal preference though. ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1694 + SectionTypes.push_back("BSS"); + std::string Type = llvm::join(SectionTypes, " "); ---------------- grimar wrote: > May be I'd try to avoid using an additional vector and algorithm here and just: > > ``` > std::string Type = Section.isText() ? "TEXT" : ""; > if (Section.isData()) > Type += Type.empty() ? "DATA" : " DATA"; > if (Section.isBSS()) > Type += Type.empty() ? "BSS" : " BSS"; > ``` I think the lack of using an algorithm is why there was odd trailing whitespace before, although I agree it would be great if this could be more succinct... Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68730/new/ https://reviews.llvm.org/D68730 From llvm-commits at lists.llvm.org Thu Oct 10 11:51:48 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:51:48 +0000 (UTC) Subject: [PATCH] D68730: [llvm-objdump] Adjust spacing and field width for --section-headers In-Reply-To: References: Message-ID: <25af508f155d896e3db4f43a8c2bb0be@localhost.localdomain> rupprecht updated this revision to Diff 224433. rupprecht marked an inline comment as done. rupprecht added a comment. - Remove yaml `...` separators - Whitespace changes - Add more test comments - Rename to getMaxSectionNameWidth - Fix yaml spacing - Remove some unnecessary yaml fields Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68730/new/ https://reviews.llvm.org/D68730 Files: lld/test/ELF/got32-i386.s lld/test/ELF/got32x-i386.s llvm/test/tools/llvm-objdump/section-headers-address-width.test llvm/test/tools/llvm-objdump/section-headers-name-width.test llvm/test/tools/llvm-objdump/section-headers-spacing.test llvm/test/tools/llvm-objdump/wasm.txt llvm/test/tools/llvm-objdump/xcoff-section-headers.test llvm/tools/llvm-objdump/llvm-objdump.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68730.224433.patch Type: text/x-patch Size: 13418 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 11:56:42 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via llvm-commits) Date: Thu, 10 Oct 2019 18:56:42 -0000 Subject: [llvm] r374420 - [NFC][PowerPC]Clean up PPCAsmPrinter for TOC related pseudo opcode Message-ID: <20191010185642.8E7C181F57@lists.llvm.org> Author: xiangling_liao Date: Thu Oct 10 11:56:42 2019 New Revision: 374420 URL: http://llvm.org/viewvc/llvm-project?rev=374420&view=rev Log: [NFC][PowerPC]Clean up PPCAsmPrinter for TOC related pseudo opcode Add a helper function getMCSymbolForTOCPseudoMO to clean up PPCAsmPrinter a little bit. Differential Revision: https://reviews.llvm.org/D68721 Modified: llvm/trunk/lib/Target/PowerPC/PPCAsmPrinter.cpp Modified: llvm/trunk/lib/Target/PowerPC/PPCAsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCAsmPrinter.cpp?rev=374420&r1=374419&r2=374420&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCAsmPrinter.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCAsmPrinter.cpp Thu Oct 10 11:56:42 2019 @@ -78,7 +78,7 @@ namespace { class PPCAsmPrinter : public AsmPrinter { protected: - MapVector TOC; + MapVector TOC; const PPCSubtarget *Subtarget; StackMaps SM; @@ -89,7 +89,7 @@ public: StringRef getPassName() const override { return "PowerPC Assembly Printer"; } - MCSymbol *lookUpOrCreateTOCEntry(MCSymbol *Sym); + MCSymbol *lookUpOrCreateTOCEntry(const MCSymbol *Sym); bool doInitialization(Module &M) override { if (!TOC.empty()) @@ -338,7 +338,7 @@ bool PPCAsmPrinter::PrintAsmMemoryOperan /// lookUpOrCreateTOCEntry -- Given a symbol, look up whether a TOC entry /// exists for it. If not, create one. Then return a symbol that references /// the TOC entry. -MCSymbol *PPCAsmPrinter::lookUpOrCreateTOCEntry(MCSymbol *Sym) { +MCSymbol *PPCAsmPrinter::lookUpOrCreateTOCEntry(const MCSymbol *Sym) { MCSymbol *&TOCEntry = TOC[Sym]; if (!TOCEntry) TOCEntry = createTempSymbol("C"); @@ -512,6 +512,22 @@ void PPCAsmPrinter::EmitTlsCall(const Ma .addExpr(SymVar)); } +/// Map the machine operand to its corresponding MCSymbol. +static MCSymbol *getMCSymbolForTOCPseudoMO(const MachineOperand &MO, AsmPrinter &AP) { + switch(MO.getType()) { + case MachineOperand::MO_GlobalAddress: + return AP.getSymbol(MO.getGlobal()); + case MachineOperand::MO_ConstantPoolIndex: + return AP.GetCPISymbol(MO.getIndex()); + case MachineOperand::MO_JumpTableIndex: + return AP.GetJTISymbol(MO.getIndex()); + case MachineOperand::MO_BlockAddress: + return AP.GetBlockAddressSymbol(MO.getBlockAddress()); + default: + llvm_unreachable("Unexpected operand type to get symbol."); + } +} + /// EmitInstruction -- Print out a single PowerPC MI in Darwin syntax to /// the current output stream. /// @@ -668,16 +684,7 @@ void PPCAsmPrinter::EmitInstruction(cons "Unexpected operand type for LWZtoc pseudo."); // Map the operand to its corresponding MCSymbol. - MCSymbol *MOSymbol = nullptr; - if (MO.isGlobal()) - MOSymbol = getSymbol(MO.getGlobal()); - else if (MO.isCPI()) - MOSymbol = GetCPISymbol(MO.getIndex()); - else if (MO.isJTI()) - MOSymbol = GetJTISymbol(MO.getIndex()); - else if (MO.isBlockAddress()) - MOSymbol = GetBlockAddressSymbol(MO.getBlockAddress()); - + const MCSymbol *const MOSymbol = getMCSymbolForTOCPseudoMO(MO, *this); const bool IsAIX = TM.getTargetTriple().isOSAIX(); // Create a reference to the GOT entry for the symbol. The GOT entry will be @@ -726,24 +733,18 @@ void PPCAsmPrinter::EmitInstruction(cons // Transform %x3 = LDtoc @min1, %x2 LowerPPCMachineInstrToMCInst(MI, TmpInst, *this, IsDarwin); - // Change the opcode to LD, and the global address operand to be a - // reference to the TOC entry we will synthesize later. + // Change the opcode to LD. TmpInst.setOpcode(PPC::LD); - const MachineOperand &MO = MI->getOperand(1); - // Map symbol -> label of TOC entry - assert(MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress()); - MCSymbol *MOSymbol = nullptr; - if (MO.isGlobal()) - MOSymbol = getSymbol(MO.getGlobal()); - else if (MO.isCPI()) - MOSymbol = GetCPISymbol(MO.getIndex()); - else if (MO.isJTI()) - MOSymbol = GetJTISymbol(MO.getIndex()); - else if (MO.isBlockAddress()) - MOSymbol = GetBlockAddressSymbol(MO.getBlockAddress()); + const MachineOperand &MO = MI->getOperand(1); + assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress()) && + "Invalid operand!"); - MCSymbol *TOCEntry = lookUpOrCreateTOCEntry(MOSymbol); + // Map the machine operand to its corresponding MCSymbol, then map the + // global address operand to be a reference to the TOC entry we will + // synthesize later. + MCSymbol *TOCEntry = + lookUpOrCreateTOCEntry(getMCSymbolForTOCPseudoMO(MO, *this)); const MCExpr *Exp = MCSymbolRefExpr::create(TOCEntry, MCSymbolRefExpr::VK_PPC_TOC, @@ -757,32 +758,22 @@ void PPCAsmPrinter::EmitInstruction(cons // Transform %xd = ADDIStocHA8 %x2, @sym LowerPPCMachineInstrToMCInst(MI, TmpInst, *this, IsDarwin); - // Change the opcode to ADDIS8. If the global address is external, has - // common linkage, is a non-local function address, or is a jump table - // address, then generate a TOC entry and reference that. Otherwise - // reference the symbol directly. + // Change the opcode to ADDIS8. If the global address is the address of + // an external symbol, is a jump table address, is a block address, or is a + // constant pool index with large code model enabled, then generate a TOC + // entry and reference that. Otherwise, reference the symbol directly. TmpInst.setOpcode(PPC::ADDIS8); + const MachineOperand &MO = MI->getOperand(2); - assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || - MO.isBlockAddress()) && + assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress()) && "Invalid operand for ADDIStocHA8!"); - MCSymbol *MOSymbol = nullptr; - bool GlobalToc = false; - if (MO.isGlobal()) { - const GlobalValue *GV = MO.getGlobal(); - MOSymbol = getSymbol(GV); - GlobalToc = Subtarget->isGVIndirectSymbol(GV); - } else if (MO.isCPI()) { - MOSymbol = GetCPISymbol(MO.getIndex()); - } else if (MO.isJTI()) { - MOSymbol = GetJTISymbol(MO.getIndex()); - } else if (MO.isBlockAddress()) { - MOSymbol = GetBlockAddressSymbol(MO.getBlockAddress()); - } + const MCSymbol *MOSymbol = getMCSymbolForTOCPseudoMO(MO, *this); + const bool GlobalToc = + MO.isGlobal() && Subtarget->isGVIndirectSymbol(MO.getGlobal()); if (GlobalToc || MO.isJTI() || MO.isBlockAddress() || - TM.getCodeModel() == CodeModel::Large) + (MO.isCPI() && TM.getCodeModel() == CodeModel::Large)) MOSymbol = lookUpOrCreateTOCEntry(MOSymbol); const MCExpr *Exp = @@ -803,36 +794,26 @@ void PPCAsmPrinter::EmitInstruction(cons // Transform %xd = LDtocL @sym, %xs LowerPPCMachineInstrToMCInst(MI, TmpInst, *this, IsDarwin); - // Change the opcode to LD. If the global address is external, has - // common linkage, or is a jump table address, then reference the - // associated TOC entry. Otherwise reference the symbol directly. + // Change the opcode to LD. If the global address is the address of + // an external symbol, is a jump table address, is a block address, or is + // a constant pool index with large code model enabled, then generate a + // TOC entry and reference that. Otherwise, reference the symbol directly. TmpInst.setOpcode(PPC::LD); + const MachineOperand &MO = MI->getOperand(1); assert((MO.isGlobal() || MO.isCPI() || MO.isJTI() || MO.isBlockAddress()) && "Invalid operand for LDtocL!"); - MCSymbol *MOSymbol = nullptr; - if (MO.isJTI()) - MOSymbol = lookUpOrCreateTOCEntry(GetJTISymbol(MO.getIndex())); - else if (MO.isBlockAddress()) { - MOSymbol = GetBlockAddressSymbol(MO.getBlockAddress()); - MOSymbol = lookUpOrCreateTOCEntry(MOSymbol); - } - else if (MO.isCPI()) { - MOSymbol = GetCPISymbol(MO.getIndex()); - if (TM.getCodeModel() == CodeModel::Large) - MOSymbol = lookUpOrCreateTOCEntry(MOSymbol); - } - else if (MO.isGlobal()) { - const GlobalValue *GV = MO.getGlobal(); - MOSymbol = getSymbol(GV); - LLVM_DEBUG( - assert((Subtarget->isGVIndirectSymbol(GV)) && - "LDtocL used on symbol that could be accessed directly is " - "invalid. Must match ADDIStocHA8.")); + LLVM_DEBUG(assert( + (!MO.isGlobal() || Subtarget->isGVIndirectSymbol(MO.getGlobal())) && + "LDtocL used on symbol that could be accessed directly is " + "invalid. Must match ADDIStocHA8.")); + + const MCSymbol *MOSymbol = getMCSymbolForTOCPseudoMO(MO, *this); + + if (!MO.isCPI() || TM.getCodeModel() == CodeModel::Large) MOSymbol = lookUpOrCreateTOCEntry(MOSymbol); - } const MCExpr *Exp = MCSymbolRefExpr::create(MOSymbol, MCSymbolRefExpr::VK_PPC_TOC_LO, @@ -845,26 +826,21 @@ void PPCAsmPrinter::EmitInstruction(cons // Transform %xd = ADDItocL %xs, @sym LowerPPCMachineInstrToMCInst(MI, TmpInst, *this, IsDarwin); - // Change the opcode to ADDI8. If the global address is external, then - // generate a TOC entry and reference that. Otherwise reference the + // Change the opcode to ADDI8. If the global address is external, then + // generate a TOC entry and reference that. Otherwise, reference the // symbol directly. TmpInst.setOpcode(PPC::ADDI8); + const MachineOperand &MO = MI->getOperand(2); - assert((MO.isGlobal() || MO.isCPI()) && "Invalid operand for ADDItocL"); - MCSymbol *MOSymbol = nullptr; + assert((MO.isGlobal() || MO.isCPI()) && "Invalid operand for ADDItocL."); - if (MO.isGlobal()) { - const GlobalValue *GV = MO.getGlobal(); - LLVM_DEBUG(assert(!(Subtarget->isGVIndirectSymbol(GV)) && - "Interposable definitions must use indirect access.")); - MOSymbol = getSymbol(GV); - } else if (MO.isCPI()) { - MOSymbol = GetCPISymbol(MO.getIndex()); - } + LLVM_DEBUG( + assert(!(MO.isGlobal() && Subtarget->isGVIndirectSymbol(MO.getGlobal())) && + "Interposable definitions must use indirect access.")); const MCExpr *Exp = - MCSymbolRefExpr::create(MOSymbol, MCSymbolRefExpr::VK_PPC_TOC_LO, - OutContext); + MCSymbolRefExpr::create(getMCSymbolForTOCPseudoMO(MO, *this), + MCSymbolRefExpr::VK_PPC_TOC_LO, OutContext); TmpInst.getOperand(2) = MCOperand::createExpr(Exp); EmitToStreamer(*OutStreamer, TmpInst); return; @@ -1400,15 +1376,16 @@ bool PPCLinuxAsmPrinter::doFinalization( ".got2", ELF::SHT_PROGBITS, ELF::SHF_WRITE | ELF::SHF_ALLOC); OutStreamer->SwitchSection(Section); - for (MapVector::iterator I = TOC.begin(), - E = TOC.end(); I != E; ++I) { - OutStreamer->EmitLabel(I->second); - MCSymbol *S = I->first; + for (const auto &TOCMapPair: TOC) { + const MCSymbol *const TOCEntryTarget = TOCMapPair.first; + MCSymbol *const TOCEntryLabel = TOCMapPair.second; + + OutStreamer->EmitLabel(TOCEntryLabel); if (isPPC64) { - TS.emitTCEntry(*S); + TS.emitTCEntry(*TOCEntryTarget); } else { OutStreamer->EmitValueToAlignment(4); - OutStreamer->EmitSymbolValue(S, 4); + OutStreamer->EmitSymbolValue(TOCEntryTarget, 4); } } } From llvm-commits at lists.llvm.org Thu Oct 10 11:57:23 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Thu, 10 Oct 2019 18:57:23 -0000 Subject: [llvm] r374421 - win: Move Parallel.h off concrt to cross-platform code Message-ID: <20191010185723.C2B6C83345@lists.llvm.org> Author: nico Date: Thu Oct 10 11:57:23 2019 New Revision: 374421 URL: http://llvm.org/viewvc/llvm-project?rev=374421&view=rev Log: win: Move Parallel.h off concrt to cross-platform code r179397 added Parallel.h and implemented it terms of concrt in 2013. In 2015, a cross-platform implementation of the functions has appeared and is in use everywhere but on Windows (r232419). r246219 hints that had issues in MSVC2013, but r296906 suggests they've been fixed now that we require 2015+. So remove the concrt code. It's less code, and it sounds like concrt has conceptual and performance issues, see PR41198. I built blink_core.dll in a debug component build with full symbols and in a release component build without any symbols. I couldn't measure a performance difference for linking blink_core.dll before and after this patch. Differential Revision: https://reviews.llvm.org/D68820 Modified: llvm/trunk/include/llvm/Support/Parallel.h llvm/trunk/lib/Support/Parallel.cpp Modified: llvm/trunk/include/llvm/Support/Parallel.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/Parallel.h?rev=374421&r1=374420&r2=374421&view=diff ============================================================================== --- llvm/trunk/include/llvm/Support/Parallel.h (original) +++ llvm/trunk/include/llvm/Support/Parallel.h Thu Oct 10 11:57:23 2019 @@ -18,14 +18,6 @@ #include #include -#if defined(_MSC_VER) && LLVM_ENABLE_THREADS -#pragma warning(push) -#pragma warning(disable : 4530) -#include -#include -#pragma warning(pop) -#endif - namespace llvm { namespace parallel { @@ -84,23 +76,6 @@ public: void sync() const { L.sync(); } }; -#if defined(_MSC_VER) -template -void parallel_sort(RandomAccessIterator Start, RandomAccessIterator End, - const Comparator &Comp) { - concurrency::parallel_sort(Start, End, Comp); -} -template -void parallel_for_each(IterTy Begin, IterTy End, FuncTy Fn) { - concurrency::parallel_for_each(Begin, End, Fn); -} - -template -void parallel_for_each_n(IndexTy Begin, IndexTy End, FuncTy Fn) { - concurrency::parallel_for(Begin, End, Fn); -} - -#else const ptrdiff_t MinParallelSize = 1024; /// Inclusive median. @@ -188,8 +163,6 @@ void parallel_for_each_n(IndexTy Begin, #endif -#endif - template using DefComparator = std::less::value_type>; Modified: llvm/trunk/lib/Support/Parallel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/Parallel.cpp?rev=374421&r1=374420&r2=374421&view=diff ============================================================================== --- llvm/trunk/lib/Support/Parallel.cpp (original) +++ llvm/trunk/lib/Support/Parallel.cpp Thu Oct 10 11:57:23 2019 @@ -32,34 +32,6 @@ public: static Executor *getDefaultExecutor(); }; -#if defined(_MSC_VER) -/// An Executor that runs tasks via ConcRT. -class ConcRTExecutor : public Executor { - struct Taskish { - Taskish(std::function Task) : Task(Task) {} - - std::function Task; - - static void run(void *P) { - Taskish *Self = static_cast(P); - Self->Task(); - concurrency::Free(Self); - } - }; - -public: - virtual void add(std::function F) { - Concurrency::CurrentScheduler::ScheduleTask( - Taskish::run, new (concurrency::Alloc(sizeof(Taskish))) Taskish(F)); - } -}; - -Executor *Executor::getDefaultExecutor() { - static ConcRTExecutor exec; - return &exec; -} - -#else /// An implementation of an Executor that runs closures on a thread pool /// in filo order. class ThreadPoolExecutor : public Executor { @@ -117,8 +89,7 @@ Executor *Executor::getDefaultExecutor() static ThreadPoolExecutor exec; return &exec; } -#endif -} +} // namespace static std::atomic TaskGroupInstances; From llvm-commits at lists.llvm.org Thu Oct 10 11:59:59 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Thu, 10 Oct 2019 18:59:59 +0000 (UTC) Subject: [PATCH] D68819: [Utils] Allow update_test_checks to check function arguments In-Reply-To: References: Message-ID: xbolva00 added inline comments. ================ Comment at: llvm/utils/UpdateTestChecks/common.py:121 + args = m.group('args').strip() + elif 'args' in m.groupdict(): + args = '(' ---------------- ``` if 'args' in m.groupdict(): if record_args: args = m.group('args').strip() else: args = '(' ``` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68819/new/ https://reviews.llvm.org/D68819 From llvm-commits at lists.llvm.org Thu Oct 10 11:59:59 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 18:59:59 +0000 (UTC) Subject: [PATCH] D68465: [DebugInfo] Trim call-clobbered location list entries when tuning for GDB In-Reply-To: References: Message-ID: <325c9d9674945c66ee29ebed09c068dc@localhost.localdomain> dblaikie added a comment. In D68465#1703341 , @dstenb wrote: > In D68465#1698682 , @dblaikie wrote: > > > So your changes to the address pool don't actually cause the address pool to contain entries with offsets - it stores only the base address in the actual debug_addr pool output, but then uses the offset from there in the place that refers to the address pool. > > > The offset will make it into the address pool output. In the parent patch (D68466 ) the offset is added to the output in `AddressPool::emit()`. > > Such a case is tested in the attached call-clobbered-split.mir test: > > # CHECK: .Ldebug_loc0: > [...] > # CHECK-NEXT: .byte 3 > # CHECK-NEXT: .byte 2 <--------- > # CHECK-NEXT: .long .Ltmp2-(.Ltmp2-1) > # CHECK-NEXT: .byte 4 # Loc expr size > # CHECK-NEXT: .byte 48 # DW_OP_lit0 > # CHECK-NEXT: .byte 159 # DW_OP_stack_value > # CHECK-NEXT: .byte 147 # DW_OP_piece > # CHECK-NEXT: .byte 8 # 8 > [...] > # CHECK: .Laddr_table_base0: > # CHECK-NEXT: .quad .Lfunc_begin0 > # CHECK-NEXT: .quad .Ltmp1 > # CHECK-NEXT: .quad .Ltmp2-1 <----- > # CHECK-NEXT: .quad .Ltmp2 > Ah, OK. I haven't applied/tested the patch myself - I looked at GCC's behavior & figured you were probably aiming for the same as it, and it doesn't look like GCC does this - and I'd certainly like to avoid that if at all possible. (though all this is pending discussion/conclusions you get from discussing things with GDB) >> So I think that would mean there would end up with duplicate entries in debug_addr, which would be a waste of space/relocations/etc. >> >> So only the address should go in the pool - the pool shouldn't be aware of the offset. (this would mean the semantics of the in-memory data structures would match more closely to the output) > > I think it's necessary to emit the offsets in the address pool output since `DW_LLE_offset_pair` takes unsigned operands. Unsigned operands don't /seem/ to me to be problematic here.. GCC's output in -gdwarf-5 with the example you provided previously doesn't use debug_addr, instead using offset_pair (if there's a base address) or start_length (if I add another function to the example, and use -ffunction-sections, so the base address of the CU is constant zero): .byte 0x8 .quad .LVL0 .uleb128 .LVL1-1-.LVL0 .uleb128 0x1 .byte 0x50 .byte 0 With -gsplit-dwarf GCC has to use debug_addr, and we don't see any label arithmetic in debug_addr: .section .debug_addr,"", at progbits .quad .LVL1 .quad .LFB0 .quad .LFB1 .quad call .quad value .quad .LVL0 & we do see it in debug_loclists.dwo: .byte 0x3 .uleb128 0x5 .uleb128 .LVL1-1-.LVL0 .uleb128 0x1 .byte 0x50 .byte 0 So if we are going to do this, I'd certainly want to match that sort of behavior - and not make changes to/add extra addresses to debug_addr if we don't have to. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68465/new/ https://reviews.llvm.org/D68465 From llvm-commits at lists.llvm.org Thu Oct 10 12:00:00 2019 From: llvm-commits at lists.llvm.org (serge via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:00:00 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: <890ad471f37c9296a94805406a1d2f29@localhost.localdomain> serge-sans-paille updated this revision to Diff 224431. serge-sans-paille edited the summary of this revision. serge-sans-paille added a comment. Added test case, statistics and refactor interactions with existing stack probing mechanism. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 Files: clang/docs/ReleaseNotes.rst clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Basic/DiagnosticFrontendKinds.td clang/include/clang/Basic/TargetInfo.h clang/include/clang/Driver/CC1Options.td clang/include/clang/Driver/Options.td clang/lib/Basic/Targets/X86.h clang/lib/CodeGen/CGStmt.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/stack-clash-protection.c clang/test/Driver/stack-clash-protection.c llvm/docs/ReleaseNotes.rst llvm/include/llvm/CodeGen/TargetLowering.h llvm/lib/Target/X86/X86CallFrameOptimization.cpp llvm/lib/Target/X86/X86FrameLowering.cpp llvm/lib/Target/X86/X86FrameLowering.h llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86ISelLowering.h llvm/lib/Target/X86/X86InstrCompiler.td llvm/lib/Target/X86/X86InstrInfo.td llvm/test/CodeGen/X86/stack-clash-dynamic-alloca.ll llvm/test/CodeGen/X86/stack-clash-medium-natural-probes.ll llvm/test/CodeGen/X86/stack-clash-medium.ll llvm/test/CodeGen/X86/stack-clash-no-free-probe.ll llvm/test/CodeGen/X86/stack-clash-small.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68720.224431.patch Type: text/x-patch Size: 35376 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 12:00:01 2019 From: llvm-commits at lists.llvm.org (Cameron McInally via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:00:01 +0000 (UTC) Subject: [PATCH] D61675: [WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator In-Reply-To: References: Message-ID: <2ebea640f05b60b4640839934c06c8a7@localhost.localdomain> cameron.mcinally added a subscriber: RKSimon. cameron.mcinally added a comment. @gribozavr I see that you also reverted @RKSimon's commit for the OCaml/core.ml failure: Author: gribozavr Date: Thu Oct 10 07:16:58 2019 New Revision: 374357 URL: http://llvm.org/viewvc/llvm-project?rev=374357&view=rev Log: Revert "Fix OCaml/core.ml fneg check" This reverts commit r374346. It attempted to fix OCaml tests, but is does not actually fix them. Modified: llvm/trunk/test/Bindings/OCaml/core.ml That appears to be the proper fix. Do you see something wrong with it that I'm missing? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61675/new/ https://reviews.llvm.org/D61675 From llvm-commits at lists.llvm.org Thu Oct 10 12:00:01 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:00:01 +0000 (UTC) Subject: [PATCH] D68586: Save a word in every StringSet entry In-Reply-To: References: Message-ID: <18ef2b216e6412325c5a64b6afaafbd2@localhost.localdomain> dblaikie accepted this revision. dblaikie added a comment. This revision is now accepted and ready to land. Looks good to me - thanks! If you like, maybe leave a comment/hint that this could be generalized to a full EBO if someone has a need in the future. Any idea why MDString is friending an implementation detail like this? Should it be? Could we make it an actual private implementation detail so people can't do this? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68586/new/ https://reviews.llvm.org/D68586 From llvm-commits at lists.llvm.org Thu Oct 10 12:00:17 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:00:17 +0000 (UTC) Subject: [PATCH] D68820: win: Move Parallel.h off concrt to cross-platform code In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGd49600320598: win: Move Parallel.h off concrt to cross-platform code (authored by thakis). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68820/new/ https://reviews.llvm.org/D68820 Files: llvm/include/llvm/Support/Parallel.h llvm/lib/Support/Parallel.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68820.224434.patch Type: text/x-patch Size: 2533 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 12:01:01 2019 From: llvm-commits at lists.llvm.org (serge via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:01:01 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: serge-sans-paille added a comment. Some early stats: on the sqlite amalgamation [0], the free probe reuse allows to skip 123 out of the 474 probes needed during frame lowering. [0] https://www.sqlite.org/download.html Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 From llvm-commits at lists.llvm.org Thu Oct 10 12:09:44 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Thu, 10 Oct 2019 19:09:44 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: xbolva00 added inline comments. ================ Comment at: llvm/lib/Target/X86/X86FrameLowering.cpp:479 + } + if (std::any_of(MI.operands_begin(), MI.operands_end(), + [](MachineOperand &MO) { return MO.isFI(); })) { ---------------- nit: llvm::any_of Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 From llvm-commits at lists.llvm.org Thu Oct 10 12:18:52 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:18:52 +0000 (UTC) Subject: [PATCH] D68819: [Utils] Allow update_test_checks to check function arguments In-Reply-To: References: Message-ID: jdoerfert updated this revision to Diff 224438. jdoerfert added a comment. Include personality, fix versioning based on different args, tested on D68766 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68819/new/ https://reviews.llvm.org/D68819 Files: llvm/utils/UpdateTestChecks/common.py llvm/utils/update_analyze_test_checks.py llvm/utils/update_mir_test_checks.py llvm/utils/update_test_checks.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68819.224438.patch Type: text/x-patch Size: 6399 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 12:18:52 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:18:52 +0000 (UTC) Subject: [PATCH] D68819: [Utils] Allow update_test_checks to check function arguments In-Reply-To: References: Message-ID: jdoerfert added a comment. In D68819#1704491 , @lebedev.ri wrote: > This doesn't change the default, right? It shouldn't no. I tested it on some formatted files but I'm unsure if there are other interactions I don't know about. @xbolva00 I don't understand but I updated the version, please take a look Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68819/new/ https://reviews.llvm.org/D68819 From llvm-commits at lists.llvm.org Thu Oct 10 12:18:54 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:18:54 +0000 (UTC) Subject: [PATCH] D68766: [NFC][ArgPromo][Tests] Run update_test_checks on all ArgumentPromotion tests In-Reply-To: References: Message-ID: <353becd7e711f510005c72075da75317@localhost.localdomain> jdoerfert updated this revision to Diff 224439. jdoerfert added a comment. Rerun update with newest version of D68819 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68766/new/ https://reviews.llvm.org/D68766 Files: llvm/test/Transforms/ArgumentPromotion/2008-02-01-ReturnAttrs.ll llvm/test/Transforms/ArgumentPromotion/2008-07-02-array-indexing.ll llvm/test/Transforms/ArgumentPromotion/2008-09-07-CGUpdate.ll llvm/test/Transforms/ArgumentPromotion/2008-09-08-CGUpdateSelfEdge.ll llvm/test/Transforms/ArgumentPromotion/X86/attributes.ll llvm/test/Transforms/ArgumentPromotion/X86/min-legal-vector-width.ll llvm/test/Transforms/ArgumentPromotion/X86/thiscall.ll llvm/test/Transforms/ArgumentPromotion/aggregate-promote.ll llvm/test/Transforms/ArgumentPromotion/attrs.ll llvm/test/Transforms/ArgumentPromotion/basictest.ll llvm/test/Transforms/ArgumentPromotion/byval-2.ll llvm/test/Transforms/ArgumentPromotion/byval.ll llvm/test/Transforms/ArgumentPromotion/chained.ll llvm/test/Transforms/ArgumentPromotion/control-flow.ll llvm/test/Transforms/ArgumentPromotion/control-flow2.ll llvm/test/Transforms/ArgumentPromotion/crash.ll llvm/test/Transforms/ArgumentPromotion/dbg.ll llvm/test/Transforms/ArgumentPromotion/fp80.ll llvm/test/Transforms/ArgumentPromotion/inalloca.ll llvm/test/Transforms/ArgumentPromotion/invalidation.ll llvm/test/Transforms/ArgumentPromotion/musttail.ll llvm/test/Transforms/ArgumentPromotion/naked_functions.ll llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll llvm/test/Transforms/ArgumentPromotion/pr27568.ll llvm/test/Transforms/ArgumentPromotion/pr3085.ll llvm/test/Transforms/ArgumentPromotion/pr32917.ll llvm/test/Transforms/ArgumentPromotion/pr33641_remove_arg_dbgvalue.ll llvm/test/Transforms/ArgumentPromotion/profile.ll llvm/test/Transforms/ArgumentPromotion/reserve-tbaa.ll llvm/test/Transforms/ArgumentPromotion/sret.ll llvm/test/Transforms/ArgumentPromotion/tail.ll llvm/test/Transforms/ArgumentPromotion/variadic.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68766.224439.patch Type: text/x-patch Size: 167229 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 12:20:03 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:20:03 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: Xiangling_L updated this revision to Diff 224442. Xiangling_L added a comment. Rebase on latest master after the PPCAsmPrinter cleanup NFC patch landed. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 Files: llvm/include/llvm/MC/MCExpr.h llvm/lib/MC/MCExpr.cpp llvm/lib/Target/PowerPC/MCTargetDesc/PPCInstPrinter.cpp llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.cpp llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68341.224442.patch Type: text/x-patch Size: 12448 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 12:20:04 2019 From: llvm-commits at lists.llvm.org (Marcello Maggioni via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:20:04 +0000 (UTC) Subject: [PATCH] D68739: [GISel] Allow ConstantFoldBinOp to consider G_FCONSTANT binary representation for combines In-Reply-To: References: Message-ID: kariddi updated this revision to Diff 224441. kariddi added a comment. Herald added subscribers: nhaehnle, jvesely. So, I changed the getConstantVRegValWithLookThrough function with an extra operand. Honestly I couldn't see any reason why the new parameter should have been set as "false" by default, so I set it to true, because it seems what the design of GlobalISel seems to suggest considering the difference of float/int being "not-there". This allowed me to remove the getAnyConstantVRegVal() function and use the normal getConstantVRegVal() Instead. A test in AMDGPU started to have different output from what I understand because of the new folding and I updated it Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68739/new/ https://reviews.llvm.org/D68739 Files: llvm/include/llvm/CodeGen/GlobalISel/Utils.h llvm/lib/CodeGen/GlobalISel/Utils.cpp llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-frint.mir llvm/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68739.224441.patch Type: text/x-patch Size: 16823 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 12:21:08 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:21:08 +0000 (UTC) Subject: [PATCH] D68338: [AMDGPU] Remove dubious logic in bidirectional list scheduler In-Reply-To: References: Message-ID: <5e11601a466003bbdabb005d38879348@localhost.localdomain> rampitec added a comment. Given the numbers I tend to agree with the change. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68338/new/ https://reviews.llvm.org/D68338 From llvm-commits at lists.llvm.org Thu Oct 10 12:24:57 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Thu, 10 Oct 2019 19:24:57 -0000 Subject: [llvm] r374425 - Revert r374392: "[lit] Extend internal diff to support -U" Message-ID: <20191010192457.5626392798@lists.llvm.org> Author: jdenny Date: Thu Oct 10 12:24:57 2019 New Revision: 374425 URL: http://llvm.org/viewvc/llvm-project?rev=374425&view=rev Log: Revert r374392: "[lit] Extend internal diff to support -U" This breaks a Windows bot. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374425&r1=374424&r2=374425&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Thu Oct 10 12:24:57 2019 @@ -10,7 +10,6 @@ class DiffFlags(): self.ignore_all_space = False self.ignore_space_change = False self.unified_diff = False - self.num_context_lines = 3 self.recursive_diff = False self.strip_trailing_cr = False @@ -49,10 +48,7 @@ def compareTwoBinaryFiles(flags, filepat exitCode = 0 if hasattr(difflib, 'diff_bytes'): # python 3.5 or newer - diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], - filelines[1], filepaths[0].encode(), - filepaths[1].encode(), - n = flags.num_context_lines) + diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) diffs = [diff.decode(errors="backslashreplace") for diff in diffs] else: # python 2.7 @@ -60,8 +56,7 @@ def compareTwoBinaryFiles(flags, filepat func = difflib.unified_diff else: func = difflib.context_diff - diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1], - n = flags.num_context_lines) + diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) for diff in diffs: sys.stdout.write(diff) @@ -93,8 +88,7 @@ def compareTwoTextFiles(flags, filepaths filelines[idx]= [f(line) for line in lines] func = difflib.unified_diff if flags.unified_diff else difflib.context_diff - for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1], - n = flags.num_context_lines): + for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): sys.stdout.write(diff) exitCode = 1 return exitCode @@ -177,7 +171,7 @@ def compareDirTrees(flags, dir_trees, ba def main(argv): args = argv[1:] try: - opts, args = getopt.gnu_getopt(args, "wbuU:r", ["strip-trailing-cr"]) + opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) except getopt.GetoptError as err: sys.stderr.write("Unsupported: 'diff': %s\n" % str(err)) sys.exit(1) @@ -191,16 +185,6 @@ def main(argv): flags.ignore_space_change = True elif o == "-u": flags.unified_diff = True - elif o.startswith("-U"): - flags.unified_diff = True - try: - flags.num_context_lines = int(a) - if flags.num_context_lines < 0: - raise ValueException - except: - sys.stderr.write("Error: invalid '-U' argument: {}\n" - .format(a)) - sys.exit(1) elif o == "-r": flags.recursive_diff = True elif o == "--strip-trailing-cr": Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt?rev=374424&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt (removed) @@ -1,38 +0,0 @@ -# RUN: echo 1 > %t.foo -# RUN: echo 2 >> %t.foo -# RUN: echo 3 >> %t.foo -# RUN: echo 4 >> %t.foo -# RUN: echo 5 >> %t.foo -# RUN: echo 6 foo >> %t.foo -# RUN: echo 7 >> %t.foo -# RUN: echo 8 >> %t.foo -# RUN: echo 9 >> %t.foo -# RUN: echo 10 >> %t.foo -# RUN: echo 11 >> %t.foo - -# RUN: echo 1 > %t.bar -# RUN: echo 2 >> %t.bar -# RUN: echo 3 >> %t.bar -# RUN: echo 4 >> %t.bar -# RUN: echo 5 >> %t.bar -# RUN: echo 6 bar >> %t.bar -# RUN: echo 7 >> %t.bar -# RUN: echo 8 >> %t.bar -# RUN: echo 9 >> %t.bar -# RUN: echo 10 >> %t.bar -# RUN: echo 11 >> %t.bar - -# Default is 3 lines of context. -# RUN: diff -u %t.foo %t.bar && false || true - -# Override default of 3 lines of context. -# RUN: diff -U 2 %t.foo %t.bar && false || true -# RUN: diff -U4 %t.foo %t.bar && false || true -# RUN: diff -U0 %t.foo %t.bar && false || true - -# Check bad -U argument. -# RUN: diff -U 30.1 %t.foo %t.foo && false || true -# RUN: diff -U-1 %t.foo %t.foo && false || true - -# Fail so lit will print output. -# RUN: false Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374425&r1=374424&r2=374425&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Thu Oct 10 12:24:57 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (31) +# CHECK: Failing Tests (30) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374425&r1=374424&r2=374425&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Thu Oct 10 12:24:57 2019 @@ -331,82 +331,6 @@ # CHECK: PASS: shtest-shell :: diff-r.txt - -# CHECK: FAIL: shtest-shell :: diff-unified.txt - -# CHECK: *** TEST 'shtest-shell :: diff-unified.txt' FAILED *** - -# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "{{[^"]*}}.bar" -# CHECK: # command output: -# CHECK: @@ {{.*}} @@ -# CHECK-NEXT: 3 -# CHECK-NEXT: 4 -# CHECK-NEXT: 5 -# CHECK-NEXT: -6 foo -# CHECK-NEXT: +6 bar -# CHECK-NEXT: 7 -# CHECK-NEXT: 8 -# CHECK-NEXT: 9 -# CHECK-EMPTY: -# CHECK-NEXT: error: command failed with exit status: 1 -# CHECK-NEXT: $ "true" - -# CHECK: $ "diff" "-U" "2" "{{[^"]*}}.foo" "{{[^"]*}}.bar" -# CHECK: # command output: -# CHECK: @@ {{.*}} @@ -# CHECK-NEXT: 4 -# CHECK-NEXT: 5 -# CHECK-NEXT: -6 foo -# CHECK-NEXT: +6 bar -# CHECK-NEXT: 7 -# CHECK-NEXT: 8 -# CHECK-EMPTY: -# CHECK-NEXT: error: command failed with exit status: 1 -# CHECK-NEXT: $ "true" - -# CHECK: $ "diff" "-U4" "{{[^"]*}}.foo" "{{[^"]*}}.bar" -# CHECK: # command output: -# CHECK: @@ {{.*}} @@ -# CHECK-NEXT: 2 -# CHECK-NEXT: 3 -# CHECK-NEXT: 4 -# CHECK-NEXT: 5 -# CHECK-NEXT: -6 foo -# CHECK-NEXT: +6 bar -# CHECK-NEXT: 7 -# CHECK-NEXT: 8 -# CHECK-NEXT: 9 -# CHECK-NEXT: 10 -# CHECK-EMPTY: -# CHECK-NEXT: error: command failed with exit status: 1 -# CHECK-NEXT: $ "true" - -# CHECK: $ "diff" "-U0" "{{[^"]*}}.foo" "{{[^"]*}}.bar" -# CHECK: # command output: -# CHECK: @@ {{.*}} @@ -# CHECK-NEXT: -6 foo -# CHECK-NEXT: +6 bar -# CHECK-EMPTY: -# CHECK-NEXT: error: command failed with exit status: 1 -# CHECK-NEXT: $ "true" - -# CHECK: $ "diff" "-U" "30.1" "{{[^"]*}}" "{{[^"]*}}" -# CHECK: # command stderr: -# CHECK: Error: invalid '-U' argument: 30.1 -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "diff" "-U-1" "{{[^"]*}}" "{{[^"]*}}" -# CHECK: # command stderr: -# CHECK: Error: invalid '-U' argument: -1 -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "false" - -# CHECK: *** - - # CHECK: FAIL: shtest-shell :: error-0.txt # CHECK: *** TEST 'shtest-shell :: error-0.txt' FAILED *** # CHECK: $ "not-a-real-command" @@ -486,4 +410,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (31) +# CHECK: Failing Tests (30) From llvm-commits at lists.llvm.org Thu Oct 10 12:25:11 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Thu, 10 Oct 2019 19:25:11 -0000 Subject: [llvm] r374426 - Revert r374390: "[lit] Extend internal diff to support `-` argument" Message-ID: <20191010192511.9AECC9279A@lists.llvm.org> Author: jdenny Date: Thu Oct 10 12:25:11 2019 New Revision: 374426 URL: http://llvm.org/viewvc/llvm-project?rev=374426&view=rev Log: Revert r374390: "[lit] Extend internal diff to support `-` argument" This breaks a Windows bot. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374426&r1=374425&r2=374426&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Thu Oct 10 12:25:11 2019 @@ -27,13 +27,8 @@ def getDirTree(path, basedir=""): def compareTwoFiles(flags, filepaths): filelines = [] for file in filepaths: - if file == "-": - stdin_fileno = sys.stdin.fileno() - with os.fdopen(os.dup(stdin_fileno), 'rb') as stdin_bin: - filelines.append(stdin_bin.readlines()) - else: - with open(file, 'rb') as file_bin: - filelines.append(file_bin.readlines()) + with open(file, 'rb') as file_bin: + filelines.append(file_bin.readlines()) try: return compareTwoTextFiles(flags, filepaths, filelines, @@ -199,13 +194,10 @@ def main(argv): exitCode = 0 try: for file in args: - if file != "-" and not os.path.isabs(file): + if not os.path.isabs(file): file = os.path.realpath(os.path.join(os.getcwd(), file)) if flags.recursive_diff: - if file == "-": - sys.stderr.write("Error: cannot recursively compare '-'\n") - sys.exit(1) dir_trees.append(getDirTree(file)) else: filepaths.append(file) Modified: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt?rev=374426&r1=374425&r2=374426&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt Thu Oct 10 12:25:11 2019 @@ -5,11 +5,5 @@ # RUN: diff -u diff-in.utf8 diff-in.bin && false || true # RUN: diff -u diff-in.bin diff-in.utf8 && false || true -# RUN: cat diff-in.bin | diff -u - diff-in.bin -# RUN: cat diff-in.bin | diff -u diff-in.bin - -# RUN: cat diff-in.bin | diff -u diff-in.utf16 - && false || true -# RUN: cat diff-in.bin | diff -u diff-in.utf8 - && false || true -# RUN: cat diff-in.bin | diff -u - diff-in.utf8 && false || true - # Fail so lit will print output. # RUN: false Modified: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt?rev=374426&r1=374425&r2=374426&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt Thu Oct 10 12:25:11 2019 @@ -5,16 +5,6 @@ # RUN: diff %t.foo %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s # RUN: diff -u %t.foo %t.bar | FileCheck %s && false || true -# Check input pipe. -# RUN: echo foo | diff -u - %t.foo -# RUN: echo foo | diff -u %t.foo - -# RUN: echo bar | diff -u %t.foo - && false || true -# RUN: echo bar | diff -u - %t.foo && false || true - -# Check output and input pipes at the same time. -# RUN: echo foo | diff - %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s -# RUN: echo bar | diff -u %t.foo - | FileCheck %s && false || true - # Fail so lit will print output. # RUN: false Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt?rev=374425&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt (removed) @@ -1,2 +0,0 @@ -# diff -r currently cannot handle stdin. -# RUN: diff -r - %t Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt?rev=374425&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt (removed) @@ -1,2 +0,0 @@ -# diff -r currently cannot handle stdin. -# RUN: diff -r %t - Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374426&r1=374425&r2=374426&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Thu Oct 10 12:25:11 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (30) +# CHECK: Failing Tests (28) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374426&r1=374425&r2=374426&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Thu Oct 10 12:25:11 2019 @@ -81,60 +81,6 @@ # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" -# CHECK: $ "cat" "diff-in.bin" -# CHECK-NOT: error -# CHECK: $ "diff" "-u" "-" "diff-in.bin" -# CHECK-NOT: error - -# CHECK: $ "cat" "diff-in.bin" -# CHECK-NOT: error -# CHECK: $ "diff" "-u" "diff-in.bin" "-" -# CHECK-NOT: error - -# CHECK: $ "cat" "diff-in.bin" -# CHECK-NOT: error -# CHECK: $ "diff" "-u" "diff-in.utf16" "-" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: {{^ .f.o.o.$}} -# CHECK-NEXT: {{^-.b.a.r.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^ .b.a.z.$}} -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "cat" "diff-in.bin" -# CHECK-NOT: error -# CHECK: $ "diff" "-u" "diff-in.utf8" "-" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: -foo -# CHECK-NEXT: -bar -# CHECK-NEXT: -baz -# CHECK-NEXT: {{^\+.f.o.o.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^\+.b.a.z.$}} -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "diff" "-u" "-" "diff-in.utf8" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: {{^\-.f.o.o.$}} -# CHECK-NEXT: {{^\-.b.a.r..}} -# CHECK-NEXT: {{^\-.b.a.z.$}} -# CHECK-NEXT: +foo -# CHECK-NEXT: +bar -# CHECK-NEXT: +baz -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - # CHECK: $ "false" # CHECK: *** @@ -212,51 +158,6 @@ # CHECK-NOT: error # CHECK: $ "true" -# CHECK: $ "echo" "foo" -# CHECK: $ "diff" "-u" "-" "{{[^"]*}}.foo" -# CHECK-NOT: note -# CHECK-NOT: error - -# CHECK: $ "echo" "foo" -# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" -# CHECK-NOT: note -# CHECK-NOT: error - -# CHECK: $ "echo" "bar" -# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" -# CHECK: # command output: -# CHECK: @@ -# CHECK-NEXT: -foo -# CHECK-NEXT: +bar -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "echo" "bar" -# CHECK: $ "diff" "-u" "-" "{{[^"]*}}.foo" -# CHECK: # command output: -# CHECK: @@ -# CHECK-NEXT: -bar -# CHECK-NEXT: +foo -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "echo" "foo" -# CHECK: $ "diff" "-" "{{[^"]*}}.foo" -# CHECK-NOT: note -# CHECK-NOT: error -# CHECK: $ "FileCheck" -# CHECK-NOT: note -# CHECK-NOT: error - -# CHECK: $ "echo" "bar" -# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" -# CHECK: note: command had no output on stdout or stderr -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "FileCheck" -# CHECK-NOT: note -# CHECK-NOT: error -# CHECK: $ "true" - # CHECK: $ "false" # CHECK: *** @@ -315,20 +216,6 @@ # CHECK: File {{.*}}dir1{{.*}}extra_file is a regular empty file while file {{.*}}dir2{{.*}}extra_file is a directory # CHECK: error: command failed with exit status: 1 -# CHECK: FAIL: shtest-shell :: diff-r-error-7.txt -# CHECK: *** TEST 'shtest-shell :: diff-r-error-7.txt' FAILED *** -# CHECK: $ "diff" "-r" "-" "{{[^"]*}}" -# CHECK: # command stderr: -# CHECK: Error: cannot recursively compare '-' -# CHECK: error: command failed with exit status: 1 - -# CHECK: FAIL: shtest-shell :: diff-r-error-8.txt -# CHECK: *** TEST 'shtest-shell :: diff-r-error-8.txt' FAILED *** -# CHECK: $ "diff" "-r" "{{[^"]*}}" "-" -# CHECK: # command stderr: -# CHECK: Error: cannot recursively compare '-' -# CHECK: error: command failed with exit status: 1 - # CHECK: PASS: shtest-shell :: diff-r.txt # CHECK: FAIL: shtest-shell :: error-0.txt @@ -410,4 +297,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (30) +# CHECK: Failing Tests (28) From llvm-commits at lists.llvm.org Thu Oct 10 12:25:24 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Thu, 10 Oct 2019 19:25:24 -0000 Subject: [llvm] r374427 - Revert r374389: "[lit] Clean up internal diff's encoding handling" Message-ID: <20191010192524.DD1DB927A2@lists.llvm.org> Author: jdenny Date: Thu Oct 10 12:25:24 2019 New Revision: 374427 URL: http://llvm.org/viewvc/llvm-project?rev=374427&view=rev Log: Revert r374389: "[lit] Clean up internal diff's encoding handling" This breaks a Windows bot. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374427&r1=374426&r2=374427&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Thu Oct 10 12:25:24 2019 @@ -1,7 +1,6 @@ import difflib import functools import getopt -import locale import os import sys @@ -25,26 +24,37 @@ def getDirTree(path, basedir=""): return path, sorted(child_trees) def compareTwoFiles(flags, filepaths): + compare_bytes = False + encoding = None filelines = [] for file in filepaths: - with open(file, 'rb') as file_bin: - filelines.append(file_bin.readlines()) - - try: - return compareTwoTextFiles(flags, filepaths, filelines, - locale.getpreferredencoding(False)) - except UnicodeDecodeError: try: - return compareTwoTextFiles(flags, filepaths, filelines, "utf-8") - except: - return compareTwoBinaryFiles(flags, filepaths, filelines) + with open(file, 'r') as f: + filelines.append(f.readlines()) + except UnicodeDecodeError: + try: + with io.open(file, 'r', encoding="utf-8") as f: + filelines.append(f.readlines()) + encoding = "utf-8" + except: + compare_bytes = True + + if compare_bytes: + return compareTwoBinaryFiles(flags, filepaths) + else: + return compareTwoTextFiles(flags, filepaths, encoding) + +def compareTwoBinaryFiles(flags, filepaths): + filelines = [] + for file in filepaths: + with open(file, 'rb') as f: + filelines.append(f.readlines()) -def compareTwoBinaryFiles(flags, filepaths, filelines): exitCode = 0 if hasattr(difflib, 'diff_bytes'): # python 3.5 or newer diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) - diffs = [diff.decode(errors="backslashreplace") for diff in diffs] + diffs = [diff.decode() for diff in diffs] else: # python 2.7 if flags.unified_diff: @@ -58,14 +68,15 @@ def compareTwoBinaryFiles(flags, filepat exitCode = 1 return exitCode -def compareTwoTextFiles(flags, filepaths, filelines_bin, encoding): +def compareTwoTextFiles(flags, filepaths, encoding): filelines = [] - for lines_bin in filelines_bin: - lines = [] - for line_bin in lines_bin: - line = line_bin.decode(encoding=encoding) - lines.append(line) - filelines.append(lines) + for file in filepaths: + if encoding is None: + with open(file, 'r') as f: + filelines.append(f.readlines()) + else: + with io.open(file, 'r', encoding=encoding) as f: + filelines.append(f.readlines()) exitCode = 0 def compose2(f, g): Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt?rev=374426&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt (removed) @@ -1,9 +0,0 @@ -# Check that diff falls back to binary mode if it cannot decode a file. - -# RUN: diff -u diff-in.bin diff-in.bin -# RUN: diff -u diff-in.utf16 diff-in.bin && false || true -# RUN: diff -u diff-in.utf8 diff-in.bin && false || true -# RUN: diff -u diff-in.bin diff-in.utf8 && false || true - -# Fail so lit will print output. -# RUN: false Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin?rev=374426&view=auto ============================================================================== Binary file - no diff available. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16?rev=374426&view=auto ============================================================================== Binary file - no diff available. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8?rev=374426&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 (removed) @@ -1,3 +0,0 @@ -foo -bar -baz Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374427&r1=374426&r2=374427&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Thu Oct 10 12:25:24 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (28) +# CHECK: Failing Tests (27) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374427&r1=374426&r2=374427&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Thu Oct 10 12:25:24 2019 @@ -34,58 +34,6 @@ # CHECK: error: command failed with exit status: 127 # CHECK: *** - -# CHECK: FAIL: shtest-shell :: diff-encodings.txt -# CHECK: *** TEST 'shtest-shell :: diff-encodings.txt' FAILED *** - -# CHECK: $ "diff" "-u" "diff-in.bin" "diff-in.bin" -# CHECK-NOT: error - -# CHECK: $ "diff" "-u" "diff-in.utf16" "diff-in.bin" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: {{^ .f.o.o.$}} -# CHECK-NEXT: {{^-.b.a.r.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^ .b.a.z.$}} -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "diff" "-u" "diff-in.utf8" "diff-in.bin" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: -foo -# CHECK-NEXT: -bar -# CHECK-NEXT: -baz -# CHECK-NEXT: {{^\+.f.o.o.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^\+.b.a.z.$}} -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "diff" "-u" "diff-in.bin" "diff-in.utf8" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: {{^\-.f.o.o.$}} -# CHECK-NEXT: {{^\-.b.a.r..}} -# CHECK-NEXT: {{^\-.b.a.z.$}} -# CHECK-NEXT: +foo -# CHECK-NEXT: +bar -# CHECK-NEXT: +baz -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "false" - -# CHECK: *** - - # CHECK: FAIL: shtest-shell :: diff-error-1.txt # CHECK: *** TEST 'shtest-shell :: diff-error-1.txt' FAILED *** # CHECK: $ "diff" "-B" "temp1.txt" "temp2.txt" @@ -297,4 +245,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (28) +# CHECK: Failing Tests (27) From llvm-commits at lists.llvm.org Thu Oct 10 12:25:30 2019 From: llvm-commits at lists.llvm.org (Jinsong Ji via llvm-commits) Date: Thu, 10 Oct 2019 19:25:30 -0000 Subject: [llvm] r374428 - [PowerPC][docs] Update IBM official docs in Compiler Writers Info page Message-ID: <20191010192530.52CD7927B4@lists.llvm.org> Author: jsji Date: Thu Oct 10 12:25:30 2019 New Revision: 374428 URL: http://llvm.org/viewvc/llvm-project?rev=374428&view=rev Log: [PowerPC][docs] Update IBM official docs in Compiler Writers Info page Summary: Just realized that most of the links in this page are deprecated. So update some important reference here: * adding PowerISA 3.0B/2.7B * adding P8/P9 User Manual * ELFv2 ABI and errata Move deprecated ones into "Other documents..". Reviewers: #powerpc, hfinkel, nemanjai Reviewed By: hfinkel Subscribers: shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68817 Modified: llvm/trunk/docs/CompilerWriterInfo.rst Modified: llvm/trunk/docs/CompilerWriterInfo.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CompilerWriterInfo.rst?rev=374428&r1=374427&r2=374428&view=diff ============================================================================== --- llvm/trunk/docs/CompilerWriterInfo.rst (original) +++ llvm/trunk/docs/CompilerWriterInfo.rst Thu Oct 10 12:25:30 2019 @@ -58,21 +58,27 @@ PowerPC IBM - Official manuals and docs ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -* `Power Instruction Set Architecture, Versions 2.03 through 2.06 (authentication required, free sign-up) `_ +* `Power Instruction Set Architecture, Version 3.0B `_ -* `PowerPC Compiler Writer's Guide `_ +* `POWER9 Processor User's Manual `_ -* `Intro to PowerPC Architecture `_ +* `Power Instruction Set Architecture, Version 2.07B `_ -* `PowerPC Processor Manuals (embedded) `_ +* `POWER8 Processor User's Manual `_ -* `Various IBM specifications and white papers `_ +* `Power Instruction Set Architecture, Versions 2.03 through 2.06 (Internet Archive) `_ + +* `IBM AIX 7.2 POWER Assembly Reference `_ * `IBM AIX/5L for POWER Assembly Reference `_ Other documents, collections, notes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +* `PowerPC Compiler Writer's Guide `_ +* `Intro to PowerPC Architecture `_ +* `PowerPC Processor Manuals (embedded) `_ +* `Various IBM specifications and white papers `_ * `PowerPC ABI documents `_ * `PowerPC64 alignment of long doubles (from GCC) `_ * `Long branch stubs for powerpc64-linux (from binutils) `_ @@ -133,6 +139,9 @@ Linux ----- * `Linux extensions to gabi `_ +* `64-Bit ELF V2 ABI Specification: Power Architecture `_ + +* `OpenPOWER ELFv2 Errata: ELFv2 ABI Version 1.4 `_ * `PowerPC 64-bit ELF ABI Supplement `_ * `Procedure Call Standard for the AArch64 Architecture `_ * `Procedure Call Standard for the ARM Architecture `_ From llvm-commits at lists.llvm.org Thu Oct 10 12:25:39 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Thu, 10 Oct 2019 19:25:39 -0000 Subject: [llvm] r374429 - Revert r374388: "[lit] Make internal diff work in pipelines" Message-ID: <20191010192539.DEA9B927C0@lists.llvm.org> Author: jdenny Date: Thu Oct 10 12:25:39 2019 New Revision: 374429 URL: http://llvm.org/viewvc/llvm-project?rev=374429&view=rev Log: Revert r374388: "[lit] Make internal diff work in pipelines" This breaks a Windows bot. Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt Removed: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt Modified: llvm/trunk/utils/lit/lit/TestRunner.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/TestRunner.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/TestRunner.py?rev=374429&r1=374428&r2=374429&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/TestRunner.py (original) +++ llvm/trunk/utils/lit/lit/TestRunner.py Thu Oct 10 12:25:39 2019 @@ -1,5 +1,7 @@ from __future__ import absolute_import +import difflib import errno +import functools import io import itertools import getopt @@ -359,6 +361,218 @@ def executeBuiltinMkdir(cmd, cmd_shenv): exitCode = 1 return ShellCommandResult(cmd, "", stderr.getvalue(), exitCode, False) +def executeBuiltinDiff(cmd, cmd_shenv): + """executeBuiltinDiff - Compare files line by line.""" + args = expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:] + try: + opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) + except getopt.GetoptError as err: + raise InternalShellError(cmd, "Unsupported: 'diff': %s" % str(err)) + + filelines, filepaths, dir_trees = ([] for i in range(3)) + ignore_all_space = False + ignore_space_change = False + unified_diff = False + recursive_diff = False + strip_trailing_cr = False + for o, a in opts: + if o == "-w": + ignore_all_space = True + elif o == "-b": + ignore_space_change = True + elif o == "-u": + unified_diff = True + elif o == "-r": + recursive_diff = True + elif o == "--strip-trailing-cr": + strip_trailing_cr = True + else: + assert False, "unhandled option" + + if len(args) != 2: + raise InternalShellError(cmd, "Error: missing or extra operand") + + def getDirTree(path, basedir=""): + # Tree is a tuple of form (dirname, child_trees). + # An empty dir has child_trees = [], a file has child_trees = None. + child_trees = [] + for dirname, child_dirs, files in os.walk(os.path.join(basedir, path)): + for child_dir in child_dirs: + child_trees.append(getDirTree(child_dir, dirname)) + for filename in files: + child_trees.append((filename, None)) + return path, sorted(child_trees) + + def compareTwoFiles(filepaths): + compare_bytes = False + encoding = None + filelines = [] + for file in filepaths: + try: + with open(file, 'r') as f: + filelines.append(f.readlines()) + except UnicodeDecodeError: + try: + with io.open(file, 'r', encoding="utf-8") as f: + filelines.append(f.readlines()) + encoding = "utf-8" + except: + compare_bytes = True + + if compare_bytes: + return compareTwoBinaryFiles(filepaths) + else: + return compareTwoTextFiles(filepaths, encoding) + + def compareTwoBinaryFiles(filepaths): + filelines = [] + for file in filepaths: + with open(file, 'rb') as f: + filelines.append(f.readlines()) + + exitCode = 0 + if hasattr(difflib, 'diff_bytes'): + # python 3.5 or newer + diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) + diffs = [diff.decode() for diff in diffs] + else: + # python 2.7 + func = difflib.unified_diff if unified_diff else difflib.context_diff + diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) + + for diff in diffs: + stdout.write(diff) + exitCode = 1 + return exitCode + + def compareTwoTextFiles(filepaths, encoding): + filelines = [] + for file in filepaths: + if encoding is None: + with open(file, 'r') as f: + filelines.append(f.readlines()) + else: + with io.open(file, 'r', encoding=encoding) as f: + filelines.append(f.readlines()) + + exitCode = 0 + def compose2(f, g): + return lambda x: f(g(x)) + + f = lambda x: x + if strip_trailing_cr: + f = compose2(lambda line: line.rstrip('\r'), f) + if ignore_all_space or ignore_space_change: + ignoreSpace = lambda line, separator: separator.join(line.split()) + ignoreAllSpaceOrSpaceChange = functools.partial(ignoreSpace, separator='' if ignore_all_space else ' ') + f = compose2(ignoreAllSpaceOrSpaceChange, f) + + for idx, lines in enumerate(filelines): + filelines[idx]= [f(line) for line in lines] + + func = difflib.unified_diff if unified_diff else difflib.context_diff + for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): + stdout.write(diff) + exitCode = 1 + return exitCode + + def printDirVsFile(dir_path, file_path): + if os.path.getsize(file_path): + msg = "File %s is a directory while file %s is a regular file" + else: + msg = "File %s is a directory while file %s is a regular empty file" + stdout.write(msg % (dir_path, file_path) + "\n") + + def printFileVsDir(file_path, dir_path): + if os.path.getsize(file_path): + msg = "File %s is a regular file while file %s is a directory" + else: + msg = "File %s is a regular empty file while file %s is a directory" + stdout.write(msg % (file_path, dir_path) + "\n") + + def printOnlyIn(basedir, path, name): + stdout.write("Only in %s: %s\n" % (os.path.join(basedir, path), name)) + + def compareDirTrees(dir_trees, base_paths=["", ""]): + # Dirnames of the trees are not checked, it's caller's responsibility, + # as top-level dirnames are always different. Base paths are important + # for doing os.walk, but we don't put it into tree's dirname in order + # to speed up string comparison below and while sorting in getDirTree. + left_tree, right_tree = dir_trees[0], dir_trees[1] + left_base, right_base = base_paths[0], base_paths[1] + + # Compare two files or report file vs. directory mismatch. + if left_tree[1] is None and right_tree[1] is None: + return compareTwoFiles([os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])]) + + if left_tree[1] is None and right_tree[1] is not None: + printFileVsDir(os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])) + return 1 + + if left_tree[1] is not None and right_tree[1] is None: + printDirVsFile(os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])) + return 1 + + # Compare two directories via recursive use of compareDirTrees. + exitCode = 0 + left_names = [node[0] for node in left_tree[1]] + right_names = [node[0] for node in right_tree[1]] + l, r = 0, 0 + while l < len(left_names) and r < len(right_names): + # Names are sorted in getDirTree, rely on that order. + if left_names[l] < right_names[r]: + exitCode = 1 + printOnlyIn(left_base, left_tree[0], left_names[l]) + l += 1 + elif left_names[l] > right_names[r]: + exitCode = 1 + printOnlyIn(right_base, right_tree[0], right_names[r]) + r += 1 + else: + exitCode |= compareDirTrees([left_tree[1][l], right_tree[1][r]], + [os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])]) + l += 1 + r += 1 + + # At least one of the trees has ended. Report names from the other tree. + while l < len(left_names): + exitCode = 1 + printOnlyIn(left_base, left_tree[0], left_names[l]) + l += 1 + while r < len(right_names): + exitCode = 1 + printOnlyIn(right_base, right_tree[0], right_names[r]) + r += 1 + return exitCode + + stderr = StringIO() + stdout = StringIO() + exitCode = 0 + try: + for file in args: + if not os.path.isabs(file): + file = os.path.realpath(os.path.join(cmd_shenv.cwd, file)) + + if recursive_diff: + dir_trees.append(getDirTree(file)) + else: + filepaths.append(file) + + if not recursive_diff: + exitCode = compareTwoFiles(filepaths) + else: + exitCode = compareDirTrees(dir_trees) + + except IOError as err: + stderr.write("Error: 'diff' command failed, %s\n" % str(err)) + exitCode = 1 + + return ShellCommandResult(cmd, stdout.getvalue(), stderr.getvalue(), exitCode, False) + def executeBuiltinRm(cmd, cmd_shenv): """executeBuiltinRm - Removes (deletes) files or directories.""" args = expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:] @@ -624,6 +838,14 @@ def _executeShCmd(cmd, shenv, results, t results.append(cmdResult) return cmdResult.exitCode + if cmd.commands[0].args[0] == 'diff': + if len(cmd.commands) != 1: + raise InternalShellError(cmd.commands[0], "Unsupported: 'diff' " + "cannot be part of a pipeline") + cmdResult = executeBuiltinDiff(cmd.commands[0], shenv) + results.append(cmdResult) + return cmdResult.exitCode + if cmd.commands[0].args[0] == 'rm': if len(cmd.commands) != 1: raise InternalShellError(cmd.commands[0], "Unsupported: 'rm' " @@ -644,7 +866,7 @@ def _executeShCmd(cmd, shenv, results, t stderrTempFiles = [] opened_files = [] named_temp_files = [] - builtin_commands = set(['cat', 'diff']) + builtin_commands = set(['cat']) builtin_commands_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "builtin_commands") # To avoid deadlock, we use a single stderr stream for piped # output. This is null until we have seen some output using Removed: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374428&view=auto ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py (removed) @@ -1,228 +0,0 @@ -import difflib -import functools -import getopt -import os -import sys - -class DiffFlags(): - def __init__(self): - self.ignore_all_space = False - self.ignore_space_change = False - self.unified_diff = False - self.recursive_diff = False - self.strip_trailing_cr = False - -def getDirTree(path, basedir=""): - # Tree is a tuple of form (dirname, child_trees). - # An empty dir has child_trees = [], a file has child_trees = None. - child_trees = [] - for dirname, child_dirs, files in os.walk(os.path.join(basedir, path)): - for child_dir in child_dirs: - child_trees.append(getDirTree(child_dir, dirname)) - for filename in files: - child_trees.append((filename, None)) - return path, sorted(child_trees) - -def compareTwoFiles(flags, filepaths): - compare_bytes = False - encoding = None - filelines = [] - for file in filepaths: - try: - with open(file, 'r') as f: - filelines.append(f.readlines()) - except UnicodeDecodeError: - try: - with io.open(file, 'r', encoding="utf-8") as f: - filelines.append(f.readlines()) - encoding = "utf-8" - except: - compare_bytes = True - - if compare_bytes: - return compareTwoBinaryFiles(flags, filepaths) - else: - return compareTwoTextFiles(flags, filepaths, encoding) - -def compareTwoBinaryFiles(flags, filepaths): - filelines = [] - for file in filepaths: - with open(file, 'rb') as f: - filelines.append(f.readlines()) - - exitCode = 0 - if hasattr(difflib, 'diff_bytes'): - # python 3.5 or newer - diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) - diffs = [diff.decode() for diff in diffs] - else: - # python 2.7 - if flags.unified_diff: - func = difflib.unified_diff - else: - func = difflib.context_diff - diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) - - for diff in diffs: - sys.stdout.write(diff) - exitCode = 1 - return exitCode - -def compareTwoTextFiles(flags, filepaths, encoding): - filelines = [] - for file in filepaths: - if encoding is None: - with open(file, 'r') as f: - filelines.append(f.readlines()) - else: - with io.open(file, 'r', encoding=encoding) as f: - filelines.append(f.readlines()) - - exitCode = 0 - def compose2(f, g): - return lambda x: f(g(x)) - - f = lambda x: x - if flags.strip_trailing_cr: - f = compose2(lambda line: line.rstrip('\r'), f) - if flags.ignore_all_space or flags.ignore_space_change: - ignoreSpace = lambda line, separator: separator.join(line.split()) - ignoreAllSpaceOrSpaceChange = functools.partial(ignoreSpace, separator='' if flags.ignore_all_space else ' ') - f = compose2(ignoreAllSpaceOrSpaceChange, f) - - for idx, lines in enumerate(filelines): - filelines[idx]= [f(line) for line in lines] - - func = difflib.unified_diff if flags.unified_diff else difflib.context_diff - for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): - sys.stdout.write(diff) - exitCode = 1 - return exitCode - -def printDirVsFile(dir_path, file_path): - if os.path.getsize(file_path): - msg = "File %s is a directory while file %s is a regular file" - else: - msg = "File %s is a directory while file %s is a regular empty file" - sys.stdout.write(msg % (dir_path, file_path) + "\n") - -def printFileVsDir(file_path, dir_path): - if os.path.getsize(file_path): - msg = "File %s is a regular file while file %s is a directory" - else: - msg = "File %s is a regular empty file while file %s is a directory" - sys.stdout.write(msg % (file_path, dir_path) + "\n") - -def printOnlyIn(basedir, path, name): - sys.stdout.write("Only in %s: %s\n" % (os.path.join(basedir, path), name)) - -def compareDirTrees(flags, dir_trees, base_paths=["", ""]): - # Dirnames of the trees are not checked, it's caller's responsibility, - # as top-level dirnames are always different. Base paths are important - # for doing os.walk, but we don't put it into tree's dirname in order - # to speed up string comparison below and while sorting in getDirTree. - left_tree, right_tree = dir_trees[0], dir_trees[1] - left_base, right_base = base_paths[0], base_paths[1] - - # Compare two files or report file vs. directory mismatch. - if left_tree[1] is None and right_tree[1] is None: - return compareTwoFiles(flags, - [os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])]) - - if left_tree[1] is None and right_tree[1] is not None: - printFileVsDir(os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])) - return 1 - - if left_tree[1] is not None and right_tree[1] is None: - printDirVsFile(os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])) - return 1 - - # Compare two directories via recursive use of compareDirTrees. - exitCode = 0 - left_names = [node[0] for node in left_tree[1]] - right_names = [node[0] for node in right_tree[1]] - l, r = 0, 0 - while l < len(left_names) and r < len(right_names): - # Names are sorted in getDirTree, rely on that order. - if left_names[l] < right_names[r]: - exitCode = 1 - printOnlyIn(left_base, left_tree[0], left_names[l]) - l += 1 - elif left_names[l] > right_names[r]: - exitCode = 1 - printOnlyIn(right_base, right_tree[0], right_names[r]) - r += 1 - else: - exitCode |= compareDirTrees(flags, - [left_tree[1][l], right_tree[1][r]], - [os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])]) - l += 1 - r += 1 - - # At least one of the trees has ended. Report names from the other tree. - while l < len(left_names): - exitCode = 1 - printOnlyIn(left_base, left_tree[0], left_names[l]) - l += 1 - while r < len(right_names): - exitCode = 1 - printOnlyIn(right_base, right_tree[0], right_names[r]) - r += 1 - return exitCode - -def main(argv): - args = argv[1:] - try: - opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) - except getopt.GetoptError as err: - sys.stderr.write("Unsupported: 'diff': %s\n" % str(err)) - sys.exit(1) - - flags = DiffFlags() - filelines, filepaths, dir_trees = ([] for i in range(3)) - for o, a in opts: - if o == "-w": - flags.ignore_all_space = True - elif o == "-b": - flags.ignore_space_change = True - elif o == "-u": - flags.unified_diff = True - elif o == "-r": - flags.recursive_diff = True - elif o == "--strip-trailing-cr": - flags.strip_trailing_cr = True - else: - assert False, "unhandled option" - - if len(args) != 2: - sys.stderr.write("Error: missing or extra operand\n") - sys.exit(1) - - exitCode = 0 - try: - for file in args: - if not os.path.isabs(file): - file = os.path.realpath(os.path.join(os.getcwd(), file)) - - if flags.recursive_diff: - dir_trees.append(getDirTree(file)) - else: - filepaths.append(file) - - if not flags.recursive_diff: - exitCode = compareTwoFiles(flags, filepaths) - else: - exitCode = compareDirTrees(flags, dir_trees) - - except IOError as err: - sys.stderr.write("Error: 'diff' command failed, %s\n" % str(err)) - exitCode = 1 - - sys.exit(exitCode) - -if __name__ == "__main__": - main(sys.argv) Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt?rev=374429&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt Thu Oct 10 12:25:39 2019 @@ -0,0 +1,3 @@ +# Check error on a unsupported diff (cannot be part of a pipeline). +# +# RUN: diff diff-error-0.txt diff-error-0.txt | echo Output Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt?rev=374428&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt (removed) @@ -1,15 +0,0 @@ -# RUN: echo foo > %t.foo -# RUN: echo bar > %t.bar - -# Check output pipe. -# RUN: diff %t.foo %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s -# RUN: diff -u %t.foo %t.bar | FileCheck %s && false || true - -# Fail so lit will print output. -# RUN: false - -# CHECK: @@ -# CHECK-NEXT: -foo -# CHECK-NEXT: +bar - -# EMPTY-NOT: {{.}} Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374429&r1=374428&r2=374429&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Thu Oct 10 12:25:39 2019 @@ -34,20 +34,28 @@ # CHECK: error: command failed with exit status: 127 # CHECK: *** +# CHECK: FAIL: shtest-shell :: diff-error-0.txt +# CHECK: *** TEST 'shtest-shell :: diff-error-0.txt' FAILED *** +# CHECK: $ "diff" "diff-error-0.txt" "diff-error-0.txt" +# CHECK: # command stderr: +# CHECK: Unsupported: 'diff' cannot be part of a pipeline +# CHECK: error: command failed with exit status: 127 +# CHECK: *** + # CHECK: FAIL: shtest-shell :: diff-error-1.txt # CHECK: *** TEST 'shtest-shell :: diff-error-1.txt' FAILED *** # CHECK: $ "diff" "-B" "temp1.txt" "temp2.txt" # CHECK: # command stderr: # CHECK: Unsupported: 'diff': option -B not recognized -# CHECK: error: command failed with exit status: 1 +# CHECK: error: command failed with exit status: 127 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-2.txt # CHECK: *** TEST 'shtest-shell :: diff-error-2.txt' FAILED *** # CHECK: $ "diff" "temp.txt" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 1 +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 127 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-3.txt @@ -74,43 +82,18 @@ # CHECK: *** TEST 'shtest-shell :: diff-error-5.txt' FAILED *** # CHECK: $ "diff" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 1 +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 127 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-6.txt # CHECK: *** TEST 'shtest-shell :: diff-error-6.txt' FAILED *** # CHECK: $ "diff" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 1 -# CHECK: *** - - -# CHECK: FAIL: shtest-shell :: diff-pipes.txt - -# CHECK: *** TEST 'shtest-shell :: diff-pipes.txt' FAILED *** - -# CHECK: $ "diff" "{{[^"]*}}.foo" "{{[^"]*}}.foo" -# CHECK-NOT: note -# CHECK-NOT: error -# CHECK: $ "FileCheck" -# CHECK-NOT: note -# CHECK-NOT: error - -# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "{{[^"]*}}.bar" -# CHECK: note: command had no output on stdout or stderr -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "FileCheck" -# CHECK-NOT: note -# CHECK-NOT: error -# CHECK: $ "true" - -# CHECK: $ "false" - +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 127 # CHECK: *** - # CHECK: FAIL: shtest-shell :: diff-r-error-0.txt # CHECK: *** TEST 'shtest-shell :: diff-r-error-0.txt' FAILED *** # CHECK: $ "diff" "-r" From llvm-commits at lists.llvm.org Thu Oct 10 12:28:08 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:28:08 +0000 (UTC) Subject: [PATCH] D68766: [NFC][ArgPromo][Tests] Run update_test_checks on all ArgumentPromotion tests In-Reply-To: References: Message-ID: <380383bf7db8d42d68a3bd6d029790d2@localhost.localdomain> lebedev.ri added a comment. If you want to add attributor runlines, i really insist on following the example i showed, it will result in cleaner diff overall. That being said i like that the attributor runlines are in a separate diff. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68766/new/ https://reviews.llvm.org/D68766 From llvm-commits at lists.llvm.org Thu Oct 10 12:28:09 2019 From: llvm-commits at lists.llvm.org (Hal Finkel via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:28:09 +0000 (UTC) Subject: [PATCH] D68793: [System Model] [TTI] Add TTI interfaces for write-combining buffers In-Reply-To: References: Message-ID: hfinkel added a comment. In D68793#1704496 , @greened wrote: > In D68793#1704266 , @hfinkel wrote: > > > How do you imagine that we'd use this? Do we need some kind of size to go along with this? > > > See the Intel optimization guide, section 3.6.9. > > https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf > > Basically, this information can be used to inform loop transformations as well as use of non-temporal instructions. A write-combining buffer is not the same as a store buffer. A write-combining buffer is always one cache line in size, so I don't think we need size information. Alright, thanks. First, we should document this in the interface. Instead of just saying: \return the number of write-combining buffers. we might say something like: \return the number of write-combining buffers. A write-combining buffer is a per-core resource used for collecting writes to a particular cache line before further processing those writes using other parts of the memory subsystem. we already have getCacheLineSize(), so we know how big that is, but we don't currently have a way to account for how many hardware threads per core, right? Don't we need that to estimate how many write-combining buffers we get for the current hardware thread? (Presumably, we'd want the same thing to use the total-cache-size functions too, because we need to generate code assuming a working-set size per thread?) The Intel optimization guide talks about using this number to drive loop distribution, where we don't update more arrays (cache lines) at a time than can fit into the thread's WC buffers. Is this what you had in mind? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68793/new/ https://reviews.llvm.org/D68793 From llvm-commits at lists.llvm.org Thu Oct 10 12:28:09 2019 From: llvm-commits at lists.llvm.org (Steven Wan via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:28:09 +0000 (UTC) Subject: [PATCH] D68529: [lit] Move argument parsing/validation to separate file In-Reply-To: References: Message-ID: <16a3d004368d2508030d8a8ac6693280@localhost.localdomain> stevewan added a comment. Hi @yln, The disappeared `--threads` is causing LIT failures in our internal build bots. Can you please take a look? Thanks! Steven Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68529/new/ https://reviews.llvm.org/D68529 From llvm-commits at lists.llvm.org Thu Oct 10 12:29:49 2019 From: llvm-commits at lists.llvm.org (Jinsong Ji via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:29:49 +0000 (UTC) Subject: [PATCH] D68817: [PowerPC][docs] Update IBM official docs in Compiler Writers Info page In-Reply-To: References: Message-ID: <7e2328ca5df88079231f9526ba0d674e@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG26cd5c93705c: [PowerPC][docs] Update IBM official docs in Compiler Writers Info page (authored by jsji). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68817/new/ https://reviews.llvm.org/D68817 Files: llvm/docs/CompilerWriterInfo.rst Index: llvm/docs/CompilerWriterInfo.rst =================================================================== --- llvm/docs/CompilerWriterInfo.rst +++ llvm/docs/CompilerWriterInfo.rst @@ -58,21 +58,27 @@ IBM - Official manuals and docs ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -* `Power Instruction Set Architecture, Versions 2.03 through 2.06 (authentication required, free sign-up) `_ +* `Power Instruction Set Architecture, Version 3.0B `_ -* `PowerPC Compiler Writer's Guide `_ +* `POWER9 Processor User's Manual `_ -* `Intro to PowerPC Architecture `_ +* `Power Instruction Set Architecture, Version 2.07B `_ -* `PowerPC Processor Manuals (embedded) `_ +* `POWER8 Processor User's Manual `_ -* `Various IBM specifications and white papers `_ +* `Power Instruction Set Architecture, Versions 2.03 through 2.06 (Internet Archive) `_ + +* `IBM AIX 7.2 POWER Assembly Reference `_ * `IBM AIX/5L for POWER Assembly Reference `_ Other documents, collections, notes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +* `PowerPC Compiler Writer's Guide `_ +* `Intro to PowerPC Architecture `_ +* `PowerPC Processor Manuals (embedded) `_ +* `Various IBM specifications and white papers `_ * `PowerPC ABI documents `_ * `PowerPC64 alignment of long doubles (from GCC) `_ * `Long branch stubs for powerpc64-linux (from binutils) `_ @@ -133,6 +139,9 @@ ----- * `Linux extensions to gabi `_ +* `64-Bit ELF V2 ABI Specification: Power Architecture `_ + +* `OpenPOWER ELFv2 Errata: ELFv2 ABI Version 1.4 `_ * `PowerPC 64-bit ELF ABI Supplement `_ * `Procedure Call Standard for the AArch64 Architecture `_ * `Procedure Call Standard for the ARM Architecture `_ -------------- next part -------------- A non-text attachment was scrubbed... Name: D68817.224446.patch Type: text/x-patch Size: 3591 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 12:30:25 2019 From: llvm-commits at lists.llvm.org (Philip Reames via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:30:25 +0000 (UTC) Subject: [PATCH] D68811: [CVP] Remove a masking operation if range information implies it's a noop In-Reply-To: References: Message-ID: reames updated this revision to Diff 224423. reames added a comment. Add statistic, and update a couple other tests which should have been in initial patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68811/new/ https://reviews.llvm.org/D68811 Files: lib/Transforms/Scalar/CorrelatedValuePropagation.cpp test/Transforms/CorrelatedValuePropagation/and.ll test/Transforms/CorrelatedValuePropagation/overflows.ll test/Transforms/CorrelatedValuePropagation/range.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68811.224423.patch Type: text/x-patch Size: 6529 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 12:37:16 2019 From: llvm-commits at lists.llvm.org (Teresa Johnson via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:37:16 +0000 (UTC) Subject: [PATCH] D67322: [LLD][ThinLTO] Handle GUID collision in import global processing In-Reply-To: References: Message-ID: <52b9647218c1eee9e0793e63c9fd2b31@localhost.localdomain> tejohnson added a comment. In D67322#1704278 , @evgeny777 wrote: > Is this the only place where bad things can happen? May be simply raise an error in `addGlobalValueSummary` when new summary type is different from that of `SummaryList[0]`? We should probably have more testing of this situation. But if we can make it work we should wherever possible. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67322/new/ https://reviews.llvm.org/D67322 From llvm-commits at lists.llvm.org Thu Oct 10 12:37:17 2019 From: llvm-commits at lists.llvm.org (Philip Reames via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:37:17 +0000 (UTC) Subject: [PATCH] D68811: [CVP] Remove a masking operation if range information implies it's a noop In-Reply-To: References: Message-ID: <8efcc0b9075e1e93302c8ac41b01e7a4@localhost.localdomain> reames marked 3 inline comments as done. reames added inline comments. ================ Comment at: lib/Transforms/Scalar/CorrelatedValuePropagation.cpp:715-716 + + ConstantRange LRange = LVI->getConstantRange(LHS, BB, BinOp); + if (!LRange.getUnsignedMax().ule(RHS->getValue())) + return false; ---------------- lebedev.ri wrote: > Do we want to query constanrange, or use `getPredicateAt()`? > Different folds here take different routes. > Should this be: > ``` > if (LVI->getPredicateAt(ICmpInst::ICMP_ULE, LHS, RHS, SDI) != > LazyValueInfo::True) > return false; > ``` > ? (i don't know) In this case, I think they're basically equivalent in practice for this case. The getPredicateAt form is slightly more powerful as it does predicate pushback through one layer of predecessors, but skipping that is slightly faster compile time wise. Hm, having written that, I think I'll switch to the predicate form just for future proofing ad consistency. Update coming. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68811/new/ https://reviews.llvm.org/D68811 From llvm-commits at lists.llvm.org Thu Oct 10 12:40:44 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Thu, 10 Oct 2019 19:40:44 -0000 Subject: [llvm] r374431 - [X86] Use packusdw+vpmovuswb to implement v16i32->V16i8 that clamps signed inputs to be between 0 and 255 when zmm registers are disabled on SKX. Message-ID: <20191010194045.098C684CF3@lists.llvm.org> Author: ctopper Date: Thu Oct 10 12:40:44 2019 New Revision: 374431 URL: http://llvm.org/viewvc/llvm-project?rev=374431&view=rev Log: [X86] Use packusdw+vpmovuswb to implement v16i32->V16i8 that clamps signed inputs to be between 0 and 255 when zmm registers are disabled on SKX. If we've disable zmm registers, the v16i32 will need to be split. This split will propagate through min/max the truncate. This creates two sequences that need to be concatenated back to v16i8. We can instead use packusdw to do part of the clamping, truncating, and concatenating all at once. Then we can use a vpmovuswb to finish off the clamp. Differential Revision: https://reviews.llvm.org/D68763 Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374431&r1=374430&r2=374431&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Thu Oct 10 12:40:44 2019 @@ -39841,6 +39841,21 @@ static SDValue combineTruncateWithSat(SD if (auto USatVal = detectUSatPattern(In, VT, DAG, DL)) return DAG.getNode(X86ISD::VTRUNCUS, DL, VT, USatVal); } + + // If we're clamping a signed 32-bit vector to 0-255 and the 32-bit vector is + // split across two registers. We can use a packusdw+perm to clamp to 0-65535 + // and concatenate at the same time. Then we can use a final vpmovuswb to + // clip to 0-255. + if (Subtarget.hasBWI() && !Subtarget.useAVX512Regs() && + InVT == MVT::v16i32 && VT == MVT::v16i8) { + if (auto USatVal = detectSSatPattern(In, VT, true)) { + // Emit a VPACKUSDW+VPERMQ followed by a VPMOVUSWB. + SDValue Mid = truncateVectorWithPACK(X86ISD::PACKUS, MVT::v16i16, USatVal, + DL, DAG, Subtarget); + return DAG.getNode(X86ISD::VTRUNCUS, DL, VT, Mid); + } + } + if (VT.isVector() && isPowerOf2_32(VT.getVectorNumElements()) && !(Subtarget.hasAVX512() && InSVT == MVT::i32) && !(Subtarget.hasBWI() && InSVT == MVT::i16) && Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll?rev=374431&r1=374430&r2=374431&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll (original) +++ llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Thu Oct 10 12:40:44 2019 @@ -1083,12 +1083,10 @@ define void @vselect_split_v16i16_setcc( define <16 x i8> @trunc_packus_v16i32_v16i8(<16 x i32>* %p, <16 x i8>* %q) "min-legal-vector-width"="256" { ; CHECK-LABEL: trunc_packus_v16i32_v16i8: ; CHECK: # %bb.0: -; CHECK-NEXT: vpxor %xmm0, %xmm0, %xmm0 -; CHECK-NEXT: vpmaxsd 32(%rdi), %ymm0, %ymm1 -; CHECK-NEXT: vpmovusdb %ymm1, %xmm1 -; CHECK-NEXT: vpmaxsd (%rdi), %ymm0, %ymm0 -; CHECK-NEXT: vpmovusdb %ymm0, %xmm0 -; CHECK-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] +; CHECK-NEXT: vmovdqa (%rdi), %ymm0 +; CHECK-NEXT: vpackusdw 32(%rdi), %ymm0, %ymm0 +; CHECK-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; CHECK-NEXT: vpmovuswb %ymm0, %xmm0 ; CHECK-NEXT: vzeroupper ; CHECK-NEXT: retq %a = load <16 x i32>, <16 x i32>* %p From llvm-commits at lists.llvm.org Thu Oct 10 12:43:57 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via llvm-commits) Date: Thu, 10 Oct 2019 19:43:57 -0000 Subject: [llvm] r374432 - [lit] Bring back `--threads` option alias Message-ID: <20191010194357.58F0E925F2@lists.llvm.org> Author: yln Date: Thu Oct 10 12:43:57 2019 New Revision: 374432 URL: http://llvm.org/viewvc/llvm-project?rev=374432&view=rev Log: [lit] Bring back `--threads` option alias Bring back `--threads` option which was lost in the move of the command line argument parsing code to cl_arguments.py. Update docs since `--workers` is preferred. Modified: llvm/trunk/docs/CommandGuide/lit.rst llvm/trunk/utils/lit/lit/cl_arguments.py Modified: llvm/trunk/docs/CommandGuide/lit.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CommandGuide/lit.rst?rev=374432&r1=374431&r2=374432&view=diff ============================================================================== --- llvm/trunk/docs/CommandGuide/lit.rst (original) +++ llvm/trunk/docs/CommandGuide/lit.rst Thu Oct 10 12:43:57 2019 @@ -53,7 +53,7 @@ GENERAL OPTIONS Show the :program:`lit` help message. -.. option:: -j N, --threads=N +.. option:: -j N, --workers=N Run ``N`` tests in parallel. By default, this is automatically chosen to match the number of detected available CPUs. Modified: llvm/trunk/utils/lit/lit/cl_arguments.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/cl_arguments.py?rev=374432&r1=374431&r2=374432&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/cl_arguments.py (original) +++ llvm/trunk/utils/lit/lit/cl_arguments.py Thu Oct 10 12:43:57 2019 @@ -16,7 +16,7 @@ def parse_args(): help="Show version and exit", action="store_true", default=False) - parser.add_argument("-j", "--workers", + parser.add_argument("-j", "--threads", "--workers", dest="numWorkers", metavar="N", help="Number of workers used for testing", From llvm-commits at lists.llvm.org Thu Oct 10 12:47:01 2019 From: llvm-commits at lists.llvm.org (Philip Reames via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:47:01 +0000 (UTC) Subject: [PATCH] D68811: [CVP] Remove a masking operation if range information implies it's a noop In-Reply-To: References: Message-ID: <8b29ea2713d6ec103cdf8861723bfcf0@localhost.localdomain> reames marked an inline comment as done. reames added inline comments. ================ Comment at: lib/Transforms/Scalar/CorrelatedValuePropagation.cpp:715-716 + + ConstantRange LRange = LVI->getConstantRange(LHS, BB, BinOp); + if (!LRange.getUnsignedMax().ule(RHS->getValue())) + return false; ---------------- reames wrote: > lebedev.ri wrote: > > Do we want to query constanrange, or use `getPredicateAt()`? > > Different folds here take different routes. > > Should this be: > > ``` > > if (LVI->getPredicateAt(ICmpInst::ICMP_ULE, LHS, RHS, SDI) != > > LazyValueInfo::True) > > return false; > > ``` > > ? (i don't know) > In this case, I think they're basically equivalent in practice for this case. The getPredicateAt form is slightly more powerful as it does predicate pushback through one layer of predecessors, but skipping that is slightly faster compile time wise. > > Hm, having written that, I think I'll switch to the predicate form just for future proofing ad consistency. Update coming. Correction: the getPredicateAt form does predicate pushback, but is weaker on conditions which are implied by edge local facts. (i.e. I tried to switch and the results are strictly worse.) This is arguably an LVI bug - one I remember from past occurrences now - but not something I'm going to fix just for this issue. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68811/new/ https://reviews.llvm.org/D68811 From llvm-commits at lists.llvm.org Thu Oct 10 12:47:01 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:47:01 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: <51f9b95468d276fb4b0f2352a6948a86@localhost.localdomain> DiggerLin marked 3 inline comments as done. DiggerLin added inline comments. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:168 + +bool isRelocationSigned(XCOFFRelocation32 &Reloc); + ---------------- hubert.reinterpretcast wrote: > sfertile wrote: > > hubert.reinterpretcast wrote: > > > DiggerLin wrote: > > > > hubert.reinterpretcast wrote: > > > > > Do these need to be declared in the header? Are they called only in one `.cpp` file? If so, they can be made `static` in the `.cpp` file. Otherwise, it seems odd that these aren't `const` member functions of `XCOFFRelocation32`. > > > > the llvm-readobj is using those function and obj2yaml will use them too. > > > It is still odd to me that these aren't `const` non-static member functions of `XCOFFRelocation32`. > > I think were these originally templated to work with both 32-bit and 64-bit relocations, which explains why they aren't member functions. > Would using CRTP with a base class template work for that case? as Sean's comment, for we only implement 32 bits relocation, we do not use any template for the relocation implement this moment. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Thu Oct 10 12:47:02 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:47:02 +0000 (UTC) Subject: [PATCH] D68826: AMDGPU: Fix redundant setting of m0 for atomic load/store Message-ID: arsenm created this revision. arsenm added a reviewer: rampitec. Herald added subscribers: jfb, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl. Atomic load/store would have their setting of m0 handled twice, which happened to be optimized out later. https://reviews.llvm.org/D68826 Files: lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp Index: lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp =================================================================== --- lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp +++ lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp @@ -715,12 +715,17 @@ return; // Already selected. } - if (isa(N) || + // isa almost works but is slightly too permisssive for some DS + // intrinsics. + if (Opc == ISD::LOAD || Opc == ISD::STORE || isa(N) || (Opc == AMDGPUISD::ATOMIC_INC || Opc == AMDGPUISD::ATOMIC_DEC || Opc == ISD::ATOMIC_LOAD_FADD || Opc == AMDGPUISD::ATOMIC_LOAD_FMIN || - Opc == AMDGPUISD::ATOMIC_LOAD_FMAX)) + Opc == AMDGPUISD::ATOMIC_LOAD_FMAX)) { N = glueCopyToM0LDSInit(N); + SelectCode(N); + return; + } switch (Opc) { default: @@ -817,14 +822,6 @@ ReplaceNode(N, buildSMovImm64(DL, Imm, N->getValueType(0))); return; } - case ISD::LOAD: - case ISD::STORE: - case ISD::ATOMIC_LOAD: - case ISD::ATOMIC_STORE: { - N = glueCopyToM0LDSInit(N); - break; - } - case AMDGPUISD::BFE_I32: case AMDGPUISD::BFE_U32: { // There is a scalar version available, but unlike the vector version which -------------- next part -------------- A non-text attachment was scrubbed... Name: D68826.224450.patch Type: text/x-patch Size: 1213 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 12:47:02 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:47:02 +0000 (UTC) Subject: [PATCH] D68529: [lit] Move argument parsing/validation to separate file In-Reply-To: References: Message-ID: <3c8253858e81bf90a06260c4e4aa3172@localhost.localdomain> yln added a comment. Added back the `--threads` option alias in b858895c859712e7c0f4e1c5e0436bd20cbe0c34 . Apologies for the churn. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68529/new/ https://reviews.llvm.org/D68529 From llvm-commits at lists.llvm.org Thu Oct 10 12:47:02 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:47:02 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: <8644befa0b4c358bbe03216ba845785b@localhost.localdomain> DiggerLin updated this revision to Diff 224451. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 Files: llvm/include/llvm/BinaryFormat/XCOFF.h llvm/include/llvm/Object/XCOFFObjectFile.h llvm/lib/Object/XCOFFObjectFile.cpp llvm/test/tools/llvm-readobj/reloc_overflow.test llvm/test/tools/llvm-readobj/xcoff-basic.test llvm/tools/llvm-readobj/XCOFFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67008.224451.patch Type: text/x-patch Size: 19676 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 12:47:56 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:47:56 +0000 (UTC) Subject: [PATCH] D68763: [X86] Use packusdw+vpmovuswb to implement v16i32->V16i8 that clamps signed inputs to be between 0 and 255 when zmm registers are disabled on SKX. In-Reply-To: References: Message-ID: <9c9291b857d4c7ad1d6e8aef60e7ccd2@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG0e561437c587: [X86] Use packusdw+vpmovuswb to implement v16i32->V16i8 that clamps signed… (authored by craig.topper). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68763/new/ https://reviews.llvm.org/D68763 Files: llvm/lib/Target/X86/X86ISelLowering.cpp llvm/test/CodeGen/X86/min-legal-vector-width.ll Index: llvm/test/CodeGen/X86/min-legal-vector-width.ll =================================================================== --- llvm/test/CodeGen/X86/min-legal-vector-width.ll +++ llvm/test/CodeGen/X86/min-legal-vector-width.ll @@ -1083,12 +1083,10 @@ define <16 x i8> @trunc_packus_v16i32_v16i8(<16 x i32>* %p, <16 x i8>* %q) "min-legal-vector-width"="256" { ; CHECK-LABEL: trunc_packus_v16i32_v16i8: ; CHECK: # %bb.0: -; CHECK-NEXT: vpxor %xmm0, %xmm0, %xmm0 -; CHECK-NEXT: vpmaxsd 32(%rdi), %ymm0, %ymm1 -; CHECK-NEXT: vpmovusdb %ymm1, %xmm1 -; CHECK-NEXT: vpmaxsd (%rdi), %ymm0, %ymm0 -; CHECK-NEXT: vpmovusdb %ymm0, %xmm0 -; CHECK-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] +; CHECK-NEXT: vmovdqa (%rdi), %ymm0 +; CHECK-NEXT: vpackusdw 32(%rdi), %ymm0, %ymm0 +; CHECK-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; CHECK-NEXT: vpmovuswb %ymm0, %xmm0 ; CHECK-NEXT: vzeroupper ; CHECK-NEXT: retq %a = load <16 x i32>, <16 x i32>* %p Index: llvm/lib/Target/X86/X86ISelLowering.cpp =================================================================== --- llvm/lib/Target/X86/X86ISelLowering.cpp +++ llvm/lib/Target/X86/X86ISelLowering.cpp @@ -39841,6 +39841,21 @@ if (auto USatVal = detectUSatPattern(In, VT, DAG, DL)) return DAG.getNode(X86ISD::VTRUNCUS, DL, VT, USatVal); } + + // If we're clamping a signed 32-bit vector to 0-255 and the 32-bit vector is + // split across two registers. We can use a packusdw+perm to clamp to 0-65535 + // and concatenate at the same time. Then we can use a final vpmovuswb to + // clip to 0-255. + if (Subtarget.hasBWI() && !Subtarget.useAVX512Regs() && + InVT == MVT::v16i32 && VT == MVT::v16i8) { + if (auto USatVal = detectSSatPattern(In, VT, true)) { + // Emit a VPACKUSDW+VPERMQ followed by a VPMOVUSWB. + SDValue Mid = truncateVectorWithPACK(X86ISD::PACKUS, MVT::v16i16, USatVal, + DL, DAG, Subtarget); + return DAG.getNode(X86ISD::VTRUNCUS, DL, VT, Mid); + } + } + if (VT.isVector() && isPowerOf2_32(VT.getVectorNumElements()) && !(Subtarget.hasAVX512() && InSVT == MVT::i32) && !(Subtarget.hasBWI() && InSVT == MVT::i16) && -------------- next part -------------- A non-text attachment was scrubbed... Name: D68763.224452.patch Type: text/x-patch Size: 2231 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 12:52:27 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Thu, 10 Oct 2019 19:52:27 -0000 Subject: [llvm] r374436 - [x86] reduce duplicate test assertions; NFC Message-ID: <20191010195227.6EB2D92889@lists.llvm.org> Author: spatel Date: Thu Oct 10 12:52:27 2019 New Revision: 374436 URL: http://llvm.org/viewvc/llvm-project?rev=374436&view=rev Log: [x86] reduce duplicate test assertions; NFC Modified: llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll Modified: llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll?rev=374436&r1=374435&r2=374436&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll (original) +++ llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll Thu Oct 10 12:52:27 2019 @@ -1,23 +1,16 @@ ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py -; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=-bmi < %s | FileCheck %s --check-prefix=CHECK-NOBMI -; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+bmi < %s | FileCheck %s --check-prefix=CHECK-BMI +; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=-bmi < %s | FileCheck %s --check-prefixes=ANY,CHECK-NOBMI +; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+bmi < %s | FileCheck %s --check-prefixes=ANY,CHECK-BMI ; Compare if negative and select of constants where one constant is zero. define i32 @neg_sel_constants(i32 %a) { -; CHECK-NOBMI-LABEL: neg_sel_constants: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: movl %edi, %eax -; CHECK-NOBMI-NEXT: sarl $31, %eax -; CHECK-NOBMI-NEXT: andl $5, %eax -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: neg_sel_constants: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: movl %edi, %eax -; CHECK-BMI-NEXT: sarl $31, %eax -; CHECK-BMI-NEXT: andl $5, %eax -; CHECK-BMI-NEXT: retq +; ANY-LABEL: neg_sel_constants: +; ANY: # %bb.0: +; ANY-NEXT: movl %edi, %eax +; ANY-NEXT: sarl $31, %eax +; ANY-NEXT: andl $5, %eax +; ANY-NEXT: retq %tmp.1 = icmp slt i32 %a, 0 %retval = select i1 %tmp.1, i32 5, i32 0 ret i32 %retval @@ -26,19 +19,12 @@ define i32 @neg_sel_constants(i32 %a) { ; Compare if negative and select of constants where one constant is zero and the other is a single bit. define i32 @neg_sel_special_constant(i32 %a) { -; CHECK-NOBMI-LABEL: neg_sel_special_constant: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: movl %edi, %eax -; CHECK-NOBMI-NEXT: shrl $22, %eax -; CHECK-NOBMI-NEXT: andl $512, %eax # imm = 0x200 -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: neg_sel_special_constant: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: movl %edi, %eax -; CHECK-BMI-NEXT: shrl $22, %eax -; CHECK-BMI-NEXT: andl $512, %eax # imm = 0x200 -; CHECK-BMI-NEXT: retq +; ANY-LABEL: neg_sel_special_constant: +; ANY: # %bb.0: +; ANY-NEXT: movl %edi, %eax +; ANY-NEXT: shrl $22, %eax +; ANY-NEXT: andl $512, %eax # imm = 0x200 +; ANY-NEXT: retq %tmp.1 = icmp slt i32 %a, 0 %retval = select i1 %tmp.1, i32 512, i32 0 ret i32 %retval @@ -47,19 +33,12 @@ define i32 @neg_sel_special_constant(i32 ; Compare if negative and select variable or zero. define i32 @neg_sel_variable_and_zero(i32 %a, i32 %b) { -; CHECK-NOBMI-LABEL: neg_sel_variable_and_zero: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: movl %edi, %eax -; CHECK-NOBMI-NEXT: sarl $31, %eax -; CHECK-NOBMI-NEXT: andl %esi, %eax -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: neg_sel_variable_and_zero: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: movl %edi, %eax -; CHECK-BMI-NEXT: sarl $31, %eax -; CHECK-BMI-NEXT: andl %esi, %eax -; CHECK-BMI-NEXT: retq +; ANY-LABEL: neg_sel_variable_and_zero: +; ANY: # %bb.0: +; ANY-NEXT: movl %edi, %eax +; ANY-NEXT: sarl $31, %eax +; ANY-NEXT: andl %esi, %eax +; ANY-NEXT: retq %tmp.1 = icmp slt i32 %a, 0 %retval = select i1 %tmp.1, i32 %b, i32 0 ret i32 %retval @@ -68,19 +47,12 @@ define i32 @neg_sel_variable_and_zero(i3 ; Compare if not positive and select the same variable as being compared: smin(a, 0). define i32 @not_pos_sel_same_variable(i32 %a) { -; CHECK-NOBMI-LABEL: not_pos_sel_same_variable: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: movl %edi, %eax -; CHECK-NOBMI-NEXT: sarl $31, %eax -; CHECK-NOBMI-NEXT: andl %edi, %eax -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: not_pos_sel_same_variable: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: movl %edi, %eax -; CHECK-BMI-NEXT: sarl $31, %eax -; CHECK-BMI-NEXT: andl %edi, %eax -; CHECK-BMI-NEXT: retq +; ANY-LABEL: not_pos_sel_same_variable: +; ANY: # %bb.0: +; ANY-NEXT: movl %edi, %eax +; ANY-NEXT: sarl $31, %eax +; ANY-NEXT: andl %edi, %eax +; ANY-NEXT: retq %tmp = icmp slt i32 %a, 1 %min = select i1 %tmp, i32 %a, i32 0 ret i32 %min @@ -91,21 +63,13 @@ define i32 @not_pos_sel_same_variable(i3 ; Compare if positive and select of constants where one constant is zero. define i32 @pos_sel_constants(i32 %a) { -; CHECK-NOBMI-LABEL: pos_sel_constants: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: # kill: def $edi killed $edi def $rdi -; CHECK-NOBMI-NEXT: notl %edi -; CHECK-NOBMI-NEXT: shrl $31, %edi -; CHECK-NOBMI-NEXT: leal (%rdi,%rdi,4), %eax -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: pos_sel_constants: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: # kill: def $edi killed $edi def $rdi -; CHECK-BMI-NEXT: notl %edi -; CHECK-BMI-NEXT: shrl $31, %edi -; CHECK-BMI-NEXT: leal (%rdi,%rdi,4), %eax -; CHECK-BMI-NEXT: retq +; ANY-LABEL: pos_sel_constants: +; ANY: # %bb.0: +; ANY-NEXT: # kill: def $edi killed $edi def $rdi +; ANY-NEXT: notl %edi +; ANY-NEXT: shrl $31, %edi +; ANY-NEXT: leal (%rdi,%rdi,4), %eax +; ANY-NEXT: retq %tmp.1 = icmp sgt i32 %a, -1 %retval = select i1 %tmp.1, i32 5, i32 0 ret i32 %retval @@ -114,21 +78,13 @@ define i32 @pos_sel_constants(i32 %a) { ; Compare if positive and select of constants where one constant is zero and the other is a single bit. define i32 @pos_sel_special_constant(i32 %a) { -; CHECK-NOBMI-LABEL: pos_sel_special_constant: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: movl %edi, %eax -; CHECK-NOBMI-NEXT: notl %eax -; CHECK-NOBMI-NEXT: shrl $22, %eax -; CHECK-NOBMI-NEXT: andl $512, %eax # imm = 0x200 -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: pos_sel_special_constant: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: movl %edi, %eax -; CHECK-BMI-NEXT: notl %eax -; CHECK-BMI-NEXT: shrl $22, %eax -; CHECK-BMI-NEXT: andl $512, %eax # imm = 0x200 -; CHECK-BMI-NEXT: retq +; ANY-LABEL: pos_sel_special_constant: +; ANY: # %bb.0: +; ANY-NEXT: movl %edi, %eax +; ANY-NEXT: notl %eax +; ANY-NEXT: shrl $22, %eax +; ANY-NEXT: andl $512, %eax # imm = 0x200 +; ANY-NEXT: retq %tmp.1 = icmp sgt i32 %a, -1 %retval = select i1 %tmp.1, i32 512, i32 0 ret i32 %retval @@ -200,147 +156,90 @@ define i32 @PR31175(i32 %x, i32 %y) { } define i8 @sel_shift_bool_i8(i1 %t) { -; CHECK-NOBMI-LABEL: sel_shift_bool_i8: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: movl %edi, %eax -; CHECK-NOBMI-NEXT: shlb $7, %al -; CHECK-NOBMI-NEXT: # kill: def $al killed $al killed $eax -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: sel_shift_bool_i8: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: movl %edi, %eax -; CHECK-BMI-NEXT: shlb $7, %al -; CHECK-BMI-NEXT: # kill: def $al killed $al killed $eax -; CHECK-BMI-NEXT: retq +; ANY-LABEL: sel_shift_bool_i8: +; ANY: # %bb.0: +; ANY-NEXT: movl %edi, %eax +; ANY-NEXT: shlb $7, %al +; ANY-NEXT: # kill: def $al killed $al killed $eax +; ANY-NEXT: retq %shl = select i1 %t, i8 128, i8 0 ret i8 %shl } define i16 @sel_shift_bool_i16(i1 %t) { -; CHECK-NOBMI-LABEL: sel_shift_bool_i16: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: movl %edi, %eax -; CHECK-NOBMI-NEXT: andl $1, %eax -; CHECK-NOBMI-NEXT: shll $7, %eax -; CHECK-NOBMI-NEXT: # kill: def $ax killed $ax killed $eax -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: sel_shift_bool_i16: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: movl %edi, %eax -; CHECK-BMI-NEXT: andl $1, %eax -; CHECK-BMI-NEXT: shll $7, %eax -; CHECK-BMI-NEXT: # kill: def $ax killed $ax killed $eax -; CHECK-BMI-NEXT: retq +; ANY-LABEL: sel_shift_bool_i16: +; ANY: # %bb.0: +; ANY-NEXT: movl %edi, %eax +; ANY-NEXT: andl $1, %eax +; ANY-NEXT: shll $7, %eax +; ANY-NEXT: # kill: def $ax killed $ax killed $eax +; ANY-NEXT: retq %shl = select i1 %t, i16 128, i16 0 ret i16 %shl } define i32 @sel_shift_bool_i32(i1 %t) { -; CHECK-NOBMI-LABEL: sel_shift_bool_i32: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: movl %edi, %eax -; CHECK-NOBMI-NEXT: andl $1, %eax -; CHECK-NOBMI-NEXT: shll $6, %eax -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: sel_shift_bool_i32: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: movl %edi, %eax -; CHECK-BMI-NEXT: andl $1, %eax -; CHECK-BMI-NEXT: shll $6, %eax -; CHECK-BMI-NEXT: retq +; ANY-LABEL: sel_shift_bool_i32: +; ANY: # %bb.0: +; ANY-NEXT: movl %edi, %eax +; ANY-NEXT: andl $1, %eax +; ANY-NEXT: shll $6, %eax +; ANY-NEXT: retq %shl = select i1 %t, i32 64, i32 0 ret i32 %shl } define i64 @sel_shift_bool_i64(i1 %t) { -; CHECK-NOBMI-LABEL: sel_shift_bool_i64: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: movl %edi, %eax -; CHECK-NOBMI-NEXT: andl $1, %eax -; CHECK-NOBMI-NEXT: shlq $16, %rax -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: sel_shift_bool_i64: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: movl %edi, %eax -; CHECK-BMI-NEXT: andl $1, %eax -; CHECK-BMI-NEXT: shlq $16, %rax -; CHECK-BMI-NEXT: retq +; ANY-LABEL: sel_shift_bool_i64: +; ANY: # %bb.0: +; ANY-NEXT: movl %edi, %eax +; ANY-NEXT: andl $1, %eax +; ANY-NEXT: shlq $16, %rax +; ANY-NEXT: retq %shl = select i1 %t, i64 65536, i64 0 ret i64 %shl } define <16 x i8> @sel_shift_bool_v16i8(<16 x i1> %t) { -; CHECK-NOBMI-LABEL: sel_shift_bool_v16i8: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: psllw $7, %xmm0 -; CHECK-NOBMI-NEXT: pand {{.*}}(%rip), %xmm0 -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: sel_shift_bool_v16i8: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: psllw $7, %xmm0 -; CHECK-BMI-NEXT: pand {{.*}}(%rip), %xmm0 -; CHECK-BMI-NEXT: retq +; ANY-LABEL: sel_shift_bool_v16i8: +; ANY: # %bb.0: +; ANY-NEXT: psllw $7, %xmm0 +; ANY-NEXT: pand {{.*}}(%rip), %xmm0 +; ANY-NEXT: retq %shl = select <16 x i1> %t, <16 x i8> , <16 x i8> zeroinitializer ret <16 x i8> %shl } define <8 x i16> @sel_shift_bool_v8i16(<8 x i1> %t) { -; CHECK-NOBMI-LABEL: sel_shift_bool_v8i16: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: psllw $15, %xmm0 -; CHECK-NOBMI-NEXT: psraw $15, %xmm0 -; CHECK-NOBMI-NEXT: pand {{.*}}(%rip), %xmm0 -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: sel_shift_bool_v8i16: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: psllw $15, %xmm0 -; CHECK-BMI-NEXT: psraw $15, %xmm0 -; CHECK-BMI-NEXT: pand {{.*}}(%rip), %xmm0 -; CHECK-BMI-NEXT: retq +; ANY-LABEL: sel_shift_bool_v8i16: +; ANY: # %bb.0: +; ANY-NEXT: psllw $15, %xmm0 +; ANY-NEXT: psraw $15, %xmm0 +; ANY-NEXT: pand {{.*}}(%rip), %xmm0 +; ANY-NEXT: retq %shl= select <8 x i1> %t, <8 x i16> , <8 x i16> zeroinitializer ret <8 x i16> %shl } define <4 x i32> @sel_shift_bool_v4i32(<4 x i1> %t) { -; CHECK-NOBMI-LABEL: sel_shift_bool_v4i32: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: pslld $31, %xmm0 -; CHECK-NOBMI-NEXT: psrad $31, %xmm0 -; CHECK-NOBMI-NEXT: pand {{.*}}(%rip), %xmm0 -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: sel_shift_bool_v4i32: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: pslld $31, %xmm0 -; CHECK-BMI-NEXT: psrad $31, %xmm0 -; CHECK-BMI-NEXT: pand {{.*}}(%rip), %xmm0 -; CHECK-BMI-NEXT: retq +; ANY-LABEL: sel_shift_bool_v4i32: +; ANY: # %bb.0: +; ANY-NEXT: pslld $31, %xmm0 +; ANY-NEXT: psrad $31, %xmm0 +; ANY-NEXT: pand {{.*}}(%rip), %xmm0 +; ANY-NEXT: retq %shl = select <4 x i1> %t, <4 x i32> , <4 x i32> zeroinitializer ret <4 x i32> %shl } define <2 x i64> @sel_shift_bool_v2i64(<2 x i1> %t) { -; CHECK-NOBMI-LABEL: sel_shift_bool_v2i64: -; CHECK-NOBMI: # %bb.0: -; CHECK-NOBMI-NEXT: psllq $63, %xmm0 -; CHECK-NOBMI-NEXT: psrad $31, %xmm0 -; CHECK-NOBMI-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; CHECK-NOBMI-NEXT: pand {{.*}}(%rip), %xmm0 -; CHECK-NOBMI-NEXT: retq -; -; CHECK-BMI-LABEL: sel_shift_bool_v2i64: -; CHECK-BMI: # %bb.0: -; CHECK-BMI-NEXT: psllq $63, %xmm0 -; CHECK-BMI-NEXT: psrad $31, %xmm0 -; CHECK-BMI-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; CHECK-BMI-NEXT: pand {{.*}}(%rip), %xmm0 -; CHECK-BMI-NEXT: retq +; ANY-LABEL: sel_shift_bool_v2i64: +; ANY: # %bb.0: +; ANY-NEXT: psllq $63, %xmm0 +; ANY-NEXT: psrad $31, %xmm0 +; ANY-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; ANY-NEXT: pand {{.*}}(%rip), %xmm0 +; ANY-NEXT: retq %shl = select <2 x i1> %t, <2 x i64> , <2 x i64> zeroinitializer ret <2 x i64> %shl } From llvm-commits at lists.llvm.org Thu Oct 10 12:57:31 2019 From: llvm-commits at lists.llvm.org (Nikita Popov via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:57:31 +0000 (UTC) Subject: [PATCH] D68811: [CVP] Remove a masking operation if range information implies it's a noop In-Reply-To: References: Message-ID: nikic added inline comments. ================ Comment at: lib/Transforms/Scalar/CorrelatedValuePropagation.cpp:715-716 + + ConstantRange LRange = LVI->getConstantRange(LHS, BB, BinOp); + if (!LRange.getUnsignedMax().ule(RHS->getValue())) + return false; ---------------- reames wrote: > reames wrote: > > lebedev.ri wrote: > > > Do we want to query constanrange, or use `getPredicateAt()`? > > > Different folds here take different routes. > > > Should this be: > > > ``` > > > if (LVI->getPredicateAt(ICmpInst::ICMP_ULE, LHS, RHS, SDI) != > > > LazyValueInfo::True) > > > return false; > > > ``` > > > ? (i don't know) > > In this case, I think they're basically equivalent in practice for this case. The getPredicateAt form is slightly more powerful as it does predicate pushback through one layer of predecessors, but skipping that is slightly faster compile time wise. > > > > Hm, having written that, I think I'll switch to the predicate form just for future proofing ad consistency. Update coming. > Correction: the getPredicateAt form does predicate pushback, but is weaker on conditions which are implied by edge local facts. (i.e. I tried to switch and the results are strictly worse.) This is arguably an LVI bug - one I remember from past occurrences now - but not something I'm going to fix just for this issue. Right, this is a longstanding LVI limitation. ================ Comment at: lib/Transforms/Scalar/CorrelatedValuePropagation.cpp:713 + ConstantInt *RHS = dyn_cast(BinOp->getOperand(1)); + if (!RHS || !RHS->getValue().isMask()) + return false; ---------------- The limit to masks seems a bit stronger than strictly necessary: The range needs to be <= than the trailing ones in RHS. That is for `0b11001111`, if the range is `<= 0b00001111` that is sufficient. Not sure if this is worth handling. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68811/new/ https://reviews.llvm.org/D68811 From llvm-commits at lists.llvm.org Thu Oct 10 12:59:07 2019 From: llvm-commits at lists.llvm.org (Nick Desaulniers via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 19:59:07 +0000 (UTC) Subject: [PATCH] D67986: [InstCombine] snprintf (d, size, "%s", s) -> memccpy (d, s, '\0', size - 1), d[size - 1] = 0 In-Reply-To: References: Message-ID: nickdesaulniers added a comment. In D67986#1702901 , @MaskRay wrote: > This transformation seems to increase code size significantly. Is the snprintf "%s" pattern common enough? I suspect most projects have already used memccpy, stpncpy, strscpy, or strlcpy. For the few that don't, the performance probably does not matter. Sounds like then maybe this optimization should conditionally occur based on optimization level/goals? For instance, maybe it's not appropriate at `-Os`, but is at `-O2`? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67986/new/ https://reviews.llvm.org/D67986 From llvm-commits at lists.llvm.org Thu Oct 10 13:02:08 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via llvm-commits) Date: Thu, 10 Oct 2019 15:02:08 -0500 Subject: [llvm] r374243 - [InstCombine] Fix PR43617 In-Reply-To: References: <20191009220324.1570A80AFB@lists.llvm.org> Message-ID: <9f9614af-18ca-37b2-1a50-07e5dfa71557@samsung.com> Neat! -- Evandro Menezes On 10/9/19 6:07 PM, Craig Topper wrote: > You could just use the CallInst version of getIntrinsicID() which > already does the right thing > > ~Craig > > > On Wed, Oct 9, 2019 at 3:01 PM Evandro Menezes via llvm-commits > > wrote: > > Author: evandro > Date: Wed Oct  9 15:03:23 2019 > New Revision: 374243 > > URL: http://llvm.org/viewvc/llvm-project?rev=374243&view=rev > Log: > [InstCombine] Fix PR43617 > > Check for `nullptr` before inspecting composite function. > > Modified: >     llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp > > Modified: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp?rev=374243&r1=374242&r2=374243&view=diff > ============================================================================== > --- llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp (original) > +++ llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp Wed Oct  > 9 15:03:23 2019 > @@ -1916,10 +1916,10 @@ Value *LibCallSimplifier::optimizeLog(Ca >    B.setFastMathFlags(FastMathFlags::getFast()); > >    Function *ArgFn = Arg->getCalledFunction(); > -  StringRef ArgNm = ArgFn->getName(); > -  Intrinsic::ID ArgID = ArgFn->getIntrinsicID(); > +  Intrinsic::ID ArgID = > +      ArgFn ? ArgFn->getIntrinsicID() : Intrinsic::not_intrinsic; >    LibFunc ArgLb = NotLibFunc; > -  TLI->getLibFunc(ArgNm, ArgLb); > +  TLI->getLibFunc(Arg, ArgLb); > >    // log(pow(x,y)) -> y*log(x) >    if (ArgLb == PowLb || ArgID == Intrinsic::pow) { > @@ -1934,9 +1934,10 @@ Value *LibCallSimplifier::optimizeLog(Ca >      substituteInParent(Arg, MulY); >      return MulY; >    } > + >    // log(exp{,2,10}(y)) -> y*log({e,2,10}) >    // TODO: There is no exp10() intrinsic yet. > -  else if (ArgLb == ExpLb || ArgLb == Exp2Lb || ArgLb == Exp10Lb || > +  if (ArgLb == ExpLb || ArgLb == Exp2Lb || ArgLb == Exp10Lb || >             ArgID == Intrinsic::exp || ArgID == Intrinsic::exp2) { >      Constant *Eul; >      if (ArgLb == ExpLb || ArgID == Intrinsic::exp) > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Thu Oct 10 13:07:09 2019 From: llvm-commits at lists.llvm.org (George Burgess IV via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:07:09 +0000 (UTC) Subject: [PATCH] D68809: [MemorySSA] Additional handling of unreachable blocks. In-Reply-To: References: Message-ID: <824cb5983e6525391383ebd03ddaf4ee@localhost.localdomain> george.burgess.iv accepted this revision. george.burgess.iv added a comment. This revision is now accepted and ready to land. Thanks! ================ Comment at: test/Analysis/MemorySSA/pr43426.ll:1 +; RUN: opt -licm -enable-mssa-loop-dependency -S %s | FileCheck %s +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" ---------------- `; REQUIRES: asserts`? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68809/new/ https://reviews.llvm.org/D68809 From llvm-commits at lists.llvm.org Thu Oct 10 13:07:09 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:07:09 +0000 (UTC) Subject: [PATCH] D68826: AMDGPU: Fix redundant setting of m0 for atomic load/store In-Reply-To: References: Message-ID: <3ff2de9f62cfc56a32adfac0a9687a04@localhost.localdomain> rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68826/new/ https://reviews.llvm.org/D68826 From llvm-commits at lists.llvm.org Thu Oct 10 13:07:10 2019 From: llvm-commits at lists.llvm.org (Nick Desaulniers via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:07:10 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses In-Reply-To: References: Message-ID: nickdesaulniers added a comment. Great test cases. Thanks for the patch! ================ Comment at: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp:5743-5744 Parser.getTok().is(AsmToken::Integer)) { if (Parser.getTok().isNot(AsmToken::Integer)) - Parser.Lex(); // Eat '#' or '$'. + if (!Parser.getTok().is(AsmToken::LParen)) + Parser.Lex(); // Eat '#' or '$' ---------------- Prefer: ``` if (foo && bar) baz(); ``` to: ``` if (foo) if (bar) baz(); ``` ================ Comment at: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp:5744 if (Parser.getTok().isNot(AsmToken::Integer)) - Parser.Lex(); // Eat '#' or '$'. + if (!Parser.getTok().is(AsmToken::LParen)) + Parser.Lex(); // Eat '#' or '$' ---------------- Looks like `Parser.getTok().isNot` is more readable that `!Parser.getTok().is()`. ================ Comment at: llvm/test/MC/ARM/gas-compl.s:4 +@ CHECK: ldr r12, [sp, #15] +.syntax unified + ldr r12, [sp, (15)] ---------------- An assembler directive should be set once. Resetting the syntax to unified has no effect. You can set it once in this test, then never again. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68764/new/ https://reviews.llvm.org/D68764 From llvm-commits at lists.llvm.org Thu Oct 10 13:16:17 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:16:17 +0000 (UTC) Subject: [PATCH] D66431: [PDB] Fix bug when using multiple PCH header objects with the same name. In-Reply-To: References: Message-ID: rnk accepted this revision. rnk added a comment. Go for it. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66431/new/ https://reviews.llvm.org/D66431 From llvm-commits at lists.llvm.org Thu Oct 10 13:16:17 2019 From: llvm-commits at lists.llvm.org (David Greene via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:16:17 +0000 (UTC) Subject: [PATCH] D68793: [System Model] [TTI] Add TTI interfaces for write-combining buffers In-Reply-To: References: Message-ID: greened added a comment. > we might say something like: > > \return the number of write-combining buffers. A write-combining buffer is a per-core resource used for collecting writes to a particular cache line before further processing those writes using other parts of the memory subsystem. Will do. > we already have getCacheLineSize(), so we know how big that is, but we don't currently have a way to account for how many hardware threads per core, right? Don't we need that to estimate how many write-combining buffers we get for the current hardware thread? (Presumably, we'd want the same thing to use the total-cache-size functions too, because we need to generate code assuming a working-set size per thread?) This is something that will become available once more bits of the system model are implemented. The model can specify things like number of cores and threads per core. The subtarget will be able to examine its execution resource configuration and return an appropriate number. After this patch makes it through I will be in a place where I can start posting the TableGen changes to generate models and then post the TableGen model classes after that. At that point targets can define their own models and away we go. > The Intel optimization guide talks about using this number to drive loop distribution, where we don't update more arrays (cache lines) at a time than can fit into the thread's WC buffers. Is this what you had in mind? Yes. It's useful for anything that cares about performance of writes to memory. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68793/new/ https://reviews.llvm.org/D68793 From llvm-commits at lists.llvm.org Thu Oct 10 13:16:17 2019 From: llvm-commits at lists.llvm.org (Bardia Mahjour via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:16:17 +0000 (UTC) Subject: [PATCH] D68789: [LoopNext]: Analysis to discover properties of a loop nest. In-Reply-To: References: Message-ID: <910330133728e3e2edc867f62c386b82@localhost.localdomain> bmahjour added a comment. typo in title: [LoopNext] -> [LoopNest] Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68789/new/ https://reviews.llvm.org/D68789 From llvm-commits at lists.llvm.org Thu Oct 10 13:16:17 2019 From: llvm-commits at lists.llvm.org (Nick Desaulniers via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:16:17 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses In-Reply-To: References: Message-ID: <258bd6f386c56fefbd0b995ed4832dcf@localhost.localdomain> nickdesaulniers added inline comments. ================ Comment at: llvm/test/MC/ARM/gas-compl.s:17 +.syntax unified + ldr r12, [sp, (15+5*5)] + ---------------- Does gas support multiple parens, ie. `((15+5))`? Do we? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68764/new/ https://reviews.llvm.org/D68764 From llvm-commits at lists.llvm.org Thu Oct 10 13:16:17 2019 From: llvm-commits at lists.llvm.org (Manoj Gupta via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:16:17 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses In-Reply-To: References: Message-ID: manojgupta added inline comments. ================ Comment at: llvm/test/MC/ARM/gas-compl.s:1 +@ RUN: llvm-mc -triple=arm < %s | FileCheck %s + ---------------- MaskRay wrote: > Change to a better test name? +1 for a better test name. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68764/new/ https://reviews.llvm.org/D68764 From llvm-commits at lists.llvm.org Thu Oct 10 13:16:18 2019 From: llvm-commits at lists.llvm.org (Bardia Mahjour via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:16:18 +0000 (UTC) Subject: [PATCH] D68827: [DDG] Data Dependence Graph - Pi Block Message-ID: bmahjour created this revision. bmahjour added reviewers: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert. bmahjour added a project: LLVM. This patch adds Pi Blocks to the DDG. A pi-block represents a group of DDG nodes that are part of a strongly-connected component of the graph. Replacing all the SCCs with pi-blocks results in an acyclic representation of the DDG. For example if we have: {a -> b}, {b -> c, d}, {c -> a} the cycle `a -> b -> c -> a` is abstracted into a pi-block "p" as follows: {p -> d} with "p" containing: {a -> b}, {b -> c}, {c -> a} In this implementation the edges between nodes that are part of the pi-block are preserved. The crossing edges (edges where one end of the edge is in the set of nodes belonging to an SCC and the other end is outside that set) are replaced with corresponding edges to/from the pi-block node instead. Repository: rL LLVM https://reviews.llvm.org/D68827 Files: llvm/include/llvm/Analysis/DDG.h llvm/include/llvm/Analysis/DependenceGraphBuilder.h llvm/lib/Analysis/DDG.cpp llvm/lib/Analysis/DependenceGraphBuilder.cpp llvm/test/Analysis/DDG/basic-a.ll llvm/test/Analysis/DDG/basic-b.ll llvm/test/Analysis/DDG/basic-loopnest.ll llvm/test/Analysis/DDG/root-node.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68827.224454.patch Type: text/x-patch Size: 59853 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 13:17:34 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:17:34 +0000 (UTC) Subject: [PATCH] D68811: [CVP] Remove a masking operation if range information implies it's a noop In-Reply-To: References: Message-ID: <4b6e52924a4aff5e5a3f880d162089e7@localhost.localdomain> lebedev.ri added a comment. Thanks, this looks ok to me, but i'll leave final review to others. ================ Comment at: lib/Transforms/Scalar/CorrelatedValuePropagation.cpp:713 + ConstantInt *RHS = dyn_cast(BinOp->getOperand(1)); + if (!RHS || !RHS->getValue().isMask()) + return false; ---------------- nikic wrote: > The limit to masks seems a bit stronger than strictly necessary: The range needs to be <= than the trailing ones in RHS. That is for `0b11001111`, if the range is `<= 0b00001111` that is sufficient. Not sure if this is worth handling. I'd say this is intentional to limit the number of `and`s we handle. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68811/new/ https://reviews.llvm.org/D68811 From llvm-commits at lists.llvm.org Thu Oct 10 13:22:53 2019 From: llvm-commits at lists.llvm.org (Jordan Rose via llvm-commits) Date: Thu, 10 Oct 2019 20:22:53 -0000 Subject: [llvm] r374440 - ADT: Save a word in every StringSet entry Message-ID: <20191010202253.7C957928C1@lists.llvm.org> Author: jrose Date: Thu Oct 10 13:22:53 2019 New Revision: 374440 URL: http://llvm.org/viewvc/llvm-project?rev=374440&view=rev Log: ADT: Save a word in every StringSet entry Add a specialization to StringMap (actually StringMapEntry) for a value type of NoneType (the type of llvm::None), and use it for StringSet. This'll save us a word from every entry in a StringSet, used for alignment with the size_t that stores the string length. I could have gone all the way to some kind of empty base class optimization, but that seemed like overkill. Someone can consider adding that in the future, though. https://reviews.llvm.org/D68586 Modified: llvm/trunk/include/llvm/ADT/StringMap.h llvm/trunk/include/llvm/ADT/StringSet.h llvm/trunk/include/llvm/IR/Metadata.h llvm/trunk/include/llvm/LTO/legacy/LTOCodeGenerator.h llvm/trunk/lib/LTO/LTOCodeGenerator.cpp Modified: llvm/trunk/include/llvm/ADT/StringMap.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/StringMap.h?rev=374440&r1=374439&r2=374440&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/StringMap.h (original) +++ llvm/trunk/include/llvm/ADT/StringMap.h Thu Oct 10 13:22:53 2019 @@ -118,36 +118,59 @@ public: } }; -/// StringMapEntry - This is used to represent one value that is inserted into -/// a StringMap. It contains the Value itself and the key: the string length -/// and data. +/// StringMapEntryStorage - Holds the value in a StringMapEntry. +/// +/// Factored out into a separate base class to make it easier to specialize. +/// This is primarily intended to support StringSet, which doesn't need a value +/// stored at all. template -class StringMapEntry : public StringMapEntryBase { +class StringMapEntryStorage : public StringMapEntryBase { public: ValueTy second; - explicit StringMapEntry(size_t strLen) + explicit StringMapEntryStorage(size_t strLen) : StringMapEntryBase(strLen), second() {} template - StringMapEntry(size_t strLen, InitTy &&... InitVals) + StringMapEntryStorage(size_t strLen, InitTy &&... InitVals) : StringMapEntryBase(strLen), second(std::forward(InitVals)...) {} - StringMapEntry(StringMapEntry &E) = delete; - - StringRef getKey() const { - return StringRef(getKeyData(), getKeyLength()); - } + StringMapEntryStorage(StringMapEntryStorage &E) = delete; const ValueTy &getValue() const { return second; } ValueTy &getValue() { return second; } void setValue(const ValueTy &V) { second = V; } +}; + +template<> +class StringMapEntryStorage : public StringMapEntryBase { +public: + explicit StringMapEntryStorage(size_t strLen, NoneType none = None) + : StringMapEntryBase(strLen) {} + StringMapEntryStorage(StringMapEntryStorage &E) = delete; + + NoneType getValue() const { return None; } +}; + +/// StringMapEntry - This is used to represent one value that is inserted into +/// a StringMap. It contains the Value itself and the key: the string length +/// and data. +template +class StringMapEntry final : public StringMapEntryStorage { +public: + using StringMapEntryStorage::StringMapEntryStorage; + + StringRef getKey() const { + return StringRef(getKeyData(), this->getKeyLength()); + } /// getKeyData - Return the start of the string data that is the key for this /// value. The string data is always stored immediately after the /// StringMapEntry object. const char *getKeyData() const {return reinterpret_cast(this+1);} - StringRef first() const { return StringRef(getKeyData(), getKeyLength()); } + StringRef first() const { + return StringRef(getKeyData(), this->getKeyLength()); + } /// Create a StringMapEntry for the specified key construct the value using /// \p InitiVals. @@ -199,7 +222,7 @@ public: template void Destroy(AllocatorTy &Allocator) { // Free memory referenced by the item. - size_t AllocSize = sizeof(StringMapEntry) + getKeyLength() + 1; + size_t AllocSize = sizeof(StringMapEntry) + this->getKeyLength() + 1; this->~StringMapEntry(); Allocator.Deallocate(static_cast(this), AllocSize); } Modified: llvm/trunk/include/llvm/ADT/StringSet.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/StringSet.h?rev=374440&r1=374439&r2=374440&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/StringSet.h (original) +++ llvm/trunk/include/llvm/ADT/StringSet.h Thu Oct 10 13:22:53 2019 @@ -24,8 +24,8 @@ namespace llvm { /// StringSet - A wrapper for StringMap that provides set-like functionality. template - class StringSet : public StringMap { - using base = StringMap; + class StringSet : public StringMap { + using base = StringMap; public: StringSet() = default; @@ -37,13 +37,13 @@ namespace llvm { std::pair insert(StringRef Key) { assert(!Key.empty()); - return base::insert(std::make_pair(Key, '\0')); + return base::insert(std::make_pair(Key, None)); } template void insert(const InputIt &Begin, const InputIt &End) { for (auto It = Begin; It != End; ++It) - base::insert(std::make_pair(*It, '\0')); + base::insert(std::make_pair(*It, None)); } template Modified: llvm/trunk/include/llvm/IR/Metadata.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Metadata.h?rev=374440&r1=374439&r2=374440&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/Metadata.h (original) +++ llvm/trunk/include/llvm/IR/Metadata.h Thu Oct 10 13:22:53 2019 @@ -601,7 +601,7 @@ dyn_extract_or_null(Y &&MD) { /// These are used to efficiently contain a byte sequence for metadata. /// MDString is always unnamed. class MDString : public Metadata { - friend class StringMapEntry; + friend class StringMapEntryStorage; StringMapEntry *Entry = nullptr; Modified: llvm/trunk/include/llvm/LTO/legacy/LTOCodeGenerator.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/LTO/legacy/LTOCodeGenerator.h?rev=374440&r1=374439&r2=374440&view=diff ============================================================================== --- llvm/trunk/include/llvm/LTO/legacy/LTOCodeGenerator.h (original) +++ llvm/trunk/include/llvm/LTO/legacy/LTOCodeGenerator.h Thu Oct 10 13:22:53 2019 @@ -113,7 +113,7 @@ struct LTOCodeGenerator { ShouldRestoreGlobalsLinkage = Value; } - void addMustPreserveSymbol(StringRef Sym) { MustPreserveSymbols[Sym] = 1; } + void addMustPreserveSymbol(StringRef Sym) { MustPreserveSymbols.insert(Sym); } /// Pass options to the driver and optimization passes. /// Modified: llvm/trunk/lib/LTO/LTOCodeGenerator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/LTO/LTOCodeGenerator.cpp?rev=374440&r1=374439&r2=374440&view=diff ============================================================================== --- llvm/trunk/lib/LTO/LTOCodeGenerator.cpp (original) +++ llvm/trunk/lib/LTO/LTOCodeGenerator.cpp Thu Oct 10 13:22:53 2019 @@ -151,7 +151,7 @@ void LTOCodeGenerator::initializeLTOPass void LTOCodeGenerator::setAsmUndefinedRefs(LTOModule *Mod) { const std::vector &undefs = Mod->getAsmUndefinedRefs(); for (int i = 0, e = undefs.size(); i != e; ++i) - AsmUndefinedRefs[undefs[i]] = 1; + AsmUndefinedRefs.insert(undefs[i]); } bool LTOCodeGenerator::addModule(LTOModule *Mod) { From llvm-commits at lists.llvm.org Thu Oct 10 13:23:28 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via llvm-commits) Date: Thu, 10 Oct 2019 20:23:28 -0000 Subject: [llvm] r374441 - [lit] Add comment explaining the LIT_OPTS env var overrides command line options Message-ID: <20191010202328.5ABB291D11@lists.llvm.org> Author: yln Date: Thu Oct 10 13:23:28 2019 New Revision: 374441 URL: http://llvm.org/viewvc/llvm-project?rev=374441&view=rev Log: [lit] Add comment explaining the LIT_OPTS env var overrides command line options Normally, command line options override environment variables. Add comment to state that we are doing the reverse on purpose. Modified: llvm/trunk/utils/lit/lit/cl_arguments.py Modified: llvm/trunk/utils/lit/lit/cl_arguments.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/cl_arguments.py?rev=374441&r1=374440&r2=374441&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/cl_arguments.py (original) +++ llvm/trunk/utils/lit/lit/cl_arguments.py Thu Oct 10 13:23:28 2019 @@ -183,8 +183,10 @@ def parse_args(): action="store_true", default=False) - opts = parser.parse_args(sys.argv[1:] + - shlex.split(os.environ.get("LIT_OPTS", ""))) + # LIT is special: environment variables override command line arguments. + env_args = shlex.split(os.environ.get("LIT_OPTS", "")) + args = sys.argv[1:] + env_args + opts = parser.parse_args(args) # Validate command line options if opts.echoAllCommands: From llvm-commits at lists.llvm.org Thu Oct 10 13:25:51 2019 From: llvm-commits at lists.llvm.org (Zachary Turner via llvm-commits) Date: Thu, 10 Oct 2019 20:25:51 -0000 Subject: [lld] r374442 - [PDB] Fix bug when using multiple PCH header objects with the same name. Message-ID: <20191010202551.43980928C1@lists.llvm.org> Author: zturner Date: Thu Oct 10 13:25:51 2019 New Revision: 374442 URL: http://llvm.org/viewvc/llvm-project?rev=374442&view=rev Log: [PDB] Fix bug when using multiple PCH header objects with the same name. A common pattern in Windows is to have all your precompiled headers use an object named stdafx.obj. If you've got a project with many different static libs, you might use a separate PCH for each one of these. During the final link step, a file from A might reference the PCH object from A, but it will have the same name (stdafx.obj) as any other PCH from another project. The only difference will be the path. For example, A might be A/stdafx.obj while B is B/stdafx.obj. The existing algorithm checks only the filename that was passed on the command line (or stored in archive), but this is insufficient in the case where relative paths are used, because depending on the command line object file / library order, it might find the wrong PCH object first resulting in a signature mismatch. The fix here is to simply check whether the absolute path of the PCH object (which is stored in the input obj file for the file that references the PCH) *ends with* the full relative path of whatever is specified on the command line (or is in the archive). Differential Revision: https://reviews.llvm.org/D66431 Added: lld/trunk/test/COFF/Inputs/precompa/ lld/trunk/test/COFF/Inputs/precompa/precomp.obj (with props) lld/trunk/test/COFF/Inputs/precompa/useprecomp.obj (with props) lld/trunk/test/COFF/Inputs/precompb/ lld/trunk/test/COFF/Inputs/precompb/precomp.obj (with props) lld/trunk/test/COFF/Inputs/precompb/useprecomp.obj (with props) lld/trunk/test/COFF/precomp-link-samename.test Modified: lld/trunk/COFF/PDB.cpp lld/trunk/test/COFF/precomp-link.test Modified: lld/trunk/COFF/PDB.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/PDB.cpp?rev=374442&r1=374441&r2=374442&view=diff ============================================================================== --- lld/trunk/COFF/PDB.cpp (original) +++ lld/trunk/COFF/PDB.cpp Thu Oct 10 13:25:51 2019 @@ -514,16 +514,15 @@ static bool equals_path(StringRef path1, return path1.equals(path2); #endif } - // Find by name an OBJ provided on the command line -static ObjFile *findObjByName(StringRef fileNameOnly) { - SmallString<128> currentPath; - +static ObjFile *findObjWithPrecompSignature(StringRef fileNameOnly, + uint32_t precompSignature) { for (ObjFile *f : ObjFile::instances) { StringRef currentFileName = sys::path::filename(f->getName()); - // Compare based solely on the file name (link.exe behavior) - if (equals_path(currentFileName, fileNameOnly)) + if (f->pchSignature.hasValue() && + f->pchSignature.getValue() == precompSignature && + equals_path(fileNameOnly, currentFileName)) return f; } return nullptr; @@ -560,22 +559,15 @@ Expected PDBLinker:: // link.exe requires that a precompiled headers object must always be provided // on the command-line, even if that's not necessary. - auto precompFile = findObjByName(precompFileName); + auto precompFile = + findObjWithPrecompSignature(precompFileName, precomp.Signature); if (!precompFile) return createFileError( - precompFileName.str(), - make_error(pdb::pdb_error_code::external_cmdline_ref)); + precomp.getPrecompFilePath().str(), + make_error(pdb::pdb_error_code::no_matching_pch)); addObjFile(precompFile, &indexMap); - if (!precompFile->pchSignature) - fatal(precompFile->getName() + " is not a precompiled headers object"); - - if (precomp.getSignature() != precompFile->pchSignature.getValueOr(0)) - return createFileError( - precomp.getPrecompFilePath().str(), - make_error(pdb::pdb_error_code::signature_out_of_date)); - return indexMap; } Added: lld/trunk/test/COFF/Inputs/precompa/precomp.obj URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/Inputs/precompa/precomp.obj?rev=374442&view=auto ============================================================================== Binary file - no diff available. Propchange: lld/trunk/test/COFF/Inputs/precompa/precomp.obj ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: lld/trunk/test/COFF/Inputs/precompa/useprecomp.obj URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/Inputs/precompa/useprecomp.obj?rev=374442&view=auto ============================================================================== Binary file - no diff available. Propchange: lld/trunk/test/COFF/Inputs/precompa/useprecomp.obj ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: lld/trunk/test/COFF/Inputs/precompb/precomp.obj URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/Inputs/precompb/precomp.obj?rev=374442&view=auto ============================================================================== Binary file - no diff available. Propchange: lld/trunk/test/COFF/Inputs/precompb/precomp.obj ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: lld/trunk/test/COFF/Inputs/precompb/useprecomp.obj URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/Inputs/precompb/useprecomp.obj?rev=374442&view=auto ============================================================================== Binary file - no diff available. Propchange: lld/trunk/test/COFF/Inputs/precompb/useprecomp.obj ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: lld/trunk/test/COFF/precomp-link-samename.test URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/precomp-link-samename.test?rev=374442&view=auto ============================================================================== --- lld/trunk/test/COFF/precomp-link-samename.test (added) +++ lld/trunk/test/COFF/precomp-link-samename.test Thu Oct 10 13:25:51 2019 @@ -0,0 +1,36 @@ +RUN: lld-link %S/Inputs/precompb/useprecomp.obj %S/Inputs/precompa/precomp.obj %S/Inputs/precompb/precomp.obj \ +RUN: %S/Inputs/precompa/useprecomp.obj /nodefaultlib /entry:main /debug /pdb:%t.pdb /out:%t.exe \ +RUN: /summary | FileCheck %s -check-prefix SUMMARY + +RUN: llvm-pdbutil dump -types %t.pdb | FileCheck %s + + +CHECK: Types (TPI Stream) +CHECK-NOT: LF_PRECOMP +CHECK-NOT: LF_ENDPRECOMP + + +SUMMARY: Summary +SUMMARY-NEXT: -------------------------------------------------------------------------------- +SUMMARY-NEXT: 4 Input OBJ files (expanded from all cmd-line inputs) +SUMMARY-NEXT: 0 PDB type server dependencies +SUMMARY-NEXT: 2 Precomp OBJ dependencies + +// precompa/precomp.cpp +#include "precomp.h" + +// precompa/useprecomp.cpp +#include "precomp.h" +int main(int argc, char **argv) { return 0; } + +// precompa/precomp.h +int precompa_symbol = 42; + +// precompb/precomp.cpp +#include "precomp.h" + +// precompb/useprecomp.cpp +#include "precomp.h" + +// precompb/precomp.h +int precompb_symbol = 142; \ No newline at end of file Modified: lld/trunk/test/COFF/precomp-link.test URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/COFF/precomp-link.test?rev=374442&r1=374441&r2=374442&view=diff ============================================================================== --- lld/trunk/test/COFF/precomp-link.test (original) +++ lld/trunk/test/COFF/precomp-link.test Thu Oct 10 13:25:51 2019 @@ -9,10 +9,10 @@ RUN: lld-link %S/Inputs/precomp-a.obj %S RUN: not lld-link %S/Inputs/precomp-a.obj %S/Inputs/precomp-b.obj /nodefaultlib /entry:main /debug /pdb:%t.pdb /out:%t.exe /opt:ref /opt:icf 2>&1 | FileCheck %s -check-prefix FAILURE-MISSING-PRECOMPOBJ FAILURE: warning: Cannot use debug info for '{{.*}}precomp-invalid.obj' [LNK4099] -FAILURE-NEXT: failed to load reference '{{.*}}precomp.obj': The signature does not match; the file(s) might be out of date. +FAILURE-NEXT: failed to load reference '{{.*}}precomp.obj': No matching precompiled header could be located. FAILURE-MISSING-PRECOMPOBJ: warning: Cannot use debug info for '{{.*}}precomp-a.obj' [LNK4099] -FAILURE-MISSING-PRECOMPOBJ-NEXT: failed to load reference '{{.*}}precomp.obj': The path to this file must be provided on the command-line +FAILURE-MISSING-PRECOMPOBJ-NEXT: failed to load reference '{{.*}}precomp.obj': No matching precompiled header could be located. CHECK: Types (TPI Stream) CHECK-NOT: LF_PRECOMP From llvm-commits at lists.llvm.org Thu Oct 10 13:25:51 2019 From: llvm-commits at lists.llvm.org (Zachary Turner via llvm-commits) Date: Thu, 10 Oct 2019 20:25:51 -0000 Subject: [llvm] r374442 - [PDB] Fix bug when using multiple PCH header objects with the same name. Message-ID: <20191010202551.48CAB928D9@lists.llvm.org> Author: zturner Date: Thu Oct 10 13:25:51 2019 New Revision: 374442 URL: http://llvm.org/viewvc/llvm-project?rev=374442&view=rev Log: [PDB] Fix bug when using multiple PCH header objects with the same name. A common pattern in Windows is to have all your precompiled headers use an object named stdafx.obj. If you've got a project with many different static libs, you might use a separate PCH for each one of these. During the final link step, a file from A might reference the PCH object from A, but it will have the same name (stdafx.obj) as any other PCH from another project. The only difference will be the path. For example, A might be A/stdafx.obj while B is B/stdafx.obj. The existing algorithm checks only the filename that was passed on the command line (or stored in archive), but this is insufficient in the case where relative paths are used, because depending on the command line object file / library order, it might find the wrong PCH object first resulting in a signature mismatch. The fix here is to simply check whether the absolute path of the PCH object (which is stored in the input obj file for the file that references the PCH) *ends with* the full relative path of whatever is specified on the command line (or is in the archive). Differential Revision: https://reviews.llvm.org/D66431 Modified: llvm/trunk/include/llvm/DebugInfo/PDB/GenericError.h llvm/trunk/lib/DebugInfo/PDB/GenericError.cpp Modified: llvm/trunk/include/llvm/DebugInfo/PDB/GenericError.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/DebugInfo/PDB/GenericError.h?rev=374442&r1=374441&r2=374442&view=diff ============================================================================== --- llvm/trunk/include/llvm/DebugInfo/PDB/GenericError.h (original) +++ llvm/trunk/include/llvm/DebugInfo/PDB/GenericError.h Thu Oct 10 13:25:51 2019 @@ -20,7 +20,7 @@ enum class pdb_error_code { dia_sdk_not_present, dia_failed_loading, signature_out_of_date, - external_cmdline_ref, + no_matching_pch, unspecified, }; } // namespace pdb Modified: llvm/trunk/lib/DebugInfo/PDB/GenericError.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/PDB/GenericError.cpp?rev=374442&r1=374441&r2=374442&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/PDB/GenericError.cpp (original) +++ llvm/trunk/lib/DebugInfo/PDB/GenericError.cpp Thu Oct 10 13:25:51 2019 @@ -34,8 +34,8 @@ public: return "The PDB file path is an invalid UTF8 sequence."; case pdb_error_code::signature_out_of_date: return "The signature does not match; the file(s) might be out of date."; - case pdb_error_code::external_cmdline_ref: - return "The path to this file must be provided on the command-line."; + case pdb_error_code::no_matching_pch: + return "No matching precompiled header could be located."; } llvm_unreachable("Unrecognized generic_error_code"); } From llvm-commits at lists.llvm.org Thu Oct 10 13:25:52 2019 From: llvm-commits at lists.llvm.org (Jordan Rose via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:25:52 +0000 (UTC) Subject: [PATCH] D68586: Save a word in every StringSet entry In-Reply-To: References: Message-ID: <0025d4a2a8a70da673fbab9a68e691ac@localhost.localdomain> jordan_rose added a comment. In D68586#1704530 , @dblaikie wrote: > Any idea why MDString is friending an implementation detail like this? Should it be? Could we make it an actual private implementation detail so people can't do this? It's funky but I think reasonable: MDString is a move-none type since there will be direct pointers to it, and the canonical instance lives in the StringMap. Disallowing this friending would mean making MDString move-only and just assuming it'll never be misused. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68586/new/ https://reviews.llvm.org/D68586 From llvm-commits at lists.llvm.org Thu Oct 10 13:25:53 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:25:53 +0000 (UTC) Subject: [PATCH] D64135: [lit] Parse command-line options from LIT_OPTS In-Reply-To: References: Message-ID: yln added a comment. Herald added a reviewer: jdoerfert. Hi Joel @jdenny, I would like to ask you to reconsider whether or not it is a good idea that the env var overrides the command line options. I understand the desire for this so one can reinvoke `env LIT_OPTS=... ninja check-...` and be certain that those options are going to be used. My arguments against it are mostly for consistency and to avoid surprising non-standard behavior: - It is not "standard". The default is that CL options override ENV vars [1]. - It is inconsistent with the other option configurable var env vars, e.g., `LIT_FILTER` and `LIT_*_SHARD`. - Most of the time we want to supplement---not override---existing options. One more ask independent of what we decide: the `lit-opts.py` test does not go red when we change the override order. Can you adapt the test to give a signal in this case? env_args = shlex.split(os.environ.get("LIT_OPTS", "")) args = sys.argv[1:] + env_args # lit-opts.py does not go red when we flip the order of concatenation here. Is this expected? opts = parser.parse_args(args) Also: thank you for adding and documenting this feature and even providing a test. I am already using it and find it useful! [1] https://stackoverflow.com/a/11077282/271968 Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64135/new/ https://reviews.llvm.org/D64135 From llvm-commits at lists.llvm.org Thu Oct 10 13:25:53 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:25:53 +0000 (UTC) Subject: [PATCH] D68772: [COFF] Wrap things in namespace lld { namespace coff { In-Reply-To: References: Message-ID: <88979373abc763a9ba578e3ed884ccef@localhost.localdomain> rnk added inline comments. ================ Comment at: lld/COFF/DebugTypes.cpp:213-214 // moved here. -Expected -lld::coff::findTypeServerSource(const ObjFile *f) { Expected ts = TypeServerSource::findFromFile(f); ---------------- I prefer this style for free functions because it makes it a hard error if there's a mismatch between the header and the cpp file. It's a pretty simple style rule: every function implemented in a .cpp file should either be qualified with a class or namespace name, or it should be marked static. Then you never have to worry about what the active namespace is outside of headers. That's just my personal preference and it's not in CodingStandards, but given how much we use free functions in LLD and LLVM, it's kind of nice. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68772/new/ https://reviews.llvm.org/D68772 From llvm-commits at lists.llvm.org Thu Oct 10 13:25:54 2019 From: llvm-commits at lists.llvm.org (Jordan Rose via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:25:54 +0000 (UTC) Subject: [PATCH] D68586: Save a word in every StringSet entry In-Reply-To: References: Message-ID: <742cf1cb2a41cab9a7ce3ea0506d68a2@localhost.localdomain> jordan_rose closed this revision. jordan_rose added a comment. Committed in rL374440 . I split the difference and put the EBO comment in the commit message. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68586/new/ https://reviews.llvm.org/D68586 From llvm-commits at lists.llvm.org Thu Oct 10 13:25:54 2019 From: llvm-commits at lists.llvm.org (David Greene via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:25:54 +0000 (UTC) Subject: [PATCH] D68819: [Utils] Allow update_test_checks to check function arguments In-Reply-To: References: Message-ID: greened added a comment. Does this subsume the goal of D68153 ? If so I am happy to abandon that revision. D68153 attempts to solve the problem of a `CHECK-LABEL` matching a function call instead of the start of a function definition. It looks like with `--function-signature` the `CHECK-LABEL` will include the arguments in the label pattern which should be enough to disambiguate it from a call to the function. Do I have that right? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68819/new/ https://reviews.llvm.org/D68819 From llvm-commits at lists.llvm.org Thu Oct 10 13:29:12 2019 From: llvm-commits at lists.llvm.org (Greg Clayton via llvm-commits) Date: Thu, 10 Oct 2019 20:29:12 -0000 Subject: [llvm] r374445 - Fix a documentation warning from GSYM commit. Message-ID: <20191010202912.2BFD59293E@lists.llvm.org> Author: gclayton Date: Thu Oct 10 13:29:11 2019 New Revision: 374445 URL: http://llvm.org/viewvc/llvm-project?rev=374445&view=rev Log: Fix a documentation warning from GSYM commit. Modified: llvm/trunk/include/llvm/DebugInfo/GSYM/GsymCreator.h Modified: llvm/trunk/include/llvm/DebugInfo/GSYM/GsymCreator.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/DebugInfo/GSYM/GsymCreator.h?rev=374445&r1=374444&r2=374445&view=diff ============================================================================== --- llvm/trunk/include/llvm/DebugInfo/GSYM/GsymCreator.h (original) +++ llvm/trunk/include/llvm/DebugInfo/GSYM/GsymCreator.h Thu Oct 10 13:29:11 2019 @@ -178,7 +178,7 @@ public: /// \param Style The path style for the "Path" parameter. /// \returns The unique file index for the inserted file. uint32_t insertFile(StringRef Path, - llvm::sys::path::Style = llvm::sys::path::Style::native); + sys::path::Style Style = sys::path::Style::native); /// Add a function info to this GSYM creator. /// From llvm-commits at lists.llvm.org Thu Oct 10 13:27:16 2019 From: llvm-commits at lists.llvm.org (Zachary Turner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:27:16 +0000 (UTC) Subject: [PATCH] D66431: [PDB] Fix bug when using multiple PCH header objects with the same name. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG02c53868116d: [PDB] Fix bug when using multiple PCH header objects with the same name. (authored by zturner). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66431/new/ https://reviews.llvm.org/D66431 Files: lld/COFF/PDB.cpp lld/test/COFF/Inputs/precompa/precomp.obj lld/test/COFF/Inputs/precompa/useprecomp.obj lld/test/COFF/Inputs/precompb/precomp.obj lld/test/COFF/Inputs/precompb/useprecomp.obj lld/test/COFF/precomp-link-samename.test lld/test/COFF/precomp-link.test llvm/include/llvm/DebugInfo/PDB/GenericError.h llvm/lib/DebugInfo/PDB/GenericError.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D66431.224458.patch Type: text/x-patch Size: 5510 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 13:39:27 2019 From: llvm-commits at lists.llvm.org (David Greene via llvm-commits) Date: Thu, 10 Oct 2019 20:39:27 -0000 Subject: [llvm] r374446 - [System Model] [TTI] Move default cache/prefetch implementations Message-ID: <20191010203927.2FAD792903@lists.llvm.org> Author: greened Date: Thu Oct 10 13:39:27 2019 New Revision: 374446 URL: http://llvm.org/viewvc/llvm-project?rev=374446&view=rev Log: [System Model] [TTI] Move default cache/prefetch implementations Move the default implementations of cache and prefetch queries to TargetTransformInfoImplBase and delete them from NoTIIImpl. This brings these interfaces in line with how other TTI interfaces work. Differential Revision: https://reviews.llvm.org/D68804 Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h llvm/trunk/lib/Analysis/TargetTransformInfo.cpp Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h?rev=374446&r1=374445&r2=374446&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h (original) +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h Thu Oct 10 13:39:27 2019 @@ -371,6 +371,34 @@ public: return false; } + unsigned getCacheLineSize() const { return 0; } + + llvm::Optional getCacheSize(TargetTransformInfo::CacheLevel Level) const { + switch (Level) { + case TargetTransformInfo::CacheLevel::L1D: + LLVM_FALLTHROUGH; + case TargetTransformInfo::CacheLevel::L2D: + return llvm::Optional(); + } + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); + } + + llvm::Optional getCacheAssociativity( + TargetTransformInfo::CacheLevel Level) const { + switch (Level) { + case TargetTransformInfo::CacheLevel::L1D: + LLVM_FALLTHROUGH; + case TargetTransformInfo::CacheLevel::L2D: + return llvm::Optional(); + } + + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); + } + + unsigned getPrefetchDistance() const { return 0; } + unsigned getMinPrefetchStride() const { return 1; } + unsigned getMaxPrefetchIterationsAhead() const { return UINT_MAX; } + unsigned getMaxInterleaveFactor(unsigned VF) { return 1; } unsigned getArithmeticInstrCost(unsigned Opcode, Type *Ty, Modified: llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h?rev=374446&r1=374445&r2=374446&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h (original) +++ llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h Thu Oct 10 13:39:27 2019 @@ -523,8 +523,13 @@ public: virtual Optional getCacheAssociativity(TargetTransformInfo::CacheLevel Level) const { - return Optional( - getST()->getCacheAssociativity(static_cast(Level))); + Optional TargetResult = + getST()->getCacheAssociativity(static_cast(Level)); + + if (TargetResult) + return TargetResult; + + return BaseT::getCacheAssociativity(Level); } virtual unsigned getCacheLineSize() const { Modified: llvm/trunk/lib/Analysis/TargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/TargetTransformInfo.cpp?rev=374446&r1=374445&r2=374446&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/TargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Analysis/TargetTransformInfo.cpp Thu Oct 10 13:39:27 2019 @@ -40,34 +40,6 @@ namespace { struct NoTTIImpl : TargetTransformInfoImplCRTPBase { explicit NoTTIImpl(const DataLayout &DL) : TargetTransformInfoImplCRTPBase(DL) {} - - unsigned getCacheLineSize() const { return 0; } - - llvm::Optional getCacheSize(TargetTransformInfo::CacheLevel Level) const { - switch (Level) { - case TargetTransformInfo::CacheLevel::L1D: - LLVM_FALLTHROUGH; - case TargetTransformInfo::CacheLevel::L2D: - return llvm::Optional(); - } - llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); - } - - llvm::Optional getCacheAssociativity( - TargetTransformInfo::CacheLevel Level) const { - switch (Level) { - case TargetTransformInfo::CacheLevel::L1D: - LLVM_FALLTHROUGH; - case TargetTransformInfo::CacheLevel::L2D: - return llvm::Optional(); - } - - llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); - } - - unsigned getPrefetchDistance() const { return 0; } - unsigned getMinPrefetchStride() const { return 1; } - unsigned getMaxPrefetchIterationsAhead() const { return UINT_MAX; } }; } From llvm-commits at lists.llvm.org Thu Oct 10 13:37:04 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:37:04 +0000 (UTC) Subject: [PATCH] D68809: [MemorySSA] Additional handling of unreachable blocks. In-Reply-To: References: Message-ID: <45214ccdfc1fa330d90802a4dc2079a1@localhost.localdomain> asbirlea marked 2 inline comments as done. asbirlea added a comment. Thank you for the review! ================ Comment at: test/Analysis/MemorySSA/pr43426.ll:1 +; RUN: opt -licm -enable-mssa-loop-dependency -S %s | FileCheck %s +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" ---------------- george.burgess.iv wrote: > `; REQUIRES: asserts`? This occurs without asserts as well. It's an infinite loop. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68809/new/ https://reviews.llvm.org/D68809 From llvm-commits at lists.llvm.org Thu Oct 10 13:37:04 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:37:04 +0000 (UTC) Subject: [PATCH] D68586: Save a word in every StringSet entry In-Reply-To: References: Message-ID: dblaikie added a comment. In D68586#1704684 , @jordan_rose wrote: > In D68586#1704530 , @dblaikie wrote: > > > Any idea why MDString is friending an implementation detail like this? Should it be? Could we make it an actual private implementation detail so people can't do this? > > > It's funky but I think reasonable: MDString is a move-none type since there will be direct pointers to it, and the canonical instance lives in the StringMap. Disallowing this friending would mean making MDString move-only and just assuming it'll never be misused. Figured something like that. I think for standard containers, at least, that can now be achieved through a custom allocator ( https://en.cppreference.com/w/cpp/memory/allocator/construct - not sure why the non-default version was deprecated... well, one possible reason (below)) The other way to do it is with a private tagged parameter - have a private type in MDString, have a public ctor that takes an instance of that type as a parameter, and then no public user can call it because they can't name the type, but MDString's implementation can call into other code including templates that can handle that. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68586/new/ https://reviews.llvm.org/D68586 From llvm-commits at lists.llvm.org Thu Oct 10 13:37:05 2019 From: llvm-commits at lists.llvm.org (Evgenii Stepanov via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:37:05 +0000 (UTC) Subject: [PATCH] D68794: libhwasan initialisation include kernel syscall ABI relaxation In-Reply-To: References: Message-ID: <4f379940be452a21950cee5cf72bffb2@localhost.localdomain> eugenis added a comment. Thank you! The patches have been merged to the android common kernel just a few days ago, 4.14 and 4.19: https://android.googlesource.com/kernel/common/+/690c4ca8a5715644370384672f24d95b042db74a Pixel kernels in Q have an early version of the patch set without the prctl (i.e. the feature is always enabled). Future releases will require a prctl. So please do a prctl on Android anyway, but ignore EINVAL. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68794/new/ https://reviews.llvm.org/D68794 From llvm-commits at lists.llvm.org Thu Oct 10 13:37:04 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:37:04 +0000 (UTC) Subject: [PATCH] D68829: [LNT] Python 3 support: Parse HTML as text Message-ID: thopre created this revision. thopre added reviewers: cmatthews, hubert.reinterpretcast, kristof.beyls. thopre added a parent revision: D68803: [LNT] Python 3 support: sort benchmark regressing. sanity_check_instance() method in the server/db/Migrations.py manipulates data from a werkzeug.wrappers.Response instance. However, data is a property which returns binary data on Python 3 but the code relies on the data being text, such as the call to the index method with a text argument. This commit uses the get_data getter of the data property setting its as_text parameter to True to request data as text instead. https://reviews.llvm.org/D68829 Files: tests/server/db/Migrations.py Index: tests/server/db/Migrations.py =================================================================== --- tests/server/db/Migrations.py +++ tests/server/db/Migrations.py @@ -28,9 +28,10 @@ # Visit all the test suites. test_suite_link_rex = re.compile(""" (.*)
    """) - test_suite_list_start = index.data.index("

    Test Suites

    ") - test_suite_list_end = index.data.index("", test_suite_list_start) - for ln in index.data[test_suite_list_start:test_suite_list_end].split("\n"): + data = index.get_data(as_text=True) + test_suite_list_start = data.index("

    Test Suites

    ") + test_suite_list_end = data.index("", test_suite_list_start) + for ln in data[test_suite_list_start:test_suite_list_end].split("\n"): # Ignore non-matching lines. print(ln, file=sys.stderr) m = test_suite_link_rex.match(ln) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68829.224461.patch Type: text/x-patch Size: 901 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 13:37:05 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:37:05 +0000 (UTC) Subject: [PATCH] D68484: [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. In-Reply-To: References: Message-ID: <0bbba6db59d0e59ed7d93f4a60ef3ebe@localhost.localdomain> jeroen.dobbelaere marked 3 inline comments as done. jeroen.dobbelaere added inline comments. ================ Comment at: llvm/docs/LangRef.rst:16241 + i32 , metadata !p.scope) !noalias !VisibleScopes + %side.p = i8* @llvm.side.noalias.XXX(i8* %side.p, i8* %p.decl, + i8** p.addr, i8** %side.p.addr, ---------------- a.elovikov wrote: > I find it strange to see %side.p on both left and right sides. Is it a typo or does it have some special meaning? > > After reading till the intrinsics' description I believe it should be just "%p" on the right side. yes, that's a typo. the second %side.p should be %p: %side.p = i8* @llvm.side.noalias.XXX(i8* %p, ...) ================ Comment at: llvm/docs/LangRef.rst:16575 +It will be transformed into a ``llvm.side.noalias`` intrinsic and moved onto +the ``noalias_sidechannel`` path, so that pointer optimizations can still be +done and the restrict information is not lost. ---------------- a.elovikov wrote: > > the ``noalias_sidechannel`` path > > Not sure about terminology, but are `@llvm.noalias.arg.guard`/`@llvm.noalias.copy.guard` considered as `noalias_sidechannel`? I'd suggest not to use the spelling from the load/store instructions and have a more general `moved onto the "side" path` (if my understanding is correct here). The @llvm.noalias.arg.guard combines the normal path with the noalias_sidechannel path. The @llvm.noalias.copy.guard resides on the normal path and adds extra information to a copy operation (memcpy, load/store). I tried to be consistent in terminology when referring to the 'noalias_sidechannel' path. (but I could also use the 'noalias side channel' or something similar). ================ Comment at: llvm/docs/LangRef.rst:16615 + +The third argument ``%p.addr`` is the address in memory of this pointer. + ---------------- a.elovikov wrote: > No explicit "or null" here. Is that intentional? It can be 'null' CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68484/new/ https://reviews.llvm.org/D68484 From llvm-commits at lists.llvm.org Thu Oct 10 13:43:06 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via llvm-commits) Date: Thu, 10 Oct 2019 20:43:06 -0000 Subject: [llvm] r374447 - [MemorySSA] Additional handling of unreachable blocks. Message-ID: <20191010204306.6B0B592903@lists.llvm.org> Author: asbirlea Date: Thu Oct 10 13:43:06 2019 New Revision: 374447 URL: http://llvm.org/viewvc/llvm-project?rev=374447&view=rev Log: [MemorySSA] Additional handling of unreachable blocks. Summary: Whenever we get the previous definition, the assumption is that the recursion starts ina reachable block. If the recursion starts in an unreachable block, we may recurse indefinitely. Handle this case by returning LoE if the block is unreachable. Resolves PR43426. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68809 Added: llvm/trunk/test/Analysis/MemorySSA/pr43426.ll Modified: llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp Modified: llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp?rev=374447&r1=374446&r2=374447&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp (original) +++ llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp Thu Oct 10 13:43:06 2019 @@ -48,6 +48,10 @@ MemoryAccess *MemorySSAUpdater::getPrevi return Cached->second; } + // If this method is called from an unreachable block, return LoE. + if (!MSSA->DT->isReachableFromEntry(BB)) + return MSSA->getLiveOnEntryDef(); + if (BasicBlock *Pred = BB->getSinglePredecessor()) { // Single predecessor case, just recurse, we can only have one definition. MemoryAccess *Result = getPreviousDefFromEnd(Pred, CachedPreviousDef); Added: llvm/trunk/test/Analysis/MemorySSA/pr43426.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/MemorySSA/pr43426.ll?rev=374447&view=auto ============================================================================== --- llvm/trunk/test/Analysis/MemorySSA/pr43426.ll (added) +++ llvm/trunk/test/Analysis/MemorySSA/pr43426.ll Thu Oct 10 13:43:06 2019 @@ -0,0 +1,40 @@ +; RUN: opt -licm -enable-mssa-loop-dependency -S %s | FileCheck %s +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" + +; CHECK-LABEL: @d() +define dso_local void @d() { +entry: + br label %header + +header: + store i32 1, i32* null, align 4 + br i1 true, label %cleanup53, label %body + +body: + br i1 undef, label %cleanup31, label %for.cond11 + +for.cond11: ; Needs branch as is + br i1 undef, label %unreachable, label %latch + +cleanup31: + br label %unreachable + +deadblock: + br i1 undef, label %unreachable, label %deadblock + +cleanup53: + %val = load i32, i32* null, align 4 + %cmpv = icmp eq i32 %val, 0 + br i1 %cmpv, label %cleanup63, label %latch + +latch: + br label %header + +cleanup63: + ret void + +unreachable: + unreachable +} + From llvm-commits at lists.llvm.org Thu Oct 10 13:47:22 2019 From: llvm-commits at lists.llvm.org (Evgeniy Stepanov via llvm-commits) Date: Thu, 10 Oct 2019 20:47:22 -0000 Subject: [compiler-rt] r374448 - Add a missing include in test. Message-ID: <20191010204722.6EB299276E@lists.llvm.org> Author: eugenis Date: Thu Oct 10 13:47:22 2019 New Revision: 374448 URL: http://llvm.org/viewvc/llvm-project?rev=374448&view=rev Log: Add a missing include in test. A fix for r373993. Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp Modified: compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp?rev=374448&r1=374447&r2=374448&view=diff ============================================================================== --- compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp (original) +++ compiler-rt/trunk/test/sanitizer_common/TestCases/Posix/crypt.cpp Thu Oct 10 13:47:22 2019 @@ -6,6 +6,7 @@ #include #include #include +#include int main (int argc, char** argv) From llvm-commits at lists.llvm.org Thu Oct 10 13:46:14 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:46:14 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: hubert.reinterpretcast added a comment. Just some minor comments. I think this is almost ready. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:150 + bool nameShouldBeInStringTable(const StringRef &); + void writeSymbolName(const StringRef &); ---------------- This should be a static member function or a non-member function. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:361 + + W.write(CSectionRef.Address + SymbolOffset); + W.write(SectionIndex); ---------------- Maybe check for overflow here. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:481 + writeSymbolTableEntryForCsectMemberLabel( + Sym, Csect, Text.Index, Layout.getSymbolOffset(*(Sym.MCSym))); + } ---------------- Please remove the excess parentheses. ================ Comment at: llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll:84 ; SYMS-NEXT: Symbol { -; SYMS-NEXT: Index: [[#Index:]] -; SYMS-NEXT: Name: a +; SYMS: Index: [[#Index:]]{{[[:space:]] *}}Name: a ; SYMS-NEXT: Value (RelocatableAddress): 0x0 ---------------- Can this be merged with the previous line? ``` SYMS: Symbol {{[{][[:space:]] *}}Index: [[#Index:]]{{[[:space:]] *}}Name: a{{$}} ``` ================ Comment at: llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll:36 +; OBJ: Section { +; OBJ: Index: 2 ; OBJ-NEXT: Name: .bss ---------------- Same comment re: merging with the previous line. ================ Comment at: llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll:56 ; SYMS-NEXT: Symbol { -; SYMS-NEXT: Index: [[#Index:]] -; SYMS-NEXT: Name: a +; SYMS: Index: [[#Index:]]{{[[:space:]] *}}Name: a ; SYMS-NEXT: Value (RelocatableAddress): 0x0 ---------------- Same comment re: merging with the previous line. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 From llvm-commits at lists.llvm.org Thu Oct 10 13:46:15 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:46:15 +0000 (UTC) Subject: [PATCH] D68830: [lit] Break main into smaller functions Message-ID: yln created this revision. yln added reviewers: rnk, ddunbar, serge-sans-paille, probinson, jdenny, cishida, nate_chandler, jordan_rose. Herald added subscribers: llvm-commits, delcypher. Herald added a project: LLVM. This change is purely mechanical. I will do further cleanups of parameter usages. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68830 Files: llvm/utils/lit/lit/main.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68830.224462.patch Type: text/x-patch Size: 16610 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 13:46:15 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:46:15 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: <9d3fd96f25e04352958b322b68b5767f@localhost.localdomain> hubert.reinterpretcast added inline comments. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:168 + +bool isRelocationSigned(XCOFFRelocation32 &Reloc); + ---------------- DiggerLin wrote: > hubert.reinterpretcast wrote: > > sfertile wrote: > > > hubert.reinterpretcast wrote: > > > > DiggerLin wrote: > > > > > hubert.reinterpretcast wrote: > > > > > > Do these need to be declared in the header? Are they called only in one `.cpp` file? If so, they can be made `static` in the `.cpp` file. Otherwise, it seems odd that these aren't `const` member functions of `XCOFFRelocation32`. > > > > > the llvm-readobj is using those function and obj2yaml will use them too. > > > > It is still odd to me that these aren't `const` non-static member functions of `XCOFFRelocation32`. > > > I think were these originally templated to work with both 32-bit and 64-bit relocations, which explains why they aren't member functions. > > Would using CRTP with a base class template work for that case? > as Sean's comment, for we only implement 32 bits relocation, we do not use any template for the relocation implement this moment. All the more reason why these should be non-static member functions in the context of this patch. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Thu Oct 10 13:46:16 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Thu, 10 Oct 2019 20:46:16 +0000 (UTC) Subject: [PATCH] D67986: [InstCombine] snprintf (d, size, "%s", s) -> memccpy (d, s, '\0', size - 1), d[size - 1] = 0 In-Reply-To: References: Message-ID: <163603179c6d16c766b0125e0b0e939b@localhost.localdomain> xbolva00 added a comment. I am fine with your suggestion to restrict it like you said. (Generally, I think more transforms in instcombine should be restricted this way) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67986/new/ https://reviews.llvm.org/D67986 From llvm-commits at lists.llvm.org Thu Oct 10 13:46:17 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:46:17 +0000 (UTC) Subject: [PATCH] D68831: [LV] Mark instructions with loop invariant arguments as uniform. (WIP) Message-ID: fhahn created this revision. fhahn added reviewers: hsaito, rengolin, dcaballe, Ayal. Herald added subscribers: rkruppe, hiraditya. Herald added a project: LLVM. As suggested by Ayal in D59995 , we can mark instructions with loop invariant arguments as uniform. They will always produce the same result. Now that we can have more uniform instructions, there were some assertions that needed relaxing a bit. Also, there still seems to be an issue with constant folding in LV not being able to simplify some uniform values compared to their replicated equivalents. I still have to look into that, but I wanted to make sure the overall approach aligns well. The overall impact of the change is probably quite low, but at least in the test-suite, there are around 4 benchmarks were we ended up vectorizing a few more loops. Currently we still miss some uniform instructions, that only have uniform operands, but that can be addressed as follow-up. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68831 Files: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/test/Transforms/LoopVectorize/AArch64/extractvalue-no-scalarization-required.ll llvm/test/Transforms/LoopVectorize/X86/assume.ll llvm/test/Transforms/LoopVectorize/X86/constant-fold.ll llvm/test/Transforms/LoopVectorize/X86/cost-model-assert.ll llvm/test/Transforms/LoopVectorize/X86/funclet.ll llvm/test/Transforms/LoopVectorize/X86/invariant-load-gather.ll llvm/test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll llvm/test/Transforms/LoopVectorize/X86/load-deref-pred.ll llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll llvm/test/Transforms/LoopVectorize/no_outside_user.ll llvm/test/Transforms/LoopVectorize/pr32859.ll llvm/test/Transforms/LoopVectorize/vector-intrinsic-call-cost.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68831.224463.patch Type: text/x-patch Size: 69350 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 13:46:17 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:46:17 +0000 (UTC) Subject: [PATCH] D68814: [LV] Allow assume calls in predicated blocks. In-Reply-To: References: Message-ID: <6c891881a2f406ed4f96b07115053090@localhost.localdomain> fhahn updated this revision to Diff 224464. fhahn added a comment. Address comments, thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68814/new/ https://reviews.llvm.org/D68814 Files: llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPlan.cpp llvm/test/Transforms/LoopVectorize/predicate-assume.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68814.224464.patch Type: text/x-patch Size: 8220 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 13:46:18 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:46:18 +0000 (UTC) Subject: [PATCH] D68832: [tsan,msan] Insert module constructors in a module pass Message-ID: vitalybuka created this revision. vitalybuka added reviewers: eugenis, leonardchan. Herald added subscribers: llvm-commits, cfe-commits, hiraditya. Herald added projects: clang, LLVM. If we insert them from function pass some analysis may be missing or invalid. Fixes PR42877. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68832 Files: clang/lib/CodeGen/BackendUtil.cpp clang/test/CodeGen/sanitizer-module-constructor.c llvm/include/llvm/Transforms/Instrumentation/MemorySanitizer.h llvm/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h llvm/lib/Passes/PassRegistry.def llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp llvm/test/Instrumentation/MemorySanitizer/msan_basic.ll llvm/test/Instrumentation/ThreadSanitizer/tsan_basic.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68832.224465.patch Type: text/x-patch Size: 15271 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 13:46:18 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:46:18 +0000 (UTC) Subject: [PATCH] D68814: [LV] Allow assume calls in predicated blocks. In-Reply-To: References: Message-ID: <7cbf35f414720a83d9883927fb37aaf9@localhost.localdomain> fhahn marked 2 inline comments as done. fhahn added inline comments. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:892 + // We can predicate blocks with calls to assume, as long as we drop them in + // case we flatten the CFG via predication. + if (match(&I, m_Intrinsic(m_Value()))) ---------------- xbolva00 wrote: > You can drop m_Value() > > if (match(&I, m_Intrinsic())) { > > Wil work too Excellent, thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68814/new/ https://reviews.llvm.org/D68814 From llvm-commits at lists.llvm.org Thu Oct 10 13:46:20 2019 From: llvm-commits at lists.llvm.org (Andrei Elovikov via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:46:20 +0000 (UTC) Subject: [PATCH] D68484: [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. In-Reply-To: References: Message-ID: <23257730b6c4454ec228e9c2b9ca7d71@localhost.localdomain> a.elovikov added inline comments. ================ Comment at: llvm/docs/LangRef.rst:16575 +It will be transformed into a ``llvm.side.noalias`` intrinsic and moved onto +the ``noalias_sidechannel`` path, so that pointer optimizations can still be +done and the restrict information is not lost. ---------------- jeroen.dobbelaere wrote: > a.elovikov wrote: > > > the ``noalias_sidechannel`` path > > > > Not sure about terminology, but are `@llvm.noalias.arg.guard`/`@llvm.noalias.copy.guard` considered as `noalias_sidechannel`? I'd suggest not to use the spelling from the load/store instructions and have a more general `moved onto the "side" path` (if my understanding is correct here). > The @llvm.noalias.arg.guard combines the normal path with the noalias_sidechannel path. The @llvm.noalias.copy.guard resides on the normal path and adds extra information to a copy operation (memcpy, load/store). > I tried to be consistent in terminology when referring to the 'noalias_sidechannel' path. (but I could also use the 'noalias side channel' or something similar). How about this: It will be transformed into a ``llvm.side.noalias`` intrinsic and moved onto the ``noalias_sidechannel`` path for loads/stores and fed into the @llvm.noalias.arg.guard/@llvm.noalias.copy.guard intrinsics for function boundaries/copies respectively. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68484/new/ https://reviews.llvm.org/D68484 From llvm-commits at lists.llvm.org Thu Oct 10 13:46:31 2019 From: llvm-commits at lists.llvm.org (David Greene via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:46:31 +0000 (UTC) Subject: [PATCH] D68804: [System Model] [TTI] Move default cache/prefetch implementations In-Reply-To: References: Message-ID: <2ce6d70ab271210e662472b152b48b04@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG7c562f12869f: [System Model] [TTI] Move default cache/prefetch implementations (authored by greened). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68804/new/ https://reviews.llvm.org/D68804 Files: llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/lib/Analysis/TargetTransformInfo.cpp Index: llvm/lib/Analysis/TargetTransformInfo.cpp =================================================================== --- llvm/lib/Analysis/TargetTransformInfo.cpp +++ llvm/lib/Analysis/TargetTransformInfo.cpp @@ -40,34 +40,6 @@ struct NoTTIImpl : TargetTransformInfoImplCRTPBase { explicit NoTTIImpl(const DataLayout &DL) : TargetTransformInfoImplCRTPBase(DL) {} - - unsigned getCacheLineSize() const { return 0; } - - llvm::Optional getCacheSize(TargetTransformInfo::CacheLevel Level) const { - switch (Level) { - case TargetTransformInfo::CacheLevel::L1D: - LLVM_FALLTHROUGH; - case TargetTransformInfo::CacheLevel::L2D: - return llvm::Optional(); - } - llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); - } - - llvm::Optional getCacheAssociativity( - TargetTransformInfo::CacheLevel Level) const { - switch (Level) { - case TargetTransformInfo::CacheLevel::L1D: - LLVM_FALLTHROUGH; - case TargetTransformInfo::CacheLevel::L2D: - return llvm::Optional(); - } - - llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); - } - - unsigned getPrefetchDistance() const { return 0; } - unsigned getMinPrefetchStride() const { return 1; } - unsigned getMaxPrefetchIterationsAhead() const { return UINT_MAX; } }; } Index: llvm/include/llvm/CodeGen/BasicTTIImpl.h =================================================================== --- llvm/include/llvm/CodeGen/BasicTTIImpl.h +++ llvm/include/llvm/CodeGen/BasicTTIImpl.h @@ -523,8 +523,13 @@ virtual Optional getCacheAssociativity(TargetTransformInfo::CacheLevel Level) const { - return Optional( - getST()->getCacheAssociativity(static_cast(Level))); + Optional TargetResult = + getST()->getCacheAssociativity(static_cast(Level)); + + if (TargetResult) + return TargetResult; + + return BaseT::getCacheAssociativity(Level); } virtual unsigned getCacheLineSize() const { Index: llvm/include/llvm/Analysis/TargetTransformInfoImpl.h =================================================================== --- llvm/include/llvm/Analysis/TargetTransformInfoImpl.h +++ llvm/include/llvm/Analysis/TargetTransformInfoImpl.h @@ -371,6 +371,34 @@ return false; } + unsigned getCacheLineSize() const { return 0; } + + llvm::Optional getCacheSize(TargetTransformInfo::CacheLevel Level) const { + switch (Level) { + case TargetTransformInfo::CacheLevel::L1D: + LLVM_FALLTHROUGH; + case TargetTransformInfo::CacheLevel::L2D: + return llvm::Optional(); + } + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); + } + + llvm::Optional getCacheAssociativity( + TargetTransformInfo::CacheLevel Level) const { + switch (Level) { + case TargetTransformInfo::CacheLevel::L1D: + LLVM_FALLTHROUGH; + case TargetTransformInfo::CacheLevel::L2D: + return llvm::Optional(); + } + + llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); + } + + unsigned getPrefetchDistance() const { return 0; } + unsigned getMinPrefetchStride() const { return 1; } + unsigned getMaxPrefetchIterationsAhead() const { return UINT_MAX; } + unsigned getMaxInterleaveFactor(unsigned VF) { return 1; } unsigned getArithmeticInstrCost(unsigned Opcode, Type *Ty, -------------- next part -------------- A non-text attachment was scrubbed... Name: D68804.224466.patch Type: text/x-patch Size: 3437 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 13:46:31 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:46:31 +0000 (UTC) Subject: [PATCH] D68809: [MemorySSA] Additional handling of unreachable blocks. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. asbirlea marked an inline comment as done. Closed by commit rG67f0c5c08578: [MemorySSA] Additional handling of unreachable blocks. (authored by asbirlea). Herald added a subscriber: hiraditya. Changed prior to commit: https://reviews.llvm.org/D68809?vs=224388&id=224467#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68809/new/ https://reviews.llvm.org/D68809 Files: llvm/lib/Analysis/MemorySSAUpdater.cpp llvm/test/Analysis/MemorySSA/pr43426.ll Index: llvm/test/Analysis/MemorySSA/pr43426.ll =================================================================== --- /dev/null +++ llvm/test/Analysis/MemorySSA/pr43426.ll @@ -0,0 +1,40 @@ +; RUN: opt -licm -enable-mssa-loop-dependency -S %s | FileCheck %s +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" + +; CHECK-LABEL: @d() +define dso_local void @d() { +entry: + br label %header + +header: + store i32 1, i32* null, align 4 + br i1 true, label %cleanup53, label %body + +body: + br i1 undef, label %cleanup31, label %for.cond11 + +for.cond11: ; Needs branch as is + br i1 undef, label %unreachable, label %latch + +cleanup31: + br label %unreachable + +deadblock: + br i1 undef, label %unreachable, label %deadblock + +cleanup53: + %val = load i32, i32* null, align 4 + %cmpv = icmp eq i32 %val, 0 + br i1 %cmpv, label %cleanup63, label %latch + +latch: + br label %header + +cleanup63: + ret void + +unreachable: + unreachable +} + Index: llvm/lib/Analysis/MemorySSAUpdater.cpp =================================================================== --- llvm/lib/Analysis/MemorySSAUpdater.cpp +++ llvm/lib/Analysis/MemorySSAUpdater.cpp @@ -48,6 +48,10 @@ return Cached->second; } + // If this method is called from an unreachable block, return LoE. + if (!MSSA->DT->isReachableFromEntry(BB)) + return MSSA->getLiveOnEntryDef(); + if (BasicBlock *Pred = BB->getSinglePredecessor()) { // Single predecessor case, just recurse, we can only have one definition. MemoryAccess *Result = getPreviousDefFromEnd(Pred, CachedPreviousDef); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68809.224467.patch Type: text/x-patch Size: 1664 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 13:46:32 2019 From: llvm-commits at lists.llvm.org (Evgenii Stepanov via llvm-commits) Date: Thu, 10 Oct 2019 13:46:32 -0700 Subject: [PATCH] D68431: [msan] Add interceptors: crypt, crypt_r. In-Reply-To: References: Message-ID: Sure, done in r374448, let me know if it did not help. My man page says that only this is necessary: #define _XOPEN_SOURCE /* See feature_test_macros(7) */ #include I don't define _XOPEN_SOURCE, maybe that's the real problem? On Thu, Oct 10, 2019 at 10:06 AM Ulrich Weigand via Phabricator wrote: > > uweigand added a comment. > > The Posix/crypt.cpp test case fails on my system with: > > /home/uweigand/llvm/llvm-head/projects/compiler-rt/test/sanitizer_common/TestCases/Posix/crypt.cpp:18:15: error: use of undeclared identifier 'crypt' > char *p = crypt("abcdef", "$1$"); > > I believe the test case is missing a > > #include > > > Repository: > rL LLVM > > CHANGES SINCE LAST ACTION > https://reviews.llvm.org/D68431/new/ > > https://reviews.llvm.org/D68431 > > > From llvm-commits at lists.llvm.org Thu Oct 10 13:46:53 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:46:53 +0000 (UTC) Subject: [PATCH] D68814: [LV] Allow assume calls in predicated blocks. In-Reply-To: References: Message-ID: <46eec5ba62ba6f656020e443dc75da02@localhost.localdomain> fhahn marked an inline comment as done. fhahn added inline comments. ================ Comment at: llvm/test/Transforms/LoopVectorize/predicate-assume.ll:2 +; REQUIRES: asserts +; RUN: opt -loop-vectorize -force-vector-width=4 -debug -S %s 2>&1 | FileCheck %s + ---------------- I've not added a test with calls to assume in the loop header here, because we already have them in llvm/test/Transforms/LoopVectorize/X86/assume.ll. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68814/new/ https://reviews.llvm.org/D68814 From llvm-commits at lists.llvm.org Thu Oct 10 13:56:21 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:56:21 +0000 (UTC) Subject: [PATCH] D68830: [lit] Break main into smaller functions In-Reply-To: References: Message-ID: <8445b477218e1fb5b599cd7a0289a67a@localhost.localdomain> rnk accepted this revision. rnk added a comment. This revision is now accepted and ready to land. lgtm Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68830/new/ https://reviews.llvm.org/D68830 From llvm-commits at lists.llvm.org Thu Oct 10 13:56:21 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:56:21 +0000 (UTC) Subject: [PATCH] D68488: [PATCH 05/38] [noalias] [IR] Introduce noalias_sidechannel for LoadInst/StoreInst In-Reply-To: References: Message-ID: jeroen.dobbelaere marked an inline comment as done. jeroen.dobbelaere added inline comments. ================ Comment at: llvm/lib/IR/Instructions.cpp:4149 + // that don't know how to handle it (Like MergeLoadStoreMotion shows) + // - safe alternative: keep the argument, but map it to undef. + if (hasNoaliasSideChannelOperand()) ---------------- a.elovikov wrote: > Why is it safe? Consider we have aliasing load and store, we clone them so load.clone and load.store now have undef in the sidechannels, so optimizations are free to choose the values for them that would mean noalias. - `Undef` is maybe the wrong value and could indeed trigger problems in future. - `null` is not safe, as that is used to indicate a converted load/store instruction, when it does not depend on a restrict pointer (still in combination with the !noalias metadata, which must be present and indicate what scopes are visible).. - Copying the noalias_sidechannel from the original is also not safe. It might not be valid in the new context. - Removing the noalias_sidechannel (as was done in some earlier version) also does not work: some passes rightfully expect the clone to have the same number of arguments as the original. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68488/new/ https://reviews.llvm.org/D68488 From llvm-commits at lists.llvm.org Thu Oct 10 13:56:22 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:56:22 +0000 (UTC) Subject: [PATCH] D68484: [PATCH 01/38] [noalias] LangRef: noalias intrinsics and noalias_sidechannel documentation. In-Reply-To: References: Message-ID: <32c3951da989744105fcd595c24bedc5@localhost.localdomain> jeroen.dobbelaere marked an inline comment as done. jeroen.dobbelaere added inline comments. ================ Comment at: llvm/docs/LangRef.rst:16575 +It will be transformed into a ``llvm.side.noalias`` intrinsic and moved onto +the ``noalias_sidechannel`` path, so that pointer optimizations can still be +done and the restrict information is not lost. ---------------- a.elovikov wrote: > jeroen.dobbelaere wrote: > > a.elovikov wrote: > > > > the ``noalias_sidechannel`` path > > > > > > Not sure about terminology, but are `@llvm.noalias.arg.guard`/`@llvm.noalias.copy.guard` considered as `noalias_sidechannel`? I'd suggest not to use the spelling from the load/store instructions and have a more general `moved onto the "side" path` (if my understanding is correct here). > > The @llvm.noalias.arg.guard combines the normal path with the noalias_sidechannel path. The @llvm.noalias.copy.guard resides on the normal path and adds extra information to a copy operation (memcpy, load/store). > > I tried to be consistent in terminology when referring to the 'noalias_sidechannel' path. (but I could also use the 'noalias side channel' or something similar). > How about this: > > It will be transformed into a ``llvm.side.noalias`` intrinsic and moved onto > the ``noalias_sidechannel`` path for loads/stores and fed into the @llvm.noalias.arg.guard/@llvm.noalias.copy.guard intrinsics for function boundaries/copies respectively. ... and fed into the @llvm.noalias.arg.guard intrinsics for function boundaries. (The @llvm.noalias.copy.guard is generated by the clang frontend) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68484/new/ https://reviews.llvm.org/D68484 From llvm-commits at lists.llvm.org Thu Oct 10 14:05:49 2019 From: llvm-commits at lists.llvm.org (Petr Hosek via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:05:49 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies In-Reply-To: References: Message-ID: <8a03c3c2d66d5469a0cfc06301f95c4e@localhost.localdomain> phosek added a comment. This is an alternative to D68791 which is trying to address the issue introduced in r374116. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68833/new/ https://reviews.llvm.org/D68833 From llvm-commits at lists.llvm.org Thu Oct 10 14:05:50 2019 From: llvm-commits at lists.llvm.org (Petr Hosek via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:05:50 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies Message-ID: phosek created this revision. phosek added reviewers: ldionne, beanz. Herald added subscribers: llvm-commits, dexonsmith, mgorny. Herald added a project: LLVM. phosek added a comment. This is an alternative to D68791 which is trying to address the issue introduced in r374116. This allows runtimes to have checks like `if(TARGET ${runtime})`. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68833 Files: llvm/runtimes/CMakeLists.txt Index: llvm/runtimes/CMakeLists.txt =================================================================== --- llvm/runtimes/CMakeLists.txt +++ llvm/runtimes/CMakeLists.txt @@ -38,10 +38,10 @@ set(LLVM_EXTERNAL_${canon_name}_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../../${proj}") endforeach() -function(get_compiler_rt_path path) +function(get_runtime_path runtime path) foreach(entry ${runtimes}) get_filename_component(projName ${entry} NAME) - if("${projName}" MATCHES "compiler-rt") + if("${projName}" MATCHES "${runtime}") set(${path} ${entry} PARENT_SCOPE) return() endif() @@ -68,18 +68,6 @@ "${CMAKE_CURRENT_SOURCE_DIR}/../cmake/modules" ) - # Some of the runtimes will conditionally use the compiler-rt sanitizers - # to make this work smoothly we ensure that compiler-rt is added first in - # the list of sub-projects. This allows other sub-projects to have checks - # like `if(TARGET asan)` to enable building with asan. - get_compiler_rt_path(compiler_rt_path) - if(compiler_rt_path) - list(REMOVE_ITEM runtimes ${compiler_rt_path}) - if(NOT DEFINED LLVM_BUILD_COMPILER_RT OR LLVM_BUILD_COMPILER_RT) - list(INSERT runtimes 0 ${compiler_rt_path}) - endif() - endif() - # Setting these variables will allow the sub-build to put their outputs into # the library and bin directories of the top-level build. set(LLVM_LIBRARY_OUTPUT_INTDIR ${LLVM_LIBRARY_DIR}) @@ -122,6 +110,17 @@ include(UseLibtool) endif() + # Re-order runtimes in the order of dependencies: libcxxabi depend on libunwind, + # libcxx depends on libcxxabi, some compiler-rt runtimes depend on libcxx. This + # allows these runtimes to have checks like `if(TARGET ${runtime})`. + foreach(runtime compiler-rt libcxx libcxxabi libunwind) + get_runtime_path(${runtime} runtime_path) + if(runtime_path) + list(REMOVE_ITEM runtimes ${runtime_path}) + list(INSERT runtimes 0 ${runtime_path}) + endif() + endforeach() + # This can be used to detect whether we're in the runtimes build. set(RUNTIMES_BUILD ON) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68833.224468.patch Type: text/x-patch Size: 2095 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 14:05:50 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:05:50 +0000 (UTC) Subject: [PATCH] D68492: [PATCH 09/38] [noalias] D9376: llvm.noalias - handling of dead intrinsics In-Reply-To: References: Message-ID: <2dc729bb2a9f4886ce580adbb55a1fe3@localhost.localdomain> jeroen.dobbelaere marked an inline comment as done. jeroen.dobbelaere added inline comments. ================ Comment at: llvm/lib/Analysis/InstructionSimplify.cpp:5156 + if (isa(Arg0) || + (isa(Arg0) && + Arg0->getType()->getPointerAddressSpace() == 0)) ---------------- a.elovikov wrote: > What if we have > > %i = ptr2int %p > %null = sub %i, %i > %nullptr = int2ptr %null > %scope = call @llvm.noalias(%nullptr) ; introduce the scope > %null2 = ptr2int %scope > %i2 = add %null2, %i > %same.as.orig.p = int2ptr %i2 > > Why don't we want `%same.as.orig.p` to have the scope? A ptr2int will normally block the noalias propagation. Not sure if a %same.as.orig.p = getelementptr %scope, %i is valid when %scope is null. If that is valid, the noalias dependency should not be removed, and we should not short-circuit it here. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68492/new/ https://reviews.llvm.org/D68492 From llvm-commits at lists.llvm.org Thu Oct 10 14:05:51 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:05:51 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream In-Reply-To: References: Message-ID: <97b4f4716cf04a76265828077caca800@localhost.localdomain> JosephTremoulet updated this revision to Diff 224469. JosephTremoulet added a comment. Herald added a project: LLDB. Herald added a subscriber: lldb-commits. - Update test input yaml Exception stream Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68657/new/ https://reviews.llvm.org/D68657 Files: lldb/packages/Python/lldbsuite/test/functionalities/postmortem/minidump-new/linux-x86_64.yaml llvm/include/llvm/ObjectYAML/MinidumpYAML.h llvm/lib/ObjectYAML/MinidumpEmitter.cpp llvm/lib/ObjectYAML/MinidumpYAML.cpp llvm/unittests/ObjectYAML/MinidumpYAMLTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68657.224469.patch Type: text/x-patch Size: 16392 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 14:15:07 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:15:07 +0000 (UTC) Subject: [PATCH] D68492: [PATCH 09/38] [noalias] D9376: llvm.noalias - handling of dead intrinsics In-Reply-To: References: Message-ID: <86b8170ecd9d2b00bd3b15043718c85a@localhost.localdomain> lebedev.ri added inline comments. ================ Comment at: llvm/lib/Analysis/InstructionSimplify.cpp:5156 + if (isa(Arg0) || + (isa(Arg0) && + Arg0->getType()->getPointerAddressSpace() == 0)) ---------------- jeroen.dobbelaere wrote: > a.elovikov wrote: > > What if we have > > > > %i = ptr2int %p > > %null = sub %i, %i > > %nullptr = int2ptr %null > > %scope = call @llvm.noalias(%nullptr) ; introduce the scope > > %null2 = ptr2int %scope > > %i2 = add %null2, %i > > %same.as.orig.p = int2ptr %i2 > > > > Why don't we want `%same.as.orig.p` to have the scope? > A ptr2int will normally block the noalias propagation. Not sure if a > %same.as.orig.p = getelementptr %scope, %i > is valid when %scope is null. If that is valid, the noalias dependency should not be removed, and we should not short-circuit it here. > `getelementptr` is valid for null pointer. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68492/new/ https://reviews.llvm.org/D68492 From llvm-commits at lists.llvm.org Thu Oct 10 14:24:41 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via llvm-commits) Date: Thu, 10 Oct 2019 21:24:41 -0000 Subject: [llvm] r374452 - [lit] Break main into smaller functions Message-ID: <20191010212441.76FA192A2A@lists.llvm.org> Author: yln Date: Thu Oct 10 14:24:41 2019 New Revision: 374452 URL: http://llvm.org/viewvc/llvm-project?rev=374452&view=rev Log: [lit] Break main into smaller functions This change is purely mechanical. I will do further cleanups of parameter usages. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D68830 Modified: llvm/trunk/utils/lit/lit/main.py Modified: llvm/trunk/utils/lit/lit/main.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/main.py?rev=374452&r1=374451&r2=374452&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/main.py (original) +++ llvm/trunk/utils/lit/lit/main.py Thu Oct 10 14:24:41 2019 @@ -25,77 +25,6 @@ import lit.run import lit.Test import lit.util -def write_test_results(run, lit_config, testing_time, output_path): - try: - import json - except ImportError: - lit_config.fatal('test output unsupported with Python 2.5') - - # Construct the data we will write. - data = {} - # Encode the current lit version as a schema version. - data['__version__'] = lit.__versioninfo__ - data['elapsed'] = testing_time - # FIXME: Record some information on the lit configuration used? - # FIXME: Record information from the individual test suites? - - # Encode the tests. - data['tests'] = tests_data = [] - for test in run.tests: - test_data = { - 'name' : test.getFullName(), - 'code' : test.result.code.name, - 'output' : test.result.output, - 'elapsed' : test.result.elapsed } - - # Add test metrics, if present. - if test.result.metrics: - test_data['metrics'] = metrics_data = {} - for key, value in test.result.metrics.items(): - metrics_data[key] = value.todata() - - # Report micro-tests separately, if present - if test.result.microResults: - for key, micro_test in test.result.microResults.items(): - # Expand parent test name with micro test name - parent_name = test.getFullName() - micro_full_name = parent_name + ':' + key - - micro_test_data = { - 'name' : micro_full_name, - 'code' : micro_test.code.name, - 'output' : micro_test.output, - 'elapsed' : micro_test.elapsed } - if micro_test.metrics: - micro_test_data['metrics'] = micro_metrics_data = {} - for key, value in micro_test.metrics.items(): - micro_metrics_data[key] = value.todata() - - tests_data.append(micro_test_data) - - tests_data.append(test_data) - - # Write the output. - f = open(output_path, 'w') - try: - json.dump(data, f, indent=2, sort_keys=True) - f.write('\n') - finally: - f.close() - -def update_incremental_cache(test): - if not test.result.code.isFailure: - return - fname = test.getFilePath() - os.utime(fname, None) - -def by_mtime(test): - fname = test.getFilePath() - try: - return os.path.getmtime(fname) - except: - return 0 - def main(builtinParameters = {}): # Create a temp directory inside the normal temp directory so that we can # try to avoid temporary test file leaks. The user can avoid this behavior @@ -132,14 +61,7 @@ def main_with_tmp(builtinParameters): print("lit %s" % (lit.__version__,)) return - # Create the user defined parameters. - userParams = dict(builtinParameters) - for entry in opts.userParameters: - if '=' not in entry: - name,val = entry,'' - else: - name,val = entry.split('=', 1) - userParams[name] = val + userParams = create_user_parameters(builtinParameters, opts) # Decide what the requested maximum indvidual test time should be if opts.maxIndividualTestTime is not None: @@ -186,57 +108,16 @@ def main_with_tmp(builtinParameters): litConfig.maxIndividualTestTime = opts.maxIndividualTestTime if opts.showSuites or opts.showTests: - # Aggregate the tests by suite. - suitesAndTests = {} - for result_test in run.tests: - if result_test.suite not in suitesAndTests: - suitesAndTests[result_test.suite] = [] - suitesAndTests[result_test.suite].append(result_test) - suitesAndTests = list(suitesAndTests.items()) - suitesAndTests.sort(key = lambda item: item[0].name) - - # Show the suites, if requested. - if opts.showSuites: - print('-- Test Suites --') - for ts,ts_tests in suitesAndTests: - print(' %s - %d tests' %(ts.name, len(ts_tests))) - print(' Source Root: %s' % ts.source_root) - print(' Exec Root : %s' % ts.exec_root) - if ts.config.available_features: - print(' Available Features : %s' % ' '.join( - sorted(ts.config.available_features))) - - # Show the tests, if requested. - if opts.showTests: - print('-- Available Tests --') - for ts,ts_tests in suitesAndTests: - ts_tests.sort(key = lambda test: test.path_in_suite) - for test in ts_tests: - print(' %s' % (test.getFullName(),)) - - # Exit. - sys.exit(0) + print_suites_or_tests(run, opts) + return # Select and order the tests. numTotalTests = len(run.tests) - # First, select based on the filter expression if given. if opts.filter: - try: - rex = re.compile(opts.filter) - except: - parser.error("invalid regular expression for --filter: %r" % ( - opts.filter)) - run.tests = [result_test for result_test in run.tests - if rex.search(result_test.getFullName())] + filter_tests(run, opts) - # Then select the order. - if opts.shuffle: - random.shuffle(run.tests) - elif opts.incremental: - run.tests.sort(key=by_mtime, reverse=True) - else: - run.tests.sort(key = lambda t: (not t.isEarlyTest(), t.getFullName())) + order_tests(run, opts) # Then optionally restrict our attention to a shard of the tests. if (opts.numShards is not None) or (opts.runShard is not None): @@ -262,27 +143,7 @@ def main_with_tmp(builtinParameters): # Don't create more workers than tests. opts.numWorkers = min(len(run.tests), opts.numWorkers) - # Because some tests use threads internally, and at least on Linux each - # of these threads counts toward the current process limit, try to - # raise the (soft) process limit so that tests don't fail due to - # resource exhaustion. - try: - cpus = lit.util.detectCPUs() - desired_limit = opts.numWorkers * cpus * 2 # the 2 is a safety factor - - # Import the resource module here inside this try block because it - # will likely fail on Windows. - import resource - - max_procs_soft, max_procs_hard = resource.getrlimit(resource.RLIMIT_NPROC) - desired_limit = min(desired_limit, max_procs_hard) - - if max_procs_soft < desired_limit: - resource.setrlimit(resource.RLIMIT_NPROC, (desired_limit, max_procs_hard)) - litConfig.note('raised the process limit from %d to %d' % \ - (max_procs_soft, desired_limit)) - except: - pass + increase_process_limit(litConfig, opts) display = lit.display.create_display(opts, len(run.tests), numTotalTests, opts.numWorkers) @@ -358,41 +219,7 @@ def main_with_tmp(builtinParameters): print(' %s: %d' % (name,N)) if opts.xunit_output_file: - # Collect the tests, indexed by test suite - by_suite = {} - for result_test in run.tests: - suite = result_test.suite.config.name - if suite not in by_suite: - by_suite[suite] = { - 'passes' : 0, - 'failures' : 0, - 'skipped': 0, - 'tests' : [] } - by_suite[suite]['tests'].append(result_test) - if result_test.result.code.isFailure: - by_suite[suite]['failures'] += 1 - elif result_test.result.code == lit.Test.UNSUPPORTED: - by_suite[suite]['skipped'] += 1 - else: - by_suite[suite]['passes'] += 1 - xunit_output_file = open(opts.xunit_output_file, "w") - xunit_output_file.write("\n") - xunit_output_file.write("\n") - for suite_name, suite in by_suite.items(): - safe_suite_name = quoteattr(suite_name.replace(".", "-")) - xunit_output_file.write("\n") - - for result_test in suite['tests']: - result_test.writeJUnitXML(xunit_output_file) - xunit_output_file.write("\n") - xunit_output_file.write("\n") - xunit_output_file.write("") - xunit_output_file.close() + write_test_results_xunit(run, opts) # If we encountered any additional errors, exit abnormally. if litConfig.numErrors: @@ -407,5 +234,196 @@ def main_with_tmp(builtinParameters): sys.exit(1) sys.exit(0) + +def create_user_parameters(builtinParameters, opts): + userParams = dict(builtinParameters) + for entry in opts.userParameters: + if '=' not in entry: + name,val = entry,'' + else: + name,val = entry.split('=', 1) + userParams[name] = val + return userParams + +def print_suites_or_tests(run, opts): + # Aggregate the tests by suite. + suitesAndTests = {} + for result_test in run.tests: + if result_test.suite not in suitesAndTests: + suitesAndTests[result_test.suite] = [] + suitesAndTests[result_test.suite].append(result_test) + suitesAndTests = list(suitesAndTests.items()) + suitesAndTests.sort(key = lambda item: item[0].name) + + # Show the suites, if requested. + if opts.showSuites: + print('-- Test Suites --') + for ts,ts_tests in suitesAndTests: + print(' %s - %d tests' %(ts.name, len(ts_tests))) + print(' Source Root: %s' % ts.source_root) + print(' Exec Root : %s' % ts.exec_root) + if ts.config.available_features: + print(' Available Features : %s' % ' '.join( + sorted(ts.config.available_features))) + + # Show the tests, if requested. + if opts.showTests: + print('-- Available Tests --') + for ts,ts_tests in suitesAndTests: + ts_tests.sort(key = lambda test: test.path_in_suite) + for test in ts_tests: + print(' %s' % (test.getFullName(),)) + + # Exit. + sys.exit(0) + +def filter_tests(run, opts): + try: + rex = re.compile(opts.filter) + except: + parser.error("invalid regular expression for --filter: %r" % ( + opts.filter)) + run.tests = [result_test for result_test in run.tests + if rex.search(result_test.getFullName())] + +def order_tests(run, opts): + if opts.shuffle: + random.shuffle(run.tests) + elif opts.incremental: + run.tests.sort(key = by_mtime, reverse = True) + else: + run.tests.sort(key = lambda t: (not t.isEarlyTest(), t.getFullName())) + +def by_mtime(test): + fname = test.getFilePath() + try: + return os.path.getmtime(fname) + except: + return 0 + +def update_incremental_cache(test): + if not test.result.code.isFailure: + return + fname = test.getFilePath() + os.utime(fname, None) + +def increase_process_limit(litConfig, opts): + # Because some tests use threads internally, and at least on Linux each + # of these threads counts toward the current process limit, try to + # raise the (soft) process limit so that tests don't fail due to + # resource exhaustion. + try: + cpus = lit.util.detectCPUs() + desired_limit = opts.numWorkers * cpus * 2 # the 2 is a safety factor + + # Import the resource module here inside this try block because it + # will likely fail on Windows. + import resource + + max_procs_soft, max_procs_hard = resource.getrlimit(resource.RLIMIT_NPROC) + desired_limit = min(desired_limit, max_procs_hard) + + if max_procs_soft < desired_limit: + resource.setrlimit(resource.RLIMIT_NPROC, (desired_limit, max_procs_hard)) + litConfig.note('raised the process limit from %d to %d' % \ + (max_procs_soft, desired_limit)) + except: + pass + +def write_test_results(run, lit_config, testing_time, output_path): + try: + import json + except ImportError: + lit_config.fatal('test output unsupported with Python 2.5') + + # Construct the data we will write. + data = {} + # Encode the current lit version as a schema version. + data['__version__'] = lit.__versioninfo__ + data['elapsed'] = testing_time + # FIXME: Record some information on the lit configuration used? + # FIXME: Record information from the individual test suites? + + # Encode the tests. + data['tests'] = tests_data = [] + for test in run.tests: + test_data = { + 'name' : test.getFullName(), + 'code' : test.result.code.name, + 'output' : test.result.output, + 'elapsed' : test.result.elapsed } + + # Add test metrics, if present. + if test.result.metrics: + test_data['metrics'] = metrics_data = {} + for key, value in test.result.metrics.items(): + metrics_data[key] = value.todata() + + # Report micro-tests separately, if present + if test.result.microResults: + for key, micro_test in test.result.microResults.items(): + # Expand parent test name with micro test name + parent_name = test.getFullName() + micro_full_name = parent_name + ':' + key + + micro_test_data = { + 'name' : micro_full_name, + 'code' : micro_test.code.name, + 'output' : micro_test.output, + 'elapsed' : micro_test.elapsed } + if micro_test.metrics: + micro_test_data['metrics'] = micro_metrics_data = {} + for key, value in micro_test.metrics.items(): + micro_metrics_data[key] = value.todata() + + tests_data.append(micro_test_data) + + tests_data.append(test_data) + + # Write the output. + f = open(output_path, 'w') + try: + json.dump(data, f, indent=2, sort_keys=True) + f.write('\n') + finally: + f.close() + +def write_test_results_xunit(run, opts): + # Collect the tests, indexed by test suite + by_suite = {} + for result_test in run.tests: + suite = result_test.suite.config.name + if suite not in by_suite: + by_suite[suite] = { + 'passes' : 0, + 'failures' : 0, + 'skipped': 0, + 'tests' : [] } + by_suite[suite]['tests'].append(result_test) + if result_test.result.code.isFailure: + by_suite[suite]['failures'] += 1 + elif result_test.result.code == lit.Test.UNSUPPORTED: + by_suite[suite]['skipped'] += 1 + else: + by_suite[suite]['passes'] += 1 + xunit_output_file = open(opts.xunit_output_file, "w") + xunit_output_file.write("\n") + xunit_output_file.write("\n") + for suite_name, suite in by_suite.items(): + safe_suite_name = quoteattr(suite_name.replace(".", "-")) + xunit_output_file.write("\n") + + for result_test in suite['tests']: + result_test.writeJUnitXML(xunit_output_file) + xunit_output_file.write("\n") + xunit_output_file.write("\n") + xunit_output_file.write("") + xunit_output_file.close() + if __name__=='__main__': main() From llvm-commits at lists.llvm.org Thu Oct 10 14:24:33 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:24:33 +0000 (UTC) Subject: [PATCH] D68834: [lit] Change regex filter to ignore case Message-ID: yln created this revision. yln added reviewers: rnk, ddunbar, serge-sans-paille, probinson, jdenny, cishida, nate_chandler, jordan_rose. Herald added subscribers: llvm-commits, delcypher. Herald added a project: LLVM. Make regex filter `--filter=REGEX` option more lenient via `re.IGNORECASE`. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68834 Files: llvm/utils/lit/lit/cl_arguments.py llvm/utils/lit/lit/main.py llvm/utils/lit/tests/selecting.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68834.224475.patch Type: text/x-patch Size: 3416 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 14:24:34 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:24:34 +0000 (UTC) Subject: [PATCH] D68739: [GISel] Allow ConstantFoldBinOp to consider G_FCONSTANT binary representation for combines In-Reply-To: References: Message-ID: <414b3a82f2030bbd9e74d848b1221338@localhost.localdomain> arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. LGTM. I'm not sure we even really need G_FCONSTANT as an instruction ================ Comment at: llvm/lib/CodeGen/GlobalISel/Utils.cpp:240-241 + return Val; + } else { + return CstVal.getFPImm()->getValueAPF().bitcastToAPInt(); + } ---------------- No else after return Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68739/new/ https://reviews.llvm.org/D68739 From llvm-commits at lists.llvm.org Thu Oct 10 14:24:50 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:24:50 +0000 (UTC) Subject: [PATCH] D68830: [lit] Break main into smaller functions In-Reply-To: References: Message-ID: <37093c7fe3218072681d0d6830265f10@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG8d0744a8b57d: [lit] Break main into smaller functions (authored by yln). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68830/new/ https://reviews.llvm.org/D68830 Files: llvm/utils/lit/lit/main.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68830.224478.patch Type: text/x-patch Size: 16610 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 14:29:10 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via llvm-commits) Date: Thu, 10 Oct 2019 21:29:10 -0000 Subject: [llvm] r374453 - [InstCombine] Add test case for PR43617 (NFC) Message-ID: <20191010212910.64B4392A6D@lists.llvm.org> Author: evandro Date: Thu Oct 10 14:29:10 2019 New Revision: 374453 URL: http://llvm.org/viewvc/llvm-project?rev=374453&view=rev Log: [InstCombine] Add test case for PR43617 (NFC) Also, refactor check in `LibCallSimplifier::optimizeLog()`. Modified: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp llvm/trunk/test/Transforms/InstCombine/log-pow.ll Modified: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp?rev=374453&r1=374452&r2=374453&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp Thu Oct 10 14:29:10 2019 @@ -1915,9 +1915,7 @@ Value *LibCallSimplifier::optimizeLog(Ca IRBuilder<>::FastMathFlagGuard Guard(B); B.setFastMathFlags(FastMathFlags::getFast()); - Function *ArgFn = Arg->getCalledFunction(); - Intrinsic::ID ArgID = - ArgFn ? ArgFn->getIntrinsicID() : Intrinsic::not_intrinsic; + Intrinsic::ID ArgID = Arg->getIntrinsicID(); LibFunc ArgLb = NotLibFunc; TLI->getLibFunc(Arg, ArgLb); Modified: llvm/trunk/test/Transforms/InstCombine/log-pow.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/log-pow.ll?rev=374453&r1=374452&r2=374453&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/log-pow.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/log-pow.ll Thu Oct 10 14:29:10 2019 @@ -97,8 +97,18 @@ define double @log_exp2_not_fast(double ret double %log } +define double @pr43617(double %d, i32 %i, double (i32)* %f) { +entry: + %sub = fsub double -0.000000e+00, %d + %icall = tail call fast double %f(i32 %i) + %log = tail call fast double @llvm.log.f64(double %icall) + %mul = fmul double %log, %sub + ret double %mul +} + declare double @log(double) #0 declare float @logf(float) #0 +declare double @llvm.log.f64(double) #0 declare <2 x float> @llvm.log.v2f32(<2 x float>) declare float @log2f(float) #0 declare <2 x double> @llvm.log2.v2f64(<2 x double>) From llvm-commits at lists.llvm.org Thu Oct 10 14:30:43 2019 From: llvm-commits at lists.llvm.org (Rong Xu via llvm-commits) Date: Thu, 10 Oct 2019 21:30:43 -0000 Subject: [llvm] r374454 - [ValueTracking] Improve pointer offset computation for cases of same base Message-ID: <20191010213043.BA50892A74@lists.llvm.org> Author: xur Date: Thu Oct 10 14:30:43 2019 New Revision: 374454 URL: http://llvm.org/viewvc/llvm-project?rev=374454&view=rev Log: [ValueTracking] Improve pointer offset computation for cases of same base This patch improves the handling of pointer offset in GEP expressions where one argument is the base pointer. isPointerOffset() is being used by memcpyopt where current code synthesizes consecutive 32 bytes stores to one store and two memset intrinsic calls. With this patch, we convert the stores to one memset intrinsic. Differential Revision: https://reviews.llvm.org/D67989 Added: llvm/trunk/test/Transforms/MemCpyOpt/store-to-memset.ll Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=374454&r1=374453&r2=374454&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original) +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Thu Oct 10 14:30:43 2019 @@ -5755,17 +5755,47 @@ Optional llvm::isPointerOffset( const GEPOperator *GEP1 = dyn_cast(Ptr1); const GEPOperator *GEP2 = dyn_cast(Ptr2); - // If one pointer is a GEP and the other isn't, then see if the GEP is a - // constant offset from the base, as in "P" and "gep P, 1". - if (GEP1 && !GEP2 && GEP1->getOperand(0)->stripPointerCasts() == Ptr2) { - auto Offset = getOffsetFromIndex(GEP1, 1, DL); - if (!Offset) + // If one pointer is a GEP see if the GEP is a constant offset from the base, + // as in "P" and "gep P, 1". + // Also do this iteratively to handle the the following case: + // Ptr_t1 = GEP Ptr1, c1 + // Ptr_t2 = GEP Ptr_t1, c2 + // Ptr2 = GEP Ptr_t2, c3 + // where we will return c1+c2+c3. + // TODO: Handle the case when both Ptr1 and Ptr2 are GEPs of some common base + // -- replace getOffsetFromBase with getOffsetAndBase, check that the bases + // are the same, and return the difference between offsets. + auto getOffsetFromBase = [&DL](const GEPOperator *GEP, + const Value *Ptr) -> Optional { + const GEPOperator *GEP_T = GEP; + int64_t OffsetVal = 0; + bool HasSameBase = false; + while (GEP_T) { + auto Offset = getOffsetFromIndex(GEP_T, 1, DL); + if (!Offset) + return None; + OffsetVal += *Offset; + auto Op0 = GEP_T->getOperand(0)->stripPointerCasts(); + if (Op0 == Ptr) { + HasSameBase = true; + break; + } + GEP_T = dyn_cast(Op0); + } + if (!HasSameBase) return None; - return -*Offset; - } + return OffsetVal; + }; - if (GEP2 && !GEP1 && GEP2->getOperand(0)->stripPointerCasts() == Ptr1) { - return getOffsetFromIndex(GEP2, 1, DL); + if (GEP1) { + auto Offset = getOffsetFromBase(GEP1, Ptr2); + if (Offset) + return -*Offset; + } + if (GEP2) { + auto Offset = getOffsetFromBase(GEP2, Ptr1); + if (Offset) + return Offset; } // Right now we handle the case when Ptr1/Ptr2 are both GEPs with an identical Added: llvm/trunk/test/Transforms/MemCpyOpt/store-to-memset.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/MemCpyOpt/store-to-memset.ll?rev=374454&view=auto ============================================================================== --- llvm/trunk/test/Transforms/MemCpyOpt/store-to-memset.ll (added) +++ llvm/trunk/test/Transforms/MemCpyOpt/store-to-memset.ll Thu Oct 10 14:30:43 2019 @@ -0,0 +1,77 @@ +; RUN: opt < %s -memcpyopt -S | FileCheck %s +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-grtev4-linux-gnu" + +define i8* @foo(i8* returned %0, i32 %1, i64 %2) { +entry: + %3 = getelementptr inbounds i8, i8* %0, i64 %2 + %4 = getelementptr inbounds i8, i8* %3, i64 -32 + %vv = trunc i32 %1 to i8 + store i8 %vv, i8* %4, align 1 + %5 = getelementptr inbounds i8, i8* %4, i64 1 + store i8 %vv, i8* %5, align 1 + %6= getelementptr inbounds i8, i8* %4, i64 2 + store i8 %vv, i8* %6, align 1 + %7= getelementptr inbounds i8, i8* %4, i64 3 + store i8 %vv, i8* %7, align 1 + %8= getelementptr inbounds i8, i8* %4, i64 4 + store i8 %vv, i8* %8, align 1 + %9= getelementptr inbounds i8, i8* %4, i64 5 + store i8 %vv, i8* %9, align 1 + %10= getelementptr inbounds i8, i8* %4, i64 6 + store i8 %vv, i8* %10, align 1 + %11= getelementptr inbounds i8, i8* %4, i64 7 + store i8 %vv, i8* %11, align 1 + %12= getelementptr inbounds i8, i8* %4, i64 8 + store i8 %vv, i8* %12, align 1 + %13= getelementptr inbounds i8, i8* %4, i64 9 + store i8 %vv, i8* %13, align 1 + %14= getelementptr inbounds i8, i8* %4, i64 10 + store i8 %vv, i8* %14, align 1 + %15= getelementptr inbounds i8, i8* %4, i64 11 + store i8 %vv, i8* %15, align 1 + %16= getelementptr inbounds i8, i8* %4, i64 12 + store i8 %vv, i8* %16, align 1 + %17= getelementptr inbounds i8, i8* %4, i64 13 + store i8 %vv, i8* %17, align 1 + %18= getelementptr inbounds i8, i8* %4, i64 14 + store i8 %vv, i8* %18, align 1 + %19= getelementptr inbounds i8, i8* %4, i64 15 + store i8 %vv, i8* %19, align 1 + %20= getelementptr inbounds i8, i8* %4, i64 16 + store i8 %vv, i8* %20, align 1 + %21= getelementptr inbounds i8, i8* %20, i64 1 + store i8 %vv, i8* %21, align 1 + %22= getelementptr inbounds i8, i8* %20, i64 2 + store i8 %vv, i8* %22, align 1 + %23= getelementptr inbounds i8, i8* %20, i64 3 + store i8 %vv, i8* %23, align 1 + %24= getelementptr inbounds i8, i8* %20, i64 4 + store i8 %vv, i8* %24, align 1 + %25= getelementptr inbounds i8, i8* %20, i64 5 + store i8 %vv, i8* %25, align 1 + %26= getelementptr inbounds i8, i8* %20, i64 6 + store i8 %vv, i8* %26, align 1 + %27= getelementptr inbounds i8, i8* %20, i64 7 + store i8 %vv, i8* %27, align 1 + %28= getelementptr inbounds i8, i8* %20, i64 8 + store i8 %vv, i8* %28, align 1 + %29= getelementptr inbounds i8, i8* %20, i64 9 + store i8 %vv, i8* %29, align 1 + %30= getelementptr inbounds i8, i8* %20, i64 10 + store i8 %vv, i8* %30, align 1 + %31 = getelementptr inbounds i8, i8* %20, i64 11 + store i8 %vv, i8* %31, align 1 + %32 = getelementptr inbounds i8, i8* %20, i64 12 + store i8 %vv, i8* %32, align 1 + %33 = getelementptr inbounds i8, i8* %20, i64 13 + store i8 %vv, i8* %33, align 1 + %34 = getelementptr inbounds i8, i8* %20, i64 14 + store i8 %vv, i8* %34, align 1 + %35 = getelementptr inbounds i8, i8* %20, i64 15 + store i8 %vv, i8* %35, align 1 + ret i8* %0 +; CHECK-LABEL: @foo +; CHECK: call void @llvm.memset.p0i8.i64(i8* align 1 %4, i8 %vv, i64 32, i1 false) +} + From llvm-commits at lists.llvm.org Thu Oct 10 14:32:41 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via llvm-commits) Date: Thu, 10 Oct 2019 21:32:41 -0000 Subject: [llvm] r374455 - [AMDGPU] Handle undef old operand in DPP combine Message-ID: <20191010213241.6D49B91F8D@lists.llvm.org> Author: rampitec Date: Thu Oct 10 14:32:41 2019 New Revision: 374455 URL: http://llvm.org/viewvc/llvm-project?rev=374455&view=rev Log: [AMDGPU] Handle undef old operand in DPP combine It was missing an undef flag. Differential Revision: https://reviews.llvm.org/D68813 Modified: llvm/trunk/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir Modified: llvm/trunk/lib/Target/AMDGPU/GCNDPPCombine.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/GCNDPPCombine.cpp?rev=374455&r1=374454&r2=374455&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/GCNDPPCombine.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/GCNDPPCombine.cpp Thu Oct 10 14:32:41 2019 @@ -178,7 +178,9 @@ MachineInstr *GCNDPPCombine::createDPPIn if (OldIdx != -1) { assert(OldIdx == NumOperands); assert(isOfRegClass(CombOldVGPR, AMDGPU::VGPR_32RegClass, *MRI)); - DPPInst.addReg(CombOldVGPR.Reg, 0, CombOldVGPR.SubReg); + auto *Def = getVRegSubRegDef(CombOldVGPR, *MRI); + DPPInst.addReg(CombOldVGPR.Reg, Def ? 0 : RegState::Undef, + CombOldVGPR.SubReg); ++NumOperands; } else { // TODO: this discards MAC/FMA instructions for now, let's add it later Modified: llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir?rev=374455&r1=374454&r2=374455&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir Thu Oct 10 14:32:41 2019 @@ -512,7 +512,7 @@ body: | ... # CHECK-LABEL: name: add_old_subreg_undef -# CHECK: %5:vgpr_32 = V_ADD_U32_dpp %3.sub1, %1, %0.sub1, 1, 15, 15, 1, implicit $exec +# CHECK: %5:vgpr_32 = V_ADD_U32_dpp undef %3.sub1, %1, %0.sub1, 1, 15, 15, 1, implicit $exec name: add_old_subreg_undef tracksRegLiveness: true @@ -551,3 +551,14 @@ body: | %2:vgpr_32 = V_MOV_B32_dpp %1:vgpr_32, undef %0:vgpr_32, 1, 15, 15, 1, implicit $exec %4:vgpr_32 = V_MIN_F32_e32 %2, undef %3:vgpr_32, implicit $exec ... + +# Test an undef old operand +# CHECK-LABEL: name: dpp_undef_old +# CHECK: %3:vgpr_32 = V_CEIL_F32_dpp undef %1:vgpr_32, 0, undef %2:vgpr_32, 1, 15, 15, 1, implicit $exec +name: dpp_undef_old +tracksRegLiveness: true +body: | + bb.0: + %2:vgpr_32 = V_MOV_B32_dpp undef %1:vgpr_32, undef %0:vgpr_32, 1, 15, 15, 1, implicit $exec + %3:vgpr_32 = V_CEIL_F32_e32 %2, implicit $exec +... From llvm-commits at lists.llvm.org Thu Oct 10 14:34:10 2019 From: llvm-commits at lists.llvm.org (Aaron Puchert via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:34:10 +0000 (UTC) Subject: [PATCH] D51741: [coro]Pass rvalue reference for named local variable to return_value In-Reply-To: References: Message-ID: aaronpuchert added a comment. @Quuxplusone Thanks for your very helpful comments! > In which case, this patch (D51741 ) itself fixed this FIXME at least partly, and maybe completely. Maybe this patch should have removed or amended the FIXME, rather than just adding code above it. It seems you're right, depending on the `CopyElisionSemanticsKind` it might even be fixed completely. > However, now that P1825 has been accepted into C++2a, `CES_AsIfByStdMove` is the actual (draft-)standard behavior, and should rightly be named something like `CES_FutureDefault`! Ok, since coroutines are actually a C++20 feature, using `CES_AsIfByStdMove` is not really wrong. But it would be inconsistent with our current treatment of `return` statements, and maybe we should switch both at the same time. (And perhaps depending on the value of `-std=`, since enabling coroutines manually is also possible in older standards. Maybe we can have a function that returns that correct value for a given standard?) So changing the `CopyElisionSemanticsKind` is not the right fix for my example. Which is not surprising, because the problem isn't that we're eliding a copy that we shouldn't elide, it's that we introduce a local copy which shouldn't be there. (Which might actually turn this code into UB, if the copy constructor was available.) We avoid this with `CES_Strict` more or less accidentally: - Without `CES_AllowParameters` we don't get a candidate because the VarDecl is a parameter. - Without `CES_AllowDifferentTypes`, `Context.hasSameUnqualifiedType(ReturnType, VDType)` returns false and so we return false from `Sema::isCopyElisionCandidate`, bceause `E->getType()` (the type of the DeclRefExpr) is `MoveOnly`, but the VarDecl is a `MoveOnly&`. The actual problem is the call to `PerformMoveOrCopyInitialization`. For normal return statements, the caller provides us with storage for the return value when that is a struct/class. We then copy (or move) the return value there (RVO) or directly construct it there (NRVO). Returning from a coroutine however just calls some `return_value` member function. If we look at the existing test, we produce CoreturnStmt 0x2462f28 |-CXXConstructExpr 0x2462e50 'MoveOnly' 'void (MoveOnly &&) noexcept' elidable | `-ImplicitCastExpr 0x2462e38 'MoveOnly' xvalue | `-DeclRefExpr 0x245e390 'MoveOnly' lvalue Var 0x245e2e8 'value' 'MoveOnly' `-ExprWithCleanups 0x2462f10 'void' `-CXXMemberCallExpr 0x2462ed0 'void' |-MemberExpr 0x2462ea0 '' .return_value 0x245fde8 | `-DeclRefExpr 0x2462e80 'std::experimental::traits_sfinae_base, void>::promise_type':'task::promise_type' lvalue Var 0x245fec8 '__promise' 'std::experimental::traits_sfinae_base, void>::promise_type':'task::promise_type' `-MaterializeTemporaryExpr 0x2462ef8 'MoveOnly' xvalue `-CXXConstructExpr 0x2462e50 'MoveOnly' 'void (MoveOnly &&) noexcept' elidable `-ImplicitCastExpr 0x2462e38 'MoveOnly' xvalue `-DeclRefExpr 0x245e390 'MoveOnly' lvalue Var 0x245e2e8 'value' 'MoveOnly' There is an actual move constructor call. So instead of transforming `__promise.return_value(value)` into `__promise.return_value(std::move(value))`, we have transformed it into `__promise.return_value(decltype(std::move(value)))`. This constructor is marked elidable, but since it's not an actual return value, there is nothing we can elide. Indeed, the unoptimized IR (where I've demangled the function names) is: %ref.tmp9.reload.addr39 = getelementptr inbounds %_Z1fv.Frame, %_Z1fv.Frame* %FramePtr, i32 0, i32 7 %ref.tmp9.reload.addr = getelementptr inbounds %_Z1fv.Frame, %_Z1fv.Frame* %FramePtr, i32 0, i32 7 %value.reload.addr37 = getelementptr inbounds %_Z1fv.Frame, %_Z1fv.Frame* %FramePtr, i32 0, i32 6 call void @"MoveOnly::MoveOnly(MoveOnly&&)"(%struct.MoveOnly* %ref.tmp9.reload.addr39, %struct.MoveOnly* dereferenceable(1) %value.reload.addr37) #2 call void @"task::promise_type::return_value(MoveOnly&&)"(%"struct.task::promise_type"* %__promise, %struct.MoveOnly* dereferenceable(1) %ref.tmp9.reload.addr) That move constructor call shouldn't be there. We shouldn't construct anything, if constructor calls are needed, building the call statement will do that. Instead of doing a move or copy initialization, we should do an rvalue reference initialization or implicit cast if we're returning a DeclRefExpr referencing a variable that we can consume. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51741/new/ https://reviews.llvm.org/D51741 From llvm-commits at lists.llvm.org Thu Oct 10 14:34:33 2019 From: llvm-commits at lists.llvm.org (Rong Xu via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:34:33 +0000 (UTC) Subject: [PATCH] D67989: [ValueTracking] Improve pointer offset computation for cases of same base In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG686fa4bbfbce: [ValueTracking] Improve pointer offset computation for cases of same base (authored by xur). Changed prior to commit: https://reviews.llvm.org/D67989?vs=221796&id=224481#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67989/new/ https://reviews.llvm.org/D67989 Files: llvm/lib/Analysis/ValueTracking.cpp llvm/test/Transforms/MemCpyOpt/store-to-memset.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67989.224481.patch Type: text/x-patch Size: 5545 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 14:43:18 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:43:18 +0000 (UTC) Subject: [PATCH] D68521: [PATCH 36/38] [noalias] Clang CodeGen for restrict-qualified pointers In-Reply-To: References: Message-ID: <60561af11265037a24a1834756e12b9d@localhost.localdomain> jeroen.dobbelaere marked an inline comment as done. jeroen.dobbelaere added inline comments. ================ Comment at: clang/lib/AST/Type.cpp:115 + } + } else if (const auto *ArrayTy = dyn_cast(CannonTy)) { + return ArrayTy->getElementType().isRestrictOrContainsRestrictMembers(); ---------------- erichkeane wrote: > Rather than this recursion, could you just unpack it from the CannonTy? So replace 107 with: > > Type *CannonTy = getCanonicalType()->getBaseElementTypeUnsafe(); > > That way this is all unpacked and saves you a recursion. > Yes, that seems to work. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68521/new/ https://reviews.llvm.org/D68521 From llvm-commits at lists.llvm.org Thu Oct 10 14:46:26 2019 From: llvm-commits at lists.llvm.org (Marcello Maggioni via llvm-commits) Date: Thu, 10 Oct 2019 21:46:26 -0000 Subject: [llvm] r374458 - [GISel] Allow getConstantVRegVal() to return G_FCONSTANT values. Message-ID: <20191010214626.B977892AE1@lists.llvm.org> Author: mggm Date: Thu Oct 10 14:46:26 2019 New Revision: 374458 URL: http://llvm.org/viewvc/llvm-project?rev=374458&view=rev Log: [GISel] Allow getConstantVRegVal() to return G_FCONSTANT values. In GISel we have both G_CONSTANT and G_FCONSTANT, but because in GISel we don't really have a concept of Float vs Int value the only difference between the two is where the data originates from. What both G_CONSTANT and G_FCONSTANT return is just a bag of bits with the constant representation in it. By making getConstantVRegVal() return G_FCONSTANTs bit representation as well we allow ConstantFold and other things to operate with G_FCONSTANT. Adding tests that show ConstantFolding to work on mixed G_CONSTANT and G_FCONSTANT sources. Differential Revision: https://reviews.llvm.org/D68739 Modified: llvm/trunk/include/llvm/CodeGen/GlobalISel/Utils.h llvm/trunk/lib/CodeGen/GlobalISel/Utils.cpp llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-frint.mir llvm/trunk/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp Modified: llvm/trunk/include/llvm/CodeGen/GlobalISel/Utils.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/GlobalISel/Utils.h?rev=374458&r1=374457&r2=374458&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/GlobalISel/Utils.h (original) +++ llvm/trunk/include/llvm/CodeGen/GlobalISel/Utils.h Thu Oct 10 14:46:26 2019 @@ -119,14 +119,16 @@ struct ValueAndVReg { unsigned VReg; }; /// If \p VReg is defined by a statically evaluable chain of -/// instructions rooted on a G_CONSTANT (\p LookThroughInstrs == true) -/// and that constant fits in int64_t, returns its value as well as -/// the virtual register defined by this G_CONSTANT. -/// When \p LookThroughInstrs == false, this function behaves like +/// instructions rooted on a G_F/CONSTANT (\p LookThroughInstrs == true) +/// and that constant fits in int64_t, returns its value as well as the +/// virtual register defined by this G_F/CONSTANT. +/// When \p LookThroughInstrs == false this function behaves like /// getConstantVRegVal. +/// When \p HandleFConstants == false the function bails on G_FCONSTANTs. Optional getConstantVRegValWithLookThrough(unsigned VReg, const MachineRegisterInfo &MRI, - bool LookThroughInstrs = true); + bool LookThroughInstrs = true, + bool HandleFConstants = true); const ConstantFP* getConstantFPVRegVal(unsigned VReg, const MachineRegisterInfo &MRI); Modified: llvm/trunk/lib/CodeGen/GlobalISel/Utils.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/Utils.cpp?rev=374458&r1=374457&r2=374458&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/Utils.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/Utils.cpp Thu Oct 10 14:46:26 2019 @@ -216,11 +216,34 @@ Optional llvm::getConstantVRegV } Optional llvm::getConstantVRegValWithLookThrough( - unsigned VReg, const MachineRegisterInfo &MRI, bool LookThroughInstrs) { + unsigned VReg, const MachineRegisterInfo &MRI, bool LookThroughInstrs, + bool HandleFConstant) { SmallVector, 4> SeenOpcodes; MachineInstr *MI; - while ((MI = MRI.getVRegDef(VReg)) && - MI->getOpcode() != TargetOpcode::G_CONSTANT && LookThroughInstrs) { + auto IsConstantOpcode = [HandleFConstant](unsigned Opcode) { + return Opcode == TargetOpcode::G_CONSTANT || + (HandleFConstant && Opcode == TargetOpcode::G_FCONSTANT); + }; + auto GetImmediateValue = [HandleFConstant, + &MRI](const MachineInstr &MI) -> Optional { + const MachineOperand &CstVal = MI.getOperand(1); + if (!CstVal.isImm() && !CstVal.isCImm() && + (!HandleFConstant || !CstVal.isFPImm())) + return None; + if (!CstVal.isFPImm()) { + unsigned BitWidth = + MRI.getType(MI.getOperand(0).getReg()).getSizeInBits(); + APInt Val = CstVal.isImm() ? APInt(BitWidth, CstVal.getImm()) + : CstVal.getCImm()->getValue(); + assert(Val.getBitWidth() == BitWidth && + "Value bitwidth doesn't match definition type"); + return Val; + } else { + return CstVal.getFPImm()->getValueAPF().bitcastToAPInt(); + } + }; + while ((MI = MRI.getVRegDef(VReg)) && !IsConstantOpcode(MI->getOpcode()) && + LookThroughInstrs) { switch (MI->getOpcode()) { case TargetOpcode::G_TRUNC: case TargetOpcode::G_SEXT: @@ -242,16 +265,13 @@ Optional llvm::getConstant return None; } } - if (!MI || MI->getOpcode() != TargetOpcode::G_CONSTANT || - (!MI->getOperand(1).isImm() && !MI->getOperand(1).isCImm())) + if (!MI || !IsConstantOpcode(MI->getOpcode())) return None; - const MachineOperand &CstVal = MI->getOperand(1); - unsigned BitWidth = MRI.getType(MI->getOperand(0).getReg()).getSizeInBits(); - APInt Val = CstVal.isImm() ? APInt(BitWidth, CstVal.getImm()) - : CstVal.getCImm()->getValue(); - assert(Val.getBitWidth() == BitWidth && - "Value bitwidth doesn't match definition type"); + Optional MaybeVal = GetImmediateValue(*MI); + if (!MaybeVal) + return None; + APInt &Val = *MaybeVal; while (!SeenOpcodes.empty()) { std::pair OpcodeAndSize = SeenOpcodes.pop_back_val(); switch (OpcodeAndSize.first) { Modified: llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-frint.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-frint.mir?rev=374458&r1=374457&r2=374458&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-frint.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/GlobalISel/legalize-frint.mir Thu Oct 10 14:46:26 2019 @@ -56,18 +56,14 @@ body: | ; SI-LABEL: name: test_frint_s64 ; SI: [[COPY:%[0-9]+]]:_(s64) = COPY $vgpr0_vgpr1 - ; SI: [[C:%[0-9]+]]:_(s64) = G_FCONSTANT double 0x4330000000000000 - ; SI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 -9223372036854775808 - ; SI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 9223372036854775807 - ; SI: [[AND:%[0-9]+]]:_(s64) = G_AND [[C]], [[C2]] - ; SI: [[AND1:%[0-9]+]]:_(s64) = G_AND [[C]], [[C1]] - ; SI: [[OR:%[0-9]+]]:_(s64) = G_OR [[AND]], [[AND1]] - ; SI: [[FADD:%[0-9]+]]:_(s64) = G_FADD [[COPY]], [[OR]] - ; SI: [[FNEG:%[0-9]+]]:_(s64) = G_FNEG [[OR]] + ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4841369599423283200 + ; SI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[C]](s64) + ; SI: [[FADD:%[0-9]+]]:_(s64) = G_FADD [[COPY]], [[COPY1]] + ; SI: [[FNEG:%[0-9]+]]:_(s64) = G_FNEG [[COPY1]] ; SI: [[FADD1:%[0-9]+]]:_(s64) = G_FADD [[FADD]], [[FNEG]] - ; SI: [[C3:%[0-9]+]]:_(s64) = G_FCONSTANT double 0x432FFFFFFFFFFFFF + ; SI: [[C1:%[0-9]+]]:_(s64) = G_FCONSTANT double 0x432FFFFFFFFFFFFF ; SI: [[FABS:%[0-9]+]]:_(s64) = G_FABS [[COPY]] - ; SI: [[FCMP:%[0-9]+]]:_(s1) = G_FCMP floatpred(ogt), [[FABS]](s64), [[C3]] + ; SI: [[FCMP:%[0-9]+]]:_(s1) = G_FCMP floatpred(ogt), [[FABS]](s64), [[C1]] ; SI: [[SELECT:%[0-9]+]]:_(s64) = G_SELECT [[FCMP]](s1), [[COPY]], [[FADD1]] ; SI: [[FRINT:%[0-9]+]]:_(s64) = G_FRINT [[COPY]] ; SI: $vgpr0_vgpr1 = COPY [[FRINT]](s64) @@ -131,26 +127,22 @@ body: | ; SI-LABEL: name: test_frint_v2s64 ; SI: [[COPY:%[0-9]+]]:_(<2 x s64>) = COPY $vgpr0_vgpr1_vgpr2_vgpr3 ; SI: [[UV:%[0-9]+]]:_(s64), [[UV1:%[0-9]+]]:_(s64) = G_UNMERGE_VALUES [[COPY]](<2 x s64>) - ; SI: [[C:%[0-9]+]]:_(s64) = G_FCONSTANT double 0x4330000000000000 - ; SI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 -9223372036854775808 - ; SI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 9223372036854775807 - ; SI: [[AND:%[0-9]+]]:_(s64) = G_AND [[C]], [[C2]] - ; SI: [[AND1:%[0-9]+]]:_(s64) = G_AND [[C]], [[C1]] - ; SI: [[OR:%[0-9]+]]:_(s64) = G_OR [[AND]], [[AND1]] - ; SI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[OR]](s64) + ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4841369599423283200 + ; SI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[C]](s64) ; SI: [[FADD:%[0-9]+]]:_(s64) = G_FADD [[UV]], [[COPY1]] ; SI: [[FNEG:%[0-9]+]]:_(s64) = G_FNEG [[COPY1]] ; SI: [[FADD1:%[0-9]+]]:_(s64) = G_FADD [[FADD]], [[FNEG]] - ; SI: [[C3:%[0-9]+]]:_(s64) = G_FCONSTANT double 0x432FFFFFFFFFFFFF + ; SI: [[C1:%[0-9]+]]:_(s64) = G_FCONSTANT double 0x432FFFFFFFFFFFFF ; SI: [[FABS:%[0-9]+]]:_(s64) = G_FABS [[UV]] - ; SI: [[FCMP:%[0-9]+]]:_(s1) = G_FCMP floatpred(ogt), [[FABS]](s64), [[C3]] + ; SI: [[FCMP:%[0-9]+]]:_(s1) = G_FCMP floatpred(ogt), [[FABS]](s64), [[C1]] ; SI: [[SELECT:%[0-9]+]]:_(s64) = G_SELECT [[FCMP]](s1), [[UV]], [[FADD1]] ; SI: [[FRINT:%[0-9]+]]:_(s64) = G_FRINT [[UV]] - ; SI: [[FADD2:%[0-9]+]]:_(s64) = G_FADD [[UV1]], [[OR]] - ; SI: [[FNEG1:%[0-9]+]]:_(s64) = G_FNEG [[OR]] + ; SI: [[COPY2:%[0-9]+]]:_(s64) = COPY [[C]](s64) + ; SI: [[FADD2:%[0-9]+]]:_(s64) = G_FADD [[UV1]], [[COPY2]] + ; SI: [[FNEG1:%[0-9]+]]:_(s64) = G_FNEG [[COPY2]] ; SI: [[FADD3:%[0-9]+]]:_(s64) = G_FADD [[FADD2]], [[FNEG1]] ; SI: [[FABS1:%[0-9]+]]:_(s64) = G_FABS [[UV1]] - ; SI: [[FCMP1:%[0-9]+]]:_(s1) = G_FCMP floatpred(ogt), [[FABS1]](s64), [[C3]] + ; SI: [[FCMP1:%[0-9]+]]:_(s1) = G_FCMP floatpred(ogt), [[FABS1]](s64), [[C1]] ; SI: [[SELECT:%[0-9]+]]:_(s64) = G_SELECT [[FCMP1]](s1), [[UV1]], [[FADD3]] ; SI: [[FRINT1:%[0-9]+]]:_(s64) = G_FRINT [[UV1]] ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[FRINT]](s64), [[FRINT1]](s64) Modified: llvm/trunk/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp?rev=374458&r1=374457&r2=374458&view=diff ============================================================================== --- llvm/trunk/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp (original) +++ llvm/trunk/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp Thu Oct 10 14:46:26 2019 @@ -68,4 +68,172 @@ TEST_F(GISelMITest, FoldWithBuilder) { EXPECT_EQ(-0x80, Cst); } +TEST_F(GISelMITest, FoldBinOp) { + setUp(); + if (!TM) + return; + + LLT s32{LLT::scalar(32)}; + auto MIBCst1 = B.buildConstant(s32, 16); + auto MIBCst2 = B.buildConstant(s32, 9); + auto MIBFCst1 = B.buildFConstant(s32, 1.0000001); + auto MIBFCst2 = B.buildFConstant(s32, 2.0); + + // Test G_ADD folding Integer + Mixed Int-Float cases + Optional FoldGAddInt = + ConstantFoldBinOp(TargetOpcode::G_ADD, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGAddInt.hasValue()); + EXPECT_EQ(25ULL, FoldGAddInt.getValue().getLimitedValue()); + Optional FoldGAddMix = + ConstantFoldBinOp(TargetOpcode::G_ADD, MIBCst1->getOperand(0).getReg(), + MIBFCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGAddMix.hasValue()); + EXPECT_EQ(1073741840ULL, FoldGAddMix.getValue().getLimitedValue()); + + // Test G_AND folding Integer + Mixed Int-Float cases + Optional FoldGAndInt = + ConstantFoldBinOp(TargetOpcode::G_AND, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGAndInt.hasValue()); + EXPECT_EQ(0ULL, FoldGAndInt.getValue().getLimitedValue()); + Optional FoldGAndMix = + ConstantFoldBinOp(TargetOpcode::G_AND, MIBCst2->getOperand(0).getReg(), + MIBFCst1->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGAndMix.hasValue()); + EXPECT_EQ(1ULL, FoldGAndMix.getValue().getLimitedValue()); + + // Test G_ASHR folding Integer + Mixed cases + Optional FoldGAShrInt = + ConstantFoldBinOp(TargetOpcode::G_ASHR, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGAShrInt.hasValue()); + EXPECT_EQ(0ULL, FoldGAShrInt.getValue().getLimitedValue()); + Optional FoldGAShrMix = + ConstantFoldBinOp(TargetOpcode::G_ASHR, MIBFCst2->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGAShrMix.hasValue()); + EXPECT_EQ(2097152ULL, FoldGAShrMix.getValue().getLimitedValue()); + + // Test G_LSHR folding Integer + Mixed Int-Float cases + Optional FoldGLShrInt = + ConstantFoldBinOp(TargetOpcode::G_LSHR, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGLShrInt.hasValue()); + EXPECT_EQ(0ULL, FoldGLShrInt.getValue().getLimitedValue()); + Optional FoldGLShrMix = + ConstantFoldBinOp(TargetOpcode::G_LSHR, MIBFCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGLShrMix.hasValue()); + EXPECT_EQ(2080768ULL, FoldGLShrMix.getValue().getLimitedValue()); + + // Test G_MUL folding Integer + Mixed Int-Float cases + Optional FoldGMulInt = + ConstantFoldBinOp(TargetOpcode::G_MUL, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGMulInt.hasValue()); + EXPECT_EQ(144ULL, FoldGMulInt.getValue().getLimitedValue()); + Optional FoldGMulMix = + ConstantFoldBinOp(TargetOpcode::G_MUL, MIBCst1->getOperand(0).getReg(), + MIBFCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGMulMix.hasValue()); + EXPECT_EQ(0ULL, FoldGMulMix.getValue().getLimitedValue()); + + // Test G_OR folding Integer + Mixed Int-Float cases + Optional FoldGOrInt = + ConstantFoldBinOp(TargetOpcode::G_OR, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGOrInt.hasValue()); + EXPECT_EQ(25ULL, FoldGOrInt.getValue().getLimitedValue()); + Optional FoldGOrMix = + ConstantFoldBinOp(TargetOpcode::G_OR, MIBCst1->getOperand(0).getReg(), + MIBFCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGOrMix.hasValue()); + EXPECT_EQ(1073741840ULL, FoldGOrMix.getValue().getLimitedValue()); + + // Test G_SHL folding Integer + Mixed Int-Float cases + Optional FoldGShlInt = + ConstantFoldBinOp(TargetOpcode::G_SHL, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGShlInt.hasValue()); + EXPECT_EQ(8192ULL, FoldGShlInt.getValue().getLimitedValue()); + Optional FoldGShlMix = + ConstantFoldBinOp(TargetOpcode::G_SHL, MIBCst1->getOperand(0).getReg(), + MIBFCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGShlMix.hasValue()); + EXPECT_EQ(0ULL, FoldGShlMix.getValue().getLimitedValue()); + + // Test G_SUB folding Integer + Mixed Int-Float cases + Optional FoldGSubInt = + ConstantFoldBinOp(TargetOpcode::G_SUB, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGSubInt.hasValue()); + EXPECT_EQ(7ULL, FoldGSubInt.getValue().getLimitedValue()); + Optional FoldGSubMix = + ConstantFoldBinOp(TargetOpcode::G_SUB, MIBCst1->getOperand(0).getReg(), + MIBFCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGSubMix.hasValue()); + EXPECT_EQ(3221225488ULL, FoldGSubMix.getValue().getLimitedValue()); + + // Test G_XOR folding Integer + Mixed Int-Float cases + Optional FoldGXorInt = + ConstantFoldBinOp(TargetOpcode::G_XOR, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGXorInt.hasValue()); + EXPECT_EQ(25ULL, FoldGXorInt.getValue().getLimitedValue()); + Optional FoldGXorMix = + ConstantFoldBinOp(TargetOpcode::G_XOR, MIBCst1->getOperand(0).getReg(), + MIBFCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGXorMix.hasValue()); + EXPECT_EQ(1073741840ULL, FoldGXorMix.getValue().getLimitedValue()); + + // Test G_UDIV folding Integer + Mixed Int-Float cases + Optional FoldGUdivInt = + ConstantFoldBinOp(TargetOpcode::G_UDIV, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGUdivInt.hasValue()); + EXPECT_EQ(1ULL, FoldGUdivInt.getValue().getLimitedValue()); + Optional FoldGUdivMix = + ConstantFoldBinOp(TargetOpcode::G_UDIV, MIBCst1->getOperand(0).getReg(), + MIBFCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGUdivMix.hasValue()); + EXPECT_EQ(0ULL, FoldGUdivMix.getValue().getLimitedValue()); + + // Test G_SDIV folding Integer + Mixed Int-Float cases + Optional FoldGSdivInt = + ConstantFoldBinOp(TargetOpcode::G_SDIV, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGSdivInt.hasValue()); + EXPECT_EQ(1ULL, FoldGSdivInt.getValue().getLimitedValue()); + Optional FoldGSdivMix = + ConstantFoldBinOp(TargetOpcode::G_SDIV, MIBCst1->getOperand(0).getReg(), + MIBFCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGSdivMix.hasValue()); + EXPECT_EQ(0ULL, FoldGSdivMix.getValue().getLimitedValue()); + + // Test G_UREM folding Integer + Mixed Int-Float cases + Optional FoldGUremInt = + ConstantFoldBinOp(TargetOpcode::G_UDIV, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGUremInt.hasValue()); + EXPECT_EQ(1ULL, FoldGUremInt.getValue().getLimitedValue()); + Optional FoldGUremMix = + ConstantFoldBinOp(TargetOpcode::G_UDIV, MIBCst1->getOperand(0).getReg(), + MIBFCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGUremMix.hasValue()); + EXPECT_EQ(0ULL, FoldGUremMix.getValue().getLimitedValue()); + + // Test G_SREM folding Integer + Mixed Int-Float cases + Optional FoldGSremInt = + ConstantFoldBinOp(TargetOpcode::G_SREM, MIBCst1->getOperand(0).getReg(), + MIBCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGSremInt.hasValue()); + EXPECT_EQ(7ULL, FoldGSremInt.getValue().getLimitedValue()); + Optional FoldGSremMix = + ConstantFoldBinOp(TargetOpcode::G_SREM, MIBCst1->getOperand(0).getReg(), + MIBFCst2->getOperand(0).getReg(), *MRI); + EXPECT_TRUE(FoldGSremMix.hasValue()); + EXPECT_EQ(16ULL, FoldGSremMix.getValue().getLimitedValue()); +} + } // namespace \ No newline at end of file From llvm-commits at lists.llvm.org Thu Oct 10 14:46:44 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Thu, 10 Oct 2019 21:46:44 -0000 Subject: [llvm] r374459 - [X86] Add test cases for packus/ssat/usat 32i32->v32i8 test cases. NFC Message-ID: <20191010214644.D196792B1D@lists.llvm.org> Author: ctopper Date: Thu Oct 10 14:46:44 2019 New Revision: 374459 URL: http://llvm.org/viewvc/llvm-project?rev=374459&view=rev Log: [X86] Add test cases for packus/ssat/usat 32i32->v32i8 test cases. NFC Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll?rev=374459&r1=374458&r2=374459&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll (original) +++ llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Thu Oct 10 14:46:44 2019 @@ -1097,3 +1097,22 @@ define <16 x i8> @trunc_packus_v16i32_v1 %f = trunc <16 x i32> %e to <16 x i8> ret <16 x i8> %f } + +define <32 x i8> @trunc_packus_v32i32_v32i8(<32 x i32> %a0) { +; CHECK-LABEL: trunc_packus_v32i32_v32i8: +; CHECK: # %bb.0: +; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 +; CHECK-NEXT: vpmaxsd %zmm2, %zmm0, %zmm0 +; CHECK-NEXT: vpmovusdb %zmm0, %xmm0 +; CHECK-NEXT: vpmaxsd %zmm2, %zmm1, %zmm1 +; CHECK-NEXT: vpmovusdb %zmm1, %xmm1 +; CHECK-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 +; CHECK-NEXT: retq + %1 = icmp slt <32 x i32> %a0, + %2 = select <32 x i1> %1, <32 x i32> %a0, <32 x i32> + %3 = icmp sgt <32 x i32> %2, zeroinitializer + %4 = select <32 x i1> %3, <32 x i32> %2, <32 x i32> zeroinitializer + %5 = trunc <32 x i32> %4 to <32 x i8> + ret <32 x i8> %5 +} + Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll?rev=374459&r1=374458&r2=374459&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll Thu Oct 10 14:46:44 2019 @@ -3087,3 +3087,57 @@ define <32 x i8> @trunc_packus_v32i16_v3 %5 = trunc <32 x i16> %4 to <32 x i8> ret <32 x i8> %5 } + +define <32 x i8> @trunc_packus_v32i32_v32i8(<32 x i32> %a0) { +; SSE-LABEL: trunc_packus_v32i32_v32i8: +; SSE: # %bb.0: +; SSE-NEXT: packssdw %xmm3, %xmm2 +; SSE-NEXT: packssdw %xmm1, %xmm0 +; SSE-NEXT: packuswb %xmm2, %xmm0 +; SSE-NEXT: packssdw %xmm7, %xmm6 +; SSE-NEXT: packssdw %xmm5, %xmm4 +; SSE-NEXT: packuswb %xmm6, %xmm4 +; SSE-NEXT: movdqa %xmm4, %xmm1 +; SSE-NEXT: retq +; +; AVX1-LABEL: trunc_packus_v32i32_v32i8: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm3, %xmm4 +; AVX1-NEXT: vpackssdw %xmm4, %xmm3, %xmm3 +; AVX1-NEXT: vextractf128 $1, %ymm2, %xmm4 +; AVX1-NEXT: vpackssdw %xmm4, %xmm2, %xmm2 +; AVX1-NEXT: vpackuswb %xmm3, %xmm2, %xmm2 +; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 +; AVX1-NEXT: vpackssdw %xmm3, %xmm1, %xmm1 +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm3 +; AVX1-NEXT: vpackssdw %xmm3, %xmm0, %xmm0 +; AVX1-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0 +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_packus_v32i32_v32i8: +; AVX2: # %bb.0: +; AVX2-NEXT: vpackssdw %ymm3, %ymm2, %ymm2 +; AVX2-NEXT: vpermq {{.*#+}} ymm2 = ymm2[0,2,1,3] +; AVX2-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: vpackuswb %ymm2, %ymm0, %ymm0 +; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: retq +; +; AVX512-LABEL: trunc_packus_v32i32_v32i8: +; AVX512: # %bb.0: +; AVX512-NEXT: vpxor %xmm2, %xmm2, %xmm2 +; AVX512-NEXT: vpmaxsd %zmm2, %zmm0, %zmm0 +; AVX512-NEXT: vpmovusdb %zmm0, %xmm0 +; AVX512-NEXT: vpmaxsd %zmm2, %zmm1, %zmm1 +; AVX512-NEXT: vpmovusdb %zmm1, %xmm1 +; AVX512-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 +; AVX512-NEXT: retq + %1 = icmp slt <32 x i32> %a0, + %2 = select <32 x i1> %1, <32 x i32> %a0, <32 x i32> + %3 = icmp sgt <32 x i32> %2, zeroinitializer + %4 = select <32 x i1> %3, <32 x i32> %2, <32 x i32> zeroinitializer + %5 = trunc <32 x i32> %4 to <32 x i8> + ret <32 x i8> %5 +} Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll?rev=374459&r1=374458&r2=374459&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll Thu Oct 10 14:46:44 2019 @@ -3049,3 +3049,54 @@ define <32 x i8> @trunc_ssat_v32i16_v32i %5 = trunc <32 x i16> %4 to <32 x i8> ret <32 x i8> %5 } + +define <32 x i8> @trunc_ssat_v32i32_v32i8(<32 x i32> %a0) { +; SSE-LABEL: trunc_ssat_v32i32_v32i8: +; SSE: # %bb.0: +; SSE-NEXT: packssdw %xmm3, %xmm2 +; SSE-NEXT: packssdw %xmm1, %xmm0 +; SSE-NEXT: packsswb %xmm2, %xmm0 +; SSE-NEXT: packssdw %xmm7, %xmm6 +; SSE-NEXT: packssdw %xmm5, %xmm4 +; SSE-NEXT: packsswb %xmm6, %xmm4 +; SSE-NEXT: movdqa %xmm4, %xmm1 +; SSE-NEXT: retq +; +; AVX1-LABEL: trunc_ssat_v32i32_v32i8: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm3, %xmm4 +; AVX1-NEXT: vpackssdw %xmm4, %xmm3, %xmm3 +; AVX1-NEXT: vextractf128 $1, %ymm2, %xmm4 +; AVX1-NEXT: vpackssdw %xmm4, %xmm2, %xmm2 +; AVX1-NEXT: vpacksswb %xmm3, %xmm2, %xmm2 +; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 +; AVX1-NEXT: vpackssdw %xmm3, %xmm1, %xmm1 +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm3 +; AVX1-NEXT: vpackssdw %xmm3, %xmm0, %xmm0 +; AVX1-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0 +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_ssat_v32i32_v32i8: +; AVX2: # %bb.0: +; AVX2-NEXT: vpackssdw %ymm3, %ymm2, %ymm2 +; AVX2-NEXT: vpermq {{.*#+}} ymm2 = ymm2[0,2,1,3] +; AVX2-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: vpacksswb %ymm2, %ymm0, %ymm0 +; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: retq +; +; AVX512-LABEL: trunc_ssat_v32i32_v32i8: +; AVX512: # %bb.0: +; AVX512-NEXT: vpmovsdb %zmm0, %xmm0 +; AVX512-NEXT: vpmovsdb %zmm1, %xmm1 +; AVX512-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 +; AVX512-NEXT: retq + %1 = icmp slt <32 x i32> %a0, + %2 = select <32 x i1> %1, <32 x i32> %a0, <32 x i32> + %3 = icmp sgt <32 x i32> %2, + %4 = select <32 x i1> %3, <32 x i32> %2, <32 x i32> + %5 = trunc <32 x i32> %4 to <32 x i8> + ret <32 x i8> %5 +} Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll?rev=374459&r1=374458&r2=374459&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll Thu Oct 10 14:46:44 2019 @@ -2453,3 +2453,211 @@ define <32 x i8> @trunc_usat_v32i16_v32i %3 = trunc <32 x i16> %2 to <32 x i8> ret <32 x i8> %3 } + +define <32 x i8> @trunc_usat_v32i32_v32i8(<32 x i32> %a0) { +; SSE2-LABEL: trunc_usat_v32i32_v32i8: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa %xmm1, %xmm8 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [255,255,255,255] +; SSE2-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648,2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm5, %xmm1 +; SSE2-NEXT: pxor %xmm11, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [2147483903,2147483903,2147483903,2147483903] +; SSE2-NEXT: movdqa %xmm9, %xmm12 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm12 +; SSE2-NEXT: pand %xmm12, %xmm5 +; SSE2-NEXT: pandn %xmm10, %xmm12 +; SSE2-NEXT: por %xmm5, %xmm12 +; SSE2-NEXT: movdqa %xmm4, %xmm5 +; SSE2-NEXT: pxor %xmm11, %xmm5 +; SSE2-NEXT: movdqa %xmm9, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm4 +; SSE2-NEXT: pandn %xmm10, %xmm1 +; SSE2-NEXT: por %xmm4, %xmm1 +; SSE2-NEXT: packuswb %xmm12, %xmm1 +; SSE2-NEXT: movdqa %xmm7, %xmm4 +; SSE2-NEXT: pxor %xmm11, %xmm4 +; SSE2-NEXT: movdqa %xmm9, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm7 +; SSE2-NEXT: pandn %xmm10, %xmm5 +; SSE2-NEXT: por %xmm7, %xmm5 +; SSE2-NEXT: movdqa %xmm6, %xmm4 +; SSE2-NEXT: pxor %xmm11, %xmm4 +; SSE2-NEXT: movdqa %xmm9, %xmm7 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm6 +; SSE2-NEXT: pandn %xmm10, %xmm7 +; SSE2-NEXT: por %xmm6, %xmm7 +; SSE2-NEXT: packuswb %xmm5, %xmm7 +; SSE2-NEXT: packuswb %xmm7, %xmm1 +; SSE2-NEXT: movdqa %xmm8, %xmm4 +; SSE2-NEXT: pxor %xmm11, %xmm4 +; SSE2-NEXT: movdqa %xmm9, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm8 +; SSE2-NEXT: pandn %xmm10, %xmm5 +; SSE2-NEXT: por %xmm8, %xmm5 +; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: pxor %xmm11, %xmm4 +; SSE2-NEXT: movdqa %xmm9, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm6 +; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pandn %xmm10, %xmm6 +; SSE2-NEXT: por %xmm6, %xmm0 +; SSE2-NEXT: packuswb %xmm5, %xmm0 +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pxor %xmm11, %xmm4 +; SSE2-NEXT: movdqa %xmm9, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm3 +; SSE2-NEXT: pandn %xmm10, %xmm5 +; SSE2-NEXT: por %xmm3, %xmm5 +; SSE2-NEXT: pxor %xmm2, %xmm11 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm9 +; SSE2-NEXT: pand %xmm9, %xmm2 +; SSE2-NEXT: pandn %xmm10, %xmm9 +; SSE2-NEXT: por %xmm2, %xmm9 +; SSE2-NEXT: packuswb %xmm5, %xmm9 +; SSE2-NEXT: packuswb %xmm9, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v32i32_v32i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa %xmm1, %xmm8 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [255,255,255,255] +; SSSE3-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648,2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm5, %xmm1 +; SSSE3-NEXT: pxor %xmm11, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [2147483903,2147483903,2147483903,2147483903] +; SSSE3-NEXT: movdqa %xmm9, %xmm12 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm12 +; SSSE3-NEXT: pand %xmm12, %xmm5 +; SSSE3-NEXT: pandn %xmm10, %xmm12 +; SSSE3-NEXT: por %xmm5, %xmm12 +; SSSE3-NEXT: movdqa %xmm4, %xmm5 +; SSSE3-NEXT: pxor %xmm11, %xmm5 +; SSSE3-NEXT: movdqa %xmm9, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm4 +; SSSE3-NEXT: pandn %xmm10, %xmm1 +; SSSE3-NEXT: por %xmm4, %xmm1 +; SSSE3-NEXT: packuswb %xmm12, %xmm1 +; SSSE3-NEXT: movdqa %xmm7, %xmm4 +; SSSE3-NEXT: pxor %xmm11, %xmm4 +; SSSE3-NEXT: movdqa %xmm9, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm7 +; SSSE3-NEXT: pandn %xmm10, %xmm5 +; SSSE3-NEXT: por %xmm7, %xmm5 +; SSSE3-NEXT: movdqa %xmm6, %xmm4 +; SSSE3-NEXT: pxor %xmm11, %xmm4 +; SSSE3-NEXT: movdqa %xmm9, %xmm7 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm6 +; SSSE3-NEXT: pandn %xmm10, %xmm7 +; SSSE3-NEXT: por %xmm6, %xmm7 +; SSSE3-NEXT: packuswb %xmm5, %xmm7 +; SSSE3-NEXT: packuswb %xmm7, %xmm1 +; SSSE3-NEXT: movdqa %xmm8, %xmm4 +; SSSE3-NEXT: pxor %xmm11, %xmm4 +; SSSE3-NEXT: movdqa %xmm9, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm8 +; SSSE3-NEXT: pandn %xmm10, %xmm5 +; SSSE3-NEXT: por %xmm8, %xmm5 +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pxor %xmm11, %xmm4 +; SSSE3-NEXT: movdqa %xmm9, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pandn %xmm10, %xmm6 +; SSSE3-NEXT: por %xmm6, %xmm0 +; SSSE3-NEXT: packuswb %xmm5, %xmm0 +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pxor %xmm11, %xmm4 +; SSSE3-NEXT: movdqa %xmm9, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm3 +; SSSE3-NEXT: pandn %xmm10, %xmm5 +; SSSE3-NEXT: por %xmm3, %xmm5 +; SSSE3-NEXT: pxor %xmm2, %xmm11 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm9 +; SSSE3-NEXT: pand %xmm9, %xmm2 +; SSSE3-NEXT: pandn %xmm10, %xmm9 +; SSSE3-NEXT: por %xmm2, %xmm9 +; SSSE3-NEXT: packuswb %xmm5, %xmm9 +; SSSE3-NEXT: packuswb %xmm9, %xmm0 +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v32i32_v32i8: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa {{.*#+}} xmm8 = [255,255,255,255] +; SSE41-NEXT: pminud %xmm8, %xmm5 +; SSE41-NEXT: pminud %xmm8, %xmm4 +; SSE41-NEXT: packusdw %xmm5, %xmm4 +; SSE41-NEXT: pminud %xmm8, %xmm7 +; SSE41-NEXT: pminud %xmm8, %xmm6 +; SSE41-NEXT: packusdw %xmm7, %xmm6 +; SSE41-NEXT: packuswb %xmm6, %xmm4 +; SSE41-NEXT: pminud %xmm8, %xmm1 +; SSE41-NEXT: pminud %xmm8, %xmm0 +; SSE41-NEXT: packusdw %xmm1, %xmm0 +; SSE41-NEXT: pminud %xmm8, %xmm3 +; SSE41-NEXT: pminud %xmm8, %xmm2 +; SSE41-NEXT: packusdw %xmm3, %xmm2 +; SSE41-NEXT: packuswb %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm1 +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v32i32_v32i8: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm4 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm5 = [255,255,255,255] +; AVX1-NEXT: vpminud %xmm5, %xmm4, %xmm4 +; AVX1-NEXT: vpminud %xmm5, %xmm0, %xmm0 +; AVX1-NEXT: vpackusdw %xmm4, %xmm0, %xmm0 +; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm4 +; AVX1-NEXT: vpminud %xmm5, %xmm4, %xmm4 +; AVX1-NEXT: vpminud %xmm5, %xmm1, %xmm1 +; AVX1-NEXT: vpackusdw %xmm4, %xmm1, %xmm1 +; AVX1-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vextractf128 $1, %ymm2, %xmm1 +; AVX1-NEXT: vpminud %xmm5, %xmm1, %xmm1 +; AVX1-NEXT: vpminud %xmm5, %xmm2, %xmm2 +; AVX1-NEXT: vpackusdw %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vextractf128 $1, %ymm3, %xmm2 +; AVX1-NEXT: vpminud %xmm5, %xmm2, %xmm2 +; AVX1-NEXT: vpminud %xmm5, %xmm3, %xmm3 +; AVX1-NEXT: vpackusdw %xmm2, %xmm3, %xmm2 +; AVX1-NEXT: vpackuswb %xmm2, %xmm1, %xmm1 +; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_usat_v32i32_v32i8: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastd {{.*#+}} ymm4 = [255,255,255,255,255,255,255,255] +; AVX2-NEXT: vpminud %ymm4, %ymm1, %ymm1 +; AVX2-NEXT: vpminud %ymm4, %ymm0, %ymm0 +; AVX2-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vpminud %ymm4, %ymm3, %ymm1 +; AVX2-NEXT: vpminud %ymm4, %ymm2, %ymm2 +; AVX2-NEXT: vpackusdw %ymm1, %ymm2, %ymm1 +; AVX2-NEXT: vpermq {{.*#+}} ymm1 = ymm1[0,2,1,3] +; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: retq +; +; AVX512-LABEL: trunc_usat_v32i32_v32i8: +; AVX512: # %bb.0: +; AVX512-NEXT: vpmovusdb %zmm0, %xmm0 +; AVX512-NEXT: vpmovusdb %zmm1, %xmm1 +; AVX512-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 +; AVX512-NEXT: retq + %1 = icmp ult <32 x i32> %a0, + %2 = select <32 x i1> %1, <32 x i32> %a0, <32 x i32> + %3 = trunc <32 x i32> %2 to <32 x i8> + ret <32 x i8> %3 +} From llvm-commits at lists.llvm.org Thu Oct 10 14:46:52 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Thu, 10 Oct 2019 21:46:52 -0000 Subject: [llvm] r374460 - [X86] Guard against leaving a dangling node in combineTruncateWithSat. Message-ID: <20191010214652.CB21092B34@lists.llvm.org> Author: ctopper Date: Thu Oct 10 14:46:52 2019 New Revision: 374460 URL: http://llvm.org/viewvc/llvm-project?rev=374460&view=rev Log: [X86] Guard against leaving a dangling node in combineTruncateWithSat. When handling the packus pattern for i32->i8 we do a two step process using a packss to i16 followed by a packus to i8. If the final i8 step is a type with less than 64-bits the packus step will return SDValue(), but the i32->i16 step might have succeeded. This leaves the nodes from the middle step dangling. Guard against this by pre-checking that the number of elements is at least 8 before doing the middle step. With that check in place this should mean the only other case the middle step itself can fail is when SSE2 is disabled. So add an early SSE2 check then just assert that neither the middle or final step ever fail. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374460&r1=374459&r2=374460&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Thu Oct 10 14:46:52 2019 @@ -39830,6 +39830,9 @@ static SDValue detectSSatPattern(SDValue static SDValue combineTruncateWithSat(SDValue In, EVT VT, const SDLoc &DL, SelectionDAG &DAG, const X86Subtarget &Subtarget) { + if (!Subtarget.hasSSE2()) + return SDValue(); + EVT SVT = VT.getScalarType(); EVT InVT = In.getValueType(); EVT InSVT = InVT.getScalarType(); @@ -39852,6 +39855,7 @@ static SDValue combineTruncateWithSat(SD // Emit a VPACKUSDW+VPERMQ followed by a VPMOVUSWB. SDValue Mid = truncateVectorWithPACK(X86ISD::PACKUS, MVT::v16i16, USatVal, DL, DAG, Subtarget); + assert(Mid && "Failed to pack!"); return DAG.getNode(X86ISD::VTRUNCUS, DL, VT, Mid); } } @@ -39863,14 +39867,19 @@ static SDValue combineTruncateWithSat(SD (InSVT == MVT::i16 || InSVT == MVT::i32)) { if (auto USatVal = detectSSatPattern(In, VT, true)) { // vXi32 -> vXi8 must be performed as PACKUSWB(PACKSSDW,PACKSSDW). - if (SVT == MVT::i8 && InSVT == MVT::i32) { + // Only do this when the result is at least 64 bits or we'll leaving + // dangling PACKSSDW nodes. + if (SVT == MVT::i8 && InSVT == MVT::i32 && + VT.getVectorNumElements() >= 8) { EVT MidVT = EVT::getVectorVT(*DAG.getContext(), MVT::i16, VT.getVectorNumElements()); SDValue Mid = truncateVectorWithPACK(X86ISD::PACKSS, MidVT, USatVal, DL, DAG, Subtarget); - if (Mid) - return truncateVectorWithPACK(X86ISD::PACKUS, VT, Mid, DL, DAG, - Subtarget); + assert(Mid && "Failed to pack!"); + SDValue V = truncateVectorWithPACK(X86ISD::PACKUS, VT, Mid, DL, DAG, + Subtarget); + assert(V && "Failed to pack!"); + return V; } else if (SVT == MVT::i8 || Subtarget.hasSSE41()) return truncateVectorWithPACK(X86ISD::PACKUS, VT, USatVal, DL, DAG, Subtarget); From llvm-commits at lists.llvm.org Thu Oct 10 14:51:30 2019 From: llvm-commits at lists.llvm.org (Marcello Maggioni via llvm-commits) Date: Thu, 10 Oct 2019 21:51:30 -0000 Subject: [llvm] r374463 - [GISel] Simplifying return from else in function. NFC Message-ID: <20191010215130.4FDCB92AD1@lists.llvm.org> Author: mggm Date: Thu Oct 10 14:51:30 2019 New Revision: 374463 URL: http://llvm.org/viewvc/llvm-project?rev=374463&view=rev Log: [GISel] Simplifying return from else in function. NFC Forgot to integrate this little change in previous commit Modified: llvm/trunk/lib/CodeGen/GlobalISel/Utils.cpp Modified: llvm/trunk/lib/CodeGen/GlobalISel/Utils.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/Utils.cpp?rev=374463&r1=374462&r2=374463&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/Utils.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/Utils.cpp Thu Oct 10 14:51:30 2019 @@ -238,9 +238,8 @@ Optional llvm::getConstant assert(Val.getBitWidth() == BitWidth && "Value bitwidth doesn't match definition type"); return Val; - } else { - return CstVal.getFPImm()->getValueAPF().bitcastToAPInt(); } + return CstVal.getFPImm()->getValueAPF().bitcastToAPInt(); }; while ((MI = MRI.getVRegDef(VReg)) && !IsConstantOpcode(MI->getOpcode()) && LookThroughInstrs) { From llvm-commits at lists.llvm.org Thu Oct 10 14:53:03 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:53:03 +0000 (UTC) Subject: [PATCH] D68836: [lit] Small cleanups in main.py Message-ID: yln created this revision. yln added reviewers: rnk, ddunbar, serge-sans-paille, probinson, jdenny, cishida, nate_chandler, jordan_rose. Herald added subscribers: llvm-commits, delcypher. Herald added a project: LLVM. - Extract separate function for running tests from main - Push single-usage imports to point of usage - Remove unnecessary sys.exit(0) calls Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68836 Files: llvm/utils/lit/lit/main.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68836.224485.patch Type: text/x-patch Size: 3224 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 14:53:10 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:53:10 +0000 (UTC) Subject: [PATCH] D68819: [Utils] Allow update_test_checks to check function arguments In-Reply-To: References: Message-ID: jdoerfert added a comment. In D68819#1704693 , @greened wrote: > Does this subsume the goal of D68153 ? If so I am happy to abandon that revision. D68153 attempts to solve the problem of a `CHECK-LABEL` matching a function call instead of the start of a function definition. It looks like with `--function-signature` the `CHECK-LABEL` will include the arguments in the label pattern which should be enough to disambiguate it from a call to the function. Do I have that right? > > EDIT: Actually, I don't think it will completely work if there is a recursive call to, say, `foo` that passes the same arguments through. In that case the call will look exactly like the function signature. The same is true if an unrelated function makes a call to `foo` with values that just happen to be named the same as `foo`'s arguments. We can, or should, combine D68153 and this, either in one or two patches. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68819/new/ https://reviews.llvm.org/D68819 From llvm-commits at lists.llvm.org Thu Oct 10 14:53:17 2019 From: llvm-commits at lists.llvm.org (Marcello Maggioni via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:53:17 +0000 (UTC) Subject: [PATCH] D68739: [GISel] Allow ConstantFoldBinOp to consider G_FCONSTANT binary representation for combines In-Reply-To: References: Message-ID: <31f24dbf31986b7829501fccd473f108@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG0112123eea5f: [GISel] Allow getConstantVRegVal() to return G_FCONSTANT values. (authored by kariddi). Changed prior to commit: https://reviews.llvm.org/D68739?vs=224441&id=224486#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68739/new/ https://reviews.llvm.org/D68739 Files: llvm/include/llvm/CodeGen/GlobalISel/Utils.h llvm/lib/CodeGen/GlobalISel/Utils.cpp llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-frint.mir llvm/unittests/CodeGen/GlobalISel/ConstantFoldingTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68739.224486.patch Type: text/x-patch Size: 16869 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 15:03:16 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:03:16 +0000 (UTC) Subject: [PATCH] D68194: [LCSSA] Forget values we create LCSSA phis for In-Reply-To: References: Message-ID: fhahn updated this revision to Diff 224488. fhahn marked an inline comment as done. fhahn added a comment. remove undef branch conditions from tests. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68194/new/ https://reviews.llvm.org/D68194 Files: llvm/include/llvm/Transforms/Utils/LoopUtils.h llvm/lib/Transforms/Utils/LCSSA.cpp llvm/test/Transforms/LoopUnroll/unroll-preserve-scev-lcssa.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68194.224488.patch Type: text/x-patch Size: 5041 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 15:03:17 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:03:17 +0000 (UTC) Subject: [PATCH] D68194: [LCSSA] Forget values we create LCSSA phis for In-Reply-To: References: Message-ID: <290d80ade02b3e858578ea4376f56c36@localhost.localdomain> fhahn marked an inline comment as done. fhahn added inline comments. ================ Comment at: llvm/test/Transforms/LoopUnroll/unroll-preserve-scev-lcssa.ll:76 +bb3: ; preds = %bb9, %bb + br i1 undef, label %bb9, label %bb5 + ---------------- sanjoy.google wrote: > Minor thing: I'd avoid adding branches on `undef` (unless you need them to reproduce the test) because there is no guarantee on how the optimizers will optimize these. > > (It is also unclear whether this is UB.) Excellent point, thanks! This was originally reduced by bug point which likes undef, but I've added proper conditions now. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68194/new/ https://reviews.llvm.org/D68194 From llvm-commits at lists.llvm.org Thu Oct 10 15:03:17 2019 From: llvm-commits at lists.llvm.org (Leonard Chan via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:03:17 +0000 (UTC) Subject: [PATCH] D68832: [tsan,msan] Insert module constructors in a module pass In-Reply-To: References: Message-ID: <97a62d67235ca1dff3be32a03c319bf3@localhost.localdomain> leonardchan accepted this revision. leonardchan added a comment. This revision is now accepted and ready to land. Thanks for finding the root cause of this! ================ Comment at: llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp:143 + +static void insertModuleCtor(Module &M) { + getOrCreateSanitizerCtorAndInitFunctions( ---------------- nit: static unneeded here ================ Comment at: llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp:163 +PreservedAnalyses ThreadSanitizerPass::run(Module &M, + AnalysisManager &AM) { + insertModuleCtor(M); ---------------- nit: `ModuleAnalysisManager &MAM` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68832/new/ https://reviews.llvm.org/D68832 From llvm-commits at lists.llvm.org Thu Oct 10 15:21:30 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:21:30 +0000 (UTC) Subject: [PATCH] D68839: [lit] Fix internal diff's --strip-trailing-cr and use it Message-ID: jdenny created this revision. jdenny added reviewers: probinson, stella.stamenova, bd1976llvm, jlpeyton, rnk, mgorny. Herald added a subscriber: delcypher. Herald added a project: LLVM. jdenny added a child revision: D66506: [lit] Fix internal env calling other internal commands. jdenny added a parent revision: D68668: [lit] Extend internal diff to support -U. Using GNU diff, `--strip-trailing-cr` removes a `\r` appearing before a `\n` at the end of a line. Without this patch, lit's internal diff only removes `\r` if it appears as the last character. That seems useless. This patch fixes that. This patch also adds `--strip-trailing-cr` to some tests that fail on Windows bots when D68664 is applied. Based on what I see in the bot logs, I think the following is happening. In each test there, lit diff is comparing a file with `\r\n` line endings to a file with `\n` line endings. Without D68664 , lit diff reads those files with Python's universal newlines support activated, causing `\r` to be dropped. However, with D68664 , lit diff reads the files in binary mode instead and thus reports that every line is different, just as GNU diff does (at least under Ubuntu). Adding `--strip-trailing-cr` to those tests restores the previous behavior while permitting the behavior of lit diff to be more like GNU diff. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68839 Files: llvm/test/MC/AsmParser/preserve-comments.s llvm/test/tools/llvm-cxxmap/remap.test llvm/test/tools/llvm-profdata/profile-symbol-list.test llvm/test/tools/llvm-profdata/roundtrip.test llvm/test/tools/llvm-profdata/sample-remap.test llvm/utils/lit/lit/builtin_commands/diff.py llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.dos llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.unix llvm/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/shtest-shell.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68839.224490.patch Type: text/x-patch Size: 8677 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 15:39:56 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Thu, 10 Oct 2019 22:39:56 -0000 Subject: [zorg] r374466 - Moved a few builders to use UnifiedTreeBuilder. Message-ID: <20191010223956.18E1492A2A@lists.llvm.org> Author: gkistanova Date: Thu Oct 10 15:39:55 2019 New Revision: 374466 URL: http://llvm.org/viewvc/llvm-project?rev=374466&view=rev Log: Moved a few builders to use UnifiedTreeBuilder. Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/buildbot/osuosl/master/config/builders.py?rev=374466&r1=374465&r2=374466&view=diff ============================================================================== --- zorg/trunk/buildbot/osuosl/master/config/builders.py (original) +++ zorg/trunk/buildbot/osuosl/master/config/builders.py Thu Oct 10 15:39:55 2019 @@ -110,38 +110,39 @@ def _get_clang_fast_builders(): 'mergeRequests': False, 'slavenames': ["ps4-buildslave4"], 'builddir': "llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast", - 'factory': ClangAndLLDBuilder.getClangAndLLDBuildFactory( - extraCmakeOptions=["-DCMAKE_C_COMPILER=clang", - "-DCMAKE_CXX_COMPILER=clang++", - "-DCOMPILER_RT_BUILD_BUILTINS:BOOL=OFF", - "-DCOMPILER_RT_BUILD_SANITIZERS:BOOL=OFF", - "-DCOMPILER_RT_CAN_EXECUTE_TESTS:BOOL=OFF", - "-DCOMPILER_RT_INCLUDE_TESTS:BOOL=OFF", - "-DLLVM_TOOL_COMPILER_RT_BUILD:BOOL=OFF", - "-DLLVM_BUILD_TESTS:BOOL=ON", - "-DLLVM_BUILD_EXAMPLES:BOOL=ON", - "-DCLANG_BUILD_EXAMPLES:BOOL=ON", - "-DLLVM_TARGETS_TO_BUILD=X86"], - extraLitArgs=['-v', '-j36'], - triple="x86_64-scei-ps4", - prefixCommand=None, # This is a designated builder, so no need to be nice. - env={'PATH':'/opt/llvm_37/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'})}, + 'factory': UnifiedTreeBuilder.getCmakeWithNinjaBuildFactory( + depends_on_projects=['llvm','clang','clang-tools-extra','compiler-rt','lld'], + extraCmakeOptions=["-DCMAKE_C_COMPILER=clang", + "-DCMAKE_CXX_COMPILER=clang++", + "-DCOMPILER_RT_BUILD_BUILTINS=OFF", + "-DCOMPILER_RT_BUILD_SANITIZERS=OFF", + "-DCOMPILER_RT_CAN_EXECUTE_TESTS=OFF", + "-DCOMPILER_RT_INCLUDE_TESTS=OFF", + "-DLLVM_TOOL_COMPILER_RT_BUILD=OFF", # TODO: Check why we depend on compiler-rt then? + "-DLLVM_BUILD_TESTS=ON", + "-DLLVM_BUILD_EXAMPLES=ON", + "-DCLANG_BUILD_EXAMPLES=ON", + "-DLLVM_TARGETS_TO_BUILD=X86", + "-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4", + "-DCMAKE_C_FLAGS='-Wdocumentation -Wno-documentation-deprecated-sync'", + "-DCMAKE_CXX_FLAGS='-std=c++11 -Wdocumentation -Wno-documentation-deprecated-sync'", + "-DLLVM_LIT_ARGS='-v -j36'"], + env={'PATH':'/opt/llvm_37/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'})}, {'name': "llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast", 'mergeRequests': True, 'slavenames': ["ps4-buildslave2"], 'builddir': "llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast", - 'factory': ClangAndLLDBuilder.getClangAndLLDBuildFactory( - extraCmakeOptions=["-DLLVM_TOOL_COMPILER_RT_BUILD:BOOL=OFF", - "-DLLVM_BUILD_TESTS:BOOL=ON", - "-DLLVM_BUILD_EXAMPLES:BOOL=ON", - "-DCLANG_BUILD_EXAMPLES:BOOL=ON", - "-DLLVM_TARGETS_TO_BUILD=X86"], - triple="x86_64-scei-ps4", - isMSVC=True, - vs="autodetect", - prefixCommand=None, # This is a designated builder, so no need to be nice. - extraLitArgs=["-j80"])}, + 'factory': UnifiedTreeBuilder.getCmakeWithNinjaWithMSVCBuildFactory( + vs="autodetect", + depends_on_projects=['llvm','clang','clang-tools-extra','compiler-rt','lld'], + extraCmakeOptions=["-DLLVM_TOOL_COMPILER_RT_BUILD=OFF", # TODO: Check why we depend on compiler-rt then? + "-DLLVM_BUILD_TESTS=ON", + "-DLLVM_BUILD_EXAMPLES=ON", + "-DCLANG_BUILD_EXAMPLES=ON", + "-DLLVM_TARGETS_TO_BUILD=X86", + "-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4", + "-DLLVM_LIT_ARGS='-v -j80'"])}, {'name': "llvm-clang-x86_64-expensive-checks-win", 'slavenames':["ps4-buildslave2"], @@ -900,26 +901,35 @@ def _get_lld_builders(): {'name': "lld-x86_64-darwin13", 'slavenames' :["as-bldslv9"], 'builddir':"lld-x86_64-darwin13", - 'factory': LLDBuilder.getLLDBuildFactory(), + 'factory': UnifiedTreeBuilder.getCmakeWithNinjaBuildFactory( + clean=True, + depends_on_projects=['llvm', 'lld'], + extra_configure_args=[ + '-DLLVM_ENABLE_WERROR=OFF', + ]), 'category' : 'lld'}, {'name': "lld-x86_64-win7", 'slavenames' :["ps4-buildslave2"], 'builddir':"lld-x86_64-win7", - 'factory': LLDBuilder.getLLDWinBuildFactory( + 'factory': UnifiedTreeBuilder.getCmakeWithNinjaWithMSVCBuildFactory( + depends_on_projects=['llvm', 'lld'], vs="autodetect", extra_configure_args = [ - '-DLLVM_ENABLE_WERROR=OFF' + '-DLLVM_ENABLE_WERROR=OFF', ]), 'category' : 'lld'}, {'name': "lld-x86_64-freebsd", 'slavenames' :["as-bldslv5"], 'builddir':"lld-x86_64-freebsd", - 'factory': LLDBuilder.getLLDBuildFactory(extra_configure_args=[ - '-DCMAKE_EXE_LINKER_FLAGS=-lcxxrt', - '-DLLVM_ENABLE_WERROR=OFF'], - env={'CXXFLAGS' : "-std=c++11 -stdlib=libc++"}), + 'factory': UnifiedTreeBuilder.getCmakeWithNinjaBuildFactory( + depends_on_projects=['llvm', 'lld'], + extra_configure_args=[ + '-DCMAKE_EXE_LINKER_FLAGS=-lcxxrt', + '-DLLVM_ENABLE_WERROR=OFF', + ], + env={'CXXFLAGS' : "-std=c++11 -stdlib=libc++"}), 'category' : 'lld'}, {'name' : "clang-with-lto-ubuntu", From llvm-commits at lists.llvm.org Thu Oct 10 15:39:49 2019 From: llvm-commits at lists.llvm.org (Nemanja Ivanovic via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:39:49 +0000 (UTC) Subject: [PATCH] D68841: [PowerPC] Do not convert loop to HW loop if the body contains calls to lrint/lround Message-ID: nemanjai created this revision. nemanjai added reviewers: hfinkel, echristo, PowerPC. Herald added subscribers: shchenz, jsji, MaskRay, kbarton. Herald added a project: LLVM. nemanjai marked an inline comment as done. nemanjai added inline comments. Herald added a subscriber: wuzish. ================ Comment at: lib/Target/PowerPC/PPCTargetTransformInfo.cpp:281 + // a call for safety. + default: return true; // If we have a call to ppc_is_decremented_ctr_nonzero, or ppc_mtctr ---------------- @hfinkel What do you think about this conservative approach to prevent similar issues in the future? This is I think a third time I address a similar bug. These two intrinsics are lowered to calls so should prevent the formation of CTR loops. This patch rather aggressively disables CTR loops if there are calls to unknown intrinsics so that we don't keep hitting similar issues in the future. I will certainly collect some data about the number of CTR loops we emit before and after the patch before committing such an "aggressively pessimistic" patch. Posting this early just to gauge what others think about this conservative behaviour. Repository: rL LLVM https://reviews.llvm.org/D68841 Files: lib/Target/PowerPC/PPCTargetTransformInfo.cpp test/CodeGen/PowerPC/pr43527.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68841.224493.patch Type: text/x-patch Size: 4614 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 15:39:49 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:39:49 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: <8589e5c1ee7469fe760cd29a04c43e49@localhost.localdomain> efriedma added inline comments. ================ Comment at: llvm/lib/Target/X86/X86FrameLowering.cpp:423 + AbsOffset - CurrentAbsOffset + PageSize); + if (FreeProbeIterator != MBB.end()) { + NumFrameFreeProbe++; ---------------- Each probe has to have an offset of at most PageSize bytes from the previous probe. If each probe is exactly PageSize bytes away from the previous probe, that's fine. But it looks like you don't enforce the distance between free probes correctly? ================ Comment at: llvm/lib/Target/X86/X86FrameLowering.cpp:481 + [](MachineOperand &MO) { return MO.isFI(); })) { + break; // effect on stack pointer not modelled, stopping + } ---------------- There are instructions that don't refer to any FI, but are still relevant. For example, calls. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 From llvm-commits at lists.llvm.org Thu Oct 10 15:39:49 2019 From: llvm-commits at lists.llvm.org (Nemanja Ivanovic via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:39:49 +0000 (UTC) Subject: [PATCH] D68841: [PowerPC] Do not convert loop to HW loop if the body contains calls to lrint/lround In-Reply-To: References: Message-ID: <104ec12d6996ab8e393d2276ad9b8f42@localhost.localdomain> nemanjai marked an inline comment as done. nemanjai added inline comments. Herald added a subscriber: wuzish. ================ Comment at: lib/Target/PowerPC/PPCTargetTransformInfo.cpp:281 + // a call for safety. + default: return true; // If we have a call to ppc_is_decremented_ctr_nonzero, or ppc_mtctr ---------------- @hfinkel What do you think about this conservative approach to prevent similar issues in the future? This is I think a third time I address a similar bug. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68841/new/ https://reviews.llvm.org/D68841 From llvm-commits at lists.llvm.org Thu Oct 10 16:07:53 2019 From: llvm-commits at lists.llvm.org (Tom Stellard via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 23:07:53 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: <3a012bc7a06791998dd971660aceb5a4@localhost.localdomain> tstellar added inline comments. ================ Comment at: clang/test/CodeGen/stack-clash-protection.c:3 +// RUN: %clang -target x86_64 -o %t.out %s -fstack-clash-protection && %t.out + +#include ---------------- There were concerns[1] raised recently about adding clang tests that were codegen dependent. Is something being tested here that can't be tested with an IR test? If you only need to test that the frontend option work, I think checking the IR for the necessary function attributes might be enough. [1] http://lists.llvm.org/pipermail/cfe-dev/2019-September/063309.html Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 From llvm-commits at lists.llvm.org Thu Oct 10 16:07:54 2019 From: llvm-commits at lists.llvm.org (Peter Collingbourne via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 23:07:54 +0000 (UTC) Subject: [PATCH] D63932: [GlobalDCE] Dead Virtual Function Elimination In-Reply-To: References: Message-ID: <7b34af9aeebdb0a62c9d80ff84650f38@localhost.localdomain> pcc accepted this revision. pcc added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63932/new/ https://reviews.llvm.org/D63932 From llvm-commits at lists.llvm.org Thu Oct 10 16:27:21 2019 From: llvm-commits at lists.llvm.org (Alina Sbirlea via llvm-commits) Date: Thu, 10 Oct 2019 23:27:21 -0000 Subject: [llvm] r374471 - [MemorySSA] Update Phi simplification. Message-ID: <20191010232721.7CD8B92C33@lists.llvm.org> Author: asbirlea Date: Thu Oct 10 16:27:21 2019 New Revision: 374471 URL: http://llvm.org/viewvc/llvm-project?rev=374471&view=rev Log: [MemorySSA] Update Phi simplification. When simplifying a Phi to the unique value found incoming, check that there wasn't a Phi already created to break a cycle. If so, remove it. Resolves PR43541. Some additional nits included. Added: llvm/trunk/test/Analysis/MemorySSA/pr43541.ll Modified: llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp Modified: llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp?rev=374471&r1=374470&r2=374471&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp (original) +++ llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp Thu Oct 10 16:27:21 2019 @@ -44,15 +44,15 @@ MemoryAccess *MemorySSAUpdater::getPrevi // First, do a cache lookup. Without this cache, certain CFG structures // (like a series of if statements) take exponential time to visit. auto Cached = CachedPreviousDef.find(BB); - if (Cached != CachedPreviousDef.end()) { + if (Cached != CachedPreviousDef.end()) return Cached->second; - } // If this method is called from an unreachable block, return LoE. if (!MSSA->DT->isReachableFromEntry(BB)) return MSSA->getLiveOnEntryDef(); - if (BasicBlock *Pred = BB->getSinglePredecessor()) { + if (BasicBlock *Pred = BB->getUniquePredecessor()) { + VisitedBlocks.insert(BB); // Single predecessor case, just recurse, we can only have one definition. MemoryAccess *Result = getPreviousDefFromEnd(Pred, CachedPreviousDef); CachedPreviousDef.insert({BB, Result}); @@ -96,9 +96,15 @@ MemoryAccess *MemorySSAUpdater::getPrevi // See if we can avoid the phi by simplifying it. auto *Result = tryRemoveTrivialPhi(Phi, PhiOps); // If we couldn't simplify, we may have to create a phi - if (Result == Phi && UniqueIncomingAccess && SingleAccess) + if (Result == Phi && UniqueIncomingAccess && SingleAccess) { + // A concrete Phi only exists if we created an empty one to break a cycle. + if (Phi) { + assert(Phi->operands().empty() && "Expected empty Phi"); + Phi->replaceAllUsesWith(SingleAccess); + removeMemoryAccess(Phi); + } Result = SingleAccess; - else if (Result == Phi && !(UniqueIncomingAccess && SingleAccess)) { + } else if (Result == Phi && !(UniqueIncomingAccess && SingleAccess)) { if (!Phi) Phi = MSSA->createMemoryPhi(BB); @@ -237,6 +243,7 @@ MemoryAccess *MemorySSAUpdater::tryRemov void MemorySSAUpdater::insertUse(MemoryUse *MU, bool RenameUses) { InsertedPHIs.clear(); MU->setDefiningAccess(getPreviousDef(MU)); + // In cases without unreachable blocks, because uses do not create new // may-defs, there are only two cases: // 1. There was a def already below us, and therefore, we should not have Added: llvm/trunk/test/Analysis/MemorySSA/pr43541.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/MemorySSA/pr43541.ll?rev=374471&view=auto ============================================================================== --- llvm/trunk/test/Analysis/MemorySSA/pr43541.ll (added) +++ llvm/trunk/test/Analysis/MemorySSA/pr43541.ll Thu Oct 10 16:27:21 2019 @@ -0,0 +1,50 @@ +; RUN: opt -gvn-hoist -enable-mssa-loop-dependency -S < %s | FileCheck %s +; REQUIRES: asserts +%struct.job_pool.6.7 = type { i32 } + +; CHECK-LABEL: @f() +define dso_local void @f() { +entry: + br label %for.cond + +for.cond: ; preds = %for.end, %entry + br label %for.body + +for.body: ; preds = %for.cond + br label %if.end + +if.then: ; No predecessors! + br label %if.end + +if.end: ; preds = %if.then, %for.body + br i1 false, label %for.body12.lr.ph, label %for.end + +for.body12.lr.ph: ; preds = %if.end + br label %for.body12 + +for.body12: ; preds = %if.end40, %for.body12.lr.ph + br label %if.then23 + +if.then23: ; preds = %for.body12 + br i1 undef, label %if.then24, label %if.else + +if.then24: ; preds = %if.then23 + %0 = load %struct.job_pool.6.7*, %struct.job_pool.6.7** undef, align 8 + br label %if.end40 + +if.else: ; preds = %if.then23 + %1 = load %struct.job_pool.6.7*, %struct.job_pool.6.7** undef, align 8 + br label %if.end40 + +if.end40: ; preds = %if.else, %if.then24 + br i1 false, label %for.body12, label %for.cond9.for.end_crit_edge + +for.cond9.for.end_crit_edge: ; preds = %if.end40 + br label %for.end + +for.end: ; preds = %for.cond9.for.end_crit_edge, %if.end + br i1 true, label %if.then45, label %for.cond + +if.then45: ; preds = %for.end + ret void +} From llvm-commits at lists.llvm.org Thu Oct 10 16:30:54 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via llvm-commits) Date: Thu, 10 Oct 2019 23:30:54 -0000 Subject: [compiler-rt] r374472 - Fix check-interception link error in compiler-rt debug mode Message-ID: <20191010233054.CF2BD92945@lists.llvm.org> Author: rnk Date: Thu Oct 10 16:30:54 2019 New Revision: 374472 URL: http://llvm.org/viewvc/llvm-project?rev=374472&view=rev Log: Fix check-interception link error in compiler-rt debug mode Modified: compiler-rt/trunk/lib/interception/tests/CMakeLists.txt Modified: compiler-rt/trunk/lib/interception/tests/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/interception/tests/CMakeLists.txt?rev=374472&r1=374471&r2=374472&view=diff ============================================================================== --- compiler-rt/trunk/lib/interception/tests/CMakeLists.txt (original) +++ compiler-rt/trunk/lib/interception/tests/CMakeLists.txt Thu Oct 10 16:30:54 2019 @@ -32,7 +32,10 @@ else() endif() if(MSVC) list(APPEND INTERCEPTION_TEST_CFLAGS_COMMON -gcodeview) - list(APPEND INTERCEPTION_TEST_LINK_FLAGS_COMMON -Wl,-largeaddressaware) + list(APPEND INTERCEPTION_TEST_LINK_FLAGS_COMMON + -Wl,-largeaddressaware + -Wl,-nodefaultlib:libcmt,-defaultlib:msvcrt,-defaultlib:oldnames + ) endif() list(APPEND INTERCEPTION_TEST_LINK_FLAGS_COMMON -g) From llvm-commits at lists.llvm.org Thu Oct 10 16:35:53 2019 From: llvm-commits at lists.llvm.org (Amy Huang via llvm-commits) Date: Thu, 10 Oct 2019 23:35:53 -0000 Subject: [lld] r374473 - Change test case so that it accepts backslashes in file path, in the case that the test runs on Windows Message-ID: <20191010233553.DADC892BAC@lists.llvm.org> Author: akhuang Date: Thu Oct 10 16:35:53 2019 New Revision: 374473 URL: http://llvm.org/viewvc/llvm-project?rev=374473&view=rev Log: Change test case so that it accepts backslashes in file path, in the case that the test runs on Windows Modified: lld/trunk/test/ELF/compressed-debug-conflict.s Modified: lld/trunk/test/ELF/compressed-debug-conflict.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/compressed-debug-conflict.s?rev=374473&r1=374472&r2=374473&view=diff ============================================================================== --- lld/trunk/test/ELF/compressed-debug-conflict.s (original) +++ lld/trunk/test/ELF/compressed-debug-conflict.s Thu Oct 10 16:35:53 2019 @@ -13,9 +13,9 @@ # OBJ-NEXT: ] # ERROR: error: duplicate symbol: main -# ERROR-NEXT: >>> defined at reduced.c:2 (/tmp/reduced.c:2) +# ERROR-NEXT: >>> defined at reduced.c:2 ({{[/\\]}}tmp{{[/\\]}}reduced.c:2) # ERROR-NEXT: >>> -# ERROR-NEXT: >>> defined at reduced.c:2 (/tmp/reduced.c:2) +# ERROR-NEXT: >>> defined at reduced.c:2 ({{[/\\]}}tmp{{[/\\]}}reduced.c:2) # ERROR-NEXT: >>> .text From llvm-commits at lists.llvm.org Thu Oct 10 16:36:06 2019 From: llvm-commits at lists.llvm.org (Tom Stellard via llvm-commits) Date: Thu, 10 Oct 2019 23:36:06 -0000 Subject: [llvm] r374474 - docs/DeveloperPolicy: Add instructions for requesting GitHub commit access Message-ID: <20191010233606.7406092C58@lists.llvm.org> Author: tstellar Date: Thu Oct 10 16:36:06 2019 New Revision: 374474 URL: http://llvm.org/viewvc/llvm-project?rev=374474&view=rev Log: docs/DeveloperPolicy: Add instructions for requesting GitHub commit access Subscribers: mehdi_amini, jtony, xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66840 Modified: llvm/trunk/docs/DeveloperPolicy.rst Modified: llvm/trunk/docs/DeveloperPolicy.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/DeveloperPolicy.rst?rev=374474&r1=374473&r2=374474&view=diff ============================================================================== --- llvm/trunk/docs/DeveloperPolicy.rst (original) +++ llvm/trunk/docs/DeveloperPolicy.rst Thu Oct 10 16:36:06 2019 @@ -396,6 +396,26 @@ to do so. .. _discuss the change/gather consensus: +Obtaining Commit Access to the GitHub Repository +------------------------------------------------ +We are currently in the process of migrating the project's source code from SVN +to a git repository on GitHub. We are maintaining a file in SVN to map +SVN usernames to GitHub usernames, so we can automatically grant access to +existing committers when we complete the migration to GitHub. In order to +request commit access, check out the github-usernames.txt file in meta/trunk and +add a line in the form of $SVN_USERNAME:$GITHUB_USERNAME and commit it. For +example: + +.. code:: console + + mkdir tmp-llvm-svn + cd tmp-llvm-svn + svn co https://$SVN_USERNAME at llvm.org/svn/llvm-project/meta/trunk + echo "$SVN_USERNAME:$GITHUB_USERNAME" >> trunk/github-usernames.txt + cd trunk + svn commit -m "Request commit access for $SVN_USERNAME" + + Making a Major Change --------------------- From llvm-commits at lists.llvm.org Thu Oct 10 16:37:49 2019 From: llvm-commits at lists.llvm.org (Lang Hames via llvm-commits) Date: Thu, 10 Oct 2019 23:37:49 -0000 Subject: [llvm] r374475 - [JITLink] Move MachO/x86 got test further down in the data section. Message-ID: <20191010233749.89FCD92BF8@lists.llvm.org> Author: lhames Date: Thu Oct 10 16:37:49 2019 New Revision: 374475 URL: http://llvm.org/viewvc/llvm-project?rev=374475&view=rev Log: [JITLink] Move MachO/x86 got test further down in the data section. 'named_data' should be the first symbol in the data section. Modified: llvm/trunk/test/ExecutionEngine/JITLink/X86/MachO_x86-64_relocations.s Modified: llvm/trunk/test/ExecutionEngine/JITLink/X86/MachO_x86-64_relocations.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/ExecutionEngine/JITLink/X86/MachO_x86-64_relocations.s?rev=374475&r1=374474&r2=374475&view=diff ============================================================================== --- llvm/trunk/test/ExecutionEngine/JITLink/X86/MachO_x86-64_relocations.s (original) +++ llvm/trunk/test/ExecutionEngine/JITLink/X86/MachO_x86-64_relocations.s Thu Oct 10 16:37:49 2019 @@ -129,18 +129,6 @@ Lanon_minuend_quad: Lanon_minuend_long: .long Lanon_minuend_long - named_data + 2 -# Check X86_64_RELOC_GOT handling. -# X86_64_RELOC_GOT is the data-section counterpart to X86_64_RELOC_GOTLD. It is -# handled exactly the same way, including having an implicit PC-rel offset of -4 -# (despite this not making sense in a data section, and requiring an explicit -# +4 addend to cancel it out and get the correct result). -# -# jitlink-check: *{4}test_got = (got_addr(macho_reloc.o, external_data) - test_got)[31:0] - .globl test_got - .p2align 2 -test_got: - .long external_data at GOTPCREL + 4 - # Named quad storage target (first named atom in __data). .globl named_data .p2align 3 @@ -284,6 +272,18 @@ subtractor_with_alt_entry_subtrahend_qua subtractor_with_alt_entry_subtrahend_quad_B: .quad 0 +# Check X86_64_RELOC_GOT handling. +# X86_64_RELOC_GOT is the data-section counterpart to X86_64_RELOC_GOTLD. It is +# handled exactly the same way, including having an implicit PC-rel offset of -4 +# (despite this not making sense in a data section, and requiring an explicit +# +4 addend to cancel it out and get the correct result). +# +# jitlink-check: *{4}test_got = (got_addr(macho_reloc.o, external_data) - test_got)[31:0] + .globl test_got + .p2align 2 +test_got: + .long external_data at GOTPCREL + 4 + # Check that unreferenced atoms in no-dead-strip sections are not dead stripped. # We need to use a local symbol for this as any named symbol will end up in the # ORC responsibility set, which is automatically marked live and would couse From llvm-commits at lists.llvm.org Thu Oct 10 16:37:51 2019 From: llvm-commits at lists.llvm.org (Lang Hames via llvm-commits) Date: Thu, 10 Oct 2019 23:37:51 -0000 Subject: [llvm] r374476 - [JITLink] Add an initial implementation of JITLink for MachO/AArch64. Message-ID: <20191010233751.760FA92C7A@lists.llvm.org> Author: lhames Date: Thu Oct 10 16:37:51 2019 New Revision: 374476 URL: http://llvm.org/viewvc/llvm-project?rev=374476&view=rev Log: [JITLink] Add an initial implementation of JITLink for MachO/AArch64. This implementation has support for all relocation types except TLV. Compact unwind sections are not yet supported, so exceptions/unwinding will not work. Added: llvm/trunk/include/llvm/ExecutionEngine/JITLink/MachO_arm64.h llvm/trunk/lib/ExecutionEngine/JITLink/MachO_arm64.cpp llvm/trunk/test/ExecutionEngine/JITLink/AArch64/ llvm/trunk/test/ExecutionEngine/JITLink/AArch64/MachO_Arm64_relocations.s llvm/trunk/test/ExecutionEngine/JITLink/AArch64/lit.local.cfg Modified: llvm/trunk/lib/ExecutionEngine/JITLink/CMakeLists.txt llvm/trunk/lib/ExecutionEngine/JITLink/MachO.cpp Added: llvm/trunk/include/llvm/ExecutionEngine/JITLink/MachO_arm64.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ExecutionEngine/JITLink/MachO_arm64.h?rev=374476&view=auto ============================================================================== --- llvm/trunk/include/llvm/ExecutionEngine/JITLink/MachO_arm64.h (added) +++ llvm/trunk/include/llvm/ExecutionEngine/JITLink/MachO_arm64.h Thu Oct 10 16:37:51 2019 @@ -0,0 +1,60 @@ +//===---- MachO_arm64.h - JIT link functions for MachO/arm64 ----*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// jit-link functions for MachO/arm64. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_EXECUTIONENGINE_JITLINK_MACHO_ARM64_H +#define LLVM_EXECUTIONENGINE_JITLINK_MACHO_ARM64_H + +#include "llvm/ExecutionEngine/JITLink/JITLink.h" + +namespace llvm { +namespace jitlink { + +namespace MachO_arm64_Edges { + +enum MachOARM64RelocationKind : Edge::Kind { + Branch26 = Edge::FirstRelocation, + Pointer32, + Pointer64, + Pointer64Anon, + Page21, + PageOffset12, + GOTPage21, + GOTPageOffset12, + PointerToGOT, + PairedAddend, + LDRLiteral19, + Delta32, + Delta64, + NegDelta32, + NegDelta64, +}; + +} // namespace MachO_arm64_Edges + +/// jit-link the given object buffer, which must be a MachO arm64 object file. +/// +/// If PrePrunePasses is empty then a default mark-live pass will be inserted +/// that will mark all exported atoms live. If PrePrunePasses is not empty, the +/// caller is responsible for including a pass to mark atoms as live. +/// +/// If PostPrunePasses is empty then a default GOT-and-stubs insertion pass will +/// be inserted. If PostPrunePasses is not empty then the caller is responsible +/// for including a pass to insert GOT and stub edges. +void jitLink_MachO_arm64(std::unique_ptr Ctx); + +/// Return the string name of the given MachO arm64 edge kind. +StringRef getMachOARM64RelocationKindName(Edge::Kind R); + +} // end namespace jitlink +} // end namespace llvm + +#endif // LLVM_EXECUTIONENGINE_JITLINK_MACHO_ARM64_H Modified: llvm/trunk/lib/ExecutionEngine/JITLink/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/JITLink/CMakeLists.txt?rev=374476&r1=374475&r2=374476&view=diff ============================================================================== --- llvm/trunk/lib/ExecutionEngine/JITLink/CMakeLists.txt (original) +++ llvm/trunk/lib/ExecutionEngine/JITLink/CMakeLists.txt Thu Oct 10 16:37:51 2019 @@ -4,6 +4,7 @@ add_llvm_library(LLVMJITLink JITLinkMemoryManager.cpp EHFrameSupport.cpp MachO.cpp + MachO_arm64.cpp MachO_x86_64.cpp MachOLinkGraphBuilder.cpp Modified: llvm/trunk/lib/ExecutionEngine/JITLink/MachO.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/JITLink/MachO.cpp?rev=374476&r1=374475&r2=374476&view=diff ============================================================================== --- llvm/trunk/lib/ExecutionEngine/JITLink/MachO.cpp (original) +++ llvm/trunk/lib/ExecutionEngine/JITLink/MachO.cpp Thu Oct 10 16:37:51 2019 @@ -14,6 +14,7 @@ #include "llvm/ExecutionEngine/JITLink/MachO.h" #include "llvm/BinaryFormat/MachO.h" +#include "llvm/ExecutionEngine/JITLink/MachO_arm64.h" #include "llvm/ExecutionEngine/JITLink/MachO_x86_64.h" #include "llvm/Support/Endian.h" #include "llvm/Support/Format.h" @@ -64,6 +65,8 @@ void jitLink_MachO(std::unique_ptr( + "__eh_frame section is marked zero-fill"); + return MachOEHFrameBinaryParser( + *this, EHFrameSection.Address, + StringRef(EHFrameSection.Data, EHFrameSection.Size), + *EHFrameSection.GraphSection, 8, 4, NegDelta32, Delta64) + .addToGraph(); + }); + } + +private: + static Expected + getRelocationKind(const MachO::relocation_info &RI) { + switch (RI.r_type) { + case MachO::ARM64_RELOC_UNSIGNED: + if (!RI.r_pcrel) { + if (RI.r_length == 3) + return RI.r_extern ? Pointer64 : Pointer64Anon; + else if (RI.r_length == 2) + return Pointer32; + } + break; + case MachO::ARM64_RELOC_SUBTRACTOR: + // SUBTRACTOR must be non-pc-rel, extern, with length 2 or 3. + // Initially represent SUBTRACTOR relocations with 'Delta'. + // They may be turned into NegDelta by parsePairRelocation. + if (!RI.r_pcrel && RI.r_extern) { + if (RI.r_length == 2) + return Delta32; + else if (RI.r_length == 3) + return Delta64; + } + break; + case MachO::ARM64_RELOC_BRANCH26: + if (RI.r_pcrel && RI.r_extern && RI.r_length == 2) + return Branch26; + break; + case MachO::ARM64_RELOC_PAGE21: + if (RI.r_pcrel && RI.r_extern && RI.r_length == 2) + return Page21; + break; + case MachO::ARM64_RELOC_PAGEOFF12: + if (!RI.r_pcrel && RI.r_extern && RI.r_length == 2) + return PageOffset12; + break; + case MachO::ARM64_RELOC_GOT_LOAD_PAGE21: + if (RI.r_pcrel && RI.r_extern && RI.r_length == 2) + return GOTPage21; + break; + case MachO::ARM64_RELOC_GOT_LOAD_PAGEOFF12: + if (!RI.r_pcrel && RI.r_extern && RI.r_length == 2) + return GOTPageOffset12; + break; + case MachO::ARM64_RELOC_POINTER_TO_GOT: + if (RI.r_pcrel && RI.r_extern && RI.r_length == 2) + return PointerToGOT; + break; + case MachO::ARM64_RELOC_ADDEND: + if (!RI.r_pcrel && !RI.r_extern && RI.r_length == 2) + return PairedAddend; + break; + } + + return make_error( + "Unsupported arm64 relocation: address=" + + formatv("{0:x8}", RI.r_address) + + ", symbolnum=" + formatv("{0:x6}", RI.r_symbolnum) + + ", kind=" + formatv("{0:x1}", RI.r_type) + + ", pc_rel=" + (RI.r_pcrel ? "true" : "false") + + ", extern=" + (RI.r_extern ? "true" : "false") + + ", length=" + formatv("{0:d}", RI.r_length)); + } + + MachO::relocation_info + getRelocationInfo(const object::relocation_iterator RelItr) { + MachO::any_relocation_info ARI = + getObject().getRelocation(RelItr->getRawDataRefImpl()); + MachO::relocation_info RI; + memcpy(&RI, &ARI, sizeof(MachO::relocation_info)); + return RI; + } + + using PairRelocInfo = + std::tuple; + + // Parses paired SUBTRACTOR/UNSIGNED relocations and, on success, + // returns the edge kind and addend to be used. + Expected + parsePairRelocation(Block &BlockToFix, Edge::Kind SubtractorKind, + const MachO::relocation_info &SubRI, + JITTargetAddress FixupAddress, const char *FixupContent, + object::relocation_iterator &UnsignedRelItr, + object::relocation_iterator &RelEnd) { + using namespace support; + + assert(((SubtractorKind == Delta32 && SubRI.r_length == 2) || + (SubtractorKind == Delta64 && SubRI.r_length == 3)) && + "Subtractor kind should match length"); + assert(SubRI.r_extern && "SUBTRACTOR reloc symbol should be extern"); + assert(!SubRI.r_pcrel && "SUBTRACTOR reloc should not be PCRel"); + + if (UnsignedRelItr == RelEnd) + return make_error("arm64 SUBTRACTOR without paired " + "UNSIGNED relocation"); + + auto UnsignedRI = getRelocationInfo(UnsignedRelItr); + + if (SubRI.r_address != UnsignedRI.r_address) + return make_error("arm64 SUBTRACTOR and paired UNSIGNED " + "point to different addresses"); + + if (SubRI.r_length != UnsignedRI.r_length) + return make_error("length of arm64 SUBTRACTOR and paired " + "UNSIGNED reloc must match"); + + Symbol *FromSymbol; + if (auto FromSymbolOrErr = findSymbolByIndex(SubRI.r_symbolnum)) + FromSymbol = FromSymbolOrErr->GraphSymbol; + else + return FromSymbolOrErr.takeError(); + + // Read the current fixup value. + uint64_t FixupValue = 0; + if (SubRI.r_length == 3) + FixupValue = *(const little64_t *)FixupContent; + else + FixupValue = *(const little32_t *)FixupContent; + + // Find 'ToSymbol' using symbol number or address, depending on whether the + // paired UNSIGNED relocation is extern. + Symbol *ToSymbol = nullptr; + if (UnsignedRI.r_extern) { + // Find target symbol by symbol index. + if (auto ToSymbolOrErr = findSymbolByIndex(UnsignedRI.r_symbolnum)) + ToSymbol = ToSymbolOrErr->GraphSymbol; + else + return ToSymbolOrErr.takeError(); + } else { + if (auto ToSymbolOrErr = findSymbolByAddress(FixupValue)) + ToSymbol = &*ToSymbolOrErr; + else + return ToSymbolOrErr.takeError(); + FixupValue -= ToSymbol->getAddress(); + } + + MachOARM64RelocationKind DeltaKind; + Symbol *TargetSymbol; + uint64_t Addend; + if (&BlockToFix == &FromSymbol->getAddressable()) { + TargetSymbol = ToSymbol; + DeltaKind = (SubRI.r_length == 3) ? Delta64 : Delta32; + Addend = FixupValue + (FixupAddress - FromSymbol->getAddress()); + // FIXME: handle extern 'from'. + } else if (&BlockToFix == &ToSymbol->getAddressable()) { + TargetSymbol = &*FromSymbol; + DeltaKind = (SubRI.r_length == 3) ? NegDelta64 : NegDelta32; + Addend = FixupValue - (FixupAddress - ToSymbol->getAddress()); + } else { + // BlockToFix was neither FromSymbol nor ToSymbol. + return make_error("SUBTRACTOR relocation must fix up " + "either 'A' or 'B' (or a symbol in one " + "of their alt-entry groups)"); + } + + return PairRelocInfo(DeltaKind, TargetSymbol, Addend); + } + + Error addRelocations() override { + using namespace support; + auto &Obj = getObject(); + + for (auto &S : Obj.sections()) { + + JITTargetAddress SectionAddress = S.getAddress(); + + for (auto RelItr = S.relocation_begin(), RelEnd = S.relocation_end(); + RelItr != RelEnd; ++RelItr) { + + MachO::relocation_info RI = getRelocationInfo(RelItr); + + // Sanity check the relocation kind. + auto Kind = getRelocationKind(RI); + if (!Kind) + return Kind.takeError(); + + // Find the address of the value to fix up. + JITTargetAddress FixupAddress = SectionAddress + (uint32_t)RI.r_address; + + LLVM_DEBUG({ + dbgs() << "Processing " << getMachOARM64RelocationKindName(*Kind) + << " relocation at " << format("0x%016" PRIx64, FixupAddress) + << "\n"; + }); + + // Find the block that the fixup points to. + Block *BlockToFix = nullptr; + { + auto SymbolToFixOrErr = findSymbolByAddress(FixupAddress); + if (!SymbolToFixOrErr) + return SymbolToFixOrErr.takeError(); + BlockToFix = &SymbolToFixOrErr->getBlock(); + } + + if (FixupAddress + static_cast(1ULL << RI.r_length) > + BlockToFix->getAddress() + BlockToFix->getContent().size()) + return make_error( + "Relocation content extends past end of fixup block"); + + // Get a pointer to the fixup content. + const char *FixupContent = BlockToFix->getContent().data() + + (FixupAddress - BlockToFix->getAddress()); + + // The target symbol and addend will be populated by the switch below. + Symbol *TargetSymbol = nullptr; + uint64_t Addend = 0; + + if (*Kind == PairedAddend) { + // If this is an Addend relocation then process it and move to the + // paired reloc. + + Addend = RI.r_symbolnum; + + if (RelItr == RelEnd) + return make_error("Unpaired Addend reloc at " + + formatv("{0:x16}", FixupAddress)); + ++RelItr; + RI = getRelocationInfo(RelItr); + + Kind = getRelocationKind(RI); + if (!Kind) + return Kind.takeError(); + + if (*Kind != Branch26 & *Kind != Page21 && *Kind != PageOffset12) + return make_error( + "Invalid relocation pair: Addend + " + + getMachOARM64RelocationKindName(*Kind)); + else + LLVM_DEBUG({ + dbgs() << " pair is " << getMachOARM64RelocationKindName(*Kind) + << "`\n"; + }); + + // Find the address of the value to fix up. + JITTargetAddress PairedFixupAddress = + SectionAddress + (uint32_t)RI.r_address; + if (PairedFixupAddress != FixupAddress) + return make_error("Paired relocation points at " + "different target"); + } + + switch (*Kind) { + case Branch26: { + if (auto TargetSymbolOrErr = findSymbolByIndex(RI.r_symbolnum)) + TargetSymbol = TargetSymbolOrErr->GraphSymbol; + else + return TargetSymbolOrErr.takeError(); + uint32_t Instr = *(const ulittle32_t *)FixupContent; + if ((Instr & 0x7fffffff) != 0x14000000) + return make_error("BRANCH26 target is not a B or BL " + "instruction with a zero addend"); + break; + } + case Pointer32: + if (auto TargetSymbolOrErr = findSymbolByIndex(RI.r_symbolnum)) + TargetSymbol = TargetSymbolOrErr->GraphSymbol; + else + return TargetSymbolOrErr.takeError(); + Addend = *(const ulittle32_t *)FixupContent; + break; + case Pointer64: + if (auto TargetSymbolOrErr = findSymbolByIndex(RI.r_symbolnum)) + TargetSymbol = TargetSymbolOrErr->GraphSymbol; + else + return TargetSymbolOrErr.takeError(); + Addend = *(const ulittle64_t *)FixupContent; + break; + case Pointer64Anon: { + JITTargetAddress TargetAddress = *(const ulittle64_t *)FixupContent; + if (auto TargetSymbolOrErr = findSymbolByAddress(TargetAddress)) + TargetSymbol = &*TargetSymbolOrErr; + else + return TargetSymbolOrErr.takeError(); + Addend = TargetAddress - TargetSymbol->getAddress(); + break; + } + case Page21: + case GOTPage21: { + if (auto TargetSymbolOrErr = findSymbolByIndex(RI.r_symbolnum)) + TargetSymbol = TargetSymbolOrErr->GraphSymbol; + else + return TargetSymbolOrErr.takeError(); + uint32_t Instr = *(const ulittle32_t *)FixupContent; + if ((Instr & 0xffffffe0) != 0x90000000) + return make_error("PAGE21/GOTPAGE21 target is not an " + "ADRP instruction with a zero " + "addend"); + break; + } + case PageOffset12: { + if (auto TargetSymbolOrErr = findSymbolByIndex(RI.r_symbolnum)) + TargetSymbol = TargetSymbolOrErr->GraphSymbol; + else + return TargetSymbolOrErr.takeError(); + break; + } + case GOTPageOffset12: { + if (auto TargetSymbolOrErr = findSymbolByIndex(RI.r_symbolnum)) + TargetSymbol = TargetSymbolOrErr->GraphSymbol; + else + return TargetSymbolOrErr.takeError(); + uint32_t Instr = *(const ulittle32_t *)FixupContent; + if ((Instr & 0xfffffc00) != 0xf9400000) + return make_error("GOTPAGEOFF12 target is not an LDR " + "immediate instruction with a zero " + "addend"); + break; + } + case PointerToGOT: + if (auto TargetSymbolOrErr = findSymbolByIndex(RI.r_symbolnum)) + TargetSymbol = TargetSymbolOrErr->GraphSymbol; + else + return TargetSymbolOrErr.takeError(); + break; + case Delta32: + case Delta64: { + // We use Delta32/Delta64 to represent SUBTRACTOR relocations. + // parsePairRelocation handles the paired reloc, and returns the + // edge kind to be used (either Delta32/Delta64, or + // NegDelta32/NegDelta64, depending on the direction of the + // subtraction) along with the addend. + auto PairInfo = + parsePairRelocation(*BlockToFix, *Kind, RI, FixupAddress, + FixupContent, ++RelItr, RelEnd); + if (!PairInfo) + return PairInfo.takeError(); + std::tie(*Kind, TargetSymbol, Addend) = *PairInfo; + assert(TargetSymbol && "No target symbol from parsePairRelocation?"); + break; + } + default: + llvm_unreachable("Special relocation kind should not appear in " + "mach-o file"); + } + + LLVM_DEBUG({ + Edge GE(*Kind, FixupAddress - BlockToFix->getAddress(), *TargetSymbol, + Addend); + printEdge(dbgs(), *BlockToFix, GE, + getMachOARM64RelocationKindName(*Kind)); + dbgs() << "\n"; + }); + BlockToFix->addEdge(*Kind, FixupAddress - BlockToFix->getAddress(), + *TargetSymbol, Addend); + } + } + return Error::success(); + } + + unsigned NumSymbols = 0; +}; + +class MachO_arm64_GOTAndStubsBuilder + : public BasicGOTAndStubsBuilder { +public: + MachO_arm64_GOTAndStubsBuilder(LinkGraph &G) + : BasicGOTAndStubsBuilder(G) {} + + bool isGOTEdge(Edge &E) const { + return E.getKind() == GOTPage21 || E.getKind() == GOTPageOffset12 || + E.getKind() == PointerToGOT; + } + + Symbol &createGOTEntry(Symbol &Target) { + auto &GOTEntryBlock = G.createContentBlock( + getGOTSection(), getGOTEntryBlockContent(), 0, 8, 0); + GOTEntryBlock.addEdge(Pointer64, 0, Target, 0); + return G.addAnonymousSymbol(GOTEntryBlock, 0, 8, false, false); + } + + void fixGOTEdge(Edge &E, Symbol &GOTEntry) { + if (E.getKind() == GOTPage21 || E.getKind() == GOTPageOffset12) { + // Update the target, but leave the edge addend as-is. + E.setTarget(GOTEntry); + } else if (E.getKind() == PointerToGOT) { + E.setTarget(GOTEntry); + E.setKind(Delta32); + } else + llvm_unreachable("Not a GOT edge?"); + } + + bool isExternalBranchEdge(Edge &E) { + return E.getKind() == Branch26 && !E.getTarget().isDefined(); + } + + Symbol &createStub(Symbol &Target) { + auto &StubContentBlock = + G.createContentBlock(getStubsSection(), getStubBlockContent(), 0, 1, 0); + // Re-use GOT entries for stub targets. + auto &GOTEntrySymbol = getGOTEntrySymbol(Target); + StubContentBlock.addEdge(LDRLiteral19, 0, GOTEntrySymbol, 0); + return G.addAnonymousSymbol(StubContentBlock, 0, 8, true, false); + } + + void fixExternalBranchEdge(Edge &E, Symbol &Stub) { + assert(E.getKind() == Branch26 && "Not a Branch32 edge?"); + assert(E.getAddend() == 0 && "Branch32 edge has non-zero addend?"); + E.setTarget(Stub); + } + +private: + Section &getGOTSection() { + if (!GOTSection) + GOTSection = &G.createSection("$__GOT", sys::Memory::MF_READ); + return *GOTSection; + } + + Section &getStubsSection() { + if (!StubsSection) { + auto StubsProt = static_cast( + sys::Memory::MF_READ | sys::Memory::MF_EXEC); + StubsSection = &G.createSection("$__STUBS", StubsProt); + } + return *StubsSection; + } + + StringRef getGOTEntryBlockContent() { + return StringRef(reinterpret_cast(NullGOTEntryContent), + sizeof(NullGOTEntryContent)); + } + + StringRef getStubBlockContent() { + return StringRef(reinterpret_cast(StubContent), + sizeof(StubContent)); + } + + static const uint8_t NullGOTEntryContent[8]; + static const uint8_t StubContent[8]; + Section *GOTSection = nullptr; + Section *StubsSection = nullptr; +}; + +const uint8_t MachO_arm64_GOTAndStubsBuilder::NullGOTEntryContent[8] = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}; +const uint8_t MachO_arm64_GOTAndStubsBuilder::StubContent[8] = { + 0x10, 0x00, 0x00, 0x58, // LDR x16, + 0x00, 0x02, 0x1f, 0xd6 // BR x16 +}; + +} // namespace + +namespace llvm { +namespace jitlink { + +class MachOJITLinker_arm64 : public JITLinker { + friend class JITLinker; + +public: + MachOJITLinker_arm64(std::unique_ptr Ctx, + PassConfiguration PassConfig) + : JITLinker(std::move(Ctx), std::move(PassConfig)) {} + +private: + StringRef getEdgeKindName(Edge::Kind R) const override { + return getMachOARM64RelocationKindName(R); + } + + Expected> + buildGraph(MemoryBufferRef ObjBuffer) override { + auto MachOObj = object::ObjectFile::createMachOObjectFile(ObjBuffer); + if (!MachOObj) + return MachOObj.takeError(); + return MachOLinkGraphBuilder_arm64(**MachOObj).buildGraph(); + } + + static Error targetOutOfRangeError(const Block &B, const Edge &E) { + std::string ErrMsg; + { + raw_string_ostream ErrStream(ErrMsg); + ErrStream << "Relocation target out of range: "; + printEdge(ErrStream, B, E, getMachOARM64RelocationKindName(E.getKind())); + ErrStream << "\n"; + } + return make_error(std::move(ErrMsg)); + } + + static unsigned getPageOffset12Shift(uint32_t Instr) { + constexpr uint32_t LDRLiteralMask = 0x3ffffc00; + + // Check for a GPR LDR immediate with a zero embedded literal. + // If found, the top two bits contain the shift. + if ((Instr & LDRLiteralMask) == 0x39400000) + return Instr >> 30; + + // Check for a Neon LDR immediate of size 64-bit or less with a zero + // embedded literal. If found, the top two bits contain the shift. + if ((Instr & LDRLiteralMask) == 0x3d400000) + return Instr >> 30; + + // Check for a Neon LDR immediate of size 128-bit with a zero embedded + // literal. + constexpr uint32_t SizeBitsMask = 0xc0000000; + if ((Instr & (LDRLiteralMask | SizeBitsMask)) == 0x3dc00000) + return 4; + + return 0; + } + + Error applyFixup(Block &B, const Edge &E, char *BlockWorkingMem) const { + using namespace support; + + char *FixupPtr = BlockWorkingMem + E.getOffset(); + JITTargetAddress FixupAddress = B.getAddress() + E.getOffset(); + + switch (E.getKind()) { + case Branch26: { + assert((FixupAddress & 0x3) == 0 && "Branch-inst is not 32-bit aligned"); + + int64_t Value = E.getTarget().getAddress() - FixupAddress + E.getAddend(); + + if (static_cast(Value) & 0x3) + return make_error("Branch26 target is not 32-bit " + "aligned"); + + if (Value < -(1 << 27) || Value > ((1 << 27) - 1)) + return targetOutOfRangeError(B, E); + + uint32_t RawInstr = *(little32_t *)FixupPtr; + assert((RawInstr & 0x7fffffff) == 0x14000000 && + "RawInstr isn't a B or BR immediate instruction"); + uint32_t Imm = (static_cast(Value) & ((1 << 28) - 1)) >> 2; + uint32_t FixedInstr = RawInstr | Imm; + *(little32_t *)FixupPtr = FixedInstr; + break; + } + case Pointer32: { + uint64_t Value = E.getTarget().getAddress() + E.getAddend(); + if (Value > std::numeric_limits::max()) + return targetOutOfRangeError(B, E); + *(ulittle32_t *)FixupPtr = Value; + break; + } + case Pointer64: { + uint64_t Value = E.getTarget().getAddress() + E.getAddend(); + *(ulittle64_t *)FixupPtr = Value; + break; + } + case Page21: + case GOTPage21: { + assert(E.getAddend() == 0 && "PAGE21/GOTPAGE21 with non-zero addend"); + uint64_t TargetPage = + E.getTarget().getAddress() & ~static_cast(4096 - 1); + uint64_t PCPage = B.getAddress() & ~static_cast(4096 - 1); + + int64_t PageDelta = TargetPage - PCPage; + if (PageDelta < -(1 << 30) || PageDelta > ((1 << 30) - 1)) + return targetOutOfRangeError(B, E); + + uint32_t RawInstr = *(ulittle32_t *)FixupPtr; + assert((RawInstr & 0xffffffe0) == 0x90000000 && + "RawInstr isn't an ADRP instruction"); + uint32_t ImmLo = (static_cast(PageDelta) >> 12) & 0x3; + uint32_t ImmHi = (static_cast(PageDelta) >> 14) & 0x7ffff; + uint32_t FixedInstr = RawInstr | (ImmLo << 29) | (ImmHi << 5); + *(ulittle32_t *)FixupPtr = FixedInstr; + break; + } + case PageOffset12: { + assert(E.getAddend() == 0 && "PAGEOFF12 with non-zero addend"); + uint64_t TargetOffset = E.getTarget().getAddress() & 0xfff; + + uint32_t RawInstr = *(ulittle32_t *)FixupPtr; + unsigned ImmShift = getPageOffset12Shift(RawInstr); + + if (TargetOffset & ((1 << ImmShift) - 1)) + return make_error("PAGEOFF12 target is not aligned"); + + uint32_t EncodedImm = (TargetOffset >> ImmShift) << 10; + uint32_t FixedInstr = RawInstr | EncodedImm; + *(ulittle32_t *)FixupPtr = FixedInstr; + break; + } + case GOTPageOffset12: { + assert(E.getAddend() == 0 && "GOTPAGEOF12 with non-zero addend"); + uint64_t TargetOffset = E.getTarget().getAddress() & 0xfff; + + uint32_t RawInstr = *(ulittle32_t *)FixupPtr; + assert((RawInstr & 0xfffffc00) == 0xf9400000 && + "RawInstr isn't a 64-bit LDR immediate"); + uint32_t FixedInstr = RawInstr | (TargetOffset << 10); + *(ulittle32_t *)FixupPtr = FixedInstr; + break; + } + case LDRLiteral19: { + assert((FixupAddress & 0x3) == 0 && "LDR is not 32-bit aligned"); + assert(E.getAddend() == 0 && "LDRLiteral19 with non-zero addend"); + uint32_t RawInstr = *(ulittle32_t *)FixupPtr; + assert(RawInstr == 0x58000010 && "RawInstr isn't a 64-bit LDR literal"); + int64_t Delta = E.getTarget().getAddress() - FixupAddress; + if (Delta & 0x3) + return make_error("LDR literal target is not 32-bit " + "aligned"); + if (Delta < -(1 << 20) || Delta > ((1 << 20) - 1)) + return targetOutOfRangeError(B, E); + + uint32_t EncodedImm = (static_cast(Delta) >> 2) << 5; + uint32_t FixedInstr = RawInstr | EncodedImm; + *(ulittle32_t *)FixupPtr = FixedInstr; + break; + } + case Delta32: + case Delta64: + case NegDelta32: + case NegDelta64: { + int64_t Value; + if (E.getKind() == Delta32 || E.getKind() == Delta64) + Value = E.getTarget().getAddress() - FixupAddress + E.getAddend(); + else + Value = FixupAddress - E.getTarget().getAddress() + E.getAddend(); + + if (E.getKind() == Delta32 || E.getKind() == NegDelta32) { + if (Value < std::numeric_limits::min() || + Value > std::numeric_limits::max()) + return targetOutOfRangeError(B, E); + *(little32_t *)FixupPtr = Value; + } else + *(little64_t *)FixupPtr = Value; + break; + } + default: + llvm_unreachable("Unrecognized edge kind"); + } + + return Error::success(); + } + + uint64_t NullValue = 0; +}; + +void jitLink_MachO_arm64(std::unique_ptr Ctx) { + PassConfiguration Config; + Triple TT("arm64-apple-ios"); + + if (Ctx->shouldAddDefaultTargetPasses(TT)) { + // Add a mark-live pass. + if (auto MarkLive = Ctx->getMarkLivePass(TT)) + Config.PrePrunePasses.push_back(std::move(MarkLive)); + else + Config.PrePrunePasses.push_back(markAllSymbolsLive); + + // Add an in-place GOT/Stubs pass. + Config.PostPrunePasses.push_back([](LinkGraph &G) -> Error { + MachO_arm64_GOTAndStubsBuilder(G).run(); + return Error::success(); + }); + } + + if (auto Err = Ctx->modifyPassConfig(TT, Config)) + return Ctx->notifyFailed(std::move(Err)); + + // Construct a JITLinker and run the link function. + MachOJITLinker_arm64::link(std::move(Ctx), std::move(Config)); +} + +StringRef getMachOARM64RelocationKindName(Edge::Kind R) { + switch (R) { + case Branch26: + return "Branch26"; + case Pointer64: + return "Pointer64"; + case Pointer64Anon: + return "Pointer64Anon"; + case Page21: + return "Page21"; + case PageOffset12: + return "PageOffset12"; + case GOTPage21: + return "GOTPage21"; + case GOTPageOffset12: + return "GOTPageOffset12"; + case PointerToGOT: + return "PointerToGOT"; + case PairedAddend: + return "PairedAddend"; + case LDRLiteral19: + return "LDRLiteral19"; + case Delta32: + return "Delta32"; + case Delta64: + return "Delta64"; + case NegDelta32: + return "NegDelta32"; + case NegDelta64: + return "NegDelta64"; + default: + return getGenericEdgeKindName(static_cast(R)); + } +} + +} // end namespace jitlink +} // end namespace llvm Added: llvm/trunk/test/ExecutionEngine/JITLink/AArch64/MachO_Arm64_relocations.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/ExecutionEngine/JITLink/AArch64/MachO_Arm64_relocations.s?rev=374476&view=auto ============================================================================== --- llvm/trunk/test/ExecutionEngine/JITLink/AArch64/MachO_Arm64_relocations.s (added) +++ llvm/trunk/test/ExecutionEngine/JITLink/AArch64/MachO_Arm64_relocations.s Thu Oct 10 16:37:51 2019 @@ -0,0 +1,339 @@ +# RUN: rm -rf %t && mkdir -p %t +# RUN: llvm-mc -triple=arm64-apple-darwin19 -filetype=obj -o %t/macho_reloc.o %s +# RUN: llvm-jitlink -noexec -define-abs external_data=0xdeadbeef -define-abs external_func=0xcafef00d -check=%s %t/macho_reloc.o + + .section __TEXT,__text,regular,pure_instructions + + .p2align 2 +Lanon_func: + ret + + .globl named_func + .p2align 2 +named_func: + ret + +# Check ARM64_RELOC_BRANCH26 handling with a call to a local function. +# The branch instruction only encodes 26 bits of the 28-bit possible branch +# range, since the low 2 bits will always be zero. +# +# jitlink-check: decode_operand(test_local_call, 0)[25:0] = (named_func - test_local_call)[27:2] + .globl test_local_call + .p2align 2 +test_local_call: + bl named_func + + .globl _main + .p2align 2 +_main: + ret + +# Check ARM64_RELOC_GOTPAGE21 / ARM64_RELOC_GOTPAGEOFF12 handling with a +# reference to an external symbol. Validate both the reference to the GOT entry, +# and also the content of the GOT entry. +# +# For the GOTPAGE21/ADRP instruction we have the 21-bit delta to the 4k page +# containing the GOT entry for external_data. +# +# For the GOTPAGEOFF/LDR instruction we have the 12-bit offset of the entry +# within the page. +# +# jitlink-check: *{8}(got_addr(macho_reloc.o, external_data)) = external_data +# jitlink-check: decode_operand(test_gotpage21, 1) = (got_addr(macho_reloc.o, external_data)[32:12] - test_gotpage21[32:12]) +# jitlink-check: decode_operand(test_gotpageoff12, 2) = got_addr(macho_reloc.o, external_data)[11:3] + .globl test_gotpage21 + .p2align 2 +test_gotpage21: + adrp x0, external_data at GOTPAGE + .globl test_gotpageoff12 +test_gotpageoff12: + ldr x0, [x0, external_data at GOTPAGEOFF] + +# Check ARM64_RELOC_PAGE21 / ARM64_RELOC_PAGEOFF12 handling with a reference to +# a local symbol. +# +# For the PAGE21/ADRP instruction we have the 21-bit delta to the 4k page +# containing the global. +# +# For the GOTPAGEOFF12 relocation we test the ADD instruction, all LDR/GPR +# variants and all LDR/Neon variants. +# +# jitlink-check: decode_operand(test_page21, 1) = (named_data[32:12] - test_page21[32:12]) +# jitlink-check: decode_operand(test_pageoff12add, 2) = named_data[11:0] +# jitlink-check: decode_operand(test_pageoff12gpr8, 2) = named_data[11:0] +# jitlink-check: decode_operand(test_pageoff12gpr16, 2) = named_data[11:1] +# jitlink-check: decode_operand(test_pageoff12gpr32, 2) = named_data[11:2] +# jitlink-check: decode_operand(test_pageoff12gpr64, 2) = named_data[11:3] +# jitlink-check: decode_operand(test_pageoff12neon8, 2) = named_data[11:0] +# jitlink-check: decode_operand(test_pageoff12neon16, 2) = named_data[11:1] +# jitlink-check: decode_operand(test_pageoff12neon32, 2) = named_data[11:2] +# jitlink-check: decode_operand(test_pageoff12neon64, 2) = named_data[11:3] +# jitlink-check: decode_operand(test_pageoff12neon128, 2) = named_data[11:4] + .globl test_page21 + .p2align 2 +test_page21: + adrp x0, named_data at PAGE + + .globl test_pageoff12add +test_pageoff12add: + add x0, x0, named_data at PAGEOFF + + .globl test_pageoff12gpr8 +test_pageoff12gpr8: + ldrb w0, [x0, named_data at PAGEOFF] + + .globl test_pageoff12gpr16 +test_pageoff12gpr16: + ldrh w0, [x0, named_data at PAGEOFF] + + .globl test_pageoff12gpr32 +test_pageoff12gpr32: + ldr w0, [x0, named_data at PAGEOFF] + + .globl test_pageoff12gpr64 +test_pageoff12gpr64: + ldr x0, [x0, named_data at PAGEOFF] + + .globl test_pageoff12neon8 +test_pageoff12neon8: + ldr b0, [x0, named_data at PAGEOFF] + + .globl test_pageoff12neon16 +test_pageoff12neon16: + ldr h0, [x0, named_data at PAGEOFF] + + .globl test_pageoff12neon32 +test_pageoff12neon32: + ldr s0, [x0, named_data at PAGEOFF] + + .globl test_pageoff12neon64 +test_pageoff12neon64: + ldr d0, [x0, named_data at PAGEOFF] + + .globl test_pageoff12neon128 +test_pageoff12neon128: + ldr q0, [x0, named_data at PAGEOFF] + +# Check that calls to external functions trigger the generation of stubs and GOT +# entries. +# +# jitlink-check: decode_operand(test_external_call, 0) = (stub_addr(macho_reloc.o, external_func) - test_external_call)[27:2] +# jitlink-check: *{8}(got_addr(macho_reloc.o, external_func)) = external_func + .globl test_external_call + .p2align 2 +test_external_call: + bl external_func + + .section __DATA,__data + +# Storage target for non-extern ARM64_RELOC_SUBTRACTOR relocs. + .p2align 3 +Lanon_data: + .quad 0x1111111111111111 + +# Check ARM64_RELOC_SUBTRACTOR Quad/Long in anonymous storage with anonymous +# minuend: "LA: .quad LA - B + C". The anonymous subtrahend form +# "LA: .quad B - LA + C" is not tested as subtrahends are not permitted to be +# anonymous. +# +# Note: +8 offset in expression below to accounts for sizeof(Lanon_data). +# jitlink-check: *{8}(section_addr(macho_reloc.o, __data) + 8) = (section_addr(macho_reloc.o, __data) + 8) - named_data + 2 + .p2align 3 +Lanon_minuend_quad: + .quad Lanon_minuend_quad - named_data + 2 + +# Note: +16 offset in expression below to accounts for sizeof(Lanon_data) + sizeof(Lanon_minuend_long). +# jitlink-check: *{4}(section_addr(macho_reloc.o, __data) + 16) = ((section_addr(macho_reloc.o, __data) + 16) - named_data + 2)[31:0] + .p2align 2 +Lanon_minuend_long: + .long Lanon_minuend_long - named_data + 2 + +# Named quad storage target (first named atom in __data). +# Align to 16 for use as 128-bit load target. + .globl named_data + .p2align 4 +named_data: + .quad 0x2222222222222222 + .quad 0x3333333333333333 + +# An alt-entry point for named_data + .globl named_data_alt_entry + .p2align 3 + .alt_entry named_data_alt_entry +named_data_alt_entry: + .quad 0 + +# Check ARM64_RELOC_UNSIGNED / quad / extern handling by putting the address of +# a local named function into a quad symbol. +# +# jitlink-check: *{8}named_func_addr_quad = named_func + .globl named_func_addr_quad + .p2align 3 +named_func_addr_quad: + .quad named_func + +# Check ARM64_RELOC_UNSIGNED / quad / non-extern handling by putting the +# address of a local anonymous function into a quad symbol. +# +# jitlink-check: *{8}anon_func_addr_quad = section_addr(macho_reloc.o, __text) + .globl anon_func_addr_quad + .p2align 3 +anon_func_addr_quad: + .quad Lanon_func + +# ARM64_RELOC_SUBTRACTOR Quad/Long in named storage with anonymous minuend +# +# jitlink-check: *{8}anon_minuend_quad1 = section_addr(macho_reloc.o, __data) - anon_minuend_quad1 + 2 +# Only the form "B: .quad LA - B + C" is tested. The form "B: .quad B - LA + C" is +# invalid because the subtrahend can not be local. + .globl anon_minuend_quad1 + .p2align 3 +anon_minuend_quad1: + .quad Lanon_data - anon_minuend_quad1 + 2 + +# jitlink-check: *{4}anon_minuend_long1 = (section_addr(macho_reloc.o, __data) - anon_minuend_long1 + 2)[31:0] + .globl anon_minuend_long1 + .p2align 2 +anon_minuend_long1: + .long Lanon_data - anon_minuend_long1 + 2 + +# Check ARM64_RELOC_SUBTRACTOR Quad/Long in named storage with minuend and subtrahend. +# Both forms "A: .quad A - B + C" and "A: .quad B - A + C" are tested. +# +# Check "A: .quad B - A + C". +# jitlink-check: *{8}subtrahend_quad2 = (named_data - subtrahend_quad2 - 2) + .globl subtrahend_quad2 + .p2align 3 +subtrahend_quad2: + .quad named_data - subtrahend_quad2 - 2 + +# Check "A: .long B - A + C". +# jitlink-check: *{4}subtrahend_long2 = (named_data - subtrahend_long2 - 2)[31:0] + .globl subtrahend_long2 + .p2align 2 +subtrahend_long2: + .long named_data - subtrahend_long2 - 2 + +# Check "A: .quad A - B + C". +# jitlink-check: *{8}minuend_quad3 = (minuend_quad3 - named_data - 2) + .globl minuend_quad3 + .p2align 3 +minuend_quad3: + .quad minuend_quad3 - named_data - 2 + +# Check "A: .long B - A + C". +# jitlink-check: *{4}minuend_long3 = (minuend_long3 - named_data - 2)[31:0] + .globl minuend_long3 + .p2align 2 +minuend_long3: + .long minuend_long3 - named_data - 2 + +# Check ARM64_RELOC_SUBTRACTOR handling for exprs of the form +# "A: .quad/long B - C + D", where 'B' or 'C' is at a fixed offset from 'A' +# (i.e. is part of an alt_entry chain that includes 'A'). +# +# Check "A: .long B - C + D" where 'B' is an alt_entry for 'A'. +# jitlink-check: *{4}subtractor_with_alt_entry_minuend_long = (subtractor_with_alt_entry_minuend_long_B - named_data + 2)[31:0] + .globl subtractor_with_alt_entry_minuend_long + .p2align 2 +subtractor_with_alt_entry_minuend_long: + .long subtractor_with_alt_entry_minuend_long_B - named_data + 2 + + .globl subtractor_with_alt_entry_minuend_long_B + .p2align 2 + .alt_entry subtractor_with_alt_entry_minuend_long_B +subtractor_with_alt_entry_minuend_long_B: + .long 0 + +# Check "A: .quad B - C + D" where 'B' is an alt_entry for 'A'. +# jitlink-check: *{8}subtractor_with_alt_entry_minuend_quad = (subtractor_with_alt_entry_minuend_quad_B - named_data + 2) + .globl subtractor_with_alt_entry_minuend_quad + .p2align 3 +subtractor_with_alt_entry_minuend_quad: + .quad subtractor_with_alt_entry_minuend_quad_B - named_data + 2 + + .globl subtractor_with_alt_entry_minuend_quad_B + .p2align 3 + .alt_entry subtractor_with_alt_entry_minuend_quad_B +subtractor_with_alt_entry_minuend_quad_B: + .quad 0 + +# Check "A: .long B - C + D" where 'C' is an alt_entry for 'A'. +# jitlink-check: *{4}subtractor_with_alt_entry_subtrahend_long = (named_data - subtractor_with_alt_entry_subtrahend_long_B + 2)[31:0] + .globl subtractor_with_alt_entry_subtrahend_long + .p2align 2 +subtractor_with_alt_entry_subtrahend_long: + .long named_data - subtractor_with_alt_entry_subtrahend_long_B + 2 + + .globl subtractor_with_alt_entry_subtrahend_long_B + .p2align 2 + .alt_entry subtractor_with_alt_entry_subtrahend_long_B +subtractor_with_alt_entry_subtrahend_long_B: + .long 0 + +# Check "A: .quad B - C + D" where 'B' is an alt_entry for 'A'. +# jitlink-check: *{8}subtractor_with_alt_entry_subtrahend_quad = (named_data - subtractor_with_alt_entry_subtrahend_quad_B + 2) + .globl subtractor_with_alt_entry_subtrahend_quad + .p2align 3 +subtractor_with_alt_entry_subtrahend_quad: + .quad named_data - subtractor_with_alt_entry_subtrahend_quad_B + 2 + + .globl subtractor_with_alt_entry_subtrahend_quad_B + .p2align 3 + .alt_entry subtractor_with_alt_entry_subtrahend_quad_B +subtractor_with_alt_entry_subtrahend_quad_B: + .quad 0 + +# Check ARM64_POINTER_TO_GOT handling. +# ARM64_POINTER_TO_GOT is a delta-32 to a GOT entry. +# +# jitlink-check: *{4}test_got = (got_addr(macho_reloc.o, external_data) - test_got)[31:0] + .globl test_got + .p2align 2 +test_got: + .long external_data at got - . + +# Check that unreferenced atoms in no-dead-strip sections are not dead stripped. +# We need to use a local symbol for this as any named symbol will end up in the +# ORC responsibility set, which is automatically marked live and would couse +# spurious passes. +# +# jitlink-check: *{8}section_addr(macho_reloc.o, __nds_test_sect) = 0 + .section __DATA,__nds_test_sect,regular,no_dead_strip + .quad 0 + +# Check that unreferenced local symbols that have been marked no-dead-strip are +# not dead-striped. +# +# jitlink-check: *{8}section_addr(macho_reloc.o, __nds_test_nlst) = 0 + .section __DATA,__nds_test_nlst,regular + .no_dead_strip no_dead_strip_test_symbol +no_dead_strip_test_symbol: + .quad 0 + +# Check that explicit zero-fill symbols are supported +# jitlink-check: *{8}zero_fill_test = 0 + .globl zero_fill_test +.zerofill __DATA,__zero_fill_test,zero_fill_test,8,3 + +# Check that section alignments are respected. +# We test this by introducing two segments with alignment 8, each containing one +# byte of data. We require both symbols to have an aligned address. +# +# jitlink-check: section_alignment_check1[2:0] = 0 +# jitlink-check: section_alignment_check2[2:0] = 0 + .section __DATA,__sec_align_chk1 + .p2align 3 + + .globl section_alignment_check1 +section_alignment_check1: + .byte 0 + + .section __DATA,__sec_align_chk2 + .p2align 3 + + .globl section_alignment_check2 +section_alignment_check2: + .byte 0 + +.subsections_via_symbols Added: llvm/trunk/test/ExecutionEngine/JITLink/AArch64/lit.local.cfg URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/ExecutionEngine/JITLink/AArch64/lit.local.cfg?rev=374476&view=auto ============================================================================== --- llvm/trunk/test/ExecutionEngine/JITLink/AArch64/lit.local.cfg (added) +++ llvm/trunk/test/ExecutionEngine/JITLink/AArch64/lit.local.cfg Thu Oct 10 16:37:51 2019 @@ -0,0 +1,2 @@ +if not 'AArch64' in config.root.targets: + config.unsupported = True From llvm-commits at lists.llvm.org Thu Oct 10 16:35:31 2019 From: llvm-commits at lists.llvm.org (Philip Reames via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 23:35:31 +0000 (UTC) Subject: [PATCH] D68811: [CVP] Remove a masking operation if range information implies it's a noop In-Reply-To: References: Message-ID: <661e8452722493597a85b1df5875a6e0@localhost.localdomain> reames marked an inline comment as done. reames added inline comments. ================ Comment at: lib/Transforms/Scalar/CorrelatedValuePropagation.cpp:713 + ConstantInt *RHS = dyn_cast(BinOp->getOperand(1)); + if (!RHS || !RHS->getValue().isMask()) + return false; ---------------- lebedev.ri wrote: > nikic wrote: > > The limit to masks seems a bit stronger than strictly necessary: The range needs to be <= than the trailing ones in RHS. That is for `0b11001111`, if the range is `<= 0b00001111` that is sufficient. Not sure if this is worth handling. > I'd say this is intentional to limit the number of `and`s we handle. It's definitely more restrictive than necessary. I decided the mask case (which is specifically what instcombine generates) was a reasonable case to break at. I'm happy to generalize (slightly) but will definitely do so in a separate patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68811/new/ https://reviews.llvm.org/D68811 From llvm-commits at lists.llvm.org Thu Oct 10 16:36:04 2019 From: llvm-commits at lists.llvm.org (Tom Stellard via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 23:36:04 +0000 (UTC) Subject: [PATCH] D66840: docs/DeveloperPolicy: Add instructions for requesting GitHub commit access In-Reply-To: References: Message-ID: This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rG97578b14fca6: docs/DeveloperPolicy: Add instructions for requesting GitHub commit access (authored by tstellar). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66840/new/ https://reviews.llvm.org/D66840 Files: llvm/docs/DeveloperPolicy.rst Index: llvm/docs/DeveloperPolicy.rst =================================================================== --- llvm/docs/DeveloperPolicy.rst +++ llvm/docs/DeveloperPolicy.rst @@ -396,6 +396,26 @@ .. _discuss the change/gather consensus: +Obtaining Commit Access to the GitHub Repository +------------------------------------------------ +We are currently in the process of migrating the project's source code from SVN +to a git repository on GitHub. We are maintaining a file in SVN to map +SVN usernames to GitHub usernames, so we can automatically grant access to +existing committers when we complete the migration to GitHub. In order to +request commit access, check out the github-usernames.txt file in meta/trunk and +add a line in the form of $SVN_USERNAME:$GITHUB_USERNAME and commit it. For +example: + +.. code:: console + + mkdir tmp-llvm-svn + cd tmp-llvm-svn + svn co https://$SVN_USERNAME at llvm.org/svn/llvm-project/meta/trunk + echo "$SVN_USERNAME:$GITHUB_USERNAME" >> trunk/github-usernames.txt + cd trunk + svn commit -m "Request commit access for $SVN_USERNAME" + + Making a Major Change --------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: D66840.224500.patch Type: text/x-patch Size: 1143 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 16:44:35 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 23:44:35 +0000 (UTC) Subject: [PATCH] D68832: [tsan,msan] Insert module constructors in a module pass In-Reply-To: References: Message-ID: <479c21d90472836306dac87a7e1a9d39@localhost.localdomain> vitalybuka updated this revision to Diff 224501. vitalybuka marked 2 inline comments as done. vitalybuka added a comment. nfc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68832/new/ https://reviews.llvm.org/D68832 Files: clang/lib/CodeGen/BackendUtil.cpp clang/test/CodeGen/sanitizer-module-constructor.c llvm/include/llvm/Transforms/Instrumentation/MemorySanitizer.h llvm/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h llvm/lib/Passes/PassRegistry.def llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp llvm/test/Instrumentation/MemorySanitizer/msan_basic.ll llvm/test/Instrumentation/ThreadSanitizer/tsan_basic.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68832.224501.patch Type: text/x-patch Size: 15253 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 16:48:56 2019 From: llvm-commits at lists.llvm.org (Michael Liao via llvm-commits) Date: Thu, 10 Oct 2019 23:48:56 -0000 Subject: [llvm] r374479 - Fix compilation warning due to typo. Message-ID: <20191010234856.A59D29272A@lists.llvm.org> Author: hliao Date: Thu Oct 10 16:48:56 2019 New Revision: 374479 URL: http://llvm.org/viewvc/llvm-project?rev=374479&view=rev Log: Fix compilation warning due to typo. Modified: llvm/trunk/lib/ExecutionEngine/JITLink/MachO_arm64.cpp Modified: llvm/trunk/lib/ExecutionEngine/JITLink/MachO_arm64.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/JITLink/MachO_arm64.cpp?rev=374479&r1=374478&r2=374479&view=diff ============================================================================== --- llvm/trunk/lib/ExecutionEngine/JITLink/MachO_arm64.cpp (original) +++ llvm/trunk/lib/ExecutionEngine/JITLink/MachO_arm64.cpp Thu Oct 10 16:48:56 2019 @@ -263,7 +263,7 @@ private: if (!Kind) return Kind.takeError(); - if (*Kind != Branch26 & *Kind != Page21 && *Kind != PageOffset12) + if (*Kind != Branch26 && *Kind != Page21 && *Kind != PageOffset12) return make_error( "Invalid relocation pair: Addend + " + getMachOARM64RelocationKindName(*Kind)); From llvm-commits at lists.llvm.org Thu Oct 10 16:49:07 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Thu, 10 Oct 2019 23:49:07 -0000 Subject: [llvm] r374480 - [msan, NFC] Move option parsing into constructor Message-ID: <20191010234907.C65B192C90@lists.llvm.org> Author: vitalybuka Date: Thu Oct 10 16:49:07 2019 New Revision: 374480 URL: http://llvm.org/viewvc/llvm-project?rev=374480&view=rev Log: [msan, NFC] Move option parsing into constructor Modified: llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp Modified: llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h?rev=374480&r1=374479&r2=374480&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h (original) +++ llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h Thu Oct 10 16:49:07 2019 @@ -19,12 +19,11 @@ namespace llvm { struct MemorySanitizerOptions { - MemorySanitizerOptions() = default; - MemorySanitizerOptions(int TrackOrigins, bool Recover, bool Kernel) - : TrackOrigins(TrackOrigins), Recover(Recover), Kernel(Kernel) {} - int TrackOrigins = 0; - bool Recover = false; - bool Kernel = false; + MemorySanitizerOptions() : MemorySanitizerOptions(0, false, false){}; + MemorySanitizerOptions(int TrackOrigins, bool Recover, bool Kernel); + bool Kernel; + int TrackOrigins; + bool Recover; }; // Insert MemorySanitizer instrumentation (detection of uninitialized reads) Modified: llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp?rev=374480&r1=374479&r2=374480&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp Thu Oct 10 16:49:07 2019 @@ -462,16 +462,9 @@ namespace { /// the module. class MemorySanitizer { public: - MemorySanitizer(Module &M, MemorySanitizerOptions Options) { - this->CompileKernel = - ClEnableKmsan.getNumOccurrences() > 0 ? ClEnableKmsan : Options.Kernel; - if (ClTrackOrigins.getNumOccurrences() > 0) - this->TrackOrigins = ClTrackOrigins; - else - this->TrackOrigins = this->CompileKernel ? 2 : Options.TrackOrigins; - this->Recover = ClKeepGoing.getNumOccurrences() > 0 - ? ClKeepGoing - : (this->CompileKernel | Options.Recover); + MemorySanitizer(Module &M, MemorySanitizerOptions Options) + : CompileKernel(Options.Kernel), TrackOrigins(Options.TrackOrigins), + Recover(Options.Recover) { initializeModule(M); } @@ -623,8 +616,17 @@ struct MemorySanitizerLegacyPass : publi MemorySanitizerOptions Options; }; +template T getOptOrDefault(const cl::opt &Opt, T Default) { + return (Opt.getNumOccurrences() > 0) ? Opt : Default; +} + } // end anonymous namespace +MemorySanitizerOptions::MemorySanitizerOptions(int TO, bool R, bool K) + : Kernel(getOptOrDefault(ClEnableKmsan, K)), + TrackOrigins(getOptOrDefault(ClTrackOrigins, Kernel ? 2 : TO)), + Recover(getOptOrDefault(ClKeepGoing, Kernel || R)) {} + PreservedAnalyses MemorySanitizerPass::run(Function &F, FunctionAnalysisManager &FAM) { MemorySanitizer Msan(*F.getParent(), Options); From llvm-commits at lists.llvm.org Thu Oct 10 16:49:11 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Thu, 10 Oct 2019 23:49:11 -0000 Subject: [llvm] r374481 - [tsan, msan] Insert module constructors in a module pass Message-ID: <20191010234911.14E6792C90@lists.llvm.org> Author: vitalybuka Date: Thu Oct 10 16:49:10 2019 New Revision: 374481 URL: http://llvm.org/viewvc/llvm-project?rev=374481&view=rev Log: [tsan,msan] Insert module constructors in a module pass Summary: If we insert them from function pass some analysis may be missing or invalid. Fixes PR42877. Reviewers: eugenis, leonardchan Reviewed By: leonardchan Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68832 Modified: llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h llvm/trunk/lib/Passes/PassRegistry.def llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll Modified: llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h?rev=374481&r1=374480&r2=374481&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h (original) +++ llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h Thu Oct 10 16:49:10 2019 @@ -40,6 +40,7 @@ struct MemorySanitizerPass : public Pass MemorySanitizerPass(MemorySanitizerOptions Options) : Options(Options) {} PreservedAnalyses run(Function &F, FunctionAnalysisManager &FAM); + PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM); private: MemorySanitizerOptions Options; Modified: llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h?rev=374481&r1=374480&r2=374481&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h (original) +++ llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h Thu Oct 10 16:49:10 2019 @@ -27,6 +27,8 @@ FunctionPass *createThreadSanitizerLegac /// yet, the pass inserts the declarations. Otherwise the existing globals are struct ThreadSanitizerPass : public PassInfoMixin { PreservedAnalyses run(Function &F, FunctionAnalysisManager &FAM); + PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM); }; + } // namespace llvm #endif /* LLVM_TRANSFORMS_INSTRUMENTATION_THREADSANITIZER_H */ Modified: llvm/trunk/lib/Passes/PassRegistry.def URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Passes/PassRegistry.def?rev=374481&r1=374480&r2=374481&view=diff ============================================================================== --- llvm/trunk/lib/Passes/PassRegistry.def (original) +++ llvm/trunk/lib/Passes/PassRegistry.def Thu Oct 10 16:49:10 2019 @@ -86,6 +86,8 @@ MODULE_PASS("synthetic-counts-propagatio MODULE_PASS("wholeprogramdevirt", WholeProgramDevirtPass(nullptr, nullptr)) MODULE_PASS("verify", VerifierPass()) MODULE_PASS("asan-module", ModuleAddressSanitizerPass(/*CompileKernel=*/false, false, true, false)) +MODULE_PASS("msan-module", MemorySanitizerPass({})) +MODULE_PASS("tsan-module", ThreadSanitizerPass()) MODULE_PASS("kasan-module", ModuleAddressSanitizerPass(/*CompileKernel=*/true, false, true, false)) MODULE_PASS("sancov-module", ModuleSanitizerCoveragePass()) MODULE_PASS("poison-checking", PoisonCheckingPass()) Modified: llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp?rev=374481&r1=374480&r2=374481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp Thu Oct 10 16:49:10 2019 @@ -587,10 +587,26 @@ private: /// An empty volatile inline asm that prevents callback merge. InlineAsm *EmptyAsm; - - Function *MsanCtorFunction; }; +void insertModuleCtor(Module &M) { + getOrCreateSanitizerCtorAndInitFunctions( + M, kMsanModuleCtorName, kMsanInitName, + /*InitArgTypes=*/{}, + /*InitArgs=*/{}, + // This callback is invoked when the functions are created the first + // time. Hook them into the global ctors list in that case: + [&](Function *Ctor, FunctionCallee) { + if (!ClWithComdat) { + appendToGlobalCtors(M, Ctor, 0); + return; + } + Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName); + Ctor->setComdat(MsanCtorComdat); + appendToGlobalCtors(M, Ctor, 0, Ctor); + }); +} + /// A legacy function pass for msan instrumentation. /// /// Instruments functions to detect unitialized reads. @@ -635,6 +651,14 @@ PreservedAnalyses MemorySanitizerPass::r return PreservedAnalyses::all(); } +PreservedAnalyses MemorySanitizerPass::run(Module &M, + ModuleAnalysisManager &AM) { + if (Options.Kernel) + return PreservedAnalyses::all(); + insertModuleCtor(M); + return PreservedAnalyses::none(); +} + char MemorySanitizerLegacyPass::ID = 0; INITIALIZE_PASS_BEGIN(MemorySanitizerLegacyPass, "msan", @@ -920,23 +944,6 @@ void MemorySanitizer::initializeModule(M OriginStoreWeights = MDBuilder(*C).createBranchWeights(1, 1000); if (!CompileKernel) { - std::tie(MsanCtorFunction, std::ignore) = - getOrCreateSanitizerCtorAndInitFunctions( - M, kMsanModuleCtorName, kMsanInitName, - /*InitArgTypes=*/{}, - /*InitArgs=*/{}, - // This callback is invoked when the functions are created the first - // time. Hook them into the global ctors list in that case: - [&](Function *Ctor, FunctionCallee) { - if (!ClWithComdat) { - appendToGlobalCtors(M, Ctor, 0); - return; - } - Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName); - Ctor->setComdat(MsanCtorComdat); - appendToGlobalCtors(M, Ctor, 0, Ctor); - }); - if (TrackOrigins) M.getOrInsertGlobal("__msan_track_origins", IRB.getInt32Ty(), [&] { return new GlobalVariable( @@ -954,6 +961,8 @@ void MemorySanitizer::initializeModule(M } bool MemorySanitizerLegacyPass::doInitialization(Module &M) { + if (!Options.Kernel) + insertModuleCtor(M); MSan.emplace(M, Options); return true; } @@ -4578,8 +4587,9 @@ static VarArgHelper *CreateVarArgHelper( } bool MemorySanitizer::sanitizeFunction(Function &F, TargetLibraryInfo &TLI) { - if (!CompileKernel && (&F == MsanCtorFunction)) + if (!CompileKernel && F.getName() == kMsanModuleCtorName) return false; + MemorySanitizerVisitor Visitor(F, *this, TLI); // Clear out readonly/readnone attributes. Modified: llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp?rev=374481&r1=374480&r2=374481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp Thu Oct 10 16:49:10 2019 @@ -92,11 +92,10 @@ namespace { /// ensures the __tsan_init function is in the list of global constructors for /// the module. struct ThreadSanitizer { - ThreadSanitizer(Module &M); bool sanitizeFunction(Function &F, const TargetLibraryInfo &TLI); private: - void initializeCallbacks(Module &M); + void initialize(Module &M); bool instrumentLoadOrStore(Instruction *I, const DataLayout &DL); bool instrumentAtomic(Instruction *I, const DataLayout &DL); bool instrumentMemIntrinsic(Instruction *I); @@ -108,8 +107,6 @@ private: void InsertRuntimeIgnores(Function &F); Type *IntptrTy; - IntegerType *OrdTy; - // Callbacks to run-time library are computed in doInitialization. FunctionCallee TsanFuncEntry; FunctionCallee TsanFuncExit; FunctionCallee TsanIgnoreBegin; @@ -130,7 +127,6 @@ private: FunctionCallee TsanVptrUpdate; FunctionCallee TsanVptrLoad; FunctionCallee MemmoveFn, MemcpyFn, MemsetFn; - Function *TsanCtorFunction; }; struct ThreadSanitizerLegacyPass : FunctionPass { @@ -143,16 +139,32 @@ struct ThreadSanitizerLegacyPass : Funct private: Optional TSan; }; + +void insertModuleCtor(Module &M) { + getOrCreateSanitizerCtorAndInitFunctions( + M, kTsanModuleCtorName, kTsanInitName, /*InitArgTypes=*/{}, + /*InitArgs=*/{}, + // This callback is invoked when the functions are created the first + // time. Hook them into the global ctors list in that case: + [&](Function *Ctor, FunctionCallee) { appendToGlobalCtors(M, Ctor, 0); }); +} + } // namespace PreservedAnalyses ThreadSanitizerPass::run(Function &F, FunctionAnalysisManager &FAM) { - ThreadSanitizer TSan(*F.getParent()); + ThreadSanitizer TSan; if (TSan.sanitizeFunction(F, FAM.getResult(F))) return PreservedAnalyses::none(); return PreservedAnalyses::all(); } +PreservedAnalyses ThreadSanitizerPass::run(Module &M, + ModuleAnalysisManager &MAM) { + insertModuleCtor(M); + return PreservedAnalyses::none(); +} + char ThreadSanitizerLegacyPass::ID = 0; INITIALIZE_PASS_BEGIN(ThreadSanitizerLegacyPass, "tsan", "ThreadSanitizer: detects data races.", false, false) @@ -169,7 +181,8 @@ void ThreadSanitizerLegacyPass::getAnaly } bool ThreadSanitizerLegacyPass::doInitialization(Module &M) { - TSan.emplace(M); + insertModuleCtor(M); + TSan.emplace(); return true; } @@ -183,7 +196,10 @@ FunctionPass *llvm::createThreadSanitize return new ThreadSanitizerLegacyPass(); } -void ThreadSanitizer::initializeCallbacks(Module &M) { +void ThreadSanitizer::initialize(Module &M) { + const DataLayout &DL = M.getDataLayout(); + IntptrTy = DL.getIntPtrType(M.getContext()); + IRBuilder<> IRB(M.getContext()); AttributeList Attr; Attr = Attr.addAttribute(M.getContext(), AttributeList::FunctionIndex, @@ -197,7 +213,7 @@ void ThreadSanitizer::initializeCallback IRB.getVoidTy()); TsanIgnoreEnd = M.getOrInsertFunction("__tsan_ignore_thread_end", Attr, IRB.getVoidTy()); - OrdTy = IRB.getInt32Ty(); + IntegerType *OrdTy = IRB.getInt32Ty(); for (size_t i = 0; i < kNumberOfAccessSizes; ++i) { const unsigned ByteSize = 1U << i; const unsigned BitSize = ByteSize * 8; @@ -280,20 +296,6 @@ void ThreadSanitizer::initializeCallback IRB.getInt8PtrTy(), IRB.getInt32Ty(), IntptrTy); } -ThreadSanitizer::ThreadSanitizer(Module &M) { - const DataLayout &DL = M.getDataLayout(); - IntptrTy = DL.getIntPtrType(M.getContext()); - std::tie(TsanCtorFunction, std::ignore) = - getOrCreateSanitizerCtorAndInitFunctions( - M, kTsanModuleCtorName, kTsanInitName, /*InitArgTypes=*/{}, - /*InitArgs=*/{}, - // This callback is invoked when the functions are created the first - // time. Hook them into the global ctors list in that case: - [&](Function *Ctor, FunctionCallee) { - appendToGlobalCtors(M, Ctor, 0); - }); -} - static bool isVtableAccess(Instruction *I) { if (MDNode *Tag = I->getMetadata(LLVMContext::MD_tbaa)) return Tag->isTBAAVtableAccess(); @@ -436,9 +438,9 @@ bool ThreadSanitizer::sanitizeFunction(F const TargetLibraryInfo &TLI) { // This is required to prevent instrumenting call to __tsan_init from within // the module constructor. - if (&F == TsanCtorFunction) + if (F.getName() == kTsanModuleCtorName) return false; - initializeCallbacks(*F.getParent()); + initialize(*F.getParent()); SmallVector AllLoadsAndStores; SmallVector LocalLoadsAndStores; SmallVector AtomicAccesses; Modified: llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll?rev=374481&r1=374480&r2=374481&view=diff ============================================================================== --- llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll (original) +++ llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll Thu Oct 10 16:49:10 2019 @@ -1,10 +1,9 @@ -; RUN: opt < %s -msan-check-access-address=0 -S -passes=msan 2>&1 | FileCheck \ -; RUN: -allow-deprecated-dag-overlap %s -; RUN: opt < %s -msan -msan-check-access-address=0 -S | FileCheck -allow-deprecated-dag-overlap %s -; RUN: opt < %s -msan-check-access-address=0 -msan-track-origins=1 -S \ -; RUN: -passes=msan 2>&1 | FileCheck -allow-deprecated-dag-overlap \ -; RUN: -check-prefix=CHECK -check-prefix=CHECK-ORIGINS %s -; RUN: opt < %s -msan -msan-check-access-address=0 -msan-track-origins=1 -S | FileCheck -allow-deprecated-dag-overlap -check-prefix=CHECK -check-prefix=CHECK-ORIGINS %s +; RUN: opt < %s -msan-check-access-address=0 -S -passes='module(msan-module),function(msan)' 2>&1 | FileCheck -allow-deprecated-dag-overlap %s +; RUN: opt < %s --passes='module(msan-module),function(msan)' -msan-check-access-address=0 -S | FileCheck -allow-deprecated-dag-overlap %s +; RUN: opt < %s -msan-check-access-address=0 -msan-track-origins=1 -S -passes='module(msan-module),function(msan)' 2>&1 | \ +; RUN: FileCheck -allow-deprecated-dag-overlap -check-prefixes=CHECK,CHECK-ORIGINS %s +; RUN: opt < %s -passes='module(msan-module),function(msan)' -msan-check-access-address=0 -msan-track-origins=1 -S | \ +; RUN: FileCheck -allow-deprecated-dag-overlap -check-prefixes=CHECK,CHECK-ORIGINS %s target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" Modified: llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll?rev=374481&r1=374480&r2=374481&view=diff ============================================================================== --- llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll (original) +++ llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll Thu Oct 10 16:49:10 2019 @@ -1,5 +1,5 @@ ; RUN: opt < %s -tsan -S | FileCheck %s -; RUN: opt < %s -passes=tsan -S | FileCheck %s +; RUN: opt < %s -passes='function(tsan),module(tsan-module)' -S | FileCheck %s target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" target triple = "x86_64-unknown-linux-gnu" From llvm-commits at lists.llvm.org Thu Oct 10 16:49:59 2019 From: llvm-commits at lists.llvm.org (GN Sync Bot via llvm-commits) Date: Thu, 10 Oct 2019 23:49:59 -0000 Subject: [llvm] r374482 - gn build: Merge r374476 Message-ID: <20191010234959.C814C92A20@lists.llvm.org> Author: gnsyncbot Date: Thu Oct 10 16:49:59 2019 New Revision: 374482 URL: http://llvm.org/viewvc/llvm-project?rev=374482&view=rev Log: gn build: Merge r374476 Modified: llvm/trunk/utils/gn/secondary/llvm/lib/ExecutionEngine/JITLink/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/llvm/lib/ExecutionEngine/JITLink/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/lib/ExecutionEngine/JITLink/BUILD.gn?rev=374482&r1=374481&r2=374482&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/lib/ExecutionEngine/JITLink/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/llvm/lib/ExecutionEngine/JITLink/BUILD.gn Thu Oct 10 16:49:59 2019 @@ -12,6 +12,7 @@ static_library("JITLink") { "JITLinkMemoryManager.cpp", "MachO.cpp", "MachOLinkGraphBuilder.cpp", + "MachO_arm64.cpp", "MachO_x86_64.cpp", ] } From llvm-commits at lists.llvm.org Thu Oct 10 16:54:43 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 23:54:43 +0000 (UTC) Subject: [PATCH] D68832: [tsan,msan] Insert module constructors in a module pass In-Reply-To: References: Message-ID: <8031e49a3d83a29c64973159ead79651@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG5c72aa232e74: [tsan,msan] Insert module constructors in a module pass (authored by vitalybuka). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68832/new/ https://reviews.llvm.org/D68832 Files: clang/lib/CodeGen/BackendUtil.cpp clang/test/CodeGen/sanitizer-module-constructor.c llvm/include/llvm/Transforms/Instrumentation/MemorySanitizer.h llvm/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h llvm/lib/Passes/PassRegistry.def llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp llvm/test/Instrumentation/MemorySanitizer/msan_basic.ll llvm/test/Instrumentation/ThreadSanitizer/tsan_basic.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68832.224502.patch Type: text/x-patch Size: 15253 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 16:55:12 2019 From: llvm-commits at lists.llvm.org (Lang Hames via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 23:55:12 +0000 (UTC) Subject: [PATCH] D68732: Break out OrcError and RPC In-Reply-To: References: Message-ID: lhames accepted this revision. lhames added a comment. This revision is now accepted and ready to land. Hi Chris, Thanks for working on this — it’ll be great not to drag all those extra dependencies in. I think I would prefer to have an OrcRPC library (rather than OrcError), and split up the error definitions (RPC errors go in OrcRPC, others go in Orc Core). That said I’m happy for this to land as is and we can make that change in-tree. Longer term I’d love to break Orc up even more: Ideally there would be a Core library that depends on nothing but Support and Object, and an OrcIR library for IR layers. I think we can tackle that after ORCv1 is killed off though. — Lang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68732/new/ https://reviews.llvm.org/D68732 From llvm-commits at lists.llvm.org Thu Oct 10 17:04:12 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 00:04:12 +0000 (UTC) Subject: [PATCH] D68153: Make IR labels more precise In-Reply-To: References: Message-ID: <333066edb832019ec3bf55a373a4c004@localhost.localdomain> jdoerfert added a comment. In D68153#1698833 , @greened wrote: > In D68153#1689791 , @RKSimon wrote: > > > Wouldn't this mean that every regeneration would see this change? > > > Yes. The label pattern is just wrong as-is because it will match calls in some cases. I experienced this case myself now. I really like this change and I think breaking things now is better than debugging it over and over again. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68153/new/ https://reviews.llvm.org/D68153 From llvm-commits at lists.llvm.org Thu Oct 10 17:04:13 2019 From: llvm-commits at lists.llvm.org (Jian Cai via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 00:04:13 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses In-Reply-To: References: Message-ID: jcai19 marked 6 inline comments as done. jcai19 added inline comments. ================ Comment at: llvm/test/MC/ARM/gas-compl.s:17 +.syntax unified + ldr r12, [sp, (15+5*5)] + ---------------- nickdesaulniers wrote: > Does gas support multiple parens, ie. `((15+5))`? Do we? Good question! Just verified GAS does, and we do too thanks to getParser().parseExpression(Offset) taking care of parentheses. $ cat sample.s .syntax unified ldr r12, [sp, $(((15+5)*5))] $ armv7a-cros-linux-gnueabihf-as sample.s -o sample.o; armv7a-cros-linux-gnueabihf-objdump -d sample.o sample.o: file format elf32-littlearm Disassembly of section .text: 00000000 <.text>: 0: e59dc064 ldr ip, [sp, #100] ; 0x64 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68764/new/ https://reviews.llvm.org/D68764 From llvm-commits at lists.llvm.org Thu Oct 10 17:04:15 2019 From: llvm-commits at lists.llvm.org (Jian Cai via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 00:04:15 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses In-Reply-To: References: Message-ID: <4071648e1582409f8112d823831a6e8d@localhost.localdomain> jcai19 updated this revision to Diff 224503. jcai19 added a comment. Expand test cases and rename test file. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68764/new/ https://reviews.llvm.org/D68764 Files: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp llvm/test/MC/ARM/gas-compl-mem-offset-paren.s Index: llvm/test/MC/ARM/gas-compl-mem-offset-paren.s =================================================================== --- /dev/null +++ llvm/test/MC/ARM/gas-compl-mem-offset-paren.s @@ -0,0 +1,26 @@ +@ RUN: llvm-mc -triple=arm < %s | FileCheck %s + +.syntax unified + +@ CHECK: ldr r12, [sp, #15] +ldr r12, [sp, (15)] + +@ CHECK: ldr r12, [sp, #15] +ldr r12, [sp, #(15)] + +@ CHECK: ldr r12, [sp, #15] +.syntax unified +ldr r12, [sp, $(15)] + +@ CHECK: ldr r12, [sp, #100] +.syntax unified +ldr r12, [sp, (((15+5)*5))] + +@ CHECK: ldr r12, [sp, #100] +.syntax unified +ldr r12, [sp, #(((15+5)*5))] + + +@ CHECK: ldr r12, [sp, #100] +.syntax unified +ldr r12, [sp, $(((15+5)*5))] Index: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp =================================================================== --- llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp +++ llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp @@ -5734,13 +5734,14 @@ } // If we have a '#', it's an immediate offset, else assume it's a register - // offset. Be friendly and also accept a plain integer (without a leading - // hash) for gas compatibility. + // offset. Be friendly and also accept a plain integer or expression (without + // a leading hash) for gas compatibility. if (Parser.getTok().is(AsmToken::Hash) || Parser.getTok().is(AsmToken::Dollar) || + Parser.getTok().is(AsmToken::LParen) || Parser.getTok().is(AsmToken::Integer)) { - if (Parser.getTok().isNot(AsmToken::Integer)) - Parser.Lex(); // Eat '#' or '$'. + if (Parser.getTok().isNot(AsmToken::Integer) && Parser.getTok().isNot(AsmToken::LParen)) + Parser.Lex(); // Eat '#' or '$' E = Parser.getTok().getLoc(); bool isNegative = getParser().getTok().is(AsmToken::Minus); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68764.224503.patch Type: text/x-patch Size: 1768 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 17:13:17 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 00:13:17 +0000 (UTC) Subject: [PATCH] D64135: [lit] Parse command-line options from LIT_OPTS In-Reply-To: References: Message-ID: <202b9e407a4ee15593eec063a276020e@localhost.localdomain> jdenny added a comment. In D64135#1704685 , @yln wrote: > Hi Joel @jdenny, Hi Julian. > I would like to ask you to reconsider whether or not it is a good idea that the env var overrides the command line options. I guess it depends on the intended use case: which is the default, and which is the override? 1. I added `LIT_OPTS` specifically in order to override the default options supplied by the build config. That implies `LIT_OPTS` should be parsed last (assuming later options override earlier options when order matters). 2. I think the convention you mention is for the (probably much more common) use case where an environment variable is intended to hold a default that can be overridden on the command line. Do we anticipate that people will actually use `LIT_OPTS` for that purpose? To fully support the use case 1 without risking surprising behavior if someone attempts use case 2, we could try to move the `LIT_OPTS` implementation out of lit and into the build system. That is, use case 2 would become impossible. Then again, order doesn't seem to matter anyway for options that I normally care about: `-a` and `-vv` seem to take effect no matter where they appear in relation to the usual default of `-sv`. Order does matter for `--filter`, and I assume it matters for `-D`. I don't know if anyone would ever build with those as default options and then want to override them. It might be surprising if you couldn't, given that 1 is the intended use case. Without knowing which use cases beyond my own are real, I'm inclined to wait before selecting a path forward. @probinson reviewed, so perhaps he has some opinion. > One more ask independent of what we decide: the `lit-opts.py` test does not go red when we change the override order. Can you adapt the test to give a signal in this case? Good point. > Also: thank you for adding and documenting this feature and even providing a test. I am already using it and find it useful! Good to hear! Thanks. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64135/new/ https://reviews.llvm.org/D64135 From llvm-commits at lists.llvm.org Thu Oct 10 17:31:26 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 00:31:26 +0000 (UTC) Subject: [PATCH] D68843: [lit] Create Run object later and only when it is needed Message-ID: yln created this revision. yln added reviewers: rnk, ddunbar, serge-sans-paille, probinson, jdenny, cishida, nate_chandler, jordan_rose. Herald added subscribers: llvm-commits, delcypher. Herald added a project: LLVM. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68843 Files: llvm/utils/lit/lit/main.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68843.224509.patch Type: text/x-patch Size: 6665 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 17:38:42 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Fri, 11 Oct 2019 00:38:42 -0000 Subject: [llvm] r374486 - [X86] Update trunc_packus_v32i32_v32i8 test in min-legal-vector-width.ll to use a load for the large type and add the min-legal-vector-width attribute. Message-ID: <20191011003842.29D8192A51@lists.llvm.org> Author: ctopper Date: Thu Oct 10 17:38:41 2019 New Revision: 374486 URL: http://llvm.org/viewvc/llvm-project?rev=374486&view=rev Log: [X86] Update trunc_packus_v32i32_v32i8 test in min-legal-vector-width.ll to use a load for the large type and add the min-legal-vector-width attribute. The attribute is needed to avoid zmm registers. Using memory avoids argument splitting for large vectors. Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll?rev=374486&r1=374485&r2=374486&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll (original) +++ llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Thu Oct 10 17:38:41 2019 @@ -1098,21 +1098,28 @@ define <16 x i8> @trunc_packus_v16i32_v1 ret <16 x i8> %f } -define <32 x i8> @trunc_packus_v32i32_v32i8(<32 x i32> %a0) { +define <32 x i8> @trunc_packus_v32i32_v32i8(<32 x i32>* %p) "min-legal-vector-width"="256" { ; CHECK-LABEL: trunc_packus_v32i32_v32i8: ; CHECK: # %bb.0: -; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; CHECK-NEXT: vpmaxsd %zmm2, %zmm0, %zmm0 -; CHECK-NEXT: vpmovusdb %zmm0, %xmm0 -; CHECK-NEXT: vpmaxsd %zmm2, %zmm1, %zmm1 -; CHECK-NEXT: vpmovusdb %zmm1, %xmm1 +; CHECK-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; CHECK-NEXT: vpmaxsd 96(%rdi), %ymm0, %ymm1 +; CHECK-NEXT: vpmovusdb %ymm1, %xmm1 +; CHECK-NEXT: vpmaxsd 64(%rdi), %ymm0, %ymm2 +; CHECK-NEXT: vpmovusdb %ymm2, %xmm2 +; CHECK-NEXT: vpunpcklqdq {{.*#+}} xmm1 = xmm2[0],xmm1[0] +; CHECK-NEXT: vpmaxsd 32(%rdi), %ymm0, %ymm2 +; CHECK-NEXT: vpmovusdb %ymm2, %xmm2 +; CHECK-NEXT: vpmaxsd (%rdi), %ymm0, %ymm0 +; CHECK-NEXT: vpmovusdb %ymm0, %xmm0 +; CHECK-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm2[0] ; CHECK-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 ; CHECK-NEXT: retq - %1 = icmp slt <32 x i32> %a0, - %2 = select <32 x i1> %1, <32 x i32> %a0, <32 x i32> - %3 = icmp sgt <32 x i32> %2, zeroinitializer - %4 = select <32 x i1> %3, <32 x i32> %2, <32 x i32> zeroinitializer - %5 = trunc <32 x i32> %4 to <32 x i8> - ret <32 x i8> %5 + %a = load <32 x i32>, <32 x i32>* %p + %b = icmp slt <32 x i32> %a, + %c = select <32 x i1> %b, <32 x i32> %a, <32 x i32> + %d = icmp sgt <32 x i32> %c, zeroinitializer + %e = select <32 x i1> %d, <32 x i32> %c, <32 x i32> zeroinitializer + %f = trunc <32 x i32> %e to <32 x i8> + ret <32 x i8> %f } From llvm-commits at lists.llvm.org Thu Oct 10 17:38:51 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Fri, 11 Oct 2019 00:38:51 -0000 Subject: [llvm] r374487 - [X86] Improve the AVX512 bailout in combineTruncateWithSat to allow pack instructions in more situations. Message-ID: <20191011003852.0C1B292CE5@lists.llvm.org> Author: ctopper Date: Thu Oct 10 17:38:51 2019 New Revision: 374487 URL: http://llvm.org/viewvc/llvm-project?rev=374487&view=rev Log: [X86] Improve the AVX512 bailout in combineTruncateWithSat to allow pack instructions in more situations. If we don't have VLX we won't end up selecting a saturating truncate for 256-bit or smaller vectors so we should just use the pack lowering. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/masked_store_trunc_ssat.ll llvm/trunk/test/CodeGen/X86/pmaddubsw.ll llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374487&r1=374486&r2=374487&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Thu Oct 10 17:38:51 2019 @@ -39860,9 +39860,16 @@ static SDValue combineTruncateWithSat(SD } } + // vXi32 truncate instructions are available with AVX512F. + // vXi16 truncate instructions are only available with AVX512BW. + // For 256-bit or smaller vectors, we require VLX. + // FIXME: We could widen truncates to 512 to remove the VLX restriction. + bool PreferAVX512 = ((Subtarget.hasAVX512() && InSVT == MVT::i32) || + (Subtarget.hasBWI() && InSVT == MVT::i16)) && + (Subtarget.hasVLX() || InVT.getSizeInBits() > 256); + if (VT.isVector() && isPowerOf2_32(VT.getVectorNumElements()) && - !(Subtarget.hasAVX512() && InSVT == MVT::i32) && - !(Subtarget.hasBWI() && InSVT == MVT::i16) && + !PreferAVX512 && (SVT == MVT::i8 || SVT == MVT::i16) && (InSVT == MVT::i16 || InSVT == MVT::i32)) { if (auto USatVal = detectSSatPattern(In, VT, true)) { Modified: llvm/trunk/test/CodeGen/X86/masked_store_trunc_ssat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/masked_store_trunc_ssat.ll?rev=374487&r1=374486&r2=374487&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/masked_store_trunc_ssat.ll (original) +++ llvm/trunk/test/CodeGen/X86/masked_store_trunc_ssat.ll Thu Oct 10 17:38:51 2019 @@ -4594,11 +4594,8 @@ define void @truncstore_v8i32_v8i16(<8 x ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm1 killed $ymm1 def $zmm1 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [32767,32767,32767,32767,32767,32767,32767,32767] -; AVX512F-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528] -; AVX512F-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdw %zmm0, %ymm0 +; AVX512F-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512F-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB11_1 @@ -4665,11 +4662,8 @@ define void @truncstore_v8i32_v8i16(<8 x ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftld $24, %k0, %k0 ; AVX512BW-NEXT: kshiftrd $24, %k0, %k1 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [32767,32767,32767,32767,32767,32767,32767,32767] -; AVX512BW-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528] -; AVX512BW-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdw %zmm0, %ymm0 +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu16 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -4977,11 +4971,9 @@ define void @truncstore_v8i32_v8i8(<8 x ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm1 killed $ymm1 def $zmm1 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [127,127,127,127,127,127,127,127] -; AVX512F-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168] -; AVX512F-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512F-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512F-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB12_1 @@ -5048,11 +5040,9 @@ define void @truncstore_v8i32_v8i8(<8 x ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlq $56, %k0, %k0 ; AVX512BW-NEXT: kshiftrq $56, %k0, %k1 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [127,127,127,127,127,127,127,127] -; AVX512BW-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168] -; AVX512BW-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -5192,10 +5182,6 @@ define void @truncstore_v4i32_v4i16(<4 x ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [32767,32767,32767,32767] -; AVX512F-NEXT: vpminsd %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294934528,4294934528,4294934528,4294934528] -; AVX512F-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ; AVX512F-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al @@ -5235,10 +5221,6 @@ define void @truncstore_v4i32_v4i16(<4 x ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftld $28, %k0, %k0 ; AVX512BW-NEXT: kshiftrd $28, %k0, %k1 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [32767,32767,32767,32767] -; AVX512BW-NEXT: vpminsd %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294934528,4294934528,4294934528,4294934528] -; AVX512BW-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ; AVX512BW-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu16 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper @@ -7302,9 +7284,8 @@ define void @truncstore_v16i16_v16i8(<16 ; AVX512BW-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 ; AVX512BW-NEXT: vptestmb %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kmovw %k0, %k1 -; AVX512BW-NEXT: vpminsw {{.*}}(%rip), %ymm0, %ymm0 -; AVX512BW-NEXT: vpmaxsw {{.*}}(%rip), %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovwb %zmm0, %ymm0 +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -7601,8 +7582,6 @@ define void @truncstore_v8i16_v8i8(<8 x ; AVX512BW-NEXT: vptestmw %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlq $56, %k0, %k0 ; AVX512BW-NEXT: kshiftrq $56, %k0, %k1 -; AVX512BW-NEXT: vpminsw {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BW-NEXT: vpmaxsw {{.*}}(%rip), %xmm0, %xmm0 ; AVX512BW-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper Modified: llvm/trunk/test/CodeGen/X86/pmaddubsw.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/pmaddubsw.ll?rev=374487&r1=374486&r2=374487&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/pmaddubsw.ll (original) +++ llvm/trunk/test/CodeGen/X86/pmaddubsw.ll Thu Oct 10 17:38:51 2019 @@ -349,53 +349,27 @@ define <8 x i16> @pmaddubsw_bad_extend(< ; AVX1-NEXT: vpackssdw %xmm0, %xmm3, %xmm0 ; AVX1-NEXT: retq ; -; AVX2-LABEL: pmaddubsw_bad_extend: -; AVX2: # %bb.0: -; AVX2-NEXT: vmovdqa (%rdi), %xmm0 -; AVX2-NEXT: vmovdqa (%rsi), %xmm1 -; AVX2-NEXT: vmovdqa {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u> -; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm3 -; AVX2-NEXT: vmovdqa {{.*#+}} xmm4 = <1,3,5,7,9,11,13,15,u,u,u,u,u,u,u,u> -; AVX2-NEXT: vpshufb %xmm4, %xmm0, %xmm0 -; AVX2-NEXT: vpshufb %xmm2, %xmm1, %xmm2 -; AVX2-NEXT: vpshufb %xmm4, %xmm1, %xmm1 -; AVX2-NEXT: vpmovsxbd %xmm3, %ymm3 -; AVX2-NEXT: vpmovzxbd {{.*#+}} ymm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero,xmm2[4],zero,zero,zero,xmm2[5],zero,zero,zero,xmm2[6],zero,zero,zero,xmm2[7],zero,zero,zero -; AVX2-NEXT: vpmulld %ymm2, %ymm3, %ymm2 -; AVX2-NEXT: vpmovzxbd {{.*#+}} ymm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero -; AVX2-NEXT: vpmovsxbd %xmm1, %ymm1 -; AVX2-NEXT: vpmulld %ymm1, %ymm0, %ymm0 -; AVX2-NEXT: vpaddd %ymm0, %ymm2, %ymm0 -; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 -; AVX2-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 -; AVX2-NEXT: vzeroupper -; AVX2-NEXT: retq -; -; AVX512-LABEL: pmaddubsw_bad_extend: -; AVX512: # %bb.0: -; AVX512-NEXT: vmovdqa (%rdi), %xmm0 -; AVX512-NEXT: vmovdqa (%rsi), %xmm1 -; AVX512-NEXT: vmovdqa {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u> -; AVX512-NEXT: vpshufb %xmm2, %xmm0, %xmm3 -; AVX512-NEXT: vmovdqa {{.*#+}} xmm4 = <1,3,5,7,9,11,13,15,u,u,u,u,u,u,u,u> -; AVX512-NEXT: vpshufb %xmm4, %xmm0, %xmm0 -; AVX512-NEXT: vpshufb %xmm2, %xmm1, %xmm2 -; AVX512-NEXT: vpshufb %xmm4, %xmm1, %xmm1 -; AVX512-NEXT: vpmovsxbd %xmm3, %ymm3 -; AVX512-NEXT: vpmovzxbd {{.*#+}} ymm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero,xmm2[4],zero,zero,zero,xmm2[5],zero,zero,zero,xmm2[6],zero,zero,zero,xmm2[7],zero,zero,zero -; AVX512-NEXT: vpmulld %ymm2, %ymm3, %ymm2 -; AVX512-NEXT: vpmovzxbd {{.*#+}} ymm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero -; AVX512-NEXT: vpmovsxbd %xmm1, %ymm1 -; AVX512-NEXT: vpmulld %ymm1, %ymm0, %ymm0 -; AVX512-NEXT: vpaddd %ymm0, %ymm2, %ymm0 -; AVX512-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528] -; AVX512-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512-NEXT: vpbroadcastd {{.*#+}} ymm1 = [32767,32767,32767,32767,32767,32767,32767,32767] -; AVX512-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512-NEXT: vpmovdw %zmm0, %ymm0 -; AVX512-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 -; AVX512-NEXT: vzeroupper -; AVX512-NEXT: retq +; AVX256-LABEL: pmaddubsw_bad_extend: +; AVX256: # %bb.0: +; AVX256-NEXT: vmovdqa (%rdi), %xmm0 +; AVX256-NEXT: vmovdqa (%rsi), %xmm1 +; AVX256-NEXT: vmovdqa {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u> +; AVX256-NEXT: vpshufb %xmm2, %xmm0, %xmm3 +; AVX256-NEXT: vmovdqa {{.*#+}} xmm4 = <1,3,5,7,9,11,13,15,u,u,u,u,u,u,u,u> +; AVX256-NEXT: vpshufb %xmm4, %xmm0, %xmm0 +; AVX256-NEXT: vpshufb %xmm2, %xmm1, %xmm2 +; AVX256-NEXT: vpshufb %xmm4, %xmm1, %xmm1 +; AVX256-NEXT: vpmovsxbd %xmm3, %ymm3 +; AVX256-NEXT: vpmovzxbd {{.*#+}} ymm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero,xmm2[4],zero,zero,zero,xmm2[5],zero,zero,zero,xmm2[6],zero,zero,zero,xmm2[7],zero,zero,zero +; AVX256-NEXT: vpmulld %ymm2, %ymm3, %ymm2 +; AVX256-NEXT: vpmovzxbd {{.*#+}} ymm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero +; AVX256-NEXT: vpmovsxbd %xmm1, %ymm1 +; AVX256-NEXT: vpmulld %ymm1, %ymm0, %ymm0 +; AVX256-NEXT: vpaddd %ymm0, %ymm2, %ymm0 +; AVX256-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX256-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX256-NEXT: vzeroupper +; AVX256-NEXT: retq %A = load <16 x i8>, <16 x i8>* %Aptr %B = load <16 x i8>, <16 x i8>* %Bptr %A_even = shufflevector <16 x i8> %A, <16 x i8> undef, <8 x i32> @@ -476,49 +450,25 @@ define <8 x i16> @pmaddubsw_bad_indices( ; AVX1-NEXT: vpackssdw %xmm0, %xmm3, %xmm0 ; AVX1-NEXT: retq ; -; AVX2-LABEL: pmaddubsw_bad_indices: -; AVX2: # %bb.0: -; AVX2-NEXT: vmovdqa (%rdi), %xmm0 -; AVX2-NEXT: vmovdqa (%rsi), %xmm1 -; AVX2-NEXT: vpshufb {{.*#+}} xmm2 = xmm0[1,2,5,6,9,10,13,14,u,u,u,u,u,u,u,u] -; AVX2-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,3,4,7,8,11,12,15,u,u,u,u,u,u,u,u] -; AVX2-NEXT: vpshufb {{.*#+}} xmm3 = xmm1[0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u] -; AVX2-NEXT: vpshufb {{.*#+}} xmm1 = xmm1[1,3,5,7,9,11,13,15,u,u,u,u,u,u,u,u] -; AVX2-NEXT: vpmovsxbd %xmm2, %ymm2 -; AVX2-NEXT: vpmovzxbd {{.*#+}} ymm3 = xmm3[0],zero,zero,zero,xmm3[1],zero,zero,zero,xmm3[2],zero,zero,zero,xmm3[3],zero,zero,zero,xmm3[4],zero,zero,zero,xmm3[5],zero,zero,zero,xmm3[6],zero,zero,zero,xmm3[7],zero,zero,zero -; AVX2-NEXT: vpmulld %ymm3, %ymm2, %ymm2 -; AVX2-NEXT: vpmovsxbd %xmm0, %ymm0 -; AVX2-NEXT: vpmovzxbd {{.*#+}} ymm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero,xmm1[4],zero,zero,zero,xmm1[5],zero,zero,zero,xmm1[6],zero,zero,zero,xmm1[7],zero,zero,zero -; AVX2-NEXT: vpmulld %ymm1, %ymm0, %ymm0 -; AVX2-NEXT: vpaddd %ymm0, %ymm2, %ymm0 -; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 -; AVX2-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 -; AVX2-NEXT: vzeroupper -; AVX2-NEXT: retq -; -; AVX512-LABEL: pmaddubsw_bad_indices: -; AVX512: # %bb.0: -; AVX512-NEXT: vmovdqa (%rdi), %xmm0 -; AVX512-NEXT: vmovdqa (%rsi), %xmm1 -; AVX512-NEXT: vpshufb {{.*#+}} xmm2 = xmm0[1,2,5,6,9,10,13,14,u,u,u,u,u,u,u,u] -; AVX512-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,3,4,7,8,11,12,15,u,u,u,u,u,u,u,u] -; AVX512-NEXT: vpshufb {{.*#+}} xmm3 = xmm1[0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u] -; AVX512-NEXT: vpshufb {{.*#+}} xmm1 = xmm1[1,3,5,7,9,11,13,15,u,u,u,u,u,u,u,u] -; AVX512-NEXT: vpmovsxbd %xmm2, %ymm2 -; AVX512-NEXT: vpmovzxbd {{.*#+}} ymm3 = xmm3[0],zero,zero,zero,xmm3[1],zero,zero,zero,xmm3[2],zero,zero,zero,xmm3[3],zero,zero,zero,xmm3[4],zero,zero,zero,xmm3[5],zero,zero,zero,xmm3[6],zero,zero,zero,xmm3[7],zero,zero,zero -; AVX512-NEXT: vpmulld %ymm3, %ymm2, %ymm2 -; AVX512-NEXT: vpmovsxbd %xmm0, %ymm0 -; AVX512-NEXT: vpmovzxbd {{.*#+}} ymm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero,xmm1[4],zero,zero,zero,xmm1[5],zero,zero,zero,xmm1[6],zero,zero,zero,xmm1[7],zero,zero,zero -; AVX512-NEXT: vpmulld %ymm1, %ymm0, %ymm0 -; AVX512-NEXT: vpaddd %ymm0, %ymm2, %ymm0 -; AVX512-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528] -; AVX512-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512-NEXT: vpbroadcastd {{.*#+}} ymm1 = [32767,32767,32767,32767,32767,32767,32767,32767] -; AVX512-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512-NEXT: vpmovdw %zmm0, %ymm0 -; AVX512-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 -; AVX512-NEXT: vzeroupper -; AVX512-NEXT: retq +; AVX256-LABEL: pmaddubsw_bad_indices: +; AVX256: # %bb.0: +; AVX256-NEXT: vmovdqa (%rdi), %xmm0 +; AVX256-NEXT: vmovdqa (%rsi), %xmm1 +; AVX256-NEXT: vpshufb {{.*#+}} xmm2 = xmm0[1,2,5,6,9,10,13,14,u,u,u,u,u,u,u,u] +; AVX256-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,3,4,7,8,11,12,15,u,u,u,u,u,u,u,u] +; AVX256-NEXT: vpshufb {{.*#+}} xmm3 = xmm1[0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u] +; AVX256-NEXT: vpshufb {{.*#+}} xmm1 = xmm1[1,3,5,7,9,11,13,15,u,u,u,u,u,u,u,u] +; AVX256-NEXT: vpmovsxbd %xmm2, %ymm2 +; AVX256-NEXT: vpmovzxbd {{.*#+}} ymm3 = xmm3[0],zero,zero,zero,xmm3[1],zero,zero,zero,xmm3[2],zero,zero,zero,xmm3[3],zero,zero,zero,xmm3[4],zero,zero,zero,xmm3[5],zero,zero,zero,xmm3[6],zero,zero,zero,xmm3[7],zero,zero,zero +; AVX256-NEXT: vpmulld %ymm3, %ymm2, %ymm2 +; AVX256-NEXT: vpmovsxbd %xmm0, %ymm0 +; AVX256-NEXT: vpmovzxbd {{.*#+}} ymm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero,xmm1[4],zero,zero,zero,xmm1[5],zero,zero,zero,xmm1[6],zero,zero,zero,xmm1[7],zero,zero,zero +; AVX256-NEXT: vpmulld %ymm1, %ymm0, %ymm0 +; AVX256-NEXT: vpaddd %ymm0, %ymm2, %ymm0 +; AVX256-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX256-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX256-NEXT: vzeroupper +; AVX256-NEXT: retq %A = load <16 x i8>, <16 x i8>* %Aptr %B = load <16 x i8>, <16 x i8>* %Bptr %A_even = shufflevector <16 x i8> %A, <16 x i8> undef, <8 x i32> ;indices aren't all even Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll?rev=374487&r1=374486&r2=374487&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll Thu Oct 10 17:38:51 2019 @@ -1111,12 +1111,8 @@ define <8 x i16> @trunc_packus_v8i32_v8i ; ; AVX512F-LABEL: trunc_packus_v8i32_v8i16: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [65535,65535,65535,65535,65535,65535,65535,65535] -; AVX512F-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512F-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdw %zmm0, %ymm0 -; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 +; AVX512F-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512F-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; @@ -1130,12 +1126,8 @@ define <8 x i16> @trunc_packus_v8i32_v8i ; ; AVX512BW-LABEL: trunc_packus_v8i32_v8i16: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [65535,65535,65535,65535,65535,65535,65535,65535] -; AVX512BW-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512BW-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdw %zmm0, %ymm0 -; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; @@ -2816,11 +2808,9 @@ define <8 x i8> @trunc_packus_v8i32_v8i8 ; ; AVX512F-LABEL: trunc_packus_v8i32_v8i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [255,255,255,255,255,255,255,255] -; AVX512F-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512F-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512F-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512F-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; @@ -2834,11 +2824,9 @@ define <8 x i8> @trunc_packus_v8i32_v8i8 ; ; AVX512BW-LABEL: trunc_packus_v8i32_v8i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [255,255,255,255,255,255,255,255] -; AVX512BW-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512BW-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; @@ -2885,11 +2873,9 @@ define void @trunc_packus_v8i32_v8i8_sto ; ; AVX512F-LABEL: trunc_packus_v8i32_v8i8_store: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [255,255,255,255,255,255,255,255] -; AVX512F-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512F-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512F-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512F-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 ; AVX512F-NEXT: vmovq %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -2904,11 +2890,9 @@ define void @trunc_packus_v8i32_v8i8_sto ; ; AVX512BW-LABEL: trunc_packus_v8i32_v8i8_store: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [255,255,255,255,255,255,255,255] -; AVX512BW-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512BW-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 ; AVX512BW-NEXT: vmovq %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -3007,11 +2991,8 @@ define <16 x i8> @trunc_packus_v16i16_v1 ; ; AVX512BW-LABEL: trunc_packus_v16i16_v16i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpminsw {{.*}}(%rip), %ymm0, %ymm0 -; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512BW-NEXT: vpmaxsw %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovwb %zmm0, %ymm0 -; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll?rev=374487&r1=374486&r2=374487&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll Thu Oct 10 17:38:51 2019 @@ -1078,12 +1078,8 @@ define <8 x i16> @trunc_ssat_v8i32_v8i16 ; ; AVX512F-LABEL: trunc_ssat_v8i32_v8i16: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [32767,32767,32767,32767,32767,32767,32767,32767] -; AVX512F-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528] -; AVX512F-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdw %zmm0, %ymm0 -; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 +; AVX512F-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512F-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; @@ -1095,12 +1091,8 @@ define <8 x i16> @trunc_ssat_v8i32_v8i16 ; ; AVX512BW-LABEL: trunc_ssat_v8i32_v8i16: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [32767,32767,32767,32767,32767,32767,32767,32767] -; AVX512BW-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528] -; AVX512BW-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdw %zmm0, %ymm0 -; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; @@ -2795,11 +2787,9 @@ define <8 x i8> @trunc_ssat_v8i32_v8i8(< ; ; AVX512F-LABEL: trunc_ssat_v8i32_v8i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [127,127,127,127,127,127,127,127] -; AVX512F-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168] -; AVX512F-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512F-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512F-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; @@ -2811,11 +2801,9 @@ define <8 x i8> @trunc_ssat_v8i32_v8i8(< ; ; AVX512BW-LABEL: trunc_ssat_v8i32_v8i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [127,127,127,127,127,127,127,127] -; AVX512BW-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168] -; AVX512BW-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; @@ -2860,11 +2848,9 @@ define void @trunc_ssat_v8i32_v8i8_store ; ; AVX512F-LABEL: trunc_ssat_v8i32_v8i8_store: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [127,127,127,127,127,127,127,127] -; AVX512F-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168] -; AVX512F-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512F-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512F-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 ; AVX512F-NEXT: vmovq %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -2877,11 +2863,9 @@ define void @trunc_ssat_v8i32_v8i8_store ; ; AVX512BW-LABEL: trunc_ssat_v8i32_v8i8_store: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [127,127,127,127,127,127,127,127] -; AVX512BW-NEXT: vpminsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168,4294967168] -; AVX512BW-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 ; AVX512BW-NEXT: vmovq %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2976,10 +2960,8 @@ define <16 x i8> @trunc_ssat_v16i16_v16i ; ; AVX512BW-LABEL: trunc_ssat_v16i16_v16i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpminsw {{.*}}(%rip), %ymm0, %ymm0 -; AVX512BW-NEXT: vpmaxsw {{.*}}(%rip), %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovwb %zmm0, %ymm0 -; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; From llvm-commits at lists.llvm.org Thu Oct 10 17:40:29 2019 From: llvm-commits at lists.llvm.org (Philip Reames via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 00:40:29 +0000 (UTC) Subject: [PATCH] D68844: [SCEV] Compute exit count for simple floating point IVs Message-ID: reames created this revision. reames added reviewers: nikic, sanjoy, scanon. Herald added subscribers: bollu, mcrosier. Herald added a project: LLVM. This patch adds the plumbing for computing exit counts for loops with floating point IVs. The logic is generally pretty simple (deliberately). The basic reasoning is that if we have an IV with integer values represented as floats, then we can compute the exit counts "as-if" they were ints. The motivation here is to eventually subsume the logic inside IndVarSimplify::handleFloatingPointIV, and restructure that one to simply perform the transform. By doing so, we handle a much broader class of floating point IV loops than we do today. To be fair, the actual motivation behind all of this is to break a toy microbenchmark. This is not something seen in real code. Repository: rL LLVM https://reviews.llvm.org/D68844 Files: include/llvm/Analysis/ScalarEvolution.h lib/Analysis/ScalarEvolution.cpp test/Analysis/ScalarEvolution/trip-count-float.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68844.224508.patch Type: text/x-patch Size: 13317 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 17:57:41 2019 From: llvm-commits at lists.llvm.org (Volodymyr Sapsai via llvm-commits) Date: Fri, 11 Oct 2019 00:57:41 -0000 Subject: [llvm] r374490 - [Stats] Add ALWAYS_ENABLED_STATISTIC enabled regardless of LLVM_ENABLE_STATS. Message-ID: <20191011005741.43E7592A1F@lists.llvm.org> Author: vsapsai Date: Thu Oct 10 17:57:41 2019 New Revision: 374490 URL: http://llvm.org/viewvc/llvm-project?rev=374490&view=rev Log: [Stats] Add ALWAYS_ENABLED_STATISTIC enabled regardless of LLVM_ENABLE_STATS. The intended usage is to measure relatively expensive operations. So the cost of the statistic is negligible compared to the cost of a measured operation and can be enabled all the time without impairing the compilation time. rdar://problem/55715134 Reviewers: dsanders, bogner, rtereshin Reviewed By: dsanders Subscribers: hiraditya, jkorous, dexonsmith, ributzka, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68252 Modified: llvm/trunk/include/llvm/ADT/Statistic.h llvm/trunk/lib/Support/Statistic.cpp llvm/trunk/unittests/ADT/StatisticTest.cpp Modified: llvm/trunk/include/llvm/ADT/Statistic.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/Statistic.h?rev=374490&r1=374489&r2=374490&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/Statistic.h (original) +++ llvm/trunk/include/llvm/ADT/Statistic.h Thu Oct 10 17:57:41 2019 @@ -44,38 +44,39 @@ class raw_ostream; class raw_fd_ostream; class StringRef; -class Statistic { +class StatisticBase { public: const char *DebugType; const char *Name; const char *Desc; - std::atomic Value; - std::atomic Initialized; - unsigned getValue() const { return Value.load(std::memory_order_relaxed); } + StatisticBase(const char *DebugType, const char *Name, const char *Desc) + : DebugType(DebugType), Name(Name), Desc(Desc) {} + const char *getDebugType() const { return DebugType; } const char *getName() const { return Name; } const char *getDesc() const { return Desc; } +}; - /// construct - This should only be called for non-global statistics. - void construct(const char *debugtype, const char *name, const char *desc) { - DebugType = debugtype; - Name = name; - Desc = desc; - Value = 0; - Initialized = false; - } +class TrackingStatistic : public StatisticBase { +public: + std::atomic Value; + std::atomic Initialized; + + TrackingStatistic(const char *DebugType, const char *Name, const char *Desc) + : StatisticBase(DebugType, Name, Desc), Value(0), Initialized(false) {} + + unsigned getValue() const { return Value.load(std::memory_order_relaxed); } // Allow use of this class as the value itself. operator unsigned() const { return getValue(); } -#if LLVM_ENABLE_STATS - const Statistic &operator=(unsigned Val) { + const TrackingStatistic &operator=(unsigned Val) { Value.store(Val, std::memory_order_relaxed); return init(); } - const Statistic &operator++() { + const TrackingStatistic &operator++() { Value.fetch_add(1, std::memory_order_relaxed); return init(); } @@ -85,7 +86,7 @@ public: return Value.fetch_add(1, std::memory_order_relaxed); } - const Statistic &operator--() { + const TrackingStatistic &operator--() { Value.fetch_sub(1, std::memory_order_relaxed); return init(); } @@ -95,14 +96,14 @@ public: return Value.fetch_sub(1, std::memory_order_relaxed); } - const Statistic &operator+=(unsigned V) { + const TrackingStatistic &operator+=(unsigned V) { if (V == 0) return *this; Value.fetch_add(V, std::memory_order_relaxed); return init(); } - const Statistic &operator-=(unsigned V) { + const TrackingStatistic &operator-=(unsigned V) { if (V == 0) return *this; Value.fetch_sub(V, std::memory_order_relaxed); @@ -119,54 +120,57 @@ public: init(); } -#else // Statistics are disabled in release builds. - - const Statistic &operator=(unsigned Val) { +protected: + TrackingStatistic &init() { + if (!Initialized.load(std::memory_order_acquire)) + RegisterStatistic(); return *this; } - const Statistic &operator++() { - return *this; - } + void RegisterStatistic(); +}; - unsigned operator++(int) { - return 0; - } +class NoopStatistic : public StatisticBase { +public: + using StatisticBase::StatisticBase; - const Statistic &operator--() { - return *this; - } + unsigned getValue() const { return 0; } - unsigned operator--(int) { - return 0; - } + // Allow use of this class as the value itself. + operator unsigned() const { return 0; } - const Statistic &operator+=(const unsigned &V) { - return *this; - } + const NoopStatistic &operator=(unsigned Val) { return *this; } - const Statistic &operator-=(const unsigned &V) { - return *this; - } + const NoopStatistic &operator++() { return *this; } - void updateMax(unsigned V) {} + unsigned operator++(int) { return 0; } -#endif // LLVM_ENABLE_STATS + const NoopStatistic &operator--() { return *this; } -protected: - Statistic &init() { - if (!Initialized.load(std::memory_order_acquire)) - RegisterStatistic(); - return *this; - } + unsigned operator--(int) { return 0; } - void RegisterStatistic(); + const NoopStatistic &operator+=(const unsigned &V) { return *this; } + + const NoopStatistic &operator-=(const unsigned &V) { return *this; } + + void updateMax(unsigned V) {} }; +#if LLVM_ENABLE_STATS +using Statistic = TrackingStatistic; +#else +using Statistic = NoopStatistic; +#endif + // STATISTIC - A macro to make definition of statistics really simple. This // automatically passes the DEBUG_TYPE of the file into the statistic. #define STATISTIC(VARNAME, DESC) \ - static llvm::Statistic VARNAME = {DEBUG_TYPE, #VARNAME, DESC, {0}, {false}} + static llvm::Statistic VARNAME = {DEBUG_TYPE, #VARNAME, DESC} + +// ALWAYS_ENABLED_STATISTIC - A macro to define a statistic like STATISTIC but +// it is enabled even if LLVM_ENABLE_STATS is off. +#define ALWAYS_ENABLED_STATISTIC(VARNAME, DESC) \ + static llvm::TrackingStatistic VARNAME = {DEBUG_TYPE, #VARNAME, DESC} /// Enable the collection and printing of statistics. void EnableStatistics(bool PrintOnExit = true); Modified: llvm/trunk/lib/Support/Statistic.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/Statistic.cpp?rev=374490&r1=374489&r2=374490&view=diff ============================================================================== --- llvm/trunk/lib/Support/Statistic.cpp (original) +++ llvm/trunk/lib/Support/Statistic.cpp Thu Oct 10 17:57:41 2019 @@ -57,7 +57,7 @@ namespace { /// This class is also used to look up statistic values from applications that /// use LLVM. class StatisticInfo { - std::vector Stats; + std::vector Stats; friend void llvm::PrintStatistics(); friend void llvm::PrintStatistics(raw_ostream &OS); @@ -66,14 +66,12 @@ class StatisticInfo { /// Sort statistics by debugtype,name,description. void sort(); public: - using const_iterator = std::vector::const_iterator; + using const_iterator = std::vector::const_iterator; StatisticInfo(); ~StatisticInfo(); - void addStatistic(Statistic *S) { - Stats.push_back(S); - } + void addStatistic(TrackingStatistic *S) { Stats.push_back(S); } const_iterator begin() const { return Stats.begin(); } const_iterator end() const { return Stats.end(); } @@ -90,7 +88,7 @@ static ManagedStaticgetDebugType(), RHS->getDebugType())) - return Cmp < 0; + llvm::stable_sort( + Stats, [](const TrackingStatistic *LHS, const TrackingStatistic *RHS) { + if (int Cmp = std::strcmp(LHS->getDebugType(), RHS->getDebugType())) + return Cmp < 0; - if (int Cmp = std::strcmp(LHS->getName(), RHS->getName())) - return Cmp < 0; + if (int Cmp = std::strcmp(LHS->getName(), RHS->getName())) + return Cmp < 0; - return std::strcmp(LHS->getDesc(), RHS->getDesc()) < 0; - }); + return std::strcmp(LHS->getDesc(), RHS->getDesc()) < 0; + }); } void StatisticInfo::reset() { @@ -207,7 +206,7 @@ void llvm::PrintStatisticsJSON(raw_ostre // Print all of the statistics. OS << "{\n"; const char *delim = ""; - for (const Statistic *Stat : Stats.Stats) { + for (const TrackingStatistic *Stat : Stats.Stats) { OS << delim; assert(yaml::needsQuotes(Stat->getDebugType()) == yaml::QuotingType::None && "Statistic group/type name is simple."); Modified: llvm/trunk/unittests/ADT/StatisticTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ADT/StatisticTest.cpp?rev=374490&r1=374489&r2=374490&view=diff ============================================================================== --- llvm/trunk/unittests/ADT/StatisticTest.cpp (original) +++ llvm/trunk/unittests/ADT/StatisticTest.cpp Thu Oct 10 17:57:41 2019 @@ -17,6 +17,7 @@ namespace { #define DEBUG_TYPE "unittest" STATISTIC(Counter, "Counts things"); STATISTIC(Counter2, "Counts other things"); +ALWAYS_ENABLED_STATISTIC(AlwaysCounter, "Counts things always"); #if LLVM_ENABLE_STATS static void @@ -43,6 +44,12 @@ TEST(StatisticTest, Count) { #else EXPECT_EQ(Counter, 0u); #endif + + AlwaysCounter = 0; + EXPECT_EQ(AlwaysCounter, 0u); + AlwaysCounter++; + ++AlwaysCounter; + EXPECT_EQ(AlwaysCounter, 2u); } TEST(StatisticTest, Assign) { @@ -54,6 +61,9 @@ TEST(StatisticTest, Assign) { #else EXPECT_EQ(Counter, 0u); #endif + + AlwaysCounter = 2; + EXPECT_EQ(AlwaysCounter, 2u); } TEST(StatisticTest, API) { From llvm-commits at lists.llvm.org Thu Oct 10 17:59:02 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 00:59:02 +0000 (UTC) Subject: [PATCH] D68832: [tsan,msan] Insert module constructors in a module pass In-Reply-To: References: Message-ID: <1e51476e503e9912bf2fcdd178737daa@localhost.localdomain> thakis added a comment. This fails on Mac and windows http://45.33.8.238/win/247/step_6.txt Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68832/new/ https://reviews.llvm.org/D68832 From llvm-commits at lists.llvm.org Thu Oct 10 17:59:02 2019 From: llvm-commits at lists.llvm.org (Aaron Puchert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 00:59:02 +0000 (UTC) Subject: [PATCH] D51741: [coro]Pass rvalue reference for named local variable to return_value In-Reply-To: References: Message-ID: <1afecb25909749c6a7834461157d4511@localhost.localdomain> aaronpuchert added a comment. Please have a look at D68845 . This should address the issues that we discussed. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51741/new/ https://reviews.llvm.org/D51741 From llvm-commits at lists.llvm.org Thu Oct 10 17:59:09 2019 From: llvm-commits at lists.llvm.org (Volodymyr Sapsai via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 00:59:09 +0000 (UTC) Subject: [PATCH] D68252: [Stats] Add ALWAYS_ENABLED_STATISTIC enabled regardless of LLVM_ENABLE_STATS. In-Reply-To: References: Message-ID: <95f1b8eb39a857027e986fca63ba7ff3@localhost.localdomain> This revision was automatically updated to reflect the committed changes. vsapsai marked an inline comment as done. Closed by commit rGadb203feda90: [Stats] Add ALWAYS_ENABLED_STATISTIC enabled regardless of LLVM_ENABLE_STATS. (authored by vsapsai). Changed prior to commit: https://reviews.llvm.org/D68252?vs=224235&id=224514#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68252/new/ https://reviews.llvm.org/D68252 Files: llvm/include/llvm/ADT/Statistic.h llvm/lib/Support/Statistic.cpp llvm/unittests/ADT/StatisticTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68252.224514.patch Type: text/x-patch Size: 8490 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 18:08:17 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:08:17 +0000 (UTC) Subject: [PATCH] D68847: [lit] Small refactoring and cleanups in main.py Message-ID: yln created this revision. yln added reviewers: rnk, ddunbar, serge-sans-paille, probinson, jdenny, cishida, nate_chandler, jordan_rose. Herald added subscribers: llvm-commits, delcypher. Herald added a project: LLVM. - Remove outdated precautions for Python versions < 2.7 - Remove dead code related to `maxIndividualTestTime` option - Move printing of test and result summary out of main into its own function Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68847 Files: llvm/utils/lit/lit/main.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68847.224515.patch Type: text/x-patch Size: 7138 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 18:08:18 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:08:18 +0000 (UTC) Subject: [PATCH] D68765: [Attributor] Function signature rewrite infrastructure In-Reply-To: References: Message-ID: <34986a147cf2559379bbb0349be1e0a8@localhost.localdomain> jdoerfert updated this revision to Diff 224516. jdoerfert added a comment. Lessons learned from running *all* argument promotion tests, patches will follow Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68765/new/ https://reviews.llvm.org/D68765 Files: llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/align.ll llvm/test/Transforms/FunctionAttrs/liveness.ll llvm/test/Transforms/FunctionAttrs/noalias_returned.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68765.224516.patch Type: text/x-patch Size: 23530 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 18:17:31 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:17:31 +0000 (UTC) Subject: [PATCH] D68848: [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index. Message-ID: rupprecht created this revision. rupprecht added reviewers: grimar, jhenderson. rupprecht added a project: LLVM. Herald added subscribers: llvm-commits, seiya, arphaman, aheejin, arichardson, sbc100, emaste. Herald added a reviewer: espindola. rupprecht added a parent revision: D68730: [llvm-objdump] Adjust spacing and field width for --section-headers. rupprecht updated this revision to Diff 224519. rupprecht added a comment. Rebase against D68730 When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF). While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68848 Files: llvm/test/tools/llvm-objdump/xcoff-section-headers.test llvm/tools/llvm-objdump/llvm-objdump.cpp llvm/tools/llvm-objdump/llvm-objdump.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68848.224519.patch Type: text/x-patch Size: 6120 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 18:17:32 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:17:32 +0000 (UTC) Subject: [PATCH] D68848: [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index. In-Reply-To: References: Message-ID: <3b2fc8a86cc7b818a37f5adf300e2fff@localhost.localdomain> rupprecht updated this revision to Diff 224519. rupprecht added a comment. Rebase against D68730 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68848/new/ https://reviews.llvm.org/D68848 Files: llvm/test/tools/llvm-objdump/xcoff-section-headers.test llvm/tools/llvm-objdump/llvm-objdump.cpp llvm/tools/llvm-objdump/llvm-objdump.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68848.224519.patch Type: text/x-patch Size: 6120 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 18:17:33 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:17:33 +0000 (UTC) Subject: [PATCH] D68338: [AMDGPU] Remove dubious logic in bidirectional list scheduler In-Reply-To: References: Message-ID: <812a718f7db0eca202e890b367c757c0@localhost.localdomain> rampitec added a comment. In D68338#1703821 , @foad wrote: > Reviewers: any advice on handling lots of test updates like this? I could pre-commit some of the tests, where I've made them strictly more lenient. I could also add -enable-misched=false to any tests that aren't specifically testing the scheduler, update them and pre-commit that, in order to protect them from this and future scheduler tweaks. I usually prefer to precommit non-essential test changes. It is less merges and smaller review. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68338/new/ https://reviews.llvm.org/D68338 From llvm-commits at lists.llvm.org Thu Oct 10 18:17:33 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:17:33 +0000 (UTC) Subject: [PATCH] D68819: [Utils] Allow update_test_checks to check function arguments In-Reply-To: References: Message-ID: jdoerfert updated this revision to Diff 224520. jdoerfert added a comment. Lessons learned by running it on all argument promotion files: D68766 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68819/new/ https://reviews.llvm.org/D68819 Files: llvm/utils/UpdateTestChecks/common.py llvm/utils/update_analyze_test_checks.py llvm/utils/update_mir_test_checks.py llvm/utils/update_test_checks.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68819.224520.patch Type: text/x-patch Size: 7919 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 18:28:27 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via llvm-commits) Date: Fri, 11 Oct 2019 01:28:27 -0000 Subject: [llvm] r374495 - AMDGPU: Move SelectFlatOffset back into AMDGPUISelDAGToDAG Message-ID: <20191011012827.DBDB38564C@lists.llvm.org> Author: arsenm Date: Thu Oct 10 18:28:27 2019 New Revision: 374495 URL: http://llvm.org/viewvc/llvm-project?rev=374495&view=rev Log: AMDGPU: Move SelectFlatOffset back into AMDGPUISelDAGToDAG Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp?rev=374495&r1=374494&r2=374495&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp Thu Oct 10 18:28:27 2019 @@ -209,15 +209,14 @@ private: bool SelectMUBUFOffset(SDValue Addr, SDValue &SRsrc, SDValue &Soffset, SDValue &Offset) const; + template + bool SelectFlatOffset(SDNode *N, SDValue Addr, SDValue &VAddr, + SDValue &Offset, SDValue &SLC) const; bool SelectFlatAtomic(SDNode *N, SDValue Addr, SDValue &VAddr, SDValue &Offset, SDValue &SLC) const; bool SelectFlatAtomicSigned(SDNode *N, SDValue Addr, SDValue &VAddr, SDValue &Offset, SDValue &SLC) const; - template - bool SelectFlatOffset(SDNode *N, SDValue Addr, SDValue &VAddr, - SDValue &Offset, SDValue &SLC) const; - bool SelectSMRDOffset(SDValue ByteOffsetNode, SDValue &Offset, bool &Imm) const; SDValue Expand32BitAddress(SDValue Addr) const; @@ -1606,14 +1605,48 @@ bool AMDGPUDAGToDAGISel::SelectMUBUFOffs return SelectMUBUFOffset(Addr, SRsrc, Soffset, Offset, GLC, SLC, TFE, DLC, SWZ); } +// Find a load or store from corresponding pattern root. +// Roots may be build_vector, bitconvert or their combinations. +static MemSDNode* findMemSDNode(SDNode *N) { + N = AMDGPUTargetLowering::stripBitcast(SDValue(N,0)).getNode(); + if (MemSDNode *MN = dyn_cast(N)) + return MN; + assert(isa(N)); + for (SDValue V : N->op_values()) + if (MemSDNode *MN = + dyn_cast(AMDGPUTargetLowering::stripBitcast(V))) + return MN; + llvm_unreachable("cannot find MemSDNode in the pattern!"); +} + template bool AMDGPUDAGToDAGISel::SelectFlatOffset(SDNode *N, SDValue Addr, SDValue &VAddr, SDValue &Offset, SDValue &SLC) const { - return static_cast(getTargetLowering())-> - SelectFlatOffset(IsSigned, *CurDAG, N, Addr, VAddr, Offset, SLC); + int64_t OffsetVal = 0; + + if (Subtarget->hasFlatInstOffsets() && + (!Subtarget->hasFlatSegmentOffsetBug() || + findMemSDNode(N)->getAddressSpace() != AMDGPUAS::FLAT_ADDRESS) && + CurDAG->isBaseWithConstantOffset(Addr)) { + SDValue N0 = Addr.getOperand(0); + SDValue N1 = Addr.getOperand(1); + int64_t COffsetVal = cast(N1)->getSExtValue(); + + const SIInstrInfo *TII = Subtarget->getInstrInfo(); + if (TII->isLegalFLATOffset(COffsetVal, findMemSDNode(N)->getAddressSpace(), + IsSigned)) { + Addr = N0; + OffsetVal = COffsetVal; + } + } + + VAddr = Addr; + Offset = CurDAG->getTargetConstant(OffsetVal, SDLoc(), MVT::i16); + SLC = CurDAG->getTargetConstant(0, SDLoc(), MVT::i1); + return true; } bool AMDGPUDAGToDAGISel::SelectFlatAtomic(SDNode *N, @@ -1625,10 +1658,10 @@ bool AMDGPUDAGToDAGISel::SelectFlatAtomi } bool AMDGPUDAGToDAGISel::SelectFlatAtomicSigned(SDNode *N, - SDValue Addr, - SDValue &VAddr, - SDValue &Offset, - SDValue &SLC) const { + SDValue Addr, + SDValue &VAddr, + SDValue &Offset, + SDValue &SLC) const { return SelectFlatOffset(N, Addr, VAddr, Offset, SLC); } Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp?rev=374495&r1=374494&r2=374495&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp Thu Oct 10 18:28:27 2019 @@ -2828,54 +2828,6 @@ bool AMDGPUTargetLowering::shouldCombine return true; } -// Find a load or store from corresponding pattern root. -// Roots may be build_vector, bitconvert or their combinations. -static MemSDNode* findMemSDNode(SDNode *N) { - N = AMDGPUTargetLowering::stripBitcast(SDValue(N,0)).getNode(); - if (MemSDNode *MN = dyn_cast(N)) - return MN; - assert(isa(N)); - for (SDValue V : N->op_values()) - if (MemSDNode *MN = - dyn_cast(AMDGPUTargetLowering::stripBitcast(V))) - return MN; - llvm_unreachable("cannot find MemSDNode in the pattern!"); -} - -bool AMDGPUTargetLowering::SelectFlatOffset(bool IsSigned, - SelectionDAG &DAG, - SDNode *N, - SDValue Addr, - SDValue &VAddr, - SDValue &Offset, - SDValue &SLC) const { - const GCNSubtarget &ST = - DAG.getMachineFunction().getSubtarget(); - int64_t OffsetVal = 0; - - if (ST.hasFlatInstOffsets() && - (!ST.hasFlatSegmentOffsetBug() || - findMemSDNode(N)->getAddressSpace() != AMDGPUAS::FLAT_ADDRESS) && - DAG.isBaseWithConstantOffset(Addr)) { - SDValue N0 = Addr.getOperand(0); - SDValue N1 = Addr.getOperand(1); - int64_t COffsetVal = cast(N1)->getSExtValue(); - - const SIInstrInfo *TII = ST.getInstrInfo(); - if (TII->isLegalFLATOffset(COffsetVal, findMemSDNode(N)->getAddressSpace(), - IsSigned)) { - Addr = N0; - OffsetVal = COffsetVal; - } - } - - VAddr = Addr; - Offset = DAG.getTargetConstant(OffsetVal, SDLoc(), MVT::i16); - SLC = DAG.getTargetConstant(0, SDLoc(), MVT::i1); - - return true; -} - // Replace load of an illegal type with a store of a bitcast to a friendlier // type. SDValue AMDGPUTargetLowering::performLoadCombine(SDNode *N, Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h?rev=374495&r1=374494&r2=374495&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h Thu Oct 10 18:28:27 2019 @@ -326,10 +326,6 @@ public: } AtomicExpansionKind shouldExpandAtomicRMWInIR(AtomicRMWInst *) const override; - - bool SelectFlatOffset(bool IsSigned, SelectionDAG &DAG, SDNode *N, - SDValue Addr, SDValue &VAddr, SDValue &Offset, - SDValue &SLC) const; }; namespace AMDGPUISD { From llvm-commits at lists.llvm.org Thu Oct 10 18:26:36 2019 From: llvm-commits at lists.llvm.org (Kostya Serebryany via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:26:36 +0000 (UTC) Subject: [PATCH] D68752: [sancov] Use LLVM Support library JSON writer in favor of individual implementation In-Reply-To: References: Message-ID: <06524e471a84719a2f8f81749494e686@localhost.localdomain> kcc added a comment. I was actually hoping to get rid of this code entirely. Why do you need this change? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68752/new/ https://reviews.llvm.org/D68752 From llvm-commits at lists.llvm.org Thu Oct 10 18:26:38 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:26:38 +0000 (UTC) Subject: [PATCH] D64335: AMDGPU: Move SelectFlatOffset back into AMDGPUISelDAGToDAG In-Reply-To: References: Message-ID: <41273dec85fa48dd4090634e77709e23@localhost.localdomain> arsenm closed this revision. arsenm added a comment. r374495 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64335/new/ https://reviews.llvm.org/D64335 From llvm-commits at lists.llvm.org Thu Oct 10 18:26:39 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:26:39 +0000 (UTC) Subject: [PATCH] D68850: [Utils] Deal with occasionally deleted functions Message-ID: jdoerfert created this revision. jdoerfert added reviewers: lebedev.ri, greened, spatel, xbolva00, RKSimon, mehdi_amini. Herald added a subscriber: bollu. Herald added a project: LLVM. When functions exist for some but not all run lines we need to be careful when selecting the prefix. So far, a common prefix was potentially chosen as there was never a "conflict" that would have caused otherwise. With this patch we avoid common prefixes if they are used by run lines that do not emit the function. Tested as part of D68766 and the follow up that adds the Attributor test lines to all argument promotion tests. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68850 Files: llvm/utils/UpdateTestChecks/common.py Index: llvm/utils/UpdateTestChecks/common.py =================================================================== --- llvm/utils/UpdateTestChecks/common.py +++ llvm/utils/UpdateTestChecks/common.py @@ -229,16 +229,37 @@ def add_checks(output_lines, comment_marker, prefix_list, func_dict, func_name, check_label_format, is_asm, is_analyze): + # blacklist are prefixes we cannot use to print the function because it doesn't exist in run lines that use these prefixes as well. + blacklist = set() printed_prefixes = [] for p in prefix_list: checkprefixes = p[0] + # If not all checkprefixes of this run line produced the function we cannot check for it as it does not + # exist for this run line. A subset of the check prefixes might know about the function but only because + # other run lines created it. + if any(map(lambda checkprefix: func_name not in func_dict[checkprefix], checkprefixes)): + blacklist |= set(checkprefixes) + continue + + # blacklist is constructed, we can now emit the output + for p in prefix_list: + checkprefixes = p[0] + saved_output = None for checkprefix in checkprefixes: if checkprefix in printed_prefixes: break - # TODO func_dict[checkprefix] may be None, '' or not exist. - # Fix the call sites. - if func_name not in func_dict[checkprefix] or not func_dict[checkprefix][func_name]: - continue + + # prefix is blacklisted. We remember the output as we might need it later but we will not emit anything for the prefix. + if checkprefix in blacklist: + if not saved_output and func_name in func_dict[checkprefix]: + saved_output = func_dict[checkprefix][func_name] + continue + + # If we do not have output for this prefix but there is one saved, we go ahead with this prefix and the saved output. + if not func_dict[checkprefix][func_name]: + if not saved_output: + continue + func_dict[checkprefix][func_name] = saved_output # Add some space between different check prefixes, but not after the last # check line (before the test code). -------------- next part -------------- A non-text attachment was scrubbed... Name: D68850.224521.patch Type: text/x-patch Size: 2157 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 18:35:54 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:35:54 +0000 (UTC) Subject: [PATCH] D68153: Make IR labels more precise In-Reply-To: References: Message-ID: jdoerfert added a comment. I stole this and added it into D68819 as well. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68153/new/ https://reviews.llvm.org/D68153 From llvm-commits at lists.llvm.org Thu Oct 10 18:35:55 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:35:55 +0000 (UTC) Subject: [PATCH] D68851: [Utils] Allow update_test_checks to scrub attribute annotations Message-ID: jdoerfert created this revision. jdoerfert added reviewers: lebedev.ri, greened, spatel, xbolva00, RKSimon, mehdi_amini. Herald added a subscriber: bollu. Herald added a project: LLVM. Attribute annotations, e.g., #0, are not useful on their own. This patch adds a flag to update_test_checks to scrub them. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68851 Files: llvm/utils/UpdateTestChecks/common.py llvm/utils/update_test_checks.py Index: llvm/utils/update_test_checks.py =================================================================== --- llvm/utils/update_test_checks.py +++ llvm/utils/update_test_checks.py @@ -68,9 +68,15 @@ help='Do not scrub IR names') parser.add_argument('--function-signature', action='store_true', help='Keep function signature information around for the check line') + parser.add_argument('--scrub-attributes', action='store_true', + help='Remove attribute annotations (#0) from the end of check line') parser.add_argument('tests', nargs='+') args = parser.parse_args() + # If requested we scrub trailing attribute annotations, e.g., '#0', together with whitespaces + if args.scrub_attributes: + common.SCRUB_TRAILING_WHITESPACE_RE = common.SCRUB_TRAILING_WHITESPACE_AND_ATTRIBUTES_RE + script_name = os.path.basename(__file__) autogenerated_note = (ADVERT + 'utils/' + script_name) Index: llvm/utils/UpdateTestChecks/common.py =================================================================== --- llvm/utils/UpdateTestChecks/common.py +++ llvm/utils/UpdateTestChecks/common.py @@ -68,6 +68,7 @@ SCRUB_LEADING_WHITESPACE_RE = re.compile(r'^(\s+)') SCRUB_WHITESPACE_RE = re.compile(r'(?!^(| \w))[ \t]+', flags=re.M) SCRUB_TRAILING_WHITESPACE_RE = re.compile(r'[ \t]+$', flags=re.M) +SCRUB_TRAILING_WHITESPACE_AND_ATTRIBUTES_RE = re.compile(r'([ \t]|(#[0-9]+))+$', flags=re.M) SCRUB_KILL_COMMENT_RE = re.compile(r'^ *#+ +kill:.*\n') SCRUB_LOOP_COMMENT_RE = re.compile( r'# =>This Inner Loop Header:.*|# in Loop:.*', flags=re.M) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68851.224523.patch Type: text/x-patch Size: 1630 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 18:35:55 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:35:55 +0000 (UTC) Subject: [PATCH] D68766: [NFC][ArgPromo][Tests] Run update_test_checks on all ArgumentPromotion tests In-Reply-To: References: Message-ID: jdoerfert updated this revision to Diff 224522. jdoerfert added a comment. Rerun with D68850 and updated D68819 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68766/new/ https://reviews.llvm.org/D68766 Files: llvm/test/Transforms/ArgumentPromotion/2008-02-01-ReturnAttrs.ll llvm/test/Transforms/ArgumentPromotion/2008-07-02-array-indexing.ll llvm/test/Transforms/ArgumentPromotion/2008-09-07-CGUpdate.ll llvm/test/Transforms/ArgumentPromotion/2008-09-08-CGUpdateSelfEdge.ll llvm/test/Transforms/ArgumentPromotion/X86/attributes.ll llvm/test/Transforms/ArgumentPromotion/X86/min-legal-vector-width.ll llvm/test/Transforms/ArgumentPromotion/X86/thiscall.ll llvm/test/Transforms/ArgumentPromotion/aggregate-promote.ll llvm/test/Transforms/ArgumentPromotion/attrs.ll llvm/test/Transforms/ArgumentPromotion/basictest.ll llvm/test/Transforms/ArgumentPromotion/byval-2.ll llvm/test/Transforms/ArgumentPromotion/byval.ll llvm/test/Transforms/ArgumentPromotion/chained.ll llvm/test/Transforms/ArgumentPromotion/control-flow.ll llvm/test/Transforms/ArgumentPromotion/control-flow2.ll llvm/test/Transforms/ArgumentPromotion/crash.ll llvm/test/Transforms/ArgumentPromotion/dbg.ll llvm/test/Transforms/ArgumentPromotion/fp80.ll llvm/test/Transforms/ArgumentPromotion/inalloca.ll llvm/test/Transforms/ArgumentPromotion/invalidation.ll llvm/test/Transforms/ArgumentPromotion/musttail.ll llvm/test/Transforms/ArgumentPromotion/naked_functions.ll llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll llvm/test/Transforms/ArgumentPromotion/pr27568.ll llvm/test/Transforms/ArgumentPromotion/pr3085.ll llvm/test/Transforms/ArgumentPromotion/pr32917.ll llvm/test/Transforms/ArgumentPromotion/pr33641_remove_arg_dbgvalue.ll llvm/test/Transforms/ArgumentPromotion/profile.ll llvm/test/Transforms/ArgumentPromotion/reserve-tbaa.ll llvm/test/Transforms/ArgumentPromotion/sret.ll llvm/test/Transforms/ArgumentPromotion/tail.ll llvm/test/Transforms/ArgumentPromotion/variadic.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68766.224522.patch Type: text/x-patch Size: 170775 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 18:43:37 2019 From: llvm-commits at lists.llvm.org (Volodymyr Sapsai via llvm-commits) Date: Fri, 11 Oct 2019 01:43:37 -0000 Subject: [polly] r374497 - [Stats] Fix polly build due to change in llvm::Statistic constructor in r374490. Message-ID: <20191011014337.2780192632@lists.llvm.org> Author: vsapsai Date: Thu Oct 10 18:43:36 2019 New Revision: 374497 URL: http://llvm.org/viewvc/llvm-project?rev=374497&view=rev Log: [Stats] Fix polly build due to change in llvm::Statistic constructor in r374490. Modified: polly/trunk/lib/Analysis/ScopDetectionDiagnostic.cpp Modified: polly/trunk/lib/Analysis/ScopDetectionDiagnostic.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/Analysis/ScopDetectionDiagnostic.cpp?rev=374497&r1=374496&r2=374497&view=diff ============================================================================== --- polly/trunk/lib/Analysis/ScopDetectionDiagnostic.cpp (original) +++ polly/trunk/lib/Analysis/ScopDetectionDiagnostic.cpp Thu Oct 10 18:43:36 2019 @@ -46,9 +46,7 @@ using namespace llvm; #define SCOP_STAT(NAME, DESC) \ { \ - "polly-detect", "NAME", "Number of rejected regions: " DESC, {0}, { \ - false \ - } \ + "polly-detect", "NAME", "Number of rejected regions: " DESC \ } Statistic RejectStatistics[] = { From llvm-commits at lists.llvm.org Thu Oct 10 18:45:32 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Fri, 11 Oct 2019 01:45:32 -0000 Subject: [llvm] r374498 - [Attributor][FIX] Do not replace musstail calls with constant Message-ID: <20191011014532.D8AC992686@lists.llvm.org> Author: jdoerfert Date: Thu Oct 10 18:45:32 2019 New Revision: 374498 URL: http://llvm.org/viewvc/llvm-project?rev=374498&view=rev Log: [Attributor][FIX] Do not replace musstail calls with constant Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374498&r1=374497&r2=374498&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Thu Oct 10 18:45:32 2019 @@ -997,7 +997,7 @@ ChangeStatus AAReturnedValuesImpl::manif // Callback to replace the uses of CB with the constant C. auto ReplaceCallSiteUsersWith = [](CallBase &CB, Constant &C) { - if (CB.getNumUses() == 0) + if (CB.getNumUses() == 0 || CB.isMustTailCall()) return ChangeStatus::UNCHANGED; CB.replaceAllUsesWith(&C); return ChangeStatus::CHANGED; Modified: llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll?rev=374498&r1=374497&r2=374498&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll Thu Oct 10 18:45:32 2019 @@ -830,6 +830,11 @@ define i32* @use_const() #0 { ; CHECK: ret i32* bitcast (i8* @G to i32*) ret i32* %c } +define i32* @dont_use_const() #0 { + %c = musttail call i32* @ret_const() + ; CHECK: ret i32* %c + ret i32* %c +} attributes #0 = { noinline nounwind uwtable } From llvm-commits at lists.llvm.org Thu Oct 10 18:50:31 2019 From: llvm-commits at lists.llvm.org (Lang Hames via llvm-commits) Date: Fri, 11 Oct 2019 01:50:31 -0000 Subject: [llvm] r374499 - [JITLink] Fix MachO/arm64 GOTPAGEOFF encoding. Message-ID: <20191011015031.ED31292DAB@lists.llvm.org> Author: lhames Date: Thu Oct 10 18:50:31 2019 New Revision: 374499 URL: http://llvm.org/viewvc/llvm-project?rev=374499&view=rev Log: [JITLink] Fix MachO/arm64 GOTPAGEOFF encoding. The original implementation failed to shift the immediate down. This should fix some of the bot failures due to r374476. Modified: llvm/trunk/lib/ExecutionEngine/JITLink/MachO_arm64.cpp Modified: llvm/trunk/lib/ExecutionEngine/JITLink/MachO_arm64.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/JITLink/MachO_arm64.cpp?rev=374499&r1=374498&r2=374499&view=diff ============================================================================== --- llvm/trunk/lib/ExecutionEngine/JITLink/MachO_arm64.cpp (original) +++ llvm/trunk/lib/ExecutionEngine/JITLink/MachO_arm64.cpp Thu Oct 10 18:50:31 2019 @@ -614,12 +614,15 @@ private: } case GOTPageOffset12: { assert(E.getAddend() == 0 && "GOTPAGEOF12 with non-zero addend"); - uint64_t TargetOffset = E.getTarget().getAddress() & 0xfff; uint32_t RawInstr = *(ulittle32_t *)FixupPtr; assert((RawInstr & 0xfffffc00) == 0xf9400000 && "RawInstr isn't a 64-bit LDR immediate"); - uint32_t FixedInstr = RawInstr | (TargetOffset << 10); + + uint32_t TargetOffset = E.getTarget().getAddress() & 0xfff; + assert((TargetOffset & 0x7) == 0 && "GOT entry is not 8-byte aligned"); + uint32_t EncodedImm = (TargetOffset >> 3) << 10; + uint32_t FixedInstr = RawInstr | EncodedImm; *(ulittle32_t *)FixupPtr = FixedInstr; break; } From llvm-commits at lists.llvm.org Thu Oct 10 18:54:14 2019 From: llvm-commits at lists.llvm.org (Douglas Gliner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:54:14 +0000 (UTC) Subject: [PATCH] D68752: [sancov] Use LLVM Support library JSON writer in favor of individual implementation In-Reply-To: References: Message-ID: dgg5503 added a comment. In D68752#1705235 , @kcc wrote: > I was actually hoping to get rid of this code entirely. > Why do you need this change? Hi @kcc, Are you referring to the JSON formatted coverage report or some other aspect of sancov? The original fix found at D51018 rev 2 essentially added support for Windows paths to `coverage-report-server.py`. This also required some changes to the JSON writer in sancov and sancov itself to ensure paths were properly normalized and escaped. I figured I might as well have sancov use the LLVM JSON support library to reduce maintenance and also fix any potential escaping issues for free. These changes branched from this suggestion . The original reason for D51018 was to allow coverage dumped from libFuzzer using `-dump-coverage` to be viewed via `coverage-report-server.py` when the binary was compiled on Windows and therefore had Windows paths in the debug info. This feature has since been deprecated in libFuzzer, however, I would argue these patches are still relevant since you can still dump coverage standalone using the sanitizer run-time (see here ). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68752/new/ https://reviews.llvm.org/D68752 From llvm-commits at lists.llvm.org Thu Oct 10 18:54:14 2019 From: llvm-commits at lists.llvm.org (Alexandre Ganea via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 01:54:14 +0000 (UTC) Subject: [PATCH] D68820: win: Move Parallel.h off concrt to cross-platform code In-Reply-To: References: Message-ID: aganea added a comment. Looks slightly better without ConcRT, thanks Nico! Here's some quick results showing the difference for the global hash parallelization in LLD, with MSVC OBJs. The algorithm is iterating on the .debug$T records, for a few thousands OBJs on 72 hyper threads: (we're saving about 2 secs on this test) **Before:** F10214323: lld-link-concrt.PNG **After:** F10214327: lld-link-no-concrt.PNG There's still this memory map lock eating half of the CPU time, I'm not too sure yet how to avoid it, if ever. Maybe touch the pages for the OBJ files in advance? Anyway, unrelated. F10214332: lld-link-mmaccessfault-lock.PNG Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68820/new/ https://reviews.llvm.org/D68820 From llvm-commits at lists.llvm.org Thu Oct 10 18:58:12 2019 From: llvm-commits at lists.llvm.org (Lang Hames via llvm-commits) Date: Fri, 11 Oct 2019 01:58:12 -0000 Subject: [llvm] r374500 - [JITLink] Disable the MachO/AArch64 testcase while investigating bot failures. Message-ID: <20191011015813.0426492674@lists.llvm.org> Author: lhames Date: Thu Oct 10 18:58:12 2019 New Revision: 374500 URL: http://llvm.org/viewvc/llvm-project?rev=374500&view=rev Log: [JITLink] Disable the MachO/AArch64 testcase while investigating bot failures. The windows bots are failing due to a memory layout error. Temporarily disabling while I investigate whether this can be worked around, or whether the test should be disabled on Windows. Modified: llvm/trunk/test/ExecutionEngine/JITLink/AArch64/lit.local.cfg Modified: llvm/trunk/test/ExecutionEngine/JITLink/AArch64/lit.local.cfg URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/ExecutionEngine/JITLink/AArch64/lit.local.cfg?rev=374500&r1=374499&r2=374500&view=diff ============================================================================== --- llvm/trunk/test/ExecutionEngine/JITLink/AArch64/lit.local.cfg (original) +++ llvm/trunk/test/ExecutionEngine/JITLink/AArch64/lit.local.cfg Thu Oct 10 18:58:12 2019 @@ -1,2 +1,2 @@ -if not 'AArch64' in config.root.targets: - config.unsupported = True +# if not 'AArch64' in config.root.targets: +config.unsupported = True From llvm-commits at lists.llvm.org Thu Oct 10 19:04:00 2019 From: llvm-commits at lists.llvm.org (Petr Hosek via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:04:00 +0000 (UTC) Subject: [PATCH] D68774: [libFuzzer] Don't prefix absolute paths in fuchsia. In-Reply-To: References: Message-ID: <0ada4956a7441e777b9b97cdc6df3fe3@localhost.localdomain> phosek added inline comments. ================ Comment at: compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp:415 + bool IsAbsolutePath = Path.length() > 1 && Path[0] == '/'; + if (!IsAbsolutePath && Cmd.hasFlag("artifact_prefix")) { + Path = Cmd.getFlagValue("artifact_prefix") + "/" + Path; ---------------- Nit: no curly braces for block with a single statement (LLVM style). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68774/new/ https://reviews.llvm.org/D68774 From llvm-commits at lists.llvm.org Thu Oct 10 19:13:12 2019 From: llvm-commits at lists.llvm.org (Jonathan Metzman via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:13:12 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Accommodate sancov and coverage report server for use under Windows In-Reply-To: References: Message-ID: <9c8c4fa55107534c9d49492b6e7925d3@localhost.localdomain> metzman added a comment. I don't consider myself a Windows expert but I don't see anything problematic from a Windows point of view. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 From llvm-commits at lists.llvm.org Thu Oct 10 19:22:13 2019 From: llvm-commits at lists.llvm.org (Chris Ye via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:22:13 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline In-Reply-To: References: Message-ID: <68d085e12d9aa1c79a05a19c58ba9231@localhost.localdomain> yechunliang marked 5 inline comments as done. yechunliang added a comment. In D68633#1699421 , @bjope wrote: > The code is written in a way that it skips any instruction, but moves contigous blocks of allocas in one splice (not sure exactly why, is that really faster?). I also could not understand why continue to scan allocas block after first none use_empty alloca instruction, here is the first commit has some reason: https://github.com/llvm/llvm-project/commit/6f8865bf9 > Maybe the difference is that the check for AI->useEmpty() only is done for the first alloca in a sequence of alloca instructions? Or can't we just remove the loop at line 1847 (only moving one alloca at a time). with this example test case, second alloca is use_empty, and will insert to caller together with first alloca (!use_empty). But if there is dbg instruction between first alloca and second alloca instruction. the continue scan will break, then with the debug instruction, the program will goto the front for() loop, and handle the second alloca as use_empty (because it has no user list like "xxx.sroa_cast = bitcast %rec1198* %volatileloadslot to i8*") and eraseFromParent. this is difference as no-dbg inline will not erase second alloca instruction. ================ Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1842-1843 + // Debuginfo (@llvm.dbg.value) will make different result, skip while allocas scanning + while (isa(I)) ++I; + ---------------- aprantl wrote: > jmorse wrote: > > Is there a possibility of an unrelated debug instruction being skipped here, and becoming part of the slice moved by lines 1847-1857? Moving dbg.values of arguments to the start of the caller may create a debug use-before-def situation, there could be other problem scenarios too. > > > > Using a debug-instruction filtering iterator (like here [0]) might just do-the-right-thing, I don't know whether feeding one to splice would behave correctly though. > > > > [0] https://github.com/llvm/llvm-project/blob/fdaa74217420729140f1786ea037ac445a724c8e/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L2592 > Don't we have an iterator that automatically skips debug intrinsics? > Is there a possibility of an unrelated debug instruction being skipped here, By default, the debug instruction between allocas will insert to caller together with block of allocas, I just keep the default behavier. Not sure if we need to remove the dbg instr when inline to the caller. > Using a debug-instruction filtering iterator (like here [0]) might just do-the-right-thing, I could not found the way to use debug-instruction filtering like instructionsWithoutDebug(llvm::Instruction) to handle BasicBlock::iterator. the format not match, so I used DbgInfoInstrinsic for simple usage. ================ Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1851 ++I; } ---------------- jdoerfert wrote: > Don't we need to have similar logic here? What happens if there are two allocas, then the dbg intrinsic, then another one? thanks for the comments, this is a good case, I have updated code and testcase to handle this. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 From llvm-commits at lists.llvm.org Thu Oct 10 19:31:23 2019 From: llvm-commits at lists.llvm.org (Marco Vanotti via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:31:23 +0000 (UTC) Subject: [PATCH] D68774: [libFuzzer] Don't prefix absolute paths in fuchsia. In-Reply-To: References: Message-ID: <3784668c5a70f05911a9f44088a932ee@localhost.localdomain> charco updated this revision to Diff 224525. charco added a comment. Remove brackets on ifs conditions Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68774/new/ https://reviews.llvm.org/D68774 Files: compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp Index: compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp =================================================================== --- compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp +++ compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp @@ -415,13 +415,14 @@ // that lacks a mutable working directory. Fortunately, when this is the case // a mutable output directory must be specified using "-artifact_prefix=...", // so write the log file(s) there. + // However, we don't want to apply this logic for absolute paths. int FdOut = STDOUT_FILENO; if (Cmd.hasOutputFile()) { - std::string Path; - if (Cmd.hasFlag("artifact_prefix")) - Path = Cmd.getFlagValue("artifact_prefix") + "/" + Cmd.getOutputFile(); - else - Path = Cmd.getOutputFile(); + std::string Path = Cmd.getOutputFile(); + bool IsAbsolutePath = Path.length() > 1 && Path[0] == '/'; + if (!IsAbsolutePath && Cmd.hasFlag("artifact_prefix")) + Path = Cmd.getFlagValue("artifact_prefix") + "/" + Path; + FdOut = open(Path.c_str(), O_WRONLY | O_CREAT | O_TRUNC, 0); if (FdOut == -1) { Printf("libFuzzer: failed to open %s: %s\n", Path.c_str(), -------------- next part -------------- A non-text attachment was scrubbed... Name: D68774.224525.patch Type: text/x-patch Size: 1148 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 19:31:23 2019 From: llvm-commits at lists.llvm.org (Marco Vanotti via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:31:23 +0000 (UTC) Subject: [PATCH] D68774: [libFuzzer] Don't prefix absolute paths in fuchsia. In-Reply-To: References: Message-ID: <708d4c4ebf2302dffc21d1d726cfc16d@localhost.localdomain> charco marked 2 inline comments as done. charco added inline comments. ================ Comment at: compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp:415 + bool IsAbsolutePath = Path.length() > 1 && Path[0] == '/'; + if (!IsAbsolutePath && Cmd.hasFlag("artifact_prefix")) { + Path = Cmd.getFlagValue("artifact_prefix") + "/" + Path; ---------------- phosek wrote: > Nit: no curly braces for block with a single statement (LLVM style). Thanks! I didn't know about this rule. Could you point me to where it is defined? I checked http://llvm.org/docs/CodingStandards.html#source-code-formatting and also tried to see if clang-format -style=LLVM catched it but nothing. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68774/new/ https://reviews.llvm.org/D68774 From llvm-commits at lists.llvm.org Thu Oct 10 19:42:17 2019 From: llvm-commits at lists.llvm.org (Volodymyr Sapsai via llvm-commits) Date: Fri, 11 Oct 2019 02:42:17 -0000 Subject: [polly] r374501 - [Stats] More polly fixes following llvm::Statistic changes in r374490. Message-ID: <20191011024217.3248292D21@lists.llvm.org> Author: vsapsai Date: Thu Oct 10 19:42:16 2019 New Revision: 374501 URL: http://llvm.org/viewvc/llvm-project?rev=374501&view=rev Log: [Stats] More polly fixes following llvm::Statistic changes in r374490. Modified: polly/trunk/lib/Transform/ScheduleOptimizer.cpp polly/trunk/lib/Transform/Simplify.cpp Modified: polly/trunk/lib/Transform/ScheduleOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/Transform/ScheduleOptimizer.cpp?rev=374501&r1=374500&r2=374501&view=diff ============================================================================== --- polly/trunk/lib/Transform/ScheduleOptimizer.cpp (original) +++ polly/trunk/lib/Transform/ScheduleOptimizer.cpp Thu Oct 10 19:42:16 2019 @@ -276,9 +276,9 @@ STATISTIC(NumBoxedLoopsOptimized, "Numbe #define THREE_STATISTICS(VARNAME, DESC) \ static Statistic VARNAME[3] = { \ - {DEBUG_TYPE, #VARNAME "0", DESC " (original)", {0}, {false}}, \ - {DEBUG_TYPE, #VARNAME "1", DESC " (after scheduler)", {0}, {false}}, \ - {DEBUG_TYPE, #VARNAME "2", DESC " (after optimizer)", {0}, {false}}} + {DEBUG_TYPE, #VARNAME "0", DESC " (original)"}, \ + {DEBUG_TYPE, #VARNAME "1", DESC " (after scheduler)"}, \ + {DEBUG_TYPE, #VARNAME "2", DESC " (after optimizer)"}} THREE_STATISTICS(NumBands, "Number of bands"); THREE_STATISTICS(NumBandMembers, "Number of band members"); Modified: polly/trunk/lib/Transform/Simplify.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/Transform/Simplify.cpp?rev=374501&r1=374500&r2=374501&view=diff ============================================================================== --- polly/trunk/lib/Transform/Simplify.cpp (original) +++ polly/trunk/lib/Transform/Simplify.cpp Thu Oct 10 19:42:16 2019 @@ -28,8 +28,8 @@ namespace { #define TWO_STATISTICS(VARNAME, DESC) \ static llvm::Statistic VARNAME[2] = { \ - {DEBUG_TYPE, #VARNAME "0", DESC " (first)", {0}, {false}}, \ - {DEBUG_TYPE, #VARNAME "1", DESC " (second)", {0}, {false}}} + {DEBUG_TYPE, #VARNAME "0", DESC " (first)"}, \ + {DEBUG_TYPE, #VARNAME "1", DESC " (second)"}} /// Number of max disjuncts we allow in removeOverwrites(). This is to avoid /// that the analysis of accesses in a statement is becoming too complex. Chosen From llvm-commits at lists.llvm.org Thu Oct 10 19:40:27 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:40:27 +0000 (UTC) Subject: [PATCH] D68852: [Attributor] Pointer privatization attribute (argument promotion) Message-ID: jdoerfert created this revision. jdoerfert added reviewers: uenoku, sstefan1, lebedev.ri, hfinkel, vsk, dblaikie, davidxl, tejohnson, tstellar, echristo, chandlerc, efriedma. Herald added subscribers: arphaman, bollu, hiraditya. Herald added a project: LLVM. jdoerfert added a comment. I went through all the tests, added mem2reg/sroa where approriate and modified the source sometimes, mostly to avoid UB. I think the results of the Attributor look good, all problems should have been addressed already. A pointer is privatizeable if it can be replaced by a new, private one. Privatizing pointer reduces the use count, interaction between unrelated code parts. This is a first step towards replacing argument promotion. While we can already handle recursion (unlike argument promotion!) we are restricted to stack allocations for now because we do not analyze the uses in the callee. All argument promotion test now run the Attributor as well. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68852 Files: llvm/include/llvm/Transforms/IPO/ArgumentPromotion.h llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Transforms/IPO/ArgumentPromotion.cpp llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/ArgumentPromotion/2008-02-01-ReturnAttrs.ll llvm/test/Transforms/ArgumentPromotion/2008-07-02-array-indexing.ll llvm/test/Transforms/ArgumentPromotion/2008-09-07-CGUpdate.ll llvm/test/Transforms/ArgumentPromotion/2008-09-08-CGUpdateSelfEdge.ll llvm/test/Transforms/ArgumentPromotion/X86/attributes.ll llvm/test/Transforms/ArgumentPromotion/X86/min-legal-vector-width.ll llvm/test/Transforms/ArgumentPromotion/X86/thiscall.ll llvm/test/Transforms/ArgumentPromotion/aggregate-promote.ll llvm/test/Transforms/ArgumentPromotion/attrs.ll llvm/test/Transforms/ArgumentPromotion/basictest.ll llvm/test/Transforms/ArgumentPromotion/byval-2.ll llvm/test/Transforms/ArgumentPromotion/byval.ll llvm/test/Transforms/ArgumentPromotion/chained.ll llvm/test/Transforms/ArgumentPromotion/control-flow.ll llvm/test/Transforms/ArgumentPromotion/control-flow2.ll llvm/test/Transforms/ArgumentPromotion/crash.ll llvm/test/Transforms/ArgumentPromotion/dbg.ll llvm/test/Transforms/ArgumentPromotion/fp80.ll llvm/test/Transforms/ArgumentPromotion/inalloca.ll llvm/test/Transforms/ArgumentPromotion/invalidation.ll llvm/test/Transforms/ArgumentPromotion/musttail.ll llvm/test/Transforms/ArgumentPromotion/naked_functions.ll llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll llvm/test/Transforms/ArgumentPromotion/pr27568.ll llvm/test/Transforms/ArgumentPromotion/pr3085.ll llvm/test/Transforms/ArgumentPromotion/pr32917.ll llvm/test/Transforms/ArgumentPromotion/pr33641_remove_arg_dbgvalue.ll llvm/test/Transforms/ArgumentPromotion/profile.ll llvm/test/Transforms/ArgumentPromotion/reserve-tbaa.ll llvm/test/Transforms/ArgumentPromotion/sret.ll llvm/test/Transforms/ArgumentPromotion/tail.ll llvm/test/Transforms/ArgumentPromotion/variadic.ll llvm/test/Transforms/FunctionAttrs/callbacks.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68852.224526.patch Type: text/x-patch Size: 157660 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 19:40:27 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:40:27 +0000 (UTC) Subject: [PATCH] D68852: [Attributor] Pointer privatization attribute (argument promotion) In-Reply-To: References: Message-ID: <32b3082fc60a8cac5ed201a5f45266a4@localhost.localdomain> jdoerfert added a comment. I went through all the tests, added mem2reg/sroa where approriate and modified the source sometimes, mostly to avoid UB. I think the results of the Attributor look good, all problems should have been addressed already. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68852/new/ https://reviews.llvm.org/D68852 From llvm-commits at lists.llvm.org Thu Oct 10 19:44:20 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Fri, 11 Oct 2019 02:44:20 -0000 Subject: [llvm] r374503 - Revert 374481 "[tsan, msan] Insert module constructors in a module pass" Message-ID: <20191011024420.C2FD492DA8@lists.llvm.org> Author: nico Date: Thu Oct 10 19:44:20 2019 New Revision: 374503 URL: http://llvm.org/viewvc/llvm-project?rev=374503&view=rev Log: Revert 374481 "[tsan,msan] Insert module constructors in a module pass" CodeGen/sanitizer-module-constructor.c fails on mac and windows, see e.g. http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/11424 Modified: llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h llvm/trunk/lib/Passes/PassRegistry.def llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll Modified: llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h?rev=374503&r1=374502&r2=374503&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h (original) +++ llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h Thu Oct 10 19:44:20 2019 @@ -40,7 +40,6 @@ struct MemorySanitizerPass : public Pass MemorySanitizerPass(MemorySanitizerOptions Options) : Options(Options) {} PreservedAnalyses run(Function &F, FunctionAnalysisManager &FAM); - PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM); private: MemorySanitizerOptions Options; Modified: llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h?rev=374503&r1=374502&r2=374503&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h (original) +++ llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h Thu Oct 10 19:44:20 2019 @@ -27,8 +27,6 @@ FunctionPass *createThreadSanitizerLegac /// yet, the pass inserts the declarations. Otherwise the existing globals are struct ThreadSanitizerPass : public PassInfoMixin { PreservedAnalyses run(Function &F, FunctionAnalysisManager &FAM); - PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM); }; - } // namespace llvm #endif /* LLVM_TRANSFORMS_INSTRUMENTATION_THREADSANITIZER_H */ Modified: llvm/trunk/lib/Passes/PassRegistry.def URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Passes/PassRegistry.def?rev=374503&r1=374502&r2=374503&view=diff ============================================================================== --- llvm/trunk/lib/Passes/PassRegistry.def (original) +++ llvm/trunk/lib/Passes/PassRegistry.def Thu Oct 10 19:44:20 2019 @@ -86,8 +86,6 @@ MODULE_PASS("synthetic-counts-propagatio MODULE_PASS("wholeprogramdevirt", WholeProgramDevirtPass(nullptr, nullptr)) MODULE_PASS("verify", VerifierPass()) MODULE_PASS("asan-module", ModuleAddressSanitizerPass(/*CompileKernel=*/false, false, true, false)) -MODULE_PASS("msan-module", MemorySanitizerPass({})) -MODULE_PASS("tsan-module", ThreadSanitizerPass()) MODULE_PASS("kasan-module", ModuleAddressSanitizerPass(/*CompileKernel=*/true, false, true, false)) MODULE_PASS("sancov-module", ModuleSanitizerCoveragePass()) MODULE_PASS("poison-checking", PoisonCheckingPass()) Modified: llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp?rev=374503&r1=374502&r2=374503&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp Thu Oct 10 19:44:20 2019 @@ -587,25 +587,9 @@ private: /// An empty volatile inline asm that prevents callback merge. InlineAsm *EmptyAsm; -}; -void insertModuleCtor(Module &M) { - getOrCreateSanitizerCtorAndInitFunctions( - M, kMsanModuleCtorName, kMsanInitName, - /*InitArgTypes=*/{}, - /*InitArgs=*/{}, - // This callback is invoked when the functions are created the first - // time. Hook them into the global ctors list in that case: - [&](Function *Ctor, FunctionCallee) { - if (!ClWithComdat) { - appendToGlobalCtors(M, Ctor, 0); - return; - } - Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName); - Ctor->setComdat(MsanCtorComdat); - appendToGlobalCtors(M, Ctor, 0, Ctor); - }); -} + Function *MsanCtorFunction; +}; /// A legacy function pass for msan instrumentation. /// @@ -651,14 +635,6 @@ PreservedAnalyses MemorySanitizerPass::r return PreservedAnalyses::all(); } -PreservedAnalyses MemorySanitizerPass::run(Module &M, - ModuleAnalysisManager &AM) { - if (Options.Kernel) - return PreservedAnalyses::all(); - insertModuleCtor(M); - return PreservedAnalyses::none(); -} - char MemorySanitizerLegacyPass::ID = 0; INITIALIZE_PASS_BEGIN(MemorySanitizerLegacyPass, "msan", @@ -944,6 +920,23 @@ void MemorySanitizer::initializeModule(M OriginStoreWeights = MDBuilder(*C).createBranchWeights(1, 1000); if (!CompileKernel) { + std::tie(MsanCtorFunction, std::ignore) = + getOrCreateSanitizerCtorAndInitFunctions( + M, kMsanModuleCtorName, kMsanInitName, + /*InitArgTypes=*/{}, + /*InitArgs=*/{}, + // This callback is invoked when the functions are created the first + // time. Hook them into the global ctors list in that case: + [&](Function *Ctor, FunctionCallee) { + if (!ClWithComdat) { + appendToGlobalCtors(M, Ctor, 0); + return; + } + Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName); + Ctor->setComdat(MsanCtorComdat); + appendToGlobalCtors(M, Ctor, 0, Ctor); + }); + if (TrackOrigins) M.getOrInsertGlobal("__msan_track_origins", IRB.getInt32Ty(), [&] { return new GlobalVariable( @@ -961,8 +954,6 @@ void MemorySanitizer::initializeModule(M } bool MemorySanitizerLegacyPass::doInitialization(Module &M) { - if (!Options.Kernel) - insertModuleCtor(M); MSan.emplace(M, Options); return true; } @@ -4587,9 +4578,8 @@ static VarArgHelper *CreateVarArgHelper( } bool MemorySanitizer::sanitizeFunction(Function &F, TargetLibraryInfo &TLI) { - if (!CompileKernel && F.getName() == kMsanModuleCtorName) + if (!CompileKernel && (&F == MsanCtorFunction)) return false; - MemorySanitizerVisitor Visitor(F, *this, TLI); // Clear out readonly/readnone attributes. Modified: llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp?rev=374503&r1=374502&r2=374503&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp Thu Oct 10 19:44:20 2019 @@ -92,10 +92,11 @@ namespace { /// ensures the __tsan_init function is in the list of global constructors for /// the module. struct ThreadSanitizer { + ThreadSanitizer(Module &M); bool sanitizeFunction(Function &F, const TargetLibraryInfo &TLI); private: - void initialize(Module &M); + void initializeCallbacks(Module &M); bool instrumentLoadOrStore(Instruction *I, const DataLayout &DL); bool instrumentAtomic(Instruction *I, const DataLayout &DL); bool instrumentMemIntrinsic(Instruction *I); @@ -107,6 +108,8 @@ private: void InsertRuntimeIgnores(Function &F); Type *IntptrTy; + IntegerType *OrdTy; + // Callbacks to run-time library are computed in doInitialization. FunctionCallee TsanFuncEntry; FunctionCallee TsanFuncExit; FunctionCallee TsanIgnoreBegin; @@ -127,6 +130,7 @@ private: FunctionCallee TsanVptrUpdate; FunctionCallee TsanVptrLoad; FunctionCallee MemmoveFn, MemcpyFn, MemsetFn; + Function *TsanCtorFunction; }; struct ThreadSanitizerLegacyPass : FunctionPass { @@ -139,32 +143,16 @@ struct ThreadSanitizerLegacyPass : Funct private: Optional TSan; }; - -void insertModuleCtor(Module &M) { - getOrCreateSanitizerCtorAndInitFunctions( - M, kTsanModuleCtorName, kTsanInitName, /*InitArgTypes=*/{}, - /*InitArgs=*/{}, - // This callback is invoked when the functions are created the first - // time. Hook them into the global ctors list in that case: - [&](Function *Ctor, FunctionCallee) { appendToGlobalCtors(M, Ctor, 0); }); -} - } // namespace PreservedAnalyses ThreadSanitizerPass::run(Function &F, FunctionAnalysisManager &FAM) { - ThreadSanitizer TSan; + ThreadSanitizer TSan(*F.getParent()); if (TSan.sanitizeFunction(F, FAM.getResult(F))) return PreservedAnalyses::none(); return PreservedAnalyses::all(); } -PreservedAnalyses ThreadSanitizerPass::run(Module &M, - ModuleAnalysisManager &MAM) { - insertModuleCtor(M); - return PreservedAnalyses::none(); -} - char ThreadSanitizerLegacyPass::ID = 0; INITIALIZE_PASS_BEGIN(ThreadSanitizerLegacyPass, "tsan", "ThreadSanitizer: detects data races.", false, false) @@ -181,8 +169,7 @@ void ThreadSanitizerLegacyPass::getAnaly } bool ThreadSanitizerLegacyPass::doInitialization(Module &M) { - insertModuleCtor(M); - TSan.emplace(); + TSan.emplace(M); return true; } @@ -196,10 +183,7 @@ FunctionPass *llvm::createThreadSanitize return new ThreadSanitizerLegacyPass(); } -void ThreadSanitizer::initialize(Module &M) { - const DataLayout &DL = M.getDataLayout(); - IntptrTy = DL.getIntPtrType(M.getContext()); - +void ThreadSanitizer::initializeCallbacks(Module &M) { IRBuilder<> IRB(M.getContext()); AttributeList Attr; Attr = Attr.addAttribute(M.getContext(), AttributeList::FunctionIndex, @@ -213,7 +197,7 @@ void ThreadSanitizer::initialize(Module IRB.getVoidTy()); TsanIgnoreEnd = M.getOrInsertFunction("__tsan_ignore_thread_end", Attr, IRB.getVoidTy()); - IntegerType *OrdTy = IRB.getInt32Ty(); + OrdTy = IRB.getInt32Ty(); for (size_t i = 0; i < kNumberOfAccessSizes; ++i) { const unsigned ByteSize = 1U << i; const unsigned BitSize = ByteSize * 8; @@ -296,6 +280,20 @@ void ThreadSanitizer::initialize(Module IRB.getInt8PtrTy(), IRB.getInt32Ty(), IntptrTy); } +ThreadSanitizer::ThreadSanitizer(Module &M) { + const DataLayout &DL = M.getDataLayout(); + IntptrTy = DL.getIntPtrType(M.getContext()); + std::tie(TsanCtorFunction, std::ignore) = + getOrCreateSanitizerCtorAndInitFunctions( + M, kTsanModuleCtorName, kTsanInitName, /*InitArgTypes=*/{}, + /*InitArgs=*/{}, + // This callback is invoked when the functions are created the first + // time. Hook them into the global ctors list in that case: + [&](Function *Ctor, FunctionCallee) { + appendToGlobalCtors(M, Ctor, 0); + }); +} + static bool isVtableAccess(Instruction *I) { if (MDNode *Tag = I->getMetadata(LLVMContext::MD_tbaa)) return Tag->isTBAAVtableAccess(); @@ -438,9 +436,9 @@ bool ThreadSanitizer::sanitizeFunction(F const TargetLibraryInfo &TLI) { // This is required to prevent instrumenting call to __tsan_init from within // the module constructor. - if (F.getName() == kTsanModuleCtorName) + if (&F == TsanCtorFunction) return false; - initialize(*F.getParent()); + initializeCallbacks(*F.getParent()); SmallVector AllLoadsAndStores; SmallVector LocalLoadsAndStores; SmallVector AtomicAccesses; Modified: llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll?rev=374503&r1=374502&r2=374503&view=diff ============================================================================== --- llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll (original) +++ llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll Thu Oct 10 19:44:20 2019 @@ -1,9 +1,10 @@ -; RUN: opt < %s -msan-check-access-address=0 -S -passes='module(msan-module),function(msan)' 2>&1 | FileCheck -allow-deprecated-dag-overlap %s -; RUN: opt < %s --passes='module(msan-module),function(msan)' -msan-check-access-address=0 -S | FileCheck -allow-deprecated-dag-overlap %s -; RUN: opt < %s -msan-check-access-address=0 -msan-track-origins=1 -S -passes='module(msan-module),function(msan)' 2>&1 | \ -; RUN: FileCheck -allow-deprecated-dag-overlap -check-prefixes=CHECK,CHECK-ORIGINS %s -; RUN: opt < %s -passes='module(msan-module),function(msan)' -msan-check-access-address=0 -msan-track-origins=1 -S | \ -; RUN: FileCheck -allow-deprecated-dag-overlap -check-prefixes=CHECK,CHECK-ORIGINS %s +; RUN: opt < %s -msan-check-access-address=0 -S -passes=msan 2>&1 | FileCheck \ +; RUN: -allow-deprecated-dag-overlap %s +; RUN: opt < %s -msan -msan-check-access-address=0 -S | FileCheck -allow-deprecated-dag-overlap %s +; RUN: opt < %s -msan-check-access-address=0 -msan-track-origins=1 -S \ +; RUN: -passes=msan 2>&1 | FileCheck -allow-deprecated-dag-overlap \ +; RUN: -check-prefix=CHECK -check-prefix=CHECK-ORIGINS %s +; RUN: opt < %s -msan -msan-check-access-address=0 -msan-track-origins=1 -S | FileCheck -allow-deprecated-dag-overlap -check-prefix=CHECK -check-prefix=CHECK-ORIGINS %s target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" Modified: llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll?rev=374503&r1=374502&r2=374503&view=diff ============================================================================== --- llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll (original) +++ llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll Thu Oct 10 19:44:20 2019 @@ -1,5 +1,5 @@ ; RUN: opt < %s -tsan -S | FileCheck %s -; RUN: opt < %s -passes='function(tsan),module(tsan-module)' -S | FileCheck %s +; RUN: opt < %s -passes=tsan -S | FileCheck %s target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" target triple = "x86_64-unknown-linux-gnu" From llvm-commits at lists.llvm.org Thu Oct 10 19:42:19 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:42:19 +0000 (UTC) Subject: [PATCH] D68832: [tsan,msan] Insert module constructors in a module pass In-Reply-To: References: Message-ID: <668eafeed7ce9b75437666a4245164dd@localhost.localdomain> thakis added a comment. Reverted in r374503. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68832/new/ https://reviews.llvm.org/D68832 From llvm-commits at lists.llvm.org Thu Oct 10 19:49:37 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:49:37 +0000 (UTC) Subject: [PATCH] D68237: [PowerPC] Handle f16 as a storage type only In-Reply-To: References: Message-ID: <75c772fba3947dcbb20b73227777275b@localhost.localdomain> shchenz added inline comments. ================ Comment at: lib/Target/PowerPC/PPCISelLowering.cpp:184 + setTruncStoreAction(MVT::f32, MVT::f16, Expand); + } + ---------------- Do we need to handle ppcf128 also? ================ Comment at: lib/Target/PowerPC/PPCInstrVSX.td:114 [SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>; +def extloadf16 : PatFrag<(ops node:$ptr), (extload node:$ptr)> { + let IsLoad = 1; ---------------- Guard under IsISA3_0? ================ Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3263 (v2i64 (XXPERMDIs (VEXTSH2Ds (LXSIHZX xoaddr:$src)), 0))>; + // Load/convert and convert/store patterns for f16. ---------------- Guard under IsISA3_0? ================ Comment at: test/CodeGen/PowerPC/handle-f16-storage-type.ll:8 +; Function Attrs: nounwind readonly +define dso_local double @loadd(i16* nocapture readonly %a) local_unnamed_addr #0 { +; P8-LABEL: loadd: ---------------- `#0 `, seems all the function attributes are not defined? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68237/new/ https://reviews.llvm.org/D68237 From llvm-commits at lists.llvm.org Thu Oct 10 19:49:37 2019 From: llvm-commits at lists.llvm.org (LiuChen via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:49:37 +0000 (UTC) Subject: [PATCH] D68854: add mayRaiseFPException flag and FPCW registers for X87 instructions Message-ID: LiuChen3 created this revision. LiuChen3 added a reviewer: pengfei. Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. This patch adds flag "mayRaiseFPException" , FPCW and FPSW for X87 instructions which could raise float exception. https://reviews.llvm.org/D68854 Files: llvm/lib/Target/X86/X86InstrFPStack.td llvm/lib/Target/X86/X86InstrFormats.td -------------- next part -------------- A non-text attachment was scrubbed... Name: D68854.224524.patch Type: text/x-patch Size: 11614 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 19:49:37 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:49:37 +0000 (UTC) Subject: [PATCH] D68766: [NFC][ArgPromo][Tests] Run update_test_checks on all ArgumentPromotion tests In-Reply-To: References: Message-ID: jdoerfert added a comment. In D68766#1704577 , @lebedev.ri wrote: > If you want to add attributor runlines, i really insist on following the example i showed, it will result in cleaner diff overall. > That being said i like that the attributor runlines are in a separate diff. D68852 adds the "privatizable pointer" attribute to the Attributor and runs it on all these test. I can split it so we first run it without the privatizable pointer attribute but it's unclear if that is helpful. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68766/new/ https://reviews.llvm.org/D68766 From llvm-commits at lists.llvm.org Thu Oct 10 19:49:38 2019 From: llvm-commits at lists.llvm.org (Petr Hosek via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 02:49:38 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies In-Reply-To: References: Message-ID: <02fa787e77334dbbe04453a512728f3c@localhost.localdomain> phosek updated this revision to Diff 224533. phosek added a reviewer: smeenai. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68833/new/ https://reviews.llvm.org/D68833 Files: llvm/runtimes/CMakeLists.txt Index: llvm/runtimes/CMakeLists.txt =================================================================== --- llvm/runtimes/CMakeLists.txt +++ llvm/runtimes/CMakeLists.txt @@ -38,10 +38,10 @@ set(LLVM_EXTERNAL_${canon_name}_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../../${proj}") endforeach() -function(get_compiler_rt_path path) +function(get_runtime_path runtime path) foreach(entry ${runtimes}) get_filename_component(projName ${entry} NAME) - if("${projName}" MATCHES "compiler-rt") + if("${projName}" MATCHES "${runtime}") set(${path} ${entry} PARENT_SCOPE) return() endif() @@ -68,18 +68,6 @@ "${CMAKE_CURRENT_SOURCE_DIR}/../cmake/modules" ) - # Some of the runtimes will conditionally use the compiler-rt sanitizers - # to make this work smoothly we ensure that compiler-rt is added first in - # the list of sub-projects. This allows other sub-projects to have checks - # like `if(TARGET asan)` to enable building with asan. - get_compiler_rt_path(compiler_rt_path) - if(compiler_rt_path) - list(REMOVE_ITEM runtimes ${compiler_rt_path}) - if(NOT DEFINED LLVM_BUILD_COMPILER_RT OR LLVM_BUILD_COMPILER_RT) - list(INSERT runtimes 0 ${compiler_rt_path}) - endif() - endif() - # Setting these variables will allow the sub-build to put their outputs into # the library and bin directories of the top-level build. set(LLVM_LIBRARY_OUTPUT_INTDIR ${LLVM_LIBRARY_DIR}) @@ -122,6 +110,17 @@ include(UseLibtool) endif() + # Re-order runtimes in the order of dependencies: libcxxabi depend on libunwind, + # libcxx depends on libcxxabi, some compiler-rt runtimes depend on libcxx. This + # allows these runtimes to have checks like `if(TARGET ${runtime})`. + foreach(runtime compiler-rt libcxx libcxxabi libunwind) + get_runtime_path(${runtime} runtime_path) + if(runtime_path) + list(REMOVE_ITEM runtimes ${runtime_path}) + list(INSERT runtimes 0 ${runtime_path}) + endif() + endforeach() + # This can be used to detect whether we're in the runtimes build. set(RUNTIMES_BUILD ON) @@ -291,7 +290,7 @@ # If compiler-rt is present we need to build the builtin libraries first. This # is required because the other runtimes need the builtin libraries present # before the just-built compiler can pass the configuration tests. - get_compiler_rt_path(compiler_rt_path) + get_runtime_path(compiler-rt compiler_rt_path) if(compiler_rt_path) if(NOT LLVM_BUILTIN_TARGETS) builtin_default_target(${compiler_rt_path} -------------- next part -------------- A non-text attachment was scrubbed... Name: D68833.224533.patch Type: text/x-patch Size: 2546 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 20:02:44 2019 From: llvm-commits at lists.llvm.org (Shoaib Meenai via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 03:02:44 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies In-Reply-To: References: Message-ID: smeenai accepted this revision. smeenai added a comment. This revision is now accepted and ready to land. Makes sense to me, particularly since this is generalizing what we're already doing for compiler-rt. Is compiler-rt's place in the dependency list accurate though? For example, libc++ needs a builtins library, which could be compiler-rt, and I think the compiler-rt to libc++ dependency is limited to things like libFuzzer. I don't think libc++ supports using an in-tree compiler-rt though, so that's probably a moot point. ================ Comment at: llvm/runtimes/CMakeLists.txt:113 + # Re-order runtimes in the order of dependencies: libcxxabi depend on libunwind, + # libcxx depends on libcxxabi, some compiler-rt runtimes depend on libcxx. This ---------------- depend on -> depends on Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68833/new/ https://reviews.llvm.org/D68833 From llvm-commits at lists.llvm.org Thu Oct 10 20:02:44 2019 From: llvm-commits at lists.llvm.org (Nikolai Tillmann via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 03:02:44 +0000 (UTC) Subject: [PATCH] D68530: [AArch64] Don't combine callee-save and local stack adjustment when optimizing for size In-Reply-To: References: Message-ID: Nikolai updated this revision to Diff 224535. Nikolai added a comment. Addressing feedback by removing option, now always changing behavior when optimizing for size. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68530/new/ https://reviews.llvm.org/D68530 Files: llvm/lib/Target/AArch64/AArch64FrameLowering.cpp llvm/test/CodeGen/AArch64/arm64-never-combine-csr-local-stack-bump-for-size.ll Index: llvm/test/CodeGen/AArch64/arm64-never-combine-csr-local-stack-bump-for-size.ll =================================================================== --- /dev/null +++ llvm/test/CodeGen/AArch64/arm64-never-combine-csr-local-stack-bump-for-size.ll @@ -0,0 +1,25 @@ +; RUN: llc < %s -mtriple=arm64-apple-ios7.0 -disable-post-ra | FileCheck %s + +; CHECK-LABEL: main: +; CHECK: stp x29, x30, [sp, #-16]! +; CHECK-NEXT: stp xzr, xzr, [sp, #-16]! +; CHECK: adrp x0, l_.str at PAGE +; CHECK: add x0, x0, l_.str at PAGEOFF +; CHECK-NEXT: bl _puts +; CHECK-NEXT: add sp, sp, #16 +; CHECK-NEXT: ldp x29, x30, [sp], #16 +; CHECK-NEXT: ret + + at .str = private unnamed_addr constant [7 x i8] c"hello\0A\00" + +define i32 @main() nounwind ssp optsize { +entry: + %local1 = alloca i64, align 8 + %local2 = alloca i64, align 8 + store i64 0, i64* %local1 + store i64 0, i64* %local2 + %call = call i32 @puts(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i32 0, i32 0)) + ret i32 %call +} + +declare i32 @puts(i8*) Index: llvm/lib/Target/AArch64/AArch64FrameLowering.cpp =================================================================== --- llvm/lib/Target/AArch64/AArch64FrameLowering.cpp +++ llvm/lib/Target/AArch64/AArch64FrameLowering.cpp @@ -447,6 +447,9 @@ const AArch64Subtarget &Subtarget = MF.getSubtarget(); const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo(); + if (MF.getFunction().hasOptSize()) + return false; + if (AFI->getLocalStackSize() == 0) return false; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68530.224535.patch Type: text/x-patch Size: 1572 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 20:02:45 2019 From: llvm-commits at lists.llvm.org (Chris Ye via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 03:02:45 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline In-Reply-To: References: Message-ID: yechunliang updated this revision to Diff 224534. yechunliang marked an inline comment as done. yechunliang added a comment. Update patch: 1. Use BasicBlock::iterator skipDebugIntrinsics instead of rolling directly. The recommend API BasicBlock:: instructionsWithoutDebug() looks not very suitable to handle llvm::Instruction in this code. 2. follow llvm style, condition: break first, then unconditionally execute the code CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 Files: llvm/lib/Transforms/Utils/InlineFunction.cpp llvm/test/Transforms/Inline/inline-with-debuginfo.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68633.224534.patch Type: text/x-patch Size: 6738 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 20:09:25 2019 From: llvm-commits at lists.llvm.org (Volodymyr Sapsai via llvm-commits) Date: Fri, 11 Oct 2019 03:09:25 -0000 Subject: [polly] r374504 - [Polly] Fix formatting violation. NFC. Message-ID: <20191011030925.26DD792578@lists.llvm.org> Author: vsapsai Date: Thu Oct 10 20:09:24 2019 New Revision: 374504 URL: http://llvm.org/viewvc/llvm-project?rev=374504&view=rev Log: [Polly] Fix formatting violation. NFC. Modified: polly/trunk/lib/Analysis/ScopDetectionDiagnostic.cpp Modified: polly/trunk/lib/Analysis/ScopDetectionDiagnostic.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/Analysis/ScopDetectionDiagnostic.cpp?rev=374504&r1=374503&r2=374504&view=diff ============================================================================== --- polly/trunk/lib/Analysis/ScopDetectionDiagnostic.cpp (original) +++ polly/trunk/lib/Analysis/ScopDetectionDiagnostic.cpp Thu Oct 10 20:09:24 2019 @@ -45,9 +45,7 @@ using namespace llvm; #define DEBUG_TYPE "polly-detect" #define SCOP_STAT(NAME, DESC) \ - { \ - "polly-detect", "NAME", "Number of rejected regions: " DESC \ - } + { "polly-detect", "NAME", "Number of rejected regions: " DESC } Statistic RejectStatistics[] = { SCOP_STAT(CFG, ""), From llvm-commits at lists.llvm.org Thu Oct 10 20:07:53 2019 From: llvm-commits at lists.llvm.org (LiuChen via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 03:07:53 +0000 (UTC) Subject: [PATCH] D68857: [X86] Add strict fp support for operations of X87 instructions Message-ID: LiuChen3 created this revision. Herald added subscribers: llvm-commits, pengfei, hiraditya. Herald added a project: LLVM. LiuChen3 added a reviewer: pengfei. This is the following patch of D68854 . This patch adds basic operations of X87 instructions, including +, -, *, / , fp extensions and fp truncations. https://reviews.llvm.org/D68857 Files: llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86InstrFPStack.td llvm/test/CodeGen/X86/x87-fp-strict-add.ll llvm/test/CodeGen/X86/x87-fp-strict-div.ll llvm/test/CodeGen/X86/x87-fp-strict-fpextend.ll llvm/test/CodeGen/X86/x87-fp-strict-fpround.ll llvm/test/CodeGen/X86/x87-fp-strict-mul.ll llvm/test/CodeGen/X86/x87-fp-strict-sqrt.ll llvm/test/CodeGen/X86/x87-fp-strict-sub.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68857.224531.patch Type: text/x-patch Size: 28320 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 20:12:07 2019 From: llvm-commits at lists.llvm.org (Chris Bieneman via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 03:12:07 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies In-Reply-To: References: Message-ID: <9e657d23998bccc5add5c3f71aa6fc7b@localhost.localdomain> beanz added a comment. Two thoughts. (1) Short term I think this is wrong, in general we should avoid using `if(TARGET...)`. Compiler-rt has the most ridiculous magic in it for which targets are enabled or not, which is why it was put first. That allows the other runtimes to check against compiler-rt which was harder to replicate logic for which sanitizers were included in the build. The other runtime libraries should be able to use the `HAVE_${runtime}` variables to determine the presence of the other runtimes. That should correctly allow libcxx, libcxxabi and libunwind to not be order dependent. (2) Longer term, we really need to update the runtimes to make their dependencies based on generator expressions so that we can remove at least some of the order dependence. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68833/new/ https://reviews.llvm.org/D68833 From llvm-commits at lists.llvm.org Thu Oct 10 20:25:57 2019 From: llvm-commits at lists.llvm.org (Saleem Abdulrasool via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 03:25:57 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies In-Reply-To: References: Message-ID: <706c11315f7935eb3c11e32dc5e93f83@localhost.localdomain> compnerd added a comment. I feel like Im not understanding something. The point of CMake is that it tracks dependencies. We are we manually tracking dependencies? The `MSAN` check was needed due to the inverted dependencies; with unified builds, this shouldnt be an issue any longer. I think that propagating the checks this way is the wrong approach. Can you please explain why you need the `if(TARGET runtime)` in the first place? Understanding that might help with coming up with a better solution (or possibly an existing solution). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68833/new/ https://reviews.llvm.org/D68833 From llvm-commits at lists.llvm.org Thu Oct 10 20:31:02 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 03:31:02 +0000 (UTC) Subject: [PATCH] D68857: [X86] Add strict fp support for operations of X87 instructions In-Reply-To: References: Message-ID: craig.topper added inline comments. ================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:601 + + // Handle constrained floating-point operations of scalar. + for (auto VT : { MVT::f32, MVT::f64, MVT::f80 }) { ---------------- Doesn't this stop working if sse1 is enabled? By we still need x87 for f64/f80. ================ Comment at: llvm/lib/Target/X86/X86InstrFPStack.td:372 let SchedRW = [WriteMicrocoded] in { -defm SIN : FPUnary; -defm COS : FPUnary; +defm SIN : FPUnary; +defm COS : FPUnary; ---------------- SIN/COS are not mentioned in the X86ISelLowering code you added. ================ Comment at: llvm/test/CodeGen/X86/x87-fp-strict-sub.ll:84 + +!0 = !{!1, !1, i64 0} +!1 = !{!"float", !2, i64 0} ---------------- Is all this metadata needed? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68857/new/ https://reviews.llvm.org/D68857 From llvm-commits at lists.llvm.org Thu Oct 10 20:38:29 2019 From: llvm-commits at lists.llvm.org (Kristina Brooks via llvm-commits) Date: Fri, 11 Oct 2019 04:38:29 +0100 Subject: [llvm] r356753 - [ObjectYAML] Add basic minidump generation support In-Reply-To: <20190322144726.9FC4F8A9D4@lists.llvm.org> References: <20190322144726.9FC4F8A9D4@lists.llvm.org> Message-ID: Hi, This doesn't build with modules enabled (using Clang r374503): FAILED: lib/ObjectYAML/CMakeFiles/LLVMObjectYAML.dir/MinidumpYAML.cpp.o /o/b/llvm-10/408/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/ObjectYAML -I/home/src2/llvm-tainted/lib/ObjectYAML -Iinclude -I/home/src2/llvm-tainted/include -O3 -march=native -Wno-unused-command-line-argument -gline-tables-only -stdlib=libc++ -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -std=c++14 -fmodules -fmodules-cache-path=/o/b/llvm-10/409/module.cache -Xclang -fmodules-local-submodule-visibility -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -flto=thin -O3 -UNDEBUG -fno-exceptions -fno-rtti -MD -MT lib/ObjectYAML/CMakeFiles/LLVMObjectYAML.dir/MinidumpYAML.cpp.o -MF lib/ObjectYAML/CMakeFiles/LLVMObjectYAML.dir/MinidumpYAML.cpp.o.d -o lib/ObjectYAML/CMakeFiles/LLVMObjectYAML.dir/MinidumpYAML.cpp.o -c /home/src2/llvm-tainted/lib/ObjectYAML/MinidumpYAML.cpp In module 'LLVM_Utils' imported from /home/src2/llvm-tainted/include/llvm/ObjectYAML/YAML.h:12: /home/src2/llvm-tainted/include/llvm/Support/YAMLTraits.h:819:48: error: call to function 'operator&' that is neither visible in the template definition nor found by argument-dependent lookup if ( bitSetMatch(Str, outputting() && (Val & ConstVal) == ConstVal) ) { ^ /home/src2/llvm-tainted/include/llvm/BinaryFormat/MinidumpConstants.def:118:1: note: in instantiation of function template specialization 'llvm::yaml::IO::bitSetCase' requested here HANDLE_MDMP_PROTECT(0x01, NoAccess, PAGE_NO_ACCESS) ^ /home/src2/llvm-tainted/lib/ObjectYAML/MinidumpYAML.cpp:119:6: note: expanded from macro 'HANDLE_MDMP_PROTECT' IO.bitSetCase(Protect, #NATIVENAME, MemoryProtection::NAME); ^ /home/src2/llvm-tainted/include/llvm/ADT/BitmaskEnum.h:111:3: note: 'operator&' should be declared prior to the call site or in namespace 'llvm::minidump' E operator&(E LHS, E RHS) { On Fri, Mar 22, 2019 at 2:46 PM Pavel Labath via llvm-commits wrote: > > Author: labath > Date: Fri Mar 22 07:47:26 2019 > New Revision: 356753 > > URL: http://llvm.org/viewvc/llvm-project?rev=356753&view=rev > Log: > [ObjectYAML] Add basic minidump generation support > > Summary: > This patch adds the ability to read a yaml form of a minidump file and > write it out as binary. Apart from the minidump header and the stream > directory, only three basic stream kinds are supported: > - Text: This kind is used for streams which contain textual data. This > is typically the contents of a /proc file on linux (e.g. > /proc/PID/maps). In this case, we just put the raw stream contents > into the yaml. > - SystemInfo: This stream contains various bits of information about the > host system in binary form. We expose the data in a structured form. > - Raw: This kind is used as a fallback when we don't have any special > knowledge about the stream. In this case, we just print the stream > contents in hex. > > For this code to be really useful, more stream kinds will need to be > added (particularly for things like lists of memory regions and loaded > modules). However, these can be added incrementally. > > Reviewers: jhenderson, zturner, clayborg, aprantl > > Subscribers: mgorny, lemo, llvm-commits, lldb-commits > > Tags: #llvm > > Differential Revision: https://reviews.llvm.org/D59482 > > Added: > llvm/trunk/include/llvm/ObjectYAML/MinidumpYAML.h > llvm/trunk/lib/ObjectYAML/MinidumpYAML.cpp > llvm/trunk/test/tools/yaml2obj/minidump-raw-stream-small-size.yaml > llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-long.yaml > llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-not-hex.yaml > llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-short.yaml > llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-x86-long.yaml > llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-x86-short.yaml > llvm/trunk/tools/yaml2obj/yaml2minidump.cpp > llvm/trunk/unittests/ObjectYAML/MinidumpYAMLTest.cpp > Modified: > llvm/trunk/include/llvm/ObjectYAML/ObjectYAML.h > llvm/trunk/lib/ObjectYAML/CMakeLists.txt > llvm/trunk/lib/ObjectYAML/ObjectYAML.cpp > llvm/trunk/tools/yaml2obj/CMakeLists.txt > llvm/trunk/tools/yaml2obj/yaml2obj.cpp > llvm/trunk/tools/yaml2obj/yaml2obj.h > llvm/trunk/unittests/ObjectYAML/CMakeLists.txt > > Added: llvm/trunk/include/llvm/ObjectYAML/MinidumpYAML.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ObjectYAML/MinidumpYAML.h?rev=356753&view=auto > ============================================================================== > --- llvm/trunk/include/llvm/ObjectYAML/MinidumpYAML.h (added) > +++ llvm/trunk/include/llvm/ObjectYAML/MinidumpYAML.h Fri Mar 22 07:47:26 2019 > @@ -0,0 +1,156 @@ > +//===- MinidumpYAML.h - Minidump YAMLIO implementation ----------*- C++ -*-===// > +// > +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. > +// See https://llvm.org/LICENSE.txt for license information. > +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception > +// > +//===----------------------------------------------------------------------===// > + > +#ifndef LLVM_OBJECTYAML_MINIDUMPYAML_H > +#define LLVM_OBJECTYAML_MINIDUMPYAML_H > + > +#include "llvm/BinaryFormat/Minidump.h" > +#include "llvm/ObjectYAML/YAML.h" > +#include "llvm/Support/YAMLTraits.h" > + > +namespace llvm { > +namespace MinidumpYAML { > + > +/// The base class for all minidump streams. The "Type" of the stream > +/// corresponds to the Stream Type field in the minidump file. The "Kind" field > +/// specifies how are we going to treat it. For highly specialized streams (e.g. > +/// SystemInfo), there is a 1:1 mapping between Types and Kinds, but in general > +/// one stream Kind can be used to represent multiple stream Types (e.g. any > +/// unrecognised stream Type will be handled via RawContentStream). The mapping > +/// from Types to Kinds is fixed and given by the static getKind function. > +struct Stream { > + enum class StreamKind { > + RawContent, > + SystemInfo, > + TextContent, > + }; > + > + Stream(StreamKind Kind, minidump::StreamType Type) : Kind(Kind), Type(Type) {} > + virtual ~Stream(); // anchor > + > + const StreamKind Kind; > + const minidump::StreamType Type; > + > + /// Get the stream Kind used for representing streams of a given Type. > + static StreamKind getKind(minidump::StreamType Type); > + > + /// Create an empty stream of the given Type. > + static std::unique_ptr create(minidump::StreamType Type); > +}; > + > +/// A minidump stream represented as a sequence of hex bytes. This is used as a > +/// fallback when no other stream kind is suitable. > +struct RawContentStream : public Stream { > + yaml::BinaryRef Content; > + yaml::Hex32 Size; > + > + RawContentStream(minidump::StreamType Type, ArrayRef Content = {}) > + : Stream(StreamKind::RawContent, Type), Content(Content), > + Size(Content.size()) {} > + > + static bool classof(const Stream *S) { > + return S->Kind == StreamKind::RawContent; > + } > +}; > + > +/// SystemInfo minidump stream. > +struct SystemInfoStream : public Stream { > + minidump::SystemInfo Info; > + > + explicit SystemInfoStream(const minidump::SystemInfo &Info) > + : Stream(StreamKind::SystemInfo, minidump::StreamType::SystemInfo), > + Info(Info) {} > + > + SystemInfoStream() > + : Stream(StreamKind::SystemInfo, minidump::StreamType::SystemInfo) { > + memset(&Info, 0, sizeof(Info)); > + } > + > + static bool classof(const Stream *S) { > + return S->Kind == StreamKind::SystemInfo; > + } > +}; > + > +/// A StringRef, which is printed using YAML block notation. > +LLVM_YAML_STRONG_TYPEDEF(StringRef, BlockStringRef) > + > +/// A minidump stream containing textual data (typically, the contents of a > +/// /proc/ file on linux). > +struct TextContentStream : public Stream { > + BlockStringRef Text; > + > + TextContentStream(minidump::StreamType Type, StringRef Text = {}) > + : Stream(StreamKind::TextContent, Type), Text(Text) {} > + > + static bool classof(const Stream *S) { > + return S->Kind == StreamKind::TextContent; > + } > +}; > + > +/// The top level structure representing a minidump object, consisting of a > +/// minidump header, and zero or more streams. To construct an Object from a > +/// minidump file, use the static create function. To serialize to/from yaml, > +/// use the appropriate streaming operator on a yaml stream. > +struct Object { > + Object() = default; > + Object(const Object &) = delete; > + Object &operator=(const Object &) = delete; > + Object(Object &&) = default; > + Object &operator=(Object &&) = default; > + > + /// The minidump header. > + minidump::Header Header; > + > + /// The list of streams in this minidump object. > + std::vector> Streams; > +}; > + > +/// Serialize the minidump file represented by Obj to OS in binary form. > +void writeAsBinary(Object &Obj, raw_ostream &OS); > + > +/// Serialize the yaml string as a minidump file to OS in binary form. > +Error writeAsBinary(StringRef Yaml, raw_ostream &OS); > + > +} // namespace MinidumpYAML > + > +namespace yaml { > +template <> struct BlockScalarTraits { > + static void output(const MinidumpYAML::BlockStringRef &Text, void *, > + raw_ostream &OS) { > + OS << Text; > + } > + > + static StringRef input(StringRef Scalar, void *, > + MinidumpYAML::BlockStringRef &Text) { > + Text = Scalar; > + return ""; > + } > +}; > + > +template <> struct MappingTraits> { > + static void mapping(IO &IO, std::unique_ptr &S); > + static StringRef validate(IO &IO, std::unique_ptr &S); > +}; > + > +} // namespace yaml > + > +} // namespace llvm > + > +LLVM_YAML_DECLARE_ENUM_TRAITS(llvm::minidump::ProcessorArchitecture) > +LLVM_YAML_DECLARE_ENUM_TRAITS(llvm::minidump::OSPlatform) > +LLVM_YAML_DECLARE_ENUM_TRAITS(llvm::minidump::StreamType) > + > +LLVM_YAML_DECLARE_MAPPING_TRAITS(llvm::minidump::CPUInfo::ArmInfo) > +LLVM_YAML_DECLARE_MAPPING_TRAITS(llvm::minidump::CPUInfo::OtherInfo) > +LLVM_YAML_DECLARE_MAPPING_TRAITS(llvm::minidump::CPUInfo::X86Info) > + > +LLVM_YAML_IS_SEQUENCE_VECTOR(std::unique_ptr) > + > +LLVM_YAML_DECLARE_MAPPING_TRAITS(llvm::MinidumpYAML::Object) > + > +#endif // LLVM_OBJECTYAML_MINIDUMPYAML_H > > Modified: llvm/trunk/include/llvm/ObjectYAML/ObjectYAML.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ObjectYAML/ObjectYAML.h?rev=356753&r1=356752&r2=356753&view=diff > ============================================================================== > --- llvm/trunk/include/llvm/ObjectYAML/ObjectYAML.h (original) > +++ llvm/trunk/include/llvm/ObjectYAML/ObjectYAML.h Fri Mar 22 07:47:26 2019 > @@ -12,6 +12,7 @@ > #include "llvm/ObjectYAML/COFFYAML.h" > #include "llvm/ObjectYAML/ELFYAML.h" > #include "llvm/ObjectYAML/MachOYAML.h" > +#include "llvm/ObjectYAML/MinidumpYAML.h" > #include "llvm/ObjectYAML/WasmYAML.h" > #include "llvm/Support/YAMLTraits.h" > #include > @@ -26,6 +27,7 @@ struct YamlObjectFile { > std::unique_ptr Coff; > std::unique_ptr MachO; > std::unique_ptr FatMachO; > + std::unique_ptr Minidump; > std::unique_ptr Wasm; > }; > > > Modified: llvm/trunk/lib/ObjectYAML/CMakeLists.txt > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ObjectYAML/CMakeLists.txt?rev=356753&r1=356752&r2=356753&view=diff > ============================================================================== > --- llvm/trunk/lib/ObjectYAML/CMakeLists.txt (original) > +++ llvm/trunk/lib/ObjectYAML/CMakeLists.txt Fri Mar 22 07:47:26 2019 > @@ -10,6 +10,7 @@ add_llvm_library(LLVMObjectYAML > ELFYAML.cpp > MachOYAML.cpp > ObjectYAML.cpp > + MinidumpYAML.cpp > WasmYAML.cpp > YAML.cpp > ) > > Added: llvm/trunk/lib/ObjectYAML/MinidumpYAML.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ObjectYAML/MinidumpYAML.cpp?rev=356753&view=auto > ============================================================================== > --- llvm/trunk/lib/ObjectYAML/MinidumpYAML.cpp (added) > +++ llvm/trunk/lib/ObjectYAML/MinidumpYAML.cpp Fri Mar 22 07:47:26 2019 > @@ -0,0 +1,385 @@ > +//===- MinidumpYAML.cpp - Minidump YAMLIO implementation ------------------===// > +// > +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. > +// See https://llvm.org/LICENSE.txt for license information. > +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception > +// > +//===----------------------------------------------------------------------===// > + > +#include "llvm/ObjectYAML/MinidumpYAML.h" > + > +using namespace llvm; > +using namespace llvm::MinidumpYAML; > +using namespace llvm::minidump; > + > +namespace { > +class BlobAllocator { > +public: > + size_t tell() const { return NextOffset; } > + > + size_t AllocateCallback(size_t Size, > + std::function Callback) { > + size_t Offset = NextOffset; > + NextOffset += Size; > + Callbacks.push_back(std::move(Callback)); > + return Offset; > + } > + > + size_t AllocateBytes(ArrayRef Data) { > + return AllocateCallback( > + Data.size(), [Data](raw_ostream &OS) { OS << toStringRef(Data); }); > + } > + > + template size_t AllocateArray(ArrayRef Data) { > + return AllocateBytes({reinterpret_cast(Data.data()), > + sizeof(T) * Data.size()}); > + } > + > + template size_t AllocateObject(const T &Data) { > + return AllocateArray(makeArrayRef(Data)); > + } > + > + void writeTo(raw_ostream &OS) const; > + > +private: > + size_t NextOffset = 0; > + > + std::vector> Callbacks; > +}; > +} // namespace > + > +void BlobAllocator::writeTo(raw_ostream &OS) const { > + size_t BeginOffset = OS.tell(); > + for (const auto &Callback : Callbacks) > + Callback(OS); > + assert(OS.tell() == BeginOffset + NextOffset && > + "Callbacks wrote an unexpected number of bytes."); > + (void)BeginOffset; > +} > + > +/// Perform an optional yaml-mapping of an endian-aware type EndianType. The > +/// only purpose of this function is to avoid casting the Default value to the > +/// endian type; > +template > +static inline void mapOptional(yaml::IO &IO, const char *Key, EndianType &Val, > + typename EndianType::value_type Default) { > + IO.mapOptional(Key, Val, EndianType(Default)); > +} > + > +/// Yaml-map an endian-aware type EndianType as some other type MapType. > +template > +static inline void mapRequiredAs(yaml::IO &IO, const char *Key, > + EndianType &Val) { > + MapType Mapped = static_cast(Val); > + IO.mapRequired(Key, Mapped); > + Val = static_cast(Mapped); > +} > + > +/// Perform an optional yaml-mapping of an endian-aware type EndianType as some > +/// other type MapType. > +template > +static inline void mapOptionalAs(yaml::IO &IO, const char *Key, EndianType &Val, > + MapType Default) { > + MapType Mapped = static_cast(Val); > + IO.mapOptional(Key, Mapped, Default); > + Val = static_cast(Mapped); > +} > + > +namespace { > +/// Return the appropriate yaml Hex type for a given endian-aware type. > +template struct HexType; > +template <> struct HexType { using type = yaml::Hex16; }; > +template <> struct HexType { using type = yaml::Hex32; }; > +template <> struct HexType { using type = yaml::Hex64; }; > +} // namespace > + > +/// Yaml-map an endian-aware type as an appropriately-sized hex value. > +template > +static inline void mapRequiredHex(yaml::IO &IO, const char *Key, > + EndianType &Val) { > + mapRequiredAs::type>(IO, Key, Val); > +} > + > +/// Perform an optional yaml-mapping of an endian-aware type as an > +/// appropriately-sized hex value. > +template > +static inline void mapOptionalHex(yaml::IO &IO, const char *Key, > + EndianType &Val, > + typename EndianType::value_type Default) { > + mapOptionalAs::type>(IO, Key, Val, Default); > +} > + > +Stream::~Stream() = default; > + > +Stream::StreamKind Stream::getKind(StreamType Type) { > + switch (Type) { > + case StreamType::SystemInfo: > + return StreamKind::SystemInfo; > + case StreamType::LinuxCPUInfo: > + case StreamType::LinuxProcStatus: > + case StreamType::LinuxLSBRelease: > + case StreamType::LinuxCMDLine: > + case StreamType::LinuxMaps: > + case StreamType::LinuxProcStat: > + case StreamType::LinuxProcUptime: > + return StreamKind::TextContent; > + default: > + return StreamKind::RawContent; > + } > +} > + > +std::unique_ptr Stream::create(StreamType Type) { > + StreamKind Kind = getKind(Type); > + switch (Kind) { > + case StreamKind::RawContent: > + return llvm::make_unique(Type); > + case StreamKind::SystemInfo: > + return llvm::make_unique(); > + case StreamKind::TextContent: > + return llvm::make_unique(Type); > + } > + llvm_unreachable("Unhandled stream kind!"); > +} > + > +void yaml::ScalarEnumerationTraits::enumeration( > + IO &IO, ProcessorArchitecture &Arch) { > +#define HANDLE_MDMP_ARCH(CODE, NAME) \ > + IO.enumCase(Arch, #NAME, ProcessorArchitecture::NAME); > +#include "llvm/BinaryFormat/MinidumpConstants.def" > + IO.enumFallback(Arch); > +} > + > +void yaml::ScalarEnumerationTraits::enumeration(IO &IO, > + OSPlatform &Plat) { > +#define HANDLE_MDMP_PLATFORM(CODE, NAME) \ > + IO.enumCase(Plat, #NAME, OSPlatform::NAME); > +#include "llvm/BinaryFormat/MinidumpConstants.def" > + IO.enumFallback(Plat); > +} > + > +void yaml::ScalarEnumerationTraits::enumeration(IO &IO, > + StreamType &Type) { > +#define HANDLE_MDMP_STREAM_TYPE(CODE, NAME) \ > + IO.enumCase(Type, #NAME, StreamType::NAME); > +#include "llvm/BinaryFormat/MinidumpConstants.def" > + IO.enumFallback(Type); > +} > + > +void yaml::MappingTraits::mapping(IO &IO, > + CPUInfo::ArmInfo &Info) { > + mapRequiredHex(IO, "CPUID", Info.CPUID); > + mapOptionalHex(IO, "ELF hwcaps", Info.ElfHWCaps, 0); > +} > + > +namespace { > +template struct FixedSizeHex { > + FixedSizeHex(uint8_t (&Storage)[N]) : Storage(Storage) {} > + > + uint8_t (&Storage)[N]; > +}; > +} // namespace > + > +namespace llvm { > +namespace yaml { > +template struct ScalarTraits> { > + static void output(const FixedSizeHex &Fixed, void *, raw_ostream &OS) { > + OS << toHex(makeArrayRef(Fixed.Storage)); > + } > + > + static StringRef input(StringRef Scalar, void *, FixedSizeHex &Fixed) { > + if (!all_of(Scalar, isHexDigit)) > + return "Invalid hex digit in input"; > + if (Scalar.size() < 2 * N) > + return "String too short"; > + if (Scalar.size() > 2 * N) > + return "String too long"; > + copy(fromHex(Scalar), Fixed.Storage); > + return ""; > + } > + > + static QuotingType mustQuote(StringRef S) { return QuotingType::None; } > +}; > +} // namespace yaml > +} // namespace llvm > +void yaml::MappingTraits::mapping( > + IO &IO, CPUInfo::OtherInfo &Info) { > + FixedSizeHex Features(Info.ProcessorFeatures); > + IO.mapRequired("Features", Features); > +} > + > +namespace { > +/// A type which only accepts strings of a fixed size for yaml conversion. > +template struct FixedSizeString { > + FixedSizeString(char (&Storage)[N]) : Storage(Storage) {} > + > + char (&Storage)[N]; > +}; > +} // namespace > + > +namespace llvm { > +namespace yaml { > +template struct ScalarTraits> { > + static void output(const FixedSizeString &Fixed, void *, raw_ostream &OS) { > + OS << StringRef(Fixed.Storage, N); > + } > + > + static StringRef input(StringRef Scalar, void *, FixedSizeString &Fixed) { > + if (Scalar.size() < N) > + return "String too short"; > + if (Scalar.size() > N) > + return "String too long"; > + copy(Scalar, Fixed.Storage); > + return ""; > + } > + > + static QuotingType mustQuote(StringRef S) { return needsQuotes(S); } > +}; > +} // namespace yaml > +} // namespace llvm > + > +void yaml::MappingTraits::mapping(IO &IO, > + CPUInfo::X86Info &Info) { > + FixedSizeString VendorID(Info.VendorID); > + IO.mapRequired("Vendor ID", VendorID); > + > + mapRequiredHex(IO, "Version Info", Info.VersionInfo); > + mapRequiredHex(IO, "Feature Info", Info.FeatureInfo); > + mapOptionalHex(IO, "AMD Extended Features", Info.AMDExtendedFeatures, 0); > +} > + > +static void streamMapping(yaml::IO &IO, RawContentStream &Stream) { > + IO.mapOptional("Content", Stream.Content); > + IO.mapOptional("Size", Stream.Size, Stream.Content.binary_size()); > +} > + > +static StringRef streamValidate(RawContentStream &Stream) { > + if (Stream.Size.value < Stream.Content.binary_size()) > + return "Stream size must be greater or equal to the content size"; > + return ""; > +} > + > +static void streamMapping(yaml::IO &IO, SystemInfoStream &Stream) { > + SystemInfo &Info = Stream.Info; > + IO.mapRequired("Processor Arch", Info.ProcessorArch); > + mapOptional(IO, "Processor Level", Info.ProcessorLevel, 0); > + mapOptional(IO, "Processor Revision", Info.ProcessorRevision, 0); > + IO.mapOptional("Number of Processors", Info.NumberOfProcessors, 0); > + IO.mapOptional("Product type", Info.ProductType, 0); > + mapOptional(IO, "Major Version", Info.MajorVersion, 0); > + mapOptional(IO, "Minor Version", Info.MinorVersion, 0); > + mapOptional(IO, "Build Number", Info.BuildNumber, 0); > + IO.mapRequired("Platform ID", Info.PlatformId); > + mapOptionalHex(IO, "CSD Version RVA", Info.CSDVersionRVA, 0); > + mapOptionalHex(IO, "Suite Mask", Info.SuiteMask, 0); > + mapOptionalHex(IO, "Reserved", Info.Reserved, 0); > + switch (static_cast(Info.ProcessorArch)) { > + case ProcessorArchitecture::X86: > + case ProcessorArchitecture::AMD64: > + IO.mapOptional("CPU", Info.CPU.X86); > + break; > + case ProcessorArchitecture::ARM: > + case ProcessorArchitecture::ARM64: > + IO.mapOptional("CPU", Info.CPU.Arm); > + break; > + default: > + IO.mapOptional("CPU", Info.CPU.Other); > + break; > + } > +} > + > +static void streamMapping(yaml::IO &IO, TextContentStream &Stream) { > + IO.mapOptional("Text", Stream.Text); > +} > + > +void yaml::MappingTraits>::mapping( > + yaml::IO &IO, std::unique_ptr &S) { > + StreamType Type; > + if (IO.outputting()) > + Type = S->Type; > + IO.mapRequired("Type", Type); > + > + if (!IO.outputting()) > + S = MinidumpYAML::Stream::create(Type); > + switch (S->Kind) { > + case MinidumpYAML::Stream::StreamKind::RawContent: > + streamMapping(IO, llvm::cast(*S)); > + break; > + case MinidumpYAML::Stream::StreamKind::SystemInfo: > + streamMapping(IO, llvm::cast(*S)); > + break; > + case MinidumpYAML::Stream::StreamKind::TextContent: > + streamMapping(IO, llvm::cast(*S)); > + break; > + } > +} > + > +StringRef yaml::MappingTraits>::validate( > + yaml::IO &IO, std::unique_ptr &S) { > + switch (S->Kind) { > + case MinidumpYAML::Stream::StreamKind::RawContent: > + return streamValidate(cast(*S)); > + case MinidumpYAML::Stream::StreamKind::SystemInfo: > + case MinidumpYAML::Stream::StreamKind::TextContent: > + return ""; > + } > + llvm_unreachable("Fully covered switch above!"); > +} > + > +void yaml::MappingTraits::mapping(IO &IO, Object &O) { > + IO.mapTag("!minidump", true); > + mapOptionalHex(IO, "Signature", O.Header.Signature, Header::MagicSignature); > + mapOptionalHex(IO, "Version", O.Header.Version, Header::MagicVersion); > + mapOptionalHex(IO, "Flags", O.Header.Flags, 0); > + IO.mapRequired("Streams", O.Streams); > +} > + > +static Directory layout(BlobAllocator &File, Stream &S) { > + Directory Result; > + Result.Type = S.Type; > + Result.Location.RVA = File.tell(); > + switch (S.Kind) { > + case Stream::StreamKind::RawContent: { > + RawContentStream &Raw = cast(S); > + File.AllocateCallback(Raw.Size, [&Raw](raw_ostream &OS) { > + Raw.Content.writeAsBinary(OS); > + assert(Raw.Content.binary_size() <= Raw.Size); > + OS << std::string(Raw.Size - Raw.Content.binary_size(), '\0'); > + }); > + break; > + } > + case Stream::StreamKind::SystemInfo: > + File.AllocateObject(cast(S).Info); > + break; > + case Stream::StreamKind::TextContent: > + File.AllocateArray(arrayRefFromStringRef(cast(S).Text)); > + break; > + } > + Result.Location.DataSize = File.tell() - Result.Location.RVA; > + return Result; > +} > + > +void MinidumpYAML::writeAsBinary(Object &Obj, raw_ostream &OS) { > + BlobAllocator File; > + File.AllocateObject(Obj.Header); > + > + std::vector StreamDirectory(Obj.Streams.size()); > + Obj.Header.StreamDirectoryRVA = > + File.AllocateArray(makeArrayRef(StreamDirectory)); > + Obj.Header.NumberOfStreams = StreamDirectory.size(); > + > + for (auto &Stream : enumerate(Obj.Streams)) > + StreamDirectory[Stream.index()] = layout(File, *Stream.value()); > + > + File.writeTo(OS); > +} > + > +Error MinidumpYAML::writeAsBinary(StringRef Yaml, raw_ostream &OS) { > + yaml::Input Input(Yaml); > + Object Obj; > + Input >> Obj; > + if (std::error_code EC = Input.error()) > + return errorCodeToError(EC); > + > + writeAsBinary(Obj, OS); > + return Error::success(); > +} > > Modified: llvm/trunk/lib/ObjectYAML/ObjectYAML.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ObjectYAML/ObjectYAML.cpp?rev=356753&r1=356752&r2=356753&view=diff > ============================================================================== > --- llvm/trunk/lib/ObjectYAML/ObjectYAML.cpp (original) > +++ llvm/trunk/lib/ObjectYAML/ObjectYAML.cpp Fri Mar 22 07:47:26 2019 > @@ -45,6 +45,9 @@ void MappingTraits::mapp > ObjectFile.FatMachO.reset(new MachOYAML::UniversalBinary()); > MappingTraits::mapping(IO, > *ObjectFile.FatMachO); > + } else if (IO.mapTag("!minidump")) { > + ObjectFile.Minidump.reset(new MinidumpYAML::Object()); > + MappingTraits::mapping(IO, *ObjectFile.Minidump); > } else if (IO.mapTag("!WASM")) { > ObjectFile.Wasm.reset(new WasmYAML::Object()); > MappingTraits::mapping(IO, *ObjectFile.Wasm); > > Added: llvm/trunk/test/tools/yaml2obj/minidump-raw-stream-small-size.yaml > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/yaml2obj/minidump-raw-stream-small-size.yaml?rev=356753&view=auto > ============================================================================== > --- llvm/trunk/test/tools/yaml2obj/minidump-raw-stream-small-size.yaml (added) > +++ llvm/trunk/test/tools/yaml2obj/minidump-raw-stream-small-size.yaml Fri Mar 22 07:47:26 2019 > @@ -0,0 +1,9 @@ > +# RUN: not yaml2obj %s 2>&1 | FileCheck %s > + > +--- !minidump > +Streams: > + - Type: LinuxAuxv > + Size: 7 > + Content: DEADBEEFBAADF00D > + > +# CHECK: Stream size must be greater or equal to the content size > > Added: llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-long.yaml > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-long.yaml?rev=356753&view=auto > ============================================================================== > --- llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-long.yaml (added) > +++ llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-long.yaml Fri Mar 22 07:47:26 2019 > @@ -0,0 +1,13 @@ > +# RUN: not yaml2obj %s 2>&1 | FileCheck %s > + > +--- !minidump > +Streams: > + - Type: SystemInfo > + Processor Arch: PPC > + Platform ID: Linux > + CPU: > + Features: 000102030405060708090a0b0c0d0e0f0 > + > + > +# CHECK: String too long > +# CHECK-NEXT: Features: 000102030405060708090a0b0c0d0e0f0 > > Added: llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-not-hex.yaml > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-not-hex.yaml?rev=356753&view=auto > ============================================================================== > --- llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-not-hex.yaml (added) > +++ llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-not-hex.yaml Fri Mar 22 07:47:26 2019 > @@ -0,0 +1,13 @@ > +# RUN: not yaml2obj %s 2>&1 | FileCheck %s > + > +--- !minidump > +Streams: > + - Type: SystemInfo > + Processor Arch: PPC > + Platform ID: Linux > + CPU: > + Features: 000102030405060708090a0b0c0d0e0g > + > + > +# CHECK: Invalid hex digit in input > +# CHECK-NEXT: Features: 000102030405060708090a0b0c0d0e0g > > Added: llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-short.yaml > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-short.yaml?rev=356753&view=auto > ============================================================================== > --- llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-short.yaml (added) > +++ llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-other-short.yaml Fri Mar 22 07:47:26 2019 > @@ -0,0 +1,13 @@ > +# RUN: not yaml2obj %s 2>&1 | FileCheck %s > + > +--- !minidump > +Streams: > + - Type: SystemInfo > + Processor Arch: PPC > + Platform ID: Linux > + CPU: > + Features: 000102030405060708090a0b0c0d0e0 > + > + > +# CHECK: String too short > +# CHECK-NEXT: Features: 000102030405060708090a0b0c0d0e0 > > Added: llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-x86-long.yaml > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-x86-long.yaml?rev=356753&view=auto > ============================================================================== > --- llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-x86-long.yaml (added) > +++ llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-x86-long.yaml Fri Mar 22 07:47:26 2019 > @@ -0,0 +1,15 @@ > +# RUN: not yaml2obj %s 2>&1 | FileCheck %s > + > +--- !minidump > +Streams: > + - Type: SystemInfo > + Processor Arch: X86 > + Platform ID: Linux > + CPU: > + Vendor ID: LLVMLLVMLLVML > + Version Info: 0x01020304 > + Feature Info: 0x05060708 > + AMD Extended Features: 0x09000102 > + > +# CHECK: String too long > +# CHECK-NEXT: Vendor ID: LLVMLLVMLLVML > > Added: llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-x86-short.yaml > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-x86-short.yaml?rev=356753&view=auto > ============================================================================== > --- llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-x86-short.yaml (added) > +++ llvm/trunk/test/tools/yaml2obj/minidump-systeminfo-x86-short.yaml Fri Mar 22 07:47:26 2019 > @@ -0,0 +1,15 @@ > +# RUN: not yaml2obj %s 2>&1 | FileCheck %s > + > +--- !minidump > +Streams: > + - Type: SystemInfo > + Processor Arch: X86 > + Platform ID: Linux > + CPU: > + Vendor ID: LLVMLLVMLLV > + Version Info: 0x01020304 > + Feature Info: 0x05060708 > + AMD Extended Features: 0x09000102 > + > +# CHECK: String too short > +# CHECK-NEXT: Vendor ID: LLVMLLVMLLV > > Modified: llvm/trunk/tools/yaml2obj/CMakeLists.txt > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/yaml2obj/CMakeLists.txt?rev=356753&r1=356752&r2=356753&view=diff > ============================================================================== > --- llvm/trunk/tools/yaml2obj/CMakeLists.txt (original) > +++ llvm/trunk/tools/yaml2obj/CMakeLists.txt Fri Mar 22 07:47:26 2019 > @@ -11,5 +11,6 @@ add_llvm_tool(yaml2obj > yaml2coff.cpp > yaml2elf.cpp > yaml2macho.cpp > + yaml2minidump.cpp > yaml2wasm.cpp > ) > > Added: llvm/trunk/tools/yaml2obj/yaml2minidump.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/yaml2obj/yaml2minidump.cpp?rev=356753&view=auto > ============================================================================== > --- llvm/trunk/tools/yaml2obj/yaml2minidump.cpp (added) > +++ llvm/trunk/tools/yaml2obj/yaml2minidump.cpp Fri Mar 22 07:47:26 2019 > @@ -0,0 +1,18 @@ > +//===- yaml2minidump.cpp - Convert a YAML file to a minidump file ---------===// > +// > +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. > +// See https://llvm.org/LICENSE.txt for license information. > +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception > +// > +//===----------------------------------------------------------------------===// > + > +#include "yaml2obj.h" > +#include "llvm/ObjectYAML/MinidumpYAML.h" > +#include "llvm/Support/raw_ostream.h" > + > +using namespace llvm; > + > +int yaml2minidump(MinidumpYAML::Object &Doc, raw_ostream &Out) { > + writeAsBinary(Doc, Out); > + return 0; > +} > > Modified: llvm/trunk/tools/yaml2obj/yaml2obj.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/yaml2obj/yaml2obj.cpp?rev=356753&r1=356752&r2=356753&view=diff > ============================================================================== > --- llvm/trunk/tools/yaml2obj/yaml2obj.cpp (original) > +++ llvm/trunk/tools/yaml2obj/yaml2obj.cpp Fri Mar 22 07:47:26 2019 > @@ -56,6 +56,8 @@ static int convertYAML(yaml::Input &YIn, > return yaml2coff(*Doc.Coff, Out); > if (Doc.MachO || Doc.FatMachO) > return yaml2macho(Doc, Out); > + if (Doc.Minidump) > + return yaml2minidump(*Doc.Minidump, Out); > if (Doc.Wasm) > return yaml2wasm(*Doc.Wasm, Out); > error("yaml2obj: Unknown document type!"); > > Modified: llvm/trunk/tools/yaml2obj/yaml2obj.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/yaml2obj/yaml2obj.h?rev=356753&r1=356752&r2=356753&view=diff > ============================================================================== > --- llvm/trunk/tools/yaml2obj/yaml2obj.h (original) > +++ llvm/trunk/tools/yaml2obj/yaml2obj.h Fri Mar 22 07:47:26 2019 > @@ -22,6 +22,10 @@ namespace ELFYAML { > struct Object; > } > > +namespace MinidumpYAML { > +struct Object; > +} > + > namespace WasmYAML { > struct Object; > } > @@ -35,6 +39,7 @@ struct YamlObjectFile; > int yaml2coff(llvm::COFFYAML::Object &Doc, llvm::raw_ostream &Out); > int yaml2elf(llvm::ELFYAML::Object &Doc, llvm::raw_ostream &Out); > int yaml2macho(llvm::yaml::YamlObjectFile &Doc, llvm::raw_ostream &Out); > +int yaml2minidump(llvm::MinidumpYAML::Object &Doc, llvm::raw_ostream &Out); > int yaml2wasm(llvm::WasmYAML::Object &Doc, llvm::raw_ostream &Out); > > #endif > > Modified: llvm/trunk/unittests/ObjectYAML/CMakeLists.txt > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ObjectYAML/CMakeLists.txt?rev=356753&r1=356752&r2=356753&view=diff > ============================================================================== > --- llvm/trunk/unittests/ObjectYAML/CMakeLists.txt (original) > +++ llvm/trunk/unittests/ObjectYAML/CMakeLists.txt Fri Mar 22 07:47:26 2019 > @@ -1,8 +1,11 @@ > set(LLVM_LINK_COMPONENTS > + Object > ObjectYAML > ) > > add_llvm_unittest(ObjectYAMLTests > + MinidumpYAMLTest.cpp > YAMLTest.cpp > ) > > +target_link_libraries(ObjectYAMLTests PRIVATE LLVMTestingSupport) > > Added: llvm/trunk/unittests/ObjectYAML/MinidumpYAMLTest.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ObjectYAML/MinidumpYAMLTest.cpp?rev=356753&view=auto > ============================================================================== > --- llvm/trunk/unittests/ObjectYAML/MinidumpYAMLTest.cpp (added) > +++ llvm/trunk/unittests/ObjectYAML/MinidumpYAMLTest.cpp Fri Mar 22 07:47:26 2019 > @@ -0,0 +1,141 @@ > +//===- MinidumpYAMLTest.cpp - Tests for Minidump<->YAML code --------------===// > +// > +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. > +// See https://llvm.org/LICENSE.txt for license information. > +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception > +// > +//===----------------------------------------------------------------------===// > + > +#include "llvm/ObjectYAML/MinidumpYAML.h" > +#include "llvm/Object/Minidump.h" > +#include "llvm/ObjectYAML/ObjectYAML.h" > +#include "llvm/Testing/Support/Error.h" > +#include "gtest/gtest.h" > + > +using namespace llvm; > +using namespace llvm::minidump; > + > +static Expected> > +toBinary(SmallVectorImpl &Storage, StringRef Yaml) { > + Storage.clear(); > + raw_svector_ostream OS(Storage); > + if (Error E = MinidumpYAML::writeAsBinary(Yaml, OS)) > + return std::move(E); > + > + return object::MinidumpFile::create(MemoryBufferRef(OS.str(), "Binary")); > +} > + > +TEST(MinidumpYAML, Basic) { > + SmallString<0> Storage; > + auto ExpectedFile = toBinary(Storage, R"( > +--- !minidump > +Streams: > + - Type: SystemInfo > + Processor Arch: ARM64 > + Platform ID: Linux > + CSD Version RVA: 0x01020304 > + CPU: > + CPUID: 0x05060708 > + - Type: LinuxMaps > + Text: | > + 400d9000-400db000 r-xp 00000000 b3:04 227 /system/bin/app_process > + 400db000-400dc000 r--p 00001000 b3:04 227 /system/bin/app_process > + > + - Type: LinuxAuxv > + Content: DEADBEEFBAADF00D)"); > + ASSERT_THAT_EXPECTED(ExpectedFile, Succeeded()); > + object::MinidumpFile &File = **ExpectedFile; > + > + ASSERT_EQ(3u, File.streams().size()); > + > + EXPECT_EQ(StreamType::SystemInfo, File.streams()[0].Type); > + auto ExpectedSysInfo = File.getSystemInfo(); > + ASSERT_THAT_EXPECTED(ExpectedSysInfo, Succeeded()); > + const SystemInfo &SysInfo = *ExpectedSysInfo; > + EXPECT_EQ(ProcessorArchitecture::ARM64, SysInfo.ProcessorArch); > + EXPECT_EQ(OSPlatform::Linux, SysInfo.PlatformId); > + EXPECT_EQ(0x01020304u, SysInfo.CSDVersionRVA); > + EXPECT_EQ(0x05060708u, SysInfo.CPU.Arm.CPUID); > + > + EXPECT_EQ(StreamType::LinuxMaps, File.streams()[1].Type); > + EXPECT_EQ("400d9000-400db000 r-xp 00000000 b3:04 227 " > + "/system/bin/app_process\n" > + "400db000-400dc000 r--p 00001000 b3:04 227 " > + "/system/bin/app_process\n", > + toStringRef(*File.getRawStream(StreamType::LinuxMaps))); > + > + EXPECT_EQ(StreamType::LinuxAuxv, File.streams()[2].Type); > + EXPECT_EQ((ArrayRef{0xDE, 0xAD, 0xBE, 0xEF, 0xBA, 0xAD, 0xF0, 0x0D}), > + File.getRawStream(StreamType::LinuxAuxv)); > +} > + > +TEST(MinidumpYAML, RawContent) { > + SmallString<0> Storage; > + auto ExpectedFile = toBinary(Storage, R"( > +--- !minidump > +Streams: > + - Type: LinuxAuxv > + Size: 9 > + Content: DEADBEEFBAADF00D)"); > + ASSERT_THAT_EXPECTED(ExpectedFile, Succeeded()); > + object::MinidumpFile &File = **ExpectedFile; > + > + EXPECT_EQ( > + (ArrayRef{0xDE, 0xAD, 0xBE, 0xEF, 0xBA, 0xAD, 0xF0, 0x0D, 0x00}), > + File.getRawStream(StreamType::LinuxAuxv)); > +} > + > +TEST(MinidumpYAML, X86SystemInfo) { > + SmallString<0> Storage; > + auto ExpectedFile = toBinary(Storage, R"( > +--- !minidump > +Streams: > + - Type: SystemInfo > + Processor Arch: X86 > + Platform ID: Linux > + CPU: > + Vendor ID: LLVMLLVMLLVM > + Version Info: 0x01020304 > + Feature Info: 0x05060708 > + AMD Extended Features: 0x09000102)"); > + ASSERT_THAT_EXPECTED(ExpectedFile, Succeeded()); > + object::MinidumpFile &File = **ExpectedFile; > + > + ASSERT_EQ(1u, File.streams().size()); > + > + auto ExpectedSysInfo = File.getSystemInfo(); > + ASSERT_THAT_EXPECTED(ExpectedSysInfo, Succeeded()); > + const SystemInfo &SysInfo = *ExpectedSysInfo; > + EXPECT_EQ(ProcessorArchitecture::X86, SysInfo.ProcessorArch); > + EXPECT_EQ(OSPlatform::Linux, SysInfo.PlatformId); > + EXPECT_EQ("LLVMLLVMLLVM", StringRef(SysInfo.CPU.X86.VendorID, > + sizeof(SysInfo.CPU.X86.VendorID))); > + EXPECT_EQ(0x01020304u, SysInfo.CPU.X86.VersionInfo); > + EXPECT_EQ(0x05060708u, SysInfo.CPU.X86.FeatureInfo); > + EXPECT_EQ(0x09000102u, SysInfo.CPU.X86.AMDExtendedFeatures); > +} > + > +TEST(MinidumpYAML, OtherSystemInfo) { > + SmallString<0> Storage; > + auto ExpectedFile = toBinary(Storage, R"( > +--- !minidump > +Streams: > + - Type: SystemInfo > + Processor Arch: PPC > + Platform ID: Linux > + CPU: > + Features: 000102030405060708090a0b0c0d0e0f)"); > + ASSERT_THAT_EXPECTED(ExpectedFile, Succeeded()); > + object::MinidumpFile &File = **ExpectedFile; > + > + ASSERT_EQ(1u, File.streams().size()); > + > + auto ExpectedSysInfo = File.getSystemInfo(); > + ASSERT_THAT_EXPECTED(ExpectedSysInfo, Succeeded()); > + const SystemInfo &SysInfo = *ExpectedSysInfo; > + EXPECT_EQ(ProcessorArchitecture::PPC, SysInfo.ProcessorArch); > + EXPECT_EQ(OSPlatform::Linux, SysInfo.PlatformId); > + EXPECT_EQ( > + (ArrayRef{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}), > + makeArrayRef(SysInfo.CPU.Other.ProcessorFeatures)); > +} > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits From llvm-commits at lists.llvm.org Thu Oct 10 20:40:15 2019 From: llvm-commits at lists.llvm.org (Petr Hosek via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 03:40:15 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies In-Reply-To: References: Message-ID: phosek added a comment. See https://reviews.llvm.org/D68791 for an example where this came up. I'm open to other solutions, but I think what's in that change (which is what we were using until r374116 when it was accidentally removed) isn't great either. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68833/new/ https://reviews.llvm.org/D68833 From llvm-commits at lists.llvm.org Thu Oct 10 20:43:50 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 03:43:50 +0000 (UTC) Subject: [PATCH] D68772: [COFF] Wrap things in namespace lld { namespace coff { In-Reply-To: References: Message-ID: <8431e1f306e8e90d14117ed64f7d602b@localhost.localdomain> MaskRay marked an inline comment as done. MaskRay added inline comments. ================ Comment at: lld/COFF/DebugTypes.cpp:213-214 // moved here. -Expected -lld::coff::findTypeServerSource(const ObjFile *f) { Expected ts = TypeServerSource::findFromFile(f); ---------------- rnk wrote: > I prefer this style for free functions because it makes it a hard error if there's a mismatch between the header and the cpp file. It's a pretty simple style rule: every function implemented in a .cpp file should either be qualified with a class or namespace name, or it should be marked static. Then you never have to worry about what the active namespace is outside of headers. > > That's just my personal preference and it's not in CodingStandards, but given how much we use free functions in LLD and LLVM, it's kind of nice. Does the argument mean this patch should be reverted? If we have interleaved classes and free functions, we may have: ``` namespace lld { namespace coff { void Class::method0() {} } } void lld::coff::free0() {} // we have to leave the active namespace, because otherwise [-Wextra-qualification] namespace lld { namespace coff { void Class::method1() {} } } void lld::coff::free1() {} ``` Instead of doing that, this patch uses an outer most `namespace lld { namespace coff {` so we will not need to think much about the active namespace. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68772/new/ https://reviews.llvm.org/D68772 From llvm-commits at lists.llvm.org Thu Oct 10 20:46:39 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Fri, 11 Oct 2019 03:46:39 -0000 Subject: [llvm] r374505 - [X86] Add more packus/ssat/usat truncate tests from legal vectors to less than 128-bit vectors. Message-ID: <20191011034639.541A192E17@lists.llvm.org> Author: ctopper Date: Thu Oct 10 20:46:39 2019 New Revision: 374505 URL: http://llvm.org/viewvc/llvm-project?rev=374505&view=rev Log: [X86] Add more packus/ssat/usat truncate tests from legal vectors to less than 128-bit vectors. Some of these have sub-optimal codegen for avx512 relative to avx2. Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll?rev=374505&r1=374504&r2=374505&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll Thu Oct 10 20:46:39 2019 @@ -651,6 +651,557 @@ define <8 x i32> @trunc_packus_v8i64_v8i ; PACKUS saturation truncation to vXi16 ; +define <4 x i16> @trunc_packus_v4i64_v4i16(<4 x i64> %a0) { +; SSE2-LABEL: trunc_packus_v4i64_v4i16: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [65535,65535] +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm1, %xmm3 +; SSE2-NEXT: pxor %xmm2, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [2147549183,2147549183] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: pxor %xmm2, %xmm1 +; SSE2-NEXT: movdqa %xmm5, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm0, %xmm4 +; SSE2-NEXT: movdqa %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm2, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm2, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm2 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1] +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v4i64_v4i16: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [65535,65535] +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm1, %xmm3 +; SSSE3-NEXT: pxor %xmm2, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [2147549183,2147549183] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm2 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v4i64_v4i16: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm4 = [65535,65535] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [2147549183,2147549183] +; SSE41-NEXT: movdqa %xmm6, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 +; SSE41-NEXT: movdqa %xmm2, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm6, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 +; SSE41-NEXT: pxor %xmm1, %xmm1 +; SSE41-NEXT: movapd %xmm4, %xmm2 +; SSE41-NEXT: xorpd %xmm3, %xmm2 +; SSE41-NEXT: movapd %xmm2, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm2 +; SSE41-NEXT: movapd %xmm5, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm1 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_packus_v4i64_v4i16: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [65535,65535] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm2, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm2, %xmm4 +; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm2, %xmm0 +; AVX1-NEXT: vpxor %xmm4, %xmm4, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vblendvpd %xmm3, %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm2 +; AVX1-NEXT: vpand %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vpand %xmm0, %xmm5, %xmm0 +; AVX1-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX1-NEXT: vzeroupper +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_packus_v4i64_v4i16: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm1 = [65535,65535,65535,65535] +; AVX2-SLOW-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-SLOW-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-SLOW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-SLOW-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm1 +; AVX2-SLOW-NEXT: vpand %ymm0, %ymm1, %ymm0 +; AVX2-SLOW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX2-SLOW-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-SLOW-NEXT: vzeroupper +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_packus_v4i64_v4i16: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm1 = [65535,65535,65535,65535] +; AVX2-FAST-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-FAST-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-FAST-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-FAST-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm1 +; AVX2-FAST-NEXT: vpand %ymm0, %ymm1, %ymm0 +; AVX2-FAST-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm2 = [0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-FAST-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-FAST-NEXT: vzeroupper +; AVX2-FAST-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v4i64_v4i16: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v4i64_v4i16: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; AVX512VL-NEXT: vpmovusqw %ymm0, %xmm0 +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v4i64_v4i16: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v4i64_v4i16: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; AVX512BWVL-NEXT: vpmovusqw %ymm0, %xmm0 +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = icmp sgt <4 x i64> %2, zeroinitializer + %4 = select <4 x i1> %3, <4 x i64> %2, <4 x i64> zeroinitializer + %5 = trunc <4 x i64> %4 to <4 x i16> + ret <4 x i16> %5 +} + +define void @trunc_packus_v4i64_v4i16_store(<4 x i64> %a0, <4 x i16> *%p1) { +; SSE2-LABEL: trunc_packus_v4i64_v4i16_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [65535,65535] +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm1, %xmm3 +; SSE2-NEXT: pxor %xmm2, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [2147549183,2147549183] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: pxor %xmm2, %xmm1 +; SSE2-NEXT: movdqa %xmm5, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm0, %xmm4 +; SSE2-NEXT: movdqa %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm2, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm2, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; SSE2-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSE2-NEXT: movq %xmm1, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v4i64_v4i16_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [65535,65535] +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm1, %xmm3 +; SSSE3-NEXT: pxor %xmm2, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [2147549183,2147549183] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSSE3-NEXT: movq %xmm1, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v4i64_v4i16_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm4 = [65535,65535] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [2147549183,2147549183] +; SSE41-NEXT: movdqa %xmm6, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 +; SSE41-NEXT: movdqa %xmm2, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm6, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 +; SSE41-NEXT: pxor %xmm1, %xmm1 +; SSE41-NEXT: movapd %xmm4, %xmm2 +; SSE41-NEXT: xorpd %xmm3, %xmm2 +; SSE41-NEXT: movapd %xmm2, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm2 +; SSE41-NEXT: movapd %xmm5, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: pshufd {{.*#+}} xmm1 = xmm2[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; SSE41-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSE41-NEXT: movq %xmm1, (%rdi) +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_packus_v4i64_v4i16_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [65535,65535] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm2, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm2, %xmm4 +; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm2, %xmm0 +; AVX1-NEXT: vpxor %xmm4, %xmm4, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vblendvpd %xmm3, %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm2 +; AVX1-NEXT: vpand %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vpand %xmm0, %xmm5, %xmm0 +; AVX1-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX1-NEXT: vmovq %xmm0, (%rdi) +; AVX1-NEXT: vzeroupper +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_packus_v4i64_v4i16_store: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm1 = [65535,65535,65535,65535] +; AVX2-SLOW-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-SLOW-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-SLOW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-SLOW-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm1 +; AVX2-SLOW-NEXT: vpand %ymm0, %ymm1, %ymm0 +; AVX2-SLOW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX2-SLOW-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-SLOW-NEXT: vmovq %xmm0, (%rdi) +; AVX2-SLOW-NEXT: vzeroupper +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_packus_v4i64_v4i16_store: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm1 = [65535,65535,65535,65535] +; AVX2-FAST-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-FAST-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-FAST-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-FAST-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm1 +; AVX2-FAST-NEXT: vpand %ymm0, %ymm1, %ymm0 +; AVX2-FAST-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm2 = [0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-FAST-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-FAST-NEXT: vmovq %xmm0, (%rdi) +; AVX2-FAST-NEXT: vzeroupper +; AVX2-FAST-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v4i64_v4i16_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v4i64_v4i16_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; AVX512VL-NEXT: vpmovusqw %ymm0, (%rdi) +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v4i64_v4i16_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v4i64_v4i16_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; AVX512BWVL-NEXT: vpmovusqw %ymm0, (%rdi) +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = icmp sgt <4 x i64> %2, zeroinitializer + %4 = select <4 x i1> %3, <4 x i64> %2, <4 x i64> zeroinitializer + %5 = trunc <4 x i64> %4 to <4 x i16> + store <4 x i16> %5, <4 x i16> *%p1 + ret void +} + define <8 x i16> @trunc_packus_v8i64_v8i16(<8 x i64> %a0) { ; SSE2-LABEL: trunc_packus_v8i64_v8i16: ; SSE2: # %bb.0: @@ -1036,6 +1587,163 @@ define <8 x i16> @trunc_packus_v8i64_v8i ret <8 x i16> %5 } +define <4 x i16> @trunc_packus_v4i32_v4i16(<4 x i32> %a0) { +; SSE2-LABEL: trunc_packus_v4i32_v4i16: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [65535,65535,65535,65535] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn %xmm1, %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm0, %xmm0 +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm2, %xmm1 +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm1[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v4i32_v4i16: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [65535,65535,65535,65535] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn %xmm1, %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm1 +; SSSE3-NEXT: movdqa %xmm2, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm0 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,1,4,5,8,9,12,13,8,9,12,13,12,13,14,15] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v4i32_v4i16: +; SSE41: # %bb.0: +; SSE41-NEXT: packusdw %xmm0, %xmm0 +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_packus_v4i32_v4i16: +; AVX: # %bb.0: +; AVX-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v4i32_v4i16: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v4i32_v4i16: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512VL-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v4i32_v4i16: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v4i32_v4i16: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = icmp sgt <4 x i32> %2, zeroinitializer + %4 = select <4 x i1> %3, <4 x i32> %2, <4 x i32> zeroinitializer + %5 = trunc <4 x i32> %4 to <4 x i16> + ret <4 x i16> %5 +} + +define void @trunc_packus_v4i32_v4i16_store(<4 x i32> %a0, <4 x i16> *%p1) { +; SSE2-LABEL: trunc_packus_v4i32_v4i16_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [65535,65535,65535,65535] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn %xmm1, %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm0, %xmm0 +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm2, %xmm1 +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm1[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SSE2-NEXT: movq %xmm0, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v4i32_v4i16_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [65535,65535,65535,65535] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn %xmm1, %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm0, %xmm0 +; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm2, %xmm1 +; SSSE3-NEXT: pshufb {{.*#+}} xmm1 = xmm1[0,1,4,5,8,9,12,13,8,9,12,13,12,13,14,15] +; SSSE3-NEXT: movq %xmm1, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v4i32_v4i16_store: +; SSE41: # %bb.0: +; SSE41-NEXT: packusdw %xmm0, %xmm0 +; SSE41-NEXT: movq %xmm0, (%rdi) +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_packus_v4i32_v4i16_store: +; AVX: # %bb.0: +; AVX-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX-NEXT: vmovq %xmm0, (%rdi) +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v4i32_v4i16_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v4i32_v4i16_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512VL-NEXT: vpmovusdw %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v4i32_v4i16_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v4i32_v4i16_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmovusdw %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = icmp sgt <4 x i32> %2, zeroinitializer + %4 = select <4 x i1> %3, <4 x i32> %2, <4 x i32> zeroinitializer + %5 = trunc <4 x i32> %4 to <4 x i16> + store <4 x i16> %5, <4 x i16> *%p1 + ret void +} + define <8 x i16> @trunc_packus_v8i32_v8i16(<8 x i32> %a0) { ; SSE2-LABEL: trunc_packus_v8i32_v8i16: ; SSE2: # %bb.0: @@ -1200,88 +1908,599 @@ define <16 x i16> @trunc_packus_v16i32_v ; SSSE3-NEXT: movdqa {{.*#+}} xmm6 = [65535,65535,65535,65535] ; SSSE3-NEXT: movdqa %xmm6, %xmm4 ; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pandn %xmm6, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pandn %xmm6, %xmm4 +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: movdqa %xmm6, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pandn %xmm6, %xmm5 +; SSSE3-NEXT: por %xmm0, %xmm5 +; SSSE3-NEXT: movdqa %xmm6, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm3 +; SSSE3-NEXT: pandn %xmm6, %xmm0 +; SSSE3-NEXT: por %xmm3, %xmm0 +; SSSE3-NEXT: movdqa %xmm6, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm2 +; SSSE3-NEXT: pandn %xmm6, %xmm3 +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pxor %xmm2, %xmm2 +; SSSE3-NEXT: movdqa %xmm3, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm1 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: movdqa %xmm0, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm0, %xmm3 +; SSSE3-NEXT: movdqa %xmm5, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm0 +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: movdqa %xmm4, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm5 +; SSSE3-NEXT: pand %xmm4, %xmm5 +; SSSE3-NEXT: pslld $16, %xmm5 +; SSSE3-NEXT: psrad $16, %xmm5 +; SSSE3-NEXT: pslld $16, %xmm0 +; SSSE3-NEXT: psrad $16, %xmm0 +; SSSE3-NEXT: packssdw %xmm5, %xmm0 +; SSSE3-NEXT: pslld $16, %xmm3 +; SSSE3-NEXT: psrad $16, %xmm3 +; SSSE3-NEXT: pslld $16, %xmm1 +; SSSE3-NEXT: psrad $16, %xmm1 +; SSSE3-NEXT: packssdw %xmm3, %xmm1 +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v16i32_v16i16: +; SSE41: # %bb.0: +; SSE41-NEXT: packusdw %xmm1, %xmm0 +; SSE41-NEXT: packusdw %xmm3, %xmm2 +; SSE41-NEXT: movdqa %xmm2, %xmm1 +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_packus_v16i32_v16i16: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 +; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 +; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_packus_v16i32_v16i16: +; AVX2: # %bb.0: +; AVX2-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: retq +; +; AVX512-LABEL: trunc_packus_v16i32_v16i16: +; AVX512: # %bb.0: +; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512-NEXT: vpmaxsd %zmm1, %zmm0, %zmm0 +; AVX512-NEXT: vpmovusdw %zmm0, %ymm0 +; AVX512-NEXT: retq + %1 = icmp slt <16 x i32> %a0, + %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> + %3 = icmp sgt <16 x i32> %2, zeroinitializer + %4 = select <16 x i1> %3, <16 x i32> %2, <16 x i32> zeroinitializer + %5 = trunc <16 x i32> %4 to <16 x i16> + ret <16 x i16> %5 +} + +; +; PACKUS saturation truncation to vXi8 +; + +define <4 x i8> @trunc_packus_v4i64_v4i8(<4 x i64> %a0) { +; SSE2-LABEL: trunc_packus_v4i64_v4i8: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255] +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm1, %xmm4 +; SSE2-NEXT: pxor %xmm3, %xmm4 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [2147483903,2147483903] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: pxor %xmm3, %xmm1 +; SSE2-NEXT: movdqa %xmm5, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm5, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: movdqa %xmm1, %xmm0 +; SSE2-NEXT: pxor %xmm3, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm6, %xmm0 +; SSE2-NEXT: movdqa %xmm4, %xmm2 +; SSE2-NEXT: pxor %xmm3, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm8, %xmm3 +; SSE2-NEXT: pand %xmm4, %xmm3 +; SSE2-NEXT: pand %xmm8, %xmm0 +; SSE2-NEXT: pand %xmm1, %xmm0 +; SSE2-NEXT: packuswb %xmm3, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v4i64_v4i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255] +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm1, %xmm3 +; SSSE3-NEXT: pxor %xmm2, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [2147483903,2147483903] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSSE3-NEXT: por %xmm6, %xmm0 +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: movdqa %xmm3, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm1, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm3, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSSE3-NEXT: pshufb %xmm1, %xmm2 +; SSSE3-NEXT: pshufb %xmm1, %xmm0 +; SSSE3-NEXT: punpcklwd {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1],xmm0[2],xmm2[2],xmm0[3],xmm2[3] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v4i64_v4i8: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm4 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [2147483903,2147483903] +; SSE41-NEXT: movdqa %xmm6, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 +; SSE41-NEXT: movdqa %xmm2, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm6, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 +; SSE41-NEXT: xorpd %xmm2, %xmm2 +; SSE41-NEXT: movapd %xmm4, %xmm1 +; SSE41-NEXT: xorpd %xmm3, %xmm1 +; SSE41-NEXT: movapd %xmm1, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm1, %xmm0 +; SSE41-NEXT: pxor %xmm1, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm1 +; SSE41-NEXT: movapd %xmm5, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm2 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSE41-NEXT: pshufb %xmm0, %xmm2 +; SSE41-NEXT: pshufb %xmm0, %xmm1 +; SSE41-NEXT: punpcklwd {{.*#+}} xmm1 = xmm1[0],xmm2[0],xmm1[1],xmm2[1],xmm1[2],xmm2[2],xmm1[3],xmm2[3] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_packus_v4i64_v4i8: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [255,255] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm2, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm2, %xmm4 +; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm2, %xmm0 +; AVX1-NEXT: vpxor %xmm4, %xmm4, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vblendvpd %xmm3, %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm2 +; AVX1-NEXT: vpand %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX1-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX1-NEXT: vpand %xmm0, %xmm5, %xmm0 +; AVX1-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX1-NEXT: vzeroupper +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_packus_v4i64_v4i8: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm1 = [255,255,255,255] +; AVX2-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm1 +; AVX2-NEXT: vpand %ymm0, %ymm1, %ymm0 +; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX2-NEXT: vmovdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX2-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX2-NEXT: vzeroupper +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v4i64_v4i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v4i64_v4i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; AVX512VL-NEXT: vpmovusqb %ymm0, %xmm0 +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v4i64_v4i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v4i64_v4i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; AVX512BWVL-NEXT: vpmovusqb %ymm0, %xmm0 +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = icmp sgt <4 x i64> %2, zeroinitializer + %4 = select <4 x i1> %3, <4 x i64> %2, <4 x i64> zeroinitializer + %5 = trunc <4 x i64> %4 to <4 x i8> + ret <4 x i8> %5 +} + +define void @trunc_packus_v4i64_v4i8_store(<4 x i64> %a0, <4 x i8> *%p1) { +; SSE2-LABEL: trunc_packus_v4i64_v4i8_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255] +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm1, %xmm4 +; SSE2-NEXT: pxor %xmm3, %xmm4 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [2147483903,2147483903] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: pxor %xmm3, %xmm1 +; SSE2-NEXT: movdqa %xmm5, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm5, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: movdqa %xmm1, %xmm0 +; SSE2-NEXT: pxor %xmm3, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: movdqa %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm3, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: pand %xmm8, %xmm3 +; SSE2-NEXT: pand %xmm4, %xmm3 +; SSE2-NEXT: pand %xmm8, %xmm2 +; SSE2-NEXT: pand %xmm1, %xmm2 +; SSE2-NEXT: packuswb %xmm3, %xmm2 +; SSE2-NEXT: packuswb %xmm0, %xmm2 +; SSE2-NEXT: packuswb %xmm0, %xmm2 +; SSE2-NEXT: movd %xmm2, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v4i64_v4i8_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255] +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm1, %xmm3 +; SSSE3-NEXT: pxor %xmm2, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [2147483903,2147483903] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] ; SSSE3-NEXT: por %xmm1, %xmm4 -; SSSE3-NEXT: movdqa %xmm6, %xmm5 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm0 -; SSSE3-NEXT: pandn %xmm6, %xmm5 -; SSSE3-NEXT: por %xmm0, %xmm5 -; SSSE3-NEXT: movdqa %xmm6, %xmm0 -; SSSE3-NEXT: pcmpgtd %xmm3, %xmm0 -; SSSE3-NEXT: pand %xmm0, %xmm3 -; SSSE3-NEXT: pandn %xmm6, %xmm0 -; SSSE3-NEXT: por %xmm3, %xmm0 -; SSSE3-NEXT: movdqa %xmm6, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm2 -; SSSE3-NEXT: pandn %xmm6, %xmm3 -; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: pxor %xmm2, %xmm2 -; SSSE3-NEXT: movdqa %xmm3, %xmm1 +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 ; SSSE3-NEXT: pcmpgtd %xmm2, %xmm1 -; SSSE3-NEXT: pand %xmm3, %xmm1 -; SSSE3-NEXT: movdqa %xmm0, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 -; SSSE3-NEXT: pand %xmm0, %xmm3 -; SSSE3-NEXT: movdqa %xmm5, %xmm0 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSSE3-NEXT: pand %xmm5, %xmm0 -; SSSE3-NEXT: movdqa %xmm4, %xmm5 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm5 -; SSSE3-NEXT: pand %xmm4, %xmm5 -; SSSE3-NEXT: pslld $16, %xmm5 -; SSSE3-NEXT: psrad $16, %xmm5 -; SSSE3-NEXT: pslld $16, %xmm0 -; SSSE3-NEXT: psrad $16, %xmm0 -; SSSE3-NEXT: packssdw %xmm5, %xmm0 -; SSSE3-NEXT: pslld $16, %xmm3 -; SSSE3-NEXT: psrad $16, %xmm3 -; SSSE3-NEXT: pslld $16, %xmm1 -; SSSE3-NEXT: psrad $16, %xmm1 -; SSSE3-NEXT: packssdw %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm3, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSSE3-NEXT: pshufb %xmm0, %xmm2 +; SSSE3-NEXT: pshufb %xmm0, %xmm1 +; SSSE3-NEXT: punpcklwd {{.*#+}} xmm1 = xmm1[0],xmm2[0],xmm1[1],xmm2[1],xmm1[2],xmm2[2],xmm1[3],xmm2[3] +; SSSE3-NEXT: movd %xmm1, (%rdi) ; SSSE3-NEXT: retq ; -; SSE41-LABEL: trunc_packus_v16i32_v16i16: +; SSE41-LABEL: trunc_packus_v4i64_v4i8_store: ; SSE41: # %bb.0: -; SSE41-NEXT: packusdw %xmm1, %xmm0 -; SSE41-NEXT: packusdw %xmm3, %xmm2 -; SSE41-NEXT: movdqa %xmm2, %xmm1 +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm4 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [2147483903,2147483903] +; SSE41-NEXT: movdqa %xmm6, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 +; SSE41-NEXT: movdqa %xmm2, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm6, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 +; SSE41-NEXT: pxor %xmm1, %xmm1 +; SSE41-NEXT: movapd %xmm4, %xmm2 +; SSE41-NEXT: xorpd %xmm3, %xmm2 +; SSE41-NEXT: movapd %xmm2, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm2 +; SSE41-NEXT: movapd %xmm5, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm1 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSE41-NEXT: pshufb %xmm0, %xmm1 +; SSE41-NEXT: pshufb %xmm0, %xmm2 +; SSE41-NEXT: punpcklwd {{.*#+}} xmm2 = xmm2[0],xmm1[0],xmm2[1],xmm1[1],xmm2[2],xmm1[2],xmm2[3],xmm1[3] +; SSE41-NEXT: movd %xmm2, (%rdi) ; SSE41-NEXT: retq ; -; AVX1-LABEL: trunc_packus_v16i32_v16i16: +; AVX1-LABEL: trunc_packus_v4i64_v4i8_store: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 -; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [255,255] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm2, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm2, %xmm4 +; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm2, %xmm0 +; AVX1-NEXT: vpxor %xmm4, %xmm4, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vblendvpd %xmm3, %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm2 +; AVX1-NEXT: vpand %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX1-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX1-NEXT: vpand %xmm0, %xmm5, %xmm0 +; AVX1-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX1-NEXT: vmovd %xmm0, (%rdi) +; AVX1-NEXT: vzeroupper ; AVX1-NEXT: retq ; -; AVX2-LABEL: trunc_packus_v16i32_v16i16: +; AVX2-LABEL: trunc_packus_v4i64_v4i8_store: ; AVX2: # %bb.0: -; AVX2-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 -; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm1 = [255,255,255,255] +; AVX2-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm1 +; AVX2-NEXT: vpand %ymm0, %ymm1, %ymm0 +; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX2-NEXT: vmovdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX2-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX2-NEXT: vmovd %xmm0, (%rdi) +; AVX2-NEXT: vzeroupper ; AVX2-NEXT: retq ; -; AVX512-LABEL: trunc_packus_v16i32_v16i16: -; AVX512: # %bb.0: -; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512-NEXT: vpmaxsd %zmm1, %zmm0, %zmm0 -; AVX512-NEXT: vpmovusdw %zmm0, %ymm0 -; AVX512-NEXT: retq - %1 = icmp slt <16 x i32> %a0, - %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> - %3 = icmp sgt <16 x i32> %2, zeroinitializer - %4 = select <16 x i1> %3, <16 x i32> %2, <16 x i32> zeroinitializer - %5 = trunc <16 x i32> %4 to <16 x i16> - ret <16 x i16> %5 -} - +; AVX512F-LABEL: trunc_packus_v4i64_v4i8_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vmovd %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v4i64_v4i8_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; AVX512VL-NEXT: vpmovusqb %ymm0, (%rdi) +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq ; -; PACKUS saturation truncation to v16i8 +; AVX512BW-LABEL: trunc_packus_v4i64_v4i8_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vmovd %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq ; +; AVX512BWVL-LABEL: trunc_packus_v4i64_v4i8_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; AVX512BWVL-NEXT: vpmovusqb %ymm0, (%rdi) +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = icmp sgt <4 x i64> %2, zeroinitializer + %4 = select <4 x i1> %3, <4 x i64> %2, <4 x i64> zeroinitializer + %5 = trunc <4 x i64> %4 to <4 x i8> + store <4 x i8> %5, <4 x i8> *%p1 + ret void +} define <8 x i8> @trunc_packus_v8i64_v8i8(<8 x i64> %a0) { ; SSE2-LABEL: trunc_packus_v8i64_v8i8: @@ -2783,6 +4002,210 @@ define <16 x i8> @trunc_packus_v16i64_v1 ret <16 x i8> %5 } +define <4 x i8> @trunc_packus_v4i32_v4i8(<4 x i32> %a0) { +; SSE2-LABEL: trunc_packus_v4i32_v4i8: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [255,255,255,255] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn %xmm1, %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm1 +; SSE2-NEXT: movdqa %xmm2, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm0 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v4i32_v4i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [255,255,255,255] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn %xmm1, %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm1 +; SSSE3-NEXT: movdqa %xmm2, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm0 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v4i32_v4i8: +; SSE41: # %bb.0: +; SSE41-NEXT: pminsd {{.*}}(%rip), %xmm0 +; SSE41-NEXT: pxor %xmm1, %xmm1 +; SSE41-NEXT: pmaxsd %xmm1, %xmm0 +; SSE41-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_packus_v4i32_v4i8: +; AVX1: # %bb.0: +; AVX1-NEXT: vpminsd {{.*}}(%rip), %xmm0, %xmm0 +; AVX1-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX1-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_packus_v4i32_v4i8: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX2-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v4i32_v4i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX512F-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v4i32_v4i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v4i32_v4i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX512BW-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v4i32_v4i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = icmp sgt <4 x i32> %2, zeroinitializer + %4 = select <4 x i1> %3, <4 x i32> %2, <4 x i32> zeroinitializer + %5 = trunc <4 x i32> %4 to <4 x i8> + ret <4 x i8> %5 +} + +define void @trunc_packus_v4i32_v4i8_store(<4 x i32> %a0, <4 x i8> *%p1) { +; SSE2-LABEL: trunc_packus_v4i32_v4i8_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [255,255,255,255] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn %xmm1, %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm0, %xmm0 +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm2, %xmm1 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm1 +; SSE2-NEXT: packuswb %xmm0, %xmm1 +; SSE2-NEXT: packuswb %xmm0, %xmm1 +; SSE2-NEXT: movd %xmm1, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v4i32_v4i8_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [255,255,255,255] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn %xmm1, %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm0, %xmm0 +; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm2, %xmm1 +; SSSE3-NEXT: pshufb {{.*#+}} xmm1 = xmm1[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: movd %xmm1, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v4i32_v4i8_store: +; SSE41: # %bb.0: +; SSE41-NEXT: pminsd {{.*}}(%rip), %xmm0 +; SSE41-NEXT: pxor %xmm1, %xmm1 +; SSE41-NEXT: pmaxsd %xmm0, %xmm1 +; SSE41-NEXT: pshufb {{.*#+}} xmm1 = xmm1[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: movd %xmm1, (%rdi) +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_packus_v4i32_v4i8_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vpminsd {{.*}}(%rip), %xmm0, %xmm0 +; AVX1-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX1-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX1-NEXT: vmovd %xmm0, (%rdi) +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_packus_v4i32_v4i8_store: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX2-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX2-NEXT: vmovd %xmm0, (%rdi) +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v4i32_v4i8_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX512F-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vmovd %xmm0, (%rdi) +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v4i32_v4i8_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512VL-NEXT: vpmovusdb %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v4i32_v4i8_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX512BW-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vmovd %xmm0, (%rdi) +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v4i32_v4i8_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmovusdb %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = icmp sgt <4 x i32> %2, zeroinitializer + %4 = select <4 x i1> %3, <4 x i32> %2, <4 x i32> zeroinitializer + %5 = trunc <4 x i32> %4 to <4 x i8> + store <4 x i8> %5, <4 x i8> *%p1 + ret void +} + define <8 x i8> @trunc_packus_v8i32_v8i8(<8 x i32> %a0) { ; SSE-LABEL: trunc_packus_v8i32_v8i8: ; SSE: # %bb.0: @@ -2955,6 +4378,93 @@ define <16 x i8> @trunc_packus_v16i32_v1 ret <16 x i8> %5 } +define <8 x i8> @trunc_packus_v8i16_v8i8(<8 x i16> %a0) { +; SSE-LABEL: trunc_packus_v8i16_v8i8: +; SSE: # %bb.0: +; SSE-NEXT: packuswb %xmm0, %xmm0 +; SSE-NEXT: retq +; +; AVX-LABEL: trunc_packus_v8i16_v8i8: +; AVX: # %bb.0: +; AVX-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v8i16_v8i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v8i16_v8i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v8i16_v8i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v8i16_v8i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsw {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512BWVL-NEXT: retq + %1 = icmp slt <8 x i16> %a0, + %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> + %3 = icmp sgt <8 x i16> %2, zeroinitializer + %4 = select <8 x i1> %3, <8 x i16> %2, <8 x i16> zeroinitializer + %5 = trunc <8 x i16> %4 to <8 x i8> + ret <8 x i8> %5 +} + +define void @trunc_packus_v8i16_v8i8_store(<8 x i16> %a0, <8 x i8> *%p1) { +; SSE-LABEL: trunc_packus_v8i16_v8i8_store: +; SSE: # %bb.0: +; SSE-NEXT: packuswb %xmm0, %xmm0 +; SSE-NEXT: movq %xmm0, (%rdi) +; SSE-NEXT: retq +; +; AVX-LABEL: trunc_packus_v8i16_v8i8_store: +; AVX: # %bb.0: +; AVX-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX-NEXT: vmovq %xmm0, (%rdi) +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v8i16_v8i8_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v8i16_v8i8_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512VL-NEXT: vmovq %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v8i16_v8i8_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v8i16_v8i8_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmovuswb %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq + %1 = icmp slt <8 x i16> %a0, + %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> + %3 = icmp sgt <8 x i16> %2, zeroinitializer + %4 = select <8 x i1> %3, <8 x i16> %2, <8 x i16> zeroinitializer + %5 = trunc <8 x i16> %4 to <8 x i8> + store <8 x i8> %5, <8 x i8> *%p1 + ret void +} + define <16 x i8> @trunc_packus_v16i16_v16i8(<16 x i16> %a0) { ; SSE-LABEL: trunc_packus_v16i16_v16i8: ; SSE: # %bb.0: Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll?rev=374505&r1=374504&r2=374505&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll Thu Oct 10 20:46:39 2019 @@ -671,6 +671,565 @@ define <8 x i32> @trunc_ssat_v8i64_v8i32 ; Signed saturation truncation to vXi16 ; +define <4 x i16> @trunc_ssat_v4i64_v4i16(<4 x i64> %a0) { +; SSE2-LABEL: trunc_ssat_v4i64_v4i16: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [32767,32767] +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm1, %xmm3 +; SSE2-NEXT: pxor %xmm2, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [2147516415,2147516415] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: pxor %xmm2, %xmm1 +; SSE2-NEXT: movdqa %xmm5, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm0, %xmm4 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [18446744073709518848,18446744073709518848] +; SSE2-NEXT: movdqa %xmm4, %xmm1 +; SSE2-NEXT: pxor %xmm2, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [18446744071562035200,18446744071562035200] +; SSE2-NEXT: movdqa %xmm1, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm6 +; SSE2-NEXT: pand %xmm6, %xmm4 +; SSE2-NEXT: pandn %xmm0, %xmm6 +; SSE2-NEXT: por %xmm4, %xmm6 +; SSE2-NEXT: pxor %xmm3, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm1[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm3 +; SSE2-NEXT: pandn %xmm0, %xmm1 +; SSE2-NEXT: por %xmm3, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm1 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v4i64_v4i16: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [32767,32767] +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm1, %xmm3 +; SSSE3-NEXT: pxor %xmm2, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [2147516415,2147516415] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [18446744073709518848,18446744073709518848] +; SSSE3-NEXT: movdqa %xmm4, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [18446744071562035200,18446744071562035200] +; SSSE3-NEXT: movdqa %xmm1, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm4 +; SSSE3-NEXT: pandn %xmm0, %xmm6 +; SSSE3-NEXT: por %xmm4, %xmm6 +; SSSE3-NEXT: pxor %xmm3, %xmm2 +; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm1[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm3 +; SSSE3-NEXT: pandn %xmm0, %xmm1 +; SSSE3-NEXT: por %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm1 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v4i64_v4i16: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm4 = [32767,32767] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [2147516415,2147516415] +; SSE41-NEXT: movdqa %xmm6, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 +; SSE41-NEXT: movdqa %xmm2, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm6, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; SSE41-NEXT: movapd %xmm4, %xmm2 +; SSE41-NEXT: xorpd %xmm3, %xmm2 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [18446744071562035200,18446744071562035200] +; SSE41-NEXT: movapd %xmm2, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm2 +; SSE41-NEXT: xorpd %xmm5, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm1 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_ssat_v4i64_v4i16: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [32767,32767] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm2, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm2, %xmm4 +; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm2, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [18446744073709518848,18446744073709518848] +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vblendvpd %xmm3, %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vblendvpd %xmm5, %xmm0, %xmm4, %xmm0 +; AVX1-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX1-NEXT: vzeroupper +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_ssat_v4i64_v4i16: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm1 = [32767,32767,32767,32767] +; AVX2-SLOW-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-SLOW-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm1 = [18446744073709518848,18446744073709518848,18446744073709518848,18446744073709518848] +; AVX2-SLOW-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm2 +; AVX2-SLOW-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-SLOW-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-SLOW-NEXT: vzeroupper +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_ssat_v4i64_v4i16: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm1 = [32767,32767,32767,32767] +; AVX2-FAST-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-FAST-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm1 = [18446744073709518848,18446744073709518848,18446744073709518848,18446744073709518848] +; AVX2-FAST-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm2 +; AVX2-FAST-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-FAST-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm2 = [0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-FAST-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-FAST-NEXT: vzeroupper +; AVX2-FAST-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v4i64_v4i16: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v4i64_v4i16: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovsqw %ymm0, %xmm0 +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v4i64_v4i16: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v4i64_v4i16: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovsqw %ymm0, %xmm0 +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = icmp sgt <4 x i64> %2, + %4 = select <4 x i1> %3, <4 x i64> %2, <4 x i64> + %5 = trunc <4 x i64> %4 to <4 x i16> + ret <4 x i16> %5 +} + +define void @trunc_ssat_v4i64_v4i16_store(<4 x i64> %a0, <4 x i16> *%p1) { +; SSE2-LABEL: trunc_ssat_v4i64_v4i16_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [32767,32767] +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm1, %xmm3 +; SSE2-NEXT: pxor %xmm2, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [2147516415,2147516415] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: pxor %xmm2, %xmm1 +; SSE2-NEXT: movdqa %xmm5, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm0, %xmm4 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [18446744073709518848,18446744073709518848] +; SSE2-NEXT: movdqa %xmm4, %xmm1 +; SSE2-NEXT: pxor %xmm2, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [18446744071562035200,18446744071562035200] +; SSE2-NEXT: movdqa %xmm1, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm6 +; SSE2-NEXT: pand %xmm6, %xmm4 +; SSE2-NEXT: pandn %xmm0, %xmm6 +; SSE2-NEXT: por %xmm4, %xmm6 +; SSE2-NEXT: pxor %xmm3, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm1[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm3 +; SSE2-NEXT: pandn %xmm0, %xmm1 +; SSE2-NEXT: por %xmm3, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm6[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; SSE2-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSE2-NEXT: movq %xmm1, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v4i64_v4i16_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [32767,32767] +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm1, %xmm3 +; SSSE3-NEXT: pxor %xmm2, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [2147516415,2147516415] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [18446744073709518848,18446744073709518848] +; SSSE3-NEXT: movdqa %xmm4, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [18446744071562035200,18446744071562035200] +; SSSE3-NEXT: movdqa %xmm1, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm4 +; SSSE3-NEXT: pandn %xmm0, %xmm6 +; SSSE3-NEXT: por %xmm4, %xmm6 +; SSSE3-NEXT: pxor %xmm3, %xmm2 +; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm1[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm3 +; SSSE3-NEXT: pandn %xmm0, %xmm1 +; SSSE3-NEXT: por %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm6[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSSE3-NEXT: movq %xmm1, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v4i64_v4i16_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm4 = [32767,32767] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [2147516415,2147516415] +; SSE41-NEXT: movdqa %xmm6, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 +; SSE41-NEXT: movdqa %xmm2, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm6, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; SSE41-NEXT: movapd %xmm4, %xmm2 +; SSE41-NEXT: xorpd %xmm3, %xmm2 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [18446744071562035200,18446744071562035200] +; SSE41-NEXT: movapd %xmm2, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm2 +; SSE41-NEXT: xorpd %xmm5, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: pshufd {{.*#+}} xmm1 = xmm2[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; SSE41-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSE41-NEXT: movq %xmm1, (%rdi) +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_ssat_v4i64_v4i16_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [32767,32767] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm2, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm2, %xmm4 +; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm2, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [18446744073709518848,18446744073709518848] +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vblendvpd %xmm3, %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vblendvpd %xmm5, %xmm0, %xmm4, %xmm0 +; AVX1-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX1-NEXT: vmovq %xmm0, (%rdi) +; AVX1-NEXT: vzeroupper +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_ssat_v4i64_v4i16_store: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm1 = [32767,32767,32767,32767] +; AVX2-SLOW-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-SLOW-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm1 = [18446744073709518848,18446744073709518848,18446744073709518848,18446744073709518848] +; AVX2-SLOW-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm2 +; AVX2-SLOW-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-SLOW-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-SLOW-NEXT: vmovq %xmm0, (%rdi) +; AVX2-SLOW-NEXT: vzeroupper +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_ssat_v4i64_v4i16_store: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm1 = [32767,32767,32767,32767] +; AVX2-FAST-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-FAST-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm1 = [18446744073709518848,18446744073709518848,18446744073709518848,18446744073709518848] +; AVX2-FAST-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm2 +; AVX2-FAST-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-FAST-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm2 = [0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-FAST-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-FAST-NEXT: vmovq %xmm0, (%rdi) +; AVX2-FAST-NEXT: vzeroupper +; AVX2-FAST-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v4i64_v4i16_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v4i64_v4i16_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovsqw %ymm0, (%rdi) +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v4i64_v4i16_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v4i64_v4i16_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovsqw %ymm0, (%rdi) +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = icmp sgt <4 x i64> %2, + %4 = select <4 x i1> %3, <4 x i64> %2, <4 x i64> + %5 = trunc <4 x i64> %4 to <4 x i16> + store <4 x i16> %5, <4 x i16> *%p1 + ret void +} + define <8 x i16> @trunc_ssat_v8i64_v8i16(<8 x i64> %a0) { ; SSE2-LABEL: trunc_ssat_v8i64_v8i16: ; SSE2: # %bb.0: @@ -1043,111 +1602,717 @@ define <8 x i16> @trunc_ssat_v8i64_v8i16 ; AVX2-NEXT: vzeroupper ; AVX2-NEXT: retq ; -; AVX512-LABEL: trunc_ssat_v8i64_v8i16: -; AVX512: # %bb.0: -; AVX512-NEXT: vpmovsqw %zmm0, %xmm0 -; AVX512-NEXT: vzeroupper -; AVX512-NEXT: retq - %1 = icmp slt <8 x i64> %a0, - %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> - %3 = icmp sgt <8 x i64> %2, - %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> - %5 = trunc <8 x i64> %4 to <8 x i16> - ret <8 x i16> %5 -} - -define <8 x i16> @trunc_ssat_v8i32_v8i16(<8 x i32> %a0) { -; SSE-LABEL: trunc_ssat_v8i32_v8i16: -; SSE: # %bb.0: -; SSE-NEXT: packssdw %xmm1, %xmm0 -; SSE-NEXT: retq +; AVX512-LABEL: trunc_ssat_v8i64_v8i16: +; AVX512: # %bb.0: +; AVX512-NEXT: vpmovsqw %zmm0, %xmm0 +; AVX512-NEXT: vzeroupper +; AVX512-NEXT: retq + %1 = icmp slt <8 x i64> %a0, + %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> + %3 = icmp sgt <8 x i64> %2, + %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> + %5 = trunc <8 x i64> %4 to <8 x i16> + ret <8 x i16> %5 +} + +define <4 x i16> @trunc_ssat_v4i32_v4i16(<4 x i32> %a0) { +; SSE-LABEL: trunc_ssat_v4i32_v4i16: +; SSE: # %bb.0: +; SSE-NEXT: packssdw %xmm0, %xmm0 +; SSE-NEXT: retq +; +; AVX-LABEL: trunc_ssat_v4i32_v4i16: +; AVX: # %bb.0: +; AVX-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v4i32_v4i16: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v4i32_v4i16: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512VL-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512VL-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v4i32_v4i16: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v4i32_v4i16: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = icmp sgt <4 x i32> %2, + %4 = select <4 x i1> %3, <4 x i32> %2, <4 x i32> + %5 = trunc <4 x i32> %4 to <4 x i16> + ret <4 x i16> %5 +} + +define void @trunc_ssat_v4i32_v4i16_store(<4 x i32> %a0, <4 x i16> *%p1) { +; SSE-LABEL: trunc_ssat_v4i32_v4i16_store: +; SSE: # %bb.0: +; SSE-NEXT: packssdw %xmm0, %xmm0 +; SSE-NEXT: movq %xmm0, (%rdi) +; SSE-NEXT: retq +; +; AVX-LABEL: trunc_ssat_v4i32_v4i16_store: +; AVX: # %bb.0: +; AVX-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 +; AVX-NEXT: vmovq %xmm0, (%rdi) +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v4i32_v4i16_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v4i32_v4i16_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovsdw %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v4i32_v4i16_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v4i32_v4i16_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovsdw %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = icmp sgt <4 x i32> %2, + %4 = select <4 x i1> %3, <4 x i32> %2, <4 x i32> + %5 = trunc <4 x i32> %4 to <4 x i16> + store <4 x i16> %5, <4 x i16> *%p1 + ret void +} + +define <8 x i16> @trunc_ssat_v8i32_v8i16(<8 x i32> %a0) { +; SSE-LABEL: trunc_ssat_v8i32_v8i16: +; SSE: # %bb.0: +; SSE-NEXT: packssdw %xmm1, %xmm0 +; SSE-NEXT: retq +; +; AVX1-LABEL: trunc_ssat_v8i32_v8i16: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX1-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vzeroupper +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_ssat_v8i32_v8i16: +; AVX2: # %bb.0: +; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX2-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vzeroupper +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v8i32_v8i16: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512F-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v8i32_v8i16: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovsdw %ymm0, %xmm0 +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v8i32_v8i16: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX512BW-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v8i32_v8i16: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovsdw %ymm0, %xmm0 +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp slt <8 x i32> %a0, + %2 = select <8 x i1> %1, <8 x i32> %a0, <8 x i32> + %3 = icmp sgt <8 x i32> %2, + %4 = select <8 x i1> %3, <8 x i32> %2, <8 x i32> + %5 = trunc <8 x i32> %4 to <8 x i16> + ret <8 x i16> %5 +} + +define <16 x i16> @trunc_ssat_v16i32_v16i16(<16 x i32> %a0) { +; SSE-LABEL: trunc_ssat_v16i32_v16i16: +; SSE: # %bb.0: +; SSE-NEXT: packssdw %xmm1, %xmm0 +; SSE-NEXT: packssdw %xmm3, %xmm2 +; SSE-NEXT: movdqa %xmm2, %xmm1 +; SSE-NEXT: retq +; +; AVX1-LABEL: trunc_ssat_v16i32_v16i16: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 +; AVX1-NEXT: vpackssdw %xmm2, %xmm1, %xmm1 +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 +; AVX1-NEXT: vpackssdw %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_ssat_v16i32_v16i16: +; AVX2: # %bb.0: +; AVX2-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: retq +; +; AVX512-LABEL: trunc_ssat_v16i32_v16i16: +; AVX512: # %bb.0: +; AVX512-NEXT: vpmovsdw %zmm0, %ymm0 +; AVX512-NEXT: retq + %1 = icmp slt <16 x i32> %a0, + %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> + %3 = icmp sgt <16 x i32> %2, + %4 = select <16 x i1> %3, <16 x i32> %2, <16 x i32> + %5 = trunc <16 x i32> %4 to <16 x i16> + ret <16 x i16> %5 +} + +; +; Signed saturation truncation to vXi8 +; + +define <4 x i8> @trunc_ssat_v4i64_v4i8(<4 x i64> %a0) { +; SSE2-LABEL: trunc_ssat_v4i64_v4i8: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [127,127] +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm1, %xmm3 +; SSE2-NEXT: pxor %xmm2, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [2147483775,2147483775] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: pxor %xmm2, %xmm1 +; SSE2-NEXT: movdqa %xmm5, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm0, %xmm4 +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709551488,18446744073709551488] +; SSE2-NEXT: movdqa %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [18446744071562067840,18446744071562067840] +; SSE2-NEXT: movdqa %xmm0, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm4 +; SSE2-NEXT: pandn %xmm8, %xmm0 +; SSE2-NEXT: por %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm3, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm1[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm3 +; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: por %xmm3, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] +; SSE2-NEXT: pand %xmm2, %xmm1 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: packuswb %xmm1, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v4i64_v4i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [127,127] +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm1, %xmm3 +; SSSE3-NEXT: pxor %xmm2, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [2147483775,2147483775] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709551488,18446744073709551488] +; SSSE3-NEXT: movdqa %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [18446744071562067840,18446744071562067840] +; SSSE3-NEXT: movdqa %xmm0, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm4 +; SSSE3-NEXT: pandn %xmm8, %xmm0 +; SSSE3-NEXT: por %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm3, %xmm2 +; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm1[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm3 +; SSSE3-NEXT: pandn %xmm8, %xmm1 +; SSSE3-NEXT: por %xmm3, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSSE3-NEXT: pshufb %xmm2, %xmm1 +; SSSE3-NEXT: pshufb %xmm2, %xmm0 +; SSSE3-NEXT: punpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v4i64_v4i8: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm4 = [127,127] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [2147483775,2147483775] +; SSE41-NEXT: movdqa %xmm6, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 +; SSE41-NEXT: movdqa %xmm2, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm6, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [18446744073709551488,18446744073709551488] +; SSE41-NEXT: movapd %xmm4, %xmm1 +; SSE41-NEXT: xorpd %xmm3, %xmm1 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [18446744071562067840,18446744071562067840] +; SSE41-NEXT: movapd %xmm1, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm1, %xmm0 +; SSE41-NEXT: movapd %xmm2, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm1 +; SSE41-NEXT: xorpd %xmm5, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm2 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSE41-NEXT: pshufb %xmm0, %xmm2 +; SSE41-NEXT: pshufb %xmm0, %xmm1 +; SSE41-NEXT: punpcklwd {{.*#+}} xmm1 = xmm1[0],xmm2[0],xmm1[1],xmm2[1],xmm1[2],xmm2[2],xmm1[3],xmm2[3] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_ssat_v4i64_v4i8: +; AVX1: # %bb.0: +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [127,127] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm2, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm2, %xmm4 +; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm2, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [18446744073709551488,18446744073709551488] +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vblendvpd %xmm3, %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX1-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX1-NEXT: vblendvpd %xmm5, %xmm0, %xmm4, %xmm0 +; AVX1-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX1-NEXT: vzeroupper +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_ssat_v4i64_v4i8: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm1 = [127,127,127,127] +; AVX2-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm1 = [18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488] +; AVX2-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm2 +; AVX2-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-NEXT: vmovdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX2-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX2-NEXT: vzeroupper +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v4i64_v4i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v4i64_v4i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovsqb %ymm0, %xmm0 +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v4i64_v4i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v4i64_v4i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovsqb %ymm0, %xmm0 +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = icmp sgt <4 x i64> %2, + %4 = select <4 x i1> %3, <4 x i64> %2, <4 x i64> + %5 = trunc <4 x i64> %4 to <4 x i8> + ret <4 x i8> %5 +} + +define void @trunc_ssat_v4i64_v4i8_store(<4 x i64> %a0, <4 x i8> *%p1) { +; SSE2-LABEL: trunc_ssat_v4i64_v4i8_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [127,127] +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm1, %xmm3 +; SSE2-NEXT: pxor %xmm2, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [2147483775,2147483775] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm0, %xmm1 +; SSE2-NEXT: pxor %xmm2, %xmm1 +; SSE2-NEXT: movdqa %xmm5, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm0, %xmm4 +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709551488,18446744073709551488] +; SSE2-NEXT: movdqa %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [18446744071562067840,18446744071562067840] +; SSE2-NEXT: movdqa %xmm0, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm4 +; SSE2-NEXT: pandn %xmm8, %xmm0 +; SSE2-NEXT: por %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm3, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm1[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm3 +; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: por %xmm3, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] +; SSE2-NEXT: pand %xmm2, %xmm1 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: packuswb %xmm1, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: movd %xmm0, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v4i64_v4i8_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [127,127] +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm1, %xmm3 +; SSSE3-NEXT: pxor %xmm2, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [2147483775,2147483775] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm0, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709551488,18446744073709551488] +; SSSE3-NEXT: movdqa %xmm4, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [18446744071562067840,18446744071562067840] +; SSSE3-NEXT: movdqa %xmm1, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm4 +; SSSE3-NEXT: pandn %xmm8, %xmm1 +; SSSE3-NEXT: por %xmm4, %xmm1 +; SSSE3-NEXT: pxor %xmm3, %xmm2 +; SSSE3-NEXT: movdqa %xmm2, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm0[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm3 +; SSSE3-NEXT: pandn %xmm8, %xmm0 +; SSSE3-NEXT: por %xmm3, %xmm0 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSSE3-NEXT: pshufb %xmm2, %xmm0 +; SSSE3-NEXT: pshufb %xmm2, %xmm1 +; SSSE3-NEXT: punpcklwd {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1],xmm1[2],xmm0[2],xmm1[3],xmm0[3] +; SSSE3-NEXT: movd %xmm1, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v4i64_v4i8_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm4 = [127,127] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [2147483775,2147483775] +; SSE41-NEXT: movdqa %xmm6, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 +; SSE41-NEXT: movdqa %xmm2, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm6, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] +; SSE41-NEXT: movapd %xmm4, %xmm2 +; SSE41-NEXT: xorpd %xmm3, %xmm2 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [18446744071562067840,18446744071562067840] +; SSE41-NEXT: movapd %xmm2, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm2 +; SSE41-NEXT: xorpd %xmm5, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm1 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSE41-NEXT: pshufb %xmm0, %xmm1 +; SSE41-NEXT: pshufb %xmm0, %xmm2 +; SSE41-NEXT: punpcklwd {{.*#+}} xmm2 = xmm2[0],xmm1[0],xmm2[1],xmm1[1],xmm2[2],xmm1[2],xmm2[3],xmm1[3] +; SSE41-NEXT: movd %xmm2, (%rdi) +; SSE41-NEXT: retq ; -; AVX1-LABEL: trunc_ssat_v8i32_v8i16: +; AVX1-LABEL: trunc_ssat_v4i64_v4i8_store: ; AVX1: # %bb.0: ; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1 -; AVX1-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [127,127] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm2, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm2, %xmm4 +; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm2, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [18446744073709551488,18446744073709551488] +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vblendvpd %xmm3, %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX1-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX1-NEXT: vblendvpd %xmm5, %xmm0, %xmm4, %xmm0 +; AVX1-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX1-NEXT: vmovd %xmm0, (%rdi) ; AVX1-NEXT: vzeroupper ; AVX1-NEXT: retq ; -; AVX2-LABEL: trunc_ssat_v8i32_v8i16: +; AVX2-LABEL: trunc_ssat_v4i64_v4i8_store: ; AVX2: # %bb.0: -; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 -; AVX2-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm1 = [127,127,127,127] +; AVX2-NEXT: vpcmpgtq %ymm0, %ymm1, %ymm2 +; AVX2-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm1 = [18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488] +; AVX2-NEXT: vpcmpgtq %ymm1, %ymm0, %ymm2 +; AVX2-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-NEXT: vmovdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX2-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX2-NEXT: vmovd %xmm0, (%rdi) ; AVX2-NEXT: vzeroupper ; AVX2-NEXT: retq ; -; AVX512F-LABEL: trunc_ssat_v8i32_v8i16: +; AVX512F-LABEL: trunc_ssat_v4i64_v4i8_store: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vextracti128 $1, %ymm0, %xmm1 -; AVX512F-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vmovd %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; -; AVX512VL-LABEL: trunc_ssat_v8i32_v8i16: +; AVX512VL-LABEL: trunc_ssat_v4i64_v4i8_store: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpmovsdw %ymm0, %xmm0 +; AVX512VL-NEXT: vpmovsqb %ymm0, (%rdi) ; AVX512VL-NEXT: vzeroupper ; AVX512VL-NEXT: retq ; -; AVX512BW-LABEL: trunc_ssat_v8i32_v8i16: +; AVX512BW-LABEL: trunc_ssat_v4i64_v4i8_store: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vextracti128 $1, %ymm0, %xmm1 -; AVX512BW-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vmovd %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; -; AVX512BWVL-LABEL: trunc_ssat_v8i32_v8i16: +; AVX512BWVL-LABEL: trunc_ssat_v4i64_v4i8_store: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpmovsdw %ymm0, %xmm0 +; AVX512BWVL-NEXT: vpmovsqb %ymm0, (%rdi) ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq - %1 = icmp slt <8 x i32> %a0, - %2 = select <8 x i1> %1, <8 x i32> %a0, <8 x i32> - %3 = icmp sgt <8 x i32> %2, - %4 = select <8 x i1> %3, <8 x i32> %2, <8 x i32> - %5 = trunc <8 x i32> %4 to <8 x i16> - ret <8 x i16> %5 -} - -define <16 x i16> @trunc_ssat_v16i32_v16i16(<16 x i32> %a0) { -; SSE-LABEL: trunc_ssat_v16i32_v16i16: -; SSE: # %bb.0: -; SSE-NEXT: packssdw %xmm1, %xmm0 -; SSE-NEXT: packssdw %xmm3, %xmm2 -; SSE-NEXT: movdqa %xmm2, %xmm1 -; SSE-NEXT: retq -; -; AVX1-LABEL: trunc_ssat_v16i32_v16i16: -; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vpackssdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vpackssdw %xmm2, %xmm0, %xmm0 -; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 -; AVX1-NEXT: retq -; -; AVX2-LABEL: trunc_ssat_v16i32_v16i16: -; AVX2: # %bb.0: -; AVX2-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 -; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] -; AVX2-NEXT: retq -; -; AVX512-LABEL: trunc_ssat_v16i32_v16i16: -; AVX512: # %bb.0: -; AVX512-NEXT: vpmovsdw %zmm0, %ymm0 -; AVX512-NEXT: retq - %1 = icmp slt <16 x i32> %a0, - %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> - %3 = icmp sgt <16 x i32> %2, - %4 = select <16 x i1> %3, <16 x i32> %2, <16 x i32> - %5 = trunc <16 x i32> %4 to <16 x i16> - ret <16 x i16> %5 + %1 = icmp slt <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = icmp sgt <4 x i64> %2, + %4 = select <4 x i1> %3, <4 x i64> %2, <4 x i64> + %5 = trunc <4 x i64> %4 to <4 x i8> + store <4 x i8> %5, <4 x i8> *%p1 + ret void } -; -; Signed saturation truncation to v16i8 -; - define <8 x i8> @trunc_ssat_v8i64_v8i8(<8 x i64> %a0) { ; SSE2-LABEL: trunc_ssat_v8i64_v8i8: ; SSE2: # %bb.0: @@ -2762,6 +3927,208 @@ define <16 x i8> @trunc_ssat_v16i64_v16i ret <16 x i8> %5 } +define <4 x i8> @trunc_ssat_v4i32_v4i8(<4 x i32> %a0) { +; SSE2-LABEL: trunc_ssat_v4i32_v4i8: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [127,127,127,127] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn %xmm1, %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] +; SSE2-NEXT: movdqa %xmm2, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm2 +; SSE2-NEXT: pandn %xmm1, %xmm0 +; SSE2-NEXT: por %xmm2, %xmm0 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v4i32_v4i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [127,127,127,127] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn %xmm1, %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] +; SSSE3-NEXT: movdqa %xmm2, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm2 +; SSSE3-NEXT: pandn %xmm1, %xmm0 +; SSSE3-NEXT: por %xmm2, %xmm0 +; SSSE3-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v4i32_v4i8: +; SSE41: # %bb.0: +; SSE41-NEXT: pminsd {{.*}}(%rip), %xmm0 +; SSE41-NEXT: pmaxsd {{.*}}(%rip), %xmm0 +; SSE41-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_ssat_v4i32_v4i8: +; AVX1: # %bb.0: +; AVX1-NEXT: vpminsd {{.*}}(%rip), %xmm0, %xmm0 +; AVX1-NEXT: vpmaxsd {{.*}}(%rip), %xmm0, %xmm0 +; AVX1-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_ssat_v4i32_v4i8: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] +; AVX2-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] +; AVX2-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v4i32_v4i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] +; AVX512F-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] +; AVX512F-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v4i32_v4i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512VL-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v4i32_v4i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] +; AVX512BW-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] +; AVX512BW-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v4i32_v4i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = icmp sgt <4 x i32> %2, + %4 = select <4 x i1> %3, <4 x i32> %2, <4 x i32> + %5 = trunc <4 x i32> %4 to <4 x i8> + ret <4 x i8> %5 +} + +define void @trunc_ssat_v4i32_v4i8_store(<4 x i32> %a0, <4 x i8> *%p1) { +; SSE2-LABEL: trunc_ssat_v4i32_v4i8_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [127,127,127,127] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn %xmm1, %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [4294967168,4294967168,4294967168,4294967168] +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm2 +; SSE2-NEXT: pandn %xmm0, %xmm1 +; SSE2-NEXT: por %xmm2, %xmm1 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm1 +; SSE2-NEXT: packuswb %xmm0, %xmm1 +; SSE2-NEXT: packuswb %xmm0, %xmm1 +; SSE2-NEXT: movd %xmm1, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v4i32_v4i8_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [127,127,127,127] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn %xmm1, %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [4294967168,4294967168,4294967168,4294967168] +; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm2 +; SSSE3-NEXT: pandn %xmm0, %xmm1 +; SSSE3-NEXT: por %xmm2, %xmm1 +; SSSE3-NEXT: pshufb {{.*#+}} xmm1 = xmm1[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: movd %xmm1, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v4i32_v4i8_store: +; SSE41: # %bb.0: +; SSE41-NEXT: pminsd {{.*}}(%rip), %xmm0 +; SSE41-NEXT: pmaxsd {{.*}}(%rip), %xmm0 +; SSE41-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: movd %xmm0, (%rdi) +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_ssat_v4i32_v4i8_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vpminsd {{.*}}(%rip), %xmm0, %xmm0 +; AVX1-NEXT: vpmaxsd {{.*}}(%rip), %xmm0, %xmm0 +; AVX1-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX1-NEXT: vmovd %xmm0, (%rdi) +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_ssat_v4i32_v4i8_store: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] +; AVX2-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] +; AVX2-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX2-NEXT: vmovd %xmm0, (%rdi) +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v4i32_v4i8_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] +; AVX512F-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] +; AVX512F-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vmovd %xmm0, (%rdi) +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v4i32_v4i8_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovsdb %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v4i32_v4i8_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] +; AVX512BW-NEXT: vpminsd %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] +; AVX512BW-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vmovd %xmm0, (%rdi) +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v4i32_v4i8_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovsdb %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq + %1 = icmp slt <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = icmp sgt <4 x i32> %2, + %4 = select <4 x i1> %3, <4 x i32> %2, <4 x i32> + %5 = trunc <4 x i32> %4 to <4 x i8> + store <4 x i8> %5, <4 x i8> *%p1 + ret void +} + define <8 x i8> @trunc_ssat_v8i32_v8i8(<8 x i32> %a0) { ; SSE-LABEL: trunc_ssat_v8i32_v8i8: ; SSE: # %bb.0: @@ -2924,6 +4291,90 @@ define <16 x i8> @trunc_ssat_v16i32_v16i ret <16 x i8> %5 } +define <8 x i8> @trunc_ssat_v8i16_v8i8(<8 x i16> %a0) { +; SSE-LABEL: trunc_ssat_v8i16_v8i8: +; SSE: # %bb.0: +; SSE-NEXT: packsswb %xmm0, %xmm0 +; SSE-NEXT: retq +; +; AVX-LABEL: trunc_ssat_v8i16_v8i8: +; AVX: # %bb.0: +; AVX-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v8i16_v8i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v8i16_v8i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v8i16_v8i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v8i16_v8i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsw {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmaxsw {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 +; AVX512BWVL-NEXT: retq + %1 = icmp slt <8 x i16> %a0, + %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> + %3 = icmp sgt <8 x i16> %2, + %4 = select <8 x i1> %3, <8 x i16> %2, <8 x i16> + %5 = trunc <8 x i16> %4 to <8 x i8> + ret <8 x i8> %5 +} + +define void @trunc_ssat_v8i16_v8i8_store(<8 x i16> %a0, <8 x i8> *%p1) { +; SSE-LABEL: trunc_ssat_v8i16_v8i8_store: +; SSE: # %bb.0: +; SSE-NEXT: packsswb %xmm0, %xmm0 +; SSE-NEXT: movq %xmm0, (%rdi) +; SSE-NEXT: retq +; +; AVX-LABEL: trunc_ssat_v8i16_v8i8_store: +; AVX: # %bb.0: +; AVX-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 +; AVX-NEXT: vmovq %xmm0, (%rdi) +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v8i16_v8i8_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v8i16_v8i8_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 +; AVX512VL-NEXT: vmovq %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v8i16_v8i8_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v8i16_v8i8_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovswb %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq + %1 = icmp slt <8 x i16> %a0, + %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> + %3 = icmp sgt <8 x i16> %2, + %4 = select <8 x i1> %3, <8 x i16> %2, <8 x i16> + %5 = trunc <8 x i16> %4 to <8 x i8> + store <8 x i8> %5, <8 x i8> *%p1 + ret void +} + define <16 x i8> @trunc_ssat_v16i16_v16i8(<16 x i16> %a0) { ; SSE-LABEL: trunc_ssat_v16i16_v16i8: ; SSE: # %bb.0: Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll?rev=374505&r1=374504&r2=374505&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll Thu Oct 10 20:46:39 2019 @@ -447,6 +447,399 @@ define <8 x i32> @trunc_usat_v8i64_v8i32 ; Unsigned saturation truncation to vXi16 ; +define <4 x i16> @trunc_usat_v4i64_v4i16(<4 x i64> %a0) { +; SSE2-LABEL: trunc_usat_v4i64_v4i16: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [65535,65535] +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: pxor %xmm3, %xmm4 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002324991,9223372039002324991] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm6 +; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pandn %xmm2, %xmm6 +; SSE2-NEXT: por %xmm0, %xmm6 +; SSE2-NEXT: pxor %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm5, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm0[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm1 +; SSE2-NEXT: pandn %xmm2, %xmm0 +; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm1 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v4i64_v4i16: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [65535,65535] +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pxor %xmm3, %xmm4 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002324991,9223372039002324991] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pandn %xmm2, %xmm6 +; SSSE3-NEXT: por %xmm0, %xmm6 +; SSSE3-NEXT: pxor %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm5, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm0[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm1 +; SSSE3-NEXT: pandn %xmm2, %xmm0 +; SSSE3-NEXT: por %xmm1, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm1 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v4i64_v4i16: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm3 = [65535,65535] +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: pxor %xmm4, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002324991,9223372039002324991] +; SSE41-NEXT: movdqa %xmm5, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 +; SSE41-NEXT: movdqa %xmm5, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm6 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm6 +; SSE41-NEXT: pxor %xmm1, %xmm4 +; SSE41-NEXT: movdqa %xmm5, %xmm2 +; SSE41-NEXT: pcmpeqd %xmm4, %xmm2 +; SSE41-NEXT: pcmpgtd %xmm4, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm2, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm1 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v4i64_v4i16: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808] +; AVX1-NEXT: vpxor %xmm1, %xmm0, %xmm2 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854841343,9223372036854841343] +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm4 +; AVX1-NEXT: vpxor %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm3, %xmm1 +; AVX1-NEXT: vmovapd {{.*#+}} xmm3 = [65535,65535] +; AVX1-NEXT: vblendvpd %xmm1, %xmm4, %xmm3, %xmm1 +; AVX1-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm3, %xmm0 +; AVX1-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX1-NEXT: vzeroupper +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_usat_v4i64_v4i16: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vbroadcastsd {{.*#+}} ymm1 = [65535,65535,65535,65535] +; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] +; AVX2-SLOW-NEXT: vpxor %ymm2, %ymm0, %ymm2 +; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854841343,9223372036854841343,9223372036854841343,9223372036854841343] +; AVX2-SLOW-NEXT: vpcmpgtq %ymm2, %ymm3, %ymm2 +; AVX2-SLOW-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-SLOW-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-SLOW-NEXT: vzeroupper +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_usat_v4i64_v4i16: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vbroadcastsd {{.*#+}} ymm1 = [65535,65535,65535,65535] +; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] +; AVX2-FAST-NEXT: vpxor %ymm2, %ymm0, %ymm2 +; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854841343,9223372036854841343,9223372036854841343,9223372036854841343] +; AVX2-FAST-NEXT: vpcmpgtq %ymm2, %ymm3, %ymm2 +; AVX2-FAST-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-FAST-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm2 = [0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-FAST-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-FAST-NEXT: vzeroupper +; AVX2-FAST-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v4i64_v4i16: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v4i64_v4i16: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovusqw %ymm0, %xmm0 +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v4i64_v4i16: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v4i64_v4i16: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovusqw %ymm0, %xmm0 +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp ult <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = trunc <4 x i64> %2 to <4 x i16> + ret <4 x i16> %3 +} + +define void @trunc_usat_v4i64_v4i16_store(<4 x i64> %a0, <4 x i16> *%p1) { +; SSE2-LABEL: trunc_usat_v4i64_v4i16_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [65535,65535] +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: pxor %xmm3, %xmm4 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002324991,9223372039002324991] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm6 +; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pandn %xmm2, %xmm6 +; SSE2-NEXT: por %xmm0, %xmm6 +; SSE2-NEXT: pxor %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm5, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm0[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm1 +; SSE2-NEXT: pandn %xmm2, %xmm0 +; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm6[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; SSE2-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSE2-NEXT: movq %xmm1, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v4i64_v4i16_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [65535,65535] +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pxor %xmm3, %xmm4 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002324991,9223372039002324991] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pandn %xmm2, %xmm6 +; SSSE3-NEXT: por %xmm0, %xmm6 +; SSSE3-NEXT: pxor %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm5, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm0[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm1 +; SSSE3-NEXT: pandn %xmm2, %xmm0 +; SSSE3-NEXT: por %xmm1, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm6[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSSE3-NEXT: movq %xmm1, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v4i64_v4i16_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm3 = [65535,65535] +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: pxor %xmm4, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002324991,9223372039002324991] +; SSE41-NEXT: movdqa %xmm5, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 +; SSE41-NEXT: movdqa %xmm5, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm6 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm6 +; SSE41-NEXT: pxor %xmm1, %xmm4 +; SSE41-NEXT: movdqa %xmm5, %xmm2 +; SSE41-NEXT: pcmpeqd %xmm4, %xmm2 +; SSE41-NEXT: pcmpgtd %xmm4, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm2, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: pshufd {{.*#+}} xmm1 = xmm6[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; SSE41-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSE41-NEXT: movq %xmm1, (%rdi) +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v4i64_v4i16_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808] +; AVX1-NEXT: vpxor %xmm1, %xmm0, %xmm2 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854841343,9223372036854841343] +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm4 +; AVX1-NEXT: vpxor %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm3, %xmm1 +; AVX1-NEXT: vmovapd {{.*#+}} xmm3 = [65535,65535] +; AVX1-NEXT: vblendvpd %xmm1, %xmm4, %xmm3, %xmm1 +; AVX1-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm3, %xmm0 +; AVX1-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX1-NEXT: vmovq %xmm0, (%rdi) +; AVX1-NEXT: vzeroupper +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_usat_v4i64_v4i16_store: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vbroadcastsd {{.*#+}} ymm1 = [65535,65535,65535,65535] +; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] +; AVX2-SLOW-NEXT: vpxor %ymm2, %ymm0, %ymm2 +; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854841343,9223372036854841343,9223372036854841343,9223372036854841343] +; AVX2-SLOW-NEXT: vpcmpgtq %ymm2, %ymm3, %ymm2 +; AVX2-SLOW-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-SLOW-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-SLOW-NEXT: vmovq %xmm0, (%rdi) +; AVX2-SLOW-NEXT: vzeroupper +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_usat_v4i64_v4i16_store: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vbroadcastsd {{.*#+}} ymm1 = [65535,65535,65535,65535] +; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] +; AVX2-FAST-NEXT: vpxor %ymm2, %ymm0, %ymm2 +; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854841343,9223372036854841343,9223372036854841343,9223372036854841343] +; AVX2-FAST-NEXT: vpcmpgtq %ymm2, %ymm3, %ymm2 +; AVX2-FAST-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-FAST-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm2 = [0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-FAST-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-FAST-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; AVX2-FAST-NEXT: vmovq %xmm0, (%rdi) +; AVX2-FAST-NEXT: vzeroupper +; AVX2-FAST-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v4i64_v4i16_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v4i64_v4i16_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovusqw %ymm0, (%rdi) +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v4i64_v4i16_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v4i64_v4i16_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovusqw %ymm0, (%rdi) +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp ult <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = trunc <4 x i64> %2 to <4 x i16> + store <4 x i16> %3, <4 x i16> *%p1 + ret void +} + define <8 x i16> @trunc_usat_v8i64_v8i16(<8 x i64> %a0) { ; SSE2-LABEL: trunc_usat_v8i64_v8i16: ; SSE2: # %bb.0: @@ -693,6 +1086,166 @@ define <8 x i16> @trunc_usat_v8i64_v8i16 ret <8 x i16> %3 } +define <4 x i16> @trunc_usat_v4i32_v4i16(<4 x i32> %a0) { +; SSE2-LABEL: trunc_usat_v4i32_v4i16: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648] +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147549183,2147549183,2147549183,2147549183] +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm2[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v4i32_v4i16: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648] +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147549183,2147549183,2147549183,2147549183] +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSSE3-NEXT: por %xmm2, %xmm0 +; SSSE3-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,1,4,5,8,9,12,13,8,9,12,13,12,13,14,15] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v4i32_v4i16: +; SSE41: # %bb.0: +; SSE41-NEXT: pminud {{.*}}(%rip), %xmm0 +; SSE41-NEXT: packusdw %xmm0, %xmm0 +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v4i32_v4i16: +; AVX1: # %bb.0: +; AVX1-NEXT: vpminud {{.*}}(%rip), %xmm0, %xmm0 +; AVX1-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_usat_v4i32_v4i16: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] +; AVX2-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v4i32_v4i16: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] +; AVX512F-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v4i32_v4i16: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512VL-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v4i32_v4i16: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] +; AVX512BW-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v4i32_v4i16: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512BWVL-NEXT: retq + %1 = icmp ult <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = trunc <4 x i32> %2 to <4 x i16> + ret <4 x i16> %3 +} + +define void @trunc_usat_v4i32_v4i16_store(<4 x i32> %a0, <4 x i16> *%p1) { +; SSE2-LABEL: trunc_usat_v4i32_v4i16_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648] +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147549183,2147549183,2147549183,2147549183] +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm2[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SSE2-NEXT: movq %xmm0, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v4i32_v4i16_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648] +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147549183,2147549183,2147549183,2147549183] +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pshufb {{.*#+}} xmm2 = xmm2[0,1,4,5,8,9,12,13,8,9,12,13,12,13,14,15] +; SSSE3-NEXT: movq %xmm2, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v4i32_v4i16_store: +; SSE41: # %bb.0: +; SSE41-NEXT: pminud {{.*}}(%rip), %xmm0 +; SSE41-NEXT: packusdw %xmm0, %xmm0 +; SSE41-NEXT: movq %xmm0, (%rdi) +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v4i32_v4i16_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vpminud {{.*}}(%rip), %xmm0, %xmm0 +; AVX1-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX1-NEXT: vmovq %xmm0, (%rdi) +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_usat_v4i32_v4i16_store: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] +; AVX2-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX2-NEXT: vmovq %xmm0, (%rdi) +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v4i32_v4i16_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] +; AVX512F-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v4i32_v4i16_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovusdw %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v4i32_v4i16_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] +; AVX512BW-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v4i32_v4i16_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovusdw %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq + %1 = icmp ult <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = trunc <4 x i32> %2 to <4 x i16> + store <4 x i16> %3, <4 x i16> *%p1 + ret void +} + define <8 x i16> @trunc_usat_v8i32_v8i16(<8 x i32> %a0) { ; SSE2-LABEL: trunc_usat_v8i32_v8i16: ; SSE2: # %bb.0: @@ -938,8 +1491,361 @@ define <16 x i16> @trunc_usat_v16i32_v16 } ; -; Unsigned saturation truncation to v16i8 +; Unsigned saturation truncation to vXi8 +; + +define <4 x i8> @trunc_usat_v4i64_v4i8(<4 x i64> %a0) { +; SSE2-LABEL: trunc_usat_v4i64_v4i8: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255] +; SSE2-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: pxor %xmm4, %xmm0 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259711,9223372039002259711] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm2 +; SSE2-NEXT: pandn %xmm8, %xmm0 +; SSE2-NEXT: por %xmm2, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm4 +; SSE2-NEXT: movdqa %xmm5, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm3, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm8, %xmm2 +; SSE2-NEXT: pand %xmm8, %xmm0 +; SSE2-NEXT: packuswb %xmm2, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v4i64_v4i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [255,255] +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pxor %xmm3, %xmm4 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259711,9223372039002259711] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pandn %xmm2, %xmm6 +; SSSE3-NEXT: por %xmm6, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pandn %xmm2, %xmm4 +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSSE3-NEXT: pshufb %xmm1, %xmm4 +; SSSE3-NEXT: pshufb %xmm1, %xmm0 +; SSSE3-NEXT: punpcklwd {{.*#+}} xmm0 = xmm0[0],xmm4[0],xmm0[1],xmm4[1],xmm0[2],xmm4[2],xmm0[3],xmm4[3] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v4i64_v4i8: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm4 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: pxor %xmm5, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [9223372039002259711,9223372039002259711] +; SSE41-NEXT: movdqa %xmm6, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm3 +; SSE41-NEXT: pxor %xmm1, %xmm5 +; SSE41-NEXT: movdqa %xmm6, %xmm2 +; SSE41-NEXT: pcmpeqd %xmm5, %xmm2 +; SSE41-NEXT: pcmpgtd %xmm5, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm2, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm4 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSE41-NEXT: pshufb %xmm0, %xmm4 +; SSE41-NEXT: pshufb %xmm0, %xmm3 +; SSE41-NEXT: punpcklwd {{.*#+}} xmm3 = xmm3[0],xmm4[0],xmm3[1],xmm4[1],xmm3[2],xmm4[2],xmm3[3],xmm4[3] +; SSE41-NEXT: movdqa %xmm3, %xmm0 +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v4i64_v4i8: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808] +; AVX1-NEXT: vpxor %xmm1, %xmm0, %xmm2 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854776063,9223372036854776063] +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm4 +; AVX1-NEXT: vpxor %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm3, %xmm1 +; AVX1-NEXT: vmovapd {{.*#+}} xmm3 = [255,255] +; AVX1-NEXT: vblendvpd %xmm1, %xmm4, %xmm3, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX1-NEXT: vpshufb %xmm4, %xmm1, %xmm1 +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm3, %xmm0 +; AVX1-NEXT: vpshufb %xmm4, %xmm0, %xmm0 +; AVX1-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX1-NEXT: vzeroupper +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_usat_v4i64_v4i8: +; AVX2: # %bb.0: +; AVX2-NEXT: vbroadcastsd {{.*#+}} ymm1 = [255,255,255,255] +; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] +; AVX2-NEXT: vpxor %ymm2, %ymm0, %ymm2 +; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854776063,9223372036854776063,9223372036854776063,9223372036854776063] +; AVX2-NEXT: vpcmpgtq %ymm2, %ymm3, %ymm2 +; AVX2-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-NEXT: vmovdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX2-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX2-NEXT: vzeroupper +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v4i64_v4i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v4i64_v4i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovusqb %ymm0, %xmm0 +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v4i64_v4i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v4i64_v4i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovusqb %ymm0, %xmm0 +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp ult <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = trunc <4 x i64> %2 to <4 x i8> + ret <4 x i8> %3 +} + +define void @trunc_usat_v4i64_v4i8_store(<4 x i64> %a0, <4 x i8> *%p1) { +; SSE2-LABEL: trunc_usat_v4i64_v4i8_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255] +; SSE2-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: movdqa %xmm0, %xmm3 +; SSE2-NEXT: pxor %xmm4, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259711,9223372039002259711] +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: pxor %xmm1, %xmm4 +; SSE2-NEXT: movdqa %xmm5, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm0[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm5, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm0 +; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: pand %xmm8, %xmm0 +; SSE2-NEXT: pand %xmm8, %xmm3 +; SSE2-NEXT: packuswb %xmm0, %xmm3 +; SSE2-NEXT: packuswb %xmm0, %xmm3 +; SSE2-NEXT: packuswb %xmm0, %xmm3 +; SSE2-NEXT: movd %xmm3, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v4i64_v4i8_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255] +; SSSE3-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: movdqa %xmm0, %xmm3 +; SSSE3-NEXT: pxor %xmm4, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259711,9223372039002259711] +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: pxor %xmm1, %xmm4 +; SSSE3-NEXT: movdqa %xmm5, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm0[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm5, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: pand %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm0 +; SSSE3-NEXT: por %xmm1, %xmm0 +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSSE3-NEXT: pshufb %xmm1, %xmm0 +; SSSE3-NEXT: pshufb %xmm1, %xmm3 +; SSSE3-NEXT: punpcklwd {{.*#+}} xmm3 = xmm3[0],xmm0[0],xmm3[1],xmm0[1],xmm3[2],xmm0[2],xmm3[3],xmm0[3] +; SSSE3-NEXT: movd %xmm3, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v4i64_v4i8_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm3 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: pxor %xmm4, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259711,9223372039002259711] +; SSE41-NEXT: movdqa %xmm5, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 +; SSE41-NEXT: movdqa %xmm5, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm6 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm6 +; SSE41-NEXT: pxor %xmm1, %xmm4 +; SSE41-NEXT: movdqa %xmm5, %xmm2 +; SSE41-NEXT: pcmpeqd %xmm4, %xmm2 +; SSE41-NEXT: pcmpgtd %xmm4, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm2, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm3 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; SSE41-NEXT: pshufb %xmm0, %xmm3 +; SSE41-NEXT: pshufb %xmm0, %xmm6 +; SSE41-NEXT: punpcklwd {{.*#+}} xmm6 = xmm6[0],xmm3[0],xmm6[1],xmm3[1],xmm6[2],xmm3[2],xmm6[3],xmm3[3] +; SSE41-NEXT: movd %xmm6, (%rdi) +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v4i64_v4i8_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808] +; AVX1-NEXT: vpxor %xmm1, %xmm0, %xmm2 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854776063,9223372036854776063] +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm4 +; AVX1-NEXT: vpxor %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm3, %xmm1 +; AVX1-NEXT: vmovapd {{.*#+}} xmm3 = [255,255] +; AVX1-NEXT: vblendvpd %xmm1, %xmm4, %xmm3, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX1-NEXT: vpshufb %xmm4, %xmm1, %xmm1 +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm3, %xmm0 +; AVX1-NEXT: vpshufb %xmm4, %xmm0, %xmm0 +; AVX1-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX1-NEXT: vmovd %xmm0, (%rdi) +; AVX1-NEXT: vzeroupper +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_usat_v4i64_v4i8_store: +; AVX2: # %bb.0: +; AVX2-NEXT: vbroadcastsd {{.*#+}} ymm1 = [255,255,255,255] +; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm2 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] +; AVX2-NEXT: vpxor %ymm2, %ymm0, %ymm2 +; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854776063,9223372036854776063,9223372036854776063,9223372036854776063] +; AVX2-NEXT: vpcmpgtq %ymm2, %ymm3, %ymm2 +; AVX2-NEXT: vblendvpd %ymm2, %ymm0, %ymm1, %ymm0 +; AVX2-NEXT: vextractf128 $1, %ymm0, %xmm1 +; AVX2-NEXT: vmovdqa {{.*#+}} xmm2 = <0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u> +; AVX2-NEXT: vpshufb %xmm2, %xmm1, %xmm1 +; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0 +; AVX2-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] +; AVX2-NEXT: vmovd %xmm0, (%rdi) +; AVX2-NEXT: vzeroupper +; AVX2-NEXT: retq ; +; AVX512F-LABEL: trunc_usat_v4i64_v4i8_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vmovd %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v4i64_v4i8_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovusqb %ymm0, (%rdi) +; AVX512VL-NEXT: vzeroupper +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v4i64_v4i8_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vmovd %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v4i64_v4i8_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovusqb %ymm0, (%rdi) +; AVX512BWVL-NEXT: vzeroupper +; AVX512BWVL-NEXT: retq + %1 = icmp ult <4 x i64> %a0, + %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> + %3 = trunc <4 x i64> %2 to <4 x i8> + store <4 x i8> %3, <4 x i8> *%p1 + ret void +} define <8 x i8> @trunc_usat_v8i64_v8i8(<8 x i64> %a0) { ; SSE2-LABEL: trunc_usat_v8i64_v8i8: @@ -1887,6 +2793,167 @@ define <16 x i8> @trunc_usat_v16i64_v16i ret <16 x i8> %3 } +define <4 x i8> @trunc_usat_v4i32_v4i8(<4 x i32> %a0) { +; SSE2-LABEL: trunc_usat_v4i32_v4i8: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648] +; SSE2-NEXT: pxor %xmm0, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483903,2147483903,2147483903,2147483903] +; SSE2-NEXT: pcmpgtd %xmm2, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm1 +; SSE2-NEXT: packuswb %xmm1, %xmm1 +; SSE2-NEXT: packuswb %xmm1, %xmm1 +; SSE2-NEXT: movdqa %xmm1, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v4i32_v4i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648] +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147483903,2147483903,2147483903,2147483903] +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSSE3-NEXT: por %xmm2, %xmm0 +; SSSE3-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v4i32_v4i8: +; SSE41: # %bb.0: +; SSE41-NEXT: pminud {{.*}}(%rip), %xmm0 +; SSE41-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v4i32_v4i8: +; AVX1: # %bb.0: +; AVX1-NEXT: vpminud {{.*}}(%rip), %xmm0, %xmm0 +; AVX1-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_usat_v4i32_v4i8: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX2-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v4i32_v4i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX512F-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v4i32_v4i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v4i32_v4i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX512BW-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v4i32_v4i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: retq + %1 = icmp ult <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = trunc <4 x i32> %2 to <4 x i8> + ret <4 x i8> %3 +} + +define void @trunc_usat_v4i32_v4i8_store(<4 x i32> %a0, <4 x i8> *%p1) { +; SSE2-LABEL: trunc_usat_v4i32_v4i8_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648] +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147483903,2147483903,2147483903,2147483903] +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm2 +; SSE2-NEXT: packuswb %xmm0, %xmm2 +; SSE2-NEXT: packuswb %xmm0, %xmm2 +; SSE2-NEXT: movd %xmm2, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v4i32_v4i8_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648,2147483648,2147483648] +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147483903,2147483903,2147483903,2147483903] +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pshufb {{.*#+}} xmm2 = xmm2[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: movd %xmm2, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v4i32_v4i8_store: +; SSE41: # %bb.0: +; SSE41-NEXT: pminud {{.*}}(%rip), %xmm0 +; SSE41-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: movd %xmm0, (%rdi) +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v4i32_v4i8_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vpminud {{.*}}(%rip), %xmm0, %xmm0 +; AVX1-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX1-NEXT: vmovd %xmm0, (%rdi) +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_usat_v4i32_v4i8_store: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX2-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX2-NEXT: vmovd %xmm0, (%rdi) +; AVX2-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v4i32_v4i8_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX512F-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vmovd %xmm0, (%rdi) +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v4i32_v4i8_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovusdb %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v4i32_v4i8_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] +; AVX512BW-NEXT: vpminud %xmm1, %xmm0, %xmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vmovd %xmm0, (%rdi) +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v4i32_v4i8_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovusdb %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq + %1 = icmp ult <4 x i32> %a0, + %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> + %3 = trunc <4 x i32> %2 to <4 x i8> + store <4 x i8> %3, <4 x i8> *%p1 + ret void +} + define <8 x i8> @trunc_usat_v8i32_v8i8(<8 x i32> %a0) { ; SSE2-LABEL: trunc_usat_v8i32_v8i8: ; SSE2: # %bb.0: @@ -2247,6 +3314,109 @@ define <16 x i8> @trunc_usat_v16i32_v16i ret <16 x i8> %3 } +define <8 x i8> @trunc_usat_v8i16_v8i8(<8 x i16> %a0) { +; SSE2-LABEL: trunc_usat_v8i16_v8i8: +; SSE2: # %bb.0: +; SSE2-NEXT: pxor {{.*}}(%rip), %xmm0 +; SSE2-NEXT: pminsw {{.*}}(%rip), %xmm0 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v8i16_v8i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: pxor {{.*}}(%rip), %xmm0 +; SSSE3-NEXT: pminsw {{.*}}(%rip), %xmm0 +; SSSE3-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v8i16_v8i8: +; SSE41: # %bb.0: +; SSE41-NEXT: pminuw {{.*}}(%rip), %xmm0 +; SSE41-NEXT: packuswb %xmm0, %xmm0 +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_usat_v8i16_v8i8: +; AVX: # %bb.0: +; AVX-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 +; AVX-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX-NEXT: retq +; +; AVX512-LABEL: trunc_usat_v8i16_v8i8: +; AVX512: # %bb.0: +; AVX512-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 +; AVX512-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: retq + %1 = icmp ult <8 x i16> %a0, + %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> + %3 = trunc <8 x i16> %2 to <8 x i8> + ret <8 x i8> %3 +} + +define void @trunc_usat_v8i16_v8i8_store(<8 x i16> %a0, <8 x i8> *%p1) { +; SSE2-LABEL: trunc_usat_v8i16_v8i8_store: +; SSE2: # %bb.0: +; SSE2-NEXT: pxor {{.*}}(%rip), %xmm0 +; SSE2-NEXT: pminsw {{.*}}(%rip), %xmm0 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: movq %xmm0, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v8i16_v8i8_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: pxor {{.*}}(%rip), %xmm0 +; SSSE3-NEXT: pminsw {{.*}}(%rip), %xmm0 +; SSSE3-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: movq %xmm0, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v8i16_v8i8_store: +; SSE41: # %bb.0: +; SSE41-NEXT: pminuw {{.*}}(%rip), %xmm0 +; SSE41-NEXT: packuswb %xmm0, %xmm0 +; SSE41-NEXT: movq %xmm0, (%rdi) +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_usat_v8i16_v8i8_store: +; AVX: # %bb.0: +; AVX-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 +; AVX-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX-NEXT: vmovq %xmm0, (%rdi) +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v8i16_v8i8_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 +; AVX512F-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v8i16_v8i8_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512VL-NEXT: vmovq %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v8i16_v8i8_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BW-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v8i16_v8i8_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovuswb %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq + %1 = icmp ult <8 x i16> %a0, + %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> + %3 = trunc <8 x i16> %2 to <8 x i8> + store <8 x i8> %3, <8 x i8> *%p1 + ret void +} + define <16 x i8> @trunc_usat_v16i16_v16i8(<16 x i16> %a0) { ; SSE2-LABEL: trunc_usat_v16i16_v16i8: ; SSE2: # %bb.0: From llvm-commits at lists.llvm.org Thu Oct 10 20:48:56 2019 From: llvm-commits at lists.llvm.org (Philip Reames via llvm-commits) Date: Fri, 11 Oct 2019 03:48:56 -0000 Subject: [llvm] r374506 - [CVP] Remove a masking operation if range information implies it's a noop Message-ID: <20191011034856.3470A92976@lists.llvm.org> Author: reames Date: Thu Oct 10 20:48:56 2019 New Revision: 374506 URL: http://llvm.org/viewvc/llvm-project?rev=374506&view=rev Log: [CVP] Remove a masking operation if range information implies it's a noop This is really a known bits style transformation, but known bits isn't context sensitive. The particular case which comes up happens to involve a range which allows range based reasoning to eliminate the mask pattern, so handle that case specifically in CVP. InstCombine likes to generate the mask-by-low-bits pattern when widening an arithmetic expression which includes a zext in the middle. Differential Revision: https://reviews.llvm.org/D68811 Added: llvm/trunk/test/Transforms/CorrelatedValuePropagation/and.ll Modified: llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp llvm/trunk/test/Transforms/CorrelatedValuePropagation/overflows.ll llvm/trunk/test/Transforms/CorrelatedValuePropagation/range.ll Modified: llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp?rev=374506&r1=374505&r2=374506&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp Thu Oct 10 20:48:56 2019 @@ -63,6 +63,7 @@ STATISTIC(NumUDivs, "Number of udivs STATISTIC(NumAShrs, "Number of ashr converted to lshr"); STATISTIC(NumSRems, "Number of srem converted to urem"); STATISTIC(NumSExt, "Number of sext converted to zext"); +STATISTIC(NumAnd, "Number of ands removed"); STATISTIC(NumOverflows, "Number of overflow checks removed"); STATISTIC(NumSaturating, "Number of saturating arithmetics converted to normal arithmetics"); @@ -700,6 +701,29 @@ static bool processBinOp(BinaryOperator return Changed; } +static bool processAnd(BinaryOperator *BinOp, LazyValueInfo *LVI) { + if (BinOp->getType()->isVectorTy()) + return false; + + // Pattern match (and lhs, C) where C includes a superset of bits which might + // be set in lhs. This is a common truncation idiom created by instcombine. + BasicBlock *BB = BinOp->getParent(); + Value *LHS = BinOp->getOperand(0); + ConstantInt *RHS = dyn_cast(BinOp->getOperand(1)); + if (!RHS || !RHS->getValue().isMask()) + return false; + + ConstantRange LRange = LVI->getConstantRange(LHS, BB, BinOp); + if (!LRange.getUnsignedMax().ule(RHS->getValue())) + return false; + + BinOp->replaceAllUsesWith(LHS); + BinOp->eraseFromParent(); + NumAnd++; + return true; +} + + static Constant *getConstantAt(Value *V, Instruction *At, LazyValueInfo *LVI) { if (Constant *C = LVI->getConstant(V, At->getParent(), At)) return C; @@ -774,6 +798,9 @@ static bool runImpl(Function &F, LazyVal case Instruction::Sub: BBChanged |= processBinOp(cast(II), LVI); break; + case Instruction::And: + BBChanged |= processAnd(cast(II), LVI); + break; } } Added: llvm/trunk/test/Transforms/CorrelatedValuePropagation/and.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CorrelatedValuePropagation/and.ll?rev=374506&view=auto ============================================================================== --- llvm/trunk/test/Transforms/CorrelatedValuePropagation/and.ll (added) +++ llvm/trunk/test/Transforms/CorrelatedValuePropagation/and.ll Thu Oct 10 20:48:56 2019 @@ -0,0 +1,127 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt < %s -correlated-propagation -S | FileCheck %s + +define i32 @test(i32 %a) { +; CHECK-LABEL: @test( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[A:%.*]], 128 +; CHECK-NEXT: br i1 [[CMP]], label [[CONTINUE:%.*]], label [[EXIT:%.*]] +; CHECK: continue: +; CHECK-NEXT: ret i32 [[A]] +; CHECK: exit: +; CHECK-NEXT: ret i32 -1 +; +entry: + %cmp = icmp ult i32 %a, 128 + br i1 %cmp, label %continue, label %exit +continue: + %and = and i32 %a, 255 + ret i32 %and +exit: + ret i32 -1 +} + +define i32 @test2(i32 %a) { +; CHECK-LABEL: @test2( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[A:%.*]], 256 +; CHECK-NEXT: br i1 [[CMP]], label [[CONTINUE:%.*]], label [[EXIT:%.*]] +; CHECK: continue: +; CHECK-NEXT: ret i32 [[A]] +; CHECK: exit: +; CHECK-NEXT: ret i32 -1 +; +entry: + %cmp = icmp ult i32 %a, 256 + br i1 %cmp, label %continue, label %exit +continue: + %and = and i32 %a, 255 + ret i32 %and +exit: + ret i32 -1 +} + +define i32 @test3(i32 %a) { +; CHECK-LABEL: @test3( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[A:%.*]], 256 +; CHECK-NEXT: br i1 [[CMP]], label [[CONTINUE:%.*]], label [[EXIT:%.*]] +; CHECK: continue: +; CHECK-NEXT: ret i32 [[A]] +; CHECK: exit: +; CHECK-NEXT: ret i32 -1 +; +entry: + %cmp = icmp ult i32 %a, 256 + br i1 %cmp, label %continue, label %exit +continue: + %and = and i32 %a, 1023 + ret i32 %and +exit: + ret i32 -1 +} + + +define i32 @neg1(i32 %a) { +; CHECK-LABEL: @neg1( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[CMP:%.*]] = icmp ule i32 [[A:%.*]], 256 +; CHECK-NEXT: br i1 [[CMP]], label [[CONTINUE:%.*]], label [[EXIT:%.*]] +; CHECK: continue: +; CHECK-NEXT: [[AND:%.*]] = and i32 [[A]], 255 +; CHECK-NEXT: ret i32 [[AND]] +; CHECK: exit: +; CHECK-NEXT: ret i32 -1 +; +entry: + %cmp = icmp ule i32 %a, 256 + br i1 %cmp, label %continue, label %exit +continue: + %and = and i32 %a, 255 + ret i32 %and +exit: + ret i32 -1 +} + +define i32 @neg2(i32 %a) { +; CHECK-LABEL: @neg2( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[A:%.*]], 513 +; CHECK-NEXT: br i1 [[CMP]], label [[CONTINUE:%.*]], label [[EXIT:%.*]] +; CHECK: continue: +; CHECK-NEXT: [[AND:%.*]] = and i32 [[A]], 255 +; CHECK-NEXT: ret i32 [[AND]] +; CHECK: exit: +; CHECK-NEXT: ret i32 -1 +; +entry: + %cmp = icmp ult i32 %a, 513 + br i1 %cmp, label %continue, label %exit +continue: + %and = and i32 %a, 255 + ret i32 %and +exit: + ret i32 -1 +} + +define i32 @neg3(i32 %a) { +; CHECK-LABEL: @neg3( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[A:%.*]], 256 +; CHECK-NEXT: br i1 [[CMP]], label [[CONTINUE:%.*]], label [[EXIT:%.*]] +; CHECK: continue: +; CHECK-NEXT: [[AND:%.*]] = and i32 [[A]], 254 +; CHECK-NEXT: ret i32 [[AND]] +; CHECK: exit: +; CHECK-NEXT: ret i32 -1 +; +entry: + %cmp = icmp ult i32 %a, 256 + br i1 %cmp, label %continue, label %exit +continue: + %and = and i32 %a, 254 + ret i32 %and +exit: + ret i32 -1 +} + Modified: llvm/trunk/test/Transforms/CorrelatedValuePropagation/overflows.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CorrelatedValuePropagation/overflows.ll?rev=374506&r1=374505&r2=374506&view=diff ============================================================================== --- llvm/trunk/test/Transforms/CorrelatedValuePropagation/overflows.ll (original) +++ llvm/trunk/test/Transforms/CorrelatedValuePropagation/overflows.ll Thu Oct 10 20:48:56 2019 @@ -1023,7 +1023,6 @@ define i1 @smul_and_cmp(i32 %x, i32 %y) ; CHECK-NEXT: [[MUL:%.*]] = extractvalue { i32, i1 } [[TMP0]], 0 ; CHECK-NEXT: br label [[CONT3:%.*]] ; CHECK: cont3: -; CHECK-NEXT: [[CMP5:%.*]] = and i1 true, true ; CHECK-NEXT: br label [[OUT]] ; CHECK: out: ; CHECK-NEXT: ret i1 true Modified: llvm/trunk/test/Transforms/CorrelatedValuePropagation/range.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CorrelatedValuePropagation/range.ll?rev=374506&r1=374505&r2=374506&view=diff ============================================================================== --- llvm/trunk/test/Transforms/CorrelatedValuePropagation/range.ll (original) +++ llvm/trunk/test/Transforms/CorrelatedValuePropagation/range.ll Thu Oct 10 20:48:56 2019 @@ -745,10 +745,9 @@ target93: define i1 @test17_i1(i1 %a) { ; CHECK-LABEL: @test17_i1( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[C:%.*]] = and i1 [[A:%.*]], true ; CHECK-NEXT: br label [[DISPATCH:%.*]] ; CHECK: dispatch: -; CHECK-NEXT: br i1 [[A]], label [[TRUE:%.*]], label [[DISPATCH]] +; CHECK-NEXT: br i1 [[A:%.*]], label [[TRUE:%.*]], label [[DISPATCH]] ; CHECK: true: ; CHECK-NEXT: ret i1 true ; From llvm-commits at lists.llvm.org Thu Oct 10 20:49:53 2019 From: llvm-commits at lists.llvm.org (Philip Reames via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 03:49:53 +0000 (UTC) Subject: [PATCH] D68811: [CVP] Remove a masking operation if range information implies it's a noop In-Reply-To: References: Message-ID: This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rG2d5820cd7225: [CVP] Remove a masking operation if range information implies it's a noop (authored by reames). Herald added a subscriber: hiraditya. Changed prior to commit: https://reviews.llvm.org/D68811?vs=224423&id=224537#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68811/new/ https://reviews.llvm.org/D68811 Files: llvm/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp llvm/test/Transforms/CorrelatedValuePropagation/and.ll llvm/test/Transforms/CorrelatedValuePropagation/overflows.ll llvm/test/Transforms/CorrelatedValuePropagation/range.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68811.224537.patch Type: text/x-patch Size: 6544 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 21:02:04 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Fri, 11 Oct 2019 04:02:04 -0000 Subject: [llvm] r374507 - [X86] Add test case for trunc_packus_v16i32_v16i8_store to min-legal-vector-width.ll Message-ID: <20191011040204.7300D92D8A@lists.llvm.org> Author: ctopper Date: Thu Oct 10 21:02:04 2019 New Revision: 374507 URL: http://llvm.org/viewvc/llvm-project?rev=374507&view=rev Log: [X86] Add test case for trunc_packus_v16i32_v16i8_store to min-legal-vector-width.ll We aren't folding the vpmovuswb into the store. Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll?rev=374507&r1=374506&r2=374507&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll (original) +++ llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Thu Oct 10 21:02:04 2019 @@ -1080,7 +1080,7 @@ define void @vselect_split_v16i16_setcc( ret void } -define <16 x i8> @trunc_packus_v16i32_v16i8(<16 x i32>* %p, <16 x i8>* %q) "min-legal-vector-width"="256" { +define <16 x i8> @trunc_packus_v16i32_v16i8(<16 x i32>* %p) "min-legal-vector-width"="256" { ; CHECK-LABEL: trunc_packus_v16i32_v16i8: ; CHECK: # %bb.0: ; CHECK-NEXT: vmovdqa (%rdi), %ymm0 @@ -1098,6 +1098,26 @@ define <16 x i8> @trunc_packus_v16i32_v1 ret <16 x i8> %f } +define void @trunc_packus_v16i32_v16i8_store(<16 x i32>* %p, <16 x i8>* %q) "min-legal-vector-width"="256" { +; CHECK-LABEL: trunc_packus_v16i32_v16i8_store: +; CHECK: # %bb.0: +; CHECK-NEXT: vmovdqa (%rdi), %ymm0 +; CHECK-NEXT: vpackusdw 32(%rdi), %ymm0, %ymm0 +; CHECK-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; CHECK-NEXT: vpmovuswb %ymm0, %xmm0 +; CHECK-NEXT: vmovdqa %xmm0, (%rsi) +; CHECK-NEXT: vzeroupper +; CHECK-NEXT: retq + %a = load <16 x i32>, <16 x i32>* %p + %b = icmp slt <16 x i32> %a, + %c = select <16 x i1> %b, <16 x i32> %a, <16 x i32> + %d = icmp sgt <16 x i32> %c, zeroinitializer + %e = select <16 x i1> %d, <16 x i32> %c, <16 x i32> zeroinitializer + %f = trunc <16 x i32> %e to <16 x i8> + store <16 x i8> %f, <16 x i8>* %q + ret void +} + define <32 x i8> @trunc_packus_v32i32_v32i8(<32 x i32>* %p) "min-legal-vector-width"="256" { ; CHECK-LABEL: trunc_packus_v32i32_v32i8: ; CHECK: # %bb.0: From llvm-commits at lists.llvm.org Thu Oct 10 20:59:52 2019 From: llvm-commits at lists.llvm.org (Shreyansh Chouhan via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 03:59:52 +0000 (UTC) Subject: [PATCH] D66604: [GVN] AnalyzeLoadAvailability: Replace a load after lifetime.end with undef (PR20811) In-Reply-To: References: Message-ID: <31c358d788e151b29700b96572c19665@localhost.localdomain> BK1603 marked an inline comment as done. BK1603 added a comment. @xbolva00 Yes, i dont have commit access, i need someone to land this patch for me :) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66604/new/ https://reviews.llvm.org/D66604 From llvm-commits at lists.llvm.org Thu Oct 10 21:02:09 2019 From: llvm-commits at lists.llvm.org (David Greene via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 04:02:09 +0000 (UTC) Subject: [PATCH] D68819: [Utils] Allow update_test_checks to check function arguments In-Reply-To: References: Message-ID: greened added a comment. In D68819#1704962 , @jdoerfert wrote: > We can, or should, combine D68153 and this, either in one or two patches. Sounds good. Do you think D68153 should operate under the `--function-signature` flag, under a different flag or always include `define` in the pattern (meaning all tests will change when the tool is re-run)? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68819/new/ https://reviews.llvm.org/D68819 From llvm-commits at lists.llvm.org Thu Oct 10 21:02:50 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Thu, 10 Oct 2019 21:02:50 -0700 Subject: LLVM buildmaster will be restarted soon Message-ID: Hello everyone, LLVM buildmaster will be updated and restarted in few minutes. Thanks Galina -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Thu Oct 10 21:16:49 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Fri, 11 Oct 2019 04:16:49 -0000 Subject: [llvm] r374509 - [X86] Add a DAG combine to turn v16i16->v16i8 VTRUNCUS+store into a saturating truncating store. Message-ID: <20191011041649.BF66692D8B@lists.llvm.org> Author: ctopper Date: Thu Oct 10 21:16:49 2019 New Revision: 374509 URL: http://llvm.org/viewvc/llvm-project?rev=374509&view=rev Log: [X86] Add a DAG combine to turn v16i16->v16i8 VTRUNCUS+store into a saturating truncating store. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374509&r1=374508&r2=374509&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Thu Oct 10 21:16:49 2019 @@ -40448,6 +40448,19 @@ static SDValue combineStore(SDNode *N, S MVT::v16i8, St->getMemOperand()); } + // Try to fold a vpmovuswb 256->128 into a truncating store. + // FIXME: Generalize this to other types. + // FIXME: Do the same for signed saturation. + if (!St->isTruncatingStore() && VT == MVT::v16i8 && + St->getValue().getOpcode() == X86ISD::VTRUNCUS && + St->getValue().getOperand(0).getValueType() == MVT::v16i16 && + TLI.isTruncStoreLegal(MVT::v16i16, MVT::v16i8) && + St->getValue().hasOneUse()) { + return EmitTruncSStore(false /* Unsigned saturation */, St->getChain(), + dl, St->getValue().getOperand(0), St->getBasePtr(), + MVT::v16i8, St->getMemOperand(), DAG); + } + // Optimize trunc store (of multiple scalars) to shuffle and store. // First, pack all of the elements in one place. Next, store to memory // in fewer chunks. Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll?rev=374509&r1=374508&r2=374509&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll (original) +++ llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Thu Oct 10 21:16:49 2019 @@ -1104,8 +1104,7 @@ define void @trunc_packus_v16i32_v16i8_s ; CHECK-NEXT: vmovdqa (%rdi), %ymm0 ; CHECK-NEXT: vpackusdw 32(%rdi), %ymm0, %ymm0 ; CHECK-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] -; CHECK-NEXT: vpmovuswb %ymm0, %xmm0 -; CHECK-NEXT: vmovdqa %xmm0, (%rsi) +; CHECK-NEXT: vpmovuswb %ymm0, (%rsi) ; CHECK-NEXT: vzeroupper ; CHECK-NEXT: retq %a = load <16 x i32>, <16 x i32>* %p From llvm-commits at lists.llvm.org Thu Oct 10 21:28:51 2019 From: llvm-commits at lists.llvm.org (Billy Robert O'Neal III via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 04:28:51 +0000 (UTC) Subject: [PATCH] D68820: win: Move Parallel.h off concrt to cross-platform code In-Reply-To: References: Message-ID: <9d7372acca388c22f3eac02797421780@localhost.localdomain> BillyONeal added inline comments. ================ Comment at: llvm/include/llvm/Support/Parallel.h:124 TaskGroup TG; parallel_quick_sort(Start, End, Comp, TG, llvm::Log2_64(std::distance(Start, End)) + 1); ---------------- If you get a chance to benchmark I'm curious how this compares to our std::sort(std::execution::par, ...) version :) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68820/new/ https://reviews.llvm.org/D68820 From llvm-commits at lists.llvm.org Thu Oct 10 22:05:36 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:05:36 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: hubert.reinterpretcast added inline comments. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:755 const MCExpr *Exp = - MCSymbolRefExpr::create(TOCEntry, MCSymbolRefExpr::VK_PPC_TOC, - OutContext); + MCSymbolRefExpr::create(TOCEntry, VK, OutContext); TmpInst.getOperand(1) = MCOperand::createExpr(Exp); ---------------- Similar cases below have two more spaces of indentation. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:845 + const MCSymbolRefExpr::VariantKind VK = + !IsAIX ? MCSymbolRefExpr::VK_PPC_TOC_HA : MCSymbolRefExpr::VK_PPC_U; const MCExpr *Exp = ---------------- See comment below re: the `!`. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:887 + const MCSymbolRefExpr::VariantKind VK = + !IsAIX ? MCSymbolRefExpr::VK_PPC_TOC_LO : MCSymbolRefExpr::VK_PPC_L; const MCExpr *Exp = ---------------- I believe @sfertile made a comment about avoiding the `!` in the condition in cases like these. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll:26 +; LARGE: ld [[REG4:[0-9]+]], LC1 at l([[REG2]]) +; LARGE: lwz [[REG4:[0-9]+]], 0([[REG3]]) + ---------------- This does not follow. `REG3` apparently holds the address of the operand for the load, so `REG4` holds the address of the target of the store. We are loading the value to `REG4` though, so its value will be clobbered before we get to the store. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Thu Oct 10 22:13:56 2019 From: llvm-commits at lists.llvm.org (Chen Zheng via llvm-commits) Date: Fri, 11 Oct 2019 05:13:56 -0000 Subject: [llvm] r374512 - [InstCombine] recognize popcount. Message-ID: <20191011051356.8518B92E6C@lists.llvm.org> Author: shchenz Date: Thu Oct 10 22:13:56 2019 New Revision: 374512 URL: http://llvm.org/viewvc/llvm-project?rev=374512&view=rev Log: [InstCombine] recognize popcount. This patch recognizes popcount intrinsic according to algorithm from website http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel Differential Revision: https://reviews.llvm.org/D68189 Added: llvm/trunk/test/Transforms/AggressiveInstCombine/popcount.ll Modified: llvm/trunk/include/llvm/IR/PatternMatch.h llvm/trunk/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp Modified: llvm/trunk/include/llvm/IR/PatternMatch.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/PatternMatch.h?rev=374512&r1=374511&r2=374512&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/PatternMatch.h (original) +++ llvm/trunk/include/llvm/IR/PatternMatch.h Thu Oct 10 22:13:56 2019 @@ -643,11 +643,11 @@ struct bind_const_intval_ty { }; /// Match a specified integer value or vector of all elements of that -// value. +/// value. struct specific_intval { - uint64_t Val; + APInt Val; - specific_intval(uint64_t V) : Val(V) {} + specific_intval(APInt V) : Val(std::move(V)) {} template bool match(ITy *V) { const auto *CI = dyn_cast(V); @@ -655,13 +655,19 @@ struct specific_intval { if (const auto *C = dyn_cast(V)) CI = dyn_cast_or_null(C->getSplatValue()); - return CI && CI->getValue() == Val; + return CI && APInt::isSameValue(CI->getValue(), Val); } }; /// Match a specific integer value or vector with all elements equal to /// the value. -inline specific_intval m_SpecificInt(uint64_t V) { return specific_intval(V); } +inline specific_intval m_SpecificInt(APInt V) { + return specific_intval(std::move(V)); +} + +inline specific_intval m_SpecificInt(uint64_t V) { + return m_SpecificInt(APInt(64, V)); +} /// Match a ConstantInt and bind to its value. This does not match /// ConstantInts wider than 64-bits. Modified: llvm/trunk/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp?rev=374512&r1=374511&r2=374512&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp (original) +++ llvm/trunk/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp Thu Oct 10 22:13:56 2019 @@ -250,6 +250,72 @@ static bool foldAnyOrAllBitsSet(Instruct return true; } +// Try to recognize below function as popcount intrinsic. +// This is the "best" algorithm from +// http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel +// Also used in TargetLowering::expandCTPOP(). +// +// int popcount(unsigned int i) { +// i = i - ((i >> 1) & 0x55555555); +// i = (i & 0x33333333) + ((i >> 2) & 0x33333333); +// i = ((i + (i >> 4)) & 0x0F0F0F0F); +// return (i * 0x01010101) >> 24; +// } +static bool tryToRecognizePopCount(Instruction &I) { + if (I.getOpcode() != Instruction::LShr) + return false; + + Type *Ty = I.getType(); + if (!Ty->isIntOrIntVectorTy()) + return false; + + unsigned Len = Ty->getScalarSizeInBits(); + // FIXME: fix Len == 8 and other irregular type lengths. + if (!(Len <= 128 && Len > 8 && Len % 8 == 0)) + return false; + + APInt Mask55 = APInt::getSplat(Len, APInt(8, 0x55)); + APInt Mask33 = APInt::getSplat(Len, APInt(8, 0x33)); + APInt Mask0F = APInt::getSplat(Len, APInt(8, 0x0F)); + APInt Mask01 = APInt::getSplat(Len, APInt(8, 0x01)); + APInt MaskShift = APInt(Len, Len - 8); + + Value *Op0 = I.getOperand(0); + Value *Op1 = I.getOperand(1); + Value *MulOp0; + // Matching "(i * 0x01010101...) >> 24". + if ((match(Op0, m_Mul(m_Value(MulOp0), m_SpecificInt(Mask01)))) && + match(Op1, m_SpecificInt(MaskShift))) { + Value *ShiftOp0; + // Matching "((i + (i >> 4)) & 0x0F0F0F0F...)". + if (match(MulOp0, m_And(m_c_Add(m_LShr(m_Value(ShiftOp0), m_SpecificInt(4)), + m_Deferred(ShiftOp0)), + m_SpecificInt(Mask0F)))) { + Value *AndOp0; + // Matching "(i & 0x33333333...) + ((i >> 2) & 0x33333333...)". + if (match(ShiftOp0, + m_c_Add(m_And(m_Value(AndOp0), m_SpecificInt(Mask33)), + m_And(m_LShr(m_Deferred(AndOp0), m_SpecificInt(2)), + m_SpecificInt(Mask33))))) { + Value *Root, *SubOp1; + // Matching "i - ((i >> 1) & 0x55555555...)". + if (match(AndOp0, m_Sub(m_Value(Root), m_Value(SubOp1))) && + match(SubOp1, m_And(m_LShr(m_Specific(Root), m_SpecificInt(1)), + m_SpecificInt(Mask55)))) { + LLVM_DEBUG(dbgs() << "Recognized popcount intrinsic\n"); + IRBuilder<> Builder(&I); + Function *Func = Intrinsic::getDeclaration( + I.getModule(), Intrinsic::ctpop, I.getType()); + I.replaceAllUsesWith(Builder.CreateCall(Func, {Root})); + return true; + } + } + } + } + + return false; +} + /// This is the entry point for folds that could be implemented in regular /// InstCombine, but they are separated because they are not expected to /// occur frequently and/or have more than a constant-length pattern match. @@ -268,6 +334,7 @@ static bool foldUnusualPatterns(Function for (Instruction &I : make_range(BB.rbegin(), BB.rend())) { MadeChange |= foldAnyOrAllBitsSet(I); MadeChange |= foldGuardedRotateToFunnelShift(I); + MadeChange |= tryToRecognizePopCount(I); } } Added: llvm/trunk/test/Transforms/AggressiveInstCombine/popcount.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/AggressiveInstCombine/popcount.ll?rev=374512&view=auto ============================================================================== --- llvm/trunk/test/Transforms/AggressiveInstCombine/popcount.ll (added) +++ llvm/trunk/test/Transforms/AggressiveInstCombine/popcount.ll Thu Oct 10 22:13:56 2019 @@ -0,0 +1,193 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt -O3 < %s -instcombine -S | FileCheck %s + +;int popcount8(unsigned char i) { +; i = i - ((i >> 1) & 0x55); +; i = (i & 0x33) + ((i >> 2) & 0x33); +; i = ((i + (i >> 4)) & 0x0F); +; return (i * 0x01010101); +;} +define signext i32 @popcount8(i8 zeroext %0) { +; CHECK-LABEL: @popcount8( +; CHECK-NEXT: [[TMP2:%.*]] = lshr i8 [[TMP0:%.*]], 1 +; CHECK-NEXT: [[TMP3:%.*]] = and i8 [[TMP2]], 85 +; CHECK-NEXT: [[TMP4:%.*]] = sub i8 [[TMP0]], [[TMP3]] +; CHECK-NEXT: [[TMP5:%.*]] = and i8 [[TMP4]], 51 +; CHECK-NEXT: [[TMP6:%.*]] = lshr i8 [[TMP4]], 2 +; CHECK-NEXT: [[TMP7:%.*]] = and i8 [[TMP6]], 51 +; CHECK-NEXT: [[TMP8:%.*]] = add nuw nsw i8 [[TMP7]], [[TMP5]] +; CHECK-NEXT: [[TMP9:%.*]] = lshr i8 [[TMP8]], 4 +; CHECK-NEXT: [[TMP10:%.*]] = add nuw nsw i8 [[TMP9]], [[TMP8]] +; CHECK-NEXT: [[TMP11:%.*]] = and i8 [[TMP10]], 15 +; CHECK-NEXT: [[TMP12:%.*]] = zext i8 [[TMP11]] to i32 +; CHECK-NEXT: ret i32 [[TMP12]] +; + %2 = lshr i8 %0, 1 + %3 = and i8 %2, 85 + %4 = sub i8 %0, %3 + %5 = and i8 %4, 51 + %6 = lshr i8 %4, 2 + %7 = and i8 %6, 51 + %8 = add nuw nsw i8 %7, %5 + %9 = lshr i8 %8, 4 + %10 = add nuw nsw i8 %9, %8 + %11 = and i8 %10, 15 + %12 = zext i8 %11 to i32 + ret i32 %12 +} + +;int popcount32(unsigned i) { +; i = i - ((i >> 1) & 0x55555555); +; i = (i & 0x33333333) + ((i >> 2) & 0x33333333); +; i = ((i + (i >> 4)) & 0x0F0F0F0F); +; return (i * 0x01010101) >> 24; +;} +define signext i32 @popcount32(i32 zeroext %0) { +; CHECK-LABEL: @popcount32( +; CHECK-NEXT: [[TMP2:%.*]] = tail call i32 @llvm.ctpop.i32(i32 [[TMP0:%.*]]), !range !0 +; CHECK-NEXT: ret i32 [[TMP2]] +; + %2 = lshr i32 %0, 1 + %3 = and i32 %2, 1431655765 + %4 = sub i32 %0, %3 + %5 = and i32 %4, 858993459 + %6 = lshr i32 %4, 2 + %7 = and i32 %6, 858993459 + %8 = add nuw nsw i32 %7, %5 + %9 = lshr i32 %8, 4 + %10 = add nuw nsw i32 %9, %8 + %11 = and i32 %10, 252645135 + %12 = mul i32 %11, 16843009 + %13 = lshr i32 %12, 24 + ret i32 %13 +} + +;int popcount64(unsigned long long i) { +; i = i - ((i >> 1) & 0x5555555555555555); +; i = (i & 0x3333333333333333) + ((i >> 2) & 0x3333333333333333); +; i = ((i + (i >> 4)) & 0x0F0F0F0F0F0F0F0F); +; return (i * 0x0101010101010101) >> 56; +;} +define signext i32 @popcount64(i64 %0) { +; CHECK-LABEL: @popcount64( +; CHECK-NEXT: [[TMP2:%.*]] = tail call i64 @llvm.ctpop.i64(i64 [[TMP0:%.*]]), !range !1 +; CHECK-NEXT: [[TMP3:%.*]] = trunc i64 [[TMP2]] to i32 +; CHECK-NEXT: ret i32 [[TMP3]] +; + %2 = lshr i64 %0, 1 + %3 = and i64 %2, 6148914691236517205 + %4 = sub i64 %0, %3 + %5 = and i64 %4, 3689348814741910323 + %6 = lshr i64 %4, 2 + %7 = and i64 %6, 3689348814741910323 + %8 = add nuw nsw i64 %7, %5 + %9 = lshr i64 %8, 4 + %10 = add nuw nsw i64 %9, %8 + %11 = and i64 %10, 1085102592571150095 + %12 = mul i64 %11, 72340172838076673 + %13 = lshr i64 %12, 56 + %14 = trunc i64 %13 to i32 + ret i32 %14 +} + +;int popcount128(__uint128_t i) { +; __uint128_t x = 0x5555555555555555; +; x <<= 64; +; x |= 0x5555555555555555; +; __uint128_t y = 0x3333333333333333; +; y <<= 64; +; y |= 0x3333333333333333; +; __uint128_t z = 0x0f0f0f0f0f0f0f0f; +; z <<= 64; +; z |= 0x0f0f0f0f0f0f0f0f; +; __uint128_t a = 0x0101010101010101; +; a <<= 64; +; a |= 0x0101010101010101; +; unsigned mask = 120; +; i = i - ((i >> 1) & x); +; i = (i & y) + ((i >> 2) & y); +; i = ((i + (i >> 4)) & z); +; return (i * a) >> mask; +;} +define signext i32 @popcount128(i128 %0) { +; CHECK-LABEL: @popcount128( +; CHECK-NEXT: [[TMP2:%.*]] = tail call i128 @llvm.ctpop.i128(i128 [[TMP0:%.*]]), !range !2 +; CHECK-NEXT: [[TMP3:%.*]] = trunc i128 [[TMP2]] to i32 +; CHECK-NEXT: ret i32 [[TMP3]] +; + %2 = lshr i128 %0, 1 + %3 = and i128 %2, 113427455640312821154458202477256070485 + %4 = sub i128 %0, %3 + %5 = and i128 %4, 68056473384187692692674921486353642291 + %6 = lshr i128 %4, 2 + %7 = and i128 %6, 68056473384187692692674921486353642291 + %8 = add nuw nsw i128 %7, %5 + %9 = lshr i128 %8, 4 + %10 = add nuw nsw i128 %9, %8 + %11 = and i128 %10, 20016609818878733144904388672456953615 + %12 = mul i128 %11, 1334440654591915542993625911497130241 + %13 = lshr i128 %12, 120 + %14 = trunc i128 %13 to i32 + ret i32 %14 +} + +;vector unsigned char popcount8vec(vector unsigned char i) +;{ +; i = i - ((i>> 1) & 0x55); +; i = (i & 0x33) + ((i >> 2) & 0x33); +; i = ((i + (i >> 4)) & 0x0F); +; return (i * 0x01); +;} +define <16 x i8> @popcount8vec(<16 x i8> %0) { +; CHECK-LABEL: @popcount8vec( +; CHECK-NEXT: [[TMP2:%.*]] = lshr <16 x i8> [[TMP0:%.*]], +; CHECK-NEXT: [[TMP3:%.*]] = and <16 x i8> [[TMP2]], +; CHECK-NEXT: [[TMP4:%.*]] = sub <16 x i8> [[TMP0]], [[TMP3]] +; CHECK-NEXT: [[TMP5:%.*]] = and <16 x i8> [[TMP4]], +; CHECK-NEXT: [[TMP6:%.*]] = lshr <16 x i8> [[TMP4]], +; CHECK-NEXT: [[TMP7:%.*]] = and <16 x i8> [[TMP6]], +; CHECK-NEXT: [[TMP8:%.*]] = add nuw nsw <16 x i8> [[TMP7]], [[TMP5]] +; CHECK-NEXT: [[TMP9:%.*]] = lshr <16 x i8> [[TMP8]], +; CHECK-NEXT: [[TMP10:%.*]] = add nuw nsw <16 x i8> [[TMP9]], [[TMP8]] +; CHECK-NEXT: [[TMP11:%.*]] = and <16 x i8> [[TMP10]], +; CHECK-NEXT: ret <16 x i8> [[TMP11]] +; + %2 = lshr <16 x i8> %0, + %3 = and <16 x i8> %2, + %4 = sub <16 x i8> %0, %3 + %5 = and <16 x i8> %4, + %6 = lshr <16 x i8> %4, + %7 = and <16 x i8> %6, + %8 = add nuw nsw <16 x i8> %7, %5 + %9 = lshr <16 x i8> %8, + %10 = add nuw nsw <16 x i8> %9, %8 + %11 = and <16 x i8> %10, + ret <16 x i8> %11 +} + +;vector unsigned int popcount32vec(vector unsigned int i) +;{ +; i = i - ((i>> 1) & 0x55555555); +; i = (i & 0x33333333) + ((i >> 2) & 0x33333333); +; i = ((i + (i >> 4)) & 0x0F0F0F0F); +; return (i * 0x01010101) >> 24; +;} +define <4 x i32> @popcount32vec(<4 x i32> %0) { +; CHECK-LABEL: @popcount32vec( +; CHECK-NEXT: [[TMP2:%.*]] = tail call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> [[TMP0:%.*]]) +; CHECK-NEXT: ret <4 x i32> [[TMP2]] +; + %2 = lshr <4 x i32> %0, + %3 = and <4 x i32> %2, + %4 = sub <4 x i32> %0, %3 + %5 = and <4 x i32> %4, + %6 = lshr <4 x i32> %4, + %7 = and <4 x i32> %6, + %8 = add nuw nsw <4 x i32> %7, %5 + %9 = lshr <4 x i32> %8, + %10 = add nuw nsw <4 x i32> %9, %8 + %11 = and <4 x i32> %10, + %12 = mul <4 x i32> %11, + %13 = lshr <4 x i32> %12, + ret <4 x i32> %13 +} From llvm-commits at lists.llvm.org Thu Oct 10 22:14:37 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:14:37 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: <04ce0920b0b7eb3f84bc4c5c94e7eb5b@localhost.localdomain> craig.topper added inline comments. ================ Comment at: llvm/test/Transforms/AggressiveInstCombine/popcount.ll:2 +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt -O3 < %s -instcombine -S | FileCheck %s + ---------------- Why does this run the entire -O3 pipeline and not just the aggressive instcombine pass? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 From llvm-commits at lists.llvm.org Thu Oct 10 22:14:47 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:14:47 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: <32e88ba35ec0dcb7bcfb4f5b4a537b43@localhost.localdomain> This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rGc17c5864fff6: [InstCombine] recognize popcount. (authored by shchenz). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 Files: llvm/include/llvm/IR/PatternMatch.h llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp llvm/test/Transforms/AggressiveInstCombine/popcount.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68189.224539.patch Type: text/x-patch Size: 13254 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 22:19:34 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Fri, 11 Oct 2019 05:19:34 -0000 Subject: [zorg] r374513 - Updated llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast and llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast builders. Message-ID: <20191011051934.DB94792A49@lists.llvm.org> Author: gkistanova Date: Thu Oct 10 22:19:34 2019 New Revision: 374513 URL: http://llvm.org/viewvc/llvm-project?rev=374513&view=rev Log: Updated llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast and llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast builders. Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/buildbot/osuosl/master/config/builders.py?rev=374513&r1=374512&r2=374513&view=diff ============================================================================== --- zorg/trunk/buildbot/osuosl/master/config/builders.py (original) +++ zorg/trunk/buildbot/osuosl/master/config/builders.py Thu Oct 10 22:19:34 2019 @@ -112,21 +112,22 @@ def _get_clang_fast_builders(): 'builddir': "llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast", 'factory': UnifiedTreeBuilder.getCmakeWithNinjaBuildFactory( depends_on_projects=['llvm','clang','clang-tools-extra','compiler-rt','lld'], - extraCmakeOptions=["-DCMAKE_C_COMPILER=clang", - "-DCMAKE_CXX_COMPILER=clang++", - "-DCOMPILER_RT_BUILD_BUILTINS=OFF", - "-DCOMPILER_RT_BUILD_SANITIZERS=OFF", - "-DCOMPILER_RT_CAN_EXECUTE_TESTS=OFF", - "-DCOMPILER_RT_INCLUDE_TESTS=OFF", - "-DLLVM_TOOL_COMPILER_RT_BUILD=OFF", # TODO: Check why we depend on compiler-rt then? - "-DLLVM_BUILD_TESTS=ON", - "-DLLVM_BUILD_EXAMPLES=ON", - "-DCLANG_BUILD_EXAMPLES=ON", - "-DLLVM_TARGETS_TO_BUILD=X86", - "-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4", - "-DCMAKE_C_FLAGS='-Wdocumentation -Wno-documentation-deprecated-sync'", - "-DCMAKE_CXX_FLAGS='-std=c++11 -Wdocumentation -Wno-documentation-deprecated-sync'", - "-DLLVM_LIT_ARGS='-v -j36'"], + extra_configure_args=[ + "-DCMAKE_C_COMPILER=clang", + "-DCMAKE_CXX_COMPILER=clang++", + "-DCOMPILER_RT_BUILD_BUILTINS=OFF", + "-DCOMPILER_RT_BUILD_SANITIZERS=OFF", + "-DCOMPILER_RT_CAN_EXECUTE_TESTS=OFF", + "-DCOMPILER_RT_INCLUDE_TESTS=OFF", + "-DLLVM_TOOL_COMPILER_RT_BUILD=OFF", # TODO: Check why we depend on compiler-rt then? + "-DLLVM_BUILD_TESTS=ON", + "-DLLVM_BUILD_EXAMPLES=ON", + "-DCLANG_BUILD_EXAMPLES=ON", + "-DLLVM_TARGETS_TO_BUILD=X86", + "-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4", + "-DCMAKE_C_FLAGS='-Wdocumentation -Wno-documentation-deprecated-sync'", + "-DCMAKE_CXX_FLAGS='-std=c++11 -Wdocumentation -Wno-documentation-deprecated-sync'", + "-DLLVM_LIT_ARGS='-v -j36'"], env={'PATH':'/opt/llvm_37/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'})}, {'name': "llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast", @@ -136,13 +137,14 @@ def _get_clang_fast_builders(): 'factory': UnifiedTreeBuilder.getCmakeWithNinjaWithMSVCBuildFactory( vs="autodetect", depends_on_projects=['llvm','clang','clang-tools-extra','compiler-rt','lld'], - extraCmakeOptions=["-DLLVM_TOOL_COMPILER_RT_BUILD=OFF", # TODO: Check why we depend on compiler-rt then? - "-DLLVM_BUILD_TESTS=ON", - "-DLLVM_BUILD_EXAMPLES=ON", - "-DCLANG_BUILD_EXAMPLES=ON", - "-DLLVM_TARGETS_TO_BUILD=X86", - "-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4", - "-DLLVM_LIT_ARGS='-v -j80'"])}, + extra_configure_args=[ + "-DLLVM_TOOL_COMPILER_RT_BUILD=OFF", # TODO: Check why we depend on compiler-rt then? + "-DLLVM_BUILD_TESTS=ON", + "-DLLVM_BUILD_EXAMPLES=ON", + "-DCLANG_BUILD_EXAMPLES=ON", + "-DLLVM_TARGETS_TO_BUILD=X86", + "-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4", + "-DLLVM_LIT_ARGS='-v -j80'"])}, {'name': "llvm-clang-x86_64-expensive-checks-win", 'slavenames':["ps4-buildslave2"], From llvm-commits at lists.llvm.org Thu Oct 10 22:23:49 2019 From: llvm-commits at lists.llvm.org (Jason Molenda via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:23:49 +0000 (UTC) Subject: [PATCH] D67347: [Windows] Use information from the PE32 exceptions directory to construct unwind plans In-Reply-To: References: Message-ID: jasonmolenda accepted this revision. jasonmolenda added a comment. This revision is now accepted and ready to land. Thanks for all of the work on this patch Aleksandr, I really appreciate the work and this will be a nice addition to lldb. Apologies again for not looking over the revised patch earlier -- this looks really good to me, please do commit it when you have a chance. Repository: rLLDB LLDB CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67347/new/ https://reviews.llvm.org/D67347 From llvm-commits at lists.llvm.org Thu Oct 10 22:23:50 2019 From: llvm-commits at lists.llvm.org (LiuChen via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:23:50 +0000 (UTC) Subject: [PATCH] D68857: [X86] Add strict fp support for operations of X87 instructions In-Reply-To: References: Message-ID: <3d34ae1b96168c89c9ac6684d368ff59@localhost.localdomain> LiuChen3 marked an inline comment as done. LiuChen3 added inline comments. ================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:601 + + // Handle constrained floating-point operations of scalar. + for (auto VT : { MVT::f32, MVT::f64, MVT::f80 }) { ---------------- craig.topper wrote: > Doesn't this stop working if sse1 is enabled? By we still need x87 for f64/f80. Yes . This can only be enabled when disable sse. I'll find which will still be needed when SSE enabled. I think most of them are related to f80. ================ Comment at: llvm/lib/Target/X86/X86InstrFPStack.td:372 let SchedRW = [WriteMicrocoded] in { -defm SIN : FPUnary; -defm COS : FPUnary; +defm SIN : FPUnary; +defm COS : FPUnary; ---------------- craig.topper wrote: > SIN/COS are not mentioned in the X86ISelLowering code you added. X86 uses library to calculate SIN/COS. X87 set SIN/COS as Expand. So I leave strict_fsin/strict_fcos as "expand" as default. ================ Comment at: llvm/test/CodeGen/X86/x87-fp-strict-sub.ll:84 + +!0 = !{!1, !1, i64 0} +!1 = !{!"float", !2, i64 0} ---------------- craig.topper wrote: > Is all this metadata needed? It's not necessay. I'll delete them. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68857/new/ https://reviews.llvm.org/D68857 From llvm-commits at lists.llvm.org Thu Oct 10 22:30:19 2019 From: llvm-commits at lists.llvm.org (Chen Zheng via llvm-commits) Date: Fri, 11 Oct 2019 05:30:19 -0000 Subject: [llvm] r374514 - [NFC] run specific pass instead of whole -O3 pipeline for popcount recoginzation testcase. Message-ID: <20191011053019.44C3E92E82@lists.llvm.org> Author: shchenz Date: Thu Oct 10 22:30:18 2019 New Revision: 374514 URL: http://llvm.org/viewvc/llvm-project?rev=374514&view=rev Log: [NFC] run specific pass instead of whole -O3 pipeline for popcount recoginzation testcase. Modified: llvm/trunk/test/Transforms/AggressiveInstCombine/popcount.ll Modified: llvm/trunk/test/Transforms/AggressiveInstCombine/popcount.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/AggressiveInstCombine/popcount.ll?rev=374514&r1=374513&r2=374514&view=diff ============================================================================== --- llvm/trunk/test/Transforms/AggressiveInstCombine/popcount.ll (original) +++ llvm/trunk/test/Transforms/AggressiveInstCombine/popcount.ll Thu Oct 10 22:30:18 2019 @@ -1,5 +1,5 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py -; RUN: opt -O3 < %s -instcombine -S | FileCheck %s +; RUN: opt < %s -aggressive-instcombine -S | FileCheck %s ;int popcount8(unsigned char i) { ; i = i - ((i >> 1) & 0x55); @@ -44,7 +44,7 @@ define signext i32 @popcount8(i8 zeroext ;} define signext i32 @popcount32(i32 zeroext %0) { ; CHECK-LABEL: @popcount32( -; CHECK-NEXT: [[TMP2:%.*]] = tail call i32 @llvm.ctpop.i32(i32 [[TMP0:%.*]]), !range !0 +; CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.ctpop.i32(i32 [[TMP0:%.*]]) ; CHECK-NEXT: ret i32 [[TMP2]] ; %2 = lshr i32 %0, 1 @@ -70,7 +70,7 @@ define signext i32 @popcount32(i32 zeroe ;} define signext i32 @popcount64(i64 %0) { ; CHECK-LABEL: @popcount64( -; CHECK-NEXT: [[TMP2:%.*]] = tail call i64 @llvm.ctpop.i64(i64 [[TMP0:%.*]]), !range !1 +; CHECK-NEXT: [[TMP2:%.*]] = call i64 @llvm.ctpop.i64(i64 [[TMP0:%.*]]) ; CHECK-NEXT: [[TMP3:%.*]] = trunc i64 [[TMP2]] to i32 ; CHECK-NEXT: ret i32 [[TMP3]] ; @@ -111,7 +111,7 @@ define signext i32 @popcount64(i64 %0) { ;} define signext i32 @popcount128(i128 %0) { ; CHECK-LABEL: @popcount128( -; CHECK-NEXT: [[TMP2:%.*]] = tail call i128 @llvm.ctpop.i128(i128 [[TMP0:%.*]]), !range !2 +; CHECK-NEXT: [[TMP2:%.*]] = call i128 @llvm.ctpop.i128(i128 [[TMP0:%.*]]) ; CHECK-NEXT: [[TMP3:%.*]] = trunc i128 [[TMP2]] to i32 ; CHECK-NEXT: ret i32 [[TMP3]] ; @@ -174,7 +174,7 @@ define <16 x i8> @popcount8vec(<16 x i8> ;} define <4 x i32> @popcount32vec(<4 x i32> %0) { ; CHECK-LABEL: @popcount32vec( -; CHECK-NEXT: [[TMP2:%.*]] = tail call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> [[TMP0:%.*]]) +; CHECK-NEXT: [[TMP2:%.*]] = call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> [[TMP0:%.*]]) ; CHECK-NEXT: ret <4 x i32> [[TMP2]] ; %2 = lshr <4 x i32> %0, From llvm-commits at lists.llvm.org Thu Oct 10 22:33:06 2019 From: llvm-commits at lists.llvm.org (Aleksandr Urakov via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:33:06 +0000 (UTC) Subject: [PATCH] D67347: [Windows] Use information from the PE32 exceptions directory to construct unwind plans In-Reply-To: References: Message-ID: aleksandr.urakov added a comment. Thanks a lot for the review! Repository: rLLDB LLDB CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67347/new/ https://reviews.llvm.org/D67347 From llvm-commits at lists.llvm.org Thu Oct 10 22:33:07 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:33:07 +0000 (UTC) Subject: [PATCH] D68189: [InstCombine] recognize popcount implemented in hacker's delight. In-Reply-To: References: Message-ID: <2a009c56500a380d47e32a1dfa308ff9@localhost.localdomain> shchenz marked an inline comment as done. shchenz added inline comments. ================ Comment at: llvm/test/Transforms/AggressiveInstCombine/popcount.ll:2 +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt -O3 < %s -instcombine -S | FileCheck %s + ---------------- craig.topper wrote: > Why does this run the entire -O3 pipeline and not just the aggressive instcombine pass? Thanks, done in rL374514 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68189/new/ https://reviews.llvm.org/D68189 From llvm-commits at lists.llvm.org Thu Oct 10 22:33:07 2019 From: llvm-commits at lists.llvm.org (Yonghong Song via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:33:07 +0000 (UTC) Subject: [PATCH] D68822: [WIP][BPF] Support external globals In-Reply-To: References: Message-ID: <9dd86492f7d77e4e7c7f13eb407e0623@localhost.localdomain> yonghong-song updated this revision to Diff 224540. yonghong-song edited the summary of this revision. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68822/new/ https://reviews.llvm.org/D68822 Files: llvm/lib/Target/BPF/BTF.h llvm/lib/Target/BPF/BTFDebug.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68822.224540.patch Type: text/x-patch Size: 4118 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 22:33:08 2019 From: llvm-commits at lists.llvm.org (Craig Topper via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:33:08 +0000 (UTC) Subject: [PATCH] D68857: [X86] Add strict fp support for operations of X87 instructions In-Reply-To: References: Message-ID: <4b46bb4872525268c087102124a8da13@localhost.localdomain> craig.topper added inline comments. ================ Comment at: llvm/lib/Target/X86/X86InstrFPStack.td:372 let SchedRW = [WriteMicrocoded] in { -defm SIN : FPUnary; -defm COS : FPUnary; +defm SIN : FPUnary; +defm COS : FPUnary; ---------------- LiuChen3 wrote: > craig.topper wrote: > > SIN/COS are not mentioned in the X86ISelLowering code you added. > X86 uses library to calculate SIN/COS. X87 set SIN/COS as Expand. So I leave strict_fsin/strict_fcos as "expand" as default. Ok then don't change these lines. We shouldn't have entries in the isel table that we don't need. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68857/new/ https://reviews.llvm.org/D68857 From llvm-commits at lists.llvm.org Thu Oct 10 22:33:41 2019 From: llvm-commits at lists.llvm.org (Yi-Hong Lyu via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:33:41 +0000 (UTC) Subject: [PATCH] D68344: [PowerPC] Remove assertion "Shouldn't overwrite a register before it is killed" In-Reply-To: References: Message-ID: <9565333b842ba91cb0f4b372376037bd@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG2fbfb04ffef4: [PowerPC] Remove assertion "Shouldn't overwrite a register before it is killed" (authored by Yi-Hong.Lyu). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68344/new/ https://reviews.llvm.org/D68344 Files: llvm/lib/Target/PowerPC/PPCPreEmitPeephole.cpp llvm/test/CodeGen/PowerPC/remove-redundant-load-imm.mir Index: llvm/test/CodeGen/PowerPC/remove-redundant-load-imm.mir =================================================================== --- llvm/test/CodeGen/PowerPC/remove-redundant-load-imm.mir +++ llvm/test/CodeGen/PowerPC/remove-redundant-load-imm.mir @@ -346,3 +346,25 @@ BLR8 implicit $lr8, implicit $rm ... +--- +name: overwrite_reg_before_killed +alignment: 16 +tracksRegLiveness: true +machineFunctionInfo: {} +body: | + bb.0.entry: + liveins: $x1 + + ; CHECK-LABEL: name: overwrite_reg_before_killed + ; CHECK: liveins: $x1 + ; CHECK: renamable $x3 = LI8 0 + ; CHECK: STD renamable $x3, 16, $x1 + ; CHECK: STD killed renamable $x3, 8, $x1 + ; CHECK: BLR8 implicit $lr8, implicit $rm + renamable $x3 = LI8 0 + STD renamable $x3, 16, $x1 + renamable $x3 = LI8 0 + STD killed renamable $x3, 8, $x1 + BLR8 implicit $lr8, implicit $rm + +... Index: llvm/lib/Target/PowerPC/PPCPreEmitPeephole.cpp =================================================================== --- llvm/lib/Target/PowerPC/PPCPreEmitPeephole.cpp +++ llvm/lib/Target/PowerPC/PPCPreEmitPeephole.cpp @@ -117,8 +117,6 @@ if (!AfterBBI->modifiesRegister(Reg, TRI)) continue; - assert(DeadOrKillToUnset && - "Shouldn't overwrite a register before it is killed"); // Finish scanning because Reg is overwritten by a non-load // instruction. if (AfterBBI->getOpcode() != Opc) @@ -134,12 +132,15 @@ // It loads same immediate value to the same Reg, which is redundant. // We would unset kill flag in previous Reg usage to extend live range // of Reg first, then remove the redundancy. - LLVM_DEBUG(dbgs() << " Unset dead/kill flag of " << *DeadOrKillToUnset - << " from " << *DeadOrKillToUnset->getParent()); - if (DeadOrKillToUnset->isDef()) - DeadOrKillToUnset->setIsDead(false); - else - DeadOrKillToUnset->setIsKill(false); + if (DeadOrKillToUnset) { + LLVM_DEBUG(dbgs() + << " Unset dead/kill flag of " << *DeadOrKillToUnset + << " from " << *DeadOrKillToUnset->getParent()); + if (DeadOrKillToUnset->isDef()) + DeadOrKillToUnset->setIsDead(false); + else + DeadOrKillToUnset->setIsKill(false); + } DeadOrKillToUnset = AfterBBI->findRegisterDefOperand(Reg, true, true, TRI); if (DeadOrKillToUnset) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68344.224541.patch Type: text/x-patch Size: 2587 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 22:42:52 2019 From: llvm-commits at lists.llvm.org (LiuChen via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:42:52 +0000 (UTC) Subject: [PATCH] D68857: [X86] Add strict fp support for operations of X87 instructions In-Reply-To: References: Message-ID: <993fbc1e8018fd07220def84de254e20@localhost.localdomain> LiuChen3 marked an inline comment as done and an inline comment as not done. LiuChen3 added inline comments. ================ Comment at: llvm/lib/Target/X86/X86InstrFPStack.td:372 let SchedRW = [WriteMicrocoded] in { -defm SIN : FPUnary; -defm COS : FPUnary; +defm SIN : FPUnary; +defm COS : FPUnary; ---------------- craig.topper wrote: > LiuChen3 wrote: > > craig.topper wrote: > > > SIN/COS are not mentioned in the X86ISelLowering code you added. > > X86 uses library to calculate SIN/COS. X87 set SIN/COS as Expand. So I leave strict_fsin/strict_fcos as "expand" as default. > Ok then don't change these lines. We shouldn't have entries in the isel table that we don't need. ok CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68857/new/ https://reviews.llvm.org/D68857 From llvm-commits at lists.llvm.org Fri Oct 11 00:02:22 2019 From: llvm-commits at lists.llvm.org (Chandler Carruth via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 07:02:22 +0000 (UTC) Subject: [PATCH] D65280: Add a pass to lower is.constant and objectsize intrinsics In-Reply-To: References: Message-ID: chandlerc added a comment. (Tried to get this out last weekend, but was blocked by the Phab down time... Sorry about that ...) Mostly nits around the exact code here. The approach looks really nice now (and sorry it took so many iterations to get there). ================ Comment at: lib/Passes/PassRegistry.def:188 FUNCTION_PASS("lower-guard-intrinsic", LowerGuardIntrinsicPass()) +FUNCTION_PASS("lower-is-constant", LowerConstantIntrinsicsPass()) FUNCTION_PASS("lower-widenable-condition", LowerWidenableConditionPass()) ---------------- Maybe `lower-constant-intrinsics` as a name? (Since it handles `objectsize` as well. ================ Comment at: lib/Transforms/Scalar/LowerConstantIntrinsics.cpp:93 + for (Instruction &I: *BB) { + if (IntrinsicInst *II = dyn_cast(&I)) { + switch (II->getIntrinsicID()) { ---------------- Use an early continue to reduce indentation? ================ Comment at: lib/Transforms/Scalar/LowerConstantIntrinsics.cpp:96 + default: + continue; + case Intrinsic::is_constant: ---------------- Odd to continue here but break below. Doesn't matter in this case of course, but just seemed surprising. ================ Comment at: lib/Transforms/Scalar/LowerConstantIntrinsics.cpp:105-106 + } + if (Worklist.empty()) + return false; + ---------------- FWIW, this doesn't skip anything, the loop has the same behavior. ================ Comment at: lib/Transforms/Scalar/LowerConstantIntrinsics.cpp:112-117 + if (!II) + continue; + Value *NewValue; + switch (II->getIntrinsicID()) { + default: + continue; ---------------- For both the `II` thing and the `default` case -- do we really expect these to ever fail? I would expect either the VH to be null, or for it to definitively be one of the two intrinsics we added. Maybe switch to `cast_or_null` above with `VN.get()` or some such, and llvm_unreachable on the default case. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65280/new/ https://reviews.llvm.org/D65280 From llvm-commits at lists.llvm.org Fri Oct 11 00:16:20 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via llvm-commits) Date: Fri, 11 Oct 2019 07:16:20 -0000 Subject: [llvm] r374517 - Fix modules build for r374337 Message-ID: <20191011071620.13DB5928BC@lists.llvm.org> Author: labath Date: Fri Oct 11 00:16:19 2019 New Revision: 374517 URL: http://llvm.org/viewvc/llvm-project?rev=374517&view=rev Log: Fix modules build for r374337 A modules build failed with the following error: call to function 'operator&' that is neither visible in the template definition nor found by argument-dependent lookup Fix that by declaring the appropriate operators in the llvm::minidump namespace. Modified: llvm/trunk/include/llvm/BinaryFormat/Minidump.h Modified: llvm/trunk/include/llvm/BinaryFormat/Minidump.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/BinaryFormat/Minidump.h?rev=374517&r1=374516&r2=374517&view=diff ============================================================================== --- llvm/trunk/include/llvm/BinaryFormat/Minidump.h (original) +++ llvm/trunk/include/llvm/BinaryFormat/Minidump.h Fri Oct 11 00:16:19 2019 @@ -25,6 +25,8 @@ namespace llvm { namespace minidump { +LLVM_ENABLE_BITMASK_ENUMS_IN_NAMESPACE(); + /// The minidump header is the first part of a minidump file. It identifies the /// file as a minidump file, and gives the location of the stream directory. struct Header { From llvm-commits at lists.llvm.org Fri Oct 11 00:15:44 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via llvm-commits) Date: Fri, 11 Oct 2019 09:15:44 +0200 Subject: [llvm] r356753 - [ObjectYAML] Add basic minidump generation support In-Reply-To: References: <20190322144726.9FC4F8A9D4@lists.llvm.org> Message-ID: <422efda3-bdd0-27fa-cc07-2907d0bddccf@labath.sk> On 11/10/2019 05:38, Kristina Brooks wrote: > Hi, > > This doesn't build with modules enabled (using Clang r374503): > > FAILED: lib/ObjectYAML/CMakeFiles/LLVMObjectYAML.dir/MinidumpYAML.cpp.o > /o/b/llvm-10/408/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG > -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS > -D__STDC_LIMIT_MACROS -Ilib/ObjectYAML > -I/home/src2/llvm-tainted/lib/ObjectYAML -Iinclude > -I/home/src2/llvm-tainted/include -O3 -march=native > -Wno-unused-command-line-argument -gline-tables-only -stdlib=libc++ > -fPIC -fvisibility-inlines-hidden -Werror=date-time > -Werror=unguarded-availability-new -std=c++14 -fmodules > -fmodules-cache-path=/o/b/llvm-10/409/module.cache -Xclang > -fmodules-local-submodule-visibility -Wall -Wextra > -Wno-unused-parameter -Wwrite-strings -Wcast-qual > -Wmissing-field-initializers -pedantic -Wno-long-long > -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type > -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion > -fdiagnostics-color -ffunction-sections -fdata-sections -flto=thin -O3 > -UNDEBUG -fno-exceptions -fno-rtti -MD -MT > lib/ObjectYAML/CMakeFiles/LLVMObjectYAML.dir/MinidumpYAML.cpp.o -MF > lib/ObjectYAML/CMakeFiles/LLVMObjectYAML.dir/MinidumpYAML.cpp.o.d -o > lib/ObjectYAML/CMakeFiles/LLVMObjectYAML.dir/MinidumpYAML.cpp.o -c > /home/src2/llvm-tainted/lib/ObjectYAML/MinidumpYAML.cpp > In module 'LLVM_Utils' imported from > /home/src2/llvm-tainted/include/llvm/ObjectYAML/YAML.h:12: > /home/src2/llvm-tainted/include/llvm/Support/YAMLTraits.h:819:48: > error: call to function 'operator&' that is neither visible in the > template definition nor found by argument-dependent lookup > if ( bitSetMatch(Str, outputting() && (Val & ConstVal) == ConstVal) ) { > ^ Should be fixed by r374517. pl From llvm-commits at lists.llvm.org Fri Oct 11 00:19:54 2019 From: llvm-commits at lists.llvm.org (Kadir Cetinkaya via llvm-commits) Date: Fri, 11 Oct 2019 07:19:54 -0000 Subject: [llvm] r374518 - [ADT][Statistics] Fix test after rL374490 Message-ID: <20191011071954.4811F92E6A@lists.llvm.org> Author: kadircet Date: Fri Oct 11 00:19:54 2019 New Revision: 374518 URL: http://llvm.org/viewvc/llvm-project?rev=374518&view=rev Log: [ADT][Statistics] Fix test after rL374490 Modified: llvm/trunk/unittests/ADT/StatisticTest.cpp Modified: llvm/trunk/unittests/ADT/StatisticTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ADT/StatisticTest.cpp?rev=374518&r1=374517&r2=374518&view=diff ============================================================================== --- llvm/trunk/unittests/ADT/StatisticTest.cpp (original) +++ llvm/trunk/unittests/ADT/StatisticTest.cpp Fri Oct 11 00:19:54 2019 @@ -68,6 +68,8 @@ TEST(StatisticTest, Assign) { TEST(StatisticTest, API) { EnableStatistics(); + // Reset beforehand to make sure previous tests don't effect this one. + ResetStatistics(); Counter = 0; EXPECT_EQ(Counter, 0u); From llvm-commits at lists.llvm.org Fri Oct 11 00:24:36 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Fri, 11 Oct 2019 07:24:36 -0000 Subject: [llvm] r374519 - [X86] Add v8i64->v8i8 ssat/usat/packus truncate tests to min-legal-vector-width.ll Message-ID: <20191011072436.92B7F92EE2@lists.llvm.org> Author: ctopper Date: Fri Oct 11 00:24:36 2019 New Revision: 374519 URL: http://llvm.org/viewvc/llvm-project?rev=374519&view=rev Log: [X86] Add v8i64->v8i8 ssat/usat/packus truncate tests to min-legal-vector-width.ll I wonder if we should split the v8i8 stores in order to form two v4i8 saturating truncating stores. This would remove the unpckl needed to concatenated the v4i8 results to make a single store. Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll?rev=374519&r1=374518&r2=374519&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll (original) +++ llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Fri Oct 11 00:24:36 2019 @@ -1142,3 +1142,106 @@ define <32 x i8> @trunc_packus_v32i32_v3 ret <32 x i8> %f } +define <8 x i8> @trunc_packus_v8i64_v8i8(<8 x i64> %a0) "min-legal-vector-width"="256" { +; CHECK-LABEL: trunc_packus_v8i64_v8i8: +; CHECK: # %bb.0: +; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 +; CHECK-NEXT: vpmaxsq %ymm2, %ymm1, %ymm1 +; CHECK-NEXT: vpmovusqb %ymm1, %xmm1 +; CHECK-NEXT: vpmaxsq %ymm2, %ymm0, %ymm0 +; CHECK-NEXT: vpmovusqb %ymm0, %xmm0 +; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; CHECK-NEXT: vzeroupper +; CHECK-NEXT: retq + %1 = icmp slt <8 x i64> %a0, + %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> + %3 = icmp sgt <8 x i64> %2, zeroinitializer + %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> zeroinitializer + %5 = trunc <8 x i64> %4 to <8 x i8> + ret <8 x i8> %5 +} + +define void @trunc_packus_v8i64_v8i8_store(<8 x i64> %a0, <8 x i8> *%p1) "min-legal-vector-width"="256" { +; CHECK-LABEL: trunc_packus_v8i64_v8i8_store: +; CHECK: # %bb.0: +; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 +; CHECK-NEXT: vpmaxsq %ymm2, %ymm1, %ymm1 +; CHECK-NEXT: vpmovusqb %ymm1, %xmm1 +; CHECK-NEXT: vpmaxsq %ymm2, %ymm0, %ymm0 +; CHECK-NEXT: vpmovusqb %ymm0, %xmm0 +; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; CHECK-NEXT: vmovq %xmm0, (%rdi) +; CHECK-NEXT: vzeroupper +; CHECK-NEXT: retq + %1 = icmp slt <8 x i64> %a0, + %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> + %3 = icmp sgt <8 x i64> %2, zeroinitializer + %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> zeroinitializer + %5 = trunc <8 x i64> %4 to <8 x i8> + store <8 x i8> %5, <8 x i8> *%p1 + ret void +} + +define <8 x i8> @trunc_ssat_v8i64_v8i8(<8 x i64> %a0) "min-legal-vector-width"="256" { +; CHECK-LABEL: trunc_ssat_v8i64_v8i8: +; CHECK: # %bb.0: +; CHECK-NEXT: vpmovsqb %ymm1, %xmm1 +; CHECK-NEXT: vpmovsqb %ymm0, %xmm0 +; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; CHECK-NEXT: vzeroupper +; CHECK-NEXT: retq + %1 = icmp slt <8 x i64> %a0, + %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> + %3 = icmp sgt <8 x i64> %2, + %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> + %5 = trunc <8 x i64> %4 to <8 x i8> + ret <8 x i8> %5 +} + +define void @trunc_ssat_v8i64_v8i8_store(<8 x i64> %a0, <8 x i8> *%p1) "min-legal-vector-width"="256" { +; CHECK-LABEL: trunc_ssat_v8i64_v8i8_store: +; CHECK: # %bb.0: +; CHECK-NEXT: vpmovsqb %ymm1, %xmm1 +; CHECK-NEXT: vpmovsqb %ymm0, %xmm0 +; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; CHECK-NEXT: vmovq %xmm0, (%rdi) +; CHECK-NEXT: vzeroupper +; CHECK-NEXT: retq + %1 = icmp slt <8 x i64> %a0, + %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> + %3 = icmp sgt <8 x i64> %2, + %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> + %5 = trunc <8 x i64> %4 to <8 x i8> + store <8 x i8> %5, <8 x i8> *%p1 + ret void +} + +define <8 x i8> @trunc_usat_v8i64_v8i8(<8 x i64> %a0) "min-legal-vector-width"="256" { +; CHECK-LABEL: trunc_usat_v8i64_v8i8: +; CHECK: # %bb.0: +; CHECK-NEXT: vpmovusqb %ymm1, %xmm1 +; CHECK-NEXT: vpmovusqb %ymm0, %xmm0 +; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; CHECK-NEXT: vzeroupper +; CHECK-NEXT: retq + %1 = icmp ult <8 x i64> %a0, + %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> + %3 = trunc <8 x i64> %2 to <8 x i8> + ret <8 x i8> %3 +} + +define void @trunc_usat_v8i64_v8i8_store(<8 x i64> %a0, <8 x i8> *%p1) "min-legal-vector-width"="256" { +; CHECK-LABEL: trunc_usat_v8i64_v8i8_store: +; CHECK: # %bb.0: +; CHECK-NEXT: vpmovusqb %ymm1, %xmm1 +; CHECK-NEXT: vpmovusqb %ymm0, %xmm0 +; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; CHECK-NEXT: vmovq %xmm0, (%rdi) +; CHECK-NEXT: vzeroupper +; CHECK-NEXT: retq + %1 = icmp ult <8 x i64> %a0, + %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> + %3 = trunc <8 x i64> %2 to <8 x i8> + store <8 x i8> %3, <8 x i8> *%p1 + ret void +} From llvm-commits at lists.llvm.org Fri Oct 11 00:30:07 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Christian_K=C3=BChnel_via_Phabricator?= via llvm-commits) Date: Fri, 11 Oct 2019 07:30:07 +0000 (UTC) Subject: [PATCH] D68860: Test Diff - DO NOT MERGE Message-ID: kuhnel created this revision. Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Test diff - DO NOT MERGE Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68860 Files: llvm/DELETEME.txt Index: llvm/DELETEME.txt =================================================================== --- /dev/null +++ llvm/DELETEME.txt @@ -0,0 +1,2 @@ +dummy file for testing + -------------- next part -------------- A non-text attachment was scrubbed... Name: D68860.224548.patch Type: text/x-patch Size: 171 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 00:39:28 2019 From: llvm-commits at lists.llvm.org (pre-merge checks [bot] via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 07:39:28 +0000 (UTC) Subject: [PATCH] D68860: Test Diff - DO NOT MERGE In-Reply-To: References: Message-ID: <601584668b8786509b8f9938a0f61893@localhost.localdomain> merge_guards_bot added a comment. Build was aborted Bulid results are available at http://results.llvm-merge-guard.org/Phabricator-30 See http://jenkins.llvm-merge-guard.org/job/Phabricator/30/ for more details. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68860/new/ https://reviews.llvm.org/D68860 From llvm-commits at lists.llvm.org Fri Oct 11 01:00:00 2019 From: llvm-commits at lists.llvm.org (Bjorn Pettersson via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 08:00:00 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline In-Reply-To: References: Message-ID: <8cfbf26534fb49ddd974224f9cb9ea75@localhost.localdomain> bjope added a comment. In D68633#1705291 , @yechunliang wrote: > In D68633#1699421 , @bjope wrote: > > > The code is written in a way that it skips any instruction, but moves contigous blocks of allocas in one splice (not sure exactly why, is that really faster?). > > > I also could not understand why continue to scan allocas block after first none use_empty alloca instruction, here is the first commit has some reason: https://github.com/llvm/llvm-project/commit/6f8865bf9 > > > Maybe the difference is that the check for AI->useEmpty() only is done for the first alloca in a sequence of alloca instructions? Or can't we just remove the loop at line 1847 (only moving one alloca at a time). > > with this example test case, second alloca is use_empty, and will insert to caller together with first alloca (!use_empty). But if there is dbg instruction between first alloca and second alloca instruction. the continue scan will break, > then with the debug instruction, the program will goto the front for() loop, and handle the second alloca as use_empty (because it has no use list like "xxx.sroa_cast = bitcast %rec1198* %volatileloadslot to i8*") and eraseFromParent. > this is difference as no-dbg inline will not erase second alloca instruction. So the root cause is rather that we treat an alloca being immediately preceeded by another alloca differrently from the case when it is preceeded by another kind of instruction. This happens also when having other instructions in between, and is not specific to dbg intrinsics (could be interesting to add a test case where you replace the dbg intrinsics by something else). So I think that the solution might be based on one of these ideas: 1. Remove the check for use_empty in the outer loop. 2. Add a check for !use_empty in the inner loop. 3. Remove the inner loop (i.e only splice one alloca at a time). Alternative 3 would be the simplest. If there really is a speedup on doing fewer splices, then alternative 2 still moves consequtive !use_empty allocas in batches. The idea with alternative 2 is to split batches on allocas that has no uses (as they are handled in the outer loop). Alternative 1 might work assuming that allocas with no uses are cleaned up somewhere else. But I think this alternative is the least interesting one. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 From llvm-commits at lists.llvm.org Fri Oct 11 01:09:15 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Fri, 11 Oct 2019 08:09:15 +0000 (UTC) Subject: [PATCH] D68450: [lit] Remove setting of the target-windows feature In-Reply-To: References: Message-ID: <7565b9493e5cc77c72297fe552151653@localhost.localdomain> mstorsjo added a comment. Ping @rnk - we should either go with this, or D68135 . Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68450/new/ https://reviews.llvm.org/D68450 From llvm-commits at lists.llvm.org Fri Oct 11 01:27:48 2019 From: llvm-commits at lists.llvm.org (ChenZheng via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 08:27:48 +0000 (UTC) Subject: [PATCH] D67088: [PowerPC] extend PPCPreIncPrep Pass for ds/dq form In-Reply-To: References: Message-ID: shchenz added a comment. Final patch is made. Could you help to review? Thanks a lot. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67088/new/ https://reviews.llvm.org/D67088 From llvm-commits at lists.llvm.org Fri Oct 11 01:36:55 2019 From: llvm-commits at lists.llvm.org (QingShan Zhang via llvm-commits) Date: Fri, 11 Oct 2019 08:36:55 -0000 Subject: [llvm] r374524 - [TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel Message-ID: <20191011083655.2D43E92ED3@lists.llvm.org> Author: qshanz Date: Fri Oct 11 01:36:54 2019 New Revision: 374524 URL: http://llvm.org/viewvc/llvm-project?rev=374524&view=rev Log: [TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel Assume that, ModelA has scheduling resource for InstA and ModelB has scheduling resource for InstB. This is what the llvm::MCSchedClassDesc looks like: llvm::MCSchedClassDesc ModelASchedClasses[] = { ... InstA, 0, ... InstB, -1,... }; llvm::MCSchedClassDesc ModelBSchedClasses[] = { ... InstA, -1,... InstB, 0,... }; The -1 means invalid num of macro ops, while it is valid if it is >=0. This is what we look like now: llvm::MCSchedClassDesc ModelASchedClasses[] = { ... InstA, 0, ... InstB, 0,... }; llvm::MCSchedClassDesc ModelBSchedClasses[] = { ... InstA, 0,... InstB, 0,... }; And compiler hit the assertion here because the SCDesc is valid now for both InstA and InstB. Differential Revision: https://reviews.llvm.org/D67950 Added: llvm/trunk/test/TableGen/InvalidMCSchedClassDesc.td Modified: llvm/trunk/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll llvm/trunk/utils/TableGen/SubtargetEmitter.cpp Modified: llvm/trunk/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll?rev=374524&r1=374523&r2=374524&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll (original) +++ llvm/trunk/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll Fri Oct 11 01:36:54 2019 @@ -45,7 +45,6 @@ entry: ; CHECK-REG-PRESSURE: ldr{{.*}}, [sp ; CHECK-REG-PRESSURE: ldr{{.*}}, [sp ; CHECK-REG-PRESSURE: ldr{{.*}}, [sp -; CHECK-REG-PRESSURE: ldr{{.*}}, [sp ; CHECK-REG-PRESSURE: bne .LBB0_1 for.body: Added: llvm/trunk/test/TableGen/InvalidMCSchedClassDesc.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/TableGen/InvalidMCSchedClassDesc.td?rev=374524&view=auto ============================================================================== --- llvm/trunk/test/TableGen/InvalidMCSchedClassDesc.td (added) +++ llvm/trunk/test/TableGen/InvalidMCSchedClassDesc.td Fri Oct 11 01:36:54 2019 @@ -0,0 +1,47 @@ +// RUN: llvm-tblgen -gen-subtarget -I %p/../../include %s 2>&1 | FileCheck %s +// Check if it is valid MCSchedClassDesc if didn't have the resources. + +include "llvm/Target/Target.td" + +def MyTarget : Target; + +let OutOperandList = (outs), InOperandList = (ins) in { + def Inst_A : Instruction; + def Inst_B : Instruction; +} + +let CompleteModel = 0 in { + def SchedModel_A: SchedMachineModel; + def SchedModel_B: SchedMachineModel; + def SchedModel_C: SchedMachineModel; +} + +// Inst_B didn't have the resoures, and it is invalid. +// CHECK: SchedModel_ASchedClasses[] = { +// CHECK: {DBGFIELD("Inst_A") 1 +// CHECK-NEXT: {DBGFIELD("Inst_B") 16383 +let SchedModel = SchedModel_A in { + def Write_A : SchedWriteRes<[]>; + def : InstRW<[Write_A], (instrs Inst_A)>; +} + +// Inst_A didn't have the resoures, and it is invalid. +// CHECK: SchedModel_BSchedClasses[] = { +// CHECK: {DBGFIELD("Inst_A") 16383 +// CHECK-NEXT: {DBGFIELD("Inst_B") 1 +let SchedModel = SchedModel_B in { + def Write_B: SchedWriteRes<[]>; + def : InstRW<[Write_B], (instrs Inst_B)>; +} + +// CHECK: SchedModel_CSchedClasses[] = { +// CHECK: {DBGFIELD("Inst_A") 1 +// CHECK-NEXT: {DBGFIELD("Inst_B") 1 +let SchedModel = SchedModel_C in { + def Write_C: SchedWriteRes<[]>; + def : InstRW<[Write_C], (instrs Inst_A, Inst_B)>; +} + +def ProcessorA: ProcessorModel<"ProcessorA", SchedModel_A, []>; +def ProcessorB: ProcessorModel<"ProcessorB", SchedModel_B, []>; +def ProcessorC: ProcessorModel<"ProcessorC", SchedModel_C, []>; Modified: llvm/trunk/utils/TableGen/SubtargetEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/SubtargetEmitter.cpp?rev=374524&r1=374523&r2=374524&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/SubtargetEmitter.cpp (original) +++ llvm/trunk/utils/TableGen/SubtargetEmitter.cpp Fri Oct 11 01:36:54 2019 @@ -1057,6 +1057,7 @@ void SubtargetEmitter::GenSchedClassTabl LLVM_DEBUG(dbgs() << ProcModel.ModelName << " does not have resources for class " << SC.Name << '\n'); + SCDesc.NumMicroOps = MCSchedClassDesc::InvalidNumMicroOps; } } // Sum resources across all operand writes. From llvm-commits at lists.llvm.org Fri Oct 11 01:37:02 2019 From: llvm-commits at lists.llvm.org (Qing Shan Zhang via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 08:37:02 +0000 (UTC) Subject: [PATCH] D67950: [TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGbb8d54001075: [TableGen] Fix a bug that MCSchedClassDesc is interfered between different… (authored by steven.zhang). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67950/new/ https://reviews.llvm.org/D67950 Files: llvm/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll llvm/test/TableGen/InvalidMCSchedClassDesc.td llvm/utils/TableGen/SubtargetEmitter.cpp Index: llvm/utils/TableGen/SubtargetEmitter.cpp =================================================================== --- llvm/utils/TableGen/SubtargetEmitter.cpp +++ llvm/utils/TableGen/SubtargetEmitter.cpp @@ -1057,6 +1057,7 @@ LLVM_DEBUG(dbgs() << ProcModel.ModelName << " does not have resources for class " << SC.Name << '\n'); + SCDesc.NumMicroOps = MCSchedClassDesc::InvalidNumMicroOps; } } // Sum resources across all operand writes. Index: llvm/test/TableGen/InvalidMCSchedClassDesc.td =================================================================== --- /dev/null +++ llvm/test/TableGen/InvalidMCSchedClassDesc.td @@ -0,0 +1,47 @@ +// RUN: llvm-tblgen -gen-subtarget -I %p/../../include %s 2>&1 | FileCheck %s +// Check if it is valid MCSchedClassDesc if didn't have the resources. + +include "llvm/Target/Target.td" + +def MyTarget : Target; + +let OutOperandList = (outs), InOperandList = (ins) in { + def Inst_A : Instruction; + def Inst_B : Instruction; +} + +let CompleteModel = 0 in { + def SchedModel_A: SchedMachineModel; + def SchedModel_B: SchedMachineModel; + def SchedModel_C: SchedMachineModel; +} + +// Inst_B didn't have the resoures, and it is invalid. +// CHECK: SchedModel_ASchedClasses[] = { +// CHECK: {DBGFIELD("Inst_A") 1 +// CHECK-NEXT: {DBGFIELD("Inst_B") 16383 +let SchedModel = SchedModel_A in { + def Write_A : SchedWriteRes<[]>; + def : InstRW<[Write_A], (instrs Inst_A)>; +} + +// Inst_A didn't have the resoures, and it is invalid. +// CHECK: SchedModel_BSchedClasses[] = { +// CHECK: {DBGFIELD("Inst_A") 16383 +// CHECK-NEXT: {DBGFIELD("Inst_B") 1 +let SchedModel = SchedModel_B in { + def Write_B: SchedWriteRes<[]>; + def : InstRW<[Write_B], (instrs Inst_B)>; +} + +// CHECK: SchedModel_CSchedClasses[] = { +// CHECK: {DBGFIELD("Inst_A") 1 +// CHECK-NEXT: {DBGFIELD("Inst_B") 1 +let SchedModel = SchedModel_C in { + def Write_C: SchedWriteRes<[]>; + def : InstRW<[Write_C], (instrs Inst_A, Inst_B)>; +} + +def ProcessorA: ProcessorModel<"ProcessorA", SchedModel_A, []>; +def ProcessorB: ProcessorModel<"ProcessorB", SchedModel_B, []>; +def ProcessorC: ProcessorModel<"ProcessorC", SchedModel_C, []>; Index: llvm/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll =================================================================== --- llvm/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll +++ llvm/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll @@ -45,7 +45,6 @@ ; CHECK-REG-PRESSURE: ldr{{.*}}, [sp ; CHECK-REG-PRESSURE: ldr{{.*}}, [sp ; CHECK-REG-PRESSURE: ldr{{.*}}, [sp -; CHECK-REG-PRESSURE: ldr{{.*}}, [sp ; CHECK-REG-PRESSURE: bne .LBB0_1 for.body: -------------- next part -------------- A non-text attachment was scrubbed... Name: D67950.224550.patch Type: text/x-patch Size: 2791 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 01:47:03 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Fri, 11 Oct 2019 08:47:03 -0000 Subject: [llvm] r374527 - Insert module constructors in a module pass Message-ID: <20191011084703.89A7292ECF@lists.llvm.org> Author: vitalybuka Date: Fri Oct 11 01:47:03 2019 New Revision: 374527 URL: http://llvm.org/viewvc/llvm-project?rev=374527&view=rev Log: Insert module constructors in a module pass Summary: If we insert them from function pass some analysis may be missing or invalid. Fixes PR42877. Reviewers: eugenis, leonardchan Reviewed By: leonardchan Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68832 llvm-svn: 374481 Signed-off-by: Vitaly Buka Modified: llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h llvm/trunk/lib/Passes/PassRegistry.def llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll Modified: llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h?rev=374527&r1=374526&r2=374527&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h (original) +++ llvm/trunk/include/llvm/Transforms/Instrumentation/MemorySanitizer.h Fri Oct 11 01:47:03 2019 @@ -40,6 +40,7 @@ struct MemorySanitizerPass : public Pass MemorySanitizerPass(MemorySanitizerOptions Options) : Options(Options) {} PreservedAnalyses run(Function &F, FunctionAnalysisManager &FAM); + PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM); private: MemorySanitizerOptions Options; Modified: llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h?rev=374527&r1=374526&r2=374527&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h (original) +++ llvm/trunk/include/llvm/Transforms/Instrumentation/ThreadSanitizer.h Fri Oct 11 01:47:03 2019 @@ -27,6 +27,8 @@ FunctionPass *createThreadSanitizerLegac /// yet, the pass inserts the declarations. Otherwise the existing globals are struct ThreadSanitizerPass : public PassInfoMixin { PreservedAnalyses run(Function &F, FunctionAnalysisManager &FAM); + PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM); }; + } // namespace llvm #endif /* LLVM_TRANSFORMS_INSTRUMENTATION_THREADSANITIZER_H */ Modified: llvm/trunk/lib/Passes/PassRegistry.def URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Passes/PassRegistry.def?rev=374527&r1=374526&r2=374527&view=diff ============================================================================== --- llvm/trunk/lib/Passes/PassRegistry.def (original) +++ llvm/trunk/lib/Passes/PassRegistry.def Fri Oct 11 01:47:03 2019 @@ -86,6 +86,8 @@ MODULE_PASS("synthetic-counts-propagatio MODULE_PASS("wholeprogramdevirt", WholeProgramDevirtPass(nullptr, nullptr)) MODULE_PASS("verify", VerifierPass()) MODULE_PASS("asan-module", ModuleAddressSanitizerPass(/*CompileKernel=*/false, false, true, false)) +MODULE_PASS("msan-module", MemorySanitizerPass({})) +MODULE_PASS("tsan-module", ThreadSanitizerPass()) MODULE_PASS("kasan-module", ModuleAddressSanitizerPass(/*CompileKernel=*/true, false, true, false)) MODULE_PASS("sancov-module", ModuleSanitizerCoveragePass()) MODULE_PASS("poison-checking", PoisonCheckingPass()) Modified: llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp?rev=374527&r1=374526&r2=374527&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/MemorySanitizer.cpp Fri Oct 11 01:47:03 2019 @@ -587,10 +587,26 @@ private: /// An empty volatile inline asm that prevents callback merge. InlineAsm *EmptyAsm; - - Function *MsanCtorFunction; }; +void insertModuleCtor(Module &M) { + getOrCreateSanitizerCtorAndInitFunctions( + M, kMsanModuleCtorName, kMsanInitName, + /*InitArgTypes=*/{}, + /*InitArgs=*/{}, + // This callback is invoked when the functions are created the first + // time. Hook them into the global ctors list in that case: + [&](Function *Ctor, FunctionCallee) { + if (!ClWithComdat) { + appendToGlobalCtors(M, Ctor, 0); + return; + } + Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName); + Ctor->setComdat(MsanCtorComdat); + appendToGlobalCtors(M, Ctor, 0, Ctor); + }); +} + /// A legacy function pass for msan instrumentation. /// /// Instruments functions to detect unitialized reads. @@ -635,6 +651,14 @@ PreservedAnalyses MemorySanitizerPass::r return PreservedAnalyses::all(); } +PreservedAnalyses MemorySanitizerPass::run(Module &M, + ModuleAnalysisManager &AM) { + if (Options.Kernel) + return PreservedAnalyses::all(); + insertModuleCtor(M); + return PreservedAnalyses::none(); +} + char MemorySanitizerLegacyPass::ID = 0; INITIALIZE_PASS_BEGIN(MemorySanitizerLegacyPass, "msan", @@ -920,23 +944,6 @@ void MemorySanitizer::initializeModule(M OriginStoreWeights = MDBuilder(*C).createBranchWeights(1, 1000); if (!CompileKernel) { - std::tie(MsanCtorFunction, std::ignore) = - getOrCreateSanitizerCtorAndInitFunctions( - M, kMsanModuleCtorName, kMsanInitName, - /*InitArgTypes=*/{}, - /*InitArgs=*/{}, - // This callback is invoked when the functions are created the first - // time. Hook them into the global ctors list in that case: - [&](Function *Ctor, FunctionCallee) { - if (!ClWithComdat) { - appendToGlobalCtors(M, Ctor, 0); - return; - } - Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName); - Ctor->setComdat(MsanCtorComdat); - appendToGlobalCtors(M, Ctor, 0, Ctor); - }); - if (TrackOrigins) M.getOrInsertGlobal("__msan_track_origins", IRB.getInt32Ty(), [&] { return new GlobalVariable( @@ -954,6 +961,8 @@ void MemorySanitizer::initializeModule(M } bool MemorySanitizerLegacyPass::doInitialization(Module &M) { + if (!Options.Kernel) + insertModuleCtor(M); MSan.emplace(M, Options); return true; } @@ -4578,8 +4587,9 @@ static VarArgHelper *CreateVarArgHelper( } bool MemorySanitizer::sanitizeFunction(Function &F, TargetLibraryInfo &TLI) { - if (!CompileKernel && (&F == MsanCtorFunction)) + if (!CompileKernel && F.getName() == kMsanModuleCtorName) return false; + MemorySanitizerVisitor Visitor(F, *this, TLI); // Clear out readonly/readnone attributes. Modified: llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp?rev=374527&r1=374526&r2=374527&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/ThreadSanitizer.cpp Fri Oct 11 01:47:03 2019 @@ -92,11 +92,10 @@ namespace { /// ensures the __tsan_init function is in the list of global constructors for /// the module. struct ThreadSanitizer { - ThreadSanitizer(Module &M); bool sanitizeFunction(Function &F, const TargetLibraryInfo &TLI); private: - void initializeCallbacks(Module &M); + void initialize(Module &M); bool instrumentLoadOrStore(Instruction *I, const DataLayout &DL); bool instrumentAtomic(Instruction *I, const DataLayout &DL); bool instrumentMemIntrinsic(Instruction *I); @@ -108,8 +107,6 @@ private: void InsertRuntimeIgnores(Function &F); Type *IntptrTy; - IntegerType *OrdTy; - // Callbacks to run-time library are computed in doInitialization. FunctionCallee TsanFuncEntry; FunctionCallee TsanFuncExit; FunctionCallee TsanIgnoreBegin; @@ -130,7 +127,6 @@ private: FunctionCallee TsanVptrUpdate; FunctionCallee TsanVptrLoad; FunctionCallee MemmoveFn, MemcpyFn, MemsetFn; - Function *TsanCtorFunction; }; struct ThreadSanitizerLegacyPass : FunctionPass { @@ -143,16 +139,32 @@ struct ThreadSanitizerLegacyPass : Funct private: Optional TSan; }; + +void insertModuleCtor(Module &M) { + getOrCreateSanitizerCtorAndInitFunctions( + M, kTsanModuleCtorName, kTsanInitName, /*InitArgTypes=*/{}, + /*InitArgs=*/{}, + // This callback is invoked when the functions are created the first + // time. Hook them into the global ctors list in that case: + [&](Function *Ctor, FunctionCallee) { appendToGlobalCtors(M, Ctor, 0); }); +} + } // namespace PreservedAnalyses ThreadSanitizerPass::run(Function &F, FunctionAnalysisManager &FAM) { - ThreadSanitizer TSan(*F.getParent()); + ThreadSanitizer TSan; if (TSan.sanitizeFunction(F, FAM.getResult(F))) return PreservedAnalyses::none(); return PreservedAnalyses::all(); } +PreservedAnalyses ThreadSanitizerPass::run(Module &M, + ModuleAnalysisManager &MAM) { + insertModuleCtor(M); + return PreservedAnalyses::none(); +} + char ThreadSanitizerLegacyPass::ID = 0; INITIALIZE_PASS_BEGIN(ThreadSanitizerLegacyPass, "tsan", "ThreadSanitizer: detects data races.", false, false) @@ -169,7 +181,8 @@ void ThreadSanitizerLegacyPass::getAnaly } bool ThreadSanitizerLegacyPass::doInitialization(Module &M) { - TSan.emplace(M); + insertModuleCtor(M); + TSan.emplace(); return true; } @@ -183,7 +196,10 @@ FunctionPass *llvm::createThreadSanitize return new ThreadSanitizerLegacyPass(); } -void ThreadSanitizer::initializeCallbacks(Module &M) { +void ThreadSanitizer::initialize(Module &M) { + const DataLayout &DL = M.getDataLayout(); + IntptrTy = DL.getIntPtrType(M.getContext()); + IRBuilder<> IRB(M.getContext()); AttributeList Attr; Attr = Attr.addAttribute(M.getContext(), AttributeList::FunctionIndex, @@ -197,7 +213,7 @@ void ThreadSanitizer::initializeCallback IRB.getVoidTy()); TsanIgnoreEnd = M.getOrInsertFunction("__tsan_ignore_thread_end", Attr, IRB.getVoidTy()); - OrdTy = IRB.getInt32Ty(); + IntegerType *OrdTy = IRB.getInt32Ty(); for (size_t i = 0; i < kNumberOfAccessSizes; ++i) { const unsigned ByteSize = 1U << i; const unsigned BitSize = ByteSize * 8; @@ -280,20 +296,6 @@ void ThreadSanitizer::initializeCallback IRB.getInt8PtrTy(), IRB.getInt32Ty(), IntptrTy); } -ThreadSanitizer::ThreadSanitizer(Module &M) { - const DataLayout &DL = M.getDataLayout(); - IntptrTy = DL.getIntPtrType(M.getContext()); - std::tie(TsanCtorFunction, std::ignore) = - getOrCreateSanitizerCtorAndInitFunctions( - M, kTsanModuleCtorName, kTsanInitName, /*InitArgTypes=*/{}, - /*InitArgs=*/{}, - // This callback is invoked when the functions are created the first - // time. Hook them into the global ctors list in that case: - [&](Function *Ctor, FunctionCallee) { - appendToGlobalCtors(M, Ctor, 0); - }); -} - static bool isVtableAccess(Instruction *I) { if (MDNode *Tag = I->getMetadata(LLVMContext::MD_tbaa)) return Tag->isTBAAVtableAccess(); @@ -436,9 +438,9 @@ bool ThreadSanitizer::sanitizeFunction(F const TargetLibraryInfo &TLI) { // This is required to prevent instrumenting call to __tsan_init from within // the module constructor. - if (&F == TsanCtorFunction) + if (F.getName() == kTsanModuleCtorName) return false; - initializeCallbacks(*F.getParent()); + initialize(*F.getParent()); SmallVector AllLoadsAndStores; SmallVector LocalLoadsAndStores; SmallVector AtomicAccesses; Modified: llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll?rev=374527&r1=374526&r2=374527&view=diff ============================================================================== --- llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll (original) +++ llvm/trunk/test/Instrumentation/MemorySanitizer/msan_basic.ll Fri Oct 11 01:47:03 2019 @@ -1,10 +1,9 @@ -; RUN: opt < %s -msan-check-access-address=0 -S -passes=msan 2>&1 | FileCheck \ -; RUN: -allow-deprecated-dag-overlap %s -; RUN: opt < %s -msan -msan-check-access-address=0 -S | FileCheck -allow-deprecated-dag-overlap %s -; RUN: opt < %s -msan-check-access-address=0 -msan-track-origins=1 -S \ -; RUN: -passes=msan 2>&1 | FileCheck -allow-deprecated-dag-overlap \ -; RUN: -check-prefix=CHECK -check-prefix=CHECK-ORIGINS %s -; RUN: opt < %s -msan -msan-check-access-address=0 -msan-track-origins=1 -S | FileCheck -allow-deprecated-dag-overlap -check-prefix=CHECK -check-prefix=CHECK-ORIGINS %s +; RUN: opt < %s -msan-check-access-address=0 -S -passes='module(msan-module),function(msan)' 2>&1 | FileCheck -allow-deprecated-dag-overlap %s +; RUN: opt < %s --passes='module(msan-module),function(msan)' -msan-check-access-address=0 -S | FileCheck -allow-deprecated-dag-overlap %s +; RUN: opt < %s -msan-check-access-address=0 -msan-track-origins=1 -S -passes='module(msan-module),function(msan)' 2>&1 | \ +; RUN: FileCheck -allow-deprecated-dag-overlap -check-prefixes=CHECK,CHECK-ORIGINS %s +; RUN: opt < %s -passes='module(msan-module),function(msan)' -msan-check-access-address=0 -msan-track-origins=1 -S | \ +; RUN: FileCheck -allow-deprecated-dag-overlap -check-prefixes=CHECK,CHECK-ORIGINS %s target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" Modified: llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll?rev=374527&r1=374526&r2=374527&view=diff ============================================================================== --- llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll (original) +++ llvm/trunk/test/Instrumentation/ThreadSanitizer/tsan_basic.ll Fri Oct 11 01:47:03 2019 @@ -1,5 +1,5 @@ ; RUN: opt < %s -tsan -S | FileCheck %s -; RUN: opt < %s -passes=tsan -S | FileCheck %s +; RUN: opt < %s -passes='function(tsan),module(tsan-module)' -S | FileCheck %s target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" target triple = "x86_64-unknown-linux-gnu" From llvm-commits at lists.llvm.org Fri Oct 11 01:46:17 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 08:46:17 +0000 (UTC) Subject: [PATCH] D66604: [GVN] AnalyzeLoadAvailability: Replace a load after lifetime.end with undef (PR20811) In-Reply-To: References: Message-ID: <3c2f76d2fb4c6bc7b74998a40d1aae4a@localhost.localdomain> fhahn added a comment. I can commit it for you in a bit, thanks for the patch! I'll generate the check lines before committing. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66604/new/ https://reviews.llvm.org/D66604 From llvm-commits at lists.llvm.org Fri Oct 11 01:55:52 2019 From: llvm-commits at lists.llvm.org (LiuChen via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 08:55:52 +0000 (UTC) Subject: [PATCH] D68857: [X86] Add strict fp support for operations of X87 instructions In-Reply-To: References: Message-ID: <2ff5cacfa24c75e21bfd0d2a00fe7d8d@localhost.localdomain> LiuChen3 updated this revision to Diff 224553. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68857/new/ https://reviews.llvm.org/D68857 Files: llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86InstrFPStack.td llvm/test/CodeGen/X86/x87-fp-strict-add.ll llvm/test/CodeGen/X86/x87-fp-strict-div.ll llvm/test/CodeGen/X86/x87-fp-strict-fpextend.ll llvm/test/CodeGen/X86/x87-fp-strict-fpround.ll llvm/test/CodeGen/X86/x87-fp-strict-mul.ll llvm/test/CodeGen/X86/x87-fp-strict-sqrt.ll llvm/test/CodeGen/X86/x87-fp-strict-sub.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68857.224553.patch Type: text/x-patch Size: 26616 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 02:03:30 2019 From: llvm-commits at lists.llvm.org (Aleksandr Urakov via llvm-commits) Date: Fri, 11 Oct 2019 09:03:30 -0000 Subject: [llvm] r374528 - [Windows] Use information from the PE32 exceptions directory to construct unwind plans Message-ID: <20191011090330.81A1C88A13@lists.llvm.org> Author: aleksandr.urakov Date: Fri Oct 11 02:03:29 2019 New Revision: 374528 URL: http://llvm.org/viewvc/llvm-project?rev=374528&view=rev Log: [Windows] Use information from the PE32 exceptions directory to construct unwind plans This patch adds an implementation of unwinding using PE EH info. It allows to get almost ideal call stacks on 64-bit Windows systems (except some epilogue cases, but I believe that they can be fixed with unwind plan disassembly augmentation in the future). To achieve the goal the CallFrameInfo abstraction was made. It is based on the DWARFCallFrameInfo class interface with a few changes to make it less DWARF-specific. To implement the new interface for PECOFF object files the class PECallFrameInfo was written. It uses the next helper classes: - UnwindCodesIterator helps to iterate through UnwindCode structures (and processes chained infos transparently); - EHProgramBuilder with the use of UnwindCodesIterator constructs EHProgram; - EHProgram is, by fact, a vector of EHInstructions. It creates an abstraction over the low-level unwind codes and simplifies work with them. It contains only the information that is relevant to unwinding in the unified form. Also the required unwind codes are read from the object file only once with it; - EHProgramRange allows to take a range of EHProgram and to build an unwind row for it. So, PECallFrameInfo builds the EHProgram with EHProgramBuilder, takes the ranges corresponding to every offset in prologue and builds the rows of the resulted unwind plan. The resulted plan covers the whole range of the function except the epilogue. Reviewers: jasonmolenda, asmith, amccarth, clayborg, JDevlieghere, stella.stamenova, labath, espindola Reviewed By: jasonmolenda Subscribers: leonid.mashinskiy, emaste, mgorny, aprantl, arichardson, MaskRay, lldb-commits, llvm-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D67347 Modified: llvm/trunk/include/llvm/Support/Win64EH.h Modified: llvm/trunk/include/llvm/Support/Win64EH.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/Win64EH.h?rev=374528&r1=374527&r2=374528&view=diff ============================================================================== --- llvm/trunk/include/llvm/Support/Win64EH.h (original) +++ llvm/trunk/include/llvm/Support/Win64EH.h Fri Oct 11 02:03:29 2019 @@ -30,7 +30,9 @@ enum UnwindOpcodes { UOP_SetFPReg, UOP_SaveNonVol, UOP_SaveNonVolBig, - UOP_SaveXMM128 = 8, + UOP_Epilog, + UOP_SpareCode, + UOP_SaveXMM128, UOP_SaveXMM128Big, UOP_PushMachFrame, // The following set of unwind opcodes is for ARM64. They are documented at From llvm-commits at lists.llvm.org Fri Oct 11 02:05:19 2019 From: llvm-commits at lists.llvm.org (Aleksandr Urakov via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 09:05:19 +0000 (UTC) Subject: [PATCH] D67347: [Windows] Use information from the PE32 exceptions directory to construct unwind plans In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG30c2441a3262: [Windows] Use information from the PE32 exceptions directory to construct… (authored by aleksandr.urakov). Changed prior to commit: https://reviews.llvm.org/D67347?vs=220144&id=224554#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67347/new/ https://reviews.llvm.org/D67347 Files: lldb/include/lldb/Symbol/CallFrameInfo.h lldb/include/lldb/Symbol/FuncUnwinders.h lldb/include/lldb/Symbol/ObjectFile.h lldb/include/lldb/Symbol/UnwindTable.h lldb/include/lldb/lldb-forward.h lldb/source/Commands/CommandObjectTarget.cpp lldb/source/Plugins/ObjectFile/PECOFF/CMakeLists.txt lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.h lldb/source/Plugins/ObjectFile/PECOFF/PECallFrameInfo.cpp lldb/source/Plugins/ObjectFile/PECOFF/PECallFrameInfo.h lldb/source/Plugins/Process/Utility/RegisterContextLLDB.cpp lldb/source/Symbol/FuncUnwinders.cpp lldb/source/Symbol/ObjectFile.cpp lldb/source/Symbol/UnwindTable.cpp lldb/unittests/ObjectFile/CMakeLists.txt lldb/unittests/ObjectFile/PECOFF/CMakeLists.txt lldb/unittests/ObjectFile/PECOFF/TestPECallFrameInfo.cpp llvm/include/llvm/Support/Win64EH.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D67347.224554.patch Type: text/x-patch Size: 51709 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 02:14:35 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 09:14:35 +0000 (UTC) Subject: [PATCH] D66004: [WIP][X86][SSE] SimplifyDemandedVectorEltsForTargetNode - add general shuffle combining support In-Reply-To: References: Message-ID: <153e5fa89d611fead0b51e8f7cdbe097@localhost.localdomain> lebedev.ri added a comment. Rebase this? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66004/new/ https://reviews.llvm.org/D66004 From llvm-commits at lists.llvm.org Fri Oct 11 02:14:36 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Fri, 11 Oct 2019 09:14:36 +0000 (UTC) Subject: [PATCH] D67347: [Windows] Use information from the PE32 exceptions directory to construct unwind plans In-Reply-To: References: Message-ID: <7b3e76e2c7c30bcb395b0d5798f4b8c3@localhost.localdomain> mstorsjo added a comment. Quick question here; will unwinding using DWARF debug info still work like before after this, for binaries that don't use SEH for exception unwinding? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67347/new/ https://reviews.llvm.org/D67347 From llvm-commits at lists.llvm.org Fri Oct 11 02:23:48 2019 From: llvm-commits at lists.llvm.org (Aleksandr Urakov via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 09:23:48 +0000 (UTC) Subject: [PATCH] D67347: [Windows] Use information from the PE32 exceptions directory to construct unwind plans In-Reply-To: References: Message-ID: <42374195e5baf82c8c1d5003f4795351@localhost.localdomain> aleksandr.urakov added a comment. In D67347#1705563 , @mstorsjo wrote: > Quick question here; will unwinding using DWARF debug info still work like before after this, for binaries that don't use SEH for exception unwinding? Hi Martin! Do I understand the question correctly: you mean x64 Windows binaries with an additional DWARF unwind info inside, right? In that case the info from PE32+ directory should be used, it has a higher priority than the debug info. I think it should have a higher priority because it is used by the system during an unwind and is always presented in x64 binaries, so we have stronger guarantees with it. In all other cases, when the info in PE32+ directory is not presented, all things should work as usual (then, the DWARF info will be used). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67347/new/ https://reviews.llvm.org/D67347 From llvm-commits at lists.llvm.org Fri Oct 11 02:34:36 2019 From: llvm-commits at lists.llvm.org (Anna Welker via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 09:34:36 +0000 (UTC) Subject: [PATCH] D68862: [ARM] Allocatable Global Register Variables for ARM Message-ID: anwel created this revision. anwel added reviewers: carwil, amilendra_arm, phosek, michaelplatings, efriedma. anwel added projects: LLVM, clang. Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls. This patch combines two earlier patches aiming at providing the same support (https://reviews.llvm.org/D56003 for clang, https://reviews.llvm.org/D56005 for LLVM). It enables reservation of allocatable registers via command line options, which in turn allows them to be used as global named register variables. They will then not be used by the register allocator nor spilled to the stack. More information is available in the original RFC: http://lists.llvm.org/pipermail/llvm-dev/2018-December/128706.html Changes from the previous patches include: - adding a constraint to specify -ffixed-rN if rN is used as named register variable. - upgrading the frame-pointer warning to an error and throwing an error in LLVM, as well as clang.* Additionally this patch now only supports r6-r11. r4 and r5 are excluded from this patch as r4 is used as hard-coded scratch register in various parts of the ARM backend. r4 also appears to be used as an input register for a Windows asm routine (__chkstk). Similarly, the ABI of the segmented stack prologue for Android and Linux seems to use r4 and r5 as input registers. A separate patch could follow to add the support for r4 and/or r5, such that the whole range of allocatable registers (r4-r11) is available. As before it should be noted that this also changes the behaviour of the old -ffixed-r9 option. This option will now prevent the register from being spilled to the stack. *This was originally a warning, but we don't seem to have the necessary information to determine frame-pointer usage in the given context. Any insight here would be welcome. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68862 Files: clang/docs/ClangCommandLineReference.rst clang/include/clang/Basic/DiagnosticDriverKinds.td clang/include/clang/Basic/DiagnosticGroups.td clang/include/clang/Basic/DiagnosticSemaKinds.td clang/include/clang/Basic/TargetInfo.h clang/include/clang/Driver/Options.td clang/lib/Basic/Targets/ARM.cpp clang/lib/Basic/Targets/ARM.h clang/lib/Driver/ToolChains/Arch/ARM.cpp clang/lib/Sema/SemaDecl.cpp clang/test/Driver/arm-reserved-reg-options.c clang/test/Sema/arm-global-regs.c llvm/lib/Target/ARM/ARM.td llvm/lib/Target/ARM/ARMAsmPrinter.cpp llvm/lib/Target/ARM/ARMBaseRegisterInfo.cpp llvm/lib/Target/ARM/ARMFrameLowering.cpp llvm/lib/Target/ARM/ARMISelLowering.cpp llvm/lib/Target/ARM/ARMSubtarget.cpp llvm/lib/Target/ARM/ARMSubtarget.h llvm/lib/Target/ARM/ARMTargetTransformInfo.h llvm/test/CodeGen/ARM/reg-alloc-fixed-r6-vla.ll llvm/test/CodeGen/ARM/reg-alloc-with-fixed-reg-r6-modified.ll llvm/test/CodeGen/ARM/reg-alloc-with-fixed-reg-r6.ll llvm/test/CodeGen/ARM/reg-alloc-wout-fixed-regs.ll llvm/test/CodeGen/Thumb/callee_save_reserved.ll llvm/test/Feature/reserve_global_reg.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68862.224555.patch Type: text/x-patch Size: 34177 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 02:44:57 2019 From: llvm-commits at lists.llvm.org (Carey Williams via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 09:44:57 +0000 (UTC) Subject: [PATCH] D56005: [RFC] [LLVM] Allocatable Global Register Variables for ARM In-Reply-To: References: Message-ID: <64884864eb4e340fabd754575cc78e32@localhost.localdomain> carwil abandoned this revision. carwil marked an inline comment as done. carwil added a comment. Superseded by https://reviews.llvm.org/D68862. ================ Comment at: lib/Target/ARM/ARMSubtarget.h:722 + if (i == 9 && isTargetMachO() && !HasV6Ops) { + return true; + } ---------------- efriedma wrote: > Can we handle this in initSubtargetFeatures instead, like we do for rwpi? It's sort of confusing to follow. I'm not quite sure what you mean. This is just a convenience function for checking which registers have been reserved. We're not actually setting the reservations here, that's handled by the ARM.td/reserve-rN rule(s). r9 could be reserved either with ffixed-r9/reserve-r9 or with -frwpi, the second case being handled in initTargetSubFeatures. For the other GPRs we only have the ffixed options, so there is no need. Am I misunderstanding? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D56005/new/ https://reviews.llvm.org/D56005 From llvm-commits at lists.llvm.org Fri Oct 11 02:45:02 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 09:45:02 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream In-Reply-To: References: Message-ID: <64ee820c5d76a16c38d753530cc187aa@localhost.localdomain> labath added a comment. Thanks. IIUC, all the existing tests just cover the yaml2obj direction. Could you add something for the other direction too? Maybe add an exception stream to `test/tools/obj2yaml/basic-minidump.yaml`? ================ Comment at: llvm/include/llvm/ObjectYAML/MinidumpYAML.h:162-181 +/// ExceptionStream minidump stream. +struct ExceptionStream : public Stream { + minidump::ExceptionStream MDExceptionStream; + yaml::BinaryRef ThreadContext; + + explicit ExceptionStream(const minidump::ExceptionStream &MDExceptionStream, + ArrayRef ThreadContext) ---------------- I've been trying to keep this somewhat sorted. Could you move this before the `MemoryInfoListStream` class? Also, in the previous patch we've moved the default constructors to front. It would be good to make this consistent with that. ================ Comment at: llvm/lib/ObjectYAML/MinidumpYAML.cpp:394 + mapOptionalHex(IO, "Exception Address", Exception.ExceptionAddress, 0); + IO.mapOptional("Number Parameters", Exception.NumberParameters, + support::ulittle32_t(0u)); ---------------- This file has a helper function for this (`mapOptional(IO, "name", value, 0)`. I'd consider changing the field name to "Number of Parameters" even though it does not match the field name, as it reads weird without that. I'm not sure why the microsoft naming is inconsistent here -- most of the other minidump structs have "of" in their name already (BaseOfImage, SizeOfImage, etc.), but at least we can be consistent. ================ Comment at: llvm/lib/ObjectYAML/MinidumpYAML.cpp:408-414 +StringRef yaml::MappingTraits::validate( + yaml::IO &IO, minidump::Exception &Exception) { + if (Exception.NumberParameters > Exception::MaxParameters) + return "Exception reports too many parameters"; + return ""; +} + ---------------- Could you remove this bit too? While it is technically invalid, this is not something that yaml2obj needs to care about (as it does not prevent successful serialization), and it would be nice to be able to use it to generate a test case with an invalid number (because that is something lldb should care about and expect/handle).. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68657/new/ https://reviews.llvm.org/D68657 From llvm-commits at lists.llvm.org Fri Oct 11 02:46:24 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 09:46:24 +0000 (UTC) Subject: [PATCH] D67347: [Windows] Use information from the PE32 exceptions directory to construct unwind plans In-Reply-To: References: Message-ID: <61d779e580ac52a48ffa671fc5d38a83@localhost.localdomain> labath added a comment. In D67347#1705563 , @mstorsjo wrote: > Quick question here; will unwinding using DWARF debug info still work like before after this, for binaries that don't use SEH for exception unwinding? Do you mean debug_frame or eh_frame? debug_frame should be completely unaffected by this. the interaction between eh_frame and SEH is more tricky, but I don't know if that's ever used/emitted on windows (since presumably the system libraries don't know how to read it)... Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67347/new/ https://reviews.llvm.org/D67347 From llvm-commits at lists.llvm.org Fri Oct 11 02:46:25 2019 From: llvm-commits at lists.llvm.org (Andrea Di Biagio via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 09:46:25 +0000 (UTC) Subject: [PATCH] D67950: [TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel In-Reply-To: References: Message-ID: <49a4d5aa6c97ee0751eb81498bf1031e@localhost.localdomain> andreadb added a comment. I am not convinced that this patch is correct. Isn’t the problem that your model should was wrongly marked as complete? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67950/new/ https://reviews.llvm.org/D67950 From llvm-commits at lists.llvm.org Fri Oct 11 02:54:09 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 09:54:09 +0000 (UTC) Subject: [PATCH] D68656: Add ExceptionStream to llvm::Object::minidump In-Reply-To: References: Message-ID: labath added a comment. Looks fine, just a few nits inline. ================ Comment at: llvm/include/llvm/BinaryFormat/Minidump.h:228 +// Exception stuff +struct Exception { ---------------- Delete or put a more meanigful comment here. ================ Comment at: llvm/unittests/Object/MinidumpTest.cpp:756-758 + if (!ExpectedStream) { + errs() << ExpectedStream.takeError(); + } ---------------- Delete. The ASSERT_THAT_EXPECTED check should already print the error message if this fails. Was that not working for you for some reason? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68656/new/ https://reviews.llvm.org/D68656 From llvm-commits at lists.llvm.org Fri Oct 11 03:02:25 2019 From: llvm-commits at lists.llvm.org (Ulrich Weigand via llvm-commits) Date: Fri, 11 Oct 2019 12:02:25 +0200 Subject: [PATCH] D68431: [msan] Add interceptors: crypt, crypt_r. In-Reply-To: References: Message-ID: Evgenii Stepanov wrote on 10.10.2019 22:46:32: > Sure, done in r374448, let me know if it did not help. This did fix the problem for me, thanks! > My man page says that only this is necessary: > #define _XOPEN_SOURCE /* See feature_test_macros(7) */ > #include > I don't define _XOPEN_SOURCE, maybe that's the real problem? Hmm, interesting. That probably would have worked as well. Bye, Ulrich -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Fri Oct 11 03:22:29 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 10:22:29 +0000 (UTC) Subject: [PATCH] D68862: [ARM] Allocatable Global Register Variables for ARM In-Reply-To: References: Message-ID: <64f0dbc9698d046b833bee1d1892edb0@localhost.localdomain> SjoerdMeijer added a comment. Bit of a drive-by comment, but I can't say I am big fan of all the string matching on the register names. Not sure if this is a fair comment, because I haven't looked closely at it yet, but could we use more the `ARM::R[0-9]` values more? Perhaps that's difficult from the Clang parts? ================ Comment at: clang/lib/Basic/Targets/ARM.cpp:902 + std::vector &Features = getTargetOpts().Features; + std::string SearchFeature = "+reserve-" + RegName.str(); + for (std::string &Feature : Features) { ---------------- I was pointed at something similar myself recently, but if I am not mistaken then I think this is a use-after-free: "+reserve-" + RegName.str() this will allocate a temp `std::string` that `SearchFeature` points to, which then gets released, and `SearchFeature` is still pointing at it. ================ Comment at: llvm/test/CodeGen/ARM/reg-alloc-with-fixed-reg-r6-modified.ll:15 +; r6 = 10; +; unsigned int result = i + j + k + l +m + n + o + p; +; } ---------------- nit: `+m` -> ` + m` ================ Comment at: llvm/test/CodeGen/ARM/reg-alloc-with-fixed-reg-r6.ll:13 +; { +; unsigned int result = i + j + k + l +m + n + o + p; +; } ---------------- same nit ================ Comment at: llvm/test/CodeGen/ARM/reg-alloc-wout-fixed-regs.ll:3 +; +; Equivalent C source code +; void bar(unsigned int i, ---------------- As all these tests (this file and the ones above) are the same, the "equivalent C source code" is the same, perhaps move all these cases into 1 file. ================ Comment at: llvm/test/CodeGen/ARM/reg-alloc-wout-fixed-regs.ll:13 +; { +; unsigned int result = i + j + k + l +m + n + o + p; +; } ---------------- same nit here Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68862/new/ https://reviews.llvm.org/D68862 From llvm-commits at lists.llvm.org Fri Oct 11 03:22:30 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 10:22:30 +0000 (UTC) Subject: [PATCH] D68848: [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index. In-Reply-To: References: Message-ID: grimar added a comment. A few ideas/suggestions. ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:345 -static bool shouldKeep(object::SectionRef S) { +struct FilterResult { + bool Keep; ---------------- It's common to wrap such things (helper types) into anonymous namespaces: I think you can also add a helper function too: ``` namespace { struct FilterResult { bool Keep; bool IncrementIndex; }; FilterResult checkSectionFilter(object::SectionRef S) { ... } }; ``` ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:347 + bool Keep; + bool IncrementIndex; +}; ---------------- I'd comment these fields. It wasn't obvious to me what they are used for. ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:352 if (FilterSections.empty()) - return true; + return {/*Keep=*/true, /*Increment=*/true}; ---------------- I think we use a full variable name usually, i.e. Increment -> IncrementIndex ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:374 + // zero (after the unsigned wrap). + if (Idx != nullptr) + *Idx = UINT64_MAX; ---------------- nit: `if (Idx)` would be shorer. ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:378 + [Idx](object::SectionRef S) { + auto Result = checkSectionFilter(S); + if (Idx != nullptr && Result.IncrementIndex) ---------------- Please avoid using auto for return types that are not obvious. ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1699 + uint64_t Idx; + for (const SectionRef &Section : ToolSectionFilter(*Obj, &Idx)) { StringRef Name = unwrapOrError(Section.getName(), Obj->getFileName()); ---------------- Looking at this, should `ToolSectionFilter` just return `[(SectionRef&)Ref, (uint64_t )Index]` struct/pair instead? It seems could make the whole logic simper. ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.h:86 +// filtered (e.g. symbol tables). +SectionFilter ToolSectionFilter(llvm::object::ObjectFile const &O, + uint64_t *Idx = nullptr); ---------------- I think this needs a full comment (it does not give an information about what this helper avtually do atm): I.e. would be nice to see something like: ``` // Function is used to ... // Idx is a optional output parameter that ... ``` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68848/new/ https://reviews.llvm.org/D68848 From llvm-commits at lists.llvm.org Fri Oct 11 03:22:31 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Fri, 11 Oct 2019 10:22:31 +0000 (UTC) Subject: [PATCH] D67347: [Windows] Use information from the PE32 exceptions directory to construct unwind plans In-Reply-To: References: Message-ID: mstorsjo added a comment. In D67347#1705611 , @labath wrote: > In D67347#1705563 , @mstorsjo wrote: > > > Quick question here; will unwinding using DWARF debug info still work like before after this, for binaries that don't use SEH for exception unwinding? > > > Do you mean debug_frame or eh_frame? debug_frame should be completely unaffected by this. the interaction between eh_frame and SEH is more tricky, but I don't know if that's ever used/emitted on windows (since presumably the system libraries don't know how to read it)... I meant debug_frame. Ok, good if that's unaffected. In MinGW setups, you can have a number of different combinations of both unwind and debug info. The debug info normally is DWARF (i.e. debug_frame), but it can also (in pure clang/lld based environments, not with GCC/binutils) optionally use codeview/PDB. For unwind info, on x64, SEH is normally used, but it can also optionally use SjLj or DWARF (i.e. eh_frame). For i686 (and armv7), DWARF is the default. And for the cases where it does use SEH for C++ exception unwinding, it doesn't do it exactly like MSVC does, but it uses a special gcc personality function which unwinds one step at a time with RtlVirtualUnwind. So it still does use the system unwinder facility, but slightly differently. In D67347#1705573 , @aleksandr.urakov wrote: > In D67347#1705563 , @mstorsjo wrote: > > > Quick question here; will unwinding using DWARF debug info still work like before after this, for binaries that don't use SEH for exception unwinding? > > > Do I understand the question correctly: you mean x64 Windows binaries with an additional DWARF unwind info inside, right? In that case the info from PE32+ directory should be used, it has a higher priority than the debug info. I think it should have a higher priority because it is used by the system during an unwind and is always presented in x64 binaries, so we have stronger guarantees with it. No, I meant DWARF debug info. > In all other cases, when the info in PE32+ directory is not presented, all things should work as usual (then, the DWARF info will be used). Ok, that's good. If I build an environment for x86_64 that uses DWARF for exception handling, ExceptionTableRVA/ExceptionTableSize are zero in the data directory, so I would presume this would be skipped then, and use the DWARF info instead. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67347/new/ https://reviews.llvm.org/D67347 From llvm-commits at lists.llvm.org Fri Oct 11 03:31:34 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 10:31:34 +0000 (UTC) Subject: [PATCH] D68863: [LNT] Python 3 support: don't assume order of cmake args Message-ID: thopre created this revision. thopre added reviewers: cmatthews, hubert.reinterpretcast, kristof.beyls. thopre added a parent revision: D68829: [LNT] Python 3 support: Parse HTML as text. runtest/test_suite-cache.shtest test assumes the order of cmake argument but that order changes between Python 2 and Python 3. Since there is no good reason to have them sorted, this commit adapts the testcase to accept any order by testing each argument in a separate FileCheck invokation. https://reviews.llvm.org/D68863 Files: tests/runtest/test_suite-cache.shtest Index: tests/runtest/test_suite-cache.shtest =================================================================== --- tests/runtest/test_suite-cache.shtest +++ tests/runtest/test_suite-cache.shtest @@ -14,8 +14,16 @@ # RUN: --cmake-define FOO=BAR \ # RUN: -D BAR=BAZ \ # RUN: &> %t.cmake-cache.log -# RUN: FileCheck --check-prefix CHECK-CACHE < %t.cmake-cache.log %s -# CHECK-CACHE: Execute: {{.*}}cmake -DCMAKE_CXX_COMPILER:FILEPATH={{.*}}/FakeCompilers/clang++-r154331 -DCMAKE_C_COMPILER:FILEPATH={{.*}}FakeCompilers/clang-r154331 -C {{.*}}/Release.cmake {{.*}}-DFOO=BAR{{.*}}-DBAR=BAZ +# RUN: FileCheck --check-prefix CHECK-CACHE1 < %t.cmake-cache.log %s +# CHECK-CACHE1: Execute: {{.*}}cmake {{(.+ )?}}-DCMAKE_CXX_COMPILER:FILEPATH={{.*}}/FakeCompilers/clang++-r154331 +# RUN: FileCheck --check-prefix CHECK-CACHE2 < %t.cmake-cache.log %s +# CHECK-CACHE2: Execute: {{.*}}cmake {{(.+ )?}}-DCMAKE_C_COMPILER:FILEPATH={{.*}}FakeCompilers/clang-r154331 +# RUN: FileCheck --check-prefix CHECK-CACHE3 < %t.cmake-cache.log %s +# CHECK-CACHE3: Execute: {{.*}}cmake {{(.+ )?}}-C {{.*}}/Release.cmake +# RUN: FileCheck --check-prefix CHECK-CACHE4 < %t.cmake-cache.log %s +# CHECK-CACHE4: Execute: {{.*}}cmake {{(.+ )?}}-DFOO=BAR +# RUN: FileCheck --check-prefix CHECK-CACHE5 < %t.cmake-cache.log %s +# CHECK-CACHE5: Execute: {{.*}}cmake {{(.+ )?}}-DBAR=BAZ # RUN: rm -rf %t.SANDBOX # Check a run of test-suite using a invalid cmake cache @@ -29,5 +37,5 @@ # RUN: --use-lit %S/Inputs/test-suite-cmake/fake-lit \ # RUN: --cmake-cache Debug \ # RUN: &> %t.cmake-cache2.err || true -# RUN: FileCheck --check-prefix CHECK-CACHE2 < %t.cmake-cache2.err %s -# CHECK-CACHE2: Could not find CMake cache file +# RUN: FileCheck --check-prefix CHECK-CACHE6 < %t.cmake-cache2.err %s +# CHECK-CACHE6: Could not find CMake cache file -------------- next part -------------- A non-text attachment was scrubbed... Name: D68863.224556.patch Type: text/x-patch Size: 1853 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 03:31:35 2019 From: llvm-commits at lists.llvm.org (Andrea Di Biagio via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 10:31:35 +0000 (UTC) Subject: [PATCH] D67950: [TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel In-Reply-To: References: Message-ID: <0376ab757d917bfec34b58bf74976932@localhost.localdomain> andreadb added a comment. In D67950#1705613 , @andreadb wrote: > I am not convinced that this patch is correct. Isn’t the problem that your model was wrongly marked as complete? Nevermind. If the issue was related to the model being marked as complete, then the tablegen backend would have complained earlier on when building the scheduling classes. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67950/new/ https://reviews.llvm.org/D67950 From llvm-commits at lists.llvm.org Fri Oct 11 03:52:32 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 10:52:32 +0000 (UTC) Subject: [PATCH] D68730: [llvm-objdump] Adjust spacing and field width for --section-headers In-Reply-To: References: Message-ID: <213f0835442f233fd385fed26d46d46b@localhost.localdomain> grimar added inline comments. ================ Comment at: llvm/test/tools/llvm-objdump/section-headers-spacing.test:1 +## Check leading and trailing whitespace for full lines. +# RUN: yaml2obj %s -o %t-whitespace.o ---------------- rupprecht wrote: > grimar wrote: > > What do you think about combining these tests you have here into one that > > could use `yaml2obj --docnum=X` and check spacing, formatting etc in one place? > > (I am not sure it if it is usefull to have 3 different test files?) > I started out with one test file, but found it to be a collection of somewhat unrelated things -- e.g. name column width and 32 vs 64 bit column widths are different features. So I think it's better to have more focused test files. It's a slightly personal preference though. We often combine tests by a feature. I.e. test that checks the "-h" output might contain everything related to "-h" at once. Sometimes we do a split to make a test that contain only a error/warnings checks (if there are too many of them) or a particular set of tests that are very different. It is not a huge problem, but might be interesting what others think about this too though. (to summarize my position: I'd prefer to combine them to reduce the number of tests, but it is not critical and is OK as is probably). ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1694 + SectionTypes.push_back("BSS"); + std::string Type = llvm::join(SectionTypes, " "); ---------------- rupprecht wrote: > grimar wrote: > > May be I'd try to avoid using an additional vector and algorithm here and just: > > > > ``` > > std::string Type = Section.isText() ? "TEXT" : ""; > > if (Section.isData()) > > Type += Type.empty() ? "DATA" : " DATA"; > > if (Section.isBSS()) > > Type += Type.empty() ? "BSS" : " BSS"; > > ``` > I think the lack of using an algorithm is why there was odd trailing whitespace before, although I agree it would be great if this could be more succinct... But does my version has a trailing whitespace issue? I think no. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68730/new/ https://reviews.llvm.org/D68730 From llvm-commits at lists.llvm.org Fri Oct 11 03:52:35 2019 From: llvm-commits at lists.llvm.org (Kerry McLaughlin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 10:52:35 +0000 (UTC) Subject: [PATCH] D67550: [AArch64][SVE] Implement unpack intrinsics In-Reply-To: References: Message-ID: <1b7a1ec021b8f588fae5f9176fb2731a@localhost.localdomain> kmclaughlin updated this revision to Diff 224558. kmclaughlin added a comment. Removed unused //SDPatternOperator op// from sve_int_perm_unpk class CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67550/new/ https://reviews.llvm.org/D67550 Files: include/llvm/IR/IntrinsicsAArch64.td lib/Target/AArch64/AArch64ISelLowering.cpp lib/Target/AArch64/AArch64ISelLowering.h lib/Target/AArch64/AArch64InstrInfo.td lib/Target/AArch64/AArch64SVEInstrInfo.td lib/Target/AArch64/SVEInstrFormats.td test/CodeGen/AArch64/sve-intrinsics-perm-select.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67550.224558.patch Type: text/x-patch Size: 10508 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 03:55:02 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 10:55:02 +0000 (UTC) Subject: [PATCH] D66924: [NewGVN] Add phi-of-ops instr as user of FoundVal. In-Reply-To: References: Message-ID: <08734019ef58e67aead7cc9962edaf2f@localhost.localdomain> fhahn added a comment. ping Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66924/new/ https://reviews.llvm.org/D66924 From llvm-commits at lists.llvm.org Fri Oct 11 04:09:10 2019 From: llvm-commits at lists.llvm.org (Chris Ye via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:09:10 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline In-Reply-To: References: Message-ID: yechunliang marked an inline comment as done. yechunliang added a comment. > So the root cause is rather that we treat an alloca being immediately preceeded by another alloca differrently from the case when it is preceeded by another kind of instruction. This happens also when having other instructions in between, and is not specific to dbg intrinsics (could be interesting to add a test case where you replace the dbg intrinsics by something else). Yes I think so, if the other instruction is not dbg instr which exist between two allocas, the InlineFunction with and without "-strip-debug" will make the same behavior, that should both erase second use_empty alloca. This patch is to fix the issue that debug instr impact InlineFunction generate different output. > So I think that the solution might be based on one of these ideas: > > 1. Remove the check for use_empty in the outer loop. > 2. Add a check for !use_empty in the inner loop. > 3. Remove the inner loop (i.e only splice one alloca at a time). These good ideas should be talking about the design change of alloca inline or improvement of splice. Read from the code, I think about the alloca inline behavior like this: First detect one !use_empty alloca, if next immediate instructions are allocas, even they are use_empty, they will all added one after one and move to caller together with first alloca. if other instruction (whatever dbg or others instrs) exist between allocas, the next alloca will check if is use_empty or not, if is use_empty then erase. Does this behavior correct, or could be improve? I don't know much about alloca inline. but seems the code run many years with this design. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 From llvm-commits at lists.llvm.org Fri Oct 11 04:09:11 2019 From: llvm-commits at lists.llvm.org (Kerry McLaughlin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:09:11 +0000 (UTC) Subject: [PATCH] D67550: [AArch64][SVE] Implement unpack intrinsics In-Reply-To: References: Message-ID: <2df109201d2a481bc9d1ab0ef06c01af@localhost.localdomain> kmclaughlin marked 2 inline comments as done. kmclaughlin added inline comments. ================ Comment at: lib/Target/AArch64/SVEInstrFormats.td:836 class sve_int_perm_unpk sz16_64, bits<2> opc, string asm, - ZPRRegOp zprty1, ZPRRegOp zprty2> + ZPRRegOp zprty1, ZPRRegOp zprty2, SDPatternOperator op> : I<(outs zprty1:$Zd), (ins zprty2:$Zn), ---------------- greened wrote: > Where is `op` used? I assume that comes later but it would help to understand where this is going. Thanks for pointing this out, op isn't actually used here! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67550/new/ https://reviews.llvm.org/D67550 From llvm-commits at lists.llvm.org Fri Oct 11 04:09:12 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:09:12 +0000 (UTC) Subject: [PATCH] D66604: [GVN] AnalyzeLoadAvailability: Replace a load after lifetime.end with undef (PR20811) In-Reply-To: References: Message-ID: <3be641f8dd9156cd942a2a55b930e18d@localhost.localdomain> fhahn requested changes to this revision. fhahn added a comment. This revision now requires changes to proceed. This seems to cause MultiSource/Benchmarks/DOE-ProxyApps-C++/HPCCG/HPCCG and External/SPEC/CINT2000/176.gcc/176.gcc from https://github.com/llvm/test-suite to fail. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66604/new/ https://reviews.llvm.org/D66604 From llvm-commits at lists.llvm.org Fri Oct 11 04:09:22 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:09:22 +0000 (UTC) Subject: [PATCH] D66604: [GVN] AnalyzeLoadAvailability: Replace a load after lifetime.end with undef (PR20811) In-Reply-To: References: Message-ID: <7762a5c8bef5c2cd96a984ee377f9f85@localhost.localdomain> fhahn added a comment. In D66604#1705703 , @fhahn wrote: > This seems to cause MultiSource/Benchmarks/DOE-ProxyApps-C++/HPCCG/HPCCG and External/SPEC/CINT2000/176.gcc/176.gcc from https://github.com/llvm/test-suite to fail. It's not an actual runtime failure, both won't terminate. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66604/new/ https://reviews.llvm.org/D66604 From llvm-commits at lists.llvm.org Fri Oct 11 04:11:21 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:11:21 +0000 (UTC) Subject: [PATCH] D68796: [LNT] Python 3 support: fix storage of json data as BLOB In-Reply-To: References: Message-ID: <96fa46c36720966338795573b0978c0f@localhost.localdomain> thopre updated this revision to Diff 224561. thopre added a comment. Add more cases of missing CAST to BLOB CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68796/new/ https://reviews.llvm.org/D68796 Files: lnt/server/db/migrations/upgrade_11_to_12.py lnt/server/db/migrations/upgrade_1_to_2.py lnt/server/db/testsuite.py lnt/server/db/testsuitedb.py tests/SharedInputs/SmallInstance/data/lnt_db_create.sql tests/server/db/Inputs/V4Pages_extra_records.sql tests/server/ui/Inputs/V4Pages_extra_records.sql -------------- next part -------------- A non-text attachment was scrubbed... Name: D68796.224561.patch Type: text/x-patch Size: 12336 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 04:13:49 2019 From: llvm-commits at lists.llvm.org (Qing Shan Zhang via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:13:49 +0000 (UTC) Subject: [PATCH] D67950: [TableGen] Fix a bug that MCSchedClassDesc is interfered between different SchedModel In-Reply-To: References: Message-ID: steven.zhang added a comment. In D67950#1705613 , @andreadb wrote: > I am not convinced that this patch is correct. Isn’t the problem that your model was wrongly marked as complete? Even we marked the model complete wrongly, the table-gen should generate the data or diagnose correctly. And yes, the table-gen will emit diagnose messages, but still emit the SCDesc as valid, which messy the model. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67950/new/ https://reviews.llvm.org/D67950 From llvm-commits at lists.llvm.org Fri Oct 11 04:33:18 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via llvm-commits) Date: Fri, 11 Oct 2019 11:33:18 -0000 Subject: [llvm] r374533 - [llvm-exegesis] Show noise cluster in analysis output. Message-ID: <20191011113318.83C6385FA8@lists.llvm.org> Author: courbet Date: Fri Oct 11 04:33:18 2019 New Revision: 374533 URL: http://llvm.org/viewvc/llvm-project?rev=374533&view=rev Log: [llvm-exegesis] Show noise cluster in analysis output. Reviewers: gchatelet Subscribers: tschuett, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68780 Added: llvm/trunk/test/tools/llvm-exegesis/X86/analysis-noise.test Modified: llvm/trunk/tools/llvm-exegesis/lib/Analysis.cpp llvm/trunk/tools/llvm-exegesis/lib/Analysis.h Added: llvm/trunk/test/tools/llvm-exegesis/X86/analysis-noise.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-exegesis/X86/analysis-noise.test?rev=374533&view=auto ============================================================================== --- llvm/trunk/test/tools/llvm-exegesis/X86/analysis-noise.test (added) +++ llvm/trunk/test/tools/llvm-exegesis/X86/analysis-noise.test Fri Oct 11 04:33:18 2019 @@ -0,0 +1,23 @@ +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-inconsistencies-output-file=- -analysis-clusters-output-file="" -analysis-numpoints=3 | FileCheck %s + +# CHECK: DOCTYPE +# CHECK: [noise] Cluster (1 points) + +--- +mode: latency +key: + instructions: + - 'ADD64rr RAX RAX RDI' + config: '' + register_initial_values: + - 'RAX=0x0' + - 'RDI=0x0' +cpu_name: haswell +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 10000 +measurements: + - { key: latency, value: 1.0049, per_snippet_value: 1.0049 } +error: '' +info: Repeating a single implicitly serial instruction +assembled_snippet: 48B8000000000000000048BF00000000000000004801F84801F84801F84801F84801F84801F84801F84801F84801F84801F84801F84801F84801F84801F84801F84801F8C3 +... Modified: llvm/trunk/tools/llvm-exegesis/lib/Analysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Analysis.cpp?rev=374533&r1=374532&r2=374533&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Analysis.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Analysis.cpp Fri Oct 11 04:33:18 2019 @@ -268,6 +268,27 @@ static void writeLatencySnippetHtml(raw_ } } +void Analysis::printPointHtml(const InstructionBenchmark &Point, + llvm::raw_ostream &OS) const { + OS << "
  • (OS, Point.AssembledSnippet, "\n"); + OS << "\">"; + switch (Point.Mode) { + case InstructionBenchmark::Latency: + writeLatencySnippetHtml(OS, Point.Key.Instructions, *InstrInfo_); + break; + case InstructionBenchmark::Uops: + case InstructionBenchmark::InverseThroughput: + writeUopsSnippetHtml(OS, Point.Key.Instructions, *InstrInfo_); + break; + default: + llvm_unreachable("invalid mode"); + } + OS << " "; + writeEscaped(OS, Point.Key.Config); + OS << "
  • "; +} + void Analysis::printSchedClassClustersHtml( const std::vector &Clusters, const ResolvedSchedClass &RSC, raw_ostream &OS) const { @@ -292,25 +313,7 @@ void Analysis::printSchedClassClustersHt writeClusterId(OS, Cluster.id()); OS << "
      "; for (const size_t PointId : Cluster.getPointIds()) { - const auto &Point = Points[PointId]; - OS << "
    • (OS, Point.AssembledSnippet, - "\n"); - OS << "\">"; - switch (Point.Mode) { - case InstructionBenchmark::Latency: - writeLatencySnippetHtml(OS, Point.Key.Instructions, *InstrInfo_); - break; - case InstructionBenchmark::Uops: - case InstructionBenchmark::InverseThroughput: - writeUopsSnippetHtml(OS, Point.Key.Instructions, *InstrInfo_); - break; - default: - llvm_unreachable("invalid mode"); - } - OS << " "; - writeEscaped(OS, Point.Key.Config); - OS << "
    • "; + printPointHtml(Points[PointId], OS); } OS << "
    "; } +void Analysis::printClusterRawHtml( + const InstructionBenchmarkClustering::ClusterId &Id, StringRef display_name, + llvm::raw_ostream &OS) const { + const auto &Points = Clustering_.getPoints(); + const auto &Cluster = Clustering_.getCluster(Id); + if (Cluster.PointIndices.empty()) + return; + + OS << "

    " << display_name << " Cluster (" + << Cluster.PointIndices.size() << " points)

    "; + OS << ""; + // Table Header. + OS << ""; + for (const auto &Measurement : Points[Cluster.PointIndices[0]].Measurements) { + OS << ""; + } + OS << ""; + + // Point data. + for (const auto &PointId : Cluster.PointIndices) { + OS << ""; + for (const auto &Measurement : Points[PointId].Measurements) { + OS << ""; + } + OS << "
    ClusterIdOpcode/Config"; + writeEscaped(OS, Measurement.Key); + OS << "
    " << display_name << "
      "; + printPointHtml(Points[PointId], OS); + OS << "
    "; + writeMeasurementValue(OS, Measurement.PerInstructionValue); + } + OS << "
    "; + + OS << "
    "; + +} // namespace exegesis + static constexpr const char kHtmlHead[] = R"( llvm-exegesis Analysis Results @@ -549,6 +589,9 @@ Error Analysis::run"; } + printClusterRawHtml(InstructionBenchmarkClustering::ClusterId::noise(), + "[noise]", OS); + OS << ""; return Error::success(); } Modified: llvm/trunk/tools/llvm-exegesis/lib/Analysis.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Analysis.h?rev=374533&r1=374532&r2=374533&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Analysis.h (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Analysis.h Fri Oct 11 04:33:18 2019 @@ -81,6 +81,12 @@ private: void printInstructionRowCsv(size_t PointId, raw_ostream &OS) const; + void printClusterRawHtml(const InstructionBenchmarkClustering::ClusterId &Id, + StringRef display_name, llvm::raw_ostream &OS) const; + + void printPointHtml(const InstructionBenchmark &Point, + llvm::raw_ostream &OS) const; + void printSchedClassClustersHtml(const std::vector &Clusters, const ResolvedSchedClass &SC, From llvm-commits at lists.llvm.org Fri Oct 11 04:34:18 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Fri, 11 Oct 2019 11:34:18 -0000 Subject: [llvm] r374534 - [X86] isFNEG - add recursion depth limit Message-ID: <20191011113418.DDC1991F23@lists.llvm.org> Author: rksimon Date: Fri Oct 11 04:34:18 2019 New Revision: 374534 URL: http://llvm.org/viewvc/llvm-project?rev=374534&view=rev Log: [X86] isFNEG - add recursion depth limit Now that its used by isNegatibleForFree we should try to avoid costly deep recursion Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374534&r1=374533&r2=374534&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Fri Oct 11 04:34:18 2019 @@ -41254,10 +41254,14 @@ static SDValue combineVTRUNC(SDNode *N, /// In this case we go though all bitcasts. /// This also recognizes splat of a negated value and returns the splat of that /// value. -static SDValue isFNEG(SelectionDAG &DAG, SDNode *N) { +static SDValue isFNEG(SelectionDAG &DAG, SDNode *N, unsigned Depth = 0) { if (N->getOpcode() == ISD::FNEG) return N->getOperand(0); + // Don't recurse exponentially. + if (Depth > SelectionDAG::MaxRecursionDepth) + return SDValue(); + unsigned ScalarSize = N->getValueType(0).getScalarSizeInBits(); SDValue Op = peekThroughBitcasts(SDValue(N, 0)); @@ -41271,7 +41275,7 @@ static SDValue isFNEG(SelectionDAG &DAG, // of this is VECTOR_SHUFFLE(-VEC1, UNDEF). The mask can be anything here. if (!SVOp->getOperand(1).isUndef()) return SDValue(); - if (SDValue NegOp0 = isFNEG(DAG, SVOp->getOperand(0).getNode())) + if (SDValue NegOp0 = isFNEG(DAG, SVOp->getOperand(0).getNode(), Depth + 1)) if (NegOp0.getValueType() == VT) // FIXME: Can we do better? return DAG.getVectorShuffle(VT, SDLoc(SVOp), NegOp0, DAG.getUNDEF(VT), SVOp->getMask()); @@ -41285,7 +41289,7 @@ static SDValue isFNEG(SelectionDAG &DAG, SDValue InsVal = Op.getOperand(1); if (!InsVector.isUndef()) return SDValue(); - if (SDValue NegInsVal = isFNEG(DAG, InsVal.getNode())) + if (SDValue NegInsVal = isFNEG(DAG, InsVal.getNode(), Depth + 1)) if (NegInsVal.getValueType() == VT.getVectorElementType()) // FIXME return DAG.getNode(ISD::INSERT_VECTOR_ELT, SDLoc(Op), VT, InsVector, NegInsVal, Op.getOperand(2)); @@ -41429,7 +41433,7 @@ char X86TargetLowering::isNegatibleForFr bool ForCodeSize, unsigned Depth) const { // fneg patterns are removable even if they have multiple uses. - if (isFNEG(DAG, Op.getNode())) + if (isFNEG(DAG, Op.getNode(), Depth)) return 2; // Don't recurse exponentially. @@ -41472,7 +41476,7 @@ SDValue X86TargetLowering::getNegatedExp bool ForCodeSize, unsigned Depth) const { // fneg patterns are removable even if they have multiple uses. - if (SDValue Arg = isFNEG(DAG, Op.getNode())) + if (SDValue Arg = isFNEG(DAG, Op.getNode(), Depth)) return DAG.getBitcast(Op.getValueType(), Arg); EVT VT = Op.getValueType(); From llvm-commits at lists.llvm.org Fri Oct 11 04:37:51 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:37:51 +0000 (UTC) Subject: [PATCH] D68289: [lldb-server/android] Show more processes by relaxing some checks In-Reply-To: References: Message-ID: <99025ea1126ad752283ba5852dd09f8a@localhost.localdomain> labath added a comment. I've committed my ProcessInstanceInfoMatch fix now. Feel free to recommit when you're able to check the bots for failure. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68289/new/ https://reviews.llvm.org/D68289 From llvm-commits at lists.llvm.org Fri Oct 11 04:39:20 2019 From: llvm-commits at lists.llvm.org (Clement Courbet via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:39:20 +0000 (UTC) Subject: [PATCH] D68780: [llvm-exegesis] Show noise cluster in analysis output. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGc8eb0547efc5: [llvm-exegesis] Show noise cluster in analysis output. (authored by courbet). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68780/new/ https://reviews.llvm.org/D68780 Files: llvm/test/tools/llvm-exegesis/X86/analysis-noise.test llvm/tools/llvm-exegesis/lib/Analysis.cpp llvm/tools/llvm-exegesis/lib/Analysis.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68780.224562.patch Type: text/x-patch Size: 5667 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 04:42:26 2019 From: llvm-commits at lists.llvm.org (Joerg Sonnenberger via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:42:26 +0000 (UTC) Subject: [PATCH] D65280: Add a pass to lower is.constant and objectsize intrinsics In-Reply-To: References: Message-ID: <65fbe86814131f9d12fc8f933de222e8@localhost.localdomain> joerg updated this revision to Diff 224563. joerg marked 6 inline comments as done. joerg added a comment. Adjust based on comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65280/new/ https://reviews.llvm.org/D65280 Files: bindings/ocaml/transforms/scalar_opts/llvm_scalar_opts.mli bindings/ocaml/transforms/scalar_opts/scalar_opts_ocaml.c include/llvm-c/Transforms/Scalar.h include/llvm/InitializePasses.h include/llvm/LinkAllPasses.h include/llvm/Transforms/Scalar.h include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h lib/CodeGen/CodeGenPrepare.cpp lib/CodeGen/GlobalISel/IRTranslator.cpp lib/CodeGen/SelectionDAG/FastISel.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp lib/CodeGen/TargetPassConfig.cpp lib/Passes/PassBuilder.cpp lib/Passes/PassRegistry.def lib/Transforms/IPO/PassManagerBuilder.cpp lib/Transforms/Scalar/CMakeLists.txt lib/Transforms/Scalar/LowerConstantIntrinsics.cpp lib/Transforms/Scalar/Scalar.cpp test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll test/CodeGen/AArch64/O0-pipeline.ll test/CodeGen/AArch64/O3-pipeline.ll test/CodeGen/ARM/O3-pipeline.ll test/CodeGen/Generic/is-constant.ll test/CodeGen/X86/O0-pipeline.ll test/CodeGen/X86/O3-pipeline.ll test/CodeGen/X86/is-constant.ll test/CodeGen/X86/object-size.ll test/Other/new-pm-defaults.ll test/Other/new-pm-thinlto-defaults.ll test/Other/opt-O2-pipeline.ll test/Other/opt-O3-pipeline.ll test/Other/opt-Os-pipeline.ll test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll test/Transforms/CodeGenPrepare/basic.ll test/Transforms/CodeGenPrepare/builtin-condition.ll test/Transforms/CodeGenPrepare/crash-on-large-allocas.ll test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn -------------- next part -------------- A non-text attachment was scrubbed... Name: D65280.224563.patch Type: text/x-patch Size: 53174 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 04:43:01 2019 From: llvm-commits at lists.llvm.org (Kai Nacke via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:43:01 +0000 (UTC) Subject: [PATCH] D68146: [FileCheck] Implement --ignore-case option. In-Reply-To: References: Message-ID: <5c05eaa21bbb83a1e3615ba991de4c8b@localhost.localdomain> Kai added a comment. In D68146#1704250 , @rupprecht wrote: > In D68146#1703693 , @thakis wrote: > > > The test fails on Linux: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/28537/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Acheck-ignore-case.txt If it takes a while to fix please revert while you investigate. > > > > Also, https://github.com/llvm/llvm-project/commit/dfd2b6f07fc40a190335f580d8a965bbebfe94df looks like you touched ~all lines in docs/CommandGuide/FileCheck.rst and llvm/include/llvm/Support/FileCheck.h Maybe you converted them to windows line endings? If so, please undo that. (Maybe revert and reland with fixed line endings so that the diff for the actual change is readable.) > > > As described in http://llvm.org/docs/GettingStarted.html#checkout-llvm-from-git, the right way to checkout the repository on windows is: > > % git clone --config core.autocrlf=false https://github.com/llvm/llvm-project.git > > > This is the second time I've reviewed a change like this without realizing, would appreciate tips if this could be more visible in Phab somehow... Sorry, I really screwed this up. Not only that I managed to insert an additional character in the regex after my last test but the day before I had trouble with the repository (the COM1.o and COM2.o names on Windows) and cloned the repository again - obviously missing the CR/LF setting I had previously in. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68146/new/ https://reviews.llvm.org/D68146 From llvm-commits at lists.llvm.org Fri Oct 11 04:46:40 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via llvm-commits) Date: Fri, 11 Oct 2019 11:46:40 -0000 Subject: [llvm] r374535 - [SCEV] Add stricter verification option. Message-ID: <20191011114640.A64F39189C@lists.llvm.org> Author: fhahn Date: Fri Oct 11 04:46:40 2019 New Revision: 374535 URL: http://llvm.org/viewvc/llvm-project?rev=374535&view=rev Log: [SCEV] Add stricter verification option. Currently -verify-scev only fails if there is a constant difference between two BE counts. This misses a lot of cases. This patch adds a -verify-scev-strict options, which fails for any non-zero differences, if used together with -verify-scev. With the stricter checking, some unit tests fail because of mis-matches, especially around IndVarSimplify. If there is no reason I am missing for just checking constant deltas, I am planning on looking into the various failures. Reviewers: efriedma, sanjoy.google, reames, atrick Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D68592 Modified: llvm/trunk/lib/Analysis/ScalarEvolution.cpp Modified: llvm/trunk/lib/Analysis/ScalarEvolution.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ScalarEvolution.cpp?rev=374535&r1=374534&r2=374535&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ScalarEvolution.cpp (original) +++ llvm/trunk/lib/Analysis/ScalarEvolution.cpp Fri Oct 11 04:46:40 2019 @@ -158,6 +158,9 @@ MaxBruteForceIterations("scalar-evolutio static cl::opt VerifySCEV( "verify-scev", cl::Hidden, cl::desc("Verify ScalarEvolution's backedge taken counts (slow)")); +static cl::opt VerifySCEVStrict( + "verify-scev-strict", cl::Hidden, + cl::desc("Enable stricter verification with -verify-scev is passed")); static cl::opt VerifySCEVMap("verify-scev-maps", cl::Hidden, cl::desc("Verify no dangling value in ScalarEvolution's " @@ -11922,14 +11925,14 @@ void ScalarEvolution::verify() const { SE.getTypeSizeInBits(NewBECount->getType())) CurBECount = SE2.getZeroExtendExpr(CurBECount, NewBECount->getType()); - auto *ConstantDelta = - dyn_cast(SE2.getMinusSCEV(CurBECount, NewBECount)); + const SCEV *Delta = SE2.getMinusSCEV(CurBECount, NewBECount); - if (ConstantDelta && ConstantDelta->getAPInt() != 0) { - dbgs() << "Trip Count Changed!\n"; + // Unless VerifySCEVStrict is set, we only compare constant deltas. + if ((VerifySCEVStrict || isa(Delta)) && !Delta->isZero()) { + dbgs() << "Trip Count for " << *L << " Changed!\n"; dbgs() << "Old: " << *CurBECount << "\n"; dbgs() << "New: " << *NewBECount << "\n"; - dbgs() << "Delta: " << *ConstantDelta << "\n"; + dbgs() << "Delta: " << *Delta << "\n"; std::abort(); } } From llvm-commits at lists.llvm.org Fri Oct 11 04:46:51 2019 From: llvm-commits at lists.llvm.org (Sven van Haastregt via llvm-commits) Date: Fri, 11 Oct 2019 11:46:51 -0000 Subject: [www] r374536 - Add Clang tutorial abstract Message-ID: <20191011114651.C2E3192FFD@lists.llvm.org> Author: svenvh Date: Fri Oct 11 04:46:51 2019 New Revision: 374536 URL: http://llvm.org/viewvc/llvm-project?rev=374536&view=rev Log: Add Clang tutorial abstract Modified: www/trunk/devmtg/2019-10/talk-abstracts.html Modified: www/trunk/devmtg/2019-10/talk-abstracts.html URL: http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2019-10/talk-abstracts.html?rev=374536&r1=374535&r2=374536&view=diff ============================================================================== --- www/trunk/devmtg/2019-10/talk-abstracts.html (original) +++ www/trunk/devmtg/2019-10/talk-abstracts.html Fri Oct 11 04:46:51 2019 @@ -456,7 +456,9 @@ A strong testing infrastructure is criti
    Sven van Haastregt, Anastasia Stulova

    - Details coming soon. + This tutorial will give an overview of Clang. We will cover the distinction between the Clang compiler driver and the Clang language frontend, with an emphasis on the latter. We will examine the different Clang components that a C program goes through when being compiled, i.e., lexing, parsing, semantic analysis, and LLVM IR generation. This includes some of the Clang Abstract Syntax Tree (AST), Type, and the Diagnostics infrastructure. We will conclude by explaining the various ways in which Clang is tested. +

    + The tutorial is aimed at newcomers who have a basic understanding of compiler concepts and wish to learn about the architecture of Clang or start contributing to Clang.

    From llvm-commits at lists.llvm.org Fri Oct 11 04:51:03 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:51:03 +0000 (UTC) Subject: [PATCH] D68776: [mips] Fix loading "double" immediate into a GPR and FPR In-Reply-To: References: Message-ID: atanasyan marked 2 inline comments as done. atanasyan added inline comments. ================ Comment at: llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp:3481 + + if (isABI_N32() || isABI_N64()) { + if (loadImmediate(ImmOp64, TmpReg, Mips::NoRegister, false, true, IDLoc, ---------------- mstojanovic wrote: > An alternative to this condition could be `isGP64bit()`. Do you think there's a case where this wouldn't work and what do you prefer? I'm going to fix that and other similar code by a separate patch. ================ Comment at: llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp:3482 + if (isABI_N32() || isABI_N64()) { + if (loadImmediate(ImmOp64, TmpReg, Mips::NoRegister, false, true, IDLoc, + Out, STI)) ---------------- mstojanovic wrote: > Is there a reason why the `IsAddress` argument is set to `true` in the `loadImmediate()` call? Good point, thanks. I missed that and just use the same value for this argument as in the original code. I think in both `loadImmediate` calls `IsAddress` should be `false`. I will fix that before commit. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68776/new/ https://reviews.llvm.org/D68776 From llvm-commits at lists.llvm.org Fri Oct 11 04:51:30 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 11:51:30 +0000 (UTC) Subject: [PATCH] D68592: [SCEV] Add stricter verification option. In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG77fbf069f6dd: [SCEV] Add stricter verification option. (authored by fhahn). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68592/new/ https://reviews.llvm.org/D68592 Files: llvm/lib/Analysis/ScalarEvolution.cpp Index: llvm/lib/Analysis/ScalarEvolution.cpp =================================================================== --- llvm/lib/Analysis/ScalarEvolution.cpp +++ llvm/lib/Analysis/ScalarEvolution.cpp @@ -158,6 +158,9 @@ static cl::opt VerifySCEV( "verify-scev", cl::Hidden, cl::desc("Verify ScalarEvolution's backedge taken counts (slow)")); +static cl::opt VerifySCEVStrict( + "verify-scev-strict", cl::Hidden, + cl::desc("Enable stricter verification with -verify-scev is passed")); static cl::opt VerifySCEVMap("verify-scev-maps", cl::Hidden, cl::desc("Verify no dangling value in ScalarEvolution's " @@ -11922,14 +11925,14 @@ SE.getTypeSizeInBits(NewBECount->getType())) CurBECount = SE2.getZeroExtendExpr(CurBECount, NewBECount->getType()); - auto *ConstantDelta = - dyn_cast(SE2.getMinusSCEV(CurBECount, NewBECount)); + const SCEV *Delta = SE2.getMinusSCEV(CurBECount, NewBECount); - if (ConstantDelta && ConstantDelta->getAPInt() != 0) { - dbgs() << "Trip Count Changed!\n"; + // Unless VerifySCEVStrict is set, we only compare constant deltas. + if ((VerifySCEVStrict || isa(Delta)) && !Delta->isZero()) { + dbgs() << "Trip Count for " << *L << " Changed!\n"; dbgs() << "Old: " << *CurBECount << "\n"; dbgs() << "New: " << *NewBECount << "\n"; - dbgs() << "Delta: " << *ConstantDelta << "\n"; + dbgs() << "Delta: " << *Delta << "\n"; std::abort(); } } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68592.224564.patch Type: text/x-patch Size: 1541 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 04:59:14 2019 From: llvm-commits at lists.llvm.org (Kai Nacke via llvm-commits) Date: Fri, 11 Oct 2019 11:59:14 -0000 Subject: [llvm] r374538 - [FileCheck] Implement --ignore-case option. Message-ID: <20191011115914.96B8A927D0@lists.llvm.org> Author: redstar Date: Fri Oct 11 04:59:14 2019 New Revision: 374538 URL: http://llvm.org/viewvc/llvm-project?rev=374538&view=rev Log: [FileCheck] Implement --ignore-case option. The FileCheck utility is enhanced to support a `--ignore-case` option. This is useful in cases where the output of Unix tools differs in case (e.g. case not specified by Posix). Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D68146 Added: llvm/trunk/test/FileCheck/check-ignore-case.txt Modified: llvm/trunk/docs/CommandGuide/FileCheck.rst llvm/trunk/include/llvm/Support/FileCheck.h llvm/trunk/lib/Support/FileCheck.cpp llvm/trunk/lib/Support/FileCheckImpl.h llvm/trunk/utils/FileCheck/FileCheck.cpp Modified: llvm/trunk/docs/CommandGuide/FileCheck.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CommandGuide/FileCheck.rst?rev=374538&r1=374537&r2=374538&view=diff ============================================================================== --- llvm/trunk/docs/CommandGuide/FileCheck.rst (original) +++ llvm/trunk/docs/CommandGuide/FileCheck.rst Fri Oct 11 04:59:14 2019 @@ -71,6 +71,11 @@ and from the command line. The :option:`--strict-whitespace` argument disables this behavior. End-of-line sequences are canonicalized to UNIX-style ``\n`` in all modes. +.. option:: --ignore-case + + By default, FileCheck uses case-sensitive matching. This option causes + FileCheck to use case-insensitive matching. + .. option:: --implicit-check-not check-pattern Adds implicit negative checks for the specified patterns between positive Modified: llvm/trunk/include/llvm/Support/FileCheck.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/FileCheck.h?rev=374538&r1=374537&r2=374538&view=diff ============================================================================== --- llvm/trunk/include/llvm/Support/FileCheck.h (original) +++ llvm/trunk/include/llvm/Support/FileCheck.h Fri Oct 11 04:59:14 2019 @@ -30,6 +30,7 @@ struct FileCheckRequest { std::vector GlobalDefines; bool AllowEmptyInput = false; bool MatchFullLines = false; + bool IgnoreCase = false; bool EnableVarScope = false; bool AllowDeprecatedDagOverlap = false; bool Verbose = false; Modified: llvm/trunk/lib/Support/FileCheck.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/FileCheck.cpp?rev=374538&r1=374537&r2=374538&view=diff ============================================================================== --- llvm/trunk/lib/Support/FileCheck.cpp (original) +++ llvm/trunk/lib/Support/FileCheck.cpp Fri Oct 11 04:59:14 2019 @@ -320,6 +320,7 @@ bool FileCheckPattern::parsePattern(Stri SourceMgr &SM, const FileCheckRequest &Req) { bool MatchFullLinesHere = Req.MatchFullLines && CheckTy != Check::CheckNot; + IgnoreCase = Req.IgnoreCase; PatternLoc = SMLoc::getFromPointer(PatternStr.data()); @@ -619,7 +620,8 @@ Expected FileCheckPattern::match // If this is a fixed string pattern, just match it now. if (!FixedStr.empty()) { MatchLen = FixedStr.size(); - size_t Pos = Buffer.find(FixedStr); + size_t Pos = IgnoreCase ? Buffer.find_lower(FixedStr) + : Buffer.find(FixedStr); if (Pos == StringRef::npos) return make_error(); return Pos; @@ -657,7 +659,10 @@ Expected FileCheckPattern::match } SmallVector MatchInfo; - if (!Regex(RegExToMatch, Regex::Newline).match(Buffer, &MatchInfo)) + unsigned int Flags = Regex::Newline; + if (IgnoreCase) + Flags |= Regex::IgnoreCase; + if (!Regex(RegExToMatch, Flags).match(Buffer, &MatchInfo)) return make_error(); // Successful regex match. Modified: llvm/trunk/lib/Support/FileCheckImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/FileCheckImpl.h?rev=374538&r1=374537&r2=374538&view=diff ============================================================================== --- llvm/trunk/lib/Support/FileCheckImpl.h (original) +++ llvm/trunk/lib/Support/FileCheckImpl.h Fri Oct 11 04:59:14 2019 @@ -428,6 +428,9 @@ class FileCheckPattern { /// line to the one with this CHECK. Optional LineNumber; + /// Ignore case while matching if set to true. + bool IgnoreCase = false; + public: FileCheckPattern(Check::FileCheckType Ty, FileCheckPatternContext *Context, Optional Line = None) Added: llvm/trunk/test/FileCheck/check-ignore-case.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/FileCheck/check-ignore-case.txt?rev=374538&view=auto ============================================================================== --- llvm/trunk/test/FileCheck/check-ignore-case.txt (added) +++ llvm/trunk/test/FileCheck/check-ignore-case.txt Fri Oct 11 04:59:14 2019 @@ -0,0 +1,45 @@ +## Check that a full line is matched case insensitively. +# RUN: FileCheck --ignore-case --match-full-lines --check-prefix=FULL --input-file=%s %s + +## Check that a regular expression matches case insensitively. +# RUN: FileCheck --ignore-case --check-prefix=REGEX --input-file=%s %s + +## Check that a pattern from command line matches case insensitively. +# RUN: FileCheck --ignore-case --check-prefix=PAT --DPATTERN="THIS is the" --input-file=%s %s + +## Check that COUNT and NEXT work case insensitively. +# RUN: FileCheck --ignore-case --check-prefix=CNT --input-file=%s %s + +## Check that match on same line works case insensitively. +# RUN: FileCheck --ignore-case --check-prefix=LINE --input-file=%s %s + +## Check that option --implicit-not works case insensitively. +# RUN: sed '/^#/d' %s | FileCheck --implicit-check-not=sTrInG %s +# RUN: sed '/^#/d' %s | not FileCheck --ignore-case --implicit-check-not=sTrInG %s 2>&1 | FileCheck --check-prefix=ERROR %s + +this is the STRING to be matched + +# FULL: tHis iS The String TO be matched +# REGEX: s{{TRing}} +# PAT: [[PATTERN]] string + +Loop 1 +lOop 2 +loOp 3 +looP 4 +loop 5 +LOOP 6 +BREAK + +# CNT-COUNT-6: LOop {{[0-9]}} +# CNT-NOT: loop +# CNT-NEXT: break + +One Line To Match + +# LINE: {{o}}ne line +# LINE-SAME: {{t}}o match + +# ERROR: command line:1:{{[0-9]+}}: error: CHECK-NOT: excluded string found in input +# ERROR-NEXT: -implicit-check-not='sTrInG' +# ERROR: note: found here Modified: llvm/trunk/utils/FileCheck/FileCheck.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/FileCheck/FileCheck.cpp?rev=374538&r1=374537&r2=374538&view=diff ============================================================================== --- llvm/trunk/utils/FileCheck/FileCheck.cpp (original) +++ llvm/trunk/utils/FileCheck/FileCheck.cpp Fri Oct 11 04:59:14 2019 @@ -48,6 +48,10 @@ static cl::opt NoCanonicalizeWhite "strict-whitespace", cl::desc("Do not treat all horizontal whitespace as equivalent")); +static cl::opt IgnoreCase( + "ignore-case", + cl::desc("Use case-insensitive matching")); + static cl::list ImplicitCheckNot( "implicit-check-not", cl::desc("Add an implicit negative check with this pattern to every\n" @@ -555,6 +559,7 @@ int main(int argc, char **argv) { Req.VerboseVerbose = VerboseVerbose; Req.NoCanonicalizeWhiteSpace = NoCanonicalizeWhiteSpace; Req.MatchFullLines = MatchFullLines; + Req.IgnoreCase = IgnoreCase; if (VerboseVerbose) Req.Verbose = true; From llvm-commits at lists.llvm.org Fri Oct 11 04:59:55 2019 From: llvm-commits at lists.llvm.org (Oliver Stannard via llvm-commits) Date: Fri, 11 Oct 2019 11:59:55 -0000 Subject: [llvm] r374539 - Dead Virtual Function Elimination Message-ID: <20191011115956.0925192A44@lists.llvm.org> Author: ostannard Date: Fri Oct 11 04:59:55 2019 New Revision: 374539 URL: http://llvm.org/viewvc/llvm-project?rev=374539&view=rev Log: Dead Virtual Function Elimination Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 Added: llvm/trunk/test/LTO/ARM/lto-linking-metadata.ll llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-base-call.ll llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-base-pointer-call.ll llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-derived-call.ll llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-derived-pointer-call.ll llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-visibility-post-lto.ll llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-visibility-pre-lto.ll llvm/trunk/test/Transforms/GlobalDCE/virtual-functions.ll llvm/trunk/test/Transforms/GlobalDCE/vtable-rtti.ll llvm/trunk/test/Transforms/Internalize/vcall-visibility.ll Modified: llvm/trunk/docs/LangRef.rst llvm/trunk/docs/TypeMetadata.rst llvm/trunk/include/llvm/Analysis/TypeMetadataUtils.h llvm/trunk/include/llvm/IR/FixedMetadataKinds.def llvm/trunk/include/llvm/IR/GlobalObject.h llvm/trunk/include/llvm/Transforms/IPO/GlobalDCE.h llvm/trunk/lib/Analysis/TypeMetadataUtils.cpp llvm/trunk/lib/IR/Metadata.cpp llvm/trunk/lib/LTO/LTO.cpp llvm/trunk/lib/LTO/LTOCodeGenerator.cpp llvm/trunk/lib/Transforms/IPO/GlobalDCE.cpp llvm/trunk/lib/Transforms/IPO/WholeProgramDevirt.cpp llvm/trunk/test/ThinLTO/X86/lazyload_metadata.ll Modified: llvm/trunk/docs/LangRef.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/LangRef.rst?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/docs/LangRef.rst (original) +++ llvm/trunk/docs/LangRef.rst Fri Oct 11 04:59:55 2019 @@ -6264,6 +6264,13 @@ enum is the smallest type which can repr !0 = !{i32 1, !"short_wchar", i32 1} !1 = !{i32 1, !"short_enum", i32 0} +LTO Post-Link Module Flags Metadata +----------------------------------- + +Some optimisations are only when the entire LTO unit is present in the current +module. This is represented by the ``LTOPostLink`` module flags metadata, which +will be created with a value of ``1`` when LTO linking occurs. + Automatic Linker Flags Named Metadata ===================================== @@ -16809,6 +16816,8 @@ Overview: The ``llvm.type.test`` intrinsic tests whether the given pointer is associated with the given type identifier. +.. _type.checked.load: + '``llvm.type.checked.load``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Modified: llvm/trunk/docs/TypeMetadata.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/TypeMetadata.rst?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/docs/TypeMetadata.rst (original) +++ llvm/trunk/docs/TypeMetadata.rst Fri Oct 11 04:59:55 2019 @@ -224,3 +224,67 @@ efficiently to minimize the sizes of the } .. _GlobalLayoutBuilder: https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Transforms/IPO/LowerTypeTests.h + +``!vcall_visibility`` Metadata +============================== + +In order to allow removing unused function pointers from vtables, we need to +know whether every virtual call which could use it is known to the compiler, or +whether another translation unit could introduce more calls through the vtable. +This is not the same as the linkage of the vtable, because call sites could be +using a pointer of a more widely-visible base class. For example, consider this +code: + +.. code-block:: c++ + + __attribute__((visibility("default"))) + struct A { + virtual void f(); + }; + + __attribute__((visibility("hidden"))) + struct B : A { + virtual void f(); + }; + +With LTO, we know that all code which can see the declaration of ``B`` is +visible to us. However, a pointer to a ``B`` could be cast to ``A*`` and passed +to another linkage unit, which could then call ``f`` on it. This call would +load from the vtable for ``B`` (using the object pointer), and then call +``B::f``. This means we can't remove the function pointer from ``B``'s vtable, +or the implementation of ``B::f``. However, if we can see all code which knows +about any dynamic base class (which would be the case if ``B`` only inherited +from classes with hidden visibility), then this optimisation would be valid. + +This concept is represented in IR by the ``!vcall_visibility`` metadata +attached to vtable objects, with the following values: + +.. list-table:: + :header-rows: 1 + :widths: 10 90 + + * - Value + - Behavior + + * - 0 (or omitted) + - **Public** + Virtual function calls using this vtable could be made from external + code. + + * - 1 + - **Linkage Unit** + All virtual function calls which might use this vtable are in the + current LTO unit, meaning they will be in the current module once + LTO linking has been performed. + + * - 2 + - **Translation Unit** + All virtual function calls which might use this vtable are in the + current module. + +In addition, all function pointer loads from a vtable marked with the +``!vcall_visibility`` metadata (with a non-zero value) must be done using the +:ref:`llvm.type.checked.load ` intrinsic, so that virtual +calls sites can be correlated with the vtables which they might load from. +Other parts of the vtable (RTTI, offset-to-top, ...) can still be accessed with +normal loads. Modified: llvm/trunk/include/llvm/Analysis/TypeMetadataUtils.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TypeMetadataUtils.h?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/TypeMetadataUtils.h (original) +++ llvm/trunk/include/llvm/Analysis/TypeMetadataUtils.h Fri Oct 11 04:59:55 2019 @@ -50,6 +50,8 @@ void findDevirtualizableCallsForTypeChec SmallVectorImpl &LoadedPtrs, SmallVectorImpl &Preds, bool &HasNonCallUses, const CallInst *CI, DominatorTree &DT); + +Constant *getPointerAtOffset(Constant *I, uint64_t Offset, Module &M); } #endif Modified: llvm/trunk/include/llvm/IR/FixedMetadataKinds.def URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/FixedMetadataKinds.def?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/FixedMetadataKinds.def (original) +++ llvm/trunk/include/llvm/IR/FixedMetadataKinds.def Fri Oct 11 04:59:55 2019 @@ -40,3 +40,4 @@ LLVM_FIXED_MD_KIND(MD_access_group, "llv LLVM_FIXED_MD_KIND(MD_callback, "callback", 26) LLVM_FIXED_MD_KIND(MD_preserve_access_index, "llvm.preserve.access.index", 27) LLVM_FIXED_MD_KIND(MD_misexpect, "misexpect", 28) +LLVM_FIXED_MD_KIND(MD_vcall_visibility, "vcall_visibility", 29) Modified: llvm/trunk/include/llvm/IR/GlobalObject.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/GlobalObject.h?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/GlobalObject.h (original) +++ llvm/trunk/include/llvm/IR/GlobalObject.h Fri Oct 11 04:59:55 2019 @@ -28,6 +28,20 @@ class MDNode; class Metadata; class GlobalObject : public GlobalValue { +public: + // VCallVisibility - values for visibility metadata attached to vtables. This + // describes the scope in which a virtual call could end up being dispatched + // through this vtable. + enum VCallVisibility { + // Type is potentially visible to external code. + VCallVisibilityPublic = 0, + // Type is only visible to code which will be in the current Module after + // LTO internalization. + VCallVisibilityLinkageUnit = 1, + // Type is only visible to code in the current Module. + VCallVisibilityTranslationUnit = 2, + }; + protected: GlobalObject(Type *Ty, ValueTy VTy, Use *Ops, unsigned NumOps, LinkageTypes Linkage, const Twine &Name, @@ -163,6 +177,8 @@ public: void copyMetadata(const GlobalObject *Src, unsigned Offset); void addTypeMetadata(unsigned Offset, Metadata *TypeID); + void addVCallVisibilityMetadata(VCallVisibility Visibility); + VCallVisibility getVCallVisibility() const; protected: void copyAttributesFrom(const GlobalObject *Src); Modified: llvm/trunk/include/llvm/Transforms/IPO/GlobalDCE.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/GlobalDCE.h?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/GlobalDCE.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/GlobalDCE.h Fri Oct 11 04:59:55 2019 @@ -43,11 +43,25 @@ private: /// Comdat -> Globals in that Comdat section. std::unordered_multimap ComdatMembers; + /// !type metadata -> set of (vtable, offset) pairs + DenseMap, 4>> + TypeIdMap; + + // Global variables which are vtables, and which we have enough information + // about to safely do dead virtual function elimination. + SmallPtrSet VFESafeVTables; + void UpdateGVDependencies(GlobalValue &GV); void MarkLive(GlobalValue &GV, SmallVectorImpl *Updates = nullptr); bool RemoveUnusedGlobalValue(GlobalValue &GV); + // Dead virtual function elimination. + void AddVirtualFunctionDependencies(Module &M); + void ScanVTables(Module &M); + void ScanTypeCheckedLoadIntrinsics(Module &M); + void ScanVTableLoad(Function *Caller, Metadata *TypeId, uint64_t CallOffset); + void ComputeDependencies(Value *V, SmallPtrSetImpl &U); }; Modified: llvm/trunk/lib/Analysis/TypeMetadataUtils.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/TypeMetadataUtils.cpp?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/TypeMetadataUtils.cpp (original) +++ llvm/trunk/lib/Analysis/TypeMetadataUtils.cpp Fri Oct 11 04:59:55 2019 @@ -127,3 +127,35 @@ void llvm::findDevirtualizableCallsForTy findCallsAtConstantOffset(DevirtCalls, &HasNonCallUses, LoadedPtr, Offset->getZExtValue(), CI, DT); } + +Constant *llvm::getPointerAtOffset(Constant *I, uint64_t Offset, Module &M) { + if (I->getType()->isPointerTy()) { + if (Offset == 0) + return I; + return nullptr; + } + + const DataLayout &DL = M.getDataLayout(); + + if (auto *C = dyn_cast(I)) { + const StructLayout *SL = DL.getStructLayout(C->getType()); + if (Offset >= SL->getSizeInBytes()) + return nullptr; + + unsigned Op = SL->getElementContainingOffset(Offset); + return getPointerAtOffset(cast(I->getOperand(Op)), + Offset - SL->getElementOffset(Op), M); + } + if (auto *C = dyn_cast(I)) { + ArrayType *VTableTy = C->getType(); + uint64_t ElemSize = DL.getTypeAllocSize(VTableTy->getElementType()); + + unsigned Op = Offset / ElemSize; + if (Op >= C->getNumOperands()) + return nullptr; + + return getPointerAtOffset(cast(I->getOperand(Op)), + Offset % ElemSize, M); + } + return nullptr; +} Modified: llvm/trunk/lib/IR/Metadata.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Metadata.cpp?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/lib/IR/Metadata.cpp (original) +++ llvm/trunk/lib/IR/Metadata.cpp Fri Oct 11 04:59:55 2019 @@ -1497,6 +1497,24 @@ void GlobalObject::addTypeMetadata(unsig TypeID})); } +void GlobalObject::addVCallVisibilityMetadata(VCallVisibility Visibility) { + addMetadata(LLVMContext::MD_vcall_visibility, + *MDNode::get(getContext(), + {ConstantAsMetadata::get(ConstantInt::get( + Type::getInt64Ty(getContext()), Visibility))})); +} + +GlobalObject::VCallVisibility GlobalObject::getVCallVisibility() const { + if (MDNode *MD = getMetadata(LLVMContext::MD_vcall_visibility)) { + uint64_t Val = cast( + cast(MD->getOperand(0))->getValue()) + ->getZExtValue(); + assert((Val >= 0 && Val <= 2) && "unknown vcall visibility!"); + return (VCallVisibility)Val; + } + return VCallVisibility::VCallVisibilityPublic; +} + void Function::setSubprogram(DISubprogram *SP) { setMetadata(LLVMContext::MD_dbg, SP); } Modified: llvm/trunk/lib/LTO/LTO.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/LTO/LTO.cpp?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/lib/LTO/LTO.cpp (original) +++ llvm/trunk/lib/LTO/LTO.cpp Fri Oct 11 04:59:55 2019 @@ -1003,6 +1003,8 @@ Error LTO::runRegularLTO(AddStreamFn Add GV->setLinkage(GlobalValue::InternalLinkage); } + RegularLTO.CombinedModule->addModuleFlag(Module::Error, "LTOPostLink", 1); + if (Conf.PostInternalizeModuleHook && !Conf.PostInternalizeModuleHook(0, *RegularLTO.CombinedModule)) return Error::success(); Modified: llvm/trunk/lib/LTO/LTOCodeGenerator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/LTO/LTOCodeGenerator.cpp?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/lib/LTO/LTOCodeGenerator.cpp (original) +++ llvm/trunk/lib/LTO/LTOCodeGenerator.cpp Fri Oct 11 04:59:55 2019 @@ -463,6 +463,8 @@ void LTOCodeGenerator::applyScopeRestric internalizeModule(*MergedModule, mustPreserveGV); + MergedModule->addModuleFlag(Module::Error, "LTOPostLink", 1); + ScopeRestrictionsDone = true; } Modified: llvm/trunk/lib/Transforms/IPO/GlobalDCE.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/GlobalDCE.cpp?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/GlobalDCE.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/GlobalDCE.cpp Fri Oct 11 04:59:55 2019 @@ -17,9 +17,11 @@ #include "llvm/Transforms/IPO/GlobalDCE.h" #include "llvm/ADT/SmallPtrSet.h" #include "llvm/ADT/Statistic.h" +#include "llvm/Analysis/TypeMetadataUtils.h" #include "llvm/IR/Instructions.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/Module.h" +#include "llvm/IR/Operator.h" #include "llvm/Pass.h" #include "llvm/Transforms/IPO.h" #include "llvm/Transforms/Utils/CtorUtils.h" @@ -29,10 +31,15 @@ using namespace llvm; #define DEBUG_TYPE "globaldce" +static cl::opt + ClEnableVFE("enable-vfe", cl::Hidden, cl::init(true), cl::ZeroOrMore, + cl::desc("Enable virtual function elimination")); + STATISTIC(NumAliases , "Number of global aliases removed"); STATISTIC(NumFunctions, "Number of functions removed"); STATISTIC(NumIFuncs, "Number of indirect functions removed"); STATISTIC(NumVariables, "Number of global variables removed"); +STATISTIC(NumVFuncs, "Number of virtual functions removed"); namespace { class GlobalDCELegacyPass : public ModulePass { @@ -118,6 +125,15 @@ void GlobalDCEPass::UpdateGVDependencies ComputeDependencies(User, Deps); Deps.erase(&GV); // Remove self-reference. for (GlobalValue *GVU : Deps) { + // If this is a dep from a vtable to a virtual function, and we have + // complete information about all virtual call sites which could call + // though this vtable, then skip it, because the call site information will + // be more precise. + if (VFESafeVTables.count(GVU) && isa(&GV)) { + LLVM_DEBUG(dbgs() << "Ignoring dep " << GVU->getName() << " -> " + << GV.getName() << "\n"); + continue; + } GVDependencies[GVU].insert(&GV); } } @@ -132,12 +148,133 @@ void GlobalDCEPass::MarkLive(GlobalValue if (Updates) Updates->push_back(&GV); if (Comdat *C = GV.getComdat()) { - for (auto &&CM : make_range(ComdatMembers.equal_range(C))) + for (auto &&CM : make_range(ComdatMembers.equal_range(C))) { MarkLive(*CM.second, Updates); // Recursion depth is only two because only // globals in the same comdat are visited. + } + } +} + +void GlobalDCEPass::ScanVTables(Module &M) { + SmallVector Types; + LLVM_DEBUG(dbgs() << "Building type info -> vtable map\n"); + + auto *LTOPostLinkMD = + cast_or_null(M.getModuleFlag("LTOPostLink")); + bool LTOPostLink = + LTOPostLinkMD && + (cast(LTOPostLinkMD->getValue())->getZExtValue() != 0); + + for (GlobalVariable &GV : M.globals()) { + Types.clear(); + GV.getMetadata(LLVMContext::MD_type, Types); + if (GV.isDeclaration() || Types.empty()) + continue; + + // Use the typeid metadata on the vtable to build a mapping from typeids to + // the list of (GV, offset) pairs which are the possible vtables for that + // typeid. + for (MDNode *Type : Types) { + Metadata *TypeID = Type->getOperand(1).get(); + + uint64_t Offset = + cast( + cast(Type->getOperand(0))->getValue()) + ->getZExtValue(); + + TypeIdMap[TypeID].insert(std::make_pair(&GV, Offset)); + } + + // If the type corresponding to the vtable is private to this translation + // unit, we know that we can see all virtual functions which might use it, + // so VFE is safe. + if (auto GO = dyn_cast(&GV)) { + GlobalObject::VCallVisibility TypeVis = GV.getVCallVisibility(); + if (TypeVis == GlobalObject::VCallVisibilityTranslationUnit || + (LTOPostLink && + TypeVis == GlobalObject::VCallVisibilityLinkageUnit)) { + LLVM_DEBUG(dbgs() << GV.getName() << " is safe for VFE\n"); + VFESafeVTables.insert(&GV); + } + } + } +} + +void GlobalDCEPass::ScanVTableLoad(Function *Caller, Metadata *TypeId, + uint64_t CallOffset) { + for (auto &VTableInfo : TypeIdMap[TypeId]) { + GlobalVariable *VTable = VTableInfo.first; + uint64_t VTableOffset = VTableInfo.second; + + Constant *Ptr = + getPointerAtOffset(VTable->getInitializer(), VTableOffset + CallOffset, + *Caller->getParent()); + if (!Ptr) { + LLVM_DEBUG(dbgs() << "can't find pointer in vtable!\n"); + VFESafeVTables.erase(VTable); + return; + } + + auto Callee = dyn_cast(Ptr->stripPointerCasts()); + if (!Callee) { + LLVM_DEBUG(dbgs() << "vtable entry is not function pointer!\n"); + VFESafeVTables.erase(VTable); + return; + } + + LLVM_DEBUG(dbgs() << "vfunc dep " << Caller->getName() << " -> " + << Callee->getName() << "\n"); + GVDependencies[Caller].insert(Callee); + } +} + +void GlobalDCEPass::ScanTypeCheckedLoadIntrinsics(Module &M) { + LLVM_DEBUG(dbgs() << "Scanning type.checked.load intrinsics\n"); + Function *TypeCheckedLoadFunc = + M.getFunction(Intrinsic::getName(Intrinsic::type_checked_load)); + + if (!TypeCheckedLoadFunc) + return; + + for (auto U : TypeCheckedLoadFunc->users()) { + auto CI = dyn_cast(U); + if (!CI) + continue; + + auto *Offset = dyn_cast(CI->getArgOperand(1)); + Value *TypeIdValue = CI->getArgOperand(2); + auto *TypeId = cast(TypeIdValue)->getMetadata(); + + if (Offset) { + ScanVTableLoad(CI->getFunction(), TypeId, Offset->getZExtValue()); + } else { + // type.checked.load with a non-constant offset, so assume every entry in + // every matching vtable is used. + for (auto &VTableInfo : TypeIdMap[TypeId]) { + VFESafeVTables.erase(VTableInfo.first); + } + } } } +void GlobalDCEPass::AddVirtualFunctionDependencies(Module &M) { + if (!ClEnableVFE) + return; + + ScanVTables(M); + + if (VFESafeVTables.empty()) + return; + + ScanTypeCheckedLoadIntrinsics(M); + + LLVM_DEBUG( + dbgs() << "VFE safe vtables:\n"; + for (auto *VTable : VFESafeVTables) + dbgs() << " " << VTable->getName() << "\n"; + ); +} + PreservedAnalyses GlobalDCEPass::run(Module &M, ModuleAnalysisManager &MAM) { bool Changed = false; @@ -163,6 +300,10 @@ PreservedAnalyses GlobalDCEPass::run(Mod if (Comdat *C = GA.getComdat()) ComdatMembers.insert(std::make_pair(C, &GA)); + // Add dependencies between virtual call sites and the virtual functions they + // might call, if we have that information. + AddVirtualFunctionDependencies(M); + // Loop over the module, adding globals which are obviously necessary. for (GlobalObject &GO : M.global_objects()) { Changed |= RemoveUnusedGlobalValue(GO); @@ -257,8 +398,17 @@ PreservedAnalyses GlobalDCEPass::run(Mod }; NumFunctions += DeadFunctions.size(); - for (Function *F : DeadFunctions) + for (Function *F : DeadFunctions) { + if (!F->use_empty()) { + // Virtual functions might still be referenced by one or more vtables, + // but if we've proven them to be unused then it's safe to replace the + // virtual function pointers with null, allowing us to remove the + // function itself. + ++NumVFuncs; + F->replaceAllUsesWith(ConstantPointerNull::get(F->getType())); + } EraseUnusedGlobalValue(F); + } NumVariables += DeadGlobalVars.size(); for (GlobalVariable *GV : DeadGlobalVars) @@ -277,6 +427,8 @@ PreservedAnalyses GlobalDCEPass::run(Mod ConstantDependenciesCache.clear(); GVDependencies.clear(); ComdatMembers.clear(); + TypeIdMap.clear(); + VFESafeVTables.clear(); if (Changed) return PreservedAnalyses::none(); Modified: llvm/trunk/lib/Transforms/IPO/WholeProgramDevirt.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/WholeProgramDevirt.cpp?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/WholeProgramDevirt.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/WholeProgramDevirt.cpp Fri Oct 11 04:59:55 2019 @@ -496,7 +496,6 @@ struct DevirtModule { void buildTypeIdentifierMap( std::vector &Bits, DenseMap> &TypeIdMap); - Constant *getPointerAtOffset(Constant *I, uint64_t Offset); bool tryFindVirtualCallTargets(std::vector &TargetsForSlot, const std::set &TypeMemberInfos, @@ -813,38 +812,6 @@ void DevirtModule::buildTypeIdentifierMa } } -Constant *DevirtModule::getPointerAtOffset(Constant *I, uint64_t Offset) { - if (I->getType()->isPointerTy()) { - if (Offset == 0) - return I; - return nullptr; - } - - const DataLayout &DL = M.getDataLayout(); - - if (auto *C = dyn_cast(I)) { - const StructLayout *SL = DL.getStructLayout(C->getType()); - if (Offset >= SL->getSizeInBytes()) - return nullptr; - - unsigned Op = SL->getElementContainingOffset(Offset); - return getPointerAtOffset(cast(I->getOperand(Op)), - Offset - SL->getElementOffset(Op)); - } - if (auto *C = dyn_cast(I)) { - ArrayType *VTableTy = C->getType(); - uint64_t ElemSize = DL.getTypeAllocSize(VTableTy->getElementType()); - - unsigned Op = Offset / ElemSize; - if (Op >= C->getNumOperands()) - return nullptr; - - return getPointerAtOffset(cast(I->getOperand(Op)), - Offset % ElemSize); - } - return nullptr; -} - bool DevirtModule::tryFindVirtualCallTargets( std::vector &TargetsForSlot, const std::set &TypeMemberInfos, uint64_t ByteOffset) { @@ -853,7 +820,7 @@ bool DevirtModule::tryFindVirtualCallTar return false; Constant *Ptr = getPointerAtOffset(TM.Bits->GV->getInitializer(), - TM.Offset + ByteOffset); + TM.Offset + ByteOffset, M); if (!Ptr) return false; @@ -1941,6 +1908,12 @@ bool DevirtModule::run() { for (VTableBits &B : Bits) rebuildGlobal(B); + // We have lowered or deleted the type checked load intrinsics, so we no + // longer have enough information to reason about the liveness of virtual + // function pointers in GlobalDCE. + for (GlobalVariable &GV : M.globals()) + GV.eraseMetadata(LLVMContext::MD_vcall_visibility); + return true; } Added: llvm/trunk/test/LTO/ARM/lto-linking-metadata.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/LTO/ARM/lto-linking-metadata.ll?rev=374539&view=auto ============================================================================== --- llvm/trunk/test/LTO/ARM/lto-linking-metadata.ll (added) +++ llvm/trunk/test/LTO/ARM/lto-linking-metadata.ll Fri Oct 11 04:59:55 2019 @@ -0,0 +1,19 @@ +; RUN: opt %s -o %t1.bc + +; RUN: llvm-lto %t1.bc -o %t1.save.opt -save-merged-module -O1 --exported-symbol=foo +; RUN: llvm-dis < %t1.save.opt.merged.bc | FileCheck %s + +; RUN: llvm-lto2 run %t1.bc -o %t.out.o -save-temps \ +; RUN: -r=%t1.bc,foo,pxl +; RUN: llvm-dis < %t.out.o.0.2.internalize.bc | FileCheck %s + +target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64" +target triple = "armv7a-unknown-linux" + +define void @foo() { +entry: + ret void +} + +; CHECK: !llvm.module.flags = !{[[MD_NUM:![0-9]+]]} +; CHECK: [[MD_NUM]] = !{i32 1, !"LTOPostLink", i32 1} Modified: llvm/trunk/test/ThinLTO/X86/lazyload_metadata.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/ThinLTO/X86/lazyload_metadata.ll?rev=374539&r1=374538&r2=374539&view=diff ============================================================================== --- llvm/trunk/test/ThinLTO/X86/lazyload_metadata.ll (original) +++ llvm/trunk/test/ThinLTO/X86/lazyload_metadata.ll Fri Oct 11 04:59:55 2019 @@ -10,13 +10,13 @@ ; RUN: llvm-lto -thinlto-action=import %t2.bc -thinlto-index=%t3.bc \ ; RUN: -o /dev/null -stats \ ; RUN: 2>&1 | FileCheck %s -check-prefix=LAZY -; LAZY: 63 bitcode-reader - Number of Metadata records loaded +; LAZY: 65 bitcode-reader - Number of Metadata records loaded ; LAZY: 2 bitcode-reader - Number of MDStrings loaded ; RUN: llvm-lto -thinlto-action=import %t2.bc -thinlto-index=%t3.bc \ ; RUN: -o /dev/null -disable-ondemand-mds-loading -stats \ ; RUN: 2>&1 | FileCheck %s -check-prefix=NOTLAZY -; NOTLAZY: 72 bitcode-reader - Number of Metadata records loaded +; NOTLAZY: 74 bitcode-reader - Number of Metadata records loaded ; NOTLAZY: 7 bitcode-reader - Number of MDStrings loaded Added: llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-base-call.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-base-call.ll?rev=374539&view=auto ============================================================================== --- llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-base-call.ll (added) +++ llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-base-call.ll Fri Oct 11 04:59:55 2019 @@ -0,0 +1,78 @@ +; RUN: opt < %s -globaldce -S | FileCheck %s + +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" + +; struct A { +; A(); +; virtual int foo(); +; }; +; +; struct B : A { +; B(); +; virtual int foo(); +; }; +; +; A::A() {} +; B::B() {} +; int A::foo() { return 42; } +; int B::foo() { return 1337; } +; +; extern "C" int test(A *p) { return p->foo(); } + +; The virtual call in test could be dispatched to either A::foo or B::foo, so +; both must be retained. + +%struct.A = type { i32 (...)** } +%struct.B = type { %struct.A } + +; CHECK: @_ZTV1A = internal unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.A*)* @_ZN1A3fooEv to i8*)] } + at _ZTV1A = internal unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.A*)* @_ZN1A3fooEv to i8*)] }, align 8, !type !0, !type !1, !vcall_visibility !2 + +; CHECK: @_ZTV1B = internal unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.B*)* @_ZN1B3fooEv to i8*)] } + at _ZTV1B = internal unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.B*)* @_ZN1B3fooEv to i8*)] }, align 8, !type !0, !type !1, !type !3, !type !4, !vcall_visibility !2 + +; CHECK: define internal i32 @_ZN1A3fooEv( +define internal i32 @_ZN1A3fooEv(%struct.A* nocapture readnone %this) { +entry: + ret i32 42 +} + +; CHECK: define internal i32 @_ZN1B3fooEv( +define internal i32 @_ZN1B3fooEv(%struct.B* nocapture readnone %this) { +entry: + ret i32 1337 +} + +define hidden void @_ZN1AC2Ev(%struct.A* nocapture %this) { +entry: + %0 = getelementptr inbounds %struct.A, %struct.A* %this, i64 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1A, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +define hidden void @_ZN1BC2Ev(%struct.B* nocapture %this) { +entry: + %0 = getelementptr inbounds %struct.B, %struct.B* %this, i64 0, i32 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1B, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +define hidden i32 @test(%struct.A* %p) { +entry: + %0 = bitcast %struct.A* %p to i8** + %vtable1 = load i8*, i8** %0, align 8 + %1 = tail call { i8*, i1 } @llvm.type.checked.load(i8* %vtable1, i32 0, metadata !"_ZTS1A"), !nosanitize !10 + %2 = extractvalue { i8*, i1 } %1, 0, !nosanitize !10 + %3 = bitcast i8* %2 to i32 (%struct.A*)*, !nosanitize !10 + %call = tail call i32 %3(%struct.A* %p) + ret i32 %call +} + +declare { i8*, i1 } @llvm.type.checked.load(i8*, i32, metadata) #2 + +!0 = !{i64 16, !"_ZTS1A"} +!1 = !{i64 16, !"_ZTSM1AFivE.virtual"} +!2 = !{i64 2} +!3 = !{i64 16, !"_ZTS1B"} +!4 = !{i64 16, !"_ZTSM1BFivE.virtual"} +!10 = !{} Added: llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-base-pointer-call.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-base-pointer-call.ll?rev=374539&view=auto ============================================================================== --- llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-base-pointer-call.ll (added) +++ llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-base-pointer-call.ll Fri Oct 11 04:59:55 2019 @@ -0,0 +1,118 @@ +; RUN: opt < %s -globaldce -S | FileCheck %s + +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" + +; struct A { +; A(); +; virtual int foo(int); +; virtual int bar(float); +; }; +; +; struct B : A { +; B(); +; virtual int foo(int); +; virtual int bar(float); +; }; +; +; A::A() {} +; B::B() {} +; int A::foo(int) { return 1; } +; int A::bar(float) { return 2; } +; int B::foo(int) { return 3; } +; int B::bar(float) { return 4; } +; +; extern "C" int test(A *p, int (A::*q)(int)) { return (p->*q)(42); } + +; Member function pointers are tracked by the combination of their object type +; and function type, which must both be compatible. Here, the call is through a +; pointer of type "int (A::*q)(int)", so the call could be dispatched to A::foo +; or B::foo. It can't be dispatched to A::bar or B::bar as the function pointer +; does not match, so those can be removed. + +%struct.A = type { i32 (...)** } +%struct.B = type { %struct.A } + +; CHECK: @_ZTV1A = internal unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.A*, i32)* @_ZN1A3fooEi to i8*), i8* null] } + at _ZTV1A = internal unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.A*, i32)* @_ZN1A3fooEi to i8*), i8* bitcast (i32 (%struct.A*, float)* @_ZN1A3barEf to i8*)] }, align 8, !type !0, !type !1, !type !2, !vcall_visibility !3 +; CHECK: @_ZTV1B = internal unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.B*, i32)* @_ZN1B3fooEi to i8*), i8* null] } + at _ZTV1B = internal unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.B*, i32)* @_ZN1B3fooEi to i8*), i8* bitcast (i32 (%struct.B*, float)* @_ZN1B3barEf to i8*)] }, align 8, !type !0, !type !1, !type !2, !type !4, !type !5, !type !6, !vcall_visibility !3 + + +; CHECK: define internal i32 @_ZN1A3fooEi( +define internal i32 @_ZN1A3fooEi(%struct.A* nocapture readnone %this, i32) unnamed_addr #1 align 2 { +entry: + ret i32 1 +} + +; CHECK-NOT: define internal i32 @_ZN1A3barEf( +define internal i32 @_ZN1A3barEf(%struct.A* nocapture readnone %this, float) unnamed_addr #1 align 2 { +entry: + ret i32 2 +} + +; CHECK: define internal i32 @_ZN1B3fooEi( +define internal i32 @_ZN1B3fooEi(%struct.B* nocapture readnone %this, i32) unnamed_addr #1 align 2 { +entry: + ret i32 3 +} + +; CHECK-NOT: define internal i32 @_ZN1B3barEf( +define internal i32 @_ZN1B3barEf(%struct.B* nocapture readnone %this, float) unnamed_addr #1 align 2 { +entry: + ret i32 4 +} + + +define hidden void @_ZN1AC2Ev(%struct.A* nocapture %this) { +entry: + %0 = getelementptr inbounds %struct.A, %struct.A* %this, i64 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [4 x i8*] }, { [4 x i8*] }* @_ZTV1A, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +define hidden void @_ZN1BC2Ev(%struct.B* nocapture %this) { +entry: + %0 = getelementptr inbounds %struct.B, %struct.B* %this, i64 0, i32 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [4 x i8*] }, { [4 x i8*] }* @_ZTV1B, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +define hidden i32 @test(%struct.A* %p, i64 %q.coerce0, i64 %q.coerce1) { +entry: + %0 = bitcast %struct.A* %p to i8* + %1 = getelementptr inbounds i8, i8* %0, i64 %q.coerce1 + %this.adjusted = bitcast i8* %1 to %struct.A* + %2 = and i64 %q.coerce0, 1 + %memptr.isvirtual = icmp eq i64 %2, 0 + br i1 %memptr.isvirtual, label %memptr.nonvirtual, label %memptr.virtual + +memptr.virtual: ; preds = %entry + %3 = bitcast i8* %1 to i8** + %vtable = load i8*, i8** %3, align 8 + %4 = add i64 %q.coerce0, -1 + %5 = getelementptr i8, i8* %vtable, i64 %4, !nosanitize !12 + %6 = tail call { i8*, i1 } @llvm.type.checked.load(i8* %5, i32 0, metadata !"_ZTSM1AFiiE.virtual"), !nosanitize !12 + %7 = extractvalue { i8*, i1 } %6, 0, !nosanitize !12 + %memptr.virtualfn = bitcast i8* %7 to i32 (%struct.A*, i32)*, !nosanitize !12 + br label %memptr.end + +memptr.nonvirtual: ; preds = %entry + %memptr.nonvirtualfn = inttoptr i64 %q.coerce0 to i32 (%struct.A*, i32)* + br label %memptr.end + +memptr.end: ; preds = %memptr.nonvirtual, %memptr.virtual + %8 = phi i32 (%struct.A*, i32)* [ %memptr.virtualfn, %memptr.virtual ], [ %memptr.nonvirtualfn, %memptr.nonvirtual ] + %call = tail call i32 %8(%struct.A* %this.adjusted, i32 42) + ret i32 %call +} + +declare { i8*, i1 } @llvm.type.checked.load(i8*, i32, metadata) + +!0 = !{i64 16, !"_ZTS1A"} +!1 = !{i64 16, !"_ZTSM1AFiiE.virtual"} +!2 = !{i64 24, !"_ZTSM1AFifE.virtual"} +!3 = !{i64 2} +!4 = !{i64 16, !"_ZTS1B"} +!5 = !{i64 16, !"_ZTSM1BFiiE.virtual"} +!6 = !{i64 24, !"_ZTSM1BFifE.virtual"} +!12 = !{} Added: llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-derived-call.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-derived-call.ll?rev=374539&view=auto ============================================================================== --- llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-derived-call.ll (added) +++ llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-derived-call.ll Fri Oct 11 04:59:55 2019 @@ -0,0 +1,78 @@ +; RUN: opt < %s -globaldce -S | FileCheck %s + +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" + +; struct A { +; A(); +; virtual int foo(); +; }; +; +; struct B : A { +; B(); +; virtual int foo(); +; }; +; +; A::A() {} +; B::B() {} +; int A::foo() { return 42; } +; int B::foo() { return 1337; } +; +; extern "C" int test(B *p) { return p->foo(); } + +; The virtual call in test can only be dispatched to B::foo (or a more-derived +; class, if there was one), so A::foo can be removed. + +%struct.A = type { i32 (...)** } +%struct.B = type { %struct.A } + +; CHECK: @_ZTV1A = internal unnamed_addr constant { [3 x i8*] } zeroinitializer + at _ZTV1A = internal unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.A*)* @_ZN1A3fooEv to i8*)] }, align 8, !type !0, !type !1, !vcall_visibility !2 + +; CHECK: @_ZTV1B = internal unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.B*)* @_ZN1B3fooEv to i8*)] } + at _ZTV1B = internal unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.B*)* @_ZN1B3fooEv to i8*)] }, align 8, !type !0, !type !1, !type !3, !type !4, !vcall_visibility !2 + +; CHECK-NOT: define internal i32 @_ZN1A3fooEv( +define internal i32 @_ZN1A3fooEv(%struct.A* nocapture readnone %this) { +entry: + ret i32 42 +} + +; CHECK: define internal i32 @_ZN1B3fooEv( +define internal i32 @_ZN1B3fooEv(%struct.B* nocapture readnone %this) { +entry: + ret i32 1337 +} + +define hidden void @_ZN1AC2Ev(%struct.A* nocapture %this) { +entry: + %0 = getelementptr inbounds %struct.A, %struct.A* %this, i64 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1A, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +define hidden void @_ZN1BC2Ev(%struct.B* nocapture %this) { +entry: + %0 = getelementptr inbounds %struct.B, %struct.B* %this, i64 0, i32 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1B, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +define hidden i32 @test(%struct.B* %p) { +entry: + %0 = bitcast %struct.B* %p to i8** + %vtable1 = load i8*, i8** %0, align 8 + %1 = tail call { i8*, i1 } @llvm.type.checked.load(i8* %vtable1, i32 0, metadata !"_ZTS1B"), !nosanitize !10 + %2 = extractvalue { i8*, i1 } %1, 0, !nosanitize !10 + %3 = bitcast i8* %2 to i32 (%struct.B*)*, !nosanitize !10 + %call = tail call i32 %3(%struct.B* %p) + ret i32 %call +} + +declare { i8*, i1 } @llvm.type.checked.load(i8*, i32, metadata) #2 + +!0 = !{i64 16, !"_ZTS1A"} +!1 = !{i64 16, !"_ZTSM1AFivE.virtual"} +!2 = !{i64 2} +!3 = !{i64 16, !"_ZTS1B"} +!4 = !{i64 16, !"_ZTSM1BFivE.virtual"} +!10 = !{} Added: llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-derived-pointer-call.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-derived-pointer-call.ll?rev=374539&view=auto ============================================================================== --- llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-derived-pointer-call.ll (added) +++ llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-derived-pointer-call.ll Fri Oct 11 04:59:55 2019 @@ -0,0 +1,120 @@ + +; RUN: opt < %s -globaldce -S | FileCheck %s + +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" + +; struct A { +; A(); +; virtual int foo(int); +; virtual int bar(float); +; }; +; +; struct B : A { +; B(); +; virtual int foo(int); +; virtual int bar(float); +; }; +; +; A::A() {} +; B::B() {} +; int A::foo(int) { return 1; } +; int A::bar(float) { return 2; } +; int B::foo(int) { return 3; } +; int B::bar(float) { return 4; } +; +; extern "C" int test(B *p, int (B::*q)(int)) { return (p->*q)(42); } + +; Member function pointers are tracked by the combination of their object type +; and function type, which must both be compatible. Here, the call is through a +; pointer of type "int (B::*q)(int)", so the call could only be dispatched to +; B::foo. It can't be dispatched to A::bar or B::bar as the function pointer +; does not match, and it can't be dispatched to A::foo as the object type +; doesn't match, so those can be removed. + +%struct.A = type { i32 (...)** } +%struct.B = type { %struct.A } + +; CHECK: @_ZTV1A = internal unnamed_addr constant { [4 x i8*] } zeroinitializer + at _ZTV1A = internal unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.A*, i32)* @_ZN1A3fooEi to i8*), i8* bitcast (i32 (%struct.A*, float)* @_ZN1A3barEf to i8*)] }, align 8, !type !0, !type !1, !type !2, !vcall_visibility !3 +; CHECK: @_ZTV1B = internal unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.B*, i32)* @_ZN1B3fooEi to i8*), i8* null] } + at _ZTV1B = internal unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.B*, i32)* @_ZN1B3fooEi to i8*), i8* bitcast (i32 (%struct.B*, float)* @_ZN1B3barEf to i8*)] }, align 8, !type !0, !type !1, !type !2, !type !4, !type !5, !type !6, !vcall_visibility !3 + + +; CHECK-NOT: define internal i32 @_ZN1A3fooEi( +define internal i32 @_ZN1A3fooEi(%struct.A* nocapture readnone %this, i32) unnamed_addr #1 align 2 { +entry: + ret i32 1 +} + +; CHECK-NOT: define internal i32 @_ZN1A3barEf( +define internal i32 @_ZN1A3barEf(%struct.A* nocapture readnone %this, float) unnamed_addr #1 align 2 { +entry: + ret i32 2 +} + +; CHECK: define internal i32 @_ZN1B3fooEi( +define internal i32 @_ZN1B3fooEi(%struct.B* nocapture readnone %this, i32) unnamed_addr #1 align 2 { +entry: + ret i32 3 +} + +; CHECK-NOT: define internal i32 @_ZN1B3barEf( +define internal i32 @_ZN1B3barEf(%struct.B* nocapture readnone %this, float) unnamed_addr #1 align 2 { +entry: + ret i32 4 +} + + +define hidden void @_ZN1AC2Ev(%struct.A* nocapture %this) { +entry: + %0 = getelementptr inbounds %struct.A, %struct.A* %this, i64 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [4 x i8*] }, { [4 x i8*] }* @_ZTV1A, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +define hidden void @_ZN1BC2Ev(%struct.B* nocapture %this) { +entry: + %0 = getelementptr inbounds %struct.B, %struct.B* %this, i64 0, i32 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [4 x i8*] }, { [4 x i8*] }* @_ZTV1B, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +define hidden i32 @test(%struct.B* %p, i64 %q.coerce0, i64 %q.coerce1) { +entry: + %0 = bitcast %struct.B* %p to i8* + %1 = getelementptr inbounds i8, i8* %0, i64 %q.coerce1 + %this.adjusted = bitcast i8* %1 to %struct.B* + %2 = and i64 %q.coerce0, 1 + %memptr.isvirtual = icmp eq i64 %2, 0 + br i1 %memptr.isvirtual, label %memptr.nonvirtual, label %memptr.virtual + +memptr.virtual: ; preds = %entry + %3 = bitcast i8* %1 to i8** + %vtable = load i8*, i8** %3, align 8 + %4 = add i64 %q.coerce0, -1 + %5 = getelementptr i8, i8* %vtable, i64 %4, !nosanitize !12 + %6 = tail call { i8*, i1 } @llvm.type.checked.load(i8* %5, i32 0, metadata !"_ZTSM1BFiiE.virtual"), !nosanitize !12 + %7 = extractvalue { i8*, i1 } %6, 0, !nosanitize !12 + %memptr.virtualfn = bitcast i8* %7 to i32 (%struct.B*, i32)*, !nosanitize !12 + br label %memptr.end + +memptr.nonvirtual: ; preds = %entry + %memptr.nonvirtualfn = inttoptr i64 %q.coerce0 to i32 (%struct.B*, i32)* + br label %memptr.end + +memptr.end: ; preds = %memptr.nonvirtual, %memptr.virtual + %8 = phi i32 (%struct.B*, i32)* [ %memptr.virtualfn, %memptr.virtual ], [ %memptr.nonvirtualfn, %memptr.nonvirtual ] + %call = tail call i32 %8(%struct.B* %this.adjusted, i32 42) + ret i32 %call +} + +declare { i8*, i1 } @llvm.type.checked.load(i8*, i32, metadata) + +!0 = !{i64 16, !"_ZTS1A"} +!1 = !{i64 16, !"_ZTSM1AFiiE.virtual"} +!2 = !{i64 24, !"_ZTSM1AFifE.virtual"} +!3 = !{i64 2} +!4 = !{i64 16, !"_ZTS1B"} +!5 = !{i64 16, !"_ZTSM1BFiiE.virtual"} +!6 = !{i64 24, !"_ZTSM1BFifE.virtual"} +!12 = !{} Added: llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-visibility-post-lto.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-visibility-post-lto.ll?rev=374539&view=auto ============================================================================== --- llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-visibility-post-lto.ll (added) +++ llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-visibility-post-lto.ll Fri Oct 11 04:59:55 2019 @@ -0,0 +1,95 @@ +; RUN: opt < %s -globaldce -S | FileCheck %s + +; structs A, B and C have vcall_visibility of public, linkage-unit and +; translation-unit respectively. This test is run after LTO linking (the +; LTOPostLink metadata is present), so B and C can be VFE'd. + +target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" + +%struct.A = type { i32 (...)** } + + at _ZTV1A = hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.A*)* @_ZN1A3fooEv to i8*)] }, align 8, !type !0, !type !1, !vcall_visibility !2 + +define internal void @_ZN1AC2Ev(%struct.A* %this) { +entry: + %0 = getelementptr inbounds %struct.A, %struct.A* %this, i64 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1A, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +; CHECK: define {{.*}} @_ZN1A3fooEv( +define internal void @_ZN1A3fooEv(%struct.A* nocapture %this) { +entry: + ret void +} + +define dso_local i8* @_Z6make_Av() { +entry: + %call = tail call i8* @_Znwm(i64 8) + %0 = bitcast i8* %call to %struct.A* + tail call void @_ZN1AC2Ev(%struct.A* %0) + ret i8* %call +} + + +%struct.B = type { i32 (...)** } + + at _ZTV1B = hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.B*)* @_ZN1B3fooEv to i8*)] }, align 8, !type !0, !type !1, !vcall_visibility !3 + +define internal void @_ZN1BC2Ev(%struct.B* %this) { +entry: + %0 = getelementptr inbounds %struct.B, %struct.B* %this, i64 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1B, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +; CHECK-NOT: define {{.*}} @_ZN1B3fooEv( +define internal void @_ZN1B3fooEv(%struct.B* nocapture %this) { +entry: + ret void +} + +define dso_local i8* @_Z6make_Bv() { +entry: + %call = tail call i8* @_Znwm(i64 8) + %0 = bitcast i8* %call to %struct.B* + tail call void @_ZN1BC2Ev(%struct.B* %0) + ret i8* %call +} + + +%struct.C = type { i32 (...)** } + + at _ZTV1C = hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.C*)* @_ZN1C3fooEv to i8*)] }, align 8, !type !0, !type !1, !vcall_visibility !4 + +define internal void @_ZN1CC2Ev(%struct.C* %this) { +entry: + %0 = getelementptr inbounds %struct.C, %struct.C* %this, i64 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1C, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +; CHECK-NOT: define {{.*}} @_ZN1C3fooEv( +define internal void @_ZN1C3fooEv(%struct.C* nocapture %this) { +entry: + ret void +} + +define dso_local i8* @_Z6make_Cv() { +entry: + %call = tail call i8* @_Znwm(i64 8) + %0 = bitcast i8* %call to %struct.C* + tail call void @_ZN1CC2Ev(%struct.C* %0) + ret i8* %call +} + +declare dso_local noalias nonnull i8* @_Znwm(i64) + +!llvm.module.flags = !{!5} + +!0 = !{i64 16, !"_ZTS1A"} +!1 = !{i64 16, !"_ZTSM1AFvvE.virtual"} +!2 = !{i64 0} ; public vcall visibility +!3 = !{i64 1} ; linkage-unit vcall visibility +!4 = !{i64 2} ; translation-unit vcall visibility +!5 = !{i32 1, !"LTOPostLink", i32 1} Added: llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-visibility-pre-lto.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-visibility-pre-lto.ll?rev=374539&view=auto ============================================================================== --- llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-visibility-pre-lto.ll (added) +++ llvm/trunk/test/Transforms/GlobalDCE/virtual-functions-visibility-pre-lto.ll Fri Oct 11 04:59:55 2019 @@ -0,0 +1,94 @@ +; RUN: opt < %s -globaldce -S | FileCheck %s + +; structs A, B and C have vcall_visibility of public, linkage-unit and +; translation-unit respectively. This test is run before LTO linking occurs +; (the LTOPostLink metadata is not present), so only C can be VFE'd. + +target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" + +%struct.A = type { i32 (...)** } + + at _ZTV1A = hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.A*)* @_ZN1A3fooEv to i8*)] }, align 8, !type !0, !type !1, !vcall_visibility !2 + +define internal void @_ZN1AC2Ev(%struct.A* %this) { +entry: + %0 = getelementptr inbounds %struct.A, %struct.A* %this, i64 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1A, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +; CHECK: define {{.*}} @_ZN1A3fooEv( +define internal void @_ZN1A3fooEv(%struct.A* nocapture %this) { +entry: + ret void +} + +define dso_local i8* @_Z6make_Av() { +entry: + %call = tail call i8* @_Znwm(i64 8) + %0 = bitcast i8* %call to %struct.A* + tail call void @_ZN1AC2Ev(%struct.A* %0) + ret i8* %call +} + + +%struct.B = type { i32 (...)** } + + at _ZTV1B = hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.B*)* @_ZN1B3fooEv to i8*)] }, align 8, !type !0, !type !1, !vcall_visibility !3 + +define internal void @_ZN1BC2Ev(%struct.B* %this) { +entry: + %0 = getelementptr inbounds %struct.B, %struct.B* %this, i64 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1B, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +; CHECK: define {{.*}} @_ZN1B3fooEv( +define internal void @_ZN1B3fooEv(%struct.B* nocapture %this) { +entry: + ret void +} + +define dso_local i8* @_Z6make_Bv() { +entry: + %call = tail call i8* @_Znwm(i64 8) + %0 = bitcast i8* %call to %struct.B* + tail call void @_ZN1BC2Ev(%struct.B* %0) + ret i8* %call +} + + +%struct.C = type { i32 (...)** } + + at _ZTV1C = hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.C*)* @_ZN1C3fooEv to i8*)] }, align 8, !type !0, !type !1, !vcall_visibility !4 + +define internal void @_ZN1CC2Ev(%struct.C* %this) { +entry: + %0 = getelementptr inbounds %struct.C, %struct.C* %this, i64 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1C, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +; CHECK-NOT: define {{.*}} @_ZN1C3fooEv( +define internal void @_ZN1C3fooEv(%struct.C* nocapture %this) { +entry: + ret void +} + +define dso_local i8* @_Z6make_Cv() { +entry: + %call = tail call i8* @_Znwm(i64 8) + %0 = bitcast i8* %call to %struct.C* + tail call void @_ZN1CC2Ev(%struct.C* %0) + ret i8* %call +} + +declare dso_local noalias nonnull i8* @_Znwm(i64) + +!llvm.module.flags = !{} + +!0 = !{i64 16, !"_ZTS1A"} +!1 = !{i64 16, !"_ZTSM1AFvvE.virtual"} +!2 = !{i64 0} ; public vcall visibility +!3 = !{i64 1} ; linkage-unit vcall visibility +!4 = !{i64 2} ; translation-unit vcall visibility Added: llvm/trunk/test/Transforms/GlobalDCE/virtual-functions.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalDCE/virtual-functions.ll?rev=374539&view=auto ============================================================================== --- llvm/trunk/test/Transforms/GlobalDCE/virtual-functions.ll (added) +++ llvm/trunk/test/Transforms/GlobalDCE/virtual-functions.ll Fri Oct 11 04:59:55 2019 @@ -0,0 +1,55 @@ +; RUN: opt < %s -globaldce -S | FileCheck %s + +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" + +declare dso_local noalias nonnull i8* @_Znwm(i64) +declare { i8*, i1 } @llvm.type.checked.load(i8*, i32, metadata) + +; %struct.A is a C++ struct with two virtual functions, A::foo and A::bar. The +; !vcall_visibility metadata is set on the vtable, so we know that all virtual +; calls through this vtable are visible and use the @llvm.type.checked.load +; intrinsic. Function test_A makes a call to A::foo, but there is no call to +; A::bar anywhere, so A::bar can be deleted, and its vtable slot replaced with +; null. + +%struct.A = type { i32 (...)** } + +; The pointer to A::bar in the vtable can be removed, because it will never be +; loaded. We replace it with null to keep the layout the same. Because it is at +; the end of the vtable we could potentially shrink the vtable, but don't +; currently do that. +; CHECK: @_ZTV1A = internal unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.A*)* @_ZN1A3fooEv to i8*), i8* null] } + at _ZTV1A = internal unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* null, i8* bitcast (i32 (%struct.A*)* @_ZN1A3fooEv to i8*), i8* bitcast (i32 (%struct.A*)* @_ZN1A3barEv to i8*)] }, align 8, !type !0, !type !1, !type !2, !vcall_visibility !3 + +; A::foo is called, so must be retained. +; CHECK: define internal i32 @_ZN1A3fooEv( +define internal i32 @_ZN1A3fooEv(%struct.A* nocapture readnone %this) { +entry: + ret i32 42 +} + +; A::bar is not used, so can be deleted. +; CHECK-NOT: define internal i32 @_ZN1A3barEv( +define internal i32 @_ZN1A3barEv(%struct.A* nocapture readnone %this) { +entry: + ret i32 1337 +} + +define dso_local i32 @test_A() { +entry: + %call = tail call i8* @_Znwm(i64 8) + %0 = bitcast i8* %call to %struct.A* + %1 = bitcast i8* %call to i32 (...)*** + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [4 x i8*] }, { [4 x i8*] }* @_ZTV1A, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %1, align 8 + %2 = tail call { i8*, i1 } @llvm.type.checked.load(i8* bitcast (i8** getelementptr inbounds ({ [4 x i8*] }, { [4 x i8*] }* @_ZTV1A, i64 0, inrange i32 0, i64 2) to i8*), i32 0, metadata !"_ZTS1A"), !nosanitize !9 + %3 = extractvalue { i8*, i1 } %2, 0, !nosanitize !9 + %4 = bitcast i8* %3 to i32 (%struct.A*)*, !nosanitize !9 + %call1 = tail call i32 %4(%struct.A* nonnull %0) + ret i32 %call1 +} + +!0 = !{i64 16, !"_ZTS1A"} +!1 = !{i64 16, !"_ZTSM1AFivE.virtual"} +!2 = !{i64 24, !"_ZTSM1AFivE.virtual"} +!3 = !{i64 2} +!9 = !{} Added: llvm/trunk/test/Transforms/GlobalDCE/vtable-rtti.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalDCE/vtable-rtti.ll?rev=374539&view=auto ============================================================================== --- llvm/trunk/test/Transforms/GlobalDCE/vtable-rtti.ll (added) +++ llvm/trunk/test/Transforms/GlobalDCE/vtable-rtti.ll Fri Oct 11 04:59:55 2019 @@ -0,0 +1,47 @@ +; RUN: opt < %s -globaldce -S | FileCheck %s + +; We currently only use llvm.type.checked.load for virtual function pointers, +; not any other part of the vtable, so we can't remove the RTTI pointer even if +; it's never going to be loaded from. + +target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" + +%struct.A = type { i32 (...)** } + +; CHECK: @_ZTV1A = hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* bitcast ({ i8*, i8* }* @_ZTI1A to i8*), i8* null] }, align 8, !type !0, !type !1, !vcall_visibility !2 + + at _ZTV1A = hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* bitcast ({ i8*, i8* }* @_ZTI1A to i8*), i8* bitcast (void (%struct.A*)* @_ZN1A3fooEv to i8*)] }, align 8, !type !0, !type !1, !vcall_visibility !2 + at _ZTS1A = hidden constant [3 x i8] c"1A\00", align 1 + at _ZTI1A = hidden constant { i8*, i8* } { i8* bitcast (i8** getelementptr inbounds (i8*, i8** @_ZTVN10__cxxabiv117__class_type_infoE, i64 2) to i8*), i8* getelementptr inbounds ([3 x i8], [3 x i8]* @_ZTS1A, i32 0, i32 0) }, align 8 + +define internal void @_ZN1AC2Ev(%struct.A* %this) { +entry: + %0 = getelementptr inbounds %struct.A, %struct.A* %this, i64 0, i32 0 + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1A, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret void +} + +; CHECK-NOT: define {{.*}} @_ZN1A3fooEv( +define internal void @_ZN1A3fooEv(%struct.A* nocapture %this) { +entry: + ret void +} + +define dso_local i8* @_Z6make_Av() { +entry: + %call = tail call i8* @_Znwm(i64 8) + %0 = bitcast i8* %call to %struct.A* + tail call void @_ZN1AC2Ev(%struct.A* %0) + ret i8* %call +} + + +declare dso_local noalias nonnull i8* @_Znwm(i64) + at _ZTVN10__cxxabiv117__class_type_infoE = external dso_local global i8* + +!llvm.module.flags = !{!3} + +!0 = !{i64 16, !"_ZTS1A"} +!1 = !{i64 16, !"_ZTSM1AFvvE.virtual"} +!2 = !{i64 2} ; translation-unit vcall visibility +!3 = !{i32 1, !"LTOPostLink", i32 1} Added: llvm/trunk/test/Transforms/Internalize/vcall-visibility.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Internalize/vcall-visibility.ll?rev=374539&view=auto ============================================================================== --- llvm/trunk/test/Transforms/Internalize/vcall-visibility.ll (added) +++ llvm/trunk/test/Transforms/Internalize/vcall-visibility.ll Fri Oct 11 04:59:55 2019 @@ -0,0 +1,64 @@ +; RUN: opt < %s -internalize -S | FileCheck %s + +%struct.A = type { i32 (...)** } +%struct.B = type { i32 (...)** } +%struct.C = type { i32 (...)** } + +; Class A has default visibility, so has no !vcall_visibility metadata before +; or after LTO. +; CHECK-NOT: @_ZTV1A = {{.*}}!vcall_visibility + at _ZTV1A = dso_local unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.A*)* @_ZN1A3fooEv to i8*)] }, align 8, !type !0, !type !1 + +; Class B has hidden visibility but public LTO visibility, so has no +; !vcall_visibility metadata before or after LTO. +; CHECK-NOT: @_ZTV1B = {{.*}}!vcall_visibility + at _ZTV1B = hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.B*)* @_ZN1B3fooEv to i8*)] }, align 8, !type !2, !type !3 + +; Class C has hidden visibility, so the !vcall_visibility metadata is set to 1 +; (linkage unit) before LTO, and 2 (translation unit) after LTO. +; CHECK: @_ZTV1C ={{.*}}!vcall_visibility [[MD_TU_VIS:![0-9]+]] + at _ZTV1C = hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%struct.C*)* @_ZN1C3fooEv to i8*)] }, align 8, !type !4, !type !5, !vcall_visibility !6 + +; Class D has translation unit visibility before LTO, and this is not changed +; by LTO. +; CHECK: @_ZTVN12_GLOBAL__N_11DE = {{.*}}!vcall_visibility [[MD_TU_VIS:![0-9]+]] + at _ZTVN12_GLOBAL__N_11DE = internal unnamed_addr constant { [3 x i8*] } zeroinitializer, align 8, !type !7, !type !9, !vcall_visibility !11 + +define dso_local void @_ZN1A3fooEv(%struct.A* nocapture %this) { +entry: + ret void +} + +define hidden void @_ZN1B3fooEv(%struct.B* nocapture %this) { +entry: + ret void +} + +define hidden void @_ZN1C3fooEv(%struct.C* nocapture %this) { +entry: + ret void +} + +define hidden noalias nonnull i8* @_Z6make_dv() { +entry: + %call = tail call i8* @_Znwm(i64 8) #3 + %0 = bitcast i8* %call to i32 (...)*** + store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTVN12_GLOBAL__N_11DE, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %0, align 8 + ret i8* %call +} + +declare dso_local noalias nonnull i8* @_Znwm(i64) + +; CHECK: [[MD_TU_VIS]] = !{i64 2} +!0 = !{i64 16, !"_ZTS1A"} +!1 = !{i64 16, !"_ZTSM1AFvvE.virtual"} +!2 = !{i64 16, !"_ZTS1B"} +!3 = !{i64 16, !"_ZTSM1BFvvE.virtual"} +!4 = !{i64 16, !"_ZTS1C"} +!5 = !{i64 16, !"_ZTSM1CFvvE.virtual"} +!6 = !{i64 1} +!7 = !{i64 16, !8} +!8 = distinct !{} +!9 = !{i64 16, !10} +!10 = distinct !{} +!11 = !{i64 2} From llvm-commits at lists.llvm.org Fri Oct 11 05:07:58 2019 From: llvm-commits at lists.llvm.org (Joerg Sonnenberger via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:07:58 +0000 (UTC) Subject: [PATCH] D65280: Add a pass to lower is.constant and objectsize intrinsics In-Reply-To: References: Message-ID: <728f873fac4141dca19abc03b9147c0f@localhost.localdomain> joerg added inline comments. ================ Comment at: lib/Transforms/Scalar/LowerConstantIntrinsics.cpp:105-106 + } + if (Worklist.empty()) + return false; + ---------------- chandlerc wrote: > FWIW, this doesn't skip anything, the loop has the same behavior. It was primarily to get the correct return value, but I'm changing it to push the check to the final return. ================ Comment at: lib/Transforms/Scalar/LowerConstantIntrinsics.cpp:112-117 + if (!II) + continue; + Value *NewValue; + switch (II->getIntrinsicID()) { + default: + continue; ---------------- chandlerc wrote: > For both the `II` thing and the `default` case -- do we really expect these to ever fail? > > I would expect either the VH to be null, or for it to definitively be one of the two intrinsics we added. Maybe switch to `cast_or_null` above with `VN.get()` or some such, and llvm_unreachable on the default case. Yes, the same concerns as with the earlier version still apply. The recursive simplification can change the instruction type in place or remove it. The logic is still simpler since no new instructions can appear. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65280/new/ https://reviews.llvm.org/D65280 From llvm-commits at lists.llvm.org Fri Oct 11 05:08:05 2019 From: llvm-commits at lists.llvm.org (David Zarzycki via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:08:05 +0000 (UTC) Subject: [PATCH] D68632: [X86] Make memcmp() use PTEST if possible and also enable AVX1 In-Reply-To: References: Message-ID: <97cc839cae709ef468453e51cb017d40@localhost.localdomain> davezarzycki updated this revision to Diff 224566. davezarzycki marked an inline comment as done and 2 inline comments as not done. davezarzycki added a comment. I believe I've incorporated all of the feedback so far. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68632/new/ https://reviews.llvm.org/D68632 Files: lib/CodeGen/SelectionDAG/DAGCombiner.cpp lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/memcmp-minsize.ll test/CodeGen/X86/memcmp-optsize.ll test/CodeGen/X86/memcmp.ll test/CodeGen/X86/setcc-wide-types.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68632.224566.patch Type: text/x-patch Size: 45593 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 05:11:03 2019 From: llvm-commits at lists.llvm.org (Oliver Stannard (Linaro) via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:11:03 +0000 (UTC) Subject: [PATCH] D63932: [GlobalDCE] Dead Virtual Function Elimination In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG9f6a873268e1: Dead Virtual Function Elimination (authored by ostannard). Changed prior to commit: https://reviews.llvm.org/D63932?vs=218363&id=224568#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63932/new/ https://reviews.llvm.org/D63932 Files: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Driver/Options.td clang/lib/CodeGen/CGClass.cpp clang/lib/CodeGen/CGVTables.cpp clang/lib/CodeGen/CodeGenModule.h clang/lib/CodeGen/ItaniumCXXABI.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGenCXX/vcall-visibility-metadata.cpp clang/test/CodeGenCXX/virtual-function-elimination.cpp clang/test/Driver/virtual-function-elimination.cpp llvm/docs/LangRef.rst llvm/docs/TypeMetadata.rst llvm/include/llvm/Analysis/TypeMetadataUtils.h llvm/include/llvm/IR/FixedMetadataKinds.def llvm/include/llvm/IR/GlobalObject.h llvm/include/llvm/Transforms/IPO/GlobalDCE.h llvm/lib/Analysis/TypeMetadataUtils.cpp llvm/lib/IR/Metadata.cpp llvm/lib/LTO/LTO.cpp llvm/lib/LTO/LTOCodeGenerator.cpp llvm/lib/Transforms/IPO/GlobalDCE.cpp llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp llvm/test/LTO/ARM/lto-linking-metadata.ll llvm/test/ThinLTO/X86/lazyload_metadata.ll llvm/test/Transforms/GlobalDCE/virtual-functions-base-call.ll llvm/test/Transforms/GlobalDCE/virtual-functions-base-pointer-call.ll llvm/test/Transforms/GlobalDCE/virtual-functions-derived-call.ll llvm/test/Transforms/GlobalDCE/virtual-functions-derived-pointer-call.ll llvm/test/Transforms/GlobalDCE/virtual-functions-visibility-post-lto.ll llvm/test/Transforms/GlobalDCE/virtual-functions-visibility-pre-lto.ll llvm/test/Transforms/GlobalDCE/virtual-functions.ll llvm/test/Transforms/GlobalDCE/vtable-rtti.ll llvm/test/Transforms/Internalize/vcall-visibility.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D63932.224568.patch Type: text/x-patch Size: 75968 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 05:11:04 2019 From: llvm-commits at lists.llvm.org (Piotr Sobczak via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:11:04 +0000 (UTC) Subject: [PATCH] D68865: [InstCombine][AMDGPU] Fix crash with v3i16/v3f16 buffer intrinsics Message-ID: piotr created this revision. Herald added subscribers: llvm-commits, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl. Herald added a project: LLVM. This is something of a workaround to avoid a crash later on in type legalizer (WidenVectorResult()). Also added some f16 tests, including a non-working v3f16 case with a FIXME. Repository: rL LLVM https://reviews.llvm.org/D68865 Files: lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll Index: test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll =================================================================== --- test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll +++ test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll @@ -1474,6 +1474,51 @@ declare <4 x i32> @llvm.amdgcn.raw.tbuffer.load.v4i32(<4 x i32>, i32, i32, i32, i32) #1 +; CHECK-LABEL: @extract_elt3_raw_tbuffer_load_v4f16( +; CHECK-NEXT: %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) +; CHECK-NEXT: %elt1 = extractelement <4 x half> %data, i32 3 +; CHECK-NEXT: ret half %elt1 +define amdgpu_ps half @extract_elt3_raw_tbuffer_load_v4f16(<4 x i32> inreg %rsrc, i32 %arg0, i32 inreg %arg1) #0 { + %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) + %elt1 = extractelement <4 x half> %data, i32 3 + ret half %elt1 +} + +; FIXME: Enable load shortening when full support for v3f16 has been added (should expect call <3 x half> @llvm.amdgcn.raw.tbuffer.load.v3f16). +; CHECK-LABEL: @extract_elt2_raw_tbuffer_load_v4f16( +; CHECK-NEXT: %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) +; CHECK-NEXT: %elt1 = extractelement <4 x half> %data, i32 2 +; CHECK-NEXT: ret half %elt1 +define amdgpu_ps half @extract_elt2_raw_tbuffer_load_v4f16(<4 x i32> inreg %rsrc, i32 %arg0, i32 inreg %arg1) #0 { + %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) + %elt1 = extractelement <4 x half> %data, i32 2 + ret half %elt1 +} + +; CHECK-LABEL: @extract_elt1_raw_tbuffer_load_v4f16( +; CHECK-NEXT: %data = call <2 x half> @llvm.amdgcn.raw.tbuffer.load.v2f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) +; CHECK-NEXT: %elt1 = extractelement <2 x half> %data, i32 1 +; CHECK-NEXT: ret half %elt1 +define amdgpu_ps half @extract_elt1_raw_tbuffer_load_v4f16(<4 x i32> inreg %rsrc, i32 %arg0, i32 inreg %arg1) #0 { + %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) + %elt1 = extractelement <4 x half> %data, i32 1 + ret half %elt1 +} + +; CHECK-LABEL: @extract_elt0_raw_tbuffer_load_v4f16( +; CHECK-NEXT: %data = call half @llvm.amdgcn.raw.tbuffer.load.f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) +; CHECK-NEXT: ret half %data +define amdgpu_ps half @extract_elt0_raw_tbuffer_load_v4f16(<4 x i32> inreg %rsrc, i32 %arg0, i32 inreg %arg1) #0 { + %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) + %elt1 = extractelement <4 x half> %data, i32 0 + ret half %elt1 +} + +declare half @llvm.amdgcn.raw.tbuffer.load.f16(<4 x i32>, i32, i32, i32, i32) #1 +declare <2 x half> @llvm.amdgcn.raw.tbuffer.load.v2f16(<4 x i32>, i32, i32, i32, i32) #1 +declare <3 x half> @llvm.amdgcn.raw.tbuffer.load.v3f16(<4 x i32>, i32, i32, i32, i32) #1 +declare <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32>, i32, i32, i32, i32) #1 + ; -------------------------------------------------------------------- ; llvm.amdgcn.struct.tbuffer.load ; -------------------------------------------------------------------- Index: lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp =================================================================== --- lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp +++ lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp @@ -971,6 +971,13 @@ Value *InstCombiner::simplifyAMDGCNMemoryIntrinsicDemanded(IntrinsicInst *II, APInt DemandedElts, int DMaskIdx) { + + // FIXME: Allow v3i16/v3f16 in buffer intrinsics when the types are fully supported. + if (DMaskIdx < 0 && + II->getType()->getScalarSizeInBits() == 16 && + DemandedElts.getActiveBits() == 3) + return nullptr; + unsigned VWidth = II->getType()->getVectorNumElements(); if (VWidth == 1) return nullptr; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68865.224567.patch Type: text/x-patch Size: 4176 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 05:18:29 2019 From: llvm-commits at lists.llvm.org (Piotr Sobczak via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:18:29 +0000 (UTC) Subject: [PATCH] D68865: [InstCombine][AMDGPU] Fix crash with v3i16/v3f16 buffer intrinsics In-Reply-To: References: Message-ID: piotr updated this revision to Diff 224570. piotr added a comment. Fixed wrong spacing. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68865/new/ https://reviews.llvm.org/D68865 Files: lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll Index: test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll =================================================================== --- test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll +++ test/Transforms/InstCombine/AMDGPU/amdgcn-demanded-vector-elts.ll @@ -1474,6 +1474,51 @@ declare <4 x i32> @llvm.amdgcn.raw.tbuffer.load.v4i32(<4 x i32>, i32, i32, i32, i32) #1 +; CHECK-LABEL: @extract_elt3_raw_tbuffer_load_v4f16( +; CHECK-NEXT: %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) +; CHECK-NEXT: %elt1 = extractelement <4 x half> %data, i32 3 +; CHECK-NEXT: ret half %elt1 +define amdgpu_ps half @extract_elt3_raw_tbuffer_load_v4f16(<4 x i32> inreg %rsrc, i32 %arg0, i32 inreg %arg1) #0 { + %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) + %elt1 = extractelement <4 x half> %data, i32 3 + ret half %elt1 +} + +; FIXME: Enable load shortening when full support for v3f16 has been added (should expect call <3 x half> @llvm.amdgcn.raw.tbuffer.load.v3f16). +; CHECK-LABEL: @extract_elt2_raw_tbuffer_load_v4f16( +; CHECK-NEXT: %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) +; CHECK-NEXT: %elt1 = extractelement <4 x half> %data, i32 2 +; CHECK-NEXT: ret half %elt1 +define amdgpu_ps half @extract_elt2_raw_tbuffer_load_v4f16(<4 x i32> inreg %rsrc, i32 %arg0, i32 inreg %arg1) #0 { + %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) + %elt1 = extractelement <4 x half> %data, i32 2 + ret half %elt1 +} + +; CHECK-LABEL: @extract_elt1_raw_tbuffer_load_v4f16( +; CHECK-NEXT: %data = call <2 x half> @llvm.amdgcn.raw.tbuffer.load.v2f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) +; CHECK-NEXT: %elt1 = extractelement <2 x half> %data, i32 1 +; CHECK-NEXT: ret half %elt1 +define amdgpu_ps half @extract_elt1_raw_tbuffer_load_v4f16(<4 x i32> inreg %rsrc, i32 %arg0, i32 inreg %arg1) #0 { + %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) + %elt1 = extractelement <4 x half> %data, i32 1 + ret half %elt1 +} + +; CHECK-LABEL: @extract_elt0_raw_tbuffer_load_v4f16( +; CHECK-NEXT: %data = call half @llvm.amdgcn.raw.tbuffer.load.f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) +; CHECK-NEXT: ret half %data +define amdgpu_ps half @extract_elt0_raw_tbuffer_load_v4f16(<4 x i32> inreg %rsrc, i32 %arg0, i32 inreg %arg1) #0 { + %data = call <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32> %rsrc, i32 %arg0, i32 %arg1, i32 78, i32 0) + %elt1 = extractelement <4 x half> %data, i32 0 + ret half %elt1 +} + +declare half @llvm.amdgcn.raw.tbuffer.load.f16(<4 x i32>, i32, i32, i32, i32) #1 +declare <2 x half> @llvm.amdgcn.raw.tbuffer.load.v2f16(<4 x i32>, i32, i32, i32, i32) #1 +declare <3 x half> @llvm.amdgcn.raw.tbuffer.load.v3f16(<4 x i32>, i32, i32, i32, i32) #1 +declare <4 x half> @llvm.amdgcn.raw.tbuffer.load.v4f16(<4 x i32>, i32, i32, i32, i32) #1 + ; -------------------------------------------------------------------- ; llvm.amdgcn.struct.tbuffer.load ; -------------------------------------------------------------------- Index: lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp =================================================================== --- lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp +++ lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp @@ -971,6 +971,13 @@ Value *InstCombiner::simplifyAMDGCNMemoryIntrinsicDemanded(IntrinsicInst *II, APInt DemandedElts, int DMaskIdx) { + + // FIXME: Allow v3i16/v3f16 in buffer intrinsics when the types are fully supported. + if (DMaskIdx < 0 && + II->getType()->getScalarSizeInBits() == 16 && + DemandedElts.getActiveBits() == 3) + return nullptr; + unsigned VWidth = II->getType()->getVectorNumElements(); if (VWidth == 1) return nullptr; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68865.224570.patch Type: text/x-patch Size: 4177 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 05:27:11 2019 From: llvm-commits at lists.llvm.org (George Rimar via llvm-commits) Date: Fri, 11 Oct 2019 12:27:11 -0000 Subject: [llvm] r374541 - [llvm-readobj] - Remove excessive fields when dumping "Version symbols". Message-ID: <20191011122711.663E792687@lists.llvm.org> Author: grimar Date: Fri Oct 11 05:27:11 2019 New Revision: 374541 URL: http://llvm.org/viewvc/llvm-project?rev=374541&view=rev Log: [llvm-readobj] - Remove excessive fields when dumping "Version symbols". This removes a few fields that are not useful: "Section Name", "Address", "Offset" and "Link" (they duplicated the information available under the "Sections [" tag). Differential revision: https://reviews.llvm.org/D68704 Modified: llvm/trunk/test/tools/llvm-readobj/all.test llvm/trunk/test/tools/llvm-readobj/elf-versioninfo.test llvm/trunk/test/tools/yaml2obj/versym-section.yaml llvm/trunk/tools/llvm-readobj/ELFDumper.cpp Modified: llvm/trunk/test/tools/llvm-readobj/all.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-readobj/all.test?rev=374541&r1=374540&r2=374541&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-readobj/all.test (original) +++ llvm/trunk/test/tools/llvm-readobj/all.test Fri Oct 11 05:27:11 2019 @@ -11,7 +11,7 @@ # LLVM-ALL: Relocations [ # LLVM-ALL: Symbols [ # LLVM-ALL: ProgramHeaders [ -# LLVM-ALL: Version symbols { +# LLVM-ALL: Version symbols [ # LLVM-ALL: SHT_GNU_verdef { # LLVM-ALL: SHT_GNU_verneed { # LLVM-ALL: Addrsig [ Modified: llvm/trunk/test/tools/llvm-readobj/elf-versioninfo.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-readobj/elf-versioninfo.test?rev=374541&r1=374540&r2=374541&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-readobj/elf-versioninfo.test (original) +++ llvm/trunk/test/tools/llvm-readobj/elf-versioninfo.test Fri Oct 11 05:27:11 2019 @@ -77,38 +77,32 @@ DynamicSymbols: Binding: STB_GLOBAL ... -# LLVM: Version symbols { -# LLVM-NEXT: Section Name: .gnu.version -# LLVM-NEXT: Address: 0x0 -# LLVM-NEXT: Offset: 0x40 -# LLVM-NEXT: Link: 7 -# LLVM-NEXT: Symbols [ -# LLVM-NEXT: Symbol { -# LLVM-NEXT: Version: 0 -# LLVM-NEXT: Name: -# LLVM-NEXT: } -# LLVM-NEXT: Symbol { -# LLVM-NEXT: Version: 2 -# LLVM-NEXT: Name: sym1@@VERSION1 -# LLVM-NEXT: } -# LLVM-NEXT: Symbol { -# LLVM-NEXT: Version: 3 -# LLVM-NEXT: Name: sym2@@VERSION2 -# LLVM-NEXT: } -# LLVM-NEXT: Symbol { -# LLVM-NEXT: Version: 4 -# LLVM-NEXT: Name: sym3 at v1 -# LLVM-NEXT: } -# LLVM-NEXT: Symbol { -# LLVM-NEXT: Version: 5 -# LLVM-NEXT: Name: sym4 at v2 -# LLVM-NEXT: } -# LLVM-NEXT: Symbol { -# LLVM-NEXT: Version: 6 -# LLVM-NEXT: Name: sym5 at v3 -# LLVM-NEXT: } -# LLVM-NEXT: ] -# LLVM-NEXT: } +# LLVM: Version symbols [ +# LLVM-NEXT: Symbol { +# LLVM-NEXT: Version: 0 +# LLVM-NEXT: Name: +# LLVM-NEXT: } +# LLVM-NEXT: Symbol { +# LLVM-NEXT: Version: 2 +# LLVM-NEXT: Name: sym1@@VERSION1 +# LLVM-NEXT: } +# LLVM-NEXT: Symbol { +# LLVM-NEXT: Version: 3 +# LLVM-NEXT: Name: sym2@@VERSION2 +# LLVM-NEXT: } +# LLVM-NEXT: Symbol { +# LLVM-NEXT: Version: 4 +# LLVM-NEXT: Name: sym3 at v1 +# LLVM-NEXT: } +# LLVM-NEXT: Symbol { +# LLVM-NEXT: Version: 5 +# LLVM-NEXT: Name: sym4 at v2 +# LLVM-NEXT: } +# LLVM-NEXT: Symbol { +# LLVM-NEXT: Version: 6 +# LLVM-NEXT: Name: sym5 at v3 +# LLVM-NEXT: } +# LLVM-NEXT: ] # LLVM-NEXT: SHT_GNU_verdef { # LLVM-NEXT: Definition { # LLVM-NEXT: Version: 1 Modified: llvm/trunk/test/tools/yaml2obj/versym-section.yaml URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/yaml2obj/versym-section.yaml?rev=374541&r1=374540&r2=374541&view=diff ============================================================================== --- llvm/trunk/test/tools/yaml2obj/versym-section.yaml (original) +++ llvm/trunk/test/tools/yaml2obj/versym-section.yaml Fri Oct 11 05:27:11 2019 @@ -4,26 +4,20 @@ # RUN: yaml2obj --docnum=1 %s -o %t1 # RUN: llvm-readobj -V %t1 | FileCheck %s -# CHECK: Version symbols { -# CHECK-NEXT: Section Name: .gnu.version -# CHECK-NEXT: Address: 0x200210 -# CHECK-NEXT: Offset: 0x40 -# CHECK-NEXT: Link: 6 -# CHECK-NEXT: Symbols [ -# CHECK-NEXT: Symbol { -# CHECK-NEXT: Version: 0 -# CHECK-NEXT: Name: -# CHECK-NEXT: } -# CHECK-NEXT: Symbol { -# CHECK-NEXT: Version: 3 -# CHECK-NEXT: Name: f1 at v1 -# CHECK-NEXT: } -# CHECK-NEXT: Symbol { -# CHECK-NEXT: Version: 4 -# CHECK-NEXT: Name: f2 at v2 -# CHECK-NEXT: } -# CHECK-NEXT: ] -# CHECK-NEXT: } +# CHECK: Version symbols [ +# CHECK-NEXT: Symbol { +# CHECK-NEXT: Version: 0 +# CHECK-NEXT: Name: +# CHECK-NEXT: } +# CHECK-NEXT: Symbol { +# CHECK-NEXT: Version: 3 +# CHECK-NEXT: Name: f1 at v1 +# CHECK-NEXT: } +# CHECK-NEXT: Symbol { +# CHECK-NEXT: Version: 4 +# CHECK-NEXT: Name: f2 at v2 +# CHECK-NEXT: } +# CHECK-NEXT: ] # CHECK-NEXT: SHT_GNU_verdef { # CHECK-NEXT: } # CHECK-NEXT: SHT_GNU_verneed { Modified: llvm/trunk/tools/llvm-readobj/ELFDumper.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-readobj/ELFDumper.cpp?rev=374541&r1=374540&r2=374541&view=diff ============================================================================== --- llvm/trunk/tools/llvm-readobj/ELFDumper.cpp (original) +++ llvm/trunk/tools/llvm-readobj/ELFDumper.cpp Fri Oct 11 05:27:11 2019 @@ -5607,23 +5607,16 @@ void LLVMStyle::printProgramHeader template void LLVMStyle::printVersionSymbolSection(const ELFFile *Obj, const Elf_Shdr *Sec) { - DictScope SS(W, "Version symbols"); + ListScope SS(W, "Version symbols"); if (!Sec) return; - StringRef SecName = unwrapOrError(this->FileName, Obj->getSectionName(Sec)); - W.printNumber("Section Name", SecName, Sec->sh_name); - W.printHex("Address", Sec->sh_addr); - W.printHex("Offset", Sec->sh_offset); - W.printNumber("Link", Sec->sh_link); - const uint8_t *VersymBuf = reinterpret_cast(Obj->base() + Sec->sh_offset); const ELFDumper *Dumper = this->dumper(); StringRef StrTable = Dumper->getDynamicStringTable(); // Same number of entries in the dynamic symbol table (DT_SYMTAB). - ListScope Syms(W, "Symbols"); for (const Elf_Sym &Sym : Dumper->dynamic_symbols()) { DictScope S(W, "Symbol"); const Elf_Versym *Versym = reinterpret_cast(VersymBuf); From llvm-commits at lists.llvm.org Fri Oct 11 05:27:20 2019 From: llvm-commits at lists.llvm.org (George Rimar via llvm-commits) Date: Fri, 11 Oct 2019 12:27:20 -0000 Subject: [lld] r374542 - [LLD][ELF] - Update test cases after llvm-readobj change. Message-ID: <20191011122720.CA94393047@lists.llvm.org> Author: grimar Date: Fri Oct 11 05:27:20 2019 New Revision: 374542 URL: http://llvm.org/viewvc/llvm-project?rev=374542&view=rev Log: [LLD][ELF] - Update test cases after llvm-readobj change. https://reviews.llvm.org/D68704 changed the output format. Modified: lld/trunk/test/ELF/empty-ver.s lld/trunk/test/ELF/empty-ver2.s lld/trunk/test/ELF/linkerscript/version-script.s lld/trunk/test/ELF/verdef-defaultver.s lld/trunk/test/ELF/verdef.s lld/trunk/test/ELF/verneed.s lld/trunk/test/ELF/version-script-extern-undefined.s lld/trunk/test/ELF/version-script-extern-wildcards.s lld/trunk/test/ELF/version-script-extern.s lld/trunk/test/ELF/version-script-extern2.s lld/trunk/test/ELF/version-script-locals-extern.s lld/trunk/test/ELF/version-script-symver2.s Modified: lld/trunk/test/ELF/empty-ver.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/empty-ver.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/empty-ver.s (original) +++ lld/trunk/test/ELF/empty-ver.s Fri Oct 11 05:27:20 2019 @@ -21,22 +21,16 @@ // CHECK-NEXT: 0000: 00666F6F 00742E73 6F007665 7200 |.foo.t.so.ver.| // CHECK-NEXT: ) -// CHECK: Version symbols { -// CHECK-NEXT: Section Name: -// CHECK-NEXT: Address: -// CHECK-NEXT: Offset: -// CHECK-NEXT: Link: -// CHECK-NEXT: Symbols [ -// CHECK-NEXT: Symbol { -// CHECK-NEXT: Version: 0 -// CHECK-NEXT: Name: -// CHECK-NEXT: } -// CHECK-NEXT: Symbol { -// CHECK-NEXT: Version: 2 -// CHECK-NEXT: Name: foo at ver -// CHECK-NEXT: } -// CHECK-NEXT: ] -// CHECK-NEXT: } +// CHECK: Version symbols [ +// CHECK-NEXT: Symbol { +// CHECK-NEXT: Version: 0 +// CHECK-NEXT: Name: +// CHECK-NEXT: } +// CHECK-NEXT: Symbol { +// CHECK-NEXT: Version: 2 +// CHECK-NEXT: Name: foo at ver +// CHECK-NEXT: } +// CHECK-NEXT: ] .global foo at ver Modified: lld/trunk/test/ELF/empty-ver2.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/empty-ver2.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/empty-ver2.s (original) +++ lld/trunk/test/ELF/empty-ver2.s Fri Oct 11 05:27:20 2019 @@ -5,7 +5,7 @@ # RUN: ld.lld %t.o -o t.so -shared -version-script %p/Inputs/empty-ver.ver # RUN: llvm-readobj --version-info t.so | FileCheck %s -# CHECK: Symbols [ +# CHECK: Version symbols [ # CHECK-NEXT: Symbol { # CHECK-NEXT: Version: 0 # CHECK-NEXT: Name: Modified: lld/trunk/test/ELF/linkerscript/version-script.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/linkerscript/version-script.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/linkerscript/version-script.s (original) +++ lld/trunk/test/ELF/linkerscript/version-script.s Fri Oct 11 05:27:20 2019 @@ -11,7 +11,7 @@ # RUN: llvm-readobj -V %t.so | FileCheck %s ## Check that we are able to version symbols defined in script. -# CHECK: Symbols [ +# CHECK: Version symbols [ # CHECK-NEXT: Symbol { # CHECK-NEXT: Version: 0 # CHECK-NEXT: Name: @@ -38,7 +38,7 @@ # RUN: echo "und = 0x1; VERSION { V { global: und; local: *; }; }" > %t.script # RUN: ld.lld -T %t.script -shared --no-undefined-version %t.o -o %t.so # RUN: llvm-readobj -V %t.so | FileCheck %s --check-prefix=UNDEF -# UNDEF: Symbols [ +# UNDEF: Version symbols [ # UNDEF-NEXT: Symbol { # UNDEF-NEXT: Version: 0 # UNDEF-NEXT: Name: Modified: lld/trunk/test/ELF/verdef-defaultver.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/verdef-defaultver.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/verdef-defaultver.s (original) +++ lld/trunk/test/ELF/verdef-defaultver.s Fri Oct 11 05:27:20 2019 @@ -53,34 +53,28 @@ # DSO-NEXT: Section: .text # DSO-NEXT: } # DSO-NEXT: ] -# DSO-NEXT: Version symbols { -# DSO-NEXT: Section Name: .gnu.version -# DSO-NEXT: Address: 0x240 -# DSO-NEXT: Offset: 0x240 -# DSO-NEXT: Link: 1 -# DSO-NEXT: Symbols [ -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 0 -# DSO-NEXT: Name: -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 2 -# DSO-NEXT: Name: a@@V1 -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 3 -# DSO-NEXT: Name: b@@V2 -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 2 -# DSO-NEXT: Name: b at V1 -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 3 -# DSO-NEXT: Name: c@@V2 -# DSO-NEXT: } -# DSO-NEXT: ] -# DSO-NEXT: } +# DSO-NEXT: Version symbols [ +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 0 +# DSO-NEXT: Name: +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 2 +# DSO-NEXT: Name: a@@V1 +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 3 +# DSO-NEXT: Name: b@@V2 +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 2 +# DSO-NEXT: Name: b at V1 +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 3 +# DSO-NEXT: Name: c@@V2 +# DSO-NEXT: } +# DSO-NEXT: ] # DSO-NEXT: SHT_GNU_verdef { # DSO-NEXT: Definition { # DSO-NEXT: Version: 1 @@ -148,30 +142,24 @@ # EXE-NEXT: Section: Undefined # EXE-NEXT: } # EXE-NEXT: ] -# EXE-NEXT: Version symbols { -# EXE-NEXT: Section Name: .gnu.version -# EXE-NEXT: Address: 0x200260 -# EXE-NEXT: Offset: 0x260 -# EXE-NEXT: Link: 1 -# EXE-NEXT: Symbols [ -# EXE-NEXT: Symbol { -# EXE-NEXT: Version: 0 -# EXE-NEXT: Name: -# EXE-NEXT: } -# EXE-NEXT: Symbol { -# EXE-NEXT: Version: 2 -# EXE-NEXT: Name: a at V1 -# EXE-NEXT: } -# EXE-NEXT: Symbol { -# EXE-NEXT: Version: 3 -# EXE-NEXT: Name: b at V2 -# EXE-NEXT: } -# EXE-NEXT: Symbol { -# EXE-NEXT: Version: 3 -# EXE-NEXT: Name: c at V2 -# EXE-NEXT: } -# EXE-NEXT: ] -# EXE-NEXT: } +# EXE-NEXT: Version symbols [ +# EXE-NEXT: Symbol { +# EXE-NEXT: Version: 0 +# EXE-NEXT: Name: +# EXE-NEXT: } +# EXE-NEXT: Symbol { +# EXE-NEXT: Version: 2 +# EXE-NEXT: Name: a at V1 +# EXE-NEXT: } +# EXE-NEXT: Symbol { +# EXE-NEXT: Version: 3 +# EXE-NEXT: Name: b at V2 +# EXE-NEXT: } +# EXE-NEXT: Symbol { +# EXE-NEXT: Version: 3 +# EXE-NEXT: Name: c at V2 +# EXE-NEXT: } +# EXE-NEXT: ] # EXE-NEXT: SHT_GNU_verdef { # EXE-NEXT: } # EXE-NEXT: SHT_GNU_verneed { Modified: lld/trunk/test/ELF/verdef.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/verdef.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/verdef.s (original) +++ lld/trunk/test/ELF/verdef.s Fri Oct 11 05:27:20 2019 @@ -6,30 +6,24 @@ # RUN: ld.lld --hash-style=sysv --version-script %t.script -shared -soname shared %t.o -o %t.so # RUN: llvm-readobj -V --dyn-syms %t.so | FileCheck --check-prefix=DSO %s -# DSO: Version symbols { -# DSO-NEXT: Section Name: .gnu.version -# DSO-NEXT: Address: 0x228 -# DSO-NEXT: Offset: 0x228 -# DSO-NEXT: Link: 1 -# DSO-NEXT: Symbols [ -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 0 -# DSO-NEXT: Name: -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 2 -# DSO-NEXT: Name: a@@LIBSAMPLE_1.0 -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 3 -# DSO-NEXT: Name: b@@LIBSAMPLE_2.0 -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 4 -# DSO-NEXT: Name: c@@LIBSAMPLE_3.0 -# DSO-NEXT: } -# DSO-NEXT: ] -# DSO-NEXT: } +# DSO: Version symbols [ +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 0 +# DSO-NEXT: Name: +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 2 +# DSO-NEXT: Name: a@@LIBSAMPLE_1.0 +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 3 +# DSO-NEXT: Name: b@@LIBSAMPLE_2.0 +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 4 +# DSO-NEXT: Name: c@@LIBSAMPLE_3.0 +# DSO-NEXT: } +# DSO-NEXT: ] # DSO-NEXT: SHT_GNU_verdef { # DSO-NEXT: Definition { # DSO-NEXT: Version: 1 @@ -68,30 +62,24 @@ # RUN: ld.lld --hash-style=sysv %tmain.o %t.so -o %tout # RUN: llvm-readobj -V %tout | FileCheck --check-prefix=MAIN %s -# MAIN: Version symbols { -# MAIN-NEXT: Section Name: .gnu.version -# MAIN-NEXT: Address: 0x200260 -# MAIN-NEXT: Offset: 0x260 -# MAIN-NEXT: Link: 1 -# MAIN-NEXT: Symbols [ -# MAIN-NEXT: Symbol { -# MAIN-NEXT: Version: 0 -# MAIN-NEXT: Name: -# MAIN-NEXT: } -# MAIN-NEXT: Symbol { -# MAIN-NEXT: Version: 2 -# MAIN-NEXT: Name: a at LIBSAMPLE_1.0 -# MAIN-NEXT: } -# MAIN-NEXT: Symbol { -# MAIN-NEXT: Version: 3 -# MAIN-NEXT: Name: b at LIBSAMPLE_2.0 -# MAIN-NEXT: } -# MAIN-NEXT: Symbol { -# MAIN-NEXT: Version: 4 -# MAIN-NEXT: Name: c at LIBSAMPLE_3.0 -# MAIN-NEXT: } -# MAIN-NEXT: ] -# MAIN-NEXT: } +# MAIN: Version symbols [ +# MAIN-NEXT: Symbol { +# MAIN-NEXT: Version: 0 +# MAIN-NEXT: Name: +# MAIN-NEXT: } +# MAIN-NEXT: Symbol { +# MAIN-NEXT: Version: 2 +# MAIN-NEXT: Name: a at LIBSAMPLE_1.0 +# MAIN-NEXT: } +# MAIN-NEXT: Symbol { +# MAIN-NEXT: Version: 3 +# MAIN-NEXT: Name: b at LIBSAMPLE_2.0 +# MAIN-NEXT: } +# MAIN-NEXT: Symbol { +# MAIN-NEXT: Version: 4 +# MAIN-NEXT: Name: c at LIBSAMPLE_3.0 +# MAIN-NEXT: } +# MAIN-NEXT: ] # MAIN-NEXT: SHT_GNU_verdef { # MAIN-NEXT: } Modified: lld/trunk/test/ELF/verneed.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/verneed.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/verneed.s (original) +++ lld/trunk/test/ELF/verneed.s Fri Oct 11 05:27:20 2019 @@ -117,30 +117,24 @@ # CHECK-NEXT: 0x000000006FFFFFFE VERNEED [[VERNEED]] # CHECK-NEXT: 0x000000006FFFFFFF VERNEEDNUM 2 -# CHECK: Version symbols { -# CHECK-NEXT: Section Name: .gnu.version -# CHECK-NEXT: Address: [[VERSYM]] -# CHECK-NEXT: Offset: [[VERSYM_OFFSET]] -# CHECK-NEXT: Link: 1 -# CHECK-NEXT: Symbols [ -# CHECK-NEXT: Symbol { -# CHECK-NEXT: Version: 0 -# CHECK-NEXT: Name: -# CHECK-NEXT: } -# CHECK-NEXT: Symbol { -# CHECK-NEXT: Version: 2 -# CHECK-NEXT: Name: f1 at v3 -# CHECK-NEXT: } -# CHECK-NEXT: Symbol { -# CHECK-NEXT: Version: 3 -# CHECK-NEXT: Name: f2 at v2 -# CHECK-NEXT: } -# CHECK-NEXT: Symbol { -# CHECK-NEXT: Version: 4 -# CHECK-NEXT: Name: g1 at v1 -# CHECK-NEXT: } -# CHECK-NEXT: ] -# CHECK-NEXT: } +# CHECK: Version symbols [ +# CHECK-NEXT: Symbol { +# CHECK-NEXT: Version: 0 +# CHECK-NEXT: Name: +# CHECK-NEXT: } +# CHECK-NEXT: Symbol { +# CHECK-NEXT: Version: 2 +# CHECK-NEXT: Name: f1 at v3 +# CHECK-NEXT: } +# CHECK-NEXT: Symbol { +# CHECK-NEXT: Version: 3 +# CHECK-NEXT: Name: f2 at v2 +# CHECK-NEXT: } +# CHECK-NEXT: Symbol { +# CHECK-NEXT: Version: 4 +# CHECK-NEXT: Name: g1 at v1 +# CHECK-NEXT: } +# CHECK-NEXT: ] # CHECK-NEXT: SHT_GNU_verdef { # CHECK-NEXT: } # CHECK-NEXT: SHT_GNU_verneed { Modified: lld/trunk/test/ELF/version-script-extern-undefined.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/version-script-extern-undefined.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/version-script-extern-undefined.s (original) +++ lld/trunk/test/ELF/version-script-extern-undefined.s Fri Oct 11 05:27:20 2019 @@ -5,7 +5,7 @@ # RUN: ld.lld --version-script %t.script -shared %t.o -o %t.so # RUN: llvm-readobj -V %t.so | FileCheck %s -# CHECK: Symbols [ +# CHECK: Version symbols [ # CHECK-NEXT: Symbol { # CHECK-NEXT: Version: 0 # CHECK-NEXT: Name: Modified: lld/trunk/test/ELF/version-script-extern-wildcards.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/version-script-extern-wildcards.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/version-script-extern-wildcards.s (original) +++ lld/trunk/test/ELF/version-script-extern-wildcards.s Fri Oct 11 05:27:20 2019 @@ -6,8 +6,7 @@ # RUN: ld.lld --version-script %t.script -shared %t.o -o %t.so # RUN: llvm-readobj -V --dyn-syms %t.so | FileCheck %s -# CHECK: Version symbols { -# CHECK: Symbols [ +# CHECK: Version symbols [ # CHECK: Name: _Z3bari # CHECK: Name: _Z3fooi@@FOO # CHECK: Name: _Z3zedi@@BAR Modified: lld/trunk/test/ELF/version-script-extern.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/version-script-extern.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/version-script-extern.s (original) +++ lld/trunk/test/ELF/version-script-extern.s Fri Oct 11 05:27:20 2019 @@ -66,38 +66,32 @@ # DSO-NEXT: Section: .text (0x6) # DSO-NEXT: } # DSO-NEXT: ] -# DSO-NEXT: Version symbols { -# DSO-NEXT: Section Name: .gnu.version -# DSO-NEXT: Address: -# DSO-NEXT: Offset: -# DSO-NEXT: Link: 1 -# DSO-NEXT: Symbols [ -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 0 -# DSO-NEXT: Name: -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 3 -# DSO-NEXT: Name: _Z3bari@@LIBSAMPLE_2.0 -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 2 -# DSO-NEXT: Name: _Z3fooi@@LIBSAMPLE_1.0 -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 2 -# DSO-NEXT: Name: _Z3zedi@@LIBSAMPLE_1.0 -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 2 -# DSO-NEXT: Name: _ZN3abcC1Ev@@LIBSAMPLE_1.0 -# DSO-NEXT: } -# DSO-NEXT: Symbol { -# DSO-NEXT: Version: 2 -# DSO-NEXT: Name: _ZN3abcC2Ev@@LIBSAMPLE_1.0 -# DSO-NEXT: } -# DSO-NEXT: ] -# DSO-NEXT: } +# DSO-NEXT: Version symbols [ +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 0 +# DSO-NEXT: Name: +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 3 +# DSO-NEXT: Name: _Z3bari@@LIBSAMPLE_2.0 +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 2 +# DSO-NEXT: Name: _Z3fooi@@LIBSAMPLE_1.0 +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 2 +# DSO-NEXT: Name: _Z3zedi@@LIBSAMPLE_1.0 +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 2 +# DSO-NEXT: Name: _ZN3abcC1Ev@@LIBSAMPLE_1.0 +# DSO-NEXT: } +# DSO-NEXT: Symbol { +# DSO-NEXT: Version: 2 +# DSO-NEXT: Name: _ZN3abcC2Ev@@LIBSAMPLE_1.0 +# DSO-NEXT: } +# DSO-NEXT: ] .text .globl _Z3fooi Modified: lld/trunk/test/ELF/version-script-extern2.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/version-script-extern2.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/version-script-extern2.s (original) +++ lld/trunk/test/ELF/version-script-extern2.s Fri Oct 11 05:27:20 2019 @@ -5,7 +5,7 @@ # RUN: ld.lld --version-script %t.script -shared %t.o -o %t.so # RUN: llvm-readobj -V %t.so | FileCheck %s -# CHECK: Symbols [ +# CHECK: Version symbols [ # CHECK-NEXT: Symbol { # CHECK-NEXT: Version: 0 # CHECK-NEXT: Name: Modified: lld/trunk/test/ELF/version-script-locals-extern.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/version-script-locals-extern.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/version-script-locals-extern.s (original) +++ lld/trunk/test/ELF/version-script-locals-extern.s Fri Oct 11 05:27:20 2019 @@ -4,7 +4,7 @@ # RUN: echo "FOO { local: extern \"C++\" { \"abb(int)\"; }; };" > %t.script # RUN: ld.lld --version-script %t.script -shared %t.o -o %t.so # RUN: llvm-readobj -V %t.so | FileCheck %s --check-prefix=ABB -# ABB: Symbols [ +# ABB: Version symbols [ # ABB-NEXT: Symbol { # ABB-NEXT: Version: 0 # ABB-NEXT: Name: @@ -23,7 +23,7 @@ # RUN: echo "FOO { local: extern \"C++\" { abc*; }; };" > %t.script # RUN: ld.lld --version-script %t.script -shared %t.o -o %t.so # RUN: llvm-readobj -V %t.so | FileCheck %s --check-prefix=ABC -# ABC: Symbols [ +# ABC: Version symbols [ # ABC-NEXT: Symbol { # ABC-NEXT: Version: 0 # ABC-NEXT: Name: Modified: lld/trunk/test/ELF/version-script-symver2.s URL: http://llvm.org/viewvc/llvm-project/lld/trunk/test/ELF/version-script-symver2.s?rev=374542&r1=374541&r2=374542&view=diff ============================================================================== --- lld/trunk/test/ELF/version-script-symver2.s (original) +++ lld/trunk/test/ELF/version-script-symver2.s Fri Oct 11 05:27:20 2019 @@ -4,7 +4,7 @@ # RUN: ld.lld -shared %t.o --version-script %t.map -o %t.so --fatal-warnings # RUN: llvm-readobj -V %t.so | FileCheck %s -# CHECK: Symbols [ +# CHECK: Version symbols [ # CHECK-NEXT: Symbol { # CHECK-NEXT: Version: 0 # CHECK-NEXT: Name: From llvm-commits at lists.llvm.org Fri Oct 11 05:26:11 2019 From: llvm-commits at lists.llvm.org (Gil Rapaport via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:26:11 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <753aaac48afd94ba4e9b21ccea8f208f@localhost.localdomain> gilr updated this revision to Diff 224571. gilr added a comment. Applied review comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 Files: llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h llvm/lib/Transforms/Vectorize/VPlan.cpp llvm/lib/Transforms/Vectorize/VPlan.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68577.224571.patch Type: text/x-patch Size: 18919 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 05:26:31 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:26:31 +0000 (UTC) Subject: [PATCH] D68704: [llvm-readobj] - Remove excessive fields when dumping "Version symbols". In-Reply-To: References: Message-ID: <73984324b0ba06bbdfee4e97ca27c88d@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGe6e26339ff03: [llvm-readobj] - Remove excessive fields when dumping "Version symbols". (authored by grimar). Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D68704?vs=224317&id=224572#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68704/new/ https://reviews.llvm.org/D68704 Files: llvm/test/tools/llvm-readobj/all.test llvm/test/tools/llvm-readobj/elf-versioninfo.test llvm/test/tools/yaml2obj/versym-section.yaml llvm/tools/llvm-readobj/ELFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68704.224572.patch Type: text/x-patch Size: 4967 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 05:28:53 2019 From: llvm-commits at lists.llvm.org (Nemanja Ivanovic via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:28:53 +0000 (UTC) Subject: [PATCH] D68237: [PowerPC] Handle f16 as a storage type only In-Reply-To: References: Message-ID: <3d300f23f0c8a079fbbd7e2b18c0485a@localhost.localdomain> nemanjai marked 4 inline comments as done. nemanjai added inline comments. ================ Comment at: lib/Target/PowerPC/PPCISelLowering.cpp:184 + setTruncStoreAction(MVT::f32, MVT::f16, Expand); + } + ---------------- shchenz wrote: > Do we need to handle ppcf128 also? Not really. That type is really just a pair of doubles and there is no register that can contain it, so it will always be broken up into a pair of doubles by the legalizer. ================ Comment at: lib/Target/PowerPC/PPCInstrVSX.td:114 [SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>; +def extloadf16 : PatFrag<(ops node:$ptr), (extload node:$ptr)> { + let IsLoad = 1; ---------------- shchenz wrote: > Guard under IsISA3_0? Why? This is just a def of a pattern fragment. It is only defined here because it is missing in the target independent td file. ================ Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3263 (v2i64 (XXPERMDIs (VEXTSH2Ds (LXSIHZX xoaddr:$src)), 0))>; + // Load/convert and convert/store patterns for f16. ---------------- shchenz wrote: > Guard under IsISA3_0? The instructions used are defined in a `Power9Vector` guard and so are these patterns. I realize this is hard to track down from the patch - this file is in desperate need of refactoring :( ================ Comment at: test/CodeGen/PowerPC/handle-f16-storage-type.ll:8 +; Function Attrs: nounwind readonly +define dso_local double @loadd(i16* nocapture readonly %a) local_unnamed_addr #0 { +; P8-LABEL: loadd: ---------------- shchenz wrote: > `#0 `, seems all the function attributes are not defined? Sure, I can get rid of these (actually, I'll just define #0 as `nounwind` and use it for all the functions so we don't get the CFI nodes). Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68237/new/ https://reviews.llvm.org/D68237 From llvm-commits at lists.llvm.org Fri Oct 11 05:33:12 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Fri, 11 Oct 2019 12:33:12 -0000 Subject: [llvm] r374544 - [mips] Fix loading "double" immediate into a GPR and FPR Message-ID: <20191011123312.B7ABE85399@lists.llvm.org> Author: atanasyan Date: Fri Oct 11 05:33:12 2019 New Revision: 374544 URL: http://llvm.org/viewvc/llvm-project?rev=374544&view=rev Log: [mips] Fix loading "double" immediate into a GPR and FPR If a "double" (64-bit) value has zero low 32-bits, it's possible to load such value into a GP/FP registers as an instruction immediate. But now assembler loads only high 32-bits of the value. For example, if a target register is GPR the `li.d $4, 1.0` instruction converts into the `lui $4, 16368` one. As a result, we get `0x3FF00000` in the register. While a correct representation of the `1.0` value is `0x3FF0000000000000`. The patch fixes that. Differential Revision: https://reviews.llvm.org/D68776 Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp llvm/trunk/test/MC/Mips/macro-li.d.s Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp?rev=374544&r1=374543&r2=374544&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp (original) +++ llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Fri Oct 11 05:33:12 2019 @@ -3403,8 +3403,8 @@ bool MipsAsmParser::expandLoadDoubleImmT if (LoImmOp64 == 0) { if (isABI_N32() || isABI_N64()) { - if (loadImmediate(HiImmOp64, FirstReg, Mips::NoRegister, false, true, - IDLoc, Out, STI)) + if (loadImmediate(ImmOp64, FirstReg, Mips::NoRegister, false, true, IDLoc, + Out, STI)) return true; } else { if (loadImmediate(HiImmOp64, FirstReg, Mips::NoRegister, true, true, @@ -3477,12 +3477,20 @@ bool MipsAsmParser::expandLoadDoubleImmT !((HiImmOp64 & 0xffff0000) && (HiImmOp64 & 0x0000ffff))) { // FIXME: In the case where the constant is zero, we can load the // register directly from the zero register. - if (loadImmediate(HiImmOp64, TmpReg, Mips::NoRegister, true, true, IDLoc, + + if (isABI_N32() || isABI_N64()) { + if (loadImmediate(ImmOp64, TmpReg, Mips::NoRegister, false, false, IDLoc, + Out, STI)) + return true; + TOut.emitRR(Mips::DMTC1, FirstReg, TmpReg, IDLoc, STI); + return false; + } + + if (loadImmediate(HiImmOp64, TmpReg, Mips::NoRegister, true, false, IDLoc, Out, STI)) return true; - if (isABI_N32() || isABI_N64()) - TOut.emitRR(Mips::DMTC1, FirstReg, TmpReg, IDLoc, STI); - else if (hasMips32r2()) { + + if (hasMips32r2()) { TOut.emitRR(Mips::MTC1, FirstReg, Mips::ZERO, IDLoc, STI); TOut.emitRRR(Mips::MTHC1_D32, FirstReg, FirstReg, TmpReg, IDLoc, STI); } else { Modified: llvm/trunk/test/MC/Mips/macro-li.d.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Mips/macro-li.d.s?rev=374544&r1=374543&r2=374544&view=diff ============================================================================== --- llvm/trunk/test/MC/Mips/macro-li.d.s (original) +++ llvm/trunk/test/MC/Mips/macro-li.d.s Fri Oct 11 05:33:12 2019 @@ -49,12 +49,16 @@ li.d $4, 1.12345 # N32-N64: ld $4, 0($1) # encoding: [0x00,0x00,0x24,0xdc] li.d $4, 1 -# ALL: lui $4, 16368 # encoding: [0xf0,0x3f,0x04,0x3c] -# O32: addiu $5, $zero, 0 # encoding: [0x00,0x00,0x05,0x24] +# O32: lui $4, 16368 # encoding: [0xf0,0x3f,0x04,0x3c] +# O32: addiu $5, $zero, 0 # encoding: [0x00,0x00,0x05,0x24] +# N32-N64: ori $4, $zero, 65472 # encoding: [0xc0,0xff,0x04,0x34] +# N32-N64: dsll $4, $4, 46 # encoding: [0xbc,0x23,0x04,0x00] li.d $4, 1.0 -# ALL: lui $4, 16368 # encoding: [0xf0,0x3f,0x04,0x3c] -# O32: addiu $5, $zero, 0 # encoding: [0x00,0x00,0x05,0x24] +# O32: lui $4, 16368 # encoding: [0xf0,0x3f,0x04,0x3c] +# O32: addiu $5, $zero, 0 # encoding: [0x00,0x00,0x05,0x24] +# N32-N64: ori $4, $zero, 65472 # encoding: [0xc0,0xff,0x04,0x34] +# N32-N64: dsll $4, $4, 46 # encoding: [0xbc,0x23,0x04,0x00] li.d $4, 12345678910 # ALL: .section .rodata,"a", at progbits @@ -153,8 +157,10 @@ li.d $4, 0.4 # N32-N64: ld $4, 0($1) # encoding: [0x00,0x00,0x24,0xdc] li.d $4, 1.5 -# ALL: lui $4, 16376 # encoding: [0xf8,0x3f,0x04,0x3c] -# O32: addiu $5, $zero, 0 # encoding: [0x00,0x00,0x05,0x24] +# O32: lui $4, 16376 # encoding: [0xf8,0x3f,0x04,0x3c] +# O32: addiu $5, $zero, 0 # encoding: [0x00,0x00,0x05,0x24] +# N32-N64: ori $4, $zero, 65504 # encoding: [0xe0,0xff,0x04,0x34] +# N32-N64: dsll $4, $4, 46 # encoding: [0xbc,0x23,0x04,0x00] li.d $4, 12345678910.12345678910 # ALL: .section .rodata,"a", at progbits @@ -228,7 +234,7 @@ li.d $f4, 0 # CHECK-MIPS32r2: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] # CHECK-MIPS32r2: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] # CHECK-MIPS32r2: mthc1 $1, $f4 # encoding: [0x00,0x20,0xe1,0x44] -# N32-N64: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] +# N32-N64: daddiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x64] # N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] li.d $f4, 0.0 @@ -238,7 +244,7 @@ li.d $f4, 0.0 # CHECK-MIPS32r2: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] # CHECK-MIPS32r2: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] # CHECK-MIPS32r2: mthc1 $1, $f4 # encoding: [0x00,0x20,0xe1,0x44] -# N32-N64: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] +# N32-N64: daddiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x64] # N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] li.d $f4, 1.12345 @@ -271,7 +277,8 @@ li.d $f4, 1 # CHECK-MIPS32r2: lui $1, 16368 # encoding: [0xf0,0x3f,0x01,0x3c] # CHECK-MIPS32r2: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] # CHECK-MIPS32r2: mthc1 $1, $f4 # encoding: [0x00,0x20,0xe1,0x44] -# N32-N64: lui $1, 16368 # encoding: [0xf0,0x3f,0x01,0x3c] +# N32-N64: ori $1, $zero, 65472 # encoding: [0xc0,0xff,0x01,0x34] +# N32-N64: dsll $1, $1, 46 # encoding: [0xbc,0x0b,0x01,0x00] # N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] li.d $f4, 1.0 @@ -281,7 +288,8 @@ li.d $f4, 1.0 # CHECK-MIPS32r2: lui $1, 16368 # encoding: [0xf0,0x3f,0x01,0x3c] # CHECK-MIPS32r2: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] # CHECK-MIPS32r2: mthc1 $1, $f4 # encoding: [0x00,0x20,0xe1,0x44] -# N32-N64: lui $1, 16368 # encoding: [0xf0,0x3f,0x01,0x3c] +# N32-N64: ori $1, $zero, 65472 # encoding: [0xc0,0xff,0x01,0x34] +# N32-N64: dsll $1, $1, 46 # encoding: [0xbc,0x0b,0x01,0x00] # N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] li.d $f4, 12345678910 @@ -360,7 +368,8 @@ li.d $f4, 1.5 # CHECK-MIPS32r2: lui $1, 16376 # encoding: [0xf8,0x3f,0x01,0x3c] # CHECK-MIPS32r2: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] # CHECK-MIPS32r2: mthc1 $1, $f4 # encoding: [0x00,0x20,0xe1,0x44] -# N32-N64: lui $1, 16376 # encoding: [0xf8,0x3f,0x01,0x3c] +# N32-N64: ori $1, $zero, 65504 # encoding: [0xe0,0xff,0x01,0x34] +# N32-N64: dsll $1, $1, 46 # encoding: [0xbc,0x0b,0x01,0x00] # N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] li.d $f4, 2.5 @@ -370,7 +379,8 @@ li.d $f4, 2.5 # CHECK-MIPS32r2: lui $1, 16388 # encoding: [0x04,0x40,0x01,0x3c] # CHECK-MIPS32r2: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] # CHECK-MIPS32r2: mthc1 $1, $f4 # encoding: [0x00,0x20,0xe1,0x44] -# N32-N64: lui $1, 16388 # encoding: [0x04,0x40,0x01,0x3c] +# N32-N64: ori $1, $zero, 32776 # encoding: [0x08,0x80,0x01,0x34] +# N32-N64: dsll $1, $1, 47 # encoding: [0xfc,0x0b,0x01,0x00] # N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] li.d $f4, 2.515625 From llvm-commits at lists.llvm.org Fri Oct 11 05:35:47 2019 From: llvm-commits at lists.llvm.org (Petar Avramovic via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:35:47 +0000 (UTC) Subject: [PATCH] D68866: [MIPS GlobalISel] Refactor MipsRegisterBankInfo [NFC] Message-ID: Petar.Avramovic created this revision. Petar.Avramovic added reviewers: atanasyan, petarj. Herald added subscribers: llvm-commits, jrtc27, arichardson, rovka, sdardis. Herald added a project: LLVM. Check if size of operand LLT matches sizes of available register banks before inspecting the opcode in order to reduce number of checks. Factor commonly used pieces of code into functions. Repository: rL LLVM https://reviews.llvm.org/D68866 Files: lib/Target/Mips/MipsRegisterBankInfo.cpp lib/Target/Mips/MipsRegisterBankInfo.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68866.224573.patch Type: text/x-patch Size: 17482 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 05:35:47 2019 From: llvm-commits at lists.llvm.org (Pavel Labath via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:35:47 +0000 (UTC) Subject: [PATCH] D68270: DWARFDebugLoc: Add a function to get the address range of an entry In-Reply-To: References: Message-ID: labath marked an inline comment as done. labath added inline comments. ================ Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:291-295 + EntryIterator Absolute = + getAbsoluteLocations( + SectionedAddress{BaseAddr, SectionedAddress::UndefSection}, + LookupPooledAddress) + .begin(); ---------------- dblaikie wrote: > labath wrote: > > dblaikie wrote: > > > labath wrote: > > > > dblaikie wrote: > > > > > labath wrote: > > > > > > This parallel iteration is not completely nice, but I think it's worth being able to reuse the absolute range computation code. I'm open to ideas for improvement though. > > > > > Ah, I see - this is what you meant about "In particular it makes it possible to reuse this stuff in the dumping code, which would have been pretty hard with callbacks.". > > > > > > > > > > I'm wondering if that might be worth revisiting somewhat. A full iterator abstraction for one user here (well, two once you include lldb - but I assume it's likely going to build its own data structure from the iteration anyway, right? (it's not going to keep the iterator around, do anything interesting like partial iterations, re-iterate/etc - such that a callback would suffice)) > > > > > > > > > > I could imagine two callback APIs for this - one that gets entries and locations and one that only gets locations by filtering on the entry version. > > > > > > > > > > eg: > > > > > > > > > > // for non-verbose output: > > > > > LL.forEachEntry([&](const Entry &E, Expected L) { > > > > > if (Verbose && actually dumping debug_loc) > > > > > print(E) // print any LLE_*, raw parameters, etc > > > > > if (L) > > > > > print(*L) // print the resulting address range, section name (if verbose), > > > > > else > > > > > print(error stuff) > > > > > }); > > > > > > > > > > One question would be "when/where do we print the DWARF expression" - if there's an error computing the address range, we can still print the expression, so maybe that happens unconditionally at the end of the callback, using the expression in the Entry? (then, arguably, the expression doesn't need to be in the DWARFLocation - and I'd say make the DWARFLocation a sectioned range, exactly the same type as for ranges so that part of the dumping code, etc, can be maximally reused) > > > > Actually, what lldb currently does is that it does not build any data structures at all (except storing the pointer to the right place in the debug_loc section. Then, whenever it wants to do something to the loclist, it parses it afresh. I don't know why it does this exactly, but I assume it has something to do with most locations never being used, or being only a couple of times, and the actual parsing being fairly fast. What this means is that lldb is not really a single "user", but there are like four or five places where it iterates through the list, depending on what does it actually want to do with it. It also does partial iteration where it stops as soon as it find the entry it was interested in. > > > > Now, all of that is possible with a callback (though I am generally trying to avoid them), but it does resurface the issue of what should be the value of the second argument for DW_LLE_base_address entries (the thing which I originally used a error type for). > > > > Maybe this should be actually one callback API, taking two callback functions, with one of them being invoked for base_address entries, and one for others? However, if we stick to the current approaches in both LLE and RLE of making the address pool resolution function a parameter (which I'd like to keep, as it makes my job in lldb easier), then this would actually be three callbacks, which starts to get unwieldy. Though one of those callbacks could be removed with the "DWARFUnit implementing a AddrOffsetResolver interface" idea, which I really like. :) > > > Ah, thanks for the details on LLDB's location parsing logic. That's interesting indeed! > > > > > > I can appreciate an iterator-based API if that's the sort of usage we've got, though I expect it doesn't have any interest in the low-level encoding & just wants the fully processed address ranges/locations - it doesn't want base_address or end_of_list entries? & I think the dual-iteration is a fairly awkward API design, trying to iterate them in lock-step, etc. I'd rather avoid that if reasonably possible. > > > > > > Either having an iterator API that gives only the fully processed data/semantic view & a completely different API if you want to access the low level primitives (LLE, etc) (this is how ranges works - there's an API that gives a collection of ranges & abstracts over v4/v5/rnglists/etc - though that's partly motivated by a strong multi-client need for that functionality for symbolizing, etc - but I think it's a good abstraction/model anyway (& one of the reasons the inline range list printing doesn't include encoding information, the API it uses is too high level to even have access to it)) > > > > > > > Now, all of that is possible with a callback (though I am generally trying to avoid them), but it does resurface the issue of what should be the value of the second argument for DW_LLE_base_address entries (the thing which I originally used a error type for). > > > > > > Sorry, my intent in the above API was for the second argument to be Optional's "None" state when... oh, I see, I did use Expected there, rather than Optional, because there are legit error cases. > > > > > > I know it's sort of awkward, but I might be inclined to use Optional> there. I realize two layers of wrapping is a bit weird, but I think it'd be nicer than having an error state for what, I think, isn't erroneous. > > > > > > > Maybe this should be actually one callback API, taking two callback functions, with one of them being invoked for base_address entries, and one for others? However, if we stick to the current approaches in both LLE and RLE of making the address pool resolution function a parameter (which I'd like to keep, as it makes my job in lldb easier), then this would actually be three callbacks, which starts to get unwieldy. > > > > > > Don't mind three callbacks too much. > > > > > > > Though one of those callbacks could be removed with the "DWARFUnit implementing a AddrOffsetResolver interface" idea, which I really like. :) > > > > > > Sorry, I haven't really looked at where the address resolver callback is registered and alternative designs being discussed - but yeah, going off just the one-sentence, it seems reasonable to have the DWARFUnit own an address resolver/be the thing you consult when you want to resolve an address (just through a normal function call in DWARFUnit, perhaps - which might, internally, use a callback registered when it was constructed). > > > I know it's sort of awkward, but I might be inclined to use Optional> there. I realize two layers of wrapping is a bit weird, but I think it'd be nicer than having an error state for what, I think, isn't erroneous. > > Actually, my very first attempt at this patch used an `Expected>`, but then I scrapped it because I didn't think you'd like it. It's not the friendliest of APIs, but I think we can go with that. > > > > > Sorry, I haven't really looked at where the address resolver callback is registered and alternative designs being discussed - but yeah, going off just the one-sentence, it seems reasonable to have the DWARFUnit own an address resolver/be the thing you consult when you want to resolve an address (just through a normal function call in DWARFUnit, perhaps - which might, internally, use a callback registered when it was constructed). > > > > I think you got that backwards. I don't want the DWARFUnit to be the source of truth for address pool resolutions, as that would make it hard to use from lldb (it's far from ready to start using the llvm version right now). What I wanted was to replace the lambda/function_ref with a single-method interface. Then both DWARFUnits could implement that interface so that passing a DWARFUnit& would "just work" (but you wouldn't be limited to DWARFUnits as anyone could implement that interface, just like anyone can write a lambda). > As for Expected> (or Optional>) - yeah, I think this is a non-obvious API (both the general problem and this specific solution). I think it's probably worth discussing this design a bit more to save you time writing/rewriting things a bit. I guess there are a few layers of failure here. > > There's the possibility that the iteration itself could fail - even for debug_loc style lists (if we reached the end of the section before encountering a terminating {0,0}). That would suggest a fallible iterator idiom: http://llvm.org/docs/ProgrammersManual.html#building-fallible-iterators-and-iterator-ranges > > But then, yes, when looking at the "processed"/semantic view, that could fail too in the case of an invalid address index, etc. > > The generic/processed/abstracted-over-ranges-and-rnglists API for ranges produces a fully computer vector (& then returns Expected of that range) - is that reasonable? (this does mean manifesting a whole location in memory, which may not be needed so I could understand avoiding that even without fully implementing & demonstrating the vector solution is inadequate). > > But I /think/ maybe the we could/should have two APIs - one generic API that abstracts over loc/loclists and only provides the fully processed view, and another that is type specific for dumping the underlying representation (only used in dumping debug_loclists). If we were computing the final address ranges from scratch (which would be the best match for the current lldb usage, but which I am not considering now for fear of changing too many things), then I agree that we would need the fallible_iterator iterator thingy. But in this case we are "interpreting" the already parsed ranges, so we can assume some level of correctness here, and the thing that can fail is only the computation of a single range, which does not affect our ability to process the next entry. This indicates to me that either each entry in the list should be an Expected<>, or that the invalid entries should be just dropped (possibly accompanied by some flag which would tell the caller that the result was not exhaustive). This is connected to one of the issues I have with the debug ranges API -- it tries _really_ hard to return *something* -- if resolving the indirect base address entry fails, it is perfectly happy to use the address _index_ as the base address. This makes sense for dumping, where you want to show something (though it would still be good to indicate that you're not showing a real address), but it definitely does not help consumers which then need to make decisions based on the returned data. Anyway, yes, I agree that we need to APIs, and probably callbacks are the easiest way to achieve that. We could have a "base" callback that is not particularly nice to use, but provides the full information via a combination of `UnparsedLL` and `Optional>` arguments. The dumper could use that to print out everything it needs. And then we could have a second API, built on top of the first one, which ignores base address entries and the raw data and returns just a bunch of `Expected`. This could be used by users like lldb, who just want to see the final data. The `ParsedLL` type would be independent of the location list type, so that the debug_loc parser could provide the same kind of API (but implemented on top of something else, as the `UnparsedLL` types would differ). Also, under the hood, the location list dumper for debug_loclists (but not debug_loc) could reuse some implementation details with the debug_rnglists dumper via a suitable combination of templates and callbacks. How does that sound? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68270/new/ https://reviews.llvm.org/D68270 From llvm-commits at lists.llvm.org Fri Oct 11 05:35:47 2019 From: llvm-commits at lists.llvm.org (Petar Avramovic via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:35:47 +0000 (UTC) Subject: [PATCH] D68867: [MIPS GlobalISel] Add MSA registers to fprb. Select vector load, store Message-ID: Petar.Avramovic created this revision. Petar.Avramovic added reviewers: atanasyan, petarj. Herald added subscribers: llvm-commits, jrtc27, arichardson, rovka, sdardis. Herald added a project: LLVM. Add vector MSA register classes to fprb, they are 128 bit wide. MSA instructions use the same registers for both integer and floating point operations. Therefore we only need to check for vector element size during legalization or instruction selection. Add helper function in MipsLegalizerInfo and switch to legalIf LegalizeRuleSet to keep legalization rules compact since they depend on MipsSubtarget and presence of MSA. fprb is assigned to all vector operands. Move selectLoadStoreOpCode to MipsInstructionSelector in order to reduce number of arguments. Repository: rL LLVM https://reviews.llvm.org/D68867 Files: lib/Target/Mips/MipsInstructionSelector.cpp lib/Target/Mips/MipsLegalizerInfo.cpp lib/Target/Mips/MipsRegisterBankInfo.cpp lib/Target/Mips/MipsRegisterBanks.td test/CodeGen/Mips/GlobalISel/instruction-select/load_store_vec.mir test/CodeGen/Mips/GlobalISel/legalizer/load_store_vec.mir test/CodeGen/Mips/GlobalISel/llvm-ir/load_store_vec.ll test/CodeGen/Mips/GlobalISel/regbankselect/load_store_vec.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68867.224575.patch Type: text/x-patch Size: 27777 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 05:36:00 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:36:00 +0000 (UTC) Subject: [PATCH] D68776: [mips] Fix loading "double" immediate into a GPR and FPR In-Reply-To: References: Message-ID: <80774763d1b9b71f7c07e99ecacb7de8@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGb051a19aa02d: [mips] Fix loading "double" immediate into a GPR and FPR (authored by atanasyan). Changed prior to commit: https://reviews.llvm.org/D68776?vs=224298&id=224576#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68776/new/ https://reviews.llvm.org/D68776 Files: llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp llvm/test/MC/Mips/macro-li.d.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68776.224576.patch Type: text/x-patch Size: 7238 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 05:45:15 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:45:15 +0000 (UTC) Subject: [PATCH] D68777: [mips] Use less instruction to load zero into FPR by li.s / li.d pseudos In-Reply-To: References: Message-ID: atanasyan updated this revision to Diff 224577. atanasyan added a comment. Rebased against the master branch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68777/new/ https://reviews.llvm.org/D68777 Files: llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp llvm/test/MC/Mips/macro-li.d.s llvm/test/MC/Mips/macro-li.s.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68777.224577.patch Type: text/x-patch Size: 5070 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 05:50:57 2019 From: llvm-commits at lists.llvm.org (Kai Nacke via llvm-commits) Date: Fri, 11 Oct 2019 12:50:57 -0000 Subject: [llvm] r374547 - [Tests] Output of od can be lower or upper case (llvm-objcopy/yaml2obj). Message-ID: <20191011125057.9FEC593059@lists.llvm.org> Author: redstar Date: Fri Oct 11 05:50:57 2019 New Revision: 374547 URL: http://llvm.org/viewvc/llvm-project?rev=374547&view=rev Log: [Tests] Output of od can be lower or upper case (llvm-objcopy/yaml2obj). The command `od -t x` is used to dump data in hex format. The LIT tests assumes that the hex characters are in lowercase. However, there are also platforms which use uppercase letter. To solve this issue the tests are updated to use the new `--ignore-case` option of FileCheck. Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson Differential Revision: https://reviews.llvm.org/D68693 Modified: llvm/trunk/test/tools/llvm-objcopy/ELF/basic-binary-copy.test llvm/trunk/test/tools/llvm-objcopy/ELF/binary-no-paddr.test llvm/trunk/test/tools/llvm-objcopy/ELF/binary-paddr.test llvm/trunk/test/tools/llvm-objcopy/ELF/binary-segment-layout.test llvm/trunk/test/tools/llvm-objcopy/ELF/check-addr-offset-align-binary.test llvm/trunk/test/tools/llvm-objcopy/ELF/dump-section.test llvm/trunk/test/tools/llvm-objcopy/ELF/preserve-segment-contents.test llvm/trunk/test/tools/llvm-objcopy/ELF/strip-all-gnu.test llvm/trunk/test/tools/llvm-objcopy/ELF/strip-sections.test llvm/trunk/test/tools/yaml2obj/elf-override-shoffset.yaml llvm/trunk/test/tools/yaml2obj/elf-override-shsize.yaml Modified: llvm/trunk/test/tools/llvm-objcopy/ELF/basic-binary-copy.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-objcopy/ELF/basic-binary-copy.test?rev=374547&r1=374546&r2=374547&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-objcopy/ELF/basic-binary-copy.test (original) +++ llvm/trunk/test/tools/llvm-objcopy/ELF/basic-binary-copy.test Fri Oct 11 05:50:57 2019 @@ -1,6 +1,6 @@ # RUN: yaml2obj %s -o %t # RUN: llvm-objcopy -O binary %t %t2 -# RUN: od -t x2 -v %t2 | FileCheck %s +# RUN: od -t x2 -v %t2 | FileCheck %s --ignore-case # RUN: wc -c < %t2 | FileCheck %s --check-prefix=SIZE !ELF Modified: llvm/trunk/test/tools/llvm-objcopy/ELF/binary-no-paddr.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-objcopy/ELF/binary-no-paddr.test?rev=374547&r1=374546&r2=374547&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-objcopy/ELF/binary-no-paddr.test (original) +++ llvm/trunk/test/tools/llvm-objcopy/ELF/binary-no-paddr.test Fri Oct 11 05:50:57 2019 @@ -1,6 +1,6 @@ # RUN: yaml2obj %s -o %t # RUN: llvm-objcopy -O binary %t %t2 -# RUN: od -t x2 -v %t2 | FileCheck %s +# RUN: od -t x2 -v %t2 | FileCheck %s --ignore-case # RUN: wc -c < %t2 | FileCheck %s --check-prefix=SIZE !ELF Modified: llvm/trunk/test/tools/llvm-objcopy/ELF/binary-paddr.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-objcopy/ELF/binary-paddr.test?rev=374547&r1=374546&r2=374547&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-objcopy/ELF/binary-paddr.test (original) +++ llvm/trunk/test/tools/llvm-objcopy/ELF/binary-paddr.test Fri Oct 11 05:50:57 2019 @@ -1,6 +1,6 @@ # RUN: yaml2obj %s -o %t # RUN: llvm-objcopy -O binary %t %t2 -# RUN: od -t x2 %t2 | FileCheck %s +# RUN: od -t x2 %t2 | FileCheck %s --ignore-case # RUN: wc -c < %t2 | FileCheck %s --check-prefix=SIZE !ELF Modified: llvm/trunk/test/tools/llvm-objcopy/ELF/binary-segment-layout.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-objcopy/ELF/binary-segment-layout.test?rev=374547&r1=374546&r2=374547&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-objcopy/ELF/binary-segment-layout.test (original) +++ llvm/trunk/test/tools/llvm-objcopy/ELF/binary-segment-layout.test Fri Oct 11 05:50:57 2019 @@ -1,6 +1,6 @@ # RUN: yaml2obj %s -o %t # RUN: llvm-objcopy -O binary %t %t2 -# RUN: od -t x2 %t2 | FileCheck %s +# RUN: od -t x2 %t2 | FileCheck %s --ignore-case # RUN: wc -c < %t2 | FileCheck %s --check-prefix=SIZE !ELF Modified: llvm/trunk/test/tools/llvm-objcopy/ELF/check-addr-offset-align-binary.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-objcopy/ELF/check-addr-offset-align-binary.test?rev=374547&r1=374546&r2=374547&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-objcopy/ELF/check-addr-offset-align-binary.test (original) +++ llvm/trunk/test/tools/llvm-objcopy/ELF/check-addr-offset-align-binary.test Fri Oct 11 05:50:57 2019 @@ -1,6 +1,6 @@ # RUN: yaml2obj %s -o %t # RUN: llvm-objcopy -O binary %t %t2 -# RUN: od -t x1 %t2 | FileCheck %s +# RUN: od -t x1 %t2 | FileCheck %s --ignore-case !ELF FileHeader: Modified: llvm/trunk/test/tools/llvm-objcopy/ELF/dump-section.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-objcopy/ELF/dump-section.test?rev=374547&r1=374546&r2=374547&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-objcopy/ELF/dump-section.test (original) +++ llvm/trunk/test/tools/llvm-objcopy/ELF/dump-section.test Fri Oct 11 05:50:57 2019 @@ -4,8 +4,8 @@ # RUN: llvm-objcopy --dump-section .text=%t4 %t %t5 # RUN: llvm-objcopy --dump-section .foo=%t6 %t %t7 # RUN: not llvm-objcopy --dump-section .bar=%t8 %t %t9 2>&1 | FileCheck %s --check-prefix=NOBITS -DINPUT=%t -# RUN: od -t x1 %t2 | FileCheck %s -# RUN: od -t x1 %t6 | FileCheck %s --check-prefix=NON-ALLOC +# RUN: od -t x1 %t2 | FileCheck %s --ignore-case +# RUN: od -t x1 %t6 | FileCheck %s --ignore-case --check-prefix=NON-ALLOC # RUN: wc -c %t2 | FileCheck %s --check-prefix=SIZE # RUN: diff %t2 %t3 # RUN: diff %t4 %t3 Modified: llvm/trunk/test/tools/llvm-objcopy/ELF/preserve-segment-contents.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-objcopy/ELF/preserve-segment-contents.test?rev=374547&r1=374546&r2=374547&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-objcopy/ELF/preserve-segment-contents.test (original) +++ llvm/trunk/test/tools/llvm-objcopy/ELF/preserve-segment-contents.test Fri Oct 11 05:50:57 2019 @@ -13,13 +13,13 @@ # RUN: llvm-objcopy %t.base %t.stripped --regex -R blob.* # Show that the removal leaves the bytes as zeroes, as desired, for all our # test cases. -# RUN: od -t x1 -j 0x2000 -N 24 %t.stripped | FileCheck %s --check-prefix=CHECK1 -DPATTERN="00 00 00 00" -# RUN: od -t x1 -j 0x2100 -N 12 %t.stripped | FileCheck %s --check-prefix=CHECK2 -DPATTERN="00 00 00 00" -# RUN: od -t x1 -j 0x2200 -N 4 %t.stripped | FileCheck %s --check-prefix=CHECK3 -DPATTERN="00 00 00 00" -# RUN: od -t x1 -j 0x2300 -N 12 %t.stripped | FileCheck %s --check-prefix=CHECK4 -DPATTERN="00 00 00 00" -# RUN: od -t x1 -j 0x3000 -N 68 %t.stripped | FileCheck %s --check-prefix=CHECK5 -DPATTERN="00 00 00 00" -# RUN: od -t x1 -j 0x4000 -N 60 %t.stripped | FileCheck %s --check-prefix=CHECK6 -DPATTERN="00 00 00 00" -# RUN: od -t x1 -j 0x5000 -N 60 %t.stripped | FileCheck %s --check-prefix=CHECK7 -DPATTERN="00 00 00 00" +# RUN: od -t x1 -j 0x2000 -N 24 %t.stripped | FileCheck %s --ignore-case --check-prefix=CHECK1 -DPATTERN="00 00 00 00" +# RUN: od -t x1 -j 0x2100 -N 12 %t.stripped | FileCheck %s --ignore-case --check-prefix=CHECK2 -DPATTERN="00 00 00 00" +# RUN: od -t x1 -j 0x2200 -N 4 %t.stripped | FileCheck %s --ignore-case --check-prefix=CHECK3 -DPATTERN="00 00 00 00" +# RUN: od -t x1 -j 0x2300 -N 12 %t.stripped | FileCheck %s --ignore-case --check-prefix=CHECK4 -DPATTERN="00 00 00 00" +# RUN: od -t x1 -j 0x3000 -N 68 %t.stripped | FileCheck %s --ignore-case --check-prefix=CHECK5 -DPATTERN="00 00 00 00" +# RUN: od -t x1 -j 0x4000 -N 60 %t.stripped | FileCheck %s --ignore-case --check-prefix=CHECK6 -DPATTERN="00 00 00 00" +# RUN: od -t x1 -j 0x5000 -N 60 %t.stripped | FileCheck %s --ignore-case --check-prefix=CHECK7 -DPATTERN="00 00 00 00" # RUN: cp %t.stripped %t.in # RUN: echo "with open('%/t.in', 'rb+') as input:" > %t.py @@ -32,13 +32,13 @@ # RUN: echo " input.write(bytearray.fromhex('DEADBEEF'))" >> %t.py # RUN: %python %t.py # RUN: llvm-objcopy %t.in %t.out -# RUN: od -t x1 -j 0x2000 -N 24 %t.out | FileCheck %s --check-prefix=CHECK1 -DPATTERN="de ad be ef" -# RUN: od -t x1 -j 0x2100 -N 12 %t.out | FileCheck %s --check-prefix=CHECK2 -DPATTERN="de ad be ef" -# RUN: od -t x1 -j 0x2200 -N 4 %t.out | FileCheck %s --check-prefix=CHECK3 -DPATTERN="de ad be ef" -# RUN: od -t x1 -j 0x2300 -N 12 %t.out | FileCheck %s --check-prefix=CHECK4 -DPATTERN="de ad be ef" -# RUN: od -t x1 -j 0x3000 -N 68 %t.out | FileCheck %s --check-prefix=CHECK5 -DPATTERN="de ad be ef" -# RUN: od -t x1 -j 0x4000 -N 60 %t.out | FileCheck %s --check-prefix=CHECK6 -DPATTERN="de ad be ef" -# RUN: od -t x1 -j 0x5000 -N 60 %t.out | FileCheck %s --check-prefix=CHECK7 -DPATTERN="de ad be ef" +# RUN: od -t x1 -j 0x2000 -N 24 %t.out | FileCheck %s --ignore-case --check-prefix=CHECK1 -DPATTERN="de ad be ef" +# RUN: od -t x1 -j 0x2100 -N 12 %t.out | FileCheck %s --ignore-case --check-prefix=CHECK2 -DPATTERN="de ad be ef" +# RUN: od -t x1 -j 0x2200 -N 4 %t.out | FileCheck %s --ignore-case --check-prefix=CHECK3 -DPATTERN="de ad be ef" +# RUN: od -t x1 -j 0x2300 -N 12 %t.out | FileCheck %s --ignore-case --check-prefix=CHECK4 -DPATTERN="de ad be ef" +# RUN: od -t x1 -j 0x3000 -N 68 %t.out | FileCheck %s --ignore-case --check-prefix=CHECK5 -DPATTERN="de ad be ef" +# RUN: od -t x1 -j 0x4000 -N 60 %t.out | FileCheck %s --ignore-case --check-prefix=CHECK6 -DPATTERN="de ad be ef" +# RUN: od -t x1 -j 0x5000 -N 60 %t.out | FileCheck %s --ignore-case --check-prefix=CHECK7 -DPATTERN="de ad be ef" # CHECK1: [[PATTERN]] 11 22 33 44 [[PATTERN]] [[PATTERN]] # CHECK1-NEXT: 55 66 77 88 [[PATTERN]] Modified: llvm/trunk/test/tools/llvm-objcopy/ELF/strip-all-gnu.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-objcopy/ELF/strip-all-gnu.test?rev=374547&r1=374546&r2=374547&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-objcopy/ELF/strip-all-gnu.test (original) +++ llvm/trunk/test/tools/llvm-objcopy/ELF/strip-all-gnu.test Fri Oct 11 05:50:57 2019 @@ -7,8 +7,8 @@ # Show that the debug section in a segment was removed, to match GNU. # First validate that the offset in use is correct. # RUN: llvm-objcopy %t %t4 -# RUN: od -t x1 -N 4 -j 120 %t4 | FileCheck %s --check-prefix=COPY-BYTES -# RUN: od -t x1 -N 4 -j 120 %t2 | FileCheck %s --check-prefix=STRIP-BYTES +# RUN: od -t x1 -N 4 -j 120 %t4 | FileCheck %s --ignore-case --check-prefix=COPY-BYTES +# RUN: od -t x1 -N 4 -j 120 %t2 | FileCheck %s --ignore-case --check-prefix=STRIP-BYTES !ELF FileHeader: Modified: llvm/trunk/test/tools/llvm-objcopy/ELF/strip-sections.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-objcopy/ELF/strip-sections.test?rev=374547&r1=374546&r2=374547&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-objcopy/ELF/strip-sections.test (original) +++ llvm/trunk/test/tools/llvm-objcopy/ELF/strip-sections.test Fri Oct 11 05:50:57 2019 @@ -1,12 +1,12 @@ # RUN: yaml2obj %s > %t # RUN: llvm-objcopy --strip-sections %t %t2 # RUN: llvm-readobj --file-headers --program-headers %t2 | FileCheck %s -# RUN: od -t x1 -j 4096 -N 12 %t2 | FileCheck %s --check-prefix=DATA +# RUN: od -t x1 -j 4096 -N 12 %t2 | FileCheck %s --ignore-case --check-prefix=DATA ## Sanity check the DATA-NOT line by showing that "fe ed fa ce" appears ## if --strip-sections is not specified. # RUN: llvm-objcopy %t %t3 -# RUN: od -t x1 -j 4096 -N 12 %t3 | FileCheck %s --check-prefix=VALIDATE +# RUN: od -t x1 -j 4096 -N 12 %t3 | FileCheck %s --ignore-case --check-prefix=VALIDATE ## Check that llvm-strip --strip-sections is equivalent to ## llvm-objcopy --strip-sections. Modified: llvm/trunk/test/tools/yaml2obj/elf-override-shoffset.yaml URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/yaml2obj/elf-override-shoffset.yaml?rev=374547&r1=374546&r2=374547&view=diff ============================================================================== --- llvm/trunk/test/tools/yaml2obj/elf-override-shoffset.yaml (original) +++ llvm/trunk/test/tools/yaml2obj/elf-override-shoffset.yaml Fri Oct 11 05:50:57 2019 @@ -75,7 +75,7 @@ Sections: # RUN: yaml2obj --docnum=3 %s -o %t3 # RUN: od -t x1 -v %t2 > %t.txt # RUN: od -t x1 -v %t3 >> %t.txt -# RUN: FileCheck %s --input-file=%t.txt --check-prefix=CASE2 +# RUN: FileCheck %s --input-file=%t.txt --ignore-case --check-prefix=CASE2 # CASE2: [[OFFSET:.*]] fe fe fe fe fe fe fe fe # CASE2: [[FILESIZE:.*]]{{$}} Modified: llvm/trunk/test/tools/yaml2obj/elf-override-shsize.yaml URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/yaml2obj/elf-override-shsize.yaml?rev=374547&r1=374546&r2=374547&view=diff ============================================================================== --- llvm/trunk/test/tools/yaml2obj/elf-override-shsize.yaml (original) +++ llvm/trunk/test/tools/yaml2obj/elf-override-shsize.yaml Fri Oct 11 05:50:57 2019 @@ -74,7 +74,7 @@ Sections: # RUN: yaml2obj --docnum=3 %s -o %t3 # RUN: od -t x1 -v %t2 > %t.txt # RUN: od -t x1 -v %t3 >> %t.txt -# RUN: FileCheck %s --input-file=%t.txt --check-prefix=CASE2 +# RUN: FileCheck %s --input-file=%t.txt --ignore-case --check-prefix=CASE2 # CASE2: [[OFFSET:.*]] fe fe fe fe fe fe fe fe # CASE2: [[FILESIZE:.*]]{{$}} @@ -136,7 +136,7 @@ Sections: ## bytes written is equal to Size in this case. # RUN: yaml2obj --docnum=5 %s -o %t5 -# RUN: od -t x1 -v %t5 | FileCheck %s --check-prefix=CASE5 +# RUN: od -t x1 -v %t5 | FileCheck %s --ignore-case --check-prefix=CASE5 # CASE5: aa aa 00 00 bb bb From llvm-commits at lists.llvm.org Fri Oct 11 05:54:15 2019 From: llvm-commits at lists.llvm.org (Orlando Cazalet-Hyams via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 12:54:15 +0000 (UTC) Subject: [PATCH] D68816: [NFC] Replace a linked list in LiveDebugVariables pass with a DenseMap In-Reply-To: References: Message-ID: Orlando updated this revision to Diff 224579. Orlando edited the summary of this revision. Orlando added a comment. Addressed comments and fixed Summary (previously it referred to LiveDebugValues.cpp instead of LiveDebugVariables.cpp). @aprantl > Not your code, but: add a FragmentInfo element? As you point out in D66415 we need to take care to track overlapping fragment ranges too. This shouldn't be too hard but it makes more sense to me to rebase D66415 on this and work from there. > How many elements does this have on average? Is a SmallDenseMap a win? The following numbers are only anecdotal of course. I compiled two example single file projects (A: ~100,000 loc, B: ~20,000 loc) and in both cases roughly 90% of UserVarMap.size() were <= 64. -------------------------------------- Project A: UserVarMap size -------------------------------------- Size <= x | Num maps | % of total maps -------------------------------------- 0 | 697 | 50 2 | 770 | 55.2 4 | 797 | 57.2 8 | 858 | 61.5 16 | 969 | 69.5 32 | 1109 | 79.6 64 | 1221 | 87.6 128 | 1311 | 94.0 256 | 1361 | 97.6 512 | 1379 | 98.9 1024 | 1388 | 99.6 -------------------------------------- Project B: UserVarMap size -------------------------------------- Size <= x | Num maps | % of total maps -------------------------------------- 0 | 296 | 50.3 2 | 306 | 52.0 4 | 333 | 56.6 8 | 364 | 61.9 16 | 397 | 67.5 32 | 477 | 81.1 64 | 530 | 90.1 128 | 572 | 97.3 256 | 582 | 99.0 512 | 587 | 99.8 1024 | 588 | 100 I don't have any self-host build time stats for this yet but he default DenseMap 64 element alloc seems almost perfect for these examples. Assuming a SmallDenseMap allocates space in the object (like SmallVector) and given that LDVImpl seems to only ever be heap allocated, a SmallDenseMap<..., 64> would likely give some (small) build time reduction. I'll add more info here when I can get some self-host build times. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68816/new/ https://reviews.llvm.org/D68816 Files: llvm/lib/CodeGen/LiveDebugVariables.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68816.224579.patch Type: text/x-patch Size: 9756 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 05:58:38 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Fri, 11 Oct 2019 12:58:38 -0000 Subject: [llvm] r374548 - [mips] Follow-up to r374544. Fix test case. Message-ID: <20191011125838.28FE1838E5@lists.llvm.org> Author: atanasyan Date: Fri Oct 11 05:58:37 2019 New Revision: 374548 URL: http://llvm.org/viewvc/llvm-project?rev=374548&view=rev Log: [mips] Follow-up to r374544. Fix test case. Modified: llvm/trunk/test/MC/Mips/macro-li.d.s Modified: llvm/trunk/test/MC/Mips/macro-li.d.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Mips/macro-li.d.s?rev=374548&r1=374547&r2=374548&view=diff ============================================================================== --- llvm/trunk/test/MC/Mips/macro-li.d.s (original) +++ llvm/trunk/test/MC/Mips/macro-li.d.s Fri Oct 11 05:58:37 2019 @@ -234,7 +234,7 @@ li.d $f4, 0 # CHECK-MIPS32r2: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] # CHECK-MIPS32r2: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] # CHECK-MIPS32r2: mthc1 $1, $f4 # encoding: [0x00,0x20,0xe1,0x44] -# N32-N64: daddiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x64] +# N32-N64: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] # N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] li.d $f4, 0.0 @@ -244,7 +244,7 @@ li.d $f4, 0.0 # CHECK-MIPS32r2: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] # CHECK-MIPS32r2: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] # CHECK-MIPS32r2: mthc1 $1, $f4 # encoding: [0x00,0x20,0xe1,0x44] -# N32-N64: daddiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x64] +# N32-N64: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] # N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] li.d $f4, 1.12345 From llvm-commits at lists.llvm.org Fri Oct 11 06:03:48 2019 From: llvm-commits at lists.llvm.org (Momchil Velikov via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:03:48 +0000 (UTC) Subject: [PATCH] D68862: [ARM] Allocatable Global Register Variables for ARM In-Reply-To: References: Message-ID: <596f92c63aaad014c11749eb9459fddb@localhost.localdomain> chill added a comment. TBH, I quite dislike the creeping abuse of `SubtargetFeature`s as code generation options. cf. Target.td:1477 //===----------------------------------------------------------------------===// // SubtargetFeature - A characteristic of the chip set. // IMHO, since reserved registes are per-function, this strongly suggests implementation as function attribute(s), rather than subtarget features (also for the pre-existing r9). It also opens the path towards possible future `__attribute__`. ================ Comment at: clang/include/clang/Basic/TargetInfo.h:944 + /// using the corresponding -ffixed-RegName option. + virtual bool isRegisterReservedGlobally(StringRef RegName) const { + return true; ---------------- Parameter name can be omitted if unused; that would remove a potential warning. ================ Comment at: clang/lib/Basic/Targets/ARM.cpp:884 + StringRef RegName, unsigned RegSize, bool &HasSizeMismatch) const { + if (RegName.equals("r6") || RegName.equals("r7") || RegName.equals("r8") || + RegName.equals("r9") || RegName.equals("r10") || RegName.equals("r11") || ---------------- Perhaps you can use here `RegName == "r6"` or string switch ? ================ Comment at: clang/lib/Basic/Targets/ARM.cpp:890 + } + return false; +} ---------------- `HasSizeMismatch` is not set along all possible paths. ================ Comment at: clang/lib/Basic/Targets/ARM.cpp:895 + // The "sp" register does not have a -ffixed-sp option, + // so enable it unconditionally. + if (RegName.equals("sp")) ---------------- s/enable/reserve/ ? ================ Comment at: clang/lib/Basic/Targets/ARM.cpp:899 + + // enable rN (N:6-11) registers only if the corresponding + // +reserve-rN feature is found ---------------- Likewise ? ================ Comment at: clang/lib/Basic/Targets/ARM.cpp:901-902 + // +reserve-rN feature is found + std::vector &Features = getTargetOpts().Features; + std::string SearchFeature = "+reserve-" + RegName.str(); + for (std::string &Feature : Features) { ---------------- These variables can be `const`. ================ Comment at: clang/lib/Basic/Targets/ARM.cpp:903-907 + for (std::string &Feature : Features) { + if (Feature.compare(SearchFeature) == 0) + return true; + } + return false; ---------------- This explicit loop can be written like: ``` return llvm::any_of(getTargetOpts().Features(), [&](auto &P) { return P == SearchFeature; }); ``` ================ Comment at: llvm/lib/Target/ARM/ARMSubtarget.h:236 + // ReservedRRegisters[i] - R#i is not available as a general purpose register. + BitVector ReservedRRegisters; ---------------- The usual designation for these registers is "GPR". Suggestion either `ReservedGPRegisters` or just `ReservedRegisters`, here and elsewhere. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68862/new/ https://reviews.llvm.org/D68862 From llvm-commits at lists.llvm.org Fri Oct 11 06:03:49 2019 From: llvm-commits at lists.llvm.org (Momchil Velikov via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:03:49 +0000 (UTC) Subject: [PATCH] D68862: [ARM] Allocatable Global Register Variables for ARM In-Reply-To: References: Message-ID: chill added inline comments. ================ Comment at: clang/lib/Basic/Targets/ARM.cpp:902 + std::vector &Features = getTargetOpts().Features; + std::string SearchFeature = "+reserve-" + RegName.str(); + for (std::string &Feature : Features) { ---------------- SjoerdMeijer wrote: > I was pointed at something similar myself recently, but if I am not mistaken then I think this is a use-after-free: > > "+reserve-" + RegName.str() > > this will allocate a temp `std::string` that `SearchFeature` points to, which then gets released, and `SearchFeature` is still pointing at it. Any temporaries would be destructed at the end of the full expression. By that time, the `SearchString` would be constructed and stand on its own. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68862/new/ https://reviews.llvm.org/D68862 From llvm-commits at lists.llvm.org Fri Oct 11 06:03:49 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:03:49 +0000 (UTC) Subject: [PATCH] D68777: [mips] Use less instruction to load zero into FPR by li.s / li.d pseudos In-Reply-To: References: Message-ID: <84a6762512e6fe4e853f5e5f8df9338f@localhost.localdomain> atanasyan updated this revision to Diff 224580. atanasyan added a comment. Rebased against the master branch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68777/new/ https://reviews.llvm.org/D68777 Files: llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp llvm/test/MC/Mips/macro-li.d.s llvm/test/MC/Mips/macro-li.s.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68777.224580.patch Type: text/x-patch Size: 5070 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 06:12:55 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Milo=C5=A1_Stojanovi=C4=87_via_Phabricator?= via llvm-commits) Date: Fri, 11 Oct 2019 13:12:55 +0000 (UTC) Subject: [PATCH] D68777: [mips] Use less instruction to load zero into FPR by li.s / li.d pseudos In-Reply-To: References: Message-ID: <3eaaaa40722f7d85d43a44d9bc4c48b6@localhost.localdomain> mstojanovic accepted this revision. mstojanovic added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68777/new/ https://reviews.llvm.org/D68777 From llvm-commits at lists.llvm.org Fri Oct 11 06:12:58 2019 From: llvm-commits at lists.llvm.org (Sean Fertile via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:12:58 +0000 (UTC) Subject: [PATCH] D68815: [AIX] Use .space instead of .zero in assembly In-Reply-To: References: Message-ID: <705b48aac9bc961b6b15438e983320c1@localhost.localdomain> sfertile accepted this revision. sfertile added a comment. This revision is now accepted and ready to land. LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68815/new/ https://reviews.llvm.org/D68815 From llvm-commits at lists.llvm.org Fri Oct 11 06:22:11 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Milo=C5=A1_Stojanovi=C4=87_via_Phabricator?= via llvm-commits) Date: Fri, 11 Oct 2019 13:22:11 +0000 (UTC) Subject: [PATCH] D68778: [mips] Store 64-bit `li.d' operand as a single 8-byte value In-Reply-To: References: Message-ID: <3033e2d8c33f0305d1e646d60b8effa8@localhost.localdomain> mstojanovic accepted this revision. mstojanovic added a comment. This revision is now accepted and ready to land. LGTM ================ Comment at: llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp:3436 getStreamer().EmitLabel(Sym, IDLoc); - getStreamer().EmitIntValue(HiImmOp64, 4); - getStreamer().EmitIntValue(LoImmOp64, 4); + getStreamer().EmitValueToAlignment(8); + getStreamer().EmitIntValue(ImmOp64, 8); ---------------- With the elimination of `HiImmOp64` and `LoImmOp64` here the use of most of these variables goes down to a single place in the code which creates a chance for inlining, though they look fine even without that. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68778/new/ https://reviews.llvm.org/D68778 From llvm-commits at lists.llvm.org Fri Oct 11 06:22:12 2019 From: llvm-commits at lists.llvm.org (Jeroen Dobbelaere via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:22:12 +0000 (UTC) Subject: [PATCH] D68521: [PATCH 36/38] [noalias] Clang CodeGen for restrict-qualified pointers In-Reply-To: References: Message-ID: <84d4b8406e8d08590ef870bc71bcfc98@localhost.localdomain> jeroen.dobbelaere marked an inline comment as done. jeroen.dobbelaere added inline comments. ================ Comment at: clang/include/clang/Driver/CC1Options.td:287 + HelpText<"Only support restrict on function arguments">; +def full_restrict : Flag<["-"], "full-restrict">, + HelpText<"Enable full restrict support">; ---------------- I plan to move the options into the 'f_group'. These will then become: -fonly-restrict-arguments, -ffull-restrict and -fno-noalias-arguments. I'll also be adding a '-fno-full-restrict' option, so that we can easily fall back to the legacy behavior when the default is toggled. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68521/new/ https://reviews.llvm.org/D68521 From llvm-commits at lists.llvm.org Fri Oct 11 06:22:12 2019 From: llvm-commits at lists.llvm.org (David Candler via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:22:12 +0000 (UTC) Subject: [PATCH] D67216: [cfi] Add flag to always generate .debug_frame In-Reply-To: References: Message-ID: dcandler updated this revision to Diff 224581. dcandler retitled this revision from "[cfi] Add flag to always generate call frame information" to "[cfi] Add flag to always generate .debug_frame". dcandler edited the summary of this revision. dcandler added reviewers: rengolin, joerg. dcandler added a comment. Herald added subscribers: jsji, MaskRay, kbarton, nemanjai. I've modified the patch so that the new flag will ensure the cfi instructions are actually present to be emitted as well. I went ahead and renamed the flag -gdwarf-frame too, to better reflect that it's dealing with the debug information you'd otherwise get with -g, and is meant to specifically put the information in a .debug_frame section and not .eh_frame. Currently, two things signal for need for cfi: exceptions (via the function's needsUnwindTableEntry()), and debug (via the machine module information's hasDebugInfo()). At frame lowering, both trigger the same thing. But when the assembly printer decides on which section to use, needsUnwindTableEntry() is checked first and triggers the need for .eh_frame, while hasDebugInfo() is checked afterwards for whether .debug_frame is needed. So .debug_frame is only present when any level of debug is requested, and no functions need unwinding for exceptions. It wouldn't be appropriate to change either needsUnwindTableEntry() or hasDebugInfo(), so I've added a check for my flag alongside them. Because the same logic is used in multiple places, I've wrapped all three checks into one function to try and clean things up slightly. When deciding on which section to emit, the new flag means .debug_frame is produced instead of nothing. If .eh_frame would have been needed, rather than replace it, the new flag simply emits both .debug_frame and .eh_frame. The end result is that -gdwarf-frame should only provide a .debug_frame section as additional information, without otherwise modifying anything. The existing -funwind-tables (and -fasynchronous-unwind-tables) flag can be used to provide similar information, but because it takes the exception angle, it alters function attributes and ultimately produces .eh_frame instead. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67216/new/ https://reviews.llvm.org/D67216 Files: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Driver/Options.td clang/lib/CodeGen/BackendUtil.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/Driver/gdwarf-frame.c llvm/include/llvm/CodeGen/CommandFlags.inc llvm/include/llvm/CodeGen/MachineFunction.h llvm/include/llvm/Target/TargetOptions.h llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/lib/CodeGen/AsmPrinter/DwarfCFIException.cpp llvm/lib/CodeGen/CFIInstrInserter.cpp llvm/lib/CodeGen/MachineFunction.cpp llvm/lib/Target/AArch64/AArch64FrameLowering.cpp llvm/lib/Target/ARC/ARCRegisterInfo.cpp llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp llvm/lib/Target/PowerPC/PPCFrameLowering.cpp llvm/lib/Target/X86/X86FrameLowering.cpp llvm/lib/Target/X86/X86InstrInfo.cpp llvm/lib/Target/XCore/XCoreRegisterInfo.cpp llvm/test/CodeGen/ARM/dwarf-frame.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67216.224581.patch Type: text/x-patch Size: 16096 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 06:22:14 2019 From: llvm-commits at lists.llvm.org (Sjoerd Meijer via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:22:14 +0000 (UTC) Subject: [PATCH] D68862: [ARM] Allocatable Global Register Variables for ARM In-Reply-To: References: Message-ID: SjoerdMeijer added inline comments. ================ Comment at: clang/lib/Basic/Targets/ARM.cpp:902 + std::vector &Features = getTargetOpts().Features; + std::string SearchFeature = "+reserve-" + RegName.str(); + for (std::string &Feature : Features) { ---------------- chill wrote: > SjoerdMeijer wrote: > > I was pointed at something similar myself recently, but if I am not mistaken then I think this is a use-after-free: > > > > "+reserve-" + RegName.str() > > > > this will allocate a temp `std::string` that `SearchFeature` points to, which then gets released, and `SearchFeature` is still pointing at it. > Any temporaries would be destructed at the end of the full expression. By that time, the `SearchString` would be constructed and stand on its own. Ah yes, true. This is a std::string, not stringref as in my case. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68862/new/ https://reviews.llvm.org/D68862 From llvm-commits at lists.llvm.org Fri Oct 11 06:40:30 2019 From: llvm-commits at lists.llvm.org (Ettore Tiotto via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:40:30 +0000 (UTC) Subject: [PATCH] D68827: [DDG] Data Dependence Graph - Pi Block In-Reply-To: References: Message-ID: etiotto added inline comments. ================ Comment at: llvm/include/llvm/Analysis/DependenceGraphBuilder.h:115 + /// and false otherwise. + virtual bool shouldCreatePiBlocks() const { return true; } + ---------------- When would be not desired to create pi-blocks? Is this member function really at this point? ================ Comment at: llvm/lib/Analysis/DDG.cpp:16 +static cl::opt + CreatePiBlocks("ddg-pi-blocks", cl::init(true), cl::Hidden, cl::ZeroOrMore, ---------------- Is probably overkill to have an option to disable the creation of pi-blocks. At least at this point. Less is more :-) ================ Comment at: llvm/lib/Analysis/DDG.cpp:215 + // already reachable by root. + auto *Pi = dyn_cast(&N); + assert(!Root || Pi && "Root node is already added. No more nodes can be added."); ---------------- This makes me think what happens when a node that is part of a pi-block is removed from the graph (via DirectedGraph::removeNode). We should update the PiBlockMap in that case to reflect that the node no longer exists in the graph. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68827/new/ https://reviews.llvm.org/D68827 From llvm-commits at lists.llvm.org Fri Oct 11 06:40:32 2019 From: llvm-commits at lists.llvm.org (Matthew Malcomson via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:40:32 +0000 (UTC) Subject: [PATCH] D68794: libhwasan initialisation include kernel syscall ABI relaxation In-Reply-To: References: Message-ID: <22f85a16c40b69f8a915868db4debf7a@localhost.localdomain> mmalcomson updated this revision to Diff 224582. mmalcomson edited the summary of this revision. mmalcomson added a comment. Run `prctl` syscall for Android, but ignore EINVAL failures. NOTE: I don't believe this distinguishes between running on a kernel with with the tagged address ABI unconditional or running on a newer kernel or on a kernel with `sysctl abi.tagged_addr_disabled=1` (https://android.googlesource.com/kernel/common/+/690c4ca8a5715644370384672f24d95b042db74a/Documentation/arm64/tagged-address-abi.rst) I doubt this will be much of a concern -- there was already a requirement of having the correct Android kernel for things to work -- but am mentioning it for posterity. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68794/new/ https://reviews.llvm.org/D68794 Files: compiler-rt/lib/hwasan/hwasan.cpp compiler-rt/lib/hwasan/hwasan.h compiler-rt/lib/hwasan/hwasan_linux.cpp Index: compiler-rt/lib/hwasan/hwasan_linux.cpp =================================================================== --- compiler-rt/lib/hwasan/hwasan_linux.cpp +++ compiler-rt/lib/hwasan/hwasan_linux.cpp @@ -34,6 +34,7 @@ #include #include #include +#include #include "sanitizer_common/sanitizer_common.h" #include "sanitizer_common/sanitizer_procmaps.h" @@ -144,6 +145,33 @@ FindDynamicShadowStart(shadow_size_bytes); } +void InitPrctl() { + // This function uses the prctl interface to ask the kernel to accept + // tagged pointers. + // + // Here we unconditionally request that the PR_TAGGED_ADDR_ENABLE value is + // turned on, there is nothing else that can be done. +#define PR_SET_TAGGED_ADDR_CTRL 55 +#define PR_GET_TAGGED_ADDR_CTRL 56 +#define PR_TAGGED_ADDR_ENABLE (1UL << 0) + if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, 0, 0, 0) == -1 + || ! prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)) { +#if SANITIZER_ANDROID + // Some older Android kernels have the tagged pointer ABI on + // unconditionally, and hence don't have the tagged-addr prctl. + // + // In order to handle those we ignore getting EINVAL. */ + if (errno == EINVAL) + return; +#endif + Printf("FATAL: HWAddressSanitizer failed to enable tagged pointer syscall ABI.\n"); + Die(); + } +#undef PR_SET_TAGGED_ADDR_CTRL +#undef PR_GET_TAGGED_ADDR_CTRL +#undef PR_TAGGED_ADDR_ENABLE +} + bool InitShadow() { // Define the entire memory range. kHighMemEnd = GetHighMemEnd(); Index: compiler-rt/lib/hwasan/hwasan.h =================================================================== --- compiler-rt/lib/hwasan/hwasan.h +++ compiler-rt/lib/hwasan/hwasan.h @@ -74,6 +74,7 @@ bool ProtectRange(uptr beg, uptr end); bool InitShadow(); +void InitPrctl(); void InitThreads(); void MadviseShadow(); char *GetProcSelfMaps(); Index: compiler-rt/lib/hwasan/hwasan.cpp =================================================================== --- compiler-rt/lib/hwasan/hwasan.cpp +++ compiler-rt/lib/hwasan/hwasan.cpp @@ -354,6 +354,8 @@ hwasan_init_is_running = 1; SanitizerToolName = "HWAddressSanitizer"; + InitPrctl(); + InitTlsSize(); CacheBinaryName(); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68794.224582.patch Type: text/x-patch Size: 2272 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 06:42:19 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:42:19 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: fhahn added inline comments. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7071 + // --------------------------------------------------------------------------- + // Transform initial VPlan: Apply previously taken decisions, in order, to ---------------- Not sure how other feel, but I think it would be great if we could move this transform out of LoopVectorize.cpp , to group together VP2VP transforms. I think it would fit well into llvm/lib/Transforms/Vectorize/VPlanHCFGTransforms.h (although the name mentions HFCGTransforms, maybe it should be just VplanToVplanTransforms.h/cpp). I could not spot anything that would prevent moving it to a different file on first glance. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 From llvm-commits at lists.llvm.org Fri Oct 11 06:49:48 2019 From: llvm-commits at lists.llvm.org (Josh Berdine via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:49:48 +0000 (UTC) Subject: [PATCH] D52239: [OCaml] Add OCaml APIs to access DebugLoc info In-Reply-To: References: Message-ID: <15005e15afa3044a4f7201a61f03c246@localhost.localdomain> jberdine abandoned this revision. jberdine added a comment. Thanks @whitequark , that is very helpful. I'll close this diff in favor of D60902 and continue discussion there. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D52239/new/ https://reviews.llvm.org/D52239 From llvm-commits at lists.llvm.org Fri Oct 11 06:49:49 2019 From: llvm-commits at lists.llvm.org (Josh Berdine via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:49:49 +0000 (UTC) Subject: [PATCH] D60902: [OCaml] Add OCaml APIs to access DebugInfo In-Reply-To: References: Message-ID: jberdine added a comment. @whitequark, with this diff's approach of creating a hierarchy of types to mirror the LLVM-C DI types, is it acceptable to add the types and functions incrementally? That is, could we land this diff and add other types and functions later? Another question is about opam. Since this diff adds a sub-library, the opam package files will need to change. Is there any experience / best practice about how to handle this? In particular, since the opam package files are not in the llvm repo, it will not be easily possible to pin the llvm dev repo. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D60902/new/ https://reviews.llvm.org/D60902 From llvm-commits at lists.llvm.org Fri Oct 11 06:59:09 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:59:09 +0000 (UTC) Subject: [PATCH] D65204: [GVN] Also invalidate users of instructions replaced due to conditionals. In-Reply-To: References: Message-ID: fhahn added a comment. Ping. Eli, do you think we should change the values we use as results of PHI translations, given my last comment? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65204/new/ https://reviews.llvm.org/D65204 From llvm-commits at lists.llvm.org Fri Oct 11 06:59:10 2019 From: llvm-commits at lists.llvm.org (whitequark via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:59:10 +0000 (UTC) Subject: [PATCH] D60902: [OCaml] Add OCaml APIs to access DebugInfo In-Reply-To: References: Message-ID: whitequark added a comment. > @whitequark, with this diff's approach of creating a hierarchy of types to mirror the LLVM-C DI types, is it acceptable to add the types and functions incrementally? That is, could we land this diff and add other types and functions later? Yes, that would be perfectly fine, and is in line with the extension of LLVM-C in the past. > Another question is about opam. Since this diff adds a sub-library, the opam package files will need to change. Is there any experience / best practice about how to handle this? In particular, since the opam package files are not in the llvm repo, it will not be easily possible to pin the llvm dev repo. I'm afraid there's no existing procedure for this. If the opam file does not have to be in the root of the repo (I forget exactly how it works), we could add it here. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D60902/new/ https://reviews.llvm.org/D60902 From llvm-commits at lists.llvm.org Fri Oct 11 06:59:10 2019 From: llvm-commits at lists.llvm.org (David Stenberg via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:59:10 +0000 (UTC) Subject: [PATCH] D67492: [DebugInfo] Add a DW_OP_LLVM_entry_value operation In-Reply-To: References: Message-ID: dstenb marked 7 inline comments as done. dstenb added a comment. Any more comments on this or D67768 ? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67492/new/ https://reviews.llvm.org/D67492 From llvm-commits at lists.llvm.org Fri Oct 11 06:59:11 2019 From: llvm-commits at lists.llvm.org (David Stenberg via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 13:59:11 +0000 (UTC) Subject: [PATCH] D68869: [DebugInfo] Fix truncation of call site immediates Message-ID: dstenb created this revision. dstenb added reviewers: djtodoro, NikolaPrica, aprantl, vsk. dstenb added a project: debug-info. Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. This addresses a bug in collectCallSiteParameters() where call site immediates would be truncated from int64_t to unsigned. This fixes PR43525. Repository: rL LLVM https://reviews.llvm.org/D68869 Files: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp llvm/test/DebugInfo/X86/dbgcall-site-64-bit-imms.ll Index: llvm/test/DebugInfo/X86/dbgcall-site-64-bit-imms.ll =================================================================== --- /dev/null +++ llvm/test/DebugInfo/X86/dbgcall-site-64-bit-imms.ll @@ -0,0 +1,56 @@ +; RUN: llc -O1 -debug-entry-values -filetype=obj -o - %s | llvm-dwarfdump - | FileCheck %s + +; Verify that the 64-bit call site immediates are not truncated. +; +; Reproducer for PR43525. + +; Based on the following C program: +; +; #include +; +; extern void foo(int64_t); +; +; int main() { +; foo(INT64_C(0x1122334455667788)); +; foo(INT32_C(-100)); +; } + +; CHECK: DW_AT_GNU_call_site_value (DW_OP_constu 0x1122334455667788) +; CHECK: DW_AT_GNU_call_site_value (DW_OP_constu 0xffffffffffffff9c) + +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" + +; Function Attrs: nounwind uwtable +define i32 @main() !dbg !12 { +entry: + tail call void @foo(i64 1234605616436508552), !dbg !16 + tail call void @foo(i64 -100), !dbg !17 + ret i32 0, !dbg !18 +} + +declare !dbg !4 void @foo(i64) + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!8, !9, !10} +!llvm.ident = !{!11} + +!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 10.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !3, nameTableKind: None) +!1 = !DIFile(filename: "dbgcall-site-long-imms.c", directory: "/") +!2 = !{} +!3 = !{!4} +!4 = !DISubprogram(name: "foo", scope: !1, file: !1, line: 3, type: !5, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !2) +!5 = !DISubroutineType(types: !6) +!6 = !{null, !7} +!7 = !DIBasicType(name: "long int", size: 64, encoding: DW_ATE_signed) +!8 = !{i32 2, !"Dwarf Version", i32 4} +!9 = !{i32 2, !"Debug Info Version", i32 3} +!10 = !{i32 1, !"wchar_size", i32 4} +!11 = !{!"clang version 10.0.0"} +!12 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 5, type: !13, scopeLine: 5, flags: DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !2) +!13 = !DISubroutineType(types: !14) +!14 = !{!15} +!15 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!16 = !DILocation(line: 6, scope: !12) +!17 = !DILocation(line: 7, scope: !12) +!18 = !DILocation(line: 8, scope: !12) Index: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp =================================================================== --- llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp +++ llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp @@ -620,7 +620,7 @@ if (auto ParamValue = TII->describeLoadedValue(*I)) { if (ParamValue->first.isImm()) { - unsigned Val = ParamValue->first.getImm(); + auto Val = ParamValue->first.getImm(); DbgValueLoc DbgLocVal(ParamValue->second, Val); finishCallSiteParam(DbgLocVal, Reg); } else if (ParamValue->first.isReg()) { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68869.224584.patch Type: text/x-patch Size: 2953 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 07:01:04 2019 From: llvm-commits at lists.llvm.org (Piotr Sobczak via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:01:04 +0000 (UTC) Subject: [PATCH] D64911: [AMDGPU] Extend the SI Load/Store optimizer In-Reply-To: References: Message-ID: <1975857e90ee0c2f9bd85ba4c8ab135b@localhost.localdomain> piotr updated this revision to Diff 224585. piotr added a comment. Rebased and addressed review comments. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64911/new/ https://reviews.llvm.org/D64911 Files: lib/Target/AMDGPU/SILoadStoreOptimizer.cpp test/CodeGen/AMDGPU/merge-image-load.mir test/CodeGen/AMDGPU/merge-image-sample.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D64911.224585.patch Type: text/x-patch Size: 107336 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 07:05:10 2019 From: llvm-commits at lists.llvm.org (Dmitry Preobrazhensky via llvm-commits) Date: Fri, 11 Oct 2019 14:05:10 -0000 Subject: [llvm] r374553 - [AMDGPU][MC] Corrected parsing of optional operands Message-ID: <20191011140510.0ED8193144@lists.llvm.org> Author: dpreobra Date: Fri Oct 11 07:05:09 2019 New Revision: 374553 URL: http://llvm.org/viewvc/llvm-project?rev=374553&view=rev Log: [AMDGPU][MC] Corrected parsing of optional operands See https://bugs.llvm.org/show_bug.cgi?id=43486 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D68350 Modified: llvm/trunk/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp llvm/trunk/test/MC/AMDGPU/flat-global.s Modified: llvm/trunk/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp?rev=374553&r1=374552&r2=374553&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp Fri Oct 11 07:05:09 2019 @@ -6074,8 +6074,6 @@ static const OptionalOperand AMDGPUOptio }; OperandMatchResultTy AMDGPUAsmParser::parseOptionalOperand(OperandVector &Operands) { - unsigned size = Operands.size(); - assert(size > 0); OperandMatchResultTy res = parseOptionalOpr(Operands); @@ -6090,17 +6088,13 @@ OperandMatchResultTy AMDGPUAsmParser::pa // to make sure autogenerated parser of custom operands never hit hardcoded // mandatory operands. - if (size == 1 || ((AMDGPUOperand &)*Operands[size - 1]).isRegKind()) { + for (unsigned i = 0; i < MAX_OPR_LOOKAHEAD; ++i) { + if (res != MatchOperand_Success || + isToken(AsmToken::EndOfStatement)) + break; - // We have parsed the first optional operand. - // Parse as many operands as necessary to skip all mandatory operands. - - for (unsigned i = 0; i < MAX_OPR_LOOKAHEAD; ++i) { - if (res != MatchOperand_Success || - getLexer().is(AsmToken::EndOfStatement)) break; - if (getLexer().is(AsmToken::Comma)) Parser.Lex(); - res = parseOptionalOpr(Operands); - } + trySkipToken(AsmToken::Comma); + res = parseOptionalOpr(Operands); } return res; Modified: llvm/trunk/test/MC/AMDGPU/flat-global.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AMDGPU/flat-global.s?rev=374553&r1=374552&r2=374553&view=diff ============================================================================== --- llvm/trunk/test/MC/AMDGPU/flat-global.s (original) +++ llvm/trunk/test/MC/AMDGPU/flat-global.s Fri Oct 11 07:05:09 2019 @@ -526,3 +526,8 @@ global_store_short_d16_hi v[3:4], v1, of // GFX10: encoding: [0x00,0x80,0x6c,0xdc,0x03,0x01,0x7d,0x00] // GFX9: global_store_short_d16_hi v[3:4], v1, off ; encoding: [0x00,0x80,0x6c,0xdc,0x03,0x01,0x7f,0x00] // VI-ERR: instruction not supported on this GPU + +global_atomic_add v0, v[1:2], v2, off glc slc +// GFX10: global_atomic_add v0, v[1:2], v2, off glc slc ; encoding: [0x00,0x80,0xcb,0xdc,0x01,0x02,0x7d,0x00] +// GFX9: global_atomic_add v0, v[1:2], v2, off glc slc ; encoding: [0x00,0x80,0x0b,0xdd,0x01,0x02,0x7f,0x00] +// VI-ERR: error: invalid operand for instruction From llvm-commits at lists.llvm.org Fri Oct 11 07:09:45 2019 From: llvm-commits at lists.llvm.org (Michael Liao via llvm-commits) Date: Fri, 11 Oct 2019 14:09:45 -0000 Subject: [llvm] r374554 - Fix compilation warnings. NFC. Message-ID: <20191011140945.1CF2792E53@lists.llvm.org> Author: hliao Date: Fri Oct 11 07:09:44 2019 New Revision: 374554 URL: http://llvm.org/viewvc/llvm-project?rev=374554&view=rev Log: Fix compilation warnings. NFC. Modified: llvm/trunk/lib/IR/Metadata.cpp llvm/trunk/lib/Transforms/IPO/GlobalDCE.cpp Modified: llvm/trunk/lib/IR/Metadata.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Metadata.cpp?rev=374554&r1=374553&r2=374554&view=diff ============================================================================== --- llvm/trunk/lib/IR/Metadata.cpp (original) +++ llvm/trunk/lib/IR/Metadata.cpp Fri Oct 11 07:09:44 2019 @@ -1509,7 +1509,7 @@ GlobalObject::VCallVisibility GlobalObje uint64_t Val = cast( cast(MD->getOperand(0))->getValue()) ->getZExtValue(); - assert((Val >= 0 && Val <= 2) && "unknown vcall visibility!"); + assert(Val <= 2 && "unknown vcall visibility!"); return (VCallVisibility)Val; } return VCallVisibility::VCallVisibilityPublic; Modified: llvm/trunk/lib/Transforms/IPO/GlobalDCE.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/GlobalDCE.cpp?rev=374554&r1=374553&r2=374554&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/GlobalDCE.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/GlobalDCE.cpp Fri Oct 11 07:09:44 2019 @@ -189,7 +189,7 @@ void GlobalDCEPass::ScanVTables(Module & // unit, we know that we can see all virtual functions which might use it, // so VFE is safe. if (auto GO = dyn_cast(&GV)) { - GlobalObject::VCallVisibility TypeVis = GV.getVCallVisibility(); + GlobalObject::VCallVisibility TypeVis = GO->getVCallVisibility(); if (TypeVis == GlobalObject::VCallVisibilityTranslationUnit || (LTOPostLink && TypeVis == GlobalObject::VCallVisibilityLinkageUnit)) { From llvm-commits at lists.llvm.org Fri Oct 11 07:08:26 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Micha=C5=82_G=C3=B3rny_via_Phabricator?= via llvm-commits) Date: Fri, 11 Oct 2019 14:08:26 +0000 (UTC) Subject: [PATCH] D68452: [llvm] [ocaml] Support linking against dylib In-Reply-To: References: Message-ID: mgorny added a comment. @whitequark, ping. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68452/new/ https://reviews.llvm.org/D68452 From llvm-commits at lists.llvm.org Fri Oct 11 07:08:27 2019 From: llvm-commits at lists.llvm.org (Piotr Sobczak via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:08:27 +0000 (UTC) Subject: [PATCH] D64911: [AMDGPU] Extend the SI Load/Store optimizer In-Reply-To: References: Message-ID: piotr marked 2 inline comments as done. piotr added inline comments. ================ Comment at: lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:322-325 + if (AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::vaddr) == -1) + return UNKNOWN; + if (!TII.get(Opc).mayLoad() || TII.isGather4(Opc)) + return UNKNOWN; ---------------- nhaehnle wrote: > This should probably check mayStore instead of mayLoad: we want to exclude both stores and atomics. > > You could also move the check for TFE and LWE to here. Good point about atomics, I added the condition to bail out on mayStore()). I am keeping !mayLoad() to avoid merging IMAGE_GET_RESINFO. For TFE/LWE I would like to keep the checks where they are, because I dislike extending getInstClass() with Instruction argument, and it would be necessary to query the actual value of TFE/LWE. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64911/new/ https://reviews.llvm.org/D64911 From llvm-commits at lists.llvm.org Fri Oct 11 07:08:34 2019 From: llvm-commits at lists.llvm.org (Dmitry Preobrazhensky via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:08:34 +0000 (UTC) Subject: [PATCH] D68350: [AMDGPU][MC][GFX9][GFX10] Corrected parsing of optional operands In-Reply-To: References: Message-ID: <202f33083ab496b37dcc3cde178ccac8@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG882c3e3db52d: [AMDGPU][MC] Corrected parsing of optional operands (authored by dp). Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D68350?vs=222884&id=224589#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68350/new/ https://reviews.llvm.org/D68350 Files: llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp llvm/test/MC/AMDGPU/flat-global.s Index: llvm/test/MC/AMDGPU/flat-global.s =================================================================== --- llvm/test/MC/AMDGPU/flat-global.s +++ llvm/test/MC/AMDGPU/flat-global.s @@ -526,3 +526,8 @@ // GFX10: encoding: [0x00,0x80,0x6c,0xdc,0x03,0x01,0x7d,0x00] // GFX9: global_store_short_d16_hi v[3:4], v1, off ; encoding: [0x00,0x80,0x6c,0xdc,0x03,0x01,0x7f,0x00] // VI-ERR: instruction not supported on this GPU + +global_atomic_add v0, v[1:2], v2, off glc slc +// GFX10: global_atomic_add v0, v[1:2], v2, off glc slc ; encoding: [0x00,0x80,0xcb,0xdc,0x01,0x02,0x7d,0x00] +// GFX9: global_atomic_add v0, v[1:2], v2, off glc slc ; encoding: [0x00,0x80,0x0b,0xdd,0x01,0x02,0x7f,0x00] +// VI-ERR: error: invalid operand for instruction Index: llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp =================================================================== --- llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp +++ llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp @@ -6074,8 +6074,6 @@ }; OperandMatchResultTy AMDGPUAsmParser::parseOptionalOperand(OperandVector &Operands) { - unsigned size = Operands.size(); - assert(size > 0); OperandMatchResultTy res = parseOptionalOpr(Operands); @@ -6090,17 +6088,13 @@ // to make sure autogenerated parser of custom operands never hit hardcoded // mandatory operands. - if (size == 1 || ((AMDGPUOperand &)*Operands[size - 1]).isRegKind()) { - - // We have parsed the first optional operand. - // Parse as many operands as necessary to skip all mandatory operands. + for (unsigned i = 0; i < MAX_OPR_LOOKAHEAD; ++i) { + if (res != MatchOperand_Success || + isToken(AsmToken::EndOfStatement)) + break; - for (unsigned i = 0; i < MAX_OPR_LOOKAHEAD; ++i) { - if (res != MatchOperand_Success || - getLexer().is(AsmToken::EndOfStatement)) break; - if (getLexer().is(AsmToken::Comma)) Parser.Lex(); - res = parseOptionalOpr(Operands); - } + trySkipToken(AsmToken::Comma); + res = parseOptionalOpr(Operands); } return res; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68350.224589.patch Type: text/x-patch Size: 2068 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 07:10:30 2019 From: llvm-commits at lists.llvm.org (Josh Berdine via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:10:30 +0000 (UTC) Subject: [PATCH] D60902: [OCaml] Add OCaml APIs to access DebugInfo In-Reply-To: References: Message-ID: <93682e6e6546e9863431660008c60424@localhost.localdomain> jberdine added a comment. I will experiment with putting the opam file below the repo root. Another potential stumbling point is that the opam package includes some patches for the build/install, see https://github.com/ocaml/opam-repository/tree/master/packages/llvm/llvm.9.0.0/files . Would it be too strange to include such patches here? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D60902/new/ https://reviews.llvm.org/D60902 From llvm-commits at lists.llvm.org Fri Oct 11 07:17:56 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Fri, 11 Oct 2019 14:17:56 -0000 Subject: [llvm] r374555 - [DAGCombiner] fold vselect-of-constants to shift Message-ID: <20191011141756.6256493109@lists.llvm.org> Author: spatel Date: Fri Oct 11 07:17:56 2019 New Revision: 374555 URL: http://llvm.org/viewvc/llvm-project?rev=374555&view=rev Log: [DAGCombiner] fold vselect-of-constants to shift The diffs suggest that we are missing some more basic analysis/transforms, but this keeps the vector path in sync with the scalar (rL374397). This is again a preliminary step for introducing the reverse transform in IR as proposed in D63382. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll llvm/trunk/test/CodeGen/X86/vselect.ll Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=374555&r1=374554&r2=374555&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Fri Oct 11 07:17:56 2019 @@ -8614,6 +8614,15 @@ SDValue DAGCombiner::foldVSelectOfConsta return DAG.getNode(ISD::ADD, DL, VT, ExtendedCond, N2); } + // select Cond, Pow2C, 0 --> (zext Cond) << log2(Pow2C) + APInt Pow2C; + if (ISD::isConstantSplatVector(N1.getNode(), Pow2C) && Pow2C.isPowerOf2() && + isNullOrNullSplat(N2)) { + SDValue ZextCond = DAG.getZExtOrTrunc(Cond, DL, VT); + SDValue ShAmtC = DAG.getConstant(Pow2C.exactLogBase2(), DL, VT); + return DAG.getNode(ISD::SHL, DL, VT, ZextCond, ShAmtC); + } + // The general case for select-of-constants: // vselect Cond, C1, C2 --> xor (and (sext Cond), (C1^C2)), C2 // ...but that only makes sense if a vselect is slower than 2 logic ops, so Modified: llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll?rev=374555&r1=374554&r2=374555&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll (original) +++ llvm/trunk/test/CodeGen/X86/selectcc-to-shiftand.ll Fri Oct 11 07:17:56 2019 @@ -213,9 +213,8 @@ define <16 x i8> @sel_shift_bool_v16i8(< define <8 x i16> @sel_shift_bool_v8i16(<8 x i1> %t) { ; ANY-LABEL: sel_shift_bool_v8i16: ; ANY: # %bb.0: -; ANY-NEXT: psllw $15, %xmm0 -; ANY-NEXT: psraw $15, %xmm0 ; ANY-NEXT: pand {{.*}}(%rip), %xmm0 +; ANY-NEXT: psllw $7, %xmm0 ; ANY-NEXT: retq %shl= select <8 x i1> %t, <8 x i16> , <8 x i16> zeroinitializer ret <8 x i16> %shl @@ -224,9 +223,8 @@ define <8 x i16> @sel_shift_bool_v8i16(< define <4 x i32> @sel_shift_bool_v4i32(<4 x i1> %t) { ; ANY-LABEL: sel_shift_bool_v4i32: ; ANY: # %bb.0: -; ANY-NEXT: pslld $31, %xmm0 -; ANY-NEXT: psrad $31, %xmm0 ; ANY-NEXT: pand {{.*}}(%rip), %xmm0 +; ANY-NEXT: pslld $6, %xmm0 ; ANY-NEXT: retq %shl = select <4 x i1> %t, <4 x i32> , <4 x i32> zeroinitializer ret <4 x i32> %shl @@ -235,10 +233,8 @@ define <4 x i32> @sel_shift_bool_v4i32(< define <2 x i64> @sel_shift_bool_v2i64(<2 x i1> %t) { ; ANY-LABEL: sel_shift_bool_v2i64: ; ANY: # %bb.0: -; ANY-NEXT: psllq $63, %xmm0 -; ANY-NEXT: psrad $31, %xmm0 -; ANY-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; ANY-NEXT: pand {{.*}}(%rip), %xmm0 +; ANY-NEXT: psllq $16, %xmm0 ; ANY-NEXT: retq %shl = select <2 x i1> %t, <2 x i64> , <2 x i64> zeroinitializer ret <2 x i64> %shl Modified: llvm/trunk/test/CodeGen/X86/vselect.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vselect.ll?rev=374555&r1=374554&r2=374555&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vselect.ll (original) +++ llvm/trunk/test/CodeGen/X86/vselect.ll Fri Oct 11 07:17:56 2019 @@ -647,33 +647,22 @@ define void @vselect_allzeros_LHS_multip ; This test case previously crashed after r363802, r363850, and r363856 due ; any_extend_vector_inreg not being handled by the X86 backend. define i64 @vselect_any_extend_vector_inreg_crash(<8 x i8>* %x) { -; SSE2-LABEL: vselect_any_extend_vector_inreg_crash: -; SSE2: # %bb.0: -; SSE2-NEXT: movq {{.*#+}} xmm0 = mem[0],zero -; SSE2-NEXT: pcmpeqb {{.*}}(%rip), %xmm0 -; SSE2-NEXT: punpcklbw {{.*#+}} xmm0 = xmm0[0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7] -; SSE2-NEXT: punpcklwd {{.*#+}} xmm0 = xmm0[0,0,1,1,2,2,3,3] -; SSE2-NEXT: psrad $24, %xmm0 -; SSE2-NEXT: movq %xmm0, %rax -; SSE2-NEXT: andl $32768, %eax # imm = 0x8000 -; SSE2-NEXT: retq -; -; SSE41-LABEL: vselect_any_extend_vector_inreg_crash: -; SSE41: # %bb.0: -; SSE41-NEXT: movq {{.*#+}} xmm0 = mem[0],zero -; SSE41-NEXT: pcmpeqb {{.*}}(%rip), %xmm0 -; SSE41-NEXT: pmovsxbq %xmm0, %xmm0 -; SSE41-NEXT: movq %xmm0, %rax -; SSE41-NEXT: andl $32768, %eax # imm = 0x8000 -; SSE41-NEXT: retq +; SSE-LABEL: vselect_any_extend_vector_inreg_crash: +; SSE: # %bb.0: +; SSE-NEXT: movq {{.*#+}} xmm0 = mem[0],zero +; SSE-NEXT: pcmpeqb {{.*}}(%rip), %xmm0 +; SSE-NEXT: movq %xmm0, %rax +; SSE-NEXT: andl $1, %eax +; SSE-NEXT: shlq $15, %rax +; SSE-NEXT: retq ; ; AVX-LABEL: vselect_any_extend_vector_inreg_crash: ; AVX: # %bb.0: ; AVX-NEXT: vmovq {{.*#+}} xmm0 = mem[0],zero ; AVX-NEXT: vpcmpeqb {{.*}}(%rip), %xmm0, %xmm0 -; AVX-NEXT: vpmovsxbq %xmm0, %xmm0 ; AVX-NEXT: vmovq %xmm0, %rax -; AVX-NEXT: andl $32768, %eax # imm = 0x8000 +; AVX-NEXT: andl $1, %eax +; AVX-NEXT: shlq $15, %rax ; AVX-NEXT: retq 0: %1 = load <8 x i8>, <8 x i8>* %x From llvm-commits at lists.llvm.org Fri Oct 11 07:17:54 2019 From: llvm-commits at lists.llvm.org (whitequark via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:17:54 +0000 (UTC) Subject: [PATCH] D68452: [llvm] [ocaml] Support linking against dylib In-Reply-To: References: Message-ID: <63850cf45ca7a6d909d17ac3a01e1a30@localhost.localdomain> whitequark accepted this revision. whitequark added a comment. This revision is now accepted and ready to land. LGTM. I did not get a notification for this before for some reason. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68452/new/ https://reviews.llvm.org/D68452 From llvm-commits at lists.llvm.org Fri Oct 11 07:27:22 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:27:22 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <3bd3d47a9580b4abd4fe89ab18a82e47@localhost.localdomain> fhahn added a comment. A while ago, I put up a patch to do sinking just on the VPInstruction/recipe level, but I never finished integrating it: D46826 . ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7079 + VPRecipeBase *Sink = RecipeBuilder.getRecipe(Entry.first); + Sink->removeFromParent(); + Sink->insertAfter(RecipeBuilder.getRecipe(Entry.second)); ---------------- This could just be `Sink->moveAfter(RecipeBuilder.getRecipe(Entry.second)) `. I've added it in D46825 and now finally have a reason to commit it ;) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 From llvm-commits at lists.llvm.org Fri Oct 11 07:29:05 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Micha=C5=82_G=C3=B3rny_via_Phabricator?= via llvm-commits) Date: Fri, 11 Oct 2019 14:29:05 +0000 (UTC) Subject: [PATCH] D68452: [llvm] [ocaml] Support linking against dylib In-Reply-To: References: Message-ID: mgorny added a comment. No problem. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68452/new/ https://reviews.llvm.org/D68452 From llvm-commits at lists.llvm.org Fri Oct 11 07:32:43 2019 From: llvm-commits at lists.llvm.org (Michal Gorny via llvm-commits) Date: Fri, 11 Oct 2019 14:32:43 -0000 Subject: [llvm] r374556 - [llvm] [ocaml] Support linking against dylib Message-ID: <20191011143243.CF45592FFF@lists.llvm.org> Author: mgorny Date: Fri Oct 11 07:32:43 2019 New Revision: 374556 URL: http://llvm.org/viewvc/llvm-project?rev=374556&view=rev Log: [llvm] [ocaml] Support linking against dylib Support linking OCaml modules against LLVM dylib when requested, rather than against static libs that might not be installed at all. Differential Revision: https://reviews.llvm.org/D68452 Modified: llvm/trunk/cmake/modules/AddOCaml.cmake Modified: llvm/trunk/cmake/modules/AddOCaml.cmake URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/cmake/modules/AddOCaml.cmake?rev=374556&r1=374555&r2=374556&view=diff ============================================================================== --- llvm/trunk/cmake/modules/AddOCaml.cmake (original) +++ llvm/trunk/cmake/modules/AddOCaml.cmake Fri Oct 11 07:32:43 2019 @@ -66,21 +66,25 @@ function(add_ocaml_library name) list(APPEND ocaml_flags "-custom") endif() - explicit_map_components_to_libraries(llvm_libs ${ARG_LLVM}) - foreach( llvm_lib ${llvm_libs} ) - list(APPEND ocaml_flags "-l${llvm_lib}" ) - endforeach() + if(LLVM_LINK_LLVM_DYLIB) + list(APPEND ocaml_flags "-lLLVM") + else() + explicit_map_components_to_libraries(llvm_libs ${ARG_LLVM}) + foreach( llvm_lib ${llvm_libs} ) + list(APPEND ocaml_flags "-l${llvm_lib}" ) + endforeach() - get_property(system_libs TARGET LLVMSupport PROPERTY LLVM_SYSTEM_LIBS) - foreach(system_lib ${system_libs}) - if (system_lib MATCHES "^-") - # If it's an option, pass it without changes. - list(APPEND ocaml_flags "${system_lib}" ) - else() - # Otherwise assume it's a library name we need to link with. - list(APPEND ocaml_flags "-l${system_lib}" ) - endif() - endforeach() + get_property(system_libs TARGET LLVMSupport PROPERTY LLVM_SYSTEM_LIBS) + foreach(system_lib ${system_libs}) + if (system_lib MATCHES "^-") + # If it's an option, pass it without changes. + list(APPEND ocaml_flags "${system_lib}" ) + else() + # Otherwise assume it's a library name we need to link with. + list(APPEND ocaml_flags "-l${system_lib}" ) + endif() + endforeach() + endif() string(REPLACE ";" " " ARG_CFLAGS "${ARG_CFLAGS}") set(c_flags "${ARG_CFLAGS} ${LLVM_DEFINITIONS}") From llvm-commits at lists.llvm.org Fri Oct 11 07:35:11 2019 From: llvm-commits at lists.llvm.org (Dmitry Preobrazhensky via llvm-commits) Date: Fri, 11 Oct 2019 14:35:11 -0000 Subject: [llvm] r374557 - [AMDGPU][MC][GFX10] Enabled null for 64-bit dst operands Message-ID: <20191011143511.EF91C858F9@lists.llvm.org> Author: dpreobra Date: Fri Oct 11 07:35:11 2019 New Revision: 374557 URL: http://llvm.org/viewvc/llvm-project?rev=374557&view=rev Log: [AMDGPU][MC][GFX10] Enabled null for 64-bit dst operands See https://bugs.llvm.org/show_bug.cgi?id=43524 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68785 Modified: llvm/trunk/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp llvm/trunk/test/MC/AMDGPU/sop1.s llvm/trunk/test/MC/AMDGPU/sop2.s llvm/trunk/test/MC/AMDGPU/sopk.s llvm/trunk/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt Modified: llvm/trunk/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp?rev=374557&r1=374556&r2=374557&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp Fri Oct 11 07:35:11 2019 @@ -290,6 +290,10 @@ public: return isOff() || isVReg32(); } + bool isNull() const { + return isRegKind() && getReg() == AMDGPU::SGPR_NULL; + } + bool isSDWAOperand(MVT type) const; bool isSDWAFP16Operand() const; bool isSDWAFP32Operand() const; @@ -6976,6 +6980,14 @@ unsigned AMDGPUAsmParser::validateTarget return Operand.isInterpAttr() ? Match_Success : Match_InvalidOperand; case MCK_AttrChan: return Operand.isAttrChan() ? Match_Success : Match_InvalidOperand; + case MCK_SReg_64: + case MCK_SReg_64_XEXEC: + // Null is defined as a 32-bit register but + // it should also be enabled with 64-bit operands. + // The following code enables it for SReg_64 operands + // used as source and destination. Remaining source + // operands are handled in isInlinableImm. + return Operand.isNull() ? Match_Success : Match_InvalidOperand; default: return Match_InvalidOperand; } Modified: llvm/trunk/test/MC/AMDGPU/sop1.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AMDGPU/sop1.s?rev=374557&r1=374556&r2=374557&view=diff ============================================================================== --- llvm/trunk/test/MC/AMDGPU/sop1.s (original) +++ llvm/trunk/test/MC/AMDGPU/sop1.s Fri Oct 11 07:35:11 2019 @@ -1,6 +1,7 @@ // RUN: not llvm-mc -arch=amdgcn -show-encoding %s | FileCheck --check-prefix=GCN --check-prefix=SICI %s // RUN: not llvm-mc -arch=amdgcn -mcpu=fiji -show-encoding %s 2>&1 | FileCheck --check-prefix=GCN --check-prefix=VI --check-prefix=GFX89 %s // RUN: not llvm-mc -arch=amdgcn -mcpu=gfx900 -show-encoding %s 2>&1 | FileCheck --check-prefix=GCN --check-prefix=GFX89 --check-prefix=GFX9 %s +// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1010 -show-encoding %s 2>&1 | FileCheck --check-prefix=GCN --check-prefix=GFX10 %s // RUN: not llvm-mc -arch=amdgcn -show-encoding %s 2>&1 | FileCheck --check-prefix=NOSICI --check-prefix=NOSICIVI %s // RUN: not llvm-mc -arch=amdgcn -mcpu=fiji -show-encoding %s 2>&1 | FileCheck --check-prefix=NOVI --check-prefix=NOSICIVI --check-prefix=NOGFX89 %s @@ -34,6 +35,10 @@ s_mov_b64 s[2:3], s[4:5] // SICI: s_mov_b64 s[2:3], s[4:5] ; encoding: [0x04,0x04,0x82,0xbe] // GFX89: s_mov_b64 s[2:3], s[4:5] ; encoding: [0x04,0x01,0x82,0xbe] +s_mov_b64 null, s[4:5] +// GFX10: s_mov_b64 null, s[4:5] ; encoding: [0x04,0x04,0xfd,0xbe] +// NOSICIVI: error: not a valid operand. + s_mov_b64 s[2:3], 0xffffffffffffffff // SICI: s_mov_b64 s[2:3], -1 ; encoding: [0xc1,0x04,0x82,0xbe] // GFX89: s_mov_b64 s[2:3], -1 ; encoding: [0xc1,0x01,0x82,0xbe] Modified: llvm/trunk/test/MC/AMDGPU/sop2.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AMDGPU/sop2.s?rev=374557&r1=374556&r2=374557&view=diff ============================================================================== --- llvm/trunk/test/MC/AMDGPU/sop2.s (original) +++ llvm/trunk/test/MC/AMDGPU/sop2.s Fri Oct 11 07:35:11 2019 @@ -3,6 +3,7 @@ // RUN: not llvm-mc -arch=amdgcn -mcpu=bonaire -show-encoding %s | FileCheck --check-prefix=GCN --check-prefix=SICI %s // RUN: not llvm-mc -arch=amdgcn -mcpu=fiji -show-encoding %s | FileCheck --check-prefix=GCN --check-prefix=GFX89 %s // RUN: not llvm-mc -arch=amdgcn -mcpu=gfx900 -show-encoding %s | FileCheck --check-prefix=GCN --check-prefix=GFX89 --check-prefix=GFX9 %s +// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1010 -show-encoding %s | FileCheck --check-prefix=GCN --check-prefix=GFX10 %s // RUN: not llvm-mc -arch=amdgcn -show-encoding %s 2>&1 | FileCheck --check-prefix=NOSICIVI %s // RUN: not llvm-mc -arch=amdgcn -mcpu=tahiti -show-encoding %s 2>&1 | FileCheck --check-prefix=NOSICIVI %s @@ -60,6 +61,10 @@ s_and_b32 s2, 0xFFFF0000, -65536 // SICI: s_and_b32 s2, 0xffff0000, 0xffff0000 ; encoding: [0xff,0xff,0x02,0x87,0x00,0x00,0xff,0xff] // GFX89: s_and_b32 s2, 0xffff0000, 0xffff0000 ; encoding: [0xff,0xff,0x02,0x86,0x00,0x00,0xff,0xff] +s_and_b64 null, s[4:5], s[6:7] +// GFX10: s_and_b64 null, s[4:5], s[6:7] ; encoding: [0x04,0x06,0xfd,0x87] +// NOSICIVI: error: not a valid operand. + s_and_b64 s[2:3], s[4:5], s[6:7] // SICI: s_and_b64 s[2:3], s[4:5], s[6:7] ; encoding: [0x04,0x06,0x82,0x87] // GFX89: s_and_b64 s[2:3], s[4:5], s[6:7] ; encoding: [0x04,0x06,0x82,0x86] Modified: llvm/trunk/test/MC/AMDGPU/sopk.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AMDGPU/sopk.s?rev=374557&r1=374556&r2=374557&view=diff ============================================================================== --- llvm/trunk/test/MC/AMDGPU/sopk.s (original) +++ llvm/trunk/test/MC/AMDGPU/sopk.s Fri Oct 11 07:35:11 2019 @@ -1,12 +1,13 @@ // RUN: not llvm-mc -arch=amdgcn -show-encoding %s | FileCheck --check-prefix=GCN --check-prefix=SICI %s // RUN: not llvm-mc -arch=amdgcn -mcpu=tahiti -show-encoding %s | FileCheck --check-prefix=GCN --check-prefix=SICI %s // RUN: not llvm-mc -arch=amdgcn -mcpu=fiji -show-encoding %s | FileCheck --check-prefix=GCN --check-prefix=VI9 --check-prefix=VI %s -// RUN: llvm-mc -arch=amdgcn -mcpu=gfx900 -show-encoding %s | FileCheck --check-prefix=GCN --check-prefix=VI9 --check-prefix=GFX9 %s +// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx900 -show-encoding %s | FileCheck --check-prefix=GCN --check-prefix=VI9 --check-prefix=GFX9 %s // RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1010 -show-encoding %s | FileCheck --check-prefix=GCN --check-prefix=GFX10 %s // RUN: not llvm-mc -arch=amdgcn %s 2>&1 | FileCheck -check-prefix=NOSICIVI %s // RUN: not llvm-mc -arch=amdgcn -mcpu=tahiti %s 2>&1 | FileCheck -check-prefix=NOSICIVI -check-prefix=NOSI %s // RUN: not llvm-mc -arch=amdgcn -mcpu=fiji %s 2>&1 | FileCheck -check-prefix=NOSICIVI -check-prefix=NOVI %s +// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx900 %s 2>&1 | FileCheck --check-prefix=NOGFX9 %s //===----------------------------------------------------------------------===// // Instructions @@ -319,6 +320,11 @@ s_endpgm_ordered_ps_done // GFX9: s_endpgm_ordered_ps_done ; encoding: [0x00,0x00,0x9e,0xbf] // NOSICIVI: error: instruction not supported on this GPU +s_call_b64 null, 12609 +// GFX10: s_call_b64 null, 12609 ; encoding: [0x41,0x31,0x7d,0xbb] +// NOSICIVI: error: not a valid operand. +// NOGFX9: error: not a valid operand. + s_call_b64 s[12:13], 12609 // GFX9: s_call_b64 s[12:13], 12609 ; encoding: [0x41,0x31,0x8c,0xba] // NOSICIVI: error: instruction not supported on this GPU Modified: llvm/trunk/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt?rev=374557&r1=374556&r2=374557&view=diff ============================================================================== --- llvm/trunk/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt (original) +++ llvm/trunk/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt Fri Oct 11 07:35:11 2019 @@ -8978,6 +8978,9 @@ # GFX10: s_and_b32 vcc_lo, s1, s2 ; encoding: [0x01,0x02,0x6a,0x87] 0x01,0x02,0x6a,0x87 +# GFX10: s_and_b64 null, s[4:5], s[6:7] ; encoding: [0x04,0x06,0xfd,0x87] +0x04,0x06,0xfd,0x87 + # GFX10: s_and_b64 exec, s[2:3], s[4:5] ; encoding: [0x02,0x04,0xfe,0x87] 0x02,0x04,0xfe,0x87 @@ -11693,6 +11696,9 @@ # GFX10: s_buffer_store_dwordx4 s[96:99], s[8:11], s0 ; encoding: [0x04,0x18,0x68,0xf4,0x00,0x00,0x00,0x00] 0x04,0x18,0x68,0xf4,0x00,0x00,0x00,0x00 +# GFX10: s_call_b64 null, 12609 ; encoding: [0x41,0x31,0x7d,0xbb] +0x41,0x31,0x7d,0xbb + # GFX10: s_call_b64 exec, 4660 ; encoding: [0x34,0x12,0x7e,0xbb] 0x34,0x12,0x7e,0xbb @@ -15665,6 +15671,9 @@ # GFX10: s_mov_b32 vcc_lo, s1 ; encoding: [0x01,0x03,0xea,0xbe] 0x01,0x03,0xea,0xbe +# GFX10: s_mov_b64 null, s[4:5] ; encoding: [0x04,0x04,0xfd,0xbe] +0x04,0x04,0xfd,0xbe + # GFX10: s_mov_b64 exec, s[2:3] ; encoding: [0x02,0x04,0xfe,0xbe] 0x02,0x04,0xfe,0xbe From llvm-commits at lists.llvm.org Fri Oct 11 07:36:30 2019 From: llvm-commits at lists.llvm.org (Djordje Todorovic via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:36:30 +0000 (UTC) Subject: [PATCH] D68869: [DebugInfo] Fix truncation of call site immediates In-Reply-To: References: Message-ID: djtodoro added a comment. Looks reasonable, thanks for fixing this! Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68869/new/ https://reviews.llvm.org/D68869 From llvm-commits at lists.llvm.org Fri Oct 11 07:36:34 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Micha=C5=82_G=C3=B3rny_via_Phabricator?= via llvm-commits) Date: Fri, 11 Oct 2019 14:36:34 +0000 (UTC) Subject: [PATCH] D68452: [llvm] [ocaml] Support linking against dylib In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGda2a29a17cb4: [llvm] [ocaml] Support linking against dylib (authored by mgorny). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68452/new/ https://reviews.llvm.org/D68452 Files: llvm/cmake/modules/AddOCaml.cmake Index: llvm/cmake/modules/AddOCaml.cmake =================================================================== --- llvm/cmake/modules/AddOCaml.cmake +++ llvm/cmake/modules/AddOCaml.cmake @@ -66,21 +66,25 @@ list(APPEND ocaml_flags "-custom") endif() - explicit_map_components_to_libraries(llvm_libs ${ARG_LLVM}) - foreach( llvm_lib ${llvm_libs} ) - list(APPEND ocaml_flags "-l${llvm_lib}" ) - endforeach() + if(LLVM_LINK_LLVM_DYLIB) + list(APPEND ocaml_flags "-lLLVM") + else() + explicit_map_components_to_libraries(llvm_libs ${ARG_LLVM}) + foreach( llvm_lib ${llvm_libs} ) + list(APPEND ocaml_flags "-l${llvm_lib}" ) + endforeach() - get_property(system_libs TARGET LLVMSupport PROPERTY LLVM_SYSTEM_LIBS) - foreach(system_lib ${system_libs}) - if (system_lib MATCHES "^-") - # If it's an option, pass it without changes. - list(APPEND ocaml_flags "${system_lib}" ) - else() - # Otherwise assume it's a library name we need to link with. - list(APPEND ocaml_flags "-l${system_lib}" ) - endif() - endforeach() + get_property(system_libs TARGET LLVMSupport PROPERTY LLVM_SYSTEM_LIBS) + foreach(system_lib ${system_libs}) + if (system_lib MATCHES "^-") + # If it's an option, pass it without changes. + list(APPEND ocaml_flags "${system_lib}" ) + else() + # Otherwise assume it's a library name we need to link with. + list(APPEND ocaml_flags "-l${system_lib}" ) + endif() + endforeach() + endif() string(REPLACE ";" " " ARG_CFLAGS "${ARG_CFLAGS}") set(c_flags "${ARG_CFLAGS} ${LLVM_DEFINITIONS}") -------------- next part -------------- A non-text attachment was scrubbed... Name: D68452.224591.patch Type: text/x-patch Size: 1624 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 07:36:53 2019 From: llvm-commits at lists.llvm.org (Dmitry Preobrazhensky via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:36:53 +0000 (UTC) Subject: [PATCH] D68785: [AMDGPU][MC][GFX10] Enabled null for 64-bit dst operands In-Reply-To: References: Message-ID: <8167744a6f6047ab599656aaccc00f78@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG472c6b0aa022: [AMDGPU][MC][GFX10] Enabled null for 64-bit dst operands (authored by dp). Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D68785?vs=224318&id=224592#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68785/new/ https://reviews.llvm.org/D68785 Files: llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp llvm/test/MC/AMDGPU/sop1.s llvm/test/MC/AMDGPU/sop2.s llvm/test/MC/AMDGPU/sopk.s llvm/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt -------------- next part -------------- A non-text attachment was scrubbed... Name: D68785.224592.patch Type: text/x-patch Size: 7190 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 07:44:51 2019 From: llvm-commits at lists.llvm.org (Dmitry Preobrazhensky via llvm-commits) Date: Fri, 11 Oct 2019 14:44:51 -0000 Subject: [llvm] r374559 - [AMDGPU][MC][GFX6][GFX7][GFX10] Added instructions buffer_atomic_[fcmpswap/fmin/fmax]* Message-ID: <20191011144451.C070E8A7E3@lists.llvm.org> Author: dpreobra Date: Fri Oct 11 07:44:51 2019 New Revision: 374559 URL: http://llvm.org/viewvc/llvm-project?rev=374559&view=rev Log: [AMDGPU][MC][GFX6][GFX7][GFX10] Added instructions buffer_atomic_[fcmpswap/fmin/fmax]* See https://bugs.llvm.org/show_bug.cgi?id=28232 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68788 Added: llvm/trunk/test/MC/Disassembler/AMDGPU/mubuf_gfx10.txt Modified: llvm/trunk/lib/Target/AMDGPU/BUFInstructions.td llvm/trunk/test/MC/AMDGPU/mubuf-gfx10.s llvm/trunk/test/MC/AMDGPU/mubuf.s Modified: llvm/trunk/lib/Target/AMDGPU/BUFInstructions.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/BUFInstructions.td?rev=374559&r1=374558&r2=374559&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/BUFInstructions.td (original) +++ llvm/trunk/lib/Target/AMDGPU/BUFInstructions.td Fri Oct 11 07:44:51 2019 @@ -1010,19 +1010,36 @@ def BUFFER_STORE_LDS_DWORD : MUBUF_Pseud let SubtargetPredicate = isGFX6 in { // isn't on CI & VI /* defm BUFFER_ATOMIC_RSUB : MUBUF_Pseudo_Atomics <"buffer_atomic_rsub">; -defm BUFFER_ATOMIC_FCMPSWAP : MUBUF_Pseudo_Atomics <"buffer_atomic_fcmpswap">; -defm BUFFER_ATOMIC_FMIN : MUBUF_Pseudo_Atomics <"buffer_atomic_fmin">; -defm BUFFER_ATOMIC_FMAX : MUBUF_Pseudo_Atomics <"buffer_atomic_fmax">; defm BUFFER_ATOMIC_RSUB_X2 : MUBUF_Pseudo_Atomics <"buffer_atomic_rsub_x2">; -defm BUFFER_ATOMIC_FCMPSWAP_X2 : MUBUF_Pseudo_Atomics <"buffer_atomic_fcmpswap_x2">; -defm BUFFER_ATOMIC_FMIN_X2 : MUBUF_Pseudo_Atomics <"buffer_atomic_fmin_x2">; -defm BUFFER_ATOMIC_FMAX_X2 : MUBUF_Pseudo_Atomics <"buffer_atomic_fmax_x2">; */ def BUFFER_WBINVL1_SC : MUBUF_Invalidate <"buffer_wbinvl1_sc", int_amdgcn_buffer_wbinvl1_sc>; } +let SubtargetPredicate = isGFX6GFX7GFX10 in { + +defm BUFFER_ATOMIC_FCMPSWAP : MUBUF_Pseudo_Atomics < + "buffer_atomic_fcmpswap", VReg_64, v2f32, null_frag +>; +defm BUFFER_ATOMIC_FMIN : MUBUF_Pseudo_Atomics < + "buffer_atomic_fmin", VGPR_32, f32, null_frag +>; +defm BUFFER_ATOMIC_FMAX : MUBUF_Pseudo_Atomics < + "buffer_atomic_fmax", VGPR_32, f32, null_frag +>; +defm BUFFER_ATOMIC_FCMPSWAP_X2 : MUBUF_Pseudo_Atomics < + "buffer_atomic_fcmpswap_x2", VReg_128, v2f64, null_frag +>; +defm BUFFER_ATOMIC_FMIN_X2 : MUBUF_Pseudo_Atomics < + "buffer_atomic_fmin_x2", VReg_64, f64, null_frag +>; +defm BUFFER_ATOMIC_FMAX_X2 : MUBUF_Pseudo_Atomics < + "buffer_atomic_fmax_x2", VReg_64, f64, null_frag +>; + +} + let SubtargetPredicate = HasD16LoadStore in { defm BUFFER_LOAD_UBYTE_D16 : MUBUF_Pseudo_Loads < @@ -2025,10 +2042,9 @@ defm BUFFER_ATOMIC_OR : MUBUF_R defm BUFFER_ATOMIC_XOR : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x03b>; defm BUFFER_ATOMIC_INC : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x03c>; defm BUFFER_ATOMIC_DEC : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x03d>; -// FIXME-GFX6-GFX7-GFX10: Add following instructions: -//defm BUFFER_ATOMIC_FCMPSWAP : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x03e>; -//defm BUFFER_ATOMIC_FMIN : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x03f>; -//defm BUFFER_ATOMIC_FMAX : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x040>; +defm BUFFER_ATOMIC_FCMPSWAP : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x03e>; +defm BUFFER_ATOMIC_FMIN : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x03f>; +defm BUFFER_ATOMIC_FMAX : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x040>; defm BUFFER_ATOMIC_SWAP_X2 : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x050>; defm BUFFER_ATOMIC_CMPSWAP_X2 : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x051>; defm BUFFER_ATOMIC_ADD_X2 : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x052>; @@ -2043,10 +2059,9 @@ defm BUFFER_ATOMIC_XOR_X2 : MUBUF_R defm BUFFER_ATOMIC_INC_X2 : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x05c>; defm BUFFER_ATOMIC_DEC_X2 : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x05d>; // FIXME-GFX7: Need to handle hazard for BUFFER_ATOMIC_FCMPSWAP_X2 on GFX7. -// FIXME-GFX6-GFX7-GFX10: Add following instructions: -//defm BUFFER_ATOMIC_FCMPSWAP_X2 : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x05e>; -//defm BUFFER_ATOMIC_FMIN_X2 : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x05f>; -//defm BUFFER_ATOMIC_FMAX_X2 : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x060>; +defm BUFFER_ATOMIC_FCMPSWAP_X2 : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x05e>; +defm BUFFER_ATOMIC_FMIN_X2 : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x05f>; +defm BUFFER_ATOMIC_FMAX_X2 : MUBUF_Real_Atomics_gfx6_gfx7_gfx10<0x060>; defm BUFFER_WBINVL1_SC : MUBUF_Real_gfx6<0x070>; defm BUFFER_WBINVL1_VOL : MUBUF_Real_gfx7<0x070>; Modified: llvm/trunk/test/MC/AMDGPU/mubuf-gfx10.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AMDGPU/mubuf-gfx10.s?rev=374559&r1=374558&r2=374559&view=diff ============================================================================== --- llvm/trunk/test/MC/AMDGPU/mubuf-gfx10.s (original) +++ llvm/trunk/test/MC/AMDGPU/mubuf-gfx10.s Fri Oct 11 07:44:51 2019 @@ -8,3 +8,39 @@ buffer_load_sbyte v5, off, s[8:11], s3 g buffer_load_sbyte v5, off, s[8:11], s3 glc slc dlc // GFX10: buffer_load_sbyte v5, off, s[8:11], s3 glc slc dlc ; encoding: [0x00,0xc0,0x24,0xe0,0x00,0x05,0x42,0x03] + +buffer_atomic_fcmpswap v[0:1], off, s[0:3], s0 offset:4095 +// GFX10: buffer_atomic_fcmpswap v[0:1], off, s[0:3], s0 offset:4095 ; encoding: [0xff,0x0f,0xf8,0xe0,0x00,0x00,0x00,0x00] + +buffer_atomic_fcmpswap_x2 v[0:3], off, s[0:3], s0 offset:4095 +// GFX10: buffer_atomic_fcmpswap_x2 v[0:3], off, s[0:3], s0 offset:4095 ; encoding: [0xff,0x0f,0x78,0xe1,0x00,0x00,0x00,0x00] + +buffer_atomic_fcmpswap_x2 v[0:3], v0, s[0:3], s0 idxen offset:4095 +// GFX10: buffer_atomic_fcmpswap_x2 v[0:3], v0, s[0:3], s0 idxen offset:4095 ; encoding: [0xff,0x2f,0x78,0xe1,0x00,0x00,0x00,0x00] + +buffer_atomic_fmax v1, off, s[0:3], s0 offset:4095 +// GFX10: buffer_atomic_fmax v1, off, s[0:3], s0 offset:4095 ; encoding: [0xff,0x0f,0x00,0xe1,0x00,0x01,0x00,0x00] + +buffer_atomic_fmax v0, off, s[0:3], s0 offset:7 +// GFX10: buffer_atomic_fmax v0, off, s[0:3], s0 offset:7 ; encoding: [0x07,0x00,0x00,0xe1,0x00,0x00,0x00,0x00] + +buffer_atomic_fmax v0, off, s[0:3], s0 offset:4095 glc +// GFX10: buffer_atomic_fmax v0, off, s[0:3], s0 offset:4095 glc ; encoding: [0xff,0x4f,0x00,0xe1,0x00,0x00,0x00,0x00] + +buffer_atomic_fmax_x2 v[5:6], off, s[0:3], s0 offset:4095 +// GFX10: buffer_atomic_fmax_x2 v[5:6], off, s[0:3], s0 offset:4095 ; encoding: [0xff,0x0f,0x80,0xe1,0x00,0x05,0x00,0x00] + +buffer_atomic_fmax_x2 v[0:1], v0, s[0:3], s0 idxen offset:4095 +// GFX10: buffer_atomic_fmax_x2 v[0:1], v0, s[0:3], s0 idxen offset:4095 ; encoding: [0xff,0x2f,0x80,0xe1,0x00,0x00,0x00,0x00] + +buffer_atomic_fmin v0, off, s[0:3], s0 +// GFX10: buffer_atomic_fmin v0, off, s[0:3], s0 ; encoding: [0x00,0x00,0xfc,0xe0,0x00,0x00,0x00,0x00] + +buffer_atomic_fmin v0, off, s[0:3], s0 offset:0 +// GFX10: buffer_atomic_fmin v0, off, s[0:3], s0 ; encoding: [0x00,0x00,0xfc,0xe0,0x00,0x00,0x00,0x00] + +buffer_atomic_fmin_x2 v[0:1], off, s[0:3], s0 offset:4095 slc +// GFX10: buffer_atomic_fmin_x2 v[0:1], off, s[0:3], s0 offset:4095 slc ; encoding: [0xff,0x0f,0x7c,0xe1,0x00,0x00,0x40,0x00] + +buffer_atomic_fmin_x2 v[0:1], v0, s[0:3], s0 idxen offset:4095 +// GFX10: buffer_atomic_fmin_x2 v[0:1], v0, s[0:3], s0 idxen offset:4095 ; encoding: [0xff,0x2f,0x7c,0xe1,0x00,0x00,0x00,0x00] Modified: llvm/trunk/test/MC/AMDGPU/mubuf.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AMDGPU/mubuf.s?rev=374559&r1=374558&r2=374559&view=diff ============================================================================== --- llvm/trunk/test/MC/AMDGPU/mubuf.s (original) +++ llvm/trunk/test/MC/AMDGPU/mubuf.s Fri Oct 11 07:44:51 2019 @@ -719,6 +719,62 @@ buffer_atomic_add v5, off, s[8:11], 0.15 // NOSICI: error: invalid operand for instruction // VI: buffer_atomic_add v5, off, s[8:11], 0.15915494 offset:4095 glc ; encoding: [0xff,0x4f,0x08,0xe1,0x00,0x05,0x02,0xf8] +buffer_atomic_fcmpswap v[0:1], off, s[0:3], s0 offset:4095 +// SICI: buffer_atomic_fcmpswap v[0:1], off, s[0:3], s0 offset:4095 ; encoding: [0xff,0x0f,0xf8,0xe0,0x00,0x00,0x00,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fcmpswap v[0:1], v[0:1], s[0:3], s0 addr64 offset:4095 +// SICI: buffer_atomic_fcmpswap v[0:1], v[0:1], s[0:3], s0 addr64 offset:4095 ; encoding: [0xff,0x8f,0xf8,0xe0,0x00,0x00,0x00,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fcmpswap_x2 v[0:3], off, s[0:3], s0 offset:4095 +// SICI: buffer_atomic_fcmpswap_x2 v[0:3], off, s[0:3], s0 offset:4095 ; encoding: [0xff,0x0f,0x78,0xe1,0x00,0x00,0x00,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fcmpswap_x2 v[0:3], v0, s[0:3], s0 idxen offset:4095 +// SICI: buffer_atomic_fcmpswap_x2 v[0:3], v0, s[0:3], s0 idxen offset:4095 ; encoding: [0xff,0x2f,0x78,0xe1,0x00,0x00,0x00,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fmax v1, off, s[0:3], s0 offset:4095 +// SICI: buffer_atomic_fmax v1, off, s[0:3], s0 offset:4095 ; encoding: [0xff,0x0f,0x00,0xe1,0x00,0x01,0x00,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fmax v0, off, s[0:3], s0 offset:7 +// SICI: buffer_atomic_fmax v0, off, s[0:3], s0 offset:7 ; encoding: [0x07,0x00,0x00,0xe1,0x00,0x00,0x00,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fmax v0, off, s[0:3], s0 offset:4095 glc +// SICI: buffer_atomic_fmax v0, off, s[0:3], s0 offset:4095 glc ; encoding: [0xff,0x4f,0x00,0xe1,0x00,0x00,0x00,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fmax_x2 v[5:6], off, s[0:3], s0 offset:4095 +// SICI: buffer_atomic_fmax_x2 v[5:6], off, s[0:3], s0 offset:4095 ; encoding: [0xff,0x0f,0x80,0xe1,0x00,0x05,0x00,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fmax_x2 v[0:1], v0, s[0:3], s0 idxen offset:4095 +// SICI: buffer_atomic_fmax_x2 v[0:1], v0, s[0:3], s0 idxen offset:4095 ; encoding: [0xff,0x2f,0x80,0xe1,0x00,0x00,0x00,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fmin v0, v[0:1], s[0:3], s0 addr64 offset:4095 +// SICI: buffer_atomic_fmin v0, v[0:1], s[0:3], s0 addr64 offset:4095 ; encoding: [0xff,0x8f,0xfc,0xe0,0x00,0x00,0x00,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fmin v0, off, s[0:3], s0 +// SICI: buffer_atomic_fmin v0, off, s[0:3], s0 ; encoding: [0x00,0x00,0xfc,0xe0,0x00,0x00,0x00,0x00] +// NOVI: error: instruction not supported on this GPU + +buffer_atomic_fmin v0, off, s[0:3], s0 offset:0 +// SICI: buffer_atomic_fmin v0, off, s[0:3], s0 ; encoding: [0x00,0x00,0xfc,0xe0,0x00,0x00,0x00,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fmin_x2 v[0:1], off, s[0:3], s0 offset:4095 slc +// SICI: buffer_atomic_fmin_x2 v[0:1], off, s[0:3], s0 offset:4095 slc ; encoding: [0xff,0x0f,0x7c,0xe1,0x00,0x00,0x40,0x00] +// NOVI: error: not a valid operand. + +buffer_atomic_fmin_x2 v[0:1], v0, s[0:3], s0 idxen offset:4095 +// SICI: buffer_atomic_fmin_x2 v[0:1], v0, s[0:3], s0 idxen offset:4095 ; encoding: [0xff,0x2f,0x7c,0xe1,0x00,0x00,0x00,0x00] +// NOVI: error: not a valid operand. + //===----------------------------------------------------------------------===// // Lds support //===----------------------------------------------------------------------===// Added: llvm/trunk/test/MC/Disassembler/AMDGPU/mubuf_gfx10.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Disassembler/AMDGPU/mubuf_gfx10.txt?rev=374559&view=auto ============================================================================== --- llvm/trunk/test/MC/Disassembler/AMDGPU/mubuf_gfx10.txt (added) +++ llvm/trunk/test/MC/Disassembler/AMDGPU/mubuf_gfx10.txt Fri Oct 11 07:44:51 2019 @@ -0,0 +1,31 @@ +# RUN: llvm-mc -arch=amdgcn -mcpu=gfx1010 -disassemble -show-encoding < %s | FileCheck %s + +# CHECK: buffer_atomic_fcmpswap v[5:6], off, s[8:11], s3 offset:4095 ; encoding: [0xff,0x0f,0xf8,0xe0,0x00,0x05,0x02,0x03] +0xff,0x0f,0xf8,0xe0,0x00,0x05,0x02,0x03 + +# CHECK: buffer_atomic_fcmpswap v[254:255], off, s[8:11], s3 offset:4095 ; encoding: [0xff,0x0f,0xf8,0xe0,0x00,0xfe,0x02,0x03] +0xff,0x0f,0xf8,0xe0,0x00,0xfe,0x02,0x03 + +# CHECK: buffer_atomic_fcmpswap_x2 v[5:8], off, s[8:11], s3 offset:7 ; encoding: [0x07,0x00,0x78,0xe1,0x00,0x05,0x02,0x03] +0x07,0x00,0x78,0xe1,0x00,0x05,0x02,0x03 + +# CHECK: buffer_atomic_fcmpswap_x2 v[5:8], off, s[8:11], s3 offset:4095 glc ; encoding: [0xff,0x4f,0x78,0xe1,0x00,0x05,0x02,0x03] +0xff,0x4f,0x78,0xe1,0x00,0x05,0x02,0x03 + +# CHECK: buffer_atomic_fmax v5, v0, s[8:11], s3 idxen offset:4095 ; encoding: [0xff,0x2f,0x00,0xe1,0x00,0x05,0x02,0x03] +0xff,0x2f,0x00,0xe1,0x00,0x05,0x02,0x03 + +# CHECK: buffer_atomic_fmax_x2 v[5:6], off, s[8:11], s3 offset:4095 glc ; encoding: [0xff,0x4f,0x80,0xe1,0x00,0x05,0x02,0x03] +0xff,0x4f,0x80,0xe1,0x00,0x05,0x02,0x03 + +# CHECK: buffer_atomic_fmax_x2 v[5:6], off, s[8:11], s3 offset:4095 slc ; encoding: [0xff,0x0f,0x80,0xe1,0x00,0x05,0x42,0x03] +0xff,0x0f,0x80,0xe1,0x00,0x05,0x42,0x03 + +# CHECK: buffer_atomic_fmin v5, off, s[8:11], s3 ; encoding: [0x00,0x00,0xfc,0xe0,0x00,0x05,0x02,0x03] +0x00,0x00,0xfc,0xe0,0x00,0x05,0x02,0x03 + +# CHECK: buffer_atomic_fmin v5, off, s[8:11], s3 offset:7 ; encoding: [0x07,0x00,0xfc,0xe0,0x00,0x05,0x02,0x03] +0x07,0x00,0xfc,0xe0,0x00,0x05,0x02,0x03 + +# CHECK: buffer_atomic_fmin_x2 v[5:6], off, ttmp[12:15], s3 offset:4095 ; encoding: [0xff,0x0f,0x7c,0xe1,0x00,0x05,0x1e,0x03] +0xff,0x0f,0x7c,0xe1,0x00,0x05,0x1e,0x03 From llvm-commits at lists.llvm.org Fri Oct 11 07:48:31 2019 From: llvm-commits at lists.llvm.org (GN Sync Bot via llvm-commits) Date: Fri, 11 Oct 2019 14:48:31 -0000 Subject: [llvm] r374560 - gn build: Merge r374558 Message-ID: <20191011144831.DB7FF83C7E@lists.llvm.org> Author: gnsyncbot Date: Fri Oct 11 07:48:31 2019 New Revision: 374560 URL: http://llvm.org/viewvc/llvm-project?rev=374560&view=rev Log: gn build: Merge r374558 Modified: llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Transformer/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Transformer/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Transformer/BUILD.gn?rev=374560&r1=374559&r2=374560&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Transformer/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang/lib/Tooling/Transformer/BUILD.gn Fri Oct 11 07:48:31 2019 @@ -12,6 +12,7 @@ static_library("Transformer") { ] sources = [ "RangeSelector.cpp", + "RewriteRule.cpp", "SourceCode.cpp", "SourceCodeBuilders.cpp", "Stencil.cpp", From llvm-commits at lists.llvm.org Fri Oct 11 07:46:14 2019 From: llvm-commits at lists.llvm.org (Dmitry Preobrazhensky via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:46:14 +0000 (UTC) Subject: [PATCH] D68788: [AMDGPU][MC][GFX6][GFX7][GFX10] Added instructions buffer_atomic_[fcmpswap/fmin/fmax]* In-Reply-To: References: Message-ID: <040b82f5d3e0250d5ac55c1938dde6fe@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGb82fae01ea45: [AMDGPU][MC][GFX6][GFX7][GFX10] Added instructions buffer_atomic_… (authored by dp). Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D68788?vs=224329&id=224594#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68788/new/ https://reviews.llvm.org/D68788 Files: llvm/lib/Target/AMDGPU/BUFInstructions.td llvm/test/MC/AMDGPU/mubuf-gfx10.s llvm/test/MC/AMDGPU/mubuf.s llvm/test/MC/Disassembler/AMDGPU/mubuf_gfx10.txt -------------- next part -------------- A non-text attachment was scrubbed... Name: D68788.224594.patch Type: text/x-patch Size: 11930 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 07:53:26 2019 From: llvm-commits at lists.llvm.org (Dmitry Preobrazhensky via llvm-commits) Date: Fri, 11 Oct 2019 14:53:26 -0000 Subject: [llvm] r374561 - [AMDGPU][MC][GFX9][GFX10] Corrected number of src operands for ds_[read/write]_addtid_b32 Message-ID: <20191011145327.0219B926D9@lists.llvm.org> Author: dpreobra Date: Fri Oct 11 07:53:26 2019 New Revision: 374561 URL: http://llvm.org/viewvc/llvm-project?rev=374561&view=rev Log: [AMDGPU][MC][GFX9][GFX10] Corrected number of src operands for ds_[read/write]_addtid_b32 See https://bugs.llvm.org/show_bug.cgi?id=37941 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68787 Modified: llvm/trunk/lib/Target/AMDGPU/DSInstructions.td llvm/trunk/test/MC/AMDGPU/ds-gfx9.s llvm/trunk/test/MC/AMDGPU/gfx10_asm_all.s llvm/trunk/test/MC/AMDGPU/gfx10_asm_err.s llvm/trunk/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt Modified: llvm/trunk/lib/Target/AMDGPU/DSInstructions.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/DSInstructions.td?rev=374561&r1=374560&r2=374561&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/DSInstructions.td (original) +++ llvm/trunk/lib/Target/AMDGPU/DSInstructions.td Fri Oct 11 07:53:26 2019 @@ -81,6 +81,17 @@ class DS_Real : // DS Pseudo instructions +class DS_0A1D_NORET +: DS_Pseudo { + + let has_addr = 0; + let has_data1 = 0; + let has_vdst = 0; +} + class DS_1A1D_NORET : DS_Pseudo; } +} // End has_m0_read = 0 + let SubtargetPredicate = HasDSAddTid in { -def DS_WRITE_ADDTID_B32 : DS_1A1D_NORET<"ds_write_addtid_b32">; +def DS_WRITE_ADDTID_B32 : DS_0A1D_NORET<"ds_write_addtid_b32">; } -} // End has_m0_read = 0 } // End mayLoad = 0 defm DS_MSKOR_B32 : DS_1A2D_NORET_mc<"ds_mskor_b32">; @@ -543,13 +555,14 @@ def DS_READ_I8_D16_HI : DS_1A_RET_Tied< def DS_READ_U16_D16 : DS_1A_RET_Tied<"ds_read_u16_d16">; def DS_READ_U16_D16_HI : DS_1A_RET_Tied<"ds_read_u16_d16_hi">; } +} // End has_m0_read = 0 let SubtargetPredicate = HasDSAddTid in { -def DS_READ_ADDTID_B32 : DS_1A_RET<"ds_read_addtid_b32">; -} -} // End has_m0_read = 0 +def DS_READ_ADDTID_B32 : DS_0A_RET<"ds_read_addtid_b32">; } +} // End mayStore = 0 + def DS_CONSUME : DS_0A_RET<"ds_consume">; def DS_APPEND : DS_0A_RET<"ds_append">; def DS_ORDERED_COUNT : DS_1A_RET_GDS<"ds_ordered_count">; Modified: llvm/trunk/test/MC/AMDGPU/ds-gfx9.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AMDGPU/ds-gfx9.s?rev=374561&r1=374560&r2=374561&view=diff ============================================================================== --- llvm/trunk/test/MC/AMDGPU/ds-gfx9.s (original) +++ llvm/trunk/test/MC/AMDGPU/ds-gfx9.s Fri Oct 11 07:53:26 2019 @@ -33,10 +33,10 @@ ds_write_b16_d16_hi v8, v2 // VI-ERR: error: instruction not supported on this GPU // GFX9: ds_write_b16_d16_hi v8, v2 ; encoding: [0x00,0x00,0xaa,0xd8,0x08,0x02,0x00,0x00] -ds_write_addtid_b32 v8, v2 +ds_write_addtid_b32 v8 // VI-ERR: error: instruction not supported on this GPU -// GFX9: ds_write_addtid_b32 v8, v2 ; encoding: [0x00,0x00,0x3a,0xd8,0x08,0x02,0x00,0x00] +// GFX9: ds_write_addtid_b32 v8 ; encoding: [0x00,0x00,0x3a,0xd8,0x00,0x08,0x00,0x00] -ds_read_addtid_b32 v8, v2 +ds_read_addtid_b32 v8 // VI-ERR: error: instruction not supported on this GPU -// GFX9: ds_read_addtid_b32 v8, v2 ; encoding: [0x00,0x00,0x6c,0xd9,0x02,0x00,0x00,0x08] +// GFX9: ds_read_addtid_b32 v8 ; encoding: [0x00,0x00,0x6c,0xd9,0x00,0x00,0x00,0x08] Modified: llvm/trunk/test/MC/AMDGPU/gfx10_asm_all.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AMDGPU/gfx10_asm_all.s?rev=374561&r1=374560&r2=374561&view=diff ============================================================================== --- llvm/trunk/test/MC/AMDGPU/gfx10_asm_all.s (original) +++ llvm/trunk/test/MC/AMDGPU/gfx10_asm_all.s Fri Oct 11 07:53:26 2019 @@ -6538,47 +6538,47 @@ ds_read_u16_d16_hi v5, v1 offset:4 ds_read_u16_d16_hi v5, v1 offset:65535 gds // GFX10: encoding: [0xff,0xff,0x9e,0xda,0x01,0x00,0x00,0x05] -ds_write_addtid_b32 v5, v1 offset:65535 -// GFX10: encoding: [0xff,0xff,0xc0,0xda,0x05,0x01,0x00,0x00] +ds_write_addtid_b32 v5 offset:65535 +// GFX10: encoding: [0xff,0xff,0xc0,0xda,0x00,0x05,0x00,0x00] -ds_write_addtid_b32 v255, v1 offset:65535 -// GFX10: encoding: [0xff,0xff,0xc0,0xda,0xff,0x01,0x00,0x00] +ds_write_addtid_b32 v255 offset:65535 +// GFX10: encoding: [0xff,0xff,0xc0,0xda,0x00,0xff,0x00,0x00] -ds_write_addtid_b32 v5, v255 offset:65535 -// GFX10: encoding: [0xff,0xff,0xc0,0xda,0x05,0xff,0x00,0x00] +ds_write_addtid_b32 v5 offset:65535 +// GFX10: encoding: [0xff,0xff,0xc0,0xda,0x00,0x05,0x00,0x00] -ds_write_addtid_b32 v5, v1 -// GFX10: encoding: [0x00,0x00,0xc0,0xda,0x05,0x01,0x00,0x00] +ds_write_addtid_b32 v5 +// GFX10: encoding: [0x00,0x00,0xc0,0xda,0x00,0x05,0x00,0x00] -ds_write_addtid_b32 v5, v1 offset:0 -// GFX10: encoding: [0x00,0x00,0xc0,0xda,0x05,0x01,0x00,0x00] +ds_write_addtid_b32 v5 offset:0 +// GFX10: encoding: [0x00,0x00,0xc0,0xda,0x00,0x05,0x00,0x00] -ds_write_addtid_b32 v5, v1 offset:4 -// GFX10: encoding: [0x04,0x00,0xc0,0xda,0x05,0x01,0x00,0x00] +ds_write_addtid_b32 v5 offset:4 +// GFX10: encoding: [0x04,0x00,0xc0,0xda,0x00,0x05,0x00,0x00] -ds_write_addtid_b32 v5, v1 offset:65535 gds -// GFX10: encoding: [0xff,0xff,0xc2,0xda,0x05,0x01,0x00,0x00] +ds_write_addtid_b32 v5 offset:65535 gds +// GFX10: encoding: [0xff,0xff,0xc2,0xda,0x00,0x05,0x00,0x00] -ds_read_addtid_b32 v5, v1 offset:65535 -// GFX10: encoding: [0xff,0xff,0xc4,0xda,0x01,0x00,0x00,0x05] +ds_read_addtid_b32 v5 offset:65535 +// GFX10: encoding: [0xff,0xff,0xc4,0xda,0x00,0x00,0x00,0x05] -ds_read_addtid_b32 v255, v1 offset:65535 -// GFX10: encoding: [0xff,0xff,0xc4,0xda,0x01,0x00,0x00,0xff] +ds_read_addtid_b32 v255 offset:65535 +// GFX10: encoding: [0xff,0xff,0xc4,0xda,0x00,0x00,0x00,0xff] -ds_read_addtid_b32 v5, v255 offset:65535 -// GFX10: encoding: [0xff,0xff,0xc4,0xda,0xff,0x00,0x00,0x05] +ds_read_addtid_b32 v5 offset:65535 +// GFX10: encoding: [0xff,0xff,0xc4,0xda,0x00,0x00,0x00,0x05] -ds_read_addtid_b32 v5, v1 -// GFX10: encoding: [0x00,0x00,0xc4,0xda,0x01,0x00,0x00,0x05] +ds_read_addtid_b32 v5 +// GFX10: encoding: [0x00,0x00,0xc4,0xda,0x00,0x00,0x00,0x05] -ds_read_addtid_b32 v5, v1 offset:0 -// GFX10: encoding: [0x00,0x00,0xc4,0xda,0x01,0x00,0x00,0x05] +ds_read_addtid_b32 v5 offset:0 +// GFX10: encoding: [0x00,0x00,0xc4,0xda,0x00,0x00,0x00,0x05] -ds_read_addtid_b32 v5, v1 offset:4 -// GFX10: encoding: [0x04,0x00,0xc4,0xda,0x01,0x00,0x00,0x05] +ds_read_addtid_b32 v5 offset:4 +// GFX10: encoding: [0x04,0x00,0xc4,0xda,0x00,0x00,0x00,0x05] -ds_read_addtid_b32 v5, v1 offset:65535 gds -// GFX10: encoding: [0xff,0xff,0xc6,0xda,0x01,0x00,0x00,0x05] +ds_read_addtid_b32 v5 offset:65535 gds +// GFX10: encoding: [0xff,0xff,0xc6,0xda,0x00,0x00,0x00,0x05] ds_permute_b32 v0, v1, v2 // GFX10: encoding: [0x00,0x00,0xc8,0xda,0x01,0x02,0x00,0x00] Modified: llvm/trunk/test/MC/AMDGPU/gfx10_asm_err.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AMDGPU/gfx10_asm_err.s?rev=374561&r1=374560&r2=374561&view=diff ============================================================================== --- llvm/trunk/test/MC/AMDGPU/gfx10_asm_err.s (original) +++ llvm/trunk/test/MC/AMDGPU/gfx10_asm_err.s Fri Oct 11 07:53:26 2019 @@ -35,10 +35,10 @@ ds_read_u16_d16 v5, v1 ds_read_u16_d16_hi v5, v1 // GFX6-8: error: instruction not supported on this GPU -ds_write_addtid_b32 v5, v1 +ds_write_addtid_b32 v5 // GFX6-8: error: instruction not supported on this GPU -ds_read_addtid_b32 v5, v1 +ds_read_addtid_b32 v5 // GFX6-8: error: instruction not supported on this GPU // GFX8+. Modified: llvm/trunk/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt?rev=374561&r1=374560&r2=374561&view=diff ============================================================================== --- llvm/trunk/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt (original) +++ llvm/trunk/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt Fri Oct 11 07:53:26 2019 @@ -5762,23 +5762,23 @@ # GFX10: ds_read2st64_b64 v[5:8], v255 offset0:127 offset1:255 ; encoding: [0x7f,0xff,0xe0,0xd9,0xff,0x00,0x00,0x05] 0x7f,0xff,0xe0,0xd9,0xff,0x00,0x00,0x05 -# GFX10: ds_read_addtid_b32 v255, v1 offset:65535 ; encoding: [0xff,0xff,0xc4,0xda,0x01,0x00,0x00,0xff] -0xff,0xff,0xc4,0xda,0x01,0x00,0x00,0xff +# GFX10: ds_read_addtid_b32 v255 offset:65535 ; encoding: [0xff,0xff,0xc4,0xda,0x00,0x00,0x00,0xff] +0xff 0xff 0xc4 0xda 0x00 0x00 0x00 0xff -# GFX10: ds_read_addtid_b32 v5, v1 ; encoding: [0x00,0x00,0xc4,0xda,0x01,0x00,0x00,0x05] -0x00,0x00,0xc4,0xda,0x01,0x00,0x00,0x05 +# GFX10: ds_read_addtid_b32 v5 ; encoding: [0x00,0x00,0xc4,0xda,0x00,0x00,0x00,0x05] +0x00 0x00 0xc4 0xda 0x00 0x00 0x00 0x05 -# GFX10: ds_read_addtid_b32 v5, v1 offset:4 ; encoding: [0x04,0x00,0xc4,0xda,0x01,0x00,0x00,0x05] -0x04,0x00,0xc4,0xda,0x01,0x00,0x00,0x05 +# GFX10: ds_read_addtid_b32 v5 offset:4 ; encoding: [0x04,0x00,0xc4,0xda,0x00,0x00,0x00,0x05] +0x04 0x00 0xc4 0xda 0x00 0x00 0x00 0x05 -# GFX10: ds_read_addtid_b32 v5, v1 offset:65535 ; encoding: [0xff,0xff,0xc4,0xda,0x01,0x00,0x00,0x05] -0xff,0xff,0xc4,0xda,0x01,0x00,0x00,0x05 +# GFX10: ds_read_addtid_b32 v5 offset:65535 ; encoding: [0xff,0xff,0xc4,0xda,0x00,0x00,0x00,0x05] +0xff 0xff 0xc4 0xda 0x00 0x00 0x00 0x05 -# GFX10: ds_read_addtid_b32 v5, v1 offset:65535 gds ; encoding: [0xff,0xff,0xc6,0xda,0x01,0x00,0x00,0x05] -0xff,0xff,0xc6,0xda,0x01,0x00,0x00,0x05 +# GFX10: ds_read_addtid_b32 v5 offset:65535 gds ; encoding: [0xff,0xff,0xc6,0xda,0x00,0x00,0x00,0x05] +0xff 0xff 0xc6 0xda 0x00 0x00 0x00 0x05 -# GFX10: ds_read_addtid_b32 v5, v255 offset:65535 ; encoding: [0xff,0xff,0xc4,0xda,0xff,0x00,0x00,0x05] -0xff,0xff,0xc4,0xda,0xff,0x00,0x00,0x05 +# GFX10: ds_read_addtid_b32 v5 offset:65535 ; encoding: [0xff,0xff,0xc4,0xda,0x00,0x00,0x00,0x05] +0xff 0xff 0xc4 0xda 0x00 0x00 0x00 0x05 # GFX10: ds_read_b128 v[252:255], v1 offset:65535 ; encoding: [0xff,0xff,0xfc,0xdb,0x01,0x00,0x00,0xfc] 0xff,0xff,0xfc,0xdb,0x01,0x00,0x00,0xfc @@ -7070,23 +7070,23 @@ # GFX10: ds_write2st64_b64 v255, v[2:3], v[3:4] offset0:127 offset1:255 ; encoding: [0x7f,0xff,0x3c,0xd9,0xff,0x02,0x03,0x00] 0x7f,0xff,0x3c,0xd9,0xff,0x02,0x03,0x00 -# GFX10: ds_write_addtid_b32 v255, v1 offset:65535 ; encoding: [0xff,0xff,0xc0,0xda,0xff,0x01,0x00,0x00] -0xff,0xff,0xc0,0xda,0xff,0x01,0x00,0x00 +# GFX10: ds_write_addtid_b32 v255 offset:65535 ; encoding: [0xff,0xff,0xc0,0xda,0x00,0xff,0x00,0x00] +0xff 0xff 0xc0 0xda 0x00 0xff 0x00 0x00 -# GFX10: ds_write_addtid_b32 v5, v1 ; encoding: [0x00,0x00,0xc0,0xda,0x05,0x01,0x00,0x00] -0x00,0x00,0xc0,0xda,0x05,0x01,0x00,0x00 +# GFX10: ds_write_addtid_b32 v5 ; encoding: [0x00,0x00,0xc0,0xda,0x00,0x05,0x00,0x00] +0x00 0x00 0xc0 0xda 0x00 0x05 0x00 0x00 -# GFX10: ds_write_addtid_b32 v5, v1 offset:4 ; encoding: [0x04,0x00,0xc0,0xda,0x05,0x01,0x00,0x00] -0x04,0x00,0xc0,0xda,0x05,0x01,0x00,0x00 +# GFX10: ds_write_addtid_b32 v5 offset:4 ; encoding: [0x04,0x00,0xc0,0xda,0x00,0x05,0x00,0x00] +0x04 0x00 0xc0 0xda 0x00 0x05 0x00 0x00 -# GFX10: ds_write_addtid_b32 v5, v1 offset:65535 ; encoding: [0xff,0xff,0xc0,0xda,0x05,0x01,0x00,0x00] -0xff,0xff,0xc0,0xda,0x05,0x01,0x00,0x00 +# GFX10: ds_write_addtid_b32 v5 offset:65535 ; encoding: [0xff,0xff,0xc0,0xda,0x00,0x05,0x00,0x00] +0xff 0xff 0xc0 0xda 0x00 0x05 0x00 0x00 -# GFX10: ds_write_addtid_b32 v5, v1 offset:65535 gds ; encoding: [0xff,0xff,0xc2,0xda,0x05,0x01,0x00,0x00] -0xff,0xff,0xc2,0xda,0x05,0x01,0x00,0x00 +# GFX10: ds_write_addtid_b32 v5 offset:65535 gds ; encoding: [0xff,0xff,0xc2,0xda,0x00,0x05,0x00,0x00] +0xff 0xff 0xc2 0xda 0x00 0x05 0x00 0x00 -# GFX10: ds_write_addtid_b32 v5, v255 offset:65535 ; encoding: [0xff,0xff,0xc0,0xda,0x05,0xff,0x00,0x00] -0xff,0xff,0xc0,0xda,0x05,0xff,0x00,0x00 +# GFX10: ds_write_addtid_b32 v5 offset:65535 ; encoding: [0xff,0xff,0xc0,0xda,0x00,0x05,0x00,0x00] +0xff 0xff 0xc0 0xda 0x00 0x05 0x00 0x00 # GFX10: ds_write_b128 v1, v[252:255] offset:65535 ; encoding: [0xff,0xff,0x7c,0xdb,0x01,0xfc,0x00,0x00] 0xff,0xff,0x7c,0xdb,0x01,0xfc,0x00,0x00 From llvm-commits at lists.llvm.org Fri Oct 11 07:55:32 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:55:32 +0000 (UTC) Subject: [PATCH] D68656: Add ExceptionStream to llvm::Object::minidump In-Reply-To: References: Message-ID: JosephTremoulet updated this revision to Diff 224595. JosephTremoulet added a comment. - Remove useless comment and leftover debugging cruft Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68656/new/ https://reviews.llvm.org/D68656 Files: llvm/include/llvm/BinaryFormat/Minidump.h llvm/include/llvm/Object/Minidump.h llvm/unittests/Object/MinidumpTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68656.224595.patch Type: text/x-patch Size: 5415 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 07:55:32 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:55:32 +0000 (UTC) Subject: [PATCH] D68656: Add ExceptionStream to llvm::Object::minidump In-Reply-To: References: Message-ID: <40719aa1fd018183b54307b5d288ea65@localhost.localdomain> JosephTremoulet marked 3 inline comments as done. JosephTremoulet added inline comments. ================ Comment at: llvm/unittests/Object/MinidumpTest.cpp:756-758 + if (!ExpectedStream) { + errs() << ExpectedStream.takeError(); + } ---------------- labath wrote: > Delete. The ASSERT_THAT_EXPECTED check should already print the error message if this fails. Was that not working for you for some reason? Yeah, that does work as expected, this was leftover cruft from when I was misunderstanding the error message while writing the test, good catch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68656/new/ https://reviews.llvm.org/D68656 From llvm-commits at lists.llvm.org Fri Oct 11 07:55:48 2019 From: llvm-commits at lists.llvm.org (Dmitry Preobrazhensky via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 14:55:48 +0000 (UTC) Subject: [PATCH] D68787: [AMDGPU][MC][GFX9][GFX10] Corrected number of src operands for ds_[read/write]_addtid_b32 In-Reply-To: References: Message-ID: <842d32a176adbf0435de9f7a0b6567e5@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGc4995076c6bd: [AMDGPU][MC][GFX9][GFX10] Corrected number of src operands for ds_… (authored by dp). Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D68787?vs=224321&id=224596#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68787/new/ https://reviews.llvm.org/D68787 Files: llvm/lib/Target/AMDGPU/DSInstructions.td llvm/test/MC/AMDGPU/ds-gfx9.s llvm/test/MC/AMDGPU/gfx10_asm_all.s llvm/test/MC/AMDGPU/gfx10_asm_err.s llvm/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt -------------- next part -------------- A non-text attachment was scrubbed... Name: D68787.224596.patch Type: text/x-patch Size: 10565 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 09:42:17 2019 From: llvm-commits at lists.llvm.org (via llvm-commits) Date: 11 Oct 2019 17:42:17 +0100 Subject: Security Alert. Your account was compromissed. Password must be changed. Message-ID: <003101d58055$07109e08$508bc8b8@mlltm> Hello! I have very bad news for you. 17/07/2019 - on this day I hacked your OS and got full access to your account llvm-commits at lists.llvm.org. You can check it - I sent this message from your account. So, you can change the password, yes.. But my malware intercepts it every time. How I made it: In the software of the router, through which you went online, was a vulnerability. I just hacked this router and placed my malicious code on it. When you went online, my trojan was installed on the OS of your device. After that, I made a full dump of your disk (I have all your address book, history of viewing sites, all files, phone numbers and addresses of all your contacts). A month ago, I wanted to lock your device and ask for a not big amount of btc to unlock. But I looked at the sites that you regularly visit, and I was shocked by what I saw!!! I'm talk you about sites for adults. I want to say - you are a BIG pervert. Your fantasy is shifted far away from the normal course! And I got an idea.... I made a screenshot of the adult sites where you have fun (do you understand what it is about, huh?). After that, I made a screenshot of your joys (using the camera of your device) and glued them together. Turned out amazing! You are so spectacular! I'm know that you would not like to show these screenshots to your friends, relatives or colleagues. I think $938 is a very, very small amount for my silence. Besides, I have been spying on you for so long, having spent a lot of time! Pay ONLY in Bitcoins! My BTC wallet: 15yF8WkUg8PRjJehYW4tGdqcyzc4z7dScM You do not know how to use bitcoins? Enter a query in any search engine: "how to replenish btc wallet". It's extremely easy For this payment I give you two days (48 hours). As soon as this letter is opened, the timer will work. After payment, my virus and dirty screenshots with your enjoys will be self-destruct automatically. If I do not receive from you the specified amount, then your device will be locked, and all your contacts will receive a screenshots with your "enjoys". I hope you understand your situation. - Do not try to find and destroy my virus! (All your data, files and screenshots is already uploaded to a remote server) - Do not try to contact me (you yourself will see that this is impossible, I sent you an email from your account) - Various security services will not help you; formatting a disk or destroying a device will not help, since your data is already on a remote server. P.S. You are not my single victim. so, I guarantee you that I will not disturb you again after payment! This is the word of honor hacker. I also ask you to regularly update your antiviruses in the future. This way you will no longer fall into a similar situation. Do not hold evil! I just do my job. Good luck. From llvm-commits at lists.llvm.org Fri Oct 11 08:07:28 2019 From: llvm-commits at lists.llvm.org (David Tenty via llvm-commits) Date: Fri, 11 Oct 2019 15:07:28 -0000 Subject: [llvm] r374564 - [AIX] Use .space instead of .zero in assembly Message-ID: <20191011150728.EC35E86B76@lists.llvm.org> Author: daltenty Date: Fri Oct 11 08:07:28 2019 New Revision: 374564 URL: http://llvm.org/viewvc/llvm-project?rev=374564&view=rev Log: [AIX] Use .space instead of .zero in assembly Summary: The AIX system assembler does not understand .zero, so we should prefer emitting .space. Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68815 Added: llvm/trunk/test/CodeGen/PowerPC/aix-space.ll Modified: llvm/trunk/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp Modified: llvm/trunk/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp?rev=374564&r1=374563&r2=374564&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp Fri Oct 11 08:07:28 2019 @@ -86,4 +86,5 @@ void PPCXCOFFMCAsmInfo::anchor() {} PPCXCOFFMCAsmInfo::PPCXCOFFMCAsmInfo(bool Is64Bit, const Triple &T) { assert(!IsLittleEndian && "Little-endian XCOFF not supported."); CodePointerSize = CalleeSaveStackSlotSize = Is64Bit ? 8 : 4; + ZeroDirective = "\t.space\t"; } Added: llvm/trunk/test/CodeGen/PowerPC/aix-space.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/aix-space.ll?rev=374564&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/PowerPC/aix-space.ll (added) +++ llvm/trunk/test/CodeGen/PowerPC/aix-space.ll Fri Oct 11 08:07:28 2019 @@ -0,0 +1,17 @@ +; RUN: llc -verify-machineinstrs -O0 -mcpu=pwr7 -mtriple powerpc-ibm-aix-xcoff < %s | FileCheck %s + + at a = common global double 0.000000e+00, align 8 + +; Get some constants into the constant pool that need spacing for alignment +define void @e() { +entry: + %0 = load double, double* @a, align 8 + %mul = fmul double 1.500000e+00, %0 + store double %mul, double* @a, align 8 + %mul1 = fmul double 0x3F9C71C71C71C71C, %0 + store double %mul1, double* @a, align 8 + ret void +} + +; CHECK: .space 4 +; CHECK-NOT: .zero From llvm-commits at lists.llvm.org Fri Oct 11 08:05:08 2019 From: llvm-commits at lists.llvm.org (Sander de Smalen via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 15:05:08 +0000 (UTC) Subject: [PATCH] D67551: [AArch64][SVE] Implement sdot and udot (lane) intrinsics In-Reply-To: References: Message-ID: sdesmalen accepted this revision. sdesmalen added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67551/new/ https://reviews.llvm.org/D67551 From llvm-commits at lists.llvm.org Fri Oct 11 08:14:30 2019 From: llvm-commits at lists.llvm.org (Andrea Di Biagio via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 15:14:30 +0000 (UTC) Subject: [PATCH] D68871: [X86][BtVer2] Improved latency and throughput of float/vector loads and stores. Message-ID: andreadb created this revision. andreadb added reviewers: RKSimon, craig.topper, lebedev.ri. Herald added subscribers: courbet, gbedwell. andreadb added a comment. F10225163: exegesis-report.txt Posted the output from llvm-exegesis for all the affected instructions. This patch introduces the following changes to the btver2 scheduling model: The number of micro opcodes for YMM loads and stores is now 2 (it was incorrectly set to 1 for both aligned and misaligned loads/stores). Increased the number of AGU resource cycles for YMM loads and stores to 2cy (instead of 1cy). Removed JFPU01 and JFPX from the list of resources consumed by pure float/vector loads (no MMX). I verified with llvm-exegesis that pure XMM/YMM loads are no-pipe. They are dispatched to the FPU but not really issues on JFPU01. https://reviews.llvm.org/D68871 Files: lib/Target/X86/X86ScheduleBtVer2.td test/tools/llvm-mca/X86/BtVer2/bottleneck-hints-3.s test/tools/llvm-mca/X86/BtVer2/load-store-alias.s test/tools/llvm-mca/X86/BtVer2/memcpy-like-test.s test/tools/llvm-mca/X86/BtVer2/resources-avx1.s test/tools/llvm-mca/X86/BtVer2/resources-sse1.s test/tools/llvm-mca/X86/BtVer2/resources-sse2.s test/tools/llvm-mca/X86/BtVer2/resources-sse3.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68871.224598.patch Type: text/x-patch Size: 35971 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 08:14:31 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 15:14:31 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: DiggerLin marked 6 inline comments as done. DiggerLin added inline comments. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:150 + bool nameShouldBeInStringTable(const StringRef &); + void writeSymbolName(const StringRef &); ---------------- hubert.reinterpretcast wrote: > This should be a static member function or a non-member function. thanks for your suggestion. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:361 + + W.write(CSectionRef.Address + SymbolOffset); + W.write(SectionIndex); ---------------- hubert.reinterpretcast wrote: > Maybe check for overflow here. added ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:481 + writeSymbolTableEntryForCsectMemberLabel( + Sym, Csect, Text.Index, Layout.getSymbolOffset(*(Sym.MCSym))); + } ---------------- hubert.reinterpretcast wrote: > Please remove the excess parentheses. deleted ================ Comment at: llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll:84 ; SYMS-NEXT: Symbol { -; SYMS-NEXT: Index: [[#Index:]] -; SYMS-NEXT: Name: a +; SYMS: Index: [[#Index:]]{{[[:space:]] *}}Name: a ; SYMS-NEXT: Value (RelocatableAddress): 0x0 ---------------- hubert.reinterpretcast wrote: > Can this be merged with the previous line? > ``` > SYMS: Symbol {{[{][[:space:]] *}}Index: [[#Index:]]{{[[:space:]] *}}Name: a{{$}} > ``` > yes ,your suggestion is more reasonable. ================ Comment at: llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll:36 +; OBJ: Section { +; OBJ: Index: 2 ; OBJ-NEXT: Name: .bss ---------------- hubert.reinterpretcast wrote: > Same comment re: merging with the previous line. changed ================ Comment at: llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll:56 ; SYMS-NEXT: Symbol { -; SYMS-NEXT: Index: [[#Index:]] -; SYMS-NEXT: Name: a +; SYMS: Index: [[#Index:]]{{[[:space:]] *}}Name: a ; SYMS-NEXT: Value (RelocatableAddress): 0x0 ---------------- hubert.reinterpretcast wrote: > Same comment re: merging with the previous line. changed as suggestion. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 From llvm-commits at lists.llvm.org Fri Oct 11 08:14:31 2019 From: llvm-commits at lists.llvm.org (Andrea Di Biagio via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 15:14:31 +0000 (UTC) Subject: [PATCH] D68871: [X86][BtVer2] Improved latency and throughput of float/vector loads and stores. In-Reply-To: References: Message-ID: andreadb added a comment. F10225163: exegesis-report.txt Posted the output from llvm-exegesis for all the affected instructions. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68871/new/ https://reviews.llvm.org/D68871 From llvm-commits at lists.llvm.org Fri Oct 11 08:14:38 2019 From: llvm-commits at lists.llvm.org (David Tenty via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 15:14:38 +0000 (UTC) Subject: [PATCH] D68815: [AIX] Use .space instead of .zero in assembly In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rG033d16cedc08: [AIX] Use .space instead of .zero in assembly (authored by daltenty). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68815/new/ https://reviews.llvm.org/D68815 Files: llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp llvm/test/CodeGen/PowerPC/aix-space.ll Index: llvm/test/CodeGen/PowerPC/aix-space.ll =================================================================== --- /dev/null +++ llvm/test/CodeGen/PowerPC/aix-space.ll @@ -0,0 +1,17 @@ +; RUN: llc -verify-machineinstrs -O0 -mcpu=pwr7 -mtriple powerpc-ibm-aix-xcoff < %s | FileCheck %s + + at a = common global double 0.000000e+00, align 8 + +; Get some constants into the constant pool that need spacing for alignment +define void @e() { +entry: + %0 = load double, double* @a, align 8 + %mul = fmul double 1.500000e+00, %0 + store double %mul, double* @a, align 8 + %mul1 = fmul double 0x3F9C71C71C71C71C, %0 + store double %mul1, double* @a, align 8 + ret void +} + +; CHECK: .space 4 +; CHECK-NOT: .zero Index: llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp =================================================================== --- llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp +++ llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCAsmInfo.cpp @@ -86,4 +86,5 @@ PPCXCOFFMCAsmInfo::PPCXCOFFMCAsmInfo(bool Is64Bit, const Triple &T) { assert(!IsLittleEndian && "Little-endian XCOFF not supported."); CodePointerSize = CalleeSaveStackSlotSize = Is64Bit ? 8 : 4; + ZeroDirective = "\t.space\t"; } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68815.224600.patch Type: text/x-patch Size: 1221 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 08:23:56 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 15:23:56 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: <6aa0dd97200389a5d3a7b127ced3bbd6@localhost.localdomain> DiggerLin updated this revision to Diff 224604. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 Files: llvm/include/llvm/MC/MCSectionXCOFF.h llvm/lib/MC/MCXCOFFStreamer.cpp llvm/lib/MC/XCOFFObjectWriter.cpp llvm/test/CodeGen/PowerPC/aix-return55.ll llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D66969.224604.patch Type: text/x-patch Size: 20808 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 08:33:11 2019 From: llvm-commits at lists.llvm.org (Mikhail Maltsev via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 15:33:11 +0000 (UTC) Subject: [PATCH] D68342: [Analysis] Don't assume that overflow can't happen in EmitGEPOffset In-Reply-To: References: Message-ID: <91d3ab4ce5f8d17522bb036f7df38448@localhost.localdomain> miyuki added a comment. ping Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68342/new/ https://reviews.llvm.org/D68342 From llvm-commits at lists.llvm.org Fri Oct 11 08:33:12 2019 From: llvm-commits at lists.llvm.org (Alexandre Ganea via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 15:33:12 +0000 (UTC) Subject: [PATCH] D68820: win: Move Parallel.h off concrt to cross-platform code In-Reply-To: References: Message-ID: <3a14d6b15503c74303c8ecacac26d748@localhost.localdomain> aganea marked an inline comment as done. aganea added inline comments. ================ Comment at: llvm/include/llvm/Support/Parallel.h:124 TaskGroup TG; parallel_quick_sort(Start, End, Comp, TG, llvm::Log2_64(std::distance(Start, End)) + 1); ---------------- BillyONeal wrote: > If you get a chance to benchmark I'm curious how this compares to our std::sort(std::execution::par, ...) version :) I ran a few AB/BA tests on LLD with my dataset. The cumulated time on all cores with ConcRT is consistently over by about 300ms on my 36-core Skylake (~1.9 sec for ConcRT version, ~1.6 sec after this patch). There are only three places where we `parallelSort` in LLD, so maybe this not representative. But the dataset is quite big, ~22 GB of OBJs and LIBs. This is a Unity build of the Editor Release target of one of our games. I can try also with no Unity files, usually the dataset is about an order of magnitude greater. **Before:** {F10225243} **After:** {F10225244} Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68820/new/ https://reviews.llvm.org/D68820 From llvm-commits at lists.llvm.org Fri Oct 11 08:36:56 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via llvm-commits) Date: Fri, 11 Oct 2019 15:36:56 -0000 Subject: [llvm] r374565 - [VPlan] Add moveAfter to VPRecipeBase. Message-ID: <20191011153656.0CE4F9317C@lists.llvm.org> Author: fhahn Date: Fri Oct 11 08:36:55 2019 New Revision: 374565 URL: http://llvm.org/viewvc/llvm-project?rev=374565&view=rev Log: [VPlan] Add moveAfter to VPRecipeBase. This patch adds a moveAfter method to VPRecipeBase, which can be used to move elements after other elements, across VPBasicBlocks, if necessary. Reviewers: dcaballe, hsaito, rengolin, hfinkel Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D46825 Modified: llvm/trunk/lib/Transforms/Vectorize/VPlan.cpp llvm/trunk/lib/Transforms/Vectorize/VPlan.h llvm/trunk/unittests/Transforms/Vectorize/VPlanTest.cpp Modified: llvm/trunk/lib/Transforms/Vectorize/VPlan.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/VPlan.cpp?rev=374565&r1=374564&r2=374565&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/VPlan.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/VPlan.cpp Fri Oct 11 08:36:55 2019 @@ -283,6 +283,12 @@ iplist::iterator VPRecipeB return getParent()->getRecipeList().erase(getIterator()); } +void VPRecipeBase::moveAfter(VPRecipeBase *InsertPos) { + InsertPos->getParent()->getRecipeList().splice( + std::next(InsertPos->getIterator()), getParent()->getRecipeList(), + getIterator()); +} + void VPInstruction::generateInstruction(VPTransformState &State, unsigned Part) { IRBuilder<> &Builder = State.Builder; Modified: llvm/trunk/lib/Transforms/Vectorize/VPlan.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/VPlan.h?rev=374565&r1=374564&r2=374565&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/VPlan.h (original) +++ llvm/trunk/lib/Transforms/Vectorize/VPlan.h Fri Oct 11 08:36:55 2019 @@ -615,6 +615,10 @@ public: /// the specified recipe. void insertBefore(VPRecipeBase *InsertPos); + /// Unlink this recipe from its current VPBasicBlock and insert it into + /// the VPBasicBlock that MovePos lives in, right after MovePos. + void moveAfter(VPRecipeBase *MovePos); + /// This method unlinks 'this' from the containing basic block and deletes it. /// /// \returns an iterator pointing to the element after the erased one Modified: llvm/trunk/unittests/Transforms/Vectorize/VPlanTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/Transforms/Vectorize/VPlanTest.cpp?rev=374565&r1=374564&r2=374565&view=diff ============================================================================== --- llvm/trunk/unittests/Transforms/Vectorize/VPlanTest.cpp (original) +++ llvm/trunk/unittests/Transforms/Vectorize/VPlanTest.cpp Fri Oct 11 08:36:55 2019 @@ -59,5 +59,31 @@ TEST(VPInstructionTest, eraseFromParent) EXPECT_TRUE(VPBB1.empty()); } +TEST(VPInstructionTest, moveAfter) { + VPInstruction *I1 = new VPInstruction(0, {}); + VPInstruction *I2 = new VPInstruction(1, {}); + VPInstruction *I3 = new VPInstruction(2, {}); + + VPBasicBlock VPBB1; + VPBB1.appendRecipe(I1); + VPBB1.appendRecipe(I2); + VPBB1.appendRecipe(I3); + + I1->moveAfter(I2); + + CHECK_ITERATOR(VPBB1, I2, I1, I3); + + VPInstruction *I4 = new VPInstruction(4, {}); + VPInstruction *I5 = new VPInstruction(5, {}); + VPBasicBlock VPBB2; + VPBB2.appendRecipe(I4); + VPBB2.appendRecipe(I5); + + I3->moveAfter(I4); + + CHECK_ITERATOR(VPBB1, I2, I1); + CHECK_ITERATOR(VPBB2, I4, I3, I5); +} + } // namespace } // namespace llvm From llvm-commits at lists.llvm.org Fri Oct 11 08:42:24 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 15:42:24 +0000 (UTC) Subject: [PATCH] D46825: [VPlan] Add moveAfter to VPRecipeBase. In-Reply-To: References: Message-ID: <662d5ba63922be5a613668be591a0c96@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG39d4c9fd56e3: [VPlan] Add moveAfter to VPRecipeBase. (authored by fhahn). Herald added subscribers: psnobl, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D46825?vs=146826&id=224605#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D46825/new/ https://reviews.llvm.org/D46825 Files: llvm/lib/Transforms/Vectorize/VPlan.cpp llvm/lib/Transforms/Vectorize/VPlan.h llvm/unittests/Transforms/Vectorize/VPlanTest.cpp Index: llvm/unittests/Transforms/Vectorize/VPlanTest.cpp =================================================================== --- llvm/unittests/Transforms/Vectorize/VPlanTest.cpp +++ llvm/unittests/Transforms/Vectorize/VPlanTest.cpp @@ -59,5 +59,31 @@ EXPECT_TRUE(VPBB1.empty()); } +TEST(VPInstructionTest, moveAfter) { + VPInstruction *I1 = new VPInstruction(0, {}); + VPInstruction *I2 = new VPInstruction(1, {}); + VPInstruction *I3 = new VPInstruction(2, {}); + + VPBasicBlock VPBB1; + VPBB1.appendRecipe(I1); + VPBB1.appendRecipe(I2); + VPBB1.appendRecipe(I3); + + I1->moveAfter(I2); + + CHECK_ITERATOR(VPBB1, I2, I1, I3); + + VPInstruction *I4 = new VPInstruction(4, {}); + VPInstruction *I5 = new VPInstruction(5, {}); + VPBasicBlock VPBB2; + VPBB2.appendRecipe(I4); + VPBB2.appendRecipe(I5); + + I3->moveAfter(I4); + + CHECK_ITERATOR(VPBB1, I2, I1); + CHECK_ITERATOR(VPBB2, I4, I3, I5); +} + } // namespace } // namespace llvm Index: llvm/lib/Transforms/Vectorize/VPlan.h =================================================================== --- llvm/lib/Transforms/Vectorize/VPlan.h +++ llvm/lib/Transforms/Vectorize/VPlan.h @@ -615,6 +615,10 @@ /// the specified recipe. void insertBefore(VPRecipeBase *InsertPos); + /// Unlink this recipe from its current VPBasicBlock and insert it into + /// the VPBasicBlock that MovePos lives in, right after MovePos. + void moveAfter(VPRecipeBase *MovePos); + /// This method unlinks 'this' from the containing basic block and deletes it. /// /// \returns an iterator pointing to the element after the erased one Index: llvm/lib/Transforms/Vectorize/VPlan.cpp =================================================================== --- llvm/lib/Transforms/Vectorize/VPlan.cpp +++ llvm/lib/Transforms/Vectorize/VPlan.cpp @@ -283,6 +283,12 @@ return getParent()->getRecipeList().erase(getIterator()); } +void VPRecipeBase::moveAfter(VPRecipeBase *InsertPos) { + InsertPos->getParent()->getRecipeList().splice( + std::next(InsertPos->getIterator()), getParent()->getRecipeList(), + getIterator()); +} + void VPInstruction::generateInstruction(VPTransformState &State, unsigned Part) { IRBuilder<> &Builder = State.Builder; -------------- next part -------------- A non-text attachment was scrubbed... Name: D46825.224605.patch Type: text/x-patch Size: 2270 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 08:53:41 2019 From: llvm-commits at lists.llvm.org (Kerry McLaughlin via llvm-commits) Date: Fri, 11 Oct 2019 15:53:41 -0000 Subject: [llvm] r374566 - [AArch64][SVE] Implement sdot and udot (lane) intrinsics Message-ID: <20191011155341.6F18F83985@lists.llvm.org> Author: kmclaughlin Date: Fri Oct 11 08:53:41 2019 New Revision: 374566 URL: http://llvm.org/viewvc/llvm-project?rev=374566&view=rev Log: [AArch64][SVE] Implement sdot and udot (lane) intrinsics Summary: Implements the following arithmetic intrinsics: - int_aarch64_sve_sdot - int_aarch64_sve_sdot_lane - int_aarch64_sve_udot - int_aarch64_sve_udot_lane This patch includes tests for the Subdivide4Argument type added by D67549 Reviewers: sdesmalen, SjoerdMeijer, greened, rengolin, rovka Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D67551 Modified: llvm/trunk/include/llvm/IR/IntrinsicsAArch64.td llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td llvm/trunk/lib/Target/AArch64/AArch64SVEInstrInfo.td llvm/trunk/lib/Target/AArch64/SVEInstrFormats.td llvm/trunk/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll Modified: llvm/trunk/include/llvm/IR/IntrinsicsAArch64.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/IntrinsicsAArch64.td?rev=374566&r1=374565&r2=374566&view=diff ============================================================================== --- llvm/trunk/include/llvm/IR/IntrinsicsAArch64.td (original) +++ llvm/trunk/include/llvm/IR/IntrinsicsAArch64.td Fri Oct 11 08:53:41 2019 @@ -780,6 +780,21 @@ let TargetPrefix = "aarch64" in { // Al [llvm_anyvector_ty], [IntrNoMem]>; + class AdvSIMD_SVE_DOT_Intrinsic + : Intrinsic<[llvm_anyvector_ty], + [LLVMMatchType<0>, + LLVMSubdivide4VectorType<0>, + LLVMSubdivide4VectorType<0>], + [IntrNoMem]>; + + class AdvSIMD_SVE_DOT_Indexed_Intrinsic + : Intrinsic<[llvm_anyvector_ty], + [LLVMMatchType<0>, + LLVMSubdivide4VectorType<0>, + LLVMSubdivide4VectorType<0>, + llvm_i32_ty], + [IntrNoMem]>; + // This class of intrinsics are not intended to be useful within LLVM IR but // are instead here to support some of the more regid parts of the ACLE. class Builtin_SVCVT @@ -799,6 +814,12 @@ let TargetPrefix = "aarch64" in { // Al def int_aarch64_sve_abs : AdvSIMD_Merged1VectorArg_Intrinsic; def int_aarch64_sve_neg : AdvSIMD_Merged1VectorArg_Intrinsic; +def int_aarch64_sve_sdot : AdvSIMD_SVE_DOT_Intrinsic; +def int_aarch64_sve_sdot_lane : AdvSIMD_SVE_DOT_Indexed_Intrinsic; + +def int_aarch64_sve_udot : AdvSIMD_SVE_DOT_Intrinsic; +def int_aarch64_sve_udot_lane : AdvSIMD_SVE_DOT_Indexed_Intrinsic; + // // Counting bits // Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td?rev=374566&r1=374565&r2=374566&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td (original) +++ llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td Fri Oct 11 08:53:41 2019 @@ -1011,8 +1011,8 @@ class AsmVectorIndex - : Operand, ImmLeaf { +class AsmVectorIndexOpnd + : Operand, ImmLeaf { let ParserMatchClass = mc; let PrintMethod = "printVectorIndex"; } @@ -1023,11 +1023,17 @@ def VectorIndexHOperand : AsmVectorIndex def VectorIndexSOperand : AsmVectorIndex<0, 3>; def VectorIndexDOperand : AsmVectorIndex<0, 1>; -def VectorIndex1 : AsmVectorIndexOpnd; -def VectorIndexB : AsmVectorIndexOpnd; -def VectorIndexH : AsmVectorIndexOpnd; -def VectorIndexS : AsmVectorIndexOpnd; -def VectorIndexD : AsmVectorIndexOpnd; +def VectorIndex1 : AsmVectorIndexOpnd; +def VectorIndexB : AsmVectorIndexOpnd; +def VectorIndexH : AsmVectorIndexOpnd; +def VectorIndexS : AsmVectorIndexOpnd; +def VectorIndexD : AsmVectorIndexOpnd; + +def VectorIndex132b : AsmVectorIndexOpnd; +def VectorIndexB32b : AsmVectorIndexOpnd; +def VectorIndexH32b : AsmVectorIndexOpnd; +def VectorIndexS32b : AsmVectorIndexOpnd; +def VectorIndexD32b : AsmVectorIndexOpnd; def SVEVectorIndexExtDupBOperand : AsmVectorIndex<0, 63, "SVE">; def SVEVectorIndexExtDupHOperand : AsmVectorIndex<0, 31, "SVE">; @@ -1036,15 +1042,15 @@ def SVEVectorIndexExtDupDOperand : AsmVe def SVEVectorIndexExtDupQOperand : AsmVectorIndex<0, 3, "SVE">; def sve_elm_idx_extdup_b - : AsmVectorIndexOpnd; + : AsmVectorIndexOpnd; def sve_elm_idx_extdup_h - : AsmVectorIndexOpnd; + : AsmVectorIndexOpnd; def sve_elm_idx_extdup_s - : AsmVectorIndexOpnd; + : AsmVectorIndexOpnd; def sve_elm_idx_extdup_d - : AsmVectorIndexOpnd; + : AsmVectorIndexOpnd; def sve_elm_idx_extdup_q - : AsmVectorIndexOpnd; + : AsmVectorIndexOpnd; // 8-bit immediate for AdvSIMD where 64-bit values of the form: // aaaaaaaa bbbbbbbb cccccccc dddddddd eeeeeeee ffffffff gggggggg hhhhhhhh Modified: llvm/trunk/lib/Target/AArch64/AArch64SVEInstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64SVEInstrInfo.td?rev=374566&r1=374565&r2=374566&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/AArch64SVEInstrInfo.td (original) +++ llvm/trunk/lib/Target/AArch64/AArch64SVEInstrInfo.td Fri Oct 11 08:53:41 2019 @@ -82,11 +82,11 @@ let Predicates = [HasSVE] in { defm SDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b110, "sdivr">; defm UDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b111, "udivr">; - defm SDOT_ZZZ : sve_intx_dot<0b0, "sdot">; - defm UDOT_ZZZ : sve_intx_dot<0b1, "udot">; + defm SDOT_ZZZ : sve_intx_dot<0b0, "sdot", int_aarch64_sve_sdot>; + defm UDOT_ZZZ : sve_intx_dot<0b1, "udot", int_aarch64_sve_udot>; - defm SDOT_ZZZI : sve_intx_dot_by_indexed_elem<0b0, "sdot">; - defm UDOT_ZZZI : sve_intx_dot_by_indexed_elem<0b1, "udot">; + defm SDOT_ZZZI : sve_intx_dot_by_indexed_elem<0b0, "sdot", int_aarch64_sve_sdot_lane>; + defm UDOT_ZZZI : sve_intx_dot_by_indexed_elem<0b1, "udot", int_aarch64_sve_udot_lane>; defm SXTB_ZPmZ : sve_int_un_pred_arit_0_h<0b000, "sxtb">; defm UXTB_ZPmZ : sve_int_un_pred_arit_0_h<0b001, "uxtb">; Modified: llvm/trunk/lib/Target/AArch64/SVEInstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/SVEInstrFormats.td?rev=374566&r1=374565&r2=374566&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/SVEInstrFormats.td (original) +++ llvm/trunk/lib/Target/AArch64/SVEInstrFormats.td Fri Oct 11 08:53:41 2019 @@ -2024,12 +2024,14 @@ class sve_intx_dot { +multiclass sve_intx_dot { def _S : sve_intx_dot<0b0, opc, asm, ZPR32, ZPR8>; def _D : sve_intx_dot<0b1, opc, asm, ZPR64, ZPR16>; + + def : SVE_3_Op_Pat(NAME # _S)>; + def : SVE_3_Op_Pat(NAME # _D)>; } //===----------------------------------------------------------------------===// @@ -2054,22 +2056,27 @@ class sve_intx_dot_by_indexed_elem { - def _S : sve_intx_dot_by_indexed_elem<0b0, opc, asm, ZPR32, ZPR8, ZPR3b8, VectorIndexS> { +multiclass sve_intx_dot_by_indexed_elem { + def _S : sve_intx_dot_by_indexed_elem<0b0, opc, asm, ZPR32, ZPR8, ZPR3b8, VectorIndexS32b> { bits<2> iop; bits<3> Zm; let Inst{20-19} = iop; let Inst{18-16} = Zm; } - def _D : sve_intx_dot_by_indexed_elem<0b1, opc, asm, ZPR64, ZPR16, ZPR4b16, VectorIndexD> { + def _D : sve_intx_dot_by_indexed_elem<0b1, opc, asm, ZPR64, ZPR16, ZPR4b16, VectorIndexD32b> { bits<1> iop; bits<4> Zm; let Inst{20} = iop; let Inst{19-16} = Zm; } + + def : Pat<(nxv4i32 (op nxv4i32:$Op1, nxv16i8:$Op2, nxv16i8:$Op3, (i32 VectorIndexS32b:$idx))), + (!cast(NAME # _S) $Op1, $Op2, $Op3, VectorIndexS32b:$idx)>; + def : Pat<(nxv2i64 (op nxv2i64:$Op1, nxv8i16:$Op2, nxv8i16:$Op3, (i32 VectorIndexD32b:$idx))), + (!cast(NAME # _D) $Op1, $Op2, $Op3, VectorIndexD32b:$idx)>; } //===----------------------------------------------------------------------===// Modified: llvm/trunk/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll?rev=374566&r1=374565&r2=374566&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll Fri Oct 11 08:53:41 2019 @@ -88,6 +88,87 @@ define @neg_i64( %out } +; SDOT + +define @sdot_i32( %a, %b, %c) { +; CHECK-LABEL: sdot_i32: +; CHECK: sdot z0.s, z1.b, z2.b +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.sdot.nxv4i32( %a, + %b, + %c) + ret %out +} + +define @sdot_i64( %a, %b, %c) { +; CHECK-LABEL: sdot_i64: +; CHECK: sdot z0.d, z1.h, z2.h +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.sdot.nxv2i64( %a, + %b, + %c) + ret %out +} + +; SDOT (Indexed) + +define @sdot_lane_i32( %a, %b, %c) { +; CHECK-LABEL: sdot_lane_i32: +; CHECK: sdot z0.s, z1.b, z2.b[2] +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.sdot.lane.nxv4i32( %a, + %b, + %c, + i32 2) + ret %out +} + +define @sdot_lane_i64( %a, %b, %c) { +; CHECK-LABEL: sdot_lane_i64: +; CHECK: sdot z0.d, z1.h, z2.h[1] +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.sdot.lane.nxv2i64( %a, + %b, + %c, + i32 1) + ret %out +} + +; UDOT + +define @udot_i32( %a, %b, %c) { +; CHECK-LABEL: udot_i32: +; CHECK: udot z0.s, z1.b, z2.b +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.udot.nxv4i32( %a, + %b, + %c) + ret %out +} + +define @udot_i64( %a, %b, %c) { +; CHECK-LABEL: udot_i64: +; CHECK: udot z0.d, z1.h, z2.h +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.udot.nxv2i64( %a, + %b, + %c) + ret %out +} + +; UDOT (Indexed) + +define @udot_lane_i32( %a, %b, %c) { +; CHECK-LABEL: udot_lane_i32: +; CHECK: udot z0.s, z1.b, z2.b[2] +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.udot.lane.nxv4i32( %a, + %b, + %c, + i32 2) + ret %out +} + declare @llvm.aarch64.sve.abs.nxv16i8(, , ) declare @llvm.aarch64.sve.abs.nxv8i16(, , ) declare @llvm.aarch64.sve.abs.nxv4i32(, , ) @@ -97,3 +178,15 @@ declare @llvm.aarch64 declare @llvm.aarch64.sve.neg.nxv8i16(, , ) declare @llvm.aarch64.sve.neg.nxv4i32(, , ) declare @llvm.aarch64.sve.neg.nxv2i64(, , ) + +declare @llvm.aarch64.sve.sdot.nxv4i32(, , ) +declare @llvm.aarch64.sve.sdot.nxv2i64(, , ) + +declare @llvm.aarch64.sve.sdot.lane.nxv4i32(, , , i32) +declare @llvm.aarch64.sve.sdot.lane.nxv2i64(, , , i32) + +declare @llvm.aarch64.sve.udot.nxv4i32(, , ) +declare @llvm.aarch64.sve.udot.nxv2i64(, , ) + +declare @llvm.aarch64.sve.udot.lane.nxv4i32(, , , i32) +declare @llvm.aarch64.sve.udot.lane.nxv2i64(, , , i32) From llvm-commits at lists.llvm.org Fri Oct 11 08:51:33 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 15:51:33 +0000 (UTC) Subject: [PATCH] D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine In-Reply-To: References: Message-ID: spatel added a comment. Ping. There seems to be general agreement that this is a temporary (stopgap) solution until we can do limited load combining in IR. So it's a question of whether we're ok with a cost model hack to overcome the motivating bugs while we figure out how to add a new pass. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67841/new/ https://reviews.llvm.org/D67841 From llvm-commits at lists.llvm.org Fri Oct 11 09:01:04 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:01:04 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: hubert.reinterpretcast added inline comments. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:23 +; LARGE: lwz [[REG2:[0-9]+]], LC0 at l([[REG1]]) +; LARGE: lwz [[REG3:[0-9]+]], 0([[REG2]]) +; LARGE: addis [[REG4:[0-9]+]], LC1 at u(2) ---------------- That the ordering and interleaving of the logical operations involved differ between the various cases seem to indicate that the test is already too complicated. Please reduce the test to use a single memory operand (e.g., store a constant or return the value read). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Fri Oct 11 09:01:04 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:01:04 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream In-Reply-To: References: Message-ID: <954e37453022fc2634a2003960e4a647@localhost.localdomain> JosephTremoulet updated this revision to Diff 224607. JosephTremoulet added a comment. Address review feedback - Add Exception stream to minidump-basic.yaml to test obj2yaml direction - Reorder ExceptionStream definition and constructors - Use the mapOptional helper - Replace "Number Parameters" with "Number of Parameters" in YAML - Stop using mapping traits validation for number of parameters, update test to ensure correctly de-yamlizing an exception record with an out-of-bounds number of parameters Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68657/new/ https://reviews.llvm.org/D68657 Files: lldb/packages/Python/lldbsuite/test/functionalities/postmortem/minidump-new/linux-x86_64.yaml llvm/include/llvm/ObjectYAML/MinidumpYAML.h llvm/lib/ObjectYAML/MinidumpEmitter.cpp llvm/lib/ObjectYAML/MinidumpYAML.cpp llvm/test/tools/obj2yaml/basic-minidump.yaml llvm/unittests/ObjectYAML/MinidumpYAMLTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68657.224607.patch Type: text/x-patch Size: 19333 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 09:01:05 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:01:05 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream In-Reply-To: References: Message-ID: JosephTremoulet marked 4 inline comments as done. JosephTremoulet added a comment. Added Exception stream to minidump-basic.yaml as suggested. ================ Comment at: llvm/lib/ObjectYAML/MinidumpYAML.cpp:394 + mapOptionalHex(IO, "Exception Address", Exception.ExceptionAddress, 0); + IO.mapOptional("Number Parameters", Exception.NumberParameters, + support::ulittle32_t(0u)); ---------------- labath wrote: > This file has a helper function for this (`mapOptional(IO, "name", value, 0)`. I'd consider changing the field name to "Number of Parameters" even though it does not match the field name, as it reads weird without that. I'm not sure why the microsoft naming is inconsistent here -- most of the other minidump structs have "of" in their name already (BaseOfImage, SizeOfImage, etc.), but at least we can be consistent. Updated to use the helper, and changed the name in the YAML to "Number of Parameters". Let me know if it's important to you to also change the name of the field in the llvm::minidump::Exception type to `NumberOfParameters` -- I wasn't sure if you were suggesting that, and regardless my preference would be to leave that as-is to match breakpad aside from casing, as otherwise it's hard to know where to stop (e.g. change "ExceptionInformation" to "Parameters" to match "NumberOfParameters" and the YAML? Reconcile the several different ways that alignment padding fields are named? etc.) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68657/new/ https://reviews.llvm.org/D68657 From llvm-commits at lists.llvm.org Fri Oct 11 09:01:13 2019 From: llvm-commits at lists.llvm.org (Kerry McLaughlin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:01:13 +0000 (UTC) Subject: [PATCH] D67551: [AArch64][SVE] Implement sdot and udot (lane) intrinsics In-Reply-To: References: Message-ID: <2c446430fd218eb9ef0793dfe4d75c32@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGee0a0a34646f: [AArch64][SVE] Implement sdot and udot (lane) intrinsics (authored by kmclaughlin). Herald added a subscriber: hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D67551?vs=220096&id=224608#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67551/new/ https://reviews.llvm.org/D67551 Files: llvm/include/llvm/IR/IntrinsicsAArch64.td llvm/lib/Target/AArch64/AArch64InstrFormats.td llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td llvm/lib/Target/AArch64/SVEInstrFormats.td llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67551.224608.patch Type: text/x-patch Size: 14182 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 09:10:23 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Fri, 11 Oct 2019 16:10:23 -0000 Subject: [llvm] r374568 - [AArch64] add tests for (v)select-of-constants; NFC Message-ID: <20191011161023.7F7AD83ED9@lists.llvm.org> Author: spatel Date: Fri Oct 11 09:10:23 2019 New Revision: 374568 URL: http://llvm.org/viewvc/llvm-project?rev=374568&view=rev Log: [AArch64] add tests for (v)select-of-constants; NFC These are copied from existing test files in x86/PPC. Added: llvm/trunk/test/CodeGen/AArch64/select_const.ll llvm/trunk/test/CodeGen/AArch64/vselect-constants.ll Added: llvm/trunk/test/CodeGen/AArch64/select_const.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/select_const.ll?rev=374568&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/select_const.ll (added) +++ llvm/trunk/test/CodeGen/AArch64/select_const.ll Fri Oct 11 09:10:23 2019 @@ -0,0 +1,625 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=aarch64-- | FileCheck %s + +; Select of constants: control flow / conditional moves can always be replaced by logic+math (but may not be worth it?). +; Test the zeroext/signext variants of each pattern to see if that makes a difference. + +; select Cond, 0, 1 --> zext (!Cond) + +define i32 @select_0_or_1(i1 %cond) { +; CHECK-LABEL: select_0_or_1: +; CHECK: // %bb.0: +; CHECK-NEXT: mvn w8, w0 +; CHECK-NEXT: and w0, w8, #0x1 +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 0, i32 1 + ret i32 %sel +} + +define i32 @select_0_or_1_zeroext(i1 zeroext %cond) { +; CHECK-LABEL: select_0_or_1_zeroext: +; CHECK: // %bb.0: +; CHECK-NEXT: eor w0, w0, #0x1 +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 0, i32 1 + ret i32 %sel +} + +define i32 @select_0_or_1_signext(i1 signext %cond) { +; CHECK-LABEL: select_0_or_1_signext: +; CHECK: // %bb.0: +; CHECK-NEXT: mvn w8, w0 +; CHECK-NEXT: and w0, w8, #0x1 +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 0, i32 1 + ret i32 %sel +} + +; select Cond, 1, 0 --> zext (Cond) + +define i32 @select_1_or_0(i1 %cond) { +; CHECK-LABEL: select_1_or_0: +; CHECK: // %bb.0: +; CHECK-NEXT: and w0, w0, #0x1 +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 1, i32 0 + ret i32 %sel +} + +define i32 @select_1_or_0_zeroext(i1 zeroext %cond) { +; CHECK-LABEL: select_1_or_0_zeroext: +; CHECK: // %bb.0: +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 1, i32 0 + ret i32 %sel +} + +define i32 @select_1_or_0_signext(i1 signext %cond) { +; CHECK-LABEL: select_1_or_0_signext: +; CHECK: // %bb.0: +; CHECK-NEXT: and w0, w0, #0x1 +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 1, i32 0 + ret i32 %sel +} + +; select Cond, 0, -1 --> sext (!Cond) + +define i32 @select_0_or_neg1(i1 %cond) { +; CHECK-LABEL: select_0_or_neg1: +; CHECK: // %bb.0: +; CHECK-NEXT: mvn w8, w0 +; CHECK-NEXT: sbfx w0, w8, #0, #1 +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 0, i32 -1 + ret i32 %sel +} + +define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) { +; CHECK-LABEL: select_0_or_neg1_zeroext: +; CHECK: // %bb.0: +; CHECK-NEXT: mvn w8, w0 +; CHECK-NEXT: sbfx w0, w8, #0, #1 +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 0, i32 -1 + ret i32 %sel +} + +define i32 @select_0_or_neg1_signext(i1 signext %cond) { +; CHECK-LABEL: select_0_or_neg1_signext: +; CHECK: // %bb.0: +; CHECK-NEXT: mvn w0, w0 +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 0, i32 -1 + ret i32 %sel +} + +; select Cond, -1, 0 --> sext (Cond) + +define i32 @select_neg1_or_0(i1 %cond) { +; CHECK-LABEL: select_neg1_or_0: +; CHECK: // %bb.0: +; CHECK-NEXT: sbfx w0, w0, #0, #1 +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 -1, i32 0 + ret i32 %sel +} + +define i32 @select_neg1_or_0_zeroext(i1 zeroext %cond) { +; CHECK-LABEL: select_neg1_or_0_zeroext: +; CHECK: // %bb.0: +; CHECK-NEXT: sbfx w0, w0, #0, #1 +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 -1, i32 0 + ret i32 %sel +} + +define i32 @select_neg1_or_0_signext(i1 signext %cond) { +; CHECK-LABEL: select_neg1_or_0_signext: +; CHECK: // %bb.0: +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 -1, i32 0 + ret i32 %sel +} + +; select Cond, C+1, C --> add (zext Cond), C + +define i32 @select_Cplus1_C(i1 %cond) { +; CHECK-LABEL: select_Cplus1_C: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #41 +; CHECK-NEXT: cinc w0, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 42, i32 41 + ret i32 %sel +} + +define i32 @select_Cplus1_C_zeroext(i1 zeroext %cond) { +; CHECK-LABEL: select_Cplus1_C_zeroext: +; CHECK: // %bb.0: +; CHECK-NEXT: cmp w0, #0 // =0 +; CHECK-NEXT: mov w8, #41 +; CHECK-NEXT: cinc w0, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 42, i32 41 + ret i32 %sel +} + +define i32 @select_Cplus1_C_signext(i1 signext %cond) { +; CHECK-LABEL: select_Cplus1_C_signext: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #41 +; CHECK-NEXT: cinc w0, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 42, i32 41 + ret i32 %sel +} + +; select Cond, C, C+1 --> add (sext Cond), C + +define i32 @select_C_Cplus1(i1 %cond) { +; CHECK-LABEL: select_C_Cplus1: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #41 +; CHECK-NEXT: cinc w0, w8, eq +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 41, i32 42 + ret i32 %sel +} + +define i32 @select_C_Cplus1_zeroext(i1 zeroext %cond) { +; CHECK-LABEL: select_C_Cplus1_zeroext: +; CHECK: // %bb.0: +; CHECK-NEXT: cmp w0, #0 // =0 +; CHECK-NEXT: mov w8, #41 +; CHECK-NEXT: cinc w0, w8, eq +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 41, i32 42 + ret i32 %sel +} + +define i32 @select_C_Cplus1_signext(i1 signext %cond) { +; CHECK-LABEL: select_C_Cplus1_signext: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #41 +; CHECK-NEXT: cinc w0, w8, eq +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 41, i32 42 + ret i32 %sel +} + +; In general, select of 2 constants could be: +; select Cond, C1, C2 --> add (mul (zext Cond), C1-C2), C2 --> add (and (sext Cond), C1-C2), C2 + +define i32 @select_C1_C2(i1 %cond) { +; CHECK-LABEL: select_C1_C2: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #42 +; CHECK-NEXT: mov w9, #421 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 421, i32 42 + ret i32 %sel +} + +define i32 @select_C1_C2_zeroext(i1 zeroext %cond) { +; CHECK-LABEL: select_C1_C2_zeroext: +; CHECK: // %bb.0: +; CHECK-NEXT: cmp w0, #0 // =0 +; CHECK-NEXT: mov w8, #42 +; CHECK-NEXT: mov w9, #421 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 421, i32 42 + ret i32 %sel +} + +define i32 @select_C1_C2_signext(i1 signext %cond) { +; CHECK-LABEL: select_C1_C2_signext: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #42 +; CHECK-NEXT: mov w9, #421 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i32 421, i32 42 + ret i32 %sel +} + +; A binary operator with constant after the select should always get folded into the select. + +define i8 @sel_constants_add_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_add_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #28 +; CHECK-NEXT: csinc w0, w8, wzr, eq +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = add i8 %sel, 5 + ret i8 %bo +} + +define i8 @sel_constants_sub_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_sub_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #18 +; CHECK-NEXT: mov w9, #-9 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = sub i8 %sel, 5 + ret i8 %bo +} + +define i8 @sel_constants_sub_constant_sel_constants(i1 %cond) { +; CHECK-LABEL: sel_constants_sub_constant_sel_constants: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #2 +; CHECK-NEXT: mov w9, #9 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 3 + %bo = sub i8 5, %sel + ret i8 %bo +} + +define i8 @sel_constants_mul_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_mul_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #115 +; CHECK-NEXT: mov w9, #-20 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = mul i8 %sel, 5 + ret i8 %bo +} + +define i8 @sel_constants_sdiv_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_sdiv_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #4 +; CHECK-NEXT: csel w0, wzr, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = sdiv i8 %sel, 5 + ret i8 %bo +} + +define i8 @sdiv_constant_sel_constants(i1 %cond) { +; CHECK-LABEL: sdiv_constant_sel_constants: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #5 +; CHECK-NEXT: csel w0, wzr, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 121, i8 23 + %bo = sdiv i8 120, %sel + ret i8 %bo +} + +define i8 @sel_constants_udiv_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_udiv_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #4 +; CHECK-NEXT: mov w9, #50 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = udiv i8 %sel, 5 + ret i8 %bo +} + +define i8 @udiv_constant_sel_constants(i1 %cond) { +; CHECK-LABEL: udiv_constant_sel_constants: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #5 +; CHECK-NEXT: csel w0, wzr, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = udiv i8 120, %sel + ret i8 %bo +} + +define i8 @sel_constants_srem_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_srem_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #-4 +; CHECK-NEXT: cinv w0, w8, eq +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = srem i8 %sel, 5 + ret i8 %bo +} + +define i8 @srem_constant_sel_constants(i1 %cond) { +; CHECK-LABEL: srem_constant_sel_constants: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #5 +; CHECK-NEXT: mov w9, #120 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 121, i8 23 + %bo = srem i8 120, %sel + ret i8 %bo +} + +define i8 @sel_constants_urem_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_urem_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #2 +; CHECK-NEXT: cinc w0, w8, eq +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = urem i8 %sel, 5 + ret i8 %bo +} + +define i8 @urem_constant_sel_constants(i1 %cond) { +; CHECK-LABEL: urem_constant_sel_constants: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #5 +; CHECK-NEXT: mov w9, #120 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = urem i8 120, %sel + ret i8 %bo +} + +define i8 @sel_constants_and_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_and_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #4 +; CHECK-NEXT: cinc w0, w8, eq +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = and i8 %sel, 5 + ret i8 %bo +} + +define i8 @sel_constants_or_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_or_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #23 +; CHECK-NEXT: mov w9, #-3 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = or i8 %sel, 5 + ret i8 %bo +} + +define i8 @sel_constants_xor_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_xor_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #18 +; CHECK-NEXT: mov w9, #-7 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = xor i8 %sel, 5 + ret i8 %bo +} + +define i8 @sel_constants_shl_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_shl_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #-32 +; CHECK-NEXT: mov w9, #-128 +; CHECK-NEXT: csel w0, w9, w8, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = shl i8 %sel, 5 + ret i8 %bo +} + +define i8 @shl_constant_sel_constants(i1 %cond) { +; CHECK-LABEL: shl_constant_sel_constants: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #2 +; CHECK-NEXT: cinc x8, x8, eq +; CHECK-NEXT: mov w9, #1 +; CHECK-NEXT: lsl w0, w9, w8 +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 2, i8 3 + %bo = shl i8 1, %sel + ret i8 %bo +} + +define i8 @sel_constants_lshr_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_lshr_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #7 +; CHECK-NEXT: csel w0, w8, wzr, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = lshr i8 %sel, 5 + ret i8 %bo +} + +define i8 @lshr_constant_sel_constants(i1 %cond) { +; CHECK-LABEL: lshr_constant_sel_constants: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #2 +; CHECK-NEXT: cinc x8, x8, eq +; CHECK-NEXT: mov w9, #64 +; CHECK-NEXT: lsr w0, w9, w8 +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 2, i8 3 + %bo = lshr i8 64, %sel + ret i8 %bo +} + + +define i8 @sel_constants_ashr_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_ashr_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: sbfx w0, w0, #0, #1 +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 -4, i8 23 + %bo = ashr i8 %sel, 5 + ret i8 %bo +} + +define i8 @ashr_constant_sel_constants(i1 %cond) { +; CHECK-LABEL: ashr_constant_sel_constants: +; CHECK: // %bb.0: +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: mov w8, #2 +; CHECK-NEXT: cinc x8, x8, eq +; CHECK-NEXT: mov w9, #-128 +; CHECK-NEXT: asr w0, w9, w8 +; CHECK-NEXT: ret + %sel = select i1 %cond, i8 2, i8 3 + %bo = ashr i8 128, %sel + ret i8 %bo +} + +define double @sel_constants_fadd_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_fadd_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI42_0 +; CHECK-NEXT: ldr d0, [x8, :lo12:.LCPI42_0] +; CHECK-NEXT: mov x8, #7378697629483820646 +; CHECK-NEXT: movk x8, #16444, lsl #48 +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: fmov d1, x8 +; CHECK-NEXT: fcsel d0, d0, d1, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, double -4.0, double 23.3 + %bo = fadd double %sel, 5.1 + ret double %bo +} + +define double @sel_constants_fsub_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_fsub_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI43_0 +; CHECK-NEXT: ldr d0, [x8, :lo12:.LCPI43_0] +; CHECK-NEXT: mov x8, #3689348814741910323 +; CHECK-NEXT: movk x8, #49186, lsl #48 +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: fmov d1, x8 +; CHECK-NEXT: fcsel d0, d1, d0, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, double -4.0, double 23.3 + %bo = fsub double %sel, 5.1 + ret double %bo +} + +define double @fsub_constant_sel_constants(i1 %cond) { +; CHECK-LABEL: fsub_constant_sel_constants: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI44_0 +; CHECK-NEXT: ldr d0, [x8, :lo12:.LCPI44_0] +; CHECK-NEXT: mov x8, #3689348814741910323 +; CHECK-NEXT: movk x8, #16418, lsl #48 +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: fmov d1, x8 +; CHECK-NEXT: fcsel d0, d1, d0, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, double -4.0, double 23.3 + %bo = fsub double 5.1, %sel + ret double %bo +} + +define double @sel_constants_fmul_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_fmul_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI45_0 +; CHECK-NEXT: ldr d0, [x8, :lo12:.LCPI45_0] +; CHECK-NEXT: mov x8, #7378697629483820646 +; CHECK-NEXT: movk x8, #49204, lsl #48 +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: fmov d1, x8 +; CHECK-NEXT: fcsel d0, d1, d0, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, double -4.0, double 23.3 + %bo = fmul double %sel, 5.1 + ret double %bo +} + +define double @sel_constants_fdiv_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_fdiv_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI46_0 +; CHECK-NEXT: adrp x9, .LCPI46_1 +; CHECK-NEXT: ldr d0, [x8, :lo12:.LCPI46_0] +; CHECK-NEXT: ldr d1, [x9, :lo12:.LCPI46_1] +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: fcsel d0, d1, d0, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, double -4.0, double 23.3 + %bo = fdiv double %sel, 5.1 + ret double %bo +} + +define double @fdiv_constant_sel_constants(i1 %cond) { +; CHECK-LABEL: fdiv_constant_sel_constants: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI47_0 +; CHECK-NEXT: ldr d0, [x8, :lo12:.LCPI47_0] +; CHECK-NEXT: mov x8, #7378697629483820646 +; CHECK-NEXT: movk x8, #49140, lsl #48 +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: fmov d1, x8 +; CHECK-NEXT: fcsel d0, d1, d0, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, double -4.0, double 23.3 + %bo = fdiv double 5.1, %sel + ret double %bo +} + +define double @sel_constants_frem_constant(i1 %cond) { +; CHECK-LABEL: sel_constants_frem_constant: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI48_0 +; CHECK-NEXT: ldr d0, [x8, :lo12:.LCPI48_0] +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: fmov d1, #-4.00000000 +; CHECK-NEXT: fcsel d0, d1, d0, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, double -4.0, double 23.3 + %bo = frem double %sel, 5.1 + ret double %bo +} + +define double @frem_constant_sel_constants(i1 %cond) { +; CHECK-LABEL: frem_constant_sel_constants: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI49_0 +; CHECK-NEXT: ldr d0, [x8, :lo12:.LCPI49_0] +; CHECK-NEXT: mov x8, #7378697629483820646 +; CHECK-NEXT: movk x8, #16404, lsl #48 +; CHECK-NEXT: tst w0, #0x1 +; CHECK-NEXT: fmov d1, x8 +; CHECK-NEXT: fcsel d0, d0, d1, ne +; CHECK-NEXT: ret + %sel = select i1 %cond, double -4.0, double 23.3 + %bo = frem double 5.1, %sel + ret double %bo +} Added: llvm/trunk/test/CodeGen/AArch64/vselect-constants.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/vselect-constants.ll?rev=374568&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/vselect-constants.ll (added) +++ llvm/trunk/test/CodeGen/AArch64/vselect-constants.ll Fri Oct 11 09:10:23 2019 @@ -0,0 +1,195 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=aarch64-- | FileCheck %s + +; First, check the generic pattern for any 2 vector constants. Then, check special cases where +; the constants are all off-by-one. Finally, check the extra special cases where the constants +; include 0 or -1. +; Each minimal select test is repeated with a more typical pattern that includes a compare to +; generate the condition value. + +define <4 x i32> @sel_C1_or_C2_vec(<4 x i1> %cond) { +; CHECK-LABEL: sel_C1_or_C2_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI0_0 +; CHECK-NEXT: adrp x9, .LCPI0_1 +; CHECK-NEXT: ldr q1, [x8, :lo12:.LCPI0_0] +; CHECK-NEXT: ldr q2, [x9, :lo12:.LCPI0_1] +; CHECK-NEXT: ushll v0.4s, v0.4h, #0 +; CHECK-NEXT: shl v0.4s, v0.4s, #31 +; CHECK-NEXT: sshr v0.4s, v0.4s, #31 +; CHECK-NEXT: bsl v0.16b, v2.16b, v1.16b +; CHECK-NEXT: ret + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @cmp_sel_C1_or_C2_vec(<4 x i32> %x, <4 x i32> %y) { +; CHECK-LABEL: cmp_sel_C1_or_C2_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI1_0 +; CHECK-NEXT: adrp x9, .LCPI1_1 +; CHECK-NEXT: ldr q2, [x8, :lo12:.LCPI1_0] +; CHECK-NEXT: ldr q3, [x9, :lo12:.LCPI1_1] +; CHECK-NEXT: cmeq v0.4s, v0.4s, v1.4s +; CHECK-NEXT: bsl v0.16b, v3.16b, v2.16b +; CHECK-NEXT: ret + %cond = icmp eq <4 x i32> %x, %y + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @sel_Cplus1_or_C_vec(<4 x i1> %cond) { +; CHECK-LABEL: sel_Cplus1_or_C_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI2_0 +; CHECK-NEXT: adrp x9, .LCPI2_1 +; CHECK-NEXT: ldr q1, [x8, :lo12:.LCPI2_0] +; CHECK-NEXT: ldr q2, [x9, :lo12:.LCPI2_1] +; CHECK-NEXT: ushll v0.4s, v0.4h, #0 +; CHECK-NEXT: shl v0.4s, v0.4s, #31 +; CHECK-NEXT: sshr v0.4s, v0.4s, #31 +; CHECK-NEXT: bsl v0.16b, v2.16b, v1.16b +; CHECK-NEXT: ret + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @cmp_sel_Cplus1_or_C_vec(<4 x i32> %x, <4 x i32> %y) { +; CHECK-LABEL: cmp_sel_Cplus1_or_C_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI3_0 +; CHECK-NEXT: adrp x9, .LCPI3_1 +; CHECK-NEXT: ldr q2, [x8, :lo12:.LCPI3_0] +; CHECK-NEXT: ldr q3, [x9, :lo12:.LCPI3_1] +; CHECK-NEXT: cmeq v0.4s, v0.4s, v1.4s +; CHECK-NEXT: bsl v0.16b, v3.16b, v2.16b +; CHECK-NEXT: ret + %cond = icmp eq <4 x i32> %x, %y + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @sel_Cminus1_or_C_vec(<4 x i1> %cond) { +; CHECK-LABEL: sel_Cminus1_or_C_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI4_0 +; CHECK-NEXT: adrp x9, .LCPI4_1 +; CHECK-NEXT: ldr q1, [x8, :lo12:.LCPI4_0] +; CHECK-NEXT: ldr q2, [x9, :lo12:.LCPI4_1] +; CHECK-NEXT: ushll v0.4s, v0.4h, #0 +; CHECK-NEXT: shl v0.4s, v0.4s, #31 +; CHECK-NEXT: sshr v0.4s, v0.4s, #31 +; CHECK-NEXT: bsl v0.16b, v2.16b, v1.16b +; CHECK-NEXT: ret + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @cmp_sel_Cminus1_or_C_vec(<4 x i32> %x, <4 x i32> %y) { +; CHECK-LABEL: cmp_sel_Cminus1_or_C_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: adrp x8, .LCPI5_0 +; CHECK-NEXT: adrp x9, .LCPI5_1 +; CHECK-NEXT: ldr q2, [x8, :lo12:.LCPI5_0] +; CHECK-NEXT: ldr q3, [x9, :lo12:.LCPI5_1] +; CHECK-NEXT: cmeq v0.4s, v0.4s, v1.4s +; CHECK-NEXT: bsl v0.16b, v3.16b, v2.16b +; CHECK-NEXT: ret + %cond = icmp eq <4 x i32> %x, %y + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @sel_minus1_or_0_vec(<4 x i1> %cond) { +; CHECK-LABEL: sel_minus1_or_0_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: ushll v0.4s, v0.4h, #0 +; CHECK-NEXT: shl v0.4s, v0.4s, #31 +; CHECK-NEXT: sshr v0.4s, v0.4s, #31 +; CHECK-NEXT: ret + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @cmp_sel_minus1_or_0_vec(<4 x i32> %x, <4 x i32> %y) { +; CHECK-LABEL: cmp_sel_minus1_or_0_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: cmeq v0.4s, v0.4s, v1.4s +; CHECK-NEXT: ret + %cond = icmp eq <4 x i32> %x, %y + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @sel_0_or_minus1_vec(<4 x i1> %cond) { +; CHECK-LABEL: sel_0_or_minus1_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: ushll v0.4s, v0.4h, #0 +; CHECK-NEXT: shl v0.4s, v0.4s, #31 +; CHECK-NEXT: cmge v0.4s, v0.4s, #0 +; CHECK-NEXT: ret + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @cmp_sel_0_or_minus1_vec(<4 x i32> %x, <4 x i32> %y) { +; CHECK-LABEL: cmp_sel_0_or_minus1_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: cmeq v0.4s, v0.4s, v1.4s +; CHECK-NEXT: mvn v0.16b, v0.16b +; CHECK-NEXT: ret + %cond = icmp eq <4 x i32> %x, %y + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @sel_1_or_0_vec(<4 x i1> %cond) { +; CHECK-LABEL: sel_1_or_0_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: ushll v0.4s, v0.4h, #0 +; CHECK-NEXT: shl v0.4s, v0.4s, #31 +; CHECK-NEXT: sshr v0.4s, v0.4s, #31 +; CHECK-NEXT: movi v1.4s, #1 +; CHECK-NEXT: and v0.16b, v0.16b, v1.16b +; CHECK-NEXT: ret + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @cmp_sel_1_or_0_vec(<4 x i32> %x, <4 x i32> %y) { +; CHECK-LABEL: cmp_sel_1_or_0_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: cmeq v0.4s, v0.4s, v1.4s +; CHECK-NEXT: movi v1.4s, #1 +; CHECK-NEXT: and v0.16b, v0.16b, v1.16b +; CHECK-NEXT: ret + %cond = icmp eq <4 x i32> %x, %y + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @sel_0_or_1_vec(<4 x i1> %cond) { +; CHECK-LABEL: sel_0_or_1_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: ushll v0.4s, v0.4h, #0 +; CHECK-NEXT: shl v0.4s, v0.4s, #31 +; CHECK-NEXT: cmge v0.4s, v0.4s, #0 +; CHECK-NEXT: movi v1.4s, #1 +; CHECK-NEXT: and v0.16b, v0.16b, v1.16b +; CHECK-NEXT: ret + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + +define <4 x i32> @cmp_sel_0_or_1_vec(<4 x i32> %x, <4 x i32> %y) { +; CHECK-LABEL: cmp_sel_0_or_1_vec: +; CHECK: // %bb.0: +; CHECK-NEXT: cmeq v0.4s, v0.4s, v1.4s +; CHECK-NEXT: movi v1.4s, #1 +; CHECK-NEXT: bic v0.16b, v1.16b, v0.16b +; CHECK-NEXT: ret + %cond = icmp eq <4 x i32> %x, %y + %add = select <4 x i1> %cond, <4 x i32> , <4 x i32> + ret <4 x i32> %add +} + From llvm-commits at lists.llvm.org Fri Oct 11 09:10:22 2019 From: llvm-commits at lists.llvm.org (David Stenberg via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:10:22 +0000 (UTC) Subject: [PATCH] D68465: [DebugInfo] Trim call-clobbered location list entries when tuning for GDB In-Reply-To: References: Message-ID: <889f2f3022089ad14e14a34ec4bf49ac@localhost.localdomain> dstenb added a comment. In D68465#1704522 , @dblaikie wrote: > Unsigned operands don't /seem/ to me to be problematic here.. > > GCC's output in -gdwarf-5 with the example you provided previously doesn't use debug_addr, instead using offset_pair (if there's a base address) or start_length (if I add another function to the example, and use -ffunction-sections, so the base address of the CU is constant zero): > > .byte 0x8 > .quad .LVL0 > .uleb128 .LVL1-1-.LVL0 > .uleb128 0x1 > .byte 0x50 > .byte 0 > > > With -gsplit-dwarf GCC has to use debug_addr, and we don't see any label arithmetic in debug_addr: > > .section .debug_addr,"", at progbits > .quad .LVL1 > .quad .LFB0 > .quad .LFB1 > .quad call > .quad value > .quad .LVL0 > > > & we do see it in debug_loclists.dwo: > > .byte 0x3 > .uleb128 0x5 > .uleb128 .LVL1-1-.LVL0 > .uleb128 0x1 > .byte 0x50 > .byte 0 > > > So if we are going to do this, I'd certainly want to match that sort of behavior - and not make changes to/add extra addresses to debug_addr if we don't have to. Is that output based on the C reproducer I posted a few comments up? I don't think that case is representative for the issue with the offsets in the address pool. Since the variable is only described by a call-clobbered register, we only have to end a location list entry at `$return_addr - 1`, not start a location list entry at `$return_addr - 1`. I think we can get away without needing offsets in the address pool for such cases. The C reproducer that the call-clobbered-split.mir test case is based on has an aggregate variable whose elements are described by a call-clobbered register respectively a constant, so we want to start a location list entry at `$return_addr - 1` for which only the constant element is described. Here is that C reproducer: extern void fn2(int *); void fn1() { int data[] = {1, 2}; int *ptrs[] = {0, &data[1]}; fn2(ptrs[1]); ptrs[1] = 0; } If I compile that with GCC 7.4.0 using the following command line: $ gcc-7 -O1 -g -gdwarf-5 -gsplit-dwarf -S -o - foo.c GCC emits an address pool entry with an offset: [...] .Ldebug_addr0: .quad .LVL1 .quad .LVL2 .quad .LVL3 .quad .LFB0 .quad .LVL2-1 <---------- .quad fn2 .quad __stack_chk_fail .quad .LVL0 So this patch seems to behave similarly as, admittedly a quite old version of, GCC. I'll build the latest GCC release and try the same with that. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68465/new/ https://reviews.llvm.org/D68465 From llvm-commits at lists.llvm.org Fri Oct 11 09:10:22 2019 From: llvm-commits at lists.llvm.org (Nick Desaulniers via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:10:22 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses In-Reply-To: References: Message-ID: nickdesaulniers added inline comments. ================ Comment at: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp:5736 // If we have a '#', it's an immediate offset, else assume it's a register + // offset. Be friendly and also accept a plain integer or expression (without ---------------- Should this comment also mention `'$'`? ================ Comment at: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp:5743 Parser.getTok().is(AsmToken::Integer)) { - if (Parser.getTok().isNot(AsmToken::Integer)) - Parser.Lex(); // Eat '#' or '$'. + if (Parser.getTok().isNot(AsmToken::Integer) && Parser.getTok().isNot(AsmToken::LParen)) + Parser.Lex(); // Eat '#' or '$' ---------------- This line length looks a little long. Did you remember to run `git-clang-format HEAD~` and amend that? ================ Comment at: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp:5743 Parser.getTok().is(AsmToken::Integer)) { - if (Parser.getTok().isNot(AsmToken::Integer)) - Parser.Lex(); // Eat '#' or '$'. + if (Parser.getTok().isNot(AsmToken::Integer) && Parser.getTok().isNot(AsmToken::LParen)) + Parser.Lex(); // Eat '#' or '$' ---------------- nickdesaulniers wrote: > This line length looks a little long. Did you remember to run `git-clang-format HEAD~` and amend that? Based on the body of the if statement, would the condition `Parser.getTok().is(AsmToken::Hash) || Parser.getTok().is(AsmToken::Dollar)` work? If so, I think it would make more sense to use that, rather than check it's not the other cases. ================ Comment at: llvm/test/MC/ARM/gas-compl-mem-offset-paren.s:1 +@ RUN: llvm-mc -triple=arm < %s | FileCheck %s + ---------------- Since this is a GAS compliance test, let's use a GAS triple, like `-triple=arm-linux-gnueabi`. ================ Comment at: llvm/test/MC/ARM/gas-compl-mem-offset-paren.s:3 + +.syntax unified + ---------------- If you remove this assembler directive outright, does the test still pass? If so, let's remove it. Also, it seems that you partially removed the other occurrences, but not all of them. It should occur once, or not at all (unless you wanted to test changing back and forth between them, but that's not what we're testing here). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68764/new/ https://reviews.llvm.org/D68764 From llvm-commits at lists.llvm.org Fri Oct 11 09:19:29 2019 From: llvm-commits at lists.llvm.org (Billy Robert O'Neal III via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:19:29 +0000 (UTC) Subject: [PATCH] D68820: win: Move Parallel.h off concrt to cross-platform code In-Reply-To: References: Message-ID: <24dd4a1d90d403cd047b5d2bd3bdc106@localhost.localdomain> BillyONeal added inline comments. ================ Comment at: llvm/include/llvm/Support/Parallel.h:124 TaskGroup TG; parallel_quick_sort(Start, End, Comp, TG, llvm::Log2_64(std::distance(Start, End)) + 1); ---------------- aganea wrote: > BillyONeal wrote: > > If you get a chance to benchmark I'm curious how this compares to our std::sort(std::execution::par, ...) version :) > I ran a few AB/BA tests on LLD with my dataset. The cumulated time on all cores with ConcRT is consistently over by about 300ms on my 36-core Skylake (~1.9 sec for ConcRT version, ~1.6 sec after this patch). There are only three places where we `parallelSort` in LLD, so maybe this not representative. But the dataset is quite big, ~22 GB of OBJs and LIBs. This is a Unity build of the Editor Release target of one of our games. I can try also with no Unity files, usually the dataset is about an order of magnitude greater. > > **Before:** > {F10225243} > > **After:** > {F10225244} > > Not concrt; the std::sort(par...) standard parallel algorithm is an unrelated implementation. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68820/new/ https://reviews.llvm.org/D68820 From llvm-commits at lists.llvm.org Fri Oct 11 09:19:29 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Christian_K=C3=BChnel_via_Phabricator?= via llvm-commits) Date: Fri, 11 Oct 2019 16:19:29 +0000 (UTC) Subject: [PATCH] D68860: Test Diff - DO NOT MERGE In-Reply-To: References: Message-ID: <2baeea81d37042c6ffb28e8adab64478@localhost.localdomain> kuhnel updated this revision to Diff 224610. kuhnel added a comment. - 2nd edit Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68860/new/ https://reviews.llvm.org/D68860 Files: llvm/DELETEME.txt Index: llvm/DELETEME.txt =================================================================== --- /dev/null +++ llvm/DELETEME.txt @@ -0,0 +1,3 @@ +dummy file for testing +2nd edit + -------------- next part -------------- A non-text attachment was scrubbed... Name: D68860.224610.patch Type: text/x-patch Size: 181 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 09:19:30 2019 From: llvm-commits at lists.llvm.org (Tim Corringham via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:19:30 +0000 (UTC) Subject: [PATCH] D68873: [AMDGPU] Amend target loop unroll defaults Message-ID: timcorringham created this revision. Herald added subscribers: llvm-commits, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl, arsenm. Herald added a project: LLVM. Amend the loop unroll thresholds for PAL shaders to be more aggressive. This gives an overall performance benefit on a representative sample of shaders. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68873 Files: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h Index: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h =================================================================== --- llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h +++ llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h @@ -49,6 +49,8 @@ const TargetSubtargetInfo *ST; const TargetLoweringBase *TLI; + AMDGPUSubtarget::Generation Gen; + const TargetSubtargetInfo *getST() const { return ST; } const TargetLoweringBase *getTLI() const { return TLI; } @@ -57,7 +59,8 @@ : BaseT(TM, F.getParent()->getDataLayout()), TargetTriple(TM->getTargetTriple()), ST(static_cast(TM->getSubtargetImpl(F))), - TLI(ST->getTargetLowering()) {} + TLI(ST->getTargetLowering()), + Gen(TM->getSubtarget(F).getGeneration()) {} void getUnrollingPreferences(Loop *L, ScalarEvolution &SE, TTI::UnrollingPreferences &UP); Index: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp =================================================================== --- llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp +++ llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp @@ -96,6 +96,19 @@ // TODO: Do we want runtime unrolling? + // Set more aggressive defaults for PAL shaders + if (TargetTriple.getOS() == Triple::AMDPAL) { + UP.MaxPercentThresholdBoost = 1000; + // and even more aggressive for GFX10 + if (Gen >= AMDGPUSubtarget::GFX10) { + UP.Threshold = 1100; + UP.PartialThreshold = 1100; + } else { + UP.Threshold = 700; + UP.PartialThreshold = 700; + } + } + // Maximum alloca size than can fit registers. Reserve 16 registers. const unsigned MaxAlloca = (256 - 16) * 4; unsigned ThresholdPrivate = UnrollThresholdPrivate; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68873.224612.patch Type: text/x-patch Size: 1805 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 09:28:40 2019 From: llvm-commits at lists.llvm.org (Sid Manning via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:28:40 +0000 (UTC) Subject: [PATCH] D68875: Check for branch range overflows. Message-ID: sidneym created this revision. sidneym added reviewers: ruiu, MaskRay, bcain, kparzysz, shankare. Herald added subscribers: arichardson, emaste. Herald added a reviewer: espindola. Herald added a project: LLVM. Check for branch range overflows and add a testcase. Repository: rLLD LLVM Linker https://reviews.llvm.org/D68875 Files: lld/ELF/Arch/Hexagon.cpp lld/test/ELF/hexagon-verify.s Index: lld/test/ELF/hexagon-verify.s =================================================================== --- /dev/null +++ lld/test/ELF/hexagon-verify.s @@ -0,0 +1,37 @@ +# REQUIRES: hexagon +# RUN: llvm-mc -filetype=obj -triple=hexagon-unknown-elf %s -o %t.o +# RUN: not ld.lld %t.o -o %t 2>&1 | FileCheck %s + +#CHECK: relocation R_HEX_B9_PCREL out of range: 1028 is not in [-1024, 1023] +#CHECK: relocation R_HEX_B13_PCREL out of range: 16388 is not in [-16384, 16383] +#CHECK: relocation R_HEX_B15_PCREL out of range: 65540 is not in [-65536, 65535] +#CHECK: relocation R_HEX_B22_PCREL out of range: 8388612 is not in [-2097152, 2097151] + + + .globl _start + .type _start, @function +_start: + +.section _pc9, "ax" +{r0 = #0; jump #pc9} +.space (1<<10) +.section b9, "ax" +pc9: + +.section _pc13, "ax" +if (r0==#0) jump:t #pc13 +.space (1<<14) +.section b13, "ax" +pc13: + +.section _pc15, "ax" +if (p0) jump #pc15 +.space (1<<16) +.section b15, "ax" +pc15: + +.section _pc22, "ax" +jump #pc22 +.space (1<<23) +.section b22, "ax" +pc22: Index: lld/ELF/Arch/Hexagon.cpp =================================================================== --- lld/ELF/Arch/Hexagon.cpp +++ lld/ELF/Arch/Hexagon.cpp @@ -242,15 +242,18 @@ or32le(loc, applyMask(0x0fff3fff, val >> 6)); break; case R_HEX_B9_PCREL: + checkInt(loc, val, 11, type); or32le(loc, applyMask(0x003000fe, val >> 2)); break; case R_HEX_B9_PCREL_X: or32le(loc, applyMask(0x003000fe, val & 0x3f)); break; case R_HEX_B13_PCREL: + checkInt(loc, val, 15, type); or32le(loc, applyMask(0x00202ffe, val >> 2)); break; case R_HEX_B15_PCREL: + checkInt(loc, val, 17, type); or32le(loc, applyMask(0x00df20fe, val >> 2)); break; case R_HEX_B15_PCREL_X: @@ -258,6 +261,7 @@ break; case R_HEX_B22_PCREL: case R_HEX_PLT_B22_PCREL: + checkInt(loc, val, 22, type); or32le(loc, applyMask(0x1ff3ffe, val >> 2)); break; case R_HEX_B22_PCREL_X: -------------- next part -------------- A non-text attachment was scrubbed... Name: D68875.224614.patch Type: text/x-patch Size: 1981 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 09:37:56 2019 From: llvm-commits at lists.llvm.org (pre-merge checks [bot] via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:37:56 +0000 (UTC) Subject: [PATCH] D68860: Test Diff - DO NOT MERGE In-Reply-To: References: Message-ID: <0ddeb97d21feb9db05f1aa6bab3975be@localhost.localdomain> merge_guards_bot added a comment. Bulid results are available at http://results.llvm-merge-guard.org/Phabricator-31 See http://jenkins.llvm-merge-guard.org/job/Phabricator/31/ for more details. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68860/new/ https://reviews.llvm.org/D68860 From llvm-commits at lists.llvm.org Fri Oct 11 09:37:56 2019 From: llvm-commits at lists.llvm.org (Louis Dionne via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:37:56 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies In-Reply-To: References: Message-ID: <61c6a7c6346adc0f8bc3570609df5807@localhost.localdomain> ldionne accepted this revision. ldionne added a comment. CMake tracks dependencies between targets, but not between directories. If the CMakeLists.txt in some directory (e.g. `/libcxx/`) needs a target defined in another directory (e.g. `/libcxxabi/`), one has to make sure that `libcxxabi`'s `CMakeLists.txt` is included before `libcxx`'s `CMakeLists.txt`. This isn't new or vexing, IMO. If that is what this patch ensures (I don't know the runtimes build very well), I think this is good. > [...] in general we should avoid using `if(TARGET...)`. I'd like to question that affirmation. What is it based on? > The other runtime libraries should be able to use the `HAVE_${runtime}` variables to determine the presence of the other runtimes IMO, the `HAVE_${runtime}` variables are the weird ones here. The normal LLVM monorepo build orders the directories correctly, and we don't run into that issue. I don't see how `HAVE_${runtime}` can get around things like being able to query properties of e.g. `cxxabi_shared` inside the libcxx build before `cxxabi_shared` has been defined. I do support a push towards using generator expressions more, but I don't think generator expressions are a complete solution to this problem. I'd like to see this patch go in under some form so that we can remove the hacky workaround introduced in https://reviews.llvm.org/D68791. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68833/new/ https://reviews.llvm.org/D68833 From llvm-commits at lists.llvm.org Fri Oct 11 09:37:57 2019 From: llvm-commits at lists.llvm.org (Andrew J Wock via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:37:57 +0000 (UTC) Subject: [PATCH] D65967: [SeparateConstOffsetFromGEP][PowerPC] Fix: sext(a) + sext(b) -> sext(a + b) matches add and sub instructions with one another In-Reply-To: References: Message-ID: <87ff4adb25ba9ba2e99e7fd1c6378cb3@localhost.localdomain> ajwock added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65967/new/ https://reviews.llvm.org/D65967 From llvm-commits at lists.llvm.org Fri Oct 11 09:38:21 2019 From: llvm-commits at lists.llvm.org (Andrew J Wock via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:38:21 +0000 (UTC) Subject: [PATCH] D64662: [FPEnv] [PowerPC] Lower ppc_fp128 StrictFP Nodes to libcalls In-Reply-To: References: Message-ID: <5d2c1d447741b3f119283a0ee1b57889@localhost.localdomain> ajwock added a comment. ping Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64662/new/ https://reviews.llvm.org/D64662 From llvm-commits at lists.llvm.org Fri Oct 11 09:47:16 2019 From: llvm-commits at lists.llvm.org (Alexey Bataev via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:47:16 +0000 (UTC) Subject: [PATCH] D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine In-Reply-To: References: Message-ID: <5f67bc452e48b405755117456af3d504@localhost.localdomain> ABataev added inline comments. ================ Comment at: llvm/include/llvm/CodeGen/BasicTTIImpl.h:1698 + /// may not be necessary. + llvm::Optional getLoadCombineCost(Value *V) { + using namespace llvm::PatternMatch; ---------------- Shall we just terminate the reduction of this special construct unconditionally? Do we need this cost calculation just to prevent the reduction all the time we see this pattern? If yes, then, probably, we don't need to calculate the cost. There is a function `isTreeTinyAndNotFullyVectorizable()`. Can you put pattern matching analysis for this particular construct in this function without any additional cost analysis? ================ Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:6504-6505 case RK_Arithmetic: - ScalarReduxCost = - TTI->getArithmeticInstrCost(ReductionData.getOpcode(), ScalarTy); + if (ReductionData.getOpcode() == Instruction::Or) + LoadCombineCost = TTI->getLoadCombineCost(FirstReducedVal); + if (LoadCombineCost) { ---------------- Maybe better to put the check for the operation into the function itself? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67841/new/ https://reviews.llvm.org/D67841 From llvm-commits at lists.llvm.org Fri Oct 11 09:56:21 2019 From: llvm-commits at lists.llvm.org (Owen Reynolds via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:56:21 +0000 (UTC) Subject: [PATCH] D68033: [llvm-ar] Make paths case insensitive when on windows In-Reply-To: References: Message-ID: <7b48c489ad14033b9b1cb6160c01349b@localhost.localdomain> gbreynoo updated this revision to Diff 224621. gbreynoo added a comment. Update llvm-ar command guide with case insensitivity details, and include a test for archived files with paths for names. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68033/new/ https://reviews.llvm.org/D68033 Files: llvm/docs/CommandGuide/llvm-ar.rst llvm/test/tools/llvm-ar/Inputs/path-names.a llvm/test/tools/llvm-ar/non-windows-name-case.test llvm/test/tools/llvm-ar/path-names.test llvm/test/tools/llvm-ar/windows-name-case.test llvm/tools/llvm-ar/llvm-ar.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68033.224621.patch Type: text/x-patch Size: 7469 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 09:56:23 2019 From: llvm-commits at lists.llvm.org (Owen Reynolds via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:56:23 +0000 (UTC) Subject: [PATCH] D68033: [llvm-ar] Make paths case insensitive when on windows In-Reply-To: References: Message-ID: <7789956c9418ca54e1e2e65137751965@localhost.localdomain> gbreynoo added a comment. Due to updating the command guide I have not added the case insensitivity details to the llvm-ar help text. Would it be preferred to have these details in both? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68033/new/ https://reviews.llvm.org/D68033 From llvm-commits at lists.llvm.org Fri Oct 11 09:56:24 2019 From: llvm-commits at lists.llvm.org (Owen Reynolds via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:56:24 +0000 (UTC) Subject: [PATCH] D68033: [llvm-ar] Make paths case insensitive when on windows In-Reply-To: References: Message-ID: <907b1352fe070d69c05538566a67f9a4@localhost.localdomain> gbreynoo marked an inline comment as done. gbreynoo added inline comments. ================ Comment at: llvm/tools/llvm-ar/llvm-ar.cpp:509 +#else + return normalizePath(Path1) == normalizePath(Path2); +#endif ---------------- rupprecht wrote: > I'm not quite sure about this change... a few of the callsites before were `Name == normalizePath(Path)`, not `normalizePath(Name) == normalizePath(Path)`. My past experiences of compatibility testing llvm-ar vs GNU ar has largely been paged out, but I think this may have been one of the differences. It may actually be something we want, but we should test it. e.g. to test the `performReadOperation` can you see if extracting "foo/file.txt" will end up extracting "bar/file.txt" (in a situation where `CompareFullPath` is false)? You are correct. I have added a test for this behaviour which now matches that of gnu-ar. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68033/new/ https://reviews.llvm.org/D68033 From llvm-commits at lists.llvm.org Fri Oct 11 10:05:41 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:05:41 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: <6a84a7c64d4363910f76d90a76165805@localhost.localdomain> DiggerLin marked an inline comment as done. DiggerLin added inline comments. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:168 + +bool isRelocationSigned(XCOFFRelocation32 &Reloc); + ---------------- hubert.reinterpretcast wrote: > DiggerLin wrote: > > hubert.reinterpretcast wrote: > > > sfertile wrote: > > > > hubert.reinterpretcast wrote: > > > > > DiggerLin wrote: > > > > > > hubert.reinterpretcast wrote: > > > > > > > Do these need to be declared in the header? Are they called only in one `.cpp` file? If so, they can be made `static` in the `.cpp` file. Otherwise, it seems odd that these aren't `const` member functions of `XCOFFRelocation32`. > > > > > > the llvm-readobj is using those function and obj2yaml will use them too. > > > > > It is still odd to me that these aren't `const` non-static member functions of `XCOFFRelocation32`. > > > > I think were these originally templated to work with both 32-bit and 64-bit relocations, which explains why they aren't member functions. > > > Would using CRTP with a base class template work for that case? > > as Sean's comment, for we only implement 32 bits relocation, we do not use any template for the relocation implement this moment. > All the more reason why these should be non-static member functions in the context of this patch. changed to member functions Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Fri Oct 11 10:05:42 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:05:42 +0000 (UTC) Subject: [PATCH] D68730: [llvm-objdump] Adjust spacing and field width for --section-headers In-Reply-To: References: Message-ID: <88e525f62f6b2a609787f8a8c2a66b8d@localhost.localdomain> rupprecht updated this revision to Diff 224627. rupprecht marked 5 inline comments as done. rupprecht added a comment. - Merge tests - Avoid llvm::join() call Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68730/new/ https://reviews.llvm.org/D68730 Files: lld/test/ELF/got32-i386.s lld/test/ELF/got32x-i386.s llvm/test/tools/llvm-objdump/section-headers.test llvm/test/tools/llvm-objdump/wasm.txt llvm/test/tools/llvm-objdump/xcoff-section-headers.test llvm/tools/llvm-objdump/llvm-objdump.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68730.224627.patch Type: text/x-patch Size: 12923 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 10:05:41 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:05:41 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: DiggerLin updated this revision to Diff 224623. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 Files: llvm/include/llvm/BinaryFormat/XCOFF.h llvm/include/llvm/Object/XCOFFObjectFile.h llvm/lib/Object/XCOFFObjectFile.cpp llvm/test/tools/llvm-readobj/reloc_overflow.test llvm/test/tools/llvm-readobj/xcoff-basic.test llvm/tools/llvm-readobj/XCOFFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67008.224623.patch Type: text/x-patch Size: 19620 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 10:05:42 2019 From: llvm-commits at lists.llvm.org (Brian Cain via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:05:42 +0000 (UTC) Subject: [PATCH] D68875: Check for branch range overflows. In-Reply-To: References: Message-ID: <2d95bd41cc885e71997a4448e917d88e@localhost.localdomain> bcain accepted this revision. bcain added a comment. This revision is now accepted and ready to land. LGTM, but suggestions for stricter test case. ================ Comment at: lld/test/ELF/hexagon-verify.s:5-9 +#CHECK: relocation R_HEX_B9_PCREL out of range: 1028 is not in [-1024, 1023] +#CHECK: relocation R_HEX_B13_PCREL out of range: 16388 is not in [-16384, 16383] +#CHECK: relocation R_HEX_B15_PCREL out of range: 65540 is not in [-65536, 65535] +#CHECK: relocation R_HEX_B22_PCREL out of range: 8388612 is not in [-2097152, 2097151] + ---------------- I'd make the subsequent ones CHECK-NEXT. Also may make sense to constrain the FileCheck with implicit-check-not='out of range' to make sure no range errors are emitted that aren't checked for. Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68875/new/ https://reviews.llvm.org/D68875 From llvm-commits at lists.llvm.org Fri Oct 11 10:05:42 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:05:42 +0000 (UTC) Subject: [PATCH] D68730: [llvm-objdump] Adjust spacing and field width for --section-headers In-Reply-To: References: Message-ID: rupprecht added inline comments. ================ Comment at: llvm/test/tools/llvm-objdump/section-headers-spacing.test:1 +## Check leading and trailing whitespace for full lines. +# RUN: yaml2obj %s -o %t-whitespace.o ---------------- grimar wrote: > rupprecht wrote: > > grimar wrote: > > > What do you think about combining these tests you have here into one that > > > could use `yaml2obj --docnum=X` and check spacing, formatting etc in one place? > > > (I am not sure it if it is usefull to have 3 different test files?) > > I started out with one test file, but found it to be a collection of somewhat unrelated things -- e.g. name column width and 32 vs 64 bit column widths are different features. So I think it's better to have more focused test files. It's a slightly personal preference though. > We often combine tests by a feature. I.e. test that checks the "-h" output might contain everything related to "-h" at once. Sometimes we do a split to make a test that contain only a error/warnings checks (if there are too many of them) or a particular set of tests that are very different. > > It is not a huge problem, but might be interesting what others think about this too though. > (to summarize my position: I'd prefer to combine them to reduce the number of tests, but it is not critical and is OK as is probably). I don't have that strong of preference either, so merged. My only concern is now it's on the larger end of test sizes (7th of all objdump tests per `find llvm/test/tools/llvm-objdump/ -name '*.test' | xargs wc -l | sort -nr | tail +2 | head -10`), but it's still not excessively long. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68730/new/ https://reviews.llvm.org/D68730 From llvm-commits at lists.llvm.org Fri Oct 11 10:05:44 2019 From: llvm-commits at lists.llvm.org (Kerry McLaughlin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:05:44 +0000 (UTC) Subject: [PATCH] D68877: [AArch64][SVE] Implement masked load intrinsics Message-ID: kmclaughlin created this revision. kmclaughlin added reviewers: huntergr, rovka, greened. Herald added subscribers: psnobl, rkruppe, hiraditya, kristof.beyls, tschuett. Herald added a project: LLVM. kmclaughlin added a parent revision: D47775: [AArch64][SVE] Add SPLAT_VECTOR ISD Node. Adds support for codegen of masked loads, with non-extending, zero-extending and sign-extending variants. Depends on the changes in D47775 for isConstantSplatVectorMaskForType Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68877 Files: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/CodeGen/TargetLoweringBase.cpp llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h llvm/lib/Target/AArch64/SVEInstrFormats.td llvm/test/CodeGen/AArch64/sve-masked-ldst-nonext.ll llvm/test/CodeGen/AArch64/sve-masked-ldst-sext.ll llvm/test/CodeGen/AArch64/sve-masked-ldst-zext.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68877.224618.patch Type: text/x-patch Size: 29069 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 10:15:14 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:15:14 +0000 (UTC) Subject: [PATCH] D66969: Output XCOFF object text section header and symbol entry for program code In-Reply-To: References: Message-ID: <404b961be30f688fe23172339e548d8b@localhost.localdomain> hubert.reinterpretcast accepted this revision. hubert.reinterpretcast added a comment. This revision is now accepted and ready to land. LGTM with minor changes that can be made on the check-in. ================ Comment at: llvm/lib/MC/XCOFFObjectWriter.cpp:360 + writeSymbolName(SymbolRef.getName()); + assert(CSectionRef.Address + SymbolOffset <= UINT32_MAX && + "Symbol address overflows."); ---------------- This would not be sufficient to avoid overflow if `SymbolOffset` is less than `UINT32_MAX` away from `UINT64_MAX`, use: `SymbolOffset <= UINT32_MAX - CSectionRef.Address`. ================ Comment at: llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll:83 ; SYMS-NEXT: Symbols [ -; SYMS-NEXT: Symbol { -; SYMS-NEXT: Index: [[#Index:]] -; SYMS-NEXT: Name: a +; SYMS: Symbol {{{[[:space:]] *}}Index: [[#Index:]]{{[[:space:]] *}}Name: a ; SYMS-NEXT: Value (RelocatableAddress): 0x0 ---------------- Please use `{{[{]` instead of `{{{` to avoid ambiguity as to where the regular expression starts and, if the regular expression starts with the first `{{`, to avoid the undefined results indicated by POSIX regarding `{` as the first character of an ERE. ================ Comment at: llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll:54 ; SYMS-NEXT: Symbols [ -; SYMS-NEXT: Symbol { -; SYMS-NEXT: Index: [[#Index:]] -; SYMS-NEXT: Name: a +; SYMS: Symbol {{{[[:space:]] *}}Index: [[#Index:]]{{[[:space:]] *}}Name: a ; SYMS-NEXT: Value (RelocatableAddress): 0x0 ---------------- Same comment. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66969/new/ https://reviews.llvm.org/D66969 From llvm-commits at lists.llvm.org Fri Oct 11 10:24:01 2019 From: llvm-commits at lists.llvm.org (serge via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:24:01 +0000 (UTC) Subject: [PATCH] D68720: Support -fstack-clash-protection for x86 In-Reply-To: References: Message-ID: serge-sans-paille updated this revision to Diff 224606. serge-sans-paille added a comment. Ensure the distance between two probes is at max PAGE_SIZE. Use Calls as free probes. Fix alignment for dynamic alloca This passes the llvm-test suite, and thanks to the use of calls, no inserted probe are needed to compile sqlite! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68720/new/ https://reviews.llvm.org/D68720 Files: clang/docs/ReleaseNotes.rst clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Basic/DiagnosticFrontendKinds.td clang/include/clang/Basic/TargetInfo.h clang/include/clang/Driver/CC1Options.td clang/include/clang/Driver/Options.td clang/lib/Basic/Targets/X86.h clang/lib/CodeGen/CGStmt.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/stack-clash-protection.c clang/test/Driver/stack-clash-protection.c llvm/docs/ReleaseNotes.rst llvm/include/llvm/CodeGen/TargetLowering.h llvm/lib/Target/X86/X86CallFrameOptimization.cpp llvm/lib/Target/X86/X86FrameLowering.cpp llvm/lib/Target/X86/X86FrameLowering.h llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86ISelLowering.h llvm/lib/Target/X86/X86InstrCompiler.td llvm/lib/Target/X86/X86InstrInfo.td llvm/test/CodeGen/X86/stack-clash-dynamic-alloca.ll llvm/test/CodeGen/X86/stack-clash-medium-natural-probes.ll llvm/test/CodeGen/X86/stack-clash-medium.ll llvm/test/CodeGen/X86/stack-clash-no-free-probe.ll llvm/test/CodeGen/X86/stack-clash-small.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68720.224606.patch Type: text/x-patch Size: 36852 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 10:33:20 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:33:20 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: hubert.reinterpretcast added a comment. I've marked comments (all minor) that have not yet been addressed. ================ Comment at: llvm/include/llvm/BinaryFormat/XCOFF.h:192 + ///< displacement that is the difference between the address of + ///< the refrenced symbol and the address of the refrenced branch + ///< instruction. References a non modifiable instruction. ---------------- hubert.reinterpretcast wrote: > The typo, "refrenced", is still here. Still not seeing the change: https://reviews.llvm.org/D67008?id=222860#inline-616695 ================ Comment at: llvm/include/llvm/BinaryFormat/XCOFF.h:193 + ///< the refrenced symbol and the address of the refrenced branch + ///< instruction. References a non modifiable instruction. + R_RBA = 0x18, ///< Branch absolute relocation. Similar to the R_BA but ---------------- hubert.reinterpretcast wrote: > Still missing the hyphen for "non-modifiable". Still not seeing the change: https://reviews.llvm.org/D67008?id=222860#inline-616694 ================ Comment at: llvm/include/llvm/BinaryFormat/XCOFF.h:194 + ///< instruction. References a non modifiable instruction. + R_RBA = 0x18, ///< Branch absolute relocation. Similar to the R_BA but + ///< references a modifiable instruction. ---------------- hubert.reinterpretcast wrote: > Either remove the "the" for this line or add "relocation" after "R_BA". Still not seeing the change: https://reviews.llvm.org/D67008?id=222860#inline-616692 ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:165 +}; class XCOFFObjectFile : public ObjectFile { private: ---------------- Blank line between the class definitions please. ================ Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:556 } +// In an XCOFF32 file, if more than 65,534 relocation entries are required, +// the field value will be 65535, and an STYP_OVRFLO section header will ---------------- hubert.reinterpretcast wrote: > DiggerLin wrote: > > hubert.reinterpretcast wrote: > > > We can reduce the amount of background for the comment to what is necessary to understand the code here: > > > In an XCOFF32 file, when the field value is 65535, then an STYP_OVRFLO section header contains the actual count of relocation entries in the s_paddr field. STYP_OVRFLO headers contain the section index of their corresponding sections as their raw "NumberOfRelocations" field value. > > added. > I am not seeing the change. Still not seeing the change: https://reviews.llvm.org/D67008?id=222237#inline-613037 ================ Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:593 + Sec.FileOffsetToRelocationInfo); + auto RelocEntNumOrErr = getLogicalNumberOfRelocationEntries(Sec); + if (Error E = RelocEntNumOrErr.takeError()) ---------------- hubert.reinterpretcast wrote: > Suggestion: `NumRelocEntriesOrErr` Still not seeing the change: https://reviews.llvm.org/D67008?id=222860#inline-616702 ================ Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:597 + + uint32_t RelocEntNum = RelocEntNumOrErr.get(); + ---------------- hubert.reinterpretcast wrote: > Suggestion: `NumRelocEntries` Still not seeing the change: https://reviews.llvm.org/D67008?id=222860#inline-616704 Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Fri Oct 11 10:42:24 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Fri, 11 Oct 2019 17:42:24 -0000 Subject: [llvm] r374575 - gn build: (manually) merge r374110 Message-ID: <20191011174224.6873B89090@lists.llvm.org> Author: nico Date: Fri Oct 11 10:42:24 2019 New Revision: 374575 URL: http://llvm.org/viewvc/llvm-project?rev=374575&view=rev Log: gn build: (manually) merge r374110 Modified: llvm/trunk/utils/gn/secondary/clang/test/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/clang/test/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang/test/BUILD.gn?rev=374575&r1=374574&r2=374575&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang/test/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang/test/BUILD.gn Fri Oct 11 10:42:24 2019 @@ -144,6 +144,7 @@ group("test") { "//llvm/tools/llvm-config", "//llvm/tools/llvm-dis", "//llvm/tools/llvm-dwarfdump", + "//llvm/tools/llvm-ifs", "//llvm/tools/llvm-lto", "//llvm/tools/llvm-lto2", "//llvm/tools/llvm-modextract", From llvm-commits at lists.llvm.org Fri Oct 11 10:42:30 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:42:30 +0000 (UTC) Subject: [PATCH] D68848: [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index. In-Reply-To: References: Message-ID: <505cb2f223e23119903fe3d06601ece4@localhost.localdomain> rupprecht updated this revision to Diff 224632. rupprecht marked 9 inline comments as done. rupprecht added a comment. - Add/update several comments - Use simplified pointer check - Avoid auto as the returned type is not obvious Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68848/new/ https://reviews.llvm.org/D68848 Files: lld/test/ELF/got32-i386.s lld/test/ELF/got32x-i386.s llvm/test/tools/llvm-objdump/section-headers.test llvm/test/tools/llvm-objdump/wasm.txt llvm/test/tools/llvm-objdump/xcoff-section-headers.test llvm/tools/llvm-objdump/llvm-objdump.cpp llvm/tools/llvm-objdump/llvm-objdump.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68848.224632.patch Type: text/x-patch Size: 17037 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 10:42:31 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:42:31 +0000 (UTC) Subject: [PATCH] D68848: [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index. In-Reply-To: References: Message-ID: <0d6bb6331d4a9dc2807caebbbf2b35b0@localhost.localdomain> rupprecht added inline comments. ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:345 -static bool shouldKeep(object::SectionRef S) { +struct FilterResult { + bool Keep; ---------------- grimar wrote: > It's common to wrap such things (helper types) into anonymous namespaces: > I think you can also add a helper function too: > > ``` > namespace { > struct FilterResult { > bool Keep; > bool IncrementIndex; > }; > > FilterResult checkSectionFilter(object::SectionRef S) { > ... > } > }; > ``` > > > Wrapped just the struct per http://llvm.org/docs/CodingStandards.html#anonymous-namespaces ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:352 if (FilterSections.empty()) - return true; + return {/*Keep=*/true, /*Increment=*/true}; ---------------- grimar wrote: > I think we use a full variable name usually, i.e. Increment -> IncrementIndex Sorry, I originally had the variable named `Increment` but forgot to update these comments. ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1699 + uint64_t Idx; + for (const SectionRef &Section : ToolSectionFilter(*Obj, &Idx)) { StringRef Name = unwrapOrError(Section.getName(), Obj->getFileName()); ---------------- grimar wrote: > Looking at this, > should `ToolSectionFilter` just return `[(SectionRef&)Ref, (uint64_t )Index]` struct/pair instead? It seems could make the whole logic simper. That's one of the paths I considered while writing this patch. A couple thoughts: 1) This is the only place that needs this counter value, and loop iteration in every other case (7 other places in this file, 1 in the MachO dumper) would now be a little more complicated even when callers don't care about the index, e.g. it would now be: ``` for (const SomeWrapperType &Foo : ToolSectionFilter(*Obj)) { const SectionRef &Section = Foo.Section; ``` 2) llvm has `make_filter_range` which probably did not exist at the time this code was first written, and it would be nice to completely remove `SectionFilter` and `SectionFilterIterator` from llvm-objdump.h in favor of those standard llvm libraries, but AIUI in order to use that, the return type would need to be `SectionRef`, not some wrapper type. (I'm trying to do that in a separate branch, but I'm dealing with template woes). 3) But on the plus side, it does avoid out parameters which would be very nice. I think 1&2 outweigh 3, so I'm leaning towards this approach. But I'm still exploring alternatives. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68848/new/ https://reviews.llvm.org/D68848 From llvm-commits at lists.llvm.org Fri Oct 11 10:42:31 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:42:31 +0000 (UTC) Subject: [PATCH] D68848: [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index. In-Reply-To: References: Message-ID: <9ed2dcf877715afcdb4620ba1b62ec69@localhost.localdomain> rupprecht updated this revision to Diff 224633. rupprecht added a comment. Rebase against D68730 again Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68848/new/ https://reviews.llvm.org/D68848 Files: llvm/test/tools/llvm-objdump/xcoff-section-headers.test llvm/tools/llvm-objdump/llvm-objdump.cpp llvm/tools/llvm-objdump/llvm-objdump.h -------------- next part -------------- A non-text attachment was scrubbed... Name: D68848.224633.patch Type: text/x-patch Size: 7169 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 10:54:15 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Fri, 11 Oct 2019 17:54:15 -0000 Subject: [llvm] r374579 - [X86][SSE] Add support for v4i8 add reduction Message-ID: <20191011175415.6D02392577@lists.llvm.org> Author: rksimon Date: Fri Oct 11 10:54:15 2019 New Revision: 374579 URL: http://llvm.org/viewvc/llvm-project?rev=374579&view=rev Log: [X86][SSE] Add support for v4i8 add reduction Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374579&r1=374578&r2=374579&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Fri Oct 11 10:54:15 2019 @@ -36239,10 +36239,15 @@ static SDValue combineReductionToHorizon SDLoc DL(ExtElt); - if (VecVT == MVT::v8i8) { + // vXi8 reduction - sub 128-bit vector. + if (VecVT == MVT::v4i8 || VecVT == MVT::v8i8) { + // Pad with zero. + if (VecVT == MVT::v4i8) + Rdx = DAG.getNode(ISD::CONCAT_VECTORS, DL, MVT::v8i8, Rdx, + DAG.getConstant(0, DL, VecVT)); // Pad with undef. Rdx = DAG.getNode(ISD::CONCAT_VECTORS, DL, MVT::v16i8, Rdx, - DAG.getUNDEF(VecVT)); + DAG.getUNDEF(MVT::v8i8)); Rdx = DAG.getNode(X86ISD::PSADBW, DL, MVT::v2i64, Rdx, DAG.getConstant(0, DL, MVT::v16i8)); Rdx = DAG.getBitcast(MVT::v16i8, Rdx); Modified: llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll?rev=374579&r1=374578&r2=374579&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll Fri Oct 11 10:54:15 2019 @@ -1029,44 +1029,36 @@ define i8 @test_v2i8_load(<2 x i8>* %p) define i8 @test_v4i8(<4 x i8> %a0) { ; SSE2-LABEL: test_v4i8: ; SSE2: # %bb.0: -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: psrld $16, %xmm1 -; SSE2-NEXT: paddb %xmm0, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm0 -; SSE2-NEXT: psrlw $8, %xmm0 -; SSE2-NEXT: paddb %xmm1, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm1 +; SSE2-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SSE2-NEXT: psadbw %xmm1, %xmm0 ; SSE2-NEXT: movd %xmm0, %eax ; SSE2-NEXT: # kill: def $al killed $al killed $eax ; SSE2-NEXT: retq ; ; SSE41-LABEL: test_v4i8: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm1 -; SSE41-NEXT: psrld $16, %xmm1 -; SSE41-NEXT: paddb %xmm0, %xmm1 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: psrlw $8, %xmm0 -; SSE41-NEXT: paddb %xmm1, %xmm0 -; SSE41-NEXT: pextrb $0, %xmm0, %eax +; SSE41-NEXT: pmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero +; SSE41-NEXT: pxor %xmm1, %xmm1 +; SSE41-NEXT: psadbw %xmm0, %xmm1 +; SSE41-NEXT: pextrb $0, %xmm1, %eax ; SSE41-NEXT: # kill: def $al killed $al killed $eax ; SSE41-NEXT: retq ; ; AVX-LABEL: test_v4i8: ; AVX: # %bb.0: -; AVX-NEXT: vpsrld $16, %xmm0, %xmm1 -; AVX-NEXT: vpaddb %xmm1, %xmm0, %xmm0 -; AVX-NEXT: vpsrlw $8, %xmm0, %xmm1 -; AVX-NEXT: vpaddb %xmm1, %xmm0, %xmm0 +; AVX-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero +; AVX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ; AVX-NEXT: vpextrb $0, %xmm0, %eax ; AVX-NEXT: # kill: def $al killed $al killed $eax ; AVX-NEXT: retq ; ; AVX512-LABEL: test_v4i8: ; AVX512: # %bb.0: -; AVX512-NEXT: vpsrld $16, %xmm0, %xmm1 -; AVX512-NEXT: vpaddb %xmm1, %xmm0, %xmm0 -; AVX512-NEXT: vpsrlw $8, %xmm0, %xmm1 -; AVX512-NEXT: vpaddb %xmm1, %xmm0, %xmm0 +; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero +; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ; AVX512-NEXT: vpextrb $0, %xmm0, %eax ; AVX512-NEXT: # kill: def $al killed $al killed $eax ; AVX512-NEXT: retq @@ -1078,36 +1070,28 @@ define i8 @test_v4i8_load(<4 x i8>* %p) ; SSE2-LABEL: test_v4i8_load: ; SSE2: # %bb.0: ; SSE2-NEXT: movd {{.*#+}} xmm0 = mem[0],zero,zero,zero -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: psrld $16, %xmm1 -; SSE2-NEXT: paddb %xmm0, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm0 -; SSE2-NEXT: psrlw $8, %xmm0 -; SSE2-NEXT: paddb %xmm1, %xmm0 -; SSE2-NEXT: movd %xmm0, %eax +; SSE2-NEXT: pxor %xmm1, %xmm1 +; SSE2-NEXT: psadbw %xmm0, %xmm1 +; SSE2-NEXT: movd %xmm1, %eax ; SSE2-NEXT: # kill: def $al killed $al killed $eax ; SSE2-NEXT: retq ; ; SSE41-LABEL: test_v4i8_load: ; SSE41: # %bb.0: ; SSE41-NEXT: movd {{.*#+}} xmm0 = mem[0],zero,zero,zero -; SSE41-NEXT: movdqa %xmm0, %xmm1 -; SSE41-NEXT: psrld $16, %xmm1 -; SSE41-NEXT: paddb %xmm0, %xmm1 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: psrlw $8, %xmm0 -; SSE41-NEXT: paddb %xmm1, %xmm0 -; SSE41-NEXT: pextrb $0, %xmm0, %eax +; SSE41-NEXT: pmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero +; SSE41-NEXT: pxor %xmm1, %xmm1 +; SSE41-NEXT: psadbw %xmm0, %xmm1 +; SSE41-NEXT: pextrb $0, %xmm1, %eax ; SSE41-NEXT: # kill: def $al killed $al killed $eax ; SSE41-NEXT: retq ; ; AVX-LABEL: test_v4i8_load: ; AVX: # %bb.0: ; AVX-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero -; AVX-NEXT: vpsrld $16, %xmm0, %xmm1 -; AVX-NEXT: vpaddb %xmm1, %xmm0, %xmm0 -; AVX-NEXT: vpsrlw $8, %xmm0, %xmm1 -; AVX-NEXT: vpaddb %xmm1, %xmm0, %xmm0 +; AVX-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero +; AVX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ; AVX-NEXT: vpextrb $0, %xmm0, %eax ; AVX-NEXT: # kill: def $al killed $al killed $eax ; AVX-NEXT: retq @@ -1115,10 +1099,9 @@ define i8 @test_v4i8_load(<4 x i8>* %p) ; AVX512-LABEL: test_v4i8_load: ; AVX512: # %bb.0: ; AVX512-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero -; AVX512-NEXT: vpsrld $16, %xmm0, %xmm1 -; AVX512-NEXT: vpaddb %xmm1, %xmm0, %xmm0 -; AVX512-NEXT: vpsrlw $8, %xmm0, %xmm1 -; AVX512-NEXT: vpaddb %xmm1, %xmm0, %xmm0 +; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero +; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ; AVX512-NEXT: vpextrb $0, %xmm0, %eax ; AVX512-NEXT: # kill: def $al killed $al killed $eax ; AVX512-NEXT: retq From llvm-commits at lists.llvm.org Fri Oct 11 10:51:55 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:51:55 +0000 (UTC) Subject: [PATCH] D68819: [Utils] Allow update_test_checks to check function arguments In-Reply-To: References: Message-ID: <411c7c31f2ee42e4a09080ccd327e343@localhost.localdomain> jdoerfert added a comment. In D68819#1705424 , @greened wrote: > In D68819#1704962 , @jdoerfert wrote: > > > We can, or should, combine D68153 and this, either in one or two patches. > > > Sounds good. Do you think D68153 should operate under the `--function-signature` flag, under a different flag or always include `define` in the pattern (meaning all tests will change when the tool is re-run)? I think D68153 is a small enough change and at the same time helpful enough to run it always. Adding function signature is not for everyone so I added the flag. I can put it under the flag if people think it should live there though. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68819/new/ https://reviews.llvm.org/D68819 From llvm-commits at lists.llvm.org Fri Oct 11 11:01:16 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:01:16 +0000 (UTC) Subject: [PATCH] D68149: LiveIntervals: Fix handleMoveUp with subreg def moving across a def In-Reply-To: References: Message-ID: arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68149/new/ https://reviews.llvm.org/D68149 From llvm-commits at lists.llvm.org Fri Oct 11 11:01:18 2019 From: llvm-commits at lists.llvm.org (Sid Manning via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:01:18 +0000 (UTC) Subject: [PATCH] D68875: [lld] Check for branch range overflows. In-Reply-To: References: Message-ID: <218c86dd577392c474dc0851dcca482d@localhost.localdomain> sidneym updated this revision to Diff 224637. sidneym added a comment. Update test case. Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68875/new/ https://reviews.llvm.org/D68875 Files: lld/ELF/Arch/Hexagon.cpp lld/test/ELF/hexagon-verify.s Index: lld/test/ELF/hexagon-verify.s =================================================================== --- /dev/null +++ lld/test/ELF/hexagon-verify.s @@ -0,0 +1,37 @@ +# REQUIRES: hexagon +# RUN: llvm-mc -filetype=obj -triple=hexagon-unknown-elf %s -o %t.o +# RUN: not ld.lld %t.o -o %t 2>&1 | FileCheck --implicit-check-not "out of range" %s + +#CHECK: relocation R_HEX_B9_PCREL out of range: 1028 is not in [-1024, 1023] +#CHECK-NEXT: relocation R_HEX_B13_PCREL out of range: 16388 is not in [-16384, 16383] +#CHECK-NEXT: relocation R_HEX_B15_PCREL out of range: 65540 is not in [-65536, 65535] +#CHECK-NEXT: relocation R_HEX_B22_PCREL out of range: 8388612 is not in [-2097152, 2097151] + + + .globl _start + .type _start, @function +_start: + +.section _pc9, "ax" +{r0 = #0; jump #pc9} +.space (1<<10) +.section b9, "ax" +pc9: + +.section _pc13, "ax" +if (r0==#0) jump:t #pc13 +.space (1<<14) +.section b13, "ax" +pc13: + +.section _pc15, "ax" +if (p0) jump #pc15 +.space (1<<16) +.section b15, "ax" +pc15: + +.section _pc22, "ax" +jump #pc22 +.space (1<<23) +.section b22, "ax" +pc22: Index: lld/ELF/Arch/Hexagon.cpp =================================================================== --- lld/ELF/Arch/Hexagon.cpp +++ lld/ELF/Arch/Hexagon.cpp @@ -242,15 +242,18 @@ or32le(loc, applyMask(0x0fff3fff, val >> 6)); break; case R_HEX_B9_PCREL: + checkInt(loc, val, 11, type); or32le(loc, applyMask(0x003000fe, val >> 2)); break; case R_HEX_B9_PCREL_X: or32le(loc, applyMask(0x003000fe, val & 0x3f)); break; case R_HEX_B13_PCREL: + checkInt(loc, val, 15, type); or32le(loc, applyMask(0x00202ffe, val >> 2)); break; case R_HEX_B15_PCREL: + checkInt(loc, val, 17, type); or32le(loc, applyMask(0x00df20fe, val >> 2)); break; case R_HEX_B15_PCREL_X: @@ -258,6 +261,7 @@ break; case R_HEX_B22_PCREL: case R_HEX_PLT_B22_PCREL: + checkInt(loc, val, 22, type); or32le(loc, applyMask(0x1ff3ffe, val >> 2)); break; case R_HEX_B22_PCREL_X: -------------- next part -------------- A non-text attachment was scrubbed... Name: D68875.224637.patch Type: text/x-patch Size: 2032 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 11:19:32 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:19:32 +0000 (UTC) Subject: [PATCH] D68465: [DebugInfo] Trim call-clobbered location list entries when tuning for GDB In-Reply-To: References: Message-ID: <3680868f3becff5347c0f9331927c371@localhost.localdomain> dblaikie added a comment. > I don't think that case is representative for the issue with the offsets in the address pool. Since the variable is only described by a call-clobbered register, we only have to end a location list entry at `$return_addr - 1`, not start a location list entry at `$return_addr - 1`. I think we can get away without needing offsets in the address pool for such cases. > > The C reproducer that the call-clobbered-split.mir test case is based on has an aggregate variable whose elements are described by a call-clobbered register respectively a constant, so we want to start a location list entry at `$return_addr - 1` for which only the constant element is described. > > Here is that C reproducer: > > extern void fn2(int *); > > void fn1() { > int data[] = {1, 2}; > int *ptrs[] = {0, &data[1]}; > fn2(ptrs[1]); > ptrs[1] = 0; > } > > > If I compile that with GCC 7.4.0 using the following command line: > > $ gcc-7 -O1 -g -gdwarf-5 -gsplit-dwarf -S -o - foo.c > > > GCC emits an address pool entry with an offset: > > [...] > .Ldebug_addr0: > .quad .LVL1 > .quad .LVL2 > .quad .LVL3 > .quad .LFB0 > .quad .LVL2-1 <---------- > .quad fn2 > .quad __stack_chk_fail > .quad .LVL0 > > > So this patch seems to behave similarly as, admittedly a quite old version of, GCC. I'll build the latest GCC release and try the same with that. Yeah, GCC 8.1 has the same behavior. But I think we can avoid that in LLVM with some of the loclist changes I'm working on - we should only ever be using the start of a function (not necessarily the current function, if there are multiple functions in the same section) as a base address in location lists. So locations like this should be rendered as offset pairs relative to that base address & should always be positive, even if you subtract 1. I'm wondering whether it might be worth deferring all this design discussion until after there's more clarity from GDB - since I expect that's the right path forward & this design discussion may be unnecessary. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68465/new/ https://reviews.llvm.org/D68465 From llvm-commits at lists.llvm.org Fri Oct 11 11:19:32 2019 From: llvm-commits at lists.llvm.org (Chris Bieneman via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:19:32 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies In-Reply-To: References: Message-ID: <65365b4b6fe95fabdae8512b67bf9c1d@localhost.localdomain> beanz added a comment. In D68833#1706104 , @ldionne wrote: > CMake tracks dependencies between targets, but not between directories. If the CMakeLists.txt in some directory (e.g. `/libcxx/`) needs a target defined in another directory (e.g. `/libcxxabi/`), one has to make sure that `libcxxabi`'s `CMakeLists.txt` is included before `libcxx`'s `CMakeLists.txt`. This isn't new or vexing, IMO. If that is what this patch ensures (I don't know the runtimes build very well), I think this is good. The problem is that libcxx depends on compiler-rt, and compiler-rt depends on libcxx. Which means we have circular dependencies. The solution is to migrate to CMake usage patterns that don't require strict ordering. Specifically the use of generator expressions in this (and many other situations) is warranted. > I'd like to question that affirmation. What is it based on? The fact that we have circular dependencies at the project level (not the target level). The `if (TARGET ...)` feature in CMake requires that targets are processed in specific orders and cannot deal with projects that have circular dependency relationships. > IMO, the `HAVE_${runtime}` variables are the weird ones here. The normal LLVM monorepo build orders the directories correctly, and we don't run into that issue. I'm not sure what you mean by this. What I *think* you are referring to is specifying runtime libraries in `LLVM_ENABLE_PROJECTS` which uses the llvm/projects subdirectory for building them. This build flow is also part of the monorepo and uses `LLVM_ENABLE_RUNTIMES`. Fundamentally `LLVM_ENABLE_PROJECTS` is the incorrect way to build runtime libraries because they are not built with the in-tree compiler. While this works for most development cases it is 100% wrong for building and shipping toolchains, and it is questionable practice to have developers building and testing things that aren't representative of what we ship. > I don't see how `HAVE_${runtime}` can get around things like being able to query properties of e.g. `cxxabi_shared` inside the libcxx build before `cxxabi_shared` has been defined. I do support a push towards using generator expressions more, but I don't think generator expressions are a complete solution to this problem. What do you think that generator expressions can't do that is relevant to this problem? > I'd like to see this patch go in under some form so that we can remove the hacky workaround introduced in https://reviews.llvm.org/D68791. You can also remove that hacky workaround using generator expressions. Replacing one hack with another isn't exactly a meaningful transformation of the codebase. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68833/new/ https://reviews.llvm.org/D68833 From llvm-commits at lists.llvm.org Fri Oct 11 11:19:32 2019 From: llvm-commits at lists.llvm.org (whitequark via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:19:32 +0000 (UTC) Subject: [PATCH] D60902: [OCaml] Add OCaml APIs to access DebugInfo In-Reply-To: References: Message-ID: whitequark added a comment. > Would it be too strange to include such patches here? Yes. The patches should either be removed or investigated and applied in tree. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D60902/new/ https://reviews.llvm.org/D60902 From llvm-commits at lists.llvm.org Fri Oct 11 11:19:32 2019 From: llvm-commits at lists.llvm.org (Ulrich Weigand via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:19:32 +0000 (UTC) Subject: [PATCH] D67105: [TargetLowering] Fix another potential FPE in expandFP_TO_UINT In-Reply-To: References: Message-ID: <954eda160c907389efe7ada38054c850@localhost.localdomain> uweigand updated this revision to Diff 224642. uweigand marked an inline comment as done. uweigand added a comment. Rebase against current mainline -- Ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67105/new/ https://reviews.llvm.org/D67105 Files: lib/CodeGen/SelectionDAG/TargetLowering.cpp test/CodeGen/SystemZ/fp-strict-conv-10.ll test/CodeGen/SystemZ/fp-strict-conv-12.ll test/CodeGen/X86/fp-intrinsics.ll test/CodeGen/X86/vector-constrained-fp-intrinsics.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67105.224642.patch Type: text/x-patch Size: 37606 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 11:30:58 2019 From: llvm-commits at lists.llvm.org (Evandro Menezes via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:30:58 +0000 (UTC) Subject: [PATCH] D67199: [InstCombine] Expand the simplification of log() In-Reply-To: References: Message-ID: <01cc33e5f03a2e52c4035a23f50ae535@localhost.localdomain> evandro added a comment. For the record, the issue reported in PR43617 was fixed by rL374243 and a test case was added by rL374453 . Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67199/new/ https://reviews.llvm.org/D67199 From llvm-commits at lists.llvm.org Fri Oct 11 11:40:24 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:40:24 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: <1c3fe397a2dd4e1d67a20156a7ce97d3@localhost.localdomain> Xiangling_L marked 7 inline comments as done. Xiangling_L added inline comments. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:23 +; LARGE: lwz [[REG2:[0-9]+]], LC0 at l([[REG1]]) +; LARGE: lwz [[REG3:[0-9]+]], 0([[REG2]]) +; LARGE: addis [[REG4:[0-9]+]], LC1 at u(2) ---------------- hubert.reinterpretcast wrote: > That the ordering and interleaving of the logical operations involved differ between the various cases seem to indicate that the test is already too complicated. Please reduce the test to use a single memory operand (e.g., store a constant or return the value read). @sfertile I guess your original purpose of creating this testcase is to test if load from TOC works for both `load` and `store`? ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll:26 +; LARGE: ld [[REG4:[0-9]+]], LC1 at l([[REG2]]) +; LARGE: lwz [[REG4:[0-9]+]], 0([[REG3]]) + ---------------- hubert.reinterpretcast wrote: > This does not follow. `REG3` apparently holds the address of the operand for the load, so `REG4` holds the address of the target of the store. We are loading the value to `REG4` though, so its value will be clobbered before we get to the store. Sorry, it was my mistake, it should be `[[REG5:[0-9]+]]` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Fri Oct 11 11:46:38 2019 From: llvm-commits at lists.llvm.org (Louis Dionne via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:46:38 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies In-Reply-To: References: Message-ID: <6ca097ebf7c6b29e9a5b0aeb24b01f23@localhost.localdomain> ldionne added a comment. In D68833#1706296 , @beanz wrote: > In D68833#1706104 , @ldionne wrote: > > > CMake tracks dependencies between targets, but not between directories. If the CMakeLists.txt in some directory (e.g. `/libcxx/`) needs a target defined in another directory (e.g. `/libcxxabi/`), one has to make sure that `libcxxabi`'s `CMakeLists.txt` is included before `libcxx`'s `CMakeLists.txt`. This isn't new or vexing, IMO. If that is what this patch ensures (I don't know the runtimes build very well), I think this is good. > > > [...] > I'm not sure what you mean by this. What I *think* you are referring to is specifying runtime libraries in `LLVM_ENABLE_PROJECTS` which uses the llvm/projects subdirectory for building them. Yes, precisely. That's also the currently preferred way of building libc++: https://libcxx.llvm.org/docs/BuildingLibcxx.html > This build flow is also part of the monorepo and uses `LLVM_ENABLE_RUNTIMES`. Fundamentally `LLVM_ENABLE_PROJECTS` is the incorrect way to build runtime libraries because they are not built with the in-tree compiler. While this works for most development cases it is 100% wrong for building and shipping toolchains, and it is questionable practice to have developers building and testing things that aren't representative of what we ship. Not everybody ships libc++/libc++abi as part of the toolchain, and for those, building with whatever `CMAKE_CXX_COMPILER` they specify is really the right thing to do. Don't get me wrong, I'm 100% on board that there's value in having this runtime build, however let's not pretend that it's the only correct way to build libc++. > > >> I don't see how `HAVE_${runtime}` can get around things like being able to query properties of e.g. `cxxabi_shared` inside the libcxx build before `cxxabi_shared` has been defined. I do support a push towards using generator expressions more, but I don't think generator expressions are a complete solution to this problem. > > What do you think that generator expressions can't do that is relevant to this problem? Say I need to generate a file based on the properties of a target. I'll need to call `get_target_property` on a target that hasn't been defined yet, and there's no way around that because `file(GENERATE)` does not expand generator expressions. > > >> I'd like to see this patch go in under some form so that we can remove the hacky workaround introduced in https://reviews.llvm.org/D68791. > > You can also remove that hacky workaround using generator expressions. Are you thinking about this? set(libname "$,$,${lib}>") list(APPEND link_libraries "${CMAKE_LINK_LIBRARY_FLAG}${libname}") That is clever, I had not thought about it. If the above workaround works, I don't care about this patch that much. I still think we need to clarify the status of the Runtimes build and document it, unless that's already done and I've missed it (in which case please point it to me). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68833/new/ https://reviews.llvm.org/D68833 From llvm-commits at lists.llvm.org Fri Oct 11 11:46:41 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:46:41 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: <515c17c45eb146f6c3e055b8e0200f35@localhost.localdomain> Xiangling_L updated this revision to Diff 224645. Xiangling_L marked 2 inline comments as done. Xiangling_L added a comment. correct code format & the testcase Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 Files: llvm/include/llvm/MC/MCExpr.h llvm/lib/MC/MCExpr.cpp llvm/lib/Target/PowerPC/MCTargetDesc/PPCInstPrinter.cpp llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.cpp llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68341.224645.patch Type: text/x-patch Size: 12974 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 11:49:41 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:49:41 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: hubert.reinterpretcast marked an inline comment as done. hubert.reinterpretcast added inline comments. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:23 +; LARGE: lwz [[REG2:[0-9]+]], LC0 at l([[REG1]]) +; LARGE: lwz [[REG3:[0-9]+]], 0([[REG2]]) +; LARGE: addis [[REG4:[0-9]+]], LC1 at u(2) ---------------- Xiangling_L wrote: > hubert.reinterpretcast wrote: > > That the ordering and interleaving of the logical operations involved differ between the various cases seem to indicate that the test is already too complicated. Please reduce the test to use a single memory operand (e.g., store a constant or return the value read). > @sfertile I guess your original purpose of creating this testcase is to test if load from TOC works for both `load` and `store`? If that is indeed the intent, then the goal can be achieved with more tests that are simpler. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Fri Oct 11 11:49:42 2019 From: llvm-commits at lists.llvm.org (Daniil Fukalov via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:49:42 +0000 (UTC) Subject: [PATCH] D68881: [AMDGPU] Improve code size cost model Message-ID: dfukalov created this revision. dfukalov added reviewers: rampitec, arsenm. Herald added subscribers: hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl. Herald added a project: LLVM. Added estimation for zero size insertelement, extractelement and llvm.fabs operators. Updated inline/unroll parameters default values. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68881 Files: llvm/lib/Target/AMDGPU/AMDGPUInline.cpp llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h llvm/test/Analysis/CostModel/AMDGPU/extractelement.ll llvm/test/Analysis/CostModel/AMDGPU/fabs.ll llvm/test/Analysis/CostModel/AMDGPU/insertelement.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68881.224646.patch Type: text/x-patch Size: 10859 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 11:51:08 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Fri, 11 Oct 2019 11:51:08 -0700 Subject: [llvm] r374579 - [X86][SSE] Add support for v4i8 add reduction In-Reply-To: <20191011175415.6D02392577@lists.llvm.org> References: <20191011175415.6D02392577@lists.llvm.org> Message-ID: Why do the load cases use a movzxdq after the movd? That seems unnecessary. The movd should have generated 0s already. ~Craig On Fri, Oct 11, 2019 at 10:51 AM Simon Pilgrim via llvm-commits < llvm-commits at lists.llvm.org> wrote: > Author: rksimon > Date: Fri Oct 11 10:54:15 2019 > New Revision: 374579 > > URL: http://llvm.org/viewvc/llvm-project?rev=374579&view=rev > Log: > [X86][SSE] Add support for v4i8 add reduction > > Modified: > llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll > > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374579&r1=374578&r2=374579&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Fri Oct 11 10:54:15 2019 > @@ -36239,10 +36239,15 @@ static SDValue combineReductionToHorizon > > SDLoc DL(ExtElt); > > - if (VecVT == MVT::v8i8) { > + // vXi8 reduction - sub 128-bit vector. > + if (VecVT == MVT::v4i8 || VecVT == MVT::v8i8) { > + // Pad with zero. > + if (VecVT == MVT::v4i8) > + Rdx = DAG.getNode(ISD::CONCAT_VECTORS, DL, MVT::v8i8, Rdx, > + DAG.getConstant(0, DL, VecVT)); > // Pad with undef. > Rdx = DAG.getNode(ISD::CONCAT_VECTORS, DL, MVT::v16i8, Rdx, > - DAG.getUNDEF(VecVT)); > + DAG.getUNDEF(MVT::v8i8)); > Rdx = DAG.getNode(X86ISD::PSADBW, DL, MVT::v2i64, Rdx, > DAG.getConstant(0, DL, MVT::v16i8)); > Rdx = DAG.getBitcast(MVT::v16i8, Rdx); > > Modified: llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll?rev=374579&r1=374578&r2=374579&view=diff > > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll (original) > +++ llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll Fri Oct 11 10:54:15 > 2019 > @@ -1029,44 +1029,36 @@ define i8 @test_v2i8_load(<2 x i8>* %p) > define i8 @test_v4i8(<4 x i8> %a0) { > ; SSE2-LABEL: test_v4i8: > ; SSE2: # %bb.0: > -; SSE2-NEXT: movdqa %xmm0, %xmm1 > -; SSE2-NEXT: psrld $16, %xmm1 > -; SSE2-NEXT: paddb %xmm0, %xmm1 > -; SSE2-NEXT: movdqa %xmm1, %xmm0 > -; SSE2-NEXT: psrlw $8, %xmm0 > -; SSE2-NEXT: paddb %xmm1, %xmm0 > +; SSE2-NEXT: pxor %xmm1, %xmm1 > +; SSE2-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] > +; SSE2-NEXT: psadbw %xmm1, %xmm0 > ; SSE2-NEXT: movd %xmm0, %eax > ; SSE2-NEXT: # kill: def $al killed $al killed $eax > ; SSE2-NEXT: retq > ; > ; SSE41-LABEL: test_v4i8: > ; SSE41: # %bb.0: > -; SSE41-NEXT: movdqa %xmm0, %xmm1 > -; SSE41-NEXT: psrld $16, %xmm1 > -; SSE41-NEXT: paddb %xmm0, %xmm1 > -; SSE41-NEXT: movdqa %xmm1, %xmm0 > -; SSE41-NEXT: psrlw $8, %xmm0 > -; SSE41-NEXT: paddb %xmm1, %xmm0 > -; SSE41-NEXT: pextrb $0, %xmm0, %eax > +; SSE41-NEXT: pmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; SSE41-NEXT: pxor %xmm1, %xmm1 > +; SSE41-NEXT: psadbw %xmm0, %xmm1 > +; SSE41-NEXT: pextrb $0, %xmm1, %eax > ; SSE41-NEXT: # kill: def $al killed $al killed $eax > ; SSE41-NEXT: retq > ; > ; AVX-LABEL: test_v4i8: > ; AVX: # %bb.0: > -; AVX-NEXT: vpsrld $16, %xmm0, %xmm1 > -; AVX-NEXT: vpaddb %xmm1, %xmm0, %xmm0 > -; AVX-NEXT: vpsrlw $8, %xmm0, %xmm1 > -; AVX-NEXT: vpaddb %xmm1, %xmm0, %xmm0 > +; AVX-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; AVX-NEXT: vpxor %xmm1, %xmm1, %xmm1 > +; AVX-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 > ; AVX-NEXT: vpextrb $0, %xmm0, %eax > ; AVX-NEXT: # kill: def $al killed $al killed $eax > ; AVX-NEXT: retq > ; > ; AVX512-LABEL: test_v4i8: > ; AVX512: # %bb.0: > -; AVX512-NEXT: vpsrld $16, %xmm0, %xmm1 > -; AVX512-NEXT: vpaddb %xmm1, %xmm0, %xmm0 > -; AVX512-NEXT: vpsrlw $8, %xmm0, %xmm1 > -; AVX512-NEXT: vpaddb %xmm1, %xmm0, %xmm0 > +; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 > +; AVX512-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 > ; AVX512-NEXT: vpextrb $0, %xmm0, %eax > ; AVX512-NEXT: # kill: def $al killed $al killed $eax > ; AVX512-NEXT: retq > @@ -1078,36 +1070,28 @@ define i8 @test_v4i8_load(<4 x i8>* %p) > ; SSE2-LABEL: test_v4i8_load: > ; SSE2: # %bb.0: > ; SSE2-NEXT: movd {{.*#+}} xmm0 = mem[0],zero,zero,zero > -; SSE2-NEXT: movdqa %xmm0, %xmm1 > -; SSE2-NEXT: psrld $16, %xmm1 > -; SSE2-NEXT: paddb %xmm0, %xmm1 > -; SSE2-NEXT: movdqa %xmm1, %xmm0 > -; SSE2-NEXT: psrlw $8, %xmm0 > -; SSE2-NEXT: paddb %xmm1, %xmm0 > -; SSE2-NEXT: movd %xmm0, %eax > +; SSE2-NEXT: pxor %xmm1, %xmm1 > +; SSE2-NEXT: psadbw %xmm0, %xmm1 > +; SSE2-NEXT: movd %xmm1, %eax > ; SSE2-NEXT: # kill: def $al killed $al killed $eax > ; SSE2-NEXT: retq > ; > ; SSE41-LABEL: test_v4i8_load: > ; SSE41: # %bb.0: > ; SSE41-NEXT: movd {{.*#+}} xmm0 = mem[0],zero,zero,zero > -; SSE41-NEXT: movdqa %xmm0, %xmm1 > -; SSE41-NEXT: psrld $16, %xmm1 > -; SSE41-NEXT: paddb %xmm0, %xmm1 > -; SSE41-NEXT: movdqa %xmm1, %xmm0 > -; SSE41-NEXT: psrlw $8, %xmm0 > -; SSE41-NEXT: paddb %xmm1, %xmm0 > -; SSE41-NEXT: pextrb $0, %xmm0, %eax > +; SSE41-NEXT: pmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; SSE41-NEXT: pxor %xmm1, %xmm1 > +; SSE41-NEXT: psadbw %xmm0, %xmm1 > +; SSE41-NEXT: pextrb $0, %xmm1, %eax > ; SSE41-NEXT: # kill: def $al killed $al killed $eax > ; SSE41-NEXT: retq > ; > ; AVX-LABEL: test_v4i8_load: > ; AVX: # %bb.0: > ; AVX-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero > -; AVX-NEXT: vpsrld $16, %xmm0, %xmm1 > -; AVX-NEXT: vpaddb %xmm1, %xmm0, %xmm0 > -; AVX-NEXT: vpsrlw $8, %xmm0, %xmm1 > -; AVX-NEXT: vpaddb %xmm1, %xmm0, %xmm0 > +; AVX-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; AVX-NEXT: vpxor %xmm1, %xmm1, %xmm1 > +; AVX-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 > ; AVX-NEXT: vpextrb $0, %xmm0, %eax > ; AVX-NEXT: # kill: def $al killed $al killed $eax > ; AVX-NEXT: retq > @@ -1115,10 +1099,9 @@ define i8 @test_v4i8_load(<4 x i8>* %p) > ; AVX512-LABEL: test_v4i8_load: > ; AVX512: # %bb.0: > ; AVX512-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero > -; AVX512-NEXT: vpsrld $16, %xmm0, %xmm1 > -; AVX512-NEXT: vpaddb %xmm1, %xmm0, %xmm0 > -; AVX512-NEXT: vpsrlw $8, %xmm0, %xmm1 > -; AVX512-NEXT: vpaddb %xmm1, %xmm0, %xmm0 > +; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 > +; AVX512-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 > ; AVX512-NEXT: vpextrb $0, %xmm0, %eax > ; AVX512-NEXT: # kill: def $al killed $al killed $eax > ; AVX512-NEXT: retq > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Fri Oct 11 12:06:35 2019 From: llvm-commits at lists.llvm.org (David Blaikie via llvm-commits) Date: Fri, 11 Oct 2019 19:06:35 -0000 Subject: [llvm] r374582 - llvm-dwarfdump: Add verbose printing for debug_loclists Message-ID: <20191011190635.7376C88DA6@lists.llvm.org> Author: dblaikie Date: Fri Oct 11 12:06:35 2019 New Revision: 374582 URL: http://llvm.org/viewvc/llvm-project?rev=374582&view=rev Log: llvm-dwarfdump: Add verbose printing for debug_loclists Modified: llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h llvm/trunk/lib/DebugInfo/DWARF/DWARFContext.cpp llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp llvm/trunk/lib/DebugInfo/DWARF/DWARFDie.cpp llvm/trunk/test/CodeGen/X86/debug-loclists.ll llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists.test llvm/trunk/test/DebugInfo/X86/fission-ranges.ll llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_startx_length.s Modified: llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h?rev=374582&r1=374581&r2=374582&view=diff ============================================================================== --- llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h (original) +++ llvm/trunk/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h Fri Oct 11 12:06:35 2019 @@ -11,6 +11,7 @@ #include "llvm/ADT/Optional.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/DebugInfo/DIContext.h" #include "llvm/DebugInfo/DWARF/DWARFDataExtractor.h" #include "llvm/DebugInfo/DWARF/DWARFRelocMap.h" #include @@ -42,6 +43,7 @@ public: /// Dump this list on OS. void dump(raw_ostream &OS, uint64_t BaseAddress, bool IsLittleEndian, unsigned AddressSize, const MCRegisterInfo *MRI, DWARFUnit *U, + DIDumpOptions DumpOpts, unsigned Indent) const; }; @@ -58,7 +60,7 @@ private: public: /// Print the location lists found within the debug_loc section. - void dump(raw_ostream &OS, const MCRegisterInfo *RegInfo, + void dump(raw_ostream &OS, const MCRegisterInfo *RegInfo, DIDumpOptions DumpOpts, Optional Offset) const; /// Parse the debug_loc section accessible via the 'data' parameter using the @@ -76,9 +78,13 @@ class DWARFDebugLoclists { public: struct Entry { uint8_t Kind; + uint64_t Offset; uint64_t Value0; uint64_t Value1; SmallVector Loc; + void dump(raw_ostream &OS, uint64_t &BaseAddr, bool IsLittleEndian, + unsigned AddressSize, const MCRegisterInfo *MRI, DWARFUnit *U, + DIDumpOptions DumpOpts, unsigned Indent, size_t MaxEncodingStringLength) const; }; struct LocationList { @@ -86,7 +92,7 @@ public: SmallVector Entries; void dump(raw_ostream &OS, uint64_t BaseAddr, bool IsLittleEndian, unsigned AddressSize, const MCRegisterInfo *RegInfo, - DWARFUnit *U, unsigned Indent) const; + DWARFUnit *U, DIDumpOptions DumpOpts, unsigned Indent) const; }; private: @@ -101,7 +107,7 @@ private: public: void parse(DataExtractor data, uint64_t Offset, uint64_t EndOffset, uint16_t Version); void dump(raw_ostream &OS, uint64_t BaseAddr, const MCRegisterInfo *RegInfo, - Optional Offset) const; + DIDumpOptions DumpOpts, Optional Offset) const; /// Return the location list at the given offset or nullptr. LocationList const *getLocationListAtOffset(uint64_t Offset) const; Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFContext.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFContext.cpp?rev=374582&r1=374581&r2=374582&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/DWARF/DWARFContext.cpp (original) +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFContext.cpp Fri Oct 11 12:06:35 2019 @@ -305,7 +305,7 @@ static void dumpLoclistsSection(raw_ostr DWARFDebugLoclists Loclists; uint64_t EndOffset = Header.length() + Header.getHeaderOffset(); Loclists.parse(LocData, Offset, EndOffset, Header.getVersion()); - Loclists.dump(OS, 0, MRI, DumpOffset); + Loclists.dump(OS, 0, MRI, DumpOpts, DumpOffset); Offset = EndOffset; } } @@ -382,7 +382,7 @@ void DWARFContext::dump( if (const auto *Off = shouldDump(Explicit, ".debug_loc", DIDT_ID_DebugLoc, DObj->getLocSection().Data)) { - getDebugLoc()->dump(OS, getRegisterInfo(), *Off); + getDebugLoc()->dump(OS, getRegisterInfo(), DumpOpts, *Off); } if (const auto *Off = shouldDump(Explicit, ".debug_loclists", DIDT_ID_DebugLoclists, @@ -394,7 +394,7 @@ void DWARFContext::dump( if (const auto *Off = shouldDump(ExplicitDWO, ".debug_loc.dwo", DIDT_ID_DebugLoc, DObj->getLocDWOSection().Data)) { - getDebugLocDWO()->dump(OS, 0, getRegisterInfo(), *Off); + getDebugLocDWO()->dump(OS, 0, getRegisterInfo(), DumpOpts, *Off); } if (const auto *Off = shouldDump(Explicit, ".debug_frame", DIDT_ID_DebugFrame, Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp?rev=374582&r1=374581&r2=374582&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp (original) +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp Fri Oct 11 12:06:35 2019 @@ -9,6 +9,7 @@ #include "llvm/DebugInfo/DWARF/DWARFDebugLoc.h" #include "llvm/ADT/StringRef.h" #include "llvm/BinaryFormat/Dwarf.h" +#include "llvm/Bitcode/BitcodeAnalyzer.h" #include "llvm/DebugInfo/DWARF/DWARFContext.h" #include "llvm/DebugInfo/DWARF/DWARFExpression.h" #include "llvm/DebugInfo/DWARF/DWARFRelocMap.h" @@ -39,6 +40,7 @@ void DWARFDebugLoc::LocationList::dump(r bool IsLittleEndian, unsigned AddressSize, const MCRegisterInfo *MRI, DWARFUnit *U, + DIDumpOptions DumpOpts, unsigned Indent) const { for (const Entry &E : Entries) { OS << '\n'; @@ -62,12 +64,12 @@ DWARFDebugLoc::getLocationListAtOffset(u return nullptr; } -void DWARFDebugLoc::dump(raw_ostream &OS, const MCRegisterInfo *MRI, +void DWARFDebugLoc::dump(raw_ostream &OS, const MCRegisterInfo *MRI, DIDumpOptions DumpOpts, Optional Offset) const { auto DumpLocationList = [&](const LocationList &L) { OS << format("0x%8.8" PRIx64 ": ", L.Offset); - L.dump(OS, 0, IsLittleEndian, AddressSize, MRI, nullptr, 12); - OS << "\n\n"; + L.dump(OS, 0, IsLittleEndian, AddressSize, MRI, nullptr, DumpOpts, 12); + OS << "\n"; }; if (Offset) { @@ -78,6 +80,8 @@ void DWARFDebugLoc::dump(raw_ostream &OS for (const LocationList &L : Locations) { DumpLocationList(L); + if (&L != &Locations.back()) + OS << '\n'; } } @@ -146,7 +150,11 @@ DWARFDebugLoclists::parseOneLocationList while (auto Kind = Data.getU8(C)) { Entry E; E.Kind = Kind; + E.Offset = C.tell() - 1; switch (Kind) { + case dwarf::DW_LLE_base_addressx: + E.Value0 = Data.getULEB128(C); + break; case dwarf::DW_LLE_startx_length: E.Value0 = Data.getULEB128(C); // Pre-DWARF 5 has different interpretation of the length field. We have @@ -173,7 +181,8 @@ DWARFDebugLoclists::parseOneLocationList "LLE of kind %x not supported", (int)Kind); } - if (Kind != dwarf::DW_LLE_base_address) { + if (Kind != dwarf::DW_LLE_base_address && + Kind != dwarf::DW_LLE_base_addressx) { unsigned Bytes = Version >= 5 ? Data.getULEB128(C) : Data.getU16(C); // A single location description describing the location of the object... Data.getU8(C, E.Loc, Bytes); @@ -183,6 +192,10 @@ DWARFDebugLoclists::parseOneLocationList } if (Error Err = C.takeError()) return std::move(Err); + Entry E; + E.Kind = dwarf::DW_LLE_end_of_list; + E.Offset = C.tell() - 1; + LL.Entries.push_back(E); *Offset = C.tell(); return LL; } @@ -210,51 +223,106 @@ DWARFDebugLoclists::getLocationListAtOff return nullptr; } -void DWARFDebugLoclists::LocationList::dump(raw_ostream &OS, uint64_t BaseAddr, - bool IsLittleEndian, - unsigned AddressSize, - const MCRegisterInfo *MRI, - DWARFUnit *U, - unsigned Indent) const { - for (const Entry &E : Entries) { - switch (E.Kind) { +void DWARFDebugLoclists::Entry::dump(raw_ostream &OS, uint64_t &BaseAddr, + bool IsLittleEndian, unsigned AddressSize, + const MCRegisterInfo *MRI, DWARFUnit *U, + DIDumpOptions DumpOpts, unsigned Indent, + size_t MaxEncodingStringLength) const { + if (DumpOpts.Verbose) { + OS << "\n"; + OS.indent(Indent); + auto EncodingString = dwarf::LocListEncodingString(Kind); + // Unsupported encodings should have been reported during parsing. + assert(!EncodingString.empty() && "Unknown loclist entry encoding"); + OS << format("%s%*c", EncodingString.data(), + MaxEncodingStringLength - EncodingString.size() + 1, '('); + switch (Kind) { case dwarf::DW_LLE_startx_length: - OS << '\n'; - OS.indent(Indent); - OS << "Addr idx " << E.Value0 << " (w/ length " << E.Value1 << "): "; - break; case dwarf::DW_LLE_start_length: - OS << '\n'; - OS.indent(Indent); - OS << format("[0x%*.*" PRIx64 ", 0x%*.*" PRIx64 "): ", AddressSize * 2, - AddressSize * 2, E.Value0, AddressSize * 2, AddressSize * 2, - E.Value0 + E.Value1); - break; case dwarf::DW_LLE_offset_pair: - OS << '\n'; - OS.indent(Indent); - OS << format("[0x%*.*" PRIx64 ", 0x%*.*" PRIx64 "): ", AddressSize * 2, - AddressSize * 2, BaseAddr + E.Value0, AddressSize * 2, - AddressSize * 2, BaseAddr + E.Value1); + OS << format("0x%*.*" PRIx64 ", 0x%*.*" PRIx64, AddressSize * 2, + AddressSize * 2, Value0, AddressSize * 2, AddressSize * 2, + Value1); break; + case dwarf::DW_LLE_base_addressx: case dwarf::DW_LLE_base_address: - BaseAddr = E.Value0; + OS << format("0x%*.*" PRIx64, AddressSize * 2, AddressSize * 2, + Value0); + break; + case dwarf::DW_LLE_end_of_list: break; - default: - llvm_unreachable("unreachable locations list kind"); } - - dumpExpression(OS, E.Loc, IsLittleEndian, AddressSize, MRI, U); + OS << ')'; + } + auto PrintPrefix = [&] { + OS << "\n"; + OS.indent(Indent); + if (DumpOpts.Verbose) + OS << format("%*s", MaxEncodingStringLength, (const char *)"=> "); + }; + switch (Kind) { + case dwarf::DW_LLE_startx_length: + PrintPrefix(); + OS << "Addr idx " << Value0 << " (w/ length " << Value1 << "): "; + break; + case dwarf::DW_LLE_start_length: + PrintPrefix(); + DWARFAddressRange(Value0, Value0 + Value1) + .dump(OS, AddressSize, DumpOpts); + OS << ": "; + break; + case dwarf::DW_LLE_offset_pair: + PrintPrefix(); + DWARFAddressRange(BaseAddr + Value0, BaseAddr + Value1) + .dump(OS, AddressSize, DumpOpts); + OS << ": "; + break; + case dwarf::DW_LLE_base_addressx: + if (!DumpOpts.Verbose) + return; + break; + case dwarf::DW_LLE_end_of_list: + if (!DumpOpts.Verbose) + return; + break; + case dwarf::DW_LLE_base_address: + BaseAddr = Value0; + if (!DumpOpts.Verbose) + return; + break; + default: + llvm_unreachable("unreachable locations list kind"); } + + dumpExpression(OS, Loc, IsLittleEndian, AddressSize, MRI, U); +} +void DWARFDebugLoclists::LocationList::dump(raw_ostream &OS, uint64_t BaseAddr, + bool IsLittleEndian, + unsigned AddressSize, + const MCRegisterInfo *MRI, + DWARFUnit *U, + DIDumpOptions DumpOpts, + unsigned Indent) const { + size_t MaxEncodingStringLength = 0; + if (DumpOpts.Verbose) + for (const auto &Entry : Entries) + MaxEncodingStringLength = + std::max(MaxEncodingStringLength, + dwarf::LocListEncodingString(Entry.Kind).size()); + + for (const Entry &E : Entries) + E.dump(OS, BaseAddr, IsLittleEndian, AddressSize, MRI, U, DumpOpts, Indent, + MaxEncodingStringLength); } void DWARFDebugLoclists::dump(raw_ostream &OS, uint64_t BaseAddr, - const MCRegisterInfo *MRI, + const MCRegisterInfo *MRI, DIDumpOptions DumpOpts, Optional Offset) const { auto DumpLocationList = [&](const LocationList &L) { OS << format("0x%8.8" PRIx64 ": ", L.Offset); - L.dump(OS, BaseAddr, IsLittleEndian, AddressSize, MRI, nullptr, /*Indent=*/12); - OS << "\n\n"; + L.dump(OS, BaseAddr, IsLittleEndian, AddressSize, MRI, nullptr, DumpOpts, + /*Indent=*/12); + OS << "\n"; }; if (Offset) { @@ -265,5 +333,7 @@ void DWARFDebugLoclists::dump(raw_ostrea for (const LocationList &L : Locations) { DumpLocationList(L); + if (&L != &Locations.back()) + OS << '\n'; } } Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFDie.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFDie.cpp?rev=374582&r1=374581&r2=374582&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/DWARF/DWARFDie.cpp (original) +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFDie.cpp Fri Oct 11 12:06:35 2019 @@ -97,8 +97,10 @@ static void dumpLocation(raw_ostream &OS uint64_t BaseAddr = 0; if (Optional BA = U->getBaseAddress()) BaseAddr = BA->Address; + auto LLDumpOpts = DumpOpts; + LLDumpOpts.Verbose = false; ExpectedLL->dump(OS, BaseAddr, Ctx.isLittleEndian(), Obj.getAddressSize(), - MRI, U, Indent); + MRI, U, LLDumpOpts, Indent); } else { OS << '\n'; OS.indent(Indent); Modified: llvm/trunk/test/CodeGen/X86/debug-loclists.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/debug-loclists.ll?rev=374582&r1=374581&r2=374582&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/debug-loclists.ll (original) +++ llvm/trunk/test/CodeGen/X86/debug-loclists.ll Fri Oct 11 12:06:35 2019 @@ -13,8 +13,10 @@ ; CHECK: .debug_loclists contents: ; CHECK-NEXT: 0x00000000: locations list header: length = 0x00000015, version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000000 ; CHECK-NEXT: 0x0000000c: -; CHECK-NEXT: [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0 -; CHECK-NEXT: [0x0000000000000004, 0x0000000000000012): DW_OP_breg3 RBX+0 +; CHECK-NEXT: DW_LLE_offset_pair(0x0000000000000000, 0x0000000000000004) +; CHECK-NEXT: => [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0 +; CHECK-NEXT: DW_LLE_offset_pair(0x0000000000000004, 0x0000000000000012) +; CHECK-NEXT: => [0x0000000000000004, 0x0000000000000012): DW_OP_breg3 RBX+0 ; There is no way to use llvm-dwarfdump atm (2018, october) to verify the DW_LLE_* codes emited, ; because dumper is not yet implements that. Use asm code to do this check instead. Modified: llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists.test?rev=374582&r1=374581&r2=374582&view=diff ============================================================================== --- llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists.test (original) +++ llvm/trunk/test/DebugInfo/X86/dwarfdump-debug-loclists.test Fri Oct 11 12:06:35 2019 @@ -11,9 +11,14 @@ # CHECK: .debug_loclists contents: # CHECK-NEXT: 0x00000000: locations list header: length = 0x0000002c, version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000000 # CHECK-NEXT: 0x0000000c: -# CHECK-NEXT: [0x0000000000000000, 0x0000000000000010): DW_OP_breg5 RDI+0 -# CHECK-NEXT: [0x0000000000000530, 0x0000000000000540): DW_OP_breg6 RBP-8, DW_OP_deref -# CHECK-NEXT: [0x0000000000000700, 0x0000000000000710): DW_OP_breg5 RDI+0 +# CHECK-NEXT: DW_LLE_offset_pair (0x0000000000000000, 0x0000000000000010) +# CHECK-NEXT: => [0x0000000000000000, 0x0000000000000010): DW_OP_breg5 RDI+0 +# CHECK-NEXT: DW_LLE_base_address(0x0000000000000500) +# CHECK-NEXT: DW_LLE_offset_pair (0x0000000000000030, 0x0000000000000040) +# CHECK-NEXT: => [0x0000000000000530, 0x0000000000000540): DW_OP_breg6 RBP-8, DW_OP_deref +# CHECK-NEXT: DW_LLE_start_length(0x0000000000000700, 0x0000000000000010) +# CHECK-NEXT: => [0x0000000000000700, 0x0000000000000710): DW_OP_breg5 RDI+0 +# CHECK-NEXT: DW_LLE_end_of_list () .section .debug_str,"MS", at progbits,1 .asciz "stub" Modified: llvm/trunk/test/DebugInfo/X86/fission-ranges.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/X86/fission-ranges.ll?rev=374582&r1=374581&r2=374582&view=diff ============================================================================== --- llvm/trunk/test/DebugInfo/X86/fission-ranges.ll (original) +++ llvm/trunk/test/DebugInfo/X86/fission-ranges.ll Fri Oct 11 12:06:35 2019 @@ -45,18 +45,31 @@ ; if they've changed due to a bugfix, change in register allocation, etc. ; CHECK: [[A]]: -; CHECK-NEXT: Addr idx 2 (w/ length 15): DW_OP_consts +0, DW_OP_stack_value -; CHECK-NEXT: Addr idx 3 (w/ length 15): DW_OP_reg0 RAX -; CHECK-NEXT: Addr idx 4 (w/ length 18): DW_OP_breg7 RSP-8 +; CHECK-NEXT: DW_LLE_startx_length(0x00000002, 0x0000000f) +; CHECK-NEXT: => Addr idx 2 (w/ length 15): DW_OP_consts +0, DW_OP_stack_value +; CHECK-NEXT: DW_LLE_startx_length(0x00000003, 0x0000000f) +; CHECK-NEXT: => Addr idx 3 (w/ length 15): DW_OP_reg0 RAX +; CHECK-NEXT: DW_LLE_startx_length(0x00000004, 0x00000012) +; CHECK-NEXT: => Addr idx 4 (w/ length 18): DW_OP_breg7 RSP-8 +; CHECK-NEXT: DW_LLE_end_of_list () ; CHECK: [[E]]: -; CHECK-NEXT: Addr idx 5 (w/ length 9): DW_OP_reg0 RAX -; CHECK-NEXT: Addr idx 6 (w/ length 98): DW_OP_breg7 RSP-44 +; CHECK-NEXT: DW_LLE_startx_length(0x00000005, 0x00000009) +; CHECK-NEXT: => Addr idx 5 (w/ length 9): DW_OP_reg0 RAX +; CHECK-NEXT: DW_LLE_startx_length(0x00000006, 0x00000062) +; CHECK-NEXT: => Addr idx 6 (w/ length 98): DW_OP_breg7 RSP-44 +; CHECK-NEXT: DW_LLE_end_of_list () ; CHECK: [[B]]: -; CHECK-NEXT: Addr idx 7 (w/ length 15): DW_OP_reg0 RAX -; CHECK-NEXT: Addr idx 8 (w/ length 66): DW_OP_breg7 RSP-32 +; CHECK-NEXT: DW_LLE_startx_length(0x00000007, 0x0000000f) +; CHECK-NEXT: => Addr idx 7 (w/ length 15): DW_OP_reg0 RAX +; CHECK-NEXT: DW_LLE_startx_length(0x00000008, 0x00000042) +; CHECK-NEXT: => Addr idx 8 (w/ length 66): DW_OP_breg7 RSP-32 +; CHECK-NEXT: DW_LLE_end_of_list () ; CHECK: [[D]]: -; CHECK-NEXT: Addr idx 9 (w/ length 15): DW_OP_reg0 RAX -; CHECK-NEXT: Addr idx 10 (w/ length 42): DW_OP_breg7 RSP-20 +; CHECK-NEXT: DW_LLE_startx_length(0x00000009, 0x0000000f) +; CHECK-NEXT: => Addr idx 9 (w/ length 15): DW_OP_reg0 RAX +; CHECK-NEXT: DW_LLE_startx_length(0x0000000a, 0x0000002a) +; CHECK-NEXT: => Addr idx 10 (w/ length 42): DW_OP_breg7 RSP-20 +; CHECK-NEXT: DW_LLE_end_of_list () ; Make sure we don't produce any relocations in any .dwo section (though in particular, debug_info.dwo) ; HDR-NOT: .rela.{{.*}}.dwo Modified: llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_startx_length.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_startx_length.s?rev=374582&r1=374581&r2=374582&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_startx_length.s (original) +++ llvm/trunk/test/tools/llvm-dwarfdump/X86/debug_loclists_startx_length.s Fri Oct 11 12:06:35 2019 @@ -8,7 +8,9 @@ # CHECK: .debug_loclists contents: # CHECK-NEXT: 0x00000000: locations list header: length = 0x0000000e, version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000000 # CHECK-NEXT: 0x0000000c: -# CHECK-NEXT: Addr idx 1 (w/ length 16): DW_OP_reg5 RDI +# CHECK-NEXT: DW_LLE_startx_length(0x0000000000000001, 0x0000000000000010) +# CHECK-NEXT: => Addr idx 1 (w/ length 16): DW_OP_reg5 RDI +# CHECK-NEXT: DW_LLE_end_of_list () .section .debug_loclists,"", at progbits .long .Ldebug_loclist_table_end0-.Ldebug_loclist_table_start0 From llvm-commits at lists.llvm.org Fri Oct 11 12:05:40 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 19:05:40 +0000 (UTC) Subject: [PATCH] D68881: [AMDGPU] Improve code size cost model In-Reply-To: References: Message-ID: arsenm added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUInline.cpp:54 static cl::opt -MaxBB("amdgpu-inline-max-bb", cl::Hidden, cl::init(300), +MaxBB("amdgpu-inline-max-bb", cl::Hidden, cl::init(1100), cl::desc("Maximum BB number allowed in a function after inlining" ---------------- This is a separate change ================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:698 + ArrayRef Operands) { + // Estimate extractelement elimination + if (const ExtractElementInst *EE = dyn_cast(U)) { ---------------- We already report vector insert/extract as free. Why does this need to look at these specifically? What is the purpose of Operands which seems to be ignored? What uses this version? I thought the set of cost model function with specific value contexts were only used by the vectorizers ================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h:207 - unsigned getInliningThresholdMultiplier() { return 7; } + unsigned getInliningThresholdMultiplier() { return 9; } ---------------- This is a separate change Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68881/new/ https://reviews.llvm.org/D68881 From llvm-commits at lists.llvm.org Fri Oct 11 12:08:50 2019 From: llvm-commits at lists.llvm.org (David Stenberg via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 19:08:50 +0000 (UTC) Subject: [PATCH] D68465: [DebugInfo] Trim call-clobbered location list entries when tuning for GDB In-Reply-To: References: Message-ID: <2bf1e5cb701ba50ca588d950d4e13222@localhost.localdomain> dstenb added a comment. In D68465#1706294 , @dblaikie wrote: > Yeah, GCC 8.1 has the same behavior. > > But I think we can avoid that in LLVM with some of the loclist changes I'm working on - we should only ever be using the start of a function (not necessarily the current function, if there are multiple functions in the same section) as a base address in location lists. So locations like this should be rendered as offset pairs relative to that base address & should always be positive, even if you subtract 1. Okay! > I'm wondering whether it might be worth deferring all this design discussion until after there's more clarity from GDB - since I expect that's the right path forward & this design discussion may be unnecessary. Yes, agreed. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68465/new/ https://reviews.llvm.org/D68465 From llvm-commits at lists.llvm.org Fri Oct 11 12:08:51 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 19:08:51 +0000 (UTC) Subject: [PATCH] D68865: [InstCombine][AMDGPU] Fix crash with v3i16/v3f16 buffer intrinsics In-Reply-To: References: Message-ID: <704885c4237686eec41239cfc26fcc31@localhost.localdomain> arsenm added inline comments. ================ Comment at: lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp:975 + + // FIXME: Allow v3i16/v3f16 in buffer intrinsics when the types are fully supported. + if (DMaskIdx < 0 && ---------------- I think I have these working in GlobalISel already ================ Comment at: lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp:977 + if (DMaskIdx < 0 && + II->getType()->getScalarSizeInBits() == 16 && + DemandedElts.getActiveBits() == 3) ---------------- != 32 would be a bit safer Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68865/new/ https://reviews.llvm.org/D68865 From llvm-commits at lists.llvm.org Fri Oct 11 12:08:48 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Fri, 11 Oct 2019 12:08:48 -0700 Subject: Attention bot owners Message-ID: Hello all bots owners, As all of you know we move to github monorepo very soon now. We are actively working on the buildbot to prepare a solution to switch from SVN to github when time comes. It would require some activity on your bots. At this point it is clear that you would need to * Make sure you have reasonably recent version of git installed and in the system path for the buildbot account, * Once the transition to github is done, you would need to remove the old source code and build directory. This could be done later assuming you have enough room on that hard drive for two sets of the source and build files. Thanks Galina -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Fri Oct 11 12:18:03 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 19:18:03 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: <9d0d13c9162eb5eac33294b3066d3caa@localhost.localdomain> DiggerLin marked 6 inline comments as done. DiggerLin added inline comments. ================ Comment at: llvm/include/llvm/BinaryFormat/XCOFF.h:193 + ///< the refrenced symbol and the address of the refrenced branch + ///< instruction. References a non modifiable instruction. + R_RBA = 0x18, ///< Branch absolute relocation. Similar to the R_BA but ---------------- hubert.reinterpretcast wrote: > hubert.reinterpretcast wrote: > > Still missing the hyphen for "non-modifiable". > Still not seeing the change: https://reviews.llvm.org/D67008?id=222860#inline-616694 changed ================ Comment at: llvm/include/llvm/BinaryFormat/XCOFF.h:194 + ///< instruction. References a non modifiable instruction. + R_RBA = 0x18, ///< Branch absolute relocation. Similar to the R_BA but + ///< references a modifiable instruction. ---------------- hubert.reinterpretcast wrote: > hubert.reinterpretcast wrote: > > Either remove the "the" for this line or add "relocation" after "R_BA". > Still not seeing the change: https://reviews.llvm.org/D67008?id=222860#inline-616692 changed ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:165 +}; class XCOFFObjectFile : public ObjectFile { private: ---------------- hubert.reinterpretcast wrote: > Blank line between the class definitions please. added ================ Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:597 + + uint32_t RelocEntNum = RelocEntNumOrErr.get(); + ---------------- hubert.reinterpretcast wrote: > hubert.reinterpretcast wrote: > > Suggestion: `NumRelocEntries` > Still not seeing the change: https://reviews.llvm.org/D67008?id=222860#inline-616704 changed Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Fri Oct 11 12:23:53 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 19:23:53 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: <1e3c14def9764542caea1fa6030b07eb@localhost.localdomain> DiggerLin marked 2 inline comments as done. DiggerLin added inline comments. ================ Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:556 } +// In an XCOFF32 file, if more than 65,534 relocation entries are required, +// the field value will be 65535, and an STYP_OVRFLO section header will ---------------- hubert.reinterpretcast wrote: > hubert.reinterpretcast wrote: > > DiggerLin wrote: > > > hubert.reinterpretcast wrote: > > > > We can reduce the amount of background for the comment to what is necessary to understand the code here: > > > > In an XCOFF32 file, when the field value is 65535, then an STYP_OVRFLO section header contains the actual count of relocation entries in the s_paddr field. STYP_OVRFLO headers contain the section index of their corresponding sections as their raw "NumberOfRelocations" field value. > > > added. > > I am not seeing the change. > Still not seeing the change: https://reviews.llvm.org/D67008?id=222237#inline-613037 changed the comment as suggestion. ================ Comment at: llvm/lib/Object/XCOFFObjectFile.cpp:593 + Sec.FileOffsetToRelocationInfo); + auto RelocEntNumOrErr = getLogicalNumberOfRelocationEntries(Sec); + if (Error E = RelocEntNumOrErr.takeError()) ---------------- hubert.reinterpretcast wrote: > hubert.reinterpretcast wrote: > > Suggestion: `NumRelocEntriesOrErr` > Still not seeing the change: https://reviews.llvm.org/D67008?id=222860#inline-616702 changed Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Fri Oct 11 12:33:03 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 19:33:03 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: <7556e16ca4b5b6521130fbb0d3dad330@localhost.localdomain> DiggerLin updated this revision to Diff 224655. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 Files: llvm/include/llvm/BinaryFormat/XCOFF.h llvm/include/llvm/Object/XCOFFObjectFile.h llvm/lib/Object/XCOFFObjectFile.cpp llvm/test/tools/llvm-readobj/reloc_overflow.test llvm/test/tools/llvm-readobj/xcoff-basic.test llvm/tools/llvm-readobj/XCOFFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67008.224655.patch Type: text/x-patch Size: 19635 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 12:46:06 2019 From: llvm-commits at lists.llvm.org (Tony Jiang via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 19:46:06 +0000 (UTC) Subject: [PATCH] D66840: docs/DeveloperPolicy: Add instructions for requesting GitHub commit access In-Reply-To: References: Message-ID: <23d53914fea69d1177d17695c75a255c@localhost.localdomain> jtony added a comment. In D66840#1702109 , @jtony wrote: > I am not able to run the last step successfully. Initially, I thought it's because I used wrong password. So I sent another new password hash to Chris Lattner to update it. He updated the password hash for me. When I used the new password to run `svn commit -m "Request commit access for jtony"` still failed. Anyway know why? Thanks! Anyone run into similar issue? What should I do? Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66840/new/ https://reviews.llvm.org/D66840 From llvm-commits at lists.llvm.org Fri Oct 11 12:51:20 2019 From: llvm-commits at lists.llvm.org (Stella Stamenova via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 19:51:20 +0000 (UTC) Subject: [PATCH] D67347: [Windows] Use information from the PE32 exceptions directory to construct unwind plans In-Reply-To: References: Message-ID: <4a64145b708d0e510aeb39132941893b@localhost.localdomain> stella.stamenova added a comment. It looks like this changed fixed at least one of the XFAILed tests on Windows: http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/9751 So now the test results would be red because of the unexpectedly passing test (if there wasn't another failure). Could you have a look at whether any of the other tests that were XFAILed for the same bug are also now passing and remove the expected failure tags as appropriate? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67347/new/ https://reviews.llvm.org/D67347 From llvm-commits at lists.llvm.org Fri Oct 11 13:00:42 2019 From: llvm-commits at lists.llvm.org (James Nagurne via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:00:42 +0000 (UTC) Subject: [PATCH] D63978: Clang Interface Stubs merger plumbing for Driver In-Reply-To: References: Message-ID: JamesNagurne added a comment. Our team maintains a downstream embedded ARM clang distribution and some tests from this commit have begun to fail for us. For a number of these tests, there was a REQUIRES: x86-registered-target at the top, which has now been removed. Specifically, externstatic.c, merge-conflict-test.c, object-float.c, and object.c are failing. object* tests seem to be based on object.cpp, which had the REQUIRES line, and externstatic.c also had that line prior to the change. I see that @compnerd suggested the removal, but were you certain that these tests would work on clang toolchains for which x86 is not a registered target? For a failure example, here the output of lit for our toolchain. If you can make sense of it, I'd appreciate input on how we can fix or work around it: > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c | /arm-llvm/Release/llvm/bin/FileCheck -check-prefix=CHECK-TAPI /llvm-project/clang/test/InterfaceStubs/object.c /llvm-project/clang/test/InterfaceStubs/object.c:5:16: error: CHECK-TAPI: expected string not found in input // CHECK-TAPI: data: { Type: Object, Size: 4 } ^ :1:1: note: scanning from here --- !experimental-ifs-v1 ^ And when run without FileCheck, our raw output: > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c --- !experimental-ifs-v1 IfsVersion: 1.0 Triple: thumbv7em-ti-none-eabihf ObjectFileFormat: ELF Symbols: ... Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63978/new/ https://reviews.llvm.org/D63978 From llvm-commits at lists.llvm.org Fri Oct 11 13:05:16 2019 From: llvm-commits at lists.llvm.org (Guozhi Wei via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:05:16 +0000 (UTC) Subject: [PATCH] D68414: [SROA] Enhance AggLoadStoreRewriter to rewrite integer load/store if it covers multi fields in original aggregate In-Reply-To: References: Message-ID: <39003e4ac695f35ba2dd9fb2d0339293@localhost.localdomain> Carrot added a comment. ping Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68414/new/ https://reviews.llvm.org/D68414 From llvm-commits at lists.llvm.org Fri Oct 11 13:18:52 2019 From: llvm-commits at lists.llvm.org (Puyan Lotfi via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:18:52 +0000 (UTC) Subject: [PATCH] D63978: Clang Interface Stubs merger plumbing for Driver In-Reply-To: References: Message-ID: <7ec12e4e0a326e8d2f6b0014fb408f23@localhost.localdomain> plotfi marked an inline comment as done. plotfi added a comment. In D63978#1706420 , @JamesNagurne wrote: > Our team maintains a downstream embedded ARM clang distribution and some tests from this commit have begun to fail for us. > For a number of these tests, there was a REQUIRES: x86-registered-target at the top, which has now been removed. Specifically, externstatic.c, merge-conflict-test.c, object-float.c, and object.c are failing. > > object* tests seem to be based on object.cpp, which had the REQUIRES line, and externstatic.c also had that line prior to the change. > I see that @compnerd suggested the removal, but were you certain that these tests would work on clang toolchains for which x86 is not a registered target? > > For a failure example, here the output of lit for our toolchain. If you can make sense of it, I'd appreciate input on how we can fix or work around it: > > > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c | /arm-llvm/Release/llvm/bin/FileCheck -check-prefix=CHECK-TAPI /llvm-project/clang/test/InterfaceStubs/object.c > /llvm-project/clang/test/InterfaceStubs/object.c:5:16: error: CHECK-TAPI: expected string not found in input > // CHECK-TAPI: data: { Type: Object, Size: 4 } > ^ > :1:1: note: scanning from here > --- !experimental-ifs-v1 > ^ > > > And when run without FileCheck, our raw output: > > > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c > --- !experimental-ifs-v1 > IfsVersion: 1.0 > Triple: thumbv7em-ti-none-eabihf > ObjectFileFormat: ELF > Symbols: > ... > I am sorry for this James. I can add back the REQUIRES lines for now and coordinate with you on making sure your downstream bots are not affected again if the REQUIRES are removed again. By chance are your bots accessible publicly? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63978/new/ https://reviews.llvm.org/D63978 From llvm-commits at lists.llvm.org Fri Oct 11 13:22:47 2019 From: llvm-commits at lists.llvm.org (Quentin Colombet via llvm-commits) Date: Fri, 11 Oct 2019 20:22:47 -0000 Subject: [llvm] r374588 - [MachineIRBuilder] Fix an assertion failure with buildMerge Message-ID: <20191011202247.BAEC28AC6E@lists.llvm.org> Author: qcolombet Date: Fri Oct 11 13:22:47 2019 New Revision: 374588 URL: http://llvm.org/viewvc/llvm-project?rev=374588&view=rev Log: [MachineIRBuilder] Fix an assertion failure with buildMerge Teach buildMerge how to deal with scalar to vector kind of requests. Prior to this patch, buildMerge would issue either a G_MERGE_VALUES when all the vregs are scalars or a G_CONCAT_VECTORS when the destination vreg is a vector. G_CONCAT_VECTORS was actually not the proper instruction when the source vregs were scalars and the compiler would assert that the sources must be vectors. Instead we want is to issue a G_BUILD_VECTOR when we are in this situation. This patch fixes that. Modified: llvm/trunk/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp llvm/trunk/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp Modified: llvm/trunk/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp?rev=374588&r1=374587&r2=374588&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp Fri Oct 11 13:22:47 2019 @@ -1063,8 +1063,11 @@ MachineInstrBuilder MachineIRBuilder::bu "input operands do not cover output register"); if (SrcOps.size() == 1) return buildCast(DstOps[0], SrcOps[0]); - if (DstOps[0].getLLTTy(*getMRI()).isVector()) - return buildInstr(TargetOpcode::G_CONCAT_VECTORS, DstOps, SrcOps); + if (DstOps[0].getLLTTy(*getMRI()).isVector()) { + if (SrcOps[0].getLLTTy(*getMRI()).isVector()) + return buildInstr(TargetOpcode::G_CONCAT_VECTORS, DstOps, SrcOps); + return buildInstr(TargetOpcode::G_BUILD_VECTOR, DstOps, SrcOps); + } break; } case TargetOpcode::G_EXTRACT_VECTOR_ELT: { Modified: llvm/trunk/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp?rev=374588&r1=374587&r2=374588&view=diff ============================================================================== --- llvm/trunk/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp (original) +++ llvm/trunk/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp Fri Oct 11 13:22:47 2019 @@ -314,3 +314,42 @@ TEST_F(GISelMITest, BuildAtomicRMW) { EXPECT_TRUE(CheckMachineFunction(*MF, CheckStr)) << *MF; } + +TEST_F(GISelMITest, BuildMerge) { + setUp(); + if (!TM) + return; + + LLT S32 = LLT::scalar(32); + Register RegC0 = B.buildConstant(S32, 0)->getOperand(0).getReg(); + Register RegC1 = B.buildConstant(S32, 1)->getOperand(0).getReg(); + Register RegC2 = B.buildConstant(S32, 2)->getOperand(0).getReg(); + Register RegC3 = B.buildConstant(S32, 3)->getOperand(0).getReg(); + + // Merging plain constants as one big blob of bit should produce a + // G_MERGE_VALUES. + B.buildMerge(LLT::scalar(128), {RegC0, RegC1, RegC2, RegC3}); + // Merging plain constants to a vector should produce a G_BUILD_VECTOR. + LLT V2x32 = LLT::vector(2, 32); + Register RegC0C1 = + B.buildMerge(V2x32, {RegC0, RegC1})->getOperand(0).getReg(); + Register RegC2C3 = + B.buildMerge(V2x32, {RegC2, RegC3})->getOperand(0).getReg(); + // Merging vector constants to a vector should produce a G_CONCAT_VECTORS. + B.buildMerge(LLT::vector(4, 32), {RegC0C1, RegC2C3}); + // Merging vector constants to a plain type is not allowed. + // Nothing else to test. + + auto CheckStr = R"( + ; CHECK: [[C0:%[0-9]+]]:_(s32) = G_CONSTANT i32 0 + ; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 1 + ; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 2 + ; CHECK: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 3 + ; CHECK: {{%[0-9]+}}:_(s128) = G_MERGE_VALUES [[C0]]:_(s32), [[C1]]:_(s32), [[C2]]:_(s32), [[C3]]:_(s32) + ; CHECK: [[LOW2x32:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[C0]]:_(s32), [[C1]]:_(s32) + ; CHECK: [[HIGH2x32:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[C2]]:_(s32), [[C3]]:_(s32) + ; CHECK: {{%[0-9]+}}:_(<4 x s32>) = G_CONCAT_VECTORS [[LOW2x32]]:_(<2 x s32>), [[HIGH2x32]]:_(<2 x s32>) + )"; + + EXPECT_TRUE(CheckMachineFunction(*MF, CheckStr)) << *MF; +} From llvm-commits at lists.llvm.org Fri Oct 11 13:22:58 2019 From: llvm-commits at lists.llvm.org (Quentin Colombet via llvm-commits) Date: Fri, 11 Oct 2019 20:22:58 -0000 Subject: [llvm] r374589 - [GISel][CallLowering] Enable vector support in argument lowering Message-ID: <20191011202258.215498AD03@lists.llvm.org> Author: qcolombet Date: Fri Oct 11 13:22:57 2019 New Revision: 374589 URL: http://llvm.org/viewvc/llvm-project?rev=374589&view=rev Log: [GISel][CallLowering] Enable vector support in argument lowering The exciting code is actually already enough to handle the splitting of vector arguments but we were lacking a test case. This commit adds a test case for vector argument lowering involving splitting and enable the related support in call lowering. Added: llvm/trunk/test/CodeGen/AArch64/GlobalISel/irtranslator-split-vector-arg.ll Modified: llvm/trunk/lib/CodeGen/GlobalISel/CallLowering.cpp Modified: llvm/trunk/lib/CodeGen/GlobalISel/CallLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/CallLowering.cpp?rev=374589&r1=374588&r2=374589&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/CallLowering.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/CallLowering.cpp Fri Oct 11 13:22:57 2019 @@ -198,14 +198,12 @@ bool CallLowering::handleAssignments(CCS unsigned NumParts = TLI->getNumRegistersForCallingConv( F.getContext(), F.getCallingConv(), CurVT); if (NumParts > 1) { - if (CurVT.isVector()) - return false; // For now only handle exact splits. if (NewVT.getSizeInBits() * NumParts != CurVT.getSizeInBits()) return false; } - // For incoming arguments (return values), we could have values in + // For incoming arguments (physregs to vregs), we could have values in // physregs (or memlocs) which we want to extract and copy to vregs. // During this, we might have to deal with the LLT being split across // multiple regs, so we have to record this information for later. @@ -221,7 +219,7 @@ bool CallLowering::handleAssignments(CCS return false; } else { // We're handling an incoming arg which is split over multiple regs. - // E.g. returning an s128 on AArch64. + // E.g. passing an s128 on AArch64. ISD::ArgFlagsTy OrigFlags = Args[i].Flags[0]; Args[i].OrigRegs.push_back(Args[i].Regs[0]); Args[i].Regs.clear(); Added: llvm/trunk/test/CodeGen/AArch64/GlobalISel/irtranslator-split-vector-arg.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/GlobalISel/irtranslator-split-vector-arg.ll?rev=374589&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/GlobalISel/irtranslator-split-vector-arg.ll (added) +++ llvm/trunk/test/CodeGen/AArch64/GlobalISel/irtranslator-split-vector-arg.ll Fri Oct 11 13:22:57 2019 @@ -0,0 +1,22 @@ +; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py +; RUN: llc -global-isel -global-isel-abort=1 %s -stop-after=irtranslator -o - | FileCheck %s +target triple = "aarch64-apple-ios" + +; Check that we correctly split %arg into two vector registers of +; size <2 x i64>. +define hidden fastcc <4 x float> @foo(<4 x i64> %arg) unnamed_addr #0 { + ; CHECK-LABEL: name: foo + ; CHECK: bb.1.bb: + ; CHECK: liveins: $q0, $q1 + ; CHECK: [[COPY:%[0-9]+]]:_(<2 x s64>) = COPY $q0 + ; CHECK: [[COPY1:%[0-9]+]]:_(<2 x s64>) = COPY $q1 + ; CHECK: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s64>) = G_CONCAT_VECTORS [[COPY]](<2 x s64>), [[COPY1]](<2 x s64>) + ; CHECK: [[UITOFP:%[0-9]+]]:_(<4 x s32>) = G_UITOFP [[CONCAT_VECTORS]](<4 x s64>) + ; CHECK: $q0 = COPY [[UITOFP]](<4 x s32>) + ; CHECK: RET_ReallyLR implicit $q0 +bb: + %tmp = uitofp <4 x i64> %arg to <4 x float> + ret <4 x float> %tmp +} + +attributes #0 = { nounwind readnone } From llvm-commits at lists.llvm.org Fri Oct 11 13:26:08 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Fri, 11 Oct 2019 20:26:08 -0000 Subject: [llvm] r374590 - [Mips][llvm-exegesis] Add a Mips target Message-ID: <20191011202608.EFCD28830F@lists.llvm.org> Author: atanasyan Date: Fri Oct 11 13:26:08 2019 New Revision: 374590 URL: http://llvm.org/viewvc/llvm-project?rev=374590&view=rev Log: [Mips][llvm-exegesis] Add a Mips target The target does just enough to be able to run llvm-exegesis in latency mode for at least some opcodes. Patch by Miloš Stojanović. Differential Revision: https://reviews.llvm.org/D68649 Added: llvm/trunk/lib/Target/Mips/MipsPfmCounters.td llvm/trunk/tools/llvm-exegesis/lib/Mips/ llvm/trunk/tools/llvm-exegesis/lib/Mips/CMakeLists.txt llvm/trunk/tools/llvm-exegesis/lib/Mips/LLVMBuild.txt llvm/trunk/tools/llvm-exegesis/lib/Mips/Target.cpp llvm/trunk/unittests/tools/llvm-exegesis/Mips/ llvm/trunk/unittests/tools/llvm-exegesis/Mips/CMakeLists.txt llvm/trunk/unittests/tools/llvm-exegesis/Mips/TargetTest.cpp Modified: llvm/trunk/lib/Target/Mips/CMakeLists.txt llvm/trunk/lib/Target/Mips/Mips.td llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp llvm/trunk/tools/llvm-exegesis/lib/CMakeLists.txt llvm/trunk/unittests/tools/llvm-exegesis/CMakeLists.txt Modified: llvm/trunk/lib/Target/Mips/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/CMakeLists.txt?rev=374590&r1=374589&r2=374590&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/CMakeLists.txt (original) +++ llvm/trunk/lib/Target/Mips/CMakeLists.txt Fri Oct 11 13:26:08 2019 @@ -13,6 +13,7 @@ tablegen(LLVM MipsGenMCPseudoLowering.in tablegen(LLVM MipsGenRegisterBank.inc -gen-register-bank) tablegen(LLVM MipsGenRegisterInfo.inc -gen-register-info) tablegen(LLVM MipsGenSubtargetInfo.inc -gen-subtarget) +tablegen(LLVM MipsGenExegesis.inc -gen-exegesis) add_public_tablegen_target(MipsCommonTableGen) Modified: llvm/trunk/lib/Target/Mips/Mips.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/Mips.td?rev=374590&r1=374589&r2=374590&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/Mips.td (original) +++ llvm/trunk/lib/Target/Mips/Mips.td Fri Oct 11 13:26:08 2019 @@ -263,3 +263,9 @@ def Mips : Target { let AssemblyParserVariants = [MipsAsmParserVariant]; let AllowRegisterRenaming = 1; } + +//===----------------------------------------------------------------------===// +// Pfm Counters +//===----------------------------------------------------------------------===// + +include "MipsPfmCounters.td" Added: llvm/trunk/lib/Target/Mips/MipsPfmCounters.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsPfmCounters.td?rev=374590&view=auto ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsPfmCounters.td (added) +++ llvm/trunk/lib/Target/Mips/MipsPfmCounters.td Fri Oct 11 13:26:08 2019 @@ -0,0 +1,18 @@ +//===-- MipsPfmCounters.td - Mips Hardware Counters --------*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This describes the available hardware counters for Mips. +// +//===----------------------------------------------------------------------===// + +def CpuCyclesPfmCounter : PfmCounter<"CYCLES">; + +def DefaultPfmCounters : ProcPfmCounters { + let CycleCounter = CpuCyclesPfmCounter; +} +def : PfmCountersDefaultBinding; Modified: llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp?rev=374590&r1=374589&r2=374590&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp (original) +++ llvm/trunk/tools/llvm-exegesis/lib/Assembler.cpp Fri Oct 11 13:26:08 2019 @@ -227,9 +227,11 @@ void assembleToStream(const ExegesisTarg ET.addTargetSpecificPasses(PM); TPC->printAndVerify("After ExegesisTarget::addTargetSpecificPasses"); // Adding the following passes: + // - postrapseudos: expands pseudo return instructions used on some targets. // - machineverifier: checks that the MachineFunction is well formed. // - prologepilog: saves and restore callee saved registers. - for (const char *PassName : {"machineverifier", "prologepilog"}) + for (const char *PassName : + {"postrapseudos", "machineverifier", "prologepilog"}) if (addPass(PM, PassName, *TPC)) report_fatal_error("Unable to add a mandatory pass"); TPC->setInitialized(); Modified: llvm/trunk/tools/llvm-exegesis/lib/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/CMakeLists.txt?rev=374590&r1=374589&r2=374590&view=diff ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/CMakeLists.txt (original) +++ llvm/trunk/tools/llvm-exegesis/lib/CMakeLists.txt Fri Oct 11 13:26:08 2019 @@ -12,6 +12,10 @@ if (LLVM_TARGETS_TO_BUILD MATCHES "Power add_subdirectory(PowerPC) set(TARGETS_TO_APPEND "${TARGETS_TO_APPEND} PowerPC") endif() +if (LLVM_TARGETS_TO_BUILD MATCHES "Mips") + add_subdirectory(Mips) + set(TARGETS_TO_APPEND "${TARGETS_TO_APPEND} Mips") +endif() set(LLVM_EXEGESIS_TARGETS "${LLVM_EXEGESIS_TARGETS} ${TARGETS_TO_APPEND}" PARENT_SCOPE) Added: llvm/trunk/tools/llvm-exegesis/lib/Mips/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Mips/CMakeLists.txt?rev=374590&view=auto ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Mips/CMakeLists.txt (added) +++ llvm/trunk/tools/llvm-exegesis/lib/Mips/CMakeLists.txt Fri Oct 11 13:26:08 2019 @@ -0,0 +1,18 @@ +include_directories( + ${LLVM_MAIN_SRC_DIR}/lib/Target/Mips + ${LLVM_BINARY_DIR}/lib/Target/Mips + ) + +add_library(LLVMExegesisMips + STATIC + Target.cpp + ) + +llvm_update_compile_flags(LLVMExegesisMips) +llvm_map_components_to_libnames(libs + Mips + Exegesis + ) + +target_link_libraries(LLVMExegesisMips ${libs}) +set_target_properties(LLVMExegesisMips PROPERTIES FOLDER "Libraries") Added: llvm/trunk/tools/llvm-exegesis/lib/Mips/LLVMBuild.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Mips/LLVMBuild.txt?rev=374590&view=auto ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Mips/LLVMBuild.txt (added) +++ llvm/trunk/tools/llvm-exegesis/lib/Mips/LLVMBuild.txt Fri Oct 11 13:26:08 2019 @@ -0,0 +1,21 @@ +;===- ./tools/llvm-exegesis/lib/Mips/LLVMBuild.txt -------------*- Conf -*--===; +; +; Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +; See https://llvm.org/LICENSE.txt for license information. +; SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +; +;===------------------------------------------------------------------------===; +; +; This is an LLVMBuild description file for the components in this subdirectory. +; +; For more information on the LLVMBuild system, please see: +; +; http://llvm.org/docs/LLVMBuild.html +; +;===------------------------------------------------------------------------===; + +[component_0] +type = Library +name = ExegesisMips +parent = Libraries +required_libraries = Mips Added: llvm/trunk/tools/llvm-exegesis/lib/Mips/Target.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Mips/Target.cpp?rev=374590&view=auto ============================================================================== --- llvm/trunk/tools/llvm-exegesis/lib/Mips/Target.cpp (added) +++ llvm/trunk/tools/llvm-exegesis/lib/Mips/Target.cpp Fri Oct 11 13:26:08 2019 @@ -0,0 +1,67 @@ +//===-- Target.cpp ----------------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +#include "../Target.h" +#include "../Latency.h" +#include "Mips.h" +#include "MipsRegisterInfo.h" + +namespace llvm { +namespace exegesis { + +#include "MipsGenExegesis.inc" + +namespace { +class ExegesisMipsTarget : public ExegesisTarget { +public: + ExegesisMipsTarget() : ExegesisTarget(MipsCpuPfmCounters) {} + +private: + std::vector setRegTo(const MCSubtargetInfo &STI, unsigned Reg, + const APInt &Value) const override; + bool matchesArch(Triple::ArchType Arch) const override { + return Arch == Triple::mips || Arch == Triple::mipsel || + Arch == Triple::mips64 || Arch == Triple::mips64el; + } +}; +} // end anonymous namespace + +// Generates instruction to load an immediate value into a register. +static MCInst loadImmediate(unsigned Reg, unsigned RegBitWidth, + const APInt &Value) { + if (Value.getActiveBits() > 16) + llvm_unreachable("Not implemented for Values wider than 16 bits"); + if (Value.getBitWidth() > RegBitWidth) + llvm_unreachable("Value must fit in the Register"); + return MCInstBuilder(Mips::ORi) + .addReg(Reg) + .addReg(Mips::ZERO) + .addImm(Value.getZExtValue()); +} + +std::vector ExegesisMipsTarget::setRegTo(const MCSubtargetInfo &STI, + unsigned Reg, + const APInt &Value) const { + if (Mips::GPR32RegClass.contains(Reg)) + return {loadImmediate(Reg, 32, Value)}; + if (Mips::GPR64RegClass.contains(Reg)) + return {loadImmediate(Reg, 64, Value)}; + errs() << "setRegTo is not implemented, results will be unreliable\n"; + return {}; +} + +static ExegesisTarget *getTheExegesisMipsTarget() { + static ExegesisMipsTarget Target; + return &Target; +} + +void InitializeMipsExegesisTarget() { + ExegesisTarget::registerTarget(getTheExegesisMipsTarget()); +} + +} // namespace exegesis +} // namespace llvm Modified: llvm/trunk/unittests/tools/llvm-exegesis/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/CMakeLists.txt?rev=374590&r1=374589&r2=374590&view=diff ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/CMakeLists.txt (original) +++ llvm/trunk/unittests/tools/llvm-exegesis/CMakeLists.txt Fri Oct 11 13:26:08 2019 @@ -30,3 +30,6 @@ endif() if(LLVM_TARGETS_TO_BUILD MATCHES "PowerPC") add_subdirectory(PowerPC) endif() +if(LLVM_TARGETS_TO_BUILD MATCHES "Mips") + add_subdirectory(Mips) +endif() Added: llvm/trunk/unittests/tools/llvm-exegesis/Mips/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/Mips/CMakeLists.txt?rev=374590&view=auto ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/Mips/CMakeLists.txt (added) +++ llvm/trunk/unittests/tools/llvm-exegesis/Mips/CMakeLists.txt Fri Oct 11 13:26:08 2019 @@ -0,0 +1,21 @@ +include_directories( + ${LLVM_MAIN_SRC_DIR}/lib/Target/Mips + ${LLVM_BINARY_DIR}/lib/Target/Mips + ${LLVM_MAIN_SRC_DIR}/tools/llvm-exegesis/lib + ) + +set(LLVM_LINK_COMPONENTS + MC + MCParser + Object + Support + Symbolize + Mips + ) + +add_llvm_unittest(LLVMExegesisMipsTests + TargetTest.cpp + ) +target_link_libraries(LLVMExegesisMipsTests PRIVATE + LLVMExegesis + LLVMExegesisMips) Added: llvm/trunk/unittests/tools/llvm-exegesis/Mips/TargetTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/tools/llvm-exegesis/Mips/TargetTest.cpp?rev=374590&view=auto ============================================================================== --- llvm/trunk/unittests/tools/llvm-exegesis/Mips/TargetTest.cpp (added) +++ llvm/trunk/unittests/tools/llvm-exegesis/Mips/TargetTest.cpp Fri Oct 11 13:26:08 2019 @@ -0,0 +1,91 @@ +//===-- TargetTest.cpp ------------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "Target.h" + +#include +#include + +#include "MCTargetDesc/MipsMCTargetDesc.h" +#include "llvm/Support/TargetRegistry.h" +#include "llvm/Support/TargetSelect.h" +#include "gmock/gmock.h" +#include "gtest/gtest.h" + +namespace llvm { +namespace exegesis { + +void InitializeMipsExegesisTarget(); + +namespace { + +using testing::AllOf; +using testing::ElementsAre; +using testing::Eq; +using testing::Matcher; +using testing::Property; + +Matcher IsImm(int64_t Value) { + return AllOf(Property(&MCOperand::isImm, Eq(true)), + Property(&MCOperand::getImm, Eq(Value))); +} + +Matcher IsReg(unsigned Reg) { + return AllOf(Property(&MCOperand::isReg, Eq(true)), + Property(&MCOperand::getReg, Eq(Reg))); +} + +Matcher OpcodeIs(unsigned Opcode) { + return Property(&MCInst::getOpcode, Eq(Opcode)); +} + +Matcher IsLoadLowImm(int64_t Reg, int64_t Value) { + return AllOf(OpcodeIs(Mips::ORi), + ElementsAre(IsReg(Reg), IsReg(Mips::ZERO), IsImm(Value))); +} + +constexpr const char kTriple[] = "mips-unknown-linux"; + +class MipsTargetTest : public ::testing::Test { +protected: + MipsTargetTest() : State(kTriple, "mips32", "") {} + + static void SetUpTestCase() { + LLVMInitializeMipsTargetInfo(); + LLVMInitializeMipsTarget(); + LLVMInitializeMipsTargetMC(); + InitializeMipsExegesisTarget(); + } + + std::vector setRegTo(unsigned Reg, const APInt &Value) { + return State.getExegesisTarget().setRegTo(State.getSubtargetInfo(), Reg, + Value); + } + + LLVMState State; +}; + +TEST_F(MipsTargetTest, SetRegToConstant) { + const uint16_t Value = 0xFFFFU; + const unsigned Reg = Mips::T0; + EXPECT_THAT(setRegTo(Reg, APInt(16, Value)), + ElementsAre(IsLoadLowImm(Reg, Value))); +} + +TEST_F(MipsTargetTest, DefaultPfmCounters) { + const std::string Expected = "CYCLES"; + EXPECT_EQ(State.getExegesisTarget().getPfmCounters("").CycleCounter, + Expected); + EXPECT_EQ( + State.getExegesisTarget().getPfmCounters("unknown_cpu").CycleCounter, + Expected); +} + +} // namespace +} // namespace exegesis +} // namespace llvm From llvm-commits at lists.llvm.org Fri Oct 11 13:24:29 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:24:29 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: hubert.reinterpretcast added a comment. Thanks @DiggerLin. I think this is almost ready. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:151 + + // Packed field, see XR_* masks for details of packing. + uint8_t Info; ---------------- Move the masks to the start of this class. Separate the nested types/constants from the fields using an access specifier label (e.g., `public`) or a comment. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:156 + + bool isRelocationSigned() const; + ---------------- Separate the fields from the methods using an access specified label or a comment. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:158 + + // If the Fixup bit is set, it indicates that the linker has modified + // the instruction the relocation refers to. ---------------- Remove the comment and the blank line before it once the mask constants are defined in the class. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:313 + getLogicalNumberOfRelocationEntries(const XCOFFSectionHeader32 &Sec) const; + Expected> + relocations(const XCOFFSectionHeader32 &) const; ---------------- I would prefer a blank line between multi-line function declarations. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Fri Oct 11 13:25:04 2019 From: llvm-commits at lists.llvm.org (Z Nguyen-Huu via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:25:04 +0000 (UTC) Subject: [PATCH] D68886: Remove unnecessary codes in llvm-dwarfdump Message-ID: duongnhn created this revision. duongnhn added reviewers: JDevlieghere, MaskRay. duongnhn added projects: LLVM, debug-info. Herald added subscribers: llvm-commits, mgorny. These codes is not needed. Remove them can reduce the size in x64 window build from ~14MB to ~3MB. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68886 Files: llvm/tools/llvm-dwarfdump/CMakeLists.txt llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp Index: llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp =================================================================== --- llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp +++ llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp @@ -566,9 +566,6 @@ int main(int argc, char **argv) { InitLLVM X(argc, argv); - llvm::InitializeAllTargetInfos(); - llvm::InitializeAllTargetMCs(); - HideUnrelatedOptions({&DwarfDumpCategory, &SectionCategory, &ColorCategory}); cl::ParseCommandLineOptions( argc, argv, Index: llvm/tools/llvm-dwarfdump/CMakeLists.txt =================================================================== --- llvm/tools/llvm-dwarfdump/CMakeLists.txt +++ llvm/tools/llvm-dwarfdump/CMakeLists.txt @@ -1,8 +1,5 @@ set(LLVM_LINK_COMPONENTS DebugInfoDWARF - AllTargetsDescs - AllTargetsInfos - MC Object Support ) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68886.224667.patch Type: text/x-patch Size: 844 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 13:25:25 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:25:25 +0000 (UTC) Subject: [PATCH] D68649: [Mips][llvm-exegesis] Add a Mips target In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGcf1ba238d4f7: [Mips][llvm-exegesis] Add a Mips target (authored by atanasyan). Herald added subscribers: jrtc27, hiraditya. Herald added a project: LLVM. Changed prior to commit: https://reviews.llvm.org/D68649?vs=224106&id=224670#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68649/new/ https://reviews.llvm.org/D68649 Files: llvm/lib/Target/Mips/CMakeLists.txt llvm/lib/Target/Mips/Mips.td llvm/lib/Target/Mips/MipsPfmCounters.td llvm/tools/llvm-exegesis/lib/Assembler.cpp llvm/tools/llvm-exegesis/lib/CMakeLists.txt llvm/tools/llvm-exegesis/lib/Mips/CMakeLists.txt llvm/tools/llvm-exegesis/lib/Mips/LLVMBuild.txt llvm/tools/llvm-exegesis/lib/Mips/Target.cpp llvm/unittests/tools/llvm-exegesis/CMakeLists.txt llvm/unittests/tools/llvm-exegesis/Mips/CMakeLists.txt llvm/unittests/tools/llvm-exegesis/Mips/TargetTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68649.224670.patch Type: text/x-patch Size: 11302 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 13:33:03 2019 From: llvm-commits at lists.llvm.org (David Green via llvm-commits) Date: Fri, 11 Oct 2019 20:33:03 -0000 Subject: [llvm] r374592 - Revert 374373: [Codegen] Alter the default promotion for saturating adds and subs Message-ID: <20191011203303.A62A78EE2F@lists.llvm.org> Author: dmgreen Date: Fri Oct 11 13:33:03 2019 New Revision: 374592 URL: http://llvm.org/viewvc/llvm-project?rev=374592&view=rev Log: Revert 374373: [Codegen] Alter the default promotion for saturating adds and subs This commit is not extending the promoted integers as it should. Reverting whilst I look into the details. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp llvm/trunk/test/CodeGen/AArch64/sadd_sat.ll llvm/trunk/test/CodeGen/AArch64/sadd_sat_vec.ll llvm/trunk/test/CodeGen/AArch64/ssub_sat.ll llvm/trunk/test/CodeGen/AArch64/ssub_sat_vec.ll llvm/trunk/test/CodeGen/AArch64/uadd_sat.ll llvm/trunk/test/CodeGen/AArch64/uadd_sat_vec.ll llvm/trunk/test/CodeGen/AArch64/usub_sat.ll llvm/trunk/test/CodeGen/AArch64/usub_sat_vec.ll llvm/trunk/test/CodeGen/ARM/sadd_sat.ll llvm/trunk/test/CodeGen/ARM/ssub_sat.ll llvm/trunk/test/CodeGen/ARM/uadd_sat.ll llvm/trunk/test/CodeGen/ARM/usub_sat.ll llvm/trunk/test/CodeGen/X86/sadd_sat.ll llvm/trunk/test/CodeGen/X86/ssub_sat.ll llvm/trunk/test/CodeGen/X86/uadd_sat.ll llvm/trunk/test/CodeGen/X86/usub_sat.ll Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp Fri Oct 11 13:33:03 2019 @@ -642,78 +642,48 @@ SDValue DAGTypeLegalizer::PromoteIntRes_ } SDValue DAGTypeLegalizer::PromoteIntRes_ADDSUBSAT(SDNode *N) { - // If the promoted type is legal, we can convert this to: - // 1. ANY_EXTEND iN to iM - // 2. SHL by M-N - // 3. [US][ADD|SUB]SAT - // 4. L/ASHR by M-N - // Else it is more efficient to convert this to a min and a max - // operation in the higher precision arithmetic. + // For promoting iN -> iM, this can be expanded by + // 1. ANY_EXTEND iN to iM + // 2. SHL by M-N + // 3. [US][ADD|SUB]SAT + // 4. L/ASHR by M-N SDLoc dl(N); SDValue Op1 = N->getOperand(0); SDValue Op2 = N->getOperand(1); unsigned OldBits = Op1.getScalarValueSizeInBits(); unsigned Opcode = N->getOpcode(); + unsigned ShiftOp; + switch (Opcode) { + case ISD::SADDSAT: + case ISD::SSUBSAT: + ShiftOp = ISD::SRA; + break; + case ISD::UADDSAT: + case ISD::USUBSAT: + ShiftOp = ISD::SRL; + break; + default: + llvm_unreachable("Expected opcode to be signed or unsigned saturation " + "addition or subtraction"); + } SDValue Op1Promoted = GetPromotedInteger(Op1); SDValue Op2Promoted = GetPromotedInteger(Op2); + EVT PromotedType = Op1Promoted.getValueType(); unsigned NewBits = PromotedType.getScalarSizeInBits(); - - if (TLI.isOperationLegalOrCustom(Opcode, PromotedType)) { - unsigned ShiftOp; - switch (Opcode) { - case ISD::SADDSAT: - case ISD::SSUBSAT: - ShiftOp = ISD::SRA; - break; - case ISD::UADDSAT: - case ISD::USUBSAT: - ShiftOp = ISD::SRL; - break; - default: - llvm_unreachable("Expected opcode to be signed or unsigned saturation " - "addition or subtraction"); - } - - unsigned SHLAmount = NewBits - OldBits; - EVT SHVT = TLI.getShiftAmountTy(PromotedType, DAG.getDataLayout()); - SDValue ShiftAmount = DAG.getConstant(SHLAmount, dl, SHVT); - Op1Promoted = - DAG.getNode(ISD::SHL, dl, PromotedType, Op1Promoted, ShiftAmount); - Op2Promoted = - DAG.getNode(ISD::SHL, dl, PromotedType, Op2Promoted, ShiftAmount); - - SDValue Result = - DAG.getNode(Opcode, dl, PromotedType, Op1Promoted, Op2Promoted); - return DAG.getNode(ShiftOp, dl, PromotedType, Result, ShiftAmount); - } else { - if (Opcode == ISD::USUBSAT) { - SDValue Max = - DAG.getNode(ISD::UMAX, dl, PromotedType, Op1Promoted, Op2Promoted); - return DAG.getNode(ISD::SUB, dl, PromotedType, Max, Op2Promoted); - } - - if (Opcode == ISD::UADDSAT) { - APInt MaxVal = APInt::getAllOnesValue(OldBits).zext(NewBits); - SDValue SatMax = DAG.getConstant(MaxVal, dl, PromotedType); - SDValue Add = - DAG.getNode(ISD::ADD, dl, PromotedType, Op1Promoted, Op2Promoted); - return DAG.getNode(ISD::UMIN, dl, PromotedType, Add, SatMax); - } - - unsigned AddOp = Opcode == ISD::SADDSAT ? ISD::ADD : ISD::SUB; - APInt MinVal = APInt::getSignedMinValue(OldBits).sext(NewBits); - APInt MaxVal = APInt::getSignedMaxValue(OldBits).sext(NewBits); - SDValue SatMin = DAG.getConstant(MinVal, dl, PromotedType); - SDValue SatMax = DAG.getConstant(MaxVal, dl, PromotedType); - SDValue Result = - DAG.getNode(AddOp, dl, PromotedType, Op1Promoted, Op2Promoted); - Result = DAG.getNode(ISD::SMIN, dl, PromotedType, Result, SatMax); - Result = DAG.getNode(ISD::SMAX, dl, PromotedType, Result, SatMin); - return Result; - } + unsigned SHLAmount = NewBits - OldBits; + EVT SHVT = TLI.getShiftAmountTy(PromotedType, DAG.getDataLayout()); + SDValue ShiftAmount = DAG.getConstant(SHLAmount, dl, SHVT); + Op1Promoted = + DAG.getNode(ISD::SHL, dl, PromotedType, Op1Promoted, ShiftAmount); + Op2Promoted = + DAG.getNode(ISD::SHL, dl, PromotedType, Op2Promoted, ShiftAmount); + + SDValue Result = + DAG.getNode(Opcode, dl, PromotedType, Op1Promoted, Op2Promoted); + return DAG.getNode(ShiftOp, dl, PromotedType, Result, ShiftAmount); } SDValue DAGTypeLegalizer::PromoteIntRes_MULFIX(SDNode *N) { Modified: llvm/trunk/test/CodeGen/AArch64/sadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/sadd_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/sadd_sat.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/sadd_sat.ll Fri Oct 11 13:33:03 2019 @@ -39,13 +39,14 @@ define i64 @func2(i64 %x, i64 %y) nounwi define i16 @func16(i16 %x, i16 %y) nounwind { ; CHECK-LABEL: func16: ; CHECK: // %bb.0: -; CHECK-NEXT: add w8, w0, w1 -; CHECK-NEXT: mov w9, #32767 -; CHECK-NEXT: cmp w8, w9 -; CHECK-NEXT: csel w8, w8, w9, lt -; CHECK-NEXT: cmn w8, #8, lsl #12 // =32768 -; CHECK-NEXT: mov w9, #-32768 -; CHECK-NEXT: csel w0, w8, w9, gt +; CHECK-NEXT: lsl w8, w0, #16 +; CHECK-NEXT: adds w10, w8, w1, lsl #16 +; CHECK-NEXT: mov w9, #2147483647 +; CHECK-NEXT: cmp w10, #0 // =0 +; CHECK-NEXT: cinv w9, w9, ge +; CHECK-NEXT: adds w8, w8, w1, lsl #16 +; CHECK-NEXT: csel w8, w9, w8, vs +; CHECK-NEXT: asr w0, w8, #16 ; CHECK-NEXT: ret %tmp = call i16 @llvm.sadd.sat.i16(i16 %x, i16 %y); ret i16 %tmp; @@ -54,13 +55,14 @@ define i16 @func16(i16 %x, i16 %y) nounw define i8 @func8(i8 %x, i8 %y) nounwind { ; CHECK-LABEL: func8: ; CHECK: // %bb.0: -; CHECK-NEXT: add w8, w0, w1 -; CHECK-NEXT: mov w9, #127 -; CHECK-NEXT: cmp w8, #127 // =127 -; CHECK-NEXT: csel w8, w8, w9, lt -; CHECK-NEXT: cmn w8, #128 // =128 -; CHECK-NEXT: mov w9, #-128 -; CHECK-NEXT: csel w0, w8, w9, gt +; CHECK-NEXT: lsl w8, w0, #24 +; CHECK-NEXT: adds w10, w8, w1, lsl #24 +; CHECK-NEXT: mov w9, #2147483647 +; CHECK-NEXT: cmp w10, #0 // =0 +; CHECK-NEXT: cinv w9, w9, ge +; CHECK-NEXT: adds w8, w8, w1, lsl #24 +; CHECK-NEXT: csel w8, w9, w8, vs +; CHECK-NEXT: asr w0, w8, #24 ; CHECK-NEXT: ret %tmp = call i8 @llvm.sadd.sat.i8(i8 %x, i8 %y); ret i8 %tmp; @@ -69,13 +71,14 @@ define i8 @func8(i8 %x, i8 %y) nounwind define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-LABEL: func3: ; CHECK: // %bb.0: -; CHECK-NEXT: add w8, w0, w1 -; CHECK-NEXT: mov w9, #7 -; CHECK-NEXT: cmp w8, #7 // =7 -; CHECK-NEXT: csel w8, w8, w9, lt -; CHECK-NEXT: cmn w8, #8 // =8 -; CHECK-NEXT: mov w9, #-8 -; CHECK-NEXT: csel w0, w8, w9, gt +; CHECK-NEXT: lsl w8, w0, #28 +; CHECK-NEXT: adds w10, w8, w1, lsl #28 +; CHECK-NEXT: mov w9, #2147483647 +; CHECK-NEXT: cmp w10, #0 // =0 +; CHECK-NEXT: cinv w9, w9, ge +; CHECK-NEXT: adds w8, w8, w1, lsl #28 +; CHECK-NEXT: csel w8, w9, w8, vs +; CHECK-NEXT: asr w0, w8, #28 ; CHECK-NEXT: ret %tmp = call i4 @llvm.sadd.sat.i4(i4 %x, i4 %y); ret i4 %tmp; Modified: llvm/trunk/test/CodeGen/AArch64/sadd_sat_vec.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/sadd_sat_vec.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/sadd_sat_vec.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/sadd_sat_vec.ll Fri Oct 11 13:33:03 2019 @@ -236,23 +236,30 @@ define void @v4i8(<4 x i8>* %px, <4 x i8 ; CHECK-NEXT: ldrb w9, [x1] ; CHECK-NEXT: ldrb w10, [x0, #1] ; CHECK-NEXT: ldrb w11, [x1, #1] +; CHECK-NEXT: ldrb w12, [x0, #2] ; CHECK-NEXT: fmov s0, w8 +; CHECK-NEXT: ldrb w8, [x1, #2] ; CHECK-NEXT: fmov s1, w9 -; CHECK-NEXT: ldrb w8, [x0, #2] -; CHECK-NEXT: ldrb w9, [x1, #2] ; CHECK-NEXT: mov v0.h[1], w10 +; CHECK-NEXT: ldrb w9, [x0, #3] +; CHECK-NEXT: ldrb w10, [x1, #3] ; CHECK-NEXT: mov v1.h[1], w11 -; CHECK-NEXT: ldrb w10, [x0, #3] -; CHECK-NEXT: ldrb w11, [x1, #3] -; CHECK-NEXT: mov v0.h[2], w8 -; CHECK-NEXT: mov v1.h[2], w9 -; CHECK-NEXT: mov v0.h[3], w10 -; CHECK-NEXT: mov v1.h[3], w11 -; CHECK-NEXT: add v0.4h, v0.4h, v1.4h -; CHECK-NEXT: movi v1.4h, #127 -; CHECK-NEXT: smin v0.4h, v0.4h, v1.4h -; CHECK-NEXT: mvni v1.4h, #127 -; CHECK-NEXT: smax v0.4h, v0.4h, v1.4h +; CHECK-NEXT: mov v0.h[2], w12 +; CHECK-NEXT: mov v1.h[2], w8 +; CHECK-NEXT: mov v0.h[3], w9 +; CHECK-NEXT: mov v1.h[3], w10 +; CHECK-NEXT: shl v1.4h, v1.4h, #8 +; CHECK-NEXT: shl v0.4h, v0.4h, #8 +; CHECK-NEXT: add v3.4h, v0.4h, v1.4h +; CHECK-NEXT: cmlt v4.4h, v3.4h, #0 +; CHECK-NEXT: mvni v2.4h, #128, lsl #8 +; CHECK-NEXT: cmlt v1.4h, v1.4h, #0 +; CHECK-NEXT: cmgt v0.4h, v0.4h, v3.4h +; CHECK-NEXT: mvn v5.8b, v4.8b +; CHECK-NEXT: bsl v2.8b, v4.8b, v5.8b +; CHECK-NEXT: eor v0.8b, v1.8b, v0.8b +; CHECK-NEXT: bsl v0.8b, v2.8b, v3.8b +; CHECK-NEXT: sshr v0.4h, v0.4h, #8 ; CHECK-NEXT: xtn v0.8b, v0.8h ; CHECK-NEXT: str s0, [x2] ; CHECK-NEXT: ret @@ -271,14 +278,21 @@ define void @v2i8(<2 x i8>* %px, <2 x i8 ; CHECK-NEXT: ldrb w10, [x0, #1] ; CHECK-NEXT: ldrb w11, [x1, #1] ; CHECK-NEXT: fmov s0, w8 -; CHECK-NEXT: fmov s1, w9 +; CHECK-NEXT: fmov s2, w9 ; CHECK-NEXT: mov v0.s[1], w10 -; CHECK-NEXT: mov v1.s[1], w11 -; CHECK-NEXT: add v0.2s, v0.2s, v1.2s -; CHECK-NEXT: movi v1.2s, #127 -; CHECK-NEXT: smin v0.2s, v0.2s, v1.2s -; CHECK-NEXT: mvni v1.2s, #127 -; CHECK-NEXT: smax v0.2s, v0.2s, v1.2s +; CHECK-NEXT: mov v2.s[1], w11 +; CHECK-NEXT: shl v2.2s, v2.2s, #24 +; CHECK-NEXT: shl v0.2s, v0.2s, #24 +; CHECK-NEXT: add v3.2s, v0.2s, v2.2s +; CHECK-NEXT: cmlt v4.2s, v3.2s, #0 +; CHECK-NEXT: mvni v1.2s, #128, lsl #24 +; CHECK-NEXT: cmlt v2.2s, v2.2s, #0 +; CHECK-NEXT: cmgt v0.2s, v0.2s, v3.2s +; CHECK-NEXT: mvn v5.8b, v4.8b +; CHECK-NEXT: eor v0.8b, v2.8b, v0.8b +; CHECK-NEXT: bsl v1.8b, v4.8b, v5.8b +; CHECK-NEXT: bsl v0.8b, v1.8b, v3.8b +; CHECK-NEXT: ushr v0.2s, v0.2s, #24 ; CHECK-NEXT: mov w8, v0.s[1] ; CHECK-NEXT: fmov w9, s0 ; CHECK-NEXT: strb w8, [x2, #1] @@ -322,14 +336,21 @@ define void @v2i16(<2 x i16>* %px, <2 x ; CHECK-NEXT: ldrh w10, [x0, #2] ; CHECK-NEXT: ldrh w11, [x1, #2] ; CHECK-NEXT: fmov s0, w8 -; CHECK-NEXT: fmov s1, w9 +; CHECK-NEXT: fmov s2, w9 ; CHECK-NEXT: mov v0.s[1], w10 -; CHECK-NEXT: mov v1.s[1], w11 -; CHECK-NEXT: add v0.2s, v0.2s, v1.2s -; CHECK-NEXT: movi v1.2s, #127, msl #8 -; CHECK-NEXT: smin v0.2s, v0.2s, v1.2s -; CHECK-NEXT: mvni v1.2s, #127, msl #8 -; CHECK-NEXT: smax v0.2s, v0.2s, v1.2s +; CHECK-NEXT: mov v2.s[1], w11 +; CHECK-NEXT: shl v2.2s, v2.2s, #16 +; CHECK-NEXT: shl v0.2s, v0.2s, #16 +; CHECK-NEXT: add v3.2s, v0.2s, v2.2s +; CHECK-NEXT: cmlt v4.2s, v3.2s, #0 +; CHECK-NEXT: mvni v1.2s, #128, lsl #24 +; CHECK-NEXT: cmlt v2.2s, v2.2s, #0 +; CHECK-NEXT: cmgt v0.2s, v0.2s, v3.2s +; CHECK-NEXT: mvn v5.8b, v4.8b +; CHECK-NEXT: eor v0.8b, v2.8b, v0.8b +; CHECK-NEXT: bsl v1.8b, v4.8b, v5.8b +; CHECK-NEXT: bsl v0.8b, v1.8b, v3.8b +; CHECK-NEXT: ushr v0.2s, v0.2s, #16 ; CHECK-NEXT: mov w8, v0.s[1] ; CHECK-NEXT: fmov w9, s0 ; CHECK-NEXT: strh w8, [x2, #2] @@ -441,11 +462,18 @@ define void @v1i16(<1 x i16>* %px, <1 x define <16 x i4> @v16i4(<16 x i4> %x, <16 x i4> %y) nounwind { ; CHECK-LABEL: v16i4: ; CHECK: // %bb.0: -; CHECK-NEXT: add v0.16b, v0.16b, v1.16b -; CHECK-NEXT: movi v1.16b, #7 -; CHECK-NEXT: smin v0.16b, v0.16b, v1.16b -; CHECK-NEXT: movi v1.16b, #248 -; CHECK-NEXT: smax v0.16b, v0.16b, v1.16b +; CHECK-NEXT: shl v1.16b, v1.16b, #4 +; CHECK-NEXT: shl v0.16b, v0.16b, #4 +; CHECK-NEXT: add v3.16b, v0.16b, v1.16b +; CHECK-NEXT: cmlt v4.16b, v3.16b, #0 +; CHECK-NEXT: movi v2.16b, #127 +; CHECK-NEXT: cmlt v1.16b, v1.16b, #0 +; CHECK-NEXT: cmgt v0.16b, v0.16b, v3.16b +; CHECK-NEXT: mvn v5.16b, v4.16b +; CHECK-NEXT: bsl v2.16b, v4.16b, v5.16b +; CHECK-NEXT: eor v0.16b, v1.16b, v0.16b +; CHECK-NEXT: bsl v0.16b, v2.16b, v3.16b +; CHECK-NEXT: sshr v0.16b, v0.16b, #4 ; CHECK-NEXT: ret %z = call <16 x i4> @llvm.sadd.sat.v16i4(<16 x i4> %x, <16 x i4> %y) ret <16 x i4> %z @@ -454,11 +482,18 @@ define <16 x i4> @v16i4(<16 x i4> %x, <1 define <16 x i1> @v16i1(<16 x i1> %x, <16 x i1> %y) nounwind { ; CHECK-LABEL: v16i1: ; CHECK: // %bb.0: -; CHECK-NEXT: add v0.16b, v0.16b, v1.16b -; CHECK-NEXT: movi v1.2d, #0000000000000000 -; CHECK-NEXT: smin v0.16b, v0.16b, v1.16b -; CHECK-NEXT: movi v1.2d, #0xffffffffffffffff -; CHECK-NEXT: smax v0.16b, v0.16b, v1.16b +; CHECK-NEXT: shl v1.16b, v1.16b, #7 +; CHECK-NEXT: shl v0.16b, v0.16b, #7 +; CHECK-NEXT: add v3.16b, v0.16b, v1.16b +; CHECK-NEXT: cmlt v4.16b, v3.16b, #0 +; CHECK-NEXT: movi v2.16b, #127 +; CHECK-NEXT: cmlt v1.16b, v1.16b, #0 +; CHECK-NEXT: cmgt v0.16b, v0.16b, v3.16b +; CHECK-NEXT: mvn v5.16b, v4.16b +; CHECK-NEXT: bsl v2.16b, v4.16b, v5.16b +; CHECK-NEXT: eor v0.16b, v1.16b, v0.16b +; CHECK-NEXT: bsl v0.16b, v2.16b, v3.16b +; CHECK-NEXT: sshr v0.16b, v0.16b, #7 ; CHECK-NEXT: ret %z = call <16 x i1> @llvm.sadd.sat.v16i1(<16 x i1> %x, <16 x i1> %y) ret <16 x i1> %z Modified: llvm/trunk/test/CodeGen/AArch64/ssub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/ssub_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/ssub_sat.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/ssub_sat.ll Fri Oct 11 13:33:03 2019 @@ -39,13 +39,14 @@ define i64 @func2(i64 %x, i64 %y) nounwi define i16 @func16(i16 %x, i16 %y) nounwind { ; CHECK-LABEL: func16: ; CHECK: // %bb.0: -; CHECK-NEXT: sub w8, w0, w1 -; CHECK-NEXT: mov w9, #32767 -; CHECK-NEXT: cmp w8, w9 -; CHECK-NEXT: csel w8, w8, w9, lt -; CHECK-NEXT: cmn w8, #8, lsl #12 // =32768 -; CHECK-NEXT: mov w9, #-32768 -; CHECK-NEXT: csel w0, w8, w9, gt +; CHECK-NEXT: lsl w8, w0, #16 +; CHECK-NEXT: subs w10, w8, w1, lsl #16 +; CHECK-NEXT: mov w9, #2147483647 +; CHECK-NEXT: cmp w10, #0 // =0 +; CHECK-NEXT: cinv w9, w9, ge +; CHECK-NEXT: subs w8, w8, w1, lsl #16 +; CHECK-NEXT: csel w8, w9, w8, vs +; CHECK-NEXT: asr w0, w8, #16 ; CHECK-NEXT: ret %tmp = call i16 @llvm.ssub.sat.i16(i16 %x, i16 %y); ret i16 %tmp; @@ -54,13 +55,14 @@ define i16 @func16(i16 %x, i16 %y) nounw define i8 @func8(i8 %x, i8 %y) nounwind { ; CHECK-LABEL: func8: ; CHECK: // %bb.0: -; CHECK-NEXT: sub w8, w0, w1 -; CHECK-NEXT: mov w9, #127 -; CHECK-NEXT: cmp w8, #127 // =127 -; CHECK-NEXT: csel w8, w8, w9, lt -; CHECK-NEXT: cmn w8, #128 // =128 -; CHECK-NEXT: mov w9, #-128 -; CHECK-NEXT: csel w0, w8, w9, gt +; CHECK-NEXT: lsl w8, w0, #24 +; CHECK-NEXT: subs w10, w8, w1, lsl #24 +; CHECK-NEXT: mov w9, #2147483647 +; CHECK-NEXT: cmp w10, #0 // =0 +; CHECK-NEXT: cinv w9, w9, ge +; CHECK-NEXT: subs w8, w8, w1, lsl #24 +; CHECK-NEXT: csel w8, w9, w8, vs +; CHECK-NEXT: asr w0, w8, #24 ; CHECK-NEXT: ret %tmp = call i8 @llvm.ssub.sat.i8(i8 %x, i8 %y); ret i8 %tmp; @@ -69,13 +71,14 @@ define i8 @func8(i8 %x, i8 %y) nounwind define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-LABEL: func3: ; CHECK: // %bb.0: -; CHECK-NEXT: sub w8, w0, w1 -; CHECK-NEXT: mov w9, #7 -; CHECK-NEXT: cmp w8, #7 // =7 -; CHECK-NEXT: csel w8, w8, w9, lt -; CHECK-NEXT: cmn w8, #8 // =8 -; CHECK-NEXT: mov w9, #-8 -; CHECK-NEXT: csel w0, w8, w9, gt +; CHECK-NEXT: lsl w8, w0, #28 +; CHECK-NEXT: subs w10, w8, w1, lsl #28 +; CHECK-NEXT: mov w9, #2147483647 +; CHECK-NEXT: cmp w10, #0 // =0 +; CHECK-NEXT: cinv w9, w9, ge +; CHECK-NEXT: subs w8, w8, w1, lsl #28 +; CHECK-NEXT: csel w8, w9, w8, vs +; CHECK-NEXT: asr w0, w8, #28 ; CHECK-NEXT: ret %tmp = call i4 @llvm.ssub.sat.i4(i4 %x, i4 %y); ret i4 %tmp; Modified: llvm/trunk/test/CodeGen/AArch64/ssub_sat_vec.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/ssub_sat_vec.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/ssub_sat_vec.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/ssub_sat_vec.ll Fri Oct 11 13:33:03 2019 @@ -237,23 +237,30 @@ define void @v4i8(<4 x i8>* %px, <4 x i8 ; CHECK-NEXT: ldrb w9, [x1] ; CHECK-NEXT: ldrb w10, [x0, #1] ; CHECK-NEXT: ldrb w11, [x1, #1] +; CHECK-NEXT: ldrb w12, [x0, #2] ; CHECK-NEXT: fmov s0, w8 +; CHECK-NEXT: ldrb w8, [x1, #2] ; CHECK-NEXT: fmov s1, w9 -; CHECK-NEXT: ldrb w8, [x0, #2] -; CHECK-NEXT: ldrb w9, [x1, #2] ; CHECK-NEXT: mov v0.h[1], w10 +; CHECK-NEXT: ldrb w9, [x0, #3] +; CHECK-NEXT: ldrb w10, [x1, #3] ; CHECK-NEXT: mov v1.h[1], w11 -; CHECK-NEXT: ldrb w10, [x0, #3] -; CHECK-NEXT: ldrb w11, [x1, #3] -; CHECK-NEXT: mov v0.h[2], w8 -; CHECK-NEXT: mov v1.h[2], w9 -; CHECK-NEXT: mov v0.h[3], w10 -; CHECK-NEXT: mov v1.h[3], w11 -; CHECK-NEXT: sub v0.4h, v0.4h, v1.4h -; CHECK-NEXT: movi v1.4h, #127 -; CHECK-NEXT: smin v0.4h, v0.4h, v1.4h -; CHECK-NEXT: mvni v1.4h, #127 -; CHECK-NEXT: smax v0.4h, v0.4h, v1.4h +; CHECK-NEXT: mov v0.h[2], w12 +; CHECK-NEXT: mov v1.h[2], w8 +; CHECK-NEXT: mov v0.h[3], w9 +; CHECK-NEXT: mov v1.h[3], w10 +; CHECK-NEXT: shl v1.4h, v1.4h, #8 +; CHECK-NEXT: shl v0.4h, v0.4h, #8 +; CHECK-NEXT: sub v3.4h, v0.4h, v1.4h +; CHECK-NEXT: cmlt v4.4h, v3.4h, #0 +; CHECK-NEXT: mvni v2.4h, #128, lsl #8 +; CHECK-NEXT: cmgt v1.4h, v1.4h, #0 +; CHECK-NEXT: cmgt v0.4h, v0.4h, v3.4h +; CHECK-NEXT: mvn v5.8b, v4.8b +; CHECK-NEXT: bsl v2.8b, v4.8b, v5.8b +; CHECK-NEXT: eor v0.8b, v1.8b, v0.8b +; CHECK-NEXT: bsl v0.8b, v2.8b, v3.8b +; CHECK-NEXT: sshr v0.4h, v0.4h, #8 ; CHECK-NEXT: xtn v0.8b, v0.8h ; CHECK-NEXT: str s0, [x2] ; CHECK-NEXT: ret @@ -272,14 +279,21 @@ define void @v2i8(<2 x i8>* %px, <2 x i8 ; CHECK-NEXT: ldrb w10, [x0, #1] ; CHECK-NEXT: ldrb w11, [x1, #1] ; CHECK-NEXT: fmov s0, w8 -; CHECK-NEXT: fmov s1, w9 +; CHECK-NEXT: fmov s2, w9 ; CHECK-NEXT: mov v0.s[1], w10 -; CHECK-NEXT: mov v1.s[1], w11 -; CHECK-NEXT: sub v0.2s, v0.2s, v1.2s -; CHECK-NEXT: movi v1.2s, #127 -; CHECK-NEXT: smin v0.2s, v0.2s, v1.2s -; CHECK-NEXT: mvni v1.2s, #127 -; CHECK-NEXT: smax v0.2s, v0.2s, v1.2s +; CHECK-NEXT: mov v2.s[1], w11 +; CHECK-NEXT: shl v2.2s, v2.2s, #24 +; CHECK-NEXT: shl v0.2s, v0.2s, #24 +; CHECK-NEXT: sub v3.2s, v0.2s, v2.2s +; CHECK-NEXT: cmlt v4.2s, v3.2s, #0 +; CHECK-NEXT: mvni v1.2s, #128, lsl #24 +; CHECK-NEXT: cmgt v2.2s, v2.2s, #0 +; CHECK-NEXT: cmgt v0.2s, v0.2s, v3.2s +; CHECK-NEXT: mvn v5.8b, v4.8b +; CHECK-NEXT: eor v0.8b, v2.8b, v0.8b +; CHECK-NEXT: bsl v1.8b, v4.8b, v5.8b +; CHECK-NEXT: bsl v0.8b, v1.8b, v3.8b +; CHECK-NEXT: ushr v0.2s, v0.2s, #24 ; CHECK-NEXT: mov w8, v0.s[1] ; CHECK-NEXT: fmov w9, s0 ; CHECK-NEXT: strb w8, [x2, #1] @@ -323,14 +337,21 @@ define void @v2i16(<2 x i16>* %px, <2 x ; CHECK-NEXT: ldrh w10, [x0, #2] ; CHECK-NEXT: ldrh w11, [x1, #2] ; CHECK-NEXT: fmov s0, w8 -; CHECK-NEXT: fmov s1, w9 +; CHECK-NEXT: fmov s2, w9 ; CHECK-NEXT: mov v0.s[1], w10 -; CHECK-NEXT: mov v1.s[1], w11 -; CHECK-NEXT: sub v0.2s, v0.2s, v1.2s -; CHECK-NEXT: movi v1.2s, #127, msl #8 -; CHECK-NEXT: smin v0.2s, v0.2s, v1.2s -; CHECK-NEXT: mvni v1.2s, #127, msl #8 -; CHECK-NEXT: smax v0.2s, v0.2s, v1.2s +; CHECK-NEXT: mov v2.s[1], w11 +; CHECK-NEXT: shl v2.2s, v2.2s, #16 +; CHECK-NEXT: shl v0.2s, v0.2s, #16 +; CHECK-NEXT: sub v3.2s, v0.2s, v2.2s +; CHECK-NEXT: cmlt v4.2s, v3.2s, #0 +; CHECK-NEXT: mvni v1.2s, #128, lsl #24 +; CHECK-NEXT: cmgt v2.2s, v2.2s, #0 +; CHECK-NEXT: cmgt v0.2s, v0.2s, v3.2s +; CHECK-NEXT: mvn v5.8b, v4.8b +; CHECK-NEXT: eor v0.8b, v2.8b, v0.8b +; CHECK-NEXT: bsl v1.8b, v4.8b, v5.8b +; CHECK-NEXT: bsl v0.8b, v1.8b, v3.8b +; CHECK-NEXT: ushr v0.2s, v0.2s, #16 ; CHECK-NEXT: mov w8, v0.s[1] ; CHECK-NEXT: fmov w9, s0 ; CHECK-NEXT: strh w8, [x2, #2] @@ -442,11 +463,18 @@ define void @v1i16(<1 x i16>* %px, <1 x define <16 x i4> @v16i4(<16 x i4> %x, <16 x i4> %y) nounwind { ; CHECK-LABEL: v16i4: ; CHECK: // %bb.0: -; CHECK-NEXT: sub v0.16b, v0.16b, v1.16b -; CHECK-NEXT: movi v1.16b, #7 -; CHECK-NEXT: smin v0.16b, v0.16b, v1.16b -; CHECK-NEXT: movi v1.16b, #248 -; CHECK-NEXT: smax v0.16b, v0.16b, v1.16b +; CHECK-NEXT: shl v1.16b, v1.16b, #4 +; CHECK-NEXT: shl v0.16b, v0.16b, #4 +; CHECK-NEXT: sub v3.16b, v0.16b, v1.16b +; CHECK-NEXT: cmlt v4.16b, v3.16b, #0 +; CHECK-NEXT: movi v2.16b, #127 +; CHECK-NEXT: cmgt v1.16b, v1.16b, #0 +; CHECK-NEXT: cmgt v0.16b, v0.16b, v3.16b +; CHECK-NEXT: mvn v5.16b, v4.16b +; CHECK-NEXT: bsl v2.16b, v4.16b, v5.16b +; CHECK-NEXT: eor v0.16b, v1.16b, v0.16b +; CHECK-NEXT: bsl v0.16b, v2.16b, v3.16b +; CHECK-NEXT: sshr v0.16b, v0.16b, #4 ; CHECK-NEXT: ret %z = call <16 x i4> @llvm.ssub.sat.v16i4(<16 x i4> %x, <16 x i4> %y) ret <16 x i4> %z @@ -455,11 +483,18 @@ define <16 x i4> @v16i4(<16 x i4> %x, <1 define <16 x i1> @v16i1(<16 x i1> %x, <16 x i1> %y) nounwind { ; CHECK-LABEL: v16i1: ; CHECK: // %bb.0: -; CHECK-NEXT: sub v0.16b, v0.16b, v1.16b -; CHECK-NEXT: movi v1.2d, #0000000000000000 -; CHECK-NEXT: smin v0.16b, v0.16b, v1.16b -; CHECK-NEXT: movi v1.2d, #0xffffffffffffffff -; CHECK-NEXT: smax v0.16b, v0.16b, v1.16b +; CHECK-NEXT: shl v1.16b, v1.16b, #7 +; CHECK-NEXT: shl v0.16b, v0.16b, #7 +; CHECK-NEXT: sub v3.16b, v0.16b, v1.16b +; CHECK-NEXT: cmlt v4.16b, v3.16b, #0 +; CHECK-NEXT: movi v2.16b, #127 +; CHECK-NEXT: cmgt v1.16b, v1.16b, #0 +; CHECK-NEXT: cmgt v0.16b, v0.16b, v3.16b +; CHECK-NEXT: mvn v5.16b, v4.16b +; CHECK-NEXT: bsl v2.16b, v4.16b, v5.16b +; CHECK-NEXT: eor v0.16b, v1.16b, v0.16b +; CHECK-NEXT: bsl v0.16b, v2.16b, v3.16b +; CHECK-NEXT: sshr v0.16b, v0.16b, #7 ; CHECK-NEXT: ret %z = call <16 x i1> @llvm.ssub.sat.v16i1(<16 x i1> %x, <16 x i1> %y) ret <16 x i1> %z Modified: llvm/trunk/test/CodeGen/AArch64/uadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/uadd_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/uadd_sat.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/uadd_sat.ll Fri Oct 11 13:33:03 2019 @@ -30,10 +30,10 @@ define i64 @func2(i64 %x, i64 %y) nounwi define i16 @func16(i16 %x, i16 %y) nounwind { ; CHECK-LABEL: func16: ; CHECK: // %bb.0: -; CHECK-NEXT: add w8, w0, w1 -; CHECK-NEXT: mov w9, #65535 -; CHECK-NEXT: cmp w8, w9 -; CHECK-NEXT: csel w0, w8, w9, lo +; CHECK-NEXT: lsl w8, w0, #16 +; CHECK-NEXT: adds w8, w8, w1, lsl #16 +; CHECK-NEXT: csinv w8, w8, wzr, lo +; CHECK-NEXT: lsr w0, w8, #16 ; CHECK-NEXT: ret %tmp = call i16 @llvm.uadd.sat.i16(i16 %x, i16 %y); ret i16 %tmp; @@ -42,10 +42,10 @@ define i16 @func16(i16 %x, i16 %y) nounw define i8 @func8(i8 %x, i8 %y) nounwind { ; CHECK-LABEL: func8: ; CHECK: // %bb.0: -; CHECK-NEXT: add w8, w0, w1 -; CHECK-NEXT: cmp w8, #255 // =255 -; CHECK-NEXT: mov w9, #255 -; CHECK-NEXT: csel w0, w8, w9, lo +; CHECK-NEXT: lsl w8, w0, #24 +; CHECK-NEXT: adds w8, w8, w1, lsl #24 +; CHECK-NEXT: csinv w8, w8, wzr, lo +; CHECK-NEXT: lsr w0, w8, #24 ; CHECK-NEXT: ret %tmp = call i8 @llvm.uadd.sat.i8(i8 %x, i8 %y); ret i8 %tmp; @@ -54,10 +54,10 @@ define i8 @func8(i8 %x, i8 %y) nounwind define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-LABEL: func3: ; CHECK: // %bb.0: -; CHECK-NEXT: add w8, w0, w1 -; CHECK-NEXT: cmp w8, #15 // =15 -; CHECK-NEXT: mov w9, #15 -; CHECK-NEXT: csel w0, w8, w9, lo +; CHECK-NEXT: lsl w8, w0, #28 +; CHECK-NEXT: adds w8, w8, w1, lsl #28 +; CHECK-NEXT: csinv w8, w8, wzr, lo +; CHECK-NEXT: lsr w0, w8, #28 ; CHECK-NEXT: ret %tmp = call i4 @llvm.uadd.sat.i4(i4 %x, i4 %y); ret i4 %tmp; Modified: llvm/trunk/test/CodeGen/AArch64/uadd_sat_vec.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/uadd_sat_vec.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/uadd_sat_vec.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/uadd_sat_vec.ll Fri Oct 11 13:33:03 2019 @@ -142,25 +142,28 @@ define void @v8i8(<8 x i8>* %px, <8 x i8 define void @v4i8(<4 x i8>* %px, <4 x i8>* %py, <4 x i8>* %pz) nounwind { ; CHECK-LABEL: v4i8: ; CHECK: // %bb.0: -; CHECK-NEXT: ldrb w8, [x0] ; CHECK-NEXT: ldrb w9, [x1] -; CHECK-NEXT: ldrb w10, [x0, #1] +; CHECK-NEXT: ldrb w8, [x0] ; CHECK-NEXT: ldrb w11, [x1, #1] -; CHECK-NEXT: fmov s0, w8 +; CHECK-NEXT: ldrb w10, [x0, #1] ; CHECK-NEXT: fmov s1, w9 -; CHECK-NEXT: ldrb w8, [x0, #2] ; CHECK-NEXT: ldrb w9, [x1, #2] -; CHECK-NEXT: mov v0.h[1], w10 +; CHECK-NEXT: fmov s0, w8 +; CHECK-NEXT: ldrb w8, [x0, #2] ; CHECK-NEXT: mov v1.h[1], w11 -; CHECK-NEXT: ldrb w10, [x0, #3] ; CHECK-NEXT: ldrb w11, [x1, #3] -; CHECK-NEXT: mov v0.h[2], w8 +; CHECK-NEXT: mov v0.h[1], w10 +; CHECK-NEXT: ldrb w10, [x0, #3] ; CHECK-NEXT: mov v1.h[2], w9 -; CHECK-NEXT: mov v0.h[3], w10 +; CHECK-NEXT: mov v0.h[2], w8 ; CHECK-NEXT: mov v1.h[3], w11 +; CHECK-NEXT: mov v0.h[3], w10 +; CHECK-NEXT: shl v1.4h, v1.4h, #8 +; CHECK-NEXT: shl v0.4h, v0.4h, #8 +; CHECK-NEXT: mvn v2.8b, v1.8b +; CHECK-NEXT: umin v0.4h, v0.4h, v2.4h ; CHECK-NEXT: add v0.4h, v0.4h, v1.4h -; CHECK-NEXT: movi d1, #0xff00ff00ff00ff -; CHECK-NEXT: umin v0.4h, v0.4h, v1.4h +; CHECK-NEXT: ushr v0.4h, v0.4h, #8 ; CHECK-NEXT: xtn v0.8b, v0.8h ; CHECK-NEXT: str s0, [x2] ; CHECK-NEXT: ret @@ -174,17 +177,20 @@ define void @v4i8(<4 x i8>* %px, <4 x i8 define void @v2i8(<2 x i8>* %px, <2 x i8>* %py, <2 x i8>* %pz) nounwind { ; CHECK-LABEL: v2i8: ; CHECK: // %bb.0: -; CHECK-NEXT: ldrb w8, [x0] ; CHECK-NEXT: ldrb w9, [x1] -; CHECK-NEXT: ldrb w10, [x0, #1] +; CHECK-NEXT: ldrb w8, [x0] ; CHECK-NEXT: ldrb w11, [x1, #1] -; CHECK-NEXT: fmov s0, w8 +; CHECK-NEXT: ldrb w10, [x0, #1] ; CHECK-NEXT: fmov s1, w9 -; CHECK-NEXT: mov v0.s[1], w10 +; CHECK-NEXT: fmov s0, w8 ; CHECK-NEXT: mov v1.s[1], w11 +; CHECK-NEXT: mov v0.s[1], w10 +; CHECK-NEXT: shl v1.2s, v1.2s, #24 +; CHECK-NEXT: shl v0.2s, v0.2s, #24 +; CHECK-NEXT: mvn v2.8b, v1.8b +; CHECK-NEXT: umin v0.2s, v0.2s, v2.2s ; CHECK-NEXT: add v0.2s, v0.2s, v1.2s -; CHECK-NEXT: movi d1, #0x0000ff000000ff -; CHECK-NEXT: umin v0.2s, v0.2s, v1.2s +; CHECK-NEXT: ushr v0.2s, v0.2s, #24 ; CHECK-NEXT: mov w8, v0.s[1] ; CHECK-NEXT: fmov w9, s0 ; CHECK-NEXT: strb w8, [x2, #1] @@ -217,17 +223,20 @@ define void @v4i16(<4 x i16>* %px, <4 x define void @v2i16(<2 x i16>* %px, <2 x i16>* %py, <2 x i16>* %pz) nounwind { ; CHECK-LABEL: v2i16: ; CHECK: // %bb.0: -; CHECK-NEXT: ldrh w8, [x0] ; CHECK-NEXT: ldrh w9, [x1] -; CHECK-NEXT: ldrh w10, [x0, #2] +; CHECK-NEXT: ldrh w8, [x0] ; CHECK-NEXT: ldrh w11, [x1, #2] -; CHECK-NEXT: fmov s0, w8 +; CHECK-NEXT: ldrh w10, [x0, #2] ; CHECK-NEXT: fmov s1, w9 -; CHECK-NEXT: mov v0.s[1], w10 +; CHECK-NEXT: fmov s0, w8 ; CHECK-NEXT: mov v1.s[1], w11 +; CHECK-NEXT: mov v0.s[1], w10 +; CHECK-NEXT: shl v1.2s, v1.2s, #16 +; CHECK-NEXT: shl v0.2s, v0.2s, #16 +; CHECK-NEXT: mvn v2.8b, v1.8b +; CHECK-NEXT: umin v0.2s, v0.2s, v2.2s ; CHECK-NEXT: add v0.2s, v0.2s, v1.2s -; CHECK-NEXT: movi d1, #0x00ffff0000ffff -; CHECK-NEXT: umin v0.2s, v0.2s, v1.2s +; CHECK-NEXT: ushr v0.2s, v0.2s, #16 ; CHECK-NEXT: mov w8, v0.s[1] ; CHECK-NEXT: fmov w9, s0 ; CHECK-NEXT: strh w8, [x2, #2] @@ -309,9 +318,12 @@ define void @v1i16(<1 x i16>* %px, <1 x define <16 x i4> @v16i4(<16 x i4> %x, <16 x i4> %y) nounwind { ; CHECK-LABEL: v16i4: ; CHECK: // %bb.0: +; CHECK-NEXT: shl v1.16b, v1.16b, #4 +; CHECK-NEXT: shl v0.16b, v0.16b, #4 +; CHECK-NEXT: mvn v2.16b, v1.16b +; CHECK-NEXT: umin v0.16b, v0.16b, v2.16b ; CHECK-NEXT: add v0.16b, v0.16b, v1.16b -; CHECK-NEXT: movi v1.16b, #15 -; CHECK-NEXT: umin v0.16b, v0.16b, v1.16b +; CHECK-NEXT: ushr v0.16b, v0.16b, #4 ; CHECK-NEXT: ret %z = call <16 x i4> @llvm.uadd.sat.v16i4(<16 x i4> %x, <16 x i4> %y) ret <16 x i4> %z @@ -320,9 +332,12 @@ define <16 x i4> @v16i4(<16 x i4> %x, <1 define <16 x i1> @v16i1(<16 x i1> %x, <16 x i1> %y) nounwind { ; CHECK-LABEL: v16i1: ; CHECK: // %bb.0: +; CHECK-NEXT: shl v1.16b, v1.16b, #7 +; CHECK-NEXT: shl v0.16b, v0.16b, #7 +; CHECK-NEXT: mvn v2.16b, v1.16b +; CHECK-NEXT: umin v0.16b, v0.16b, v2.16b ; CHECK-NEXT: add v0.16b, v0.16b, v1.16b -; CHECK-NEXT: movi v1.16b, #1 -; CHECK-NEXT: umin v0.16b, v0.16b, v1.16b +; CHECK-NEXT: ushr v0.16b, v0.16b, #7 ; CHECK-NEXT: ret %z = call <16 x i1> @llvm.uadd.sat.v16i1(<16 x i1> %x, <16 x i1> %y) ret <16 x i1> %z Modified: llvm/trunk/test/CodeGen/AArch64/usub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/usub_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/usub_sat.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/usub_sat.ll Fri Oct 11 13:33:03 2019 @@ -30,9 +30,10 @@ define i64 @func2(i64 %x, i64 %y) nounwi define i16 @func16(i16 %x, i16 %y) nounwind { ; CHECK-LABEL: func16: ; CHECK: // %bb.0: -; CHECK-NEXT: cmp w0, w1 -; CHECK-NEXT: csel w8, w0, w1, hi -; CHECK-NEXT: sub w0, w8, w1 +; CHECK-NEXT: lsl w8, w0, #16 +; CHECK-NEXT: subs w8, w8, w1, lsl #16 +; CHECK-NEXT: csel w8, wzr, w8, lo +; CHECK-NEXT: lsr w0, w8, #16 ; CHECK-NEXT: ret %tmp = call i16 @llvm.usub.sat.i16(i16 %x, i16 %y); ret i16 %tmp; @@ -41,9 +42,10 @@ define i16 @func16(i16 %x, i16 %y) nounw define i8 @func8(i8 %x, i8 %y) nounwind { ; CHECK-LABEL: func8: ; CHECK: // %bb.0: -; CHECK-NEXT: cmp w0, w1 -; CHECK-NEXT: csel w8, w0, w1, hi -; CHECK-NEXT: sub w0, w8, w1 +; CHECK-NEXT: lsl w8, w0, #24 +; CHECK-NEXT: subs w8, w8, w1, lsl #24 +; CHECK-NEXT: csel w8, wzr, w8, lo +; CHECK-NEXT: lsr w0, w8, #24 ; CHECK-NEXT: ret %tmp = call i8 @llvm.usub.sat.i8(i8 %x, i8 %y); ret i8 %tmp; @@ -52,9 +54,10 @@ define i8 @func8(i8 %x, i8 %y) nounwind define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-LABEL: func3: ; CHECK: // %bb.0: -; CHECK-NEXT: cmp w0, w1 -; CHECK-NEXT: csel w8, w0, w1, hi -; CHECK-NEXT: sub w0, w8, w1 +; CHECK-NEXT: lsl w8, w0, #28 +; CHECK-NEXT: subs w8, w8, w1, lsl #28 +; CHECK-NEXT: csel w8, wzr, w8, lo +; CHECK-NEXT: lsr w0, w8, #28 ; CHECK-NEXT: ret %tmp = call i4 @llvm.usub.sat.i4(i4 %x, i4 %y); ret i4 %tmp; Modified: llvm/trunk/test/CodeGen/AArch64/usub_sat_vec.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/usub_sat_vec.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/usub_sat_vec.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/usub_sat_vec.ll Fri Oct 11 13:33:03 2019 @@ -144,8 +144,11 @@ define void @v4i8(<4 x i8>* %px, <4 x i8 ; CHECK-NEXT: mov v1.h[2], w9 ; CHECK-NEXT: mov v0.h[3], w10 ; CHECK-NEXT: mov v1.h[3], w11 +; CHECK-NEXT: shl v1.4h, v1.4h, #8 +; CHECK-NEXT: shl v0.4h, v0.4h, #8 ; CHECK-NEXT: umax v0.4h, v0.4h, v1.4h ; CHECK-NEXT: sub v0.4h, v0.4h, v1.4h +; CHECK-NEXT: ushr v0.4h, v0.4h, #8 ; CHECK-NEXT: xtn v0.8b, v0.8h ; CHECK-NEXT: str s0, [x2] ; CHECK-NEXT: ret @@ -167,8 +170,11 @@ define void @v2i8(<2 x i8>* %px, <2 x i8 ; CHECK-NEXT: fmov s1, w9 ; CHECK-NEXT: mov v0.s[1], w10 ; CHECK-NEXT: mov v1.s[1], w11 +; CHECK-NEXT: shl v1.2s, v1.2s, #24 +; CHECK-NEXT: shl v0.2s, v0.2s, #24 ; CHECK-NEXT: umax v0.2s, v0.2s, v1.2s ; CHECK-NEXT: sub v0.2s, v0.2s, v1.2s +; CHECK-NEXT: ushr v0.2s, v0.2s, #24 ; CHECK-NEXT: mov w8, v0.s[1] ; CHECK-NEXT: fmov w9, s0 ; CHECK-NEXT: strb w8, [x2, #1] @@ -208,8 +214,11 @@ define void @v2i16(<2 x i16>* %px, <2 x ; CHECK-NEXT: fmov s1, w9 ; CHECK-NEXT: mov v0.s[1], w10 ; CHECK-NEXT: mov v1.s[1], w11 +; CHECK-NEXT: shl v1.2s, v1.2s, #16 +; CHECK-NEXT: shl v0.2s, v0.2s, #16 ; CHECK-NEXT: umax v0.2s, v0.2s, v1.2s ; CHECK-NEXT: sub v0.2s, v0.2s, v1.2s +; CHECK-NEXT: ushr v0.2s, v0.2s, #16 ; CHECK-NEXT: mov w8, v0.s[1] ; CHECK-NEXT: fmov w9, s0 ; CHECK-NEXT: strh w8, [x2, #2] @@ -286,8 +295,11 @@ define void @v1i16(<1 x i16>* %px, <1 x define <16 x i4> @v16i4(<16 x i4> %x, <16 x i4> %y) nounwind { ; CHECK-LABEL: v16i4: ; CHECK: // %bb.0: +; CHECK-NEXT: shl v1.16b, v1.16b, #4 +; CHECK-NEXT: shl v0.16b, v0.16b, #4 ; CHECK-NEXT: umax v0.16b, v0.16b, v1.16b ; CHECK-NEXT: sub v0.16b, v0.16b, v1.16b +; CHECK-NEXT: ushr v0.16b, v0.16b, #4 ; CHECK-NEXT: ret %z = call <16 x i4> @llvm.usub.sat.v16i4(<16 x i4> %x, <16 x i4> %y) ret <16 x i4> %z @@ -296,8 +308,11 @@ define <16 x i4> @v16i4(<16 x i4> %x, <1 define <16 x i1> @v16i1(<16 x i1> %x, <16 x i1> %y) nounwind { ; CHECK-LABEL: v16i1: ; CHECK: // %bb.0: +; CHECK-NEXT: shl v1.16b, v1.16b, #7 +; CHECK-NEXT: shl v0.16b, v0.16b, #7 ; CHECK-NEXT: umax v0.16b, v0.16b, v1.16b ; CHECK-NEXT: sub v0.16b, v0.16b, v1.16b +; CHECK-NEXT: ushr v0.16b, v0.16b, #7 ; CHECK-NEXT: ret %z = call <16 x i1> @llvm.usub.sat.v16i1(<16 x i1> %x, <16 x i1> %y) ret <16 x i1> %z Modified: llvm/trunk/test/CodeGen/ARM/sadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/sadd_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/sadd_sat.ll (original) +++ llvm/trunk/test/CodeGen/ARM/sadd_sat.ll Fri Oct 11 13:33:03 2019 @@ -210,51 +210,67 @@ define i64 @func2(i64 %x, i64 %y) nounwi define i16 @func16(i16 %x, i16 %y) nounwind { ; CHECK-T1-LABEL: func16: ; CHECK-T1: @ %bb.0: -; CHECK-T1-NEXT: adds r0, r0, r1 -; CHECK-T1-NEXT: ldr r1, .LCPI2_0 -; CHECK-T1-NEXT: cmp r0, r1 -; CHECK-T1-NEXT: blt .LBB2_2 +; CHECK-T1-NEXT: lsls r3, r1, #16 +; CHECK-T1-NEXT: lsls r1, r0, #16 +; CHECK-T1-NEXT: movs r2, #1 +; CHECK-T1-NEXT: adds r0, r1, r3 +; CHECK-T1-NEXT: mov r3, r2 +; CHECK-T1-NEXT: bmi .LBB2_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: movs r3, #0 ; CHECK-T1-NEXT: .LBB2_2: -; CHECK-T1-NEXT: ldr r1, .LCPI2_1 -; CHECK-T1-NEXT: cmp r0, r1 -; CHECK-T1-NEXT: bgt .LBB2_4 +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: bne .LBB2_4 ; CHECK-T1-NEXT: @ %bb.3: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: lsls r2, r2, #31 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvs .LBB2_5 +; CHECK-T1-NEXT: b .LBB2_6 ; CHECK-T1-NEXT: .LBB2_4: +; CHECK-T1-NEXT: ldr r2, .LCPI2_0 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvc .LBB2_6 +; CHECK-T1-NEXT: .LBB2_5: +; CHECK-T1-NEXT: mov r0, r2 +; CHECK-T1-NEXT: .LBB2_6: +; CHECK-T1-NEXT: asrs r0, r0, #16 ; CHECK-T1-NEXT: bx lr ; CHECK-T1-NEXT: .p2align 2 -; CHECK-T1-NEXT: @ %bb.5: +; CHECK-T1-NEXT: @ %bb.7: ; CHECK-T1-NEXT: .LCPI2_0: -; CHECK-T1-NEXT: .long 32767 @ 0x7fff -; CHECK-T1-NEXT: .LCPI2_1: -; CHECK-T1-NEXT: .long 4294934528 @ 0xffff8000 +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff ; ; CHECK-T2-LABEL: func16: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: add r0, r1 -; CHECK-T2-NEXT: movw r1, #32767 -; CHECK-T2-NEXT: cmp r0, r1 -; CHECK-T2-NEXT: it lt -; CHECK-T2-NEXT: movlt r1, r0 -; CHECK-T2-NEXT: movw r0, #32768 -; CHECK-T2-NEXT: cmn.w r1, #32768 -; CHECK-T2-NEXT: movt r0, #65535 -; CHECK-T2-NEXT: it gt -; CHECK-T2-NEXT: movgt r0, r1 +; CHECK-T2-NEXT: lsls r2, r0, #16 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #16 +; CHECK-T2-NEXT: movs r2, #0 +; CHECK-T2-NEXT: cmp r1, #0 +; CHECK-T2-NEXT: mov.w r3, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r2, #1 +; CHECK-T2-NEXT: cmp r2, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r3, #-2147483648 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #16 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r3, r1 +; CHECK-T2-NEXT: asrs r0, r3, #16 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func16: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: add r0, r0, r1 -; CHECK-ARM-NEXT: movw r1, #32767 -; CHECK-ARM-NEXT: cmp r0, r1 -; CHECK-ARM-NEXT: movlt r1, r0 -; CHECK-ARM-NEXT: movw r0, #32768 -; CHECK-ARM-NEXT: movt r0, #65535 -; CHECK-ARM-NEXT: cmn r1, #32768 -; CHECK-ARM-NEXT: movgt r0, r1 +; CHECK-ARM-NEXT: lsl r2, r0, #16 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #16 +; CHECK-ARM-NEXT: mov r2, #0 +; CHECK-ARM-NEXT: cmp r1, #0 +; CHECK-ARM-NEXT: movwmi r2, #1 +; CHECK-ARM-NEXT: mov r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r2, #0 +; CHECK-ARM-NEXT: mvnne r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #16 +; CHECK-ARM-NEXT: movvc r3, r1 +; CHECK-ARM-NEXT: asr r0, r3, #16 ; CHECK-ARM-NEXT: bx lr %tmp = call i16 @llvm.sadd.sat.i16(i16 %x, i16 %y) ret i16 %tmp @@ -263,39 +279,67 @@ define i16 @func16(i16 %x, i16 %y) nounw define i8 @func8(i8 %x, i8 %y) nounwind { ; CHECK-T1-LABEL: func8: ; CHECK-T1: @ %bb.0: -; CHECK-T1-NEXT: adds r0, r0, r1 -; CHECK-T1-NEXT: movs r1, #127 -; CHECK-T1-NEXT: cmp r0, #127 -; CHECK-T1-NEXT: blt .LBB3_2 +; CHECK-T1-NEXT: lsls r3, r1, #24 +; CHECK-T1-NEXT: lsls r1, r0, #24 +; CHECK-T1-NEXT: movs r2, #1 +; CHECK-T1-NEXT: adds r0, r1, r3 +; CHECK-T1-NEXT: mov r3, r2 +; CHECK-T1-NEXT: bmi .LBB3_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: movs r3, #0 ; CHECK-T1-NEXT: .LBB3_2: -; CHECK-T1-NEXT: mvns r1, r1 -; CHECK-T1-NEXT: cmp r0, r1 -; CHECK-T1-NEXT: bgt .LBB3_4 +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: bne .LBB3_4 ; CHECK-T1-NEXT: @ %bb.3: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: lsls r2, r2, #31 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvs .LBB3_5 +; CHECK-T1-NEXT: b .LBB3_6 ; CHECK-T1-NEXT: .LBB3_4: +; CHECK-T1-NEXT: ldr r2, .LCPI3_0 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvc .LBB3_6 +; CHECK-T1-NEXT: .LBB3_5: +; CHECK-T1-NEXT: mov r0, r2 +; CHECK-T1-NEXT: .LBB3_6: +; CHECK-T1-NEXT: asrs r0, r0, #24 ; CHECK-T1-NEXT: bx lr +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI3_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff ; ; CHECK-T2-LABEL: func8: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: add r0, r1 -; CHECK-T2-NEXT: cmp r0, #127 -; CHECK-T2-NEXT: it ge -; CHECK-T2-NEXT: movge r0, #127 -; CHECK-T2-NEXT: cmn.w r0, #128 -; CHECK-T2-NEXT: it le -; CHECK-T2-NEXT: mvnle r0, #127 +; CHECK-T2-NEXT: lsls r2, r0, #24 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #24 +; CHECK-T2-NEXT: movs r2, #0 +; CHECK-T2-NEXT: cmp r1, #0 +; CHECK-T2-NEXT: mov.w r3, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r2, #1 +; CHECK-T2-NEXT: cmp r2, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r3, #-2147483648 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #24 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r3, r1 +; CHECK-T2-NEXT: asrs r0, r3, #24 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func8: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: add r0, r0, r1 -; CHECK-ARM-NEXT: cmp r0, #127 -; CHECK-ARM-NEXT: movge r0, #127 -; CHECK-ARM-NEXT: cmn r0, #128 -; CHECK-ARM-NEXT: mvnle r0, #127 +; CHECK-ARM-NEXT: lsl r2, r0, #24 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #24 +; CHECK-ARM-NEXT: mov r2, #0 +; CHECK-ARM-NEXT: cmp r1, #0 +; CHECK-ARM-NEXT: movwmi r2, #1 +; CHECK-ARM-NEXT: mov r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r2, #0 +; CHECK-ARM-NEXT: mvnne r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #24 +; CHECK-ARM-NEXT: movvc r3, r1 +; CHECK-ARM-NEXT: asr r0, r3, #24 ; CHECK-ARM-NEXT: bx lr %tmp = call i8 @llvm.sadd.sat.i8(i8 %x, i8 %y) ret i8 %tmp @@ -304,39 +348,67 @@ define i8 @func8(i8 %x, i8 %y) nounwind define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-T1-LABEL: func3: ; CHECK-T1: @ %bb.0: -; CHECK-T1-NEXT: adds r0, r0, r1 -; CHECK-T1-NEXT: movs r1, #7 -; CHECK-T1-NEXT: cmp r0, #7 -; CHECK-T1-NEXT: blt .LBB4_2 +; CHECK-T1-NEXT: lsls r3, r1, #28 +; CHECK-T1-NEXT: lsls r1, r0, #28 +; CHECK-T1-NEXT: movs r2, #1 +; CHECK-T1-NEXT: adds r0, r1, r3 +; CHECK-T1-NEXT: mov r3, r2 +; CHECK-T1-NEXT: bmi .LBB4_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: movs r3, #0 ; CHECK-T1-NEXT: .LBB4_2: -; CHECK-T1-NEXT: mvns r1, r1 -; CHECK-T1-NEXT: cmp r0, r1 -; CHECK-T1-NEXT: bgt .LBB4_4 +; CHECK-T1-NEXT: cmp r3, #0 +; CHECK-T1-NEXT: bne .LBB4_4 ; CHECK-T1-NEXT: @ %bb.3: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: lsls r2, r2, #31 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvs .LBB4_5 +; CHECK-T1-NEXT: b .LBB4_6 ; CHECK-T1-NEXT: .LBB4_4: +; CHECK-T1-NEXT: ldr r2, .LCPI4_0 +; CHECK-T1-NEXT: cmp r0, r1 +; CHECK-T1-NEXT: bvc .LBB4_6 +; CHECK-T1-NEXT: .LBB4_5: +; CHECK-T1-NEXT: mov r0, r2 +; CHECK-T1-NEXT: .LBB4_6: +; CHECK-T1-NEXT: asrs r0, r0, #28 ; CHECK-T1-NEXT: bx lr +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI4_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff ; ; CHECK-T2-LABEL: func3: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: add r0, r1 -; CHECK-T2-NEXT: cmp r0, #7 -; CHECK-T2-NEXT: it ge -; CHECK-T2-NEXT: movge r0, #7 -; CHECK-T2-NEXT: cmn.w r0, #8 -; CHECK-T2-NEXT: it le -; CHECK-T2-NEXT: mvnle r0, #7 +; CHECK-T2-NEXT: lsls r2, r0, #28 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #28 +; CHECK-T2-NEXT: movs r2, #0 +; CHECK-T2-NEXT: cmp r1, #0 +; CHECK-T2-NEXT: mov.w r3, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r2, #1 +; CHECK-T2-NEXT: cmp r2, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r3, #-2147483648 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #28 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r3, r1 +; CHECK-T2-NEXT: asrs r0, r3, #28 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func3: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: add r0, r0, r1 -; CHECK-ARM-NEXT: cmp r0, #7 -; CHECK-ARM-NEXT: movge r0, #7 -; CHECK-ARM-NEXT: cmn r0, #8 -; CHECK-ARM-NEXT: mvnle r0, #7 +; CHECK-ARM-NEXT: lsl r2, r0, #28 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #28 +; CHECK-ARM-NEXT: mov r2, #0 +; CHECK-ARM-NEXT: cmp r1, #0 +; CHECK-ARM-NEXT: movwmi r2, #1 +; CHECK-ARM-NEXT: mov r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r2, #0 +; CHECK-ARM-NEXT: mvnne r3, #-2147483648 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #28 +; CHECK-ARM-NEXT: movvc r3, r1 +; CHECK-ARM-NEXT: asr r0, r3, #28 ; CHECK-ARM-NEXT: bx lr %tmp = call i4 @llvm.sadd.sat.i4(i4 %x, i4 %y) ret i4 %tmp Modified: llvm/trunk/test/CodeGen/ARM/ssub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/ssub_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/ssub_sat.ll (original) +++ llvm/trunk/test/CodeGen/ARM/ssub_sat.ll Fri Oct 11 13:33:03 2019 @@ -212,51 +212,69 @@ define i64 @func2(i64 %x, i64 %y) nounwi define i16 @func16(i16 %x, i16 %y) nounwind { ; CHECK-T1-LABEL: func16: ; CHECK-T1: @ %bb.0: -; CHECK-T1-NEXT: subs r0, r0, r1 -; CHECK-T1-NEXT: ldr r1, .LCPI2_0 -; CHECK-T1-NEXT: cmp r0, r1 -; CHECK-T1-NEXT: blt .LBB2_2 +; CHECK-T1-NEXT: .save {r4, lr} +; CHECK-T1-NEXT: push {r4, lr} +; CHECK-T1-NEXT: lsls r1, r1, #16 +; CHECK-T1-NEXT: lsls r2, r0, #16 +; CHECK-T1-NEXT: movs r3, #1 +; CHECK-T1-NEXT: subs r0, r2, r1 +; CHECK-T1-NEXT: mov r4, r3 +; CHECK-T1-NEXT: bmi .LBB2_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: movs r4, #0 ; CHECK-T1-NEXT: .LBB2_2: -; CHECK-T1-NEXT: ldr r1, .LCPI2_1 -; CHECK-T1-NEXT: cmp r0, r1 -; CHECK-T1-NEXT: bgt .LBB2_4 +; CHECK-T1-NEXT: cmp r4, #0 +; CHECK-T1-NEXT: bne .LBB2_4 ; CHECK-T1-NEXT: @ %bb.3: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: lsls r3, r3, #31 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvs .LBB2_5 +; CHECK-T1-NEXT: b .LBB2_6 ; CHECK-T1-NEXT: .LBB2_4: -; CHECK-T1-NEXT: bx lr +; CHECK-T1-NEXT: ldr r3, .LCPI2_0 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvc .LBB2_6 +; CHECK-T1-NEXT: .LBB2_5: +; CHECK-T1-NEXT: mov r0, r3 +; CHECK-T1-NEXT: .LBB2_6: +; CHECK-T1-NEXT: asrs r0, r0, #16 +; CHECK-T1-NEXT: pop {r4, pc} ; CHECK-T1-NEXT: .p2align 2 -; CHECK-T1-NEXT: @ %bb.5: +; CHECK-T1-NEXT: @ %bb.7: ; CHECK-T1-NEXT: .LCPI2_0: -; CHECK-T1-NEXT: .long 32767 @ 0x7fff -; CHECK-T1-NEXT: .LCPI2_1: -; CHECK-T1-NEXT: .long 4294934528 @ 0xffff8000 +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff ; ; CHECK-T2-LABEL: func16: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: subs r0, r0, r1 -; CHECK-T2-NEXT: movw r1, #32767 -; CHECK-T2-NEXT: cmp r0, r1 -; CHECK-T2-NEXT: it lt -; CHECK-T2-NEXT: movlt r1, r0 -; CHECK-T2-NEXT: movw r0, #32768 -; CHECK-T2-NEXT: cmn.w r1, #32768 -; CHECK-T2-NEXT: movt r0, #65535 -; CHECK-T2-NEXT: it gt -; CHECK-T2-NEXT: movgt r0, r1 +; CHECK-T2-NEXT: lsls r0, r0, #16 +; CHECK-T2-NEXT: sub.w r12, r0, r1, lsl #16 +; CHECK-T2-NEXT: movs r3, #0 +; CHECK-T2-NEXT: cmp.w r12, #0 +; CHECK-T2-NEXT: mov.w r2, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r3, #1 +; CHECK-T2-NEXT: cmp r3, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r2, #-2147483648 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #16 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r2, r12 +; CHECK-T2-NEXT: asrs r0, r2, #16 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func16: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: sub r0, r0, r1 -; CHECK-ARM-NEXT: movw r1, #32767 -; CHECK-ARM-NEXT: cmp r0, r1 -; CHECK-ARM-NEXT: movlt r1, r0 -; CHECK-ARM-NEXT: movw r0, #32768 -; CHECK-ARM-NEXT: movt r0, #65535 -; CHECK-ARM-NEXT: cmn r1, #32768 -; CHECK-ARM-NEXT: movgt r0, r1 +; CHECK-ARM-NEXT: lsl r0, r0, #16 +; CHECK-ARM-NEXT: sub r12, r0, r1, lsl #16 +; CHECK-ARM-NEXT: mov r3, #0 +; CHECK-ARM-NEXT: cmp r12, #0 +; CHECK-ARM-NEXT: movwmi r3, #1 +; CHECK-ARM-NEXT: mov r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r3, #0 +; CHECK-ARM-NEXT: mvnne r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #16 +; CHECK-ARM-NEXT: movvc r2, r12 +; CHECK-ARM-NEXT: asr r0, r2, #16 ; CHECK-ARM-NEXT: bx lr %tmp = call i16 @llvm.ssub.sat.i16(i16 %x, i16 %y) ret i16 %tmp @@ -265,39 +283,69 @@ define i16 @func16(i16 %x, i16 %y) nounw define i8 @func8(i8 %x, i8 %y) nounwind { ; CHECK-T1-LABEL: func8: ; CHECK-T1: @ %bb.0: -; CHECK-T1-NEXT: subs r0, r0, r1 -; CHECK-T1-NEXT: movs r1, #127 -; CHECK-T1-NEXT: cmp r0, #127 -; CHECK-T1-NEXT: blt .LBB3_2 +; CHECK-T1-NEXT: .save {r4, lr} +; CHECK-T1-NEXT: push {r4, lr} +; CHECK-T1-NEXT: lsls r1, r1, #24 +; CHECK-T1-NEXT: lsls r2, r0, #24 +; CHECK-T1-NEXT: movs r3, #1 +; CHECK-T1-NEXT: subs r0, r2, r1 +; CHECK-T1-NEXT: mov r4, r3 +; CHECK-T1-NEXT: bmi .LBB3_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: movs r4, #0 ; CHECK-T1-NEXT: .LBB3_2: -; CHECK-T1-NEXT: mvns r1, r1 -; CHECK-T1-NEXT: cmp r0, r1 -; CHECK-T1-NEXT: bgt .LBB3_4 +; CHECK-T1-NEXT: cmp r4, #0 +; CHECK-T1-NEXT: bne .LBB3_4 ; CHECK-T1-NEXT: @ %bb.3: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: lsls r3, r3, #31 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvs .LBB3_5 +; CHECK-T1-NEXT: b .LBB3_6 ; CHECK-T1-NEXT: .LBB3_4: -; CHECK-T1-NEXT: bx lr +; CHECK-T1-NEXT: ldr r3, .LCPI3_0 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvc .LBB3_6 +; CHECK-T1-NEXT: .LBB3_5: +; CHECK-T1-NEXT: mov r0, r3 +; CHECK-T1-NEXT: .LBB3_6: +; CHECK-T1-NEXT: asrs r0, r0, #24 +; CHECK-T1-NEXT: pop {r4, pc} +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI3_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff ; ; CHECK-T2-LABEL: func8: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: subs r0, r0, r1 -; CHECK-T2-NEXT: cmp r0, #127 -; CHECK-T2-NEXT: it ge -; CHECK-T2-NEXT: movge r0, #127 -; CHECK-T2-NEXT: cmn.w r0, #128 -; CHECK-T2-NEXT: it le -; CHECK-T2-NEXT: mvnle r0, #127 +; CHECK-T2-NEXT: lsls r0, r0, #24 +; CHECK-T2-NEXT: sub.w r12, r0, r1, lsl #24 +; CHECK-T2-NEXT: movs r3, #0 +; CHECK-T2-NEXT: cmp.w r12, #0 +; CHECK-T2-NEXT: mov.w r2, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r3, #1 +; CHECK-T2-NEXT: cmp r3, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r2, #-2147483648 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #24 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r2, r12 +; CHECK-T2-NEXT: asrs r0, r2, #24 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func8: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: sub r0, r0, r1 -; CHECK-ARM-NEXT: cmp r0, #127 -; CHECK-ARM-NEXT: movge r0, #127 -; CHECK-ARM-NEXT: cmn r0, #128 -; CHECK-ARM-NEXT: mvnle r0, #127 +; CHECK-ARM-NEXT: lsl r0, r0, #24 +; CHECK-ARM-NEXT: sub r12, r0, r1, lsl #24 +; CHECK-ARM-NEXT: mov r3, #0 +; CHECK-ARM-NEXT: cmp r12, #0 +; CHECK-ARM-NEXT: movwmi r3, #1 +; CHECK-ARM-NEXT: mov r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r3, #0 +; CHECK-ARM-NEXT: mvnne r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #24 +; CHECK-ARM-NEXT: movvc r2, r12 +; CHECK-ARM-NEXT: asr r0, r2, #24 ; CHECK-ARM-NEXT: bx lr %tmp = call i8 @llvm.ssub.sat.i8(i8 %x, i8 %y) ret i8 %tmp @@ -306,39 +354,69 @@ define i8 @func8(i8 %x, i8 %y) nounwind define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-T1-LABEL: func3: ; CHECK-T1: @ %bb.0: -; CHECK-T1-NEXT: subs r0, r0, r1 -; CHECK-T1-NEXT: movs r1, #7 -; CHECK-T1-NEXT: cmp r0, #7 -; CHECK-T1-NEXT: blt .LBB4_2 +; CHECK-T1-NEXT: .save {r4, lr} +; CHECK-T1-NEXT: push {r4, lr} +; CHECK-T1-NEXT: lsls r1, r1, #28 +; CHECK-T1-NEXT: lsls r2, r0, #28 +; CHECK-T1-NEXT: movs r3, #1 +; CHECK-T1-NEXT: subs r0, r2, r1 +; CHECK-T1-NEXT: mov r4, r3 +; CHECK-T1-NEXT: bmi .LBB4_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: movs r4, #0 ; CHECK-T1-NEXT: .LBB4_2: -; CHECK-T1-NEXT: mvns r1, r1 -; CHECK-T1-NEXT: cmp r0, r1 -; CHECK-T1-NEXT: bgt .LBB4_4 +; CHECK-T1-NEXT: cmp r4, #0 +; CHECK-T1-NEXT: bne .LBB4_4 ; CHECK-T1-NEXT: @ %bb.3: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: lsls r3, r3, #31 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvs .LBB4_5 +; CHECK-T1-NEXT: b .LBB4_6 ; CHECK-T1-NEXT: .LBB4_4: -; CHECK-T1-NEXT: bx lr +; CHECK-T1-NEXT: ldr r3, .LCPI4_0 +; CHECK-T1-NEXT: cmp r2, r1 +; CHECK-T1-NEXT: bvc .LBB4_6 +; CHECK-T1-NEXT: .LBB4_5: +; CHECK-T1-NEXT: mov r0, r3 +; CHECK-T1-NEXT: .LBB4_6: +; CHECK-T1-NEXT: asrs r0, r0, #28 +; CHECK-T1-NEXT: pop {r4, pc} +; CHECK-T1-NEXT: .p2align 2 +; CHECK-T1-NEXT: @ %bb.7: +; CHECK-T1-NEXT: .LCPI4_0: +; CHECK-T1-NEXT: .long 2147483647 @ 0x7fffffff ; ; CHECK-T2-LABEL: func3: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: subs r0, r0, r1 -; CHECK-T2-NEXT: cmp r0, #7 -; CHECK-T2-NEXT: it ge -; CHECK-T2-NEXT: movge r0, #7 -; CHECK-T2-NEXT: cmn.w r0, #8 -; CHECK-T2-NEXT: it le -; CHECK-T2-NEXT: mvnle r0, #7 +; CHECK-T2-NEXT: lsls r0, r0, #28 +; CHECK-T2-NEXT: sub.w r12, r0, r1, lsl #28 +; CHECK-T2-NEXT: movs r3, #0 +; CHECK-T2-NEXT: cmp.w r12, #0 +; CHECK-T2-NEXT: mov.w r2, #-2147483648 +; CHECK-T2-NEXT: it mi +; CHECK-T2-NEXT: movmi r3, #1 +; CHECK-T2-NEXT: cmp r3, #0 +; CHECK-T2-NEXT: it ne +; CHECK-T2-NEXT: mvnne r2, #-2147483648 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #28 +; CHECK-T2-NEXT: it vc +; CHECK-T2-NEXT: movvc r2, r12 +; CHECK-T2-NEXT: asrs r0, r2, #28 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func3: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: sub r0, r0, r1 -; CHECK-ARM-NEXT: cmp r0, #7 -; CHECK-ARM-NEXT: movge r0, #7 -; CHECK-ARM-NEXT: cmn r0, #8 -; CHECK-ARM-NEXT: mvnle r0, #7 +; CHECK-ARM-NEXT: lsl r0, r0, #28 +; CHECK-ARM-NEXT: sub r12, r0, r1, lsl #28 +; CHECK-ARM-NEXT: mov r3, #0 +; CHECK-ARM-NEXT: cmp r12, #0 +; CHECK-ARM-NEXT: movwmi r3, #1 +; CHECK-ARM-NEXT: mov r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r3, #0 +; CHECK-ARM-NEXT: mvnne r2, #-2147483648 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #28 +; CHECK-ARM-NEXT: movvc r2, r12 +; CHECK-ARM-NEXT: asr r0, r2, #28 ; CHECK-ARM-NEXT: bx lr %tmp = call i4 @llvm.ssub.sat.i4(i4 %x, i4 %y) ret i4 %tmp Modified: llvm/trunk/test/CodeGen/ARM/uadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/uadd_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/uadd_sat.ll (original) +++ llvm/trunk/test/CodeGen/ARM/uadd_sat.ll Fri Oct 11 13:33:03 2019 @@ -93,34 +93,34 @@ define i64 @func2(i64 %x, i64 %y) nounwi define i16 @func16(i16 %x, i16 %y) nounwind { ; CHECK-T1-LABEL: func16: ; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r1, r1, #16 +; CHECK-T1-NEXT: lsls r0, r0, #16 ; CHECK-T1-NEXT: adds r0, r0, r1 -; CHECK-T1-NEXT: ldr r1, .LCPI2_0 -; CHECK-T1-NEXT: cmp r0, r1 ; CHECK-T1-NEXT: blo .LBB2_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: mvns r0, r0 ; CHECK-T1-NEXT: .LBB2_2: +; CHECK-T1-NEXT: lsrs r0, r0, #16 ; CHECK-T1-NEXT: bx lr -; CHECK-T1-NEXT: .p2align 2 -; CHECK-T1-NEXT: @ %bb.3: -; CHECK-T1-NEXT: .LCPI2_0: -; CHECK-T1-NEXT: .long 65535 @ 0xffff ; ; CHECK-T2-LABEL: func16: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: add r1, r0 -; CHECK-T2-NEXT: movw r0, #65535 -; CHECK-T2-NEXT: cmp r1, r0 +; CHECK-T2-NEXT: lsls r2, r0, #16 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #16 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #16 ; CHECK-T2-NEXT: it lo -; CHECK-T2-NEXT: movlo r0, r1 +; CHECK-T2-NEXT: movlo.w r1, #-1 +; CHECK-T2-NEXT: lsrs r0, r1, #16 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func16: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: add r1, r0, r1 -; CHECK-ARM-NEXT: movw r0, #65535 -; CHECK-ARM-NEXT: cmp r1, r0 -; CHECK-ARM-NEXT: movlo r0, r1 +; CHECK-ARM-NEXT: lsl r2, r0, #16 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #16 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #16 +; CHECK-ARM-NEXT: mvnlo r1, #0 +; CHECK-ARM-NEXT: lsr r0, r1, #16 ; CHECK-ARM-NEXT: bx lr %tmp = call i16 @llvm.uadd.sat.i16(i16 %x, i16 %y) ret i16 %tmp @@ -129,27 +129,34 @@ define i16 @func16(i16 %x, i16 %y) nounw define i8 @func8(i8 %x, i8 %y) nounwind { ; CHECK-T1-LABEL: func8: ; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r1, r1, #24 +; CHECK-T1-NEXT: lsls r0, r0, #24 ; CHECK-T1-NEXT: adds r0, r0, r1 -; CHECK-T1-NEXT: cmp r0, #255 ; CHECK-T1-NEXT: blo .LBB3_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: movs r0, #255 +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: mvns r0, r0 ; CHECK-T1-NEXT: .LBB3_2: +; CHECK-T1-NEXT: lsrs r0, r0, #24 ; CHECK-T1-NEXT: bx lr ; ; CHECK-T2-LABEL: func8: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: add r0, r1 -; CHECK-T2-NEXT: cmp r0, #255 -; CHECK-T2-NEXT: it hs -; CHECK-T2-NEXT: movhs r0, #255 +; CHECK-T2-NEXT: lsls r2, r0, #24 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #24 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #24 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo.w r1, #-1 +; CHECK-T2-NEXT: lsrs r0, r1, #24 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func8: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: add r0, r0, r1 -; CHECK-ARM-NEXT: cmp r0, #255 -; CHECK-ARM-NEXT: movhs r0, #255 +; CHECK-ARM-NEXT: lsl r2, r0, #24 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #24 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #24 +; CHECK-ARM-NEXT: mvnlo r1, #0 +; CHECK-ARM-NEXT: lsr r0, r1, #24 ; CHECK-ARM-NEXT: bx lr %tmp = call i8 @llvm.uadd.sat.i8(i8 %x, i8 %y) ret i8 %tmp @@ -158,27 +165,34 @@ define i8 @func8(i8 %x, i8 %y) nounwind define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-T1-LABEL: func3: ; CHECK-T1: @ %bb.0: +; CHECK-T1-NEXT: lsls r1, r1, #28 +; CHECK-T1-NEXT: lsls r0, r0, #28 ; CHECK-T1-NEXT: adds r0, r0, r1 -; CHECK-T1-NEXT: cmp r0, #15 ; CHECK-T1-NEXT: blo .LBB4_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: movs r0, #15 +; CHECK-T1-NEXT: movs r0, #0 +; CHECK-T1-NEXT: mvns r0, r0 ; CHECK-T1-NEXT: .LBB4_2: +; CHECK-T1-NEXT: lsrs r0, r0, #28 ; CHECK-T1-NEXT: bx lr ; ; CHECK-T2-LABEL: func3: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: add r0, r1 -; CHECK-T2-NEXT: cmp r0, #15 -; CHECK-T2-NEXT: it hs -; CHECK-T2-NEXT: movhs r0, #15 +; CHECK-T2-NEXT: lsls r2, r0, #28 +; CHECK-T2-NEXT: add.w r1, r2, r1, lsl #28 +; CHECK-T2-NEXT: cmp.w r1, r0, lsl #28 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo.w r1, #-1 +; CHECK-T2-NEXT: lsrs r0, r1, #28 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func3: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: add r0, r0, r1 -; CHECK-ARM-NEXT: cmp r0, #15 -; CHECK-ARM-NEXT: movhs r0, #15 +; CHECK-ARM-NEXT: lsl r2, r0, #28 +; CHECK-ARM-NEXT: add r1, r2, r1, lsl #28 +; CHECK-ARM-NEXT: cmp r1, r0, lsl #28 +; CHECK-ARM-NEXT: mvnlo r1, #0 +; CHECK-ARM-NEXT: lsr r0, r1, #28 ; CHECK-ARM-NEXT: bx lr %tmp = call i4 @llvm.uadd.sat.i4(i4 %x, i4 %y) ret i4 %tmp Modified: llvm/trunk/test/CodeGen/ARM/usub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/usub_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/usub_sat.ll (original) +++ llvm/trunk/test/CodeGen/ARM/usub_sat.ll Fri Oct 11 13:33:03 2019 @@ -93,27 +93,33 @@ define i64 @func2(i64 %x, i64 %y) nounwi define i16 @func16(i16 %x, i16 %y) nounwind { ; CHECK-T1-LABEL: func16: ; CHECK-T1: @ %bb.0: -; CHECK-T1-NEXT: cmp r0, r1 -; CHECK-T1-NEXT: bhi .LBB2_2 +; CHECK-T1-NEXT: lsls r1, r1, #16 +; CHECK-T1-NEXT: lsls r0, r0, #16 +; CHECK-T1-NEXT: subs r0, r0, r1 +; CHECK-T1-NEXT: bhs .LBB2_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: movs r0, #0 ; CHECK-T1-NEXT: .LBB2_2: -; CHECK-T1-NEXT: subs r0, r0, r1 +; CHECK-T1-NEXT: lsrs r0, r0, #16 ; CHECK-T1-NEXT: bx lr ; ; CHECK-T2-LABEL: func16: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: cmp r0, r1 -; CHECK-T2-NEXT: it ls -; CHECK-T2-NEXT: movls r0, r1 -; CHECK-T2-NEXT: subs r0, r0, r1 +; CHECK-T2-NEXT: lsls r0, r0, #16 +; CHECK-T2-NEXT: sub.w r2, r0, r1, lsl #16 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #16 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo r2, #0 +; CHECK-T2-NEXT: lsrs r0, r2, #16 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func16: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: cmp r0, r1 -; CHECK-ARM-NEXT: movls r0, r1 -; CHECK-ARM-NEXT: sub r0, r0, r1 +; CHECK-ARM-NEXT: lsl r0, r0, #16 +; CHECK-ARM-NEXT: sub r2, r0, r1, lsl #16 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #16 +; CHECK-ARM-NEXT: movlo r2, #0 +; CHECK-ARM-NEXT: lsr r0, r2, #16 ; CHECK-ARM-NEXT: bx lr %tmp = call i16 @llvm.usub.sat.i16(i16 %x, i16 %y) ret i16 %tmp @@ -122,27 +128,33 @@ define i16 @func16(i16 %x, i16 %y) nounw define i8 @func8(i8 %x, i8 %y) nounwind { ; CHECK-T1-LABEL: func8: ; CHECK-T1: @ %bb.0: -; CHECK-T1-NEXT: cmp r0, r1 -; CHECK-T1-NEXT: bhi .LBB3_2 +; CHECK-T1-NEXT: lsls r1, r1, #24 +; CHECK-T1-NEXT: lsls r0, r0, #24 +; CHECK-T1-NEXT: subs r0, r0, r1 +; CHECK-T1-NEXT: bhs .LBB3_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: movs r0, #0 ; CHECK-T1-NEXT: .LBB3_2: -; CHECK-T1-NEXT: subs r0, r0, r1 +; CHECK-T1-NEXT: lsrs r0, r0, #24 ; CHECK-T1-NEXT: bx lr ; ; CHECK-T2-LABEL: func8: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: cmp r0, r1 -; CHECK-T2-NEXT: it ls -; CHECK-T2-NEXT: movls r0, r1 -; CHECK-T2-NEXT: subs r0, r0, r1 +; CHECK-T2-NEXT: lsls r0, r0, #24 +; CHECK-T2-NEXT: sub.w r2, r0, r1, lsl #24 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #24 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo r2, #0 +; CHECK-T2-NEXT: lsrs r0, r2, #24 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func8: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: cmp r0, r1 -; CHECK-ARM-NEXT: movls r0, r1 -; CHECK-ARM-NEXT: sub r0, r0, r1 +; CHECK-ARM-NEXT: lsl r0, r0, #24 +; CHECK-ARM-NEXT: sub r2, r0, r1, lsl #24 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #24 +; CHECK-ARM-NEXT: movlo r2, #0 +; CHECK-ARM-NEXT: lsr r0, r2, #24 ; CHECK-ARM-NEXT: bx lr %tmp = call i8 @llvm.usub.sat.i8(i8 %x, i8 %y) ret i8 %tmp @@ -151,27 +163,33 @@ define i8 @func8(i8 %x, i8 %y) nounwind define i4 @func3(i4 %x, i4 %y) nounwind { ; CHECK-T1-LABEL: func3: ; CHECK-T1: @ %bb.0: -; CHECK-T1-NEXT: cmp r0, r1 -; CHECK-T1-NEXT: bhi .LBB4_2 +; CHECK-T1-NEXT: lsls r1, r1, #28 +; CHECK-T1-NEXT: lsls r0, r0, #28 +; CHECK-T1-NEXT: subs r0, r0, r1 +; CHECK-T1-NEXT: bhs .LBB4_2 ; CHECK-T1-NEXT: @ %bb.1: -; CHECK-T1-NEXT: mov r0, r1 +; CHECK-T1-NEXT: movs r0, #0 ; CHECK-T1-NEXT: .LBB4_2: -; CHECK-T1-NEXT: subs r0, r0, r1 +; CHECK-T1-NEXT: lsrs r0, r0, #28 ; CHECK-T1-NEXT: bx lr ; ; CHECK-T2-LABEL: func3: ; CHECK-T2: @ %bb.0: -; CHECK-T2-NEXT: cmp r0, r1 -; CHECK-T2-NEXT: it ls -; CHECK-T2-NEXT: movls r0, r1 -; CHECK-T2-NEXT: subs r0, r0, r1 +; CHECK-T2-NEXT: lsls r0, r0, #28 +; CHECK-T2-NEXT: sub.w r2, r0, r1, lsl #28 +; CHECK-T2-NEXT: cmp.w r0, r1, lsl #28 +; CHECK-T2-NEXT: it lo +; CHECK-T2-NEXT: movlo r2, #0 +; CHECK-T2-NEXT: lsrs r0, r2, #28 ; CHECK-T2-NEXT: bx lr ; ; CHECK-ARM-LABEL: func3: ; CHECK-ARM: @ %bb.0: -; CHECK-ARM-NEXT: cmp r0, r1 -; CHECK-ARM-NEXT: movls r0, r1 -; CHECK-ARM-NEXT: sub r0, r0, r1 +; CHECK-ARM-NEXT: lsl r0, r0, #28 +; CHECK-ARM-NEXT: sub r2, r0, r1, lsl #28 +; CHECK-ARM-NEXT: cmp r0, r1, lsl #28 +; CHECK-ARM-NEXT: movlo r2, #0 +; CHECK-ARM-NEXT: lsr r0, r2, #28 ; CHECK-ARM-NEXT: bx lr %tmp = call i4 @llvm.usub.sat.i4(i4 %x, i4 %y) ret i4 %tmp Modified: llvm/trunk/test/CodeGen/X86/sadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sadd_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/sadd_sat.ll (original) +++ llvm/trunk/test/CodeGen/X86/sadd_sat.ll Fri Oct 11 13:33:03 2019 @@ -159,27 +159,34 @@ define i4 @func3(i4 %x, i4 %y) nounwind ; X86-LABEL: func3: ; X86: # %bb.0: ; X86-NEXT: movb {{[0-9]+}}(%esp), %al -; X86-NEXT: addb {{[0-9]+}}(%esp), %al -; X86-NEXT: movzbl %al, %ecx -; X86-NEXT: cmpb $7, %al -; X86-NEXT: movl $7, %edx -; X86-NEXT: cmovll %ecx, %edx -; X86-NEXT: cmpb $-8, %dl -; X86-NEXT: movl $248, %eax -; X86-NEXT: cmovgl %edx, %eax +; X86-NEXT: movb {{[0-9]+}}(%esp), %dl +; X86-NEXT: shlb $4, %dl +; X86-NEXT: shlb $4, %al +; X86-NEXT: xorl %ecx, %ecx +; X86-NEXT: movb %al, %ah +; X86-NEXT: addb %dl, %ah +; X86-NEXT: setns %cl +; X86-NEXT: addl $127, %ecx +; X86-NEXT: addb %dl, %al +; X86-NEXT: movzbl %al, %eax +; X86-NEXT: cmovol %ecx, %eax +; X86-NEXT: sarb $4, %al ; X86-NEXT: # kill: def $al killed $al killed $eax ; X86-NEXT: retl ; ; X64-LABEL: func3: ; X64: # %bb.0: +; X64-NEXT: shlb $4, %sil +; X64-NEXT: shlb $4, %dil +; X64-NEXT: xorl %ecx, %ecx +; X64-NEXT: movl %edi, %eax +; X64-NEXT: addb %sil, %al +; X64-NEXT: setns %cl +; X64-NEXT: addl $127, %ecx ; X64-NEXT: addb %sil, %dil ; X64-NEXT: movzbl %dil, %eax -; X64-NEXT: cmpb $7, %al -; X64-NEXT: movl $7, %ecx -; X64-NEXT: cmovll %eax, %ecx -; X64-NEXT: cmpb $-8, %cl -; X64-NEXT: movl $248, %eax -; X64-NEXT: cmovgl %ecx, %eax +; X64-NEXT: cmovol %ecx, %eax +; X64-NEXT: sarb $4, %al ; X64-NEXT: # kill: def $al killed $al killed $eax ; X64-NEXT: retq %tmp = call i4 @llvm.sadd.sat.i4(i4 %x, i4 %y); Modified: llvm/trunk/test/CodeGen/X86/ssub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/ssub_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/ssub_sat.ll (original) +++ llvm/trunk/test/CodeGen/X86/ssub_sat.ll Fri Oct 11 13:33:03 2019 @@ -159,27 +159,34 @@ define i4 @func3(i4 %x, i4 %y) nounwind ; X86-LABEL: func3: ; X86: # %bb.0: ; X86-NEXT: movb {{[0-9]+}}(%esp), %al -; X86-NEXT: subb {{[0-9]+}}(%esp), %al -; X86-NEXT: movzbl %al, %ecx -; X86-NEXT: cmpb $7, %al -; X86-NEXT: movl $7, %edx -; X86-NEXT: cmovll %ecx, %edx -; X86-NEXT: cmpb $-8, %dl -; X86-NEXT: movl $248, %eax -; X86-NEXT: cmovgl %edx, %eax +; X86-NEXT: movb {{[0-9]+}}(%esp), %dl +; X86-NEXT: shlb $4, %dl +; X86-NEXT: shlb $4, %al +; X86-NEXT: xorl %ecx, %ecx +; X86-NEXT: movb %al, %ah +; X86-NEXT: subb %dl, %ah +; X86-NEXT: setns %cl +; X86-NEXT: addl $127, %ecx +; X86-NEXT: subb %dl, %al +; X86-NEXT: movzbl %al, %eax +; X86-NEXT: cmovol %ecx, %eax +; X86-NEXT: sarb $4, %al ; X86-NEXT: # kill: def $al killed $al killed $eax ; X86-NEXT: retl ; ; X64-LABEL: func3: ; X64: # %bb.0: +; X64-NEXT: shlb $4, %sil +; X64-NEXT: shlb $4, %dil +; X64-NEXT: xorl %ecx, %ecx +; X64-NEXT: movl %edi, %eax +; X64-NEXT: subb %sil, %al +; X64-NEXT: setns %cl +; X64-NEXT: addl $127, %ecx ; X64-NEXT: subb %sil, %dil ; X64-NEXT: movzbl %dil, %eax -; X64-NEXT: cmpb $7, %al -; X64-NEXT: movl $7, %ecx -; X64-NEXT: cmovll %eax, %ecx -; X64-NEXT: cmpb $-8, %cl -; X64-NEXT: movl $248, %eax -; X64-NEXT: cmovgl %ecx, %eax +; X64-NEXT: cmovol %ecx, %eax +; X64-NEXT: sarb $4, %al ; X64-NEXT: # kill: def $al killed $al killed $eax ; X64-NEXT: retq %tmp = call i4 @llvm.ssub.sat.i4(i4 %x, i4 %y) Modified: llvm/trunk/test/CodeGen/X86/uadd_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/uadd_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/uadd_sat.ll (original) +++ llvm/trunk/test/CodeGen/X86/uadd_sat.ll Fri Oct 11 13:33:03 2019 @@ -98,21 +98,26 @@ define i4 @func3(i4 %x, i4 %y) nounwind ; X86-LABEL: func3: ; X86: # %bb.0: ; X86-NEXT: movb {{[0-9]+}}(%esp), %al -; X86-NEXT: addb {{[0-9]+}}(%esp), %al +; X86-NEXT: movb {{[0-9]+}}(%esp), %cl +; X86-NEXT: shlb $4, %cl +; X86-NEXT: shlb $4, %al +; X86-NEXT: addb %cl, %al ; X86-NEXT: movzbl %al, %ecx -; X86-NEXT: cmpb $15, %al -; X86-NEXT: movl $15, %eax -; X86-NEXT: cmovbl %ecx, %eax +; X86-NEXT: movl $255, %eax +; X86-NEXT: cmovael %ecx, %eax +; X86-NEXT: shrb $4, %al ; X86-NEXT: # kill: def $al killed $al killed $eax ; X86-NEXT: retl ; ; X64-LABEL: func3: ; X64: # %bb.0: +; X64-NEXT: shlb $4, %sil +; X64-NEXT: shlb $4, %dil ; X64-NEXT: addb %sil, %dil ; X64-NEXT: movzbl %dil, %ecx -; X64-NEXT: cmpb $15, %cl -; X64-NEXT: movl $15, %eax -; X64-NEXT: cmovbl %ecx, %eax +; X64-NEXT: movl $255, %eax +; X64-NEXT: cmovael %ecx, %eax +; X64-NEXT: shrb $4, %al ; X64-NEXT: # kill: def $al killed $al killed $eax ; X64-NEXT: retq %tmp = call i4 @llvm.uadd.sat.i4(i4 %x, i4 %y) Modified: llvm/trunk/test/CodeGen/X86/usub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/usub_sat.ll?rev=374592&r1=374591&r2=374592&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/usub_sat.ll (original) +++ llvm/trunk/test/CodeGen/X86/usub_sat.ll Fri Oct 11 13:33:03 2019 @@ -97,21 +97,27 @@ define i8 @func8(i8 %x, i8 %y) nounwind define i4 @func3(i4 %x, i4 %y) nounwind { ; X86-LABEL: func3: ; X86: # %bb.0: -; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx -; X86-NEXT: movl {{[0-9]+}}(%esp), %edx -; X86-NEXT: cmpb %cl, %dl -; X86-NEXT: movl %ecx, %eax -; X86-NEXT: cmoval %edx, %eax +; X86-NEXT: movb {{[0-9]+}}(%esp), %al +; X86-NEXT: movb {{[0-9]+}}(%esp), %cl +; X86-NEXT: shlb $4, %cl +; X86-NEXT: shlb $4, %al +; X86-NEXT: xorl %edx, %edx ; X86-NEXT: subb %cl, %al +; X86-NEXT: movzbl %al, %eax +; X86-NEXT: cmovbl %edx, %eax +; X86-NEXT: shrb $4, %al ; X86-NEXT: # kill: def $al killed $al killed $eax ; X86-NEXT: retl ; ; X64-LABEL: func3: ; X64: # %bb.0: -; X64-NEXT: cmpb %sil, %dil -; X64-NEXT: movl %esi, %eax -; X64-NEXT: cmoval %edi, %eax -; X64-NEXT: subb %sil, %al +; X64-NEXT: shlb $4, %sil +; X64-NEXT: shlb $4, %dil +; X64-NEXT: xorl %ecx, %ecx +; X64-NEXT: subb %sil, %dil +; X64-NEXT: movzbl %dil, %eax +; X64-NEXT: cmovbl %ecx, %eax +; X64-NEXT: shrb $4, %al ; X64-NEXT: # kill: def $al killed $al killed $eax ; X64-NEXT: retq %tmp = call i4 @llvm.usub.sat.i4(i4 %x, i4 %y) From llvm-commits at lists.llvm.org Fri Oct 11 13:35:12 2019 From: llvm-commits at lists.llvm.org (James Nagurne via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:35:12 +0000 (UTC) Subject: [PATCH] D63978: Clang Interface Stubs merger plumbing for Driver In-Reply-To: References: Message-ID: JamesNagurne added a comment. In D63978#1706448 , @plotfi wrote: > In D63978#1706420 , @JamesNagurne wrote: > > > Our team maintains a downstream embedded ARM clang distribution and some tests from this commit have begun to fail for us. > > For a number of these tests, there was a REQUIRES: x86-registered-target at the top, which has now been removed. Specifically, externstatic.c, merge-conflict-test.c, object-float.c, and object.c are failing. > > > > object* tests seem to be based on object.cpp, which had the REQUIRES line, and externstatic.c also had that line prior to the change. > > I see that @compnerd suggested the removal, but were you certain that these tests would work on clang toolchains for which x86 is not a registered target? > > > > For a failure example, here the output of lit for our toolchain. If you can make sense of it, I'd appreciate input on how we can fix or work around it: > > > > > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c | /arm-llvm/Release/llvm/bin/FileCheck -check-prefix=CHECK-TAPI /llvm-project/clang/test/InterfaceStubs/object.c > > /llvm-project/clang/test/InterfaceStubs/object.c:5:16: error: CHECK-TAPI: expected string not found in input > > // CHECK-TAPI: data: { Type: Object, Size: 4 } > > ^ > > :1:1: note: scanning from here > > --- !experimental-ifs-v1 > > ^ > > > > > > And when run without FileCheck, our raw output: > > > > > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c > > --- !experimental-ifs-v1 > > IfsVersion: 1.0 > > Triple: thumbv7em-ti-none-eabihf > > ObjectFileFormat: ELF > > Symbols: > > ... > > > > > I am sorry for this James. I can add back the REQUIRES lines for now and coordinate with you on making sure your downstream bots are not affected again if the REQUIRES are removed again. > By chance are your bots accessible publicly? Sadly, they are not. It's on our list of things to investigate, but we don't have the resources to do such a thing quite yet. I'm looking into the 'arm7*' buildbots to see if they are built similar to ours so I am not leaving you entirely without something to look at. However, if it seems to be common knowledge to always include an X86 target, I think I can talk to my team and change up what we do. These buildbots seem to also do LLVM_TARGETS_TO_BUILD=ARM, and then set the default target triple to a non-x86 triple (the host's) That could point towards us being in error here. I'll investigate things a little further, and update when I get the chance. To be clear: this feature should work for any ELF target, correct? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63978/new/ https://reviews.llvm.org/D63978 From llvm-commits at lists.llvm.org Fri Oct 11 13:35:13 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:35:13 +0000 (UTC) Subject: [PATCH] D68270: DWARFDebugLoc: Add a function to get the address range of an entry In-Reply-To: References: Message-ID: dblaikie added inline comments. ================ Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:291-295 + EntryIterator Absolute = + getAbsoluteLocations( + SectionedAddress{BaseAddr, SectionedAddress::UndefSection}, + LookupPooledAddress) + .begin(); ---------------- labath wrote: > dblaikie wrote: > > labath wrote: > > > dblaikie wrote: > > > > labath wrote: > > > > > dblaikie wrote: > > > > > > labath wrote: > > > > > > > This parallel iteration is not completely nice, but I think it's worth being able to reuse the absolute range computation code. I'm open to ideas for improvement though. > > > > > > Ah, I see - this is what you meant about "In particular it makes it possible to reuse this stuff in the dumping code, which would have been pretty hard with callbacks.". > > > > > > > > > > > > I'm wondering if that might be worth revisiting somewhat. A full iterator abstraction for one user here (well, two once you include lldb - but I assume it's likely going to build its own data structure from the iteration anyway, right? (it's not going to keep the iterator around, do anything interesting like partial iterations, re-iterate/etc - such that a callback would suffice)) > > > > > > > > > > > > I could imagine two callback APIs for this - one that gets entries and locations and one that only gets locations by filtering on the entry version. > > > > > > > > > > > > eg: > > > > > > > > > > > > // for non-verbose output: > > > > > > LL.forEachEntry([&](const Entry &E, Expected L) { > > > > > > if (Verbose && actually dumping debug_loc) > > > > > > print(E) // print any LLE_*, raw parameters, etc > > > > > > if (L) > > > > > > print(*L) // print the resulting address range, section name (if verbose), > > > > > > else > > > > > > print(error stuff) > > > > > > }); > > > > > > > > > > > > One question would be "when/where do we print the DWARF expression" - if there's an error computing the address range, we can still print the expression, so maybe that happens unconditionally at the end of the callback, using the expression in the Entry? (then, arguably, the expression doesn't need to be in the DWARFLocation - and I'd say make the DWARFLocation a sectioned range, exactly the same type as for ranges so that part of the dumping code, etc, can be maximally reused) > > > > > Actually, what lldb currently does is that it does not build any data structures at all (except storing the pointer to the right place in the debug_loc section. Then, whenever it wants to do something to the loclist, it parses it afresh. I don't know why it does this exactly, but I assume it has something to do with most locations never being used, or being only a couple of times, and the actual parsing being fairly fast. What this means is that lldb is not really a single "user", but there are like four or five places where it iterates through the list, depending on what does it actually want to do with it. It also does partial iteration where it stops as soon as it find the entry it was interested in. > > > > > Now, all of that is possible with a callback (though I am generally trying to avoid them), but it does resurface the issue of what should be the value of the second argument for DW_LLE_base_address entries (the thing which I originally used a error type for). > > > > > Maybe this should be actually one callback API, taking two callback functions, with one of them being invoked for base_address entries, and one for others? However, if we stick to the current approaches in both LLE and RLE of making the address pool resolution function a parameter (which I'd like to keep, as it makes my job in lldb easier), then this would actually be three callbacks, which starts to get unwieldy. Though one of those callbacks could be removed with the "DWARFUnit implementing a AddrOffsetResolver interface" idea, which I really like. :) > > > > Ah, thanks for the details on LLDB's location parsing logic. That's interesting indeed! > > > > > > > > I can appreciate an iterator-based API if that's the sort of usage we've got, though I expect it doesn't have any interest in the low-level encoding & just wants the fully processed address ranges/locations - it doesn't want base_address or end_of_list entries? & I think the dual-iteration is a fairly awkward API design, trying to iterate them in lock-step, etc. I'd rather avoid that if reasonably possible. > > > > > > > > Either having an iterator API that gives only the fully processed data/semantic view & a completely different API if you want to access the low level primitives (LLE, etc) (this is how ranges works - there's an API that gives a collection of ranges & abstracts over v4/v5/rnglists/etc - though that's partly motivated by a strong multi-client need for that functionality for symbolizing, etc - but I think it's a good abstraction/model anyway (& one of the reasons the inline range list printing doesn't include encoding information, the API it uses is too high level to even have access to it)) > > > > > > > > > Now, all of that is possible with a callback (though I am generally trying to avoid them), but it does resurface the issue of what should be the value of the second argument for DW_LLE_base_address entries (the thing which I originally used a error type for). > > > > > > > > Sorry, my intent in the above API was for the second argument to be Optional's "None" state when... oh, I see, I did use Expected there, rather than Optional, because there are legit error cases. > > > > > > > > I know it's sort of awkward, but I might be inclined to use Optional> there. I realize two layers of wrapping is a bit weird, but I think it'd be nicer than having an error state for what, I think, isn't erroneous. > > > > > > > > > Maybe this should be actually one callback API, taking two callback functions, with one of them being invoked for base_address entries, and one for others? However, if we stick to the current approaches in both LLE and RLE of making the address pool resolution function a parameter (which I'd like to keep, as it makes my job in lldb easier), then this would actually be three callbacks, which starts to get unwieldy. > > > > > > > > Don't mind three callbacks too much. > > > > > > > > > Though one of those callbacks could be removed with the "DWARFUnit implementing a AddrOffsetResolver interface" idea, which I really like. :) > > > > > > > > Sorry, I haven't really looked at where the address resolver callback is registered and alternative designs being discussed - but yeah, going off just the one-sentence, it seems reasonable to have the DWARFUnit own an address resolver/be the thing you consult when you want to resolve an address (just through a normal function call in DWARFUnit, perhaps - which might, internally, use a callback registered when it was constructed). > > > > I know it's sort of awkward, but I might be inclined to use Optional> there. I realize two layers of wrapping is a bit weird, but I think it'd be nicer than having an error state for what, I think, isn't erroneous. > > > Actually, my very first attempt at this patch used an `Expected>`, but then I scrapped it because I didn't think you'd like it. It's not the friendliest of APIs, but I think we can go with that. > > > > > > > Sorry, I haven't really looked at where the address resolver callback is registered and alternative designs being discussed - but yeah, going off just the one-sentence, it seems reasonable to have the DWARFUnit own an address resolver/be the thing you consult when you want to resolve an address (just through a normal function call in DWARFUnit, perhaps - which might, internally, use a callback registered when it was constructed). > > > > > > I think you got that backwards. I don't want the DWARFUnit to be the source of truth for address pool resolutions, as that would make it hard to use from lldb (it's far from ready to start using the llvm version right now). What I wanted was to replace the lambda/function_ref with a single-method interface. Then both DWARFUnits could implement that interface so that passing a DWARFUnit& would "just work" (but you wouldn't be limited to DWARFUnits as anyone could implement that interface, just like anyone can write a lambda). > > As for Expected> (or Optional>) - yeah, I think this is a non-obvious API (both the general problem and this specific solution). I think it's probably worth discussing this design a bit more to save you time writing/rewriting things a bit. I guess there are a few layers of failure here. > > > > There's the possibility that the iteration itself could fail - even for debug_loc style lists (if we reached the end of the section before encountering a terminating {0,0}). That would suggest a fallible iterator idiom: http://llvm.org/docs/ProgrammersManual.html#building-fallible-iterators-and-iterator-ranges > > > > But then, yes, when looking at the "processed"/semantic view, that could fail too in the case of an invalid address index, etc. > > > > The generic/processed/abstracted-over-ranges-and-rnglists API for ranges produces a fully computer vector (& then returns Expected of that range) - is that reasonable? (this does mean manifesting a whole location in memory, which may not be needed so I could understand avoiding that even without fully implementing & demonstrating the vector solution is inadequate). > > > > But I /think/ maybe the we could/should have two APIs - one generic API that abstracts over loc/loclists and only provides the fully processed view, and another that is type specific for dumping the underlying representation (only used in dumping debug_loclists). > If we were computing the final address ranges from scratch (which would be the best match for the current lldb usage, but which I am not considering now for fear of changing too many things), then I agree that we would need the fallible_iterator iterator thingy. But in this case we are "interpreting" the already parsed ranges, so we can assume some level of correctness here, and the thing that can fail is only the computation of a single range, which does not affect our ability to process the next entry. > This indicates to me that either each entry in the list should be an Expected<>, or that the invalid entries should be just dropped (possibly accompanied by some flag which would tell the caller that the result was not exhaustive). > > This is connected to one of the issues I have with the debug ranges API -- it tries _really_ hard to return *something* -- if resolving the indirect base address entry fails, it is perfectly happy to use the address _index_ as the base address. This makes sense for dumping, where you want to show something (though it would still be good to indicate that you're not showing a real address), but it definitely does not help consumers which then need to make decisions based on the returned data. > > Anyway, yes, I agree that we need to APIs, and probably callbacks are the easiest way to achieve that. We could have a "base" callback that is not particularly nice to use, but provides the full information via a combination of `UnparsedLL` and `Optional>` arguments. The dumper could use that to print out everything it needs. And then we could have a second API, built on top of the first one, which ignores base address entries and the raw data and returns just a bunch of `Expected`. This could be used by users like lldb, who just want to see the final data. The `ParsedLL` type would be independent of the location list type, so that the debug_loc parser could provide the same kind of API (but implemented on top of something else, as the `UnparsedLL` types would differ). Also, under the hood, the location list dumper for debug_loclists (but not debug_loc) could reuse some implementation details with the debug_rnglists dumper via a suitable combination of templates and callbacks. > > How does that sound? What sort of things are you concerned about with deeper API changes here? I think it's probably worth building the "right" thing now - as good a time as any. LLVM's debug info APIs, as you've pointed out, aren't exactly "sturdy" (treating address indexes as offsets, etc, etc), so no time like the present to clean it up. I think if we had an abstraction over v4 and v5 location descriptions, parsing from scratch, fallible iterators, etc - that'd be the ideal thing to use in the inline dumping code (that dumps inside debug_info) - which currently uses "parseOneLocationList" - so it is parsing from scratch and dumping. But equally I understand not wanting to make you/me/anyone fix everything when just trying to get something actually done. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68270/new/ https://reviews.llvm.org/D68270 From llvm-commits at lists.llvm.org Fri Oct 11 13:35:14 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:35:14 +0000 (UTC) Subject: [PATCH] D68839: [lit] Fix internal diff's --strip-trailing-cr and use it In-Reply-To: References: Message-ID: <4e419a54d18f618c31867457bf092093@localhost.localdomain> rnk accepted this revision. rnk added a comment. This revision is now accepted and ready to land. I looked into these tests, and it seems that they would've failed if it were not for Python's universal newline translator thing. When I diff the files in question with gnu diff from git bash, they appear to be different without `-w` or `--strip-trailing-cr`. So, your fix makes lit's diff more like gnu diff, and fixes the tests to work in that mode. The only downside is that this is one more way for tests to pass on Linux but fail on Windows out of the box. However, if we want to address that, I think we should fix it by adding a lit substitution for "\bdiff\b " to add `--strip-trailing-cr` on Windows. lgtm Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68839/new/ https://reviews.llvm.org/D68839 From llvm-commits at lists.llvm.org Fri Oct 11 13:35:15 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:35:15 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: <3ccfa983e5d6166840a9873f6e71d5c7@localhost.localdomain> hubert.reinterpretcast added inline comments. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:926 + MCSymbolRefExpr::create(MOSymbol, MCSymbolRefExpr::VK_PPC_GOT_TPREL_HA, OutContext); EmitToStreamer(*OutStreamer, MCInstBuilder(PPC::ADDIS8) ---------------- Two more spaces here. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Fri Oct 11 13:35:16 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:35:16 +0000 (UTC) Subject: [PATCH] D68633: fix debug info affects output when opt inline In-Reply-To: References: Message-ID: <9ad3934fde017c4d73216bc60428dfcf@localhost.localdomain> aprantl added inline comments. ================ Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1855 + // exit the scan loop. + if (!isa(I) || + !allocaWouldBeStaticInEntry(cast(I))) ---------------- Does this work when I == E here? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 From llvm-commits at lists.llvm.org Fri Oct 11 13:35:18 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:35:18 +0000 (UTC) Subject: [PATCH] D68450: [lit] Remove setting of the target-windows feature In-Reply-To: References: Message-ID: <0258f5434f409fec13706cc7e07d4b6a@localhost.localdomain> rnk accepted this revision. rnk added a comment. This revision is now accepted and ready to land. lgtm Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68450/new/ https://reviews.llvm.org/D68450 From llvm-commits at lists.llvm.org Fri Oct 11 13:37:07 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:37:07 +0000 (UTC) Subject: [PATCH] D68836: [lit] Small cleanups in main.py In-Reply-To: References: Message-ID: rnk accepted this revision. rnk added a comment. This revision is now accepted and ready to land. lgtm ================ Comment at: llvm/utils/lit/lit/main.py:32 + import tempfile lit_tmp = tempfile.mkdtemp(prefix="lit_tmp_") os.environ.update({ ---------------- Unrelated, but I wonder if we should augment this logic to garbage collect old `lit_tmp_` directories that are 24+ hours old. I routinely find lots of leaked lit_tmp_ directories because oftentimes the parent Python process is killed before it gets to the finally block below. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68836/new/ https://reviews.llvm.org/D68836 From llvm-commits at lists.llvm.org Fri Oct 11 13:37:07 2019 From: llvm-commits at lists.llvm.org (Sylvestre Ledru via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:37:07 +0000 (UTC) Subject: [PATCH] D66733: [analyzer] Add a checker option to detect nested dead stores In-Reply-To: References: Message-ID: sylvestre.ledru added a comment. I added it to the release notes here : https://reviews.llvm.org/rC374593 I am wondering if the option( WarnForDeadNestedAssignments ) to disable it is really necessary? I haven't seen any false positive while deadstore has some. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66733/new/ https://reviews.llvm.org/D66733 From llvm-commits at lists.llvm.org Fri Oct 11 13:39:42 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Fri, 11 Oct 2019 20:39:42 -0000 Subject: [zorg] r374594 - Added legacy mode support for automatic SVN schedulers. Message-ID: <20191011203942.481B491E55@lists.llvm.org> Author: gkistanova Date: Fri Oct 11 13:39:42 2019 New Revision: 374594 URL: http://llvm.org/viewvc/llvm-project?rev=374594&view=rev Log: Added legacy mode support for automatic SVN schedulers. Modified: zorg/trunk/buildbot/osuosl/master/config/schedulers.py Modified: zorg/trunk/buildbot/osuosl/master/config/schedulers.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/buildbot/osuosl/master/config/schedulers.py?rev=374594&r1=374593&r2=374594&view=diff ============================================================================== --- zorg/trunk/buildbot/osuosl/master/config/schedulers.py (original) +++ zorg/trunk/buildbot/osuosl/master/config/schedulers.py Fri Oct 11 13:39:42 2019 @@ -26,8 +26,10 @@ def getSingleBranchSchedulers(builders, for builder in builders: # Only for the builders created with LLVMBuildFactory or similar. if getattr(builder['factory'], 'depends_on_projects', None): - # And only if this builder does not yet have an assigned scheduler. - if builder['name'] not in builders_with_schedulers: + # And only if this builder is in the legacy mode and + # does not yet have an assigned scheduler. + if getattr(builder['factory'], 'is_legacy_mode', True) and \ + builder['name'] not in builders_with_schedulers: # This builder is a candidate for an automatic scheduler. builders_with_automatic_schedulers.append(builder) From llvm-commits at lists.llvm.org Fri Oct 11 13:45:32 2019 From: llvm-commits at lists.llvm.org (Jian Cai via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:45:32 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses In-Reply-To: References: Message-ID: <73e39b20131fd54cf246619b8d155588@localhost.localdomain> jcai19 updated this revision to Diff 224673. jcai19 marked 4 inline comments as done. jcai19 added a comment. Update based on comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68764/new/ https://reviews.llvm.org/D68764 Files: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp llvm/test/MC/ARM/gas-compl-mem-offset-paren.s Index: llvm/test/MC/ARM/gas-compl-mem-offset-paren.s =================================================================== --- /dev/null +++ llvm/test/MC/ARM/gas-compl-mem-offset-paren.s @@ -0,0 +1,20 @@ +@ RUN: llvm-mc -triple=arm-linux-gnueab < %s | FileCheck %s + +@ CHECK: ldr r12, [sp, #15] +ldr r12, [sp, (15)] + +@ CHECK: ldr r12, [sp, #15] +ldr r12, [sp, #(15)] + +@ CHECK: ldr r12, [sp, #15] +ldr r12, [sp, $(15)] + +@ CHECK: ldr r12, [sp, #100] +ldr r12, [sp, (((15+5)*5))] + +@ CHECK: ldr r12, [sp, #100] +ldr r12, [sp, #(((15+5)*5))] + + +@ CHECK: ldr r12, [sp, #100] +ldr r12, [sp, $(((15+5)*5))] Index: llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp =================================================================== --- llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp +++ llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp @@ -5733,14 +5733,16 @@ return false; } - // If we have a '#', it's an immediate offset, else assume it's a register - // offset. Be friendly and also accept a plain integer (without a leading - // hash) for gas compatibility. + // If we have a '#' or '$', it's an immediate offset, else assume it's a + // register offset. Be friendly and also accept a plain integer or expression + // (without a leading hash) for gas compatibility. if (Parser.getTok().is(AsmToken::Hash) || Parser.getTok().is(AsmToken::Dollar) || + Parser.getTok().is(AsmToken::LParen) || Parser.getTok().is(AsmToken::Integer)) { - if (Parser.getTok().isNot(AsmToken::Integer)) - Parser.Lex(); // Eat '#' or '$'. + if (Parser.getTok().is(AsmToken::Hash) || + Parser.getTok().is(AsmToken::Dollar)) + Parser.Lex(); // Eat '#' or '$' E = Parser.getTok().getLoc(); bool isNegative = getParser().getTok().is(AsmToken::Minus); -------------- next part -------------- A non-text attachment was scrubbed... Name: D68764.224673.patch Type: text/x-patch Size: 1796 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 13:45:33 2019 From: llvm-commits at lists.llvm.org (Jian Cai via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:45:33 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses In-Reply-To: References: Message-ID: <25fe7e2e69dfcb5bc9e1298b6ef005e9@localhost.localdomain> jcai19 marked 3 inline comments as done. jcai19 added inline comments. ================ Comment at: llvm/test/MC/ARM/gas-compl-mem-offset-paren.s:3 + +.syntax unified + ---------------- nickdesaulniers wrote: > If you remove this assembler directive outright, does the test still pass? If so, let's remove it. Also, it seems that you partially removed the other occurrences, but not all of them. It should occur once, or not at all (unless you wanted to test changing back and forth between them, but that's not what we're testing here). Oops! Thanks for the catch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68764/new/ https://reviews.llvm.org/D68764 From llvm-commits at lists.llvm.org Fri Oct 11 13:55:02 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 20:55:02 +0000 (UTC) Subject: [PATCH] D68772: [COFF] Wrap things in namespace lld { namespace coff { In-Reply-To: References: Message-ID: <1bab4229d46b3de833ab9d89247d29e2@localhost.localdomain> rnk added inline comments. ================ Comment at: lld/COFF/DebugTypes.cpp:213-214 // moved here. -Expected -lld::coff::findTypeServerSource(const ObjFile *f) { Expected ts = TypeServerSource::findFromFile(f); ---------------- MaskRay wrote: > rnk wrote: > > I prefer this style for free functions because it makes it a hard error if there's a mismatch between the header and the cpp file. It's a pretty simple style rule: every function implemented in a .cpp file should either be qualified with a class or namespace name, or it should be marked static. Then you never have to worry about what the active namespace is outside of headers. > > > > That's just my personal preference and it's not in CodingStandards, but given how much we use free functions in LLD and LLVM, it's kind of nice. > Does the argument mean this patch should be reverted? > > If we have interleaved classes and free functions, we may have: > > ``` > namespace lld { > namespace coff { > void Class::method0() {} > } > } > > void lld::coff::free0() {} // we have to leave the active namespace, because otherwise [-Wextra-qualification] > > namespace lld { > namespace coff { > void Class::method1() {} > } > } > > void lld::coff::free1() {} > ``` > > Instead of doing that, this patch uses an outer most `namespace lld { namespace coff {` so we will not need to think much about the active namespace. > Does the argument mean this patch should be reverted? Maybe. I'm saying I would prefer to go the opposite direction from this patch, and standardize on the `lld::coff::foo` names. But, this is just my opinion, not a standard, and I want to see if people agree first. > If we have interleaved classes and free functions, we may have: > ... The code pattern we had before this change looked like: ``` // Foo.h namespace lld { namespace coff { class Foo { void bar(); }; void baz(); } } // lld::coff // Foo.cpp using namespace lld; using namespace lld::coff; void Foo::bar() { ... } void lld::coff::baz() { ... } ``` We never needed to open and close the namespaces in the first place, because classes like Foo are already in scope. In general, when do we have to open namespace in a .cpp file? I don't think there are any cases that really matter. I guess another thing that makes me lean this way is the LLVM preference for as few scopes as possible: - use early return/break/continue - prefer static to anon namespace This style seems like it fits into that: - have as few open namespace scopes as possible Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68772/new/ https://reviews.llvm.org/D68772 From llvm-commits at lists.llvm.org Fri Oct 11 13:58:27 2019 From: llvm-commits at lists.llvm.org (Quentin Colombet via llvm-commits) Date: Fri, 11 Oct 2019 20:58:27 -0000 Subject: [llvm] r374595 - [GISel][UnitTest] Fix a bunch of tests that were not doing anything Message-ID: <20191011205827.2056C840B8@lists.llvm.org> Author: qcolombet Date: Fri Oct 11 13:58:26 2019 New Revision: 374595 URL: http://llvm.org/viewvc/llvm-project?rev=374595&view=rev Log: [GISel][UnitTest] Fix a bunch of tests that were not doing anything After r368065, all the tests using GISelMITest must call setUp() before doing anything, otherwise the TargetMachine is not going to be set up. A few tests added after that commit were not doing that and ended up testing effectively nothing. Fix the setup of all the tests and fix the failing tests. Modified: llvm/trunk/unittests/CodeGen/GlobalISel/KnownBitsTest.cpp llvm/trunk/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp Modified: llvm/trunk/unittests/CodeGen/GlobalISel/KnownBitsTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/CodeGen/GlobalISel/KnownBitsTest.cpp?rev=374595&r1=374594&r2=374595&view=diff ============================================================================== --- llvm/trunk/unittests/CodeGen/GlobalISel/KnownBitsTest.cpp (original) +++ llvm/trunk/unittests/CodeGen/GlobalISel/KnownBitsTest.cpp Fri Oct 11 13:58:26 2019 @@ -120,17 +120,16 @@ TEST_F(GISelMITest, TestKnownBits) { } TEST_F(GISelMITest, TestSignBitIsZero) { + setUp(); if (!TM) return; const LLT S32 = LLT::scalar(32); - auto SignBit = B.buildConstant(S32, 0x8000000); + auto SignBit = B.buildConstant(S32, 0x80000000); auto Zero = B.buildConstant(S32, 0); GISelKnownBits KnownBits(*MF); EXPECT_TRUE(KnownBits.signBitIsZero(Zero.getReg(0))); - EXPECT_FALSE(KnownBits.signBitIsZero(Zero.getReg(0))); EXPECT_FALSE(KnownBits.signBitIsZero(SignBit.getReg(0))); - EXPECT_TRUE(KnownBits.signBitIsZero(SignBit.getReg(0))); } Modified: llvm/trunk/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp?rev=374595&r1=374594&r2=374595&view=diff ============================================================================== --- llvm/trunk/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp (original) +++ llvm/trunk/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp Fri Oct 11 13:58:26 2019 @@ -74,6 +74,7 @@ TEST_F(GISelMITest, TestBuildConstantFCo #endif TEST_F(GISelMITest, DstOpSrcOp) { + setUp(); if (!TM) return; @@ -99,6 +100,7 @@ TEST_F(GISelMITest, DstOpSrcOp) { } TEST_F(GISelMITest, BuildUnmerge) { + setUp(); if (!TM) return; @@ -119,6 +121,7 @@ TEST_F(GISelMITest, BuildUnmerge) { } TEST_F(GISelMITest, TestBuildFPInsts) { + setUp(); if (!TM) return; @@ -154,6 +157,7 @@ TEST_F(GISelMITest, TestBuildFPInsts) { } TEST_F(GISelMITest, BuildIntrinsic) { + setUp(); if (!TM) return; @@ -182,6 +186,7 @@ TEST_F(GISelMITest, BuildIntrinsic) { } TEST_F(GISelMITest, BuildXor) { + setUp(); if (!TM) return; @@ -210,6 +215,7 @@ TEST_F(GISelMITest, BuildXor) { } TEST_F(GISelMITest, BuildBitCounts) { + setUp(); if (!TM) return; @@ -237,6 +243,7 @@ TEST_F(GISelMITest, BuildBitCounts) { } TEST_F(GISelMITest, BuildCasts) { + setUp(); if (!TM) return; @@ -261,6 +268,7 @@ TEST_F(GISelMITest, BuildCasts) { } TEST_F(GISelMITest, BuildMinMax) { + setUp(); if (!TM) return; @@ -286,6 +294,7 @@ TEST_F(GISelMITest, BuildMinMax) { } TEST_F(GISelMITest, BuildAtomicRMW) { + setUp(); if (!TM) return; From llvm-commits at lists.llvm.org Fri Oct 11 14:04:28 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:04:28 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: DiggerLin marked 4 inline comments as done. DiggerLin added inline comments. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:151 + + // Packed field, see XR_* masks for details of packing. + uint8_t Info; ---------------- hubert.reinterpretcast wrote: > Move the masks to the start of this class. Separate the nested types/constants from the fields using an access specifier label (e.g., `public`) or a comment. changed a suggestion. ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:156 + + bool isRelocationSigned() const; + ---------------- hubert.reinterpretcast wrote: > Separate the fields from the methods using an access specified label or a comment. changed as suggestion ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:158 + + // If the Fixup bit is set, it indicates that the linker has modified + // the instruction the relocation refers to. ---------------- hubert.reinterpretcast wrote: > Remove the comment and the blank line before it once the mask constants are defined in the class. changed as suggestion ================ Comment at: llvm/include/llvm/Object/XCOFFObjectFile.h:313 + getLogicalNumberOfRelocationEntries(const XCOFFSectionHeader32 &Sec) const; + Expected> + relocations(const XCOFFSectionHeader32 &) const; ---------------- hubert.reinterpretcast wrote: > I would prefer a blank line between multi-line function declarations. added Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Fri Oct 11 14:04:28 2019 From: llvm-commits at lists.llvm.org (Chris Bieneman via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:04:28 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies In-Reply-To: References: Message-ID: <9faef269128327a630ccb65ab5eb5db0@localhost.localdomain> beanz added a comment. In D68833#1706315 , @ldionne wrote: > Yes, precisely. That's also the currently preferred way of building libc++: https://libcxx.llvm.org/docs/BuildingLibcxx.html That documentation is more than a bit lacking, as is much of the LLVM documentation. > Not everybody ships libc++/libc++abi as part of the toolchain, and for those, building with whatever `CMAKE_CXX_COMPILER` they specify is really the right thing to do. While this is true, in many instances libc++ even when libc++ isn't shipped with a toolchain it is locked to one. Darwin is a prime example of this. On Darwin libc++ is shipped as part of the OS, but that cycle is closely coordinated with the toolchain updates and the two are usually kept in sync. > Don't get me wrong, I'm 100% on board that there's value in having this runtime build, however let's not pretend that it's the only correct way to build libc++. I would argue if you don't ship libc++ as part of the toolchain you shouldn't build it as part of the toolchain either. In which case the standalone build configuration is the correct way to build it. My intention isn't to say building libcxx as a runtime is the only correct way to build libc++, my intention is to state that *if* you are building libc++ with the toolchain, building it as a runtime is the only correct way to build it. > Say I need to generate a file based on the properties of a target. I'll need to call `get_target_property` on a target that hasn't been defined yet, and there's no way around that because `file(GENERATE)` does not expand generator expressions. Is this something you need to do? If so I'd question higher-level decisions about how the build is structured. > Are you thinking about this? > > set(libname "$,$,${lib}>") > list(APPEND link_libraries "${CMAKE_LINK_LIBRARY_FLAG}${libname}") > > > That is clever, I had not thought about it. > > If the above workaround works, I don't care about this patch that much. I still think we need to clarify the status of the Runtimes build and document it, unless that's already done and I've missed it (in which case please point it to me). That is a much better approach. CMake 3.11 is when the `TARGET_EXISTS` generator expression was added, although the documentation wasn't updated until CMake 3.15, so this change would require a CMake version update, which I don't think is unreasonable. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68833/new/ https://reviews.llvm.org/D68833 From llvm-commits at lists.llvm.org Fri Oct 11 14:04:28 2019 From: llvm-commits at lists.llvm.org (Nick Desaulniers via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:04:28 +0000 (UTC) Subject: [PATCH] D68764: [ARM][AsmParser] handles offset expression in parentheses In-Reply-To: References: Message-ID: <176461607ee57fb71fed7dc08fe27bfe@localhost.localdomain> nickdesaulniers added inline comments. ================ Comment at: llvm/test/MC/ARM/gas-compl-mem-offset-paren.s:1 +@ RUN: llvm-mc -triple=arm < %s | FileCheck %s + ---------------- nickdesaulniers wrote: > Since this is a GAS compliance test, let's use a GAS triple, like `-triple=arm-linux-gnueabi`. missing the `i` on the end. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68764/new/ https://reviews.llvm.org/D68764 From llvm-commits at lists.llvm.org Fri Oct 11 14:04:29 2019 From: llvm-commits at lists.llvm.org (Digger via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:04:29 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: <2d537229ef4b09566a43468a20c70691@localhost.localdomain> DiggerLin updated this revision to Diff 224675. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 Files: llvm/include/llvm/BinaryFormat/XCOFF.h llvm/include/llvm/Object/XCOFFObjectFile.h llvm/lib/Object/XCOFFObjectFile.cpp llvm/test/tools/llvm-readobj/reloc_overflow.test llvm/test/tools/llvm-readobj/xcoff-basic.test llvm/tools/llvm-readobj/XCOFFDumper.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67008.224675.patch Type: text/x-patch Size: 19552 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 14:04:29 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:04:29 +0000 (UTC) Subject: [PATCH] D68839: [lit] Fix internal diff's --strip-trailing-cr and use it In-Reply-To: References: Message-ID: jdenny added a comment. In D68839#1706504 , @rnk wrote: > I looked into these tests, and it seems that they would've failed if it were not for Python's universal newline translator thing. When I diff the files in question with gnu diff from git bash, they appear to be different without `-w` or `--strip-trailing-cr`. So, your fix makes lit's diff more like gnu diff, and fixes the tests to work in that mode. Thanks for verifying. I only had access to logs, so I was doing some guesswork. > The only downside is that this is one more way for tests to pass on Linux but fail on Windows out of the box. Yes, I debated whether the old behavior was desirable, but ultimately I concluded that consistency with external diffs is more important to avoid surprising behavior. > However, if we want to address that, I think we should fix it by adding a lit substitution for "\bdiff\b " to add `--strip-trailing-cr` on Windows. Agreed. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68839/new/ https://reviews.llvm.org/D68839 From llvm-commits at lists.llvm.org Fri Oct 11 14:04:30 2019 From: llvm-commits at lists.llvm.org (Dmitry Mikulin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:04:30 +0000 (UTC) Subject: [PATCH] D67985: CFI: wrong type passed to llvm.type.test with multiple inheritance devirtualization In-Reply-To: References: Message-ID: dmikulin added a comment. @pcc : poke CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67985/new/ https://reviews.llvm.org/D67985 From llvm-commits at lists.llvm.org Fri Oct 11 14:13:37 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:13:37 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands Message-ID: tlively created this revision. tlively added reviewers: aheejin, dschuff, aardappel. Herald added subscribers: llvm-commits, sunfish, hiraditya, jgravelle-google, sbc100. Herald added a project: LLVM. Renames `ExprType` to the more apt `BlockType` and adds a variant for multivalue blocks. Currently non-void blocks are only generated at the end of functions where the block return type needs to agree with the function return type, and that remains true for multivalue blocks. That invariant means that the actual signature does not need to be stored in the block signature `MachineOperand` because it can be inferred by `WebAssemblyMCInstLower` from the return type of the parent function. `WebAssemblyMCInstLower` continues to lower block signature operands to immediates when possible but lowers multivalue signatures to function type symbols. The AsmParser and Disassembler are updated to handle multivalue block types as well. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68889 Files: llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp llvm/lib/Target/WebAssembly/Disassembler/LLVMBuild.txt llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCCodeEmitter.cpp llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/lib/Target/WebAssembly/WebAssemblyMCInstLower.cpp llvm/lib/Target/WebAssembly/WebAssemblyMCInstLower.h llvm/test/CodeGen/WebAssembly/multivalue.ll llvm/test/MC/Disassembler/WebAssembly/wasm-error.txt llvm/test/MC/WebAssembly/basic-assembly.s llvm/tools/llvm-mc/Disassembler.cpp llvm/tools/llvm-mc/Disassembler.h llvm/tools/llvm-mc/llvm-mc.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68889.224676.patch Type: text/x-patch Size: 24640 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 14:13:37 2019 From: llvm-commits at lists.llvm.org (Jake Ehrlich via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:13:37 +0000 (UTC) Subject: [PATCH] D68886: Remove unnecessary codes in llvm-dwarfdump In-Reply-To: References: Message-ID: jakehehrlich added a comment. I'm not sure why I was added but this looks fine to me. If it compiles and runs on everyone's system I can't see how this could be anything but good. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68886/new/ https://reviews.llvm.org/D68886 From llvm-commits at lists.llvm.org Fri Oct 11 14:13:38 2019 From: llvm-commits at lists.llvm.org (Peter Collingbourne via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:13:38 +0000 (UTC) Subject: [PATCH] D67985: CFI: wrong type passed to llvm.type.test with multiple inheritance devirtualization In-Reply-To: References: Message-ID: pcc accepted this revision. pcc added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67985/new/ https://reviews.llvm.org/D67985 From llvm-commits at lists.llvm.org Fri Oct 11 14:22:46 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:22:46 +0000 (UTC) Subject: [PATCH] D67008: [llvm-readobj][XCOFF]implement parsing relocation information for 32-bit xcoff objectfile In-Reply-To: References: Message-ID: <4589c6174c106938b5b4e2c8fdfd7128@localhost.localdomain> hubert.reinterpretcast accepted this revision. hubert.reinterpretcast added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67008/new/ https://reviews.llvm.org/D67008 From llvm-commits at lists.llvm.org Fri Oct 11 14:22:46 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:22:46 +0000 (UTC) Subject: [PATCH] D52199: [profile] Install headers for custom runtime maintainers In-Reply-To: References: Message-ID: <49a5c0e8f9caaff3746ffa5b0eb59575@localhost.localdomain> vsk added a comment. Friendly ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D52199/new/ https://reviews.llvm.org/D52199 From llvm-commits at lists.llvm.org Fri Oct 11 14:41:04 2019 From: llvm-commits at lists.llvm.org (Bjorn Pettersson via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:41:04 +0000 (UTC) Subject: [PATCH] D68633: [utils] InlineFunction: fix for debug info affecting optimizations In-Reply-To: References: Message-ID: <957d82af2f1874a09f045db51ca94d10@localhost.localdomain> bjope added a reviewer: fhahn. bjope added a comment. In D68633#1705697 , @yechunliang wrote: > > So the root cause is rather that we treat an alloca being immediately preceeded by another alloca differrently from the case when it is preceeded by another kind of instruction. This happens also when having other instructions in between, and is not specific to dbg intrinsics (could be interesting to add a test case where you replace the dbg intrinsics by something else). > > Yes I think so, if the other instruction is not dbg instr which exist between two allocas, the InlineFunction with and without "-strip-debug" will make the same behavior, that should both erase second use_empty alloca. This patch is to fix the issue that debug instr impact InlineFunction generate different output. > > > So I think that the solution might be based on one of these ideas: > > > > 1. Remove the check for use_empty in the outer loop. > > 2. Add a check for !use_empty in the inner loop. > > 3. Remove the inner loop (i.e only splice one alloca at a time). > > These good ideas should be talking about the design change of alloca inline or improvement of splice. > Read from the code, I think about the alloca inline behavior like this: First detect one !use_empty alloca, if next immediate instructions are allocas, even they are use_empty, they will all added one after one and move to caller together with first alloca. if other instruction (whatever dbg or others instrs) exist between allocas, the next alloca will check if is use_empty or not, if is use_empty then erase. Does this behavior correct, or could be improve? I don't know much about alloca inline. but seems the code run many years with this design. I'm no expert on this part of the code either. But there is no reasonable logic in handling allocas differently depending on the existence of other allocas here afaict. It looks like it has been like that for over 10 years, but I doubt that it justifies adding yet another level of illogical handling here. This time handling dbg intrinsics differently depending on the existance of, possibly unrelated, alloca instructions. It would just make this an even bigger mess. Would be much better if we could fix what seems to have been a bug(?) for over a decade (which also would solve the debug invariance problem). (Maybe you'll need another set of reviewers considering that the fix would impact the alloca inlining slightly, rather than just aiming at dbg intrinsics?) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68633/new/ https://reviews.llvm.org/D68633 From llvm-commits at lists.llvm.org Fri Oct 11 14:41:05 2019 From: llvm-commits at lists.llvm.org (Artem Belevich via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:41:05 +0000 (UTC) Subject: [PATCH] D68892: [NVPTX] Restructure shfl instrinsics and add variants that return a predicate. Message-ID: tra created this revision. tra added a reviewer: timshen. Herald added subscribers: jdoerfert, sanjoy.google, bixia, hiraditya, jholewinski. Herald added a project: LLVM. Restructure shfl instrinsics and add variants that return a predicate. Amend constraints for non-sync variants that are no longer available on sm_70+ with PTX6.4+. https://reviews.llvm.org/D68892 Files: llvm/include/llvm/IR/IntrinsicsNVVM.td llvm/lib/Target/NVPTX/NVPTXInstrInfo.td llvm/lib/Target/NVPTX/NVPTXIntrinsics.td llvm/test/CodeGen/NVPTX/shfl-p.ll llvm/test/CodeGen/NVPTX/shfl-sync-p.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68892.224679.patch Type: text/x-patch Size: 31888 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 14:51:24 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Fri, 11 Oct 2019 21:51:24 -0000 Subject: [llvm] r374597 - [mips] Use less instruction to load zero into FPR by li.s / li.d pseudos Message-ID: <20191011215124.2EE2986DB8@lists.llvm.org> Author: atanasyan Date: Fri Oct 11 14:51:23 2019 New Revision: 374597 URL: http://llvm.org/viewvc/llvm-project?rev=374597&view=rev Log: [mips] Use less instruction to load zero into FPR by li.s / li.d pseudos If `li.s` or `li.d` loads zero into a FPR, it's not necessary to load zero into `at` GPR register and then move its value into a floating point register. We can use as a source register the `zero / $0` one. Differential Revision: https://reviews.llvm.org/D68777 Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp llvm/trunk/test/MC/Mips/macro-li.d.s llvm/trunk/test/MC/Mips/macro-li.s.s Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp?rev=374597&r1=374596&r2=374597&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp (original) +++ llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Fri Oct 11 14:51:23 2019 @@ -3345,13 +3345,16 @@ bool MipsAsmParser::expandLoadSingleImmT uint32_t ImmOp32 = covertDoubleImmToSingleImm(ImmOp64); - unsigned TmpReg = getATReg(IDLoc); - if (!TmpReg) - return true; + unsigned TmpReg = Mips::ZERO; + if (ImmOp32 != 0) { + TmpReg = getATReg(IDLoc); + if (!TmpReg) + return true; + } if (Lo_32(ImmOp64) == 0) { - if (loadImmediate(ImmOp32, TmpReg, Mips::NoRegister, true, true, IDLoc, Out, - STI)) + if (TmpReg != Mips::ZERO && loadImmediate(ImmOp32, TmpReg, Mips::NoRegister, + true, false, IDLoc, Out, STI)) return true; TOut.emitRR(Mips::MTC1, FirstReg, TmpReg, IDLoc, STI); return false; @@ -3469,24 +3472,26 @@ bool MipsAsmParser::expandLoadDoubleImmT uint32_t LoImmOp64 = Lo_32(ImmOp64); uint32_t HiImmOp64 = Hi_32(ImmOp64); - unsigned TmpReg = getATReg(IDLoc); - if (!TmpReg) - return true; + unsigned TmpReg = Mips::ZERO; + if (ImmOp64 != 0) { + TmpReg = getATReg(IDLoc); + if (!TmpReg) + return true; + } if ((LoImmOp64 == 0) && !((HiImmOp64 & 0xffff0000) && (HiImmOp64 & 0x0000ffff))) { - // FIXME: In the case where the constant is zero, we can load the - // register directly from the zero register. - if (isABI_N32() || isABI_N64()) { - if (loadImmediate(ImmOp64, TmpReg, Mips::NoRegister, false, false, IDLoc, + if (TmpReg != Mips::ZERO && + loadImmediate(ImmOp64, TmpReg, Mips::NoRegister, false, false, IDLoc, Out, STI)) return true; TOut.emitRR(Mips::DMTC1, FirstReg, TmpReg, IDLoc, STI); return false; } - if (loadImmediate(HiImmOp64, TmpReg, Mips::NoRegister, true, false, IDLoc, + if (TmpReg != Mips::ZERO && + loadImmediate(HiImmOp64, TmpReg, Mips::NoRegister, true, false, IDLoc, Out, STI)) return true; Modified: llvm/trunk/test/MC/Mips/macro-li.d.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Mips/macro-li.d.s?rev=374597&r1=374596&r2=374597&view=diff ============================================================================== --- llvm/trunk/test/MC/Mips/macro-li.d.s (original) +++ llvm/trunk/test/MC/Mips/macro-li.d.s Fri Oct 11 14:51:23 2019 @@ -228,24 +228,18 @@ li.d $4, 12345678910123456789.1234567891 # N32-N64: ld $4, 0($1) # encoding: [0x00,0x00,0x24,0xdc] li.d $f4, 0 -# O32: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] -# O32: mtc1 $1, $f5 # encoding: [0x00,0x28,0x81,0x44] +# O32: mtc1 $zero, $f5 # encoding: [0x00,0x28,0x80,0x44] # O32: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] -# CHECK-MIPS32r2: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] # CHECK-MIPS32r2: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] -# CHECK-MIPS32r2: mthc1 $1, $f4 # encoding: [0x00,0x20,0xe1,0x44] -# N32-N64: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] -# N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] +# CHECK-MIPS32r2: mthc1 $zero, $f4 # encoding: [0x00,0x20,0xe0,0x44] +# N32-N64: dmtc1 $zero, $f4 # encoding: [0x00,0x20,0xa0,0x44] li.d $f4, 0.0 -# O32: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] -# O32: mtc1 $1, $f5 # encoding: [0x00,0x28,0x81,0x44] +# O32: mtc1 $zero, $f5 # encoding: [0x00,0x28,0x80,0x44] # O32: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] -# CHECK-MIPS32r2: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] # CHECK-MIPS32r2: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] -# CHECK-MIPS32r2: mthc1 $1, $f4 # encoding: [0x00,0x20,0xe1,0x44] -# N32-N64: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] -# N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] +# CHECK-MIPS32r2: mthc1 $zero, $f4 # encoding: [0x00,0x20,0xe0,0x44] +# N32-N64: dmtc1 $zero, $f4 # encoding: [0x00,0x20,0xa0,0x44] li.d $f4, 1.12345 # ALL: .section .rodata,"a", at progbits Modified: llvm/trunk/test/MC/Mips/macro-li.s.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Mips/macro-li.s.s?rev=374597&r1=374596&r2=374597&view=diff ============================================================================== --- llvm/trunk/test/MC/Mips/macro-li.s.s (original) +++ llvm/trunk/test/MC/Mips/macro-li.s.s Fri Oct 11 14:51:23 2019 @@ -45,12 +45,10 @@ li.s $4, 12345678910123456789.1234567891 # ALL: ori $4, $4, 21674 # encoding: [0xaa,0x54,0x84,0x34] li.s $f4, 0 -# ALL: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] -# ALL: mtc1 $1, $f4 # encoding: [0x00,0x20,0x81,0x44] +# ALL: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] li.s $f4, 0.0 -# ALL: addiu $1, $zero, 0 # encoding: [0x00,0x00,0x01,0x24] -# ALL: mtc1 $1, $f4 # encoding: [0x00,0x20,0x81,0x44] +# ALL: mtc1 $zero, $f4 # encoding: [0x00,0x20,0x80,0x44] li.s $f4, 1.12345 # ALL: .section .rodata,"a", at progbits From llvm-commits at lists.llvm.org Fri Oct 11 14:51:33 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Fri, 11 Oct 2019 21:51:33 -0000 Subject: [llvm] r374598 - [mips] Store 64-bit `li.d' operand as a single 8-byte value Message-ID: <20191011215133.8325D8AC1A@lists.llvm.org> Author: atanasyan Date: Fri Oct 11 14:51:33 2019 New Revision: 374598 URL: http://llvm.org/viewvc/llvm-project?rev=374598&view=rev Log: [mips] Store 64-bit `li.d' operand as a single 8-byte value Now assembler generates two consecutive `.4byte` directives to store 64-bit `li.d' operand. The first directive stores high 4-byte of the value. The second directive stores low 4-byte of the value. But on 64-bit system we load this value at once and get wrong result if the system is little-endian. This patch fixes the bug. It stores the `li.d' operand as a single 8-byte value. Differential Revision: https://reviews.llvm.org/D68778 Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp llvm/trunk/test/MC/Mips/macro-li.d.s Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp?rev=374598&r1=374597&r2=374598&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp (original) +++ llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Fri Oct 11 14:51:33 2019 @@ -3433,8 +3433,8 @@ bool MipsAsmParser::expandLoadDoubleImmT getStreamer().SwitchSection(ReadOnlySection); getStreamer().EmitLabel(Sym, IDLoc); - getStreamer().EmitIntValue(HiImmOp64, 4); - getStreamer().EmitIntValue(LoImmOp64, 4); + getStreamer().EmitValueToAlignment(8); + getStreamer().EmitIntValue(ImmOp64, 8); getStreamer().SwitchSection(CS); if (emitPartialAddress(TOut, IDLoc, Sym)) @@ -3519,8 +3519,8 @@ bool MipsAsmParser::expandLoadDoubleImmT getStreamer().SwitchSection(ReadOnlySection); getStreamer().EmitLabel(Sym, IDLoc); - getStreamer().EmitIntValue(HiImmOp64, 4); - getStreamer().EmitIntValue(LoImmOp64, 4); + getStreamer().EmitValueToAlignment(8); + getStreamer().EmitIntValue(ImmOp64, 8); getStreamer().SwitchSection(CS); if (emitPartialAddress(TOut, IDLoc, Sym)) Modified: llvm/trunk/test/MC/Mips/macro-li.d.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Mips/macro-li.d.s?rev=374598&r1=374597&r2=374598&view=diff ============================================================================== --- llvm/trunk/test/MC/Mips/macro-li.d.s (original) +++ llvm/trunk/test/MC/Mips/macro-li.d.s Fri Oct 11 14:51:33 2019 @@ -17,11 +17,11 @@ li.d $4, 0.0 # N32-N64: daddiu $4, $zero, 0 # encoding: [0x00,0x00,0x04,0x64] li.d $4, 1.12345 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1072822694 -# ALL: .4byte 3037400872 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4607738388174016296 +# ALL-NEXT: .text # O32-N32-NO-PIC: lui $1, %hi([[LABEL]]) # encoding: [A,A,0x01,0x3c] # O32-N32-NO-PIC: # fixup A - offset: 0, value: %hi([[LABEL]]), kind: fixup_Mips_HI16 # O32-N32-NO-PIC: addiu $1, $1, %lo([[LABEL]]) # encoding: [A,A,0x21,0x24] @@ -61,11 +61,11 @@ li.d $4, 1.0 # N32-N64: dsll $4, $4, 46 # encoding: [0xbc,0x23,0x04,0x00] li.d $4, 12345678910 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1107754720 -# ALL: .4byte 3790602240 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4757770298180239360 +# ALL-NEXT: .text # O32-N32-NO-PIC: lui $1, %hi([[LABEL]]) # encoding: [A,A,0x01,0x3c] # O32-N32-NO-PIC: # fixup A - offset: 0, value: %hi([[LABEL]]), kind: fixup_Mips_HI16 # O32-N32-NO-PIC: addiu $1, $1, %lo([[LABEL]]) # encoding: [A,A,0x21,0x24] @@ -93,11 +93,11 @@ li.d $4, 12345678910 # N32-N64: ld $4, 0($1) # encoding: [0x00,0x00,0x24,0xdc] li.d $4, 12345678910.0 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1107754720 -# ALL: .4byte 3790602240 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4757770298180239360 +# ALL-NEXT: .text # O32-N32-NO-PIC: lui $1, %hi([[LABEL]]) # encoding: [A,A,0x01,0x3c] # O32-N32-NO-PIC: # fixup A - offset: 0, value: %hi([[LABEL]]), kind: fixup_Mips_HI16 # O32-N32-NO-PIC: addiu $1, $1, %lo([[LABEL]]) # encoding: [A,A,0x21,0x24] @@ -125,11 +125,11 @@ li.d $4, 12345678910.0 # N32-N64: ld $4, 0($1) # encoding: [0x00,0x00,0x24,0xdc] li.d $4, 0.4 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1071225241 -# ALL: .4byte 2576980378 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4600877379321698714 +# ALL-NEXT: .text # O32-N32-NO-PIC: lui $1, %hi([[LABEL]]) # encoding: [A,A,0x01,0x3c] # O32-N32-NO-PIC: # fixup A - offset: 0, value: %hi([[LABEL]]), kind: fixup_Mips_HI16 # O32-N32-NO-PIC: addiu $1, $1, %lo([[LABEL]]) # encoding: [A,A,0x21,0x24] @@ -163,11 +163,11 @@ li.d $4, 1.5 # N32-N64: dsll $4, $4, 46 # encoding: [0xbc,0x23,0x04,0x00] li.d $4, 12345678910.12345678910 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1107754720 -# ALL: .4byte 3790666967 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4757770298180304087 +# ALL-NEXT: .text # O32-N32-NO-PIC: lui $1, %hi([[LABEL]]) # encoding: [A,A,0x01,0x3c] # O32-N32-NO-PIC: # fixup A - offset: 0, value: %hi([[LABEL]]), kind: fixup_Mips_HI16 # O32-N32-NO-PIC: addiu $1, $1, %lo([[LABEL]]) # encoding: [A,A,0x21,0x24] @@ -196,11 +196,11 @@ li.d $4, 12345678910.12345678910 li.d $4, 12345678910123456789.12345678910 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1139108501 -# ALL: .4byte 836738583 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4892433759227321879 +# ALL-NEXT: .text # O32-N32-NO-PIC: lui $1, %hi([[LABEL]]) # encoding: [A,A,0x01,0x3c] # O32-N32-NO-PIC: # fixup A - offset: 0, value: %hi([[LABEL]]), kind: fixup_Mips_HI16 # O32-N32-NO-PIC: addiu $1, $1, %lo([[LABEL]]) # encoding: [A,A,0x21,0x24] @@ -242,11 +242,11 @@ li.d $f4, 0.0 # N32-N64: dmtc1 $zero, $f4 # encoding: [0x00,0x20,0xa0,0x44] li.d $f4, 1.12345 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1072822694 -# ALL: .4byte 3037400872 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4607738388174016296 +# ALL-NEXT: .text # O32-N32-PIC: lw $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0x8f] # O32-N32-PIC: # fixup A - offset: 0, value: %got([[LABEL]]), kind: fixup_Mips_GOT # N64-PIC: ld $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0xdf] @@ -287,11 +287,11 @@ li.d $f4, 1.0 # N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] li.d $f4, 12345678910 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1107754720 -# ALL: .4byte 3790602240 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4757770298180239360 +# ALL-NEXT: .text # O32-N32-PIC: lw $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0x8f] # O32-N32-PIC: # fixup A - offset: 0, value: %got([[LABEL]]), kind: fixup_Mips_GOT # N64-PIC: ld $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0xdf] @@ -310,11 +310,11 @@ li.d $f4, 12345678910 # ALL: # fixup A - offset: 0, value: %lo([[LABEL]]), kind: fixup_Mips_LO16 li.d $f4, 12345678910.0 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1107754720 -# ALL: .4byte 3790602240 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4757770298180239360 +# ALL-NEXT: .text # O32-N32-PIC: lw $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0x8f] # O32-N32-PIC: # fixup A - offset: 0, value: %got([[LABEL]]), kind: fixup_Mips_GOT # N64-PIC: ld $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0xdf] @@ -333,11 +333,11 @@ li.d $f4, 12345678910.0 # ALL: # fixup A - offset: 0, value: %lo([[LABEL]]), kind: fixup_Mips_LO16 li.d $f4, 0.4 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1071225241 -# ALL: .4byte 2576980378 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4600877379321698714 +# ALL-NEXT: .text # O32-N32-PIC: lw $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0x8f] # O32-N32-PIC: # fixup A - offset: 0, value: %got([[LABEL]]), kind: fixup_Mips_GOT # N64-PIC: ld $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0xdf] @@ -378,11 +378,11 @@ li.d $f4, 2.5 # N32-N64: dmtc1 $1, $f4 # encoding: [0x00,0x20,0xa1,0x44] li.d $f4, 2.515625 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1074012160 -# ALL: .4byte 0 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4612847102706319360 +# ALL-NEXT: .text # O32-N32-PIC: lw $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0x8f] # O32-N32-PIC: # fixup A - offset: 0, value: %got([[LABEL]]), kind: fixup_Mips_GOT # N64-PIC: ld $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0xdf] @@ -401,11 +401,11 @@ li.d $f4, 2.515625 # ALL: # fixup A - offset: 0, value: %lo([[LABEL]]), kind: fixup_Mips_LO16 li.d $f4, 12345678910.12345678910 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1107754720 -# ALL: .4byte 3790666967 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4757770298180304087 +# ALL-NEXT: .text # O32-N32-PIC: lw $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0x8f] # O32-N32-PIC: # fixup A - offset: 0, value: %got([[LABEL]]), kind: fixup_Mips_GOT # N64-PIC: ld $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0xdf] @@ -424,11 +424,11 @@ li.d $f4, 12345678910.12345678910 # ALL: # fixup A - offset: 0, value: %lo([[LABEL]]), kind: fixup_Mips_LO16 li.d $f4, 12345678910123456789.12345678910 -# ALL: .section .rodata,"a", at progbits -# ALL: [[LABEL:\$tmp[0-9]+]]: -# ALL: .4byte 1139108501 -# ALL: .4byte 836738583 -# ALL: .text +# ALL: .section .rodata,"a", at progbits +# ALL-NEXT: [[LABEL:\$tmp[0-9]+]]: +# ALL-NEXT: .p2align 3 +# ALL-NEXT: .8byte 4892433759227321879 +# ALL-NEXT: .text # O32-N32-PIC: lw $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0x8f] # O32-N32-PIC: # fixup A - offset: 0, value: %got([[LABEL]]), kind: fixup_Mips_GOT # N64-PIC: ld $1, %got([[LABEL]])($gp) # encoding: [A,A,0x81,0xdf] From llvm-commits at lists.llvm.org Fri Oct 11 14:51:39 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Fri, 11 Oct 2019 21:51:39 -0000 Subject: [llvm] r374599 - [mips] Remove unused local variables. NFC Message-ID: <20191011215139.A6807932F2@lists.llvm.org> Author: atanasyan Date: Fri Oct 11 14:51:39 2019 New Revision: 374599 URL: http://llvm.org/viewvc/llvm-project?rev=374599&view=rev Log: [mips] Remove unused local variables. NFC Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp?rev=374599&r1=374598&r2=374599&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp (original) +++ llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Fri Oct 11 14:51:39 2019 @@ -3322,9 +3322,7 @@ bool MipsAsmParser::expandLoadSingleImmT unsigned FirstReg = Inst.getOperand(0).getReg(); uint64_t ImmOp64 = Inst.getOperand(1).getImm(); - ImmOp64 = convertIntToDoubleImm(ImmOp64); - - uint32_t ImmOp32 = covertDoubleImmToSingleImm(ImmOp64); + uint32_t ImmOp32 = covertDoubleImmToSingleImm(convertIntToDoubleImm(ImmOp64)); return loadImmediate(ImmOp32, FirstReg, Mips::NoRegister, true, true, IDLoc, Out, STI); @@ -3397,20 +3395,13 @@ bool MipsAsmParser::expandLoadDoubleImmT ImmOp64 = convertIntToDoubleImm(ImmOp64); - uint32_t LoImmOp64 = Lo_32(ImmOp64); - uint32_t HiImmOp64 = Hi_32(ImmOp64); - - unsigned TmpReg = getATReg(IDLoc); - if (!TmpReg) - return true; - - if (LoImmOp64 == 0) { + if (Lo_32(ImmOp64) == 0) { if (isABI_N32() || isABI_N64()) { if (loadImmediate(ImmOp64, FirstReg, Mips::NoRegister, false, true, IDLoc, Out, STI)) return true; } else { - if (loadImmediate(HiImmOp64, FirstReg, Mips::NoRegister, true, true, + if (loadImmediate(Hi_32(ImmOp64), FirstReg, Mips::NoRegister, true, true, IDLoc, Out, STI)) return true; @@ -3437,6 +3428,10 @@ bool MipsAsmParser::expandLoadDoubleImmT getStreamer().EmitIntValue(ImmOp64, 8); getStreamer().SwitchSection(CS); + unsigned TmpReg = getATReg(IDLoc); + if (!TmpReg) + return true; + if (emitPartialAddress(TOut, IDLoc, Sym)) return true; @@ -3469,9 +3464,6 @@ bool MipsAsmParser::expandLoadDoubleImmT ImmOp64 = convertIntToDoubleImm(ImmOp64); - uint32_t LoImmOp64 = Lo_32(ImmOp64); - uint32_t HiImmOp64 = Hi_32(ImmOp64); - unsigned TmpReg = Mips::ZERO; if (ImmOp64 != 0) { TmpReg = getATReg(IDLoc); @@ -3479,8 +3471,8 @@ bool MipsAsmParser::expandLoadDoubleImmT return true; } - if ((LoImmOp64 == 0) && - !((HiImmOp64 & 0xffff0000) && (HiImmOp64 & 0x0000ffff))) { + if ((Lo_32(ImmOp64) == 0) && + !((Hi_32(ImmOp64) & 0xffff0000) && (Hi_32(ImmOp64) & 0x0000ffff))) { if (isABI_N32() || isABI_N64()) { if (TmpReg != Mips::ZERO && loadImmediate(ImmOp64, TmpReg, Mips::NoRegister, false, false, IDLoc, @@ -3491,8 +3483,8 @@ bool MipsAsmParser::expandLoadDoubleImmT } if (TmpReg != Mips::ZERO && - loadImmediate(HiImmOp64, TmpReg, Mips::NoRegister, true, false, IDLoc, - Out, STI)) + loadImmediate(Hi_32(ImmOp64), TmpReg, Mips::NoRegister, true, false, + IDLoc, Out, STI)) return true; if (hasMips32r2()) { From llvm-commits at lists.llvm.org Fri Oct 11 14:50:10 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:50:10 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands In-Reply-To: References: Message-ID: <206f33224e9b6c874742407dfb784507@localhost.localdomain> sbc100 added inline comments. ================ Comment at: llvm/tools/llvm-mc/llvm-mc.cpp:520 if (disassemble) - Res = Disassembler::disassemble(*TheTarget, TripleName, *STI, *Str, - *Buffer, SrcMgr, Out->os()); ---------------- This change to remove the context creation seems seperate. If so can you split it out? That was this change can stay wasm specific. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68889/new/ https://reviews.llvm.org/D68889 From llvm-commits at lists.llvm.org Fri Oct 11 14:50:11 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:50:11 +0000 (UTC) Subject: [PATCH] D68836: [lit] Small cleanups in main.py In-Reply-To: References: Message-ID: yln marked an inline comment as done. yln added inline comments. ================ Comment at: llvm/utils/lit/lit/main.py:32 + import tempfile lit_tmp = tempfile.mkdtemp(prefix="lit_tmp_") os.environ.update({ ---------------- rnk wrote: > Unrelated, but I wonder if we should augment this logic to garbage collect old `lit_tmp_` directories that are 24+ hours old. I routinely find lots of leaked lit_tmp_ directories because oftentimes the parent Python process is killed before it gets to the finally block below. One of the improvements for lit that I want to explore is graceful shutdown on `CTRL+C`. If that isn't possible, we should implement your suggestion. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68836/new/ https://reviews.llvm.org/D68836 From llvm-commits at lists.llvm.org Fri Oct 11 14:50:11 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:50:11 +0000 (UTC) Subject: [PATCH] D68620: DebugInfo: Use base address selection entries for debug_loc In-Reply-To: References: Message-ID: <2147017675ba497c2ae36d01d848ecff@localhost.localdomain> dblaikie marked an inline comment as done. dblaikie added a comment. In D68620#1699420 , @labath wrote: > LLDB seems to have support for base address selection in v4 debug_loc. It does not have support for v5 LLE_base_address(x) stuff, but the whole of v5 location list support is kind of wonky, which also is why I am looking at getting it to use the llvm version of the parser. Yeah, that sort of summarizes GDB's support too. > As for llvm-dwarfdump, feel free to add new encodings there. My plan is to add support for all LLE encodings, but since I also need to figure out a way to refactor all of that stuff, it may take a while before I get to that. Having one or two new encodings appear in the mean time should only be a minor nuisance. Had to take a few goes at this to see if there was a good mid-point of refactoring & think I found one that coalesces some of the codepaths for verbose, non-verbose, and inline dumping - insofar as seemed reasonable, I tried to make things more similar to debug_rnglists (in several cases just at least making the code look similar, even though it's not shared yet). ================ Comment at: lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2328 + BaseIsSet = true; + if (UseDwarf5) { + Asm->OutStreamer->AddComment(StringifyEnum(BaseAddressx)); ---------------- probinson wrote: > Would it be more readable this way? > ``` > if (!UseDwarf5) { > Base = NewBase; > BaseIsSet = true; > Asm-OutStreamer->EmitIntValue(-1, Size); > // etc > } else if (NewBase != Begin || P.second.size() > 1) { > Base = NewBase; > BaseIsSet = true; > Asm->OutStreamer->AddComment(StringifyEnum(BaseAddressx); > // etc > } > ``` > As there are only 2 lines in common. (My eye caught `if (!UseDwarf5` and two lines later `if (UseDwarf5)` and did a double-take.) Sure, looks good to me! I know this whole function's got several cases to think about & is a bit unwieldy. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68620/new/ https://reviews.llvm.org/D68620 From llvm-commits at lists.llvm.org Fri Oct 11 14:52:41 2019 From: llvm-commits at lists.llvm.org (David Blaikie via llvm-commits) Date: Fri, 11 Oct 2019 21:52:41 -0000 Subject: [llvm] r374600 - DebugInfo: Use base address selection entries for debug_loc Message-ID: <20191011215241.E677993351@lists.llvm.org> Author: dblaikie Date: Fri Oct 11 14:52:41 2019 New Revision: 374600 URL: http://llvm.org/viewvc/llvm-project?rev=374600&view=rev Log: DebugInfo: Use base address selection entries for debug_loc Unify the range and loc emission (for both DWARFv4 and DWARFv5 style lists) and take advantage of that unification to use strategic base addresses for loclists. Differential Revision: https://reviews.llvm.org/D68620 Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp llvm/trunk/test/CodeGen/X86/debug-loclists.ll llvm/trunk/test/DebugInfo/X86/sret.ll Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp?rev=374600&r1=374599&r2=374600&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Fri Oct 11 14:52:41 2019 @@ -2293,14 +2293,121 @@ static MCSymbol *emitLoclistsTableHeader return TableEnd; } +template +static void emitRangeList( + DwarfDebug &DD, AsmPrinter *Asm, MCSymbol *Sym, const Ranges &R, + const DwarfCompileUnit &CU, unsigned BaseAddressx, unsigned OffsetPair, + unsigned StartxLength, unsigned EndOfList, + StringRef (*StringifyEnum)(unsigned), + bool ShouldUseBaseAddress, + PayloadEmitter EmitPayload) { + + auto Size = Asm->MAI->getCodePointerSize(); + bool UseDwarf5 = DD.getDwarfVersion() >= 5; + + // Emit our symbol so we can find the beginning of the range. + Asm->OutStreamer->EmitLabel(Sym); + + // Gather all the ranges that apply to the same section so they can share + // a base address entry. + MapVector> SectionRanges; + + for (const auto &Range : R) + SectionRanges[&Range.Begin->getSection()].push_back(&Range); + + const MCSymbol *CUBase = CU.getBaseAddress(); + bool BaseIsSet = false; + for (const auto &P : SectionRanges) { + auto *Base = CUBase; + if (!Base && ShouldUseBaseAddress) { + const MCSymbol *Begin = P.second.front()->Begin; + const MCSymbol *NewBase = DD.getSectionLabel(&Begin->getSection()); + if (!UseDwarf5) { + Base = NewBase; + BaseIsSet = true; + Asm->OutStreamer->EmitIntValue(-1, Size); + Asm->OutStreamer->AddComment(" base address"); + Asm->OutStreamer->EmitSymbolValue(Base, Size); + } else if (NewBase != Begin || P.second.size() > 1) { + // Only use a base address if + // * the existing pool address doesn't match (NewBase != Begin) + // * or, there's more than one entry to share the base address + Base = NewBase; + BaseIsSet = true; + Asm->OutStreamer->AddComment(StringifyEnum(BaseAddressx)); + Asm->emitInt8(BaseAddressx); + Asm->OutStreamer->AddComment(" base address index"); + Asm->EmitULEB128(DD.getAddressPool().getIndex(Base)); + } + } else if (BaseIsSet && !UseDwarf5) { + BaseIsSet = false; + assert(!Base); + Asm->OutStreamer->EmitIntValue(-1, Size); + Asm->OutStreamer->EmitIntValue(0, Size); + } + + for (const auto *RS : P.second) { + const MCSymbol *Begin = RS->Begin; + const MCSymbol *End = RS->End; + assert(Begin && "Range without a begin symbol?"); + assert(End && "Range without an end symbol?"); + if (Base) { + if (UseDwarf5) { + // Emit offset_pair when we have a base. + Asm->OutStreamer->AddComment(StringifyEnum(OffsetPair)); + Asm->emitInt8(OffsetPair); + Asm->OutStreamer->AddComment(" starting offset"); + Asm->EmitLabelDifferenceAsULEB128(Begin, Base); + Asm->OutStreamer->AddComment(" ending offset"); + Asm->EmitLabelDifferenceAsULEB128(End, Base); + } else { + Asm->EmitLabelDifference(Begin, Base, Size); + Asm->EmitLabelDifference(End, Base, Size); + } + } else if (UseDwarf5) { + Asm->OutStreamer->AddComment(StringifyEnum(StartxLength)); + Asm->emitInt8(StartxLength); + Asm->OutStreamer->AddComment(" start index"); + Asm->EmitULEB128(DD.getAddressPool().getIndex(Begin)); + Asm->OutStreamer->AddComment(" length"); + Asm->EmitLabelDifferenceAsULEB128(End, Begin); + } else { + Asm->OutStreamer->EmitSymbolValue(Begin, Size); + Asm->OutStreamer->EmitSymbolValue(End, Size); + } + EmitPayload(*RS); + } + } + + if (UseDwarf5) { + Asm->OutStreamer->AddComment(StringifyEnum(EndOfList)); + Asm->emitInt8(EndOfList); + } else { + // Terminate the list with two 0 values. + Asm->OutStreamer->EmitIntValue(0, Size); + Asm->OutStreamer->EmitIntValue(0, Size); + } +} + +static void emitLocList(DwarfDebug &DD, AsmPrinter *Asm, const DebugLocStream::List &List) { + emitRangeList( + DD, Asm, List.Label, DD.getDebugLocs().getEntries(List), *List.CU, + dwarf::DW_LLE_base_addressx, dwarf::DW_LLE_offset_pair, + dwarf::DW_LLE_startx_length, dwarf::DW_LLE_end_of_list, + llvm::dwarf::LocListEncodingString, + /* ShouldUseBaseAddress */ true, + [&](const DebugLocStream::Entry &E) { + DD.emitDebugLocEntryLocation(E, List.CU); + }); +} + // Emit locations into the .debug_loc/.debug_rnglists section. void DwarfDebug::emitDebugLoc() { if (DebugLocs.getLists().empty()) return; - bool IsLocLists = getDwarfVersion() >= 5; MCSymbol *TableEnd = nullptr; - if (IsLocLists) { + if (getDwarfVersion() >= 5) { Asm->OutStreamer->SwitchSection( Asm->getObjFileLowering().getDwarfLoclistsSection()); TableEnd = emitLoclistsTableHeader(Asm, useSplitDwarf() ? SkeletonHolder @@ -2310,63 +2417,8 @@ void DwarfDebug::emitDebugLoc() { Asm->getObjFileLowering().getDwarfLocSection()); } - unsigned char Size = Asm->MAI->getCodePointerSize(); - for (const auto &List : DebugLocs.getLists()) { - Asm->OutStreamer->EmitLabel(List.Label); - - const DwarfCompileUnit *CU = List.CU; - const MCSymbol *Base = CU->getBaseAddress(); - for (const auto &Entry : DebugLocs.getEntries(List)) { - if (Base) { - // Set up the range. This range is relative to the entry point of the - // compile unit. This is a hard coded 0 for low_pc when we're emitting - // ranges, or the DW_AT_low_pc on the compile unit otherwise. - if (IsLocLists) { - Asm->OutStreamer->AddComment("DW_LLE_offset_pair"); - Asm->OutStreamer->EmitIntValue(dwarf::DW_LLE_offset_pair, 1); - Asm->OutStreamer->AddComment(" starting offset"); - Asm->EmitLabelDifferenceAsULEB128(Entry.Begin, Base); - Asm->OutStreamer->AddComment(" ending offset"); - Asm->EmitLabelDifferenceAsULEB128(Entry.End, Base); - } else { - Asm->EmitLabelDifference(Entry.Begin, Base, Size); - Asm->EmitLabelDifference(Entry.End, Base, Size); - } - - emitDebugLocEntryLocation(Entry, CU); - continue; - } - - // We have no base address. - if (IsLocLists) { - // TODO: Use DW_LLE_base_addressx + DW_LLE_offset_pair, or - // DW_LLE_startx_length in case if there is only a single range. - // That should reduce the size of the debug data emited. - // For now just use the DW_LLE_startx_length for all cases. - Asm->OutStreamer->AddComment("DW_LLE_startx_length"); - Asm->emitInt8(dwarf::DW_LLE_startx_length); - Asm->OutStreamer->AddComment(" start idx"); - Asm->EmitULEB128(AddrPool.getIndex(Entry.Begin)); - Asm->OutStreamer->AddComment(" length"); - Asm->EmitLabelDifferenceAsULEB128(Entry.End, Entry.Begin); - } else { - Asm->OutStreamer->EmitSymbolValue(Entry.Begin, Size); - Asm->OutStreamer->EmitSymbolValue(Entry.End, Size); - } - - emitDebugLocEntryLocation(Entry, CU); - } - - if (IsLocLists) { - // .debug_loclists section ends with DW_LLE_end_of_list. - Asm->OutStreamer->AddComment("DW_LLE_end_of_list"); - Asm->OutStreamer->EmitIntValue(dwarf::DW_LLE_end_of_list, 1); - } else { - // Terminate the .debug_loc list with two 0 values. - Asm->OutStreamer->EmitIntValue(0, Size); - Asm->OutStreamer->EmitIntValue(0, Size); - } - } + for (const auto &List : DebugLocs.getLists()) + emitLocList(*this, Asm, List); if (TableEnd) Asm->OutStreamer->EmitLabel(TableEnd); @@ -2556,103 +2608,16 @@ void DwarfDebug::emitDebugARanges() { } } -template -static void emitRangeList(DwarfDebug &DD, AsmPrinter *Asm, MCSymbol *Sym, - const Ranges &R, const DwarfCompileUnit &CU, - unsigned BaseAddressx, unsigned OffsetPair, - unsigned StartxLength, unsigned EndOfList, - StringRef (*StringifyEnum)(unsigned)) { - auto DwarfVersion = DD.getDwarfVersion(); - // Emit our symbol so we can find the beginning of the range. - Asm->OutStreamer->EmitLabel(Sym); - // Gather all the ranges that apply to the same section so they can share - // a base address entry. - MapVector> SectionRanges; - // Size for our labels. - auto Size = Asm->MAI->getCodePointerSize(); - - for (const RangeSpan &Range : R) - SectionRanges[&Range.Begin->getSection()].push_back(&Range); - - const MCSymbol *CUBase = CU.getBaseAddress(); - bool BaseIsSet = false; - for (const auto &P : SectionRanges) { - // Don't bother with a base address entry if there's only one range in - // this section in this range list - for example ranges for a CU will - // usually consist of single regions from each of many sections - // (-ffunction-sections, or just C++ inline functions) except under LTO - // or optnone where there may be holes in a single CU's section - // contributions. - auto *Base = CUBase; - if (!Base && (P.second.size() > 1 || DwarfVersion < 5) && - (CU.getCUNode()->getRangesBaseAddress() || DwarfVersion >= 5)) { - BaseIsSet = true; - Base = DD.getSectionLabel(&P.second.front()->Begin->getSection()); - if (DwarfVersion >= 5) { - Asm->OutStreamer->AddComment(StringifyEnum(BaseAddressx)); - Asm->OutStreamer->EmitIntValue(BaseAddressx, 1); - Asm->OutStreamer->AddComment(" base address index"); - Asm->EmitULEB128(DD.getAddressPool().getIndex(Base)); - } else { - Asm->OutStreamer->EmitIntValue(-1, Size); - Asm->OutStreamer->AddComment(" base address"); - Asm->OutStreamer->EmitSymbolValue(Base, Size); - } - } else if (BaseIsSet && DwarfVersion < 5) { - BaseIsSet = false; - assert(!Base); - Asm->OutStreamer->EmitIntValue(-1, Size); - Asm->OutStreamer->EmitIntValue(0, Size); - } - - for (const auto *RS : P.second) { - const MCSymbol *Begin = RS->Begin; - const MCSymbol *End = RS->End; - assert(Begin && "Range without a begin symbol?"); - assert(End && "Range without an end symbol?"); - if (Base) { - if (DwarfVersion >= 5) { - // Emit DW_RLE_offset_pair when we have a base. - Asm->OutStreamer->AddComment(StringifyEnum(OffsetPair)); - Asm->emitInt8(OffsetPair); - Asm->OutStreamer->AddComment(" starting offset"); - Asm->EmitLabelDifferenceAsULEB128(Begin, Base); - Asm->OutStreamer->AddComment(" ending offset"); - Asm->EmitLabelDifferenceAsULEB128(End, Base); - } else { - Asm->EmitLabelDifference(Begin, Base, Size); - Asm->EmitLabelDifference(End, Base, Size); - } - } else if (DwarfVersion >= 5) { - Asm->OutStreamer->AddComment(StringifyEnum(StartxLength)); - Asm->emitInt8(StartxLength); - Asm->OutStreamer->AddComment(" start index"); - Asm->EmitULEB128(DD.getAddressPool().getIndex(Begin)); - Asm->OutStreamer->AddComment(" length"); - Asm->EmitLabelDifferenceAsULEB128(End, Begin); - } else { - Asm->OutStreamer->EmitSymbolValue(Begin, Size); - Asm->OutStreamer->EmitSymbolValue(End, Size); - } - } - } - if (DwarfVersion >= 5) { - Asm->OutStreamer->AddComment(StringifyEnum(EndOfList)); - Asm->emitInt8(EndOfList); - } else { - // Terminate the list with two 0 values. - Asm->OutStreamer->EmitIntValue(0, Size); - Asm->OutStreamer->EmitIntValue(0, Size); - } -} - /// Emit a single range list. We handle both DWARF v5 and earlier. static void emitRangeList(DwarfDebug &DD, AsmPrinter *Asm, const RangeSpanList &List) { emitRangeList(DD, Asm, List.getSym(), List.getRanges(), List.getCU(), dwarf::DW_RLE_base_addressx, dwarf::DW_RLE_offset_pair, dwarf::DW_RLE_startx_length, dwarf::DW_RLE_end_of_list, - llvm::dwarf::RangeListEncodingString); + llvm::dwarf::RangeListEncodingString, + List.getCU().getCUNode()->getRangesBaseAddress() || + DD.getDwarfVersion() >= 5, + [](auto) {}); } static void emitDebugRangesImpl(DwarfDebug &DD, AsmPrinter *Asm, Modified: llvm/trunk/test/CodeGen/X86/debug-loclists.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/debug-loclists.ll?rev=374600&r1=374599&r2=374600&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/debug-loclists.ll (original) +++ llvm/trunk/test/CodeGen/X86/debug-loclists.ll Fri Oct 11 14:52:41 2019 @@ -1,144 +1,119 @@ -; RUN: llc -mtriple=x86_64-pc-linux -filetype=obj -o %t < %s -; RUN: llvm-dwarfdump -v %t | FileCheck %s +; RUN: llc -mtriple=x86_64-pc-linux -filetype=obj -function-sections -o %t < %s +; RUN: llvm-dwarfdump -v -debug-info -debug-loclists %t | FileCheck %s -; CHECK: 0x00000033: DW_TAG_formal_parameter [3] -; CHECK-NEXT: DW_AT_location [DW_FORM_sec_offset] (0x0000000c -; CHECK-NEXT: [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0 -; CHECK-NEXT: [0x0000000000000004, 0x0000000000000012): DW_OP_breg3 RBX+0) -; CHECK-NEXT: DW_AT_name [DW_FORM_strx1] (indexed (0000000e) string = "a") -; CHECK-NEXT: DW_AT_decl_file [DW_FORM_data1] ("/home/folder{{\\|\/}}test.cc") -; CHECK-NEXT: DW_AT_decl_line [DW_FORM_data1] (6) -; CHECK-NEXT: DW_AT_type [DW_FORM_ref4] (cu + 0x0040 => {0x00000040} "A") +; CHECK: DW_TAG_variable +; FIXME: Use DW_FORM_loclistx to reduce relocations +; CHECK-NEXT: DW_AT_location [DW_FORM_sec_offset] (0x0000000c +; CHECK-NEXT: [0x0000000000000000, 0x0000000000000003): DW_OP_consts +3, DW_OP_stack_value +; CHECK-NEXT: [0x0000000000000003, 0x0000000000000004): DW_OP_consts +4, DW_OP_stack_value) +; CHECK-NEXT: DW_AT_name {{.*}} "y" + +; CHECK: DW_TAG_variable +; FIXME: Use DW_FORM_loclistx to reduce relocations +; CHECK-NEXT: DW_AT_location [DW_FORM_sec_offset] (0x0000001d +; CHECK-NEXT: Addr idx 0 (w/ length 3): DW_OP_consts +5, DW_OP_stack_value) +; CHECK-NEXT: DW_AT_name {{.*}} "x" + +; CHECK: DW_TAG_variable +; FIXME: Use DW_FORM_loclistx to reduce relocations +; CHECK-NEXT: DW_AT_location [DW_FORM_sec_offset] (0x00000025 +; CHECK-NEXT: [0x0000000000000003, 0x0000000000000004): DW_OP_reg0 RAX) +; CHECK-NEXT: DW_AT_name {{.*}} "r" ; CHECK: .debug_loclists contents: -; CHECK-NEXT: 0x00000000: locations list header: length = 0x00000015, version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000000 -; CHECK-NEXT: 0x0000000c: -; CHECK-NEXT: DW_LLE_offset_pair(0x0000000000000000, 0x0000000000000004) -; CHECK-NEXT: => [0x0000000000000000, 0x0000000000000004): DW_OP_breg5 RDI+0 -; CHECK-NEXT: DW_LLE_offset_pair(0x0000000000000004, 0x0000000000000012) -; CHECK-NEXT: => [0x0000000000000004, 0x0000000000000012): DW_OP_breg3 RBX+0 - -; There is no way to use llvm-dwarfdump atm (2018, october) to verify the DW_LLE_* codes emited, -; because dumper is not yet implements that. Use asm code to do this check instead. -; -; RUN: llc -mtriple=x86_64-pc-linux -filetype=asm < %s -o - | FileCheck %s --check-prefix=ASM -; ASM: .section .debug_loclists,"", at progbits -; ASM-NEXT: .long .Ldebug_loclist_table_end0-.Ldebug_loclist_table_start0 # Length -; ASM-NEXT: .Ldebug_loclist_table_start0: -; ASM-NEXT: .short 5 # Version -; ASM-NEXT: .byte 8 # Address size -; ASM-NEXT: .byte 0 # Segment selector size -; ASM-NEXT: .long 0 # Offset entry count -; ASM-NEXT: .Lloclists_table_base0: -; ASM-NEXT: .Ldebug_loc0: -; ASM-NEXT: .byte 4 # DW_LLE_offset_pair -; ASM-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0 # starting offset -; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # ending offset -; ASM-NEXT: .byte 2 # Loc expr size -; ASM-NEXT: .byte 117 # DW_OP_breg5 -; ASM-NEXT: .byte 0 # 0 -; ASM-NEXT: .byte 4 # DW_LLE_offset_pair -; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # starting offset -; ASM-NEXT: .uleb128 .Ltmp1-.Lfunc_begin0 # ending offset -; ASM-NEXT: .byte 2 # Loc expr size -; ASM-NEXT: .byte 115 # DW_OP_breg3 -; ASM-NEXT: .byte 0 # 0 -; ASM-NEXT: .byte 0 # DW_LLE_end_of_list -; ASM-NEXT: .Ldebug_loclist_table_end0: - -; ModuleID = 'test.cc' -source_filename = "test.cc" -target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" -target triple = "x86_64-unknown-linux-gnu" - -%struct.A = type { i32 (...)** } - - at _ZTV1A = dso_local unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* bitcast ({ i8*, i8* }* @_ZTI1A to i8*), i8* bitcast (void (%struct.A*)* @_ZN1A3fooEv to i8*), i8* bitcast (void (%struct.A*)* @_ZN1A3barEv to i8*)] }, align 8 - at _ZTVN10__cxxabiv117__class_type_infoE = external dso_local global i8* - at _ZTS1A = dso_local constant [3 x i8] c"1A\00", align 1 - at _ZTI1A = dso_local constant { i8*, i8* } { i8* bitcast (i8** getelementptr inbounds (i8*, i8** @_ZTVN10__cxxabiv117__class_type_infoE, i64 2) to i8*), i8* getelementptr inbounds ([3 x i8], [3 x i8]* @_ZTS1A, i32 0, i32 0) }, align 8 - -; Function Attrs: noinline optnone uwtable -define dso_local void @_Z3baz1A(%struct.A* %a) #0 !dbg !7 { -entry: - call void @llvm.dbg.declare(metadata %struct.A* %a, metadata !23, metadata !DIExpression()), !dbg !24 - call void @_ZN1A3fooEv(%struct.A* %a), !dbg !25 - call void @_ZN1A3barEv(%struct.A* %a), !dbg !26 - ret void, !dbg !27 -} +; CHECK-NEXT: 0x00000000: locations list header: length = 0x00000029, version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000000 -; Function Attrs: nounwind readnone speculatable -declare void @llvm.dbg.declare(metadata, metadata, metadata) #1 +; Don't use startx_length if there's more than one entry, because the shared +; base address will be useful for both the range that does start at the start of +; the function, and the one that doesn't. -; Function Attrs: noinline nounwind optnone uwtable -define dso_local void @_ZN1A3fooEv(%struct.A* %this) unnamed_addr #2 align 2 !dbg !28 { -entry: - %this.addr = alloca %struct.A*, align 8 - store %struct.A* %this, %struct.A** %this.addr, align 8 - call void @llvm.dbg.declare(metadata %struct.A** %this.addr, metadata !29, metadata !DIExpression()), !dbg !31 - %this1 = load %struct.A*, %struct.A** %this.addr, align 8 - ret void, !dbg !32 -} +; CHECK-NEXT: 0x0000000c: +; CHECK-NEXT: DW_LLE_base_addressx(0x0000000000000000) +; CHECK-NEXT: DW_LLE_offset_pair (0x0000000000000000, 0x0000000000000003) +; CHECK-NEXT: => [0x0000000000000000, 0x0000000000000003): DW_OP_consts +3, DW_OP_stack_value +; CHECK-NEXT: DW_LLE_offset_pair (0x0000000000000003, 0x0000000000000004) +; CHECK-NEXT: => [0x0000000000000003, 0x0000000000000004): DW_OP_consts +4, DW_OP_stack_value +; CHECK-NEXT: DW_LLE_end_of_list () + +; Show that startx_length can be used when the address range starts at the start of the function. + +; CHECK: 0x0000001d: +; CHECK-NEXT: DW_LLE_startx_length(0x0000000000000000, 0x0000000000000003) +; CHECK-NEXT: => Addr idx 0 (w/ length 3): DW_OP_consts +5, DW_OP_stack_value +; CHECK-NEXT: DW_LLE_end_of_list () + +; And use a base address when the range doesn't start at an existing/useful +; address in the pool. + +; CHECK: 0x00000025: +; CHECK-NEXT: DW_LLE_base_addressx(0x0000000000000000) +; CHECK-NEXT: DW_LLE_offset_pair (0x0000000000000003, 0x0000000000000004) +; CHECK-NEXT: => [0x0000000000000003, 0x0000000000000004): DW_OP_reg0 RAX +; CHECK-NEXT: DW_LLE_end_of_list () + +; Built with clang -O3 -ffunction-sections from source: +; +; int f1(int i, int j) { +; int x = 5; +; int y = 3; +; int r = i + j; +; int undef; +; x = undef; +; y = 4; +; return r; +; } +; void f2() { +; } -; Function Attrs: noinline nounwind optnone uwtable -define dso_local void @_ZN1A3barEv(%struct.A* %this) unnamed_addr #2 align 2 !dbg !33 { +; Function Attrs: norecurse nounwind readnone uwtable +define dso_local i32 @_Z2f1ii(i32 %i, i32 %j) local_unnamed_addr !dbg !7 { entry: - %this.addr = alloca %struct.A*, align 8 - store %struct.A* %this, %struct.A** %this.addr, align 8 - call void @llvm.dbg.declare(metadata %struct.A** %this.addr, metadata !34, metadata !DIExpression()), !dbg !35 - %this1 = load %struct.A*, %struct.A** %this.addr, align 8 - ret void, !dbg !36 + call void @llvm.dbg.value(metadata i32 %i, metadata !12, metadata !DIExpression()), !dbg !18 + call void @llvm.dbg.value(metadata i32 %j, metadata !13, metadata !DIExpression()), !dbg !18 + call void @llvm.dbg.value(metadata i32 5, metadata !14, metadata !DIExpression()), !dbg !18 + call void @llvm.dbg.value(metadata i32 3, metadata !15, metadata !DIExpression()), !dbg !18 + %add = add nsw i32 %j, %i, !dbg !19 + call void @llvm.dbg.value(metadata i32 %add, metadata !16, metadata !DIExpression()), !dbg !18 + call void @llvm.dbg.value(metadata i32 undef, metadata !14, metadata !DIExpression()), !dbg !18 + call void @llvm.dbg.value(metadata i32 4, metadata !15, metadata !DIExpression()), !dbg !18 + ret i32 %add, !dbg !20 } -; Function Attrs: noinline norecurse nounwind optnone uwtable -define dso_local i32 @main() #3 !dbg !37 { +; Function Attrs: norecurse nounwind readnone uwtable +define dso_local void @_Z2f2v() local_unnamed_addr !dbg !21 { entry: - %retval = alloca i32, align 4 - store i32 0, i32* %retval, align 4 - ret i32 0, !dbg !38 + ret void, !dbg !24 } +; Function Attrs: nounwind readnone speculatable willreturn +declare void @llvm.dbg.value(metadata, metadata, metadata) !llvm.dbg.cu = !{!0} !llvm.module.flags = !{!3, !4, !5} !llvm.ident = !{!6} -!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "clang version 8.0.0 (trunk 344035)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) -!1 = !DIFile(filename: "test.cc", directory: "/home/folder", checksumkind: CSK_MD5, checksum: "e0f357ad6dcb791a774a0dae55baf5e7") +!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, producer: "clang version 10.0.0 (trunk 374581) (llvm/trunk 374579)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) +!1 = !DIFile(filename: "loc2.cpp", directory: "/usr/local/google/home/blaikie/dev/scratch", checksumkind: CSK_MD5, checksum: "91e0069c680e2a63f4f885ec93f5d07e") !2 = !{} !3 = !{i32 2, !"Dwarf Version", i32 5} !4 = !{i32 2, !"Debug Info Version", i32 3} !5 = !{i32 1, !"wchar_size", i32 4} -!6 = !{!"clang version 8.0.0 (trunk 344035)"} -!7 = distinct !DISubprogram(name: "baz", linkageName: "_Z3baz1A", scope: !1, file: !1, line: 6, type: !8, isLocal: false, isDefinition: true, scopeLine: 6, flags: DIFlagPrototyped, isOptimized: false, unit: !0, retainedNodes: !2) +!6 = !{!"clang version 10.0.0 (trunk 374581) (llvm/trunk 374579)"} +!7 = distinct !DISubprogram(name: "f1", linkageName: "_Z2f1ii", scope: !1, file: !1, line: 1, type: !8, scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !11) !8 = !DISubroutineType(types: !9) -!9 = !{null, !10} -!10 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "A", file: !1, line: 1, size: 64, flags: DIFlagTypePassByReference, elements: !11, vtableHolder: !10, identifier: "_ZTS1A") -!11 = !{!12, !18, !22} -!12 = !DIDerivedType(tag: DW_TAG_member, name: "_vptr$A", scope: !1, file: !1, baseType: !13, size: 64, flags: DIFlagArtificial) -!13 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !14, size: 64) -!14 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "__vtbl_ptr_type", baseType: !15, size: 64) -!15 = !DISubroutineType(types: !16) -!16 = !{!17} -!17 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) -!18 = !DISubprogram(name: "foo", linkageName: "_ZN1A3fooEv", scope: !10, file: !1, line: 2, type: !19, isLocal: false, isDefinition: false, scopeLine: 2, containingType: !10, virtuality: DW_VIRTUALITY_virtual, virtualIndex: 0, flags: DIFlagPrototyped, isOptimized: false) -!19 = !DISubroutineType(types: !20) -!20 = !{null, !21} -!21 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !10, size: 64, flags: DIFlagArtificial | DIFlagObjectPointer) -!22 = !DISubprogram(name: "bar", linkageName: "_ZN1A3barEv", scope: !10, file: !1, line: 3, type: !19, isLocal: false, isDefinition: false, scopeLine: 3, containingType: !10, virtuality: DW_VIRTUALITY_virtual, virtualIndex: 1, flags: DIFlagPrototyped, isOptimized: false) -!23 = !DILocalVariable(name: "a", arg: 1, scope: !7, file: !1, line: 6, type: !10) -!24 = !DILocation(line: 6, column: 19, scope: !7) -!25 = !DILocation(line: 7, column: 6, scope: !7) -!26 = !DILocation(line: 8, column: 6, scope: !7) -!27 = !DILocation(line: 9, column: 1, scope: !7) -!28 = distinct !DISubprogram(name: "foo", linkageName: "_ZN1A3fooEv", scope: !10, file: !1, line: 12, type: !19, isLocal: false, isDefinition: true, scopeLine: 12, flags: DIFlagPrototyped, isOptimized: false, unit: !0, declaration: !18, retainedNodes: !2) -!29 = !DILocalVariable(name: "this", arg: 1, scope: !28, type: !30, flags: DIFlagArtificial | DIFlagObjectPointer) -!30 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !10, size: 64) -!31 = !DILocation(line: 0, scope: !28) -!32 = !DILocation(line: 12, column: 16, scope: !28) -!33 = distinct !DISubprogram(name: "bar", linkageName: "_ZN1A3barEv", scope: !10, file: !1, line: 13, type: !19, isLocal: false, isDefinition: true, scopeLine: 13, flags: DIFlagPrototyped, isOptimized: false, unit: !0, declaration: !22, retainedNodes: !2) -!34 = !DILocalVariable(name: "this", arg: 1, scope: !33, type: !30, flags: DIFlagArtificial | DIFlagObjectPointer) -!35 = !DILocation(line: 0, scope: !33) -!36 = !DILocation(line: 13, column: 16, scope: !33) -!37 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 15, type: !15, isLocal: false, isDefinition: true, scopeLine: 15, flags: DIFlagPrototyped, isOptimized: false, unit: !0, retainedNodes: !2) -!38 = !DILocation(line: 16, column: 3, scope: !37) +!9 = !{!10, !10, !10} +!10 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) +!11 = !{!12, !13, !14, !15, !16, !17} +!12 = !DILocalVariable(name: "i", arg: 1, scope: !7, file: !1, line: 1, type: !10) +!13 = !DILocalVariable(name: "j", arg: 2, scope: !7, file: !1, line: 1, type: !10) +!14 = !DILocalVariable(name: "x", scope: !7, file: !1, line: 2, type: !10) +!15 = !DILocalVariable(name: "y", scope: !7, file: !1, line: 3, type: !10) +!16 = !DILocalVariable(name: "r", scope: !7, file: !1, line: 4, type: !10) +!17 = !DILocalVariable(name: "undef", scope: !7, file: !1, line: 5, type: !10) +!18 = !DILocation(line: 0, scope: !7) +!19 = !DILocation(line: 4, column: 13, scope: !7) +!20 = !DILocation(line: 8, column: 3, scope: !7) +!21 = distinct !DISubprogram(name: "f2", linkageName: "_Z2f2v", scope: !1, file: !1, line: 10, type: !22, scopeLine: 10, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !2) +!22 = !DISubroutineType(types: !23) +!23 = !{null} +!24 = !DILocation(line: 11, column: 1, scope: !21) Modified: llvm/trunk/test/DebugInfo/X86/sret.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/X86/sret.ll?rev=374600&r1=374599&r2=374600&view=diff ============================================================================== --- llvm/trunk/test/DebugInfo/X86/sret.ll (original) +++ llvm/trunk/test/DebugInfo/X86/sret.ll Fri Oct 11 14:52:41 2019 @@ -11,6 +11,7 @@ ; CHECK: _ZN1B9AInstanceEv ; CHECK: DW_TAG_variable ; CHECK-NEXT: DW_AT_location [DW_FORM_sec_offset] (0x00000000 +; CHECK-NEXT: [0xffffffffffffffff, {{.*}}): {{$}} ; CHECK-NEXT: [{{.*}}, {{.*}}): DW_OP_breg5 RDI+0 ; CHECK-NEXT: [{{.*}}, {{.*}}): DW_OP_breg6 RBP-24, DW_OP_deref) ; CHECK-NEXT: DW_AT_name {{.*}}"a" From llvm-commits at lists.llvm.org Fri Oct 11 14:51:03 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Fri, 11 Oct 2019 14:51:03 -0700 Subject: LLVM buildmaster could be unavailable today at 5:00 pm PST Message-ID: Hello everyone, LLVM buildmaster could be unavailable for short time on Friday, October 11th at 5:00 pm PST due to Network maintenance. Thanks Galina -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Fri Oct 11 14:57:06 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via llvm-commits) Date: Fri, 11 Oct 2019 21:57:06 -0000 Subject: [llvm] r374601 - [lit] Change regex filter to ignore case Message-ID: <20191011215706.E793088B04@lists.llvm.org> Author: yln Date: Fri Oct 11 14:57:06 2019 New Revision: 374601 URL: http://llvm.org/viewvc/llvm-project?rev=374601&view=rev Log: [lit] Change regex filter to ignore case Make regex filter `--filter=REGEX` option more lenient via `re.IGNORECASE`. Reviewed By: yln Differential Revision: https://reviews.llvm.org/D68834 Modified: llvm/trunk/utils/lit/lit/cl_arguments.py llvm/trunk/utils/lit/lit/main.py llvm/trunk/utils/lit/tests/selecting.py Modified: llvm/trunk/utils/lit/lit/cl_arguments.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/cl_arguments.py?rev=374601&r1=374600&r2=374601&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/cl_arguments.py (original) +++ llvm/trunk/utils/lit/lit/cl_arguments.py Fri Oct 11 14:57:06 2019 @@ -152,6 +152,7 @@ def parse_args(): default=False) selection_group.add_argument("--filter", metavar="REGEX", + type=_case_insensitive_regex, help="Only run tests with paths matching the given regular expression", default=os.environ.get("LIT_FILTER")) selection_group.add_argument("--num-shards", @@ -201,14 +202,22 @@ def parse_args(): return opts def _positive_int(arg): + desc = "requires positive integer, but found '{}'" try: n = int(arg) except ValueError: - raise _arg_error('positive integer', arg) + raise _error(desc, arg) if n <= 0: - raise _arg_error('positive integer', arg) + raise _error(desc, arg) return n -def _arg_error(desc, arg): - msg = "requires %s, but found '%s'" % (desc, arg) +def _case_insensitive_regex(arg): + import re + try: + return re.compile(arg, re.IGNORECASE) + except re.error as reason: + raise _error("invalid regular expression: '{}', {}", arg, reason) + +def _error(desc, *args): + msg = desc.format(*args) return argparse.ArgumentTypeError(msg) Modified: llvm/trunk/utils/lit/lit/main.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/main.py?rev=374601&r1=374600&r2=374601&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/main.py (original) +++ llvm/trunk/utils/lit/lit/main.py Fri Oct 11 14:57:06 2019 @@ -10,7 +10,6 @@ from __future__ import absolute_import import os import platform import random -import re import sys import time import tempfile @@ -115,7 +114,7 @@ def main_with_tmp(builtinParameters): numTotalTests = len(run.tests) if opts.filter: - filter_tests(run, opts) + run.tests = [t for t in run.tests if opts.filter.search(t.getFullName())] order_tests(run, opts) @@ -277,15 +276,6 @@ def print_suites_or_tests(run, opts): # Exit. sys.exit(0) -def filter_tests(run, opts): - try: - rex = re.compile(opts.filter) - except: - parser.error("invalid regular expression for --filter: %r" % ( - opts.filter)) - run.tests = [result_test for result_test in run.tests - if rex.search(result_test.getFullName())] - def order_tests(run, opts): if opts.shuffle: random.shuffle(run.tests) Modified: llvm/trunk/utils/lit/tests/selecting.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/selecting.py?rev=374601&r1=374600&r2=374601&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/selecting.py (original) +++ llvm/trunk/utils/lit/tests/selecting.py Fri Oct 11 14:57:06 2019 @@ -1,17 +1,13 @@ # RUN: %{lit} %{inputs}/discovery | FileCheck --check-prefix=CHECK-BASIC %s # CHECK-BASIC: Testing: 5 tests -# Check that regex-filtering works +# Check that regex-filtering works, is case-insensitive, and can be configured via env var. # # RUN: %{lit} --filter 'o[a-z]e' %{inputs}/discovery | FileCheck --check-prefix=CHECK-FILTER %s +# RUN: %{lit} --filter 'O[A-Z]E' %{inputs}/discovery | FileCheck --check-prefix=CHECK-FILTER %s +# RUN: env LIT_FILTER='o[a-z]e' %{lit} %{inputs}/discovery | FileCheck --check-prefix=CHECK-FILTER %s # CHECK-FILTER: Testing: 2 of 5 tests -# Check that regex-filtering based on environment variables work. -# -# RUN: env LIT_FILTER='o[a-z]e' %{lit} %{inputs}/discovery | FileCheck --check-prefix=CHECK-FILTER-ENV %s -# CHECK-FILTER-ENV: Testing: 2 of 5 tests - - # Check that maximum counts work # # RUN: %{lit} --max-tests 3 %{inputs}/discovery | FileCheck --check-prefix=CHECK-MAX %s From llvm-commits at lists.llvm.org Fri Oct 11 14:57:09 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via llvm-commits) Date: Fri, 11 Oct 2019 21:57:09 -0000 Subject: [llvm] r374602 - [lit] Small cleanups in main.py Message-ID: <20191011215709.DD1B793388@lists.llvm.org> Author: yln Date: Fri Oct 11 14:57:09 2019 New Revision: 374602 URL: http://llvm.org/viewvc/llvm-project?rev=374602&view=rev Log: [lit] Small cleanups in main.py * Extract separate function for running tests from main * Push single-usage imports to point of usage * Remove unnecessary sys.exit(0) calls Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D68836 Modified: llvm/trunk/utils/lit/lit/main.py Modified: llvm/trunk/utils/lit/lit/main.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/main.py?rev=374602&r1=374601&r2=374602&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/main.py (original) +++ llvm/trunk/utils/lit/lit/main.py Fri Oct 11 14:57:09 2019 @@ -9,12 +9,8 @@ See lit.pod for more information. from __future__ import absolute_import import os import platform -import random import sys import time -import tempfile -import shutil -from xml.sax.saxutils import quoteattr import lit.cl_arguments import lit.discovery @@ -32,6 +28,7 @@ def main(builtinParameters = {}): # the buildbot level. lit_tmp = None if 'LIT_PRESERVES_TMP' not in os.environ: + import tempfile lit_tmp = tempfile.mkdtemp(prefix="lit_tmp_") os.environ.update({ 'TMPDIR': lit_tmp, @@ -48,6 +45,7 @@ def main(builtinParameters = {}): finally: if lit_tmp: try: + import shutil shutil.rmtree(lit_tmp) except: # FIXME: Re-try after timeout on Windows. @@ -142,23 +140,7 @@ def main_with_tmp(builtinParameters): # Don't create more workers than tests. opts.numWorkers = min(len(run.tests), opts.numWorkers) - increase_process_limit(litConfig, opts) - - display = lit.display.create_display(opts, len(run.tests), - numTotalTests, opts.numWorkers) - def progress_callback(test): - display.update(test) - if opts.incremental: - update_incremental_cache(test) - - startTime = time.time() - try: - run.execute_tests(progress_callback, opts.numWorkers, opts.maxTime) - except KeyboardInterrupt: - sys.exit(2) - testing_time = time.time() - startTime - - display.finish() + testing_time = run_tests(run, litConfig, opts, numTotalTests) if not opts.quiet: print('Testing Time: %.2fs' % (testing_time,)) @@ -231,7 +213,6 @@ def main_with_tmp(builtinParameters): if hasFailures: sys.exit(1) - sys.exit(0) def create_user_parameters(builtinParameters, opts): @@ -273,11 +254,9 @@ def print_suites_or_tests(run, opts): for test in ts_tests: print(' %s' % (test.getFullName(),)) - # Exit. - sys.exit(0) - def order_tests(run, opts): if opts.shuffle: + import random random.shuffle(run.tests) elif opts.incremental: run.tests.sort(key = by_mtime, reverse = True) @@ -320,6 +299,26 @@ def increase_process_limit(litConfig, op except: pass +def run_tests(run, litConfig, opts, numTotalTests): + increase_process_limit(litConfig, opts) + + display = lit.display.create_display(opts, len(run.tests), + numTotalTests, opts.numWorkers) + def progress_callback(test): + display.update(test) + if opts.incremental: + update_incremental_cache(test) + + startTime = time.time() + try: + run.execute_tests(progress_callback, opts.numWorkers, opts.maxTime) + except KeyboardInterrupt: + sys.exit(2) + testing_time = time.time() - startTime + + display.finish() + return testing_time + def write_test_results(run, lit_config, testing_time, output_path): try: import json @@ -379,6 +378,7 @@ def write_test_results(run, lit_config, f.close() def write_test_results_xunit(run, opts): + from xml.sax.saxutils import quoteattr # Collect the tests, indexed by test suite by_suite = {} for result_test in run.tests: From llvm-commits at lists.llvm.org Fri Oct 11 14:59:25 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:59:25 +0000 (UTC) Subject: [PATCH] D68834: [lit] Change regex filter to ignore case In-Reply-To: References: Message-ID: <436c85a82d698eed825a6ef1bcc5415c@localhost.localdomain> yln accepted this revision. yln added a comment. This revision is now accepted and ready to land. I am accepting this myself because it is a small and reasonable (to me) change. Please voice your concerns if you have any. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68834/new/ https://reviews.llvm.org/D68834 From llvm-commits at lists.llvm.org Fri Oct 11 14:59:25 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:59:25 +0000 (UTC) Subject: [PATCH] D68893: AMDGPU: Split flat offsets that don't fit in DAG Message-ID: arsenm created this revision. arsenm added reviewers: rampitec, vpykhtin. Herald added subscribers: jfb, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl. We handle it this way for some other address spaces. Since r349196, SILoadStoreOptimizer has been trying to do this. This is after SIFoldOperands runs, which can change the addressing patterns. It's simpler to just split this earlier. https://reviews.llvm.org/D68893 Files: lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp lib/Target/AMDGPU/SIInstrInfo.cpp lib/Target/AMDGPU/SIInstrInfo.h test/CodeGen/AMDGPU/cgp-addressing-modes.ll test/CodeGen/AMDGPU/flat-address-space.ll test/CodeGen/AMDGPU/global-saddr.ll test/CodeGen/AMDGPU/global_atomics.ll test/CodeGen/AMDGPU/global_atomics_i64.ll test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll test/CodeGen/AMDGPU/store-hi16.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68893.224684.patch Type: text/x-patch Size: 25782 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 14:59:26 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:59:26 +0000 (UTC) Subject: [PATCH] D68894: AMDGPU: Increase vcc liveness scan threshold Message-ID: arsenm created this revision. arsenm added reviewers: rampitec, nhaehnle. Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, wdng, jvesely, kzhuravl. Avoids a test regression in a future patch. Also add debug printing on this case, so I waste less time debugging folds in the future. https://reviews.llvm.org/D68894 Files: lib/Target/AMDGPU/SIFoldOperands.cpp test/CodeGen/AMDGPU/copy-illegal-type.ll test/CodeGen/AMDGPU/cvt_f32_ubyte.ll test/CodeGen/AMDGPU/ds-negative-offset-addressing-mode-loop.ll test/CodeGen/AMDGPU/fence-barrier.ll test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68894.224685.patch Type: text/x-patch Size: 6202 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 14:59:26 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:59:26 +0000 (UTC) Subject: [PATCH] D68895: AMDGPU: Erase redundant redefs of m0 in SIFoldOperands Message-ID: arsenm created this revision. arsenm added reviewers: rampitec, kerbowa, tstellar. Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl. Only handle simple inter-block redefs of m0 to the same value. This avoids interference from redefs of m0 in SILoadStoreOptimzer. I was initially teaching that pass to ignore redefs of m0, but having them not exist beforehand is much simpler. This is in preparation for deleting the current special m0 handling in SIFixSGPRCopies to allow the register coalescer to handle the difficult cases. https://reviews.llvm.org/D68895 Files: lib/Target/AMDGPU/SIFoldOperands.cpp test/CodeGen/AMDGPU/fold-operands-remove-m0-redef.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68895.224686.patch Type: text/x-patch Size: 15059 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 14:59:31 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:59:31 +0000 (UTC) Subject: [PATCH] D68777: [mips] Use less instruction to load zero into FPR by li.s / li.d pseudos In-Reply-To: References: Message-ID: <1cce76370abf382314de94ac02c3ad44@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG5ebe3511b35d: [mips] Use less instruction to load zero into FPR by li.s / li.d pseudos (authored by atanasyan). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68777/new/ https://reviews.llvm.org/D68777 Files: llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp llvm/test/MC/Mips/macro-li.d.s llvm/test/MC/Mips/macro-li.s.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68777.224688.patch Type: text/x-patch Size: 5070 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 14:59:33 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:59:33 +0000 (UTC) Subject: [PATCH] D68778: [mips] Store 64-bit `li.d' operand as a single 8-byte value In-Reply-To: References: Message-ID: <144cca8c1d375a38f6f220ed9aea7289@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG66048fed8289: [mips] Store 64-bit `li.d' operand as a single 8-byte value (authored by atanasyan). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68778/new/ https://reviews.llvm.org/D68778 Files: llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp llvm/test/MC/Mips/macro-li.d.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68778.224689.patch Type: text/x-patch Size: 10668 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 15:00:29 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:00:29 +0000 (UTC) Subject: [PATCH] D68620: DebugInfo: Use base address selection entries for debug_loc In-Reply-To: References: Message-ID: <707e3ba1f27ca1ea3e823dc861e74ad0@localhost.localdomain> This revision was not accepted when it landed; it landed in state "Needs Revision". This revision was automatically updated to reflect the committed changes. Closed by commit rG289c45cc62e4: DebugInfo: Use base address selection entries for debug_loc (authored by dblaikie). Herald added subscribers: ormris, hiraditya. Changed prior to commit: https://reviews.llvm.org/D68620?vs=223716&id=224690#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68620/new/ https://reviews.llvm.org/D68620 Files: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp llvm/test/CodeGen/X86/debug-loclists.ll llvm/test/DebugInfo/X86/sret.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68620.224690.patch Type: text/x-patch Size: 28484 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 15:00:30 2019 From: llvm-commits at lists.llvm.org (Evgenii Stepanov via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:00:30 +0000 (UTC) Subject: [PATCH] D68794: libhwasan initialisation include kernel syscall ABI relaxation In-Reply-To: References: Message-ID: <2ca4eae46faf2b482e60f85e1a855eba@localhost.localdomain> eugenis added a comment. In D68794#1705862 , @mmalcomson wrote: > Run `prctl` syscall for Android, but ignore EINVAL failures. > > NOTE: I don't believe this distinguishes between running on a kernel with with the tagged address ABI unconditional or running on a newer kernel or on a kernel with `sysctl abi.tagged_addr_disabled=1` > (https://android.googlesource.com/kernel/common/+/690c4ca8a5715644370384672f24d95b042db74a/Documentation/arm64/tagged-address-abi.rst) This is a good point. It appears that PR_GET_TAGGED_ADDR_CTRL works even when abi.tagged_addr_disabled=1, we can use it to tell these two cases apart, but there is not a lot we could do with that information. A real test would be invoking a random syscall with a tagged pointer, ex. uname(). I don't think we need to go that far, but leaving it up to you. ================ Comment at: compiler-rt/lib/hwasan/hwasan.cpp:357 + InitPrctl(); + ---------------- Please move it to InitInstrumentation to handle __hwasan_init_static, too. ================ Comment at: compiler-rt/lib/hwasan/hwasan_linux.cpp:157 +#define PR_TAGGED_ADDR_ENABLE (1UL << 0) + if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, 0, 0, 0) == -1 + || ! prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)) { ---------------- This needs to be internal_prctl because prctl implementation in libc can be built with hwasan. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68794/new/ https://reviews.llvm.org/D68794 From llvm-commits at lists.llvm.org Fri Oct 11 15:03:37 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via llvm-commits) Date: Fri, 11 Oct 2019 22:03:37 -0000 Subject: [llvm] r374604 - [AMDGPU] link dpp pseudos and real instructions on gfx10 Message-ID: <20191011220337.1B9AA8588B@lists.llvm.org> Author: rampitec Date: Fri Oct 11 15:03:36 2019 New Revision: 374604 URL: http://llvm.org/viewvc/llvm-project?rev=374604&view=rev Log: [AMDGPU] link dpp pseudos and real instructions on gfx10 This defaults to zero fi operand, but we do not expose it anyway. Should we expose it later it needs to be added to the pseudo. This enables dpp combining on gfx10. Differential Revision: https://reviews.llvm.org/D68888 Added: llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.ll Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td llvm/trunk/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll Modified: llvm/trunk/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp?rev=374604&r1=374603&r2=374604&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp (original) +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp Fri Oct 11 15:03:36 2019 @@ -211,6 +211,10 @@ void AMDGPUMCInstLower::lower(const Mach lowerOperand(MO, MCOp); OutMI.addOperand(MCOp); } + + int FIIdx = AMDGPU::getNamedOperandIdx(MCOpcode, AMDGPU::OpName::fi); + if (FIIdx >= (int)OutMI.getNumOperands()) + OutMI.addOperand(MCOperand::createImm(0)); } bool AMDGPUAsmPrinter::lowerOperand(const MachineOperand &MO, Modified: llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td?rev=374604&r1=374603&r2=374604&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td (original) +++ llvm/trunk/lib/Target/AMDGPU/VOP1Instructions.td Fri Oct 11 15:03:36 2019 @@ -441,7 +441,7 @@ let SubtargetPredicate = isGFX10Plus in // Target-specific instruction encodings. //===----------------------------------------------------------------------===// -class VOP1_DPP op, VOP1_Pseudo ps, VOPProfile p = ps.Pfl, bit isDPP16 = 0> : +class VOP1_DPP op, VOP1_DPP_Pseudo ps, VOPProfile p = ps.Pfl, bit isDPP16 = 0> : VOP_DPP { let hasSideEffects = ps.hasSideEffects; let Defs = ps.Defs; @@ -455,8 +455,9 @@ class VOP1_DPP op, VOP1_Pseudo p let Inst{31-25} = 0x3f; } -class VOP1_DPP16 op, VOP1_Pseudo ps, VOPProfile p = ps.Pfl> : - VOP1_DPP { +class VOP1_DPP16 op, VOP1_DPP_Pseudo ps, VOPProfile p = ps.Pfl> : + VOP1_DPP, + SIMCInstr { let AssemblerPredicate = !if(p.HasExt, HasDPP16, DisableInst); let SubtargetPredicate = HasDPP16; } @@ -507,7 +508,7 @@ let AssemblerPredicate = isGFX10Plus, De } multiclass VOP1_Real_dpp_gfx10 op> { foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in - def _dpp_gfx10 : VOP1_DPP16(NAME#"_e32")> { + def _dpp_gfx10 : VOP1_DPP16(NAME#"_dpp")> { let DecoderNamespace = "SDWA10"; } } @@ -840,7 +841,7 @@ def V_MOVRELD_B32_V4 : V_MOVRELD_B32_pse def V_MOVRELD_B32_V8 : V_MOVRELD_B32_pseudo; def V_MOVRELD_B32_V16 : V_MOVRELD_B32_pseudo; -let OtherPredicates = [isGFX8GFX9] in { +let OtherPredicates = [isGFX8Plus] in { def : GCNPat < (i32 (int_amdgcn_mov_dpp i32:$src, timm:$dpp_ctrl, timm:$row_mask, timm:$bank_mask, @@ -858,7 +859,7 @@ def : GCNPat < (as_i1imm $bound_ctrl)) >; -} // End OtherPredicates = [isGFX8GFX9] +} // End OtherPredicates = [isGFX8Plus] let OtherPredicates = [isGFX8Plus] in { def : GCNPat< @@ -916,20 +917,4 @@ def : GCNPat < (i32 (int_amdgcn_mov_dpp8 i32:$src, timm:$dpp8)), (V_MOV_B32_dpp8_gfx10 $src, $src, (as_i32imm $dpp8), (i32 DPP8Mode.FI_0)) >; - -def : GCNPat < - (i32 (int_amdgcn_mov_dpp i32:$src, timm:$dpp_ctrl, timm:$row_mask, timm:$bank_mask, - timm:$bound_ctrl)), - (V_MOV_B32_dpp_gfx10 $src, $src, (as_i32imm $dpp_ctrl), - (as_i32imm $row_mask), (as_i32imm $bank_mask), - (as_i1imm $bound_ctrl), (i32 0)) ->; - -def : GCNPat < - (i32 (int_amdgcn_update_dpp i32:$old, i32:$src, timm:$dpp_ctrl, timm:$row_mask, - timm:$bank_mask, timm:$bound_ctrl)), - (V_MOV_B32_dpp_gfx10 $old, $src, (as_i32imm $dpp_ctrl), - (as_i32imm $row_mask), (as_i32imm $bank_mask), - (as_i1imm $bound_ctrl), (i32 0)) ->; } // End OtherPredicates = [isGFX10Plus] Modified: llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td?rev=374604&r1=374603&r2=374604&view=diff ============================================================================== --- llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td (original) +++ llvm/trunk/lib/Target/AMDGPU/VOP2Instructions.td Fri Oct 11 15:03:36 2019 @@ -658,14 +658,14 @@ let Constraints = "$vdst = $src2", isCommutable = 1, IsDOT = 1 in { let SubtargetPredicate = HasDot5Insts in - defm V_DOT2C_F32_F16 : VOP2Inst_e32<"v_dot2c_f32_f16", VOP_DOT_ACC_F32_V2F16>; + defm V_DOT2C_F32_F16 : VOP2Inst<"v_dot2c_f32_f16", VOP_DOT_ACC_F32_V2F16>; let SubtargetPredicate = HasDot6Insts in - defm V_DOT4C_I32_I8 : VOP2Inst_e32<"v_dot4c_i32_i8", VOP_DOT_ACC_I32_I32>; + defm V_DOT4C_I32_I8 : VOP2Inst<"v_dot4c_i32_i8", VOP_DOT_ACC_I32_I32>; let SubtargetPredicate = HasDot4Insts in - defm V_DOT2C_I32_I16 : VOP2Inst_e32<"v_dot2c_i32_i16", VOP_DOT_ACC_I32_I32>; + defm V_DOT2C_I32_I16 : VOP2Inst<"v_dot2c_i32_i16", VOP_DOT_ACC_I32_I32>; let SubtargetPredicate = HasDot3Insts in - defm V_DOT8C_I32_I4 : VOP2Inst_e32<"v_dot8c_i32_i4", VOP_DOT_ACC_I32_I32>; + defm V_DOT8C_I32_I4 : VOP2Inst<"v_dot8c_i32_i4", VOP_DOT_ACC_I32_I32>; } let AddedComplexity = 30 in { @@ -800,7 +800,7 @@ def : GCNPat< // Target-specific instruction encodings. //===----------------------------------------------------------------------===// -class VOP2_DPP op, VOP2_Pseudo ps, +class VOP2_DPP op, VOP2_DPP_Pseudo ps, string opName = ps.OpName, VOPProfile p = ps.Pfl, bit IsDPP16 = 0> : VOP_DPP { @@ -818,13 +818,18 @@ class VOP2_DPP op, VOP2_Pseudo p let Inst{31} = 0x0; } -class VOP2_DPP16 op, VOP2_Pseudo ps, +class Base_VOP2_DPP16 op, VOP2_DPP_Pseudo ps, string opName = ps.OpName, VOPProfile p = ps.Pfl> : VOP2_DPP { let AssemblerPredicate = !if(p.HasExt, HasDPP16, DisableInst); let SubtargetPredicate = HasDPP16; } +class VOP2_DPP16 op, VOP2_DPP_Pseudo ps, + string opName = ps.OpName, VOPProfile p = ps.Pfl> : + Base_VOP2_DPP16, + SIMCInstr ; + class VOP2_DPP8 op, VOP2_Pseudo ps, string opName = ps.OpName, VOPProfile p = ps.Pfl> : VOP_DPP8 { @@ -885,7 +890,7 @@ let AssemblerPredicate = isGFX10Plus, De } multiclass VOP2_Real_dpp_gfx10 op> { foreach _ = BoolToList(NAME#"_e32").Pfl.HasExtDPP>.ret in - def _dpp_gfx10 : VOP2_DPP16(NAME#"_e32")> { + def _dpp_gfx10 : VOP2_DPP16(NAME#"_dpp")> { let DecoderNamespace = "SDWA10"; } } @@ -929,7 +934,7 @@ let AssemblerPredicate = isGFX10Plus, De multiclass VOP2_Real_dpp_gfx10_with_name op, string opName, string asmName> { foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in - def _dpp_gfx10 : VOP2_DPP16(opName#"_e32")> { + def _dpp_gfx10 : VOP2_DPP16(opName#"_dpp")> { VOP2_Pseudo ps = !cast(opName#"_e32"); let AsmString = asmName # ps.Pfl.AsmDPP16; } @@ -969,7 +974,7 @@ let AssemblerPredicate = isGFX10Plus, De } foreach _ = BoolToList(opName#"_e32").Pfl.HasExtDPP>.ret in def _dpp_gfx10 : - VOP2_DPP16(opName#"_e32"), asmName> { + VOP2_DPP16(opName#"_dpp"), asmName> { string AsmDPP = !cast(opName#"_e32").Pfl.AsmDPP16; let AsmString = asmName # !subst(", vcc", "", AsmDPP); let DecoderNamespace = "SDWA10"; @@ -992,7 +997,7 @@ let AssemblerPredicate = isGFX10Plus, De let DecoderNamespace = "SDWA10"; } def _dpp_w32_gfx10 : - VOP2_DPP16(opName#"_e32"), asmName> { + Base_VOP2_DPP16(opName#"_dpp"), asmName> { string AsmDPP = !cast(opName#"_e32").Pfl.AsmDPP16; let AsmString = asmName # !subst("vcc", "vcc_lo", AsmDPP); let isAsmParserOnly = 1; @@ -1015,7 +1020,7 @@ let AssemblerPredicate = isGFX10Plus, De let DecoderNamespace = "SDWA10"; } def _dpp_w64_gfx10 : - VOP2_DPP16(opName#"_e32"), asmName> { + Base_VOP2_DPP16(opName#"_dpp"), asmName> { string AsmDPP = !cast(opName#"_e32").Pfl.AsmDPP16; let AsmString = asmName # AsmDPP; let isAsmParserOnly = 1; @@ -1513,7 +1518,7 @@ defm V_XNOR_B32 : VOP2_Real_e32e64_vi <0 } // End SubtargetPredicate = HasDLInsts multiclass VOP2_Real_DOT_ACC_gfx9 op> : VOP2_Real_e32_vi { - def _dpp : VOP2_DPP(NAME#"_e32")>; + def _dpp_vi : VOP2_DPP(NAME#"_dpp")>; } multiclass VOP2_Real_DOT_ACC_gfx10 op> : Modified: llvm/trunk/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll?rev=374604&r1=374603&r2=374604&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll (original) +++ llvm/trunk/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll Fri Oct 11 15:03:36 2019 @@ -1,9 +1,9 @@ ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py -; RUN: llc -march=amdgcn -mtriple=amdgcn---amdgiz -amdgpu-atomic-optimizations=true -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GCN64,GFX7LESS %s -; RUN: llc -march=amdgcn -mtriple=amdgcn---amdgiz -mcpu=tonga -mattr=-flat-for-global -amdgpu-atomic-optimizations=true -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GCN64,GFX8MORE,GFX8MORE64,DPPCOMB %s -; RUN: llc -march=amdgcn -mtriple=amdgcn---amdgiz -mcpu=gfx900 -mattr=-flat-for-global -amdgpu-atomic-optimizations=true -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GCN64,GFX8MORE,GFX8MORE64,DPPCOMB %s -; RUN: llc -march=amdgcn -mtriple=amdgcn---amdgiz -mcpu=gfx1010 -mattr=-wavefrontsize32,+wavefrontsize64 -mattr=-flat-for-global -amdgpu-atomic-optimizations=true -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GCN64,GFX8MORE,GFX8MORE64,GFX1064 %s -; RUN: llc -march=amdgcn -mtriple=amdgcn---amdgiz -mcpu=gfx1010 -mattr=+wavefrontsize32,-wavefrontsize64 -mattr=-flat-for-global -amdgpu-atomic-optimizations=true -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GCN32,GFX8MORE,GFX8MORE32,GFX1032 %s +; RUN: llc -march=amdgcn -mtriple=amdgcn---amdgiz -amdgpu-atomic-optimizations=true -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GFX7LESS %s +; RUN: llc -march=amdgcn -mtriple=amdgcn---amdgiz -mcpu=tonga -mattr=-flat-for-global -amdgpu-atomic-optimizations=true -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GFX8 %s +; RUN: llc -march=amdgcn -mtriple=amdgcn---amdgiz -mcpu=gfx900 -mattr=-flat-for-global -amdgpu-atomic-optimizations=true -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GFX9 %s +; RUN: llc -march=amdgcn -mtriple=amdgcn---amdgiz -mcpu=gfx1010 -mattr=-wavefrontsize32,+wavefrontsize64 -mattr=-flat-for-global -amdgpu-atomic-optimizations=true -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GFX1064 %s +; RUN: llc -march=amdgcn -mtriple=amdgcn---amdgiz -mcpu=gfx1010 -mattr=+wavefrontsize32,-wavefrontsize64 -mattr=-flat-for-global -amdgpu-atomic-optimizations=true -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GFX1032 %s declare i32 @llvm.amdgcn.workitem.id.x() @@ -12,53 +12,596 @@ declare i32 @llvm.amdgcn.workitem.id.x() ; Show that what the atomic optimization pass will do for local pointers. -; GCN-LABEL: add_i32_constant: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN32: s_bcnt1_i32_b32 s[[popcount:[0-9]+]], s[[exec_lo]] -; GCN64: s_bcnt1_i32_b64 s[[popcount:[0-9]+]], s{{\[}}[[exec_lo]]:[[exec_hi]]{{\]}} -; GCN: v_mul_u32_u24{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[popcount]], 5 -; GCN: ds_add_rtn_u32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @add_i32_constant(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: add_i32_constant: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s4, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s5, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr1 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX7LESS-NEXT: ; mask branch BB0_2 +; GFX7LESS-NEXT: s_cbranch_execz BB0_2 +; GFX7LESS-NEXT: BB0_1: +; GFX7LESS-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: v_mul_u32_u24_e64 v2, s4, 5 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_add_rtn_u32 v1, v1, v2 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB0_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX7LESS-NEXT: v_readfirstlane_b32 s2, v1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: v_mad_u32_u24 v0, v0, 5, s2 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: add_i32_constant: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s4, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s5, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr1 +; GFX8-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX8-NEXT: ; mask branch BB0_2 +; GFX8-NEXT: s_cbranch_execz BB0_2 +; GFX8-NEXT: BB0_1: +; GFX8-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX8-NEXT: v_mul_u32_u24_e64 v1, s4, 5 +; GFX8-NEXT: v_mov_b32_e32 v2, local_var32 at abs32@lo +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_add_rtn_u32 v1, v2, v1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB0_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX8-NEXT: v_readfirstlane_b32 s2, v1 +; GFX8-NEXT: v_mad_u32_u24 v0, v0, 5, s2 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: add_i32_constant: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s4, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s5, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr1 +; GFX9-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX9-NEXT: ; mask branch BB0_2 +; GFX9-NEXT: s_cbranch_execz BB0_2 +; GFX9-NEXT: BB0_1: +; GFX9-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX9-NEXT: v_mul_u32_u24_e64 v1, s4, 5 +; GFX9-NEXT: v_mov_b32_e32 v2, local_var32 at abs32@lo +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_add_rtn_u32 v1, v2, v1 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB0_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX9-NEXT: v_readfirstlane_b32 s2, v1 +; GFX9-NEXT: v_mad_u32_u24 v0, v0, 5, s2 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: add_i32_constant: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: ; implicit-def: $vgpr1 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB0_2 +; GFX1064-NEXT: s_cbranch_execz BB0_2 +; GFX1064-NEXT: BB0_1: +; GFX1064-NEXT: s_bcnt1_i32_b64 s2, s[2:3] +; GFX1064-NEXT: v_mov_b32_e32 v2, local_var32 at abs32@lo +; GFX1064-NEXT: v_mul_u32_u24_e64 v1, s2, 5 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_add_rtn_u32 v1, v2, v1 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB0_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s2, v1 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: v_mad_u32_u24 v0, v0, 5, s2 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: add_i32_constant: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s3, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: ; implicit-def: $vgpr1 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s3, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: s_and_saveexec_b32 s2, vcc_lo +; GFX1032-NEXT: ; mask branch BB0_2 +; GFX1032-NEXT: s_cbranch_execz BB0_2 +; GFX1032-NEXT: BB0_1: +; GFX1032-NEXT: s_bcnt1_i32_b32 s3, s3 +; GFX1032-NEXT: v_mov_b32_e32 v2, local_var32 at abs32@lo +; GFX1032-NEXT: v_mul_u32_u24_e64 v1, s3, 5 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_add_rtn_u32 v1, v2, v1 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB0_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s2 +; GFX1032-NEXT: v_readfirstlane_b32 s2, v1 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: v_mad_u32_u24 v0, v0, 5, s2 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw add i32 addrspace(3)* @local_var32, i32 5 acq_rel store i32 %old, i32 addrspace(1)* %out ret void } -; GCN-LABEL: add_i32_uniform: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN32: s_bcnt1_i32_b32 s[[popcount:[0-9]+]], s[[exec_lo]] -; GCN64: s_bcnt1_i32_b64 s[[popcount:[0-9]+]], s{{\[}}[[exec_lo]]:[[exec_hi]]{{\]}} -; GCN: s_mul_i32 s[[scalar_value:[0-9]+]], s{{[0-9]+}}, s[[popcount]] -; GCN: v_mov_b32{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[scalar_value]] -; GCN: ds_add_rtn_u32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @add_i32_uniform(i32 addrspace(1)* %out, i32 %additive) { +; +; +; GFX7LESS-LABEL: add_i32_uniform: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x9 +; GFX7LESS-NEXT: s_load_dword s2, s[0:1], 0xb +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s6, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s7, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr1 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[0:1], vcc +; GFX7LESS-NEXT: ; mask branch BB1_2 +; GFX7LESS-NEXT: s_cbranch_execz BB1_2 +; GFX7LESS-NEXT: BB1_1: +; GFX7LESS-NEXT: s_bcnt1_i32_b64 s3, s[6:7] +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: s_mul_i32 s3, s2, s3 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: v_mov_b32_e32 v2, s3 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_add_rtn_u32 v1, v1, v2 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB1_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[0:1] +; GFX7LESS-NEXT: v_readfirstlane_b32 s0, v1 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX7LESS-NEXT: s_mov_b32 s7, 0xf000 +; GFX7LESS-NEXT: v_add_i32_e32 v0, vcc, s0, v0 +; GFX7LESS-NEXT: s_mov_b32 s6, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: add_i32_uniform: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x24 +; GFX8-NEXT: s_load_dword s0, s[0:1], 0x2c +; GFX8-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s6, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s7, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr1 +; GFX8-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX8-NEXT: ; mask branch BB1_2 +; GFX8-NEXT: s_cbranch_execz BB1_2 +; GFX8-NEXT: BB1_1: +; GFX8-NEXT: s_bcnt1_i32_b64 s1, s[6:7] +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: s_mul_i32 s1, s0, s1 +; GFX8-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v2, s1 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_add_rtn_u32 v1, v1, v2 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB1_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: v_mul_lo_u32 v0, s0, v0 +; GFX8-NEXT: v_readfirstlane_b32 s0, v1 +; GFX8-NEXT: s_mov_b32 s7, 0xf000 +; GFX8-NEXT: s_mov_b32 s6, -1 +; GFX8-NEXT: v_add_u32_e32 v0, vcc, s0, v0 +; GFX8-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: add_i32_uniform: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x24 +; GFX9-NEXT: s_load_dword s0, s[0:1], 0x2c +; GFX9-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s6, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s7, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr1 +; GFX9-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX9-NEXT: ; mask branch BB1_2 +; GFX9-NEXT: s_cbranch_execz BB1_2 +; GFX9-NEXT: BB1_1: +; GFX9-NEXT: s_bcnt1_i32_b64 s1, s[6:7] +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: s_mul_i32 s1, s0, s1 +; GFX9-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v2, s1 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_add_rtn_u32 v1, v1, v2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB1_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mul_lo_u32 v0, s0, v0 +; GFX9-NEXT: v_readfirstlane_b32 s0, v1 +; GFX9-NEXT: s_mov_b32 s7, 0xf000 +; GFX9-NEXT: s_mov_b32 s6, -1 +; GFX9-NEXT: v_add_u32_e32 v0, s0, v0 +; GFX9-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: add_i32_uniform: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x24 +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: s_load_dword s0, s[0:1], 0x2c +; GFX1064-NEXT: ; implicit-def: $vgpr1 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: s_and_saveexec_b64 s[6:7], vcc +; GFX1064-NEXT: ; mask branch BB1_2 +; GFX1064-NEXT: s_cbranch_execz BB1_2 +; GFX1064-NEXT: BB1_1: +; GFX1064-NEXT: s_bcnt1_i32_b64 s1, s[2:3] +; GFX1064-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: s_mul_i32 s1, s0, s1 +; GFX1064-NEXT: v_mov_b32_e32 v2, s1 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_add_rtn_u32 v1, v1, v2 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB1_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[6:7] +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: v_mul_lo_u32 v0, s0, v0 +; GFX1064-NEXT: v_readfirstlane_b32 s0, v1 +; GFX1064-NEXT: s_mov_b32 s7, 0x31016000 +; GFX1064-NEXT: s_mov_b32 s6, -1 +; GFX1064-NEXT: v_add_nc_u32_e32 v0, s0, v0 +; GFX1064-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: add_i32_uniform: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x24 +; GFX1032-NEXT: s_load_dword s0, s[0:1], 0x2c +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: ; implicit-def: $vgpr1 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: s_and_saveexec_b32 s1, vcc_lo +; GFX1032-NEXT: ; mask branch BB1_2 +; GFX1032-NEXT: s_cbranch_execz BB1_2 +; GFX1032-NEXT: BB1_1: +; GFX1032-NEXT: s_bcnt1_i32_b32 s2, s2 +; GFX1032-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: s_mul_i32 s2, s0, s2 +; GFX1032-NEXT: v_mov_b32_e32 v2, s2 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_add_rtn_u32 v1, v1, v2 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB1_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: v_mul_lo_u32 v0, s0, v0 +; GFX1032-NEXT: v_readfirstlane_b32 s0, v1 +; GFX1032-NEXT: s_mov_b32 s7, 0x31016000 +; GFX1032-NEXT: s_mov_b32 s6, -1 +; GFX1032-NEXT: v_add_nc_u32_e32 v0, s0, v0 +; GFX1032-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw add i32 addrspace(3)* @local_var32, i32 %additive acq_rel store i32 %old, i32 addrspace(1)* %out ret void } -; GCN-LABEL: add_i32_varying: ; GFX7LESS-NOT: v_mbcnt_lo_u32_b32 ; GFX7LESS-NOT: v_mbcnt_hi_u32_b32 ; GFX7LESS-NOT: s_bcnt1_i32_b64 -; GFX7LESS: ds_add_rtn_u32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} ; DPPCOMB: v_add_u32_dpp ; DPPCOMB: v_add_u32_dpp ; GFX8MORE32: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 31 -; GFX8MORE64: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 63 ; GFX8MORE: v_mov_b32{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[scalar_value]] ; GFX8MORE: ds_add_rtn_u32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @add_i32_varying(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: add_i32_varying: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_add_rtn_u32 v0, v1, v0 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: add_i32_varying: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: v_mov_b32_e32 v2, v0 +; GFX8-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: s_mov_b64 exec, s[2:3] +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: v_mov_b32_e32 v2, 0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX8-NEXT: v_readlane_b32 s2, v2, 63 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_mov_b64 exec, s[4:5] +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr0 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB2_2 +; GFX8-NEXT: s_cbranch_execz BB2_2 +; GFX8-NEXT: BB2_1: +; GFX8-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v3, s2 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_add_rtn_u32 v0, v0, v3 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB2_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_mov_b32_e32 v0, v1 +; GFX8-NEXT: v_add_u32_e32 v0, vcc, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: add_i32_varying: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: v_mov_b32_e32 v2, v0 +; GFX9-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: s_mov_b64 exec, s[2:3] +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: v_mov_b32_e32 v2, 0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX9-NEXT: v_readlane_b32 s2, v2, 63 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_mov_b64 exec, s[4:5] +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr0 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB2_2 +; GFX9-NEXT: s_cbranch_execz BB2_2 +; GFX9-NEXT: BB2_1: +; GFX9-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v3, s2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_add_rtn_u32 v0, v0, v3 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB2_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_mov_b32_e32 v0, v1 +; GFX9-NEXT: v_add_u32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: add_i32_varying: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_mov_b32_e32 v2, v0 +; GFX1064-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: s_mov_b64 exec, s[2:3] +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: v_mov_b32_e32 v2, 0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_mov_b32_e32 v3, v2 +; GFX1064-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 31 +; GFX1064-NEXT: v_mov_b32_e32 v3, s2 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 15 +; GFX1064-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1064-NEXT: v_readlane_b32 s6, v2, 47 +; GFX1064-NEXT: v_writelane_b32 v1, s2, 16 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_writelane_b32 v1, s3, 32 +; GFX1064-NEXT: v_readlane_b32 s3, v2, 63 +; GFX1064-NEXT: v_writelane_b32 v1, s6, 48 +; GFX1064-NEXT: s_mov_b64 exec, s[4:5] +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: ; implicit-def: $vgpr0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB2_2 +; GFX1064-NEXT: s_cbranch_execz BB2_2 +; GFX1064-NEXT: BB2_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v7, s3 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_add_rtn_u32 v0, v0, v7 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB2_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1064-NEXT: v_mov_b32_e32 v0, v1 +; GFX1064-NEXT: v_add_nc_u32_e32 v0, s3, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: add_i32_varying: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mov_b32_e32 v2, v0 +; GFX1032-NEXT: s_or_saveexec_b32 s2, -1 +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: s_mov_b32 exec_lo, s2 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: v_mov_b32_e32 v2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: s_or_saveexec_b32 s4, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_mov_b32_e32 v3, v2 +; GFX1032-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1032-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s5, v2, 15 +; GFX1032-NEXT: v_writelane_b32 v1, s5, 16 +; GFX1032-NEXT: s_mov_b32 exec_lo, s4 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: ; implicit-def: $vgpr0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB2_2 +; GFX1032-NEXT: s_cbranch_execz BB2_2 +; GFX1032-NEXT: BB2_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v7, s3 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_add_rtn_u32 v0, v0, v7 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB2_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1032-NEXT: v_mov_b32_e32 v0, v1 +; GFX1032-NEXT: v_add_nc_u32_e32 v0, s3, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %old = atomicrmw add i32 addrspace(3)* @local_var32, i32 %lane acq_rel @@ -67,64 +610,241 @@ entry: } define amdgpu_kernel void @add_i32_varying_gfx1032(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: add_i32_varying_gfx1032: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_add_rtn_u32 v0, v1, v0 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: add_i32_varying_gfx1032: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: v_mov_b32_e32 v2, v0 +; GFX8-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: s_mov_b64 exec, s[2:3] +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: v_mov_b32_e32 v2, 0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX8-NEXT: v_readlane_b32 s2, v2, 63 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_mov_b64 exec, s[4:5] +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr0 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB3_2 +; GFX8-NEXT: s_cbranch_execz BB3_2 +; GFX8-NEXT: BB3_1: +; GFX8-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v3, s2 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_add_rtn_u32 v0, v0, v3 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB3_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_mov_b32_e32 v0, v1 +; GFX8-NEXT: v_add_u32_e32 v0, vcc, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: add_i32_varying_gfx1032: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: v_mov_b32_e32 v2, v0 +; GFX9-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: s_mov_b64 exec, s[2:3] +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: v_mov_b32_e32 v2, 0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX9-NEXT: v_readlane_b32 s2, v2, 63 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_mov_b64 exec, s[4:5] +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr0 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB3_2 +; GFX9-NEXT: s_cbranch_execz BB3_2 +; GFX9-NEXT: BB3_1: +; GFX9-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v3, s2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_add_rtn_u32 v0, v0, v3 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB3_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_mov_b32_e32 v0, v1 +; GFX9-NEXT: v_add_u32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: add_i32_varying_gfx1032: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_mov_b32_e32 v2, v0 +; GFX1064-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: s_mov_b64 exec, s[2:3] +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: v_mov_b32_e32 v2, 0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_mov_b32_e32 v3, v2 +; GFX1064-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 31 +; GFX1064-NEXT: v_mov_b32_e32 v3, s2 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 15 +; GFX1064-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1064-NEXT: v_readlane_b32 s6, v2, 47 +; GFX1064-NEXT: v_writelane_b32 v1, s2, 16 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_writelane_b32 v1, s3, 32 +; GFX1064-NEXT: v_readlane_b32 s3, v2, 63 +; GFX1064-NEXT: v_writelane_b32 v1, s6, 48 +; GFX1064-NEXT: s_mov_b64 exec, s[4:5] +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: ; implicit-def: $vgpr0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB3_2 +; GFX1064-NEXT: s_cbranch_execz BB3_2 +; GFX1064-NEXT: BB3_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v7, s3 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_add_rtn_u32 v0, v0, v7 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB3_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1064-NEXT: v_mov_b32_e32 v0, v1 +; GFX1064-NEXT: v_add_nc_u32_e32 v0, s3, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; ; GFX1032-LABEL: add_i32_varying_gfx1032: -; GFX1032: v_mov_b32_e32 v2, v0 -; GFX1032: s_or_saveexec_b32 s2, -1 -; GFX1032: s_load_dwordx2 s[0:1], s[0:1], 0x24 -; GFX1032: v_mov_b32_e32 v1, 0 -; GFX1032: s_mov_b32 exec_lo, s2 -; GFX1032: v_cmp_ne_u32_e64 s2, 1, 0 -; GFX1032: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 -; GFX1032: s_not_b32 exec_lo, exec_lo -; GFX1032: v_mov_b32_e32 v2, 0 -; GFX1032: s_not_b32 exec_lo, exec_lo -; GFX1032: s_or_saveexec_b32 s4, -1 -; GFX1032: v_mov_b32_e32 v3, v1 -; GFX1032: v_mov_b32_e32 v4, v1 -; GFX1032: s_mov_b32 s2, -1 -; GFX1032: v_mov_b32_dpp v3, v2 row_shr:1 row_mask:0xf bank_mask:0xf -; GFX1032: v_add_nc_u32_e32 v2, v2, v3 -; GFX1032: v_mov_b32_e32 v3, v1 -; GFX1032: v_mov_b32_dpp v3, v2 row_shr:2 row_mask:0xf bank_mask:0xf -; GFX1032: v_add_nc_u32_e32 v2, v2, v3 -; GFX1032: v_mov_b32_e32 v3, v1 -; GFX1032: v_mov_b32_dpp v3, v2 row_shr:4 row_mask:0xf bank_mask:0xf -; GFX1032: v_add_nc_u32_e32 v2, v2, v3 -; GFX1032: v_mov_b32_e32 v3, v1 -; GFX1032: v_mov_b32_dpp v3, v2 row_shr:8 row_mask:0xf bank_mask:0xf -; GFX1032: v_add_nc_u32_e32 v2, v2, v3 -; GFX1032: v_mov_b32_e32 v3, v2 -; GFX1032: v_permlanex16_b32 v3, v3, -1, -1 -; GFX1032: v_mov_b32_dpp v4, v3 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf -; GFX1032: v_add_nc_u32_e32 v2, v2, v4 -; GFX1032: v_readlane_b32 s3, v2, 31 -; GFX1032: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf -; GFX1032: v_readlane_b32 s5, v2, 15 -; GFX1032: v_writelane_b32 v1, s5, 16 -; GFX1032: s_mov_b32 exec_lo, s4 -; GFX1032: v_cmp_eq_u32_e32 vcc_lo, 0, v0 -; GFX1032: s_and_saveexec_b32 s4, vcc_lo -; GFX1032: s_cbranch_execz BB3_2 -; GFX1032: BB3_1: -; GFX1032: v_mov_b32_e32 v0, local_var32 at abs32@lo -; GFX1032: v_mov_b32_e32 v5, s3 -; GFX1032: s_waitcnt vmcnt(0) lgkmcnt(0) -; GFX1032: s_waitcnt_vscnt null, 0x0 -; GFX1032: ds_add_rtn_u32 v0, v0, v5 -; GFX1032: s_waitcnt vmcnt(0) lgkmcnt(0) -; GFX1032: buffer_gl0_inv -; GFX1032: buffer_gl1_inv -; GFX1032: BB3_2: -; GFX1032: v_nop -; GFX1032: s_or_b32 exec_lo, exec_lo, s4 -; GFX1032: v_readfirstlane_b32 s3, v0 -; GFX1032: v_mov_b32_e32 v0, v1 -; GFX1032: v_add_nc_u32_e32 v0, s3, v0 -; GFX1032: s_mov_b32 s3, 0x31016000 -; GFX1032: s_nop 1 -; GFX1032: s_waitcnt lgkmcnt(0) -; GFX1032: buffer_store_dword v0, off, s[0:3], 0 -; GFX1032: s_endpgm +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mov_b32_e32 v2, v0 +; GFX1032-NEXT: s_or_saveexec_b32 s2, -1 +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: s_mov_b32 exec_lo, s2 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: v_mov_b32_e32 v2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: s_or_saveexec_b32 s4, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_mov_b32_e32 v3, v2 +; GFX1032-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1032-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s5, v2, 15 +; GFX1032-NEXT: v_writelane_b32 v1, s5, 16 +; GFX1032-NEXT: s_mov_b32 exec_lo, s4 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: ; implicit-def: $vgpr0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB3_2 +; GFX1032-NEXT: s_cbranch_execz BB3_2 +; GFX1032-NEXT: BB3_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v7, s3 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_add_rtn_u32 v0, v0, v7 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB3_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1032-NEXT: v_mov_b32_e32 v0, v1 +; GFX1032-NEXT: v_add_nc_u32_e32 v0, s3, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %old = atomicrmw add i32 addrspace(3)* @local_var32, i32 %lane acq_rel @@ -133,74 +853,241 @@ entry: } define amdgpu_kernel void @add_i32_varying_gfx1064(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: add_i32_varying_gfx1064: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_add_rtn_u32 v0, v1, v0 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: add_i32_varying_gfx1064: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: v_mov_b32_e32 v2, v0 +; GFX8-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: s_mov_b64 exec, s[2:3] +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: v_mov_b32_e32 v2, 0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX8-NEXT: v_readlane_b32 s2, v2, 63 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_mov_b64 exec, s[4:5] +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr0 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB4_2 +; GFX8-NEXT: s_cbranch_execz BB4_2 +; GFX8-NEXT: BB4_1: +; GFX8-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v3, s2 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_add_rtn_u32 v0, v0, v3 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB4_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_mov_b32_e32 v0, v1 +; GFX8-NEXT: v_add_u32_e32 v0, vcc, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: add_i32_varying_gfx1064: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: v_mov_b32_e32 v2, v0 +; GFX9-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: s_mov_b64 exec, s[2:3] +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: v_mov_b32_e32 v2, 0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX9-NEXT: v_readlane_b32 s2, v2, 63 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_mov_b64 exec, s[4:5] +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr0 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB4_2 +; GFX9-NEXT: s_cbranch_execz BB4_2 +; GFX9-NEXT: BB4_1: +; GFX9-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v3, s2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_add_rtn_u32 v0, v0, v3 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB4_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_mov_b32_e32 v0, v1 +; GFX9-NEXT: v_add_u32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; ; GFX1064-LABEL: add_i32_varying_gfx1064: -; GFX1064: v_mov_b32_e32 v2, v0 -; GFX1064: s_or_saveexec_b64 s[2:3], -1 -; GFX1064: s_load_dwordx2 s[0:1], s[0:1], 0x24 -; GFX1064: v_mov_b32_e32 v1, 0 -; GFX1064: s_mov_b64 exec, s[2:3] -; GFX1064: v_cmp_ne_u32_e64 s[2:3], 1, 0 -; GFX1064: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 -; GFX1064: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 -; GFX1064: s_not_b64 exec, exec -; GFX1064: v_mov_b32_e32 v2, 0 -; GFX1064: s_not_b64 exec, exec -; GFX1064: s_or_saveexec_b64 s[4:5], -1 -; GFX1064: v_mov_b32_e32 v3, v1 -; GFX1064: v_mov_b32_e32 v4, v1 -; GFX1064: s_mov_b32 s2, -1 -; GFX1064: v_mov_b32_dpp v3, v2 row_shr:1 row_mask:0xf bank_mask:0xf -; GFX1064: v_add_nc_u32_e32 v2, v2, v3 -; GFX1064: v_mov_b32_e32 v3, v1 -; GFX1064: v_mov_b32_dpp v3, v2 row_shr:2 row_mask:0xf bank_mask:0xf -; GFX1064: v_add_nc_u32_e32 v2, v2, v3 -; GFX1064: v_mov_b32_e32 v3, v1 -; GFX1064: v_mov_b32_dpp v3, v2 row_shr:4 row_mask:0xf bank_mask:0xf -; GFX1064: v_add_nc_u32_e32 v2, v2, v3 -; GFX1064: v_mov_b32_e32 v3, v1 -; GFX1064: v_mov_b32_dpp v3, v2 row_shr:8 row_mask:0xf bank_mask:0xf -; GFX1064: v_add_nc_u32_e32 v2, v2, v3 -; GFX1064: v_mov_b32_e32 v3, v2 -; GFX1064: v_permlanex16_b32 v3, v3, -1, -1 -; GFX1064: v_mov_b32_dpp v4, v3 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf -; GFX1064: v_add_nc_u32_e32 v2, v2, v4 -; GFX1064: v_mov_b32_e32 v4, v1 -; GFX1064: v_readlane_b32 s3, v2, 31 -; GFX1064: v_mov_b32_e32 v3, s3 -; GFX1064: v_mov_b32_dpp v4, v3 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf -; GFX1064: v_add_nc_u32_e32 v2, v2, v4 -; GFX1064: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf -; GFX1064: v_readlane_b32 s3, v2, 15 -; GFX1064: v_readlane_b32 s6, v2, 31 -; GFX1064: v_writelane_b32 v1, s3, 16 -; GFX1064: v_readlane_b32 s3, v2, 63 -; GFX1064: v_writelane_b32 v1, s6, 32 -; GFX1064: v_readlane_b32 s6, v2, 47 -; GFX1064: v_writelane_b32 v1, s6, 48 -; GFX1064: s_mov_b64 exec, s[4:5] -; GFX1064: v_cmp_eq_u32_e32 vcc, 0, v0 -; GFX1064: s_and_saveexec_b64 s[4:5], vcc -; GFX1064: s_cbranch_execz BB4_2 -; GFX1064: BB4_1: -; GFX1064: v_mov_b32_e32 v0, local_var32 at abs32@lo -; GFX1064: v_mov_b32_e32 v5, s3 -; GFX1064: s_waitcnt vmcnt(0) lgkmcnt(0) -; GFX1064: s_waitcnt_vscnt null, 0x0 -; GFX1064: ds_add_rtn_u32 v0, v0, v5 -; GFX1064: s_waitcnt vmcnt(0) lgkmcnt(0) -; GFX1064: buffer_gl0_inv -; GFX1064: buffer_gl1_inv -; GFX1064: BB4_2: -; GFX1064: v_nop -; GFX1064: s_or_b64 exec, exec, s[4:5] -; GFX1064: v_readfirstlane_b32 s3, v0 -; GFX1064: v_mov_b32_e32 v0, v1 -; GFX1064: v_add_nc_u32_e32 v0, s3, v0 -; GFX1064: s_mov_b32 s3, 0x31016000 -; GFX1064: s_nop 1 -; GFX1064: s_waitcnt lgkmcnt(0) -; GFX1064: buffer_store_dword v0, off, s[0:3], 0 -; GFX1064: s_endpgm +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_mov_b32_e32 v2, v0 +; GFX1064-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: s_mov_b64 exec, s[2:3] +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: v_mov_b32_e32 v2, 0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_mov_b32_e32 v3, v2 +; GFX1064-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 31 +; GFX1064-NEXT: v_mov_b32_e32 v3, s2 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 15 +; GFX1064-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1064-NEXT: v_readlane_b32 s6, v2, 47 +; GFX1064-NEXT: v_writelane_b32 v1, s2, 16 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_writelane_b32 v1, s3, 32 +; GFX1064-NEXT: v_readlane_b32 s3, v2, 63 +; GFX1064-NEXT: v_writelane_b32 v1, s6, 48 +; GFX1064-NEXT: s_mov_b64 exec, s[4:5] +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: ; implicit-def: $vgpr0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB4_2 +; GFX1064-NEXT: s_cbranch_execz BB4_2 +; GFX1064-NEXT: BB4_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v7, s3 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_add_rtn_u32 v0, v0, v7 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB4_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1064-NEXT: v_mov_b32_e32 v0, v1 +; GFX1064-NEXT: v_add_nc_u32_e32 v0, s3, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: add_i32_varying_gfx1064: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mov_b32_e32 v2, v0 +; GFX1032-NEXT: s_or_saveexec_b32 s2, -1 +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: s_mov_b32 exec_lo, s2 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: v_mov_b32_e32 v2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: s_or_saveexec_b32 s4, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_mov_b32_e32 v3, v2 +; GFX1032-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1032-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s5, v2, 15 +; GFX1032-NEXT: v_writelane_b32 v1, s5, 16 +; GFX1032-NEXT: s_mov_b32 exec_lo, s4 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: ; implicit-def: $vgpr0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB4_2 +; GFX1032-NEXT: s_cbranch_execz BB4_2 +; GFX1032-NEXT: BB4_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v7, s3 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_add_rtn_u32 v0, v0, v7 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB4_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1032-NEXT: v_mov_b32_e32 v0, v1 +; GFX1032-NEXT: v_add_nc_u32_e32 v0, s3, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %old = atomicrmw add i32 addrspace(3)* @local_var32, i32 %lane acq_rel @@ -208,46 +1095,495 @@ entry: ret void } -; GCN-LABEL: add_i64_constant: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN32: s_bcnt1_i32_b32 s[[popcount:[0-9]+]], s[[exec_lo]] -; GCN64: s_bcnt1_i32_b64 s[[popcount:[0-9]+]], s{{\[}}[[exec_lo]]:[[exec_hi]]{{\]}} -; GCN: v_mul_hi_u32_u24{{(_e[0-9]+)?}} v[[value_hi:[0-9]+]], s[[popcount]], 5 -; GCN: v_mul_u32_u24{{(_e[0-9]+)?}} v[[value_lo:[0-9]+]], s[[popcount]], 5 -; GCN: ds_add_rtn_u64 v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}}, v{{[0-9]+}}, v{{\[}}[[value_lo]]:[[value_hi]]{{\]}} define amdgpu_kernel void @add_i64_constant(i64 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: add_i64_constant: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s4, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s5, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX7LESS-NEXT: ; mask branch BB5_2 +; GFX7LESS-NEXT: s_cbranch_execz BB5_2 +; GFX7LESS-NEXT: BB5_1: +; GFX7LESS-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX7LESS-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX7LESS-NEXT: v_mul_hi_u32_u24_e64 v2, s4, 5 +; GFX7LESS-NEXT: v_mul_u32_u24_e64 v1, s4, 5 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_add_rtn_u64 v[1:2], v3, v[1:2] +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB5_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX7LESS-NEXT: v_readfirstlane_b32 s2, v1 +; GFX7LESS-NEXT: v_readfirstlane_b32 s4, v2 +; GFX7LESS-NEXT: v_mul_hi_u32_u24_e32 v1, 5, v0 +; GFX7LESS-NEXT: v_mul_u32_u24_e32 v0, 5, v0 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: v_mov_b32_e32 v2, s4 +; GFX7LESS-NEXT: v_add_i32_e32 v0, vcc, s2, v0 +; GFX7LESS-NEXT: v_addc_u32_e32 v1, vcc, v2, v1, vcc +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: add_i64_constant: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s4, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s5, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX8-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX8-NEXT: ; mask branch BB5_2 +; GFX8-NEXT: s_cbranch_execz BB5_2 +; GFX8-NEXT: BB5_1: +; GFX8-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX8-NEXT: v_mul_hi_u32_u24_e64 v2, s4, 5 +; GFX8-NEXT: v_mul_u32_u24_e64 v1, s4, 5 +; GFX8-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_add_rtn_u64 v[1:2], v3, v[1:2] +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB5_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX8-NEXT: v_readfirstlane_b32 s2, v1 +; GFX8-NEXT: v_readfirstlane_b32 s3, v2 +; GFX8-NEXT: v_mad_u64_u32 v[0:1], s[2:3], v0, 5, s[2:3] +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 2 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: add_i64_constant: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s4, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s5, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX9-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX9-NEXT: ; mask branch BB5_2 +; GFX9-NEXT: s_cbranch_execz BB5_2 +; GFX9-NEXT: BB5_1: +; GFX9-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX9-NEXT: v_mul_hi_u32_u24_e64 v2, s4, 5 +; GFX9-NEXT: v_mul_u32_u24_e64 v1, s4, 5 +; GFX9-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_add_rtn_u64 v[1:2], v3, v[1:2] +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB5_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX9-NEXT: v_readfirstlane_b32 s2, v1 +; GFX9-NEXT: v_readfirstlane_b32 s3, v2 +; GFX9-NEXT: v_mad_u64_u32 v[0:1], s[2:3], v0, 5, s[2:3] +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 2 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: add_i64_constant: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB5_2 +; GFX1064-NEXT: s_cbranch_execz BB5_2 +; GFX1064-NEXT: BB5_1: +; GFX1064-NEXT: s_bcnt1_i32_b64 s2, s[2:3] +; GFX1064-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX1064-NEXT: v_mul_hi_u32_u24_e64 v2, s2, 5 +; GFX1064-NEXT: v_mul_u32_u24_e64 v1, s2, 5 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_add_rtn_u64 v[1:2], v3, v[1:2] +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB5_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s2, v1 +; GFX1064-NEXT: v_readfirstlane_b32 s3, v2 +; GFX1064-NEXT: v_mad_u64_u32 v[0:1], s[2:3], v0, 5, s[2:3] +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: s_nop 2 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: add_i64_constant: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s3, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s3, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: s_and_saveexec_b32 s2, vcc_lo +; GFX1032-NEXT: ; mask branch BB5_2 +; GFX1032-NEXT: s_cbranch_execz BB5_2 +; GFX1032-NEXT: BB5_1: +; GFX1032-NEXT: s_bcnt1_i32_b32 s3, s3 +; GFX1032-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX1032-NEXT: v_mul_hi_u32_u24_e64 v2, s3, 5 +; GFX1032-NEXT: v_mul_u32_u24_e64 v1, s3, 5 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_add_rtn_u64 v[1:2], v3, v[1:2] +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB5_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s2 +; GFX1032-NEXT: v_readfirstlane_b32 s2, v1 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v2 +; GFX1032-NEXT: v_mad_u64_u32 v[0:1], s2, v0, 5, s[2:3] +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: s_nop 2 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw add i64 addrspace(3)* @local_var64, i64 5 acq_rel store i64 %old, i64 addrspace(1)* %out ret void } -; GCN-LABEL: add_i64_uniform: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN32: s_bcnt1_i32_b32 s{{[0-9]+}}, s[[exec_lo]] -; GCN64: s_bcnt1_i32_b64 s{{[0-9]+}}, s{{\[}}[[exec_lo]]:[[exec_hi]]{{\]}} -; GCN: ds_add_rtn_u64 v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}}, v{{[0-9]+}}, v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}} define amdgpu_kernel void @add_i64_uniform(i64 addrspace(1)* %out, i64 %additive) { +; +; +; GFX7LESS-LABEL: add_i64_uniform: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s6, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s7, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX7LESS-NEXT: ; mask branch BB6_2 +; GFX7LESS-NEXT: s_cbranch_execz BB6_2 +; GFX7LESS-NEXT: BB6_1: +; GFX7LESS-NEXT: s_bcnt1_i32_b64 s6, s[6:7] +; GFX7LESS-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: s_mul_i32 s7, s3, s6 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, s6 +; GFX7LESS-NEXT: v_mul_hi_u32 v1, s2, v1 +; GFX7LESS-NEXT: s_mul_i32 s6, s2, s6 +; GFX7LESS-NEXT: v_add_i32_e32 v2, vcc, s7, v1 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, s6 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_add_rtn_u64 v[1:2], v3, v[1:2] +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB6_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX7LESS-NEXT: s_mov_b32 s7, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s6, -1 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: s_mov_b32 s4, s0 +; GFX7LESS-NEXT: s_mov_b32 s5, s1 +; GFX7LESS-NEXT: v_readfirstlane_b32 s0, v1 +; GFX7LESS-NEXT: v_readfirstlane_b32 s1, v2 +; GFX7LESS-NEXT: v_mul_lo_u32 v1, s3, v0 +; GFX7LESS-NEXT: v_mul_hi_u32 v2, s2, v0 +; GFX7LESS-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX7LESS-NEXT: v_add_i32_e32 v1, vcc, v2, v1 +; GFX7LESS-NEXT: v_mov_b32_e32 v2, s1 +; GFX7LESS-NEXT: v_add_i32_e32 v0, vcc, s0, v0 +; GFX7LESS-NEXT: v_addc_u32_e32 v1, vcc, v2, v1, vcc +; GFX7LESS-NEXT: buffer_store_dwordx2 v[0:1], off, s[4:7], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: add_i64_uniform: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s6, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s7, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB6_2 +; GFX8-NEXT: s_cbranch_execz BB6_2 +; GFX8-NEXT: BB6_1: +; GFX8-NEXT: s_bcnt1_i32_b64 s6, s[6:7] +; GFX8-NEXT: v_mov_b32_e32 v1, s6 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: v_mul_hi_u32 v1, s2, v1 +; GFX8-NEXT: s_mul_i32 s7, s3, s6 +; GFX8-NEXT: s_mul_i32 s6, s2, s6 +; GFX8-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX8-NEXT: v_add_u32_e32 v2, vcc, s7, v1 +; GFX8-NEXT: v_mov_b32_e32 v1, s6 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_add_rtn_u64 v[1:2], v3, v[1:2] +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB6_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: s_mov_b32 s4, s0 +; GFX8-NEXT: v_readfirstlane_b32 s0, v1 +; GFX8-NEXT: v_mul_lo_u32 v1, s3, v0 +; GFX8-NEXT: v_mul_hi_u32 v3, s2, v0 +; GFX8-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX8-NEXT: s_mov_b32 s5, s1 +; GFX8-NEXT: v_readfirstlane_b32 s1, v2 +; GFX8-NEXT: v_add_u32_e32 v1, vcc, v3, v1 +; GFX8-NEXT: v_mov_b32_e32 v2, s1 +; GFX8-NEXT: v_add_u32_e32 v0, vcc, s0, v0 +; GFX8-NEXT: s_mov_b32 s7, 0xf000 +; GFX8-NEXT: s_mov_b32 s6, -1 +; GFX8-NEXT: v_addc_u32_e32 v1, vcc, v2, v1, vcc +; GFX8-NEXT: buffer_store_dwordx2 v[0:1], off, s[4:7], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: add_i64_uniform: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s6, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s7, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB6_2 +; GFX9-NEXT: s_cbranch_execz BB6_2 +; GFX9-NEXT: BB6_1: +; GFX9-NEXT: s_bcnt1_i32_b64 s6, s[6:7] +; GFX9-NEXT: v_mov_b32_e32 v1, s6 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mul_hi_u32 v2, s2, v1 +; GFX9-NEXT: s_mul_i32 s7, s3, s6 +; GFX9-NEXT: s_mul_i32 s6, s2, s6 +; GFX9-NEXT: v_mov_b32_e32 v1, s6 +; GFX9-NEXT: v_add_u32_e32 v2, s7, v2 +; GFX9-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_add_rtn_u64 v[1:2], v3, v[1:2] +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB6_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mul_lo_u32 v3, s3, v0 +; GFX9-NEXT: v_mul_hi_u32 v4, s2, v0 +; GFX9-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s4, s0 +; GFX9-NEXT: v_readfirstlane_b32 s0, v1 +; GFX9-NEXT: s_mov_b32 s5, s1 +; GFX9-NEXT: v_readfirstlane_b32 s1, v2 +; GFX9-NEXT: v_add_u32_e32 v1, v4, v3 +; GFX9-NEXT: v_mov_b32_e32 v2, s1 +; GFX9-NEXT: v_add_co_u32_e32 v0, vcc, s0, v0 +; GFX9-NEXT: s_mov_b32 s7, 0xf000 +; GFX9-NEXT: s_mov_b32 s6, -1 +; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, v2, v1, vcc +; GFX9-NEXT: buffer_store_dwordx2 v[0:1], off, s[4:7], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: add_i64_uniform: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX1064-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX1064-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s6, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s7, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB6_2 +; GFX1064-NEXT: s_cbranch_execz BB6_2 +; GFX1064-NEXT: BB6_1: +; GFX1064-NEXT: s_bcnt1_i32_b64 s6, s[6:7] +; GFX1064-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: v_mul_hi_u32 v2, s2, s6 +; GFX1064-NEXT: s_mul_i32 s7, s2, s6 +; GFX1064-NEXT: s_mul_i32 s6, s3, s6 +; GFX1064-NEXT: v_mov_b32_e32 v1, s7 +; GFX1064-NEXT: v_add_nc_u32_e32 v2, s6, v2 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_add_rtn_u64 v[1:2], v3, v[1:2] +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB6_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: v_mul_lo_u32 v3, s3, v0 +; GFX1064-NEXT: v_mul_hi_u32 v4, s2, v0 +; GFX1064-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX1064-NEXT: v_readfirstlane_b32 s4, v1 +; GFX1064-NEXT: v_readfirstlane_b32 s5, v2 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_add_nc_u32_e32 v1, v4, v3 +; GFX1064-NEXT: v_add_co_u32_e64 v0, vcc, s4, v0 +; GFX1064-NEXT: v_add_co_ci_u32_e32 v1, vcc, s5, v1, vcc +; GFX1064-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: add_i64_uniform: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s5, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s5, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB6_2 +; GFX1032-NEXT: s_cbranch_execz BB6_2 +; GFX1032-NEXT: BB6_1: +; GFX1032-NEXT: s_bcnt1_i32_b32 s5, s5 +; GFX1032-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: v_mul_hi_u32 v2, s2, s5 +; GFX1032-NEXT: s_mul_i32 s6, s2, s5 +; GFX1032-NEXT: s_mul_i32 s5, s3, s5 +; GFX1032-NEXT: v_mov_b32_e32 v1, s6 +; GFX1032-NEXT: v_add_nc_u32_e32 v2, s5, v2 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_add_rtn_u64 v[1:2], v3, v[1:2] +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB6_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: v_mul_lo_u32 v3, s3, v0 +; GFX1032-NEXT: v_mul_hi_u32 v4, s2, v0 +; GFX1032-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX1032-NEXT: v_readfirstlane_b32 s4, v1 +; GFX1032-NEXT: v_readfirstlane_b32 s5, v2 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_add_nc_u32_e32 v1, v4, v3 +; GFX1032-NEXT: v_add_co_u32_e64 v0, vcc_lo, s4, v0 +; GFX1032-NEXT: v_add_co_ci_u32_e32 v1, vcc_lo, s5, v1, vcc_lo +; GFX1032-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw add i64 addrspace(3)* @local_var64, i64 %additive acq_rel store i64 %old, i64 addrspace(1)* %out ret void } -; GCN-LABEL: add_i64_varying: ; GCN-NOT: v_mbcnt_lo_u32_b32 ; GCN-NOT: v_mbcnt_hi_u32_b32 ; GCN-NOT: s_bcnt1_i32_b64 -; GCN: ds_add_rtn_u64 v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}}, v{{[0-9]+}}, v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}} define amdgpu_kernel void @add_i64_varying(i64 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: add_i64_varying: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, 0 +; GFX7LESS-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_add_rtn_u64 v[0:1], v2, v[0:1] +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: add_i64_varying: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_add_rtn_u64 v[0:1], v2, v[0:1] +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: add_i64_varying: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_add_rtn_u64 v[0:1], v2, v[0:1] +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: add_i64_varying: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_add_rtn_u64 v[0:1], v2, v[0:1] +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: add_i64_varying: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_add_rtn_u64 v[0:1], v2, v[0:1] +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %zext = zext i32 %lane to i64 @@ -256,53 +1592,601 @@ entry: ret void } -; GCN-LABEL: sub_i32_constant: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN32: s_bcnt1_i32_b32 s[[popcount:[0-9]+]], s[[exec_lo]] -; GCN64: s_bcnt1_i32_b64 s[[popcount:[0-9]+]], s{{\[}}[[exec_lo]]:[[exec_hi]]{{\]}} -; GCN: v_mul_u32_u24{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[popcount]], 5 -; GCN: ds_sub_rtn_u32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @sub_i32_constant(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: sub_i32_constant: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s4, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s5, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr1 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX7LESS-NEXT: ; mask branch BB8_2 +; GFX7LESS-NEXT: s_cbranch_execz BB8_2 +; GFX7LESS-NEXT: BB8_1: +; GFX7LESS-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: v_mul_u32_u24_e64 v2, s4, 5 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_sub_rtn_u32 v1, v1, v2 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB8_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX7LESS-NEXT: v_readfirstlane_b32 s2, v1 +; GFX7LESS-NEXT: v_mul_u32_u24_e32 v0, 5, v0 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: v_sub_i32_e32 v0, vcc, s2, v0 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: sub_i32_constant: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s4, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s5, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr1 +; GFX8-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX8-NEXT: ; mask branch BB8_2 +; GFX8-NEXT: s_cbranch_execz BB8_2 +; GFX8-NEXT: BB8_1: +; GFX8-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX8-NEXT: v_mul_u32_u24_e64 v1, s4, 5 +; GFX8-NEXT: v_mov_b32_e32 v2, local_var32 at abs32@lo +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_sub_rtn_u32 v1, v2, v1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB8_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX8-NEXT: v_readfirstlane_b32 s2, v1 +; GFX8-NEXT: v_mul_u32_u24_e32 v0, 5, v0 +; GFX8-NEXT: v_sub_u32_e32 v0, vcc, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: sub_i32_constant: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s4, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s5, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr1 +; GFX9-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX9-NEXT: ; mask branch BB8_2 +; GFX9-NEXT: s_cbranch_execz BB8_2 +; GFX9-NEXT: BB8_1: +; GFX9-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX9-NEXT: v_mul_u32_u24_e64 v1, s4, 5 +; GFX9-NEXT: v_mov_b32_e32 v2, local_var32 at abs32@lo +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_sub_rtn_u32 v1, v2, v1 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB8_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX9-NEXT: v_readfirstlane_b32 s2, v1 +; GFX9-NEXT: v_mul_u32_u24_e32 v0, 5, v0 +; GFX9-NEXT: v_sub_u32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: sub_i32_constant: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: ; implicit-def: $vgpr1 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB8_2 +; GFX1064-NEXT: s_cbranch_execz BB8_2 +; GFX1064-NEXT: BB8_1: +; GFX1064-NEXT: s_bcnt1_i32_b64 s2, s[2:3] +; GFX1064-NEXT: v_mov_b32_e32 v2, local_var32 at abs32@lo +; GFX1064-NEXT: v_mul_u32_u24_e64 v1, s2, 5 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_sub_rtn_u32 v1, v2, v1 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB8_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s2, v1 +; GFX1064-NEXT: v_mul_u32_u24_e32 v0, 5, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: v_sub_nc_u32_e32 v0, s2, v0 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: s_nop 0 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: sub_i32_constant: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s3, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: ; implicit-def: $vgpr1 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s3, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: s_and_saveexec_b32 s2, vcc_lo +; GFX1032-NEXT: ; mask branch BB8_2 +; GFX1032-NEXT: s_cbranch_execz BB8_2 +; GFX1032-NEXT: BB8_1: +; GFX1032-NEXT: s_bcnt1_i32_b32 s3, s3 +; GFX1032-NEXT: v_mov_b32_e32 v2, local_var32 at abs32@lo +; GFX1032-NEXT: v_mul_u32_u24_e64 v1, s3, 5 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_sub_rtn_u32 v1, v2, v1 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB8_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s2 +; GFX1032-NEXT: v_readfirstlane_b32 s2, v1 +; GFX1032-NEXT: v_mul_u32_u24_e32 v0, 5, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: v_sub_nc_u32_e32 v0, s2, v0 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: s_nop 0 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw sub i32 addrspace(3)* @local_var32, i32 5 acq_rel store i32 %old, i32 addrspace(1)* %out ret void } -; GCN-LABEL: sub_i32_uniform: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN32: s_bcnt1_i32_b32 s[[popcount:[0-9]+]], s[[exec_lo]] -; GCN64: s_bcnt1_i32_b64 s[[popcount:[0-9]+]], s{{\[}}[[exec_lo]]:[[exec_hi]]{{\]}} -; GCN: s_mul_i32 s[[scalar_value:[0-9]+]], s{{[0-9]+}}, s[[popcount]] -; GCN: v_mov_b32{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[scalar_value]] -; GCN: ds_sub_rtn_u32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @sub_i32_uniform(i32 addrspace(1)* %out, i32 %subitive) { +; +; +; GFX7LESS-LABEL: sub_i32_uniform: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x9 +; GFX7LESS-NEXT: s_load_dword s2, s[0:1], 0xb +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s6, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s7, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr1 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[0:1], vcc +; GFX7LESS-NEXT: ; mask branch BB9_2 +; GFX7LESS-NEXT: s_cbranch_execz BB9_2 +; GFX7LESS-NEXT: BB9_1: +; GFX7LESS-NEXT: s_bcnt1_i32_b64 s3, s[6:7] +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: s_mul_i32 s3, s2, s3 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: v_mov_b32_e32 v2, s3 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_sub_rtn_u32 v1, v1, v2 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB9_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[0:1] +; GFX7LESS-NEXT: v_readfirstlane_b32 s0, v1 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX7LESS-NEXT: s_mov_b32 s7, 0xf000 +; GFX7LESS-NEXT: v_sub_i32_e32 v0, vcc, s0, v0 +; GFX7LESS-NEXT: s_mov_b32 s6, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: sub_i32_uniform: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x24 +; GFX8-NEXT: s_load_dword s0, s[0:1], 0x2c +; GFX8-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s6, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s7, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr1 +; GFX8-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX8-NEXT: ; mask branch BB9_2 +; GFX8-NEXT: s_cbranch_execz BB9_2 +; GFX8-NEXT: BB9_1: +; GFX8-NEXT: s_bcnt1_i32_b64 s1, s[6:7] +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: s_mul_i32 s1, s0, s1 +; GFX8-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v2, s1 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_sub_rtn_u32 v1, v1, v2 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB9_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: v_mul_lo_u32 v0, s0, v0 +; GFX8-NEXT: v_readfirstlane_b32 s0, v1 +; GFX8-NEXT: s_mov_b32 s7, 0xf000 +; GFX8-NEXT: s_mov_b32 s6, -1 +; GFX8-NEXT: v_sub_u32_e32 v0, vcc, s0, v0 +; GFX8-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: sub_i32_uniform: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x24 +; GFX9-NEXT: s_load_dword s0, s[0:1], 0x2c +; GFX9-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s6, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s7, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr1 +; GFX9-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX9-NEXT: ; mask branch BB9_2 +; GFX9-NEXT: s_cbranch_execz BB9_2 +; GFX9-NEXT: BB9_1: +; GFX9-NEXT: s_bcnt1_i32_b64 s1, s[6:7] +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: s_mul_i32 s1, s0, s1 +; GFX9-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v2, s1 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_sub_rtn_u32 v1, v1, v2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB9_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mul_lo_u32 v0, s0, v0 +; GFX9-NEXT: v_readfirstlane_b32 s0, v1 +; GFX9-NEXT: s_mov_b32 s7, 0xf000 +; GFX9-NEXT: s_mov_b32 s6, -1 +; GFX9-NEXT: v_sub_u32_e32 v0, s0, v0 +; GFX9-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: sub_i32_uniform: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x24 +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: s_load_dword s0, s[0:1], 0x2c +; GFX1064-NEXT: ; implicit-def: $vgpr1 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: s_and_saveexec_b64 s[6:7], vcc +; GFX1064-NEXT: ; mask branch BB9_2 +; GFX1064-NEXT: s_cbranch_execz BB9_2 +; GFX1064-NEXT: BB9_1: +; GFX1064-NEXT: s_bcnt1_i32_b64 s1, s[2:3] +; GFX1064-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: s_mul_i32 s1, s0, s1 +; GFX1064-NEXT: v_mov_b32_e32 v2, s1 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_sub_rtn_u32 v1, v1, v2 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB9_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[6:7] +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: v_mul_lo_u32 v0, s0, v0 +; GFX1064-NEXT: v_readfirstlane_b32 s0, v1 +; GFX1064-NEXT: s_mov_b32 s7, 0x31016000 +; GFX1064-NEXT: s_mov_b32 s6, -1 +; GFX1064-NEXT: v_sub_nc_u32_e32 v0, s0, v0 +; GFX1064-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: sub_i32_uniform: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[4:5], s[0:1], 0x24 +; GFX1032-NEXT: s_load_dword s0, s[0:1], 0x2c +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: ; implicit-def: $vgpr1 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: s_and_saveexec_b32 s1, vcc_lo +; GFX1032-NEXT: ; mask branch BB9_2 +; GFX1032-NEXT: s_cbranch_execz BB9_2 +; GFX1032-NEXT: BB9_1: +; GFX1032-NEXT: s_bcnt1_i32_b32 s2, s2 +; GFX1032-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: s_mul_i32 s2, s0, s2 +; GFX1032-NEXT: v_mov_b32_e32 v2, s2 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_sub_rtn_u32 v1, v1, v2 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB9_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: v_mul_lo_u32 v0, s0, v0 +; GFX1032-NEXT: v_readfirstlane_b32 s0, v1 +; GFX1032-NEXT: s_mov_b32 s7, 0x31016000 +; GFX1032-NEXT: s_mov_b32 s6, -1 +; GFX1032-NEXT: v_sub_nc_u32_e32 v0, s0, v0 +; GFX1032-NEXT: buffer_store_dword v0, off, s[4:7], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw sub i32 addrspace(3)* @local_var32, i32 %subitive acq_rel store i32 %old, i32 addrspace(1)* %out ret void } -; GCN-LABEL: sub_i32_varying: ; GFX7LESS-NOT: v_mbcnt_lo_u32_b32 ; GFX7LESS-NOT: v_mbcnt_hi_u32_b32 ; GFX7LESS-NOT: s_bcnt1_i32_b64 -; GFX7LESS: ds_sub_rtn_u32 v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}} ; DPPCOMB: v_add_u32_dpp ; DPPCOMB: v_add_u32_dpp ; GFX8MORE32: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 31 -; GFX8MORE64: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 63 ; GFX8MORE: v_mov_b32{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[scalar_value]] ; GFX8MORE: ds_sub_rtn_u32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @sub_i32_varying(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: sub_i32_varying: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_sub_rtn_u32 v0, v1, v0 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: sub_i32_varying: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: v_mov_b32_e32 v2, v0 +; GFX8-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: s_mov_b64 exec, s[2:3] +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: v_mov_b32_e32 v2, 0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_add_u32_dpp v2, vcc, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX8-NEXT: v_readlane_b32 s2, v2, 63 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_mov_b64 exec, s[4:5] +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr0 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB10_2 +; GFX8-NEXT: s_cbranch_execz BB10_2 +; GFX8-NEXT: BB10_1: +; GFX8-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v3, s2 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_sub_rtn_u32 v0, v0, v3 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB10_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_mov_b32_e32 v0, v1 +; GFX8-NEXT: v_sub_u32_e32 v0, vcc, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: sub_i32_varying: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: v_mov_b32_e32 v2, v0 +; GFX9-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: s_mov_b64 exec, s[2:3] +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: v_mov_b32_e32 v2, 0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_add_u32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX9-NEXT: v_readlane_b32 s2, v2, 63 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_mov_b64 exec, s[4:5] +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr0 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB10_2 +; GFX9-NEXT: s_cbranch_execz BB10_2 +; GFX9-NEXT: BB10_1: +; GFX9-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v3, s2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_sub_rtn_u32 v0, v0, v3 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB10_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_mov_b32_e32 v0, v1 +; GFX9-NEXT: v_sub_u32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: sub_i32_varying: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_mov_b32_e32 v2, v0 +; GFX1064-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: s_mov_b64 exec, s[2:3] +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: v_mov_b32_e32 v2, 0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_mov_b32_e32 v3, v2 +; GFX1064-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 31 +; GFX1064-NEXT: v_mov_b32_e32 v3, s2 +; GFX1064-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 15 +; GFX1064-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1064-NEXT: v_readlane_b32 s6, v2, 47 +; GFX1064-NEXT: v_writelane_b32 v1, s2, 16 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_writelane_b32 v1, s3, 32 +; GFX1064-NEXT: v_readlane_b32 s3, v2, 63 +; GFX1064-NEXT: v_writelane_b32 v1, s6, 48 +; GFX1064-NEXT: s_mov_b64 exec, s[4:5] +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: ; implicit-def: $vgpr0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB10_2 +; GFX1064-NEXT: s_cbranch_execz BB10_2 +; GFX1064-NEXT: BB10_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v7, s3 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_sub_rtn_u32 v0, v0, v7 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB10_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1064-NEXT: v_mov_b32_e32 v0, v1 +; GFX1064-NEXT: v_sub_nc_u32_e32 v0, s3, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: sub_i32_varying: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mov_b32_e32 v2, v0 +; GFX1032-NEXT: s_or_saveexec_b32 s2, -1 +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: s_mov_b32 exec_lo, s2 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: v_mov_b32_e32 v2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: s_or_saveexec_b32 s4, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_mov_b32_e32 v3, v2 +; GFX1032-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1032-NEXT: v_add_nc_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1032-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s5, v2, 15 +; GFX1032-NEXT: v_writelane_b32 v1, s5, 16 +; GFX1032-NEXT: s_mov_b32 exec_lo, s4 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: ; implicit-def: $vgpr0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB10_2 +; GFX1032-NEXT: s_cbranch_execz BB10_2 +; GFX1032-NEXT: BB10_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v7, s3 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_sub_rtn_u32 v0, v0, v7 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB10_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1032-NEXT: v_mov_b32_e32 v0, v1 +; GFX1032-NEXT: v_sub_nc_u32_e32 v0, s3, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %old = atomicrmw sub i32 addrspace(3)* @local_var32, i32 %lane acq_rel @@ -310,46 +2194,505 @@ entry: ret void } -; GCN-LABEL: sub_i64_constant: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN32: s_bcnt1_i32_b32 s[[popcount:[0-9]+]], s[[exec_lo]] -; GCN64: s_bcnt1_i32_b64 s[[popcount:[0-9]+]], s{{\[}}[[exec_lo]]:[[exec_hi]]{{\]}} -; GCN: v_mul_hi_u32_u24{{(_e[0-9]+)?}} v[[value_hi:[0-9]+]], s[[popcount]], 5 -; GCN: v_mul_u32_u24{{(_e[0-9]+)?}} v[[value_lo:[0-9]+]], s[[popcount]], 5 -; GCN: ds_sub_rtn_u64 v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}}, v{{[0-9]+}}, v{{\[}}[[value_lo]]:[[value_hi]]{{\]}} define amdgpu_kernel void @sub_i64_constant(i64 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: sub_i64_constant: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s4, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s5, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX7LESS-NEXT: ; mask branch BB11_2 +; GFX7LESS-NEXT: s_cbranch_execz BB11_2 +; GFX7LESS-NEXT: BB11_1: +; GFX7LESS-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX7LESS-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX7LESS-NEXT: v_mul_hi_u32_u24_e64 v2, s4, 5 +; GFX7LESS-NEXT: v_mul_u32_u24_e64 v1, s4, 5 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_sub_rtn_u64 v[1:2], v3, v[1:2] +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB11_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX7LESS-NEXT: v_readfirstlane_b32 s2, v1 +; GFX7LESS-NEXT: v_readfirstlane_b32 s4, v2 +; GFX7LESS-NEXT: v_mul_hi_u32_u24_e32 v1, 5, v0 +; GFX7LESS-NEXT: v_mul_u32_u24_e32 v0, 5, v0 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: v_mov_b32_e32 v2, s4 +; GFX7LESS-NEXT: v_sub_i32_e32 v0, vcc, s2, v0 +; GFX7LESS-NEXT: v_subb_u32_e32 v1, vcc, v2, v1, vcc +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: sub_i64_constant: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s4, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s5, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX8-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX8-NEXT: ; mask branch BB11_2 +; GFX8-NEXT: s_cbranch_execz BB11_2 +; GFX8-NEXT: BB11_1: +; GFX8-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX8-NEXT: v_mul_hi_u32_u24_e64 v2, s4, 5 +; GFX8-NEXT: v_mul_u32_u24_e64 v1, s4, 5 +; GFX8-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_sub_rtn_u64 v[1:2], v3, v[1:2] +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB11_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX8-NEXT: v_readfirstlane_b32 s3, v2 +; GFX8-NEXT: v_readfirstlane_b32 s2, v1 +; GFX8-NEXT: v_mul_hi_u32_u24_e32 v1, 5, v0 +; GFX8-NEXT: v_mul_u32_u24_e32 v0, 5, v0 +; GFX8-NEXT: v_mov_b32_e32 v2, s3 +; GFX8-NEXT: v_sub_u32_e32 v0, vcc, s2, v0 +; GFX8-NEXT: v_subb_u32_e32 v1, vcc, v2, v1, vcc +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: sub_i64_constant: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[4:5], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s4, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s5, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX9-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX9-NEXT: ; mask branch BB11_2 +; GFX9-NEXT: s_cbranch_execz BB11_2 +; GFX9-NEXT: BB11_1: +; GFX9-NEXT: s_bcnt1_i32_b64 s4, s[4:5] +; GFX9-NEXT: v_mul_hi_u32_u24_e64 v2, s4, 5 +; GFX9-NEXT: v_mul_u32_u24_e64 v1, s4, 5 +; GFX9-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_sub_rtn_u64 v[1:2], v3, v[1:2] +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB11_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX9-NEXT: v_readfirstlane_b32 s3, v2 +; GFX9-NEXT: v_readfirstlane_b32 s2, v1 +; GFX9-NEXT: v_mul_hi_u32_u24_e32 v1, 5, v0 +; GFX9-NEXT: v_mul_u32_u24_e32 v0, 5, v0 +; GFX9-NEXT: v_mov_b32_e32 v2, s3 +; GFX9-NEXT: v_sub_co_u32_e32 v0, vcc, s2, v0 +; GFX9-NEXT: v_subb_co_u32_e32 v1, vcc, v2, v1, vcc +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: sub_i64_constant: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB11_2 +; GFX1064-NEXT: s_cbranch_execz BB11_2 +; GFX1064-NEXT: BB11_1: +; GFX1064-NEXT: s_bcnt1_i32_b64 s2, s[2:3] +; GFX1064-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX1064-NEXT: v_mul_hi_u32_u24_e64 v2, s2, 5 +; GFX1064-NEXT: v_mul_u32_u24_e64 v1, s2, 5 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_sub_rtn_u64 v[1:2], v3, v[1:2] +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB11_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s2, v1 +; GFX1064-NEXT: v_mul_u32_u24_e32 v1, 5, v0 +; GFX1064-NEXT: v_readfirstlane_b32 s3, v2 +; GFX1064-NEXT: v_mul_hi_u32_u24_e32 v2, 5, v0 +; GFX1064-NEXT: v_sub_co_u32_e64 v0, vcc, s2, v1 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_sub_co_ci_u32_e32 v1, vcc, s3, v2, vcc +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: sub_i64_constant: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s3, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s3, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: s_and_saveexec_b32 s2, vcc_lo +; GFX1032-NEXT: ; mask branch BB11_2 +; GFX1032-NEXT: s_cbranch_execz BB11_2 +; GFX1032-NEXT: BB11_1: +; GFX1032-NEXT: s_bcnt1_i32_b32 s3, s3 +; GFX1032-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX1032-NEXT: v_mul_hi_u32_u24_e64 v2, s3, 5 +; GFX1032-NEXT: v_mul_u32_u24_e64 v1, s3, 5 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_sub_rtn_u64 v[1:2], v3, v[1:2] +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB11_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s2 +; GFX1032-NEXT: v_readfirstlane_b32 s2, v1 +; GFX1032-NEXT: v_mul_u32_u24_e32 v1, 5, v0 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v2 +; GFX1032-NEXT: v_mul_hi_u32_u24_e32 v2, 5, v0 +; GFX1032-NEXT: v_sub_co_u32_e64 v0, vcc_lo, s2, v1 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_sub_co_ci_u32_e32 v1, vcc_lo, s3, v2, vcc_lo +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw sub i64 addrspace(3)* @local_var64, i64 5 acq_rel store i64 %old, i64 addrspace(1)* %out ret void } -; GCN-LABEL: sub_i64_uniform: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN32: s_bcnt1_i32_b32 s{{[0-9]+}}, s[[exec_lo]] -; GCN64: s_bcnt1_i32_b64 s{{[0-9]+}}, s{{\[}}[[exec_lo]]:[[exec_hi]]{{\]}} -; GCN: ds_sub_rtn_u64 v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}}, v{{[0-9]+}}, v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}} define amdgpu_kernel void @sub_i64_uniform(i64 addrspace(1)* %out, i64 %subitive) { +; +; +; GFX7LESS-LABEL: sub_i64_uniform: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x9 +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s6, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s7, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX7LESS-NEXT: ; mask branch BB12_2 +; GFX7LESS-NEXT: s_cbranch_execz BB12_2 +; GFX7LESS-NEXT: BB12_1: +; GFX7LESS-NEXT: s_bcnt1_i32_b64 s6, s[6:7] +; GFX7LESS-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: s_mul_i32 s7, s3, s6 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, s6 +; GFX7LESS-NEXT: v_mul_hi_u32 v1, s2, v1 +; GFX7LESS-NEXT: s_mul_i32 s6, s2, s6 +; GFX7LESS-NEXT: v_add_i32_e32 v2, vcc, s7, v1 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, s6 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_sub_rtn_u64 v[1:2], v3, v[1:2] +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB12_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX7LESS-NEXT: s_mov_b32 s7, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s6, -1 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: s_mov_b32 s4, s0 +; GFX7LESS-NEXT: s_mov_b32 s5, s1 +; GFX7LESS-NEXT: v_readfirstlane_b32 s0, v1 +; GFX7LESS-NEXT: v_readfirstlane_b32 s1, v2 +; GFX7LESS-NEXT: v_mul_lo_u32 v1, s3, v0 +; GFX7LESS-NEXT: v_mul_hi_u32 v2, s2, v0 +; GFX7LESS-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX7LESS-NEXT: v_add_i32_e32 v1, vcc, v2, v1 +; GFX7LESS-NEXT: v_mov_b32_e32 v2, s1 +; GFX7LESS-NEXT: v_sub_i32_e32 v0, vcc, s0, v0 +; GFX7LESS-NEXT: v_subb_u32_e32 v1, vcc, v2, v1, vcc +; GFX7LESS-NEXT: buffer_store_dwordx2 v[0:1], off, s[4:7], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: sub_i64_uniform: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s6, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s7, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB12_2 +; GFX8-NEXT: s_cbranch_execz BB12_2 +; GFX8-NEXT: BB12_1: +; GFX8-NEXT: s_bcnt1_i32_b64 s6, s[6:7] +; GFX8-NEXT: v_mov_b32_e32 v1, s6 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: v_mul_hi_u32 v1, s2, v1 +; GFX8-NEXT: s_mul_i32 s7, s3, s6 +; GFX8-NEXT: s_mul_i32 s6, s2, s6 +; GFX8-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX8-NEXT: v_add_u32_e32 v2, vcc, s7, v1 +; GFX8-NEXT: v_mov_b32_e32 v1, s6 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_sub_rtn_u64 v[1:2], v3, v[1:2] +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB12_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: s_mov_b32 s4, s0 +; GFX8-NEXT: v_readfirstlane_b32 s0, v1 +; GFX8-NEXT: v_mul_lo_u32 v1, s3, v0 +; GFX8-NEXT: v_mul_hi_u32 v3, s2, v0 +; GFX8-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX8-NEXT: s_mov_b32 s5, s1 +; GFX8-NEXT: v_readfirstlane_b32 s1, v2 +; GFX8-NEXT: v_add_u32_e32 v1, vcc, v3, v1 +; GFX8-NEXT: v_mov_b32_e32 v2, s1 +; GFX8-NEXT: v_sub_u32_e32 v0, vcc, s0, v0 +; GFX8-NEXT: s_mov_b32 s7, 0xf000 +; GFX8-NEXT: s_mov_b32 s6, -1 +; GFX8-NEXT: v_subb_u32_e32 v1, vcc, v2, v1, vcc +; GFX8-NEXT: buffer_store_dwordx2 v[0:1], off, s[4:7], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: sub_i64_uniform: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s6, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s7, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB12_2 +; GFX9-NEXT: s_cbranch_execz BB12_2 +; GFX9-NEXT: BB12_1: +; GFX9-NEXT: s_bcnt1_i32_b64 s6, s[6:7] +; GFX9-NEXT: v_mov_b32_e32 v1, s6 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mul_hi_u32 v2, s2, v1 +; GFX9-NEXT: s_mul_i32 s7, s3, s6 +; GFX9-NEXT: s_mul_i32 s6, s2, s6 +; GFX9-NEXT: v_mov_b32_e32 v1, s6 +; GFX9-NEXT: v_add_u32_e32 v2, s7, v2 +; GFX9-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_sub_rtn_u64 v[1:2], v3, v[1:2] +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB12_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: v_mul_lo_u32 v3, s3, v0 +; GFX9-NEXT: v_mul_hi_u32 v4, s2, v0 +; GFX9-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s4, s0 +; GFX9-NEXT: v_readfirstlane_b32 s0, v1 +; GFX9-NEXT: s_mov_b32 s5, s1 +; GFX9-NEXT: v_readfirstlane_b32 s1, v2 +; GFX9-NEXT: v_add_u32_e32 v1, v4, v3 +; GFX9-NEXT: v_mov_b32_e32 v2, s1 +; GFX9-NEXT: v_sub_co_u32_e32 v0, vcc, s0, v0 +; GFX9-NEXT: s_mov_b32 s7, 0xf000 +; GFX9-NEXT: s_mov_b32 s6, -1 +; GFX9-NEXT: v_subb_co_u32_e32 v1, vcc, v2, v1, vcc +; GFX9-NEXT: buffer_store_dwordx2 v[0:1], off, s[4:7], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: sub_i64_uniform: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[6:7], 1, 0 +; GFX1064-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX1064-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s6, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s7, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB12_2 +; GFX1064-NEXT: s_cbranch_execz BB12_2 +; GFX1064-NEXT: BB12_1: +; GFX1064-NEXT: s_bcnt1_i32_b64 s6, s[6:7] +; GFX1064-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: v_mul_hi_u32 v2, s2, s6 +; GFX1064-NEXT: s_mul_i32 s7, s2, s6 +; GFX1064-NEXT: s_mul_i32 s6, s3, s6 +; GFX1064-NEXT: v_mov_b32_e32 v1, s7 +; GFX1064-NEXT: v_add_nc_u32_e32 v2, s6, v2 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_sub_rtn_u64 v[1:2], v3, v[1:2] +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB12_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: v_mul_lo_u32 v3, s3, v0 +; GFX1064-NEXT: v_mul_hi_u32 v4, s2, v0 +; GFX1064-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX1064-NEXT: v_readfirstlane_b32 s4, v1 +; GFX1064-NEXT: v_readfirstlane_b32 s5, v2 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_add_nc_u32_e32 v1, v4, v3 +; GFX1064-NEXT: v_sub_co_u32_e64 v0, vcc, s4, v0 +; GFX1064-NEXT: v_sub_co_ci_u32_e32 v1, vcc, s5, v1, vcc +; GFX1064-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: sub_i64_uniform: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx4 s[0:3], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s5, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: ; implicit-def: $vgpr1_vgpr2 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s5, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB12_2 +; GFX1032-NEXT: s_cbranch_execz BB12_2 +; GFX1032-NEXT: BB12_1: +; GFX1032-NEXT: s_bcnt1_i32_b32 s5, s5 +; GFX1032-NEXT: v_mov_b32_e32 v3, local_var64 at abs32@lo +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: v_mul_hi_u32 v2, s2, s5 +; GFX1032-NEXT: s_mul_i32 s6, s2, s5 +; GFX1032-NEXT: s_mul_i32 s5, s3, s5 +; GFX1032-NEXT: v_mov_b32_e32 v1, s6 +; GFX1032-NEXT: v_add_nc_u32_e32 v2, s5, v2 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_sub_rtn_u64 v[1:2], v3, v[1:2] +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB12_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: v_mul_lo_u32 v3, s3, v0 +; GFX1032-NEXT: v_mul_hi_u32 v4, s2, v0 +; GFX1032-NEXT: v_mul_lo_u32 v0, s2, v0 +; GFX1032-NEXT: v_readfirstlane_b32 s4, v1 +; GFX1032-NEXT: v_readfirstlane_b32 s5, v2 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_add_nc_u32_e32 v1, v4, v3 +; GFX1032-NEXT: v_sub_co_u32_e64 v0, vcc_lo, s4, v0 +; GFX1032-NEXT: v_sub_co_ci_u32_e32 v1, vcc_lo, s5, v1, vcc_lo +; GFX1032-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw sub i64 addrspace(3)* @local_var64, i64 %subitive acq_rel store i64 %old, i64 addrspace(1)* %out ret void } -; GCN-LABEL: sub_i64_varying: ; GCN-NOT: v_mbcnt_lo_u32_b32 ; GCN-NOT: v_mbcnt_hi_u32_b32 ; GCN-NOT: s_bcnt1_i32_b64 -; GCN: ds_sub_rtn_u64 v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}}, v{{[0-9]+}}, v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}} define amdgpu_kernel void @sub_i64_varying(i64 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: sub_i64_varying: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, 0 +; GFX7LESS-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_sub_rtn_u64 v[0:1], v2, v[0:1] +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: sub_i64_varying: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_sub_rtn_u64 v[0:1], v2, v[0:1] +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: sub_i64_varying: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_sub_rtn_u64 v[0:1], v2, v[0:1] +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: sub_i64_varying: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_sub_rtn_u64 v[0:1], v2, v[0:1] +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: sub_i64_varying: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_sub_rtn_u64 v[0:1], v2, v[0:1] +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %zext = zext i32 %lane to i64 @@ -358,12 +2701,245 @@ entry: ret void } -; GCN-LABEL: and_i32_varying: ; GFX8MORE32: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 31 -; GFX8MORE64: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 63 ; GFX8MORE: v_mov_b32{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[scalar_value]] ; GFX8MORE: ds_and_rtn_b32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @and_i32_varying(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: and_i32_varying: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_and_rtn_b32 v0, v1, v0 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: and_i32_varying: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v3, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v3, s3, v3 +; GFX8-NEXT: v_mov_b32_e32 v2, v0 +; GFX8-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX8-NEXT: v_mov_b32_e32 v1, -1 +; GFX8-NEXT: s_mov_b64 exec, s[2:3] +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: v_mov_b32_e32 v2, -1 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX8-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_and_b32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_and_b32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX8-NEXT: v_readlane_b32 s2, v2, 63 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_mov_b64 exec, s[4:5] +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v3 +; GFX8-NEXT: ; implicit-def: $vgpr0 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB14_2 +; GFX8-NEXT: s_cbranch_execz BB14_2 +; GFX8-NEXT: BB14_1: +; GFX8-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v3, s2 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_and_rtn_b32 v0, v0, v3 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB14_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_mov_b32_e32 v0, v1 +; GFX8-NEXT: v_and_b32_e32 v0, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: and_i32_varying: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v3, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v3, s3, v3 +; GFX9-NEXT: v_mov_b32_e32 v2, v0 +; GFX9-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX9-NEXT: v_mov_b32_e32 v1, -1 +; GFX9-NEXT: s_mov_b64 exec, s[2:3] +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: v_mov_b32_e32 v2, -1 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX9-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_and_b32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_and_b32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX9-NEXT: v_readlane_b32 s2, v2, 63 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_mov_b64 exec, s[4:5] +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v3 +; GFX9-NEXT: ; implicit-def: $vgpr0 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB14_2 +; GFX9-NEXT: s_cbranch_execz BB14_2 +; GFX9-NEXT: BB14_1: +; GFX9-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v3, s2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_and_rtn_b32 v0, v0, v3 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB14_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_mov_b32_e32 v0, v1 +; GFX9-NEXT: v_and_b32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: and_i32_varying: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: v_mov_b32_e32 v2, v0 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v4, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v4, s3, v4 +; GFX1064-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX1064-NEXT: v_mov_b32_e32 v1, -1 +; GFX1064-NEXT: s_mov_b64 exec, s[2:3] +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: v_mov_b32_e32 v2, -1 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX1064-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_mov_b32_e32 v3, v2 +; GFX1064-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1064-NEXT: v_and_b32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 31 +; GFX1064-NEXT: v_mov_b32_e32 v3, s2 +; GFX1064-NEXT: v_and_b32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 15 +; GFX1064-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1064-NEXT: v_readlane_b32 s6, v2, 47 +; GFX1064-NEXT: v_writelane_b32 v1, s2, 16 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_writelane_b32 v1, s3, 32 +; GFX1064-NEXT: v_readlane_b32 s3, v2, 63 +; GFX1064-NEXT: v_writelane_b32 v1, s6, 48 +; GFX1064-NEXT: s_mov_b64 exec, s[4:5] +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v4 +; GFX1064-NEXT: ; implicit-def: $vgpr0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB14_2 +; GFX1064-NEXT: s_cbranch_execz BB14_2 +; GFX1064-NEXT: BB14_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v7, s3 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_and_rtn_b32 v0, v0, v7 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB14_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1064-NEXT: v_mov_b32_e32 v0, v1 +; GFX1064-NEXT: v_and_b32_e32 v0, s3, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: and_i32_varying: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mov_b32_e32 v2, v0 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v4, s2, 0 +; GFX1032-NEXT: s_or_saveexec_b32 s2, -1 +; GFX1032-NEXT: v_mov_b32_e32 v1, -1 +; GFX1032-NEXT: s_mov_b32 exec_lo, s2 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: v_mov_b32_e32 v2, -1 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: s_or_saveexec_b32 s4, -1 +; GFX1032-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_and_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_mov_b32_e32 v3, v2 +; GFX1032-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1032-NEXT: v_and_b32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1032-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s5, v2, 15 +; GFX1032-NEXT: v_writelane_b32 v1, s5, 16 +; GFX1032-NEXT: s_mov_b32 exec_lo, s4 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v4 +; GFX1032-NEXT: ; implicit-def: $vgpr0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB14_2 +; GFX1032-NEXT: s_cbranch_execz BB14_2 +; GFX1032-NEXT: BB14_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v7, s3 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_and_rtn_b32 v0, v0, v7 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB14_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1032-NEXT: v_mov_b32_e32 v0, v1 +; GFX1032-NEXT: v_and_b32_e32 v0, s3, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %old = atomicrmw and i32 addrspace(3)* @local_var32, i32 %lane acq_rel @@ -371,12 +2947,245 @@ entry: ret void } -; GCN-LABEL: or_i32_varying: ; GFX8MORE32: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 31 -; GFX8MORE64: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 63 ; GFX8MORE: v_mov_b32{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[scalar_value]] ; GFX8MORE: ds_or_rtn_b32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @or_i32_varying(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: or_i32_varying: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_or_rtn_b32 v0, v1, v0 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: or_i32_varying: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: v_mov_b32_e32 v2, v0 +; GFX8-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: s_mov_b64 exec, s[2:3] +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: v_mov_b32_e32 v2, 0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX8-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_or_b32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_or_b32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX8-NEXT: v_readlane_b32 s2, v2, 63 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_mov_b64 exec, s[4:5] +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr0 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB15_2 +; GFX8-NEXT: s_cbranch_execz BB15_2 +; GFX8-NEXT: BB15_1: +; GFX8-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v3, s2 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_or_rtn_b32 v0, v0, v3 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB15_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_mov_b32_e32 v0, v1 +; GFX8-NEXT: v_or_b32_e32 v0, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: or_i32_varying: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: v_mov_b32_e32 v2, v0 +; GFX9-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: s_mov_b64 exec, s[2:3] +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: v_mov_b32_e32 v2, 0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX9-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_or_b32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_or_b32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX9-NEXT: v_readlane_b32 s2, v2, 63 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_mov_b64 exec, s[4:5] +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr0 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB15_2 +; GFX9-NEXT: s_cbranch_execz BB15_2 +; GFX9-NEXT: BB15_1: +; GFX9-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v3, s2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_or_rtn_b32 v0, v0, v3 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB15_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_mov_b32_e32 v0, v1 +; GFX9-NEXT: v_or_b32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: or_i32_varying: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_mov_b32_e32 v2, v0 +; GFX1064-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: s_mov_b64 exec, s[2:3] +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: v_mov_b32_e32 v2, 0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX1064-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_mov_b32_e32 v3, v2 +; GFX1064-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1064-NEXT: v_or_b32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 31 +; GFX1064-NEXT: v_mov_b32_e32 v3, s2 +; GFX1064-NEXT: v_or_b32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 15 +; GFX1064-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1064-NEXT: v_readlane_b32 s6, v2, 47 +; GFX1064-NEXT: v_writelane_b32 v1, s2, 16 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_writelane_b32 v1, s3, 32 +; GFX1064-NEXT: v_readlane_b32 s3, v2, 63 +; GFX1064-NEXT: v_writelane_b32 v1, s6, 48 +; GFX1064-NEXT: s_mov_b64 exec, s[4:5] +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: ; implicit-def: $vgpr0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB15_2 +; GFX1064-NEXT: s_cbranch_execz BB15_2 +; GFX1064-NEXT: BB15_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v7, s3 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_or_rtn_b32 v0, v0, v7 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB15_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1064-NEXT: v_mov_b32_e32 v0, v1 +; GFX1064-NEXT: v_or_b32_e32 v0, s3, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: or_i32_varying: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mov_b32_e32 v2, v0 +; GFX1032-NEXT: s_or_saveexec_b32 s2, -1 +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: s_mov_b32 exec_lo, s2 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: v_mov_b32_e32 v2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: s_or_saveexec_b32 s4, -1 +; GFX1032-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_or_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_mov_b32_e32 v3, v2 +; GFX1032-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1032-NEXT: v_or_b32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1032-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s5, v2, 15 +; GFX1032-NEXT: v_writelane_b32 v1, s5, 16 +; GFX1032-NEXT: s_mov_b32 exec_lo, s4 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: ; implicit-def: $vgpr0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB15_2 +; GFX1032-NEXT: s_cbranch_execz BB15_2 +; GFX1032-NEXT: BB15_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v7, s3 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_or_rtn_b32 v0, v0, v7 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB15_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1032-NEXT: v_mov_b32_e32 v0, v1 +; GFX1032-NEXT: v_or_b32_e32 v0, s3, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %old = atomicrmw or i32 addrspace(3)* @local_var32, i32 %lane acq_rel @@ -384,12 +3193,245 @@ entry: ret void } -; GCN-LABEL: xor_i32_varying: ; GFX8MORE32: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 31 -; GFX8MORE64: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 63 ; GFX8MORE: v_mov_b32{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[scalar_value]] ; GFX8MORE: ds_xor_rtn_b32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @xor_i32_varying(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: xor_i32_varying: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_xor_rtn_b32 v0, v1, v0 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: xor_i32_varying: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: v_mov_b32_e32 v2, v0 +; GFX8-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: s_mov_b64 exec, s[2:3] +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: v_mov_b32_e32 v2, 0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX8-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_xor_b32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_xor_b32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX8-NEXT: v_readlane_b32 s2, v2, 63 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_mov_b64 exec, s[4:5] +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr0 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB16_2 +; GFX8-NEXT: s_cbranch_execz BB16_2 +; GFX8-NEXT: BB16_1: +; GFX8-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v3, s2 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_xor_rtn_b32 v0, v0, v3 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB16_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_mov_b32_e32 v0, v1 +; GFX8-NEXT: v_xor_b32_e32 v0, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: xor_i32_varying: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: v_mov_b32_e32 v2, v0 +; GFX9-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: s_mov_b64 exec, s[2:3] +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: v_mov_b32_e32 v2, 0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX9-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_xor_b32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_xor_b32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX9-NEXT: v_readlane_b32 s2, v2, 63 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_mov_b64 exec, s[4:5] +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr0 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB16_2 +; GFX9-NEXT: s_cbranch_execz BB16_2 +; GFX9-NEXT: BB16_1: +; GFX9-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v3, s2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_xor_rtn_b32 v0, v0, v3 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB16_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_mov_b32_e32 v0, v1 +; GFX9-NEXT: v_xor_b32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: xor_i32_varying: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_mov_b32_e32 v2, v0 +; GFX1064-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: s_mov_b64 exec, s[2:3] +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: v_mov_b32_e32 v2, 0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX1064-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_mov_b32_e32 v3, v2 +; GFX1064-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1064-NEXT: v_xor_b32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 31 +; GFX1064-NEXT: v_mov_b32_e32 v3, s2 +; GFX1064-NEXT: v_xor_b32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 15 +; GFX1064-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1064-NEXT: v_readlane_b32 s6, v2, 47 +; GFX1064-NEXT: v_writelane_b32 v1, s2, 16 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_writelane_b32 v1, s3, 32 +; GFX1064-NEXT: v_readlane_b32 s3, v2, 63 +; GFX1064-NEXT: v_writelane_b32 v1, s6, 48 +; GFX1064-NEXT: s_mov_b64 exec, s[4:5] +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: ; implicit-def: $vgpr0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB16_2 +; GFX1064-NEXT: s_cbranch_execz BB16_2 +; GFX1064-NEXT: BB16_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v7, s3 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_xor_rtn_b32 v0, v0, v7 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB16_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1064-NEXT: v_mov_b32_e32 v0, v1 +; GFX1064-NEXT: v_xor_b32_e32 v0, s3, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: xor_i32_varying: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mov_b32_e32 v2, v0 +; GFX1032-NEXT: s_or_saveexec_b32 s2, -1 +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: s_mov_b32 exec_lo, s2 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: v_mov_b32_e32 v2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: s_or_saveexec_b32 s4, -1 +; GFX1032-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_xor_b32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_mov_b32_e32 v3, v2 +; GFX1032-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1032-NEXT: v_xor_b32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1032-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s5, v2, 15 +; GFX1032-NEXT: v_writelane_b32 v1, s5, 16 +; GFX1032-NEXT: s_mov_b32 exec_lo, s4 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: ; implicit-def: $vgpr0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB16_2 +; GFX1032-NEXT: s_cbranch_execz BB16_2 +; GFX1032-NEXT: BB16_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v7, s3 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_xor_rtn_b32 v0, v0, v7 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB16_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1032-NEXT: v_mov_b32_e32 v0, v1 +; GFX1032-NEXT: v_xor_b32_e32 v0, s3, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %old = atomicrmw xor i32 addrspace(3)* @local_var32, i32 %lane acq_rel @@ -397,12 +3439,245 @@ entry: ret void } -; GCN-LABEL: max_i32_varying: ; GFX8MORE32: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 31 -; GFX8MORE64: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 63 ; GFX8MORE: v_mov_b32{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[scalar_value]] ; GFX8MORE: ds_max_rtn_i32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @max_i32_varying(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: max_i32_varying: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_max_rtn_i32 v0, v1, v0 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: max_i32_varying: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v3, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v3, s3, v3 +; GFX8-NEXT: v_mov_b32_e32 v2, v0 +; GFX8-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX8-NEXT: v_bfrev_b32_e32 v1, 1 +; GFX8-NEXT: s_mov_b64 exec, s[2:3] +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: v_mov_b32_e32 v2, v1 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX8-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_max_i32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_max_i32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX8-NEXT: v_readlane_b32 s2, v2, 63 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_mov_b64 exec, s[4:5] +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v3 +; GFX8-NEXT: ; implicit-def: $vgpr0 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB17_2 +; GFX8-NEXT: s_cbranch_execz BB17_2 +; GFX8-NEXT: BB17_1: +; GFX8-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v3, s2 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_max_rtn_i32 v0, v0, v3 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB17_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_mov_b32_e32 v0, v1 +; GFX8-NEXT: v_max_i32_e32 v0, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: max_i32_varying: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v3, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v3, s3, v3 +; GFX9-NEXT: v_mov_b32_e32 v2, v0 +; GFX9-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX9-NEXT: v_bfrev_b32_e32 v1, 1 +; GFX9-NEXT: s_mov_b64 exec, s[2:3] +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: v_mov_b32_e32 v2, v1 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX9-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_max_i32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_max_i32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX9-NEXT: v_readlane_b32 s2, v2, 63 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_mov_b64 exec, s[4:5] +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v3 +; GFX9-NEXT: ; implicit-def: $vgpr0 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB17_2 +; GFX9-NEXT: s_cbranch_execz BB17_2 +; GFX9-NEXT: BB17_1: +; GFX9-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v3, s2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_max_rtn_i32 v0, v0, v3 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB17_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_mov_b32_e32 v0, v1 +; GFX9-NEXT: v_max_i32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: max_i32_varying: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: v_mov_b32_e32 v2, v0 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v4, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v4, s3, v4 +; GFX1064-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX1064-NEXT: v_bfrev_b32_e32 v1, 1 +; GFX1064-NEXT: s_mov_b64 exec, s[2:3] +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: v_mov_b32_e32 v2, v1 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX1064-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_mov_b32_e32 v3, v2 +; GFX1064-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1064-NEXT: v_max_i32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 31 +; GFX1064-NEXT: v_mov_b32_e32 v3, s2 +; GFX1064-NEXT: v_max_i32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 15 +; GFX1064-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1064-NEXT: v_readlane_b32 s6, v2, 47 +; GFX1064-NEXT: v_writelane_b32 v1, s2, 16 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_writelane_b32 v1, s3, 32 +; GFX1064-NEXT: v_readlane_b32 s3, v2, 63 +; GFX1064-NEXT: v_writelane_b32 v1, s6, 48 +; GFX1064-NEXT: s_mov_b64 exec, s[4:5] +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v4 +; GFX1064-NEXT: ; implicit-def: $vgpr0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB17_2 +; GFX1064-NEXT: s_cbranch_execz BB17_2 +; GFX1064-NEXT: BB17_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v7, s3 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_max_rtn_i32 v0, v0, v7 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB17_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1064-NEXT: v_mov_b32_e32 v0, v1 +; GFX1064-NEXT: v_max_i32_e32 v0, s3, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: max_i32_varying: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mov_b32_e32 v2, v0 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v4, s2, 0 +; GFX1032-NEXT: s_or_saveexec_b32 s2, -1 +; GFX1032-NEXT: v_bfrev_b32_e32 v1, 1 +; GFX1032-NEXT: s_mov_b32 exec_lo, s2 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: v_mov_b32_e32 v2, v1 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: s_or_saveexec_b32 s4, -1 +; GFX1032-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_max_i32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_mov_b32_e32 v3, v2 +; GFX1032-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1032-NEXT: v_max_i32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1032-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s5, v2, 15 +; GFX1032-NEXT: v_writelane_b32 v1, s5, 16 +; GFX1032-NEXT: s_mov_b32 exec_lo, s4 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v4 +; GFX1032-NEXT: ; implicit-def: $vgpr0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB17_2 +; GFX1032-NEXT: s_cbranch_execz BB17_2 +; GFX1032-NEXT: BB17_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v7, s3 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_max_rtn_i32 v0, v0, v7 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB17_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1032-NEXT: v_mov_b32_e32 v0, v1 +; GFX1032-NEXT: v_max_i32_e32 v0, s3, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %old = atomicrmw max i32 addrspace(3)* @local_var32, i32 %lane acq_rel @@ -410,28 +3685,440 @@ entry: ret void } -; GCN-LABEL: max_i64_constant: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN: v_mov_b32{{(_e[0-9]+)?}} v[[value_lo:[0-9]+]], 5 -; GCN: v_mov_b32{{(_e[0-9]+)?}} v[[value_hi:[0-9]+]], 0 -; GCN: ds_max_rtn_i64 v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}}, v{{[0-9]+}}, v{{\[}}[[value_lo]]:[[value_hi]]{{\]}} define amdgpu_kernel void @max_i64_constant(i64 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: max_i64_constant: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s3, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX7LESS-NEXT: ; mask branch BB18_2 +; GFX7LESS-NEXT: s_cbranch_execz BB18_2 +; GFX7LESS-NEXT: BB18_1: +; GFX7LESS-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX7LESS-NEXT: v_mov_b32_e32 v0, 5 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, 0 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_max_rtn_i64 v[0:1], v2, v[0:1] +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB18_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX7LESS-NEXT: v_readfirstlane_b32 s4, v0 +; GFX7LESS-NEXT: v_readfirstlane_b32 s5, v1 +; GFX7LESS-NEXT: v_bfrev_b32_e32 v1, 1 +; GFX7LESS-NEXT: v_cndmask_b32_e64 v0, 5, 0, vcc +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: v_cndmask_b32_e32 v1, 0, v1, vcc +; GFX7LESS-NEXT: v_mov_b32_e32 v2, s5 +; GFX7LESS-NEXT: v_mov_b32_e32 v3, s4 +; GFX7LESS-NEXT: v_cmp_gt_i64_e32 vcc, s[4:5], v[0:1] +; GFX7LESS-NEXT: v_cndmask_b32_e32 v1, v1, v2, vcc +; GFX7LESS-NEXT: v_cndmask_b32_e32 v0, v0, v3, vcc +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: max_i64_constant: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX8-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX8-NEXT: ; mask branch BB18_2 +; GFX8-NEXT: s_cbranch_execz BB18_2 +; GFX8-NEXT: BB18_1: +; GFX8-NEXT: v_mov_b32_e32 v0, 5 +; GFX8-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_max_rtn_i64 v[0:1], v2, v[0:1] +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB18_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_bfrev_b32_e32 v0, 1 +; GFX8-NEXT: v_readfirstlane_b32 s3, v1 +; GFX8-NEXT: v_cndmask_b32_e32 v1, 0, v0, vcc +; GFX8-NEXT: v_cndmask_b32_e64 v0, 5, 0, vcc +; GFX8-NEXT: v_cmp_gt_i64_e32 vcc, s[2:3], v[0:1] +; GFX8-NEXT: v_mov_b32_e32 v2, s3 +; GFX8-NEXT: v_cndmask_b32_e32 v1, v1, v2, vcc +; GFX8-NEXT: v_mov_b32_e32 v2, s2 +; GFX8-NEXT: v_cndmask_b32_e32 v0, v0, v2, vcc +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: max_i64_constant: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX9-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX9-NEXT: ; mask branch BB18_2 +; GFX9-NEXT: s_cbranch_execz BB18_2 +; GFX9-NEXT: BB18_1: +; GFX9-NEXT: v_mov_b32_e32 v0, 5 +; GFX9-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_max_rtn_i64 v[0:1], v2, v[0:1] +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB18_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_bfrev_b32_e32 v0, 1 +; GFX9-NEXT: v_readfirstlane_b32 s3, v1 +; GFX9-NEXT: v_cndmask_b32_e32 v1, 0, v0, vcc +; GFX9-NEXT: v_cndmask_b32_e64 v0, 5, 0, vcc +; GFX9-NEXT: v_cmp_gt_i64_e32 vcc, s[2:3], v[0:1] +; GFX9-NEXT: v_mov_b32_e32 v2, s3 +; GFX9-NEXT: v_cndmask_b32_e32 v1, v1, v2, vcc +; GFX9-NEXT: v_mov_b32_e32 v2, s2 +; GFX9-NEXT: v_cndmask_b32_e32 v0, v0, v2, vcc +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: max_i64_constant: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX1064-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX1064-NEXT: ; mask branch BB18_2 +; GFX1064-NEXT: s_cbranch_execz BB18_2 +; GFX1064-NEXT: BB18_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, 5 +; GFX1064-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_max_rtn_i64 v[0:1], v2, v[0:1] +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB18_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX1064-NEXT: v_readfirstlane_b32 s4, v0 +; GFX1064-NEXT: v_readfirstlane_b32 s5, v1 +; GFX1064-NEXT: v_cndmask_b32_e64 v1, 0, 0x80000000, vcc +; GFX1064-NEXT: v_cndmask_b32_e64 v0, 5, 0, vcc +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_cmp_gt_i64_e32 vcc, s[4:5], v[0:1] +; GFX1064-NEXT: v_cndmask_b32_e64 v1, v1, s5, vcc +; GFX1064-NEXT: v_cndmask_b32_e64 v0, v0, s4, vcc +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: max_i64_constant: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX1032-NEXT: s_and_saveexec_b32 s2, vcc_lo +; GFX1032-NEXT: ; mask branch BB18_2 +; GFX1032-NEXT: s_cbranch_execz BB18_2 +; GFX1032-NEXT: BB18_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, 5 +; GFX1032-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_max_rtn_i64 v[0:1], v2, v[0:1] +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB18_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s2 +; GFX1032-NEXT: v_readfirstlane_b32 s4, v0 +; GFX1032-NEXT: v_readfirstlane_b32 s5, v1 +; GFX1032-NEXT: v_cndmask_b32_e64 v1, 0, 0x80000000, vcc_lo +; GFX1032-NEXT: v_cndmask_b32_e64 v0, 5, 0, vcc_lo +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_cmp_gt_i64_e32 vcc_lo, s[4:5], v[0:1] +; GFX1032-NEXT: v_cndmask_b32_e64 v1, v1, s5, vcc_lo +; GFX1032-NEXT: v_cndmask_b32_e64 v0, v0, s4, vcc_lo +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw max i64 addrspace(3)* @local_var64, i64 5 acq_rel store i64 %old, i64 addrspace(1)* %out ret void } -; GCN-LABEL: min_i32_varying: ; GFX8MORE32: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 31 -; GFX8MORE64: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 63 ; GFX8MORE: v_mov_b32{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[scalar_value]] ; GFX8MORE: ds_min_rtn_i32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @min_i32_varying(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: min_i32_varying: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_min_rtn_i32 v0, v1, v0 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: min_i32_varying: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v3, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v3, s3, v3 +; GFX8-NEXT: v_mov_b32_e32 v2, v0 +; GFX8-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX8-NEXT: v_bfrev_b32_e32 v1, -2 +; GFX8-NEXT: s_mov_b64 exec, s[2:3] +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: v_mov_b32_e32 v2, v1 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX8-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_min_i32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_min_i32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX8-NEXT: v_readlane_b32 s2, v2, 63 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_mov_b64 exec, s[4:5] +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v3 +; GFX8-NEXT: ; implicit-def: $vgpr0 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB19_2 +; GFX8-NEXT: s_cbranch_execz BB19_2 +; GFX8-NEXT: BB19_1: +; GFX8-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v3, s2 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_min_rtn_i32 v0, v0, v3 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB19_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_mov_b32_e32 v0, v1 +; GFX8-NEXT: v_min_i32_e32 v0, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: min_i32_varying: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v3, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v3, s3, v3 +; GFX9-NEXT: v_mov_b32_e32 v2, v0 +; GFX9-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX9-NEXT: v_bfrev_b32_e32 v1, -2 +; GFX9-NEXT: s_mov_b64 exec, s[2:3] +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: v_mov_b32_e32 v2, v1 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX9-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_min_i32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_min_i32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX9-NEXT: v_readlane_b32 s2, v2, 63 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_mov_b64 exec, s[4:5] +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v3 +; GFX9-NEXT: ; implicit-def: $vgpr0 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB19_2 +; GFX9-NEXT: s_cbranch_execz BB19_2 +; GFX9-NEXT: BB19_1: +; GFX9-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v3, s2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_min_rtn_i32 v0, v0, v3 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB19_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_mov_b32_e32 v0, v1 +; GFX9-NEXT: v_min_i32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: min_i32_varying: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: v_mov_b32_e32 v2, v0 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v4, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v4, s3, v4 +; GFX1064-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX1064-NEXT: v_bfrev_b32_e32 v1, -2 +; GFX1064-NEXT: s_mov_b64 exec, s[2:3] +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: v_mov_b32_e32 v2, v1 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX1064-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_mov_b32_e32 v3, v2 +; GFX1064-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1064-NEXT: v_min_i32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 31 +; GFX1064-NEXT: v_mov_b32_e32 v3, s2 +; GFX1064-NEXT: v_min_i32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 15 +; GFX1064-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1064-NEXT: v_readlane_b32 s6, v2, 47 +; GFX1064-NEXT: v_writelane_b32 v1, s2, 16 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_writelane_b32 v1, s3, 32 +; GFX1064-NEXT: v_readlane_b32 s3, v2, 63 +; GFX1064-NEXT: v_writelane_b32 v1, s6, 48 +; GFX1064-NEXT: s_mov_b64 exec, s[4:5] +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v4 +; GFX1064-NEXT: ; implicit-def: $vgpr0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB19_2 +; GFX1064-NEXT: s_cbranch_execz BB19_2 +; GFX1064-NEXT: BB19_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v7, s3 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_min_rtn_i32 v0, v0, v7 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB19_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1064-NEXT: v_mov_b32_e32 v0, v1 +; GFX1064-NEXT: v_min_i32_e32 v0, s3, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: min_i32_varying: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mov_b32_e32 v2, v0 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v4, s2, 0 +; GFX1032-NEXT: s_or_saveexec_b32 s2, -1 +; GFX1032-NEXT: v_bfrev_b32_e32 v1, -2 +; GFX1032-NEXT: s_mov_b32 exec_lo, s2 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: v_mov_b32_e32 v2, v1 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: s_or_saveexec_b32 s4, -1 +; GFX1032-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_min_i32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_mov_b32_e32 v3, v2 +; GFX1032-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1032-NEXT: v_min_i32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1032-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s5, v2, 15 +; GFX1032-NEXT: v_writelane_b32 v1, s5, 16 +; GFX1032-NEXT: s_mov_b32 exec_lo, s4 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v4 +; GFX1032-NEXT: ; implicit-def: $vgpr0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB19_2 +; GFX1032-NEXT: s_cbranch_execz BB19_2 +; GFX1032-NEXT: BB19_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v7, s3 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_min_rtn_i32 v0, v0, v7 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB19_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1032-NEXT: v_mov_b32_e32 v0, v1 +; GFX1032-NEXT: v_min_i32_e32 v0, s3, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %old = atomicrmw min i32 addrspace(3)* @local_var32, i32 %lane acq_rel @@ -439,28 +4126,440 @@ entry: ret void } -; GCN-LABEL: min_i64_constant: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN: v_mov_b32{{(_e[0-9]+)?}} v[[value_lo:[0-9]+]], 5 -; GCN: v_mov_b32{{(_e[0-9]+)?}} v[[value_hi:[0-9]+]], 0 -; GCN: ds_min_rtn_i64 v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}}, v{{[0-9]+}}, v{{\[}}[[value_lo]]:[[value_hi]]{{\]}} define amdgpu_kernel void @min_i64_constant(i64 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: min_i64_constant: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s3, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX7LESS-NEXT: ; mask branch BB20_2 +; GFX7LESS-NEXT: s_cbranch_execz BB20_2 +; GFX7LESS-NEXT: BB20_1: +; GFX7LESS-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX7LESS-NEXT: v_mov_b32_e32 v0, 5 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, 0 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_min_rtn_i64 v[0:1], v2, v[0:1] +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB20_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX7LESS-NEXT: v_readfirstlane_b32 s4, v0 +; GFX7LESS-NEXT: v_readfirstlane_b32 s5, v1 +; GFX7LESS-NEXT: v_bfrev_b32_e32 v1, -2 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: v_cndmask_b32_e64 v0, 5, -1, vcc +; GFX7LESS-NEXT: v_cndmask_b32_e32 v1, 0, v1, vcc +; GFX7LESS-NEXT: v_mov_b32_e32 v2, s5 +; GFX7LESS-NEXT: v_mov_b32_e32 v3, s4 +; GFX7LESS-NEXT: v_cmp_lt_i64_e32 vcc, s[4:5], v[0:1] +; GFX7LESS-NEXT: v_cndmask_b32_e32 v1, v1, v2, vcc +; GFX7LESS-NEXT: v_cndmask_b32_e32 v0, v0, v3, vcc +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: min_i64_constant: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX8-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX8-NEXT: ; mask branch BB20_2 +; GFX8-NEXT: s_cbranch_execz BB20_2 +; GFX8-NEXT: BB20_1: +; GFX8-NEXT: v_mov_b32_e32 v0, 5 +; GFX8-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_min_rtn_i64 v[0:1], v2, v[0:1] +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB20_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX8-NEXT: v_readfirstlane_b32 s4, v0 +; GFX8-NEXT: v_bfrev_b32_e32 v0, -2 +; GFX8-NEXT: v_readfirstlane_b32 s5, v1 +; GFX8-NEXT: v_cndmask_b32_e32 v1, 0, v0, vcc +; GFX8-NEXT: v_cndmask_b32_e64 v0, 5, -1, vcc +; GFX8-NEXT: v_cmp_lt_i64_e32 vcc, s[4:5], v[0:1] +; GFX8-NEXT: v_mov_b32_e32 v2, s5 +; GFX8-NEXT: v_cndmask_b32_e32 v1, v1, v2, vcc +; GFX8-NEXT: v_mov_b32_e32 v2, s4 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: v_cndmask_b32_e32 v0, v0, v2, vcc +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: min_i64_constant: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX9-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX9-NEXT: ; mask branch BB20_2 +; GFX9-NEXT: s_cbranch_execz BB20_2 +; GFX9-NEXT: BB20_1: +; GFX9-NEXT: v_mov_b32_e32 v0, 5 +; GFX9-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_min_rtn_i64 v[0:1], v2, v[0:1] +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB20_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX9-NEXT: v_readfirstlane_b32 s4, v0 +; GFX9-NEXT: v_bfrev_b32_e32 v0, -2 +; GFX9-NEXT: v_readfirstlane_b32 s5, v1 +; GFX9-NEXT: v_cndmask_b32_e32 v1, 0, v0, vcc +; GFX9-NEXT: v_cndmask_b32_e64 v0, 5, -1, vcc +; GFX9-NEXT: v_cmp_lt_i64_e32 vcc, s[4:5], v[0:1] +; GFX9-NEXT: v_mov_b32_e32 v2, s5 +; GFX9-NEXT: v_cndmask_b32_e32 v1, v1, v2, vcc +; GFX9-NEXT: v_mov_b32_e32 v2, s4 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: v_cndmask_b32_e32 v0, v0, v2, vcc +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: min_i64_constant: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX1064-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX1064-NEXT: ; mask branch BB20_2 +; GFX1064-NEXT: s_cbranch_execz BB20_2 +; GFX1064-NEXT: BB20_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, 5 +; GFX1064-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_min_rtn_i64 v[0:1], v2, v[0:1] +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB20_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX1064-NEXT: v_readfirstlane_b32 s4, v0 +; GFX1064-NEXT: v_readfirstlane_b32 s5, v1 +; GFX1064-NEXT: v_cndmask_b32_e64 v1, 0, 0x7fffffff, vcc +; GFX1064-NEXT: v_cndmask_b32_e64 v0, 5, -1, vcc +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: v_cmp_lt_i64_e32 vcc, s[4:5], v[0:1] +; GFX1064-NEXT: v_cndmask_b32_e64 v1, v1, s5, vcc +; GFX1064-NEXT: v_cndmask_b32_e64 v0, v0, s4, vcc +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: min_i64_constant: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX1032-NEXT: s_and_saveexec_b32 s2, vcc_lo +; GFX1032-NEXT: ; mask branch BB20_2 +; GFX1032-NEXT: s_cbranch_execz BB20_2 +; GFX1032-NEXT: BB20_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, 5 +; GFX1032-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_min_rtn_i64 v[0:1], v2, v[0:1] +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB20_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s2 +; GFX1032-NEXT: v_readfirstlane_b32 s4, v0 +; GFX1032-NEXT: v_readfirstlane_b32 s5, v1 +; GFX1032-NEXT: v_cndmask_b32_e64 v1, 0, 0x7fffffff, vcc_lo +; GFX1032-NEXT: v_cndmask_b32_e64 v0, 5, -1, vcc_lo +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: v_cmp_lt_i64_e32 vcc_lo, s[4:5], v[0:1] +; GFX1032-NEXT: v_cndmask_b32_e64 v1, v1, s5, vcc_lo +; GFX1032-NEXT: v_cndmask_b32_e64 v0, v0, s4, vcc_lo +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw min i64 addrspace(3)* @local_var64, i64 5 acq_rel store i64 %old, i64 addrspace(1)* %out ret void } -; GCN-LABEL: umax_i32_varying: ; GFX8MORE32: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 31 -; GFX8MORE64: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 63 ; GFX8MORE: v_mov_b32{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[scalar_value]] ; GFX8MORE: ds_max_rtn_u32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @umax_i32_varying(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: umax_i32_varying: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_max_rtn_u32 v0, v1, v0 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: umax_i32_varying: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: v_mov_b32_e32 v2, v0 +; GFX8-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: s_mov_b64 exec, s[2:3] +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: v_mov_b32_e32 v2, 0 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX8-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_max_u32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_max_u32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX8-NEXT: v_readlane_b32 s2, v2, 63 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_mov_b64 exec, s[4:5] +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr0 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB21_2 +; GFX8-NEXT: s_cbranch_execz BB21_2 +; GFX8-NEXT: BB21_1: +; GFX8-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v3, s2 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_max_rtn_u32 v0, v0, v3 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB21_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_mov_b32_e32 v0, v1 +; GFX8-NEXT: v_max_u32_e32 v0, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: umax_i32_varying: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: v_mov_b32_e32 v2, v0 +; GFX9-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: s_mov_b64 exec, s[2:3] +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: v_mov_b32_e32 v2, 0 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX9-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_max_u32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_max_u32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX9-NEXT: v_readlane_b32 s2, v2, 63 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_mov_b64 exec, s[4:5] +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr0 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB21_2 +; GFX9-NEXT: s_cbranch_execz BB21_2 +; GFX9-NEXT: BB21_1: +; GFX9-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v3, s2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_max_rtn_u32 v0, v0, v3 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB21_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_mov_b32_e32 v0, v1 +; GFX9-NEXT: v_max_u32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: umax_i32_varying: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_mov_b32_e32 v2, v0 +; GFX1064-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: s_mov_b64 exec, s[2:3] +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: v_mov_b32_e32 v2, 0 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX1064-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1064-NEXT: v_mov_b32_e32 v3, v2 +; GFX1064-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1064-NEXT: v_max_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 31 +; GFX1064-NEXT: v_mov_b32_e32 v3, s2 +; GFX1064-NEXT: v_max_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 15 +; GFX1064-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1064-NEXT: v_readlane_b32 s6, v2, 47 +; GFX1064-NEXT: v_writelane_b32 v1, s2, 16 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_writelane_b32 v1, s3, 32 +; GFX1064-NEXT: v_readlane_b32 s3, v2, 63 +; GFX1064-NEXT: v_writelane_b32 v1, s6, 48 +; GFX1064-NEXT: s_mov_b64 exec, s[4:5] +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: ; implicit-def: $vgpr0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB21_2 +; GFX1064-NEXT: s_cbranch_execz BB21_2 +; GFX1064-NEXT: BB21_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v7, s3 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_max_rtn_u32 v0, v0, v7 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB21_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1064-NEXT: v_mov_b32_e32 v0, v1 +; GFX1064-NEXT: v_max_u32_e32 v0, s3, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: umax_i32_varying: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mov_b32_e32 v2, v0 +; GFX1032-NEXT: s_or_saveexec_b32 s2, -1 +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: s_mov_b32 exec_lo, s2 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: v_mov_b32_e32 v2, 0 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: s_or_saveexec_b32 s4, -1 +; GFX1032-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_max_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf bound_ctrl:0 +; GFX1032-NEXT: v_mov_b32_e32 v3, v2 +; GFX1032-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1032-NEXT: v_max_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1032-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s5, v2, 15 +; GFX1032-NEXT: v_writelane_b32 v1, s5, 16 +; GFX1032-NEXT: s_mov_b32 exec_lo, s4 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: ; implicit-def: $vgpr0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB21_2 +; GFX1032-NEXT: s_cbranch_execz BB21_2 +; GFX1032-NEXT: BB21_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v7, s3 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_max_rtn_u32 v0, v0, v7 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB21_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1032-NEXT: v_mov_b32_e32 v0, v1 +; GFX1032-NEXT: v_max_u32_e32 v0, s3, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %old = atomicrmw umax i32 addrspace(3)* @local_var32, i32 %lane acq_rel @@ -468,28 +4567,437 @@ entry: ret void } -; GCN-LABEL: umax_i64_constant: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN: v_mov_b32{{(_e[0-9]+)?}} v[[value_lo:[0-9]+]], 5 -; GCN: v_mov_b32{{(_e[0-9]+)?}} v[[value_hi:[0-9]+]], 0 -; GCN: ds_max_rtn_u64 v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}}, v{{[0-9]+}}, v{{\[}}[[value_lo]]:[[value_hi]]{{\]}} define amdgpu_kernel void @umax_i64_constant(i64 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: umax_i64_constant: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s3, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX7LESS-NEXT: ; mask branch BB22_2 +; GFX7LESS-NEXT: s_cbranch_execz BB22_2 +; GFX7LESS-NEXT: BB22_1: +; GFX7LESS-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX7LESS-NEXT: v_mov_b32_e32 v0, 5 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, 0 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_max_rtn_u64 v[0:1], v2, v[0:1] +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB22_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX7LESS-NEXT: v_readfirstlane_b32 s4, v0 +; GFX7LESS-NEXT: v_readfirstlane_b32 s5, v1 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, 0 +; GFX7LESS-NEXT: v_cndmask_b32_e64 v0, 5, 0, vcc +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: v_mov_b32_e32 v2, s4 +; GFX7LESS-NEXT: v_cmp_gt_u64_e32 vcc, s[4:5], v[0:1] +; GFX7LESS-NEXT: v_cndmask_b32_e32 v0, v0, v2, vcc +; GFX7LESS-NEXT: v_mov_b32_e32 v1, s5 +; GFX7LESS-NEXT: v_cndmask_b32_e32 v1, 0, v1, vcc +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: umax_i64_constant: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX8-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX8-NEXT: ; mask branch BB22_2 +; GFX8-NEXT: s_cbranch_execz BB22_2 +; GFX8-NEXT: BB22_1: +; GFX8-NEXT: v_mov_b32_e32 v0, 5 +; GFX8-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_max_rtn_u64 v[0:1], v2, v[0:1] +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB22_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_readfirstlane_b32 s3, v1 +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: v_cndmask_b32_e64 v0, 5, 0, vcc +; GFX8-NEXT: v_cmp_gt_u64_e32 vcc, s[2:3], v[0:1] +; GFX8-NEXT: v_mov_b32_e32 v1, s3 +; GFX8-NEXT: v_mov_b32_e32 v2, s2 +; GFX8-NEXT: v_cndmask_b32_e32 v0, v0, v2, vcc +; GFX8-NEXT: v_cndmask_b32_e32 v1, 0, v1, vcc +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: umax_i64_constant: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX9-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX9-NEXT: ; mask branch BB22_2 +; GFX9-NEXT: s_cbranch_execz BB22_2 +; GFX9-NEXT: BB22_1: +; GFX9-NEXT: v_mov_b32_e32 v0, 5 +; GFX9-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_max_rtn_u64 v[0:1], v2, v[0:1] +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB22_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_readfirstlane_b32 s3, v1 +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: v_cndmask_b32_e64 v0, 5, 0, vcc +; GFX9-NEXT: v_cmp_gt_u64_e32 vcc, s[2:3], v[0:1] +; GFX9-NEXT: v_mov_b32_e32 v1, s3 +; GFX9-NEXT: v_mov_b32_e32 v2, s2 +; GFX9-NEXT: v_cndmask_b32_e32 v0, v0, v2, vcc +; GFX9-NEXT: v_cndmask_b32_e32 v1, 0, v1, vcc +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: umax_i64_constant: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX1064-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX1064-NEXT: ; mask branch BB22_2 +; GFX1064-NEXT: s_cbranch_execz BB22_2 +; GFX1064-NEXT: BB22_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, 5 +; GFX1064-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_max_rtn_u64 v[0:1], v2, v[0:1] +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB22_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX1064-NEXT: v_readfirstlane_b32 s4, v0 +; GFX1064-NEXT: v_readfirstlane_b32 s5, v1 +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: v_cndmask_b32_e64 v0, 5, 0, vcc +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_cmp_gt_u64_e32 vcc, s[4:5], v[0:1] +; GFX1064-NEXT: v_cndmask_b32_e64 v0, v0, s4, vcc +; GFX1064-NEXT: v_cndmask_b32_e64 v1, 0, s5, vcc +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: umax_i64_constant: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX1032-NEXT: s_and_saveexec_b32 s2, vcc_lo +; GFX1032-NEXT: ; mask branch BB22_2 +; GFX1032-NEXT: s_cbranch_execz BB22_2 +; GFX1032-NEXT: BB22_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, 5 +; GFX1032-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_max_rtn_u64 v[0:1], v2, v[0:1] +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB22_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s2 +; GFX1032-NEXT: v_readfirstlane_b32 s4, v0 +; GFX1032-NEXT: v_readfirstlane_b32 s5, v1 +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: v_cndmask_b32_e64 v0, 5, 0, vcc_lo +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_cmp_gt_u64_e32 vcc_lo, s[4:5], v[0:1] +; GFX1032-NEXT: v_cndmask_b32_e64 v0, v0, s4, vcc_lo +; GFX1032-NEXT: v_cndmask_b32_e64 v1, 0, s5, vcc_lo +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw umax i64 addrspace(3)* @local_var64, i64 5 acq_rel store i64 %old, i64 addrspace(1)* %out ret void } -; GCN-LABEL: umin_i32_varying: ; GFX8MORE32: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 31 -; GFX8MORE64: v_readlane_b32 s[[scalar_value:[0-9]+]], v{{[0-9]+}}, 63 ; GFX8MORE: v_mov_b32{{(_e[0-9]+)?}} v[[value:[0-9]+]], s[[scalar_value]] ; GFX8MORE: ds_min_rtn_u32 v{{[0-9]+}}, v{{[0-9]+}}, v[[value]] define amdgpu_kernel void @umin_i32_varying(i32 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: umin_i32_varying: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, local_var32 at abs32@lo +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_min_rtn_u32 v0, v1, v0 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: umin_i32_varying: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v3, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v3, s3, v3 +; GFX8-NEXT: v_mov_b32_e32 v2, v0 +; GFX8-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX8-NEXT: v_mov_b32_e32 v1, -1 +; GFX8-NEXT: s_mov_b64 exec, s[2:3] +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: v_mov_b32_e32 v2, -1 +; GFX8-NEXT: s_not_b64 exec, exec +; GFX8-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX8-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_min_u32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX8-NEXT: s_nop 1 +; GFX8-NEXT: v_min_u32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX8-NEXT: v_readlane_b32 s2, v2, 63 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX8-NEXT: s_mov_b64 exec, s[4:5] +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v3 +; GFX8-NEXT: ; implicit-def: $vgpr0 +; GFX8-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX8-NEXT: ; mask branch BB23_2 +; GFX8-NEXT: s_cbranch_execz BB23_2 +; GFX8-NEXT: BB23_1: +; GFX8-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v3, s2 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_min_rtn_u32 v0, v0, v3 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB23_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX8-NEXT: v_readfirstlane_b32 s2, v0 +; GFX8-NEXT: v_mov_b32_e32 v0, v1 +; GFX8-NEXT: v_min_u32_e32 v0, s2, v0 +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: s_nop 0 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: umin_i32_varying: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v3, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v3, s3, v3 +; GFX9-NEXT: v_mov_b32_e32 v2, v0 +; GFX9-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX9-NEXT: v_mov_b32_e32 v1, -1 +; GFX9-NEXT: s_mov_b64 exec, s[2:3] +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: v_mov_b32_e32 v2, -1 +; GFX9-NEXT: s_not_b64 exec, exec +; GFX9-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX9-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_min_u32_dpp v2, v2, v2 row_bcast:15 row_mask:0xa bank_mask:0xf +; GFX9-NEXT: s_nop 1 +; GFX9-NEXT: v_min_u32_dpp v2, v2, v2 row_bcast:31 row_mask:0xc bank_mask:0xf +; GFX9-NEXT: v_readlane_b32 s2, v2, 63 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: v_mov_b32_dpp v1, v2 wave_shr:1 row_mask:0xf bank_mask:0xf +; GFX9-NEXT: s_mov_b64 exec, s[4:5] +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v3 +; GFX9-NEXT: ; implicit-def: $vgpr0 +; GFX9-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX9-NEXT: ; mask branch BB23_2 +; GFX9-NEXT: s_cbranch_execz BB23_2 +; GFX9-NEXT: BB23_1: +; GFX9-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v3, s2 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_min_rtn_u32 v0, v0, v3 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB23_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX9-NEXT: v_readfirstlane_b32 s2, v0 +; GFX9-NEXT: v_mov_b32_e32 v0, v1 +; GFX9-NEXT: v_min_u32_e32 v0, s2, v0 +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: s_nop 0 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: umin_i32_varying: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: v_mov_b32_e32 v2, v0 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v4, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v4, s3, v4 +; GFX1064-NEXT: s_or_saveexec_b64 s[2:3], -1 +; GFX1064-NEXT: v_mov_b32_e32 v1, -1 +; GFX1064-NEXT: s_mov_b64 exec, s[2:3] +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: v_mov_b32_e32 v2, -1 +; GFX1064-NEXT: s_not_b64 exec, exec +; GFX1064-NEXT: s_or_saveexec_b64 s[4:5], -1 +; GFX1064-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_mov_b32_e32 v3, v2 +; GFX1064-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1064-NEXT: v_min_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 31 +; GFX1064-NEXT: v_mov_b32_e32 v3, s2 +; GFX1064-NEXT: v_min_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s2, v2, 15 +; GFX1064-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1064-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1064-NEXT: v_readlane_b32 s6, v2, 47 +; GFX1064-NEXT: v_writelane_b32 v1, s2, 16 +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: v_writelane_b32 v1, s3, 32 +; GFX1064-NEXT: v_readlane_b32 s3, v2, 63 +; GFX1064-NEXT: v_writelane_b32 v1, s6, 48 +; GFX1064-NEXT: s_mov_b64 exec, s[4:5] +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v4 +; GFX1064-NEXT: ; implicit-def: $vgpr0 +; GFX1064-NEXT: s_and_saveexec_b64 s[4:5], vcc +; GFX1064-NEXT: ; mask branch BB23_2 +; GFX1064-NEXT: s_cbranch_execz BB23_2 +; GFX1064-NEXT: BB23_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v7, s3 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_min_rtn_u32 v0, v0, v7 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB23_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[4:5] +; GFX1064-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1064-NEXT: v_mov_b32_e32 v0, v1 +; GFX1064-NEXT: v_min_u32_e32 v0, s3, v0 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: s_nop 1 +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: umin_i32_varying: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mov_b32_e32 v2, v0 +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v4, s2, 0 +; GFX1032-NEXT: s_or_saveexec_b32 s2, -1 +; GFX1032-NEXT: v_mov_b32_e32 v1, -1 +; GFX1032-NEXT: s_mov_b32 exec_lo, s2 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: v_mov_b32_e32 v2, -1 +; GFX1032-NEXT: s_not_b32 exec_lo, exec_lo +; GFX1032-NEXT: s_or_saveexec_b32 s4, -1 +; GFX1032-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:2 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:4 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_min_u32_dpp v2, v2, v2 row_shr:8 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_mov_b32_e32 v3, v2 +; GFX1032-NEXT: v_permlanex16_b32 v3, v3, -1, -1 +; GFX1032-NEXT: v_min_u32_dpp v2, v3, v2 quad_perm:[0,1,2,3] row_mask:0xa bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s3, v2, 31 +; GFX1032-NEXT: v_mov_b32_dpp v1, v2 row_shr:1 row_mask:0xf bank_mask:0xf +; GFX1032-NEXT: v_readlane_b32 s5, v2, 15 +; GFX1032-NEXT: v_writelane_b32 v1, s5, 16 +; GFX1032-NEXT: s_mov_b32 exec_lo, s4 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v4 +; GFX1032-NEXT: ; implicit-def: $vgpr0 +; GFX1032-NEXT: s_and_saveexec_b32 s4, vcc_lo +; GFX1032-NEXT: ; mask branch BB23_2 +; GFX1032-NEXT: s_cbranch_execz BB23_2 +; GFX1032-NEXT: BB23_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, local_var32 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v7, s3 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_min_rtn_u32 v0, v0, v7 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB23_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s4 +; GFX1032-NEXT: v_readfirstlane_b32 s3, v0 +; GFX1032-NEXT: v_mov_b32_e32 v0, v1 +; GFX1032-NEXT: v_min_u32_e32 v0, s3, v0 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: s_nop 1 +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dword v0, off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %lane = call i32 @llvm.amdgcn.workitem.id.x() %old = atomicrmw umin i32 addrspace(3)* @local_var32, i32 %lane acq_rel @@ -497,16 +5005,192 @@ entry: ret void } -; GCN-LABEL: umin_i64_constant: -; GCN32: v_cmp_ne_u32_e64 s[[exec_lo:[0-9]+]], 1, 0 -; GCN64: v_cmp_ne_u32_e64 s{{\[}}[[exec_lo:[0-9]+]]:[[exec_hi:[0-9]+]]{{\]}}, 1, 0 -; GCN: v_mbcnt_lo_u32_b32{{(_e[0-9]+)?}} v[[mbcnt:[0-9]+]], s[[exec_lo]], 0 -; GCN64: v_mbcnt_hi_u32_b32{{(_e[0-9]+)?}} v[[mbcnt]], s[[exec_hi]], v[[mbcnt]] -; GCN: v_cmp_eq_u32{{(_e[0-9]+)?}} vcc{{(_lo)?}}, 0, v[[mbcnt]] -; GCN: v_mov_b32{{(_e[0-9]+)?}} v[[value_lo:[0-9]+]], 5 -; GCN: v_mov_b32{{(_e[0-9]+)?}} v[[value_hi:[0-9]+]], 0 -; GCN: ds_min_rtn_u64 v{{\[}}{{[0-9]+}}:{{[0-9]+}}{{\]}}, v{{[0-9]+}}, v{{\[}}[[value_lo]]:[[value_hi]]{{\]}} define amdgpu_kernel void @umin_i64_constant(i64 addrspace(1)* %out) { +; +; +; GFX7LESS-LABEL: umin_i64_constant: +; GFX7LESS: ; %bb.0: ; %entry +; GFX7LESS-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x9 +; GFX7LESS-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX7LESS-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX7LESS-NEXT: v_mbcnt_hi_u32_b32_e32 v0, s3, v0 +; GFX7LESS-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX7LESS-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX7LESS-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX7LESS-NEXT: ; mask branch BB24_2 +; GFX7LESS-NEXT: s_cbranch_execz BB24_2 +; GFX7LESS-NEXT: BB24_1: +; GFX7LESS-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX7LESS-NEXT: v_mov_b32_e32 v0, 5 +; GFX7LESS-NEXT: v_mov_b32_e32 v1, 0 +; GFX7LESS-NEXT: s_mov_b32 m0, -1 +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: ds_min_rtn_u64 v[0:1], v2, v[0:1] +; GFX7LESS-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX7LESS-NEXT: buffer_wbinvl1 +; GFX7LESS-NEXT: BB24_2: +; GFX7LESS-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX7LESS-NEXT: v_readfirstlane_b32 s4, v0 +; GFX7LESS-NEXT: v_readfirstlane_b32 s5, v1 +; GFX7LESS-NEXT: s_mov_b32 s2, -1 +; GFX7LESS-NEXT: v_cndmask_b32_e64 v1, 0, -1, vcc +; GFX7LESS-NEXT: v_cndmask_b32_e64 v0, 5, -1, vcc +; GFX7LESS-NEXT: v_mov_b32_e32 v2, s5 +; GFX7LESS-NEXT: v_cmp_lt_u64_e32 vcc, s[4:5], v[0:1] +; GFX7LESS-NEXT: v_cndmask_b32_e32 v1, v1, v2, vcc +; GFX7LESS-NEXT: v_mov_b32_e32 v2, s4 +; GFX7LESS-NEXT: v_cndmask_b32_e32 v0, v0, v2, vcc +; GFX7LESS-NEXT: s_mov_b32 s3, 0xf000 +; GFX7LESS-NEXT: s_waitcnt lgkmcnt(0) +; GFX7LESS-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX7LESS-NEXT: s_endpgm +; +; GFX8-LABEL: umin_i64_constant: +; GFX8: ; %bb.0: ; %entry +; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX8-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX8-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX8-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX8-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX8-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX8-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX8-NEXT: ; mask branch BB24_2 +; GFX8-NEXT: s_cbranch_execz BB24_2 +; GFX8-NEXT: BB24_1: +; GFX8-NEXT: v_mov_b32_e32 v0, 5 +; GFX8-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX8-NEXT: v_mov_b32_e32 v1, 0 +; GFX8-NEXT: s_mov_b32 m0, -1 +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: ds_min_rtn_u64 v[0:1], v2, v[0:1] +; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX8-NEXT: buffer_wbinvl1_vol +; GFX8-NEXT: BB24_2: +; GFX8-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX8-NEXT: v_readfirstlane_b32 s5, v1 +; GFX8-NEXT: v_readfirstlane_b32 s4, v0 +; GFX8-NEXT: v_cndmask_b32_e64 v1, 0, -1, vcc +; GFX8-NEXT: v_cndmask_b32_e64 v0, 5, -1, vcc +; GFX8-NEXT: v_cmp_lt_u64_e32 vcc, s[4:5], v[0:1] +; GFX8-NEXT: v_mov_b32_e32 v2, s5 +; GFX8-NEXT: v_cndmask_b32_e32 v1, v1, v2, vcc +; GFX8-NEXT: v_mov_b32_e32 v2, s4 +; GFX8-NEXT: s_mov_b32 s2, -1 +; GFX8-NEXT: v_cndmask_b32_e32 v0, v0, v2, vcc +; GFX8-NEXT: s_mov_b32 s3, 0xf000 +; GFX8-NEXT: s_waitcnt lgkmcnt(0) +; GFX8-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX8-NEXT: s_endpgm +; +; GFX9-LABEL: umin_i64_constant: +; GFX9: ; %bb.0: ; %entry +; GFX9-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX9-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX9-NEXT: v_mbcnt_lo_u32_b32 v0, s2, 0 +; GFX9-NEXT: v_mbcnt_hi_u32_b32 v0, s3, v0 +; GFX9-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX9-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX9-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX9-NEXT: ; mask branch BB24_2 +; GFX9-NEXT: s_cbranch_execz BB24_2 +; GFX9-NEXT: BB24_1: +; GFX9-NEXT: v_mov_b32_e32 v0, 5 +; GFX9-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX9-NEXT: v_mov_b32_e32 v1, 0 +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: ds_min_rtn_u64 v[0:1], v2, v[0:1] +; GFX9-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX9-NEXT: buffer_wbinvl1_vol +; GFX9-NEXT: BB24_2: +; GFX9-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX9-NEXT: v_readfirstlane_b32 s5, v1 +; GFX9-NEXT: v_readfirstlane_b32 s4, v0 +; GFX9-NEXT: v_cndmask_b32_e64 v1, 0, -1, vcc +; GFX9-NEXT: v_cndmask_b32_e64 v0, 5, -1, vcc +; GFX9-NEXT: v_cmp_lt_u64_e32 vcc, s[4:5], v[0:1] +; GFX9-NEXT: v_mov_b32_e32 v2, s5 +; GFX9-NEXT: v_cndmask_b32_e32 v1, v1, v2, vcc +; GFX9-NEXT: v_mov_b32_e32 v2, s4 +; GFX9-NEXT: s_mov_b32 s2, -1 +; GFX9-NEXT: v_cndmask_b32_e32 v0, v0, v2, vcc +; GFX9-NEXT: s_mov_b32 s3, 0xf000 +; GFX9-NEXT: s_waitcnt lgkmcnt(0) +; GFX9-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX9-NEXT: s_endpgm +; +; GFX1064-LABEL: umin_i64_constant: +; GFX1064: ; %bb.0: ; %entry +; GFX1064-NEXT: v_cmp_ne_u32_e64 s[2:3], 1, 0 +; GFX1064-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1064-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1064-NEXT: v_mbcnt_hi_u32_b32_e64 v0, s3, v0 +; GFX1064-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0 +; GFX1064-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX1064-NEXT: s_and_saveexec_b64 s[2:3], vcc +; GFX1064-NEXT: ; mask branch BB24_2 +; GFX1064-NEXT: s_cbranch_execz BB24_2 +; GFX1064-NEXT: BB24_1: +; GFX1064-NEXT: v_mov_b32_e32 v0, 5 +; GFX1064-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1064-NEXT: v_mov_b32_e32 v1, 0 +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1064-NEXT: ds_min_rtn_u64 v[0:1], v2, v[0:1] +; GFX1064-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1064-NEXT: buffer_gl0_inv +; GFX1064-NEXT: buffer_gl1_inv +; GFX1064-NEXT: BB24_2: +; GFX1064-NEXT: v_nop +; GFX1064-NEXT: s_or_b64 exec, exec, s[2:3] +; GFX1064-NEXT: v_readfirstlane_b32 s4, v0 +; GFX1064-NEXT: v_readfirstlane_b32 s5, v1 +; GFX1064-NEXT: v_cndmask_b32_e64 v1, 0, -1, vcc +; GFX1064-NEXT: v_cndmask_b32_e64 v0, 5, -1, vcc +; GFX1064-NEXT: s_mov_b32 s2, -1 +; GFX1064-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1064-NEXT: v_cmp_lt_u64_e32 vcc, s[4:5], v[0:1] +; GFX1064-NEXT: v_cndmask_b32_e64 v1, v1, s5, vcc +; GFX1064-NEXT: v_cndmask_b32_e64 v0, v0, s4, vcc +; GFX1064-NEXT: s_waitcnt lgkmcnt(0) +; GFX1064-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1064-NEXT: s_endpgm +; +; GFX1032-LABEL: umin_i64_constant: +; GFX1032: ; %bb.0: ; %entry +; GFX1032-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX1032-NEXT: v_cmp_ne_u32_e64 s2, 1, 0 +; GFX1032-NEXT: ; implicit-def: $vcc_hi +; GFX1032-NEXT: v_mbcnt_lo_u32_b32_e64 v0, s2, 0 +; GFX1032-NEXT: v_cmp_eq_u32_e32 vcc_lo, 0, v0 +; GFX1032-NEXT: ; implicit-def: $vgpr0_vgpr1 +; GFX1032-NEXT: s_and_saveexec_b32 s2, vcc_lo +; GFX1032-NEXT: ; mask branch BB24_2 +; GFX1032-NEXT: s_cbranch_execz BB24_2 +; GFX1032-NEXT: BB24_1: +; GFX1032-NEXT: v_mov_b32_e32 v0, 5 +; GFX1032-NEXT: v_mov_b32_e32 v2, local_var64 at abs32@lo +; GFX1032-NEXT: v_mov_b32_e32 v1, 0 +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: s_waitcnt_vscnt null, 0x0 +; GFX1032-NEXT: ds_min_rtn_u64 v[0:1], v2, v[0:1] +; GFX1032-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0) +; GFX1032-NEXT: buffer_gl0_inv +; GFX1032-NEXT: buffer_gl1_inv +; GFX1032-NEXT: BB24_2: +; GFX1032-NEXT: v_nop +; GFX1032-NEXT: s_or_b32 exec_lo, exec_lo, s2 +; GFX1032-NEXT: v_readfirstlane_b32 s4, v0 +; GFX1032-NEXT: v_readfirstlane_b32 s5, v1 +; GFX1032-NEXT: v_cndmask_b32_e64 v1, 0, -1, vcc_lo +; GFX1032-NEXT: v_cndmask_b32_e64 v0, 5, -1, vcc_lo +; GFX1032-NEXT: s_mov_b32 s2, -1 +; GFX1032-NEXT: s_mov_b32 s3, 0x31016000 +; GFX1032-NEXT: v_cmp_lt_u64_e32 vcc_lo, s[4:5], v[0:1] +; GFX1032-NEXT: v_cndmask_b32_e64 v1, v1, s5, vcc_lo +; GFX1032-NEXT: v_cndmask_b32_e64 v0, v0, s4, vcc_lo +; GFX1032-NEXT: s_waitcnt lgkmcnt(0) +; GFX1032-NEXT: buffer_store_dwordx2 v[0:1], off, s[0:3], 0 +; GFX1032-NEXT: s_endpgm entry: %old = atomicrmw umin i64 addrspace(3)* @local_var64, i64 5 acq_rel store i64 %old, i64 addrspace(1)* %out Added: llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.ll?rev=374604&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.ll (added) +++ llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.ll Fri Oct 11 15:03:36 2019 @@ -0,0 +1,53 @@ +; RUN: llc -march=amdgcn -mcpu=gfx900 -verify-machineinstrs < %s | FileCheck %s -check-prefix=GCN +; RUN: llc -march=amdgcn -mcpu=gfx1010 -verify-machineinstrs < %s | FileCheck %s -check-prefix=GCN + +; GCN-LABEL: {{^}}dpp_add: +; GCN: global_load_dword [[V:v[0-9]+]], +; GCN: v_add_{{(nc_)?}}u32_dpp [[V]], [[V]], [[V]] quad_perm:[1,0,0,0] row_mask:0xf bank_mask:0xf bound_ctrl:0{{$}} +define amdgpu_kernel void @dpp_add(i32 addrspace(1)* %arg) { + %id = tail call i32 @llvm.amdgcn.workitem.id.x() + %gep = getelementptr inbounds i32, i32 addrspace(1)* %arg, i32 %id + %load = load i32, i32 addrspace(1)* %gep + %tmp0 = call i32 @llvm.amdgcn.update.dpp.i32(i32 %load, i32 %load, i32 1, i32 15, i32 15, i1 1) #0 + %add = add i32 %tmp0, %load + store i32 %add, i32 addrspace(1)* %gep + ret void +} + +; GCN-LABEL: {{^}}dpp_ceil: +; GCN: global_load_dword [[V:v[0-9]+]], +; GCN: v_ceil_f32_dpp [[V]], [[V]] quad_perm:[1,0,0,0] row_mask:0xf bank_mask:0xf bound_ctrl:0{{$}} +define amdgpu_kernel void @dpp_ceil(i32 addrspace(1)* %arg) { + %id = tail call i32 @llvm.amdgcn.workitem.id.x() + %gep = getelementptr inbounds i32, i32 addrspace(1)* %arg, i32 %id + %load = load i32, i32 addrspace(1)* %gep + %tmp0 = call i32 @llvm.amdgcn.update.dpp.i32(i32 %load, i32 %load, i32 1, i32 15, i32 15, i1 1) #0 + %tmp1 = bitcast i32 %tmp0 to float + %round = tail call float @llvm.ceil.f32(float %tmp1) + %tmp2 = bitcast float %round to i32 + store i32 %tmp2, i32 addrspace(1)* %gep + ret void +} + +; GCN-LABEL: {{^}}dpp_fadd: +; GCN: global_load_dword [[V:v[0-9]+]], +; GCN: v_add_f32_dpp [[V]], [[V]], [[V]] quad_perm:[1,0,0,0] row_mask:0xf bank_mask:0xf bound_ctrl:0{{$}} +define amdgpu_kernel void @dpp_fadd(i32 addrspace(1)* %arg) { + %id = tail call i32 @llvm.amdgcn.workitem.id.x() + %gep = getelementptr inbounds i32, i32 addrspace(1)* %arg, i32 %id + %load = load i32, i32 addrspace(1)* %gep + %tmp0 = call i32 @llvm.amdgcn.update.dpp.i32(i32 %load, i32 %load, i32 1, i32 15, i32 15, i1 1) #0 + %tmp1 = bitcast i32 %tmp0 to float + %t = bitcast i32 %load to float + %add = fadd float %tmp1, %t + %tmp2 = bitcast float %add to i32 + store i32 %tmp2, i32 addrspace(1)* %gep + ret void +} + + +declare i32 @llvm.amdgcn.workitem.id.x() +declare i32 @llvm.amdgcn.update.dpp.i32(i32, i32, i32, i32, i32, i1) #0 +declare float @llvm.ceil.f32(float) + +attributes #0 = { nounwind readnone convergent } From llvm-commits at lists.llvm.org Fri Oct 11 15:01:19 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:01:19 +0000 (UTC) Subject: [PATCH] D68834: [lit] Change regex filter to ignore case In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGac36dafb6921: [lit] Change regex filter to ignore case (authored by yln). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68834/new/ https://reviews.llvm.org/D68834 Files: llvm/utils/lit/lit/cl_arguments.py llvm/utils/lit/lit/main.py llvm/utils/lit/tests/selecting.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68834.224691.patch Type: text/x-patch Size: 3416 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 15:01:20 2019 From: llvm-commits at lists.llvm.org (Julian Lettner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:01:20 +0000 (UTC) Subject: [PATCH] D68836: [lit] Small cleanups in main.py In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGafa8903ad6de: [lit] Small cleanups in main.py (authored by yln). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68836/new/ https://reviews.llvm.org/D68836 Files: llvm/utils/lit/lit/main.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68836.224692.patch Type: text/x-patch Size: 3224 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 15:10:37 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:10:37 +0000 (UTC) Subject: [PATCH] D68816: [NFC] Replace a linked list in LiveDebugVariables pass with a DenseMap In-Reply-To: References: Message-ID: <230a8f29d24c3c97d155fc6312ff3bdd@localhost.localdomain> aprantl accepted this revision. aprantl added a comment. This revision is now accepted and ready to land. This is good to land with whatever conclusions we draw from the SmallDenseMap experiment. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68816/new/ https://reviews.llvm.org/D68816 From llvm-commits at lists.llvm.org Fri Oct 11 15:10:39 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:10:39 +0000 (UTC) Subject: [PATCH] D67768: [DebugInfo] Add interface for pre-calculating the size of emitted DWARF In-Reply-To: References: Message-ID: aprantl accepted this revision. aprantl added a comment. This revision is now accepted and ready to land. lgtm with inline comments addressed ================ Comment at: llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h:337 + SmallString<32> Bytes; + SmallVector Comments; + BufferByteStreamer BS; ---------------- A SmallVector doesn't seem to make that much sense, since strings are always heap-allocated. Either this should be a vector or a SmallVector. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67768/new/ https://reviews.llvm.org/D67768 From llvm-commits at lists.llvm.org Fri Oct 11 15:10:39 2019 From: llvm-commits at lists.llvm.org (Petr Hosek via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:10:39 +0000 (UTC) Subject: [PATCH] D68833: [CMake] Re-order runtimes in the order of dependencies In-Reply-To: References: Message-ID: phosek added a comment. In D68833#1706544 , @beanz wrote: > In D68833#1706315 , @ldionne wrote: > > > Yes, precisely. That's also the currently preferred way of building libc++: https://libcxx.llvm.org/docs/BuildingLibcxx.html > > > That documentation is more than a bit lacking, as is much of the LLVM documentation. > > > Not everybody ships libc++/libc++abi as part of the toolchain, and for those, building with whatever `CMAKE_CXX_COMPILER` they specify is really the right thing to do. > > While this is true, in many instances libc++ even when libc++ isn't shipped with a toolchain it is locked to one. Darwin is a prime example of this. On Darwin libc++ is shipped as part of the OS, but that cycle is closely coordinated with the toolchain updates and the two are usually kept in sync. > > > Don't get me wrong, I'm 100% on board that there's value in having this runtime build, however let's not pretend that it's the only correct way to build libc++. > > I would argue if you don't ship libc++ as part of the toolchain you shouldn't build it as part of the toolchain either. In which case the standalone build configuration is the correct way to build it. My intention isn't to say building libcxx as a runtime is the only correct way to build libc++, my intention is to state that *if* you are building libc++ with the toolchain, building it as a runtime is the only correct way to build it. > > > Say I need to generate a file based on the properties of a target. I'll need to call `get_target_property` on a target that hasn't been defined yet, and there's no way around that because `file(GENERATE)` does not expand generator expressions. > > Is this something you need to do? If so I'd question higher-level decisions about how the build is structured. > > > Are you thinking about this? > > > > set(libname "$,$,${lib}>") > > list(APPEND link_libraries "${CMAKE_LINK_LIBRARY_FLAG}${libname}") > > > > > > That is clever, I had not thought about it. > > > > If the above workaround works, I don't care about this patch that much. I still think we need to clarify the status of the Runtimes build and document it, unless that's already done and I've missed it (in which case please point it to me). > > That is a much better approach. CMake 3.11 is when the `TARGET_EXISTS` generator expression was added, although the documentation wasn't updated until CMake 3.15, so this change would require a CMake version update, which I don't think is unreasonable. I've tested D68880 which requires newer CMake but it does seem to be working. Shall we start the discussion about bumping the minimum CMake version requirement on llvm-dev? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68833/new/ https://reviews.llvm.org/D68833 From llvm-commits at lists.llvm.org Fri Oct 11 15:10:40 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:10:40 +0000 (UTC) Subject: [PATCH] D68894: AMDGPU: Increase vcc liveness scan threshold In-Reply-To: References: Message-ID: <23dbe62ebda769adf718d53a838c0efb@localhost.localdomain> rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68894/new/ https://reviews.llvm.org/D68894 From llvm-commits at lists.llvm.org Fri Oct 11 15:20:01 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:20:01 +0000 (UTC) Subject: [PATCH] D68751: [lld][WebAssembly] Where possible handle signature mismatches via an adaptor function In-Reply-To: References: Message-ID: <5ab0a97990d93d076eae3640a34a5e5c@localhost.localdomain> sbc100 added a comment. friendly ping.. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68751/new/ https://reviews.llvm.org/D68751 From llvm-commits at lists.llvm.org Fri Oct 11 15:20:01 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:20:01 +0000 (UTC) Subject: [PATCH] D68751: [lld][WebAssembly] Where possible handle signature mismatches via an adaptor function In-Reply-To: References: Message-ID: sbc100 updated this revision to Diff 224695. sbc100 added a comment. - rebase Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68751/new/ https://reviews.llvm.org/D68751 Files: lld/test/wasm/lto/signature-mismatch.ll lld/wasm/InputChunks.h lld/wasm/MarkLive.cpp lld/wasm/SymbolTable.cpp lld/wasm/SymbolTable.h lld/wasm/Writer.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68751.224695.patch Type: text/x-patch Size: 11032 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 15:20:01 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:20:01 +0000 (UTC) Subject: [PATCH] D67492: [DebugInfo] Add a DW_OP_LLVM_entry_value operation In-Reply-To: References: Message-ID: aprantl accepted this revision. aprantl added inline comments. This revision is now accepted and ready to land. ================ Comment at: llvm/lib/IR/Verifier.cpp:5024 + + AssertDI(!E->isEntryValue(), "Entry values are not allowed in LLVM IR", &I); +} ---------------- "only allowed in MIR" is perhaps more helpful. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67492/new/ https://reviews.llvm.org/D67492 From llvm-commits at lists.llvm.org Fri Oct 11 15:20:02 2019 From: llvm-commits at lists.llvm.org (Puyan Lotfi via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:20:02 +0000 (UTC) Subject: [PATCH] D63978: Clang Interface Stubs merger plumbing for Driver In-Reply-To: References: Message-ID: <68c262f4b18a948579f7085ef5240a20@localhost.localdomain> plotfi added a comment. In D63978#1706502 , @JamesNagurne wrote: > In D63978#1706448 , @plotfi wrote: > > > In D63978#1706420 , @JamesNagurne wrote: > > > > > Our team maintains a downstream embedded ARM clang distribution and some tests from this commit have begun to fail for us. > > > For a number of these tests, there was a REQUIRES: x86-registered-target at the top, which has now been removed. Specifically, externstatic.c, merge-conflict-test.c, object-float.c, and object.c are failing. > > > > > > object* tests seem to be based on object.cpp, which had the REQUIRES line, and externstatic.c also had that line prior to the change. > > > I see that @compnerd suggested the removal, but were you certain that these tests would work on clang toolchains for which x86 is not a registered target? > > > > > > For a failure example, here the output of lit for our toolchain. If you can make sense of it, I'd appreciate input on how we can fix or work around it: > > > > > > > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c | /arm-llvm/Release/llvm/bin/FileCheck -check-prefix=CHECK-TAPI /llvm-project/clang/test/InterfaceStubs/object.c > > > /llvm-project/clang/test/InterfaceStubs/object.c:5:16: error: CHECK-TAPI: expected string not found in input > > > // CHECK-TAPI: data: { Type: Object, Size: 4 } > > > ^ > > > :1:1: note: scanning from here > > > --- !experimental-ifs-v1 > > > ^ > > > > > > > > > And when run without FileCheck, our raw output: > > > > > > > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c > > > --- !experimental-ifs-v1 > > > IfsVersion: 1.0 > > > Triple: thumbv7em-ti-none-eabihf > > > ObjectFileFormat: ELF > > > Symbols: > > > ... > > > > > > > > > I am sorry for this James. I can add back the REQUIRES lines for now and coordinate with you on making sure your downstream bots are not affected again if the REQUIRES are removed again. > > By chance are your bots accessible publicly? > > > Sadly, they are not. It's on our list of things to investigate, but we don't have the resources to do such a thing quite yet. > I'm looking into the 'arm7*' buildbots to see if they are built similar to ours so I am not leaving you entirely without something to look at. However, if it seems to be common knowledge to always include an X86 target, I think I can talk to my team and change up what we do. > > These buildbots seem to also do LLVM_TARGETS_TO_BUILD=ARM, and then set the default target triple to a non-x86 triple (the host's) > > That could point towards us being in error here. I'll investigate things a little further, and update when I get the chance. > To be clear: this feature should work for any ELF target, correct? Yes, it is designed to work for all ELF targets but at the moment it is still in an early state. I am on the llvm IRC as zer0_ BTW Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63978/new/ https://reviews.llvm.org/D63978 From llvm-commits at lists.llvm.org Fri Oct 11 15:20:02 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:20:02 +0000 (UTC) Subject: [PATCH] D68869: [DebugInfo] Fix truncation of call site immediates In-Reply-To: References: Message-ID: <1ccb2d0ff1400501b6f3f99551b11115@localhost.localdomain> aprantl accepted this revision. aprantl added a comment. This revision is now accepted and ready to land. LGTM with comment addressed. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68869/new/ https://reviews.llvm.org/D68869 From llvm-commits at lists.llvm.org Fri Oct 11 15:20:02 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:20:02 +0000 (UTC) Subject: [PATCH] D68869: [DebugInfo] Fix truncation of call site immediates In-Reply-To: References: Message-ID: <378af5df116882bc1ae624a1d679a501@localhost.localdomain> aprantl added inline comments. ================ Comment at: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:623 if (ParamValue->first.isImm()) { - unsigned Val = ParamValue->first.getImm(); + auto Val = ParamValue->first.getImm(); DbgValueLoc DbgLocVal(ParamValue->second, Val); ---------------- We should only use auto where the type is obvious from the context. Let's use uint64_t here. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68869/new/ https://reviews.llvm.org/D68869 From llvm-commits at lists.llvm.org Fri Oct 11 15:28:04 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via llvm-commits) Date: Fri, 11 Oct 2019 22:28:04 -0000 Subject: [llvm] r374607 - [AMDGPU] Use GCN prefix in dpp_combine.mir. NFC. Message-ID: <20191011222804.7FCC886855@lists.llvm.org> Author: rampitec Date: Fri Oct 11 15:28:04 2019 New Revision: 374607 URL: http://llvm.org/viewvc/llvm-project?rev=374607&view=rev Log: [AMDGPU] Use GCN prefix in dpp_combine.mir. NFC. Modified: llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir Modified: llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir?rev=374607&r1=374606&r2=374607&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir (original) +++ llvm/trunk/test/CodeGen/AMDGPU/dpp_combine.mir Fri Oct 11 15:28:04 2019 @@ -1,20 +1,20 @@ -# RUN: llc -march=amdgcn -mcpu=gfx900 -run-pass=gcn-dpp-combine -verify-machineinstrs -o - %s | FileCheck %s +# RUN: llc -march=amdgcn -mcpu=gfx900 -run-pass=gcn-dpp-combine -verify-machineinstrs -o - %s | FileCheck %s -check-prefix=GCN --- # old is undefined: only combine when masks are fully enabled and # bound_ctrl:0 is set, otherwise the result of DPP VALU op can be undefined. -# CHECK-LABEL: name: old_is_undef -# CHECK: %2:vgpr_32 = IMPLICIT_DEF +# GCN-LABEL: name: old_is_undef +# GCN: %2:vgpr_32 = IMPLICIT_DEF # VOP2: -# CHECK: %4:vgpr_32 = V_ADD_U32_dpp %2, %0, %1, 1, 15, 15, 1, implicit $exec -# CHECK: %6:vgpr_32 = V_ADD_U32_e32 %5, %1, implicit $exec -# CHECK: %8:vgpr_32 = V_ADD_U32_e32 %7, %1, implicit $exec -# CHECK: %10:vgpr_32 = V_ADD_U32_e32 %9, %1, implicit $exec +# GCN: %4:vgpr_32 = V_ADD_U32_dpp %2, %0, %1, 1, 15, 15, 1, implicit $exec +# GCN: %6:vgpr_32 = V_ADD_U32_e32 %5, %1, implicit $exec +# GCN: %8:vgpr_32 = V_ADD_U32_e32 %7, %1, implicit $exec +# GCN: %10:vgpr_32 = V_ADD_U32_e32 %9, %1, implicit $exec # VOP1: -# CHECK: %12:vgpr_32 = V_NOT_B32_dpp %2, %0, 1, 15, 15, 1, implicit $exec -# CHECK: %14:vgpr_32 = V_NOT_B32_e32 %13, implicit $exec -# CHECK: %16:vgpr_32 = V_NOT_B32_e32 %15, implicit $exec -# CHECK: %18:vgpr_32 = V_NOT_B32_e32 %17, implicit $exec +# GCN: %12:vgpr_32 = V_NOT_B32_dpp %2, %0, 1, 15, 15, 1, implicit $exec +# GCN: %14:vgpr_32 = V_NOT_B32_e32 %13, implicit $exec +# GCN: %16:vgpr_32 = V_NOT_B32_e32 %15, implicit $exec +# GCN: %18:vgpr_32 = V_NOT_B32_e32 %17, implicit $exec name: old_is_undef tracksRegLiveness: true body: | @@ -53,21 +53,21 @@ body: | # old is zero cases: -# CHECK-LABEL: name: old_is_0 +# GCN-LABEL: name: old_is_0 # VOP2: # case 1: old is zero, masks are fully enabled, bound_ctrl:0 is on: # the DPP mov result would be either zero ({src lane disabled}|{src lane is # out of range}) or active src lane result - can combine with old = undef. # undef is preffered as it makes life easier for the regalloc. -# CHECK: [[U1:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF -# CHECK: %4:vgpr_32 = V_ADD_U32_dpp [[U1]], %0, %1, 1, 15, 15, 1, implicit $exec +# GCN: [[U1:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF +# GCN: %4:vgpr_32 = V_ADD_U32_dpp [[U1]], %0, %1, 1, 15, 15, 1, implicit $exec # case 2: old is zero, masks are fully enabled, bound_ctrl:0 is off: # as the DPP mov old is zero this case is no different from case 1 - combine it # setting bound_ctrl0 on for the combined DPP VALU op to make old undefined -# CHECK: [[U2:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF -# CHECK: %6:vgpr_32 = V_ADD_U32_dpp [[U2]], %0, %1, 1, 15, 15, 1, implicit $exec +# GCN: [[U2:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF +# GCN: %6:vgpr_32 = V_ADD_U32_dpp [[U2]], %0, %1, 1, 15, 15, 1, implicit $exec # case 3: masks are partialy disabled, bound_ctrl:0 is on: # the DPP mov result would be either zero ({src lane disabled}|{src lane is @@ -77,7 +77,7 @@ body: | # with identity value. # Special case: the bound_ctrl for the combined DPP VALU op isn't important # here but let's make it off to keep the combiner's logic simpler. -# CHECK: %8:vgpr_32 = V_ADD_U32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec +# GCN: %8:vgpr_32 = V_ADD_U32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec # case 4: masks are partialy disabled, bound_ctrl:0 is off: # the DPP mov result would be either zero ({src lane disabled}|{src lane is @@ -85,19 +85,19 @@ body: | # active src lane result - can combine with old = src1 of the VALU op. # The VALU op should have the same masks as DPP mov as they select # lanes with identity value -# CHECK: %10:vgpr_32 = V_ADD_U32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec +# GCN: %10:vgpr_32 = V_ADD_U32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec # VOP1: # see case 1 -# CHECK: [[U3:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF -# CHECK: %12:vgpr_32 = V_NOT_B32_dpp [[U3]], %0, 1, 15, 15, 1, implicit $exec +# GCN: [[U3:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF +# GCN: %12:vgpr_32 = V_NOT_B32_dpp [[U3]], %0, 1, 15, 15, 1, implicit $exec # see case 2 -# CHECK: [[U4:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF -# CHECK: %14:vgpr_32 = V_NOT_B32_dpp [[U4]], %0, 1, 15, 15, 1, implicit $exec +# GCN: [[U4:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF +# GCN: %14:vgpr_32 = V_NOT_B32_dpp [[U4]], %0, 1, 15, 15, 1, implicit $exec # case 3 and 4 not appliable as there is no way to specify unchanged result # for the unary VALU op -# CHECK: %16:vgpr_32 = V_NOT_B32_e32 %15, implicit $exec -# CHECK: %18:vgpr_32 = V_NOT_B32_e32 %17, implicit $exec +# GCN: %16:vgpr_32 = V_NOT_B32_e32 %15, implicit $exec +# GCN: %18:vgpr_32 = V_NOT_B32_e32 %17, implicit $exec name: old_is_0 tracksRegLiveness: true @@ -143,11 +143,11 @@ body: | # The DPP VALU op should have the same masks (and bctrl) as DPP mov as they # select lanes with identity value -# CHECK-LABEL: name: nonzero_old_is_identity_masks_enabled_bctl_off -# CHECK: %4:vgpr_32 = V_MUL_U32_U24_dpp %1, %0, %1, 1, 15, 15, 0, implicit $exec -# CHECK: %7:vgpr_32 = V_AND_B32_dpp %1, %0, %1, 1, 15, 15, 0, implicit $exec -# CHECK: %10:vgpr_32 = V_MAX_I32_dpp %1, %0, %1, 1, 15, 15, 0, implicit $exec -# CHECK: %13:vgpr_32 = V_MIN_I32_dpp %1, %0, %1, 1, 15, 15, 0, implicit $exec +# GCN-LABEL: name: nonzero_old_is_identity_masks_enabled_bctl_off +# GCN: %4:vgpr_32 = V_MUL_U32_U24_dpp %1, %0, %1, 1, 15, 15, 0, implicit $exec +# GCN: %7:vgpr_32 = V_AND_B32_dpp %1, %0, %1, 1, 15, 15, 0, implicit $exec +# GCN: %10:vgpr_32 = V_MAX_I32_dpp %1, %0, %1, 1, 15, 15, 0, implicit $exec +# GCN: %13:vgpr_32 = V_MIN_I32_dpp %1, %0, %1, 1, 15, 15, 0, implicit $exec name: nonzero_old_is_identity_masks_enabled_bctl_off tracksRegLiveness: true @@ -181,11 +181,11 @@ body: | # The DPP VALU op should have the same masks (and bctrl) as DPP mov as they # select lanes with identity value -# CHECK-LABEL: name: nonzero_old_is_identity_masks_partially_disabled_bctl_off -# CHECK: %4:vgpr_32 = V_MUL_U32_U24_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec -# CHECK: %7:vgpr_32 = V_AND_B32_dpp %1, %0, %1, 1, 15, 14, 0, implicit $exec -# CHECK: %10:vgpr_32 = V_MAX_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec -# CHECK: %13:vgpr_32 = V_MIN_I32_dpp %1, %0, %1, 1, 15, 14, 0, implicit $exec +# GCN-LABEL: name: nonzero_old_is_identity_masks_partially_disabled_bctl_off +# GCN: %4:vgpr_32 = V_MUL_U32_U24_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec +# GCN: %7:vgpr_32 = V_AND_B32_dpp %1, %0, %1, 1, 15, 14, 0, implicit $exec +# GCN: %10:vgpr_32 = V_MAX_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec +# GCN: %13:vgpr_32 = V_MIN_I32_dpp %1, %0, %1, 1, 15, 14, 0, implicit $exec name: nonzero_old_is_identity_masks_partially_disabled_bctl_off tracksRegLiveness: true @@ -219,11 +219,11 @@ body: | # 3. DPP mov's old value if the mov's dest VGPR write is disabled by masks # can't combine -# CHECK-LABEL: name: nonzero_old_is_identity_masks_partially_disabled_bctl0 -# CHECK: %4:vgpr_32 = V_MUL_U32_U24_e32 %3, %1, implicit $exec -# CHECK: %7:vgpr_32 = V_AND_B32_e32 %6, %1, implicit $exec -# CHECK: %10:vgpr_32 = V_MAX_I32_e32 %9, %1, implicit $exec -# CHECK: %13:vgpr_32 = V_MIN_I32_e32 %12, %1, implicit $exec +# GCN-LABEL: name: nonzero_old_is_identity_masks_partially_disabled_bctl0 +# GCN: %4:vgpr_32 = V_MUL_U32_U24_e32 %3, %1, implicit $exec +# GCN: %7:vgpr_32 = V_AND_B32_e32 %6, %1, implicit $exec +# GCN: %10:vgpr_32 = V_MAX_I32_e32 %9, %1, implicit $exec +# GCN: %13:vgpr_32 = V_MIN_I32_e32 %12, %1, implicit $exec name: nonzero_old_is_identity_masks_partially_disabled_bctl0 tracksRegLiveness: true @@ -251,13 +251,13 @@ body: | ... # when the DPP source isn't a src0 operand the operation should be commuted if possible -# CHECK-LABEL: name: dpp_commute -# CHECK: %4:vgpr_32 = V_MUL_U32_U24_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec -# CHECK: %7:vgpr_32 = V_AND_B32_dpp %1, %0, %1, 1, 15, 14, 0, implicit $exec -# CHECK: %10:vgpr_32 = V_MAX_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec -# CHECK: %13:vgpr_32 = V_MIN_I32_dpp %1, %0, %1, 1, 15, 14, 0, implicit $exec -# CHECK: %16:vgpr_32 = V_SUBREV_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit-def $vcc, implicit $exec -# CHECK: %19:vgpr_32 = V_ADD_I32_e32 5, %18, implicit-def $vcc, implicit $exec +# GCN-LABEL: name: dpp_commute +# GCN: %4:vgpr_32 = V_MUL_U32_U24_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec +# GCN: %7:vgpr_32 = V_AND_B32_dpp %1, %0, %1, 1, 15, 14, 0, implicit $exec +# GCN: %10:vgpr_32 = V_MAX_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec +# GCN: %13:vgpr_32 = V_MIN_I32_dpp %1, %0, %1, 1, 15, 14, 0, implicit $exec +# GCN: %16:vgpr_32 = V_SUBREV_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit-def $vcc, implicit $exec +# GCN: %19:vgpr_32 = V_ADD_I32_e32 5, %18, implicit-def $vcc, implicit $exec name: dpp_commute tracksRegLiveness: true body: | @@ -294,12 +294,12 @@ body: | ... # check for floating point modifiers -# CHECK-LABEL: name: add_f32_e64 -# CHECK: %3:vgpr_32 = V_MOV_B32_dpp undef %2, %1, 1, 15, 15, 1, implicit $exec -# CHECK: %4:vgpr_32 = V_ADD_F32_e64 0, %3, 0, %0, 0, 1, implicit $exec -# CHECK: %6:vgpr_32 = V_ADD_F32_dpp %2, 0, %1, 0, %0, 1, 15, 15, 1, implicit $exec -# CHECK: %8:vgpr_32 = V_ADD_F32_dpp %2, 1, %1, 2, %0, 1, 15, 15, 1, implicit $exec -# CHECK: %10:vgpr_32 = V_ADD_F32_e64 4, %9, 8, %0, 0, 0, implicit $exec +# GCN-LABEL: name: add_f32_e64 +# GCN: %3:vgpr_32 = V_MOV_B32_dpp undef %2, %1, 1, 15, 15, 1, implicit $exec +# GCN: %4:vgpr_32 = V_ADD_F32_e64 0, %3, 0, %0, 0, 1, implicit $exec +# GCN: %6:vgpr_32 = V_ADD_F32_dpp %2, 0, %1, 0, %0, 1, 15, 15, 1, implicit $exec +# GCN: %8:vgpr_32 = V_ADD_F32_dpp %2, 1, %1, 2, %0, 1, 15, 15, 1, implicit $exec +# GCN: %10:vgpr_32 = V_ADD_F32_e64 4, %9, 8, %0, 0, 0, implicit $exec name: add_f32_e64 tracksRegLiveness: true @@ -329,9 +329,9 @@ body: | ... # check for e64 modifiers -# CHECK-LABEL: name: add_u32_e64 -# CHECK: %4:vgpr_32 = V_ADD_U32_dpp %2, %0, %1, 1, 15, 15, 1, implicit $exec -# CHECK: %6:vgpr_32 = V_ADD_U32_e64 %5, %1, 1, implicit $exec +# GCN-LABEL: name: add_u32_e64 +# GCN: %4:vgpr_32 = V_ADD_U32_dpp %2, %0, %1, 1, 15, 15, 1, implicit $exec +# GCN: %6:vgpr_32 = V_ADD_U32_e64 %5, %1, 1, implicit $exec name: add_u32_e64 tracksRegLiveness: true @@ -353,12 +353,12 @@ body: | ... # tests on sequences of dpp consumers -# CHECK-LABEL: name: dpp_seq -# CHECK: %4:vgpr_32 = V_ADD_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit-def $vcc, implicit $exec -# CHECK: %5:vgpr_32 = V_SUBREV_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit-def $vcc, implicit $exec -# CHECK: %6:vgpr_32 = V_OR_B32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec +# GCN-LABEL: name: dpp_seq +# GCN: %4:vgpr_32 = V_ADD_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit-def $vcc, implicit $exec +# GCN: %5:vgpr_32 = V_SUBREV_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit-def $vcc, implicit $exec +# GCN: %6:vgpr_32 = V_OR_B32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec # broken sequence: -# CHECK: %7:vgpr_32 = V_MOV_B32_dpp %2, %0, 1, 14, 15, 0, implicit $exec +# GCN: %7:vgpr_32 = V_MOV_B32_dpp %2, %0, 1, 14, 15, 0, implicit $exec name: dpp_seq tracksRegLiveness: true @@ -381,10 +381,10 @@ body: | ... # tests on sequences of dpp consumers followed by control flow -# CHECK-LABEL: name: dpp_seq_cf -# CHECK: %4:vgpr_32 = V_ADD_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit-def $vcc, implicit $exec -# CHECK: %5:vgpr_32 = V_SUBREV_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit-def $vcc, implicit $exec -# CHECK: %6:vgpr_32 = V_OR_B32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec +# GCN-LABEL: name: dpp_seq_cf +# GCN: %4:vgpr_32 = V_ADD_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit-def $vcc, implicit $exec +# GCN: %5:vgpr_32 = V_SUBREV_I32_dpp %1, %0, %1, 1, 14, 15, 0, implicit-def $vcc, implicit $exec +# GCN: %6:vgpr_32 = V_OR_B32_dpp %1, %0, %1, 1, 14, 15, 0, implicit $exec name: dpp_seq_cf tracksRegLiveness: true @@ -413,8 +413,8 @@ body: | ... # old reg def is in diff BB - cannot combine -# CHECK-LABEL: name: old_in_diff_bb -# CHECK: %3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, implicit $exec +# GCN-LABEL: name: old_in_diff_bb +# GCN: %3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, implicit $exec name: old_in_diff_bb tracksRegLiveness: true @@ -434,8 +434,8 @@ body: | ... # old reg def is in diff BB but bound_ctrl:0 - can combine -# CHECK-LABEL: name: old_in_diff_bb_bctrl_zero -# CHECK: %4:vgpr_32 = V_ADD_U32_dpp {{%[0-9]}}, %0, %1, 1, 15, 15, 1, implicit $exec +# GCN-LABEL: name: old_in_diff_bb_bctrl_zero +# GCN: %4:vgpr_32 = V_ADD_U32_dpp {{%[0-9]}}, %0, %1, 1, 15, 15, 1, implicit $exec name: old_in_diff_bb_bctrl_zero tracksRegLiveness: true @@ -455,8 +455,8 @@ body: | ... # EXEC mask changed between def and use - cannot combine -# CHECK-LABEL: name: exec_changed -# CHECK: %3:vgpr_32 = V_MOV_B32_dpp %2, %0, 1, 15, 15, 1, implicit $exec +# GCN-LABEL: name: exec_changed +# GCN: %3:vgpr_32 = V_MOV_B32_dpp %2, %0, 1, 15, 15, 1, implicit $exec name: exec_changed tracksRegLiveness: true @@ -475,8 +475,8 @@ body: | # test if $old definition is correctly tracked through subreg manipulation pseudos -# CHECK-LABEL: name: mul_old_subreg -# CHECK: %7:vgpr_32 = V_MUL_I32_I24_dpp %0.sub1, %1, %0.sub1, 1, 1, 1, 0, implicit $exec +# GCN-LABEL: name: mul_old_subreg +# GCN: %7:vgpr_32 = V_MUL_I32_I24_dpp %0.sub1, %1, %0.sub1, 1, 1, 1, 0, implicit $exec name: mul_old_subreg tracksRegLiveness: true @@ -494,8 +494,8 @@ body: | %7:vgpr_32 = V_MUL_I32_I24_e32 %6, %0.sub1, implicit $exec ... -# CHECK-LABEL: name: add_old_subreg -# CHECK: %5:vgpr_32 = V_ADD_U32_dpp %0.sub1, %1, %0.sub1, 1, 1, 1, 0, implicit $exec +# GCN-LABEL: name: add_old_subreg +# GCN: %5:vgpr_32 = V_ADD_U32_dpp %0.sub1, %1, %0.sub1, 1, 1, 1, 0, implicit $exec name: add_old_subreg tracksRegLiveness: true @@ -511,8 +511,8 @@ body: | %5:vgpr_32 = V_ADD_U32_e32 %4, %0.sub1, implicit $exec ... -# CHECK-LABEL: name: add_old_subreg_undef -# CHECK: %5:vgpr_32 = V_ADD_U32_dpp undef %3.sub1, %1, %0.sub1, 1, 15, 15, 1, implicit $exec +# GCN-LABEL: name: add_old_subreg_undef +# GCN: %5:vgpr_32 = V_ADD_U32_dpp undef %3.sub1, %1, %0.sub1, 1, 15, 15, 1, implicit $exec name: add_old_subreg_undef tracksRegLiveness: true @@ -529,8 +529,8 @@ body: | ... # Test instruction which does not have modifiers in VOP1 form but does in DPP form. -# CHECK-LABEL: name: dpp_vop1 -# CHECK: %3:vgpr_32 = V_CEIL_F32_dpp %0, 0, undef %2:vgpr_32, 1, 15, 15, 1, implicit $exec +# GCN-LABEL: name: dpp_vop1 +# GCN: %3:vgpr_32 = V_CEIL_F32_dpp %0, 0, undef %2:vgpr_32, 1, 15, 15, 1, implicit $exec name: dpp_vop1 tracksRegLiveness: true body: | @@ -541,8 +541,8 @@ body: | ... # Test instruction which does not have modifiers in VOP2 form but does in DPP form. -# CHECK-LABEL: name: dpp_min -# CHECK: %3:vgpr_32 = V_MIN_F32_dpp %0, 0, undef %2:vgpr_32, 0, undef %4:vgpr_32, 1, 15, 15, 1, implicit $exec +# GCN-LABEL: name: dpp_min +# GCN: %3:vgpr_32 = V_MIN_F32_dpp %0, 0, undef %2:vgpr_32, 0, undef %4:vgpr_32, 1, 15, 15, 1, implicit $exec name: dpp_min tracksRegLiveness: true body: | @@ -553,8 +553,8 @@ body: | ... # Test an undef old operand -# CHECK-LABEL: name: dpp_undef_old -# CHECK: %3:vgpr_32 = V_CEIL_F32_dpp undef %1:vgpr_32, 0, undef %2:vgpr_32, 1, 15, 15, 1, implicit $exec +# GCN-LABEL: name: dpp_undef_old +# GCN: %3:vgpr_32 = V_CEIL_F32_dpp undef %1:vgpr_32, 0, undef %2:vgpr_32, 1, 15, 15, 1, implicit $exec name: dpp_undef_old tracksRegLiveness: true body: | From llvm-commits at lists.llvm.org Fri Oct 11 15:29:08 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:29:08 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands In-Reply-To: References: Message-ID: tlively marked an inline comment as done. tlively added inline comments. ================ Comment at: llvm/tools/llvm-mc/llvm-mc.cpp:520 if (disassemble) - Res = Disassembler::disassemble(*TheTarget, TripleName, *STI, *Str, - *Buffer, SrcMgr, Out->os()); ---------------- sbc100 wrote: > This change to remove the context creation seems seperate. If so can you split it out? That was this change can stay wasm specific. It's not separate, unfortunately. In order to create a wasm symbol from the disassembler we need a bunch of target information that this target contains that the old one did not have. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68889/new/ https://reviews.llvm.org/D68889 From llvm-commits at lists.llvm.org Fri Oct 11 15:29:08 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:29:08 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands In-Reply-To: References: Message-ID: <54dd30f9a5e586784aa72493fd9fbc61@localhost.localdomain> tlively marked an inline comment as done. tlively added inline comments. ================ Comment at: llvm/tools/llvm-mc/llvm-mc.cpp:520 if (disassemble) - Res = Disassembler::disassemble(*TheTarget, TripleName, *STI, *Str, - *Buffer, SrcMgr, Out->os()); ---------------- tlively wrote: > sbc100 wrote: > > This change to remove the context creation seems seperate. If so can you split it out? That was this change can stay wasm specific. > It's not separate, unfortunately. In order to create a wasm symbol from the disassembler we need a bunch of target information that this target contains that the old one did not have. I am happy to reconsider the wisdom of creating a symbol in the disassembler, though. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68889/new/ https://reviews.llvm.org/D68889 From llvm-commits at lists.llvm.org Fri Oct 11 15:38:24 2019 From: llvm-commits at lists.llvm.org (Andrii Nakryiko via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:38:24 +0000 (UTC) Subject: [PATCH] D68822: [WIP][BPF] Support external globals In-Reply-To: References: Message-ID: anakryiko added inline comments. ================ Comment at: llvm/lib/Target/BPF/BTF.h:183-184 VAR_GLOBAL_ALLOCATED = 1, ///< Linkage: ExternalLinkage - VAR_GLOBAL_TENTATIVE = 2, ///< Linkage: CommonLinkage - VAR_GLOBAL_EXTERNAL = 3, ///< Linkage: ExternalLinkage + VAR_GLOBAL_EXTERNAL = 2, ///< Linkage: ExternalLinkage + VAR_GLOBAL_TENTATIVE = 3, ///< Linkage: CommonLinkage }; ---------------- what's the difference between EXTERNAL and TENTATIVE? ================ Comment at: llvm/lib/Target/BPF/BTFDebug.cpp:1064 + // Whether DataSec is readonly or not can be found from the + // corresponding ELF section flags. ---------------- I remember there were problems figuring out datasec size when emitting BTF. Is this still the case? Does the same problem prevent readonly flag in DATASEC or you are trying to not put redundant information into BTF that can be derived from ELF? ================ Comment at: llvm/lib/Target/BPF/BTFDebug.cpp:1112 + if (SecName.empty()) + continue; ---------------- we'll just ignore externs with GlobalValue::CommonLinkage, right? Should we just assign a special section name to them instead? feels bad to only have a partial list of externs. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68822/new/ https://reviews.llvm.org/D68822 From llvm-commits at lists.llvm.org Fri Oct 11 15:38:35 2019 From: llvm-commits at lists.llvm.org (Sam Clegg via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:38:35 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands In-Reply-To: References: Message-ID: <04ad14ac842c76a8f5a69b6f3db95392@localhost.localdomain> sbc100 added inline comments. ================ Comment at: llvm/tools/llvm-mc/llvm-mc.cpp:520 if (disassemble) - Res = Disassembler::disassemble(*TheTarget, TripleName, *STI, *Str, - *Buffer, SrcMgr, Out->os()); ---------------- tlively wrote: > tlively wrote: > > sbc100 wrote: > > > This change to remove the context creation seems seperate. If so can you split it out? That was this change can stay wasm specific. > > It's not separate, unfortunately. In order to create a wasm symbol from the disassembler we need a bunch of target information that this target contains that the old one did not have. > I am happy to reconsider the wisdom of creating a symbol in the disassembler, though. I was confused because the existing disassembly already needs a context and even creates one saying: `// Set up the MCContext for creating symbols and MCExpr's.`. So that approach seems justified and consistent with the existing code. I'm not clear why it wasn't done this way to begin with. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68889/new/ https://reviews.llvm.org/D68889 From llvm-commits at lists.llvm.org Fri Oct 11 15:56:26 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:56:26 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands In-Reply-To: References: Message-ID: tlively marked an inline comment as done. tlively added inline comments. ================ Comment at: llvm/tools/llvm-mc/llvm-mc.cpp:520 if (disassemble) - Res = Disassembler::disassemble(*TheTarget, TripleName, *STI, *Str, - *Buffer, SrcMgr, Out->os()); ---------------- sbc100 wrote: > tlively wrote: > > tlively wrote: > > > sbc100 wrote: > > > > This change to remove the context creation seems seperate. If so can you split it out? That was this change can stay wasm specific. > > > It's not separate, unfortunately. In order to create a wasm symbol from the disassembler we need a bunch of target information that this target contains that the old one did not have. > > I am happy to reconsider the wisdom of creating a symbol in the disassembler, though. > I was confused because the existing disassembly already needs a context and even creates one saying: `// Set up the MCContext for creating symbols and MCExpr's.`. So that approach seems justified and consistent with the existing code. I'm not clear why it wasn't done this way to begin with. Yeah it was surprising that the previous code didn't work. With the previous code, asking the context for a symbol returned a generic MCSymbol. With the new code it returns a MCSymbolWasm, which we need to be able to set the symbol type to `wasm::WASM_SYMBOL_TYPE_FUNCTION`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68889/new/ https://reviews.llvm.org/D68889 From llvm-commits at lists.llvm.org Fri Oct 11 15:56:26 2019 From: llvm-commits at lists.llvm.org (Z Nguyen-Huu via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:56:26 +0000 (UTC) Subject: [PATCH] D68886: Remove unnecessary codes in llvm-dwarfdump In-Reply-To: References: Message-ID: duongnhn added a comment. In D68886#1706567 , @jakehehrlich wrote: > I'm not sure why I was added but this looks fine to me. If it compiles and runs on everyone's system I can't see how this could be anything but good. Sorry, I'm not able to find the exact owner for this so I thought you might be related. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68886/new/ https://reviews.llvm.org/D68886 From llvm-commits at lists.llvm.org Fri Oct 11 16:05:24 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Fri, 11 Oct 2019 23:05:24 -0000 Subject: [llvm] r374608 - gn build: Cmanually) merge r374590 Message-ID: <20191011230524.E60F7932E2@lists.llvm.org> Author: nico Date: Fri Oct 11 16:05:24 2019 New Revision: 374608 URL: http://llvm.org/viewvc/llvm-project?rev=374608&view=rev Log: gn build: Cmanually) merge r374590 Added: llvm/trunk/utils/gn/secondary/llvm/tools/llvm-exegesis/lib/Mips/ llvm/trunk/utils/gn/secondary/llvm/tools/llvm-exegesis/lib/Mips/BUILD.gn llvm/trunk/utils/gn/secondary/llvm/unittests/tools/llvm-exegesis/Mips/ llvm/trunk/utils/gn/secondary/llvm/unittests/tools/llvm-exegesis/Mips/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/llvm/lib/Target/targets.gni llvm/trunk/utils/gn/secondary/llvm/unittests/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/llvm/lib/Target/targets.gni URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/lib/Target/targets.gni?rev=374608&r1=374607&r2=374608&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/lib/Target/targets.gni (original) +++ llvm/trunk/utils/gn/secondary/llvm/lib/Target/targets.gni Fri Oct 11 16:05:24 2019 @@ -46,6 +46,7 @@ if (llvm_targets_to_build == "host") { llvm_build_AArch64 = false llvm_build_ARM = false llvm_build_BPF = false +llvm_build_Mips = false llvm_build_PowerPC = false llvm_build_WebAssembly = false llvm_build_X86 = false @@ -56,6 +57,8 @@ foreach(target, llvm_targets_to_build) { llvm_build_ARM = true } else if (target == "BPF") { llvm_build_BPF = true + } else if (target == "Mips") { + llvm_build_Mips = true } else if (target == "PowerPC") { llvm_build_PowerPC = true } else if (target == "WebAssembly") { @@ -63,17 +66,16 @@ foreach(target, llvm_targets_to_build) { } else if (target == "X86") { llvm_build_X86 = true } else if (target == "AMDGPU" || target == "AVR" || target == "Hexagon" || - target == "Lanai" || target == "Mips" || target == "NVPTX" || - target == "RISCV" || target == "Sparc" || target == "SystemZ") { + target == "Lanai" || target == "NVPTX" || target == "RISCV" || + target == "Sparc" || target == "SystemZ") { # Nothing to do. } else { all_targets_string = "" foreach(target, llvm_all_targets) { all_targets_string += "$0x0a " + target } - assert(false, - "Unknown target '$target' in llvm_targets_to_build. " + - "Known targets:" + all_targets_string) + assert(false, "Unknown target '$target' in llvm_targets_to_build. " + + "Known targets:" + all_targets_string) } } Added: llvm/trunk/utils/gn/secondary/llvm/tools/llvm-exegesis/lib/Mips/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/tools/llvm-exegesis/lib/Mips/BUILD.gn?rev=374608&view=auto ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/tools/llvm-exegesis/lib/Mips/BUILD.gn (added) +++ llvm/trunk/utils/gn/secondary/llvm/tools/llvm-exegesis/lib/Mips/BUILD.gn Fri Oct 11 16:05:24 2019 @@ -0,0 +1,21 @@ +import("//llvm/utils/TableGen/tablegen.gni") + +tablegen("MipsGenExegesis") { + args = [ "-gen-exegesis" ] + td_file = "//llvm/lib/Target/Mips/Mips.td" +} + +static_library("Mips") { + output_name = "LLVMExegesisMips" + deps = [ + ":MipsGenExegesis", + + # Exegesis reaches inside the Target/Mips tablegen internals and must + # depend on these Target/Mips-internal build targets. + "//llvm/lib/Target/Mips/MCTargetDesc", + ] + sources = [ + "Target.cpp", + ] + include_dirs = [ "//llvm/lib/Target/Mips" ] +} Modified: llvm/trunk/utils/gn/secondary/llvm/unittests/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/unittests/BUILD.gn?rev=374608&r1=374607&r2=374608&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/unittests/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/llvm/unittests/BUILD.gn Fri Oct 11 16:05:24 2019 @@ -61,12 +61,15 @@ group("unittests") { "tools/llvm-exegesis/ARM:LLVMExegesisARMTests", ] } - if (llvm_build_WebAssembly) { - deps += [ "Target/WebAssembly:WebAssemblyTests" ] + if (llvm_build_Mips) { + deps += [ "tools/llvm-exegesis/Mips:LLVMExegesisMipsTests" ] } if (llvm_build_PowerPC) { deps += [ "tools/llvm-exegesis/PowerPC:LLVMExegesisPowerPCTests" ] } + if (llvm_build_WebAssembly) { + deps += [ "Target/WebAssembly:WebAssemblyTests" ] + } if (llvm_build_X86) { deps += [ "tools/llvm-exegesis/X86:LLVMExegesisX86Tests" ] } Added: llvm/trunk/utils/gn/secondary/llvm/unittests/tools/llvm-exegesis/Mips/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/unittests/tools/llvm-exegesis/Mips/BUILD.gn?rev=374608&view=auto ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/unittests/tools/llvm-exegesis/Mips/BUILD.gn (added) +++ llvm/trunk/utils/gn/secondary/llvm/unittests/tools/llvm-exegesis/Mips/BUILD.gn Fri Oct 11 16:05:24 2019 @@ -0,0 +1,25 @@ +import("//llvm/utils/unittest/unittest.gni") + +unittest("LLVMExegesisMipsTests") { + deps = [ + "//llvm/lib/DebugInfo/Symbolize", + "//llvm/lib/MC", + "//llvm/lib/MC/MCParser", + "//llvm/lib/Object", + "//llvm/lib/Support", + "//llvm/lib/Target/Mips", + + # Exegesis reaches inside the Target/Mips tablegen internals and must + # depend on these Target/Mips-internal build targets. + "//llvm/lib/Target/Mips/MCTargetDesc", + "//llvm/tools/llvm-exegesis/lib", + "//llvm/tools/llvm-exegesis/lib/Mips", + ] + include_dirs = [ + "//llvm/lib/Target/Mips", + "//llvm/tools/llvm-exegesis/lib", + ] + sources = [ + "TargetTest.cpp", + ] +} From llvm-commits at lists.llvm.org Fri Oct 11 16:12:04 2019 From: llvm-commits at lists.llvm.org (GN Sync Bot via llvm-commits) Date: Fri, 11 Oct 2019 23:12:04 -0000 Subject: [llvm] r374610 - gn build: Merge r235758 Message-ID: <20191011231204.CE6C38951F@lists.llvm.org> Author: gnsyncbot Date: Fri Oct 11 16:12:04 2019 New Revision: 374610 URL: http://llvm.org/viewvc/llvm-project?rev=374610&view=rev Log: gn build: Merge r235758 Modified: llvm/trunk/utils/gn/secondary/libunwind/src/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/libunwind/src/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/libunwind/src/BUILD.gn?rev=374610&r1=374609&r2=374610&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/libunwind/src/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/libunwind/src/BUILD.gn Fri Oct 11 16:12:04 2019 @@ -24,6 +24,7 @@ if (target_os == "mac") { } unwind_sources = [ + "Unwind_AppleExtras.cpp", "libunwind.cpp", "Unwind-EHABI.cpp", "Unwind-seh.cpp", From llvm-commits at lists.llvm.org Fri Oct 11 16:14:51 2019 From: llvm-commits at lists.llvm.org (Yonghong Song via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:14:51 +0000 (UTC) Subject: [PATCH] D68822: [WIP][BPF] Support external globals In-Reply-To: References: Message-ID: <7b079a54e638aaffa34bc786c9d834e4@localhost.localdomain> yonghong-song marked 3 inline comments as done. yonghong-song added inline comments. ================ Comment at: llvm/lib/Target/BPF/BTF.h:183-184 VAR_GLOBAL_ALLOCATED = 1, ///< Linkage: ExternalLinkage - VAR_GLOBAL_TENTATIVE = 2, ///< Linkage: CommonLinkage - VAR_GLOBAL_EXTERNAL = 3, ///< Linkage: ExternalLinkage + VAR_GLOBAL_EXTERNAL = 2, ///< Linkage: ExternalLinkage + VAR_GLOBAL_TENTATIVE = 3, ///< Linkage: CommonLinkage }; ---------------- anakryiko wrote: > what's the difference between EXTERNAL and TENTATIVE? TENTATIVE is for .common symbols. Just do "int g;" and compile, you will find out. ================ Comment at: llvm/lib/Target/BPF/BTFDebug.cpp:1064 + // Whether DataSec is readonly or not can be found from the + // corresponding ELF section flags. ---------------- anakryiko wrote: > I remember there were problems figuring out datasec size when emitting BTF. Is this still the case? Does the same problem prevent readonly flag in DATASEC or you are trying to not put redundant information into BTF that can be derived from ELF? Yes this is still the case. The debugging info is emitted earlier and we do not know the data section until very end of compilation which is too late. The section readonly thing is marked by ELF object writer. The BPF backend can only detect readonly variables through type inspection, so it MAY mark the container DataSec as readonly section. Let us do it now with ELF checking since the ELF section inspection is required to get the readonly data. ================ Comment at: llvm/lib/Target/BPF/BTFDebug.cpp:1112 + if (SecName.empty()) + continue; ---------------- anakryiko wrote: > we'll just ignore externs with GlobalValue::CommonLinkage, right? Should we just assign a special section name to them instead? feels bad to only have a partial list of externs. No, all externs will emitted with their special section name. CommonLinkage is not an extern. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68822/new/ https://reviews.llvm.org/D68822 From llvm-commits at lists.llvm.org Fri Oct 11 16:14:51 2019 From: llvm-commits at lists.llvm.org (Joel Klinghed via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:14:51 +0000 (UTC) Subject: [PATCH] D67322: [LLD][ThinLTO] Handle GUID collision in import global processing In-Reply-To: References: Message-ID: <6179ffc85daa2df10ae22a6dac664d5f@localhost.localdomain> the_jk updated this revision to Diff 224702. the_jk added a comment. Updated to latest master. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67322/new/ https://reviews.llvm.org/D67322 Files: llvm/lib/Transforms/Utils/FunctionImportUtils.cpp llvm/test/ThinLTO/X86/Inputs/guid_collision.ll llvm/test/ThinLTO/X86/guid_collision.ll Index: llvm/test/ThinLTO/X86/guid_collision.ll =================================================================== --- /dev/null +++ llvm/test/ThinLTO/X86/guid_collision.ll @@ -0,0 +1,17 @@ +; Make sure LTO succeeds even if %t.bc contains a GlobalVariable F and +; %t2.bc cointains a Function F with the same GUID. +; +; RUN: opt -module-summary %s -o %t.bc +; RUN: opt -module-summary %p/Inputs/guid_collision.ll -o %t2.bc +; RUN: llvm-lto2 run %t.bc %t2.bc -o %t.out \ +; RUN: -r=%t.bc,dummy,px -r=%t2.bc,dummy2,px + +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-pc-linux-gnu" + +; The source for the GUID for this symbol will be -:F +source_filename = "-" + at F = internal constant i8 0 + +; Needed to give llvm-lto2 something to do + at dummy = global i32 0 Index: llvm/test/ThinLTO/X86/Inputs/guid_collision.ll =================================================================== --- /dev/null +++ llvm/test/ThinLTO/X86/Inputs/guid_collision.ll @@ -0,0 +1,12 @@ +target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-pc-linux-gnu" + +; The source for the GUID for this symbol will be -:F +source_filename = "-" +define internal fastcc i64 @F() { + ret i64 0 +} + +; Needed to give llvm-lto2 something to do + at dummy2 = global i32 0 + Index: llvm/lib/Transforms/Utils/FunctionImportUtils.cpp =================================================================== --- llvm/lib/Transforms/Utils/FunctionImportUtils.cpp +++ llvm/lib/Transforms/Utils/FunctionImportUtils.cpp @@ -239,11 +239,20 @@ // propagateConstants hasn't been run. We can't internalize GV // in such case. if (!GV.isDeclaration() && VI && ImportIndex.withGlobalValueDeadStripping()) { - const auto &SL = VI.getSummaryList(); - auto *GVS = SL.empty() ? nullptr : dyn_cast(SL[0].get()); - // At this stage "maybe" is "definitely" - if (GVS && (GVS->maybeReadOnly() || GVS->maybeWriteOnly())) - cast(&GV)->addAttribute("thinlto-internalize"); + if (GlobalVariable *V = dyn_cast(&GV)) { + GlobalVarSummary* GVS = nullptr; + for (auto &S : VI.getSummaryList()) { + GVS = dyn_cast(S->getBaseObject()); + if (GVS) { + if (GVS->modulePath() == M.getModuleIdentifier()) + break; + GVS = nullptr; + } + } + // At this stage "maybe" is "definitely" + if (GVS && (GVS->maybeReadOnly() || GVS->maybeWriteOnly())) + V->addAttribute("thinlto-internalize"); + } } bool DoPromote = false; -------------- next part -------------- A non-text attachment was scrubbed... Name: D67322.224702.patch Type: text/x-patch Size: 2595 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 16:22:36 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Fri, 11 Oct 2019 23:22:36 -0000 Subject: [llvm] r374611 - gn build: (manually) merge r374606 better Message-ID: <20191011232236.995BB93460@lists.llvm.org> Author: nico Date: Fri Oct 11 16:22:36 2019 New Revision: 374611 URL: http://llvm.org/viewvc/llvm-project?rev=374611&view=rev Log: gn build: (manually) merge r374606 better Modified: llvm/trunk/utils/gn/secondary/libunwind/src/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/libunwind/src/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/libunwind/src/BUILD.gn?rev=374611&r1=374610&r2=374611&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/libunwind/src/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/libunwind/src/BUILD.gn Fri Oct 11 16:22:36 2019 @@ -24,7 +24,6 @@ if (target_os == "mac") { } unwind_sources = [ - "Unwind_AppleExtras.cpp", "libunwind.cpp", "Unwind-EHABI.cpp", "Unwind-seh.cpp", @@ -46,7 +45,11 @@ unwind_sources = [ "UnwindCursor.hpp", ] if (target_os == "mac") { - unwind_sources += [ "src/Unwind_AppleExtras.cpp" ] + unwind_sources += [ + # This comment prevents `gn format` from putting the file on the same line + # as `sources +=`, for sync_source_lists_from_cmake.py. + "Unwind_AppleExtras.cpp", + ] } config("unwind_config") { From llvm-commits at lists.llvm.org Fri Oct 11 16:24:00 2019 From: llvm-commits at lists.llvm.org (James Nagurne via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:24:00 +0000 (UTC) Subject: [PATCH] D63978: Clang Interface Stubs merger plumbing for Driver In-Reply-To: References: Message-ID: <8f2b9d5b78e20ca315f6c9e09fceb7f6@localhost.localdomain> JamesNagurne added a comment. In D63978#1706714 , @plotfi wrote: > In D63978#1706502 , @JamesNagurne wrote: > > > In D63978#1706448 , @plotfi wrote: > > > > > In D63978#1706420 , @JamesNagurne wrote: > > > > > > > Our team maintains a downstream embedded ARM clang distribution and some tests from this commit have begun to fail for us. > > > > For a number of these tests, there was a REQUIRES: x86-registered-target at the top, which has now been removed. Specifically, externstatic.c, merge-conflict-test.c, object-float.c, and object.c are failing. > > > > > > > > object* tests seem to be based on object.cpp, which had the REQUIRES line, and externstatic.c also had that line prior to the change. > > > > I see that @compnerd suggested the removal, but were you certain that these tests would work on clang toolchains for which x86 is not a registered target? > > > > > > > > For a failure example, here the output of lit for our toolchain. If you can make sense of it, I'd appreciate input on how we can fix or work around it: > > > > > > > > > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c | /arm-llvm/Release/llvm/bin/FileCheck -check-prefix=CHECK-TAPI /llvm-project/clang/test/InterfaceStubs/object.c > > > > /llvm-project/clang/test/InterfaceStubs/object.c:5:16: error: CHECK-TAPI: expected string not found in input > > > > // CHECK-TAPI: data: { Type: Object, Size: 4 } > > > > ^ > > > > :1:1: note: scanning from here > > > > --- !experimental-ifs-v1 > > > > ^ > > > > > > > > > > > > And when run without FileCheck, our raw output: > > > > > > > > > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c > > > > --- !experimental-ifs-v1 > > > > IfsVersion: 1.0 > > > > Triple: thumbv7em-ti-none-eabihf > > > > ObjectFileFormat: ELF > > > > Symbols: > > > > ... > > > > > > > > > > > > > I am sorry for this James. I can add back the REQUIRES lines for now and coordinate with you on making sure your downstream bots are not affected again if the REQUIRES are removed again. > > > By chance are your bots accessible publicly? > > > > > > Sadly, they are not. It's on our list of things to investigate, but we don't have the resources to do such a thing quite yet. > > I'm looking into the 'arm7*' buildbots to see if they are built similar to ours so I am not leaving you entirely without something to look at. However, if it seems to be common knowledge to always include an X86 target, I think I can talk to my team and change up what we do. > > > > These buildbots seem to also do LLVM_TARGETS_TO_BUILD=ARM, and then set the default target triple to a non-x86 triple (the host's) > > > > That could point towards us being in error here. I'll investigate things a little further, and update when I get the chance. > > To be clear: this feature should work for any ELF target, correct? > > > Yes, it is designed to work for all ELF targets but at the moment it is still in an early state. I am on the llvm IRC as zer0_ BTW I'd love to bounce ideas off of people in IRC, but the big mean IT security guys say no to any sort of chat programs. It's a real shame. I found the assumption being missed though, so good news! Our targets assume hidden visibility by default. After scanning your code (and realizing 'interface' is spelled as 'iterface' in a number of places), I noticed it was looking only for externally visible decls. After that, I scanned out changes and found a sneaky '-fvisibility=hidden' in our toolchain options. By running all of your tests with '-fvisibility=default', our toolchain passes! If you're willing to review/commit the fix upstream, I'm putting up a review presently. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63978/new/ https://reviews.llvm.org/D63978 From llvm-commits at lists.llvm.org Fri Oct 11 16:35:14 2019 From: llvm-commits at lists.llvm.org (Jake Ehrlich via llvm-commits) Date: Fri, 11 Oct 2019 23:35:14 -0000 Subject: [compiler-rt] r374612 - [libFuzzer] Don't prefix absolute paths in fuchsia. Message-ID: <20191011233514.2505C93397@lists.llvm.org> Author: jakehehrlich Date: Fri Oct 11 16:35:13 2019 New Revision: 374612 URL: http://llvm.org/viewvc/llvm-project?rev=374612&view=rev Log: [libFuzzer] Don't prefix absolute paths in fuchsia. The ExecuteCommand function in fuchsia used to prefix the getOutputFile for each command run with the artifact_prefix flag if it was available, because fuchsia components don't have a writable working directory. However, if a file with a global path is provided, fuchsia should honor that. An example of this is using the global /tmp directory to store stuff. In fuchsia it ended up being translated to data///tmp, whereas we want to make sure it is using /tmp (which is available to components using the isolated-temp feature). To test this I made the change, compiled fuchsia with this toolchain and ran a fuzzer with the -fork=1 flag (that mode makes use of the /tmp directory). I also tested that normal fuzzing workflow was not affected by this. Author: charco (Marco Vanotti) Differential Revision: https://reviews.llvm.org/D68774 Modified: compiler-rt/trunk/lib/fuzzer/FuzzerUtilFuchsia.cpp Modified: compiler-rt/trunk/lib/fuzzer/FuzzerUtilFuchsia.cpp URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/fuzzer/FuzzerUtilFuchsia.cpp?rev=374612&r1=374611&r2=374612&view=diff ============================================================================== --- compiler-rt/trunk/lib/fuzzer/FuzzerUtilFuchsia.cpp (original) +++ compiler-rt/trunk/lib/fuzzer/FuzzerUtilFuchsia.cpp Fri Oct 11 16:35:13 2019 @@ -407,13 +407,14 @@ int ExecuteCommand(const Command &Cmd) { // that lacks a mutable working directory. Fortunately, when this is the case // a mutable output directory must be specified using "-artifact_prefix=...", // so write the log file(s) there. + // However, we don't want to apply this logic for absolute paths. int FdOut = STDOUT_FILENO; if (Cmd.hasOutputFile()) { - std::string Path; - if (Cmd.hasFlag("artifact_prefix")) - Path = Cmd.getFlagValue("artifact_prefix") + "/" + Cmd.getOutputFile(); - else - Path = Cmd.getOutputFile(); + std::string Path = Cmd.getOutputFile(); + bool IsAbsolutePath = Path.length() > 1 && Path[0] == '/'; + if (!IsAbsolutePath && Cmd.hasFlag("artifact_prefix")) + Path = Cmd.getFlagValue("artifact_prefix") + "/" + Path; + FdOut = open(Path.c_str(), O_WRONLY | O_CREAT | O_TRUNC, 0); if (FdOut == -1) { Printf("libFuzzer: failed to open %s: %s\n", Path.c_str(), From llvm-commits at lists.llvm.org Fri Oct 11 16:33:10 2019 From: llvm-commits at lists.llvm.org (David Li via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:33:10 +0000 (UTC) Subject: [PATCH] D68898: JumpThreading: enhance JT to handle BB with no successor and address comparison Message-ID: davidxl created this revision. davidxl added reviewers: efriedma, wmi. Herald added a subscriber: jfb. Current JT only process (clone) BBs with multiple successors in JT with the aim to thread the predecessor with a successor BB. This misses opportunities to to handle return BB where the return value can be simplified with threading (cloning). Example: #include #include constexpr std::array x = {1, 7, 17}; bool Contains(int i) { return std::find(x.begin(), x.end(), i) != x.end(); } Clang produces inefficient code: _Z8Containsi: # @_Z8Containsi .cfi_startproc 1. %bb.0: cmpl $1, %edi je .LBB0_1 2. %bb.2: cmpl $7, %edi jne .LBB0_3 3. %bb.4: movl $_ZL1x+4, %eax jmp .LBB0_5 .LBB0_1: movl $_ZL1x, %eax jmp .LBB0_5 .LBB0_3: cmpl $17, %edi movl $_ZL1x+8, %ecx movl $_ZL1x+12, %eax cmoveq %rcx, %rax .LBB0_5: movl $_ZL1x+12, %ecx cmpq %rcx, %rax setne %al retq While GCC produces: _Z8Containsi: .LFB1534: .cfi_startproc movl $1, %eax cmpl $1, %edi je .L1 cmpl $7, %edi je .L1 cmpl $17, %edi sete %al .L1 : ret This patch address the issue. After the fix, the generated code looks like: _Z8Containsi: # @_Z8Containsi .cfi_startproc addl $-1, %edi cmpl $16, %edi ja .LBB0_2 movl $65601, %eax # imm = 0x10041 movl %edi, %ecx shrl %cl, %eax andb $1, %al retq .LBB0_2: # %_ZSt4findIPKiiET_S2_S2_RKT0_.exit.thread xorl %eax, %eax retq https://reviews.llvm.org/D68898 Files: include/llvm/Transforms/Scalar/JumpThreading.h lib/Transforms/Scalar/JumpThreading.cpp test/Transforms/JumpThreading/addr.ll test/Transforms/JumpThreading/return.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68898.224704.patch Type: text/x-patch Size: 17827 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 16:33:10 2019 From: llvm-commits at lists.llvm.org (Puyan Lotfi via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:33:10 +0000 (UTC) Subject: [PATCH] D63978: Clang Interface Stubs merger plumbing for Driver In-Reply-To: References: Message-ID: <4219d70455e084aac6dc03ecda5480f6@localhost.localdomain> plotfi added a comment. In D63978#1706764 , @JamesNagurne wrote: > In D63978#1706714 , @plotfi wrote: > > > In D63978#1706502 , @JamesNagurne wrote: > > > > > In D63978#1706448 , @plotfi wrote: > > > > > > > In D63978#1706420 , @JamesNagurne wrote: > > > > > > > > > Our team maintains a downstream embedded ARM clang distribution and some tests from this commit have begun to fail for us. > > > > > For a number of these tests, there was a REQUIRES: x86-registered-target at the top, which has now been removed. Specifically, externstatic.c, merge-conflict-test.c, object-float.c, and object.c are failing. > > > > > > > > > > object* tests seem to be based on object.cpp, which had the REQUIRES line, and externstatic.c also had that line prior to the change. > > > > > I see that @compnerd suggested the removal, but were you certain that these tests would work on clang toolchains for which x86 is not a registered target? > > > > > > > > > > For a failure example, here the output of lit for our toolchain. If you can make sense of it, I'd appreciate input on how we can fix or work around it: > > > > > > > > > > > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c | /arm-llvm/Release/llvm/bin/FileCheck -check-prefix=CHECK-TAPI /llvm-project/clang/test/InterfaceStubs/object.c > > > > > /llvm-project/clang/test/InterfaceStubs/object.c:5:16: error: CHECK-TAPI: expected string not found in input > > > > > // CHECK-TAPI: data: { Type: Object, Size: 4 } > > > > > ^ > > > > > :1:1: note: scanning from here > > > > > --- !experimental-ifs-v1 > > > > > ^ > > > > > > > > > > > > > > > And when run without FileCheck, our raw output: > > > > > > > > > > > /arm-llvm/Release/llvm/bin/clang -c -o - -emit-interface-stubs /llvm-project/clang/test/InterfaceStubs/object.c > > > > > --- !experimental-ifs-v1 > > > > > IfsVersion: 1.0 > > > > > Triple: thumbv7em-ti-none-eabihf > > > > > ObjectFileFormat: ELF > > > > > Symbols: > > > > > ... > > > > > > > > > > > > > > > > > I am sorry for this James. I can add back the REQUIRES lines for now and coordinate with you on making sure your downstream bots are not affected again if the REQUIRES are removed again. > > > > By chance are your bots accessible publicly? > > > > > > > > > Sadly, they are not. It's on our list of things to investigate, but we don't have the resources to do such a thing quite yet. > > > I'm looking into the 'arm7*' buildbots to see if they are built similar to ours so I am not leaving you entirely without something to look at. However, if it seems to be common knowledge to always include an X86 target, I think I can talk to my team and change up what we do. > > > > > > These buildbots seem to also do LLVM_TARGETS_TO_BUILD=ARM, and then set the default target triple to a non-x86 triple (the host's) > > > > > > That could point towards us being in error here. I'll investigate things a little further, and update when I get the chance. > > > To be clear: this feature should work for any ELF target, correct? > > > > > > Yes, it is designed to work for all ELF targets but at the moment it is still in an early state. I am on the llvm IRC as zer0_ BTW > > > I'd love to bounce ideas off of people in IRC, but the big mean IT security guys say no to any sort of chat programs. It's a real shame. > I found the assumption being missed though, so good news! > Our targets assume hidden visibility by default. After scanning your code (and realizing 'interface' is spelled as 'iterface' in a number of places), I noticed it was looking only for externally visible decls. After that, I scanned out changes and found a sneaky '-fvisibility=hidden' in our toolchain options. > > By running all of your tests with '-fvisibility=default', our toolchain passes! If you're willing to review/commit the fix upstream, I'm putting up a review presently. Fan-freakin-tastic! I can help review, and can you include the places where I misspelled things?? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63978/new/ https://reviews.llvm.org/D63978 From llvm-commits at lists.llvm.org Fri Oct 11 16:42:23 2019 From: llvm-commits at lists.llvm.org (Jake Ehrlich via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:42:23 +0000 (UTC) Subject: [PATCH] D68774: [libFuzzer] Don't prefix absolute paths in fuchsia. In-Reply-To: References: Message-ID: <06fb8967ece7eb6db72b8ae1f729e5ce@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGcde860a1c996: [libFuzzer] Don't prefix absolute paths in fuchsia. (authored by jakehehrlich). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68774/new/ https://reviews.llvm.org/D68774 Files: compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp Index: compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp =================================================================== --- compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp +++ compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp @@ -407,13 +407,14 @@ // that lacks a mutable working directory. Fortunately, when this is the case // a mutable output directory must be specified using "-artifact_prefix=...", // so write the log file(s) there. + // However, we don't want to apply this logic for absolute paths. int FdOut = STDOUT_FILENO; if (Cmd.hasOutputFile()) { - std::string Path; - if (Cmd.hasFlag("artifact_prefix")) - Path = Cmd.getFlagValue("artifact_prefix") + "/" + Cmd.getOutputFile(); - else - Path = Cmd.getOutputFile(); + std::string Path = Cmd.getOutputFile(); + bool IsAbsolutePath = Path.length() > 1 && Path[0] == '/'; + if (!IsAbsolutePath && Cmd.hasFlag("artifact_prefix")) + Path = Cmd.getFlagValue("artifact_prefix") + "/" + Path; + FdOut = open(Path.c_str(), O_WRONLY | O_CREAT | O_TRUNC, 0); if (FdOut == -1) { Printf("libFuzzer: failed to open %s: %s\n", Path.c_str(), -------------- next part -------------- A non-text attachment was scrubbed... Name: D68774.224708.patch Type: text/x-patch Size: 1148 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 16:51:25 2019 From: llvm-commits at lists.llvm.org (David Blaikie via llvm-commits) Date: Fri, 11 Oct 2019 23:51:25 -0000 Subject: [llvm] r374613 - DebugInfo: Reduce the scope of some variables related to debug_ranges emission Message-ID: <20191011235125.204AD8A9A5@lists.llvm.org> Author: dblaikie Date: Fri Oct 11 16:51:24 2019 New Revision: 374613 URL: http://llvm.org/viewvc/llvm-project?rev=374613&view=rev Log: DebugInfo: Reduce the scope of some variables related to debug_ranges emission Minor tidy up/NFC Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp?rev=374613&r1=374612&r2=374613&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp Fri Oct 11 16:51:24 2019 @@ -467,14 +467,6 @@ void DwarfCompileUnit::constructScopeDIE void DwarfCompileUnit::addScopeRangeList(DIE &ScopeDIE, SmallVector Range) { - const TargetLoweringObjectFile &TLOF = Asm->getObjFileLowering(); - - // Emit the offset into .debug_ranges or .debug_rnglists as a relocatable - // label. emitDIE() will handle emitting it appropriately. - const MCSymbol *RangeSectionSym = - DD->getDwarfVersion() >= 5 - ? TLOF.getDwarfRnglistsSection()->getBeginSymbol() - : TLOF.getDwarfRangesSection()->getBeginSymbol(); HasRangeLists = true; @@ -493,12 +485,17 @@ void DwarfCompileUnit::addScopeRangeList // (DW_RLE_startx_endx etc.). if (DD->getDwarfVersion() >= 5) addUInt(ScopeDIE, dwarf::DW_AT_ranges, dwarf::DW_FORM_rnglistx, Index); - else if (isDwoUnit()) - addSectionDelta(ScopeDIE, dwarf::DW_AT_ranges, List.getSym(), - RangeSectionSym); - else - addSectionLabel(ScopeDIE, dwarf::DW_AT_ranges, List.getSym(), - RangeSectionSym); + else { + const TargetLoweringObjectFile &TLOF = Asm->getObjFileLowering(); + const MCSymbol *RangeSectionSym = + TLOF.getDwarfRangesSection()->getBeginSymbol(); + if (isDwoUnit()) + addSectionDelta(ScopeDIE, dwarf::DW_AT_ranges, List.getSym(), + RangeSectionSym); + else + addSectionLabel(ScopeDIE, dwarf::DW_AT_ranges, List.getSym(), + RangeSectionSym); + } } void DwarfCompileUnit::attachRangesOrLowHighPC( From llvm-commits at lists.llvm.org Fri Oct 11 16:51:35 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:51:35 +0000 (UTC) Subject: [PATCH] D66613: [support][llvm-objcopy] Add support for shell wildcards In-Reply-To: References: Message-ID: <4401d0e81ee5cacd80891a7bb8b35a0f@localhost.localdomain> rupprecht added a comment. In D66613#1659632 , @alexshap wrote: > khm, I don't insist, but personally I would split out the changes in lib/Support and the corresponding unit tests into a separate patch Ack; I'll submit them separately, but I'm leaving them together in the patch for now, e.g. to show how the added glob support is needed by llvm-objcopy tests. In D66613#1643926 , @MaskRay wrote: > I just realized that you can remove the glob (I'd call it glob, not wildcard) change from this patch, and just use `lib/Support/Regex.cpp:GlobPattern`. > > `GlobPattern` is currently used by lld to do version script/dynamic list matching. In version scripts/dynamic lists, `[:` and `[=` are syntax error (ld.bfd), and I don't think anyone using `[.`. But to make it fully `fnmatch(pat, str, 0)` capable (in case someone uses character classes like `[[:digit:]]`), you can add these enhancement to a separate change. In D66613#1703550 , @evgeny777 wrote: > I wonder if you can use GlobPattern (llvm/Support/GlobPattern.h) or extend this class. One of the reasons we use it in lld is because of much better performance (see D26241 ), > This may not be the case for llvm-objcopy, but still using original pattern matcher looks nicer than translating to regexps. OK, removed the regex transformation and using glob instead. I decided to leave out character classes for now, so the changes to glob are actually pretty minimal. lld tests are still passing w/ these glob changes. I hope the test coverage is sufficient there. ================ Comment at: llvm/docs/CommandGuide/llvm-objcopy.rst:134 + + Allow wildcard syntax for symbol-related flags. On by default for + section-related flags. Incompatible with --regex. ---------------- MaskRay wrote: > > On by default for section-related flags. > > This is another thing I'm not sure if we want to do. > > I'd like users to specify `-w` to get wildcard semantics for section options. I checked the ruby/nacl/netbsd links and they all appear to use -w Those were just easy-to-find examples of wildcard usage. There are others using wildcards w/o `-w`, e.g. https://chromium.googlesource.com/chromiumos/platform/ec/+/refs/heads/master/Makefile.rules#66 ``` cmd_ec_elf_to_flat ?= $(OBJCOPY) --set-section-flags .roshared=share -R .dram* \ -O binary $< $@ cmd_ec_elf_to_flat_dram ?= $(OBJCOPY) -j .dram* -O binary $< $@ ``` Older kernel versions: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/firmware/efi/libstub/Makefile?h=linux-4.4.y#n57 ``` STUBCOPY_FLAGS-y := -R .debug* -R *ksymtab* -R *kcrctab* ``` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66613/new/ https://reviews.llvm.org/D66613 From llvm-commits at lists.llvm.org Fri Oct 11 16:51:35 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:51:35 +0000 (UTC) Subject: [PATCH] D66613: [support][llvm-objcopy] Add support for shell wildcards In-Reply-To: References: Message-ID: <398757af41f9b88d266e68e632b50a3e@localhost.localdomain> rupprecht updated this revision to Diff 224709. rupprecht marked 3 inline comments as done. rupprecht added a comment. - Use GlobPattern instead of Regex - Log a warning if the glob expression is invalid Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66613/new/ https://reviews.llvm.org/D66613 Files: llvm/docs/CommandGuide/llvm-objcopy.rst llvm/docs/CommandGuide/llvm-strip.rst llvm/include/llvm/Support/GlobPattern.h llvm/lib/Support/GlobPattern.cpp llvm/test/tools/llvm-objcopy/ELF/wildcard-flags.test llvm/test/tools/llvm-objcopy/ELF/wildcard-syntax.test llvm/tools/llvm-objcopy/CommonOpts.td llvm/tools/llvm-objcopy/CopyConfig.cpp llvm/tools/llvm-objcopy/CopyConfig.h llvm/tools/llvm-objcopy/llvm-objcopy.cpp llvm/unittests/Support/GlobPatternTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D66613.224709.patch Type: text/x-patch Size: 40162 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 16:51:36 2019 From: llvm-commits at lists.llvm.org (Derek Schuff via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:51:36 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands In-Reply-To: References: Message-ID: <6e20a5d834676da9c27ebce99de70d84@localhost.localdomain> dschuff added inline comments. ================ Comment at: llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h:135 + Exnref = unsigned(wasm::ValType::EXNREF), + // Will be lowered to match the function return type in MCInstLower + Multivalue = 0xffff, ---------------- The invariant that only fallthrough-return blocks are allowed to be multivalue should probably be restated here. edit: I guess it's not just fallthrough-return blocks, right? Just the last block, even if it has an explicit return? We should probably have some more tests for those cases. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68889/new/ https://reviews.llvm.org/D68889 From llvm-commits at lists.llvm.org Fri Oct 11 16:51:37 2019 From: llvm-commits at lists.llvm.org (Tim Shen via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:51:37 +0000 (UTC) Subject: [PATCH] D68892: [NVPTX] Restructure shfl instrinsics and add variants that return a predicate. In-Reply-To: References: Message-ID: <0b99f1707b7aa7366d4a37b9d7164f54@localhost.localdomain> timshen accepted this revision. timshen added inline comments. This revision is now accepted and ready to land. ================ Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:280 +class SHFL_INFO { + string Suffix = !if(sync, "sync_","") + # mode # "_" ---------------- nit: format `if(sync, "sync_","")` to `if(sync, "sync_", "")` ================ Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:293 + !eq(type,"f32"): llvm_float_ty); + list RetTy = !listconcat( + [OpType], !if(return_pred, [llvm_i1_ty], [])); ---------------- Seems cleaner to just have `= !if(return_pred, [OpType, llvm_i1_ty], [OpType])`. ================ Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:295 + [OpType], !if(return_pred, [llvm_i1_ty], [])); + list ArgsTy = !listconcat( + !if(sync, [llvm_i32_ty], []), ---------------- ditto. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68892/new/ https://reviews.llvm.org/D68892 From llvm-commits at lists.llvm.org Fri Oct 11 17:00:59 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Sat, 12 Oct 2019 00:00:59 -0000 Subject: [llvm] r374614 - [X86] Add test case showing missing opportunity to fold vmovsdb into a store after type legalization. NFC Message-ID: <20191012000100.03B4B8A9A8@lists.llvm.org> Author: ctopper Date: Fri Oct 11 17:00:59 2019 New Revision: 374614 URL: http://llvm.org/viewvc/llvm-project?rev=374614&view=rev Log: [X86] Add test case showing missing opportunity to fold vmovsdb into a store after type legalization. NFC Modified: llvm/trunk/test/CodeGen/X86/avx512-trunc.ll Modified: llvm/trunk/test/CodeGen/X86/avx512-trunc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-trunc.ll?rev=374614&r1=374613&r2=374614&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx512-trunc.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx512-trunc.ll Fri Oct 11 17:00:59 2019 @@ -1044,3 +1044,23 @@ define void @negative_test2_smax_usat_tr store <16 x i8> %x6, <16 x i8>* %res, align 1 ret void } + +define void @ssat_trunc_db_1024_mem(<32 x i32> %i, <32 x i8>* %p) { +; ALL-LABEL: ssat_trunc_db_1024_mem: +; ALL: ## %bb.0: +; ALL-NEXT: vpmovsdb %zmm0, %xmm0 +; ALL-NEXT: vpmovsdb %zmm1, %xmm1 +; ALL-NEXT: vmovdqu %xmm1, 16(%rdi) +; ALL-NEXT: vmovdqu %xmm0, (%rdi) +; ALL-NEXT: vzeroupper +; ALL-NEXT: retq + %x1 = icmp sgt <32 x i32> %i, + %x2 = select <32 x i1> %x1, <32 x i32> %i, <32 x i32> + %x3 = icmp slt <32 x i32> %x2, + %x5 = select <32 x i1> %x3, <32 x i32> %x2, <32 x i32> + %x6 = trunc <32 x i32> %x5 to <32 x i8> + store <32 x i8>%x6, <32 x i8>* %p, align 1 + ret void +} + From llvm-commits at lists.llvm.org Fri Oct 11 17:01:08 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Sat, 12 Oct 2019 00:01:08 -0000 Subject: [llvm] r374615 - [X86] Fold a VTRUNCS/VTRUNCUS+store into a saturating truncating store. Message-ID: <20191012000108.D4B8B934BB@lists.llvm.org> Author: ctopper Date: Fri Oct 11 17:01:08 2019 New Revision: 374615 URL: http://llvm.org/viewvc/llvm-project?rev=374615&view=rev Log: [X86] Fold a VTRUNCS/VTRUNCUS+store into a saturating truncating store. We already did this for VTRUNCUS with a specific combination of types. This extends this to VTRUNCS and handles any types where a truncating store is legal. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/avx512-trunc.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374615&r1=374614&r2=374615&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Fri Oct 11 17:01:08 2019 @@ -40332,11 +40332,11 @@ static SDValue combineStore(SDNode *N, S TargetLowering::DAGCombinerInfo &DCI, const X86Subtarget &Subtarget) { StoreSDNode *St = cast(N); - EVT VT = St->getValue().getValueType(); EVT StVT = St->getMemoryVT(); SDLoc dl(St); unsigned Alignment = St->getAlignment(); - SDValue StoredVal = St->getOperand(1); + SDValue StoredVal = St->getValue(); + EVT VT = StoredVal.getValueType(); const TargetLowering &TLI = DAG.getTargetLoweringInfo(); // Convert a store of vXi1 into a store of iX and a bitcast. @@ -40453,17 +40453,15 @@ static SDValue combineStore(SDNode *N, S MVT::v16i8, St->getMemOperand()); } - // Try to fold a vpmovuswb 256->128 into a truncating store. - // FIXME: Generalize this to other types. - // FIXME: Do the same for signed saturation. - if (!St->isTruncatingStore() && VT == MVT::v16i8 && - St->getValue().getOpcode() == X86ISD::VTRUNCUS && - St->getValue().getOperand(0).getValueType() == MVT::v16i16 && - TLI.isTruncStoreLegal(MVT::v16i16, MVT::v16i8) && - St->getValue().hasOneUse()) { - return EmitTruncSStore(false /* Unsigned saturation */, St->getChain(), - dl, St->getValue().getOperand(0), St->getBasePtr(), - MVT::v16i8, St->getMemOperand(), DAG); + // Try to fold a VTRUNCUS or VTRUNCS into a truncating store. + if (!St->isTruncatingStore() && StoredVal.hasOneUse() && + (StoredVal.getOpcode() == X86ISD::VTRUNCUS || + StoredVal.getOpcode() == X86ISD::VTRUNCS) && + TLI.isTruncStoreLegal(StoredVal.getOperand(0).getValueType(), VT)) { + bool IsSigned = StoredVal.getOpcode() == X86ISD::VTRUNCS; + return EmitTruncSStore(IsSigned, St->getChain(), + dl, StoredVal.getOperand(0), St->getBasePtr(), + VT, St->getMemOperand(), DAG); } // Optimize trunc store (of multiple scalars) to shuffle and store. Modified: llvm/trunk/test/CodeGen/X86/avx512-trunc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-trunc.ll?rev=374615&r1=374614&r2=374615&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx512-trunc.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx512-trunc.ll Fri Oct 11 17:01:08 2019 @@ -690,10 +690,8 @@ define <32 x i8> @usat_trunc_db_1024(<32 define void @usat_trunc_db_1024_mem(<32 x i32> %i, <32 x i8>* %p) { ; ALL-LABEL: usat_trunc_db_1024_mem: ; ALL: ## %bb.0: -; ALL-NEXT: vpmovusdb %zmm0, %xmm0 -; ALL-NEXT: vpmovusdb %zmm1, %xmm1 -; ALL-NEXT: vmovdqu %xmm1, 16(%rdi) -; ALL-NEXT: vmovdqu %xmm0, (%rdi) +; ALL-NEXT: vpmovusdb %zmm1, 16(%rdi) +; ALL-NEXT: vpmovusdb %zmm0, (%rdi) ; ALL-NEXT: vzeroupper ; ALL-NEXT: retq %x3 = icmp ult <32 x i32> %i, @@ -957,12 +955,10 @@ define void @smax_usat_trunc_db_1024_mem ; ALL-LABEL: smax_usat_trunc_db_1024_mem: ; ALL: ## %bb.0: ; ALL-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; ALL-NEXT: vpmaxsd %zmm2, %zmm1, %zmm1 ; ALL-NEXT: vpmaxsd %zmm2, %zmm0, %zmm0 -; ALL-NEXT: vpmovusdb %zmm0, %xmm0 -; ALL-NEXT: vpmovusdb %zmm1, %xmm1 -; ALL-NEXT: vmovdqu %xmm1, 16(%rdi) -; ALL-NEXT: vmovdqu %xmm0, (%rdi) +; ALL-NEXT: vpmaxsd %zmm2, %zmm1, %zmm1 +; ALL-NEXT: vpmovusdb %zmm1, 16(%rdi) +; ALL-NEXT: vpmovusdb %zmm0, (%rdi) ; ALL-NEXT: vzeroupper ; ALL-NEXT: retq %x1 = icmp sgt <32 x i32> %i, @@ -1048,10 +1044,8 @@ define void @negative_test2_smax_usat_tr define void @ssat_trunc_db_1024_mem(<32 x i32> %i, <32 x i8>* %p) { ; ALL-LABEL: ssat_trunc_db_1024_mem: ; ALL: ## %bb.0: -; ALL-NEXT: vpmovsdb %zmm0, %xmm0 -; ALL-NEXT: vpmovsdb %zmm1, %xmm1 -; ALL-NEXT: vmovdqu %xmm1, 16(%rdi) -; ALL-NEXT: vmovdqu %xmm0, (%rdi) +; ALL-NEXT: vpmovsdb %zmm1, 16(%rdi) +; ALL-NEXT: vpmovsdb %zmm0, (%rdi) ; ALL-NEXT: vzeroupper ; ALL-NEXT: retq %x1 = icmp sgt <32 x i32> %i, gokturk created this revision. Herald added subscribers: llvm-commits, s.egerton, benna, psnobl, PkmX, rogfer01, shiva0217, kito-cheng, simoncook, mgorny. Herald added a project: LLVM. gokturk added reviewers: erichkeane, rengolin, mgorny. LLVM configuration fails with 'unable to guess system type' on riscv64. Add support for detecting riscv32 and riscv64 systems. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68899 Files: llvm/cmake/config.guess Index: llvm/cmake/config.guess =================================================================== --- llvm/cmake/config.guess +++ llvm/cmake/config.guess @@ -973,6 +973,30 @@ ppc:Linux:*:*) echo powerpc-unknown-linux-gnu exit ;; + riscv32:Linux:*:* | riscv64:Linux:*:*) + LIBC=gnu + eval $set_cc_for_build + # Do not check for __GLIBC__ because uclibc defines it too + sed 's/^ //' << EOF >$dummy.c + #include + #if defined(__UCLIBC__) + LIBC=uclibc + #elif defined(__dietlibc__) + LIBC=dietlibc + #endif +EOF + eval `$CC_FOR_BUILD -E $dummy.c 2>/dev/null | grep '^LIBC'` + + # There is no features test macro for musl + # Follow the GNU's config.guess approach of + # checking the output of ldd + if command -v ldd >/dev/null && \ + ldd --version 2>&1 | grep -q ^musl; then + LIBC=musl + fi + + echo ${UNAME_MACHINE}-unknown-linux-${LIBC} + exit ;; s390:Linux:*:* | s390x:Linux:*:*) echo ${UNAME_MACHINE}-ibm-linux exit ;; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68899.224712.patch Type: text/x-patch Size: 964 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 17:01:01 2019 From: llvm-commits at lists.llvm.org (Hideki Saito via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 00:01:01 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: hsaito added a comment. In D68651#1704375 , @kparzysz wrote: > I'm in favor of treating signed saturation as canonical. The issue in delaying detection of such cases to instruction selection is the volatility of the IR: there is no guarantee that the IR will remain in the same form (expected by isel) from one day to the next. For example, some optimization may decide to just promote the operations to the wider type and only do the extension/truncate once, depending on how many saturating operations may be near one another. Handling this variability in isel is just not feasible. I don't want to hijack this review. Let me have the rest of the discussion in the form of RFC on llvm-dev. Vector idiom discussions resulted in ~20 idioms. It would be nice if we can come up with a basic guideline on how to think and how to make a case for the new canonical form. For example, TI said they are interested in saturating mul. If saturating add/sub have a canonical form, I don't see why saturating mul should not. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Fri Oct 11 17:01:02 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 00:01:02 +0000 (UTC) Subject: [PATCH] D68900: [SROA] Reuse existing lifetime markers if possible Message-ID: jdoerfert created this revision. jdoerfert added reviewers: reames, ssarda, t.p.northover, hfinkel. Herald added subscribers: bollu, hiraditya. Herald added a project: LLVM. If the underlying alloca did not change, we do not necessarily need new lifetime markers. This patch adds a check and reuses the old ones if possible. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68900 Files: llvm/lib/Transforms/Scalar/SROA.cpp llvm/test/Transforms/SROA/reuse_lifetime_markers.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68900.224713.patch Type: text/x-patch Size: 4740 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 17:01:02 2019 From: llvm-commits at lists.llvm.org (Jordan Rupprecht via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 00:01:02 +0000 (UTC) Subject: [PATCH] D68886: Remove unnecessary codes in llvm-dwarfdump In-Reply-To: References: Message-ID: rupprecht added a reviewer: dblaikie. rupprecht added a comment. Does `check-all` pass with this change? I'd imagine these are necessary to print target-specific information. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68886/new/ https://reviews.llvm.org/D68886 From llvm-commits at lists.llvm.org Fri Oct 11 17:10:35 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 00:10:35 +0000 (UTC) Subject: [PATCH] D68901: [SampleFDO] Add profile remapping support for profile on-demand loading used by ExtBinary format profile Message-ID: wmi created this revision. wmi added reviewers: davidxl, rsmith, wenlei. Herald added a subscriber: hiraditya. Herald added a project: LLVM. profile on-demand loading was added for ExtBinary format profile in https://reviews.llvm.org/rL374233, but currently profile on-demand loading doesn't work well with profile remapping. The patch adds the support. Suppose a function in the current module has outline instance in the profile. The function name in the module is different from the name of the outline instance, but remapper knows the two names are equal. When loading profile on-demand, the outline instance has to be loaded with remapper's help. Before the patch, the steps to read the profile is as follows: - create the profile reader - profile reader read the profile. - create the profile remapper - remapper set the underlying reader. - reset the profile reader to remapper. With the patch, the steps to read the profile is changed to: - create the profile reader - create the profile remapper - profile reader set the underlying remapper. - profile reader read the profile. - remapper set the underlying reader. - reset the profile reader to remapper. Repository: rL LLVM https://reviews.llvm.org/D68901 Files: llvm/include/llvm/ProfileData/SampleProfReader.h llvm/lib/ProfileData/SampleProfReader.cpp llvm/lib/Transforms/IPO/SampleProfile.cpp llvm/test/Transforms/SampleProfile/remap.ll llvm/unittests/ProfileData/SampleProfTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68901.224710.patch Type: text/x-patch Size: 15728 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 17:23:16 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via llvm-commits) Date: Sat, 12 Oct 2019 00:23:16 -0000 Subject: [llvm] r374617 - [llvm-profdata] Make "malformed-ptr-to-counter-array.test" textual Message-ID: <20191012002316.09865932F2@lists.llvm.org> Author: vedantk Date: Fri Oct 11 17:23:15 2019 New Revision: 374617 URL: http://llvm.org/viewvc/llvm-project?rev=374617&view=rev Log: [llvm-profdata] Make "malformed-ptr-to-counter-array.test" textual As pointed out in https://reviews.llvm.org/D66979 post-commit, making this test textual would make it more maintainable. Differential Revision: https://reviews.llvm.org/D68718 Removed: llvm/trunk/test/tools/llvm-profdata/Inputs/malformed-ptr-to-counter-array.profraw Modified: llvm/trunk/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test Removed: llvm/trunk/test/tools/llvm-profdata/Inputs/malformed-ptr-to-counter-array.profraw URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-profdata/Inputs/malformed-ptr-to-counter-array.profraw?rev=374616&view=auto ============================================================================== Binary file - no diff available. Modified: llvm/trunk/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test?rev=374617&r1=374616&r2=374617&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test (original) +++ llvm/trunk/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test Fri Oct 11 17:23:15 2019 @@ -1,5 +1,50 @@ -REQUIRES: zlib +// Header +// +// INSTR_PROF_RAW_HEADER(uint64_t, Magic, __llvm_profile_get_magic()) +// INSTR_PROF_RAW_HEADER(uint64_t, Version, __llvm_profile_get_version()) +// INSTR_PROF_RAW_HEADER(uint64_t, DataSize, DataSize) +// INSTR_PROF_RAW_HEADER(uint64_t, CountersSize, CountersSize) +// INSTR_PROF_RAW_HEADER(uint64_t, NamesSize, NamesSize) +// INSTR_PROF_RAW_HEADER(uint64_t, CountersDelta, (uintptr_t)CountersBegin) +// INSTR_PROF_RAW_HEADER(uint64_t, NamesDelta, (uintptr_t)NamesBegin) +// INSTR_PROF_RAW_HEADER(uint64_t, ValueKindLast, IPVK_Last) -RUN: not llvm-profdata merge -o /dev/null %p/Inputs/malformed-ptr-to-counter-array.profraw 2>&1 | FileCheck %s +RUN: printf '\201rforpl\377' > %t.profraw +RUN: printf '\4\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\1\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\2\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\10\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\0\0\6\0\1\0\0\0' >> %t.profraw +RUN: printf '\0\0\6\0\2\0\0\0' >> %t.profraw +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw + +// Data Section +// +// struct ProfData { +// #define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) \ +// Type Name; +// #include "llvm/ProfileData/InstrProfData.inc" +// }; + +RUN: printf '\067\265\035\031\112\165\023\344' >> %t.profraw +RUN: printf '\02\0\0\0\0\0\0\0' >> %t.profraw + +// Note: The CounterPtr here is off-by-one. This should trigger a malformed profile error. +RUN: printf '\0\0\6\0\1\0\0\1' >> %t.profraw + +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\02\0\0\0\0\0\0\0' >> %t.profraw + +// Counter Section + +RUN: printf '\067\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\101\0\0\0\0\0\0\0' >> %t.profraw + +// Name Section + +RUN: printf '\3\0bar\0\0\0' >> %t.profraw + +RUN: not llvm-profdata merge -o /dev/null %t.profraw 2>&1 | FileCheck %s CHECK: Malformed instrumentation profile data From llvm-commits at lists.llvm.org Fri Oct 11 17:27:13 2019 From: llvm-commits at lists.llvm.org (David Blaikie via llvm-commits) Date: Sat, 12 Oct 2019 00:27:13 -0000 Subject: [llvm] r374619 - DebugInfo: Fix msan use-of-uninitialized exposed by r374600 Message-ID: <20191012002713.292BE81F7A@lists.llvm.org> Author: dblaikie Date: Fri Oct 11 17:27:12 2019 New Revision: 374619 URL: http://llvm.org/viewvc/llvm-project?rev=374619&view=rev Log: DebugInfo: Fix msan use-of-uninitialized exposed by r374600 Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp Modified: llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp?rev=374619&r1=374618&r2=374619&view=diff ============================================================================== --- llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp (original) +++ llvm/trunk/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp Fri Oct 11 17:27:12 2019 @@ -90,6 +90,7 @@ DWARFDebugLoc::parseOneLocationList(cons uint64_t *Offset) { LocationList LL; LL.Offset = *Offset; + AddressSize = Data.getAddressSize(); DataExtractor::Cursor C(*Offset); // 2.6.2 Location Lists From llvm-commits at lists.llvm.org Fri Oct 11 17:27:38 2019 From: llvm-commits at lists.llvm.org (David Blaikie via llvm-commits) Date: Fri, 11 Oct 2019 17:27:38 -0700 Subject: [llvm] r374600 - DebugInfo: Use base address selection entries for debug_loc In-Reply-To: <20191011215241.E677993351@lists.llvm.org> References: <20191011215241.E677993351@lists.llvm.org> Message-ID: r374619 fixes a use-of-uninitialized detected by msan exposed by this patch. On Fri, Oct 11, 2019 at 2:50 PM David Blaikie via llvm-commits < llvm-commits at lists.llvm.org> wrote: > Author: dblaikie > Date: Fri Oct 11 14:52:41 2019 > New Revision: 374600 > > URL: http://llvm.org/viewvc/llvm-project?rev=374600&view=rev > Log: > DebugInfo: Use base address selection entries for debug_loc > > Unify the range and loc emission (for both DWARFv4 and DWARFv5 style > lists) and take advantage of that unification to use strategic base > addresses for loclists. > > Differential Revision: https://reviews.llvm.org/D68620 > > Modified: > llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp > llvm/trunk/test/CodeGen/X86/debug-loclists.ll > llvm/trunk/test/DebugInfo/X86/sret.ll > > Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp?rev=374600&r1=374599&r2=374600&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (original) > +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Fri Oct 11 14:52:41 > 2019 > @@ -2293,14 +2293,121 @@ static MCSymbol *emitLoclistsTableHeader > return TableEnd; > } > > +template > +static void emitRangeList( > + DwarfDebug &DD, AsmPrinter *Asm, MCSymbol *Sym, const Ranges &R, > + const DwarfCompileUnit &CU, unsigned BaseAddressx, unsigned > OffsetPair, > + unsigned StartxLength, unsigned EndOfList, > + StringRef (*StringifyEnum)(unsigned), > + bool ShouldUseBaseAddress, > + PayloadEmitter EmitPayload) { > + > + auto Size = Asm->MAI->getCodePointerSize(); > + bool UseDwarf5 = DD.getDwarfVersion() >= 5; > + > + // Emit our symbol so we can find the beginning of the range. > + Asm->OutStreamer->EmitLabel(Sym); > + > + // Gather all the ranges that apply to the same section so they can > share > + // a base address entry. > + MapVector> > SectionRanges; > + > + for (const auto &Range : R) > + SectionRanges[&Range.Begin->getSection()].push_back(&Range); > + > + const MCSymbol *CUBase = CU.getBaseAddress(); > + bool BaseIsSet = false; > + for (const auto &P : SectionRanges) { > + auto *Base = CUBase; > + if (!Base && ShouldUseBaseAddress) { > + const MCSymbol *Begin = P.second.front()->Begin; > + const MCSymbol *NewBase = DD.getSectionLabel(&Begin->getSection()); > + if (!UseDwarf5) { > + Base = NewBase; > + BaseIsSet = true; > + Asm->OutStreamer->EmitIntValue(-1, Size); > + Asm->OutStreamer->AddComment(" base address"); > + Asm->OutStreamer->EmitSymbolValue(Base, Size); > + } else if (NewBase != Begin || P.second.size() > 1) { > + // Only use a base address if > + // * the existing pool address doesn't match (NewBase != Begin) > + // * or, there's more than one entry to share the base address > + Base = NewBase; > + BaseIsSet = true; > + Asm->OutStreamer->AddComment(StringifyEnum(BaseAddressx)); > + Asm->emitInt8(BaseAddressx); > + Asm->OutStreamer->AddComment(" base address index"); > + Asm->EmitULEB128(DD.getAddressPool().getIndex(Base)); > + } > + } else if (BaseIsSet && !UseDwarf5) { > + BaseIsSet = false; > + assert(!Base); > + Asm->OutStreamer->EmitIntValue(-1, Size); > + Asm->OutStreamer->EmitIntValue(0, Size); > + } > + > + for (const auto *RS : P.second) { > + const MCSymbol *Begin = RS->Begin; > + const MCSymbol *End = RS->End; > + assert(Begin && "Range without a begin symbol?"); > + assert(End && "Range without an end symbol?"); > + if (Base) { > + if (UseDwarf5) { > + // Emit offset_pair when we have a base. > + Asm->OutStreamer->AddComment(StringifyEnum(OffsetPair)); > + Asm->emitInt8(OffsetPair); > + Asm->OutStreamer->AddComment(" starting offset"); > + Asm->EmitLabelDifferenceAsULEB128(Begin, Base); > + Asm->OutStreamer->AddComment(" ending offset"); > + Asm->EmitLabelDifferenceAsULEB128(End, Base); > + } else { > + Asm->EmitLabelDifference(Begin, Base, Size); > + Asm->EmitLabelDifference(End, Base, Size); > + } > + } else if (UseDwarf5) { > + Asm->OutStreamer->AddComment(StringifyEnum(StartxLength)); > + Asm->emitInt8(StartxLength); > + Asm->OutStreamer->AddComment(" start index"); > + Asm->EmitULEB128(DD.getAddressPool().getIndex(Begin)); > + Asm->OutStreamer->AddComment(" length"); > + Asm->EmitLabelDifferenceAsULEB128(End, Begin); > + } else { > + Asm->OutStreamer->EmitSymbolValue(Begin, Size); > + Asm->OutStreamer->EmitSymbolValue(End, Size); > + } > + EmitPayload(*RS); > + } > + } > + > + if (UseDwarf5) { > + Asm->OutStreamer->AddComment(StringifyEnum(EndOfList)); > + Asm->emitInt8(EndOfList); > + } else { > + // Terminate the list with two 0 values. > + Asm->OutStreamer->EmitIntValue(0, Size); > + Asm->OutStreamer->EmitIntValue(0, Size); > + } > +} > + > +static void emitLocList(DwarfDebug &DD, AsmPrinter *Asm, const > DebugLocStream::List &List) { > + emitRangeList( > + DD, Asm, List.Label, DD.getDebugLocs().getEntries(List), *List.CU, > + dwarf::DW_LLE_base_addressx, dwarf::DW_LLE_offset_pair, > + dwarf::DW_LLE_startx_length, dwarf::DW_LLE_end_of_list, > + llvm::dwarf::LocListEncodingString, > + /* ShouldUseBaseAddress */ true, > + [&](const DebugLocStream::Entry &E) { > + DD.emitDebugLocEntryLocation(E, List.CU); > + }); > +} > + > // Emit locations into the .debug_loc/.debug_rnglists section. > void DwarfDebug::emitDebugLoc() { > if (DebugLocs.getLists().empty()) > return; > > - bool IsLocLists = getDwarfVersion() >= 5; > MCSymbol *TableEnd = nullptr; > - if (IsLocLists) { > + if (getDwarfVersion() >= 5) { > Asm->OutStreamer->SwitchSection( > Asm->getObjFileLowering().getDwarfLoclistsSection()); > TableEnd = emitLoclistsTableHeader(Asm, useSplitDwarf() ? > SkeletonHolder > @@ -2310,63 +2417,8 @@ void DwarfDebug::emitDebugLoc() { > Asm->getObjFileLowering().getDwarfLocSection()); > } > > - unsigned char Size = Asm->MAI->getCodePointerSize(); > - for (const auto &List : DebugLocs.getLists()) { > - Asm->OutStreamer->EmitLabel(List.Label); > - > - const DwarfCompileUnit *CU = List.CU; > - const MCSymbol *Base = CU->getBaseAddress(); > - for (const auto &Entry : DebugLocs.getEntries(List)) { > - if (Base) { > - // Set up the range. This range is relative to the entry point of > the > - // compile unit. This is a hard coded 0 for low_pc when we're > emitting > - // ranges, or the DW_AT_low_pc on the compile unit otherwise. > - if (IsLocLists) { > - Asm->OutStreamer->AddComment("DW_LLE_offset_pair"); > - Asm->OutStreamer->EmitIntValue(dwarf::DW_LLE_offset_pair, 1); > - Asm->OutStreamer->AddComment(" starting offset"); > - Asm->EmitLabelDifferenceAsULEB128(Entry.Begin, Base); > - Asm->OutStreamer->AddComment(" ending offset"); > - Asm->EmitLabelDifferenceAsULEB128(Entry.End, Base); > - } else { > - Asm->EmitLabelDifference(Entry.Begin, Base, Size); > - Asm->EmitLabelDifference(Entry.End, Base, Size); > - } > - > - emitDebugLocEntryLocation(Entry, CU); > - continue; > - } > - > - // We have no base address. > - if (IsLocLists) { > - // TODO: Use DW_LLE_base_addressx + DW_LLE_offset_pair, or > - // DW_LLE_startx_length in case if there is only a single range. > - // That should reduce the size of the debug data emited. > - // For now just use the DW_LLE_startx_length for all cases. > - Asm->OutStreamer->AddComment("DW_LLE_startx_length"); > - Asm->emitInt8(dwarf::DW_LLE_startx_length); > - Asm->OutStreamer->AddComment(" start idx"); > - Asm->EmitULEB128(AddrPool.getIndex(Entry.Begin)); > - Asm->OutStreamer->AddComment(" length"); > - Asm->EmitLabelDifferenceAsULEB128(Entry.End, Entry.Begin); > - } else { > - Asm->OutStreamer->EmitSymbolValue(Entry.Begin, Size); > - Asm->OutStreamer->EmitSymbolValue(Entry.End, Size); > - } > - > - emitDebugLocEntryLocation(Entry, CU); > - } > - > - if (IsLocLists) { > - // .debug_loclists section ends with DW_LLE_end_of_list. > - Asm->OutStreamer->AddComment("DW_LLE_end_of_list"); > - Asm->OutStreamer->EmitIntValue(dwarf::DW_LLE_end_of_list, 1); > - } else { > - // Terminate the .debug_loc list with two 0 values. > - Asm->OutStreamer->EmitIntValue(0, Size); > - Asm->OutStreamer->EmitIntValue(0, Size); > - } > - } > + for (const auto &List : DebugLocs.getLists()) > + emitLocList(*this, Asm, List); > > if (TableEnd) > Asm->OutStreamer->EmitLabel(TableEnd); > @@ -2556,103 +2608,16 @@ void DwarfDebug::emitDebugARanges() { > } > } > > -template > -static void emitRangeList(DwarfDebug &DD, AsmPrinter *Asm, MCSymbol *Sym, > - const Ranges &R, const DwarfCompileUnit &CU, > - unsigned BaseAddressx, unsigned OffsetPair, > - unsigned StartxLength, unsigned EndOfList, > - StringRef (*StringifyEnum)(unsigned)) { > - auto DwarfVersion = DD.getDwarfVersion(); > - // Emit our symbol so we can find the beginning of the range. > - Asm->OutStreamer->EmitLabel(Sym); > - // Gather all the ranges that apply to the same section so they can > share > - // a base address entry. > - MapVector> > SectionRanges; > - // Size for our labels. > - auto Size = Asm->MAI->getCodePointerSize(); > - > - for (const RangeSpan &Range : R) > - SectionRanges[&Range.Begin->getSection()].push_back(&Range); > - > - const MCSymbol *CUBase = CU.getBaseAddress(); > - bool BaseIsSet = false; > - for (const auto &P : SectionRanges) { > - // Don't bother with a base address entry if there's only one range in > - // this section in this range list - for example ranges for a CU will > - // usually consist of single regions from each of many sections > - // (-ffunction-sections, or just C++ inline functions) except under > LTO > - // or optnone where there may be holes in a single CU's section > - // contributions. > - auto *Base = CUBase; > - if (!Base && (P.second.size() > 1 || DwarfVersion < 5) && > - (CU.getCUNode()->getRangesBaseAddress() || DwarfVersion >= 5)) { > - BaseIsSet = true; > - Base = DD.getSectionLabel(&P.second.front()->Begin->getSection()); > - if (DwarfVersion >= 5) { > - Asm->OutStreamer->AddComment(StringifyEnum(BaseAddressx)); > - Asm->OutStreamer->EmitIntValue(BaseAddressx, 1); > - Asm->OutStreamer->AddComment(" base address index"); > - Asm->EmitULEB128(DD.getAddressPool().getIndex(Base)); > - } else { > - Asm->OutStreamer->EmitIntValue(-1, Size); > - Asm->OutStreamer->AddComment(" base address"); > - Asm->OutStreamer->EmitSymbolValue(Base, Size); > - } > - } else if (BaseIsSet && DwarfVersion < 5) { > - BaseIsSet = false; > - assert(!Base); > - Asm->OutStreamer->EmitIntValue(-1, Size); > - Asm->OutStreamer->EmitIntValue(0, Size); > - } > - > - for (const auto *RS : P.second) { > - const MCSymbol *Begin = RS->Begin; > - const MCSymbol *End = RS->End; > - assert(Begin && "Range without a begin symbol?"); > - assert(End && "Range without an end symbol?"); > - if (Base) { > - if (DwarfVersion >= 5) { > - // Emit DW_RLE_offset_pair when we have a base. > - Asm->OutStreamer->AddComment(StringifyEnum(OffsetPair)); > - Asm->emitInt8(OffsetPair); > - Asm->OutStreamer->AddComment(" starting offset"); > - Asm->EmitLabelDifferenceAsULEB128(Begin, Base); > - Asm->OutStreamer->AddComment(" ending offset"); > - Asm->EmitLabelDifferenceAsULEB128(End, Base); > - } else { > - Asm->EmitLabelDifference(Begin, Base, Size); > - Asm->EmitLabelDifference(End, Base, Size); > - } > - } else if (DwarfVersion >= 5) { > - Asm->OutStreamer->AddComment(StringifyEnum(StartxLength)); > - Asm->emitInt8(StartxLength); > - Asm->OutStreamer->AddComment(" start index"); > - Asm->EmitULEB128(DD.getAddressPool().getIndex(Begin)); > - Asm->OutStreamer->AddComment(" length"); > - Asm->EmitLabelDifferenceAsULEB128(End, Begin); > - } else { > - Asm->OutStreamer->EmitSymbolValue(Begin, Size); > - Asm->OutStreamer->EmitSymbolValue(End, Size); > - } > - } > - } > - if (DwarfVersion >= 5) { > - Asm->OutStreamer->AddComment(StringifyEnum(EndOfList)); > - Asm->emitInt8(EndOfList); > - } else { > - // Terminate the list with two 0 values. > - Asm->OutStreamer->EmitIntValue(0, Size); > - Asm->OutStreamer->EmitIntValue(0, Size); > - } > -} > - > /// Emit a single range list. We handle both DWARF v5 and earlier. > static void emitRangeList(DwarfDebug &DD, AsmPrinter *Asm, > const RangeSpanList &List) { > emitRangeList(DD, Asm, List.getSym(), List.getRanges(), List.getCU(), > dwarf::DW_RLE_base_addressx, dwarf::DW_RLE_offset_pair, > dwarf::DW_RLE_startx_length, dwarf::DW_RLE_end_of_list, > - llvm::dwarf::RangeListEncodingString); > + llvm::dwarf::RangeListEncodingString, > + List.getCU().getCUNode()->getRangesBaseAddress() || > + DD.getDwarfVersion() >= 5, > + [](auto) {}); > } > > static void emitDebugRangesImpl(DwarfDebug &DD, AsmPrinter *Asm, > > Modified: llvm/trunk/test/CodeGen/X86/debug-loclists.ll > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/debug-loclists.ll?rev=374600&r1=374599&r2=374600&view=diff > > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/debug-loclists.ll (original) > +++ llvm/trunk/test/CodeGen/X86/debug-loclists.ll Fri Oct 11 14:52:41 2019 > @@ -1,144 +1,119 @@ > -; RUN: llc -mtriple=x86_64-pc-linux -filetype=obj -o %t < %s > -; RUN: llvm-dwarfdump -v %t | FileCheck %s > +; RUN: llc -mtriple=x86_64-pc-linux -filetype=obj -function-sections -o > %t < %s > +; RUN: llvm-dwarfdump -v -debug-info -debug-loclists %t | FileCheck %s > > -; CHECK: 0x00000033: DW_TAG_formal_parameter [3] > -; CHECK-NEXT: DW_AT_location [DW_FORM_sec_offset] > (0x0000000c > -; CHECK-NEXT: [0x0000000000000000, 0x0000000000000004): > DW_OP_breg5 RDI+0 > -; CHECK-NEXT: [0x0000000000000004, 0x0000000000000012): > DW_OP_breg3 RBX+0) > -; CHECK-NEXT: DW_AT_name [DW_FORM_strx1] (indexed > (0000000e) string = "a") > -; CHECK-NEXT: DW_AT_decl_file [DW_FORM_data1] > ("/home/folder{{\\|\/}}test.cc") > -; CHECK-NEXT: DW_AT_decl_line [DW_FORM_data1] (6) > -; CHECK-NEXT: DW_AT_type [DW_FORM_ref4] (cu + 0x0040 => > {0x00000040} "A") > +; CHECK: DW_TAG_variable > +; FIXME: Use DW_FORM_loclistx to reduce relocations > +; CHECK-NEXT: DW_AT_location [DW_FORM_sec_offset] (0x0000000c > +; CHECK-NEXT: [0x0000000000000000, 0x0000000000000003): DW_OP_consts > +3, DW_OP_stack_value > +; CHECK-NEXT: [0x0000000000000003, 0x0000000000000004): DW_OP_consts > +4, DW_OP_stack_value) > +; CHECK-NEXT: DW_AT_name {{.*}} "y" > + > +; CHECK: DW_TAG_variable > +; FIXME: Use DW_FORM_loclistx to reduce relocations > +; CHECK-NEXT: DW_AT_location [DW_FORM_sec_offset] (0x0000001d > +; CHECK-NEXT: Addr idx 0 (w/ length 3): DW_OP_consts +5, > DW_OP_stack_value) > +; CHECK-NEXT: DW_AT_name {{.*}} "x" > + > +; CHECK: DW_TAG_variable > +; FIXME: Use DW_FORM_loclistx to reduce relocations > +; CHECK-NEXT: DW_AT_location [DW_FORM_sec_offset] (0x00000025 > +; CHECK-NEXT: [0x0000000000000003, 0x0000000000000004): DW_OP_reg0 > RAX) > +; CHECK-NEXT: DW_AT_name {{.*}} "r" > > ; CHECK: .debug_loclists contents: > -; CHECK-NEXT: 0x00000000: locations list header: length = 0x00000015, > version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = > 0x00000000 > -; CHECK-NEXT: 0x0000000c: > -; CHECK-NEXT: DW_LLE_offset_pair(0x0000000000000000, 0x0000000000000004) > -; CHECK-NEXT: => [0x0000000000000000, 0x0000000000000004): > DW_OP_breg5 RDI+0 > -; CHECK-NEXT: DW_LLE_offset_pair(0x0000000000000004, 0x0000000000000012) > -; CHECK-NEXT: => [0x0000000000000004, 0x0000000000000012): > DW_OP_breg3 RBX+0 > - > -; There is no way to use llvm-dwarfdump atm (2018, october) to verify the > DW_LLE_* codes emited, > -; because dumper is not yet implements that. Use asm code to do this > check instead. > -; > -; RUN: llc -mtriple=x86_64-pc-linux -filetype=asm < %s -o - | FileCheck > %s --check-prefix=ASM > -; ASM: .section .debug_loclists,"", at progbits > -; ASM-NEXT: .long .Ldebug_loclist_table_end0-.Ldebug_loclist_table_start0 > # Length > -; ASM-NEXT: .Ldebug_loclist_table_start0: > -; ASM-NEXT: .short 5 # Version > -; ASM-NEXT: .byte 8 # Address size > -; ASM-NEXT: .byte 0 # Segment selector size > -; ASM-NEXT: .long 0 # Offset entry count > -; ASM-NEXT: .Lloclists_table_base0: > -; ASM-NEXT: .Ldebug_loc0: > -; ASM-NEXT: .byte 4 # DW_LLE_offset_pair > -; ASM-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0 # starting offset > -; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # ending offset > -; ASM-NEXT: .byte 2 # Loc expr size > -; ASM-NEXT: .byte 117 # DW_OP_breg5 > -; ASM-NEXT: .byte 0 # 0 > -; ASM-NEXT: .byte 4 # DW_LLE_offset_pair > -; ASM-NEXT: .uleb128 .Ltmp0-.Lfunc_begin0 # starting offset > -; ASM-NEXT: .uleb128 .Ltmp1-.Lfunc_begin0 # ending offset > -; ASM-NEXT: .byte 2 # Loc expr size > -; ASM-NEXT: .byte 115 # DW_OP_breg3 > -; ASM-NEXT: .byte 0 # 0 > -; ASM-NEXT: .byte 0 # DW_LLE_end_of_list > -; ASM-NEXT: .Ldebug_loclist_table_end0: > - > -; ModuleID = 'test.cc' > -source_filename = "test.cc" > -target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" > -target triple = "x86_64-unknown-linux-gnu" > - > -%struct.A = type { i32 (...)** } > - > - at _ZTV1A = dso_local unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* > null, i8* bitcast ({ i8*, i8* }* @_ZTI1A to i8*), i8* bitcast (void > (%struct.A*)* @_ZN1A3fooEv to i8*), i8* bitcast (void (%struct.A*)* > @_ZN1A3barEv to i8*)] }, align 8 > - at _ZTVN10__cxxabiv117__class_type_infoE = external dso_local global i8* > - at _ZTS1A = dso_local constant [3 x i8] c"1A\00", align 1 > - at _ZTI1A = dso_local constant { i8*, i8* } { i8* bitcast (i8** > getelementptr inbounds (i8*, i8** @_ZTVN10__cxxabiv117__class_type_infoE, > i64 2) to i8*), i8* getelementptr inbounds ([3 x i8], [3 x i8]* @_ZTS1A, > i32 0, i32 0) }, align 8 > - > -; Function Attrs: noinline optnone uwtable > -define dso_local void @_Z3baz1A(%struct.A* %a) #0 !dbg !7 { > -entry: > - call void @llvm.dbg.declare(metadata %struct.A* %a, metadata !23, > metadata !DIExpression()), !dbg !24 > - call void @_ZN1A3fooEv(%struct.A* %a), !dbg !25 > - call void @_ZN1A3barEv(%struct.A* %a), !dbg !26 > - ret void, !dbg !27 > -} > +; CHECK-NEXT: 0x00000000: locations list header: length = 0x00000029, > version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = > 0x00000000 > > -; Function Attrs: nounwind readnone speculatable > -declare void @llvm.dbg.declare(metadata, metadata, metadata) #1 > +; Don't use startx_length if there's more than one entry, because the > shared > +; base address will be useful for both the range that does start at the > start of > +; the function, and the one that doesn't. > > -; Function Attrs: noinline nounwind optnone uwtable > -define dso_local void @_ZN1A3fooEv(%struct.A* %this) unnamed_addr #2 > align 2 !dbg !28 { > -entry: > - %this.addr = alloca %struct.A*, align 8 > - store %struct.A* %this, %struct.A** %this.addr, align 8 > - call void @llvm.dbg.declare(metadata %struct.A** %this.addr, metadata > !29, metadata !DIExpression()), !dbg !31 > - %this1 = load %struct.A*, %struct.A** %this.addr, align 8 > - ret void, !dbg !32 > -} > +; CHECK-NEXT: 0x0000000c: > +; CHECK-NEXT: DW_LLE_base_addressx(0x0000000000000000) > +; CHECK-NEXT: DW_LLE_offset_pair (0x0000000000000000, > 0x0000000000000003) > +; CHECK-NEXT: => [0x0000000000000000, > 0x0000000000000003): DW_OP_consts +3, DW_OP_stack_value > +; CHECK-NEXT: DW_LLE_offset_pair (0x0000000000000003, > 0x0000000000000004) > +; CHECK-NEXT: => [0x0000000000000003, > 0x0000000000000004): DW_OP_consts +4, DW_OP_stack_value > +; CHECK-NEXT: DW_LLE_end_of_list () > + > +; Show that startx_length can be used when the address range starts at > the start of the function. > + > +; CHECK: 0x0000001d: > +; CHECK-NEXT: DW_LLE_startx_length(0x0000000000000000, > 0x0000000000000003) > +; CHECK-NEXT: => Addr idx 0 (w/ length 3): > DW_OP_consts +5, DW_OP_stack_value > +; CHECK-NEXT: DW_LLE_end_of_list () > + > +; And use a base address when the range doesn't start at an > existing/useful > +; address in the pool. > + > +; CHECK: 0x00000025: > +; CHECK-NEXT: DW_LLE_base_addressx(0x0000000000000000) > +; CHECK-NEXT: DW_LLE_offset_pair (0x0000000000000003, > 0x0000000000000004) > +; CHECK-NEXT: => [0x0000000000000003, > 0x0000000000000004): DW_OP_reg0 RAX > +; CHECK-NEXT: DW_LLE_end_of_list () > + > +; Built with clang -O3 -ffunction-sections from source: > +; > +; int f1(int i, int j) { > +; int x = 5; > +; int y = 3; > +; int r = i + j; > +; int undef; > +; x = undef; > +; y = 4; > +; return r; > +; } > +; void f2() { > +; } > > -; Function Attrs: noinline nounwind optnone uwtable > -define dso_local void @_ZN1A3barEv(%struct.A* %this) unnamed_addr #2 > align 2 !dbg !33 { > +; Function Attrs: norecurse nounwind readnone uwtable > +define dso_local i32 @_Z2f1ii(i32 %i, i32 %j) local_unnamed_addr !dbg !7 { > entry: > - %this.addr = alloca %struct.A*, align 8 > - store %struct.A* %this, %struct.A** %this.addr, align 8 > - call void @llvm.dbg.declare(metadata %struct.A** %this.addr, metadata > !34, metadata !DIExpression()), !dbg !35 > - %this1 = load %struct.A*, %struct.A** %this.addr, align 8 > - ret void, !dbg !36 > + call void @llvm.dbg.value(metadata i32 %i, metadata !12, metadata > !DIExpression()), !dbg !18 > + call void @llvm.dbg.value(metadata i32 %j, metadata !13, metadata > !DIExpression()), !dbg !18 > + call void @llvm.dbg.value(metadata i32 5, metadata !14, metadata > !DIExpression()), !dbg !18 > + call void @llvm.dbg.value(metadata i32 3, metadata !15, metadata > !DIExpression()), !dbg !18 > + %add = add nsw i32 %j, %i, !dbg !19 > + call void @llvm.dbg.value(metadata i32 %add, metadata !16, metadata > !DIExpression()), !dbg !18 > + call void @llvm.dbg.value(metadata i32 undef, metadata !14, metadata > !DIExpression()), !dbg !18 > + call void @llvm.dbg.value(metadata i32 4, metadata !15, metadata > !DIExpression()), !dbg !18 > + ret i32 %add, !dbg !20 > } > > -; Function Attrs: noinline norecurse nounwind optnone uwtable > -define dso_local i32 @main() #3 !dbg !37 { > +; Function Attrs: norecurse nounwind readnone uwtable > +define dso_local void @_Z2f2v() local_unnamed_addr !dbg !21 { > entry: > - %retval = alloca i32, align 4 > - store i32 0, i32* %retval, align 4 > - ret i32 0, !dbg !38 > + ret void, !dbg !24 > } > > +; Function Attrs: nounwind readnone speculatable willreturn > +declare void @llvm.dbg.value(metadata, metadata, metadata) > > !llvm.dbg.cu = !{!0} > !llvm.module.flags = !{!3, !4, !5} > !llvm.ident = !{!6} > > -!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, > producer: "clang version 8.0.0 (trunk 344035)", isOptimized: false, > runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None) > -!1 = !DIFile(filename: "test.cc", directory: "/home/folder", > checksumkind: CSK_MD5, checksum: "e0f357ad6dcb791a774a0dae55baf5e7") > +!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, > producer: "clang version 10.0.0 (trunk 374581) (llvm/trunk 374579)", > isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, > nameTableKind: None) > +!1 = !DIFile(filename: "loc2.cpp", directory: > "/usr/local/google/home/blaikie/dev/scratch", checksumkind: CSK_MD5, > checksum: "91e0069c680e2a63f4f885ec93f5d07e") > !2 = !{} > !3 = !{i32 2, !"Dwarf Version", i32 5} > !4 = !{i32 2, !"Debug Info Version", i32 3} > !5 = !{i32 1, !"wchar_size", i32 4} > -!6 = !{!"clang version 8.0.0 (trunk 344035)"} > -!7 = distinct !DISubprogram(name: "baz", linkageName: "_Z3baz1A", scope: > !1, file: !1, line: 6, type: !8, isLocal: false, isDefinition: true, > scopeLine: 6, flags: DIFlagPrototyped, isOptimized: false, unit: !0, > retainedNodes: !2) > +!6 = !{!"clang version 10.0.0 (trunk 374581) (llvm/trunk 374579)"} > +!7 = distinct !DISubprogram(name: "f1", linkageName: "_Z2f1ii", scope: > !1, file: !1, line: 1, type: !8, scopeLine: 1, flags: DIFlagPrototyped | > DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, > unit: !0, retainedNodes: !11) > !8 = !DISubroutineType(types: !9) > -!9 = !{null, !10} > -!10 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "A", > file: !1, line: 1, size: 64, flags: DIFlagTypePassByReference, elements: > !11, vtableHolder: !10, identifier: "_ZTS1A") > -!11 = !{!12, !18, !22} > -!12 = !DIDerivedType(tag: DW_TAG_member, name: "_vptr$A", scope: !1, > file: !1, baseType: !13, size: 64, flags: DIFlagArtificial) > -!13 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !14, size: 64) > -!14 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "__vtbl_ptr_type", > baseType: !15, size: 64) > -!15 = !DISubroutineType(types: !16) > -!16 = !{!17} > -!17 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) > -!18 = !DISubprogram(name: "foo", linkageName: "_ZN1A3fooEv", scope: !10, > file: !1, line: 2, type: !19, isLocal: false, isDefinition: false, > scopeLine: 2, containingType: !10, virtuality: DW_VIRTUALITY_virtual, > virtualIndex: 0, flags: DIFlagPrototyped, isOptimized: false) > -!19 = !DISubroutineType(types: !20) > -!20 = !{null, !21} > -!21 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !10, size: 64, > flags: DIFlagArtificial | DIFlagObjectPointer) > -!22 = !DISubprogram(name: "bar", linkageName: "_ZN1A3barEv", scope: !10, > file: !1, line: 3, type: !19, isLocal: false, isDefinition: false, > scopeLine: 3, containingType: !10, virtuality: DW_VIRTUALITY_virtual, > virtualIndex: 1, flags: DIFlagPrototyped, isOptimized: false) > -!23 = !DILocalVariable(name: "a", arg: 1, scope: !7, file: !1, line: 6, > type: !10) > -!24 = !DILocation(line: 6, column: 19, scope: !7) > -!25 = !DILocation(line: 7, column: 6, scope: !7) > -!26 = !DILocation(line: 8, column: 6, scope: !7) > -!27 = !DILocation(line: 9, column: 1, scope: !7) > -!28 = distinct !DISubprogram(name: "foo", linkageName: "_ZN1A3fooEv", > scope: !10, file: !1, line: 12, type: !19, isLocal: false, isDefinition: > true, scopeLine: 12, flags: DIFlagPrototyped, isOptimized: false, unit: !0, > declaration: !18, retainedNodes: !2) > -!29 = !DILocalVariable(name: "this", arg: 1, scope: !28, type: !30, > flags: DIFlagArtificial | DIFlagObjectPointer) > -!30 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !10, size: 64) > -!31 = !DILocation(line: 0, scope: !28) > -!32 = !DILocation(line: 12, column: 16, scope: !28) > -!33 = distinct !DISubprogram(name: "bar", linkageName: "_ZN1A3barEv", > scope: !10, file: !1, line: 13, type: !19, isLocal: false, isDefinition: > true, scopeLine: 13, flags: DIFlagPrototyped, isOptimized: false, unit: !0, > declaration: !22, retainedNodes: !2) > -!34 = !DILocalVariable(name: "this", arg: 1, scope: !33, type: !30, > flags: DIFlagArtificial | DIFlagObjectPointer) > -!35 = !DILocation(line: 0, scope: !33) > -!36 = !DILocation(line: 13, column: 16, scope: !33) > -!37 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 15, > type: !15, isLocal: false, isDefinition: true, scopeLine: 15, flags: > DIFlagPrototyped, isOptimized: false, unit: !0, retainedNodes: !2) > -!38 = !DILocation(line: 16, column: 3, scope: !37) > +!9 = !{!10, !10, !10} > +!10 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) > +!11 = !{!12, !13, !14, !15, !16, !17} > +!12 = !DILocalVariable(name: "i", arg: 1, scope: !7, file: !1, line: 1, > type: !10) > +!13 = !DILocalVariable(name: "j", arg: 2, scope: !7, file: !1, line: 1, > type: !10) > +!14 = !DILocalVariable(name: "x", scope: !7, file: !1, line: 2, type: !10) > +!15 = !DILocalVariable(name: "y", scope: !7, file: !1, line: 3, type: !10) > +!16 = !DILocalVariable(name: "r", scope: !7, file: !1, line: 4, type: !10) > +!17 = !DILocalVariable(name: "undef", scope: !7, file: !1, line: 5, type: > !10) > +!18 = !DILocation(line: 0, scope: !7) > +!19 = !DILocation(line: 4, column: 13, scope: !7) > +!20 = !DILocation(line: 8, column: 3, scope: !7) > +!21 = distinct !DISubprogram(name: "f2", linkageName: "_Z2f2v", scope: > !1, file: !1, line: 10, type: !22, scopeLine: 10, flags: DIFlagPrototyped | > DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, > unit: !0, retainedNodes: !2) > +!22 = !DISubroutineType(types: !23) > +!23 = !{null} > +!24 = !DILocation(line: 11, column: 1, scope: !21) > > Modified: llvm/trunk/test/DebugInfo/X86/sret.ll > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/DebugInfo/X86/sret.ll?rev=374600&r1=374599&r2=374600&view=diff > > ============================================================================== > --- llvm/trunk/test/DebugInfo/X86/sret.ll (original) > +++ llvm/trunk/test/DebugInfo/X86/sret.ll Fri Oct 11 14:52:41 2019 > @@ -11,6 +11,7 @@ > ; CHECK: _ZN1B9AInstanceEv > ; CHECK: DW_TAG_variable > ; CHECK-NEXT: DW_AT_location [DW_FORM_sec_offset] (0x00000000 > +; CHECK-NEXT: [0xffffffffffffffff, {{.*}}): {{$}} > ; CHECK-NEXT: [{{.*}}, {{.*}}): DW_OP_breg5 RDI+0 > ; CHECK-NEXT: [{{.*}}, {{.*}}): DW_OP_breg6 RBP-24, DW_OP_deref) > ; CHECK-NEXT: DW_AT_name {{.*}}"a" > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Fri Oct 11 17:28:37 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 00:28:37 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands In-Reply-To: References: Message-ID: <7b6e67e7f08efb05a02cee935ea4b0e4@localhost.localdomain> tlively marked an inline comment as done. tlively added inline comments. ================ Comment at: llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h:135 + Exnref = unsigned(wasm::ValType::EXNREF), + // Will be lowered to match the function return type in MCInstLower + Multivalue = 0xffff, ---------------- dschuff wrote: > The invariant that only fallthrough-return blocks are allowed to be multivalue should probably be restated here. > edit: I guess it's not just fallthrough-return blocks, right? Just the last block, even if it has an explicit return? > We should probably have some more tests for those cases. Right, it's the last block and the last block within that block recursively. In fact this only happens with explicit returns inside the blocks (or blocks that are otherwise never exited) because blocks that could be fallthrough return blocks instead set their results to a local then have a `local.get` in the return position. I will expand this comment to reiterate this invariant. I only added one additional test for this case because the algorithm for determining when to set block types was not changed and is already tested in cfg-stackify.ll. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68889/new/ https://reviews.llvm.org/D68889 From llvm-commits at lists.llvm.org Fri Oct 11 17:28:46 2019 From: llvm-commits at lists.llvm.org (Vedant Kumar via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 00:28:46 +0000 (UTC) Subject: [PATCH] D68718: [llvm-profdata] Make "malformed-ptr-to-counter-array.test" textual In-Reply-To: References: Message-ID: <6d1b69c7f286b9a0ff2b2a7899909959@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG852e3b207651: [llvm-profdata] Make "malformed-ptr-to-counter-array.test" textual (authored by vsk). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68718/new/ https://reviews.llvm.org/D68718 Files: llvm/test/tools/llvm-profdata/Inputs/malformed-ptr-to-counter-array.profraw llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test Index: llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test =================================================================== --- llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test +++ llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test @@ -1,5 +1,50 @@ -REQUIRES: zlib +// Header +// +// INSTR_PROF_RAW_HEADER(uint64_t, Magic, __llvm_profile_get_magic()) +// INSTR_PROF_RAW_HEADER(uint64_t, Version, __llvm_profile_get_version()) +// INSTR_PROF_RAW_HEADER(uint64_t, DataSize, DataSize) +// INSTR_PROF_RAW_HEADER(uint64_t, CountersSize, CountersSize) +// INSTR_PROF_RAW_HEADER(uint64_t, NamesSize, NamesSize) +// INSTR_PROF_RAW_HEADER(uint64_t, CountersDelta, (uintptr_t)CountersBegin) +// INSTR_PROF_RAW_HEADER(uint64_t, NamesDelta, (uintptr_t)NamesBegin) +// INSTR_PROF_RAW_HEADER(uint64_t, ValueKindLast, IPVK_Last) -RUN: not llvm-profdata merge -o /dev/null %p/Inputs/malformed-ptr-to-counter-array.profraw 2>&1 | FileCheck %s +RUN: printf '\201rforpl\377' > %t.profraw +RUN: printf '\4\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\1\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\2\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\10\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\0\0\6\0\1\0\0\0' >> %t.profraw +RUN: printf '\0\0\6\0\2\0\0\0' >> %t.profraw +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw + +// Data Section +// +// struct ProfData { +// #define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) \ +// Type Name; +// #include "llvm/ProfileData/InstrProfData.inc" +// }; + +RUN: printf '\067\265\035\031\112\165\023\344' >> %t.profraw +RUN: printf '\02\0\0\0\0\0\0\0' >> %t.profraw + +// Note: The CounterPtr here is off-by-one. This should trigger a malformed profile error. +RUN: printf '\0\0\6\0\1\0\0\1' >> %t.profraw + +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\0\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\02\0\0\0\0\0\0\0' >> %t.profraw + +// Counter Section + +RUN: printf '\067\0\0\0\0\0\0\0' >> %t.profraw +RUN: printf '\101\0\0\0\0\0\0\0' >> %t.profraw + +// Name Section + +RUN: printf '\3\0bar\0\0\0' >> %t.profraw + +RUN: not llvm-profdata merge -o /dev/null %t.profraw 2>&1 | FileCheck %s CHECK: Malformed instrumentation profile data -------------- next part -------------- A non-text attachment was scrubbed... Name: D68718.224714.patch Type: text/x-patch Size: 2230 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 17:35:27 2019 From: llvm-commits at lists.llvm.org (Philip Reames via llvm-commits) Date: Fri, 11 Oct 2019 17:35:27 -0700 Subject: [llvm] r374535 - [SCEV] Add stricter verification option. In-Reply-To: <20191011114640.A64F39189C@lists.llvm.org> References: <20191011114640.A64F39189C@lists.llvm.org> Message-ID: I may be missing the obvious, but why is a symbolic expression which *may* be zero a violation here?  It would seem to be a missed canonicalization, nothing more. I agree that a SCEV which is known (via isKnownPredicate?) not to be zero is a bug. Philip On 10/11/2019 4:46 AM, Florian Hahn via llvm-commits wrote: > Author: fhahn > Date: Fri Oct 11 04:46:40 2019 > New Revision: 374535 > > URL: http://llvm.org/viewvc/llvm-project?rev=374535&view=rev > Log: > [SCEV] Add stricter verification option. > > Currently -verify-scev only fails if there is a constant difference > between two BE counts. This misses a lot of cases. > > This patch adds a -verify-scev-strict options, which fails for any > non-zero differences, if used together with -verify-scev. > > With the stricter checking, some unit tests fail because > of mis-matches, especially around IndVarSimplify. > > If there is no reason I am missing for just checking constant deltas, I > am planning on looking into the various failures. > > Reviewers: efriedma, sanjoy.google, reames, atrick > > Reviewed By: sanjoy.google > > Differential Revision: https://reviews.llvm.org/D68592 > > Modified: > llvm/trunk/lib/Analysis/ScalarEvolution.cpp > > Modified: llvm/trunk/lib/Analysis/ScalarEvolution.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ScalarEvolution.cpp?rev=374535&r1=374534&r2=374535&view=diff > ============================================================================== > --- llvm/trunk/lib/Analysis/ScalarEvolution.cpp (original) > +++ llvm/trunk/lib/Analysis/ScalarEvolution.cpp Fri Oct 11 04:46:40 2019 > @@ -158,6 +158,9 @@ MaxBruteForceIterations("scalar-evolutio > static cl::opt VerifySCEV( > "verify-scev", cl::Hidden, > cl::desc("Verify ScalarEvolution's backedge taken counts (slow)")); > +static cl::opt VerifySCEVStrict( > + "verify-scev-strict", cl::Hidden, > + cl::desc("Enable stricter verification with -verify-scev is passed")); > static cl::opt > VerifySCEVMap("verify-scev-maps", cl::Hidden, > cl::desc("Verify no dangling value in ScalarEvolution's " > @@ -11922,14 +11925,14 @@ void ScalarEvolution::verify() const { > SE.getTypeSizeInBits(NewBECount->getType())) > CurBECount = SE2.getZeroExtendExpr(CurBECount, NewBECount->getType()); > > - auto *ConstantDelta = > - dyn_cast(SE2.getMinusSCEV(CurBECount, NewBECount)); > + const SCEV *Delta = SE2.getMinusSCEV(CurBECount, NewBECount); > > - if (ConstantDelta && ConstantDelta->getAPInt() != 0) { > - dbgs() << "Trip Count Changed!\n"; > + // Unless VerifySCEVStrict is set, we only compare constant deltas. > + if ((VerifySCEVStrict || isa(Delta)) && !Delta->isZero()) { > + dbgs() << "Trip Count for " << *L << " Changed!\n"; > dbgs() << "Old: " << *CurBECount << "\n"; > dbgs() << "New: " << *NewBECount << "\n"; > - dbgs() << "Delta: " << *ConstantDelta << "\n"; > + dbgs() << "Delta: " << *Delta << "\n"; > std::abort(); > } > } > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits From llvm-commits at lists.llvm.org Fri Oct 11 17:38:02 2019 From: llvm-commits at lists.llvm.org (Eli Friedman via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 00:38:02 +0000 (UTC) Subject: [PATCH] D68898: JumpThreading: enhance JT to handle BB with no successor and address comparison In-Reply-To: References: Message-ID: efriedma added a comment. If the terminator is a "ret", or some arbitrary terminator that doesn't simplify, it's not really "threading"; it's just tail duplication. That's likely profitable in some cases, but using ThreadEdge to perform the transform seems confusing. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68898/new/ https://reviews.llvm.org/D68898 From llvm-commits at lists.llvm.org Fri Oct 11 17:38:03 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 00:38:03 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands In-Reply-To: References: Message-ID: <11e39c10717243b472250d07fe0aa6fe@localhost.localdomain> tlively updated this revision to Diff 224715. tlively added a comment. - Explain more about multivalue types in WebAssembly::BlockType Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68889/new/ https://reviews.llvm.org/D68889 Files: llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp llvm/lib/Target/WebAssembly/Disassembler/LLVMBuild.txt llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyInstPrinter.cpp llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCCodeEmitter.cpp llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp llvm/lib/Target/WebAssembly/WebAssemblyMCInstLower.cpp llvm/lib/Target/WebAssembly/WebAssemblyMCInstLower.h llvm/test/CodeGen/WebAssembly/multivalue.ll llvm/test/MC/Disassembler/WebAssembly/wasm-error.txt llvm/test/MC/WebAssembly/basic-assembly.s llvm/tools/llvm-mc/Disassembler.cpp llvm/tools/llvm-mc/Disassembler.h llvm/tools/llvm-mc/llvm-mc.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68889.224715.patch Type: text/x-patch Size: 24957 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 17:56:06 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 00:56:06 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands In-Reply-To: References: Message-ID: <067f702e213ae852b7b771bf8ec5b122@localhost.localdomain> aheejin added a comment. Nice! Mostly LGTM. > Currently non-void blocks are only generated at the end of functions where the block return type needs to agree with the function return type, and that remains true for multivalue blocks. That invariant means that the actual signature does not need to be stored in the block signature MachineOperand because it can be inferred by WebAssemblyMCInstLower from the return type of the parent function. I guess this is a tentative state before you implement the rest of the proposal in full, right? If other blocks are able to return multivalue, are we gonna change their operands to also take typeindex? ================ Comment at: llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp:224 + if (Val < 0) { + // Negative values are single septet value types or empty types + if (Size != PrevSize + 1) { ---------------- What are septet values, and when is `Val` negative? All values of `BlockType` look unsigned. Maybe I'm missing something..? ================ Comment at: llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp:227 + MI.addOperand( + MCOperand::createImm(int64_t(WebAssembly::BlockType::Invalid))); + } else { ---------------- When does this happen? ================ Comment at: llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp:239 + MI.addOperand(MCOperand::createExpr(Expr)); + } break; ---------------- Disassembler is not going to print multivalue signatures then? Is this tentative or permanent? ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyMCInstLower.cpp:203 + computeLegalValueVTs(F, TM, RetTy, CallerRetTys); + llvm::valTypesFromMVTs(CallerRetTys, Returns); +} ---------------- Nit: Do we need `llvm::` here? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68889/new/ https://reviews.llvm.org/D68889 From llvm-commits at lists.llvm.org Fri Oct 11 17:56:07 2019 From: llvm-commits at lists.llvm.org (Heejin Ahn via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 00:56:07 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands In-Reply-To: References: Message-ID: <5b42b492ddd7f08aa8208ec99c12b955@localhost.localdomain> aheejin added inline comments. ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp:1248 + case WebAssembly::END_BLOCK: + case WebAssembly::END_LOOP: EndToBegin[&MI]->getOperand(0).setImm(int32_t(RetType)); ---------------- It's preexisting, but I think we should add `END_TRY` here too. Not sure if I can generate a test case that ends with `end_try` returning something easily though.. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68889/new/ https://reviews.llvm.org/D68889 From llvm-commits at lists.llvm.org Fri Oct 11 18:05:16 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 01:05:16 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: <7a38ab635d8f2f0766b6a08472d742a9@localhost.localdomain> Xiangling_L marked 2 inline comments as done. Xiangling_L added inline comments. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:23 +; LARGE: lwz [[REG2:[0-9]+]], LC0 at l([[REG1]]) +; LARGE: lwz [[REG3:[0-9]+]], 0([[REG2]]) +; LARGE: addis [[REG4:[0-9]+]], LC1 at u(2) ---------------- hubert.reinterpretcast wrote: > Xiangling_L wrote: > > hubert.reinterpretcast wrote: > > > That the ordering and interleaving of the logical operations involved differ between the various cases seem to indicate that the test is already too complicated. Please reduce the test to use a single memory operand (e.g., store a constant or return the value read). > > @sfertile I guess your original purpose of creating this testcase is to test if load from TOC works for both `load` and `store`? > If that is indeed the intent, then the goal can be achieved with more tests that are simpler. Thanks, I will update the testcase Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Fri Oct 11 18:05:17 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 01:05:17 +0000 (UTC) Subject: [PATCH] D68889: [WebAssembly] Allow multivalue types in block signature operands In-Reply-To: References: Message-ID: <2f9e2e89fe5e38b903f41b729e55b80c@localhost.localdomain> tlively marked 4 inline comments as done. tlively added inline comments. ================ Comment at: llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp:224 + if (Val < 0) { + // Negative values are single septet value types or empty types + if (Size != PrevSize + 1) { ---------------- aheejin wrote: > What are septet values, and when is `Val` negative? All values of `BlockType` look unsigned. Maybe I'm missing something..? See https://webassembly.github.io/multi-value/core/binary/instructions.html#control-instructions for details on the encoding. By `septet value` I just mean a group of 7 bits. That's where the `& 0x7f` comes from. ================ Comment at: llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp:227 + MI.addOperand( + MCOperand::createImm(int64_t(WebAssembly::BlockType::Invalid))); + } else { ---------------- aheejin wrote: > When does this happen? See wasm-error.txt for an example. Basically anytime you have a negative SLEB128 value here that occupies more than one byte, which is invalid according to the spec. ================ Comment at: llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp:239 + MI.addOperand(MCOperand::createExpr(Expr)); + } break; ---------------- aheejin wrote: > Disassembler is not going to print multivalue signatures then? Is this tentative or permanent? I don't think there's any way to access the type section from the disassembler, so this is permanent unless there's a large overhaul. cc @aardappel for the details. ================ Comment at: llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp:1248 + case WebAssembly::END_BLOCK: + case WebAssembly::END_LOOP: EndToBegin[&MI]->getOperand(0).setImm(int32_t(RetType)); ---------------- aheejin wrote: > It's preexisting, but I think we should add `END_TRY` here too. Not sure if I can generate a test case that ends with `end_try` returning something easily though.. Sounds good. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68889/new/ https://reviews.llvm.org/D68889 From llvm-commits at lists.llvm.org Fri Oct 11 18:23:23 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 01:23:23 +0000 (UTC) Subject: [PATCH] D68875: [lld] Check for branch range overflows. In-Reply-To: References: Message-ID: <821c43a23461160c008d3a7ac9944682@localhost.localdomain> MaskRay added inline comments. ================ Comment at: lld/test/ELF/hexagon-verify.s:5 + +#CHECK: relocation R_HEX_B9_PCREL out of range: 1028 is not in [-1024, 1023] +#CHECK-NEXT: relocation R_HEX_B13_PCREL out of range: 16388 is not in [-16384, 16383] ---------------- `#CHECK` -> `# CHECK` If such branch relocation types have a categorical name, e.g. jump, consider rename this test to something like `hexagon-jump-error.s` ``` x86-64-reloc-error.s aarch64-lo21-error.s # the name is good ``` Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68875/new/ https://reviews.llvm.org/D68875 From llvm-commits at lists.llvm.org Fri Oct 11 18:23:23 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 01:23:23 +0000 (UTC) Subject: [PATCH] D68875: [lld] Check for branch range overflows. In-Reply-To: References: Message-ID: MaskRay added inline comments. ================ Comment at: lld/test/ELF/hexagon-verify.s:21 + +.section _pc13, "ax" +if (r0==#0) jump:t #pc13 ---------------- Consider moving the CHECK lines just before the corresponding instructions. ================ Comment at: lld/test/ELF/hexagon-verify.s:28 +.section _pc15, "ax" +if (p0) jump #pc15 +.space (1<<16) ---------------- If `jump #b15` works, you can delete the local label `pc15`. Repository: rLLD LLVM Linker CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68875/new/ https://reviews.llvm.org/D68875 From llvm-commits at lists.llvm.org Fri Oct 11 18:32:22 2019 From: llvm-commits at lists.llvm.org (Thomas Lively via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 01:32:22 +0000 (UTC) Subject: [PATCH] D68902: [WebAssembly] Trapping fptoint builtins and intrinsics Message-ID: tlively created this revision. tlively added a reviewer: aheejin. Herald added subscribers: llvm-commits, cfe-commits, sunfish, hiraditya, jgravelle-google, sbc100, dschuff. Herald added projects: clang, LLVM. The WebAssembly backend lowers fptoint instructions to a code sequence that checks for overflow to avoid traps because fptoint is supposed to be speculatable. These new builtins and intrinsics give users a way to depend on the trapping semantics of the underlying instructions and avoid the extra code generated normally. Patch by coffee and tlively. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68902 Files: clang/include/clang/Basic/BuiltinsWebAssembly.def clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/builtins-wasm.c llvm/include/llvm/IR/IntrinsicsWebAssembly.td llvm/lib/Target/WebAssembly/WebAssemblyInstrConv.td llvm/test/CodeGen/WebAssembly/conv-trap.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68902.224718.patch Type: text/x-patch Size: 10431 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 18:41:25 2019 From: llvm-commits at lists.llvm.org (Xiangling Liao via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 01:41:25 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: Xiangling_L updated this revision to Diff 224720. Xiangling_L added a comment. Update testcases: split into simpler ones Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 Files: llvm/include/llvm/MC/MCExpr.h llvm/lib/MC/MCExpr.cpp llvm/lib/Target/PowerPC/MCTargetDesc/PPCInstPrinter.cpp llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.cpp llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68341.224720.patch Type: text/x-patch Size: 13632 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 18:50:36 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Sat, 12 Oct 2019 01:50:36 -0000 Subject: [llvm] r374623 - [asan] Return true from instrumentModule Message-ID: <20191012015036.DB51C93482@lists.llvm.org> Author: vitalybuka Date: Fri Oct 11 18:50:36 2019 New Revision: 374623 URL: http://llvm.org/viewvc/llvm-project?rev=374623&view=rev Log: [asan] Return true from instrumentModule createSanitizerCtorAndInitFunctions always change the module. Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp?rev=374623&r1=374622&r2=374623&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Fri Oct 11 18:50:36 2019 @@ -2439,11 +2439,10 @@ bool ModuleAddressSanitizer::instrumentM /*InitArgs=*/{}, VersionCheckName); bool CtorComdat = true; - bool Changed = false; // TODO(glider): temporarily disabled globals instrumentation for KASan. if (ClGlobals) { IRBuilder<> IRB(AsanCtorFunction->getEntryBlock().getTerminator()); - Changed |= InstrumentGlobals(IRB, M, &CtorComdat); + InstrumentGlobals(IRB, M, &CtorComdat); } const uint64_t Priority = GetCtorAndDtorPriority(TargetTriple); @@ -2464,7 +2463,7 @@ bool ModuleAddressSanitizer::instrumentM appendToGlobalDtors(M, AsanDtorFunction, Priority); } - return Changed; + return true; } void AddressSanitizer::initializeCallbacks(Module &M) { From llvm-commits at lists.llvm.org Fri Oct 11 18:59:53 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 01:59:53 +0000 (UTC) Subject: [PATCH] D66613: [support][llvm-objcopy] Add support for shell wildcards In-Reply-To: References: Message-ID: MaskRay added inline comments. ================ Comment at: llvm/test/tools/llvm-objcopy/ELF/wildcard-syntax.test:1 +# RUN: yaml2obj --docnum=1 %s > %t.o + ---------------- Move the file-level comment before `RUN: ` ================ Comment at: llvm/test/tools/llvm-objcopy/ELF/wildcard-syntax.test:3 + +## This test checks that llvm-objcopy accepts wildcard syntax correctly. + ---------------- The name is `wildcard-syntax.test` (good). Both `wildcard` and `glob` are referred to in this test. Personally I think `wildcard` is a subset of `glob` and `glob` is a better name here - glob is a POSIX specified concept and the function `fnmatch` that objcopy uses accepts glob, not just wildcard. Unify the naming here. ================ Comment at: llvm/test/tools/llvm-objcopy/ELF/wildcard-syntax.test:21 +## ! (as a leading character) prevents matches (not dependent on ordering). +# RUN: llvm-objcopy --remove-section='.???' --remove-section='!.f*' %t.o %t.negmatch.o +# RUN: llvm-readobj --sections %t.negmatch.o \ ---------------- What does `--remove-section='.???' --remove-section='!.f*' --remove-section='.???'` do? ================ Comment at: llvm/test/tools/llvm-objcopy/ELF/wildcard-syntax.test:25 + +## [a-z] matches a range of characters +# RUN: llvm-objcopy --remove-section='.[a-c][a-a][q-z]' %t.o %t.range.o ---------------- Full stop. ================ Comment at: llvm/test/tools/llvm-objcopy/ELF/wildcard-syntax.test:26 +## [a-z] matches a range of characters +# RUN: llvm-objcopy --remove-section='.[a-c][a-a][q-z]' %t.o %t.range.o +# RUN: llvm-readobj --sections %t.range.o \ ---------------- If llvm-objcopy called setlocale, RE Bracket Expression (also used by glob) would be non-portable (behaviors of locales other than the POSIX locale are unspecified). Just for fun, glibc>=2.27 make `w` and `v` have the same collating order and a `w` test may fail in the Swedish locale (if regex matching is used): https://sourceware.org/bugzilla/show_bug.cgi?id=23393 (Comment #41 said this is basically a wontfix) In any case, llvm-objcopy does not call setlocale, `w` not used in the test, nor do we use regex, so we are good. ================ Comment at: llvm/test/tools/llvm-objcopy/ELF/wildcard-syntax.test:73 +## ] doesn't close the character class as a first character. +# RUN: llvm-objcopy --remove-section='[]xyz]' %t.special.o %t.class.2.o +# RUN: llvm-readobj --sections %t.class.2.o \ ---------------- Probably add a section named `z` to enhance the test. ================ Comment at: llvm/tools/llvm-objcopy/CopyConfig.h:115 + create(StringRef Pattern, MatchStyle MS, + std::function ErrorCallback); + ---------------- If ErrorCallback is not stored, consider llvm::function_ref. ================ Comment at: llvm/tools/llvm-objcopy/CopyConfig.h:119 + bool operator==(StringRef S) const { + return R ? return R->match(S) : G ? return G->match(S) : Name == S; + } ---------------- Remove some `return` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66613/new/ https://reviews.llvm.org/D66613 From llvm-commits at lists.llvm.org Fri Oct 11 19:09:10 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 02:09:10 +0000 (UTC) Subject: [PATCH] D68900: [SROA] Reuse existing lifetime markers if possible In-Reply-To: References: Message-ID: <32c2af9d0ba122fea064f6efa77707b7@localhost.localdomain> arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. Herald added a subscriber: wdng. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68900/new/ https://reviews.llvm.org/D68900 From llvm-commits at lists.llvm.org Fri Oct 11 19:29:24 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Sat, 12 Oct 2019 02:29:24 -0000 Subject: [llvm] r374628 - [sancov] Use LLVM Support library JSON writer in favor of individual implementation Message-ID: <20191012022924.42BD787CB1@lists.llvm.org> Author: vitalybuka Date: Fri Oct 11 19:29:24 2019 New Revision: 374628 URL: http://llvm.org/viewvc/llvm-project?rev=374628&view=rev Log: [sancov] Use LLVM Support library JSON writer in favor of individual implementation Summary: In this diff, I've replaced the individual implementation of `JSONWriter` with `json::OStream` provided by `llvm/Support/JSON.h`. Important Note: The output format of the JSON is considerably different compared to the original implementation. Important differences include: * New line for each entry in an array (should make diffs cleaner) * No space between keys and colon in attributed object entries. * Attributes with empty strings will now print the attribute name and a quote pair rather than excluding the attribute altogether Examples of these differences can be seen in the changes to the sancov tests which compare the JSON output. Patch by Douglas Gliner. Reviewers: kcc, filcab, phosek, morehouse, vitalybuka, metzman Subscribers: mehdi_amini, dexonsmith, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D68752 Modified: llvm/trunk/test/tools/sancov/merge.test llvm/trunk/test/tools/sancov/symbolize.test llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test llvm/trunk/tools/sancov/sancov.cpp Modified: llvm/trunk/test/tools/sancov/merge.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/merge.test?rev=374628&r1=374627&r2=374628&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/merge.test (original) +++ llvm/trunk/test/tools/sancov/merge.test Fri Oct 11 19:29:24 2019 @@ -3,62 +3,81 @@ RUN: sancov -merge %p/Inputs/test-linux_ RUN: sancov -merge %p/Inputs/test-linux_x86_64.0.symcov %p/Inputs/test-linux_x86_64.1.symcov| FileCheck --check-prefix=MERGE2 %s MERGE1: { -MERGE1-NEXT: "covered-points" : ["4e132b", "4e1472", "4e1520", "4e1553", "4e1586"], -MERGE1-NEXT: "binary-hash" : "BB3CDD5045AED83906F6ADCC1C4DAF7E2596A6B5", -MERGE1-NEXT: "point-symbol-info" : { -MERGE1-NEXT: "test/tools/sancov/Inputs/foo.cpp" : { -MERGE1-NEXT: "foo()" : { -MERGE1-NEXT: "4e178c" : "5:0" +MERGE1-NEXT: "covered-points": [ +MERGE1-NEXT: "4e132b", +MERGE1-NEXT: "4e1472", +MERGE1-NEXT: "4e1520", +MERGE1-NEXT: "4e1553", +MERGE1-NEXT: "4e1586" +MERGE1-NEXT: ], +MERGE1-NEXT: "binary-hash": "BB3CDD5045AED83906F6ADCC1C4DAF7E2596A6B5", +MERGE1-NEXT: "point-symbol-info": { +MERGE1-NEXT: "test/tools/sancov/Inputs/foo.cpp": { +MERGE1-NEXT: "foo()": { +MERGE1-NEXT: "4e178c": "5:0" MERGE1-NEXT: } MERGE1-NEXT: }, -MERGE1-NEXT: "test/tools/sancov/Inputs/test.cpp" : { -MERGE1-NEXT: "bar(std::string)" : { -MERGE1-NEXT: "4e132b" : "12:0" +MERGE1-NEXT: "test/tools/sancov/Inputs/test.cpp": { +MERGE1-NEXT: "bar(std::string)": { +MERGE1-NEXT: "4e132b": "12:0" MERGE1-NEXT: }, -MERGE1-NEXT: "main" : { -MERGE1-NEXT: "4e1472" : "14:0", -MERGE1-NEXT: "4e14c2" : "16:9", -MERGE1-NEXT: "4e1520" : "17:5", -MERGE1-NEXT: "4e1553" : "17:5", -MERGE1-NEXT: "4e1586" : "17:5", -MERGE1-NEXT: "4e1635" : "19:1", -MERGE1-NEXT: "4e1690" : "17:5" +MERGE1-NEXT: "main": { +MERGE1-NEXT: "4e1472": "14:0", +MERGE1-NEXT: "4e14c2": "16:9", +MERGE1-NEXT: "4e1520": "17:5", +MERGE1-NEXT: "4e1553": "17:5", +MERGE1-NEXT: "4e1586": "17:5", +MERGE1-NEXT: "4e1635": "19:1", +MERGE1-NEXT: "4e1690": "17:5" MERGE1-NEXT: } MERGE1-NEXT: } MERGE1-NEXT: } MERGE1-NEXT: } MERGE2: { -MERGE2-NEXT: "covered-points" : ["04e132b", "04e1472", "04e1520", "04e1553", "04e1586", "14e132b", "14e1472", "14e14c2", "14e1520", "14e1553", "14e1586", "14e178c"], -MERGE2-NEXT: "point-symbol-info" : { -MERGE2-NEXT: "test/tools/sancov/Inputs/foo.cpp" : { -MERGE2-NEXT: "foo()" : { -MERGE2-NEXT: "04e178c" : "5:0", -MERGE2-NEXT: "14e178c" : "5:0" +MERGE2-NEXT: "covered-points": [ +MERGE2-NEXT: "04e132b", +MERGE2-NEXT: "04e1472", +MERGE2-NEXT: "04e1520", +MERGE2-NEXT: "04e1553", +MERGE2-NEXT: "04e1586", +MERGE2-NEXT: "14e132b", +MERGE2-NEXT: "14e1472", +MERGE2-NEXT: "14e14c2", +MERGE2-NEXT: "14e1520", +MERGE2-NEXT: "14e1553", +MERGE2-NEXT: "14e1586", +MERGE2-NEXT: "14e178c" +MERGE2-NEXT: ], +MERGE2-NEXT: "binary-hash": "", +MERGE2-NEXT: "point-symbol-info": { +MERGE2-NEXT: "test/tools/sancov/Inputs/foo.cpp": { +MERGE2-NEXT: "foo()": { +MERGE2-NEXT: "04e178c": "5:0", +MERGE2-NEXT: "14e178c": "5:0" MERGE2-NEXT: } MERGE2-NEXT: }, -MERGE2-NEXT: "test/tools/sancov/Inputs/test.cpp" : { -MERGE2-NEXT: "bar(std::string)" : { -MERGE2-NEXT: "04e132b" : "12:0", -MERGE2-NEXT: "14e132b" : "12:0" +MERGE2-NEXT: "test/tools/sancov/Inputs/test.cpp": { +MERGE2-NEXT: "bar(std::string)": { +MERGE2-NEXT: "04e132b": "12:0", +MERGE2-NEXT: "14e132b": "12:0" MERGE2-NEXT: }, -MERGE2-NEXT: "main" : { -MERGE2-NEXT: "04e1472" : "14:0", -MERGE2-NEXT: "04e14c2" : "16:9", -MERGE2-NEXT: "04e1520" : "17:5", -MERGE2-NEXT: "04e1553" : "17:5", -MERGE2-NEXT: "04e1586" : "17:5", -MERGE2-NEXT: "04e1635" : "19:1", -MERGE2-NEXT: "04e1690" : "17:5", -MERGE2-NEXT: "14e1472" : "14:0", -MERGE2-NEXT: "14e14c2" : "16:9", -MERGE2-NEXT: "14e1520" : "17:5", -MERGE2-NEXT: "14e1553" : "17:5", -MERGE2-NEXT: "14e1586" : "17:5", -MERGE2-NEXT: "14e1635" : "19:1", -MERGE2-NEXT: "14e1690" : "17:5" +MERGE2-NEXT: "main": { +MERGE2-NEXT: "04e1472": "14:0", +MERGE2-NEXT: "04e14c2": "16:9", +MERGE2-NEXT: "04e1520": "17:5", +MERGE2-NEXT: "04e1553": "17:5", +MERGE2-NEXT: "04e1586": "17:5", +MERGE2-NEXT: "04e1635": "19:1", +MERGE2-NEXT: "04e1690": "17:5", +MERGE2-NEXT: "14e1472": "14:0", +MERGE2-NEXT: "14e14c2": "16:9", +MERGE2-NEXT: "14e1520": "17:5", +MERGE2-NEXT: "14e1553": "17:5", +MERGE2-NEXT: "14e1586": "17:5", +MERGE2-NEXT: "14e1635": "19:1", +MERGE2-NEXT: "14e1690": "17:5" MERGE2-NEXT: } MERGE2-NEXT: } MERGE2-NEXT: } MERGE2-NEXT: } - Modified: llvm/trunk/test/tools/sancov/symbolize.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/symbolize.test?rev=374628&r1=374627&r2=374628&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/symbolize.test (original) +++ llvm/trunk/test/tools/sancov/symbolize.test Fri Oct 11 19:29:24 2019 @@ -2,23 +2,28 @@ REQUIRES: x86_64-linux RUN: sancov -symbolize -strip_path_prefix="llvm/" %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s CHECK: { -CHECK-NEXT: "covered-points" : ["4e132b", "4e1472", "4e1520", "4e1553", "4e1586"], -CHECK-NEXT: "binary-hash" : "BB3CDD5045AED83906F6ADCC1C4DAF7E2596A6B5", -CHECK-NEXT: "point-symbol-info" : { -CHECK-NEXT: "test/tools/sancov/Inputs/test.cpp" : { -CHECK-NEXT: "bar(std::string)" : { -CHECK-NEXT: "4e132b" : "12:0" +CHECK-NEXT: "covered-points": [ +CHECK-NEXT: "4e132b", +CHECK-NEXT: "4e1472", +CHECK-NEXT: "4e1520", +CHECK-NEXT: "4e1553", +CHECK-NEXT: "4e1586" +CHECK-NEXT: ], +CHECK-NEXT: "binary-hash": "BB3CDD5045AED83906F6ADCC1C4DAF7E2596A6B5", +CHECK-NEXT: "point-symbol-info": { +CHECK-NEXT: "test/tools/sancov/Inputs/test.cpp": { +CHECK-NEXT: "bar(std::string)": { +CHECK-NEXT: "4e132b": "12:0" CHECK-NEXT: }, -CHECK-NEXT: "main" : { -CHECK-NEXT: "4e1472" : "14:0", -CHECK-NEXT: "4e14c2" : "16:9", -CHECK-NEXT: "4e1520" : "17:5", -CHECK-NEXT: "4e1553" : "17:5", -CHECK-NEXT: "4e1586" : "17:5", -CHECK-NEXT: "4e1635" : "19:1", -CHECK-NEXT: "4e1690" : "17:5" +CHECK-NEXT: "main": { +CHECK-NEXT: "4e1472": "14:0", +CHECK-NEXT: "4e14c2": "16:9", +CHECK-NEXT: "4e1520": "17:5", +CHECK-NEXT: "4e1553": "17:5", +CHECK-NEXT: "4e1586": "17:5", +CHECK-NEXT: "4e1635": "19:1", +CHECK-NEXT: "4e1690": "17:5" CHECK-NEXT: } CHECK-NEXT: } CHECK-NEXT: } CHECK-NEXT:} - Modified: llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test?rev=374628&r1=374627&r2=374628&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test (original) +++ llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test Fri Oct 11 19:29:24 2019 @@ -2,28 +2,33 @@ REQUIRES: x86_64-linux RUN: sancov -symbolize -skip-dead-files=0 -strip_path_prefix="llvm/" %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s CHECK: { -CHECK-NEXT: "covered-points" : ["4e132b", "4e1472", "4e1520", "4e1553", "4e1586"], -CHECK-NEXT: "binary-hash" : "BB3CDD5045AED83906F6ADCC1C4DAF7E2596A6B5", -CHECK-NEXT: "point-symbol-info" : { -CHECK-NEXT: "test/tools/sancov/Inputs/foo.cpp" : { -CHECK-NEXT: "foo()" : { -CHECK-NEXT: "4e178c" : "5:0" -CHECK-NEXT: } -CHECK-NEXT: }, -CHECK-NEXT: "test/tools/sancov/Inputs/test.cpp" : { -CHECK-NEXT: "bar(std::string)" : { -CHECK-NEXT: "4e132b" : "12:0" +CHECK-NEXT: "covered-points": [ +CHECK-NEXT: "4e132b", +CHECK-NEXT: "4e1472", +CHECK-NEXT: "4e1520", +CHECK-NEXT: "4e1553", +CHECK-NEXT: "4e1586" +CHECK-NEXT: ], +CHECK-NEXT: "binary-hash": "BB3CDD5045AED83906F6ADCC1C4DAF7E2596A6B5", +CHECK-NEXT: "point-symbol-info": { +CHECK-NEXT: "test/tools/sancov/Inputs/foo.cpp": { +CHECK-NEXT: "foo()": { +CHECK-NEXT: "4e178c": "5:0" +CHECK-NEXT: } +CHECK-NEXT: }, +CHECK-NEXT: "test/tools/sancov/Inputs/test.cpp": { +CHECK-NEXT: "bar(std::string)": { +CHECK-NEXT: "4e132b": "12:0" CHECK-NEXT: }, -CHECK-NEXT: "main" : { -CHECK-NEXT: "4e1472" : "14:0", -CHECK-NEXT: "4e14c2" : "16:9", -CHECK-NEXT: "4e1520" : "17:5", -CHECK-NEXT: "4e1553" : "17:5", -CHECK-NEXT: "4e1586" : "17:5", -CHECK-NEXT: "4e1635" : "19:1", -CHECK-NEXT: "4e1690" : "17:5" +CHECK-NEXT: "main": { +CHECK-NEXT: "4e1472": "14:0", +CHECK-NEXT: "4e14c2": "16:9", +CHECK-NEXT: "4e1520": "17:5", +CHECK-NEXT: "4e1553": "17:5", +CHECK-NEXT: "4e1586": "17:5", +CHECK-NEXT: "4e1635": "19:1", +CHECK-NEXT: "4e1690": "17:5" CHECK-NEXT: } CHECK-NEXT: } CHECK-NEXT: } CHECK-NEXT:} - Modified: llvm/trunk/tools/sancov/sancov.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/sancov/sancov.cpp?rev=374628&r1=374627&r2=374628&view=diff ============================================================================== --- llvm/trunk/tools/sancov/sancov.cpp (original) +++ llvm/trunk/tools/sancov/sancov.cpp Fri Oct 11 19:29:24 2019 @@ -31,6 +31,7 @@ #include "llvm/Support/Errc.h" #include "llvm/Support/ErrorOr.h" #include "llvm/Support/FileSystem.h" +#include "llvm/Support/JSON.h" #include "llvm/Support/MD5.h" #include "llvm/Support/ManagedStatic.h" #include "llvm/Support/MemoryBuffer.h" @@ -284,87 +285,6 @@ static raw_ostream &operator<<(raw_ostre return OS; } -// Helper for writing out JSON. Handles indents and commas using -// scope variables for objects and arrays. -class JSONWriter { -public: - JSONWriter(raw_ostream &Out) : OS(Out) {} - JSONWriter(const JSONWriter &) = delete; - ~JSONWriter() { OS << "\n"; } - - void operator<<(StringRef S) { printJSONStringLiteral(S, OS); } - - // Helper RAII class to output JSON objects. - class Object { - public: - Object(JSONWriter *W, raw_ostream &OS) : W(W), OS(OS) { - OS << "{"; - W->Indent++; - } - ~Object() { - W->Indent--; - OS << "\n"; - W->indent(); - OS << "}"; - } - - void key(StringRef Key) { - Index++; - if (Index > 0) - OS << ","; - OS << "\n"; - W->indent(); - printJSONStringLiteral(Key, OS); - OS << " : "; - } - - private: - JSONWriter *W; - raw_ostream &OS; - int Index = -1; - }; - - Object object() { return {this, OS}; } - - // Helper RAII class to output JSON arrays. - class Array { - public: - Array(raw_ostream &OS) : OS(OS) { OS << "["; } - ~Array() { OS << "]"; } - void next() { - Index++; - if (Index > 0) - OS << ", "; - } - - private: - raw_ostream &OS; - int Index = -1; - }; - - Array array() { return {OS}; } - -private: - void indent() { OS.indent(Indent * 2); } - - static void printJSONStringLiteral(StringRef S, raw_ostream &OS) { - if (S.find('"') == std::string::npos) { - OS << "\"" << S << "\""; - return; - } - OS << "\""; - for (char Ch : S.bytes()) { - if (Ch == '"') - OS << "\\"; - OS << Ch; - } - OS << "\""; - } - - raw_ostream &OS; - int Indent = 0; -}; - // Output symbolized information for coverage points in JSON. // Format: // { @@ -375,10 +295,9 @@ private: // } // } // } -static void operator<<(JSONWriter &W, +static void operator<<(json::OStream &W, const std::vector &Points) { // Group points by file. - auto ByFile(W.object()); std::map> PointsByFile; for (const auto &Point : Points) { for (const DILineInfo &Loc : Point.Locs) { @@ -388,10 +307,6 @@ static void operator<<(JSONWriter &W, for (const auto &P : PointsByFile) { std::string FileName = P.first; - ByFile.key(FileName); - - // Group points by function. - auto ByFn(W.object()); std::map> PointsByFn; for (auto PointPtr : P.second) { for (const DILineInfo &Loc : PointPtr->Locs) { @@ -399,54 +314,42 @@ static void operator<<(JSONWriter &W, } } - for (const auto &P : PointsByFn) { - std::string FunctionName = P.first; - std::set WrittenIds; - - ByFn.key(FunctionName); - - // Output : ":". - auto ById(W.object()); - for (const CoveragePoint *Point : P.second) { - for (const auto &Loc : Point->Locs) { - if (Loc.FileName != FileName || Loc.FunctionName != FunctionName) - continue; - if (WrittenIds.find(Point->Id) != WrittenIds.end()) - continue; - - WrittenIds.insert(Point->Id); - ById.key(Point->Id); - W << (utostr(Loc.Line) + ":" + utostr(Loc.Column)); - } + W.attributeObject(P.first, [&] { + // Group points by function. + for (const auto &P : PointsByFn) { + std::string FunctionName = P.first; + std::set WrittenIds; + + W.attributeObject(FunctionName, [&] { + for (const CoveragePoint *Point : P.second) { + for (const auto &Loc : Point->Locs) { + if (Loc.FileName != FileName || Loc.FunctionName != FunctionName) + continue; + if (WrittenIds.find(Point->Id) != WrittenIds.end()) + continue; + + // Output : ":". + WrittenIds.insert(Point->Id); + W.attribute(Point->Id, + (utostr(Loc.Line) + ":" + utostr(Loc.Column))); + } + } + }); } - } + }); } } -static void operator<<(JSONWriter &W, const SymbolizedCoverage &C) { - auto O(W.object()); - - { - O.key("covered-points"); - auto PointsArray(W.array()); - - for (const std::string &P : C.CoveredIds) { - PointsArray.next(); - W << P; - } - } - - { - if (!C.BinaryHash.empty()) { - O.key("binary-hash"); - W << C.BinaryHash; - } - } - - { - O.key("point-symbol-info"); - W << C.Points; - } +static void operator<<(json::OStream &W, const SymbolizedCoverage &C) { + W.object([&] { + W.attributeArray("covered-points", [&] { + for (const std::string &P : C.CoveredIds) { + W.value(P); + } + }); + W.attribute("binary-hash", C.BinaryHash); + W.attributeObject("point-symbol-info", [&] { W << C.Points; }); + }); } static std::string parseScalarString(yaml::Node *N) { @@ -1275,7 +1178,7 @@ int main(int Argc, char **Argv) { } case MergeAction: case SymbolizeAction: { // merge & symbolize are synonims. - JSONWriter W(outs()); + json::OStream W(outs(), 2); W << *Coverage; return 0; } From llvm-commits at lists.llvm.org Fri Oct 11 19:29:26 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Sat, 12 Oct 2019 02:29:26 -0000 Subject: [llvm] r374629 - [sancov] Accommodate sancov and coverage report server for use under Windows Message-ID: <20191012022926.74074934DF@lists.llvm.org> Author: vitalybuka Date: Fri Oct 11 19:29:26 2019 New Revision: 374629 URL: http://llvm.org/viewvc/llvm-project?rev=374629&view=rev Log: [sancov] Accommodate sancov and coverage report server for use under Windows Summary: This patch makes the following changes to SanCov and its complementary Python script in order to resolve issues pertaining to non-UNIX file paths in JSON symbolization information: * Convert all paths to use forward slash. * Update `coverage-report-server.py` to correctly handle paths to sources which contain spaces. * Remove Linux platform restriction for all SanCov unit tests. All SanCov tests passed when ran on my local Windows machine. Patch by Douglas Gliner. Reviewers: kcc, filcab, phosek, morehouse, vitalybuka, metzman Reviewed By: vitalybuka Subscribers: vsk, Dor1s, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D51018 Modified: llvm/trunk/test/tools/sancov/blacklist.test llvm/trunk/test/tools/sancov/covered_functions.test llvm/trunk/test/tools/sancov/merge.test llvm/trunk/test/tools/sancov/not_covered_functions.test llvm/trunk/test/tools/sancov/print.test llvm/trunk/test/tools/sancov/stats.test llvm/trunk/test/tools/sancov/symbolize.test llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test llvm/trunk/test/tools/sancov/validation.test llvm/trunk/tools/sancov/coverage-report-server.py llvm/trunk/tools/sancov/sancov.cpp Modified: llvm/trunk/test/tools/sancov/blacklist.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/blacklist.test?rev=374629&r1=374628&r2=374629&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/blacklist.test (original) +++ llvm/trunk/test/tools/sancov/blacklist.test Fri Oct 11 19:29:26 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86_64-linux +REQUIRES: x86-registered-target RUN: sancov -covered-functions %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s --check-prefix=ALL RUN: sancov -covered-functions -blacklist %p/Inputs/fun_blacklist.txt %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s RUN: sancov -covered-functions -blacklist %p/Inputs/src_blacklist.txt %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.1.sancov | FileCheck --check-prefix=CHECK1 %s Modified: llvm/trunk/test/tools/sancov/covered_functions.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/covered_functions.test?rev=374629&r1=374628&r2=374629&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/covered_functions.test (original) +++ llvm/trunk/test/tools/sancov/covered_functions.test Fri Oct 11 19:29:26 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86_64-linux +REQUIRES: x86-registered-target RUN: sancov -covered-functions %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s RUN: sancov -covered-functions -strip_path_prefix=Inputs/ %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck --check-prefix=STRIP_PATH %s RUN: sancov -demangle=0 -covered-functions %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck --check-prefix=NO_DEMANGLE %s Modified: llvm/trunk/test/tools/sancov/merge.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/merge.test?rev=374629&r1=374628&r2=374629&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/merge.test (original) +++ llvm/trunk/test/tools/sancov/merge.test Fri Oct 11 19:29:26 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86_64-linux +REQUIRES: x86-registered-target RUN: sancov -merge %p/Inputs/test-linux_x86_64.0.symcov| FileCheck --check-prefix=MERGE1 %s RUN: sancov -merge %p/Inputs/test-linux_x86_64.0.symcov %p/Inputs/test-linux_x86_64.1.symcov| FileCheck --check-prefix=MERGE2 %s Modified: llvm/trunk/test/tools/sancov/not_covered_functions.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/not_covered_functions.test?rev=374629&r1=374628&r2=374629&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/not_covered_functions.test (original) +++ llvm/trunk/test/tools/sancov/not_covered_functions.test Fri Oct 11 19:29:26 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86_64-linux +REQUIRES: x86-registered-target RUN: sancov -skip-dead-files=0 -not-covered-functions %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s RUN: sancov -not-covered-functions %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.1.sancov | FileCheck --check-prefix=CHECK1 --allow-empty %s Modified: llvm/trunk/test/tools/sancov/print.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/print.test?rev=374629&r1=374628&r2=374629&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/print.test (original) +++ llvm/trunk/test/tools/sancov/print.test Fri Oct 11 19:29:26 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86_64-linux +REQUIRES: x86-registered-target RUN: sancov -print %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s CHECK: 0x4e132b Modified: llvm/trunk/test/tools/sancov/stats.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/stats.test?rev=374629&r1=374628&r2=374629&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/stats.test (original) +++ llvm/trunk/test/tools/sancov/stats.test Fri Oct 11 19:29:26 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86_64-linux +REQUIRES: x86-registered-target RUN: sancov -print-coverage-stats %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s CHECK: all-edges: 8 Modified: llvm/trunk/test/tools/sancov/symbolize.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/symbolize.test?rev=374629&r1=374628&r2=374629&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/symbolize.test (original) +++ llvm/trunk/test/tools/sancov/symbolize.test Fri Oct 11 19:29:26 2019 @@ -1,5 +1,6 @@ -REQUIRES: x86_64-linux -RUN: sancov -symbolize -strip_path_prefix="llvm/" %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s +REQUIRES: x86-registered-target +RUN: sancov -symbolize -strip_path_prefix="llvm/" %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s --check-prefixes=CHECK,STRIP +RUN: sancov -symbolize %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s --check-prefixes=CHECK,NOSTRIP CHECK: { CHECK-NEXT: "covered-points": [ @@ -11,7 +12,8 @@ CHECK-NEXT: "4e1586" CHECK-NEXT: ], CHECK-NEXT: "binary-hash": "BB3CDD5045AED83906F6ADCC1C4DAF7E2596A6B5", CHECK-NEXT: "point-symbol-info": { -CHECK-NEXT: "test/tools/sancov/Inputs/test.cpp": { +STRIP-NEXT: "test/tools/sancov/Inputs/test.cpp": { +NOSTRIP-NEXT: "/usr/local/google/home/aizatsky/src/llvm/test/tools/sancov/Inputs/test.cpp": { CHECK-NEXT: "bar(std::string)": { CHECK-NEXT: "4e132b": "12:0" CHECK-NEXT: }, Modified: llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test?rev=374629&r1=374628&r2=374629&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test (original) +++ llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test Fri Oct 11 19:29:26 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86_64-linux +REQUIRES: x86-registered-target RUN: sancov -symbolize -skip-dead-files=0 -strip_path_prefix="llvm/" %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s CHECK: { Modified: llvm/trunk/test/tools/sancov/validation.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/validation.test?rev=374629&r1=374628&r2=374629&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/validation.test (original) +++ llvm/trunk/test/tools/sancov/validation.test Fri Oct 11 19:29:26 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86_64-linux +REQUIRES: x86-registered-target RUN: not sancov -covered-functions %p/Inputs/test-linux_x86_64 2>&1 | FileCheck --check-prefix=NOCFILE %s NOCFILE: WARNING: No coverage file for {{.*}}test-linux_x86_64 Modified: llvm/trunk/tools/sancov/coverage-report-server.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/sancov/coverage-report-server.py?rev=374629&r1=374628&r2=374629&view=diff ============================================================================== --- llvm/trunk/tools/sancov/coverage-report-server.py (original) +++ llvm/trunk/tools/sancov/coverage-report-server.py Fri Oct 11 19:29:26 2019 @@ -32,6 +32,7 @@ import html import os import string import math +import urllib INDEX_PAGE_TMPL = """ @@ -128,6 +129,7 @@ class ServerHandler(http.server.BaseHTTP src_path = None def do_GET(self): + norm_path = os.path.normpath(urllib.parse.unquote(self.path[1:])) if self.path == '/': self.send_response(200) self.send_header("Content-type", "text/html; charset=utf-8") @@ -147,8 +149,8 @@ class ServerHandler(http.server.BaseHTTP response = string.Template(INDEX_PAGE_TMPL).safe_substitute( filenames='\n'.join(filelist)) self.wfile.write(response.encode('UTF-8', 'replace')) - elif self.symcov_data.has_file(self.path[1:]): - filename = self.path[1:] + elif self.symcov_data.has_file(norm_path): + filename = norm_path filepath = os.path.join(self.src_path, filename) if not os.path.exists(filepath): self.send_response(404) Modified: llvm/trunk/tools/sancov/sancov.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/sancov/sancov.cpp?rev=374629&r1=374628&r2=374629&view=diff ============================================================================== --- llvm/trunk/tools/sancov/sancov.cpp (original) +++ llvm/trunk/tools/sancov/sancov.cpp Fri Oct 11 19:29:26 2019 @@ -469,7 +469,7 @@ static std::unique_ptr S(FileName); sys::path::remove_dots(S, /* remove_dot_dot */ true); - return stripPathPrefix(S.str().str()); + return stripPathPrefix(sys::path::convert_to_slash(S.str())); } class Blacklists { From llvm-commits at lists.llvm.org Fri Oct 11 19:27:22 2019 From: llvm-commits at lists.llvm.org (Amy Kwan via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 02:27:22 +0000 (UTC) Subject: [PATCH] D68443: [PowerPC] Spill CR LT bits on P9 using setb In-Reply-To: References: Message-ID: <379ee34a38a33316347290896e3c789f@localhost.localdomain> amyk added a comment. Herald added a subscriber: wuzish. Ping. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68443/new/ https://reviews.llvm.org/D68443 From llvm-commits at lists.llvm.org Fri Oct 11 19:36:36 2019 From: llvm-commits at lists.llvm.org (David Li via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 02:36:36 +0000 (UTC) Subject: [PATCH] D68898: JumpThreading: enhance JT to handle BB with no successor and address comparison In-Reply-To: References: Message-ID: <1b42d6b3941a195fe8b48826de7c3e27@localhost.localdomain> davidxl added a comment. JumpThreading is basically basic block cloning followed by control flow simplification. This is just a special case where the second part is missing. There is already another special case in JT -- if all the Pred's target successor is the same, there is no threading either -- basically there is only control flow simplification part without the basic cloning. These two cases are just at two different ends of the spectrum. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68898/new/ https://reviews.llvm.org/D68898 From llvm-commits at lists.llvm.org Fri Oct 11 19:36:42 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 02:36:42 +0000 (UTC) Subject: [PATCH] D68752: [sancov] Use LLVM Support library JSON writer in favor of individual implementation In-Reply-To: References: Message-ID: <5dd21441d42fe6d9dfb2533c0f565b1e@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGe8a462a01923: [sancov] Use LLVM Support library JSON writer in favor of individual… (authored by vitalybuka). Changed prior to commit: https://reviews.llvm.org/D68752?vs=224214&id=224723#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68752/new/ https://reviews.llvm.org/D68752 Files: llvm/test/tools/sancov/merge.test llvm/test/tools/sancov/symbolize.test llvm/test/tools/sancov/symbolize_noskip_dead_files.test llvm/tools/sancov/sancov.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68752.224723.patch Type: text/x-patch Size: 14548 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 19:36:43 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 02:36:43 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Accommodate sancov and coverage report server for use under Windows In-Reply-To: References: Message-ID: <93f37848e17e0ccef5c362c911713eb7@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG23aa2aec7818: [sancov] Accommodate sancov and coverage report server for use under Windows (authored by vitalybuka). Changed prior to commit: https://reviews.llvm.org/D51018?vs=224223&id=224724#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 Files: llvm/test/tools/sancov/blacklist.test llvm/test/tools/sancov/covered_functions.test llvm/test/tools/sancov/merge.test llvm/test/tools/sancov/not_covered_functions.test llvm/test/tools/sancov/print.test llvm/test/tools/sancov/stats.test llvm/test/tools/sancov/symbolize.test llvm/test/tools/sancov/symbolize_noskip_dead_files.test llvm/test/tools/sancov/validation.test llvm/tools/sancov/coverage-report-server.py llvm/tools/sancov/sancov.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D51018.224724.patch Type: text/x-patch Size: 7015 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 19:53:04 2019 From: llvm-commits at lists.llvm.org (Zi Xuan Wu via llvm-commits) Date: Sat, 12 Oct 2019 02:53:04 -0000 Subject: [llvm] r374634 - recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize Message-ID: <20191012025304.A8EDD87B28@lists.llvm.org> Author: wuzish Date: Fri Oct 11 19:53:04 2019 New Revision: 374634 URL: http://llvm.org/viewvc/llvm-project?rev=374634&view=rev Log: recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 Added: llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h llvm/trunk/lib/Analysis/TargetTransformInfo.cpp llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original) +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Fri Oct 11 19:53:04 2019 @@ -788,10 +788,23 @@ public: /// Additional properties of an operand's values. enum OperandValueProperties { OP_None = 0, OP_PowerOf2 = 1 }; - /// \return The number of scalar or vector registers that the target has. - /// If 'Vectors' is true, it returns the number of vector registers. If it is - /// set to false, it returns the number of scalar registers. - unsigned getNumberOfRegisters(bool Vector) const; + /// \return the number of registers in the target-provided register class. + unsigned getNumberOfRegisters(unsigned ClassID) const; + + /// \return the target-provided register class ID for the provided type, + /// accounting for type promotion and other type-legalization techniques that the target might apply. + /// However, it specifically does not account for the scalarization or splitting of vector types. + /// Should a vector type require scalarization or splitting into multiple underlying vector registers, + /// that type should be mapped to a register class containing no registers. + /// Specifically, this is designed to provide a simple, high-level view of the register allocation + /// later performed by the backend. These register classes don't necessarily map onto the + /// register classes used by the backend. + /// FIXME: It's not currently possible to determine how many registers + /// are used by the provided type. + unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const; + + /// \return the target-provided register class name + const char* getRegisterClassName(unsigned ClassID) const; /// \return The width of the largest scalar or vector register type. unsigned getRegisterBitWidth(bool Vector) const; @@ -1245,7 +1258,9 @@ public: Type *Ty) = 0; virtual int getIntImmCost(Intrinsic::ID IID, unsigned Idx, const APInt &Imm, Type *Ty) = 0; - virtual unsigned getNumberOfRegisters(bool Vector) = 0; + virtual unsigned getNumberOfRegisters(unsigned ClassID) const = 0; + virtual unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const = 0; + virtual const char* getRegisterClassName(unsigned ClassID) const = 0; virtual unsigned getRegisterBitWidth(bool Vector) const = 0; virtual unsigned getMinVectorRegisterBitWidth() = 0; virtual bool shouldMaximizeVectorBandwidth(bool OptSize) const = 0; @@ -1602,8 +1617,14 @@ public: Type *Ty) override { return Impl.getIntImmCost(IID, Idx, Imm, Ty); } - unsigned getNumberOfRegisters(bool Vector) override { - return Impl.getNumberOfRegisters(Vector); + unsigned getNumberOfRegisters(unsigned ClassID) const override { + return Impl.getNumberOfRegisters(ClassID); + } + unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const override { + return Impl.getRegisterClassForType(Vector, Ty); + } + const char* getRegisterClassName(unsigned ClassID) const override { + return Impl.getRegisterClassName(ClassID); } unsigned getRegisterBitWidth(bool Vector) const override { return Impl.getRegisterBitWidth(Vector); Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h (original) +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h Fri Oct 11 19:53:04 2019 @@ -354,7 +354,20 @@ public: return TTI::TCC_Free; } - unsigned getNumberOfRegisters(bool Vector) { return 8; } + unsigned getNumberOfRegisters(unsigned ClassID) const { return 8; } + + unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const { + return Vector ? 1 : 0; + }; + + const char* getRegisterClassName(unsigned ClassID) const { + switch (ClassID) { + default: + return "Generic::Unknown Register Class"; + case 0: return "Generic::ScalarRC"; + case 1: return "Generic::VectorRC"; + } + } unsigned getRegisterBitWidth(bool Vector) const { return 32; } Modified: llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h (original) +++ llvm/trunk/include/llvm/CodeGen/BasicTTIImpl.h Fri Oct 11 19:53:04 2019 @@ -553,8 +553,6 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(bool Vector) { return Vector ? 0 : 1; } - unsigned getRegisterBitWidth(bool Vector) const { return 32; } /// Estimate the overhead of scalarizing an instruction. Insert and Extract Modified: llvm/trunk/lib/Analysis/TargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/TargetTransformInfo.cpp?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/TargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Analysis/TargetTransformInfo.cpp Fri Oct 11 19:53:04 2019 @@ -466,8 +466,16 @@ int TargetTransformInfo::getIntImmCost(I return Cost; } -unsigned TargetTransformInfo::getNumberOfRegisters(bool Vector) const { - return TTIImpl->getNumberOfRegisters(Vector); +unsigned TargetTransformInfo::getNumberOfRegisters(unsigned ClassID) const { + return TTIImpl->getNumberOfRegisters(ClassID); +} + +unsigned TargetTransformInfo::getRegisterClassForType(bool Vector, Type *Ty) const { + return TTIImpl->getRegisterClassForType(Vector, Ty); +} + +const char* TargetTransformInfo::getRegisterClassName(unsigned ClassID) const { + return TTIImpl->getRegisterClassName(ClassID); } unsigned TargetTransformInfo::getRegisterBitWidth(bool Vector) const { Modified: llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/AArch64/AArch64TargetTransformInfo.h Fri Oct 11 19:53:04 2019 @@ -85,7 +85,8 @@ public: bool enableInterleavedAccessVectorization() { return true; } - unsigned getNumberOfRegisters(bool Vector) { + unsigned getNumberOfRegisters(unsigned ClassID) const { + bool Vector = (ClassID == 1); if (Vector) { if (ST->hasNEON()) return 32; Modified: llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.h Fri Oct 11 19:53:04 2019 @@ -122,7 +122,8 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(bool Vector) { + unsigned getNumberOfRegisters(unsigned ClassID) const { + bool Vector = (ClassID == 1); if (Vector) { if (ST->hasNEON()) return 16; Modified: llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp Fri Oct 11 19:53:04 2019 @@ -594,10 +594,37 @@ bool PPCTTIImpl::enableInterleavedAccess return true; } -unsigned PPCTTIImpl::getNumberOfRegisters(bool Vector) { - if (Vector && !ST->hasAltivec() && !ST->hasQPX()) - return 0; - return ST->hasVSX() ? 64 : 32; +unsigned PPCTTIImpl::getNumberOfRegisters(unsigned ClassID) const { + assert(ClassID == GPRRC || ClassID == FPRRC || + ClassID == VRRC || ClassID == VSXRC); + if (ST->hasVSX()) { + assert(ClassID == GPRRC || ClassID == VSXRC); + return ClassID == GPRRC ? 32 : 64; + } + assert(ClassID == GPRRC || ClassID == FPRRC || ClassID == VRRC); + return 32; +} + +unsigned PPCTTIImpl::getRegisterClassForType(bool Vector, Type *Ty) const { + if (Vector) + return ST->hasVSX() ? VSXRC : VRRC; + else if (Ty && Ty->getScalarType()->isFloatTy()) + return ST->hasVSX() ? VSXRC : FPRRC; + else + return GPRRC; +} + +const char* PPCTTIImpl::getRegisterClassName(unsigned ClassID) const { + + switch (ClassID) { + default: + llvm_unreachable("unknown register class"); + return "PPC::unknown register class"; + case GPRRC: return "PPC::GPRRC"; + case FPRRC: return "PPC::FPRRC"; + case VRRC: return "PPC::VRRC"; + case VSXRC: return "PPC::VSXRC"; + } } unsigned PPCTTIImpl::getRegisterBitWidth(bool Vector) const { Modified: llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.h Fri Oct 11 19:53:04 2019 @@ -72,7 +72,13 @@ public: TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const; bool enableInterleavedAccessVectorization(); - unsigned getNumberOfRegisters(bool Vector); + + enum PPCRegisterClass { + GPRRC, FPRRC, VRRC, VSXRC + }; + unsigned getNumberOfRegisters(unsigned ClassID) const; + unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const; + const char* getRegisterClassName(unsigned ClassID) const; unsigned getRegisterBitWidth(bool Vector) const; unsigned getCacheLineSize() const override; unsigned getPrefetchDistance() const override; Modified: llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp Fri Oct 11 19:53:04 2019 @@ -304,7 +304,8 @@ bool SystemZTTIImpl::isLSRCostLess(Targe C2.ScaleCost, C2.SetupCost); } -unsigned SystemZTTIImpl::getNumberOfRegisters(bool Vector) { +unsigned SystemZTTIImpl::getNumberOfRegisters(unsigned ClassID) const { + bool Vector = (ClassID == 1); if (!Vector) // Discount the stack pointer. Also leave out %r0, since it can't // be used in an address. Modified: llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/SystemZ/SystemZTargetTransformInfo.h Fri Oct 11 19:53:04 2019 @@ -56,7 +56,7 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(bool Vector); + unsigned getNumberOfRegisters(unsigned ClassID) const; unsigned getRegisterBitWidth(bool Vector) const; unsigned getCacheLineSize() const override { return 256; } Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp Fri Oct 11 19:53:04 2019 @@ -25,10 +25,11 @@ WebAssemblyTTIImpl::getPopcntSupport(uns return TargetTransformInfo::PSK_FastHardware; } -unsigned WebAssemblyTTIImpl::getNumberOfRegisters(bool Vector) { - unsigned Result = BaseT::getNumberOfRegisters(Vector); +unsigned WebAssemblyTTIImpl::getNumberOfRegisters(unsigned ClassID) const { + unsigned Result = BaseT::getNumberOfRegisters(ClassID); // For SIMD, use at least 16 registers, as a rough guess. + bool Vector = (ClassID == 1); if (Vector) Result = std::max(Result, 16u); Modified: llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h Fri Oct 11 19:53:04 2019 @@ -53,7 +53,7 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(bool Vector); + unsigned getNumberOfRegisters(unsigned ClassID) const; unsigned getRegisterBitWidth(bool Vector) const; unsigned getArithmeticInstrCost( unsigned Opcode, Type *Ty, Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp Fri Oct 11 19:53:04 2019 @@ -116,7 +116,8 @@ llvm::Optional X86TTIImpl::get llvm_unreachable("Unknown TargetTransformInfo::CacheLevel"); } -unsigned X86TTIImpl::getNumberOfRegisters(bool Vector) { +unsigned X86TTIImpl::getNumberOfRegisters(unsigned ClassID) const { + bool Vector = (ClassID == 1); if (Vector && !ST->hasSSE1()) return 0; Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.h Fri Oct 11 19:53:04 2019 @@ -116,7 +116,7 @@ public: /// \name Vector TTI Implementations /// @{ - unsigned getNumberOfRegisters(bool Vector); + unsigned getNumberOfRegisters(unsigned ClassID) const; unsigned getRegisterBitWidth(bool Vector) const; unsigned getLoadStoreVecRegBitWidth(unsigned AS) const; unsigned getMaxInterleaveFactor(unsigned VF); Modified: llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h (original) +++ llvm/trunk/lib/Target/XCore/XCoreTargetTransformInfo.h Fri Oct 11 19:53:04 2019 @@ -40,7 +40,8 @@ public: : BaseT(TM, F.getParent()->getDataLayout()), ST(TM->getSubtargetImpl()), TLI(ST->getTargetLowering()) {} - unsigned getNumberOfRegisters(bool Vector) { + unsigned getNumberOfRegisters(unsigned ClassID) const { + bool Vector = (ClassID == 1); if (Vector) { return 0; } Modified: llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Fri Oct 11 19:53:04 2019 @@ -1386,7 +1386,9 @@ void Cost::RateFormula(const Formula &F, // Treat every new register that exceeds TTI.getNumberOfRegisters() - 1 as // additional instruction (at least fill). - unsigned TTIRegNum = TTI->getNumberOfRegisters(false) - 1; + // TODO: Need distinguish register class? + unsigned TTIRegNum = TTI->getNumberOfRegisters( + TTI->getRegisterClassForType(false, F.getType())) - 1; if (C.NumRegs > TTIRegNum) { // Cost already exceeded TTIRegNum, then only newly added register can add // new instructions. Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Fri Oct 11 19:53:04 2019 @@ -1006,10 +1006,11 @@ public: /// of a loop. struct RegisterUsage { /// Holds the number of loop invariant values that are used in the loop. - unsigned LoopInvariantRegs; - + /// The key is ClassID of target-provided register class. + SmallMapVector LoopInvariantRegs; /// Holds the maximum number of concurrent live intervals in the loop. - unsigned MaxLocalUsers; + /// The key is ClassID of target-provided register class. + SmallMapVector MaxLocalUsers; }; /// \return Returns information about the register usages of the loop for the @@ -4985,9 +4986,14 @@ LoopVectorizationCostModel::computeFeasi // Select the largest VF which doesn't require more registers than existing // ones. - unsigned TargetNumRegisters = TTI.getNumberOfRegisters(true); for (int i = RUs.size() - 1; i >= 0; --i) { - if (RUs[i].MaxLocalUsers <= TargetNumRegisters) { + bool Selected = true; + for (auto& pair : RUs[i].MaxLocalUsers) { + unsigned TargetNumRegisters = TTI.getNumberOfRegisters(pair.first); + if (pair.second > TargetNumRegisters) + Selected = false; + } + if (Selected) { MaxVF = VFs[i]; break; } @@ -5138,22 +5144,12 @@ unsigned LoopVectorizationCostModel::sel if (TC > 1 && TC < TinyTripCountInterleaveThreshold) return 1; - unsigned TargetNumRegisters = TTI.getNumberOfRegisters(VF > 1); - LLVM_DEBUG(dbgs() << "LV: The target has " << TargetNumRegisters - << " registers\n"); - - if (VF == 1) { - if (ForceTargetNumScalarRegs.getNumOccurrences() > 0) - TargetNumRegisters = ForceTargetNumScalarRegs; - } else { - if (ForceTargetNumVectorRegs.getNumOccurrences() > 0) - TargetNumRegisters = ForceTargetNumVectorRegs; - } - RegisterUsage R = calculateRegisterUsage({VF})[0]; // We divide by these constants so assume that we have at least one // instruction that uses at least one register. - R.MaxLocalUsers = std::max(R.MaxLocalUsers, 1U); + for (auto& pair : R.MaxLocalUsers) { + pair.second = std::max(pair.second, 1U); + } // We calculate the interleave count using the following formula. // Subtract the number of loop invariants from the number of available @@ -5166,13 +5162,35 @@ unsigned LoopVectorizationCostModel::sel // We also want power of two interleave counts to ensure that the induction // variable of the vector loop wraps to zero, when tail is folded by masking; // this currently happens when OptForSize, in which case IC is set to 1 above. - unsigned IC = PowerOf2Floor((TargetNumRegisters - R.LoopInvariantRegs) / - R.MaxLocalUsers); + unsigned IC = UINT_MAX; - // Don't count the induction variable as interleaved. - if (EnableIndVarRegisterHeur) - IC = PowerOf2Floor((TargetNumRegisters - R.LoopInvariantRegs - 1) / - std::max(1U, (R.MaxLocalUsers - 1))); + for (auto& pair : R.MaxLocalUsers) { + unsigned TargetNumRegisters = TTI.getNumberOfRegisters(pair.first); + LLVM_DEBUG(dbgs() << "LV: The target has " << TargetNumRegisters + << " registers of " + << TTI.getRegisterClassName(pair.first) << " register class\n"); + if (VF == 1) { + if (ForceTargetNumScalarRegs.getNumOccurrences() > 0) + TargetNumRegisters = ForceTargetNumScalarRegs; + } else { + if (ForceTargetNumVectorRegs.getNumOccurrences() > 0) + TargetNumRegisters = ForceTargetNumVectorRegs; + } + unsigned MaxLocalUsers = pair.second; + unsigned LoopInvariantRegs = 0; + if (R.LoopInvariantRegs.find(pair.first) != R.LoopInvariantRegs.end()) + LoopInvariantRegs = R.LoopInvariantRegs[pair.first]; + + unsigned TmpIC = PowerOf2Floor((TargetNumRegisters - LoopInvariantRegs) / MaxLocalUsers); + // Don't count the induction variable as interleaved. + if (EnableIndVarRegisterHeur) { + TmpIC = + PowerOf2Floor((TargetNumRegisters - LoopInvariantRegs - 1) / + std::max(1U, (MaxLocalUsers - 1))); + } + + IC = std::min(IC, TmpIC); + } // Clamp the interleave ranges to reasonable counts. unsigned MaxInterleaveCount = TTI.getMaxInterleaveFactor(VF); @@ -5354,7 +5372,7 @@ LoopVectorizationCostModel::calculateReg const DataLayout &DL = TheFunction->getParent()->getDataLayout(); SmallVector RUs(VFs.size()); - SmallVector MaxUsages(VFs.size(), 0); + SmallVector, 8> MaxUsages(VFs.size()); LLVM_DEBUG(dbgs() << "LV(REG): Calculating max register usage:\n"); @@ -5384,21 +5402,45 @@ LoopVectorizationCostModel::calculateReg // For each VF find the maximum usage of registers. for (unsigned j = 0, e = VFs.size(); j < e; ++j) { + // Count the number of live intervals. + SmallMapVector RegUsage; + if (VFs[j] == 1) { - MaxUsages[j] = std::max(MaxUsages[j], OpenIntervals.size()); - continue; + for (auto Inst : OpenIntervals) { + unsigned ClassID = TTI.getRegisterClassForType(false, Inst->getType()); + if (RegUsage.find(ClassID) == RegUsage.end()) + RegUsage[ClassID] = 1; + else + RegUsage[ClassID] += 1; + } + } else { + collectUniformsAndScalars(VFs[j]); + for (auto Inst : OpenIntervals) { + // Skip ignored values for VF > 1. + if (VecValuesToIgnore.find(Inst) != VecValuesToIgnore.end()) + continue; + if (isScalarAfterVectorization(Inst, VFs[j])) { + unsigned ClassID = TTI.getRegisterClassForType(false, Inst->getType()); + if (RegUsage.find(ClassID) == RegUsage.end()) + RegUsage[ClassID] = 1; + else + RegUsage[ClassID] += 1; + } else { + unsigned ClassID = TTI.getRegisterClassForType(true, Inst->getType()); + if (RegUsage.find(ClassID) == RegUsage.end()) + RegUsage[ClassID] = GetRegUsage(Inst->getType(), VFs[j]); + else + RegUsage[ClassID] += GetRegUsage(Inst->getType(), VFs[j]); + } + } } - collectUniformsAndScalars(VFs[j]); - // Count the number of live intervals. - unsigned RegUsage = 0; - for (auto Inst : OpenIntervals) { - // Skip ignored values for VF > 1. - if (VecValuesToIgnore.find(Inst) != VecValuesToIgnore.end() || - isScalarAfterVectorization(Inst, VFs[j])) - continue; - RegUsage += GetRegUsage(Inst->getType(), VFs[j]); + + for (auto& pair : RegUsage) { + if (MaxUsages[j].find(pair.first) != MaxUsages[j].end()) + MaxUsages[j][pair.first] = std::max(MaxUsages[j][pair.first], pair.second); + else + MaxUsages[j][pair.first] = pair.second; } - MaxUsages[j] = std::max(MaxUsages[j], RegUsage); } LLVM_DEBUG(dbgs() << "LV(REG): At #" << i << " Interval # " @@ -5409,18 +5451,32 @@ LoopVectorizationCostModel::calculateReg } for (unsigned i = 0, e = VFs.size(); i < e; ++i) { - unsigned Invariant = 0; - if (VFs[i] == 1) - Invariant = LoopInvariants.size(); - else { - for (auto Inst : LoopInvariants) - Invariant += GetRegUsage(Inst->getType(), VFs[i]); + SmallMapVector Invariant; + + for (auto Inst : LoopInvariants) { + unsigned Usage = VFs[i] == 1 ? 1 : GetRegUsage(Inst->getType(), VFs[i]); + unsigned ClassID = TTI.getRegisterClassForType(VFs[i] > 1, Inst->getType()); + if (Invariant.find(ClassID) == Invariant.end()) + Invariant[ClassID] = Usage; + else + Invariant[ClassID] += Usage; } LLVM_DEBUG(dbgs() << "LV(REG): VF = " << VFs[i] << '\n'); - LLVM_DEBUG(dbgs() << "LV(REG): Found max usage: " << MaxUsages[i] << '\n'); - LLVM_DEBUG(dbgs() << "LV(REG): Found invariant usage: " << Invariant - << '\n'); + LLVM_DEBUG(dbgs() << "LV(REG): Found max usage: " + << MaxUsages[i].size() << " item\n"); + for (const auto& pair : MaxUsages[i]) { + LLVM_DEBUG(dbgs() << "LV(REG): RegisterClass: " + << TTI.getRegisterClassName(pair.first) + << ", " << pair.second << " registers \n"); + } + LLVM_DEBUG(dbgs() << "LV(REG): Found invariant usage: " + << Invariant.size() << " item\n"); + for (const auto& pair : Invariant) { + LLVM_DEBUG(dbgs() << "LV(REG): RegisterClass: " + << TTI.getRegisterClassName(pair.first) + << ", " << pair.second << " registers \n"); + } RU.LoopInvariantRegs = Invariant; RU.MaxLocalUsers = MaxUsages[i]; @@ -7760,7 +7816,8 @@ bool LoopVectorizePass::runImpl( // The second condition is necessary because, even if the target has no // vector registers, loop vectorization may still enable scalar // interleaving. - if (!TTI->getNumberOfRegisters(true) && TTI->getMaxInterleaveFactor(1) < 2) + if (!TTI->getNumberOfRegisters(TTI->getRegisterClassForType(true)) && + TTI->getMaxInterleaveFactor(1) < 2) return false; bool Changed = false; Modified: llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp Fri Oct 11 19:53:04 2019 @@ -5237,7 +5237,7 @@ bool SLPVectorizerPass::runImpl(Function // If the target claims to have no vector registers don't attempt // vectorization. - if (!TTI->getNumberOfRegisters(true)) + if (!TTI->getNumberOfRegisters(TTI->getRegisterClassForType(true))) return false; // Don't vectorize when the attribute NoImplicitFloat is used. Added: llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll?rev=374634&view=auto ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll (added) +++ llvm/trunk/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll Fri Oct 11 19:53:04 2019 @@ -0,0 +1,129 @@ +; RUN: opt < %s -debug-only=loop-vectorize -loop-vectorize -vectorizer-maximize-bandwidth -O2 -mtriple=powerpc64-unknown-linux -S -mcpu=pwr8 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-PWR8 +; RUN: opt < %s -debug-only=loop-vectorize -loop-vectorize -vectorizer-maximize-bandwidth -O2 -mtriple=powerpc64le-unknown-linux -S -mcpu=pwr9 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-PWR9 +; REQUIRES: asserts + + at a = global [1024 x i8] zeroinitializer, align 16 + at b = global [1024 x i8] zeroinitializer, align 16 + +define i32 @foo() { +; CHECK-LABEL: foo + +; CHECK-PWR8: Setting best plan to VF=16, UF=4 + +; CHECK-PWR9: Setting best plan to VF=8, UF=8 + + +entry: + br label %for.body + +for.cond.cleanup: + %add.lcssa = phi i32 [ %add, %for.body ] + ret i32 %add.lcssa + +for.body: + %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] + %s.015 = phi i32 [ 0, %entry ], [ %add, %for.body ] + %arrayidx = getelementptr inbounds [1024 x i8], [1024 x i8]* @a, i64 0, i64 %indvars.iv + %0 = load i8, i8* %arrayidx, align 1 + %conv = zext i8 %0 to i32 + %arrayidx2 = getelementptr inbounds [1024 x i8], [1024 x i8]* @b, i64 0, i64 %indvars.iv + %1 = load i8, i8* %arrayidx2, align 1 + %conv3 = zext i8 %1 to i32 + %sub = sub nsw i32 %conv, %conv3 + %ispos = icmp sgt i32 %sub, -1 + %neg = sub nsw i32 0, %sub + %2 = select i1 %ispos, i32 %sub, i32 %neg + %add = add nsw i32 %2, %s.015 + %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 + %exitcond = icmp eq i64 %indvars.iv.next, 1024 + br i1 %exitcond, label %for.cond.cleanup, label %for.body +} + +define i32 @goo() { +; For indvars.iv used in a computating chain only feeding into getelementptr or cmp, +; it will not have vector version and the vector register usage will not exceed the +; available vector register number. + +; CHECK-LABEL: goo + +; CHECK: Setting best plan to VF=16, UF=4 + +entry: + br label %for.body + +for.cond.cleanup: ; preds = %for.body + %add.lcssa = phi i32 [ %add, %for.body ] + ret i32 %add.lcssa + +for.body: ; preds = %for.body, %entry + %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] + %s.015 = phi i32 [ 0, %entry ], [ %add, %for.body ] + %tmp1 = add nsw i64 %indvars.iv, 3 + %arrayidx = getelementptr inbounds [1024 x i8], [1024 x i8]* @a, i64 0, i64 %tmp1 + %tmp = load i8, i8* %arrayidx, align 1 + %conv = zext i8 %tmp to i32 + %tmp2 = add nsw i64 %indvars.iv, 2 + %arrayidx2 = getelementptr inbounds [1024 x i8], [1024 x i8]* @b, i64 0, i64 %tmp2 + %tmp3 = load i8, i8* %arrayidx2, align 1 + %conv3 = zext i8 %tmp3 to i32 + %sub = sub nsw i32 %conv, %conv3 + %ispos = icmp sgt i32 %sub, -1 + %neg = sub nsw i32 0, %sub + %tmp4 = select i1 %ispos, i32 %sub, i32 %neg + %add = add nsw i32 %tmp4, %s.015 + %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 + %exitcond = icmp eq i64 %indvars.iv.next, 1024 + br i1 %exitcond, label %for.cond.cleanup, label %for.body +} + +define i64 @bar(i64* nocapture %a) { +; CHECK-LABEL: bar + +; CHECK: Setting best plan to VF=2, UF=12 + +entry: + br label %for.body + +for.cond.cleanup: + %add2.lcssa = phi i64 [ %add2, %for.body ] + ret i64 %add2.lcssa + +for.body: + %i.012 = phi i64 [ 0, %entry ], [ %inc, %for.body ] + %s.011 = phi i64 [ 0, %entry ], [ %add2, %for.body ] + %arrayidx = getelementptr inbounds i64, i64* %a, i64 %i.012 + %0 = load i64, i64* %arrayidx, align 8 + %add = add nsw i64 %0, %i.012 + store i64 %add, i64* %arrayidx, align 8 + %add2 = add nsw i64 %add, %s.011 + %inc = add nuw nsw i64 %i.012, 1 + %exitcond = icmp eq i64 %inc, 1024 + br i1 %exitcond, label %for.cond.cleanup, label %for.body +} + + at d = external global [0 x i64], align 8 + at e = external global [0 x i32], align 4 + at c = external global [0 x i32], align 4 + +define void @hoo(i32 %n) { +; CHECK-LABEL: hoo +; CHECK: Setting best plan to VF=1, UF=12 + +entry: + br label %for.body + +for.body: ; preds = %for.body, %entry + %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] + %arrayidx = getelementptr inbounds [0 x i64], [0 x i64]* @d, i64 0, i64 %indvars.iv + %tmp = load i64, i64* %arrayidx, align 8 + %arrayidx1 = getelementptr inbounds [0 x i32], [0 x i32]* @e, i64 0, i64 %tmp + %tmp1 = load i32, i32* %arrayidx1, align 4 + %arrayidx3 = getelementptr inbounds [0 x i32], [0 x i32]* @c, i64 0, i64 %indvars.iv + store i32 %tmp1, i32* %arrayidx3, align 4 + %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 + %exitcond = icmp eq i64 %indvars.iv.next, 10000 + br i1 %exitcond, label %for.end, label %for.body + +for.end: ; preds = %for.body + ret void +} Modified: llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll (original) +++ llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll Fri Oct 11 19:53:04 2019 @@ -22,7 +22,11 @@ target datalayout = "e-m:e-i64:64-f80:12 target triple = "x86_64-unknown-linux-gnu" ; CHECK: LV: Checking a loop in "test_g" -; CHECK: LV(REG): Found max usage: 2 +; CHECK: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 1 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers define i32 @test_g(i32* nocapture readonly %a, i32 %n) local_unnamed_addr !dbg !6 { entry: @@ -60,7 +64,11 @@ for.end: } ; CHECK: LV: Checking a loop in "test" -; CHECK: LV(REG): Found max usage: 2 +; CHECK: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 1 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers define i32 @test(i32* nocapture readonly %a, i32 %n) local_unnamed_addr { entry: Modified: llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll?rev=374634&r1=374633&r2=374634&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll (original) +++ llvm/trunk/test/Transforms/LoopVectorize/X86/reg-usage.ll Fri Oct 11 19:53:04 2019 @@ -11,9 +11,15 @@ define i32 @foo() { ; ; CHECK-LABEL: foo ; CHECK: LV(REG): VF = 8 -; CHECK-NEXT: LV(REG): Found max usage: 7 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 7 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item ; CHECK: LV(REG): VF = 16 -; CHECK-NEXT: LV(REG): Found max usage: 13 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 13 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item entry: br label %for.body @@ -47,9 +53,15 @@ define i32 @goo() { ; available vector register number. ; CHECK-LABEL: goo ; CHECK: LV(REG): VF = 8 -; CHECK-NEXT: LV(REG): Found max usage: 7 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 7 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item ; CHECK: LV(REG): VF = 16 -; CHECK-NEXT: LV(REG): Found max usage: 13 +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 13 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item entry: br label %for.body @@ -81,8 +93,11 @@ for.body: define i64 @bar(i64* nocapture %a) { ; CHECK-LABEL: bar ; CHECK: LV(REG): VF = 2 -; CHECK: LV(REG): Found max usage: 3 -; +; CHECK-NEXT: LV(REG): Found max usage: 2 item +; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 3 registers +; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 1 registers +; CHECK-NEXT: LV(REG): Found invariant usage: 0 item + entry: br label %for.body @@ -113,8 +128,11 @@ define void @hoo(i32 %n) { ; so the max usage of AVX512 vector register will be 2. ; AVX512F-LABEL: bar ; AVX512F: LV(REG): VF = 16 -; AVX512F: LV(REG): Found max usage: 2 -; +; AVX512F-CHECK: LV(REG): Found max usage: 2 item +; AVX512F-CHECK: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers +; AVX512F-CHECK: LV(REG): RegisterClass: Generic::VectorRC, 2 registers +; AVX512F-CHECK: LV(REG): Found invariant usage: 0 item + entry: br label %for.body From llvm-commits at lists.llvm.org Fri Oct 11 20:22:27 2019 From: llvm-commits at lists.llvm.org (Z Nguyen-Huu via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 03:22:27 +0000 (UTC) Subject: [PATCH] D68886: Remove unnecessary codes in llvm-dwarfdump In-Reply-To: References: Message-ID: <5b676b9f75c661e59dd6e6b0c8d20bc2@localhost.localdomain> duongnhn added a comment. In D68886#1706821 , @rupprecht wrote: > Does `check-all` pass with this change? I'd imagine these are necessary to print target-specific information. You are right `--all` options is not working properly with this. I will delete this. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68886/new/ https://reviews.llvm.org/D68886 From llvm-commits at lists.llvm.org Fri Oct 11 20:31:30 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 03:31:30 +0000 (UTC) Subject: [PATCH] D68341: [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large modes In-Reply-To: References: Message-ID: hubert.reinterpretcast accepted this revision. hubert.reinterpretcast marked an inline comment as done. hubert.reinterpretcast added a comment. This revision is now accepted and ready to land. LGTM with minor changes. ================ Comment at: llvm/lib/Target/PowerPC/MCTargetDesc/PPCInstPrinter.cpp:82 + "The third operand of an addis instruction should be a symbol " + "reference expression if it is an expression at all."); + ---------------- Indentation: ``` assert(isa(MI->getOperand(2).getExpr()) && "The third operand of an addis instruction should be a symbol " "reference expression if it is an expression at all."); ``` ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:734 case PPC::LDtoc: { + assert (!IsDarwin && "TOC is an ELF/XCOFF construct"); + ---------------- The space after `assert` seems odd. `clang-format` removes this space. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:766 + // Transform %rd = ADDIStocHA %rA, @sym(%r2) + LowerPPCMachineInstrToMCInst(MI, TmpInst, *this, IsDarwin); + ---------------- We know `IsDarwin` is false here. For code readability purposes, I am okay with this line as-is because the Darwin code is supposedly going away. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:821 case PPC::ADDIStocHA8: { + assert (!IsDarwin && "TOC is an ELF/XCOFF construct"); + ---------------- Same comment about the space. ================ Comment at: llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:860 case PPC::LDtocL: { + assert (!IsDarwin && "TOC is an ELF/XCOFF construct"); + ---------------- Same comment about the space. ================ Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:332 case PPC::ADDIStocHA8: + case PPC::ADDIStocHA: case PPC::ADDItocL: ---------------- This patch adds `ADDIStocHA` to two switches with a different ordering relative to `ADDIStoHA8` in each switch. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:15 + +; SMALL-LABEL: test_load +; SMALL: lwz [[REG1:[0-9]+]], LC0(2) ---------------- This is a function entry point label, so ``` SMALL-LABEL: .test_load:{{$}} ``` is appropriate. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:20 + +; LARGE-LABEL: test_load +; LARGE: addis [[REG1:[0-9]+]], LC0 at u(2) ---------------- Same comment re: function entry label. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:33 -; CHECK-LABEL: test -; CHECK-DAG: lwz [[REG1:[0-9]+]], LC0(2) -; CHECK-DAG: lwz [[REG2:[0-9]+]], LC1(2) -; CHECK-DAG: lwz [[REG3:[0-9]+]], 0([[REG1]]) -; CHECK: stw [[REG3]], 0([[REG2]]) -; CHECK: blr +; SMALL-LABEL: test_store +; SMALL: lwz [[REG1:[0-9]+]], LC1(2) ---------------- Same comment re: function entry label. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll:38 + +; LARGE-LABEL: test_store +; LARGE: addis [[REG1:[0-9]+]], LC1 at u(2) ---------------- Same comment re: function entry label. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll:15 + +; SMALL-LABEL: test_load +; SMALL: ld [[REG1:[0-9]+]], LC0(2) ---------------- Same comment re: function entry label. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll:17 +; SMALL: ld [[REG1:[0-9]+]], LC0(2) +; SMALL: lwz [[REG2:[0-9]+]], 0([[REG1]]) +; SMALL: blr ---------------- Just a note: Currently, the front-end seems to not generate `signext` or `zeroext` into the IR for 64-bit AIX. It seems the default behaviour matches `zeroext` (meaning the return type is unsigned). ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll:20 + +; LARGE-LABEL: test_load +; LARGE: addis [[REG1:[0-9]+]], LC0 at u(2) ---------------- Same comment re: function entry label. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll:33 + +; SMALL-LABEL: test_store +; SMALL: ld [[REG1:[0-9]+]], LC1(2) ---------------- Same comment re: function entry label. ================ Comment at: llvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll:38 + +; LARGE-LABEL: test_store +; LARGE: addis [[REG1:[0-9]+]], LC1 at u(2) ---------------- Same comment re: function entry label. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68341/new/ https://reviews.llvm.org/D68341 From llvm-commits at lists.llvm.org Fri Oct 11 21:08:32 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via llvm-commits) Date: Sat, 12 Oct 2019 04:08:32 -0000 Subject: [llvm] r374635 - NFC: clang-format rL374420 and adjust comment wording Message-ID: <20191012040832.2C44982A17@lists.llvm.org> Author: hubert.reinterpretcast Date: Fri Oct 11 21:08:31 2019 New Revision: 374635 URL: http://llvm.org/viewvc/llvm-project?rev=374635&view=rev Log: NFC: clang-format rL374420 and adjust comment wording The commit of rL374420 had various formatting issues, including lines that exceed 80 columns. This patch applies `git clang-format` on the changes from commit 13bd3ef40d8b1586f26a022e01b21e56c91e05bd. It further adjusts a comment to clarify the domain of inputs upon which a newly added function is meant to operate. The adjustment to the comment was suggested in a post-commit comment on D68721 and discussed off-list with @sfertile. Modified: llvm/trunk/lib/Target/PowerPC/PPCAsmPrinter.cpp Modified: llvm/trunk/lib/Target/PowerPC/PPCAsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCAsmPrinter.cpp?rev=374635&r1=374634&r2=374635&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCAsmPrinter.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCAsmPrinter.cpp Fri Oct 11 21:08:31 2019 @@ -512,9 +512,11 @@ void PPCAsmPrinter::EmitTlsCall(const Ma .addExpr(SymVar)); } -/// Map the machine operand to its corresponding MCSymbol. -static MCSymbol *getMCSymbolForTOCPseudoMO(const MachineOperand &MO, AsmPrinter &AP) { - switch(MO.getType()) { +/// Map a machine operand for a TOC pseudo-machine instruction to its +/// corresponding MCSymbol. +static MCSymbol *getMCSymbolForTOCPseudoMO(const MachineOperand &MO, + AsmPrinter &AP) { + switch (MO.getType()) { case MachineOperand::MO_GlobalAddress: return AP.getSymbol(MO.getGlobal()); case MachineOperand::MO_ConstantPoolIndex: @@ -771,9 +773,9 @@ void PPCAsmPrinter::EmitInstruction(cons const MCSymbol *MOSymbol = getMCSymbolForTOCPseudoMO(MO, *this); const bool GlobalToc = - MO.isGlobal() && Subtarget->isGVIndirectSymbol(MO.getGlobal()); + MO.isGlobal() && Subtarget->isGVIndirectSymbol(MO.getGlobal()); if (GlobalToc || MO.isJTI() || MO.isBlockAddress() || - (MO.isCPI() && TM.getCodeModel() == CodeModel::Large)) + (MO.isCPI() && TM.getCodeModel() == CodeModel::Large)) MOSymbol = lookUpOrCreateTOCEntry(MOSymbol); const MCExpr *Exp = @@ -834,9 +836,9 @@ void PPCAsmPrinter::EmitInstruction(cons const MachineOperand &MO = MI->getOperand(2); assert((MO.isGlobal() || MO.isCPI()) && "Invalid operand for ADDItocL."); - LLVM_DEBUG( - assert(!(MO.isGlobal() && Subtarget->isGVIndirectSymbol(MO.getGlobal())) && - "Interposable definitions must use indirect access.")); + LLVM_DEBUG(assert( + !(MO.isGlobal() && Subtarget->isGVIndirectSymbol(MO.getGlobal())) && + "Interposable definitions must use indirect access.")); const MCExpr *Exp = MCSymbolRefExpr::create(getMCSymbolForTOCPseudoMO(MO, *this), @@ -1376,7 +1378,7 @@ bool PPCLinuxAsmPrinter::doFinalization( ".got2", ELF::SHT_PROGBITS, ELF::SHF_WRITE | ELF::SHF_ALLOC); OutStreamer->SwitchSection(Section); - for (const auto &TOCMapPair: TOC) { + for (const auto &TOCMapPair : TOC) { const MCSymbol *const TOCEntryTarget = TOCMapPair.first; MCSymbol *const TOCEntryLabel = TOCMapPair.second; From llvm-commits at lists.llvm.org Fri Oct 11 21:07:26 2019 From: llvm-commits at lists.llvm.org (Austin Kerbow via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 04:07:26 +0000 (UTC) Subject: [PATCH] D68895: AMDGPU: Erase redundant redefs of m0 in SIFoldOperands In-Reply-To: References: Message-ID: kerbowa accepted this revision. kerbowa added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68895/new/ https://reviews.llvm.org/D68895 From llvm-commits at lists.llvm.org Fri Oct 11 22:01:22 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 05:01:22 +0000 (UTC) Subject: [PATCH] D68903: [LNT] NFC: Fix order of globals and locals on exec Message-ID: hubert.reinterpretcast created this revision. hubert.reinterpretcast added reviewers: cmatthews, thopre, kristof.beyls. Per https://docs.python.org/3/library/functions.html#exec, the order of `globals` comes before `locals`. Since `globals` and `locals` are always the same object for the call in question, we can remove `locals`, and thereby `globals` will be used for both the global and the local variables. https://reviews.llvm.org/D68903 Files: lnt/tests/nt.py Index: lnt/tests/nt.py =================================================================== --- lnt/tests/nt.py +++ lnt/tests/nt.py @@ -536,7 +536,7 @@ results = [] for name in test_modules: # First, load the test module file. - locals = globals = {} + globals = {} test_path = os.path.join(config.test_suite_root, 'LNTBased', name) # This is where shared code between test modules should go. sys.path.append(os.path.join(config.test_suite_root, 'LNTBased/lib')) @@ -544,7 +544,7 @@ module_path = os.path.join(test_path, 'TestModule') module_file = open(module_path) try: - exec(module_file, locals, globals) + exec(module_file, globals) except Exception: info = traceback.format_exc() fatal("unable to import test module: %r\n%s" % ( -------------- next part -------------- A non-text attachment was scrubbed... Name: D68903.224727.patch Type: text/x-patch Size: 878 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 22:23:43 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via llvm-commits) Date: Sat, 12 Oct 2019 05:23:43 -0000 Subject: [llvm] r374636 - Revert 374629 "[sancov] Accommodate sancov and coverage report server for use under Windows" Message-ID: <20191012052343.B7F56877C4@lists.llvm.org> Author: vitalybuka Date: Fri Oct 11 22:23:43 2019 New Revision: 374636 URL: http://llvm.org/viewvc/llvm-project?rev=374636&view=rev Log: Revert 374629 "[sancov] Accommodate sancov and coverage report server for use under Windows" http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/27650/steps/ninja%20check%201/logs/stdio http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/31759 http://lab.llvm.org:8011/builders/clang-s390x-linux-lnt/builds/15095 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/21075 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/31759 Modified: llvm/trunk/test/tools/sancov/blacklist.test llvm/trunk/test/tools/sancov/covered_functions.test llvm/trunk/test/tools/sancov/merge.test llvm/trunk/test/tools/sancov/not_covered_functions.test llvm/trunk/test/tools/sancov/print.test llvm/trunk/test/tools/sancov/stats.test llvm/trunk/test/tools/sancov/symbolize.test llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test llvm/trunk/test/tools/sancov/validation.test llvm/trunk/tools/sancov/coverage-report-server.py llvm/trunk/tools/sancov/sancov.cpp Modified: llvm/trunk/test/tools/sancov/blacklist.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/blacklist.test?rev=374636&r1=374635&r2=374636&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/blacklist.test (original) +++ llvm/trunk/test/tools/sancov/blacklist.test Fri Oct 11 22:23:43 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86-registered-target +REQUIRES: x86_64-linux RUN: sancov -covered-functions %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s --check-prefix=ALL RUN: sancov -covered-functions -blacklist %p/Inputs/fun_blacklist.txt %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s RUN: sancov -covered-functions -blacklist %p/Inputs/src_blacklist.txt %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.1.sancov | FileCheck --check-prefix=CHECK1 %s Modified: llvm/trunk/test/tools/sancov/covered_functions.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/covered_functions.test?rev=374636&r1=374635&r2=374636&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/covered_functions.test (original) +++ llvm/trunk/test/tools/sancov/covered_functions.test Fri Oct 11 22:23:43 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86-registered-target +REQUIRES: x86_64-linux RUN: sancov -covered-functions %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s RUN: sancov -covered-functions -strip_path_prefix=Inputs/ %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck --check-prefix=STRIP_PATH %s RUN: sancov -demangle=0 -covered-functions %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck --check-prefix=NO_DEMANGLE %s Modified: llvm/trunk/test/tools/sancov/merge.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/merge.test?rev=374636&r1=374635&r2=374636&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/merge.test (original) +++ llvm/trunk/test/tools/sancov/merge.test Fri Oct 11 22:23:43 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86-registered-target +REQUIRES: x86_64-linux RUN: sancov -merge %p/Inputs/test-linux_x86_64.0.symcov| FileCheck --check-prefix=MERGE1 %s RUN: sancov -merge %p/Inputs/test-linux_x86_64.0.symcov %p/Inputs/test-linux_x86_64.1.symcov| FileCheck --check-prefix=MERGE2 %s Modified: llvm/trunk/test/tools/sancov/not_covered_functions.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/not_covered_functions.test?rev=374636&r1=374635&r2=374636&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/not_covered_functions.test (original) +++ llvm/trunk/test/tools/sancov/not_covered_functions.test Fri Oct 11 22:23:43 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86-registered-target +REQUIRES: x86_64-linux RUN: sancov -skip-dead-files=0 -not-covered-functions %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s RUN: sancov -not-covered-functions %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.1.sancov | FileCheck --check-prefix=CHECK1 --allow-empty %s Modified: llvm/trunk/test/tools/sancov/print.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/print.test?rev=374636&r1=374635&r2=374636&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/print.test (original) +++ llvm/trunk/test/tools/sancov/print.test Fri Oct 11 22:23:43 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86-registered-target +REQUIRES: x86_64-linux RUN: sancov -print %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s CHECK: 0x4e132b Modified: llvm/trunk/test/tools/sancov/stats.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/stats.test?rev=374636&r1=374635&r2=374636&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/stats.test (original) +++ llvm/trunk/test/tools/sancov/stats.test Fri Oct 11 22:23:43 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86-registered-target +REQUIRES: x86_64-linux RUN: sancov -print-coverage-stats %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s CHECK: all-edges: 8 Modified: llvm/trunk/test/tools/sancov/symbolize.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/symbolize.test?rev=374636&r1=374635&r2=374636&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/symbolize.test (original) +++ llvm/trunk/test/tools/sancov/symbolize.test Fri Oct 11 22:23:43 2019 @@ -1,6 +1,5 @@ -REQUIRES: x86-registered-target -RUN: sancov -symbolize -strip_path_prefix="llvm/" %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s --check-prefixes=CHECK,STRIP -RUN: sancov -symbolize %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s --check-prefixes=CHECK,NOSTRIP +REQUIRES: x86_64-linux +RUN: sancov -symbolize -strip_path_prefix="llvm/" %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s CHECK: { CHECK-NEXT: "covered-points": [ @@ -12,8 +11,7 @@ CHECK-NEXT: "4e1586" CHECK-NEXT: ], CHECK-NEXT: "binary-hash": "BB3CDD5045AED83906F6ADCC1C4DAF7E2596A6B5", CHECK-NEXT: "point-symbol-info": { -STRIP-NEXT: "test/tools/sancov/Inputs/test.cpp": { -NOSTRIP-NEXT: "/usr/local/google/home/aizatsky/src/llvm/test/tools/sancov/Inputs/test.cpp": { +CHECK-NEXT: "test/tools/sancov/Inputs/test.cpp": { CHECK-NEXT: "bar(std::string)": { CHECK-NEXT: "4e132b": "12:0" CHECK-NEXT: }, Modified: llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test?rev=374636&r1=374635&r2=374636&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test (original) +++ llvm/trunk/test/tools/sancov/symbolize_noskip_dead_files.test Fri Oct 11 22:23:43 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86-registered-target +REQUIRES: x86_64-linux RUN: sancov -symbolize -skip-dead-files=0 -strip_path_prefix="llvm/" %p/Inputs/test-linux_x86_64 %p/Inputs/test-linux_x86_64.0.sancov | FileCheck %s CHECK: { Modified: llvm/trunk/test/tools/sancov/validation.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/sancov/validation.test?rev=374636&r1=374635&r2=374636&view=diff ============================================================================== --- llvm/trunk/test/tools/sancov/validation.test (original) +++ llvm/trunk/test/tools/sancov/validation.test Fri Oct 11 22:23:43 2019 @@ -1,4 +1,4 @@ -REQUIRES: x86-registered-target +REQUIRES: x86_64-linux RUN: not sancov -covered-functions %p/Inputs/test-linux_x86_64 2>&1 | FileCheck --check-prefix=NOCFILE %s NOCFILE: WARNING: No coverage file for {{.*}}test-linux_x86_64 Modified: llvm/trunk/tools/sancov/coverage-report-server.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/sancov/coverage-report-server.py?rev=374636&r1=374635&r2=374636&view=diff ============================================================================== --- llvm/trunk/tools/sancov/coverage-report-server.py (original) +++ llvm/trunk/tools/sancov/coverage-report-server.py Fri Oct 11 22:23:43 2019 @@ -32,7 +32,6 @@ import html import os import string import math -import urllib INDEX_PAGE_TMPL = """ @@ -129,7 +128,6 @@ class ServerHandler(http.server.BaseHTTP src_path = None def do_GET(self): - norm_path = os.path.normpath(urllib.parse.unquote(self.path[1:])) if self.path == '/': self.send_response(200) self.send_header("Content-type", "text/html; charset=utf-8") @@ -149,8 +147,8 @@ class ServerHandler(http.server.BaseHTTP response = string.Template(INDEX_PAGE_TMPL).safe_substitute( filenames='\n'.join(filelist)) self.wfile.write(response.encode('UTF-8', 'replace')) - elif self.symcov_data.has_file(norm_path): - filename = norm_path + elif self.symcov_data.has_file(self.path[1:]): + filename = self.path[1:] filepath = os.path.join(self.src_path, filename) if not os.path.exists(filepath): self.send_response(404) Modified: llvm/trunk/tools/sancov/sancov.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/sancov/sancov.cpp?rev=374636&r1=374635&r2=374636&view=diff ============================================================================== --- llvm/trunk/tools/sancov/sancov.cpp (original) +++ llvm/trunk/tools/sancov/sancov.cpp Fri Oct 11 22:23:43 2019 @@ -469,7 +469,7 @@ static std::unique_ptr S(FileName); sys::path::remove_dots(S, /* remove_dot_dot */ true); - return stripPathPrefix(sys::path::convert_to_slash(S.str())); + return stripPathPrefix(S.str().str()); } class Blacklists { From llvm-commits at lists.llvm.org Fri Oct 11 22:28:17 2019 From: llvm-commits at lists.llvm.org (Vitaly Buka via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 05:28:17 +0000 (UTC) Subject: [PATCH] D51018: [sancov] Accommodate sancov and coverage report server for use under Windows In-Reply-To: References: Message-ID: <05feb14dbda87fcdc68a1d8cadfbb400@localhost.localdomain> vitalybuka reopened this revision. vitalybuka added a comment. This revision is now accepted and ready to land. Reverted https://reviews.llvm.org/rL374636 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D51018/new/ https://reviews.llvm.org/D51018 From llvm-commits at lists.llvm.org Fri Oct 11 22:46:12 2019 From: llvm-commits at lists.llvm.org (Amy Kwan via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 05:46:12 +0000 (UTC) Subject: [PATCH] D68576: [PowerPC] Fix VSX clobbers of CSR registers In-Reply-To: References: Message-ID: amyk added inline comments. Herald added a subscriber: wuzish. ================ Comment at: test/CodeGen/PowerPC/inline-asm-vsx-clobbers.ll:2 +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc -mcpu=pwr9 -mtriple=powerpc64le-unknown-unknown \ +; RUN: -enable-ppc-quad-precision -ppc-vsr-nums-as-vr \ ---------------- Should we add `-verify-machineinstrs` to the test case? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68576/new/ https://reviews.llvm.org/D68576 From llvm-commits at lists.llvm.org Fri Oct 11 23:07:17 2019 From: llvm-commits at lists.llvm.org (Roman Tereshin via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 06:07:17 +0000 (UTC) Subject: [PATCH] D68905: [update_mir_test_checks] Handle MI flags properly Message-ID: rtereshin created this revision. rtereshin added a reviewer: bogner. Herald added subscribers: llvm-commits, Petar.Avramovic, atanasyan, jrtc27, nhaehnle, jvesely, sdardis. Herald added a project: LLVM. previously we would generate literal check lines w/o no reg-exps for vregs as MI flags (nsw, ninf, etc.) won't be recognized as a part of MI. Fixing that. Includes updating the MIR tests that suffered from the problem. Repository: rL LLVM https://reviews.llvm.org/D68905 Files: test/CodeGen/AArch64/GlobalISel/legalize-dyn-alloca.mir test/CodeGen/AArch64/GlobalISel/prelegalizercombiner-br.mir test/CodeGen/AArch64/GlobalISel/regbank-fma.mir test/CodeGen/AArch64/GlobalISel/select-jump-table-brjt.mir test/CodeGen/AMDGPU/GlobalISel/legalize-fadd.mir test/CodeGen/AMDGPU/GlobalISel/legalize-fcmp.mir test/CodeGen/AMDGPU/GlobalISel/legalize-fcopysign.mir test/CodeGen/AMDGPU/GlobalISel/legalize-fcos.mir test/CodeGen/AMDGPU/GlobalISel/legalize-fmaxnum.mir test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir test/CodeGen/AMDGPU/GlobalISel/legalize-fmul.mir test/CodeGen/AMDGPU/GlobalISel/legalize-fpext.mir test/CodeGen/AMDGPU/GlobalISel/legalize-fsin.mir test/CodeGen/AMDGPU/GlobalISel/legalize-fsub.mir test/CodeGen/AMDGPU/GlobalISel/legalize-intrinsic-amdgcn-fdiv-fast.mir test/CodeGen/Mips/GlobalISel/legalizer/dyn_stackalloc.mir utils/update_mir_test_checks.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68905.224729.patch Type: text/x-patch Size: 47871 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 23:07:17 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 06:07:17 +0000 (UTC) Subject: [PATCH] D68778: [mips] Store 64-bit `li.d' operand as a single 8-byte value In-Reply-To: References: Message-ID: atanasyan added a comment. Thanks for all reviews. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68778/new/ https://reviews.llvm.org/D68778 From llvm-commits at lists.llvm.org Fri Oct 11 23:14:02 2019 From: llvm-commits at lists.llvm.org (Alexander Shaposhnikov via llvm-commits) Date: Sat, 12 Oct 2019 06:14:02 -0000 Subject: [llvm] r374637 - [llvm-lipo] Pass ArrayRef by value. Message-ID: <20191012061402.A72FF87C86@lists.llvm.org> Author: alexshap Date: Fri Oct 11 23:14:02 2019 New Revision: 374637 URL: http://llvm.org/viewvc/llvm-project?rev=374637&view=rev Log: [llvm-lipo] Pass ArrayRef by value. Pass ArrayRef by value, fix formatting. NFC. Test plan: make check-all Modified: llvm/trunk/tools/llvm-lipo/llvm-lipo.cpp Modified: llvm/trunk/tools/llvm-lipo/llvm-lipo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-lipo/llvm-lipo.cpp?rev=374637&r1=374636&r2=374637&view=diff ============================================================================== --- llvm/trunk/tools/llvm-lipo/llvm-lipo.cpp (original) +++ llvm/trunk/tools/llvm-lipo/llvm-lipo.cpp Fri Oct 11 23:14:02 2019 @@ -441,10 +441,10 @@ readInputBinaries(ArrayRef In if (IF.ArchType && (B->isMachO() || B->isArchive())) { const auto S = B->isMachO() ? Slice(cast(B)) : Slice(cast(B)); - const auto SpecifiedCPUType = - MachO::getCPUTypeFromArchitecture( - MachO::mapToArchitecture(Triple(*IF.ArchType))) - .first; + const auto SpecifiedCPUType = MachO::getCPUTypeFromArchitecture( + MachO::getArchitectureFromName( + Triple(*IF.ArchType).getArchName())) + .first; // For compatibility with cctools' lipo the comparison is relaxed just to // checking cputypes. if (S.getCPUType() != SpecifiedCPUType) @@ -583,7 +583,7 @@ static void extractSlice(ArrayRef &Slices) { +static void checkArchDuplicates(ArrayRef Slices) { DenseMap CPUIds; for (const auto &S : Slices) { auto Entry = CPUIds.try_emplace(S.getCPUID(), S.getBinary()); From llvm-commits at lists.llvm.org Fri Oct 11 23:34:46 2019 From: llvm-commits at lists.llvm.org (Alex Cameron via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 06:34:46 +0000 (UTC) Subject: [PATCH] D68906: [llvm-size] Tidy up error messages Message-ID: tetsuo-cpp created this revision. tetsuo-cpp added reviewers: grimar, MaskRay. Herald added subscribers: llvm-commits, rupprecht. Herald added a project: LLVM. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=42970 This patch is to clean up some formatting inconsistencies in the error messages of `llvm-size` and to correctly exit with non-zero in all error cases (there were a few which were exiting with zero). When working on the error messages, I based the formatting off the way that `llvm-objdump` does it. Repository: rL LLVM https://reviews.llvm.org/D68906 Files: llvm/test/tools/llvm-size/invalid-input.test llvm/test/tools/llvm-size/no-input.test llvm/tools/llvm-size/llvm-size.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68906.224730.patch Type: text/x-patch Size: 6692 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 23:40:24 2019 From: llvm-commits at lists.llvm.org (Martin Storsjo via llvm-commits) Date: Sat, 12 Oct 2019 06:40:24 -0000 Subject: [llvm] r374639 - [lit] Remove setting of the target-windows feature Message-ID: <20191012064024.F369E935A3@lists.llvm.org> Author: mstorsjo Date: Fri Oct 11 23:40:24 2019 New Revision: 374639 URL: http://llvm.org/viewvc/llvm-project?rev=374639&view=rev Log: [lit] Remove setting of the target-windows feature No other OSes use a target- feature, and no tests depend on it any lomger. Differential Revision: https://reviews.llvm.org/D68450 Modified: llvm/trunk/utils/lit/lit/llvm/config.py Modified: llvm/trunk/utils/lit/lit/llvm/config.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/llvm/config.py?rev=374639&r1=374638&r2=374639&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/llvm/config.py (original) +++ llvm/trunk/utils/lit/lit/llvm/config.py Fri Oct 11 23:40:24 2019 @@ -93,8 +93,6 @@ class LLVMConfig(object): 'ASAN_OPTIONS', 'detect_leaks=1', append_path=True) if re.match(r'^x86_64.*-linux', target_triple): features.add('x86_64-linux') - if re.match(r'.*-windows-msvc$', target_triple): - features.add('target-windows') if re.match(r'^i.86.*', target_triple): features.add('target-x86') elif re.match(r'^x86_64.*', target_triple): From llvm-commits at lists.llvm.org Fri Oct 11 23:43:46 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Sat, 12 Oct 2019 06:43:46 +0000 (UTC) Subject: [PATCH] D68450: [lit] Remove setting of the target-windows feature In-Reply-To: References: Message-ID: <5a26fed55c89bd6d6620cab142cdee30@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGfe88be8c3af9: [lit] Remove setting of the target-windows feature (authored by mstorsjo). Changed prior to commit: https://reviews.llvm.org/D68450?vs=223159&id=224731#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68450/new/ https://reviews.llvm.org/D68450 Files: llvm/utils/lit/lit/llvm/config.py Index: llvm/utils/lit/lit/llvm/config.py =================================================================== --- llvm/utils/lit/lit/llvm/config.py +++ llvm/utils/lit/lit/llvm/config.py @@ -93,8 +93,6 @@ 'ASAN_OPTIONS', 'detect_leaks=1', append_path=True) if re.match(r'^x86_64.*-linux', target_triple): features.add('x86_64-linux') - if re.match(r'.*-windows-msvc$', target_triple): - features.add('target-windows') if re.match(r'^i.86.*', target_triple): features.add('target-x86') elif re.match(r'^x86_64.*', target_triple): -------------- next part -------------- A non-text attachment was scrubbed... Name: D68450.224731.patch Type: text/x-patch Size: 651 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 00:42:45 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Sat, 12 Oct 2019 07:42:45 -0000 Subject: [llvm] r374640 - [mips] Fix `loadImmediate` calls when load non-address values. Message-ID: <20191012074245.197C991FCE@lists.llvm.org> Author: atanasyan Date: Sat Oct 12 00:42:44 2019 New Revision: 374640 URL: http://llvm.org/viewvc/llvm-project?rev=374640&view=rev Log: [mips] Fix `loadImmediate` calls when load non-address values. Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp llvm/trunk/test/MC/Mips/macro-li.d.s Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp?rev=374640&r1=374639&r2=374640&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp (original) +++ llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Sat Oct 12 00:42:44 2019 @@ -3324,7 +3324,7 @@ bool MipsAsmParser::expandLoadSingleImmT uint32_t ImmOp32 = covertDoubleImmToSingleImm(convertIntToDoubleImm(ImmOp64)); - return loadImmediate(ImmOp32, FirstReg, Mips::NoRegister, true, true, IDLoc, + return loadImmediate(ImmOp32, FirstReg, Mips::NoRegister, true, false, IDLoc, Out, STI); } @@ -3397,15 +3397,15 @@ bool MipsAsmParser::expandLoadDoubleImmT if (Lo_32(ImmOp64) == 0) { if (isABI_N32() || isABI_N64()) { - if (loadImmediate(ImmOp64, FirstReg, Mips::NoRegister, false, true, IDLoc, - Out, STI)) + if (loadImmediate(ImmOp64, FirstReg, Mips::NoRegister, false, false, + IDLoc, Out, STI)) return true; } else { - if (loadImmediate(Hi_32(ImmOp64), FirstReg, Mips::NoRegister, true, true, + if (loadImmediate(Hi_32(ImmOp64), FirstReg, Mips::NoRegister, true, false, IDLoc, Out, STI)) return true; - if (loadImmediate(0, nextReg(FirstReg), Mips::NoRegister, true, true, + if (loadImmediate(0, nextReg(FirstReg), Mips::NoRegister, true, false, IDLoc, Out, STI)) return true; } Modified: llvm/trunk/test/MC/Mips/macro-li.d.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Mips/macro-li.d.s?rev=374640&r1=374639&r2=374640&view=diff ============================================================================== --- llvm/trunk/test/MC/Mips/macro-li.d.s (original) +++ llvm/trunk/test/MC/Mips/macro-li.d.s Sat Oct 12 00:42:44 2019 @@ -9,12 +9,12 @@ li.d $4, 0 # O32: addiu $4, $zero, 0 # encoding: [0x00,0x00,0x04,0x24] # O32: addiu $5, $zero, 0 # encoding: [0x00,0x00,0x05,0x24] -# N32-N64: daddiu $4, $zero, 0 # encoding: [0x00,0x00,0x04,0x64] +# N32-N64: addiu $4, $zero, 0 # encoding: [0x00,0x00,0x04,0x24] li.d $4, 0.0 # O32: addiu $4, $zero, 0 # encoding: [0x00,0x00,0x04,0x24] # O32: addiu $5, $zero, 0 # encoding: [0x00,0x00,0x05,0x24] -# N32-N64: daddiu $4, $zero, 0 # encoding: [0x00,0x00,0x04,0x64] +# N32-N64: addiu $4, $zero, 0 # encoding: [0x00,0x00,0x04,0x24] li.d $4, 1.12345 # ALL: .section .rodata,"a", at progbits From llvm-commits at lists.llvm.org Sat Oct 12 00:42:51 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Sat, 12 Oct 2019 07:42:51 -0000 Subject: [llvm] r374641 - [mips] Rely on GPR size not ABI when select instruction to load value into register Message-ID: <20191012074251.346AC932F2@lists.llvm.org> Author: atanasyan Date: Sat Oct 12 00:42:51 2019 New Revision: 374641 URL: http://llvm.org/viewvc/llvm-project?rev=374641&view=rev Log: [mips] Rely on GPR size not ABI when select instruction to load value into register Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Modified: llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp?rev=374641&r1=374640&r2=374641&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp (original) +++ llvm/trunk/lib/Target/Mips/AsmParser/MipsAsmParser.cpp Sat Oct 12 00:42:51 2019 @@ -3396,7 +3396,7 @@ bool MipsAsmParser::expandLoadDoubleImmT ImmOp64 = convertIntToDoubleImm(ImmOp64); if (Lo_32(ImmOp64) == 0) { - if (isABI_N32() || isABI_N64()) { + if (isGP64bit()) { if (loadImmediate(ImmOp64, FirstReg, Mips::NoRegister, false, false, IDLoc, Out, STI)) return true; @@ -3435,14 +3435,10 @@ bool MipsAsmParser::expandLoadDoubleImmT if (emitPartialAddress(TOut, IDLoc, Sym)) return true; - if (isABI_N64()) - TOut.emitRRX(Mips::DADDiu, TmpReg, TmpReg, MCOperand::createExpr(LoExpr), - IDLoc, STI); - else - TOut.emitRRX(Mips::ADDiu, TmpReg, TmpReg, MCOperand::createExpr(LoExpr), - IDLoc, STI); + TOut.emitRRX(isABI_N64() ? Mips::DADDiu : Mips::ADDiu, TmpReg, TmpReg, + MCOperand::createExpr(LoExpr), IDLoc, STI); - if (isABI_N32() || isABI_N64()) + if (isGP64bit()) TOut.emitRRI(Mips::LD, FirstReg, TmpReg, 0, IDLoc, STI); else { TOut.emitRRI(Mips::LW, FirstReg, TmpReg, 0, IDLoc, STI); @@ -3473,7 +3469,7 @@ bool MipsAsmParser::expandLoadDoubleImmT if ((Lo_32(ImmOp64) == 0) && !((Hi_32(ImmOp64) & 0xffff0000) && (Hi_32(ImmOp64) & 0x0000ffff))) { - if (isABI_N32() || isABI_N64()) { + if (isGP64bit()) { if (TmpReg != Mips::ZERO && loadImmediate(ImmOp64, TmpReg, Mips::NoRegister, false, false, IDLoc, Out, STI)) From llvm-commits at lists.llvm.org Sat Oct 12 00:47:12 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 07:47:12 +0000 (UTC) Subject: [PATCH] D68866: [MIPS GlobalISel] Refactor MipsRegisterBankInfo [NFC] In-Reply-To: References: Message-ID: <56c3a7961d71490215f47af6181f9c86@localhost.localdomain> atanasyan accepted this revision. atanasyan added a comment. This revision is now accepted and ready to land. LGTM Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68866/new/ https://reviews.llvm.org/D68866 From llvm-commits at lists.llvm.org Sat Oct 12 00:59:24 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Sat, 12 Oct 2019 07:59:24 -0000 Subject: [llvm] r374642 - [X86] Test SKX cpu in the vector-trunc-packus/ssat/usat.ll tests instad of min-legal-vector-width.ll Message-ID: <20191012075924.A7DA986F23@lists.llvm.org> Author: ctopper Date: Sat Oct 12 00:59:24 2019 New Revision: 374642 URL: http://llvm.org/viewvc/llvm-project?rev=374642&view=rev Log: [X86] Test SKX cpu in the vector-trunc-packus/ssat/usat.ll tests instad of min-legal-vector-width.ll This adds "min-legal-vector-width"="256" function attributes to all the tests for a larger than 256-bit input. Also switch any larger than 512-bit inputs to use a load. This makes the arguments consistent with min-legal-vector-width attribute which should usually be at least as large as the arguments. The SKX configuration will avoid using zmm registers on the modified test cases. For many of them we should use something closer to the AVX2 codegen with pack instructions instead of the avx512 saturating truncates. Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll Modified: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll?rev=374642&r1=374641&r2=374642&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll (original) +++ llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll Sat Oct 12 00:59:24 2019 @@ -1116,132 +1116,3 @@ define void @trunc_packus_v16i32_v16i8_s store <16 x i8> %f, <16 x i8>* %q ret void } - -define <32 x i8> @trunc_packus_v32i32_v32i8(<32 x i32>* %p) "min-legal-vector-width"="256" { -; CHECK-LABEL: trunc_packus_v32i32_v32i8: -; CHECK: # %bb.0: -; CHECK-NEXT: vpxor %xmm0, %xmm0, %xmm0 -; CHECK-NEXT: vpmaxsd 96(%rdi), %ymm0, %ymm1 -; CHECK-NEXT: vpmovusdb %ymm1, %xmm1 -; CHECK-NEXT: vpmaxsd 64(%rdi), %ymm0, %ymm2 -; CHECK-NEXT: vpmovusdb %ymm2, %xmm2 -; CHECK-NEXT: vpunpcklqdq {{.*#+}} xmm1 = xmm2[0],xmm1[0] -; CHECK-NEXT: vpmaxsd 32(%rdi), %ymm0, %ymm2 -; CHECK-NEXT: vpmovusdb %ymm2, %xmm2 -; CHECK-NEXT: vpmaxsd (%rdi), %ymm0, %ymm0 -; CHECK-NEXT: vpmovusdb %ymm0, %xmm0 -; CHECK-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm2[0] -; CHECK-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 -; CHECK-NEXT: retq - %a = load <32 x i32>, <32 x i32>* %p - %b = icmp slt <32 x i32> %a, - %c = select <32 x i1> %b, <32 x i32> %a, <32 x i32> - %d = icmp sgt <32 x i32> %c, zeroinitializer - %e = select <32 x i1> %d, <32 x i32> %c, <32 x i32> zeroinitializer - %f = trunc <32 x i32> %e to <32 x i8> - ret <32 x i8> %f -} - -define <8 x i8> @trunc_packus_v8i64_v8i8(<8 x i64> %a0) "min-legal-vector-width"="256" { -; CHECK-LABEL: trunc_packus_v8i64_v8i8: -; CHECK: # %bb.0: -; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; CHECK-NEXT: vpmaxsq %ymm2, %ymm1, %ymm1 -; CHECK-NEXT: vpmovusqb %ymm1, %xmm1 -; CHECK-NEXT: vpmaxsq %ymm2, %ymm0, %ymm0 -; CHECK-NEXT: vpmovusqb %ymm0, %xmm0 -; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] -; CHECK-NEXT: vzeroupper -; CHECK-NEXT: retq - %1 = icmp slt <8 x i64> %a0, - %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> - %3 = icmp sgt <8 x i64> %2, zeroinitializer - %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> zeroinitializer - %5 = trunc <8 x i64> %4 to <8 x i8> - ret <8 x i8> %5 -} - -define void @trunc_packus_v8i64_v8i8_store(<8 x i64> %a0, <8 x i8> *%p1) "min-legal-vector-width"="256" { -; CHECK-LABEL: trunc_packus_v8i64_v8i8_store: -; CHECK: # %bb.0: -; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; CHECK-NEXT: vpmaxsq %ymm2, %ymm1, %ymm1 -; CHECK-NEXT: vpmovusqb %ymm1, %xmm1 -; CHECK-NEXT: vpmaxsq %ymm2, %ymm0, %ymm0 -; CHECK-NEXT: vpmovusqb %ymm0, %xmm0 -; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] -; CHECK-NEXT: vmovq %xmm0, (%rdi) -; CHECK-NEXT: vzeroupper -; CHECK-NEXT: retq - %1 = icmp slt <8 x i64> %a0, - %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> - %3 = icmp sgt <8 x i64> %2, zeroinitializer - %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> zeroinitializer - %5 = trunc <8 x i64> %4 to <8 x i8> - store <8 x i8> %5, <8 x i8> *%p1 - ret void -} - -define <8 x i8> @trunc_ssat_v8i64_v8i8(<8 x i64> %a0) "min-legal-vector-width"="256" { -; CHECK-LABEL: trunc_ssat_v8i64_v8i8: -; CHECK: # %bb.0: -; CHECK-NEXT: vpmovsqb %ymm1, %xmm1 -; CHECK-NEXT: vpmovsqb %ymm0, %xmm0 -; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] -; CHECK-NEXT: vzeroupper -; CHECK-NEXT: retq - %1 = icmp slt <8 x i64> %a0, - %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> - %3 = icmp sgt <8 x i64> %2, - %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> - %5 = trunc <8 x i64> %4 to <8 x i8> - ret <8 x i8> %5 -} - -define void @trunc_ssat_v8i64_v8i8_store(<8 x i64> %a0, <8 x i8> *%p1) "min-legal-vector-width"="256" { -; CHECK-LABEL: trunc_ssat_v8i64_v8i8_store: -; CHECK: # %bb.0: -; CHECK-NEXT: vpmovsqb %ymm1, %xmm1 -; CHECK-NEXT: vpmovsqb %ymm0, %xmm0 -; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] -; CHECK-NEXT: vmovq %xmm0, (%rdi) -; CHECK-NEXT: vzeroupper -; CHECK-NEXT: retq - %1 = icmp slt <8 x i64> %a0, - %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> - %3 = icmp sgt <8 x i64> %2, - %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> - %5 = trunc <8 x i64> %4 to <8 x i8> - store <8 x i8> %5, <8 x i8> *%p1 - ret void -} - -define <8 x i8> @trunc_usat_v8i64_v8i8(<8 x i64> %a0) "min-legal-vector-width"="256" { -; CHECK-LABEL: trunc_usat_v8i64_v8i8: -; CHECK: # %bb.0: -; CHECK-NEXT: vpmovusqb %ymm1, %xmm1 -; CHECK-NEXT: vpmovusqb %ymm0, %xmm0 -; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] -; CHECK-NEXT: vzeroupper -; CHECK-NEXT: retq - %1 = icmp ult <8 x i64> %a0, - %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> - %3 = trunc <8 x i64> %2 to <8 x i8> - ret <8 x i8> %3 -} - -define void @trunc_usat_v8i64_v8i8_store(<8 x i64> %a0, <8 x i8> *%p1) "min-legal-vector-width"="256" { -; CHECK-LABEL: trunc_usat_v8i64_v8i8_store: -; CHECK: # %bb.0: -; CHECK-NEXT: vpmovusqb %ymm1, %xmm1 -; CHECK-NEXT: vpmovusqb %ymm0, %xmm0 -; CHECK-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] -; CHECK-NEXT: vmovq %xmm0, (%rdi) -; CHECK-NEXT: vzeroupper -; CHECK-NEXT: retq - %1 = icmp ult <8 x i64> %a0, - %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> - %3 = trunc <8 x i64> %2 to <8 x i8> - store <8 x i8> %3, <8 x i8> *%p1 - ret void -} Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll?rev=374642&r1=374641&r2=374642&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll Sat Oct 12 00:59:24 2019 @@ -9,6 +9,7 @@ ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512vl,+fast-variable-shuffle | FileCheck %s --check-prefixes=AVX512,AVX512VL ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw,+fast-variable-shuffle | FileCheck %s --check-prefixes=AVX512,AVX512BW ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw,+avx512vl,+fast-variable-shuffle | FileCheck %s --check-prefixes=AVX512,AVX512BWVL +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=skx | FileCheck %s --check-prefixes=SKX ; ; PACKUS saturation truncation to vXi32 @@ -257,6 +258,14 @@ define <4 x i32> @trunc_packus_v4i64_v4i ; AVX512BWVL-NEXT: vpmovusqd %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v4i64_v4i32: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; SKX-NEXT: vpmovusqd %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = icmp sgt <4 x i64> %2, zeroinitializer @@ -266,338 +275,354 @@ define <4 x i32> @trunc_packus_v4i64_v4i } -define <8 x i32> @trunc_packus_v8i64_v8i32(<8 x i64> %a0) { +define <8 x i32> @trunc_packus_v8i64_v8i32(<8 x i64>* %p0) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_packus_v8i64_v8i32: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm3 +; SSE2-NEXT: movdqa 16(%rdi), %xmm7 +; SSE2-NEXT: movdqa 32(%rdi), %xmm6 +; SSE2-NEXT: movdqa 48(%rdi), %xmm9 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [4294967295,4294967295] -; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm0, %xmm5 -; SSE2-NEXT: pxor %xmm10, %xmm5 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [2147483647,2147483647] -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm5[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm0 -; SSE2-NEXT: pandn %xmm8, %xmm5 -; SSE2-NEXT: por %xmm0, %xmm5 -; SSE2-NEXT: movdqa %xmm1, %xmm0 -; SSE2-NEXT: pxor %xmm10, %xmm0 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm0 -; SSE2-NEXT: pand %xmm0, %xmm1 -; SSE2-NEXT: pandn %xmm8, %xmm0 -; SSE2-NEXT: por %xmm1, %xmm0 -; SSE2-NEXT: movdqa %xmm2, %xmm1 -; SSE2-NEXT: pxor %xmm10, %xmm1 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm6 -; SSE2-NEXT: pand %xmm6, %xmm2 -; SSE2-NEXT: pandn %xmm8, %xmm6 -; SSE2-NEXT: por %xmm2, %xmm6 -; SSE2-NEXT: movdqa %xmm3, %xmm1 -; SSE2-NEXT: pxor %xmm10, %xmm1 -; SSE2-NEXT: movdqa %xmm9, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm3, %xmm2 +; SSE2-NEXT: pxor %xmm11, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [2147483647,2147483647] +; SSE2-NEXT: movdqa %xmm10, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm2 ; SSE2-NEXT: pand %xmm2, %xmm3 ; SSE2-NEXT: pandn %xmm8, %xmm2 ; SSE2-NEXT: por %xmm3, %xmm2 -; SSE2-NEXT: movdqa %xmm2, %xmm1 -; SSE2-NEXT: pxor %xmm10, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm3 +; SSE2-NEXT: movdqa %xmm7, %xmm1 +; SSE2-NEXT: pxor %xmm11, %xmm1 +; SSE2-NEXT: movdqa %xmm10, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 ; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] ; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] ; SSE2-NEXT: pand %xmm4, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSE2-NEXT: por %xmm1, %xmm3 -; SSE2-NEXT: pand %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm7, %xmm3 ; SSE2-NEXT: movdqa %xmm6, %xmm1 -; SSE2-NEXT: pxor %xmm10, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pxor %xmm11, %xmm1 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] ; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm6 +; SSE2-NEXT: pandn %xmm8, %xmm7 +; SSE2-NEXT: por %xmm6, %xmm7 +; SSE2-NEXT: movdqa %xmm9, %xmm1 +; SSE2-NEXT: pxor %xmm11, %xmm1 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm9 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm9, %xmm4 +; SSE2-NEXT: movdqa %xmm4, %xmm1 +; SSE2-NEXT: pxor %xmm11, %xmm1 +; SSE2-NEXT: movdqa %xmm1, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] ; SSE2-NEXT: pand %xmm6, %xmm1 -; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm3[0,2] -; SSE2-NEXT: movdqa %xmm0, %xmm2 -; SSE2-NEXT: pxor %xmm10, %xmm2 -; SSE2-NEXT: movdqa %xmm2, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: pand %xmm0, %xmm3 -; SSE2-NEXT: movdqa %xmm5, %xmm0 -; SSE2-NEXT: pxor %xmm10, %xmm0 -; SSE2-NEXT: movdqa %xmm0, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm5 +; SSE2-NEXT: pand %xmm4, %xmm5 +; SSE2-NEXT: movdqa %xmm7, %xmm1 +; SSE2-NEXT: pxor %xmm11, %xmm1 +; SSE2-NEXT: movdqa %xmm1, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm7, %xmm1 +; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm5[0,2] +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm4 +; SSE2-NEXT: pand %xmm3, %xmm4 +; SSE2-NEXT: movdqa %xmm2, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm3[1,1,3,3] ; SSE2-NEXT: por %xmm6, %xmm0 -; SSE2-NEXT: pand %xmm5, %xmm0 -; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm3[0,2] +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm4[0,2] ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_packus_v8i64_v8i32: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm3 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm7 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm6 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm9 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [4294967295,4294967295] -; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm0, %xmm5 -; SSSE3-NEXT: pxor %xmm10, %xmm5 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [2147483647,2147483647] -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm5[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm0 -; SSSE3-NEXT: pandn %xmm8, %xmm5 -; SSSE3-NEXT: por %xmm0, %xmm5 -; SSSE3-NEXT: movdqa %xmm1, %xmm0 -; SSSE3-NEXT: pxor %xmm10, %xmm0 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: pand %xmm0, %xmm1 -; SSSE3-NEXT: pandn %xmm8, %xmm0 -; SSSE3-NEXT: por %xmm1, %xmm0 -; SSSE3-NEXT: movdqa %xmm2, %xmm1 -; SSSE3-NEXT: pxor %xmm10, %xmm1 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm6 -; SSSE3-NEXT: pand %xmm6, %xmm2 -; SSSE3-NEXT: pandn %xmm8, %xmm6 -; SSSE3-NEXT: por %xmm2, %xmm6 -; SSSE3-NEXT: movdqa %xmm3, %xmm1 -; SSSE3-NEXT: pxor %xmm10, %xmm1 -; SSSE3-NEXT: movdqa %xmm9, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm3, %xmm2 +; SSSE3-NEXT: pxor %xmm11, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [2147483647,2147483647] +; SSSE3-NEXT: movdqa %xmm10, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm2 ; SSSE3-NEXT: pand %xmm2, %xmm3 ; SSSE3-NEXT: pandn %xmm8, %xmm2 ; SSSE3-NEXT: por %xmm3, %xmm2 -; SSSE3-NEXT: movdqa %xmm2, %xmm1 -; SSSE3-NEXT: pxor %xmm10, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm3 +; SSSE3-NEXT: movdqa %xmm7, %xmm1 +; SSSE3-NEXT: pxor %xmm11, %xmm1 +; SSSE3-NEXT: movdqa %xmm10, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 ; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] ; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] ; SSSE3-NEXT: pand %xmm4, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSSE3-NEXT: por %xmm1, %xmm3 -; SSSE3-NEXT: pand %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm7, %xmm3 ; SSSE3-NEXT: movdqa %xmm6, %xmm1 -; SSSE3-NEXT: pxor %xmm10, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pxor %xmm11, %xmm1 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] ; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm6 +; SSSE3-NEXT: pandn %xmm8, %xmm7 +; SSSE3-NEXT: por %xmm6, %xmm7 +; SSSE3-NEXT: movdqa %xmm9, %xmm1 +; SSSE3-NEXT: pxor %xmm11, %xmm1 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm9 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm9, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm1 +; SSSE3-NEXT: pxor %xmm11, %xmm1 +; SSSE3-NEXT: movdqa %xmm1, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] ; SSSE3-NEXT: pand %xmm6, %xmm1 -; SSSE3-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm3[0,2] -; SSSE3-NEXT: movdqa %xmm0, %xmm2 -; SSSE3-NEXT: pxor %xmm10, %xmm2 -; SSSE3-NEXT: movdqa %xmm2, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: pand %xmm0, %xmm3 -; SSSE3-NEXT: movdqa %xmm5, %xmm0 -; SSSE3-NEXT: pxor %xmm10, %xmm0 -; SSSE3-NEXT: movdqa %xmm0, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm5 +; SSSE3-NEXT: pand %xmm4, %xmm5 +; SSSE3-NEXT: movdqa %xmm7, %xmm1 +; SSSE3-NEXT: pxor %xmm11, %xmm1 +; SSSE3-NEXT: movdqa %xmm1, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm7, %xmm1 +; SSSE3-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm5[0,2] +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: pand %xmm3, %xmm4 +; SSSE3-NEXT: movdqa %xmm2, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm3[1,1,3,3] ; SSSE3-NEXT: por %xmm6, %xmm0 -; SSSE3-NEXT: pand %xmm5, %xmm0 -; SSSE3-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm3[0,2] +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm4[0,2] ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_packus_v8i64_v8i32: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm9 -; SSE41-NEXT: movapd {{.*#+}} xmm7 = [4294967295,4294967295] -; SSE41-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [2147483647,2147483647] -; SSE41-NEXT: movdqa %xmm5, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 -; SSE41-NEXT: movdqa %xmm5, %xmm6 +; SSE41-NEXT: movdqa (%rdi), %xmm5 +; SSE41-NEXT: movdqa 16(%rdi), %xmm4 +; SSE41-NEXT: movdqa 32(%rdi), %xmm10 +; SSE41-NEXT: movdqa 48(%rdi), %xmm9 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [4294967295,4294967295] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm5, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm2 = [2147483647,2147483647] +; SSE41-NEXT: movdqa %xmm2, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm7 +; SSE41-NEXT: movdqa %xmm2, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm8 -; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm8 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa %xmm5, %xmm4 +; SSE41-NEXT: movapd %xmm1, %xmm8 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm8 +; SSE41-NEXT: movdqa %xmm4, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm2, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm2, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm5 +; SSE41-NEXT: movdqa %xmm10, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm2, %xmm4 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 -; SSE41-NEXT: movdqa %xmm5, %xmm6 +; SSE41-NEXT: movdqa %xmm2, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] ; SSE41-NEXT: pand %xmm4, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm9 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm9 -; SSE41-NEXT: movdqa %xmm2, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa %xmm5, %xmm1 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 -; SSE41-NEXT: movdqa %xmm5, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm1, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa %xmm5, %xmm1 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] -; SSE41-NEXT: pand %xmm1, %xmm0 -; SSE41-NEXT: por %xmm5, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm7 -; SSE41-NEXT: xorpd %xmm2, %xmm2 -; SSE41-NEXT: movapd %xmm7, %xmm1 -; SSE41-NEXT: xorpd %xmm10, %xmm1 -; SSE41-NEXT: movapd %xmm1, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm1 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 -; SSE41-NEXT: por %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm3, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm3 +; SSE41-NEXT: movapd %xmm1, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm4 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm2, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm1 +; SSE41-NEXT: pxor %xmm2, %xmm2 +; SSE41-NEXT: movapd %xmm1, %xmm6 +; SSE41-NEXT: xorpd %xmm3, %xmm6 +; SSE41-NEXT: movapd %xmm6, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: pxor %xmm6, %xmm6 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm6 ; SSE41-NEXT: movapd %xmm4, %xmm1 -; SSE41-NEXT: xorpd %xmm10, %xmm1 -; SSE41-NEXT: movapd %xmm1, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm1 +; SSE41-NEXT: xorpd %xmm3, %xmm1 +; SSE41-NEXT: movapd %xmm1, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm1 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm1, %xmm0 ; SSE41-NEXT: pxor %xmm1, %xmm1 ; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm1 -; SSE41-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm3[0,2] -; SSE41-NEXT: movapd %xmm9, %xmm3 -; SSE41-NEXT: xorpd %xmm10, %xmm3 -; SSE41-NEXT: movapd %xmm3, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm3 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm4, %xmm0 -; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm3, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm3 -; SSE41-NEXT: movapd %xmm8, %xmm4 -; SSE41-NEXT: xorpd %xmm10, %xmm4 -; SSE41-NEXT: movapd %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm4 +; SSE41-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm6[0,2] +; SSE41-NEXT: movapd %xmm5, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm6, %xmm0 ; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: pxor %xmm4, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm4 +; SSE41-NEXT: movapd %xmm8, %xmm5 +; SSE41-NEXT: xorpd %xmm3, %xmm5 +; SSE41-NEXT: movapd %xmm5, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 ; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm2 -; SSE41-NEXT: shufps {{.*#+}} xmm2 = xmm2[0,2],xmm3[0,2] +; SSE41-NEXT: shufps {{.*#+}} xmm2 = xmm2[0,2],xmm4[0,2] ; SSE41-NEXT: movaps %xmm2, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_packus_v8i64_v8i32: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [4294967295,4294967295] -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm8 -; AVX1-NEXT: vpcmpgtq %xmm1, %xmm3, %xmm5 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm6, %xmm3, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm0, %xmm3, %xmm4 -; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm3, %xmm0 -; AVX1-NEXT: vpxor %xmm4, %xmm4, %xmm4 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm9 -; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm3, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm7 -; AVX1-NEXT: vblendvpd %xmm5, %xmm1, %xmm3, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm5 -; AVX1-NEXT: vblendvpd %xmm8, %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm2, %xmm3 -; AVX1-NEXT: vpand %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpand %xmm1, %xmm5, %xmm1 -; AVX1-NEXT: vshufps {{.*#+}} xmm1 = xmm1[0,2],xmm2[0,2] -; AVX1-NEXT: vpand %xmm6, %xmm7, %xmm2 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [4294967295,4294967295] +; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm8 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm6 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm4, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm4, %xmm5 +; AVX1-NEXT: vblendvpd %xmm5, %xmm0, %xmm4, %xmm0 +; AVX1-NEXT: vpxor %xmm5, %xmm5, %xmm5 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm0, %xmm9 +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm6, %xmm2, %xmm4, %xmm2 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm6 +; AVX1-NEXT: vblendvpd %xmm8, %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm3, %xmm4 +; AVX1-NEXT: vpand %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpand %xmm2, %xmm6, %xmm2 +; AVX1-NEXT: vshufps {{.*#+}} xmm2 = xmm2[0,2],xmm3[0,2] +; AVX1-NEXT: vpand %xmm1, %xmm7, %xmm1 ; AVX1-NEXT: vpand %xmm0, %xmm9, %xmm0 -; AVX1-NEXT: vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm2[0,2] -; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX1-NEXT: vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm1[0,2] +; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-SLOW-LABEL: trunc_packus_v8i64_v8i32: ; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-SLOW-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm2 = [4294967295,4294967295,4294967295,4294967295] ; AVX2-SLOW-NEXT: vpcmpgtq %ymm1, %ymm2, %ymm3 ; AVX2-SLOW-NEXT: vblendvpd %ymm3, %ymm1, %ymm2, %ymm1 @@ -617,6 +642,8 @@ define <8 x i32> @trunc_packus_v8i64_v8i ; ; AVX2-FAST-LABEL: trunc_packus_v8i64_v8i32: ; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-FAST-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm2 = [4294967295,4294967295,4294967295,4294967295] ; AVX2-FAST-NEXT: vpcmpgtq %ymm0, %ymm2, %ymm3 ; AVX2-FAST-NEXT: vblendvpd %ymm3, %ymm0, %ymm2, %ymm0 @@ -635,10 +662,21 @@ define <8 x i32> @trunc_packus_v8i64_v8i ; ; AVX512-LABEL: trunc_packus_v8i64_v8i32: ; AVX512: # %bb.0: -; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: vpmaxsq (%rdi), %zmm0, %zmm0 ; AVX512-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_packus_v8i64_v8i32: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; SKX-NEXT: vpmaxsq (%rdi), %ymm0, %ymm1 +; SKX-NEXT: vpmovusqd %ymm1, %xmm1 +; SKX-NEXT: vpmaxsq 32(%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpmovusqd %ymm0, %xmm0 +; SKX-NEXT: vinserti128 $1, %xmm0, %ymm1, %ymm0 +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp slt <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = icmp sgt <8 x i64> %2, zeroinitializer @@ -914,6 +952,14 @@ define <4 x i16> @trunc_packus_v4i64_v4i ; AVX512BWVL-NEXT: vpmovusqw %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v4i64_v4i16: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; SKX-NEXT: vpmovusqw %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = icmp sgt <4 x i64> %2, zeroinitializer @@ -1193,6 +1239,14 @@ define void @trunc_packus_v4i64_v4i16_st ; AVX512BWVL-NEXT: vpmovusqw %ymm0, (%rdi) ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v4i64_v4i16_store: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; SKX-NEXT: vpmovusqw %ymm0, (%rdi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = icmp sgt <4 x i64> %2, zeroinitializer @@ -1202,265 +1256,276 @@ define void @trunc_packus_v4i64_v4i16_st ret void } -define <8 x i16> @trunc_packus_v8i64_v8i16(<8 x i64> %a0) { +define <8 x i16> @trunc_packus_v8i64_v8i16(<8 x i64>* %p0) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_packus_v8i64_v8i16: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm7 +; SSE2-NEXT: movdqa 16(%rdi), %xmm2 +; SSE2-NEXT: movdqa 32(%rdi), %xmm9 +; SSE2-NEXT: movdqa 48(%rdi), %xmm6 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [65535,65535] -; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm1, %xmm5 -; SSE2-NEXT: pxor %xmm10, %xmm5 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [2147549183,2147549183] -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm5[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm1 -; SSE2-NEXT: pandn %xmm8, %xmm5 -; SSE2-NEXT: por %xmm1, %xmm5 -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pxor %xmm10, %xmm1 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm0 +; SSE2-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pxor %xmm11, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [2147549183,2147549183] +; SSE2-NEXT: movdqa %xmm10, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm3, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm2 ; SSE2-NEXT: pandn %xmm8, %xmm1 -; SSE2-NEXT: por %xmm0, %xmm1 -; SSE2-NEXT: movdqa %xmm3, %xmm0 -; SSE2-NEXT: pxor %xmm10, %xmm0 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm0, %xmm6 -; SSE2-NEXT: pand %xmm6, %xmm3 -; SSE2-NEXT: pandn %xmm8, %xmm6 -; SSE2-NEXT: por %xmm3, %xmm6 -; SSE2-NEXT: movdqa %xmm2, %xmm0 -; SSE2-NEXT: pxor %xmm10, %xmm0 -; SSE2-NEXT: movdqa %xmm9, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm0, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSE2-NEXT: por %xmm0, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm2 -; SSE2-NEXT: pandn %xmm8, %xmm3 -; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: movdqa %xmm3, %xmm0 -; SSE2-NEXT: pxor %xmm10, %xmm0 -; SSE2-NEXT: movdqa %xmm0, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm0 -; SSE2-NEXT: pand %xmm3, %xmm0 -; SSE2-NEXT: movdqa %xmm6, %xmm2 -; SSE2-NEXT: pxor %xmm10, %xmm2 -; SSE2-NEXT: movdqa %xmm2, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm3 +; SSE2-NEXT: por %xmm2, %xmm1 +; SSE2-NEXT: movdqa %xmm7, %xmm2 +; SSE2-NEXT: pxor %xmm11, %xmm2 +; SSE2-NEXT: movdqa %xmm10, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm3 ; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] ; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm5 ; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm5, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm2 ; SSE2-NEXT: por %xmm7, %xmm2 -; SSE2-NEXT: pand %xmm6, %xmm2 -; SSE2-NEXT: movdqa %xmm1, %xmm3 -; SSE2-NEXT: pxor %xmm10, %xmm3 -; SSE2-NEXT: movdqa %xmm3, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: movdqa %xmm6, %xmm3 +; SSE2-NEXT: pxor %xmm11, %xmm3 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] ; SSE2-NEXT: pcmpeqd %xmm10, %xmm3 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm3, %xmm4 -; SSE2-NEXT: pand %xmm1, %xmm4 -; SSE2-NEXT: movdqa %xmm5, %xmm1 -; SSE2-NEXT: pxor %xmm10, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pand %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm6 +; SSE2-NEXT: pandn %xmm8, %xmm7 +; SSE2-NEXT: por %xmm6, %xmm7 +; SSE2-NEXT: movdqa %xmm9, %xmm3 +; SSE2-NEXT: pxor %xmm11, %xmm3 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm3 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm3 ; SSE2-NEXT: pand %xmm5, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm3[0,2,2,3] +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm9 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm9, %xmm4 +; SSE2-NEXT: movdqa %xmm4, %xmm3 +; SSE2-NEXT: pxor %xmm11, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: pand %xmm4, %xmm3 +; SSE2-NEXT: movdqa %xmm7, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm4 +; SSE2-NEXT: pand %xmm7, %xmm4 +; SSE2-NEXT: movdqa %xmm2, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm5 +; SSE2-NEXT: pand %xmm2, %xmm5 +; SSE2-NEXT: movdqa %xmm1, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm5[0,2,2,3] ; SSE2-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[0,2,2,3] -; SSE2-NEXT: pshuflw {{.*#+}} xmm3 = xmm3[0,2,2,3,4,5,6,7] -; SSE2-NEXT: punpckldq {{.*#+}} xmm3 = xmm3[0],xmm1[0],xmm3[1],xmm1[1] -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[0,2,2,3] -; SSE2-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,1,0,2,4,5,6,7] -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SSE2-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm2 = xmm0[0,1,0,2,4,5,6,7] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,2,2,3] ; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,1,0,2,4,5,6,7] -; SSE2-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] -; SSE2-NEXT: movsd {{.*#+}} xmm0 = xmm3[0],xmm0[1] +; SSE2-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1] +; SSE2-NEXT: movsd {{.*#+}} xmm0 = xmm1[0],xmm0[1] ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_packus_v8i64_v8i16: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm7 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm2 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm9 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm6 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [65535,65535] -; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm1, %xmm5 -; SSSE3-NEXT: pxor %xmm10, %xmm5 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [2147549183,2147549183] -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm5[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm1 -; SSSE3-NEXT: pandn %xmm8, %xmm5 -; SSSE3-NEXT: por %xmm1, %xmm5 -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pxor %xmm10, %xmm1 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm0 +; SSSE3-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: pxor %xmm11, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [2147549183,2147549183] +; SSSE3-NEXT: movdqa %xmm10, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm3, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm2 ; SSSE3-NEXT: pandn %xmm8, %xmm1 -; SSSE3-NEXT: por %xmm0, %xmm1 -; SSSE3-NEXT: movdqa %xmm3, %xmm0 -; SSSE3-NEXT: pxor %xmm10, %xmm0 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm0, %xmm6 -; SSSE3-NEXT: pand %xmm6, %xmm3 -; SSSE3-NEXT: pandn %xmm8, %xmm6 -; SSSE3-NEXT: por %xmm3, %xmm6 -; SSSE3-NEXT: movdqa %xmm2, %xmm0 -; SSSE3-NEXT: pxor %xmm10, %xmm0 -; SSSE3-NEXT: movdqa %xmm9, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSSE3-NEXT: por %xmm0, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm2 -; SSSE3-NEXT: pandn %xmm8, %xmm3 -; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: movdqa %xmm3, %xmm0 -; SSSE3-NEXT: pxor %xmm10, %xmm0 -; SSSE3-NEXT: movdqa %xmm0, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: pand %xmm3, %xmm0 -; SSSE3-NEXT: movdqa %xmm6, %xmm2 -; SSSE3-NEXT: pxor %xmm10, %xmm2 -; SSSE3-NEXT: movdqa %xmm2, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm3 +; SSSE3-NEXT: por %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm7, %xmm2 +; SSSE3-NEXT: pxor %xmm11, %xmm2 +; SSSE3-NEXT: movdqa %xmm10, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 ; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] ; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm5 ; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm5, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm2 ; SSSE3-NEXT: por %xmm7, %xmm2 -; SSSE3-NEXT: pand %xmm6, %xmm2 -; SSSE3-NEXT: movdqa %xmm1, %xmm3 -; SSSE3-NEXT: pxor %xmm10, %xmm3 -; SSSE3-NEXT: movdqa %xmm3, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: movdqa %xmm6, %xmm3 +; SSSE3-NEXT: pxor %xmm11, %xmm3 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] ; SSSE3-NEXT: pcmpeqd %xmm10, %xmm3 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm3, %xmm4 -; SSSE3-NEXT: pand %xmm1, %xmm4 -; SSSE3-NEXT: movdqa %xmm5, %xmm1 -; SSSE3-NEXT: pxor %xmm10, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pand %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm6 +; SSSE3-NEXT: pandn %xmm8, %xmm7 +; SSSE3-NEXT: por %xmm6, %xmm7 +; SSSE3-NEXT: movdqa %xmm9, %xmm3 +; SSSE3-NEXT: pxor %xmm11, %xmm3 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm3 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm3 ; SSSE3-NEXT: pand %xmm5, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm3[0,2,2,3] +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm9 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm9, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm3 +; SSSE3-NEXT: pxor %xmm11, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: pand %xmm4, %xmm3 +; SSSE3-NEXT: movdqa %xmm7, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: pand %xmm7, %xmm4 +; SSSE3-NEXT: movdqa %xmm2, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm5 +; SSSE3-NEXT: pand %xmm2, %xmm5 +; SSSE3-NEXT: movdqa %xmm1, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm5[0,2,2,3] ; SSSE3-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[0,2,2,3] -; SSSE3-NEXT: pshuflw {{.*#+}} xmm3 = xmm3[0,2,2,3,4,5,6,7] -; SSSE3-NEXT: punpckldq {{.*#+}} xmm3 = xmm3[0],xmm1[0],xmm3[1],xmm1[1] -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[0,2,2,3] -; SSSE3-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,1,0,2,4,5,6,7] -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SSSE3-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm2 = xmm0[0,1,0,2,4,5,6,7] +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,2,2,3] ; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,1,0,2,4,5,6,7] -; SSSE3-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] -; SSSE3-NEXT: movsd {{.*#+}} xmm0 = xmm3[0],xmm0[1] +; SSSE3-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1] +; SSSE3-NEXT: movsd {{.*#+}} xmm0 = xmm1[0],xmm0[1] ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_packus_v8i64_v8i16: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm9 -; SSE41-NEXT: movapd {{.*#+}} xmm7 = [65535,65535] -; SSE41-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] -; SSE41-NEXT: movdqa %xmm2, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 +; SSE41-NEXT: movdqa (%rdi), %xmm10 +; SSE41-NEXT: movdqa 16(%rdi), %xmm9 +; SSE41-NEXT: movdqa 32(%rdi), %xmm3 +; SSE41-NEXT: movdqa 48(%rdi), %xmm5 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [65535,65535] +; SSE41-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm3, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 ; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147549183,2147549183] -; SSE41-NEXT: movdqa %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm4, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm7 ; SSE41-NEXT: movdqa %xmm4, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm8 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm8 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm4, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 -; SSE41-NEXT: por %xmm5, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm2 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm2 -; SSE41-NEXT: movdqa %xmm9, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm8 +; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm8 +; SSE41-NEXT: movdqa %xmm5, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm6 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm6 +; SSE41-NEXT: movdqa %xmm10, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 ; SSE41-NEXT: movdqa %xmm4, %xmm3 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 ; SSE41-NEXT: movdqa %xmm4, %xmm5 @@ -1468,93 +1533,96 @@ define <8 x i16> @trunc_packus_v8i64_v8i ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] ; SSE41-NEXT: pand %xmm3, %xmm0 ; SSE41-NEXT: por %xmm5, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm6 -; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm6 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movapd %xmm1, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm3 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm7 -; SSE41-NEXT: pxor %xmm3, %xmm3 -; SSE41-NEXT: movapd %xmm7, %xmm1 -; SSE41-NEXT: xorpd %xmm10, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm1 +; SSE41-NEXT: pxor %xmm5, %xmm5 ; SSE41-NEXT: movapd %xmm1, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm1 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm4, %xmm0 -; SSE41-NEXT: por %xmm1, %xmm0 +; SSE41-NEXT: xorpd %xmm2, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 ; SSE41-NEXT: pxor %xmm4, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm4 -; SSE41-NEXT: movapd %xmm6, %xmm1 -; SSE41-NEXT: xorpd %xmm10, %xmm1 -; SSE41-NEXT: movapd %xmm1, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm4 +; SSE41-NEXT: movapd %xmm3, %xmm1 +; SSE41-NEXT: xorpd %xmm2, %xmm1 +; SSE41-NEXT: movapd %xmm1, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm1 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm1, %xmm0 ; SSE41-NEXT: pxor %xmm1, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm1 ; SSE41-NEXT: packusdw %xmm4, %xmm1 -; SSE41-NEXT: movapd %xmm2, %xmm4 -; SSE41-NEXT: xorpd %xmm10, %xmm4 -; SSE41-NEXT: movapd %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm4 +; SSE41-NEXT: movapd %xmm6, %xmm3 +; SSE41-NEXT: xorpd %xmm2, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm3 +; SSE41-NEXT: movapd %xmm8, %xmm4 +; SSE41-NEXT: xorpd %xmm2, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm4 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm6, %xmm0 ; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: pxor %xmm4, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 -; SSE41-NEXT: movapd %xmm8, %xmm2 -; SSE41-NEXT: xorpd %xmm10, %xmm2 -; SSE41-NEXT: movapd %xmm2, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm2 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 -; SSE41-NEXT: por %xmm2, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm3 -; SSE41-NEXT: packusdw %xmm4, %xmm3 -; SSE41-NEXT: packusdw %xmm3, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm5 +; SSE41-NEXT: packusdw %xmm3, %xmm5 +; SSE41-NEXT: packusdw %xmm5, %xmm1 ; SSE41-NEXT: movdqa %xmm1, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_packus_v8i64_v8i16: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [65535,65535] -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm8 -; AVX1-NEXT: vpcmpgtq %xmm1, %xmm3, %xmm5 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm6, %xmm3, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm0, %xmm3, %xmm4 -; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm3, %xmm0 -; AVX1-NEXT: vpxor %xmm4, %xmm4, %xmm4 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm9 -; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm3, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm7 -; AVX1-NEXT: vblendvpd %xmm5, %xmm1, %xmm3, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm5 -; AVX1-NEXT: vblendvpd %xmm8, %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm2, %xmm3 -; AVX1-NEXT: vpand %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpand %xmm1, %xmm5, %xmm1 -; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vpand %xmm6, %xmm7, %xmm2 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [65535,65535] +; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm8 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm6 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm4, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm4, %xmm5 +; AVX1-NEXT: vblendvpd %xmm5, %xmm0, %xmm4, %xmm0 +; AVX1-NEXT: vpxor %xmm5, %xmm5, %xmm5 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm0, %xmm9 +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm6, %xmm2, %xmm4, %xmm2 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm6 +; AVX1-NEXT: vblendvpd %xmm8, %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm3, %xmm4 +; AVX1-NEXT: vpand %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpand %xmm2, %xmm6, %xmm2 +; AVX1-NEXT: vpackusdw %xmm3, %xmm2, %xmm2 +; AVX1-NEXT: vpand %xmm1, %xmm7, %xmm1 ; AVX1-NEXT: vpand %xmm0, %xmm9, %xmm0 -; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 -; AVX1-NEXT: vzeroupper +; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_packus_v8i64_v8i16: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm2 = [65535,65535,65535,65535] ; AVX2-NEXT: vpcmpgtq %ymm0, %ymm2, %ymm3 ; AVX2-NEXT: vblendvpd %ymm3, %ymm0, %ymm2, %ymm0 @@ -1574,11 +1642,23 @@ define <8 x i16> @trunc_packus_v8i64_v8i ; ; AVX512-LABEL: trunc_packus_v8i64_v8i16: ; AVX512: # %bb.0: -; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: vpmaxsq (%rdi), %zmm0, %zmm0 ; AVX512-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_packus_v8i64_v8i16: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; SKX-NEXT: vpmaxsq 32(%rdi), %ymm0, %ymm1 +; SKX-NEXT: vpmovusqw %ymm1, %xmm1 +; SKX-NEXT: vpmaxsq (%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpmovusqw %ymm0, %xmm0 +; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp slt <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = icmp sgt <8 x i64> %2, zeroinitializer @@ -1655,6 +1735,14 @@ define <4 x i16> @trunc_packus_v4i32_v4i ; AVX512BWVL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v4i32_v4i16: +; SKX: # %bb.0: +; SKX-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = icmp sgt <4 x i32> %2, zeroinitializer @@ -1735,6 +1823,13 @@ define void @trunc_packus_v4i32_v4i16_st ; AVX512BWVL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpmovusdw %xmm0, (%rdi) ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v4i32_v4i16_store: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpmovusdw %xmm0, (%rdi) +; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = icmp sgt <4 x i32> %2, zeroinitializer @@ -1846,6 +1941,14 @@ define <8 x i16> @trunc_packus_v8i32_v8i ; AVX512BWVL-NEXT: vpmovusdw %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v8i32_v8i16: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 +; SKX-NEXT: vpmovusdw %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <8 x i32> %a0, %2 = select <8 x i1> %1, <8 x i32> %a0, <8 x i32> %3 = icmp sgt <8 x i32> %2, zeroinitializer @@ -1854,131 +1957,154 @@ define <8 x i16> @trunc_packus_v8i32_v8i ret <8 x i16> %5 } -define <16 x i16> @trunc_packus_v16i32_v16i16(<16 x i32> %a0) { +define <16 x i16> @trunc_packus_v16i32_v16i16(<16 x i32>* %p0) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_packus_v16i32_v16i16: ; SSE2: # %bb.0: -; SSE2-NEXT: movdqa {{.*#+}} xmm6 = [65535,65535,65535,65535] -; SSE2-NEXT: movdqa %xmm6, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pandn %xmm6, %xmm4 -; SSE2-NEXT: por %xmm1, %xmm4 -; SSE2-NEXT: movdqa %xmm6, %xmm5 -; SSE2-NEXT: pcmpgtd %xmm0, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm0 -; SSE2-NEXT: pandn %xmm6, %xmm5 -; SSE2-NEXT: por %xmm0, %xmm5 -; SSE2-NEXT: movdqa %xmm6, %xmm0 -; SSE2-NEXT: pcmpgtd %xmm3, %xmm0 -; SSE2-NEXT: pand %xmm0, %xmm3 -; SSE2-NEXT: pandn %xmm6, %xmm0 -; SSE2-NEXT: por %xmm3, %xmm0 -; SSE2-NEXT: movdqa %xmm6, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm2 -; SSE2-NEXT: pandn %xmm6, %xmm3 -; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: pxor %xmm2, %xmm2 -; SSE2-NEXT: movdqa %xmm3, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm1 +; SSE2-NEXT: movdqa (%rdi), %xmm1 +; SSE2-NEXT: movdqa 16(%rdi), %xmm3 +; SSE2-NEXT: movdqa 32(%rdi), %xmm0 +; SSE2-NEXT: movdqa 48(%rdi), %xmm4 +; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [65535,65535,65535,65535] +; SSE2-NEXT: movdqa %xmm5, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm3 +; SSE2-NEXT: pandn %xmm5, %xmm2 +; SSE2-NEXT: por %xmm3, %xmm2 +; SSE2-NEXT: movdqa %xmm5, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 ; SSE2-NEXT: pand %xmm3, %xmm1 -; SSE2-NEXT: movdqa %xmm0, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm3 -; SSE2-NEXT: pand %xmm0, %xmm3 -; SSE2-NEXT: movdqa %xmm5, %xmm0 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm0 -; SSE2-NEXT: pand %xmm5, %xmm0 -; SSE2-NEXT: movdqa %xmm4, %xmm5 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm5 -; SSE2-NEXT: pand %xmm4, %xmm5 -; SSE2-NEXT: pslld $16, %xmm5 -; SSE2-NEXT: psrad $16, %xmm5 -; SSE2-NEXT: pslld $16, %xmm0 -; SSE2-NEXT: psrad $16, %xmm0 -; SSE2-NEXT: packssdw %xmm5, %xmm0 +; SSE2-NEXT: pandn %xmm5, %xmm3 +; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm5, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm6 +; SSE2-NEXT: pand %xmm6, %xmm4 +; SSE2-NEXT: pandn %xmm5, %xmm6 +; SSE2-NEXT: por %xmm4, %xmm6 +; SSE2-NEXT: movdqa %xmm5, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pandn %xmm5, %xmm4 +; SSE2-NEXT: por %xmm0, %xmm4 +; SSE2-NEXT: pxor %xmm5, %xmm5 +; SSE2-NEXT: movdqa %xmm4, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm1 +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: movdqa %xmm6, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm4 +; SSE2-NEXT: pand %xmm6, %xmm4 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm0 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm3 +; SSE2-NEXT: pand %xmm2, %xmm3 ; SSE2-NEXT: pslld $16, %xmm3 ; SSE2-NEXT: psrad $16, %xmm3 +; SSE2-NEXT: pslld $16, %xmm0 +; SSE2-NEXT: psrad $16, %xmm0 +; SSE2-NEXT: packssdw %xmm3, %xmm0 +; SSE2-NEXT: pslld $16, %xmm4 +; SSE2-NEXT: psrad $16, %xmm4 ; SSE2-NEXT: pslld $16, %xmm1 ; SSE2-NEXT: psrad $16, %xmm1 -; SSE2-NEXT: packssdw %xmm3, %xmm1 +; SSE2-NEXT: packssdw %xmm4, %xmm1 ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_packus_v16i32_v16i16: ; SSSE3: # %bb.0: -; SSSE3-NEXT: movdqa {{.*#+}} xmm6 = [65535,65535,65535,65535] -; SSSE3-NEXT: movdqa %xmm6, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pandn %xmm6, %xmm4 -; SSSE3-NEXT: por %xmm1, %xmm4 -; SSSE3-NEXT: movdqa %xmm6, %xmm5 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm0 -; SSSE3-NEXT: pandn %xmm6, %xmm5 -; SSSE3-NEXT: por %xmm0, %xmm5 -; SSSE3-NEXT: movdqa %xmm6, %xmm0 -; SSSE3-NEXT: pcmpgtd %xmm3, %xmm0 -; SSSE3-NEXT: pand %xmm0, %xmm3 -; SSSE3-NEXT: pandn %xmm6, %xmm0 -; SSSE3-NEXT: por %xmm3, %xmm0 -; SSSE3-NEXT: movdqa %xmm6, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm2 -; SSSE3-NEXT: pandn %xmm6, %xmm3 -; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: pxor %xmm2, %xmm2 -; SSSE3-NEXT: movdqa %xmm3, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm1 +; SSSE3-NEXT: movdqa (%rdi), %xmm1 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm3 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm0 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm4 +; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [65535,65535,65535,65535] +; SSSE3-NEXT: movdqa %xmm5, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm3 +; SSSE3-NEXT: pandn %xmm5, %xmm2 +; SSSE3-NEXT: por %xmm3, %xmm2 +; SSSE3-NEXT: movdqa %xmm5, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 ; SSSE3-NEXT: pand %xmm3, %xmm1 -; SSSE3-NEXT: movdqa %xmm0, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 -; SSSE3-NEXT: pand %xmm0, %xmm3 -; SSSE3-NEXT: movdqa %xmm5, %xmm0 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm0 -; SSSE3-NEXT: pand %xmm5, %xmm0 -; SSSE3-NEXT: movdqa %xmm4, %xmm5 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm5 -; SSSE3-NEXT: pand %xmm4, %xmm5 -; SSSE3-NEXT: pslld $16, %xmm5 -; SSSE3-NEXT: psrad $16, %xmm5 -; SSSE3-NEXT: pslld $16, %xmm0 -; SSSE3-NEXT: psrad $16, %xmm0 -; SSSE3-NEXT: packssdw %xmm5, %xmm0 +; SSSE3-NEXT: pandn %xmm5, %xmm3 +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm5, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm4 +; SSSE3-NEXT: pandn %xmm5, %xmm6 +; SSSE3-NEXT: por %xmm4, %xmm6 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pandn %xmm5, %xmm4 +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: pxor %xmm5, %xmm5 +; SSSE3-NEXT: movdqa %xmm4, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm1 +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: movdqa %xmm6, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm4 +; SSSE3-NEXT: pand %xmm6, %xmm4 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm0 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm3 +; SSSE3-NEXT: pand %xmm2, %xmm3 ; SSSE3-NEXT: pslld $16, %xmm3 ; SSSE3-NEXT: psrad $16, %xmm3 +; SSSE3-NEXT: pslld $16, %xmm0 +; SSSE3-NEXT: psrad $16, %xmm0 +; SSSE3-NEXT: packssdw %xmm3, %xmm0 +; SSSE3-NEXT: pslld $16, %xmm4 +; SSSE3-NEXT: psrad $16, %xmm4 ; SSSE3-NEXT: pslld $16, %xmm1 ; SSSE3-NEXT: psrad $16, %xmm1 -; SSSE3-NEXT: packssdw %xmm3, %xmm1 +; SSSE3-NEXT: packssdw %xmm4, %xmm1 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_packus_v16i32_v16i16: ; SSE41: # %bb.0: -; SSE41-NEXT: packusdw %xmm1, %xmm0 -; SSE41-NEXT: packusdw %xmm3, %xmm2 -; SSE41-NEXT: movdqa %xmm2, %xmm1 +; SSE41-NEXT: movdqa (%rdi), %xmm0 +; SSE41-NEXT: movdqa 32(%rdi), %xmm1 +; SSE41-NEXT: packusdw 16(%rdi), %xmm0 +; SSE41-NEXT: packusdw 48(%rdi), %xmm1 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_packus_v16i32_v16i16: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm1 +; AVX1-NEXT: vpackusdw 48(%rdi), %xmm1, %xmm1 +; AVX1-NEXT: vpackusdw 16(%rdi), %xmm0, %xmm0 ; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_packus_v16i32_v16i16: ; AVX2: # %bb.0: -; AVX2-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vpackusdw 32(%rdi), %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: retq ; ; AVX512-LABEL: trunc_packus_v16i32_v16i16: ; AVX512: # %bb.0: -; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512-NEXT: vpmaxsd %zmm1, %zmm0, %zmm0 +; AVX512-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: vpmaxsd (%rdi), %zmm0, %zmm0 ; AVX512-NEXT: vpmovusdw %zmm0, %ymm0 ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_packus_v16i32_v16i16: +; SKX: # %bb.0: +; SKX-NEXT: vpbroadcastd {{.*#+}} ymm0 = [65535,65535,65535,65535,65535,65535,65535,65535] +; SKX-NEXT: vpminsd (%rdi), %ymm0, %ymm1 +; SKX-NEXT: vpminsd 32(%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2 +; SKX-NEXT: vpmaxsd %ymm2, %ymm0, %ymm0 +; SKX-NEXT: vpmaxsd %ymm2, %ymm1, %ymm1 +; SKX-NEXT: vpackusdw %ymm0, %ymm1, %ymm0 +; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; SKX-NEXT: retq + %a0 = load <16 x i32>, <16 x i32>* %p0 %1 = icmp slt <16 x i32> %a0, %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> %3 = icmp sgt <16 x i32> %2, zeroinitializer @@ -2235,6 +2361,14 @@ define <4 x i8> @trunc_packus_v4i64_v4i8 ; AVX512BWVL-NEXT: vpmovusqb %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v4i64_v4i8: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; SKX-NEXT: vpmovusqb %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = icmp sgt <4 x i64> %2, zeroinitializer @@ -2493,6 +2627,14 @@ define void @trunc_packus_v4i64_v4i8_sto ; AVX512BWVL-NEXT: vpmovusqb %ymm0, (%rdi) ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v4i64_v4i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 +; SKX-NEXT: vpmovusqb %ymm0, (%rdi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = icmp sgt <4 x i64> %2, zeroinitializer @@ -2502,251 +2644,262 @@ define void @trunc_packus_v4i64_v4i8_sto ret void } -define <8 x i8> @trunc_packus_v8i64_v8i8(<8 x i64> %a0) { +define <8 x i8> @trunc_packus_v8i64_v8i8(<8 x i64>* %p0) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_packus_v8i64_v8i8: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm5 +; SSE2-NEXT: movdqa 16(%rdi), %xmm9 +; SSE2-NEXT: movdqa 32(%rdi), %xmm3 +; SSE2-NEXT: movdqa 48(%rdi), %xmm7 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255] -; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm2, %xmm5 -; SSE2-NEXT: pxor %xmm10, %xmm5 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [2147483903,2147483903] -; SSE2-NEXT: movdqa %xmm9, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm7[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm5[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm2 -; SSE2-NEXT: pandn %xmm8, %xmm5 -; SSE2-NEXT: por %xmm2, %xmm5 +; SSE2-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648] ; SSE2-NEXT: movdqa %xmm3, %xmm2 -; SSE2-NEXT: pxor %xmm10, %xmm2 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm2 +; SSE2-NEXT: pxor %xmm11, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [2147483903,2147483903] +; SSE2-NEXT: movdqa %xmm10, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm0, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm2 ; SSE2-NEXT: pand %xmm2, %xmm3 ; SSE2-NEXT: pandn %xmm8, %xmm2 ; SSE2-NEXT: por %xmm3, %xmm2 -; SSE2-NEXT: movdqa %xmm0, %xmm3 -; SSE2-NEXT: pxor %xmm10, %xmm3 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm3, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm0 -; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: movdqa %xmm7, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSE2-NEXT: por %xmm0, %xmm3 -; SSE2-NEXT: movdqa %xmm1, %xmm0 -; SSE2-NEXT: pxor %xmm10, %xmm0 -; SSE2-NEXT: movdqa %xmm9, %xmm4 +; SSE2-NEXT: pand %xmm3, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm7, %xmm3 +; SSE2-NEXT: movdqa %xmm5, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm4 ; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 ; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm7 +; SSE2-NEXT: por %xmm5, %xmm7 +; SSE2-NEXT: movdqa %xmm9, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] ; SSE2-NEXT: por %xmm0, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pand %xmm4, %xmm9 ; SSE2-NEXT: pandn %xmm8, %xmm4 -; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: por %xmm9, %xmm4 ; SSE2-NEXT: movdqa %xmm4, %xmm0 -; SSE2-NEXT: pxor %xmm10, %xmm0 -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm1[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSE2-NEXT: pand %xmm6, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm0, %xmm1 -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm3, %xmm0 -; SSE2-NEXT: pxor %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm5 +; SSE2-NEXT: pand %xmm4, %xmm5 +; SSE2-NEXT: movdqa %xmm7, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 ; SSE2-NEXT: movdqa %xmm0, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm4 ; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm7 +; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm0 -; SSE2-NEXT: pand %xmm3, %xmm0 -; SSE2-NEXT: packuswb %xmm1, %xmm0 +; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: pand %xmm7, %xmm0 +; SSE2-NEXT: packuswb %xmm5, %xmm0 +; SSE2-NEXT: movdqa %xmm3, %xmm1 +; SSE2-NEXT: pxor %xmm11, %xmm1 +; SSE2-NEXT: movdqa %xmm1, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: pand %xmm3, %xmm4 ; SSE2-NEXT: movdqa %xmm2, %xmm1 -; SSE2-NEXT: pxor %xmm10, %xmm1 +; SSE2-NEXT: pxor %xmm11, %xmm1 ; SSE2-NEXT: movdqa %xmm1, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pand %xmm5, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSE2-NEXT: por %xmm1, %xmm3 ; SSE2-NEXT: pand %xmm2, %xmm3 -; SSE2-NEXT: movdqa %xmm5, %xmm1 -; SSE2-NEXT: pxor %xmm10, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm5, %xmm2 -; SSE2-NEXT: packuswb %xmm3, %xmm2 -; SSE2-NEXT: packuswb %xmm2, %xmm0 +; SSE2-NEXT: packuswb %xmm4, %xmm3 +; SSE2-NEXT: packuswb %xmm3, %xmm0 ; SSE2-NEXT: packuswb %xmm0, %xmm0 ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_packus_v8i64_v8i8: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm5 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm9 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm3 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm7 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255] -; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm2, %xmm5 -; SSSE3-NEXT: pxor %xmm10, %xmm5 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [2147483903,2147483903] -; SSSE3-NEXT: movdqa %xmm9, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm7[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm5[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm2 -; SSSE3-NEXT: pandn %xmm8, %xmm5 -; SSSE3-NEXT: por %xmm2, %xmm5 +; SSSE3-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648] ; SSSE3-NEXT: movdqa %xmm3, %xmm2 -; SSSE3-NEXT: pxor %xmm10, %xmm2 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm2 +; SSSE3-NEXT: pxor %xmm11, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [2147483903,2147483903] +; SSSE3-NEXT: movdqa %xmm10, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm0, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm2 ; SSSE3-NEXT: pand %xmm2, %xmm3 ; SSSE3-NEXT: pandn %xmm8, %xmm2 ; SSSE3-NEXT: por %xmm3, %xmm2 -; SSSE3-NEXT: movdqa %xmm0, %xmm3 -; SSSE3-NEXT: pxor %xmm10, %xmm3 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm3, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm0 -; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: movdqa %xmm7, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSSE3-NEXT: por %xmm0, %xmm3 -; SSSE3-NEXT: movdqa %xmm1, %xmm0 -; SSSE3-NEXT: pxor %xmm10, %xmm0 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 +; SSSE3-NEXT: pand %xmm3, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm7, %xmm3 +; SSSE3-NEXT: movdqa %xmm5, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 ; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 ; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm7 +; SSSE3-NEXT: por %xmm5, %xmm7 +; SSSE3-NEXT: movdqa %xmm9, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] ; SSSE3-NEXT: por %xmm0, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pand %xmm4, %xmm9 ; SSSE3-NEXT: pandn %xmm8, %xmm4 -; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: por %xmm9, %xmm4 ; SSSE3-NEXT: movdqa %xmm4, %xmm0 -; SSSE3-NEXT: pxor %xmm10, %xmm0 -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm1[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSSE3-NEXT: pand %xmm6, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm0, %xmm1 -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm3, %xmm0 -; SSSE3-NEXT: pxor %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm5 +; SSSE3-NEXT: pand %xmm4, %xmm5 +; SSSE3-NEXT: movdqa %xmm7, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 ; SSSE3-NEXT: movdqa %xmm0, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm4 ; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm7 +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: pand %xmm3, %xmm0 -; SSSE3-NEXT: packuswb %xmm1, %xmm0 +; SSSE3-NEXT: por %xmm1, %xmm0 +; SSSE3-NEXT: pand %xmm7, %xmm0 +; SSSE3-NEXT: packuswb %xmm5, %xmm0 +; SSSE3-NEXT: movdqa %xmm3, %xmm1 +; SSSE3-NEXT: pxor %xmm11, %xmm1 +; SSSE3-NEXT: movdqa %xmm1, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: pand %xmm3, %xmm4 ; SSSE3-NEXT: movdqa %xmm2, %xmm1 -; SSSE3-NEXT: pxor %xmm10, %xmm1 +; SSSE3-NEXT: pxor %xmm11, %xmm1 ; SSSE3-NEXT: movdqa %xmm1, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pand %xmm5, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSSE3-NEXT: por %xmm1, %xmm3 ; SSSE3-NEXT: pand %xmm2, %xmm3 -; SSSE3-NEXT: movdqa %xmm5, %xmm1 -; SSSE3-NEXT: pxor %xmm10, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm5, %xmm2 -; SSSE3-NEXT: packuswb %xmm3, %xmm2 -; SSSE3-NEXT: packuswb %xmm2, %xmm0 +; SSSE3-NEXT: packuswb %xmm4, %xmm3 +; SSSE3-NEXT: packuswb %xmm3, %xmm0 ; SSSE3-NEXT: packuswb %xmm0, %xmm0 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_packus_v8i64_v8i8: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm9 -; SSE41-NEXT: movapd {{.*#+}} xmm7 = [255,255] -; SSE41-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] -; SSE41-NEXT: movdqa %xmm2, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 +; SSE41-NEXT: movdqa (%rdi), %xmm10 +; SSE41-NEXT: movdqa 16(%rdi), %xmm9 +; SSE41-NEXT: movdqa 32(%rdi), %xmm3 +; SSE41-NEXT: movdqa 48(%rdi), %xmm5 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm3, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 ; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147483903,2147483903] -; SSE41-NEXT: movdqa %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm4, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm7 ; SSE41-NEXT: movdqa %xmm4, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm8 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm8 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm4, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 -; SSE41-NEXT: por %xmm5, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm2 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm2 -; SSE41-NEXT: movdqa %xmm9, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm8 +; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm8 +; SSE41-NEXT: movdqa %xmm5, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm6 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm6 +; SSE41-NEXT: movdqa %xmm10, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 ; SSE41-NEXT: movdqa %xmm4, %xmm3 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 ; SSE41-NEXT: movdqa %xmm4, %xmm5 @@ -2754,95 +2907,98 @@ define <8 x i8> @trunc_packus_v8i64_v8i8 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] ; SSE41-NEXT: pand %xmm3, %xmm0 ; SSE41-NEXT: por %xmm5, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm6 -; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm6 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movapd %xmm1, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm3 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm7 -; SSE41-NEXT: pxor %xmm3, %xmm3 -; SSE41-NEXT: movapd %xmm7, %xmm1 -; SSE41-NEXT: xorpd %xmm10, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm1 +; SSE41-NEXT: pxor %xmm5, %xmm5 ; SSE41-NEXT: movapd %xmm1, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm1 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm4, %xmm0 -; SSE41-NEXT: por %xmm1, %xmm0 +; SSE41-NEXT: xorpd %xmm2, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 ; SSE41-NEXT: pxor %xmm4, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm4 -; SSE41-NEXT: movapd %xmm6, %xmm1 -; SSE41-NEXT: xorpd %xmm10, %xmm1 -; SSE41-NEXT: movapd %xmm1, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm4 +; SSE41-NEXT: movapd %xmm3, %xmm1 +; SSE41-NEXT: xorpd %xmm2, %xmm1 +; SSE41-NEXT: movapd %xmm1, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm1 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm1, %xmm0 ; SSE41-NEXT: pxor %xmm1, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm1 ; SSE41-NEXT: packusdw %xmm4, %xmm1 -; SSE41-NEXT: movapd %xmm2, %xmm4 -; SSE41-NEXT: xorpd %xmm10, %xmm4 -; SSE41-NEXT: movapd %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm4 +; SSE41-NEXT: movapd %xmm6, %xmm3 +; SSE41-NEXT: xorpd %xmm2, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm3 +; SSE41-NEXT: movapd %xmm8, %xmm4 +; SSE41-NEXT: xorpd %xmm2, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm4 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm6, %xmm0 ; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: pxor %xmm4, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 -; SSE41-NEXT: movapd %xmm8, %xmm2 -; SSE41-NEXT: xorpd %xmm10, %xmm2 -; SSE41-NEXT: movapd %xmm2, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm2 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 -; SSE41-NEXT: por %xmm2, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm3 -; SSE41-NEXT: packusdw %xmm4, %xmm3 -; SSE41-NEXT: packusdw %xmm3, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm5 +; SSE41-NEXT: packusdw %xmm3, %xmm5 +; SSE41-NEXT: packusdw %xmm5, %xmm1 ; SSE41-NEXT: packuswb %xmm1, %xmm1 ; SSE41-NEXT: movdqa %xmm1, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_packus_v8i64_v8i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [255,255] -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm8 -; AVX1-NEXT: vpcmpgtq %xmm1, %xmm3, %xmm5 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm6, %xmm3, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm0, %xmm3, %xmm4 -; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm3, %xmm0 -; AVX1-NEXT: vpxor %xmm4, %xmm4, %xmm4 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm9 -; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm3, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm7 -; AVX1-NEXT: vblendvpd %xmm5, %xmm1, %xmm3, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm5 -; AVX1-NEXT: vblendvpd %xmm8, %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm2, %xmm3 -; AVX1-NEXT: vpand %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpand %xmm1, %xmm5, %xmm1 -; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vpand %xmm6, %xmm7, %xmm2 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [255,255] +; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm8 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm6 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm4, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm4, %xmm5 +; AVX1-NEXT: vblendvpd %xmm5, %xmm0, %xmm4, %xmm0 +; AVX1-NEXT: vpxor %xmm5, %xmm5, %xmm5 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm0, %xmm9 +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm6, %xmm2, %xmm4, %xmm2 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm6 +; AVX1-NEXT: vblendvpd %xmm8, %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm3, %xmm4 +; AVX1-NEXT: vpand %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpand %xmm2, %xmm6, %xmm2 +; AVX1-NEXT: vpackusdw %xmm3, %xmm2, %xmm2 +; AVX1-NEXT: vpand %xmm1, %xmm7, %xmm1 ; AVX1-NEXT: vpand %xmm0, %xmm9, %xmm0 -; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 -; AVX1-NEXT: vzeroupper ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_packus_v8i64_v8i8: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm2 = [255,255,255,255] ; AVX2-NEXT: vpcmpgtq %ymm1, %ymm2, %ymm3 ; AVX2-NEXT: vblendvpd %ymm3, %ymm1, %ymm2, %ymm1 @@ -2869,11 +3025,23 @@ define <8 x i8> @trunc_packus_v8i64_v8i8 ; ; AVX512-LABEL: trunc_packus_v8i64_v8i8: ; AVX512: # %bb.0: -; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: vpmaxsq (%rdi), %zmm0, %zmm0 ; AVX512-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_packus_v8i64_v8i8: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; SKX-NEXT: vpmaxsq 32(%rdi), %ymm0, %ymm1 +; SKX-NEXT: vpmovusqb %ymm1, %xmm1 +; SKX-NEXT: vpmaxsq (%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpmovusqb %ymm0, %xmm0 +; SKX-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp slt <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = icmp sgt <8 x i64> %2, zeroinitializer @@ -2882,350 +3050,364 @@ define <8 x i8> @trunc_packus_v8i64_v8i8 ret <8 x i8> %5 } -define void @trunc_packus_v8i64_v8i8_store(<8 x i64> %a0, <8 x i8> *%p1) { +define void @trunc_packus_v8i64_v8i8_store(<8 x i64>* %p0, <8 x i8> *%p1) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_packus_v8i64_v8i8_store: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm5 +; SSE2-NEXT: movdqa 16(%rdi), %xmm9 +; SSE2-NEXT: movdqa 32(%rdi), %xmm2 +; SSE2-NEXT: movdqa 48(%rdi), %xmm7 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255] -; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm2, %xmm5 -; SSE2-NEXT: pxor %xmm10, %xmm5 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [2147483903,2147483903] -; SSE2-NEXT: movdqa %xmm9, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm7[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm5[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm2 -; SSE2-NEXT: pandn %xmm8, %xmm5 -; SSE2-NEXT: por %xmm2, %xmm5 -; SSE2-NEXT: movdqa %xmm3, %xmm2 -; SSE2-NEXT: pxor %xmm10, %xmm2 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pxor %xmm11, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [2147483903,2147483903] +; SSE2-NEXT: movdqa %xmm10, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm3, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm2 +; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: por %xmm2, %xmm1 +; SSE2-NEXT: movdqa %xmm7, %xmm2 +; SSE2-NEXT: pxor %xmm11, %xmm2 +; SSE2-NEXT: movdqa %xmm10, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm6, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm7 ; SSE2-NEXT: pandn %xmm8, %xmm2 -; SSE2-NEXT: por %xmm3, %xmm2 -; SSE2-NEXT: movdqa %xmm0, %xmm3 -; SSE2-NEXT: pxor %xmm10, %xmm3 -; SSE2-NEXT: movdqa %xmm9, %xmm4 +; SSE2-NEXT: por %xmm7, %xmm2 +; SSE2-NEXT: movdqa %xmm5, %xmm3 +; SSE2-NEXT: pxor %xmm11, %xmm3 +; SSE2-NEXT: movdqa %xmm10, %xmm4 ; SSE2-NEXT: pcmpgtd %xmm3, %xmm4 ; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm7 +; SSE2-NEXT: pcmpeqd %xmm10, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm7 +; SSE2-NEXT: por %xmm5, %xmm7 +; SSE2-NEXT: movdqa %xmm9, %xmm3 +; SSE2-NEXT: pxor %xmm11, %xmm3 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm9 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm9, %xmm4 +; SSE2-NEXT: movdqa %xmm4, %xmm3 +; SSE2-NEXT: pxor %xmm11, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm5 +; SSE2-NEXT: pand %xmm4, %xmm5 +; SSE2-NEXT: movdqa %xmm7, %xmm3 +; SSE2-NEXT: pxor %xmm11, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm0 -; SSE2-NEXT: pandn %xmm8, %xmm3 ; SSE2-NEXT: por %xmm0, %xmm3 -; SSE2-NEXT: movdqa %xmm1, %xmm0 -; SSE2-NEXT: pxor %xmm10, %xmm0 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 +; SSE2-NEXT: pand %xmm7, %xmm3 +; SSE2-NEXT: packuswb %xmm5, %xmm3 +; SSE2-NEXT: movdqa %xmm2, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pand %xmm5, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] ; SSE2-NEXT: por %xmm0, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pandn %xmm8, %xmm4 -; SSE2-NEXT: por %xmm1, %xmm4 -; SSE2-NEXT: movdqa %xmm4, %xmm0 -; SSE2-NEXT: pxor %xmm10, %xmm0 -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm1[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pand %xmm2, %xmm4 +; SSE2-NEXT: movdqa %xmm1, %xmm0 +; SSE2-NEXT: pxor %xmm11, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm11, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm0, %xmm1 -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm3, %xmm0 -; SSE2-NEXT: pxor %xmm10, %xmm0 -; SSE2-NEXT: movdqa %xmm0, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm0 -; SSE2-NEXT: pand %xmm3, %xmm0 -; SSE2-NEXT: packuswb %xmm1, %xmm0 -; SSE2-NEXT: movdqa %xmm2, %xmm1 -; SSE2-NEXT: pxor %xmm10, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm3 -; SSE2-NEXT: pand %xmm2, %xmm3 -; SSE2-NEXT: movdqa %xmm5, %xmm1 -; SSE2-NEXT: pxor %xmm10, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm10, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pand %xmm5, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm5, %xmm2 -; SSE2-NEXT: packuswb %xmm3, %xmm2 -; SSE2-NEXT: packuswb %xmm2, %xmm0 -; SSE2-NEXT: packuswb %xmm0, %xmm0 -; SSE2-NEXT: movq %xmm0, (%rdi) +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm1, %xmm2 +; SSE2-NEXT: packuswb %xmm4, %xmm2 +; SSE2-NEXT: packuswb %xmm2, %xmm3 +; SSE2-NEXT: packuswb %xmm0, %xmm3 +; SSE2-NEXT: movq %xmm3, (%rsi) ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_packus_v8i64_v8i8_store: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm5 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm9 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm2 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm7 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255] -; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm2, %xmm5 -; SSSE3-NEXT: pxor %xmm10, %xmm5 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [2147483903,2147483903] -; SSSE3-NEXT: movdqa %xmm9, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm7[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm5[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm2 -; SSSE3-NEXT: pandn %xmm8, %xmm5 -; SSSE3-NEXT: por %xmm2, %xmm5 -; SSSE3-NEXT: movdqa %xmm3, %xmm2 -; SSSE3-NEXT: pxor %xmm10, %xmm2 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: pxor %xmm11, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [2147483903,2147483903] +; SSSE3-NEXT: movdqa %xmm10, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm3, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm2 +; SSSE3-NEXT: pandn %xmm8, %xmm1 +; SSSE3-NEXT: por %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm7, %xmm2 +; SSSE3-NEXT: pxor %xmm11, %xmm2 +; SSSE3-NEXT: movdqa %xmm10, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm6, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm7 ; SSSE3-NEXT: pandn %xmm8, %xmm2 -; SSSE3-NEXT: por %xmm3, %xmm2 -; SSSE3-NEXT: movdqa %xmm0, %xmm3 -; SSSE3-NEXT: pxor %xmm10, %xmm3 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 +; SSSE3-NEXT: por %xmm7, %xmm2 +; SSSE3-NEXT: movdqa %xmm5, %xmm3 +; SSSE3-NEXT: pxor %xmm11, %xmm3 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 ; SSSE3-NEXT: pcmpgtd %xmm3, %xmm4 ; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm7 +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm7 +; SSSE3-NEXT: por %xmm5, %xmm7 +; SSSE3-NEXT: movdqa %xmm9, %xmm3 +; SSSE3-NEXT: pxor %xmm11, %xmm3 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm9 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm9, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm3 +; SSSE3-NEXT: pxor %xmm11, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm5 +; SSSE3-NEXT: pand %xmm4, %xmm5 +; SSSE3-NEXT: movdqa %xmm7, %xmm3 +; SSSE3-NEXT: pxor %xmm11, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm0 -; SSSE3-NEXT: pandn %xmm8, %xmm3 ; SSSE3-NEXT: por %xmm0, %xmm3 -; SSSE3-NEXT: movdqa %xmm1, %xmm0 -; SSSE3-NEXT: pxor %xmm10, %xmm0 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 +; SSSE3-NEXT: pand %xmm7, %xmm3 +; SSSE3-NEXT: packuswb %xmm5, %xmm3 +; SSSE3-NEXT: movdqa %xmm2, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pand %xmm5, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] ; SSSE3-NEXT: por %xmm0, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pandn %xmm8, %xmm4 -; SSSE3-NEXT: por %xmm1, %xmm4 -; SSSE3-NEXT: movdqa %xmm4, %xmm0 -; SSSE3-NEXT: pxor %xmm10, %xmm0 -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm1[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pand %xmm2, %xmm4 +; SSSE3-NEXT: movdqa %xmm1, %xmm0 +; SSSE3-NEXT: pxor %xmm11, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm11, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm0, %xmm1 -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm3, %xmm0 -; SSSE3-NEXT: pxor %xmm10, %xmm0 -; SSSE3-NEXT: movdqa %xmm0, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: pand %xmm3, %xmm0 -; SSSE3-NEXT: packuswb %xmm1, %xmm0 -; SSSE3-NEXT: movdqa %xmm2, %xmm1 -; SSSE3-NEXT: pxor %xmm10, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm3 -; SSSE3-NEXT: pand %xmm2, %xmm3 -; SSSE3-NEXT: movdqa %xmm5, %xmm1 -; SSSE3-NEXT: pxor %xmm10, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm10, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pand %xmm5, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm5, %xmm2 -; SSSE3-NEXT: packuswb %xmm3, %xmm2 -; SSSE3-NEXT: packuswb %xmm2, %xmm0 -; SSSE3-NEXT: packuswb %xmm0, %xmm0 -; SSSE3-NEXT: movq %xmm0, (%rdi) +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm1, %xmm2 +; SSSE3-NEXT: packuswb %xmm4, %xmm2 +; SSSE3-NEXT: packuswb %xmm2, %xmm3 +; SSSE3-NEXT: packuswb %xmm0, %xmm3 +; SSSE3-NEXT: movq %xmm3, (%rsi) ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_packus_v8i64_v8i8_store: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm9 -; SSE41-NEXT: movapd {{.*#+}} xmm7 = [255,255] -; SSE41-NEXT: movdqa {{.*#+}} xmm10 = [2147483648,2147483648] +; SSE41-NEXT: movdqa (%rdi), %xmm10 +; SSE41-NEXT: movdqa 16(%rdi), %xmm9 +; SSE41-NEXT: movdqa 32(%rdi), %xmm2 +; SSE41-NEXT: movdqa 48(%rdi), %xmm5 +; SSE41-NEXT: movapd {{.*#+}} xmm4 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] ; SSE41-NEXT: movdqa %xmm2, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147483903,2147483903] -; SSE41-NEXT: movdqa %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 -; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483903,2147483903] +; SSE41-NEXT: movdqa %xmm3, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm7 +; SSE41-NEXT: movdqa %xmm3, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm8 +; SSE41-NEXT: movapd %xmm4, %xmm8 ; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm8 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm2 +; SSE41-NEXT: movdqa %xmm5, %xmm0 +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa %xmm3, %xmm2 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm4, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: movdqa %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] ; SSE41-NEXT: pand %xmm2, %xmm0 -; SSE41-NEXT: por %xmm5, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm2 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm2 -; SSE41-NEXT: movdqa %xmm9, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 -; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm6 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm6 +; SSE41-NEXT: movdqa %xmm10, %xmm0 +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa %xmm3, %xmm2 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 +; SSE41-NEXT: movdqa %xmm3, %xmm5 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 -; SSE41-NEXT: por %xmm5, %xmm0 -; SSE41-NEXT: movapd %xmm7, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm3 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm10, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pand %xmm2, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm2 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa %xmm3, %xmm5 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm7 -; SSE41-NEXT: xorpd %xmm1, %xmm1 -; SSE41-NEXT: movapd %xmm7, %xmm4 -; SSE41-NEXT: xorpd %xmm10, %xmm4 -; SSE41-NEXT: movapd %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] ; SSE41-NEXT: pand %xmm5, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm4 ; SSE41-NEXT: pxor %xmm5, %xmm5 -; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm5 -; SSE41-NEXT: movapd %xmm3, %xmm4 -; SSE41-NEXT: xorpd %xmm10, %xmm4 -; SSE41-NEXT: movapd %xmm4, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm3 +; SSE41-NEXT: xorpd %xmm1, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm1, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm3 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: xorpd %xmm1, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm1, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm1, %xmm4 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm4, %xmm0 ; SSE41-NEXT: pxor %xmm4, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm4 -; SSE41-NEXT: packusdw %xmm5, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm4 +; SSE41-NEXT: packusdw %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm6, %xmm2 +; SSE41-NEXT: xorpd %xmm1, %xmm2 ; SSE41-NEXT: movapd %xmm2, %xmm3 -; SSE41-NEXT: xorpd %xmm10, %xmm3 -; SSE41-NEXT: movapd %xmm3, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm3 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 -; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm3, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm3 -; SSE41-NEXT: movapd %xmm8, %xmm2 -; SSE41-NEXT: xorpd %xmm10, %xmm2 -; SSE41-NEXT: movapd %xmm2, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm10, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm10, %xmm2 +; SSE41-NEXT: pcmpeqd %xmm1, %xmm3 +; SSE41-NEXT: pcmpgtd %xmm1, %xmm2 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm3, %xmm0 ; SSE41-NEXT: por %xmm2, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm1 -; SSE41-NEXT: packusdw %xmm3, %xmm1 -; SSE41-NEXT: packusdw %xmm1, %xmm4 +; SSE41-NEXT: pxor %xmm2, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm2 +; SSE41-NEXT: movapd %xmm8, %xmm3 +; SSE41-NEXT: xorpd %xmm1, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm1, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm5 +; SSE41-NEXT: packusdw %xmm2, %xmm5 +; SSE41-NEXT: packusdw %xmm5, %xmm4 ; SSE41-NEXT: packuswb %xmm0, %xmm4 -; SSE41-NEXT: movq %xmm4, (%rdi) +; SSE41-NEXT: movq %xmm4, (%rsi) ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_packus_v8i64_v8i8_store: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [255,255] -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm8 -; AVX1-NEXT: vpcmpgtq %xmm1, %xmm3, %xmm5 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm6, %xmm3, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm0, %xmm3, %xmm4 -; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm3, %xmm0 -; AVX1-NEXT: vpxor %xmm4, %xmm4, %xmm4 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm9 -; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm3, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm7 -; AVX1-NEXT: vblendvpd %xmm5, %xmm1, %xmm3, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm5 -; AVX1-NEXT: vblendvpd %xmm8, %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm2, %xmm3 -; AVX1-NEXT: vpand %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpand %xmm1, %xmm5, %xmm1 -; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vpand %xmm6, %xmm7, %xmm2 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [255,255] +; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm8 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm6 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm4, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm4, %xmm5 +; AVX1-NEXT: vblendvpd %xmm5, %xmm0, %xmm4, %xmm0 +; AVX1-NEXT: vpxor %xmm5, %xmm5, %xmm5 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm0, %xmm9 +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm6, %xmm2, %xmm4, %xmm2 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm6 +; AVX1-NEXT: vblendvpd %xmm8, %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm3, %xmm4 +; AVX1-NEXT: vpand %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpand %xmm2, %xmm6, %xmm2 +; AVX1-NEXT: vpackusdw %xmm3, %xmm2, %xmm2 +; AVX1-NEXT: vpand %xmm1, %xmm7, %xmm1 ; AVX1-NEXT: vpand %xmm0, %xmm9, %xmm0 -; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 -; AVX1-NEXT: vmovq %xmm0, (%rdi) -; AVX1-NEXT: vzeroupper +; AVX1-NEXT: vmovq %xmm0, (%rsi) ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_packus_v8i64_v8i8_store: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm2 = [255,255,255,255] ; AVX2-NEXT: vpcmpgtq %ymm1, %ymm2, %ymm3 ; AVX2-NEXT: vblendvpd %ymm3, %ymm1, %ymm2, %ymm1 @@ -3247,17 +3429,30 @@ define void @trunc_packus_v8i64_v8i8_sto ; AVX2-NEXT: vpshufb %xmm3, %xmm0, %xmm0 ; AVX2-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1],xmm0[2],xmm2[2],xmm0[3],xmm2[3] ; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3] -; AVX2-NEXT: vmovq %xmm0, (%rdi) +; AVX2-NEXT: vmovq %xmm0, (%rsi) ; AVX2-NEXT: vzeroupper ; AVX2-NEXT: retq ; ; AVX512-LABEL: trunc_packus_v8i64_v8i8_store: ; AVX512: # %bb.0: -; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512-NEXT: vpmovusqb %zmm0, (%rdi) +; AVX512-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: vpmaxsq (%rdi), %zmm0, %zmm0 +; AVX512-NEXT: vpmovusqb %zmm0, (%rsi) ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_packus_v8i64_v8i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; SKX-NEXT: vpmaxsq 32(%rdi), %ymm0, %ymm1 +; SKX-NEXT: vpmovusqb %ymm1, %xmm1 +; SKX-NEXT: vpmaxsq (%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpmovusqb %ymm0, %xmm0 +; SKX-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SKX-NEXT: vmovq %xmm0, (%rsi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp slt <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = icmp sgt <8 x i64> %2, zeroinitializer @@ -3267,653 +3462,683 @@ define void @trunc_packus_v8i64_v8i8_sto ret void } -define <16 x i8> @trunc_packus_v16i64_v16i8(<16 x i64> %a0) { +define <16 x i8> @trunc_packus_v16i64_v16i8(<16 x i64>* %p0) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_packus_v16i64_v16i8: ; SSE2: # %bb.0: -; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [255,255] -; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm6, %xmm9 -; SSE2-NEXT: pxor %xmm8, %xmm9 -; SSE2-NEXT: movdqa {{.*#+}} xmm11 = [2147483903,2147483903] -; SSE2-NEXT: movdqa %xmm11, %xmm12 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm12 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm12[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm9 -; SSE2-NEXT: pshufd {{.*#+}} xmm14 = xmm9[1,1,3,3] -; SSE2-NEXT: pand %xmm13, %xmm14 -; SSE2-NEXT: pshufd {{.*#+}} xmm9 = xmm12[1,1,3,3] -; SSE2-NEXT: por %xmm14, %xmm9 -; SSE2-NEXT: pand %xmm9, %xmm6 -; SSE2-NEXT: pandn %xmm10, %xmm9 -; SSE2-NEXT: por %xmm6, %xmm9 -; SSE2-NEXT: movdqa %xmm7, %xmm6 -; SSE2-NEXT: pxor %xmm8, %xmm6 -; SSE2-NEXT: movdqa %xmm11, %xmm12 -; SSE2-NEXT: pcmpgtd %xmm6, %xmm12 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm12[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] -; SSE2-NEXT: pand %xmm13, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm12 = xmm12[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm12 -; SSE2-NEXT: pand %xmm12, %xmm7 -; SSE2-NEXT: pandn %xmm10, %xmm12 -; SSE2-NEXT: por %xmm7, %xmm12 -; SSE2-NEXT: movdqa %xmm4, %xmm6 -; SSE2-NEXT: pxor %xmm8, %xmm6 -; SSE2-NEXT: movdqa %xmm11, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm6, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm7[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] -; SSE2-NEXT: pand %xmm13, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm7[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm13 -; SSE2-NEXT: pand %xmm13, %xmm4 -; SSE2-NEXT: pandn %xmm10, %xmm13 -; SSE2-NEXT: por %xmm4, %xmm13 -; SSE2-NEXT: movdqa %xmm5, %xmm4 -; SSE2-NEXT: pxor %xmm8, %xmm4 -; SSE2-NEXT: movdqa %xmm11, %xmm6 +; SSE2-NEXT: movdqa (%rdi), %xmm10 +; SSE2-NEXT: movdqa 16(%rdi), %xmm9 +; SSE2-NEXT: movdqa 32(%rdi), %xmm15 +; SSE2-NEXT: movdqa 48(%rdi), %xmm13 +; SSE2-NEXT: movdqa 80(%rdi), %xmm7 +; SSE2-NEXT: movdqa 64(%rdi), %xmm5 +; SSE2-NEXT: movdqa 112(%rdi), %xmm3 +; SSE2-NEXT: movdqa 96(%rdi), %xmm0 +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255] +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: pxor %xmm1, %xmm4 +; SSE2-NEXT: movdqa {{.*#+}} xmm14 = [2147483903,2147483903] +; SSE2-NEXT: movdqa %xmm14, %xmm6 ; SSE2-NEXT: pcmpgtd %xmm4, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm14 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm14 -; SSE2-NEXT: pand %xmm14, %xmm5 -; SSE2-NEXT: pandn %xmm10, %xmm14 -; SSE2-NEXT: por %xmm5, %xmm14 -; SSE2-NEXT: movdqa %xmm2, %xmm4 -; SSE2-NEXT: pxor %xmm8, %xmm4 -; SSE2-NEXT: movdqa %xmm11, %xmm5 -; SSE2-NEXT: pcmpgtd %xmm4, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm5[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm4 ; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm2 -; SSE2-NEXT: pandn %xmm10, %xmm5 -; SSE2-NEXT: por %xmm2, %xmm5 -; SSE2-NEXT: movdqa %xmm3, %xmm2 -; SSE2-NEXT: pxor %xmm8, %xmm2 -; SSE2-NEXT: movdqa %xmm11, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm3 -; SSE2-NEXT: pandn %xmm10, %xmm2 -; SSE2-NEXT: por %xmm3, %xmm2 -; SSE2-NEXT: movdqa %xmm0, %xmm3 -; SSE2-NEXT: pxor %xmm8, %xmm3 -; SSE2-NEXT: movdqa %xmm11, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm3, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm0 -; SSE2-NEXT: pandn %xmm10, %xmm3 -; SSE2-NEXT: por %xmm0, %xmm3 -; SSE2-NEXT: movdqa %xmm1, %xmm0 -; SSE2-NEXT: pxor %xmm8, %xmm0 -; SSE2-NEXT: movdqa %xmm11, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 +; SSE2-NEXT: pand %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm11 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm11 +; SSE2-NEXT: pand %xmm11, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm11 +; SSE2-NEXT: por %xmm0, %xmm11 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm12 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm12 +; SSE2-NEXT: pand %xmm12, %xmm3 +; SSE2-NEXT: pandn %xmm8, %xmm12 +; SSE2-NEXT: por %xmm3, %xmm12 +; SSE2-NEXT: movdqa %xmm5, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] ; SSE2-NEXT: por %xmm0, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pandn %xmm10, %xmm4 -; SSE2-NEXT: por %xmm1, %xmm4 -; SSE2-NEXT: movdqa %xmm4, %xmm0 -; SSE2-NEXT: pxor %xmm8, %xmm0 -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm8, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm1[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm8, %xmm0 +; SSE2-NEXT: pand %xmm4, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm5, %xmm4 +; SSE2-NEXT: movdqa %xmm7, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm0, %xmm1 -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm3, %xmm0 -; SSE2-NEXT: pxor %xmm8, %xmm0 -; SSE2-NEXT: movdqa %xmm0, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm8, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm8, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm0 ; SSE2-NEXT: pand %xmm3, %xmm0 -; SSE2-NEXT: packuswb %xmm1, %xmm0 -; SSE2-NEXT: movdqa %xmm2, %xmm1 -; SSE2-NEXT: pxor %xmm8, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm8, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm8, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm3 -; SSE2-NEXT: pand %xmm2, %xmm3 -; SSE2-NEXT: movdqa %xmm5, %xmm1 -; SSE2-NEXT: pxor %xmm8, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm8, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm8, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: por %xmm7, %xmm5 +; SSE2-NEXT: movdqa %xmm15, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm15 +; SSE2-NEXT: pandn %xmm8, %xmm7 +; SSE2-NEXT: por %xmm15, %xmm7 +; SSE2-NEXT: movdqa %xmm13, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm15 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm15 +; SSE2-NEXT: pand %xmm15, %xmm13 +; SSE2-NEXT: pandn %xmm8, %xmm15 +; SSE2-NEXT: por %xmm13, %xmm15 +; SSE2-NEXT: movdqa %xmm10, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm13 +; SSE2-NEXT: pand %xmm13, %xmm10 +; SSE2-NEXT: pandn %xmm8, %xmm13 +; SSE2-NEXT: por %xmm10, %xmm13 +; SSE2-NEXT: movdqa %xmm9, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm6 +; SSE2-NEXT: pand %xmm6, %xmm9 +; SSE2-NEXT: pandn %xmm8, %xmm6 +; SSE2-NEXT: por %xmm9, %xmm6 +; SSE2-NEXT: movdqa %xmm6, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm8 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm8, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm5, %xmm2 -; SSE2-NEXT: packuswb %xmm3, %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm6, %xmm2 +; SSE2-NEXT: movdqa %xmm13, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm8 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm8, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm0 +; SSE2-NEXT: pand %xmm13, %xmm0 ; SSE2-NEXT: packuswb %xmm2, %xmm0 -; SSE2-NEXT: movdqa %xmm14, %xmm1 -; SSE2-NEXT: pxor %xmm8, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm8, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm8, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: movdqa %xmm15, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm2 ; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm14, %xmm2 -; SSE2-NEXT: movdqa %xmm13, %xmm1 -; SSE2-NEXT: pxor %xmm8, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm8, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm8, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm3[1,1,3,3] -; SSE2-NEXT: por %xmm5, %xmm1 -; SSE2-NEXT: pand %xmm13, %xmm1 -; SSE2-NEXT: packuswb %xmm2, %xmm1 -; SSE2-NEXT: movdqa %xmm12, %xmm2 -; SSE2-NEXT: pxor %xmm8, %xmm2 +; SSE2-NEXT: pand %xmm6, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm15, %xmm3 +; SSE2-NEXT: movdqa %xmm7, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm8 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm8, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm6 +; SSE2-NEXT: pand %xmm7, %xmm6 +; SSE2-NEXT: packuswb %xmm3, %xmm6 +; SSE2-NEXT: packuswb %xmm6, %xmm0 +; SSE2-NEXT: movdqa %xmm5, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 ; SSE2-NEXT: movdqa %xmm2, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm8, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm8, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm2 ; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm2 +; SSE2-NEXT: pand %xmm6, %xmm2 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: pand %xmm12, %xmm3 -; SSE2-NEXT: movdqa %xmm9, %xmm2 -; SSE2-NEXT: pxor %xmm8, %xmm2 -; SSE2-NEXT: movdqa %xmm2, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm8, %xmm4 +; SSE2-NEXT: pand %xmm5, %xmm3 +; SSE2-NEXT: movdqa %xmm4, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm7, %xmm2 +; SSE2-NEXT: pand %xmm4, %xmm2 +; SSE2-NEXT: packuswb %xmm3, %xmm2 +; SSE2-NEXT: movdqa %xmm12, %xmm3 +; SSE2-NEXT: pxor %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 ; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm8, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pcmpeqd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm3 ; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm2, %xmm4 -; SSE2-NEXT: pand %xmm9, %xmm4 -; SSE2-NEXT: packuswb %xmm3, %xmm4 -; SSE2-NEXT: packuswb %xmm4, %xmm1 -; SSE2-NEXT: packuswb %xmm1, %xmm0 +; SSE2-NEXT: por %xmm3, %xmm4 +; SSE2-NEXT: pand %xmm12, %xmm4 +; SSE2-NEXT: movdqa %xmm11, %xmm3 +; SSE2-NEXT: pxor %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: pand %xmm11, %xmm3 +; SSE2-NEXT: packuswb %xmm4, %xmm3 +; SSE2-NEXT: packuswb %xmm3, %xmm2 +; SSE2-NEXT: packuswb %xmm2, %xmm0 ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_packus_v16i64_v16i8: ; SSSE3: # %bb.0: -; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [255,255] -; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm6, %xmm9 -; SSSE3-NEXT: pxor %xmm8, %xmm9 -; SSSE3-NEXT: movdqa {{.*#+}} xmm11 = [2147483903,2147483903] -; SSSE3-NEXT: movdqa %xmm11, %xmm12 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm12 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm12[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm9 -; SSSE3-NEXT: pshufd {{.*#+}} xmm14 = xmm9[1,1,3,3] -; SSSE3-NEXT: pand %xmm13, %xmm14 -; SSSE3-NEXT: pshufd {{.*#+}} xmm9 = xmm12[1,1,3,3] -; SSSE3-NEXT: por %xmm14, %xmm9 -; SSSE3-NEXT: pand %xmm9, %xmm6 -; SSSE3-NEXT: pandn %xmm10, %xmm9 -; SSSE3-NEXT: por %xmm6, %xmm9 -; SSSE3-NEXT: movdqa %xmm7, %xmm6 -; SSSE3-NEXT: pxor %xmm8, %xmm6 -; SSSE3-NEXT: movdqa %xmm11, %xmm12 -; SSSE3-NEXT: pcmpgtd %xmm6, %xmm12 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm12[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] -; SSSE3-NEXT: pand %xmm13, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm12 = xmm12[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm12 -; SSSE3-NEXT: pand %xmm12, %xmm7 -; SSSE3-NEXT: pandn %xmm10, %xmm12 -; SSSE3-NEXT: por %xmm7, %xmm12 -; SSSE3-NEXT: movdqa %xmm4, %xmm6 -; SSSE3-NEXT: pxor %xmm8, %xmm6 -; SSSE3-NEXT: movdqa %xmm11, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm6, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm7[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] -; SSSE3-NEXT: pand %xmm13, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm7[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm13 -; SSSE3-NEXT: pand %xmm13, %xmm4 -; SSSE3-NEXT: pandn %xmm10, %xmm13 -; SSSE3-NEXT: por %xmm4, %xmm13 -; SSSE3-NEXT: movdqa %xmm5, %xmm4 -; SSSE3-NEXT: pxor %xmm8, %xmm4 -; SSSE3-NEXT: movdqa %xmm11, %xmm6 +; SSSE3-NEXT: movdqa (%rdi), %xmm10 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm9 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm15 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm13 +; SSSE3-NEXT: movdqa 80(%rdi), %xmm7 +; SSSE3-NEXT: movdqa 64(%rdi), %xmm5 +; SSSE3-NEXT: movdqa 112(%rdi), %xmm3 +; SSSE3-NEXT: movdqa 96(%rdi), %xmm0 +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255] +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pxor %xmm1, %xmm4 +; SSSE3-NEXT: movdqa {{.*#+}} xmm14 = [2147483903,2147483903] +; SSSE3-NEXT: movdqa %xmm14, %xmm6 ; SSSE3-NEXT: pcmpgtd %xmm4, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm14 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm14 -; SSSE3-NEXT: pand %xmm14, %xmm5 -; SSSE3-NEXT: pandn %xmm10, %xmm14 -; SSSE3-NEXT: por %xmm5, %xmm14 -; SSSE3-NEXT: movdqa %xmm2, %xmm4 -; SSSE3-NEXT: pxor %xmm8, %xmm4 -; SSSE3-NEXT: movdqa %xmm11, %xmm5 -; SSSE3-NEXT: pcmpgtd %xmm4, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm5[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm4 ; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm2 -; SSSE3-NEXT: pandn %xmm10, %xmm5 -; SSSE3-NEXT: por %xmm2, %xmm5 -; SSSE3-NEXT: movdqa %xmm3, %xmm2 -; SSSE3-NEXT: pxor %xmm8, %xmm2 -; SSSE3-NEXT: movdqa %xmm11, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm3 -; SSSE3-NEXT: pandn %xmm10, %xmm2 -; SSSE3-NEXT: por %xmm3, %xmm2 -; SSSE3-NEXT: movdqa %xmm0, %xmm3 -; SSSE3-NEXT: pxor %xmm8, %xmm3 -; SSSE3-NEXT: movdqa %xmm11, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm3, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm3 +; SSSE3-NEXT: pand %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm11 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm11 +; SSSE3-NEXT: pand %xmm11, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm11 +; SSSE3-NEXT: por %xmm0, %xmm11 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm12 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm12 +; SSSE3-NEXT: pand %xmm12, %xmm3 +; SSSE3-NEXT: pandn %xmm8, %xmm12 +; SSSE3-NEXT: por %xmm3, %xmm12 +; SSSE3-NEXT: movdqa %xmm5, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm5, %xmm4 +; SSSE3-NEXT: movdqa %xmm7, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: por %xmm7, %xmm5 +; SSSE3-NEXT: movdqa %xmm15, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSSE3-NEXT: pand %xmm3, %xmm0 -; SSSE3-NEXT: pandn %xmm10, %xmm3 -; SSSE3-NEXT: por %xmm0, %xmm3 -; SSSE3-NEXT: movdqa %xmm1, %xmm0 -; SSSE3-NEXT: pxor %xmm8, %xmm0 -; SSSE3-NEXT: movdqa %xmm11, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm15 +; SSSE3-NEXT: pandn %xmm8, %xmm7 +; SSSE3-NEXT: por %xmm15, %xmm7 +; SSSE3-NEXT: movdqa %xmm13, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm0, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pandn %xmm10, %xmm4 -; SSSE3-NEXT: por %xmm1, %xmm4 -; SSSE3-NEXT: movdqa %xmm4, %xmm0 -; SSSE3-NEXT: pxor %xmm8, %xmm0 -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm8, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm1[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm8, %xmm0 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm15 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm15 +; SSSE3-NEXT: pand %xmm15, %xmm13 +; SSSE3-NEXT: pandn %xmm8, %xmm15 +; SSSE3-NEXT: por %xmm13, %xmm15 +; SSSE3-NEXT: movdqa %xmm10, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSSE3-NEXT: pand %xmm6, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm0, %xmm1 -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm3, %xmm0 -; SSSE3-NEXT: pxor %xmm8, %xmm0 -; SSSE3-NEXT: movdqa %xmm0, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm8, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm8, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: pand %xmm3, %xmm0 -; SSSE3-NEXT: packuswb %xmm1, %xmm0 -; SSSE3-NEXT: movdqa %xmm2, %xmm1 -; SSSE3-NEXT: pxor %xmm8, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm8, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm8, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm3 -; SSSE3-NEXT: pand %xmm2, %xmm3 -; SSSE3-NEXT: movdqa %xmm5, %xmm1 -; SSSE3-NEXT: pxor %xmm8, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm8, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm8, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm13 +; SSSE3-NEXT: pand %xmm13, %xmm10 +; SSSE3-NEXT: pandn %xmm8, %xmm13 +; SSSE3-NEXT: por %xmm10, %xmm13 +; SSSE3-NEXT: movdqa %xmm9, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm9 +; SSSE3-NEXT: pandn %xmm8, %xmm6 +; SSSE3-NEXT: por %xmm9, %xmm6 +; SSSE3-NEXT: movdqa %xmm6, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm8 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm8, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm5, %xmm2 -; SSSE3-NEXT: packuswb %xmm3, %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm6, %xmm2 +; SSSE3-NEXT: movdqa %xmm13, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm8 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm8, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm0 +; SSSE3-NEXT: pand %xmm13, %xmm0 ; SSSE3-NEXT: packuswb %xmm2, %xmm0 -; SSSE3-NEXT: movdqa %xmm14, %xmm1 -; SSSE3-NEXT: pxor %xmm8, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm8, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm8, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: movdqa %xmm15, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm2 ; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm14, %xmm2 -; SSSE3-NEXT: movdqa %xmm13, %xmm1 -; SSSE3-NEXT: pxor %xmm8, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm8, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm8, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm3[1,1,3,3] -; SSSE3-NEXT: por %xmm5, %xmm1 -; SSSE3-NEXT: pand %xmm13, %xmm1 -; SSSE3-NEXT: packuswb %xmm2, %xmm1 -; SSSE3-NEXT: movdqa %xmm12, %xmm2 -; SSSE3-NEXT: pxor %xmm8, %xmm2 +; SSSE3-NEXT: pand %xmm6, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm15, %xmm3 +; SSSE3-NEXT: movdqa %xmm7, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa %xmm2, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm8 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm8, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm6 +; SSSE3-NEXT: pand %xmm7, %xmm6 +; SSSE3-NEXT: packuswb %xmm3, %xmm6 +; SSSE3-NEXT: packuswb %xmm6, %xmm0 +; SSSE3-NEXT: movdqa %xmm5, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 ; SSSE3-NEXT: movdqa %xmm2, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm8, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm8, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm2 ; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm2 +; SSSE3-NEXT: pand %xmm6, %xmm2 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: pand %xmm12, %xmm3 -; SSSE3-NEXT: movdqa %xmm9, %xmm2 -; SSSE3-NEXT: pxor %xmm8, %xmm2 -; SSSE3-NEXT: movdqa %xmm2, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm8, %xmm4 +; SSSE3-NEXT: pand %xmm5, %xmm3 +; SSSE3-NEXT: movdqa %xmm4, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa %xmm2, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm7, %xmm2 +; SSSE3-NEXT: pand %xmm4, %xmm2 +; SSSE3-NEXT: packuswb %xmm3, %xmm2 +; SSSE3-NEXT: movdqa %xmm12, %xmm3 +; SSSE3-NEXT: pxor %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 ; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm8, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm3 ; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm2, %xmm4 -; SSSE3-NEXT: pand %xmm9, %xmm4 -; SSSE3-NEXT: packuswb %xmm3, %xmm4 -; SSSE3-NEXT: packuswb %xmm4, %xmm1 -; SSSE3-NEXT: packuswb %xmm1, %xmm0 +; SSSE3-NEXT: por %xmm3, %xmm4 +; SSSE3-NEXT: pand %xmm12, %xmm4 +; SSSE3-NEXT: movdqa %xmm11, %xmm3 +; SSSE3-NEXT: pxor %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: pand %xmm11, %xmm3 +; SSSE3-NEXT: packuswb %xmm4, %xmm3 +; SSSE3-NEXT: packuswb %xmm3, %xmm2 +; SSSE3-NEXT: packuswb %xmm2, %xmm0 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_packus_v16i64_v16i8: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm8 -; SSE41-NEXT: movapd {{.*#+}} xmm11 = [255,255] -; SSE41-NEXT: movdqa {{.*#+}} xmm9 = [2147483648,2147483648] -; SSE41-NEXT: movdqa %xmm6, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm12 = [2147483903,2147483903] -; SSE41-NEXT: movdqa %xmm12, %xmm10 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm10 -; SSE41-NEXT: movdqa %xmm12, %xmm13 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm13 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm13[0,0,2,2] -; SSE41-NEXT: pand %xmm10, %xmm0 -; SSE41-NEXT: por %xmm13, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm10 -; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm10 -; SSE41-NEXT: movdqa %xmm7, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm13 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm13 -; SSE41-NEXT: movdqa %xmm12, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm13, %xmm0 -; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm13 -; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm13 +; SSE41-NEXT: movdqa (%rdi), %xmm10 +; SSE41-NEXT: movdqa 16(%rdi), %xmm9 +; SSE41-NEXT: movdqa 32(%rdi), %xmm14 +; SSE41-NEXT: movdqa 48(%rdi), %xmm12 +; SSE41-NEXT: movdqa 80(%rdi), %xmm15 +; SSE41-NEXT: movdqa 64(%rdi), %xmm6 +; SSE41-NEXT: movdqa 112(%rdi), %xmm13 +; SSE41-NEXT: movdqa 96(%rdi), %xmm4 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] ; SSE41-NEXT: movdqa %xmm4, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 -; SSE41-NEXT: movdqa %xmm12, %xmm7 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 -; SSE41-NEXT: por %xmm7, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm14 -; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm14 -; SSE41-NEXT: movdqa %xmm5, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm4 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm7 = [2147483903,2147483903] +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm8 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm8 +; SSE41-NEXT: movdqa %xmm13, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm11 +; SSE41-NEXT: blendvpd %xmm0, %xmm13, %xmm11 +; SSE41-NEXT: movdqa %xmm6, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm13 +; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm13 +; SSE41-NEXT: movdqa %xmm15, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm6 +; SSE41-NEXT: blendvpd %xmm0, %xmm15, %xmm6 +; SSE41-NEXT: movdqa %xmm14, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm15 +; SSE41-NEXT: blendvpd %xmm0, %xmm14, %xmm15 +; SSE41-NEXT: movdqa %xmm12, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm4 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 -; SSE41-NEXT: movdqa %xmm12, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: movdqa %xmm7, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] ; SSE41-NEXT: pand %xmm4, %xmm0 -; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm15 -; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm15 -; SSE41-NEXT: movdqa %xmm2, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm5 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm12, %xmm4 +; SSE41-NEXT: movdqa %xmm10, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm5 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 -; SSE41-NEXT: movdqa %xmm12, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 -; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm5 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm5 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm12, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 -; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm6 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm6 -; SSE41-NEXT: movdqa %xmm8, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm12, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm3 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 +; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm3 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm12 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm12[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 -; SSE41-NEXT: por %xmm12, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm11 -; SSE41-NEXT: pxor %xmm2, %xmm2 -; SSE41-NEXT: movapd %xmm11, %xmm1 -; SSE41-NEXT: xorpd %xmm9, %xmm1 +; SSE41-NEXT: movapd %xmm1, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm5 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm1 +; SSE41-NEXT: xorpd %xmm9, %xmm9 +; SSE41-NEXT: movapd %xmm1, %xmm3 +; SSE41-NEXT: xorpd %xmm2, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm3 +; SSE41-NEXT: movapd %xmm5, %xmm1 +; SSE41-NEXT: xorpd %xmm2, %xmm1 ; SSE41-NEXT: movapd %xmm1, %xmm7 -; SSE41-NEXT: pcmpeqd %xmm9, %xmm7 -; SSE41-NEXT: pcmpgtd %xmm9, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm1 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] ; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm7, %xmm7 -; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm7 -; SSE41-NEXT: movapd %xmm3, %xmm1 -; SSE41-NEXT: xorpd %xmm9, %xmm1 -; SSE41-NEXT: movapd %xmm1, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm9, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm9, %xmm1 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm4, %xmm0 -; SSE41-NEXT: por %xmm1, %xmm0 ; SSE41-NEXT: pxor %xmm1, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm1 -; SSE41-NEXT: packusdw %xmm7, %xmm1 -; SSE41-NEXT: movapd %xmm6, %xmm3 -; SSE41-NEXT: xorpd %xmm9, %xmm3 -; SSE41-NEXT: movapd %xmm3, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm9, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm9, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm1 +; SSE41-NEXT: packusdw %xmm3, %xmm1 +; SSE41-NEXT: movapd %xmm4, %xmm3 +; SSE41-NEXT: xorpd %xmm2, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm3 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm3, %xmm0 ; SSE41-NEXT: pxor %xmm3, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm3 -; SSE41-NEXT: movapd %xmm5, %xmm4 -; SSE41-NEXT: xorpd %xmm9, %xmm4 -; SSE41-NEXT: movapd %xmm4, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm9, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm9, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm3 +; SSE41-NEXT: movapd %xmm15, %xmm4 +; SSE41-NEXT: xorpd %xmm2, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm4 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm4, %xmm0 ; SSE41-NEXT: pxor %xmm4, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm15, %xmm4 ; SSE41-NEXT: packusdw %xmm3, %xmm4 ; SSE41-NEXT: packusdw %xmm4, %xmm1 -; SSE41-NEXT: movapd %xmm15, %xmm3 -; SSE41-NEXT: xorpd %xmm9, %xmm3 +; SSE41-NEXT: movapd %xmm6, %xmm3 +; SSE41-NEXT: xorpd %xmm2, %xmm3 ; SSE41-NEXT: movapd %xmm3, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm9, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm9, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm3 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] ; SSE41-NEXT: pand %xmm4, %xmm0 ; SSE41-NEXT: por %xmm3, %xmm0 ; SSE41-NEXT: pxor %xmm4, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm15, %xmm4 -; SSE41-NEXT: movapd %xmm14, %xmm3 -; SSE41-NEXT: xorpd %xmm9, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm4 +; SSE41-NEXT: movapd %xmm13, %xmm3 +; SSE41-NEXT: xorpd %xmm2, %xmm3 ; SSE41-NEXT: movapd %xmm3, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm9, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm3 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] ; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm3, %xmm0 ; SSE41-NEXT: pxor %xmm3, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm14, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm13, %xmm3 ; SSE41-NEXT: packusdw %xmm4, %xmm3 -; SSE41-NEXT: movapd %xmm13, %xmm4 -; SSE41-NEXT: xorpd %xmm9, %xmm4 +; SSE41-NEXT: movapd %xmm11, %xmm4 +; SSE41-NEXT: xorpd %xmm2, %xmm4 ; SSE41-NEXT: movapd %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm9, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm4 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] ; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm4, %xmm0 ; SSE41-NEXT: pxor %xmm4, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm13, %xmm4 -; SSE41-NEXT: movapd %xmm10, %xmm5 -; SSE41-NEXT: xorpd %xmm9, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm4 +; SSE41-NEXT: movapd %xmm8, %xmm5 +; SSE41-NEXT: xorpd %xmm2, %xmm5 ; SSE41-NEXT: movapd %xmm5, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm9, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm9, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm2, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm2, %xmm5 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] ; SSE41-NEXT: pand %xmm6, %xmm0 ; SSE41-NEXT: por %xmm5, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm2 -; SSE41-NEXT: packusdw %xmm4, %xmm2 -; SSE41-NEXT: packusdw %xmm2, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm9 +; SSE41-NEXT: packusdw %xmm4, %xmm9 +; SSE41-NEXT: packusdw %xmm9, %xmm3 ; SSE41-NEXT: packuswb %xmm3, %xmm1 ; SSE41-NEXT: movdqa %xmm1, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_packus_v16i64_v16i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm3, %xmm8 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm5 = [255,255] -; AVX1-NEXT: vextractf128 $1, %ymm2, %xmm9 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm7 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm4 -; AVX1-NEXT: vpcmpgtq %xmm0, %xmm5, %xmm6 -; AVX1-NEXT: vblendvpd %xmm6, %xmm0, %xmm5, %xmm10 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm5, %xmm6 -; AVX1-NEXT: vblendvpd %xmm6, %xmm4, %xmm5, %xmm11 -; AVX1-NEXT: vpcmpgtq %xmm1, %xmm5, %xmm6 -; AVX1-NEXT: vblendvpd %xmm6, %xmm1, %xmm5, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm5, %xmm6 -; AVX1-NEXT: vblendvpd %xmm6, %xmm7, %xmm5, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm5, %xmm7 -; AVX1-NEXT: vblendvpd %xmm7, %xmm2, %xmm5, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm9, %xmm5, %xmm7 -; AVX1-NEXT: vblendvpd %xmm7, %xmm9, %xmm5, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm3, %xmm5, %xmm0 -; AVX1-NEXT: vblendvpd %xmm0, %xmm3, %xmm5, %xmm0 -; AVX1-NEXT: vpcmpgtq %xmm8, %xmm5, %xmm3 -; AVX1-NEXT: vblendvpd %xmm3, %xmm8, %xmm5, %xmm3 -; AVX1-NEXT: vpxor %xmm5, %xmm5, %xmm5 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm3, %xmm4 -; AVX1-NEXT: vpand %xmm3, %xmm4, %xmm3 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm0, %xmm4 -; AVX1-NEXT: vpand %xmm0, %xmm4, %xmm0 -; AVX1-NEXT: vpackusdw %xmm3, %xmm0, %xmm0 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm7, %xmm3 -; AVX1-NEXT: vpand %xmm7, %xmm3, %xmm3 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm4 -; AVX1-NEXT: vpand %xmm2, %xmm4, %xmm2 +; AVX1-NEXT: vmovdqa 112(%rdi), %xmm8 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] +; AVX1-NEXT: vmovdqa 96(%rdi), %xmm9 +; AVX1-NEXT: vmovdqa 80(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa 64(%rdi), %xmm4 +; AVX1-NEXT: vmovdqa (%rdi), %xmm5 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm6 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm7 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm0 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm5, %xmm1, %xmm10 +; AVX1-NEXT: vpcmpgtq %xmm6, %xmm1, %xmm5 +; AVX1-NEXT: vblendvpd %xmm5, %xmm6, %xmm1, %xmm11 +; AVX1-NEXT: vpcmpgtq %xmm7, %xmm1, %xmm6 +; AVX1-NEXT: vblendvpd %xmm6, %xmm7, %xmm1, %xmm6 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm4, %xmm1, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm3, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm3, %xmm1, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm9, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm9, %xmm1, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm8, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm8, %xmm1, %xmm1 +; AVX1-NEXT: vpxor %xmm2, %xmm2, %xmm2 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm1, %xmm5 +; AVX1-NEXT: vpand %xmm1, %xmm5, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm7, %xmm5 +; AVX1-NEXT: vpand %xmm7, %xmm5, %xmm5 +; AVX1-NEXT: vpackusdw %xmm1, %xmm5, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm5 +; AVX1-NEXT: vpand %xmm3, %xmm5, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm5 +; AVX1-NEXT: vpand %xmm4, %xmm5, %xmm4 +; AVX1-NEXT: vpackusdw %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpackusdw %xmm1, %xmm3, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm0, %xmm3 +; AVX1-NEXT: vpand %xmm0, %xmm3, %xmm0 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm6, %xmm3 +; AVX1-NEXT: vpand %xmm6, %xmm3, %xmm3 +; AVX1-NEXT: vpackusdw %xmm0, %xmm3, %xmm0 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm11, %xmm3 +; AVX1-NEXT: vpand %xmm11, %xmm3, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm10, %xmm2 +; AVX1-NEXT: vpand %xmm10, %xmm2, %xmm2 ; AVX1-NEXT: vpackusdw %xmm3, %xmm2, %xmm2 ; AVX1-NEXT: vpackusdw %xmm0, %xmm2, %xmm0 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm2 -; AVX1-NEXT: vpand %xmm6, %xmm2, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm3 -; AVX1-NEXT: vpand %xmm1, %xmm3, %xmm1 -; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm11, %xmm2 -; AVX1-NEXT: vpand %xmm11, %xmm2, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm10, %xmm3 -; AVX1-NEXT: vpand %xmm10, %xmm3, %xmm3 -; AVX1-NEXT: vpackusdw %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpackusdw %xmm1, %xmm2, %xmm1 -; AVX1-NEXT: vpackuswb %xmm0, %xmm1, %xmm0 -; AVX1-NEXT: vzeroupper +; AVX1-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_packus_v16i64_v16i8: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 +; AVX2-NEXT: vmovdqa 64(%rdi), %ymm2 +; AVX2-NEXT: vmovdqa 96(%rdi), %ymm3 ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm4 = [255,255,255,255] ; AVX2-NEXT: vpcmpgtq %ymm2, %ymm4, %ymm5 ; AVX2-NEXT: vblendvpd %ymm5, %ymm2, %ymm4, %ymm2 @@ -3945,25 +4170,25 @@ define <16 x i8> @trunc_packus_v16i64_v1 ; ; AVX512F-LABEL: trunc_packus_v16i64_v16i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastq {{.*#+}} zmm2 = [255,255,255,255,255,255,255,255] -; AVX512F-NEXT: vpminsq %zmm2, %zmm0, %zmm0 -; AVX512F-NEXT: vpminsq %zmm2, %zmm1, %zmm1 +; AVX512F-NEXT: vpbroadcastq {{.*#+}} zmm0 = [255,255,255,255,255,255,255,255] +; AVX512F-NEXT: vpminsq (%rdi), %zmm0, %zmm1 +; AVX512F-NEXT: vpminsq 64(%rdi), %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; AVX512F-NEXT: vpmaxsq %zmm2, %zmm1, %zmm1 ; AVX512F-NEXT: vpmaxsq %zmm2, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512F-NEXT: vpmaxsq %zmm2, %zmm1, %zmm1 ; AVX512F-NEXT: vpmovqd %zmm1, %ymm1 -; AVX512F-NEXT: vinserti64x4 $1, %ymm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512F-NEXT: vinserti64x4 $1, %ymm0, %zmm1, %zmm0 ; AVX512F-NEXT: vpmovdb %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_packus_v16i64_v16i8: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; AVX512VL-NEXT: vpmaxsq %zmm2, %zmm1, %zmm1 +; AVX512VL-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512VL-NEXT: vpmaxsq 64(%rdi), %zmm0, %zmm1 ; AVX512VL-NEXT: vpmovusqb %zmm1, %xmm1 -; AVX512VL-NEXT: vpmaxsq %zmm2, %zmm0, %zmm0 +; AVX512VL-NEXT: vpmaxsq (%rdi), %zmm0, %zmm0 ; AVX512VL-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512VL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] ; AVX512VL-NEXT: vzeroupper @@ -3971,29 +4196,47 @@ define <16 x i8> @trunc_packus_v16i64_v1 ; ; AVX512BW-LABEL: trunc_packus_v16i64_v16i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastq {{.*#+}} zmm2 = [255,255,255,255,255,255,255,255] -; AVX512BW-NEXT: vpminsq %zmm2, %zmm0, %zmm0 -; AVX512BW-NEXT: vpminsq %zmm2, %zmm1, %zmm1 +; AVX512BW-NEXT: vpbroadcastq {{.*#+}} zmm0 = [255,255,255,255,255,255,255,255] +; AVX512BW-NEXT: vpminsq (%rdi), %zmm0, %zmm1 +; AVX512BW-NEXT: vpminsq 64(%rdi), %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; AVX512BW-NEXT: vpmaxsq %zmm2, %zmm1, %zmm1 ; AVX512BW-NEXT: vpmaxsq %zmm2, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512BW-NEXT: vpmaxsq %zmm2, %zmm1, %zmm1 ; AVX512BW-NEXT: vpmovqd %zmm1, %ymm1 -; AVX512BW-NEXT: vinserti64x4 $1, %ymm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512BW-NEXT: vinserti64x4 $1, %ymm0, %zmm1, %zmm0 ; AVX512BW-NEXT: vpmovdb %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_packus_v16i64_v16i8: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; AVX512BWVL-NEXT: vpmaxsq %zmm2, %zmm1, %zmm1 +; AVX512BWVL-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmaxsq 64(%rdi), %zmm0, %zmm1 ; AVX512BWVL-NEXT: vpmovusqb %zmm1, %xmm1 -; AVX512BWVL-NEXT: vpmaxsq %zmm2, %zmm0, %zmm0 +; AVX512BWVL-NEXT: vpmaxsq (%rdi), %zmm0, %zmm0 ; AVX512BWVL-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BWVL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v16i64_v16i8: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; SKX-NEXT: vpmaxsq 96(%rdi), %ymm0, %ymm1 +; SKX-NEXT: vpmovusqb %ymm1, %xmm1 +; SKX-NEXT: vpmaxsq 64(%rdi), %ymm0, %ymm2 +; SKX-NEXT: vpmovusqb %ymm2, %xmm2 +; SKX-NEXT: vpunpckldq {{.*#+}} xmm1 = xmm2[0],xmm1[0],xmm2[1],xmm1[1] +; SKX-NEXT: vpmaxsq 32(%rdi), %ymm0, %ymm2 +; SKX-NEXT: vpmovusqb %ymm2, %xmm2 +; SKX-NEXT: vpmaxsq (%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpmovusqb %ymm0, %xmm0 +; SKX-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1] +; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <16 x i64>, <16 x i64>* %p0 %1 = icmp slt <16 x i64> %a0, %2 = select <16 x i1> %1, <16 x i64> %a0, <16 x i64> %3 = icmp sgt <16 x i64> %2, zeroinitializer @@ -4002,7 +4245,7 @@ define <16 x i8> @trunc_packus_v16i64_v1 ret <16 x i8> %5 } -define <4 x i8> @trunc_packus_v4i32_v4i8(<4 x i32> %a0) { +define <4 x i8> @trunc_packus_v4i32_v4i8(<4 x i32> %a0) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_packus_v4i32_v4i8: ; SSE2: # %bb.0: ; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [255,255,255,255] @@ -4093,6 +4336,14 @@ define <4 x i8> @trunc_packus_v4i32_v4i8 ; AVX512BWVL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v4i32_v4i8: +; SKX: # %bb.0: +; SKX-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = icmp sgt <4 x i32> %2, zeroinitializer @@ -4197,6 +4448,13 @@ define void @trunc_packus_v4i32_v4i8_sto ; AVX512BWVL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpmovusdb %xmm0, (%rdi) ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v4i32_v4i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpmovusdb %xmm0, (%rdi) +; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = icmp sgt <4 x i32> %2, zeroinitializer @@ -4260,6 +4518,14 @@ define <8 x i8> @trunc_packus_v8i32_v8i8 ; AVX512BWVL-NEXT: vpmovusdb %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v8i32_v8i8: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 +; SKX-NEXT: vpmovusdb %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <8 x i32> %a0, %2 = select <8 x i1> %1, <8 x i32> %a0, <8 x i32> %3 = icmp sgt <8 x i32> %2, zeroinitializer @@ -4327,6 +4593,14 @@ define void @trunc_packus_v8i32_v8i8_sto ; AVX512BWVL-NEXT: vpmovusdb %ymm0, (%rdi) ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v8i32_v8i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 +; SKX-NEXT: vpmovusdb %ymm0, (%rdi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <8 x i32> %a0, %2 = select <8 x i1> %1, <8 x i32> %a0, <8 x i32> %3 = icmp sgt <8 x i32> %2, zeroinitializer @@ -4336,27 +4610,29 @@ define void @trunc_packus_v8i32_v8i8_sto ret void } -define <16 x i8> @trunc_packus_v16i32_v16i8(<16 x i32> %a0) { +define <16 x i8> @trunc_packus_v16i32_v16i8(<16 x i32>* %p0) "min-legal-vector-width"="256" { ; SSE-LABEL: trunc_packus_v16i32_v16i8: ; SSE: # %bb.0: -; SSE-NEXT: packssdw %xmm3, %xmm2 -; SSE-NEXT: packssdw %xmm1, %xmm0 -; SSE-NEXT: packuswb %xmm2, %xmm0 +; SSE-NEXT: movdqa (%rdi), %xmm0 +; SSE-NEXT: movdqa 32(%rdi), %xmm1 +; SSE-NEXT: packssdw 48(%rdi), %xmm1 +; SSE-NEXT: packssdw 16(%rdi), %xmm0 +; SSE-NEXT: packuswb %xmm1, %xmm0 ; SSE-NEXT: retq ; ; AVX1-LABEL: trunc_packus_v16i32_v16i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vpackssdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vpackssdw %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm1 +; AVX1-NEXT: vpackssdw 48(%rdi), %xmm1, %xmm1 +; AVX1-NEXT: vpackssdw 16(%rdi), %xmm0, %xmm0 ; AVX1-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 -; AVX1-NEXT: vzeroupper ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_packus_v16i32_v16i8: ; AVX2: # %bb.0: -; AVX2-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vpackssdw 32(%rdi), %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 ; AVX2-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 @@ -4365,11 +4641,21 @@ define <16 x i8> @trunc_packus_v16i32_v1 ; ; AVX512-LABEL: trunc_packus_v16i32_v16i8: ; AVX512: # %bb.0: -; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512-NEXT: vpmaxsd %zmm1, %zmm0, %zmm0 +; AVX512-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: vpmaxsd (%rdi), %zmm0, %zmm0 ; AVX512-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_packus_v16i32_v16i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vpackusdw 32(%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; SKX-NEXT: vpmovuswb %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <16 x i32>, <16 x i32>* %p0 %1 = icmp slt <16 x i32> %a0, %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> %3 = icmp sgt <16 x i32> %2, zeroinitializer @@ -4378,6 +4664,64 @@ define <16 x i8> @trunc_packus_v16i32_v1 ret <16 x i8> %5 } +define void @trunc_packus_v16i32_v16i8_store(<16 x i32>* %p0, <16 x i8>* %p1) "min-legal-vector-width"="256" { +; SSE-LABEL: trunc_packus_v16i32_v16i8_store: +; SSE: # %bb.0: +; SSE-NEXT: movdqa (%rdi), %xmm0 +; SSE-NEXT: movdqa 32(%rdi), %xmm1 +; SSE-NEXT: packssdw 48(%rdi), %xmm1 +; SSE-NEXT: packssdw 16(%rdi), %xmm0 +; SSE-NEXT: packuswb %xmm1, %xmm0 +; SSE-NEXT: movdqa %xmm0, (%rsi) +; SSE-NEXT: retq +; +; AVX1-LABEL: trunc_packus_v16i32_v16i8_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm1 +; AVX1-NEXT: vpackssdw 48(%rdi), %xmm1, %xmm1 +; AVX1-NEXT: vpackssdw 16(%rdi), %xmm0, %xmm0 +; AVX1-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa %xmm0, (%rsi) +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_packus_v16i32_v16i8_store: +; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vpackssdw 32(%rdi), %ymm0, %ymm0 +; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX2-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vmovdqa %xmm0, (%rsi) +; AVX2-NEXT: vzeroupper +; AVX2-NEXT: retq +; +; AVX512-LABEL: trunc_packus_v16i32_v16i8_store: +; AVX512: # %bb.0: +; AVX512-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: vpmaxsd (%rdi), %zmm0, %zmm0 +; AVX512-NEXT: vpmovusdb %zmm0, (%rsi) +; AVX512-NEXT: vzeroupper +; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_packus_v16i32_v16i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vpackusdw 32(%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; SKX-NEXT: vpmovuswb %ymm0, (%rsi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a = load <16 x i32>, <16 x i32>* %p0 + %b = icmp slt <16 x i32> %a, + %c = select <16 x i1> %b, <16 x i32> %a, <16 x i32> + %d = icmp sgt <16 x i32> %c, zeroinitializer + %e = select <16 x i1> %d, <16 x i32> %c, <16 x i32> zeroinitializer + %f = trunc <16 x i32> %e to <16 x i8> + store <16 x i8> %f, <16 x i8>* %p1 + ret void +} + define <8 x i8> @trunc_packus_v8i16_v8i8(<8 x i16> %a0) { ; SSE-LABEL: trunc_packus_v8i16_v8i8: ; SSE: # %bb.0: @@ -4411,6 +4755,14 @@ define <8 x i8> @trunc_packus_v8i16_v8i8 ; AVX512BWVL-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v8i16_v8i8: +; SKX: # %bb.0: +; SKX-NEXT: vpminsw {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; SKX-NEXT: retq %1 = icmp slt <8 x i16> %a0, %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> %3 = icmp sgt <8 x i16> %2, zeroinitializer @@ -4456,6 +4808,13 @@ define void @trunc_packus_v8i16_v8i8_sto ; AVX512BWVL-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpmovuswb %xmm0, (%rdi) ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v8i16_v8i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpmovuswb %xmm0, (%rdi) +; SKX-NEXT: retq %1 = icmp slt <8 x i16> %a0, %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> %3 = icmp sgt <8 x i16> %2, zeroinitializer @@ -4513,6 +4872,14 @@ define <16 x i8> @trunc_packus_v16i16_v1 ; AVX512BWVL-NEXT: vpmovuswb %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v16i16_v16i8: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsw %ymm1, %ymm0, %ymm0 +; SKX-NEXT: vpmovuswb %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <16 x i16> %a0, %2 = select <16 x i1> %1, <16 x i16> %a0, <16 x i16> %3 = icmp sgt <16 x i16> %2, zeroinitializer @@ -4521,56 +4888,71 @@ define <16 x i8> @trunc_packus_v16i16_v1 ret <16 x i8> %5 } -define <32 x i8> @trunc_packus_v32i16_v32i8(<32 x i16> %a0) { +define <32 x i8> @trunc_packus_v32i16_v32i8(<32 x i16>* %p0) "min-legal-vector-width"="256" { ; SSE-LABEL: trunc_packus_v32i16_v32i8: ; SSE: # %bb.0: -; SSE-NEXT: packuswb %xmm1, %xmm0 -; SSE-NEXT: packuswb %xmm3, %xmm2 -; SSE-NEXT: movdqa %xmm2, %xmm1 +; SSE-NEXT: movdqa (%rdi), %xmm0 +; SSE-NEXT: movdqa 32(%rdi), %xmm1 +; SSE-NEXT: packuswb 16(%rdi), %xmm0 +; SSE-NEXT: packuswb 48(%rdi), %xmm1 ; SSE-NEXT: retq ; ; AVX1-LABEL: trunc_packus_v32i16_v32i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vpackuswb %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vpackuswb %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm1 +; AVX1-NEXT: vpackuswb 48(%rdi), %xmm1, %xmm1 +; AVX1-NEXT: vpackuswb 16(%rdi), %xmm0, %xmm0 ; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_packus_v32i16_v32i8: ; AVX2: # %bb.0: -; AVX2-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vpackuswb 32(%rdi), %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: retq ; ; AVX512F-LABEL: trunc_packus_v32i16_v32i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vextracti64x4 $1, %zmm0, %ymm1 -; AVX512F-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 +; AVX512F-NEXT: vmovdqa (%rdi), %ymm0 +; AVX512F-NEXT: vpackuswb 32(%rdi), %ymm0, %ymm0 ; AVX512F-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_packus_v32i16_v32i8: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vextracti64x4 $1, %zmm0, %ymm1 -; AVX512VL-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 +; AVX512VL-NEXT: vmovdqa (%rdi), %ymm0 +; AVX512VL-NEXT: vpackuswb 32(%rdi), %ymm0, %ymm0 ; AVX512VL-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_packus_v32i16_v32i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512BW-NEXT: vpmaxsw %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: vpmaxsw (%rdi), %zmm0, %zmm0 ; AVX512BW-NEXT: vpmovuswb %zmm0, %ymm0 ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_packus_v32i16_v32i8: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512BWVL-NEXT: vpmaxsw %zmm1, %zmm0, %zmm0 +; AVX512BWVL-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmaxsw (%rdi), %zmm0, %zmm0 ; AVX512BWVL-NEXT: vpmovuswb %zmm0, %ymm0 ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v32i16_v32i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa {{.*#+}} ymm0 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255] +; SKX-NEXT: vpminsw (%rdi), %ymm0, %ymm1 +; SKX-NEXT: vpminsw 32(%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2 +; SKX-NEXT: vpmaxsw %ymm2, %ymm0, %ymm0 +; SKX-NEXT: vpmaxsw %ymm2, %ymm1, %ymm1 +; SKX-NEXT: vpackuswb %ymm0, %ymm1, %ymm0 +; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; SKX-NEXT: retq + %a0 = load <32 x i16>, <32 x i16>* %p0 %1 = icmp slt <32 x i16> %a0, %2 = select <32 x i1> %1, <32 x i16> %a0, <32 x i16> %3 = icmp sgt <32 x i16> %2, zeroinitializer @@ -4579,52 +4961,74 @@ define <32 x i8> @trunc_packus_v32i16_v3 ret <32 x i8> %5 } -define <32 x i8> @trunc_packus_v32i32_v32i8(<32 x i32> %a0) { +define <32 x i8> @trunc_packus_v32i32_v32i8(<32 x i32>* %p0) "min-legal-vector-width"="256" { ; SSE-LABEL: trunc_packus_v32i32_v32i8: ; SSE: # %bb.0: -; SSE-NEXT: packssdw %xmm3, %xmm2 -; SSE-NEXT: packssdw %xmm1, %xmm0 +; SSE-NEXT: movdqa (%rdi), %xmm0 +; SSE-NEXT: movdqa 32(%rdi), %xmm2 +; SSE-NEXT: movdqa 64(%rdi), %xmm1 +; SSE-NEXT: movdqa 96(%rdi), %xmm3 +; SSE-NEXT: packssdw 48(%rdi), %xmm2 +; SSE-NEXT: packssdw 16(%rdi), %xmm0 ; SSE-NEXT: packuswb %xmm2, %xmm0 -; SSE-NEXT: packssdw %xmm7, %xmm6 -; SSE-NEXT: packssdw %xmm5, %xmm4 -; SSE-NEXT: packuswb %xmm6, %xmm4 -; SSE-NEXT: movdqa %xmm4, %xmm1 +; SSE-NEXT: packssdw 112(%rdi), %xmm3 +; SSE-NEXT: packssdw 80(%rdi), %xmm1 +; SSE-NEXT: packuswb %xmm3, %xmm1 ; SSE-NEXT: retq ; ; AVX1-LABEL: trunc_packus_v32i32_v32i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm3, %xmm4 -; AVX1-NEXT: vpackssdw %xmm4, %xmm3, %xmm3 -; AVX1-NEXT: vextractf128 $1, %ymm2, %xmm4 -; AVX1-NEXT: vpackssdw %xmm4, %xmm2, %xmm2 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 64(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 96(%rdi), %xmm3 +; AVX1-NEXT: vpackssdw 112(%rdi), %xmm3, %xmm3 +; AVX1-NEXT: vpackssdw 80(%rdi), %xmm2, %xmm2 ; AVX1-NEXT: vpackuswb %xmm3, %xmm2, %xmm2 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 -; AVX1-NEXT: vpackssdw %xmm3, %xmm1, %xmm1 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm3 -; AVX1-NEXT: vpackssdw %xmm3, %xmm0, %xmm0 +; AVX1-NEXT: vpackssdw 48(%rdi), %xmm1, %xmm1 +; AVX1-NEXT: vpackssdw 16(%rdi), %xmm0, %xmm0 ; AVX1-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_packus_v32i32_v32i8: ; AVX2: # %bb.0: -; AVX2-NEXT: vpackssdw %ymm3, %ymm2, %ymm2 -; AVX2-NEXT: vpermq {{.*#+}} ymm2 = ymm2[0,2,1,3] -; AVX2-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 64(%rdi), %ymm1 +; AVX2-NEXT: vpackssdw 96(%rdi), %ymm1, %ymm1 +; AVX2-NEXT: vpermq {{.*#+}} ymm1 = ymm1[0,2,1,3] +; AVX2-NEXT: vpackssdw 32(%rdi), %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] -; AVX2-NEXT: vpackuswb %ymm2, %ymm0, %ymm0 +; AVX2-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: retq ; ; AVX512-LABEL: trunc_packus_v32i32_v32i8: ; AVX512: # %bb.0: -; AVX512-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; AVX512-NEXT: vpmaxsd %zmm2, %zmm0, %zmm0 -; AVX512-NEXT: vpmovusdb %zmm0, %xmm0 -; AVX512-NEXT: vpmaxsd %zmm2, %zmm1, %zmm1 +; AVX512-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: vpmaxsd (%rdi), %zmm0, %zmm1 ; AVX512-NEXT: vpmovusdb %zmm1, %xmm1 -; AVX512-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 +; AVX512-NEXT: vpmaxsd 64(%rdi), %zmm0, %zmm0 +; AVX512-NEXT: vpmovusdb %zmm0, %xmm0 +; AVX512-NEXT: vinserti128 $1, %xmm0, %ymm1, %ymm0 ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_packus_v32i32_v32i8: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; SKX-NEXT: vpmaxsd 96(%rdi), %ymm0, %ymm1 +; SKX-NEXT: vpmovusdb %ymm1, %xmm1 +; SKX-NEXT: vpmaxsd 64(%rdi), %ymm0, %ymm2 +; SKX-NEXT: vpmovusdb %ymm2, %xmm2 +; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm1 = xmm2[0],xmm1[0] +; SKX-NEXT: vpmaxsd 32(%rdi), %ymm0, %ymm2 +; SKX-NEXT: vpmovusdb %ymm2, %xmm2 +; SKX-NEXT: vpmaxsd (%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpmovusdb %ymm0, %xmm0 +; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm2[0] +; SKX-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 +; SKX-NEXT: retq + %a0 = load <32 x i32>, <32 x i32>* %p0 %1 = icmp slt <32 x i32> %a0, %2 = select <32 x i1> %1, <32 x i32> %a0, <32 x i32> %3 = icmp sgt <32 x i32> %2, zeroinitializer Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll?rev=374642&r1=374641&r2=374642&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll Sat Oct 12 00:59:24 2019 @@ -9,6 +9,7 @@ ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512vl,+fast-variable-shuffle | FileCheck %s --check-prefixes=AVX512,AVX512VL ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw,+fast-variable-shuffle | FileCheck %s --check-prefixes=AVX512,AVX512BW ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw,+avx512vl,+fast-variable-shuffle | FileCheck %s --check-prefixes=AVX512,AVX512BWVL +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=skx | FileCheck %s --check-prefixes=SKX ; ; Signed saturation truncation to vXi32 @@ -261,6 +262,12 @@ define <4 x i32> @trunc_ssat_v4i64_v4i32 ; AVX512BWVL-NEXT: vpmovsqd %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v4i64_v4i32: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsqd %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = icmp sgt <4 x i64> %2, @@ -270,322 +277,334 @@ define <4 x i32> @trunc_ssat_v4i64_v4i32 } -define <8 x i32> @trunc_ssat_v8i64_v8i32(<8 x i64> %a0) { +define <8 x i32> @trunc_ssat_v8i64_v8i32(<8 x i64>* %p0) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_ssat_v8i64_v8i32: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm3 +; SSE2-NEXT: movdqa 16(%rdi), %xmm5 +; SSE2-NEXT: movdqa 32(%rdi), %xmm7 +; SSE2-NEXT: movdqa 48(%rdi), %xmm9 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [2147483647,2147483647] -; SSE2-NEXT: movdqa {{.*#+}} xmm4 = [2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm0, %xmm5 -; SSE2-NEXT: pxor %xmm4, %xmm5 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [4294967295,4294967295] -; SSE2-NEXT: movdqa %xmm9, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm3, %xmm2 +; SSE2-NEXT: pxor %xmm0, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [4294967295,4294967295] +; SSE2-NEXT: movdqa %xmm10, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm3 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm3, %xmm2 +; SSE2-NEXT: movdqa %xmm5, %xmm1 +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa %xmm10, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm5, %xmm3 +; SSE2-NEXT: movdqa %xmm7, %xmm1 +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm7 ; SSE2-NEXT: pandn %xmm8, %xmm5 -; SSE2-NEXT: por %xmm0, %xmm5 -; SSE2-NEXT: movdqa %xmm1, %xmm0 -; SSE2-NEXT: pxor %xmm4, %xmm0 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm0 -; SSE2-NEXT: pand %xmm0, %xmm1 -; SSE2-NEXT: pandn %xmm8, %xmm0 -; SSE2-NEXT: por %xmm1, %xmm0 -; SSE2-NEXT: movdqa %xmm2, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 +; SSE2-NEXT: por %xmm7, %xmm5 +; SSE2-NEXT: movdqa %xmm9, %xmm1 +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] ; SSE2-NEXT: por %xmm1, %xmm7 -; SSE2-NEXT: pand %xmm7, %xmm2 +; SSE2-NEXT: pand %xmm7, %xmm9 ; SSE2-NEXT: pandn %xmm8, %xmm7 -; SSE2-NEXT: por %xmm2, %xmm7 -; SSE2-NEXT: movdqa %xmm3, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm9, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm6 -; SSE2-NEXT: pand %xmm6, %xmm3 -; SSE2-NEXT: pandn %xmm8, %xmm6 -; SSE2-NEXT: por %xmm3, %xmm6 +; SSE2-NEXT: por %xmm9, %xmm7 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [18446744071562067968,18446744071562067968] -; SSE2-NEXT: movdqa %xmm6, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 +; SSE2-NEXT: movdqa %xmm7, %xmm1 +; SSE2-NEXT: pxor %xmm0, %xmm1 ; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [18446744069414584320,18446744069414584320] -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSE2-NEXT: movdqa %xmm1, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] ; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm3, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm6 -; SSE2-NEXT: pandn %xmm8, %xmm2 -; SSE2-NEXT: por %xmm6, %xmm2 -; SSE2-NEXT: movdqa %xmm7, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm3[0,0,2,2] +; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm7, %xmm4 +; SSE2-NEXT: movdqa %xmm5, %xmm1 +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa %xmm1, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] ; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm3[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm7 -; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm10, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm6[1,1,3,3] ; SSE2-NEXT: por %xmm7, %xmm1 -; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm2[0,2] -; SSE2-NEXT: movdqa %xmm0, %xmm2 -; SSE2-NEXT: pxor %xmm4, %xmm2 -; SSE2-NEXT: movdqa %xmm2, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm0 -; SSE2-NEXT: pandn %xmm8, %xmm3 -; SSE2-NEXT: por %xmm0, %xmm3 -; SSE2-NEXT: pxor %xmm5, %xmm4 -; SSE2-NEXT: movdqa %xmm4, %xmm0 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm0[0,0,2,2] +; SSE2-NEXT: pand %xmm1, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: por %xmm5, %xmm1 +; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm4[0,2] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pxor %xmm0, %xmm4 +; SSE2-NEXT: movdqa %xmm4, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] ; SSE2-NEXT: pcmpeqd %xmm9, %xmm4 ; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: pand %xmm2, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm0 -; SSE2-NEXT: pand %xmm0, %xmm5 +; SSE2-NEXT: pand %xmm6, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm3 +; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: por %xmm3, %xmm5 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm6, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm2 ; SSE2-NEXT: pandn %xmm8, %xmm0 -; SSE2-NEXT: por %xmm5, %xmm0 -; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm3[0,2] +; SSE2-NEXT: por %xmm2, %xmm0 +; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm5[0,2] ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_ssat_v8i64_v8i32: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm3 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm5 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm7 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm9 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [2147483647,2147483647] -; SSSE3-NEXT: movdqa {{.*#+}} xmm4 = [2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm0, %xmm5 -; SSSE3-NEXT: pxor %xmm4, %xmm5 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [4294967295,4294967295] -; SSSE3-NEXT: movdqa %xmm9, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm3, %xmm2 +; SSSE3-NEXT: pxor %xmm0, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [4294967295,4294967295] +; SSSE3-NEXT: movdqa %xmm10, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm3 +; SSSE3-NEXT: pandn %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm3, %xmm2 +; SSSE3-NEXT: movdqa %xmm5, %xmm1 +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa %xmm10, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm5, %xmm3 +; SSSE3-NEXT: movdqa %xmm7, %xmm1 +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm7 ; SSSE3-NEXT: pandn %xmm8, %xmm5 -; SSSE3-NEXT: por %xmm0, %xmm5 -; SSSE3-NEXT: movdqa %xmm1, %xmm0 -; SSSE3-NEXT: pxor %xmm4, %xmm0 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: pand %xmm0, %xmm1 -; SSSE3-NEXT: pandn %xmm8, %xmm0 -; SSSE3-NEXT: por %xmm1, %xmm0 -; SSSE3-NEXT: movdqa %xmm2, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 +; SSSE3-NEXT: por %xmm7, %xmm5 +; SSSE3-NEXT: movdqa %xmm9, %xmm1 +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] ; SSSE3-NEXT: por %xmm1, %xmm7 -; SSSE3-NEXT: pand %xmm7, %xmm2 +; SSSE3-NEXT: pand %xmm7, %xmm9 ; SSSE3-NEXT: pandn %xmm8, %xmm7 -; SSSE3-NEXT: por %xmm2, %xmm7 -; SSSE3-NEXT: movdqa %xmm3, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm9, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm6 -; SSSE3-NEXT: pand %xmm6, %xmm3 -; SSSE3-NEXT: pandn %xmm8, %xmm6 -; SSSE3-NEXT: por %xmm3, %xmm6 +; SSSE3-NEXT: por %xmm9, %xmm7 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [18446744071562067968,18446744071562067968] -; SSSE3-NEXT: movdqa %xmm6, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 +; SSSE3-NEXT: movdqa %xmm7, %xmm1 +; SSSE3-NEXT: pxor %xmm0, %xmm1 ; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [18446744069414584320,18446744069414584320] -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSSE3-NEXT: movdqa %xmm1, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] ; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm3, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm6 -; SSSE3-NEXT: pandn %xmm8, %xmm2 -; SSSE3-NEXT: por %xmm6, %xmm2 -; SSSE3-NEXT: movdqa %xmm7, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm3[0,0,2,2] +; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm7, %xmm4 +; SSSE3-NEXT: movdqa %xmm5, %xmm1 +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa %xmm1, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] ; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm3[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm7 -; SSSE3-NEXT: pandn %xmm8, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm10, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm6[1,1,3,3] ; SSSE3-NEXT: por %xmm7, %xmm1 -; SSSE3-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm2[0,2] -; SSSE3-NEXT: movdqa %xmm0, %xmm2 -; SSSE3-NEXT: pxor %xmm4, %xmm2 -; SSSE3-NEXT: movdqa %xmm2, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm0 -; SSSE3-NEXT: pandn %xmm8, %xmm3 -; SSSE3-NEXT: por %xmm0, %xmm3 -; SSSE3-NEXT: pxor %xmm5, %xmm4 -; SSSE3-NEXT: movdqa %xmm4, %xmm0 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm0[0,0,2,2] +; SSSE3-NEXT: pand %xmm1, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm1 +; SSSE3-NEXT: por %xmm5, %xmm1 +; SSSE3-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm4[0,2] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pxor %xmm0, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] ; SSSE3-NEXT: pcmpeqd %xmm9, %xmm4 ; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: pand %xmm2, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm0 -; SSSE3-NEXT: pand %xmm0, %xmm5 +; SSSE3-NEXT: pand %xmm6, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm3 +; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: por %xmm3, %xmm5 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm6, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm2 ; SSSE3-NEXT: pandn %xmm8, %xmm0 -; SSSE3-NEXT: por %xmm5, %xmm0 -; SSSE3-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm3[0,2] +; SSSE3-NEXT: por %xmm2, %xmm0 +; SSSE3-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm5[0,2] ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_ssat_v8i64_v8i32: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm9 -; SSE41-NEXT: movapd {{.*#+}} xmm10 = [2147483647,2147483647] -; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [2147483648,2147483648] -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [4294967295,4294967295] -; SSE41-NEXT: movdqa %xmm4, %xmm7 +; SSE41-NEXT: movdqa (%rdi), %xmm5 +; SSE41-NEXT: movdqa 16(%rdi), %xmm4 +; SSE41-NEXT: movdqa 32(%rdi), %xmm10 +; SSE41-NEXT: movdqa 48(%rdi), %xmm9 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [2147483647,2147483647] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm5, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm2 = [4294967295,4294967295] +; SSE41-NEXT: movdqa %xmm2, %xmm7 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm7 -; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: movdqa %xmm2, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] ; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm10, %xmm8 -; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm8 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 -; SSE41-NEXT: movdqa %xmm4, %xmm7 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 -; SSE41-NEXT: por %xmm7, %xmm0 -; SSE41-NEXT: movapd %xmm10, %xmm9 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm9 -; SSE41-NEXT: movdqa %xmm2, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm1 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 -; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: movapd %xmm1, %xmm8 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm8 +; SSE41-NEXT: movdqa %xmm4, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm2, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm2, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm10, %xmm6 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm6 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm1 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm1, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm10 -; SSE41-NEXT: movapd {{.*#+}} xmm2 = [18446744071562067968,18446744071562067968] -; SSE41-NEXT: movapd %xmm10, %xmm1 -; SSE41-NEXT: xorpd %xmm5, %xmm1 -; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [18446744069414584320,18446744069414584320] -; SSE41-NEXT: movapd %xmm1, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm3, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm3, %xmm1 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] +; SSE41-NEXT: movapd %xmm1, %xmm11 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm11 +; SSE41-NEXT: movdqa %xmm10, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm2, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: movdqa %xmm2, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] ; SSE41-NEXT: pand %xmm4, %xmm0 -; SSE41-NEXT: por %xmm1, %xmm0 -; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm4 ; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm4 -; SSE41-NEXT: movapd %xmm6, %xmm1 -; SSE41-NEXT: xorpd %xmm5, %xmm1 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm2, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [18446744071562067968,18446744071562067968] +; SSE41-NEXT: movapd %xmm1, %xmm7 +; SSE41-NEXT: xorpd %xmm3, %xmm7 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [18446744069414584320,18446744069414584320] +; SSE41-NEXT: movapd %xmm7, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm2, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 +; SSE41-NEXT: movapd %xmm4, %xmm1 +; SSE41-NEXT: xorpd %xmm3, %xmm1 ; SSE41-NEXT: movapd %xmm1, %xmm7 -; SSE41-NEXT: pcmpeqd %xmm3, %xmm7 -; SSE41-NEXT: pcmpgtd %xmm3, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm1 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] ; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm1, %xmm0 ; SSE41-NEXT: movapd %xmm2, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm1 -; SSE41-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm4[0,2] -; SSE41-NEXT: movapd %xmm9, %xmm4 -; SSE41-NEXT: xorpd %xmm5, %xmm4 -; SSE41-NEXT: movapd %xmm4, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm1 +; SSE41-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm5[0,2] +; SSE41-NEXT: movapd %xmm11, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm4 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm4, %xmm0 ; SSE41-NEXT: movapd %xmm2, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm4 -; SSE41-NEXT: xorpd %xmm8, %xmm5 -; SSE41-NEXT: movapd %xmm5, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm3, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm3, %xmm5 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 -; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm4 +; SSE41-NEXT: xorpd %xmm8, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 ; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm2 ; SSE41-NEXT: shufps {{.*#+}} xmm2 = xmm2[0,2],xmm4[0,2] ; SSE41-NEXT: movaps %xmm2, %xmm0 @@ -593,33 +612,37 @@ define <8 x i32> @trunc_ssat_v8i64_v8i32 ; ; AVX1-LABEL: trunc_ssat_v8i64_v8i32: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [2147483647,2147483647] -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm8 -; AVX1-NEXT: vpcmpgtq %xmm1, %xmm3, %xmm5 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm6, %xmm3, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm0, %xmm3, %xmm4 -; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm3, %xmm0 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [18446744071562067968,18446744071562067968] -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm9 -; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm3, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm7 -; AVX1-NEXT: vblendvpd %xmm5, %xmm1, %xmm3, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm5 -; AVX1-NEXT: vblendvpd %xmm8, %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm2, %xmm3 -; AVX1-NEXT: vblendvpd %xmm3, %xmm2, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm5, %xmm1, %xmm4, %xmm1 -; AVX1-NEXT: vshufps {{.*#+}} xmm1 = xmm1[0,2],xmm2[0,2] -; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm9, %xmm0, %xmm4, %xmm0 -; AVX1-NEXT: vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm2[0,2] -; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [2147483647,2147483647] +; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm8 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm6 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm4, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm4, %xmm5 +; AVX1-NEXT: vblendvpd %xmm5, %xmm0, %xmm4, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm5 = [18446744071562067968,18446744071562067968] +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm0, %xmm9 +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm6, %xmm2, %xmm4, %xmm2 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm6 +; AVX1-NEXT: vblendvpd %xmm8, %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm3, %xmm4 +; AVX1-NEXT: vblendvpd %xmm4, %xmm3, %xmm5, %xmm3 +; AVX1-NEXT: vblendvpd %xmm6, %xmm2, %xmm5, %xmm2 +; AVX1-NEXT: vshufps {{.*#+}} xmm2 = xmm2[0,2],xmm3[0,2] +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm5, %xmm1 +; AVX1-NEXT: vblendvpd %xmm9, %xmm0, %xmm5, %xmm0 +; AVX1-NEXT: vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm1[0,2] +; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-SLOW-LABEL: trunc_ssat_v8i64_v8i32: ; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-SLOW-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm2 = [2147483647,2147483647,2147483647,2147483647] ; AVX2-SLOW-NEXT: vpcmpgtq %ymm1, %ymm2, %ymm3 ; AVX2-SLOW-NEXT: vblendvpd %ymm3, %ymm1, %ymm2, %ymm1 @@ -639,6 +662,8 @@ define <8 x i32> @trunc_ssat_v8i64_v8i32 ; ; AVX2-FAST-LABEL: trunc_ssat_v8i64_v8i32: ; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-FAST-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm2 = [2147483647,2147483647,2147483647,2147483647] ; AVX2-FAST-NEXT: vpcmpgtq %ymm0, %ymm2, %ymm3 ; AVX2-FAST-NEXT: vblendvpd %ymm3, %ymm0, %ymm2, %ymm0 @@ -657,8 +682,19 @@ define <8 x i32> @trunc_ssat_v8i64_v8i32 ; ; AVX512-LABEL: trunc_ssat_v8i64_v8i32: ; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512-NEXT: vpmovsqd %zmm0, %ymm0 ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v8i64_v8i32: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vmovdqa 32(%rdi), %ymm1 +; SKX-NEXT: vpmovsqd %ymm0, %xmm0 +; SKX-NEXT: vpmovsqd %ymm1, %xmm1 +; SKX-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp slt <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = icmp sgt <8 x i64> %2, @@ -938,6 +974,12 @@ define <4 x i16> @trunc_ssat_v4i64_v4i16 ; AVX512BWVL-NEXT: vpmovsqw %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v4i64_v4i16: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsqw %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = icmp sgt <4 x i64> %2, @@ -1221,6 +1263,12 @@ define void @trunc_ssat_v4i64_v4i16_stor ; AVX512BWVL-NEXT: vpmovsqw %ymm0, (%rdi) ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v4i64_v4i16_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsqw %ymm0, (%rdi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = icmp sgt <4 x i64> %2, @@ -1230,81 +1278,85 @@ define void @trunc_ssat_v4i64_v4i16_stor ret void } -define <8 x i16> @trunc_ssat_v8i64_v8i16(<8 x i64> %a0) { +define <8 x i16> @trunc_ssat_v8i64_v8i16(<8 x i64>* %p0) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_ssat_v8i64_v8i16: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm6 +; SSE2-NEXT: movdqa 16(%rdi), %xmm9 +; SSE2-NEXT: movdqa 32(%rdi), %xmm3 +; SSE2-NEXT: movdqa 48(%rdi), %xmm5 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [32767,32767] -; SSE2-NEXT: movdqa {{.*#+}} xmm4 = [2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm2, %xmm5 -; SSE2-NEXT: pxor %xmm4, %xmm5 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [2147516415,2147516415] -; SSE2-NEXT: movdqa %xmm9, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm2 -; SSE2-NEXT: pandn %xmm8, %xmm5 -; SSE2-NEXT: por %xmm2, %xmm5 +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] ; SSE2-NEXT: movdqa %xmm3, %xmm2 -; SSE2-NEXT: pxor %xmm4, %xmm2 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [2147516415,2147516415] +; SSE2-NEXT: movdqa %xmm10, %xmm7 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm0, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm7[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm2 ; SSE2-NEXT: pand %xmm2, %xmm3 ; SSE2-NEXT: pandn %xmm8, %xmm2 ; SSE2-NEXT: por %xmm3, %xmm2 -; SSE2-NEXT: movdqa %xmm0, %xmm3 -; SSE2-NEXT: pxor %xmm4, %xmm3 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm3, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm0 -; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: movdqa %xmm5, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSE2-NEXT: por %xmm0, %xmm3 -; SSE2-NEXT: movdqa %xmm1, %xmm0 -; SSE2-NEXT: pxor %xmm4, %xmm0 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 +; SSE2-NEXT: pand %xmm3, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm5, %xmm3 +; SSE2-NEXT: movdqa %xmm6, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm6 +; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: por %xmm6, %xmm5 +; SSE2-NEXT: movdqa %xmm9, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] ; SSE2-NEXT: por %xmm0, %xmm7 -; SSE2-NEXT: pand %xmm7, %xmm1 +; SSE2-NEXT: pand %xmm7, %xmm9 ; SSE2-NEXT: pandn %xmm8, %xmm7 -; SSE2-NEXT: por %xmm1, %xmm7 +; SSE2-NEXT: por %xmm9, %xmm7 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709518848,18446744073709518848] ; SSE2-NEXT: movdqa %xmm7, %xmm0 -; SSE2-NEXT: pxor %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 ; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [18446744071562035200,18446744071562035200] -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm1[0,0,2,2] +; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] ; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSE2-NEXT: pand %xmm6, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm0, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm7 -; SSE2-NEXT: pandn %xmm8, %xmm1 -; SSE2-NEXT: por %xmm7, %xmm1 -; SSE2-NEXT: movdqa %xmm3, %xmm0 -; SSE2-NEXT: pxor %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm7, %xmm4 +; SSE2-NEXT: movdqa %xmm5, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 ; SSE2-NEXT: movdqa %xmm0, %xmm6 ; SSE2-NEXT: pcmpgtd %xmm9, %xmm6 ; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] @@ -1313,113 +1365,117 @@ define <8 x i16> @trunc_ssat_v8i64_v8i16 ; SSE2-NEXT: pand %xmm10, %xmm7 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm6[1,1,3,3] ; SSE2-NEXT: por %xmm7, %xmm0 -; SSE2-NEXT: pand %xmm0, %xmm3 +; SSE2-NEXT: pand %xmm0, %xmm5 ; SSE2-NEXT: pandn %xmm8, %xmm0 -; SSE2-NEXT: por %xmm3, %xmm0 -; SSE2-NEXT: packssdw %xmm1, %xmm0 -; SSE2-NEXT: movdqa %xmm2, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 +; SSE2-NEXT: por %xmm5, %xmm0 +; SSE2-NEXT: packssdw %xmm4, %xmm0 +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pxor %xmm1, %xmm4 +; SSE2-NEXT: movdqa %xmm4, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm3 +; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: por %xmm3, %xmm5 +; SSE2-NEXT: pxor %xmm2, %xmm1 ; SSE2-NEXT: movdqa %xmm1, %xmm3 ; SSE2-NEXT: pcmpgtd %xmm9, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] ; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pand %xmm4, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSE2-NEXT: por %xmm1, %xmm3 ; SSE2-NEXT: pand %xmm3, %xmm2 ; SSE2-NEXT: pandn %xmm8, %xmm3 ; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: pxor %xmm5, %xmm4 -; SSE2-NEXT: movdqa %xmm4, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm1[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: pand %xmm2, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm5 -; SSE2-NEXT: pandn %xmm8, %xmm1 -; SSE2-NEXT: por %xmm5, %xmm1 -; SSE2-NEXT: packssdw %xmm3, %xmm1 -; SSE2-NEXT: packssdw %xmm1, %xmm0 +; SSE2-NEXT: packssdw %xmm5, %xmm3 +; SSE2-NEXT: packssdw %xmm3, %xmm0 ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_ssat_v8i64_v8i16: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm6 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm9 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm3 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm5 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [32767,32767] -; SSSE3-NEXT: movdqa {{.*#+}} xmm4 = [2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm2, %xmm5 -; SSSE3-NEXT: pxor %xmm4, %xmm5 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [2147516415,2147516415] -; SSSE3-NEXT: movdqa %xmm9, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm2 -; SSSE3-NEXT: pandn %xmm8, %xmm5 -; SSSE3-NEXT: por %xmm2, %xmm5 +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] ; SSSE3-NEXT: movdqa %xmm3, %xmm2 -; SSSE3-NEXT: pxor %xmm4, %xmm2 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [2147516415,2147516415] +; SSSE3-NEXT: movdqa %xmm10, %xmm7 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm0, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm7[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm2 ; SSSE3-NEXT: pand %xmm2, %xmm3 ; SSSE3-NEXT: pandn %xmm8, %xmm2 ; SSSE3-NEXT: por %xmm3, %xmm2 -; SSSE3-NEXT: movdqa %xmm0, %xmm3 -; SSSE3-NEXT: pxor %xmm4, %xmm3 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm0 -; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: movdqa %xmm5, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSSE3-NEXT: por %xmm0, %xmm3 -; SSSE3-NEXT: movdqa %xmm1, %xmm0 -; SSSE3-NEXT: pxor %xmm4, %xmm0 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 +; SSSE3-NEXT: pand %xmm3, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm5, %xmm3 +; SSSE3-NEXT: movdqa %xmm6, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm6 +; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: por %xmm6, %xmm5 +; SSSE3-NEXT: movdqa %xmm9, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] ; SSSE3-NEXT: por %xmm0, %xmm7 -; SSSE3-NEXT: pand %xmm7, %xmm1 +; SSSE3-NEXT: pand %xmm7, %xmm9 ; SSSE3-NEXT: pandn %xmm8, %xmm7 -; SSSE3-NEXT: por %xmm1, %xmm7 +; SSSE3-NEXT: por %xmm9, %xmm7 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709518848,18446744073709518848] ; SSSE3-NEXT: movdqa %xmm7, %xmm0 -; SSSE3-NEXT: pxor %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 ; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [18446744071562035200,18446744071562035200] -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm1[0,0,2,2] +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] ; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSSE3-NEXT: pand %xmm6, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm0, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm7 -; SSSE3-NEXT: pandn %xmm8, %xmm1 -; SSSE3-NEXT: por %xmm7, %xmm1 -; SSSE3-NEXT: movdqa %xmm3, %xmm0 -; SSSE3-NEXT: pxor %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm7, %xmm4 +; SSSE3-NEXT: movdqa %xmm5, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 ; SSSE3-NEXT: movdqa %xmm0, %xmm6 ; SSSE3-NEXT: pcmpgtd %xmm9, %xmm6 ; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] @@ -1428,46 +1484,49 @@ define <8 x i16> @trunc_ssat_v8i64_v8i16 ; SSSE3-NEXT: pand %xmm10, %xmm7 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm6[1,1,3,3] ; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: pand %xmm0, %xmm3 +; SSSE3-NEXT: pand %xmm0, %xmm5 ; SSSE3-NEXT: pandn %xmm8, %xmm0 -; SSSE3-NEXT: por %xmm3, %xmm0 -; SSSE3-NEXT: packssdw %xmm1, %xmm0 -; SSSE3-NEXT: movdqa %xmm2, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 +; SSSE3-NEXT: por %xmm5, %xmm0 +; SSSE3-NEXT: packssdw %xmm4, %xmm0 +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pxor %xmm1, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm3 +; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: por %xmm3, %xmm5 +; SSSE3-NEXT: pxor %xmm2, %xmm1 ; SSSE3-NEXT: movdqa %xmm1, %xmm3 ; SSSE3-NEXT: pcmpgtd %xmm9, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] ; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pand %xmm4, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSSE3-NEXT: por %xmm1, %xmm3 ; SSSE3-NEXT: pand %xmm3, %xmm2 ; SSSE3-NEXT: pandn %xmm8, %xmm3 ; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: pxor %xmm5, %xmm4 -; SSSE3-NEXT: movdqa %xmm4, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm1[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: pand %xmm2, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm5 -; SSSE3-NEXT: pandn %xmm8, %xmm1 -; SSSE3-NEXT: por %xmm5, %xmm1 -; SSSE3-NEXT: packssdw %xmm3, %xmm1 -; SSSE3-NEXT: packssdw %xmm1, %xmm0 +; SSSE3-NEXT: packssdw %xmm5, %xmm3 +; SSSE3-NEXT: packssdw %xmm3, %xmm0 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_ssat_v8i64_v8i16: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm10 -; SSE41-NEXT: movapd {{.*#+}} xmm11 = [32767,32767] -; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [2147483648,2147483648] -; SSE41-NEXT: movdqa %xmm2, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 +; SSE41-NEXT: movdqa (%rdi), %xmm10 +; SSE41-NEXT: movdqa 16(%rdi), %xmm9 +; SSE41-NEXT: movdqa 32(%rdi), %xmm3 +; SSE41-NEXT: movdqa 48(%rdi), %xmm5 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [32767,32767] +; SSE41-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm3, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 ; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147516415,2147516415] ; SSE41-NEXT: movdqa %xmm4, %xmm7 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm7 @@ -1476,115 +1535,118 @@ define <8 x i16> @trunc_ssat_v8i64_v8i16 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] ; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm8 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm8 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 +; SSE41-NEXT: movapd %xmm1, %xmm8 +; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm8 +; SSE41-NEXT: movdqa %xmm5, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 ; SSE41-NEXT: movdqa %xmm4, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 +; SSE41-NEXT: pand %xmm3, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm9 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm9 +; SSE41-NEXT: movapd %xmm1, %xmm11 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm11 ; SSE41-NEXT: movdqa %xmm10, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm4, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 -; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm2 -; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm2 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 ; SSE41-NEXT: movdqa %xmm4, %xmm3 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm3 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm11 -; SSE41-NEXT: movapd {{.*#+}} xmm3 = [18446744073709518848,18446744073709518848] -; SSE41-NEXT: movapd %xmm11, %xmm1 -; SSE41-NEXT: xorpd %xmm5, %xmm1 -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [18446744071562035200,18446744071562035200] -; SSE41-NEXT: movapd %xmm1, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm1 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 -; SSE41-NEXT: por %xmm1, %xmm0 -; SSE41-NEXT: movapd %xmm3, %xmm6 -; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm6 -; SSE41-NEXT: movapd %xmm2, %xmm1 -; SSE41-NEXT: xorpd %xmm5, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm5 = [18446744073709518848,18446744073709518848] +; SSE41-NEXT: movapd %xmm1, %xmm4 +; SSE41-NEXT: xorpd %xmm2, %xmm4 +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [18446744071562035200,18446744071562035200] +; SSE41-NEXT: movapd %xmm4, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: movapd %xmm5, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm4 +; SSE41-NEXT: movapd %xmm3, %xmm1 +; SSE41-NEXT: xorpd %xmm2, %xmm1 ; SSE41-NEXT: movapd %xmm1, %xmm7 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm7 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm1 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] ; SSE41-NEXT: pand %xmm7, %xmm0 ; SSE41-NEXT: por %xmm1, %xmm0 -; SSE41-NEXT: movapd %xmm3, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 -; SSE41-NEXT: packssdw %xmm6, %xmm1 -; SSE41-NEXT: movapd %xmm9, %xmm2 -; SSE41-NEXT: xorpd %xmm5, %xmm2 -; SSE41-NEXT: movapd %xmm2, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm2 +; SSE41-NEXT: movapd %xmm5, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm1 +; SSE41-NEXT: packssdw %xmm4, %xmm1 +; SSE41-NEXT: movapd %xmm11, %xmm3 +; SSE41-NEXT: xorpd %xmm2, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: movapd %xmm5, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm3 +; SSE41-NEXT: xorpd %xmm8, %xmm2 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm2 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: pand %xmm4, %xmm0 ; SSE41-NEXT: por %xmm2, %xmm0 -; SSE41-NEXT: movapd %xmm3, %xmm2 -; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm2 -; SSE41-NEXT: xorpd %xmm8, %xmm5 -; SSE41-NEXT: movapd %xmm5, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm5 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 -; SSE41-NEXT: por %xmm5, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm3 -; SSE41-NEXT: packssdw %xmm2, %xmm3 -; SSE41-NEXT: packssdw %xmm3, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm5 +; SSE41-NEXT: packssdw %xmm3, %xmm5 +; SSE41-NEXT: packssdw %xmm5, %xmm1 ; SSE41-NEXT: movdqa %xmm1, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_ssat_v8i64_v8i16: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [32767,32767] -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm8 -; AVX1-NEXT: vpcmpgtq %xmm1, %xmm3, %xmm5 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm6, %xmm3, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm0, %xmm3, %xmm4 -; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm3, %xmm0 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [18446744073709518848,18446744073709518848] -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm0, %xmm9 -; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm3, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm7 -; AVX1-NEXT: vblendvpd %xmm5, %xmm1, %xmm3, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm5 -; AVX1-NEXT: vblendvpd %xmm8, %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm2, %xmm3 -; AVX1-NEXT: vblendvpd %xmm3, %xmm2, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm5, %xmm1, %xmm4, %xmm1 -; AVX1-NEXT: vpackssdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm9, %xmm0, %xmm4, %xmm0 -; AVX1-NEXT: vpackssdw %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [32767,32767] +; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm8 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm6 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm4, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm4, %xmm5 +; AVX1-NEXT: vblendvpd %xmm5, %xmm0, %xmm4, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm5 = [18446744073709518848,18446744073709518848] +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm0, %xmm9 +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm4, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm6, %xmm2, %xmm4, %xmm2 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm6 +; AVX1-NEXT: vblendvpd %xmm8, %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm3, %xmm4 +; AVX1-NEXT: vblendvpd %xmm4, %xmm3, %xmm5, %xmm3 +; AVX1-NEXT: vblendvpd %xmm6, %xmm2, %xmm5, %xmm2 +; AVX1-NEXT: vpackssdw %xmm3, %xmm2, %xmm2 +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm5, %xmm1 +; AVX1-NEXT: vblendvpd %xmm9, %xmm0, %xmm5, %xmm0 ; AVX1-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 -; AVX1-NEXT: vzeroupper +; AVX1-NEXT: vpackssdw %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_ssat_v8i64_v8i16: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm2 = [32767,32767,32767,32767] ; AVX2-NEXT: vpcmpgtq %ymm0, %ymm2, %ymm3 ; AVX2-NEXT: vblendvpd %ymm3, %ymm0, %ymm2, %ymm0 @@ -1604,9 +1666,21 @@ define <8 x i16> @trunc_ssat_v8i64_v8i16 ; ; AVX512-LABEL: trunc_ssat_v8i64_v8i16: ; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v8i64_v8i16: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vmovdqa 32(%rdi), %ymm1 +; SKX-NEXT: vpmovsqw %ymm1, %xmm1 +; SKX-NEXT: vpmovsqw %ymm0, %xmm0 +; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp slt <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = icmp sgt <8 x i64> %2, @@ -1649,6 +1723,13 @@ define <4 x i16> @trunc_ssat_v4i32_v4i16 ; AVX512BWVL-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v4i32_v4i16: +; SKX: # %bb.0: +; SKX-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; SKX-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; SKX-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 +; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = icmp sgt <4 x i32> %2, @@ -1691,6 +1772,11 @@ define void @trunc_ssat_v4i32_v4i16_stor ; AVX512BWVL: # %bb.0: ; AVX512BWVL-NEXT: vpmovsdw %xmm0, (%rdi) ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v4i32_v4i16_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsdw %xmm0, (%rdi) +; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = icmp sgt <4 x i32> %2, @@ -1745,6 +1831,12 @@ define <8 x i16> @trunc_ssat_v8i32_v8i16 ; AVX512BWVL-NEXT: vpmovsdw %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v8i32_v8i16: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsdw %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <8 x i32> %a0, %2 = select <8 x i1> %1, <8 x i32> %a0, <8 x i32> %3 = icmp sgt <8 x i32> %2, @@ -1753,33 +1845,49 @@ define <8 x i16> @trunc_ssat_v8i32_v8i16 ret <8 x i16> %5 } -define <16 x i16> @trunc_ssat_v16i32_v16i16(<16 x i32> %a0) { +define <16 x i16> @trunc_ssat_v16i32_v16i16(<16 x i32>* %p0) "min-legal-vector-width"="256" { ; SSE-LABEL: trunc_ssat_v16i32_v16i16: ; SSE: # %bb.0: -; SSE-NEXT: packssdw %xmm1, %xmm0 -; SSE-NEXT: packssdw %xmm3, %xmm2 -; SSE-NEXT: movdqa %xmm2, %xmm1 +; SSE-NEXT: movdqa (%rdi), %xmm0 +; SSE-NEXT: movdqa 32(%rdi), %xmm1 +; SSE-NEXT: packssdw 16(%rdi), %xmm0 +; SSE-NEXT: packssdw 48(%rdi), %xmm1 ; SSE-NEXT: retq ; ; AVX1-LABEL: trunc_ssat_v16i32_v16i16: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vpackssdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vpackssdw %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm1 +; AVX1-NEXT: vpackssdw 48(%rdi), %xmm1, %xmm1 +; AVX1-NEXT: vpackssdw 16(%rdi), %xmm0, %xmm0 ; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_ssat_v16i32_v16i16: ; AVX2: # %bb.0: -; AVX2-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vpackssdw 32(%rdi), %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: retq ; ; AVX512-LABEL: trunc_ssat_v16i32_v16i16: ; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512-NEXT: vpmovsdw %zmm0, %ymm0 ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v16i32_v16i16: +; SKX: # %bb.0: +; SKX-NEXT: vpbroadcastd {{.*#+}} ymm0 = [32767,32767,32767,32767,32767,32767,32767,32767] +; SKX-NEXT: vpminsd (%rdi), %ymm0, %ymm1 +; SKX-NEXT: vpminsd 32(%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpbroadcastd {{.*#+}} ymm2 = [4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528] +; SKX-NEXT: vpmaxsd %ymm2, %ymm0, %ymm0 +; SKX-NEXT: vpmaxsd %ymm2, %ymm1, %ymm1 +; SKX-NEXT: vpackssdw %ymm0, %ymm1, %ymm0 +; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; SKX-NEXT: retq + %a0 = load <16 x i32>, <16 x i32>* %p0 %1 = icmp slt <16 x i32> %a0, %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> %3 = icmp sgt <16 x i32> %2, @@ -2041,6 +2149,12 @@ define <4 x i8> @trunc_ssat_v4i64_v4i8(< ; AVX512BWVL-NEXT: vpmovsqb %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v4i64_v4i8: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsqb %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = icmp sgt <4 x i64> %2, @@ -2304,6 +2418,12 @@ define void @trunc_ssat_v4i64_v4i8_store ; AVX512BWVL-NEXT: vpmovsqb %ymm0, (%rdi) ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v4i64_v4i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsqb %ymm0, (%rdi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = icmp sgt <4 x i64> %2, @@ -2313,344 +2433,355 @@ define void @trunc_ssat_v4i64_v4i8_store ret void } -define <8 x i8> @trunc_ssat_v8i64_v8i8(<8 x i64> %a0) { +define <8 x i8> @trunc_ssat_v8i64_v8i8(<8 x i64>* %p0) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_ssat_v8i64_v8i8: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm9 +; SSE2-NEXT: movdqa 16(%rdi), %xmm7 +; SSE2-NEXT: movdqa 32(%rdi), %xmm5 +; SSE2-NEXT: movdqa 48(%rdi), %xmm3 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [127,127] -; SSE2-NEXT: movdqa {{.*#+}} xmm4 = [2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm3, %xmm5 -; SSE2-NEXT: pxor %xmm4, %xmm5 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [2147483775,2147483775] -; SSE2-NEXT: movdqa %xmm9, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm3 -; SSE2-NEXT: pandn %xmm8, %xmm5 -; SSE2-NEXT: por %xmm3, %xmm5 -; SSE2-NEXT: movdqa %xmm2, %xmm3 -; SSE2-NEXT: pxor %xmm4, %xmm3 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm3, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm2 -; SSE2-NEXT: pandn %xmm8, %xmm3 -; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pxor %xmm4, %xmm2 -; SSE2-NEXT: movdqa %xmm9, %xmm6 +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm3, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [2147483775,2147483775] +; SSE2-NEXT: movdqa %xmm10, %xmm6 ; SSE2-NEXT: pcmpgtd %xmm2, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm0, %xmm4 ; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm1 +; SSE2-NEXT: por %xmm4, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm3 ; SSE2-NEXT: pandn %xmm8, %xmm2 -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm7 -; SSE2-NEXT: pand %xmm7, %xmm0 -; SSE2-NEXT: pandn %xmm8, %xmm7 +; SSE2-NEXT: por %xmm3, %xmm2 +; SSE2-NEXT: movdqa %xmm5, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm5, %xmm3 +; SSE2-NEXT: movdqa %xmm7, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: por %xmm7, %xmm5 +; SSE2-NEXT: movdqa %xmm9, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] ; SSE2-NEXT: por %xmm0, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm9 +; SSE2-NEXT: pandn %xmm8, %xmm7 +; SSE2-NEXT: por %xmm9, %xmm7 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709551488,18446744073709551488] ; SSE2-NEXT: movdqa %xmm7, %xmm0 -; SSE2-NEXT: pxor %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 ; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [18446744071562067840,18446744071562067840] -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm1[0,0,2,2] +; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm4[0,0,2,2] ; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm0[1,1,3,3] ; SSE2-NEXT: pand %xmm10, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] ; SSE2-NEXT: por %xmm6, %xmm0 ; SSE2-NEXT: pand %xmm0, %xmm7 ; SSE2-NEXT: pandn %xmm8, %xmm0 ; SSE2-NEXT: por %xmm7, %xmm0 -; SSE2-NEXT: movdqa %xmm2, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm6 +; SSE2-NEXT: movdqa %xmm5, %xmm4 +; SSE2-NEXT: pxor %xmm1, %xmm4 +; SSE2-NEXT: movdqa %xmm4, %xmm6 ; SSE2-NEXT: pcmpgtd %xmm9, %xmm6 ; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm1 +; SSE2-NEXT: pcmpeqd %xmm9, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm4 ; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm7 -; SSE2-NEXT: pand %xmm7, %xmm2 +; SSE2-NEXT: por %xmm4, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm5 ; SSE2-NEXT: pandn %xmm8, %xmm7 -; SSE2-NEXT: por %xmm2, %xmm7 -; SSE2-NEXT: movdqa %xmm3, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm3 -; SSE2-NEXT: pandn %xmm8, %xmm2 -; SSE2-NEXT: por %xmm3, %xmm2 -; SSE2-NEXT: pxor %xmm5, %xmm4 -; SSE2-NEXT: movdqa %xmm4, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm1[0,0,2,2] +; SSE2-NEXT: por %xmm5, %xmm7 +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pxor %xmm1, %xmm4 +; SSE2-NEXT: movdqa %xmm4, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] ; SSE2-NEXT: pcmpeqd %xmm9, %xmm4 ; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: pand %xmm3, %xmm4 +; SSE2-NEXT: pand %xmm6, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm3 +; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: por %xmm3, %xmm5 +; SSE2-NEXT: pxor %xmm2, %xmm1 +; SSE2-NEXT: movdqa %xmm1, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm5 -; SSE2-NEXT: pandn %xmm8, %xmm1 -; SSE2-NEXT: por %xmm5, %xmm1 -; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] -; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm3 ; SSE2-NEXT: pand %xmm3, %xmm2 -; SSE2-NEXT: packuswb %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm3, %xmm7 -; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] +; SSE2-NEXT: pand %xmm1, %xmm3 +; SSE2-NEXT: pand %xmm1, %xmm5 +; SSE2-NEXT: packuswb %xmm3, %xmm5 +; SSE2-NEXT: pand %xmm1, %xmm7 +; SSE2-NEXT: pand %xmm1, %xmm0 ; SSE2-NEXT: packuswb %xmm7, %xmm0 -; SSE2-NEXT: packuswb %xmm2, %xmm0 +; SSE2-NEXT: packuswb %xmm5, %xmm0 ; SSE2-NEXT: packuswb %xmm0, %xmm0 ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_ssat_v8i64_v8i8: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm9 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm7 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm5 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm3 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [127,127] -; SSSE3-NEXT: movdqa {{.*#+}} xmm4 = [2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm3, %xmm5 -; SSSE3-NEXT: pxor %xmm4, %xmm5 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [2147483775,2147483775] -; SSSE3-NEXT: movdqa %xmm9, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm3 -; SSSE3-NEXT: pandn %xmm8, %xmm5 -; SSSE3-NEXT: por %xmm3, %xmm5 -; SSSE3-NEXT: movdqa %xmm2, %xmm3 -; SSSE3-NEXT: pxor %xmm4, %xmm3 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm2 -; SSSE3-NEXT: pandn %xmm8, %xmm3 -; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pxor %xmm4, %xmm2 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm3, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [2147483775,2147483775] +; SSSE3-NEXT: movdqa %xmm10, %xmm6 ; SSSE3-NEXT: pcmpgtd %xmm2, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm0, %xmm4 ; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm1 +; SSSE3-NEXT: por %xmm4, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm3 ; SSSE3-NEXT: pandn %xmm8, %xmm2 -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm7 -; SSSE3-NEXT: pand %xmm7, %xmm0 -; SSSE3-NEXT: pandn %xmm8, %xmm7 +; SSSE3-NEXT: por %xmm3, %xmm2 +; SSSE3-NEXT: movdqa %xmm5, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm5, %xmm3 +; SSSE3-NEXT: movdqa %xmm7, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: por %xmm7, %xmm5 +; SSSE3-NEXT: movdqa %xmm9, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] ; SSSE3-NEXT: por %xmm0, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm9 +; SSSE3-NEXT: pandn %xmm8, %xmm7 +; SSSE3-NEXT: por %xmm9, %xmm7 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709551488,18446744073709551488] ; SSSE3-NEXT: movdqa %xmm7, %xmm0 -; SSSE3-NEXT: pxor %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 ; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [18446744071562067840,18446744071562067840] -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm1[0,0,2,2] +; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm4[0,0,2,2] ; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm0[1,1,3,3] ; SSSE3-NEXT: pand %xmm10, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] ; SSSE3-NEXT: por %xmm6, %xmm0 ; SSSE3-NEXT: pand %xmm0, %xmm7 ; SSSE3-NEXT: pandn %xmm8, %xmm0 ; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: movdqa %xmm2, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm6 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pxor %xmm1, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm6 ; SSSE3-NEXT: pcmpgtd %xmm9, %xmm6 ; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm1 +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 ; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm7 -; SSSE3-NEXT: pand %xmm7, %xmm2 +; SSSE3-NEXT: por %xmm4, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm5 ; SSSE3-NEXT: pandn %xmm8, %xmm7 -; SSSE3-NEXT: por %xmm2, %xmm7 -; SSSE3-NEXT: movdqa %xmm3, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm3 -; SSSE3-NEXT: pandn %xmm8, %xmm2 -; SSSE3-NEXT: por %xmm3, %xmm2 -; SSSE3-NEXT: pxor %xmm5, %xmm4 -; SSSE3-NEXT: movdqa %xmm4, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm1[0,0,2,2] +; SSSE3-NEXT: por %xmm5, %xmm7 +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pxor %xmm1, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] ; SSSE3-NEXT: pcmpeqd %xmm9, %xmm4 ; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: pand %xmm3, %xmm4 +; SSSE3-NEXT: pand %xmm6, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm3 +; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: por %xmm3, %xmm5 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm1, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm5 -; SSSE3-NEXT: pandn %xmm8, %xmm1 -; SSSE3-NEXT: por %xmm5, %xmm1 -; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] -; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm3 ; SSSE3-NEXT: pand %xmm3, %xmm2 -; SSSE3-NEXT: packuswb %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm3, %xmm7 -; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] +; SSSE3-NEXT: pand %xmm1, %xmm3 +; SSSE3-NEXT: pand %xmm1, %xmm5 +; SSSE3-NEXT: packuswb %xmm3, %xmm5 +; SSSE3-NEXT: pand %xmm1, %xmm7 +; SSSE3-NEXT: pand %xmm1, %xmm0 ; SSSE3-NEXT: packuswb %xmm7, %xmm0 -; SSSE3-NEXT: packuswb %xmm2, %xmm0 +; SSSE3-NEXT: packuswb %xmm5, %xmm0 ; SSSE3-NEXT: packuswb %xmm0, %xmm0 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_ssat_v8i64_v8i8: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm8 -; SSE41-NEXT: movapd {{.*#+}} xmm11 = [127,127] -; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [2147483648,2147483648] -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [2147483775,2147483775] -; SSE41-NEXT: movdqa %xmm6, %xmm7 +; SSE41-NEXT: movdqa (%rdi), %xmm9 +; SSE41-NEXT: movdqa 16(%rdi), %xmm10 +; SSE41-NEXT: movdqa 32(%rdi), %xmm3 +; SSE41-NEXT: movdqa 48(%rdi), %xmm5 +; SSE41-NEXT: movapd {{.*#+}} xmm4 = [127,127] +; SSE41-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm5, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm1 = [2147483775,2147483775] +; SSE41-NEXT: movdqa %xmm1, %xmm7 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm7 -; SSE41-NEXT: movdqa %xmm6, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: movdqa %xmm1, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] ; SSE41-NEXT: pand %xmm7, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm9 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm9 -; SSE41-NEXT: movdqa %xmm2, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm6, %xmm3 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm8 +; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm8 +; SSE41-NEXT: movdqa %xmm3, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm1, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm1, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: movapd %xmm4, %xmm11 +; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm11 +; SSE41-NEXT: movdqa %xmm10, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm1, %xmm3 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 -; SSE41-NEXT: movdqa %xmm6, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm10 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm10 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm6, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm6, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 -; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm3 -; SSE41-NEXT: movdqa %xmm8, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm6, %xmm1 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: movdqa %xmm1, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: pand %xmm3, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm11 -; SSE41-NEXT: movapd {{.*#+}} xmm2 = [18446744073709551488,18446744073709551488] -; SSE41-NEXT: movapd %xmm11, %xmm1 -; SSE41-NEXT: xorpd %xmm5, %xmm1 -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [18446744071562067840,18446744071562067840] -; SSE41-NEXT: movapd %xmm1, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm1 +; SSE41-NEXT: movapd %xmm4, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm3 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm1, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm1 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] ; SSE41-NEXT: pand %xmm6, %xmm0 ; SSE41-NEXT: por %xmm1, %xmm0 -; SSE41-NEXT: movapd %xmm2, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm1 -; SSE41-NEXT: movapd %xmm3, %xmm6 -; SSE41-NEXT: xorpd %xmm5, %xmm6 -; SSE41-NEXT: movapd %xmm6, %xmm7 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm7 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm6 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm7, %xmm0 -; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm2, %xmm7 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm7 -; SSE41-NEXT: movapd %xmm10, %xmm3 -; SSE41-NEXT: xorpd %xmm5, %xmm3 -; SSE41-NEXT: movapd %xmm3, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm4 +; SSE41-NEXT: movapd {{.*#+}} xmm6 = [18446744073709551488,18446744073709551488] +; SSE41-NEXT: movapd %xmm4, %xmm1 +; SSE41-NEXT: xorpd %xmm2, %xmm1 +; SSE41-NEXT: movdqa {{.*#+}} xmm7 = [18446744071562067840,18446744071562067840] +; SSE41-NEXT: movapd %xmm1, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm7, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm7, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm1, %xmm0 +; SSE41-NEXT: movapd %xmm6, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm1 +; SSE41-NEXT: movapd %xmm3, %xmm4 +; SSE41-NEXT: xorpd %xmm2, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm7, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm7, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: movapd %xmm6, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm11, %xmm3 +; SSE41-NEXT: xorpd %xmm2, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm7, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm7, %xmm3 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm2, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm3 -; SSE41-NEXT: xorpd %xmm9, %xmm5 -; SSE41-NEXT: movapd %xmm5, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm5 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 -; SSE41-NEXT: por %xmm5, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm2 +; SSE41-NEXT: movapd %xmm6, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm3 +; SSE41-NEXT: xorpd %xmm8, %xmm2 +; SSE41-NEXT: movapd %xmm2, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm7, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm7, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm6 ; SSE41-NEXT: movapd {{.*#+}} xmm0 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] -; SSE41-NEXT: andpd %xmm0, %xmm2 +; SSE41-NEXT: andpd %xmm0, %xmm6 ; SSE41-NEXT: andpd %xmm0, %xmm3 -; SSE41-NEXT: packusdw %xmm2, %xmm3 -; SSE41-NEXT: andpd %xmm0, %xmm7 +; SSE41-NEXT: packusdw %xmm6, %xmm3 +; SSE41-NEXT: andpd %xmm0, %xmm4 ; SSE41-NEXT: andpd %xmm0, %xmm1 -; SSE41-NEXT: packusdw %xmm7, %xmm1 +; SSE41-NEXT: packusdw %xmm4, %xmm1 ; SSE41-NEXT: packusdw %xmm3, %xmm1 ; SSE41-NEXT: packuswb %xmm1, %xmm1 ; SSE41-NEXT: movdqa %xmm1, %xmm0 @@ -2658,32 +2789,34 @@ define <8 x i8> @trunc_ssat_v8i64_v8i8(< ; ; AVX1-LABEL: trunc_ssat_v8i64_v8i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vmovapd {{.*#+}} ymm8 = [127,127,127,127] -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [127,127] -; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm5 -; AVX1-NEXT: vpcmpgtq %xmm1, %xmm4, %xmm6 -; AVX1-NEXT: vinsertf128 $1, %xmm5, %ymm6, %ymm7 -; AVX1-NEXT: vblendvpd %ymm7, %ymm1, %ymm8, %ymm9 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm0, %xmm4, %xmm10 -; AVX1-NEXT: vinsertf128 $1, %xmm7, %ymm10, %ymm11 -; AVX1-NEXT: vblendvpd %ymm11, %ymm0, %ymm8, %ymm8 +; AVX1-NEXT: vmovapd {{.*#+}} ymm9 = [127,127,127,127] +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [127,127] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm2, %xmm3 +; AVX1-NEXT: vmovdqa (%rdi), %xmm4 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm5 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm6 +; AVX1-NEXT: vpcmpgtq %xmm6, %xmm2, %xmm7 +; AVX1-NEXT: vinsertf128 $1, %xmm3, %ymm7, %ymm8 +; AVX1-NEXT: vblendvpd %ymm8, 32(%rdi), %ymm9, %ymm8 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm0 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm2, %xmm10 +; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm10, %ymm11 +; AVX1-NEXT: vblendvpd %ymm11, (%rdi), %ymm9, %ymm9 ; AVX1-NEXT: vmovapd {{.*#+}} ymm11 = [18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488] -; AVX1-NEXT: vblendvpd %xmm7, %xmm2, %xmm4, %xmm2 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm7 = [18446744073709551488,18446744073709551488] -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm2, %xmm2 -; AVX1-NEXT: vblendvpd %xmm10, %xmm0, %xmm4, %xmm0 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm0, %xmm0 -; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0 -; AVX1-NEXT: vblendvpd %ymm0, %ymm8, %ymm11, %ymm0 -; AVX1-NEXT: vblendvpd %xmm5, %xmm3, %xmm4, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm2, %xmm2 -; AVX1-NEXT: vblendvpd %xmm6, %xmm1, %xmm4, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm1, %xmm1 -; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm1, %ymm1 -; AVX1-NEXT: vblendvpd %ymm1, %ymm9, %ymm11, %ymm1 +; AVX1-NEXT: vblendvpd %xmm0, %xmm5, %xmm2, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm5 = [18446744073709551488,18446744073709551488] +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm0, %xmm0 +; AVX1-NEXT: vblendvpd %xmm10, %xmm4, %xmm2, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm4, %xmm4 +; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm4, %ymm0 +; AVX1-NEXT: vblendvpd %ymm0, %ymm9, %ymm11, %ymm0 +; AVX1-NEXT: vblendvpd %xmm3, %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm1 +; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm2, %xmm2 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm2 +; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm2, %ymm1 +; AVX1-NEXT: vblendvpd %ymm1, %ymm8, %ymm11, %ymm1 ; AVX1-NEXT: vmovapd {{.*#+}} ymm2 = [255,255,255,255] ; AVX1-NEXT: vandpd %ymm2, %ymm1, %ymm1 ; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 @@ -2698,6 +2831,8 @@ define <8 x i8> @trunc_ssat_v8i64_v8i8(< ; ; AVX2-LABEL: trunc_ssat_v8i64_v8i8: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm2 = [127,127,127,127] ; AVX2-NEXT: vpcmpgtq %ymm1, %ymm2, %ymm3 ; AVX2-NEXT: vblendvpd %ymm3, %ymm1, %ymm2, %ymm1 @@ -2724,9 +2859,21 @@ define <8 x i8> @trunc_ssat_v8i64_v8i8(< ; ; AVX512-LABEL: trunc_ssat_v8i64_v8i8: ; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v8i64_v8i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vmovdqa 32(%rdi), %ymm1 +; SKX-NEXT: vpmovsqb %ymm1, %xmm1 +; SKX-NEXT: vpmovsqb %ymm0, %xmm0 +; SKX-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp slt <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = icmp sgt <8 x i64> %2, @@ -2737,379 +2884,392 @@ define <8 x i8> @trunc_ssat_v8i64_v8i8(< ; TODO: The AVX1 codegen shows a missed opportunity to narrow blendv+logic to 128-bit. -define void @trunc_ssat_v8i64_v8i8_store(<8 x i64> %a0, <8 x i8> *%p1) { +define void @trunc_ssat_v8i64_v8i8_store(<8 x i64>* %p0, <8 x i8> *%p1) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_ssat_v8i64_v8i8_store: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm9 +; SSE2-NEXT: movdqa 16(%rdi), %xmm7 +; SSE2-NEXT: movdqa 32(%rdi), %xmm5 +; SSE2-NEXT: movdqa 48(%rdi), %xmm2 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [127,127] -; SSE2-NEXT: movdqa {{.*#+}} xmm4 = [2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm3, %xmm5 -; SSE2-NEXT: pxor %xmm4, %xmm5 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [2147483775,2147483775] -; SSE2-NEXT: movdqa %xmm9, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm5 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [2147483775,2147483775] +; SSE2-NEXT: movdqa %xmm10, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm3, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm6[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm2 +; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: por %xmm2, %xmm1 +; SSE2-NEXT: movdqa %xmm5, %xmm2 +; SSE2-NEXT: pxor %xmm0, %xmm2 +; SSE2-NEXT: movdqa %xmm10, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm6, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm5, %xmm2 +; SSE2-NEXT: movdqa %xmm7, %xmm3 +; SSE2-NEXT: pxor %xmm0, %xmm3 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSE2-NEXT: pand %xmm5, %xmm3 -; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[1,1,3,3] ; SSE2-NEXT: por %xmm3, %xmm5 -; SSE2-NEXT: movdqa %xmm2, %xmm3 -; SSE2-NEXT: pxor %xmm4, %xmm3 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm3, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm2 -; SSE2-NEXT: pandn %xmm8, %xmm3 -; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pxor %xmm4, %xmm2 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm1 -; SSE2-NEXT: pandn %xmm8, %xmm2 -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm7 -; SSE2-NEXT: pand %xmm7, %xmm0 +; SSE2-NEXT: pand %xmm5, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: por %xmm7, %xmm5 +; SSE2-NEXT: movdqa %xmm9, %xmm3 +; SSE2-NEXT: pxor %xmm0, %xmm3 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm9 ; SSE2-NEXT: pandn %xmm8, %xmm7 -; SSE2-NEXT: por %xmm0, %xmm7 +; SSE2-NEXT: por %xmm9, %xmm7 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709551488,18446744073709551488] -; SSE2-NEXT: movdqa %xmm7, %xmm0 -; SSE2-NEXT: pxor %xmm4, %xmm0 +; SSE2-NEXT: movdqa %xmm7, %xmm3 +; SSE2-NEXT: pxor %xmm0, %xmm3 ; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [18446744071562067840,18446744071562067840] -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm1[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm0[1,1,3,3] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm3[1,1,3,3] ; SSE2-NEXT: pand %xmm10, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm0 -; SSE2-NEXT: pand %xmm0, %xmm7 -; SSE2-NEXT: pandn %xmm8, %xmm0 -; SSE2-NEXT: por %xmm7, %xmm0 -; SSE2-NEXT: movdqa %xmm2, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm6 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm6, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm7, %xmm3 +; SSE2-NEXT: movdqa %xmm5, %xmm4 +; SSE2-NEXT: pxor %xmm0, %xmm4 +; SSE2-NEXT: movdqa %xmm4, %xmm6 ; SSE2-NEXT: pcmpgtd %xmm9, %xmm6 ; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm1 +; SSE2-NEXT: pcmpeqd %xmm9, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm4 ; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm7 -; SSE2-NEXT: pand %xmm7, %xmm2 +; SSE2-NEXT: por %xmm4, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm5 ; SSE2-NEXT: pandn %xmm8, %xmm7 -; SSE2-NEXT: por %xmm2, %xmm7 -; SSE2-NEXT: movdqa %xmm3, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: por %xmm5, %xmm7 +; SSE2-NEXT: movdqa %xmm2, %xmm4 +; SSE2-NEXT: pxor %xmm0, %xmm4 +; SSE2-NEXT: movdqa %xmm4, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm4, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: por %xmm2, %xmm5 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 ; SSE2-NEXT: pcmpgtd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm3 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm1 ; SSE2-NEXT: pandn %xmm8, %xmm2 -; SSE2-NEXT: por %xmm3, %xmm2 -; SSE2-NEXT: pxor %xmm5, %xmm4 -; SSE2-NEXT: movdqa %xmm4, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm1[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: pand %xmm3, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm5 -; SSE2-NEXT: pandn %xmm8, %xmm1 -; SSE2-NEXT: por %xmm5, %xmm1 -; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] -; SSE2-NEXT: pand %xmm3, %xmm1 -; SSE2-NEXT: pand %xmm3, %xmm2 -; SSE2-NEXT: packuswb %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm3, %xmm7 -; SSE2-NEXT: pand %xmm3, %xmm0 -; SSE2-NEXT: packuswb %xmm7, %xmm0 -; SSE2-NEXT: packuswb %xmm2, %xmm0 -; SSE2-NEXT: packuswb %xmm0, %xmm0 -; SSE2-NEXT: movq %xmm0, (%rdi) +; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] +; SSE2-NEXT: pand %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm0, %xmm5 +; SSE2-NEXT: packuswb %xmm2, %xmm5 +; SSE2-NEXT: pand %xmm0, %xmm7 +; SSE2-NEXT: pand %xmm0, %xmm3 +; SSE2-NEXT: packuswb %xmm7, %xmm3 +; SSE2-NEXT: packuswb %xmm5, %xmm3 +; SSE2-NEXT: packuswb %xmm0, %xmm3 +; SSE2-NEXT: movq %xmm3, (%rsi) ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_ssat_v8i64_v8i8_store: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm9 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm7 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm5 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm2 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [127,127] -; SSSE3-NEXT: movdqa {{.*#+}} xmm4 = [2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm3, %xmm5 -; SSSE3-NEXT: pxor %xmm4, %xmm5 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [2147483775,2147483775] -; SSSE3-NEXT: movdqa %xmm9, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm5 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [2147483775,2147483775] +; SSSE3-NEXT: movdqa %xmm10, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm3, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm6[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm2 +; SSSE3-NEXT: pandn %xmm8, %xmm1 +; SSSE3-NEXT: por %xmm2, %xmm1 +; SSSE3-NEXT: movdqa %xmm5, %xmm2 +; SSSE3-NEXT: pxor %xmm0, %xmm2 +; SSSE3-NEXT: movdqa %xmm10, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm6, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm5, %xmm2 +; SSSE3-NEXT: movdqa %xmm7, %xmm3 +; SSSE3-NEXT: pxor %xmm0, %xmm3 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSSE3-NEXT: pand %xmm5, %xmm3 -; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[1,1,3,3] ; SSSE3-NEXT: por %xmm3, %xmm5 -; SSSE3-NEXT: movdqa %xmm2, %xmm3 -; SSSE3-NEXT: pxor %xmm4, %xmm3 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm2 -; SSSE3-NEXT: pandn %xmm8, %xmm3 -; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pxor %xmm4, %xmm2 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm1 -; SSSE3-NEXT: pandn %xmm8, %xmm2 -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm7 -; SSSE3-NEXT: pand %xmm7, %xmm0 +; SSSE3-NEXT: pand %xmm5, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: por %xmm7, %xmm5 +; SSSE3-NEXT: movdqa %xmm9, %xmm3 +; SSSE3-NEXT: pxor %xmm0, %xmm3 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm9 ; SSSE3-NEXT: pandn %xmm8, %xmm7 -; SSSE3-NEXT: por %xmm0, %xmm7 +; SSSE3-NEXT: por %xmm9, %xmm7 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709551488,18446744073709551488] -; SSSE3-NEXT: movdqa %xmm7, %xmm0 -; SSSE3-NEXT: pxor %xmm4, %xmm0 +; SSSE3-NEXT: movdqa %xmm7, %xmm3 +; SSSE3-NEXT: pxor %xmm0, %xmm3 ; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [18446744071562067840,18446744071562067840] -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm1[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm0[1,1,3,3] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm3[1,1,3,3] ; SSSE3-NEXT: pand %xmm10, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm0 -; SSSE3-NEXT: pand %xmm0, %xmm7 -; SSSE3-NEXT: pandn %xmm8, %xmm0 -; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: movdqa %xmm2, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm6 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm6, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm7, %xmm3 +; SSSE3-NEXT: movdqa %xmm5, %xmm4 +; SSSE3-NEXT: pxor %xmm0, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm6 ; SSSE3-NEXT: pcmpgtd %xmm9, %xmm6 ; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm1 +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm4 ; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm7 -; SSSE3-NEXT: pand %xmm7, %xmm2 +; SSSE3-NEXT: por %xmm4, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm5 ; SSSE3-NEXT: pandn %xmm8, %xmm7 -; SSSE3-NEXT: por %xmm2, %xmm7 -; SSSE3-NEXT: movdqa %xmm3, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: por %xmm5, %xmm7 +; SSSE3-NEXT: movdqa %xmm2, %xmm4 +; SSSE3-NEXT: pxor %xmm0, %xmm4 +; SSSE3-NEXT: movdqa %xmm4, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm4, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: por %xmm2, %xmm5 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm2 ; SSSE3-NEXT: pcmpgtd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm1 ; SSSE3-NEXT: pandn %xmm8, %xmm2 -; SSSE3-NEXT: por %xmm3, %xmm2 -; SSSE3-NEXT: pxor %xmm5, %xmm4 -; SSSE3-NEXT: movdqa %xmm4, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm1[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: pand %xmm3, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm5 -; SSSE3-NEXT: pandn %xmm8, %xmm1 -; SSSE3-NEXT: por %xmm5, %xmm1 -; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] -; SSSE3-NEXT: pand %xmm3, %xmm1 -; SSSE3-NEXT: pand %xmm3, %xmm2 -; SSSE3-NEXT: packuswb %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm3, %xmm7 -; SSSE3-NEXT: pand %xmm3, %xmm0 -; SSSE3-NEXT: packuswb %xmm7, %xmm0 -; SSSE3-NEXT: packuswb %xmm2, %xmm0 -; SSSE3-NEXT: packuswb %xmm0, %xmm0 -; SSSE3-NEXT: movq %xmm0, (%rdi) +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] +; SSSE3-NEXT: pand %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm0, %xmm5 +; SSSE3-NEXT: packuswb %xmm2, %xmm5 +; SSSE3-NEXT: pand %xmm0, %xmm7 +; SSSE3-NEXT: pand %xmm0, %xmm3 +; SSSE3-NEXT: packuswb %xmm7, %xmm3 +; SSSE3-NEXT: packuswb %xmm5, %xmm3 +; SSSE3-NEXT: packuswb %xmm0, %xmm3 +; SSSE3-NEXT: movq %xmm3, (%rsi) ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_ssat_v8i64_v8i8_store: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm8 -; SSE41-NEXT: movapd {{.*#+}} xmm11 = [127,127] -; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [2147483648,2147483648] -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [2147483775,2147483775] -; SSE41-NEXT: movdqa %xmm6, %xmm7 +; SSE41-NEXT: movdqa (%rdi), %xmm9 +; SSE41-NEXT: movdqa 16(%rdi), %xmm10 +; SSE41-NEXT: movdqa 32(%rdi), %xmm2 +; SSE41-NEXT: movdqa 48(%rdi), %xmm4 +; SSE41-NEXT: movapd {{.*#+}} xmm3 = [127,127] +; SSE41-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm4, %xmm0 +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [2147483775,2147483775] +; SSE41-NEXT: movdqa %xmm5, %xmm7 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm7 -; SSE41-NEXT: movdqa %xmm6, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: movdqa %xmm5, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] ; SSE41-NEXT: pand %xmm7, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm9 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm9 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm8 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm8 ; SSE41-NEXT: movdqa %xmm2, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm6, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 -; SSE41-NEXT: movdqa %xmm6, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm10 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm10 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm6, %xmm2 +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa %xmm5, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: movdqa %xmm5, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm11 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm11 +; SSE41-NEXT: movdqa %xmm10, %xmm0 +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa %xmm5, %xmm2 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm6, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 -; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm3 -; SSE41-NEXT: movdqa %xmm8, %xmm0 -; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm6, %xmm1 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: movdqa %xmm5, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: pand %xmm2, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm11 -; SSE41-NEXT: movapd {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] -; SSE41-NEXT: movapd %xmm11, %xmm2 -; SSE41-NEXT: xorpd %xmm5, %xmm2 -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [18446744071562067840,18446744071562067840] -; SSE41-NEXT: movapd %xmm2, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm2 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: movapd %xmm3, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm2 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa %xmm5, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] ; SSE41-NEXT: pand %xmm6, %xmm0 -; SSE41-NEXT: por %xmm2, %xmm0 -; SSE41-NEXT: movapd %xmm1, %xmm2 -; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm2 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm3 +; SSE41-NEXT: movapd {{.*#+}} xmm5 = [18446744073709551488,18446744073709551488] ; SSE41-NEXT: movapd %xmm3, %xmm6 -; SSE41-NEXT: xorpd %xmm5, %xmm6 -; SSE41-NEXT: movapd %xmm6, %xmm7 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm7 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm6 +; SSE41-NEXT: xorpd %xmm1, %xmm6 +; SSE41-NEXT: movdqa {{.*#+}} xmm7 = [18446744071562067840,18446744071562067840] +; SSE41-NEXT: movapd %xmm6, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm7, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm7, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: pand %xmm4, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm1, %xmm7 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm7 -; SSE41-NEXT: movapd %xmm10, %xmm3 -; SSE41-NEXT: xorpd %xmm5, %xmm3 -; SSE41-NEXT: movapd %xmm3, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm3 +; SSE41-NEXT: movapd %xmm5, %xmm6 +; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm6 +; SSE41-NEXT: movapd %xmm2, %xmm3 +; SSE41-NEXT: xorpd %xmm1, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm7, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm7, %xmm3 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: pand %xmm4, %xmm0 ; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm1, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm3 -; SSE41-NEXT: xorpd %xmm9, %xmm5 -; SSE41-NEXT: movapd %xmm5, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm4, %xmm5 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 -; SSE41-NEXT: por %xmm5, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm1 +; SSE41-NEXT: movapd %xmm5, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm3 +; SSE41-NEXT: movapd %xmm11, %xmm2 +; SSE41-NEXT: xorpd %xmm1, %xmm2 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm7, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm7, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: movapd %xmm5, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm2 +; SSE41-NEXT: xorpd %xmm8, %xmm1 +; SSE41-NEXT: movapd %xmm1, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm7, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm7, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm1, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm5 ; SSE41-NEXT: movapd {{.*#+}} xmm0 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0] -; SSE41-NEXT: andpd %xmm0, %xmm1 -; SSE41-NEXT: andpd %xmm0, %xmm3 -; SSE41-NEXT: packusdw %xmm1, %xmm3 -; SSE41-NEXT: andpd %xmm0, %xmm7 +; SSE41-NEXT: andpd %xmm0, %xmm5 ; SSE41-NEXT: andpd %xmm0, %xmm2 -; SSE41-NEXT: packusdw %xmm7, %xmm2 -; SSE41-NEXT: packusdw %xmm3, %xmm2 -; SSE41-NEXT: packuswb %xmm0, %xmm2 -; SSE41-NEXT: movq %xmm2, (%rdi) +; SSE41-NEXT: packusdw %xmm5, %xmm2 +; SSE41-NEXT: andpd %xmm0, %xmm3 +; SSE41-NEXT: andpd %xmm0, %xmm6 +; SSE41-NEXT: packusdw %xmm3, %xmm6 +; SSE41-NEXT: packusdw %xmm2, %xmm6 +; SSE41-NEXT: packuswb %xmm0, %xmm6 +; SSE41-NEXT: movq %xmm6, (%rsi) ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_ssat_v8i64_v8i8_store: ; AVX1: # %bb.0: -; AVX1-NEXT: vmovapd {{.*#+}} ymm8 = [127,127,127,127] -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [127,127] -; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm5 -; AVX1-NEXT: vpcmpgtq %xmm1, %xmm4, %xmm6 -; AVX1-NEXT: vinsertf128 $1, %xmm5, %ymm6, %ymm7 -; AVX1-NEXT: vblendvpd %ymm7, %ymm1, %ymm8, %ymm9 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm0, %xmm4, %xmm10 -; AVX1-NEXT: vinsertf128 $1, %xmm7, %ymm10, %ymm11 -; AVX1-NEXT: vblendvpd %ymm11, %ymm0, %ymm8, %ymm8 +; AVX1-NEXT: vmovapd {{.*#+}} ymm9 = [127,127,127,127] +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [127,127] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm2, %xmm3 +; AVX1-NEXT: vmovdqa (%rdi), %xmm4 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm5 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm6 +; AVX1-NEXT: vpcmpgtq %xmm6, %xmm2, %xmm7 +; AVX1-NEXT: vinsertf128 $1, %xmm3, %ymm7, %ymm8 +; AVX1-NEXT: vblendvpd %ymm8, 32(%rdi), %ymm9, %ymm8 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm0 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm2, %xmm10 +; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm10, %ymm11 +; AVX1-NEXT: vblendvpd %ymm11, (%rdi), %ymm9, %ymm9 ; AVX1-NEXT: vmovapd {{.*#+}} ymm11 = [18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488] -; AVX1-NEXT: vblendvpd %xmm7, %xmm2, %xmm4, %xmm2 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm7 = [18446744073709551488,18446744073709551488] -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm2, %xmm2 -; AVX1-NEXT: vblendvpd %xmm10, %xmm0, %xmm4, %xmm0 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm0, %xmm0 -; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0 -; AVX1-NEXT: vblendvpd %ymm0, %ymm8, %ymm11, %ymm0 -; AVX1-NEXT: vblendvpd %xmm5, %xmm3, %xmm4, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm2, %xmm2 -; AVX1-NEXT: vblendvpd %xmm6, %xmm1, %xmm4, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm1, %xmm1 -; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm1, %ymm1 -; AVX1-NEXT: vblendvpd %ymm1, %ymm9, %ymm11, %ymm1 +; AVX1-NEXT: vblendvpd %xmm0, %xmm5, %xmm2, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm5 = [18446744073709551488,18446744073709551488] +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm0, %xmm0 +; AVX1-NEXT: vblendvpd %xmm10, %xmm4, %xmm2, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm4, %xmm4 +; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm4, %ymm0 +; AVX1-NEXT: vblendvpd %ymm0, %ymm9, %ymm11, %ymm0 +; AVX1-NEXT: vblendvpd %xmm3, %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm1 +; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm2, %xmm2 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm2 +; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm2, %ymm1 +; AVX1-NEXT: vblendvpd %ymm1, %ymm8, %ymm11, %ymm1 ; AVX1-NEXT: vmovapd {{.*#+}} ymm2 = [255,255,255,255] ; AVX1-NEXT: vandpd %ymm2, %ymm1, %ymm1 ; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 @@ -3119,12 +3279,14 @@ define void @trunc_ssat_v8i64_v8i8_store ; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ; AVX1-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 -; AVX1-NEXT: vmovq %xmm0, (%rdi) +; AVX1-NEXT: vmovq %xmm0, (%rsi) ; AVX1-NEXT: vzeroupper ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_ssat_v8i64_v8i8_store: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm2 = [127,127,127,127] ; AVX2-NEXT: vpcmpgtq %ymm1, %ymm2, %ymm3 ; AVX2-NEXT: vblendvpd %ymm3, %ymm1, %ymm2, %ymm1 @@ -3146,15 +3308,28 @@ define void @trunc_ssat_v8i64_v8i8_store ; AVX2-NEXT: vpshufb %xmm3, %xmm0, %xmm0 ; AVX2-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1],xmm0[2],xmm2[2],xmm0[3],xmm2[3] ; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3] -; AVX2-NEXT: vmovq %xmm0, (%rdi) +; AVX2-NEXT: vmovq %xmm0, (%rsi) ; AVX2-NEXT: vzeroupper ; AVX2-NEXT: retq ; ; AVX512-LABEL: trunc_ssat_v8i64_v8i8_store: ; AVX512: # %bb.0: -; AVX512-NEXT: vpmovsqb %zmm0, (%rdi) +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 +; AVX512-NEXT: vpmovsqb %zmm0, (%rsi) ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v8i64_v8i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vmovdqa 32(%rdi), %ymm1 +; SKX-NEXT: vpmovsqb %ymm1, %xmm1 +; SKX-NEXT: vpmovsqb %ymm0, %xmm0 +; SKX-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SKX-NEXT: vmovq %xmm0, (%rsi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp slt <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = icmp sgt <8 x i64> %2, @@ -3164,687 +3339,717 @@ define void @trunc_ssat_v8i64_v8i8_store ret void } -define <16 x i8> @trunc_ssat_v16i64_v16i8(<16 x i64> %a0) { +define <16 x i8> @trunc_ssat_v16i64_v16i8(<16 x i64>* %p0) "min-legal-vector-width"="256" { ; SSE2-LABEL: trunc_ssat_v16i64_v16i8: ; SSE2: # %bb.0: -; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [127,127] -; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm6, %xmm9 -; SSE2-NEXT: pxor %xmm8, %xmm9 -; SSE2-NEXT: movdqa {{.*#+}} xmm11 = [2147483775,2147483775] -; SSE2-NEXT: movdqa %xmm11, %xmm12 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm12 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm12[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm9 -; SSE2-NEXT: pshufd {{.*#+}} xmm14 = xmm9[1,1,3,3] -; SSE2-NEXT: pand %xmm13, %xmm14 -; SSE2-NEXT: pshufd {{.*#+}} xmm9 = xmm12[1,1,3,3] -; SSE2-NEXT: por %xmm14, %xmm9 -; SSE2-NEXT: pand %xmm9, %xmm6 -; SSE2-NEXT: pandn %xmm10, %xmm9 -; SSE2-NEXT: por %xmm6, %xmm9 -; SSE2-NEXT: movdqa %xmm7, %xmm6 -; SSE2-NEXT: pxor %xmm8, %xmm6 -; SSE2-NEXT: movdqa %xmm11, %xmm12 -; SSE2-NEXT: pcmpgtd %xmm6, %xmm12 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm12[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] -; SSE2-NEXT: pand %xmm13, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm12 = xmm12[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm12 -; SSE2-NEXT: pand %xmm12, %xmm7 -; SSE2-NEXT: pandn %xmm10, %xmm12 -; SSE2-NEXT: por %xmm7, %xmm12 -; SSE2-NEXT: movdqa %xmm4, %xmm6 -; SSE2-NEXT: pxor %xmm8, %xmm6 -; SSE2-NEXT: movdqa %xmm11, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm6, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm7[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] -; SSE2-NEXT: pand %xmm13, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm7[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm13 -; SSE2-NEXT: pand %xmm13, %xmm4 -; SSE2-NEXT: pandn %xmm10, %xmm13 -; SSE2-NEXT: por %xmm4, %xmm13 -; SSE2-NEXT: movdqa %xmm5, %xmm4 -; SSE2-NEXT: pxor %xmm8, %xmm4 -; SSE2-NEXT: movdqa %xmm11, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm4, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm14 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm14 -; SSE2-NEXT: pand %xmm14, %xmm5 -; SSE2-NEXT: pandn %xmm10, %xmm14 -; SSE2-NEXT: por %xmm5, %xmm14 -; SSE2-NEXT: movdqa %xmm2, %xmm4 -; SSE2-NEXT: pxor %xmm8, %xmm4 -; SSE2-NEXT: movdqa %xmm11, %xmm5 -; SSE2-NEXT: pcmpgtd %xmm4, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm4 +; SSE2-NEXT: movdqa (%rdi), %xmm10 +; SSE2-NEXT: movdqa 16(%rdi), %xmm9 +; SSE2-NEXT: movdqa 32(%rdi), %xmm15 +; SSE2-NEXT: movdqa 48(%rdi), %xmm13 +; SSE2-NEXT: movdqa 80(%rdi), %xmm6 +; SSE2-NEXT: movdqa 64(%rdi), %xmm3 +; SSE2-NEXT: movdqa 112(%rdi), %xmm4 +; SSE2-NEXT: movdqa 96(%rdi), %xmm7 +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [127,127] +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm7, %xmm5 +; SSE2-NEXT: pxor %xmm1, %xmm5 +; SSE2-NEXT: movdqa {{.*#+}} xmm14 = [2147483775,2147483775] +; SSE2-NEXT: movdqa %xmm14, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm0[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm5 ; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm2 -; SSE2-NEXT: pandn %xmm10, %xmm5 -; SSE2-NEXT: por %xmm2, %xmm5 -; SSE2-NEXT: movdqa %xmm3, %xmm2 -; SSE2-NEXT: pxor %xmm8, %xmm2 -; SSE2-NEXT: movdqa %xmm11, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm2, %xmm6 -; SSE2-NEXT: pand %xmm6, %xmm3 -; SSE2-NEXT: pandn %xmm10, %xmm6 -; SSE2-NEXT: por %xmm3, %xmm6 -; SSE2-NEXT: movdqa %xmm0, %xmm2 -; SSE2-NEXT: pxor %xmm8, %xmm2 -; SSE2-NEXT: movdqa %xmm11, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm0 -; SSE2-NEXT: pandn %xmm10, %xmm3 -; SSE2-NEXT: por %xmm0, %xmm3 -; SSE2-NEXT: movdqa %xmm1, %xmm0 -; SSE2-NEXT: pxor %xmm8, %xmm0 -; SSE2-NEXT: movdqa %xmm11, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm11 = xmm0[1,1,3,3] +; SSE2-NEXT: por %xmm5, %xmm11 +; SSE2-NEXT: pand %xmm11, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm11 +; SSE2-NEXT: por %xmm7, %xmm11 +; SSE2-NEXT: movdqa %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm12 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm12 +; SSE2-NEXT: pand %xmm12, %xmm4 +; SSE2-NEXT: pandn %xmm8, %xmm12 +; SSE2-NEXT: por %xmm4, %xmm12 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm2 ; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 ; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSE2-NEXT: pand %xmm4, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] ; SSE2-NEXT: por %xmm0, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pandn %xmm10, %xmm4 -; SSE2-NEXT: por %xmm1, %xmm4 -; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [18446744073709551488,18446744073709551488] -; SSE2-NEXT: movdqa %xmm4, %xmm0 -; SSE2-NEXT: pxor %xmm8, %xmm0 -; SSE2-NEXT: movdqa {{.*#+}} xmm11 = [18446744071562067840,18446744071562067840] -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm11, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm1[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 +; SSE2-NEXT: pand %xmm4, %xmm3 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm3, %xmm4 +; SSE2-NEXT: movdqa %xmm6, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm2, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm0, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm4 -; SSE2-NEXT: pandn %xmm10, %xmm1 -; SSE2-NEXT: por %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm3, %xmm0 -; SSE2-NEXT: pxor %xmm8, %xmm0 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm6 +; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: por %xmm6, %xmm5 +; SSE2-NEXT: movdqa %xmm15, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm6 +; SSE2-NEXT: pand %xmm6, %xmm15 +; SSE2-NEXT: pandn %xmm8, %xmm6 +; SSE2-NEXT: por %xmm15, %xmm6 +; SSE2-NEXT: movdqa %xmm13, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm15 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm15 +; SSE2-NEXT: pand %xmm15, %xmm13 +; SSE2-NEXT: pandn %xmm8, %xmm15 +; SSE2-NEXT: por %xmm13, %xmm15 +; SSE2-NEXT: movdqa %xmm10, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm13 +; SSE2-NEXT: pand %xmm13, %xmm10 +; SSE2-NEXT: pandn %xmm8, %xmm13 +; SSE2-NEXT: por %xmm10, %xmm13 +; SSE2-NEXT: movdqa %xmm9, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm7 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm7[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm9 +; SSE2-NEXT: pandn %xmm8, %xmm7 +; SSE2-NEXT: por %xmm9, %xmm7 +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709551488,18446744073709551488] +; SSE2-NEXT: movdqa %xmm7, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [18446744071562067840,18446744071562067840] ; SSE2-NEXT: movdqa %xmm0, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm11, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm0 -; SSE2-NEXT: pand %xmm0, %xmm3 -; SSE2-NEXT: pandn %xmm10, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm7, %xmm2 +; SSE2-NEXT: movdqa %xmm13, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm7 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm10, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm7[1,1,3,3] ; SSE2-NEXT: por %xmm3, %xmm0 -; SSE2-NEXT: packssdw %xmm1, %xmm0 -; SSE2-NEXT: movdqa %xmm6, %xmm1 -; SSE2-NEXT: pxor %xmm8, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm11, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pand %xmm0, %xmm13 +; SSE2-NEXT: pandn %xmm8, %xmm0 +; SSE2-NEXT: por %xmm13, %xmm0 +; SSE2-NEXT: packssdw %xmm2, %xmm0 +; SSE2-NEXT: movdqa %xmm15, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 ; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm6 -; SSE2-NEXT: pandn %xmm10, %xmm2 -; SSE2-NEXT: por %xmm6, %xmm2 -; SSE2-NEXT: movdqa %xmm5, %xmm1 -; SSE2-NEXT: pxor %xmm8, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm11, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pand %xmm7, %xmm2 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm15 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm15, %xmm3 +; SSE2-NEXT: movdqa %xmm6, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm7 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm10, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm7[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm7 +; SSE2-NEXT: pand %xmm7, %xmm6 +; SSE2-NEXT: pandn %xmm8, %xmm7 +; SSE2-NEXT: por %xmm6, %xmm7 +; SSE2-NEXT: packssdw %xmm3, %xmm7 +; SSE2-NEXT: packssdw %xmm7, %xmm0 +; SSE2-NEXT: movdqa %xmm5, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 ; SSE2-NEXT: pand %xmm3, %xmm5 -; SSE2-NEXT: pandn %xmm10, %xmm3 +; SSE2-NEXT: pandn %xmm8, %xmm3 ; SSE2-NEXT: por %xmm5, %xmm3 -; SSE2-NEXT: packssdw %xmm2, %xmm3 -; SSE2-NEXT: packssdw %xmm3, %xmm0 -; SSE2-NEXT: movdqa %xmm14, %xmm1 -; SSE2-NEXT: pxor %xmm8, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm11, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm3, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm14 -; SSE2-NEXT: pandn %xmm10, %xmm2 -; SSE2-NEXT: por %xmm14, %xmm2 -; SSE2-NEXT: movdqa %xmm13, %xmm1 -; SSE2-NEXT: pxor %xmm8, %xmm1 +; SSE2-NEXT: movdqa %xmm4, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa %xmm2, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm7, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm4 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm4, %xmm2 +; SSE2-NEXT: packssdw %xmm3, %xmm2 +; SSE2-NEXT: movdqa %xmm12, %xmm3 +; SSE2-NEXT: pxor %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm12 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm12, %xmm4 +; SSE2-NEXT: pxor %xmm11, %xmm1 ; SSE2-NEXT: movdqa %xmm1, %xmm3 -; SSE2-NEXT: pcmpgtd %xmm11, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm9, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pand %xmm5, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSE2-NEXT: por %xmm1, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm13 -; SSE2-NEXT: pandn %xmm10, %xmm3 -; SSE2-NEXT: por %xmm13, %xmm3 -; SSE2-NEXT: packssdw %xmm2, %xmm3 -; SSE2-NEXT: movdqa %xmm12, %xmm1 -; SSE2-NEXT: pxor %xmm8, %xmm1 -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm11, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm12 -; SSE2-NEXT: pandn %xmm10, %xmm2 -; SSE2-NEXT: por %xmm12, %xmm2 -; SSE2-NEXT: pxor %xmm9, %xmm8 -; SSE2-NEXT: movdqa %xmm8, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm11, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm1[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm11, %xmm8 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm8[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm5, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm9 -; SSE2-NEXT: pandn %xmm10, %xmm1 -; SSE2-NEXT: por %xmm9, %xmm1 -; SSE2-NEXT: packssdw %xmm2, %xmm1 -; SSE2-NEXT: packssdw %xmm1, %xmm3 -; SSE2-NEXT: packsswb %xmm3, %xmm0 +; SSE2-NEXT: pand %xmm3, %xmm11 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm11, %xmm3 +; SSE2-NEXT: packssdw %xmm4, %xmm3 +; SSE2-NEXT: packssdw %xmm3, %xmm2 +; SSE2-NEXT: packsswb %xmm2, %xmm0 ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_ssat_v16i64_v16i8: ; SSSE3: # %bb.0: -; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [127,127] -; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm6, %xmm9 -; SSSE3-NEXT: pxor %xmm8, %xmm9 -; SSSE3-NEXT: movdqa {{.*#+}} xmm11 = [2147483775,2147483775] -; SSSE3-NEXT: movdqa %xmm11, %xmm12 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm12 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm12[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm9 -; SSSE3-NEXT: pshufd {{.*#+}} xmm14 = xmm9[1,1,3,3] -; SSSE3-NEXT: pand %xmm13, %xmm14 -; SSSE3-NEXT: pshufd {{.*#+}} xmm9 = xmm12[1,1,3,3] -; SSSE3-NEXT: por %xmm14, %xmm9 -; SSSE3-NEXT: pand %xmm9, %xmm6 -; SSSE3-NEXT: pandn %xmm10, %xmm9 -; SSSE3-NEXT: por %xmm6, %xmm9 -; SSSE3-NEXT: movdqa %xmm7, %xmm6 -; SSSE3-NEXT: pxor %xmm8, %xmm6 -; SSSE3-NEXT: movdqa %xmm11, %xmm12 -; SSSE3-NEXT: pcmpgtd %xmm6, %xmm12 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm12[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] -; SSSE3-NEXT: pand %xmm13, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm12 = xmm12[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm12 -; SSSE3-NEXT: pand %xmm12, %xmm7 -; SSSE3-NEXT: pandn %xmm10, %xmm12 -; SSSE3-NEXT: por %xmm7, %xmm12 -; SSSE3-NEXT: movdqa %xmm4, %xmm6 -; SSSE3-NEXT: pxor %xmm8, %xmm6 -; SSSE3-NEXT: movdqa %xmm11, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm6, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm7[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] -; SSSE3-NEXT: pand %xmm13, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm7[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm13 -; SSSE3-NEXT: pand %xmm13, %xmm4 -; SSSE3-NEXT: pandn %xmm10, %xmm13 -; SSSE3-NEXT: por %xmm4, %xmm13 -; SSSE3-NEXT: movdqa %xmm5, %xmm4 -; SSSE3-NEXT: pxor %xmm8, %xmm4 -; SSSE3-NEXT: movdqa %xmm11, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm4, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm14 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm14 -; SSSE3-NEXT: pand %xmm14, %xmm5 -; SSSE3-NEXT: pandn %xmm10, %xmm14 -; SSSE3-NEXT: por %xmm5, %xmm14 -; SSSE3-NEXT: movdqa %xmm2, %xmm4 -; SSSE3-NEXT: pxor %xmm8, %xmm4 -; SSSE3-NEXT: movdqa %xmm11, %xmm5 -; SSSE3-NEXT: pcmpgtd %xmm4, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm4 +; SSSE3-NEXT: movdqa (%rdi), %xmm10 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm9 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm15 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm13 +; SSSE3-NEXT: movdqa 80(%rdi), %xmm6 +; SSSE3-NEXT: movdqa 64(%rdi), %xmm3 +; SSSE3-NEXT: movdqa 112(%rdi), %xmm4 +; SSSE3-NEXT: movdqa 96(%rdi), %xmm7 +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [127,127] +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm7, %xmm5 +; SSSE3-NEXT: pxor %xmm1, %xmm5 +; SSSE3-NEXT: movdqa {{.*#+}} xmm14 = [2147483775,2147483775] +; SSSE3-NEXT: movdqa %xmm14, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm0[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm5 ; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm2 -; SSSE3-NEXT: pandn %xmm10, %xmm5 -; SSSE3-NEXT: por %xmm2, %xmm5 -; SSSE3-NEXT: movdqa %xmm3, %xmm2 -; SSSE3-NEXT: pxor %xmm8, %xmm2 -; SSSE3-NEXT: movdqa %xmm11, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm2, %xmm6 -; SSSE3-NEXT: pand %xmm6, %xmm3 -; SSSE3-NEXT: pandn %xmm10, %xmm6 -; SSSE3-NEXT: por %xmm3, %xmm6 -; SSSE3-NEXT: movdqa %xmm0, %xmm2 -; SSSE3-NEXT: pxor %xmm8, %xmm2 -; SSSE3-NEXT: movdqa %xmm11, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm0 -; SSSE3-NEXT: pandn %xmm10, %xmm3 -; SSSE3-NEXT: por %xmm0, %xmm3 -; SSSE3-NEXT: movdqa %xmm1, %xmm0 -; SSSE3-NEXT: pxor %xmm8, %xmm0 -; SSSE3-NEXT: movdqa %xmm11, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm11 = xmm0[1,1,3,3] +; SSSE3-NEXT: por %xmm5, %xmm11 +; SSSE3-NEXT: pand %xmm11, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm11 +; SSSE3-NEXT: por %xmm7, %xmm11 +; SSSE3-NEXT: movdqa %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm12 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm12 +; SSSE3-NEXT: pand %xmm12, %xmm4 +; SSSE3-NEXT: pandn %xmm8, %xmm12 +; SSSE3-NEXT: por %xmm4, %xmm12 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 ; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 ; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSSE3-NEXT: pand %xmm4, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[1,1,3,3] ; SSSE3-NEXT: por %xmm0, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pandn %xmm10, %xmm4 -; SSSE3-NEXT: por %xmm1, %xmm4 -; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [18446744073709551488,18446744073709551488] -; SSSE3-NEXT: movdqa %xmm4, %xmm0 -; SSSE3-NEXT: pxor %xmm8, %xmm0 -; SSSE3-NEXT: movdqa {{.*#+}} xmm11 = [18446744071562067840,18446744071562067840] -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm11, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm1[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 +; SSSE3-NEXT: pand %xmm4, %xmm3 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm3, %xmm4 +; SSSE3-NEXT: movdqa %xmm6, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm2, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm0, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm4 -; SSSE3-NEXT: pandn %xmm10, %xmm1 -; SSSE3-NEXT: por %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm3, %xmm0 -; SSSE3-NEXT: pxor %xmm8, %xmm0 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm6 +; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: por %xmm6, %xmm5 +; SSSE3-NEXT: movdqa %xmm15, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm15 +; SSSE3-NEXT: pandn %xmm8, %xmm6 +; SSSE3-NEXT: por %xmm15, %xmm6 +; SSSE3-NEXT: movdqa %xmm13, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm15 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm15 +; SSSE3-NEXT: pand %xmm15, %xmm13 +; SSSE3-NEXT: pandn %xmm8, %xmm15 +; SSSE3-NEXT: por %xmm13, %xmm15 +; SSSE3-NEXT: movdqa %xmm10, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm13 +; SSSE3-NEXT: pand %xmm13, %xmm10 +; SSSE3-NEXT: pandn %xmm8, %xmm13 +; SSSE3-NEXT: por %xmm10, %xmm13 +; SSSE3-NEXT: movdqa %xmm9, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm7 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm7[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm9 +; SSSE3-NEXT: pandn %xmm8, %xmm7 +; SSSE3-NEXT: por %xmm9, %xmm7 +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [18446744073709551488,18446744073709551488] +; SSSE3-NEXT: movdqa %xmm7, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [18446744071562067840,18446744071562067840] ; SSSE3-NEXT: movdqa %xmm0, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm11, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: pand %xmm0, %xmm3 -; SSSE3-NEXT: pandn %xmm10, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm7, %xmm2 +; SSSE3-NEXT: movdqa %xmm13, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm7 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm10, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm7[1,1,3,3] ; SSSE3-NEXT: por %xmm3, %xmm0 -; SSSE3-NEXT: packssdw %xmm1, %xmm0 -; SSSE3-NEXT: movdqa %xmm6, %xmm1 -; SSSE3-NEXT: pxor %xmm8, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm11, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pand %xmm0, %xmm13 +; SSSE3-NEXT: pandn %xmm8, %xmm0 +; SSSE3-NEXT: por %xmm13, %xmm0 +; SSSE3-NEXT: packssdw %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm15, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 ; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm6 -; SSSE3-NEXT: pandn %xmm10, %xmm2 -; SSSE3-NEXT: por %xmm6, %xmm2 -; SSSE3-NEXT: movdqa %xmm5, %xmm1 -; SSSE3-NEXT: pxor %xmm8, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm11, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pand %xmm7, %xmm2 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm15 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm15, %xmm3 +; SSSE3-NEXT: movdqa %xmm6, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa %xmm2, %xmm7 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm10, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm7[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm7 +; SSSE3-NEXT: pand %xmm7, %xmm6 +; SSSE3-NEXT: pandn %xmm8, %xmm7 +; SSSE3-NEXT: por %xmm6, %xmm7 +; SSSE3-NEXT: packssdw %xmm3, %xmm7 +; SSSE3-NEXT: packssdw %xmm7, %xmm0 +; SSSE3-NEXT: movdqa %xmm5, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 ; SSSE3-NEXT: pand %xmm3, %xmm5 -; SSSE3-NEXT: pandn %xmm10, %xmm3 +; SSSE3-NEXT: pandn %xmm8, %xmm3 ; SSSE3-NEXT: por %xmm5, %xmm3 -; SSSE3-NEXT: packssdw %xmm2, %xmm3 -; SSSE3-NEXT: packssdw %xmm3, %xmm0 -; SSSE3-NEXT: movdqa %xmm14, %xmm1 -; SSSE3-NEXT: pxor %xmm8, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm11, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm3, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm14 -; SSSE3-NEXT: pandn %xmm10, %xmm2 -; SSSE3-NEXT: por %xmm14, %xmm2 -; SSSE3-NEXT: movdqa %xmm13, %xmm1 -; SSSE3-NEXT: pxor %xmm8, %xmm1 +; SSSE3-NEXT: movdqa %xmm4, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa %xmm2, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm7, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm4 +; SSSE3-NEXT: pandn %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm4, %xmm2 +; SSSE3-NEXT: packssdw %xmm3, %xmm2 +; SSSE3-NEXT: movdqa %xmm12, %xmm3 +; SSSE3-NEXT: pxor %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm12 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm12, %xmm4 +; SSSE3-NEXT: pxor %xmm11, %xmm1 ; SSSE3-NEXT: movdqa %xmm1, %xmm3 -; SSSE3-NEXT: pcmpgtd %xmm11, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm9, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pand %xmm5, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSSE3-NEXT: por %xmm1, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm13 -; SSSE3-NEXT: pandn %xmm10, %xmm3 -; SSSE3-NEXT: por %xmm13, %xmm3 -; SSSE3-NEXT: packssdw %xmm2, %xmm3 -; SSSE3-NEXT: movdqa %xmm12, %xmm1 -; SSSE3-NEXT: pxor %xmm8, %xmm1 -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm11, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm12 -; SSSE3-NEXT: pandn %xmm10, %xmm2 -; SSSE3-NEXT: por %xmm12, %xmm2 -; SSSE3-NEXT: pxor %xmm9, %xmm8 -; SSSE3-NEXT: movdqa %xmm8, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm11, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm1[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm11, %xmm8 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm8[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm5, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm9 -; SSSE3-NEXT: pandn %xmm10, %xmm1 -; SSSE3-NEXT: por %xmm9, %xmm1 -; SSSE3-NEXT: packssdw %xmm2, %xmm1 -; SSSE3-NEXT: packssdw %xmm1, %xmm3 -; SSSE3-NEXT: packsswb %xmm3, %xmm0 +; SSSE3-NEXT: pand %xmm3, %xmm11 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm11, %xmm3 +; SSSE3-NEXT: packssdw %xmm4, %xmm3 +; SSSE3-NEXT: packssdw %xmm3, %xmm2 +; SSSE3-NEXT: packsswb %xmm2, %xmm0 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_ssat_v16i64_v16i8: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm8 -; SSE41-NEXT: movapd {{.*#+}} xmm11 = [127,127] -; SSE41-NEXT: movdqa {{.*#+}} xmm9 = [2147483648,2147483648] -; SSE41-NEXT: movdqa %xmm6, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm12 = [2147483775,2147483775] -; SSE41-NEXT: movdqa %xmm12, %xmm10 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm10 -; SSE41-NEXT: movdqa %xmm12, %xmm13 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm13 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm13[0,0,2,2] -; SSE41-NEXT: pand %xmm10, %xmm0 -; SSE41-NEXT: por %xmm13, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm10 -; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm10 -; SSE41-NEXT: movdqa %xmm7, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm13 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm13 -; SSE41-NEXT: movdqa %xmm12, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm13, %xmm0 -; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm13 -; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm13 -; SSE41-NEXT: movdqa %xmm4, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 -; SSE41-NEXT: movdqa %xmm12, %xmm7 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] -; SSE41-NEXT: pand %xmm6, %xmm0 -; SSE41-NEXT: por %xmm7, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm14 -; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm14 -; SSE41-NEXT: movdqa %xmm5, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 -; SSE41-NEXT: movdqa %xmm12, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm4, %xmm0 -; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm15 -; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm15 -; SSE41-NEXT: movdqa %xmm2, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm5 +; SSE41-NEXT: movdqa (%rdi), %xmm11 +; SSE41-NEXT: movdqa 16(%rdi), %xmm9 +; SSE41-NEXT: movdqa 32(%rdi), %xmm15 +; SSE41-NEXT: movdqa 48(%rdi), %xmm12 +; SSE41-NEXT: movdqa 80(%rdi), %xmm4 +; SSE41-NEXT: movdqa 64(%rdi), %xmm14 +; SSE41-NEXT: movdqa 112(%rdi), %xmm13 +; SSE41-NEXT: movdqa 96(%rdi), %xmm3 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [127,127] +; SSE41-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648] +; SSE41-NEXT: movdqa %xmm3, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm7 = [2147483775,2147483775] +; SSE41-NEXT: movdqa %xmm7, %xmm5 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 -; SSE41-NEXT: movdqa %xmm12, %xmm6 +; SSE41-NEXT: movdqa %xmm7, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] ; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm5 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm5 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm12, %xmm6 +; SSE41-NEXT: movapd %xmm1, %xmm8 +; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm8 +; SSE41-NEXT: movdqa %xmm13, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm10 +; SSE41-NEXT: blendvpd %xmm0, %xmm13, %xmm10 +; SSE41-NEXT: movdqa %xmm14, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm13 +; SSE41-NEXT: blendvpd %xmm0, %xmm14, %xmm13 +; SSE41-NEXT: movdqa %xmm4, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm14 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm14 +; SSE41-NEXT: movdqa %xmm15, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm15, %xmm4 +; SSE41-NEXT: movdqa %xmm12, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm15 +; SSE41-NEXT: blendvpd %xmm0, %xmm12, %xmm15 +; SSE41-NEXT: movdqa %xmm11, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 +; SSE41-NEXT: pand %xmm3, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm6 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm6 -; SSE41-NEXT: movdqa %xmm8, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm12, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 -; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm11, %xmm7 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm7 -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm9, %xmm0 -; SSE41-NEXT: movdqa %xmm12, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm12 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm12[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 -; SSE41-NEXT: por %xmm12, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm11 -; SSE41-NEXT: movapd {{.*#+}} xmm2 = [18446744073709551488,18446744073709551488] -; SSE41-NEXT: movapd %xmm11, %xmm1 -; SSE41-NEXT: xorpd %xmm9, %xmm1 -; SSE41-NEXT: movdqa {{.*#+}} xmm8 = [18446744071562067840,18446744071562067840] -; SSE41-NEXT: movapd %xmm1, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm8, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm8, %xmm1 +; SSE41-NEXT: movapd %xmm1, %xmm6 +; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm6 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm2, %xmm0 +; SSE41-NEXT: movdqa %xmm7, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm7 = [18446744073709551488,18446744073709551488] +; SSE41-NEXT: movapd %xmm1, %xmm5 +; SSE41-NEXT: xorpd %xmm2, %xmm5 +; SSE41-NEXT: movdqa {{.*#+}} xmm9 = [18446744071562067840,18446744071562067840] +; SSE41-NEXT: movapd %xmm5, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm9, %xmm3 +; SSE41-NEXT: pcmpgtd %xmm9, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm3, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm7, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm3 +; SSE41-NEXT: movapd %xmm6, %xmm1 +; SSE41-NEXT: xorpd %xmm2, %xmm1 +; SSE41-NEXT: movapd %xmm1, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm9, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm9, %xmm1 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm1, %xmm0 -; SSE41-NEXT: movapd %xmm2, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm4 ; SSE41-NEXT: movapd %xmm7, %xmm1 -; SSE41-NEXT: xorpd %xmm9, %xmm1 -; SSE41-NEXT: movapd %xmm1, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm8, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm8, %xmm1 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 -; SSE41-NEXT: por %xmm1, %xmm0 -; SSE41-NEXT: movapd %xmm2, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm1 -; SSE41-NEXT: packssdw %xmm4, %xmm1 -; SSE41-NEXT: movapd %xmm6, %xmm3 -; SSE41-NEXT: xorpd %xmm9, %xmm3 -; SSE41-NEXT: movapd %xmm3, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm8, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm8, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm1 +; SSE41-NEXT: packssdw %xmm3, %xmm1 +; SSE41-NEXT: movapd %xmm15, %xmm3 +; SSE41-NEXT: xorpd %xmm2, %xmm3 +; SSE41-NEXT: movapd %xmm3, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm9, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm9, %xmm3 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm2, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm3 -; SSE41-NEXT: movapd %xmm5, %xmm4 -; SSE41-NEXT: xorpd %xmm9, %xmm4 -; SSE41-NEXT: movapd %xmm4, %xmm6 -; SSE41-NEXT: pcmpeqd %xmm8, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm8, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: movapd %xmm7, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm15, %xmm3 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: xorpd %xmm2, %xmm5 +; SSE41-NEXT: movapd %xmm5, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm9, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm9, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] ; SSE41-NEXT: pand %xmm6, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: movapd %xmm2, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm4 -; SSE41-NEXT: packssdw %xmm3, %xmm4 -; SSE41-NEXT: packssdw %xmm4, %xmm1 -; SSE41-NEXT: movapd %xmm15, %xmm3 -; SSE41-NEXT: xorpd %xmm9, %xmm3 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm7, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm5 +; SSE41-NEXT: packssdw %xmm3, %xmm5 +; SSE41-NEXT: packssdw %xmm5, %xmm1 +; SSE41-NEXT: movapd %xmm14, %xmm3 +; SSE41-NEXT: xorpd %xmm2, %xmm3 ; SSE41-NEXT: movapd %xmm3, %xmm4 -; SSE41-NEXT: pcmpeqd %xmm8, %xmm4 -; SSE41-NEXT: pcmpgtd %xmm8, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm9, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm9, %xmm3 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] ; SSE41-NEXT: pand %xmm4, %xmm0 ; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm2, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm15, %xmm3 -; SSE41-NEXT: movapd %xmm14, %xmm4 -; SSE41-NEXT: xorpd %xmm9, %xmm4 +; SSE41-NEXT: movapd %xmm7, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm14, %xmm3 +; SSE41-NEXT: movapd %xmm13, %xmm4 +; SSE41-NEXT: xorpd %xmm2, %xmm4 ; SSE41-NEXT: movapd %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm8, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm8, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm9, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm9, %xmm4 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] ; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: movapd %xmm2, %xmm4 -; SSE41-NEXT: blendvpd %xmm0, %xmm14, %xmm4 +; SSE41-NEXT: movapd %xmm7, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm13, %xmm4 ; SSE41-NEXT: packssdw %xmm3, %xmm4 -; SSE41-NEXT: movapd %xmm13, %xmm3 -; SSE41-NEXT: xorpd %xmm9, %xmm3 +; SSE41-NEXT: movapd %xmm10, %xmm3 +; SSE41-NEXT: xorpd %xmm2, %xmm3 ; SSE41-NEXT: movapd %xmm3, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm8, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm8, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm9, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm9, %xmm3 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] ; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm2, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm13, %xmm3 -; SSE41-NEXT: xorpd %xmm10, %xmm9 -; SSE41-NEXT: movapd %xmm9, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm8, %xmm5 -; SSE41-NEXT: pcmpgtd %xmm8, %xmm9 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm9[0,0,2,2] +; SSE41-NEXT: movapd %xmm7, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm3 +; SSE41-NEXT: xorpd %xmm8, %xmm2 +; SSE41-NEXT: movapd %xmm2, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm9, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm9, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] ; SSE41-NEXT: pand %xmm5, %xmm0 -; SSE41-NEXT: por %xmm9, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm2 -; SSE41-NEXT: packssdw %xmm3, %xmm2 -; SSE41-NEXT: packssdw %xmm2, %xmm4 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm7 +; SSE41-NEXT: packssdw %xmm3, %xmm7 +; SSE41-NEXT: packssdw %xmm7, %xmm4 ; SSE41-NEXT: packsswb %xmm4, %xmm1 ; SSE41-NEXT: movdqa %xmm1, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_ssat_v16i64_v16i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm3, %xmm8 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm5 = [127,127] -; AVX1-NEXT: vextractf128 $1, %ymm2, %xmm9 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm7 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm4 -; AVX1-NEXT: vpcmpgtq %xmm0, %xmm5, %xmm6 -; AVX1-NEXT: vblendvpd %xmm6, %xmm0, %xmm5, %xmm10 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm5, %xmm6 -; AVX1-NEXT: vblendvpd %xmm6, %xmm4, %xmm5, %xmm11 -; AVX1-NEXT: vpcmpgtq %xmm1, %xmm5, %xmm6 -; AVX1-NEXT: vblendvpd %xmm6, %xmm1, %xmm5, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm5, %xmm6 -; AVX1-NEXT: vblendvpd %xmm6, %xmm7, %xmm5, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm5, %xmm7 -; AVX1-NEXT: vblendvpd %xmm7, %xmm2, %xmm5, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm9, %xmm5, %xmm7 -; AVX1-NEXT: vblendvpd %xmm7, %xmm9, %xmm5, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm3, %xmm5, %xmm0 -; AVX1-NEXT: vblendvpd %xmm0, %xmm3, %xmm5, %xmm0 -; AVX1-NEXT: vpcmpgtq %xmm8, %xmm5, %xmm3 -; AVX1-NEXT: vblendvpd %xmm3, %xmm8, %xmm5, %xmm3 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm5 = [18446744073709551488,18446744073709551488] -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm3, %xmm4 -; AVX1-NEXT: vblendvpd %xmm4, %xmm3, %xmm5, %xmm8 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm0, %xmm4 -; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm5, %xmm0 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm7, %xmm4 -; AVX1-NEXT: vblendvpd %xmm4, %xmm7, %xmm5, %xmm4 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm2, %xmm7 -; AVX1-NEXT: vblendvpd %xmm7, %xmm2, %xmm5, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm7 -; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm5, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm7 -; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm5, %xmm1 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm11, %xmm7 -; AVX1-NEXT: vblendvpd %xmm7, %xmm11, %xmm5, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm10, %xmm3 -; AVX1-NEXT: vblendvpd %xmm3, %xmm10, %xmm5, %xmm3 -; AVX1-NEXT: vpackssdw %xmm8, %xmm0, %xmm0 -; AVX1-NEXT: vpackssdw %xmm4, %xmm2, %xmm2 -; AVX1-NEXT: vpackssdw %xmm0, %xmm2, %xmm0 -; AVX1-NEXT: vpackssdw %xmm6, %xmm1, %xmm1 -; AVX1-NEXT: vpackssdw %xmm7, %xmm3, %xmm2 -; AVX1-NEXT: vpackssdw %xmm1, %xmm2, %xmm1 -; AVX1-NEXT: vpacksswb %xmm0, %xmm1, %xmm0 -; AVX1-NEXT: vzeroupper +; AVX1-NEXT: vmovdqa 112(%rdi), %xmm8 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] +; AVX1-NEXT: vmovdqa 96(%rdi), %xmm9 +; AVX1-NEXT: vmovdqa 80(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa 64(%rdi), %xmm4 +; AVX1-NEXT: vmovdqa (%rdi), %xmm5 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm6 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm7 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm0 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm5, %xmm1, %xmm10 +; AVX1-NEXT: vpcmpgtq %xmm6, %xmm1, %xmm5 +; AVX1-NEXT: vblendvpd %xmm5, %xmm6, %xmm1, %xmm11 +; AVX1-NEXT: vpcmpgtq %xmm7, %xmm1, %xmm6 +; AVX1-NEXT: vblendvpd %xmm6, %xmm7, %xmm1, %xmm6 +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm4, %xmm1, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm3, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm3, %xmm1, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm9, %xmm1, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm9, %xmm1, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm8, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm8, %xmm1, %xmm1 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [18446744073709551488,18446744073709551488] +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm1, %xmm5 +; AVX1-NEXT: vblendvpd %xmm5, %xmm1, %xmm2, %xmm8 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm7, %xmm5 +; AVX1-NEXT: vblendvpd %xmm5, %xmm7, %xmm2, %xmm5 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm3, %xmm2, %xmm3 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm4, %xmm2, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm0, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm0, %xmm2, %xmm0 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm6, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm6, %xmm2, %xmm6 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm11, %xmm7 +; AVX1-NEXT: vblendvpd %xmm7, %xmm11, %xmm2, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm10, %xmm1 +; AVX1-NEXT: vblendvpd %xmm1, %xmm10, %xmm2, %xmm1 +; AVX1-NEXT: vpackssdw %xmm8, %xmm5, %xmm2 +; AVX1-NEXT: vpackssdw %xmm3, %xmm4, %xmm3 +; AVX1-NEXT: vpackssdw %xmm2, %xmm3, %xmm2 +; AVX1-NEXT: vpackssdw %xmm0, %xmm6, %xmm0 +; AVX1-NEXT: vpackssdw %xmm7, %xmm1, %xmm1 +; AVX1-NEXT: vpackssdw %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpacksswb %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_ssat_v16i64_v16i8: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 +; AVX2-NEXT: vmovdqa 64(%rdi), %ymm2 +; AVX2-NEXT: vmovdqa 96(%rdi), %ymm3 ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm4 = [127,127,127,127] ; AVX2-NEXT: vpcmpgtq %ymm2, %ymm4, %ymm5 ; AVX2-NEXT: vblendvpd %ymm5, %ymm2, %ymm4, %ymm2 @@ -3876,21 +4081,23 @@ define <16 x i8> @trunc_ssat_v16i64_v16i ; ; AVX512F-LABEL: trunc_ssat_v16i64_v16i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastq {{.*#+}} zmm2 = [127,127,127,127,127,127,127,127] -; AVX512F-NEXT: vpminsq %zmm2, %zmm0, %zmm0 -; AVX512F-NEXT: vpminsq %zmm2, %zmm1, %zmm1 +; AVX512F-NEXT: vpbroadcastq {{.*#+}} zmm0 = [127,127,127,127,127,127,127,127] +; AVX512F-NEXT: vpminsq (%rdi), %zmm0, %zmm1 +; AVX512F-NEXT: vpminsq 64(%rdi), %zmm0, %zmm0 ; AVX512F-NEXT: vpbroadcastq {{.*#+}} zmm2 = [18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488] -; AVX512F-NEXT: vpmaxsq %zmm2, %zmm1, %zmm1 ; AVX512F-NEXT: vpmaxsq %zmm2, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512F-NEXT: vpmaxsq %zmm2, %zmm1, %zmm1 ; AVX512F-NEXT: vpmovqd %zmm1, %ymm1 -; AVX512F-NEXT: vinserti64x4 $1, %ymm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512F-NEXT: vinserti64x4 $1, %ymm0, %zmm1, %zmm0 ; AVX512F-NEXT: vpmovdb %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_ssat_v16i64_v16i8: ; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vmovdqa64 (%rdi), %zmm0 +; AVX512VL-NEXT: vmovdqa64 64(%rdi), %zmm1 ; AVX512VL-NEXT: vpmovsqb %zmm1, %xmm1 ; AVX512VL-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512VL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] @@ -3899,26 +4106,45 @@ define <16 x i8> @trunc_ssat_v16i64_v16i ; ; AVX512BW-LABEL: trunc_ssat_v16i64_v16i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastq {{.*#+}} zmm2 = [127,127,127,127,127,127,127,127] -; AVX512BW-NEXT: vpminsq %zmm2, %zmm0, %zmm0 -; AVX512BW-NEXT: vpminsq %zmm2, %zmm1, %zmm1 +; AVX512BW-NEXT: vpbroadcastq {{.*#+}} zmm0 = [127,127,127,127,127,127,127,127] +; AVX512BW-NEXT: vpminsq (%rdi), %zmm0, %zmm1 +; AVX512BW-NEXT: vpminsq 64(%rdi), %zmm0, %zmm0 ; AVX512BW-NEXT: vpbroadcastq {{.*#+}} zmm2 = [18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488,18446744073709551488] -; AVX512BW-NEXT: vpmaxsq %zmm2, %zmm1, %zmm1 ; AVX512BW-NEXT: vpmaxsq %zmm2, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512BW-NEXT: vpmaxsq %zmm2, %zmm1, %zmm1 ; AVX512BW-NEXT: vpmovqd %zmm1, %ymm1 -; AVX512BW-NEXT: vinserti64x4 $1, %ymm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512BW-NEXT: vinserti64x4 $1, %ymm0, %zmm1, %zmm0 ; AVX512BW-NEXT: vpmovdb %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_ssat_v16i64_v16i8: ; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vmovdqa64 (%rdi), %zmm0 +; AVX512BWVL-NEXT: vmovdqa64 64(%rdi), %zmm1 ; AVX512BWVL-NEXT: vpmovsqb %zmm1, %xmm1 ; AVX512BWVL-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512BWVL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v16i64_v16i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vmovdqa 32(%rdi), %ymm1 +; SKX-NEXT: vmovdqa 64(%rdi), %ymm2 +; SKX-NEXT: vmovdqa 96(%rdi), %ymm3 +; SKX-NEXT: vpmovsqb %ymm3, %xmm3 +; SKX-NEXT: vpmovsqb %ymm2, %xmm2 +; SKX-NEXT: vpunpckldq {{.*#+}} xmm2 = xmm2[0],xmm3[0],xmm2[1],xmm3[1] +; SKX-NEXT: vpmovsqb %ymm1, %xmm1 +; SKX-NEXT: vpmovsqb %ymm0, %xmm0 +; SKX-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] +; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm2[0] +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <16 x i64>, <16 x i64>* %p0 %1 = icmp slt <16 x i64> %a0, %2 = select <16 x i1> %1, <16 x i64> %a0, <16 x i64> %3 = icmp sgt <16 x i64> %2, @@ -4018,6 +4244,13 @@ define <4 x i8> @trunc_ssat_v4i32_v4i8(< ; AVX512BWVL-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v4i32_v4i8: +; SKX: # %bb.0: +; SKX-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; SKX-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = icmp sgt <4 x i32> %2, @@ -4120,6 +4353,11 @@ define void @trunc_ssat_v4i32_v4i8_store ; AVX512BWVL: # %bb.0: ; AVX512BWVL-NEXT: vpmovsdb %xmm0, (%rdi) ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v4i32_v4i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsdb %xmm0, (%rdi) +; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = icmp sgt <4 x i32> %2, @@ -4179,6 +4417,12 @@ define <8 x i8> @trunc_ssat_v8i32_v8i8(< ; AVX512BWVL-NEXT: vpmovsdb %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v8i32_v8i8: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsdb %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <8 x i32> %a0, %2 = select <8 x i1> %1, <8 x i32> %a0, <8 x i32> %3 = icmp sgt <8 x i32> %2, @@ -4242,6 +4486,12 @@ define void @trunc_ssat_v8i32_v8i8_store ; AVX512BWVL-NEXT: vpmovsdb %ymm0, (%rdi) ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v8i32_v8i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsdb %ymm0, (%rdi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <8 x i32> %a0, %2 = select <8 x i1> %1, <8 x i32> %a0, <8 x i32> %3 = icmp sgt <8 x i32> %2, @@ -4251,27 +4501,29 @@ define void @trunc_ssat_v8i32_v8i8_store ret void } -define <16 x i8> @trunc_ssat_v16i32_v16i8(<16 x i32> %a0) { +define <16 x i8> @trunc_ssat_v16i32_v16i8(<16 x i32>* %p0) "min-legal-vector-width"="256" { ; SSE-LABEL: trunc_ssat_v16i32_v16i8: ; SSE: # %bb.0: -; SSE-NEXT: packssdw %xmm3, %xmm2 -; SSE-NEXT: packssdw %xmm1, %xmm0 -; SSE-NEXT: packsswb %xmm2, %xmm0 +; SSE-NEXT: movdqa (%rdi), %xmm0 +; SSE-NEXT: movdqa 32(%rdi), %xmm1 +; SSE-NEXT: packssdw 48(%rdi), %xmm1 +; SSE-NEXT: packssdw 16(%rdi), %xmm0 +; SSE-NEXT: packsswb %xmm1, %xmm0 ; SSE-NEXT: retq ; ; AVX1-LABEL: trunc_ssat_v16i32_v16i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vpackssdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vpackssdw %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm1 +; AVX1-NEXT: vpackssdw 48(%rdi), %xmm1, %xmm1 +; AVX1-NEXT: vpackssdw 16(%rdi), %xmm0, %xmm0 ; AVX1-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 -; AVX1-NEXT: vzeroupper ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_ssat_v16i32_v16i8: ; AVX2: # %bb.0: -; AVX2-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vpackssdw 32(%rdi), %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 ; AVX2-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 @@ -4280,9 +4532,21 @@ define <16 x i8> @trunc_ssat_v16i32_v16i ; ; AVX512-LABEL: trunc_ssat_v16i32_v16i8: ; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512-NEXT: vpmovsdb %zmm0, %xmm0 ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v16i32_v16i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vmovdqa 32(%rdi), %ymm1 +; SKX-NEXT: vpmovsdb %ymm1, %xmm1 +; SKX-NEXT: vpmovsdb %ymm0, %xmm0 +; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <16 x i32>, <16 x i32>* %p0 %1 = icmp slt <16 x i32> %a0, %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> %3 = icmp sgt <16 x i32> %2, @@ -4291,6 +4555,65 @@ define <16 x i8> @trunc_ssat_v16i32_v16i ret <16 x i8> %5 } +define void @trunc_ssat_v16i32_v16i8_store(<16 x i32>* %p0, <16 x i8>* %p1) "min-legal-vector-width"="256" { +; SSE-LABEL: trunc_ssat_v16i32_v16i8_store: +; SSE: # %bb.0: +; SSE-NEXT: movdqa (%rdi), %xmm0 +; SSE-NEXT: movdqa 32(%rdi), %xmm1 +; SSE-NEXT: packssdw 48(%rdi), %xmm1 +; SSE-NEXT: packssdw 16(%rdi), %xmm0 +; SSE-NEXT: packsswb %xmm1, %xmm0 +; SSE-NEXT: movdqa %xmm0, (%rsi) +; SSE-NEXT: retq +; +; AVX1-LABEL: trunc_ssat_v16i32_v16i8_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm1 +; AVX1-NEXT: vpackssdw 48(%rdi), %xmm1, %xmm1 +; AVX1-NEXT: vpackssdw 16(%rdi), %xmm0, %xmm0 +; AVX1-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa %xmm0, (%rsi) +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_ssat_v16i32_v16i8_store: +; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vpackssdw 32(%rdi), %ymm0, %ymm0 +; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX2-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vmovdqa %xmm0, (%rsi) +; AVX2-NEXT: vzeroupper +; AVX2-NEXT: retq +; +; AVX512-LABEL: trunc_ssat_v16i32_v16i8_store: +; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 +; AVX512-NEXT: vpmovsdb %zmm0, (%rsi) +; AVX512-NEXT: vzeroupper +; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v16i32_v16i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vmovdqa 32(%rdi), %ymm1 +; SKX-NEXT: vpmovsdb %ymm1, %xmm1 +; SKX-NEXT: vpmovsdb %ymm0, %xmm0 +; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] +; SKX-NEXT: vmovdqa %xmm0, (%rsi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <16 x i32>, <16 x i32>* %p0 + %1 = icmp slt <16 x i32> %a0, + %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> + %3 = icmp sgt <16 x i32> %2, + %4 = select <16 x i1> %3, <16 x i32> %2, <16 x i32> + %5 = trunc <16 x i32> %4 to <16 x i8> + store <16 x i8> %5, <16 x i8>* %p1 + ret void +} + define <8 x i8> @trunc_ssat_v8i16_v8i8(<8 x i16> %a0) { ; SSE-LABEL: trunc_ssat_v8i16_v8i8: ; SSE: # %bb.0: @@ -4323,6 +4646,13 @@ define <8 x i8> @trunc_ssat_v8i16_v8i8(< ; AVX512BWVL-NEXT: vpmaxsw {{.*}}(%rip), %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v8i16_v8i8: +; SKX: # %bb.0: +; SKX-NEXT: vpminsw {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpmaxsw {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 +; SKX-NEXT: retq %1 = icmp slt <8 x i16> %a0, %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> %3 = icmp sgt <8 x i16> %2, @@ -4366,6 +4696,11 @@ define void @trunc_ssat_v8i16_v8i8_store ; AVX512BWVL: # %bb.0: ; AVX512BWVL-NEXT: vpmovswb %xmm0, (%rdi) ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v8i16_v8i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovswb %xmm0, (%rdi) +; SKX-NEXT: retq %1 = icmp slt <8 x i16> %a0, %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> %3 = icmp sgt <8 x i16> %2, @@ -4421,6 +4756,12 @@ define <16 x i8> @trunc_ssat_v16i16_v16i ; AVX512BWVL-NEXT: vpmovswb %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v16i16_v16i8: +; SKX: # %bb.0: +; SKX-NEXT: vpmovswb %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp slt <16 x i16> %a0, %2 = select <16 x i1> %1, <16 x i16> %a0, <16 x i16> %3 = icmp sgt <16 x i16> %2, @@ -4429,52 +4770,69 @@ define <16 x i8> @trunc_ssat_v16i16_v16i ret <16 x i8> %5 } -define <32 x i8> @trunc_ssat_v32i16_v32i8(<32 x i16> %a0) { +define <32 x i8> @trunc_ssat_v32i16_v32i8(<32 x i16>* %p0) "min-legal-vector-width"="256" { ; SSE-LABEL: trunc_ssat_v32i16_v32i8: ; SSE: # %bb.0: -; SSE-NEXT: packsswb %xmm1, %xmm0 -; SSE-NEXT: packsswb %xmm3, %xmm2 -; SSE-NEXT: movdqa %xmm2, %xmm1 +; SSE-NEXT: movdqa (%rdi), %xmm0 +; SSE-NEXT: movdqa 32(%rdi), %xmm1 +; SSE-NEXT: packsswb 16(%rdi), %xmm0 +; SSE-NEXT: packsswb 48(%rdi), %xmm1 ; SSE-NEXT: retq ; ; AVX1-LABEL: trunc_ssat_v32i16_v32i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vpacksswb %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vpacksswb %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm1 +; AVX1-NEXT: vpacksswb 48(%rdi), %xmm1, %xmm1 +; AVX1-NEXT: vpacksswb 16(%rdi), %xmm0, %xmm0 ; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_ssat_v32i16_v32i8: ; AVX2: # %bb.0: -; AVX2-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vpacksswb 32(%rdi), %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: retq ; ; AVX512F-LABEL: trunc_ssat_v32i16_v32i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vextracti64x4 $1, %zmm0, %ymm1 -; AVX512F-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 +; AVX512F-NEXT: vmovdqa (%rdi), %ymm0 +; AVX512F-NEXT: vpacksswb 32(%rdi), %ymm0, %ymm0 ; AVX512F-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_ssat_v32i16_v32i8: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vextracti64x4 $1, %zmm0, %ymm1 -; AVX512VL-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 +; AVX512VL-NEXT: vmovdqa (%rdi), %ymm0 +; AVX512VL-NEXT: vpacksswb 32(%rdi), %ymm0, %ymm0 ; AVX512VL-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_ssat_v32i16_v32i8: ; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512BW-NEXT: vpmovswb %zmm0, %ymm0 ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_ssat_v32i16_v32i8: ; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512BWVL-NEXT: vpmovswb %zmm0, %ymm0 ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v32i16_v32i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa {{.*#+}} ymm0 = [127,127,127,127,127,127,127,127,127,127,127,127,127,127,127,127] +; SKX-NEXT: vpminsw (%rdi), %ymm0, %ymm1 +; SKX-NEXT: vpminsw 32(%rdi), %ymm0, %ymm0 +; SKX-NEXT: vmovdqa {{.*#+}} ymm2 = [65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408] +; SKX-NEXT: vpmaxsw %ymm2, %ymm0, %ymm0 +; SKX-NEXT: vpmaxsw %ymm2, %ymm1, %ymm1 +; SKX-NEXT: vpacksswb %ymm0, %ymm1, %ymm0 +; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; SKX-NEXT: retq + %a0 = load <32 x i16>, <32 x i16>* %p0 %1 = icmp slt <32 x i16> %a0, %2 = select <32 x i1> %1, <32 x i16> %a0, <32 x i16> %3 = icmp sgt <32 x i16> %2, @@ -4483,49 +4841,72 @@ define <32 x i8> @trunc_ssat_v32i16_v32i ret <32 x i8> %5 } -define <32 x i8> @trunc_ssat_v32i32_v32i8(<32 x i32> %a0) { +define <32 x i8> @trunc_ssat_v32i32_v32i8(<32 x i32>* %p0) "min-legal-vector-width"="256" { ; SSE-LABEL: trunc_ssat_v32i32_v32i8: ; SSE: # %bb.0: -; SSE-NEXT: packssdw %xmm3, %xmm2 -; SSE-NEXT: packssdw %xmm1, %xmm0 +; SSE-NEXT: movdqa (%rdi), %xmm0 +; SSE-NEXT: movdqa 32(%rdi), %xmm2 +; SSE-NEXT: movdqa 64(%rdi), %xmm1 +; SSE-NEXT: movdqa 96(%rdi), %xmm3 +; SSE-NEXT: packssdw 48(%rdi), %xmm2 +; SSE-NEXT: packssdw 16(%rdi), %xmm0 ; SSE-NEXT: packsswb %xmm2, %xmm0 -; SSE-NEXT: packssdw %xmm7, %xmm6 -; SSE-NEXT: packssdw %xmm5, %xmm4 -; SSE-NEXT: packsswb %xmm6, %xmm4 -; SSE-NEXT: movdqa %xmm4, %xmm1 +; SSE-NEXT: packssdw 112(%rdi), %xmm3 +; SSE-NEXT: packssdw 80(%rdi), %xmm1 +; SSE-NEXT: packsswb %xmm3, %xmm1 ; SSE-NEXT: retq ; ; AVX1-LABEL: trunc_ssat_v32i32_v32i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm3, %xmm4 -; AVX1-NEXT: vpackssdw %xmm4, %xmm3, %xmm3 -; AVX1-NEXT: vextractf128 $1, %ymm2, %xmm4 -; AVX1-NEXT: vpackssdw %xmm4, %xmm2, %xmm2 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 64(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 96(%rdi), %xmm3 +; AVX1-NEXT: vpackssdw 112(%rdi), %xmm3, %xmm3 +; AVX1-NEXT: vpackssdw 80(%rdi), %xmm2, %xmm2 ; AVX1-NEXT: vpacksswb %xmm3, %xmm2, %xmm2 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 -; AVX1-NEXT: vpackssdw %xmm3, %xmm1, %xmm1 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm3 -; AVX1-NEXT: vpackssdw %xmm3, %xmm0, %xmm0 +; AVX1-NEXT: vpackssdw 48(%rdi), %xmm1, %xmm1 +; AVX1-NEXT: vpackssdw 16(%rdi), %xmm0, %xmm0 ; AVX1-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_ssat_v32i32_v32i8: ; AVX2: # %bb.0: -; AVX2-NEXT: vpackssdw %ymm3, %ymm2, %ymm2 -; AVX2-NEXT: vpermq {{.*#+}} ymm2 = ymm2[0,2,1,3] -; AVX2-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 64(%rdi), %ymm1 +; AVX2-NEXT: vpackssdw 96(%rdi), %ymm1, %ymm1 +; AVX2-NEXT: vpermq {{.*#+}} ymm1 = ymm1[0,2,1,3] +; AVX2-NEXT: vpackssdw 32(%rdi), %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] -; AVX2-NEXT: vpacksswb %ymm2, %ymm0, %ymm0 +; AVX2-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: retq ; ; AVX512-LABEL: trunc_ssat_v32i32_v32i8: ; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 +; AVX512-NEXT: vmovdqa64 64(%rdi), %zmm1 ; AVX512-NEXT: vpmovsdb %zmm0, %xmm0 ; AVX512-NEXT: vpmovsdb %zmm1, %xmm1 ; AVX512-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v32i32_v32i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vmovdqa 32(%rdi), %ymm1 +; SKX-NEXT: vmovdqa 64(%rdi), %ymm2 +; SKX-NEXT: vmovdqa 96(%rdi), %ymm3 +; SKX-NEXT: vpmovsdb %ymm3, %xmm3 +; SKX-NEXT: vpmovsdb %ymm2, %xmm2 +; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm2 = xmm2[0],xmm3[0] +; SKX-NEXT: vpmovsdb %ymm1, %xmm1 +; SKX-NEXT: vpmovsdb %ymm0, %xmm0 +; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] +; SKX-NEXT: vinserti128 $1, %xmm2, %ymm0, %ymm0 +; SKX-NEXT: retq + %a0 = load <32 x i32>, <32 x i32>* %p0 %1 = icmp slt <32 x i32> %a0, %2 = select <32 x i1> %1, <32 x i32> %a0, <32 x i32> %3 = icmp sgt <32 x i32> %2, Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll?rev=374642&r1=374641&r2=374642&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll Sat Oct 12 00:59:24 2019 @@ -9,6 +9,7 @@ ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512vl,+fast-variable-shuffle | FileCheck %s --check-prefixes=AVX512,AVX512VL ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw,+fast-variable-shuffle | FileCheck %s --check-prefixes=AVX512,AVX512BW ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx512bw,+avx512vl,+fast-variable-shuffle | FileCheck %s --check-prefixes=AVX512,AVX512BWVL +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=skx | FileCheck %s --check-prefixes=SKX ; ; Unsigned saturation truncation to vXi32 @@ -192,214 +193,235 @@ define <4 x i32> @trunc_usat_v4i64_v4i32 ; AVX512BWVL-NEXT: vpmovqd %ymm1, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v4i64_v4i32: +; SKX: # %bb.0: +; SKX-NEXT: vpcmpltuq {{.*}}(%rip){1to4}, %ymm0, %k1 +; SKX-NEXT: vmovdqa {{.*#+}} ymm1 = [4294967295,4294967295,4294967295,429496729] +; SKX-NEXT: vmovdqa64 %ymm0, %ymm1 {%k1} +; SKX-NEXT: vpmovqd %ymm1, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp ult <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = trunc <4 x i64> %2 to <4 x i32> ret <4 x i32> %3 } -define <8 x i32> @trunc_usat_v8i64_v8i32(<8 x i64> %a0) { +define <8 x i32> @trunc_usat_v8i64_v8i32(<8 x i64>* %p0) { ; SSE2-LABEL: trunc_usat_v8i64_v8i32: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm9 +; SSE2-NEXT: movdqa 16(%rdi), %xmm5 +; SSE2-NEXT: movdqa 32(%rdi), %xmm6 +; SSE2-NEXT: movdqa 48(%rdi), %xmm1 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [4294967295,4294967295] -; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259456,9223372039002259456] -; SSE2-NEXT: movdqa %xmm3, %xmm7 -; SSE2-NEXT: pxor %xmm5, %xmm7 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [9223372039002259455,9223372039002259455] -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm7, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm7 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: movdqa %xmm1, %xmm7 +; SSE2-NEXT: pxor %xmm0, %xmm7 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [9223372039002259455,9223372039002259455] +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm7, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm7 ; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm7[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm3 -; SSE2-NEXT: pandn %xmm8, %xmm4 -; SSE2-NEXT: por %xmm3, %xmm4 -; SSE2-NEXT: movdqa %xmm2, %xmm3 -; SSE2-NEXT: pxor %xmm5, %xmm3 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm3, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm2 -; SSE2-NEXT: pandn %xmm8, %xmm3 -; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: shufps {{.*#+}} xmm3 = xmm3[0,2],xmm4[0,2] -; SSE2-NEXT: movdqa %xmm1, %xmm2 -; SSE2-NEXT: pxor %xmm5, %xmm2 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm6, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm2, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pandn %xmm8, %xmm4 -; SSE2-NEXT: por %xmm1, %xmm4 -; SSE2-NEXT: pxor %xmm0, %xmm5 -; SSE2-NEXT: movdqa %xmm9, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm1[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] -; SSE2-NEXT: pand %xmm2, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm5, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm0 +; SSE2-NEXT: pand %xmm2, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm7, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: movdqa %xmm6, %xmm1 +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm6 ; SSE2-NEXT: pandn %xmm8, %xmm1 -; SSE2-NEXT: por %xmm1, %xmm0 -; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm4[0,2] -; SSE2-NEXT: movaps %xmm3, %xmm1 +; SSE2-NEXT: por %xmm6, %xmm1 +; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm2[0,2] +; SSE2-NEXT: movdqa %xmm5, %xmm2 +; SSE2-NEXT: pxor %xmm0, %xmm2 +; SSE2-NEXT: movdqa %xmm10, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm5, %xmm3 +; SSE2-NEXT: pxor %xmm9, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm5, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm9 +; SSE2-NEXT: pandn %xmm8, %xmm0 +; SSE2-NEXT: por %xmm9, %xmm0 +; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm3[0,2] ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_usat_v8i64_v8i32: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm9 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm5 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm6 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm1 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [4294967295,4294967295] -; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259456,9223372039002259456] -; SSSE3-NEXT: movdqa %xmm3, %xmm7 -; SSSE3-NEXT: pxor %xmm5, %xmm7 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [9223372039002259455,9223372039002259455] -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm7, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm7 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: movdqa %xmm1, %xmm7 +; SSSE3-NEXT: pxor %xmm0, %xmm7 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [9223372039002259455,9223372039002259455] +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm7 ; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm7[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm3 -; SSSE3-NEXT: pandn %xmm8, %xmm4 -; SSSE3-NEXT: por %xmm3, %xmm4 -; SSSE3-NEXT: movdqa %xmm2, %xmm3 -; SSSE3-NEXT: pxor %xmm5, %xmm3 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm3, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm3[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm2 -; SSSE3-NEXT: pandn %xmm8, %xmm3 -; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: shufps {{.*#+}} xmm3 = xmm3[0,2],xmm4[0,2] -; SSSE3-NEXT: movdqa %xmm1, %xmm2 -; SSSE3-NEXT: pxor %xmm5, %xmm2 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm6, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm2, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pandn %xmm8, %xmm4 -; SSSE3-NEXT: por %xmm1, %xmm4 -; SSSE3-NEXT: pxor %xmm0, %xmm5 -; SSSE3-NEXT: movdqa %xmm9, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm1[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] -; SSSE3-NEXT: pand %xmm2, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm5, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm0 +; SSSE3-NEXT: pand %xmm2, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm7, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: movdqa %xmm6, %xmm1 +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm6 ; SSSE3-NEXT: pandn %xmm8, %xmm1 -; SSSE3-NEXT: por %xmm1, %xmm0 -; SSSE3-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm4[0,2] -; SSSE3-NEXT: movaps %xmm3, %xmm1 +; SSSE3-NEXT: por %xmm6, %xmm1 +; SSSE3-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm2[0,2] +; SSSE3-NEXT: movdqa %xmm5, %xmm2 +; SSSE3-NEXT: pxor %xmm0, %xmm2 +; SSSE3-NEXT: movdqa %xmm10, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm5, %xmm3 +; SSSE3-NEXT: pxor %xmm9, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm5, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm9 +; SSSE3-NEXT: pandn %xmm8, %xmm0 +; SSSE3-NEXT: por %xmm9, %xmm0 +; SSSE3-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm3[0,2] ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_usat_v8i64_v8i32: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm8 -; SSE41-NEXT: movapd {{.*#+}} xmm9 = [4294967295,4294967295] +; SSE41-NEXT: movdqa (%rdi), %xmm8 +; SSE41-NEXT: movdqa 16(%rdi), %xmm9 +; SSE41-NEXT: movdqa 32(%rdi), %xmm7 +; SSE41-NEXT: movdqa 48(%rdi), %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [4294967295,4294967295] ; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259456,9223372039002259456] -; SSE41-NEXT: movdqa %xmm3, %xmm0 +; SSE41-NEXT: movdqa %xmm1, %xmm0 ; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259455,9223372039002259455] -; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259455,9223372039002259455] +; SSE41-NEXT: movdqa %xmm3, %xmm6 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 -; SSE41-NEXT: movdqa %xmm4, %xmm7 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] ; SSE41-NEXT: pand %xmm6, %xmm0 -; SSE41-NEXT: por %xmm7, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm6 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm6 -; SSE41-NEXT: movdqa %xmm2, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm4 +; SSE41-NEXT: movdqa %xmm7, %xmm0 ; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 -; SSE41-NEXT: movdqa %xmm4, %xmm7 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 -; SSE41-NEXT: por %xmm7, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm3 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm3 -; SSE41-NEXT: shufps {{.*#+}} xmm3 = xmm3[0,2],xmm6[0,2] -; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: movdqa %xmm3, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: movdqa %xmm3, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: pand %xmm1, %xmm0 +; SSE41-NEXT: por %xmm6, %xmm0 +; SSE41-NEXT: movapd %xmm2, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm1 +; SSE41-NEXT: shufps {{.*#+}} xmm1 = xmm1[0,2],xmm4[0,2] +; SSE41-NEXT: movdqa %xmm9, %xmm0 ; SSE41-NEXT: pxor %xmm5, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm2 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: movdqa %xmm3, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm2, %xmm0 +; SSE41-NEXT: pand %xmm4, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm2 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm4 ; SSE41-NEXT: pxor %xmm8, %xmm5 -; SSE41-NEXT: movdqa %xmm4, %xmm1 -; SSE41-NEXT: pcmpeqd %xmm5, %xmm1 -; SSE41-NEXT: pcmpgtd %xmm5, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm1, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm9 -; SSE41-NEXT: shufps {{.*#+}} xmm9 = xmm9[0,2],xmm2[0,2] -; SSE41-NEXT: movaps %xmm9, %xmm0 -; SSE41-NEXT: movaps %xmm3, %xmm1 +; SSE41-NEXT: movdqa %xmm3, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm5, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm5, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm2 +; SSE41-NEXT: shufps {{.*#+}} xmm2 = xmm2[0,2],xmm4[0,2] +; SSE41-NEXT: movaps %xmm2, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_usat_v8i64_v8i32: ; AVX1: # %bb.0: -; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808] -; AVX1-NEXT: vpxor %xmm2, %xmm0, %xmm3 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [9223372041149743103,9223372041149743103] -; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm8 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm5 -; AVX1-NEXT: vpxor %xmm2, %xmm5, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm6, %xmm4, %xmm6 -; AVX1-NEXT: vpxor %xmm2, %xmm1, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm4, %xmm7 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 -; AVX1-NEXT: vpxor %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm2 -; AVX1-NEXT: vmovapd {{.*#+}} xmm4 = [4294967295,4294967295] -; AVX1-NEXT: vblendvpd %xmm2, %xmm3, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm4, %xmm1 -; AVX1-NEXT: vshufps {{.*#+}} xmm1 = xmm1[0,2],xmm2[0,2] -; AVX1-NEXT: vblendvpd %xmm6, %xmm5, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm8, %xmm0, %xmm4, %xmm0 -; AVX1-NEXT: vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm2[0,2] -; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [9223372036854775808,9223372036854775808] +; AVX1-NEXT: vpxor %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm6 = [9223372041149743103,9223372041149743103] +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm8 +; AVX1-NEXT: vpxor %xmm4, %xmm1, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm7, %xmm6, %xmm7 +; AVX1-NEXT: vpxor %xmm4, %xmm2, %xmm5 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm5 +; AVX1-NEXT: vpxor %xmm4, %xmm3, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm4 +; AVX1-NEXT: vmovapd {{.*#+}} xmm6 = [4294967295,4294967295] +; AVX1-NEXT: vblendvpd %xmm4, %xmm3, %xmm6, %xmm3 +; AVX1-NEXT: vblendvpd %xmm5, %xmm2, %xmm6, %xmm2 +; AVX1-NEXT: vshufps {{.*#+}} xmm2 = xmm2[0,2],xmm3[0,2] +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm6, %xmm1 +; AVX1-NEXT: vblendvpd %xmm8, %xmm0, %xmm6, %xmm0 +; AVX1-NEXT: vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm1[0,2] +; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-SLOW-LABEL: trunc_usat_v8i64_v8i32: ; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-SLOW-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-SLOW-NEXT: vbroadcastsd {{.*#+}} ymm2 = [4294967295,4294967295,4294967295,4294967295] ; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] ; AVX2-SLOW-NEXT: vpxor %ymm3, %ymm0, %ymm4 @@ -418,6 +440,8 @@ define <8 x i32> @trunc_usat_v8i64_v8i32 ; ; AVX2-FAST-LABEL: trunc_usat_v8i64_v8i32: ; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-FAST-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-FAST-NEXT: vbroadcastsd {{.*#+}} ymm2 = [4294967295,4294967295,4294967295,4294967295] ; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] ; AVX2-FAST-NEXT: vpxor %ymm3, %ymm1, %ymm4 @@ -435,8 +459,16 @@ define <8 x i32> @trunc_usat_v8i64_v8i32 ; ; AVX512-LABEL: trunc_usat_v8i64_v8i32: ; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_usat_v8i64_v8i32: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa64 (%rdi), %zmm0 +; SKX-NEXT: vpmovusqd %zmm0, %ymm0 +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp ult <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = trunc <8 x i64> %2 to <8 x i32> @@ -633,6 +665,12 @@ define <4 x i16> @trunc_usat_v4i64_v4i16 ; AVX512BWVL-NEXT: vpmovusqw %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v4i64_v4i16: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusqw %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp ult <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = trunc <4 x i64> %2 to <4 x i16> @@ -833,6 +871,12 @@ define void @trunc_usat_v4i64_v4i16_stor ; AVX512BWVL-NEXT: vpmovusqw %ymm0, (%rdi) ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v4i64_v4i16_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusqw %ymm0, (%rdi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp ult <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = trunc <4 x i64> %2 to <4 x i16> @@ -840,225 +884,239 @@ define void @trunc_usat_v4i64_v4i16_stor ret void } -define <8 x i16> @trunc_usat_v8i64_v8i16(<8 x i64> %a0) { +define <8 x i16> @trunc_usat_v8i64_v8i16(<8 x i64>* %p0) { ; SSE2-LABEL: trunc_usat_v8i64_v8i16: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm4 +; SSE2-NEXT: movdqa 16(%rdi), %xmm9 +; SSE2-NEXT: movdqa 32(%rdi), %xmm6 +; SSE2-NEXT: movdqa 48(%rdi), %xmm7 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [65535,65535] -; SSE2-NEXT: movdqa {{.*#+}} xmm6 = [9223372039002259456,9223372039002259456] -; SSE2-NEXT: movdqa %xmm2, %xmm5 -; SSE2-NEXT: pxor %xmm6, %xmm5 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [9223372039002324991,9223372039002324991] -; SSE2-NEXT: movdqa %xmm9, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm5[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSE2-NEXT: por %xmm4, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm2 -; SSE2-NEXT: pandn %xmm8, %xmm5 -; SSE2-NEXT: por %xmm2, %xmm5 -; SSE2-NEXT: movdqa %xmm3, %xmm2 -; SSE2-NEXT: pxor %xmm6, %xmm2 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: movdqa %xmm6, %xmm2 +; SSE2-NEXT: pxor %xmm3, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [9223372039002324991,9223372039002324991] +; SSE2-NEXT: movdqa %xmm10, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm0, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm6 ; SSE2-NEXT: pandn %xmm8, %xmm2 -; SSE2-NEXT: por %xmm3, %xmm2 -; SSE2-NEXT: movdqa %xmm0, %xmm3 -; SSE2-NEXT: pxor %xmm6, %xmm3 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm3, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm3 +; SSE2-NEXT: por %xmm6, %xmm2 +; SSE2-NEXT: movdqa %xmm7, %xmm0 +; SSE2-NEXT: pxor %xmm3, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm1[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm6 +; SSE2-NEXT: pand %xmm6, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm6 +; SSE2-NEXT: por %xmm7, %xmm6 +; SSE2-NEXT: movdqa %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm3, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm4 +; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: por %xmm4, %xmm1 +; SSE2-NEXT: pxor %xmm9, %xmm3 +; SSE2-NEXT: movdqa %xmm10, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm0[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm3 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSE2-NEXT: pand %xmm7, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm3, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm0 -; SSE2-NEXT: pandn %xmm8, %xmm4 -; SSE2-NEXT: por %xmm0, %xmm4 -; SSE2-NEXT: pxor %xmm1, %xmm6 -; SSE2-NEXT: movdqa %xmm9, %xmm0 -; SSE2-NEXT: pcmpgtd %xmm6, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm0[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] -; SSE2-NEXT: pand %xmm3, %xmm6 +; SSE2-NEXT: pand %xmm4, %xmm3 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: por %xmm6, %xmm0 -; SSE2-NEXT: pand %xmm0, %xmm1 +; SSE2-NEXT: por %xmm3, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm9 ; SSE2-NEXT: pandn %xmm8, %xmm0 -; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: por %xmm9, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] ; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm4[0,2,2,3] +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[0,2,2,3] ; SSE2-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] ; SSE2-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm3 = xmm0[0,1,0,2,4,5,6,7] ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] -; SSE2-NEXT: pshuflw {{.*#+}} xmm2 = xmm0[0,1,0,2,4,5,6,7] -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,2,2,3] ; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,1,0,2,4,5,6,7] -; SSE2-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1] +; SSE2-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm3[0],xmm0[1],xmm3[1] ; SSE2-NEXT: movsd {{.*#+}} xmm0 = xmm1[0],xmm0[1] ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_usat_v8i64_v8i16: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm4 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm9 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm6 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm7 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [65535,65535] -; SSSE3-NEXT: movdqa {{.*#+}} xmm6 = [9223372039002259456,9223372039002259456] -; SSSE3-NEXT: movdqa %xmm2, %xmm5 -; SSSE3-NEXT: pxor %xmm6, %xmm5 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [9223372039002324991,9223372039002324991] -; SSSE3-NEXT: movdqa %xmm9, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm7[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm5[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSSE3-NEXT: por %xmm4, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm2 -; SSSE3-NEXT: pandn %xmm8, %xmm5 -; SSSE3-NEXT: por %xmm2, %xmm5 -; SSSE3-NEXT: movdqa %xmm3, %xmm2 -; SSSE3-NEXT: pxor %xmm6, %xmm2 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: movdqa %xmm6, %xmm2 +; SSSE3-NEXT: pxor %xmm3, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [9223372039002324991,9223372039002324991] +; SSSE3-NEXT: movdqa %xmm10, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm0, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm6 ; SSSE3-NEXT: pandn %xmm8, %xmm2 -; SSSE3-NEXT: por %xmm3, %xmm2 -; SSSE3-NEXT: movdqa %xmm0, %xmm3 -; SSSE3-NEXT: pxor %xmm6, %xmm3 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm3, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm3 +; SSSE3-NEXT: por %xmm6, %xmm2 +; SSSE3-NEXT: movdqa %xmm7, %xmm0 +; SSSE3-NEXT: pxor %xmm3, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm1[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm6 +; SSSE3-NEXT: por %xmm7, %xmm6 +; SSSE3-NEXT: movdqa %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm3, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm1[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm4 +; SSSE3-NEXT: pandn %xmm8, %xmm1 +; SSSE3-NEXT: por %xmm4, %xmm1 +; SSSE3-NEXT: pxor %xmm9, %xmm3 +; SSSE3-NEXT: movdqa %xmm10, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm0[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm3 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] -; SSSE3-NEXT: pand %xmm7, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm3, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm0 -; SSSE3-NEXT: pandn %xmm8, %xmm4 -; SSSE3-NEXT: por %xmm0, %xmm4 -; SSSE3-NEXT: pxor %xmm1, %xmm6 -; SSSE3-NEXT: movdqa %xmm9, %xmm0 -; SSSE3-NEXT: pcmpgtd %xmm6, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm0[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm6[1,1,3,3] -; SSSE3-NEXT: pand %xmm3, %xmm6 +; SSSE3-NEXT: pand %xmm4, %xmm3 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSSE3-NEXT: por %xmm6, %xmm0 -; SSSE3-NEXT: pand %xmm0, %xmm1 +; SSSE3-NEXT: por %xmm3, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm9 ; SSSE3-NEXT: pandn %xmm8, %xmm0 -; SSSE3-NEXT: por %xmm1, %xmm0 +; SSSE3-NEXT: por %xmm9, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] ; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm4[0,2,2,3] +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[0,2,2,3] ; SSSE3-NEXT: pshuflw {{.*#+}} xmm1 = xmm1[0,2,2,3,4,5,6,7] ; SSSE3-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1] +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm3 = xmm0[0,1,0,2,4,5,6,7] ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] -; SSSE3-NEXT: pshuflw {{.*#+}} xmm2 = xmm0[0,1,0,2,4,5,6,7] -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,2,2,3] ; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,1,0,2,4,5,6,7] -; SSSE3-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1] +; SSSE3-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm3[0],xmm0[1],xmm3[1] ; SSSE3-NEXT: movsd {{.*#+}} xmm0 = xmm1[0],xmm0[1] ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_usat_v8i64_v8i16: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm8 -; SSE41-NEXT: movapd {{.*#+}} xmm9 = [65535,65535] -; SSE41-NEXT: movdqa {{.*#+}} xmm7 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: movdqa (%rdi), %xmm7 +; SSE41-NEXT: movdqa 16(%rdi), %xmm1 +; SSE41-NEXT: movdqa 32(%rdi), %xmm8 +; SSE41-NEXT: movdqa 48(%rdi), %xmm9 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [65535,65535] +; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259456,9223372039002259456] ; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm7, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002324991,9223372039002324991] -; SSE41-NEXT: movdqa %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 -; SSE41-NEXT: movdqa %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 -; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm5 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 -; SSE41-NEXT: movdqa %xmm8, %xmm0 -; SSE41-NEXT: pxor %xmm7, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm1 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 -; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: pxor %xmm5, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002324991,9223372039002324991] +; SSE41-NEXT: movdqa %xmm3, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm4 +; SSE41-NEXT: movdqa %xmm7, %xmm0 +; SSE41-NEXT: pxor %xmm5, %xmm0 +; SSE41-NEXT: movdqa %xmm3, %xmm1 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE41-NEXT: movdqa %xmm3, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] ; SSE41-NEXT: pand %xmm1, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm1 -; SSE41-NEXT: packusdw %xmm5, %xmm1 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm7, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 -; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: movapd %xmm2, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm1 +; SSE41-NEXT: packusdw %xmm4, %xmm1 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm5, %xmm0 +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: movdqa %xmm3, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm4, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm5 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm5 -; SSE41-NEXT: pxor %xmm2, %xmm7 -; SSE41-NEXT: movdqa %xmm4, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm7, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm7, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm9 -; SSE41-NEXT: packusdw %xmm5, %xmm9 -; SSE41-NEXT: packusdw %xmm9, %xmm1 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm4 +; SSE41-NEXT: pxor %xmm8, %xmm5 +; SSE41-NEXT: movdqa %xmm3, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm5, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm5, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm2 +; SSE41-NEXT: packusdw %xmm4, %xmm2 +; SSE41-NEXT: packusdw %xmm2, %xmm1 ; SSE41-NEXT: movdqa %xmm1, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_usat_v8i64_v8i16: ; AVX1: # %bb.0: -; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808] -; AVX1-NEXT: vpxor %xmm2, %xmm0, %xmm3 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [9223372036854841343,9223372036854841343] -; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm8 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm5 -; AVX1-NEXT: vpxor %xmm2, %xmm5, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm6, %xmm4, %xmm6 -; AVX1-NEXT: vpxor %xmm2, %xmm1, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm4, %xmm7 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 -; AVX1-NEXT: vpxor %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm2 -; AVX1-NEXT: vmovapd {{.*#+}} xmm4 = [65535,65535] -; AVX1-NEXT: vblendvpd %xmm2, %xmm3, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm4, %xmm1 -; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vblendvpd %xmm6, %xmm5, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm8, %xmm0, %xmm4, %xmm0 -; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [9223372036854775808,9223372036854775808] +; AVX1-NEXT: vpxor %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm6 = [9223372036854841343,9223372036854841343] +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm8 +; AVX1-NEXT: vpxor %xmm4, %xmm1, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm7, %xmm6, %xmm7 +; AVX1-NEXT: vpxor %xmm4, %xmm2, %xmm5 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm5 +; AVX1-NEXT: vpxor %xmm4, %xmm3, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm4 +; AVX1-NEXT: vmovapd {{.*#+}} xmm6 = [65535,65535] +; AVX1-NEXT: vblendvpd %xmm4, %xmm3, %xmm6, %xmm3 +; AVX1-NEXT: vblendvpd %xmm5, %xmm2, %xmm6, %xmm2 +; AVX1-NEXT: vpackusdw %xmm3, %xmm2, %xmm2 +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm6, %xmm1 +; AVX1-NEXT: vblendvpd %xmm8, %xmm0, %xmm6, %xmm0 ; AVX1-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 -; AVX1-NEXT: vzeroupper +; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_usat_v8i64_v8i16: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-NEXT: vbroadcastsd {{.*#+}} ymm2 = [65535,65535,65535,65535] ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] ; AVX2-NEXT: vpxor %ymm3, %ymm1, %ymm4 @@ -1077,9 +1135,18 @@ define <8 x i16> @trunc_usat_v8i64_v8i16 ; ; AVX512-LABEL: trunc_usat_v8i64_v8i16: ; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_usat_v8i64_v8i16: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa64 (%rdi), %zmm0 +; SKX-NEXT: vpmovusqw %zmm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp ult <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = trunc <8 x i64> %2 to <8 x i16> @@ -1157,6 +1224,12 @@ define <4 x i16> @trunc_usat_v4i32_v4i16 ; AVX512BWVL-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v4i32_v4i16: +; SKX: # %bb.0: +; SKX-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; SKX-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; SKX-NEXT: retq %1 = icmp ult <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = trunc <4 x i32> %2 to <4 x i16> @@ -1239,6 +1312,11 @@ define void @trunc_usat_v4i32_v4i16_stor ; AVX512BWVL: # %bb.0: ; AVX512BWVL-NEXT: vpmovusdw %xmm0, (%rdi) ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v4i32_v4i16_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusdw %xmm0, (%rdi) +; SKX-NEXT: retq %1 = icmp ult <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = trunc <4 x i32> %2 to <4 x i16> @@ -1350,140 +1428,160 @@ define <8 x i16> @trunc_usat_v8i32_v8i16 ; AVX512BWVL-NEXT: vpmovusdw %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v8i32_v8i16: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusdw %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp ult <8 x i32> %a0, %2 = select <8 x i1> %1, <8 x i32> %a0, <8 x i32> %3 = trunc <8 x i32> %2 to <8 x i16> ret <8 x i16> %3 } -define <16 x i16> @trunc_usat_v16i32_v16i16(<16 x i32> %a0) { +define <16 x i16> @trunc_usat_v16i32_v16i16(<16 x i32>* %p0) { ; SSE2-LABEL: trunc_usat_v16i32_v16i16: ; SSE2: # %bb.0: -; SSE2-NEXT: movdqa %xmm1, %xmm8 +; SSE2-NEXT: movdqa (%rdi), %xmm5 +; SSE2-NEXT: movdqa 16(%rdi), %xmm8 +; SSE2-NEXT: movdqa 32(%rdi), %xmm0 +; SSE2-NEXT: movdqa 48(%rdi), %xmm4 ; SSE2-NEXT: movdqa {{.*#+}} xmm6 = [2147483648,2147483648,2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm2, %xmm7 -; SSE2-NEXT: pxor %xmm6, %xmm7 -; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [2147549183,2147549183,2147549183,2147549183] -; SSE2-NEXT: movdqa %xmm5, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm7, %xmm1 -; SSE2-NEXT: pcmpeqd %xmm7, %xmm7 -; SSE2-NEXT: pand %xmm1, %xmm2 -; SSE2-NEXT: pxor %xmm7, %xmm1 -; SSE2-NEXT: por %xmm2, %xmm1 -; SSE2-NEXT: movdqa %xmm3, %xmm4 -; SSE2-NEXT: pxor %xmm6, %xmm4 -; SSE2-NEXT: movdqa %xmm5, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm4, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm3 -; SSE2-NEXT: pxor %xmm7, %xmm2 -; SSE2-NEXT: por %xmm3, %xmm2 ; SSE2-NEXT: movdqa %xmm0, %xmm3 ; SSE2-NEXT: pxor %xmm6, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147549183,2147549183,2147549183,2147549183] +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm1 +; SSE2-NEXT: pcmpeqd %xmm7, %xmm7 +; SSE2-NEXT: pand %xmm1, %xmm0 +; SSE2-NEXT: pxor %xmm7, %xmm1 +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: movdqa %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm6, %xmm0 +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm4 +; SSE2-NEXT: pxor %xmm7, %xmm3 +; SSE2-NEXT: por %xmm4, %xmm3 ; SSE2-NEXT: movdqa %xmm5, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm3, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm0 -; SSE2-NEXT: pxor %xmm7, %xmm4 -; SSE2-NEXT: por %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm6, %xmm4 +; SSE2-NEXT: movdqa %xmm2, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm5 +; SSE2-NEXT: pxor %xmm7, %xmm0 +; SSE2-NEXT: por %xmm5, %xmm0 ; SSE2-NEXT: pxor %xmm8, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm6, %xmm5 -; SSE2-NEXT: pxor %xmm5, %xmm7 -; SSE2-NEXT: pand %xmm8, %xmm5 -; SSE2-NEXT: por %xmm7, %xmm5 -; SSE2-NEXT: pslld $16, %xmm5 -; SSE2-NEXT: psrad $16, %xmm5 -; SSE2-NEXT: pslld $16, %xmm0 -; SSE2-NEXT: psrad $16, %xmm0 -; SSE2-NEXT: packssdw %xmm5, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm6, %xmm2 +; SSE2-NEXT: pxor %xmm2, %xmm7 +; SSE2-NEXT: pand %xmm8, %xmm2 +; SSE2-NEXT: por %xmm7, %xmm2 ; SSE2-NEXT: pslld $16, %xmm2 ; SSE2-NEXT: psrad $16, %xmm2 +; SSE2-NEXT: pslld $16, %xmm0 +; SSE2-NEXT: psrad $16, %xmm0 +; SSE2-NEXT: packssdw %xmm2, %xmm0 +; SSE2-NEXT: pslld $16, %xmm3 +; SSE2-NEXT: psrad $16, %xmm3 ; SSE2-NEXT: pslld $16, %xmm1 ; SSE2-NEXT: psrad $16, %xmm1 -; SSE2-NEXT: packssdw %xmm2, %xmm1 +; SSE2-NEXT: packssdw %xmm3, %xmm1 ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_usat_v16i32_v16i16: ; SSSE3: # %bb.0: -; SSSE3-NEXT: movdqa %xmm1, %xmm8 +; SSSE3-NEXT: movdqa (%rdi), %xmm5 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm8 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm0 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm4 ; SSSE3-NEXT: movdqa {{.*#+}} xmm6 = [2147483648,2147483648,2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm2, %xmm7 -; SSSE3-NEXT: pxor %xmm6, %xmm7 -; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [2147549183,2147549183,2147549183,2147549183] -; SSSE3-NEXT: movdqa %xmm5, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm7, %xmm1 -; SSSE3-NEXT: pcmpeqd %xmm7, %xmm7 -; SSSE3-NEXT: pand %xmm1, %xmm2 -; SSSE3-NEXT: pxor %xmm7, %xmm1 -; SSSE3-NEXT: por %xmm2, %xmm1 -; SSSE3-NEXT: movdqa %xmm3, %xmm4 -; SSSE3-NEXT: pxor %xmm6, %xmm4 -; SSSE3-NEXT: movdqa %xmm5, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm4, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm3 -; SSSE3-NEXT: pxor %xmm7, %xmm2 -; SSSE3-NEXT: por %xmm3, %xmm2 ; SSSE3-NEXT: movdqa %xmm0, %xmm3 ; SSSE3-NEXT: pxor %xmm6, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147549183,2147549183,2147549183,2147549183] +; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm1 +; SSSE3-NEXT: pcmpeqd %xmm7, %xmm7 +; SSSE3-NEXT: pand %xmm1, %xmm0 +; SSSE3-NEXT: pxor %xmm7, %xmm1 +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: movdqa %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm6, %xmm0 +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm4 +; SSSE3-NEXT: pxor %xmm7, %xmm3 +; SSSE3-NEXT: por %xmm4, %xmm3 ; SSSE3-NEXT: movdqa %xmm5, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm3, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm0 -; SSSE3-NEXT: pxor %xmm7, %xmm4 -; SSSE3-NEXT: por %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm6, %xmm4 +; SSSE3-NEXT: movdqa %xmm2, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm5 +; SSSE3-NEXT: pxor %xmm7, %xmm0 +; SSSE3-NEXT: por %xmm5, %xmm0 ; SSSE3-NEXT: pxor %xmm8, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm6, %xmm5 -; SSSE3-NEXT: pxor %xmm5, %xmm7 -; SSSE3-NEXT: pand %xmm8, %xmm5 -; SSSE3-NEXT: por %xmm7, %xmm5 -; SSSE3-NEXT: pslld $16, %xmm5 -; SSSE3-NEXT: psrad $16, %xmm5 -; SSSE3-NEXT: pslld $16, %xmm0 -; SSSE3-NEXT: psrad $16, %xmm0 -; SSSE3-NEXT: packssdw %xmm5, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm6, %xmm2 +; SSSE3-NEXT: pxor %xmm2, %xmm7 +; SSSE3-NEXT: pand %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm7, %xmm2 ; SSSE3-NEXT: pslld $16, %xmm2 ; SSSE3-NEXT: psrad $16, %xmm2 +; SSSE3-NEXT: pslld $16, %xmm0 +; SSSE3-NEXT: psrad $16, %xmm0 +; SSSE3-NEXT: packssdw %xmm2, %xmm0 +; SSSE3-NEXT: pslld $16, %xmm3 +; SSSE3-NEXT: psrad $16, %xmm3 ; SSSE3-NEXT: pslld $16, %xmm1 ; SSSE3-NEXT: psrad $16, %xmm1 -; SSSE3-NEXT: packssdw %xmm2, %xmm1 +; SSSE3-NEXT: packssdw %xmm3, %xmm1 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_usat_v16i32_v16i16: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [65535,65535,65535,65535] -; SSE41-NEXT: pminud %xmm4, %xmm3 -; SSE41-NEXT: pminud %xmm4, %xmm2 -; SSE41-NEXT: packusdw %xmm3, %xmm2 -; SSE41-NEXT: pminud %xmm4, %xmm1 -; SSE41-NEXT: pminud %xmm4, %xmm0 -; SSE41-NEXT: packusdw %xmm1, %xmm0 -; SSE41-NEXT: movdqa %xmm2, %xmm1 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [65535,65535,65535,65535] +; SSE41-NEXT: movdqa 48(%rdi), %xmm2 +; SSE41-NEXT: pminud %xmm0, %xmm2 +; SSE41-NEXT: movdqa 32(%rdi), %xmm1 +; SSE41-NEXT: pminud %xmm0, %xmm1 +; SSE41-NEXT: packusdw %xmm2, %xmm1 +; SSE41-NEXT: movdqa 16(%rdi), %xmm2 +; SSE41-NEXT: pminud %xmm0, %xmm2 +; SSE41-NEXT: pminud (%rdi), %xmm0 +; SSE41-NEXT: packusdw %xmm2, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_usat_v16i32_v16i16: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [65535,65535,65535,65535] -; AVX1-NEXT: vpminud %xmm3, %xmm2, %xmm2 -; AVX1-NEXT: vpminud %xmm3, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm0 = [65535,65535,65535,65535] +; AVX1-NEXT: vpminud 16(%rdi), %xmm0, %xmm1 +; AVX1-NEXT: vpminud (%rdi), %xmm0, %xmm2 +; AVX1-NEXT: vpackusdw %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpminud 48(%rdi), %xmm0, %xmm2 +; AVX1-NEXT: vpminud 32(%rdi), %xmm0, %xmm0 ; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vpminud %xmm3, %xmm2, %xmm2 -; AVX1-NEXT: vpminud %xmm3, %xmm1, %xmm1 -; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_usat_v16i32_v16i16: ; AVX2: # %bb.0: -; AVX2-NEXT: vpbroadcastd {{.*#+}} ymm2 = [65535,65535,65535,65535,65535,65535,65535,65535] -; AVX2-NEXT: vpminud %ymm2, %ymm1, %ymm1 -; AVX2-NEXT: vpminud %ymm2, %ymm0, %ymm0 +; AVX2-NEXT: vpbroadcastd {{.*#+}} ymm0 = [65535,65535,65535,65535,65535,65535,65535,65535] +; AVX2-NEXT: vpminud 32(%rdi), %ymm0, %ymm1 +; AVX2-NEXT: vpminud (%rdi), %ymm0, %ymm0 ; AVX2-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: retq ; ; AVX512-LABEL: trunc_usat_v16i32_v16i16: ; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512-NEXT: vpmovusdw %zmm0, %ymm0 ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_usat_v16i32_v16i16: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa64 (%rdi), %zmm0 +; SKX-NEXT: vpmovusdw %zmm0, %ymm0 +; SKX-NEXT: retq + %a0 = load <16 x i32>, <16 x i32>* %p0 %1 = icmp ult <16 x i32> %a0, %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> %3 = trunc <16 x i32> %2 to <16 x i16> @@ -1661,6 +1759,12 @@ define <4 x i8> @trunc_usat_v4i64_v4i8(< ; AVX512BWVL-NEXT: vpmovusqb %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v4i64_v4i8: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusqb %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp ult <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = trunc <4 x i64> %2 to <4 x i8> @@ -1840,6 +1944,12 @@ define void @trunc_usat_v4i64_v4i8_store ; AVX512BWVL-NEXT: vpmovusqb %ymm0, (%rdi) ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v4i64_v4i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusqb %ymm0, (%rdi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp ult <4 x i64> %a0, %2 = select <4 x i1> %1, <4 x i64> %a0, <4 x i64> %3 = trunc <4 x i64> %2 to <4 x i8> @@ -1847,215 +1957,227 @@ define void @trunc_usat_v4i64_v4i8_store ret void } -define <8 x i8> @trunc_usat_v8i64_v8i8(<8 x i64> %a0) { +define <8 x i8> @trunc_usat_v8i64_v8i8(<8 x i64>* %p0) { ; SSE2-LABEL: trunc_usat_v8i64_v8i8: ; SSE2: # %bb.0: -; SSE2-NEXT: movdqa %xmm0, %xmm4 +; SSE2-NEXT: movdqa (%rdi), %xmm6 +; SSE2-NEXT: movdqa 16(%rdi), %xmm0 +; SSE2-NEXT: movdqa 32(%rdi), %xmm9 +; SSE2-NEXT: movdqa 48(%rdi), %xmm5 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255] -; SSE2-NEXT: movdqa {{.*#+}} xmm6 = [9223372039002259456,9223372039002259456] -; SSE2-NEXT: movdqa %xmm1, %xmm0 -; SSE2-NEXT: pxor %xmm6, %xmm0 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [9223372039002259711,9223372039002259711] -; SSE2-NEXT: movdqa %xmm9, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm0, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm7[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm5, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSE2-NEXT: por %xmm0, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm1 -; SSE2-NEXT: pandn %xmm8, %xmm5 -; SSE2-NEXT: por %xmm1, %xmm5 -; SSE2-NEXT: movdqa %xmm4, %xmm0 -; SSE2-NEXT: pxor %xmm6, %xmm0 -; SSE2-NEXT: movdqa %xmm9, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm0, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm1[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm0 -; SSE2-NEXT: pand %xmm0, %xmm4 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: movdqa %xmm0, %xmm7 +; SSE2-NEXT: pxor %xmm3, %xmm7 +; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [9223372039002259711,9223372039002259711] +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm7, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm7[1,1,3,3] +; SSE2-NEXT: pand %xmm1, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm7, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: movdqa %xmm6, %xmm0 +; SSE2-NEXT: pxor %xmm3, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm6 ; SSE2-NEXT: pandn %xmm8, %xmm0 -; SSE2-NEXT: por %xmm4, %xmm0 -; SSE2-NEXT: packuswb %xmm5, %xmm0 -; SSE2-NEXT: movdqa %xmm3, %xmm1 -; SSE2-NEXT: pxor %xmm6, %xmm1 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 +; SSE2-NEXT: por %xmm6, %xmm0 +; SSE2-NEXT: packuswb %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm5, %xmm1 +; SSE2-NEXT: pxor %xmm3, %xmm1 +; SSE2-NEXT: movdqa %xmm10, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm5, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm1, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm5, %xmm2 +; SSE2-NEXT: pxor %xmm9, %xmm3 +; SSE2-NEXT: movdqa %xmm10, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm1[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm10, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSE2-NEXT: pand %xmm4, %xmm3 -; SSE2-NEXT: pandn %xmm8, %xmm4 -; SSE2-NEXT: por %xmm3, %xmm4 -; SSE2-NEXT: pxor %xmm2, %xmm6 -; SSE2-NEXT: movdqa %xmm9, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm6, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm1[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm6[1,1,3,3] -; SSE2-NEXT: pand %xmm3, %xmm5 ; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSE2-NEXT: por %xmm5, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm2 +; SSE2-NEXT: por %xmm3, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm9 ; SSE2-NEXT: pandn %xmm8, %xmm1 -; SSE2-NEXT: por %xmm2, %xmm1 -; SSE2-NEXT: packuswb %xmm4, %xmm1 +; SSE2-NEXT: por %xmm9, %xmm1 +; SSE2-NEXT: packuswb %xmm2, %xmm1 ; SSE2-NEXT: packuswb %xmm1, %xmm0 ; SSE2-NEXT: packuswb %xmm0, %xmm0 ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_usat_v8i64_v8i8: ; SSSE3: # %bb.0: -; SSSE3-NEXT: movdqa %xmm0, %xmm4 +; SSSE3-NEXT: movdqa (%rdi), %xmm6 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm0 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm9 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm5 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255] -; SSSE3-NEXT: movdqa {{.*#+}} xmm6 = [9223372039002259456,9223372039002259456] -; SSSE3-NEXT: movdqa %xmm1, %xmm0 -; SSSE3-NEXT: pxor %xmm6, %xmm0 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [9223372039002259711,9223372039002259711] -; SSSE3-NEXT: movdqa %xmm9, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm7[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm5, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm7[1,1,3,3] -; SSSE3-NEXT: por %xmm0, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm1 -; SSSE3-NEXT: pandn %xmm8, %xmm5 -; SSSE3-NEXT: por %xmm1, %xmm5 -; SSSE3-NEXT: movdqa %xmm4, %xmm0 -; SSSE3-NEXT: pxor %xmm6, %xmm0 -; SSSE3-NEXT: movdqa %xmm9, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm1[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: pand %xmm0, %xmm4 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: movdqa %xmm0, %xmm7 +; SSSE3-NEXT: pxor %xmm3, %xmm7 +; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [9223372039002259711,9223372039002259711] +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm7, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm7[1,1,3,3] +; SSSE3-NEXT: pand %xmm1, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm7, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm1 +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: movdqa %xmm6, %xmm0 +; SSSE3-NEXT: pxor %xmm3, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm6 ; SSSE3-NEXT: pandn %xmm8, %xmm0 -; SSSE3-NEXT: por %xmm4, %xmm0 -; SSSE3-NEXT: packuswb %xmm5, %xmm0 -; SSSE3-NEXT: movdqa %xmm3, %xmm1 -; SSSE3-NEXT: pxor %xmm6, %xmm1 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 +; SSSE3-NEXT: por %xmm6, %xmm0 +; SSSE3-NEXT: packuswb %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm5, %xmm1 +; SSSE3-NEXT: pxor %xmm3, %xmm1 +; SSSE3-NEXT: movdqa %xmm10, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm5, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm1, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm5, %xmm2 +; SSSE3-NEXT: pxor %xmm9, %xmm3 +; SSSE3-NEXT: movdqa %xmm10, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm1[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm10, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSSE3-NEXT: pand %xmm4, %xmm3 -; SSSE3-NEXT: pandn %xmm8, %xmm4 -; SSSE3-NEXT: por %xmm3, %xmm4 -; SSSE3-NEXT: pxor %xmm2, %xmm6 -; SSSE3-NEXT: movdqa %xmm9, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm6, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm1[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm6[1,1,3,3] -; SSSE3-NEXT: pand %xmm3, %xmm5 ; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] -; SSSE3-NEXT: por %xmm5, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm2 +; SSSE3-NEXT: por %xmm3, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm9 ; SSSE3-NEXT: pandn %xmm8, %xmm1 -; SSSE3-NEXT: por %xmm2, %xmm1 -; SSSE3-NEXT: packuswb %xmm4, %xmm1 +; SSSE3-NEXT: por %xmm9, %xmm1 +; SSSE3-NEXT: packuswb %xmm2, %xmm1 ; SSSE3-NEXT: packuswb %xmm1, %xmm0 ; SSSE3-NEXT: packuswb %xmm0, %xmm0 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_usat_v8i64_v8i8: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm8 -; SSE41-NEXT: movapd {{.*#+}} xmm9 = [255,255] -; SSE41-NEXT: movdqa {{.*#+}} xmm7 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: movdqa (%rdi), %xmm7 +; SSE41-NEXT: movdqa 16(%rdi), %xmm1 +; SSE41-NEXT: movdqa 32(%rdi), %xmm8 +; SSE41-NEXT: movdqa 48(%rdi), %xmm9 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259456,9223372039002259456] ; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm7, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259711,9223372039002259711] -; SSE41-NEXT: movdqa %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 -; SSE41-NEXT: movdqa %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 -; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm5 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 -; SSE41-NEXT: movdqa %xmm8, %xmm0 -; SSE41-NEXT: pxor %xmm7, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm1 +; SSE41-NEXT: pxor %xmm5, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259711,9223372039002259711] +; SSE41-NEXT: movdqa %xmm3, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm6 +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm4 +; SSE41-NEXT: movdqa %xmm7, %xmm0 +; SSE41-NEXT: pxor %xmm5, %xmm0 +; SSE41-NEXT: movdqa %xmm3, %xmm1 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 -; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: movdqa %xmm3, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] ; SSE41-NEXT: pand %xmm1, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm1 -; SSE41-NEXT: packusdw %xmm5, %xmm1 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm7, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm5 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 -; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: movapd %xmm2, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm1 +; SSE41-NEXT: packusdw %xmm4, %xmm1 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm5, %xmm0 +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: movdqa %xmm3, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: pand %xmm4, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm5 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm5 -; SSE41-NEXT: pxor %xmm2, %xmm7 -; SSE41-NEXT: movdqa %xmm4, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm7, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm7, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm9 -; SSE41-NEXT: packusdw %xmm5, %xmm9 -; SSE41-NEXT: packusdw %xmm9, %xmm1 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm4 +; SSE41-NEXT: pxor %xmm8, %xmm5 +; SSE41-NEXT: movdqa %xmm3, %xmm6 +; SSE41-NEXT: pcmpeqd %xmm5, %xmm6 +; SSE41-NEXT: pcmpgtd %xmm5, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm6, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm2 +; SSE41-NEXT: packusdw %xmm4, %xmm2 +; SSE41-NEXT: packusdw %xmm2, %xmm1 ; SSE41-NEXT: packuswb %xmm1, %xmm1 ; SSE41-NEXT: movdqa %xmm1, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_usat_v8i64_v8i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808] -; AVX1-NEXT: vpxor %xmm2, %xmm0, %xmm3 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [9223372036854776063,9223372036854776063] -; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm8 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm5 -; AVX1-NEXT: vpxor %xmm2, %xmm5, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm6, %xmm4, %xmm6 -; AVX1-NEXT: vpxor %xmm2, %xmm1, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm4, %xmm7 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 -; AVX1-NEXT: vpxor %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm2 -; AVX1-NEXT: vmovapd {{.*#+}} xmm4 = [255,255] -; AVX1-NEXT: vblendvpd %xmm2, %xmm3, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm4, %xmm1 -; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vblendvpd %xmm6, %xmm5, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm8, %xmm0, %xmm4, %xmm0 -; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [9223372036854775808,9223372036854775808] +; AVX1-NEXT: vpxor %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm6 = [9223372036854776063,9223372036854776063] +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm8 +; AVX1-NEXT: vpxor %xmm4, %xmm1, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm7, %xmm6, %xmm7 +; AVX1-NEXT: vpxor %xmm4, %xmm2, %xmm5 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm5 +; AVX1-NEXT: vpxor %xmm4, %xmm3, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm4 +; AVX1-NEXT: vmovapd {{.*#+}} xmm6 = [255,255] +; AVX1-NEXT: vblendvpd %xmm4, %xmm3, %xmm6, %xmm3 +; AVX1-NEXT: vblendvpd %xmm5, %xmm2, %xmm6, %xmm2 +; AVX1-NEXT: vpackusdw %xmm3, %xmm2, %xmm2 +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm6, %xmm1 +; AVX1-NEXT: vblendvpd %xmm8, %xmm0, %xmm6, %xmm0 ; AVX1-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 -; AVX1-NEXT: vzeroupper ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_usat_v8i64_v8i8: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-NEXT: vbroadcastsd {{.*#+}} ymm2 = [255,255,255,255] ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] ; AVX2-NEXT: vpxor %ymm3, %ymm0, %ymm4 @@ -2081,225 +2203,248 @@ define <8 x i8> @trunc_usat_v8i64_v8i8(< ; ; AVX512-LABEL: trunc_usat_v8i64_v8i8: ; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_usat_v8i64_v8i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa64 (%rdi), %zmm0 +; SKX-NEXT: vpmovusqb %zmm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp ult <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = trunc <8 x i64> %2 to <8 x i8> ret <8 x i8> %3 } -define void @trunc_usat_v8i64_v8i8_store(<8 x i64> %a0, <8 x i8> *%p1) { +define void @trunc_usat_v8i64_v8i8_store(<8 x i64>* %p0, <8 x i8> *%p1) { ; SSE2-LABEL: trunc_usat_v8i64_v8i8_store: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm6 +; SSE2-NEXT: movdqa 16(%rdi), %xmm5 +; SSE2-NEXT: movdqa 32(%rdi), %xmm9 +; SSE2-NEXT: movdqa 48(%rdi), %xmm4 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255] -; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259456,9223372039002259456] -; SSE2-NEXT: movdqa %xmm1, %xmm7 -; SSE2-NEXT: pxor %xmm5, %xmm7 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [9223372039002259711,9223372039002259711] -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm7, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm7 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: movdqa %xmm5, %xmm7 +; SSE2-NEXT: pxor %xmm2, %xmm7 +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259711,9223372039002259711] +; SSE2-NEXT: movdqa %xmm1, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm7, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm7 ; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm7[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pandn %xmm8, %xmm4 -; SSE2-NEXT: por %xmm1, %xmm4 -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pxor %xmm5, %xmm1 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm6 -; SSE2-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm10, %xmm7 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm6[1,1,3,3] -; SSE2-NEXT: por %xmm7, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm0 -; SSE2-NEXT: pandn %xmm8, %xmm1 -; SSE2-NEXT: por %xmm0, %xmm1 -; SSE2-NEXT: packuswb %xmm4, %xmm1 -; SSE2-NEXT: movdqa %xmm3, %xmm0 -; SSE2-NEXT: pxor %xmm5, %xmm0 -; SSE2-NEXT: movdqa %xmm9, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm0, %xmm4 -; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm7, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm0 +; SSE2-NEXT: por %xmm5, %xmm0 +; SSE2-NEXT: movdqa %xmm6, %xmm3 +; SSE2-NEXT: pxor %xmm2, %xmm3 +; SSE2-NEXT: movdqa %xmm1, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: pand %xmm7, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm3, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm6 +; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: por %xmm6, %xmm5 +; SSE2-NEXT: packuswb %xmm0, %xmm5 +; SSE2-NEXT: movdqa %xmm4, %xmm0 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: movdqa %xmm1, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm0 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSE2-NEXT: pand %xmm6, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSE2-NEXT: por %xmm0, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm3 -; SSE2-NEXT: pandn %xmm8, %xmm4 -; SSE2-NEXT: por %xmm3, %xmm4 -; SSE2-NEXT: pxor %xmm2, %xmm5 -; SSE2-NEXT: movdqa %xmm9, %xmm0 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm0 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm0[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm9, %xmm5 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] -; SSE2-NEXT: pand %xmm3, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm4 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm4, %xmm3 +; SSE2-NEXT: pxor %xmm9, %xmm2 +; SSE2-NEXT: movdqa %xmm1, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm0[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSE2-NEXT: por %xmm5, %xmm0 -; SSE2-NEXT: pand %xmm0, %xmm2 +; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm9 ; SSE2-NEXT: pandn %xmm8, %xmm0 -; SSE2-NEXT: por %xmm2, %xmm0 -; SSE2-NEXT: packuswb %xmm4, %xmm0 -; SSE2-NEXT: packuswb %xmm0, %xmm1 -; SSE2-NEXT: packuswb %xmm0, %xmm1 -; SSE2-NEXT: movq %xmm1, (%rdi) +; SSE2-NEXT: por %xmm9, %xmm0 +; SSE2-NEXT: packuswb %xmm3, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm5 +; SSE2-NEXT: packuswb %xmm0, %xmm5 +; SSE2-NEXT: movq %xmm5, (%rsi) ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_usat_v8i64_v8i8_store: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm6 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm5 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm9 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm4 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255] -; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [9223372039002259456,9223372039002259456] -; SSSE3-NEXT: movdqa %xmm1, %xmm7 -; SSSE3-NEXT: pxor %xmm5, %xmm7 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [9223372039002259711,9223372039002259711] -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm7, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm7 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: movdqa %xmm5, %xmm7 +; SSSE3-NEXT: pxor %xmm2, %xmm7 +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259711,9223372039002259711] +; SSSE3-NEXT: movdqa %xmm1, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm7, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm7 ; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm7[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pandn %xmm8, %xmm4 -; SSSE3-NEXT: por %xmm1, %xmm4 -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pxor %xmm5, %xmm1 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm6 -; SSSE3-NEXT: pshufd {{.*#+}} xmm10 = xmm6[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm10, %xmm7 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm6[1,1,3,3] -; SSSE3-NEXT: por %xmm7, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm0 -; SSSE3-NEXT: pandn %xmm8, %xmm1 -; SSSE3-NEXT: por %xmm0, %xmm1 -; SSSE3-NEXT: packuswb %xmm4, %xmm1 -; SSSE3-NEXT: movdqa %xmm3, %xmm0 -; SSSE3-NEXT: pxor %xmm5, %xmm0 -; SSSE3-NEXT: movdqa %xmm9, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm0, %xmm4 -; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm4[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm7, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm0 +; SSSE3-NEXT: por %xmm5, %xmm0 +; SSSE3-NEXT: movdqa %xmm6, %xmm3 +; SSSE3-NEXT: pxor %xmm2, %xmm3 +; SSSE3-NEXT: movdqa %xmm1, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: pand %xmm7, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm3, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm6 +; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: por %xmm6, %xmm5 +; SSSE3-NEXT: packuswb %xmm0, %xmm5 +; SSSE3-NEXT: movdqa %xmm4, %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm1, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm0 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] ; SSSE3-NEXT: pand %xmm6, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] -; SSSE3-NEXT: por %xmm0, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm3 -; SSSE3-NEXT: pandn %xmm8, %xmm4 -; SSSE3-NEXT: por %xmm3, %xmm4 -; SSSE3-NEXT: pxor %xmm2, %xmm5 -; SSSE3-NEXT: movdqa %xmm9, %xmm0 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm0 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm0[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm9, %xmm5 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm5[1,1,3,3] -; SSSE3-NEXT: pand %xmm3, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm4 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm4, %xmm3 +; SSSE3-NEXT: pxor %xmm9, %xmm2 +; SSSE3-NEXT: movdqa %xmm1, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm0[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] -; SSSE3-NEXT: por %xmm5, %xmm0 -; SSSE3-NEXT: pand %xmm0, %xmm2 +; SSSE3-NEXT: por %xmm1, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm9 ; SSSE3-NEXT: pandn %xmm8, %xmm0 -; SSSE3-NEXT: por %xmm2, %xmm0 -; SSSE3-NEXT: packuswb %xmm4, %xmm0 -; SSSE3-NEXT: packuswb %xmm0, %xmm1 -; SSSE3-NEXT: packuswb %xmm0, %xmm1 -; SSSE3-NEXT: movq %xmm1, (%rdi) +; SSSE3-NEXT: por %xmm9, %xmm0 +; SSSE3-NEXT: packuswb %xmm3, %xmm0 +; SSSE3-NEXT: packuswb %xmm0, %xmm5 +; SSSE3-NEXT: packuswb %xmm0, %xmm5 +; SSSE3-NEXT: movq %xmm5, (%rsi) ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_usat_v8i64_v8i8_store: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm8 -; SSE41-NEXT: movapd {{.*#+}} xmm9 = [255,255] -; SSE41-NEXT: movdqa {{.*#+}} xmm7 = [9223372039002259456,9223372039002259456] -; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm7, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259711,9223372039002259711] -; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: movdqa (%rdi), %xmm7 +; SSE41-NEXT: movdqa 16(%rdi), %xmm6 +; SSE41-NEXT: movdqa 32(%rdi), %xmm8 +; SSE41-NEXT: movdqa 48(%rdi), %xmm9 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: movdqa %xmm6, %xmm0 +; SSE41-NEXT: pxor %xmm4, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002259711,9223372039002259711] +; SSE41-NEXT: movdqa %xmm2, %xmm5 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 -; SSE41-NEXT: movdqa %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] +; SSE41-NEXT: movdqa %xmm2, %xmm3 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] ; SSE41-NEXT: pand %xmm5, %xmm0 -; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm5 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 -; SSE41-NEXT: movdqa %xmm8, %xmm0 -; SSE41-NEXT: pxor %xmm7, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm1 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 -; SSE41-NEXT: movdqa %xmm4, %xmm6 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] -; SSE41-NEXT: pand %xmm1, %xmm0 -; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm1 -; SSE41-NEXT: packusdw %xmm5, %xmm1 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm7, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm3 +; SSE41-NEXT: movdqa %xmm7, %xmm0 +; SSE41-NEXT: pxor %xmm4, %xmm0 +; SSE41-NEXT: movdqa %xmm2, %xmm5 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 -; SSE41-NEXT: movdqa %xmm4, %xmm6 +; SSE41-NEXT: movdqa %xmm2, %xmm6 ; SSE41-NEXT: pcmpgtd %xmm0, %xmm6 ; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm6[0,0,2,2] ; SSE41-NEXT: pand %xmm5, %xmm0 ; SSE41-NEXT: por %xmm6, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm5 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm5 -; SSE41-NEXT: pxor %xmm2, %xmm7 -; SSE41-NEXT: movdqa %xmm4, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm7, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm7, %xmm4 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: movapd %xmm1, %xmm6 +; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm6 +; SSE41-NEXT: packusdw %xmm3, %xmm6 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm4, %xmm0 +; SSE41-NEXT: movdqa %xmm2, %xmm3 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 +; SSE41-NEXT: movdqa %xmm2, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] ; SSE41-NEXT: pand %xmm3, %xmm0 -; SSE41-NEXT: por %xmm4, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm9 -; SSE41-NEXT: packusdw %xmm5, %xmm9 -; SSE41-NEXT: packusdw %xmm9, %xmm1 -; SSE41-NEXT: packuswb %xmm0, %xmm1 -; SSE41-NEXT: movq %xmm1, (%rdi) +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm1, %xmm3 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm3 +; SSE41-NEXT: pxor %xmm8, %xmm4 +; SSE41-NEXT: movdqa %xmm2, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm4, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm4, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm2, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm1 +; SSE41-NEXT: packusdw %xmm3, %xmm1 +; SSE41-NEXT: packusdw %xmm1, %xmm6 +; SSE41-NEXT: packuswb %xmm0, %xmm6 +; SSE41-NEXT: movq %xmm6, (%rsi) ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_usat_v8i64_v8i8_store: ; AVX1: # %bb.0: -; AVX1-NEXT: vmovdqa {{.*#+}} xmm2 = [9223372036854775808,9223372036854775808] -; AVX1-NEXT: vpxor %xmm2, %xmm0, %xmm3 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [9223372036854776063,9223372036854776063] -; AVX1-NEXT: vpcmpgtq %xmm3, %xmm4, %xmm8 -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm5 -; AVX1-NEXT: vpxor %xmm2, %xmm5, %xmm6 -; AVX1-NEXT: vpcmpgtq %xmm6, %xmm4, %xmm6 -; AVX1-NEXT: vpxor %xmm2, %xmm1, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm4, %xmm7 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm3 -; AVX1-NEXT: vpxor %xmm2, %xmm3, %xmm2 -; AVX1-NEXT: vpcmpgtq %xmm2, %xmm4, %xmm2 -; AVX1-NEXT: vmovapd {{.*#+}} xmm4 = [255,255] -; AVX1-NEXT: vblendvpd %xmm2, %xmm3, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm4, %xmm1 -; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vblendvpd %xmm6, %xmm5, %xmm4, %xmm2 -; AVX1-NEXT: vblendvpd %xmm8, %xmm0, %xmm4, %xmm0 -; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa (%rdi), %xmm0 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm1 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm2 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm3 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [9223372036854775808,9223372036854775808] +; AVX1-NEXT: vpxor %xmm4, %xmm0, %xmm5 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm6 = [9223372036854776063,9223372036854776063] +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm8 +; AVX1-NEXT: vpxor %xmm4, %xmm1, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm7, %xmm6, %xmm7 +; AVX1-NEXT: vpxor %xmm4, %xmm2, %xmm5 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm5 +; AVX1-NEXT: vpxor %xmm4, %xmm3, %xmm4 +; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm4 +; AVX1-NEXT: vmovapd {{.*#+}} xmm6 = [255,255] +; AVX1-NEXT: vblendvpd %xmm4, %xmm3, %xmm6, %xmm3 +; AVX1-NEXT: vblendvpd %xmm5, %xmm2, %xmm6, %xmm2 +; AVX1-NEXT: vpackusdw %xmm3, %xmm2, %xmm2 +; AVX1-NEXT: vblendvpd %xmm7, %xmm1, %xmm6, %xmm1 +; AVX1-NEXT: vblendvpd %xmm8, %xmm0, %xmm6, %xmm0 ; AVX1-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 +; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 ; AVX1-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 -; AVX1-NEXT: vmovq %xmm0, (%rdi) -; AVX1-NEXT: vzeroupper +; AVX1-NEXT: vmovq %xmm0, (%rsi) ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_usat_v8i64_v8i8_store: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 ; AVX2-NEXT: vbroadcastsd {{.*#+}} ymm2 = [255,255,255,255] ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm3 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] ; AVX2-NEXT: vpxor %ymm3, %ymm0, %ymm4 @@ -2320,15 +2465,24 @@ define void @trunc_usat_v8i64_v8i8_store ; AVX2-NEXT: vpshufb %xmm3, %xmm0, %xmm0 ; AVX2-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1],xmm0[2],xmm2[2],xmm0[3],xmm2[3] ; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3] -; AVX2-NEXT: vmovq %xmm0, (%rdi) +; AVX2-NEXT: vmovq %xmm0, (%rsi) ; AVX2-NEXT: vzeroupper ; AVX2-NEXT: retq ; ; AVX512-LABEL: trunc_usat_v8i64_v8i8_store: ; AVX512: # %bb.0: -; AVX512-NEXT: vpmovusqb %zmm0, (%rdi) +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 +; AVX512-NEXT: vpmovusqb %zmm0, (%rsi) ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_usat_v8i64_v8i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa64 (%rdi), %zmm0 +; SKX-NEXT: vpmovusqb %zmm0, (%rsi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 %1 = icmp ult <8 x i64> %a0, %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> %3 = trunc <8 x i64> %2 to <8 x i8> @@ -2336,119 +2490,127 @@ define void @trunc_usat_v8i64_v8i8_store ret void } -define <16 x i8> @trunc_usat_v16i64_v16i8(<16 x i64> %a0) { +define <16 x i8> @trunc_usat_v16i64_v16i8(<16 x i64>* %p0) { ; SSE2-LABEL: trunc_usat_v16i64_v16i8: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa 96(%rdi), %xmm9 +; SSE2-NEXT: movdqa 112(%rdi), %xmm10 +; SSE2-NEXT: movdqa 64(%rdi), %xmm11 +; SSE2-NEXT: movdqa 80(%rdi), %xmm12 +; SSE2-NEXT: movdqa (%rdi), %xmm3 +; SSE2-NEXT: movdqa 16(%rdi), %xmm6 +; SSE2-NEXT: movdqa 32(%rdi), %xmm13 +; SSE2-NEXT: movdqa 48(%rdi), %xmm1 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255] -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [9223372039002259456,9223372039002259456] -; SSE2-NEXT: movdqa %xmm1, %xmm11 -; SSE2-NEXT: pxor %xmm9, %xmm11 -; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [9223372039002259711,9223372039002259711] -; SSE2-NEXT: movdqa %xmm10, %xmm12 -; SSE2-NEXT: pcmpgtd %xmm11, %xmm12 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm12[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm11 -; SSE2-NEXT: pshufd {{.*#+}} xmm11 = xmm11[1,1,3,3] -; SSE2-NEXT: pand %xmm13, %xmm11 -; SSE2-NEXT: pshufd {{.*#+}} xmm12 = xmm12[1,1,3,3] -; SSE2-NEXT: por %xmm11, %xmm12 -; SSE2-NEXT: pand %xmm12, %xmm1 -; SSE2-NEXT: pandn %xmm8, %xmm12 -; SSE2-NEXT: por %xmm1, %xmm12 -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pxor %xmm9, %xmm1 -; SSE2-NEXT: movdqa %xmm10, %xmm11 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm11 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm11[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm14 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm13, %xmm14 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm11[1,1,3,3] -; SSE2-NEXT: por %xmm14, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm0 -; SSE2-NEXT: pandn %xmm8, %xmm1 -; SSE2-NEXT: por %xmm1, %xmm0 -; SSE2-NEXT: packuswb %xmm12, %xmm0 -; SSE2-NEXT: movdqa %xmm3, %xmm1 -; SSE2-NEXT: pxor %xmm9, %xmm1 -; SSE2-NEXT: movdqa %xmm10, %xmm11 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm11 -; SSE2-NEXT: pshufd {{.*#+}} xmm12 = xmm11[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm12, %xmm13 -; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm11[1,1,3,3] -; SSE2-NEXT: por %xmm13, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm3 -; SSE2-NEXT: pandn %xmm8, %xmm1 -; SSE2-NEXT: por %xmm3, %xmm1 -; SSE2-NEXT: movdqa %xmm2, %xmm3 -; SSE2-NEXT: pxor %xmm9, %xmm3 -; SSE2-NEXT: movdqa %xmm10, %xmm11 -; SSE2-NEXT: pcmpgtd %xmm3, %xmm11 -; SSE2-NEXT: pshufd {{.*#+}} xmm12 = xmm11[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm13 = xmm3[1,1,3,3] -; SSE2-NEXT: pand %xmm12, %xmm13 -; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm11[1,1,3,3] -; SSE2-NEXT: por %xmm13, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm2 -; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: movdqa %xmm6, %xmm0 +; SSE2-NEXT: pxor %xmm4, %xmm0 +; SSE2-NEXT: movdqa {{.*#+}} xmm14 = [9223372039002259711,9223372039002259711] +; SSE2-NEXT: movdqa %xmm14, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm6 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm6, %xmm2 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm4, %xmm0 +; SSE2-NEXT: movdqa %xmm14, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm6, %xmm7 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm5[1,1,3,3] +; SSE2-NEXT: por %xmm7, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm3 +; SSE2-NEXT: pandn %xmm8, %xmm0 +; SSE2-NEXT: por %xmm3, %xmm0 +; SSE2-NEXT: packuswb %xmm2, %xmm0 +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pxor %xmm4, %xmm2 +; SSE2-NEXT: movdqa %xmm14, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: packuswb %xmm1, %xmm3 -; SSE2-NEXT: packuswb %xmm3, %xmm0 -; SSE2-NEXT: movdqa %xmm5, %xmm1 -; SSE2-NEXT: pxor %xmm9, %xmm1 -; SSE2-NEXT: movdqa %xmm10, %xmm2 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: movdqa %xmm13, %xmm1 +; SSE2-NEXT: pxor %xmm4, %xmm1 +; SSE2-NEXT: movdqa %xmm14, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm13 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm13, %xmm2 +; SSE2-NEXT: packuswb %xmm3, %xmm2 +; SSE2-NEXT: packuswb %xmm2, %xmm0 +; SSE2-NEXT: movdqa %xmm12, %xmm1 +; SSE2-NEXT: pxor %xmm4, %xmm1 +; SSE2-NEXT: movdqa %xmm14, %xmm2 ; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 +; SSE2-NEXT: pcmpeqd %xmm14, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] ; SSE2-NEXT: pand %xmm3, %xmm1 ; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] ; SSE2-NEXT: por %xmm1, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm5 +; SSE2-NEXT: pand %xmm2, %xmm12 ; SSE2-NEXT: pandn %xmm8, %xmm2 -; SSE2-NEXT: por %xmm5, %xmm2 -; SSE2-NEXT: movdqa %xmm4, %xmm1 -; SSE2-NEXT: pxor %xmm9, %xmm1 -; SSE2-NEXT: movdqa %xmm10, %xmm3 +; SSE2-NEXT: por %xmm12, %xmm2 +; SSE2-NEXT: movdqa %xmm11, %xmm1 +; SSE2-NEXT: pxor %xmm4, %xmm1 +; SSE2-NEXT: movdqa %xmm14, %xmm3 ; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm11 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm1 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm1[1,1,3,3] -; SSE2-NEXT: pand %xmm11, %xmm5 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm6 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm6 ; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm3[1,1,3,3] -; SSE2-NEXT: por %xmm5, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm4 +; SSE2-NEXT: por %xmm6, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm11 ; SSE2-NEXT: pandn %xmm8, %xmm1 -; SSE2-NEXT: por %xmm4, %xmm1 +; SSE2-NEXT: por %xmm11, %xmm1 ; SSE2-NEXT: packuswb %xmm2, %xmm1 -; SSE2-NEXT: movdqa %xmm7, %xmm2 -; SSE2-NEXT: pxor %xmm9, %xmm2 -; SSE2-NEXT: movdqa %xmm10, %xmm3 +; SSE2-NEXT: movdqa %xmm10, %xmm2 +; SSE2-NEXT: pxor %xmm4, %xmm2 +; SSE2-NEXT: movdqa %xmm14, %xmm3 ; SSE2-NEXT: pcmpgtd %xmm2, %xmm3 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm2 ; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm2 +; SSE2-NEXT: pand %xmm5, %xmm2 ; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSE2-NEXT: por %xmm2, %xmm3 -; SSE2-NEXT: pand %xmm3, %xmm7 +; SSE2-NEXT: pand %xmm3, %xmm10 ; SSE2-NEXT: pandn %xmm8, %xmm3 -; SSE2-NEXT: por %xmm7, %xmm3 -; SSE2-NEXT: pxor %xmm6, %xmm9 -; SSE2-NEXT: movdqa %xmm10, %xmm2 -; SSE2-NEXT: pcmpgtd %xmm9, %xmm2 -; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSE2-NEXT: pcmpeqd %xmm10, %xmm9 -; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm9[1,1,3,3] -; SSE2-NEXT: pand %xmm4, %xmm5 +; SSE2-NEXT: por %xmm10, %xmm3 +; SSE2-NEXT: pxor %xmm9, %xmm4 +; SSE2-NEXT: movdqa %xmm14, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm14, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm4 ; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSE2-NEXT: por %xmm5, %xmm2 -; SSE2-NEXT: pand %xmm2, %xmm6 +; SSE2-NEXT: por %xmm4, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm9 ; SSE2-NEXT: pandn %xmm8, %xmm2 -; SSE2-NEXT: por %xmm6, %xmm2 +; SSE2-NEXT: por %xmm9, %xmm2 ; SSE2-NEXT: packuswb %xmm3, %xmm2 ; SSE2-NEXT: packuswb %xmm2, %xmm1 ; SSE2-NEXT: packuswb %xmm1, %xmm0 @@ -2456,116 +2618,124 @@ define <16 x i8> @trunc_usat_v16i64_v16i ; ; SSSE3-LABEL: trunc_usat_v16i64_v16i8: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa 96(%rdi), %xmm9 +; SSSE3-NEXT: movdqa 112(%rdi), %xmm10 +; SSSE3-NEXT: movdqa 64(%rdi), %xmm11 +; SSSE3-NEXT: movdqa 80(%rdi), %xmm12 +; SSSE3-NEXT: movdqa (%rdi), %xmm3 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm6 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm13 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm1 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255] -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [9223372039002259456,9223372039002259456] -; SSSE3-NEXT: movdqa %xmm1, %xmm11 -; SSSE3-NEXT: pxor %xmm9, %xmm11 -; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [9223372039002259711,9223372039002259711] -; SSSE3-NEXT: movdqa %xmm10, %xmm12 -; SSSE3-NEXT: pcmpgtd %xmm11, %xmm12 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm12[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm11 -; SSSE3-NEXT: pshufd {{.*#+}} xmm11 = xmm11[1,1,3,3] -; SSSE3-NEXT: pand %xmm13, %xmm11 -; SSSE3-NEXT: pshufd {{.*#+}} xmm12 = xmm12[1,1,3,3] -; SSSE3-NEXT: por %xmm11, %xmm12 -; SSSE3-NEXT: pand %xmm12, %xmm1 -; SSSE3-NEXT: pandn %xmm8, %xmm12 -; SSSE3-NEXT: por %xmm1, %xmm12 -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pxor %xmm9, %xmm1 -; SSSE3-NEXT: movdqa %xmm10, %xmm11 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm11 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm11[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm14 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm13, %xmm14 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm11[1,1,3,3] -; SSSE3-NEXT: por %xmm14, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm0 -; SSSE3-NEXT: pandn %xmm8, %xmm1 -; SSSE3-NEXT: por %xmm1, %xmm0 -; SSSE3-NEXT: packuswb %xmm12, %xmm0 -; SSSE3-NEXT: movdqa %xmm3, %xmm1 -; SSSE3-NEXT: pxor %xmm9, %xmm1 -; SSSE3-NEXT: movdqa %xmm10, %xmm11 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm11 -; SSSE3-NEXT: pshufd {{.*#+}} xmm12 = xmm11[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm12, %xmm13 -; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm11[1,1,3,3] -; SSSE3-NEXT: por %xmm13, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm3 -; SSSE3-NEXT: pandn %xmm8, %xmm1 -; SSSE3-NEXT: por %xmm3, %xmm1 -; SSSE3-NEXT: movdqa %xmm2, %xmm3 -; SSSE3-NEXT: pxor %xmm9, %xmm3 -; SSSE3-NEXT: movdqa %xmm10, %xmm11 -; SSSE3-NEXT: pcmpgtd %xmm3, %xmm11 -; SSSE3-NEXT: pshufd {{.*#+}} xmm12 = xmm11[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm13 = xmm3[1,1,3,3] -; SSSE3-NEXT: pand %xmm12, %xmm13 -; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm11[1,1,3,3] -; SSSE3-NEXT: por %xmm13, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm2 -; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: movdqa %xmm6, %xmm0 +; SSSE3-NEXT: pxor %xmm4, %xmm0 +; SSSE3-NEXT: movdqa {{.*#+}} xmm14 = [9223372039002259711,9223372039002259711] +; SSSE3-NEXT: movdqa %xmm14, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm6 +; SSSE3-NEXT: pandn %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm6, %xmm2 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm4, %xmm0 +; SSSE3-NEXT: movdqa %xmm14, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm5[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm7 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm6, %xmm7 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm5[1,1,3,3] +; SSSE3-NEXT: por %xmm7, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm3 +; SSSE3-NEXT: pandn %xmm8, %xmm0 +; SSSE3-NEXT: por %xmm3, %xmm0 +; SSSE3-NEXT: packuswb %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pxor %xmm4, %xmm2 +; SSSE3-NEXT: movdqa %xmm14, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: packuswb %xmm1, %xmm3 -; SSSE3-NEXT: packuswb %xmm3, %xmm0 -; SSSE3-NEXT: movdqa %xmm5, %xmm1 -; SSSE3-NEXT: pxor %xmm9, %xmm1 -; SSSE3-NEXT: movdqa %xmm10, %xmm2 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: movdqa %xmm13, %xmm1 +; SSSE3-NEXT: pxor %xmm4, %xmm1 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm13 +; SSSE3-NEXT: pandn %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm13, %xmm2 +; SSSE3-NEXT: packuswb %xmm3, %xmm2 +; SSSE3-NEXT: packuswb %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm12, %xmm1 +; SSSE3-NEXT: pxor %xmm4, %xmm1 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 ; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] ; SSSE3-NEXT: pand %xmm3, %xmm1 ; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] ; SSSE3-NEXT: por %xmm1, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm5 +; SSSE3-NEXT: pand %xmm2, %xmm12 ; SSSE3-NEXT: pandn %xmm8, %xmm2 -; SSSE3-NEXT: por %xmm5, %xmm2 -; SSSE3-NEXT: movdqa %xmm4, %xmm1 -; SSSE3-NEXT: pxor %xmm9, %xmm1 -; SSSE3-NEXT: movdqa %xmm10, %xmm3 +; SSSE3-NEXT: por %xmm12, %xmm2 +; SSSE3-NEXT: movdqa %xmm11, %xmm1 +; SSSE3-NEXT: pxor %xmm4, %xmm1 +; SSSE3-NEXT: movdqa %xmm14, %xmm3 ; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm11 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm1 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm1[1,1,3,3] -; SSSE3-NEXT: pand %xmm11, %xmm5 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm6 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm6 ; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm3[1,1,3,3] -; SSSE3-NEXT: por %xmm5, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm4 +; SSSE3-NEXT: por %xmm6, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm11 ; SSSE3-NEXT: pandn %xmm8, %xmm1 -; SSSE3-NEXT: por %xmm4, %xmm1 +; SSSE3-NEXT: por %xmm11, %xmm1 ; SSSE3-NEXT: packuswb %xmm2, %xmm1 -; SSSE3-NEXT: movdqa %xmm7, %xmm2 -; SSSE3-NEXT: pxor %xmm9, %xmm2 -; SSSE3-NEXT: movdqa %xmm10, %xmm3 +; SSSE3-NEXT: movdqa %xmm10, %xmm2 +; SSSE3-NEXT: pxor %xmm4, %xmm2 +; SSSE3-NEXT: movdqa %xmm14, %xmm3 ; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm2 ; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm2 +; SSSE3-NEXT: pand %xmm5, %xmm2 ; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm3[1,1,3,3] ; SSSE3-NEXT: por %xmm2, %xmm3 -; SSSE3-NEXT: pand %xmm3, %xmm7 +; SSSE3-NEXT: pand %xmm3, %xmm10 ; SSSE3-NEXT: pandn %xmm8, %xmm3 -; SSSE3-NEXT: por %xmm7, %xmm3 -; SSSE3-NEXT: pxor %xmm6, %xmm9 -; SSSE3-NEXT: movdqa %xmm10, %xmm2 -; SSSE3-NEXT: pcmpgtd %xmm9, %xmm2 -; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] -; SSSE3-NEXT: pcmpeqd %xmm10, %xmm9 -; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm9[1,1,3,3] -; SSSE3-NEXT: pand %xmm4, %xmm5 +; SSSE3-NEXT: por %xmm10, %xmm3 +; SSSE3-NEXT: pxor %xmm9, %xmm4 +; SSSE3-NEXT: movdqa %xmm14, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm14, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm4 ; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] -; SSSE3-NEXT: por %xmm5, %xmm2 -; SSSE3-NEXT: pand %xmm2, %xmm6 +; SSSE3-NEXT: por %xmm4, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm9 ; SSSE3-NEXT: pandn %xmm8, %xmm2 -; SSSE3-NEXT: por %xmm6, %xmm2 +; SSSE3-NEXT: por %xmm9, %xmm2 ; SSSE3-NEXT: packuswb %xmm3, %xmm2 ; SSSE3-NEXT: packuswb %xmm2, %xmm1 ; SSSE3-NEXT: packuswb %xmm1, %xmm0 @@ -2573,155 +2743,168 @@ define <16 x i8> @trunc_usat_v16i64_v16i ; ; SSE41-LABEL: trunc_usat_v16i64_v16i8: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa %xmm0, %xmm8 -; SSE41-NEXT: movapd {{.*#+}} xmm9 = [255,255] -; SSE41-NEXT: movdqa {{.*#+}} xmm11 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: movdqa 96(%rdi), %xmm8 +; SSE41-NEXT: movdqa 112(%rdi), %xmm9 +; SSE41-NEXT: movdqa 64(%rdi), %xmm10 +; SSE41-NEXT: movdqa 80(%rdi), %xmm11 +; SSE41-NEXT: movdqa (%rdi), %xmm2 +; SSE41-NEXT: movdqa 16(%rdi), %xmm1 +; SSE41-NEXT: movdqa 32(%rdi), %xmm12 +; SSE41-NEXT: movdqa 48(%rdi), %xmm13 +; SSE41-NEXT: movapd {{.*#+}} xmm3 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm6 = [9223372039002259456,9223372039002259456] ; SSE41-NEXT: movdqa %xmm1, %xmm0 -; SSE41-NEXT: pxor %xmm11, %xmm0 -; SSE41-NEXT: movdqa {{.*#+}} xmm10 = [9223372039002259711,9223372039002259711] -; SSE41-NEXT: movdqa %xmm10, %xmm12 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm12 -; SSE41-NEXT: movdqa %xmm10, %xmm13 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm13 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm13[0,0,2,2] -; SSE41-NEXT: pand %xmm12, %xmm0 -; SSE41-NEXT: por %xmm13, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm12 -; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm12 -; SSE41-NEXT: movdqa %xmm8, %xmm0 -; SSE41-NEXT: pxor %xmm11, %xmm0 -; SSE41-NEXT: movdqa %xmm10, %xmm13 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm13 -; SSE41-NEXT: movdqa %xmm10, %xmm1 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm1 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm13, %xmm0 -; SSE41-NEXT: por %xmm1, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm13 -; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm13 -; SSE41-NEXT: packusdw %xmm12, %xmm13 -; SSE41-NEXT: movdqa %xmm3, %xmm0 -; SSE41-NEXT: pxor %xmm11, %xmm0 -; SSE41-NEXT: movdqa %xmm10, %xmm8 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm8 -; SSE41-NEXT: movdqa %xmm10, %xmm1 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm1 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm8, %xmm0 -; SSE41-NEXT: por %xmm1, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm8 -; SSE41-NEXT: blendvpd %xmm0, %xmm3, %xmm8 +; SSE41-NEXT: pxor %xmm6, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259711,9223372039002259711] +; SSE41-NEXT: movdqa %xmm4, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm7 +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm5 ; SSE41-NEXT: movdqa %xmm2, %xmm0 -; SSE41-NEXT: pxor %xmm11, %xmm0 -; SSE41-NEXT: movdqa %xmm10, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm3 -; SSE41-NEXT: movdqa %xmm10, %xmm1 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm1 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 -; SSE41-NEXT: por %xmm1, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 -; SSE41-NEXT: packusdw %xmm8, %xmm1 -; SSE41-NEXT: packusdw %xmm1, %xmm13 -; SSE41-NEXT: movdqa %xmm5, %xmm0 -; SSE41-NEXT: pxor %xmm11, %xmm0 -; SSE41-NEXT: movdqa %xmm10, %xmm1 +; SSE41-NEXT: pxor %xmm6, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm1 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 -; SSE41-NEXT: movdqa %xmm10, %xmm2 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm2 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,0,2,2] +; SSE41-NEXT: movdqa %xmm4, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] ; SSE41-NEXT: pand %xmm1, %xmm0 -; SSE41-NEXT: por %xmm2, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm5, %xmm1 -; SSE41-NEXT: movdqa %xmm4, %xmm0 -; SSE41-NEXT: pxor %xmm11, %xmm0 -; SSE41-NEXT: movdqa %xmm10, %xmm2 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm1 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: packusdw %xmm5, %xmm1 +; SSE41-NEXT: movdqa %xmm13, %xmm0 +; SSE41-NEXT: pxor %xmm6, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm2 ; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 -; SSE41-NEXT: movdqa %xmm10, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] ; SSE41-NEXT: pand %xmm2, %xmm0 -; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm2 -; SSE41-NEXT: blendvpd %xmm0, %xmm4, %xmm2 -; SSE41-NEXT: packusdw %xmm1, %xmm2 -; SSE41-NEXT: movdqa %xmm7, %xmm0 -; SSE41-NEXT: pxor %xmm11, %xmm0 -; SSE41-NEXT: movdqa %xmm10, %xmm1 -; SSE41-NEXT: pcmpeqd %xmm0, %xmm1 -; SSE41-NEXT: movdqa %xmm10, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] -; SSE41-NEXT: pand %xmm1, %xmm0 -; SSE41-NEXT: por %xmm3, %xmm0 -; SSE41-NEXT: movapd %xmm9, %xmm1 -; SSE41-NEXT: blendvpd %xmm0, %xmm7, %xmm1 -; SSE41-NEXT: pxor %xmm6, %xmm11 -; SSE41-NEXT: movdqa %xmm10, %xmm3 -; SSE41-NEXT: pcmpeqd %xmm11, %xmm3 -; SSE41-NEXT: pcmpgtd %xmm11, %xmm10 -; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm10[0,0,2,2] -; SSE41-NEXT: pand %xmm3, %xmm0 -; SSE41-NEXT: por %xmm10, %xmm0 -; SSE41-NEXT: blendvpd %xmm0, %xmm6, %xmm9 -; SSE41-NEXT: packusdw %xmm1, %xmm9 -; SSE41-NEXT: packusdw %xmm9, %xmm2 -; SSE41-NEXT: packuswb %xmm2, %xmm13 -; SSE41-NEXT: movdqa %xmm13, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm13, %xmm2 +; SSE41-NEXT: movdqa %xmm12, %xmm0 +; SSE41-NEXT: pxor %xmm6, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm4, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm12, %xmm5 +; SSE41-NEXT: packusdw %xmm2, %xmm5 +; SSE41-NEXT: packusdw %xmm5, %xmm1 +; SSE41-NEXT: movdqa %xmm11, %xmm0 +; SSE41-NEXT: pxor %xmm6, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm2 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm5 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm5[0,0,2,2] +; SSE41-NEXT: pand %xmm2, %xmm0 +; SSE41-NEXT: por %xmm5, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm11, %xmm5 +; SSE41-NEXT: movdqa %xmm10, %xmm0 +; SSE41-NEXT: pxor %xmm6, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm2 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm2 +; SSE41-NEXT: movdqa %xmm4, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm2, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm2 +; SSE41-NEXT: blendvpd %xmm0, %xmm10, %xmm2 +; SSE41-NEXT: packusdw %xmm5, %xmm2 +; SSE41-NEXT: movdqa %xmm9, %xmm0 +; SSE41-NEXT: pxor %xmm6, %xmm0 +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: movdqa %xmm4, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm7 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm7[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm7, %xmm0 +; SSE41-NEXT: movapd %xmm3, %xmm5 +; SSE41-NEXT: blendvpd %xmm0, %xmm9, %xmm5 +; SSE41-NEXT: pxor %xmm8, %xmm6 +; SSE41-NEXT: movdqa %xmm4, %xmm7 +; SSE41-NEXT: pcmpeqd %xmm6, %xmm7 +; SSE41-NEXT: pcmpgtd %xmm6, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm7, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm8, %xmm3 +; SSE41-NEXT: packusdw %xmm5, %xmm3 +; SSE41-NEXT: packusdw %xmm3, %xmm2 +; SSE41-NEXT: packuswb %xmm2, %xmm1 +; SSE41-NEXT: movdqa %xmm1, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_usat_v16i64_v16i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vmovdqa %ymm0, %ymm8 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm5 = [9223372036854775808,9223372036854775808] -; AVX1-NEXT: vpxor %xmm5, %xmm8, %xmm4 +; AVX1-NEXT: vmovdqa (%rdi), %xmm12 +; AVX1-NEXT: vmovdqa 16(%rdi), %xmm13 +; AVX1-NEXT: vmovdqa 32(%rdi), %xmm15 +; AVX1-NEXT: vmovdqa 48(%rdi), %xmm9 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm4 = [9223372036854775808,9223372036854775808] +; AVX1-NEXT: vpxor %xmm4, %xmm12, %xmm5 ; AVX1-NEXT: vmovdqa {{.*#+}} xmm6 = [9223372036854776063,9223372036854776063] -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm0 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm0 ; AVX1-NEXT: vmovdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill -; AVX1-NEXT: vextractf128 $1, %ymm8, %xmm11 -; AVX1-NEXT: vpxor %xmm5, %xmm11, %xmm4 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm0 +; AVX1-NEXT: vpxor %xmm4, %xmm13, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm7, %xmm6, %xmm0 ; AVX1-NEXT: vmovdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill -; AVX1-NEXT: vpxor %xmm5, %xmm1, %xmm4 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm10 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm14 -; AVX1-NEXT: vpxor %xmm5, %xmm14, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm6, %xmm12 -; AVX1-NEXT: vpxor %xmm5, %xmm2, %xmm7 -; AVX1-NEXT: vpcmpgtq %xmm7, %xmm6, %xmm13 -; AVX1-NEXT: vextractf128 $1, %ymm2, %xmm7 -; AVX1-NEXT: vpxor %xmm5, %xmm7, %xmm4 -; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm15 -; AVX1-NEXT: vpxor %xmm5, %xmm3, %xmm4 +; AVX1-NEXT: vpxor %xmm4, %xmm15, %xmm5 +; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm10 +; AVX1-NEXT: vpxor %xmm4, %xmm9, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm7, %xmm6, %xmm11 +; AVX1-NEXT: vmovdqa 64(%rdi), %xmm5 +; AVX1-NEXT: vpxor %xmm4, %xmm5, %xmm7 +; AVX1-NEXT: vpcmpgtq %xmm7, %xmm6, %xmm14 +; AVX1-NEXT: vmovdqa 80(%rdi), %xmm3 +; AVX1-NEXT: vpxor %xmm4, %xmm3, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm6, %xmm1 +; AVX1-NEXT: vmovdqa 96(%rdi), %xmm7 +; AVX1-NEXT: vpxor %xmm4, %xmm7, %xmm2 +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm6, %xmm2 +; AVX1-NEXT: vmovdqa 112(%rdi), %xmm0 +; AVX1-NEXT: vpxor %xmm4, %xmm0, %xmm4 ; AVX1-NEXT: vpcmpgtq %xmm4, %xmm6, %xmm4 -; AVX1-NEXT: vextractf128 $1, %ymm3, %xmm0 -; AVX1-NEXT: vpxor %xmm5, %xmm0, %xmm5 -; AVX1-NEXT: vpcmpgtq %xmm5, %xmm6, %xmm5 ; AVX1-NEXT: vmovapd {{.*#+}} xmm6 = [255,255] -; AVX1-NEXT: vblendvpd %xmm5, %xmm0, %xmm6, %xmm9 -; AVX1-NEXT: vblendvpd %xmm4, %xmm3, %xmm6, %xmm3 -; AVX1-NEXT: vblendvpd %xmm15, %xmm7, %xmm6, %xmm4 -; AVX1-NEXT: vblendvpd %xmm13, %xmm2, %xmm6, %xmm2 -; AVX1-NEXT: vblendvpd %xmm12, %xmm14, %xmm6, %xmm5 -; AVX1-NEXT: vblendvpd %xmm10, %xmm1, %xmm6, %xmm1 +; AVX1-NEXT: vblendvpd %xmm4, %xmm0, %xmm6, %xmm8 +; AVX1-NEXT: vblendvpd %xmm2, %xmm7, %xmm6, %xmm2 +; AVX1-NEXT: vblendvpd %xmm1, %xmm3, %xmm6, %xmm1 +; AVX1-NEXT: vblendvpd %xmm14, %xmm5, %xmm6, %xmm3 +; AVX1-NEXT: vblendvpd %xmm11, %xmm9, %xmm6, %xmm4 +; AVX1-NEXT: vblendvpd %xmm10, %xmm15, %xmm6, %xmm5 ; AVX1-NEXT: vmovapd {{[-0-9]+}}(%r{{[sb]}}p), %xmm0 # 16-byte Reload -; AVX1-NEXT: vblendvpd %xmm0, %xmm11, %xmm6, %xmm7 +; AVX1-NEXT: vblendvpd %xmm0, %xmm13, %xmm6, %xmm7 ; AVX1-NEXT: vmovapd {{[-0-9]+}}(%r{{[sb]}}p), %xmm0 # 16-byte Reload -; AVX1-NEXT: vblendvpd %xmm0, %xmm8, %xmm6, %xmm6 -; AVX1-NEXT: vpackusdw %xmm9, %xmm3, %xmm0 -; AVX1-NEXT: vpackusdw %xmm4, %xmm2, %xmm2 -; AVX1-NEXT: vpackusdw %xmm0, %xmm2, %xmm0 -; AVX1-NEXT: vpackusdw %xmm5, %xmm1, %xmm1 +; AVX1-NEXT: vblendvpd %xmm0, %xmm12, %xmm6, %xmm6 +; AVX1-NEXT: vpackusdw %xmm8, %xmm2, %xmm0 +; AVX1-NEXT: vpackusdw %xmm1, %xmm3, %xmm1 +; AVX1-NEXT: vpackusdw %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpackusdw %xmm4, %xmm5, %xmm1 ; AVX1-NEXT: vpackusdw %xmm7, %xmm6, %xmm2 ; AVX1-NEXT: vpackusdw %xmm1, %xmm2, %xmm1 ; AVX1-NEXT: vpackuswb %xmm0, %xmm1, %xmm0 -; AVX1-NEXT: vzeroupper ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_usat_v16i64_v16i8: ; AVX2: # %bb.0: +; AVX2-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-NEXT: vmovdqa 32(%rdi), %ymm1 +; AVX2-NEXT: vmovdqa 64(%rdi), %ymm2 +; AVX2-NEXT: vmovdqa 96(%rdi), %ymm3 ; AVX2-NEXT: vbroadcastsd {{.*#+}} ymm4 = [255,255,255,255] ; AVX2-NEXT: vpbroadcastq {{.*#+}} ymm5 = [9223372036854775808,9223372036854775808,9223372036854775808,9223372036854775808] ; AVX2-NEXT: vpxor %ymm5, %ymm1, %ymm6 @@ -2750,9 +2933,9 @@ define <16 x i8> @trunc_usat_v16i64_v16i ; ; AVX512F-LABEL: trunc_usat_v16i64_v16i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastq {{.*#+}} zmm2 = [255,255,255,255,255,255,255,255] -; AVX512F-NEXT: vpminuq %zmm2, %zmm1, %zmm1 -; AVX512F-NEXT: vpminuq %zmm2, %zmm0, %zmm0 +; AVX512F-NEXT: vpbroadcastq {{.*#+}} zmm0 = [255,255,255,255,255,255,255,255] +; AVX512F-NEXT: vpminuq 64(%rdi), %zmm0, %zmm1 +; AVX512F-NEXT: vpminuq (%rdi), %zmm0, %zmm0 ; AVX512F-NEXT: vpmovqd %zmm0, %ymm0 ; AVX512F-NEXT: vpmovqd %zmm1, %ymm1 ; AVX512F-NEXT: vinserti64x4 $1, %ymm1, %zmm0, %zmm0 @@ -2762,6 +2945,8 @@ define <16 x i8> @trunc_usat_v16i64_v16i ; ; AVX512VL-LABEL: trunc_usat_v16i64_v16i8: ; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vmovdqa64 (%rdi), %zmm0 +; AVX512VL-NEXT: vmovdqa64 64(%rdi), %zmm1 ; AVX512VL-NEXT: vpmovusqb %zmm1, %xmm1 ; AVX512VL-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512VL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] @@ -2770,9 +2955,9 @@ define <16 x i8> @trunc_usat_v16i64_v16i ; ; AVX512BW-LABEL: trunc_usat_v16i64_v16i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastq {{.*#+}} zmm2 = [255,255,255,255,255,255,255,255] -; AVX512BW-NEXT: vpminuq %zmm2, %zmm1, %zmm1 -; AVX512BW-NEXT: vpminuq %zmm2, %zmm0, %zmm0 +; AVX512BW-NEXT: vpbroadcastq {{.*#+}} zmm0 = [255,255,255,255,255,255,255,255] +; AVX512BW-NEXT: vpminuq 64(%rdi), %zmm0, %zmm1 +; AVX512BW-NEXT: vpminuq (%rdi), %zmm0, %zmm0 ; AVX512BW-NEXT: vpmovqd %zmm0, %ymm0 ; AVX512BW-NEXT: vpmovqd %zmm1, %ymm1 ; AVX512BW-NEXT: vinserti64x4 $1, %ymm1, %zmm0, %zmm0 @@ -2782,11 +2967,24 @@ define <16 x i8> @trunc_usat_v16i64_v16i ; ; AVX512BWVL-LABEL: trunc_usat_v16i64_v16i8: ; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vmovdqa64 (%rdi), %zmm0 +; AVX512BWVL-NEXT: vmovdqa64 64(%rdi), %zmm1 ; AVX512BWVL-NEXT: vpmovusqb %zmm1, %xmm1 ; AVX512BWVL-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BWVL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v16i64_v16i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa64 (%rdi), %zmm0 +; SKX-NEXT: vmovdqa64 64(%rdi), %zmm1 +; SKX-NEXT: vpmovusqb %zmm1, %xmm1 +; SKX-NEXT: vpmovusqb %zmm0, %xmm0 +; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <16 x i64>, <16 x i64>* %p0 %1 = icmp ult <16 x i64> %a0, %2 = select <16 x i1> %1, <16 x i64> %a0, <16 x i64> %3 = trunc <16 x i64> %2 to <16 x i8> @@ -2865,6 +3063,12 @@ define <4 x i8> @trunc_usat_v4i32_v4i8(< ; AVX512BWVL-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v4i32_v4i8: +; SKX: # %bb.0: +; SKX-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 +; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: retq %1 = icmp ult <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = trunc <4 x i32> %2 to <4 x i8> @@ -2947,6 +3151,11 @@ define void @trunc_usat_v4i32_v4i8_store ; AVX512BWVL: # %bb.0: ; AVX512BWVL-NEXT: vpmovusdb %xmm0, (%rdi) ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v4i32_v4i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusdb %xmm0, (%rdi) +; SKX-NEXT: retq %1 = icmp ult <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> %3 = trunc <4 x i32> %2 to <4 x i8> @@ -3063,6 +3272,12 @@ define <8 x i8> @trunc_usat_v8i32_v8i8(< ; AVX512BWVL-NEXT: vpmovusdb %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v8i32_v8i8: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusdb %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp ult <8 x i32> %a0, %2 = select <8 x i1> %1, <8 x i32> %a0, <8 x i32> %3 = trunc <8 x i32> %2 to <8 x i8> @@ -3184,6 +3399,12 @@ define void @trunc_usat_v8i32_v8i8_store ; AVX512BWVL-NEXT: vpmovusdb %ymm0, (%rdi) ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v8i32_v8i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusdb %ymm0, (%rdi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp ult <8 x i32> %a0, %2 = select <8 x i1> %1, <8 x i32> %a0, <8 x i32> %3 = trunc <8 x i32> %2 to <8 x i8> @@ -3191,127 +3412,290 @@ define void @trunc_usat_v8i32_v8i8_store ret void } -define <16 x i8> @trunc_usat_v16i32_v16i8(<16 x i32> %a0) { -; SSE2-LABEL: trunc_usat_v16i32_v16i8: +define <16 x i8> @trunc_usat_v16i32_v16i8(<16 x i32>* %p0) { +; SSE2-LABEL: trunc_usat_v16i32_v16i8: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm6 +; SSE2-NEXT: movdqa 16(%rdi), %xmm0 +; SSE2-NEXT: movdqa 32(%rdi), %xmm1 +; SSE2-NEXT: movdqa 48(%rdi), %xmm5 +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255,255,255] +; SSE2-NEXT: movdqa {{.*#+}} xmm4 = [2147483648,2147483648,2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm7 +; SSE2-NEXT: pxor %xmm4, %xmm7 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147483903,2147483903,2147483903,2147483903] +; SSE2-NEXT: movdqa %xmm3, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm7, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: movdqa %xmm6, %xmm7 +; SSE2-NEXT: pxor %xmm4, %xmm7 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm7, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm6 +; SSE2-NEXT: pandn %xmm8, %xmm0 +; SSE2-NEXT: por %xmm6, %xmm0 +; SSE2-NEXT: packuswb %xmm2, %xmm0 +; SSE2-NEXT: movdqa %xmm5, %xmm2 +; SSE2-NEXT: pxor %xmm4, %xmm2 +; SSE2-NEXT: movdqa %xmm3, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm6 +; SSE2-NEXT: pand %xmm6, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm6 +; SSE2-NEXT: por %xmm5, %xmm6 +; SSE2-NEXT: pxor %xmm1, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm4, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm1, %xmm3 +; SSE2-NEXT: packuswb %xmm6, %xmm3 +; SSE2-NEXT: packuswb %xmm3, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v16i32_v16i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm6 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm0 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm1 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm5 +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255,255,255] +; SSSE3-NEXT: movdqa {{.*#+}} xmm4 = [2147483648,2147483648,2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm7 +; SSSE3-NEXT: pxor %xmm4, %xmm7 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147483903,2147483903,2147483903,2147483903] +; SSSE3-NEXT: movdqa %xmm3, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm7, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: movdqa %xmm6, %xmm7 +; SSSE3-NEXT: pxor %xmm4, %xmm7 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm7, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm6 +; SSSE3-NEXT: pandn %xmm8, %xmm0 +; SSSE3-NEXT: por %xmm6, %xmm0 +; SSSE3-NEXT: packuswb %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm5, %xmm2 +; SSSE3-NEXT: pxor %xmm4, %xmm2 +; SSSE3-NEXT: movdqa %xmm3, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm6 +; SSSE3-NEXT: por %xmm5, %xmm6 +; SSSE3-NEXT: pxor %xmm1, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm4, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm1, %xmm3 +; SSSE3-NEXT: packuswb %xmm6, %xmm3 +; SSSE3-NEXT: packuswb %xmm3, %xmm0 +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v16i32_v16i8: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa {{.*#+}} xmm1 = [255,255,255,255] +; SSE41-NEXT: movdqa 16(%rdi), %xmm2 +; SSE41-NEXT: pminud %xmm1, %xmm2 +; SSE41-NEXT: movdqa (%rdi), %xmm0 +; SSE41-NEXT: pminud %xmm1, %xmm0 +; SSE41-NEXT: packusdw %xmm2, %xmm0 +; SSE41-NEXT: movdqa 48(%rdi), %xmm2 +; SSE41-NEXT: pminud %xmm1, %xmm2 +; SSE41-NEXT: pminud 32(%rdi), %xmm1 +; SSE41-NEXT: packusdw %xmm2, %xmm1 +; SSE41-NEXT: packuswb %xmm1, %xmm0 +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v16i32_v16i8: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovdqa {{.*#+}} xmm0 = [255,255,255,255] +; AVX1-NEXT: vpminud 16(%rdi), %xmm0, %xmm1 +; AVX1-NEXT: vpminud (%rdi), %xmm0, %xmm2 +; AVX1-NEXT: vpackusdw %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpminud 48(%rdi), %xmm0, %xmm2 +; AVX1-NEXT: vpminud 32(%rdi), %xmm0, %xmm0 +; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 +; AVX1-NEXT: vpackuswb %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: retq +; +; AVX2-LABEL: trunc_usat_v16i32_v16i8: +; AVX2: # %bb.0: +; AVX2-NEXT: vpbroadcastd {{.*#+}} ymm0 = [255,255,255,255,255,255,255,255] +; AVX2-NEXT: vpminud 32(%rdi), %ymm0, %ymm1 +; AVX2-NEXT: vpminud (%rdi), %ymm0, %ymm0 +; AVX2-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 +; AVX2-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vzeroupper +; AVX2-NEXT: retq +; +; AVX512-LABEL: trunc_usat_v16i32_v16i8: +; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 +; AVX512-NEXT: vpmovusdb %zmm0, %xmm0 +; AVX512-NEXT: vzeroupper +; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_usat_v16i32_v16i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa64 (%rdi), %zmm0 +; SKX-NEXT: vpmovusdb %zmm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <16 x i32>, <16 x i32>* %p0 + %1 = icmp ult <16 x i32> %a0, + %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> + %3 = trunc <16 x i32> %2 to <16 x i8> + ret <16 x i8> %3 +} + +define void @trunc_usat_v16i32_v16i8_store(<16 x i32>* %p0, <16 x i8>* %p1) { +; SSE2-LABEL: trunc_usat_v16i32_v16i8_store: ; SSE2: # %bb.0: +; SSE2-NEXT: movdqa (%rdi), %xmm6 +; SSE2-NEXT: movdqa 16(%rdi), %xmm5 +; SSE2-NEXT: movdqa 32(%rdi), %xmm0 +; SSE2-NEXT: movdqa 48(%rdi), %xmm4 ; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255,255,255] -; SSE2-NEXT: movdqa {{.*#+}} xmm6 = [2147483648,2147483648,2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm1, %xmm7 -; SSE2-NEXT: pxor %xmm6, %xmm7 -; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [2147483903,2147483903,2147483903,2147483903] -; SSE2-NEXT: movdqa %xmm5, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm7, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm1 -; SSE2-NEXT: pandn %xmm8, %xmm4 -; SSE2-NEXT: por %xmm1, %xmm4 -; SSE2-NEXT: movdqa %xmm0, %xmm1 -; SSE2-NEXT: pxor %xmm6, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648,2147483648,2147483648] ; SSE2-NEXT: movdqa %xmm5, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm7 -; SSE2-NEXT: pand %xmm7, %xmm0 -; SSE2-NEXT: pandn %xmm8, %xmm7 -; SSE2-NEXT: por %xmm7, %xmm0 -; SSE2-NEXT: packuswb %xmm4, %xmm0 -; SSE2-NEXT: movdqa %xmm3, %xmm1 -; SSE2-NEXT: pxor %xmm6, %xmm1 -; SSE2-NEXT: movdqa %xmm5, %xmm4 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm4 -; SSE2-NEXT: pand %xmm4, %xmm3 -; SSE2-NEXT: pandn %xmm8, %xmm4 -; SSE2-NEXT: por %xmm3, %xmm4 -; SSE2-NEXT: pxor %xmm2, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm6, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pxor %xmm3, %xmm7 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [2147483903,2147483903,2147483903,2147483903] +; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm7, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: por %xmm5, %xmm1 +; SSE2-NEXT: movdqa %xmm6, %xmm7 +; SSE2-NEXT: pxor %xmm3, %xmm7 +; SSE2-NEXT: movdqa %xmm2, %xmm5 +; SSE2-NEXT: pcmpgtd %xmm7, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm6 ; SSE2-NEXT: pandn %xmm8, %xmm5 -; SSE2-NEXT: por %xmm2, %xmm5 -; SSE2-NEXT: packuswb %xmm4, %xmm5 -; SSE2-NEXT: packuswb %xmm5, %xmm0 +; SSE2-NEXT: por %xmm6, %xmm5 +; SSE2-NEXT: packuswb %xmm1, %xmm5 +; SSE2-NEXT: movdqa %xmm4, %xmm1 +; SSE2-NEXT: pxor %xmm3, %xmm1 +; SSE2-NEXT: movdqa %xmm2, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm6 +; SSE2-NEXT: pand %xmm6, %xmm4 +; SSE2-NEXT: pandn %xmm8, %xmm6 +; SSE2-NEXT: por %xmm4, %xmm6 +; SSE2-NEXT: pxor %xmm0, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: packuswb %xmm6, %xmm2 +; SSE2-NEXT: packuswb %xmm2, %xmm5 +; SSE2-NEXT: movdqa %xmm5, (%rsi) ; SSE2-NEXT: retq ; -; SSSE3-LABEL: trunc_usat_v16i32_v16i8: +; SSSE3-LABEL: trunc_usat_v16i32_v16i8_store: ; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa (%rdi), %xmm6 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm5 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm0 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm4 ; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255,255,255] -; SSSE3-NEXT: movdqa {{.*#+}} xmm6 = [2147483648,2147483648,2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm1, %xmm7 -; SSSE3-NEXT: pxor %xmm6, %xmm7 -; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [2147483903,2147483903,2147483903,2147483903] -; SSSE3-NEXT: movdqa %xmm5, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm7, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm1 -; SSSE3-NEXT: pandn %xmm8, %xmm4 -; SSSE3-NEXT: por %xmm1, %xmm4 -; SSSE3-NEXT: movdqa %xmm0, %xmm1 -; SSSE3-NEXT: pxor %xmm6, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648,2147483648,2147483648] ; SSSE3-NEXT: movdqa %xmm5, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm7 -; SSSE3-NEXT: pand %xmm7, %xmm0 -; SSSE3-NEXT: pandn %xmm8, %xmm7 -; SSSE3-NEXT: por %xmm7, %xmm0 -; SSSE3-NEXT: packuswb %xmm4, %xmm0 -; SSSE3-NEXT: movdqa %xmm3, %xmm1 -; SSSE3-NEXT: pxor %xmm6, %xmm1 -; SSSE3-NEXT: movdqa %xmm5, %xmm4 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm4 -; SSSE3-NEXT: pand %xmm4, %xmm3 -; SSSE3-NEXT: pandn %xmm8, %xmm4 -; SSSE3-NEXT: por %xmm3, %xmm4 -; SSSE3-NEXT: pxor %xmm2, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm6, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pxor %xmm3, %xmm7 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [2147483903,2147483903,2147483903,2147483903] +; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm7, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm1 +; SSSE3-NEXT: por %xmm5, %xmm1 +; SSSE3-NEXT: movdqa %xmm6, %xmm7 +; SSSE3-NEXT: pxor %xmm3, %xmm7 +; SSSE3-NEXT: movdqa %xmm2, %xmm5 +; SSSE3-NEXT: pcmpgtd %xmm7, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm6 ; SSSE3-NEXT: pandn %xmm8, %xmm5 -; SSSE3-NEXT: por %xmm2, %xmm5 -; SSSE3-NEXT: packuswb %xmm4, %xmm5 -; SSSE3-NEXT: packuswb %xmm5, %xmm0 +; SSSE3-NEXT: por %xmm6, %xmm5 +; SSSE3-NEXT: packuswb %xmm1, %xmm5 +; SSSE3-NEXT: movdqa %xmm4, %xmm1 +; SSSE3-NEXT: pxor %xmm3, %xmm1 +; SSSE3-NEXT: movdqa %xmm2, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm6 +; SSSE3-NEXT: pand %xmm6, %xmm4 +; SSSE3-NEXT: pandn %xmm8, %xmm6 +; SSSE3-NEXT: por %xmm4, %xmm6 +; SSSE3-NEXT: pxor %xmm0, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: packuswb %xmm6, %xmm2 +; SSSE3-NEXT: packuswb %xmm2, %xmm5 +; SSSE3-NEXT: movdqa %xmm5, (%rsi) ; SSSE3-NEXT: retq ; -; SSE41-LABEL: trunc_usat_v16i32_v16i8: +; SSE41-LABEL: trunc_usat_v16i32_v16i8_store: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [255,255,255,255] -; SSE41-NEXT: pminud %xmm4, %xmm1 -; SSE41-NEXT: pminud %xmm4, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [255,255,255,255] +; SSE41-NEXT: movdqa 16(%rdi), %xmm1 +; SSE41-NEXT: pminud %xmm0, %xmm1 +; SSE41-NEXT: movdqa (%rdi), %xmm2 +; SSE41-NEXT: pminud %xmm0, %xmm2 +; SSE41-NEXT: packusdw %xmm1, %xmm2 +; SSE41-NEXT: movdqa 48(%rdi), %xmm1 +; SSE41-NEXT: pminud %xmm0, %xmm1 +; SSE41-NEXT: pminud 32(%rdi), %xmm0 ; SSE41-NEXT: packusdw %xmm1, %xmm0 -; SSE41-NEXT: pminud %xmm4, %xmm3 -; SSE41-NEXT: pminud %xmm4, %xmm2 -; SSE41-NEXT: packusdw %xmm3, %xmm2 -; SSE41-NEXT: packuswb %xmm2, %xmm0 +; SSE41-NEXT: packuswb %xmm0, %xmm2 +; SSE41-NEXT: movdqa %xmm2, (%rsi) ; SSE41-NEXT: retq ; -; AVX1-LABEL: trunc_usat_v16i32_v16i8: +; AVX1-LABEL: trunc_usat_v16i32_v16i8_store: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [255,255,255,255] -; AVX1-NEXT: vpminud %xmm3, %xmm2, %xmm2 -; AVX1-NEXT: vpminud %xmm3, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm0 = [255,255,255,255] +; AVX1-NEXT: vpminud 16(%rdi), %xmm0, %xmm1 +; AVX1-NEXT: vpminud (%rdi), %xmm0, %xmm2 +; AVX1-NEXT: vpackusdw %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpminud 48(%rdi), %xmm0, %xmm2 +; AVX1-NEXT: vpminud 32(%rdi), %xmm0, %xmm0 ; AVX1-NEXT: vpackusdw %xmm2, %xmm0, %xmm0 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vpminud %xmm3, %xmm2, %xmm2 -; AVX1-NEXT: vpminud %xmm3, %xmm1, %xmm1 -; AVX1-NEXT: vpackusdw %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 -; AVX1-NEXT: vzeroupper +; AVX1-NEXT: vpackuswb %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vmovdqa %xmm0, (%rsi) ; AVX1-NEXT: retq ; -; AVX2-LABEL: trunc_usat_v16i32_v16i8: +; AVX2-LABEL: trunc_usat_v16i32_v16i8_store: ; AVX2: # %bb.0: -; AVX2-NEXT: vpbroadcastd {{.*#+}} ymm2 = [255,255,255,255,255,255,255,255] -; AVX2-NEXT: vpminud %ymm2, %ymm1, %ymm1 -; AVX2-NEXT: vpminud %ymm2, %ymm0, %ymm0 +; AVX2-NEXT: vpbroadcastd {{.*#+}} ymm0 = [255,255,255,255,255,255,255,255] +; AVX2-NEXT: vpminud 32(%rdi), %ymm0, %ymm1 +; AVX2-NEXT: vpminud (%rdi), %ymm0, %ymm0 ; AVX2-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm1 ; AVX2-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 +; AVX2-NEXT: vmovdqa %xmm0, (%rsi) ; AVX2-NEXT: vzeroupper ; AVX2-NEXT: retq ; -; AVX512-LABEL: trunc_usat_v16i32_v16i8: +; AVX512-LABEL: trunc_usat_v16i32_v16i8_store: ; AVX512: # %bb.0: -; AVX512-NEXT: vpmovusdb %zmm0, %xmm0 +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 +; AVX512-NEXT: vpmovusdb %zmm0, (%rsi) ; AVX512-NEXT: vzeroupper ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_usat_v16i32_v16i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa64 (%rdi), %zmm0 +; SKX-NEXT: vpmovusdb %zmm0, (%rsi) +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq + %a0 = load <16 x i32>, <16 x i32>* %p0 %1 = icmp ult <16 x i32> %a0, %2 = select <16 x i1> %1, <16 x i32> %a0, <16 x i32> %3 = trunc <16 x i32> %2 to <16 x i8> - ret <16 x i8> %3 + store <16 x i8> %3, <16 x i8>* %p1 + ret void } define <8 x i8> @trunc_usat_v8i16_v8i8(<8 x i16> %a0) { @@ -3347,6 +3731,12 @@ define <8 x i8> @trunc_usat_v8i16_v8i8(< ; AVX512-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 ; AVX512-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_usat_v8i16_v8i8: +; SKX: # %bb.0: +; SKX-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; SKX-NEXT: retq %1 = icmp ult <8 x i16> %a0, %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> %3 = trunc <8 x i16> %2 to <8 x i8> @@ -3410,6 +3800,11 @@ define void @trunc_usat_v8i16_v8i8_store ; AVX512BWVL: # %bb.0: ; AVX512BWVL-NEXT: vpmovuswb %xmm0, (%rdi) ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v8i16_v8i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovuswb %xmm0, (%rdi) +; SKX-NEXT: retq %1 = icmp ult <8 x i16> %a0, %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> %3 = trunc <8 x i16> %2 to <8 x i8> @@ -3499,96 +3894,107 @@ define <16 x i8> @trunc_usat_v16i16_v16i ; AVX512BWVL-NEXT: vpmovuswb %ymm0, %xmm0 ; AVX512BWVL-NEXT: vzeroupper ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v16i16_v16i8: +; SKX: # %bb.0: +; SKX-NEXT: vpmovuswb %ymm0, %xmm0 +; SKX-NEXT: vzeroupper +; SKX-NEXT: retq %1 = icmp ult <16 x i16> %a0, %2 = select <16 x i1> %1, <16 x i16> %a0, <16 x i16> %3 = trunc <16 x i16> %2 to <16 x i8> ret <16 x i8> %3 } -define <32 x i8> @trunc_usat_v32i16_v32i8(<32 x i16> %a0) { +define <32 x i8> @trunc_usat_v32i16_v32i8(<32 x i16>* %p0) { ; SSE2-LABEL: trunc_usat_v32i16_v32i8: ; SSE2: # %bb.0: -; SSE2-NEXT: movdqa {{.*#+}} xmm4 = [32768,32768,32768,32768,32768,32768,32768,32768] -; SSE2-NEXT: pxor %xmm4, %xmm3 -; SSE2-NEXT: movdqa {{.*#+}} xmm5 = [33023,33023,33023,33023,33023,33023,33023,33023] -; SSE2-NEXT: pminsw %xmm5, %xmm3 -; SSE2-NEXT: pxor %xmm4, %xmm3 -; SSE2-NEXT: pxor %xmm4, %xmm2 -; SSE2-NEXT: pminsw %xmm5, %xmm2 -; SSE2-NEXT: pxor %xmm4, %xmm2 -; SSE2-NEXT: packuswb %xmm3, %xmm2 -; SSE2-NEXT: pxor %xmm4, %xmm1 -; SSE2-NEXT: pminsw %xmm5, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm1 -; SSE2-NEXT: pxor %xmm4, %xmm0 -; SSE2-NEXT: pminsw %xmm5, %xmm0 -; SSE2-NEXT: pxor %xmm4, %xmm0 -; SSE2-NEXT: packuswb %xmm1, %xmm0 -; SSE2-NEXT: movdqa %xmm2, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [32768,32768,32768,32768,32768,32768,32768,32768] +; SSE2-NEXT: movdqa 48(%rdi), %xmm0 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [33023,33023,33023,33023,33023,33023,33023,33023] +; SSE2-NEXT: pminsw %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: movdqa 32(%rdi), %xmm1 +; SSE2-NEXT: pxor %xmm2, %xmm1 +; SSE2-NEXT: pminsw %xmm3, %xmm1 +; SSE2-NEXT: pxor %xmm2, %xmm1 +; SSE2-NEXT: packuswb %xmm0, %xmm1 +; SSE2-NEXT: movdqa 16(%rdi), %xmm4 +; SSE2-NEXT: pxor %xmm2, %xmm4 +; SSE2-NEXT: pminsw %xmm3, %xmm4 +; SSE2-NEXT: pxor %xmm2, %xmm4 +; SSE2-NEXT: movdqa (%rdi), %xmm0 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: pminsw %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm2, %xmm0 +; SSE2-NEXT: packuswb %xmm4, %xmm0 ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_usat_v32i16_v32i8: ; SSSE3: # %bb.0: -; SSSE3-NEXT: movdqa {{.*#+}} xmm4 = [32768,32768,32768,32768,32768,32768,32768,32768] -; SSSE3-NEXT: pxor %xmm4, %xmm3 -; SSSE3-NEXT: movdqa {{.*#+}} xmm5 = [33023,33023,33023,33023,33023,33023,33023,33023] -; SSSE3-NEXT: pminsw %xmm5, %xmm3 -; SSSE3-NEXT: pxor %xmm4, %xmm3 -; SSSE3-NEXT: pxor %xmm4, %xmm2 -; SSSE3-NEXT: pminsw %xmm5, %xmm2 -; SSSE3-NEXT: pxor %xmm4, %xmm2 -; SSSE3-NEXT: packuswb %xmm3, %xmm2 -; SSSE3-NEXT: pxor %xmm4, %xmm1 -; SSSE3-NEXT: pminsw %xmm5, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm1 -; SSSE3-NEXT: pxor %xmm4, %xmm0 -; SSSE3-NEXT: pminsw %xmm5, %xmm0 -; SSSE3-NEXT: pxor %xmm4, %xmm0 -; SSSE3-NEXT: packuswb %xmm1, %xmm0 -; SSSE3-NEXT: movdqa %xmm2, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [32768,32768,32768,32768,32768,32768,32768,32768] +; SSSE3-NEXT: movdqa 48(%rdi), %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [33023,33023,33023,33023,33023,33023,33023,33023] +; SSSE3-NEXT: pminsw %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: pminsw %xmm3, %xmm1 +; SSSE3-NEXT: pxor %xmm2, %xmm1 +; SSSE3-NEXT: packuswb %xmm0, %xmm1 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm4 +; SSSE3-NEXT: pxor %xmm2, %xmm4 +; SSSE3-NEXT: pminsw %xmm3, %xmm4 +; SSSE3-NEXT: pxor %xmm2, %xmm4 +; SSSE3-NEXT: movdqa (%rdi), %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: pminsw %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm2, %xmm0 +; SSSE3-NEXT: packuswb %xmm4, %xmm0 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_usat_v32i16_v32i8: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [255,255,255,255,255,255,255,255] -; SSE41-NEXT: pminuw %xmm4, %xmm3 -; SSE41-NEXT: pminuw %xmm4, %xmm2 -; SSE41-NEXT: packuswb %xmm3, %xmm2 -; SSE41-NEXT: pminuw %xmm4, %xmm1 -; SSE41-NEXT: pminuw %xmm4, %xmm0 -; SSE41-NEXT: packuswb %xmm1, %xmm0 -; SSE41-NEXT: movdqa %xmm2, %xmm1 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [255,255,255,255,255,255,255,255] +; SSE41-NEXT: movdqa 48(%rdi), %xmm2 +; SSE41-NEXT: pminuw %xmm0, %xmm2 +; SSE41-NEXT: movdqa 32(%rdi), %xmm1 +; SSE41-NEXT: pminuw %xmm0, %xmm1 +; SSE41-NEXT: packuswb %xmm2, %xmm1 +; SSE41-NEXT: movdqa 16(%rdi), %xmm2 +; SSE41-NEXT: pminuw %xmm0, %xmm2 +; SSE41-NEXT: pminuw (%rdi), %xmm0 +; SSE41-NEXT: packuswb %xmm2, %xmm0 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_usat_v32i16_v32i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [255,255,255,255,255,255,255,255] -; AVX1-NEXT: vpminuw %xmm3, %xmm2, %xmm2 -; AVX1-NEXT: vpminuw %xmm3, %xmm0, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm0 = [255,255,255,255,255,255,255,255] +; AVX1-NEXT: vpminuw 16(%rdi), %xmm0, %xmm1 +; AVX1-NEXT: vpminuw (%rdi), %xmm0, %xmm2 +; AVX1-NEXT: vpackuswb %xmm1, %xmm2, %xmm1 +; AVX1-NEXT: vpminuw 48(%rdi), %xmm0, %xmm2 +; AVX1-NEXT: vpminuw 32(%rdi), %xmm0, %xmm0 ; AVX1-NEXT: vpackuswb %xmm2, %xmm0, %xmm0 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2 -; AVX1-NEXT: vpminuw %xmm3, %xmm2, %xmm2 -; AVX1-NEXT: vpminuw %xmm3, %xmm1, %xmm1 -; AVX1-NEXT: vpackuswb %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_usat_v32i16_v32i8: ; AVX2: # %bb.0: -; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255] -; AVX2-NEXT: vpminuw %ymm2, %ymm1, %ymm1 -; AVX2-NEXT: vpminuw %ymm2, %ymm0, %ymm0 +; AVX2-NEXT: vmovdqa {{.*#+}} ymm0 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255] +; AVX2-NEXT: vpminuw 32(%rdi), %ymm0, %ymm1 +; AVX2-NEXT: vpminuw (%rdi), %ymm0, %ymm0 ; AVX2-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: retq ; ; AVX512F-LABEL: trunc_usat_v32i16_v32i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vextracti64x4 $1, %zmm0, %ymm1 -; AVX512F-NEXT: vmovdqa {{.*#+}} ymm2 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255] -; AVX512F-NEXT: vpminuw %ymm2, %ymm1, %ymm1 -; AVX512F-NEXT: vpminuw %ymm2, %ymm0, %ymm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} ymm0 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255] +; AVX512F-NEXT: vpminuw 32(%rdi), %ymm0, %ymm1 +; AVX512F-NEXT: vpminuw (%rdi), %ymm0, %ymm0 ; AVX512F-NEXT: vpmovzxwd {{.*#+}} zmm0 = ymm0[0],zero,ymm0[1],zero,ymm0[2],zero,ymm0[3],zero,ymm0[4],zero,ymm0[5],zero,ymm0[6],zero,ymm0[7],zero,ymm0[8],zero,ymm0[9],zero,ymm0[10],zero,ymm0[11],zero,ymm0[12],zero,ymm0[13],zero,ymm0[14],zero,ymm0[15],zero ; AVX512F-NEXT: vpmovdb %zmm0, %xmm0 ; AVX512F-NEXT: vpmovzxwd {{.*#+}} zmm1 = ymm1[0],zero,ymm1[1],zero,ymm1[2],zero,ymm1[3],zero,ymm1[4],zero,ymm1[5],zero,ymm1[6],zero,ymm1[7],zero,ymm1[8],zero,ymm1[9],zero,ymm1[10],zero,ymm1[11],zero,ymm1[12],zero,ymm1[13],zero,ymm1[14],zero,ymm1[15],zero @@ -3598,10 +4004,9 @@ define <32 x i8> @trunc_usat_v32i16_v32i ; ; AVX512VL-LABEL: trunc_usat_v32i16_v32i8: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vextracti64x4 $1, %zmm0, %ymm1 -; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm2 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255] -; AVX512VL-NEXT: vpminuw %ymm2, %ymm1, %ymm1 -; AVX512VL-NEXT: vpminuw %ymm2, %ymm0, %ymm0 +; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm0 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255] +; AVX512VL-NEXT: vpminuw 32(%rdi), %ymm0, %ymm1 +; AVX512VL-NEXT: vpminuw (%rdi), %ymm0, %ymm0 ; AVX512VL-NEXT: vpmovzxwd {{.*#+}} zmm0 = ymm0[0],zero,ymm0[1],zero,ymm0[2],zero,ymm0[3],zero,ymm0[4],zero,ymm0[5],zero,ymm0[6],zero,ymm0[7],zero,ymm0[8],zero,ymm0[9],zero,ymm0[10],zero,ymm0[11],zero,ymm0[12],zero,ymm0[13],zero,ymm0[14],zero,ymm0[15],zero ; AVX512VL-NEXT: vpmovdb %zmm0, %xmm0 ; AVX512VL-NEXT: vpmovzxwd {{.*#+}} zmm1 = ymm1[0],zero,ymm1[1],zero,ymm1[2],zero,ymm1[3],zero,ymm1[4],zero,ymm1[5],zero,ymm1[6],zero,ymm1[7],zero,ymm1[8],zero,ymm1[9],zero,ymm1[10],zero,ymm1[11],zero,ymm1[12],zero,ymm1[13],zero,ymm1[14],zero,ymm1[15],zero @@ -3611,221 +4016,258 @@ define <32 x i8> @trunc_usat_v32i16_v32i ; ; AVX512BW-LABEL: trunc_usat_v32i16_v32i8: ; AVX512BW: # %bb.0: +; AVX512BW-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512BW-NEXT: vpmovuswb %zmm0, %ymm0 ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_usat_v32i16_v32i8: ; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vmovdqa64 (%rdi), %zmm0 ; AVX512BWVL-NEXT: vpmovuswb %zmm0, %ymm0 ; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v32i16_v32i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa64 (%rdi), %zmm0 +; SKX-NEXT: vpmovuswb %zmm0, %ymm0 +; SKX-NEXT: retq + %a0 = load <32 x i16>, <32 x i16>* %p0 %1 = icmp ult <32 x i16> %a0, %2 = select <32 x i1> %1, <32 x i16> %a0, <32 x i16> %3 = trunc <32 x i16> %2 to <32 x i8> ret <32 x i8> %3 } -define <32 x i8> @trunc_usat_v32i32_v32i8(<32 x i32> %a0) { +define <32 x i8> @trunc_usat_v32i32_v32i8(<32 x i32>* %p0) { ; SSE2-LABEL: trunc_usat_v32i32_v32i8: ; SSE2: # %bb.0: -; SSE2-NEXT: movdqa %xmm1, %xmm8 -; SSE2-NEXT: movdqa {{.*#+}} xmm10 = [255,255,255,255] -; SSE2-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648,2147483648,2147483648] -; SSE2-NEXT: movdqa %xmm5, %xmm1 -; SSE2-NEXT: pxor %xmm11, %xmm1 -; SSE2-NEXT: movdqa {{.*#+}} xmm9 = [2147483903,2147483903,2147483903,2147483903] -; SSE2-NEXT: movdqa %xmm9, %xmm12 -; SSE2-NEXT: pcmpgtd %xmm1, %xmm12 -; SSE2-NEXT: pand %xmm12, %xmm5 -; SSE2-NEXT: pandn %xmm10, %xmm12 -; SSE2-NEXT: por %xmm5, %xmm12 +; SSE2-NEXT: movdqa (%rdi), %xmm11 +; SSE2-NEXT: movdqa 16(%rdi), %xmm12 +; SSE2-NEXT: movdqa 32(%rdi), %xmm9 +; SSE2-NEXT: movdqa 48(%rdi), %xmm10 +; SSE2-NEXT: movdqa 96(%rdi), %xmm0 +; SSE2-NEXT: movdqa 112(%rdi), %xmm2 +; SSE2-NEXT: movdqa 64(%rdi), %xmm5 +; SSE2-NEXT: movdqa 80(%rdi), %xmm7 +; SSE2-NEXT: movdqa {{.*#+}} xmm8 = [255,255,255,255] +; SSE2-NEXT: movdqa {{.*#+}} xmm6 = [2147483648,2147483648,2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm7, %xmm1 +; SSE2-NEXT: pxor %xmm6, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm4 = [2147483903,2147483903,2147483903,2147483903] +; SSE2-NEXT: movdqa %xmm4, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm7 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm7, %xmm3 +; SSE2-NEXT: movdqa %xmm5, %xmm7 +; SSE2-NEXT: pxor %xmm6, %xmm7 +; SSE2-NEXT: movdqa %xmm4, %xmm1 +; SSE2-NEXT: pcmpgtd %xmm7, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm5 +; SSE2-NEXT: pandn %xmm8, %xmm1 +; SSE2-NEXT: por %xmm5, %xmm1 +; SSE2-NEXT: packuswb %xmm3, %xmm1 +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pxor %xmm6, %xmm3 ; SSE2-NEXT: movdqa %xmm4, %xmm5 -; SSE2-NEXT: pxor %xmm11, %xmm5 -; SSE2-NEXT: movdqa %xmm9, %xmm1 -; SSE2-NEXT: pcmpgtd %xmm5, %xmm1 -; SSE2-NEXT: pand %xmm1, %xmm4 -; SSE2-NEXT: pandn %xmm10, %xmm1 -; SSE2-NEXT: por %xmm4, %xmm1 -; SSE2-NEXT: packuswb %xmm12, %xmm1 -; SSE2-NEXT: movdqa %xmm7, %xmm4 -; SSE2-NEXT: pxor %xmm11, %xmm4 -; SSE2-NEXT: movdqa %xmm9, %xmm5 -; SSE2-NEXT: pcmpgtd %xmm4, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm7 -; SSE2-NEXT: pandn %xmm10, %xmm5 -; SSE2-NEXT: por %xmm7, %xmm5 -; SSE2-NEXT: movdqa %xmm6, %xmm4 -; SSE2-NEXT: pxor %xmm11, %xmm4 -; SSE2-NEXT: movdqa %xmm9, %xmm7 -; SSE2-NEXT: pcmpgtd %xmm4, %xmm7 -; SSE2-NEXT: pand %xmm7, %xmm6 -; SSE2-NEXT: pandn %xmm10, %xmm7 -; SSE2-NEXT: por %xmm6, %xmm7 -; SSE2-NEXT: packuswb %xmm5, %xmm7 -; SSE2-NEXT: packuswb %xmm7, %xmm1 -; SSE2-NEXT: movdqa %xmm8, %xmm4 -; SSE2-NEXT: pxor %xmm11, %xmm4 -; SSE2-NEXT: movdqa %xmm9, %xmm5 -; SSE2-NEXT: pcmpgtd %xmm4, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm8 -; SSE2-NEXT: pandn %xmm10, %xmm5 -; SSE2-NEXT: por %xmm8, %xmm5 -; SSE2-NEXT: movdqa %xmm0, %xmm4 -; SSE2-NEXT: pxor %xmm11, %xmm4 -; SSE2-NEXT: movdqa %xmm9, %xmm6 -; SSE2-NEXT: pcmpgtd %xmm4, %xmm6 -; SSE2-NEXT: pand %xmm6, %xmm0 -; SSE2-NEXT: pandn %xmm10, %xmm6 -; SSE2-NEXT: por %xmm6, %xmm0 -; SSE2-NEXT: packuswb %xmm5, %xmm0 -; SSE2-NEXT: movdqa %xmm3, %xmm4 -; SSE2-NEXT: pxor %xmm11, %xmm4 -; SSE2-NEXT: movdqa %xmm9, %xmm5 -; SSE2-NEXT: pcmpgtd %xmm4, %xmm5 -; SSE2-NEXT: pand %xmm5, %xmm3 -; SSE2-NEXT: pandn %xmm10, %xmm5 -; SSE2-NEXT: por %xmm3, %xmm5 -; SSE2-NEXT: pxor %xmm2, %xmm11 -; SSE2-NEXT: pcmpgtd %xmm11, %xmm9 -; SSE2-NEXT: pand %xmm9, %xmm2 -; SSE2-NEXT: pandn %xmm10, %xmm9 -; SSE2-NEXT: por %xmm2, %xmm9 -; SSE2-NEXT: packuswb %xmm5, %xmm9 -; SSE2-NEXT: packuswb %xmm9, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm5 +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pandn %xmm8, %xmm5 +; SSE2-NEXT: por %xmm2, %xmm5 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm6, %xmm2 +; SSE2-NEXT: movdqa %xmm4, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: packuswb %xmm5, %xmm3 +; SSE2-NEXT: packuswb %xmm3, %xmm1 +; SSE2-NEXT: movdqa %xmm12, %xmm0 +; SSE2-NEXT: pxor %xmm6, %xmm0 +; SSE2-NEXT: movdqa %xmm4, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm12 +; SSE2-NEXT: pandn %xmm8, %xmm2 +; SSE2-NEXT: por %xmm12, %xmm2 +; SSE2-NEXT: movdqa %xmm11, %xmm3 +; SSE2-NEXT: pxor %xmm6, %xmm3 +; SSE2-NEXT: movdqa %xmm4, %xmm0 +; SSE2-NEXT: pcmpgtd %xmm3, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm11 +; SSE2-NEXT: pandn %xmm8, %xmm0 +; SSE2-NEXT: por %xmm11, %xmm0 +; SSE2-NEXT: packuswb %xmm2, %xmm0 +; SSE2-NEXT: movdqa %xmm10, %xmm2 +; SSE2-NEXT: pxor %xmm6, %xmm2 +; SSE2-NEXT: movdqa %xmm4, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm10 +; SSE2-NEXT: pandn %xmm8, %xmm3 +; SSE2-NEXT: por %xmm10, %xmm3 +; SSE2-NEXT: pxor %xmm9, %xmm6 +; SSE2-NEXT: pcmpgtd %xmm6, %xmm4 +; SSE2-NEXT: pand %xmm4, %xmm9 +; SSE2-NEXT: pandn %xmm8, %xmm4 +; SSE2-NEXT: por %xmm9, %xmm4 +; SSE2-NEXT: packuswb %xmm3, %xmm4 +; SSE2-NEXT: packuswb %xmm4, %xmm0 ; SSE2-NEXT: retq ; ; SSSE3-LABEL: trunc_usat_v32i32_v32i8: ; SSSE3: # %bb.0: -; SSSE3-NEXT: movdqa %xmm1, %xmm8 -; SSSE3-NEXT: movdqa {{.*#+}} xmm10 = [255,255,255,255] -; SSSE3-NEXT: movdqa {{.*#+}} xmm11 = [2147483648,2147483648,2147483648,2147483648] -; SSSE3-NEXT: movdqa %xmm5, %xmm1 -; SSSE3-NEXT: pxor %xmm11, %xmm1 -; SSSE3-NEXT: movdqa {{.*#+}} xmm9 = [2147483903,2147483903,2147483903,2147483903] -; SSSE3-NEXT: movdqa %xmm9, %xmm12 -; SSSE3-NEXT: pcmpgtd %xmm1, %xmm12 -; SSSE3-NEXT: pand %xmm12, %xmm5 -; SSSE3-NEXT: pandn %xmm10, %xmm12 -; SSSE3-NEXT: por %xmm5, %xmm12 +; SSSE3-NEXT: movdqa (%rdi), %xmm11 +; SSSE3-NEXT: movdqa 16(%rdi), %xmm12 +; SSSE3-NEXT: movdqa 32(%rdi), %xmm9 +; SSSE3-NEXT: movdqa 48(%rdi), %xmm10 +; SSSE3-NEXT: movdqa 96(%rdi), %xmm0 +; SSSE3-NEXT: movdqa 112(%rdi), %xmm2 +; SSSE3-NEXT: movdqa 64(%rdi), %xmm5 +; SSSE3-NEXT: movdqa 80(%rdi), %xmm7 +; SSSE3-NEXT: movdqa {{.*#+}} xmm8 = [255,255,255,255] +; SSSE3-NEXT: movdqa {{.*#+}} xmm6 = [2147483648,2147483648,2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm7, %xmm1 +; SSSE3-NEXT: pxor %xmm6, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm4 = [2147483903,2147483903,2147483903,2147483903] +; SSSE3-NEXT: movdqa %xmm4, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm7 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm7, %xmm3 +; SSSE3-NEXT: movdqa %xmm5, %xmm7 +; SSSE3-NEXT: pxor %xmm6, %xmm7 +; SSSE3-NEXT: movdqa %xmm4, %xmm1 +; SSSE3-NEXT: pcmpgtd %xmm7, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm5 +; SSSE3-NEXT: pandn %xmm8, %xmm1 +; SSSE3-NEXT: por %xmm5, %xmm1 +; SSSE3-NEXT: packuswb %xmm3, %xmm1 +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pxor %xmm6, %xmm3 ; SSSE3-NEXT: movdqa %xmm4, %xmm5 -; SSSE3-NEXT: pxor %xmm11, %xmm5 -; SSSE3-NEXT: movdqa %xmm9, %xmm1 -; SSSE3-NEXT: pcmpgtd %xmm5, %xmm1 -; SSSE3-NEXT: pand %xmm1, %xmm4 -; SSSE3-NEXT: pandn %xmm10, %xmm1 -; SSSE3-NEXT: por %xmm4, %xmm1 -; SSSE3-NEXT: packuswb %xmm12, %xmm1 -; SSSE3-NEXT: movdqa %xmm7, %xmm4 -; SSSE3-NEXT: pxor %xmm11, %xmm4 -; SSSE3-NEXT: movdqa %xmm9, %xmm5 -; SSSE3-NEXT: pcmpgtd %xmm4, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm7 -; SSSE3-NEXT: pandn %xmm10, %xmm5 -; SSSE3-NEXT: por %xmm7, %xmm5 -; SSSE3-NEXT: movdqa %xmm6, %xmm4 -; SSSE3-NEXT: pxor %xmm11, %xmm4 -; SSSE3-NEXT: movdqa %xmm9, %xmm7 -; SSSE3-NEXT: pcmpgtd %xmm4, %xmm7 -; SSSE3-NEXT: pand %xmm7, %xmm6 -; SSSE3-NEXT: pandn %xmm10, %xmm7 -; SSSE3-NEXT: por %xmm6, %xmm7 -; SSSE3-NEXT: packuswb %xmm5, %xmm7 -; SSSE3-NEXT: packuswb %xmm7, %xmm1 -; SSSE3-NEXT: movdqa %xmm8, %xmm4 -; SSSE3-NEXT: pxor %xmm11, %xmm4 -; SSSE3-NEXT: movdqa %xmm9, %xmm5 -; SSSE3-NEXT: pcmpgtd %xmm4, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm8 -; SSSE3-NEXT: pandn %xmm10, %xmm5 -; SSSE3-NEXT: por %xmm8, %xmm5 -; SSSE3-NEXT: movdqa %xmm0, %xmm4 -; SSSE3-NEXT: pxor %xmm11, %xmm4 -; SSSE3-NEXT: movdqa %xmm9, %xmm6 -; SSSE3-NEXT: pcmpgtd %xmm4, %xmm6 -; SSSE3-NEXT: pand %xmm6, %xmm0 -; SSSE3-NEXT: pandn %xmm10, %xmm6 -; SSSE3-NEXT: por %xmm6, %xmm0 -; SSSE3-NEXT: packuswb %xmm5, %xmm0 -; SSSE3-NEXT: movdqa %xmm3, %xmm4 -; SSSE3-NEXT: pxor %xmm11, %xmm4 -; SSSE3-NEXT: movdqa %xmm9, %xmm5 -; SSSE3-NEXT: pcmpgtd %xmm4, %xmm5 -; SSSE3-NEXT: pand %xmm5, %xmm3 -; SSSE3-NEXT: pandn %xmm10, %xmm5 -; SSSE3-NEXT: por %xmm3, %xmm5 -; SSSE3-NEXT: pxor %xmm2, %xmm11 -; SSSE3-NEXT: pcmpgtd %xmm11, %xmm9 -; SSSE3-NEXT: pand %xmm9, %xmm2 -; SSSE3-NEXT: pandn %xmm10, %xmm9 -; SSSE3-NEXT: por %xmm2, %xmm9 -; SSSE3-NEXT: packuswb %xmm5, %xmm9 -; SSSE3-NEXT: packuswb %xmm9, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm5 +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pandn %xmm8, %xmm5 +; SSSE3-NEXT: por %xmm2, %xmm5 +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm6, %xmm2 +; SSSE3-NEXT: movdqa %xmm4, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: packuswb %xmm5, %xmm3 +; SSSE3-NEXT: packuswb %xmm3, %xmm1 +; SSSE3-NEXT: movdqa %xmm12, %xmm0 +; SSSE3-NEXT: pxor %xmm6, %xmm0 +; SSSE3-NEXT: movdqa %xmm4, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm12 +; SSSE3-NEXT: pandn %xmm8, %xmm2 +; SSSE3-NEXT: por %xmm12, %xmm2 +; SSSE3-NEXT: movdqa %xmm11, %xmm3 +; SSSE3-NEXT: pxor %xmm6, %xmm3 +; SSSE3-NEXT: movdqa %xmm4, %xmm0 +; SSSE3-NEXT: pcmpgtd %xmm3, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm11 +; SSSE3-NEXT: pandn %xmm8, %xmm0 +; SSSE3-NEXT: por %xmm11, %xmm0 +; SSSE3-NEXT: packuswb %xmm2, %xmm0 +; SSSE3-NEXT: movdqa %xmm10, %xmm2 +; SSSE3-NEXT: pxor %xmm6, %xmm2 +; SSSE3-NEXT: movdqa %xmm4, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm10 +; SSSE3-NEXT: pandn %xmm8, %xmm3 +; SSSE3-NEXT: por %xmm10, %xmm3 +; SSSE3-NEXT: pxor %xmm9, %xmm6 +; SSSE3-NEXT: pcmpgtd %xmm6, %xmm4 +; SSSE3-NEXT: pand %xmm4, %xmm9 +; SSSE3-NEXT: pandn %xmm8, %xmm4 +; SSSE3-NEXT: por %xmm9, %xmm4 +; SSSE3-NEXT: packuswb %xmm3, %xmm4 +; SSSE3-NEXT: packuswb %xmm4, %xmm0 ; SSSE3-NEXT: retq ; ; SSE41-LABEL: trunc_usat_v32i32_v32i8: ; SSE41: # %bb.0: -; SSE41-NEXT: movdqa {{.*#+}} xmm8 = [255,255,255,255] -; SSE41-NEXT: pminud %xmm8, %xmm5 -; SSE41-NEXT: pminud %xmm8, %xmm4 -; SSE41-NEXT: packusdw %xmm5, %xmm4 -; SSE41-NEXT: pminud %xmm8, %xmm7 -; SSE41-NEXT: pminud %xmm8, %xmm6 -; SSE41-NEXT: packusdw %xmm7, %xmm6 -; SSE41-NEXT: packuswb %xmm6, %xmm4 -; SSE41-NEXT: pminud %xmm8, %xmm1 -; SSE41-NEXT: pminud %xmm8, %xmm0 -; SSE41-NEXT: packusdw %xmm1, %xmm0 -; SSE41-NEXT: pminud %xmm8, %xmm3 -; SSE41-NEXT: pminud %xmm8, %xmm2 +; SSE41-NEXT: movdqa {{.*#+}} xmm2 = [255,255,255,255] +; SSE41-NEXT: movdqa 80(%rdi), %xmm0 +; SSE41-NEXT: pminud %xmm2, %xmm0 +; SSE41-NEXT: movdqa 64(%rdi), %xmm1 +; SSE41-NEXT: pminud %xmm2, %xmm1 +; SSE41-NEXT: packusdw %xmm0, %xmm1 +; SSE41-NEXT: movdqa 112(%rdi), %xmm0 +; SSE41-NEXT: pminud %xmm2, %xmm0 +; SSE41-NEXT: movdqa 96(%rdi), %xmm3 +; SSE41-NEXT: pminud %xmm2, %xmm3 +; SSE41-NEXT: packusdw %xmm0, %xmm3 +; SSE41-NEXT: packuswb %xmm3, %xmm1 +; SSE41-NEXT: movdqa 16(%rdi), %xmm3 +; SSE41-NEXT: pminud %xmm2, %xmm3 +; SSE41-NEXT: movdqa (%rdi), %xmm0 +; SSE41-NEXT: pminud %xmm2, %xmm0 +; SSE41-NEXT: packusdw %xmm3, %xmm0 +; SSE41-NEXT: movdqa 48(%rdi), %xmm3 +; SSE41-NEXT: pminud %xmm2, %xmm3 +; SSE41-NEXT: pminud 32(%rdi), %xmm2 ; SSE41-NEXT: packusdw %xmm3, %xmm2 ; SSE41-NEXT: packuswb %xmm2, %xmm0 -; SSE41-NEXT: movdqa %xmm4, %xmm1 ; SSE41-NEXT: retq ; ; AVX1-LABEL: trunc_usat_v32i32_v32i8: ; AVX1: # %bb.0: -; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm4 -; AVX1-NEXT: vmovdqa {{.*#+}} xmm5 = [255,255,255,255] -; AVX1-NEXT: vpminud %xmm5, %xmm4, %xmm4 -; AVX1-NEXT: vpminud %xmm5, %xmm0, %xmm0 -; AVX1-NEXT: vpackusdw %xmm4, %xmm0, %xmm0 -; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm4 -; AVX1-NEXT: vpminud %xmm5, %xmm4, %xmm4 -; AVX1-NEXT: vpminud %xmm5, %xmm1, %xmm1 -; AVX1-NEXT: vpackusdw %xmm4, %xmm1, %xmm1 -; AVX1-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 -; AVX1-NEXT: vextractf128 $1, %ymm2, %xmm1 -; AVX1-NEXT: vpminud %xmm5, %xmm1, %xmm1 -; AVX1-NEXT: vpminud %xmm5, %xmm2, %xmm2 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm0 = [255,255,255,255] +; AVX1-NEXT: vpminud 16(%rdi), %xmm0, %xmm1 +; AVX1-NEXT: vpminud (%rdi), %xmm0, %xmm2 ; AVX1-NEXT: vpackusdw %xmm1, %xmm2, %xmm1 -; AVX1-NEXT: vextractf128 $1, %ymm3, %xmm2 -; AVX1-NEXT: vpminud %xmm5, %xmm2, %xmm2 -; AVX1-NEXT: vpminud %xmm5, %xmm3, %xmm3 +; AVX1-NEXT: vpminud 48(%rdi), %xmm0, %xmm2 +; AVX1-NEXT: vpminud 32(%rdi), %xmm0, %xmm3 ; AVX1-NEXT: vpackusdw %xmm2, %xmm3, %xmm2 ; AVX1-NEXT: vpackuswb %xmm2, %xmm1, %xmm1 -; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX1-NEXT: vpminud 80(%rdi), %xmm0, %xmm2 +; AVX1-NEXT: vpminud 64(%rdi), %xmm0, %xmm3 +; AVX1-NEXT: vpackusdw %xmm2, %xmm3, %xmm2 +; AVX1-NEXT: vpminud 112(%rdi), %xmm0, %xmm3 +; AVX1-NEXT: vpminud 96(%rdi), %xmm0, %xmm0 +; AVX1-NEXT: vpackusdw %xmm3, %xmm0, %xmm0 +; AVX1-NEXT: vpackuswb %xmm0, %xmm2, %xmm0 +; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0 ; AVX1-NEXT: retq ; ; AVX2-LABEL: trunc_usat_v32i32_v32i8: ; AVX2: # %bb.0: -; AVX2-NEXT: vpbroadcastd {{.*#+}} ymm4 = [255,255,255,255,255,255,255,255] -; AVX2-NEXT: vpminud %ymm4, %ymm1, %ymm1 -; AVX2-NEXT: vpminud %ymm4, %ymm0, %ymm0 -; AVX2-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 -; AVX2-NEXT: vpminud %ymm4, %ymm3, %ymm1 -; AVX2-NEXT: vpminud %ymm4, %ymm2, %ymm2 +; AVX2-NEXT: vpbroadcastd {{.*#+}} ymm0 = [255,255,255,255,255,255,255,255] +; AVX2-NEXT: vpminud 32(%rdi), %ymm0, %ymm1 +; AVX2-NEXT: vpminud (%rdi), %ymm0, %ymm2 ; AVX2-NEXT: vpackusdw %ymm1, %ymm2, %ymm1 -; AVX2-NEXT: vpermq {{.*#+}} ymm1 = ymm1[0,2,1,3] +; AVX2-NEXT: vpminud 96(%rdi), %ymm0, %ymm2 +; AVX2-NEXT: vpminud 64(%rdi), %ymm0, %ymm0 +; AVX2-NEXT: vpackusdw %ymm2, %ymm0, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] -; AVX2-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 +; AVX2-NEXT: vpermq {{.*#+}} ymm1 = ymm1[0,2,1,3] +; AVX2-NEXT: vpackuswb %ymm0, %ymm1, %ymm0 ; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; AVX2-NEXT: retq ; ; AVX512-LABEL: trunc_usat_v32i32_v32i8: ; AVX512: # %bb.0: +; AVX512-NEXT: vmovdqa64 (%rdi), %zmm0 +; AVX512-NEXT: vmovdqa64 64(%rdi), %zmm1 ; AVX512-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512-NEXT: vpmovusdb %zmm1, %xmm1 ; AVX512-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 ; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_usat_v32i32_v32i8: +; SKX: # %bb.0: +; SKX-NEXT: vmovdqa64 (%rdi), %zmm0 +; SKX-NEXT: vmovdqa64 64(%rdi), %zmm1 +; SKX-NEXT: vpmovusdb %zmm0, %xmm0 +; SKX-NEXT: vpmovusdb %zmm1, %xmm1 +; SKX-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 +; SKX-NEXT: retq + %a0 = load <32 x i32>, <32 x i32>* %p0 %1 = icmp ult <32 x i32> %a0, %2 = select <32 x i1> %1, <32 x i32> %a0, <32 x i32> %3 = trunc <32 x i32> %2 to <32 x i8> From llvm-commits at lists.llvm.org Sat Oct 12 00:59:29 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Sat, 12 Oct 2019 07:59:29 -0000 Subject: [llvm] r374643 - [X86] Use pack instructions for packus/ssat truncate patterns when 256-bit is the largest legal vector and the result type is at least 256 bits. Message-ID: <20191012075929.67D079339B@lists.llvm.org> Author: ctopper Date: Sat Oct 12 00:59:29 2019 New Revision: 374643 URL: http://llvm.org/viewvc/llvm-project?rev=374643&view=rev Log: [X86] Use pack instructions for packus/ssat truncate patterns when 256-bit is the largest legal vector and the result type is at least 256 bits. Since the input type is larger than 256-bits we'll need to some concatenating to reassemble the results. The pack instructions ability to concatenate while packing make this a shorter/faster sequence. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374643&r1=374642&r2=374643&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sat Oct 12 00:59:29 2019 @@ -39869,9 +39869,12 @@ static SDValue combineTruncateWithSat(SD // vXi16 truncate instructions are only available with AVX512BW. // For 256-bit or smaller vectors, we require VLX. // FIXME: We could widen truncates to 512 to remove the VLX restriction. + // If the result type is 256-bits or larger and we have disable 512-bit + // registers, we should go ahead and use the pack instructions if possible. bool PreferAVX512 = ((Subtarget.hasAVX512() && InSVT == MVT::i32) || (Subtarget.hasBWI() && InSVT == MVT::i16)) && - (Subtarget.hasVLX() || InVT.getSizeInBits() > 256); + (Subtarget.hasVLX() || InVT.getSizeInBits() > 256) && + !(!Subtarget.useAVX512Regs() && VT.getSizeInBits() >= 256); if (VT.isVector() && isPowerOf2_32(VT.getVectorNumElements()) && !PreferAVX512 && Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll?rev=374643&r1=374642&r2=374643&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll Sat Oct 12 00:59:29 2019 @@ -2095,13 +2095,8 @@ define <16 x i16> @trunc_packus_v16i32_v ; ; SKX-LABEL: trunc_packus_v16i32_v16i16: ; SKX: # %bb.0: -; SKX-NEXT: vpbroadcastd {{.*#+}} ymm0 = [65535,65535,65535,65535,65535,65535,65535,65535] -; SKX-NEXT: vpminsd (%rdi), %ymm0, %ymm1 -; SKX-NEXT: vpminsd 32(%rdi), %ymm0, %ymm0 -; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; SKX-NEXT: vpmaxsd %ymm2, %ymm0, %ymm0 -; SKX-NEXT: vpmaxsd %ymm2, %ymm1, %ymm1 -; SKX-NEXT: vpackusdw %ymm0, %ymm1, %ymm0 +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vpackusdw 32(%rdi), %ymm0, %ymm0 ; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; SKX-NEXT: retq %a0 = load <16 x i32>, <16 x i32>* %p0 @@ -4943,13 +4938,8 @@ define <32 x i8> @trunc_packus_v32i16_v3 ; ; SKX-LABEL: trunc_packus_v32i16_v32i8: ; SKX: # %bb.0: -; SKX-NEXT: vmovdqa {{.*#+}} ymm0 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255] -; SKX-NEXT: vpminsw (%rdi), %ymm0, %ymm1 -; SKX-NEXT: vpminsw 32(%rdi), %ymm0, %ymm0 -; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; SKX-NEXT: vpmaxsw %ymm2, %ymm0, %ymm0 -; SKX-NEXT: vpmaxsw %ymm2, %ymm1, %ymm1 -; SKX-NEXT: vpackuswb %ymm0, %ymm1, %ymm0 +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vpackuswb 32(%rdi), %ymm0, %ymm0 ; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; SKX-NEXT: retq %a0 = load <32 x i16>, <32 x i16>* %p0 @@ -5015,18 +5005,14 @@ define <32 x i8> @trunc_packus_v32i32_v3 ; ; SKX-LABEL: trunc_packus_v32i32_v32i8: ; SKX: # %bb.0: -; SKX-NEXT: vpxor %xmm0, %xmm0, %xmm0 -; SKX-NEXT: vpmaxsd 96(%rdi), %ymm0, %ymm1 -; SKX-NEXT: vpmovusdb %ymm1, %xmm1 -; SKX-NEXT: vpmaxsd 64(%rdi), %ymm0, %ymm2 -; SKX-NEXT: vpmovusdb %ymm2, %xmm2 -; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm1 = xmm2[0],xmm1[0] -; SKX-NEXT: vpmaxsd 32(%rdi), %ymm0, %ymm2 -; SKX-NEXT: vpmovusdb %ymm2, %xmm2 -; SKX-NEXT: vpmaxsd (%rdi), %ymm0, %ymm0 -; SKX-NEXT: vpmovusdb %ymm0, %xmm0 -; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm2[0] -; SKX-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vmovdqa 64(%rdi), %ymm1 +; SKX-NEXT: vpackssdw 96(%rdi), %ymm1, %ymm1 +; SKX-NEXT: vpermq {{.*#+}} ymm1 = ymm1[0,2,1,3] +; SKX-NEXT: vpackssdw 32(%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; SKX-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 +; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; SKX-NEXT: retq %a0 = load <32 x i32>, <32 x i32>* %p0 %1 = icmp slt <32 x i32> %a0, Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll?rev=374643&r1=374642&r2=374643&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll Sat Oct 12 00:59:29 2019 @@ -1878,13 +1878,8 @@ define <16 x i16> @trunc_ssat_v16i32_v16 ; ; SKX-LABEL: trunc_ssat_v16i32_v16i16: ; SKX: # %bb.0: -; SKX-NEXT: vpbroadcastd {{.*#+}} ymm0 = [32767,32767,32767,32767,32767,32767,32767,32767] -; SKX-NEXT: vpminsd (%rdi), %ymm0, %ymm1 -; SKX-NEXT: vpminsd 32(%rdi), %ymm0, %ymm0 -; SKX-NEXT: vpbroadcastd {{.*#+}} ymm2 = [4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528,4294934528] -; SKX-NEXT: vpmaxsd %ymm2, %ymm0, %ymm0 -; SKX-NEXT: vpmaxsd %ymm2, %ymm1, %ymm1 -; SKX-NEXT: vpackssdw %ymm0, %ymm1, %ymm0 +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vpackssdw 32(%rdi), %ymm0, %ymm0 ; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; SKX-NEXT: retq %a0 = load <16 x i32>, <16 x i32>* %p0 @@ -4823,13 +4818,8 @@ define <32 x i8> @trunc_ssat_v32i16_v32i ; ; SKX-LABEL: trunc_ssat_v32i16_v32i8: ; SKX: # %bb.0: -; SKX-NEXT: vmovdqa {{.*#+}} ymm0 = [127,127,127,127,127,127,127,127,127,127,127,127,127,127,127,127] -; SKX-NEXT: vpminsw (%rdi), %ymm0, %ymm1 -; SKX-NEXT: vpminsw 32(%rdi), %ymm0, %ymm0 -; SKX-NEXT: vmovdqa {{.*#+}} ymm2 = [65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408,65408] -; SKX-NEXT: vpmaxsw %ymm2, %ymm0, %ymm0 -; SKX-NEXT: vpmaxsw %ymm2, %ymm1, %ymm1 -; SKX-NEXT: vpacksswb %ymm0, %ymm1, %ymm0 +; SKX-NEXT: vmovdqa (%rdi), %ymm0 +; SKX-NEXT: vpacksswb 32(%rdi), %ymm0, %ymm0 ; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; SKX-NEXT: retq %a0 = load <32 x i16>, <32 x i16>* %p0 @@ -4895,16 +4885,13 @@ define <32 x i8> @trunc_ssat_v32i32_v32i ; SKX-LABEL: trunc_ssat_v32i32_v32i8: ; SKX: # %bb.0: ; SKX-NEXT: vmovdqa (%rdi), %ymm0 -; SKX-NEXT: vmovdqa 32(%rdi), %ymm1 -; SKX-NEXT: vmovdqa 64(%rdi), %ymm2 -; SKX-NEXT: vmovdqa 96(%rdi), %ymm3 -; SKX-NEXT: vpmovsdb %ymm3, %xmm3 -; SKX-NEXT: vpmovsdb %ymm2, %xmm2 -; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm2 = xmm2[0],xmm3[0] -; SKX-NEXT: vpmovsdb %ymm1, %xmm1 -; SKX-NEXT: vpmovsdb %ymm0, %xmm0 -; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0] -; SKX-NEXT: vinserti128 $1, %xmm2, %ymm0, %ymm0 +; SKX-NEXT: vmovdqa 64(%rdi), %ymm1 +; SKX-NEXT: vpackssdw 96(%rdi), %ymm1, %ymm1 +; SKX-NEXT: vpermq {{.*#+}} ymm1 = ymm1[0,2,1,3] +; SKX-NEXT: vpackssdw 32(%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] +; SKX-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 +; SKX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3] ; SKX-NEXT: retq %a0 = load <32 x i32>, <32 x i32>* %p0 %1 = icmp slt <32 x i32> %a0, From llvm-commits at lists.llvm.org Sat Oct 12 01:14:38 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 08:14:38 +0000 (UTC) Subject: [PATCH] D68906: [llvm-size] Tidy up error messages In-Reply-To: References: Message-ID: <6f5721e0c2eda9b862ef0a5808892399@localhost.localdomain> MaskRay added inline comments. ================ Comment at: llvm/tools/llvm-size/llvm-size.cpp:533 })) { - error(Filename + ": No architecture specified"); + error(Filename, "No architecture specified"); return false; ---------------- `No` -> `no` Most llvm binary utilities have switched to lowercase error messages. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68906/new/ https://reviews.llvm.org/D68906 From llvm-commits at lists.llvm.org Sat Oct 12 01:23:35 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 08:23:35 +0000 (UTC) Subject: [PATCH] D68906: [llvm-size] Tidy up error messages In-Reply-To: References: Message-ID: <2a9c6c7fce6efdbd15649b5f87e0df04@localhost.localdomain> MaskRay added inline comments. ================ Comment at: llvm/tools/llvm-size/llvm-size.cpp:110 -static bool error(Twine Message) { +static bool error(StringRef File, Twine Message) { HadError = true; ---------------- I'd prefer `error(Twine Message, StringRef File)`. Note the error below places the error (for error message) at the first argument position. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68906/new/ https://reviews.llvm.org/D68906 From llvm-commits at lists.llvm.org Sat Oct 12 01:50:54 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 08:50:54 +0000 (UTC) Subject: [PATCH] D68903: [LNT] NFC: Fix order of globals and locals on exec In-Reply-To: References: Message-ID: <28ae321928f6b2f1db131978c4508b48@localhost.localdomain> thopre accepted this revision. thopre added a comment. This revision is now accepted and ready to land. LGTM but would suggest this altered description: Per https://docs.python.org/3/library/functions.html#exec, globals parameter comes before locals. Since globals and locals refer to the same object for the call in question, we can remove locals, which will make globals used for both the global and the local variables, thus keeping the same behavior. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68903/new/ https://reviews.llvm.org/D68903 From llvm-commits at lists.llvm.org Sat Oct 12 02:18:02 2019 From: llvm-commits at lists.llvm.org (Nikita Popov via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 09:18:02 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: <7440268e4bcfa00fb24ca8a3b97c3bdb@localhost.localdomain> nikic added a comment. Generally looks good to me, I'm only wondering whether the `trunc` is the right place to start the match. Starting from the min/max we could match a larger set of patterns, in particular those where the result of the saturation is still extended to a larger type -- for example doing a 16-bit saturating add but continuing with a 32-bit result. ================ Comment at: llvm/include/llvm/IR/PatternMatch.h:663 +/// Match a specified integer value or vector of all elements of that +// value. +struct specific_apintval { ---------------- nit: `///` ================ Comment at: llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp:729 + if (A->getType() != Ty || B->getType() != Ty) + return nullptr; + ---------------- Rather than exact type equality, we could require that the original type is <= the trunc type and sext to the trunc type. This would allow also matching a saturating add between 16-bit and 8-bit number, for example. Not sure how practically relevant that would be though. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Sat Oct 12 03:20:39 2019 From: llvm-commits at lists.llvm.org (Nikita Popov via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 10:20:39 +0000 (UTC) Subject: [PATCH] D68717: [Codegen] More add_sat and sub_sat promotion In-Reply-To: References: Message-ID: nikic accepted this revision. nikic added a comment. This revision is now accepted and ready to land. LGTM The signed cases are a bit of a mixed bag in isolation, but will probably do better inside a loop or with adjacent instructions. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68717/new/ https://reviews.llvm.org/D68717 From llvm-commits at lists.llvm.org Sat Oct 12 03:29:34 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 10:29:34 +0000 (UTC) Subject: [PATCH] D68906: [llvm-size] Tidy up error messages In-Reply-To: References: Message-ID: grimar added inline comments. ================ Comment at: llvm/tools/llvm-size/llvm-size.cpp:110 -static bool error(Twine Message) { +static bool error(StringRef File, Twine Message) { HadError = true; ---------------- MaskRay wrote: > I'd prefer `error(Twine Message, StringRef File)`. Note the error below places the error (for error message) at the first argument position. Yeah. But it should be `const Twine &` ("Twines should only be used accepted as const references in arguments" https://llvm.org/doxygen/classllvm_1_1Twine.html#details) Also, why does it return `bool`? Can it be `void`? Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68906/new/ https://reviews.llvm.org/D68906 From llvm-commits at lists.llvm.org Sat Oct 12 03:38:33 2019 From: llvm-commits at lists.llvm.org (Nikita Popov via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 10:38:33 +0000 (UTC) Subject: [PATCH] D68844: [SCEV] Compute exit count for simple floating point IVs In-Reply-To: References: Message-ID: <51ea823ffaabd214d0a40a1950c2b77e@localhost.localdomain> nikic added a comment. I'm wondering if we can't extend Float2Int to convert these to operations on integers. I'm assuming it currently doesn't due to a `FIXME: Handle select and phi nodes`. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68844/new/ https://reviews.llvm.org/D68844 From llvm-commits at lists.llvm.org Sat Oct 12 03:47:28 2019 From: llvm-commits at lists.llvm.org (kamlesh kumar via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 10:47:28 +0000 (UTC) Subject: [PATCH] D68907: [6/7/trunk] -fno-plt generates wrong relocation for std::ios_base::Init leading to segmentation fault Message-ID: kamleshbhalui created this revision. kamleshbhalui added a reviewer: craig.topper. kamleshbhalui added a project: LLVM. Herald added a subscriber: hiraditya. Fixes this https://bugs.llvm.org/show_bug.cgi?id=39252 Repository: rL LLVM https://reviews.llvm.org/D68907 Files: llvm/lib/Target/X86/X86Subtarget.cpp Index: llvm/lib/Target/X86/X86Subtarget.cpp =================================================================== --- llvm/lib/Target/X86/X86Subtarget.cpp +++ llvm/lib/Target/X86/X86Subtarget.cpp @@ -337,10 +337,10 @@ InstrInfo(initializeSubtargetDependencies(CPU, FS)), TLInfo(TM, *this), FrameLowering(*this, getStackAlignment()) { // Determine the PICStyle based on the target selected. - if (!isPositionIndependent()) + if (is64Bit()) + setPICStyle(PICStyles::RIPRel); + else if (!isPositionIndependent()) setPICStyle(PICStyles::None); - else if (is64Bit()) - setPICStyle(PICStyles::RIPRel); else if (isTargetCOFF()) setPICStyle(PICStyles::None); else if (isTargetDarwin()) -------------- next part -------------- A non-text attachment was scrubbed... Name: D68907.224733.patch Type: text/x-patch Size: 720 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 03:57:22 2019 From: llvm-commits at lists.llvm.org (Benjamin Kramer via llvm-commits) Date: Sat, 12 Oct 2019 10:57:22 -0000 Subject: [llvm] r374646 - [LV] Merge LLVM_DEBUG blocks. Message-ID: <20191012105723.0297A8588B@lists.llvm.org> Author: d0k Date: Sat Oct 12 03:57:22 2019 New Revision: 374646 URL: http://llvm.org/viewvc/llvm-project?rev=374646&view=rev Log: [LV] Merge LLVM_DEBUG blocks. Avoids unused variable warnings about the range-based for loops in there. NFCI. Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp?rev=374646&r1=374645&r2=374646&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Sat Oct 12 03:57:22 2019 @@ -5462,21 +5462,23 @@ LoopVectorizationCostModel::calculateReg Invariant[ClassID] += Usage; } - LLVM_DEBUG(dbgs() << "LV(REG): VF = " << VFs[i] << '\n'); - LLVM_DEBUG(dbgs() << "LV(REG): Found max usage: " - << MaxUsages[i].size() << " item\n"); - for (const auto& pair : MaxUsages[i]) { - LLVM_DEBUG(dbgs() << "LV(REG): RegisterClass: " - << TTI.getRegisterClassName(pair.first) - << ", " << pair.second << " registers \n"); - } - LLVM_DEBUG(dbgs() << "LV(REG): Found invariant usage: " - << Invariant.size() << " item\n"); - for (const auto& pair : Invariant) { - LLVM_DEBUG(dbgs() << "LV(REG): RegisterClass: " - << TTI.getRegisterClassName(pair.first) - << ", " << pair.second << " registers \n"); - } + LLVM_DEBUG({ + dbgs() << "LV(REG): VF = " << VFs[i] << '\n'; + dbgs() << "LV(REG): Found max usage: " << MaxUsages[i].size() + << " item\n"; + for (const auto &pair : MaxUsages[i]) { + dbgs() << "LV(REG): RegisterClass: " + << TTI.getRegisterClassName(pair.first) << ", " << pair.second + << " registers\n"; + } + dbgs() << "LV(REG): Found invariant usage: " << Invariant.size() + << " item\n"; + for (const auto &pair : Invariant) { + dbgs() << "LV(REG): RegisterClass: " + << TTI.getRegisterClassName(pair.first) << ", " << pair.second + << " registers\n"; + } + }); RU.LoopInvariantRegs = Invariant; RU.MaxLocalUsers = MaxUsages[i]; From llvm-commits at lists.llvm.org Sat Oct 12 04:01:53 2019 From: llvm-commits at lists.llvm.org (Benjamin Kramer via llvm-commits) Date: Sat, 12 Oct 2019 11:01:53 -0000 Subject: [llvm] r374647 - [Attributor] Extend anonymous namespace. NFC. Message-ID: <20191012110153.466AA88A29@lists.llvm.org> Author: d0k Date: Sat Oct 12 04:01:52 2019 New Revision: 374647 URL: http://llvm.org/viewvc/llvm-project?rev=374647&view=rev Log: [Attributor] Extend anonymous namespace. NFC. Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374647&r1=374646&r2=374647&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sat Oct 12 04:01:52 2019 @@ -3667,7 +3667,6 @@ struct AAHeapToStackFunction final : pub BUILD_STAT_NAME(MallocCalls, Function) += MallocCalls.size(); } }; -} // namespace /// -------------------- Memory Behavior Attributes ---------------------------- /// Includes read-none, read-only, and write-only. @@ -3940,6 +3939,7 @@ struct AAMemoryBehaviorCallSite final : STATS_DECLTRACK_CS_ATTR(writeonly) } }; +} // namespace ChangeStatus AAMemoryBehaviorFunction::updateImpl(Attributor &A) { From llvm-commits at lists.llvm.org Sat Oct 12 04:16:35 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 11:16:35 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream In-Reply-To: References: Message-ID: <5882c0fff7d9cc473e3d6602aa3a8834@localhost.localdomain> grimar added inline comments. ================ Comment at: llvm/include/llvm/ObjectYAML/MinidumpYAML.h:113-114 + ExceptionStream() + : Stream(StreamKind::Exception, minidump::StreamType::Exception) { + memset(&MDExceptionStream, 0, sizeof(minidump::ExceptionStream)); + } ---------------- I'd avoid memset: ``` ExceptionStream() : Stream(StreamKind::Exception, minidump::StreamType::Exception), MDExceptionStream({}) { ``` ================ Comment at: llvm/lib/ObjectYAML/MinidumpYAML.cpp:525 + case StreamKind::Exception: { + auto ExpectedExceptionStream = File.getExceptionStream(); + if (!ExpectedExceptionStream) ---------------- We often avoid using `auto` when return type is not obvious. ================ Comment at: llvm/unittests/ObjectYAML/MinidumpYAMLTest.cpp:143 + +TEST(MinidumpYAML, ExceptionStream) { + SmallString<0> Storage; ---------------- I'd add a comment for each test to describe what they do. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68657/new/ https://reviews.llvm.org/D68657 From llvm-commits at lists.llvm.org Sat Oct 12 04:25:39 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 11:25:39 +0000 (UTC) Subject: [PATCH] D68730: [llvm-objdump] Adjust spacing and field width for --section-headers In-Reply-To: References: Message-ID: <092a5a05ab0d745cabc22d91bc295737@localhost.localdomain> grimar accepted this revision. grimar added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68730/new/ https://reviews.llvm.org/D68730 From llvm-commits at lists.llvm.org Sat Oct 12 04:56:57 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 11:56:57 -0000 Subject: [llvm] r374648 - Reland r374388: [lit] Make internal diff work in pipelines Message-ID: <20191012115657.87DE18665C@lists.llvm.org> Author: jdenny Date: Sat Oct 12 04:56:57 2019 New Revision: 374648 URL: http://llvm.org/viewvc/llvm-project?rev=374648&view=rev Log: Reland r374388: [lit] Make internal diff work in pipelines To avoid breaking some tests, D66574, D68664, D67643, and D68668 landed together. However, D68664 introduced an issue now addressed by D68839, with which these are now all relanding. Differential Revision: https://reviews.llvm.org/D66574 Added: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt Modified: llvm/trunk/utils/lit/lit/TestRunner.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/TestRunner.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/TestRunner.py?rev=374648&r1=374647&r2=374648&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/TestRunner.py (original) +++ llvm/trunk/utils/lit/lit/TestRunner.py Sat Oct 12 04:56:57 2019 @@ -1,7 +1,5 @@ from __future__ import absolute_import -import difflib import errno -import functools import io import itertools import getopt @@ -361,218 +359,6 @@ def executeBuiltinMkdir(cmd, cmd_shenv): exitCode = 1 return ShellCommandResult(cmd, "", stderr.getvalue(), exitCode, False) -def executeBuiltinDiff(cmd, cmd_shenv): - """executeBuiltinDiff - Compare files line by line.""" - args = expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:] - try: - opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) - except getopt.GetoptError as err: - raise InternalShellError(cmd, "Unsupported: 'diff': %s" % str(err)) - - filelines, filepaths, dir_trees = ([] for i in range(3)) - ignore_all_space = False - ignore_space_change = False - unified_diff = False - recursive_diff = False - strip_trailing_cr = False - for o, a in opts: - if o == "-w": - ignore_all_space = True - elif o == "-b": - ignore_space_change = True - elif o == "-u": - unified_diff = True - elif o == "-r": - recursive_diff = True - elif o == "--strip-trailing-cr": - strip_trailing_cr = True - else: - assert False, "unhandled option" - - if len(args) != 2: - raise InternalShellError(cmd, "Error: missing or extra operand") - - def getDirTree(path, basedir=""): - # Tree is a tuple of form (dirname, child_trees). - # An empty dir has child_trees = [], a file has child_trees = None. - child_trees = [] - for dirname, child_dirs, files in os.walk(os.path.join(basedir, path)): - for child_dir in child_dirs: - child_trees.append(getDirTree(child_dir, dirname)) - for filename in files: - child_trees.append((filename, None)) - return path, sorted(child_trees) - - def compareTwoFiles(filepaths): - compare_bytes = False - encoding = None - filelines = [] - for file in filepaths: - try: - with open(file, 'r') as f: - filelines.append(f.readlines()) - except UnicodeDecodeError: - try: - with io.open(file, 'r', encoding="utf-8") as f: - filelines.append(f.readlines()) - encoding = "utf-8" - except: - compare_bytes = True - - if compare_bytes: - return compareTwoBinaryFiles(filepaths) - else: - return compareTwoTextFiles(filepaths, encoding) - - def compareTwoBinaryFiles(filepaths): - filelines = [] - for file in filepaths: - with open(file, 'rb') as f: - filelines.append(f.readlines()) - - exitCode = 0 - if hasattr(difflib, 'diff_bytes'): - # python 3.5 or newer - diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) - diffs = [diff.decode() for diff in diffs] - else: - # python 2.7 - func = difflib.unified_diff if unified_diff else difflib.context_diff - diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) - - for diff in diffs: - stdout.write(diff) - exitCode = 1 - return exitCode - - def compareTwoTextFiles(filepaths, encoding): - filelines = [] - for file in filepaths: - if encoding is None: - with open(file, 'r') as f: - filelines.append(f.readlines()) - else: - with io.open(file, 'r', encoding=encoding) as f: - filelines.append(f.readlines()) - - exitCode = 0 - def compose2(f, g): - return lambda x: f(g(x)) - - f = lambda x: x - if strip_trailing_cr: - f = compose2(lambda line: line.rstrip('\r'), f) - if ignore_all_space or ignore_space_change: - ignoreSpace = lambda line, separator: separator.join(line.split()) - ignoreAllSpaceOrSpaceChange = functools.partial(ignoreSpace, separator='' if ignore_all_space else ' ') - f = compose2(ignoreAllSpaceOrSpaceChange, f) - - for idx, lines in enumerate(filelines): - filelines[idx]= [f(line) for line in lines] - - func = difflib.unified_diff if unified_diff else difflib.context_diff - for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): - stdout.write(diff) - exitCode = 1 - return exitCode - - def printDirVsFile(dir_path, file_path): - if os.path.getsize(file_path): - msg = "File %s is a directory while file %s is a regular file" - else: - msg = "File %s is a directory while file %s is a regular empty file" - stdout.write(msg % (dir_path, file_path) + "\n") - - def printFileVsDir(file_path, dir_path): - if os.path.getsize(file_path): - msg = "File %s is a regular file while file %s is a directory" - else: - msg = "File %s is a regular empty file while file %s is a directory" - stdout.write(msg % (file_path, dir_path) + "\n") - - def printOnlyIn(basedir, path, name): - stdout.write("Only in %s: %s\n" % (os.path.join(basedir, path), name)) - - def compareDirTrees(dir_trees, base_paths=["", ""]): - # Dirnames of the trees are not checked, it's caller's responsibility, - # as top-level dirnames are always different. Base paths are important - # for doing os.walk, but we don't put it into tree's dirname in order - # to speed up string comparison below and while sorting in getDirTree. - left_tree, right_tree = dir_trees[0], dir_trees[1] - left_base, right_base = base_paths[0], base_paths[1] - - # Compare two files or report file vs. directory mismatch. - if left_tree[1] is None and right_tree[1] is None: - return compareTwoFiles([os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])]) - - if left_tree[1] is None and right_tree[1] is not None: - printFileVsDir(os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])) - return 1 - - if left_tree[1] is not None and right_tree[1] is None: - printDirVsFile(os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])) - return 1 - - # Compare two directories via recursive use of compareDirTrees. - exitCode = 0 - left_names = [node[0] for node in left_tree[1]] - right_names = [node[0] for node in right_tree[1]] - l, r = 0, 0 - while l < len(left_names) and r < len(right_names): - # Names are sorted in getDirTree, rely on that order. - if left_names[l] < right_names[r]: - exitCode = 1 - printOnlyIn(left_base, left_tree[0], left_names[l]) - l += 1 - elif left_names[l] > right_names[r]: - exitCode = 1 - printOnlyIn(right_base, right_tree[0], right_names[r]) - r += 1 - else: - exitCode |= compareDirTrees([left_tree[1][l], right_tree[1][r]], - [os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])]) - l += 1 - r += 1 - - # At least one of the trees has ended. Report names from the other tree. - while l < len(left_names): - exitCode = 1 - printOnlyIn(left_base, left_tree[0], left_names[l]) - l += 1 - while r < len(right_names): - exitCode = 1 - printOnlyIn(right_base, right_tree[0], right_names[r]) - r += 1 - return exitCode - - stderr = StringIO() - stdout = StringIO() - exitCode = 0 - try: - for file in args: - if not os.path.isabs(file): - file = os.path.realpath(os.path.join(cmd_shenv.cwd, file)) - - if recursive_diff: - dir_trees.append(getDirTree(file)) - else: - filepaths.append(file) - - if not recursive_diff: - exitCode = compareTwoFiles(filepaths) - else: - exitCode = compareDirTrees(dir_trees) - - except IOError as err: - stderr.write("Error: 'diff' command failed, %s\n" % str(err)) - exitCode = 1 - - return ShellCommandResult(cmd, stdout.getvalue(), stderr.getvalue(), exitCode, False) - def executeBuiltinRm(cmd, cmd_shenv): """executeBuiltinRm - Removes (deletes) files or directories.""" args = expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:] @@ -838,14 +624,6 @@ def _executeShCmd(cmd, shenv, results, t results.append(cmdResult) return cmdResult.exitCode - if cmd.commands[0].args[0] == 'diff': - if len(cmd.commands) != 1: - raise InternalShellError(cmd.commands[0], "Unsupported: 'diff' " - "cannot be part of a pipeline") - cmdResult = executeBuiltinDiff(cmd.commands[0], shenv) - results.append(cmdResult) - return cmdResult.exitCode - if cmd.commands[0].args[0] == 'rm': if len(cmd.commands) != 1: raise InternalShellError(cmd.commands[0], "Unsupported: 'rm' " @@ -866,7 +644,7 @@ def _executeShCmd(cmd, shenv, results, t stderrTempFiles = [] opened_files = [] named_temp_files = [] - builtin_commands = set(['cat']) + builtin_commands = set(['cat', 'diff']) builtin_commands_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "builtin_commands") # To avoid deadlock, we use a single stderr stream for piped # output. This is null until we have seen some output using Added: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374648&view=auto ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (added) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 04:56:57 2019 @@ -0,0 +1,228 @@ +import difflib +import functools +import getopt +import os +import sys + +class DiffFlags(): + def __init__(self): + self.ignore_all_space = False + self.ignore_space_change = False + self.unified_diff = False + self.recursive_diff = False + self.strip_trailing_cr = False + +def getDirTree(path, basedir=""): + # Tree is a tuple of form (dirname, child_trees). + # An empty dir has child_trees = [], a file has child_trees = None. + child_trees = [] + for dirname, child_dirs, files in os.walk(os.path.join(basedir, path)): + for child_dir in child_dirs: + child_trees.append(getDirTree(child_dir, dirname)) + for filename in files: + child_trees.append((filename, None)) + return path, sorted(child_trees) + +def compareTwoFiles(flags, filepaths): + compare_bytes = False + encoding = None + filelines = [] + for file in filepaths: + try: + with open(file, 'r') as f: + filelines.append(f.readlines()) + except UnicodeDecodeError: + try: + with io.open(file, 'r', encoding="utf-8") as f: + filelines.append(f.readlines()) + encoding = "utf-8" + except: + compare_bytes = True + + if compare_bytes: + return compareTwoBinaryFiles(flags, filepaths) + else: + return compareTwoTextFiles(flags, filepaths, encoding) + +def compareTwoBinaryFiles(flags, filepaths): + filelines = [] + for file in filepaths: + with open(file, 'rb') as f: + filelines.append(f.readlines()) + + exitCode = 0 + if hasattr(difflib, 'diff_bytes'): + # python 3.5 or newer + diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) + diffs = [diff.decode() for diff in diffs] + else: + # python 2.7 + if flags.unified_diff: + func = difflib.unified_diff + else: + func = difflib.context_diff + diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) + + for diff in diffs: + sys.stdout.write(diff) + exitCode = 1 + return exitCode + +def compareTwoTextFiles(flags, filepaths, encoding): + filelines = [] + for file in filepaths: + if encoding is None: + with open(file, 'r') as f: + filelines.append(f.readlines()) + else: + with io.open(file, 'r', encoding=encoding) as f: + filelines.append(f.readlines()) + + exitCode = 0 + def compose2(f, g): + return lambda x: f(g(x)) + + f = lambda x: x + if flags.strip_trailing_cr: + f = compose2(lambda line: line.rstrip('\r'), f) + if flags.ignore_all_space or flags.ignore_space_change: + ignoreSpace = lambda line, separator: separator.join(line.split()) + ignoreAllSpaceOrSpaceChange = functools.partial(ignoreSpace, separator='' if flags.ignore_all_space else ' ') + f = compose2(ignoreAllSpaceOrSpaceChange, f) + + for idx, lines in enumerate(filelines): + filelines[idx]= [f(line) for line in lines] + + func = difflib.unified_diff if flags.unified_diff else difflib.context_diff + for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): + sys.stdout.write(diff) + exitCode = 1 + return exitCode + +def printDirVsFile(dir_path, file_path): + if os.path.getsize(file_path): + msg = "File %s is a directory while file %s is a regular file" + else: + msg = "File %s is a directory while file %s is a regular empty file" + sys.stdout.write(msg % (dir_path, file_path) + "\n") + +def printFileVsDir(file_path, dir_path): + if os.path.getsize(file_path): + msg = "File %s is a regular file while file %s is a directory" + else: + msg = "File %s is a regular empty file while file %s is a directory" + sys.stdout.write(msg % (file_path, dir_path) + "\n") + +def printOnlyIn(basedir, path, name): + sys.stdout.write("Only in %s: %s\n" % (os.path.join(basedir, path), name)) + +def compareDirTrees(flags, dir_trees, base_paths=["", ""]): + # Dirnames of the trees are not checked, it's caller's responsibility, + # as top-level dirnames are always different. Base paths are important + # for doing os.walk, but we don't put it into tree's dirname in order + # to speed up string comparison below and while sorting in getDirTree. + left_tree, right_tree = dir_trees[0], dir_trees[1] + left_base, right_base = base_paths[0], base_paths[1] + + # Compare two files or report file vs. directory mismatch. + if left_tree[1] is None and right_tree[1] is None: + return compareTwoFiles(flags, + [os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])]) + + if left_tree[1] is None and right_tree[1] is not None: + printFileVsDir(os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])) + return 1 + + if left_tree[1] is not None and right_tree[1] is None: + printDirVsFile(os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])) + return 1 + + # Compare two directories via recursive use of compareDirTrees. + exitCode = 0 + left_names = [node[0] for node in left_tree[1]] + right_names = [node[0] for node in right_tree[1]] + l, r = 0, 0 + while l < len(left_names) and r < len(right_names): + # Names are sorted in getDirTree, rely on that order. + if left_names[l] < right_names[r]: + exitCode = 1 + printOnlyIn(left_base, left_tree[0], left_names[l]) + l += 1 + elif left_names[l] > right_names[r]: + exitCode = 1 + printOnlyIn(right_base, right_tree[0], right_names[r]) + r += 1 + else: + exitCode |= compareDirTrees(flags, + [left_tree[1][l], right_tree[1][r]], + [os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])]) + l += 1 + r += 1 + + # At least one of the trees has ended. Report names from the other tree. + while l < len(left_names): + exitCode = 1 + printOnlyIn(left_base, left_tree[0], left_names[l]) + l += 1 + while r < len(right_names): + exitCode = 1 + printOnlyIn(right_base, right_tree[0], right_names[r]) + r += 1 + return exitCode + +def main(argv): + args = argv[1:] + try: + opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) + except getopt.GetoptError as err: + sys.stderr.write("Unsupported: 'diff': %s\n" % str(err)) + sys.exit(1) + + flags = DiffFlags() + filelines, filepaths, dir_trees = ([] for i in range(3)) + for o, a in opts: + if o == "-w": + flags.ignore_all_space = True + elif o == "-b": + flags.ignore_space_change = True + elif o == "-u": + flags.unified_diff = True + elif o == "-r": + flags.recursive_diff = True + elif o == "--strip-trailing-cr": + flags.strip_trailing_cr = True + else: + assert False, "unhandled option" + + if len(args) != 2: + sys.stderr.write("Error: missing or extra operand\n") + sys.exit(1) + + exitCode = 0 + try: + for file in args: + if not os.path.isabs(file): + file = os.path.realpath(os.path.join(os.getcwd(), file)) + + if flags.recursive_diff: + dir_trees.append(getDirTree(file)) + else: + filepaths.append(file) + + if not flags.recursive_diff: + exitCode = compareTwoFiles(flags, filepaths) + else: + exitCode = compareDirTrees(flags, dir_trees) + + except IOError as err: + sys.stderr.write("Error: 'diff' command failed, %s\n" % str(err)) + exitCode = 1 + + sys.exit(exitCode) + +if __name__ == "__main__": + main(sys.argv) Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt?rev=374647&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt (removed) @@ -1,3 +0,0 @@ -# Check error on a unsupported diff (cannot be part of a pipeline). -# -# RUN: diff diff-error-0.txt diff-error-0.txt | echo Output Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt?rev=374648&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt Sat Oct 12 04:56:57 2019 @@ -0,0 +1,15 @@ +# RUN: echo foo > %t.foo +# RUN: echo bar > %t.bar + +# Check output pipe. +# RUN: diff %t.foo %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s +# RUN: diff -u %t.foo %t.bar | FileCheck %s && false || true + +# Fail so lit will print output. +# RUN: false + +# CHECK: @@ +# CHECK-NEXT: -foo +# CHECK-NEXT: +bar + +# EMPTY-NOT: {{.}} Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374648&r1=374647&r2=374648&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 04:56:57 2019 @@ -34,28 +34,20 @@ # CHECK: error: command failed with exit status: 127 # CHECK: *** -# CHECK: FAIL: shtest-shell :: diff-error-0.txt -# CHECK: *** TEST 'shtest-shell :: diff-error-0.txt' FAILED *** -# CHECK: $ "diff" "diff-error-0.txt" "diff-error-0.txt" -# CHECK: # command stderr: -# CHECK: Unsupported: 'diff' cannot be part of a pipeline -# CHECK: error: command failed with exit status: 127 -# CHECK: *** - # CHECK: FAIL: shtest-shell :: diff-error-1.txt # CHECK: *** TEST 'shtest-shell :: diff-error-1.txt' FAILED *** # CHECK: $ "diff" "-B" "temp1.txt" "temp2.txt" # CHECK: # command stderr: # CHECK: Unsupported: 'diff': option -B not recognized -# CHECK: error: command failed with exit status: 127 +# CHECK: error: command failed with exit status: 1 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-2.txt # CHECK: *** TEST 'shtest-shell :: diff-error-2.txt' FAILED *** # CHECK: $ "diff" "temp.txt" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 127 +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 1 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-3.txt @@ -82,18 +74,43 @@ # CHECK: *** TEST 'shtest-shell :: diff-error-5.txt' FAILED *** # CHECK: $ "diff" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 127 +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 1 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-6.txt # CHECK: *** TEST 'shtest-shell :: diff-error-6.txt' FAILED *** # CHECK: $ "diff" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 127 +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 1 # CHECK: *** + +# CHECK: FAIL: shtest-shell :: diff-pipes.txt + +# CHECK: *** TEST 'shtest-shell :: diff-pipes.txt' FAILED *** + +# CHECK: $ "diff" "{{[^"]*}}.foo" "{{[^"]*}}.foo" +# CHECK-NOT: note +# CHECK-NOT: error +# CHECK: $ "FileCheck" +# CHECK-NOT: note +# CHECK-NOT: error + +# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "{{[^"]*}}.bar" +# CHECK: note: command had no output on stdout or stderr +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "FileCheck" +# CHECK-NOT: note +# CHECK-NOT: error +# CHECK: $ "true" + +# CHECK: $ "false" + +# CHECK: *** + + # CHECK: FAIL: shtest-shell :: diff-r-error-0.txt # CHECK: *** TEST 'shtest-shell :: diff-r-error-0.txt' FAILED *** # CHECK: $ "diff" "-r" From llvm-commits at lists.llvm.org Sat Oct 12 04:57:20 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 11:57:20 -0000 Subject: [llvm] r374649 - Reland r374389: [lit] Clean up internal diff's encoding handling Message-ID: <20191012115720.6371D86678@lists.llvm.org> Author: jdenny Date: Sat Oct 12 04:57:20 2019 New Revision: 374649 URL: http://llvm.org/viewvc/llvm-project?rev=374649&view=rev Log: Reland r374389: [lit] Clean up internal diff's encoding handling To avoid breaking some tests, D66574, D68664, D67643, and D68668 landed together. However, D68664 introduced an issue now addressed by D68839, with which these are now all relanding. Differential Revision: https://reviews.llvm.org/D68664 Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin (with props) llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 (with props) llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374649&r1=374648&r2=374649&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 04:57:20 2019 @@ -1,6 +1,7 @@ import difflib import functools import getopt +import locale import os import sys @@ -24,37 +25,26 @@ def getDirTree(path, basedir=""): return path, sorted(child_trees) def compareTwoFiles(flags, filepaths): - compare_bytes = False - encoding = None filelines = [] for file in filepaths: - try: - with open(file, 'r') as f: - filelines.append(f.readlines()) - except UnicodeDecodeError: - try: - with io.open(file, 'r', encoding="utf-8") as f: - filelines.append(f.readlines()) - encoding = "utf-8" - except: - compare_bytes = True - - if compare_bytes: - return compareTwoBinaryFiles(flags, filepaths) - else: - return compareTwoTextFiles(flags, filepaths, encoding) + with open(file, 'rb') as file_bin: + filelines.append(file_bin.readlines()) -def compareTwoBinaryFiles(flags, filepaths): - filelines = [] - for file in filepaths: - with open(file, 'rb') as f: - filelines.append(f.readlines()) + try: + return compareTwoTextFiles(flags, filepaths, filelines, + locale.getpreferredencoding(False)) + except UnicodeDecodeError: + try: + return compareTwoTextFiles(flags, filepaths, filelines, "utf-8") + except: + return compareTwoBinaryFiles(flags, filepaths, filelines) +def compareTwoBinaryFiles(flags, filepaths, filelines): exitCode = 0 if hasattr(difflib, 'diff_bytes'): # python 3.5 or newer diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) - diffs = [diff.decode() for diff in diffs] + diffs = [diff.decode(errors="backslashreplace") for diff in diffs] else: # python 2.7 if flags.unified_diff: @@ -68,15 +58,14 @@ def compareTwoBinaryFiles(flags, filepat exitCode = 1 return exitCode -def compareTwoTextFiles(flags, filepaths, encoding): +def compareTwoTextFiles(flags, filepaths, filelines_bin, encoding): filelines = [] - for file in filepaths: - if encoding is None: - with open(file, 'r') as f: - filelines.append(f.readlines()) - else: - with io.open(file, 'r', encoding=encoding) as f: - filelines.append(f.readlines()) + for lines_bin in filelines_bin: + lines = [] + for line_bin in lines_bin: + line = line_bin.decode(encoding=encoding) + lines.append(line) + filelines.append(lines) exitCode = 0 def compose2(f, g): Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt?rev=374649&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt Sat Oct 12 04:57:20 2019 @@ -0,0 +1,9 @@ +# Check that diff falls back to binary mode if it cannot decode a file. + +# RUN: diff -u diff-in.bin diff-in.bin +# RUN: diff -u diff-in.utf16 diff-in.bin && false || true +# RUN: diff -u diff-in.utf8 diff-in.bin && false || true +# RUN: diff -u diff-in.bin diff-in.utf8 && false || true + +# Fail so lit will print output. +# RUN: false Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin?rev=374649&view=auto ============================================================================== Binary file - no diff available. Propchange: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16?rev=374649&view=auto ============================================================================== Binary file - no diff available. Propchange: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8?rev=374649&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 Sat Oct 12 04:57:20 2019 @@ -0,0 +1,3 @@ +foo +bar +baz Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374649&r1=374648&r2=374649&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Sat Oct 12 04:57:20 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (27) +# CHECK: Failing Tests (28) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374649&r1=374648&r2=374649&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 04:57:20 2019 @@ -34,6 +34,58 @@ # CHECK: error: command failed with exit status: 127 # CHECK: *** + +# CHECK: FAIL: shtest-shell :: diff-encodings.txt +# CHECK: *** TEST 'shtest-shell :: diff-encodings.txt' FAILED *** + +# CHECK: $ "diff" "-u" "diff-in.bin" "diff-in.bin" +# CHECK-NOT: error + +# CHECK: $ "diff" "-u" "diff-in.utf16" "diff-in.bin" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: {{^ .f.o.o.$}} +# CHECK-NEXT: {{^-.b.a.r.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^ .b.a.z.$}} +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "diff" "-u" "diff-in.utf8" "diff-in.bin" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: -foo +# CHECK-NEXT: -bar +# CHECK-NEXT: -baz +# CHECK-NEXT: {{^\+.f.o.o.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^\+.b.a.z.$}} +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "diff" "-u" "diff-in.bin" "diff-in.utf8" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: {{^\-.f.o.o.$}} +# CHECK-NEXT: {{^\-.b.a.r..}} +# CHECK-NEXT: {{^\-.b.a.z.$}} +# CHECK-NEXT: +foo +# CHECK-NEXT: +bar +# CHECK-NEXT: +baz +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "false" + +# CHECK: *** + + # CHECK: FAIL: shtest-shell :: diff-error-1.txt # CHECK: *** TEST 'shtest-shell :: diff-error-1.txt' FAILED *** # CHECK: $ "diff" "-B" "temp1.txt" "temp2.txt" @@ -245,4 +297,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (27) +# CHECK: Failing Tests (28) From llvm-commits at lists.llvm.org Sat Oct 12 04:57:42 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 11:57:42 -0000 Subject: [llvm] r374650 - Reland r374390: [lit] Extend internal diff to support `-` argument Message-ID: <20191012115742.2B99C86678@lists.llvm.org> Author: jdenny Date: Sat Oct 12 04:57:41 2019 New Revision: 374650 URL: http://llvm.org/viewvc/llvm-project?rev=374650&view=rev Log: Reland r374390: [lit] Extend internal diff to support `-` argument To avoid breaking some tests, D66574, D68664, D67643, and D68668 landed together. However, D68664 introduced an issue now addressed by D68839, with which these are now all relanding. Differential Revision: https://reviews.llvm.org/D67643 Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374650&r1=374649&r2=374650&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 04:57:41 2019 @@ -27,8 +27,13 @@ def getDirTree(path, basedir=""): def compareTwoFiles(flags, filepaths): filelines = [] for file in filepaths: - with open(file, 'rb') as file_bin: - filelines.append(file_bin.readlines()) + if file == "-": + stdin_fileno = sys.stdin.fileno() + with os.fdopen(os.dup(stdin_fileno), 'rb') as stdin_bin: + filelines.append(stdin_bin.readlines()) + else: + with open(file, 'rb') as file_bin: + filelines.append(file_bin.readlines()) try: return compareTwoTextFiles(flags, filepaths, filelines, @@ -194,10 +199,13 @@ def main(argv): exitCode = 0 try: for file in args: - if not os.path.isabs(file): + if file != "-" and not os.path.isabs(file): file = os.path.realpath(os.path.join(os.getcwd(), file)) if flags.recursive_diff: + if file == "-": + sys.stderr.write("Error: cannot recursively compare '-'\n") + sys.exit(1) dir_trees.append(getDirTree(file)) else: filepaths.append(file) Modified: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt?rev=374650&r1=374649&r2=374650&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt Sat Oct 12 04:57:41 2019 @@ -5,5 +5,11 @@ # RUN: diff -u diff-in.utf8 diff-in.bin && false || true # RUN: diff -u diff-in.bin diff-in.utf8 && false || true +# RUN: cat diff-in.bin | diff -u - diff-in.bin +# RUN: cat diff-in.bin | diff -u diff-in.bin - +# RUN: cat diff-in.bin | diff -u diff-in.utf16 - && false || true +# RUN: cat diff-in.bin | diff -u diff-in.utf8 - && false || true +# RUN: cat diff-in.bin | diff -u - diff-in.utf8 && false || true + # Fail so lit will print output. # RUN: false Modified: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt?rev=374650&r1=374649&r2=374650&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt Sat Oct 12 04:57:41 2019 @@ -5,6 +5,16 @@ # RUN: diff %t.foo %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s # RUN: diff -u %t.foo %t.bar | FileCheck %s && false || true +# Check input pipe. +# RUN: echo foo | diff -u - %t.foo +# RUN: echo foo | diff -u %t.foo - +# RUN: echo bar | diff -u %t.foo - && false || true +# RUN: echo bar | diff -u - %t.foo && false || true + +# Check output and input pipes at the same time. +# RUN: echo foo | diff - %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s +# RUN: echo bar | diff -u %t.foo - | FileCheck %s && false || true + # Fail so lit will print output. # RUN: false Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt?rev=374650&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt Sat Oct 12 04:57:41 2019 @@ -0,0 +1,2 @@ +# diff -r currently cannot handle stdin. +# RUN: diff -r - %t Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt?rev=374650&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt Sat Oct 12 04:57:41 2019 @@ -0,0 +1,2 @@ +# diff -r currently cannot handle stdin. +# RUN: diff -r %t - Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374650&r1=374649&r2=374650&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Sat Oct 12 04:57:41 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (28) +# CHECK: Failing Tests (30) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374650&r1=374649&r2=374650&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 04:57:41 2019 @@ -81,6 +81,60 @@ # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" +# CHECK: $ "cat" "diff-in.bin" +# CHECK-NOT: error +# CHECK: $ "diff" "-u" "-" "diff-in.bin" +# CHECK-NOT: error + +# CHECK: $ "cat" "diff-in.bin" +# CHECK-NOT: error +# CHECK: $ "diff" "-u" "diff-in.bin" "-" +# CHECK-NOT: error + +# CHECK: $ "cat" "diff-in.bin" +# CHECK-NOT: error +# CHECK: $ "diff" "-u" "diff-in.utf16" "-" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: {{^ .f.o.o.$}} +# CHECK-NEXT: {{^-.b.a.r.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^ .b.a.z.$}} +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "cat" "diff-in.bin" +# CHECK-NOT: error +# CHECK: $ "diff" "-u" "diff-in.utf8" "-" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: -foo +# CHECK-NEXT: -bar +# CHECK-NEXT: -baz +# CHECK-NEXT: {{^\+.f.o.o.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^\+.b.a.z.$}} +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "diff" "-u" "-" "diff-in.utf8" +# CHECK: # command output: +# CHECK-NEXT: --- +# CHECK-NEXT: +++ +# CHECK-NEXT: @@ +# CHECK-NEXT: {{^\-.f.o.o.$}} +# CHECK-NEXT: {{^\-.b.a.r..}} +# CHECK-NEXT: {{^\-.b.a.z.$}} +# CHECK-NEXT: +foo +# CHECK-NEXT: +bar +# CHECK-NEXT: +baz +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + # CHECK: $ "false" # CHECK: *** @@ -158,6 +212,51 @@ # CHECK-NOT: error # CHECK: $ "true" +# CHECK: $ "echo" "foo" +# CHECK: $ "diff" "-u" "-" "{{[^"]*}}.foo" +# CHECK-NOT: note +# CHECK-NOT: error + +# CHECK: $ "echo" "foo" +# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" +# CHECK-NOT: note +# CHECK-NOT: error + +# CHECK: $ "echo" "bar" +# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" +# CHECK: # command output: +# CHECK: @@ +# CHECK-NEXT: -foo +# CHECK-NEXT: +bar +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "echo" "bar" +# CHECK: $ "diff" "-u" "-" "{{[^"]*}}.foo" +# CHECK: # command output: +# CHECK: @@ +# CHECK-NEXT: -bar +# CHECK-NEXT: +foo +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "echo" "foo" +# CHECK: $ "diff" "-" "{{[^"]*}}.foo" +# CHECK-NOT: note +# CHECK-NOT: error +# CHECK: $ "FileCheck" +# CHECK-NOT: note +# CHECK-NOT: error + +# CHECK: $ "echo" "bar" +# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" +# CHECK: note: command had no output on stdout or stderr +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "FileCheck" +# CHECK-NOT: note +# CHECK-NOT: error +# CHECK: $ "true" + # CHECK: $ "false" # CHECK: *** @@ -216,6 +315,20 @@ # CHECK: File {{.*}}dir1{{.*}}extra_file is a regular empty file while file {{.*}}dir2{{.*}}extra_file is a directory # CHECK: error: command failed with exit status: 1 +# CHECK: FAIL: shtest-shell :: diff-r-error-7.txt +# CHECK: *** TEST 'shtest-shell :: diff-r-error-7.txt' FAILED *** +# CHECK: $ "diff" "-r" "-" "{{[^"]*}}" +# CHECK: # command stderr: +# CHECK: Error: cannot recursively compare '-' +# CHECK: error: command failed with exit status: 1 + +# CHECK: FAIL: shtest-shell :: diff-r-error-8.txt +# CHECK: *** TEST 'shtest-shell :: diff-r-error-8.txt' FAILED *** +# CHECK: $ "diff" "-r" "{{[^"]*}}" "-" +# CHECK: # command stderr: +# CHECK: Error: cannot recursively compare '-' +# CHECK: error: command failed with exit status: 1 + # CHECK: PASS: shtest-shell :: diff-r.txt # CHECK: FAIL: shtest-shell :: error-0.txt @@ -297,4 +410,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (28) +# CHECK: Failing Tests (30) From llvm-commits at lists.llvm.org Sat Oct 12 04:58:03 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 11:58:03 -0000 Subject: [llvm] r374651 - Reland r374392: [lit] Extend internal diff to support -U Message-ID: <20191012115803.90CB988C94@lists.llvm.org> Author: jdenny Date: Sat Oct 12 04:58:03 2019 New Revision: 374651 URL: http://llvm.org/viewvc/llvm-project?rev=374651&view=rev Log: Reland r374392: [lit] Extend internal diff to support -U To avoid breaking some tests, D66574, D68664, D67643, and D68668 landed together. However, D68664 introduced an issue now addressed by D68839, with which these are now all relanding. Differential Revision: https://reviews.llvm.org/D68668 Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374651&r1=374650&r2=374651&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 04:58:03 2019 @@ -10,6 +10,7 @@ class DiffFlags(): self.ignore_all_space = False self.ignore_space_change = False self.unified_diff = False + self.num_context_lines = 3 self.recursive_diff = False self.strip_trailing_cr = False @@ -48,7 +49,10 @@ def compareTwoBinaryFiles(flags, filepat exitCode = 0 if hasattr(difflib, 'diff_bytes'): # python 3.5 or newer - diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) + diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], + filelines[1], filepaths[0].encode(), + filepaths[1].encode(), + n = flags.num_context_lines) diffs = [diff.decode(errors="backslashreplace") for diff in diffs] else: # python 2.7 @@ -56,7 +60,8 @@ def compareTwoBinaryFiles(flags, filepat func = difflib.unified_diff else: func = difflib.context_diff - diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) + diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1], + n = flags.num_context_lines) for diff in diffs: sys.stdout.write(diff) @@ -88,7 +93,8 @@ def compareTwoTextFiles(flags, filepaths filelines[idx]= [f(line) for line in lines] func = difflib.unified_diff if flags.unified_diff else difflib.context_diff - for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): + for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1], + n = flags.num_context_lines): sys.stdout.write(diff) exitCode = 1 return exitCode @@ -171,7 +177,7 @@ def compareDirTrees(flags, dir_trees, ba def main(argv): args = argv[1:] try: - opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) + opts, args = getopt.gnu_getopt(args, "wbuU:r", ["strip-trailing-cr"]) except getopt.GetoptError as err: sys.stderr.write("Unsupported: 'diff': %s\n" % str(err)) sys.exit(1) @@ -185,6 +191,16 @@ def main(argv): flags.ignore_space_change = True elif o == "-u": flags.unified_diff = True + elif o.startswith("-U"): + flags.unified_diff = True + try: + flags.num_context_lines = int(a) + if flags.num_context_lines < 0: + raise ValueException + except: + sys.stderr.write("Error: invalid '-U' argument: {}\n" + .format(a)) + sys.exit(1) elif o == "-r": flags.recursive_diff = True elif o == "--strip-trailing-cr": Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt?rev=374651&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt Sat Oct 12 04:58:03 2019 @@ -0,0 +1,38 @@ +# RUN: echo 1 > %t.foo +# RUN: echo 2 >> %t.foo +# RUN: echo 3 >> %t.foo +# RUN: echo 4 >> %t.foo +# RUN: echo 5 >> %t.foo +# RUN: echo 6 foo >> %t.foo +# RUN: echo 7 >> %t.foo +# RUN: echo 8 >> %t.foo +# RUN: echo 9 >> %t.foo +# RUN: echo 10 >> %t.foo +# RUN: echo 11 >> %t.foo + +# RUN: echo 1 > %t.bar +# RUN: echo 2 >> %t.bar +# RUN: echo 3 >> %t.bar +# RUN: echo 4 >> %t.bar +# RUN: echo 5 >> %t.bar +# RUN: echo 6 bar >> %t.bar +# RUN: echo 7 >> %t.bar +# RUN: echo 8 >> %t.bar +# RUN: echo 9 >> %t.bar +# RUN: echo 10 >> %t.bar +# RUN: echo 11 >> %t.bar + +# Default is 3 lines of context. +# RUN: diff -u %t.foo %t.bar && false || true + +# Override default of 3 lines of context. +# RUN: diff -U 2 %t.foo %t.bar && false || true +# RUN: diff -U4 %t.foo %t.bar && false || true +# RUN: diff -U0 %t.foo %t.bar && false || true + +# Check bad -U argument. +# RUN: diff -U 30.1 %t.foo %t.foo && false || true +# RUN: diff -U-1 %t.foo %t.foo && false || true + +# Fail so lit will print output. +# RUN: false Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374651&r1=374650&r2=374651&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Sat Oct 12 04:58:03 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (30) +# CHECK: Failing Tests (31) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374651&r1=374650&r2=374651&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 04:58:03 2019 @@ -331,6 +331,82 @@ # CHECK: PASS: shtest-shell :: diff-r.txt + +# CHECK: FAIL: shtest-shell :: diff-unified.txt + +# CHECK: *** TEST 'shtest-shell :: diff-unified.txt' FAILED *** + +# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "{{[^"]*}}.bar" +# CHECK: # command output: +# CHECK: @@ {{.*}} @@ +# CHECK-NEXT: 3 +# CHECK-NEXT: 4 +# CHECK-NEXT: 5 +# CHECK-NEXT: -6 foo +# CHECK-NEXT: +6 bar +# CHECK-NEXT: 7 +# CHECK-NEXT: 8 +# CHECK-NEXT: 9 +# CHECK-EMPTY: +# CHECK-NEXT: error: command failed with exit status: 1 +# CHECK-NEXT: $ "true" + +# CHECK: $ "diff" "-U" "2" "{{[^"]*}}.foo" "{{[^"]*}}.bar" +# CHECK: # command output: +# CHECK: @@ {{.*}} @@ +# CHECK-NEXT: 4 +# CHECK-NEXT: 5 +# CHECK-NEXT: -6 foo +# CHECK-NEXT: +6 bar +# CHECK-NEXT: 7 +# CHECK-NEXT: 8 +# CHECK-EMPTY: +# CHECK-NEXT: error: command failed with exit status: 1 +# CHECK-NEXT: $ "true" + +# CHECK: $ "diff" "-U4" "{{[^"]*}}.foo" "{{[^"]*}}.bar" +# CHECK: # command output: +# CHECK: @@ {{.*}} @@ +# CHECK-NEXT: 2 +# CHECK-NEXT: 3 +# CHECK-NEXT: 4 +# CHECK-NEXT: 5 +# CHECK-NEXT: -6 foo +# CHECK-NEXT: +6 bar +# CHECK-NEXT: 7 +# CHECK-NEXT: 8 +# CHECK-NEXT: 9 +# CHECK-NEXT: 10 +# CHECK-EMPTY: +# CHECK-NEXT: error: command failed with exit status: 1 +# CHECK-NEXT: $ "true" + +# CHECK: $ "diff" "-U0" "{{[^"]*}}.foo" "{{[^"]*}}.bar" +# CHECK: # command output: +# CHECK: @@ {{.*}} @@ +# CHECK-NEXT: -6 foo +# CHECK-NEXT: +6 bar +# CHECK-EMPTY: +# CHECK-NEXT: error: command failed with exit status: 1 +# CHECK-NEXT: $ "true" + +# CHECK: $ "diff" "-U" "30.1" "{{[^"]*}}" "{{[^"]*}}" +# CHECK: # command stderr: +# CHECK: Error: invalid '-U' argument: 30.1 +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "diff" "-U-1" "{{[^"]*}}" "{{[^"]*}}" +# CHECK: # command stderr: +# CHECK: Error: invalid '-U' argument: -1 +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "false" + +# CHECK: *** + + # CHECK: FAIL: shtest-shell :: error-0.txt # CHECK: *** TEST 'shtest-shell :: error-0.txt' FAILED *** # CHECK: $ "not-a-real-command" @@ -410,4 +486,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (30) +# CHECK: Failing Tests (31) From llvm-commits at lists.llvm.org Sat Oct 12 04:58:31 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 11:58:31 -0000 Subject: [llvm] r374652 - [lit] Fix internal diff's --strip-trailing-cr and use it Message-ID: <20191012115831.395A488E6D@lists.llvm.org> Author: jdenny Date: Sat Oct 12 04:58:30 2019 New Revision: 374652 URL: http://llvm.org/viewvc/llvm-project?rev=374652&view=rev Log: [lit] Fix internal diff's --strip-trailing-cr and use it Using GNU diff, `--strip-trailing-cr` removes a `\r` appearing before a `\n` at the end of a line. Without this patch, lit's internal diff only removes `\r` if it appears as the last character. That seems useless. This patch fixes that. This patch also adds `--strip-trailing-cr` to some tests that fail on Windows bots when D68664 is applied. Based on what I see in the bot logs, I think the following is happening. In each test there, lit diff is comparing a file with `\r\n` line endings to a file with `\n` line endings. Without D68664, lit diff reads those files with Python's universal newlines support activated, causing `\r` to be dropped. However, with D68664, lit diff reads the files in binary mode instead and thus reports that every line is different, just as GNU diff does (at least under Ubuntu). Adding `--strip-trailing-cr` to those tests restores the previous behavior while permitting the behavior of lit diff to be more like GNU diff. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D68839 Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.dos llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.unix llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt Modified: llvm/trunk/test/MC/AsmParser/preserve-comments.s llvm/trunk/test/tools/llvm-cxxmap/remap.test llvm/trunk/test/tools/llvm-profdata/profile-symbol-list.test llvm/trunk/test/tools/llvm-profdata/roundtrip.test llvm/trunk/test/tools/llvm-profdata/sample-remap.test llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/test/MC/AsmParser/preserve-comments.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AsmParser/preserve-comments.s?rev=374652&r1=374651&r2=374652&view=diff ============================================================================== --- llvm/trunk/test/MC/AsmParser/preserve-comments.s (original) +++ llvm/trunk/test/MC/AsmParser/preserve-comments.s Sat Oct 12 04:58:30 2019 @@ -1,5 +1,5 @@ #RUN: llvm-mc -preserve-comments -n -triple i386-linux-gnu < %s > %t - #RUN: diff %s %t + #RUN: diff --strip-trailing-cr %s %t .text foo: #Comment here Modified: llvm/trunk/test/tools/llvm-cxxmap/remap.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-cxxmap/remap.test?rev=374652&r1=374651&r2=374652&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-cxxmap/remap.test (original) +++ llvm/trunk/test/tools/llvm-cxxmap/remap.test Sat Oct 12 04:58:30 2019 @@ -1,5 +1,5 @@ RUN: llvm-cxxmap %S/Inputs/before.sym %S/Inputs/after.sym -r %S/Inputs/remap.map -o %t.output -Wambiguous -Wincomplete 2>&1 | FileCheck %s --allow-empty -RUN: diff %S/Inputs/expected %t.output +RUN: diff --strip-trailing-cr %S/Inputs/expected %t.output CHECK-NOT: warning CHECK-NOT: error Modified: llvm/trunk/test/tools/llvm-profdata/profile-symbol-list.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-profdata/profile-symbol-list.test?rev=374652&r1=374651&r2=374652&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-profdata/profile-symbol-list.test (original) +++ llvm/trunk/test/tools/llvm-profdata/profile-symbol-list.test Sat Oct 12 04:58:30 2019 @@ -2,4 +2,4 @@ ; RUN: llvm-profdata merge -sample -extbinary -prof-sym-list=%S/Inputs/profile-symbol-list-2.text %S/Inputs/sample-profile.proftext -o %t.2.output ; RUN: llvm-profdata merge -sample -extbinary %t.1.output %t.2.output -o %t.3.output ; RUN: llvm-profdata show -sample -show-prof-sym-list %t.3.output > %t.4.output -; RUN: diff %S/Inputs/profile-symbol-list.expected %t.4.output +; RUN: diff --strip-trailing-cr %S/Inputs/profile-symbol-list.expected %t.4.output Modified: llvm/trunk/test/tools/llvm-profdata/roundtrip.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-profdata/roundtrip.test?rev=374652&r1=374651&r2=374652&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-profdata/roundtrip.test (original) +++ llvm/trunk/test/tools/llvm-profdata/roundtrip.test Sat Oct 12 04:58:30 2019 @@ -1,18 +1,18 @@ RUN: llvm-profdata merge -o %t.0.profdata %S/Inputs/IR_profile.proftext RUN: llvm-profdata show -o %t.0.proftext -all-functions -text %t.0.profdata -RUN: diff %t.0.proftext %S/Inputs/IR_profile.proftext +RUN: diff --strip-trailing-cr %t.0.proftext %S/Inputs/IR_profile.proftext RUN: llvm-profdata merge -o %t.1.profdata %t.0.proftext RUN: llvm-profdata show -o %t.1.proftext -all-functions -text %t.1.profdata -RUN: diff %t.1.proftext %S/Inputs/IR_profile.proftext +RUN: diff --strip-trailing-cr %t.1.proftext %S/Inputs/IR_profile.proftext RUN: llvm-profdata merge --sample --binary -output=%t.2.profdata %S/Inputs/sample-profile.proftext RUN: llvm-profdata merge --sample --text -output=%t.2.proftext %t.2.profdata -RUN: diff %t.2.proftext %S/Inputs/sample-profile.proftext +RUN: diff --strip-trailing-cr %t.2.proftext %S/Inputs/sample-profile.proftext # Round trip from text --> extbinary --> text RUN: llvm-profdata merge --sample --extbinary -output=%t.3.profdata %S/Inputs/sample-profile.proftext RUN: llvm-profdata merge --sample --text -output=%t.3.proftext %t.3.profdata -RUN: diff %t.3.proftext %S/Inputs/sample-profile.proftext +RUN: diff --strip-trailing-cr %t.3.proftext %S/Inputs/sample-profile.proftext # Round trip from text --> binary --> extbinary --> text RUN: llvm-profdata merge --sample --binary -output=%t.4.profdata %S/Inputs/sample-profile.proftext RUN: llvm-profdata merge --sample --extbinary -output=%t.5.profdata %t.4.profdata RUN: llvm-profdata merge --sample --text -output=%t.4.proftext %t.5.profdata -RUN: diff %t.4.proftext %S/Inputs/sample-profile.proftext +RUN: diff --strip-trailing-cr %t.4.proftext %S/Inputs/sample-profile.proftext Modified: llvm/trunk/test/tools/llvm-profdata/sample-remap.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-profdata/sample-remap.test?rev=374652&r1=374651&r2=374652&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-profdata/sample-remap.test (original) +++ llvm/trunk/test/tools/llvm-profdata/sample-remap.test Sat Oct 12 04:58:30 2019 @@ -1,2 +1,2 @@ ; RUN: llvm-profdata merge -sample -text %S/Inputs/sample-remap.proftext -r %S/Inputs/sample-remap.remap -o %t.output -; RUN: diff %S/Inputs/sample-remap.expected %t.output +; RUN: diff --strip-trailing-cr %S/Inputs/sample-remap.expected %t.output Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374652&r1=374651&r2=374652&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 04:58:30 2019 @@ -83,7 +83,7 @@ def compareTwoTextFiles(flags, filepaths f = lambda x: x if flags.strip_trailing_cr: - f = compose2(lambda line: line.rstrip('\r'), f) + f = compose2(lambda line: line.replace('\r\n', '\n'), f) if flags.ignore_all_space or flags.ignore_space_change: ignoreSpace = lambda line, separator: separator.join(line.split()) ignoreAllSpaceOrSpaceChange = functools.partial(ignoreSpace, separator='' if flags.ignore_all_space else ' ') Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.dos URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.dos?rev=374652&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.dos (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.dos Sat Oct 12 04:58:30 2019 @@ -0,0 +1,3 @@ +In this file, the +sequence "\r\n" +terminates lines. Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.unix URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.unix?rev=374652&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.unix (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.unix Sat Oct 12 04:58:30 2019 @@ -0,0 +1,3 @@ +In this file, the +sequence "\n" +terminates lines. Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt?rev=374652&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt Sat Oct 12 04:58:30 2019 @@ -0,0 +1,10 @@ +# Check behavior of --strip-trailing-cr. + +# RUN: diff -u diff-in.dos diff-in.unix && false || true +# RUN: diff -u diff-in.unix diff-in.dos && false || true + +# RUN: diff -u --strip-trailing-cr diff-in.dos diff-in.unix && false || true +# RUN: diff -u --strip-trailing-cr diff-in.unix diff-in.dos && false || true + +# Fail so lit will print output. +# RUN: false Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374652&r1=374651&r2=374652&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Sat Oct 12 04:58:30 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (31) +# CHECK: Failing Tests (32) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374652&r1=374651&r2=374652&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 04:58:30 2019 @@ -4,7 +4,7 @@ # FIXME: Temporarily dump test output so we can debug failing tests on # buildbots. # RUN: cat %t.out -# RUN: FileCheck --input-file %t.out %s +# RUN: FileCheck --dump-input=fail --color -vv --input-file %t.out %s # # END. @@ -332,6 +332,59 @@ # CHECK: PASS: shtest-shell :: diff-r.txt +# CHECK: FAIL: shtest-shell :: diff-strip-trailing-cr.txt + +# CHECK: *** TEST 'shtest-shell :: diff-strip-trailing-cr.txt' FAILED *** + +# CHECK: $ "diff" "-u" "diff-in.dos" "diff-in.unix" +# CHECK: # command output: +# CHECK: @@ +# CHECK-NEXT: -In this file, the +# CHECK-NEXT: -sequence "\r\n" +# CHECK-NEXT: -terminates lines. +# CHECK-NEXT: +In this file, the +# CHECK-NEXT: +sequence "\n" +# CHECK-NEXT: +terminates lines. +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "diff" "-u" "diff-in.unix" "diff-in.dos" +# CHECK: # command output: +# CHECK: @@ +# CHECK-NEXT: -In this file, the +# CHECK-NEXT: -sequence "\n" +# CHECK-NEXT: -terminates lines. +# CHECK-NEXT: +In this file, the +# CHECK-NEXT: +sequence "\r\n" +# CHECK-NEXT: +terminates lines. +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "diff" "-u" "--strip-trailing-cr" "diff-in.dos" "diff-in.unix" +# CHECK: # command output: +# CHECK: @@ +# CHECK-NEXT: In this file, the +# CHECK-NEXT: -sequence "\r\n" +# CHECK-NEXT: +sequence "\n" +# CHECK-NEXT: terminates lines. +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "diff" "-u" "--strip-trailing-cr" "diff-in.unix" "diff-in.dos" +# CHECK: # command output: +# CHECK: @@ +# CHECK-NEXT: In this file, the +# CHECK-NEXT: -sequence "\n" +# CHECK-NEXT: +sequence "\r\n" +# CHECK-NEXT: terminates lines. +# CHECK: error: command failed with exit status: 1 +# CHECK: $ "true" + +# CHECK: $ "false" + +# CHECK: *** + + # CHECK: FAIL: shtest-shell :: diff-unified.txt # CHECK: *** TEST 'shtest-shell :: diff-unified.txt' FAILED *** @@ -486,4 +539,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (31) +# CHECK: Failing Tests (32) From llvm-commits at lists.llvm.org Sat Oct 12 04:58:49 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sat, 12 Oct 2019 12:58:49 +0100 Subject: [llvm] r374579 - [X86][SSE] Add support for v4i8 add reduction In-Reply-To: References: <20191011175415.6D02392577@lists.llvm.org> Message-ID: Definitely unnecessary, but this is coming from a scalar_to_vector unfortunately. I'm going to investigate doing an explicit zextload from v4i8 to a v4i32 and then perform as a v16i8 reduction. Simon. On 11/10/2019 19:51, Craig Topper wrote: > Why do the load cases use a movzxdq after the movd? That seems > unnecessary. The movd should have generated 0s already. > > ~Craig > > > On Fri, Oct 11, 2019 at 10:51 AM Simon Pilgrim via llvm-commits > > wrote: > > Author: rksimon > Date: Fri Oct 11 10:54:15 2019 > New Revision: 374579 > > URL: http://llvm.org/viewvc/llvm-project?rev=374579&view=rev > Log: > [X86][SSE] Add support for v4i8 add reduction > > Modified: >     llvm/trunk/lib/Target/X86/X86ISelLowering.cpp >     llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll > > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374579&r1=374578&r2=374579&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Fri Oct 11 > 10:54:15 2019 > @@ -36239,10 +36239,15 @@ static SDValue combineReductionToHorizon > >    SDLoc DL(ExtElt); > > -  if (VecVT == MVT::v8i8) { > +  // vXi8 reduction - sub 128-bit vector. > +  if (VecVT == MVT::v4i8 || VecVT == MVT::v8i8) { > +    // Pad with zero. > +    if (VecVT == MVT::v4i8) > +      Rdx = DAG.getNode(ISD::CONCAT_VECTORS, DL, MVT::v8i8, Rdx, > +                        DAG.getConstant(0, DL, VecVT)); >      // Pad with undef. >      Rdx = DAG.getNode(ISD::CONCAT_VECTORS, DL, MVT::v16i8, Rdx, > -                      DAG.getUNDEF(VecVT)); > +                      DAG.getUNDEF(MVT::v8i8)); >      Rdx = DAG.getNode(X86ISD::PSADBW, DL, MVT::v2i64, Rdx, >                        DAG.getConstant(0, DL, MVT::v16i8)); >      Rdx = DAG.getBitcast(MVT::v16i8, Rdx); > > Modified: llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll?rev=374579&r1=374578&r2=374579&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll (original) > +++ llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll Fri Oct 11 > 10:54:15 2019 > @@ -1029,44 +1029,36 @@ define i8 @test_v2i8_load(<2 x i8>* %p) >  define i8 @test_v4i8(<4 x i8> %a0) { >  ; SSE2-LABEL: test_v4i8: >  ; SSE2:       # %bb.0: > -; SSE2-NEXT:    movdqa %xmm0, %xmm1 > -; SSE2-NEXT:    psrld $16, %xmm1 > -; SSE2-NEXT:    paddb %xmm0, %xmm1 > -; SSE2-NEXT:    movdqa %xmm1, %xmm0 > -; SSE2-NEXT:    psrlw $8, %xmm0 > -; SSE2-NEXT:    paddb %xmm1, %xmm0 > +; SSE2-NEXT:    pxor %xmm1, %xmm1 > +; SSE2-NEXT:    punpckldq {{.*#+}} xmm0 = > xmm0[0],xmm1[0],xmm0[1],xmm1[1] > +; SSE2-NEXT:    psadbw %xmm1, %xmm0 >  ; SSE2-NEXT:    movd %xmm0, %eax >  ; SSE2-NEXT:    # kill: def $al killed $al killed $eax >  ; SSE2-NEXT:    retq >  ; >  ; SSE41-LABEL: test_v4i8: >  ; SSE41:       # %bb.0: > -; SSE41-NEXT:    movdqa %xmm0, %xmm1 > -; SSE41-NEXT:    psrld $16, %xmm1 > -; SSE41-NEXT:    paddb %xmm0, %xmm1 > -; SSE41-NEXT:    movdqa %xmm1, %xmm0 > -; SSE41-NEXT:    psrlw $8, %xmm0 > -; SSE41-NEXT:    paddb %xmm1, %xmm0 > -; SSE41-NEXT:    pextrb $0, %xmm0, %eax > +; SSE41-NEXT:    pmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; SSE41-NEXT:    pxor %xmm1, %xmm1 > +; SSE41-NEXT:    psadbw %xmm0, %xmm1 > +; SSE41-NEXT:    pextrb $0, %xmm1, %eax >  ; SSE41-NEXT:    # kill: def $al killed $al killed $eax >  ; SSE41-NEXT:    retq >  ; >  ; AVX-LABEL: test_v4i8: >  ; AVX:       # %bb.0: > -; AVX-NEXT:    vpsrld $16, %xmm0, %xmm1 > -; AVX-NEXT:    vpaddb %xmm1, %xmm0, %xmm0 > -; AVX-NEXT:    vpsrlw $8, %xmm0, %xmm1 > -; AVX-NEXT:    vpaddb %xmm1, %xmm0, %xmm0 > +; AVX-NEXT:    vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; AVX-NEXT:    vpxor %xmm1, %xmm1, %xmm1 > +; AVX-NEXT:    vpsadbw %xmm1, %xmm0, %xmm0 >  ; AVX-NEXT:    vpextrb $0, %xmm0, %eax >  ; AVX-NEXT:    # kill: def $al killed $al killed $eax >  ; AVX-NEXT:    retq >  ; >  ; AVX512-LABEL: test_v4i8: >  ; AVX512:       # %bb.0: > -; AVX512-NEXT:    vpsrld $16, %xmm0, %xmm1 > -; AVX512-NEXT:    vpaddb %xmm1, %xmm0, %xmm0 > -; AVX512-NEXT:    vpsrlw $8, %xmm0, %xmm1 > -; AVX512-NEXT:    vpaddb %xmm1, %xmm0, %xmm0 > +; AVX512-NEXT:    vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; AVX512-NEXT:    vpxor %xmm1, %xmm1, %xmm1 > +; AVX512-NEXT:    vpsadbw %xmm1, %xmm0, %xmm0 >  ; AVX512-NEXT:    vpextrb $0, %xmm0, %eax >  ; AVX512-NEXT:    # kill: def $al killed $al killed $eax >  ; AVX512-NEXT:    retq > @@ -1078,36 +1070,28 @@ define i8 @test_v4i8_load(<4 x i8>* %p) >  ; SSE2-LABEL: test_v4i8_load: >  ; SSE2:       # %bb.0: >  ; SSE2-NEXT:    movd {{.*#+}} xmm0 = mem[0],zero,zero,zero > -; SSE2-NEXT:    movdqa %xmm0, %xmm1 > -; SSE2-NEXT:    psrld $16, %xmm1 > -; SSE2-NEXT:    paddb %xmm0, %xmm1 > -; SSE2-NEXT:    movdqa %xmm1, %xmm0 > -; SSE2-NEXT:    psrlw $8, %xmm0 > -; SSE2-NEXT:    paddb %xmm1, %xmm0 > -; SSE2-NEXT:    movd %xmm0, %eax > +; SSE2-NEXT:    pxor %xmm1, %xmm1 > +; SSE2-NEXT:    psadbw %xmm0, %xmm1 > +; SSE2-NEXT:    movd %xmm1, %eax >  ; SSE2-NEXT:    # kill: def $al killed $al killed $eax >  ; SSE2-NEXT:    retq >  ; >  ; SSE41-LABEL: test_v4i8_load: >  ; SSE41:       # %bb.0: >  ; SSE41-NEXT:    movd {{.*#+}} xmm0 = mem[0],zero,zero,zero > -; SSE41-NEXT:    movdqa %xmm0, %xmm1 > -; SSE41-NEXT:    psrld $16, %xmm1 > -; SSE41-NEXT:    paddb %xmm0, %xmm1 > -; SSE41-NEXT:    movdqa %xmm1, %xmm0 > -; SSE41-NEXT:    psrlw $8, %xmm0 > -; SSE41-NEXT:    paddb %xmm1, %xmm0 > -; SSE41-NEXT:    pextrb $0, %xmm0, %eax > +; SSE41-NEXT:    pmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; SSE41-NEXT:    pxor %xmm1, %xmm1 > +; SSE41-NEXT:    psadbw %xmm0, %xmm1 > +; SSE41-NEXT:    pextrb $0, %xmm1, %eax >  ; SSE41-NEXT:    # kill: def $al killed $al killed $eax >  ; SSE41-NEXT:    retq >  ; >  ; AVX-LABEL: test_v4i8_load: >  ; AVX:       # %bb.0: >  ; AVX-NEXT:    vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero > -; AVX-NEXT:    vpsrld $16, %xmm0, %xmm1 > -; AVX-NEXT:    vpaddb %xmm1, %xmm0, %xmm0 > -; AVX-NEXT:    vpsrlw $8, %xmm0, %xmm1 > -; AVX-NEXT:    vpaddb %xmm1, %xmm0, %xmm0 > +; AVX-NEXT:    vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; AVX-NEXT:    vpxor %xmm1, %xmm1, %xmm1 > +; AVX-NEXT:    vpsadbw %xmm1, %xmm0, %xmm0 >  ; AVX-NEXT:    vpextrb $0, %xmm0, %eax >  ; AVX-NEXT:    # kill: def $al killed $al killed $eax >  ; AVX-NEXT:    retq > @@ -1115,10 +1099,9 @@ define i8 @test_v4i8_load(<4 x i8>* %p) >  ; AVX512-LABEL: test_v4i8_load: >  ; AVX512:       # %bb.0: >  ; AVX512-NEXT:    vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero > -; AVX512-NEXT:    vpsrld $16, %xmm0, %xmm1 > -; AVX512-NEXT:    vpaddb %xmm1, %xmm0, %xmm0 > -; AVX512-NEXT:    vpsrlw $8, %xmm0, %xmm1 > -; AVX512-NEXT:    vpaddb %xmm1, %xmm0, %xmm0 > +; AVX512-NEXT:    vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero > +; AVX512-NEXT:    vpxor %xmm1, %xmm1, %xmm1 > +; AVX512-NEXT:    vpsadbw %xmm1, %xmm0, %xmm0 >  ; AVX512-NEXT:    vpextrb $0, %xmm0, %eax >  ; AVX512-NEXT:    # kill: def $al killed $al killed $eax >  ; AVX512-NEXT:    retq > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits > -------------- next part -------------- An HTML attachment was scrubbed... URL: From llvm-commits at lists.llvm.org Sat Oct 12 05:01:13 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 12:01:13 +0000 (UTC) Subject: [PATCH] D68839: [lit] Fix internal diff's --strip-trailing-cr and use it In-Reply-To: References: Message-ID: <32b9fe400ccf3ebf32bd25de83b60e82@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG0f80927316c7: [lit] Fix internal diff's --strip-trailing-cr and use it (authored by jdenny). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68839/new/ https://reviews.llvm.org/D68839 Files: llvm/test/MC/AsmParser/preserve-comments.s llvm/test/tools/llvm-cxxmap/remap.test llvm/test/tools/llvm-profdata/profile-symbol-list.test llvm/test/tools/llvm-profdata/roundtrip.test llvm/test/tools/llvm-profdata/sample-remap.test llvm/utils/lit/lit/builtin_commands/diff.py llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.dos llvm/utils/lit/tests/Inputs/shtest-shell/diff-in.unix llvm/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt llvm/utils/lit/tests/max-failures.py llvm/utils/lit/tests/shtest-shell.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D68839.224738.patch Type: text/x-patch Size: 8677 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 05:20:19 2019 From: llvm-commits at lists.llvm.org (George Rimar via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 12:20:19 +0000 (UTC) Subject: [PATCH] D68848: [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index. In-Reply-To: References: Message-ID: <8f6488d636a5fece490ceaf8d5c75f93@localhost.localdomain> grimar added inline comments. ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1699 + uint64_t Idx; + for (const SectionRef &Section : ToolSectionFilter(*Obj, &Idx)) { StringRef Name = unwrapOrError(Section.getName(), Obj->getFileName()); ---------------- rupprecht wrote: > grimar wrote: > > Looking at this, > > should `ToolSectionFilter` just return `[(SectionRef&)Ref, (uint64_t )Index]` struct/pair instead? It seems could make the whole logic simper. > That's one of the paths I considered while writing this patch. A couple thoughts: > 1) This is the only place that needs this counter value, and loop iteration in every other case (7 other places in this file, 1 in the MachO dumper) would now be a little more complicated even when callers don't care about the index, e.g. it would now be: > ``` > for (const SomeWrapperType &Foo : ToolSectionFilter(*Obj)) { > const SectionRef &Section = Foo.Section; > ``` > 2) llvm has `make_filter_range` which probably did not exist at the time this code was first written, and it would be nice to completely remove `SectionFilter` and `SectionFilterIterator` from llvm-objdump.h in favor of those standard llvm libraries, but AIUI in order to use that, the return type would need to be `SectionRef`, not some wrapper type. (I'm trying to do that in a separate branch, but I'm dealing with template woes). > 3) But on the plus side, it does avoid out parameters which would be very nice. > > I think 1&2 outweigh 3, so I'm leaning towards this approach. But I'm still exploring alternatives. I see. I have no much better/different ideas atm. Perhaps the current approach is OK for now. ================ Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:376 + // increment so the indexing is stable. + return {/*Keep=*/is_contained(FilterSections, SecName), + /*IncrementIndex=*/true}; ---------------- Can we have a test for this logic? I.e. for a case when `Keep=false`, `IncrementIndex=true`. (Doesn't seem we have it) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68848/new/ https://reviews.llvm.org/D68848 From llvm-commits at lists.llvm.org Sat Oct 12 05:32:01 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 12:32:01 -0000 Subject: [llvm] r374653 - [lit] Fix a few oversights in r374651 that broke some bots Message-ID: <20191012123201.19CC381E2E@lists.llvm.org> Author: jdenny Date: Sat Oct 12 05:32:00 2019 New Revision: 374653 URL: http://llvm.org/viewvc/llvm-project?rev=374653&view=rev Log: [lit] Fix a few oversights in r374651 that broke some bots Modified: llvm/trunk/test/MC/ARM/preserve-comments-arm.s llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/test/MC/ARM/preserve-comments-arm.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ARM/preserve-comments-arm.s?rev=374653&r1=374652&r2=374653&view=diff ============================================================================== --- llvm/trunk/test/MC/ARM/preserve-comments-arm.s (original) +++ llvm/trunk/test/MC/ARM/preserve-comments-arm.s Sat Oct 12 05:32:00 2019 @@ -1,6 +1,6 @@ @RUN: llvm-mc -preserve-comments -n -triple arm-eabi < %s > %t @RUN: sed 's/#[C]omment/@Comment/g' %s > %t2 - @RUN: diff %t %t2 + @RUN: diff --strip-trailing-cr %t %t2 .text mov r0, r0 Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374653&r1=374652&r2=374653&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 05:32:00 2019 @@ -4,7 +4,7 @@ # FIXME: Temporarily dump test output so we can debug failing tests on # buildbots. # RUN: cat %t.out -# RUN: FileCheck --dump-input=fail --color -vv --input-file %t.out %s +# RUN: FileCheck --input-file %t.out %s # # END. From llvm-commits at lists.llvm.org Sat Oct 12 05:44:52 2019 From: llvm-commits at lists.llvm.org (Alex Cameron via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 12:44:52 +0000 (UTC) Subject: [PATCH] D68906: [llvm-size] Tidy up error messages In-Reply-To: References: Message-ID: <7e2204cf5436e3f20bfd3acbde38e69d@localhost.localdomain> tetsuo-cpp updated this revision to Diff 224740. tetsuo-cpp added a comment. Address review comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68906/new/ https://reviews.llvm.org/D68906 Files: llvm/test/tools/llvm-size/invalid-input.test llvm/test/tools/llvm-size/no-input.test llvm/tools/llvm-size/llvm-size.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68906.224740.patch Type: text/x-patch Size: 6813 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 06:08:21 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 13:08:21 -0000 Subject: [llvm] r374654 - [lit] Try to fix new tests that fail on Windows bots Message-ID: <20191012130821.7CB60878FA@lists.llvm.org> Author: jdenny Date: Sat Oct 12 06:08:21 2019 New Revision: 374654 URL: http://llvm.org/viewvc/llvm-project?rev=374654&view=rev Log: [lit] Try to fix new tests that fail on Windows bots Modified: llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374654&r1=374653&r2=374654&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 06:08:21 2019 @@ -46,10 +46,10 @@ # CHECK-NEXT: --- # CHECK-NEXT: +++ # CHECK-NEXT: @@ -# CHECK-NEXT: {{^ .f.o.o.$}} -# CHECK-NEXT: {{^-.b.a.r.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^ .b.a.z.$}} +# CHECK-NEXT: {{^ .+f.+o.+o.+$}} +# CHECK-NEXT: {{^-.+b.+a.+r.+$}} +# CHECK-NEXT: {{^\+.+b.+a.+r.+$}} +# CHECK-NEXT: {{^ .+b.+a.+z.+$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -61,9 +61,9 @@ # CHECK-NEXT: -foo # CHECK-NEXT: -bar # CHECK-NEXT: -baz -# CHECK-NEXT: {{^\+.f.o.o.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^\+.b.a.z.$}} +# CHECK-NEXT: {{^\+.+f.+o.+o.+$}} +# CHECK-NEXT: {{^\+.+b.+a.+r.+$}} +# CHECK-NEXT: {{^\+.+b.+a.+z.+$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -72,9 +72,9 @@ # CHECK-NEXT: --- # CHECK-NEXT: +++ # CHECK-NEXT: @@ -# CHECK-NEXT: {{^\-.f.o.o.$}} -# CHECK-NEXT: {{^\-.b.a.r..}} -# CHECK-NEXT: {{^\-.b.a.z.$}} +# CHECK-NEXT: {{^\-.+f.+o.+o.+$}} +# CHECK-NEXT: {{^\-.+b.+a.+r.+$}} +# CHECK-NEXT: {{^\-.+b.+a.+z.+$}} # CHECK-NEXT: +foo # CHECK-NEXT: +bar # CHECK-NEXT: +baz @@ -98,10 +98,10 @@ # CHECK-NEXT: --- # CHECK-NEXT: +++ # CHECK-NEXT: @@ -# CHECK-NEXT: {{^ .f.o.o.$}} -# CHECK-NEXT: {{^-.b.a.r.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^ .b.a.z.$}} +# CHECK-NEXT: {{^ .+f.+o.+o.+$}} +# CHECK-NEXT: {{^-.+b.+a.+r.+$}} +# CHECK-NEXT: {{^\+.+b.+a.+r.+$}} +# CHECK-NEXT: {{^ .+b.+a.+z.+$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -115,9 +115,9 @@ # CHECK-NEXT: -foo # CHECK-NEXT: -bar # CHECK-NEXT: -baz -# CHECK-NEXT: {{^\+.f.o.o.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^\+.b.a.z.$}} +# CHECK-NEXT: {{^\+.+f.+o.+o.+$}} +# CHECK-NEXT: {{^\+.+b.+a.+r.+$}} +# CHECK-NEXT: {{^\+.+b.+a.+z.+$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -126,9 +126,9 @@ # CHECK-NEXT: --- # CHECK-NEXT: +++ # CHECK-NEXT: @@ -# CHECK-NEXT: {{^\-.f.o.o.$}} -# CHECK-NEXT: {{^\-.b.a.r..}} -# CHECK-NEXT: {{^\-.b.a.z.$}} +# CHECK-NEXT: {{^\-.+f.+o.+o.+$}} +# CHECK-NEXT: {{^\-.+b.+a.+r.+$}} +# CHECK-NEXT: {{^\-.+b.+a.+z.+$}} # CHECK-NEXT: +foo # CHECK-NEXT: +bar # CHECK-NEXT: +baz From llvm-commits at lists.llvm.org Sat Oct 12 06:21:50 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sat, 12 Oct 2019 13:21:50 -0000 Subject: [llvm] r374655 - [CostModel][X86] Improve sum reduction costs. Message-ID: <20191012132150.64ACA88F9B@lists.llvm.org> Author: rksimon Date: Sat Oct 12 06:21:50 2019 New Revision: 374655 URL: http://llvm.org/viewvc/llvm-project?rev=374655&view=rev Log: [CostModel][X86] Improve sum reduction costs. I can't see any notable differences in costs between SSE2 and SSE42 arches for FADD/ADD reduction, so I've lowered the target to just SSE2. I've also added vXi8 sum reduction costs in line with the PSADBW codegen and discussions on PR42674. Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp llvm/trunk/test/Analysis/CostModel/X86/reduce-add.ll llvm/trunk/test/Analysis/CostModel/X86/reduction.ll llvm/trunk/test/Transforms/SLPVectorizer/X86/remark_horcost.ll Modified: llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp?rev=374655&r1=374654&r2=374655&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86TargetTransformInfo.cpp Sat Oct 12 06:21:50 2019 @@ -2488,7 +2488,7 @@ int X86TTIImpl::getArithmeticReductionCo // We use the Intel Architecture Code Analyzer(IACA) to measure the throughput // and make it as the cost. - static const CostTblEntry SSE42CostTblPairWise[] = { + static const CostTblEntry SSE2CostTblPairWise[] = { { ISD::FADD, MVT::v2f64, 2 }, { ISD::FADD, MVT::v4f32, 4 }, { ISD::ADD, MVT::v2i64, 2 }, // The data reported by the IACA tool is "1.6". @@ -2497,23 +2497,23 @@ int X86TTIImpl::getArithmeticReductionCo { ISD::ADD, MVT::v2i16, 3 }, // FIXME: chosen to be less than v4i16 { ISD::ADD, MVT::v4i16, 4 }, // FIXME: chosen to be less than v8i16 { ISD::ADD, MVT::v8i16, 5 }, + { ISD::ADD, MVT::v2i8, 2 }, + { ISD::ADD, MVT::v4i8, 2 }, + { ISD::ADD, MVT::v8i8, 2 }, + { ISD::ADD, MVT::v16i8, 3 }, }; static const CostTblEntry AVX1CostTblPairWise[] = { - { ISD::FADD, MVT::v4f32, 4 }, { ISD::FADD, MVT::v4f64, 5 }, { ISD::FADD, MVT::v8f32, 7 }, { ISD::ADD, MVT::v2i64, 1 }, // The data reported by the IACA tool is "1.5". - { ISD::ADD, MVT::v2i32, 2 }, // FIXME: chosen to be less than v4i32 - { ISD::ADD, MVT::v4i32, 3 }, // The data reported by the IACA tool is "3.5". { ISD::ADD, MVT::v4i64, 5 }, // The data reported by the IACA tool is "4.8". - { ISD::ADD, MVT::v2i16, 3 }, // FIXME: chosen to be less than v4i16 - { ISD::ADD, MVT::v4i16, 4 }, // FIXME: chosen to be less than v8i16 - { ISD::ADD, MVT::v8i16, 5 }, { ISD::ADD, MVT::v8i32, 5 }, + { ISD::ADD, MVT::v16i16, 6 }, + { ISD::ADD, MVT::v32i8, 4 }, }; - static const CostTblEntry SSE42CostTblNoPairWise[] = { + static const CostTblEntry SSE2CostTblNoPairWise[] = { { ISD::FADD, MVT::v2f64, 2 }, { ISD::FADD, MVT::v4f32, 4 }, { ISD::ADD, MVT::v2i64, 2 }, // The data reported by the IACA tool is "1.6". @@ -2522,20 +2522,21 @@ int X86TTIImpl::getArithmeticReductionCo { ISD::ADD, MVT::v2i16, 2 }, // The data reported by the IACA tool is "4.3". { ISD::ADD, MVT::v4i16, 3 }, // The data reported by the IACA tool is "4.3". { ISD::ADD, MVT::v8i16, 4 }, // The data reported by the IACA tool is "4.3". + { ISD::ADD, MVT::v2i8, 2 }, + { ISD::ADD, MVT::v4i8, 2 }, + { ISD::ADD, MVT::v8i8, 2 }, + { ISD::ADD, MVT::v16i8, 3 }, }; static const CostTblEntry AVX1CostTblNoPairWise[] = { - { ISD::FADD, MVT::v4f32, 3 }, { ISD::FADD, MVT::v4f64, 3 }, + { ISD::FADD, MVT::v4f32, 3 }, { ISD::FADD, MVT::v8f32, 4 }, { ISD::ADD, MVT::v2i64, 1 }, // The data reported by the IACA tool is "1.5". - { ISD::ADD, MVT::v2i32, 2 }, // FIXME: chosen to be less than v4i32 - { ISD::ADD, MVT::v4i32, 3 }, // The data reported by the IACA tool is "2.8". { ISD::ADD, MVT::v4i64, 3 }, - { ISD::ADD, MVT::v2i16, 2 }, // The data reported by the IACA tool is "4.3". - { ISD::ADD, MVT::v4i16, 3 }, // The data reported by the IACA tool is "4.3". - { ISD::ADD, MVT::v8i16, 4 }, { ISD::ADD, MVT::v8i32, 5 }, + { ISD::ADD, MVT::v16i16, 5 }, + { ISD::ADD, MVT::v32i8, 4 }, }; int ISD = TLI->InstructionOpcodeToISD(Opcode); @@ -2552,16 +2553,16 @@ int X86TTIImpl::getArithmeticReductionCo if (const auto *Entry = CostTableLookup(AVX1CostTblPairWise, ISD, MTy)) return Entry->Cost; - if (ST->hasSSE42()) - if (const auto *Entry = CostTableLookup(SSE42CostTblPairWise, ISD, MTy)) + if (ST->hasSSE2()) + if (const auto *Entry = CostTableLookup(SSE2CostTblPairWise, ISD, MTy)) return Entry->Cost; } else { if (ST->hasAVX()) if (const auto *Entry = CostTableLookup(AVX1CostTblNoPairWise, ISD, MTy)) return Entry->Cost; - if (ST->hasSSE42()) - if (const auto *Entry = CostTableLookup(SSE42CostTblNoPairWise, ISD, MTy)) + if (ST->hasSSE2()) + if (const auto *Entry = CostTableLookup(SSE2CostTblNoPairWise, ISD, MTy)) return Entry->Cost; } } @@ -2575,16 +2576,16 @@ int X86TTIImpl::getArithmeticReductionCo if (const auto *Entry = CostTableLookup(AVX1CostTblPairWise, ISD, MTy)) return LT.first * Entry->Cost; - if (ST->hasSSE42()) - if (const auto *Entry = CostTableLookup(SSE42CostTblPairWise, ISD, MTy)) + if (ST->hasSSE2()) + if (const auto *Entry = CostTableLookup(SSE2CostTblPairWise, ISD, MTy)) return LT.first * Entry->Cost; } else { if (ST->hasAVX()) if (const auto *Entry = CostTableLookup(AVX1CostTblNoPairWise, ISD, MTy)) return LT.first * Entry->Cost; - if (ST->hasSSE42()) - if (const auto *Entry = CostTableLookup(SSE42CostTblNoPairWise, ISD, MTy)) + if (ST->hasSSE2()) + if (const auto *Entry = CostTableLookup(SSE2CostTblNoPairWise, ISD, MTy)) return LT.first * Entry->Cost; } Modified: llvm/trunk/test/Analysis/CostModel/X86/reduce-add.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/CostModel/X86/reduce-add.ll?rev=374655&r1=374654&r2=374655&view=diff ============================================================================== --- llvm/trunk/test/Analysis/CostModel/X86/reduce-add.ll (original) +++ llvm/trunk/test/Analysis/CostModel/X86/reduce-add.ll Sat Oct 12 06:21:50 2019 @@ -9,29 +9,13 @@ ; RUN: opt < %s -cost-model -mtriple=x86_64-apple-darwin -analyze -mattr=+avx512f,+avx512dq | FileCheck %s --check-prefixes=CHECK,AVX512,AVX512DQ define i32 @reduce_i64(i32 %arg) { -; SSE2-LABEL: 'reduce_i64' -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.experimental.vector.reduce.add.v1i64(<1 x i64> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i64 @llvm.experimental.vector.reduce.add.v2i64(<2 x i64> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4 = call i64 @llvm.experimental.vector.reduce.add.v4i64(<4 x i64> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V8 = call i64 @llvm.experimental.vector.reduce.add.v8i64(<8 x i64> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V16 = call i64 @llvm.experimental.vector.reduce.add.v16i64(<16 x i64> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; SSSE3-LABEL: 'reduce_i64' -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.experimental.vector.reduce.add.v1i64(<1 x i64> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i64 @llvm.experimental.vector.reduce.add.v2i64(<2 x i64> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4 = call i64 @llvm.experimental.vector.reduce.add.v4i64(<4 x i64> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V8 = call i64 @llvm.experimental.vector.reduce.add.v8i64(<8 x i64> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V16 = call i64 @llvm.experimental.vector.reduce.add.v16i64(<16 x i64> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; SSE42-LABEL: 'reduce_i64' -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.experimental.vector.reduce.add.v1i64(<1 x i64> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i64 @llvm.experimental.vector.reduce.add.v2i64(<2 x i64> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4 = call i64 @llvm.experimental.vector.reduce.add.v4i64(<4 x i64> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8 = call i64 @llvm.experimental.vector.reduce.add.v8i64(<8 x i64> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V16 = call i64 @llvm.experimental.vector.reduce.add.v16i64(<16 x i64> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef +; SSE-LABEL: 'reduce_i64' +; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.experimental.vector.reduce.add.v1i64(<1 x i64> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i64 @llvm.experimental.vector.reduce.add.v2i64(<2 x i64> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4 = call i64 @llvm.experimental.vector.reduce.add.v4i64(<4 x i64> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8 = call i64 @llvm.experimental.vector.reduce.add.v8i64(<8 x i64> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V16 = call i64 @llvm.experimental.vector.reduce.add.v16i64(<16 x i64> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX-LABEL: 'reduce_i64' ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.experimental.vector.reduce.add.v1i64(<1 x i64> undef) @@ -58,29 +42,13 @@ define i32 @reduce_i64(i32 %arg) { } define i32 @reduce_i32(i32 %arg) { -; SSE2-LABEL: 'reduce_i32' -; SSE2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i32 @llvm.experimental.vector.reduce.add.v2i32(<2 x i32> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i32 @llvm.experimental.vector.reduce.add.v4i32(<4 x i32> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V8 = call i32 @llvm.experimental.vector.reduce.add.v8i32(<8 x i32> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16 = call i32 @llvm.experimental.vector.reduce.add.v16i32(<16 x i32> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V32 = call i32 @llvm.experimental.vector.reduce.add.v32i32(<32 x i32> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; SSSE3-LABEL: 'reduce_i32' -; SSSE3-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i32 @llvm.experimental.vector.reduce.add.v2i32(<2 x i32> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i32 @llvm.experimental.vector.reduce.add.v4i32(<4 x i32> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V8 = call i32 @llvm.experimental.vector.reduce.add.v8i32(<8 x i32> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16 = call i32 @llvm.experimental.vector.reduce.add.v16i32(<16 x i32> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V32 = call i32 @llvm.experimental.vector.reduce.add.v32i32(<32 x i32> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; SSE42-LABEL: 'reduce_i32' -; SSE42-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.experimental.vector.reduce.add.v2i32(<2 x i32> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.experimental.vector.reduce.add.v4i32(<4 x i32> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V8 = call i32 @llvm.experimental.vector.reduce.add.v8i32(<8 x i32> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V16 = call i32 @llvm.experimental.vector.reduce.add.v16i32(<16 x i32> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 24 for instruction: %V32 = call i32 @llvm.experimental.vector.reduce.add.v32i32(<32 x i32> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef +; SSE-LABEL: 'reduce_i32' +; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.experimental.vector.reduce.add.v2i32(<2 x i32> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.experimental.vector.reduce.add.v4i32(<4 x i32> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V8 = call i32 @llvm.experimental.vector.reduce.add.v8i32(<8 x i32> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V16 = call i32 @llvm.experimental.vector.reduce.add.v16i32(<16 x i32> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 24 for instruction: %V32 = call i32 @llvm.experimental.vector.reduce.add.v32i32(<32 x i32> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX-LABEL: 'reduce_i32' ; AVX-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.experimental.vector.reduce.add.v2i32(<2 x i32> undef) @@ -107,65 +75,38 @@ define i32 @reduce_i32(i32 %arg) { } define i32 @reduce_i16(i32 %arg) { -; SSE2-LABEL: 'reduce_i16' -; SSE2-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V2 = call i16 @llvm.experimental.vector.reduce.add.v2i16(<2 x i16> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 13 for instruction: %V4 = call i16 @llvm.experimental.vector.reduce.add.v4i16(<4 x i16> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 19 for instruction: %V8 = call i16 @llvm.experimental.vector.reduce.add.v8i16(<8 x i16> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 26 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; SSSE3-LABEL: 'reduce_i16' -; SSSE3-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i16 @llvm.experimental.vector.reduce.add.v2i16(<2 x i16> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i16 @llvm.experimental.vector.reduce.add.v4i16(<4 x i16> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i16 @llvm.experimental.vector.reduce.add.v8i16(<8 x i16> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; SSE42-LABEL: 'reduce_i16' -; SSE42-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.experimental.vector.reduce.add.v2i16(<2 x i16> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.experimental.vector.reduce.add.v4i16(<4 x i16> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i16 @llvm.experimental.vector.reduce.add.v8i16(<8 x i16> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; AVX1-LABEL: 'reduce_i16' -; AVX1-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.experimental.vector.reduce.add.v2i16(<2 x i16> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.experimental.vector.reduce.add.v4i16(<4 x i16> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i16 @llvm.experimental.vector.reduce.add.v8i16(<8 x i16> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 49 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 53 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 61 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; AVX2-LABEL: 'reduce_i16' -; AVX2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.experimental.vector.reduce.add.v2i16(<2 x i16> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.experimental.vector.reduce.add.v4i16(<4 x i16> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i16 @llvm.experimental.vector.reduce.add.v8i16(<8 x i16> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 24 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef +; SSE-LABEL: 'reduce_i16' +; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.experimental.vector.reduce.add.v2i16(<2 x i16> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.experimental.vector.reduce.add.v4i16(<4 x i16> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i16 @llvm.experimental.vector.reduce.add.v8i16(<8 x i16> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef +; +; AVX-LABEL: 'reduce_i16' +; AVX-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.experimental.vector.reduce.add.v2i16(<2 x i16> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.experimental.vector.reduce.add.v4i16(<4 x i16> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i16 @llvm.experimental.vector.reduce.add.v8i16(<8 x i16> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX512F-LABEL: 'reduce_i16' ; AVX512F-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.experimental.vector.reduce.add.v2i16(<2 x i16> undef) ; AVX512F-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.experimental.vector.reduce.add.v4i16(<4 x i16> undef) ; AVX512F-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i16 @llvm.experimental.vector.reduce.add.v8i16(<8 x i16> undef) -; AVX512F-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) -; AVX512F-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) -; AVX512F-NEXT: Cost Model: Found an estimated cost of 24 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) +; AVX512F-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) +; AVX512F-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) +; AVX512F-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) ; AVX512F-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX512BW-LABEL: 'reduce_i16' ; AVX512BW-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.experimental.vector.reduce.add.v2i16(<2 x i16> undef) ; AVX512BW-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.experimental.vector.reduce.add.v4i16(<4 x i16> undef) ; AVX512BW-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i16 @llvm.experimental.vector.reduce.add.v8i16(<8 x i16> undef) -; AVX512BW-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) +; AVX512BW-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) ; AVX512BW-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) ; AVX512BW-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) ; AVX512BW-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef @@ -174,9 +115,9 @@ define i32 @reduce_i16(i32 %arg) { ; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i16 @llvm.experimental.vector.reduce.add.v2i16(<2 x i16> undef) ; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i16 @llvm.experimental.vector.reduce.add.v4i16(<4 x i16> undef) ; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i16 @llvm.experimental.vector.reduce.add.v8i16(<8 x i16> undef) -; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) -; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) -; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 24 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) +; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V16 = call i16 @llvm.experimental.vector.reduce.add.v16i16(<16 x i16> undef) +; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V32 = call i16 @llvm.experimental.vector.reduce.add.v32i16(<32 x i16> undef) +; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %V64 = call i16 @llvm.experimental.vector.reduce.add.v64i16(<64 x i16> undef) ; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; %V2 = call i16 @llvm.experimental.vector.reduce.add.v2i16(<2 x i16> undef) @@ -189,84 +130,54 @@ define i32 @reduce_i16(i32 %arg) { } define i32 @reduce_i8(i32 %arg) { -; SSE2-LABEL: 'reduce_i8' -; SSE2-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 23 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 34 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 45 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 46 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 48 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 52 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; SSSE3-LABEL: 'reduce_i8' -; SSSE3-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; SSE42-LABEL: 'reduce_i8' -; SSE42-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; AVX1-LABEL: 'reduce_i8' -; AVX1-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 61 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 65 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 73 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) -; AVX1-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef -; -; AVX2-LABEL: 'reduce_i8' -; AVX2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 26 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 27 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 29 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) -; AVX2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef +; SSE-LABEL: 'reduce_i8' +; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 24 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) +; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef +; +; AVX-LABEL: 'reduce_i8' +; AVX-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) +; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX512F-LABEL: 'reduce_i8' -; AVX512F-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) -; AVX512F-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) -; AVX512F-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) -; AVX512F-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) -; AVX512F-NEXT: Cost Model: Found an estimated cost of 26 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) -; AVX512F-NEXT: Cost Model: Found an estimated cost of 27 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) -; AVX512F-NEXT: Cost Model: Found an estimated cost of 29 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) +; AVX512F-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) +; AVX512F-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) +; AVX512F-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) +; AVX512F-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) +; AVX512F-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) +; AVX512F-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) +; AVX512F-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) ; AVX512F-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX512BW-LABEL: 'reduce_i8' -; AVX512BW-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) -; AVX512BW-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) -; AVX512BW-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) -; AVX512BW-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) -; AVX512BW-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) +; AVX512BW-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) +; AVX512BW-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) +; AVX512BW-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) +; AVX512BW-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) +; AVX512BW-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) ; AVX512BW-NEXT: Cost Model: Found an estimated cost of 55 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) ; AVX512BW-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) ; AVX512BW-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; ; AVX512DQ-LABEL: 'reduce_i8' -; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) -; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) -; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) -; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) -; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 26 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) -; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 27 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) -; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 29 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) +; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) +; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i8 @llvm.experimental.vector.reduce.add.v4i8(<4 x i8> undef) +; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i8 @llvm.experimental.vector.reduce.add.v8i8(<8 x i8> undef) +; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16 = call i8 @llvm.experimental.vector.reduce.add.v16i8(<16 x i8> undef) +; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V32 = call i8 @llvm.experimental.vector.reduce.add.v32i8(<32 x i8> undef) +; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V64 = call i8 @llvm.experimental.vector.reduce.add.v64i8(<64 x i8> undef) +; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V128 = call i8 @llvm.experimental.vector.reduce.add.v128i8(<128 x i8> undef) ; AVX512DQ-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef ; %V2 = call i8 @llvm.experimental.vector.reduce.add.v2i8(<2 x i8> undef) Modified: llvm/trunk/test/Analysis/CostModel/X86/reduction.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/CostModel/X86/reduction.ll?rev=374655&r1=374654&r2=374655&view=diff ============================================================================== --- llvm/trunk/test/Analysis/CostModel/X86/reduction.ll (original) +++ llvm/trunk/test/Analysis/CostModel/X86/reduction.ll Sat Oct 12 06:21:50 2019 @@ -15,7 +15,7 @@ define fastcc float @reduction_cost_floa ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf ; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7 -; SSE2-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r ; ; SSSE3-LABEL: 'reduction_cost_float' @@ -23,7 +23,7 @@ define fastcc float @reduction_cost_floa ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r ; ; SSE42-LABEL: 'reduction_cost_float' @@ -107,7 +107,7 @@ define fastcc float @pairwise_hadd(<4 x ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r2 = fadd float %r, %f1 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r2 ; @@ -118,7 +118,7 @@ define fastcc float @pairwise_hadd(<4 x ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r2 = fadd float %r, %f1 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r2 ; @@ -168,7 +168,7 @@ define fastcc float @pairwise_hadd_assoc ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r2 = fadd float %r, %f1 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r2 ; @@ -179,7 +179,7 @@ define fastcc float @pairwise_hadd_assoc ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r2 = fadd float %r, %f1 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r2 ; @@ -228,7 +228,7 @@ define fastcc float @pairwise_hadd_skip_ ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1 ; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx.1 = fadd <4 x float> %bin.rdx.0, %rdx.shuf.1.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r2 = fadd float %r, %f1 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r2 ; @@ -238,7 +238,7 @@ define fastcc float @pairwise_hadd_skip_ ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx.1 = fadd <4 x float> %bin.rdx.0, %rdx.shuf.1.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx.1, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r2 = fadd float %r, %f1 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r2 ; @@ -280,13 +280,13 @@ define fastcc double @no_pairwise_reduct ; SSE2-LABEL: 'no_pairwise_reduction2double' ; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <2 x double> %rdx, <2 x double> undef, <2 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx = fadd <2 x double> %rdx, %rdx.shuf -; SSE2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <2 x double> %bin.rdx, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = extractelement <2 x double> %bin.rdx, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret double %r ; ; SSSE3-LABEL: 'no_pairwise_reduction2double' ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <2 x double> %rdx, <2 x double> undef, <2 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx = fadd <2 x double> %rdx, %rdx.shuf -; SSSE3-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <2 x double> %bin.rdx, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = extractelement <2 x double> %bin.rdx, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret double %r ; ; SSE42-LABEL: 'no_pairwise_reduction2double' @@ -314,7 +314,7 @@ define fastcc float @no_pairwise_reducti ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf ; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7 -; SSE2-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r ; ; SSSE3-LABEL: 'no_pairwise_reduction4float' @@ -322,7 +322,7 @@ define fastcc float @no_pairwise_reducti ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r ; ; SSE42-LABEL: 'no_pairwise_reduction4float' @@ -356,7 +356,7 @@ define fastcc double @no_pairwise_reduct ; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %bin.rdx = fadd <4 x double> %rdx, %rdx.shuf ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %rdx.shuf7 = shufflevector <4 x double> %bin.rdx, <4 x double> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %bin.rdx8 = fadd <4 x double> %bin.rdx, %rdx.shuf7 -; SSE2-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r = extractelement <4 x double> %bin.rdx8, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x double> %bin.rdx8, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret double %r ; ; SSSE3-LABEL: 'no_pairwise_reduction4double' @@ -364,7 +364,7 @@ define fastcc double @no_pairwise_reduct ; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %bin.rdx = fadd <4 x double> %rdx, %rdx.shuf ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %rdx.shuf7 = shufflevector <4 x double> %bin.rdx, <4 x double> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %bin.rdx8 = fadd <4 x double> %bin.rdx, %rdx.shuf7 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r = extractelement <4 x double> %bin.rdx8, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x double> %bin.rdx8, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret double %r ; ; SSE42-LABEL: 'no_pairwise_reduction4double' @@ -463,23 +463,11 @@ define fastcc float @no_pairwise_reducti } define fastcc i64 @no_pairwise_reduction2i64(<2 x i64> %rdx, i64 %f1) { -; SSE2-LABEL: 'no_pairwise_reduction2i64' -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <2 x i64> %rdx, %rdx.shuf -; SSE2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <2 x i64> %bin.rdx, i32 0 -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i64 %r -; -; SSSE3-LABEL: 'no_pairwise_reduction2i64' -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <2 x i64> %rdx, %rdx.shuf -; SSSE3-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <2 x i64> %bin.rdx, i32 0 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i64 %r -; -; SSE42-LABEL: 'no_pairwise_reduction2i64' -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <2 x i64> %rdx, %rdx.shuf -; SSE42-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = extractelement <2 x i64> %bin.rdx, i32 0 -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i64 %r +; SSE-LABEL: 'no_pairwise_reduction2i64' +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <2 x i64> %rdx, %rdx.shuf +; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = extractelement <2 x i64> %bin.rdx, i32 0 +; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i64 %r ; ; AVX-LABEL: 'no_pairwise_reduction2i64' ; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> @@ -495,37 +483,13 @@ define fastcc i64 @no_pairwise_reduction } define fastcc i32 @no_pairwise_reduction4i32(<4 x i32> %rdx, i32 %f1) { -; SSE2-LABEL: 'no_pairwise_reduction4i32' -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <4 x i32> %rdx, %rdx.shuf -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf7 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <4 x i32> %bin.rdx, %rdx.shuf7 -; SSE2-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r = extractelement <4 x i32> %bin.rdx8, i32 0 -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r -; -; SSSE3-LABEL: 'no_pairwise_reduction4i32' -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <4 x i32> %rdx, %rdx.shuf -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf7 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <4 x i32> %bin.rdx, %rdx.shuf7 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r = extractelement <4 x i32> %bin.rdx8, i32 0 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r -; -; SSE42-LABEL: 'no_pairwise_reduction4i32' -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <4 x i32> %rdx, %rdx.shuf -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf7 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <4 x i32> %bin.rdx, %rdx.shuf7 -; SSE42-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <4 x i32> %bin.rdx8, i32 0 -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r -; -; AVX-LABEL: 'no_pairwise_reduction4i32' -; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <4 x i32> %rdx, %rdx.shuf -; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf7 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <4 x i32> %bin.rdx, %rdx.shuf7 -; AVX-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <4 x i32> %bin.rdx8, i32 0 -; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r +; CHECK-LABEL: 'no_pairwise_reduction4i32' +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <4 x i32> %rdx, %rdx.shuf +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf7 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <4 x i32> %bin.rdx, %rdx.shuf7 +; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <4 x i32> %bin.rdx8, i32 0 +; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r ; %rdx.shuf = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> %bin.rdx = add <4 x i32> %rdx, %rdx.shuf @@ -578,7 +542,7 @@ define fastcc i16 @no_pairwise_reduction ; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <8 x i16> %bin.rdx4, %rdx.shuf ; SSE2-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %rdx.shuf7 = shufflevector <8 x i16> %bin.rdx, <8 x i16> undef, <8 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <8 x i16> %bin.rdx, %rdx.shuf7 -; SSE2-NEXT: Cost Model: Found an estimated cost of 19 for instruction: %r = extractelement <8 x i16> %bin.rdx8, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <8 x i16> %bin.rdx8, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i16 %r ; ; SSSE3-LABEL: 'no_pairwise_reduction8i16' @@ -588,7 +552,7 @@ define fastcc i16 @no_pairwise_reduction ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <8 x i16> %bin.rdx4, %rdx.shuf ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf7 = shufflevector <8 x i16> %bin.rdx, <8 x i16> undef, <8 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <8 x i16> %bin.rdx, %rdx.shuf7 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %r = extractelement <8 x i16> %bin.rdx8, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <8 x i16> %bin.rdx8, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i16 %r ; ; SSE42-LABEL: 'no_pairwise_reduction8i16' @@ -669,14 +633,14 @@ define fastcc double @pairwise_reduction ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <2 x double> %rdx, <2 x double> undef, <2 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <2 x double> %rdx, <2 x double> undef, <2 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = fadd <2 x double> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <2 x double> %bin.rdx8, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = extractelement <2 x double> %bin.rdx8, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret double %r ; ; SSSE3-LABEL: 'pairwise_reduction2double' ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <2 x double> %rdx, <2 x double> undef, <2 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <2 x double> %rdx, <2 x double> undef, <2 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = fadd <2 x double> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <2 x double> %bin.rdx8, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = extractelement <2 x double> %bin.rdx8, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret double %r ; ; SSE42-LABEL: 'pairwise_reduction2double' @@ -709,7 +673,7 @@ define fastcc float @pairwise_reduction4 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r ; ; SSSE3-LABEL: 'pairwise_reduction4float' @@ -719,7 +683,7 @@ define fastcc float @pairwise_reduction4 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x float> %bin.rdx8, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r ; ; SSE42-LABEL: 'pairwise_reduction4float' @@ -761,7 +725,7 @@ define fastcc double @pairwise_reduction ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x double> %bin.rdx, <4 x double> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %rdx.shuf.1.1 = shufflevector <4 x double> %bin.rdx, <4 x double> undef, <4 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %bin.rdx8 = fadd <4 x double> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r = extractelement <4 x double> %bin.rdx8, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x double> %bin.rdx8, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret double %r ; ; SSSE3-LABEL: 'pairwise_reduction4double' @@ -771,7 +735,7 @@ define fastcc double @pairwise_reduction ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x double> %bin.rdx, <4 x double> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %rdx.shuf.1.1 = shufflevector <4 x double> %bin.rdx, <4 x double> undef, <4 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %bin.rdx8 = fadd <4 x double> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r = extractelement <4 x double> %bin.rdx8, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r = extractelement <4 x double> %bin.rdx8, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret double %r ; ; SSE42-LABEL: 'pairwise_reduction4double' @@ -826,7 +790,7 @@ define fastcc float @pairwise_reduction8 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.2.0 = shufflevector <8 x float> %bin.rdx8, <8 x float> undef, <8 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.2.1 = shufflevector <8 x float> %bin.rdx8, <8 x float> undef, <8 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %bin.rdx9 = fadd <8 x float> %rdx.shuf.2.0, %rdx.shuf.2.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %r = extractelement <8 x float> %bin.rdx9, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r = extractelement <8 x float> %bin.rdx9, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r ; ; SSSE3-LABEL: 'pairwise_reduction8float' @@ -839,7 +803,7 @@ define fastcc float @pairwise_reduction8 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.2.0 = shufflevector <8 x float> %bin.rdx8, <8 x float> undef, <8 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.2.1 = shufflevector <8 x float> %bin.rdx8, <8 x float> undef, <8 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %bin.rdx9 = fadd <8 x float> %rdx.shuf.2.0, %rdx.shuf.2.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %r = extractelement <8 x float> %bin.rdx9, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r = extractelement <8 x float> %bin.rdx9, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret float %r ; ; SSE42-LABEL: 'pairwise_reduction8float' @@ -896,26 +860,12 @@ define fastcc float @pairwise_reduction8 } define fastcc i64 @pairwise_reduction2i64(<2 x i64> %rdx, i64 %f1) { -; SSE2-LABEL: 'pairwise_reduction2i64' -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <2 x i64> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <2 x i64> %bin.rdx8, i32 0 -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i64 %r -; -; SSSE3-LABEL: 'pairwise_reduction2i64' -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <2 x i64> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <2 x i64> %bin.rdx8, i32 0 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i64 %r -; -; SSE42-LABEL: 'pairwise_reduction2i64' -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <2 x i64> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSE42-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = extractelement <2 x i64> %bin.rdx8, i32 0 -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i64 %r +; SSE-LABEL: 'pairwise_reduction2i64' +; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> +; SSE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <2 x i64> %rdx.shuf.1.0, %rdx.shuf.1.1 +; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = extractelement <2 x i64> %bin.rdx8, i32 0 +; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i64 %r ; ; AVX-LABEL: 'pairwise_reduction2i64' ; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <2 x i64> %rdx, <2 x i64> undef, <2 x i32> @@ -933,45 +883,15 @@ define fastcc i64 @pairwise_reduction2i6 } define fastcc i32 @pairwise_reduction4i32(<4 x i32> %rdx, i32 %f1) { -; SSE2-LABEL: 'pairwise_reduction4i32' -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.0.0 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.0.1 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <4 x i32> %rdx.shuf.0.0, %rdx.shuf.0.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <4 x i32> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r = extractelement <4 x i32> %bin.rdx8, i32 0 -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r -; -; SSSE3-LABEL: 'pairwise_reduction4i32' -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.0.0 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.0.1 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <4 x i32> %rdx.shuf.0.0, %rdx.shuf.0.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <4 x i32> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r = extractelement <4 x i32> %bin.rdx8, i32 0 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r -; -; SSE42-LABEL: 'pairwise_reduction4i32' -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.0.0 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.0.1 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <4 x i32> %rdx.shuf.0.0, %rdx.shuf.0.1 -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <4 x i32> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSE42-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <4 x i32> %bin.rdx8, i32 0 -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r -; -; AVX-LABEL: 'pairwise_reduction4i32' -; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.0.0 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.0.1 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> -; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <4 x i32> %rdx.shuf.0.0, %rdx.shuf.0.1 -; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> -; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <4 x i32> %rdx.shuf.1.0, %rdx.shuf.1.1 -; AVX-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <4 x i32> %bin.rdx8, i32 0 -; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r +; CHECK-LABEL: 'pairwise_reduction4i32' +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.0.0 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.0.1 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx = add <4 x i32> %rdx.shuf.0.0, %rdx.shuf.0.1 +; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.1.0 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.1.1 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> +; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx8 = add <4 x i32> %rdx.shuf.1.0, %rdx.shuf.1.1 +; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = extractelement <4 x i32> %bin.rdx8, i32 0 +; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r ; %rdx.shuf.0.0 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> %rdx.shuf.0.1 = shufflevector <4 x i32> %rdx, <4 x i32> undef, <4 x i32> @@ -1037,7 +957,7 @@ define fastcc i16 @pairwise_reduction8i1 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.2.0 = shufflevector <8 x i16> %bin.rdx8, <8 x i16> undef, <8 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %rdx.shuf.2.1 = shufflevector <8 x i16> %bin.rdx8, <8 x i16> undef, <8 x i32> ; SSE2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx9 = add <8 x i16> %rdx.shuf.2.0, %rdx.shuf.2.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 29 for instruction: %r = extractelement <8 x i16> %bin.rdx9, i32 0 +; SSE2-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r = extractelement <8 x i16> %bin.rdx9, i32 0 ; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i16 %r ; ; SSSE3-LABEL: 'pairwise_reduction8i16' @@ -1050,7 +970,7 @@ define fastcc i16 @pairwise_reduction8i1 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.2.0 = shufflevector <8 x i16> %bin.rdx8, <8 x i16> undef, <8 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %rdx.shuf.2.1 = shufflevector <8 x i16> %bin.rdx8, <8 x i16> undef, <8 x i32> ; SSSE3-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %bin.rdx9 = add <8 x i16> %rdx.shuf.2.0, %rdx.shuf.2.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %r = extractelement <8 x i16> %bin.rdx9, i32 0 +; SSSE3-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r = extractelement <8 x i16> %bin.rdx9, i32 0 ; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i16 %r ; ; SSE42-LABEL: 'pairwise_reduction8i16' @@ -1094,44 +1014,18 @@ define fastcc i16 @pairwise_reduction8i1 } define fastcc i32 @pairwise_reduction8i32(<8 x i32> %rdx, i32 %f1) { -; SSE2-LABEL: 'pairwise_reduction8i32' -; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.0.0 = shufflevector <8 x i32> %rdx, <8 x i32> undef, <8 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.0.1 = shufflevector <8 x i32> %rdx, <8 x i32> undef, <8 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx = add <8 x i32> %rdx.shuf.0.0, %rdx.shuf.0.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.1.0 = shufflevector <8 x i32> %bin.rdx, <8 x i32> undef, <8 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.1.1 = shufflevector <8 x i32> %bin.rdx, <8 x i32> undef, <8 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = add <8 x i32> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.2.0 = shufflevector <8 x i32> %bin.rdx8, <8 x i32> undef, <8 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.2.1 = shufflevector <8 x i32> %bin.rdx8, <8 x i32> undef, <8 x i32> -; SSE2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx9 = add <8 x i32> %rdx.shuf.2.0, %rdx.shuf.2.1 -; SSE2-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %r = extractelement <8 x i32> %bin.rdx9, i32 0 -; SSE2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r -; -; SSSE3-LABEL: 'pairwise_reduction8i32' -; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.0.0 = shufflevector <8 x i32> %rdx, <8 x i32> undef, <8 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.0.1 = shufflevector <8 x i32> %rdx, <8 x i32> undef, <8 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx = add <8 x i32> %rdx.shuf.0.0, %rdx.shuf.0.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.1.0 = shufflevector <8 x i32> %bin.rdx, <8 x i32> undef, <8 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.1.1 = shufflevector <8 x i32> %bin.rdx, <8 x i32> undef, <8 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = add <8 x i32> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.2.0 = shufflevector <8 x i32> %bin.rdx8, <8 x i32> undef, <8 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.2.1 = shufflevector <8 x i32> %bin.rdx8, <8 x i32> undef, <8 x i32> -; SSSE3-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx9 = add <8 x i32> %rdx.shuf.2.0, %rdx.shuf.2.1 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %r = extractelement <8 x i32> %bin.rdx9, i32 0 -; SSSE3-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r -; -; SSE42-LABEL: 'pairwise_reduction8i32' -; SSE42-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.0.0 = shufflevector <8 x i32> %rdx, <8 x i32> undef, <8 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.0.1 = shufflevector <8 x i32> %rdx, <8 x i32> undef, <8 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx = add <8 x i32> %rdx.shuf.0.0, %rdx.shuf.0.1 -; SSE42-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.1.0 = shufflevector <8 x i32> %bin.rdx, <8 x i32> undef, <8 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.1.1 = shufflevector <8 x i32> %bin.rdx, <8 x i32> undef, <8 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = add <8 x i32> %rdx.shuf.1.0, %rdx.shuf.1.1 -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.2.0 = shufflevector <8 x i32> %bin.rdx8, <8 x i32> undef, <8 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.2.1 = shufflevector <8 x i32> %bin.rdx8, <8 x i32> undef, <8 x i32> -; SSE42-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx9 = add <8 x i32> %rdx.shuf.2.0, %rdx.shuf.2.1 -; SSE42-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r = extractelement <8 x i32> %bin.rdx9, i32 0 -; SSE42-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r +; SSE-LABEL: 'pairwise_reduction8i32' +; SSE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.0.0 = shufflevector <8 x i32> %rdx, <8 x i32> undef, <8 x i32> +; SSE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.0.1 = shufflevector <8 x i32> %rdx, <8 x i32> undef, <8 x i32> +; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx = add <8 x i32> %rdx.shuf.0.0, %rdx.shuf.0.1 +; SSE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.1.0 = shufflevector <8 x i32> %bin.rdx, <8 x i32> undef, <8 x i32> +; SSE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.1.1 = shufflevector <8 x i32> %bin.rdx, <8 x i32> undef, <8 x i32> +; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx8 = add <8 x i32> %rdx.shuf.1.0, %rdx.shuf.1.1 +; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %rdx.shuf.2.0 = shufflevector <8 x i32> %bin.rdx8, <8 x i32> undef, <8 x i32> +; SSE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.2.1 = shufflevector <8 x i32> %bin.rdx8, <8 x i32> undef, <8 x i32> +; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %bin.rdx9 = add <8 x i32> %rdx.shuf.2.0, %rdx.shuf.2.1 +; SSE-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %r = extractelement <8 x i32> %bin.rdx9, i32 0 +; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 %r ; ; AVX1-LABEL: 'pairwise_reduction8i32' ; AVX1-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %rdx.shuf.0.0 = shufflevector <8 x i32> %rdx, <8 x i32> undef, <8 x i32> Modified: llvm/trunk/test/Transforms/SLPVectorizer/X86/remark_horcost.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SLPVectorizer/X86/remark_horcost.ll?rev=374655&r1=374654&r2=374655&view=diff ============================================================================== --- llvm/trunk/test/Transforms/SLPVectorizer/X86/remark_horcost.ll (original) +++ llvm/trunk/test/Transforms/SLPVectorizer/X86/remark_horcost.ll Sat Oct 12 06:21:50 2019 @@ -120,7 +120,7 @@ for.body: ; YAML-NEXT: Function: foo ; YAML-NEXT: Args: ; YAML-NEXT: - String: 'Vectorized horizontal reduction with cost ' - ; YAML-NEXT: - Cost: '-2' + ; YAML-NEXT: - Cost: '-4' ; YAML-NEXT: - String: ' and with tree size ' ; YAML-NEXT: - TreeSize: '1' From llvm-commits at lists.llvm.org Sat Oct 12 07:15:48 2019 From: llvm-commits at lists.llvm.org (Fangrui Song via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 14:15:48 +0000 (UTC) Subject: [PATCH] D68906: [llvm-size] Tidy up error messages In-Reply-To: References: Message-ID: <20bcffaa7f6706732884bb0858c67461@localhost.localdomain> MaskRay accepted this revision. MaskRay added a comment. This revision is now accepted and ready to land. LGTM, but please wait for @grimar's opinion. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68906/new/ https://reviews.llvm.org/D68906 From llvm-commits at lists.llvm.org Sat Oct 12 07:26:55 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 14:26:55 +0000 (UTC) Subject: [PATCH] D68911: [AArch64] enable (v)select to math TLI hook (WIP) Message-ID: spatel created this revision. spatel added reviewers: greened, t.p.northover, sebpop, SjoerdMeijer, kristof.beyls. Herald added subscribers: hiraditya, mcrosier. Herald added a project: LLVM. I added more select-to-shift folds to DAGCombiner in: rL374397 rL374555 ...and noticed that AArch64 does not override a TLI hook that was added with: rL296977 Not sure if that is intentional or oversight, so I flipped the setting completely and updated some auto-generated regression tests. It's not a universal win, but likely worth doing? If so, there are a few more regression tests that need updating/inspection. I don't intend to do more with this patch, so feel free to commandeer/close. https://reviews.llvm.org/D68911 Files: llvm/lib/Target/AArch64/AArch64ISelLowering.h llvm/test/CodeGen/AArch64/select_const.ll llvm/test/CodeGen/AArch64/selectcc-to-shiftand.ll llvm/test/CodeGen/AArch64/signbit-shift.ll llvm/test/CodeGen/AArch64/vselect-constants.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68911.224741.patch Type: text/x-patch Size: 16060 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 07:42:44 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 14:42:44 +0000 (UTC) Subject: [PATCH] D63382: [InstCombine] fold a shifted zext to a select In-Reply-To: References: Message-ID: spatel added a subscriber: joanlluch. spatel added a comment. @joanlluch - this is a case where we could transform incoming shift to select and benefit small targets as you've noted. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63382/new/ https://reviews.llvm.org/D63382 From llvm-commits at lists.llvm.org Sat Oct 12 07:42:44 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 14:42:44 +0000 (UTC) Subject: [PATCH] D63382: [InstCombine] fold a shifted zext to a select In-Reply-To: References: Message-ID: spatel added a comment. I haven't tested this on trunk, but I added the DAGCombiner reversals: rL374397 rL374555 ...so this should be good to go. Should I commandeer? There may still be regressions because most in-trunk targets don't enable the guarding hook. Example: D68911 ...but any target has the ability to change that as needed. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63382/new/ https://reviews.llvm.org/D63382 From llvm-commits at lists.llvm.org Sat Oct 12 07:58:30 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 14:58:30 -0000 Subject: [llvm] r374656 - Revert r374654: "[lit] Try to fix new tests that fail on Windows bots" Message-ID: <20191012145830.94C7685660@lists.llvm.org> Author: jdenny Date: Sat Oct 12 07:58:30 2019 New Revision: 374656 URL: http://llvm.org/viewvc/llvm-project?rev=374656&view=rev Log: Revert r374654: "[lit] Try to fix new tests that fail on Windows bots" Modified: llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374656&r1=374655&r2=374656&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 07:58:30 2019 @@ -46,10 +46,10 @@ # CHECK-NEXT: --- # CHECK-NEXT: +++ # CHECK-NEXT: @@ -# CHECK-NEXT: {{^ .+f.+o.+o.+$}} -# CHECK-NEXT: {{^-.+b.+a.+r.+$}} -# CHECK-NEXT: {{^\+.+b.+a.+r.+$}} -# CHECK-NEXT: {{^ .+b.+a.+z.+$}} +# CHECK-NEXT: {{^ .f.o.o.$}} +# CHECK-NEXT: {{^-.b.a.r.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^ .b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -61,9 +61,9 @@ # CHECK-NEXT: -foo # CHECK-NEXT: -bar # CHECK-NEXT: -baz -# CHECK-NEXT: {{^\+.+f.+o.+o.+$}} -# CHECK-NEXT: {{^\+.+b.+a.+r.+$}} -# CHECK-NEXT: {{^\+.+b.+a.+z.+$}} +# CHECK-NEXT: {{^\+.f.o.o.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^\+.b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -72,9 +72,9 @@ # CHECK-NEXT: --- # CHECK-NEXT: +++ # CHECK-NEXT: @@ -# CHECK-NEXT: {{^\-.+f.+o.+o.+$}} -# CHECK-NEXT: {{^\-.+b.+a.+r.+$}} -# CHECK-NEXT: {{^\-.+b.+a.+z.+$}} +# CHECK-NEXT: {{^\-.f.o.o.$}} +# CHECK-NEXT: {{^\-.b.a.r..}} +# CHECK-NEXT: {{^\-.b.a.z.$}} # CHECK-NEXT: +foo # CHECK-NEXT: +bar # CHECK-NEXT: +baz @@ -98,10 +98,10 @@ # CHECK-NEXT: --- # CHECK-NEXT: +++ # CHECK-NEXT: @@ -# CHECK-NEXT: {{^ .+f.+o.+o.+$}} -# CHECK-NEXT: {{^-.+b.+a.+r.+$}} -# CHECK-NEXT: {{^\+.+b.+a.+r.+$}} -# CHECK-NEXT: {{^ .+b.+a.+z.+$}} +# CHECK-NEXT: {{^ .f.o.o.$}} +# CHECK-NEXT: {{^-.b.a.r.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^ .b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -115,9 +115,9 @@ # CHECK-NEXT: -foo # CHECK-NEXT: -bar # CHECK-NEXT: -baz -# CHECK-NEXT: {{^\+.+f.+o.+o.+$}} -# CHECK-NEXT: {{^\+.+b.+a.+r.+$}} -# CHECK-NEXT: {{^\+.+b.+a.+z.+$}} +# CHECK-NEXT: {{^\+.f.o.o.$}} +# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^\+.b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -126,9 +126,9 @@ # CHECK-NEXT: --- # CHECK-NEXT: +++ # CHECK-NEXT: @@ -# CHECK-NEXT: {{^\-.+f.+o.+o.+$}} -# CHECK-NEXT: {{^\-.+b.+a.+r.+$}} -# CHECK-NEXT: {{^\-.+b.+a.+z.+$}} +# CHECK-NEXT: {{^\-.f.o.o.$}} +# CHECK-NEXT: {{^\-.b.a.r..}} +# CHECK-NEXT: {{^\-.b.a.z.$}} # CHECK-NEXT: +foo # CHECK-NEXT: +bar # CHECK-NEXT: +baz From llvm-commits at lists.llvm.org Sat Oct 12 07:58:43 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 14:58:43 -0000 Subject: [llvm] r374657 - [lit] Try again to fix new tests that fail on Windows bots Message-ID: <20191012145843.401908570F@lists.llvm.org> Author: jdenny Date: Sat Oct 12 07:58:43 2019 New Revision: 374657 URL: http://llvm.org/viewvc/llvm-project?rev=374657&view=rev Log: [lit] Try again to fix new tests that fail on Windows bots Based on the bot logs, when lit's internal diff runs on Windows, it looks like binary diffs must be decoded also for Python 2.7. Otherwise, writing the diff to stdout fails with: ``` UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128) ``` I did not need to decode using Python 2.7.15 under Ubuntu. When I do it anyway in that case, `errors="backslashreplace"` fails for me: ``` TypeError: don't know how to handle UnicodeDecodeError in error callback ``` However, `errors="ignore"` works, so this patch uses that, hoping it'll work on Windows as well. This patch leaves `errors="backslashreplace"` for Python >= 3.5 as there's no evidence yet that doesn't work and it produces more informative binary diffs. This patch also adjusts some lit tests to succeed for either error handler. This patch adjusts changes introduced by D68664. Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374657&r1=374656&r2=374657&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 07:58:43 2019 @@ -62,6 +62,7 @@ def compareTwoBinaryFiles(flags, filepat func = difflib.context_diff diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1], n = flags.num_context_lines) + diffs = [diff.decode(errors="ignore") for diff in diffs] for diff in diffs: sys.stdout.write(diff) Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374657&r1=374656&r2=374657&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 07:58:43 2019 @@ -48,7 +48,7 @@ # CHECK-NEXT: @@ # CHECK-NEXT: {{^ .f.o.o.$}} # CHECK-NEXT: {{^-.b.a.r.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^\+.b.a.r.}} # CHECK-NEXT: {{^ .b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -62,7 +62,7 @@ # CHECK-NEXT: -bar # CHECK-NEXT: -baz # CHECK-NEXT: {{^\+.f.o.o.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^\+.b.a.r.}} # CHECK-NEXT: {{^\+.b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -73,7 +73,7 @@ # CHECK-NEXT: +++ # CHECK-NEXT: @@ # CHECK-NEXT: {{^\-.f.o.o.$}} -# CHECK-NEXT: {{^\-.b.a.r..}} +# CHECK-NEXT: {{^\-.b.a.r.}} # CHECK-NEXT: {{^\-.b.a.z.$}} # CHECK-NEXT: +foo # CHECK-NEXT: +bar @@ -100,7 +100,7 @@ # CHECK-NEXT: @@ # CHECK-NEXT: {{^ .f.o.o.$}} # CHECK-NEXT: {{^-.b.a.r.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^\+.b.a.r.}} # CHECK-NEXT: {{^ .b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -116,7 +116,7 @@ # CHECK-NEXT: -bar # CHECK-NEXT: -baz # CHECK-NEXT: {{^\+.f.o.o.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} +# CHECK-NEXT: {{^\+.b.a.r.}} # CHECK-NEXT: {{^\+.b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -127,7 +127,7 @@ # CHECK-NEXT: +++ # CHECK-NEXT: @@ # CHECK-NEXT: {{^\-.f.o.o.$}} -# CHECK-NEXT: {{^\-.b.a.r..}} +# CHECK-NEXT: {{^\-.b.a.r.}} # CHECK-NEXT: {{^\-.b.a.z.$}} # CHECK-NEXT: +foo # CHECK-NEXT: +bar From llvm-commits at lists.llvm.org Sat Oct 12 08:19:14 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sat, 12 Oct 2019 15:19:14 -0000 Subject: [llvm] r374658 - [X86][SSE] Avoid unnecessary PMOVZX in v4i8 sum reduction Message-ID: <20191012151914.1CD9C85E93@lists.llvm.org> Author: rksimon Date: Sat Oct 12 08:19:13 2019 New Revision: 374658 URL: http://llvm.org/viewvc/llvm-project?rev=374658&view=rev Log: [X86][SSE] Avoid unnecessary PMOVZX in v4i8 sum reduction This should go away once D66004 has landed and we can simplify shuffle chains using demanded elts. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374658&r1=374657&r2=374658&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sat Oct 12 08:19:13 2019 @@ -36241,13 +36241,24 @@ static SDValue combineReductionToHorizon // vXi8 reduction - sub 128-bit vector. if (VecVT == MVT::v4i8 || VecVT == MVT::v8i8) { - // Pad with zero. - if (VecVT == MVT::v4i8) - Rdx = DAG.getNode(ISD::CONCAT_VECTORS, DL, MVT::v8i8, Rdx, - DAG.getConstant(0, DL, VecVT)); - // Pad with undef. - Rdx = DAG.getNode(ISD::CONCAT_VECTORS, DL, MVT::v16i8, Rdx, - DAG.getUNDEF(MVT::v8i8)); + if (VecVT == MVT::v4i8) { + // Pad with zero. + if (Subtarget.hasSSE41()) { + Rdx = DAG.getBitcast(MVT::i32, Rdx); + Rdx = DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, MVT::v4i32, + DAG.getConstant(0, DL, MVT::v4i32), Rdx, + DAG.getIntPtrConstant(0, DL)); + Rdx = DAG.getBitcast(MVT::v16i8, Rdx); + } else { + Rdx = DAG.getNode(ISD::CONCAT_VECTORS, DL, MVT::v8i8, Rdx, + DAG.getConstant(0, DL, VecVT)); + } + } + if (Rdx.getValueType() == MVT::v8i8) { + // Pad with undef. + Rdx = DAG.getNode(ISD::CONCAT_VECTORS, DL, MVT::v16i8, Rdx, + DAG.getUNDEF(MVT::v8i8)); + } Rdx = DAG.getNode(X86ISD::PSADBW, DL, MVT::v2i64, Rdx, DAG.getConstant(0, DL, MVT::v16i8)); Rdx = DAG.getBitcast(MVT::v16i8, Rdx); Modified: llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll?rev=374658&r1=374657&r2=374658&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-reduce-add.ll Sat Oct 12 08:19:13 2019 @@ -1038,17 +1038,17 @@ define i8 @test_v4i8(<4 x i8> %a0) { ; ; SSE41-LABEL: test_v4i8: ; SSE41: # %bb.0: -; SSE41-NEXT: pmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero ; SSE41-NEXT: pxor %xmm1, %xmm1 -; SSE41-NEXT: psadbw %xmm0, %xmm1 -; SSE41-NEXT: pextrb $0, %xmm1, %eax +; SSE41-NEXT: pblendw {{.*#+}} xmm0 = xmm0[0,1],xmm1[2,3,4,5,6,7] +; SSE41-NEXT: psadbw %xmm1, %xmm0 +; SSE41-NEXT: pextrb $0, %xmm0, %eax ; SSE41-NEXT: # kill: def $al killed $al killed $eax ; SSE41-NEXT: retq ; ; AVX-LABEL: test_v4i8: ; AVX: # %bb.0: -; AVX-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero ; AVX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1],xmm1[2,3,4,5,6,7] ; AVX-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ; AVX-NEXT: vpextrb $0, %xmm0, %eax ; AVX-NEXT: # kill: def $al killed $al killed $eax @@ -1056,7 +1056,8 @@ define i8 @test_v4i8(<4 x i8> %a0) { ; ; AVX512-LABEL: test_v4i8: ; AVX512: # %bb.0: -; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero +; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1],xmm1[2,3,4,5,6,7] ; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ; AVX512-NEXT: vpextrb $0, %xmm0, %eax @@ -1079,7 +1080,6 @@ define i8 @test_v4i8_load(<4 x i8>* %p) ; SSE41-LABEL: test_v4i8_load: ; SSE41: # %bb.0: ; SSE41-NEXT: movd {{.*#+}} xmm0 = mem[0],zero,zero,zero -; SSE41-NEXT: pmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero ; SSE41-NEXT: pxor %xmm1, %xmm1 ; SSE41-NEXT: psadbw %xmm0, %xmm1 ; SSE41-NEXT: pextrb $0, %xmm1, %eax @@ -1089,7 +1089,6 @@ define i8 @test_v4i8_load(<4 x i8>* %p) ; AVX-LABEL: test_v4i8_load: ; AVX: # %bb.0: ; AVX-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero -; AVX-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero ; AVX-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ; AVX-NEXT: vpextrb $0, %xmm0, %eax @@ -1099,7 +1098,6 @@ define i8 @test_v4i8_load(<4 x i8>* %p) ; AVX512-LABEL: test_v4i8_load: ; AVX512: # %bb.0: ; AVX512-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero -; AVX512-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero ; AVX512-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ; AVX512-NEXT: vpextrb $0, %xmm0, %eax From llvm-commits at lists.llvm.org Sat Oct 12 08:19:01 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 15:19:01 +0000 (UTC) Subject: [PATCH] D66004: [WIP][X86][SSE] SimplifyDemandedVectorEltsForTargetNode - add general shuffle combining support In-Reply-To: References: Message-ID: RKSimon planned changes to this revision. RKSimon added a comment. WIP - PR27854 and PR43024 need to be finished first. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66004/new/ https://reviews.llvm.org/D66004 From llvm-commits at lists.llvm.org Sat Oct 12 08:28:10 2019 From: llvm-commits at lists.llvm.org (Juneyoung Lee via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 15:28:10 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: <46ab9536095767f06e4da0db906bac1f@localhost.localdomain> aqjune updated this revision to Diff 224745. aqjune marked 3 inline comments as done. aqjune added a comment. - Support freeze constexpr - Make freeze's type checking consistent with LangRef. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 Files: include/llvm-c/Core.h include/llvm/Bitcode/LLVMBitCodes.h include/llvm/CodeGen/GlobalISel/IRTranslator.h include/llvm/IR/IRBuilder.h include/llvm/IR/Instruction.def include/llvm/IR/PatternMatch.h lib/AsmParser/LLLexer.cpp lib/AsmParser/LLParser.cpp lib/AsmParser/LLToken.h lib/Bitcode/Reader/BitcodeReader.cpp lib/Bitcode/Writer/BitcodeWriter.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h lib/CodeGen/TargetLoweringBase.cpp lib/IR/ConstantFold.cpp lib/IR/Core.cpp lib/IR/Instruction.cpp lib/IR/Instructions.cpp lib/IR/Verifier.cpp test/Bindings/llvm-c/freeze.ll test/Bitcode/compatibility.ll tools/llvm-c-test/echo.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D29011.224745.patch Type: text/x-patch Size: 25350 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 08:35:09 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Sat, 12 Oct 2019 15:35:09 -0000 Subject: [llvm] r374660 - [NFC][LoopIdiom] Move one bcmp test into the proper place Message-ID: <20191012153509.3C07380AFF@lists.llvm.org> Author: lebedevri Date: Sat Oct 12 08:35:09 2019 New Revision: 374660 URL: http://llvm.org/viewvc/llvm-project?rev=374660&view=rev Log: [NFC][LoopIdiom] Move one bcmp test into the proper place Modified: llvm/trunk/test/Transforms/LoopIdiom/bcmp-basic.ll llvm/trunk/test/Transforms/LoopIdiom/bcmp-negative-tests.ll Modified: llvm/trunk/test/Transforms/LoopIdiom/bcmp-basic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopIdiom/bcmp-basic.ll?rev=374660&r1=374659&r2=374660&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopIdiom/bcmp-basic.ll (original) +++ llvm/trunk/test/Transforms/LoopIdiom/bcmp-basic.ll Sat Oct 12 08:35:09 2019 @@ -1894,3 +1894,50 @@ cleanup4: %res = phi i1 [ true, %entry ], [ true, %for.inc ], [ false, %for.body ] ret i1 %res } + +define i1 @exit_block_is_not_dedicated(i8* %ptr0, i8* %ptr1) { +; CHECK-LABEL: @exit_block_is_not_dedicated( +; CHECK-NEXT: entry: +; CHECK-NEXT: br i1 true, label [[FOR_BODY_PREHEADER:%.*]], label [[CLEANUP:%.*]] +; CHECK: for.body.preheader: +; CHECK-NEXT: br label [[FOR_BODY:%.*]] +; CHECK: for.body: +; CHECK-NEXT: [[I_08:%.*]] = phi i64 [ [[INC:%.*]], [[FOR_COND:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ] +; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[I_08]] +; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[ARRAYIDX]] +; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, i8* [[PTR1:%.*]], i64 [[I_08]] +; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[ARRAYIDX1]] +; CHECK-NEXT: [[CMP3:%.*]] = icmp eq i8 [[V0]], [[V1]] +; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I_08]], 1 +; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_COND]], label [[CLEANUP_LOOPEXIT:%.*]] +; CHECK: for.cond: +; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC]], 8 +; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT]] +; CHECK: cleanup.loopexit: +; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ true, [[FOR_COND]] ], [ false, [[FOR_BODY]] ] +; CHECK-NEXT: br label [[CLEANUP]] +; CHECK: cleanup: +; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[ENTRY:%.*]] ], [ [[RES_PH]], [[CLEANUP_LOOPEXIT]] ] +; CHECK-NEXT: ret i1 [[RES]] +; +entry: + br i1 true, label %for.body, label %cleanup + +for.body: + %i.08 = phi i64 [ 0, %entry ], [ %inc, %for.cond ] + %arrayidx = getelementptr inbounds i8, i8* %ptr0, i64 %i.08 + %v0 = load i8, i8* %arrayidx + %arrayidx1 = getelementptr inbounds i8, i8* %ptr1, i64 %i.08 + %v1 = load i8, i8* %arrayidx1 + %cmp3 = icmp eq i8 %v0, %v1 + %inc = add nuw nsw i64 %i.08, 1 + br i1 %cmp3, label %for.cond, label %cleanup + +for.cond: + %cmp = icmp ult i64 %inc, 8 + br i1 %cmp, label %for.body, label %cleanup + +cleanup: + %res = phi i1 [ false, %for.body ], [ true, %for.cond ], [ false, %entry ] + ret i1 %res +} Modified: llvm/trunk/test/Transforms/LoopIdiom/bcmp-negative-tests.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopIdiom/bcmp-negative-tests.ll?rev=374660&r1=374659&r2=374660&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopIdiom/bcmp-negative-tests.ll (original) +++ llvm/trunk/test/Transforms/LoopIdiom/bcmp-negative-tests.ll Sat Oct 12 08:35:09 2019 @@ -59,29 +59,6 @@ cleanup: ret i1 %res } -define i1 @exit_block_is_not_dedicated(i8* %ptr0, i8* %ptr1) { -entry: - br i1 true, label %for.body, label %cleanup - -for.body: - %i.08 = phi i64 [ 0, %entry ], [ %inc, %for.cond ] - %arrayidx = getelementptr inbounds i8, i8* %ptr0, i64 %i.08 - %v0 = load i8, i8* %arrayidx - %arrayidx1 = getelementptr inbounds i8, i8* %ptr1, i64 %i.08 - %v1 = load i8, i8* %arrayidx1 - %cmp3 = icmp eq i8 %v0, %v1 - %inc = add nuw nsw i64 %i.08, 1 - br i1 %cmp3, label %for.cond, label %cleanup - -for.cond: - %cmp = icmp ult i64 %inc, 8 - br i1 %cmp, label %for.body, label %cleanup - -cleanup: - %res = phi i1 [ false, %for.body ], [ true, %for.cond ], [ false, %entry ] - ret i1 %res -} - define i1 @body_cmp_is_not_equality(i8* %ptr0, i8* %ptr1) { entry: br label %for.body From llvm-commits at lists.llvm.org Sat Oct 12 08:35:16 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Sat, 12 Oct 2019 15:35:16 -0000 Subject: [llvm] r374661 - [NFC][LoopIdiom] Add bcmp loop idiom miscompile test from PR43206. Message-ID: <20191012153516.B7B6888A07@lists.llvm.org> Author: lebedevri Date: Sat Oct 12 08:35:16 2019 New Revision: 374661 URL: http://llvm.org/viewvc/llvm-project?rev=374661&view=rev Log: [NFC][LoopIdiom] Add bcmp loop idiom miscompile test from PR43206. The transform forgot to check SCEV loop scopes. https://bugs.llvm.org/show_bug.cgi?id=43206 Modified: llvm/trunk/test/Transforms/LoopIdiom/bcmp-negative-tests.ll Modified: llvm/trunk/test/Transforms/LoopIdiom/bcmp-negative-tests.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopIdiom/bcmp-negative-tests.ll?rev=374661&r1=374660&r2=374661&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopIdiom/bcmp-negative-tests.ll (original) +++ llvm/trunk/test/Transforms/LoopIdiom/bcmp-negative-tests.ll Sat Oct 12 08:35:16 2019 @@ -1,9 +1,7 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py -; RUN: opt -loop-idiom < %s -S | FileCheck %s +; RUN: opt -loop-idiom -verify -verify-each -verify-dom-info -verify-loop-info < %s -S | FileCheck %s --implicit-check-not=bcmp --implicit-check-not=memcmp ; CHECK: source_filename -; CHECK-NOT; bcmp -; CHECK-NOT; memcmp target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" @@ -478,3 +476,56 @@ cleanup: %res = phi i1 [ false, %for.body ], [ true, %for.inc ] ret i1 %res } + +; See https://bugs.llvm.org/show_bug.cgi?id=43206 for original reduced (but runnable) test: +; +; bool do_check(int i_max, int j_max, int**bptr, int* fillch) { +; for (int i = 0; i < i_max; i++) +; for (int j = 0; j < j_max; j++) +; if (bptr[i][j] != fillch[i]) +; return 1; +; return 0; +; } +; The loads proceed differently here - fillch[i] changes once per outer loop, +; while bptr[i][j] changes both in inner loop, and in outer loop. +define i1 @pr43206_different_loops(i32 %i_max, i32 %j_max, i32** %bptr, i32* %fillch) { +entry: + %cmp31 = icmp sgt i32 %i_max, 0 + %cmp229 = icmp sgt i32 %j_max, 0 + %or.cond = and i1 %cmp31, %cmp229 + br i1 %or.cond, label %for.cond1.preheader.us.preheader, label %cleanup12 + +for.cond1.preheader.us.preheader: ; preds = %entry + %wide.trip.count38 = zext i32 %i_max to i64 + %wide.trip.count = zext i32 %j_max to i64 + br label %for.cond1.preheader.us + +for.cond1.preheader.us: ; preds = %for.cond1.for.inc10_crit_edge.us, %for.cond1.preheader.us.preheader + %indvars.iv36 = phi i64 [ 0, %for.cond1.preheader.us.preheader ], [ %indvars.iv.next37, %for.cond1.for.inc10_crit_edge.us ] + %arrayidx.us = getelementptr inbounds i32*, i32** %bptr, i64 %indvars.iv36 + %v0 = load i32*, i32** %arrayidx.us, align 8 + %arrayidx8.us = getelementptr inbounds i32, i32* %fillch, i64 %indvars.iv36 + %v1 = load i32, i32* %arrayidx8.us, align 4 + br label %for.body4.us + +for.cond1.us: ; preds = %for.body4.us + %exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count + br i1 %exitcond, label %for.cond1.for.inc10_crit_edge.us, label %for.body4.us + +for.body4.us: ; preds = %for.cond1.us, %for.cond1.preheader.us + %indvars.iv = phi i64 [ 0, %for.cond1.preheader.us ], [ %indvars.iv.next, %for.cond1.us ] + %arrayidx6.us = getelementptr inbounds i32, i32* %v0, i64 %indvars.iv + %v2 = load i32, i32* %arrayidx6.us, align 4 + %cmp9.us = icmp eq i32 %v2, %v1 + %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 + br i1 %cmp9.us, label %for.cond1.us, label %cleanup12 + +for.cond1.for.inc10_crit_edge.us: ; preds = %for.cond1.us + %indvars.iv.next37 = add nuw nsw i64 %indvars.iv36, 1 + %exitcond39 = icmp eq i64 %indvars.iv.next37, %wide.trip.count38 + br i1 %exitcond39, label %cleanup12, label %for.cond1.preheader.us + +cleanup12: ; preds = %for.cond1.for.inc10_crit_edge.us, %for.body4.us, %entry + %v3 = phi i1 [ false, %entry ], [ true, %for.body4.us ], [ false, %for.cond1.for.inc10_crit_edge.us ] + ret i1 %v3 +} From llvm-commits at lists.llvm.org Sat Oct 12 08:35:32 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Sat, 12 Oct 2019 15:35:32 -0000 Subject: [llvm] r374662 - [LoopIdiomRecognize] Recommit: BCmp loop idiom recognition Message-ID: <20191012153532.C4E528AB60@lists.llvm.org> Author: lebedevri Date: Sat Oct 12 08:35:32 2019 New Revision: 374662 URL: http://llvm.org/viewvc/llvm-project?rev=374662&view=rev Log: [LoopIdiomRecognize] Recommit: BCmp loop idiom recognition Summary: This is a recommit, this originally landed in rL370454 but was subsequently reverted in rL370788 due to https://bugs.llvm.org/show_bug.cgi?id=43206 The reduced testcase was added to bcmp-negative-tests.ll as @pr43206_different_loops - we must ensure that the SCEV's we got are both for the same loop we are currently investigating. Original commit message: @mclow.lists brought up this issue up in IRC. It is a reasonably common problem to compare some two values for equality. Those may be just some integers, strings or arrays of integers. In C, there is `memcmp()`, `bcmp()` functions. In C++, there exists `std::equal()` algorithm. One can also write that function manually. libstdc++'s `std::equal()` is specialized to directly call `memcmp()` for various types, but not `std::byte` from C++2a. https://godbolt.org/z/mx2ejJ libc++ does not do anything like that, it simply relies on simple C++'s `operator==()`. https://godbolt.org/z/er0Zwf (GOOD!) So likely, there exists a certain performance opportunities. Let's compare performance of naive `std::equal()` (no `memcmp()`) with one that is using `memcmp()` (in this case, compiled with modified compiler). {F8768213} ``` #include #include #include #include #include #include #include #include #include #include "benchmark/benchmark.h" template bool equal(T* a, T* a_end, T* b) noexcept { for (; a != a_end; ++a, ++b) { if (*a != *b) return false; } return true; } template std::vector getVectorOfRandomNumbers(size_t count) { std::random_device rd; std::mt19937 gen(rd()); std::uniform_int_distribution dis(std::numeric_limits::min(), std::numeric_limits::max()); std::vector v; v.reserve(count); std::generate_n(std::back_inserter(v), count, [&dis, &gen]() { return dis(gen); }); assert(v.size() == count); return v; } struct Identical { template static std::pair, std::vector> Gen(size_t count) { auto Tmp = getVectorOfRandomNumbers(count); return std::make_pair(Tmp, std::move(Tmp)); } }; struct InequalHalfway { template static std::pair, std::vector> Gen(size_t count) { auto V0 = getVectorOfRandomNumbers(count); auto V1 = V0; V1[V1.size() / size_t(2)]++; // just change the value. return std::make_pair(std::move(V0), std::move(V1)); } }; template void BM_bcmp(benchmark::State& state) { const size_t Length = state.range(0); const std::pair, std::vector> Data = Gen::template Gen(Length); const std::vector& a = Data.first; const std::vector& b = Data.second; assert(a.size() == Length && b.size() == a.size()); benchmark::ClobberMemory(); benchmark::DoNotOptimize(a); benchmark::DoNotOptimize(a.data()); benchmark::DoNotOptimize(b); benchmark::DoNotOptimize(b.data()); for (auto _ : state) { const bool is_equal = equal(a.data(), a.data() + a.size(), b.data()); benchmark::DoNotOptimize(is_equal); } state.SetComplexityN(Length); state.counters["eltcnt"] = benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariant); state.counters["eltcnt/sec"] = benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariantRate); const size_t BytesRead = 2 * sizeof(T) * Length; state.counters["bytes_read/iteration"] = benchmark::Counter(BytesRead, benchmark::Counter::kDefaults, benchmark::Counter::OneK::kIs1024); state.counters["bytes_read/sec"] = benchmark::Counter( BytesRead, benchmark::Counter::kIsIterationInvariantRate, benchmark::Counter::OneK::kIs1024); } template static void CustomArguments(benchmark::internal::Benchmark* b) { const size_t L2SizeBytes = []() { for (const benchmark::CPUInfo::CacheInfo& I : benchmark::CPUInfo::Get().caches) { if (I.level == 2) return I.size; } return 0; }(); // What is the largest range we can check to always fit within given L2 cache? const size_t MaxLen = L2SizeBytes / /*total bufs*/ 2 / /*maximal elt size*/ sizeof(T) / /*safety margin*/ 2; b->RangeMultiplier(2)->Range(1, MaxLen)->Complexity(benchmark::oN); } BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, Identical) ->Apply(CustomArguments); BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, Identical) ->Apply(CustomArguments); BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, Identical) ->Apply(CustomArguments); BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, Identical) ->Apply(CustomArguments); BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, InequalHalfway) ->Apply(CustomArguments); BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, InequalHalfway) ->Apply(CustomArguments); BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, InequalHalfway) ->Apply(CustomArguments); BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, InequalHalfway) ->Apply(CustomArguments); ``` {F8768210} ``` $ ~/src/googlebenchmark/tools/compare.py --no-utest benchmarks build-{old,new}/test/llvm-bcmp-bench RUNNING: build-old/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpb6PEUx 2019-04-25 21:17:11 Running build-old/test/llvm-bcmp-bench Run on (8 X 4000 MHz CPU s) CPU Caches: L1 Data 16K (x8) L1 Instruction 64K (x4) L2 Unified 2048K (x4) L3 Unified 8192K (x1) Load Average: 0.65, 3.90, 4.14 --------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... --------------------------------------------------------------------------------------------------- <...> BM_bcmp/512000 432131 ns 432101 ns 1613 bytes_read/iteration=1000k bytes_read/sec=2.20706G/s eltcnt=825.856M eltcnt/sec=1.18491G/s BM_bcmp_BigO 0.86 N 0.86 N BM_bcmp_RMS 8 % 8 % <...> BM_bcmp/256000 161408 ns 161409 ns 4027 bytes_read/iteration=1000k bytes_read/sec=5.90843G/s eltcnt=1030.91M eltcnt/sec=1.58603G/s BM_bcmp_BigO 0.67 N 0.67 N BM_bcmp_RMS 25 % 25 % <...> BM_bcmp/128000 81497 ns 81488 ns 8415 bytes_read/iteration=1000k bytes_read/sec=11.7032G/s eltcnt=1077.12M eltcnt/sec=1.57078G/s BM_bcmp_BigO 0.71 N 0.71 N BM_bcmp_RMS 42 % 42 % <...> BM_bcmp/64000 50138 ns 50138 ns 10909 bytes_read/iteration=1000k bytes_read/sec=19.0209G/s eltcnt=698.176M eltcnt/sec=1.27647G/s BM_bcmp_BigO 0.84 N 0.84 N BM_bcmp_RMS 27 % 27 % <...> BM_bcmp/512000 192405 ns 192392 ns 3638 bytes_read/iteration=1000k bytes_read/sec=4.95694G/s eltcnt=1.86266G eltcnt/sec=2.66124G/s BM_bcmp_BigO 0.38 N 0.38 N BM_bcmp_RMS 3 % 3 % <...> BM_bcmp/256000 127858 ns 127860 ns 5477 bytes_read/iteration=1000k bytes_read/sec=7.45873G/s eltcnt=1.40211G eltcnt/sec=2.00219G/s BM_bcmp_BigO 0.50 N 0.50 N BM_bcmp_RMS 0 % 0 % <...> BM_bcmp/128000 49140 ns 49140 ns 14281 bytes_read/iteration=1000k bytes_read/sec=19.4072G/s eltcnt=1.82797G eltcnt/sec=2.60478G/s BM_bcmp_BigO 0.40 N 0.40 N BM_bcmp_RMS 18 % 18 % <...> BM_bcmp/64000 32101 ns 32099 ns 21786 bytes_read/iteration=1000k bytes_read/sec=29.7101G/s eltcnt=1.3943G eltcnt/sec=1.99381G/s BM_bcmp_BigO 0.50 N 0.50 N BM_bcmp_RMS 1 % 1 % RUNNING: build-new/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpQ46PP0 2019-04-25 21:19:29 Running build-new/test/llvm-bcmp-bench Run on (8 X 4000 MHz CPU s) CPU Caches: L1 Data 16K (x8) L1 Instruction 64K (x4) L2 Unified 2048K (x4) L3 Unified 8192K (x1) Load Average: 1.01, 2.85, 3.71 --------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... --------------------------------------------------------------------------------------------------- <...> BM_bcmp/512000 18593 ns 18590 ns 37565 bytes_read/iteration=1000k bytes_read/sec=51.2991G/s eltcnt=19.2333G eltcnt/sec=27.541G/s BM_bcmp_BigO 0.04 N 0.04 N BM_bcmp_RMS 37 % 37 % <...> BM_bcmp/256000 18950 ns 18948 ns 37223 bytes_read/iteration=1000k bytes_read/sec=50.3324G/s eltcnt=9.52909G eltcnt/sec=13.511G/s BM_bcmp_BigO 0.08 N 0.08 N BM_bcmp_RMS 34 % 34 % <...> BM_bcmp/128000 18627 ns 18627 ns 37895 bytes_read/iteration=1000k bytes_read/sec=51.198G/s eltcnt=4.85056G eltcnt/sec=6.87168G/s BM_bcmp_BigO 0.16 N 0.16 N BM_bcmp_RMS 35 % 35 % <...> BM_bcmp/64000 18855 ns 18855 ns 37458 bytes_read/iteration=1000k bytes_read/sec=50.5791G/s eltcnt=2.39731G eltcnt/sec=3.3943G/s BM_bcmp_BigO 0.32 N 0.32 N BM_bcmp_RMS 33 % 33 % <...> BM_bcmp/512000 9570 ns 9569 ns 73500 bytes_read/iteration=1000k bytes_read/sec=99.6601G/s eltcnt=37.632G eltcnt/sec=53.5046G/s BM_bcmp_BigO 0.02 N 0.02 N BM_bcmp_RMS 29 % 29 % <...> BM_bcmp/256000 9547 ns 9547 ns 74343 bytes_read/iteration=1000k bytes_read/sec=99.8971G/s eltcnt=19.0318G eltcnt/sec=26.8159G/s BM_bcmp_BigO 0.04 N 0.04 N BM_bcmp_RMS 29 % 29 % <...> BM_bcmp/128000 9396 ns 9394 ns 73521 bytes_read/iteration=1000k bytes_read/sec=101.518G/s eltcnt=9.41069G eltcnt/sec=13.6255G/s BM_bcmp_BigO 0.08 N 0.08 N BM_bcmp_RMS 30 % 30 % <...> BM_bcmp/64000 9499 ns 9498 ns 73802 bytes_read/iteration=1000k bytes_read/sec=100.405G/s eltcnt=4.72333G eltcnt/sec=6.73808G/s BM_bcmp_BigO 0.16 N 0.16 N BM_bcmp_RMS 28 % 28 % Comparing build-old/test/llvm-bcmp-bench to build-new/test/llvm-bcmp-bench Benchmark Time CPU Time Old Time New CPU Old CPU New --------------------------------------------------------------------------------------------------------------------------------------- <...> BM_bcmp/512000 -0.9570 -0.9570 432131 18593 432101 18590 <...> BM_bcmp/256000 -0.8826 -0.8826 161408 18950 161409 18948 <...> BM_bcmp/128000 -0.7714 -0.7714 81497 18627 81488 18627 <...> BM_bcmp/64000 -0.6239 -0.6239 50138 18855 50138 18855 <...> BM_bcmp/512000 -0.9503 -0.9503 192405 9570 192392 9569 <...> BM_bcmp/256000 -0.9253 -0.9253 127858 9547 127860 9547 <...> BM_bcmp/128000 -0.8088 -0.8088 49140 9396 49140 9394 <...> BM_bcmp/64000 -0.7041 -0.7041 32101 9499 32099 9498 ``` What can we tell from the benchmark? * Performance of naive equality check somewhat improves with element size, maxing out at eltcnt/sec=1.58603G/s for uint16_t, or bytes_read/sec=19.0209G/s for uint64_t. I think, that instability implies performance problems. * Performance of `memcmp()`-aware benchmark always maxes out at around bytes_read/sec=51.2991G/s for every type. That is 2.6x the throughput of the naive variant! * eltcnt/sec metric for the `memcmp()`-aware benchmark maxes out at eltcnt/sec=27.541G/s for uint8_t (was: eltcnt/sec=1.18491G/s, so 24x) and linearly decreases with element size. For uint64_t, it's ~4x+ the elements/second. * The call obvious is more pricey than the loop, with small element count. As it can be seen from the full output {F8768210}, the `memcmp()` is almost universally worse, independent of the element size (and thus buffer size) when element count is less than 8. So all in all, bcmp idiom does indeed pose untapped performance headroom. This diff does implement said idiom recognition. I think a reasonable test coverage is present, but do tell if there is anything obvious missing. Now, quality. This does succeed to build and pass the test-suite, at least without any non-bundled elements. {F8768216} {F8768217} This transform fires 91 times: ``` $ /build/test-suite/utils/compare.py -m loop-idiom.NumBCmp result-new.json Tests: 1149 Metric: loop-idiom.NumBCmp Program result-new MultiSourc...Benchmarks/7zip/7zip-benchmark 79.00 MultiSource/Applications/d/make_dparser 3.00 SingleSource/UnitTests/vla 2.00 MultiSource/Applications/Burg/burg 1.00 MultiSourc.../Applications/JM/lencod/lencod 1.00 MultiSource/Applications/lemon/lemon 1.00 MultiSource/Benchmarks/Bullet/bullet 1.00 MultiSourc...e/Benchmarks/MallocBench/gs/gs 1.00 MultiSourc...gs-C/TimberWolfMC/timberwolfmc 1.00 MultiSourc...Prolangs-C/simulator/simulator 1.00 ``` The size changes are: I'm not sure what's going on with SingleSource/UnitTests/vla.test yet, did not look. ``` $ /build/test-suite/utils/compare.py -m size..text result-{old,new}.json --filter-hash Tests: 1149 Same hash: 907 (filtered out) Remaining: 242 Metric: size..text Program result-old result-new diff test-suite...ingleSource/UnitTests/vla.test 753.00 833.00 10.6% test-suite...marks/7zip/7zip-benchmark.test 1001697.00 966657.00 -3.5% test-suite...ngs-C/simulator/simulator.test 32369.00 32321.00 -0.1% test-suite...plications/d/make_dparser.test 89585.00 89505.00 -0.1% test-suite...ce/Applications/Burg/burg.test 40817.00 40785.00 -0.1% test-suite.../Applications/lemon/lemon.test 47281.00 47249.00 -0.1% test-suite...TimberWolfMC/timberwolfmc.test 250065.00 250113.00 0.0% test-suite...chmarks/MallocBench/gs/gs.test 149889.00 149873.00 -0.0% test-suite...ications/JM/lencod/lencod.test 769585.00 769569.00 -0.0% test-suite.../Benchmarks/Bullet/bullet.test 770049.00 770049.00 0.0% test-suite...HMARK_ANISTROPIC_DIFFUSION/128 NaN NaN nan% test-suite...HMARK_ANISTROPIC_DIFFUSION/256 NaN NaN nan% test-suite...CHMARK_ANISTROPIC_DIFFUSION/64 NaN NaN nan% test-suite...CHMARK_ANISTROPIC_DIFFUSION/32 NaN NaN nan% test-suite...ENCHMARK_BILATERAL_FILTER/64/4 NaN NaN nan% Geomean difference nan% result-old result-new diff count 1.000000e+01 10.00000 10.000000 mean 3.152090e+05 311695.40000 0.006749 std 3.790398e+05 372091.42232 0.036605 min 7.530000e+02 833.00000 -0.034981 25% 4.243300e+04 42401.00000 -0.000866 50% 1.197370e+05 119689.00000 -0.000392 75% 6.397050e+05 639705.00000 -0.000005 max 1.001697e+06 966657.00000 0.106242 ``` I don't have timings though. And now to the code. The basic idea is to completely replace the whole loop. If we can't fully kill it, don't transform. I have left one or two comments in the code, so hopefully it can be understood. Also, there is a few TODO's that i have left for follow-ups: * widening of `memcmp()`/`bcmp()` * step smaller than the comparison size * Metadata propagation * more than two blocks as long as there is still a single backedge? * ??? Reviewers: reames, fhahn, mkazantsev, chandlerc, craig.topper, courbet Reviewed By: courbet Subscribers: miyuki, hiraditya, xbolva00, nikic, jfb, gchatelet, courbet, llvm-commits, mclow.lists Tags: #llvm Differential Revision: https://reviews.llvm.org/D61144 Modified: llvm/trunk/docs/ReleaseNotes.rst llvm/trunk/lib/Transforms/Scalar/LoopIdiomRecognize.cpp llvm/trunk/test/Transforms/LoopIdiom/bcmp-basic.ll llvm/trunk/test/Transforms/LoopIdiom/bcmp-debugify-remarks.ll llvm/trunk/test/Transforms/LoopIdiom/bcmp-widening.ll Modified: llvm/trunk/docs/ReleaseNotes.rst URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.rst?rev=374662&r1=374661&r2=374662&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.rst (original) +++ llvm/trunk/docs/ReleaseNotes.rst Sat Oct 12 08:35:32 2019 @@ -66,6 +66,9 @@ Non-comprehensive list of changes in thi Undefined Behaviour Sanitizer ``-fsanitize=pointer-overflow`` check will now catch such cases. +* The Loop Idiom Recognition (``-loop-idiom``) pass has learned to recognize + ``bcmp`` pattern, and convert it into a call to ``bcmp`` (or ``memcmp``) + function. Changes to the LLVM IR ---------------------- Modified: llvm/trunk/lib/Transforms/Scalar/LoopIdiomRecognize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopIdiomRecognize.cpp?rev=374662&r1=374661&r2=374662&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopIdiomRecognize.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopIdiomRecognize.cpp Sat Oct 12 08:35:32 2019 @@ -41,6 +41,7 @@ #include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/DenseMap.h" #include "llvm/ADT/MapVector.h" +#include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SetVector.h" #include "llvm/ADT/SmallPtrSet.h" #include "llvm/ADT/SmallVector.h" @@ -77,16 +78,20 @@ #include "llvm/IR/LLVMContext.h" #include "llvm/IR/Module.h" #include "llvm/IR/PassManager.h" +#include "llvm/IR/PatternMatch.h" #include "llvm/IR/Type.h" #include "llvm/IR/User.h" #include "llvm/IR/Value.h" #include "llvm/IR/ValueHandle.h" +#include "llvm/IR/Verifier.h" #include "llvm/Pass.h" #include "llvm/Support/Casting.h" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Transforms/Scalar.h" +#include "llvm/Transforms/Scalar/LoopPassManager.h" +#include "llvm/Transforms/Utils/BasicBlockUtils.h" #include "llvm/Transforms/Utils/BuildLibCalls.h" #include "llvm/Transforms/Utils/Local.h" #include "llvm/Transforms/Utils/LoopUtils.h" @@ -102,6 +107,7 @@ using namespace llvm; STATISTIC(NumMemSet, "Number of memset's formed from loop stores"); STATISTIC(NumMemCpy, "Number of memcpy's formed from loop load+stores"); +STATISTIC(NumBCmp, "Number of memcmp's formed from loop 2xload+eq-compare"); static cl::opt UseLIRCodeSizeHeurs( "use-lir-code-size-heurs", @@ -111,6 +117,26 @@ static cl::opt UseLIRCodeSizeHeurs namespace { +// FIXME: reinventing the wheel much? Is there a cleaner solution? +struct PMAbstraction { + virtual void markLoopAsDeleted(Loop *L) = 0; + virtual ~PMAbstraction() = default; +}; +struct LegacyPMAbstraction : PMAbstraction { + LPPassManager &LPM; + LegacyPMAbstraction(LPPassManager &LPM) : LPM(LPM) {} + virtual ~LegacyPMAbstraction() = default; + void markLoopAsDeleted(Loop *L) override { LPM.markLoopAsDeleted(*L); } +}; +struct NewPMAbstraction : PMAbstraction { + LPMUpdater &Updater; + NewPMAbstraction(LPMUpdater &Updater) : Updater(Updater) {} + virtual ~NewPMAbstraction() = default; + void markLoopAsDeleted(Loop *L) override { + Updater.markLoopAsDeleted(*L, L->getName()); + } +}; + class LoopIdiomRecognize { Loop *CurLoop = nullptr; AliasAnalysis *AA; @@ -120,6 +146,7 @@ class LoopIdiomRecognize { TargetLibraryInfo *TLI; const TargetTransformInfo *TTI; const DataLayout *DL; + PMAbstraction &LoopDeleter; OptimizationRemarkEmitter &ORE; bool ApplyCodeSizeHeuristics; @@ -128,9 +155,10 @@ public: LoopInfo *LI, ScalarEvolution *SE, TargetLibraryInfo *TLI, const TargetTransformInfo *TTI, - const DataLayout *DL, + const DataLayout *DL, PMAbstraction &LoopDeleter, OptimizationRemarkEmitter &ORE) - : AA(AA), DT(DT), LI(LI), SE(SE), TLI(TLI), TTI(TTI), DL(DL), ORE(ORE) {} + : AA(AA), DT(DT), LI(LI), SE(SE), TLI(TLI), TTI(TTI), DL(DL), + LoopDeleter(LoopDeleter), ORE(ORE) {} bool runOnLoop(Loop *L); @@ -144,6 +172,8 @@ private: bool HasMemset; bool HasMemsetPattern; bool HasMemcpy; + bool HasMemCmp; + bool HasBCmp; /// Return code for isLegalStore() enum LegalStoreKind { @@ -186,6 +216,32 @@ private: bool runOnNoncountableLoop(); + struct CmpLoopStructure { + Value *BCmpValue, *LatchCmpValue; + BasicBlock *HeaderBrEqualBB, *HeaderBrUnequalBB; + BasicBlock *LatchBrFinishBB, *LatchBrContinueBB; + }; + bool matchBCmpLoopStructure(CmpLoopStructure &CmpLoop) const; + struct CmpOfLoads { + ICmpInst::Predicate BCmpPred; + Value *LoadSrcA, *LoadSrcB; + Value *LoadA, *LoadB; + }; + bool matchBCmpOfLoads(Value *BCmpValue, CmpOfLoads &CmpOfLoads) const; + bool recognizeBCmpLoopControlFlow(const CmpOfLoads &CmpOfLoads, + CmpLoopStructure &CmpLoop) const; + bool recognizeBCmpLoopSCEV(uint64_t BCmpTyBytes, CmpOfLoads &CmpOfLoads, + const SCEV *&SrcA, const SCEV *&SrcB, + const SCEV *&Iterations) const; + bool detectBCmpIdiom(ICmpInst *&BCmpInst, CmpInst *&LatchCmpInst, + LoadInst *&LoadA, LoadInst *&LoadB, const SCEV *&SrcA, + const SCEV *&SrcB, const SCEV *&NBytes) const; + BasicBlock *transformBCmpControlFlow(ICmpInst *ComparedEqual); + void transformLoopToBCmp(ICmpInst *BCmpInst, CmpInst *LatchCmpInst, + LoadInst *LoadA, LoadInst *LoadB, const SCEV *SrcA, + const SCEV *SrcB, const SCEV *NBytes); + bool recognizeBCmp(); + bool recognizePopcount(); void transformLoopToPopcount(BasicBlock *PreCondBB, Instruction *CntInst, PHINode *CntPhi, Value *Var); @@ -223,13 +279,14 @@ public: &getAnalysis().getTTI( *L->getHeader()->getParent()); const DataLayout *DL = &L->getHeader()->getModule()->getDataLayout(); + LegacyPMAbstraction LoopDeleter(LPM); // For the old PM, we can't use OptimizationRemarkEmitter as an analysis // pass. Function analyses need to be preserved across loop transformations // but ORE cannot be preserved (see comment before the pass definition). OptimizationRemarkEmitter ORE(L->getHeader()->getParent()); - LoopIdiomRecognize LIR(AA, DT, LI, SE, TLI, TTI, DL, ORE); + LoopIdiomRecognize LIR(AA, DT, LI, SE, TLI, TTI, DL, LoopDeleter, ORE); return LIR.runOnLoop(L); } @@ -248,7 +305,7 @@ char LoopIdiomRecognizeLegacyPass::ID = PreservedAnalyses LoopIdiomRecognizePass::run(Loop &L, LoopAnalysisManager &AM, LoopStandardAnalysisResults &AR, - LPMUpdater &) { + LPMUpdater &Updater) { const auto *DL = &L.getHeader()->getModule()->getDataLayout(); const auto &FAM = @@ -262,8 +319,9 @@ PreservedAnalyses LoopIdiomRecognizePass "LoopIdiomRecognizePass: OptimizationRemarkEmitterAnalysis not cached " "at a higher level"); + NewPMAbstraction LoopDeleter(Updater); LoopIdiomRecognize LIR(&AR.AA, &AR.DT, &AR.LI, &AR.SE, &AR.TLI, &AR.TTI, DL, - *ORE); + LoopDeleter, *ORE); if (!LIR.runOnLoop(&L)) return PreservedAnalyses::all(); @@ -300,7 +358,8 @@ bool LoopIdiomRecognize::runOnLoop(Loop // Disable loop idiom recognition if the function's name is a common idiom. StringRef Name = L->getHeader()->getParent()->getName(); - if (Name == "memset" || Name == "memcpy") + if (Name == "memset" || Name == "memcpy" || Name == "memcmp" || + Name == "bcmp") return false; // Determine if code size heuristics need to be applied. @@ -310,8 +369,10 @@ bool LoopIdiomRecognize::runOnLoop(Loop HasMemset = TLI->has(LibFunc_memset); HasMemsetPattern = TLI->has(LibFunc_memset_pattern16); HasMemcpy = TLI->has(LibFunc_memcpy); + HasMemCmp = TLI->has(LibFunc_memcmp); + HasBCmp = TLI->has(LibFunc_bcmp); - if (HasMemset || HasMemsetPattern || HasMemcpy) + if (HasMemset || HasMemsetPattern || HasMemcpy || HasMemCmp || HasBCmp) if (SE->hasLoopInvariantBackedgeTakenCount(L)) return runOnCountableLoop(); @@ -1150,7 +1211,7 @@ bool LoopIdiomRecognize::runOnNoncountab << "] Noncountable Loop %" << CurLoop->getHeader()->getName() << "\n"); - return recognizePopcount() || recognizeAndInsertFFS(); + return recognizeBCmp() || recognizePopcount() || recognizeAndInsertFFS(); } /// Check if the given conditional branch is based on the comparison between @@ -1824,3 +1885,804 @@ void LoopIdiomRecognize::transformLoopTo // loop. The loop would otherwise not be deleted even if it becomes empty. SE->forgetLoop(CurLoop); } + +bool LoopIdiomRecognize::matchBCmpLoopStructure( + CmpLoopStructure &CmpLoop) const { + ICmpInst::Predicate BCmpPred; + + // We are looking for the following basic layout: + // PreheaderBB: ; preds = ??? + // <...> + // br label %LoopHeaderBB + // LoopHeaderBB: ; preds = %PreheaderBB,%LoopLatchBB + // <...> + // %BCmpValue = icmp <...> + // br i1 %BCmpValue, label %LoopLatchBB, label %Successor0 + // LoopLatchBB: ; preds = %LoopHeaderBB + // <...> + // %LatchCmpValue = + // br i1 %LatchCmpValue, label %Successor1, label %LoopHeaderBB + // Successor0: ; preds = %LoopHeaderBB + // <...> + // Successor1: ; preds = %LoopLatchBB + // <...> + // + // Successor0 and Successor1 may or may not be the same basic block. + + // Match basic frame-work of this supposedly-comparison loop. + using namespace PatternMatch; + if (!match(CurLoop->getHeader()->getTerminator(), + m_Br(m_CombineAnd(m_ICmp(BCmpPred, m_Value(), m_Value()), + m_Value(CmpLoop.BCmpValue)), + CmpLoop.HeaderBrEqualBB, CmpLoop.HeaderBrUnequalBB)) || + !match(CurLoop->getLoopLatch()->getTerminator(), + m_Br(m_CombineAnd(m_Cmp(), m_Value(CmpLoop.LatchCmpValue)), + CmpLoop.LatchBrFinishBB, CmpLoop.LatchBrContinueBB))) { + LLVM_DEBUG(dbgs() << "Basic control-flow layout unrecognized.\n"); + return false; + } + LLVM_DEBUG(dbgs() << "Recognized basic control-flow layout.\n"); + return true; +} + +bool LoopIdiomRecognize::matchBCmpOfLoads(Value *BCmpValue, + CmpOfLoads &CmpOfLoads) const { + using namespace PatternMatch; + LLVM_DEBUG(dbgs() << "Analyzing header icmp " << *BCmpValue + << " as bcmp pattern.\n"); + + // Match bcmp-style loop header cmp. It must be an eq-icmp of loads. Example: + // %v0 = load <...>, <...>* %LoadSrcA + // %v1 = load <...>, <...>* %LoadSrcB + // %CmpLoop.BCmpValue = icmp eq <...> %v0, %v1 + // There won't be any no-op bitcasts between load and icmp, + // they would have been transformed into a load of bitcast. + // FIXME: {b,mem}cmp() calls have the same semantics as icmp. Match them too. + if (!match(BCmpValue, + m_ICmp(CmpOfLoads.BCmpPred, + m_CombineAnd(m_Load(m_Value(CmpOfLoads.LoadSrcA)), + m_Value(CmpOfLoads.LoadA)), + m_CombineAnd(m_Load(m_Value(CmpOfLoads.LoadSrcB)), + m_Value(CmpOfLoads.LoadB)))) || + !ICmpInst::isEquality(CmpOfLoads.BCmpPred)) { + LLVM_DEBUG(dbgs() << "Loop header icmp did not match bcmp pattern.\n"); + return false; + } + LLVM_DEBUG(dbgs() << "Recognized header icmp as bcmp pattern with loads:\n\t" + << *CmpOfLoads.LoadA << "\n\t" << *CmpOfLoads.LoadB + << "\n"); + // FIXME: handle memcmp pattern? + return true; +} + +bool LoopIdiomRecognize::recognizeBCmpLoopControlFlow( + const CmpOfLoads &CmpOfLoads, CmpLoopStructure &CmpLoop) const { + BasicBlock *LoopHeaderBB = CurLoop->getHeader(); + BasicBlock *LoopLatchBB = CurLoop->getLoopLatch(); + + // Be wary, comparisons can be inverted, canonicalize order. + // If this 'element' comparison passed, we expect to proceed to the next elt. + if (CmpOfLoads.BCmpPred != ICmpInst::Predicate::ICMP_EQ) + std::swap(CmpLoop.HeaderBrEqualBB, CmpLoop.HeaderBrUnequalBB); + // The predicate on loop latch does not matter, just canonicalize some order. + if (CmpLoop.LatchBrContinueBB != LoopHeaderBB) + std::swap(CmpLoop.LatchBrFinishBB, CmpLoop.LatchBrContinueBB); + + // Check that control-flow between blocks is as expected. + if (CmpLoop.HeaderBrEqualBB != LoopLatchBB || + CmpLoop.LatchBrContinueBB != LoopHeaderBB) { + LLVM_DEBUG(dbgs() << "Loop control-flow not recognized.\n"); + return false; + } + + SmallVector ExitBlocks; + CurLoop->getUniqueExitBlocks(ExitBlocks); + assert(ExitBlocks.size() <= 2U && "Can't have more than two exit blocks."); + + assert(!is_contained(ExitBlocks, CmpLoop.HeaderBrEqualBB) && + is_contained(ExitBlocks, CmpLoop.HeaderBrUnequalBB) && + !is_contained(ExitBlocks, CmpLoop.LatchBrContinueBB) && + is_contained(ExitBlocks, CmpLoop.LatchBrFinishBB) && + "Unexpected exit edges."); + + LLVM_DEBUG(dbgs() << "Recognized loop control-flow.\n"); + + LLVM_DEBUG(dbgs() << "Performing side-effect analysis on the loop.\n"); + assert(CurLoop->isLCSSAForm(*DT) && "Should only get LCSSA-form loops here."); + // No loop instructions must be used outside of the loop. Since we are in + // LCSSA form, we only need to check successor block's PHI nodes's incoming + // values for incoming blocks that are the loop basic blocks. + for (const BasicBlock *ExitBB : ExitBlocks) { + for (const PHINode &PHI : ExitBB->phis()) { + for (const BasicBlock *LoopBB : + make_filter_range(PHI.blocks(), [this](BasicBlock *PredecessorBB) { + return CurLoop->contains(PredecessorBB); + })) { + const auto *I = + dyn_cast(PHI.getIncomingValueForBlock(LoopBB)); + if (I && CurLoop->contains(I)) { + LLVM_DEBUG(dbgs() + << "Loop contains instruction " << *I + << " which is used outside of the loop in basic block " + << ExitBB->getName() << " in phi node " << PHI << "\n"); + return false; + } + } + } + } + // Similarly, the loop should not have any other observable side-effects + // other than the final comparison result. + for (BasicBlock *LoopBB : CurLoop->blocks()) { + for (Instruction &I : *LoopBB) { + if (isa(I)) // Ignore dbginfo. + continue; // FIXME: anything else? lifetime info? + if ((I.mayHaveSideEffects() || I.isAtomic() || I.isFenceLike()) && + &I != CmpOfLoads.LoadA && &I != CmpOfLoads.LoadB) { + LLVM_DEBUG( + dbgs() << "Loop contains instruction with potential side-effects: " + << I << "\n"); + return false; + } + } + } + LLVM_DEBUG(dbgs() << "No loop instructions deemed to have side-effects.\n"); + return true; +} + +bool LoopIdiomRecognize::recognizeBCmpLoopSCEV(uint64_t BCmpTyBytes, + CmpOfLoads &CmpOfLoads, + const SCEV *&SrcA, + const SCEV *&SrcB, + const SCEV *&Iterations) const { + // Try to compute SCEV of the loads, for this loop's scope. + const auto *ScevForSrcA = dyn_cast( + SE->getSCEVAtScope(CmpOfLoads.LoadSrcA, CurLoop)); + const auto *ScevForSrcB = dyn_cast( + SE->getSCEVAtScope(CmpOfLoads.LoadSrcB, CurLoop)); + if (!ScevForSrcA || !ScevForSrcB) { + LLVM_DEBUG(dbgs() << "Failed to get SCEV expressions for load sources.\n"); + return false; + } + + LLVM_DEBUG(dbgs() << "Got SCEV expressions (at loop scope) for loads:\n\t" + << *ScevForSrcA << "\n\t" << *ScevForSrcB << "\n"); + + // Loads must have folloving SCEV exprs: {%ptr,+,BCmpTyBytes}<%LoopHeaderBB> + const SCEV *RecStepForA = ScevForSrcA->getStepRecurrence(*SE); + const SCEV *RecStepForB = ScevForSrcB->getStepRecurrence(*SE); + if (!ScevForSrcA->isAffine() || !ScevForSrcB->isAffine() || + ScevForSrcA->getLoop() != CurLoop || ScevForSrcB->getLoop() != CurLoop || + RecStepForA != RecStepForB || !isa(RecStepForA) || + cast(RecStepForA)->getAPInt() != BCmpTyBytes) { + LLVM_DEBUG(dbgs() << "Unsupported SCEV expressions for loads. Only support " + "affine SCEV expressions originating in the loop we " + "are analysing with identical constant positive step, " + "equal to the count of bytes compared. Got:\n\t" + << *RecStepForA << "\n\t" << *RecStepForB << "\n"); + return false; + // FIXME: can support BCmpTyBytes > Step. + // But will need to account for the extra bytes compared at the end. + } + + SrcA = ScevForSrcA->getStart(); + SrcB = ScevForSrcB->getStart(); + LLVM_DEBUG(dbgs() << "Got SCEV expressions for load sources:\n\t" << *SrcA + << "\n\t" << *SrcB << "\n"); + + // The load sources must be loop-invants that dominate the loop header. + if (SrcA == SE->getCouldNotCompute() || SrcB == SE->getCouldNotCompute() || + !SE->isAvailableAtLoopEntry(SrcA, CurLoop) || + !SE->isAvailableAtLoopEntry(SrcB, CurLoop)) { + LLVM_DEBUG(dbgs() << "Unsupported SCEV expressions for loads, unavaliable " + "prior to loop header.\n"); + return false; + } + + LLVM_DEBUG(dbgs() << "SCEV expressions for loads are acceptable.\n"); + + // For how many iterations is loop guaranteed not to exit via LoopLatch? + // This is one less than the maximal number of comparisons,and is: n + -1 + const SCEV *LoopExitCount = + SE->getExitCount(CurLoop, CurLoop->getLoopLatch()); + LLVM_DEBUG(dbgs() << "Got SCEV expression for loop latch exit count: " + << *LoopExitCount << "\n"); + // Exit count, similarly, must be loop-invant that dominates the loop header. + if (LoopExitCount == SE->getCouldNotCompute() || + !LoopExitCount->getType()->isIntOrPtrTy() || + !SE->isAvailableAtLoopEntry(LoopExitCount, CurLoop)) { + LLVM_DEBUG(dbgs() << "Unsupported SCEV expression for loop latch exit.\n"); + return false; + } + + // LoopExitCount is always one less than the actual count of iterations. + // Do this before cast, else we will be stuck with 1 + zext(-1 + n) + Iterations = SE->getAddExpr( + LoopExitCount, SE->getOne(LoopExitCount->getType()), SCEV::FlagNUW); + assert(Iterations != SE->getCouldNotCompute() && + "Shouldn't fail to increment by one."); + + LLVM_DEBUG(dbgs() << "Computed iteration count: " << *Iterations << "\n"); + return true; +} + +/// Return true iff the bcmp idiom is detected in the loop. +/// +/// Additionally: +/// 1) \p BCmpInst is set to the root byte-comparison instruction. +/// 2) \p LatchCmpInst is set to the comparison that controls the latch. +/// 3) \p LoadA is set to the first LoadInst. +/// 4) \p LoadB is set to the second LoadInst. +/// 5) \p SrcA is set to the first source location that is being compared. +/// 6) \p SrcB is set to the second source location that is being compared. +/// 7) \p NBytes is set to the number of bytes to compare. +bool LoopIdiomRecognize::detectBCmpIdiom(ICmpInst *&BCmpInst, + CmpInst *&LatchCmpInst, + LoadInst *&LoadA, LoadInst *&LoadB, + const SCEV *&SrcA, const SCEV *&SrcB, + const SCEV *&NBytes) const { + LLVM_DEBUG(dbgs() << "Recognizing bcmp idiom\n"); + + // Give up if the loop is not in normal form, or has more than 2 blocks. + if (!CurLoop->isLoopSimplifyForm() || CurLoop->getNumBlocks() > 2) { + LLVM_DEBUG(dbgs() << "Basic loop structure unrecognized.\n"); + return false; + } + LLVM_DEBUG(dbgs() << "Recognized basic loop structure.\n"); + + CmpLoopStructure CmpLoop; + if (!matchBCmpLoopStructure(CmpLoop)) + return false; + + CmpOfLoads CmpOfLoads; + if (!matchBCmpOfLoads(CmpLoop.BCmpValue, CmpOfLoads)) + return false; + + if (!recognizeBCmpLoopControlFlow(CmpOfLoads, CmpLoop)) + return false; + + BCmpInst = cast(CmpLoop.BCmpValue); // FIXME: is there no + LatchCmpInst = cast(CmpLoop.LatchCmpValue); // way to combine + LoadA = cast(CmpOfLoads.LoadA); // these cast with + LoadB = cast(CmpOfLoads.LoadB); // m_Value() matcher? + + Type *BCmpValTy = BCmpInst->getOperand(0)->getType(); + LLVMContext &Context = BCmpValTy->getContext(); + uint64_t BCmpTyBits = DL->getTypeSizeInBits(BCmpValTy); + static constexpr uint64_t ByteTyBits = 8; + + LLVM_DEBUG(dbgs() << "Got comparison between values of type " << *BCmpValTy + << " of size " << BCmpTyBits + << " bits (while byte = " << ByteTyBits << " bits).\n"); + // bcmp()/memcmp() minimal unit of work is a byte. Therefore we must check + // that we are dealing with a multiple of a byte here. + if (BCmpTyBits % ByteTyBits != 0) { + LLVM_DEBUG(dbgs() << "Value size is not a multiple of byte.\n"); + return false; + // FIXME: could still be done under a run-time check that the total bit + // count is a multiple of a byte i guess? Or handle remainder separately? + } + + // Each comparison is done on this many bytes. + uint64_t BCmpTyBytes = BCmpTyBits / ByteTyBits; + LLVM_DEBUG(dbgs() << "Size is exactly " << BCmpTyBytes + << " bytes, eligible for bcmp conversion.\n"); + + const SCEV *Iterations; + if (!recognizeBCmpLoopSCEV(BCmpTyBytes, CmpOfLoads, SrcA, SrcB, Iterations)) + return false; + + // bcmp / memcmp take length argument as size_t, do promotion now. + Type *CmpFuncSizeTy = DL->getIntPtrType(Context); + Iterations = SE->getNoopOrZeroExtend(Iterations, CmpFuncSizeTy); + assert(Iterations != SE->getCouldNotCompute() && "Promotion failed."); + // Note that it didn't do ptrtoint cast, we will need to do it manually. + + // We will be comparing *bytes*, not BCmpTy, we need to recalculate size. + // It's a multiplication, and it *could* overflow. But for it to overflow + // we'd want to compare more bytes than could be represented by size_t, But + // allocation functions also take size_t. So how'd you produce such buffer? + // FIXME: we likely need to actually check that we know this won't overflow, + // via llvm::computeOverflowForUnsignedMul(). + NBytes = SE->getMulExpr( + Iterations, SE->getConstant(CmpFuncSizeTy, BCmpTyBytes), SCEV::FlagNUW); + assert(NBytes != SE->getCouldNotCompute() && + "Shouldn't fail to increment by one."); + + LLVM_DEBUG(dbgs() << "Computed total byte count: " << *NBytes << "\n"); + + if (LoadA->getPointerAddressSpace() != LoadB->getPointerAddressSpace() || + LoadA->getPointerAddressSpace() != 0 || !LoadA->isSimple() || + !LoadB->isSimple()) { + StringLiteral L("Unsupported loads in idiom - only support identical, " + "simple loads from address space 0.\n"); + LLVM_DEBUG(dbgs() << L); + ORE.emit([&]() { + return OptimizationRemarkMissed(DEBUG_TYPE, "BCmpIdiomUnsupportedLoads", + BCmpInst->getDebugLoc(), + CurLoop->getHeader()) + << L; + }); + return false; // FIXME + } + + LLVM_DEBUG(dbgs() << "Recognized bcmp idiom\n"); + ORE.emit([&]() { + return OptimizationRemarkAnalysis(DEBUG_TYPE, "RecognizedBCmpIdiom", + CurLoop->getStartLoc(), + CurLoop->getHeader()) + << "Loop recognized as a bcmp idiom"; + }); + + return true; +} + +BasicBlock * +LoopIdiomRecognize::transformBCmpControlFlow(ICmpInst *ComparedEqual) { + LLVM_DEBUG(dbgs() << "Transforming control-flow.\n"); + SmallVector DTUpdates; + + BasicBlock *PreheaderBB = CurLoop->getLoopPreheader(); + BasicBlock *HeaderBB = CurLoop->getHeader(); + BasicBlock *LoopLatchBB = CurLoop->getLoopLatch(); + SmallString<32> LoopName = CurLoop->getName(); + Function *Func = PreheaderBB->getParent(); + LLVMContext &Context = Func->getContext(); + + // Before doing anything, drop SCEV info. + SE->forgetLoop(CurLoop); + + // Here we start with: (0/6) + // PreheaderBB: ; preds = ??? + // <...> + // %memcmp = call i32 @memcmp(i8* %LoadSrcA, i8* %LoadSrcB, i64 %Nbytes) + // %ComparedEqual = icmp eq <...> %memcmp, 0 + // br label %LoopHeaderBB + // LoopHeaderBB: ; preds = %PreheaderBB,%LoopLatchBB + // <...> + // br i1 %<...>, label %LoopLatchBB, label %Successor0BB + // LoopLatchBB: ; preds = %LoopHeaderBB + // <...> + // br i1 %<...>, label %Successor1BB, label %LoopHeaderBB + // Successor0BB: ; preds = %LoopHeaderBB + // %S0PHI = phi <...> [ <...>, %LoopHeaderBB ] + // <...> + // Successor1BB: ; preds = %LoopLatchBB + // %S1PHI = phi <...> [ <...>, %LoopLatchBB ] + // <...> + // + // Successor0 and Successor1 may or may not be the same basic block. + + // Decouple the edge between loop preheader basic block and loop header basic + // block. Thus the loop has become unreachable. + assert(cast(PreheaderBB->getTerminator())->isUnconditional() && + PreheaderBB->getTerminator()->getSuccessor(0) == HeaderBB && + "Preheader bb must end with an unconditional branch to header bb."); + PreheaderBB->getTerminator()->eraseFromParent(); + DTUpdates.push_back({DominatorTree::Delete, PreheaderBB, HeaderBB}); + + // Create a new preheader basic block before loop header basic block. + auto *PhonyPreheaderBB = BasicBlock::Create( + Context, LoopName + ".phonypreheaderbb", Func, HeaderBB); + // And insert an unconditional branch from phony preheader basic block to + // loop header basic block. + IRBuilder<>(PhonyPreheaderBB).CreateBr(HeaderBB); + DTUpdates.push_back({DominatorTree::Insert, PhonyPreheaderBB, HeaderBB}); + + // Create a *single* new empty block that we will substitute as a + // successor basic block for the loop's exits. This one is temporary. + // Much like phony preheader basic block, it is not connected. + auto *PhonySuccessorBB = + BasicBlock::Create(Context, LoopName + ".phonysuccessorbb", Func, + LoopLatchBB->getNextNode()); + // That block must have *some* non-PHI instruction, or else deleteDeadLoop() + // will mess up cleanup of dbginfo, and verifier will complain. + IRBuilder<>(PhonySuccessorBB).CreateUnreachable(); + + // Create two new empty blocks that we will use to preserve the original + // loop exit control-flow, and preserve the incoming values in the PHI nodes + // in loop's successor exit blocks. These will live one. + auto *ComparedUnequalBB = + BasicBlock::Create(Context, ComparedEqual->getName() + ".unequalbb", Func, + PhonySuccessorBB->getNextNode()); + auto *ComparedEqualBB = + BasicBlock::Create(Context, ComparedEqual->getName() + ".equalbb", Func, + PhonySuccessorBB->getNextNode()); + + // By now we have: (1/6) + // PreheaderBB: ; preds = ??? + // <...> + // %memcmp = call i32 @memcmp(i8* %LoadSrcA, i8* %LoadSrcB, i64 %Nbytes) + // %ComparedEqual = icmp eq <...> %memcmp, 0 + // [no terminator instruction!] + // PhonyPreheaderBB: ; No preds, UNREACHABLE! + // br label %LoopHeaderBB + // LoopHeaderBB: ; preds = %PhonyPreheaderBB, %LoopLatchBB + // <...> + // br i1 %<...>, label %LoopLatchBB, label %Successor0BB + // LoopLatchBB: ; preds = %LoopHeaderBB + // <...> + // br i1 %<...>, label %Successor1BB, label %LoopHeaderBB + // PhonySuccessorBB: ; No preds, UNREACHABLE! + // unreachable + // EqualBB: ; No preds, UNREACHABLE! + // [no terminator instruction!] + // UnequalBB: ; No preds, UNREACHABLE! + // [no terminator instruction!] + // Successor0BB: ; preds = %LoopHeaderBB + // %S0PHI = phi <...> [ <...>, %LoopHeaderBB ] + // <...> + // Successor1BB: ; preds = %LoopLatchBB + // %S1PHI = phi <...> [ <...>, %LoopLatchBB ] + // <...> + + // What is the mapping/replacement basic block for exiting out of the loop + // from either of old's loop basic blocks? + auto GetReplacementBB = [this, ComparedEqualBB, + ComparedUnequalBB](const BasicBlock *OldBB) { + assert(CurLoop->contains(OldBB) && "Only for loop's basic blocks."); + if (OldBB == CurLoop->getLoopLatch()) // "all elements compared equal". + return ComparedEqualBB; + if (OldBB == CurLoop->getHeader()) // "element compared unequal". + return ComparedUnequalBB; + llvm_unreachable("Only had two basic blocks in loop."); + }; + + // What are the exits out of this loop? + SmallVector LoopExitEdges; + CurLoop->getExitEdges(LoopExitEdges); + assert(LoopExitEdges.size() == 2 && "Should have only to two exit edges."); + + // Populate new basic blocks, update the exiting control-flow, PHI nodes. + for (const Loop::Edge &Edge : LoopExitEdges) { + auto *OldLoopBB = const_cast(Edge.first); + auto *SuccessorBB = const_cast(Edge.second); + assert(CurLoop->contains(OldLoopBB) && !CurLoop->contains(SuccessorBB) && + "Unexpected edge."); + + // If we would exit the loop from this loop's basic block, + // what semantically would that mean? Did comparison succeed or fail? + BasicBlock *NewBB = GetReplacementBB(OldLoopBB); + assert(NewBB->empty() && "Should not get same new basic block here twice."); + IRBuilder<> Builder(NewBB); + Builder.SetCurrentDebugLocation(OldLoopBB->getTerminator()->getDebugLoc()); + Builder.CreateBr(SuccessorBB); + DTUpdates.push_back({DominatorTree::Insert, NewBB, SuccessorBB}); + // Also, be *REALLY* careful with PHI nodes in successor basic block, + // update them to recieve the same input value, but not from current loop's + // basic block, but from new basic block instead. + SuccessorBB->replacePhiUsesWith(OldLoopBB, NewBB); + // Also, change loop control-flow. This loop's basic block shall no longer + // exit from the loop to it's original successor basic block, but to our new + // phony successor basic block. Note that new successor will be unique exit. + OldLoopBB->getTerminator()->replaceSuccessorWith(SuccessorBB, + PhonySuccessorBB); + DTUpdates.push_back({DominatorTree::Delete, OldLoopBB, SuccessorBB}); + DTUpdates.push_back({DominatorTree::Insert, OldLoopBB, PhonySuccessorBB}); + } + + // Inform DomTree about edge changes. Note that LoopInfo is still out-of-date. + assert(DTUpdates.size() == 8 && "Update count prediction failed."); + DomTreeUpdater DTU(DT, DomTreeUpdater::UpdateStrategy::Eager); + DTU.applyUpdates(DTUpdates); + DTUpdates.clear(); + + // By now we have: (2/6) + // PreheaderBB: ; preds = ??? + // <...> + // %memcmp = call i32 @memcmp(i8* %LoadSrcA, i8* %LoadSrcB, i64 %Nbytes) + // %ComparedEqual = icmp eq <...> %memcmp, 0 + // [no terminator instruction!] + // PhonyPreheaderBB: ; No preds, UNREACHABLE! + // br label %LoopHeaderBB + // LoopHeaderBB: ; preds = %PhonyPreheaderBB, %LoopLatchBB + // <...> + // br i1 %<...>, label %LoopLatchBB, label %PhonySuccessorBB + // LoopLatchBB: ; preds = %LoopHeaderBB + // <...> + // br i1 %<...>, label %PhonySuccessorBB, label %LoopHeaderBB + // PhonySuccessorBB: ; preds = %LoopHeaderBB, %LoopLatchBB + // unreachable + // EqualBB: ; No preds, UNREACHABLE! + // br label %Successor1BB + // UnequalBB: ; No preds, UNREACHABLE! + // br label %Successor0BB + // Successor0BB: ; preds = %UnequalBB + // %S0PHI = phi <...> [ <...>, %UnequalBB ] + // <...> + // Successor1BB: ; preds = %EqualBB + // %S0PHI = phi <...> [ <...>, %EqualBB ] + // <...> + + // *Finally*, zap the original loop. Record it's parent loop though. + Loop *ParentLoop = CurLoop->getParentLoop(); + LLVM_DEBUG(dbgs() << "Deleting old loop.\n"); + LoopDeleter.markLoopAsDeleted(CurLoop); // Mark as deleted *BEFORE* deleting! + deleteDeadLoop(CurLoop, DT, SE, LI); // And actually delete the loop. + CurLoop = nullptr; + + // By now we have: (3/6) + // PreheaderBB: ; preds = ??? + // <...> + // %memcmp = call i32 @memcmp(i8* %LoadSrcA, i8* %LoadSrcB, i64 %Nbytes) + // %ComparedEqual = icmp eq <...> %memcmp, 0 + // [no terminator instruction!] + // PhonyPreheaderBB: ; No preds, UNREACHABLE! + // br label %PhonySuccessorBB + // PhonySuccessorBB: ; preds = %PhonyPreheaderBB + // unreachable + // EqualBB: ; No preds, UNREACHABLE! + // br label %Successor1BB + // UnequalBB: ; No preds, UNREACHABLE! + // br label %Successor0BB + // Successor0BB: ; preds = %UnequalBB + // %S0PHI = phi <...> [ <...>, %UnequalBB ] + // <...> + // Successor1BB: ; preds = %EqualBB + // %S0PHI = phi <...> [ <...>, %EqualBB ] + // <...> + + // Now, actually restore the CFG. + + // Insert an unconditional branch from an actual preheader basic block to + // phony preheader basic block. + IRBuilder<>(PreheaderBB).CreateBr(PhonyPreheaderBB); + DTUpdates.push_back({DominatorTree::Insert, PhonyPreheaderBB, HeaderBB}); + // Insert proper conditional branch from phony successor basic block to the + // "dispatch" basic blocks, which were used to preserve incoming values in + // original loop's successor basic blocks. + assert(isa(PhonySuccessorBB->getTerminator()) && + "Yep, that's the one we created to keep deleteDeadLoop() happy."); + PhonySuccessorBB->getTerminator()->eraseFromParent(); + { + IRBuilder<> Builder(PhonySuccessorBB); + Builder.SetCurrentDebugLocation(ComparedEqual->getDebugLoc()); + Builder.CreateCondBr(ComparedEqual, ComparedEqualBB, ComparedUnequalBB); + } + DTUpdates.push_back( + {DominatorTree::Insert, PhonySuccessorBB, ComparedEqualBB}); + DTUpdates.push_back( + {DominatorTree::Insert, PhonySuccessorBB, ComparedUnequalBB}); + + BasicBlock *DispatchBB = PhonySuccessorBB; + DispatchBB->setName(LoopName + ".bcmpdispatchbb"); + + assert(DTUpdates.size() == 3 && "Update count prediction failed."); + DTU.applyUpdates(DTUpdates); + DTUpdates.clear(); + + // By now we have: (4/6) + // PreheaderBB: ; preds = ??? + // <...> + // %memcmp = call i32 @memcmp(i8* %LoadSrcA, i8* %LoadSrcB, i64 %Nbytes) + // %ComparedEqual = icmp eq <...> %memcmp, 0 + // br label %PhonyPreheaderBB + // PhonyPreheaderBB: ; preds = %PreheaderBB + // br label %DispatchBB + // DispatchBB: ; preds = %PhonyPreheaderBB + // br i1 %ComparedEqual, label %EqualBB, label %UnequalBB + // EqualBB: ; preds = %DispatchBB + // br label %Successor1BB + // UnequalBB: ; preds = %DispatchBB + // br label %Successor0BB + // Successor0BB: ; preds = %UnequalBB + // %S0PHI = phi <...> [ <...>, %UnequalBB ] + // <...> + // Successor1BB: ; preds = %EqualBB + // %S0PHI = phi <...> [ <...>, %EqualBB ] + // <...> + + // The basic CFG has been restored! Now let's merge redundant basic blocks. + + // Merge phony successor basic block into it's only predecessor, + // phony preheader basic block. It is fully pointlessly redundant. + MergeBasicBlockIntoOnlyPred(DispatchBB, &DTU); + + // By now we have: (5/6) + // PreheaderBB: ; preds = ??? + // <...> + // %memcmp = call i32 @memcmp(i8* %LoadSrcA, i8* %LoadSrcB, i64 %Nbytes) + // %ComparedEqual = icmp eq <...> %memcmp, 0 + // br label %DispatchBB + // DispatchBB: ; preds = %PreheaderBB + // br i1 %ComparedEqual, label %EqualBB, label %UnequalBB + // EqualBB: ; preds = %DispatchBB + // br label %Successor1BB + // UnequalBB: ; preds = %DispatchBB + // br label %Successor0BB + // Successor0BB: ; preds = %UnequalBB + // %S0PHI = phi <...> [ <...>, %UnequalBB ] + // <...> + // Successor1BB: ; preds = %EqualBB + // %S0PHI = phi <...> [ <...>, %EqualBB ] + // <...> + + // Was this loop nested? + if (!ParentLoop) { + // If the loop was *NOT* nested, then let's also merge phony successor + // basic block into it's only predecessor, preheader basic block. + // Also, here we need to update LoopInfo. + LI->removeBlock(PreheaderBB); + MergeBasicBlockIntoOnlyPred(DispatchBB, &DTU); + + // By now we have: (6/6) + // DispatchBB: ; preds = ??? + // <...> + // %memcmp = call i32 @memcmp(i8* %LoadSrcA, i8* %LoadSrcB, i64 %Nbytes) + // %ComparedEqual = icmp eq <...> %memcmp, 0 + // br i1 %ComparedEqual, label %EqualBB, label %UnequalBB + // EqualBB: ; preds = %DispatchBB + // br label %Successor1BB + // UnequalBB: ; preds = %DispatchBB + // br label %Successor0BB + // Successor0BB: ; preds = %UnequalBB + // %S0PHI = phi <...> [ <...>, %UnequalBB ] + // <...> + // Successor1BB: ; preds = %EqualBB + // %S0PHI = phi <...> [ <...>, %EqualBB ] + // <...> + + return DispatchBB; + } + + // Otherwise, we need to "preserve" the LoopSimplify form of the deleted loop. + // To achieve that, we shall keep the preheader basic block (mainly so that + // the loop header block will be guaranteed to have a predecessor outside of + // the loop), and create a phony loop with all these new three basic blocks. + Loop *PhonyLoop = LI->AllocateLoop(); + ParentLoop->addChildLoop(PhonyLoop); + PhonyLoop->addBasicBlockToLoop(DispatchBB, *LI); + PhonyLoop->addBasicBlockToLoop(ComparedEqualBB, *LI); + PhonyLoop->addBasicBlockToLoop(ComparedUnequalBB, *LI); + + // But we only have a preheader basic block, a header basic block block and + // two exiting basic blocks. For a proper loop we also need a backedge from + // non-header basic block to header bb. + // Let's just add a never-taken branch from both of the exiting basic blocks. + for (BasicBlock *BB : {ComparedEqualBB, ComparedUnequalBB}) { + BranchInst *OldTerminator = cast(BB->getTerminator()); + assert(OldTerminator->isUnconditional() && "That's the one we created."); + BasicBlock *SuccessorBB = OldTerminator->getSuccessor(0); + + IRBuilder<> Builder(OldTerminator); + Builder.SetCurrentDebugLocation(OldTerminator->getDebugLoc()); + Builder.CreateCondBr(ConstantInt::getTrue(Context), SuccessorBB, + DispatchBB); + OldTerminator->eraseFromParent(); + // Yes, the backedge will never be taken. The control-flow is redundant. + // If it can be simplified further, other passes will take care. + DTUpdates.push_back({DominatorTree::Delete, BB, SuccessorBB}); + DTUpdates.push_back({DominatorTree::Insert, BB, SuccessorBB}); + DTUpdates.push_back({DominatorTree::Insert, BB, DispatchBB}); + } + assert(DTUpdates.size() == 6 && "Update count prediction failed."); + DTU.applyUpdates(DTUpdates); + DTUpdates.clear(); + + // By now we have: (6/6) + // PreheaderBB: ; preds = ??? + // <...> + // %memcmp = call i32 @memcmp(i8* %LoadSrcA, i8* %LoadSrcB, i64 %Nbytes) + // %ComparedEqual = icmp eq <...> %memcmp, 0 + // br label %BCmpDispatchBB + // BCmpDispatchBB:
    ; preds = %PreheaderBB + // br i1 %ComparedEqual, label %EqualBB, label %UnequalBB + // EqualBB: ; preds = %BCmpDispatchBB + // br i1 %true, label %Successor1BB, label %BCmpDispatchBB + // UnequalBB: ; preds = %BCmpDispatchBB + // br i1 %true, label %Successor0BB, label %BCmpDispatchBB + // Successor0BB: ; preds = %UnequalBB + // %S0PHI = phi <...> [ <...>, %UnequalBB ] + // <...> + // Successor1BB: ; preds = %EqualBB + // %S0PHI = phi <...> [ <...>, %EqualBB ] + // <...> + + // Finally fully DONE! + return DispatchBB; +} + +void LoopIdiomRecognize::transformLoopToBCmp(ICmpInst *BCmpInst, + CmpInst *LatchCmpInst, + LoadInst *LoadA, LoadInst *LoadB, + const SCEV *SrcA, const SCEV *SrcB, + const SCEV *NBytes) { + // We will be inserting before the terminator instruction of preheader block. + IRBuilder<> Builder(CurLoop->getLoopPreheader()->getTerminator()); + + LLVM_DEBUG(dbgs() << "Transforming bcmp loop idiom into a call.\n"); + LLVM_DEBUG(dbgs() << "Emitting new instructions.\n"); + + // Expand the SCEV expressions for both sources to compare, and produce value + // for the byte len (beware of Iterations potentially being a pointer, and + // account for element size being BCmpTyBytes bytes, which may be not 1 byte) + Value *PtrA, *PtrB, *Len; + { + SCEVExpander SExp(*SE, *DL, "LoopToBCmp"); + SExp.setInsertPoint(&*Builder.GetInsertPoint()); + + auto HandlePtr = [&SExp](LoadInst *Load, const SCEV *Src) { + SExp.SetCurrentDebugLocation(DebugLoc()); + // If the pointer operand of original load had dbgloc - use it. + if (const auto *I = dyn_cast(Load->getPointerOperand())) + SExp.SetCurrentDebugLocation(I->getDebugLoc()); + return SExp.expandCodeFor(Src); + }; + PtrA = HandlePtr(LoadA, SrcA); + PtrB = HandlePtr(LoadB, SrcB); + + // For len calculation let's use dbgloc for the loop's latch condition. + Builder.SetCurrentDebugLocation(LatchCmpInst->getDebugLoc()); + SExp.SetCurrentDebugLocation(LatchCmpInst->getDebugLoc()); + Len = SExp.expandCodeFor(NBytes); + + Type *CmpFuncSizeTy = DL->getIntPtrType(Builder.getContext()); + assert(SE->getTypeSizeInBits(Len->getType()) == + DL->getTypeSizeInBits(CmpFuncSizeTy) && + "Len should already have the correct size."); + + // Make sure that iteration count is a number, insert ptrtoint cast if not. + if (Len->getType()->isPointerTy()) + Len = Builder.CreatePtrToInt(Len, CmpFuncSizeTy); + assert(Len->getType() == CmpFuncSizeTy && "Should have correct type now."); + + Len->setName(Len->getName() + ".bytecount"); + + // There is no legality check needed. We want to compare that the memory + // regions [PtrA, PtrA+Len) and [PtrB, PtrB+Len) are fully identical, equal. + // For them to be fully equal, they must match bit-by-bit. And likewise, + // for them to *NOT* be fully equal, they have to differ just by one bit. + // The step of comparison (bits compared at once) simply does not matter. + } + + // For the rest of new instructions, dbgloc should point at the value cmp. + Builder.SetCurrentDebugLocation(BCmpInst->getDebugLoc()); + + // Emit the comparison itself. + auto *CmpCall = + cast(HasBCmp ? emitBCmp(PtrA, PtrB, Len, Builder, *DL, TLI) + : emitMemCmp(PtrA, PtrB, Len, Builder, *DL, TLI)); + // FIXME: add {B,Mem}CmpInst with MemoryCompareInst + // (based on MemIntrinsicBase) as base? + // FIXME: propagate metadata from loads? (alignments, AS, TBAA, ...) + + // {b,mem}cmp returned 0 if they were equal, or non-zero if not equal. + auto *ComparedEqual = cast(Builder.CreateICmpEQ( + CmpCall, ConstantInt::get(CmpCall->getType(), 0), + PtrA->getName() + ".vs." + PtrB->getName() + ".eqcmp")); + + BasicBlock *BB = transformBCmpControlFlow(ComparedEqual); + Builder.ClearInsertionPoint(); + + // We're done. + LLVM_DEBUG(dbgs() << "Transformed loop bcmp idiom into a call.\n"); + ORE.emit([&]() { + return OptimizationRemark(DEBUG_TYPE, "TransformedBCmpIdiomToCall", + CmpCall->getDebugLoc(), BB) + << "Transformed bcmp idiom into a call to " + << ore::NV("NewFunction", CmpCall->getCalledFunction()) + << "() function"; + }); + ++NumBCmp; +} + +/// Recognizes a bcmp idiom in a non-countable loop. +/// +/// If detected, transforms the relevant code to issue the bcmp (or memcmp) +/// intrinsic function call, and returns true; otherwise, returns false. +bool LoopIdiomRecognize::recognizeBCmp() { + if (!HasMemCmp && !HasBCmp) + return false; + + ICmpInst *BCmpInst; + CmpInst *LatchCmpInst; + LoadInst *LoadA, *LoadB; + const SCEV *SrcA, *SrcB, *NBytes; + if (!detectBCmpIdiom(BCmpInst, LatchCmpInst, LoadA, LoadB, SrcA, SrcB, + NBytes)) { + LLVM_DEBUG(dbgs() << "bcmp idiom recognition failed.\n"); + return false; + } + + transformLoopToBCmp(BCmpInst, LatchCmpInst, LoadA, LoadB, SrcA, SrcB, NBytes); + return true; +} Modified: llvm/trunk/test/Transforms/LoopIdiom/bcmp-basic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopIdiom/bcmp-basic.ll?rev=374662&r1=374661&r2=374662&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopIdiom/bcmp-basic.ll (original) +++ llvm/trunk/test/Transforms/LoopIdiom/bcmp-basic.ll Sat Oct 12 08:35:32 2019 @@ -1,5 +1,5 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py -; RUN: opt -loop-idiom < %s -S | FileCheck %s +; RUN: opt -loop-idiom -verify -verify-each -verify-dom-info -verify-loop-info < %s -S | FileCheck %s target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" @@ -239,24 +239,17 @@ target datalayout = "e-p:64:64:64-i1:8:8 define i1 @_Z39pointer_iteration_const_size_no_overlapPKc(i8* %ptr) { ; CHECK-LABEL: @_Z39pointer_iteration_const_size_no_overlapPKc( -; CHECK-NEXT: entry: +; CHECK-NEXT: for.body.i.i.bcmpdispatchbb: ; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 8 -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[ADD_PTR]], [[ENTRY:%.*]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I_IDX:%.*]] = phi i64 [ [[__FIRST1_ADDR_06_I_I_ADD:%.*]], [[FOR_INC_I_I]] ], [ 0, [[ENTRY]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR]], i64 [[__FIRST1_ADDR_06_I_I_IDX]] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I_PTR]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I_ADD]] = add nuw nsw i64 [[__FIRST1_ADDR_06_I_I_IDX]], 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i64 [[__FIRST1_ADDR_06_I_I_ADD]], 8 -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR]], i64 8) +; CHECK-NEXT: [[PTR_VS_ADD_PTR_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR_EQCMP]], label [[PTR_VS_ADD_PTR_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.equalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.unequalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: -; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ false, [[FOR_BODY_I_I]] ], [ true, [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR_EQCMP_EQUALBB]] ] ; CHECK-NEXT: ret i1 [[RETVAL_0_I_I]] ; entry: @@ -285,24 +278,17 @@ _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: define i1 @_Z44pointer_iteration_const_size_partial_overlapPKc(i8* %ptr) { ; CHECK-LABEL: @_Z44pointer_iteration_const_size_partial_overlapPKc( -; CHECK-NEXT: entry: +; CHECK-NEXT: for.body.i.i.bcmpdispatchbb: ; CHECK-NEXT: [[ADD_PTR1:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 8 -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[ADD_PTR1]], [[ENTRY:%.*]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I_IDX:%.*]] = phi i64 [ [[__FIRST1_ADDR_06_I_I_ADD:%.*]], [[FOR_INC_I_I]] ], [ 0, [[ENTRY]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR]], i64 [[__FIRST1_ADDR_06_I_I_IDX]] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I_PTR]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I_ADD]] = add nuw nsw i64 [[__FIRST1_ADDR_06_I_I_IDX]], 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i64 [[__FIRST1_ADDR_06_I_I_ADD]], 16 -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR1]], i64 16) +; CHECK-NEXT: [[PTR_VS_ADD_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR1_EQCMP]], label [[PTR_VS_ADD_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]] +; CHECK: ptr.vs.add.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: -; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ false, [[FOR_BODY_I_I]] ], [ true, [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: ret i1 [[RETVAL_0_I_I]] ; entry: @@ -331,23 +317,16 @@ _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: define i1 @_Z44pointer_iteration_const_size_overlap_unknownPKcS0_(i8* %ptr0, i8* %ptr1) { ; CHECK-LABEL: @_Z44pointer_iteration_const_size_overlap_unknownPKcS0_( -; CHECK-NEXT: entry: -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[PTR1:%.*]], [[ENTRY:%.*]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I_IDX:%.*]] = phi i64 [ [[__FIRST1_ADDR_06_I_I_ADD:%.*]], [[FOR_INC_I_I]] ], [ 0, [[ENTRY]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[__FIRST1_ADDR_06_I_I_IDX]] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I_PTR]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I_ADD]] = add nuw nsw i64 [[__FIRST1_ADDR_06_I_I_IDX]], 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i64 [[__FIRST1_ADDR_06_I_I_ADD]], 8 -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0:%.*]], i8* [[PTR1:%.*]], i64 8) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: -; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ false, [[FOR_BODY_I_I]] ], [ true, [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ false, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: ret i1 [[RETVAL_0_I_I]] ; entry: @@ -376,25 +355,19 @@ _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: define i1 @_Z42pointer_iteration_variable_size_no_overlapPKcm(i8* %ptr, i64 %count) { ; CHECK-LABEL: @_Z42pointer_iteration_variable_size_no_overlapPKcm( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[COUNT:%.*]] -; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT]], 0 -; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_PREHEADER:%.*]] -; CHECK: for.body.i.i.preheader: -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[ADD_PTR]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i8* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[PTR]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i8, i8* [[__FIRST1_ADDR_06_I_I]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i8* [[INCDEC_PTR_I_I]], [[ADD_PTR]] -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[COUNT_BYTECOUNT:%.*]] +; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT_BYTECOUNT]], 0 +; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR]], i64 [[COUNT_BYTECOUNT]]) +; CHECK-NEXT: [[PTR_VS_ADD_PTR_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR_EQCMP]], label [[PTR_VS_ADD_PTR_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.equalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.unequalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit.loopexit: -; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[FOR_BODY_I_I]] ], [ true, [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: ; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RETVAL_0_I_I_PH]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ] @@ -427,27 +400,21 @@ _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: define i1 @_Z47pointer_iteration_variable_size_partial_overlapPKcm(i8* %ptr, i64 %count) { ; CHECK-LABEL: @_Z47pointer_iteration_variable_size_partial_overlapPKcm( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[MUL:%.*]] = shl i64 [[COUNT:%.*]], 1 -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[MUL]] -; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[MUL]], 0 -; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_PREHEADER:%.*]] -; CHECK: for.body.i.i.preheader: +; CHECK-NEXT: [[MUL_BYTECOUNT:%.*]] = shl i64 [[COUNT:%.*]], 1 +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[MUL_BYTECOUNT]] +; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[MUL_BYTECOUNT]], 0 +; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: ; CHECK-NEXT: [[ADD_PTR1:%.*]] = getelementptr inbounds i8, i8* [[PTR]], i64 [[COUNT]] -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[ADD_PTR1]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i8* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[PTR]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i8, i8* [[__FIRST1_ADDR_06_I_I]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i8* [[INCDEC_PTR_I_I]], [[ADD_PTR]] -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR1]], i64 [[MUL_BYTECOUNT]]) +; CHECK-NEXT: [[PTR_VS_ADD_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR1_EQCMP]], label [[PTR_VS_ADD_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] +; CHECK: ptr.vs.add.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit.loopexit: -; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[FOR_BODY_I_I]] ], [ true, [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: ; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RETVAL_0_I_I_PH]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ] @@ -485,25 +452,19 @@ _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: define i1 @_Z47pointer_iteration_variable_size_overlap_unknownPKcS0_m(i8* %ptr0, i8* %ptr1, i64 %count) { ; CHECK-LABEL: @_Z47pointer_iteration_variable_size_overlap_unknownPKcS0_m( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[COUNT:%.*]] -; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT]], 0 -; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_PREHEADER:%.*]] -; CHECK: for.body.i.i.preheader: -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[PTR1:%.*]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i8* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[PTR0]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i8, i8* [[__FIRST1_ADDR_06_I_I]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i8* [[INCDEC_PTR_I_I]], [[ADD_PTR]] -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[COUNT_BYTECOUNT:%.*]] +; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT_BYTECOUNT]], 0 +; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0]], i8* [[PTR1:%.*]], i64 [[COUNT_BYTECOUNT]]) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit.loopexit: -; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[FOR_BODY_I_I]] ], [ true, [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: ; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RETVAL_0_I_I_PH]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ] @@ -535,23 +496,17 @@ _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: define i1 @_Z40index_iteration_eq_const_size_no_overlapPKc(i8* %ptr) { ; CHECK-LABEL: @_Z40index_iteration_eq_const_size_no_overlapPKc( -; CHECK-NEXT: entry: +; CHECK-NEXT: for.body.bcmpdispatchbb: ; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 8 -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.cond: -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC:%.*]], 8 -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_013:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INC]], [[FOR_COND:%.*]] ] -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, i8* [[PTR]], i64 [[I_013]] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[ARRAYIDX]] -; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, i8* [[ADD_PTR]], i64 [[I_013]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[ARRAYIDX1]] -; CHECK-NEXT: [[CMP3:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I_013]], 1 -; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_COND]], label [[CLEANUP]] +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR]], i64 8) +; CHECK-NEXT: [[PTR_VS_ADD_PTR_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR_EQCMP]], label [[PTR_VS_ADD_PTR_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: -; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_COND]] ] +; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR_EQCMP_EQUALBB]] ] ; CHECK-NEXT: ret i1 [[RES]] ; entry: @@ -579,23 +534,17 @@ cleanup: define i1 @_Z45index_iteration_eq_const_size_partial_overlapPKc(i8* %ptr) { ; CHECK-LABEL: @_Z45index_iteration_eq_const_size_partial_overlapPKc( -; CHECK-NEXT: entry: +; CHECK-NEXT: for.body.bcmpdispatchbb: ; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 8 -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.cond: -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC:%.*]], 16 -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_013:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INC]], [[FOR_COND:%.*]] ] -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, i8* [[PTR]], i64 [[I_013]] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[ARRAYIDX]] -; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, i8* [[ADD_PTR]], i64 [[I_013]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[ARRAYIDX1]] -; CHECK-NEXT: [[CMP3:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I_013]], 1 -; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_COND]], label [[CLEANUP]] +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR]], i64 16) +; CHECK-NEXT: [[PTR_VS_ADD_PTR_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR_EQCMP]], label [[PTR_VS_ADD_PTR_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: -; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_COND]] ] +; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR_EQCMP_EQUALBB]] ] ; CHECK-NEXT: ret i1 [[RES]] ; entry: @@ -623,22 +572,16 @@ cleanup: define i1 @_Z45index_iteration_eq_const_size_overlap_unknownPKcS0_(i8* %ptr0, i8* %ptr1) { ; CHECK-LABEL: @_Z45index_iteration_eq_const_size_overlap_unknownPKcS0_( -; CHECK-NEXT: entry: -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.cond: -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC:%.*]], 8 -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_08:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INC]], [[FOR_COND:%.*]] ] -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[I_08]] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[ARRAYIDX]] -; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, i8* [[PTR1:%.*]], i64 [[I_08]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[ARRAYIDX1]] -; CHECK-NEXT: [[CMP3:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I_08]], 1 -; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_COND]], label [[CLEANUP]] +; CHECK-NEXT: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0:%.*]], i8* [[PTR1:%.*]], i64 8) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: -; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_COND]] ] +; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: ret i1 [[RES]] ; entry: @@ -666,25 +609,19 @@ cleanup: define i1 @_Z43index_iteration_eq_variable_size_no_overlapPKcm(i8* %ptr, i64 %count) { ; CHECK-LABEL: @_Z43index_iteration_eq_variable_size_no_overlapPKcm( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[COUNT:%.*]] -; CHECK-NEXT: [[CMP14:%.*]] = icmp eq i64 [[COUNT]], 0 -; CHECK-NEXT: br i1 [[CMP14]], label [[CLEANUP:%.*]], label [[FOR_BODY_PREHEADER:%.*]] -; CHECK: for.body.preheader: -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.cond: -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC:%.*]], [[COUNT]] -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_015:%.*]] = phi i64 [ [[INC]], [[FOR_COND:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, i8* [[PTR]], i64 [[I_015]] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[ARRAYIDX]] -; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, i8* [[ADD_PTR]], i64 [[I_015]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[ARRAYIDX1]] -; CHECK-NEXT: [[CMP3:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: [[INC]] = add nuw i64 [[I_015]], 1 -; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_COND]], label [[CLEANUP_LOOPEXIT]] +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[COUNT_BYTECOUNT:%.*]] +; CHECK-NEXT: [[CMP14:%.*]] = icmp eq i64 [[COUNT_BYTECOUNT]], 0 +; CHECK-NEXT: br i1 [[CMP14]], label [[CLEANUP:%.*]], label [[FOR_BODY_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR]], i64 [[COUNT_BYTECOUNT]]) +; CHECK-NEXT: [[PTR_VS_ADD_PTR_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR_EQCMP]], label [[PTR_VS_ADD_PTR_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT]] ; CHECK: cleanup.loopexit: -; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_COND]] ] +; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: ; CHECK-NEXT: [[RES:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RES_PH]], [[CLEANUP_LOOPEXIT]] ] @@ -718,25 +655,19 @@ define i1 @_Z48index_iteration_eq_variab ; CHECK-LABEL: @_Z48index_iteration_eq_variable_size_partial_overlapPKcm( ; CHECK-NEXT: entry: ; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[COUNT:%.*]] -; CHECK-NEXT: [[MUL:%.*]] = shl i64 [[COUNT]], 1 -; CHECK-NEXT: [[CMP14:%.*]] = icmp eq i64 [[MUL]], 0 -; CHECK-NEXT: br i1 [[CMP14]], label [[CLEANUP:%.*]], label [[FOR_BODY_PREHEADER:%.*]] -; CHECK: for.body.preheader: -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.cond: -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC:%.*]], [[MUL]] -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_015:%.*]] = phi i64 [ [[INC]], [[FOR_COND:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, i8* [[PTR]], i64 [[I_015]] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[ARRAYIDX]] -; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, i8* [[ADD_PTR]], i64 [[I_015]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[ARRAYIDX1]] -; CHECK-NEXT: [[CMP3:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: [[INC]] = add nuw i64 [[I_015]], 1 -; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_COND]], label [[CLEANUP_LOOPEXIT]] +; CHECK-NEXT: [[MUL_BYTECOUNT:%.*]] = shl i64 [[COUNT]], 1 +; CHECK-NEXT: [[CMP14:%.*]] = icmp eq i64 [[MUL_BYTECOUNT]], 0 +; CHECK-NEXT: br i1 [[CMP14]], label [[CLEANUP:%.*]], label [[FOR_BODY_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR]], i64 [[MUL_BYTECOUNT]]) +; CHECK-NEXT: [[PTR_VS_ADD_PTR_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR_EQCMP]], label [[PTR_VS_ADD_PTR_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT]] ; CHECK: cleanup.loopexit: -; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_COND]] ] +; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: ; CHECK-NEXT: [[RES:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RES_PH]], [[CLEANUP_LOOPEXIT]] ] @@ -770,24 +701,18 @@ cleanup: define i1 @_Z48index_iteration_eq_variable_size_overlap_unknownPKcS0_m(i8* %ptr0, i8* %ptr1, i64 %count) { ; CHECK-LABEL: @_Z48index_iteration_eq_variable_size_overlap_unknownPKcS0_m( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[CMP8:%.*]] = icmp eq i64 [[COUNT:%.*]], 0 -; CHECK-NEXT: br i1 [[CMP8]], label [[CLEANUP:%.*]], label [[FOR_BODY_PREHEADER:%.*]] -; CHECK: for.body.preheader: -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.cond: -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC:%.*]], [[COUNT]] -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_09:%.*]] = phi i64 [ [[INC]], [[FOR_COND:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[I_09]] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[ARRAYIDX]] -; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, i8* [[PTR1:%.*]], i64 [[I_09]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[ARRAYIDX1]] -; CHECK-NEXT: [[CMP3:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: [[INC]] = add nuw i64 [[I_09]], 1 -; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_COND]], label [[CLEANUP_LOOPEXIT]] +; CHECK-NEXT: [[CMP8:%.*]] = icmp eq i64 [[COUNT_BYTECOUNT:%.*]], 0 +; CHECK-NEXT: br i1 [[CMP8]], label [[CLEANUP:%.*]], label [[FOR_BODY_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0:%.*]], i8* [[PTR1:%.*]], i64 [[COUNT_BYTECOUNT]]) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT]] ; CHECK: cleanup.loopexit: -; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_COND]] ] +; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: ; CHECK-NEXT: [[RES:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RES_PH]], [[CLEANUP_LOOPEXIT]] ] @@ -818,22 +743,18 @@ cleanup: define i1 @_Z38index_iteration_starting_from_negativePKcS0_(i8* %ptr0, i8* %ptr1) { ; CHECK-LABEL: @_Z38index_iteration_starting_from_negativePKcS0_( -; CHECK-NEXT: entry: -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.cond: -; CHECK-NEXT: [[CMP:%.*]] = icmp slt i64 [[INDVARS_IV_NEXT:%.*]], 4 -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ -4, [[ENTRY:%.*]] ], [ [[INDVARS_IV_NEXT]], [[FOR_COND:%.*]] ] -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[INDVARS_IV]] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[ARRAYIDX]] -; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i8, i8* [[PTR1:%.*]], i64 [[INDVARS_IV]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[ARRAYIDX2]] -; CHECK-NEXT: [[CMP4:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], 1 -; CHECK-NEXT: br i1 [[CMP4]], label [[FOR_COND]], label [[CLEANUP]] +; CHECK-NEXT: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, i8* [[PTR0:%.*]], i64 -4 +; CHECK-NEXT: [[SCEVGEP1:%.*]] = getelementptr i8, i8* [[PTR1:%.*]], i64 -4 +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[SCEVGEP]], i8* [[SCEVGEP1]], i64 8) +; CHECK-NEXT: [[SCEVGEP_VS_SCEVGEP1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[SCEVGEP_VS_SCEVGEP1_EQCMP]], label [[SCEVGEP_VS_SCEVGEP1_EQCMP_EQUALBB:%.*]], label [[SCEVGEP_VS_SCEVGEP1_EQCMP_UNEQUALBB:%.*]] +; CHECK: scevgep.vs.scevgep1.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP:%.*]] +; CHECK: scevgep.vs.scevgep1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: -; CHECK-NEXT: [[RET:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_COND]] ] +; CHECK-NEXT: [[RET:%.*]] = phi i1 [ false, [[SCEVGEP_VS_SCEVGEP1_EQCMP_UNEQUALBB]] ], [ true, [[SCEVGEP_VS_SCEVGEP1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: ret i1 [[RET]] ; entry: @@ -860,25 +781,17 @@ cleanup: define i1 @_Z43combined_iteration_eq_const_size_no_overlapPKc(i8* %ptr) { ; CHECK-LABEL: @_Z43combined_iteration_eq_const_size_no_overlapPKc( -; CHECK-NEXT: entry: +; CHECK-NEXT: for.body.bcmpdispatchbb: ; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 8 -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_015:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INC:%.*]], [[FOR_INC:%.*]] ] -; CHECK-NEXT: [[PTR1_014:%.*]] = phi i8* [ [[ADD_PTR]], [[ENTRY]] ], [ [[INCDEC_PTR3:%.*]], [[FOR_INC]] ] -; CHECK-NEXT: [[PTR0_013:%.*]] = phi i8* [ [[PTR]], [[ENTRY]] ], [ [[INCDEC_PTR:%.*]], [[FOR_INC]] ] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[PTR0_013]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[PTR1_014]] -; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP2]], label [[FOR_INC]], label [[CLEANUP:%.*]] -; CHECK: for.inc: -; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I_015]], 1 -; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[PTR0_013]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR3]] = getelementptr inbounds i8, i8* [[PTR1_014]], i64 1 -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC]], 8 -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP]] +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR]], i64 8) +; CHECK-NEXT: [[PTR_VS_ADD_PTR_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR_EQCMP]], label [[PTR_VS_ADD_PTR_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: -; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_INC]] ] +; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR_EQCMP_EQUALBB]] ] ; CHECK-NEXT: ret i1 [[RES]] ; entry: @@ -908,25 +821,17 @@ cleanup: define i1 @_Z48combined_iteration_eq_const_size_partial_overlapPKc(i8* %ptr) { ; CHECK-LABEL: @_Z48combined_iteration_eq_const_size_partial_overlapPKc( -; CHECK-NEXT: entry: +; CHECK-NEXT: for.body.bcmpdispatchbb: ; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 8 -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_015:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INC:%.*]], [[FOR_INC:%.*]] ] -; CHECK-NEXT: [[PTR1_014:%.*]] = phi i8* [ [[ADD_PTR]], [[ENTRY]] ], [ [[INCDEC_PTR3:%.*]], [[FOR_INC]] ] -; CHECK-NEXT: [[PTR0_013:%.*]] = phi i8* [ [[PTR]], [[ENTRY]] ], [ [[INCDEC_PTR:%.*]], [[FOR_INC]] ] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[PTR0_013]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[PTR1_014]] -; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP2]], label [[FOR_INC]], label [[CLEANUP:%.*]] -; CHECK: for.inc: -; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I_015]], 1 -; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[PTR0_013]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR3]] = getelementptr inbounds i8, i8* [[PTR1_014]], i64 1 -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC]], 16 -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP]] +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR]], i64 16) +; CHECK-NEXT: [[PTR_VS_ADD_PTR_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR_EQCMP]], label [[PTR_VS_ADD_PTR_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: -; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_INC]] ] +; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR_EQCMP_EQUALBB]] ] ; CHECK-NEXT: ret i1 [[RES]] ; entry: @@ -956,24 +861,16 @@ cleanup: define i1 @_Z48combined_iteration_eq_const_size_overlap_unknownPKcS0_(i8* %ptr0, i8* %ptr1) { ; CHECK-LABEL: @_Z48combined_iteration_eq_const_size_overlap_unknownPKcS0_( -; CHECK-NEXT: entry: -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_010:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INC:%.*]], [[FOR_INC:%.*]] ] -; CHECK-NEXT: [[PTR1_ADDR_09:%.*]] = phi i8* [ [[PTR1:%.*]], [[ENTRY]] ], [ [[INCDEC_PTR3:%.*]], [[FOR_INC]] ] -; CHECK-NEXT: [[PTR0_ADDR_08:%.*]] = phi i8* [ [[PTR0:%.*]], [[ENTRY]] ], [ [[INCDEC_PTR:%.*]], [[FOR_INC]] ] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[PTR0_ADDR_08]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[PTR1_ADDR_09]] -; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP2]], label [[FOR_INC]], label [[CLEANUP:%.*]] -; CHECK: for.inc: -; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I_010]], 1 -; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[PTR0_ADDR_08]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR3]] = getelementptr inbounds i8, i8* [[PTR1_ADDR_09]], i64 1 -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC]], 8 -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP]] +; CHECK-NEXT: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0:%.*]], i8* [[PTR1:%.*]], i64 8) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: -; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_INC]] ] +; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: ret i1 [[RES]] ; entry: @@ -1003,27 +900,19 @@ cleanup: define i1 @_Z46combined_iteration_eq_variable_size_no_overlapPKcm(i8* %ptr, i64 %count) { ; CHECK-LABEL: @_Z46combined_iteration_eq_variable_size_no_overlapPKcm( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[CMP14:%.*]] = icmp eq i64 [[COUNT:%.*]], 0 -; CHECK-NEXT: br i1 [[CMP14]], label [[CLEANUP:%.*]], label [[FOR_BODY_PREHEADER:%.*]] -; CHECK: for.body.preheader: -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[COUNT]] -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_017:%.*]] = phi i64 [ [[INC:%.*]], [[FOR_INC:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[PTR1_016:%.*]] = phi i8* [ [[INCDEC_PTR3:%.*]], [[FOR_INC]] ], [ [[ADD_PTR]], [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[PTR0_015:%.*]] = phi i8* [ [[INCDEC_PTR:%.*]], [[FOR_INC]] ], [ [[PTR]], [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[PTR0_015]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[PTR1_016]] -; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP2]], label [[FOR_INC]], label [[CLEANUP_LOOPEXIT:%.*]] -; CHECK: for.inc: -; CHECK-NEXT: [[INC]] = add nuw i64 [[I_017]], 1 -; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[PTR0_015]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR3]] = getelementptr inbounds i8, i8* [[PTR1_016]], i64 1 -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC]], [[COUNT]] -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT]] +; CHECK-NEXT: [[CMP14:%.*]] = icmp eq i64 [[COUNT_BYTECOUNT:%.*]], 0 +; CHECK-NEXT: br i1 [[CMP14]], label [[CLEANUP:%.*]], label [[FOR_BODY_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[COUNT_BYTECOUNT]] +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR]], i64 [[COUNT_BYTECOUNT]]) +; CHECK-NEXT: [[PTR_VS_ADD_PTR_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR_EQCMP]], label [[PTR_VS_ADD_PTR_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT]] ; CHECK: cleanup.loopexit: -; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_INC]] ] +; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: ; CHECK-NEXT: [[RES:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RES_PH]], [[CLEANUP_LOOPEXIT]] ] @@ -1061,28 +950,20 @@ cleanup: define i1 @_Z51combined_iteration_eq_variable_size_partial_overlapPKcm(i8* %ptr, i64 %count) { ; CHECK-LABEL: @_Z51combined_iteration_eq_variable_size_partial_overlapPKcm( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[MUL:%.*]] = shl i64 [[COUNT:%.*]], 1 -; CHECK-NEXT: [[CMP14:%.*]] = icmp eq i64 [[MUL]], 0 -; CHECK-NEXT: br i1 [[CMP14]], label [[CLEANUP:%.*]], label [[FOR_BODY_PREHEADER:%.*]] -; CHECK: for.body.preheader: +; CHECK-NEXT: [[MUL_BYTECOUNT:%.*]] = shl i64 [[COUNT:%.*]], 1 +; CHECK-NEXT: [[CMP14:%.*]] = icmp eq i64 [[MUL_BYTECOUNT]], 0 +; CHECK-NEXT: br i1 [[CMP14]], label [[CLEANUP:%.*]], label [[FOR_BODY_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.bcmpdispatchbb: ; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[COUNT]] -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_017:%.*]] = phi i64 [ [[INC:%.*]], [[FOR_INC:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[PTR1_016:%.*]] = phi i8* [ [[INCDEC_PTR3:%.*]], [[FOR_INC]] ], [ [[ADD_PTR]], [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[PTR0_015:%.*]] = phi i8* [ [[INCDEC_PTR:%.*]], [[FOR_INC]] ], [ [[PTR]], [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[PTR0_015]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[PTR1_016]] -; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP2]], label [[FOR_INC]], label [[CLEANUP_LOOPEXIT:%.*]] -; CHECK: for.inc: -; CHECK-NEXT: [[INC]] = add nuw i64 [[I_017]], 1 -; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[PTR0_015]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR3]] = getelementptr inbounds i8, i8* [[PTR1_016]], i64 1 -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC]], [[MUL]] -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT]] +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR]], i64 [[MUL_BYTECOUNT]]) +; CHECK-NEXT: [[PTR_VS_ADD_PTR_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR_EQCMP]], label [[PTR_VS_ADD_PTR_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT:%.*]] +; CHECK: ptr.vs.add.ptr.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT]] ; CHECK: cleanup.loopexit: -; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_INC]] ] +; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: ; CHECK-NEXT: [[RES:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RES_PH]], [[CLEANUP_LOOPEXIT]] ] @@ -1121,26 +1002,18 @@ cleanup: define i1 @_Z51combined_iteration_eq_variable_size_overlap_unknownPKcS0_m(i8* %ptr0, i8* %ptr1, i64 %count) { ; CHECK-LABEL: @_Z51combined_iteration_eq_variable_size_overlap_unknownPKcS0_m( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[CMP8:%.*]] = icmp eq i64 [[COUNT:%.*]], 0 -; CHECK-NEXT: br i1 [[CMP8]], label [[CLEANUP:%.*]], label [[FOR_BODY_PREHEADER:%.*]] -; CHECK: for.body.preheader: -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_011:%.*]] = phi i64 [ [[INC:%.*]], [[FOR_INC:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[PTR1_ADDR_010:%.*]] = phi i8* [ [[INCDEC_PTR3:%.*]], [[FOR_INC]] ], [ [[PTR1:%.*]], [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[PTR0_ADDR_09:%.*]] = phi i8* [ [[INCDEC_PTR:%.*]], [[FOR_INC]] ], [ [[PTR0:%.*]], [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[PTR0_ADDR_09]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[PTR1_ADDR_010]] -; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: br i1 [[CMP2]], label [[FOR_INC]], label [[CLEANUP_LOOPEXIT:%.*]] -; CHECK: for.inc: -; CHECK-NEXT: [[INC]] = add nuw i64 [[I_011]], 1 -; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[PTR0_ADDR_09]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR3]] = getelementptr inbounds i8, i8* [[PTR1_ADDR_010]], i64 1 -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC]], [[COUNT]] -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT]] +; CHECK-NEXT: [[CMP8:%.*]] = icmp eq i64 [[COUNT_BYTECOUNT:%.*]], 0 +; CHECK-NEXT: br i1 [[CMP8]], label [[CLEANUP:%.*]], label [[FOR_BODY_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0:%.*]], i8* [[PTR1:%.*]], i64 [[COUNT_BYTECOUNT]]) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT]] ; CHECK: cleanup.loopexit: -; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_INC]] ] +; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: ; CHECK-NEXT: [[RES:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RES_PH]], [[CLEANUP_LOOPEXIT]] ] @@ -1174,25 +1047,19 @@ cleanup: define i1 @_Z55negated_pointer_iteration_variable_size_overlap_unknownPKcS0_m(i8* %ptr0, i8* %ptr1, i64 %count) { ; CHECK-LABEL: @_Z55negated_pointer_iteration_variable_size_overlap_unknownPKcS0_m( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[COUNT:%.*]] -; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT]], 0 -; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_PREHEADER:%.*]] -; CHECK: for.body.i.i.preheader: -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[PTR1:%.*]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i8* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[PTR0]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[T0:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I]] -; CHECK-NEXT: [[T1:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[T0]], [[T1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i8, i8* [[__FIRST1_ADDR_06_I_I]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i8* [[INCDEC_PTR_I_I]], [[ADD_PTR]] -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[COUNT_BYTECOUNT:%.*]] +; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT_BYTECOUNT]], 0 +; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0]], i8* [[PTR1:%.*]], i64 [[COUNT_BYTECOUNT]]) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit.loopexit: -; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ true, [[FOR_BODY_I_I]] ], [ false, [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ true, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ false, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: ; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ false, [[ENTRY:%.*]] ], [ [[RETVAL_0_I_I_PH]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ] @@ -1227,23 +1094,24 @@ define i1 @_Z55integer_pointer_iteration ; CHECK-NEXT: entry: ; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i32, i32* [[PTR0:%.*]], i64 [[COUNT:%.*]] ; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT]], 0 -; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKIS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_PREHEADER:%.*]] -; CHECK: for.body.i.i.preheader: -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i32* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[PTR1:%.*]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i32* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[PTR0]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[T0:%.*]] = load i32, i32* [[__FIRST1_ADDR_06_I_I]] -; CHECK-NEXT: [[T1:%.*]] = load i32, i32* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i32 [[T0]], [[T1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKIS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i32, i32* [[__FIRST1_ADDR_06_I_I]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i32, i32* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i32* [[INCDEC_PTR_I_I]], [[ADD_PTR]] -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKIS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKIS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: [[TMP0:%.*]] = shl nsw i64 [[COUNT]], 2 +; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[TMP0]], -4 +; CHECK-NEXT: [[TMP2:%.*]] = lshr i64 [[TMP1]], 2 +; CHECK-NEXT: [[TMP3:%.*]] = shl nuw i64 [[TMP2]], 2 +; CHECK-NEXT: [[DOTBYTECOUNT:%.*]] = add i64 [[TMP3]], 4 +; CHECK-NEXT: [[CSTR:%.*]] = bitcast i32* [[PTR0]] to i8* +; CHECK-NEXT: [[CSTR1:%.*]] = bitcast i32* [[PTR1:%.*]] to i8* +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[CSTR]], i8* [[CSTR1]], i64 [[DOTBYTECOUNT]]) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKIS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKIS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ; CHECK: _ZNSt3__15equalIPKiS2_EEbT_S3_T0_.exit.loopexit: -; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[FOR_BODY_I_I]] ], [ true, [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKIS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKiS2_EEbT_S3_T0_.exit: ; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RETVAL_0_I_I_PH]], [[_ZNST3__15EQUALIPKIS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ] @@ -1277,25 +1145,18 @@ define i1 @_Z21small_index_iterationPKcS ; CHECK-LABEL: @_Z21small_index_iterationPKcS0_i( ; CHECK-NEXT: entry: ; CHECK-NEXT: [[CMP8:%.*]] = icmp sgt i32 [[COUNT:%.*]], 0 -; CHECK-NEXT: br i1 [[CMP8]], label [[FOR_BODY_PREHEADER:%.*]], label [[CLEANUP:%.*]] -; CHECK: for.body.preheader: -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_011:%.*]] = phi i32 [ [[INC:%.*]], [[FOR_INC:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[PTR1_ADDR_010:%.*]] = phi i8* [ [[INCDEC_PTR3:%.*]], [[FOR_INC]] ], [ [[PTR1:%.*]], [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[PTR0_ADDR_09:%.*]] = phi i8* [ [[INCDEC_PTR:%.*]], [[FOR_INC]] ], [ [[PTR0:%.*]], [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[T0:%.*]] = load i8, i8* [[PTR0_ADDR_09]] -; CHECK-NEXT: [[T1:%.*]] = load i8, i8* [[PTR1_ADDR_010]] -; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i8 [[T0]], [[T1]] -; CHECK-NEXT: br i1 [[CMP2]], label [[FOR_INC]], label [[CLEANUP_LOOPEXIT:%.*]] -; CHECK: for.inc: -; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[I_011]], 1 -; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[PTR0_ADDR_09]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR3]] = getelementptr inbounds i8, i8* [[PTR1_ADDR_010]], i64 1 -; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[INC]], [[COUNT]] -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT]] +; CHECK-NEXT: br i1 [[CMP8]], label [[FOR_BODY_BCMPDISPATCHBB:%.*]], label [[CLEANUP:%.*]] +; CHECK: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[DOTBYTECOUNT:%.*]] = zext i32 [[COUNT]] to i64 +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0:%.*]], i8* [[PTR1:%.*]], i64 [[DOTBYTECOUNT]]) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT]] ; CHECK: cleanup.loopexit: -; CHECK-NEXT: [[T2_PH:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_INC]] ] +; CHECK-NEXT: [[T2_PH:%.*]] = phi i1 [ false, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: ; CHECK-NEXT: [[T2:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[T2_PH]], [[CLEANUP_LOOPEXIT]] ] @@ -1329,24 +1190,22 @@ cleanup: define i1 @_Z23three_pointer_iterationPKcS0_S0_(i8* %ptr0, i8* %ptr0_end, i8* %ptr1) { ; CHECK-LABEL: @_Z23three_pointer_iterationPKcS0_S0_( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i8* [[PTR0:%.*]], [[PTR0_END:%.*]] -; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_PREHEADER:%.*]] -; CHECK: for.body.i.i.preheader: -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[PTR1:%.*]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i8* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[PTR0]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[T0:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I]] -; CHECK-NEXT: [[T1:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[T0]], [[T1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i8, i8* [[__FIRST1_ADDR_06_I_I]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i8* [[INCDEC_PTR_I_I]], [[PTR0_END]] -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[PTR01:%.*]] = ptrtoint i8* [[PTR0:%.*]] to i64 +; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i8* [[PTR0]], [[PTR0_END:%.*]] +; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: [[TMP0:%.*]] = sub i64 0, [[PTR01]] +; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, i8* [[PTR0_END]], i64 [[TMP0]] +; CHECK-NEXT: [[DOTBYTECOUNT:%.*]] = ptrtoint i8* [[SCEVGEP]] to i64 +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0]], i8* [[PTR1:%.*]], i64 [[DOTBYTECOUNT]]) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit.loopexit: -; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[FOR_BODY_I_I]] ], [ true, [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: ; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RETVAL_0_I_I_PH]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ] @@ -1378,25 +1237,19 @@ _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: define i32 @_Z17value_propagationPKcS0_mii(i8* %ptr0, i8* %ptr1, i64 %count, i32 %on_equal, i32 %on_unequal) { ; CHECK-LABEL: @_Z17value_propagationPKcS0_mii( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[COUNT:%.*]] -; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT]], 0 -; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_PREHEADER:%.*]] -; CHECK: for.body.i.i.preheader: -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[PTR1:%.*]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i8* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[PTR0]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[T0:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I]] -; CHECK-NEXT: [[T1:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[T0]], [[T1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i8, i8* [[__FIRST1_ADDR_06_I_I]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i8* [[INCDEC_PTR_I_I]], [[ADD_PTR]] -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[COUNT_BYTECOUNT:%.*]] +; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT_BYTECOUNT]], 0 +; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0]], i8* [[PTR1:%.*]], i64 [[COUNT_BYTECOUNT]]) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit.loopexit: -; CHECK-NEXT: [[T2_PH:%.*]] = phi i32 [ [[ON_UNEQUAL:%.*]], [[FOR_BODY_I_I]] ], [ [[ON_EQUAL:%.*]], [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[T2_PH:%.*]] = phi i32 [ [[ON_UNEQUAL:%.*]], [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ [[ON_EQUAL:%.*]], [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: ; CHECK-NEXT: [[T2:%.*]] = phi i32 [ [[ON_EQUAL]], [[ENTRY:%.*]] ], [ [[T2_PH]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ] @@ -1429,23 +1282,17 @@ _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: define void @_Z20multiple_exit_blocksPKcS0_m(i8* %ptr0, i8* %ptr1, i64 %count) { ; CHECK-LABEL: @_Z20multiple_exit_blocksPKcS0_m( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[COUNT:%.*]] -; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT]], 0 -; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[IF_END:%.*]], label [[FOR_BODY_I_I_PREHEADER:%.*]] -; CHECK: for.body.i.i.preheader: -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[PTR1:%.*]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i8* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[PTR0]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[T0:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I]] -; CHECK-NEXT: [[T1:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[T0]], [[T1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[IF_THEN:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i8, i8* [[__FIRST1_ADDR_06_I_I]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i8* [[INCDEC_PTR_I_I]], [[ADD_PTR]] -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[IF_END_LOOPEXIT:%.*]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[COUNT_BYTECOUNT:%.*]] +; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT_BYTECOUNT]], 0 +; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[IF_END:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0]], i8* [[PTR1:%.*]], i64 [[COUNT_BYTECOUNT]]) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[IF_END_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[IF_THEN:%.*]] ; CHECK: if.then: ; CHECK-NEXT: tail call void @_Z17callee_on_unequalv() ; CHECK-NEXT: br label [[RETURN:%.*]] @@ -1493,26 +1340,20 @@ declare void @_Z17callee_on_successv() define void @_Z13multiple_phisPKcS0_mS0_S0_S0_S0_PS0_S1_(i8* %ptr0, i8* %ptr1, i64 %count, i8* %v0, i8* %v1, i8* %v2, i8* %v3, i8** %out0, i8** %out1) { ; CHECK-LABEL: @_Z13multiple_phisPKcS0_mS0_S0_S0_S0_PS0_S1_( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[COUNT:%.*]] -; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT]], 0 -; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_PREHEADER:%.*]] -; CHECK: for.body.i.i.preheader: -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[PTR1:%.*]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i8* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[PTR0]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[T0:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I]] -; CHECK-NEXT: [[T1:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[T0]], [[T1]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i8, i8* [[__FIRST1_ADDR_06_I_I]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i8* [[INCDEC_PTR_I_I]], [[ADD_PTR]] -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[COUNT_BYTECOUNT:%.*]] +; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[COUNT_BYTECOUNT]], 0 +; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0]], i8* [[PTR1:%.*]], i64 [[COUNT_BYTECOUNT]]) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit.loopexit: -; CHECK-NEXT: [[T2_PH:%.*]] = phi i8* [ [[V2:%.*]], [[FOR_BODY_I_I]] ], [ [[V0:%.*]], [[FOR_INC_I_I]] ] -; CHECK-NEXT: [[T3_PH:%.*]] = phi i8* [ [[V3:%.*]], [[FOR_BODY_I_I]] ], [ [[V1:%.*]], [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[T2_PH:%.*]] = phi i8* [ [[V2:%.*]], [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ [[V0:%.*]], [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] +; CHECK-NEXT: [[T3_PH:%.*]] = phi i8* [ [[V3:%.*]], [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ [[V1:%.*]], [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: ; CHECK-NEXT: [[T2:%.*]] = phi i8* [ [[V0]], [[ENTRY:%.*]] ], [ [[T2_PH]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ] @@ -1564,28 +1405,24 @@ define void @_Z16loop_within_loopmPPKcS1 ; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8*, i8** [[PTR0:%.*]], i64 [[I_012]] ; CHECK-NEXT: [[T0:%.*]] = load i8*, i8** [[ARRAYIDX]] ; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i64, i64* [[COUNT:%.*]], i64 [[I_012]] -; CHECK-NEXT: [[T1:%.*]] = load i64, i64* [[ARRAYIDX2]] -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[T0]], i64 [[T1]] -; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[T1]], 0 +; CHECK-NEXT: [[T1_BYTECOUNT:%.*]] = load i64, i64* [[ARRAYIDX2]] +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[T0]], i64 [[T1_BYTECOUNT]] +; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[T1_BYTECOUNT]], 0 ; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]], label [[FOR_BODY_I_I_PREHEADER:%.*]] ; CHECK: for.body.i.i.preheader: ; CHECK-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i8*, i8** [[PTR1:%.*]], i64 [[I_012]] ; CHECK-NEXT: [[T2:%.*]] = load i8*, i8** [[ARRAYIDX3]] -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[T2]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i8* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[T0]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[T3:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I]] -; CHECK-NEXT: [[T4:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[T3]], [[T4]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i8, i8* [[__FIRST1_ADDR_06_I_I]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i8* [[INCDEC_PTR_I_I]], [[ADD_PTR]] -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[T0]], i8* [[T2]], i64 [[T1_BYTECOUNT]]) +; CHECK-NEXT: [[T0_VS_T2_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: br i1 [[T0_VS_T2_EQCMP]], label [[T0_VS_T2_EQCMP_EQUALBB:%.*]], label [[T0_VS_T2_EQCMP_UNEQUALBB:%.*]] +; CHECK: t0.vs.t2.eqcmp.equalbb: +; CHECK-NEXT: br i1 true, label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB]] +; CHECK: t0.vs.t2.eqcmp.unequalbb: +; CHECK-NEXT: br i1 true, label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I_BCMPDISPATCHBB]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit.loopexit: -; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[FOR_BODY_I_I]] ], [ true, [[FOR_INC_I_I]] ] +; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[T0_VS_T2_EQCMP_UNEQUALBB]] ], [ true, [[T0_VS_T2_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]] ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: ; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ true, [[FOR_BODY]] ], [ [[RETVAL_0_I_I_PH]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ] @@ -1651,26 +1488,22 @@ define void @_Z42loop_within_loop_with_m ; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8*, i8** [[PTR0:%.*]], i64 [[I_012]] ; CHECK-NEXT: [[T0:%.*]] = load i8*, i8** [[ARRAYIDX]] ; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i64, i64* [[COUNT:%.*]], i64 [[I_012]] -; CHECK-NEXT: [[T1:%.*]] = load i64, i64* [[ARRAYIDX2]] -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[T0]], i64 [[T1]] -; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[T1]], 0 +; CHECK-NEXT: [[T1_BYTECOUNT:%.*]] = load i64, i64* [[ARRAYIDX2]] +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[T0]], i64 [[T1_BYTECOUNT]] +; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[T1_BYTECOUNT]], 0 ; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[IF_END]], label [[FOR_BODY_I_I_PREHEADER:%.*]] ; CHECK: for.body.i.i.preheader: ; CHECK-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i8*, i8** [[PTR1:%.*]], i64 [[I_012]] ; CHECK-NEXT: [[T2:%.*]] = load i8*, i8** [[ARRAYIDX3]] -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]] -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[T2]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i8* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[T0]], [[FOR_BODY_I_I_PREHEADER]] ] -; CHECK-NEXT: [[T3:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I]] -; CHECK-NEXT: [[T4:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]] -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[T3]], [[T4]] -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[IF_THEN:%.*]] -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i8, i8* [[__FIRST1_ADDR_06_I_I]], i64 1 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i8* [[INCDEC_PTR_I_I]], [[ADD_PTR]] -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[IF_END_LOOPEXIT:%.*]], label [[FOR_BODY_I_I]] +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[T0]], i8* [[T2]], i64 [[T1_BYTECOUNT]]) +; CHECK-NEXT: [[T0_VS_T2_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: br i1 [[T0_VS_T2_EQCMP]], label [[T0_VS_T2_EQCMP_EQUALBB:%.*]], label [[T0_VS_T2_EQCMP_UNEQUALBB:%.*]] +; CHECK: t0.vs.t2.eqcmp.equalbb: +; CHECK-NEXT: br i1 true, label [[IF_END_LOOPEXIT:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB]] +; CHECK: t0.vs.t2.eqcmp.unequalbb: +; CHECK-NEXT: br i1 true, label [[IF_THEN:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB]] ; CHECK: if.then: ; CHECK-NEXT: tail call void @_Z17callee_on_unequalv() ; CHECK-NEXT: br label [[CLEANUP]] @@ -1740,19 +1573,17 @@ define void @_Z21endless_loop_if_equalPi ; CHECK: for.cond.loopexit: ; CHECK-NEXT: br label [[FOR_COND]] ; CHECK: for.cond: -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.cond1: -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INDVARS_IV_NEXT:%.*]], 4 -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_COND_LOOPEXIT:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ 0, [[FOR_COND]] ], [ [[INDVARS_IV_NEXT]], [[FOR_COND1:%.*]] ] -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[A:%.*]], i64 [[INDVARS_IV]] -; CHECK-NEXT: [[TMP0:%.*]] = load i32, i32* [[ARRAYIDX]] -; CHECK-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, i32* [[B:%.*]], i64 [[INDVARS_IV]] -; CHECK-NEXT: [[TMP1:%.*]] = load i32, i32* [[ARRAYIDX3]] -; CHECK-NEXT: [[CMP4:%.*]] = icmp eq i32 [[TMP0]], [[TMP1]] -; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1 -; CHECK-NEXT: br i1 [[CMP4]], label [[FOR_COND1]], label [[RETURN:%.*]] +; CHECK-NEXT: [[CSTR:%.*]] = bitcast i32* [[A:%.*]] to i8* +; CHECK-NEXT: [[CSTR1:%.*]] = bitcast i32* [[B:%.*]] to i8* +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[CSTR]], i8* [[CSTR1]], i64 16) +; CHECK-NEXT: [[A_VS_B_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br label [[FOR_BODY_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.bcmpdispatchbb: +; CHECK-NEXT: br i1 [[A_VS_B_EQCMP]], label [[A_VS_B_EQCMP_EQUALBB:%.*]], label [[A_VS_B_EQCMP_UNEQUALBB:%.*]] +; CHECK: a.vs.b.eqcmp.equalbb: +; CHECK-NEXT: br i1 true, label [[FOR_COND_LOOPEXIT:%.*]], label [[FOR_BODY_BCMPDISPATCHBB]] +; CHECK: a.vs.b.eqcmp.unequalbb: +; CHECK-NEXT: br i1 true, label [[RETURN:%.*]], label [[FOR_BODY_BCMPDISPATCHBB]] ; CHECK: return: ; CHECK-NEXT: ret void ; @@ -1784,27 +1615,19 @@ define i1 @_Z21load_of_bitcastsPKcPKfm(i ; CHECK-LABEL: @_Z21load_of_bitcastsPKcPKfm( ; CHECK-NEXT: entry: ; CHECK-NEXT: [[CMP13:%.*]] = icmp eq i64 [[COUNT:%.*]], 0 -; CHECK-NEXT: br i1 [[CMP13]], label [[CLEANUP3:%.*]], label [[FOR_BODY_PREHEADER:%.*]] -; CHECK: for.body.preheader: -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[PTR0_ADDR_016:%.*]] = phi i8* [ [[ADD_PTR:%.*]], [[FOR_INC:%.*]] ], [ [[PTR0:%.*]], [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[I_015:%.*]] = phi i64 [ [[INC:%.*]], [[FOR_INC]] ], [ 0, [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[PTR1_ADDR_014:%.*]] = phi float* [ [[INCDEC_PTR:%.*]], [[FOR_INC]] ], [ [[PTR1:%.*]], [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[V0_0__SROA_CAST:%.*]] = bitcast i8* [[PTR0_ADDR_016]] to i32* -; CHECK-NEXT: [[V0_0_COPYLOAD:%.*]] = load i32, i32* [[V0_0__SROA_CAST]] -; CHECK-NEXT: [[V1_0__SROA_CAST:%.*]] = bitcast float* [[PTR1_ADDR_014]] to i32* -; CHECK-NEXT: [[V1_0_COPYLOAD:%.*]] = load i32, i32* [[V1_0__SROA_CAST]] -; CHECK-NEXT: [[CMP1:%.*]] = icmp eq i32 [[V0_0_COPYLOAD]], [[V1_0_COPYLOAD]] -; CHECK-NEXT: br i1 [[CMP1]], label [[FOR_INC]], label [[CLEANUP3_LOOPEXIT:%.*]] -; CHECK: for.inc: -; CHECK-NEXT: [[INC]] = add nuw i64 [[I_015]], 1 -; CHECK-NEXT: [[ADD_PTR]] = getelementptr inbounds i8, i8* [[PTR0_ADDR_016]], i64 4 -; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds float, float* [[PTR1_ADDR_014]], i64 1 -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC]], [[COUNT]] -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP3_LOOPEXIT]] +; CHECK-NEXT: br i1 [[CMP13]], label [[CLEANUP3:%.*]], label [[FOR_BODY_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[DOTBYTECOUNT:%.*]] = shl nuw i64 [[COUNT]], 2 +; CHECK-NEXT: [[CSTR:%.*]] = bitcast float* [[PTR1:%.*]] to i8* +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0:%.*]], i8* [[CSTR]], i64 [[DOTBYTECOUNT]]) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP3_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP3_LOOPEXIT]] ; CHECK: cleanup3.loopexit: -; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_INC]] ] +; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ], [ true, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[CLEANUP3]] ; CHECK: cleanup3: ; CHECK-NEXT: [[RES:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RES_PH]], [[CLEANUP3_LOOPEXIT]] ] @@ -1898,23 +1721,17 @@ cleanup4: define i1 @exit_block_is_not_dedicated(i8* %ptr0, i8* %ptr1) { ; CHECK-LABEL: @exit_block_is_not_dedicated( ; CHECK-NEXT: entry: -; CHECK-NEXT: br i1 true, label [[FOR_BODY_PREHEADER:%.*]], label [[CLEANUP:%.*]] -; CHECK: for.body.preheader: -; CHECK-NEXT: br label [[FOR_BODY:%.*]] -; CHECK: for.body: -; CHECK-NEXT: [[I_08:%.*]] = phi i64 [ [[INC:%.*]], [[FOR_COND:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ] -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, i8* [[PTR0:%.*]], i64 [[I_08]] -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[ARRAYIDX]] -; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, i8* [[PTR1:%.*]], i64 [[I_08]] -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[ARRAYIDX1]] -; CHECK-NEXT: [[CMP3:%.*]] = icmp eq i8 [[V0]], [[V1]] -; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I_08]], 1 -; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_COND]], label [[CLEANUP_LOOPEXIT:%.*]] -; CHECK: for.cond: -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC]], 8 -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT]] +; CHECK-NEXT: br i1 true, label [[FOR_BODY_BCMPDISPATCHBB:%.*]], label [[CLEANUP:%.*]] +; CHECK: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR0:%.*]], i8* [[PTR1:%.*]], i64 8) +; CHECK-NEXT: [[PTR0_VS_PTR1_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0 +; CHECK-NEXT: br i1 [[PTR0_VS_PTR1_EQCMP]], label [[PTR0_VS_PTR1_EQCMP_EQUALBB:%.*]], label [[PTR0_VS_PTR1_EQCMP_UNEQUALBB:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT:%.*]] +; CHECK: ptr0.vs.ptr1.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT]] ; CHECK: cleanup.loopexit: -; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ true, [[FOR_COND]] ], [ false, [[FOR_BODY]] ] +; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ true, [[PTR0_VS_PTR1_EQCMP_EQUALBB]] ], [ false, [[PTR0_VS_PTR1_EQCMP_UNEQUALBB]] ] ; CHECK-NEXT: br label [[CLEANUP]] ; CHECK: cleanup: ; CHECK-NEXT: [[RES:%.*]] = phi i1 [ false, [[ENTRY:%.*]] ], [ [[RES_PH]], [[CLEANUP_LOOPEXIT]] ] Modified: llvm/trunk/test/Transforms/LoopIdiom/bcmp-debugify-remarks.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopIdiom/bcmp-debugify-remarks.ll?rev=374662&r1=374661&r2=374662&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopIdiom/bcmp-debugify-remarks.ll (original) +++ llvm/trunk/test/Transforms/LoopIdiom/bcmp-debugify-remarks.ll Sat Oct 12 08:35:32 2019 @@ -1,5 +1,5 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py -; RUN: opt -debugify -loop-idiom < %s -S 2>&1 | FileCheck %s +; RUN: opt -debugify -loop-idiom -pass-remarks=loop-idiom -pass-remarks-analysis=loop-idiom -verify -verify-each -verify-dom-info -verify-loop-info < %s -S 2>&1 | FileCheck %s target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" @@ -23,38 +23,37 @@ target datalayout = "e-p:64:64:64-i1:8:8 ; sink(std::equal(ptr0[i], ptr0[i] + count[i], ptr1[i])); ; } +; CHECK: remark: :13:1: Loop recognized as a bcmp idiom +; CHECK: remark: :11:1: Transformed bcmp idiom into a call to memcmp() function +; CHECK: remark: :29:1: Loop recognized as a bcmp idiom +; CHECK: remark: :34:1: Transformed bcmp idiom into a call to memcmp() function + define i1 @_Z43index_iteration_eq_variable_size_no_overlapPKcm(i8* nocapture %ptr, i64 %count) { ; CHECK-LABEL: @_Z43index_iteration_eq_variable_size_no_overlapPKcm( ; CHECK-NEXT: entry: -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[COUNT:%.*]], !dbg !22 +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[PTR:%.*]], i64 [[COUNT_BYTECOUNT:%.*]], !dbg !22 ; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[ADD_PTR]], metadata !9, metadata !DIExpression()), !dbg !22 -; CHECK-NEXT: [[CMP14:%.*]] = icmp eq i64 [[COUNT]], 0, !dbg !23 +; CHECK-NEXT: [[CMP14:%.*]] = icmp eq i64 [[COUNT_BYTECOUNT]], 0, !dbg !23 ; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP14]], metadata !11, metadata !DIExpression()), !dbg !23 -; CHECK-NEXT: br i1 [[CMP14]], label [[CLEANUP:%.*]], label [[FOR_BODY_PREHEADER:%.*]], !dbg !24 -; CHECK: for.body.preheader: -; CHECK-NEXT: br label [[FOR_BODY:%.*]], !dbg !25 -; CHECK: for.cond: -; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[INC:%.*]], [[COUNT]], !dbg !26 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP]], metadata !13, metadata !DIExpression()), !dbg !26 -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[CLEANUP_LOOPEXIT:%.*]], !dbg !27 -; CHECK: for.body: -; CHECK-NEXT: [[I_015:%.*]] = phi i64 [ [[INC]], [[FOR_COND:%.*]] ], [ 0, [[FOR_BODY_PREHEADER]] ], !dbg !28 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[I_015]], metadata !14, metadata !DIExpression()), !dbg !28 -; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i8, i8* [[PTR]], i64 [[I_015]], !dbg !29 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[ARRAYIDX]], metadata !15, metadata !DIExpression()), !dbg !29 -; CHECK-NEXT: [[V0:%.*]] = load i8, i8* [[ARRAYIDX]], !dbg !30 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i8 [[V0]], metadata !16, metadata !DIExpression()), !dbg !30 -; CHECK-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, i8* [[ADD_PTR]], i64 [[I_015]], !dbg !31 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[ARRAYIDX1]], metadata !17, metadata !DIExpression()), !dbg !31 -; CHECK-NEXT: [[V1:%.*]] = load i8, i8* [[ARRAYIDX1]], !dbg !32 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i8 [[V1]], metadata !18, metadata !DIExpression()), !dbg !32 -; CHECK-NEXT: [[CMP3:%.*]] = icmp eq i8 [[V0]], [[V1]], !dbg !33 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP3]], metadata !19, metadata !DIExpression()), !dbg !33 -; CHECK-NEXT: [[INC]] = add nuw i64 [[I_015]], 1, !dbg !34 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[INC]], metadata !20, metadata !DIExpression()), !dbg !34 -; CHECK-NEXT: br i1 [[CMP3]], label [[FOR_COND]], label [[CLEANUP_LOOPEXIT]], !dbg !25 +; CHECK-NEXT: br i1 [[CMP14]], label [[CLEANUP:%.*]], label [[FOR_BODY_BCMPDISPATCHBB:%.*]], !dbg !24 +; CHECK: for.body.bcmpdispatchbb: +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[PTR]], i8* [[ADD_PTR]], i64 [[COUNT_BYTECOUNT]]), !dbg !25 +; CHECK-NEXT: [[PTR_VS_ADD_PTR_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0, !dbg !25 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !14, metadata !DIExpression()), !dbg !26 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !15, metadata !DIExpression()), !dbg !27 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !16, metadata !DIExpression()), !dbg !28 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !17, metadata !DIExpression()), !dbg !29 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !18, metadata !DIExpression()), !dbg !30 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !19, metadata !DIExpression()), !dbg !25 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !20, metadata !DIExpression()), !dbg !31 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !13, metadata !DIExpression()), !dbg !32 +; CHECK-NEXT: br i1 [[PTR_VS_ADD_PTR_EQCMP]], label [[PTR_VS_ADD_PTR_EQCMP_EQUALBB:%.*]], label [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB:%.*]], !dbg !25 +; CHECK: ptr.vs.add.ptr.eqcmp.equalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT:%.*]], !dbg !33 +; CHECK: ptr.vs.add.ptr.eqcmp.unequalbb: +; CHECK-NEXT: br label [[CLEANUP_LOOPEXIT]], !dbg !34 ; CHECK: cleanup.loopexit: -; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[FOR_BODY]] ], [ true, [[FOR_COND]] ] +; CHECK-NEXT: [[RES_PH:%.*]] = phi i1 [ false, [[PTR_VS_ADD_PTR_EQCMP_UNEQUALBB]] ], [ true, [[PTR_VS_ADD_PTR_EQCMP_EQUALBB]] ] ; CHECK-NEXT: br label [[CLEANUP]], !dbg !35 ; CHECK: cleanup: ; CHECK-NEXT: [[RES:%.*]] = phi i1 [ true, [[ENTRY:%.*]] ], [ [[RES_PH]], [[CLEANUP_LOOPEXIT]] ], !dbg !36 @@ -106,11 +105,11 @@ define void @_Z16loop_within_loopmPPKcS1 ; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[T0]], metadata !42, metadata !DIExpression()), !dbg !66 ; CHECK-NEXT: [[ARRAYIDX2:%.*]] = getelementptr inbounds i64, i64* [[COUNT:%.*]], i64 [[I_012]], !dbg !67 ; CHECK-NEXT: call void @llvm.dbg.value(metadata i64* [[ARRAYIDX2]], metadata !43, metadata !DIExpression()), !dbg !67 -; CHECK-NEXT: [[T1:%.*]] = load i64, i64* [[ARRAYIDX2]], !dbg !68 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[T1]], metadata !44, metadata !DIExpression()), !dbg !68 -; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[T0]], i64 [[T1]], !dbg !69 +; CHECK-NEXT: [[T1_BYTECOUNT:%.*]] = load i64, i64* [[ARRAYIDX2]], !dbg !68 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[T1_BYTECOUNT]], metadata !44, metadata !DIExpression()), !dbg !68 +; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds i8, i8* [[T0]], i64 [[T1_BYTECOUNT]], !dbg !69 ; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[ADD_PTR]], metadata !45, metadata !DIExpression()), !dbg !69 -; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[T1]], 0, !dbg !70 +; CHECK-NEXT: [[CMP5_I_I:%.*]] = icmp eq i64 [[T1_BYTECOUNT]], 0, !dbg !70 ; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP5_I_I]], metadata !46, metadata !DIExpression()), !dbg !70 ; CHECK-NEXT: br i1 [[CMP5_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]], label [[FOR_BODY_I_I_PREHEADER:%.*]], !dbg !62 ; CHECK: for.body.i.i.preheader: @@ -118,39 +117,35 @@ define void @_Z16loop_within_loopmPPKcS1 ; CHECK-NEXT: call void @llvm.dbg.value(metadata i8** [[ARRAYIDX3]], metadata !47, metadata !DIExpression()), !dbg !71 ; CHECK-NEXT: [[T2:%.*]] = load i8*, i8** [[ARRAYIDX3]], !dbg !72 ; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[T2]], metadata !48, metadata !DIExpression()), !dbg !72 -; CHECK-NEXT: br label [[FOR_BODY_I_I:%.*]], !dbg !73 -; CHECK: for.body.i.i: -; CHECK-NEXT: [[__FIRST2_ADDR_07_I_I:%.*]] = phi i8* [ [[INCDEC_PTR1_I_I:%.*]], [[FOR_INC_I_I:%.*]] ], [ [[T2]], [[FOR_BODY_I_I_PREHEADER]] ], !dbg !74 -; CHECK-NEXT: [[__FIRST1_ADDR_06_I_I:%.*]] = phi i8* [ [[INCDEC_PTR_I_I:%.*]], [[FOR_INC_I_I]] ], [ [[T0]], [[FOR_BODY_I_I_PREHEADER]] ], !dbg !75 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[__FIRST2_ADDR_07_I_I]], metadata !49, metadata !DIExpression()), !dbg !74 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[__FIRST1_ADDR_06_I_I]], metadata !50, metadata !DIExpression()), !dbg !75 -; CHECK-NEXT: [[T3:%.*]] = load i8, i8* [[__FIRST1_ADDR_06_I_I]], !dbg !76 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i8 [[T3]], metadata !51, metadata !DIExpression()), !dbg !76 -; CHECK-NEXT: [[T4:%.*]] = load i8, i8* [[__FIRST2_ADDR_07_I_I]], !dbg !77 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i8 [[T4]], metadata !52, metadata !DIExpression()), !dbg !77 -; CHECK-NEXT: [[CMP_I_I_I:%.*]] = icmp eq i8 [[T3]], [[T4]], !dbg !78 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP_I_I_I]], metadata !53, metadata !DIExpression()), !dbg !78 -; CHECK-NEXT: br i1 [[CMP_I_I_I]], label [[FOR_INC_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]], !dbg !79 -; CHECK: for.inc.i.i: -; CHECK-NEXT: [[INCDEC_PTR_I_I]] = getelementptr inbounds i8, i8* [[__FIRST1_ADDR_06_I_I]], i64 1, !dbg !80 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[INCDEC_PTR_I_I]], metadata !54, metadata !DIExpression()), !dbg !80 -; CHECK-NEXT: [[INCDEC_PTR1_I_I]] = getelementptr inbounds i8, i8* [[__FIRST2_ADDR_07_I_I]], i64 1, !dbg !81 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i8* [[INCDEC_PTR1_I_I]], metadata !55, metadata !DIExpression()), !dbg !81 -; CHECK-NEXT: [[CMP_I_I:%.*]] = icmp eq i8* [[INCDEC_PTR_I_I]], [[ADD_PTR]], !dbg !82 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP_I_I]], metadata !56, metadata !DIExpression()), !dbg !82 -; CHECK-NEXT: br i1 [[CMP_I_I]], label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I]], !dbg !83 +; CHECK-NEXT: [[MEMCMP:%.*]] = call i32 @memcmp(i8* [[T0]], i8* [[T2]], i64 [[T1_BYTECOUNT]]), !dbg !73 +; CHECK-NEXT: [[T0_VS_T2_EQCMP:%.*]] = icmp eq i32 [[MEMCMP]], 0, !dbg !73 +; CHECK-NEXT: br label [[FOR_BODY_I_I_BCMPDISPATCHBB:%.*]] +; CHECK: for.body.i.i.bcmpdispatchbb: +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !49, metadata !DIExpression()), !dbg !74 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !50, metadata !DIExpression()), !dbg !75 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !51, metadata !DIExpression()), !dbg !76 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !52, metadata !DIExpression()), !dbg !77 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !53, metadata !DIExpression()), !dbg !73 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !54, metadata !DIExpression()), !dbg !78 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !55, metadata !DIExpression()), !dbg !79 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata !56, metadata !DIExpression()), !dbg !80 +; CHECK-NEXT: br i1 [[T0_VS_T2_EQCMP]], label [[T0_VS_T2_EQCMP_EQUALBB:%.*]], label [[T0_VS_T2_EQCMP_UNEQUALBB:%.*]], !dbg !73 +; CHECK: t0.vs.t2.eqcmp.equalbb: +; CHECK-NEXT: br i1 true, label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT:%.*]], label [[FOR_BODY_I_I_BCMPDISPATCHBB]], !dbg !81 +; CHECK: t0.vs.t2.eqcmp.unequalbb: +; CHECK-NEXT: br i1 true, label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]], label [[FOR_BODY_I_I_BCMPDISPATCHBB]], !dbg !82 ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit.loopexit: -; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[FOR_BODY_I_I]] ], [ true, [[FOR_INC_I_I]] ] -; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]], !dbg !84 +; CHECK-NEXT: [[RETVAL_0_I_I_PH:%.*]] = phi i1 [ false, [[T0_VS_T2_EQCMP_UNEQUALBB]] ], [ true, [[T0_VS_T2_EQCMP_EQUALBB]] ] +; CHECK-NEXT: br label [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT]], !dbg !83 ; CHECK: _ZNSt3__15equalIPKcS2_EEbT_S3_T0_.exit: -; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ true, [[FOR_BODY]] ], [ [[RETVAL_0_I_I_PH]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ], !dbg !85 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[RETVAL_0_I_I]], metadata !57, metadata !DIExpression()), !dbg !85 -; CHECK-NEXT: tail call void @_Z4sinkb(i1 [[RETVAL_0_I_I]]), !dbg !84 -; CHECK-NEXT: [[INC]] = add nuw i64 [[I_012]], 1, !dbg !86 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[INC]], metadata !58, metadata !DIExpression()), !dbg !86 -; CHECK-NEXT: [[CMP:%.*]] = icmp eq i64 [[INC]], [[OUTER_COUNT]], !dbg !87 -; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP]], metadata !59, metadata !DIExpression()), !dbg !87 -; CHECK-NEXT: br i1 [[CMP]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]], label [[FOR_BODY]], !dbg !88 +; CHECK-NEXT: [[RETVAL_0_I_I:%.*]] = phi i1 [ true, [[FOR_BODY]] ], [ [[RETVAL_0_I_I_PH]], [[_ZNST3__15EQUALIPKCS2_EEBT_S3_T0__EXIT_LOOPEXIT]] ], !dbg !84 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[RETVAL_0_I_I]], metadata !57, metadata !DIExpression()), !dbg !84 +; CHECK-NEXT: tail call void @_Z4sinkb(i1 [[RETVAL_0_I_I]]), !dbg !83 +; CHECK-NEXT: [[INC]] = add nuw i64 [[I_012]], 1, !dbg !85 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i64 [[INC]], metadata !58, metadata !DIExpression()), !dbg !85 +; CHECK-NEXT: [[CMP:%.*]] = icmp eq i64 [[INC]], [[OUTER_COUNT]], !dbg !86 +; CHECK-NEXT: call void @llvm.dbg.value(metadata i1 [[CMP]], metadata !59, metadata !DIExpression()), !dbg !86 +; CHECK-NEXT: br i1 [[CMP]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]], label [[FOR_BODY]], !dbg !87 ; entry: %cmp11 = icmp eq i64 %outer_count, 0 Modified: llvm/trunk/test/Transforms/LoopIdiom/bcmp-widening.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopIdiom/bcmp-widening.ll?rev=374662&r1=374661&r2=374662&view=diff ============================================================================== --- llvm/trunk/test/Transforms/LoopIdiom/bcmp-widening.ll (original) +++ llvm/trunk/test/Transforms/LoopIdiom/bcmp-widening.ll Sat Oct 12 08:35:32 2019 @@ -1,5 +1,5 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py -; RUN: opt -loop-idiom < %s -S | FileCheck %s +; RUN: opt -loop-idiom -verify -verify-each -verify-dom-info -verify-loop-info < %s -S | FileCheck %s target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" From llvm-commits at lists.llvm.org Sat Oct 12 08:37:48 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 15:37:48 +0000 (UTC) Subject: [PATCH] D61144: [LoopIdiomRecognize] BCmp loop idiom recognition In-Reply-To: References: Message-ID: This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rG76cdcf25b883: [LoopIdiomRecognize] Recommit: BCmp loop idiom recognition (authored by lebedev.ri). Changed prior to commit: https://reviews.llvm.org/D61144?vs=218055&id=224747#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61144/new/ https://reviews.llvm.org/D61144 Files: llvm/docs/ReleaseNotes.rst llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp llvm/test/Transforms/LoopIdiom/bcmp-basic.ll llvm/test/Transforms/LoopIdiom/bcmp-debugify-remarks.ll llvm/test/Transforms/LoopIdiom/bcmp-widening.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D61144.224747.patch Type: text/x-patch Size: 142297 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 09:00:26 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 16:00:26 -0000 Subject: [llvm] r374664 - Revert r374657: "[lit] Try again to fix new tests that fail on Windows bots" Message-ID: <20191012160026.1D15B8A9A9@lists.llvm.org> Author: jdenny Date: Sat Oct 12 09:00:25 2019 New Revision: 374664 URL: http://llvm.org/viewvc/llvm-project?rev=374664&view=rev Log: Revert r374657: "[lit] Try again to fix new tests that fail on Windows bots" Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374664&r1=374663&r2=374664&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 09:00:25 2019 @@ -62,7 +62,6 @@ def compareTwoBinaryFiles(flags, filepat func = difflib.context_diff diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1], n = flags.num_context_lines) - diffs = [diff.decode(errors="ignore") for diff in diffs] for diff in diffs: sys.stdout.write(diff) Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374664&r1=374663&r2=374664&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 09:00:25 2019 @@ -48,7 +48,7 @@ # CHECK-NEXT: @@ # CHECK-NEXT: {{^ .f.o.o.$}} # CHECK-NEXT: {{^-.b.a.r.$}} -# CHECK-NEXT: {{^\+.b.a.r.}} +# CHECK-NEXT: {{^\+.b.a.r..}} # CHECK-NEXT: {{^ .b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -62,7 +62,7 @@ # CHECK-NEXT: -bar # CHECK-NEXT: -baz # CHECK-NEXT: {{^\+.f.o.o.$}} -# CHECK-NEXT: {{^\+.b.a.r.}} +# CHECK-NEXT: {{^\+.b.a.r..}} # CHECK-NEXT: {{^\+.b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -73,7 +73,7 @@ # CHECK-NEXT: +++ # CHECK-NEXT: @@ # CHECK-NEXT: {{^\-.f.o.o.$}} -# CHECK-NEXT: {{^\-.b.a.r.}} +# CHECK-NEXT: {{^\-.b.a.r..}} # CHECK-NEXT: {{^\-.b.a.z.$}} # CHECK-NEXT: +foo # CHECK-NEXT: +bar @@ -100,7 +100,7 @@ # CHECK-NEXT: @@ # CHECK-NEXT: {{^ .f.o.o.$}} # CHECK-NEXT: {{^-.b.a.r.$}} -# CHECK-NEXT: {{^\+.b.a.r.}} +# CHECK-NEXT: {{^\+.b.a.r..}} # CHECK-NEXT: {{^ .b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -116,7 +116,7 @@ # CHECK-NEXT: -bar # CHECK-NEXT: -baz # CHECK-NEXT: {{^\+.f.o.o.$}} -# CHECK-NEXT: {{^\+.b.a.r.}} +# CHECK-NEXT: {{^\+.b.a.r..}} # CHECK-NEXT: {{^\+.b.a.z.$}} # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" @@ -127,7 +127,7 @@ # CHECK-NEXT: +++ # CHECK-NEXT: @@ # CHECK-NEXT: {{^\-.f.o.o.$}} -# CHECK-NEXT: {{^\-.b.a.r.}} +# CHECK-NEXT: {{^\-.b.a.r..}} # CHECK-NEXT: {{^\-.b.a.z.$}} # CHECK-NEXT: +foo # CHECK-NEXT: +bar From llvm-commits at lists.llvm.org Sat Oct 12 09:00:35 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 16:00:35 -0000 Subject: [llvm] r374665 - [lit] Try yet again to fix new tests that fail on Windows bots Message-ID: <20191012160035.E2E098ADC7@lists.llvm.org> Author: jdenny Date: Sat Oct 12 09:00:35 2019 New Revision: 374665 URL: http://llvm.org/viewvc/llvm-project?rev=374665&view=rev Log: [lit] Try yet again to fix new tests that fail on Windows bots I seem to have misread the bot logs on my last attempt. When lit's internal diff runs on Windows under Python 2.7, it's text diffs not binary diffs that need decoding to avoid this error when writing the diff to stdout: ``` UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128) ``` There is no `decode` attribute in this case under Python 3.6.8 under Ubuntu, so this patch checks for the `decode` attribute before using it here. Hopefully nothing else is needed when `decode` isn't available. It might take a couple more attempts to figure out what error handling, if any, is needed for this decoding. Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374665&r1=374664&r2=374665&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 09:00:35 2019 @@ -95,6 +95,9 @@ def compareTwoTextFiles(flags, filepaths func = difflib.unified_diff if flags.unified_diff else difflib.context_diff for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1], n = flags.num_context_lines): + if hasattr(diff, 'decode'): + # python 2.7 + diff = diff.decode() sys.stdout.write(diff) exitCode = 1 return exitCode From llvm-commits at lists.llvm.org Sat Oct 12 09:25:46 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 16:25:46 -0000 Subject: [llvm] r374666 - [lit] Adjust error handling for decode introduced by r374665 Message-ID: <20191012162546.3B681857A7@lists.llvm.org> Author: jdenny Date: Sat Oct 12 09:25:46 2019 New Revision: 374666 URL: http://llvm.org/viewvc/llvm-project?rev=374666&view=rev Log: [lit] Adjust error handling for decode introduced by r374665 On that decode, Windows bots fail with: ``` UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128) ``` That's the same error as before r374665 except it's now at the decode before the write to stdout. Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374666&r1=374665&r2=374666&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 09:25:46 2019 @@ -97,7 +97,7 @@ def compareTwoTextFiles(flags, filepaths n = flags.num_context_lines): if hasattr(diff, 'decode'): # python 2.7 - diff = diff.decode() + diff = diff.decode(errors="backslashreplace") sys.stdout.write(diff) exitCode = 1 return exitCode From llvm-commits at lists.llvm.org Sat Oct 12 09:36:44 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sat, 12 Oct 2019 16:36:44 -0000 Subject: [llvm] r374667 - [X86] Use any_of/all_of patterns in shuffle mask pattern recognisers. NFCI. Message-ID: <20191012163644.5FB7C836E2@lists.llvm.org> Author: rksimon Date: Sat Oct 12 09:36:44 2019 New Revision: 374667 URL: http://llvm.org/viewvc/llvm-project?rev=374667&view=rev Log: [X86] Use any_of/all_of patterns in shuffle mask pattern recognisers. NFCI. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374667&r1=374666&r2=374667&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sat Oct 12 09:36:44 2019 @@ -5189,10 +5189,8 @@ static bool isUndefOrZero(int Val) { /// Return true if every element in Mask, beginning from position Pos and ending /// in Pos+Size is the undef sentinel value. static bool isUndefInRange(ArrayRef Mask, unsigned Pos, unsigned Size) { - for (unsigned i = Pos, e = Pos + Size; i != e; ++i) - if (Mask[i] != SM_SentinelUndef) - return false; - return true; + return llvm::all_of(Mask.slice(Pos, Size), + [](int M) { return M == SM_SentinelUndef; }); } /// Return true if the mask creates a vector whose lower half is undefined. @@ -5215,10 +5213,7 @@ static bool isInRange(int Val, int Low, /// Return true if the value of any element in Mask falls within the specified /// range (L, H]. static bool isAnyInRange(ArrayRef Mask, int Low, int Hi) { - for (int M : Mask) - if (isInRange(M, Low, Hi)) - return true; - return false; + return llvm::any_of(Mask, [Low, Hi](int M) { return isInRange(M, Low, Hi); }); } /// Return true if Val is undef or if its value falls within the @@ -5229,12 +5224,9 @@ static bool isUndefOrInRange(int Val, in /// Return true if every element in Mask is undef or if its value /// falls within the specified range (L, H]. -static bool isUndefOrInRange(ArrayRef Mask, - int Low, int Hi) { - for (int M : Mask) - if (!isUndefOrInRange(M, Low, Hi)) - return false; - return true; +static bool isUndefOrInRange(ArrayRef Mask, int Low, int Hi) { + return llvm::all_of( + Mask, [Low, Hi](int M) { return isUndefOrInRange(M, Low, Hi); }); } /// Return true if Val is undef, zero or if its value falls within the @@ -5246,10 +5238,8 @@ static bool isUndefOrZeroOrInRange(int V /// Return true if every element in Mask is undef, zero or if its value /// falls within the specified range (L, H]. static bool isUndefOrZeroOrInRange(ArrayRef Mask, int Low, int Hi) { - for (int M : Mask) - if (!isUndefOrZeroOrInRange(M, Low, Hi)) - return false; - return true; + return llvm::all_of( + Mask, [Low, Hi](int M) { return isUndefOrZeroOrInRange(M, Low, Hi); }); } /// Return true if every element in Mask, beginning @@ -5267,8 +5257,9 @@ static bool isSequentialOrUndefInRange(A /// from position Pos and ending in Pos+Size, falls within the specified /// sequential range (Low, Low+Size], or is undef or is zero. static bool isSequentialOrUndefOrZeroInRange(ArrayRef Mask, unsigned Pos, - unsigned Size, int Low) { - for (unsigned i = Pos, e = Pos + Size; i != e; ++i, ++Low) + unsigned Size, int Low, + int Step = 1) { + for (unsigned i = Pos, e = Pos + Size; i != e; ++i, Low += Step) if (!isUndefOrZero(Mask[i]) && Mask[i] != Low) return false; return true; @@ -5278,10 +5269,8 @@ static bool isSequentialOrUndefOrZeroInR /// from position Pos and ending in Pos+Size is undef or is zero. static bool isUndefOrZeroInRange(ArrayRef Mask, unsigned Pos, unsigned Size) { - for (unsigned i = Pos, e = Pos + Size; i != e; ++i) - if (!isUndefOrZero(Mask[i])) - return false; - return true; + return llvm::all_of(Mask.slice(Pos, Size), + [](int M) { return isUndefOrZero(M); }); } /// Helper function to test whether a shuffle mask could be From llvm-commits at lists.llvm.org Sat Oct 12 09:36:52 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sat, 12 Oct 2019 16:36:52 -0000 Subject: [llvm] r374668 - Fix cppcheck shadow variable name warnings. NFCI. Message-ID: <20191012163653.088C88ACD9@lists.llvm.org> Author: rksimon Date: Sat Oct 12 09:36:52 2019 New Revision: 374668 URL: http://llvm.org/viewvc/llvm-project?rev=374668&view=rev Log: Fix cppcheck shadow variable name warnings. NFCI. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374668&r1=374667&r2=374668&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sat Oct 12 09:36:52 2019 @@ -34259,17 +34259,17 @@ bool X86TargetLowering::SimplifyDemanded if (Src.getOpcode() == X86ISD::KSHIFTR) { if (!DemandedElts.intersects(APInt::getLowBitsSet(NumElts, ShiftAmt))) { unsigned C1 = Src.getConstantOperandVal(1); - unsigned Opc = X86ISD::KSHIFTL; + unsigned NewOpc = X86ISD::KSHIFTL; int Diff = ShiftAmt - C1; if (Diff < 0) { Diff = -Diff; - Opc = X86ISD::KSHIFTR; + NewOpc = X86ISD::KSHIFTR; } SDLoc dl(Op); SDValue NewSA = TLO.DAG.getTargetConstant(Diff, dl, MVT::i8); return TLO.CombineTo( - Op, TLO.DAG.getNode(Opc, dl, VT, Src.getOperand(0), NewSA)); + Op, TLO.DAG.getNode(NewOpc, dl, VT, Src.getOperand(0), NewSA)); } } @@ -34298,17 +34298,17 @@ bool X86TargetLowering::SimplifyDemanded if (Src.getOpcode() == X86ISD::KSHIFTL) { if (!DemandedElts.intersects(APInt::getHighBitsSet(NumElts, ShiftAmt))) { unsigned C1 = Src.getConstantOperandVal(1); - unsigned Opc = X86ISD::KSHIFTR; + unsigned NewOpc = X86ISD::KSHIFTR; int Diff = ShiftAmt - C1; if (Diff < 0) { Diff = -Diff; - Opc = X86ISD::KSHIFTL; + NewOpc = X86ISD::KSHIFTL; } SDLoc dl(Op); SDValue NewSA = TLO.DAG.getTargetConstant(Diff, dl, MVT::i8); return TLO.CombineTo( - Op, TLO.DAG.getNode(Opc, dl, VT, Src.getOperand(0), NewSA)); + Op, TLO.DAG.getNode(NewOpc, dl, VT, Src.getOperand(0), NewSA)); } } From llvm-commits at lists.llvm.org Sat Oct 12 09:37:02 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sat, 12 Oct 2019 16:37:02 -0000 Subject: [llvm] r374669 - Replace for-loop of SmallVector::push_back with SmallVector::append. NFCI. Message-ID: <20191012163702.4395188A0B@lists.llvm.org> Author: rksimon Date: Sat Oct 12 09:37:02 2019 New Revision: 374669 URL: http://llvm.org/viewvc/llvm-project?rev=374669&view=rev Log: Replace for-loop of SmallVector::push_back with SmallVector::append. NFCI. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374669&r1=374668&r2=374669&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sat Oct 12 09:37:02 2019 @@ -6936,10 +6936,8 @@ static bool getFauxShuffleMask(SDValue N else return false; } - for (SDValue &Op : SrcInputs0) - Ops.push_back(Op); - for (SDValue &Op : SrcInputs1) - Ops.push_back(Op); + Ops.append(SrcInputs0.begin(), SrcInputs0.end()); + Ops.append(SrcInputs1.begin(), SrcInputs1.end()); return true; } case ISD::INSERT_SUBVECTOR: { From llvm-commits at lists.llvm.org Sat Oct 12 09:48:16 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Sat, 12 Oct 2019 16:48:16 -0000 Subject: [llvm] r374670 - [NFC][LoopIdiom] Adjust FIXME to be self-explanatory Message-ID: <20191012164816.3B45988F3D@lists.llvm.org> Author: lebedevri Date: Sat Oct 12 09:48:16 2019 New Revision: 374670 URL: http://llvm.org/viewvc/llvm-project?rev=374670&view=rev Log: [NFC][LoopIdiom] Adjust FIXME to be self-explanatory Modified: llvm/trunk/lib/Transforms/Scalar/LoopIdiomRecognize.cpp Modified: llvm/trunk/lib/Transforms/Scalar/LoopIdiomRecognize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopIdiomRecognize.cpp?rev=374670&r1=374669&r2=374670&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopIdiomRecognize.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopIdiomRecognize.cpp Sat Oct 12 09:48:16 2019 @@ -2202,7 +2202,7 @@ bool LoopIdiomRecognize::detectBCmpIdiom CurLoop->getHeader()) << L; }); - return false; // FIXME + return false; // FIXME: support non-simple loads. } LLVM_DEBUG(dbgs() << "Recognized bcmp idiom\n"); From llvm-commits at lists.llvm.org Sat Oct 12 09:50:59 2019 From: llvm-commits at lists.llvm.org (Stefan Stipanovic via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 16:50:59 +0000 (UTC) Subject: [PATCH] D68626: [Attributor] Use undef for calls with unused arguments. In-Reply-To: References: Message-ID: <90c177b01f57dc283dfdedbe07054a6b@localhost.localdomain> sstefan1 added a comment. I have no problem with this going in as is. I have one question though. Why do we have to wait for ValueSimplify to finish? Wouldn't it be useful for AAIsDead to have `isDeadArg(Arg)` and then ValueSimplify could use that to decide whether to simplify or not? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68626/new/ https://reviews.llvm.org/D68626 From llvm-commits at lists.llvm.org Sat Oct 12 10:23:25 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 17:23:25 -0000 Subject: [llvm] r374671 - [lit] Try errors="ignore" for decode introduced by r374665 Message-ID: <20191012172325.BC29981D95@lists.llvm.org> Author: jdenny Date: Sat Oct 12 10:23:25 2019 New Revision: 374671 URL: http://llvm.org/viewvc/llvm-project?rev=374671&view=rev Log: [lit] Try errors="ignore" for decode introduced by r374665 Still trying to fix the same error as in r374666. Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374671&r1=374670&r2=374671&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 10:23:25 2019 @@ -97,7 +97,7 @@ def compareTwoTextFiles(flags, filepaths n = flags.num_context_lines): if hasattr(diff, 'decode'): # python 2.7 - diff = diff.decode(errors="backslashreplace") + diff = diff.decode(errors="ignore") sys.stdout.write(diff) exitCode = 1 return exitCode From llvm-commits at lists.llvm.org Sat Oct 12 10:55:01 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sat, 12 Oct 2019 17:55:01 -0000 Subject: [llvm] r374672 - SymbolRecord - fix uninitialized variable warnings. NFCI. Message-ID: <20191012175501.711DE862C0@lists.llvm.org> Author: rksimon Date: Sat Oct 12 10:55:01 2019 New Revision: 374672 URL: http://llvm.org/viewvc/llvm-project?rev=374672&view=rev Log: SymbolRecord - fix uninitialized variable warnings. NFCI. Modified: llvm/trunk/include/llvm/DebugInfo/CodeView/SymbolRecord.h Modified: llvm/trunk/include/llvm/DebugInfo/CodeView/SymbolRecord.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/DebugInfo/CodeView/SymbolRecord.h?rev=374672&r1=374671&r2=374672&view=diff ============================================================================== --- llvm/trunk/include/llvm/DebugInfo/CodeView/SymbolRecord.h (original) +++ llvm/trunk/include/llvm/DebugInfo/CodeView/SymbolRecord.h Sat Oct 12 10:55:01 2019 @@ -73,17 +73,17 @@ public: Thunk32Sym(SymbolRecordKind Kind, uint32_t RecordOffset) : SymbolRecord(Kind), RecordOffset(RecordOffset) {} - uint32_t Parent; - uint32_t End; - uint32_t Next; - uint32_t Offset; - uint16_t Segment; - uint16_t Length; + uint32_t Parent = 0; + uint32_t End = 0; + uint32_t Next = 0; + uint32_t Offset = 0; + uint16_t Segment = 0; + uint16_t Length = 0; ThunkOrdinal Thunk; StringRef Name; ArrayRef VariantData; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_TRAMPOLINE @@ -94,13 +94,13 @@ public: : SymbolRecord(Kind), RecordOffset(RecordOffset) {} TrampolineType Type; - uint16_t Size; - uint32_t ThunkOffset; - uint32_t TargetOffset; - uint16_t ThunkSection; - uint16_t TargetSection; + uint16_t Size = 0; + uint32_t ThunkOffset = 0; + uint32_t TargetOffset = 0; + uint16_t ThunkSection = 0; + uint16_t TargetSection = 0; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_SECTION @@ -110,14 +110,14 @@ public: SectionSym(SymbolRecordKind Kind, uint32_t RecordOffset) : SymbolRecord(Kind), RecordOffset(RecordOffset) {} - uint16_t SectionNumber; - uint8_t Alignment; - uint32_t Rva; - uint32_t Length; - uint32_t Characteristics; + uint16_t SectionNumber = 0; + uint8_t Alignment = 0; + uint32_t Rva = 0; + uint32_t Length = 0; + uint32_t Characteristics = 0; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_COFFGROUP @@ -127,13 +127,13 @@ public: CoffGroupSym(SymbolRecordKind Kind, uint32_t RecordOffset) : SymbolRecord(Kind), RecordOffset(RecordOffset) {} - uint32_t Size; - uint32_t Characteristics; - uint32_t Offset; - uint16_t Segment; + uint32_t Size = 0; + uint32_t Characteristics = 0; + uint32_t Offset = 0; + uint16_t Segment = 0; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; class ScopeEndSym : public SymbolRecord { @@ -142,7 +142,7 @@ public: ScopeEndSym(SymbolRecordKind Kind, uint32_t RecordOffset) : SymbolRecord(Kind), RecordOffset(RecordOffset) {} - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; class CallerSym : public SymbolRecord { @@ -153,7 +153,7 @@ public: std::vector Indices; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; struct DecodedAnnotation { @@ -342,12 +342,12 @@ public: BinaryAnnotationIterator()); } - uint32_t Parent; - uint32_t End; + uint32_t Parent = 0; + uint32_t End = 0; TypeIndex Inlinee; std::vector AnnotationData; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_PUB32 @@ -379,7 +379,7 @@ public: RegisterId Register; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_PROCREF, S_LPROCREF @@ -390,13 +390,13 @@ public: : SymbolRecord(SymbolRecordKind::ProcRefSym), RecordOffset(RecordOffset) { } - uint32_t SumName; - uint32_t SymOffset; - uint16_t Module; + uint32_t SumName = 0; + uint32_t SymOffset = 0; + uint16_t Module = 0; StringRef Name; uint16_t modi() const { return Module - 1; } - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_LOCAL @@ -410,7 +410,7 @@ public: LocalSymFlags Flags; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; struct LocalVariableAddrRange { @@ -440,11 +440,11 @@ public: return RecordOffset + RelocationOffset; } - uint32_t Program; + uint32_t Program = 0; LocalVariableAddrRange Range; std::vector Gaps; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_DEFRANGE_SUBFIELD @@ -461,12 +461,12 @@ public: return RecordOffset + RelocationOffset; } - uint32_t Program; - uint16_t OffsetInParent; + uint32_t Program = 0; + uint16_t OffsetInParent = 0; LocalVariableAddrRange Range; std::vector Gaps; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_DEFRANGE_REGISTER @@ -488,7 +488,7 @@ public: LocalVariableAddrRange Range; std::vector Gaps; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_DEFRANGE_SUBFIELD_REGISTER @@ -512,7 +512,7 @@ public: LocalVariableAddrRange Range; std::vector Gaps; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_DEFRANGE_FRAMEPOINTER_REL @@ -538,7 +538,7 @@ public: LocalVariableAddrRange Range; std::vector Gaps; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_DEFRANGE_REGISTER_REL @@ -573,7 +573,7 @@ public: LocalVariableAddrRange Range; std::vector Gaps; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_DEFRANGE_FRAMEPOINTER_REL_FULL_SCOPE @@ -585,9 +585,9 @@ public: : SymbolRecord(SymbolRecordKind::DefRangeFramePointerRelFullScopeSym), RecordOffset(RecordOffset) {} - int32_t Offset; + int32_t Offset = 0; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_BLOCK32 @@ -603,14 +603,14 @@ public: return RecordOffset + RelocationOffset; } - uint32_t Parent; - uint32_t End; - uint32_t CodeSize; - uint32_t CodeOffset; - uint16_t Segment; + uint32_t Parent = 0; + uint32_t End = 0; + uint32_t CodeSize = 0; + uint32_t CodeOffset = 0; + uint16_t Segment = 0; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_LABEL32 @@ -626,12 +626,12 @@ public: return RecordOffset + RelocationOffset; } - uint32_t CodeOffset; - uint16_t Segment; + uint32_t CodeOffset = 0; + uint16_t Segment = 0; ProcSymFlags Flags; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_OBJNAME @@ -643,10 +643,10 @@ public: : SymbolRecord(SymbolRecordKind::ObjNameSym), RecordOffset(RecordOffset) { } - uint32_t Signature; + uint32_t Signature = 0; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_ENVBLOCK @@ -659,7 +659,7 @@ public: std::vector Fields; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_EXPORT @@ -669,11 +669,11 @@ public: ExportSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::ExportSym), RecordOffset(RecordOffset) {} - uint16_t Ordinal; + uint16_t Ordinal = 0; ExportFlags Flags; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_FILESTATIC @@ -685,11 +685,11 @@ public: RecordOffset(RecordOffset) {} TypeIndex Index; - uint32_t ModFilenameOffset; + uint32_t ModFilenameOffset = 0; LocalSymFlags Flags; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_COMPILE2 @@ -702,19 +702,19 @@ public: CompileSym2Flags Flags; CPUType Machine; - uint16_t VersionFrontendMajor; - uint16_t VersionFrontendMinor; - uint16_t VersionFrontendBuild; - uint16_t VersionBackendMajor; - uint16_t VersionBackendMinor; - uint16_t VersionBackendBuild; + uint16_t VersionFrontendMajor = 0; + uint16_t VersionFrontendMinor = 0; + uint16_t VersionFrontendBuild = 0; + uint16_t VersionBackendMajor = 0; + uint16_t VersionBackendMinor = 0; + uint16_t VersionBackendBuild = 0; StringRef Version; std::vector ExtraStrings; uint8_t getLanguage() const { return static_cast(Flags) & 0xFF; } uint32_t getFlags() const { return static_cast(Flags) & ~0xFF; } - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_COMPILE3 @@ -728,14 +728,14 @@ public: CompileSym3Flags Flags; CPUType Machine; - uint16_t VersionFrontendMajor; - uint16_t VersionFrontendMinor; - uint16_t VersionFrontendBuild; - uint16_t VersionFrontendQFE; - uint16_t VersionBackendMajor; - uint16_t VersionBackendMinor; - uint16_t VersionBackendBuild; - uint16_t VersionBackendQFE; + uint16_t VersionFrontendMajor = 0; + uint16_t VersionFrontendMinor = 0; + uint16_t VersionFrontendBuild = 0; + uint16_t VersionFrontendQFE = 0; + uint16_t VersionBackendMajor = 0; + uint16_t VersionBackendMinor = 0; + uint16_t VersionBackendBuild = 0; + uint16_t VersionBackendQFE = 0; StringRef Version; void setLanguage(SourceLanguage Lang) { @@ -754,7 +754,7 @@ public: (getFlags() & (CompileSym3Flags::PGO | CompileSym3Flags::LTCG)); } - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_FRAMEPROC @@ -765,12 +765,12 @@ public: : SymbolRecord(SymbolRecordKind::FrameProcSym), RecordOffset(RecordOffset) {} - uint32_t TotalFrameBytes; - uint32_t PaddingFrameBytes; - uint32_t OffsetToPadding; - uint32_t BytesOfCalleeSavedRegisters; - uint32_t OffsetOfExceptionHandler; - uint16_t SectionIdOfExceptionHandler; + uint32_t TotalFrameBytes = 0; + uint32_t PaddingFrameBytes = 0; + uint32_t OffsetToPadding = 0; + uint32_t BytesOfCalleeSavedRegisters = 0; + uint32_t OffsetOfExceptionHandler = 0; + uint16_t SectionIdOfExceptionHandler = 0; FrameProcedureOptions Flags; /// Extract the register this frame uses to refer to local variables. @@ -785,7 +785,7 @@ public: EncodedFramePtrReg((uint32_t(Flags) >> 16U) & 0x3U), CPU); } - uint32_t RecordOffset; + uint32_t RecordOffset = 0; private: }; @@ -803,11 +803,11 @@ public: return RecordOffset + RelocationOffset; } - uint32_t CodeOffset; - uint16_t Segment; + uint32_t CodeOffset = 0; + uint16_t Segment = 0; TypeIndex Type; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_HEAPALLOCSITE @@ -824,12 +824,12 @@ public: return RecordOffset + RelocationOffset; } - uint32_t CodeOffset; - uint16_t Segment; - uint16_t CallInstructionSize; + uint32_t CodeOffset = 0; + uint16_t Segment = 0; + uint16_t CallInstructionSize = 0; TypeIndex Type; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_FRAMECOOKIE @@ -845,12 +845,12 @@ public: return RecordOffset + RelocationOffset; } - uint32_t CodeOffset; - uint16_t Register; + uint32_t CodeOffset = 0; + uint16_t Register = 0; FrameCookieKind CookieKind; - uint8_t Flags; + uint8_t Flags = 0; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_UDT, S_COBOLUDT @@ -863,7 +863,7 @@ public: TypeIndex Type; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_BUILDINFO @@ -876,7 +876,7 @@ public: TypeIndex BuildId; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_BPREL32 @@ -887,11 +887,11 @@ public: : SymbolRecord(SymbolRecordKind::BPRelativeSym), RecordOffset(RecordOffset) {} - int32_t Offset; + int32_t Offset = 0; TypeIndex Type; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_REGREL32 @@ -902,12 +902,12 @@ public: : SymbolRecord(SymbolRecordKind::RegRelativeSym), RecordOffset(RecordOffset) {} - uint32_t Offset; + uint32_t Offset = 0; TypeIndex Type; RegisterId Register; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_CONSTANT, S_MANCONSTANT @@ -922,7 +922,7 @@ public: APSInt Value; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_LDATA32, S_GDATA32, S_LMANDATA, S_GMANDATA @@ -939,11 +939,11 @@ public: } TypeIndex Type; - uint32_t DataOffset; - uint16_t Segment; + uint32_t DataOffset = 0; + uint16_t Segment = 0; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_LTHREAD32, S_GTHREAD32 @@ -961,11 +961,11 @@ public: } TypeIndex Type; - uint32_t DataOffset; - uint16_t Segment; + uint32_t DataOffset = 0; + uint16_t Segment = 0; StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_UNAMESPACE @@ -978,7 +978,7 @@ public: StringRef Name; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; // S_ANNOTATION @@ -993,7 +993,7 @@ public: uint16_t Segment = 0; std::vector Strings; - uint32_t RecordOffset; + uint32_t RecordOffset = 0; }; using CVSymbol = CVRecord; From llvm-commits at lists.llvm.org Sat Oct 12 10:55:09 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sat, 12 Oct 2019 17:55:09 -0000 Subject: [llvm] r374673 - SymbolRecord - consistently use explicit for single operand constructors Message-ID: <20191012175509.E8B2287813@lists.llvm.org> Author: rksimon Date: Sat Oct 12 10:55:09 2019 New Revision: 374673 URL: http://llvm.org/viewvc/llvm-project?rev=374673&view=rev Log: SymbolRecord - consistently use explicit for single operand constructors Modified: llvm/trunk/include/llvm/DebugInfo/CodeView/SymbolRecord.h Modified: llvm/trunk/include/llvm/DebugInfo/CodeView/SymbolRecord.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/DebugInfo/CodeView/SymbolRecord.h?rev=374673&r1=374672&r2=374673&view=diff ============================================================================== --- llvm/trunk/include/llvm/DebugInfo/CodeView/SymbolRecord.h (original) +++ llvm/trunk/include/llvm/DebugInfo/CodeView/SymbolRecord.h Sat Oct 12 10:55:09 2019 @@ -333,7 +333,7 @@ private: class InlineSiteSym : public SymbolRecord { public: explicit InlineSiteSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - InlineSiteSym(uint32_t RecordOffset) + explicit InlineSiteSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::InlineSiteSym), RecordOffset(RecordOffset) {} @@ -371,7 +371,7 @@ public: class RegisterSym : public SymbolRecord { public: explicit RegisterSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - RegisterSym(uint32_t RecordOffset) + explicit RegisterSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::RegisterSym), RecordOffset(RecordOffset) {} @@ -453,7 +453,7 @@ class DefRangeSubfieldSym : public Symbo public: explicit DefRangeSubfieldSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - DefRangeSubfieldSym(uint32_t RecordOffset) + explicit DefRangeSubfieldSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::DefRangeSubfieldSym), RecordOffset(RecordOffset) {} @@ -478,7 +478,7 @@ public: }; explicit DefRangeRegisterSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - DefRangeRegisterSym(uint32_t RecordOffset) + explicit DefRangeRegisterSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::DefRangeRegisterSym), RecordOffset(RecordOffset) {} @@ -502,7 +502,7 @@ public: explicit DefRangeSubfieldRegisterSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - DefRangeSubfieldRegisterSym(uint32_t RecordOffset) + explicit DefRangeSubfieldRegisterSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::DefRangeSubfieldRegisterSym), RecordOffset(RecordOffset) {} @@ -526,7 +526,7 @@ public: explicit DefRangeFramePointerRelSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - DefRangeFramePointerRelSym(uint32_t RecordOffset) + explicit DefRangeFramePointerRelSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::DefRangeFramePointerRelSym), RecordOffset(RecordOffset) {} @@ -639,7 +639,7 @@ class ObjNameSym : public SymbolRecord { public: explicit ObjNameSym() : SymbolRecord(SymbolRecordKind::ObjNameSym) {} explicit ObjNameSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - ObjNameSym(uint32_t RecordOffset) + explicit ObjNameSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::ObjNameSym), RecordOffset(RecordOffset) { } @@ -653,7 +653,7 @@ public: class EnvBlockSym : public SymbolRecord { public: explicit EnvBlockSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - EnvBlockSym(uint32_t RecordOffset) + explicit EnvBlockSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::EnvBlockSym), RecordOffset(RecordOffset) {} @@ -666,7 +666,7 @@ public: class ExportSym : public SymbolRecord { public: explicit ExportSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - ExportSym(uint32_t RecordOffset) + explicit ExportSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::ExportSym), RecordOffset(RecordOffset) {} uint16_t Ordinal = 0; @@ -680,7 +680,7 @@ public: class FileStaticSym : public SymbolRecord { public: explicit FileStaticSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - FileStaticSym(uint32_t RecordOffset) + explicit FileStaticSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::FileStaticSym), RecordOffset(RecordOffset) {} @@ -696,7 +696,7 @@ public: class Compile2Sym : public SymbolRecord { public: explicit Compile2Sym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - Compile2Sym(uint32_t RecordOffset) + explicit Compile2Sym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::Compile2Sym), RecordOffset(RecordOffset) {} @@ -722,7 +722,7 @@ class Compile3Sym : public SymbolRecord public: Compile3Sym() : SymbolRecord(SymbolRecordKind::Compile3Sym) {} explicit Compile3Sym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - Compile3Sym(uint32_t RecordOffset) + explicit Compile3Sym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::Compile3Sym), RecordOffset(RecordOffset) {} @@ -870,7 +870,7 @@ public: class BuildInfoSym : public SymbolRecord { public: explicit BuildInfoSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - BuildInfoSym(uint32_t RecordOffset) + explicit BuildInfoSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::BuildInfoSym), RecordOffset(RecordOffset) {} @@ -914,7 +914,7 @@ public: class ConstantSym : public SymbolRecord { public: explicit ConstantSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - ConstantSym(uint32_t RecordOffset) + explicit ConstantSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::ConstantSym), RecordOffset(RecordOffset) {} @@ -931,7 +931,7 @@ class DataSym : public SymbolRecord { public: explicit DataSym(SymbolRecordKind Kind) : SymbolRecord(Kind) {} - DataSym(uint32_t RecordOffset) + explicit DataSym(uint32_t RecordOffset) : SymbolRecord(SymbolRecordKind::DataSym), RecordOffset(RecordOffset) {} uint32_t getRelocationOffset() const { From llvm-commits at lists.llvm.org Sat Oct 12 11:04:12 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 18:04:12 +0000 (UTC) Subject: [PATCH] D68626: [Attributor] Use undef for calls with unused arguments. In-Reply-To: References: Message-ID: <23bbe01c4bcfb5a0fb3e898305f8956d@localhost.localdomain> jdoerfert added a comment. In D68626#1707223 , @sstefan1 wrote: > I have no problem with this going in as is. I have one question though. Why do we have to wait for ValueSimplify to finish? Wouldn't it be useful for AAIsDead to have `isDeadArg(Arg)` and then ValueSimplify could use that to decide whether to simplify or not? You are right. I'll replace it with a `AAIsDead` solution instead shortly. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68626/new/ https://reviews.llvm.org/D68626 From llvm-commits at lists.llvm.org Sat Oct 12 11:13:22 2019 From: llvm-commits at lists.llvm.org (Wenlei He via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 18:13:22 +0000 (UTC) Subject: [PATCH] D52845: Update entry count for cold calls In-Reply-To: References: Message-ID: <14121d9520f1cffdeae9340453ac17f3@localhost.localdomain> wenlei added a comment. Herald added a project: LLVM. @wmi @davidxl FYI, we have an internal patch to address this issue further. This patch only scales up entry count of outline function for cold callsites, thus losing context-sensitiveness. But ideally, we could reuse the entire profile from that inline context, and merge it all back to the outlined function. We changed the function annotation/inlining order to be top-down, then as we decide to not inline a cold call site, we merge the entire FunctionSamples of that inlinee back to outline counterpart, and since we process functions in top-down order (ignoring recursion for a second), annotation of the outline callee is able to use post merge FunctionSamples which is more accurate than simple scaling of entry count. (Doing context-sensitive inlining top-down has other benefits too, as it allows specialization whiling inlining which helps maximize the benefit of having context sensitive profile) We'll upstream the changes later if you think it will be useful. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D52845/new/ https://reviews.llvm.org/D52845 From llvm-commits at lists.llvm.org Sat Oct 12 11:13:21 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 18:13:21 +0000 (UTC) Subject: [PATCH] D67822: [LNT] Python 3 support: adapt to removal of execfile In-Reply-To: References: Message-ID: <18aa93dd392687b67c330babba03105d@localhost.localdomain> hubert.reinterpretcast accepted this revision. hubert.reinterpretcast added a comment. This revision is now accepted and ready to land. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67822/new/ https://reviews.llvm.org/D67822 From llvm-commits at lists.llvm.org Sat Oct 12 11:22:32 2019 From: llvm-commits at lists.llvm.org (Momchil Velikov via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 18:22:32 +0000 (UTC) Subject: [PATCH] D68916: [ARM] Accept ldrb.w mnemonic for certain addressing modes (PR43382) Message-ID: chill created this revision. chill added reviewers: efriedma, grosbach. Herald added subscribers: llvm-commits, hiraditya, kristof.beyls. Herald added a project: LLVM. Encoding T3 of `ldrb` in A7.7.46 (Armv7-M ARM Revidion E.d) offset allows the ".w" mnemonic suffix, even though the preferred disassembly is without the suffix. We did not accept the suffix; this patch fixes it by adding a few instruction aliases. Is there a less hackish way of doing it ? https://reviews.llvm.org/D68916 Files: llvm/lib/Target/ARM/ARMInstrThumb2.td llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp llvm/test/MC/ARM/basic-thumb2-instructions.s -------------- next part -------------- A non-text attachment was scrubbed... Name: D68916.224755.patch Type: text/x-patch Size: 6259 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 11:22:32 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 18:22:32 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: <9a7cb740805d78f982d19489f6124564@localhost.localdomain> jdoerfert added a comment. Two minor comments from my side. (I would like someone to include the keyword in the editor lists `llvm/utils/{emacs,vim}` after this landed) ================ Comment at: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h:671 void visitFNeg(const User &I) { visitUnary(I, ISD::FNEG); } + void visitFreeze(const User &I); ---------------- The lady of the lake says this should be: `void visitFreeze(const User &I) { visitUnary(I, ISD::FREEZE); }` If you have reason not to do it this way, also replace `visitUnrary` with `visitFNeg`, though I'd prefer not to. ================ Comment at: test/Bindings/llvm-c/freeze.ll:12 + %6 = freeze <2 x float> %arg4 + %7 = freeze i8* %arg5 + ret i32 %1 ---------------- Missing types, here and elsewhere I think: - array - struct w/ definition - struct w/o definition (opaque) - non-standard integer size (i666) Missing inputs: - undef - null CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Sat Oct 12 11:33:47 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sat, 12 Oct 2019 18:33:47 -0000 Subject: [llvm] r374674 - [X86] scaleShuffleMask - use size_t Scale to avoid overflow warnings Message-ID: <20191012183348.074F68356F@lists.llvm.org> Author: rksimon Date: Sat Oct 12 11:33:47 2019 New Revision: 374674 URL: http://llvm.org/viewvc/llvm-project?rev=374674&view=rev Log: [X86] scaleShuffleMask - use size_t Scale to avoid overflow warnings Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/lib/Target/X86/X86ISelLowering.h Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374674&r1=374673&r2=374674&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sat Oct 12 11:33:47 2019 @@ -6920,11 +6920,11 @@ static bool getFauxShuffleMask(SDValue N !getTargetShuffleInputs(N1, SrcInputs1, SrcMask1, DAG, Depth + 1, ResolveZero)) return false; - int MaskSize = std::max(SrcMask0.size(), SrcMask1.size()); + size_t MaskSize = std::max(SrcMask0.size(), SrcMask1.size()); SmallVector Mask0, Mask1; scaleShuffleMask(MaskSize / SrcMask0.size(), SrcMask0, Mask0); scaleShuffleMask(MaskSize / SrcMask1.size(), SrcMask1, Mask1); - for (int i = 0; i != MaskSize; ++i) { + for (size_t i = 0; i != MaskSize; ++i) { if (Mask0[i] == SM_SentinelUndef && Mask1[i] == SM_SentinelUndef) Mask.push_back(SM_SentinelUndef); else if (Mask0[i] == SM_SentinelZero && Mask1[i] == SM_SentinelZero) @@ -6932,7 +6932,7 @@ static bool getFauxShuffleMask(SDValue N else if (Mask1[i] == SM_SentinelZero) Mask.push_back(Mask0[i]); else if (Mask0[i] == SM_SentinelZero) - Mask.push_back(Mask1[i] + (MaskSize * SrcInputs0.size())); + Mask.push_back(Mask1[i] + (int)(MaskSize * SrcInputs0.size())); else return false; } Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.h?rev=374674&r1=374673&r2=374674&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.h (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.h Sat Oct 12 11:33:47 2019 @@ -1672,24 +1672,24 @@ namespace llvm { /// mask. This is the reverse process to canWidenShuffleElements, but can /// always succeed. template - void scaleShuffleMask(int Scale, ArrayRef Mask, + void scaleShuffleMask(size_t Scale, ArrayRef Mask, SmallVectorImpl &ScaledMask) { assert(0 < Scale && "Unexpected scaling factor"); size_t NumElts = Mask.size(); ScaledMask.assign(NumElts * Scale, -1); - for (int i = 0; i != (int)NumElts; ++i) { + for (size_t i = 0; i != NumElts; ++i) { int M = Mask[i]; // Repeat sentinel values in every mask element. if (M < 0) { - for (int s = 0; s != Scale; ++s) + for (size_t s = 0; s != Scale; ++s) ScaledMask[(Scale * i) + s] = M; continue; } // Scale mask element and increment across each mask element. - for (int s = 0; s != Scale; ++s) + for (size_t s = 0; s != Scale; ++s) ScaledMask[(Scale * i) + s] = (Scale * M) + s; } } From llvm-commits at lists.llvm.org Sat Oct 12 11:31:32 2019 From: llvm-commits at lists.llvm.org (Joan LLuch via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 18:31:32 +0000 (UTC) Subject: [PATCH] D63382: [InstCombine] fold a shifted zext to a select In-Reply-To: References: Message-ID: <803e390155a3bf1049d7d0c9f0937ee4@localhost.localdomain> joanlluch added a comment. In D63382#1707153 , @spatel wrote: > @joanlluch - this is a case where we could transform incoming shift to select and benefit small targets as you've noted. Indeed, this seems to replace a shift that would be executed in all cases by a shift that will execute only if the incoming value was 1. I also agree with your comment above about "prefer 'select' in IR over bithacks". CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63382/new/ https://reviews.llvm.org/D63382 From llvm-commits at lists.llvm.org Sat Oct 12 11:31:33 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 18:31:33 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: <12ec1fb2515fafe5d9a9d1c35c7d70e2@localhost.localdomain> lebedev.ri added a comment. In D29011#1707274 , @jdoerfert wrote: > Two minor comments from my side. Looks good to me with those two fixed. Should you add `llvm::Freze` here by inheriting from `UnaryOperator` to make `isa(Op)` possible? > (I would like someone to include the keyword in the editor lists `llvm/utils/{emacs,vim}` after this landed) There is also `llvm/utils/{kate}`. ================ Comment at: include/llvm/Bitcode/LLVMBitCodes.h:394 enum UnaryOpcodes { - UNOP_NEG = 0 + UNOP_NEG = 0, + UNOP_FREEZE = 1 ---------------- I think you want to rebase. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Sat Oct 12 11:40:34 2019 From: llvm-commits at lists.llvm.org (Itay Bookstein via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 18:40:34 +0000 (UTC) Subject: [PATCH] D68123: [CodeGen][SelectionDAG] Fix tiny bug in ExpandIntRes_UADDSUBO In-Reply-To: References: Message-ID: <35d9cba6b3c2f9c8ff03f90189300dbf@localhost.localdomain> ibookstein added a comment. I failed to mention during the review that I don't have commit access, so could anyone commit this on my behalf? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68123/new/ https://reviews.llvm.org/D68123 From llvm-commits at lists.llvm.org Sat Oct 12 11:50:57 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 18:50:57 -0000 Subject: [llvm] r374675 - Revert r374671: "[lit] Try errors="ignore" for decode introduced by r374665" Message-ID: <20191012185057.B54BB85325@lists.llvm.org> Author: jdenny Date: Sat Oct 12 11:50:57 2019 New Revision: 374675 URL: http://llvm.org/viewvc/llvm-project?rev=374675&view=rev Log: Revert r374671: "[lit] Try errors="ignore" for decode introduced by r374665" This series of patches still breaks a Windows bot. Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374675&r1=374674&r2=374675&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 11:50:57 2019 @@ -97,7 +97,7 @@ def compareTwoTextFiles(flags, filepaths n = flags.num_context_lines): if hasattr(diff, 'decode'): # python 2.7 - diff = diff.decode(errors="ignore") + diff = diff.decode(errors="backslashreplace") sys.stdout.write(diff) exitCode = 1 return exitCode From llvm-commits at lists.llvm.org Sat Oct 12 11:51:08 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 18:51:08 -0000 Subject: [llvm] r374676 - Revert r374666: "[lit] Adjust error handling for decode introduced by r374665" Message-ID: <20191012185108.4ABFB853BF@lists.llvm.org> Author: jdenny Date: Sat Oct 12 11:51:08 2019 New Revision: 374676 URL: http://llvm.org/viewvc/llvm-project?rev=374676&view=rev Log: Revert r374666: "[lit] Adjust error handling for decode introduced by r374665" This series of patches still breaks a Windows bot. Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374676&r1=374675&r2=374676&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 11:51:08 2019 @@ -97,7 +97,7 @@ def compareTwoTextFiles(flags, filepaths n = flags.num_context_lines): if hasattr(diff, 'decode'): # python 2.7 - diff = diff.decode(errors="backslashreplace") + diff = diff.decode() sys.stdout.write(diff) exitCode = 1 return exitCode From llvm-commits at lists.llvm.org Sat Oct 12 11:51:18 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 18:51:18 -0000 Subject: [llvm] r374677 - Revert r374665: "[lit] Try yet again to fix new tests that fail on Windows bots" Message-ID: <20191012185118.A5CE9853BF@lists.llvm.org> Author: jdenny Date: Sat Oct 12 11:51:18 2019 New Revision: 374677 URL: http://llvm.org/viewvc/llvm-project?rev=374677&view=rev Log: Revert r374665: "[lit] Try yet again to fix new tests that fail on Windows bots" This series of patches still breaks a Windows bot. Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374677&r1=374676&r2=374677&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 11:51:18 2019 @@ -95,9 +95,6 @@ def compareTwoTextFiles(flags, filepaths func = difflib.unified_diff if flags.unified_diff else difflib.context_diff for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1], n = flags.num_context_lines): - if hasattr(diff, 'decode'): - # python 2.7 - diff = diff.decode() sys.stdout.write(diff) exitCode = 1 return exitCode From llvm-commits at lists.llvm.org Sat Oct 12 11:51:34 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 18:51:34 -0000 Subject: [llvm] r374678 - Revert r374653: "[lit] Fix a few oversights in r374651 that broke some bots" Message-ID: <20191012185134.6AB618AF4E@lists.llvm.org> Author: jdenny Date: Sat Oct 12 11:51:34 2019 New Revision: 374678 URL: http://llvm.org/viewvc/llvm-project?rev=374678&view=rev Log: Revert r374653: "[lit] Fix a few oversights in r374651 that broke some bots" This series of patches still breaks a Windows bot. Modified: llvm/trunk/test/MC/ARM/preserve-comments-arm.s llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/test/MC/ARM/preserve-comments-arm.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ARM/preserve-comments-arm.s?rev=374678&r1=374677&r2=374678&view=diff ============================================================================== --- llvm/trunk/test/MC/ARM/preserve-comments-arm.s (original) +++ llvm/trunk/test/MC/ARM/preserve-comments-arm.s Sat Oct 12 11:51:34 2019 @@ -1,6 +1,6 @@ @RUN: llvm-mc -preserve-comments -n -triple arm-eabi < %s > %t @RUN: sed 's/#[C]omment/@Comment/g' %s > %t2 - @RUN: diff --strip-trailing-cr %t %t2 + @RUN: diff %t %t2 .text mov r0, r0 Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374678&r1=374677&r2=374678&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 11:51:34 2019 @@ -4,7 +4,7 @@ # FIXME: Temporarily dump test output so we can debug failing tests on # buildbots. # RUN: cat %t.out -# RUN: FileCheck --input-file %t.out %s +# RUN: FileCheck --dump-input=fail --color -vv --input-file %t.out %s # # END. From llvm-commits at lists.llvm.org Sat Oct 12 11:51:51 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 18:51:51 -0000 Subject: [llvm] r374679 - Revert r374652: "[lit] Fix internal diff's --strip-trailing-cr and use it" Message-ID: <20191012185151.A58968AFD3@lists.llvm.org> Author: jdenny Date: Sat Oct 12 11:51:51 2019 New Revision: 374679 URL: http://llvm.org/viewvc/llvm-project?rev=374679&view=rev Log: Revert r374652: "[lit] Fix internal diff's --strip-trailing-cr and use it" This series of patches still breaks a Windows bot. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.dos llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.unix llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt Modified: llvm/trunk/test/MC/AsmParser/preserve-comments.s llvm/trunk/test/tools/llvm-cxxmap/remap.test llvm/trunk/test/tools/llvm-profdata/profile-symbol-list.test llvm/trunk/test/tools/llvm-profdata/roundtrip.test llvm/trunk/test/tools/llvm-profdata/sample-remap.test llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/test/MC/AsmParser/preserve-comments.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/AsmParser/preserve-comments.s?rev=374679&r1=374678&r2=374679&view=diff ============================================================================== --- llvm/trunk/test/MC/AsmParser/preserve-comments.s (original) +++ llvm/trunk/test/MC/AsmParser/preserve-comments.s Sat Oct 12 11:51:51 2019 @@ -1,5 +1,5 @@ #RUN: llvm-mc -preserve-comments -n -triple i386-linux-gnu < %s > %t - #RUN: diff --strip-trailing-cr %s %t + #RUN: diff %s %t .text foo: #Comment here Modified: llvm/trunk/test/tools/llvm-cxxmap/remap.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-cxxmap/remap.test?rev=374679&r1=374678&r2=374679&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-cxxmap/remap.test (original) +++ llvm/trunk/test/tools/llvm-cxxmap/remap.test Sat Oct 12 11:51:51 2019 @@ -1,5 +1,5 @@ RUN: llvm-cxxmap %S/Inputs/before.sym %S/Inputs/after.sym -r %S/Inputs/remap.map -o %t.output -Wambiguous -Wincomplete 2>&1 | FileCheck %s --allow-empty -RUN: diff --strip-trailing-cr %S/Inputs/expected %t.output +RUN: diff %S/Inputs/expected %t.output CHECK-NOT: warning CHECK-NOT: error Modified: llvm/trunk/test/tools/llvm-profdata/profile-symbol-list.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-profdata/profile-symbol-list.test?rev=374679&r1=374678&r2=374679&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-profdata/profile-symbol-list.test (original) +++ llvm/trunk/test/tools/llvm-profdata/profile-symbol-list.test Sat Oct 12 11:51:51 2019 @@ -2,4 +2,4 @@ ; RUN: llvm-profdata merge -sample -extbinary -prof-sym-list=%S/Inputs/profile-symbol-list-2.text %S/Inputs/sample-profile.proftext -o %t.2.output ; RUN: llvm-profdata merge -sample -extbinary %t.1.output %t.2.output -o %t.3.output ; RUN: llvm-profdata show -sample -show-prof-sym-list %t.3.output > %t.4.output -; RUN: diff --strip-trailing-cr %S/Inputs/profile-symbol-list.expected %t.4.output +; RUN: diff %S/Inputs/profile-symbol-list.expected %t.4.output Modified: llvm/trunk/test/tools/llvm-profdata/roundtrip.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-profdata/roundtrip.test?rev=374679&r1=374678&r2=374679&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-profdata/roundtrip.test (original) +++ llvm/trunk/test/tools/llvm-profdata/roundtrip.test Sat Oct 12 11:51:51 2019 @@ -1,18 +1,18 @@ RUN: llvm-profdata merge -o %t.0.profdata %S/Inputs/IR_profile.proftext RUN: llvm-profdata show -o %t.0.proftext -all-functions -text %t.0.profdata -RUN: diff --strip-trailing-cr %t.0.proftext %S/Inputs/IR_profile.proftext +RUN: diff %t.0.proftext %S/Inputs/IR_profile.proftext RUN: llvm-profdata merge -o %t.1.profdata %t.0.proftext RUN: llvm-profdata show -o %t.1.proftext -all-functions -text %t.1.profdata -RUN: diff --strip-trailing-cr %t.1.proftext %S/Inputs/IR_profile.proftext +RUN: diff %t.1.proftext %S/Inputs/IR_profile.proftext RUN: llvm-profdata merge --sample --binary -output=%t.2.profdata %S/Inputs/sample-profile.proftext RUN: llvm-profdata merge --sample --text -output=%t.2.proftext %t.2.profdata -RUN: diff --strip-trailing-cr %t.2.proftext %S/Inputs/sample-profile.proftext +RUN: diff %t.2.proftext %S/Inputs/sample-profile.proftext # Round trip from text --> extbinary --> text RUN: llvm-profdata merge --sample --extbinary -output=%t.3.profdata %S/Inputs/sample-profile.proftext RUN: llvm-profdata merge --sample --text -output=%t.3.proftext %t.3.profdata -RUN: diff --strip-trailing-cr %t.3.proftext %S/Inputs/sample-profile.proftext +RUN: diff %t.3.proftext %S/Inputs/sample-profile.proftext # Round trip from text --> binary --> extbinary --> text RUN: llvm-profdata merge --sample --binary -output=%t.4.profdata %S/Inputs/sample-profile.proftext RUN: llvm-profdata merge --sample --extbinary -output=%t.5.profdata %t.4.profdata RUN: llvm-profdata merge --sample --text -output=%t.4.proftext %t.5.profdata -RUN: diff --strip-trailing-cr %t.4.proftext %S/Inputs/sample-profile.proftext +RUN: diff %t.4.proftext %S/Inputs/sample-profile.proftext Modified: llvm/trunk/test/tools/llvm-profdata/sample-remap.test URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-profdata/sample-remap.test?rev=374679&r1=374678&r2=374679&view=diff ============================================================================== --- llvm/trunk/test/tools/llvm-profdata/sample-remap.test (original) +++ llvm/trunk/test/tools/llvm-profdata/sample-remap.test Sat Oct 12 11:51:51 2019 @@ -1,2 +1,2 @@ ; RUN: llvm-profdata merge -sample -text %S/Inputs/sample-remap.proftext -r %S/Inputs/sample-remap.remap -o %t.output -; RUN: diff --strip-trailing-cr %S/Inputs/sample-remap.expected %t.output +; RUN: diff %S/Inputs/sample-remap.expected %t.output Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374679&r1=374678&r2=374679&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 11:51:51 2019 @@ -83,7 +83,7 @@ def compareTwoTextFiles(flags, filepaths f = lambda x: x if flags.strip_trailing_cr: - f = compose2(lambda line: line.replace('\r\n', '\n'), f) + f = compose2(lambda line: line.rstrip('\r'), f) if flags.ignore_all_space or flags.ignore_space_change: ignoreSpace = lambda line, separator: separator.join(line.split()) ignoreAllSpaceOrSpaceChange = functools.partial(ignoreSpace, separator='' if flags.ignore_all_space else ' ') Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.dos URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.dos?rev=374678&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.dos (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.dos (removed) @@ -1,3 +0,0 @@ -In this file, the -sequence "\r\n" -terminates lines. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.unix URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.unix?rev=374678&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.unix (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.unix (removed) @@ -1,3 +0,0 @@ -In this file, the -sequence "\n" -terminates lines. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt?rev=374678&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt (removed) @@ -1,10 +0,0 @@ -# Check behavior of --strip-trailing-cr. - -# RUN: diff -u diff-in.dos diff-in.unix && false || true -# RUN: diff -u diff-in.unix diff-in.dos && false || true - -# RUN: diff -u --strip-trailing-cr diff-in.dos diff-in.unix && false || true -# RUN: diff -u --strip-trailing-cr diff-in.unix diff-in.dos && false || true - -# Fail so lit will print output. -# RUN: false Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374679&r1=374678&r2=374679&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Sat Oct 12 11:51:51 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (32) +# CHECK: Failing Tests (31) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374679&r1=374678&r2=374679&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 11:51:51 2019 @@ -4,7 +4,7 @@ # FIXME: Temporarily dump test output so we can debug failing tests on # buildbots. # RUN: cat %t.out -# RUN: FileCheck --dump-input=fail --color -vv --input-file %t.out %s +# RUN: FileCheck --input-file %t.out %s # # END. @@ -332,59 +332,6 @@ # CHECK: PASS: shtest-shell :: diff-r.txt -# CHECK: FAIL: shtest-shell :: diff-strip-trailing-cr.txt - -# CHECK: *** TEST 'shtest-shell :: diff-strip-trailing-cr.txt' FAILED *** - -# CHECK: $ "diff" "-u" "diff-in.dos" "diff-in.unix" -# CHECK: # command output: -# CHECK: @@ -# CHECK-NEXT: -In this file, the -# CHECK-NEXT: -sequence "\r\n" -# CHECK-NEXT: -terminates lines. -# CHECK-NEXT: +In this file, the -# CHECK-NEXT: +sequence "\n" -# CHECK-NEXT: +terminates lines. -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "diff" "-u" "diff-in.unix" "diff-in.dos" -# CHECK: # command output: -# CHECK: @@ -# CHECK-NEXT: -In this file, the -# CHECK-NEXT: -sequence "\n" -# CHECK-NEXT: -terminates lines. -# CHECK-NEXT: +In this file, the -# CHECK-NEXT: +sequence "\r\n" -# CHECK-NEXT: +terminates lines. -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "diff" "-u" "--strip-trailing-cr" "diff-in.dos" "diff-in.unix" -# CHECK: # command output: -# CHECK: @@ -# CHECK-NEXT: In this file, the -# CHECK-NEXT: -sequence "\r\n" -# CHECK-NEXT: +sequence "\n" -# CHECK-NEXT: terminates lines. -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "diff" "-u" "--strip-trailing-cr" "diff-in.unix" "diff-in.dos" -# CHECK: # command output: -# CHECK: @@ -# CHECK-NEXT: In this file, the -# CHECK-NEXT: -sequence "\n" -# CHECK-NEXT: +sequence "\r\n" -# CHECK-NEXT: terminates lines. -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "false" - -# CHECK: *** - - # CHECK: FAIL: shtest-shell :: diff-unified.txt # CHECK: *** TEST 'shtest-shell :: diff-unified.txt' FAILED *** @@ -539,4 +486,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (32) +# CHECK: Failing Tests (31) From llvm-commits at lists.llvm.org Sat Oct 12 11:52:05 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 18:52:05 -0000 Subject: [llvm] r374680 - Revert 374651: "Reland r374392: [lit] Extend internal diff to support -U" Message-ID: <20191012185205.3FF908B048@lists.llvm.org> Author: jdenny Date: Sat Oct 12 11:52:05 2019 New Revision: 374680 URL: http://llvm.org/viewvc/llvm-project?rev=374680&view=rev Log: Revert 374651: "Reland r374392: [lit] Extend internal diff to support -U" This series of patches still breaks a Windows bot. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374680&r1=374679&r2=374680&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 11:52:05 2019 @@ -10,7 +10,6 @@ class DiffFlags(): self.ignore_all_space = False self.ignore_space_change = False self.unified_diff = False - self.num_context_lines = 3 self.recursive_diff = False self.strip_trailing_cr = False @@ -49,10 +48,7 @@ def compareTwoBinaryFiles(flags, filepat exitCode = 0 if hasattr(difflib, 'diff_bytes'): # python 3.5 or newer - diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], - filelines[1], filepaths[0].encode(), - filepaths[1].encode(), - n = flags.num_context_lines) + diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) diffs = [diff.decode(errors="backslashreplace") for diff in diffs] else: # python 2.7 @@ -60,8 +56,7 @@ def compareTwoBinaryFiles(flags, filepat func = difflib.unified_diff else: func = difflib.context_diff - diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1], - n = flags.num_context_lines) + diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) for diff in diffs: sys.stdout.write(diff) @@ -93,8 +88,7 @@ def compareTwoTextFiles(flags, filepaths filelines[idx]= [f(line) for line in lines] func = difflib.unified_diff if flags.unified_diff else difflib.context_diff - for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1], - n = flags.num_context_lines): + for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): sys.stdout.write(diff) exitCode = 1 return exitCode @@ -177,7 +171,7 @@ def compareDirTrees(flags, dir_trees, ba def main(argv): args = argv[1:] try: - opts, args = getopt.gnu_getopt(args, "wbuU:r", ["strip-trailing-cr"]) + opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) except getopt.GetoptError as err: sys.stderr.write("Unsupported: 'diff': %s\n" % str(err)) sys.exit(1) @@ -191,16 +185,6 @@ def main(argv): flags.ignore_space_change = True elif o == "-u": flags.unified_diff = True - elif o.startswith("-U"): - flags.unified_diff = True - try: - flags.num_context_lines = int(a) - if flags.num_context_lines < 0: - raise ValueException - except: - sys.stderr.write("Error: invalid '-U' argument: {}\n" - .format(a)) - sys.exit(1) elif o == "-r": flags.recursive_diff = True elif o == "--strip-trailing-cr": Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt?rev=374679&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt (removed) @@ -1,38 +0,0 @@ -# RUN: echo 1 > %t.foo -# RUN: echo 2 >> %t.foo -# RUN: echo 3 >> %t.foo -# RUN: echo 4 >> %t.foo -# RUN: echo 5 >> %t.foo -# RUN: echo 6 foo >> %t.foo -# RUN: echo 7 >> %t.foo -# RUN: echo 8 >> %t.foo -# RUN: echo 9 >> %t.foo -# RUN: echo 10 >> %t.foo -# RUN: echo 11 >> %t.foo - -# RUN: echo 1 > %t.bar -# RUN: echo 2 >> %t.bar -# RUN: echo 3 >> %t.bar -# RUN: echo 4 >> %t.bar -# RUN: echo 5 >> %t.bar -# RUN: echo 6 bar >> %t.bar -# RUN: echo 7 >> %t.bar -# RUN: echo 8 >> %t.bar -# RUN: echo 9 >> %t.bar -# RUN: echo 10 >> %t.bar -# RUN: echo 11 >> %t.bar - -# Default is 3 lines of context. -# RUN: diff -u %t.foo %t.bar && false || true - -# Override default of 3 lines of context. -# RUN: diff -U 2 %t.foo %t.bar && false || true -# RUN: diff -U4 %t.foo %t.bar && false || true -# RUN: diff -U0 %t.foo %t.bar && false || true - -# Check bad -U argument. -# RUN: diff -U 30.1 %t.foo %t.foo && false || true -# RUN: diff -U-1 %t.foo %t.foo && false || true - -# Fail so lit will print output. -# RUN: false Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374680&r1=374679&r2=374680&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Sat Oct 12 11:52:05 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (31) +# CHECK: Failing Tests (30) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374680&r1=374679&r2=374680&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 11:52:05 2019 @@ -331,82 +331,6 @@ # CHECK: PASS: shtest-shell :: diff-r.txt - -# CHECK: FAIL: shtest-shell :: diff-unified.txt - -# CHECK: *** TEST 'shtest-shell :: diff-unified.txt' FAILED *** - -# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "{{[^"]*}}.bar" -# CHECK: # command output: -# CHECK: @@ {{.*}} @@ -# CHECK-NEXT: 3 -# CHECK-NEXT: 4 -# CHECK-NEXT: 5 -# CHECK-NEXT: -6 foo -# CHECK-NEXT: +6 bar -# CHECK-NEXT: 7 -# CHECK-NEXT: 8 -# CHECK-NEXT: 9 -# CHECK-EMPTY: -# CHECK-NEXT: error: command failed with exit status: 1 -# CHECK-NEXT: $ "true" - -# CHECK: $ "diff" "-U" "2" "{{[^"]*}}.foo" "{{[^"]*}}.bar" -# CHECK: # command output: -# CHECK: @@ {{.*}} @@ -# CHECK-NEXT: 4 -# CHECK-NEXT: 5 -# CHECK-NEXT: -6 foo -# CHECK-NEXT: +6 bar -# CHECK-NEXT: 7 -# CHECK-NEXT: 8 -# CHECK-EMPTY: -# CHECK-NEXT: error: command failed with exit status: 1 -# CHECK-NEXT: $ "true" - -# CHECK: $ "diff" "-U4" "{{[^"]*}}.foo" "{{[^"]*}}.bar" -# CHECK: # command output: -# CHECK: @@ {{.*}} @@ -# CHECK-NEXT: 2 -# CHECK-NEXT: 3 -# CHECK-NEXT: 4 -# CHECK-NEXT: 5 -# CHECK-NEXT: -6 foo -# CHECK-NEXT: +6 bar -# CHECK-NEXT: 7 -# CHECK-NEXT: 8 -# CHECK-NEXT: 9 -# CHECK-NEXT: 10 -# CHECK-EMPTY: -# CHECK-NEXT: error: command failed with exit status: 1 -# CHECK-NEXT: $ "true" - -# CHECK: $ "diff" "-U0" "{{[^"]*}}.foo" "{{[^"]*}}.bar" -# CHECK: # command output: -# CHECK: @@ {{.*}} @@ -# CHECK-NEXT: -6 foo -# CHECK-NEXT: +6 bar -# CHECK-EMPTY: -# CHECK-NEXT: error: command failed with exit status: 1 -# CHECK-NEXT: $ "true" - -# CHECK: $ "diff" "-U" "30.1" "{{[^"]*}}" "{{[^"]*}}" -# CHECK: # command stderr: -# CHECK: Error: invalid '-U' argument: 30.1 -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "diff" "-U-1" "{{[^"]*}}" "{{[^"]*}}" -# CHECK: # command stderr: -# CHECK: Error: invalid '-U' argument: -1 -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "false" - -# CHECK: *** - - # CHECK: FAIL: shtest-shell :: error-0.txt # CHECK: *** TEST 'shtest-shell :: error-0.txt' FAILED *** # CHECK: $ "not-a-real-command" @@ -486,4 +410,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (31) +# CHECK: Failing Tests (30) From llvm-commits at lists.llvm.org Sat Oct 12 11:52:18 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 18:52:18 -0000 Subject: [llvm] r374681 - Revert r374650: "Reland r374390: [lit] Extend internal diff to support `-` argument" Message-ID: <20191012185218.555C08B06A@lists.llvm.org> Author: jdenny Date: Sat Oct 12 11:52:18 2019 New Revision: 374681 URL: http://llvm.org/viewvc/llvm-project?rev=374681&view=rev Log: Revert r374650: "Reland r374390: [lit] Extend internal diff to support `-` argument" This series of patches still breaks a Windows bot. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374681&r1=374680&r2=374681&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 11:52:18 2019 @@ -27,13 +27,8 @@ def getDirTree(path, basedir=""): def compareTwoFiles(flags, filepaths): filelines = [] for file in filepaths: - if file == "-": - stdin_fileno = sys.stdin.fileno() - with os.fdopen(os.dup(stdin_fileno), 'rb') as stdin_bin: - filelines.append(stdin_bin.readlines()) - else: - with open(file, 'rb') as file_bin: - filelines.append(file_bin.readlines()) + with open(file, 'rb') as file_bin: + filelines.append(file_bin.readlines()) try: return compareTwoTextFiles(flags, filepaths, filelines, @@ -199,13 +194,10 @@ def main(argv): exitCode = 0 try: for file in args: - if file != "-" and not os.path.isabs(file): + if not os.path.isabs(file): file = os.path.realpath(os.path.join(os.getcwd(), file)) if flags.recursive_diff: - if file == "-": - sys.stderr.write("Error: cannot recursively compare '-'\n") - sys.exit(1) dir_trees.append(getDirTree(file)) else: filepaths.append(file) Modified: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt?rev=374681&r1=374680&r2=374681&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt Sat Oct 12 11:52:18 2019 @@ -5,11 +5,5 @@ # RUN: diff -u diff-in.utf8 diff-in.bin && false || true # RUN: diff -u diff-in.bin diff-in.utf8 && false || true -# RUN: cat diff-in.bin | diff -u - diff-in.bin -# RUN: cat diff-in.bin | diff -u diff-in.bin - -# RUN: cat diff-in.bin | diff -u diff-in.utf16 - && false || true -# RUN: cat diff-in.bin | diff -u diff-in.utf8 - && false || true -# RUN: cat diff-in.bin | diff -u - diff-in.utf8 && false || true - # Fail so lit will print output. # RUN: false Modified: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt?rev=374681&r1=374680&r2=374681&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt Sat Oct 12 11:52:18 2019 @@ -5,16 +5,6 @@ # RUN: diff %t.foo %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s # RUN: diff -u %t.foo %t.bar | FileCheck %s && false || true -# Check input pipe. -# RUN: echo foo | diff -u - %t.foo -# RUN: echo foo | diff -u %t.foo - -# RUN: echo bar | diff -u %t.foo - && false || true -# RUN: echo bar | diff -u - %t.foo && false || true - -# Check output and input pipes at the same time. -# RUN: echo foo | diff - %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s -# RUN: echo bar | diff -u %t.foo - | FileCheck %s && false || true - # Fail so lit will print output. # RUN: false Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt?rev=374680&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt (removed) @@ -1,2 +0,0 @@ -# diff -r currently cannot handle stdin. -# RUN: diff -r - %t Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt?rev=374680&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt (removed) @@ -1,2 +0,0 @@ -# diff -r currently cannot handle stdin. -# RUN: diff -r %t - Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374681&r1=374680&r2=374681&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Sat Oct 12 11:52:18 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (30) +# CHECK: Failing Tests (28) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374681&r1=374680&r2=374681&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 11:52:18 2019 @@ -81,60 +81,6 @@ # CHECK: error: command failed with exit status: 1 # CHECK: $ "true" -# CHECK: $ "cat" "diff-in.bin" -# CHECK-NOT: error -# CHECK: $ "diff" "-u" "-" "diff-in.bin" -# CHECK-NOT: error - -# CHECK: $ "cat" "diff-in.bin" -# CHECK-NOT: error -# CHECK: $ "diff" "-u" "diff-in.bin" "-" -# CHECK-NOT: error - -# CHECK: $ "cat" "diff-in.bin" -# CHECK-NOT: error -# CHECK: $ "diff" "-u" "diff-in.utf16" "-" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: {{^ .f.o.o.$}} -# CHECK-NEXT: {{^-.b.a.r.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^ .b.a.z.$}} -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "cat" "diff-in.bin" -# CHECK-NOT: error -# CHECK: $ "diff" "-u" "diff-in.utf8" "-" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: -foo -# CHECK-NEXT: -bar -# CHECK-NEXT: -baz -# CHECK-NEXT: {{^\+.f.o.o.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^\+.b.a.z.$}} -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "diff" "-u" "-" "diff-in.utf8" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: {{^\-.f.o.o.$}} -# CHECK-NEXT: {{^\-.b.a.r..}} -# CHECK-NEXT: {{^\-.b.a.z.$}} -# CHECK-NEXT: +foo -# CHECK-NEXT: +bar -# CHECK-NEXT: +baz -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - # CHECK: $ "false" # CHECK: *** @@ -212,51 +158,6 @@ # CHECK-NOT: error # CHECK: $ "true" -# CHECK: $ "echo" "foo" -# CHECK: $ "diff" "-u" "-" "{{[^"]*}}.foo" -# CHECK-NOT: note -# CHECK-NOT: error - -# CHECK: $ "echo" "foo" -# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" -# CHECK-NOT: note -# CHECK-NOT: error - -# CHECK: $ "echo" "bar" -# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" -# CHECK: # command output: -# CHECK: @@ -# CHECK-NEXT: -foo -# CHECK-NEXT: +bar -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "echo" "bar" -# CHECK: $ "diff" "-u" "-" "{{[^"]*}}.foo" -# CHECK: # command output: -# CHECK: @@ -# CHECK-NEXT: -bar -# CHECK-NEXT: +foo -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "echo" "foo" -# CHECK: $ "diff" "-" "{{[^"]*}}.foo" -# CHECK-NOT: note -# CHECK-NOT: error -# CHECK: $ "FileCheck" -# CHECK-NOT: note -# CHECK-NOT: error - -# CHECK: $ "echo" "bar" -# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "-" -# CHECK: note: command had no output on stdout or stderr -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "FileCheck" -# CHECK-NOT: note -# CHECK-NOT: error -# CHECK: $ "true" - # CHECK: $ "false" # CHECK: *** @@ -315,20 +216,6 @@ # CHECK: File {{.*}}dir1{{.*}}extra_file is a regular empty file while file {{.*}}dir2{{.*}}extra_file is a directory # CHECK: error: command failed with exit status: 1 -# CHECK: FAIL: shtest-shell :: diff-r-error-7.txt -# CHECK: *** TEST 'shtest-shell :: diff-r-error-7.txt' FAILED *** -# CHECK: $ "diff" "-r" "-" "{{[^"]*}}" -# CHECK: # command stderr: -# CHECK: Error: cannot recursively compare '-' -# CHECK: error: command failed with exit status: 1 - -# CHECK: FAIL: shtest-shell :: diff-r-error-8.txt -# CHECK: *** TEST 'shtest-shell :: diff-r-error-8.txt' FAILED *** -# CHECK: $ "diff" "-r" "{{[^"]*}}" "-" -# CHECK: # command stderr: -# CHECK: Error: cannot recursively compare '-' -# CHECK: error: command failed with exit status: 1 - # CHECK: PASS: shtest-shell :: diff-r.txt # CHECK: FAIL: shtest-shell :: error-0.txt @@ -410,4 +297,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (30) +# CHECK: Failing Tests (28) From llvm-commits at lists.llvm.org Sat Oct 12 11:52:31 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 18:52:31 -0000 Subject: [llvm] r374682 - Revert r374649: "Reland r374389: [lit] Clean up internal diff's encoding handling" Message-ID: <20191012185231.D48E48B02F@lists.llvm.org> Author: jdenny Date: Sat Oct 12 11:52:31 2019 New Revision: 374682 URL: http://llvm.org/viewvc/llvm-project?rev=374682&view=rev Log: Revert r374649: "Reland r374389: [lit] Clean up internal diff's encoding handling" This series of patches still breaks a Windows bot. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/max-failures.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374682&r1=374681&r2=374682&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py Sat Oct 12 11:52:31 2019 @@ -1,7 +1,6 @@ import difflib import functools import getopt -import locale import os import sys @@ -25,26 +24,37 @@ def getDirTree(path, basedir=""): return path, sorted(child_trees) def compareTwoFiles(flags, filepaths): + compare_bytes = False + encoding = None filelines = [] for file in filepaths: - with open(file, 'rb') as file_bin: - filelines.append(file_bin.readlines()) - - try: - return compareTwoTextFiles(flags, filepaths, filelines, - locale.getpreferredencoding(False)) - except UnicodeDecodeError: try: - return compareTwoTextFiles(flags, filepaths, filelines, "utf-8") - except: - return compareTwoBinaryFiles(flags, filepaths, filelines) + with open(file, 'r') as f: + filelines.append(f.readlines()) + except UnicodeDecodeError: + try: + with io.open(file, 'r', encoding="utf-8") as f: + filelines.append(f.readlines()) + encoding = "utf-8" + except: + compare_bytes = True + + if compare_bytes: + return compareTwoBinaryFiles(flags, filepaths) + else: + return compareTwoTextFiles(flags, filepaths, encoding) + +def compareTwoBinaryFiles(flags, filepaths): + filelines = [] + for file in filepaths: + with open(file, 'rb') as f: + filelines.append(f.readlines()) -def compareTwoBinaryFiles(flags, filepaths, filelines): exitCode = 0 if hasattr(difflib, 'diff_bytes'): # python 3.5 or newer diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) - diffs = [diff.decode(errors="backslashreplace") for diff in diffs] + diffs = [diff.decode() for diff in diffs] else: # python 2.7 if flags.unified_diff: @@ -58,14 +68,15 @@ def compareTwoBinaryFiles(flags, filepat exitCode = 1 return exitCode -def compareTwoTextFiles(flags, filepaths, filelines_bin, encoding): +def compareTwoTextFiles(flags, filepaths, encoding): filelines = [] - for lines_bin in filelines_bin: - lines = [] - for line_bin in lines_bin: - line = line_bin.decode(encoding=encoding) - lines.append(line) - filelines.append(lines) + for file in filepaths: + if encoding is None: + with open(file, 'r') as f: + filelines.append(f.readlines()) + else: + with io.open(file, 'r', encoding=encoding) as f: + filelines.append(f.readlines()) exitCode = 0 def compose2(f, g): Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt?rev=374681&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt (removed) @@ -1,9 +0,0 @@ -# Check that diff falls back to binary mode if it cannot decode a file. - -# RUN: diff -u diff-in.bin diff-in.bin -# RUN: diff -u diff-in.utf16 diff-in.bin && false || true -# RUN: diff -u diff-in.utf8 diff-in.bin && false || true -# RUN: diff -u diff-in.bin diff-in.utf8 && false || true - -# Fail so lit will print output. -# RUN: false Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.bin?rev=374681&view=auto ============================================================================== Binary file - no diff available. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16 URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16?rev=374681&view=auto ============================================================================== Binary file - no diff available. Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8?rev=374681&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8 (removed) @@ -1,3 +0,0 @@ -foo -bar -baz Modified: llvm/trunk/utils/lit/tests/max-failures.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/max-failures.py?rev=374682&r1=374681&r2=374682&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/max-failures.py (original) +++ llvm/trunk/utils/lit/tests/max-failures.py Sat Oct 12 11:52:31 2019 @@ -8,7 +8,7 @@ # # END. -# CHECK: Failing Tests (28) +# CHECK: Failing Tests (27) # CHECK: Failing Tests (1) # CHECK: Failing Tests (2) # CHECK: error: argument --max-failures: requires positive integer, but found '0' Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374682&r1=374681&r2=374682&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 11:52:31 2019 @@ -34,58 +34,6 @@ # CHECK: error: command failed with exit status: 127 # CHECK: *** - -# CHECK: FAIL: shtest-shell :: diff-encodings.txt -# CHECK: *** TEST 'shtest-shell :: diff-encodings.txt' FAILED *** - -# CHECK: $ "diff" "-u" "diff-in.bin" "diff-in.bin" -# CHECK-NOT: error - -# CHECK: $ "diff" "-u" "diff-in.utf16" "diff-in.bin" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: {{^ .f.o.o.$}} -# CHECK-NEXT: {{^-.b.a.r.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^ .b.a.z.$}} -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "diff" "-u" "diff-in.utf8" "diff-in.bin" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: -foo -# CHECK-NEXT: -bar -# CHECK-NEXT: -baz -# CHECK-NEXT: {{^\+.f.o.o.$}} -# CHECK-NEXT: {{^\+.b.a.r..}} -# CHECK-NEXT: {{^\+.b.a.z.$}} -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "diff" "-u" "diff-in.bin" "diff-in.utf8" -# CHECK: # command output: -# CHECK-NEXT: --- -# CHECK-NEXT: +++ -# CHECK-NEXT: @@ -# CHECK-NEXT: {{^\-.f.o.o.$}} -# CHECK-NEXT: {{^\-.b.a.r..}} -# CHECK-NEXT: {{^\-.b.a.z.$}} -# CHECK-NEXT: +foo -# CHECK-NEXT: +bar -# CHECK-NEXT: +baz -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "true" - -# CHECK: $ "false" - -# CHECK: *** - - # CHECK: FAIL: shtest-shell :: diff-error-1.txt # CHECK: *** TEST 'shtest-shell :: diff-error-1.txt' FAILED *** # CHECK: $ "diff" "-B" "temp1.txt" "temp2.txt" @@ -297,4 +245,4 @@ # CHECK: PASS: shtest-shell :: sequencing-0.txt # CHECK: XFAIL: shtest-shell :: sequencing-1.txt # CHECK: PASS: shtest-shell :: valid-shell.txt -# CHECK: Failing Tests (28) +# CHECK: Failing Tests (27) From llvm-commits at lists.llvm.org Sat Oct 12 11:52:46 2019 From: llvm-commits at lists.llvm.org (Joel E. Denny via llvm-commits) Date: Sat, 12 Oct 2019 18:52:46 -0000 Subject: [llvm] r374683 - Revert r374648: "Reland r374388: [lit] Make internal diff work in pipelines" Message-ID: <20191012185246.518CF8B07B@lists.llvm.org> Author: jdenny Date: Sat Oct 12 11:52:46 2019 New Revision: 374683 URL: http://llvm.org/viewvc/llvm-project?rev=374683&view=rev Log: Revert r374648: "Reland r374388: [lit] Make internal diff work in pipelines" This series of patches still breaks a Windows bot. Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt Removed: llvm/trunk/utils/lit/lit/builtin_commands/diff.py llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt Modified: llvm/trunk/utils/lit/lit/TestRunner.py llvm/trunk/utils/lit/tests/shtest-shell.py Modified: llvm/trunk/utils/lit/lit/TestRunner.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/TestRunner.py?rev=374683&r1=374682&r2=374683&view=diff ============================================================================== --- llvm/trunk/utils/lit/lit/TestRunner.py (original) +++ llvm/trunk/utils/lit/lit/TestRunner.py Sat Oct 12 11:52:46 2019 @@ -1,5 +1,7 @@ from __future__ import absolute_import +import difflib import errno +import functools import io import itertools import getopt @@ -359,6 +361,218 @@ def executeBuiltinMkdir(cmd, cmd_shenv): exitCode = 1 return ShellCommandResult(cmd, "", stderr.getvalue(), exitCode, False) +def executeBuiltinDiff(cmd, cmd_shenv): + """executeBuiltinDiff - Compare files line by line.""" + args = expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:] + try: + opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) + except getopt.GetoptError as err: + raise InternalShellError(cmd, "Unsupported: 'diff': %s" % str(err)) + + filelines, filepaths, dir_trees = ([] for i in range(3)) + ignore_all_space = False + ignore_space_change = False + unified_diff = False + recursive_diff = False + strip_trailing_cr = False + for o, a in opts: + if o == "-w": + ignore_all_space = True + elif o == "-b": + ignore_space_change = True + elif o == "-u": + unified_diff = True + elif o == "-r": + recursive_diff = True + elif o == "--strip-trailing-cr": + strip_trailing_cr = True + else: + assert False, "unhandled option" + + if len(args) != 2: + raise InternalShellError(cmd, "Error: missing or extra operand") + + def getDirTree(path, basedir=""): + # Tree is a tuple of form (dirname, child_trees). + # An empty dir has child_trees = [], a file has child_trees = None. + child_trees = [] + for dirname, child_dirs, files in os.walk(os.path.join(basedir, path)): + for child_dir in child_dirs: + child_trees.append(getDirTree(child_dir, dirname)) + for filename in files: + child_trees.append((filename, None)) + return path, sorted(child_trees) + + def compareTwoFiles(filepaths): + compare_bytes = False + encoding = None + filelines = [] + for file in filepaths: + try: + with open(file, 'r') as f: + filelines.append(f.readlines()) + except UnicodeDecodeError: + try: + with io.open(file, 'r', encoding="utf-8") as f: + filelines.append(f.readlines()) + encoding = "utf-8" + except: + compare_bytes = True + + if compare_bytes: + return compareTwoBinaryFiles(filepaths) + else: + return compareTwoTextFiles(filepaths, encoding) + + def compareTwoBinaryFiles(filepaths): + filelines = [] + for file in filepaths: + with open(file, 'rb') as f: + filelines.append(f.readlines()) + + exitCode = 0 + if hasattr(difflib, 'diff_bytes'): + # python 3.5 or newer + diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) + diffs = [diff.decode() for diff in diffs] + else: + # python 2.7 + func = difflib.unified_diff if unified_diff else difflib.context_diff + diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) + + for diff in diffs: + stdout.write(diff) + exitCode = 1 + return exitCode + + def compareTwoTextFiles(filepaths, encoding): + filelines = [] + for file in filepaths: + if encoding is None: + with open(file, 'r') as f: + filelines.append(f.readlines()) + else: + with io.open(file, 'r', encoding=encoding) as f: + filelines.append(f.readlines()) + + exitCode = 0 + def compose2(f, g): + return lambda x: f(g(x)) + + f = lambda x: x + if strip_trailing_cr: + f = compose2(lambda line: line.rstrip('\r'), f) + if ignore_all_space or ignore_space_change: + ignoreSpace = lambda line, separator: separator.join(line.split()) + ignoreAllSpaceOrSpaceChange = functools.partial(ignoreSpace, separator='' if ignore_all_space else ' ') + f = compose2(ignoreAllSpaceOrSpaceChange, f) + + for idx, lines in enumerate(filelines): + filelines[idx]= [f(line) for line in lines] + + func = difflib.unified_diff if unified_diff else difflib.context_diff + for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): + stdout.write(diff) + exitCode = 1 + return exitCode + + def printDirVsFile(dir_path, file_path): + if os.path.getsize(file_path): + msg = "File %s is a directory while file %s is a regular file" + else: + msg = "File %s is a directory while file %s is a regular empty file" + stdout.write(msg % (dir_path, file_path) + "\n") + + def printFileVsDir(file_path, dir_path): + if os.path.getsize(file_path): + msg = "File %s is a regular file while file %s is a directory" + else: + msg = "File %s is a regular empty file while file %s is a directory" + stdout.write(msg % (file_path, dir_path) + "\n") + + def printOnlyIn(basedir, path, name): + stdout.write("Only in %s: %s\n" % (os.path.join(basedir, path), name)) + + def compareDirTrees(dir_trees, base_paths=["", ""]): + # Dirnames of the trees are not checked, it's caller's responsibility, + # as top-level dirnames are always different. Base paths are important + # for doing os.walk, but we don't put it into tree's dirname in order + # to speed up string comparison below and while sorting in getDirTree. + left_tree, right_tree = dir_trees[0], dir_trees[1] + left_base, right_base = base_paths[0], base_paths[1] + + # Compare two files or report file vs. directory mismatch. + if left_tree[1] is None and right_tree[1] is None: + return compareTwoFiles([os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])]) + + if left_tree[1] is None and right_tree[1] is not None: + printFileVsDir(os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])) + return 1 + + if left_tree[1] is not None and right_tree[1] is None: + printDirVsFile(os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])) + return 1 + + # Compare two directories via recursive use of compareDirTrees. + exitCode = 0 + left_names = [node[0] for node in left_tree[1]] + right_names = [node[0] for node in right_tree[1]] + l, r = 0, 0 + while l < len(left_names) and r < len(right_names): + # Names are sorted in getDirTree, rely on that order. + if left_names[l] < right_names[r]: + exitCode = 1 + printOnlyIn(left_base, left_tree[0], left_names[l]) + l += 1 + elif left_names[l] > right_names[r]: + exitCode = 1 + printOnlyIn(right_base, right_tree[0], right_names[r]) + r += 1 + else: + exitCode |= compareDirTrees([left_tree[1][l], right_tree[1][r]], + [os.path.join(left_base, left_tree[0]), + os.path.join(right_base, right_tree[0])]) + l += 1 + r += 1 + + # At least one of the trees has ended. Report names from the other tree. + while l < len(left_names): + exitCode = 1 + printOnlyIn(left_base, left_tree[0], left_names[l]) + l += 1 + while r < len(right_names): + exitCode = 1 + printOnlyIn(right_base, right_tree[0], right_names[r]) + r += 1 + return exitCode + + stderr = StringIO() + stdout = StringIO() + exitCode = 0 + try: + for file in args: + if not os.path.isabs(file): + file = os.path.realpath(os.path.join(cmd_shenv.cwd, file)) + + if recursive_diff: + dir_trees.append(getDirTree(file)) + else: + filepaths.append(file) + + if not recursive_diff: + exitCode = compareTwoFiles(filepaths) + else: + exitCode = compareDirTrees(dir_trees) + + except IOError as err: + stderr.write("Error: 'diff' command failed, %s\n" % str(err)) + exitCode = 1 + + return ShellCommandResult(cmd, stdout.getvalue(), stderr.getvalue(), exitCode, False) + def executeBuiltinRm(cmd, cmd_shenv): """executeBuiltinRm - Removes (deletes) files or directories.""" args = expand_glob_expressions(cmd.args, cmd_shenv.cwd)[1:] @@ -624,6 +838,14 @@ def _executeShCmd(cmd, shenv, results, t results.append(cmdResult) return cmdResult.exitCode + if cmd.commands[0].args[0] == 'diff': + if len(cmd.commands) != 1: + raise InternalShellError(cmd.commands[0], "Unsupported: 'diff' " + "cannot be part of a pipeline") + cmdResult = executeBuiltinDiff(cmd.commands[0], shenv) + results.append(cmdResult) + return cmdResult.exitCode + if cmd.commands[0].args[0] == 'rm': if len(cmd.commands) != 1: raise InternalShellError(cmd.commands[0], "Unsupported: 'rm' " @@ -644,7 +866,7 @@ def _executeShCmd(cmd, shenv, results, t stderrTempFiles = [] opened_files = [] named_temp_files = [] - builtin_commands = set(['cat', 'diff']) + builtin_commands = set(['cat']) builtin_commands_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "builtin_commands") # To avoid deadlock, we use a single stderr stream for piped # output. This is null until we have seen some output using Removed: llvm/trunk/utils/lit/lit/builtin_commands/diff.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/lit/builtin_commands/diff.py?rev=374682&view=auto ============================================================================== --- llvm/trunk/utils/lit/lit/builtin_commands/diff.py (original) +++ llvm/trunk/utils/lit/lit/builtin_commands/diff.py (removed) @@ -1,228 +0,0 @@ -import difflib -import functools -import getopt -import os -import sys - -class DiffFlags(): - def __init__(self): - self.ignore_all_space = False - self.ignore_space_change = False - self.unified_diff = False - self.recursive_diff = False - self.strip_trailing_cr = False - -def getDirTree(path, basedir=""): - # Tree is a tuple of form (dirname, child_trees). - # An empty dir has child_trees = [], a file has child_trees = None. - child_trees = [] - for dirname, child_dirs, files in os.walk(os.path.join(basedir, path)): - for child_dir in child_dirs: - child_trees.append(getDirTree(child_dir, dirname)) - for filename in files: - child_trees.append((filename, None)) - return path, sorted(child_trees) - -def compareTwoFiles(flags, filepaths): - compare_bytes = False - encoding = None - filelines = [] - for file in filepaths: - try: - with open(file, 'r') as f: - filelines.append(f.readlines()) - except UnicodeDecodeError: - try: - with io.open(file, 'r', encoding="utf-8") as f: - filelines.append(f.readlines()) - encoding = "utf-8" - except: - compare_bytes = True - - if compare_bytes: - return compareTwoBinaryFiles(flags, filepaths) - else: - return compareTwoTextFiles(flags, filepaths, encoding) - -def compareTwoBinaryFiles(flags, filepaths): - filelines = [] - for file in filepaths: - with open(file, 'rb') as f: - filelines.append(f.readlines()) - - exitCode = 0 - if hasattr(difflib, 'diff_bytes'): - # python 3.5 or newer - diffs = difflib.diff_bytes(difflib.unified_diff, filelines[0], filelines[1], filepaths[0].encode(), filepaths[1].encode()) - diffs = [diff.decode() for diff in diffs] - else: - # python 2.7 - if flags.unified_diff: - func = difflib.unified_diff - else: - func = difflib.context_diff - diffs = func(filelines[0], filelines[1], filepaths[0], filepaths[1]) - - for diff in diffs: - sys.stdout.write(diff) - exitCode = 1 - return exitCode - -def compareTwoTextFiles(flags, filepaths, encoding): - filelines = [] - for file in filepaths: - if encoding is None: - with open(file, 'r') as f: - filelines.append(f.readlines()) - else: - with io.open(file, 'r', encoding=encoding) as f: - filelines.append(f.readlines()) - - exitCode = 0 - def compose2(f, g): - return lambda x: f(g(x)) - - f = lambda x: x - if flags.strip_trailing_cr: - f = compose2(lambda line: line.rstrip('\r'), f) - if flags.ignore_all_space or flags.ignore_space_change: - ignoreSpace = lambda line, separator: separator.join(line.split()) - ignoreAllSpaceOrSpaceChange = functools.partial(ignoreSpace, separator='' if flags.ignore_all_space else ' ') - f = compose2(ignoreAllSpaceOrSpaceChange, f) - - for idx, lines in enumerate(filelines): - filelines[idx]= [f(line) for line in lines] - - func = difflib.unified_diff if flags.unified_diff else difflib.context_diff - for diff in func(filelines[0], filelines[1], filepaths[0], filepaths[1]): - sys.stdout.write(diff) - exitCode = 1 - return exitCode - -def printDirVsFile(dir_path, file_path): - if os.path.getsize(file_path): - msg = "File %s is a directory while file %s is a regular file" - else: - msg = "File %s is a directory while file %s is a regular empty file" - sys.stdout.write(msg % (dir_path, file_path) + "\n") - -def printFileVsDir(file_path, dir_path): - if os.path.getsize(file_path): - msg = "File %s is a regular file while file %s is a directory" - else: - msg = "File %s is a regular empty file while file %s is a directory" - sys.stdout.write(msg % (file_path, dir_path) + "\n") - -def printOnlyIn(basedir, path, name): - sys.stdout.write("Only in %s: %s\n" % (os.path.join(basedir, path), name)) - -def compareDirTrees(flags, dir_trees, base_paths=["", ""]): - # Dirnames of the trees are not checked, it's caller's responsibility, - # as top-level dirnames are always different. Base paths are important - # for doing os.walk, but we don't put it into tree's dirname in order - # to speed up string comparison below and while sorting in getDirTree. - left_tree, right_tree = dir_trees[0], dir_trees[1] - left_base, right_base = base_paths[0], base_paths[1] - - # Compare two files or report file vs. directory mismatch. - if left_tree[1] is None and right_tree[1] is None: - return compareTwoFiles(flags, - [os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])]) - - if left_tree[1] is None and right_tree[1] is not None: - printFileVsDir(os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])) - return 1 - - if left_tree[1] is not None and right_tree[1] is None: - printDirVsFile(os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])) - return 1 - - # Compare two directories via recursive use of compareDirTrees. - exitCode = 0 - left_names = [node[0] for node in left_tree[1]] - right_names = [node[0] for node in right_tree[1]] - l, r = 0, 0 - while l < len(left_names) and r < len(right_names): - # Names are sorted in getDirTree, rely on that order. - if left_names[l] < right_names[r]: - exitCode = 1 - printOnlyIn(left_base, left_tree[0], left_names[l]) - l += 1 - elif left_names[l] > right_names[r]: - exitCode = 1 - printOnlyIn(right_base, right_tree[0], right_names[r]) - r += 1 - else: - exitCode |= compareDirTrees(flags, - [left_tree[1][l], right_tree[1][r]], - [os.path.join(left_base, left_tree[0]), - os.path.join(right_base, right_tree[0])]) - l += 1 - r += 1 - - # At least one of the trees has ended. Report names from the other tree. - while l < len(left_names): - exitCode = 1 - printOnlyIn(left_base, left_tree[0], left_names[l]) - l += 1 - while r < len(right_names): - exitCode = 1 - printOnlyIn(right_base, right_tree[0], right_names[r]) - r += 1 - return exitCode - -def main(argv): - args = argv[1:] - try: - opts, args = getopt.gnu_getopt(args, "wbur", ["strip-trailing-cr"]) - except getopt.GetoptError as err: - sys.stderr.write("Unsupported: 'diff': %s\n" % str(err)) - sys.exit(1) - - flags = DiffFlags() - filelines, filepaths, dir_trees = ([] for i in range(3)) - for o, a in opts: - if o == "-w": - flags.ignore_all_space = True - elif o == "-b": - flags.ignore_space_change = True - elif o == "-u": - flags.unified_diff = True - elif o == "-r": - flags.recursive_diff = True - elif o == "--strip-trailing-cr": - flags.strip_trailing_cr = True - else: - assert False, "unhandled option" - - if len(args) != 2: - sys.stderr.write("Error: missing or extra operand\n") - sys.exit(1) - - exitCode = 0 - try: - for file in args: - if not os.path.isabs(file): - file = os.path.realpath(os.path.join(os.getcwd(), file)) - - if flags.recursive_diff: - dir_trees.append(getDirTree(file)) - else: - filepaths.append(file) - - if not flags.recursive_diff: - exitCode = compareTwoFiles(flags, filepaths) - else: - exitCode = compareDirTrees(flags, dir_trees) - - except IOError as err: - sys.stderr.write("Error: 'diff' command failed, %s\n" % str(err)) - exitCode = 1 - - sys.exit(exitCode) - -if __name__ == "__main__": - main(sys.argv) Added: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt?rev=374683&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt (added) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt Sat Oct 12 11:52:46 2019 @@ -0,0 +1,3 @@ +# Check error on a unsupported diff (cannot be part of a pipeline). +# +# RUN: diff diff-error-0.txt diff-error-0.txt | echo Output Removed: llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt?rev=374682&view=auto ============================================================================== --- llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt (original) +++ llvm/trunk/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt (removed) @@ -1,15 +0,0 @@ -# RUN: echo foo > %t.foo -# RUN: echo bar > %t.bar - -# Check output pipe. -# RUN: diff %t.foo %t.foo | FileCheck -allow-empty -check-prefix=EMPTY %s -# RUN: diff -u %t.foo %t.bar | FileCheck %s && false || true - -# Fail so lit will print output. -# RUN: false - -# CHECK: @@ -# CHECK-NEXT: -foo -# CHECK-NEXT: +bar - -# EMPTY-NOT: {{.}} Modified: llvm/trunk/utils/lit/tests/shtest-shell.py URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/lit/tests/shtest-shell.py?rev=374683&r1=374682&r2=374683&view=diff ============================================================================== --- llvm/trunk/utils/lit/tests/shtest-shell.py (original) +++ llvm/trunk/utils/lit/tests/shtest-shell.py Sat Oct 12 11:52:46 2019 @@ -34,20 +34,28 @@ # CHECK: error: command failed with exit status: 127 # CHECK: *** +# CHECK: FAIL: shtest-shell :: diff-error-0.txt +# CHECK: *** TEST 'shtest-shell :: diff-error-0.txt' FAILED *** +# CHECK: $ "diff" "diff-error-0.txt" "diff-error-0.txt" +# CHECK: # command stderr: +# CHECK: Unsupported: 'diff' cannot be part of a pipeline +# CHECK: error: command failed with exit status: 127 +# CHECK: *** + # CHECK: FAIL: shtest-shell :: diff-error-1.txt # CHECK: *** TEST 'shtest-shell :: diff-error-1.txt' FAILED *** # CHECK: $ "diff" "-B" "temp1.txt" "temp2.txt" # CHECK: # command stderr: # CHECK: Unsupported: 'diff': option -B not recognized -# CHECK: error: command failed with exit status: 1 +# CHECK: error: command failed with exit status: 127 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-2.txt # CHECK: *** TEST 'shtest-shell :: diff-error-2.txt' FAILED *** # CHECK: $ "diff" "temp.txt" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 1 +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 127 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-3.txt @@ -74,43 +82,18 @@ # CHECK: *** TEST 'shtest-shell :: diff-error-5.txt' FAILED *** # CHECK: $ "diff" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 1 +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 127 # CHECK: *** # CHECK: FAIL: shtest-shell :: diff-error-6.txt # CHECK: *** TEST 'shtest-shell :: diff-error-6.txt' FAILED *** # CHECK: $ "diff" # CHECK: # command stderr: -# CHECK: Error: missing or extra operand -# CHECK: error: command failed with exit status: 1 -# CHECK: *** - - -# CHECK: FAIL: shtest-shell :: diff-pipes.txt - -# CHECK: *** TEST 'shtest-shell :: diff-pipes.txt' FAILED *** - -# CHECK: $ "diff" "{{[^"]*}}.foo" "{{[^"]*}}.foo" -# CHECK-NOT: note -# CHECK-NOT: error -# CHECK: $ "FileCheck" -# CHECK-NOT: note -# CHECK-NOT: error - -# CHECK: $ "diff" "-u" "{{[^"]*}}.foo" "{{[^"]*}}.bar" -# CHECK: note: command had no output on stdout or stderr -# CHECK: error: command failed with exit status: 1 -# CHECK: $ "FileCheck" -# CHECK-NOT: note -# CHECK-NOT: error -# CHECK: $ "true" - -# CHECK: $ "false" - +# CHECK: Error: missing or extra operand +# CHECK: error: command failed with exit status: 127 # CHECK: *** - # CHECK: FAIL: shtest-shell :: diff-r-error-0.txt # CHECK: *** TEST 'shtest-shell :: diff-r-error-0.txt' FAILED *** # CHECK: $ "diff" "-r" From llvm-commits at lists.llvm.org Sat Oct 12 12:27:09 2019 From: llvm-commits at lists.llvm.org (David Li via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 19:27:09 +0000 (UTC) Subject: [PATCH] D52845: Update entry count for cold calls In-Reply-To: References: Message-ID: <1b5708c84237957dd5f8e1edc481f50b@localhost.localdomain> davidxl added a comment. Wenlei, this sounds like a good idea. Patches are welcome! Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D52845/new/ https://reviews.llvm.org/D52845 From llvm-commits at lists.llvm.org Sat Oct 12 13:23:16 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via llvm-commits) Date: Sat, 12 Oct 2019 20:23:16 -0000 Subject: [LNT] r374685 - [LNT] NFC: Fix order of globals and locals on exec Message-ID: <20191012202317.06D9683FA7@lists.llvm.org> Author: hubert.reinterpretcast Date: Sat Oct 12 13:23:16 2019 New Revision: 374685 URL: http://llvm.org/viewvc/llvm-project?rev=374685&view=rev Log: [LNT] NFC: Fix order of globals and locals on exec Summary: Per https://docs.python.org/3/library/functions.html#exec, the globals parameter comes before the locals one. Since `globals` and `locals` refer to the same object for the call in question, we can remove `locals`, which will cause the globals parameter to be used for both the global and the local variables, thus keeping the same behavior. Reviewers: cmatthews, thopre, kristof.beyls Reviewed By: thopre Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D68903 Modified: lnt/trunk/lnt/tests/nt.py Modified: lnt/trunk/lnt/tests/nt.py URL: http://llvm.org/viewvc/llvm-project/lnt/trunk/lnt/tests/nt.py?rev=374685&r1=374684&r2=374685&view=diff ============================================================================== --- lnt/trunk/lnt/tests/nt.py (original) +++ lnt/trunk/lnt/tests/nt.py Sat Oct 12 13:23:16 2019 @@ -536,7 +536,7 @@ def execute_test_modules(test_log, test_ results = [] for name in test_modules: # First, load the test module file. - locals = globals = {} + globals = {} test_path = os.path.join(config.test_suite_root, 'LNTBased', name) # This is where shared code between test modules should go. sys.path.append(os.path.join(config.test_suite_root, 'LNTBased/lib')) @@ -544,7 +544,7 @@ def execute_test_modules(test_log, test_ module_path = os.path.join(test_path, 'TestModule') module_file = open(module_path) try: - exec(module_file, locals, globals) + exec(module_file, globals) except Exception: info = traceback.format_exc() fatal("unable to import test module: %r\n%s" % ( From llvm-commits at lists.llvm.org Sat Oct 12 13:30:39 2019 From: llvm-commits at lists.llvm.org (Hubert Tong via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 20:30:39 +0000 (UTC) Subject: [PATCH] D68903: [LNT] NFC: Fix order of globals and locals on exec In-Reply-To: References: Message-ID: hubert.reinterpretcast closed this revision. hubert.reinterpretcast added a comment. Committed: https://github.com/llvm/llvm-lnt/commit/f805c3e1eece29e502a88a095c710f32473ca3c5 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68903/new/ https://reviews.llvm.org/D68903 From llvm-commits at lists.llvm.org Sat Oct 12 13:48:46 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Sat, 12 Oct 2019 20:48:46 +0000 (UTC) Subject: [PATCH] D67954: [LLDB] [Windows] Initial support for ARM64 register contexts In-Reply-To: References: Message-ID: <959b906457dc653b31037f36a73abde6@localhost.localdomain> mstorsjo updated this revision to Diff 224757. mstorsjo retitled this revision from "[LLDB] [Windows] Initial support for ARM64 debugging" to "[LLDB] [Windows] Initial support for ARM64 register contexts". mstorsjo edited the summary of this revision. mstorsjo added a reviewer: aleksandr.urakov. mstorsjo added a comment. Herald added subscribers: llvm-commits, delcypher. Herald added a project: LLVM. Added two lit/shell based tests that pass on both linux/arm64 and windows/arm64. I've managed to set up some sort of hacked up environment where I can run lit/shell based tests (even though the main python test driver runs in WSL, but executing native windows binaries for the tests). I also added a NativeRegisterContext for arm64, for lldb-server. For the RegisterInfoInterface for NativeRegisterContext, I reused RegisterInfoPOSIX_arm64 instead of creating a new copy similar to it, since I didn't really see anything OS specific in there. The tests pass both with and without use of lldb-server. However, when using lldb-server with NativeRegisterContext, while the register values are correct, I don't get a correct working backtrace with it. Without lldb-server, I get a perfect backtrace. (The tested binary uses SEH unwind tables, but DWARF debug info.) Any clues about what might be going wrong there? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67954/new/ https://reviews.llvm.org/D67954 Files: lldb/source/Plugins/Process/Windows/Common/CMakeLists.txt lldb/source/Plugins/Process/Windows/Common/NativeRegisterContextWindows_arm64.cpp lldb/source/Plugins/Process/Windows/Common/NativeRegisterContextWindows_arm64.h lldb/source/Plugins/Process/Windows/Common/TargetThreadWindows.cpp lldb/source/Plugins/Process/Windows/Common/arm64/RegisterContextWindows_arm64.cpp lldb/source/Plugins/Process/Windows/Common/arm64/RegisterContextWindows_arm64.h lldb/test/Shell/Register/Inputs/aarch64-fp-read.cpp lldb/test/Shell/Register/Inputs/aarch64-gp-read.cpp lldb/test/Shell/Register/aarch64-fp-read.test lldb/test/Shell/Register/aarch64-gp-read.test llvm/utils/lit/lit/llvm/config.py -------------- next part -------------- A non-text attachment was scrubbed... Name: D67954.224757.patch Type: text/x-patch Size: 46446 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 14:16:05 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 21:16:05 +0000 (UTC) Subject: [PATCH] D68651: [InstCombine] Signed saturation patterns In-Reply-To: References: Message-ID: <362050a7063edad1d05575ab72547c4e@localhost.localdomain> dmgreen added a comment. In D68651#1703888 , @dmgreen wrote: > At some point when we get into enough of the details the code becomes the best documentation, but I'm guessing you will disagree with that... First up, sorry about this line. I was in a bit of a mood over emails not working and someone distracted me, I pressed submit before I had properly constructed what I really wanted to say. Sorry if it came across as accusatory or passive aggressive. I was really wanting to say that; strong specifications, whilst they can be very useful, can also be stifling to innovation and bog down forward progress. We should try to keep a balance between writing things down and getting things done :) Secondly, I had to revert D68643 because it wasn't sign extending the values as it should. Simple cases were working, but more complex examples were not. This will mostly effect illegal types (like i4's or i8/i16 on arm), but will make the results look less impressive. I will put up another patch together showing the correct results, which will contain more extends in places. I don't think it changes the decision of whether to use the intrinsics, just moves a few cases from looking like "an improvement", to just being "the same". And still better for many cases like vectorisation. In D68651#1706810 , @hsaito wrote: > I don't want to hijack this review. Let me have the rest of the discussion in the form of RFC on llvm-dev. Vector idiom discussions resulted in ~20 idioms. It would be nice if we can come up with a basic guideline on how to think and how to make a case for the new canonical form. For example, TI said they are interested in saturating mul. If saturating add/sub have a canonical form, I don't see why saturating mul should not. I was going to say yeah, sounds great to me, let me look into adding it.. But I took another look at the MVE spec and we apparently don't have that instruction. There are some that do multiplying + doubling and saturate the high half. Apparently that's useful to someone! I don't believe it would be a detriment to have a saturating mul though, so sounds good to me if someone wants to add it. Do you happen to know if that discussion you spoke of was written down? It sounds very interesting. One thing I would like to try is to kick llvm's use of reductions up to 11, and it might provide useful insights on the best way to make that work. In D68651#1707017 , @nikic wrote: > I'm only wondering whether the trunc is the right place to start the match. Starting from the min/max we could match a larger set of patterns, in particular those where the result of the saturation is still extended to a larger type -- for example doing a 16-bit saturating add but continuing with a 32-bit result. Yeah I was wondering if that would be better. So long as it doesn't create strange types, I think it sounds like a good idea. I'll give it a try and let you know how it looks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68651/new/ https://reviews.llvm.org/D68651 From llvm-commits at lists.llvm.org Sat Oct 12 14:43:03 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Sat, 12 Oct 2019 21:43:03 +0000 (UTC) Subject: [PATCH] D68917: [Demangle] Add a few more options to the microsoft demangler Message-ID: mstorsjo created this revision. mstorsjo added reviewers: thakis, rnk, zturner, ruiu. Herald added subscribers: erik.pilkington, hiraditya. Herald added a project: LLVM. This corresponds to commonly used options to UnDecorateSymbolName within llvm. Add them as hidden options in llvm-undname. MS undname.exe takes numeric flags, corresponding to the UNDNAME_* constants, but instead of hardcoding in mappings for those numbers, just add textual options instead, as it the use of them here is primarily intended for testing. This should allow replacing UnDecorateSymbolName from dbghelp with the llvm demangler mostly without changing the output. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68917 Files: llvm/include/llvm/Demangle/Demangle.h llvm/include/llvm/Demangle/MicrosoftDemangleNodes.h llvm/lib/Demangle/MicrosoftDemangle.cpp llvm/lib/Demangle/MicrosoftDemangleNodes.cpp llvm/test/Demangle/ms-options.test llvm/tools/llvm-undname/llvm-undname.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68917.224758.patch Type: text/x-patch Size: 7929 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 15:10:21 2019 From: llvm-commits at lists.llvm.org (Nikita Popov via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 22:10:21 +0000 (UTC) Subject: [PATCH] D68672: [APInt] Rounding right-shifts In-Reply-To: References: Message-ID: nikic added a comment. > I'd like to try to extend ConstantRange::makeGuaranteedNoWrapRegion() > to deal with Instruction::Shl so i believe i need rounding right shifts. I don't think rounding shifts are strictly necessary for this purpose, the correct behavior should fall out when applying the normal lshr/ashr operations to -1 / signed_min / signed_max. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68672/new/ https://reviews.llvm.org/D68672 From llvm-commits at lists.llvm.org Sat Oct 12 15:24:56 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Sat, 12 Oct 2019 22:24:56 -0000 Subject: [llvm] r374686 - gn build: (manually) merge r374663 Message-ID: <20191012222456.E9BAB84E28@lists.llvm.org> Author: nico Date: Sat Oct 12 15:24:56 2019 New Revision: 374686 URL: http://llvm.org/viewvc/llvm-project?rev=374686&view=rev Log: gn build: (manually) merge r374663 Modified: llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn?rev=374686&r1=374685&r2=374686&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn Sat Oct 12 15:24:56 2019 @@ -3,6 +3,7 @@ executable("clang-format") { deps = [ "//clang/lib/Basic", "//clang/lib/Format", + "//clang/lib/Frontend", "//clang/lib/Rewrite", "//clang/lib/Tooling/Core", "//llvm/lib/Support", From llvm-commits at lists.llvm.org Sat Oct 12 15:46:32 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 22:46:32 +0000 (UTC) Subject: [PATCH] D68919: [LNT] Python 3 support: import LNT report as text Message-ID: thopre created this revision. thopre added reviewers: cmatthews, hubert.reinterpretcast, kristof.beyls. thopre added a parent revision: D68863: [LNT] Python 3 support: don't assume order of cmake args. thopre added a child revision: D68920: [LNT] Python 3 support: specify how to sort dict. As per LNT documentation, all fields in the JSON that is a LNT report file format are strings. Yet, the code responsible for adding a new ru in LNT database reads the Flask data property holding the corresponding LNT report file format which default to returning binary data. The code then fail when invoking lnt.util.ImportData.import_from_string due to using methods related to strings. This commit changes the access to the data to using the property's getter with the as_text parameter set to True, thereby requesting from Flask to return the data as a string. https://reviews.llvm.org/D68919 Files: lnt/server/ui/api.py Index: lnt/server/ui/api.py =================================================================== --- lnt/server/ui/api.py +++ lnt/server/ui/api.py @@ -306,7 +306,7 @@ """Add a new run into the lnt database""" session = request.session db = request.get_db() - data = request.data + data = request.get_data(as_text=True) select_machine = request.values.get('select_machine', 'match') merge = request.values.get('merge', None) result = lnt.util.ImportData.import_from_string( -------------- next part -------------- A non-text attachment was scrubbed... Name: D68919.224760.patch Type: text/x-patch Size: 540 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 15:46:32 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 22:46:32 +0000 (UTC) Subject: [PATCH] D68920: [LNT] Python 3 support: specify how to sort dict Message-ID: thopre created this revision. thopre added reviewers: cmatthews, hubert.reinterpretcast, kristof.beyls. thopre added a parent revision: D68919: [LNT] Python 3 support: import LNT report as text. To be able to compare 2 lists of dictionaries, test server/ui/test_roundtrip.py calls sorted on those list so that they are comparable. However no comparison function is given so it fails on Python 3 because dictionary are not comparable. This commit adds a key parameter to sort on the name key of these dictionaries. https://reviews.llvm.org/D68920 Files: tests/server/ui/test_roundtrip.py Index: tests/server/ui/test_roundtrip.py =================================================================== --- tests/server/ui/test_roundtrip.py +++ tests/server/ui/test_roundtrip.py @@ -94,8 +94,10 @@ for k in ['machine', 'run']: self.assertEqual(before_submit_run[k], after_submit_run[k]) # The order of the tests might have changed, so sort before they are compared. - before_submit_tests = sorted(before_submit_run['tests']) - after_submit_tests = sorted(after_submit_run['tests']) + before_submit_tests = sorted(before_submit_run['tests'], + key=lambda test: test['id']) + after_submit_tests = sorted(after_submit_run['tests'], + key=lambda test: test['id']) for i, _ in enumerate(before_submit_tests): before_submit_tests[i]['run_id'] = 1234 after_submit_tests[i]['run_id'] = 1234 -------------- next part -------------- A non-text attachment was scrubbed... Name: D68920.224761.patch Type: text/x-patch Size: 954 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 15:46:45 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 22:46:45 +0000 (UTC) Subject: [PATCH] D68921: [LNT] Python 3 support: fix server/ui/statsTester.py test discovery Message-ID: thopre created this revision. thopre added reviewers: cmatthews, hubert.reinterpretcast, kristof.beyls. Unit test server/ui/statsTester.py is invoked with an extra parameter which confuses unittest test discovery. Indeed, when parameter defaultTest of unittest.main() method is in its default value the test to run is taken from argv parameter if not empty. argv parameter in turn defaults to sys.argv and thus unittest will take the first parameter for the name of the test to execute. Under Python 2 this would throw an AttributeError which would pass through all the way up to the test itself which contains a catch to remove the parameter and try again. Note that the exception is not documented. Under Python 3 the exception is caught by the unittest framework which exits with an error. Since the parameter that confuses the test discovery is a temporary LNT instance directory which is not used by the test, this commit simply removes these LIT steps and only call the unittest without parameter. https://reviews.llvm.org/D68921 Files: tests/server/ui/statsTester.py Index: tests/server/ui/statsTester.py =================================================================== --- tests/server/ui/statsTester.py +++ tests/server/ui/statsTester.py @@ -1,13 +1,4 @@ -# -# create temporary instance -# Cleanup temporary directory in case one remained from a previous run - also -# see PR9904. -# RUN: rm -rf %t.instance -# RUN: python %{shared_inputs}/create_temp_instance.py \ -# RUN: %s %{shared_inputs}/SmallInstance %t.instance \ -# RUN: %S/Inputs/V4Pages_extra_records.sql -# -# RUN: python %s %t.instance +# RUN: python %s import unittest @@ -54,13 +45,4 @@ if __name__ == '__main__': - try: - unittest.main() - except AttributeError: - # Command line parameters are treated as test cases, when \ - # running with lit rather than python directly. - import sys - if len(sys.argv) != 2: - sys.exit("Something went horribly wrong. You need parameters.") - del sys.argv[1:] - unittest.main() + unittest.main() -------------- next part -------------- A non-text attachment was scrubbed... Name: D68921.224762.patch Type: text/x-patch Size: 1021 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 15:53:51 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via llvm-commits) Date: Sat, 12 Oct 2019 22:53:51 -0000 Subject: [LNT] r374687 - [LNT] Python 3 support: adapt to removal of execfile Message-ID: <20191012225351.95EE083EDC@lists.llvm.org> Author: thopre Date: Sat Oct 12 15:53:51 2019 New Revision: 374687 URL: http://llvm.org/viewvc/llvm-project?rev=374687&view=rev Log: [LNT] Python 3 support: adapt to removal of execfile Replace calls to execfile by calling exec on the result of calling compile on the result of calling open().read(). Reviewers: cmatthews, hubert.reinterpretcast, kristof.beyls Reviewed By: hubert.reinterpretcast Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D67822 Modified: lnt/trunk/lnt/server/db/migrate.py lnt/trunk/lnt/server/db/rules_manager.py Modified: lnt/trunk/lnt/server/db/migrate.py URL: http://llvm.org/viewvc/llvm-project/lnt/trunk/lnt/server/db/migrate.py?rev=374687&r1=374686&r2=374687&view=diff ============================================================================== --- lnt/trunk/lnt/server/db/migrate.py (original) +++ lnt/trunk/lnt/server/db/migrate.py Sat Oct 12 15:53:51 2019 @@ -162,7 +162,8 @@ def update_schema(engine, versions, avai upgrade_script = schema_migrations[db_version] globals = {} - execfile(upgrade_script, globals) + exec(compile(open(upgrade_script).read(), upgrade_script, 'exec'), + globals) upgrade_method = globals['upgrade'] # Execute the upgrade. Modified: lnt/trunk/lnt/server/db/rules_manager.py URL: http://llvm.org/viewvc/llvm-project/lnt/trunk/lnt/server/db/rules_manager.py?rev=374687&r1=374686&r2=374687&view=diff ============================================================================== --- lnt/trunk/lnt/server/db/rules_manager.py (original) +++ lnt/trunk/lnt/server/db/rules_manager.py Sat Oct 12 15:53:51 2019 @@ -66,7 +66,7 @@ def register_hooks(): global HOOKS_LOADED for name, path in load_rules().items(): globals = {} - execfile(path, globals) + exec(compile(open(path).read(), path, 'exec'), globals) DESCRIPTIONS[name] = globals['__doc__'] for hook_name in HOOKS.keys(): if hook_name in globals: From llvm-commits at lists.llvm.org Sat Oct 12 15:55:38 2019 From: llvm-commits at lists.llvm.org (Thomas Preud'homme via Phabricator via llvm-commits) Date: Sat, 12 Oct 2019 22:55:38 +0000 (UTC) Subject: [PATCH] D68922: [LNT] Python 3 support: read machine deletion page as text Message-ID: thopre created this revision. thopre added reviewers: cmatthews, hubert.reinterpretcast, kristof.beyls. thopre added a parent revision: D68921: [LNT] Python 3 support: fix server/ui/statsTester.py test discovery. Test server/ui/test_api_modify.py deletes some machines via the REST interface and tests that the content of page displayed upon that action. Since that content is made of text, it is compared against a string. However by default Flask returns the content as binary data which leads to an invalid comparison of str against byte. This commit fixes it by requesting the content as a string from Flask using the as_text parameter from get_data() property getter. https://reviews.llvm.org/D68922 Files: tests/server/ui/test_api_modify.py Index: tests/server/ui/test_api_modify.py =================================================================== --- tests/server/ui/test_api_modify.py +++ tests/server/ui/test_api_modify.py @@ -148,7 +148,7 @@ resp = client.delete('api/db_default/v4/nts/machines/2', headers={'AuthToken': 'test_token'}) self.assertEqual(resp.status_code, 200) - self.assertEqual(resp.get_data(), + self.assertEqual(resp.get_data(as_text=True), '''Deleting runs 3 5 6 7 8 9 (6/6) Deleted machine machine2:2 ''') -------------- next part -------------- A non-text attachment was scrubbed... Name: D68922.224763.patch Type: text/x-patch Size: 583 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 15:58:34 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Sat, 12 Oct 2019 22:58:34 -0000 Subject: [llvm] r374688 - Revert r374663 "[clang-format] Proposal for clang-format to give compiler style warnings" Message-ID: <20191012225834.BF921816C4@lists.llvm.org> Author: nico Date: Sat Oct 12 15:58:34 2019 New Revision: 374688 URL: http://llvm.org/viewvc/llvm-project?rev=374688&view=rev Log: Revert r374663 "[clang-format] Proposal for clang-format to give compiler style warnings" The test fails on macOS and looks a bit wrong, see comments on the review. Also revert follow-up r374686. Modified: llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn?rev=374688&r1=374687&r2=374688&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn Sat Oct 12 15:58:34 2019 @@ -3,7 +3,6 @@ executable("clang-format") { deps = [ "//clang/lib/Basic", "//clang/lib/Format", - "//clang/lib/Frontend", "//clang/lib/Rewrite", "//clang/lib/Tooling/Core", "//llvm/lib/Support", From llvm-commits at lists.llvm.org Sat Oct 12 18:58:19 2019 From: llvm-commits at lists.llvm.org (Aditya Kumar via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 01:58:19 +0000 (UTC) Subject: [PATCH] D68924: CodeExtractor: NFC: Use Range based loop Message-ID: hiraditya created this revision. hiraditya added reviewers: vsk, tejohnson, fhahn. Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68924 Files: llvm/lib/Transforms/Utils/CodeExtractor.cpp Index: llvm/lib/Transforms/Utils/CodeExtractor.cpp =================================================================== --- llvm/lib/Transforms/Utils/CodeExtractor.cpp +++ llvm/lib/Transforms/Utils/CodeExtractor.cpp @@ -961,12 +961,12 @@ // within the new function. This must be done before we lose track of which // blocks were originally in the code region. std::vector Users(header->user_begin(), header->user_end()); - for (unsigned i = 0, e = Users.size(); i != e; ++i) + for (auto &U : Users) // The BasicBlock which contains the branch is not in the region // modify the branch target to a new block - if (Instruction *I = dyn_cast(Users[i])) - if (I->isTerminator() && !Blocks.count(I->getParent()) && - I->getParent()->getParent() == oldFunction) + if (Instruction *I = dyn_cast(U)) + if (I->isTerminator() && I->getFunction() == oldFunction && + !Blocks.count(I->getParent())) I->replaceUsesOfWith(header, newHeader); return newFunction; -------------- next part -------------- A non-text attachment was scrubbed... Name: D68924.224766.patch Type: text/x-patch Size: 1050 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 19:06:33 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Sun, 13 Oct 2019 02:06:33 -0000 Subject: [zorg] r374689 - Do not set default cmake options in CmakeCommand. Message-ID: <20191013020633.CA6D285B84@lists.llvm.org> Author: gkistanova Date: Sat Oct 12 19:06:33 2019 New Revision: 374689 URL: http://llvm.org/viewvc/llvm-project?rev=374689&view=rev Log: Do not set default cmake options in CmakeCommand. Modified: zorg/trunk/zorg/buildbot/commands/CmakeCommand.py Modified: zorg/trunk/zorg/buildbot/commands/CmakeCommand.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/commands/CmakeCommand.py?rev=374689&r1=374688&r2=374689&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/commands/CmakeCommand.py (original) +++ zorg/trunk/zorg/buildbot/commands/CmakeCommand.py Sat Oct 12 19:06:33 2019 @@ -135,13 +135,6 @@ class CmakeCommand(WarningCountingShellC command += ["cmake"] - # Set some default options. - CmakeCommand.applyDefaultOptions(self.options, [ - ('-DCMAKE_BUILD_TYPE=', 'Release'), - ('-DLLVM_ENABLE_WERROR=', 'ON'), - ('-DLLVM_OPTIMIZED_TABLEGEN=', 'ON'), - ]) - if self.options: command += self.options From llvm-commits at lists.llvm.org Sat Oct 12 19:11:28 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Sun, 13 Oct 2019 02:11:28 -0000 Subject: [zorg] r374690 - Removed some default cmake options which doesn't seem worth being default from UnifiedTreeBuilder.addCmakeSteps. Message-ID: <20191013021128.2294985DB6@lists.llvm.org> Author: gkistanova Date: Sat Oct 12 19:11:27 2019 New Revision: 374690 URL: http://llvm.org/viewvc/llvm-project?rev=374690&view=rev Log: Removed some default cmake options which doesn't seem worth being default from UnifiedTreeBuilder.addCmakeSteps. Modified: zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py Modified: zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py?rev=374690&r1=374689&r2=374690&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py Sat Oct 12 19:11:27 2019 @@ -100,10 +100,7 @@ def addCmakeSteps( # Set proper defaults. CmakeCommand.applyDefaultOptions(cmake_args, [ ('-DCMAKE_BUILD_TYPE=', 'Release'), - ('-DCLANG_BUILD_EXAMPLES=', 'OFF'), - ('-DLLVM_BUILD_TESTS=', 'ON'), ('-DLLVM_ENABLE_ASSERTIONS=', 'ON'), - ('-DLLVM_OPTIMIZED_TABLEGEN=', 'ON'), ('-DLLVM_LIT_ARGS=', '"-v"'), ]) From llvm-commits at lists.llvm.org Sat Oct 12 19:20:23 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Sun, 13 Oct 2019 02:20:23 -0000 Subject: [zorg] r374691 - Changed clang-x86_64-debian-fast builder to use UnifiedTreeBuilder. Message-ID: <20191013022023.EB446883B0@lists.llvm.org> Author: gkistanova Date: Sat Oct 12 19:20:23 2019 New Revision: 374691 URL: http://llvm.org/viewvc/llvm-project?rev=374691&view=rev Log: Changed clang-x86_64-debian-fast builder to use UnifiedTreeBuilder. Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/buildbot/osuosl/master/config/builders.py?rev=374691&r1=374690&r2=374691&view=diff ============================================================================== --- zorg/trunk/buildbot/osuosl/master/config/builders.py (original) +++ zorg/trunk/buildbot/osuosl/master/config/builders.py Sat Oct 12 19:20:23 2019 @@ -94,17 +94,22 @@ def _get_clang_fast_builders(): {'name': "clang-x86_64-debian-fast", 'slavenames':["gribozavr4"], 'builddir':"clang-x86_64-debian-fast", - 'factory': ClangAndLLDBuilder.getClangAndLLDBuildFactory( - withLLD=False, - extraCmakeOptions=[ - "-DCOMPILER_RT_BUILD_BUILTINS:BOOL=OFF", - "-DCOMPILER_RT_BUILD_SANITIZERS:BOOL=OFF", - "-DCOMPILER_RT_BUILD_XRAY:BOOL=OFF", - "-DCOMPILER_RT_CAN_EXECUTE_TESTS:BOOL=OFF", - "-DCOMPILER_RT_INCLUDE_TESTS:BOOL=OFF"], - prefixCommand=None, # This is a designated builder, so no need to be nice. - env={'PATH':'/home/llvmbb/bin/clang-latest/bin:/home/llvmbb/bin:/usr/local/bin:/usr/local/bin:/usr/bin:/bin', - 'CC': 'ccache clang', 'CXX': 'ccache clang++', 'CCACHE_CPP2': 'yes'})}, + 'factory': UnifiedTreeBuilder.getCmakeWithNinjaBuildFactory( + llvm_srcdir="llvm.src", + obj_dir="llvm.obj", + depends_on_projects=['llvm','clang','clang-tools-extra','compiler-rt'], + extra_configure_args=[ + "-DCOMPILER_RT_BUILD_BUILTINS:BOOL=OFF", + "-DCOMPILER_RT_BUILD_SANITIZERS:BOOL=OFF", + "-DCOMPILER_RT_BUILD_XRAY:BOOL=OFF", + "-DCOMPILER_RT_INCLUDE_TESTS:BOOL=OFF", + "-DCMAKE_C_FLAGS=-Wdocumentation -Wno-documentation-deprecated-sync", + "-DCMAKE_CXX_FLAGS=-std=c++11 -Wdocumentation -Wno-documentation-deprecated-sync", + ], + env={ + 'PATH':'/home/llvmbb/bin/clang-latest/bin:/home/llvmbb/bin:/usr/local/bin:/usr/local/bin:/usr/bin:/bin', + 'CC': 'ccache clang', 'CXX': 'ccache clang++', 'CCACHE_CPP2': 'yes', + })}, {'name': "llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast", 'mergeRequests': False, From llvm-commits at lists.llvm.org Sat Oct 12 19:21:23 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 02:21:23 -0000 Subject: [llvm] r374692 - [SROA] Reuse existing lifetime markers if possible Message-ID: <20191013022123.54E92883B0@lists.llvm.org> Author: jdoerfert Date: Sat Oct 12 19:21:23 2019 New Revision: 374692 URL: http://llvm.org/viewvc/llvm-project?rev=374692&view=rev Log: [SROA] Reuse existing lifetime markers if possible Summary: If the underlying alloca did not change, we do not necessarily need new lifetime markers. This patch adds a check and reuses the old ones if possible. Reviewers: reames, ssarda, t.p.northover, hfinkel Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68900 Added: llvm/trunk/test/Transforms/SROA/reuse_lifetime_markers.ll Modified: llvm/trunk/lib/Transforms/Scalar/SROA.cpp Modified: llvm/trunk/lib/Transforms/Scalar/SROA.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SROA.cpp?rev=374692&r1=374691&r2=374692&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/SROA.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/SROA.cpp Sat Oct 12 19:21:23 2019 @@ -3072,6 +3072,13 @@ private: LLVM_DEBUG(dbgs() << " original: " << II << "\n"); assert(II.getArgOperand(1) == OldPtr); + bool EntireRange = (NewBeginOffset == NewAllocaBeginOffset && + NewEndOffset == NewAllocaEndOffset); + + // If the new lifetime marker would not differ from the old, just keep it. + if (&OldAI == &NewAI && EntireRange) + return true; + // Record this instruction for deletion. Pass.DeadInsts.insert(&II); @@ -3082,8 +3089,7 @@ private: // promoted, but PromoteMemToReg doesn't handle that case.) // FIXME: Check whether the alloca is promotable before dropping the // lifetime intrinsics? - if (NewBeginOffset != NewAllocaBeginOffset || - NewEndOffset != NewAllocaEndOffset) + if (!EntireRange) return true; ConstantInt *Size = Added: llvm/trunk/test/Transforms/SROA/reuse_lifetime_markers.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SROA/reuse_lifetime_markers.ll?rev=374692&view=auto ============================================================================== --- llvm/trunk/test/Transforms/SROA/reuse_lifetime_markers.ll (added) +++ llvm/trunk/test/Transforms/SROA/reuse_lifetime_markers.ll Sat Oct 12 19:21:23 2019 @@ -0,0 +1,69 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt < %s -sroa -S | FileCheck %s +; +; Make sure we reuse the lifetime marker and do not create a new one that looks the same but without the call site attributes. +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" + +; Function Attrs: argmemonly nounwind willreturn +declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #0 + +; Function Attrs: argmemonly nounwind willreturn +declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #0 + +define hidden void @old_markers() { +; +; CHECK-LABEL: define {{[^@]+}}@old_markers( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[VV_SROA_4:%.*]] = alloca [3 x i32*] +; CHECK-NEXT: [[VV_SROA_4_0__SROA_CAST61:%.*]] = bitcast [3 x i32*]* [[VV_SROA_4]] to i8* +; CHECK-NEXT: [[VV_SROA_4_0__SROA_CAST94:%.*]] = bitcast [3 x i32*]* [[VV_SROA_4]] to i8* +; CHECK-NEXT: call void @llvm.lifetime.start.p0i8(i64 24, i8* nonnull align 8 dereferenceable(24) [[VV_SROA_4_0__SROA_CAST94]]) +; CHECK-NEXT: br i1 undef, label [[DO_BODY:%.*]], label [[IF_END31:%.*]] +; CHECK: do.body: +; CHECK-NEXT: ret void +; CHECK: if.end31: +; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nonnull align 8 dereferenceable(24) [[VV_SROA_4_0__SROA_CAST61]], i8* noalias nonnull align 8 undef, i64 24, i1 false) +; CHECK-NEXT: unreachable +; +entry: + %vv.sroa.4 = alloca [3 x i32*] + %vv.sroa.4.0..sroa_cast61 = bitcast [3 x i32*]* %vv.sroa.4 to i8* + %vv.sroa.4.0..sroa_cast94 = bitcast [3 x i32*]* %vv.sroa.4 to i8* + call void @llvm.lifetime.start.p0i8(i64 24, i8* nonnull align 8 dereferenceable(24) %vv.sroa.4.0..sroa_cast94) + br i1 undef, label %do.body, label %if.end31 + +do.body: ; preds = %entry + ret void + +if.end31: ; preds = %entry + call void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nonnull align 8 dereferenceable(24) %vv.sroa.4.0..sroa_cast61, i8* noalias nonnull align 8 undef, i64 24, i1 false) + unreachable +} + +define hidden void @new_markers() { +; +; CHECK-LABEL: define {{[^@]+}}@new_markers( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[VV_SROA_4:%.*]] = alloca [3 x i32*] +; CHECK-NEXT: [[VV_SROA_4_0__SROA_CAST61:%.*]] = bitcast [3 x i32*]* [[VV_SROA_4]] to i8* +; CHECK-NEXT: br i1 undef, label [[DO_BODY:%.*]], label [[IF_END31:%.*]] +; CHECK: do.body: +; CHECK-NEXT: ret void +; CHECK: if.end31: +; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nonnull align 8 dereferenceable(24) [[VV_SROA_4_0__SROA_CAST61]], i8* noalias nonnull align 8 undef, i64 24, i1 false) +; CHECK-NEXT: unreachable +; +entry: + %vv.sroa.4 = alloca [3 x i32*] + %vv.sroa.4.0..sroa_cast61 = bitcast [3 x i32*]* %vv.sroa.4 to i8* + %vv.sroa.4.0..sroa_cast94 = bitcast [3 x i32*]* %vv.sroa.4 to i8* + call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull align 8 dereferenceable(24) %vv.sroa.4.0..sroa_cast94) + br i1 undef, label %do.body, label %if.end31 + +do.body: ; preds = %entry + ret void + +if.end31: ; preds = %entry + call void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nonnull align 8 dereferenceable(24) %vv.sroa.4.0..sroa_cast61, i8* noalias nonnull align 8 undef, i64 24, i1 false) + unreachable +} From llvm-commits at lists.llvm.org Sat Oct 12 19:23:22 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Sun, 13 Oct 2019 02:23:22 -0000 Subject: [zorg] r374693 - NFC. Few cosmetic changes. Message-ID: <20191013022322.CB99B83810@lists.llvm.org> Author: gkistanova Date: Sat Oct 12 19:23:22 2019 New Revision: 374693 URL: http://llvm.org/viewvc/llvm-project?rev=374693&view=rev Log: NFC. Few cosmetic changes. Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/buildbot/osuosl/master/config/builders.py?rev=374693&r1=374692&r2=374693&view=diff ============================================================================== --- zorg/trunk/buildbot/osuosl/master/config/builders.py (original) +++ zorg/trunk/buildbot/osuosl/master/config/builders.py Sat Oct 12 19:23:22 2019 @@ -130,9 +130,9 @@ def _get_clang_fast_builders(): "-DCLANG_BUILD_EXAMPLES=ON", "-DLLVM_TARGETS_TO_BUILD=X86", "-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4", - "-DCMAKE_C_FLAGS='-Wdocumentation -Wno-documentation-deprecated-sync'", - "-DCMAKE_CXX_FLAGS='-std=c++11 -Wdocumentation -Wno-documentation-deprecated-sync'", - "-DLLVM_LIT_ARGS='-v -j36'"], + "-DCMAKE_C_FLAGS=-Wdocumentation -Wno-documentation-deprecated-sync", + "-DCMAKE_CXX_FLAGS=-std=c++11 -Wdocumentation -Wno-documentation-deprecated-sync", + "-DLLVM_LIT_ARGS=\"-v -j36\""], env={'PATH':'/opt/llvm_37/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'})}, {'name': "llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast", @@ -149,7 +149,7 @@ def _get_clang_fast_builders(): "-DCLANG_BUILD_EXAMPLES=ON", "-DLLVM_TARGETS_TO_BUILD=X86", "-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4", - "-DLLVM_LIT_ARGS='-v -j80'"])}, + "-DLLVM_LIT_ARGS="-v -j80\""])}, {'name': "llvm-clang-x86_64-expensive-checks-win", 'slavenames':["ps4-buildslave2"], From llvm-commits at lists.llvm.org Sat Oct 12 19:24:02 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 02:24:02 -0000 Subject: [llvm] r374694 - [Attributor][FIX] Avoid modifying naked/optnone functions Message-ID: <20191013022402.9AC4A8B04A@lists.llvm.org> Author: jdoerfert Date: Sat Oct 12 19:24:02 2019 New Revision: 374694 URL: http://llvm.org/viewvc/llvm-project?rev=374694&view=rev Log: [Attributor][FIX] Avoid modifying naked/optnone functions The check for naked/optnone was insufficient for different reasons. We now check before we initialize an abstract attribute and we do it for all abstract attributes. Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/Attributor.h?rev=374694&r1=374693&r2=374694&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/Attributor.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/Attributor.h Sat Oct 12 19:24:02 2019 @@ -913,15 +913,23 @@ private: // Use the static create method. auto &AA = AAType::createForPosition(IRP, *this); registerAA(AA); - AA.initialize(*this); + + // For now we ignore naked and optnone functions. + bool Invalidate = Whitelist && !Whitelist->count(&AAType::ID); + if (const Function *Fn = IRP.getAnchorScope()) + Invalidate |= Fn->hasFnAttribute(Attribute::Naked) || + Fn->hasFnAttribute(Attribute::OptimizeNone); // Bootstrap the new attribute with an initial update to propagate // information, e.g., function -> call site. If it is not on a given // whitelist we will not perform updates at all. - if (Whitelist && !Whitelist->count(&AAType::ID)) + if (Invalidate) { AA.getState().indicatePessimisticFixpoint(); - else - AA.update(*this); + return AA; + } + + AA.initialize(*this); + AA.update(*this); if (TrackDependence && AA.getState().isValidState()) QueryMap[&AA].insert(const_cast(QueryingAA)); Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374694&r1=374693&r2=374694&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sat Oct 12 19:24:02 2019 @@ -4847,11 +4847,6 @@ static bool runAttributorOnModule(Module else NumFnWithoutExactDefinition++; - // For now we ignore naked and optnone functions. - if (F.hasFnAttribute(Attribute::Naked) || - F.hasFnAttribute(Attribute::OptimizeNone)) - continue; - // We look at internal functions only on-demand but if any use is not a // direct call, we have to do it eagerly. if (F.hasLocalLinkage()) { Modified: llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll?rev=374694&r1=374693&r2=374694&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll Sat Oct 12 19:24:02 2019 @@ -545,6 +545,30 @@ define weak_odr void @weak_caller(i32* n ret void } +; Expect nonnull +; ATTRIBUTOR: define internal void @control(i32* nocapture nonnull readnone align 16 dereferenceable(8) %a) +define internal void @control(i32* dereferenceable(4) %a) { + call void @use_i32_ptr(i32* %a) + ret void +} +; Avoid nonnull as we do not touch naked functions +; ATTRIBUTOR: define internal void @naked(i32* dereferenceable(4) %a) +define internal void @naked(i32* dereferenceable(4) %a) naked { + call void @use_i32_ptr(i32* %a) + ret void +} +; Avoid nonnull as we do not touch optnone +; ATTRIBUTOR: define internal void @optnone(i32* dereferenceable(4) %a) +define internal void @optnone(i32* dereferenceable(4) %a) optnone noinline { + call void @use_i32_ptr(i32* %a) + ret void +} +define void @make_live(i32* nonnull dereferenceable(8) %a) { + call void @naked(i32* nonnull dereferenceable(8) align 16 %a) + call void @control(i32* nonnull dereferenceable(8) align 16 %a) + call void @optnone(i32* nonnull dereferenceable(8) align 16 %a) + ret void +} attributes #0 = { "null-pointer-is-valid"="true" } attributes #1 = { nounwind willreturn} From llvm-commits at lists.llvm.org Sat Oct 12 19:25:42 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 02:25:42 +0000 (UTC) Subject: [PATCH] D68900: [SROA] Reuse existing lifetime markers if possible In-Reply-To: References: Message-ID: <7caa1c72191800f05737340f3ab905c4@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG92694eba933e: [SROA] Reuse existing lifetime markers if possible (authored by jdoerfert). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68900/new/ https://reviews.llvm.org/D68900 Files: llvm/lib/Transforms/Scalar/SROA.cpp llvm/test/Transforms/SROA/reuse_lifetime_markers.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68900.224768.patch Type: text/x-patch Size: 4740 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 19:30:19 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Sun, 13 Oct 2019 02:30:19 -0000 Subject: [zorg] r374695 - Incremental. Message-ID: <20191013023019.B089D83675@lists.llvm.org> Author: gkistanova Date: Sat Oct 12 19:30:19 2019 New Revision: 374695 URL: http://llvm.org/viewvc/llvm-project?rev=374695&view=rev Log: Incremental. Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/buildbot/osuosl/master/config/builders.py?rev=374695&r1=374694&r2=374695&view=diff ============================================================================== --- zorg/trunk/buildbot/osuosl/master/config/builders.py (original) +++ zorg/trunk/buildbot/osuosl/master/config/builders.py Sat Oct 12 19:30:19 2019 @@ -149,7 +149,7 @@ def _get_clang_fast_builders(): "-DCLANG_BUILD_EXAMPLES=ON", "-DLLVM_TARGETS_TO_BUILD=X86", "-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-scei-ps4", - "-DLLVM_LIT_ARGS="-v -j80\""])}, + "-DLLVM_LIT_ARGS=\"-v -j80\""])}, {'name': "llvm-clang-x86_64-expensive-checks-win", 'slavenames':["ps4-buildslave2"], From llvm-commits at lists.llvm.org Sat Oct 12 19:42:09 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 02:42:09 -0000 Subject: [llvm] r374696 - [Attributor][FIX] Add missing function declaration in test case Message-ID: <20191013024209.B4DB285B8A@lists.llvm.org> Author: jdoerfert Date: Sat Oct 12 19:42:09 2019 New Revision: 374696 URL: http://llvm.org/viewvc/llvm-project?rev=374696&view=rev Log: [Attributor][FIX] Add missing function declaration in test case Modified: llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll Modified: llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll?rev=374696&r1=374695&r2=374696&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll Sat Oct 12 19:42:09 2019 @@ -533,8 +533,10 @@ define i32* @g1() { ret i32* %c } +declare void @use_i32_ptr(i32*) readnone nounwind ; ATTRIBUTOR: define internal void @called_by_weak(i32* nocapture nonnull readnone %a) define internal void @called_by_weak(i32* %a) { + call void @use_i32_ptr(i32* %a) ret void } From llvm-commits at lists.llvm.org Sat Oct 12 20:30:03 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Sun, 13 Oct 2019 03:30:03 -0000 Subject: [zorg] r374697 - UnifiedTreeBuilder code cleaning. NFC. Message-ID: <20191013033003.2A2B686B6B@lists.llvm.org> Author: gkistanova Date: Sat Oct 12 20:30:02 2019 New Revision: 374697 URL: http://llvm.org/viewvc/llvm-project?rev=374697&view=rev Log: UnifiedTreeBuilder code cleaning. NFC. Modified: zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py Modified: zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py?rev=374697&r1=374696&r2=374697&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py Sat Oct 12 20:30:02 2019 @@ -20,25 +20,21 @@ def getLLVMBuildFactoryAndSVNSteps( **kwargs): def cleanBuildRequestedByProperty(step): - return step.build.getProperty("clean") - - # Set defaults - if not depends_on_projects: - depends_on_projects=['llvm', 'clang'] + return step.build.getProperty("clean") if cleanBuildRequested is None: - # We want a clean checkout only if requested by the property. - cleanBuildRequested = cleanBuildRequestedByProperty + # We want a clean checkout only if requested by the property. + cleanBuildRequested = cleanBuildRequestedByProperty f = LLVMBuildFactory( depends_on_projects=depends_on_projects, - llvm_srcdir=llvm_srcdir or "llvm", - obj_dir=obj_dir or "build", + llvm_srcdir=llvm_srcdir, + obj_dir=obj_dir, install_dir=install_dir, cleanBuildRequested=cleanBuildRequested, **kwargs) # Pass through all the extra arguments. - # Do a clean checkout if requested by a build property. + # Remove the source code for a clean checkout if requested by property. # TODO: Some Windows slaves do not handle RemoveDirectory command well. # So, consider running "rmdir /S /Q " if the build runs on Windows. f.addStep(RemoveDirectory(name='clean-src-dir', @@ -118,7 +114,6 @@ def addCmakeSteps( description=["Cmake", "configure", stage_name], options=cmake_args, path=src_dir, - haltOnFailure=kwargs.get('haltOnFailure', True), env=env, workdir=obj_dir, **kwargs # Pass through all the extra arguments. @@ -147,7 +142,6 @@ def addNinjaSteps( f.addStep(NinjaCommand(name="build-%sunified-tree" % step_name, targets=targets, description=["Build", stage_name, "unified", "tree"], - haltOnFailure=kwargs.get('haltOnFailure', True), env=env, workdir=obj_dir, **kwargs # Pass through all the extra arguments. @@ -158,7 +152,6 @@ def addNinjaSteps( f.addStep(NinjaCommand(name="test-%s%s" % (step_name,"-".join(checks)), targets=checks, description=["Test", "just", "built", "components"], - haltOnFailure=kwargs.get('haltOnFailure', True), env=env, workdir=obj_dir, **kwargs # Pass through all the extra arguments. @@ -169,7 +162,6 @@ def addNinjaSteps( f.addStep(NinjaCommand(name="install-%sall" % step_name, targets=["install"], description=["Install", "just", "built", "components"], - haltOnFailure=kwargs.get('haltOnFailure', True), env=env, workdir=obj_dir, **kwargs # Pass through all the extra arguments. From llvm-commits at lists.llvm.org Sat Oct 12 20:48:59 2019 From: llvm-commits at lists.llvm.org (Teresa Johnson via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 03:48:59 +0000 (UTC) Subject: [PATCH] D68924: CodeExtractor: NFC: Use Range based loop In-Reply-To: References: Message-ID: <495fbdd2e51cfcfccc033736b6a4df93@localhost.localdomain> tejohnson added inline comments. ================ Comment at: llvm/lib/Transforms/Utils/CodeExtractor.cpp:963 // blocks were originally in the code region. std::vector Users(header->user_begin(), header->user_end()); + for (auto &U : Users) ---------------- You can use a range iterator over the users accessed from header directly, rather than creating a vector: for (auto &U : header->users()) Looks like the code earlier at line 944-5 can be transformed similarly Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68924/new/ https://reviews.llvm.org/D68924 From llvm-commits at lists.llvm.org Sat Oct 12 20:54:08 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 03:54:08 -0000 Subject: [llvm] r374698 - [Attributor][FIX] Do not apply h2s for arbitrary mallocs Message-ID: <20191013035408.D64F685B8A@lists.llvm.org> Author: jdoerfert Date: Sat Oct 12 20:54:08 2019 New Revision: 374698 URL: http://llvm.org/viewvc/llvm-project?rev=374698&view=rev Log: [Attributor][FIX] Do not apply h2s for arbitrary mallocs H2S did apply to mallocs of non-constant sizes if the uses were OK. This is now forbidden through reording of the "good" and "bad" cases in the conditional. Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/heap_to_stack.ll Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374698&r1=374697&r2=374698&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sat Oct 12 20:54:08 2019 @@ -3620,30 +3620,36 @@ ChangeStatus AAHeapToStackImpl::updateIm }; auto MallocCallocCheck = [&](Instruction &I) { - if (isMallocLikeFn(&I, TLI)) { + if (BadMallocCalls.count(&I)) + return true; + + bool IsMalloc = isMallocLikeFn(&I, TLI); + bool IsCalloc = !IsMalloc && isCallocLikeFn(&I, TLI); + if (!IsMalloc && !IsCalloc) { + BadMallocCalls.insert(&I); + return true; + } + + if (IsMalloc) { if (auto *Size = dyn_cast(I.getOperand(0))) - if (!Size->getValue().sle(MaxHeapToStackSize)) - return true; - } else if (isCallocLikeFn(&I, TLI)) { + if (Size->getValue().sle(MaxHeapToStackSize)) + if (UsesCheck(I)) { + MallocCalls.insert(&I); + return true; + } + } else if (IsCalloc) { bool Overflow = false; if (auto *Num = dyn_cast(I.getOperand(0))) if (auto *Size = dyn_cast(I.getOperand(1))) - if (!(Size->getValue().umul_ov(Num->getValue(), Overflow)) + if ((Size->getValue().umul_ov(Num->getValue(), Overflow)) .sle(MaxHeapToStackSize)) - if (!Overflow) + if (!Overflow && UsesCheck(I)) { + MallocCalls.insert(&I); return true; - } else { - BadMallocCalls.insert(&I); - return true; + } } - if (BadMallocCalls.count(&I)) - return true; - - if (UsesCheck(I)) - MallocCalls.insert(&I); - else - BadMallocCalls.insert(&I); + BadMallocCalls.insert(&I); return true; }; Modified: llvm/trunk/test/Transforms/FunctionAttrs/heap_to_stack.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/heap_to_stack.ll?rev=374698&r1=374697&r2=374698&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/heap_to_stack.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/heap_to_stack.ll Sat Oct 12 20:54:08 2019 @@ -316,3 +316,8 @@ define void @test14() { ; CHECK: tail call void @free(i8* noalias %1) ret void } + +define void @test15(i64 %S) { + %1 = tail call noalias i8* @malloc(i64 %S) + ret void +} From llvm-commits at lists.llvm.org Sat Oct 12 21:14:15 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 04:14:15 -0000 Subject: [llvm] r374699 - [Attributor][FIX] Ensure h2s doesn't trigger on escaped pointers Message-ID: <20191013041415.56C3E87970@lists.llvm.org> Author: jdoerfert Date: Sat Oct 12 21:14:15 2019 New Revision: 374699 URL: http://llvm.org/viewvc/llvm-project?rev=374699&view=rev Log: [Attributor][FIX] Ensure h2s doesn't trigger on escaped pointers We do not yet perform h2s because we know something is free'ed but we do it because we know the pointer does not escape. Storing the pointer allows it to escape so we have to prevent that. Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/heap_to_stack.ll Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374699&r1=374698&r2=374699&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sat Oct 12 21:14:15 2019 @@ -3569,8 +3569,16 @@ ChangeStatus AAHeapToStackImpl::updateIm auto *UserI = U->getUser(); - if (isa(UserI) || isa(UserI)) + if (isa(UserI)) continue; + if (auto *SI = dyn_cast(UserI)) { + if (SI->getValueOperand() == U->get()) { + LLVM_DEBUG(dbgs() << "[H2S] escaping store to memory: " << *UserI << "\n"); + return false; + } + // A store into the malloc'ed memory is fine. + continue; + } // NOTE: Right now, if a function that has malloc pointer as an argument // frees memory, we assume that the malloc pointer is freed. Modified: llvm/trunk/test/Transforms/FunctionAttrs/heap_to_stack.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/heap_to_stack.ll?rev=374699&r1=374698&r2=374699&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/heap_to_stack.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/heap_to_stack.ll Sat Oct 12 21:14:15 2019 @@ -215,60 +215,46 @@ define void @test11() { ; TEST 12 define i32 @irreducible_cfg(i32 %0) { - %2 = alloca i32, align 4 - %3 = alloca i32*, align 8 - %4 = alloca i32, align 4 - store i32 %0, i32* %2, align 4 - %5 = call noalias i8* @malloc(i64 4) #2 ; CHECK: alloca i8, i64 4 - ; CHECK-NEXT: %6 = bitcast - %6 = bitcast i8* %5 to i32* - store i32* %6, i32** %3, align 8 - %7 = load i32*, i32** %3, align 8 - store i32 10, i32* %7, align 4 - %8 = load i32, i32* %2, align 4 - %9 = icmp eq i32 %8, 1 - br i1 %9, label %10, label %13 - -10: ; preds = %1 - %11 = load i32, i32* %2, align 4 - %12 = add nsw i32 %11, 5 - store i32 %12, i32* %2, align 4 - br label %20 - -13: ; preds = %1 - store i32 1, i32* %2, align 4 - br label %14 - -14: ; preds = %20, %13 - %15 = load i32*, i32** %3, align 8 - %16 = load i32, i32* %15, align 4 - %17 = add nsw i32 %16, -1 - store i32 %17, i32* %15, align 4 - %18 = icmp ne i32 %16, 0 - br i1 %18, label %19, label %23 - -19: ; preds = %14 - br label %20 - -20: ; preds = %19, %10 - %21 = load i32, i32* %2, align 4 - %22 = add nsw i32 %21, 1 - store i32 %22, i32* %2, align 4 - br label %14 - -23: ; preds = %14 - %24 = load i32*, i32** %3, align 8 - %25 = load i32, i32* %24, align 4 - store i32 %25, i32* %4, align 4 - %26 = load i32*, i32** %3, align 8 - %27 = bitcast i32* %26 to i8* - call void @free(i8* %27) #2 - %28 = load i32*, i32** %3, align 8 - %29 = load i32, i32* %28, align 4 - ret i32 %29 + ; CHECK-NEXT: %3 = bitcast + %2 = call noalias i8* @malloc(i64 4) + %3 = bitcast i8* %2 to i32* + store i32 10, i32* %3, align 4 + %4 = icmp eq i32 %0, 1 + br i1 %4, label %5, label %7 + +5: ; preds = %1 + %6 = add nsw i32 %0, 5 + br label %13 + +7: ; preds = %1 + br label %8 + +8: ; preds = %13, %7 + %.0 = phi i32 [ %14, %13 ], [ 1, %7 ] + %9 = load i32, i32* %3, align 4 + %10 = add nsw i32 %9, -1 + store i32 %10, i32* %3, align 4 + %11 = icmp ne i32 %9, 0 + br i1 %11, label %12, label %15 + +12: ; preds = %8 + br label %13 + +13: ; preds = %12, %5 + %.1 = phi i32 [ %6, %5 ], [ %.0, %12 ] + %14 = add nsw i32 %.1, 1 + br label %8 + +15: ; preds = %8 + %16 = load i32, i32* %3, align 4 + %17 = bitcast i32* %3 to i8* + call void @free(i8* %17) + %18 = load i32, i32* %3, align 4 + ret i32 %18 } + define i32 @malloc_in_loop(i32 %0) { %2 = alloca i32, align 4 %3 = alloca i32*, align 8 @@ -286,7 +272,7 @@ define i32 @malloc_in_loop(i32 %0) { %9 = call noalias i8* @malloc(i64 4) ; CHECK: alloca i8, i64 4 %10 = bitcast i8* %9 to i32* - store i32* %10, i32** %3, align 8 + store i32 1, i32* %10, align 8 br label %4 11: ; preds = %4 @@ -318,6 +304,35 @@ define void @test14() { } define void @test15(i64 %S) { + ; CHECK: %1 = tail call noalias i8* @malloc(i64 %S) %1 = tail call noalias i8* @malloc(i64 %S) + ; CHECK-NEXT: @no_sync_func(i8* noalias %1) + tail call void @no_sync_func(i8* %1) + ; CHECK-NEXT: @free(i8* noalias %1) + tail call void @free(i8* %1) + ret void +} + +define void @test16a(i8 %v, i8** %P) { + ; CHECK: %1 = alloca + %1 = tail call noalias i8* @malloc(i64 4) + ; CHECK-NEXT: store i8 %v, i8* %1 + store i8 %v, i8* %1 + ; CHECK-NEXT: @no_sync_func(i8* noalias nocapture %1) + tail call void @no_sync_func(i8* %1) + ; CHECK-NOT: @free(i8* %1) + tail call void @free(i8* %1) + ret void +} + +define void @test16b(i8 %v, i8** %P) { + ; CHECK: %1 = tail call noalias i8* @malloc(i64 4) + %1 = tail call noalias i8* @malloc(i64 4) + ; CHECK-NEXT: store i8* %1, i8** %P + store i8* %1, i8** %P + ; CHECK-NEXT: @no_sync_func(i8* %1) + tail call void @no_sync_func(i8* %1) + ; CHECK-NEXT: @free(i8* %1) + tail call void @free(i8* %1) ret void } From llvm-commits at lists.llvm.org Sat Oct 12 21:16:02 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 04:16:02 -0000 Subject: [llvm] r374700 - [Attributor][NFC] Expose call site traversal without QueryingAA Message-ID: <20191013041602.DDE1E877FC@lists.llvm.org> Author: jdoerfert Date: Sat Oct 12 21:16:02 2019 New Revision: 374700 URL: http://llvm.org/viewvc/llvm-project?rev=374700&view=rev Log: [Attributor][NFC] Expose call site traversal without QueryingAA Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h llvm/trunk/lib/Transforms/IPO/Attributor.cpp Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/Attributor.h?rev=374700&r1=374699&r2=374700&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/Attributor.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/Attributor.h Sat Oct 12 21:16:02 2019 @@ -899,6 +899,15 @@ struct Attributor { const DataLayout &getDataLayout() const { return InfoCache.DL; } private: + /// Check \p Pred on all call sites of \p Fn. + /// + /// This method will evaluate \p Pred on call sites and return + /// true if \p Pred holds in every call sites. However, this is only possible + /// all call sites are known, hence the function has internal linkage. + bool checkForAllCallSites(const function_ref &Pred, + const Function &Fn, bool RequireAllCallSites, + const AbstractAttribute *QueryingAA); + /// The private version of getAAFor that allows to omit a querying abstract /// attribute. See also the public getAAFor method. template Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374700&r1=374699&r2=374700&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sat Oct 12 21:16:02 2019 @@ -4167,19 +4167,26 @@ bool Attributor::checkForAllCallSites( return false; } - if (RequireAllCallSites && !AssociatedFunction->hasLocalLinkage()) { + return checkForAllCallSites(Pred, *AssociatedFunction, RequireAllCallSites, + &QueryingAA); +} + +bool Attributor::checkForAllCallSites( + const function_ref &Pred, const Function &Fn, + bool RequireAllCallSites, const AbstractAttribute *QueryingAA) { + if (RequireAllCallSites && !Fn.hasLocalLinkage()) { LLVM_DEBUG( dbgs() - << "[Attributor] Function " << AssociatedFunction->getName() + << "[Attributor] Function " << Fn.getName() << " has no internal linkage, hence not all call sites are known\n"); return false; } - for (const Use &U : AssociatedFunction->uses()) { + for (const Use &U : Fn.uses()) { AbstractCallSite ACS(&U); if (!ACS) { LLVM_DEBUG(dbgs() << "[Attributor] Function " - << AssociatedFunction->getName() + << Fn.getName() << " has non call site use " << *U.get() << " in " << *U.getUser() << "\n"); return false; @@ -4188,15 +4195,16 @@ bool Attributor::checkForAllCallSites( Instruction *I = ACS.getInstruction(); Function *Caller = I->getFunction(); - const auto &LivenessAA = - getAAFor(QueryingAA, IRPosition::function(*Caller), + const auto *LivenessAA = + lookupAAFor(IRPosition::function(*Caller), QueryingAA, /* TrackDependence */ false); // Skip dead calls. - if (LivenessAA.isAssumedDead(I)) { + if (LivenessAA && LivenessAA->isAssumedDead(I)) { // We actually used liveness information so we have to record a // dependence. - recordDependence(LivenessAA, QueryingAA); + if (QueryingAA) + recordDependence(*LivenessAA, *QueryingAA); continue; } @@ -4207,7 +4215,7 @@ bool Attributor::checkForAllCallSites( continue; LLVM_DEBUG(dbgs() << "[Attributor] User " << EffectiveUse->getUser() << " is an invalid use of " - << AssociatedFunction->getName() << "\n"); + << Fn.getName() << "\n"); return false; } From llvm-commits at lists.llvm.org Sat Oct 12 22:01:21 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 05:01:21 +0000 (UTC) Subject: [PATCH] D68008: [Attributor] Use abstract call sites to determine associated arguments In-Reply-To: References: Message-ID: <66658a4f22c79c12b6a72cfbc579e00d@localhost.localdomain> jdoerfert updated this revision to Diff 224769. jdoerfert edited the summary of this revision. jdoerfert added a comment. Fixes and check line update Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68008/new/ https://reviews.llvm.org/D68008 Files: llvm/include/llvm/IR/CallSite.h llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/IR/AbstractCallSite.cpp llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/callbacks.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68008.224769.patch Type: text/x-patch Size: 18647 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 22:01:21 2019 From: llvm-commits at lists.llvm.org (Wei Mi via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 05:01:21 +0000 (UTC) Subject: [PATCH] D68898: JumpThreading: enhance JT to handle BB with no successor and address comparison In-Reply-To: References: Message-ID: <44817bd8833cc3ea7504d754012207f2@localhost.localdomain> wmi added a comment. I change the testcase a little so the terminator won't be ret, but the generated code pattern is the same. Should it be handled as well? ------------------------------------ #include #include constexpr std::array x = {1, 7, 17}; bool global, cond; void Contains(int i) { global = std::find(x.begin(), x.end(), i) != x.end(); if (cond) __builtin_printf("hello\n"); } ------------------------------------ CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68898/new/ https://reviews.llvm.org/D68898 From llvm-commits at lists.llvm.org Sat Oct 12 22:01:22 2019 From: llvm-commits at lists.llvm.org (Juneyoung Lee via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 05:01:22 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: <32cb9680746fc3a6365ed18a94caa8d1@localhost.localdomain> aqjune marked an inline comment as done. aqjune added a comment. In D29011#1707278 , @lebedev.ri wrote: > Should you add `llvm::Freeze` here by inheriting from `UnaryOperator` to make `isa(Op)` possible? Couldn't you kindly point which place is good to update? ================ Comment at: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h:671 void visitFNeg(const User &I) { visitUnary(I, ISD::FNEG); } + void visitFreeze(const User &I); ---------------- jdoerfert wrote: > The lady of the lake says this should be: > `void visitFreeze(const User &I) { visitUnary(I, ISD::FREEZE); }` > If you have reason not to do it this way, also replace `visitUnrary` with `visitFNeg`, though I'd prefer not to. ISD::FREEZE will be added in the next patch - https://reviews.llvm.org/D29014 . Do you want to move the definition of ISD::FREEZE to this patch? @lebedev.ri CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Sat Oct 12 22:07:00 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 05:07:00 -0000 Subject: [llvm] r374701 - [Attributor] Remove unused verification flag Message-ID: <20191013050700.B074184FA8@lists.llvm.org> Author: jdoerfert Date: Sat Oct 12 22:07:00 2019 New Revision: 374701 URL: http://llvm.org/viewvc/llvm-project?rev=374701&view=rev Log: [Attributor] Remove unused verification flag We use the verify max iteration now which is more reliable. Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374701&r1=374700&r2=374701&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sat Oct 12 22:07:00 2019 @@ -125,12 +125,6 @@ static cl::opt ManifestInternal( cl::desc("Manifest Attributor internal string attributes."), cl::init(false)); -static cl::opt VerifyAttributor( - "attributor-verify", cl::Hidden, - cl::desc("Verify the Attributor deduction and " - "manifestation of attributes -- may issue false-positive errors"), - cl::init(false)); - static cl::opt DepRecInterval( "attributor-dependence-recompute-interval", cl::Hidden, cl::desc("Number of iterations until dependences are recomputed."), @@ -4501,24 +4495,6 @@ ChangeStatus Attributor::run(Module &M) << " arguments while " << NumAtFixpoint << " were in a valid fixpoint state\n"); - // If verification is requested, we finished this run at a fixpoint, and the - // IR was changed, we re-run the whole fixpoint analysis, starting at - // re-initialization of the arguments. This re-run should not result in an IR - // change. Though, the (virtual) state of attributes at the end of the re-run - // might be more optimistic than the known state or the IR state if the better - // state cannot be manifested. - if (VerifyAttributor && FinishedAtFixpoint && - ManifestChange == ChangeStatus::CHANGED) { - VerifyAttributor = false; - ChangeStatus VerifyStatus = run(M); - if (VerifyStatus != ChangeStatus::UNCHANGED) - llvm_unreachable( - "Attributor verification failed, re-run did result in an IR change " - "even after a fixpoint was reached in the original run. (False " - "positives possible!)"); - VerifyAttributor = true; - } - NumAttributesManifested += NumManifested; NumAttributesValidFixpoint += NumAtFixpoint; From llvm-commits at lists.llvm.org Sat Oct 12 22:19:17 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 05:19:17 -0000 Subject: [llvm] r374702 - [Attributor][FIX] Remove leftover, now unused, variable Message-ID: <20191013051917.2DE9487553@lists.llvm.org> Author: jdoerfert Date: Sat Oct 12 22:19:17 2019 New Revision: 374702 URL: http://llvm.org/viewvc/llvm-project?rev=374702&view=rev Log: [Attributor][FIX] Remove leftover, now unused, variable Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374702&r1=374701&r2=374702&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sat Oct 12 22:19:17 2019 @@ -4428,8 +4428,6 @@ ChangeStatus Attributor::run(Module &M) size_t NumFinalAAs = AllAbstractAttributes.size(); - bool FinishedAtFixpoint = Worklist.empty(); - // Reset abstract arguments not settled in a sound fixpoint by now. This // happens when we stopped the fixpoint iteration early. Note that only the // ones marked as "changed" *and* the ones transitively depending on them From llvm-commits at lists.llvm.org Sat Oct 12 22:27:09 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 05:27:09 -0000 Subject: [llvm] r374703 - [Attributor][FIX] Avoid splitting blocks if possible Message-ID: <20191013052709.5320B87553@lists.llvm.org> Author: jdoerfert Date: Sat Oct 12 22:27:09 2019 New Revision: 374703 URL: http://llvm.org/viewvc/llvm-project?rev=374703&view=rev Log: [Attributor][FIX] Avoid splitting blocks if possible Before, we eagerly split blocks even if it was not necessary, e.g., they had a single unreachable instruction and only a single predecessor. Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll llvm/trunk/test/Transforms/FunctionAttrs/noreturn_async.ll llvm/trunk/test/Transforms/FunctionAttrs/noreturn_sync.ll Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374703&r1=374702&r2=374703&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sat Oct 12 22:27:09 2019 @@ -2139,8 +2139,6 @@ struct AAIsDeadImpl : public AAIsDead { BasicBlock *BB = I->getParent(); Instruction *SplitPos = I->getNextNode(); // TODO: mark stuff before unreachable instructions as dead. - if (isa_and_nonnull(SplitPos)) - continue; if (auto *II = dyn_cast(I)) { // If we keep the invoke the split position is at the beginning of the @@ -2183,15 +2181,23 @@ struct AAIsDeadImpl : public AAIsDead { // also manifest. assert(!NormalDestBB->isLandingPad() && "Expected the normal destination not to be a landingpad!"); - BasicBlock *SplitBB = - SplitBlockPredecessors(NormalDestBB, {BB}, ".dead"); - // The split block is live even if it contains only an unreachable - // instruction at the end. - assumeLive(A, *SplitBB); - SplitPos = SplitBB->getTerminator(); + if (NormalDestBB->getUniquePredecessor() == BB) { + assumeLive(A, *NormalDestBB); + } else { + BasicBlock *SplitBB = + SplitBlockPredecessors(NormalDestBB, {BB}, ".dead"); + // The split block is live even if it contains only an unreachable + // instruction at the end. + assumeLive(A, *SplitBB); + SplitPos = SplitBB->getTerminator(); + HasChanged = ChangeStatus::CHANGED; + } } } + if (isa_and_nonnull(SplitPos)) + continue; + BB = SplitPos->getParent(); SplitBlock(BB, SplitPos); changeToUnreachable(BB->getTerminator(), /* UseLLVMTrap */ false); Modified: llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll?rev=374703&r1=374702&r2=374703&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll Sat Oct 12 22:27:09 2019 @@ -177,7 +177,7 @@ cond.true: %call = invoke i32 @foo_noreturn() to label %continue unwind label %cleanup ; CHECK: %call = invoke i32 @foo_noreturn() - ; CHECK-NEXT: to label %continue.dead unwind label %cleanup + ; CHECK-NEXT: to label %continue unwind label %cleanup cond.false: ; preds = %entry call void @normal_call() @@ -189,7 +189,7 @@ cond.end: ret i32 %cond continue: - ; CHECK: continue.dead: + ; CHECK: continue: ; CHECK-NEXT: unreachable br label %cond.end Modified: llvm/trunk/test/Transforms/FunctionAttrs/noreturn_async.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/noreturn_async.ll?rev=374703&r1=374702&r2=374703&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/noreturn_async.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/noreturn_async.ll Sat Oct 12 22:27:09 2019 @@ -42,12 +42,12 @@ entry: %retval = alloca i32, align 4 %__exception_code = alloca i32, align 4 ; CHECK: invoke void @"?overflow@@YAXXZ"() -; CHECK: to label %invoke.cont.dead unwind label %catch.dispatch +; CHECK: to label %invoke.cont unwind label %catch.dispatch invoke void @"?overflow@@YAXXZ"() to label %invoke.cont unwind label %catch.dispatch invoke.cont: ; preds = %entry -; CHECK: invoke.cont.dead: +; CHECK: invoke.cont: ; CHECK-NEXT: unreachable br label %invoke.cont1 @@ -101,12 +101,12 @@ entry: %retval = alloca i32, align 4 %__exception_code = alloca i32, align 4 ; CHECK: invoke void @"?overflow@@YAXXZ_may_throw"() -; CHECK: to label %invoke.cont.dead unwind label %catch.dispatch +; CHECK: to label %invoke.cont unwind label %catch.dispatch invoke void @"?overflow@@YAXXZ_may_throw"() to label %invoke.cont unwind label %catch.dispatch invoke.cont: ; preds = %entry -; CHECK: invoke.cont.dead: +; CHECK: invoke.cont: ; CHECK-NEXT: unreachable br label %invoke.cont1 Modified: llvm/trunk/test/Transforms/FunctionAttrs/noreturn_sync.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/noreturn_sync.ll?rev=374703&r1=374702&r2=374703&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/noreturn_sync.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/noreturn_sync.ll Sat Oct 12 22:27:09 2019 @@ -97,12 +97,12 @@ entry: %retval = alloca i32, align 4 %__exception_code = alloca i32, align 4 ; CHECK: invoke void @"?overflow@@YAXXZ_may_throw"() -; CHECK: to label %invoke.cont.dead unwind label %catch.dispatch +; CHECK: to label %invoke.cont unwind label %catch.dispatch invoke void @"?overflow@@YAXXZ_may_throw"() to label %invoke.cont unwind label %catch.dispatch invoke.cont: ; preds = %entry -; CHECK: invoke.cont.dead: +; CHECK: invoke.cont: ; CHECK-NEXT: unreachable br label %invoke.cont1 From llvm-commits at lists.llvm.org Sat Oct 12 22:47:42 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Sun, 13 Oct 2019 05:47:42 -0000 Subject: [llvm] r374704 - [X86] Add v2i64->v2i32/v2i16/v2i8 test cases to the trunc packus/ssat/usat tests. NFC Message-ID: <20191013054742.6A800803DE@lists.llvm.org> Author: ctopper Date: Sat Oct 12 22:47:42 2019 New Revision: 374704 URL: http://llvm.org/viewvc/llvm-project?rev=374704&view=rev Log: [X86] Add v2i64->v2i32/v2i16/v2i8 test cases to the trunc packus/ssat/usat tests. NFC Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll?rev=374704&r1=374703&r2=374704&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll Sat Oct 12 22:47:42 2019 @@ -15,6 +15,318 @@ ; PACKUS saturation truncation to vXi32 ; +define <2 x i32> @trunc_packus_v2i64_v2i32(<2 x i64> %a0) { +; SSE2-LABEL: trunc_packus_v2i64_v2i32: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147483647,2147483647] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v2i64_v2i32: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147483647,2147483647] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v2i64_v2i32: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [4294967295,4294967295] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147483647,2147483647] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: xorpd %xmm1, %xmm1 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_packus_v2i64_v2i32: +; AVX: # %bb.0: +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm1 +; AVX-NEXT: vpand %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v2i64_v2i32: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v2i64_v2i32: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v2i64_v2i32: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v2i64_v2i32: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v2i64_v2i32: +; SKX: # %bb.0: +; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SKX-NEXT: retq + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, zeroinitializer + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> zeroinitializer + %5 = trunc <2 x i64> %4 to <2 x i32> + ret <2 x i32> %5 +} + +define void @trunc_packus_v2i64_v2i32_store(<2 x i64> %a0, <2 x i32>* %p1) { +; SSE2-LABEL: trunc_packus_v2i64_v2i32_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147483647,2147483647] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE2-NEXT: movq %xmm0, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v2i64_v2i32_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147483647,2147483647] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSSE3-NEXT: movq %xmm0, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v2i64_v2i32_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [4294967295,4294967295] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147483647,2147483647] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: xorpd %xmm1, %xmm1 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: movq %xmm0, (%rdi) +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_packus_v2i64_v2i32_store: +; AVX: # %bb.0: +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm1 +; AVX-NEXT: vpand %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX-NEXT: vmovq %xmm0, (%rdi) +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v2i64_v2i32_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v2i64_v2i32_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512VL-NEXT: vpmovusqd %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v2i64_v2i32_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v2i64_v2i32_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmovusqd %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v2i64_v2i32_store: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpmovusqd %xmm0, (%rdi) +; SKX-NEXT: retq + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, zeroinitializer + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> zeroinitializer + %5 = trunc <2 x i64> %4 to <2 x i32> + store <2 x i32> %5, <2 x i32>* %p1 + ret void +} + define <4 x i32> @trunc_packus_v4i64_v4i32(<4 x i64> %a0) { ; SSE2-LABEL: trunc_packus_v4i64_v4i32: ; SSE2: # %bb.0: @@ -619,76 +931,446 @@ define <8 x i32> @trunc_packus_v8i64_v8i ; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0 ; AVX1-NEXT: retq ; -; AVX2-SLOW-LABEL: trunc_packus_v8i64_v8i32: +; AVX2-SLOW-LABEL: trunc_packus_v8i64_v8i32: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-SLOW-NEXT: vmovdqa 32(%rdi), %ymm1 +; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm2 = [4294967295,4294967295,4294967295,4294967295] +; AVX2-SLOW-NEXT: vpcmpgtq %ymm1, %ymm2, %ymm3 +; AVX2-SLOW-NEXT: vblendvpd %ymm3, %ymm1, %ymm2, %ymm1 +; AVX2-SLOW-NEXT: vpcmpgtq %ymm0, %ymm2, %ymm3 +; AVX2-SLOW-NEXT: vblendvpd %ymm3, %ymm0, %ymm2, %ymm0 +; AVX2-SLOW-NEXT: vpxor %xmm2, %xmm2, %xmm2 +; AVX2-SLOW-NEXT: vpcmpgtq %ymm2, %ymm0, %ymm3 +; AVX2-SLOW-NEXT: vpand %ymm0, %ymm3, %ymm0 +; AVX2-SLOW-NEXT: vpcmpgtq %ymm2, %ymm1, %ymm2 +; AVX2-SLOW-NEXT: vpand %ymm1, %ymm2, %ymm1 +; AVX2-SLOW-NEXT: vextracti128 $1, %ymm1, %xmm2 +; AVX2-SLOW-NEXT: vshufps {{.*#+}} xmm1 = xmm1[0,2],xmm2[0,2] +; AVX2-SLOW-NEXT: vextracti128 $1, %ymm0, %xmm2 +; AVX2-SLOW-NEXT: vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm2[0,2] +; AVX2-SLOW-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_packus_v8i64_v8i32: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vmovdqa (%rdi), %ymm0 +; AVX2-FAST-NEXT: vmovdqa 32(%rdi), %ymm1 +; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm2 = [4294967295,4294967295,4294967295,4294967295] +; AVX2-FAST-NEXT: vpcmpgtq %ymm0, %ymm2, %ymm3 +; AVX2-FAST-NEXT: vblendvpd %ymm3, %ymm0, %ymm2, %ymm0 +; AVX2-FAST-NEXT: vpcmpgtq %ymm1, %ymm2, %ymm3 +; AVX2-FAST-NEXT: vblendvpd %ymm3, %ymm1, %ymm2, %ymm1 +; AVX2-FAST-NEXT: vpxor %xmm2, %xmm2, %xmm2 +; AVX2-FAST-NEXT: vpcmpgtq %ymm2, %ymm1, %ymm3 +; AVX2-FAST-NEXT: vpand %ymm1, %ymm3, %ymm1 +; AVX2-FAST-NEXT: vpcmpgtq %ymm2, %ymm0, %ymm2 +; AVX2-FAST-NEXT: vpand %ymm0, %ymm2, %ymm0 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} ymm2 = [0,2,4,6,4,6,6,7] +; AVX2-FAST-NEXT: vpermd %ymm0, %ymm2, %ymm0 +; AVX2-FAST-NEXT: vpermd %ymm1, %ymm2, %ymm1 +; AVX2-FAST-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 +; AVX2-FAST-NEXT: retq +; +; AVX512-LABEL: trunc_packus_v8i64_v8i32: +; AVX512: # %bb.0: +; AVX512-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: vpmaxsq (%rdi), %zmm0, %zmm0 +; AVX512-NEXT: vpmovusqd %zmm0, %ymm0 +; AVX512-NEXT: retq +; +; SKX-LABEL: trunc_packus_v8i64_v8i32: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm0, %xmm0, %xmm0 +; SKX-NEXT: vpmaxsq (%rdi), %ymm0, %ymm1 +; SKX-NEXT: vpmovusqd %ymm1, %xmm1 +; SKX-NEXT: vpmaxsq 32(%rdi), %ymm0, %ymm0 +; SKX-NEXT: vpmovusqd %ymm0, %xmm0 +; SKX-NEXT: vinserti128 $1, %xmm0, %ymm1, %ymm0 +; SKX-NEXT: retq + %a0 = load <8 x i64>, <8 x i64>* %p0 + %1 = icmp slt <8 x i64> %a0, + %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> + %3 = icmp sgt <8 x i64> %2, zeroinitializer + %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> zeroinitializer + %5 = trunc <8 x i64> %4 to <8 x i32> + ret <8 x i32> %5 +} + +; +; PACKUS saturation truncation to vXi16 +; + +define <2 x i16> @trunc_packus_v2i64_v2i16(<2 x i64> %a0) { +; SSE2-LABEL: trunc_packus_v2i64_v2i16: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147549183,2147549183] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v2i64_v2i16: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147549183,2147549183] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v2i64_v2i16: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [65535,65535] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147549183,2147549183] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: xorpd %xmm1, %xmm1 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_packus_v2i64_v2i16: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm1 +; AVX1-NEXT: vpand %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_packus_v2i64_v2i16: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX2-SLOW-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX2-SLOW-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-SLOW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-SLOW-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm1 +; AVX2-SLOW-NEXT: vpand %xmm0, %xmm1, %xmm0 +; AVX2-SLOW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_packus_v2i64_v2i16: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX2-FAST-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX2-FAST-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-FAST-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-FAST-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm1 +; AVX2-FAST-NEXT: vpand %xmm0, %xmm1, %xmm0 +; AVX2-FAST-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v2i64_v2i16: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v2i64_v2i16: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v2i64_v2i16: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v2i64_v2i16: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v2i64_v2i16: +; SKX: # %bb.0: +; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; SKX-NEXT: retq + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, zeroinitializer + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> zeroinitializer + %5 = trunc <2 x i64> %4 to <2 x i16> + ret <2 x i16> %5 +} + +define void @trunc_packus_v2i64_v2i16_store(<2 x i64> %a0, <2 x i16> *%p1) { +; SSE2-LABEL: trunc_packus_v2i64_v2i16_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147549183,2147549183] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: movd %xmm0, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v2i64_v2i16_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147549183,2147549183] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: movd %xmm0, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v2i64_v2i16_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [65535,65535] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147549183,2147549183] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: xorpd %xmm1, %xmm1 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: movd %xmm0, (%rdi) +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_packus_v2i64_v2i16_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm1 +; AVX1-NEXT: vpand %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vmovd %xmm0, (%rdi) +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_packus_v2i64_v2i16_store: ; AVX2-SLOW: # %bb.0: -; AVX2-SLOW-NEXT: vmovdqa (%rdi), %ymm0 -; AVX2-SLOW-NEXT: vmovdqa 32(%rdi), %ymm1 -; AVX2-SLOW-NEXT: vpbroadcastq {{.*#+}} ymm2 = [4294967295,4294967295,4294967295,4294967295] -; AVX2-SLOW-NEXT: vpcmpgtq %ymm1, %ymm2, %ymm3 -; AVX2-SLOW-NEXT: vblendvpd %ymm3, %ymm1, %ymm2, %ymm1 -; AVX2-SLOW-NEXT: vpcmpgtq %ymm0, %ymm2, %ymm3 -; AVX2-SLOW-NEXT: vblendvpd %ymm3, %ymm0, %ymm2, %ymm0 -; AVX2-SLOW-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; AVX2-SLOW-NEXT: vpcmpgtq %ymm2, %ymm0, %ymm3 -; AVX2-SLOW-NEXT: vpand %ymm0, %ymm3, %ymm0 -; AVX2-SLOW-NEXT: vpcmpgtq %ymm2, %ymm1, %ymm2 -; AVX2-SLOW-NEXT: vpand %ymm1, %ymm2, %ymm1 -; AVX2-SLOW-NEXT: vextracti128 $1, %ymm1, %xmm2 -; AVX2-SLOW-NEXT: vshufps {{.*#+}} xmm1 = xmm1[0,2],xmm2[0,2] -; AVX2-SLOW-NEXT: vextracti128 $1, %ymm0, %xmm2 -; AVX2-SLOW-NEXT: vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm2[0,2] -; AVX2-SLOW-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0 +; AVX2-SLOW-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX2-SLOW-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX2-SLOW-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-SLOW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-SLOW-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm1 +; AVX2-SLOW-NEXT: vpand %xmm0, %xmm1, %xmm0 +; AVX2-SLOW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vmovd %xmm0, (%rdi) ; AVX2-SLOW-NEXT: retq ; -; AVX2-FAST-LABEL: trunc_packus_v8i64_v8i32: +; AVX2-FAST-LABEL: trunc_packus_v2i64_v2i16_store: ; AVX2-FAST: # %bb.0: -; AVX2-FAST-NEXT: vmovdqa (%rdi), %ymm0 -; AVX2-FAST-NEXT: vmovdqa 32(%rdi), %ymm1 -; AVX2-FAST-NEXT: vpbroadcastq {{.*#+}} ymm2 = [4294967295,4294967295,4294967295,4294967295] -; AVX2-FAST-NEXT: vpcmpgtq %ymm0, %ymm2, %ymm3 -; AVX2-FAST-NEXT: vblendvpd %ymm3, %ymm0, %ymm2, %ymm0 -; AVX2-FAST-NEXT: vpcmpgtq %ymm1, %ymm2, %ymm3 -; AVX2-FAST-NEXT: vblendvpd %ymm3, %ymm1, %ymm2, %ymm1 -; AVX2-FAST-NEXT: vpxor %xmm2, %xmm2, %xmm2 -; AVX2-FAST-NEXT: vpcmpgtq %ymm2, %ymm1, %ymm3 -; AVX2-FAST-NEXT: vpand %ymm1, %ymm3, %ymm1 -; AVX2-FAST-NEXT: vpcmpgtq %ymm2, %ymm0, %ymm2 -; AVX2-FAST-NEXT: vpand %ymm0, %ymm2, %ymm0 -; AVX2-FAST-NEXT: vmovdqa {{.*#+}} ymm2 = [0,2,4,6,4,6,6,7] -; AVX2-FAST-NEXT: vpermd %ymm0, %ymm2, %ymm0 -; AVX2-FAST-NEXT: vpermd %ymm1, %ymm2, %ymm1 -; AVX2-FAST-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX2-FAST-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX2-FAST-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-FAST-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX2-FAST-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm1 +; AVX2-FAST-NEXT: vpand %xmm0, %xmm1, %xmm0 +; AVX2-FAST-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: vmovd %xmm0, (%rdi) ; AVX2-FAST-NEXT: retq ; -; AVX512-LABEL: trunc_packus_v8i64_v8i32: -; AVX512: # %bb.0: -; AVX512-NEXT: vpxor %xmm0, %xmm0, %xmm0 -; AVX512-NEXT: vpmaxsq (%rdi), %zmm0, %zmm0 -; AVX512-NEXT: vpmovusqd %zmm0, %ymm0 -; AVX512-NEXT: retq +; AVX512F-LABEL: trunc_packus_v2i64_v2i16_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vmovd %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq ; -; SKX-LABEL: trunc_packus_v8i64_v8i32: +; AVX512VL-LABEL: trunc_packus_v2i64_v2i16_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512VL-NEXT: vpmovusqw %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v2i64_v2i16_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vmovd %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v2i64_v2i16_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmovusqw %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v2i64_v2i16_store: ; SKX: # %bb.0: -; SKX-NEXT: vpxor %xmm0, %xmm0, %xmm0 -; SKX-NEXT: vpmaxsq (%rdi), %ymm0, %ymm1 -; SKX-NEXT: vpmovusqd %ymm1, %xmm1 -; SKX-NEXT: vpmaxsq 32(%rdi), %ymm0, %ymm0 -; SKX-NEXT: vpmovusqd %ymm0, %xmm0 -; SKX-NEXT: vinserti128 $1, %xmm0, %ymm1, %ymm0 +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpmovusqw %xmm0, (%rdi) ; SKX-NEXT: retq - %a0 = load <8 x i64>, <8 x i64>* %p0 - %1 = icmp slt <8 x i64> %a0, - %2 = select <8 x i1> %1, <8 x i64> %a0, <8 x i64> - %3 = icmp sgt <8 x i64> %2, zeroinitializer - %4 = select <8 x i1> %3, <8 x i64> %2, <8 x i64> zeroinitializer - %5 = trunc <8 x i64> %4 to <8 x i32> - ret <8 x i32> %5 + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, zeroinitializer + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> zeroinitializer + %5 = trunc <2 x i64> %4 to <2 x i16> + store <2 x i16> %5, <2 x i16> *%p1 + ret void } -; -; PACKUS saturation truncation to vXi16 -; - define <4 x i16> @trunc_packus_v4i64_v4i16(<4 x i64> %a0) { ; SSE2-LABEL: trunc_packus_v4i64_v4i16: ; SSE2: # %bb.0: @@ -2112,6 +2794,327 @@ define <16 x i16> @trunc_packus_v16i32_v ; PACKUS saturation truncation to vXi8 ; +define <2 x i8> @trunc_packus_v2i64_v2i8(<2 x i64> %a0) { +; SSE2-LABEL: trunc_packus_v2i64_v2i8: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147483903,2147483903] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v2i64_v2i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147483903,2147483903] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm0 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v2i64_v2i8: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147483903,2147483903] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: xorpd %xmm1, %xmm1 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufb {{.*#+}} xmm1 = xmm1[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_packus_v2i64_v2i8: +; AVX: # %bb.0: +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] +; AVX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm1 +; AVX-NEXT: vpand %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v2i64_v2i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v2i64_v2i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v2i64_v2i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v2i64_v2i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v2i64_v2i8: +; SKX: # %bb.0: +; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: retq + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, zeroinitializer + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> zeroinitializer + %5 = trunc <2 x i64> %4 to <2 x i8> + ret <2 x i8> %5 +} + +define void @trunc_packus_v2i64_v2i8_store(<2 x i64> %a0, <2 x i8> *%p1) { +; SSE2-LABEL: trunc_packus_v2i64_v2i8_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147483903,2147483903] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: movdqa %xmm3, %xmm0 +; SSE2-NEXT: pxor %xmm1, %xmm0 +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm1, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm3, %xmm1 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm1 +; SSE2-NEXT: packuswb %xmm1, %xmm1 +; SSE2-NEXT: packuswb %xmm0, %xmm1 +; SSE2-NEXT: packuswb %xmm0, %xmm1 +; SSE2-NEXT: movd %xmm1, %eax +; SSE2-NEXT: movw %ax, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_packus_v2i64_v2i8_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147483903,2147483903] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: movdqa %xmm3, %xmm0 +; SSSE3-NEXT: pxor %xmm1, %xmm0 +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm1, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm3, %xmm1 +; SSSE3-NEXT: pshufb {{.*#+}} xmm1 = xmm1[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: movd %xmm1, %eax +; SSSE3-NEXT: movw %ax, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_packus_v2i64_v2i8_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147483903,2147483903] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: xorpd %xmm1, %xmm1 +; SSE41-NEXT: movapd %xmm2, %xmm4 +; SSE41-NEXT: xorpd %xmm3, %xmm4 +; SSE41-NEXT: movapd %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm3, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm3, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufb {{.*#+}} xmm1 = xmm1[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: pextrw $0, %xmm1, (%rdi) +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_packus_v2i64_v2i8_store: +; AVX: # %bb.0: +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] +; AVX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm1 +; AVX-NEXT: vpand %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX-NEXT: vpextrw $0, %xmm0, (%rdi) +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_packus_v2i64_v2i8_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpextrw $0, %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_packus_v2i64_v2i8_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512VL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512VL-NEXT: vpmovusqb %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_packus_v2i64_v2i8_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpextrw $0, %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_packus_v2i64_v2i8_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; AVX512BWVL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmovusqb %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_packus_v2i64_v2i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 +; SKX-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 +; SKX-NEXT: vpmovusqb %xmm0, (%rdi) +; SKX-NEXT: retq + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, zeroinitializer + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> zeroinitializer + %5 = trunc <2 x i64> %4 to <2 x i8> + store <2 x i8> %5, <2 x i8> *%p1 + ret void +} + define <4 x i8> @trunc_packus_v4i64_v4i8(<4 x i64> %a0) { ; SSE2-LABEL: trunc_packus_v4i64_v4i8: ; SSE2: # %bb.0: Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll?rev=374704&r1=374703&r2=374704&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll Sat Oct 12 22:47:42 2019 @@ -15,6 +15,317 @@ ; Signed saturation truncation to vXi32 ; +define <2 x i32> @trunc_ssat_v2i64_v2i32(<2 x i64> %a0) { +; SSE2-LABEL: trunc_ssat_v2i64_v2i32: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [4294967295,4294967295] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: pxor %xmm3, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [18446744069414584320,18446744069414584320] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm3 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSE2-NEXT: por %xmm3, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v2i64_v2i32: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [4294967295,4294967295] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: pxor %xmm3, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [18446744069414584320,18446744069414584320] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm0, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm3 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSSE3-NEXT: por %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v2i64_v2i32: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [2147483647,2147483647] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [4294967295,4294967295] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] +; SSE41-NEXT: pxor %xmm2, %xmm3 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [18446744069414584320,18446744069414584320] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_ssat_v2i64_v2i32: +; AVX: # %bb.0: +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] +; AVX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] +; AVX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v2i64_v2i32: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v2i64_v2i32: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v2i64_v2i32: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v2i64_v2i32: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v2i64_v2i32: +; SKX: # %bb.0: +; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SKX-NEXT: retq + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> + %5 = trunc <2 x i64> %4 to <2 x i32> + ret <2 x i32> %5 +} + +define void @trunc_ssat_v2i64_v2i32_store(<2 x i64> %a0, <2 x i32>* %p1) { +; SSE2-LABEL: trunc_ssat_v2i64_v2i32_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [4294967295,4294967295] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: pxor %xmm3, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [18446744069414584320,18446744069414584320] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm3 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSE2-NEXT: por %xmm3, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE2-NEXT: movq %xmm0, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v2i64_v2i32_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [4294967295,4294967295] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: pxor %xmm3, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [18446744069414584320,18446744069414584320] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm0, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm3 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSSE3-NEXT: por %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSSE3-NEXT: movq %xmm0, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v2i64_v2i32_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [2147483647,2147483647] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [4294967295,4294967295] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] +; SSE41-NEXT: pxor %xmm2, %xmm3 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [18446744069414584320,18446744069414584320] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: movq %xmm0, (%rdi) +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_ssat_v2i64_v2i32_store: +; AVX: # %bb.0: +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] +; AVX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] +; AVX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX-NEXT: vmovlpd %xmm0, (%rdi) +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v2i64_v2i32_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v2i64_v2i32_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovsqd %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v2i64_v2i32_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v2i64_v2i32_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovsqd %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v2i64_v2i32_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsqd %xmm0, (%rdi) +; SKX-NEXT: retq + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> + %5 = trunc <2 x i64> %4 to <2 x i32> + store <2 x i32> %5, <2 x i32>* %p1 + ret void +} + define <4 x i32> @trunc_ssat_v4i64_v4i32(<4 x i64> %a0) { ; SSE2-LABEL: trunc_ssat_v4i64_v4i32: ; SSE2: # %bb.0: @@ -707,6 +1018,375 @@ define <8 x i32> @trunc_ssat_v8i64_v8i32 ; Signed saturation truncation to vXi16 ; +define <2 x i16> @trunc_ssat_v2i64_v2i16(<2 x i64> %a0) { +; SSE2-LABEL: trunc_ssat_v2i64_v2i16: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147516415,2147516415] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: pxor %xmm3, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562035200,18446744071562035200] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm3 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSE2-NEXT: por %xmm3, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v2i64_v2i16: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147516415,2147516415] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: pxor %xmm3, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562035200,18446744071562035200] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm0, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm3 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSSE3-NEXT: por %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v2i64_v2i16: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [32767,32767] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147516415,2147516415] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; SSE41-NEXT: pxor %xmm2, %xmm3 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562035200,18446744071562035200] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_ssat_v2i64_v2i16: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_ssat_v2i64_v2i16: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] +; AVX2-SLOW-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX2-SLOW-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-SLOW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; AVX2-SLOW-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm2 +; AVX2-SLOW-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_ssat_v2i64_v2i16: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] +; AVX2-FAST-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX2-FAST-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; AVX2-FAST-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm2 +; AVX2-FAST-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-FAST-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v2i64_v2i16: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v2i64_v2i16: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v2i64_v2i16: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v2i64_v2i16: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v2i64_v2i16: +; SKX: # %bb.0: +; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; SKX-NEXT: retq + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> + %5 = trunc <2 x i64> %4 to <2 x i16> + ret <2 x i16> %5 +} + +define void @trunc_ssat_v2i64_v2i16_store(<2 x i64> %a0, <2 x i16> *%p1) { +; SSE2-LABEL: trunc_ssat_v2i64_v2i16_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147516415,2147516415] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: pxor %xmm3, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562035200,18446744071562035200] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm3 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSE2-NEXT: por %xmm3, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: movd %xmm0, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v2i64_v2i16_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147516415,2147516415] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: pxor %xmm3, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562035200,18446744071562035200] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm0, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm3 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSSE3-NEXT: por %xmm3, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: movd %xmm0, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v2i64_v2i16_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [32767,32767] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147516415,2147516415] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; SSE41-NEXT: pxor %xmm2, %xmm3 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562035200,18446744071562035200] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm1[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: movd %xmm0, (%rdi) +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_ssat_v2i64_v2i16_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] +; AVX1-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; AVX1-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vmovd %xmm0, (%rdi) +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_ssat_v2i64_v2i16_store: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] +; AVX2-SLOW-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX2-SLOW-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-SLOW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; AVX2-SLOW-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm2 +; AVX2-SLOW-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vmovd %xmm0, (%rdi) +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_ssat_v2i64_v2i16_store: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] +; AVX2-FAST-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX2-FAST-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; AVX2-FAST-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm2 +; AVX2-FAST-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-FAST-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: vmovd %xmm0, (%rdi) +; AVX2-FAST-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v2i64_v2i16_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vmovd %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v2i64_v2i16_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovsqw %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v2i64_v2i16_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vmovd %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v2i64_v2i16_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovsqw %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v2i64_v2i16_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsqw %xmm0, (%rdi) +; SKX-NEXT: retq + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> + %5 = trunc <2 x i64> %4 to <2 x i16> + store <2 x i16> %5, <2 x i16> *%p1 + ret void +} + define <4 x i16> @trunc_ssat_v4i64_v4i16(<4 x i64> %a0) { ; SSE2-LABEL: trunc_ssat_v4i64_v4i16: ; SSE2: # %bb.0: @@ -1895,6 +2575,326 @@ define <16 x i16> @trunc_ssat_v16i32_v16 ; Signed saturation truncation to vXi8 ; +define <2 x i8> @trunc_ssat_v2i64_v2i8(<2 x i64> %a0) { +; SSE2-LABEL: trunc_ssat_v2i64_v2i8: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147483775,2147483775] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: pxor %xmm3, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562067840,18446744071562067840] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm0 +; SSE2-NEXT: pand %xmm0, %xmm3 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm0 +; SSE2-NEXT: por %xmm3, %xmm0 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: packuswb %xmm0, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v2i64_v2i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147483775,2147483775] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: pxor %xmm3, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562067840,18446744071562067840] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm0, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm0 +; SSSE3-NEXT: pand %xmm0, %xmm3 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm0 +; SSSE3-NEXT: por %xmm3, %xmm0 +; SSSE3-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v2i64_v2i8: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [127,127] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147483775,2147483775] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] +; SSE41-NEXT: pxor %xmm2, %xmm3 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562067840,18446744071562067840] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufb {{.*#+}} xmm1 = xmm1[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: movdqa %xmm1, %xmm0 +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_ssat_v2i64_v2i8: +; AVX: # %bb.0: +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] +; AVX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] +; AVX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v2i64_v2i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v2i64_v2i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v2i64_v2i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v2i64_v2i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v2i64_v2i8: +; SKX: # %bb.0: +; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: retq + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> + %5 = trunc <2 x i64> %4 to <2 x i8> + ret <2 x i8> %5 +} + +define void @trunc_ssat_v2i64_v2i8_store(<2 x i64> %a0, <2 x i8> *%p1) { +; SSE2-LABEL: trunc_ssat_v2i64_v2i8_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSE2-NEXT: movdqa %xmm0, %xmm2 +; SSE2-NEXT: pxor %xmm1, %xmm2 +; SSE2-NEXT: movdqa {{.*#+}} xmm3 = [2147483775,2147483775] +; SSE2-NEXT: movdqa %xmm3, %xmm4 +; SSE2-NEXT: pcmpgtd %xmm2, %xmm4 +; SSE2-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm3, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSE2-NEXT: pand %xmm5, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm3 +; SSE2-NEXT: pand %xmm3, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSE2-NEXT: por %xmm0, %xmm3 +; SSE2-NEXT: pxor %xmm3, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562067840,18446744071562067840] +; SSE2-NEXT: movdqa %xmm1, %xmm2 +; SSE2-NEXT: pcmpgtd %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm0, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm0 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm3 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSE2-NEXT: por %xmm3, %xmm1 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm1 +; SSE2-NEXT: packuswb %xmm1, %xmm1 +; SSE2-NEXT: packuswb %xmm0, %xmm1 +; SSE2-NEXT: packuswb %xmm0, %xmm1 +; SSE2-NEXT: movd %xmm1, %eax +; SSE2-NEXT: movw %ax, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_ssat_v2i64_v2i8_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [2147483648,2147483648] +; SSSE3-NEXT: movdqa %xmm0, %xmm2 +; SSSE3-NEXT: pxor %xmm1, %xmm2 +; SSSE3-NEXT: movdqa {{.*#+}} xmm3 = [2147483775,2147483775] +; SSSE3-NEXT: movdqa %xmm3, %xmm4 +; SSSE3-NEXT: pcmpgtd %xmm2, %xmm4 +; SSSE3-NEXT: pshufd {{.*#+}} xmm5 = xmm4[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm3, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm2[1,1,3,3] +; SSSE3-NEXT: pand %xmm5, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm3 = xmm4[1,1,3,3] +; SSSE3-NEXT: por %xmm2, %xmm3 +; SSSE3-NEXT: pand %xmm3, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm3 +; SSSE3-NEXT: por %xmm0, %xmm3 +; SSSE3-NEXT: pxor %xmm3, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562067840,18446744071562067840] +; SSSE3-NEXT: movdqa %xmm1, %xmm2 +; SSSE3-NEXT: pcmpgtd %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm2[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm0, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm0 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm2[1,1,3,3] +; SSSE3-NEXT: por %xmm0, %xmm1 +; SSSE3-NEXT: pand %xmm1, %xmm3 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSSE3-NEXT: por %xmm3, %xmm1 +; SSSE3-NEXT: pshufb {{.*#+}} xmm1 = xmm1[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: movd %xmm1, %eax +; SSSE3-NEXT: movw %ax, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_ssat_v2i64_v2i8_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [127,127] +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [2147483648,2147483648] +; SSE41-NEXT: pxor %xmm3, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm4 = [2147483775,2147483775] +; SSE41-NEXT: movdqa %xmm4, %xmm5 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm5 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm4 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm4[0,0,2,2] +; SSE41-NEXT: pand %xmm5, %xmm0 +; SSE41-NEXT: por %xmm4, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: movapd {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] +; SSE41-NEXT: pxor %xmm2, %xmm3 +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [18446744071562067840,18446744071562067840] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm2, %xmm1 +; SSE41-NEXT: pshufb {{.*#+}} xmm1 = xmm1[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: pextrw $0, %xmm1, (%rdi) +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_ssat_v2i64_v2i8_store: +; AVX: # %bb.0: +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] +; AVX-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] +; AVX-NEXT: vpcmpgtq %xmm1, %xmm0, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX-NEXT: vpextrw $0, %xmm0, (%rdi) +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_ssat_v2i64_v2i8_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] +; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] +; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpextrw $0, %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_ssat_v2i64_v2i8_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovsqb %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_ssat_v2i64_v2i8_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] +; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] +; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpextrw $0, %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_ssat_v2i64_v2i8_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovsqb %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_ssat_v2i64_v2i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovsqb %xmm0, (%rdi) +; SKX-NEXT: retq + %1 = icmp slt <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = icmp sgt <2 x i64> %2, + %4 = select <2 x i1> %3, <2 x i64> %2, <2 x i64> + %5 = trunc <2 x i64> %4 to <2 x i8> + store <2 x i8> %5, <2 x i8> *%p1 + ret void +} + define <4 x i8> @trunc_ssat_v4i64_v4i8(<4 x i64> %a0) { ; SSE2-LABEL: trunc_ssat_v4i64_v4i8: ; SSE2: # %bb.0: Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll?rev=374704&r1=374703&r2=374704&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll Sat Oct 12 22:47:42 2019 @@ -15,6 +15,224 @@ ; Unsigned saturation truncation to vXi32 ; +define <2 x i32> @trunc_usat_v2i64_v2i32(<2 x i64> %a0) { +; SSE2-LABEL: trunc_usat_v2i64_v2i32: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002259455,9223372039002259455] +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm2, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v2i64_v2i32: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002259455,9223372039002259455] +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v2i64_v2i32: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [4294967295,4294967295] +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259455,9223372039002259455] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_usat_v2i64_v2i32: +; AVX: # %bb.0: +; AVX-NEXT: vmovapd {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX-NEXT: vpxor {{.*}}(%rip), %xmm0, %xmm2 +; AVX-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372041149743103,9223372041149743103] +; AVX-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v2i64_v2i32: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v2i64_v2i32: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v2i64_v2i32: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v2i64_v2i32: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v2i64_v2i32: +; SKX: # %bb.0: +; SKX-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SKX-NEXT: retq + %1 = icmp ult <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = trunc <2 x i64> %2 to <2 x i32> + ret <2 x i32> %3 +} + +define void @trunc_usat_v2i64_v2i32_store(<2 x i64> %a0, <2 x i32>* %p1) { +; SSE2-LABEL: trunc_usat_v2i64_v2i32_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002259455,9223372039002259455] +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm2, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE2-NEXT: movq %xmm0, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v2i64_v2i32_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002259455,9223372039002259455] +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSSE3-NEXT: movq %xmm0, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v2i64_v2i32_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [4294967295,4294967295] +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259455,9223372039002259455] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE41-NEXT: movq %xmm0, (%rdi) +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_usat_v2i64_v2i32_store: +; AVX: # %bb.0: +; AVX-NEXT: vmovapd {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX-NEXT: vpxor {{.*}}(%rip), %xmm0, %xmm2 +; AVX-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372041149743103,9223372041149743103] +; AVX-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX-NEXT: vmovlpd %xmm0, (%rdi) +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v2i64_v2i32_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v2i64_v2i32_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovusqd %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v2i64_v2i32_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] +; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v2i64_v2i32_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovusqd %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v2i64_v2i32_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusqd %xmm0, (%rdi) +; SKX-NEXT: retq + %1 = icmp ult <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = trunc <2 x i64> %2 to <2 x i32> + store <2 x i32> %3, <2 x i32>* %p1 + ret void +} + define <4 x i32> @trunc_usat_v4i64_v4i32(<4 x i64> %a0) { ; SSE2-LABEL: trunc_usat_v4i64_v4i32: ; SSE2: # %bb.0: @@ -479,6 +697,278 @@ define <8 x i32> @trunc_usat_v8i64_v8i32 ; Unsigned saturation truncation to vXi16 ; +define <2 x i16> @trunc_usat_v2i64_v2i16(<2 x i64> %a0) { +; SSE2-LABEL: trunc_usat_v2i64_v2i16: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002324991,9223372039002324991] +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm2, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v2i64_v2i16: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002324991,9223372039002324991] +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v2i64_v2i16: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [65535,65535] +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002324991,9223372039002324991] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v2i64_v2i16: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovapd {{.*#+}} xmm1 = [65535,65535] +; AVX1-NEXT: vpxor {{.*}}(%rip), %xmm0, %xmm2 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854841343,9223372036854841343] +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_usat_v2i64_v2i16: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vmovapd {{.*#+}} xmm1 = [65535,65535] +; AVX2-SLOW-NEXT: vpxor {{.*}}(%rip), %xmm0, %xmm2 +; AVX2-SLOW-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854841343,9223372036854841343] +; AVX2-SLOW-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX2-SLOW-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_usat_v2i64_v2i16: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vmovapd {{.*#+}} xmm1 = [65535,65535] +; AVX2-FAST-NEXT: vpxor {{.*}}(%rip), %xmm0, %xmm2 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854841343,9223372036854841343] +; AVX2-FAST-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX2-FAST-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-FAST-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v2i64_v2i16: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v2i64_v2i16: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v2i64_v2i16: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v2i64_v2i16: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v2i64_v2i16: +; SKX: # %bb.0: +; SKX-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; SKX-NEXT: retq + %1 = icmp ult <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = trunc <2 x i64> %2 to <2 x i16> + ret <2 x i16> %3 +} + +define void @trunc_usat_v2i64_v2i16_store(<2 x i64> %a0, <2 x i16>* %p1) { +; SSE2-LABEL: trunc_usat_v2i64_v2i16_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002324991,9223372039002324991] +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm2, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE2-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE2-NEXT: movd %xmm0, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v2i64_v2i16_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002324991,9223372039002324991] +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSSE3-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSSE3-NEXT: movd %xmm0, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v2i64_v2i16_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [65535,65535] +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002324991,9223372039002324991] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm2[0,2,2,3] +; SSE41-NEXT: pshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; SSE41-NEXT: movd %xmm0, (%rdi) +; SSE41-NEXT: retq +; +; AVX1-LABEL: trunc_usat_v2i64_v2i16_store: +; AVX1: # %bb.0: +; AVX1-NEXT: vmovapd {{.*#+}} xmm1 = [65535,65535] +; AVX1-NEXT: vpxor {{.*}}(%rip), %xmm0, %xmm2 +; AVX1-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854841343,9223372036854841343] +; AVX1-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX1-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX1-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX1-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX1-NEXT: vmovd %xmm0, (%rdi) +; AVX1-NEXT: retq +; +; AVX2-SLOW-LABEL: trunc_usat_v2i64_v2i16_store: +; AVX2-SLOW: # %bb.0: +; AVX2-SLOW-NEXT: vmovapd {{.*#+}} xmm1 = [65535,65535] +; AVX2-SLOW-NEXT: vpxor {{.*}}(%rip), %xmm0, %xmm2 +; AVX2-SLOW-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854841343,9223372036854841343] +; AVX2-SLOW-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX2-SLOW-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-SLOW-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX2-SLOW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX2-SLOW-NEXT: vmovd %xmm0, (%rdi) +; AVX2-SLOW-NEXT: retq +; +; AVX2-FAST-LABEL: trunc_usat_v2i64_v2i16_store: +; AVX2-FAST: # %bb.0: +; AVX2-FAST-NEXT: vmovapd {{.*#+}} xmm1 = [65535,65535] +; AVX2-FAST-NEXT: vpxor {{.*}}(%rip), %xmm0, %xmm2 +; AVX2-FAST-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854841343,9223372036854841343] +; AVX2-FAST-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX2-FAST-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX2-FAST-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX2-FAST-NEXT: vmovd %xmm0, (%rdi) +; AVX2-FAST-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v2i64_v2i16_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vmovd %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v2i64_v2i16_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovusqw %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v2i64_v2i16_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] +; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vmovd %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v2i64_v2i16_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovusqw %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v2i64_v2i16_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusqw %xmm0, (%rdi) +; SKX-NEXT: retq + %1 = icmp ult <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = trunc <2 x i64> %2 to <2 x i16> + store <2 x i16> %3, <2 x i16>* %p1 + ret void +} + define <4 x i16> @trunc_usat_v4i64_v4i16(<4 x i64> %a0) { ; SSE2-LABEL: trunc_usat_v4i64_v4i16: ; SSE2: # %bb.0: @@ -1592,6 +2082,234 @@ define <16 x i16> @trunc_usat_v16i32_v16 ; Unsigned saturation truncation to vXi8 ; +define <2 x i8> @trunc_usat_v2i64_v2i8(<2 x i64> %a0) { +; SSE2-LABEL: trunc_usat_v2i64_v2i8: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002259711,9223372039002259711] +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm2, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm2 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm2, %xmm1 +; SSE2-NEXT: pand %xmm1, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm1 +; SSE2-NEXT: por %xmm0, %xmm1 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm1 +; SSE2-NEXT: packuswb %xmm1, %xmm1 +; SSE2-NEXT: packuswb %xmm1, %xmm1 +; SSE2-NEXT: packuswb %xmm1, %xmm1 +; SSE2-NEXT: movdqa %xmm1, %xmm0 +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v2i64_v2i8: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002259711,9223372039002259711] +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSSE3-NEXT: por %xmm2, %xmm0 +; SSSE3-NEXT: pshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v2i64_v2i8: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259711,9223372039002259711] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: pshufb {{.*#+}} xmm2 = xmm2[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: movdqa %xmm2, %xmm0 +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_usat_v2i64_v2i8: +; AVX: # %bb.0: +; AVX-NEXT: vmovapd {{.*#+}} xmm1 = [255,255] +; AVX-NEXT: vpxor {{.*}}(%rip), %xmm0, %xmm2 +; AVX-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854776063,9223372036854776063] +; AVX-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v2i64_v2i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] +; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v2i64_v2i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v2i64_v2i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] +; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v2i64_v2i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v2i64_v2i8: +; SKX: # %bb.0: +; SKX-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 +; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: retq + %1 = icmp ult <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = trunc <2 x i64> %2 to <2 x i8> + ret <2 x i8> %3 +} + +define void @trunc_usat_v2i64_v2i8_store(<2 x i64> %a0, <2 x i8>* %p1) { +; SSE2-LABEL: trunc_usat_v2i64_v2i8_store: +; SSE2: # %bb.0: +; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSE2-NEXT: pxor %xmm0, %xmm1 +; SSE2-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002259711,9223372039002259711] +; SSE2-NEXT: movdqa %xmm2, %xmm3 +; SSE2-NEXT: pcmpgtd %xmm1, %xmm3 +; SSE2-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSE2-NEXT: pcmpeqd %xmm2, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSE2-NEXT: pand %xmm4, %xmm1 +; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSE2-NEXT: por %xmm1, %xmm2 +; SSE2-NEXT: pand %xmm2, %xmm0 +; SSE2-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSE2-NEXT: por %xmm0, %xmm2 +; SSE2-NEXT: pand {{.*}}(%rip), %xmm2 +; SSE2-NEXT: packuswb %xmm2, %xmm2 +; SSE2-NEXT: packuswb %xmm0, %xmm2 +; SSE2-NEXT: packuswb %xmm0, %xmm2 +; SSE2-NEXT: movd %xmm2, %eax +; SSE2-NEXT: movw %ax, (%rdi) +; SSE2-NEXT: retq +; +; SSSE3-LABEL: trunc_usat_v2i64_v2i8_store: +; SSSE3: # %bb.0: +; SSSE3-NEXT: movdqa {{.*#+}} xmm1 = [9223372039002259456,9223372039002259456] +; SSSE3-NEXT: pxor %xmm0, %xmm1 +; SSSE3-NEXT: movdqa {{.*#+}} xmm2 = [9223372039002259711,9223372039002259711] +; SSSE3-NEXT: movdqa %xmm2, %xmm3 +; SSSE3-NEXT: pcmpgtd %xmm1, %xmm3 +; SSSE3-NEXT: pshufd {{.*#+}} xmm4 = xmm3[0,0,2,2] +; SSSE3-NEXT: pcmpeqd %xmm2, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3] +; SSSE3-NEXT: pand %xmm4, %xmm1 +; SSSE3-NEXT: pshufd {{.*#+}} xmm2 = xmm3[1,1,3,3] +; SSSE3-NEXT: por %xmm1, %xmm2 +; SSSE3-NEXT: pand %xmm2, %xmm0 +; SSSE3-NEXT: pandn {{.*}}(%rip), %xmm2 +; SSSE3-NEXT: por %xmm0, %xmm2 +; SSSE3-NEXT: pshufb {{.*#+}} xmm2 = xmm2[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSSE3-NEXT: movd %xmm2, %eax +; SSSE3-NEXT: movw %ax, (%rdi) +; SSSE3-NEXT: retq +; +; SSE41-LABEL: trunc_usat_v2i64_v2i8_store: +; SSE41: # %bb.0: +; SSE41-NEXT: movdqa %xmm0, %xmm1 +; SSE41-NEXT: movapd {{.*#+}} xmm2 = [255,255] +; SSE41-NEXT: movdqa {{.*#+}} xmm0 = [9223372039002259456,9223372039002259456] +; SSE41-NEXT: pxor %xmm1, %xmm0 +; SSE41-NEXT: movdqa {{.*#+}} xmm3 = [9223372039002259711,9223372039002259711] +; SSE41-NEXT: movdqa %xmm3, %xmm4 +; SSE41-NEXT: pcmpeqd %xmm0, %xmm4 +; SSE41-NEXT: pcmpgtd %xmm0, %xmm3 +; SSE41-NEXT: pshufd {{.*#+}} xmm0 = xmm3[0,0,2,2] +; SSE41-NEXT: pand %xmm4, %xmm0 +; SSE41-NEXT: por %xmm3, %xmm0 +; SSE41-NEXT: blendvpd %xmm0, %xmm1, %xmm2 +; SSE41-NEXT: pshufb {{.*#+}} xmm2 = xmm2[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SSE41-NEXT: pextrw $0, %xmm2, (%rdi) +; SSE41-NEXT: retq +; +; AVX-LABEL: trunc_usat_v2i64_v2i8_store: +; AVX: # %bb.0: +; AVX-NEXT: vmovapd {{.*#+}} xmm1 = [255,255] +; AVX-NEXT: vpxor {{.*}}(%rip), %xmm0, %xmm2 +; AVX-NEXT: vmovdqa {{.*#+}} xmm3 = [9223372036854776063,9223372036854776063] +; AVX-NEXT: vpcmpgtq %xmm2, %xmm3, %xmm2 +; AVX-NEXT: vblendvpd %xmm2, %xmm0, %xmm1, %xmm0 +; AVX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX-NEXT: vpextrw $0, %xmm0, (%rdi) +; AVX-NEXT: retq +; +; AVX512F-LABEL: trunc_usat_v2i64_v2i8_store: +; AVX512F: # %bb.0: +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] +; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpextrw $0, %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v2i64_v2i8_store: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpmovusqb %xmm0, (%rdi) +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v2i64_v2i8_store: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] +; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 +; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpextrw $0, %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v2i64_v2i8_store: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovusqb %xmm0, (%rdi) +; AVX512BWVL-NEXT: retq +; +; SKX-LABEL: trunc_usat_v2i64_v2i8_store: +; SKX: # %bb.0: +; SKX-NEXT: vpmovusqb %xmm0, (%rdi) +; SKX-NEXT: retq + %1 = icmp ult <2 x i64> %a0, + %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> + %3 = trunc <2 x i64> %2 to <2 x i8> + store <2 x i8> %3, <2 x i8>* %p1 + ret void +} + define <4 x i8> @trunc_usat_v4i64_v4i8(<4 x i64> %a0) { ; SSE2-LABEL: trunc_usat_v4i64_v4i8: ; SSE2: # %bb.0: From llvm-commits at lists.llvm.org Sat Oct 12 22:47:47 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Sun, 13 Oct 2019 05:47:47 -0000 Subject: [llvm] r374705 - [X86] Enable v4i32->v4i16 and v8i16->v8i8 saturating truncates to use pack instructions with avx512. Message-ID: <20191013054747.7DD368385B@lists.llvm.org> Author: ctopper Date: Sat Oct 12 22:47:47 2019 New Revision: 374705 URL: http://llvm.org/viewvc/llvm-project?rev=374705&view=rev Log: [X86] Enable v4i32->v4i16 and v8i16->v8i8 saturating truncates to use pack instructions with avx512. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374705&r1=374704&r2=374705&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sat Oct 12 22:47:47 2019 @@ -39871,6 +39871,7 @@ static SDValue combineTruncateWithSat(SD // registers, we should go ahead and use the pack instructions if possible. bool PreferAVX512 = ((Subtarget.hasAVX512() && InSVT == MVT::i32) || (Subtarget.hasBWI() && InSVT == MVT::i16)) && + (InVT.getSizeInBits() > 128) && (Subtarget.hasVLX() || InVT.getSizeInBits() > 256) && !(!Subtarget.useAVX512Regs() && VT.getSizeInBits() >= 256); Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll?rev=374705&r1=374704&r2=374705&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll Sat Oct 12 22:47:47 2019 @@ -2392,37 +2392,13 @@ define <4 x i16> @trunc_packus_v4i32_v4i ; AVX-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 ; AVX-NEXT: retq ; -; AVX512F-LABEL: trunc_packus_v4i32_v4i16: -; AVX512F: # %bb.0: -; AVX512F-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 -; AVX512F-NEXT: retq -; -; AVX512VL-LABEL: trunc_packus_v4i32_v4i16: -; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512VL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512VL-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 -; AVX512VL-NEXT: retq -; -; AVX512BW-LABEL: trunc_packus_v4i32_v4i16: -; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 -; AVX512BW-NEXT: retq -; -; AVX512BWVL-LABEL: trunc_packus_v4i32_v4i16: -; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512BWVL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 -; AVX512BWVL-NEXT: retq +; AVX512-LABEL: trunc_packus_v4i32_v4i16: +; AVX512: # %bb.0: +; AVX512-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: retq ; ; SKX-LABEL: trunc_packus_v4i32_v4i16: ; SKX: # %bb.0: -; SKX-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; SKX-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ; SKX-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, @@ -5731,34 +5707,13 @@ define <8 x i8> @trunc_packus_v8i16_v8i8 ; AVX-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 ; AVX-NEXT: retq ; -; AVX512F-LABEL: trunc_packus_v8i16_v8i8: -; AVX512F: # %bb.0: -; AVX512F-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 -; AVX512F-NEXT: retq -; -; AVX512VL-LABEL: trunc_packus_v8i16_v8i8: -; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 -; AVX512VL-NEXT: retq -; -; AVX512BW-LABEL: trunc_packus_v8i16_v8i8: -; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 -; AVX512BW-NEXT: retq -; -; AVX512BWVL-LABEL: trunc_packus_v8i16_v8i8: -; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsw {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; AVX512BWVL-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 -; AVX512BWVL-NEXT: retq +; AVX512-LABEL: trunc_packus_v8i16_v8i8: +; AVX512: # %bb.0: +; AVX512-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: retq ; ; SKX-LABEL: trunc_packus_v8i16_v8i8: ; SKX: # %bb.0: -; SKX-NEXT: vpminsw {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 -; SKX-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 ; SKX-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <8 x i16> %a0, Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll?rev=374705&r1=374704&r2=374705&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll Sat Oct 12 22:47:47 2019 @@ -2380,34 +2380,13 @@ define <4 x i16> @trunc_ssat_v4i32_v4i16 ; AVX-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 ; AVX-NEXT: retq ; -; AVX512F-LABEL: trunc_ssat_v4i32_v4i16: -; AVX512F: # %bb.0: -; AVX512F-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 -; AVX512F-NEXT: retq -; -; AVX512VL-LABEL: trunc_ssat_v4i32_v4i16: -; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512VL-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512VL-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 -; AVX512VL-NEXT: retq -; -; AVX512BW-LABEL: trunc_ssat_v4i32_v4i16: -; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 -; AVX512BW-NEXT: retq -; -; AVX512BWVL-LABEL: trunc_ssat_v4i32_v4i16: -; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 -; AVX512BWVL-NEXT: retq +; AVX512-LABEL: trunc_ssat_v4i32_v4i16: +; AVX512: # %bb.0: +; AVX512-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: retq ; ; SKX-LABEL: trunc_ssat_v4i32_v4i16: ; SKX: # %bb.0: -; SKX-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; SKX-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 ; SKX-NEXT: vpackssdw %xmm0, %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, @@ -5620,32 +5599,13 @@ define <8 x i8> @trunc_ssat_v8i16_v8i8(< ; AVX-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 ; AVX-NEXT: retq ; -; AVX512F-LABEL: trunc_ssat_v8i16_v8i8: -; AVX512F: # %bb.0: -; AVX512F-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 -; AVX512F-NEXT: retq -; -; AVX512VL-LABEL: trunc_ssat_v8i16_v8i8: -; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 -; AVX512VL-NEXT: retq -; -; AVX512BW-LABEL: trunc_ssat_v8i16_v8i8: -; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 -; AVX512BW-NEXT: retq -; -; AVX512BWVL-LABEL: trunc_ssat_v8i16_v8i8: -; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsw {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpmaxsw {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 -; AVX512BWVL-NEXT: retq +; AVX512-LABEL: trunc_ssat_v8i16_v8i8: +; AVX512: # %bb.0: +; AVX512-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 +; AVX512-NEXT: retq ; ; SKX-LABEL: trunc_ssat_v8i16_v8i8: ; SKX: # %bb.0: -; SKX-NEXT: vpminsw {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpmaxsw {{.*}}(%rip), %xmm0, %xmm0 ; SKX-NEXT: vpacksswb %xmm0, %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <8 x i16> %a0, From llvm-commits at lists.llvm.org Sat Oct 12 23:48:05 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Sun, 13 Oct 2019 06:48:05 -0000 Subject: [llvm] r374706 - [X86] Add a one use check on the setcc to the min/max canonicalization code in combineSelect. Message-ID: <20191013064805.8A53285CE8@lists.llvm.org> Author: ctopper Date: Sat Oct 12 23:48:05 2019 New Revision: 374706 URL: http://llvm.org/viewvc/llvm-project?rev=374706&view=rev Log: [X86] Add a one use check on the setcc to the min/max canonicalization code in combineSelect. This seems to improve std::midpoint code where we have a min and a max with the same condition. If we split the setcc we can end up with two compares if the one of the operands is a constant. Since we aggressively canonicalize compares with constants. For non-constants it can interfere with our ability to share control flow if we need to expand cmovs into control flow. I'm also not sure I understand this min/max canonicalization code. The motivating case talks about comparing with 0. But we don't check for 0 explicitly. Removes one instruction from the codegen for PR43658. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/midpoint-int.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374706&r1=374705&r2=374706&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sat Oct 12 23:48:05 2019 @@ -37009,6 +37009,7 @@ static SDValue combineSelect(SDNode *N, // subl %esi, $edi // cmovsl %eax, %edi if (N->getOpcode() == ISD::SELECT && Cond.getOpcode() == ISD::SETCC && + Cond.hasOneUse() && DAG.isEqualTo(LHS, Cond.getOperand(0)) && DAG.isEqualTo(RHS, Cond.getOperand(1))) { ISD::CondCode CC = cast(Cond.getOperand(2))->get(); Modified: llvm/trunk/test/CodeGen/X86/midpoint-int.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/midpoint-int.ll?rev=374706&r1=374705&r2=374706&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/midpoint-int.ll (original) +++ llvm/trunk/test/CodeGen/X86/midpoint-int.ll Sat Oct 12 23:48:05 2019 @@ -20,7 +20,7 @@ define i32 @scalar_i32_signed_reg_reg(i3 ; X64-NEXT: leal -1(%rax,%rax), %eax ; X64-NEXT: movl %edi, %ecx ; X64-NEXT: cmovgl %esi, %ecx -; X64-NEXT: cmovgel %edi, %esi +; X64-NEXT: cmovgl %edi, %esi ; X64-NEXT: subl %ecx, %esi ; X64-NEXT: shrl %esi ; X64-NEXT: imull %esi, %eax @@ -29,30 +29,26 @@ define i32 @scalar_i32_signed_reg_reg(i3 ; ; X32-LABEL: scalar_i32_signed_reg_reg: ; X32: # %bb.0: -; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi -; X32-NEXT: movl {{[0-9]+}}(%esp), %edx +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx -; X32-NEXT: xorl %eax, %eax -; X32-NEXT: cmpl %edx, %ecx -; X32-NEXT: setle %al -; X32-NEXT: movl %edx, %esi -; X32-NEXT: jg .LBB0_2 -; X32-NEXT: # %bb.1: +; X32-NEXT: xorl %edx, %edx +; X32-NEXT: cmpl %eax, %ecx +; X32-NEXT: setle %dl +; X32-NEXT: leal -1(%edx,%edx), %edx +; X32-NEXT: jg .LBB0_1 +; X32-NEXT: # %bb.2: ; X32-NEXT: movl %ecx, %esi -; X32-NEXT: .LBB0_2: -; X32-NEXT: leal -1(%eax,%eax), %edi +; X32-NEXT: jmp .LBB0_3 +; X32-NEXT: .LBB0_1: +; X32-NEXT: movl %eax, %esi ; X32-NEXT: movl %ecx, %eax -; X32-NEXT: jge .LBB0_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl %edx, %eax -; X32-NEXT: .LBB0_4: +; X32-NEXT: .LBB0_3: ; X32-NEXT: subl %esi, %eax ; X32-NEXT: shrl %eax -; X32-NEXT: imull %edi, %eax +; X32-NEXT: imull %edx, %eax ; X32-NEXT: addl %ecx, %eax ; X32-NEXT: popl %esi -; X32-NEXT: popl %edi ; X32-NEXT: retl %t3 = icmp sgt i32 %a1, %a2 ; signed %t4 = select i1 %t3, i32 -1, i32 1 @@ -127,7 +123,7 @@ define i32 @scalar_i32_signed_mem_reg(i3 ; X64-NEXT: leal -1(%rax,%rax), %eax ; X64-NEXT: movl %ecx, %edx ; X64-NEXT: cmovgl %esi, %edx -; X64-NEXT: cmovgel %ecx, %esi +; X64-NEXT: cmovgl %ecx, %esi ; X64-NEXT: subl %edx, %esi ; X64-NEXT: shrl %esi ; X64-NEXT: imull %esi, %eax @@ -136,31 +132,27 @@ define i32 @scalar_i32_signed_mem_reg(i3 ; ; X32-LABEL: scalar_i32_signed_mem_reg: ; X32: # %bb.0: -; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi -; X32-NEXT: movl {{[0-9]+}}(%esp), %edx ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax -; X32-NEXT: movl (%eax), %ecx -; X32-NEXT: xorl %eax, %eax -; X32-NEXT: cmpl %edx, %ecx -; X32-NEXT: setle %al -; X32-NEXT: movl %edx, %esi -; X32-NEXT: jg .LBB2_2 -; X32-NEXT: # %bb.1: +; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx +; X32-NEXT: movl (%ecx), %ecx +; X32-NEXT: xorl %edx, %edx +; X32-NEXT: cmpl %eax, %ecx +; X32-NEXT: setle %dl +; X32-NEXT: leal -1(%edx,%edx), %edx +; X32-NEXT: jg .LBB2_1 +; X32-NEXT: # %bb.2: ; X32-NEXT: movl %ecx, %esi -; X32-NEXT: .LBB2_2: -; X32-NEXT: leal -1(%eax,%eax), %edi +; X32-NEXT: jmp .LBB2_3 +; X32-NEXT: .LBB2_1: +; X32-NEXT: movl %eax, %esi ; X32-NEXT: movl %ecx, %eax -; X32-NEXT: jge .LBB2_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl %edx, %eax -; X32-NEXT: .LBB2_4: +; X32-NEXT: .LBB2_3: ; X32-NEXT: subl %esi, %eax ; X32-NEXT: shrl %eax -; X32-NEXT: imull %edi, %eax +; X32-NEXT: imull %edx, %eax ; X32-NEXT: addl %ecx, %eax ; X32-NEXT: popl %esi -; X32-NEXT: popl %edi ; X32-NEXT: retl %a1 = load i32, i32* %a1_addr %t3 = icmp sgt i32 %a1, %a2 ; signed @@ -184,7 +176,7 @@ define i32 @scalar_i32_signed_reg_mem(i3 ; X64-NEXT: leal -1(%rcx,%rcx), %ecx ; X64-NEXT: movl %edi, %edx ; X64-NEXT: cmovgl %eax, %edx -; X64-NEXT: cmovgel %edi, %eax +; X64-NEXT: cmovgl %edi, %eax ; X64-NEXT: subl %edx, %eax ; X64-NEXT: shrl %eax ; X64-NEXT: imull %ecx, %eax @@ -193,31 +185,27 @@ define i32 @scalar_i32_signed_reg_mem(i3 ; ; X32-LABEL: scalar_i32_signed_reg_mem: ; X32: # %bb.0: -; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax -; X32-NEXT: movl (%eax), %edx -; X32-NEXT: xorl %eax, %eax -; X32-NEXT: cmpl %edx, %ecx -; X32-NEXT: setle %al -; X32-NEXT: movl %edx, %esi -; X32-NEXT: jg .LBB3_2 -; X32-NEXT: # %bb.1: +; X32-NEXT: movl (%eax), %eax +; X32-NEXT: xorl %edx, %edx +; X32-NEXT: cmpl %eax, %ecx +; X32-NEXT: setle %dl +; X32-NEXT: leal -1(%edx,%edx), %edx +; X32-NEXT: jg .LBB3_1 +; X32-NEXT: # %bb.2: ; X32-NEXT: movl %ecx, %esi -; X32-NEXT: .LBB3_2: -; X32-NEXT: leal -1(%eax,%eax), %edi +; X32-NEXT: jmp .LBB3_3 +; X32-NEXT: .LBB3_1: +; X32-NEXT: movl %eax, %esi ; X32-NEXT: movl %ecx, %eax -; X32-NEXT: jge .LBB3_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl %edx, %eax -; X32-NEXT: .LBB3_4: +; X32-NEXT: .LBB3_3: ; X32-NEXT: subl %esi, %eax ; X32-NEXT: shrl %eax -; X32-NEXT: imull %edi, %eax +; X32-NEXT: imull %edx, %eax ; X32-NEXT: addl %ecx, %eax ; X32-NEXT: popl %esi -; X32-NEXT: popl %edi ; X32-NEXT: retl %a2 = load i32, i32* %a2_addr %t3 = icmp sgt i32 %a1, %a2 ; signed @@ -242,7 +230,7 @@ define i32 @scalar_i32_signed_mem_mem(i3 ; X64-NEXT: leal -1(%rdx,%rdx), %edx ; X64-NEXT: movl %ecx, %esi ; X64-NEXT: cmovgl %eax, %esi -; X64-NEXT: cmovgel %ecx, %eax +; X64-NEXT: cmovgl %ecx, %eax ; X64-NEXT: subl %esi, %eax ; X64-NEXT: shrl %eax ; X64-NEXT: imull %edx, %eax @@ -251,32 +239,28 @@ define i32 @scalar_i32_signed_mem_mem(i3 ; ; X32-LABEL: scalar_i32_signed_mem_mem: ; X32: # %bb.0: -; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx ; X32-NEXT: movl (%ecx), %ecx -; X32-NEXT: movl (%eax), %edx -; X32-NEXT: xorl %eax, %eax -; X32-NEXT: cmpl %edx, %ecx -; X32-NEXT: setle %al -; X32-NEXT: movl %edx, %esi -; X32-NEXT: jg .LBB4_2 -; X32-NEXT: # %bb.1: +; X32-NEXT: movl (%eax), %eax +; X32-NEXT: xorl %edx, %edx +; X32-NEXT: cmpl %eax, %ecx +; X32-NEXT: setle %dl +; X32-NEXT: leal -1(%edx,%edx), %edx +; X32-NEXT: jg .LBB4_1 +; X32-NEXT: # %bb.2: ; X32-NEXT: movl %ecx, %esi -; X32-NEXT: .LBB4_2: -; X32-NEXT: leal -1(%eax,%eax), %edi +; X32-NEXT: jmp .LBB4_3 +; X32-NEXT: .LBB4_1: +; X32-NEXT: movl %eax, %esi ; X32-NEXT: movl %ecx, %eax -; X32-NEXT: jge .LBB4_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl %edx, %eax -; X32-NEXT: .LBB4_4: +; X32-NEXT: .LBB4_3: ; X32-NEXT: subl %esi, %eax ; X32-NEXT: shrl %eax -; X32-NEXT: imull %edi, %eax +; X32-NEXT: imull %edx, %eax ; X32-NEXT: addl %ecx, %eax ; X32-NEXT: popl %esi -; X32-NEXT: popl %edi ; X32-NEXT: retl %a1 = load i32, i32* %a1_addr %a2 = load i32, i32* %a2_addr @@ -306,7 +290,7 @@ define i64 @scalar_i64_signed_reg_reg(i6 ; X64-NEXT: leaq -1(%rax,%rax), %rax ; X64-NEXT: movq %rdi, %rcx ; X64-NEXT: cmovgq %rsi, %rcx -; X64-NEXT: cmovgeq %rdi, %rsi +; X64-NEXT: cmovgq %rdi, %rsi ; X64-NEXT: subq %rcx, %rsi ; X64-NEXT: shrq %rsi ; X64-NEXT: imulq %rsi, %rax @@ -319,48 +303,38 @@ define i64 @scalar_i64_signed_reg_reg(i6 ; X32-NEXT: pushl %ebx ; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi -; X32-NEXT: pushl %eax -; X32-NEXT: movl {{[0-9]+}}(%esp), %esi ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx -; X32-NEXT: movl {{[0-9]+}}(%esp), %edx -; X32-NEXT: movl {{[0-9]+}}(%esp), %ebp -; X32-NEXT: cmpl %esi, %edx -; X32-NEXT: movl %ebp, %eax -; X32-NEXT: sbbl %ecx, %eax -; X32-NEXT: movl %edx, %eax -; X32-NEXT: movl $-1, %edi +; X32-NEXT: movl {{[0-9]+}}(%esp), %eax +; X32-NEXT: movl {{[0-9]+}}(%esp), %edi +; X32-NEXT: cmpl %ecx, %eax +; X32-NEXT: movl %edi, %edx +; X32-NEXT: sbbl {{[0-9]+}}(%esp), %edx ; X32-NEXT: movl $-1, %ebx -; X32-NEXT: jl .LBB5_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: xorl %ebx, %ebx -; X32-NEXT: movl $1, %edi -; X32-NEXT: movl %ecx, %ebp -; X32-NEXT: movl %esi, %edx -; X32-NEXT: .LBB5_2: -; X32-NEXT: movl %edi, (%esp) # 4-byte Spill -; X32-NEXT: cmpl %eax, %esi -; X32-NEXT: movl %ecx, %eax +; X32-NEXT: jl .LBB5_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: xorl %ebp, %ebp +; X32-NEXT: movl $1, %ebx +; X32-NEXT: movl {{[0-9]+}}(%esp), %edx +; X32-NEXT: movl %ecx, %esi +; X32-NEXT: jmp .LBB5_3 +; X32-NEXT: .LBB5_1: +; X32-NEXT: movl $-1, %ebp +; X32-NEXT: movl %edi, %edx +; X32-NEXT: movl %eax, %esi ; X32-NEXT: movl {{[0-9]+}}(%esp), %edi -; X32-NEXT: sbbl %edi, %eax -; X32-NEXT: movl %esi, %eax -; X32-NEXT: jge .LBB5_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl %edi, %ecx -; X32-NEXT: movl {{[0-9]+}}(%esp), %eax -; X32-NEXT: .LBB5_4: -; X32-NEXT: subl %edx, %eax -; X32-NEXT: sbbl %ebp, %ecx -; X32-NEXT: shrdl $1, %ecx, %eax -; X32-NEXT: imull %eax, %ebx -; X32-NEXT: movl (%esp), %esi # 4-byte Reload -; X32-NEXT: mull %esi -; X32-NEXT: addl %ebx, %edx -; X32-NEXT: shrl %ecx -; X32-NEXT: imull %esi, %ecx -; X32-NEXT: addl %ecx, %edx -; X32-NEXT: addl {{[0-9]+}}(%esp), %eax +; X32-NEXT: movl %ecx, %eax +; X32-NEXT: .LBB5_3: +; X32-NEXT: subl %esi, %eax +; X32-NEXT: sbbl %edx, %edi +; X32-NEXT: shrdl $1, %edi, %eax +; X32-NEXT: imull %eax, %ebp +; X32-NEXT: mull %ebx +; X32-NEXT: addl %ebp, %edx +; X32-NEXT: shrl %edi +; X32-NEXT: imull %ebx, %edi +; X32-NEXT: addl %edi, %edx +; X32-NEXT: addl %ecx, %eax ; X32-NEXT: adcl {{[0-9]+}}(%esp), %edx -; X32-NEXT: addl $4, %esp ; X32-NEXT: popl %esi ; X32-NEXT: popl %edi ; X32-NEXT: popl %ebx @@ -459,7 +433,7 @@ define i64 @scalar_i64_signed_mem_reg(i6 ; X64-NEXT: leaq -1(%rax,%rax), %rax ; X64-NEXT: movq %rcx, %rdx ; X64-NEXT: cmovgq %rsi, %rdx -; X64-NEXT: cmovgeq %rcx, %rsi +; X64-NEXT: cmovgq %rcx, %rsi ; X64-NEXT: subq %rdx, %rsi ; X64-NEXT: shrq %rsi ; X64-NEXT: imulq %rsi, %rax @@ -473,48 +447,40 @@ define i64 @scalar_i64_signed_mem_reg(i6 ; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi ; X32-NEXT: pushl %eax -; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx -; X32-NEXT: movl {{[0-9]+}}(%esp), %edx ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax -; X32-NEXT: movl (%eax), %esi -; X32-NEXT: movl 4(%eax), %ebp -; X32-NEXT: cmpl %esi, %ecx -; X32-NEXT: movl %edx, %eax -; X32-NEXT: sbbl %ebp, %eax -; X32-NEXT: movl $-1, %eax +; X32-NEXT: movl {{[0-9]+}}(%esp), %edi +; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx +; X32-NEXT: movl (%ecx), %esi +; X32-NEXT: movl 4(%ecx), %ecx +; X32-NEXT: cmpl %esi, %eax +; X32-NEXT: movl %edi, %edx +; X32-NEXT: sbbl %ecx, %edx ; X32-NEXT: movl $-1, %ebx +; X32-NEXT: jl .LBB7_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: xorl %ebp, %ebp +; X32-NEXT: movl $1, %ebx +; X32-NEXT: movl %ecx, (%esp) # 4-byte Spill +; X32-NEXT: movl %esi, %edx +; X32-NEXT: jmp .LBB7_3 +; X32-NEXT: .LBB7_1: +; X32-NEXT: movl $-1, %ebp +; X32-NEXT: movl %edi, (%esp) # 4-byte Spill +; X32-NEXT: movl %eax, %edx ; X32-NEXT: movl %ecx, %edi -; X32-NEXT: jl .LBB7_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: xorl %ebx, %ebx -; X32-NEXT: movl $1, %eax -; X32-NEXT: movl %ebp, %edx -; X32-NEXT: movl %esi, %edi -; X32-NEXT: .LBB7_2: -; X32-NEXT: movl %eax, (%esp) # 4-byte Spill -; X32-NEXT: cmpl %ecx, %esi -; X32-NEXT: movl %ebp, %eax -; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx -; X32-NEXT: sbbl %ecx, %eax -; X32-NEXT: movl %ebp, %ecx ; X32-NEXT: movl %esi, %eax -; X32-NEXT: jge .LBB7_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx -; X32-NEXT: movl {{[0-9]+}}(%esp), %eax -; X32-NEXT: .LBB7_4: -; X32-NEXT: subl %edi, %eax -; X32-NEXT: sbbl %edx, %ecx -; X32-NEXT: shrdl $1, %ecx, %eax -; X32-NEXT: imull %eax, %ebx -; X32-NEXT: movl (%esp), %edi # 4-byte Reload -; X32-NEXT: mull %edi -; X32-NEXT: addl %ebx, %edx -; X32-NEXT: shrl %ecx -; X32-NEXT: imull %edi, %ecx -; X32-NEXT: addl %ecx, %edx +; X32-NEXT: .LBB7_3: +; X32-NEXT: subl %edx, %eax +; X32-NEXT: sbbl (%esp), %edi # 4-byte Folded Reload +; X32-NEXT: shrdl $1, %edi, %eax +; X32-NEXT: imull %eax, %ebp +; X32-NEXT: mull %ebx +; X32-NEXT: addl %ebp, %edx +; X32-NEXT: shrl %edi +; X32-NEXT: imull %ebx, %edi +; X32-NEXT: addl %edi, %edx ; X32-NEXT: addl %esi, %eax -; X32-NEXT: adcl %ebp, %edx +; X32-NEXT: adcl %ecx, %edx ; X32-NEXT: addl $4, %esp ; X32-NEXT: popl %esi ; X32-NEXT: popl %edi @@ -543,7 +509,7 @@ define i64 @scalar_i64_signed_reg_mem(i6 ; X64-NEXT: leaq -1(%rcx,%rcx), %rcx ; X64-NEXT: movq %rdi, %rdx ; X64-NEXT: cmovgq %rax, %rdx -; X64-NEXT: cmovgeq %rdi, %rax +; X64-NEXT: cmovgq %rdi, %rax ; X64-NEXT: subq %rdx, %rax ; X64-NEXT: shrq %rax ; X64-NEXT: imulq %rcx, %rax @@ -556,49 +522,39 @@ define i64 @scalar_i64_signed_reg_mem(i6 ; X32-NEXT: pushl %ebx ; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi -; X32-NEXT: subl $8, %esp -; X32-NEXT: movl {{[0-9]+}}(%esp), %esi ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx -; X32-NEXT: movl {{[0-9]+}}(%esp), %eax -; X32-NEXT: movl (%eax), %edx -; X32-NEXT: movl 4(%eax), %ebp -; X32-NEXT: cmpl %esi, %edx -; X32-NEXT: movl %ebp, %eax -; X32-NEXT: sbbl %ecx, %eax -; X32-NEXT: movl $-1, %eax +; X32-NEXT: movl {{[0-9]+}}(%esp), %edx +; X32-NEXT: movl (%edx), %eax +; X32-NEXT: movl 4(%edx), %edi +; X32-NEXT: cmpl %ecx, %eax +; X32-NEXT: movl %edi, %edx +; X32-NEXT: sbbl {{[0-9]+}}(%esp), %edx ; X32-NEXT: movl $-1, %ebx -; X32-NEXT: movl %ebp, (%esp) # 4-byte Spill -; X32-NEXT: movl %edx, %edi -; X32-NEXT: jl .LBB8_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: xorl %ebx, %ebx -; X32-NEXT: movl $1, %eax -; X32-NEXT: movl %ecx, (%esp) # 4-byte Spill -; X32-NEXT: movl %esi, %edi -; X32-NEXT: .LBB8_2: -; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill -; X32-NEXT: cmpl %edx, %esi +; X32-NEXT: jl .LBB8_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: xorl %ebp, %ebp +; X32-NEXT: movl $1, %ebx +; X32-NEXT: movl {{[0-9]+}}(%esp), %edx +; X32-NEXT: movl %ecx, %esi +; X32-NEXT: jmp .LBB8_3 +; X32-NEXT: .LBB8_1: +; X32-NEXT: movl $-1, %ebp +; X32-NEXT: movl %edi, %edx +; X32-NEXT: movl %eax, %esi +; X32-NEXT: movl {{[0-9]+}}(%esp), %edi ; X32-NEXT: movl %ecx, %eax -; X32-NEXT: sbbl %ebp, %eax -; X32-NEXT: jge .LBB8_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl %ebp, %ecx -; X32-NEXT: movl %edx, %esi -; X32-NEXT: .LBB8_4: -; X32-NEXT: subl %edi, %esi -; X32-NEXT: sbbl (%esp), %ecx # 4-byte Folded Reload -; X32-NEXT: shrdl $1, %ecx, %esi -; X32-NEXT: imull %esi, %ebx -; X32-NEXT: movl %esi, %eax -; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload -; X32-NEXT: mull %esi -; X32-NEXT: addl %ebx, %edx -; X32-NEXT: shrl %ecx -; X32-NEXT: imull %esi, %ecx -; X32-NEXT: addl %ecx, %edx -; X32-NEXT: addl {{[0-9]+}}(%esp), %eax +; X32-NEXT: .LBB8_3: +; X32-NEXT: subl %esi, %eax +; X32-NEXT: sbbl %edx, %edi +; X32-NEXT: shrdl $1, %edi, %eax +; X32-NEXT: imull %eax, %ebp +; X32-NEXT: mull %ebx +; X32-NEXT: addl %ebp, %edx +; X32-NEXT: shrl %edi +; X32-NEXT: imull %ebx, %edi +; X32-NEXT: addl %edi, %edx +; X32-NEXT: addl %ecx, %eax ; X32-NEXT: adcl {{[0-9]+}}(%esp), %edx -; X32-NEXT: addl $8, %esp ; X32-NEXT: popl %esi ; X32-NEXT: popl %edi ; X32-NEXT: popl %ebx @@ -627,7 +583,7 @@ define i64 @scalar_i64_signed_mem_mem(i6 ; X64-NEXT: leaq -1(%rdx,%rdx), %rdx ; X64-NEXT: movq %rcx, %rsi ; X64-NEXT: cmovgq %rax, %rsi -; X64-NEXT: cmovgeq %rcx, %rax +; X64-NEXT: cmovgq %rcx, %rax ; X64-NEXT: subq %rsi, %rax ; X64-NEXT: shrq %rax ; X64-NEXT: imulq %rdx, %rax @@ -640,52 +596,43 @@ define i64 @scalar_i64_signed_mem_mem(i6 ; X32-NEXT: pushl %ebx ; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi -; X32-NEXT: subl $12, %esp +; X32-NEXT: pushl %eax +; X32-NEXT: movl {{[0-9]+}}(%esp), %edx ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax -; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx -; X32-NEXT: movl (%ecx), %esi -; X32-NEXT: movl 4(%ecx), %edi -; X32-NEXT: movl (%eax), %edx -; X32-NEXT: movl 4(%eax), %ebp -; X32-NEXT: cmpl %esi, %edx -; X32-NEXT: movl %ebp, %eax -; X32-NEXT: sbbl %edi, %eax -; X32-NEXT: movl $-1, %eax +; X32-NEXT: movl (%eax), %esi +; X32-NEXT: movl 4(%eax), %ecx +; X32-NEXT: movl (%edx), %eax +; X32-NEXT: movl 4(%edx), %edi +; X32-NEXT: cmpl %esi, %eax +; X32-NEXT: movl %edi, %edx +; X32-NEXT: sbbl %ecx, %edx ; X32-NEXT: movl $-1, %ebx -; X32-NEXT: movl %ebp, %ecx -; X32-NEXT: movl %edx, (%esp) # 4-byte Spill -; X32-NEXT: jl .LBB9_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: xorl %ebx, %ebx -; X32-NEXT: movl $1, %eax -; X32-NEXT: movl %edi, %ecx -; X32-NEXT: movl %esi, (%esp) # 4-byte Spill -; X32-NEXT: .LBB9_2: -; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill -; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill -; X32-NEXT: cmpl %edx, %esi -; X32-NEXT: movl %edi, %eax -; X32-NEXT: sbbl %ebp, %eax -; X32-NEXT: movl %edi, %ecx +; X32-NEXT: jl .LBB9_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: xorl %ebp, %ebp +; X32-NEXT: movl $1, %ebx +; X32-NEXT: movl %ecx, (%esp) # 4-byte Spill +; X32-NEXT: movl %esi, %edx +; X32-NEXT: jmp .LBB9_3 +; X32-NEXT: .LBB9_1: +; X32-NEXT: movl $-1, %ebp +; X32-NEXT: movl %edi, (%esp) # 4-byte Spill +; X32-NEXT: movl %eax, %edx +; X32-NEXT: movl %ecx, %edi ; X32-NEXT: movl %esi, %eax -; X32-NEXT: jge .LBB9_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl %ebp, %ecx -; X32-NEXT: movl %edx, %eax -; X32-NEXT: .LBB9_4: -; X32-NEXT: subl (%esp), %eax # 4-byte Folded Reload -; X32-NEXT: sbbl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Folded Reload -; X32-NEXT: shrdl $1, %ecx, %eax -; X32-NEXT: imull %eax, %ebx -; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebp # 4-byte Reload -; X32-NEXT: mull %ebp -; X32-NEXT: addl %ebx, %edx -; X32-NEXT: shrl %ecx -; X32-NEXT: imull %ebp, %ecx -; X32-NEXT: addl %ecx, %edx +; X32-NEXT: .LBB9_3: +; X32-NEXT: subl %edx, %eax +; X32-NEXT: sbbl (%esp), %edi # 4-byte Folded Reload +; X32-NEXT: shrdl $1, %edi, %eax +; X32-NEXT: imull %eax, %ebp +; X32-NEXT: mull %ebx +; X32-NEXT: addl %ebp, %edx +; X32-NEXT: shrl %edi +; X32-NEXT: imull %ebx, %edi +; X32-NEXT: addl %edi, %edx ; X32-NEXT: addl %esi, %eax -; X32-NEXT: adcl %edi, %edx -; X32-NEXT: addl $12, %esp +; X32-NEXT: adcl %ecx, %edx +; X32-NEXT: addl $4, %esp ; X32-NEXT: popl %esi ; X32-NEXT: popl %edi ; X32-NEXT: popl %ebx @@ -719,7 +666,7 @@ define i16 @scalar_i16_signed_reg_reg(i1 ; X64-NEXT: leal -1(%rax,%rax), %ecx ; X64-NEXT: movl %edi, %eax ; X64-NEXT: cmovgl %esi, %eax -; X64-NEXT: cmovgel %edi, %esi +; X64-NEXT: cmovgl %edi, %esi ; X64-NEXT: subl %eax, %esi ; X64-NEXT: movzwl %si, %eax ; X64-NEXT: shrl %eax @@ -730,32 +677,28 @@ define i16 @scalar_i16_signed_reg_reg(i1 ; ; X32-LABEL: scalar_i16_signed_reg_reg: ; X32: # %bb.0: -; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx ; X32-NEXT: xorl %edx, %edx ; X32-NEXT: cmpw %ax, %cx ; X32-NEXT: setle %dl -; X32-NEXT: movl %eax, %esi -; X32-NEXT: jg .LBB10_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: movl %ecx, %esi -; X32-NEXT: .LBB10_2: ; X32-NEXT: leal -1(%edx,%edx), %edx -; X32-NEXT: movl %ecx, %edi -; X32-NEXT: jge .LBB10_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl %eax, %edi -; X32-NEXT: .LBB10_4: -; X32-NEXT: subl %esi, %edi -; X32-NEXT: movzwl %di, %eax +; X32-NEXT: jg .LBB10_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: movl %ecx, %esi +; X32-NEXT: jmp .LBB10_3 +; X32-NEXT: .LBB10_1: +; X32-NEXT: movl %eax, %esi +; X32-NEXT: movl %ecx, %eax +; X32-NEXT: .LBB10_3: +; X32-NEXT: subl %esi, %eax +; X32-NEXT: movzwl %ax, %eax ; X32-NEXT: shrl %eax ; X32-NEXT: imull %edx, %eax ; X32-NEXT: addl %ecx, %eax ; X32-NEXT: # kill: def $ax killed $ax killed $eax ; X32-NEXT: popl %esi -; X32-NEXT: popl %edi ; X32-NEXT: retl %t3 = icmp sgt i16 %a1, %a2 ; signed %t4 = select i1 %t3, i16 -1, i16 1 @@ -834,7 +777,7 @@ define i16 @scalar_i16_signed_mem_reg(i1 ; X64-NEXT: leal -1(%rax,%rax), %edx ; X64-NEXT: movl %ecx, %eax ; X64-NEXT: cmovgl %esi, %eax -; X64-NEXT: cmovgel %ecx, %esi +; X64-NEXT: cmovgl %ecx, %esi ; X64-NEXT: subl %eax, %esi ; X64-NEXT: movzwl %si, %eax ; X64-NEXT: shrl %eax @@ -845,7 +788,6 @@ define i16 @scalar_i16_signed_mem_reg(i1 ; ; X32-LABEL: scalar_i16_signed_mem_reg: ; X32: # %bb.0: -; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx @@ -853,25 +795,22 @@ define i16 @scalar_i16_signed_mem_reg(i1 ; X32-NEXT: xorl %edx, %edx ; X32-NEXT: cmpw %ax, %cx ; X32-NEXT: setle %dl -; X32-NEXT: movl %eax, %esi -; X32-NEXT: jg .LBB12_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: movl %ecx, %esi -; X32-NEXT: .LBB12_2: ; X32-NEXT: leal -1(%edx,%edx), %edx -; X32-NEXT: movl %ecx, %edi -; X32-NEXT: jge .LBB12_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl %eax, %edi -; X32-NEXT: .LBB12_4: -; X32-NEXT: subl %esi, %edi -; X32-NEXT: movzwl %di, %eax +; X32-NEXT: jg .LBB12_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: movl %ecx, %esi +; X32-NEXT: jmp .LBB12_3 +; X32-NEXT: .LBB12_1: +; X32-NEXT: movl %eax, %esi +; X32-NEXT: movl %ecx, %eax +; X32-NEXT: .LBB12_3: +; X32-NEXT: subl %esi, %eax +; X32-NEXT: movzwl %ax, %eax ; X32-NEXT: shrl %eax ; X32-NEXT: imull %edx, %eax ; X32-NEXT: addl %ecx, %eax ; X32-NEXT: # kill: def $ax killed $ax killed $eax ; X32-NEXT: popl %esi -; X32-NEXT: popl %edi ; X32-NEXT: retl %a1 = load i16, i16* %a1_addr %t3 = icmp sgt i16 %a1, %a2 ; signed @@ -895,7 +834,7 @@ define i16 @scalar_i16_signed_reg_mem(i1 ; X64-NEXT: leal -1(%rcx,%rcx), %ecx ; X64-NEXT: movl %edi, %edx ; X64-NEXT: cmovgl %eax, %edx -; X64-NEXT: cmovgel %edi, %eax +; X64-NEXT: cmovgl %edi, %eax ; X64-NEXT: subl %edx, %eax ; X64-NEXT: movzwl %ax, %eax ; X64-NEXT: shrl %eax @@ -906,7 +845,6 @@ define i16 @scalar_i16_signed_reg_mem(i1 ; ; X32-LABEL: scalar_i16_signed_reg_mem: ; X32: # %bb.0: -; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax @@ -914,25 +852,22 @@ define i16 @scalar_i16_signed_reg_mem(i1 ; X32-NEXT: xorl %edx, %edx ; X32-NEXT: cmpw %ax, %cx ; X32-NEXT: setle %dl -; X32-NEXT: movl %eax, %esi -; X32-NEXT: jg .LBB13_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: movl %ecx, %esi -; X32-NEXT: .LBB13_2: ; X32-NEXT: leal -1(%edx,%edx), %edx -; X32-NEXT: movl %ecx, %edi -; X32-NEXT: jge .LBB13_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl %eax, %edi -; X32-NEXT: .LBB13_4: -; X32-NEXT: subl %esi, %edi -; X32-NEXT: movzwl %di, %eax +; X32-NEXT: jg .LBB13_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: movl %ecx, %esi +; X32-NEXT: jmp .LBB13_3 +; X32-NEXT: .LBB13_1: +; X32-NEXT: movl %eax, %esi +; X32-NEXT: movl %ecx, %eax +; X32-NEXT: .LBB13_3: +; X32-NEXT: subl %esi, %eax +; X32-NEXT: movzwl %ax, %eax ; X32-NEXT: shrl %eax ; X32-NEXT: imull %edx, %eax ; X32-NEXT: addl %ecx, %eax ; X32-NEXT: # kill: def $ax killed $ax killed $eax ; X32-NEXT: popl %esi -; X32-NEXT: popl %edi ; X32-NEXT: retl %a2 = load i16, i16* %a2_addr %t3 = icmp sgt i16 %a1, %a2 ; signed @@ -957,7 +892,7 @@ define i16 @scalar_i16_signed_mem_mem(i1 ; X64-NEXT: leal -1(%rdx,%rdx), %edx ; X64-NEXT: movl %ecx, %esi ; X64-NEXT: cmovgl %eax, %esi -; X64-NEXT: cmovgel %ecx, %eax +; X64-NEXT: cmovgl %ecx, %eax ; X64-NEXT: subl %esi, %eax ; X64-NEXT: movzwl %ax, %eax ; X64-NEXT: shrl %eax @@ -968,7 +903,6 @@ define i16 @scalar_i16_signed_mem_mem(i1 ; ; X32-LABEL: scalar_i16_signed_mem_mem: ; X32: # %bb.0: -; X32-NEXT: pushl %edi ; X32-NEXT: pushl %esi ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx @@ -977,25 +911,22 @@ define i16 @scalar_i16_signed_mem_mem(i1 ; X32-NEXT: xorl %edx, %edx ; X32-NEXT: cmpw %ax, %cx ; X32-NEXT: setle %dl -; X32-NEXT: movl %eax, %esi -; X32-NEXT: jg .LBB14_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: movl %ecx, %esi -; X32-NEXT: .LBB14_2: ; X32-NEXT: leal -1(%edx,%edx), %edx -; X32-NEXT: movl %ecx, %edi -; X32-NEXT: jge .LBB14_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movl %eax, %edi -; X32-NEXT: .LBB14_4: -; X32-NEXT: subl %esi, %edi -; X32-NEXT: movzwl %di, %eax +; X32-NEXT: jg .LBB14_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: movl %ecx, %esi +; X32-NEXT: jmp .LBB14_3 +; X32-NEXT: .LBB14_1: +; X32-NEXT: movl %eax, %esi +; X32-NEXT: movl %ecx, %eax +; X32-NEXT: .LBB14_3: +; X32-NEXT: subl %esi, %eax +; X32-NEXT: movzwl %ax, %eax ; X32-NEXT: shrl %eax ; X32-NEXT: imull %edx, %eax ; X32-NEXT: addl %ecx, %eax ; X32-NEXT: # kill: def $ax killed $ax killed $eax ; X32-NEXT: popl %esi -; X32-NEXT: popl %edi ; X32-NEXT: retl %a1 = load i16, i16* %a1_addr %a2 = load i16, i16* %a2_addr @@ -1024,7 +955,7 @@ define i8 @scalar_i8_signed_reg_reg(i8 % ; X64-NEXT: setle %cl ; X64-NEXT: movl %edi, %edx ; X64-NEXT: cmovgl %esi, %edx -; X64-NEXT: cmovgel %edi, %eax +; X64-NEXT: cmovgl %edi, %eax ; X64-NEXT: addb %cl, %cl ; X64-NEXT: decb %cl ; X64-NEXT: subb %dl, %al @@ -1036,21 +967,19 @@ define i8 @scalar_i8_signed_reg_reg(i8 % ; ; X32-LABEL: scalar_i8_signed_reg_reg: ; X32: # %bb.0: -; X32-NEXT: movb {{[0-9]+}}(%esp), %ah +; X32-NEXT: movb {{[0-9]+}}(%esp), %al ; X32-NEXT: movb {{[0-9]+}}(%esp), %cl -; X32-NEXT: cmpb %ah, %cl +; X32-NEXT: cmpb %al, %cl ; X32-NEXT: setle %dl -; X32-NEXT: movb %ah, %ch -; X32-NEXT: jg .LBB15_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: movb %cl, %ch -; X32-NEXT: .LBB15_2: +; X32-NEXT: jg .LBB15_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: movb %cl, %ah +; X32-NEXT: jmp .LBB15_3 +; X32-NEXT: .LBB15_1: +; X32-NEXT: movb %al, %ah ; X32-NEXT: movb %cl, %al -; X32-NEXT: jge .LBB15_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movb %ah, %al -; X32-NEXT: .LBB15_4: -; X32-NEXT: subb %ch, %al +; X32-NEXT: .LBB15_3: +; X32-NEXT: subb %ah, %al ; X32-NEXT: addb %dl, %dl ; X32-NEXT: decb %dl ; X32-NEXT: shrb %al @@ -1129,7 +1058,7 @@ define i8 @scalar_i8_signed_mem_reg(i8* ; X64-NEXT: movl %ecx, %edi ; X64-NEXT: cmovgl %esi, %edi ; X64-NEXT: movl %ecx, %eax -; X64-NEXT: cmovll %esi, %eax +; X64-NEXT: cmovlel %esi, %eax ; X64-NEXT: addb %dl, %dl ; X64-NEXT: decb %dl ; X64-NEXT: subb %dil, %al @@ -1141,22 +1070,20 @@ define i8 @scalar_i8_signed_mem_reg(i8* ; ; X32-LABEL: scalar_i8_signed_mem_reg: ; X32: # %bb.0: -; X32-NEXT: movb {{[0-9]+}}(%esp), %ah +; X32-NEXT: movb {{[0-9]+}}(%esp), %al ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx ; X32-NEXT: movb (%ecx), %cl -; X32-NEXT: cmpb %ah, %cl +; X32-NEXT: cmpb %al, %cl ; X32-NEXT: setle %dl -; X32-NEXT: movb %ah, %ch -; X32-NEXT: jg .LBB17_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: movb %cl, %ch -; X32-NEXT: .LBB17_2: +; X32-NEXT: jg .LBB17_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: movb %cl, %ah +; X32-NEXT: jmp .LBB17_3 +; X32-NEXT: .LBB17_1: +; X32-NEXT: movb %al, %ah ; X32-NEXT: movb %cl, %al -; X32-NEXT: jge .LBB17_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movb %ah, %al -; X32-NEXT: .LBB17_4: -; X32-NEXT: subb %ch, %al +; X32-NEXT: .LBB17_3: +; X32-NEXT: subb %ah, %al ; X32-NEXT: addb %dl, %dl ; X32-NEXT: decb %dl ; X32-NEXT: shrb %al @@ -1183,7 +1110,7 @@ define i8 @scalar_i8_signed_reg_mem(i8 % ; X64-NEXT: setle %cl ; X64-NEXT: movl %edi, %edx ; X64-NEXT: cmovgl %eax, %edx -; X64-NEXT: cmovgel %edi, %eax +; X64-NEXT: cmovgl %edi, %eax ; X64-NEXT: addb %cl, %cl ; X64-NEXT: decb %cl ; X64-NEXT: subb %dl, %al @@ -1197,20 +1124,18 @@ define i8 @scalar_i8_signed_reg_mem(i8 % ; X32: # %bb.0: ; X32-NEXT: movb {{[0-9]+}}(%esp), %cl ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax -; X32-NEXT: movb (%eax), %ah -; X32-NEXT: cmpb %ah, %cl +; X32-NEXT: movb (%eax), %al +; X32-NEXT: cmpb %al, %cl ; X32-NEXT: setle %dl -; X32-NEXT: movb %ah, %ch -; X32-NEXT: jg .LBB18_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: movb %cl, %ch -; X32-NEXT: .LBB18_2: +; X32-NEXT: jg .LBB18_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: movb %cl, %ah +; X32-NEXT: jmp .LBB18_3 +; X32-NEXT: .LBB18_1: +; X32-NEXT: movb %al, %ah ; X32-NEXT: movb %cl, %al -; X32-NEXT: jge .LBB18_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movb %ah, %al -; X32-NEXT: .LBB18_4: -; X32-NEXT: subb %ch, %al +; X32-NEXT: .LBB18_3: +; X32-NEXT: subb %ah, %al ; X32-NEXT: addb %dl, %dl ; X32-NEXT: decb %dl ; X32-NEXT: shrb %al @@ -1238,7 +1163,7 @@ define i8 @scalar_i8_signed_mem_mem(i8* ; X64-NEXT: setle %dl ; X64-NEXT: movl %ecx, %esi ; X64-NEXT: cmovgl %eax, %esi -; X64-NEXT: cmovgel %ecx, %eax +; X64-NEXT: cmovgl %ecx, %eax ; X64-NEXT: addb %dl, %dl ; X64-NEXT: decb %dl ; X64-NEXT: subb %sil, %al @@ -1253,20 +1178,18 @@ define i8 @scalar_i8_signed_mem_mem(i8* ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx ; X32-NEXT: movb (%ecx), %cl -; X32-NEXT: movb (%eax), %ah -; X32-NEXT: cmpb %ah, %cl +; X32-NEXT: movb (%eax), %al +; X32-NEXT: cmpb %al, %cl ; X32-NEXT: setle %dl -; X32-NEXT: movb %ah, %ch -; X32-NEXT: jg .LBB19_2 -; X32-NEXT: # %bb.1: -; X32-NEXT: movb %cl, %ch -; X32-NEXT: .LBB19_2: +; X32-NEXT: jg .LBB19_1 +; X32-NEXT: # %bb.2: +; X32-NEXT: movb %cl, %ah +; X32-NEXT: jmp .LBB19_3 +; X32-NEXT: .LBB19_1: +; X32-NEXT: movb %al, %ah ; X32-NEXT: movb %cl, %al -; X32-NEXT: jge .LBB19_4 -; X32-NEXT: # %bb.3: -; X32-NEXT: movb %ah, %al -; X32-NEXT: .LBB19_4: -; X32-NEXT: subb %ch, %al +; X32-NEXT: .LBB19_3: +; X32-NEXT: subb %ah, %al ; X32-NEXT: addb %dl, %dl ; X32-NEXT: decb %dl ; X32-NEXT: shrb %al From llvm-commits at lists.llvm.org Sun Oct 13 00:11:52 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 07:11:52 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: lebedev.ri marked an inline comment as done. lebedev.ri added a comment. In D29011#1707396 , @aqjune wrote: > In D29011#1707278 , @lebedev.ri wrote: > > > Should you add `llvm::Freeze` here by inheriting from `UnaryOperator` to make `isa(Op)` possible? > > > Couldn't you kindly point which place is good to update? Right after where the `llvm::UnaryOperator` is defined in `llvm/include/llvm/IR/InstrTypes.h` i think. See `OverflowingBinaryOperator` for an example. ================ Comment at: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h:671 void visitFNeg(const User &I) { visitUnary(I, ISD::FNEG); } + void visitFreeze(const User &I); ---------------- aqjune wrote: > jdoerfert wrote: > > The lady of the lake says this should be: > > `void visitFreeze(const User &I) { visitUnary(I, ISD::FREEZE); }` > > If you have reason not to do it this way, also replace `visitUnrary` with `visitFNeg`, though I'd prefer not to. > ISD::FREEZE will be added in the next patch - https://reviews.llvm.org/D29014 . > Do you want to move the definition of ISD::FREEZE to this patch? @lebedev.ri Oh good point, let's leave as is. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Sun Oct 13 00:30:01 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 07:30:01 +0000 (UTC) Subject: [PATCH] D68672: [APInt] Rounding right-shifts In-Reply-To: References: Message-ID: <20095e8ee2a8fe503d4b2911d847c738@localhost.localdomain> lebedev.ri added a comment. In D68672#1707312 , @nikic wrote: > > I'd like to try to extend ConstantRange::makeGuaranteedNoWrapRegion() > > to deal with Instruction::Shl so i believe i need rounding right shifts. > > I don't think rounding shifts are strictly necessary for this purpose, > the correct behavior should fall out when applying the normal lshr/ashr operations to -1 / signed_min / signed_max. I'm not sure i can parse that, fall out? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68672/new/ https://reviews.llvm.org/D68672 From llvm-commits at lists.llvm.org Sun Oct 13 00:48:14 2019 From: llvm-commits at lists.llvm.org (Chandler Carruth via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 07:48:14 +0000 (UTC) Subject: [PATCH] D65280: Add a pass to lower is.constant and objectsize intrinsics In-Reply-To: References: Message-ID: chandlerc accepted this revision. chandlerc added a comment. This revision is now accepted and ready to land. FWIW, the adjustments I'm suggesting around tightening the logic can easily be in a follow-up patch if you like. I think generally the code LGTM and I'd just like us to pin down exactly what changes we expect to happen w/ the handles as much as possible to avoid subtle latent bugs creeping in and never getting noticed. The other two are trivial, feel free to land w/ those fixed. ================ Comment at: lib/Transforms/Scalar/LowerConstantIntrinsics.cpp:112-117 + if (!II) + continue; + Value *NewValue; + switch (II->getIntrinsicID()) { + default: + continue; ---------------- joerg wrote: > chandlerc wrote: > > For both the `II` thing and the `default` case -- do we really expect these to ever fail? > > > > I would expect either the VH to be null, or for it to definitively be one of the two intrinsics we added. Maybe switch to `cast_or_null` above with `VN.get()` or some such, and llvm_unreachable on the default case. > Yes, the same concerns as with the earlier version still apply. The recursive simplification can change the instruction type in place or remove it. The logic is still simpler since no new instructions can appear. I'm really surprised that it can *change* the value handle in this way. I guess because we're using a tracking value handle (is that really necessary?) they may be moved onto the constant, but IMO that'd be more cleanly handled by checking for the value handle being either null or a non-instruction value. If its an instruction, it should really only be one of these two intrinsics or something deeply wrong has happened elsewhere, no? I'm mostly suggesting we assert on that to track down the strange behavior and make sure the overall logic is actually still correct if it comes up rather than potentially hiding a deeper bug. ================ Comment at: test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll:1 -; RUN: opt -S -codegenprepare %s -o - | FileCheck %s +; RUN: opt -S --lower-constant-intrinsics %s -o - | FileCheck %s ; ---------------- Probable just one `-` is fine? ================ Comment at: test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll:1 -; RUN: opt -codegenprepare -S < %s | FileCheck %s +; RUN: opt --lower-constant-intrinsics -S < %s | FileCheck %s ---------------- Probably just one `-` is fine? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65280/new/ https://reviews.llvm.org/D65280 From llvm-commits at lists.llvm.org Sun Oct 13 01:15:47 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 08:15:47 +0000 (UTC) Subject: [PATCH] D68925: [Attributor] Liveness for values Message-ID: jdoerfert created this revision. jdoerfert added reviewers: uenoku, sstefan1. Herald added subscribers: bollu, hiraditya. Herald added a project: LLVM. This patch introduces liveness (AAIsDead) for all positions, thus for all kinds of values. For now, we say an instruction is dead if it would be removed assuming all users are dead. A call site return is different as we just look at the users. If all call site returns have been eliminated, the return values can return undef instead of their original value, eliminating uses. We try to recursively delete dead instructions now and we introduce a simple check-like interface for use-traversal. More explicit tests will be added. This is the idea tried out in D68626 but implemented in the right way. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68925 Files: llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/align.ll llvm/test/Transforms/FunctionAttrs/liveness.ll llvm/test/Transforms/FunctionAttrs/new_attributes.ll llvm/test/Transforms/FunctionAttrs/noalias_returned.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68925.224770.patch Type: text/x-patch Size: 28053 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 01:15:48 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 08:15:48 +0000 (UTC) Subject: [PATCH] D68626: [Attributor] Use undef for calls with unused arguments. In-Reply-To: References: Message-ID: <401c20d4fc03203821d027ab137ee23f@localhost.localdomain> jdoerfert abandoned this revision. jdoerfert added a comment. Dropped in favor of D68925 . Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68626/new/ https://reviews.llvm.org/D68626 From llvm-commits at lists.llvm.org Sun Oct 13 01:15:48 2019 From: llvm-commits at lists.llvm.org (Juneyoung Lee via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 08:15:48 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: <3ff8357e189dafa03c0c4354ca30a491@localhost.localdomain> aqjune updated this revision to Diff 224771. aqjune added a comment. - Rebase - Add more tests to freeze.ll - Define FreezeOperator CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 Files: include/llvm-c/Core.h include/llvm/Bitcode/LLVMBitCodes.h include/llvm/CodeGen/GlobalISel/IRTranslator.h include/llvm/IR/IRBuilder.h include/llvm/IR/Instruction.def include/llvm/IR/Operator.h include/llvm/IR/PatternMatch.h lib/AsmParser/LLLexer.cpp lib/AsmParser/LLParser.cpp lib/AsmParser/LLToken.h lib/Bitcode/Reader/BitcodeReader.cpp lib/Bitcode/Writer/BitcodeWriter.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h lib/CodeGen/TargetLoweringBase.cpp lib/IR/ConstantFold.cpp lib/IR/Core.cpp lib/IR/Instruction.cpp lib/IR/Instructions.cpp lib/IR/Verifier.cpp test/Bindings/llvm-c/freeze.ll test/Bitcode/compatibility.ll tools/llvm-c-test/echo.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D29011.224771.patch Type: text/x-patch Size: 25772 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 01:24:50 2019 From: llvm-commits at lists.llvm.org (Juneyoung Lee via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 08:24:50 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: aqjune marked 2 inline comments as done. aqjune added inline comments. ================ Comment at: include/llvm/IR/Operator.h:594 +{}; + } // end namespace llvm ---------------- I added FreezeOperator here, as other operators including OverflowingBinaryOperator were in this file. ================ Comment at: test/Bindings/llvm-c/freeze.ll:12 + %6 = freeze <2 x float> %arg4 + %7 = freeze i8* %arg5 + ret i32 %1 ---------------- jdoerfert wrote: > Missing types, here and elsewhere I think: > - array > - struct w/ definition > - struct w/o definition (opaque) > - non-standard integer size (i666) > > Missing inputs: > - undef > - null > I found that `freeze i8* null` raised an error due to a bug in `tools/llvm-c-test/echo.cpp`: ``` // Try null if (LLVMIsNull(Cst)) { check_value_kind(Cst, LLVMConstantTokenNoneValueKind); LLVMTypeRef Ty = TypeCloner(M).Clone(Cst); return LLVMConstNull(Ty); } ``` Here, Cst can not only be `LLVMConstantTokenNoneValueKind` but also `LLVMConstantPointerNullValueKind`. I'll make a separate patch that resolves this error, then I'll be able to add `freeze i8* null` test. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Sun Oct 13 01:24:50 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 08:24:50 +0000 (UTC) Subject: [PATCH] D68925: [Attributor] Liveness for values In-Reply-To: References: Message-ID: <6d55461370ebe157dcd9e5eeded733cb@localhost.localdomain> jdoerfert updated this revision to Diff 224772. jdoerfert added a comment. Non-exact functions fix + more tests Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68925/new/ https://reviews.llvm.org/D68925 Files: llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/align.ll llvm/test/Transforms/FunctionAttrs/liveness.ll llvm/test/Transforms/FunctionAttrs/new_attributes.ll llvm/test/Transforms/FunctionAttrs/noalias_returned.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68925.224772.patch Type: text/x-patch Size: 29560 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 01:33:14 2019 From: llvm-commits at lists.llvm.org (GN Sync Bot via llvm-commits) Date: Sun, 13 Oct 2019 08:33:14 -0000 Subject: [llvm] r374708 - gn build: Merge r374707 Message-ID: <20191013083314.2E55E85171@lists.llvm.org> Author: gnsyncbot Date: Sun Oct 13 01:33:14 2019 New Revision: 374708 URL: http://llvm.org/viewvc/llvm-project?rev=374708&view=rev Log: gn build: Merge r374707 Modified: llvm/trunk/utils/gn/secondary/clang-tools-extra/clang-tidy/bugprone/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/clang-tools-extra/clang-tidy/bugprone/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang-tools-extra/clang-tidy/bugprone/BUILD.gn?rev=374708&r1=374707&r2=374708&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang-tools-extra/clang-tidy/bugprone/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang-tools-extra/clang-tidy/bugprone/BUILD.gn Sun Oct 13 01:33:14 2019 @@ -37,6 +37,7 @@ static_library("bugprone") { "MisplacedWideningCastCheck.cpp", "MoveForwardingReferenceCheck.cpp", "MultipleStatementMacroCheck.cpp", + "NotNullTerminatedResultCheck.cpp", "ParentVirtualCallCheck.cpp", "PosixReturnCheck.cpp", "SizeofContainerCheck.cpp", From llvm-commits at lists.llvm.org Sun Oct 13 01:33:52 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 08:33:52 +0000 (UTC) Subject: [PATCH] D68530: [AArch64] Don't combine callee-save and local stack adjustment when optimizing for size In-Reply-To: References: Message-ID: <10f26b5b35d0d9374feb705d06de7754@localhost.localdomain> dmgreen added a reviewer: dmgreen. dmgreen accepted this revision. dmgreen added a comment. This revision is now accepted and ready to land. LGTM, Thanks! Do you want me to commit this, or do you have commit access already? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68530/new/ https://reviews.llvm.org/D68530 From llvm-commits at lists.llvm.org Sun Oct 13 01:33:52 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 08:33:52 +0000 (UTC) Subject: [PATCH] D68342: [Analysis] Don't assume that overflow can't happen in EmitGEPOffset In-Reply-To: References: Message-ID: <315203cd8faa2f29763613fc9f305062@localhost.localdomain> lebedev.ri added a comment. In D68342#1692607 , @miyuki wrote: > > So clang is perfectly correct here. > > Yes, Clang is correct. EmitGEPOffset is doing the wrong thing. > `nuw` is incorrect because negative offsets are allowed. `nsw` would also be incorrect because of the quote you mentioned before: > > > If the inbounds keyword is present, the result value of the getelementptr is a poison value > > if the base pointer is not an in bounds address of an allocated object, or if any of the > > addresses that would be formed by successive addition of the offsets implied by the indices > > to the base address **with infinitely precise signed arithmetic** are not an in bounds address > > of that allocated object. <...> > > `nsw` would imply that signed overflow must not occur when computing the offset in the integer type of the same width as the pointer type. But LangRef is talking about infinitely precise arithmetic. I'll rephrase. I don't think this case is defined for address space 0 - i don't believe you can ever have an object e.g. occupying `[i8 128, i8 8]` (i.e. including null pointer). It it likely not so for other address spaces. So the likely solution is to use `NSW` iff address space = 0. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68342/new/ https://reviews.llvm.org/D68342 From llvm-commits at lists.llvm.org Sun Oct 13 01:33:53 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 08:33:53 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: lebedev.ri accepted this revision. lebedev.ri added a comment. This revision is now accepted and ready to land. Thanks, LG. Anyone else has any comments here? ================ Comment at: include/llvm/IR/PatternMatch.h:830-834 + auto *I = dyn_cast(V); + if (!I) return false; + + if (I->getOpcode() == Instruction::Freeze) + return X.match(I->getOperand(0)); ---------------- Let's use `FreezeOperator` here then? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 From llvm-commits at lists.llvm.org Sun Oct 13 01:33:58 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 08:33:58 +0000 (UTC) Subject: [PATCH] D68877: [AArch64][SVE] Implement masked load intrinsics In-Reply-To: References: Message-ID: <82e26a43eb417d50d923c1c32026d0e5@localhost.localdomain> dmgreen added subscribers: samparker, dmgreen. dmgreen added a comment. Sam has been looking at extending masked loads and stores in D68337 and related patches. There looks like there would be some overlap with this, especially in the target independent parts. Make sure you co-ordinate with him. ================ Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:10393 + ((!LegalOperations && !cast(N0)->isVolatile()) || + TLI.isLoadExtLegal(ISD::SEXTLOAD, VT, EVT))) { + MaskedLoadSDNode *LN0 = cast(N0); ---------------- I'm not convinced that just because a sext load is legal and a masked load is legal, that a sext masked load is always legal. ================ Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:1077 + + def _default_z : Pat<(Ty (Load GPR64:$base, (PredTy PPR:$gp), (SVEUndef))), + (RegImmInst PPR:$gp, GPR64:$base, (i64 0))>; ---------------- What if the passthru isn't undef? ================ Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h:151 + bool isLegalMaskedLoad(Type *DataType) { + return ST->hasSVE(); + } ---------------- This can handle all masked loads? Of any type, extended into any other type, with any alignment? ================ Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h:153 + } + bool isLegalMaskedStore(Type *DataType) { + return ST->hasSVE(); ---------------- This patch doesn't handle stores yet. ================ Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:296 +def SVEUndef : ComplexPattern; + ---------------- Can this just use "undef"? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68877/new/ https://reviews.llvm.org/D68877 From llvm-commits at lists.llvm.org Sun Oct 13 02:11:01 2019 From: llvm-commits at lists.llvm.org (Dave Green via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 09:11:01 +0000 (UTC) Subject: [PATCH] D68926: [Codegen] Alter the default promotion for add_sat and sub_sat Message-ID: dmgreen created this revision. dmgreen added reviewers: nikic, leonardchan, craig.topper, RKSimon, efriedma. Herald added a subscriber: hiraditya. Herald added a project: LLVM. This is round 2 of D68643 . The values were not being sign extended or zero extended correctly, which could lead to incorrect results when the incoming values were not already extended. They are needed because the min/max need no superfluous values in the higher bits. I've fixed that and added some extra tests. Not everything here is an improvement (although most of it still is). Some of the i4 cases look slightly larger, but this may be improved in cases where the extend can become free (from a load, for example). https://reviews.llvm.org/D68926 Files: llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp llvm/test/CodeGen/AArch64/sadd_sat.ll llvm/test/CodeGen/AArch64/sadd_sat_plus.ll llvm/test/CodeGen/AArch64/sadd_sat_vec.ll llvm/test/CodeGen/AArch64/ssub_sat.ll llvm/test/CodeGen/AArch64/ssub_sat_plus.ll llvm/test/CodeGen/AArch64/ssub_sat_vec.ll llvm/test/CodeGen/AArch64/uadd_sat.ll llvm/test/CodeGen/AArch64/uadd_sat_plus.ll llvm/test/CodeGen/AArch64/uadd_sat_vec.ll llvm/test/CodeGen/AArch64/usub_sat.ll llvm/test/CodeGen/AArch64/usub_sat_plus.ll llvm/test/CodeGen/AArch64/usub_sat_vec.ll llvm/test/CodeGen/ARM/sadd_sat.ll llvm/test/CodeGen/ARM/sadd_sat_plus.ll llvm/test/CodeGen/ARM/ssub_sat.ll llvm/test/CodeGen/ARM/ssub_sat_plus.ll llvm/test/CodeGen/ARM/uadd_sat.ll llvm/test/CodeGen/ARM/uadd_sat_plus.ll llvm/test/CodeGen/ARM/usub_sat.ll llvm/test/CodeGen/ARM/usub_sat_plus.ll llvm/test/CodeGen/X86/sadd_sat.ll llvm/test/CodeGen/X86/sadd_sat_plus.ll llvm/test/CodeGen/X86/ssub_sat.ll llvm/test/CodeGen/X86/ssub_sat_plus.ll llvm/test/CodeGen/X86/uadd_sat.ll llvm/test/CodeGen/X86/uadd_sat_plus.ll llvm/test/CodeGen/X86/usub_sat.ll llvm/test/CodeGen/X86/usub_sat_plus.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68926.224774.patch Type: text/x-patch Size: 121389 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 03:32:31 2019 From: llvm-commits at lists.llvm.org (Ayal Zaks via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 10:32:31 +0000 (UTC) Subject: [PATCH] D68814: [LV] Allow assume calls in predicated blocks. In-Reply-To: References: Message-ID: Ayal added a comment. If conditional assumes are to be dropped, better do so on entry to VPlan, as in DeadInstructions, rather than representing them in ReplicateRecipe (as do unconditional assumes) and silencing their code generation. To retain conditional assumes along with their control flow, they could be marked under isScalarWithPredication; but this complicates vectorization, plus what use are such assumes when all else is if-converted(?) Conditional assumes under uniform control flow could be retained, along with the uniform control flow they depend upon; this may be mostly relevant for outerloop vectorization. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68814/new/ https://reviews.llvm.org/D68814 From llvm-commits at lists.llvm.org Sun Oct 13 03:33:02 2019 From: llvm-commits at lists.llvm.org (Joel Klinghed via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 10:33:02 +0000 (UTC) Subject: [PATCH] D67322: [LLD][ThinLTO] Handle GUID collision in import global processing In-Reply-To: References: Message-ID: <0375b467b039e344914a788f6b60fe2f@localhost.localdomain> the_jk added a comment. Another place that already handles multiple locals with the same GUID is in the same file: bool FunctionImportGlobalProcessing::shouldPromoteLocalToGlobal(), from a comment there: // When exporting, consult the index. We can have more than one local // with the same GUID, in the case of same-named locals in different but // same-named source files that were compiled in their respective directories // (so the source file name and resulting GUID is the same). Find the one // in this module. so I think raising an error would cause surprises for different setups. I'll gladly add more testcases if you can give me some ideas/tips on possible problems. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67322/new/ https://reviews.llvm.org/D67322 From llvm-commits at lists.llvm.org Sun Oct 13 03:50:34 2019 From: llvm-commits at lists.llvm.org (Gil Rapaport via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 10:50:34 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <9e5ad9d3596922288f80b05d3866351d@localhost.localdomain> gilr marked 2 inline comments as done. gilr added inline comments. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7071 + // --------------------------------------------------------------------------- + // Transform initial VPlan: Apply previously taken decisions, in order, to ---------------- fhahn wrote: > Not sure how other feel, but I think it would be great if we could move this transform out of LoopVectorize.cpp , to group together VP2VP transforms. I think it would fit well into llvm/lib/Transforms/Vectorize/VPlanHCFGTransforms.h (although the name mentions HFCGTransforms, maybe it should be just VplanToVplanTransforms.h/cpp). > > I could not spot anything that would prevent moving it to a different file on first glance. This is currently still ingredient-based, i.e. not a pure VPlan2VPlan transformation, and as you mention the VPlan2VPlan part it's basically just a moveAfter(). A VPlan-based sinkAfter() should not be based on ingredients (as stated in D46826) and instead might take a Recipe2Recipe map., but that seems a bit of an overkill for this patch. Modelling VPlan-based transformations is definitely worth a larger discussion. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7079 + VPRecipeBase *Sink = RecipeBuilder.getRecipe(Entry.first); + Sink->removeFromParent(); + Sink->insertAfter(RecipeBuilder.getRecipe(Entry.second)); ---------------- fhahn wrote: > This could just be `Sink->moveAfter(RecipeBuilder.getRecipe(Entry.second)) `. I've added it in D46825 and now finally have a reason to commit it ;) Right. Will use it instead. Seems it doesn't update Parent, though. Will rewrite as a composition of the more basic removeFromParent(), insertAfter() to avoid duplicating that and the assertions. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 From llvm-commits at lists.llvm.org Sun Oct 13 03:51:16 2019 From: llvm-commits at lists.llvm.org (Gil Rapaport via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 10:51:16 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <614b426bfdad86d65feff23f6a516334@localhost.localdomain> gilr updated this revision to Diff 224777. gilr added a comment. Applied review comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 Files: llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h llvm/lib/Transforms/Vectorize/VPlan.cpp llvm/lib/Transforms/Vectorize/VPlan.h llvm/unittests/Transforms/Vectorize/VPlanTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68577.224777.patch Type: text/x-patch Size: 20131 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 04:26:46 2019 From: llvm-commits at lists.llvm.org (Ayal Zaks via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 11:26:46 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <25a8856178ec72422463c3006c25348d@localhost.localdomain> Ayal added inline comments. ================ Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:290 + Parent = InsertPos->getParent(); + InsertPos->getParent()->getRecipeList().insertAfter(InsertPos->getIterator(), + this); ---------------- nit: use `Parent`, as in insertBefore above. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 From llvm-commits at lists.llvm.org Sun Oct 13 04:26:47 2019 From: llvm-commits at lists.llvm.org (Juneyoung Lee via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 11:26:47 +0000 (UTC) Subject: [PATCH] D29011: [IR] Add Freeze instruction In-Reply-To: References: Message-ID: <480e1b95187f32eb36f332ffc8e73a24@localhost.localdomain> aqjune updated this revision to Diff 224778. aqjune added a comment. - Use FreezeOperator inside Freeze_match CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29011/new/ https://reviews.llvm.org/D29011 Files: include/llvm-c/Core.h include/llvm/Bitcode/LLVMBitCodes.h include/llvm/CodeGen/GlobalISel/IRTranslator.h include/llvm/IR/IRBuilder.h include/llvm/IR/Instruction.def include/llvm/IR/Operator.h include/llvm/IR/PatternMatch.h lib/AsmParser/LLLexer.cpp lib/AsmParser/LLParser.cpp lib/AsmParser/LLToken.h lib/Bitcode/Reader/BitcodeReader.cpp lib/Bitcode/Writer/BitcodeWriter.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h lib/CodeGen/TargetLoweringBase.cpp lib/IR/ConstantFold.cpp lib/IR/Core.cpp lib/IR/Instruction.cpp lib/IR/Instructions.cpp lib/IR/Verifier.cpp test/Bindings/llvm-c/freeze.ll test/Bitcode/compatibility.ll tools/llvm-c-test/echo.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D29011.224778.patch Type: text/x-patch Size: 25757 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 04:29:35 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sun, 13 Oct 2019 11:29:35 -0000 Subject: [llvm] r374716 - IRTranslator - silence static analyzer null dereference warnings. NFCI. Message-ID: <20191013112935.ACEAE8B15E@lists.llvm.org> Author: rksimon Date: Sun Oct 13 04:29:35 2019 New Revision: 374716 URL: http://llvm.org/viewvc/llvm-project?rev=374716&view=rev Log: IRTranslator - silence static analyzer null dereference warnings. NFCI. The CmpInst::getType() calls can be replaced by just using User::getType() that it was dyn_cast from, and we then need to assert that any default predicate cases came from the CmpInst. Modified: llvm/trunk/lib/CodeGen/GlobalISel/IRTranslator.cpp Modified: llvm/trunk/lib/CodeGen/GlobalISel/IRTranslator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/IRTranslator.cpp?rev=374716&r1=374715&r2=374716&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/IRTranslator.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/IRTranslator.cpp Sun Oct 13 04:29:35 2019 @@ -335,7 +335,7 @@ bool IRTranslator::translateFNeg(const U bool IRTranslator::translateCompare(const User &U, MachineIRBuilder &MIRBuilder) { - const CmpInst *CI = dyn_cast(&U); + auto *CI = dyn_cast(&U); Register Op0 = getOrCreateVReg(*U.getOperand(0)); Register Op1 = getOrCreateVReg(*U.getOperand(1)); Register Res = getOrCreateVReg(U); @@ -346,11 +346,12 @@ bool IRTranslator::translateCompare(cons MIRBuilder.buildICmp(Pred, Res, Op0, Op1); else if (Pred == CmpInst::FCMP_FALSE) MIRBuilder.buildCopy( - Res, getOrCreateVReg(*Constant::getNullValue(CI->getType()))); + Res, getOrCreateVReg(*Constant::getNullValue(U.getType()))); else if (Pred == CmpInst::FCMP_TRUE) MIRBuilder.buildCopy( - Res, getOrCreateVReg(*Constant::getAllOnesValue(CI->getType()))); + Res, getOrCreateVReg(*Constant::getAllOnesValue(U.getType()))); else { + assert(CI && "Instruction should be CmpInst"); MIRBuilder.buildInstr(TargetOpcode::G_FCMP, {Res}, {Pred, Op0, Op1}, MachineInstr::copyFlagsFromInstruction(*CI)); } From llvm-commits at lists.llvm.org Sun Oct 13 04:35:47 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 11:35:47 +0000 (UTC) Subject: [PATCH] D29014: [SelDag] Implement FREEZE node In-Reply-To: References: Message-ID: <3f87021273b0f7b3b06f5dc7a198e164@localhost.localdomain> lebedev.ri added a comment. I guess this is the next patch in the queue? This needs rebasing, and more tests to match the langref wording. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D29014/new/ https://reviews.llvm.org/D29014 From llvm-commits at lists.llvm.org Sun Oct 13 04:35:48 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 11:35:48 +0000 (UTC) Subject: [PATCH] D68550: [X86] Teach X86MCInstLower to swap operands of commutable instructions to enable 2-byte VEX encoding. In-Reply-To: References: Message-ID: <467d37f4683ebc418b72ab335863edc0@localhost.localdomain> RKSimon accepted this revision. RKSimon added a comment. This revision is now accepted and ready to land. LGTM with one minor query. ================ Comment at: llvm/lib/Target/X86/X86MCInstLower.cpp:915 + MI->getOpcode() != X86::VMOVHLPSrr && + MI->getOpcode() != X86::VUNPCKHPDrr) { + if (!X86II::isX86_64ExtendedReg(OutMI.getOperand(1).getReg()) && ---------------- The need to exclude specific opcodes here is unfortunate - maybe add a comment explaining why? Maybe even put them under a separate 'no doing' case statement? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68550/new/ https://reviews.llvm.org/D68550 From llvm-commits at lists.llvm.org Sun Oct 13 04:35:48 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 11:35:48 +0000 (UTC) Subject: [PATCH] D68871: [X86][BtVer2] Improved latency and throughput of float/vector loads and stores. In-Reply-To: References: Message-ID: <2afce1816d977f1b786c99948fb0bfdb@localhost.localdomain> RKSimon accepted this revision. RKSimon added a comment. This revision is now accepted and ready to land. LGTM - thanks for looking into this CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68871/new/ https://reviews.llvm.org/D68871 From llvm-commits at lists.llvm.org Sun Oct 13 04:44:50 2019 From: llvm-commits at lists.llvm.org (Gil Rapaport via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 11:44:50 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <55d488fce922a4b34a6df90bc655047f@localhost.localdomain> gilr updated this revision to Diff 224779. gilr added a comment. Applied review comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 Files: llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h llvm/lib/Transforms/Vectorize/VPlan.cpp llvm/lib/Transforms/Vectorize/VPlan.h llvm/unittests/Transforms/Vectorize/VPlanTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68577.224779.patch Type: text/x-patch Size: 20060 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 05:23:57 2019 From: llvm-commits at lists.llvm.org (kamlesh kumar via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 12:23:57 +0000 (UTC) Subject: [PATCH] D68907: [6/7/trunk] -fno-plt generates wrong relocation for std::ios_base::Init leading to segmentation fault In-Reply-To: References: Message-ID: kamleshbhalui updated this revision to Diff 224780. kamleshbhalui edited the summary of this revision. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68907/new/ https://reviews.llvm.org/D68907 Files: llvm/lib/Target/X86/X86FastISel.cpp Index: llvm/lib/Target/X86/X86FastISel.cpp =================================================================== --- llvm/lib/Target/X86/X86FastISel.cpp +++ llvm/lib/Target/X86/X86FastISel.cpp @@ -745,6 +745,12 @@ AM.Base.Reg = getInstrInfo()->getGlobalBaseReg(FuncInfo.MF); } + bool NeedLoad = GVFlags == X86II::MO_GOTPCREL; + if(NeedLoad) { + assert(AM.Base.Reg == 0 && AM.IndexReg == 0); + AM.Base.Reg = X86::RIP; + } + // Unless the ABI requires an extra load, return a direct reference to // the global. if (!isGlobalStubReference(GVFlags)) { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68907.224780.patch Type: text/x-patch Size: 617 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 06:18:07 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sun, 13 Oct 2019 13:18:07 -0000 Subject: [llvm] r374719 - [X86][AVX] Add i686 avx splat tests Message-ID: <20191013131807.313AB86942@lists.llvm.org> Author: rksimon Date: Sun Oct 13 06:18:07 2019 New Revision: 374719 URL: http://llvm.org/viewvc/llvm-project?rev=374719&view=rev Log: [X86][AVX] Add i686 avx splat tests Modified: llvm/trunk/test/CodeGen/X86/avx-splat.ll Modified: llvm/trunk/test/CodeGen/X86/avx-splat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-splat.ll?rev=374719&r1=374718&r2=374719&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-splat.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx-splat.ll Sun Oct 13 06:18:07 2019 @@ -1,12 +1,13 @@ ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py -; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx | FileCheck %s +; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+avx | FileCheck %s --check-prefixes=CHECK,X86 +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx | FileCheck %s --check-prefixes=CHECK,X64 define <32 x i8> @funcA(<32 x i8> %a) nounwind uwtable readnone ssp { ; CHECK-LABEL: funcA: ; CHECK: # %bb.0: # %entry ; CHECK-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5] ; CHECK-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0 -; CHECK-NEXT: retq +; CHECK-NEXT: ret{{[l|q]}} entry: %shuffle = shufflevector <32 x i8> %a, <32 x i8> undef, <32 x i32> ret <32 x i8> %shuffle @@ -18,19 +19,24 @@ define <16 x i16> @funcB(<16 x i16> %a) ; CHECK-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,5,5,6,7] ; CHECK-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,2,2,2] ; CHECK-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0 -; CHECK-NEXT: retq +; CHECK-NEXT: ret{{[l|q]}} entry: %shuffle = shufflevector <16 x i16> %a, <16 x i16> undef, <16 x i32> ret <16 x i16> %shuffle } define <4 x i64> @funcC(i64 %q) nounwind uwtable readnone ssp { -; CHECK-LABEL: funcC: -; CHECK: # %bb.0: # %entry -; CHECK-NEXT: vmovq %rdi, %xmm0 -; CHECK-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,1] -; CHECK-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0 -; CHECK-NEXT: retq +; X86-LABEL: funcC: +; X86: # %bb.0: # %entry +; X86-NEXT: vbroadcastsd {{[0-9]+}}(%esp), %ymm0 +; X86-NEXT: retl +; +; X64-LABEL: funcC: +; X64: # %bb.0: # %entry +; X64-NEXT: vmovq %rdi, %xmm0 +; X64-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,1] +; X64-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0 +; X64-NEXT: retq entry: %vecinit.i = insertelement <4 x i64> undef, i64 %q, i32 0 %vecinit2.i = insertelement <4 x i64> %vecinit.i, i64 %q, i32 1 @@ -40,11 +46,16 @@ entry: } define <4 x double> @funcD(double %q) nounwind uwtable readnone ssp { -; CHECK-LABEL: funcD: -; CHECK: # %bb.0: # %entry -; CHECK-NEXT: vmovddup {{.*#+}} xmm0 = xmm0[0,0] -; CHECK-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0 -; CHECK-NEXT: retq +; X86-LABEL: funcD: +; X86: # %bb.0: # %entry +; X86-NEXT: vbroadcastsd {{[0-9]+}}(%esp), %ymm0 +; X86-NEXT: retl +; +; X64-LABEL: funcD: +; X64: # %bb.0: # %entry +; X64-NEXT: vmovddup {{.*#+}} xmm0 = xmm0[0,0] +; X64-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0 +; X64-NEXT: retq entry: %vecinit.i = insertelement <4 x double> undef, double %q, i32 0 %vecinit2.i = insertelement <4 x double> %vecinit.i, double %q, i32 1 @@ -57,22 +68,39 @@ entry: ; shuffle (scalar_to_vector (load (ptr + 4))), undef, <0, 0, 0, 0> ; define <8 x float> @funcE() nounwind { -; CHECK-LABEL: funcE: -; CHECK: # %bb.0: # %allocas -; CHECK-NEXT: xorl %eax, %eax -; CHECK-NEXT: testb %al, %al -; CHECK-NEXT: # implicit-def: $ymm0 -; CHECK-NEXT: jne .LBB4_2 -; CHECK-NEXT: # %bb.1: # %load.i1247 -; CHECK-NEXT: pushq %rbp -; CHECK-NEXT: movq %rsp, %rbp -; CHECK-NEXT: andq $-32, %rsp -; CHECK-NEXT: subq $1312, %rsp # imm = 0x520 -; CHECK-NEXT: vbroadcastss {{[0-9]+}}(%rsp), %ymm0 -; CHECK-NEXT: movq %rbp, %rsp -; CHECK-NEXT: popq %rbp -; CHECK-NEXT: .LBB4_2: # %__load_and_broadcast_32.exit1249 -; CHECK-NEXT: retq +; X86-LABEL: funcE: +; X86: # %bb.0: # %allocas +; X86-NEXT: xorl %eax, %eax +; X86-NEXT: testb %al, %al +; X86-NEXT: # implicit-def: $ymm0 +; X86-NEXT: jne .LBB4_2 +; X86-NEXT: # %bb.1: # %load.i1247 +; X86-NEXT: pushl %ebp +; X86-NEXT: movl %esp, %ebp +; X86-NEXT: andl $-32, %esp +; X86-NEXT: subl $1312, %esp # imm = 0x520 +; X86-NEXT: vbroadcastss {{[0-9]+}}(%esp), %ymm0 +; X86-NEXT: movl %ebp, %esp +; X86-NEXT: popl %ebp +; X86-NEXT: .LBB4_2: # %__load_and_broadcast_32.exit1249 +; X86-NEXT: retl +; +; X64-LABEL: funcE: +; X64: # %bb.0: # %allocas +; X64-NEXT: xorl %eax, %eax +; X64-NEXT: testb %al, %al +; X64-NEXT: # implicit-def: $ymm0 +; X64-NEXT: jne .LBB4_2 +; X64-NEXT: # %bb.1: # %load.i1247 +; X64-NEXT: pushq %rbp +; X64-NEXT: movq %rsp, %rbp +; X64-NEXT: andq $-32, %rsp +; X64-NEXT: subq $1312, %rsp # imm = 0x520 +; X64-NEXT: vbroadcastss {{[0-9]+}}(%rsp), %ymm0 +; X64-NEXT: movq %rbp, %rsp +; X64-NEXT: popq %rbp +; X64-NEXT: .LBB4_2: # %__load_and_broadcast_32.exit1249 +; X64-NEXT: retq allocas: %udx495 = alloca [18 x [18 x float]], align 32 br label %for_test505.preheader @@ -98,12 +126,17 @@ __load_and_broadcast_32.exit1249: } define <8 x float> @funcF(i32 %val) nounwind { -; CHECK-LABEL: funcF: -; CHECK: # %bb.0: -; CHECK-NEXT: vmovd %edi, %xmm0 -; CHECK-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,0] -; CHECK-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0 -; CHECK-NEXT: retq +; X86-LABEL: funcF: +; X86: # %bb.0: +; X86-NEXT: vbroadcastss {{[0-9]+}}(%esp), %ymm0 +; X86-NEXT: retl +; +; X64-LABEL: funcF: +; X64: # %bb.0: +; X64-NEXT: vmovd %edi, %xmm0 +; X64-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,0] +; X64-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0 +; X64-NEXT: retq %ret6 = insertelement <8 x i32> undef, i32 %val, i32 6 %ret7 = insertelement <8 x i32> %ret6, i32 %val, i32 7 %tmp = bitcast <8 x i32> %ret7 to <8 x float> @@ -115,7 +148,7 @@ define <8 x float> @funcG(<8 x float> %a ; CHECK: # %bb.0: # %entry ; CHECK-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[0,0,0,0] ; CHECK-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0 -; CHECK-NEXT: retq +; CHECK-NEXT: ret{{[l|q]}} entry: %shuffle = shufflevector <8 x float> %a, <8 x float> undef, <8 x i32> ret <8 x float> %shuffle @@ -126,47 +159,71 @@ define <8 x float> @funcH(<8 x float> %a ; CHECK: # %bb.0: # %entry ; CHECK-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[1,1,1,1,5,5,5,5] ; CHECK-NEXT: vperm2f128 {{.*#+}} ymm0 = ymm0[2,3,2,3] -; CHECK-NEXT: retq +; CHECK-NEXT: ret{{[l|q]}} entry: %shuffle = shufflevector <8 x float> %a, <8 x float> undef, <8 x i32> ret <8 x float> %shuffle } define <2 x double> @splat_load_2f64_11(<2 x double>* %ptr) { -; CHECK-LABEL: splat_load_2f64_11: -; CHECK: # %bb.0: -; CHECK-NEXT: vmovddup {{.*#+}} xmm0 = mem[0,0] -; CHECK-NEXT: retq +; X86-LABEL: splat_load_2f64_11: +; X86: # %bb.0: +; X86-NEXT: movl {{[0-9]+}}(%esp), %eax +; X86-NEXT: vmovddup {{.*#+}} xmm0 = mem[0,0] +; X86-NEXT: retl +; +; X64-LABEL: splat_load_2f64_11: +; X64: # %bb.0: +; X64-NEXT: vmovddup {{.*#+}} xmm0 = mem[0,0] +; X64-NEXT: retq %x = load <2 x double>, <2 x double>* %ptr %x1 = shufflevector <2 x double> %x, <2 x double> undef, <2 x i32> ret <2 x double> %x1 } define <4 x double> @splat_load_4f64_2222(<4 x double>* %ptr) { -; CHECK-LABEL: splat_load_4f64_2222: -; CHECK: # %bb.0: -; CHECK-NEXT: vbroadcastsd 16(%rdi), %ymm0 -; CHECK-NEXT: retq +; X86-LABEL: splat_load_4f64_2222: +; X86: # %bb.0: +; X86-NEXT: movl {{[0-9]+}}(%esp), %eax +; X86-NEXT: vbroadcastsd 16(%eax), %ymm0 +; X86-NEXT: retl +; +; X64-LABEL: splat_load_4f64_2222: +; X64: # %bb.0: +; X64-NEXT: vbroadcastsd 16(%rdi), %ymm0 +; X64-NEXT: retq %x = load <4 x double>, <4 x double>* %ptr %x1 = shufflevector <4 x double> %x, <4 x double> undef, <4 x i32> ret <4 x double> %x1 } define <4 x float> @splat_load_4f32_0000(<4 x float>* %ptr) { -; CHECK-LABEL: splat_load_4f32_0000: -; CHECK: # %bb.0: -; CHECK-NEXT: vbroadcastss (%rdi), %xmm0 -; CHECK-NEXT: retq +; X86-LABEL: splat_load_4f32_0000: +; X86: # %bb.0: +; X86-NEXT: movl {{[0-9]+}}(%esp), %eax +; X86-NEXT: vbroadcastss (%eax), %xmm0 +; X86-NEXT: retl +; +; X64-LABEL: splat_load_4f32_0000: +; X64: # %bb.0: +; X64-NEXT: vbroadcastss (%rdi), %xmm0 +; X64-NEXT: retq %x = load <4 x float>, <4 x float>* %ptr %x1 = shufflevector <4 x float> %x, <4 x float> undef, <4 x i32> ret <4 x float> %x1 } define <8 x float> @splat_load_8f32_77777777(<8 x float>* %ptr) { -; CHECK-LABEL: splat_load_8f32_77777777: -; CHECK: # %bb.0: -; CHECK-NEXT: vbroadcastss 28(%rdi), %ymm0 -; CHECK-NEXT: retq +; X86-LABEL: splat_load_8f32_77777777: +; X86: # %bb.0: +; X86-NEXT: movl {{[0-9]+}}(%esp), %eax +; X86-NEXT: vbroadcastss 28(%eax), %ymm0 +; X86-NEXT: retl +; +; X64-LABEL: splat_load_8f32_77777777: +; X64: # %bb.0: +; X64-NEXT: vbroadcastss 28(%rdi), %ymm0 +; X64-NEXT: retq %x = load <8 x float>, <8 x float>* %ptr %x1 = shufflevector <8 x float> %x, <8 x float> undef, <8 x i32> ret <8 x float> %x1 From llvm-commits at lists.llvm.org Sun Oct 13 14:19:13 2019 From: llvm-commits at lists.llvm.org (via llvm-commits) Date: 14 Oct 2019 03:19:13 +0600 Subject: Your account was under attack! Change your access data! Message-ID: <001a01d58210$03ccad01$f11fa3b2$@lists.llvm.org> Hello! I have very bad news for you. 17/07/2019 - on this day I hacked your OS and got full access to your account llvm-commits at lists.llvm.org. You can check it - I sent this message from your account. So, you can change the password, yes.. But my malware intercepts it every time. How I made it: In the software of the router, through which you went online, was a vulnerability. I just hacked this router and placed my malicious code on it. When you went online, my trojan was installed on the OS of your device. After that, I made a full dump of your disk (I have all your address book, history of viewing sites, all files, phone numbers and addresses of all your contacts). A month ago, I wanted to lock your device and ask for a not big amount of btc to unlock. But I looked at the sites that you regularly visit, and I was shocked by what I saw!!! I'm talk you about sites for adults. I want to say - you are a BIG pervert. Your fantasy is shifted far away from the normal course! And I got an idea.... I made a screenshot of the adult sites where you have fun (do you understand what it is about, huh?). After that, I made a screenshot of your joys (using the camera of your device) and glued them together. Turned out amazing! You are so spectacular! I'm know that you would not like to show these screenshots to your friends, relatives or colleagues. I think $915 is a very, very small amount for my silence. Besides, I have been spying on you for so long, having spent a lot of time! Pay ONLY in Bitcoins! My BTC wallet: 15yF8WkUg8PRjJehYW4tGdqcyzc4z7dScM You do not know how to use bitcoins? Enter a query in any search engine: "how to replenish btc wallet". It's extremely easy For this payment I give you two days (48 hours). As soon as this letter is opened, the timer will work. After payment, my virus and dirty screenshots with your enjoys will be self-destruct automatically. If I do not receive from you the specified amount, then your device will be locked, and all your contacts will receive a screenshots with your "enjoys". I hope you understand your situation. - Do not try to find and destroy my virus! (All your data, files and screenshots is already uploaded to a remote server) - Do not try to contact me (you yourself will see that this is impossible, I sent you an email from your account) - Various security services will not help you; formatting a disk or destroying a device will not help, since your data is already on a remote server. P.S. You are not my single victim. so, I guarantee you that I will not disturb you again after payment! This is the word of honor hacker. I also ask you to regularly update your antiviruses in the future. This way you will no longer fall into a similar situation. Do not hold evil! I just do my job. Good luck. From llvm-commits at lists.llvm.org Sun Oct 13 07:52:40 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Sun, 13 Oct 2019 14:52:40 +0000 (UTC) Subject: [PATCH] D68907: [6/7/trunk] -fno-plt generates wrong relocation for std::ios_base::Init leading to segmentation fault In-Reply-To: References: Message-ID: xbolva00 added a comment. Please add a test and clang-format your patch. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68907/new/ https://reviews.llvm.org/D68907 From llvm-commits at lists.llvm.org Sun Oct 13 08:25:13 2019 From: llvm-commits at lists.llvm.org (Nico Weber via llvm-commits) Date: Sun, 13 Oct 2019 15:25:13 -0000 Subject: [llvm] r374721 - gn build: (manually) merge r374720 Message-ID: <20191013152513.5CA2F82CDF@lists.llvm.org> Author: nico Date: Sun Oct 13 08:25:13 2019 New Revision: 374721 URL: http://llvm.org/viewvc/llvm-project?rev=374721&view=rev Log: gn build: (manually) merge r374720 Modified: llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn Modified: llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn?rev=374721&r1=374720&r2=374721&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/clang/tools/clang-format/BUILD.gn Sun Oct 13 08:25:13 2019 @@ -3,6 +3,7 @@ executable("clang-format") { deps = [ "//clang/lib/Basic", "//clang/lib/Format", + "//clang/lib/Frontend", "//clang/lib/Rewrite", "//clang/lib/Tooling/Core", "//llvm/lib/Support", From llvm-commits at lists.llvm.org Sun Oct 13 09:06:36 2019 From: llvm-commits at lists.llvm.org (Juneyoung Lee via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 16:06:36 +0000 (UTC) Subject: [PATCH] D68928: Fix clone_constant_impl to correctly deal with null pointers Message-ID: aqjune created this revision. aqjune added reviewers: jdoerfert, CodaFi, deadalnix. Herald added a project: LLVM. Herald added a subscriber: llvm-commits. This patch resolves llvm-c-test's following error LLVM ERROR: LLVMGetValueKind returned incorrect type which arises when the input bitcode contains a null pointer. Repository: rL LLVM https://reviews.llvm.org/D68928 Files: test/Bindings/llvm-c/echo.ll tools/llvm-c-test/echo.cpp Index: tools/llvm-c-test/echo.cpp =================================================================== --- tools/llvm-c-test/echo.cpp +++ tools/llvm-c-test/echo.cpp @@ -326,6 +326,13 @@ EltCount, LLVMIsPackedStruct(Ty)); } + // Try ConstantPointerNull + if (LLVMIsAConstantPointerNull(Cst)) { + check_value_kind(Cst, LLVMConstantPointerNullValueKind); + LLVMTypeRef Ty = TypeCloner(M).Clone(Cst); + return LLVMConstNull(Ty); + } + // Try undef if (LLVMIsUndef(Cst)) { check_value_kind(Cst, LLVMUndefValueValueKind); Index: test/Bindings/llvm-c/echo.ll =================================================================== --- test/Bindings/llvm-c/echo.ll +++ test/Bindings/llvm-c/echo.ll @@ -21,6 +21,7 @@ @protected = protected global i32 23 @section = global i32 27, section ".custom" @align = global i32 31, align 4 + at nullptr = global i32* null @aliased1 = alias i32, i32* @var @aliased2 = internal alias i32, i32* @var -------------- next part -------------- A non-text attachment was scrubbed... Name: D68928.224784.patch Type: text/x-patch Size: 992 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 10:03:02 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sun, 13 Oct 2019 17:03:02 -0000 Subject: [llvm] r374724 - [X86] getTargetShuffleInputs - add KnownUndef/Zero output support Message-ID: <20191013170303.1841F88E4F@lists.llvm.org> Author: rksimon Date: Sun Oct 13 10:03:02 2019 New Revision: 374724 URL: http://llvm.org/viewvc/llvm-project?rev=374724&view=rev Log: [X86] getTargetShuffleInputs - add KnownUndef/Zero output support Adjust SimplifyDemandedVectorEltsForTargetNode to use the known elts masks instead of recomputing it locally. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374724&r1=374723&r2=374724&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sun Oct 13 10:03:02 2019 @@ -7248,13 +7248,13 @@ static void resolveTargetShuffleInputsAn static bool getTargetShuffleInputs(SDValue Op, const APInt &DemandedElts, SmallVectorImpl &Inputs, SmallVectorImpl &Mask, + APInt &KnownUndef, APInt &KnownZero, SelectionDAG &DAG, unsigned Depth, bool ResolveZero) { EVT VT = Op.getValueType(); if (!VT.isSimple() || !VT.isVector()) return false; - APInt KnownUndef, KnownZero; if (getTargetShuffleAndZeroables(Op, Mask, Inputs, KnownUndef, KnownZero)) { for (int i = 0, e = Mask.size(); i != e; ++i) { int &M = Mask[i]; @@ -7267,8 +7267,19 @@ static bool getTargetShuffleInputs(SDVal } return true; } - return getFauxShuffleMask(Op, DemandedElts, Mask, Inputs, DAG, Depth, - ResolveZero); + if (getFauxShuffleMask(Op, DemandedElts, Mask, Inputs, DAG, Depth, + ResolveZero)) { + KnownUndef = KnownZero = APInt::getNullValue(Mask.size()); + for (int i = 0, e = Mask.size(); i != e; ++i) { + int M = Mask[i]; + if (SM_SentinelUndef == M) + KnownUndef.setBit(i); + if (SM_SentinelZero == M) + KnownZero.setBit(i); + } + return true; + } + return false; } static bool getTargetShuffleInputs(SDValue Op, SmallVectorImpl &Inputs, @@ -7279,10 +7290,11 @@ static bool getTargetShuffleInputs(SDVal if (!VT.isSimple() || !VT.isVector()) return false; + APInt KnownUndef, KnownZero; unsigned NumElts = Op.getValueType().getVectorNumElements(); APInt DemandedElts = APInt::getAllOnesValue(NumElts); - return getTargetShuffleInputs(Op, DemandedElts, Inputs, Mask, DAG, Depth, - ResolveZero); + return getTargetShuffleInputs(Op, DemandedElts, Inputs, Mask, KnownUndef, + KnownZero, DAG, Depth, ResolveZero); } /// Returns the scalar element that will make up the ith @@ -34572,10 +34584,11 @@ bool X86TargetLowering::SimplifyDemanded } // Get target/faux shuffle mask. + APInt OpUndef, OpZero; SmallVector OpMask; SmallVector OpInputs; - if (!getTargetShuffleInputs(Op, DemandedElts, OpInputs, OpMask, TLO.DAG, - Depth, false)) + if (!getTargetShuffleInputs(Op, DemandedElts, OpInputs, OpMask, OpUndef, + OpZero, TLO.DAG, Depth, false)) return false; // Shuffle inputs must be the same size as the result. @@ -34586,19 +34599,14 @@ bool X86TargetLowering::SimplifyDemanded })) return false; - // Clear known elts that might have been set above. - KnownZero.clearAllBits(); - KnownUndef.clearAllBits(); + KnownZero = OpZero; + KnownUndef = OpUndef; // Check if shuffle mask can be simplified to undef/zero/identity. int NumSrcs = OpInputs.size(); - for (int i = 0; i != NumElts; ++i) { - int &M = OpMask[i]; + for (int i = 0; i != NumElts; ++i) if (!DemandedElts[i]) - M = SM_SentinelUndef; - else if (0 <= M && OpInputs[M / NumElts].isUndef()) - M = SM_SentinelUndef; - } + OpMask[i] = SM_SentinelUndef; if (isUndefInRange(OpMask, 0, NumElts)) { KnownUndef.setAllBits(); @@ -34628,21 +34636,13 @@ bool X86TargetLowering::SimplifyDemanded SrcElts.setBit(M); } + // TODO - Propagate input undef/zero elts. APInt SrcUndef, SrcZero; if (SimplifyDemandedVectorElts(OpInputs[Src], SrcElts, SrcUndef, SrcZero, TLO, Depth + 1)) return true; } - // Extract known zero/undef elements. - // TODO - Propagate input undef/zero elts. - for (int i = 0; i != NumElts; ++i) { - if (OpMask[i] == SM_SentinelUndef) - KnownUndef.setBit(i); - if (OpMask[i] == SM_SentinelZero) - KnownZero.setBit(i); - } - return false; } From llvm-commits at lists.llvm.org Sun Oct 13 10:03:11 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sun, 13 Oct 2019 17:03:11 -0000 Subject: [llvm] r374725 - [X86] SimplifyMultipleUseDemandedBitsForTargetNode - use getTargetShuffleInputs with KnownUndef/Zero results. Message-ID: <20191013170311.B50E08B331@lists.llvm.org> Author: rksimon Date: Sun Oct 13 10:03:11 2019 New Revision: 374725 URL: http://llvm.org/viewvc/llvm-project?rev=374725&view=rev Log: [X86] SimplifyMultipleUseDemandedBitsForTargetNode - use getTargetShuffleInputs with KnownUndef/Zero results. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374725&r1=374724&r2=374725&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sun Oct 13 10:03:11 2019 @@ -34952,9 +34952,11 @@ SDValue X86TargetLowering::SimplifyMulti } } + APInt ShuffleUndef, ShuffleZero; SmallVector ShuffleMask; SmallVector ShuffleOps; - if (getTargetShuffleInputs(Op, ShuffleOps, ShuffleMask, DAG, Depth)) { + if (getTargetShuffleInputs(Op, DemandedElts, ShuffleOps, ShuffleMask, + ShuffleUndef, ShuffleZero, DAG, Depth, false)) { // If all the demanded elts are from one operand and are inline, // then we can use the operand directly. int NumOps = ShuffleOps.size(); @@ -34963,15 +34965,17 @@ SDValue X86TargetLowering::SimplifyMulti return VT.getSizeInBits() == V.getValueSizeInBits(); })) { + if (DemandedElts.isSubsetOf(ShuffleUndef)) + return DAG.getUNDEF(VT); + if (DemandedElts.isSubsetOf(ShuffleUndef | ShuffleZero)) + return getZeroVector(VT.getSimpleVT(), Subtarget, DAG, SDLoc(Op)); + // Bitmask that indicates which ops have only been accessed 'inline'. APInt IdentityOp = APInt::getAllOnesValue(NumOps); - bool AllUndef = true; - for (int i = 0; i != NumElts; ++i) { int M = ShuffleMask[i]; - if (SM_SentinelUndef == M || !DemandedElts[i]) + if (!DemandedElts[i] || ShuffleUndef[i]) continue; - AllUndef = false; int Op = M / NumElts; int Index = M % NumElts; if (M < 0 || Index != i) { @@ -34982,16 +34986,11 @@ SDValue X86TargetLowering::SimplifyMulti if (IdentityOp == 0) break; } - - if (AllUndef) - return DAG.getUNDEF(VT); - assert((IdentityOp == 0 || IdentityOp.countPopulation() == 1) && "Multiple identity shuffles detected"); - for (int i = 0; i != NumOps; ++i) - if (IdentityOp[i]) - return DAG.getBitcast(VT, ShuffleOps[i]); + if (IdentityOp != 0) + return DAG.getBitcast(VT, ShuffleOps[IdentityOp.countTrailingZeros()]); } } From llvm-commits at lists.llvm.org Sun Oct 13 10:11:17 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Sun, 13 Oct 2019 17:11:17 -0000 Subject: [llvm] r374726 - [NFC][InstCombine] More test for "sign bit test via shifts" pattern (PR43595) Message-ID: <20191013171117.2CD94835D0@lists.llvm.org> Author: lebedevri Date: Sun Oct 13 10:11:16 2019 New Revision: 374726 URL: http://llvm.org/viewvc/llvm-project?rev=374726&view=rev Log: [NFC][InstCombine] More test for "sign bit test via shifts" pattern (PR43595) While that pattern is indirectly handled via reassociateShiftAmtsOfTwoSameDirectionShifts(), that incursme one-use restriction on truncation, which is pointless since we know that we'll produce a single instruction. Additionally, *if* we are only looking for sign bit, we don't need shifts to be identical, which isn't the case in general, and is the blocker for me in bug in question: https://bugs.llvm.org/show_bug.cgi?id=43595 Modified: llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-in-bittest.ll llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-ashr.ll llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-lshr.ll llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-shl.ll llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation.ll llvm/trunk/test/Transforms/InstCombine/sign-bit-test-via-right-shifting-all-other-bits.ll Modified: llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-in-bittest.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-in-bittest.ll?rev=374726&r1=374725&r2=374726&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-in-bittest.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-in-bittest.ll Sun Oct 13 10:11:16 2019 @@ -671,6 +671,14 @@ define <2 x i1> @n38_overshift(<2 x i32> ; As usual, don't crash given constantexpr's :/ @f.a = internal global i16 0 define i1 @constantexpr() { +; CHECK-LABEL: @constantexpr( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[TMP0:%.*]] = load i16, i16* @f.a, align 2 +; CHECK-NEXT: [[TMP1:%.*]] = lshr i16 [[TMP0]], 1 +; CHECK-NEXT: [[TMP2:%.*]] = and i16 [[TMP1]], shl (i16 1, i16 zext (i1 icmp ne (i16 ptrtoint (i16* @f.a to i16), i16 1) to i16)) +; CHECK-NEXT: [[TOBOOL:%.*]] = icmp ne i16 [[TMP2]], 0 +; CHECK-NEXT: ret i1 [[TOBOOL]] +; entry: %0 = load i16, i16* @f.a %shr = ashr i16 %0, 1 Modified: llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-ashr.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-ashr.ll?rev=374726&r1=374725&r2=374726&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-ashr.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-ashr.ll Sun Oct 13 10:11:16 2019 @@ -166,7 +166,9 @@ define i16 @t9_ashr(i32 %x, i16 %y) { ; CHECK-NEXT: [[T1:%.*]] = zext i16 [[T0]] to i32 ; CHECK-NEXT: [[T2:%.*]] = ashr i32 [[X:%.*]], [[T1]] ; CHECK-NEXT: [[T3:%.*]] = trunc i32 [[T2]] to i16 -; CHECK-NEXT: ret i16 [[T3]] +; CHECK-NEXT: [[T4:%.*]] = add i16 [[Y]], -2 +; CHECK-NEXT: [[T5:%.*]] = ashr i16 [[T3]], [[T4]] +; CHECK-NEXT: ret i16 [[T5]] ; %t0 = sub i16 32, %y %t1 = zext i16 %t0 to i32 @@ -174,5 +176,25 @@ define i16 @t9_ashr(i32 %x, i16 %y) { %t3 = trunc i32 %t2 to i16 %t4 = add i16 %y, -2 %t5 = ashr i16 %t3, %t4 - ret i16 %t3 + ret i16 %t5 +} + +; If we have different right-shifts, in general, we can't do anything with it. +define i16 @n10_lshr_ashr(i32 %x, i16 %y) { +; CHECK-LABEL: @n10_lshr_ashr( +; CHECK-NEXT: [[T0:%.*]] = sub i16 32, [[Y:%.*]] +; CHECK-NEXT: [[T1:%.*]] = zext i16 [[T0]] to i32 +; CHECK-NEXT: [[T2:%.*]] = lshr i32 [[X:%.*]], [[T1]] +; CHECK-NEXT: [[T3:%.*]] = trunc i32 [[T2]] to i16 +; CHECK-NEXT: [[T4:%.*]] = add i16 [[Y]], -1 +; CHECK-NEXT: [[T5:%.*]] = ashr i16 [[T3]], [[T4]] +; CHECK-NEXT: ret i16 [[T5]] +; + %t0 = sub i16 32, %y + %t1 = zext i16 %t0 to i32 + %t2 = lshr i32 %x, %t1 + %t3 = trunc i32 %t2 to i16 + %t4 = add i16 %y, -1 + %t5 = ashr i16 %t3, %t4 + ret i16 %t5 } Modified: llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-lshr.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-lshr.ll?rev=374726&r1=374725&r2=374726&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-lshr.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-lshr.ll Sun Oct 13 10:11:16 2019 @@ -166,7 +166,9 @@ define i16 @t9_lshr(i32 %x, i16 %y) { ; CHECK-NEXT: [[T1:%.*]] = zext i16 [[T0]] to i32 ; CHECK-NEXT: [[T2:%.*]] = lshr i32 [[X:%.*]], [[T1]] ; CHECK-NEXT: [[T3:%.*]] = trunc i32 [[T2]] to i16 -; CHECK-NEXT: ret i16 [[T3]] +; CHECK-NEXT: [[T4:%.*]] = add i16 [[Y]], -2 +; CHECK-NEXT: [[T5:%.*]] = lshr i16 [[T3]], [[T4]] +; CHECK-NEXT: ret i16 [[T5]] ; %t0 = sub i16 32, %y %t1 = zext i16 %t0 to i32 @@ -174,5 +176,25 @@ define i16 @t9_lshr(i32 %x, i16 %y) { %t3 = trunc i32 %t2 to i16 %t4 = add i16 %y, -2 %t5 = lshr i16 %t3, %t4 - ret i16 %t3 + ret i16 %t5 +} + +; If we have different right-shifts, in general, we can't do anything with it. +define i16 @n10_ashr_lshr(i32 %x, i16 %y) { +; CHECK-LABEL: @n10_ashr_lshr( +; CHECK-NEXT: [[T0:%.*]] = sub i16 32, [[Y:%.*]] +; CHECK-NEXT: [[T1:%.*]] = zext i16 [[T0]] to i32 +; CHECK-NEXT: [[T2:%.*]] = ashr i32 [[X:%.*]], [[T1]] +; CHECK-NEXT: [[T3:%.*]] = trunc i32 [[T2]] to i16 +; CHECK-NEXT: [[T4:%.*]] = add i16 [[Y]], -1 +; CHECK-NEXT: [[T5:%.*]] = lshr i16 [[T3]], [[T4]] +; CHECK-NEXT: ret i16 [[T5]] +; + %t0 = sub i16 32, %y + %t1 = zext i16 %t0 to i32 + %t2 = ashr i32 %x, %t1 + %t3 = trunc i32 %t2 to i16 + %t4 = add i16 %y, -1 + %t5 = lshr i16 %t3, %t4 + ret i16 %t5 } Modified: llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-shl.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-shl.ll?rev=374726&r1=374725&r2=374726&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-shl.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation-with-truncation-shl.ll Sun Oct 13 10:11:16 2019 @@ -181,15 +181,17 @@ define i16 @n11(i32 %x, i16 %y) { ; CHECK-NEXT: [[T1:%.*]] = zext i16 [[T0]] to i32 ; CHECK-NEXT: [[T2:%.*]] = shl i32 [[X:%.*]], [[T1]] ; CHECK-NEXT: [[T3:%.*]] = trunc i32 [[T2]] to i16 -; CHECK-NEXT: ret i16 [[T3]] +; CHECK-NEXT: [[T4:%.*]] = add i16 [[Y]], -31 +; CHECK-NEXT: [[T5:%.*]] = shl i16 [[T3]], [[T4]] +; CHECK-NEXT: ret i16 [[T5]] ; %t0 = sub i16 30, %y %t1 = zext i16 %t0 to i32 %t2 = shl i32 %x, %t1 %t3 = trunc i32 %t2 to i16 - %t4 = add i16 %y, -24 + %t4 = add i16 %y, -31 %t5 = shl i16 %t3, %t4 - ret i16 %t3 + ret i16 %t5 } ; Bit width mismatch of shit amount Modified: llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation.ll?rev=374726&r1=374725&r2=374726&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/shift-amount-reassociation.ll Sun Oct 13 10:11:16 2019 @@ -203,3 +203,119 @@ define <2 x i32> @t13_vec(<2 x i32> %x, %t3 = lshr <2 x i32> %t1, %t2 ret <2 x i32> %t3 } + +; If we have different right-shifts, in general, we can't do anything with it. +define i32 @n13(i32 %x, i32 %y) { +; CHECK-LABEL: @n13( +; CHECK-NEXT: [[T0:%.*]] = sub i32 32, [[Y:%.*]] +; CHECK-NEXT: [[T1:%.*]] = lshr i32 [[X:%.*]], [[T0]] +; CHECK-NEXT: [[T2:%.*]] = add i32 [[Y]], -2 +; CHECK-NEXT: [[T3:%.*]] = ashr i32 [[T1]], [[T2]] +; CHECK-NEXT: ret i32 [[T3]] +; + %t0 = sub i32 32, %y + %t1 = lshr i32 %x, %t0 + %t2 = add i32 %y, -2 + %t3 = ashr i32 %t1, %t2 + ret i32 %t3 +} +define i32 @n14(i32 %x, i32 %y) { +; CHECK-LABEL: @n14( +; CHECK-NEXT: [[T0:%.*]] = sub i32 32, [[Y:%.*]] +; CHECK-NEXT: [[T1:%.*]] = lshr i32 [[X:%.*]], [[T0]] +; CHECK-NEXT: [[T2:%.*]] = add i32 [[Y]], -1 +; CHECK-NEXT: [[T3:%.*]] = ashr i32 [[T1]], [[T2]] +; CHECK-NEXT: ret i32 [[T3]] +; + %t0 = sub i32 32, %y + %t1 = lshr i32 %x, %t0 + %t2 = add i32 %y, -1 + %t3 = ashr i32 %t1, %t2 + ret i32 %t3 +} +define i32 @n15(i32 %x, i32 %y) { +; CHECK-LABEL: @n15( +; CHECK-NEXT: [[T0:%.*]] = sub i32 32, [[Y:%.*]] +; CHECK-NEXT: [[T1:%.*]] = ashr i32 [[X:%.*]], [[T0]] +; CHECK-NEXT: [[T2:%.*]] = add i32 [[Y]], -2 +; CHECK-NEXT: [[T3:%.*]] = lshr i32 [[T1]], [[T2]] +; CHECK-NEXT: ret i32 [[T3]] +; + %t0 = sub i32 32, %y + %t1 = ashr i32 %x, %t0 + %t2 = add i32 %y, -2 + %t3 = lshr i32 %t1, %t2 + ret i32 %t3 +} +define i32 @n16(i32 %x, i32 %y) { +; CHECK-LABEL: @n16( +; CHECK-NEXT: [[T0:%.*]] = sub i32 32, [[Y:%.*]] +; CHECK-NEXT: [[T1:%.*]] = ashr i32 [[X:%.*]], [[T0]] +; CHECK-NEXT: [[T2:%.*]] = add i32 [[Y]], -1 +; CHECK-NEXT: [[T3:%.*]] = lshr i32 [[T1]], [[T2]] +; CHECK-NEXT: ret i32 [[T3]] +; + %t0 = sub i32 32, %y + %t1 = ashr i32 %x, %t0 + %t2 = add i32 %y, -1 + %t3 = lshr i32 %t1, %t2 + ret i32 %t3 +} + +; If the shift direction is different, then this should be handled elsewhere. +define i32 @n17(i32 %x, i32 %y) { +; CHECK-LABEL: @n17( +; CHECK-NEXT: [[T0:%.*]] = sub i32 32, [[Y:%.*]] +; CHECK-NEXT: [[T1:%.*]] = shl i32 [[X:%.*]], [[T0]] +; CHECK-NEXT: [[T2:%.*]] = add i32 [[Y]], -1 +; CHECK-NEXT: [[T3:%.*]] = lshr i32 [[T1]], [[T2]] +; CHECK-NEXT: ret i32 [[T3]] +; + %t0 = sub i32 32, %y + %t1 = shl i32 %x, %t0 + %t2 = add i32 %y, -1 + %t3 = lshr i32 %t1, %t2 + ret i32 %t3 +} +define i32 @n18(i32 %x, i32 %y) { +; CHECK-LABEL: @n18( +; CHECK-NEXT: [[T0:%.*]] = sub i32 32, [[Y:%.*]] +; CHECK-NEXT: [[T1:%.*]] = shl i32 [[X:%.*]], [[T0]] +; CHECK-NEXT: [[T2:%.*]] = add i32 [[Y]], -1 +; CHECK-NEXT: [[T3:%.*]] = ashr i32 [[T1]], [[T2]] +; CHECK-NEXT: ret i32 [[T3]] +; + %t0 = sub i32 32, %y + %t1 = shl i32 %x, %t0 + %t2 = add i32 %y, -1 + %t3 = ashr i32 %t1, %t2 + ret i32 %t3 +} +define i32 @n19(i32 %x, i32 %y) { +; CHECK-LABEL: @n19( +; CHECK-NEXT: [[T0:%.*]] = sub i32 32, [[Y:%.*]] +; CHECK-NEXT: [[T1:%.*]] = lshr i32 [[X:%.*]], [[T0]] +; CHECK-NEXT: [[T2:%.*]] = add i32 [[Y]], -1 +; CHECK-NEXT: [[T3:%.*]] = shl i32 [[T1]], [[T2]] +; CHECK-NEXT: ret i32 [[T3]] +; + %t0 = sub i32 32, %y + %t1 = lshr i32 %x, %t0 + %t2 = add i32 %y, -1 + %t3 = shl i32 %t1, %t2 + ret i32 %t3 +} +define i32 @n20(i32 %x, i32 %y) { +; CHECK-LABEL: @n20( +; CHECK-NEXT: [[T0:%.*]] = sub i32 32, [[Y:%.*]] +; CHECK-NEXT: [[T1:%.*]] = ashr i32 [[X:%.*]], [[T0]] +; CHECK-NEXT: [[T2:%.*]] = add i32 [[Y]], -1 +; CHECK-NEXT: [[T3:%.*]] = shl i32 [[T1]], [[T2]] +; CHECK-NEXT: ret i32 [[T3]] +; + %t0 = sub i32 32, %y + %t1 = ashr i32 %x, %t0 + %t2 = add i32 %y, -1 + %t3 = shl i32 %t1, %t2 + ret i32 %t3 +} Modified: llvm/trunk/test/Transforms/InstCombine/sign-bit-test-via-right-shifting-all-other-bits.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/sign-bit-test-via-right-shifting-all-other-bits.ll?rev=374726&r1=374725&r2=374726&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/sign-bit-test-via-right-shifting-all-other-bits.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/sign-bit-test-via-right-shifting-all-other-bits.ll Sun Oct 13 10:11:16 2019 @@ -1,22 +1,51 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py ; RUN: opt < %s -instcombine -S | FileCheck %s +declare void @use32(i32) +declare void @use64(i64) + define i1 @highest_bit_test_via_lshr(i32 %data, i32 %nbits) { ; CHECK-LABEL: @highest_bit_test_via_lshr( -; CHECK-NEXT: [[ISNEG:%.*]] = icmp slt i32 [[DATA:%.*]], 0 +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[NUM_LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SKIP_ALL_BITS_TILL_SIGNBIT:%.*]] = add i32 [[NBITS]], -1 +; CHECK-NEXT: [[SIGNBIT:%.*]] = lshr i32 [[DATA]], 31 +; CHECK-NEXT: call void @use32(i32 [[NUM_LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use32(i32 [[SKIP_ALL_BITS_TILL_SIGNBIT]]) +; CHECK-NEXT: call void @use32(i32 [[SIGNBIT]]) +; CHECK-NEXT: [[ISNEG:%.*]] = icmp slt i32 [[DATA]], 0 ; CHECK-NEXT: ret i1 [[ISNEG]] ; %num_low_bits_to_skip = sub i32 32, %nbits %high_bits_extracted = lshr i32 %data, %num_low_bits_to_skip %skip_all_bits_till_signbit = sub i32 %nbits, 1 %signbit = lshr i32 %high_bits_extracted, %skip_all_bits_till_signbit + + call void @use32(i32 %num_low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use32(i32 %skip_all_bits_till_signbit) + call void @use32(i32 %signbit) + %isneg = icmp ne i32 %signbit, 0 ret i1 %isneg } define i1 @highest_bit_test_via_lshr_with_truncation(i64 %data, i32 %nbits) { ; CHECK-LABEL: @highest_bit_test_via_lshr_with_truncation( -; CHECK-NEXT: [[ISNEG:%.*]] = icmp slt i64 [[DATA:%.*]], 0 +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP:%.*]] = sub i32 64, [[NBITS:%.*]] +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP_WIDE:%.*]] = zext i32 [[NUM_LOW_BITS_TO_SKIP]] to i64 +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i64 [[DATA:%.*]], [[NUM_LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED_NARROW:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED]] to i32 +; CHECK-NEXT: [[SKIP_ALL_BITS_TILL_SIGNBIT:%.*]] = add i32 [[NBITS]], -1 +; CHECK-NEXT: [[SIGNBIT:%.*]] = lshr i32 [[HIGH_BITS_EXTRACTED_NARROW]], [[SKIP_ALL_BITS_TILL_SIGNBIT]] +; CHECK-NEXT: call void @use32(i32 [[NUM_LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use64(i64 [[NUM_LOW_BITS_TO_SKIP_WIDE]]) +; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED_NARROW]]) +; CHECK-NEXT: call void @use32(i32 [[SKIP_ALL_BITS_TILL_SIGNBIT]]) +; CHECK-NEXT: call void @use32(i32 [[SIGNBIT]]) +; CHECK-NEXT: [[ISNEG:%.*]] = icmp ne i32 [[SIGNBIT]], 0 ; CHECK-NEXT: ret i1 [[ISNEG]] ; %num_low_bits_to_skip = sub i32 64, %nbits @@ -25,26 +54,60 @@ define i1 @highest_bit_test_via_lshr_wit %high_bits_extracted_narrow = trunc i64 %high_bits_extracted to i32 %skip_all_bits_till_signbit = sub i32 %nbits, 1 %signbit = lshr i32 %high_bits_extracted_narrow, %skip_all_bits_till_signbit + + call void @use32(i32 %num_low_bits_to_skip) + call void @use64(i64 %num_low_bits_to_skip_wide) + call void @use64(i64 %high_bits_extracted) + call void @use32(i32 %high_bits_extracted_narrow) + call void @use32(i32 %skip_all_bits_till_signbit) + call void @use32(i32 %signbit) + %isneg = icmp ne i32 %signbit, 0 ret i1 %isneg } define i1 @highest_bit_test_via_ashr(i32 %data, i32 %nbits) { ; CHECK-LABEL: @highest_bit_test_via_ashr( -; CHECK-NEXT: [[ISNEG:%.*]] = icmp slt i32 [[DATA:%.*]], 0 +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = ashr i32 [[DATA:%.*]], [[NUM_LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SKIP_ALL_BITS_TILL_SIGNBIT:%.*]] = add i32 [[NBITS]], -1 +; CHECK-NEXT: [[SIGNBIT:%.*]] = ashr i32 [[DATA]], 31 +; CHECK-NEXT: call void @use32(i32 [[NUM_LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use32(i32 [[SKIP_ALL_BITS_TILL_SIGNBIT]]) +; CHECK-NEXT: call void @use32(i32 [[SIGNBIT]]) +; CHECK-NEXT: [[ISNEG:%.*]] = icmp slt i32 [[DATA]], 0 ; CHECK-NEXT: ret i1 [[ISNEG]] ; %num_low_bits_to_skip = sub i32 32, %nbits %high_bits_extracted = ashr i32 %data, %num_low_bits_to_skip %skip_all_bits_till_signbit = sub i32 %nbits, 1 %signbit = ashr i32 %high_bits_extracted, %skip_all_bits_till_signbit + + call void @use32(i32 %num_low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use32(i32 %skip_all_bits_till_signbit) + call void @use32(i32 %signbit) + %isneg = icmp ne i32 %signbit, 0 ret i1 %isneg } define i1 @highest_bit_test_via_ashr_with_truncation(i64 %data, i32 %nbits) { ; CHECK-LABEL: @highest_bit_test_via_ashr_with_truncation( -; CHECK-NEXT: [[ISNEG:%.*]] = icmp slt i64 [[DATA:%.*]], 0 +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP:%.*]] = sub i32 64, [[NBITS:%.*]] +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP_WIDE:%.*]] = zext i32 [[NUM_LOW_BITS_TO_SKIP]] to i64 +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = ashr i64 [[DATA:%.*]], [[NUM_LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED_NARROW:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED]] to i32 +; CHECK-NEXT: [[SKIP_ALL_BITS_TILL_SIGNBIT:%.*]] = add i32 [[NBITS]], -1 +; CHECK-NEXT: [[SIGNBIT:%.*]] = ashr i32 [[HIGH_BITS_EXTRACTED_NARROW]], [[SKIP_ALL_BITS_TILL_SIGNBIT]] +; CHECK-NEXT: call void @use32(i32 [[NUM_LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use64(i64 [[NUM_LOW_BITS_TO_SKIP_WIDE]]) +; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED_NARROW]]) +; CHECK-NEXT: call void @use32(i32 [[SKIP_ALL_BITS_TILL_SIGNBIT]]) +; CHECK-NEXT: call void @use32(i32 [[SIGNBIT]]) +; CHECK-NEXT: [[ISNEG:%.*]] = icmp ne i32 [[SIGNBIT]], 0 ; CHECK-NEXT: ret i1 [[ISNEG]] ; %num_low_bits_to_skip = sub i32 64, %nbits @@ -53,12 +116,143 @@ define i1 @highest_bit_test_via_ashr_wit %high_bits_extracted_narrow = trunc i64 %high_bits_extracted to i32 %skip_all_bits_till_signbit = sub i32 %nbits, 1 %signbit = ashr i32 %high_bits_extracted_narrow, %skip_all_bits_till_signbit + + call void @use32(i32 %num_low_bits_to_skip) + call void @use64(i64 %num_low_bits_to_skip_wide) + call void @use64(i64 %high_bits_extracted) + call void @use32(i32 %high_bits_extracted_narrow) + call void @use32(i32 %skip_all_bits_till_signbit) + call void @use32(i32 %signbit) + %isneg = icmp ne i32 %signbit, 0 ret i1 %isneg } -declare void @use32(i32) -declare void @use64(i64) +define i1 @highest_bit_test_via_lshr_ashr(i32 %data, i32 %nbits) { +; CHECK-LABEL: @highest_bit_test_via_lshr_ashr( +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i32 [[DATA:%.*]], [[NUM_LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SKIP_ALL_BITS_TILL_SIGNBIT:%.*]] = add i32 [[NBITS]], -1 +; CHECK-NEXT: [[SIGNBIT:%.*]] = ashr i32 [[HIGH_BITS_EXTRACTED]], [[SKIP_ALL_BITS_TILL_SIGNBIT]] +; CHECK-NEXT: call void @use32(i32 [[NUM_LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use32(i32 [[SKIP_ALL_BITS_TILL_SIGNBIT]]) +; CHECK-NEXT: call void @use32(i32 [[SIGNBIT]]) +; CHECK-NEXT: [[ISNEG:%.*]] = icmp ne i32 [[SIGNBIT]], 0 +; CHECK-NEXT: ret i1 [[ISNEG]] +; + %num_low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = lshr i32 %data, %num_low_bits_to_skip + %skip_all_bits_till_signbit = sub i32 %nbits, 1 + %signbit = ashr i32 %high_bits_extracted, %skip_all_bits_till_signbit + + call void @use32(i32 %num_low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use32(i32 %skip_all_bits_till_signbit) + call void @use32(i32 %signbit) + + %isneg = icmp ne i32 %signbit, 0 + ret i1 %isneg +} + +define i1 @highest_bit_test_via_lshr_ashe_with_truncation(i64 %data, i32 %nbits) { +; CHECK-LABEL: @highest_bit_test_via_lshr_ashe_with_truncation( +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP:%.*]] = sub i32 64, [[NBITS:%.*]] +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP_WIDE:%.*]] = zext i32 [[NUM_LOW_BITS_TO_SKIP]] to i64 +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = lshr i64 [[DATA:%.*]], [[NUM_LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED_NARROW:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED]] to i32 +; CHECK-NEXT: [[SKIP_ALL_BITS_TILL_SIGNBIT:%.*]] = add i32 [[NBITS]], -1 +; CHECK-NEXT: [[SIGNBIT:%.*]] = ashr i32 [[HIGH_BITS_EXTRACTED_NARROW]], [[SKIP_ALL_BITS_TILL_SIGNBIT]] +; CHECK-NEXT: call void @use32(i32 [[NUM_LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use64(i64 [[NUM_LOW_BITS_TO_SKIP_WIDE]]) +; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED_NARROW]]) +; CHECK-NEXT: call void @use32(i32 [[SKIP_ALL_BITS_TILL_SIGNBIT]]) +; CHECK-NEXT: call void @use32(i32 [[SIGNBIT]]) +; CHECK-NEXT: [[ISNEG:%.*]] = icmp ne i32 [[SIGNBIT]], 0 +; CHECK-NEXT: ret i1 [[ISNEG]] +; + %num_low_bits_to_skip = sub i32 64, %nbits + %num_low_bits_to_skip_wide = zext i32 %num_low_bits_to_skip to i64 + %high_bits_extracted = lshr i64 %data, %num_low_bits_to_skip_wide + %high_bits_extracted_narrow = trunc i64 %high_bits_extracted to i32 + %skip_all_bits_till_signbit = sub i32 %nbits, 1 + %signbit = ashr i32 %high_bits_extracted_narrow, %skip_all_bits_till_signbit + + call void @use32(i32 %num_low_bits_to_skip) + call void @use64(i64 %num_low_bits_to_skip_wide) + call void @use64(i64 %high_bits_extracted) + call void @use32(i32 %high_bits_extracted_narrow) + call void @use32(i32 %skip_all_bits_till_signbit) + call void @use32(i32 %signbit) + + %isneg = icmp ne i32 %signbit, 0 + ret i1 %isneg +} + +define i1 @highest_bit_test_via_ashr_lshr(i32 %data, i32 %nbits) { +; CHECK-LABEL: @highest_bit_test_via_ashr_lshr( +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP:%.*]] = sub i32 32, [[NBITS:%.*]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = ashr i32 [[DATA:%.*]], [[NUM_LOW_BITS_TO_SKIP]] +; CHECK-NEXT: [[SKIP_ALL_BITS_TILL_SIGNBIT:%.*]] = add i32 [[NBITS]], -1 +; CHECK-NEXT: [[SIGNBIT:%.*]] = lshr i32 [[HIGH_BITS_EXTRACTED]], [[SKIP_ALL_BITS_TILL_SIGNBIT]] +; CHECK-NEXT: call void @use32(i32 [[NUM_LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use32(i32 [[SKIP_ALL_BITS_TILL_SIGNBIT]]) +; CHECK-NEXT: call void @use32(i32 [[SIGNBIT]]) +; CHECK-NEXT: [[ISNEG:%.*]] = icmp ne i32 [[SIGNBIT]], 0 +; CHECK-NEXT: ret i1 [[ISNEG]] +; + %num_low_bits_to_skip = sub i32 32, %nbits + %high_bits_extracted = ashr i32 %data, %num_low_bits_to_skip + %skip_all_bits_till_signbit = sub i32 %nbits, 1 + %signbit = lshr i32 %high_bits_extracted, %skip_all_bits_till_signbit + + call void @use32(i32 %num_low_bits_to_skip) + call void @use32(i32 %high_bits_extracted) + call void @use32(i32 %skip_all_bits_till_signbit) + call void @use32(i32 %signbit) + + %isneg = icmp ne i32 %signbit, 0 + ret i1 %isneg +} + +define i1 @highest_bit_test_via_ashr_lshr_with_truncation(i64 %data, i32 %nbits) { +; CHECK-LABEL: @highest_bit_test_via_ashr_lshr_with_truncation( +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP:%.*]] = sub i32 64, [[NBITS:%.*]] +; CHECK-NEXT: [[NUM_LOW_BITS_TO_SKIP_WIDE:%.*]] = zext i32 [[NUM_LOW_BITS_TO_SKIP]] to i64 +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED:%.*]] = ashr i64 [[DATA:%.*]], [[NUM_LOW_BITS_TO_SKIP_WIDE]] +; CHECK-NEXT: [[HIGH_BITS_EXTRACTED_NARROW:%.*]] = trunc i64 [[HIGH_BITS_EXTRACTED]] to i32 +; CHECK-NEXT: [[SKIP_ALL_BITS_TILL_SIGNBIT:%.*]] = add i32 [[NBITS]], -1 +; CHECK-NEXT: [[SIGNBIT:%.*]] = lshr i32 [[HIGH_BITS_EXTRACTED_NARROW]], [[SKIP_ALL_BITS_TILL_SIGNBIT]] +; CHECK-NEXT: call void @use32(i32 [[NUM_LOW_BITS_TO_SKIP]]) +; CHECK-NEXT: call void @use64(i64 [[NUM_LOW_BITS_TO_SKIP_WIDE]]) +; CHECK-NEXT: call void @use64(i64 [[HIGH_BITS_EXTRACTED]]) +; CHECK-NEXT: call void @use32(i32 [[HIGH_BITS_EXTRACTED_NARROW]]) +; CHECK-NEXT: call void @use32(i32 [[SKIP_ALL_BITS_TILL_SIGNBIT]]) +; CHECK-NEXT: call void @use32(i32 [[SIGNBIT]]) +; CHECK-NEXT: [[ISNEG:%.*]] = icmp ne i32 [[SIGNBIT]], 0 +; CHECK-NEXT: ret i1 [[ISNEG]] +; + %num_low_bits_to_skip = sub i32 64, %nbits + %num_low_bits_to_skip_wide = zext i32 %num_low_bits_to_skip to i64 + %high_bits_extracted = ashr i64 %data, %num_low_bits_to_skip_wide + %high_bits_extracted_narrow = trunc i64 %high_bits_extracted to i32 + %skip_all_bits_till_signbit = sub i32 %nbits, 1 + %signbit = lshr i32 %high_bits_extracted_narrow, %skip_all_bits_till_signbit + + call void @use32(i32 %num_low_bits_to_skip) + call void @use64(i64 %num_low_bits_to_skip_wide) + call void @use64(i64 %high_bits_extracted) + call void @use32(i32 %high_bits_extracted_narrow) + call void @use32(i32 %skip_all_bits_till_signbit) + call void @use32(i32 %signbit) + + %isneg = icmp ne i32 %signbit, 0 + ret i1 %isneg +} + +;------------------------------------------------------------------------------; define i1 @unsigned_sign_bit_extract(i32 %x) { ; CHECK-LABEL: @unsigned_sign_bit_extract( From llvm-commits at lists.llvm.org Sun Oct 13 10:10:26 2019 From: llvm-commits at lists.llvm.org (kamlesh kumar via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 17:10:26 +0000 (UTC) Subject: [PATCH] D68907: [6/7/trunk] -fno-plt generates wrong relocation for std::ios_base::Init leading to segmentation fault In-Reply-To: References: Message-ID: kamleshbhalui updated this revision to Diff 224785. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68907/new/ https://reviews.llvm.org/D68907 Files: llvm/lib/Target/X86/X86FastISel.cpp llvm/test/CodeGen/X86/pr39252.ll Index: llvm/test/CodeGen/X86/pr39252.ll =================================================================== --- /dev/null +++ llvm/test/CodeGen/X86/pr39252.ll @@ -0,0 +1,27 @@ +; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -fast-isel=1 | FileCheck %s -check-prefix=FASTISEL + +%"g1" = type { i8 } + + at g2 = internal global %"g1" zeroinitializer, align 1 + at __dso_handle = external hidden global i8 + +define internal void @foobar() #0 { +entry: + call void @foo(%"g1"* @g2) + %0 = call i32 @func(void (i8*)* bitcast (void (%"g1"*)* @bar to void (i8*)*), i8* getelementptr inbounds (%"g1", %"g1"* @g2, i32 0, i32 0), i8* @__dso_handle) #3 + ret void +; FASTISEL: movq bar at GOTPCREL(%rip), %rdi +} + +declare void @foo(%"g1"*) unnamed_addr #1 + +declare void @bar(%"g1"*) unnamed_addr #2 + +declare i32 @func(void (i8*)*, i8*, i8*) #3 + +attributes #0 = { noinline uwtable } +attributes #1 = { nonlazybind } +attributes #2 = { nounwind nonlazybind } +attributes #3 = { nounwind } + + Index: llvm/lib/Target/X86/X86FastISel.cpp =================================================================== --- llvm/lib/Target/X86/X86FastISel.cpp +++ llvm/lib/Target/X86/X86FastISel.cpp @@ -745,6 +745,12 @@ AM.Base.Reg = getInstrInfo()->getGlobalBaseReg(FuncInfo.MF); } + bool NeedLoad = GVFlags == X86II::MO_GOTPCREL; + if (NeedLoad) { + assert(AM.Base.Reg == 0 && AM.IndexReg == 0); + AM.Base.Reg = X86::RIP; + } + // Unless the ABI requires an extra load, return a direct reference to // the global. if (!isGlobalStubReference(GVFlags)) { -------------- next part -------------- A non-text attachment was scrubbed... Name: D68907.224785.patch Type: text/x-patch Size: 1608 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 10:19:08 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Sun, 13 Oct 2019 17:19:08 -0000 Subject: [llvm] r374728 - [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space Message-ID: <20191013171908.3C37C8B478@lists.llvm.org> Author: spatel Date: Sun Oct 13 10:19:08 2019 New Revision: 374728 URL: http://llvm.org/viewvc/llvm-project?rev=374728&view=rev Log: [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space Follow-up to D68244 to account for a corner case discussed in: https://bugs.llvm.org/show_bug.cgi?id=43501 Add one more restriction: if the pointer is deref-or-null and in a non-default (non-zero) address space, we can't assume inbounds. Differential Revision: https://reviews.llvm.org/D68706 Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCasts.cpp llvm/trunk/test/Transforms/InstCombine/load-bitcast-vec.ll Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCasts.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCasts.cpp?rev=374728&r1=374727&r2=374728&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineCasts.cpp (original) +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineCasts.cpp Sun Oct 13 10:19:08 2019 @@ -2344,8 +2344,16 @@ Instruction *InstCombiner::visitBitCast( // If the source pointer is dereferenceable, then assume it points to an // allocated object and apply "inbounds" to the GEP. bool CanBeNull; - if (Src->getPointerDereferenceableBytes(DL, CanBeNull)) - GEP->setIsInBounds(); + if (Src->getPointerDereferenceableBytes(DL, CanBeNull)) { + // In a non-default address space (not 0), a null pointer can not be + // assumed inbounds, so ignore that case (dereferenceable_or_null). + // The reason is that 'null' is not treated differently in these address + // spaces, and we consequently ignore the 'gep inbounds' special case + // for 'null' which allows 'inbounds' on 'null' if the indices are + // zeros. + if (SrcPTy->getAddressSpace() == 0 || !CanBeNull) + GEP->setIsInBounds(); + } return GEP; } } Modified: llvm/trunk/test/Transforms/InstCombine/load-bitcast-vec.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/load-bitcast-vec.ll?rev=374728&r1=374727&r2=374728&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/load-bitcast-vec.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/load-bitcast-vec.ll Sun Oct 13 10:19:08 2019 @@ -100,11 +100,11 @@ define float @matching_scalar_smallest_d ret float %r } -; TODO: Is a null pointer inbounds in any address space? +; A null pointer can't be assumed inbounds in a non-default address space. define float @matching_scalar_smallest_deref_or_null_addrspace(<4 x float> addrspace(4)* dereferenceable_or_null(1) %p) { ; CHECK-LABEL: @matching_scalar_smallest_deref_or_null_addrspace( -; CHECK-NEXT: [[BC:%.*]] = getelementptr inbounds <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 +; CHECK-NEXT: [[BC:%.*]] = getelementptr <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 ; CHECK-NEXT: [[R:%.*]] = load float, float addrspace(4)* [[BC]], align 16 ; CHECK-NEXT: ret float [[R]] ; From llvm-commits at lists.llvm.org Sun Oct 13 10:19:52 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 17:19:52 +0000 (UTC) Subject: [PATCH] D68706: [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space In-Reply-To: References: Message-ID: <104e7226bdf1cfcd7fa4b56a127780a7@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGf90728c3227d: [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non… (authored by spatel). Changed prior to commit: https://reviews.llvm.org/D68706?vs=224108&id=224786#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68706/new/ https://reviews.llvm.org/D68706 Files: llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp llvm/test/Transforms/InstCombine/load-bitcast-vec.ll Index: llvm/test/Transforms/InstCombine/load-bitcast-vec.ll =================================================================== --- llvm/test/Transforms/InstCombine/load-bitcast-vec.ll +++ llvm/test/Transforms/InstCombine/load-bitcast-vec.ll @@ -100,11 +100,11 @@ ret float %r } -; TODO: Is a null pointer inbounds in any address space? +; A null pointer can't be assumed inbounds in a non-default address space. define float @matching_scalar_smallest_deref_or_null_addrspace(<4 x float> addrspace(4)* dereferenceable_or_null(1) %p) { ; CHECK-LABEL: @matching_scalar_smallest_deref_or_null_addrspace( -; CHECK-NEXT: [[BC:%.*]] = getelementptr inbounds <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 +; CHECK-NEXT: [[BC:%.*]] = getelementptr <4 x float>, <4 x float> addrspace(4)* [[P:%.*]], i64 0, i64 0 ; CHECK-NEXT: [[R:%.*]] = load float, float addrspace(4)* [[BC]], align 16 ; CHECK-NEXT: ret float [[R]] ; Index: llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp =================================================================== --- llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp +++ llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp @@ -2344,8 +2344,16 @@ // If the source pointer is dereferenceable, then assume it points to an // allocated object and apply "inbounds" to the GEP. bool CanBeNull; - if (Src->getPointerDereferenceableBytes(DL, CanBeNull)) - GEP->setIsInBounds(); + if (Src->getPointerDereferenceableBytes(DL, CanBeNull)) { + // In a non-default address space (not 0), a null pointer can not be + // assumed inbounds, so ignore that case (dereferenceable_or_null). + // The reason is that 'null' is not treated differently in these address + // spaces, and we consequently ignore the 'gep inbounds' special case + // for 'null' which allows 'inbounds' on 'null' if the indices are + // zeros. + if (SrcPTy->getAddressSpace() == 0 || !CanBeNull) + GEP->setIsInBounds(); + } return GEP; } } -------------- next part -------------- A non-text attachment was scrubbed... Name: D68706.224786.patch Type: text/x-patch Size: 2082 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 10:28:47 2019 From: llvm-commits at lists.llvm.org (Aditya Kumar via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 17:28:47 +0000 (UTC) Subject: [PATCH] D68924: CodeExtractor: NFC: Use Range based loop In-Reply-To: References: Message-ID: <1557d845b04fb43ac2437a4c0189e6e3@localhost.localdomain> hiraditya added a comment. Tried the following, I get test failures and crashes. diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp index 3e1bea77f6c..c4661fde0fb 100644 --- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp +++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp @@ -929,8 +929,7 @@ Function *CodeExtractor::constructFunction(const ValueSet &inputs, // Rewrite branches to basic blocks outside of the loop to new dummy blocks // within the new function. This must be done before we lose track of which // blocks were originally in the code region. - std::vector Users(header->user_begin(), header->user_end()); - for (auto &U : Users) + for (auto U : header->users()) // The BasicBlock which contains the branch is not in the region // modify the branch target to a new block if (Instruction *I = dyn_cast(U)) Failing Tests (6): LLVM :: Transforms/CodeExtractor/PartialInlineAndOr.ll LLVM :: Transforms/CodeExtractor/PartialInlineOr.ll LLVM :: Transforms/CodeExtractor/PartialInlineOrAnd.ll // Crashed: Assertion `(Flags & RF_IgnoreMissingLocals) && "Referenced value not in value map!"' failed. LLVM :: Transforms/HotColdSplit/eh-pads.ll LLVM :: Transforms/HotColdSplit/outline-multiple-entry-region.ll LLVM :: Transforms/HotColdSplit/unwind.ll Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68924/new/ https://reviews.llvm.org/D68924 From llvm-commits at lists.llvm.org Sun Oct 13 10:34:08 2019 From: llvm-commits at lists.llvm.org (Sanjay Patel via llvm-commits) Date: Sun, 13 Oct 2019 17:34:08 -0000 Subject: [llvm] r374729 - [ConstantFold] fix inconsistent handling of extractelement with undef index (PR42689) Message-ID: <20191013173408.865B286017@lists.llvm.org> Author: spatel Date: Sun Oct 13 10:34:08 2019 New Revision: 374729 URL: http://llvm.org/viewvc/llvm-project?rev=374729&view=rev Log: [ConstantFold] fix inconsistent handling of extractelement with undef index (PR42689) Any constant other than zero was already folded to undef if the index is undef. https://bugs.llvm.org/show_bug.cgi?id=42689 Modified: llvm/trunk/lib/IR/ConstantFold.cpp llvm/trunk/test/Transforms/ConstProp/InsertElement.ll Modified: llvm/trunk/lib/IR/ConstantFold.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/ConstantFold.cpp?rev=374729&r1=374728&r2=374729&view=diff ============================================================================== --- llvm/trunk/lib/IR/ConstantFold.cpp (original) +++ llvm/trunk/lib/IR/ConstantFold.cpp Sun Oct 13 10:34:08 2019 @@ -787,12 +787,9 @@ Constant *llvm::ConstantFoldSelectInstru Constant *llvm::ConstantFoldExtractElementInstruction(Constant *Val, Constant *Idx) { - if (isa(Val)) // ee(undef, x) -> undef - return UndefValue::get(Val->getType()->getVectorElementType()); - if (Val->isNullValue()) // ee(zero, x) -> zero - return Constant::getNullValue(Val->getType()->getVectorElementType()); - // ee({w,x,y,z}, undef) -> undef - if (isa(Idx)) + // extractelt undef, C -> undef + // extractelt C, undef -> undef + if (isa(Val) || isa(Idx)) return UndefValue::get(Val->getType()->getVectorElementType()); if (ConstantInt *CIdx = dyn_cast(Idx)) { Modified: llvm/trunk/test/Transforms/ConstProp/InsertElement.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/ConstProp/InsertElement.ll?rev=374729&r1=374728&r2=374729&view=diff ============================================================================== --- llvm/trunk/test/Transforms/ConstProp/InsertElement.ll (original) +++ llvm/trunk/test/Transforms/ConstProp/InsertElement.ll Sun Oct 13 10:34:08 2019 @@ -38,7 +38,7 @@ define <4 x i64> @insertelement_undef() define i64 @extract_undef_index_from_zero_vec() { ; CHECK-LABEL: @extract_undef_index_from_zero_vec( -; CHECK-NEXT: ret i64 0 +; CHECK-NEXT: ret i64 undef ; %E = extractelement <2 x i64> zeroinitializer, i64 undef ret i64 %E From llvm-commits at lists.llvm.org Sun Oct 13 11:14:16 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 18:14:16 +0000 (UTC) Subject: [PATCH] D68929: [Attributor][FIX] Use check line that is actually tested Message-ID: jdoerfert created this revision. jdoerfert added reviewers: sstefan1, uenoku. Herald added subscribers: bollu, hiraditya. Herald added a project: LLVM. This changes "CHECK" check lines to "ATTRIBUTOR" check lines where necessary and also fixes the now exposed, mostly minor, problems. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68929 Files: llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/arg_returned.ll llvm/test/Transforms/FunctionAttrs/dereferenceable.ll llvm/test/Transforms/FunctionAttrs/nocapture.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68929.224787.patch Type: text/x-patch Size: 6314 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 11:23:19 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 18:23:19 +0000 (UTC) Subject: [PATCH] D68929: [Attributor][FIX] Use check line that is actually tested In-Reply-To: References: Message-ID: <8caff4026720fde2ec89b7db5358396d@localhost.localdomain> jdoerfert updated this revision to Diff 224790. jdoerfert added a comment. Handle pointers with address spaces and make sure uses are pointer operands Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68929/new/ https://reviews.llvm.org/D68929 Files: llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/arg_returned.ll llvm/test/Transforms/FunctionAttrs/dereferenceable.ll llvm/test/Transforms/FunctionAttrs/nocapture.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68929.224790.patch Type: text/x-patch Size: 7263 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 11:32:21 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 18:32:21 +0000 (UTC) Subject: [PATCH] D68930: [InstCombine] Shift amount reassociation in shifty sign bit test (PR43595) Message-ID: lebedev.ri created this revision. lebedev.ri added reviewers: spatel, efriedma. lebedev.ri added a project: LLVM. Herald added a subscriber: hiraditya. This problem consists of several parts: - Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`. This is trivial, and easy to do, we have a fold for it. - Shift amount reassociation - if we have two identical shifts, and we can simplify-add their shift amounts together, then we likely can just perform them as a single shift. But this is finicky, has one-use restrictions, and shift opcodes must be identical. But there is a super-pattern where both of these work together. to produce sign bit test from two shifts + comparison. We do indeed already handle this in most cases. But since we get that fold transitively, it has one-use restrictions. And what's worse, in this case the right-shifts aren't required to be identical, and we can't handle that transitively: If the total shift amount is bitwidth-1, only a sign bit will remain in the output value. But if we look at this from the perspective of two shifts, we can't fold - we can't possibly know what bit pattern we'd produce via two shifts, it will be *some* kind of a mask produced from original sign bit, but we just can't tell it's shape: https://rise4fun.com/Alive/cM0 https://rise4fun.com/Alive/9IN But it will *only* contain sign bit and zeros. So from the perspective of sign bit test, we're good: https://rise4fun.com/Alive/FRz https://rise4fun.com/Alive/qBU Superb! So the simplest solution is to extend `reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a sudo-analysis mode that will ignore extra-uses, and will only check whether a) those are two right shifts and b) they end up with bitwidth(x)-1 shift amount and return either the original value that we sign-checking, or null. This does not have any functionality change for the existing `reassociateShiftAmtsOfTwoSameDirectionShifts()`. https://bugs.llvm.org/show_bug.cgi?id=43595 Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68930 Files: llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp llvm/lib/Transforms/InstCombine/InstCombineInternal.h llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp llvm/test/Transforms/InstCombine/sign-bit-test-via-right-shifting-all-other-bits.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68930.224788.patch Type: text/x-patch Size: 10058 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 11:32:21 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 18:32:21 +0000 (UTC) Subject: [PATCH] D68924: CodeExtractor: NFC: Use Range based loop In-Reply-To: References: Message-ID: <08f5e5da9af127073af83fa4c8503a00@localhost.localdomain> fhahn added a comment. In D68924#1707545 , @hiraditya wrote: > Tried the following, I get test failures and crashes. > > diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp > index 3e1bea77f6c..c4661fde0fb 100644 > --- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp > +++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp > @@ -929,8 +929,7 @@ Function *CodeExtractor::constructFunction(const ValueSet &inputs, > // Rewrite branches to basic blocks outside of the loop to new dummy blocks > // within the new function. This must be done before we lose track of which > // blocks were originally in the code region. > - std::vector Users(header->user_begin(), header->user_end()); > - for (auto &U : Users) > + for (auto U : header->users()) > // The BasicBlock which contains the branch is not in the region > // modify the branch target to a new block > if (Instruction *I = dyn_cast(U)) > > > Failing Tests (6): > > LLVM :: Transforms/CodeExtractor/PartialInlineAndOr.ll > LLVM :: Transforms/CodeExtractor/PartialInlineOr.ll > LLVM :: Transforms/CodeExtractor/PartialInlineOrAnd.ll // Crashed: Assertion `(Flags & RF_IgnoreMissingLocals) && "Referenced value not in value map!"' failed. > LLVM :: Transforms/HotColdSplit/eh-pads.ll > LLVM :: Transforms/HotColdSplit/outline-multiple-entry-region.ll > LLVM :: Transforms/HotColdSplit/unwind.ll I guess you are seeing some segfaults, right? The problem is that in the loop, you change header's users (I->replaceUsesOfWith()), which invalidates the range iterator. I am not entirely sure how the users() iterator range is implemented, but you might be able to use `make_early_inc_range(header->users())`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68924/new/ https://reviews.llvm.org/D68924 From llvm-commits at lists.llvm.org Sun Oct 13 11:32:21 2019 From: llvm-commits at lists.llvm.org (Sourabh Singh Tomar via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 18:32:21 +0000 (UTC) Subject: [PATCH] D68117: [DWARF-5] Support for C++11 defaulted, deleted member functions. In-Reply-To: References: Message-ID: <43ff10c1029241cbc9133185f0283c3a@localhost.localdomain> SouraVX added a comment. In D68117#1702595 , @probinson wrote: > We really do want to pack the four mutually exclusive cases into two bits. I have tried to give more explicit comments inline to explain how you would do this. It really should work fine, recognizing that the "not defaulted" case is not explicitly represented in the textual IR because it uses a zero value in the defaulted/deleted subfield of SPFlags. Thanks Paul, for suggesting this. Your approach works fine. But as I was working on some lvm-dwarfdump test cases. We seems to miss one corner case -- Consider this test case; class foo{ foo() = default; ~foo() = default; void not_special() {} }; void not_a_member_of_foo(){} Now I'm getting DW_AT_defaulted getting emitted with value DW_DEFAULTED_no, for functions "not_special" and "not_a_member_of_foo". This behavior is undesirable since, DW_AT_defaulted attributes is only valid for C++ special member functions{Constructors/Destructors, ...}. Please correct me if I'm wrong -- Now This attributes to- implicitly defined "0" NotDefaulted bit. which is getting checked{that's fine as long as we have a dedicated bits for distinguishing} and true for every subprogram or function in a CU. void DwarfUnit::applySubprogramAttributes( ... ... else if (SP->isNotDefaulted()) addUInt(SPDie, dwarf::DW_AT_defaulted, dwarf::DW_FORM_data1, dwarf::DW_DEFAULTED_no); ... CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68117/new/ https://reviews.llvm.org/D68117 From llvm-commits at lists.llvm.org Sun Oct 13 11:59:38 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 18:59:38 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <876c5af3d5512647524c34c75e757976@localhost.localdomain> fhahn added inline comments. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7071 + // --------------------------------------------------------------------------- + // Transform initial VPlan: Apply previously taken decisions, in order, to ---------------- gilr wrote: > fhahn wrote: > > Not sure how other feel, but I think it would be great if we could move this transform out of LoopVectorize.cpp , to group together VP2VP transforms. I think it would fit well into llvm/lib/Transforms/Vectorize/VPlanHCFGTransforms.h (although the name mentions HFCGTransforms, maybe it should be just VplanToVplanTransforms.h/cpp). > > > > I could not spot anything that would prevent moving it to a different file on first glance. > This is currently still ingredient-based, i.e. not a pure VPlan2VPlan transformation, and as you mention the VPlan2VPlan part it's basically just a moveAfter(). A VPlan-based sinkAfter() should not be based on ingredients (as stated in D46826) and instead might take a Recipe2Recipe map., but that seems a bit of an overkill for this patch. Modelling VPlan-based transformations is definitely worth a larger discussion. Sounds good to me, let's not overcomplicate things. Yep, D46826 is still not really suitable here yet, which I only realised after accidentally submitting the comment here. Sorry for the confusion. ================ Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7079 + VPRecipeBase *Sink = RecipeBuilder.getRecipe(Entry.first); + Sink->removeFromParent(); + Sink->insertAfter(RecipeBuilder.getRecipe(Entry.second)); ---------------- gilr wrote: > fhahn wrote: > > This could just be `Sink->moveAfter(RecipeBuilder.getRecipe(Entry.second)) `. I've added it in D46825 and now finally have a reason to commit it ;) > Right. Will use it instead. > Seems it doesn't update Parent, though. Will rewrite as a composition of the more basic removeFromParent(), insertAfter() to avoid duplicating that and the assertions. Makes sense, thanks! Originally I tried to mirror the implementation of Instruction::moveAfter, but it seems like it is only supposed to move instructions inside a basic block, although that's not really clear from the documentation. ================ Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:985 + VPValue *getMask() { + // Mask is the last operand. ---------------- nit: it would be great to have a doc comment. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 From llvm-commits at lists.llvm.org Sun Oct 13 12:07:28 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Sun, 13 Oct 2019 19:07:28 -0000 Subject: [llvm] r374731 - [X86] Enable use of avx512 saturating truncate instructions in more cases. Message-ID: <20191013190728.E307885399@lists.llvm.org> Author: ctopper Date: Sun Oct 13 12:07:28 2019 New Revision: 374731 URL: http://llvm.org/viewvc/llvm-project?rev=374731&view=rev Log: [X86] Enable use of avx512 saturating truncate instructions in more cases. This enables use of the saturating truncate instructions when the result type is less than 128 bits. It also enables the use of saturating truncate instructions on KNL when the input is less than 512 bits. We can do this by widening the input and then extracting the result. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/avx512-trunc.ll llvm/trunk/test/CodeGen/X86/masked_store_trunc_ssat.ll llvm/trunk/test/CodeGen/X86/masked_store_trunc_usat.ll llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374731&r1=374730&r2=374731&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sun Oct 13 12:07:28 2019 @@ -39713,26 +39713,6 @@ static SDValue foldVectorXorShiftIntoCmp return DAG.getNode(X86ISD::PCMPGT, SDLoc(N), VT, Shift.getOperand(0), Ones); } -/// Check if truncation with saturation form type \p SrcVT to \p DstVT -/// is valid for the given \p Subtarget. -static bool isSATValidOnAVX512Subtarget(EVT SrcVT, EVT DstVT, - const X86Subtarget &Subtarget) { - if (!Subtarget.hasAVX512()) - return false; - - // FIXME: Scalar type may be supported if we move it to vector register. - if (!SrcVT.isVector()) - return false; - - EVT SrcElVT = SrcVT.getScalarType(); - EVT DstElVT = DstVT.getScalarType(); - if (DstElVT != MVT::i8 && DstElVT != MVT::i16 && DstElVT != MVT::i32) - return false; - if (SrcVT.is512BitVector() || Subtarget.hasVLX()) - return SrcElVT.getSizeInBits() >= 32 || Subtarget.hasBWI(); - return false; -} - /// Detect patterns of truncation with unsigned saturation: /// /// 1. (truncate (umin (x, unsigned_max_of_dest_type)) to dest_type). @@ -39833,20 +39813,12 @@ static SDValue detectSSatPattern(SDValue static SDValue combineTruncateWithSat(SDValue In, EVT VT, const SDLoc &DL, SelectionDAG &DAG, const X86Subtarget &Subtarget) { - if (!Subtarget.hasSSE2()) + if (!Subtarget.hasSSE2() || !VT.isVector()) return SDValue(); - EVT SVT = VT.getScalarType(); + EVT SVT = VT.getVectorElementType(); EVT InVT = In.getValueType(); - EVT InSVT = InVT.getScalarType(); - const TargetLowering &TLI = DAG.getTargetLoweringInfo(); - if (TLI.isTypeLegal(InVT) && TLI.isTypeLegal(VT) && - isSATValidOnAVX512Subtarget(InVT, VT, Subtarget)) { - if (auto SSatVal = detectSSatPattern(In, VT)) - return DAG.getNode(X86ISD::VTRUNCS, DL, VT, SSatVal); - if (auto USatVal = detectUSatPattern(In, VT, DAG, DL)) - return DAG.getNode(X86ISD::VTRUNCUS, DL, VT, USatVal); - } + EVT InSVT = InVT.getVectorElementType(); // If we're clamping a signed 32-bit vector to 0-255 and the 32-bit vector is // split across two registers. We can use a packusdw+perm to clamp to 0-65535 @@ -39875,16 +39847,15 @@ static SDValue combineTruncateWithSat(SD (Subtarget.hasVLX() || InVT.getSizeInBits() > 256) && !(!Subtarget.useAVX512Regs() && VT.getSizeInBits() >= 256); - if (VT.isVector() && isPowerOf2_32(VT.getVectorNumElements()) && - !PreferAVX512 && + if (isPowerOf2_32(VT.getVectorNumElements()) && !PreferAVX512 && + VT.getSizeInBits() >= 64 && (SVT == MVT::i8 || SVT == MVT::i16) && (InSVT == MVT::i16 || InSVT == MVT::i32)) { if (auto USatVal = detectSSatPattern(In, VT, true)) { // vXi32 -> vXi8 must be performed as PACKUSWB(PACKSSDW,PACKSSDW). // Only do this when the result is at least 64 bits or we'll leaving // dangling PACKSSDW nodes. - if (SVT == MVT::i8 && InSVT == MVT::i32 && - VT.getVectorNumElements() >= 8) { + if (SVT == MVT::i8 && InSVT == MVT::i32) { EVT MidVT = EVT::getVectorVT(*DAG.getContext(), MVT::i16, VT.getVectorNumElements()); SDValue Mid = truncateVectorWithPACK(X86ISD::PACKSS, MidVT, USatVal, DL, @@ -39902,6 +39873,42 @@ static SDValue combineTruncateWithSat(SD return truncateVectorWithPACK(X86ISD::PACKSS, VT, SSatVal, DL, DAG, Subtarget); } + + const TargetLowering &TLI = DAG.getTargetLoweringInfo(); + if (TLI.isTypeLegal(InVT) && InVT.isVector() && SVT != MVT::i1 && + Subtarget.hasAVX512() && (InSVT != MVT::i16 || Subtarget.hasBWI())) { + unsigned TruncOpc; + SDValue SatVal; + if (auto SSatVal = detectSSatPattern(In, VT)) { + SatVal = SSatVal; + TruncOpc = X86ISD::VTRUNCS; + } else if (auto USatVal = detectUSatPattern(In, VT, DAG, DL)) { + SatVal = USatVal; + TruncOpc = X86ISD::VTRUNCUS; + } + if (SatVal) { + unsigned ResElts = VT.getVectorNumElements(); + // If the input type is less than 512 bits and we don't have VLX, we need + // to widen to 512 bits. + if (!Subtarget.hasVLX() && !InVT.is512BitVector()) { + unsigned NumConcats = 512 / InVT.getSizeInBits(); + ResElts *= NumConcats; + SmallVector ConcatOps(NumConcats, DAG.getUNDEF(InVT)); + ConcatOps[0] = SatVal; + InVT = EVT::getVectorVT(*DAG.getContext(), InSVT, + NumConcats * InVT.getVectorNumElements()); + SatVal = DAG.getNode(ISD::CONCAT_VECTORS, DL, InVT, ConcatOps); + } + // Widen the result if its narrower than 128 bits. + if (ResElts * SVT.getSizeInBits() < 128) + ResElts = 128 / SVT.getSizeInBits(); + EVT TruncVT = EVT::getVectorVT(*DAG.getContext(), SVT, ResElts); + SDValue Res = DAG.getNode(TruncOpc, DL, TruncVT, SatVal); + return DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, Res, + DAG.getIntPtrConstant(0, DL)); + } + } + return SDValue(); } Modified: llvm/trunk/test/CodeGen/X86/avx512-trunc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-trunc.ll?rev=374731&r1=374730&r2=374731&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx512-trunc.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx512-trunc.ll Sun Oct 13 12:07:28 2019 @@ -713,11 +713,16 @@ define <16 x i16> @usat_trunc_dw_512(<16 } define <8 x i8> @usat_trunc_wb_128(<8 x i16> %i) { -; ALL-LABEL: usat_trunc_wb_128: -; ALL: ## %bb.0: -; ALL-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 -; ALL-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 -; ALL-NEXT: retq +; KNL-LABEL: usat_trunc_wb_128: +; KNL: ## %bb.0: +; KNL-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 +; KNL-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; KNL-NEXT: retq +; +; SKX-LABEL: usat_trunc_wb_128: +; SKX: ## %bb.0: +; SKX-NEXT: vpmovuswb %xmm0, %xmm0 +; SKX-NEXT: retq %x3 = icmp ult <8 x i16> %i, %x5 = select <8 x i1> %x3, <8 x i16> %i, <8 x i16> %x6 = trunc <8 x i16> %x5 to <8 x i8> @@ -740,9 +745,8 @@ define <16 x i16> @usat_trunc_qw_1024(<1 define <16 x i8> @usat_trunc_db_256(<8 x i32> %x) { ; KNL-LABEL: usat_trunc_db_256: ; KNL: ## %bb.0: -; KNL-NEXT: vpbroadcastd {{.*#+}} ymm1 = [255,255,255,255,255,255,255,255] -; KNL-NEXT: vpminud %ymm1, %ymm0, %ymm0 -; KNL-NEXT: vpmovdb %zmm0, %xmm0 +; KNL-NEXT: ## kill: def $ymm0 killed $ymm0 def $zmm0 +; KNL-NEXT: vpmovusdb %zmm0, %xmm0 ; KNL-NEXT: vzeroupper ; KNL-NEXT: retq ; Modified: llvm/trunk/test/CodeGen/X86/masked_store_trunc_ssat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/masked_store_trunc_ssat.ll?rev=374731&r1=374730&r2=374731&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/masked_store_trunc_ssat.ll (original) +++ llvm/trunk/test/CodeGen/X86/masked_store_trunc_ssat.ll Sun Oct 13 12:07:28 2019 @@ -1717,9 +1717,7 @@ define void @truncstore_v4i64_v4i32(<4 x ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512F-NEXT: kshiftlw $12, %k0, %k0 ; AVX512F-NEXT: kshiftrw $12, %k0, %k1 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512F-NEXT: vpmovsqd %zmm0, %ymm0 ; AVX512F-NEXT: vmovdqu32 %zmm0, (%rdi) {%k1} ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -1740,9 +1738,7 @@ define void @truncstore_v4i64_v4i32(<4 x ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlw $12, %k0, %k0 ; AVX512BW-NEXT: kshiftrw $12, %k0, %k1 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512BW-NEXT: vpmovsqd %zmm0, %ymm0 ; AVX512BW-NEXT: vmovdqu32 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2021,9 +2017,7 @@ define void @truncstore_v4i64_v4i16(<4 x ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB4_1 @@ -2063,9 +2057,7 @@ define void @truncstore_v4i64_v4i16(<4 x ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftld $28, %k0, %k0 ; AVX512BW-NEXT: kshiftrd $28, %k0, %k1 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu16 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2351,9 +2343,7 @@ define void @truncstore_v4i64_v4i8(<4 x ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB5_1 @@ -2393,9 +2383,7 @@ define void @truncstore_v4i64_v4i8(<4 x ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlq $60, %k0, %k0 ; AVX512BW-NEXT: kshiftrq $60, %k0, %k1 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2544,11 +2532,7 @@ define void @truncstore_v2i64_v2i32(<2 x ; AVX512F-NEXT: vptestmq %zmm1, %zmm1, %k0 ; AVX512F-NEXT: kshiftlw $14, %k0, %k0 ; AVX512F-NEXT: kshiftrw $14, %k0, %k1 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] -; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpmovsqd %zmm0, %ymm0 ; AVX512F-NEXT: vmovdqu32 %zmm0, (%rdi) {%k1} ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -2568,11 +2552,7 @@ define void @truncstore_v2i64_v2i32(<2 x ; AVX512BW-NEXT: vptestmq %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlw $14, %k0, %k0 ; AVX512BW-NEXT: kshiftrw $14, %k0, %k1 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] -; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vpmovsqd %zmm0, %ymm0 ; AVX512BW-NEXT: vmovdqu32 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2708,12 +2688,7 @@ define void @truncstore_v2i64_v2i16(<2 x ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 ; AVX512F-NEXT: vptestmq %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] -; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] -; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB7_1 @@ -2739,12 +2714,7 @@ define void @truncstore_v2i64_v2i16(<2 x ; AVX512BW-NEXT: vptestmq %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftld $30, %k0, %k0 ; AVX512BW-NEXT: kshiftrd $30, %k0, %k1 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] -; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] -; AVX512BW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512BW-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu16 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2887,11 +2857,7 @@ define void @truncstore_v2i64_v2i8(<2 x ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 ; AVX512F-NEXT: vptestmq %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] -; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB8_1 @@ -2917,11 +2883,7 @@ define void @truncstore_v2i64_v2i8(<2 x ; AVX512BW-NEXT: vptestmq %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlq $62, %k0, %k0 ; AVX512BW-NEXT: kshiftrq $62, %k0, %k1 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] -; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -5417,12 +5379,9 @@ define void @truncstore_v4i32_v4i8(<4 x ; AVX512F-LABEL: truncstore_v4i32_v4i8: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] -; AVX512F-NEXT: vpminsd %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] -; AVX512F-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovsdb %zmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB14_1 @@ -5458,14 +5417,11 @@ define void @truncstore_v4i32_v4i8(<4 x ; AVX512BW-LABEL: truncstore_v4i32_v4i8: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlq $60, %k0, %k0 ; AVX512BW-NEXT: kshiftrq $60, %k0, %k1 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] -; AVX512BW-NEXT: vpminsd %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] -; AVX512BW-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovsdb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq Modified: llvm/trunk/test/CodeGen/X86/masked_store_trunc_usat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/masked_store_trunc_usat.ll?rev=374731&r1=374730&r2=374731&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/masked_store_trunc_usat.ll (original) +++ llvm/trunk/test/CodeGen/X86/masked_store_trunc_usat.ll Sun Oct 13 12:07:28 2019 @@ -1464,8 +1464,7 @@ define void @truncstore_v4i64_v4i32(<4 x ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512F-NEXT: kshiftlw $12, %k0, %k0 ; AVX512F-NEXT: kshiftrw $12, %k0, %k1 -; AVX512F-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512F-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512F-NEXT: vmovdqu32 %zmm0, (%rdi) {%k1} ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -1485,8 +1484,7 @@ define void @truncstore_v4i64_v4i32(<4 x ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlw $12, %k0, %k0 ; AVX512BW-NEXT: kshiftrw $12, %k0, %k1 -; AVX512BW-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512BW-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512BW-NEXT: vmovdqu32 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -1731,8 +1729,7 @@ define void @truncstore_v4i64_v4i16(<4 x ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB4_1 @@ -1772,8 +1769,7 @@ define void @truncstore_v4i64_v4i16(<4 x ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftld $28, %k0, %k0 ; AVX512BW-NEXT: kshiftrd $28, %k0, %k1 -; AVX512BW-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu16 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2023,8 +2019,7 @@ define void @truncstore_v4i64_v4i8(<4 x ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB5_1 @@ -2064,8 +2059,7 @@ define void @truncstore_v4i64_v4i8(<4 x ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlq $60, %k0, %k0 ; AVX512BW-NEXT: kshiftrq $60, %k0, %k1 -; AVX512BW-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2193,9 +2187,7 @@ define void @truncstore_v2i64_v2i32(<2 x ; AVX512F-NEXT: vptestmq %zmm1, %zmm1, %k0 ; AVX512F-NEXT: kshiftlw $14, %k0, %k0 ; AVX512F-NEXT: kshiftrw $14, %k0, %k1 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] -; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512F-NEXT: vmovdqu32 %zmm0, (%rdi) {%k1} ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -2214,9 +2206,7 @@ define void @truncstore_v2i64_v2i32(<2 x ; AVX512BW-NEXT: vptestmq %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlw $14, %k0, %k0 ; AVX512BW-NEXT: kshiftrw $14, %k0, %k1 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] -; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512BW-NEXT: vmovdqu32 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2333,10 +2323,7 @@ define void @truncstore_v2i64_v2i16(<2 x ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 ; AVX512F-NEXT: vptestmq %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] -; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] -; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB7_1 @@ -2362,10 +2349,7 @@ define void @truncstore_v2i64_v2i16(<2 x ; AVX512BW-NEXT: vptestmq %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftld $30, %k0, %k0 ; AVX512BW-NEXT: kshiftrd $30, %k0, %k1 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] -; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] -; AVX512BW-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512BW-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu16 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2488,9 +2472,7 @@ define void @truncstore_v2i64_v2i8(<2 x ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 ; AVX512F-NEXT: vptestmq %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] -; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB8_1 @@ -2516,9 +2498,7 @@ define void @truncstore_v2i64_v2i8(<2 x ; AVX512BW-NEXT: vptestmq %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlq $62, %k0, %k0 ; AVX512BW-NEXT: kshiftrq $62, %k0, %k1 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] -; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -4279,10 +4259,9 @@ define void @truncstore_v8i32_v8i16(<8 x ; AVX512F-LABEL: truncstore_v8i32_v8i16: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm1 killed $ymm1 def $zmm1 +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [65535,65535,65535,65535,65535,65535,65535,65535] -; AVX512F-NEXT: vpminud %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdw %zmm0, %ymm0 +; AVX512F-NEXT: vpmovusdw %zmm0, %ymm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB11_1 @@ -4346,12 +4325,11 @@ define void @truncstore_v8i32_v8i16(<8 x ; AVX512BW-LABEL: truncstore_v8i32_v8i16: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm1 killed $ymm1 def $zmm1 +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftld $24, %k0, %k0 ; AVX512BW-NEXT: kshiftrd $24, %k0, %k1 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [65535,65535,65535,65535,65535,65535,65535,65535] -; AVX512BW-NEXT: vpminud %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdw %zmm0, %ymm0 +; AVX512BW-NEXT: vpmovusdw %zmm0, %ymm0 ; AVX512BW-NEXT: vmovdqu16 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -4684,10 +4662,9 @@ define void @truncstore_v8i32_v8i8(<8 x ; AVX512F-LABEL: truncstore_v8i32_v8i8: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm1 killed $ymm1 def $zmm1 +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [255,255,255,255,255,255,255,255] -; AVX512F-NEXT: vpminud %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512F-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB12_1 @@ -4751,12 +4728,11 @@ define void @truncstore_v8i32_v8i8(<8 x ; AVX512BW-LABEL: truncstore_v8i32_v8i8: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm1 killed $ymm1 def $zmm1 +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlq $56, %k0, %k0 ; AVX512BW-NEXT: kshiftrq $56, %k0, %k1 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [255,255,255,255,255,255,255,255] -; AVX512BW-NEXT: vpminud %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -4941,10 +4917,9 @@ define void @truncstore_v4i32_v4i16(<4 x ; AVX512F-LABEL: truncstore_v4i32_v4i16: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] -; AVX512F-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: vpmovusdw %zmm0, %ymm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB13_1 @@ -4980,12 +4955,11 @@ define void @truncstore_v4i32_v4i16(<4 x ; AVX512BW-LABEL: truncstore_v4i32_v4i16: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftld $28, %k0, %k0 ; AVX512BW-NEXT: kshiftrd $28, %k0, %k1 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] -; AVX512BW-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: vpmovusdw %zmm0, %ymm0 ; AVX512BW-NEXT: vmovdqu16 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -5169,10 +5143,9 @@ define void @truncstore_v4i32_v4i8(<4 x ; AVX512F-LABEL: truncstore_v4i32_v4i8: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 ; AVX512F-NEXT: vptestmd %zmm1, %zmm1, %k0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] -; AVX512F-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512F-NEXT: kmovw %k0, %eax ; AVX512F-NEXT: testb $1, %al ; AVX512F-NEXT: jne .LBB14_1 @@ -5208,12 +5181,11 @@ define void @truncstore_v4i32_v4i8(<4 x ; AVX512BW-LABEL: truncstore_v4i32_v4i8: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 ; AVX512BW-NEXT: vptestmd %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlq $60, %k0, %k0 ; AVX512BW-NEXT: kshiftrq $60, %k0, %k1 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] -; AVX512BW-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -7070,10 +7042,10 @@ define void @truncstore_v16i16_v16i8(<16 ; AVX512BW-LABEL: truncstore_v16i16_v16i8: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 ; AVX512BW-NEXT: vptestmb %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kmovw %k0, %k1 -; AVX512BW-NEXT: vpminuw {{.*}}(%rip), %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovwb %zmm0, %ymm0 +; AVX512BW-NEXT: vpmovuswb %zmm0, %ymm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -7370,11 +7342,11 @@ define void @truncstore_v8i16_v8i8(<8 x ; AVX512BW-LABEL: truncstore_v8i16_v8i8: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 ; AVX512BW-NEXT: vptestmw %zmm1, %zmm1, %k0 ; AVX512BW-NEXT: kshiftlq $56, %k0, %k0 ; AVX512BW-NEXT: kshiftrq $56, %k0, %k1 -; AVX512BW-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BW-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: vpmovuswb %zmm0, %ymm0 ; AVX512BW-NEXT: vmovdqu8 %zmm0, (%rdi) {%k1} ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll?rev=374731&r1=374730&r2=374731&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll Sun Oct 13 12:07:28 2019 @@ -119,47 +119,42 @@ define <2 x i32> @trunc_packus_v2i64_v2i ; AVX512F-LABEL: trunc_packus_v2i64_v2i32: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpmovusqd %zmm0, %ymm0 +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_packus_v2i64_v2i32: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 ; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512VL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512VL-NEXT: vpmovusqd %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_packus_v2i64_v2i32: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vpmovusqd %zmm0, %ymm0 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_packus_v2i64_v2i32: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BWVL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BWVL-NEXT: vpmovusqd %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_packus_v2i64_v2i32: ; SKX: # %bb.0: -; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 ; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; SKX-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 -; SKX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SKX-NEXT: vpmovusqd %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <2 x i64> %a0, %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> @@ -277,11 +272,9 @@ define void @trunc_packus_v2i64_v2i32_st ; AVX512F-LABEL: trunc_packus_v2i64_v2i32_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512F-NEXT: vmovq %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -296,11 +289,9 @@ define void @trunc_packus_v2i64_v2i32_st ; AVX512BW-LABEL: trunc_packus_v2i64_v2i32_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512BW-NEXT: vmovq %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -536,10 +527,9 @@ define <4 x i32> @trunc_packus_v4i64_v4i ; AVX512F-LABEL: trunc_packus_v4i64_v4i32: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512F-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -555,10 +545,9 @@ define <4 x i32> @trunc_packus_v4i64_v4i ; AVX512BW-LABEL: trunc_packus_v4i64_v4i32: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512BW-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -1132,48 +1121,40 @@ define <2 x i16> @trunc_packus_v2i64_v2i ; AVX512F-LABEL: trunc_packus_v2i64_v2i16: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] -; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_packus_v2i64_v2i16: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 ; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512VL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512VL-NEXT: vpmovusqw %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_packus_v2i64_v2i16: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_packus_v2i64_v2i16: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BWVL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BWVL-NEXT: vpmovusqw %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_packus_v2i64_v2i16: ; SKX: # %bb.0: -; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 ; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; SKX-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 -; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; SKX-NEXT: vpmovusqw %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <2 x i64> %a0, %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> @@ -1320,12 +1301,9 @@ define void @trunc_packus_v2i64_v2i16_st ; AVX512F-LABEL: trunc_packus_v2i64_v2i16_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] -; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512F-NEXT: vmovd %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -1340,11 +1318,9 @@ define void @trunc_packus_v2i64_v2i16_st ; AVX512BW-LABEL: trunc_packus_v2i64_v2i16_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512BW-NEXT: vmovd %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -1602,10 +1578,9 @@ define <4 x i16> @trunc_packus_v4i64_v4i ; AVX512F-LABEL: trunc_packus_v4i64_v4i16: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; @@ -1620,10 +1595,9 @@ define <4 x i16> @trunc_packus_v4i64_v4i ; AVX512BW-LABEL: trunc_packus_v4i64_v4i16: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; @@ -1887,10 +1861,9 @@ define void @trunc_packus_v4i64_v4i16_st ; AVX512F-LABEL: trunc_packus_v4i64_v4i16_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512F-NEXT: vmovq %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -1906,10 +1879,9 @@ define void @trunc_packus_v4i64_v4i16_st ; AVX512BW-LABEL: trunc_packus_v4i64_v4i16_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512BW-NEXT: vmovq %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2878,47 +2850,40 @@ define <2 x i8> @trunc_packus_v2i64_v2i8 ; AVX512F-LABEL: trunc_packus_v2i64_v2i8: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_packus_v2i64_v2i8: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 ; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512VL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: vpmovusqb %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_packus_v2i64_v2i8: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_packus_v2i64_v2i8: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BWVL-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: vpmovusqb %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_packus_v2i64_v2i8: ; SKX: # %bb.0: -; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 ; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; SKX-NEXT: vpmaxsq %xmm1, %xmm0, %xmm0 -; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: vpmovusqb %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <2 x i64> %a0, %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> @@ -3041,11 +3006,9 @@ define void @trunc_packus_v2i64_v2i8_sto ; AVX512F-LABEL: trunc_packus_v2i64_v2i8_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512F-NEXT: vpextrw $0, %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -3060,11 +3023,9 @@ define void @trunc_packus_v2i64_v2i8_sto ; AVX512BW-LABEL: trunc_packus_v2i64_v2i8_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BW-NEXT: vpextrw $0, %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -3303,10 +3264,9 @@ define <4 x i8> @trunc_packus_v4i64_v4i8 ; AVX512F-LABEL: trunc_packus_v4i64_v4i8: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; @@ -3321,10 +3281,9 @@ define <4 x i8> @trunc_packus_v4i64_v4i8 ; AVX512BW-LABEL: trunc_packus_v4i64_v4i8: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; @@ -3567,10 +3526,9 @@ define void @trunc_packus_v4i64_v4i8_sto ; AVX512F-LABEL: trunc_packus_v4i64_v4i8_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512F-NEXT: vmovd %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -3586,10 +3544,9 @@ define void @trunc_packus_v4i64_v4i8_sto ; AVX512BW-LABEL: trunc_packus_v4i64_v4i8_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovd %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -5279,44 +5236,39 @@ define <4 x i8> @trunc_packus_v4i32_v4i8 ; ; AVX512F-LABEL: trunc_packus_v4i32_v4i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] -; AVX512F-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovusdb %zmm0, %xmm0 +; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_packus_v4i32_v4i8: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 ; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512VL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: vpmovusdb %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_packus_v4i32_v4i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] -; AVX512BW-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovusdb %zmm0, %xmm0 +; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_packus_v4i32_v4i8: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 ; AVX512BWVL-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BWVL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: vpmovusdb %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_packus_v4i32_v4i8: ; SKX: # %bb.0: -; SKX-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 ; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; SKX-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: vpmovusdb %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> @@ -5391,12 +5343,11 @@ define void @trunc_packus_v4i32_v4i8_sto ; ; AVX512F-LABEL: trunc_packus_v4i32_v4i8_store: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] -; AVX512F-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512F-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512F-NEXT: vmovd %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_packus_v4i32_v4i8_store: @@ -5408,12 +5359,11 @@ define void @trunc_packus_v4i32_v4i8_sto ; ; AVX512BW-LABEL: trunc_packus_v4i32_v4i8_store: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] -; AVX512BW-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ; AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1 ; AVX512BW-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovd %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_packus_v4i32_v4i8_store: Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll?rev=374731&r1=374730&r2=374731&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll Sun Oct 13 12:07:28 2019 @@ -123,44 +123,32 @@ define <2 x i32> @trunc_ssat_v2i64_v2i32 ; AVX512F-LABEL: trunc_ssat_v2i64_v2i32: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] -; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpmovsqd %zmm0, %ymm0 +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_ssat_v2i64_v2i32: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512VL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512VL-NEXT: vpmovsqd %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_ssat_v2i64_v2i32: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] -; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vpmovsqd %zmm0, %ymm0 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_ssat_v2i64_v2i32: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BWVL-NEXT: vpmovsqd %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_ssat_v2i64_v2i32: ; SKX: # %bb.0: -; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SKX-NEXT: vpmovsqd %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <2 x i64> %a0, %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> @@ -282,11 +270,7 @@ define void @trunc_ssat_v2i64_v2i32_stor ; AVX512F-LABEL: trunc_ssat_v2i64_v2i32_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] -; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpmovsqd %zmm0, %ymm0 ; AVX512F-NEXT: vmovq %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -299,11 +283,7 @@ define void @trunc_ssat_v2i64_v2i32_stor ; AVX512BW-LABEL: trunc_ssat_v2i64_v2i32_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [2147483647,2147483647] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744071562067968,18446744071562067968] -; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vpmovsqd %zmm0, %ymm0 ; AVX512BW-NEXT: vmovq %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -545,9 +525,7 @@ define <4 x i32> @trunc_ssat_v4i64_v4i32 ; AVX512F-LABEL: trunc_ssat_v4i64_v4i32: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512F-NEXT: vpmovsqd %zmm0, %ymm0 ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -561,9 +539,7 @@ define <4 x i32> @trunc_ssat_v4i64_v4i32 ; AVX512BW-LABEL: trunc_ssat_v4i64_v4i32: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqd %zmm0, %ymm0 +; AVX512BW-NEXT: vpmovsqd %zmm0, %ymm0 ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -1153,45 +1129,30 @@ define <2 x i16> @trunc_ssat_v2i64_v2i16 ; AVX512F-LABEL: trunc_ssat_v2i64_v2i16: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] -; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] -; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_ssat_v2i64_v2i16: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512VL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512VL-NEXT: vpmovsqw %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_ssat_v2i64_v2i16: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] -; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_ssat_v2i64_v2i16: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BWVL-NEXT: vpmovsqw %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_ssat_v2i64_v2i16: ; SKX: # %bb.0: -; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; SKX-NEXT: vpmovsqw %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <2 x i64> %a0, %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> @@ -1342,12 +1303,7 @@ define void @trunc_ssat_v2i64_v2i16_stor ; AVX512F-LABEL: trunc_ssat_v2i64_v2i16_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] -; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] -; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512F-NEXT: vmovd %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -1360,11 +1316,7 @@ define void @trunc_ssat_v2i64_v2i16_stor ; AVX512BW-LABEL: trunc_ssat_v2i64_v2i16_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [32767,32767] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709518848,18446744073709518848] -; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512BW-NEXT: vmovd %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -1628,9 +1580,7 @@ define <4 x i16> @trunc_ssat_v4i64_v4i16 ; AVX512F-LABEL: trunc_ssat_v4i64_v4i16: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; @@ -1643,9 +1593,7 @@ define <4 x i16> @trunc_ssat_v4i64_v4i16 ; AVX512BW-LABEL: trunc_ssat_v4i64_v4i16: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; @@ -1915,9 +1863,7 @@ define void @trunc_ssat_v4i64_v4i16_stor ; AVX512F-LABEL: trunc_ssat_v4i64_v4i16_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512F-NEXT: vmovq %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -1931,9 +1877,7 @@ define void @trunc_ssat_v4i64_v4i16_stor ; AVX512BW-LABEL: trunc_ssat_v4i64_v4i16_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovsqw %zmm0, %xmm0 ; AVX512BW-NEXT: vmovq %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2666,44 +2610,30 @@ define <2 x i8> @trunc_ssat_v2i64_v2i8(< ; AVX512F-LABEL: trunc_ssat_v2i64_v2i8: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] -; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_ssat_v2i64_v2i8: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512VL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: vpmovsqb %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_ssat_v2i64_v2i8: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] -; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_ssat_v2i64_v2i8: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: vpmovsqb %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_ssat_v2i64_v2i8: ; SKX: # %bb.0: -; SKX-NEXT: vpminsq {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpmaxsq {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: vpmovsqb %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <2 x i64> %a0, %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> @@ -2830,11 +2760,7 @@ define void @trunc_ssat_v2i64_v2i8_store ; AVX512F-LABEL: trunc_ssat_v2i64_v2i8_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] -; AVX512F-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] -; AVX512F-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512F-NEXT: vpextrw $0, %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -2847,11 +2773,7 @@ define void @trunc_ssat_v2i64_v2i8_store ; AVX512BW-LABEL: trunc_ssat_v2i64_v2i8_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [127,127] -; AVX512BW-NEXT: vpminsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [18446744073709551488,18446744073709551488] -; AVX512BW-NEXT: vpmaxsq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512BW-NEXT: vpextrw $0, %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -3097,9 +3019,7 @@ define <4 x i8> @trunc_ssat_v4i64_v4i8(< ; AVX512F-LABEL: trunc_ssat_v4i64_v4i8: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; @@ -3112,9 +3032,7 @@ define <4 x i8> @trunc_ssat_v4i64_v4i8(< ; AVX512BW-LABEL: trunc_ssat_v4i64_v4i8: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; @@ -3364,9 +3282,7 @@ define void @trunc_ssat_v4i64_v4i8_store ; AVX512F-LABEL: trunc_ssat_v4i64_v4i8_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512F-NEXT: vmovd %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -3380,9 +3296,7 @@ define void @trunc_ssat_v4i64_v4i8_store ; AVX512BW-LABEL: trunc_ssat_v4i64_v4i8_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmaxsq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovsqb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovd %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -5189,41 +5103,31 @@ define <4 x i8> @trunc_ssat_v4i32_v4i8(< ; ; AVX512F-LABEL: trunc_ssat_v4i32_v4i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] -; AVX512F-NEXT: vpminsd %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] -; AVX512F-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vpmovsdb %zmm0, %xmm0 +; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_ssat_v4i32_v4i8: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512VL-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: vpmovsdb %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_ssat_v4i32_v4i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] -; AVX512BW-NEXT: vpminsd %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] -; AVX512BW-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vpmovsdb %zmm0, %xmm0 +; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_ssat_v4i32_v4i8: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: vpmovsdb %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_ssat_v4i32_v4i8: ; SKX: # %bb.0: -; SKX-NEXT: vpminsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; SKX-NEXT: vpmaxsd {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: vpmovsdb %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp slt <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> @@ -5300,12 +5204,10 @@ define void @trunc_ssat_v4i32_v4i8_store ; ; AVX512F-LABEL: trunc_ssat_v4i32_v4i8_store: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] -; AVX512F-NEXT: vpminsd %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] -; AVX512F-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vpmovsdb %zmm0, %xmm0 ; AVX512F-NEXT: vmovd %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_ssat_v4i32_v4i8_store: @@ -5315,12 +5217,10 @@ define void @trunc_ssat_v4i32_v4i8_store ; ; AVX512BW-LABEL: trunc_ssat_v4i32_v4i8_store: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [127,127,127,127] -; AVX512BW-NEXT: vpminsd %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [4294967168,4294967168,4294967168,4294967168] -; AVX512BW-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vpmovsdb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovd %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_ssat_v4i32_v4i8_store: Modified: llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll?rev=374731&r1=374730&r2=374731&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll (original) +++ llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll Sun Oct 13 12:07:28 2019 @@ -84,37 +84,32 @@ define <2 x i32> @trunc_usat_v2i64_v2i32 ; AVX512F-LABEL: trunc_usat_v2i64_v2i32: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] -; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpmovusqd %zmm0, %ymm0 +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_usat_v2i64_v2i32: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512VL-NEXT: vpmovusqd %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_usat_v2i64_v2i32: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] -; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vpmovusqd %zmm0, %ymm0 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_usat_v2i64_v2i32: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BWVL-NEXT: vpmovusqd %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_usat_v2i64_v2i32: ; SKX: # %bb.0: -; SKX-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; SKX-NEXT: vpmovusqd %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp ult <2 x i64> %a0, %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> @@ -195,9 +190,7 @@ define void @trunc_usat_v2i64_v2i32_stor ; AVX512F-LABEL: trunc_usat_v2i64_v2i32_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] -; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512F-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512F-NEXT: vmovq %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -210,9 +203,7 @@ define void @trunc_usat_v2i64_v2i32_stor ; AVX512BW-LABEL: trunc_usat_v2i64_v2i32_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [4294967295,4294967295] -; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] +; AVX512BW-NEXT: vpmovusqd %zmm0, %ymm0 ; AVX512BW-NEXT: vmovq %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -791,38 +782,30 @@ define <2 x i16> @trunc_usat_v2i64_v2i16 ; AVX512F-LABEL: trunc_usat_v2i64_v2i16: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] -; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] -; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_usat_v2i64_v2i16: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512VL-NEXT: vpmovusqw %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_usat_v2i64_v2i16: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] -; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_usat_v2i64_v2i16: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BWVL-NEXT: vpmovusqw %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_usat_v2i64_v2i16: ; SKX: # %bb.0: -; SKX-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; SKX-NEXT: vpmovusqw %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp ult <2 x i64> %a0, %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> @@ -930,10 +913,7 @@ define void @trunc_usat_v2i64_v2i16_stor ; AVX512F-LABEL: trunc_usat_v2i64_v2i16_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] -; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3] -; AVX512F-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7] +; AVX512F-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512F-NEXT: vmovd %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -946,9 +926,7 @@ define void @trunc_usat_v2i64_v2i16_stor ; AVX512BW-LABEL: trunc_usat_v2i64_v2i16_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [65535,65535] -; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,1,8,9,8,9,10,11,8,9,10,11,12,13,14,15] +; AVX512BW-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512BW-NEXT: vmovd %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -1131,8 +1109,7 @@ define <4 x i16> @trunc_usat_v4i64_v4i16 ; AVX512F-LABEL: trunc_usat_v4i64_v4i16: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; @@ -1145,8 +1122,7 @@ define <4 x i16> @trunc_usat_v4i64_v4i16 ; AVX512BW-LABEL: trunc_usat_v4i64_v4i16: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; @@ -1335,8 +1311,7 @@ define void @trunc_usat_v4i64_v4i16_stor ; AVX512F-LABEL: trunc_usat_v4i64_v4i16_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512F-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512F-NEXT: vmovq %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -1350,8 +1325,7 @@ define void @trunc_usat_v4i64_v4i16_stor ; AVX512BW-LABEL: trunc_usat_v4i64_v4i16_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqw %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovusqw %zmm0, %xmm0 ; AVX512BW-NEXT: vmovq %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -1691,34 +1665,33 @@ define <4 x i16> @trunc_usat_v4i32_v4i16 ; ; AVX512F-LABEL: trunc_usat_v4i32_v4i16: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] -; AVX512F-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vpmovusdw %zmm0, %ymm0 +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 +; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_usat_v4i32_v4i16: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512VL-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512VL-NEXT: vpmovusdw %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_usat_v4i32_v4i16: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] -; AVX512BW-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vpmovusdw %zmm0, %ymm0 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 +; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_usat_v4i32_v4i16: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512BWVL-NEXT: vpmovusdw %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_usat_v4i32_v4i16: ; SKX: # %bb.0: -; SKX-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; SKX-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; SKX-NEXT: vpmovusdw %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp ult <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> @@ -1779,10 +1752,10 @@ define void @trunc_usat_v4i32_v4i16_stor ; ; AVX512F-LABEL: trunc_usat_v4i32_v4i16_store: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] -; AVX512F-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vpmovusdw %zmm0, %ymm0 ; AVX512F-NEXT: vmovq %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_usat_v4i32_v4i16_store: @@ -1792,10 +1765,10 @@ define void @trunc_usat_v4i32_v4i16_stor ; ; AVX512BW-LABEL: trunc_usat_v4i32_v4i16_store: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [65535,65535,65535,65535] -; AVX512BW-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpackusdw %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vpmovusdw %zmm0, %ymm0 ; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_usat_v4i32_v4i16_store: @@ -1891,9 +1864,8 @@ define <8 x i16> @trunc_usat_v8i32_v8i16 ; ; AVX512F-LABEL: trunc_usat_v8i32_v8i16: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [65535,65535,65535,65535,65535,65535,65535,65535] -; AVX512F-NEXT: vpminud %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdw %zmm0, %ymm0 +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpmovusdw %zmm0, %ymm0 ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -1906,9 +1878,8 @@ define <8 x i16> @trunc_usat_v8i32_v8i16 ; ; AVX512BW-LABEL: trunc_usat_v8i32_v8i16: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [65535,65535,65535,65535,65535,65535,65535,65535] -; AVX512BW-NEXT: vpminud %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdw %zmm0, %ymm0 +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpmovusdw %zmm0, %ymm0 ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2156,37 +2127,30 @@ define <2 x i8> @trunc_usat_v2i64_v2i8(< ; AVX512F-LABEL: trunc_usat_v2i64_v2i8: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] -; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_usat_v2i64_v2i8: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: vpmovusqb %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_usat_v2i64_v2i8: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] -; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_usat_v2i64_v2i8: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: vpmovusqb %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_usat_v2i64_v2i8: ; SKX: # %bb.0: -; SKX-NEXT: vpminuq {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: vpmovusqb %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp ult <2 x i64> %a0, %2 = select <2 x i1> %1, <2 x i64> %a0, <2 x i64> @@ -2272,9 +2236,7 @@ define void @trunc_usat_v2i64_v2i8_store ; AVX512F-LABEL: trunc_usat_v2i64_v2i8_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512F-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] -; AVX512F-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512F-NEXT: vpextrw $0, %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -2287,9 +2249,7 @@ define void @trunc_usat_v2i64_v2i8_store ; AVX512BW-LABEL: trunc_usat_v2i64_v2i8_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 -; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm1 = [255,255] -; AVX512BW-NEXT: vpminuq %zmm1, %zmm0, %zmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,8,u,u,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BW-NEXT: vpextrw $0, %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -2453,8 +2413,7 @@ define <4 x i8> @trunc_usat_v4i64_v4i8(< ; AVX512F-LABEL: trunc_usat_v4i64_v4i8: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; @@ -2467,8 +2426,7 @@ define <4 x i8> @trunc_usat_v4i64_v4i8(< ; AVX512BW-LABEL: trunc_usat_v4i64_v4i8: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; @@ -2636,8 +2594,7 @@ define void @trunc_usat_v4i64_v4i8_store ; AVX512F-LABEL: trunc_usat_v4i64_v4i8_store: ; AVX512F: # %bb.0: ; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512F-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512F-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512F-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512F-NEXT: vmovd %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -2651,8 +2608,7 @@ define void @trunc_usat_v4i64_v4i8_store ; AVX512BW-LABEL: trunc_usat_v4i64_v4i8_store: ; AVX512BW: # %bb.0: ; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 -; AVX512BW-NEXT: vpminuq {{.*}}(%rip){1to8}, %zmm0, %zmm0 -; AVX512BW-NEXT: vpmovqb %zmm0, %xmm0 +; AVX512BW-NEXT: vpmovusqb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovd %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -3758,34 +3714,31 @@ define <4 x i8> @trunc_usat_v4i32_v4i8(< ; ; AVX512F-LABEL: trunc_usat_v4i32_v4i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] -; AVX512F-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vpmovusdb %zmm0, %xmm0 +; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_usat_v4i32_v4i8: ; AVX512VL: # %bb.0: -; AVX512VL-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512VL-NEXT: vpmovusdb %xmm0, %xmm0 ; AVX512VL-NEXT: retq ; ; AVX512BW-LABEL: trunc_usat_v4i32_v4i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] -; AVX512BW-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vpmovusdb %zmm0, %xmm0 +; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_usat_v4i32_v4i8: ; AVX512BWVL: # %bb.0: -; AVX512BWVL-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; AVX512BWVL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BWVL-NEXT: vpmovusdb %xmm0, %xmm0 ; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_usat_v4i32_v4i8: ; SKX: # %bb.0: -; SKX-NEXT: vpminud {{.*}}(%rip){1to4}, %xmm0, %xmm0 -; SKX-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; SKX-NEXT: vpmovusdb %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp ult <4 x i32> %a0, %2 = select <4 x i1> %1, <4 x i32> %a0, <4 x i32> @@ -3846,10 +3799,10 @@ define void @trunc_usat_v4i32_v4i8_store ; ; AVX512F-LABEL: trunc_usat_v4i32_v4i8_store: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] -; AVX512F-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512F-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512F-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512F-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512F-NEXT: vmovd %xmm0, (%rdi) +; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; ; AVX512VL-LABEL: trunc_usat_v4i32_v4i8_store: @@ -3859,10 +3812,10 @@ define void @trunc_usat_v4i32_v4i8_store ; ; AVX512BW-LABEL: trunc_usat_v4i32_v4i8_store: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} xmm1 = [255,255,255,255] -; AVX512BW-NEXT: vpminud %xmm1, %xmm0, %xmm0 -; AVX512BW-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[0,4,8,12,u,u,u,u,u,u,u,u,u,u,u,u] +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovd %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_usat_v4i32_v4i8_store: @@ -3965,9 +3918,8 @@ define <8 x i8> @trunc_usat_v8i32_v8i8(< ; ; AVX512F-LABEL: trunc_usat_v8i32_v8i8: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [255,255,255,255,255,255,255,255] -; AVX512F-NEXT: vpminud %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq ; @@ -3979,9 +3931,8 @@ define <8 x i8> @trunc_usat_v8i32_v8i8(< ; ; AVX512BW-LABEL: trunc_usat_v8i32_v8i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [255,255,255,255,255,255,255,255] -; AVX512BW-NEXT: vpminud %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; @@ -4090,9 +4041,8 @@ define void @trunc_usat_v8i32_v8i8_store ; ; AVX512F-LABEL: trunc_usat_v8i32_v8i8_store: ; AVX512F: # %bb.0: -; AVX512F-NEXT: vpbroadcastd {{.*#+}} ymm1 = [255,255,255,255,255,255,255,255] -; AVX512F-NEXT: vpminud %ymm1, %ymm0, %ymm0 -; AVX512F-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512F-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512F-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512F-NEXT: vmovq %xmm0, (%rdi) ; AVX512F-NEXT: vzeroupper ; AVX512F-NEXT: retq @@ -4105,9 +4055,8 @@ define void @trunc_usat_v8i32_v8i8_store ; ; AVX512BW-LABEL: trunc_usat_v8i32_v8i8_store: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpbroadcastd {{.*#+}} ymm1 = [255,255,255,255,255,255,255,255] -; AVX512BW-NEXT: vpminud %ymm1, %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovdb %zmm0, %xmm0 +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpmovusdb %zmm0, %xmm0 ; AVX512BW-NEXT: vmovq %xmm0, (%rdi) ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq @@ -4444,16 +4393,34 @@ define <8 x i8> @trunc_usat_v8i16_v8i8(< ; AVX-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 ; AVX-NEXT: retq ; -; AVX512-LABEL: trunc_usat_v8i16_v8i8: -; AVX512: # %bb.0: -; AVX512-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 -; AVX512-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 -; AVX512-NEXT: retq +; AVX512F-LABEL: trunc_usat_v8i16_v8i8: +; AVX512F: # %bb.0: +; AVX512F-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 +; AVX512F-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512F-NEXT: retq +; +; AVX512VL-LABEL: trunc_usat_v8i16_v8i8: +; AVX512VL: # %bb.0: +; AVX512VL-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 +; AVX512VL-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512VL-NEXT: retq +; +; AVX512BW-LABEL: trunc_usat_v8i16_v8i8: +; AVX512BW: # %bb.0: +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vpmovuswb %zmm0, %ymm0 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 +; AVX512BW-NEXT: vzeroupper +; AVX512BW-NEXT: retq +; +; AVX512BWVL-LABEL: trunc_usat_v8i16_v8i8: +; AVX512BWVL: # %bb.0: +; AVX512BWVL-NEXT: vpmovuswb %xmm0, %xmm0 +; AVX512BWVL-NEXT: retq ; ; SKX-LABEL: trunc_usat_v8i16_v8i8: ; SKX: # %bb.0: -; SKX-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 -; SKX-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; SKX-NEXT: vpmovuswb %xmm0, %xmm0 ; SKX-NEXT: retq %1 = icmp ult <8 x i16> %a0, %2 = select <8 x i1> %1, <8 x i16> %a0, <8 x i16> @@ -4509,9 +4476,10 @@ define void @trunc_usat_v8i16_v8i8_store ; ; AVX512BW-LABEL: trunc_usat_v8i16_v8i8_store: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpminuw {{.*}}(%rip), %xmm0, %xmm0 -; AVX512BW-NEXT: vpackuswb %xmm0, %xmm0, %xmm0 +; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0 +; AVX512BW-NEXT: vpmovuswb %zmm0, %ymm0 ; AVX512BW-NEXT: vmovq %xmm0, (%rdi) +; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq ; ; AVX512BWVL-LABEL: trunc_usat_v8i16_v8i8_store: @@ -4601,8 +4569,8 @@ define <16 x i8> @trunc_usat_v16i16_v16i ; ; AVX512BW-LABEL: trunc_usat_v16i16_v16i8: ; AVX512BW: # %bb.0: -; AVX512BW-NEXT: vpminuw {{.*}}(%rip), %ymm0, %ymm0 -; AVX512BW-NEXT: vpmovwb %zmm0, %ymm0 +; AVX512BW-NEXT: # kill: def $ymm0 killed $ymm0 def $zmm0 +; AVX512BW-NEXT: vpmovuswb %zmm0, %ymm0 ; AVX512BW-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0 ; AVX512BW-NEXT: vzeroupper ; AVX512BW-NEXT: retq From llvm-commits at lists.llvm.org Sun Oct 13 12:18:04 2019 From: llvm-commits at lists.llvm.org (Joan LLuch via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 19:18:04 +0000 (UTC) Subject: [PATCH] D63382: [InstCombine] fold a shifted zext to a select In-Reply-To: References: Message-ID: joanlluch added a comment. @spatel I want to express my support to selects over bit manipulation instructions in IR, as stated above, in order to move such optimisations to DAGCombine. Ideally, this should involve the removal of some of the existing InstCombineSelect transformations, particularly most of the ones in foldSelectInstWithICmp. However, as I exposed earlier in LLVM-dev, the DAGCombine code should incorporate hooks to allow targets to decide whether such bihacks are actually profitable, or it's best to keep them as selects. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63382/new/ https://reviews.llvm.org/D63382 From llvm-commits at lists.llvm.org Sun Oct 13 12:27:05 2019 From: llvm-commits at lists.llvm.org (Aditya Kumar via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 19:27:05 +0000 (UTC) Subject: [PATCH] D68924: CodeExtractor: NFC: Use Range based loop In-Reply-To: References: Message-ID: <7fe360bc8611e7ada6f94b53941c843d@localhost.localdomain> hiraditya added a comment. Good point, there's something about updating the users that copying maybe required. make_early_inc_range didn't work, I searched for uses cases of replaceUsesOfWith and I couldn't find any instance where we use 'users'. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68924/new/ https://reviews.llvm.org/D68924 From llvm-commits at lists.llvm.org Sun Oct 13 12:35:35 2019 From: llvm-commits at lists.llvm.org (Simon Pilgrim via llvm-commits) Date: Sun, 13 Oct 2019 19:35:35 -0000 Subject: [llvm] r374732 - [X86] getTargetShuffleInputs - Control KnownUndef mask element resolution as well as KnownZero. Message-ID: <20191013193536.062F283C7A@lists.llvm.org> Author: rksimon Date: Sun Oct 13 12:35:35 2019 New Revision: 374732 URL: http://llvm.org/viewvc/llvm-project?rev=374732&view=rev Log: [X86] getTargetShuffleInputs - Control KnownUndef mask element resolution as well as KnownZero. We were already controlling whether the KnownZero elements were being written to the target mask, this extends it to the KnownUndef elements as well so we can prevent the target shuffle mask being manipulated at all. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374732&r1=374731&r2=374732&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sun Oct 13 12:35:35 2019 @@ -6812,7 +6812,7 @@ static bool getTargetShuffleAndZeroables static bool getTargetShuffleInputs(SDValue Op, SmallVectorImpl &Inputs, SmallVectorImpl &Mask, SelectionDAG &DAG, unsigned Depth, - bool ResolveZero); + bool ResolveKnownElts); // Attempt to decode ops that could be represented as a shuffle mask. // The decoded shuffle mask may contain a different number of elements to the @@ -6821,7 +6821,7 @@ static bool getFauxShuffleMask(SDValue N SmallVectorImpl &Mask, SmallVectorImpl &Ops, SelectionDAG &DAG, unsigned Depth, - bool ResolveZero) { + bool ResolveKnownElts) { Mask.clear(); Ops.clear(); @@ -6916,9 +6916,9 @@ static bool getFauxShuffleMask(SDValue N SmallVector SrcMask0, SrcMask1; SmallVector SrcInputs0, SrcInputs1; if (!getTargetShuffleInputs(N0, SrcInputs0, SrcMask0, DAG, Depth + 1, - ResolveZero) || + ResolveKnownElts) || !getTargetShuffleInputs(N1, SrcInputs1, SrcMask1, DAG, Depth + 1, - ResolveZero)) + ResolveKnownElts)) return false; size_t MaskSize = std::max(SrcMask0.size(), SrcMask1.size()); SmallVector Mask0, Mask1; @@ -6966,7 +6966,7 @@ static bool getFauxShuffleMask(SDValue N SmallVector SubMask; SmallVector SubInputs; if (!getTargetShuffleInputs(peekThroughOneUseBitcasts(Sub), SubInputs, - SubMask, DAG, Depth + 1, ResolveZero)) + SubMask, DAG, Depth + 1, ResolveKnownElts)) return false; if (SubMask.size() != NumSubElts) { assert(((SubMask.size() % NumSubElts) == 0 || @@ -7250,7 +7250,7 @@ static bool getTargetShuffleInputs(SDVal SmallVectorImpl &Mask, APInt &KnownUndef, APInt &KnownZero, SelectionDAG &DAG, unsigned Depth, - bool ResolveZero) { + bool ResolveKnownElts) { EVT VT = Op.getValueType(); if (!VT.isSimple() || !VT.isVector()) return false; @@ -7258,17 +7258,17 @@ static bool getTargetShuffleInputs(SDVal if (getTargetShuffleAndZeroables(Op, Mask, Inputs, KnownUndef, KnownZero)) { for (int i = 0, e = Mask.size(); i != e; ++i) { int &M = Mask[i]; - if (M < 0) + if (M < 0 || !ResolveKnownElts) continue; if (KnownUndef[i]) M = SM_SentinelUndef; - else if (ResolveZero && KnownZero[i]) + else if (KnownZero[i]) M = SM_SentinelZero; } return true; } if (getFauxShuffleMask(Op, DemandedElts, Mask, Inputs, DAG, Depth, - ResolveZero)) { + ResolveKnownElts)) { KnownUndef = KnownZero = APInt::getNullValue(Mask.size()); for (int i = 0, e = Mask.size(); i != e; ++i) { int M = Mask[i]; @@ -7285,7 +7285,7 @@ static bool getTargetShuffleInputs(SDVal static bool getTargetShuffleInputs(SDValue Op, SmallVectorImpl &Inputs, SmallVectorImpl &Mask, SelectionDAG &DAG, unsigned Depth = 0, - bool ResolveZero = true) { + bool ResolveKnownElts = true) { EVT VT = Op.getValueType(); if (!VT.isSimple() || !VT.isVector()) return false; @@ -7294,7 +7294,7 @@ static bool getTargetShuffleInputs(SDVal unsigned NumElts = Op.getValueType().getVectorNumElements(); APInt DemandedElts = APInt::getAllOnesValue(NumElts); return getTargetShuffleInputs(Op, DemandedElts, Inputs, Mask, KnownUndef, - KnownZero, DAG, Depth, ResolveZero); + KnownZero, DAG, Depth, ResolveKnownElts); } /// Returns the scalar element that will make up the ith From llvm-commits at lists.llvm.org Sun Oct 13 13:15:00 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via llvm-commits) Date: Sun, 13 Oct 2019 20:15:00 -0000 Subject: [llvm] r374734 - [NFC][InstCombine] Some preparatory cleanup in dropRedundantMaskingOfLeftShiftInput() Message-ID: <20191013201500.A5E9B83279@lists.llvm.org> Author: lebedevri Date: Sun Oct 13 13:15:00 2019 New Revision: 374734 URL: http://llvm.org/viewvc/llvm-project?rev=374734&view=rev Log: [NFC][InstCombine] Some preparatory cleanup in dropRedundantMaskingOfLeftShiftInput() Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp?rev=374734&r1=374733&r2=374734&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp (original) +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp Sun Oct 13 13:15:00 2019 @@ -161,6 +161,12 @@ dropRedundantMaskingOfLeftShiftInput(Bin Value *Masked = OuterShift->getOperand(0); Value *ShiftShAmt = OuterShift->getOperand(1); + Type *NarrowestTy = OuterShift->getType(); + Type *WidestTy = Masked->getType(); + // The mask must be computed in a type twice as wide to ensure + // that no bits are lost if the sum-of-shifts is wider than the base type. + Type *ExtendedTy = WidestTy->getExtendedType(); + Value *MaskShAmt; // ((1 << MaskShAmt) - 1) @@ -175,6 +181,7 @@ dropRedundantMaskingOfLeftShiftInput(Bin Value *X; Constant *NewMask; + if (match(Masked, m_c_And(m_CombineOr(MaskA, MaskB), m_Value(X)))) { // Can we simplify (MaskShAmt+ShiftShAmt) ? auto *SumOfShAmts = dyn_cast_or_null(SimplifyAddInst( @@ -184,26 +191,19 @@ dropRedundantMaskingOfLeftShiftInput(Bin // In this pattern SumOfShAmts correlates with the number of low bits // that shall remain in the root value (OuterShift). - Type *Ty = X->getType(); - - // The mask must be computed in a type twice as wide to ensure - // that no bits are lost if the sum-of-shifts is wider than the base type. - Type *ExtendedTy = Ty->getExtendedType(); // An extend of an undef value becomes zero because the high bits are never // completely unknown. Replace the the `undef` shift amounts with final // shift bitwidth to ensure that the value remains undef when creating the // subsequent shift op. SumOfShAmts = replaceUndefsWith( - SumOfShAmts, - ConstantInt::get(SumOfShAmts->getType()->getScalarType(), - ExtendedTy->getScalarType()->getScalarSizeInBits())); + SumOfShAmts, ConstantInt::get(SumOfShAmts->getType()->getScalarType(), + ExtendedTy->getScalarSizeInBits())); auto *ExtendedSumOfShAmts = ConstantExpr::getZExt(SumOfShAmts, ExtendedTy); // And compute the mask as usual: ~(-1 << (SumOfShAmts)) auto *ExtendedAllOnes = ConstantExpr::getAllOnesValue(ExtendedTy); auto *ExtendedInvertedMask = ConstantExpr::getShl(ExtendedAllOnes, ExtendedSumOfShAmts); - auto *ExtendedMask = ConstantExpr::getNot(ExtendedInvertedMask); - NewMask = ConstantExpr::getTrunc(ExtendedMask, Ty); + NewMask = ConstantExpr::getNot(ExtendedInvertedMask); } else if (match(Masked, m_c_And(m_CombineOr(MaskC, MaskD), m_Value(X))) || match(Masked, m_Shr(m_Shl(m_Value(X), m_Value(MaskShAmt)), m_Deferred(MaskShAmt)))) { @@ -215,32 +215,29 @@ dropRedundantMaskingOfLeftShiftInput(Bin // In this pattern ShAmtsDiff correlates with the number of high bits that // shall be unset in the root value (OuterShift). - Type *Ty = X->getType(); - unsigned BitWidth = Ty->getScalarSizeInBits(); - - // The mask must be computed in a type twice as wide to ensure - // that no bits are lost if the sum-of-shifts is wider than the base type. - Type *ExtendedTy = Ty->getExtendedType(); // An extend of an undef value becomes zero because the high bits are never // completely unknown. Replace the the `undef` shift amounts with negated - // shift bitwidth to ensure that the value remains undef when creating the - // subsequent shift op. + // bitwidth of innermost shift to ensure that the value remains undef when + // creating the subsequent shift op. + unsigned WidestTyBitWidth = WidestTy->getScalarSizeInBits(); ShAmtsDiff = replaceUndefsWith( - ShAmtsDiff, - ConstantInt::get(ShAmtsDiff->getType()->getScalarType(), -BitWidth)); + ShAmtsDiff, ConstantInt::get(ShAmtsDiff->getType()->getScalarType(), + -WidestTyBitWidth)); auto *ExtendedNumHighBitsToClear = ConstantExpr::getZExt( - ConstantExpr::getSub(ConstantInt::get(ShAmtsDiff->getType(), BitWidth, + ConstantExpr::getSub(ConstantInt::get(ShAmtsDiff->getType(), + WidestTyBitWidth, /*isSigned=*/false), ShAmtsDiff), ExtendedTy); // And compute the mask as usual: (-1 l>> (NumHighBitsToClear)) auto *ExtendedAllOnes = ConstantExpr::getAllOnesValue(ExtendedTy); - auto *ExtendedMask = + NewMask = ConstantExpr::getLShr(ExtendedAllOnes, ExtendedNumHighBitsToClear); - NewMask = ConstantExpr::getTrunc(ExtendedMask, Ty); } else return nullptr; // Don't know anything about this pattern. + NewMask = ConstantExpr::getTrunc(NewMask, NarrowestTy); + // Does this mask has any unset bits? If not then we can just not apply it. bool NeedMask = !match(NewMask, m_AllOnes()); @@ -257,6 +254,7 @@ dropRedundantMaskingOfLeftShiftInput(Bin // No 'NUW'/'NSW'! We no longer know that we won't shift-out non-0 bits. auto *NewShift = BinaryOperator::Create(OuterShift->getOpcode(), X, ShiftShAmt); + if (!NeedMask) return NewShift; From llvm-commits at lists.llvm.org Sun Oct 13 13:12:49 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Martin_Storsj=C3=B6_via_Phabricator?= via llvm-commits) Date: Sun, 13 Oct 2019 20:12:49 +0000 (UTC) Subject: [PATCH] D68135: [lit] Set the target-windows feature for any windows environment In-Reply-To: References: Message-ID: mstorsjo abandoned this revision. mstorsjo added a comment. Made redundant by D68450 . CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68135/new/ https://reviews.llvm.org/D68135 From llvm-commits at lists.llvm.org Sun Oct 13 13:12:51 2019 From: llvm-commits at lists.llvm.org (Gil Rapaport via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 20:12:51 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <1485f65fa43af6eb5f9d29a5d7ff8e13@localhost.localdomain> gilr updated this revision to Diff 224792. gilr added a comment. Applied review comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 Files: llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h llvm/lib/Transforms/Vectorize/VPlan.cpp llvm/lib/Transforms/Vectorize/VPlan.h llvm/unittests/Transforms/Vectorize/VPlanTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68577.224792.patch Type: text/x-patch Size: 20121 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 13:30:48 2019 From: llvm-commits at lists.llvm.org (Stefan Stipanovic via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 20:30:48 +0000 (UTC) Subject: [PATCH] D68929: [Attributor][FIX] Use check line that is actually tested In-Reply-To: References: Message-ID: <4d513eed5f42783b2f478740b41928d8@localhost.localdomain> sstefan1 accepted this revision. sstefan1 added a comment. This revision is now accepted and ready to land. lgtm. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68929/new/ https://reviews.llvm.org/D68929 From llvm-commits at lists.llvm.org Thu Oct 10 09:19:56 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 16:19:56 +0000 (UTC) Subject: [PATCH] D67122: [UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour In-Reply-To: References: Message-ID: <68d2c0deb40e388accda4727d52f3f07@localhost.localdomain> lebedev.ri added a comment. In D67122#1703616 , @lebedev.ri wrote: > http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-ubsan/builds/15329/steps/annotate/logs/stdio appears to contain "only" 3 distinct issues. > I've pushed those 3 fixes, but maybe that is not enough, we'll see. And we're green. Thanks for everyone's patience and lack of reverts :) As per that bot(!) there were "only" 3 places in LLVM that had that UB. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67122/new/ https://reviews.llvm.org/D67122 From llvm-commits at lists.llvm.org Thu Oct 10 09:47:56 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 16:47:56 +0000 (UTC) Subject: [PATCH] D68813: [AMDGPU] Handle undef old operand in DPP combine Message-ID: rampitec created this revision. rampitec added reviewers: arsenm, vpykhtin, kzhuravl. Herald added subscribers: MaskRay, kbarton, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, nemanjai. Herald added a project: LLVM. It was missing an undef flag. https://reviews.llvm.org/D68813 Files: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/test/CodeGen/AMDGPU/dpp_combine.mir Index: llvm/test/CodeGen/AMDGPU/dpp_combine.mir =================================================================== --- llvm/test/CodeGen/AMDGPU/dpp_combine.mir +++ llvm/test/CodeGen/AMDGPU/dpp_combine.mir @@ -512,7 +512,7 @@ ... # CHECK-LABEL: name: add_old_subreg_undef -# CHECK: %5:vgpr_32 = V_ADD_U32_dpp %3.sub1, %1, %0.sub1, 1, 15, 15, 1, implicit $exec +# CHECK: %5:vgpr_32 = V_ADD_U32_dpp undef %3.sub1, %1, %0.sub1, 1, 15, 15, 1, implicit $exec name: add_old_subreg_undef tracksRegLiveness: true @@ -551,3 +551,14 @@ %2:vgpr_32 = V_MOV_B32_dpp %1:vgpr_32, undef %0:vgpr_32, 1, 15, 15, 1, implicit $exec %4:vgpr_32 = V_MIN_F32_e32 %2, undef %3:vgpr_32, implicit $exec ... + +# Test an undef old operand +# CHECK-LABEL: name: dpp_undef_old +# CHECK: %3:vgpr_32 = V_CEIL_F32_dpp undef %1:vgpr_32, 0, undef %2:vgpr_32, 1, 15, 15, 1, implicit $exec +name: dpp_undef_old +tracksRegLiveness: true +body: | + bb.0: + %2:vgpr_32 = V_MOV_B32_dpp undef %1:vgpr_32, undef %0:vgpr_32, 1, 15, 15, 1, implicit $exec + %3:vgpr_32 = V_CEIL_F32_e32 %2, implicit $exec +... Index: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp =================================================================== --- llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp +++ llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp @@ -178,7 +178,9 @@ if (OldIdx != -1) { assert(OldIdx == NumOperands); assert(isOfRegClass(CombOldVGPR, AMDGPU::VGPR_32RegClass, *MRI)); - DPPInst.addReg(CombOldVGPR.Reg, 0, CombOldVGPR.SubReg); + auto *Def = getVRegSubRegDef(CombOldVGPR, *MRI); + DPPInst.addReg(CombOldVGPR.Reg, Def ? 0 : RegState::Undef, + CombOldVGPR.SubReg); ++NumOperands; } else { // TODO: this discards MAC/FMA instructions for now, let's add it later -------------- next part -------------- A non-text attachment was scrubbed... Name: D68813.224393.patch Type: text/x-patch Size: 1806 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 13:26:00 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 20:26:00 +0000 (UTC) Subject: [PATCH] D68828: [AMDGPU] Allow DPP combiner to work with REG_SEQUENCE Message-ID: rampitec created this revision. rampitec added reviewers: vpykhtin, arsenm, kzhuravl. Herald added subscribers: MaskRay, kbarton, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, nemanjai. Herald added a project: LLVM. https://reviews.llvm.org/D68828 Files: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/test/CodeGen/AMDGPU/dpp_combine.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68828.224457.patch Type: text/x-patch Size: 10886 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 14:05:50 2019 From: llvm-commits at lists.llvm.org (Valery Pykhtin via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:05:50 +0000 (UTC) Subject: [PATCH] D68813: [AMDGPU] Handle undef old operand in DPP combine In-Reply-To: References: Message-ID: vpykhtin accepted this revision. vpykhtin added a comment. This revision is now accepted and ready to land. Herald added a subscriber: wuzish. LGTM. ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:181 assert(isOfRegClass(CombOldVGPR, AMDGPU::VGPR_32RegClass, *MRI)); - DPPInst.addReg(CombOldVGPR.Reg, 0, CombOldVGPR.SubReg); + auto *Def = getVRegSubRegDef(CombOldVGPR, *MRI); + DPPInst.addReg(CombOldVGPR.Reg, Def ? 0 : RegState::Undef, ---------------- ok, I missed the case when there is no defining instruction. It is not very good that the def is searched again, but lets submit this for the fix. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68813/new/ https://reviews.llvm.org/D68813 From llvm-commits at lists.llvm.org Thu Oct 10 14:06:43 2019 From: llvm-commits at lists.llvm.org (Tim Gymnich via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:06:43 +0000 (UTC) Subject: [PATCH] D68557: PR41162 Implement LKK remainder and divisibility algorithms [srem] In-Reply-To: References: Message-ID: TG908 updated this revision to Diff 224471. Herald added subscribers: pzheng, s.egerton, lenary, wuzish, jocewei, PkmX, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, asb. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68557/new/ https://reviews.llvm.org/D68557 Files: llvm/include/llvm/CodeGen/TargetLowering.h llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp llvm/test/CodeGen/AArch64/srem-lkk.ll llvm/test/CodeGen/AArch64/srem-seteq.ll llvm/test/CodeGen/AArch64/srem-vector-lkk.ll llvm/test/CodeGen/ARM/urem-opt-size.ll llvm/test/CodeGen/PowerPC/machine-pre.ll llvm/test/CodeGen/PowerPC/srem-lkk.ll llvm/test/CodeGen/PowerPC/srem-vector-lkk.ll llvm/test/CodeGen/RISCV/srem-lkk.ll llvm/test/CodeGen/RISCV/srem-vector-lkk.ll llvm/test/CodeGen/X86/load-scalar-as-vector.ll llvm/test/CodeGen/X86/pr14088.ll llvm/test/CodeGen/X86/srem-lkk.ll llvm/test/CodeGen/X86/srem-vector-lkk.ll llvm/test/CodeGen/X86/urem-vector-lkk.ll llvm/test/CodeGen/X86/vector-idiv-sdiv-128.ll llvm/test/CodeGen/X86/vector-idiv-sdiv-256.ll llvm/test/CodeGen/X86/vector-idiv-sdiv-512.ll llvm/test/CodeGen/X86/vector-intrinsics.ll llvm/test/CodeGen/X86/vector-rem.ll llvm/test/CodeGen/X86/vector-truncate-combine.ll llvm/test/CodeGen/X86/vector-variable-idx.ll llvm/test/CodeGen/X86/vector-variable-idx2.ll llvm/test/CodeGen/X86/vector-width-store-merge.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68557.224471.patch Type: text/x-patch Size: 209099 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 14:15:11 2019 From: llvm-commits at lists.llvm.org (Hiroshi Yamauchi via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:15:11 +0000 (UTC) Subject: [PATCH] D67120: [PGO] Profile guided code size optimization (continued). In-Reply-To: References: Message-ID: <9a900504cf2cdaaf6171abd33c62c3f7@localhost.localdomain> yamauchi updated this revision to Diff 224474. yamauchi marked an inline comment as done. yamauchi added a comment. Herald added subscribers: pzheng, s.egerton, lenary, jocewei, PkmX, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, asb. Added tests. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67120/new/ https://reviews.llvm.org/D67120 Files: llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/AsmPrinter.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/include/llvm/CodeGen/ExecutionDomainFix.h llvm/include/llvm/CodeGen/LiveRangeEdit.h llvm/include/llvm/CodeGen/MachineSizeOpts.h llvm/include/llvm/CodeGen/SelectionDAG.h llvm/include/llvm/CodeGen/SelectionDAGISel.h llvm/include/llvm/CodeGen/SwitchLoweringUtils.h llvm/include/llvm/CodeGen/TailDuplicator.h llvm/include/llvm/CodeGen/TargetInstrInfo.h llvm/include/llvm/CodeGen/TargetLowering.h llvm/include/llvm/Transforms/Utils/SizeOpts.h llvm/lib/Analysis/InlineCost.cpp llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/lib/CodeGen/BranchFolding.cpp llvm/lib/CodeGen/BranchFolding.h llvm/lib/CodeGen/CMakeLists.txt llvm/lib/CodeGen/CodeGenPrepare.cpp llvm/lib/CodeGen/ExecutionDomainFix.cpp llvm/lib/CodeGen/ExpandMemCmp.cpp llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp llvm/lib/CodeGen/IfConversion.cpp llvm/lib/CodeGen/InlineSpiller.cpp llvm/lib/CodeGen/LiveRangeEdit.cpp llvm/lib/CodeGen/MachineBlockPlacement.cpp llvm/lib/CodeGen/MachineCSE.cpp llvm/lib/CodeGen/MachineCombiner.cpp llvm/lib/CodeGen/MachineSizeOpts.cpp llvm/lib/CodeGen/PeepholeOptimizer.cpp llvm/lib/CodeGen/RegAllocBasic.cpp llvm/lib/CodeGen/RegAllocGreedy.cpp llvm/lib/CodeGen/RegAllocPBQP.cpp llvm/lib/CodeGen/RegisterCoalescer.cpp llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp llvm/lib/CodeGen/SplitKit.cpp llvm/lib/CodeGen/SplitKit.h llvm/lib/CodeGen/SwitchLoweringUtils.cpp llvm/lib/CodeGen/TailDuplication.cpp llvm/lib/CodeGen/TailDuplicator.cpp llvm/lib/CodeGen/TargetInstrInfo.cpp llvm/lib/CodeGen/TwoAddressInstructionPass.cpp llvm/lib/Target/AArch64/AArch64InstrInfo.cpp llvm/lib/Target/AArch64/AArch64InstrInfo.h llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/lib/Target/AMDGPU/SIFoldOperands.cpp llvm/lib/Target/AMDGPU/SIInsertSkips.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.h llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp llvm/lib/Target/ARM/ARMBaseInstrInfo.h llvm/lib/Target/ARM/Thumb2SizeReduction.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.h llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp llvm/lib/Target/SystemZ/SystemZInstrInfo.h llvm/lib/Target/SystemZ/SystemZPostRewrite.cpp llvm/lib/Target/SystemZ/SystemZShortenInst.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.h llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp llvm/lib/Target/X86/X86FastISel.cpp llvm/lib/Target/X86/X86FixupBWInsts.cpp llvm/lib/Target/X86/X86ISelDAGToDAG.cpp llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86InstrInfo.cpp llvm/lib/Target/X86/X86InstrInfo.h llvm/lib/Target/X86/X86InstrInfo.td llvm/lib/Target/X86/X86OptimizeLEAs.cpp llvm/lib/Target/X86/X86PadShortFunction.cpp llvm/lib/Transforms/Utils/SizeOpts.cpp llvm/test/CodeGen/AArch64/O0-pipeline.ll llvm/test/CodeGen/AArch64/O3-pipeline.ll llvm/test/CodeGen/AArch64/arm64-memset-to-bzero-pgso.ll llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll llvm/test/CodeGen/AArch64/max-jump-table.ll llvm/test/CodeGen/ARM/O3-pipeline.ll llvm/test/CodeGen/ARM/constantpool-align.ll llvm/test/CodeGen/RISCV/tail-calls.ll llvm/test/CodeGen/X86/O0-pipeline.ll llvm/test/CodeGen/X86/O3-pipeline.ll llvm/test/CodeGen/X86/atom-pad-short-functions.ll llvm/test/CodeGen/X86/avx-cvt.ll llvm/test/CodeGen/X86/avx512-mask-op.ll llvm/test/CodeGen/X86/bypass-slow-division-tune.ll llvm/test/CodeGen/X86/cmov-into-branch.ll llvm/test/CodeGen/X86/conditional-tailcall-pgso.ll llvm/test/CodeGen/X86/fixup-lea.ll llvm/test/CodeGen/X86/fold-load-unops.ll llvm/test/CodeGen/X86/fshl.ll llvm/test/CodeGen/X86/fshr.ll llvm/test/CodeGen/X86/haddsub.ll llvm/test/CodeGen/X86/immediate_merging.ll llvm/test/CodeGen/X86/immediate_merging64.ll llvm/test/CodeGen/X86/loop-blocks.ll llvm/test/CodeGen/X86/materialize.ll llvm/test/CodeGen/X86/memcmp-pgso.ll llvm/test/CodeGen/X86/memcpy.ll llvm/test/CodeGen/X86/powi.ll llvm/test/CodeGen/X86/rounding-ops.ll llvm/test/CodeGen/X86/shrink-compare-pgso.ll llvm/test/CodeGen/X86/slow-incdec.ll llvm/test/CodeGen/X86/splat-for-size.ll llvm/test/CodeGen/X86/sse2-intrinsics-x86-upgrade.ll llvm/test/CodeGen/X86/sse41.ll llvm/test/CodeGen/X86/store-zero-and-minus-one.ll llvm/test/CodeGen/X86/switch-density.ll llvm/test/CodeGen/X86/tail-opts.ll llvm/test/CodeGen/X86/test-vs-bittest.ll llvm/test/CodeGen/X86/vector-shuffle-256-v4.ll llvm/test/CodeGen/X86/x86-64-bittest-logic.ll llvm/test/CodeGen/X86/x86-64-double-shifts-Oz-Os-O2.ll llvm/test/CodeGen/X86/x86-repmov-copy-eflags.ll llvm/test/Transforms/CodeGenPrepare/X86/sink-addrmode.ll llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp llvm/utils/TableGen/GlobalISelEmitter.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67120.224474.patch Type: text/x-patch Size: 372063 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 14:34:14 2019 From: llvm-commits at lists.llvm.org (Hiroshi Yamauchi via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:34:14 +0000 (UTC) Subject: [PATCH] D67120: [PGO] Profile guided code size optimization (continued). In-Reply-To: References: Message-ID: <77eb6135f612b92528255ef4ffcffe95@localhost.localdomain> yamauchi added inline comments. ================ Comment at: llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:118 + unsigned &JTSize, + ProfileSummaryInfo *PSI, + BlockFrequencyInfo *BFI) { ---------------- davidxl wrote: > Mark as unused args? Done (note according to the comment for LLVM_ATTRIBUTE_UNUSED, a cast-to-void is preferred for unused variables.) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67120/new/ https://reviews.llvm.org/D67120 From llvm-commits at lists.llvm.org Thu Oct 10 14:34:46 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 21:34:46 +0000 (UTC) Subject: [PATCH] D68813: [AMDGPU] Handle undef old operand in DPP combine In-Reply-To: References: Message-ID: <19dc083bf2bb108db558858809448700@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rG19a1a739b15d: [AMDGPU] Handle undef old operand in DPP combine (authored by rampitec). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68813/new/ https://reviews.llvm.org/D68813 Files: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/test/CodeGen/AMDGPU/dpp_combine.mir Index: llvm/test/CodeGen/AMDGPU/dpp_combine.mir =================================================================== --- llvm/test/CodeGen/AMDGPU/dpp_combine.mir +++ llvm/test/CodeGen/AMDGPU/dpp_combine.mir @@ -512,7 +512,7 @@ ... # CHECK-LABEL: name: add_old_subreg_undef -# CHECK: %5:vgpr_32 = V_ADD_U32_dpp %3.sub1, %1, %0.sub1, 1, 15, 15, 1, implicit $exec +# CHECK: %5:vgpr_32 = V_ADD_U32_dpp undef %3.sub1, %1, %0.sub1, 1, 15, 15, 1, implicit $exec name: add_old_subreg_undef tracksRegLiveness: true @@ -551,3 +551,14 @@ %2:vgpr_32 = V_MOV_B32_dpp %1:vgpr_32, undef %0:vgpr_32, 1, 15, 15, 1, implicit $exec %4:vgpr_32 = V_MIN_F32_e32 %2, undef %3:vgpr_32, implicit $exec ... + +# Test an undef old operand +# CHECK-LABEL: name: dpp_undef_old +# CHECK: %3:vgpr_32 = V_CEIL_F32_dpp undef %1:vgpr_32, 0, undef %2:vgpr_32, 1, 15, 15, 1, implicit $exec +name: dpp_undef_old +tracksRegLiveness: true +body: | + bb.0: + %2:vgpr_32 = V_MOV_B32_dpp undef %1:vgpr_32, undef %0:vgpr_32, 1, 15, 15, 1, implicit $exec + %3:vgpr_32 = V_CEIL_F32_e32 %2, implicit $exec +... Index: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp =================================================================== --- llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp +++ llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp @@ -178,7 +178,9 @@ if (OldIdx != -1) { assert(OldIdx == NumOperands); assert(isOfRegClass(CombOldVGPR, AMDGPU::VGPR_32RegClass, *MRI)); - DPPInst.addReg(CombOldVGPR.Reg, 0, CombOldVGPR.SubReg); + auto *Def = getVRegSubRegDef(CombOldVGPR, *MRI); + DPPInst.addReg(CombOldVGPR.Reg, Def ? 0 : RegState::Undef, + CombOldVGPR.SubReg); ++NumOperands; } else { // TODO: this discards MAC/FMA instructions for now, let's add it later -------------- next part -------------- A non-text attachment was scrubbed... Name: D68813.224482.patch Type: text/x-patch Size: 1806 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 15:21:31 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:21:31 +0000 (UTC) Subject: [PATCH] D68828: [AMDGPU] Allow DPP combiner to work with REG_SEQUENCE In-Reply-To: References: Message-ID: arsenm added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:420 SmallVector OrigMIs, DPPMIs; + SmallSetVector RegSeqs; auto CombOldVGPR = getRegSubRegPair(*OldOpnd); ---------------- Why is this a SetVector? ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:459-462 + if (OrigMI.getOperand(I).getReg() == DPPMovReg) { + FwdSubReg = OrigMI.getOperand(I + 1).getImm(); + break; + } ---------------- I think this won't work in the case where the operand itself has a subregister. Can you add a test with something like %0:vreg_64 = REG_SEQUENCE %vreg_64.sub0, sub1, %vreg_64.1, sub0 I think you can use composeSubRegIndices here ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:471 + continue; + } else if (TII->isVOP3(OrigOp)) { if (!TII->hasVALU32BitEncoding(OrigOp)) { ---------------- No else after continue CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68828/new/ https://reviews.llvm.org/D68828 From llvm-commits at lists.llvm.org Thu Oct 10 15:30:40 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:30:40 +0000 (UTC) Subject: [PATCH] D68828: [AMDGPU] Allow DPP combiner to work with REG_SEQUENCE In-Reply-To: References: Message-ID: <711cec7cc07ead2cc6255bb1c201dcab@localhost.localdomain> rampitec marked an inline comment as done. rampitec added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:420 SmallVector OrigMIs, DPPMIs; + SmallSetVector RegSeqs; auto CombOldVGPR = getRegSubRegPair(*OldOpnd); ---------------- arsenm wrote: > Why is this a SetVector? I will try to add if several times, for each subreg. Thus set. I guess it can be just SmallSet though. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68828/new/ https://reviews.llvm.org/D68828 From llvm-commits at lists.llvm.org Thu Oct 10 15:30:41 2019 From: llvm-commits at lists.llvm.org (Hiroshi Yamauchi via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:30:41 +0000 (UTC) Subject: [PATCH] D67120: [PGO] Profile guided code size optimization (continued). In-Reply-To: References: Message-ID: yamauchi updated this revision to Diff 224491. yamauchi added a comment. Rebased. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67120/new/ https://reviews.llvm.org/D67120 Files: llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/AsmPrinter.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/include/llvm/CodeGen/ExecutionDomainFix.h llvm/include/llvm/CodeGen/LiveRangeEdit.h llvm/include/llvm/CodeGen/MachineSizeOpts.h llvm/include/llvm/CodeGen/SelectionDAG.h llvm/include/llvm/CodeGen/SelectionDAGISel.h llvm/include/llvm/CodeGen/SwitchLoweringUtils.h llvm/include/llvm/CodeGen/TailDuplicator.h llvm/include/llvm/CodeGen/TargetInstrInfo.h llvm/include/llvm/CodeGen/TargetLowering.h llvm/include/llvm/Transforms/Utils/SizeOpts.h llvm/lib/Analysis/InlineCost.cpp llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/lib/CodeGen/BranchFolding.cpp llvm/lib/CodeGen/BranchFolding.h llvm/lib/CodeGen/CMakeLists.txt llvm/lib/CodeGen/CodeGenPrepare.cpp llvm/lib/CodeGen/ExecutionDomainFix.cpp llvm/lib/CodeGen/ExpandMemCmp.cpp llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp llvm/lib/CodeGen/IfConversion.cpp llvm/lib/CodeGen/InlineSpiller.cpp llvm/lib/CodeGen/LiveRangeEdit.cpp llvm/lib/CodeGen/MachineBlockPlacement.cpp llvm/lib/CodeGen/MachineCSE.cpp llvm/lib/CodeGen/MachineCombiner.cpp llvm/lib/CodeGen/MachineSizeOpts.cpp llvm/lib/CodeGen/PeepholeOptimizer.cpp llvm/lib/CodeGen/RegAllocBasic.cpp llvm/lib/CodeGen/RegAllocGreedy.cpp llvm/lib/CodeGen/RegAllocPBQP.cpp llvm/lib/CodeGen/RegisterCoalescer.cpp llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp llvm/lib/CodeGen/SplitKit.cpp llvm/lib/CodeGen/SplitKit.h llvm/lib/CodeGen/SwitchLoweringUtils.cpp llvm/lib/CodeGen/TailDuplication.cpp llvm/lib/CodeGen/TailDuplicator.cpp llvm/lib/CodeGen/TargetInstrInfo.cpp llvm/lib/CodeGen/TwoAddressInstructionPass.cpp llvm/lib/Target/AArch64/AArch64InstrInfo.cpp llvm/lib/Target/AArch64/AArch64InstrInfo.h llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/lib/Target/AMDGPU/SIFoldOperands.cpp llvm/lib/Target/AMDGPU/SIInsertSkips.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.h llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp llvm/lib/Target/ARM/ARMBaseInstrInfo.h llvm/lib/Target/ARM/Thumb2SizeReduction.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.h llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp llvm/lib/Target/SystemZ/SystemZInstrInfo.h llvm/lib/Target/SystemZ/SystemZPostRewrite.cpp llvm/lib/Target/SystemZ/SystemZShortenInst.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.h llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp llvm/lib/Target/X86/X86FastISel.cpp llvm/lib/Target/X86/X86FixupBWInsts.cpp llvm/lib/Target/X86/X86ISelDAGToDAG.cpp llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86InstrInfo.cpp llvm/lib/Target/X86/X86InstrInfo.h llvm/lib/Target/X86/X86InstrInfo.td llvm/lib/Target/X86/X86OptimizeLEAs.cpp llvm/lib/Target/X86/X86PadShortFunction.cpp llvm/lib/Transforms/Utils/SizeOpts.cpp llvm/test/CodeGen/AArch64/O0-pipeline.ll llvm/test/CodeGen/AArch64/O3-pipeline.ll llvm/test/CodeGen/AArch64/arm64-memset-to-bzero-pgso.ll llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll llvm/test/CodeGen/AArch64/max-jump-table.ll llvm/test/CodeGen/ARM/O3-pipeline.ll llvm/test/CodeGen/ARM/constantpool-align.ll llvm/test/CodeGen/RISCV/tail-calls.ll llvm/test/CodeGen/X86/O0-pipeline.ll llvm/test/CodeGen/X86/O3-pipeline.ll llvm/test/CodeGen/X86/atom-pad-short-functions.ll llvm/test/CodeGen/X86/avx-cvt.ll llvm/test/CodeGen/X86/avx512-mask-op.ll llvm/test/CodeGen/X86/bypass-slow-division-tune.ll llvm/test/CodeGen/X86/cmov-into-branch.ll llvm/test/CodeGen/X86/conditional-tailcall-pgso.ll llvm/test/CodeGen/X86/fixup-lea.ll llvm/test/CodeGen/X86/fold-load-unops.ll llvm/test/CodeGen/X86/fshl.ll llvm/test/CodeGen/X86/fshr.ll llvm/test/CodeGen/X86/haddsub.ll llvm/test/CodeGen/X86/immediate_merging.ll llvm/test/CodeGen/X86/immediate_merging64.ll llvm/test/CodeGen/X86/loop-blocks.ll llvm/test/CodeGen/X86/materialize.ll llvm/test/CodeGen/X86/memcmp-pgso.ll llvm/test/CodeGen/X86/memcpy.ll llvm/test/CodeGen/X86/powi.ll llvm/test/CodeGen/X86/rounding-ops.ll llvm/test/CodeGen/X86/shrink-compare-pgso.ll llvm/test/CodeGen/X86/slow-incdec.ll llvm/test/CodeGen/X86/splat-for-size.ll llvm/test/CodeGen/X86/sse2-intrinsics-x86-upgrade.ll llvm/test/CodeGen/X86/sse41.ll llvm/test/CodeGen/X86/store-zero-and-minus-one.ll llvm/test/CodeGen/X86/switch-density.ll llvm/test/CodeGen/X86/tail-opts.ll llvm/test/CodeGen/X86/test-vs-bittest.ll llvm/test/CodeGen/X86/vector-shuffle-256-v4.ll llvm/test/CodeGen/X86/x86-64-bittest-logic.ll llvm/test/CodeGen/X86/x86-64-double-shifts-Oz-Os-O2.ll llvm/test/CodeGen/X86/x86-repmov-copy-eflags.ll llvm/test/Transforms/CodeGenPrepare/X86/sink-addrmode.ll llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp llvm/utils/TableGen/GlobalISelEmitter.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67120.224491.patch Type: text/x-patch Size: 371837 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 15:39:54 2019 From: llvm-commits at lists.llvm.org (Hiroshi Yamauchi via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:39:54 +0000 (UTC) Subject: [PATCH] D67120: [PGO] Profile guided code size optimization (continued). In-Reply-To: References: Message-ID: <88dfa118cacfb1e6fbff4bd44071de30@localhost.localdomain> yamauchi updated this revision to Diff 224495. yamauchi added a comment. Fix the summary. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67120/new/ https://reviews.llvm.org/D67120 Files: llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/AsmPrinter.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/include/llvm/CodeGen/ExecutionDomainFix.h llvm/include/llvm/CodeGen/LiveRangeEdit.h llvm/include/llvm/CodeGen/MachineSizeOpts.h llvm/include/llvm/CodeGen/SelectionDAG.h llvm/include/llvm/CodeGen/SelectionDAGISel.h llvm/include/llvm/CodeGen/SwitchLoweringUtils.h llvm/include/llvm/CodeGen/TailDuplicator.h llvm/include/llvm/CodeGen/TargetInstrInfo.h llvm/include/llvm/CodeGen/TargetLowering.h llvm/include/llvm/Transforms/Utils/SizeOpts.h llvm/lib/Analysis/InlineCost.cpp llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/lib/CodeGen/BranchFolding.cpp llvm/lib/CodeGen/BranchFolding.h llvm/lib/CodeGen/CMakeLists.txt llvm/lib/CodeGen/CodeGenPrepare.cpp llvm/lib/CodeGen/ExecutionDomainFix.cpp llvm/lib/CodeGen/ExpandMemCmp.cpp llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp llvm/lib/CodeGen/IfConversion.cpp llvm/lib/CodeGen/InlineSpiller.cpp llvm/lib/CodeGen/LiveRangeEdit.cpp llvm/lib/CodeGen/MachineBlockPlacement.cpp llvm/lib/CodeGen/MachineCSE.cpp llvm/lib/CodeGen/MachineCombiner.cpp llvm/lib/CodeGen/MachineSizeOpts.cpp llvm/lib/CodeGen/PeepholeOptimizer.cpp llvm/lib/CodeGen/RegAllocBasic.cpp llvm/lib/CodeGen/RegAllocGreedy.cpp llvm/lib/CodeGen/RegAllocPBQP.cpp llvm/lib/CodeGen/RegisterCoalescer.cpp llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp llvm/lib/CodeGen/SplitKit.cpp llvm/lib/CodeGen/SplitKit.h llvm/lib/CodeGen/SwitchLoweringUtils.cpp llvm/lib/CodeGen/TailDuplication.cpp llvm/lib/CodeGen/TailDuplicator.cpp llvm/lib/CodeGen/TargetInstrInfo.cpp llvm/lib/CodeGen/TwoAddressInstructionPass.cpp llvm/lib/Target/AArch64/AArch64InstrInfo.cpp llvm/lib/Target/AArch64/AArch64InstrInfo.h llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/lib/Target/AMDGPU/SIFoldOperands.cpp llvm/lib/Target/AMDGPU/SIInsertSkips.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.h llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp llvm/lib/Target/ARM/ARMBaseInstrInfo.h llvm/lib/Target/ARM/Thumb2SizeReduction.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.cpp llvm/lib/Target/PowerPC/PPCInstrInfo.h llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp llvm/lib/Target/SystemZ/SystemZInstrInfo.h llvm/lib/Target/SystemZ/SystemZPostRewrite.cpp llvm/lib/Target/SystemZ/SystemZShortenInst.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.cpp llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.h llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp llvm/lib/Target/X86/X86FastISel.cpp llvm/lib/Target/X86/X86FixupBWInsts.cpp llvm/lib/Target/X86/X86ISelDAGToDAG.cpp llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86InstrInfo.cpp llvm/lib/Target/X86/X86InstrInfo.h llvm/lib/Target/X86/X86InstrInfo.td llvm/lib/Target/X86/X86OptimizeLEAs.cpp llvm/lib/Target/X86/X86PadShortFunction.cpp llvm/lib/Transforms/Utils/SizeOpts.cpp llvm/test/CodeGen/AArch64/O0-pipeline.ll llvm/test/CodeGen/AArch64/O3-pipeline.ll llvm/test/CodeGen/AArch64/arm64-memset-to-bzero-pgso.ll llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll llvm/test/CodeGen/AArch64/max-jump-table.ll llvm/test/CodeGen/ARM/O3-pipeline.ll llvm/test/CodeGen/ARM/constantpool-align.ll llvm/test/CodeGen/RISCV/tail-calls.ll llvm/test/CodeGen/X86/O0-pipeline.ll llvm/test/CodeGen/X86/O3-pipeline.ll llvm/test/CodeGen/X86/atom-pad-short-functions.ll llvm/test/CodeGen/X86/avx-cvt.ll llvm/test/CodeGen/X86/avx512-mask-op.ll llvm/test/CodeGen/X86/bypass-slow-division-tune.ll llvm/test/CodeGen/X86/cmov-into-branch.ll llvm/test/CodeGen/X86/conditional-tailcall-pgso.ll llvm/test/CodeGen/X86/fixup-lea.ll llvm/test/CodeGen/X86/fold-load-unops.ll llvm/test/CodeGen/X86/fshl.ll llvm/test/CodeGen/X86/fshr.ll llvm/test/CodeGen/X86/haddsub.ll llvm/test/CodeGen/X86/immediate_merging.ll llvm/test/CodeGen/X86/immediate_merging64.ll llvm/test/CodeGen/X86/loop-blocks.ll llvm/test/CodeGen/X86/materialize.ll llvm/test/CodeGen/X86/memcmp-pgso.ll llvm/test/CodeGen/X86/memcpy.ll llvm/test/CodeGen/X86/powi.ll llvm/test/CodeGen/X86/rounding-ops.ll llvm/test/CodeGen/X86/shrink-compare-pgso.ll llvm/test/CodeGen/X86/slow-incdec.ll llvm/test/CodeGen/X86/splat-for-size.ll llvm/test/CodeGen/X86/sse2-intrinsics-x86-upgrade.ll llvm/test/CodeGen/X86/sse41.ll llvm/test/CodeGen/X86/store-zero-and-minus-one.ll llvm/test/CodeGen/X86/switch-density.ll llvm/test/CodeGen/X86/tail-opts.ll llvm/test/CodeGen/X86/test-vs-bittest.ll llvm/test/CodeGen/X86/vector-shuffle-256-v4.ll llvm/test/CodeGen/X86/x86-64-bittest-logic.ll llvm/test/CodeGen/X86/x86-64-double-shifts-Oz-Os-O2.ll llvm/test/CodeGen/X86/x86-repmov-copy-eflags.ll llvm/test/Transforms/CodeGenPrepare/X86/sink-addrmode.ll llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp llvm/utils/TableGen/GlobalISelEmitter.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D67120.224495.patch Type: text/x-patch Size: 371837 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 15:58:13 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:58:13 +0000 (UTC) Subject: [PATCH] D68828: [AMDGPU] Allow DPP combiner to work with REG_SEQUENCE In-Reply-To: References: Message-ID: <7404ce84e0488396704ccbb13dd34186@localhost.localdomain> rampitec marked 4 inline comments as done. rampitec added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:459-462 + if (OrigMI.getOperand(I).getReg() == DPPMovReg) { + FwdSubReg = OrigMI.getOperand(I + 1).getImm(); + break; + } ---------------- arsenm wrote: > I think this won't work in the case where the operand itself has a subregister. > Can you add a test with something like > > %0:vreg_64 = REG_SEQUENCE %vreg_64.sub0, sub1, %vreg_64.1, sub0 > > I think you can use composeSubRegIndices here It cannot directly happen because we are in SSA and def must be a result of mov_dpp, i.e. defining the whole register. This can however happen if yet another reg_sequence is composed out of the first one. I have added checks and test. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68828/new/ https://reviews.llvm.org/D68828 From llvm-commits at lists.llvm.org Thu Oct 10 15:58:15 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Thu, 10 Oct 2019 22:58:15 +0000 (UTC) Subject: [PATCH] D68828: [AMDGPU] Allow DPP combiner to work with REG_SEQUENCE In-Reply-To: References: Message-ID: <97c4c660bb43dbee5da9e764109f84f8@localhost.localdomain> rampitec updated this revision to Diff 224498. rampitec marked an inline comment as done. rampitec added a comment. Addressed comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68828/new/ https://reviews.llvm.org/D68828 Files: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/test/CodeGen/AMDGPU/dpp_combine.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68828.224498.patch Type: text/x-patch Size: 12566 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 02:23:49 2019 From: llvm-commits at lists.llvm.org (Roger Ferrer Ibanez via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 09:23:49 +0000 (UTC) Subject: [PATCH] D68685: [RISCV] Scheduler description for Rocket Core In-Reply-To: References: Message-ID: <029ec376d03c1ee31d80503fe6977340@localhost.localdomain> rogfer01 added subscribers: javedabsar, javed.absar. rogfer01 added a comment. @javedabsar (or @javed.absar) I seem to recall you have experience with schedulers. If you could give us a hand here that'd be great! :) Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68685/new/ https://reviews.llvm.org/D68685 From llvm-commits at lists.llvm.org Fri Oct 11 02:34:35 2019 From: llvm-commits at lists.llvm.org (Zhang Kang via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 09:34:35 +0000 (UTC) Subject: [PATCH] D66576: [Regalloc][WIP] Increase CSR cost in RegAllocGreedy to favour splitting/spill over CSR first use In-Reply-To: References: Message-ID: ZhangKang added a comment. In D66576#1705534 , @lebedev.ri wrote: > In D66576#1660216 , @steven.zhang wrote: > > > .AMDGPU also override the getCSRFirstUseCost() but your patch didn't catch that. And would you please post some improve number for powerpc of this patch ? > > > Not done; would be good to have some perf numbers here, for ppc and x86 We have test this patch(`getCSRFirstUseCost ()` return 1) on PowerPC. For spec base, there are 6 cases has improved more that 1%, the largest improvement case is 3.63%, no case degraded more than 1%. For spec peak, there are 5 cases has improved more that 1%, the largest improvement case is 5.9%, only one case degraded more than 1%(1.76%). Overall, the base & peak reset has been improved after this patch on PPC. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66576/new/ https://reviews.llvm.org/D66576 From llvm-commits at lists.llvm.org Fri Oct 11 03:22:30 2019 From: llvm-commits at lists.llvm.org (Javed Absar via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 10:22:30 +0000 (UTC) Subject: [PATCH] D68685: [RISCV] Scheduler description for Rocket Core In-Reply-To: References: Message-ID: <3988a1ae803a9e2ba4fa243b913a983f@localhost.localdomain> javed.absar added a comment. In D68685#1705570 , @rogfer01 wrote: > @javedabsar (or @javed.absar) I seem to recall you have experience with schedulers. If you could give us a hand here that'd be great! :) Sure no problem Roger :) Could you please point me to some doc which describes the pipeline model of RISCVRocket64 - i.e. what kind of processing units are available, how each instruction flows through the pipeline (fully pipelined or partially, latencies, resource dependences)? That would be my starting point to match against the schedules defined in schedule*.td. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68685/new/ https://reviews.llvm.org/D68685 From llvm-commits at lists.llvm.org Fri Oct 11 03:22:30 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?D=C3=A1vid_Bolvansk=C3=BD_via_Phabricator?= via llvm-commits) Date: Fri, 11 Oct 2019 10:22:30 +0000 (UTC) Subject: [PATCH] D66576: [Regalloc][WIP] Increase CSR cost in RegAllocGreedy to favour splitting/spill over CSR first use In-Reply-To: References: Message-ID: <1ea763652589b7bc226b9f8212e3f5a2@localhost.localdomain> xbolva00 added a comment. Do you have numbers also for x86? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66576/new/ https://reviews.llvm.org/D66576 From llvm-commits at lists.llvm.org Fri Oct 11 08:24:16 2019 From: llvm-commits at lists.llvm.org (Mirko Brkusanin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 15:24:16 +0000 (UTC) Subject: [PATCH] D66795: [Mips] Use appropriate private label prefix based on Mips ABI In-Reply-To: References: Message-ID: mbrkusanin updated this revision to Diff 224603. mbrkusanin added a comment. - Rebase - Ping @echristo @craig.topper @tstellar @dylanmckay @petecoup If there are no objections then I'll split this into llvm, clang and lldb patches and commit them next week. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66795/new/ https://reviews.llvm.org/D66795 Files: clang/lib/Parse/ParseStmtAsm.cpp clang/tools/driver/cc1as_main.cpp lldb/source/Plugins/Disassembler/llvm/DisassemblerLLVMC.cpp lldb/source/Plugins/Instruction/MIPS/EmulateInstructionMIPS.cpp lldb/source/Plugins/Instruction/MIPS64/EmulateInstructionMIPS64.cpp llvm/include/llvm/Support/TargetRegistry.h llvm/lib/CodeGen/LLVMTargetMachine.cpp llvm/lib/MC/MCDisassembler/Disassembler.cpp llvm/lib/Object/ModuleSymbolTable.cpp llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCAsmInfo.cpp llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCAsmInfo.h llvm/lib/Target/ARC/MCTargetDesc/ARCMCTargetDesc.cpp llvm/lib/Target/ARM/MCTargetDesc/ARMMCTargetDesc.cpp llvm/lib/Target/AVR/MCTargetDesc/AVRMCAsmInfo.cpp llvm/lib/Target/AVR/MCTargetDesc/AVRMCAsmInfo.h llvm/lib/Target/BPF/MCTargetDesc/BPFMCAsmInfo.h llvm/lib/Target/Hexagon/MCTargetDesc/HexagonMCTargetDesc.cpp llvm/lib/Target/Lanai/MCTargetDesc/LanaiMCAsmInfo.cpp llvm/lib/Target/Lanai/MCTargetDesc/LanaiMCAsmInfo.h llvm/lib/Target/MSP430/MCTargetDesc/MSP430MCAsmInfo.cpp llvm/lib/Target/MSP430/MCTargetDesc/MSP430MCAsmInfo.h llvm/lib/Target/Mips/MCTargetDesc/MipsMCAsmInfo.cpp llvm/lib/Target/Mips/MCTargetDesc/MipsMCAsmInfo.h llvm/lib/Target/Mips/MCTargetDesc/MipsMCTargetDesc.cpp llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXMCAsmInfo.cpp llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXMCAsmInfo.h llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCTargetDesc.cpp llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCTargetDesc.cpp llvm/lib/Target/Sparc/MCTargetDesc/SparcMCTargetDesc.cpp llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCTargetDesc.cpp llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCAsmInfo.cpp llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCAsmInfo.h llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.cpp llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp llvm/lib/Target/XCore/MCTargetDesc/XCoreMCTargetDesc.cpp llvm/test/CodeGen/Mips/compactbranches/no-beqzc-bnezc.ll llvm/test/MC/Mips/macro-li.d.s llvm/test/MC/Mips/macro-li.s.s llvm/test/MC/Mips/private-prefix.s llvm/tools/dsymutil/DwarfStreamer.cpp llvm/tools/llvm-cfi-verify/lib/FileAnalysis.cpp llvm/tools/llvm-dwp/llvm-dwp.cpp llvm/tools/llvm-exegesis/lib/Analysis.cpp llvm/tools/llvm-jitlink/llvm-jitlink.cpp llvm/tools/llvm-mc-assemble-fuzzer/llvm-mc-assemble-fuzzer.cpp llvm/tools/llvm-mc/Disassembler.cpp llvm/tools/llvm-mc/Disassembler.h llvm/tools/llvm-mc/llvm-mc.cpp llvm/tools/llvm-mca/llvm-mca.cpp llvm/tools/llvm-objdump/MachODump.cpp llvm/tools/llvm-objdump/llvm-objdump.cpp llvm/tools/llvm-rtdyld/llvm-rtdyld.cpp llvm/tools/sancov/sancov.cpp llvm/unittests/DebugInfo/DWARF/DwarfGenerator.cpp llvm/unittests/ExecutionEngine/JITLink/JITLinkTestCommon.cpp llvm/unittests/MC/DwarfLineTables.cpp llvm/unittests/MC/MCInstPrinter.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D66795.224603.patch Type: text/x-patch Size: 52616 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 09:01:06 2019 From: llvm-commits at lists.llvm.org (Zhang Kang via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 16:01:06 +0000 (UTC) Subject: [PATCH] D66576: [Regalloc][WIP] Increase CSR cost in RegAllocGreedy to favour splitting/spill over CSR first use In-Reply-To: References: Message-ID: <842a499a3ec4b7da9d94fd7729cbca69@localhost.localdomain> ZhangKang added a comment. In D66576#1705646 , @xbolva00 wrote: > Do you have numbers also for x86? No, I don't have. I'm sorry that I have no x86 test machine. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66576/new/ https://reviews.llvm.org/D66576 From llvm-commits at lists.llvm.org Fri Oct 11 10:24:02 2019 From: llvm-commits at lists.llvm.org (David Li via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:24:02 +0000 (UTC) Subject: [PATCH] D67120: [PGO] Profile guided code size optimization (continued). In-Reply-To: References: Message-ID: <3e458381634619537c13b6df89689f6c@localhost.localdomain> davidxl added a comment. SizeOpts and MachineSizeOpts changes can also be extracted into its own patch. After this is done, the TTI change has one fewer dependency and we might figure out a way to isolate that part too.. ================ Comment at: llvm/lib/CodeGen/MachineSizeOpts.cpp:28 +/// Like ProfileSummaryInfo::isColdBlock but for MachineBasicBlock. +static bool isColdBlock(const MachineBasicBlock *MBB, + ProfileSummaryInfo *PSI, ---------------- I've looked at SizeOpts.h, SizeOpts.cpp, MachineSizeOpts.cpp files. I think we should refactorize the code using template to avoid duplicate logic. This should be done similarly to BlockFrequencyInfoImpl template bool shouldOptimizeForSize(FuncT *F, ....) { } isColdBlock should probably refactorized in the similar way. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67120/new/ https://reviews.llvm.org/D67120 From llvm-commits at lists.llvm.org Fri Oct 11 10:51:56 2019 From: llvm-commits at lists.llvm.org (David Li via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 17:51:56 +0000 (UTC) Subject: [PATCH] D67120: [PGO] Profile guided code size optimization (continued). In-Reply-To: References: Message-ID: <44e736957fd92c9bc2e84dd6b33c6721@localhost.localdomain> davidxl added a comment. The code can be broken down and contributed in the following order: 1. SizeOpts related change 2. TargetLowering change (isSuitableForJumpTable) 3. TargetTransformation related changes -- depending on 1) and 2) 4. SwitchLoweringUtils (findJumpTable depends on isSuitableForJumpTable in 2) 5. the rest of the changes (can be further broken down per-pass). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67120/new/ https://reviews.llvm.org/D67120 From llvm-commits at lists.llvm.org Fri Oct 11 11:01:18 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:01:18 +0000 (UTC) Subject: [PATCH] D68873: [AMDGPU] Amend target loop unroll defaults In-Reply-To: References: Message-ID: <4f0024759fd40dc72c3d05027c2e7b4e@localhost.localdomain> arsenm added a comment. Could use a test ================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:93 TTI::UnrollingPreferences &UP) { UP.Threshold = 300; // Twice the default. UP.MaxCount = std::numeric_limits::max(); ---------------- This would now be dead ================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:100 + // Set more aggressive defaults for PAL shaders + if (TargetTriple.getOS() == Triple::AMDPAL) { + UP.MaxPercentThresholdBoost = 1000; ---------------- These should probably be the same for all OSes ================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h:52 + AMDGPUSubtarget::Generation Gen; + ---------------- You don't need to add this field. You already have the subtarget available here, you just need to change the type Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68873/new/ https://reviews.llvm.org/D68873 From llvm-commits at lists.llvm.org Fri Oct 11 11:10:25 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 18:10:25 +0000 (UTC) Subject: [PATCH] D68873: [AMDGPU] Amend target loop unroll defaults In-Reply-To: References: Message-ID: rampitec requested changes to this revision. rampitec added a comment. This revision now requires changes to proceed. How big was the performance testing? ================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:62 static cl::opt UnrollThresholdLocal( "amdgpu-unroll-threshold-local", ---------------- This change penalizes loops which should have unroll boosted instead. Your new default thresholds are now higher than boosted. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68873/new/ https://reviews.llvm.org/D68873 From llvm-commits at lists.llvm.org Fri Oct 11 14:04:33 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:04:33 +0000 (UTC) Subject: [PATCH] D68888: [AMDGPU] link dpp pseudos and real instructions on gfx10 Message-ID: rampitec created this revision. rampitec added reviewers: arsenm, mjbedy, kzhuravl. Herald added subscribers: jfb, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely. Herald added a project: LLVM. This defaults to zero fi operand, but we do not expose it anyway. Should we expose it later it needs to be added to the pseudo. This enables dpp combining on gfx10. https://reviews.llvm.org/D68888 Files: llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp llvm/lib/Target/AMDGPU/VOP1Instructions.td llvm/lib/Target/AMDGPU/VOP2Instructions.td llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68888.224674.patch Type: text/x-patch Size: 251802 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 14:31:59 2019 From: llvm-commits at lists.llvm.org (Amy Huang via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:31:59 +0000 (UTC) Subject: [PATCH] D67723: [CodeView] Add option to disable inline line tables. In-Reply-To: References: Message-ID: <6025d8ff47b4ca4f55098d6ed8add36b@localhost.localdomain> akhuang updated this revision to Diff 224681. akhuang marked 2 inline comments as done. akhuang added a comment. Herald added a subscriber: ormris. -Remove intrinsics debug info -Add inliner test -Add to function attribute description Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67723/new/ https://reviews.llvm.org/D67723 Files: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Driver/Options.td clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/debug-info-no-inline-line-tables.c llvm/docs/LangRef.rst llvm/include/llvm/IR/Attributes.td llvm/lib/Transforms/Utils/InlineFunction.cpp llvm/test/Transforms/Inline/no-inline-line-tables.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67723.224681.patch Type: text/x-patch Size: 10855 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 14:50:11 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:50:11 +0000 (UTC) Subject: [PATCH] D67723: [CodeView] Add option to disable inline line tables. In-Reply-To: References: Message-ID: rnk added inline comments. ================ Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1427 + // Remove debug info intrinsics. + if (auto *DbgInst = dyn_cast(BI)) { + BI = --(DbgInst->eraseFromParent()); ---------------- Each of these inherit from DbgVariableIntrinsic, so you should be able to dyn_cast to that, and handle them all with one if. ================ Comment at: llvm/test/Transforms/Inline/no-inline-line-tables.ll:31 +; CHECK-NOT: @f +; CHECK-NOT: @llvm.dbg.declare +; CHECK: %{{[0-9]+}} = load i32, i32* %i.addr.i, align 4, !dbg ![[VAR:[0-9]+]] ---------------- Test looks good Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67723/new/ https://reviews.llvm.org/D67723 From llvm-commits at lists.llvm.org Fri Oct 11 14:50:11 2019 From: llvm-commits at lists.llvm.org (Konstantin Zhuravlyov via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:50:11 +0000 (UTC) Subject: [PATCH] D68888: [AMDGPU] link dpp pseudos and real instructions on gfx10 In-Reply-To: References: Message-ID: <9197bfdd25876ee6ec5bbd6140393030@localhost.localdomain> kzhuravl accepted this revision. kzhuravl added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68888/new/ https://reviews.llvm.org/D68888 From llvm-commits at lists.llvm.org Thu Oct 10 17:31:23 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 00:31:23 +0000 (UTC) Subject: [PATCH] D68673: [AMDGPU] Support mov dpp with 64 bit operands In-Reply-To: References: Message-ID: <097f82cf5aee923a0594998dede5bb3b@localhost.localdomain> rampitec updated this revision to Diff 224506. rampitec marked an inline comment as done. rampitec added a comment. Herald added subscribers: MaskRay, kbarton, nemanjai. GCNDPPCombiner can split the new pseudo and then handle the split. Post-RA split is needed anyway since combining is an optimization. Tests are updated to handle case w/o optimization. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68673/new/ https://reviews.llvm.org/D68673 Files: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.h llvm/lib/Target/AMDGPU/SIInstructions.td llvm/test/CodeGen/AMDGPU/dpp_combine.mir llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mov.dpp.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.update.dpp.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68673.224506.patch Type: text/x-patch Size: 16876 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Thu Oct 10 22:42:55 2019 From: llvm-commits at lists.llvm.org (Zixuan Wu (Zeson) via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 05:42:55 +0000 (UTC) Subject: [PATCH] D67148: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In-Reply-To: References: Message-ID: wuzish updated this revision to Diff 224542. wuzish added a comment. update test case CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67148/new/ https://reviews.llvm.org/D67148 Files: llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/BasicTTIImpl.h llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h llvm/lib/Target/ARM/ARMTargetTransformInfo.h llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.h llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h llvm/lib/Target/X86/X86TargetTransformInfo.cpp llvm/lib/Target/X86/X86TargetTransformInfo.h llvm/lib/Target/XCore/XCoreTargetTransformInfo.h llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/test/Transforms/LoopVectorize/PowerPC/reg-usage.ll llvm/test/Transforms/LoopVectorize/X86/reg-usage-debug.ll llvm/test/Transforms/LoopVectorize/X86/reg-usage.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67148.224542.patch Type: text/x-patch Size: 32549 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 01:18:20 2019 From: llvm-commits at lists.llvm.org (Roman Lebedev via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 08:18:20 +0000 (UTC) Subject: [PATCH] D66576: [Regalloc][WIP] Increase CSR cost in RegAllocGreedy to favour splitting/spill over CSR first use In-Reply-To: References: Message-ID: <832efe838ed0e1ba5c79e63f7779b17e@localhost.localdomain> lebedev.ri added a comment. In D66576#1660216 , @steven.zhang wrote: > .AMDGPU also override the getCSRFirstUseCost() but your patch didn't catch that. And would you please post some improve number for powerpc of this patch ? Not done; would be good to have some perf numbers here, for ppc and x86 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66576/new/ https://reviews.llvm.org/D66576 From llvm-commits at lists.llvm.org Fri Oct 11 14:50:14 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:50:14 +0000 (UTC) Subject: [PATCH] D68888: [AMDGPU] link dpp pseudos and real instructions on gfx10 In-Reply-To: References: Message-ID: rampitec updated this revision to Diff 224683. rampitec added a comment. Added test. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68888/new/ https://reviews.llvm.org/D68888 Files: llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp llvm/lib/Target/AMDGPU/VOP1Instructions.td llvm/lib/Target/AMDGPU/VOP2Instructions.td llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll llvm/test/CodeGen/AMDGPU/dpp_combine.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68888.224683.patch Type: text/x-patch Size: 254409 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 14:59:26 2019 From: llvm-commits at lists.llvm.org (Amy Huang via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 21:59:26 +0000 (UTC) Subject: [PATCH] D67723: [CodeView] Add option to disable inline line tables. In-Reply-To: References: Message-ID: <831a42f1565b29500c0da8e09e3f4da0@localhost.localdomain> akhuang updated this revision to Diff 224687. akhuang added a comment. - Remove extra ifs. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67723/new/ https://reviews.llvm.org/D67723 Files: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Driver/Options.td clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/debug-info-no-inline-line-tables.c llvm/docs/LangRef.rst llvm/include/llvm/IR/Attributes.td llvm/lib/Transforms/Utils/InlineFunction.cpp llvm/test/Transforms/Inline/no-inline-line-tables.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67723.224687.patch Type: text/x-patch Size: 10579 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 15:10:48 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:10:48 +0000 (UTC) Subject: [PATCH] D68888: [AMDGPU] link dpp pseudos and real instructions on gfx10 In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGe2d104f64ca8: [AMDGPU] link dpp pseudos and real instructions on gfx10 (authored by rampitec). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68888/new/ https://reviews.llvm.org/D68888 Files: llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp llvm/lib/Target/AMDGPU/VOP1Instructions.td llvm/lib/Target/AMDGPU/VOP2Instructions.td llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll llvm/test/CodeGen/AMDGPU/dpp_combine.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68888.224694.patch Type: text/x-patch Size: 254409 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 15:20:03 2019 From: llvm-commits at lists.llvm.org (Reid Kleckner via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:20:03 +0000 (UTC) Subject: [PATCH] D67723: [CodeView] Add option to disable inline line tables. In-Reply-To: References: Message-ID: <51289a5b69f7fd74968b2b3be82e0884@localhost.localdomain> rnk added a comment. I guess the commit message shouldn't say "[CodeView] Add option to disable inline line tables." It's really an option for all debug info. You could put "[DebugInfo]" on there, or just drop the tag. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67723/new/ https://reviews.llvm.org/D67723 From llvm-commits at lists.llvm.org Fri Oct 11 15:29:08 2019 From: llvm-commits at lists.llvm.org (Adrian Prantl via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:29:08 +0000 (UTC) Subject: [PATCH] D67723: [DebugInfo] Add option to disable inline line tables. In-Reply-To: References: Message-ID: aprantl accepted this revision. aprantl added a comment. This revision is now accepted and ready to land. I would still prefer no-inline-info or no-inline-debuginfo over no-inline-linetables and a line 0 location for the inlined instructions. Other than that the patch is now safe. ================ Comment at: llvm/lib/Transforms/Utils/InlineFunction.cpp:1431 + } + BI->setDebugLoc(TheCallDL); + continue; ---------------- I still think an artificial (line 0) location would be less misleading for debuggers, profilers, and optimization remarks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67723/new/ https://reviews.llvm.org/D67723 From llvm-commits at lists.llvm.org Fri Oct 11 15:29:10 2019 From: llvm-commits at lists.llvm.org (Amy Huang via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:29:10 +0000 (UTC) Subject: [PATCH] D67723: [DebugInfo] Add option to disable inline line tables. In-Reply-To: References: Message-ID: akhuang updated this revision to Diff 224697. akhuang marked an inline comment as done. akhuang added a comment. Fix code so that -gno-inline-line-tables works when not codeview Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67723/new/ https://reviews.llvm.org/D67723 Files: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Driver/Options.td clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/debug-info-no-inline-line-tables.c llvm/docs/LangRef.rst llvm/include/llvm/IR/Attributes.td llvm/lib/Transforms/Utils/InlineFunction.cpp llvm/test/Transforms/Inline/no-inline-line-tables.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67723.224697.patch Type: text/x-patch Size: 10508 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 15:38:26 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:38:26 +0000 (UTC) Subject: [PATCH] D68828: [AMDGPU] Allow DPP combiner to work with REG_SEQUENCE In-Reply-To: References: Message-ID: <20d4a7674599c2978a1681cc8c2492fe@localhost.localdomain> rampitec updated this revision to Diff 224700. rampitec added a comment. Rebased. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68828/new/ https://reviews.llvm.org/D68828 Files: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/test/CodeGen/AMDGPU/dpp_combine.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D68828.224700.patch Type: text/x-patch Size: 12454 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 15:56:28 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 22:56:28 +0000 (UTC) Subject: [PATCH] D68673: [AMDGPU] Support mov dpp with 64 bit operands In-Reply-To: References: Message-ID: <75e1547f9a49f719957351b1ca579023@localhost.localdomain> rampitec updated this revision to Diff 224701. rampitec added a comment. Rebased. Removed special handling of gfx10, it uses the same pseudo now. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68673/new/ https://reviews.llvm.org/D68673 Files: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.h llvm/lib/Target/AMDGPU/SIInstructions.td llvm/test/CodeGen/AMDGPU/dpp_combine.mir llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mov.dpp.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.update.dpp.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68673.224701.patch Type: text/x-patch Size: 16217 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Fri Oct 11 16:05:47 2019 From: llvm-commits at lists.llvm.org (Matt Arsenault via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:05:47 +0000 (UTC) Subject: [PATCH] D68828: [AMDGPU] Allow DPP combiner to work with REG_SEQUENCE In-Reply-To: References: Message-ID: <0c86b6724973eb4c2ab252edbbc3cc0f@localhost.localdomain> arsenm added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:539-543 + for (unsigned I = 1, E = S->getNumOperands(); I < E; I += 2) { + MachineOperand &Op = S->getOperand(I); + if (!MRI->getVRegDef(Op.getReg())) + Op.setIsUndef(true); + } ---------------- This seems like a questionable way to preserve undefs. Can you avoid doing this by checking getVRegDef? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68828/new/ https://reviews.llvm.org/D68828 From llvm-commits at lists.llvm.org Fri Oct 11 16:14:52 2019 From: llvm-commits at lists.llvm.org (Stanislav Mekhanoshin via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:14:52 +0000 (UTC) Subject: [PATCH] D68828: [AMDGPU] Allow DPP combiner to work with REG_SEQUENCE In-Reply-To: References: Message-ID: <9adf71b01875227122985905235b6a70@localhost.localdomain> rampitec marked an inline comment as done. rampitec added inline comments. ================ Comment at: llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp:539-543 + for (unsigned I = 1, E = S->getNumOperands(); I < E; I += 2) { + MachineOperand &Op = S->getOperand(I); + if (!MRI->getVRegDef(Op.getReg())) + Op.setIsUndef(true); + } ---------------- arsenm wrote: > This seems like a questionable way to preserve undefs. Can you avoid doing this by checking getVRegDef? What's wrong with it? A register either has def or not. The other way would be to keep all reg_sequences along with the info about all subregs, if they were combined or not. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68828/new/ https://reviews.llvm.org/D68828 From llvm-commits at lists.llvm.org Fri Oct 11 16:51:36 2019 From: llvm-commits at lists.llvm.org (Amy Huang via Phabricator via llvm-commits) Date: Fri, 11 Oct 2019 23:51:36 +0000 (UTC) Subject: [PATCH] D67723: [DebugInfo] Add option to disable inline line tables. In-Reply-To: References: Message-ID: <1480606a721c86bb8cca0a2883508867@localhost.localdomain> akhuang updated this revision to Diff 224711. akhuang added a comment. - Set location to line 0 with getMergedLocation Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67723/new/ https://reviews.llvm.org/D67723 Files: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Driver/Options.td clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/test/CodeGen/debug-info-no-inline-line-tables.c llvm/docs/LangRef.rst llvm/include/llvm/IR/Attributes.td llvm/lib/Transforms/Utils/InlineFunction.cpp llvm/test/Transforms/Inline/no-inline-line-tables.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67723.224711.patch Type: text/x-patch Size: 10572 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sat Oct 12 23:35:02 2019 From: llvm-commits at lists.llvm.org (Shiva Chen via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 06:35:02 +0000 (UTC) Subject: [PATCH] D62190: [RISCV] Allow shrink wrapping for RISC-V In-Reply-To: References: Message-ID: <50b2691b9af17fcb3ffad85ec65c315e@localhost.localdomain> shiva0217 added inline comments. ================ Comment at: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp:230 + MachineBasicBlock::iterator MBBI = MBB.getFirstTerminator(); + if (MBBI == MBB.end()) + MBBI = MBB.getLastNonDebugInstr(); ---------------- apazos wrote: > I have been verifying the pending patches at Oz, Os, and also O2. This is helping with uncovering issues. > You can run it with 'llc test.ll -enable-shrink-wrap > > With the latest standalone patch, it seems we have a few crashes left. Below is a bugpoint reduced test: > > define dso_local void @test() local_unnamed_addr { > entry: > br i1 undef, label %T.exit, label %for.body.i > > for.body.i: ; preds = %for.body.i, %entry > store i32 0, i32* undef > %incdec.ptr.i.i = getelementptr inbounds i32, i32* null, i32 1 > %cmp.i.i = icmp eq i32* undef, undef > br i1 %cmp.i.i, label %T.exit.loopexit, label %for.body.i > > T.exit.loopexit: ; preds = %for.body.i > %0 = ptrtoint i32* %incdec.ptr.i.i to i32 > br label %T.exit > > T.exit: ; preds = %T.exit.loopexit, %entry > ret void > } > > It seems that shrink wrapping may choose an empty basic block to insert epilogue. We might need to add empty block detection for DL initialization and MBBI advance. Something like: DebugLoc DL = !MBB.empty() ? MBBI->getDebugLoc() : DebugLoc(); if (!MBB.empty() && !MBBI->isTerminator()) MBBI = std::next(MBBI); Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62190/new/ https://reviews.llvm.org/D62190 From llvm-commits at lists.llvm.org Sun Oct 13 08:48:10 2019 From: llvm-commits at lists.llvm.org (Lewis Revill via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 15:48:10 +0000 (UTC) Subject: [PATCH] D62190: [RISCV] Allow shrink wrapping for RISC-V In-Reply-To: References: Message-ID: <01f59d21020f58455eacc2a1770f82ee@localhost.localdomain> lewis-revill added inline comments. ================ Comment at: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp:230 + MachineBasicBlock::iterator MBBI = MBB.getFirstTerminator(); + if (MBBI == MBB.end()) + MBBI = MBB.getLastNonDebugInstr(); ---------------- shiva0217 wrote: > apazos wrote: > > I have been verifying the pending patches at Oz, Os, and also O2. This is helping with uncovering issues. > > You can run it with 'llc test.ll -enable-shrink-wrap > > > > With the latest standalone patch, it seems we have a few crashes left. Below is a bugpoint reduced test: > > > > define dso_local void @test() local_unnamed_addr { > > entry: > > br i1 undef, label %T.exit, label %for.body.i > > > > for.body.i: ; preds = %for.body.i, %entry > > store i32 0, i32* undef > > %incdec.ptr.i.i = getelementptr inbounds i32, i32* null, i32 1 > > %cmp.i.i = icmp eq i32* undef, undef > > br i1 %cmp.i.i, label %T.exit.loopexit, label %for.body.i > > > > T.exit.loopexit: ; preds = %for.body.i > > %0 = ptrtoint i32* %incdec.ptr.i.i to i32 > > br label %T.exit > > > > T.exit: ; preds = %T.exit.loopexit, %entry > > ret void > > } > > > > > It seems that shrink wrapping may choose an empty basic block to insert epilogue. We might need to add empty block detection for DL initialization and MBBI advance. Something like: > DebugLoc DL = !MBB.empty() ? MBBI->getDebugLoc() : DebugLoc(); > if (!MBB.empty() && !MBBI->isTerminator()) > MBBI = std::next(MBBI); Thanks Ana, it looks like this is a case of the shrink wrapping pass choosing a basic block which is empty as the place to insert the prologue. I didn't realise it was possible, but I'll push an updated patch that takes this into account. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62190/new/ https://reviews.llvm.org/D62190 From llvm-commits at lists.llvm.org Sun Oct 13 09:06:38 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Lu=C3=ADs_Marques_via_Phabricator?= via llvm-commits) Date: Sun, 13 Oct 2019 16:06:38 +0000 (UTC) Subject: [PATCH] D67185: [RISCV] Add support for -ffixed-xX flags In-Reply-To: References: Message-ID: luismarques accepted this revision. luismarques added a comment. This revision is now accepted and ready to land. Overall LGTM. Caveats: - Address the issues in the inline comments; - Shouldn't the TLS lowering also complain when `-ffixed-x4` is used? - Is there a way to ensure we don't forget to check any such reserved reg uses? I'm not quite confident we haven't overlooked anything. - (Remember to check for the `-ffixed-xX` flags when implementing the callee-saved regs via libcalls (D62686 ), etc.) Apologies for the delayed review. ================ Comment at: clang/include/clang/Driver/Options.td:2224 HelpText<"Don't workaround Cortex-A53 erratum 835769 (AArch64 only)">; -foreach i = {1-7,9-15,18,20-28} in - def ffixed_x#i : Flag<["-"], "ffixed-x"#i>, Group, - HelpText<"Reserve the "#i#" register (AArch64 only)">; +foreach i = {1-31} in + def ffixed_x#i : Flag<["-"], "ffixed-x"#i>, Group, ---------------- Given the expansion of the flags here, the AArch64 driver should probably detect and reject the flags `-ffixed-x[8,16-17,19,29-31]`, to preserve the old behavior where passing those flags would be an error and to ensure that erroneous flags are not silently accepted. ================ Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:2412 + })) + F.getContext().diagnose(DiagnosticInfoUnsupported{F, "Argument register" + " required, but has been reserved."}); ---------------- clang-format indicates another formatting style here. ================ Comment at: llvm/lib/Target/RISCV/RISCVSubtarget.cpp:53 : RISCVGenSubtargetInfo(TT, CPU, FS), + UserReservedRegister(RISCV::NUM_TARGET_REGS), FrameLowering(initializeSubtargetDependencies(TT, CPU, FS, ABIName)), ---------------- This includes more than the x0 - x31 registers. If the intent is to only allow reserving the GPRs then this should be tightened. ================ Comment at: llvm/lib/Target/RISCV/RISCVSubtarget.h:98 + bool isRegisterReservedByUser(size_t i) const { + return UserReservedRegister[i]; + } ---------------- Consider adding a bounds checking assert. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67185/new/ https://reviews.llvm.org/D67185 From llvm-commits at lists.llvm.org Sun Oct 13 13:40:10 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 20:40:10 -0000 Subject: [llvm] r374735 - [Attributor][FIX] Use check prefix that is actually tested Message-ID: <20191013204010.B2B168569F@lists.llvm.org> Author: jdoerfert Date: Sun Oct 13 13:40:10 2019 New Revision: 374735 URL: http://llvm.org/viewvc/llvm-project?rev=374735&view=rev Log: [Attributor][FIX] Use check prefix that is actually tested Summary: This changes "CHECK" check lines to "ATTRIBUTOR" check lines where necessary and also fixes the now exposed, mostly minor, problems. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68929 Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374735&r1=374734&r2=374735&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sun Oct 13 13:40:10 2019 @@ -593,8 +593,9 @@ struct AAComposeTwoGenericDeduction /// See AbstractAttribute::updateImpl(...). ChangeStatus updateImpl(Attributor &A) override { - return F, StateType>::updateImpl(A) | - G::updateImpl(A); + ChangeStatus ChangedF = F, StateType>::updateImpl(A); + ChangeStatus ChangedG = G::updateImpl(A); + return ChangedF | ChangedG; } }; @@ -1535,11 +1536,16 @@ struct AANoFreeCallSite final : AANoFree static int64_t getKnownNonNullAndDerefBytesForUse( Attributor &A, AbstractAttribute &QueryingAA, Value &AssociatedValue, const Use *U, const Instruction *I, bool &IsNonNull, bool &TrackUse) { - // TODO: Add GEP support TrackUse = false; + const Value *UseV = U->get(); + if (!UseV->getType()->isPointerTy()) + return 0; + + Type *PtrTy = UseV->getType(); const Function *F = I->getFunction(); - bool NullPointerIsDefined = F ? F->nullPointerIsDefined() : true; + bool NullPointerIsDefined = + F ? llvm::NullPointerIsDefined(F, PtrTy->getPointerAddressSpace()) : true; const DataLayout &DL = A.getInfoCache().getDL(); if (ImmutableCallSite ICS = ImmutableCallSite(I)) { if (ICS.isBundleOperand(U)) @@ -1559,19 +1565,28 @@ static int64_t getKnownNonNullAndDerefBy int64_t Offset; if (const Value *Base = getBasePointerOfAccessPointerOperand(I, Offset, DL)) { - if (Base == &AssociatedValue) { + if (Base == &AssociatedValue && getPointerOperand(I) == UseV) { int64_t DerefBytes = - Offset + - (int64_t)DL.getTypeStoreSize( - getPointerOperand(I)->getType()->getPointerElementType()); + Offset + (int64_t)DL.getTypeStoreSize(PtrTy->getPointerElementType()); IsNonNull |= !NullPointerIsDefined; return DerefBytes; } } + if (const Value *Base = + GetPointerBaseWithConstantOffset(UseV, Offset, DL, + /*AllowNonInbounds*/ false)) { + auto &DerefAA = + A.getAAFor(QueryingAA, IRPosition::value(*Base)); + IsNonNull |= (!NullPointerIsDefined && DerefAA.isKnownNonNull()); + IsNonNull |= (!NullPointerIsDefined && (Offset != 0)); + int64_t DerefBytes = DerefAA.getKnownDereferenceableBytes(); + return std::max(int64_t(0), DerefBytes - Offset); + } return 0; } + struct AANonNullImpl : AANonNull { AANonNullImpl(const IRPosition &IRP) : AANonNull(IRP) {} @@ -2539,7 +2554,7 @@ struct AADereferenceableFloating // for overflows of the dereferenceable bytes. int64_t OffsetSExt = Offset.getSExtValue(); if (OffsetSExt < 0) - Offset = 0; + OffsetSExt = 0; T.takeAssumedDerefBytesMinimum( std::max(int64_t(0), DerefBytes - OffsetSExt)); Modified: llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll?rev=374735&r1=374734&r2=374735&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/arg_returned.ll Sun Oct 13 13:40:10 2019 @@ -370,11 +370,11 @@ define i32* @calls_unknown_fn(i32* %r) # ; ; Verify the maybe-redefined function is not annotated: ; -; CHECK: Function Attrs: noinline nounwind uwtable -; CHECK: define linkonce_odr i32* @maybe_redefined_fn(i32* %r) +; ATTRIBUTOR: Function Attrs: noinline nounwind uwtable +; ATTRIBUTOR: define linkonce_odr i32* @maybe_redefined_fn(i32* %r) ; -; CHECK: Function Attrs: noinline nounwind uwtable -; CHECK: define i32* @calls_maybe_redefined_fn(i32* returned %r) +; ATTRIBUTOR: Function Attrs: noinline nounwind uwtable +; ATTRIBUTOR: define i32* @calls_maybe_redefined_fn(i32* returned %r) ; ; BOTH: Function Attrs: noinline nounwind uwtable ; BOTH-NEXT: define linkonce_odr i32* @maybe_redefined_fn(i32* %r) @@ -808,12 +808,12 @@ define i32 @exact(i32* %a) { %c3 = call i32* @non_exact_3(i32* %a) ; We can use the information of the weak function non_exact_3 because it was ; given to us and not derived (the alignment of the returned argument). -; CHECK: %c4 = load i32, i32* %c3, align 32 +; ATTRIBUTOR: %c4 = load i32, i32* %c3, align 32 %c4 = load i32, i32* %c3 ; FIXME: %c2 and %c3 should be replaced but not %c0 or %c1! -; CHECK: %add1 = add i32 %c0, %c1 -; CHECK: %add2 = add i32 %add1, %c2 -; CHECK: %add3 = add i32 %add2, %c3 +; ATTRIBUTOR: %add1 = add i32 %c0, %c1 +; ATTRIBUTOR: %add2 = add i32 %add1, %c2 +; ATTRIBUTOR: %add3 = add i32 %add2, %c4 %add1 = add i32 %c0, %c1 %add2 = add i32 %add1, %c2 %add3 = add i32 %add2, %c4 @@ -827,12 +827,12 @@ define i32* @ret_const() #0 { } define i32* @use_const() #0 { %c = call i32* @ret_const() - ; CHECK: ret i32* bitcast (i8* @G to i32*) + ; ATTRIBUTOR: ret i32* bitcast (i8* @G to i32*) ret i32* %c } define i32* @dont_use_const() #0 { %c = musttail call i32* @ret_const() - ; CHECK: ret i32* %c + ; ATTRIBUTOR: ret i32* %c ret i32* %c } Modified: llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll?rev=374735&r1=374734&r2=374735&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/dereferenceable.ll Sun Oct 13 13:40:10 2019 @@ -1,4 +1,4 @@ -; RUN: opt -attributor -attributor-manifest-internal --attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 -S < %s | FileCheck %s --check-prefixes=ATTRIBUTOR +; RUN: opt -attributor -attributor-manifest-internal --attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 -S < %s | FileCheck %s --check-prefix=ATTRIBUTOR declare void @deref_phi_user(i32* %a); @@ -61,7 +61,7 @@ entry: for.cond: ; preds = %for.inc, %entry %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ] %a.addr.0 = phi i32* [ %a, %entry ], [ %incdec.ptr, %for.inc ] -; CHECK: call void @deref_phi_user(i32* dereferenceable(4000) %a.addr.0) +; ATTRIBUTOR: call void @deref_phi_user(i32* nonnull dereferenceable(4000) %a.addr.0) call void @deref_phi_user(i32* %a.addr.0) %tmp = load i32, i32* %a.addr.0, align 4 %cmp = icmp slt i32 %i.0, %tmp @@ -91,7 +91,7 @@ entry: for.cond: ; preds = %for.inc, %entry %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ] %a.addr.0 = phi i32* [ %a, %entry ], [ %incdec.ptr, %for.inc ] -; CHECK: call void @deref_phi_user(i32* %a.addr.0) +; ATTRIBUTOR: call void @deref_phi_user(i32* nonnull %a.addr.0) call void @deref_phi_user(i32* %a.addr.0) %tmp = load i32, i32* %a.addr.0, align 4 %cmp = icmp slt i32 %i.0, %tmp Modified: llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll?rev=374735&r1=374734&r2=374735&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll Sun Oct 13 13:40:10 2019 @@ -323,8 +323,8 @@ define i1 @captureDereferenceableOrNullI declare void @unknown(i8*) define void @test_callsite() { entry: -; We know that 'null' in AS 0 does not alias anything and cannot be captured -; CHECK: call void @unknown(i8* noalias nocapture null) +; We know that 'null' in AS 0 does not alias anything and cannot be captured. Though the latter is not qurried -> derived atm. +; ATTRIBUTOR: call void @unknown(i8* noalias null) call void @unknown(i8* null) ret void } From llvm-commits at lists.llvm.org Sun Oct 13 13:40:04 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 20:40:04 +0000 (UTC) Subject: [PATCH] D68929: [Attributor][FIX] Use check line that is actually tested In-Reply-To: References: Message-ID: <27b63c2114c4236c2fad833ea73f206b@localhost.localdomain> This revision was automatically updated to reflect the committed changes. Closed by commit rGdb6efb017f24: [Attributor][FIX] Use check prefix that is actually tested (authored by jdoerfert). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68929/new/ https://reviews.llvm.org/D68929 Files: llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/arg_returned.ll llvm/test/Transforms/FunctionAttrs/dereferenceable.ll llvm/test/Transforms/FunctionAttrs/nocapture.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68929.224793.patch Type: text/x-patch Size: 7263 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 13:47:16 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 20:47:16 -0000 Subject: [llvm] r374736 - [Attributor][MemBehavior] Fallback to the function state for arguments Message-ID: <20191013204716.BB4BB833FE@lists.llvm.org> Author: jdoerfert Date: Sun Oct 13 13:47:16 2019 New Revision: 374736 URL: http://llvm.org/viewvc/llvm-project?rev=374736&view=rev Log: [Attributor][MemBehavior] Fallback to the function state for arguments Even if an argument is captured, we cannot have an effect the function does not have. This is fine except for the special case of `inalloca` as it does not behave by the rules. TODO: Maybe the special rule for `inalloca` is wrong after all. Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll Modified: llvm/trunk/include/llvm/Transforms/IPO/Attributor.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/Attributor.h?rev=374736&r1=374735&r2=374736&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/Attributor.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/Attributor.h Sun Oct 13 13:47:16 2019 @@ -1128,6 +1128,12 @@ struct IntegerState : public AbstractSta return *this; } + /// Remove the bits in \p BitsEncoding from the "known bits". + IntegerState &removeKnownBits(base_t BitsEncoding) { + Known = (Known & ~BitsEncoding); + return *this; + } + /// Keep only "assumed bits" also set in \p BitsEncoding but all known ones. IntegerState &intersectAssumedBits(base_t BitsEncoding) { // Make sure we never loose any "known bits". Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374736&r1=374735&r2=374736&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sun Oct 13 13:47:16 2019 @@ -3838,17 +3838,23 @@ struct AAMemoryBehaviorArgument : AAMemo void initialize(Attributor &A) override { AAMemoryBehaviorFloating::initialize(A); - // TODO: From readattrs.ll: "inalloca parameters are always - // considered written" - if (hasAttr({Attribute::InAlloca})) - removeAssumedBits(NO_WRITES); - // Initialize the use vector with all direct uses of the associated value. Argument *Arg = getAssociatedArgument(); if (!Arg || !Arg->getParent()->hasExactDefinition()) indicatePessimisticFixpoint(); } + ChangeStatus manifest(Attributor &A) override { + // TODO: From readattrs.ll: "inalloca parameters are always + // considered written" + if (hasAttr({Attribute::InAlloca})) { + removeKnownBits(NO_WRITES); + removeAssumedBits(NO_WRITES); + } + return AAMemoryBehaviorFloating::manifest(A); + } + + /// See AbstractAttribute::trackStatistics() void trackStatistics() const override { if (isAssumedReadNone()) @@ -4017,10 +4023,13 @@ ChangeStatus AAMemoryBehaviorFloating::u // Make sure the value is not captured (except through "return"), if // it is, any information derived would be irrelevant anyway as we cannot - // check the potential aliases introduced by the capture. + // check the potential aliases introduced by the capture. However, no need + // to fall back to anythign less optimistic than the function state. const auto &ArgNoCaptureAA = A.getAAFor(*this, IRP); - if (!ArgNoCaptureAA.isAssumedNoCaptureMaybeReturned()) - return indicatePessimisticFixpoint(); + if (!ArgNoCaptureAA.isAssumedNoCaptureMaybeReturned()) { + S.intersectAssumedBits(FnMemAA.getAssumed()); + return ChangeStatus::CHANGED; + } // The current assumed state used to determine a change. auto AssumedState = S.getAssumed(); Modified: llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll?rev=374736&r1=374735&r2=374736&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll Sun Oct 13 13:47:16 2019 @@ -11,14 +11,16 @@ define i32* @c1(i32* %q) { ret i32* %q } -; EITHER: define void @c2(i32* %q) +; FNATTR: define void @c2(i32* %q) +; ATTRIBUTOR: define void @c2(i32* writeonly %q) ; It would also be acceptable to mark %q as readnone. Update @c3 too. define void @c2(i32* %q) { store i32* %q, i32** @g ret void } -; EITHER: define void @c3(i32* %q) +; FNATTR: define void @c3(i32* %q) +; ATTRIBUTOR: define void @c3(i32* writeonly %q) define void @c3(i32* %q) { call void @c2(i32* %q) ret void @@ -39,7 +41,8 @@ l1: @lookup_table = global [2 x i1] [ i1 0, i1 1 ] -; EITHER: define i1 @c5(i32* %q, i32 %bitno) +; FNATTR: define i1 @c5(i32* %q, i32 %bitno) +; ATTRIBUTOR: define i1 @c5(i32* readonly %q, i32 %bitno) define i1 @c5(i32* %q, i32 %bitno) { %tmp = ptrtoint i32* %q to i32 %tmp2 = lshr i32 %tmp, %bitno @@ -52,8 +55,7 @@ define i1 @c5(i32* %q, i32 %bitno) { declare void @throw_if_bit_set(i8*, i8) readonly -; FNATTR: define i1 @c6(i8* readonly %q, i8 %bit) -; ATTRIBUTOR: define i1 @c6(i8* %q, i8 %bit) +; EITHER: define i1 @c6(i8* readonly %q, i8 %bit) define i1 @c6(i8* %q, i8 %bit) personality i32 (...)* @__gxx_personality_v0 { invoke void @throw_if_bit_set(i8* %q, i8 %bit) to label %ret0 unwind label %ret1 @@ -75,8 +77,7 @@ define i1* @lookup_bit(i32* %q, i32 %bit ret i1* %lookup } -; FNATTR: define i1 @c7(i32* readonly %q, i32 %bitno) -; ATTRIBUTOR: define i1 @c7(i32* %q, i32 %bitno) +; EITHER: define i1 @c7(i32* readonly %q, i32 %bitno) define i1 @c7(i32* %q, i32 %bitno) { %ptr = call i1* @lookup_bit(i32* %q, i32 %bitno) %val = load i1, i1* %ptr @@ -271,7 +272,8 @@ entry: } @g3 = global i8* null -; EITHER: define void @captureStrip(i8* %p) +; FNATTR: define void @captureStrip(i8* %p) +; ATTRIBUTOR: define void @captureStrip(i8* writeonly %p) define void @captureStrip(i8* %p) { %b = call i8* @llvm.strip.invariant.group.p0i8(i8* %p) store i8* %b, i8** @g3 Modified: llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll?rev=374736&r1=374735&r2=374736&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll Sun Oct 13 13:47:16 2019 @@ -198,7 +198,7 @@ bb4: bb6: ; preds = %bb1 ; FIXME: missing nonnull. It should be @f2(i32* nonnull %arg) -; ATTRIBUTOR: %tmp7 = tail call nonnull i32* @f2(i32* %arg) +; ATTRIBUTOR: %tmp7 = tail call nonnull i32* @f2(i32* readonly %arg) %tmp7 = tail call i32* @f2(i32* %arg) ret i32* %tmp7 @@ -209,7 +209,7 @@ bb9: define internal i32* @f2(i32* %arg) { ; FIXME: missing nonnull. It should be nonnull @f2(i32* nonnull %arg) -; ATTRIBUTOR: define internal nonnull i32* @f2(i32* %arg) +; ATTRIBUTOR: define internal nonnull i32* @f2(i32* readonly %arg) bb: ; FIXME: missing nonnull. It should be @f1(i32* nonnull readonly %arg) Modified: llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll?rev=374736&r1=374735&r2=374736&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/readattrs.ll Sun Oct 13 13:47:16 2019 @@ -39,7 +39,7 @@ define void @test4_2(i8* %p) { } ; FNATTR: define void @test5(i8** nocapture %p, i8* %q) -; ATTRIBUTOR: define void @test5(i8** nocapture nonnull writeonly dereferenceable(8) %p, i8* %q) +; ATTRIBUTOR: define void @test5(i8** nocapture nonnull writeonly dereferenceable(8) %p, i8* writeonly %q) ; Missed optz'n: we could make %q readnone, but don't break test6! define void @test5(i8** %p, i8* %q) { store i8* %q, i8** %p From llvm-commits at lists.llvm.org Sun Oct 13 13:48:26 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 20:48:26 -0000 Subject: [llvm] r374737 - [Attributor][FIX] NullPointerIsDefined needs the pointer AS (AANonNull) Message-ID: <20191013204826.B380580772@lists.llvm.org> Author: jdoerfert Date: Sun Oct 13 13:48:26 2019 New Revision: 374737 URL: http://llvm.org/viewvc/llvm-project?rev=374737&view=rev Log: [Attributor][FIX] NullPointerIsDefined needs the pointer AS (AANonNull) Also includes a shortcut via AADereferenceable if possible. Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll llvm/trunk/test/Transforms/FunctionAttrs/nounwind.ll llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374737&r1=374736&r2=374737&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sun Oct 13 13:48:26 2019 @@ -1588,11 +1588,16 @@ static int64_t getKnownNonNullAndDerefBy } struct AANonNullImpl : AANonNull { - AANonNullImpl(const IRPosition &IRP) : AANonNull(IRP) {} + AANonNullImpl(const IRPosition &IRP) + : AANonNull(IRP), + NullIsDefined(NullPointerIsDefined( + getAnchorScope(), + getAssociatedValue().getType()->getPointerAddressSpace())) {} /// See AbstractAttribute::initialize(...). void initialize(Attributor &A) override { - if (hasAttr({Attribute::NonNull, Attribute::Dereferenceable})) + if (!NullIsDefined && + hasAttr({Attribute::NonNull, Attribute::Dereferenceable})) indicateOptimisticFixpoint(); else AANonNull::initialize(A); @@ -1612,6 +1617,10 @@ struct AANonNullImpl : AANonNull { const std::string getAsStr() const override { return getAssumed() ? "nonnull" : "may-null"; } + + /// Flag to determine if the underlying value can be null and still allow + /// valid accesses. + const bool NullIsDefined; }; /// NonNull attribute for a floating value. @@ -1644,6 +1653,12 @@ struct AANonNullFloating if (isKnownNonNull()) return Change; + if (!NullIsDefined) { + const auto &DerefAA = A.getAAFor(*this, getIRPosition()); + if (DerefAA.getAssumedDereferenceableBytes()) + return Change; + } + const DataLayout &DL = A.getDataLayout(); auto VisitValueCB = [&](Value &V, AAAlign::StateType &T, @@ -1651,7 +1666,7 @@ struct AANonNullFloating const auto &AA = A.getAAFor(*this, IRPosition::value(V)); if (!Stripped && this == &AA) { if (!isKnownNonZero(&V, DL, 0, /* TODO: AC */ nullptr, - /* TODO: CtxI */ nullptr, + /* CtxI */ getCtxI(), /* TODO: DT */ nullptr)) T.indicatePessimisticFixpoint(); } else { Modified: llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll?rev=374737&r1=374736&r2=374737&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/noalias_returned.ll Sun Oct 13 13:48:26 2019 @@ -1,4 +1,4 @@ -; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s | FileCheck %s +; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 < %s | FileCheck %s ; TEST 1 - negative. Modified: llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll?rev=374737&r1=374736&r2=374737&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nonnull.ll Sun Oct 13 13:48:26 2019 @@ -523,6 +523,13 @@ define i32 addrspace(3)* @gep2(i32 addrs ret i32 addrspace(3)* %q } +; FNATTR: define i32 addrspace(3)* @as(i32 addrspace(3)* readnone returned dereferenceable(4) %p) +; FIXME: We should propagate dereferenceable here but *not* nonnull +; ATTRIBUTOR: define dereferenceable_or_null(4) i32 addrspace(3)* @as(i32 addrspace(3)* readnone returned dereferenceable(4) dereferenceable_or_null(4) %p) +define i32 addrspace(3)* @as(i32 addrspace(3)* dereferenceable(4) %p) { + ret i32 addrspace(3)* %p +} + ; BOTH: define internal nonnull i32* @g2() define internal i32* @g2() { ret i32* inttoptr (i64 4 to i32*) Modified: llvm/trunk/test/Transforms/FunctionAttrs/nounwind.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nounwind.ll?rev=374737&r1=374736&r2=374737&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nounwind.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nounwind.ll Sun Oct 13 13:48:26 2019 @@ -1,5 +1,5 @@ ; RUN: opt < %s -functionattrs -S | FileCheck %s -; RUN: opt < %s -attributor -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 -S | FileCheck %s --check-prefix=ATTRIBUTOR +; RUN: opt < %s -attributor -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 -S | FileCheck %s --check-prefix=ATTRIBUTOR ; TEST 1 ; CHECK: Function Attrs: norecurse nounwind readnone Modified: llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll?rev=374737&r1=374736&r2=374737&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll Sun Oct 13 13:48:26 2019 @@ -1,4 +1,4 @@ -; RUN: opt -functionattrs -enable-nonnull-arg-prop -attributor -attributor-manifest-internal -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=6 -S < %s | FileCheck %s +; RUN: opt -functionattrs -enable-nonnull-arg-prop -attributor -attributor-manifest-internal -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=7 -S < %s | FileCheck %s ; ; This is an evolved example to stress test SCC parameter attribute propagation. ; The SCC in this test is made up of the following six function, three of which From llvm-commits at lists.llvm.org Sun Oct 13 13:59:47 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Sun, 13 Oct 2019 20:59:47 -0000 Subject: [zorg] r374738 - Remove build directory for each build on clang-x86_64-debian-fast. Message-ID: <20191013205947.8173B82EB6@lists.llvm.org> Author: gkistanova Date: Sun Oct 13 13:59:47 2019 New Revision: 374738 URL: http://llvm.org/viewvc/llvm-project?rev=374738&view=rev Log: Remove build directory for each build on clang-x86_64-debian-fast. Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/buildbot/osuosl/master/config/builders.py?rev=374738&r1=374737&r2=374738&view=diff ============================================================================== --- zorg/trunk/buildbot/osuosl/master/config/builders.py (original) +++ zorg/trunk/buildbot/osuosl/master/config/builders.py Sun Oct 13 13:59:47 2019 @@ -97,6 +97,7 @@ def _get_clang_fast_builders(): 'factory': UnifiedTreeBuilder.getCmakeWithNinjaBuildFactory( llvm_srcdir="llvm.src", obj_dir="llvm.obj", + clean=True, depends_on_projects=['llvm','clang','clang-tools-extra','compiler-rt'], extra_configure_args=[ "-DCOMPILER_RT_BUILD_BUILTINS:BOOL=OFF", From llvm-commits at lists.llvm.org Sun Oct 13 14:17:28 2019 From: llvm-commits at lists.llvm.org (David Li via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 21:17:28 +0000 (UTC) Subject: [PATCH] D68898: JumpThreading: enhance JT to handle BB with no successor and address comparison In-Reply-To: References: Message-ID: <2d76644c4114afcb8074377b14a6e799@localhost.localdomain> davidxl added a comment. Handling what Wei's case will be a nice thing to have, but it may require more significant change in JT. Currently the JT candidate BB selection is based on checking the conditional value used by branch or return value of ret instr (with this patch). To handle this case, it requires checking use values of arbitrary instructions (value of store in the example). Another thing to consider is the cost model difference. In Wei's case, cloning really becomes tail dup with increased complexity of control flow (handling Ret instruction on the other hand does not have the issue). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68898/new/ https://reviews.llvm.org/D68898 From llvm-commits at lists.llvm.org Sun Oct 13 14:17:29 2019 From: llvm-commits at lists.llvm.org (Ayal Zaks via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 21:17:29 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: <51432a205904b7f574031e55bbee4525@localhost.localdomain> Ayal added inline comments. ================ Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:985 + /// Return the mask used by this recipe, nullptr if none. + VPValue *getMask() { ---------------- While you're at it... worth adding that a full mask (potentially used by this recipe) is represented by nullptr, as well. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 From llvm-commits at lists.llvm.org Sun Oct 13 14:17:29 2019 From: llvm-commits at lists.llvm.org (Stefan Stipanovic via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 21:17:29 +0000 (UTC) Subject: [PATCH] D67886: NoFree argument attribute. In-Reply-To: References: Message-ID: <29840798cb05d2aba3c5cdda4e2ef6d8@localhost.localdomain> sstefan1 updated this revision to Diff 224794. sstefan1 added a comment. - addressing comments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67886/new/ https://reviews.llvm.org/D67886 Files: llvm/docs/LangRef.rst llvm/lib/AsmParser/LLParser.cpp llvm/lib/IR/Verifier.cpp llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/align.ll llvm/test/Transforms/FunctionAttrs/arg_nocapture.ll llvm/test/Transforms/FunctionAttrs/arg_returned.ll llvm/test/Transforms/FunctionAttrs/dereferenceable.ll llvm/test/Transforms/FunctionAttrs/heap_to_stack.ll llvm/test/Transforms/FunctionAttrs/internal-noalias.ll llvm/test/Transforms/FunctionAttrs/liveness.ll llvm/test/Transforms/FunctionAttrs/noalias_returned.ll llvm/test/Transforms/FunctionAttrs/nocapture.ll llvm/test/Transforms/FunctionAttrs/nofree-attributor.ll llvm/test/Transforms/FunctionAttrs/nonnull.ll llvm/test/Transforms/FunctionAttrs/nosync.ll llvm/test/Transforms/FunctionAttrs/read_write_returned_arguments_scc.ll llvm/test/Transforms/FunctionAttrs/willreturn.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D67886.224794.patch Type: text/x-patch Size: 64287 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 14:25:53 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via llvm-commits) Date: Sun, 13 Oct 2019 21:25:53 -0000 Subject: [llvm] r374739 - [Attributor] Shortcut no-return through will-return Message-ID: <20191013212553.77F3183D41@lists.llvm.org> Author: jdoerfert Date: Sun Oct 13 14:25:53 2019 New Revision: 374739 URL: http://llvm.org/viewvc/llvm-project?rev=374739&view=rev Log: [Attributor] Shortcut no-return through will-return No-return and will-return are exclusive, assuming the latter is more prominent we can avoid updates of the former unless will-return is not known for sure. Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp llvm/trunk/test/Transforms/FunctionAttrs/fn_noreturn.ll llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll llvm/trunk/test/Transforms/FunctionAttrs/noreturn_async.ll llvm/trunk/test/Transforms/FunctionAttrs/noreturn_sync.ll Modified: llvm/trunk/lib/Transforms/IPO/Attributor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/Attributor.cpp?rev=374739&r1=374738&r2=374739&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/Attributor.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/Attributor.cpp Sun Oct 13 14:25:53 2019 @@ -2859,6 +2859,14 @@ struct AAAlignCallSiteReturned final : A struct AANoReturnImpl : public AANoReturn { AANoReturnImpl(const IRPosition &IRP) : AANoReturn(IRP) {} + /// See AbstractAttribute::initialize(...). + void initialize(Attributor &A) override { + AANoReturn::initialize(A); + Function *F = getAssociatedFunction(); + if (!F || F->hasFnAttribute(Attribute::WillReturn)) + indicatePessimisticFixpoint(); + } + /// See AbstractAttribute::getAsStr(). const std::string getAsStr() const override { return getAssumed() ? "noreturn" : "may-return"; @@ -2866,6 +2874,9 @@ struct AANoReturnImpl : public AANoRetur /// See AbstractAttribute::updateImpl(Attributor &A). virtual ChangeStatus updateImpl(Attributor &A) override { + const auto &WillReturnAA = A.getAAFor(*this, getIRPosition()); + if (WillReturnAA.isKnownWillReturn()) + return indicatePessimisticFixpoint(); auto CheckForNoReturn = [](Instruction &) { return false; }; if (!A.checkForAllInstructions(CheckForNoReturn, *this, {(unsigned)Instruction::Ret})) @@ -2885,14 +2896,6 @@ struct AANoReturnFunction final : AANoRe struct AANoReturnCallSite final : AANoReturnImpl { AANoReturnCallSite(const IRPosition &IRP) : AANoReturnImpl(IRP) {} - /// See AbstractAttribute::initialize(...). - void initialize(Attributor &A) override { - AANoReturnImpl::initialize(A); - Function *F = getAssociatedFunction(); - if (!F) - indicatePessimisticFixpoint(); - } - /// See AbstractAttribute::updateImpl(...). ChangeStatus updateImpl(Attributor &A) override { // TODO: Once we have call site specific value information we can provide Modified: llvm/trunk/test/Transforms/FunctionAttrs/fn_noreturn.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/fn_noreturn.ll?rev=374739&r1=374738&r2=374739&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/fn_noreturn.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/fn_noreturn.ll Sun Oct 13 14:25:53 2019 @@ -1,4 +1,4 @@ -; RUN: opt -functionattrs -attributor -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 -S < %s | FileCheck %s +; RUN: opt -functionattrs -attributor -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 -S < %s | FileCheck %s ; ; Test cases specifically designed for the "no-return" function attribute. ; We use FIXME's to indicate problems and missing attributes. @@ -124,4 +124,16 @@ cond.end: ret i32 %cond } + +; TEST 6: willreturn means *not* no-return +; CHECK: Function Attrs: nofree norecurse nosync nounwind readnone willreturn +; CHECK-NEXT: define i32 @endless_loop_but_willreturn +define i32 @endless_loop_but_willreturn(i32 %a) willreturn { +entry: + br label %while.body + +while.body: ; preds = %entry, %while.body + br label %while.body +} + attributes #0 = { noinline nounwind uwtable } Modified: llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll?rev=374739&r1=374738&r2=374739&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/internal-noalias.ll Sun Oct 13 14:25:53 2019 @@ -1,4 +1,4 @@ -; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=5 < %s | FileCheck %s +; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 < %s | FileCheck %s define dso_local i32 @visible(i32* noalias %A, i32* noalias %B) #0 { entry: Modified: llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll?rev=374739&r1=374738&r2=374739&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/liveness.ll Sun Oct 13 14:25:53 2019 @@ -1,4 +1,4 @@ -; RUN: opt -attributor --attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 -S < %s | FileCheck %s +; RUN: opt -attributor --attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 -S < %s | FileCheck %s declare void @no_return_call() nofree noreturn nounwind readnone Modified: llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll?rev=374739&r1=374738&r2=374739&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/norecurse.ll Sun Oct 13 14:25:53 2019 @@ -1,6 +1,6 @@ ; RUN: opt < %s -basicaa -functionattrs -rpo-functionattrs -S | FileCheck %s --check-prefixes=CHECK,BOTH ; RUN: opt < %s -aa-pipeline=basic-aa -passes='cgscc(function-attrs),rpo-functionattrs' -S | FileCheck %s --check-prefixes=CHECK,BOTH -; RUN: opt -passes=attributor --attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 -S < %s | FileCheck %s --check-prefixes=ATTRIBUTOR,BOTH +; RUN: opt -passes=attributor --attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=4 -S < %s | FileCheck %s --check-prefixes=ATTRIBUTOR,BOTH ; CHECK: Function Attrs ; CHECK-SAME: norecurse nounwind readnone Modified: llvm/trunk/test/Transforms/FunctionAttrs/noreturn_async.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/noreturn_async.ll?rev=374739&r1=374738&r2=374739&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/noreturn_async.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/noreturn_async.ll Sun Oct 13 14:25:53 2019 @@ -1,4 +1,4 @@ -; RUN: opt -functionattrs -attributor -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 -S < %s | FileCheck %s +; RUN: opt -functionattrs -attributor -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 -S < %s | FileCheck %s ; ; This file is the same as noreturn_sync.ll but with a personality which ; indicates that the exception handler *can* catch asynchronous exceptions. As Modified: llvm/trunk/test/Transforms/FunctionAttrs/noreturn_sync.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/noreturn_sync.ll?rev=374739&r1=374738&r2=374739&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/noreturn_sync.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/noreturn_sync.ll Sun Oct 13 14:25:53 2019 @@ -1,4 +1,4 @@ -; RUN: opt -functionattrs -attributor -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 -S < %s | FileCheck %s +; RUN: opt -functionattrs -attributor -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 -S < %s | FileCheck %s ; ; This file is the same as noreturn_async.ll but with a personality which ; indicates that the exception handler *cannot* catch asynchronous exceptions. From llvm-commits at lists.llvm.org Sun Oct 13 14:39:00 2019 From: llvm-commits at lists.llvm.org (Galina Kistanova via llvm-commits) Date: Sun, 13 Oct 2019 21:39:00 -0000 Subject: [zorg] r374740 - Set a default build directory in the LLVMBuildFactory and then properly use it. Message-ID: <20191013213900.A028481A74@lists.llvm.org> Author: gkistanova Date: Sun Oct 13 14:39:00 2019 New Revision: 374740 URL: http://llvm.org/viewvc/llvm-project?rev=374740&view=rev Log: Set a default build directory in the LLVMBuildFactory and then properly use it. Modified: zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py zorg/trunk/zorg/buildbot/process/factory.py Modified: zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py?rev=374740&r1=374739&r2=374740&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/UnifiedTreeBuilder.py Sun Oct 13 14:39:00 2019 @@ -65,6 +65,9 @@ def addCmakeSteps( else: cmake_args = list() + if obj_dir is None: + obj_dir = f.obj_dir + # This is an incremental build, unless otherwise has been requested. # Remove obj and install dirs for a clean build. # TODO: Some Windows slaves do not handle RemoveDirectory command well. @@ -242,7 +245,7 @@ def getCmakeWithNinjaBuildFactory( addNinjaSteps( f, - obj_dir=obj_dir, + obj_dir=f.obj_dir, checks=checks, install_dir=f.install_dir, env=env, Modified: zorg/trunk/zorg/buildbot/process/factory.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/process/factory.py?rev=374740&r1=374739&r2=374740&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/process/factory.py (original) +++ zorg/trunk/zorg/buildbot/process/factory.py Sun Oct 13 14:39:00 2019 @@ -44,6 +44,10 @@ class LLVMBuildFactory(BuildFactory): if kwargs.get('llvm_srcdir', None) is None: self.llvm_srcdir = "llvm" + # Default build directory. + if kwargs.get('obj_dir', None) is None: + self.obj_dir = "build" + @staticmethod def pathRelativeToBuild(path, buildPath): From llvm-commits at lists.llvm.org Sun Oct 13 14:44:35 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 21:44:35 +0000 (UTC) Subject: [PATCH] D67886: NoFree argument attribute. In-Reply-To: References: Message-ID: <2d7af5dd9267a25799ff277d28abd28f@localhost.localdomain> jdoerfert added a comment. quick feedback ================ Comment at: llvm/lib/Transforms/IPO/Attributor.cpp:1546 + AANoFreeCallSiteReturned(const IRPosition &IRP) : AANoFreeFloating(IRP) {} + + /// See AbstractAttribute::trackStatistics() ---------------- jdoerfert wrote: > overwrite `manifest` here to ensure we do not add "no-free" to the return value even if we derive it. We can actually derive it as we do not need to restrict it to arguments in the `AANoFreeFloating::update`. This is still open, see my "why arguments" comment above. Btw. later we can actually add "nofree" to return values (and call site returns) as it can help to keep dereferenceable. But that is for a follow up patch after we get `dereferenceable_globally` ================ Comment at: llvm/lib/Transforms/IPO/Attributor.cpp:1433 + for (Use &U : Arg->uses()) + Worklist.push_back(&U); + ---------------- Why do we need an argument and not just the associated value? ================ Comment at: llvm/lib/Transforms/IPO/Attributor.cpp:1478 + } + + // Unknown user. ---------------- Allow PHI nodes and selects as well. Same as bitcast above. ================ Comment at: llvm/test/Transforms/FunctionAttrs/nofree-attributor.ll:267 + +define void @test14(i8* nocapture %0, i8* nocapture %1) { + tail call void @free(i8* %0) #1 ---------------- ATTRIBUTOR: check line missing Also replace the CHECK with ATTRIBUTOR Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67886/new/ https://reviews.llvm.org/D67886 From llvm-commits at lists.llvm.org Sun Oct 13 14:44:36 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 21:44:36 +0000 (UTC) Subject: [PATCH] D68766: [NFC][ArgPromo][Tests] Run update_test_checks on all ArgumentPromotion tests In-Reply-To: References: Message-ID: <52eaa113e674c965e2cc373372e0f5ed@localhost.localdomain> jdoerfert updated this revision to Diff 224795. jdoerfert added a comment. Rerun with fixed script to use argument variable names in body Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68766/new/ https://reviews.llvm.org/D68766 Files: llvm/test/Transforms/ArgumentPromotion/2008-02-01-ReturnAttrs.ll llvm/test/Transforms/ArgumentPromotion/2008-07-02-array-indexing.ll llvm/test/Transforms/ArgumentPromotion/2008-09-07-CGUpdate.ll llvm/test/Transforms/ArgumentPromotion/2008-09-08-CGUpdateSelfEdge.ll llvm/test/Transforms/ArgumentPromotion/X86/attributes.ll llvm/test/Transforms/ArgumentPromotion/X86/min-legal-vector-width.ll llvm/test/Transforms/ArgumentPromotion/X86/thiscall.ll llvm/test/Transforms/ArgumentPromotion/aggregate-promote.ll llvm/test/Transforms/ArgumentPromotion/attrs.ll llvm/test/Transforms/ArgumentPromotion/basictest.ll llvm/test/Transforms/ArgumentPromotion/byval-2.ll llvm/test/Transforms/ArgumentPromotion/byval.ll llvm/test/Transforms/ArgumentPromotion/chained.ll llvm/test/Transforms/ArgumentPromotion/control-flow.ll llvm/test/Transforms/ArgumentPromotion/control-flow2.ll llvm/test/Transforms/ArgumentPromotion/crash.ll llvm/test/Transforms/ArgumentPromotion/dbg.ll llvm/test/Transforms/ArgumentPromotion/fp80.ll llvm/test/Transforms/ArgumentPromotion/inalloca.ll llvm/test/Transforms/ArgumentPromotion/invalidation.ll llvm/test/Transforms/ArgumentPromotion/musttail.ll llvm/test/Transforms/ArgumentPromotion/naked_functions.ll llvm/test/Transforms/ArgumentPromotion/nonzero-address-spaces.ll llvm/test/Transforms/ArgumentPromotion/pr27568.ll llvm/test/Transforms/ArgumentPromotion/pr3085.ll llvm/test/Transforms/ArgumentPromotion/pr32917.ll llvm/test/Transforms/ArgumentPromotion/pr33641_remove_arg_dbgvalue.ll llvm/test/Transforms/ArgumentPromotion/profile.ll llvm/test/Transforms/ArgumentPromotion/reserve-tbaa.ll llvm/test/Transforms/ArgumentPromotion/sret.ll llvm/test/Transforms/ArgumentPromotion/tail.ll llvm/test/Transforms/ArgumentPromotion/variadic.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68766.224795.patch Type: text/x-patch Size: 170318 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 15:10:06 2019 From: llvm-commits at lists.llvm.org (Simon Atanasyan via llvm-commits) Date: Sun, 13 Oct 2019 22:10:06 -0000 Subject: [llvm] r374741 - merge-request.sh: Update 9.0 metabug for 9.0.1 Message-ID: <20191013221006.CD80084236@lists.llvm.org> Author: atanasyan Date: Sun Oct 13 15:10:06 2019 New Revision: 374741 URL: http://llvm.org/viewvc/llvm-project?rev=374741&view=rev Log: merge-request.sh: Update 9.0 metabug for 9.0.1 Modified: llvm/trunk/utils/release/merge-request.sh Modified: llvm/trunk/utils/release/merge-request.sh URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/release/merge-request.sh?rev=374741&r1=374740&r2=374741&view=diff ============================================================================== --- llvm/trunk/utils/release/merge-request.sh (original) +++ llvm/trunk/utils/release/merge-request.sh Sun Oct 13 15:10:06 2019 @@ -104,7 +104,7 @@ case $stable_version in release_metabug="41221" ;; 9.0) - release_metabug="42474" + release_metabug="43360" ;; *) echo "error: invalid stable version" From llvm-commits at lists.llvm.org Sun Oct 13 16:00:15 2019 From: llvm-commits at lists.llvm.org (Joerg Sonnenberger via llvm-commits) Date: Sun, 13 Oct 2019 23:00:15 -0000 Subject: [llvm] r374743 - Add a pass to lower is.constant and objectsize intrinsics Message-ID: <20191013230016.36324877F4@lists.llvm.org> Author: joerg Date: Sun Oct 13 16:00:15 2019 New Revision: 374743 URL: http://llvm.org/viewvc/llvm-project?rev=374743&view=rev Log: Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 Added: llvm/trunk/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h llvm/trunk/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp llvm/trunk/test/Transforms/LowerConstantIntrinsics/ llvm/trunk/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll - copied, changed from r374742, llvm/trunk/test/CodeGen/X86/is-constant.ll llvm/trunk/test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll - copied, changed from r374742, llvm/trunk/test/Transforms/CodeGenPrepare/crash-on-large-allocas.ll llvm/trunk/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll - copied, changed from r374742, llvm/trunk/test/Transforms/CodeGenPrepare/basic.ll Removed: llvm/trunk/test/CodeGen/Generic/is-constant.ll llvm/trunk/test/CodeGen/X86/is-constant.ll llvm/trunk/test/CodeGen/X86/object-size.ll llvm/trunk/test/Transforms/CodeGenPrepare/basic.ll llvm/trunk/test/Transforms/CodeGenPrepare/builtin-condition.ll llvm/trunk/test/Transforms/CodeGenPrepare/crash-on-large-allocas.ll Modified: llvm/trunk/bindings/ocaml/transforms/scalar_opts/llvm_scalar_opts.mli llvm/trunk/bindings/ocaml/transforms/scalar_opts/scalar_opts_ocaml.c llvm/trunk/include/llvm-c/Transforms/Scalar.h llvm/trunk/include/llvm/InitializePasses.h llvm/trunk/include/llvm/LinkAllPasses.h llvm/trunk/include/llvm/Transforms/Scalar.h llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp llvm/trunk/lib/CodeGen/GlobalISel/IRTranslator.cpp llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/trunk/lib/CodeGen/TargetPassConfig.cpp llvm/trunk/lib/Passes/PassBuilder.cpp llvm/trunk/lib/Passes/PassRegistry.def llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt llvm/trunk/lib/Transforms/Scalar/Scalar.cpp llvm/trunk/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll llvm/trunk/test/CodeGen/AArch64/O0-pipeline.ll llvm/trunk/test/CodeGen/AArch64/O3-pipeline.ll llvm/trunk/test/CodeGen/ARM/O3-pipeline.ll llvm/trunk/test/CodeGen/X86/O0-pipeline.ll llvm/trunk/test/CodeGen/X86/O3-pipeline.ll llvm/trunk/test/Other/new-pm-defaults.ll llvm/trunk/test/Other/new-pm-thinlto-defaults.ll llvm/trunk/test/Other/opt-O2-pipeline.ll llvm/trunk/test/Other/opt-O3-pipeline.ll llvm/trunk/test/Other/opt-Os-pipeline.ll llvm/trunk/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll llvm/trunk/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn Modified: llvm/trunk/bindings/ocaml/transforms/scalar_opts/llvm_scalar_opts.mli URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/bindings/ocaml/transforms/scalar_opts/llvm_scalar_opts.mli?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/bindings/ocaml/transforms/scalar_opts/llvm_scalar_opts.mli (original) +++ llvm/trunk/bindings/ocaml/transforms/scalar_opts/llvm_scalar_opts.mli Sun Oct 13 16:00:15 2019 @@ -191,6 +191,11 @@ external add_lower_expect_intrinsic : [< Llvm.PassManager.any ] Llvm.PassManager.t -> unit = "llvm_add_lower_expect_intrinsic" +(** See the [llvm::createLowerConstantIntrinsicsPass] function. *) +external add_lower_constant_intrinsics + : [< Llvm.PassManager.any ] Llvm.PassManager.t -> unit + = "llvm_add_lower_constant_intrinsics" + (** See the [llvm::createTypeBasedAliasAnalysisPass] function. *) external add_type_based_alias_analysis : [< Llvm.PassManager.any ] Llvm.PassManager.t -> unit Modified: llvm/trunk/bindings/ocaml/transforms/scalar_opts/scalar_opts_ocaml.c URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/bindings/ocaml/transforms/scalar_opts/scalar_opts_ocaml.c?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/bindings/ocaml/transforms/scalar_opts/scalar_opts_ocaml.c (original) +++ llvm/trunk/bindings/ocaml/transforms/scalar_opts/scalar_opts_ocaml.c Sun Oct 13 16:00:15 2019 @@ -237,6 +237,12 @@ CAMLprim value llvm_add_lower_expect_int } /* [ unit */ +CAMLprim value llvm_add_lower_constant_intrinsics(LLVMPassManagerRef PM) { + LLVMAddLowerConstantIntrinsicsPass(PM); + return Val_unit; +} + +/* [ unit */ CAMLprim value llvm_add_type_based_alias_analysis(LLVMPassManagerRef PM) { LLVMAddTypeBasedAliasAnalysisPass(PM); return Val_unit; Modified: llvm/trunk/include/llvm-c/Transforms/Scalar.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/Transforms/Scalar.h?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/include/llvm-c/Transforms/Scalar.h (original) +++ llvm/trunk/include/llvm-c/Transforms/Scalar.h Sun Oct 13 16:00:15 2019 @@ -147,6 +147,9 @@ void LLVMAddEarlyCSEMemSSAPass(LLVMPassM /** See llvm::createLowerExpectIntrinsicPass function */ void LLVMAddLowerExpectIntrinsicPass(LLVMPassManagerRef PM); +/** See llvm::createLowerConstantIntrinsicsPass function */ +void LLVMAddLowerConstantIntrinsicsPass(LLVMPassManagerRef PM); + /** See llvm::createTypeBasedAliasAnalysisPass function */ void LLVMAddTypeBasedAliasAnalysisPass(LLVMPassManagerRef PM); Modified: llvm/trunk/include/llvm/InitializePasses.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/InitializePasses.h?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/include/llvm/InitializePasses.h (original) +++ llvm/trunk/include/llvm/InitializePasses.h Sun Oct 13 16:00:15 2019 @@ -243,6 +243,7 @@ void initializeLoopVectorizePass(PassReg void initializeLoopVersioningLICMPass(PassRegistry&); void initializeLoopVersioningPassPass(PassRegistry&); void initializeLowerAtomicLegacyPassPass(PassRegistry&); +void initializeLowerConstantIntrinsicsPass(PassRegistry&); void initializeLowerEmuTLSPass(PassRegistry&); void initializeLowerExpectIntrinsicPass(PassRegistry&); void initializeLowerGuardIntrinsicLegacyPassPass(PassRegistry&); Modified: llvm/trunk/include/llvm/LinkAllPasses.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/LinkAllPasses.h?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/include/llvm/LinkAllPasses.h (original) +++ llvm/trunk/include/llvm/LinkAllPasses.h Sun Oct 13 16:00:15 2019 @@ -140,6 +140,7 @@ namespace { (void) llvm::createLoopVersioningLICMPass(); (void) llvm::createLoopIdiomPass(); (void) llvm::createLoopRotatePass(); + (void) llvm::createLowerConstantIntrinsicsPass(); (void) llvm::createLowerExpectIntrinsicPass(); (void) llvm::createLowerInvokePass(); (void) llvm::createLowerSwitchPass(); Modified: llvm/trunk/include/llvm/Transforms/Scalar.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Scalar.h?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/Scalar.h (original) +++ llvm/trunk/include/llvm/Transforms/Scalar.h Sun Oct 13 16:00:15 2019 @@ -397,6 +397,13 @@ FunctionPass *createLowerExpectIntrinsic //===----------------------------------------------------------------------===// // +// LowerConstantIntrinsicss - Expand any remaining llvm.objectsize and +// llvm.is.constant intrinsic calls, even for the unknown cases. +// +FunctionPass *createLowerConstantIntrinsicsPass(); + +//===----------------------------------------------------------------------===// +// // PartiallyInlineLibCalls - Tries to inline the fast path of library // calls such as sqrt. // Added: llvm/trunk/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h?rev=374743&view=auto ============================================================================== --- llvm/trunk/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h (added) +++ llvm/trunk/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h Sun Oct 13 16:00:15 2019 @@ -0,0 +1,41 @@ +//===- LowerConstantIntrinsics.h - Lower constant int. pass -*- C++ -*-========// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +/// \file +/// +/// The header file for the LowerConstantIntrinsics pass as used by the new pass +/// manager. +/// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_TRANSFORMS_SCALAR_LOWERCONSTANTINTRINSICS_H +#define LLVM_TRANSFORMS_SCALAR_LOWERCONSTANTINTRINSICS_H + +#include "llvm/IR/Function.h" +#include "llvm/IR/PassManager.h" + +namespace llvm { + +struct LowerConstantIntrinsicsPass : + PassInfoMixin { +public: + explicit LowerConstantIntrinsicsPass() {} + + /// Run the pass over the function. + /// + /// This will lower all remaining 'objectsize' and 'is.constant'` + /// intrinsic calls in this function, even when the argument has no known + /// size or is not a constant respectively. The resulting constant is + /// propagated and conditional branches are resolved where possible. + /// This complements the Instruction Simplification and + /// Instruction Combination passes of the optimized pass chain. + PreservedAnalyses run(Function &F, FunctionAnalysisManager &); +}; + +} + +#endif Modified: llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp (original) +++ llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp Sun Oct 13 16:00:15 2019 @@ -1868,24 +1868,10 @@ bool CodeGenPrepare::optimizeCallInst(Ca }); return true; } - case Intrinsic::objectsize: { - // Lower all uses of llvm.objectsize.* - Value *RetVal = - lowerObjectSizeCall(II, *DL, TLInfo, /*MustSucceed=*/true); - - resetIteratorIfInvalidatedWhileCalling(BB, [&]() { - replaceAndRecursivelySimplify(CI, RetVal, TLInfo, nullptr); - }); - return true; - } - case Intrinsic::is_constant: { - // If is_constant hasn't folded away yet, lower it to false now. - Constant *RetVal = ConstantInt::get(II->getType(), 0); - resetIteratorIfInvalidatedWhileCalling(BB, [&]() { - replaceAndRecursivelySimplify(CI, RetVal, TLInfo, nullptr); - }); - return true; - } + case Intrinsic::objectsize: + llvm_unreachable("llvm.objectsize.* should have been lowered already"); + case Intrinsic::is_constant: + llvm_unreachable("llvm.is.constant.* should have been lowered already"); case Intrinsic::aarch64_stlxr: case Intrinsic::aarch64_stxr: { ZExtInst *ExtVal = dyn_cast(CI->getArgOperand(0)); Modified: llvm/trunk/lib/CodeGen/GlobalISel/IRTranslator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/GlobalISel/IRTranslator.cpp?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/GlobalISel/IRTranslator.cpp (original) +++ llvm/trunk/lib/CodeGen/GlobalISel/IRTranslator.cpp Sun Oct 13 16:00:15 2019 @@ -1437,18 +1437,12 @@ bool IRTranslator::translateKnownIntrins MIRBuilder.buildConstant(Reg, TypeID); return true; } - case Intrinsic::objectsize: { - // If we don't know by now, we're never going to know. - const ConstantInt *Min = cast(CI.getArgOperand(1)); + case Intrinsic::objectsize: + llvm_unreachable("llvm.objectsize.* should have been lowered already"); - MIRBuilder.buildConstant(getOrCreateVReg(CI), Min->isZero() ? -1ULL : 0); - return true; - } case Intrinsic::is_constant: - // If this wasn't constant-folded away by now, then it's not a - // constant. - MIRBuilder.buildConstant(getOrCreateVReg(CI), 0); - return true; + llvm_unreachable("llvm.is.constant.* should have been lowered already"); + case Intrinsic::stackguard: getStackGuard(getOrCreateVReg(CI), MIRBuilder); return true; Modified: llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp Sun Oct 13 16:00:15 2019 @@ -1454,24 +1454,12 @@ bool FastISel::selectIntrinsicCall(const TII.get(TargetOpcode::DBG_LABEL)).addMetadata(DI->getLabel()); return true; } - case Intrinsic::objectsize: { - ConstantInt *CI = cast(II->getArgOperand(1)); - unsigned long long Res = CI->isZero() ? -1ULL : 0; - Constant *ResCI = ConstantInt::get(II->getType(), Res); - unsigned ResultReg = getRegForValue(ResCI); - if (!ResultReg) - return false; - updateValueMap(II, ResultReg); - return true; - } - case Intrinsic::is_constant: { - Constant *ResCI = ConstantInt::get(II->getType(), 0); - unsigned ResultReg = getRegForValue(ResCI); - if (!ResultReg) - return false; - updateValueMap(II, ResultReg); - return true; - } + case Intrinsic::objectsize: + llvm_unreachable("llvm.objectsize.* should have been lowered already"); + + case Intrinsic::is_constant: + llvm_unreachable("llvm.is.constant.* should have been lowered already"); + case Intrinsic::launder_invariant_group: case Intrinsic::strip_invariant_group: case Intrinsic::expect: { Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Sun Oct 13 16:00:15 2019 @@ -6388,29 +6388,11 @@ void SelectionDAGBuilder::visitIntrinsic DAG.setRoot(Res); return; } - case Intrinsic::objectsize: { - // If we don't know by now, we're never going to know. - ConstantInt *CI = dyn_cast(I.getArgOperand(1)); - - assert(CI && "Non-constant type in __builtin_object_size?"); - - SDValue Arg = getValue(I.getCalledValue()); - EVT Ty = Arg.getValueType(); - - if (CI->isZero()) - Res = DAG.getConstant(-1ULL, sdl, Ty); - else - Res = DAG.getConstant(0, sdl, Ty); - - setValue(&I, Res); - return; - } + case Intrinsic::objectsize: + llvm_unreachable("llvm.objectsize.* should have been lowered already"); case Intrinsic::is_constant: - // If this wasn't constant-folded away by now, then it's not a - // constant. - setValue(&I, DAG.getConstant(0, sdl, MVT::i1)); - return; + llvm_unreachable("llvm.is.constant.* should have been lowered already"); case Intrinsic::annotation: case Intrinsic::ptr_annotation: Modified: llvm/trunk/lib/CodeGen/TargetPassConfig.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TargetPassConfig.cpp?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/TargetPassConfig.cpp (original) +++ llvm/trunk/lib/CodeGen/TargetPassConfig.cpp Sun Oct 13 16:00:15 2019 @@ -657,6 +657,7 @@ void TargetPassConfig::addIRPasses() { // TODO: add a pass insertion point here addPass(createGCLoweringPass()); addPass(createShadowStackGCLoweringPass()); + addPass(createLowerConstantIntrinsicsPass()); // Make sure that no unreachable blocks are instruction selected. addPass(createUnreachableBlockEliminationPass()); Modified: llvm/trunk/lib/Passes/PassBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Passes/PassBuilder.cpp?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/lib/Passes/PassBuilder.cpp (original) +++ llvm/trunk/lib/Passes/PassBuilder.cpp Sun Oct 13 16:00:15 2019 @@ -142,6 +142,7 @@ #include "llvm/Transforms/Scalar/LoopUnrollAndJamPass.h" #include "llvm/Transforms/Scalar/LoopUnrollPass.h" #include "llvm/Transforms/Scalar/LowerAtomic.h" +#include "llvm/Transforms/Scalar/LowerConstantIntrinsics.h" #include "llvm/Transforms/Scalar/LowerExpectIntrinsic.h" #include "llvm/Transforms/Scalar/LowerGuardIntrinsic.h" #include "llvm/Transforms/Scalar/LowerWidenableCondition.h" @@ -891,6 +892,8 @@ ModulePassManager PassBuilder::buildModu FunctionPassManager OptimizePM(DebugLogging); OptimizePM.addPass(Float2IntPass()); + OptimizePM.addPass(LowerConstantIntrinsicsPass()); + // FIXME: We need to run some loop optimizations to re-rotate loops after // simplify-cfg and others undo their rotation. Modified: llvm/trunk/lib/Passes/PassRegistry.def URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Passes/PassRegistry.def?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/lib/Passes/PassRegistry.def (original) +++ llvm/trunk/lib/Passes/PassRegistry.def Sun Oct 13 16:00:15 2019 @@ -187,6 +187,7 @@ FUNCTION_PASS("libcalls-shrinkwrap", Lib FUNCTION_PASS("loweratomic", LowerAtomicPass()) FUNCTION_PASS("lower-expect", LowerExpectIntrinsicPass()) FUNCTION_PASS("lower-guard-intrinsic", LowerGuardIntrinsicPass()) +FUNCTION_PASS("lower-constant-intrinsics", LowerConstantIntrinsicsPass()) FUNCTION_PASS("lower-widenable-condition", LowerWidenableConditionPass()) FUNCTION_PASS("guard-widening", GuardWideningPass()) FUNCTION_PASS("gvn", GVN()) Modified: llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp Sun Oct 13 16:00:15 2019 @@ -654,6 +654,7 @@ void PassManagerBuilder::populateModuleP MPM.add(createGlobalsAAWrapperPass()); MPM.add(createFloat2IntPass()); + MPM.add(createLowerConstantIntrinsicsPass()); addExtensionsToPM(EP_VectorizerStart, MPM); Modified: llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt (original) +++ llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt Sun Oct 13 16:00:15 2019 @@ -44,6 +44,7 @@ add_llvm_library(LLVMScalarOpts LoopUnswitch.cpp LoopVersioningLICM.cpp LowerAtomic.cpp + LowerConstantIntrinsics.cpp LowerExpectIntrinsic.cpp LowerGuardIntrinsic.cpp LowerWidenableCondition.cpp Added: llvm/trunk/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp?rev=374743&view=auto ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp (added) +++ llvm/trunk/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp Sun Oct 13 16:00:15 2019 @@ -0,0 +1,170 @@ +//===- LowerConstantIntrinsics.cpp - Lower constant intrinsic calls -------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This pass lowers all remaining 'objectsize' 'is.constant' intrinsic calls +// and provides constant propagation and basic CFG cleanup on the result. +// +//===----------------------------------------------------------------------===// + +#include "llvm/Transforms/Scalar/LowerConstantIntrinsics.h" +#include "llvm/ADT/PostOrderIterator.h" +#include "llvm/ADT/Statistic.h" +#include "llvm/Analysis/InstructionSimplify.h" +#include "llvm/Analysis/MemoryBuiltins.h" +#include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/IR/BasicBlock.h" +#include "llvm/IR/Constants.h" +#include "llvm/IR/Function.h" +#include "llvm/IR/Instructions.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/Intrinsics.h" +#include "llvm/IR/PatternMatch.h" +#include "llvm/Pass.h" +#include "llvm/Support/Debug.h" +#include "llvm/Transforms/Scalar.h" +#include "llvm/Transforms/Utils/Local.h" + +using namespace llvm; +using namespace llvm::PatternMatch; + +#define DEBUG_TYPE "lower-is-constant-intrinsic" + +STATISTIC(IsConstantIntrinsicsHandled, + "Number of 'is.constant' intrinsic calls handled"); +STATISTIC(ObjectSizeIntrinsicsHandled, + "Number of 'objectsize' intrinsic calls handled"); + +static Value *lowerIsConstantIntrinsic(IntrinsicInst *II) { + Value *Op = II->getOperand(0); + + return isa(Op) ? ConstantInt::getTrue(II->getType()) + : ConstantInt::getFalse(II->getType()); +} + +static bool replaceConditionalBranchesOnConstant(Instruction *II, + Value *NewValue) { + bool HasDeadBlocks = false; + SmallSetVector Worklist; + replaceAndRecursivelySimplify(II, NewValue, nullptr, nullptr, nullptr, + &Worklist); + for (auto I : Worklist) { + BranchInst *BI = dyn_cast(I); + if (!BI) + continue; + if (BI->isUnconditional()) + continue; + + BasicBlock *Target, *Other; + if (match(BI->getOperand(0), m_Zero())) { + Target = BI->getSuccessor(1); + Other = BI->getSuccessor(0); + } else if (match(BI->getOperand(0), m_One())) { + Target = BI->getSuccessor(0); + Other = BI->getSuccessor(1); + } else { + Target = nullptr; + Other = nullptr; + } + if (Target && Target != Other) { + BasicBlock *Source = BI->getParent(); + Other->removePredecessor(Source); + BI->eraseFromParent(); + BranchInst::Create(Target, Source); + if (pred_begin(Other) == pred_end(Other)) + HasDeadBlocks = true; + } + } + return HasDeadBlocks; +} + +static bool lowerConstantIntrinsics(Function &F, const TargetLibraryInfo *TLI) { + bool HasDeadBlocks = false; + const auto &DL = F.getParent()->getDataLayout(); + SmallVector Worklist; + + ReversePostOrderTraversal RPOT(&F); + for (BasicBlock *BB : RPOT) { + for (Instruction &I: *BB) { + IntrinsicInst *II = dyn_cast(&I); + if (!II) + continue; + switch (II->getIntrinsicID()) { + default: + break; + case Intrinsic::is_constant: + case Intrinsic::objectsize: + Worklist.push_back(WeakTrackingVH(&I)); + break; + } + } + } + for (WeakTrackingVH &VH: Worklist) { + // Items on the worklist can be mutated by earlier recursive replaces. + // This can remove the intrinsic as dead (VH == null), but also replace + // the intrinsic in place. + if (!VH) + continue; + IntrinsicInst *II = dyn_cast(&*VH); + if (!II) + continue; + Value *NewValue; + switch (II->getIntrinsicID()) { + default: + continue; + case Intrinsic::is_constant: + NewValue = lowerIsConstantIntrinsic(II); + IsConstantIntrinsicsHandled++; + break; + case Intrinsic::objectsize: + NewValue = lowerObjectSizeCall(II, DL, TLI, true); + ObjectSizeIntrinsicsHandled++; + break; + } + HasDeadBlocks |= replaceConditionalBranchesOnConstant(II, NewValue); + } + if (HasDeadBlocks) + removeUnreachableBlocks(F); + return !Worklist.empty(); +} + +PreservedAnalyses +LowerConstantIntrinsicsPass::run(Function &F, FunctionAnalysisManager &AM) { + if (lowerConstantIntrinsics(F, AM.getCachedResult(F))) + return PreservedAnalyses::none(); + + return PreservedAnalyses::all(); +} + +namespace { +/// Legacy pass for lowering is.constant intrinsics out of the IR. +/// +/// When this pass is run over a function it converts is.constant intrinsics +/// into 'true' or 'false'. This is completements the normal constand folding +/// to 'true' as part of Instruction Simplify passes. +class LowerConstantIntrinsics : public FunctionPass { +public: + static char ID; + LowerConstantIntrinsics() : FunctionPass(ID) { + initializeLowerConstantIntrinsicsPass(*PassRegistry::getPassRegistry()); + } + + bool runOnFunction(Function &F) override { + auto *TLIP = getAnalysisIfAvailable(); + const TargetLibraryInfo *TLI = TLIP ? &TLIP->getTLI(F) : nullptr; + return lowerConstantIntrinsics(F, TLI); + } +}; +} // namespace + +char LowerConstantIntrinsics::ID = 0; +INITIALIZE_PASS(LowerConstantIntrinsics, "lower-constant-intrinsics", + "Lower constant intrinsics", false, false) + +FunctionPass *llvm::createLowerConstantIntrinsicsPass() { + return new LowerConstantIntrinsics(); +} Modified: llvm/trunk/lib/Transforms/Scalar/Scalar.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/Scalar.cpp?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/Scalar.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/Scalar.cpp Sun Oct 13 16:00:15 2019 @@ -79,6 +79,7 @@ void llvm::initializeScalarOpts(PassRegi initializeLoopVersioningLICMPass(Registry); initializeLoopIdiomRecognizeLegacyPassPass(Registry); initializeLowerAtomicLegacyPassPass(Registry); + initializeLowerConstantIntrinsicsPass(Registry); initializeLowerExpectIntrinsicPass(Registry); initializeLowerGuardIntrinsicLegacyPassPass(Registry); initializeLowerWidenableConditionLegacyPassPass(Registry); @@ -284,6 +285,10 @@ void LLVMAddBasicAliasAnalysisPass(LLVMP unwrap(PM)->add(createBasicAAWrapperPass()); } +void LLVMAddLowerConstantIntrinsicsPass(LLVMPassManagerRef PM) { + unwrap(PM)->add(createLowerConstantIntrinsicsPass()); +} + void LLVMAddLowerExpectIntrinsicPass(LLVMPassManagerRef PM) { unwrap(PM)->add(createLowerExpectIntrinsicPass()); } Modified: llvm/trunk/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll Sun Oct 13 16:00:15 2019 @@ -1183,23 +1183,6 @@ define void @test_memset(i8* %dst, i8 %v ret void } -declare i64 @llvm.objectsize.i64(i8*, i1) -declare i32 @llvm.objectsize.i32(i8*, i1) -define void @test_objectsize(i8* %addr0, i8* %addr1) { -; CHECK-LABEL: name: test_objectsize -; CHECK: [[ADDR0:%[0-9]+]]:_(p0) = COPY $x0 -; CHECK: [[ADDR1:%[0-9]+]]:_(p0) = COPY $x1 -; CHECK: {{%[0-9]+}}:_(s64) = G_CONSTANT i64 -1 -; CHECK: {{%[0-9]+}}:_(s64) = G_CONSTANT i64 0 -; CHECK: {{%[0-9]+}}:_(s32) = G_CONSTANT i32 -1 -; CHECK: {{%[0-9]+}}:_(s32) = G_CONSTANT i32 0 - %size64.0 = call i64 @llvm.objectsize.i64(i8* %addr0, i1 0) - %size64.intmin = call i64 @llvm.objectsize.i64(i8* %addr0, i1 1) - %size32.0 = call i32 @llvm.objectsize.i32(i8* %addr0, i1 0) - %size32.intmin = call i32 @llvm.objectsize.i32(i8* %addr0, i1 1) - ret void -} - define void @test_large_const(i128* %addr) { ; CHECK-LABEL: name: test_large_const ; CHECK: [[ADDR:%[0-9]+]]:_(p0) = COPY $x0 Modified: llvm/trunk/test/CodeGen/AArch64/O0-pipeline.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/O0-pipeline.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/O0-pipeline.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/O0-pipeline.ll Sun Oct 13 16:00:15 2019 @@ -21,6 +21,7 @@ ; CHECK-NEXT: Module Verifier ; CHECK-NEXT: Lower Garbage Collection Instructions ; CHECK-NEXT: Shadow Stack GC Lowering +; CHECK-NEXT: Lower constant intrinsics ; CHECK-NEXT: Remove unreachable blocks from the CFG ; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining) ; CHECK-NEXT: Scalarize Masked Memory Intrinsics Modified: llvm/trunk/test/CodeGen/AArch64/O3-pipeline.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/O3-pipeline.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/AArch64/O3-pipeline.ll (original) +++ llvm/trunk/test/CodeGen/AArch64/O3-pipeline.ll Sun Oct 13 16:00:15 2019 @@ -38,6 +38,7 @@ ; CHECK-NEXT: Expand memcmp() to load/stores ; CHECK-NEXT: Lower Garbage Collection Instructions ; CHECK-NEXT: Shadow Stack GC Lowering +; CHECK-NEXT: Lower constant intrinsics ; CHECK-NEXT: Remove unreachable blocks from the CFG ; CHECK-NEXT: Dominator Tree Construction ; CHECK-NEXT: Natural Loop Information Modified: llvm/trunk/test/CodeGen/ARM/O3-pipeline.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/O3-pipeline.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/O3-pipeline.ll (original) +++ llvm/trunk/test/CodeGen/ARM/O3-pipeline.ll Sun Oct 13 16:00:15 2019 @@ -22,6 +22,7 @@ ; CHECK-NEXT: Expand memcmp() to load/stores ; CHECK-NEXT: Lower Garbage Collection Instructions ; CHECK-NEXT: Shadow Stack GC Lowering +; CHECK-NEXT: Lower constant intrinsics ; CHECK-NEXT: Remove unreachable blocks from the CFG ; CHECK-NEXT: Dominator Tree Construction ; CHECK-NEXT: Natural Loop Information Removed: llvm/trunk/test/CodeGen/Generic/is-constant.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Generic/is-constant.ll?rev=374742&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/Generic/is-constant.ll (original) +++ llvm/trunk/test/CodeGen/Generic/is-constant.ll (removed) @@ -1,114 +0,0 @@ -; RUN: opt -O2 -S < %s | FileCheck %s -; RUN: llc -o /dev/null 2>&1 < %s -; RUN: llc -O0 -o /dev/null 2>&1 < %s - -;; The llc runs above are just to ensure it doesn't blow up upon -;; seeing an is_constant intrinsic. - -declare i1 @llvm.is.constant.i32(i32 %a) -declare i1 @llvm.is.constant.i64(i64 %a) -declare i1 @llvm.is.constant.i256(i256 %a) -declare i1 @llvm.is.constant.v2i64(<2 x i64> %a) -declare i1 @llvm.is.constant.f32(float %a) -declare i1 @llvm.is.constant.sl_i32i32s({i32, i32} %a) -declare i1 @llvm.is.constant.a2i64([2 x i64] %a) -declare i1 @llvm.is.constant.p0i64(i64* %a) - -;; Basic test that optimization folds away the is.constant when given -;; a constant. -define i1 @test_constant() #0 { -; CHECK-LABEL: @test_constant( -; CHECK-NOT: llvm.is.constant -; CHECK: ret i1 true -%y = call i1 @llvm.is.constant.i32(i32 44) - ret i1 %y -} - -;; And test that the intrinsic sticks around when given a -;; non-constant. -define i1 @test_nonconstant(i32 %x) #0 { -; CHECK-LABEL: @test_nonconstant( -; CHECK: @llvm.is.constant - %y = call i1 @llvm.is.constant.i32(i32 %x) - ret i1 %y -} - -;; Ensure that nested is.constants fold. -define i32 @test_nested() #0 { -; CHECK-LABEL: @test_nested( -; CHECK-NOT: llvm.is.constant -; CHECK: ret i32 13 - %val1 = call i1 @llvm.is.constant.i32(i32 27) - %val2 = zext i1 %val1 to i32 - %val3 = add i32 %val2, 12 - %1 = call i1 @llvm.is.constant.i32(i32 %val3) - %2 = zext i1 %1 to i32 - %3 = add i32 %2, 12 - ret i32 %3 -} - - at G = global [2 x i64] zeroinitializer -define i1 @test_global() #0 { -; CHECK-LABEL: @test_global( -; CHECK: llvm.is.constant - %ret = call i1 @llvm.is.constant.p0i64(i64* getelementptr ([2 x i64], [2 x i64]* @G, i32 0, i32 0)) - ret i1 %ret -} - -define i1 @test_diff() #0 { -; CHECK-LABEL: @test_diff( - %ret = call i1 @llvm.is.constant.i64(i64 sub ( - i64 ptrtoint (i64* getelementptr inbounds ([2 x i64], [2 x i64]* @G, i64 0, i64 1) to i64), - i64 ptrtoint ([2 x i64]* @G to i64))) - ret i1 %ret -} - -define i1 @test_various_types(i256 %int, float %float, <2 x i64> %vec, {i32, i32} %struct, [2 x i64] %arr, i64* %ptr) #0 { -; CHECK-LABEL: @test_various_types( -; CHECK: llvm.is.constant -; CHECK: llvm.is.constant -; CHECK: llvm.is.constant -; CHECK: llvm.is.constant -; CHECK: llvm.is.constant -; CHECK: llvm.is.constant -; CHECK-NOT: llvm.is.constant - %v1 = call i1 @llvm.is.constant.i256(i256 %int) - %v2 = call i1 @llvm.is.constant.f32(float %float) - %v3 = call i1 @llvm.is.constant.v2i64(<2 x i64> %vec) - %v4 = call i1 @llvm.is.constant.sl_i32i32s({i32, i32} %struct) - %v5 = call i1 @llvm.is.constant.a2i64([2 x i64] %arr) - %v6 = call i1 @llvm.is.constant.p0i64(i64* %ptr) - - %c1 = call i1 @llvm.is.constant.i256(i256 -1) - %c2 = call i1 @llvm.is.constant.f32(float 17.0) - %c3 = call i1 @llvm.is.constant.v2i64(<2 x i64> ) - %c4 = call i1 @llvm.is.constant.sl_i32i32s({i32, i32} {i32 -1, i32 32}) - %c5 = call i1 @llvm.is.constant.a2i64([2 x i64] [i64 -1, i64 32]) - %c6 = call i1 @llvm.is.constant.p0i64(i64* inttoptr (i32 42 to i64*)) - - %x1 = add i1 %v1, %c1 - %x2 = add i1 %v2, %c2 - %x3 = add i1 %v3, %c3 - %x4 = add i1 %v4, %c4 - %x5 = add i1 %v5, %c5 - %x6 = add i1 %v6, %c6 - - %res2 = add i1 %x1, %x2 - %res3 = add i1 %res2, %x3 - %res4 = add i1 %res3, %x4 - %res5 = add i1 %res4, %x5 - %res6 = add i1 %res5, %x6 - - ret i1 %res6 -} - -define i1 @test_various_types2() #0 { -; CHECK-LABEL: @test_various_types2( -; CHECK: ret i1 false - %r = call i1 @test_various_types(i256 -1, float 22.0, <2 x i64> , - {i32, i32} {i32 -1, i32 55}, [2 x i64] [i64 -1, i64 55], - i64* inttoptr (i64 42 to i64*)) - ret i1 %r -} - -attributes #0 = { nounwind uwtable } Modified: llvm/trunk/test/CodeGen/X86/O0-pipeline.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/O0-pipeline.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/O0-pipeline.ll (original) +++ llvm/trunk/test/CodeGen/X86/O0-pipeline.ll Sun Oct 13 16:00:15 2019 @@ -24,6 +24,7 @@ ; CHECK-NEXT: Module Verifier ; CHECK-NEXT: Lower Garbage Collection Instructions ; CHECK-NEXT: Shadow Stack GC Lowering +; CHECK-NEXT: Lower constant intrinsics ; CHECK-NEXT: Remove unreachable blocks from the CFG ; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining) ; CHECK-NEXT: Scalarize Masked Memory Intrinsics Modified: llvm/trunk/test/CodeGen/X86/O3-pipeline.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/O3-pipeline.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/O3-pipeline.ll (original) +++ llvm/trunk/test/CodeGen/X86/O3-pipeline.ll Sun Oct 13 16:00:15 2019 @@ -35,6 +35,7 @@ ; CHECK-NEXT: Expand memcmp() to load/stores ; CHECK-NEXT: Lower Garbage Collection Instructions ; CHECK-NEXT: Shadow Stack GC Lowering +; CHECK-NEXT: Lower constant intrinsics ; CHECK-NEXT: Remove unreachable blocks from the CFG ; CHECK-NEXT: Dominator Tree Construction ; CHECK-NEXT: Natural Loop Information Removed: llvm/trunk/test/CodeGen/X86/is-constant.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/is-constant.ll?rev=374742&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/is-constant.ll (original) +++ llvm/trunk/test/CodeGen/X86/is-constant.ll (removed) @@ -1,50 +0,0 @@ -; RUN: llc -O2 < %s | FileCheck %s --check-prefix=CHECK-O2 --check-prefix=CHECK -; RUN: llc -O0 -fast-isel < %s | FileCheck %s --check-prefix=CHECK-O0 --check-prefix=CHECK -; RUN: llc -O0 -fast-isel=0 < %s | FileCheck %s --check-prefix=CHECK-O0 --check-prefix=CHECK -; RUN: llc -O0 -global-isel < %s | FileCheck %s --check-prefix=CHECK-O0 --check-prefix=CHECK - -;; Ensure that an unfoldable is.constant gets lowered reasonably in -;; optimized codegen, in particular, that the "true" branch is -;; eliminated. -;; -;; This isn't asserting any specific output from non-optimized runs, -;; (e.g., currently the not-taken branch does not get eliminated). But -;; it does ensure that lowering succeeds in all 3 codegen paths. - -target triple = "x86_64-unknown-linux-gnu" - -declare i1 @llvm.is.constant.i32(i32 %a) nounwind readnone -declare i1 @llvm.is.constant.i64(i64 %a) nounwind readnone -declare i64 @llvm.objectsize.i64.p0i8(i8*, i1, i1, i1) nounwind readnone - -declare i32 @subfun_1() -declare i32 @subfun_2() - -define i32 @test_branch(i32 %in) nounwind { -; CHECK-LABEL: test_branch: -; CHECK-O2: %bb.0: -; CHECK-O2-NEXT: jmp subfun_2 - %v = call i1 @llvm.is.constant.i32(i32 %in) - br i1 %v, label %True, label %False - -True: - %call1 = tail call i32 @subfun_1() - ret i32 %call1 - -False: - %call2 = tail call i32 @subfun_2() - ret i32 %call2 -} - -;; llvm.objectsize is another tricky case which gets folded to -1 very -;; late in the game. We'd like to ensure that llvm.is.constant of -;; llvm.objectsize is true. -define i1 @test_objectsize(i8* %obj) nounwind { -; CHECK-LABEL: test_objectsize: -; CHECK-O2: %bb.0: -; CHECK-O2: movb $1, %al -; CHECK-O2-NEXT: retq - %os = call i64 @llvm.objectsize.i64.p0i8(i8* %obj, i1 false, i1 false, i1 false) - %v = call i1 @llvm.is.constant.i64(i64 %os) - ret i1 %v -} Removed: llvm/trunk/test/CodeGen/X86/object-size.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/object-size.ll?rev=374742&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/object-size.ll (original) +++ llvm/trunk/test/CodeGen/X86/object-size.ll (removed) @@ -1,55 +0,0 @@ -; RUN: llc -O0 < %s | FileCheck %s - -; ModuleID = 'ts.c' -target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" -target triple = "x86_64-apple-darwin10.0" - - at p = common global i8* null, align 8 ; [#uses=4] - at .str = private constant [3 x i8] c"Hi\00" ; <[3 x i8]*> [#uses=1] - -define void @bar() nounwind ssp { -entry: - %tmp = load i8*, i8** @p ; [#uses=1] - %0 = call i64 @llvm.objectsize.i64.p0i8(i8* %tmp, i1 0) ; [#uses=1] - %cmp = icmp ne i64 %0, -1 ; [#uses=1] -; CHECK: movq $-1, [[RAX:%r..]] -; CHECK: cmpq $-1, [[RAX]] - br i1 %cmp, label %cond.true, label %cond.false - -cond.true: ; preds = %entry - %tmp1 = load i8*, i8** @p ; [#uses=1] - %tmp2 = load i8*, i8** @p ; [#uses=1] - %1 = call i64 @llvm.objectsize.i64.p0i8(i8* %tmp2, i1 1) ; [#uses=1] - %call = call i8* @__strcpy_chk(i8* %tmp1, i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i32 0, i32 0), i64 %1) ssp ; [#uses=1] - br label %cond.end - -cond.false: ; preds = %entry - %tmp3 = load i8*, i8** @p ; [#uses=1] - %call4 = call i8* @__inline_strcpy_chk(i8* %tmp3, i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i32 0, i32 0)) ssp ; [#uses=1] - br label %cond.end - -cond.end: ; preds = %cond.false, %cond.true - %cond = phi i8* [ %call, %cond.true ], [ %call4, %cond.false ] ; [#uses=0] - ret void -} - -declare i64 @llvm.objectsize.i64.p0i8(i8*, i1) nounwind readonly - -declare i8* @__strcpy_chk(i8*, i8*, i64) ssp - -define internal i8* @__inline_strcpy_chk(i8* %__dest, i8* %__src) nounwind ssp { -entry: - %retval = alloca i8* ; [#uses=2] - %__dest.addr = alloca i8* ; [#uses=3] - %__src.addr = alloca i8* ; [#uses=2] - store i8* %__dest, i8** %__dest.addr - store i8* %__src, i8** %__src.addr - %tmp = load i8*, i8** %__dest.addr ; [#uses=1] - %tmp1 = load i8*, i8** %__src.addr ; [#uses=1] - %tmp2 = load i8*, i8** %__dest.addr ; [#uses=1] - %0 = call i64 @llvm.objectsize.i64.p0i8(i8* %tmp2, i1 1) ; [#uses=1] - %call = call i8* @__strcpy_chk(i8* %tmp, i8* %tmp1, i64 %0) ssp ; [#uses=1] - store i8* %call, i8** %retval - %1 = load i8*, i8** %retval ; [#uses=1] - ret i8* %1 -} Modified: llvm/trunk/test/Other/new-pm-defaults.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Other/new-pm-defaults.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/Other/new-pm-defaults.ll (original) +++ llvm/trunk/test/Other/new-pm-defaults.ll Sun Oct 13 16:00:15 2019 @@ -231,6 +231,7 @@ ; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.*}}PassManager{{.*}}> ; CHECK-O-NEXT: Starting llvm::Function pass manager run. ; CHECK-O-NEXT: Running pass: Float2IntPass +; CHECK-O-NEXT: Running pass: LowerConstantIntrinsicsPass on foo ; CHECK-EP-VECTORIZER-START-NEXT: Running pass: NoOpFunctionPass ; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LoopRotatePass ; CHECK-O-NEXT: Starting llvm::Function pass manager run. Modified: llvm/trunk/test/Other/new-pm-thinlto-defaults.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Other/new-pm-thinlto-defaults.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/Other/new-pm-thinlto-defaults.ll (original) +++ llvm/trunk/test/Other/new-pm-thinlto-defaults.ll Sun Oct 13 16:00:15 2019 @@ -205,6 +205,7 @@ ; CHECK-POSTLINK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.*}}PassManager{{.*}}> ; CHECK-POSTLINK-O-NEXT: Starting llvm::Function pass manager run. ; CHECK-POSTLINK-O-NEXT: Running pass: Float2IntPass +; CHECK-POSTLINK-O-NEXT: Running pass: LowerConstantIntrinsicsPass ; CHECK-POSTLINK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LoopRotatePass ; CHECK-POSTLINK-O-NEXT: Starting llvm::Function pass manager run ; CHECK-POSTLINK-O-NEXT: Running pass: LoopSimplifyPass Modified: llvm/trunk/test/Other/opt-O2-pipeline.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Other/opt-O2-pipeline.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/Other/opt-O2-pipeline.ll (original) +++ llvm/trunk/test/Other/opt-O2-pipeline.ll Sun Oct 13 16:00:15 2019 @@ -187,6 +187,8 @@ ; CHECK-NEXT: FunctionPass Manager ; CHECK-NEXT: Dominator Tree Construction ; CHECK-NEXT: Float to int +; CHECK-NEXT: Lower constant intrinsics +; CHECK-NEXT: Dominator Tree Construction ; CHECK-NEXT: Basic Alias Analysis (stateless AA impl) ; CHECK-NEXT: Function Alias Analysis Results ; CHECK-NEXT: Memory SSA Modified: llvm/trunk/test/Other/opt-O3-pipeline.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Other/opt-O3-pipeline.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/Other/opt-O3-pipeline.ll (original) +++ llvm/trunk/test/Other/opt-O3-pipeline.ll Sun Oct 13 16:00:15 2019 @@ -192,6 +192,8 @@ ; CHECK-NEXT: FunctionPass Manager ; CHECK-NEXT: Dominator Tree Construction ; CHECK-NEXT: Float to int +; CHECK-NEXT: Lower constant intrinsics +; CHECK-NEXT: Dominator Tree Construction ; CHECK-NEXT: Basic Alias Analysis (stateless AA impl) ; CHECK-NEXT: Function Alias Analysis Results ; CHECK-NEXT: Memory SSA Modified: llvm/trunk/test/Other/opt-Os-pipeline.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Other/opt-Os-pipeline.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/Other/opt-Os-pipeline.ll (original) +++ llvm/trunk/test/Other/opt-Os-pipeline.ll Sun Oct 13 16:00:15 2019 @@ -174,6 +174,8 @@ ; CHECK-NEXT: FunctionPass Manager ; CHECK-NEXT: Dominator Tree Construction ; CHECK-NEXT: Float to int +; CHECK-NEXT: Lower constant intrinsics +; CHECK-NEXT: Dominator Tree Construction ; CHECK-NEXT: Basic Alias Analysis (stateless AA impl) ; CHECK-NEXT: Function Alias Analysis Results ; CHECK-NEXT: Memory SSA Modified: llvm/trunk/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll (original) +++ llvm/trunk/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll Sun Oct 13 16:00:15 2019 @@ -514,26 +514,6 @@ exit: ret void } -; This was crashing when trying to delay instruction removal/deletion. - -declare i64 @llvm.objectsize.i64.p0i8(i8*, i1 immarg, i1 immarg, i1 immarg) #0 - -define hidden fastcc void @crash() { -; CHECK-LABEL: @crash( -; CHECK-NEXT: [[TMP1:%.*]] = call { i64, i1 } @llvm.uadd.with.overflow.i64(i64 undef, i64 undef) -; CHECK-NEXT: [[MATH:%.*]] = extractvalue { i64, i1 } [[TMP1]], 0 -; CHECK-NEXT: [[OV:%.*]] = extractvalue { i64, i1 } [[TMP1]], 1 -; CHECK-NEXT: [[T2:%.*]] = select i1 undef, i1 undef, i1 [[OV]] -; CHECK-NEXT: unreachable -; - %t0 = add i64 undef, undef - %t1 = icmp ult i64 %t0, undef - %t2 = select i1 undef, i1 undef, i1 %t1 - %t3 = call i64 @llvm.objectsize.i64.p0i8(i8* nonnull undef, i1 false, i1 false, i1 false) - %t4 = icmp ugt i64 %t3, 7 - unreachable -} - ; Check that every instruction inserted by -codegenprepare has a debug location. ; DEBUG: CheckModuleDebugify: PASS Removed: llvm/trunk/test/Transforms/CodeGenPrepare/basic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeGenPrepare/basic.ll?rev=374742&view=auto ============================================================================== --- llvm/trunk/test/Transforms/CodeGenPrepare/basic.ll (original) +++ llvm/trunk/test/Transforms/CodeGenPrepare/basic.ll (removed) @@ -1,86 +0,0 @@ -; RUN: opt -codegenprepare -S < %s | FileCheck %s - -target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" -target triple = "x86_64-apple-darwin10.0.0" - -; CHECK-LABEL: @test1( -; objectsize should fold to a constant, which causes the branch to fold to an -; uncond branch. Next, we fold the control flow alltogether. -; rdar://8785296 -define i32 @test1(i8* %ptr) nounwind ssp noredzone align 2 { -entry: - %0 = tail call i64 @llvm.objectsize.i64(i8* %ptr, i1 false, i1 false, i1 false) - %1 = icmp ugt i64 %0, 3 - br i1 %1, label %T, label %trap - -; CHECK: entry: -; CHECK-NOT: br label % - -trap: ; preds = %0, %entry - tail call void @llvm.trap() noreturn nounwind - unreachable - -T: -; CHECK: ret i32 4 - ret i32 4 -} - -; CHECK-LABEL: @test_objectsize_null_flag( -define i64 @test_objectsize_null_flag(i8* %ptr) { -entry: - ; CHECK: ret i64 -1 - %0 = tail call i64 @llvm.objectsize.i64(i8* null, i1 false, i1 true, i1 false) - ret i64 %0 -} - -; CHECK-LABEL: @test_objectsize_null_flag_min( -define i64 @test_objectsize_null_flag_min(i8* %ptr) { -entry: - ; CHECK: ret i64 0 - %0 = tail call i64 @llvm.objectsize.i64(i8* null, i1 true, i1 true, i1 false) - ret i64 %0 -} - -; Test foldable null pointers because we evaluate them with non-exact modes in -; CodeGenPrepare. -; CHECK-LABEL: @test_objectsize_null_flag_noas0( -define i64 @test_objectsize_null_flag_noas0() { -entry: - ; CHECK: ret i64 -1 - %0 = tail call i64 @llvm.objectsize.i64.p1i8(i8 addrspace(1)* null, i1 false, - i1 true, i1 false) - ret i64 %0 -} - -; CHECK-LABEL: @test_objectsize_null_flag_min_noas0( -define i64 @test_objectsize_null_flag_min_noas0() { -entry: - ; CHECK: ret i64 0 - %0 = tail call i64 @llvm.objectsize.i64.p1i8(i8 addrspace(1)* null, i1 true, - i1 true, i1 false) - ret i64 %0 -} - -; CHECK-LABEL: @test_objectsize_null_known_flag_noas0 -define i64 @test_objectsize_null_known_flag_noas0() { -entry: - ; CHECK: ret i64 -1 - %0 = tail call i64 @llvm.objectsize.i64.p1i8(i8 addrspace(1)* null, i1 false, - i1 false, i1 false) - ret i64 %0 -} - -; CHECK-LABEL: @test_objectsize_null_known_flag_min_noas0 -define i64 @test_objectsize_null_known_flag_min_noas0() { -entry: - ; CHECK: ret i64 0 - %0 = tail call i64 @llvm.objectsize.i64.p1i8(i8 addrspace(1)* null, i1 true, - i1 false, i1 false) - ret i64 %0 -} - - -declare i64 @llvm.objectsize.i64(i8*, i1, i1, i1) nounwind readonly -declare i64 @llvm.objectsize.i64.p1i8(i8 addrspace(1)*, i1, i1, i1) nounwind readonly - -declare void @llvm.trap() nounwind Removed: llvm/trunk/test/Transforms/CodeGenPrepare/builtin-condition.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeGenPrepare/builtin-condition.ll?rev=374742&view=auto ============================================================================== --- llvm/trunk/test/Transforms/CodeGenPrepare/builtin-condition.ll (original) +++ llvm/trunk/test/Transforms/CodeGenPrepare/builtin-condition.ll (removed) @@ -1,123 +0,0 @@ -; RUN: opt -codegenprepare -S < %s | FileCheck %s - -; Ensure we act sanely on overflow. -; CHECK-LABEL: define i32 @bar -define i32 @bar() { -entry: - ; CHECK: ret i32 -1 - %az = alloca [2147483649 x i32], align 16 - %a = alloca i8*, align 8 - %arraydecay = getelementptr inbounds [2147483649 x i32], [2147483649 x i32]* %az, i32 0, i32 0 - %0 = bitcast i32* %arraydecay to i8* - store i8* %0, i8** %a, align 8 - %1 = load i8*, i8** %a, align 8 - %2 = call i32 @llvm.objectsize.i32.p0i8(i8* %1, i1 false) - ret i32 %2 -} - -; CHECK-LABEL: define i32 @baz -define i32 @baz(i32 %n) { -entry: - ; CHECK: ret i32 -1 - %az = alloca [1 x i32], align 16 - %bz = alloca [4294967297 x i32], align 16 - %tobool = icmp ne i32 %n, 0 - %arraydecay = getelementptr inbounds [1 x i32], [1 x i32]* %az, i64 0, i64 0 - %arraydecay1 = getelementptr inbounds [4294967297 x i32], [4294967297 x i32]* %bz, i64 0, i64 0 - %cond = select i1 %tobool, i32* %arraydecay, i32* %arraydecay1 - %0 = bitcast i32* %cond to i8* - %1 = call i32 @llvm.objectsize.i32.p0i8(i8* %0, i1 false) - ret i32 %1 -} - -declare i32 @llvm.objectsize.i32.p0i8(i8*, i1) - -; The following tests were generated by: -; #include -; #define STATIC_BUF_SIZE 10 -; #define LARGER_BUF_SIZE 30 -; -; size_t foo1(int flag) { -; char *cptr; -; char chararray[LARGER_BUF_SIZE]; -; char chararray2[STATIC_BUF_SIZE]; -; if(flag) -; cptr = chararray2; -; else -; cptr = chararray; -; -; return __builtin_object_size(cptr, 2); -; } -; -; size_t foo2(int n) { -; char Small[10]; -; char Large[20]; -; char *Ptr = n ? Small : Large + 19; -; return __builtin_object_size(Ptr, 0); -; } -; -; void foo() { -; size_t ret; -; size_t ret1; -; ret = foo1(0); -; ret1 = foo2(0); -; printf("\n%d %d\n", ret, ret1); -; } - -target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" -target triple = "x86_64-unknown-linux-gnu" - - at .str = private unnamed_addr constant [8 x i8] c"\0A%d %d\0A\00", align 1 - -define i64 @foo1(i32 %flag) { -entry: - %chararray = alloca [30 x i8], align 16 - %chararray2 = alloca [10 x i8], align 1 - %0 = getelementptr inbounds [30 x i8], [30 x i8]* %chararray, i64 0, i64 0 - call void @llvm.lifetime.start.p0i8(i64 30, i8* %0) - %1 = getelementptr inbounds [10 x i8], [10 x i8]* %chararray2, i64 0, i64 0 - call void @llvm.lifetime.start.p0i8(i64 10, i8* %1) - %tobool = icmp eq i32 %flag, 0 - %cptr.0 = select i1 %tobool, i8* %0, i8* %1 - %2 = call i64 @llvm.objectsize.i64.p0i8(i8* %cptr.0, i1 true) - call void @llvm.lifetime.end.p0i8(i64 10, i8* %1) - call void @llvm.lifetime.end.p0i8(i64 30, i8* %0) - ret i64 %2 -; CHECK-LABEL: foo1 -; CHECK: ret i64 10 -} - -declare void @llvm.lifetime.start.p0i8(i64, i8* nocapture) - -declare i64 @llvm.objectsize.i64.p0i8(i8*, i1) - -declare void @llvm.lifetime.end.p0i8(i64, i8* nocapture) - -define i64 @foo2(i32 %n) { -entry: - %Small = alloca [10 x i8], align 1 - %Large = alloca [20 x i8], align 16 - %0 = getelementptr inbounds [10 x i8], [10 x i8]* %Small, i64 0, i64 0 - call void @llvm.lifetime.start.p0i8(i64 10, i8* %0) - %1 = getelementptr inbounds [20 x i8], [20 x i8]* %Large, i64 0, i64 0 - call void @llvm.lifetime.start.p0i8(i64 20, i8* %1) - %tobool = icmp ne i32 %n, 0 - %add.ptr = getelementptr inbounds [20 x i8], [20 x i8]* %Large, i64 0, i64 19 - %cond = select i1 %tobool, i8* %0, i8* %add.ptr - %2 = call i64 @llvm.objectsize.i64.p0i8(i8* %cond, i1 false) - call void @llvm.lifetime.end.p0i8(i64 20, i8* %1) - call void @llvm.lifetime.end.p0i8(i64 10, i8* %0) - ret i64 %2 -; CHECK-LABEL: foo2 -; CHECK: ret i64 10 -} - -define void @foo() { -entry: - %call = tail call i64 @foo1(i32 0) - %call1 = tail call i64 @foo2(i32 0) - %call2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str, i64 0, i64 0), i64 %call, i64 %call1) - ret void -} - -declare i32 @printf(i8* nocapture readonly, ...) Removed: llvm/trunk/test/Transforms/CodeGenPrepare/crash-on-large-allocas.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeGenPrepare/crash-on-large-allocas.ll?rev=374742&view=auto ============================================================================== --- llvm/trunk/test/Transforms/CodeGenPrepare/crash-on-large-allocas.ll (original) +++ llvm/trunk/test/Transforms/CodeGenPrepare/crash-on-large-allocas.ll (removed) @@ -1,16 +0,0 @@ -; RUN: opt -S -codegenprepare %s -o - | FileCheck %s -; -; Ensure that we don't {crash,return a bad value} when given an alloca larger -; than what a pointer can represent. - -target datalayout = "p:16:16" - -; CHECK-LABEL: @alloca_overflow_is_unknown( -define i16 @alloca_overflow_is_unknown() { - %i = alloca i8, i32 65537 - %j = call i16 @llvm.objectsize.i16.p0i8(i8* %i, i1 false, i1 false, i1 false) - ; CHECK: ret i16 -1 - ret i16 %j -} - -declare i16 @llvm.objectsize.i16.p0i8(i8*, i1, i1, i1) Copied: llvm/trunk/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll (from r374742, llvm/trunk/test/CodeGen/X86/is-constant.ll) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll?p2=llvm/trunk/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll&p1=llvm/trunk/test/CodeGen/X86/is-constant.ll&r1=374742&r2=374743&rev=374743&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/is-constant.ll (original) +++ llvm/trunk/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll Sun Oct 13 16:00:15 2019 @@ -1,29 +1,30 @@ -; RUN: llc -O2 < %s | FileCheck %s --check-prefix=CHECK-O2 --check-prefix=CHECK -; RUN: llc -O0 -fast-isel < %s | FileCheck %s --check-prefix=CHECK-O0 --check-prefix=CHECK -; RUN: llc -O0 -fast-isel=0 < %s | FileCheck %s --check-prefix=CHECK-O0 --check-prefix=CHECK -; RUN: llc -O0 -global-isel < %s | FileCheck %s --check-prefix=CHECK-O0 --check-prefix=CHECK +; RUN: opt -lower-constant-intrinsics -S < %s | FileCheck %s ;; Ensure that an unfoldable is.constant gets lowered reasonably in ;; optimized codegen, in particular, that the "true" branch is ;; eliminated. -;; -;; This isn't asserting any specific output from non-optimized runs, -;; (e.g., currently the not-taken branch does not get eliminated). But -;; it does ensure that lowering succeeds in all 3 codegen paths. -target triple = "x86_64-unknown-linux-gnu" +;; Also ensure that any unfoldable objectsize is resolved in order. + +;; CHECK-NOT: tail call i32 @subfun_1() +;; CHECK: tail call i32 @subfun_2() +;; CHECK-NOT: tail call i32 @subfun_1() declare i1 @llvm.is.constant.i32(i32 %a) nounwind readnone declare i1 @llvm.is.constant.i64(i64 %a) nounwind readnone +declare i1 @llvm.is.constant.i256(i256 %a) nounwind readnone +declare i1 @llvm.is.constant.v2i64(<2 x i64> %a) nounwind readnone +declare i1 @llvm.is.constant.f32(float %a) nounwind readnone +declare i1 @llvm.is.constant.sl_i32i32s({i32, i32} %a) nounwind readnone +declare i1 @llvm.is.constant.a2i64([2 x i64] %a) nounwind readnone +declare i1 @llvm.is.constant.p0i64(i64* %a) nounwind readnone + declare i64 @llvm.objectsize.i64.p0i8(i8*, i1, i1, i1) nounwind readnone declare i32 @subfun_1() declare i32 @subfun_2() define i32 @test_branch(i32 %in) nounwind { -; CHECK-LABEL: test_branch: -; CHECK-O2: %bb.0: -; CHECK-O2-NEXT: jmp subfun_2 %v = call i1 @llvm.is.constant.i32(i32 %in) br i1 %v, label %True, label %False @@ -40,11 +41,74 @@ False: ;; late in the game. We'd like to ensure that llvm.is.constant of ;; llvm.objectsize is true. define i1 @test_objectsize(i8* %obj) nounwind { -; CHECK-LABEL: test_objectsize: -; CHECK-O2: %bb.0: -; CHECK-O2: movb $1, %al -; CHECK-O2-NEXT: retq +;; CHECK-LABEL: test_objectsize +;; CHECK-NOT: llvm.objectsize +;; CHECK-NOT: llvm.is.constant +;; CHECK: ret i1 true %os = call i64 @llvm.objectsize.i64.p0i8(i8* %obj, i1 false, i1 false, i1 false) - %v = call i1 @llvm.is.constant.i64(i64 %os) + %os1 = add i64 %os, 1 + %v = call i1 @llvm.is.constant.i64(i64 %os1) ret i1 %v } + + at test_phi_a = dso_local global i32 0, align 4 +declare dso_local i32 @test_phi_b(...) + +; Function Attrs: nounwind uwtable +define dso_local i32 @test_phi() { +entry: + %0 = load i32, i32* @test_phi_a, align 4 + %1 = tail call i1 @llvm.is.constant.i32(i32 %0) + br i1 %1, label %cond.end, label %cond.false + +cond.false: ; preds = %entry + %call = tail call i32 bitcast (i32 (...)* @test_phi_b to i32 ()*)() #3 + %.pre = load i32, i32* @test_phi_a, align 4 + br label %cond.end + +cond.end: ; preds = %entry, %cond.false + %2 = phi i32 [ %.pre, %cond.false ], [ %0, %entry ] + %cond = phi i32 [ %call, %cond.false ], [ 1, %entry ] + %cmp = icmp eq i32 %cond, %2 + br i1 %cmp, label %cond.true1, label %cond.end4 + +cond.true1: ; preds = %cond.end + %call2 = tail call i32 bitcast (i32 (...)* @test_phi_b to i32 ()*)() #3 + br label %cond.end4 + +cond.end4: ; preds = %cond.end, %cond.true1 + ret i32 undef +} + +define i1 @test_various_types(i256 %int, float %float, <2 x i64> %vec, {i32, i32} %struct, [2 x i64] %arr, i64* %ptr) #0 { +; CHECK-LABEL: @test_various_types( +; CHECK-NOT: llvm.is.constant + %v1 = call i1 @llvm.is.constant.i256(i256 %int) + %v2 = call i1 @llvm.is.constant.f32(float %float) + %v3 = call i1 @llvm.is.constant.v2i64(<2 x i64> %vec) + %v4 = call i1 @llvm.is.constant.sl_i32i32s({i32, i32} %struct) + %v5 = call i1 @llvm.is.constant.a2i64([2 x i64] %arr) + %v6 = call i1 @llvm.is.constant.p0i64(i64* %ptr) + + %c1 = call i1 @llvm.is.constant.i256(i256 -1) + %c2 = call i1 @llvm.is.constant.f32(float 17.0) + %c3 = call i1 @llvm.is.constant.v2i64(<2 x i64> ) + %c4 = call i1 @llvm.is.constant.sl_i32i32s({i32, i32} {i32 -1, i32 32}) + %c5 = call i1 @llvm.is.constant.a2i64([2 x i64] [i64 -1, i64 32]) + %c6 = call i1 @llvm.is.constant.p0i64(i64* inttoptr (i32 42 to i64*)) + + %x1 = add i1 %v1, %c1 + %x2 = add i1 %v2, %c2 + %x3 = add i1 %v3, %c3 + %x4 = add i1 %v4, %c4 + %x5 = add i1 %v5, %c5 + %x6 = add i1 %v6, %c6 + + %res2 = add i1 %x1, %x2 + %res3 = add i1 %res2, %x3 + %res4 = add i1 %res3, %x4 + %res5 = add i1 %res4, %x5 + %res6 = add i1 %res5, %x6 + + ret i1 %res6 +} Copied: llvm/trunk/test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll (from r374742, llvm/trunk/test/Transforms/CodeGenPrepare/crash-on-large-allocas.ll) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll?p2=llvm/trunk/test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll&p1=llvm/trunk/test/Transforms/CodeGenPrepare/crash-on-large-allocas.ll&r1=374742&r2=374743&rev=374743&view=diff ============================================================================== --- llvm/trunk/test/Transforms/CodeGenPrepare/crash-on-large-allocas.ll (original) +++ llvm/trunk/test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll Sun Oct 13 16:00:15 2019 @@ -1,4 +1,4 @@ -; RUN: opt -S -codegenprepare %s -o - | FileCheck %s +; RUN: opt -S -lower-constant-intrinsics %s -o - | FileCheck %s ; ; Ensure that we don't {crash,return a bad value} when given an alloca larger ; than what a pointer can represent. Copied: llvm/trunk/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll (from r374742, llvm/trunk/test/Transforms/CodeGenPrepare/basic.ll) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll?p2=llvm/trunk/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll&p1=llvm/trunk/test/Transforms/CodeGenPrepare/basic.ll&r1=374742&r2=374743&rev=374743&view=diff ============================================================================== --- llvm/trunk/test/Transforms/CodeGenPrepare/basic.ll (original) +++ llvm/trunk/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll Sun Oct 13 16:00:15 2019 @@ -1,12 +1,15 @@ -; RUN: opt -codegenprepare -S < %s | FileCheck %s +; RUN: opt -lower-constant-intrinsics -S < %s | FileCheck %s target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" target triple = "x86_64-apple-darwin10.0.0" +declare i64 @llvm.objectsize.i64(i8*, i1, i1, i1) nounwind readonly +declare i64 @llvm.objectsize.i64.p1i8(i8 addrspace(1)*, i1, i1, i1) nounwind readonly +declare void @llvm.trap() nounwind + ; CHECK-LABEL: @test1( ; objectsize should fold to a constant, which causes the branch to fold to an -; uncond branch. Next, we fold the control flow alltogether. -; rdar://8785296 +; uncond branch. define i32 @test1(i8* %ptr) nounwind ssp noredzone align 2 { entry: %0 = tail call i64 @llvm.objectsize.i64(i8* %ptr, i1 false, i1 false, i1 false) @@ -14,7 +17,7 @@ entry: br i1 %1, label %T, label %trap ; CHECK: entry: -; CHECK-NOT: br label % +; CHECK-NOT: label %trap trap: ; preds = %0, %entry tail call void @llvm.trap() noreturn nounwind @@ -78,9 +81,3 @@ entry: i1 false, i1 false) ret i64 %0 } - - -declare i64 @llvm.objectsize.i64(i8*, i1, i1, i1) nounwind readonly -declare i64 @llvm.objectsize.i64.p1i8(i8 addrspace(1)*, i1, i1, i1) nounwind readonly - -declare void @llvm.trap() nounwind Modified: llvm/trunk/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn?rev=374743&r1=374742&r2=374743&view=diff ============================================================================== --- llvm/trunk/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn (original) +++ llvm/trunk/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn Sun Oct 13 16:00:15 2019 @@ -57,6 +57,7 @@ static_library("Scalar") { "LowerAtomic.cpp", "LowerExpectIntrinsic.cpp", "LowerGuardIntrinsic.cpp", + "LowerConstantIntrinsics.cpp", "LowerWidenableCondition.cpp", "MakeGuardsExplicit.cpp", "MemCpyOptimizer.cpp", From llvm-commits at lists.llvm.org Sun Oct 13 15:59:36 2019 From: llvm-commits at lists.llvm.org (Joerg Sonnenberger via Phabricator via llvm-commits) Date: Sun, 13 Oct 2019 22:59:36 +0000 (UTC) Subject: [PATCH] D65280: Add a pass to lower is.constant and objectsize intrinsics In-Reply-To: References: Message-ID: This revision was automatically updated to reflect the committed changes. Closed by commit rGe4300c392de2: Add a pass to lower is.constant and objectsize intrinsics (authored by joerg). Changed prior to commit: https://reviews.llvm.org/D65280?vs=224563&id=224797#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65280/new/ https://reviews.llvm.org/D65280 Files: llvm/bindings/ocaml/transforms/scalar_opts/llvm_scalar_opts.mli llvm/bindings/ocaml/transforms/scalar_opts/scalar_opts_ocaml.c llvm/include/llvm-c/Transforms/Scalar.h llvm/include/llvm/InitializePasses.h llvm/include/llvm/LinkAllPasses.h llvm/include/llvm/Transforms/Scalar.h llvm/include/llvm/Transforms/Scalar/LowerConstantIntrinsics.h llvm/lib/CodeGen/CodeGenPrepare.cpp llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp llvm/lib/CodeGen/SelectionDAG/FastISel.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/CodeGen/TargetPassConfig.cpp llvm/lib/Passes/PassBuilder.cpp llvm/lib/Passes/PassRegistry.def llvm/lib/Transforms/IPO/PassManagerBuilder.cpp llvm/lib/Transforms/Scalar/CMakeLists.txt llvm/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp llvm/lib/Transforms/Scalar/Scalar.cpp llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll llvm/test/CodeGen/AArch64/O0-pipeline.ll llvm/test/CodeGen/AArch64/O3-pipeline.ll llvm/test/CodeGen/ARM/O3-pipeline.ll llvm/test/CodeGen/Generic/is-constant.ll llvm/test/CodeGen/X86/O0-pipeline.ll llvm/test/CodeGen/X86/O3-pipeline.ll llvm/test/CodeGen/X86/is-constant.ll llvm/test/CodeGen/X86/object-size.ll llvm/test/Other/new-pm-defaults.ll llvm/test/Other/new-pm-thinlto-defaults.ll llvm/test/Other/opt-O2-pipeline.ll llvm/test/Other/opt-O3-pipeline.ll llvm/test/Other/opt-Os-pipeline.ll llvm/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll llvm/test/Transforms/CodeGenPrepare/basic.ll llvm/test/Transforms/CodeGenPrepare/builtin-condition.ll llvm/test/Transforms/CodeGenPrepare/crash-on-large-allocas.ll llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll llvm/test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn -------------- next part -------------- A non-text attachment was scrubbed... Name: D65280.224797.patch Type: text/x-patch Size: 49089 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 16:34:13 2019 From: llvm-commits at lists.llvm.org (Florian Hahn via llvm-commits) Date: Sun, 13 Oct 2019 23:34:13 -0000 Subject: [llvm] r374744 - [NewGVN] Use m_Br to simplify code a bit. (NFC) Message-ID: <20191013233413.8B223830AA@lists.llvm.org> Author: fhahn Date: Sun Oct 13 16:34:13 2019 New Revision: 374744 URL: http://llvm.org/viewvc/llvm-project?rev=374744&view=rev Log: [NewGVN] Use m_Br to simplify code a bit. (NFC) Modified: llvm/trunk/lib/Transforms/Scalar/NewGVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/NewGVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/NewGVN.cpp?rev=374744&r1=374743&r2=374744&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/NewGVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/NewGVN.cpp Sun Oct 13 16:34:13 2019 @@ -89,6 +89,7 @@ #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/Intrinsics.h" #include "llvm/IR/LLVMContext.h" +#include "llvm/IR/PatternMatch.h" #include "llvm/IR/Type.h" #include "llvm/IR/Use.h" #include "llvm/IR/User.h" @@ -122,6 +123,7 @@ using namespace llvm; using namespace llvm::GVNExpression; using namespace llvm::VNCoercion; +using namespace llvm::PatternMatch; #define DEBUG_TYPE "newgvn" @@ -2464,9 +2466,9 @@ Value *NewGVN::findConditionEquivalence( // Process the outgoing edges of a block for reachability. void NewGVN::processOutgoingEdges(Instruction *TI, BasicBlock *B) { // Evaluate reachability of terminator instruction. - BranchInst *BR; - if ((BR = dyn_cast(TI)) && BR->isConditional()) { - Value *Cond = BR->getCondition(); + Value *Cond; + BasicBlock *TrueSucc, *FalseSucc; + if (match(TI, m_Br(m_Value(Cond), TrueSucc, FalseSucc))) { Value *CondEvaluated = findConditionEquivalence(Cond); if (!CondEvaluated) { if (auto *I = dyn_cast(Cond)) { @@ -2479,8 +2481,6 @@ void NewGVN::processOutgoingEdges(Instru } } ConstantInt *CI; - BasicBlock *TrueSucc = BR->getSuccessor(0); - BasicBlock *FalseSucc = BR->getSuccessor(1); if (CondEvaluated && (CI = dyn_cast(CondEvaluated))) { if (CI->isOne()) { LLVM_DEBUG(dbgs() << "Condition for Terminator " << *TI From llvm-commits at lists.llvm.org Sun Oct 13 17:36:21 2019 From: llvm-commits at lists.llvm.org (Frank Derry Wanye via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 00:36:21 +0000 (UTC) Subject: [PATCH] D66564: [clang-tidy] new performance struct pack align check In-Reply-To: References: Message-ID: <6dfa01b453d7f91e77815b512c2ccae9@localhost.localdomain> ffrankies updated this revision to Diff 224799. ffrankies retitled this revision from "[clang-tidy] new FPGA struct pack align check" to "[clang-tidy] new performance struct pack align check". ffrankies edited the summary of this revision. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66564/new/ https://reviews.llvm.org/D66564 Files: clang-tidy/performance/CMakeLists.txt clang-tidy/performance/PerformanceTidyModule.cpp clang-tidy/performance/StructPackAlignCheck.cpp clang-tidy/performance/StructPackAlignCheck.h docs/ReleaseNotes.rst docs/clang-tidy/checks/list.rst docs/clang-tidy/checks/performance-struct-pack-align.rst test/clang-tidy/performance-struct-pack-align.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D66564.224799.patch Type: text/x-patch Size: 14193 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 17:54:50 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 00:54:50 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream In-Reply-To: References: Message-ID: JosephTremoulet updated this revision to Diff 224800. JosephTremoulet marked an inline comment as done. JosephTremoulet added a comment. - Apply review feedback (-auto, -memset, +comments) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68657/new/ https://reviews.llvm.org/D68657 Files: lldb/packages/Python/lldbsuite/test/functionalities/postmortem/minidump-new/linux-x86_64.yaml llvm/include/llvm/ObjectYAML/MinidumpYAML.h llvm/lib/ObjectYAML/MinidumpEmitter.cpp llvm/lib/ObjectYAML/MinidumpYAML.cpp llvm/test/tools/obj2yaml/basic-minidump.yaml llvm/unittests/ObjectYAML/MinidumpYAMLTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68657.224800.patch Type: text/x-patch Size: 19982 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 18:04:51 2019 From: llvm-commits at lists.llvm.org (Frank Derry Wanye via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 01:04:51 +0000 (UTC) Subject: [PATCH] D66564: [clang-tidy] new performance struct pack align check In-Reply-To: References: Message-ID: ffrankies updated this revision to Diff 224802. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66564/new/ https://reviews.llvm.org/D66564 Files: clang-tidy/performance/CMakeLists.txt clang-tidy/performance/PerformanceTidyModule.cpp clang-tidy/performance/StructPackAlignCheck.cpp clang-tidy/performance/StructPackAlignCheck.h docs/ReleaseNotes.rst docs/clang-tidy/checks/list.rst docs/clang-tidy/checks/performance-struct-pack-align.rst test/clang-tidy/performance-struct-pack-align.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D66564.224802.patch Type: text/x-patch Size: 14117 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 18:04:52 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 01:04:52 +0000 (UTC) Subject: [PATCH] D65387: [clangd] Add a callback mechanism for handling responses from client. In-Reply-To: References: Message-ID: thakis added inline comments. Herald added a subscriber: usaxena95. ================ Comment at: clang-tools-extra/trunk/clangd/test/request-reply.test:6 +--- +{"jsonrpc":"2.0","id":4,"method":"workspace/executeCommand","params":{"command":"clangd.applyTweak","arguments":[{"file":"file:///clangd-test/main.cpp","selection":{"end":{"character":4,"line":0},"start":{"character":0,"line":0}},"tweakID":"ExpandAutoType"}]}} +# CHECK: "id": 0, ---------------- FYI, referring to a file opened as test:///foo.cpp as file:///clangd-test/file.cpp is wrong on Windows. I fixed this in rL374746. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65387/new/ https://reviews.llvm.org/D65387 From llvm-commits at lists.llvm.org Sun Oct 13 18:04:52 2019 From: llvm-commits at lists.llvm.org (Nico Weber via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 01:04:52 +0000 (UTC) Subject: [PATCH] D62855: [clangd] Implementation of auto type expansion. In-Reply-To: References: Message-ID: <977a001c1e9cc8d2628e39ceff357a16@localhost.localdomain> thakis added inline comments. Herald added a subscriber: usaxena95. ================ Comment at: clang-tools-extra/trunk/clangd/test/code-action-request.test:54 +--- +{"jsonrpc":"2.0","id":4,"method":"workspace/executeCommand","params":{"command":"clangd.applyTweak","arguments":[{"file":"file:///clangd-test/main.cpp","selection":{"end":{"character":4,"line":0},"start":{"character":0,"line":0}},"tweakID":"ExpandAutoType"}]}} +# CHECK: "newText": "int", ---------------- FYI, referring to a file opened as test:///foo.cpp as file:///clangd-test/file.cpp is wrong on Windows. I fixed this in rL374746. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62855/new/ https://reviews.llvm.org/D62855 From llvm-commits at lists.llvm.org Sun Oct 13 18:04:55 2019 From: llvm-commits at lists.llvm.org (Frank Derry Wanye via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 01:04:55 +0000 (UTC) Subject: [PATCH] D66564: [clang-tidy] new performance struct pack align check In-Reply-To: References: Message-ID: ffrankies updated this revision to Diff 224803. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66564/new/ https://reviews.llvm.org/D66564 Files: clang-tidy/performance/CMakeLists.txt clang-tidy/performance/PerformanceTidyModule.cpp clang-tidy/performance/StructPackAlignCheck.cpp clang-tidy/performance/StructPackAlignCheck.h docs/ReleaseNotes.rst docs/clang-tidy/checks/list.rst docs/clang-tidy/checks/performance-struct-pack-align.rst test/clang-tidy/performance-struct-pack-align.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D66564.224803.patch Type: text/x-patch Size: 14109 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 18:14:37 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 01:14:37 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream In-Reply-To: References: Message-ID: <58a85aa8ee5b98c5d8aba4502bb3f365@localhost.localdomain> JosephTremoulet updated this revision to Diff 224804. JosephTremoulet added a comment. - Fix Expected<> types Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68657/new/ https://reviews.llvm.org/D68657 Files: lldb/packages/Python/lldbsuite/test/functionalities/postmortem/minidump-new/linux-x86_64.yaml llvm/include/llvm/ObjectYAML/MinidumpYAML.h llvm/lib/ObjectYAML/MinidumpEmitter.cpp llvm/lib/ObjectYAML/MinidumpYAML.cpp llvm/test/tools/obj2yaml/basic-minidump.yaml llvm/unittests/ObjectYAML/MinidumpYAMLTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68657.224804.patch Type: text/x-patch Size: 19994 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 18:14:37 2019 From: llvm-commits at lists.llvm.org (Frank Derry Wanye via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 01:14:37 +0000 (UTC) Subject: [PATCH] D66564: [clang-tidy] new performance struct pack align check In-Reply-To: References: Message-ID: <60b7d3af881b4a5dba36289181f4f085@localhost.localdomain> ffrankies added a comment. As per the previous discussion, the check has been moved into the `performance` module. In D66564#1670640 , @lebedev.ri wrote: > I, too, don't believe this is FPGA specific; it should likely go into `misc-` or even `performance-`. > The wording of the diags seems weird to me, it would be good to 1. add more explanation to the docs and 2. reword the diags. Implemented the requested code refactoring changes, and reworded the diags to now say `accessing the fields in struct 'name' is inefficient due to padding/poor alignment;`. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66564/new/ https://reviews.llvm.org/D66564 From llvm-commits at lists.llvm.org Sun Oct 13 18:14:38 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Lu=C3=ADs_Marques_via_Phabricator?= via llvm-commits) Date: Mon, 14 Oct 2019 01:14:38 +0000 (UTC) Subject: [PATCH] D67698: [RISCV] Remove RA from reserved register to use as callee saved register In-Reply-To: References: Message-ID: luismarques added a comment. In CoreMark-Pro, when the execution model is 1 instr == 1 cycle, with this patch the results I get is that sha-test improves by +1.43% in RV64 (GC, LP64D). For all other sub-benchmarks the performance differences round to 0.00%. There's no change to sha-test in RV32 before and after the patch. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67698/new/ https://reviews.llvm.org/D67698 From llvm-commits at lists.llvm.org Sun Oct 13 18:14:38 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Lu=C3=ADs_Marques_via_Phabricator?= via llvm-commits) Date: Mon, 14 Oct 2019 01:14:38 +0000 (UTC) Subject: [PATCH] D68393: [RISCV] Add riscv{32,64} to ALL_CRT_SUPPORTED_ARCH list In-Reply-To: References: Message-ID: <8242f656d94c17ee771cbf0be2c1188c@localhost.localdomain> luismarques added a comment. @edward-jones: what LLVM cmake options are you using to test this? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68393/new/ https://reviews.llvm.org/D68393 From llvm-commits at lists.llvm.org Sun Oct 13 18:23:48 2019 From: llvm-commits at lists.llvm.org (Joseph Tremoulet via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 01:23:48 +0000 (UTC) Subject: [PATCH] D68657: Update MinidumpYAML to use minidump::Exception for exception stream In-Reply-To: References: Message-ID: JosephTremoulet updated this revision to Diff 224805. JosephTremoulet added a comment. - ...and fix namespace... Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68657/new/ https://reviews.llvm.org/D68657 Files: lldb/packages/Python/lldbsuite/test/functionalities/postmortem/minidump-new/linux-x86_64.yaml llvm/include/llvm/ObjectYAML/MinidumpYAML.h llvm/lib/ObjectYAML/MinidumpEmitter.cpp llvm/lib/ObjectYAML/MinidumpYAML.cpp llvm/test/tools/obj2yaml/basic-minidump.yaml llvm/unittests/ObjectYAML/MinidumpYAMLTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68657.224805.patch Type: text/x-patch Size: 20004 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 18:41:04 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Mon, 14 Oct 2019 01:41:04 -0000 Subject: [llvm] r374748 - [X86] Autogenerate complete checks. NFC Message-ID: <20191014014104.3DA2187C53@lists.llvm.org> Author: ctopper Date: Sun Oct 13 18:41:04 2019 New Revision: 374748 URL: http://llvm.org/viewvc/llvm-project?rev=374748&view=rev Log: [X86] Autogenerate complete checks. NFC Modified: llvm/trunk/test/CodeGen/X86/h-registers-0.ll llvm/trunk/test/CodeGen/X86/h-registers-3.ll Modified: llvm/trunk/test/CodeGen/X86/h-registers-0.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/h-registers-0.ll?rev=374748&r1=374747&r2=374748&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/h-registers-0.ll (original) +++ llvm/trunk/test/CodeGen/X86/h-registers-0.ll Sun Oct 13 18:41:04 2019 @@ -1,5 +1,6 @@ -; RUN: llc < %s -mattr=-bmi -mtriple=x86_64-linux | FileCheck %s -check-prefix=X86-64 -; RUN: llc < %s -mattr=-bmi -mtriple=x86_64-linux-gnux32 | FileCheck %s -check-prefix=X86-64 +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mattr=-bmi -mtriple=x86_64-linux | FileCheck %s -check-prefix=X86-64 -check-prefix=X64 +; RUN: llc < %s -mattr=-bmi -mtriple=x86_64-linux-gnux32 | FileCheck %s -check-prefix=X86-64 -check-prefix=X32 ; RUN: llc < %s -mattr=-bmi -mtriple=x86_64-win32 | FileCheck %s -check-prefix=WIN64 ; RUN: llc < %s -mattr=-bmi -mtriple=i686-- | FileCheck %s -check-prefix=X86-32 @@ -7,18 +8,36 @@ ; of h registers yet, due to x86 encoding complications. define void @bar64(i64 inreg %x, i8* inreg %p) nounwind { -; X86-64-LABEL: bar64: -; X86-64: shrq $8, %rdi -; X86-64: incb %dil +; X64-LABEL: bar64: +; X64: # %bb.0: +; X64-NEXT: shrq $8, %rdi +; X64-NEXT: incb %dil +; X64-NEXT: movb %dil, (%rsi) +; X64-NEXT: retq +; +; X32-LABEL: bar64: +; X32: # %bb.0: +; X32-NEXT: shrq $8, %rdi +; X32-NEXT: incb %dil +; X32-NEXT: movb %dil, (%esi) +; X32-NEXT: retq +; +; WIN64-LABEL: bar64: +; WIN64: # %bb.0: +; WIN64-NEXT: shrq $8, %rcx +; WIN64-NEXT: incb %cl +; WIN64-NEXT: movb %cl, (%rdx) +; WIN64-NEXT: retq +; +; X86-32-LABEL: bar64: +; X86-32: # %bb.0: +; X86-32-NEXT: incb %ah +; X86-32-NEXT: movb %ah, (%ecx) +; X86-32-NEXT: retl ; See FIXME: on regclass GR8. ; It could be optimally transformed like; incb %ch; movb %ch, (%rdx) -; WIN64-LABEL: bar64: -; WIN64: shrq $8, %rcx -; WIN64: incb %cl -; X86-32-LABEL: bar64: -; X86-32: incb %ah %t0 = lshr i64 %x, 8 %t1 = trunc i64 %t0 to i8 %t2 = add i8 %t1, 1 @@ -27,16 +46,34 @@ define void @bar64(i64 inreg %x, i8* inr } define void @bar32(i32 inreg %x, i8* inreg %p) nounwind { -; X86-64-LABEL: bar32: -; X86-64: shrl $8, %edi -; X86-64: incb %dil - -; WIN64-LABEL: bar32: -; WIN64: shrl $8, %ecx -; WIN64: incb %cl - +; X64-LABEL: bar32: +; X64: # %bb.0: +; X64-NEXT: shrl $8, %edi +; X64-NEXT: incb %dil +; X64-NEXT: movb %dil, (%rsi) +; X64-NEXT: retq +; +; X32-LABEL: bar32: +; X32: # %bb.0: +; X32-NEXT: shrl $8, %edi +; X32-NEXT: incb %dil +; X32-NEXT: movb %dil, (%esi) +; X32-NEXT: retq +; +; WIN64-LABEL: bar32: +; WIN64: # %bb.0: +; WIN64-NEXT: shrl $8, %ecx +; WIN64-NEXT: incb %cl +; WIN64-NEXT: movb %cl, (%rdx) +; WIN64-NEXT: retq +; ; X86-32-LABEL: bar32: -; X86-32: incb %ah +; X86-32: # %bb.0: +; X86-32-NEXT: incb %ah +; X86-32-NEXT: movb %ah, (%edx) +; X86-32-NEXT: retl + + %t0 = lshr i32 %x, 8 %t1 = trunc i32 %t0 to i8 %t2 = add i8 %t1, 1 @@ -45,16 +82,35 @@ define void @bar32(i32 inreg %x, i8* inr } define void @bar16(i16 inreg %x, i8* inreg %p) nounwind { -; X86-64-LABEL: bar16: -; X86-64: shrl $8, %edi -; X86-64: incb %dil - -; WIN64-LABEL: bar16: -; WIN64: shrl $8, %ecx -; WIN64: incb %cl - +; X64-LABEL: bar16: +; X64: # %bb.0: +; X64-NEXT: shrl $8, %edi +; X64-NEXT: incb %dil +; X64-NEXT: movb %dil, (%rsi) +; X64-NEXT: retq +; +; X32-LABEL: bar16: +; X32: # %bb.0: +; X32-NEXT: shrl $8, %edi +; X32-NEXT: incb %dil +; X32-NEXT: movb %dil, (%esi) +; X32-NEXT: retq +; +; WIN64-LABEL: bar16: +; WIN64: # %bb.0: +; WIN64-NEXT: # kill: def $cx killed $cx def $ecx +; WIN64-NEXT: shrl $8, %ecx +; WIN64-NEXT: incb %cl +; WIN64-NEXT: movb %cl, (%rdx) +; WIN64-NEXT: retq +; ; X86-32-LABEL: bar16: -; X86-32: incb %ah +; X86-32: # %bb.0: +; X86-32-NEXT: incb %ah +; X86-32-NEXT: movb %ah, (%edx) +; X86-32-NEXT: retl + + %t0 = lshr i16 %x, 8 %t1 = trunc i16 %t0 to i8 %t2 = add i8 %t1, 1 @@ -64,14 +120,23 @@ define void @bar16(i16 inreg %x, i8* inr define i64 @qux64(i64 inreg %x) nounwind { ; X86-64-LABEL: qux64: -; X86-64: movq %rdi, %rax -; X86-64: movzbl %ah, %eax +; X86-64: # %bb.0: +; X86-64-NEXT: movq %rdi, %rax +; X86-64-NEXT: movzbl %ah, %eax +; X86-64-NEXT: retq +; +; WIN64-LABEL: qux64: +; WIN64: # %bb.0: +; WIN64-NEXT: movzbl %ch, %eax +; WIN64-NEXT: retq +; +; X86-32-LABEL: qux64: +; X86-32: # %bb.0: +; X86-32-NEXT: movzbl %ah, %eax +; X86-32-NEXT: xorl %edx, %edx +; X86-32-NEXT: retl -; WIN64-LABEL: qux64: -; WIN64: movzbl %ch, %eax -; X86-32-LABEL: qux64: -; X86-32: movzbl %ah, %eax %t0 = lshr i64 %x, 8 %t1 = and i64 %t0, 255 ret i64 %t1 @@ -79,14 +144,22 @@ define i64 @qux64(i64 inreg %x) nounwind define i32 @qux32(i32 inreg %x) nounwind { ; X86-64-LABEL: qux32: -; X86-64: movl %edi, %eax -; X86-64: movzbl %ah, %eax +; X86-64: # %bb.0: +; X86-64-NEXT: movl %edi, %eax +; X86-64-NEXT: movzbl %ah, %eax +; X86-64-NEXT: retq +; +; WIN64-LABEL: qux32: +; WIN64: # %bb.0: +; WIN64-NEXT: movzbl %ch, %eax +; WIN64-NEXT: retq +; +; X86-32-LABEL: qux32: +; X86-32: # %bb.0: +; X86-32-NEXT: movzbl %ah, %eax +; X86-32-NEXT: retl -; WIN64-LABEL: qux32: -; WIN64: movzbl %ch, %eax -; X86-32-LABEL: qux32: -; X86-32: movzbl %ah, %eax %t0 = lshr i32 %x, 8 %t1 = and i32 %t0, 255 ret i32 %t1 @@ -94,15 +167,26 @@ define i32 @qux32(i32 inreg %x) nounwind define i16 @qux16(i16 inreg %x) nounwind { ; X86-64-LABEL: qux16: -; X86-64: movl %edi, %eax -; X86-64: movzbl %ah, %eax +; X86-64: # %bb.0: +; X86-64-NEXT: movl %edi, %eax +; X86-64-NEXT: movzbl %ah, %eax +; X86-64-NEXT: # kill: def $ax killed $ax killed $eax +; X86-64-NEXT: retq +; +; WIN64-LABEL: qux16: +; WIN64: # %bb.0: +; WIN64-NEXT: movzwl %cx, %eax +; WIN64-NEXT: shrl $8, %eax +; WIN64-NEXT: # kill: def $ax killed $ax killed $eax +; WIN64-NEXT: retq +; +; X86-32-LABEL: qux16: +; X86-32: # %bb.0: +; X86-32-NEXT: movzbl %ah, %eax +; X86-32-NEXT: # kill: def $ax killed $ax killed $eax +; X86-32-NEXT: retl -; WIN64-LABEL: qux16: -; WIN64: movzwl %cx, %eax -; WIN64: shrl $8, %eax -; X86-32-LABEL: qux16: -; X86-32: movzbl %ah, %eax %t0 = lshr i16 %x, 8 ret i16 %t0 } Modified: llvm/trunk/test/CodeGen/X86/h-registers-3.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/h-registers-3.ll?rev=374748&r1=374747&r2=374748&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/h-registers-3.ll (original) +++ llvm/trunk/test/CodeGen/X86/h-registers-3.ll Sun Oct 13 18:41:04 2019 @@ -1,35 +1,46 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc < %s -mtriple=i686-unknown-linux-gnu | FileCheck %s -check-prefix=X86 ; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu | FileCheck %s -check-prefix=X64 ; RUN: llc < %s -mtriple=x86_64-linux-gnux32 | FileCheck %s -check-prefix=X32 define zeroext i8 @foo() nounwind ssp { +; X86-LABEL: foo: +; X86: # %bb.0: # %entry +; X86-NEXT: subl $12, %esp +; X86-NEXT: calll bar +; X86-NEXT: movb %ah, %al +; X86-NEXT: addl $12, %esp +; X86-NEXT: retl +; +; X64-LABEL: foo: +; X64: # %bb.0: # %entry +; X64-NEXT: pushq %rax +; X64-NEXT: xorl %eax, %eax +; X64-NEXT: callq bar +; X64-NEXT: # kill: def $ax killed $ax def $eax +; X64-NEXT: shrl $8, %eax +; X64-NEXT: # kill: def $al killed $al killed $eax +; X64-NEXT: popq %rcx +; X64-NEXT: retq +; +; X32-LABEL: foo: +; X32: # %bb.0: # %entry +; X32-NEXT: pushq %rax +; X32-NEXT: xorl %eax, %eax +; X32-NEXT: callq bar +; X32-NEXT: # kill: def $ax killed $ax def $eax +; X32-NEXT: shrl $8, %eax +; X32-NEXT: # kill: def $al killed $al killed $eax +; X32-NEXT: popq %rcx +; X32-NEXT: retq entry: %0 = tail call zeroext i16 (...) @bar() nounwind %1 = lshr i16 %0, 8 %2 = trunc i16 %1 to i8 ret i8 %2 -; X86-LABEL: foo -; X86: calll -; X86-NEXT: movb %ah, %al -; X86-NEXT: addl $12, %esp -; X86-NEXT: retl -; X64-LABEL: foo -; X64: callq -; X64-NEXT: # kill -; X64-NEXT: shrl $8, %eax -; X64-NEXT: # kill -; X64-NEXT: popq -; X64-NEXT: retq -; X32-LABEL: foo -; X32: callq -; X32-NEXT: # kill -; X32-NEXT: shrl $8, %eax -; X32-NEXT: # kill -; X32-NEXT: popq -; X32-NEXT: retq } declare zeroext i16 @bar(...) From llvm-commits at lists.llvm.org Sun Oct 13 18:41:59 2019 From: llvm-commits at lists.llvm.org (David Blaikie via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 01:41:59 +0000 (UTC) Subject: [PATCH] D68117: [DWARF-5] Support for C++11 defaulted, deleted member functions. In-Reply-To: References: Message-ID: <0c3cea9aa5f3f168df2a546aeae37c55@localhost.localdomain> dblaikie added a comment. In D68117#1707578 , @SouraVX wrote: > In D68117#1702595 , @probinson wrote: > > > We really do want to pack the four mutually exclusive cases into two bits. I have tried to give more explicit comments inline to explain how you would do this. It really should work fine, recognizing that the "not defaulted" case is not explicitly represented in the textual IR because it uses a zero value in the defaulted/deleted subfield of SPFlags. > > > Thanks Paul, for suggesting this. Your approach works fine. But as I was working on some lvm-dwarfdump test cases. We seems to miss one corner case -- > Consider this test case; > class foo{ > > foo() = default; > ~foo() = default; > void not_special() {} > > }; > void not_a_member_of_foo(){} > > Now I'm getting DW_AT_defaulted getting emitted with value DW_DEFAULTED_no, for functions "not_special" and "not_a_member_of_foo". This behavior is undesirable since, DW_AT_defaulted attributes is only valid for C++ special member functions{Constructors/Destructors, ...}. > > Please correct me if I'm wrong -- Now This attributes to- implicitly defined "0" NotDefaulted bit. which is getting checked{that's fine as long as we have a dedicated bits for distinguishing} and true for every subprogram or function in a CU. > void DwarfUnit::applySubprogramAttributes( ... > ... > else if (SP->isNotDefaulted()) > > addUInt(SPDie, dwarf::DW_AT_defaulted, dwarf::DW_FORM_data1, > dwarf::DW_DEFAULTED_no); > > ... Perhaps we should only emit DEFAULTED_yes, and assume anything that's not DEFAULTED_yes, is... not defaulted? Also: What features is anyone planning to build with this information? I'm sort of inclined not to implement features without some use-case in mind/planned. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68117/new/ https://reviews.llvm.org/D68117 From llvm-commits at lists.llvm.org Sun Oct 13 20:23:13 2019 From: llvm-commits at lists.llvm.org (Pengfei Wang via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 03:23:13 +0000 (UTC) Subject: [PATCH] D68757: [X86] Add strict fp support for instructions fadd/fsub/fmul/fdiv In-Reply-To: References: Message-ID: <710dd66a56caf582c7599a5eb28ce6f4@localhost.localdomain> pengfei updated this revision to Diff 224806. pengfei added a comment. Add the missing `attributes #0 = { strictfp }` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68757/new/ https://reviews.llvm.org/D68757 Files: llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86InstrAVX512.td llvm/lib/Target/X86/X86InstrSSE.td llvm/test/CodeGen/X86/fp-strict-scalar.ll llvm/test/CodeGen/X86/vec-strict-128.ll llvm/test/CodeGen/X86/vec-strict-256.ll llvm/test/CodeGen/X86/vec-strict-512.ll llvm/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68757.224806.patch Type: text/x-patch Size: 41356 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 23:23:59 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 06:23:59 +0000 (UTC) Subject: [PATCH] D68925: [Attributor] Liveness for values In-Reply-To: References: Message-ID: <2216b674dc3661b6000aa11f92c79bed@localhost.localdomain> jdoerfert updated this revision to Diff 224809. jdoerfert added a comment. Minor updates and fixes Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68925/new/ https://reviews.llvm.org/D68925 Files: llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/align.ll llvm/test/Transforms/FunctionAttrs/arg_nocapture.ll llvm/test/Transforms/FunctionAttrs/arg_returned.ll llvm/test/Transforms/FunctionAttrs/liveness.ll llvm/test/Transforms/FunctionAttrs/new_attributes.ll llvm/test/Transforms/FunctionAttrs/noalias_returned.ll llvm/test/Transforms/FunctionAttrs/nonnull.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68925.224809.patch Type: text/x-patch Size: 33179 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 23:23:59 2019 From: llvm-commits at lists.llvm.org (Gil Rapaport via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 06:23:59 +0000 (UTC) Subject: [PATCH] D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) In-Reply-To: References: Message-ID: gilr updated this revision to Diff 224808. gilr added a comment. Applied review comments. Reused VPWidenMemoryInstructionRecipe's new getMask() in its execute(). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68577/new/ https://reviews.llvm.org/D68577 Files: llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h llvm/lib/Transforms/Vectorize/VPlan.cpp llvm/lib/Transforms/Vectorize/VPlan.h llvm/unittests/Transforms/Vectorize/VPlanTest.cpp -------------- next part -------------- A non-text attachment was scrubbed... Name: D68577.224808.patch Type: text/x-patch Size: 20719 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 23:25:47 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 06:25:47 +0000 (UTC) Subject: [PATCH] D68932: [Attributor][MustExecute] Use optimistic information in MustBeExecutedContextExplorer Message-ID: jdoerfert created this revision. jdoerfert added reviewers: sstefan1, uenoku. Herald added subscribers: bollu, hiraditya. Herald added a project: LLVM. NOTE: This is a prototype. We need a way to keep MustBeExecutedContextExplorer updates separate from other updates so they can continue on, then this should help. This allows optimistic information to be used to answer the question if a call returns. This will also record dependences between attributes using the explorer and AAWillReturn/AANoUnwind that are used to improve the IR information. If the latter ever get fixed, the updates of the former are triggered and we have better known information available. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68932 Files: llvm/include/llvm/Analysis/MustExecute.h llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Analysis/MustExecute.cpp llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/arg_nocapture.ll llvm/test/Transforms/FunctionAttrs/liveness.ll llvm/test/Transforms/FunctionAttrs/nonnull.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68932.224810.patch Type: text/x-patch Size: 8694 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 23:34:36 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 06:34:36 +0000 (UTC) Subject: [PATCH] D68933: [MustExecute] Forward iterate over conditional branches Message-ID: jdoerfert created this revision. jdoerfert added reviewers: uenoku, sstefan1, hfinkel. Herald added subscribers: bollu, hiraditya. Herald added a project: LLVM. If a conditional branch is encountered we can try to find a join block where the execution is known to continue. This means finding a suitable block, e.g., the immediate post dominator of the conditional branch, and proofing control will always reach that block. This patch implements different techniques that work with and without provided analysis. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68933 Files: llvm/include/llvm/Analysis/MustExecute.h llvm/lib/Analysis/MustExecute.cpp llvm/test/Analysis/MustExecute/must_be_executed_context.ll llvm/test/Transforms/FunctionAttrs/nonnull.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68933.224811.patch Type: text/x-patch Size: 11145 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 23:43:38 2019 From: llvm-commits at lists.llvm.org (Johannes Doerfert via Phabricator via llvm-commits) Date: Mon, 14 Oct 2019 06:43:38 +0000 (UTC) Subject: [PATCH] D68934: [Attributor] Make value simplify stronger Message-ID: jdoerfert created this revision. jdoerfert added reviewers: uenoku, sstefan1, hfinkel. Herald added subscribers: bollu, hiraditya. Herald added a project: LLVM. NOTE: This is still under testing on the test-suite This patch makes value simplify stronger and consolidates features present in AAReturnedValuesImpl before. The new algorithm does not try to find a single value which is "simplified" but instead collects "assumed equivalent" values. If we find an inconsistency in our assumed equivalent value set, we revert it back to the original value. The trick is that a value might not look equivalent early but as long as its AAValueSimplify object is not in a fixpoint state we can optimistically assume we have not found the link yet that will make the equivalence obvious. At manifest time we have to verify that all links are present as we know all AAValueSimplify objects are in a fixpoint state now, even if they do not know it yet. In addition, this patch adds simple constant folding and instruction folding logic. The reason for this patch was the Adobe-C++/loop_unroll.cpp benchmark in the test suite. The compile time with the old implementation was bad. A test case in the spirit of loop_unroll.cpp was added to value_simplify.ll. TODO: Other AAs and Attributor routines could use AAValueSimplify when they look at values. Use update_test_checks for value_simplify.ll with the following patches applied and function signature printing enabled: D68851 , D68850 , D68819 Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D68934 Files: llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Transforms/IPO/Attributor.cpp llvm/test/Transforms/FunctionAttrs/arg_nocapture.ll llvm/test/Transforms/FunctionAttrs/arg_returned.ll llvm/test/Transforms/FunctionAttrs/value-simplify.ll llvm/test/Transforms/FunctionAttrs/willreturn.ll -------------- next part -------------- A non-text attachment was scrubbed... Name: D68934.224812.patch Type: text/x-patch Size: 49256 bytes Desc: not available URL: From llvm-commits at lists.llvm.org Sun Oct 13 23:47:56 2019 From: llvm-commits at lists.llvm.org (Craig Topper via llvm-commits) Date: Mon, 14 Oct 2019 06:47:56 -0000 Subject: [llvm] r374755 - [X86] Teach EmitTest to handle ISD::SSUBO/USUBO in order to use the Z flag from the subtract directly during isel. Message-ID: <20191014064757.156608BDFC@lists.llvm.org> Author: ctopper Date: Sun Oct 13 23:47:56 2019 New Revision: 374755 URL: http://llvm.org/viewvc/llvm-project?rev=374755&view=rev Log: [X86] Teach EmitTest to handle ISD::SSUBO/USUBO in order to use the Z flag from the subtract directly during isel. This prevents isel from emitting a TEST instruction that optimizeCompareInstr will need to remove later. In some of the modified tests, the SUB gets duplicated due to the flags being needed in two places and being clobbered in between. optimizeCompareInstr was able to optimize away the TEST that was using the result of one of them, but optimizeCompareInstr doesn't know to turn SUB into CMP after removing the TEST. It only knows how to turn SUB into CMP if the result was already dead. With this change the TEST never exists, so optimizeCompareInstr doesn't have to remove it. Then it can just turn the SUB into CMP immediately. Fixes PR43649. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/known-bits.ll llvm/trunk/test/CodeGen/X86/ssub_sat.ll llvm/trunk/test/CodeGen/X86/ssub_sat_vec.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=374755&r1=374754&r2=374755&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sun Oct 13 23:47:56 2019 @@ -20079,6 +20079,13 @@ static SDValue EmitTest(SDValue Op, unsi case X86ISD::XOR: case X86ISD::AND: return SDValue(Op.getNode(), 1); + case ISD::SSUBO: + case ISD::USUBO: { + // /USUBO/SSUBO will become a X86ISD::SUB and we can use its Z flag. + SDVTList VTs = DAG.getVTList(Op.getValueType(), MVT::i32); + return DAG.getNode(X86ISD::SUB, dl, VTs, Op->getOperand(0), + Op->getOperand(1)).getValue(1); + } default: default_case: break; Modified: llvm/trunk/test/CodeGen/X86/known-bits.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/known-bits.ll?rev=374755&r1=374754&r2=374755&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/known-bits.ll (original) +++ llvm/trunk/test/CodeGen/X86/known-bits.ll Sun Oct 13 23:47:56 2019 @@ -190,26 +190,22 @@ define {i32, i1} @knownbits_uaddo_saddo( define {i32, i1} @knownbits_usubo_ssubo(i64 %a0, i64 %a1) nounwind { ; X32-LABEL: knownbits_usubo_ssubo: ; X32: # %bb.0: -; X32-NEXT: pushl %ebx ; X32-NEXT: movl {{[0-9]+}}(%esp), %eax ; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx -; X32-NEXT: movl %ecx, %edx -; X32-NEXT: subl %eax, %edx -; X32-NEXT: setb %bl -; X32-NEXT: testl %eax, %eax -; X32-NEXT: setns %al +; X32-NEXT: cmpl %eax, %ecx +; X32-NEXT: setb %dh +; X32-NEXT: setns %dl ; X32-NEXT: testl %ecx, %ecx ; X32-NEXT: setns %cl -; X32-NEXT: cmpb %al, %cl -; X32-NEXT: setne %al -; X32-NEXT: testl %edx, %edx -; X32-NEXT: setns %dl ; X32-NEXT: cmpb %dl, %cl +; X32-NEXT: setne %ch +; X32-NEXT: testl %eax, %eax +; X32-NEXT: setns %al +; X32-NEXT: cmpb %al, %cl ; X32-NEXT: setne %dl -; X32-NEXT: andb %al, %dl -; X32-NEXT: orb %bl, %dl +; X32-NEXT: andb %ch, %dl +; X32-NEXT: orb %dh, %dl ; X32-NEXT: xorl %eax, %eax -; X32-NEXT: popl %ebx ; X32-NEXT: retl ; ; X64-LABEL: knownbits_usubo_ssubo: Modified: llvm/trunk/test/CodeGen/X86/ssub_sat.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/ssub_sat.ll?rev=374755&r1=374754&r2=374755&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/ssub_sat.ll (original) +++ llvm/trunk/test/CodeGen/X86/ssub_sat.ll Sun Oct 13 23:47:56 2019 @@ -12,24 +12,20 @@ declare <4 x i32> @llvm.ssub.sat.v4i32(< define i32 @func(i32 %x, i32 %y) nounwind { ; X86-LABEL: func: ; X86: # %bb.0: -; X86-NEXT: pushl %esi ; X86-NEXT: movl {{[0-9]+}}(%esp), %eax ; X86-NEXT: movl {{[0-9]+}}(%esp), %edx ; X86-NEXT: xorl %ecx, %ecx -; X86-NEXT: movl %eax, %esi -; X86-NEXT: subl %edx, %esi +; X86-NEXT: cmpl %edx, %eax ; X86-NEXT: setns %cl ; X86-NEXT: addl $2147483647, %ecx # imm = 0x7FFFFFFF ; X86-NEXT: subl %edx, %eax ; X86-NEXT: cmovol %ecx, %eax -; X86-NEXT: popl %esi ; X86-NEXT: retl ; ; X64-LABEL: func: ; X64: # %bb.0: ; X64-NEXT: xorl %eax, %eax -; X64-NEXT: movl %edi, %ecx -; X64-NEXT: subl %esi, %ecx +; X64-NEXT: cmpl %esi, %edi ; X64-NEXT: setns %al ; X64-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF ; X64-NEXT: subl %esi, %edi @@ -79,8 +75,7 @@ define i64 @func2(i64 %x, i64 %y) nounwi ; X64-LABEL: func2: ; X64: # %bb.0: ; X64-NEXT: xorl %ecx, %ecx -; X64-NEXT: movq %rdi, %rax -; X64-NEXT: subq %rsi, %rax +; X64-NEXT: cmpq %rsi, %rdi ; X64-NEXT: setns %cl ; X64-NEXT: movabsq $9223372036854775807, %rax # imm = 0x7FFFFFFFFFFFFFFF ; X64-NEXT: addq %rcx, %rax @@ -94,25 +89,21 @@ define i64 @func2(i64 %x, i64 %y) nounwi define i16 @func16(i16 %x, i16 %y) nounwind { ; X86-LABEL: func16: ; X86: # %bb.0: -; X86-NEXT: pushl %esi ; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax ; X86-NEXT: movzwl {{[0-9]+}}(%esp), %edx ; X86-NEXT: xorl %ecx, %ecx -; X86-NEXT: movl %eax, %esi -; X86-NEXT: subw %dx, %si +; X86-NEXT: cmpw %dx, %ax ; X86-NEXT: setns %cl ; X86-NEXT: addl $32767, %ecx # imm = 0x7FFF ; X86-NEXT: subw %dx, %ax ; X86-NEXT: cmovol %ecx, %eax ; X86-NEXT: # kill: def $ax killed $ax killed $eax -; X86-NEXT: popl %esi ; X86-NEXT: retl ; ; X64-LABEL: func16: ; X64: # %bb.0: ; X64-NEXT: xorl %eax, %eax -; X64-NEXT: movl %edi, %ecx -; X64-NEXT: subw %si, %cx +; X64-NEXT: cmpw %si, %di ; X64-NEXT: setns %al ; X64-NEXT: addl $32767, %eax # imm = 0x7FFF ; X64-NEXT: subw %si, %di @@ -129,8 +120,7 @@ define i8 @func8(i8 %x, i8 %y) nounwind ; X86-NEXT: movb {{[0-9]+}}(%esp), %al ; X86-NEXT: movb {{[0-9]+}}(%esp), %dl ; X86-NEXT: xorl %ecx, %ecx -; X86-NEXT: movb %al, %ah -; X86-NEXT: subb %dl, %ah +; X86-NEXT: cmpb %dl, %al ; X86-NEXT: setns %cl ; X86-NEXT: addl $127, %ecx ; X86-NEXT: subb %dl, %al @@ -142,8 +132,7 @@ define i8 @func8(i8 %x, i8 %y) nounwind ; X64-LABEL: func8: ; X64: # %bb.0: ; X64-NEXT: xorl %ecx, %ecx -; X64-NEXT: movl %edi, %eax -; X64-NEXT: subb %sil, %al +; X64-NEXT: cmpb %sil, %dil ; X64-NEXT: setns %cl ; X64-NEXT: addl $127, %ecx ; X64-NEXT: subb %sil, %dil @@ -163,8 +152,7 @@ define i4 @func3(i4 %x, i4 %y) nounwind ; X86-NEXT: shlb $4, %dl ; X86-NEXT: shlb $4, %al ; X86-NEXT: xorl %ecx, %ecx -; X86-NEXT: movb %al, %ah -; X86-NEXT: subb %dl, %ah +; X86-NEXT: cmpb %dl, %al ; X86-NEXT: setns %cl ; X86-NEXT: addl $127, %ecx ; X86-NEXT: subb %dl, %al @@ -179,8 +167,7 @@ define i4 @func3(i4 %x, i4 %y) nounwind ; X64-NEXT: shlb $4, %sil ; X64-NEXT: shlb $4, %dil ; X64-NEXT: xorl %ecx, %ecx -; X64-NEXT: movl %edi, %eax -; X64-NEXT: subb %sil, %al +; X64-NEXT: cmpb %sil, %dil ; X64-NEXT: setns %cl ; X64-NEXT: addl $127, %ecx ; X64-NEXT: subb %sil, %dil @@ -196,15 +183,13 @@ define i4 @func3(i4 %x, i4 %y) nounwind define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind { ; X86-LABEL: vec: ; X86: # %bb.0: -; X86-NEXT: pushl %ebp ; X86-NEXT: pushl %ebx ; X86-NEXT: pushl %edi ; X86-NEXT: pushl %esi ; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx ; X86-NEXT: movl {{[0-9]+}}(%esp), %edx ; X86-NEXT: xorl %eax, %eax -; X86-NEXT: movl %ecx, %esi -; X86-NEXT: subl %edx, %esi +; X86-NEXT: cmpl %edx, %ecx ; X86-NEXT: setns %al ; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF ; X86-NEXT: subl %edx, %ecx @@ -212,8 +197,7 @@ define <4 x i32> @vec(<4 x i32> %x, <4 x ; X86-NEXT: cmovol %eax, %ecx ; X86-NEXT: movl {{[0-9]+}}(%esp), %esi ; X86-NEXT: xorl %eax, %eax -; X86-NEXT: movl %edx, %edi -; X86-NEXT: subl %esi, %edi +; X86-NEXT: cmpl %esi, %edx ; X86-NEXT: setns %al ; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF ; X86-NEXT: subl %esi, %edx @@ -221,8 +205,7 @@ define <4 x i32> @vec(<4 x i32> %x, <4 x ; X86-NEXT: cmovol %eax, %edx ; X86-NEXT: movl {{[0-9]+}}(%esp), %edi ; X86-NEXT: xorl %eax, %eax -; X86-NEXT: movl %esi, %ebx -; X86-NEXT: subl %edi, %ebx +; X86-NEXT: cmpl %edi, %esi ; X86-NEXT: setns %al ; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF ; X86-NEXT: subl %edi, %esi @@ -230,8 +213,7 @@ define <4 x i32> @vec(<4 x i32> %x, <4 x ; X86-NEXT: cmovol %eax, %esi ; X86-NEXT: movl {{[0-9]+}}(%esp), %eax ; X86-NEXT: xorl %ebx, %ebx -; X86-NEXT: movl %edi, %ebp -; X86-NEXT: subl %eax, %ebp +; X86-NEXT: cmpl %eax, %edi ; X86-NEXT: setns %bl ; X86-NEXT: addl $2147483647, %ebx # imm = 0x7FFFFFFF ; X86-NEXT: subl %eax, %edi @@ -244,7 +226,6 @@ define <4 x i32> @vec(<4 x i32> %x, <4 x ; X86-NEXT: popl %esi ; X86-NEXT: popl %edi ; X86-NEXT: popl %ebx -; X86-NEXT: popl %ebp ; X86-NEXT: retl $4 ; ; X64-LABEL: vec: Modified: llvm/trunk/test/CodeGen/X86/ssub_sat_vec.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/ssub_sat_vec.ll?rev=374755&r1=374754&r2=374755&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/ssub_sat_vec.ll (original) +++ llvm/trunk/test/CodeGen/X86/ssub_sat_vec.ll Sun Oct 13 23:47:56 2019 @@ -408,30 +408,28 @@ define void @v12i16(<12 x i16>* %px, <12 define void @v1i8(<1 x i8>* %px, <1 x i8>* %py, <1 x i8>* %pz) nounwind { ; SSE-LABEL: v1i8: ; SSE: # %bb.0: -; SSE-NEXT: movb (%rdi), %cl -; SSE-NEXT: movb (%rsi), %dil +; SSE-NEXT: movb (%rdi), %al +; SSE-NEXT: movb (%rsi), %cl ; SSE-NEXT: xorl %esi, %esi -; SSE-NEXT: movl %ecx, %eax -; SSE-NEXT: subb %dil, %al +; SSE-NEXT: cmpb %cl, %al ; SSE-NEXT: setns %sil ; SSE-NEXT: addl $127, %esi -; SSE-NEXT: subb %dil, %cl -; SSE-NEXT: movzbl %cl, %eax +; SSE-NEXT: subb %cl, %al +; SSE-NEXT: movzbl %al, %eax ; SSE-NEXT: cmovol %esi, %eax ; SSE-NEXT: movb %al, (%rdx) ; SSE-NEXT: retq ; ; AVX-LABEL: v1i8: ; AVX: # %bb.0: -; AVX-NEXT: movb (%rdi), %cl -; AVX-NEXT: movb (%rsi), %dil +; AVX-NEXT: movb (%rdi), %al +; AVX-NEXT: movb (%rsi), %cl ; AVX-NEXT: xorl %esi, %esi -; AVX-NEXT: movl %ecx, %eax -; AVX-NEXT: subb %dil, %al +; AVX-NEXT: cmpb %cl, %al ; AVX-NEXT: setns %sil ; AVX-NEXT: addl $127, %esi -; AVX-NEXT: subb %dil, %cl -; AVX-NEXT: movzbl %cl, %eax +; AVX-NEXT: subb %cl, %al +; AVX-NEXT: movzbl %al, %eax ; AVX-NEXT: cmovol %esi, %eax ; AVX-NEXT: movb %al, (%rdx) ; AVX-NEXT: retq @@ -448,8 +446,7 @@ define void @v1i16(<1 x i16>* %px, <1 x ; SSE-NEXT: movzwl (%rdi), %eax ; SSE-NEXT: movzwl (%rsi), %ecx ; SSE-NEXT: xorl %esi, %esi -; SSE-NEXT: movl %eax, %edi -; SSE-NEXT: subw %cx, %di +; SSE-NEXT: cmpw %cx, %ax ; SSE-NEXT: setns %sil ; SSE-NEXT: addl $32767, %esi # imm = 0x7FFF ; SSE-NEXT: subw %cx, %ax @@ -462,8 +459,7 @@ define void @v1i16(<1 x i16>* %px, <1 x ; AVX-NEXT: movzwl (%rdi), %eax ; AVX-NEXT: movzwl (%rsi), %ecx ; AVX-NEXT: xorl %esi, %esi -; AVX-NEXT: movl %eax, %edi -; AVX-NEXT: subw %cx, %di +; AVX-NEXT: cmpw %cx, %ax ; AVX-NEXT: setns %sil ; AVX-NEXT: addl $32767, %esi # imm = 0x7FFF ; AVX-NEXT: subw %cx, %ax From llvm-commits at lists.llvm.org Sun Oct 13 13:49:24 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Lu=C3=ADs_Marques_via_Phabricator?= via llvm-commits) Date: Sun, 13 Oct 2019 20:49:24 +0000 (UTC) Subject: [PATCH] D62686: [RISCV] Add support for save/restore of callee-saved registers via libcalls In-Reply-To: References: Message-ID: <79d76d1219822ba45bd3cec1d9cce5e3@localhost.localdomain> luismarques added a comment. The priority for this patch is to address the issues reported by @apazos but after that please check the clang-format output. There are some cases in this patch where it might make sense to use a different formatting than clang-format indicates, but the remaining should be addressed. @apazos Have you considered tweaking the patch code to not do a tail call, just to check if that's what's causing the remaining failures? I'm not sure if that's too hard, but it could eventually be easier than drilling into the failing cases. ================ Comment at: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp:27 +// registers. +static int getLibCallID(const MachineFunction &MF, + const std::vector &CSI) { ---------------- The return value isn't used as just an opaque index, it also reflects the frame size and is used for that purpose. The function comment should probably reflect that. ================ Comment at: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp:34 + + unsigned MaxReg = 0; + for (auto &CS : CSI) ---------------- Use `Register` and `RISCV::NoRegister`. (You'll have to use `MaxReg.id()` instead in the call to `max`). ================ Comment at: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp:36 + for (auto &CS : CSI) + if (CS.getFrameIdx() < 0) + MaxReg = std::max(MaxReg, CS.getReg()); ---------------- Might be worth adding a small comment explaining how this serves as a filters for the registers we are interested in. Or point to a later relevant comment? ================ Comment at: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp:39 + + if (MaxReg == 0) + return -1; ---------------- Ditto `NoRegister`. ================ Comment at: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp:66 + const std::vector &CSI) { + static const char *const spillLibCalls[] = { + "__riscv_save_0", ---------------- Check LLVM naming convention capitalization. Ditto other vars here. ================ Comment at: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp:93 + const std::vector &CSI) { + static const char *const restoreLibCalls[] = { + "__riscv_restore_0", ---------------- Check LLVM naming convention capitalization. Ditto other vars here. ================ Comment at: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp:190 +static std::vector +getNonLibcallCSI(const std::vector &CSI) { ---------------- This could probably use `SmallVector`. ================ Comment at: llvm/lib/Target/RISCV/RISCVFrameLowering.cpp:706 + for (auto &CS : reverse(NonLibcallCSI)) { + unsigned Reg = CS.getReg(); + const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(Reg); ---------------- Ditto `Register`. ================ Comment at: llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp:95 +// by save/restore libcalls. +static const std::map FixedCSRFIMap = { + {/*ra*/ RISCV::X1, -1}, ---------------- Use `IndexedMap` instead? ================ Comment at: llvm/test/CodeGen/RISCV/saverestore.ll:348 + +; Check that functions with varargs do not use save/restore code + ---------------- Maybe for these tests just put a -NOT check that __riscv_save_ isn't called? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62686/new/ https://reviews.llvm.org/D62686 From llvm-commits at lists.llvm.org Sun Oct 13 16:15:20 2019 From: llvm-commits at lists.llvm.org (=?utf-8?q?Lu=C3=ADs_Marques_via_Phabricator?= via llvm-commits) Date: Sun, 13 Oct 2019 23:15:20 +0000 (UTC) Subject: [PATCH] D67397: [RISCV] Add MachineInstr immediate verification In-Reply-To: References: Message-ID: luismarques updated this revision to Diff 224798. luismarques added a comment. Rebase and address iteration issue (hat tip to both commenters!). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67397/new/ https://reviews.llvm.org/D67397 Files: llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCTargetDesc.cpp llvm/lib/Target/RISCV/RISCVInstrInfo.cpp llvm/lib/Target/RISCV/RISCVInstrInfo.h llvm/lib/Target/RISCV/RISCVInstrInfo.td llvm/lib/Target/RISCV/RISCVSubtarget.cpp llvm/lib/Target/RISCV/Utils/RISCVBaseInfo.h llvm/test/CodeGen/RISCV/verify-instr.mir -------------- next part -------------- A non-text attachment was scrubbed... Name: D67397.224798.patch Type: text/x-patch Size: 8082 bytes Desc: not available URL: